negbinomial {VGAM} | R Documentation |
Maximum likelihood estimation of the two parameters of a negative binomial distribution.
negbinomial(lmu = "loge", lk = "loge", emu =list(), ek=list(), ik = NULL, cutoff = 0.995, Maxiter=5000, deviance.arg = FALSE, method.init=1, shrinkage.init=0.95, zero = -2)
lmu, lk |
Link functions applied to the mu and k parameters.
See Links for more choices.
|
emu, ek |
List. Extra argument for each of the links.
See earg in Links for general information.
|
ik |
Optional initial values for k.
If failure to converge occurs try different values (and/or use
method.init ).
For a S-column response, ik can be of length S.
A value NULL means an initial value for each response is
computed internally using a range of values.
This argument is ignored if used within cqo ; see
the iKvector argument of qrrvglm.control instead.
|
cutoff |
A numeric which is close to 1 but never exactly 1.
Used to specify how many terms of the infinite series
for computing the second diagonal element of the expected information
matrix are actually used.
The sum of the probabilites are added until they reach this value or more
(but no more than Maxiter terms allowed).
It is like specifying p in an imaginary function
qnegbin(p) .
|
Maxiter |
Integer. The maximum number of terms allowed when computing
the second diagonal element of the expected information matrix.
In theory, the value involves an infinite series.
If this argument is too small then the value may be inaccurate.
|
deviance.arg |
Logical. If TRUE , the deviance function
is attached to the object. Under ordinary circumstances, it should
be left alone because it really assumes the index parameter is at
the maximum likelihood estimate. Consequently, one cannot use that
criterion to minimize within the IRLS algorithm.
It should be set TRUE only when used with cqo
under the fast algorithm.
|
method.init |
An integer with value 1 or 2 which
specifies the initialization method for the mu parameter.
If failure to converge occurs try another value
and/or else specify a value for shrinkage.init
and/or else specify a value for ik .
|
shrinkage.init |
How much shrinkage is used when initializing mu.
The value must be between 0 and 1 inclusive, and
a value of 0 means the individual response values are used,
and a value of 1 means the median or mean is used.
This argument is used in conjunction with method.init .
|
zero |
Integer valued vector, usually assigned -2 or 2 if used
at all. Specifies which of the two linear/additive predictors are
modelled as an intercept only. By default, the k parameter
(after lk is applied) is modelled as a single unknown
number that is estimated. It can be modelled as a function of the
explanatory variables by setting zero=NULL . A negative value
means that the value is recycled, so setting -2 means all k
are intercept-only.
|
The negative binomial distribution can be motivated in several ways, e.g., as a Poisson distribution with a mean that is gamma distributed. There are several common parametrizations of the negative binomial distribution. The one used here uses the mean mu and an index parameter k, both which are positive. Specifically, the density of a random variable Y is
f(y;mu,k) = C_{y}^{y + k - 1} [mu/(mu+k)]^y [k/(k+mu)]^k
where y=0,1,2,...,
and mu > 0 and k > 0.
Note that the dispersion parameter is
1/k, so that as k approaches infinity the negative
binomial distribution approaches a Poisson distribution.
The response has variance Var(Y)=mu*(1+mu/k).
When fitted, the fitted.values
slot of the object contains
the estimated value of the mu parameter, i.e., of the mean
E(Y).
The negative binomial distribution can be coerced into the classical
GLM framework, with one of the parameters being of interest and the
other treated as a nuisance/scale parameter (and implemented in the
MASS library). This VGAM family function negbinomial
treats
both parameters on the same footing, and estimates them both by full
maximum likelihood estimation.
The parameters mu and k are independent (diagonal expected information matrix), and the confidence region for k is extremely skewed so that its standard error is often of no practical use. The parameter 1/k has been used as a measure of aggregation.
This VGAM function handles multivariate responses, so
that a matrix can be used as the response. The number of columns is the
number of species, say, and setting zero=-2
means that all
species have a k equalling a (different) intercept only.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
and vgam
.
The Poisson model corresponds to k equalling infinity.
If the data is Poisson or close to Poisson, numerical problems will
occur. Possibly choosing a log-log link may help in such cases,
otherwise use poissonff
.
This function is fragile; the maximum likelihood estimate of the
index parameter is fraught (see Lawless, 1987). In general, the
quasipoissonff
is more robust than this function.
Assigning values to the ik
argument may lead to a local solution,
and smaller values are preferred over large values when using this argument.
Yet to do: write a family function which uses the methods of moments estimator for k.
This function can be used by the fast algorithm in
cqo
, however, setting EqualTolerances=TRUE
and
ITolerances=FALSE
is recommended.
In the first example below (Bliss and Fisher, 1953), from each of 6 McIntosh apple trees in an orchard that had been sprayed, 25 leaves were randomly selected. On each of the leaves, the number of adult female European red mites were counted.
Thomas W. Yee
Lawless, J. F. (1987) Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics 15, 209–225.
Bliss, C. and Fisher, R. A. (1953) Fitting the negative binomial distribution to biological data. Biometrics 9, 174–200.
quasipoissonff
,
poissonff
,
cao
,
cqo
,
zinegbinomial
,
posnegbinomial
,
invbinomial
,
rnbinom
,
nbolf
.
y = 0:7 # Example 1: apple tree data w = c(70, 38, 17, 10, 9, 3, 2, 1) fit = vglm(y ~ 1, negbinomial, weights=w) summary(fit) coef(fit, matrix=TRUE) Coef(fit) ## Not run: x = runif(n <- 500) # Example 2: simulated data with multivariate response y1 = rnbinom(n, mu=exp(3+x), size=exp(1)) # k is size y2 = rnbinom(n, mu=exp(2-x), size=exp(0)) fit = vglm(cbind(y1,y2) ~ x, negbinomial, trace=TRUE) coef(fit, matrix=TRUE) ## End(Not run)