vglm {VGAM} | R Documentation |
vglm
is used to fit vector generalized linear models (VGLMs).
This is a large class of models that includes
generalized linear models (GLMs) as special cases.
vglm(formula, family, data = list(), weights = NULL, subset = NULL, na.action = na.fail, etastart = NULL, mustart = NULL, coefstart = NULL, control = vglm.control(...), offset = NULL, method = "vglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE, contrasts = NULL, constraints = NULL, extra = list(), qr.arg = FALSE, smart = TRUE, ...)
In the following, M is the number of linear predictors.
formula |
a symbolic description of the model to be fit.
The RHS of the formula is applied to each linear predictor. Different
variables in each linear predictor can be chosen by specifying
constraint matrices.
|
family |
a function of class "vglmff" (see vglmff-class )
describing what statistical model is to be fitted. This is called a
``VGAM family function''. See CommonVGAMffArguments
for general information about many types of arguments found in this
type of function.
|
data |
an optional data frame containing the variables in the model.
By default the variables are taken from
environment(formula) , typically the environment from which
vglm is called.
|
weights |
an optional vector or matrix of (prior) weights
to be used in the fitting process.
If weights is a matrix, then it must be in
matrix-band form, whereby the first M
columns of the matrix are the
diagonals, followed by the upper-diagonal band, followed by the
band above that, etc. In this case, there can be up to M(M+1)
columns, with the last column corresponding to the (1,M) elements
of the weight matrices.
|
subset |
an optional logical vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when
the data contain NA s.
The default is set by the na.action setting
of options , and is na.fail if that is unset.
The ``factory-fresh'' default is na.omit .
|
etastart |
starting values for the linear predictors.
It is a M-column matrix. If M=1 then it may be a vector.
|
mustart |
starting values for the
fitted values. It can be a vector or a matrix.
Some family functions do not make use of this argument.
|
coefstart |
starting values for the coefficient vector.
|
control |
a list of parameters for controlling the fitting process.
See vglm.control for details.
|
offset |
a vector or M-column matrix of offset values. These are a
priori known and are added to the linear predictors during fitting.
|
method |
the method to be used in fitting the model. The default (and
presently only) method vglm.fit uses iteratively reweighted
least squares (IRLS).
|
model |
a logical value indicating whether the
model frame
should be assigned in the model slot.
|
x.arg, y.arg |
logical values indicating whether
the model matrix and response vector/matrix used in the fitting
process should be assigned in the x and y slots.
Note the model matrix is the LM model matrix; to get the VGLM
model matrix type model.matrix(vglmfit) where
vglmfit is a vglm object.
|
contrasts |
an optional list. See the contrasts.arg
of model.matrix.default .
|
constraints |
an optional list of constraint matrices.
The components of the list must be named with the term it corresponds
to (and it must match in character format exactly).
Each constraint matrix must have M rows, and be of full-column
rank. By default, constraint matrices are the M by M
identity
matrix unless arguments in the family function itself override
these values.
If constraints is used it must contain all the
terms; an incomplete list is not accepted.
|
extra |
an optional list with any extra information that might be needed by
the VGAM family function.
|
qr.arg |
logical value indicating whether
the slot qr , which returns the QR decomposition of the
VLM model matrix, is returned on the object.
|
smart |
logical value indicating whether smart prediction
(smartpred ) will be used.
|
... |
further arguments passed into vglm.control .
|
A vector generalized linear model (VGLM) is loosely defined as a statistical model that is a function of M linear predictors. The central formula is given by
eta_j = beta_j^T x
where x is a vector of explanatory variables (sometimes just a 1 for an intercept), and beta_j is a vector of regression coefficients to be estimated. Here, j=1,...,M where M is finite. Then one can write eta=(eta_1,...,eta_M)^T as a vector of linear predictors.
Most users will find vglm
similar in flavour to
glm
. The function vglm.fit
actually does
the work.
An object of class "vglm"
, which has the
following slots. Some of these may not be assigned to save
space, and will be recreated if necessary later.
extra |
the list extra at the end of fitting. |
family |
the family function (of class "vglmff" ). |
iter |
the number of IRLS iterations used. |
predictors |
a M-column matrix of linear predictors. |
assign |
a named list which matches the columns and the (LM) model matrix terms. |
call |
the matched call. |
coefficients |
a named vector of coefficients. |
constraints |
a named list of constraint matrices used in the fitting. |
contrasts |
the contrasts used (if any). |
control |
list of control parameter used in the fitting. |
criterion |
list of convergence criterion evaluated at the final IRLS iteration. |
df.residual |
the residual degrees of freedom. |
df.total |
the total degrees of freedom. |
dispersion |
the scaling parameter. |
effects |
the effects. |
fitted.values |
the fitted values, as a matrix.
This is usually the mean but may be quantiles, or the location
parameter, e.g., in the Cauchy model.
|
misc |
a list to hold miscellaneous parameters. |
model |
the model frame. |
na.action |
a list holding information about missing values. |
offset |
if non-zero, a M-column matrix of offsets. |
post |
a list where post-analysis results may be put. |
preplot |
used by plotvgam , the plotting parameters
may be put here. |
prior.weights |
initially supplied weights. |
qr |
the QR decomposition used in the fitting. |
R |
the R matrix in the QR decomposition used in the fitting. |
rank |
numerical rank of the fitted model. |
residuals |
the working residuals at the final IRLS iteration. |
rss |
residual sum of squares at the final IRLS iteration with the adjusted dependent vectors and weight matrices. |
smart.prediction |
a list of data-dependent parameters (if any)
that are used by smart prediction.
|
terms |
the terms object used. |
weights |
the weight matrices at the final IRLS iteration. This is in matrix-band form. |
x |
the model matrix (linear model LM, not VGLM). |
xlevels |
the levels of the factors, if any, used in fitting. |
y |
the response, in matrix form. |
This slot information is repeated at vglm-class
.
This function can fit a wide variety of statistical models. Some of
these are harder to fit than others because of inherent numerical
difficulties associated with some of them. Successful model fitting
benefits from cumulative experience. Varying the values of arguments
in the VGAM family function itself is a good first step if
difficulties arise, especially if initial values can be inputted.
A second, more general step, is to vary the values of arguments in
vglm.control
.
A third step is to make use of arguments such as etastart
,
coefstart
and mustart
.
Some VGAM family functions end in "ff"
to avoid
interference with other functions, e.g., binomialff
,
poissonff
, gaussianff
,
gammaff
. This is because VGAM family
functions are incompatible with glm
(and also gam
in the gam library and
gam
in the mgcv library).
The smart prediction (smartpred
) library is packed with
the VGAM library.
The theory behind the scaling parameter is currently being made more rigorous, but it it should give the same value as the scale parameter for GLMs.
In Example 5 below, the xij
argument to illustrate covariates
that are specific to a linear predictor. Here, lop
/rop
are
the ocular pressures of the left/right eye (artificial data). Variables
leye
and reye
might be the presence/absence of a particular
disease on the LHS/RHS eye respectively. See fill
for more details and examples.
Thomas W. Yee
Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.
Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.
vglm.control
,
vglm-class
,
vglmff-class
,
smartpred
,
vglm.fit
,
fill
,
rrvglm
,
vgam
.
Methods functions include
coef.vlm
,
predict.vglm
,
summary.vglm
,
etc.
# Example 1. Dobson (1990) Page 93: Randomized Controlled Trial : counts = c(18,17,15,20,10,20,25,13,12) outcome = gl(3,1,9) treatment = gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) vglm.D93 = vglm(counts ~ outcome + treatment, family=poissonff) summary(vglm.D93) # Example 2. Multinomial logit model data(pneumo) pneumo = transform(pneumo, let=log(exposure.time)) vglm(cbind(normal, mild, severe) ~ let, multinomial, pneumo) # Example 3. Proportional odds model fit = vglm(cbind(normal,mild,severe) ~ let, cumulative(par=TRUE), pneumo) coef(fit, matrix=TRUE) constraints(fit) fit@x # LM model matrix model.matrix(fit) # Larger VGLM model matrix # Example 4. Bivariate logistic model data(coalminers) fit = vglm(cbind(nBnW, nBW, BnW, BW) ~ age, binom2.or, coalminers, trace=TRUE) coef(fit, matrix=TRUE) fit@y # Example 5. The use of the xij argument n = 1000 eyes = data.frame(lop = runif(n), rop = runif(n)) eyes = transform(eyes, leye = ifelse(runif(n) < logit(-1+2*lop, inverse=TRUE), 1, 0), reye = ifelse(runif(n) < logit(-1+2*rop, inverse=TRUE), 1, 0)) fit = vglm(cbind(leye,reye) ~ lop + rop + fill(lop), binom2.or(exchangeable=TRUE, zero=3), xij = op ~ lop + rop + fill(lop), data=eyes) coef(fit) coef(fit, matrix=TRUE) coef(fit, matrix=TRUE, compress=FALSE) # Here's one method to handle the xij argument with a term that # produces more than one column in the model matrix. POLY3 = function(x, ...) { # A cubic poly(c(x,...), 3)[1:length(x),] } fit = vglm(cbind(leye,reye) ~ POLY3(lop,rop) + POLY3(rop,lop) + fill(POLY3(lop,rop)), binom2.or(exchangeable=TRUE, zero=3), data=eyes, xij = POLY3(op) ~ POLY3(lop,rop) + POLY3(rop,lop) + fill(POLY3(lop,rop))) coef(fit) coef(fit, matrix=TRUE) coef(fit, matrix=TRUE, compress=FALSE) predict(fit)[1:4,]