uqo {VGAM}R Documentation

Fitting Unconstrained Quadratic Ordination (UQO)

Description

An unconstrained quadratic ordination (UQO) (equivalently, noncanonical Gaussian ordination) model is fitted using the quadratic unconstrained vector generalized linear model (QU-VGLM) framework. In this documentation, M is the number of linear predictors or species.

Usage

uqo(formula, family, data = list(), weights = NULL, subset = NULL,
    na.action = na.fail, etastart = NULL, mustart = NULL,
    coefstart = NULL, control = uqo.control(...), offset = NULL,
    method = "uqo.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
    contrasts = NULL, constraints = NULL, extra = NULL,
    qr.arg = FALSE, ...)

Arguments

formula a symbolic description of the model to be fit. Since there is no x_2 vector by definition, the RHS of the formula has all terms belonging to the x_1 vector.
family a function of class "vglmff" describing what statistical model is to be fitted. Currently two families are supported: Poisson and binomial.
data an optional data frame containing the variables in the model. By default the variables are taken from environment(formula), typically the environment from which uqo is called.
weights an optional vector or matrix of (prior) weights to be used in the fitting process. This argument should not be used.
subset an optional logical vector specifying a subset of observations to be used in the fitting process.
na.action a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ``factory-fresh'' default is na.omit.
etastart starting values for the linear predictors. It is a M-column matrix. If M=1 then it may be a vector.
mustart starting values for the fitted values. It can be a vector or a matrix. Some family functions do not make use of this argument.
coefstart starting values for the coefficient vector.
control a list of parameters for controlling the fitting process. See uqo.control for details.
offset a vector or M-column matrix of offset values. This argument should not be used.
method the method to be used in fitting the model. The default (and presently only) method uqo.fit uses iteratively reweighted least squares (IRLS).
model a logical value indicating whether the model frame should be assigned in the model slot.
x.arg, y.arg logical values indicating whether the model matrix and response matrix used in the fitting process should be assigned in the x and y slots. Note the model matrix is the LM model matrix.
contrasts an optional list. See the contrasts.arg of model.matrix.default.
constraints an optional list of constraint matrices. This argument should not be used.
extra an optional list with any extra information that might be needed by the family function.
qr.arg logical value indicating whether the slot qr, which returns the QR decomposition of the VLM model matrix, is returned on the object. This argument should not be set TRUE.
... further arguments passed into uqo.control.

Details

Unconstrained quadratic ordination models fit symmetric bell-shaped response curves/surfaces to response data, but the latent variables are largely free parameters and are not constrained to be linear combinations of the environmental variables. This poses a difficult optimization problem. The current algorithm is very simple and will often fail (even for Rank=1) but hopefully this will be improved in the future.

The central formula is given by

eta = B_1^T x_1 + A nu + sum_{m=1}^M (nu^T D_m nu) e_m

where x_1 is a vector (usually just a 1 for an intercept), nu is a R-vector of latent variables, e_m is a vector of 0s but with a 1 in the mth position. The eta are a vector of linear/additive predictors, e.g., the mth element is eta_m = log(E[Y_m]) for the mth species. The matrices B_1, A, and D_m are estimated from the data, i.e., contain the regression coefficients. Also, nu is estimated. The tolerance matrices satisfy T_s = -(0.5 D_s^(-1). Many important UQO details are directly related to arguments in uqo.control; see also cqo and qrrvglm.control.

Currently, only Poisson and binomial VGAM family functions are implemented for this function, and dispersion parameters for these are assumed known. Thus the Poisson is catered for by poissonff, and the binomial by binomialff. Those beginning with "quasi" have dispersion parameters that are estimated for each species, hence will give an error message here.

Value

An object of class "uqo" (this may change to "quvglm" in the future).

Warning

Local solutions are not uncommon when fitting UQO models. To increase the chances of obtaining the global solution, set ITolerances=TRUE or EqualTolerances=TRUE and increase the value of the argument Bestof in uqo.control. For reproducibility of the results, it pays to set a different random number seed before calling uqo (the function set.seed does this).

The function uqo is very sensitive to initial values, and there is a lot of room for improvement here.

UQO is computationally expensive. It pays to keep the rank to no more than 2, and 1 is much preferred over 2. The data needs to conform closely to the statistical model.

Currently there is a bug with the argument Crow1positive in uqo.control. This argument might be interpreted as controlling the sign of the first site score, but currently this is not done.

Note

The site scores are centered. When R>1, they are uncorrelated and should be unique up to a rotation.

The argument Bestof in uqo.control controls the number of models fitted (each uses different starting values) to the data. This argument is important because convergence may be to a local solution rather than the global solution. Using more starting values increases the chances of finding the global solution. Local solutions arise because the optimization problem is highly nonlinear.

In the example below, a CQO model is fitted and used for providing initial values for a UQO model.

Author(s)

Thomas W. Yee

References

Yee, T. W. (2004) A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685–701.

Yee, T. W. (2005) On constrained and unconstrained quadratic ordination. Manuscript in preparation.

Yee, T. W. (2006) Constrained additive ordination. Ecology, 87, 203–213.

See Also

uqo.control, cqo, qrrvglm.control, rcqo, poissonff, binomialff, Coef.uqo, lvplot.uqo, persp.uqo, trplot.uqo, vcov.uqo, set.seed, hspider.

Examples

## Not run: 
data(hspider)
set.seed(123)  # This leads to the global solution
hspider[,1:6] = scale(hspider[,1:6]) # Standardized environmental vars
p1 = cqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi,
               Auloalbi, Pardlugu, Pardmont, Pardnigr, Pardpull,
               Trocterr, Zoraspin) ~
         WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
         ITolerances = TRUE, fam = poissonff, data = hspider, 
         Crow1positive=TRUE, Bestof=3, trace=FALSE)
if(deviance(p1) > 1589.0) stop("suboptimal fit obtained")

set.seed(111)
up1 = uqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi,
                Auloalbi, Pardlugu, Pardmont, Pardnigr, Pardpull,
                Trocterr, Zoraspin) ~ 1,
          family = poissonff, data = hspider,
          ITolerances = TRUE,
          Crow1positive = TRUE, lvstart = -lv(p1))
if(deviance(up1) > 1310.0) stop("suboptimal fit obtained")

nos = ncol(up1@y) # Number of species
clr = (1:(nos+1))[-7]  # to omit yellow
lvplot(up1, las=1, y=TRUE, pch=1:nos, scol=clr, lcol=clr, 
       pcol=clr, llty=1:nos, llwd=2)
legend(x=2, y=135, dimnames(up1@y)[[2]], col=clr, lty=1:nos,
       lwd=2, merge=FALSE, ncol=1, x.inter=4.0, bty="l", cex=0.9)

# Compare the site scores between the two models
plot(lv(p1), lv(up1), xlim=c(-3,4), ylim=c(-3,4), las=1)
abline(a=0, b=-1, lty=2, col="blue", xpd=FALSE)
cor(lv(p1, ITol=TRUE), lv(up1))

# Another comparison between the constrained and unconstrained models
# The signs are not right so they are similar when reflected about 0 
par(mfrow=c(2,1))
persp(up1, main="Red/Blue are the constrained/unconstrained models",
      label=TRUE, col="blue", las=1)
persp(p1, add=FALSE, col="red")
1-pchisq(deviance(p1) - deviance(up1), df=52-30)
## End(Not run)

[Package VGAM version 0.7-7 Index]