lms.yjn {VGAM} | R Documentation |
LMS quantile regression with the Yeo-Johnson transformation to normality.
lms.yjn(percentiles = c(25, 50, 75), zero = NULL, link.lambda = "identity", link.sigma = "loge", elambda=list(), esigma=list(), dfmu.init=4, dfsigma.init=2, init.lambda = 1, init.sigma = NULL, rule = c(10, 5), yoffset = NULL, diagW=FALSE, iters.diagW=6) lms.yjn2(percentiles=c(25,50,75), zero=NULL, link.lambda="identity", link.mu = "identity", link.sigma="loge", elambda=list(), emu = list(), esigma=list(), dfmu.init=4, dfsigma.init=2, init.lambda=1.0, init.sigma=NULL, yoffset=NULL, nsimEIM=250)
In the following, n is the number of (independent) observations.
percentiles |
A numerical vector containing values between 0 and 100,
which are the quantiles. They will be returned as `fitted values'.
|
zero |
An integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
The values must be from the set {1,2,3}.
The default value, NULL , means they all are
functions of the covariates.
|
link.lambda, link.mu, link.sigma |
Parameter link function applied to the first, second and third
linear/additive predictor.
See Links for more choices.
|
elambda, emu, esigma |
List. Extra argument for each of the links.
See earg in Links for general information.
|
dfmu.init |
Degrees of freedom for the cubic smoothing spline fit applied to
get an initial estimate of mu.
See vsmooth.spline .
|
dfsigma.init |
Degrees of freedom for the cubic smoothing spline fit applied to
get an initial estimate of sigma.
See vsmooth.spline .
This argument may be assigned NULL to get an initial value
using some other algorithm.
|
init.lambda |
Initial value for lambda.
If necessary, it is recycled to be a vector of length n.
|
init.sigma |
Optional initial value for sigma.
If necessary, it is recycled to be a vector of length n.
The default value, NULL , means an initial value is computed
in the @initialize slot of the family function.
|
rule |
Number of abscissae used in the Gaussian integration
scheme to work out elements of the weight matrices.
The values given are the possible choices, with the first value
being the default.
The larger the value, the more accurate the approximation is likely
to be but involving more computational expense.
|
yoffset |
A value to be added to the response y, for the purpose
of centering the response before fitting the model to the data.
The default value, NULL , means -median(y) is used, so that
the response actually used has median zero. The yoffset is
saved on the object and used during prediction.
|
diagW |
Logical.
This argument is offered because the expected information matrix may not
be positive-definite. Using the diagonal elements of this matrix results
in a higher chance of it being positive-definite, however convergence will
be very slow.
If TRUE , then the first iters.diagW iterations will
use the diagonal of the expected information matrix.
The default is FALSE , meaning faster convergence.
|
iters.diagW |
Integer. Number of iterations in which the
diagonal elements of the expected information matrix are used.
Only used if diagW = TRUE .
|
nsimEIM |
See CommonVGAMffArguments for more information.
|
Given a value of the covariate, this function applies a Yeo-Johnson
transformation to the response to best obtain normality. The parameters
chosen to do this are estimated by maximum likelihood or penalized
maximum likelihood.
The function lms.yjn2()
estimates the expected information
matrices using simulation (and is consequently slower) while
lms.yjn()
uses numerical integration.
Try the other if one function fails.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
and vgam
.
The computations are not simple, therefore convergence may fail. In that case, try different starting values.
The generic function predict
, when applied to a
lms.yjn
fit, does not add back the yoffset
value.
The response may contain both positive and negative values. In contrast, the LMS-Box-Cox-normal and LMS-Box-Cox-gamma methods only handle a positive response because the Box-Cox transformation cannot handle negative values.
In general, the lambda and sigma functions should be more smoother
than the mean function. Often setting zero=1
or
zero=3
or zero=c(1,3)
is a good idea.
See the example below.
While it is usual to regress the response against a single covariate, it is possible to add other explanatory variables, e.g., sex. See http://www.stat.auckland.ac.nz/~yee for further information and examples about this feature.
Thomas W. Yee
Yeo, I.-K. and Johnson, R. A. (2000) A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954–959.
Yee, T. W. (2004) Quantile regression via vector generalized additive models. Statistics in Medicine, 23, 2295–2315.
Yee, T. W. (2002) An Implementation for Regression Quantile Estimation. Pages 3–14. In: Haerdle, W. and Ronz, B., Proceedings in Computational Statistics COMPSTAT 2002. Heidelberg: Physica-Verlag.
Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.
lms.bcn
,
lms.bcg
,
qtplot.lmscreg
,
deplot.lmscreg
,
cdf.lmscreg
,
bminz
,
alsqreg
.
data(bminz) fit = vgam(BMI ~ s(age, df=4), fam=lms.yjn(zero=c(1,3)), data=bminz, trace=TRUE) predict(fit)[1:3,] fitted(fit)[1:3,] bminz[1:3,] # Person 1 is near the lower quartile of BMI amongst people his age cdf(fit)[1:3] ## Not run: # Quantile plot par(bty="l", mar=c(5,4,4,3)+0.1, xpd=TRUE) qtplot(fit, percentiles=c(5,50,90,99), main="Quantiles", xlim=c(15,90), las=1, ylab="BMI", lwd=2, lcol=4) # Density plot ygrid = seq(15, 43, len=100) # BMI ranges par(mfrow=c(1,1), lwd=2) a = deplot(fit, x0=20, y=ygrid, xlab="BMI", col="black", main="Density functions at Age = 20 (black), 42 (red) and 55 (blue)") a a = deplot(fit, x0=42, y=ygrid, add=TRUE, llty=2, col="red") a = deplot(fit, x0=55, y=ygrid, add=TRUE, llty=4, col="blue", Attach=TRUE) a@post$deplot # Contains density function values ## End(Not run)