alaplace {VGAM} | R Documentation |
Maximum likelihood estimation of the 1, 2 and 3-parameter asymmetric Laplace distributions (ALDs). The 1-parameter ALD may be used for quantile regression.
alaplace1(tau = NULL, llocation = "identity", elocation = list(), ilocation = NULL, kappa = sqrt(tau/(1 - tau)), Scale.arg = 1, shrinkage.init = 0.95, parallelLocation = FALSE, digt = 4, dfmu.init = 3, method.init = 1, zero = NULL) alaplace2(tau = NULL, llocation = "identity", lscale = "loge", elocation = list(), escale = list(), ilocation = NULL, iscale = NULL, kappa = sqrt(tau/(1 - tau)), shrinkage.init = 0.95, parallelLocation = FALSE, digt = 4, sameScale = TRUE, dfmu.init = 3, method.init = 1, zero = "(1 + M/2):M") alaplace3(llocation = "identity", lscale = "loge", lkappa = "loge", elocation = list(), escale = list(), ekappa = list(), ilocation = NULL, iscale = NULL, ikappa = 1, method.init = 1, zero = 2:3)
tau, kappa |
Numeric vectors with
0 < tau < 1 and kappa >0.
Most users will only specify tau since the estimated
location parameter corresponds to the tauth
regression quantile, which is easier to understand.
See below for details.
|
llocation, lscale, lkappa |
Character.
Parameter link functions for
location parameter xi,
scale parameter b,
asymmetry parameter kappa.
See Links for more choices.
For example, the argument llocation can help handle
count data by restricting the quantiles to be positive
(use llocation="loge" ).
|
elocation, escale, ekappa |
List. Extra argument for each of the links.
See earg in Links for general information.
|
ilocation, iscale, ikappa |
Optional initial values. If given, it must be numeric and values are recycled to the appropriate length. The default is to choose the value internally. |
parallelLocation |
Logical.
Should the quantiles be parallel on the transformed scale (argument
llocation )?
Assigning this argument to TRUE circumvents the
seriously embarrassing quantile crossing problem.
|
sameScale |
Logical.
Should the scale parameters be equal? It is advised to keep
sameScale=TRUE unchanged because it does not make sense to
have different values for each tau value.
|
method.init |
Initialization method.
Either the value 1, 2, 3 or 4.
|
dfmu.init |
Degrees of freedom for the cubic smoothing spline fit applied to
get an initial estimate of the location parameter.
See vsmooth.spline .
Used only when method.init=3 .
|
shrinkage.init |
How much shrinkage is used when initializing xi.
The value must be between 0 and 1 inclusive, and
a value of 0 means the individual response values are used,
and a value of 1 means the median or mean is used.
This argument is used only when method.init=4 .
|
Scale.arg |
The value of the scale parameter b.
This argument may be used to compute quantiles at different
tau values from an existing fitted alaplace2() model
(practical only if it has a single value).
If the model has parallelLocation = TRUE then only the intercept
need be estimated; use an offset.
See below for an example.
|
digt |
Passed into Round as the digits argument
for the tau values;
used cosmetically for labelling.
|
zero |
See CommonVGAMffArguments for more information.
Where possible,
the default is to model all the b and kappa
as an intercept-only term.
|
These VGAM family functions implement one variant of asymmetric Laplace distributions (ALDs) suitable for quantile regression. Kotz et al. (2001) call it the ALD. Its density function is
f(y) = (sqrt(2)/b) * (kappa/(1+ kappa^2)) * exp( - (sqrt(2) / (b * kappa)) * |y-xi| )
for y <= xi, and
f(y) = (sqrt(2)/b) * (kappa/(1+ kappa^2)) * exp( - (sqrt(2) * kappa / b) * |y-xi| )
for y > xi.
Here, the ranges are
for all real y and xi, positive b and
positive kappa.
The special case kappa=1 corresponds to the
(symmetric) Laplace distribution of Kotz et al. (2001).
The mean is xi +
b * (1/kappa - kappa) / sqrt(2)
and the variance is
b^2 * (1 +
kappa^4) / (2 * kappa^2).
The enumeration of the linear/additive predictors used here is to
first have all the location parameters, followed by all the
scale parameters. Finally, for alaplace3()
, the last one
is the asymmetry parameter.
It is known that the maximum likelihood estimate of the
location parameter xi corresponds to the regression
quantile estimate
of the classical quantile regression approach of Koenker and Bassett
(1978). An important property of the ALD is that
P(Y <= xi) = tau where
tau = kappa^2 / (1 + kappa^2)
so that
kappa = sqrt(tau / (1-tau)).
Thus alaplace1()
may be used as an alternative to rq
in the quantreg package.
In general the response must be a vector or a 1-column matrix.
For alaplace1()
and alaplace2()
the number of linear/additive predictors is dictated by the
length of tau
or kappa
.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
and vgam
.
In the extra
slot of the fitted object are some list
components which are useful, e.g., the sample proportion of
values which are less than the fitted quantile curves.
The MLE regularity conditions do not hold for this distribution
so that misleading inferences may result,
e.g., in the summary
and vcov
of the object.
Care is needed with tau
values which are too small, e.g.,
for count data with llocation="loge"
and if the sample
proportion of zeros is greater than tau
.
These VGAM family functions use Fisher scoring.
Convergence may be slow and
half-stepping is usual (although one can use trace=TRUE
to see
which is the best model and then use maxit
to fit that model).
For large data sets it is a very good idea to keep the length of
tau
/kappa
low to avoid large memory requirements.
Then
for parallelLoc=FALSE
one can repeatedly fit a model with
alaplace1()
with one tau at a time;
and
for parallelLoc=TRUE
one can refit a model with
alaplace1()
with one tau at a time but
using offsets and an intercept-only model.
A second method for solving the noncrossing quantile problem is illustrated below in Example 3. This is called the accumulative quantile method (AQM) and details are in Yee (2008). It does not make the strong parallelism assumption.
The functions alaplace2()
and laplace
differ slightly in terms of the parameterizations.
Thomas W. Yee
Koenker, R. and Bassett, G. (1978) Regression quantiles. Econometrica, 46, 33–50.
Kotz, S., Kozubowski, T. J. and Podgorski, K. (2001) The Laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance, Boston: Birkhauser.
Yee, T. W. (2008) Quantile regression for counts and binomial proportions. In preparation.
ralaplace
,
laplace
,
lms.bcn
,
alsqreg
.
# Example 1: quantile regression with smoothing splines alldat = data.frame(x = sort(runif(n <- 500))) mymu = function(x) exp(-2 + 6*sin(2*x-0.2) / (x+0.5)^2) alldat = transform(alldat, y = rpois(n, lambda=mymu(x))) mytau = c(0.25, 0.75); mydof = 4 fit = vgam(y ~ s(x, df=mydof), alaplace1(tau=mytau, llocation="loge", parallelLoc=FALSE), data=alldat, trace=TRUE) fitp = vgam(y ~ s(x, df=mydof), alaplace1(tau=mytau, llocation="loge", parallelLoc=TRUE), data=alldat, trace=TRUE) ## Not run: par(las=1) mylwd = 1.5 with(alldat, plot(x, jitter(y, factor=0.5), col="red", main="Example 1; green: parallelLoc=TRUE", ylab="y", pch="o", cex=0.75)) with(alldat, matlines(x, fitted(fit), col="blue", lty="solid", lwd=mylwd)) with(alldat, matlines(x, fitted(fitp), col="green", lty="solid", lwd=mylwd)) finexgrid = seq(0, 1, len=1001) for(ii in 1:length(mytau)) lines(finexgrid, qpois(p=mytau[ii], lambda=mymu(finexgrid)), col="blue", lwd=mylwd) ## End(Not run) fit@extra # Contains useful information # Example 2: regression quantile at a new tau value from an existing fit # Nb. regression splines are used here since it is easier. fitp2 = vglm(y ~ bs(x, df=mydof), family = alaplace1(tau=mytau, llocation="loge", parallelLoc=TRUE), data=alldat, trace=TRUE) newtau = 0.5 # Want to refit the model with this tau value fitp3 = vglm(y ~ 1 + offset(predict(fitp2)[,1]), family = alaplace1(tau=newtau, llocation="loge"), data=alldat) ## Not run: with(alldat, plot(x, jitter(y, factor=0.5), col="red", ylab="y", pch="o", cex=0.75, main="Example 2; parallelLoc=TRUE")) with(alldat, matlines(x, fitted(fitp2), col="blue", lty="solid", lwd=mylwd)) with(alldat, matlines(x, fitted(fitp3), col="black", lty="solid", lwd=mylwd)) ## End(Not run) # Example 3: noncrossing regression quantiles using a trick: obtain # successive solutions which are added to previous solutions; use a log # link to ensure an increasing quantiles at any value of x. mytau = seq(0.1, 0.9, by=0.1) answer = matrix(0, nrow(alldat), length(mytau)) # Stores the quantiles alldat = transform(alldat, offsety=y*0) usetau = mytau for(ii in 1:length(mytau)) { # cat("\n\nii =", ii, "\n") alldat = transform(alldat, usey=y-offsety) iloc = ifelse(ii==1, with(alldat, median(y)), 1.0) # Well-chosen! mydf = ifelse(ii==1, 5, 3) # Maybe less smoothing will help lloc = ifelse(ii==1, "loge", "loge") # 2nd value must be "loge" fit3 = vglm(usey ~ ns(x, df=mydf), data=alldat, trace=TRUE, fam=alaplace1(tau=usetau[ii], lloc=lloc, iloc=iloc)) answer[,ii] = (if(ii==1) 0 else answer[,ii-1]) + fitted(fit3) alldat = transform(alldat, offsety=answer[,ii]) } # Plot the results. ## Not run: with(alldat, plot(x, y, col="blue", main=paste("Noncrossing and nonparallel; tau =", paste(mytau, collapse=", ")))) with(alldat, matlines(x, answer, col="red", lty=1)) # Zoom in near the origin. with(alldat, plot(x, y, col="blue", xlim=c(0, 0.2), ylim=0:1, main=paste("Noncrossing and nonparallel; tau =", paste(mytau, collapse=", ")))) with(alldat, matlines(x, answer, col="red", lty=1)) ## End(Not run)