dirichlet {VGAM} | R Documentation |
Fits a Dirichlet distribution to a matrix of compositions.
dirichlet(link = "loge", earg=list(), zero=NULL)
In the following, the response is assumed to be a M-column matrix with positive values and whose rows each sum to unity. Such data can be thought of as compositional data. There are M linear/additive predictors eta_j.
link |
Link function applied to each of the M (positive) shape
parameters alpha_j.
See Links for more choices.
The default gives eta_j=log(alpha_j).
|
earg |
List. Extra argument for the link.
See earg in Links for general information.
|
zero |
An integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
The default is none of them.
If used, choose values from the set {1,2,...,M}.
|
The Dirichlet distribution is commonly used to model compositional data, including applications in genetics. Suppose (Y_1,...,Y_M)^T is the response. Then it has a Dirichlet distribution if (Y_1,...,Y_{M-1})^T has density
(Gamma(alpha_+) / prod_{j=1}^M gamma(alpha_j)) prod_{j=1}^M y_j^(alpha_j -1)
where alpha_+= alpha_1 + ... + alpha_M, alpha_j > 0, and the density is defined on the unit simplex
Delta_M = { (y_1,...,y_M)^T : y_1 > 0, ..., y_M > 0, sum_{j=1}^M y_j = 1 }.
One has E(Y_j) = alpha_j / alpha_{+}, which are returned as the fitted values. For this distribution Fisher scoring corresponds to Newton-Raphson.
The Dirichlet distribution can be motivated by considering the random variables (G_1,...,G_M)^T which are each independent and identically distributed as a gamma distribution with density f(g_j)= g_j^(alpha_j - 1) e^(-g_j) / gamma(alpha_j). Then the Dirichlet distribution arises when Y_j = G_j / (G_1 + ... + G_M).
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
rrvglm
and vgam
.
When fitted, the fitted.values
slot of the object contains the
M-column matrix of means.
The response should be a matrix of positive values whose rows
each sum to unity. Similar to this is count data, where probably a
multinomial logit model (multinomial
) may be appropriate.
Another similar distribution to the Dirichlet is the
Dirichlet-multinomial (see dirmultinomial
).
Thomas W. Yee
Lange, K. (2002) Mathematical and Statistical Methods for Genetic Analysis, 2nd ed. New York: Springer-Verlag.
Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions, New York: Wiley-Interscience, Third edition.
rdiric
,
dirmultinomial
,
multinomial
.
y = rdiric(n=1000, shape=exp(c(-1,1,0))) fit = vglm(y ~ 1, dirichlet, trace = TRUE, crit="c") Coef(fit) coef(fit, matrix=TRUE) fitted(fit)[1:2,]