14 JOHN AITCHISON D-dimensional real space RD. If we insist on a symmetric set of log ratios then we may take (10.3) Zi = log{(xi/g(x)}, i = 1, ... ,D, with inverse (10.4) Xi= exp(zi)/{exp(zt) ·· +exp(zv)}, i = 1, ... ,D, where g(x) is the geometric mean of the components of x. This is a transformation between the unit simplex Sd and the hyperplane Z1 + · · · + ZD = 0 in d-dimensional real space R_d. The new constraint on the transformed composition is not a trans- fer of the so-called constant-sum constraint but a penalty for the insistence on a symmetric treatment of the components of the composition. It is linked to the use of the singular centered log ratio covariance matrix r(x) at (6.8). In practice this singularity causes no interpretational or computational problem. There are essentially four steps in any log ratio analysis of compositional data. (1) Reformulate the compositional problem in terms of log ratios of the com- ponents. (2) Transform the compositional data set into compatible log ratio vectors. (3) Since the log ratio vectors are in real space and free of the constant sum constraint simply apply the appropriate multivariate methodology associ- ated with unconstrained vectors. ( 4) Reinterpret the inference from the statistical analysis of the log ratios into terms of the compositions. A wide variety of compositional problems which can be studied through the above log ratio transformation techniques is described in Aitchison [A5]. These in- clude tests of distributional form, log linear modeling to take account of experimen- tal design and concomitant factors, testing various forms of pseudo-independence, discriminant analysis, log contrast principal component analysis. Moreover the link to the multivariate normal allows simple Bayesian analysis in- cluding the use of predictive distributions. A question that often arises in the use of form (10.1) of the log ratio transformation is whether the inference is sensitive to the choice of divisor. Aitchison [A5] demonstrates that all these procedures are invariant under the group of permutations of the components, and so in particular of the choice of divisor. Rather than reiterate these procedures we concentrate on some more recent developments. 11. Graphical display of compositional data The biplot [Gl], [G2] is a well-established graphical aid in other branches of statistical analysis. Its adaptation for compositional and probability statement data is simple and can prove a useful exploratory and expository tool. For the compositional data set (6.3) the biplot is based on a singular value decomposition of the doubly centered log ratio matrix Z = [zri], where N Zri = log{xri/g(xr)}- N-1Llog{Xri/g(xr)}, r=l i = 1, ... ,D, r = 1, ... ,N. Let Z = U diag(k1,... ,kR)VT be the singular value decomposition, where R is the rank of Z, in practice usually R = d, and where the singular values k1, ... , kR are in descending order of magnitude. The biplot
Previous Page Next Page