8

JOHN AITCHISON

There is an analogy here with the use of the lognormal distribution A(J.L, E)

to describe the pattern of variability of a positive quantity and the use of the

geometric mean exp{E(logz)} = exp(J.L}, dating back to

[Me].

We shall refer toe

as the geometric center and note that, for any

fixed

perturbation p, cen(p ox)

=

pocen(x), in analogy with E(t+y) = E(y)+t in (4.2} for unconstrained variability

in

RD.

Note also that with this definition we have a power result cen(a · x) =

C{a cen(x)} in analogy with E(ay) = aE(y) for unconstrained variability. Note

also that cen(x o y) = cen(x) o cen(y), whether or not x and y are independent,

in conformity with a similar

RD

result. We may digress here to note the practical

implications of this simple choice of center. For a compositional data set

(6.3}

'D

=

{xr

=

(xrl• ... , XrD):

r

=

1, ... , N},

standard practice seems to be to take the arithmetic center

N

(6.4}

x

= (x.1, ... ,x.v) where X.i = N-

1

LXri·

r=l

A consequence of the above analysis is the advocacy of

(6.5}

as center of the compositional data set, where gi = (:U..xri)lfN is the geometric

mean of the

ith

component over all N cases. There can be a substantial difference

in the use of these different centers. For such examples, see

[A6], [Al2].

6.2.

Distributional characteristics: measures of dispersion and de-

pendence. There are a number of criteria which dictate the choice of any measure

V(x) of dispersion and dependence, which forms the basis of characteristics of com-

positional variability in terms of second-order moments.

(a} Interpretability in relation to the specific hypotheses and problems of in-

terest in fields of application.

(b) Conformability with the definition of center as defined in (6.2).

(c) Invariance under the group of perturbations. Can we ensure that V(p o

x)

=

V(x) for every constant perturbation p? (Recall the result in (4.2}

that for

y

E

RD

the covariance matrix V is invariant under translation:

V(t

+

y) = V(y).)

(d) Satisfaction of the power transformation relationship V(a · x)

=

a2V(x),

in a way similar to V(ay)

=

a2V(y) for

RD

variability.

(e) Mathematical tractability.

Criterion (a} clearly requires that we work in terms of ratios of the components

of compositions to ensure scale invariance. At first, thought, this might suggest

the use of variances and covariances of the form var(xi/x;) and cov(xi/x;,xk/xt)·

These, however, are mathematically intractable since there is no exact or even

simple approximate relationship between var(xi/x;) and var(x;/xi)· Fortunately

criterion (b} suggests that logarithms of ratios are more appropriate for our purpose

to conform with the definition of the geometric center. We are thus led to consider

the use of such dispersion characteristics as

(6.6}

Obvious advantages of this are simple relationships such as var{log(xi/x;)} =

var{log(x;/xi)} and cov{log(xdx;}, log(xk/xt}}

=

cov{log(x;/xi}, log(xl/xk)}.