SIMPLICIAL INFERENCE

3

subcomposition based on parts {1, 2, ... , C) of a D-part composition (x1

, ••• ,

xv)

is the

{1,

2, ... , C)-subcomposition {81

, ••• ,

8c)

defined by

{3.1)

(8I,···

,8c)=(xi,··· ,xc)/(xi+···+xc).

This operation

C : Sd -+

sc,

where c

=

C

-1, is commonly referred to as the closure

operation by geologists and allows {3.1) to be expressed as

{3.2)

Less familiar is another logical necessity of compositional analysis, namely subcom-

positional coherence. Consider two scientists A and B interested in soil samples,

which have been divided into aliquots. For each aliquot A records a 4-part composi-

tion (animal, vegetable, mineral, water); B first dries each aliquot without recording

the water content and arrives at a 3-part composition (animal, vegetable, mineral).

Let us further assume for simplicity the ideal situation where the aliquots in each

pair are identical and where the two scientists are accurate in their determinations.

Then clearly B's 3-part composition ( s

1

,

s2, 83) for an aliquot will be a subcomposi-

tion of A's 4-part composition (x

1

,

x2, X3, x 4) for the corresponding aliquot related

as in (3.2) above with

C =

3,

D =

4. It is then obvious that any compositional

statements that A and B make about the common parts, animal, vegetable and

mineral, must agree. This is the nature of subcompositional coherence.

The ignoring of this principle of subcompositional coherence has been a source

of great confusion in compositional data analysis, The literature, even currently, is

full of attempts to explain the dependence of components of compositions in terms

of product moment correlation of raw components. Consider the simple data set:

Full compositions (x

1

,x2,x3,x4)

(0.1, 0.2, 0.1, 0.6)

(0.2, 0.1, 0.1, 0.6)

(0.3, 0.3, 0.2, 0.2)

Subcompositions

(81! 82,

83)

(0.250, 0.500, 0.250)

(0.500, 0.250, 0.250)

(0.375, 0.375, 0.250)

Scientist A would report the correlation between animal and vegetable as

corr{x1

,

x2)

=

0.5, whereas B would report corr( 81

,

82)

=

-1. There is thus incoher-

ence of the product-moment correlation between raw components as a measure of

dependence. Note, however, that the ratio of two components remains unchanged

when we move from full composition to subcomposition:

8;/8; = x;/x;,

so that,

as long as we work with scale invariant functions, or equivalently express all our

statements about compositions in terms of ratios, we shall be subcompositionally

coherent.

Similar considerations apply to probabilistic statements. A clinician may be

faced with a differential diagnostic problem among five forms (1, 2, 3, 4, 5) of

which 1, 2, 3, are malignant and 4, 5 benign. At a stage in the diagnostic process

the clinician, having ruled out the benign forms 4 and 5, may wish to make a

conditional probabilistic statement involving only the malignant states 1, 2, 3.

The process of moving from the full probabilistic statement to the conditional

probability statement is exactly analogous to the closure operation of forming a

subcomposition from a full composition. Moreover, clearly there is also a principle

of conditional coherence, analogous to the subcompositional coherence principle,

that must apply here.