688 NATURE April 30, 1949 Vol. 163 Measurement of Diversity The third and fourth cumulants of the distribution THE 'characteristic' defined by Yule1 and the of l have also been calculated exactly. They indicats 'index of diversity' defined by Fisher• are two that as N increases, the distribution tends to normality measures of the degree of concentration or diversity except when A = 1/Z ; in that case the distribution of achieved when the individuals of a population are lNZ tends to that ofx• with Z - 1 degrees offreedom, classified into groups. Both are defined as statistics but with its mean moved from Z- 1 toN. to be calculated from sample data and not in terms The characteristic defined by Yulel is, in the 2 of population constants. The index of diversity has notation used above, 1,000 :En(n - 1)/1\' , which so far been used chiefly with the logarithmic distribu­ differs from l, the sample estimator of A, only in tion. It cannot be used everywhere, as it does not having N instead of N - 1 in the denominator and always give values which are independent of sample in the scale factor of 1,000. size ; it cannot do so, for example, when applied Now let us see what value A takes for a population to an infinite population of individuals classified into containing Z groups the frequencies of which are a finite number of groups. Williams• has pointed 1ti = Wi/ :Ew, where the Wi are chosen at random out a relationship between the characteristic and and independently from the Type III distribution the index of diversity when both are applied to a 1 logarithmic distribution. The present purpose is to -w k-1 define and examine a measure of concentration in dF = (k _ I) ! e w dw, terms of population constants. Consider an infinite population such that each This may be called a 'negative binomial population', individual belongs to one of Z groups, and let 1t 1 ••• 1tz since samples drawn from it by the 'fixed exposure' ( :E1t = 1) be the proportions of individuals in the method will obey the negative binomial distribution. various groups. Then A defined as :E1t 2 is a measure The value of A appropriate to it is obtained by 2 2 of the concentration of the classification. It can averaging :Ewi /( :Ewi) over all sets (w" w 2 ••• wz) take any value between 1/Z and 1, the former which can be drawn from the population of values representing the smallest concentration or largest of w. Thus diversity possible with Z groups, and the latter com­ 00 00 plete concentration, all the 1 A = e-rw [w1 ••• Wz]k-1 - :Ewi • dw •• • dwz = _k__±_!_ individuals being in a single J... J[-- (k-1)!- -Jz (:Ewi )2 1 Zk + I' group. A can be simply 0 interpreted as the prob­ ability that two individuals chosen at random and The Poisson distribution is the special case of the independently from the population will be found negative binomial distribution in which k tends to to belong to the same group. infinity. Under this condition, A = 1/Z. This is as Now suppose a sample of N individuals to be we would expect, since the Poisson distribution arises chosen at random from a population of this kind, from a population in which all groups are equally and let n 1, n 2 ••• nz (:En = N) be the numbers of represented, and so the probability that two in­ individuals falling into the various groups. It is dividuals chosen at random will be found to belong to the same group must be 1/Z. :En(n - 1) The other extreme case of the negative binomial easily shown that l = N (N _ -I) is an unbiased is the logarithmic population, which is obtained by letting Z tend to infinity and k tend to zero simul­ estimator of A; this is almost obvious since !N(N -I) taneously so that the product Zk remains finite and is the number of pairs in the sample and t:En(n-1) tends to a quantity called a:. (This is not quite the is the number of pairs drawn from the same group. same derivation as that used by Fisher•, but the l is also an unbiased estimate of A when the sample­ quantity a: is the same as his index of diversity.) size varies, provided no samples of size 0 or I are The value obtained for A under this limiting process is included and that the probability of the sample 1/ (rx + 1). (n" n 2 •• • nz) splits into these two factors : It will be noticed that this last value is not con­ sistent with the equation given by Williams•, namely, N! that Yule's characteristic had the value 1,000/a: when P(n • • • nz) = 10 n 2 P(N) 1 1 (7t 1)n, (7t 2 )n, . n 1 • n, .... applied to the logarithmic distribution. His result was obtained by applying Yule's formula to a series where P(N) gives the probability distribution of the of expected values, whereas the present procedure sample size, 2 -<; N -<; ao • This is true in particular is equivalent to applying the formula first and then when samples are obtained by the 'fixed-exposure' averaging the r esult. Some support for the new method common in biological work, N having then equation is found by considering the ranges of the a Poisson distribution adjusted for the absence of variables concerned. Since the characteristic cannot the first two terms. exceed 1,000, the earlier equation would deny to a: If repeated samples of size N are drawn from the all values less than I ; but the present one allows it same population, the values of l obtained will be the range 0 -<; a: <; ao , while I A 0. distributed about A with variance E. H. SIMPSON 3 2 4N(N- I)(N- 2) :E1t + 2N(N- 1) I:1t - 2N(N- 1) (2N- 3) (:E7t•) 2 ; 3 West End Avenue, [N(N- 1) ] 2 Pinner. Jan. 29. or, if N be very large, approximately 1 Yule, "Statistical Study of Literary Vocabulary" (Cambridge, 4 1944). N [:E1t• - ( :E7t")"J. 1 Fisher, Corbet and Williams, J • .Animal Ecol., 12, 42 (1943). 1 Williams, Nature, 157, 482 (1946). © 1949 Nature Publishing Group.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages1 Page
-
File Size-