c
Metho ds of Psychological Research Online 1997, Vol.2, No.1 1998 Pabst Science Publishers
Internet: http://www.pabst-publishers.de/mpr/
The Mo di cation of the Phi-co ecient
Reducing its Dep endence on the Marginal
Distributions
Peter V. Zysno
Abstract
The Phi-co ecient is a well known measure of correlation for dichotomous
variables. It is worthy of remark, that the extreme values 1 only o ccur in
the case of consistent resp onses and symmetric marginal frequencies. Con-
sequently low correlations may be due to either inconsistent data, unequal
resp onse frequencies or b oth. In order to overcome this somewhat confusing
situation various alternative prop osals were made, which generally, remained
rather unsatisfactory. Here, rst of all a system has b een develop ed in order
to evaluate these measures. Only one of the well-known co ecients satis es
the underlying demands. According to the criteria, the Phi-co ecientisac-
companied by a formally similar mo di cation, which is indep endent of the
marginal frequency distributions. Based on actual data b oth of them can b e
easily computed. If the original data are not available { as usual in publica-
tions { but the intercorrelations and resp onse frequencies of the variables are,
then the grades of asso ciation for assymmetric distributions can b e calculated
subsequently.
Keywords: Phi-co ecient, indep endent marginal distributions, dichotomous
variables
1 Intro duction
In the b eginning of this century the Phi-co ecientYule 1912 was develop ed as a
correlational measure for dichotomous variables. Its essential features can b e quickly
outlined. N elements of twovariables i and j fall into the classes "+" and " " with
frequencies n , u = N n , n and u = N n , resp ectively. Consequently,
i i i j j j
within a pair of variables four dyadic events may o ccur: ++; + ; + and . The
corresp onding joint frequencies are denoted by the initial letters of the alphab et
a; b; c and d. The parameters are usually presented in a fourfold table tab.1.
The degree of coherence between the two variables is de ned as the quotient
of two frequencies. The numerator represents the di erence of the pro ducts of
Table 1: Fourfold table of the Phi-co ecient for two binary variables i and j with classes
+ and .
j
+
+ a b n
i
i
c d u
i
n u N
j j
Peter V. Zysno: Mo di cation of the Phi-co ecient 42
Table 2: Phi-correlations of 6 Items with 3 levels of diculty
1 2 3 4 5 6
1 - 1.00 0.43 0.43 0.18 0.18
2 1.00 - 0.43 0.43 0.18 0.18
3 0.43 0.43 - 1.00 0.48 0.48
4 0.43 0.43 1.00 - 0.48 0.48
5 0.18 0.18 0.48 0.48 - 1.00
6 0.18 0.18 0.48 0.48 1.00 -
the congruent ++; and the incongruent + ; + dyads, the denominater
representing the corresp onding exp ected value:
ad bc
= 1
p
n u n u
i i j j
3
Guilford 1956 p.511 has shown that this is a pro duct-moment-co ecient. There-
fore, it may also b e describ ed as the relationship of covariance and total variance, i.e.
2 1=2 1=2
= cov =s s , with cov =ad bc=N , s =[n u =N ] and s =[n u =N ] .
i j i i i j j j
Using relative frequencies p = n =N; q = u =N; p = n =N; q = u =N; p = a=N
i i i i j j j j ij
1=2 1=2
these terms can b e replaced by cov = p p p ;s =[p q ] and s =[p q ] . In
ij i j i i i j j j
short, this co ecient exhibits advantages such as evidence of interpretation, simple
applicability, statistical testability and equivalence to Pearson's pro duct-moment-
correlation. However, there is one prop erty which has rep eatedly given rise to dis-
cussion. Measures of asso ciation cf. Kubinger, 1990 in general, owe their creation
to empirists' desire to makecovariation more tangible and to more precisely acco-
mo date some coincidental events or facts which cannot b e grasp ed adequately by
graphic or linguistic means. For this reason it is not only an ob jecti cation function
that is desired, but also a descriptive one. The measure should allow substantial
interpretation. For this purp ose the degree of asso ciation is usually limited to the
interval [ 1; +1], one denoting p erfect asso ciation and zero indicating the absence
there of. The Phi-co ecient satis es these claims. However, the extreme values can
only b e reached within symmetric distributions. This fact may lead to p otentially
signi cant misjudgment. Exempli cation of this can b e found in the resp onses to
six intelligence items answered consistently by 90 sub jects; the analysis of the data
mayhave resulted in the frequencies n = 80, n = 80, n = 50, n = 50, n = 20,
1 2 3 4 5
n = 20. Because of the high conformity of the answers to the mo del, the matrix of
6
intercorrelations is exp ected to show predominantly ones, except for small declines
in the case of di ering levels of diculty. In fact, most of the values range from .48
to .18 tab.2. Only in the case of symmetric frequencies, i.e. n = n if ad > bc
i j
or n = N n , if ad < bc are the entries equal to 1:00. A factor analysis will
i j
showasmany factors as there are levels of dicultyFerguson, 1941.
Obviously the Phi-co ecient is considerably diminished when di erences be-
tween the diculty levels increase. With real data this technical e ect meshes with
actual resp onse inconsistencies of the sub jects. The observer is usually unable to
decide, to what degree a value b elow one is a ected by the distribution asymme-
try of the variables or by some incontigency within the observed events. Hence,
generally the comprehension of this measure of asso ciation is complicated in that
values b elow one may b e caused by resp onse inconsistencies, levels of dicultyor
b oth. These problems are not restricted to graduations of theoretical test con-
structs. A symmetric partition of two sets of events will also b e the exception in
natural binary variables. For instance, for mammals there is a complete biological
c
MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers
Peter V. Zysno: Mo di cation of the Phi-co ecient 43
asso ciation b etween sex and capabilityofchildb earing. Nevertheless there will not
be a Phi-correlation of one. A p ossible explanation for this might be that some
of the males showchild b earing capability,too. In reality the reduced correlation
originates from the fact that some females are unable to b ear children b ecause of
sex-unsp eci c reasons illness, accident. Actually, the twovariables sex and child-
b earing capability do not generate equivalent classes. In this case the reasons for
the diminished correlation will b e readily elucidated as the result { contradictory
to exp ectation { will call on inquiries. However, such previous knowledge will not
often b e available as a protective instance. In contrary, for the most part, measures
of asso ciation supp ort a primarily explorative pro cedure which allows for the acqui-
sition of relational knowledge only after data analysis. As an example, a ctitious
examination of the asso ciation b etween sex and subscription of a woman's journal
can b e used, where by a correlation of = :50 mayhave b een found. It mightbe
concluded that, although this magazine is more often read bywomen, it is also read
by men. Without access to the original data there is little reason to distrust this
interpretation. However a more precise insp ection of the data would reveal that the
sample of 1000 individuals was made up on the one hand of 500 female and 500 male
sub jects, and on the other hand of 800 non-subscrib ers and 200 exclusively female
subscrib ers. The cells of the fourfold table might contain the following frequencies:
a=200, b=300, c=0, d=500. Actually a very strong asso ciation b etween sex and
magazine subscription is given: The subscrib ers are exclusively female, male sub-
jects do not take the journal. This fact should b e expressed bya = 1, but the
Phi-co ecient fails to do so.
Is it p ossible to formulate an alternative, p erhaps complementary measure of
asso ciation? Logic o ers an access. A p erfect Phi-correlation contains an equiv-
alence prop osition: pa pb. If a p erson p has a feature a then he also has
feature b and vice versa; if he solves task a then he will solve b as well, and if he
do es not solve a, then he will be unable to solve b. Here it is presupp osed that
the twovariables are dichotomized symmetrically, i.e. on b oth sets of events exist
equivalent binary partitions. What, however, can b e done if the real conditions are
not in accordance with these presumptions or with the test requirements? The sets
of p ositive resp onses are now not in an equal, but in a partial set relation and the
following implication holds: pb = pa und :pa =:pb. If sub ject p solves the
more dicult item b then he will solve the easier item a and if he do es not solve
the easier a, then he will not solve the more dicult b.
What follows from this distinction? The classical Phi-co ecient is appropriate
only if a theoretically symmetric distribution of the variables can b e assumed. Pos-
sible asymmetries indicate mo del adequacy de ciencies and consequently reduce the
correlation. The assumption of symmetry however, is not a cogent condition. For
this reason a distributionally indep endent measure would b e desirable. The in-
tercorrelation matrix tab.2 with entries of exclusively one show that a completely
homogeneous item collection is given. Generally, aside from equivalences, implica-
tive connections in dichotomous variables could be revealed by technical means.
But what would such a measure lo ok like and what are the desired prop erties?
2 Demands on a Phi-co ecient indep endent from
marginal distributions
A view into the literature teaches us that alternatives for the Phi-co ecienthave
b een sought almost since its publication. Already in the b eginning of this century
attempts were made to nd more suitable measures of asso ciation. The basic idea
of the Phi-correlation was to relate the di erence ad bc of the pro duct of congruent
c
MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers
Peter V. Zysno: Mo di cation of the Phi-co ecient 44
++; and incongruent+ ; + resp onse patterns to the complete set of ex-
p ected resp onse patterns. This exp ected value corresp onds to the geometric mean
of the pro ducts of the marginal frequencies in the symmetric case. For asymmetric
marginal distributions this referential term exceeds the p ossible maximum number
of di erent resp onse patterns. The e ect increases with the extent of asymmetry.
The search for alternatives therefore, was principally directed toward more suitable
reference terms. In general, co ecients were prop osed which kept ad bc in the
numerator while mo difying the nominal term nom. This typ e of co ecient is
referred to as Phi-mo di cation:
ad bc
p
= jad bcj nom n u n u 2
i i j j
nom
The more detailed denominator prop erties will b e illuminated in further discourse.
However, the admissable domain can already be de ned. On the one hand, its
absolute value must not decrease the numerator jad bcj nom, b ecause it
would violate the convention of the 1 limitation. On the other hand, should
b e at least as large as since the diminishing in uence of di erent diculty levels
is reduced; hence the nominal term must not b e larger than the denominator of the
original Phi-co ecient. The restricted interval in the p ositive range implies equal
signs for and .
Equation [2] can be regarded as a conceptual frame for any Phi-mo di cation.
More explicit sp eci cations are directed to the appropiate pro le of requirements
which should b e orientated to the Phi-co ecient as far as p ossible. Undoubtedly
the following three conditions are part of this. First, the marginal freqencies must
not b e changed A1. Secondly, all entries in the fourfold table are frequencies and
therefore must not b e negative A2. Third, the inversion of a variable will change
the sign but must not change the numerical value A3.
A1: Invariance of the marginal distributions
A2: Non-negativity of the cell frequencies
A3: Invariance of the absolute value for inverted variables
While these three demands emphasize common ground with the Phi-co ecient,
the b efore mentioned interval restriction [2] has to be re ned. Here the idea of
Torgerson 1958 and Cureton 1959 can b e included to relate the Phi-co ecient
to the p ossible maximum:
3 =
max
A preliminary notion of the maximal Phi is provided by solving for =
max
and inserting formulas [1] and [2] at the corresp onding p ositions. The following
de nition results:
nom
p
jac bdjnom n u n u 4 =
p
i i j j max
n u n u
i i j j
The developmental advantage of this expression as compared to formula [2], lies
in the elimination of all cell parameters a; b; c; d. For the purp ose of research
economy it is desirable to formulate a nominal term based solely on the marginal
frequencies. In this way the maximal values of earlier published Phi-co ecients can
b e found even if the original data are unknown. In accordance with formula [3], the
distributionally indep endent co ecient can also b e determined. Surely, things have
not yet come to this p oint. First, the formal requirements will be more precisely
c
MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers
Peter V. Zysno: Mo di cation of the Phi-co ecient 45
addressed. If the lower interval b ound jac bdj is taken for the nominal term, then
can not fall b elow the absolute value of . Doing the same for the upp er
max
b ound yields a value of 1. Hence the maximal Phi is limited to the interval [jj; 1].
This assumption is stated in A4. The conditions for reaching these interval b ounds
can be explicitly stated. The upp er b ound supremum 1 can only be reached if
the marginal distributions are symmetric. According to formula [3] in this case,
is related to one, hence equals the original Phi-co ecient. Technically, the
symmetry of marginal distributions b ecomes apparent up on review of the fourfold
table by the fact that one of the row sums n or u is equal to the column n A4'.
i i j
On the other hand, variables with asymmetric marginals are in a partial set relation
if no incongruentdyads o ccur. Formally this app ears in the fourfold table if one of
the cells is equal to zero A4". The Phi-mo di cation in this case should result in
=1, consequently = jj:
max
A4: Interval b oundaries
2 [ jj; 1]
max
A4': Supremum for marginal symmetry
n = n j ad bc
i j
=1
max
n =N n j ad < bc
i i
A4": Identity for the partial set relation
9jfa; b; c; dg =0 = jj
max
3 Previous Prop osals
As already mentioned a considerable collection of Phi-mo di cations has b een devel-
op ed. To examine them all with resp ect to requirements A1 through A4 would b e
a rather arduous task. Fortunately, they can b e roughly divided into two p erio ds,
namely up to 1948 and from 1949 onward. The early authors Forb es 1907, Yule
1911, Hacker 1920, Ferguson 1941, Dice 1945, Lo evinger 1947 prop osed nominal
terms which were related more or less explicitly to a xed data constellation, for
example nom = n n Forb es, nom = ac + bd Yule or nom = n N n
i j j i
Lo evinger. They do not satisfy the invariance of item inversion and the interval
restriction.
Beginning with Cole 1949 it was attempted to meet the p erceived problems with
suitable constraints. These attempts, in short, will be outlined in the following.
Table 3 shows the maximal Phi for a set of examplary data consisting of two groups.
In the rst two homogeneous asymmetric variables are inverted systematically and
have a correlation of :33. The maximal Phi, then, must also b e equal to .33. In
the second group, two mo derately correlated variables with symmetric marginals are
systematically inverted. The Phi-co ecientis = :25, for the maximal p endant
avalue of 1 should result. Cole's 1949 formula with equality constraints read:
8
>
n N n j ad bc
i j
<
nom = 5
n n j bc > ad; c a
i j
>
:
N n N n j bc > ad; a>c
i j
Example group 1, data set 1: =9 9=3 9=3
max
Cureton's 1959 prop osal was based on the idea of relating Phi to the p ossible
maximum. He presented two nominal expressions, one of which comprises two
c
MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers
Peter V. Zysno: Mo di cation of the Phi-co ecient 46
Table 3: Invariance of absolute value of the Phi-correlation if items are inverted
Data group 1 Data group 2
i j i j i j i j i j i j i j i j
1 1 1 0 0 1 0 0 1 1 1 0 0 1 0 0
1 1 1 0 0 1 0 0 1 1 1 0 0 1 0 0
1 1 1 0 0 1 0 0 1 1 1 0 0 1 0 0
1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 0
1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 0
1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 0
1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1
1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1
1 0 1 1 0 0 0 1 0 1 0 0 1 1 1 0
0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
+:33 :33 :33 +:33 +:25 :25 :25 +:25
alternatives dep endent on the sign. In the case of negative asso ciations, regard
should be given to the secondary condition that for n + n < N the term in
i j
brackets is put to zero:
N min n ;n n n j ad bc
i j i j
6 nom =
n n N n + n N j ad < bc; n + n i j i j i j This mo di cation was develop ed by Cureton in a somewhat curious style of argu- mentation and was nally presented in formulas which app ear rather involved and dicult to interpret in terms of correlations or variances. It will be dicult for the user to assess the p erformance of the formalism. Presumably uncertainties such as these are primarily attributable to oversight of this mo di cation. Actually, the ab ove conditions app ear to b e met. A con rmation of this lies in that the exp ected values are attained. A thorough examination of the requirements would be di- cult to conduct at this p oint and will be taken up later after the clari cation of additional essentials which will help to provide for a b etter understanding. A third alternative has b een presented by Guilford 1965. He prop osed that 1=2 1=2 = [p q =p q ] in the case of p ositive and = [q q =p p ] in max j i i j min i i j j the case of negative correlations. The symb ols p and q corresp ond to the relative i i frequencies n =N und u =N . Unfortunately, again there is no invariance of inversion i i 1=2 group 1, column 4: =[:75 :75=:25 :25] =3: max A fourth mo di cation with three equality constraints was presented by Bortz, Lienert and Bo ehnke 1990, p.278 whichisobviously in uenced by Guilford's pro- p osal: r p q i j p q ;q p ;p p 7 = i i j j i j max q p i j The following example illustrates the missing invariance of inversion group 2, col- 1=2 umn1: = 8 8=4 4 = 2. The form with the constraints p q and max i i q p cf. Bortz 1993, p.211 is also not convincing; Hannover and Steyer 1994 j j havewarned of these de ciencies. Examination of the previous Phi-mo di cations leads to the conclusion that, ex- cept for Cureton's prop osal, the alternatives are not invariant with resp ect to inver- sions or, in case of symmetric marginals, are not identical with the Phi-co ecent. Most desirable, however, is a maximal Phi which satis es the ab ove requirements, c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 47 is conceptually clear and do es not presupp ose informations other than the marginal frequencies. 4 Formal Development The basic idea of the Phi-mo di cation is to relate the real di erence ad bc to the maximum p ossible di erence or extreme di erence. Indicating the joint frequencies by an asterisk, it is de ned as: di = a d b c . The sign is equal to that of ext the real di erence. The index "ext" was chosen intentionally instead of "max" in order to make clear that it may b e negative or p ositive just as the real di erence. However, b ecause of the restriction to the p ositive area, only the resp ective absolute value is applied. Hence, the neutral Phi is de ned as follows: ad bc di = 8 = jdi j ja d b c j ext Obviously this is a co ecient of the typ e = . The maximal Phi, according to max formula [4], can b e put into concrete terms by the expression ja d b c j. A tech- nical interpretation of variance is p ossible as well, i.e. as prop ortion of covariance to the maximal p ossible covariance: = cov =jcov j. ext Now the task is to determine the extreme di erence from the parameters a ;b ;c and d . To accomplish this only one of them must be known b ecause the others can b e computed via the marginals. But, which one should it be? A rst clue is given by the ab ove observation that two completely homogeneous variables with di erent levels of diculty are accompanied by a zero cell in the fourfold table. Of course, for the most part, the resp onse sets of the variables will not form a partial set relation, and therefore, a zero cell will not app ear. However, in order to nd the maximum di erence, one of the elements should b e interpreted as rate of violations of a p erfect asso ciation "errors". It is imp ortant that only one cell assumes this role; otherwise a de nite solution can not exist for this problem. Any uncertainty as to the existence of this uniqueness can b e quickly eliminated. If a cell is reduced to zero, the adjacent cell must increase and the diagonal cell must decrease equally, so that according to A1 the marginal frequencies remain constant. If the reduction of a cell implies the reduction of the diagonal cell, only the smaller of the twomay b ecome zero, otherwise the non-negativity of the frequencies A2 would b e violated. In which of the diagonal pro ducts should this smaller element be sought? Again it is the lesser of the two. If found in the larger pro duct, the maximum di erence a d b c would have the opp osite sign of the real di erence and would contradict the sign identity of real and maximum di erence according to de nition [8]. Therefore the error rate e corresp onds to the lesser of the two cells b or c for p ositive and a or d for negative asso ciations; it is always in the lesser of the two diagonal pro ducts: min b; c j ad bc e = 9 min a; d j bc ad Stated so as to b e easily rememb ered: The error cel l is the smal ler element of the smal ler diagonal product. If the pro ducts are equal, either of the minimal values can b e chosen. Since the Phi-co ecient in this case is equal to zero, it can not b e upgraded anyway. The error rate is unimp ortant under this condition as long as it remains larger than zero. Technically, the ab ove dual form is easy to handle. It is however, more dicult to integrate such expressions in more complex formulas. For convenience a convex combination of the minimum terms { here the weighted c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 48 Table 4: The nominal term in dep encence of the error cell Nominal Term Di Error cell I II III ad bc e = b; b =0 ja d b c j = a d a + bb + d n u ad bc + bN i j 0 e = c; c =0 ja d b c j = a d a + cc + d n u ad bc + cN j i ad bc e = a; a =0 ja d b c j = b c a + ba + c n n bc ac + aN i j 0 e = d; d =0 ja d b c j = b c b + dc + d u u bc ad + dN i j sums { can b e formed: K = g Min +1 g Min j g = 1 + sgn=2 10 adbc adbc The weight g takes the values 1 for p ositive and 0 for negative real di erences. In this way one of the two sums will b e 0, the other delivers the resp ective minimum. If the real di erence equals zero, then the arithmetic mean of the two minimum terms is formed. Here the ab ove describ ed freedom is used to cho ose the rate of errors pragmatically in the case of zero di erences. Since it has now b een determined that, in principal, a de nite solution to the problem exists, one might try to nd the extreme di erence or the maximal Phi cf. Bosser, 1979 by determining a ;b ;c and d . However, a more general so- lution on the basis of the marginal frequencies is desired. For this reason the asterisk-parameters are transformed into the corresp onding marginal parameters. The formal step-by-step execution of this pro cess is provided in table 4. Each cell of the fourfold table is successively regarded as p otential error element. For example, for p ositive real di erences either b or c will b e zero; the nominal term is reduced to a d . In the next step this pro duct is replaced by the corresp onding real param- eters col. I I in tab.4. The error element will b e added to the neighb ouring cells b ecause of the marginal identity A1. The resulting equivalences a + bb + d and a + cc + d corresp ond to the marginal pro ducts n u and n u , resp ectively. i j j i The ideal pro duct a d therefore equals one of the diagonal pro ducts. Which of them is the right one? According to the interval restriction, the nominal term must not exceed the geometric mean of the marginal pro ducts. If the pro ducts n u and n u are di erent, one of them will b e less than, the other larger than the i j j i geometric mean. Consequently the maximum pro duct of the main diagonal is given by a d = min [n u ;n u ]. i j j i The minimum of the secondary diagonal is analogously determined. Hence the maximal absolute di erence of the pro ducts is provided by the following equation: minn u ;n u j 0 i j j i 11 ja d b c j = minn n ;u u j 0 i j i j Substituting the two nominal terms as numerator in equation [4] yields the corre- sp onding dual form for the maximal Phi: 8 q q min[n u ;n u ] n u n u i j j i i j j i < p = min ; 0 n u n u n u n u i i j j j i i j q q = 12 max n n u u min[n n ;u u ] i j i j i j i j : p <0 = min ; n u n u u u n n i i j j i j i j With a little algebra each minimum term is converted into twoinversely equivalent radical fractions. The numerator of the rst equals the denominator of the second and vice versa. Hence one of the radicals is 1, the other 1. However, since must not exceed one, a simple two step algorithm can b e used for computer max programs: c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 49 1=2 1=2 1. IF 0 THEN =[n u =n u ] ELSE =[n n =u u ] . max i j j i max i j i j 2. IF > 1 THEN =1= . max max max The maximal Phi follows a simple logic: It equals the smaller radical fraction of the cross over pro ducts n u ;u n for p ositive and the parallel pro ducts n n ;u u i j i j i j i j for negative real di erences. With formula [12] the primary ob jective is attained by determining the maximal Phi exclusively by the marginal frequencies. The frequencies n and u may be replaced by probabilities p and q . Of course the minimal term may also b e entered into the convex form [10]. At this p oint the initial requirements should b e viewed. The invariance of the marginal distributions A1 and the non-negativity of the frequencies A2 are not violated by design. The invariance of the absolute value in case of inversions A3 is achieved since the substantial quantity for extreme di erences { the "error cell" the smaller element of the smaller pro duct { remains unchanged. The interval restriction A4 exists since the absolute value of the extreme di erence can never be less than the real di erence and never larger than the geometric mean of the marginal pro ducts. Finally, the neutral has to be determined. If Phi-co ecients are available they will b e related to the maximal Phi. However, if the original data are available, the fraction of real di erence and maximum di erence [11] may be formed. A more elegantwayof p erforming this is by replacing the marginal frequncies with the corresp onding joint frequencies last column of table 4. The smaller value of min[ad bc + bN ; ad bc + cN ]now only dep ends on the smaller cell b or c. The maximum di erence simpli es to ad bc + N minb; c. By employment of a similar pro cess for the minimum di erence and substitution into the convex combination we obtain: h i h i ja d b c j = g ad bc+ N min [b; c] +1 g bc ad+N min[a; d] = jad bcj + N g min [b; c]+1 g min[a; d] = jad bcj + N e 13 As the terms g ad bc and 1 g bc ad are always p ositive they are replaced by jad bcj and then factored out. For the remaining quantity in brackets the symbol e was already intro duced in formula [9]. Avery condensed formula results: ad bc = 14 jad bcj + N e Here it b ecomes apparent that the extreme di erence consists of the absolute real di erence added by the "error comp onent". 5 Discussion The starting p oint of this discourse was the statement that the incongruence of the marginals diminishes the Phi-co ecient. This is plausible as long as the hy- p othesis of asso ciation pro ceeds from symmetric distributions: The more skewed the distribution, the lower the appropriateness and, consequently the correlation. This is di erent, however, if the structural ob jectives or the empirical facts require asymmetry. The reduction, then, on formal grounds, is a metho dical artefact. This problem was recognized early on and was followed by a series of Phi-mo di cations which left the user somewhat helpless in that they were dicult to judge without c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 50 a suitable frame of reference. Which one is logically correct, technically sure and substantially clear to interprete? As comparison four formal requirements have b een put forth. The rst two, invariance of marginals and non-negativity of the cell frequencies, are in general uncritical. However, the invariance of the absolute value is nearly always violated. The only exception was presented by Cureton 1959. At this p oint the question, as to the adequacy of his mo di cation, can now be rather easily answered. The nominal terms in formula [11] can b e transformed algebraically: 0 : min[n u ;n u ] = min[n N n n ;n N n n ]=N minn ;n n n : i j j i i i j j i j i j i j 2 0 : min[n n ;u u ] = min[n n ;N Nn Nn + n n ] i j i j i j i j i j = min[n n ;n n N n + n N ]: i j i j i j For a p ositive Phi the concluding expression corresp onds formally to Cureton's prop osal apart from the denotiation of the variables. Principally, this is true for a negative Phi as well. In fact, he only o ers the second alternative of the minimum { i.e. the di erence of the pro ducts n n and N n + n N { and requires that i j i j the second pro duct is set to zero if n + n < N ; this corresp onds to the term i j n n . Therefore, his mo di cation adheres to the requirements. i j Up on consideration of this statement one might side with Cureton by asking if an up dated treatment of this problem is even necessary. The following o er some supp ort for this: 1. Catalogue of requirements, examination of uniqueness, solution. As compared with earlier approaches, a more global problem analysis was adopted. In the rst step a catalogue of requirements for a Phi-mo di cation on a generalized level was develop ed; in the second step it was shown that a solution must exist for a co ecient with these prop erties since the rate of error for the nominal term is uniquely xed. On this basis a measure was develop ed satisfying the requirements. Such an analytical pro cedure o ers the p ossible classi cation of other co ecents. Only Cureton's approach stands the formal criteria. He develop ed his mo di cation rather intuitively on technical consid- erations concerning the p ossible variations of the relative marginal frequencies in the fourfold table. The resulting formulas are rather clumsy and b ound to more constraints. Insightinto his pro cedure and the rather opaque formalism do es not suggest itself. Therefore it will not be surprising that even in the recent past further Phi-mo di cations were prop osed and discussed cf. Guil- ford, 1965; Bosser, 1979; Kubinger, 1990; Bortz, 1993; Hannover & Steyer, 1994; Kubinger, 1995; Bortz, 1995. With every advancement, considerations shift o track b ecoming obviously insucient. But of course there are also old "true" elements gaining new meaning in a new framework of thinking. 2. The develop ed system helps to evaluate and cho ose suitable mo di cations. Up to now Phi-mo di cations were not easily discussed b ecause principles of evaluation { for instance a list of requirements { existed at b est subliminally. Hardly an author has explained in what resp ect his mo di cation is sup erior to the previous ones; this holds for Cureton, to o. Consequently, there was no sucient reason for the user to prefer or reject a certain co ecient. Here the nominal expression turned out to have four slightly di erent app earances dep endent on the inversion of the variables. The published solutions, however, except for one of them, contain only a partial set. This underscores the merit of the system: Not only the insuciency of most of the prop osals could b e shown, but also the formal equivalence of one { nearly forgotten { mo di cation. Ab ove c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 51 all, the danger of b eing seduced into making false statements of asso ciation and conclusions through the use of unsuitable Phi-mo di cations is prevented. 3. Better understanding and ease of reception through clarityofinterpretation. The propagated mo di cation can b e adopted by the user in the most accepted form of interpretation. According to the de nition, it may b e regarded as the relation of real di erence and maximum p ossible di erence of the pro duct of congruent and incongruent dyads. But it may also b e read according to Lo evinger's 1948 homogeneity as the ratio of covariance and maximal co- variance, or according to Cureton 1959, as the quotient of real Phi and maximal Phi. The most substantial part of the Phi-mo di cation, the nominal term, is easy to memorize. It equals the minimum of the cross over pro ducts n u ;u n for p ositive and the minimum of the parallel pro ducts n n ;u u i j i j i j i j for negative correlations. 4. Ease of handling. can be computed on the basis of the joint frequencies or by means of the Phi-co ecient and the marginal frequencies. Both versions can b e easily implemented in computer programs. However, in the end, a de cit has to be stated: Since little is known ab out the distribution of the co ecient, a statistical test of sigi cance can not yet b een recommended. Go o dmans 1959 criterion may p ossibly help. For the present it will p erhaps be b est to decide conservatively by testing the signi cance of the corresp onding classical Phi-co ecient. Finally it should be reminded that should not b e used in a factor analysis since the asymmetry contradicts the presupp osed normal distribution of the data. References 4 [1] Bortz, J. 1993 . Statistik Fur Sozialwissenschaftler. Berlin: Springer. [2] Bortz, J., Lienert, G.A., Bo ehnke, K. 1990. Verteilungsfreie Methoden in der Bio- statistik. Berlin: Springer. [3] Bosser, T. 1979. PHI-Interkorrelation seltener Symptome. Psychologische Beitrage, 21, 343-348. [4] Cole, L.C. 1949. The measurementofintersp eci c asso ciation. Ecology, 30, 411-424. [5] Cureton, E.E. 1959. Note on = . Psychometrika, 24, 89-91. max [6] Dice, L.R. 1945. Measures of the amountof ecologic asso ciation between sp ecies. Ecology, 26, 297-302. [7] Ferguson, G.A. 1941. The factorial interpretation of test diculty. Psychometrika, 41, 323-329. [8] Forb es, S.A. 1907. On the lo cal distribution of certain Illinois shes. An essayin statistical ecology. Bul letin of the Il linois State Laboratory; Natural History, 7, 273- 303. [9] Go o dman, L.A. 1959. Simple statistical metho ds for scalogram analysis. Psychome- trika, 24, 29-43. 4 [10] Guilford, J.P. 1965 a. Fundamental statistics in psychology and education. New York: McGraw-Hill. [11] Guilford, J.P. 1965b. The minimal PHI co ecient and the maximal PHI. Educa- tional and Psychological Measurement, 25, 3-8. [12] Hannover, W., Steyer, R.1994. Zur Korrektur des -Ko ezientenn. Newsletter der Fachgruppe Methoden, II, 4-5. c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers Peter V. Zysno: Mo di cation of the Phi-co ecient 52 [13] Kubinger, K.D. 1990. Ub ersicht und Interpretation der verschiedenen Assoziation- smae. Psychologische Beitrage, 32, 290-346. [14] Kubinger, K.D. 1995. Entgegnung: Zur Korrektur des -Ko ezienten. Newsletter der Fachgruppe Methoden,II, 3-4. [15] Lo evinger, Jane A. 1947. A systematic approach to the construction and evaluation of tests of ability. Psychological Monograph, 61, 1-49. [16] Lo evinger, J. 1948. The technique of homogeneous tests compared with some asp ects of "scale analysis" and factor analysis. Psychological Bul letin, 45, 507-529. [17] Torgerson, W.S. 1958. Theory and methods of scaling. New York: Wiley. [18] Yule, G.U. 1911. An introduction to the theory of statistics. London: Grin & Co. [19] Yule, G.U. 1912. On the metho ds of measuring the asso ciation between two at- tributes. Journal of the Royal Statistical Society, 75, 579-652. First Submitted January 1, 97 Revised June 16, 97 Accepted June 17, 97 c MPR{online 1997, Vol.2, No.1 1998 Pabst Science Publishers