SW 983 - DISCRIMINANT ANALYSIS

There are two perspectives from which to consider discriminant analysis. In applying these two perspectives, DA serves as a link between MR and MANOVA.

From the MANOVA Perspective

DA provides a tool for describing how groups differ with respect to multiple (continuous) dependent variables. The discriminant analysis permits one to judge the relative contribution of the individual variables in a linear function that maximizes between group differences.

D = B0 + B1X1 + B2X2 + ... +BpXp

B's are chosen so that the value of the discriminant function differs as much as possible between the groups, or that for the discriminant scores the ratio between groups sum of squares within groups sum of squares is a maximum.

Discriminant scores are obtained for each observation by plugging values for independent variables into the discriminant function.

The discriminant function can be used to classify cases from the same sample and determine the "correct classification rates".

The number of possible discriminant functions is equal to the number of groups minus one or the number of predictor variables, whichever is smaller.

Statistics eta = canonical correlation = measure of the degree of association between the discriminant scores and the groups. lambda = ratio of the within groups SS to the total SS = proportion of the total variance in the discriminant scores not explained by differences among groups.

lambda + eta2 = 1 lambda can be transformed into a chi square or F statistic (latter for two groups only) group centroids = discriminant functions evaluated at the group means structure coefficients = correlations of the discriminant score with each of the original variables. The square of the structure coefficient indicates the proportion of variance of the corresponding variable which is accounted for by the given discriminant function.

Relationship to Multiple Regression Analysis

From the MR perspective we can think of group membership as the dependent variable which we are trying to predict through the use of a set of continuous independent variables. In fact, in performing DA, linear combinations of the independent, sometimes called predictor variables are formed and serve as the basis for classifying cases into one of the groups.

Two-group linear discriminant analysis is closely related to multiple linear regression analysis. If the binary grouping variable is considered the dependent variable (dummy coded) and the predictor variables are the independent variables, the multiple regression coefficients will be proportional to the discriminate function coefficients. The exact constant of proportionality varies from data set to data set, but the two sets of coefficients are always proportional. This is only true for two-group discriminant analysis.