POPULATION AND SEX DETERMINATION BASED ON MEASUREMENTS OF THE TALUS

Terri B. Torres

A Thesis

Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of

Master of Science

Committee:

Richard N. McGrath John T. Chen Nancy Boudreau

ii

ABSTRACT

Dr. R.N. McGrath, Advisor

Categorizing remains by sex and race has long been a challenge for the medicolegal profession. Logistic regression models and multicategory logit models can be used to accurately place individuals into their respective groups using measurements of the talus .

Successful placement was achieved in 93% of the individuals for the response variable of sex.

The percentage of correct placement for race alone was 95%. The correct placement for sex with race ranged from 76% to 95% depending on the group. The goal of this paper was to show that in addition to the and , the talus can be used with equal or greater success rates to classify human remains.

iii

ACKNOWLEDGEMENTS

At this time I wish to express my thanks to Dr. R. N. McGrath, Dr. Nancy Boudreau and Dr. John Chen for their unselfish support and encouragement that helped me understand and love statistics.

Terri Torres

iv

TABLE OF CONTENTS

ABSTRACT ...... ii ACKNOWLEGEMENT ...... iii LIST OF TABLES ...... v LIST OF FIGURES ...... vi LIST OF APPENDICES ...... vii INTRODUCTION ...... 1 MATERIALS AND METHODS ...... 2 DISCUSSION 1. Sex...... 4 2. Race...... 15 3. Sex with Race ...... 20 CONCLUSION ...... 26 FIGURES ...... 28 APPENDICES ...... 35 LITERATURE CITED ...... 72

v

LIST OF TABLES

Table Page

1.1 Descriptive Statistics and Analysis for “Sex” ...... 4

1.2 Model Selection for “Sex” ...... 13

2.1 Descriptive Statistics and Analysis for “Race” ...... 15

2.2 Model Selection for “Race” ...... 18

2.3 Model Refinement for “Race” ...... 19

3.1 Summary of ANOVA for “Sex” with “Race” ...... 20

3.2 Summary of Placement for Validating Set ...... 24

3.3 Summary of Placement for Large Set ...... 24

3.4 Comparison of Models ...... 25

iii

LIST OF FIGURES

Figure

1 of the Human Foot ...... 28

2 Superior View of the Right Talus ...... 29

3 Medial View of the Right Talus ...... 30

4 Anterior View of the Right Talus ...... 31

5 Lateral View of the Right Talus ...... 32

6 Inferior View of Talus ...... 33

7 Superior View of Talus Showing Neck Length ...... 34

iii

LIST OF APPENDICES Appendix Page

A Box Plots for “Sex” ...... 35

B Correlation Matrix for “Sex” ...... 36

C Matrix Plot for Eleven Variables ...... 37

D SAS Model Selection for “Sex” ...... 38

E MINITAB Five variable Model Selection for “Sex” ...... 40

F Measures of Association for “Sex” ...... 41

G Box Plots for “Sex”, Validating Set ...... 42

H Box Plots for “Sex”, Entire Set ...... 43

I Matrix Plot for Model Independent Variables, Validating Set ...... 44

J Logit Model for Model Set without Influential Values ...... 45

K Boxplots for “Race” ...... 46

L SAS Model Selection for “Race” ...... 47

M MINITAB Model for “Race” ...... 50

N MINITAB Output for Interaction Models for “Race” ...... 51

O ANOVA and Tukey Comparisons for “Sex” with “Race” ...... 55

P SAS Output for Model Selection for “Sex” with “Race” ...... 65

Q MINITAB Output for “Sex” with “Race” Models ...... 70

R SAS Code for Model Selection ...... 71

1

INTRODUCTION

Within a medicolegal context, the objective of the forensic scientist when working with recovered skeletal remains is the determination of sex, stature, age, and ancestry. There exist several techniques in the field of forensic which make it possible to determine the demographics (e.g., sex and race) of the skeletal material under investigation. One method relies on basic visual inspection and the recognition of physical characteristics which are unique to individual elements of the human skeleton (Byers, 2008). A more quantitative approach (osteometry) requires taking measurements directly from the bone (Introna et al., 1997; Bass, 1995; Trotter, 1970; Trotter & Gleser, 1958; McKern & Stewart, 1957). For example, the human skull possesses several unique and quantifiable character traits; one method often employed by forensic anthropologists in cranial studies is the use of discriminant function analysis (De Vito, C., & Saunders, S.R., 1990; Giles, E., & Elliot, O., 1963). The application of discriminant function analysis is also applicable to the postcranial skeleton; several studies specifically focused on elements of the foot (Bidmos, M.A., & Dayal, M.R., 2004; Bidmos, M.A., & Asala, S.A., 2004; Steele, D.G., 1976). The results from previous studies on the postcranial skeleton, notably the elements of the foot provided accuracy values of 89% for correct sex determination (i.e. male or female). However, an alternative method for determining the sex and estimating ancestry on unknown skeletal material is the application of logistic regression. The objective of this study is to demonstrate that logistic regression can predict with high levels of accuracy both sex and race based exclusively measurements taken directly from the talus (Figure 1). These measurements will be used along with multicategory logit model to assist in the prediction of the four categories of sex with race. In this paper the only two races included will be black and white. Surface or even shallow burials often result in the loss of skeletal material which can greatly impede an investigation. Unlike the skull and long bones such as the femur or humerus, the compactness and the association of soft tissue (ligaments) makes the talus more resistance to taphonomic factors, thus increasing its chances of preservation and eventual field recovery. In situations requiring post-mortem identification where recovered skeletal material may be limited, this quality makes the talus an appropriate alternative for osteological analysis.

2

MATERIALS AND METHODS

The osteological material used in this study is part of the Hamann-Todd Osteological Collection, housed at the Cleveland Museum of Natural History, Cleveland Ohio. The collection contains more than 3,100 human skeletons from Cuyahoga County, Ohio which were acquired during the early part of the 1900‟s. To maintain consistency throughout the study, only the right talus was used for data collecting. A total of 270 specimens were randomly selected. Any tali which exhibited missing bone or displayed any noticeable pathology was eliminated from the sample. Two specimens had arthritis so measurements were not able to be properly made. One specimen had not been preserved adequately. Therefore 267 specimens remained. A subset with a sample size of 10 was randomly selected from each of the four subpopulations (white female, black female, white male, and black male) to be used for validation of the models. A sample size representing 227 individuals was used in this study, composed of 113 females and 114 males. Both the female and male samples were further divided into two separate subpsamples; 59 white females ranging in age from 12 through 93 years and 54 black females ranging in age from 20 through 87 years. The 114 males were also divided into two subpopulations; 60 white males ranging in age from 23 through 81 years and 54 black males between the age of 20 and 76 years. The category of black and white are consistent with the manner in which these terms were used during the early part of the 1900‟s when the specimens were collected. The term Black referred to a group of individuals not of Asian or Caucasian decent. White was used to as a category of individuals neither Black nor Asian. A total of 11 measurements were taken for each right talus and recorded to 1/100th mm. using a Mitutoyo digital caliper, model number CD-8" CSX. The 11 measurements are a modification of previous osteological studies (Bidmos, M.A., & Dayal, M.R., 2004). The 11 measurements (Figures 2-7) include: talus length (TL), talus width (TW), talus height (TH), neck length (NL), neck width (NW), trochlea length (TRL), trochlea width (TRW), calcaneus articular surface length (CASL), calcaneus articular surface width (CASW), navicular articular surface length (NASL),and navicular articular surface width (NASW), (Figures 1-7). The data were then organized into white male, white female, black male and black female and subjected to standard descriptive statistical analysis (Table 1). This was then followed by a more rigorous logistic regression analysis using SAS 4.1 and Minitab 15 first comparing males 3 to females, second comparing blacks to whites and finally using multicategory procedures to simultaneously determine the probability of each sex-race of category previously mentioned.

4

Sex

As can be seen from the boxplot the eleven variables appear to have significantly different spreads based on sex. There are outliers in most of the variables but it was decided to keep all values in the data set after these values were checked for recording errors. See

Appendix A.

A t-test with null hypothesis H o : male   female was performed for each of the variables.

See Table 1.1. The mean values of ten of the eleven variables are significantly different with respect to sex when based on the p-values alone. The variable “NH” does not have a statistically significant difference.

Table 1.1 Descriptive Statistics and Analysis for “Sex”

Variable Male Mean SD Male Female Mean SD Female t-test Statistic p-value TL 60.96 3.61 53.75 3.37 -15.54 0.000 TW 45.28 2.86 39.30 3.25 -14.71 0.000 TH 33.06 2.11 29.01 2.34 -13.66 0.000 NL 18.41 2.65 16.03 4.86 -4.57 0.000 NH 20.89 4.70 18.7 13.8 -1.57 0.119 TRW 35.26 3.10 31.18 3.93 -8.67 0.000 TRL 36.71 2.73 32.43 3.38 -10.48 0.000 CASW 33.74 2.20 29.88 3.08 -10.89 0.000 CASL 22.82 2.01 19.67 2.59 -10.22 0.000 NASW 32.43 4.22 28.11 3.23 -8.65 0.000 NASH 26.65 3.90 23.17 2.65 -7.88 0.000

A correlation matrix was calculated for the variables as well. See Appendix B. Most of the variables are correlated with each other (calculated correlation coefficient close to -1 or 1 and the p-value less than  0.05). This is reasonable since they are eleven measurements of an individual‟s talus. One would expect them to be correlated. This means that when the variables for the model are actually selected the variation captured by one variable may overlap with the variation captured by other variables chosen. 5

The strongest correlations, correlation coefficient greater than 0.70, are between the variables “TL” and “TW”, “TL” and “TH”, “TW” and “TH”, “TW” and “CASW” with correlation coefficients of 0.764, 0.794, 0.747, and 0.734 respectfully. These strongest correlations are verified by observing the scatterplot matrix. See Appendix C. The diagonal oval shape that signifies a stronger correlation can be seen in all of the above mentioned associations.

When all eleven variables were entered into the SAS stepwise selection procedure in logistic analysis with a 0.15 level to enter and a 0.10 to exit, five variables were selected for a model for the response of “sex”. The variables were: “TL”, “TW”, “TH”, “NASW”, and

“NASH”. The model is

ˆ logit Y 1   50.62  0.22 xTL  0.31 x TW  0.43 x TH  0.21 x NASW  0.20 x NASH where the event of success, Y=1, is the category of male. This leads to the equation

e50.62  0.22xTL  0.31 x TW  0.43 x TH  0.21 x NASW  0.20 x NASH ˆ Y 1 50.62  0.22x  0.31 x  043 x  0.21 x  0.20 x 1 e TL TW TH NASW NASH

with ˆ as the estimate of the probability of success. This second equation can then be utilized to predict the success for the event “male” where if the probability is greater than or equal to 0.50 the individual is categorized as “male”. Likewise, if the probability is less than

0.50 the individual is classified as “female”. See Appendix D.

The probability that Yˆ 1 or the prediction is “male” given that Y=1 or the individual is

18 actually male is called sensitivity. With this model it was found that PYYˆ 1/  1  or 0.90   20 using a validating set of 20 male individuals that were not utilized in the selection of the model.

As stated earlier, a subset of the original 267 individuals was reserved as a validating set. Ten specimens were randomly selected from each of the four categories of sex and race. This 6 validating set was removed from all calculations to serve an independent set to test the resulting equations.

17 The specificity or PYYˆ 0 /  0  or 0.85 for female individuals from the   20 validating set. Overall 35 of the 40 individuals were categorized properly. This means that approximately 88% of the individuals were correctly classified.

One test for this particular model is the one that uses the likelihood function, (L), by calculating the ratio of the maximizations. The numerator consists of the maximum over the possible parameter values assuming that the null hypothesis is correct. The null hypothesis is that all 5 variables have slopes equal to zero. The denominator is the maximum over the larger set of values that allows either the null or the alternate to be true. The values used for the slopes in the alternate are the maximum likelihood estimates for the parameters calculated by SAS and/or MINITAB.

The model had a likelihood ratio statistic of G  202.839. This value is calculated by computing

-2(maximum likelihood when the parameters satisfy the null/maximum likelihood when the

parameters are unrestricted)

or

GLL 2 ln   2ln  intercept only   intercept and covariates 

The null hypothesis is

H o : i  0 for all i whereas the alternate is 7

H a : i  0 for some i.

The maximum of the likelihood when the parameter values assume the null should always be less than or equal to the maximum when the parameters values assume the alternate because the latter maximizes over a larger set of possible parameter values. Therefore, since the numerator is less than the denominator the ratio will be less than 1. Also, the log of the ratio will be less than 0. When the fraction is multiplied by -2 the ratio will be positive. If the numerator is much less than the denominator then the test statistic will be a large positive number signifying that there is strong evidence against the null hypothesis. The justification for taking the log and multiplying the ratio by 2 is that this now has an approximate chi-square sampling distribution

(Agresti 2007).

In this model the -2logL for the intercept only is 314.684 or the logL is -157.342. See

Appendix D. The logL value for this, five variable model is -55.923. See Appendix E. The logL test statistic is computed as

G  314.684 (2 55.923)  202.838

The degrees of freedom are the number of parameters in the full model minus the number of parameters for the intercept alone. In this case it would be

61 5.

The logL test statistic follows a chi-square distribution leading to a p-value of less than 0.0001.

It can then be concluded that this model is good or that at the null should be rejected meaning that there is reason to believe that at least one of the slopes is not equal to zero.

A goodness-of fit test for this model utilizes the deviance. Although this test is using the maximized logL as well as the previous it is slightly different. In this case the second portion of the equation for deviance uses the logL for the saturated model. The deviance is defined as: 8

-2(maximized logL for a simple model – maximized logL for a saturated model).

The saturated model has a separate parameter for each observation. It is a perfect fit to the data and is the most complex model possible (Agresti 2007). The null hypothesis is that all of the parameters that are in the saturated model but not in the reduced model are equal to zero. The alternate here is that these parameters are not equal to zero.

The degrees of freedom value for this test statistic is the degrees of freedom for the complex model minus the degrees of freedom for the simpler model. Deviance for this model is

deviance 2 55.923 0=111.845 with degrees of freedom equal to 227-6 = 221 where 227 is the number of observations in the data set. Deviance is again a approximately chi-square test statistic. The p-value is approximately 1 signifying that the model fits adequately.

Kendall’s Tau-a is a measure of association. The complete total number of pairs having a distinct response, “male” or “female”, where the total number of observations is 227 is

227(226)/2=25,651 where no value can be paired up with itself. Of these 25,651 possible pairings it was calculated that 12,882 have a different value on the response variable and 25,651-

12,882=12,769 have the same value in the response. See Appendix F. The set of 12,882 is further divided into those that are concordant, discordant and tied. A pair is concordant if the observation “female” (Y=0) has a lower predicted mean score than the observation “male”

(Y=1). A pair is discordant if the observation of “female” has a higher predicted mean score than that of “male”. Another way to explain this is if the sign of the difference between the explanatory variables is the same as the sign of the difference between the response variables then the pair is concordant. If the signs of the differences are opposite then the pairs are labeled discordant. If a pair is neither concordant nor discordant it is classified as a tie. With this model 9

12,405 are concordant, 466 are discordant and 11 are tied. The Kendall‟s tau-a is defined as the ratio of the difference between the count of concordant pairs and the count of discordant pairs to the total number of possible pairs. The calculated t, an estimate of  , for this model is given as

12405 466 t   .4654 25651

If there is perfect agreement between the pairs the coefficient has a value of 1. If there is perfect disagreement the coefficient has a value of -1. If the values are completely independent the coefficient has a value of 0.

 1  The odds ratio, C , of the concordant to discordant observations is with  D 1 

12405 τ being equal to  . The estimate of the probability of concordant is  .483. The CD 25651

466 estimate of the probability of discordant is  0.017 . For this situation the odds ratio is 2.7 25651 meaning you are almost 2.7 times more likely to be concordant than discordant (Wilkie 1980).

An alternate, simpler, measure of association that is used to determine the strength and direction of the relation between the independent and dependent variables is Somers’ D. The values fall in a range between -1 and 1. Where the -1 value means all values disagree whereas 1 means all values agree. This Somers’ D value is calculated by subtracting the number of pairs that are discordant from the number of pairs that are concordant and dividing by the total number of pairs with different responses. In this situation the number of pairs that are concordant is

12405 and the number of pairs that are discordant is 466. There are 11 ties. The total is 12882.

See Appendix E. This leads to a calculation for Somers’D of

12405 466  .93 12882 10 signifying that the relationship is both strong and positive (Appendix F).

A correlation coefficient can be used in a similar way that it is in the general linear model. Cox and Snell (1989) suggest the following formula for the coefficient of determination:

2 n 2 Lintercept only R 1  Lmodel

where Lintercept only is the likelihood for the model with the intercept only and Lmodel is the likelihood of the particular model and n is the sample size. Nagelkerke (1991) suggested the adjusted coefficient of

2 2 R R  2 n 1 Lintercept only

2 e157.34 227 R2 2 R2 where can have a maximum of one. For this model R 1 55.923 or = .5908 and e

22.5908 R  2 or R =0.7877 . See Appendix D. This value does not have the 1 (e157.34 )227 traditional meaning of R2 therefore it is used only to compare models.

The 40 validation values were used to test this model. The accuracy rate was approximately 88%. To put this in context, Holland (1986) and Bass (1995) both agree that the pelvis and skull are the best section of the skeleton to use classify gender. However, according to

Graw (2001), with five measurements of the skull only 91% of the could be correctly sorted. With these five measurements of the talus the model was able to correctly identify approximately the same amount. 11

Of the 40 data values of the validation set, 5 were not correctly classified. These are specimens: 1755, 1739, 1111, 1767 and 27. Only one data value, specimen 1111, had a probability close to the cut-off value of 0.50 with it being 0.46. All other values were clearly removed from 0.50. Only one of the misclassified values was considered an outlier according to the boxplot, specimen 27, with respect to the validating set for the two variables of “TL” and

“NASW”. See Appendix G. This specimen is also an outlier for “TL” and “NASW” when utilizing the entire, complete, data set according to the boxplot. See Appendix H.

One method for detecting influential values is the use a “matrix plot”. If values seem to be far enough away from the relationship of the two variables then these values may be outliers with respect to one or more variables. These values could cause the model to not be as accurate at predicting as hoped for. The entire data set of 227 values was used in this “matrix plot”. See

Appendix I. The following are possible outliers with respect to the relationship between two explanatory variables are those with specimen numbers: 225, 155, 134, 70 and 59. When the

“logit” equation was calculated again, after removing the outliers, the resulting equation was,

e51.51  0.31xTL  0.33 x TW  0.32 x TH  0.15 x NASW  0.21 x NASH ˆ Y 1 1 e51.51  0.31xTL  0.33 x TW  0.32 x TH  0.15 x NASW  0.21 x NASH with a logL value of -53.581. See Appendix J. Using this second model the success rate was similar for the validation set. Four of the five misclassified values were still misclassified.

Interestingly the change was in the “close” specimen, specimen 1111. This entry was now correctly classified. The new model was different enough to move the probability from 0.46 to

0.56 thereby allowing this white male to be correctly classified as „male‟. Possibly the probabilities should be rounded to the nearest tenth. For this specimen a .50 would be correctly classified as “male”. This would correct this situation without affecting the other correctly 12 placed values. Since the influential values are not influential for all five model variables there is not a sufficient enough reason for eliminating them from the data set.

Although this model is adequate, the question arose as to whether the number of variables could be reduced. For simplicity of the model all interactions were excluded. A simple model that fits well has the advantage of model parsimony along with ease of use. To help determine which model was best, understanding that there is really no best but rather a good model for what is required, the deviance and AIC were used as selection aids. Recall that deviance was used to compare the previous model with the saturated model. It can now be used yet again to compare models. According to Agresti (2007), the test statistic is the difference between the deviances

for the models,  2Lm  LS  with Lm being the maximized log likelihood for the simpler, model

of interest M and Ls is the maximized log likelihood for the more complex, saturated, model.

This value has an approximate chi-square distribution with the degrees of freedom equal to the degrees of freedom for the second model minus the degrees of freedom for the first model. The hypotheses statements are:

H :all parameters in more complex model but not in the simpler model equal 0 o HHao: not .

In the case of model 2 being compared to the model 1,using all eleven variables, the null is not rejected meaning that the simpler, restricted model is an appropriate model. See table 1.2 below. When model 2 was compared with the reduced model 3, using only four variables the null was indeed rejected showing that the five variable model, model 2, was a better fit. The variable “NASW” was eliminated based on the highest p-value when five variables were included in model 2.

13

Table 1.2 Model Selection for “Sex”

Model Predictors Deviance df AIC Percent Models Deviance p-value Correct Compared Difference 1 All eleven variables 109.157 215 133.157 925 2 TL+TW+TH+NASW+NASH 111.845 221 123.845 875 1-2 2.688 df(6) .846 3 TL+TW+TH+NASH 116.626 222 126.626 900 2-3 4.78 df(1) .029 4 TL+TW+TH 120.393 223 128.393 875 3-4 3.767 df(1) .052 5 TL+TW+NASH 122.109 223 130.109 875 3-5 5.48 df(1) .019 6 TW+NASH 142.790 224 148.790 875 5-6 20.68 df(1) <0.000 7 TW 153.060 206 165.378 850 6-7 10.27 df(18) .920 8 NASH 231.941 203 262.987 650 6-8 89.15 df(21) <0.000

The Akaike information criterion (AIC) is a tool that can aid in the selection of the model. The AIC penalizes for adding predictors therefore rewards for model parsimony

(Kutner,et al., 2005). According to Agresti (2007), the AIC ascertains how close the fitted values are to the expected values by minimizing the differences between the log likelihood minus the number of parameters in the model (p).

AIC = -2(log likelihood – p)

The AIC is similar to the Mallows C p that is used in linear regression. As can be observed from the table 1.2, the accuracy ranged from approximately 65 to 90 percent for the validating set, depending on the model. The AIC value is the lowest for the five variable model, the same model that was selected using SAS and selected based as deviance difference.

ˆ An interpretation of each slope in the model is that the further the estimated odds, , 1ˆ or odds ratio, is from 1 the stronger the association is between the explanatory variable and the response variable. In this selected model all of the odds are greater than one. The confidence interval for the odds ratio is calculated by using the confidence interval for the slope. In each case the 95% confidence interval for the odds ratio does not contain the value of one. See

Appendix D. For multiple logistic regression the estimated odds ratio assumes that all other variables in the model are held constant. For example, with all other variables held constant, the 14 odds of the response being male is increased on average by 25% for every unit increase in “TL” where a unit in this situation is one mm. Since all of the estimated odds ratios are greater than one the probability of being “male” is increased with a larger measurement values. If the odds are equal to one the variable is not considered to be essential or in other words X and Y are independent. The odds ratio value is the odds in favor of success (“male” in this case) for an individual with a unit increase in one variable while the other five are held constant. This is calculated by observing that if the one variable increased by a unit of 1 the formulas would be

p ln   1xTL   2 x TW   3 x TH   4 x NASW   5 x NASW 1 p

for the original formula and

p ln  1 (xTL  1)   2 x TW   3 x TH   4 x NASW   5 x NASH 1 p for the increased formula when the first variable is increased by one. If we subtract the original from the increased it yields

pp    ln  ln   1 11pp   

p 1 p ln   p 1 1 p

Therefore the odds of the original over the increased is

p 1 p  e1 p  1 p . 15

Race

Descriptive statistics were obtained for each of the eleven variables along with the box- plots for each variable grouped by race using Minitab 15. See Appendix K. There are strong outliers for the variables “NH”, “NASW”, and “NL”. These values were kept in the data set at this stage because there was no justification for removing an entire value. Using this information t-tests were performed for the null of no difference between the categories white and black; for each variable.

Table 2.1 Descriptive Statistics and Analysis for “Race”

Variable Mean White SD White Mean Black SD Black t-test statistic p-value TL 57.28 5.19 57.48 4.84 -0.30 0.765 TW 41.44 4.30 43.26 4.07 -3.27 0.001 TH 31.29 3.45 30.78 2.42 1.30 0.196 NL 17.03 3.40 17.44 4.72 -0.74 0.458 NH 17.32 13.54 22.57 2.95 -4.13 <0.001 TRW 34.68 4.42 31.63 2.94 6.17 <0.001 TRL 35.16 3.64 33.95 3.78 2.45 0.015 CASW 30.92 3.61 32.81 2.58 -4.57 <0.001 CASL 20.96 2.91 21.57 2.64 -1.67 0.097 NASW 28.93 4.75 31.76 3.24 -5.30 <0.001 NASH 26.95 3.90 22.68 1.87 10.67 <0.001

As one can see the p-values appear to be statistically significant for all variables expect for “TL”,

“TH”, “NL” and “CASL”. See Table 2.1.

Binary logistic regression was implemented using a chi-square test for stepwise variable selection using SAS Enterprise. When all eleven variables were entered into the SAS stepwise selection procedure in logistic analysis with a 0.15 level to enter and a 0.10 to exit, seven variables were selected for a model for the response of race. The variables were: “TL”, “TRW”,

“NH”, “TRL”, “CASW”, “NASW”, “NASH”. See Appendix L.

16

The selected model is:

 ˆ  ln   5.87.403xTL .564xTRW .0825xNH .277xTRL 1.384xCASW .225xNASW 1.736xNASH 1ˆ 

or

e5.87 .40xTL  .56 x TRW  .08 x NH  .28 x TRL  1.38 x CASW  .23 x NASW  1.74 x NASH ˆ  1 e5.87 .40xTL  .56 x TRW  .08 x NH  .28 x TRL  1.38 x CASW  .23 x NASW  1.74 x NASH

Both of these models are used to predict the event of “white”. Note, SAS models for success

Y=0, the event “white”, whereas MINITAB models for Y=1 or “black”. The estimates will be the same but with opposite signs on the coefficients. The model chosen above was selected from the SAS output therefore it predicts the event of “white”.

The probability that Yˆ  0 or the prediction is “white” given that Y=0 or the sensitivity

20 with this model it was found to be PYYˆ 0 /  0  or 100% using a validating set of 20   20 white individuals that were not utilized in the selection of the model. The specificity or

17 PYYˆ 1/  1  or 0.85 for black individuals from the validating set. Overall 37 of the 40   20 individuals were categorized correctly or that 93% of the individuals were placed in the right category. When the independent variables yield a ˆ , probability of success, greater than or equal to 0.50 these individuals were assigned “white” and where ˆ is less than 0.50 these individuals were assigned “black”. No values fell exactly at 50%.

The test for the model, Ho : i  0for all i versus H a :   0 for some i, has a log- likelihood test statistic G of 258.73 with an associated p-value less than 0.001. This signifies that the model has significant coefficients. See Appendix L. The deviance is 55.425 with 219 17 degrees of freedom. The associated p-value is approximately one signifying again that the model does not have a lack of fit.

Kendall’s Tau-a is 0.49 which is actually higher than the value of .47 for the “sex” model. The Somers’ D value is .98 suggesting that relationship is both strong and positive. The

Cox and Snell coefficient of determination is R2  .68 whereas as the Nagelkerke adjusted coefficient is R2  .91. See Appendix L.

Unlike the “sex” model the odds ratio estimates are not all greater than one i.e. the coefficients are not all positive. Nonetheless, all of the 95% confidence intervals for the odds ratios do not include the value of one. See Appendix M. Each variable‟s odds ratio value is the odds in favor of success (“white” in this case) for an individual with a one millimeter increase in the variable with the other six are held constant.

In the present situation the odds in favor of an individual being black with a unit increase in “TL” with all other variables held constant is e-.403 or .67. For each unit increase in “TL”, odds of “white” are multiplied by .67 or odds decrease by 33%.

The goal of modeling for “race” was to select a model that was as simple as possible and that was done. The seven variables, chosen of the eleven, seem to perform sufficiently. As has been addressed earlier, the variables may have interactions amongst themselves. The most reasonable solution was to examine two-way interactions among those variables thought to be most related. From an anatomical point of view the following interactions were considered to be the most likely. If three-way interactions were added then it was felt that this would create a model too complicated to be practical since all accompanying two-terms would need to be included as well. The table below summarizes the results. See Appendix N.

18

Table 2.2 Model Selection for “Race”

Model Predictors Deviance df AIC Percent Models Deviance p-value Correct Compared Difference 1 All eleven variables 53.34 215 75.34 90 2 TL+TRW+NH+TRL+CASW+ 55.43 219 71.43 93 1-2 2.09df(4) 0.719 NASW+NASH 3 TL+TRW+NH+TRL+CASW+ 53.42 218 71.41 88 2-3 2.01df(1) 0.156 NASW+NASH+TL*TRW 4 TL+TRW+NH+TRL+CASW+ 47.95 218 65.95 90 2-4 7.48df(1) 0.006 NASW+NASH+ NASW*NASH 5 TL+TRW+NH+TRL+CASW+ 47.67 217 67.56 90 2-5 7.76df(2) .021 NASW+NASH+NAS W*NASH+TL*TRW 6 TL+TRW+NH+TRL+CASW+ 28.91 217 48.91 95 2-6 26.52df(2) <0.000 NASW+TRW*CASW NASH+NH*NASH 7 TL+TRW+NH+TRL+CASW+ 44.12 218 62.12 95 2-7 11.31df(1) <0.000 NASW+NASH+TRW*CASW 6-7 15.21df(1) <0.000 8 TL+TRW+NH+TRL+CASW+ 34.79 218 52.79 95 2-8 20.64df(1) <0.000 NASW+NASH+NH*NASH 6-8 5.88df(1) .015

According the Akaike information criterion (AIC) the model of choice would be model number 6 because it has the lowest value.

When a test of the null hypothesis of the simpler as adequate against the alternative that the more complex fits better, the null was rejected in the case of model 6 being compared to model 2. Possibly this model with two interaction terms could be simplified. When comparing model 6 with model 7 it shows that the “NH*NASH” term is significant. A similar comparison was made between models 6 and 8 with the result being that model 6 is preferable to model 8.

Once model 6 was selected it was noticed that not all of the individual terms were significant when the interactions are included. The variables “TL”, “TRW”, “TRL” and

“NASH” are not significant. The variables “TRW” and “”NASH” should remain in the model since they are included in the interaction terms. Further refinement was required.

19

Table 2.3 Model Refinement for “Race”

Model Predictors 2 2 Deviance df AIC Percent Models Deviance p-

R R Correct Compared Difference value 6 TL+TRW+NH+TRL+CASW+NASW+ .72 .96 28.91 217 48.91 95 TRW*CASW+ NASH+NH*NASH 6b TRW+NH+TRL+CASW+ .71 .95 31.91 218 49.91 95 6-6b 3df(1) .08 NASW+TRW*CASW+NASH+ NH*NASH 6c TRW+NH+CASW+NASW+TRW*CAS .71 .95 31.95 219 47.95 95 6b-6c .04df(1) .84 W+ 6-6c 3.04df(2) .22 NASH+NH*NASH

According to the results shown above the most preferred model is model 6c. In each case when

model 6c is compared to models 6 and 6b the null is not rejected meaning that the simpler model

is adequate. The calculated average accuracy rates for models 6, 6b and 6c are all 95%. The

Nagelkerke coefficient of determination is .95 signifying 95% of the variation is explained by the

model. The AIC is smallest for model 6c as well. The model is

ˆ ln 174.03  2.34xTRW  4.68 x NH  4.08 x CASW  .33 x NASW  .09 x TRW* CASW  1.95 x NASH  .17 x NH  x NASH 1ˆ

or

e174.03 2.34xTRW  4.68 x NH  4.08 x CASW  .33 x NASW  .09 x TRW* CASW  1.95 x NASH  .17 x NH  x NASH ˆ  174.03 2.34x  4.68 x  4.08 x  .33 x  .09 x  1.95 x  .17 x  x 1 e TRW NH CASW NASW TRW* CASW NASH NH NASH .

Again, this is a model for the event “white”. See Appendix N.

20

Sex with Race

A model that could assist in the prediction of both the race and the sex of the individual at the same time would be helpful for the forensic scientist in those certain instances when both race and sex are unknown. For this reason it was decided that a model that could assist in this task would be beneficial. The response categories are: white female (wf), white male (wm), black female (bf) and black male (bm). In this study only the races of Black and White are considered.

The descriptive statistics have already been completed earlier with the previous two models. Tests for the differences of the means were performed for each of the eleven variables.

The hypotheses statements below are for each individual test.

H0:  1  2   3   4

H A : not all means are equal

In each instance the hypotheses of all means being equal for the four different categories were rejected based on the low p-value and on the high F- test statistic value. See Table 3.1.

Table 3.1 Summary of ANOVA for “Sex” with “Race”

Variable wf mean SD wf wm mean SD wm bf mean SD bf bm mean SD bm F test statistic p-value TL 53.43 3.32 61.06 3.72 54.11 3.43 60.84 3.52 80.53 0.000 TW 38.18 3.01 44.65 2.63 40.53 3.08 45.99 2.97 87.43 0.000 TH 28.93 2.68 33.61 2.38 29.11 1.93 32.44 1.57 66.71 0.000 NL 15.36 3.32 18.67 2.60 16.76 6.06 18.11 2.69 8.48 0.000 NH 16.09 18.71 18.52 4.41 21.62 1.94 23.52 3.47 6.12 0.001 TRW 32.58 4.70 36.74 2.95 29.65 1.98 33.61 2.36 47.42 0.000 TRL 32.69 2.84 37.58 2.54 32.15 3.90 35.75 2.64 41.99 0.000 CASW 28.41 2.98 33.38 2.22 31.48 2.30 34.14 2.12 63.45 0.000 CASL 18.93 2.56 22.95 1.55 20.47 2.39 22.67 2.44 41.22 0.000 NASW 26.39 2.67 31.42 5.04 29.99 2.72 33.54 2.71 42.77 0.000 NASH 24.39 2.85 29.46 3.07 21.83 1.57 23.53 1.76 105.36 0.000

21

For each variable the simultaneous Tukey pairwise confidence intervals were calculated for all six combinations. See Appendix O. As can be seen, not all variables had differences for every response but the results were promising.

Utilizing SAS stepwise selection procedure for a nominal multicategory response variable, models were chosen. See Appendix P. Variable selection was based on the multicategory response stepwise procedure with 0.15 level to enter and a 0.10 level to exit.

Seven of the original eleven variables were chosen for the models. See Appendix P. The multicategory logit option selects models for all of the equations simultaneously. If it were not so, the standard error for the coefficients would be larger.

To test to see if the any of the seven variables were significant in either model in predicting the four options we would use a chi-square test.

The G-test statistic is 467.84 with an accompanying chi-square p-value of less than .0001. See

Appendix Q. The conclusion is that there is enough evidence to reject the null hypothesis and claim that there is sufficient evidence to indicate that at least one of these variables is a good predictor for the distinguishing between the four categories.

The models are given below where ˆ 0 is the estimated probability of white female, ˆ1

the estimated probability of black female, ˆ 2 the estimated probability of black male, ˆ 3 the estimated probability of white male. The reference event is white male. The estimates of the models are:

4.58+.24 +.06 +.19 -.45 +.78 +.24 -1.75

22

45.77-.09 -.25 +.17 -.54 +.80 +.05 -1.72

62.49-.25 .45 +.11 -.05 -.29 -.48 -.25

For each additional unit increase in, say, talar length (TL), the odds of a black male classification over a white male classification is multiplied by . Similar calculations could be done for each equation and each variable.

If an equation is needed to compare “white female” with “black female” this can be calculated by realizing that

(62.49-.25 -.45 +.11 -.05 -.29 -.48 -.25 )-(

45.77-.09 -.25 +.17 -.54 +.80 +.05 -1.72 ).

Then,

-.20 -.06 +.50 -1.09 -.52 +1.47 where this is the estimated logit model for comparing white females to black females. Likewise for every unit increase in the measurement of the variable labeled NASH, the odds of being the classification of white female is multiplied by or 4.35.

The remaining two equations are given below.

-.51 -.08 -1.08 -.72 +1.50

-.31 -.02 -.10 +.01 -.20 +.03

Let

4.58+.24 +.06 +.19 -.45 +.78 +.24 -1.75 23

45.77-.09 -.25 +.17 -.54 +.80 +.05 -1.72

62.49-.25 .45 +.11 -.05 -.29 -.48 -.25

-.51 -.08 -1.08 -.72 +1.50

-.31 -.02 -.10 +.01 -.20 +.03

-.20 -.06 +.50 -1.09 -.52 +1.47

Solving these equations simultaneously the following equations for the predicted probability of each event were obtained:

eB ˆ  1 1eeeABC

eA ˆ  2 1eeeABC

eC ˆ  0 1eeeABC

1 ˆ  . 3 1eeeABC

These equations were then used on the forty data values that were reserved as a validating set. The probability of each event was calculated for each data entry for all four equations. The individuals were then classified by the equation that had the highest probability.

Of the 10 individuals who were categorized as “white females” the equations were able to correctly classify 7 of the 10. Two of the individuals were incorrectly classified as “white males” and 1 was classified as a “black female”. Of the 10 who were cataloged as “black 24 female” the process was able to correctly group 8 of the 10. The two that were misclassified were mislabeled as “white female” instead of “black female”. The group “black male” was correctly sorted 7 of the 10 times with two mistakes being grouped as “black female” and 1 as

“white male”. The highest success rate was seen in pigeonholing for “white male”. Every individual was correctly classified. This is an overall success rate of 80%. See table 3.2.

Table 3.2 Summary of Placement for Validating Set

Category Total Count Number Correct Misplaced Values Percentage Correct white female (wf) 10 7 1 bf, 2 wm 70 black male (bm) 10 7 2 bf, 1 wm 70 white male (wm) 10 10 100 black female (bf) 10 8 2 wf 80 Total 40 32 8 80

If the original 227 values were classified according to the four equations that have been selected the following placement results. This again is based on the result with highest probability of the four categories. See table below.

Table 3.3 Summary of Placement for Large Set

Category Total Count Number Correct Misplaced Values Percent Correct white female (wf) 59 51 4 bf, 4 wm 86 black male (bm) 54 41 12 bf, 1 wm 76 white male (wm) 60 57 1 bf, 1bm, 1wf 95 Black female (bf) 54 51 3 bm 94 Total 227 200 27 88

These values are most likely higher than one would see from the validating set since this is the same set used to generate the equations. 25

When various interaction terms were added to the sex/race models either the terms themselves or single terms were not significant. For this reason no interaction terms were added.

Suppose either the sex or race for an individual was known and the other needed to be determined. An interesting question may be whether it is better to use the prediction equation for race or sex or use the predictions equations for the combination of gender and race. Which situation leads to the highest number of correct placements? The accuracy rates for the individual race and sex models, using the validating set, are given in respective sections. To answer this question the set used is the total set of 267 values which includes the validating set plus and the lager set of 227. See table below.

Table 3.4 Comparison of models

Total set of sex/race sex/race sex/race race model race model sex model 267 model model given model given followed by for race for sex values gender is race is sex model alone alone known known

sex and race sex sex and race race sex race predicted predicted predicted predicted predicted predicted Percentage correctly 87 95 92 73 91 89 placed

If both the race and sex are not known, a greater accuracy rate results from using the combination “sex/race” model. If the race of the individual is known the “sex/race” model is better at predicting the sex than the “sex model”. The “sex/race model” is also superior to the

“race model” at predicting race. The only advantage to the individual “race” and “sex” models is simplicity.

26

CONCLUSION

The results of the sex determination, race determination and sex/race determination of this study indicate sex, race and sex/race determination can be accomplished using measurements of the talus bone with accuracy comparable to other techniques based on other skeletal elements. The talus is a robust bone that is likely to be well-preserved and recovered.

The talus has now been shown to be a good indicator of sex, race and the four categories of sex with race. The successful placement of individuals in the category of sex was 93% using the model selected. The percentage of those correctly classified using the race model with interactions was 95%. The more complicated situation was one that attempted to classify an individual by both sex and race. Four equations were used with the result that an overall success rate of 80% was achieved in the validating set and an overall success rate of 88% for the total set.

The success rates for the four individual groups ranged from 76% to 95%.

These methods classify the individuals based on different subsets of 8 of the original 11 measurements made on the talus. It is thought to be primarily based on size. The talus is a weight bearing bone therefore the size and shape will be affected by the size and weight of the body it supports. Some individuals may be misclassified by a model because they have talus measurements that are either larger or smaller than other members of the same group. This however, is true for the most popular cranial and pelvic techniques as well. Even with complete skeletons some individuals will remain ambiguous.

This paper dealt only with the categories of male and female among the populations of

Black and White. In this research no data for Hispanics, Asians or American Indians are given.

It should be understood that the predictive ability would be less if additional groups were included. Nevertheless, all these models produce considerable results over prior models. Fewer 27 bones were used with the same or better results. Thus, in the absence of preferred remains or identification, the talus can provide evidence that should result in a good probability of correct classification for race, gender and gender with race.

28

Figure 1 Bones of the Human Foot

(http:// www.britannia.com/EBchecked/topicart/547358)

29

Figure 2 Superior View of the Right Talus

30

Figure 3 Medial View of the Right Talus

31

Figure 4 Anterior View of the Right Talus

8 9

Anterior view right talus showing the measurements of Navicular Articular Surface Height (8), Navicular Articular Surface Width (9).

32

Figure 5 Lateral View of the Right Talus

10

11

Lateral view of right talus showing measurements of Talar Height (10), and Neck Height (11)

33

Figure 6 Inferior View of Talus

6

7

Inferior view right talus showing measurements of Calcaneus Articular Surface Length (6), Calcaneus Articular Surface Width (7)

34

Figure 7 Superior View of Talus showing Neck Length

3

4

5

Superior view right talus showing measurements of Neck Length (3), Trochlear Length (4), Trochlear Width (5)

35

Appendix A: Box Plot for “Sex”, All Eleven Variables

Gender 0-female 1-male

0 1 0 1 talarlength talarw idth talarheight necklength 70 60 40 50 60 40

40 30 50 20

30 20 neckheight trochleaw idth trochlealength casw idth 160 40 40 50

40 30 80 20

30 0 0 20 caslength nasw idth nasheight 40 60 30 30 20 40

20 10 20 0 1 0 1

36

Appendix B: Correlations

talarlength talarwidth talarheight necklength talarwidth 0.764 0.000 talarheight 0.790 0.747 0.000 0.000 necklength 0.391 0.309 0.379 0.000 0.000 0.000 neckheight 0.137 0.194 0.133 0.094 0.031 0.002 0.035 0.138 trochleawidth 0.585 0.505 0.654 0.261 0.000 0.000 0.000 0.000 trochlealength 0.665 0.634 0.671 0.267 0.000 0.000 0.000 0.000 caswidth 0.637 0.734 0.616 0.385 0.000 0.000 0.000 0.000 caslength 0.676 0.679 0.641 0.303 0.000 0.000 0.000 0.000 naswidth 0.572 0.633 0.528 0.239 0.000 0.000 0.000 0.000 nasheight 0.526 0.399 0.567 0.237 0.000 0.000 0.000 0.000

neckheight trochleawidth trochlealength caswidth trochleawidth -0.001 0.985 trochlealength 0.111 0.583 0.078 0.000 caswidth 0.197 0.403 0.552 0.002 0.000 0.000 caslength 0.155 0.435 0.524 0.610 0.014 0.000 0.000 0.000 naswidth 0.210 0.325 0.445 0.600 0.001 0.000 0.000 0.000 nasheight -0.014 0.551 0.573 0.331 0.820 0.000 0.000 0.000

caslength naswidth naswidth 0.525 0.000 nasheight 0.367 0.240 0.000 0.000

37

Appendix C: Matrix Plot for Eleven Variables

Matrix Plot of TL, TW, TH, NL, NH, TRW, TRL, CASW, CASL, NASW, NASH

TW TH NL NH TRW TRL CASW CASL NASW NASH 30 40 50 20 30 40 20 40 600 80 1600 20 40 30 40 50 20 30 4010 20 30 20 40 60 20 30 40 70 60

L gender T 50 0-female

50 0 W 40 T 1 30

40 H

30 T

620

40 L N 20 160

80 H N

0

40 W

20 R T

0

50 L

40 R T 30

40 W

30 S

A C 20

30 L S

20 A C 10

60

W S

40 A N 20

38

Appendix D: Model Selection for “Sex”

The LOGISTIC Procedure Response Variable gender

Number of Response Levels 2

Model binary logit

Optimization Technique Fisher's scoring

Number of Observations Read 227

Number of Observations Used 227

Response Profile

Ordered gender Total Value Frequency

1 0 113

2 1 114

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

-2 Log L = 314.684

R-Square 0.5908 Max-rescaled R-Square 0.7877

39

Analysis of Maximum Likelihood Estimates

Parameter DF Estimate Standard Wald Pr > ChiSq Error Chi-Square

Intercept 1 50.6156 7.1038 50.7673 <.0001

tlength 1 -0.2214 0.0926 5.7152 0.0168

twidth 1 -0.3127 0.1055 8.7858 0.0030

theight 1 -0.4289 0.1869 5.2680 0.0217

nasw 1 -0.2064 0.0950 4.7151 0.0299

nash 1 -0.1993 0.0887 5.0498 0.0246

Odds Ratio Estimates

Effect Point 95% Wald Estimate Confidence Limits

tlength 0.801 0.668 0.961

twidth 0.731 0.595 0.899

theight 0.651 0.451 0.939

nasw 0.814 0.675 0.980

nash 0.819 0.689 0.975

Association of Predicted Probabilities and Observed Responses

Percent Concordant 96.3 Somers' D 0.927

Percent Discordant 3.6 Gamma 0.927

Percent Tied 0.1 Tau-a 0.465

Pairs 12882 c 0.963

40

Appendix E: MINITAB Five Variable Model Selection for “Sex”

Link Function: Logit

Response Information

Variable Value Count gender 0-female 1 114 (Event)

0 113

Total 227

Logistic Regression Table

Odds 95% CI

Predictor Coef SE Coef Z P Ratio Lower Upper

Constant -50.6158 7.10382 -7.13 0.000 talarlength 0.221434 0.0926254 2.39 0.017 1.25 1.04 1.50 talarwidth 0.312681 0.105489 2.96 0.003 1.37 1.11 1.68 talarheight 0.428923 0.186876 2.30 0.022 1.54 1.06 2.21 nasheight 0.199346 0.0887087 2.25 0.025 1.22 1.03 1.45 naswidth 0.206386 0.0950450 2.17 0.030 1.23 1.02 1.48

Log-Likelihood = -55.923

Test that all slopes are zero: G = 202.839, DF = 5, P-Value = 0.000

Goodness-of-Fit Tests

Method Chi-Square DF P

Pearson 188.129 221 0.947

Deviance 111.845 221 1.000

Hosmer-Lemeshow 5.315 8 0.723

Brown:

41

Appendix F: Association for “Sex”

Measures of Association:

(Between the Response Variable and Predicted Probabilities)

Pairs Number Percent Summary Measures

Concordant 12405 96.3 Somers' D 0.93

Discordant 466 3.6 Goodman-Kruskal Gamma 0.93

Ties 11 0.1 Kendall's Tau-a 0.47

Total 12882 100.0

42

Appendix G: Boxplot for “Sex”, Validating Data Set

Boxplot of talarlength, talarwidth, talarheight, naswidth, nasheight

0 1 talarlength talarwidth talarheight 426 65 55 35.0

50 60 32.5

45 30.0 55 40 27 27.5 27 50 35 25.0 naswidth nasheight 36 0 1 40 426

32 35

28 30

24 25 1266 20 20 0 1 gender 0-female

43

Appendix H: Boxplot for “Sex”, Entire Data Set

Boxplot of talarlength, talarwidth, talarheight, naswidth, nasheight

0 1 talarlength talarwidth talarheight 70 1013 55 426 1068 40 110103 50 1345 8162815 60 35 45

27 30 50 40 540 27 25 35 642 1240 40 20 naswidth nasheight 0 1 1765 40 60 35 50 30 40 25 30 20 20 0 1 gender 0-female

44

Appendix I: Matrix Plot for Independent, Validating Set

Matrix Plot of TL, TW, TH, NASW, NASH

TW TH NASW NASH 30 40 50 20 30 40 20 40 60 20 30 40 70 gender

60 0-female L

T 0 50 1

50 W 40 T

30

40 H

30 T

20

60

W S

40 A N

20

45

Appendix J: Logit Model for Model Set without Influential Values

Logit function for “Sex” minus influential values

Response Information

Variable Value Count gender 0-female 1 110 (Event) 0 112 Total 222

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -51.5071 7.22748 -7.13 0.000 TL 0.308017 0.104751 2.94 0.003 1.36 1.11 1.67 TW 0.326852 0.108747 3.01 0.003 1.39 1.12 1.72 TH 0.320752 0.192126 1.67 0.095 1.38 0.95 2.01 NASW 0.149525 0.0998896 1.50 0.134 1.16 0.95 1.41 NASH 0.214386 0.0916645 2.34 0.019 1.24 1.04 1.48

Log-Likelihood = -53.581 Test that all slopes are zero: G = 200.577, DF = 5, P-Value = 0.000

Goodness-of-Fit Tests

Method Chi-Square DF P Pearson 175.736 216 0.979 Deviance 107.162 216 1.000 Hosmer-Lemeshow 6.739 8 0.565

Measures of Association: (Between the Response Variable and Predicted Probabilities)

Pairs Number Percent Summary Measures Concordant 11884 96.5 Somers' D 0.93 Discordant 431 3.5 Goodman-Kruskal Gamma 0.93 Ties 5 0.0 Kendall's Tau-a 0.47 Total 12320 100.0

46

Appendix K: Boxplots for “Race”

Boxplot of talarlength, talarwidth, talarheight, necklength, ...

0 1 0 1 talarlength talarwidth talarheight necklength 70 60 40 50 60 40

40 30 50 20

30 20 neckheight trochleawidth trochlealength caswidth 160 40 40 50

80 20 40 30

30 0 0 20 caslength naswidth nasheight 40 60 30 30 20 40

20 10 20 0 1 0 1 race 0-white

47

Appendix L: SAS Model Selection for “Race”

The LOGISTIC Procedure

Response Variable race

Number of Response Levels 2

Model binary logit

Optimization Technique Fisher's scoring

Number of Observations Read 227

Number of Observations Used 227

Response Profile

Ordered race Total Value Frequency

1 0 119

2 1 108

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

-2 Log L = 314.156

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

167.4166 11 <.0001

48

R-Square 0.6801 Max-rescaled R-Square 0.9075

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiS q

Likelihood Ratio 91.7985 1 <.0001

Score 73.2184 1 <.0001

Wald 50.0299 1 <.0001

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Criterion Intercept Intercept Only and Covariate s

AIC 316.156 71.425

SC 319.581 98.825

-2 Log L 314.156 55.425

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 258.7303 7 <.0001

Score 163.6509 7 <.0001

Wald 25.3750 7 0.0007

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

1.9016 4 0.7538

49

Analysis of Maximum Likelihood Estimates

Parameter DF Estimate Standard Wald Pr > ChiSq

Error Chi-Square

Intercept 1 5.8665 4.5462 1.6652 0.1969

TL 1 -0.4030 0.1338 9.0704 0.0026

NH 1 -0.0825 0.0230 12.8632 0.0003

TRW 1 0.5641 0.1221 21.3556 <.0001

TRL 1 0.2770 0.1301 4.5350 0.0332

CASW 1 -1.3843 0.3251 18.1357 <.0001

NASW 1 -0.2251 0.0781 8.3093 0.0039

NASH 1 1.7362 0.3717 21.8229 <.0001

50

Appendix M: MINITAB Model for “Race”

Response Information

Variable Value Count race 0-white 1 108 (Event)

0 119

Total 227

Logistic Regression Table

Odds 95% CI

Predictor Coef SE Coef Z P Ratio Lower Upper

Constant -5.86648 4.54621 -1.29 0.197

TL 0.403015 0.133816 3.01 0.003 1.50 1.15 1.95

NH 0.0824947 0.0230013 3.59 0.000 1.09 1.04 1.14

TRW -0.564088 0.122065 -4.62 0.000 0.57 0.45 0.72

TRL -0.277032 0.130089 -2.13 0.033 0.76 0.59 0.98

CASW 1.38427 0.325052 4.26 0.000 3.99 2.11 7.55

NASW 0.225123 0.0780976 2.88 0.004 1.25 1.07 1.46

NASH -1.73619 0.371654 -4.67 0.000 0.18 0.09 0.37

Log-Likelihood = -27.713

Test that all slopes are zero: G = 258.730, DF = 7, P-Value = 0.000

Goodness-of-Fit Tests

Method Chi-Square DF P

Pearson 326.584 219 0.000

Deviance 55.425 219 1.000

Hosmer-Lemeshow 19.272 8 0.013

Measures of Association:

(Between the Response Variable and Predicted Probabilities)

Pairs Number Percent Summary Measures

Concordant 12718 99.0 Somers' D 0.98

Discordant 123 1.0 Goodman-Kruskal Gamma 0.98

Ties 11 0.1 Kendall's Tau-a 0.49

Total 12852 100.0 51

Appendix N: MINITAB Output for Interaction Models for “Race”

Model 1

Response Information

Variable Value Count race 0-white 1 108 (Event) 0 119 Total 227

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -5.88423 5.15744 -1.14 0.254 TL 0.372019 0.177541 2.10 0.036 1.45 1.02 2.05 TW 0.219364 0.182083 1.20 0.228 1.25 0.87 1.78 TH -0.221514 0.279355 -0.79 0.428 0.80 0.46 1.39 NL 0.0885253 0.185287 0.48 0.633 1.09 0.76 1.57 NH 0.0815155 0.0233930 3.48 0.000 1.08 1.04 1.14 TRW -0.533651 0.123625 -4.32 0.000 0.59 0.46 0.75 TRL -0.267512 0.121328 -2.20 0.027 0.77 0.60 0.97 CASW 1.27186 0.343717 3.70 0.000 3.57 1.82 7.00 CASL -0.0351722 0.311329 -0.11 0.910 0.97 0.52 1.78 NASW 0.220865 0.0787801 2.80 0.005 1.25 1.07 1.46 NASH -1.69468 0.383163 -4.42 0.000 0.18 0.09 0.39

Log-Likelihood = -26.670 Test that all slopes are zero: G = 260.816, DF = 11, P-Value = 0.000

Model 2

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -5.86648 4.54621 -1.29 0.197 TL 0.403015 0.133816 3.01 0.003 1.50 1.15 1.95 TRW -0.564088 0.122065 -4.62 0.000 0.57 0.45 0.72 NH 0.0824947 0.0230013 3.59 0.000 1.09 1.04 1.14 TRL -0.277032 0.130089 -2.13 0.033 0.76 0.59 0.98 CASW 1.38427 0.325052 4.26 0.000 3.99 2.11 7.55 NASW 0.225123 0.0780976 2.88 0.004 1.25 1.07 1.46 NASH -1.73619 0.371654 -4.67 0.000 0.18 0.09 0.37

Log-Likelihood = -27.713 Test that all slopes are zero: G = 258.730, DF = 7, P-Value = 0.000

52

Model 3

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.4107 47.4082 -1.49 0.137 TL 1.56644 0.873843 1.79 0.073 4.79 0.86 26.55 TRW 1.35961 1.39260 0.98 0.329 3.89 0.25 59.69 NH 0.0782883 0.0228892 3.42 0.001 1.08 1.03 1.13 TRL -0.293272 0.130935 -2.24 0.025 0.75 0.58 0.96 CASW 1.32653 0.317887 4.17 0.000 3.77 2.02 7.03 NASW 0.273463 0.0907205 3.01 0.003 1.31 1.10 1.57 NASH -1.67194 0.365908 -4.57 0.000 0.19 0.09 0.38 TL*TRW -0.0347235 0.0254446 -1.36 0.172 0.97 0.92 1.02

Log-Likelihood = -26.707 Test that all slopes are zero: G = 260.741, DF = 8, P-Value = 0.000

Model 4

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -43.2576 16.1723 -2.67 0.007 TL 0.293835 0.142617 2.06 0.039 1.34 1.01 1.77 TRW -0.510258 0.124402 -4.10 0.000 0.60 0.47 0.77 NH 0.0736815 0.0243052 3.03 0.002 1.08 1.03 1.13 TRL -0.304224 0.122738 -2.48 0.013 0.74 0.58 0.94 CASW 1.26853 0.344443 3.68 0.000 3.56 1.81 6.98 NASW 1.75790 0.622647 2.82 0.005 5.80 1.71 19.65 NASH -0.168535 0.665501 -0.25 0.800 0.84 0.23 3.11 NASW*NASH -0.0509867 0.0203612 -2.50 0.012 0.95 0.91 0.99

Log-Likelihood = -23.973 Test that all slopes are zero: G = 266.210, DF = 8, P-Value = 0.000

Model 5

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.4201 48.1188 -1.46 0.143 TL 0.844742 0.929689 0.91 0.364 2.33 0.38 14.40 TRW 0.381881 1.47785 0.26 0.796 1.47 0.08 26.53 NH 0.0718397 0.0240468 2.99 0.003 1.07 1.03 1.13 TRL -0.309997 0.122392 -2.53 0.011 0.73 0.58 0.93 CASW 1.23494 0.339124 3.64 0.000 3.44 1.77 6.68 NASW 1.66850 0.635239 2.63 0.009 5.30 1.53 18.42 NASH -0.244549 0.676724 -0.36 0.718 0.78 0.21 2.95 NASW*NASH -0.0472719 0.0210906 -2.24 0.025 0.95 0.92 0.99 TL*TRW -0.0161221 0.0267750 -0.60 0.547 0.98 0.93 1.04

Log-Likelihood = -23.783 Test that all slopes are zero: G = 266.589, DF = 9, P-Value = 0.000 53

Model 6

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -186.245 61.9372 -3.01 0.003 TL 0.356783 0.225091 1.59 0.113 1.43 0.92 2.22 TRW 2.73686 1.46008 1.87 0.061 15.44 0.88 270.05 NH 4.07752 1.25364 3.25 0.001 59.00 5.06 688.59 TRL -0.159763 0.289506 -0.55 0.581 0.85 0.48 1.50 CASW 4.48686 1.68700 2.66 0.008 88.84 3.26 2424.48 NASW 0.364759 0.152780 2.39 0.017 1.44 1.07 1.94 NASH 1.44764 0.833928 1.74 0.083 4.25 0.83 21.80 TRW*CASW -0.109724 0.0523412 -2.10 0.036 0.90 0.81 0.99 NH*NASH -0.150902 0.0469249 -3.22 0.001 0.86 0.78 0.94

Log-Likelihood = -14.453 Test that all slopes are zero: G = 285.249, DF = 9, P-Value = 0.000

Model 7

Logistic Regression Table

95% CI Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant -118.408 38.8410 -3.05 0.002 TL 0.519677 0.175085 2.97 0.003 1.68 1.19 2.37 TRW 2.59920 1.03909 2.50 0.012 13.45 1.76 103.11 NH 0.0771210 0.0249617 3.09 0.002 1.08 1.03 1.13 TRL -0.253359 0.195440 -1.30 0.195 0.78 0.53 1.14 CASW 4.84838 1.30846 3.71 0.000 127.53 9.81 1657.33 NASW 0.447722 0.124477 3.60 0.000 1.56 1.23 2.00 NASH -1.66806 0.399147 -4.18 0.000 0.19 0.09 0.41 TRW*CASW -0.111364 0.0378820 -2.94 0.003 0.89 0.83 0.96

Log-Likelihood = -22.060 Test that all slopes are zero: G = 270.035, DF = 8, P-Value = 0.000

Model 8

Logistic Regression Table

95% CI Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant -84.8241 25.2873 -3.35 0.001 TL 0.216887 0.169544 1.28 0.201 1.24 0.89 1.73 TRW -0.417955 0.133794 -3.12 0.002 0.66 0.51 0.86 NH 4.68841 1.34447 3.49 0.000 108.68 7.79 1515.62 TRL -0.220138 0.219200 -1.00 0.315 0.80 0.52 1.23 CASW 1.27680 0.429933 2.97 0.003 3.59 1.54 8.33 NASW 0.158234 0.0809558 1.95 0.051 1.17 1.00 1.37 NASH 1.59608 0.834809 1.91 0.056 4.93 0.96 25.34 NH*NASH -0.173346 0.0501833 -3.45 0.001 0.84 0.76 0.93

Log-Likelihood = -17.395 Test that all slopes are zero: G = 279.365, DF = 8, P-Value = 0.000 54

Model 6b

Logistic Regression Table

95% CI Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant -178.402 58.6779 -3.04 0.002 TRW 2.44898 1.43540 1.71 0.088 11.58 0.69 192.94 NH 4.68995 1.27673 3.67 0.000 108.85 8.91 1329.21 TRL 0.0537632 0.240891 0.22 0.823 1.06 0.66 1.69 CASW 4.17632 1.65971 2.52 0.012 65.13 2.52 1684.70 NASW 0.342444 0.143878 2.38 0.017 1.41 1.06 1.87 TRW*CASW -0.0979236 0.0508754 -1.92 0.054 0.91 0.82 1.00 NASH 1.94470 0.805299 2.41 0.016 6.99 1.44 33.89 NH*NASH -0.174379 0.0477548 -3.65 0.000 0.84 0.76 0.92

Log-Likelihood = -15.953 Test that all slopes are zero: G = 282.251, DF = 8, P-Value = 0.000

Model 6c

Logistic Regression Table

95% CI Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant -174.030 55.0453 -3.16 0.002 TRW 2.33923 1.36156 1.72 0.086 10.37 0.72 149.59 NH 4.67721 1.26515 3.70 0.000 107.47 9.00 1282.94 CASW 4.07708 1.59646 2.55 0.011 58.97 2.58 1347.67 NASW 0.330359 0.131910 2.50 0.012 1.39 1.07 1.80 TRW*CASW -0.0937474 0.0474839 -1.97 0.048 0.91 0.83 1.00 NASH 1.95380 0.798224 2.45 0.014 7.06 1.48 33.73 NH*NASH -0.173871 0.0473085 -3.68 0.000 0.84 0.77 0.92

Log-Likelihood = -15.977 Test that all slopes are zero: G = 282.201, DF = 7, P-Value = 0.000

Model 6d

Logistic Regression Table 95% CI Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant -94.2384 23.3412 -4.04 0.000 NH 5.37793 1.24694 4.31 0.000 216.57 18.80 2494.71 CASW 0.791293 0.266798 2.97 0.003 2.21 1.31 3.72 NASW 0.119105 0.0680674 1.75 0.080 1.13 0.99 1.29 NASH 2.27766 0.761190 2.99 0.003 9.75 2.19 43.36 NH*NASH -0.200110 0.0465938 -4.29 0.000 0.82 0.75 0.90

Log-Likelihood = -24.915 Test that all slopes are zero: G = 264.325, DF = 5, P-Value = 0.000

55

Appendix O: ANOVA Tables and Tukey Comparisons for “Sex with Race”

One-way ANOVA: talarlength versus response Source DF SS MS F P response 3 2958.5 986.2 80.53 0.000 Error 223 2730.7 12.2 Total 226 5689.2 S = 3.499 R-Sq = 52.00% R-Sq(adj) = 51.36% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev +------+------+------+------bf 54 54.110 3.431 (--*---) bm 54 60.842 3.516 (--*---) wf 59 53.429 3.316 (---*--) wm 60 61.060 3.715 (--*---) +------+------+------+------52.5 55.0 57.5 60.0 Pooled StDev = 3.499 Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+ bm 4.989 6.732 8.475 (--*---) wf -2.387 -0.682 1.024 (---*--) wm 5.251 6.950 8.649 (--*--) ------+------+------+------+ -5.0 0.0 5.0 10.0 response = bm subtracted from: response Lower Center Upper ------+------+------+------+ wf -9.119 -7.414 -5.708 (--*---) wm -1.481 0.218 1.917 (--*---) ------+------+------+------+ -5.0 0.0 5.0 10.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+ wm 5.971 7.632 9.292 (--*---) ------+------+------+------+ -5.0 0.0 5.0 10.0

One-way ANOVA: talarwidth versus response

Source DF SS MS F P response 3 2237.07 745.69 87.43 0.000 Error 223 1902.06 8.53 Total 226 4139.13

S = 2.921 R-Sq = 54.05% R-Sq(adj) = 53.43%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev +------+------+------+------bf 54 40.528 3.075 (--*--) bm 54 45.991 2.966 (--*--) wf 59 38.181 3.011 (--*--) wm 60 44.648 2.632 (--*--) +------+------+------+------37.5 40.0 42.5 45.0

Pooled StDev = 2.921 56

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm 4.009 5.464 6.918 (--*--) wf -3.770 -2.347 -0.923 (--*--) wm 2.702 4.120 5.538 (--*--) ------+------+------+------+- -5.0 0.0 5.0 10.0 response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -9.234 -7.810 -6.387 (-*--) wm -2.761 -1.344 0.074 (--*--) ------+------+------+------+- -5.0 0.0 5.0 10.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm 5.081 6.467 7.852 (--*--) ------+------+------+------+- -5.0 0.0 5.0 10.0

One-way ANOVA: talarheight versus response

Source DF SS MS F P response 3 966.34 322.11 66.71 0.000 Error 223 1076.77 4.83 Total 226 2043.12

S = 2.197 R-Sq = 47.30% R-Sq(adj) = 46.59%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -+------+------+------+------bf 54 29.112 1.926 (---*---) bm 54 32.443 1.569 (---*---) wf 59 28.925 2.681 (---*---) wm 60 33.607 2.375 (---*---) -+------+------+------+------28.5 30.0 31.5 33.0 Pooled StDev = 2.197

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+ bm 2.236 3.331 4.425 (---*---) wf -1.258 -0.187 0.884 (--*---) wm 3.429 4.495 5.562 (---*---) ------+------+------+------+ -3.0 0.0 3.0 6.0 57

response = bm subtracted from: response Lower Center Upper ------+------+------+------+ wf -4.589 -3.518 -2.447 (--*---) wm 0.098 1.165 2.231 (---*--) ------+------+------+------+ -3.0 0.0 3.0 6.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+ wm 3.639 4.682 5.725 (---*--) ------+------+------+------+ -3.0 0.0 3.0 6.0

One-way ANOVA: necklength versus response

Source DF SS MS F P response 3 384.5 128.2 8.48 0.000 Error 223 3370.5 15.1 Total 226 3755.0

S = 3.888 R-Sq = 10.24% R-Sq(adj) = 9.03%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ----+------+------+------+----- bf 54 16.762 6.061 (------*------) bm 54 18.113 2.690 (------*------) wf 59 15.360 3.324 (-----*------) wm 60 18.669 2.602 (-----*------) ----+------+------+------+----- 15.0 16.5 18.0 19.5 Pooled StDev = 3.888

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+-- bm -0.585 1.351 3.288 (------*-----) wf -3.297 -1.402 0.493 (-----*------) wm 0.020 1.907 3.794 (-----*------) ------+------+------+------+-- -3.0 0.0 3.0 6.0 response = bm subtracted from: response Lower Center Upper ------+------+------+------+-- wf -4.648 -2.753 -0.858 (-----*-----) wm -1.331 0.556 2.443 (-----*-----) ------+------+------+------+-- -3.0 0.0 3.0 6.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+-- wm 1.464 3.309 5.154 (-----*-----) ------+------+------+------+-- -3.0 0.0 3.0 6.0 58

One-way ANOVA: neckheight versus response

Source DF SS MS F P response 3 1835.0 611.7 6.12 0.001 Error 223 22285.4 99.9 Total 226 24120.4

S = 9.997 R-Sq = 7.61% R-Sq(adj) = 6.36%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -+------+------+------+------bf 54 21.624 1.936 (------*------) bm 54 23.518 3.465 (------*------) wf 59 16.092 18.709 (------*------) wm 60 18.522 4.413 (------*------) -+------+------+------+------14.0 17.5 21.0 24.5 Pooled StDev = 9.997

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm -3.085 1.894 6.873 (------*------) wf -10.404 -5.532 -0.659 (------*------) wm -7.955 -3.102 1.751 (------*------) ------+------+------+------+- -7.0 0.0 7.0 14.0 response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -12.298 -7.425 -2.553 (------*------) wm -9.849 -4.996 -0.143 (------*------) ------+------+------+------+- -7.0 0.0 7.0 14.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm -2.314 2.429 7.173 (-----*------) ------+------+------+------+- -7.0 0.0 7.0 14.0

One-way ANOVA: trochleawidth versus response Source DF SS MS F P response 3 1465.6 488.5 47.42 0.000 Error 223 2297.3 10.3 Total 226 3762.8

S = 3.210 R-Sq = 38.95% R-Sq(adj) = 38.13% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -----+------+------+------+---- bf 54 29.649 1.979 (---*--) bm 54 33.609 2.361 (--*---) wf 59 32.580 4.698 (--*---) wm 60 36.743 2.951 (--*--) -----+------+------+------+---- 30.0 32.5 35.0 37.5 59

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ----+------+------+------+----- bm 2.361 3.959 5.558 (---*---) wf 1.367 2.931 4.495 (---*---) wm 5.536 7.094 8.652 (---*---) ----+------+------+------+------4.0 0.0 4.0 8.0 response = bm subtracted from: response Lower Center Upper ----+------+------+------+----- wf -2.593 -1.029 0.536 (--*---) wm 1.576 3.134 4.693 (---*---) ----+------+------+------+------4.0 0.0 4.0 8.0 response = wf subtracted from: response Lower Center Upper ----+------+------+------+----- wm 2.640 4.163 5.686 (--*---) ----+------+------+------+------4.0 0.0 4.0 8.0

One-way ANOVA: trochlealength versus response

Source DF SS MS F P response 3 1143.73 381.24 41.99 0.000 Error 223 2024.76 9.08 Total 226 3168.49

S = 3.013 R-Sq = 36.10% R-Sq(adj) = 35.24%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ---+------+------+------+------bf 54 32.149 3.900 (---*---) bm 54 35.747 2.641 (---*---) wf 59 32.689 2.843 (--*---) wm 60 37.580 2.538 (---*---) ---+------+------+------+------32.0 34.0 36.0 38.0 Pooled StDev = 3.013

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97% response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm 2.098 3.599 5.099 (---*----) wf -0.929 0.540 2.009 (----*---) wm 3.968 5.431 6.894 (----*---) ------+------+------+------+- -3.5 0.0 3.5 7.0 60

response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -4.527 -3.058 -1.590 (---*---) wm 0.370 1.833 3.295 (---*---) ------+------+------+------+- -3.5 0.0 3.5 7.0 response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm 3.461 4.891 6.321 (---*---) ------+------+------+------+- -3.5 0.0 3.5 7.0

One-way ANOVA: caswidth versus response

Source DF SS MS F P response 3 1129.55 376.52 63.45 0.000 Error 223 1323.24 5.93 Total 226 2452.79

S = 2.436 R-Sq = 46.05% R-Sq(adj) = 45.33%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -+------+------+------+------bf 54 31.477 2.302 (--*---) bm 54 34.143 2.120 (---*--) wf 59 28.411 2.975 (--*--) wm 60 33.382 2.220 (--*--) -+------+------+------+------28.0 30.0 32.0 34.0

Pooled StDev = 2.436

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97%

response = bf subtracted from: response Lower Center Upper +------+------+------+------bm 1.452 2.665 3.879 (---*--) wf -4.253 -3.066 -1.879 (--*---) wm 0.722 1.905 3.087 (--*---) +------+------+------+------7.0 -3.5 0.0 3.5

response = bm subtracted from: response Lower Center Upper +------+------+------+------wf -6.919 -5.731 -4.544 (---*--) wm -1.943 -0.761 0.422 (---*--) +------+------+------+------7.0 -3.5 0.0 3.5

response = wf subtracted from: 61

response Lower Center Upper +------+------+------+------wm 3.815 4.971 6.127 (--*---) +------+------+------+------7.0 -3.5 0.0 3.5

One-way ANOVA: caslength versus response

Source DF SS MS F P response 3 632.01 210.67 41.22 0.000 Error 223 1139.72 5.11 Total 226 1771.72

S = 2.261 R-Sq = 35.67% R-Sq(adj) = 34.81%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ------+------+------+------+- bf 54 20.472 2.393 (---*----) bm 54 22.670 2.436 (---*---) wf 59 18.931 2.562 (---*---) wm 60 22.948 1.546 (---*---) ------+------+------+------+- 19.5 21.0 22.5 24.0

Pooled StDev = 2.261

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97%

response = bf subtracted from: response Lower Center Upper ------+------+------+------+-- bm 1.072 2.198 3.324 (--*---) wf -2.644 -1.542 -0.440 (---*---) wm 1.379 2.476 3.574 (--*---) ------+------+------+------+-- -3.0 0.0 3.0 6.0

response = bm subtracted from: response Lower Center Upper ------+------+------+------+-- wf -4.841 -3.739 -2.637 (---*--) wm -0.819 0.278 1.376 (---*---) ------+------+------+------+-- -3.0 0.0 3.0 6.0

response = wf subtracted from: response Lower Center Upper ------+------+------+------+-- wm 2.945 4.018 5.091 (--*---) ------+------+------+------+-- -3.0 0.0 3.0 6.0

62

One-way ANOVA: naswidth versus response

Source DF SS MS F P response 3 1548.1 516.0 42.77 0.000 Error 223 2690.9 12.1 Total 226 4239.0

S = 3.474 R-Sq = 36.52% R-Sq(adj) = 35.67%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ------+------+------+------+- bf 54 29.989 2.721 (---*---) bm 54 33.539 2.712 (---*---) wf 59 26.393 2.667 (---*--) wm 60 31.423 5.036 (---*--) ------+------+------+------+- 27.5 30.0 32.5 35.0

Pooled StDev = 3.474

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97%

response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm 1.820 3.550 5.280 (--*---) wf -5.290 -3.597 -1.904 (---*--) wm -0.253 1.433 3.120 (---*--) ------+------+------+------+- -5.0 0.0 5.0 10.0

response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -8.840 -7.147 -5.454 (---*--) wm -3.803 -2.117 -0.431 (---*--) ------+------+------+------+- -5.0 0.0 5.0 10.0

response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm 3.382 5.030 6.678 (--*--) ------+------+------+------+- -5.0 0.0 5.0 10.0

One-way ANOVA: nasheight versus response

Source DF SS MS F P response 3 1873.34 624.45 105.36 0.000 Error 223 1321.65 5.93 Total 226 3195.00 63

S = 2.434 R-Sq = 58.63% R-Sq(adj) = 58.08%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -----+------+------+------+---- bf 54 21.832 1.569 (-*--) bm 54 23.532 1.761 (-*--) wf 59 24.392 2.852 (--*-) wm 60 29.462 3.067 (--*-) -----+------+------+------+---- 22.5 25.0 27.5 30.0

Pooled StDev = 2.434

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97%

response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm 0.487 1.700 2.912 (--*--) wf 1.373 2.560 3.746 (--*--) wm 6.448 7.630 8.812 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -0.327 0.860 2.047 (--*--) wm 4.749 5.930 7.112 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm 3.915 5.070 6.226 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

One-way ANOVA: nasheight versus response

Source DF SS MS F P response 3 1873.34 624.45 105.36 0.000 Error 223 1321.65 5.93 Total 226 3195.00

S = 2.434 R-Sq = 58.63% R-Sq(adj) = 58.08%

Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -----+------+------+------+---- 64

bf 54 21.832 1.569 (-*--) bm 54 23.532 1.761 (-*--) wf 59 24.392 2.852 (--*-) wm 60 29.462 3.067 (--*-) -----+------+------+------+---- 22.5 25.0 27.5 30.0

Pooled StDev = 2.434

Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of response

Individual confidence level = 98.97%

response = bf subtracted from: response Lower Center Upper ------+------+------+------+- bm 0.487 1.700 2.912 (--*--) wf 1.373 2.560 3.746 (--*--) wm 6.448 7.630 8.812 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

response = bm subtracted from: response Lower Center Upper ------+------+------+------+- wf -0.327 0.860 2.047 (--*--) wm 4.749 5.930 7.112 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

response = wf subtracted from: response Lower Center Upper ------+------+------+------+- wm 3.915 5.070 6.226 (--*--) ------+------+------+------+- -4.0 0.0 4.0 8.0

65

Appendix P: SAS Output for Model Selection

The CATMOD Procedure

Data Summary

Response final Response Levels 4

Weight Variable None Populations 227

Data Set TALUSREAL Total Frequency 227

Frequency Missing 0 Observations 227

Number of Observations Read 227

Number of Observations Used 227

Response Profile

Ordered final Total Value Frequency

1 0 59

2 1 54

3 2 54

4 3 60

66

Model Fit Statistics

Criterion Intercept Intercept

Only and

Covariates

AIC 634.836 208.993

SC 645.111 291.191

-2 Log L 628.836 160.993

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 467.8435 21 <.0001

Score 326.0345 21 <.0001

Wald 78.7224 21 <.0001

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

11.0888 12 0.5213

67

Analysis of Maximum Likelihood Estimates

Parameter final DF Estimate Standard Wald Pr > ChiSq

Error Chi-Square

Intercept 0 1 62.4872 10.6985 34.1141 <.0001

Intercept 1 1 45.7701 10.6915 18.3267 <.0001

Intercept 2 1 4.5831 10.4022 0.1941 0.6595

tlength 0 1 -0.2482 0.1651 2.2615 0.1326

tlength 1 1 -0.0892 0.1743 0.2619 0.6088

tlength 2 1 0.2419 0.1771 1.8655 0.1720

twidth 0 1 -0.4532 0.2084 4.7278 0.0297

twidth 1 1 -0.2527 0.2141 1.3929 0.2379

twidth 2 1 0.0603 0.2127 0.0803 0.7769

nheight 0 1 0.1133 0.0881 1.6538 0.1984

nheight 1 1 0.1739 0.0899 3.7374 0.0532

nheight 2 1 0.1905 0.0932 4.1785 0.0409

trocw 0 1 -0.0464 0.1457 0.1014 0.7502

trocw 1 1 -0.5475 0.1747 9.8194 0.0017

trocw 2 1 -0.4498 0.1882 5.7113 0.0169

casw 0 1 -0.2944 0.1538 3.6638 0.0556

casw 1 1 0.7961 0.3505 5.1573 0.0231

casw 2 1 0.7834 0.3724 4.4242 0.0354

nasw 0 1 -0.4759 0.1708 7.7653 0.0053

nasw 1 1 0.0465 0.1331 0.1220 0.7269

nasw 2 1 0.2422 0.0914 7.0172 0.0081 68

Analysis of Maximum Likelihood Estimates

Parameter final DF Estimate Standard Wald Pr > ChiSq

Error Chi-Square

nash 0 1 -0.2473 0.1699 2.1185 0.1455

nash 1 1 -1.7180 0.3957 18.8537 <.0001

nash 2 1 -1.7470 0.4137 17.8342 <.0001

Odds Ratio Estimates

Effect final Point Estimate 95% Wald Confidence Limits

tlength 0 0.780 0.565 1.078

tlength 1 0.915 0.650 1.287

tlength 2 1.274 0.900 1.802

twidth 0 0.636 0.422 0.956

twidth 1 0.777 0.510 1.182

twidth 2 1.062 0.700 1.611

nheight 0 1.120 0.942 1.331

nheight 1 1.190 0.998 1.419

nheight 2 1.210 1.008 1.452

trocw 0 0.955 0.717 1.270

trocw 1 0.578 0.411 0.815

trocw 2 0.638 0.441 0.922

casw 0 0.745 0.551 1.007

casw 1 2.217 1.115 4.407

casw 2 2.189 1.055 4.542 69

Odds Ratio Estimates

Effect final Point Estimate 95% Wald Confidence Limits

nasw 0 0.621 0.445 0.868

nasw 1 1.048 0.807 1.360

nasw 2 1.274 1.065 1.524

nash 0 0.781 0.560 1.089

nash 1 0.179 0.083 0.390

nash 2 0.174 0.077 0.392

70

Appendix Q: MINITAB Output for “Sex” with “Race” Models

Nominal Logistic Regression: 0wf 1bf 2bm 3wm versus TL, TW, ...

Response Information

Variable Value Count

0wf 1bf 2bm 3wm 3 60 (Reference Event)

2 54

1 54

0 59

Total 227

Logistic Regression Table

Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Logit 1: (2/3) Constant 4.58305 10.4022 0.44 0.660 TL 0.241902 0.177109 1.37 0.172 1.27 0.90 1.80 TW 0.0602647 0.212705 0.28 0.777 1.06 0.70 1.61 NH 0.190524 0.0932051 2.04 0.041 1.21 1.01 1.45 TRW -0.449825 0.188224 -2.39 0.017 0.64 0.44 0.92 CASW 0.783369 0.372432 2.10 0.035 2.19 1.05 4.54 NASW 0.242209 0.0914339 2.65 0.008 1.27 1.07 1.52 NASH -1.74697 0.413674 -4.22 0.000 0.17 0.08 0.39 Logit 2: (1/3) Constant 45.7701 10.6915 4.28 0.000 TL -0.0891939 0.174301 -0.51 0.609 0.91 0.65 1.29 TW -0.252728 0.214141 -1.18 0.238 0.78 0.51 1.18 NH 0.173859 0.0899314 1.93 0.053 1.19 1.00 1.42 TRW -0.547465 0.174708 -3.13 0.002 0.58 0.41 0.81 CASW 0.796051 0.350533 2.27 0.023 2.22 1.12 4.41 NASW 0.0464980 0.133142 0.35 0.727 1.05 0.81 1.36 NASH -1.71797 0.395655 -4.34 0.000 0.18 0.08 0.39 Logit 3: (0/3) Constant 62.4872 10.6985 5.84 0.000 TL -0.248226 0.165063 -1.50 0.133 0.78 0.56 1.08 TW -0.453237 0.208446 -2.17 0.030 0.64 0.42 0.96 NH 0.113274 0.0880824 1.29 0.198 1.12 0.94 1.33 TRW -0.0464089 0.145745 -0.32 0.750 0.95 0.72 1.27 CASW -0.294440 0.153826 -1.91 0.056 0.74 0.55 1.01 NASW -0.475891 0.170777 -2.79 0.005 0.62 0.44 0.87 NASH -0.247323 0.169922 -1.46 0.146 0.78 0.56 1.09

Log-Likelihood = -80.496 Test that all slopes are zero: G = 467.843, DF = 21, P-Value = 0.000

Goodness-of-Fit Tests

Method Chi-Square DF P

Pearson 349.564 657 1.000

Deviance 160.993 657 1.000 71

Appendix R: SAS Code for Model Selection data talusreal; input spec age tlength twidth theight nlength nheight trocw trocl casw casl nasw nash gender race final; datalines; DATA; proc logistic simple; model gender=tlength twidth theight nlength nheight trocw trocl casw casl nasw nash/ link=logit selection=stepwise SLE=.15 SLS=.10; run; proc logistic; model race=tlength twidth theight nlength nheight trocw trocl casw casl nasw nash/ link=logit selection=stepswise SLE=.15 SLS =.10; run; proc catmod; response logits; direct tlength twidth theight nlength nheight trocw trocl casw casl nasw nash; model final=tlength twidth theight nlength nheight trocw trocl casw casl nasw nash; proc logistic simple; model final=tlength twidth theight nlength nheight trocw trocl casw casl nasw nash/link=glogit selection=stepwise SLE=.15 SLS=.10 aggregate scale=none; run;

72

LITERATURE CITED

Agresti, A. An Introduction to Categorical Data Analysis (2nd edition). Hoboken, New Jersey: John Wiley & Sons, Inc.; 2007.

Bass, W.M. Human Osteology: A Laboratory and Field Manual (3rd edition). Columbia, Missouri: Missouri Archaeological Society; 1995.

Bidmos, M.A., and Asala, S.A. (2004) Sexual dimorphism of the calcaneus of South African Blacks. Journal of Forensic Sciences 49(3):446-450.

Bidmos, M.A., and Dayal, M.R. (2004) Further evidence to show population specificity of discriminant function equations for sex determination using the talus of South African Blacks. Journal of Forensic Sciences 49(6):1165-1170.

Byers, S.N. Introduction to Forensic Anthropology (3rd edition). Boston, Massachusetts: Pearson Education, Inc; 2008.

Cox, D.R and Snell, E.J (1989), The Analysis of Binary Data,(Second Edition). London: Chapman and Hall.

De Vito, C., and Saunders, S.R., (1990) A discriminant function analysis of deciduous teeth to determine sex. Journal of Forensic Sciences 35:845-858.

Giles, E., and Elliot. O. (1963) Sex determination by discriminant function analysis of crania. American Journal of Physical Anthropology 21:53-68.

Graw, M. (2001) Significance of the classification morphological criteria for identifying gender using recent skulls. Forensic Science Communications. 3(1):1-8.

Holland, T. D. (1986) Sex determination of fragmentary crania by analysis of the cranial base. American Journal of Physical Anthropology 70:203-208.

Introduction fo SAS. UCLA: Academis Technology Services, Statistical Consulting Group. From http://ats.ucla.edu/stat/sas/notes2/ (accessed November 24, 2007).

Introna, F., Jr., Di Vella, G., Campobasso, C.P., and Dragone, M. (1997) Sex determination by discriminant analysis of calcanei measurements. Journal of Forensic Sciences 42:725-728.

Kutner, M., Nachtsheim, C., Neter, J., and Li, W. Applied linear statistical models: McGraw Hill/Irwin: 2004.

McKern, T.W., and Stewart, T.D. (1957) Skeletal age changes in young American males. Natick, Massachusetts: Quartermaster Research and Development Command Technical Report EP-45.

Nagelkerke, N.J.D. (1991), A note on a general definition of the coefficient of determination, Biometrika, 78, 691-692.

Steele, D.G. (1976) The estimation of sex on the basis of the talus and calcaneus. American Journal of Physical Anthropology 45:581-588.

73

Trotter, M. Estimation of stature from intact long bones. In: T.D. Stewart (Ed.) Personal Identification in Mass Disasters. Washington, D.C.: Smithsonian Institution Press; 1970:71-83.

Trotter, M., and Gleser, G.C., (1958) A re-evaluation of estimation of stature based on measurements of stature taken during life and long of long bones after death. American Journal of Physical Anthropology 16: 79-123.

Wilkie, D. (2008) Pictorial Representation of Kendall‟s Rank Correlation Coefficient. Teaching Statistics. 2:76-78.