TWOTRAN

This macro is designed to test, using randomization, whether or not the means for two independent samples are equal.

RUNNING THE MACRO Calling statement twotran c1 c2 ; nran k1 (999) ; differences c1 ; tstatistics c1.

Input C1 Data for both groups C2 Group indicator

C1 and C2 must both be columns containing only numerical data, and they must be of the same length. The column c2 should contain group markers; these should be any two distinct numerical values (for example, 1 and 2).

Subcommands nran Number of randomizations used. differences Specify a column in which to store differences between simulated group means. tstatistics Specify a column in which to store t-statistics for differences between simulated group means.

Output Basic summary statistics (numbers of observations, group means & standard deviations) are given, along with the observed t-statistic and difference in sample means. Randomization p-values are given for both one-sided hypotheses, and for the two-sided hypothesis.

Speed of macro : FAST

ALTERNATIVE PROCEDURES Other macros This macro uses randomization, but two bootstrapping versions of the test are available (depending upon whether variances are pooled) : TWOTPOOLBOOT Bootstrap test with pooling of variances TWOTUNPOOLBOOT Bootstrap test without pooling of variances

This macros is suitable when data for the two groups are contained in contained in the same column, with a separate column denoting which group each observation corresponds to. If data for the two groups are contained in separate columns, TWOSAMPLERAN should be used.

Standard procedures twot [C1][C2].

This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are not pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal. twot [C1][C2]; pooled.

This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal.

TECHNICAL DETAILS Null hypothesis : We test the null hypothesis that the mean for the first group is equal to the mean for the second group.

Randomization procedure : We fix the data value for each individual, and fix the size of the groups. We then randomize the allocation of individuals to groups, since under the null hypothesis this allocation will be random.

Test-statistic : We use the difference between the two sample group means as the test-statistic.

REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6).

WORKED EXAMPLE FOR TWOTRAN

Name of dataset MANDIBLES

Description The data are mandible lengths (mm) for 10 male and 10 female golden jackals.

Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London.

Original source HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from Thailand. Journal of Archaeological Science, 7, pp. 149-165.

The data Male (group 1) 120 107 110 116 114 111 113 117 114 112

Female (group 2) 110 111 107 108 110 105 107 106 111 111

The worksheet C1 Mandible lengths for males C2 Mandible lengths for females

Aims of analysis To investigate whether mandible lengths are different for males and females.

2 Standard procedure (without pooling)

MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 28/08/01 11:00:13

Results for: Mandibles.MTW

MTB > twot c1 c2

Two-Sample T-Test and CI: Data, Group

Two-sample T for Data

Group N Mean StDev SE Mean 1 10 113.40 3.72 1.2 2 10 108.60 2.27 0.72

Difference = mu (1) - mu (2) Estimate for difference: 4.80 95% CI for difference: (1.85, 7.75) T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.004 DF = 14

Standard procedure (with pooling)

MTB > twot c1 c2 ; SUBC> pooled.

Two-Sample T-Test and CI: Data, Group

Two-sample T for Data

Group N Mean StDev SE Mean 1 10 113.40 3.72 1.2 2 10 108.60 2.27 0.72

Difference = mu (1) - mu (2) Estimate for difference: 4.80 95% CI for difference: (1.91, 7.69) T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.003 DF = 18 Both use Pooled StDev = 3.08

Randomization procedure

MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 05/07/01 15:04:34

Results for: Mandibles.MTW

3 MTB > % N:\resampling\library\twotran c1 c2 ; SUBC> nran 999 ; SUBC> differences c4 ; SUBC> tstatistics c6. Executing from file: N:\resampling\library\twotran.MAC

Two-sample randomization test

Data Display (WRITE)

Number of observations in group 1 10 Number of observations in group 2 10 Data mean for group 1 113.4 Data mean for group 2 108.6 Standard deviation for group 1 3.718 Standard deviation for group 2 2.271

Observed difference in means 4.800 Observed t-statistic 3.48

Number of randomization samples 999 P-value for one-sided test with alternative: mean(group 1) > mean(group2) 0.0020 P-value for one-sided test with alternative: mean(group 2) < mean(group1) 1.0000 P-value for two-sided test 0.0040

Modified worksheet C4 A column containing 999 differences between sample means, one for each randomized dataset C6 A column containing 999 t-statistics for differences, one for each randomized dataset

Discussion All methods agree that there is clear evidence of a difference in mandible lengths between sexes. Two-sided p-values are 0.004 for standard methods (without pooling) and for randomization, and 0.003 for standard methods (with pooling). Looking at the data, we see that males (group 1) have longer mandibles.

4