LPM and Logit Example

PROC IMPORT OUT= WORK.woodwc DATAFILE= "D:\data\logitex.xls" DBMS=EXCEL REPLACE; RANGE="data"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES; RUN;

DATA logitex; SET WORK.woodwc;

PROC SORT; BY GPA; RUN;

PROC REG; TITLE 'OLS Estimation of Personalized Instruction Model'; MODEL GRADE = GPA TUCE PSI; RUN;

COMMENT 'What follows is Weighted Least Squares - don't use this code unless you're doing WLS';

OUTPUT OUT=RESFILE PREDICTED=YHAT; DATA TWO; MERGE ECONED RESFILE; BY GPA;

YHATC = (YHAT*(0=1)) + (0.001*(YHAT<=0)); W = SQRT(YHATC*(1-YHATC)); RECIPW = 1/W; GRADEW = GRADE/W; GPAW = GPA/W; TUCEW = TUCE/W; PSIW = PSI/W;

PROC PRINT; RUN;

PROC REG; TITLE 'Weighted Least Squares Estimation of Linear Probability Model'; MODEL GRADEW = RECIPW GPAW TUCEW PSIW /NOINT; RUN;

COMMENT 'End Weighted Least Squares code';

PROC LOGISTIC DESCENDING; TITLE 'Logit Estimation of Personalized Instruction Model'; MODEL GRADE = GPA TUCE PSI /CTABLE PPROB=0.5; RUN;

COMMENT 'PROC QLIM promising new procedure for getting marginal effects';

PROC QLIM; model GRADE = GPA TUCE PSI/ DISCRETE(D=logit); output out=meffects marginal; run; proc means data=meffects; var meff:; run; OLS Estimation of Personalized Instruction Model

The REG Procedure Model: MODEL1 Dependent Variable: GRADE GRADE

Number of Observations Read 32 Number of Observations Used 32

Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F

Model 3 3.00228 1.00076 6.65 0.0016 Error 28 4.21647 0.15059 Corrected Total 31 7.21875

Root MSE 0.38806 R-Square 0.4159 Dependent Mean 0.34375 Adj R-Sq 0.3533 Coeff Var 112.88935

Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > |t|

Intercept Intercept 1 -1.49802 0.52389 -2.86 0.0079 GPA GPA 1 0.46385 0.16196 2.86 0.0078 TUCE TUCE 1 0.01050 0.01948 0.54 0.5944 PSI PSI 1 0.37855 0.13917 2.72 0.0111 1 OLS Estimation of Personalized Instruction Model 2

Obs GPA TUCE PSI GRADE YHAT YHATC W 1 2.06 22 1 0 0.06696 0.06696 0.24996 2 2.39 19 1 1 0.18855 0.18855 0.39115 3 2.63 20 0 0 -0.06818 0.00100 0.03161 4 2.66 20 0 0 -0.05427 0.00100 0.03161 5 2.67 24 1 0 0.37090 0.37090 0.48305 6 2.74 19 0 0 -0.02766 0.00100 0.03161 7 2.75 25 0 0 0.03995 0.03995 0.19585 8 2.76 17 0 0 -0.03937 0.00100 0.03161 9 2.83 19 0 0 0.01409 0.01409 0.11786 10 2.83 27 1 1 0.47661 0.47661 0.49945 11 2.86 17 0 0 0.00702 0.00702 0.08347 12 2.87 21 0 0 0.05363 0.05363 0.22530 13 2.89 22 0 0 0.07341 0.07341 0.26080 14 2.89 14 1 0 0.36800 0.36800 0.48226 15 2.92 12 0 0 -0.01763 0.00100 0.03161 16 3.03 25 0 0 0.16983 0.16983 0.37548 17 3.10 21 1 0 0.53888 0.53888 0.49849 18 3.12 23 1 0 0.56914 0.56914 0.49520 19 3.16 25 1 1 0.60869 0.60869 0.48804 20 3.26 25 0 1 0.27652 0.27652 0.44728 21 3.28 24 0 0 0.27530 0.27530 0.44666 22 3.32 23 0 0 0.28336 0.28336 0.45063 23 3.39 17 1 1 0.63141 0.63141 0.48242 24 3.51 26 1 0 0.78153 0.78153 0.41321 25 3.53 26 0 0 0.41225 0.41225 0.49224 26 3.54 24 1 1 0.77446 0.77446 0.41794 27 3.57 23 0 0 0.39932 0.39932 0.48976 Weighted Least Squares Estimation of Linear Probability Model The REG Procedure Model: MODEL1 Dependent Variable: GRADEW

NOTE: No intercept in model. R-Square is redefined.

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Pr > F Model 4 74.64082 18.66020 22.98 <.0001 Error 28 22.73882 0.81210 Uncorrected Total 32 97.37964

Root MSE 0.90117 R-Square 0.7665 Dependent Mean 0.91932 Adj R-Sq 0.7331 Coeff Var 98.02486

Parameter Estimates

Parameter Standard Variable DF Estimate Error t Value Pr > |t| RECIPW 1 -1.30873 0.28849 -4.54 <.0001 GPAW 1 0.39817 0.08783 4.53 <.0001 TUCEW 1 0.01216 0.00454 2.68 0.0123 PSIW 1 0.38782 0.10518 3.69 0.0010 Logit Estimation of Personalized Instruction Model

The LOGISTIC Procedure

Model Information

Data Set WORK.TWO Response Variable GRADE GRADE Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring

Response Profile

Ordered Total Value GRADE Frequency

1 1 11 2 0 21

Probability modeled is GRADE=1.

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept Intercept and Criterion Only Covariates

AIC 43.183 33.779 SC 44.649 39.642 -2 Log L 41.183 25.779

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 15.4042 3 0.0015 Score 13.3088 3 0.0040 Wald 8.3762 3 0.0388 Logit Estimation of Personalized Instruction Model 61

The LOGISTIC Procedure

Analysis of Maximum Likelihood Estimates

Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -13.0204 4.9310 6.9723 0.0083 GPA 1 2.8259 1.2629 5.0072 0.0252 TUCE 1 0.0951 0.1415 0.4518 0.5015 PSI 1 2.3785 1.0645 4.9925 0.0255

Odds Ratio Estimates

Point 95% Wald Effect Estimate Confidence Limits

GPA 16.877 1.420 200.567 TUCE 1.100 0.833 1.451 PSI 10.789 1.339 86.917

Association of Predicted Probabilities and Observed Responses

Percent Concordant 88.3 Somers' D 0.771 Percent Discordant 11.3 Gamma 0.774 Percent Tied 0.4 Tau-a 0.359 Pairs 231 c 0.885 Classification Table

Correct Incorrect Percentages Prob Non- Non- Sensi- Speci- False False Level Event Event Event Event Correct tivity ficity POS NEG

0.500 6 18 3 5 75.0 54.5 85.7 33.3 21.7 Logit Estimation of Personalized Instruction Model 62

The QLIM Procedure

Discrete Response Profile of GRADE

Index Value Frequency Percent

1 0 21 65.63 2 1 11 34.38

Model Fit Summary

Number of Endogenous Variables 1 Endogenous Variable GRADE Number of Observations 32 Log Likelihood -12.88963 Maximum Absolute Gradient 3.82282E-6 Number of Iterations 17 Optimization Method Quasi-Newton AIC 33.77927 Schwarz Criterion 39.64221

Goodness-of-Fit Measures

Measure Value Formula

Likelihood Ratio (R) 15.404 2 * (LogL - LogL0) Upper Bound of R (U) 41.183 - 2 * LogL0 Aldrich-Nelson 0.325 R / (R+N) Cragg-Uhler 1 0.3821 1 - exp(-R/N) Cragg-Uhler 2 0.5278 (1-exp(-R/N)) / (1-exp(-U/N)) Estrella 0.4528 1 - (1-R/U)^(U/N) Adjusted Estrella 0.2251 1 - ((LogL-K)/LogL0)^(-2/N*LogL0) McFadden's LRI 0.374 R / U Veall-Zimmermann 0.5774 (R * (U+N)) / (U * (R+N)) McKelvey-Zavoina 0.7915

N = # of observations, K = # of regressors

Algorithm converged. Logit Estimation of Personalized Instruction Model 63

The QLIM Procedure

Parameter Estimates

Standard Approx Parameter DF Estimate Error t Value Pr > |t|

Intercept 1 -13.021347 4.931350 -2.64 0.0083 GPA 1 2.826113 1.262912 2.24 0.0252 TUCE 1 0.095158 0.141555 0.67 0.5014 PSI 1 2.378688 1.064557 2.23 0.0255 Logit Estimation of Personalized Instruction Model The MEANS Procedure

Variable Label N Mean ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 32 -0.3625809 Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 32 0.3625809 Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 32 -0.0122084 Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 32 0.0122084 Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 32 -0.3051777 Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 32 0.3051777 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Variable Label Std Dev Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 0.2354968 -0.7055222 Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 0.2354968 0.0674638 Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 0.0079294 -0.0237555 Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 0.0079294 0.0022716 Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 0.1982133 -0.5938252 Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 0.1982133 0.0567830 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Variable Label Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 -0.0674638 Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 0.7055222 Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 -0.0022716 Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 0.0237555 Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 -0.0567830 Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 0.5938252 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ In the logit analysis, gradepoint average and tutoring both had positive effects on student grades.

The higher a student’s overall gradepoint average, the more likely the student was to receive an A in intermediate macroeconomics. Evaluated at the mean, an increase of one entire gradepoint was associated with a 36.2 percent greater probability of receiving an A. The coefficient on gradepoint average was statistically significant (p = 0.0252).

Students who received tutoring were also more likely to receive A grades in intermediate macroeconomics. Evaluated at the mean, receiving tutoring was associated with a 30.5 percent greater probability of receiving an A. The coefficient on the tutoring dummy variable was statistically significant (p = 0.0255).

The coefficient on the TUCE score was not statistically significant.

2 The overall equation fit well, with an Rp equal to 0.75 and a highly statistically significant likelihood ratio test (p = 0.0015, 2 = 15.40).