LPM and Logit Example
PROC IMPORT OUT= WORK.woodwc
DATAFILE= "D:\data\logitex.xls"
DBMS=EXCEL REPLACE;
RANGE="data";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
DATA logitex;
SET WORK.woodwc;
PROC SORT;
BY GPA;
RUN;
PROC REG;
TITLE 'OLS Estimation of Personalized Instruction Model';
MODEL GRADE = GPA TUCE PSI;
RUN;
COMMENT 'What follows is Weighted Least Squares - don't use this code unless you're doing WLS';
OUTPUT OUT=RESFILE PREDICTED=YHAT;
DATA TWO;
MERGE ECONED RESFILE;
BY GPA;
YHATC = (YHAT*(0<YHAT<1)) + (0.999*(YHAT>=1)) + (0.001*(YHAT<=0));
W = SQRT(YHATC*(1-YHATC));
RECIPW = 1/W;
GRADEW = GRADE/W;
GPAW = GPA/W;
TUCEW = TUCE/W;
PSIW = PSI/W;
PROC PRINT;
RUN;
PROC REG;
TITLE 'Weighted Least Squares Estimation of Linear Probability Model';
MODEL GRADEW = RECIPW GPAW TUCEW PSIW /NOINT;
RUN;
COMMENT 'End Weighted Least Squares code';
PROC LOGISTIC DESCENDING;
TITLE 'Logit Estimation of Personalized Instruction Model';
MODEL GRADE = GPA TUCE PSI /CTABLE PPROB=0.5;
RUN;
COMMENT 'PROC QLIM promising new procedure for getting marginal effects';
PROC QLIM;
model GRADE = GPA TUCE PSI/ DISCRETE(D=logit);
output out=meffects marginal;
run;
proc means data=meffects;
var meff:;
run;
OLS Estimation of Personalized Instruction Model
The REG Procedure
Model: MODEL1
Dependent Variable: GRADE GRADE
Number of Observations Read 32
Number of Observations Used 32
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 3 3.00228 1.00076 6.65 0.0016
Error 28 4.21647 0.15059
Corrected Total 31 7.21875
Root MSE 0.38806 R-Square 0.4159
Dependent Mean 0.34375 Adj R-Sq 0.3533
Coeff Var 112.88935
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 -1.49802 0.52389 -2.86 0.0079
GPA GPA 1 0.46385 0.16196 2.86 0.0078
TUCE TUCE 1 0.01050 0.01948 0.54 0.5944
PSI PSI 1 0.37855 0.13917 2.72 0.0111
1 OLS Estimation of Personalized Instruction Model 2
Obs GPA TUCE PSI GRADE YHAT YHATC W
1 2.06 22 1 0 0.06696 0.06696 0.24996
2 2.39 19 1 1 0.18855 0.18855 0.39115
3 2.63 20 0 0 -0.06818 0.00100 0.03161
4 2.66 20 0 0 -0.05427 0.00100 0.03161
5 2.67 24 1 0 0.37090 0.37090 0.48305
6 2.74 19 0 0 -0.02766 0.00100 0.03161
7 2.75 25 0 0 0.03995 0.03995 0.19585
8 2.76 17 0 0 -0.03937 0.00100 0.03161
9 2.83 19 0 0 0.01409 0.01409 0.11786
10 2.83 27 1 1 0.47661 0.47661 0.49945
11 2.86 17 0 0 0.00702 0.00702 0.08347
12 2.87 21 0 0 0.05363 0.05363 0.22530
13 2.89 22 0 0 0.07341 0.07341 0.26080
14 2.89 14 1 0 0.36800 0.36800 0.48226
15 2.92 12 0 0 -0.01763 0.00100 0.03161
16 3.03 25 0 0 0.16983 0.16983 0.37548
17 3.10 21 1 0 0.53888 0.53888 0.49849
18 3.12 23 1 0 0.56914 0.56914 0.49520
19 3.16 25 1 1 0.60869 0.60869 0.48804
20 3.26 25 0 1 0.27652 0.27652 0.44728
21 3.28 24 0 0 0.27530 0.27530 0.44666
22 3.32 23 0 0 0.28336 0.28336 0.45063
23 3.39 17 1 1 0.63141 0.63141 0.48242
24 3.51 26 1 0 0.78153 0.78153 0.41321
25 3.53 26 0 0 0.41225 0.41225 0.49224
26 3.54 24 1 1 0.77446 0.77446 0.41794
27 3.57 23 0 0 0.39932 0.39932 0.48976
Weighted Least Squares Estimation of Linear Probability Model
The REG Procedure
Model: MODEL1
Dependent Variable: GRADEW
Number of Observations Read 32
Number of Observations Used 32
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 4 74.64082 18.66020 22.98 <.0001
Error 28 22.73882 0.81210
Uncorrected Total 32 97.37964
Root MSE 0.90117 R-Square 0.7665
Dependent Mean 0.91932 Adj R-Sq 0.7331
Coeff Var 98.02486
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
RECIPW 1 -1.30873 0.28849 -4.54 <.0001
GPAW 1 0.39817 0.08783 4.53 <.0001
TUCEW 1 0.01216 0.00454 2.68 0.0123
PSIW 1 0.38782 0.10518 3.69 0.0010
Logit Estimation of Personalized Instruction Model
The LOGISTIC Procedure
Model Information
Data Set WORK.TWO
Response Variable GRADE GRADE
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring
Number of Observations Read 32
Number of Observations Used 32
Response Profile
Ordered Total
Value GRADE Frequency
1 1 11
2 0 21
Probability modeled is GRADE=1.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 43.183 33.779
SC 44.649 39.642
-2 Log L 41.183 25.779
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 15.4042 3 0.0015
Score 13.3088 3 0.0040
Wald 8.3762 3 0.0388
Logit Estimation of Personalized Instruction Model 61
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -13.0204 4.9310 6.9723 0.0083
GPA 1 2.8259 1.2629 5.0072 0.0252
TUCE 1 0.0951 0.1415 0.4518 0.5015
PSI 1 2.3785 1.0645 4.9925 0.0255
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
GPA 16.877 1.420 200.567
TUCE 1.100 0.833 1.451
PSI 10.789 1.339 86.917
Association of Predicted Probabilities and Observed Responses
Percent Concordant 88.3 Somers' D 0.771
Percent Discordant 11.3 Gamma 0.774
Percent Tied 0.4 Tau-a 0.359
Pairs 231 c 0.885
Classification Table
Correct Incorrect Percentages
Prob Non- Non- Sensi- Speci- False False
Level Event Event Event Event Correct tivity ficity POS NEG
0.500 6 18 3 5 75.0 54.5 85.7 33.3 21.7
Logit Estimation of Personalized Instruction Model 62
The QLIM Procedure
Discrete Response Profile of GRADE
Index Value Frequency Percent
1 0 21 65.63
2 1 11 34.38
Model Fit Summary
Number of Endogenous Variables 1
Endogenous Variable GRADE
Number of Observations 32
Log Likelihood -12.88963
Maximum Absolute Gradient 3.82282E-6
Number of Iterations 17
Optimization Method Quasi-Newton
AIC 33.77927
Schwarz Criterion 39.64221
Goodness-of-Fit Measures
Measure Value Formula
Likelihood Ratio (R) 15.404 2 * (LogL - LogL0)
Upper Bound of R (U) 41.183 - 2 * LogL0
Aldrich-Nelson 0.325 R / (R+N)
Cragg-Uhler 1 0.3821 1 - exp(-R/N)
Cragg-Uhler 2 0.5278 (1-exp(-R/N)) / (1-exp(-U/N))
Estrella 0.4528 1 - (1-R/U)^(U/N)
Adjusted Estrella 0.2251 1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI 0.374 R / U
Veall-Zimmermann 0.5774 (R * (U+N)) / (U * (R+N))
McKelvey-Zavoina 0.7915
N = # of observations, K = # of regressors
Algorithm converged.
Logit Estimation of Personalized Instruction Model 63
The QLIM Procedure
Parameter Estimates
Standard Approx
Parameter DF Estimate Error t Value Pr > |t|
Intercept 1 -13.021347 4.931350 -2.64 0.0083
GPA 1 2.826113 1.262912 2.24 0.0252
TUCE 1 0.095158 0.141555 0.67 0.5014
PSI 1 2.378688 1.064557 2.23 0.0255
Logit Estimation of Personalized Instruction Model
The MEANS Procedure
Variable Label N Mean
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 32 -0.3625809
Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 32 0.3625809
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 32 -0.0122084
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 32 0.0122084
Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 32 -0.3051777
Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 32 0.3051777
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Variable Label Std Dev Minimum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 0.2354968 -0.7055222
Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 0.2354968 0.0674638
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 0.0079294 -0.0237555
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 0.0079294 0.0022716
Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 0.1982133 -0.5938252
Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 0.1982133 0.0567830
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Variable Label Maximum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA Marginal effect of GPA on the probability of GRADE=1 -0.0674638
Meff_P2_GPA Marginal effect of GPA on the probability of GRADE=2 0.7055222
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 -0.0022716
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 0.0237555
Meff_P1_PSI Marginal effect of PSI on the probability of GRADE=1 -0.0567830
Meff_P2_PSI Marginal effect of PSI on the probability of GRADE=2 0.5938252
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
In the logit analysis, gradepoint average and tutoring both had positive effects on student grades.
The higher a student’s overall gradepoint average, the more likely the student was to receive an A in intermediate macroeconomics. Evaluated at the mean, an increase of one entire gradepoint was associated with a 36.2 percent greater probability of receiving an A. The coefficient on gradepoint average was statistically significant (p = 0.0252).
Students who received tutoring were also more likely to receive A grades in intermediate macroeconomics. Evaluated at the mean, receiving tutoring was associated with a 30.5 percent greater probability of receiving an A. The coefficient on the tutoring dummy variable was statistically significant (p = 0.0255).
The coefficient on the TUCE score was not statistically significant.
The overall equation fit well, with an Rp2 equal to 0.75 and a highly statistically significant likelihood ratio test (p = 0.0015, 2 = 15.40).