Fitting Regression Models Containing Categorical Factors

Fitting Regression Models Containing Categorical Factors

Fitting Regression Models Containing Categorical Factors Presented by Dr. Neil W. Polhemus 1 Virginia Piedmont 2 Type of grass affects population of wild birds Cool season grasses Warm season grasses 3 Outline 1. Linear models with a single predictor 2. Linear models with multiple predictors 3. Logistic regression 4. Nonlinear regression models 5. Life data regression 4 Example: Pricing of Diamonds • The file diamonds.sgd contains information on 308 diamonds. (JSE Data Archive, Singfat Chu, National University of Singapore) 5 Models with a single categorical predictor • Dependent variable: Y = Price • Independent variable: X = Carat weight • Categorical variable: C = Color (6 levels) 6 Coded Scatterplot 7 Transformation 8 Transformed Data 9 Statistical Model 푌 = 훽0 + 훽1퐼1 + 훽2퐼2 + 훽3퐼3 + 훽4퐼4 + 훽5퐼5 + 훽6푋 + 훽7퐼1푋 + 훽8퐼2푋 + 훽9퐼3푋 + 훽10퐼4푋+ 훽11퐼5X where Y=LOG(Price) 푋 = 퐶푎푟푎푡 푤푒ℎ푡^0.33 Color I1 I2 I3 I4 I5 D 0 0 0 0 0 E 1 0 0 0 0 F 0 1 0 0 0 G 0 0 1 0 0 H 0 0 0 1 0 I 0 0 0 0 1 10 Model by Color Color D: 푌 = 훽0 + 훽6푋 Color E: 푌 = 훽0 + 훽1 + 훽6 + 훽7 푋 Color F: 푌 = 훽0 + 훽2 + 훽6 + 훽8 푋 Color G: 푌 = 훽0 + 훽3 + 훽6 + 훽9 푋 Color H: 푌 = 훽0 + 훽4 + 훽6 + 훽10 푋 Color I: 푌 = 훽0 + 훽5 + 훽6 + 훽11 푋 11 Comparison of Regression Lines 12 Fitted Regression Model 13 Test for Differences in the Slopes and Intercepts • Conditional sums of squares • Simplify the model using Analysis Options 14 Parallel Regression Lines 15 Fitted Model 16 Multiple Predictors • When dealing with multiple categorical and quantitative predictors, we can use either of 2 procedures: – Multiple Regression (have to type in expressions for each indicator variable) – GLM: General Linear Model (automatically generates the indicator variables) • Be careful: the indicator variables are set up differently in GLM (as well as the DOE procedures.) 17 Coding Comparison • Comparison of Regression Lines Color I1 I2 I3 I4 I5 D 0 0 0 0 0 E 1 0 0 0 0 F 0 1 0 0 0 G 0 0 1 0 0 H 0 0 0 1 0 I 0 0 0 0 1 • GLM and DOE Color I1 I2 I3 I4 I5 D 1 0 0 0 0 E 0 1 0 0 0 F 0 0 1 0 0 G 0 0 0 1 0 H 0 0 0 0 1 I -1 -1 -1 -1 -1 18 Model Comparison • Comparison of Regression Lines Color Intercept Slope D b0 b6 E b0 + b1 b6 + b7 F b0 + b2 b6 + b8 G b0 + b3 b6 + b9 H b0 + b4 b6 + b10 I b0 + b5 b6 + b11 • GLM and DOE Color Intercept Slope D b0 + b1 b6 + b7 E b0 + b2 b6 + b8 F b0 + b3 b6 + b9 G b0 + b4 b6 + b10 H b0 + b5 b6 + b11 I b0 – (b1+b2+b3+b4+b5) b6 – (b7+b8+b9+b10+b11) 19 Data Input Dialog Box 20 Model Specification 21 Results 22 Simplified Model 23 Predicting New Observations 24 Least Squares Means 25 Least Squares Means 26 Residual Plot 27 Logistic Regression Can we predict how well Barry Bonds would do when he came to bat? 28 Bonds Data from 2001 n = 648 at-bats Source: JSE Data Archive, Jerome P. Reiter, Duke University 29 Model • Y = 1 if Bonds reached base and 0 otherwise • Predictors: – ERA of opposing pitcher – Runs already scored that inning – Opposing team’s score – Inning – # of outs when he came to bat – Whether a runner was on first base – Whether a runner was on second base – Whether a runner was on third base – Whether it was a home game 30 Data Input Dialog Box 31 Analysis Options 32 Results 33 Plot of Probability of Reaching Base 34 Nonlinear Models • Data file: 93cars.sgd – Y: MPG Highway – X: weight – C: manual • The relationship between Y and X is nonlinear. 35 Multiplicative Model 36 Nonlinear Regression 37 Results MPG Highway = EXP(9.66687 -0.783745*LOG(weight) -0.0440352*manual) 38 Life Data Regression • Data file: methadone.sgd Caplehorn, J. (1991). Methadone dosage and retention of patients in maintenance treatment. Medical Journal of Australia. 39 Distribution Fitting 40 Distribution Fitting 41 Data Input Dialog Box 42 Analysis Options 43 Results Days in clinic = exp(5.29378 + 0.0244232*Dose - 0.709329*Clinic=1 + 0.229509*Prison?=0) 44 Percentile Plot 45 Percentile Plot 46 More Information • Video, slides and sample data may be found at www.statgraphics.com/webinars. 47 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    47 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us