SAS/STAT® 13.2 User’S Guide the REG Procedure This Document Is an Individual Chapter from SAS/STAT® 13.2 User’S Guide

SAS/STAT® 13.2 User’s Guide The REG Procedure This document is an individual chapter from SAS/STAT® 13.2 User’s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute Inc. 2014. SAS/STAT® 13.2 User’s Guide. Cary, NC: SAS Institute Inc. Copyright © 2014, SAS Institute Inc., Cary, NC, USA All rights reserved. Produced in the United States of America. For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated. U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication or disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a) and DFAR 227.7202-4 and, to the extent required under U.S. federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The Government’s rights in Software and documentation shall be only those set forth in this Agreement. SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. August 2014 SAS provides a complete selection of books and electronic products to help customers use SAS® software to its fullest potential. For more information about our offerings, visit support.sas.com/bookstore or call 1-800-727-3228. SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Gain Greater Insight into Your SAS® Software with SAS Books. Discover all that you need on your journey to knowledge and empowerment. support.sas.com/bookstore for additional books and resources. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2013 SAS Institute Inc. All rights reserved. S107969US.0613 Chapter 85 The REG Procedure Contents Overview: REG Procedure................................... 6978 Getting Started: REG Procedure................................ 6980 Simple Linear Regression................................ 6980 Polynomial Regression.................................. 6984 Using PROC REG Interactively............................. 6995 Syntax: REG Procedure.................................... 6996 PROC REG Statement.................................. 6998 ADD Statement ..................................... 7012 BY Statement ...................................... 7012 CODE Statement..................................... 7012 DELETE Statement ................................... 7013 FREQ Statement..................................... 7013 ID Statement....................................... 7014 MODEL Statement.................................... 7014 MTEST Statement.................................... 7025 OUTPUT Statement................................... 7027 PRINT Statement .................................... 7028 REFIT Statement..................................... 7029 RESTRICT Statement.................................. 7030 REWEIGHT Statement ................................. 7031 STORE Statement.................................... 7034 TEST Statement..................................... 7035 VAR Statement...................................... 7036 WEIGHT Statement................................... 7036 Details: REG Procedure.................................... 7036 Missing Values...................................... 7036 Input Data Sets...................................... 7037 Output Data Sets..................................... 7040 Interactive Analysis ................................... 7045 Model-Selection Methods................................ 7049 Criteria Used in Model-Selection Methods . 7052 Limitations in Model-Selection Methods . 7052 Parameter Estimates and Associated Statistics . 7053 Predicted and Residual Values.............................. 7055 Models of Less Than Full Rank............................. 7059 Collinearity Diagnostics................................. 7060 6978 F Chapter 85: The REG Procedure Model Fit and Diagnostic Statistics........................... 7062 Influence Statistics.................................... 7064 Reweighting Observations in an Analysis . 7072 Testing for Heteroscedasticity.............................. 7078 Testing for Lack of Fit.................................. 7079 Multivariate Tests .................................... 7080 Autocorrelation in Time Series Data........................... 7083 Computations for Ridge Regression and IPC Analysis . 7084 Construction of Q-Q and P-P Plots ........................... 7085 Computational Methods................................. 7085 Computer Resources in Regression Analysis . 7085 Displayed Output..................................... 7086 Plot Options Superseded by ODS Graphics . 7088 ODS Table Names.................................... 7104 ODS Graphics...................................... 7106 Examples: REG Procedure................................... 7112 Example 85.1: Modeling Salaries of Major League Baseball Players . 7112 Example 85.2: Aerobic Fitness Prediction . 7126 Example 85.3: Predicting Weight by Height and Age . 7142 Example 85.4: Regression with Quantitative and Qualitative Variables . 7146 Example 85.5: Ridge Regression for Acetylene Data . 7152 Example 85.6: Chemical Reaction Response . 7156 References........................................... 7158 Overview: REG Procedure The REG procedure is one of many regression procedures in the SAS System. It is a general-purpose procedure for regression, while other SAS regression procedures provide more specialized applications. Other SAS/STAT procedures that perform at least one type of regression analysis are the CATMOD, GEN- MOD, GLM, LOGISTIC, MIXED, NLIN, ORTHOREG, PROBIT, RSREG, and TRANSREG procedures. SAS/ETS procedures are specialized for applications in time series or simultaneous systems. These other SAS/STAT regression procedures are summarized in Chapter 4, “Introduction to Regression Procedures,” which also contains an overview of regression techniques and defines many of the statistics computed by PROC REG and other regression procedures. Overview: REG Procedure F 6979 PROC REG provides the following capabilities: • multiple MODEL statements • nine model-selection methods • interactive changes both in the model and the data used to fit the model • linear equality restrictions on parameters • tests of linear hypotheses and multivariate hypotheses • collinearity diagnostics • predicted values, residuals, studentized residuals, confidence limits, and influence statistics • correlation or crossproduct input • requested statistics available for output through output data sets • ODS Graphics. For more information, see the section “ODS Graphics” on page 7106. Nine model-selection methods are available in PROC REG. In the simplest method, PROC REG fits the complete model that you specify. The other eight methods involve various ways of including or excluding variables from the model. You specify these methods with the SELECTION= option in the MODEL statement. The methods are identified in the following list and are explained in detail in the section “Model-Selection Methods” on page 7049. NONE no model selection. This is the default. The complete model specified in the MODEL statement is fit to the data. FORWARD forward selection. This method starts with no variables in the model and adds variables. BACKWARD backward elimination. This method starts with all variables in the model and deletes variables. STEPWISE stepwise regression. This is similar to the FORWARD method except that variables already in the model do not necessarily stay there. MAXR forward selection to fit the best one-variable model, the best two-variable model, and so on. Variables are switched so that R square is maximized. MINR similar to the MAXR method, except that variables are switched so that the increase in R square from adding a variable to the model is minimized. RSQUARE finds a specified number of models with the highest R square in a range of model sizes. ADJRSQ finds a specified number of models with the highest adjusted R square in a range of model sizes. CP finds a specified number of models with the lowest Cp in a range of model sizes. 6980 F Chapter 85: The REG Procedure Getting Started: REG Procedure Simple Linear Regression

SAS/STAT® 13.2 User’S Guide the REG Procedure This Document Is an Individual Chapter from SAS/STAT® 13.2 User’S Guide

Power Comparisons of Five Most Commonly Used Autocorrelation Tests

Chapter 4: Model Adequacy Checking

Multiple Regression

Outlier Detection and Influence Diagnostics in Network Meta- Analysis

Residuals, Part II

Chapter 39 Fit Analyses

Regression Diagnostics and Model Selection

Linear Regression Diagnostics*

Studentized Residuals

Outlier Detection Based on Robust Parameter Estimates

Econometrics Streamlined, Applied and E-Aware

Cal Analysis of Residual Scores and Outlier Detection in Bivariate Least Squares Regression Analysis