SM222: Modeling Business Decisions Test #1
Total Page:16
File Type:pdf, Size:1020Kb
SM222 SECTION B6: Modeling Business Decisions Midterm
BOSTON UNIVERSITY School of Management
Fall 2013
Sign the following statement. Grades will not be given to students who do not do so. I have not cheated or helped anyone else cheat on this exam.
______Signature
Name:______DO NOT WRITE YOUR NAME ANYWHERE ELSE ON THIS TEST.
Professor and time of class:______
NOTE: WE GIVE LOTS OF PARTIAL CREDIT ON TESTS. Always say something.
IF YOU HAVE NOT BROUGHT A REGRESSION OF YOUR OWN TO ANALYZE, YOU SHOULD ANSWER THE REGULAR TEST EXCLUDING 4 B AND C AND EXCLUDING THE QUESTION STARTING “WHAT IMPORTANT FACTS ABOUT THE TIME TRENDS IN UNEMPLOYMENT RATES DO WE LEARN FROM” … (This will either be 1e or 2e.) QUESTION 1 Your Regression
Answer the following questions regarding the main regression that you have brought with you from your project. You can write the answers either on the regression itself or on this page. Label each answer (a), (b) etc. Be sure to put your name on the page with your regression. When you complete the test, staple your regression sheet to this test. a. What is the main question that this regression addresses and what is the answer to this question? How do you learn this answer from your regression? In your answer, be sure to discuss how sure you are (statistically) of your answer. Note: You will need to define your variables (including your dependent variable in this explanation. IF YOU ABSOLUTELY NEED 2 REGRESSIONS TO ANSWER YOUR QUESTION, YOU CAN INCLUDE 2.
b. What null hypothesis does the t-statistic on the first variable in your regression test?
c. What does each observation in your data set represent? (in a few words at most). d. For each other variable in your regression, explain what role it plays in the regression. (If you have multiple dummy variables for a single category, explain the category as a group.)
e. For each variable, in intuitive words, explain specifically what we learn from the coefficient, in other words, why you felt you should include it. (Again, if you have multiple dummy variables for a single category, explain the category as a group.) f. If your R-squared is low, explain why it is low and whether it casts doubt on your answer (in part a). If your R-squared answer is high, explain why it is so much higher than the R- squares in so many other regressions.
g. Consider one possibly confounding variable that you included in your regression. Explain why there would be missing variable bias if you excluded this variable from your regression. IF YOU HAVE NO VARIABLE THAT YOU CONSIDER POSSIBLY CONFOUNDING IN YOUR REGRESSION: What is one possibly confounding variable that is NOT in your regression. Explain why this would cause missing variable bias. The best answer would give the direction of the missing variable bias (positive or negative) and explain how you know. QUESTION 2 Other Questions A. Answer questions on only one of the two student projects on this page. Do not answer the question from your project. Project I: In this project, the dependent variable was the person’s income. The project tested whether the person’s family background (particularly, their income) affected income. Three of the explanatory variables were: incom16_hi: a dummy =1 if the respondent’s family income when you were 16yrs old was far above average male: a dummy if male maleXincome16_hi: an interaction term formed by multiplying the other two variables together. (There were other variables such as education, age, hours worked). That part of the equation was: Income = -8893 incom16_hi + 17907 male + 36529 maleXincome16_hi + ………. (10462) (1860) (13046) (standard errors in parentheses)
Project II: In this project, the dependent variable was the water flow downstream. Three of the explanatory variables were: Snow: The total snow in winter June2: A dummy for the second week in June SnowJune2: : an interaction term formed by multiplying the other two variables together. (There were many other variables.) That part of the equation was: Water flow = .198 Snow -21.56 June2 +4.354 SnowJune2 (.088) (18.13) (.968) (standard errors in parentheses)
About which project are you answering:______
Which coefficients are you >=95% certain affect the dependent variable?
Which coefficients are you 68% but not 95% certain affect the dependent variable?
Which coefficients are you <68% certain affect the dependent variable?
In intuitive words (rather than statistical terms), what do we learn from the coefficient on the interaction term?
B. In this project, the dependent variable is sales of cereal brands. The explanatory X variables are obvious except that you should know that the 4 regions of the US are South, Northeast, West and Midwest..
Use the number 1.611 (the coefficient on summer) in a sentence that explains what it is in as common sense, intuitive terms as possible. (don’t use statistics terms). The more intuitive and straightforward, the better.
In this regression with just dummy variables, the number 31.4245 (the constant or intercept) represents something specific and intuitive. What does it represent? (No statistics or math terms. The more intuitive and straightforward, the better.) c. What predicts who smokes? Do cigarette prices affect it? Below is a regression where the dependent variable is a dummy variable of whether or not the person smoked. 2 of the explanatory variables were age and age-squared. That part of the equation was: Person smoked= .0206 age - .0003 age-squared + ……………………………. (3.7559) (-4.479 ) (t-stats in parentheses) In intuitive words (rather than statistical terms), what do we learn from the coefficient and t- statistic on age-squared?
d. Here is a regression of salaries of SMG graduates in 2012 on dummies for the highest and lowest paying concentrations and for whether the student was from the US (domestic) or from abroad : . regress BaseSalary fin_or_acc marketing domestic international note: international omitted because of collinearity
Source | SS df MS Number of obs = 263 ------+------F( 3, 259) = 17.10 Model | 6.9469e+09 3 2.3156e+09 Prob > F = 0.0000 Residual | 3.5072e+10 259 135413574 R-squared = 0.1653 ------+------Adj R-squared = 0.1557 Total | 4.2019e+10 262 160377747 Root MSE = 11637
------BaseSalary | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------+------fin_or_acc | 8251.386 1615.468 5.11 0.000 5070.262 11432.51 marketing | -3754.324 2325.157 -1.61 0.108 -8332.943 824.2956 domestic | -4404.168 2273.778 -1.94 0.054 -8881.613 73.27668 international | 0 (omitted) _cons | 49704.64 2489.461 19.97 0.000 44802.48 54606.8 ------
Exactly why did Stata omit giving an international coefficent? Explain without using statistical terms if possible.