Problem of the Day Make a Scatter Plot of the Data. Is Linear Regression
Total Page:16
File Type:pdf, Size:1020Kb
Problem of the Day Weight in Static motion(kg) Weight(kg) Make a scatter plot 26 27.9 of the data. 29.9 29.1 Is linear regression 39.5 38.0 appropriate? Why 25.1 27.0 or why not? 31.6 30.3 36.2 34.5 25.1 27.8 31.0 29.6 35.6 33.1 40.2 35.5 Salary(in Problem of the Day Player Year millions) Nolan Ryan 1980 1.0 Is it appropriate to use George Foster 1982 2.0 linear regression Kirby Puckett 1990 3.0 to predict salary Jose Canseco 1991 4.7 from year? Roger Clemens 1996 5.3 Why or why not? Ken Griffey, Jr 1997 8.5 Albert Belle 1997 11.0 Pedro Martinez 1998 12.5 Mike Piazza 1999 12.5 Mo Vaughn 1999 13.3 Kevin Brown 1999 15.0 Carlos Delgado 2001 17.0 Alex Rodriguez 2001 22.0 Manny Ramirez 2004 22.5 Alex Rodriguez 2005 26.0 Chapter 10 ReExpressing Data: Get It Straight! Linear Regressioneasiest of methods, how can we make our data linear in appearance Can we reexpress data? Change functions or add a function? Can we think about data differently? What is the meaning of the yunits? Why do we need to reexpress? Methods to deal with data that we have learned 1. 2. Goal 1 making data symmetric Goal 2 make spreads more alike(centers are not necessarily alike), less spread out Goal 3(most used) make data appear more linear Goal 4(similar to Goal 3) make the data in a scatter plot more spread out Ladder of Powers(pg 227) Straightening is good, but limited multimodal data cannot be "straightened" multiple models is really the only way to deal with this data Things to Remember we want linear regression because it is easiest (curves are possible, but beyond the scope of our class) don't choose a model based on r or R2 don't go too far from the Ladder of Powers negative values or multimodal data are difficult to reexpress Salary(in Player Year Find an appropriate millions) Nolan Ryan 1980 1.0 linear model for the George Foster 1982 2.0 data. Use it to predict Kirby Puckett 1990 3.0 the highest paid player Jose Canseco 1991 4.7 in 1989 and 2003. Roger Clemens 1996 5.3 Ken Griffey, Jr 1997 8.5 Albert Belle 1997 11.0 Pedro Martinez 1998 12.5 Mike Piazza 1999 12.5 Mo Vaughn 1999 13.3 Kevin Brown 1999 15.0 Carlos Delgado 2001 17.0 Alex Rodriguez 2001 22.0 Manny Ramirez 2004 22.5 Alex Rodriguez 2005 26.0 Percentiles from Correlation Week on the Job(x) and Time to Complete a Task (y) have a correlation of .968. If you have a week in the 90th percentile, what percentile will the corresponding time expected to be in? Chapter 10 Readings and Examples pgs 222238 Homework pgs 239243:14,6,7,11,13,21, 2527,32(join Twin Club),37.