Problem of the Day Weight in Static Make a scatter plot motion(kg) Weight(kg) of the data. 26 27.9 29.9 29.1 Is linear regression 39.5 38.0 appropriate? Why 25.1 27.0 or why not? 31.6 30.3 36.2 34.5 25.1 27.8 31.0 29.6 35.6 33.1 40.2 35.5 Problem of the Day When using midterm exam scores to predict a student’s final grade in a class, the student would prefer to have a

A) positive residual, because that means the student’s final grade is higher than we would predict with the model.

B) positive residual, because that means the student’s final grade is lower than we would predict with the model.

C) residual equal to zero, because that means the student’s final grade is exactly what we would predict with the model.

D) negative residual, because that means the student’s final grade is lower than we would predict with the model.

E) negative residual, because that means the student’s final grade is higher than we would predict with the model. Problem of the Day A scatterplot of vs. x shows a strong positive linear pattern. It is probably true that

A) the correlation between X and Y is near +1.0.

B) the scatterplot of Y vs X also shows a linear pattern.

C) the residuals plot for regression of Y on X shows a curved pattern.

D) large values of X are associated with large values of Y.

E) accurate predictions can be made for Y even if extrapolation is involved. Problem of the Day The model = 3.30 + 0.235(speed) can be used to predict the stopping distance (in feet) for a car traveling at a specific speed (in mph). According to this model, about how much distance will a car going 65 mph need to stop?

A) 4.3 feet B) 18.6 feet C) 27.0 feet D) 345.0 feet E) 729.0 feet Problem of the Day

Create an appropriate model for the data. Use it to predict the MPG for a speed of 57 mph.

Speed 35 40 45 50 55 60 65 70 75

MPG 25.9 27.7 28.5 29.5 29.2 27.4 26.4 24.2 22.8 Salary(in Player Year Problem of the Day millions) Nolan Ryan 1980 1.0 Is it appropriate to use George Foster 1982 2.0 linear regression 1990 3.0 to predict salary Jose Canseco 1991 4.7 1996 5.3 from year? Ken Griffey, Jr 1997 8.5 Why or why not? Albert Belle 1997 11.0 Pedro Martinez 1998 12.5 1999 12.5 Mo Vaughn 1999 13.3 Kevin Brown 1999 15.0 Carlos Delgado 2001 17.0 Alex Rodriguez 2001 22.0 Manny Ramirez 2004 22.5 Alex Rodriguez 2005 26.0

Chapter 10 Re­Expressing Data: Get It Straight!

Linear Regression­easiest of methods, how can we make our data linear in appearance

Can we re­express data? Change functions or add a function? Can we think about data differently? What is the meaning of the y­units? Why do we need to re­express?

Methods to deal with data that we have learned

1.

2. Goal 1

­making data symmetric Goal 2

­make spreads more alike(centers are not necessarily alike), less spread out Goal 3(most used)

­make data appear more linear Goal 4(similar to Goal 2)

­make the data in a scatter plot more spread out Ladder of Powers(pg 227) Straightening is good, but limited ­multi­modal data cannot be "straightened" ­multiple models is really the only way to deal with this data Things to Remember

­we want linear regression because it is easiest (curves are possible, but beyond the scope of our class)

­don't choose a model based on r or R2

­don't go too far from the Ladder of Powers

­negative values or multi­modal data are difficult to re­express Salary(in Player Year Find an appropriate millions) Nolan Ryan 1980 1.0 linear model for the George Foster 1982 2.0 data. Use it to predict Kirby Puckett 1990 3.0 the highest paid player Jose Canseco 1991 4.7 in 1989 and 2003. Roger Clemens 1996 5.3 Ken Griffey, Jr 1997 8.5 Albert Belle 1997 11.0 Pedro Martinez 1998 12.5 Mike Piazza 1999 12.5 Mo Vaughn 1999 13.3 Kevin Brown 1999 15.0 Carlos Delgado 2001 17.0 Alex Rodriguez 2001 22.0 Manny Ramirez 2004 22.5 Alex Rodriguez 2005 26.0 Practice Problems

Find an appropriate model for the data. Use it to predict the f/stop of a camera with a shutter speed of .05 sec.

Shutter speed(in seconds) .001 .002 .004 .008 .0167 .033 .067 .125 f/stop(in mm) 2.8 4 5.6 8 11 16 22 32 Average # of orange weight of Find an appropriate model and trees fruit predict the weight of fruit for a 50 .6 farm with 325 trees. 100 .58 150 .56 200 .55 250 .53 300 .52 350 .50 400 .49 450 .48 500 .46 600 .44 700 .42 800 .40 900 .38

50

Percentiles from Correlation

Week on the Job(x) and Time to Complete a Task (y) have a correlation of ­.968. If you have a week in the 90th percentile, what percentile will the corresponding time expected to be in? Percentiles from Correlation

Distance(x) and airfare(y) have a correlation of .694. If you have a distance in the 80th percentile, in what percentile would you expect the corresponding airfare to be? Chapter 10 Readings and Examples pgs 222­238

Homework pgs 239­243:1­4,6,7,11,13,19,21, 25,27 Group Problems

1. Write a brief description of the problem(you do not have to copy the whole problem).

2. Re­express the data how you see fit.

3. Show your final scatter plot with residuals.

4. Show your final model and use it to predict a value.

5. Discuss what other functions you may have tried and why you did not use them. Chapter 9 FRQ Chapter 12 Readings and Examples pgs 268­287

Turn your homework in with your quiz.