Matching Histograms, Box Plots, and Normality Plots 6

Matching Histograms, Box Plots, and Normality Plots 1

 WHAT IF YOU DO NOT KNOW THE STANDARD DEVIATION OF THE POPULATION?

 What are t-distributions?

When you don’t know the standard deviation of the population you use the standard deviation of the sample. The majority of time you do not know the standard deviation of the population so most times on the MEAN side you will be using the t-distribution.

The majority of time when you are on the mean side you will be using a t-distribution and when you are on the proportion side you will be using a z score. The t distribution will only go with mean problems because you are working with a standard deviation. On the proportion side you are CREATING the standard deviation based on the sample size.

When dealing with a t-distribution there are two values that have variation, the mean and the standard deviation. Every sample will most likely have a different mean and different standard deviation based on the sample.

The CLT tells us that the sampling distribution of sample means approaches a Normal model as n increases. That's based on the assumption that we know sigma. In the real world, that's exceedingly rare. Not knowing sigma forces us to use the sample's s, and that means we're no longer playing by CLT rules. We have to take this extra uncertainty into account, and Gosset's t does exactly that. The critical issue in deciding between z and t is the fact that we don't (and indeed can't) know sigma. In the real world, inference for means ALWAYS requires t.

The rule of thumb suggesting that one should use z if n > 30 is a pre-technology approximation that's no longer necessary. No matter how large n is, we use t. Before we had stats software or calculators, finding values of t for any number of df would have required many pages of tables. The easy out was to print only one page of the t-table (usually up to about 30 df) and then tell people to switch over to z. That's not because the sampling distribution miraculously changed, it's because the t distribution approaches a Normal as n increases, and 30 was a convenient time to say "okay, it's close enough now".

The book suggesting this approach is out of date We no longer have to resort to this approximation, because the calculator can work with any number of df and that lets us always (and properly) use t for inference for means. Suggesting that we should sometimes use t and sometimes z confuses kids. Don't go there. We won't know sigma. t is for means, z is for proportions. Keep it simple.

Jeane Swaynos AP Workshop July 2008 1 Matching Histograms, Box Plots, and Normality Plots 2

There is a family of graphs for the t-distributions and each one depends on the size of the sample.

T distributions

This is a graph of the NORMAL distribution

This is a graph of the t- distribution with 2 degrees of freedom. Do you notice that the tails are thicker on the t distribution

This is a t distribution with 10 degrees of freedom. As the degrees of freedom increase the sampling distribution is closer to a normal distribution. The degrees of freedom is found by taking your sample size –1. For example is your sample size is 15 your degrees of freedom would be 14 so you would look at the distribution for t = 14.

Look at the chart in your book and find the t-distribution table. Do you notice that there are values along the left side? These represent the degrees of freedom.

These are additional assumptions necessary when using the “t” distribution. If you are given the data points you must graph the points and discuss what you are looking for on the graph. If you are not given the data points then you will have to assume that these conditions occur. Here are the conditions

1. n< 15 distribution approximately Normal

With a “t” distribution you have additional assumptions. If your sample size is less than 15 then you have to assume that the population of interest is NORMAL. Think about this, it would be difficult to show normality with only a small sample size. If the data is provided the reader still expects you to display the data and comment on the histogram.

2. 15<= n < 30 If your sample size is between 15 and 30 your data does not necessary have to be normal but you cannot have skewness or outliers. Again, if the data is given show a histogram and describe what you see.

3. n > 30 If your sample size is greater than 30 then it does not have to be approximately normal and it can also have some skewness in either direction. You will still have to be cautious of outliers so you will want to state this as part of your assumptions.

You will choose one of the assumptions above based on your sample size. You will still need to address the other two assumptions:

 You assume this is a SRS from your population of interest  You assume that your sample independent, which means that it, is less than 10% of the entire population.

The assumptions on the AP test are graded as right or wrong. If one little piece for example if you do not define the population in the SRS or you do not state the correct information involving your sample size then it is wrong, there is no partial credit.

One sample T Confidence Interval

Mrs. Swaynos took a random sample of gas prices around the area and found the price per gallon of regular gasoline was as follows:

3.12 3.21 3.34 3.67 3.78 3.10 3.12 3.32 3.45 3.56 3.78

Make a 95% confidence interval for the data and interpret the results. Show all parts

Define the statistic because that is what you are using in the formula _ x = the average price of regular gasoline per gallon FROM MY SAMPLE

Assumptions 1. I assume this is a simple random sample of locations 2. I assume that the 11 samples are independent of each other Since n< 15 and I am using a t distribution I will assume the population of interest is normal. Below is a normality plot to show the 11 data points and the histogram shows a slightly skewed distribution. It is difficult to show data is normal with such a small sample size. I will, therefore, assume that the population of all gas prices is normally distributed. Be sure to describe your graph

You could either make a normality plot or a histogram to show the data is normal.

Mechanics (Name or Formula with substitution, df, and the specific interval) Note at the start of the year I do not give the students a choice, they must give the formula with the correct substitutions. If they do not have t-inverse on their calculator have them use the chart.

_ sd 95% CI = t * x n

.26 95% CI = 3.4 2.228 ( 3.23 to 3.58) 11

Conclusion

Interpretation of the INTERVAL I am 95% confident that the true price of gas in Florida is between $3.23 and $3.58 per gallon.

Interpretation of the LEVEL If I did this process again and again, I would capture the true mean of gas prices approximately 95% of the time in the various intervals.

Two Sample T-Test

A teacher wants to know if the method of instruction affects how well students learn. Using two classes of the same level of statistics, she teaches one class using lecture only and the other class using lecture and group work. She measures the level of learning by giving both classes the same test. Assuming that the two classes are representative of all statistics students, what type of inference procedures should be used? State the hypotheses for the appropriate test and identify the inference procedure you would use. Justify your response and include comments on the design of the study. Solution Procedure type: Two-sample t-test

Ho: m L= m G Where mL represents the mean score of the tests in the class where lecture Ha: m L m G

only was used and mG represents the mean score of the tests in the class where lecture and group work were both used. Notes The response variable, scores of individual students, is numerical, and there are two independent groups, classes with lecture only and classes with both lecture and group work. This leads us to conclude it is a difference of means two-sample t problem. The teacher is looking to see if one method is different from the other which would indicate a two-tailed test.

Matched Pair T- Test Situation

Having done poorly on their math final exam in June, six students repeat the course in summer school, and then take another exam in August. If we consider these students representative of all students who might attend this summer school in other years, do these results provide evidence that the program is worthwhile? Show all parts

June 54 49 68 66 62 62 Aug 50 65 74 64 68 72

This is a matched-pair because we are taking two measurements from one experimental unit. The treatment is the summer course remediation. Any pre and posttest is a Matched Pair design and we look at the Difference. This is really a one- sample t test on the difference. We call it a matched pair because the data of the difference comes from two data points. These data points, however, are NOT independent. The same person is taking the test in June and in August.

What to look for when making a decision about whether to use a “Matched Pair” test for the difference or a “Two sample” test:

 Data must be paired for a matched pair test. Pairing is not a problem, it is an opportunity. The independent assumption is violated but we can actually do much better than the two-sample t-test. After all, we should be focusing on the changes.  You make a decision about whether the data is paired from understanding how they were collected and what they mean. There is no test to determine whether the data are paired. This comes from reading the problem  Mechanically, a matched pair t-test is just a one-sample t-test for the means of these paired differences. The sample size is the number of pairs.

Things to remember  Don’t use a two-sample t-test for paired data.  Don’t use a paired-t method when the samples aren’t paired. When two groups do not have the same number of values, it’s pretty easy to see that they can’t be paired. But just because two groups have the same number of observations doesn’t mean they can be paired even if they are shown side-by-side in a table. We might have 25 men and 25 women in the study, but they might be completely independent of one another.  There is most often less variability in the matched pair design then the two sample t test.  Matching pairs generally removes so much extra variation that it more than compensates for having only half the degrees of freedom.

Decide whether the following situations are paired or two samples. .

Define Parameter

mD - This is the AVERAGE DIFFERENCE between the Test score in August – Test score in June. Mean of the differences! Null and Alternative in terms of the PARAMETER

Hom D = 0

H Am D > 0

Assumptions Assumptions 1. This is a simple random sample of students 2. The six students are independent of each other. 3. Since I am using a t-distribution with only 6 points and I have the data points I MUST SHOW A GRAPH OF THE DATA. I will assume the population is NORMAL.

OR

Since the Normality plot shows an outlier. I will assume that the POPULATION of interest is normal. The histogram does not appear normal because there is a gap. It is difficult to show that a sample size this small would be normal. That is why we assume that the population of interest is normal. Many students did not include a normality plot or a histogram.

Test statistics, type of test by name of formula, pvalue, degrees of freedom

Matched Pair t-test t score is 1.75 degrees of freedom is 5 P(t >1.75) = .0699

5.33- 0 = 1.75 7.44 6 T = Conclusion

There is approximately a .07 probability that a DIFFERENCE of 5.33 or larger would happen by chance alone if the true difference in scores from August – June was 0. I will NOT reject the Null at a = .05

Proportion Problem (Two Sample)

Notes about the standard error the two proportion test and two-proportion confidence interval

p(1- p ) 1 1 The standard error for a two-proportion test is C C (+ ) 1 n1 n 2

pC This is the combination of the proportions. The book does not use this notation but I use it to say it is a combination of both proportions for example

12 15 27 p= and p = then p = 130 2 40C 70

When you are doing a TEST you are stating that the two proportions are equal so you can pool the standard error. When you pool the proportions you combine both the values and create a proportion based on both samples.

Confidence Interval

When you are doing a confidence interval this is not the case. In this situation the p(1- p ) p (1 - p ) standard error is ( 1 1+ 2 2 n1 n 2

So the formula for a two-proportion confidence interval is

p1(1- p 1 ) p 2 (1 - p 2 ) ____% CI = p z * ( + the z* has to match the n1 n 2 confidence interval. Remember how to find z*

(1-c) / 2. This will determine the area under the curve at each tail. Use InvNorm to find the z* matching the confidence interval.

Remember for a confidence interval you will be defining the p from your sample.

Situation A Gallup Poll taken in May 2000 asked the question: “In general, do you feel that the laws covering the sale of firearms should be made: more strict, less strict, or kept as they are now?” Of the n = 493 men who responded, 52% said “more strict,” while of the n = 538 women who responded, 72% said “more strict.” Assuming these respondents constitute random samples of U.S. men and women, is there sufficient evidence to conclude that a higher proportion of women than men in the population think these laws should be made stricter? Justify your answer.

Procedure type: Difference of two proportions z test H: p- p = 0 H: p= p o w m OR o w m Ha: p w- p m > 0 Ha: p w> p m where pm and p w represent the proportion of men and women respectively who support “more strict” laws in the sale of firearms.

Notes Performing a large-sample difference of two proportions test solves this problem. This is evident because there are two populations that we are studying, men and women. The problem also asks students to find evidence of a higher percentage or proportion of women than men who think that the sale of firearms should be stricter, indicating a one-tailed test. Students may need to be reminded that for the hypothesis test, in calculating the test statistic, students must use a pooled estimate for the proportion. In a hypothesis test we are assuming the null hypothesis is true, and the null hypothesis assumes population proportions for men and women are equal. Students need to check the conditions that the sample size is large enough. One rule for checking this would be: nw壮 p w 10 , nw�(1 p w ) 10 and nm壮 p m 10 , nm�(1 p m ) 10 where nm and n w represent the number of men and women respectively. Some texts use: nw 壮 pˆ 10 , nw �(1 pˆ ) 10 and nm 壮 pˆ 10 , nm �(1 pˆ ) 10 where pˆ represents the pooled estimate of p. Other texts use 5 in place of 10. Possible incorrect solutions would include difference of two means or Chi-Square. Difference of means cannot be correct because we have proportions as opposed to average percentages. If students are confused on this issue, ask them what the original data must look like. Is it numerical or categorical? For answers of categorical, tests of proportion are correct. For answers of numerical, t-tests are appropriate. In this case the raw data must be in the form, “more strict,” “less strict,” or “kept as they are now.” This is categorical data. If our original data were presented as a list of percentages for many different samples (which is numerical), a t-test would be used. Students could also attempt a solution using Chi-Square. Since Chi-Square is always two-tailed and we are doing a one-tailed test this would not be appropriate. Chi-Square could work as an alternative solution if this example was not one-tailed.

It is important that you define the Null and Alternative Hypothesis in terms of the PARAMETER

State Null and Alternative in terms of parameters

HO: Pm = Pf HA: Pm < Pf

Define Parameters in context

Pm = The proportion of males who think that the laws about gun control should be MORE STRICT Pf = The proportion of females who think the laws about gun control should be made more strict

Give Assumptions for both groups Individually

Males 1. I assume this is a simple random sample of males from the population 2. I assume that the 493 males are independent 3. np> 10 .52(493) > 10 256 > 10 and n(1-p) > 10 493(.48) > 10 236 > 10

Show numerical values

Females 1. I assume that this is a simple random sample of females from the population 2. I assume that the sample of 538 females are independent of each other 3. np> 10 of .72(538) > 10 387 > 10 and .28(538) > 10 150> 10 Show numerical values

4. I assume that the 493 males sampled are independent of the 538 females

Give formula with test statistic, zscore and pvalue

All the values you need for the formula are on the calculator screen

.52- .72 - 0 phatm- phat f = -6.623 Z = = 1 1 1 1 .62(.38)(+ ) pc(1- p c )( + ) 538 493 nF n M

If you entered the females first you would have a positive z-score and the p value would be the same. Remember the p-value will never be exactly zero even though that is what the calculator gives. There is always some small tiny area so have the p-value APPROXIMATELY equal to zero.

If you had done females – males the z value would be + 6.623 and this would represent the right area of the tail. The p-value would be the same approximately zero. Conclusion (pvalue, difference, context, reject or not reject, alpha level)

There is approximately a 0% probability that a DIFFERENCE of .20 or larger would happen by chance alone if the true difference between males and females that prefer a stricter gun control law were 0. I will reject the Null at a =.05

Chi Square Test

Chi Square test is used when you have categorical data and are comparing a number of proportions. There are two basic types of Chi Square Test  Test of Goodness of Fit  Test for Independence  Test for Homogeneity

The Goodness of Fit is testing whether a particular distribution is as stated. This is comparing many different proportions. Only one of the proportions has to be statistically off, not all of the proportions. These types of problems only have one row and 2 or more columns.

The Test for Independence also, known as the Test for Association is testing whether there is a relationship between the two categories for example gender and grades. The problem will have more than one row.

(Frog Problem 2009 #3)

What does the Chi Square graph look like?

Chi Square for each degree of freedom has a different type of graph.

As the degrees of freedom increase the graph becomes less skewed and is not as tall. You may get significance with a higher degree of freedom due to the area of the tail. You will only be using a one sided test for Chi Square

The Goodness of Fit test has one row and the Null state that the distribution is as the manufacture states or perhaps as the newspaper states.

Example Problem Arnold Palmer states that the percentage of hair color for a random sample of newborn babies is as state 10% Red 20% Black 20% Brown and 50% Bald and you take a random sample of 200 babies and find the result

25 red 60 black 50 Brown 65 Bald

You could do a Chi Square Test for Goodness of Fit

HO: The distribution of newborn hair color is as Arnold Palmer states: 10% red, 20% Black, 20% Brown, and 50% Bald

HA: The distribution of newborn hair color is NOT as Arnold Palmer states: 10% red, 20% Black, 20% Brown, and 50% Bald

Assumptions 1. The data are COUNTS 2. We assume this is a simple random sample of 200 newborn babies 3. We assume the 200 babies are independent of each other 4. All expected counts are greater than 5. (20, 40, 100) YOU MUST SHOW THE EXPECTED COUNTS.

There are two ways to look at assumption 4. We can say that all expected counts are greater than 1 and no more than 20% of the expected counts are less than 5. This is about EXPECTED counts not the actual data.

The degrees of freedom is the number of columns you have minus 1: column – 1

List the data in a table.

Red Black Brown Bald Observed 25 60 50 65 Expected .1(200) = 20 .2(200) = 40 .2(200) = 40 .5(200)=100

(Observed- Expected )2 Chi Square is found by taking the total of Expected 2 2 (O- E ) This is the symbol for Chi SquareX = = E (25- 20)2 (60 - 40) 2 (50 - 40) 2 (65 - 100) 2 + + + = ___26____ 20 40 40 100

1.25 + 10 + 2.5 + 12.25 = 2 6

Parts of the Chi Square Test

 State the Null and Alternative ( If this is written in words then you have already defined the variables)  Give the assumptions 1. The data are counts 2. SRS of ______from _____ 3. The samples are independent 4. All expected counts are greater than 5 ( show expected counts)

 Chi Square Value, degrees of freedom and P Value

 Conclusion There is a ____% probability that a chi square value of ____ or larger would happen by chance alone if the true distribution were _____. I ______reject the Null at a =.05 (Be sure to connect the conclusion to the context of the problem) OR There is a ___% probability that I would get this Chi Square value or larger by chance alone. Therefore, I _____ reject the Null at alpha = .05. Therefore it appears that ---- and ---- are or are not independent.

M&M Activity (on your own)

Show all parts for this activity, including the table with observed and expected counts. Show all the mechanical steps to find your Chi Square value

M&M Plain M&M Peanut Brown 13% Peanut Both Butter Yellow 14% Types 10 Brown Red 13% Brown 12% 20 Yellow Blue 24% Yellow 15% 10 Red Orange Red 12% 20 Blue 20% Blue 23% 20 Orange Green 16% Orange23% 20 Green Dark Choc. Uniform Distribution Review of Chi Square \

There are several procedures to compare the distribution of categorical data. When a categorical variable has multiple categories, when there are two categorical variables under

Jeane Swaynos AP Workshop July 2008 14 Matching Histograms, Box Plots, and Normality Plots 15 consideration, or when there are multiple populations under study, a Chi Square test is used.

Goodness of Fit: One Row. One categorical variable with multiple categories from ONE population

Test for Homogeneity: More than one row. (One categorical variable with multiple categories from two or more populations) The test compares the distribution of sample counts with the hypothesized distribution of the population assume the populations have identical distributions

Test of Independence: More than one row (Two categorical variables with multiple categories from ONE population) The test compares the distribution of sample counts with the hypothesized distribution of the population assuming the two variables are independent

Test for Homogeneity

Situation: A certain brand of bits-size candies comes in three varieties: creamy, crispy, and chewy. The manufacturer is interested if preference for the types of candies differs between three school age groups: elementary, middle, and high school. Random samples at three local schools, one of each age group are taken and the sample data compiled in the table below

Variety Population Creamy Crispy Chewy Elementary 33 14 19 Middle 21 16 17 High School 16 12 32

*Do you notice that I have three different samples here, so I am looking at three different populations. This is what tells me it is a test for Homogeneity and not Association.

HO: P creamy in elementary = P creamy in middle school = P creamy in high school P c rispy in elementary = P crispy in middle school = P crispy in high school P chewy in elementary = p chewy in middle school = P chewy in high school

HA : At least one of the statements in HO is not true

Chi Square test for Homogeneity

Assumptions IN CONTEXT 1. Data comes from an independent simple random sample: OK – each group of students was randomly selected from their respective schools.

2. The samples were independent or the size of the sample is less than 10% of the population sizes: OK as long as there are at least 660 elementary, 540 middle, and 600 high school students in the respective populations. 3. All expected counts are as least 5 – see the expected counts below. It is important that you indicate exactly where to find the expected counts.

Calculate the Chi Square test statistic

The expected counts are found by take the (Row Total)(Column Total) and divide this by the (Total Total)

The degrees of freedom is (# of rows – 1) ( # of columns – 1)

Creamy Crispy Chewy TOTAL Elementary 33 14 19 66 Middle 21 16 17 54 High School 16 12 32 60 TOTAL 70 42 68 180

Expected Counts Creamy Crispy Chewy Elementary (66)(70)/ 180 (66)(42)/180 (66)(68)/180 Middle (54)(70)/180 (42)(54)/180 (54)(68)/180 High School (60)(70)/180 (60)(42)/180 (68)(60)/180 (Observed- Expected )2 Chi Square is found by taking the total of Expected 2 2 (O- E ) This is the symbol for Chi SquareX = = E

The calculator will automatically store the expected counts in Matrix B. This will only work when you have more than one row.

The conclusion Variety There is approximately a 2% probability that a result of Creamy Crispy Chewy Would happen by chance alone if all the proportions of Elementary 33 14 19 creamy, crispy and chewy candy were the same for the Middle 21 16 17 a High 16 12 32 three types of schools. I will reject the Null at =.05 School

OR There is a approximately a 2% probability that if the preference for candies were distributed in the same proportion among all three age groups of students a chi-square statistics 0f 11.552 or larger would occur. Reject the Null at a =.05

Test for Independence

The chi-Square test of independence is used to compare the distribution of sample counts of two categorical variables from a SINGLE population to see if there is an association between the variables. For instance, parents of incoming freshman in a large school district were asked if they supported school uniforms. Parents were classified by whether or not they favored uniforms, and by the type of uniform policy.

Situation According to the Orlando Sentinel the proportion of schools that allow cell phones depends on the population of students at the school. This is the data they have collected. Complete an appropriate significance test. Show all Parts Cell Phone Population Yes No Less than 500 12 23 500-1000 11 5 1001-1500 16 34 1501-2000 15 17 2001-2500 18 24 2501-3000 12 32 Over 3000 34 30

Solution

This is a Chi Square test for Independence. There is a subtle difference in this problem. We are assuming this is all from one sample and then we separated the schools and gathered the data. If this was taken from individual samples then we would do a test for Homogeneity and be comparing the proportions. Both Null and Alternatives will be accepted. The stem of the problem was not clear in how the sample was gathered.

Null and Alternative

Ho The number of schools that allow cell phones is independent of the size of the school HA The number of schools that allow cell phone use is not independent of the size

OR

Ho The proportion of cell phone use allowed is the same for each of the 7 different size schools HA The proportion of cell phone use allowed is NOT the same for at least one of the 7 different size schools.

Assumptions

Assumptions 1. I assume this is a simple random sample of schools 2. I assume that the sample of schools are independent of each other 3. All expected counts are greater than 5. The approximate value of each is as follows: 14, 20, 6, 9, 20, 29, 13, 18, 17, 24, 18, 25, 26, 37

Show the expected counts. You get the expected counts by looking at Matrix B. You do not have to do anything expect put the values in Matrix A. The calculator will do the rest. You need to remember how the expected values are calculated because this could be asked on a free response. They could also show you a computer printout of the values.

Define Chi Square test, give test statistic, degrees of freedom and p-value

Chi Square test for independence Chi Square is 15.12 Degrees of freedom 6 P(X6 > 15.12) = .019

Conclusion There is a .019 probability that I would get a Chi Square value of 15.12 or larger. I will reject the Null at a =.05. It appears that the size of the school is not independent of cell phone use.

Test for Goodness of Fit Example According to the USA Today 20% of children are blond, 40% have brown hair, 10% have red hair, and 30% have black hair. Mrs. Swaynos has taken a random sample of children and found the following results

32 Blond 48 Brown 15 Red 40 Black

Is there evidence to think USA today is not correct

Parts to Chi Square for Goodness of Fit

1. Define Null and Alternative in words 2. Give the assumptions  SRS of _____ from ____  Data are counts  N< 10% of the population  All expected counts are greater than 1  Not more than 20% of expected counts are less than 5

3. Give the Chi Square value showing the work 4. Give the p-value 5. Give the conclusion

There is a ______% probability that I would get a result of ____ OR LARGER by chance alone if the true proportion were as stated. I ______reject the Null at alpha = ______. Make a statement that connects to the problem.

Notes about Chi Square

There are three types of Chi Square problems Goodness of Fit, Test for Independence, Test for Homogeneity

The difference between Independence and Homogeneity is that Independence comes from one population. Homogeneity comes from two samples. The way the sample is taken and the question you are trying to answer dictates the type of test. A test for Goodness of Fit only has ONE row. A test for Independence and Homogeneity has at least two rows. This type of test can be done on the calculator. The goodness of fit must be done by hand.

Example for Goodness of Fit According to USA today skittles are equally distributed. Mrs. Swaynos has opened a package of skittles and found the following results 12 yellow 14 red 22 green and 28 purple. Do a statistical test for this.

Chi Square Test for Independence

According to Ms. Michalik the number of juniors and seniors that attend prom is as follows:

Spend 0-100 Spend 101-250 Spend 251-400 Spend over 400 Males 21 32 12 8 Females 4 32 23 30

Chi Square Test for Homogeneity

Ms. Michalik took a sample from Seminole, Oviedo, and Lake Mary and found out how much people spend on prom

Spend 0-100 Spend 101-250 Spend 251-400 Spend over 400 Oviedo 34 56 123 200 Seminole 12 25 231 129 Lake Mary 5 34 129 321

Assumptions 1. SRS of ______from ______2. SRS< 10% of the population 3. All expected counts are >1 4. No more than 20% of Expected Counts can be less than 5 5. Make sure the data are counts.

Degrees of Freedom (r-1) (c-1) for independence and homogeneity. For a goodness of fit it is (c – 1) for degrees of freedom.

Conclusion

There is a _p-value_____% probability that a result of __ or larger from your sample______would happen by chance alone if the true _Null______were ______. I _will or will not______reject the Null at alpha = .05

Conclusion must always connect to the context of the problems.

Hypothesis Test for Slope

The hypothesis test for slope allows us to determine if there is a useful linear relationship between x and y in the population. That is, does the slope of the population model differ from 0 – does y tend to change linearly with changes in x? If there is a linear relationship between the two variables, the slope should not equal 0. The Null will state:

HO b = 0

Here b is used as the symbol for the slope of a linear regression. This is the same symbol that is often use as a Type 2 error so be careful and make sure you define the variable of interest.

The Null will most often say b 0 but it could also say b <0 or b >0. The formula to compute the test statistic (t score) is

b1 t = where b is the slope of the sample and SE is the standard error of the slope SE 1 b1 b1

The number of degrees of freedom is n-2 because you are looking at two variables for each data set. The P-value will be computed using the t-distribution.

The assumptions 1. SRS of ____ from _____ 2. The scatter plot looks linear 3. The residuals do not show a pattern 4. The residuals follow a NORMAL distribution

Conclusion

There is a _____% probability I would get a slope of ______given the true slope is 0. I ______reject the Null at a =.05 (Include a statement that ties this in with the CONTEXT of the problem)

Key Words in Advance Placement Statistics Questions

If it says………… Then you must address….. Describe the distribution (numerical)  Center  Shape  Spread  Gaps  Outlier G-SOCS Describe the distribution (Categorical)  Make sure the graph is appropriate compare  Compare each of the sets ( 2009 #1)  Label and Scale  If the data are not the same size then you must use relative frequency as the y axis Compare the distributions  Use words like bigger, smaller, wider  You must compare the center, shape and spread  When comparing the shape just list the types of distribution  Be very careful using the word NORMAL, not all symmetric distributions are NORMAL but all NORMAL distributions are symmetric. Design an experiment  Random -allocation of experimental units to treatments RAT  Define what tool you will use to randomize  Same size groups  Clearly define groups  Replication – enough experiment units  Control – Control for lurking variables (often by blocking)  Draw picture  Define what you will compare and give units  Write in sentences

What additional information…….  You must state NEW information not what is previously stated Based on the diagram above…….  You must use the graph or diagram that is provided Based on parts a and b…….  You must use your answers from the previous parts Give appropriate statistical evidence  Must perform a test of significance, Hypothesis test choices Provide evidence o One sample t o Two sample t

Conduct an appropriate analysis o One sample proportion z test o Two sample proportion z test o Chi Square o Linear Regression  Must show all parts (Null & Alternative, name test, assumptions, test statistic, p value, df, conclusion)

In context of the question, drawing, table, You must connect back to the stem of the graph, etc. problem.

Explain the slope ___ change in y is PREDICTED for every 1- unit change in x. CONTEXT Fudge factor or wiggle word must be there Explain the coefficient of determination R2% of the variation in Y is explained by the LSRL of Y on X. CONTEXT Explain “r” Strength, direction, context At most Means x or less No more than Means x or less At least Means x or greater Conclusion/linkage Pvalue, alpha level, context, reject or not reject, conditional probability statement  Never accept the alternative  Never reject the alternative  Never accept the Null Predict Use a linear regression Confidence Interval Do not make your interval special. Your are ____% confident that the true ____ is between___ and ____ Confidence Level  This is referring to the process and all the other intervals that were obtains.  If I did this process again and again I would capture the TRUE ____ approximately ___% of the time in the various INTERVALS. Underlined words Few words are underline, pay attention to what words the author chooses to underline. They direct you to the answer they are looking for. Describe scatterplots Strength, direction, placement of the data, outliers, CONTEXT

Assumption and Conditions

Assumptions Conditions that support the assumptions Proportions (z)  One Sample 1. Individuals are independent 1. SRS of ____ from _____and n < 2. Samples are sufficiently large 10% of the population 2. np>10 and n(1-p) > 10 this is the same as there are at least 10 successes and 10 failures in each sample  Two Samples 1. Samples are independent 1. (Think about how the data was collected) 2. Data in each sample are 2. Both samples are SRS of ___ from independent ____ and n < 10% of the population OR Random Allocation of Treatment 3. Both samples are sufficiently large 3. np>10 and n(1-p) > 10 both each sample or there are at least 10 successes and 10 failures in each of the two samples

Means (t) One sample (df = n-1) 1. Individuals are independent 1. SRS of ___ from ---- and n < 10% of the population 2. Population distribution is normal 2. n < 15 data comes from a normal population. For 15 < n<30 the data has no outliers  If data is given you must show a or skewness normality plot or histogram and check For n> 30 data this is considered the data sufficiently large and outliers could e a problem. The data does not have to be normal

Matched Pair (df = n-1)

1. Data are matched 1. (Think about how the data was collected) 2. Individuals are independent 2. SRS of ___ from ___ and n < 10% of the population OR Random Allocation of Treatment 3. n<15 Data comes from a normal 3. Population of differences is Normal population For 15 < n<30 the data has not  If data is given you must show a outliers or skewness normality plot or histogram and check For n> 30 data this is considered the data of the differences sufficiently large and outliers could be a problem. The data does not have to be normal

Two independent samples (df from technology)

1. Samples are independent 1. (Think about the design) 2. Data in each sample are 2. SRS of ___ from ___ and n < 10% independent of the population OR Random Allocation of Treatment for each of the samples 3. Both population have a Normal 3. The data for each sample is distribution approximately normal for n < 15 For 15 < n<30 both data sets has not outliers or skewness For n> 30 both data sets are is considered sufficiently large and outliers could be a problem. The data does not have to be normal 4. Data is independent of each other 4. (Think about how the data was collected)

AP Statistics Writing Conclusions and Interpretations for Statistical Inference

Note: All conclusions must be connected to the context of the problem. The must include p- value, alpha level, reject or not reject, and must state given the Null.

Interpretation of R-sqaure

______% of the variation in ______is explained by the least square regression line of ___ on _____ (y, y, x)

Interpretation of SLOPE The ______will change by APPROXIMATELY _____ as ____ increases by 1.

Example; y = 3 +14x y = mileage x = gas

The mileage will increase by approximately 14 as the gas increase by 1 gallon.

You may also define the slope as the PREDICTED slope instead of approximate slope

Know how to read a computer printout. Know how to find the standard deviation of the residuals from the printout and how to interpret this in context to the problem

Interpretation of R This is the correlation involves a LINEAR relationship and has strength and direction. You must address all three of these issues.

Residual Distance from the Observed – Expected. Points above the line are an UNDER prediction and points below the line are an OVER prediction.

Confidence Interval I am _____% confident that the true ______is between ____ and ______

Confidence Level If I repeated this process again and again I would capture the true ______% of the time in the various intervals.

Conclusion for a Linear Regression Slope

With a p-value of ______I will or will not reject the Null at the ___% level. I would get this t score of ____ OR LARGER, SMALLER OR BOTH DEPENDING ON THE PROBLEM by chance alone _____% of the time given that the TRUE slope was 0.

You could also say given that the x and y are independent. Conclusion for one sample t test or z test There is a _____probability that I would get a result of ____ or (larger, smaller, or both) by chance alone given the true mean of ______(context)______is ______I will or will not reject the Null at the ___% level.

CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT

Conclusion for a two sample t test or z test There is a ______probability that I would get a difference of ______(larger, smaller, or both) by chance alone given the true difference in ______context is 0. I will or will not reject the null at alpha equal to .05.

Conclusion for one sample proportion There is a _____ probability that I would get a proportion of ______(larger, smaller, or both) by chance alone given the true proportion of ______context ______is _____. I will or will not reject the Null at alpha equal to .05.

CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT CONTEXT

Conclusion for two sample proportion There is a ______probability that I would get this proportion difference of _____ or LARGER by chance alone given the true difference ______(context) ______is 0. I will or will not reject the Null at the ___% level.

Conclusion for Chi Square TEST FOR INDEPENDENCE (You have one sample and are comparing two difference categories within the one sample) There is a _____ probability I would get a Chi Square value of ___ or larger. I will or will not reject the Null at alpha equal to .05. There is a _____ probability I would get (give observed values here and context) With a p-value of ____ I will or will not reject the Null at the __% level. I would get a chi square value of ___ by chance alone ___ % of the time if the ___ and ___ were independent ( or I could say if the ___ and ___ had no association)

Conclusion for Chi-Square for TEST OF GOODNESS OF FIT (you have one sample and are comparing the distribution to something that the manufacturer claims) There is a ____ probability that I would get this Chi Square value ______or larger by chance alone. I will or will not reject the Null at alpha equal to .05. There is a ____ probability I would get ( give the observed values and the context here) by chance alone given the true proportions were (give expected here with context.)

Conclusion for Chi- Square for TEST OF HOMOGENITY (you have two samples and are comparing the distributions of each) With a p-value of _____ I will or will not reject the Null at the ___% level. I would get a chi square value of ___ or larger by chance alone ___% of the time given that the two (or more) distributions were the same. CONTEXT

Here are few pointers and reminders to help you do well on the AP Statistics Exam.

The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second section is 6 (technically, 4 to 7, but it’s always been 6) free response questions. Each section counts for half of the overall score. The last free response question counts for 25% of the Section II score. You are allowed to use your calculator(s) throughout the exam, and a standard set of formulas and tables is printed right in the test booklet for your use.

General tips for writing free response answers Understand your obligation as a test taker You are being evaluated not only on the correctness of your answers, but also on your ability to communicate the methods you used to reach them. The answer is everything you write down, not just the last line or number at the end. Convince the reader that you understand the key concepts in the question. Don’t just give them the numbers and hope they will assume you understand the concepts.

Be smart about multi-part questions Most AP Stat questions have several parts. Read all the parts before you start answering and think about how they might be related (sometimes they aren’t). If the last part asks you to answer a question based on your results to the previous parts, be sure to actually use your prior results to answer. If you couldn’t do one of the previous parts, make up an answer and explain what you would have done.

Answer the question you are asked The test writers spend over a year writing these questions. They word them carefully and specifically. Spend more time reading and less time writing to make sure you really understand what is being asked. When you have answered the question asked, stop writing. They give you much more space than you need. Don’t panic because you haven’t used all the space provided.

Answer in context Most, if not all, AP Stat problems will have a real life context. Make sure your answers include the context. This is especially important when defining symbols/variables and writing conclusions.

Use vocabulary carefully This isn’t English class. There’s no poetic license here. Terms like normal, independent, and sampling distribution have specific meanings. Don’t say “normal” if you mean “approximately normal” and don’t mix up populations and samples in either words or symbols.

Leave enough time for the last question

The last free response question counts for more points and is designed to take 20 to 30 minutes. At least read it first, and if you feel OK about it, go ahead and answer. If it looks hard, you can save it for the end, but no matter what, when there are 30 minutes left in the test, stop and go to the last question.

Relax Having met many of the people who write the exam and grading standards, I can assure they are not out to trick you. They write challenging but straightforward questions designed to give you an opportunity to demonstrate what you have learned. Seize the opportunity and do your best. Keep in mind that you only need to earn roughly 65 to 70% (it varies from year to year) of the points on the exam to get a 5.

Collecting Data There are 2 broad areas of data collection we cover in AP Stat, Experiments and Sampling. You are expected to know some general concepts and specific techniques related to each area.

Experiments vs. Samples Many students confuse experimentation with sampling or try to incorporate ideas from one into the other. This is not totally off-base since some concepts appear in both areas, but it is important to keep them straight.

The purpose of sampling is to estimate a population parameter by measuring a representative subset of the population. We try to create a representative sample by selecting subjects randomly using an appropriate technique.

The purpose of an experiment is to demonstrate a cause and effect relationship by controlling extraneous factors. Experiments are rarely performed on random samples because both ethics and practicality make it impossible to do so. For this reason, there is always a concern of how far we can generalize the results of an experiment. Generalizing results to a population unlike the subjects in the experiment is very dangerous.

Blocking vs. Stratifying Students (and teachers) often ask, "What is the difference between blocking and stratifying?" The simple answer is that blocking is done in experiments and stratifying is done with samples. There are similarities between the two, namely the dividing up of subjects before random assignment or selection, but the words are definitely not interchangeable.

Blocking In blocking we divide our subjects up in advance based on some factor we know or believe is relevant to the study and then randomly assign treatments within each block. The key things to remember:

1. You don't just block for the heck of it. You block based on some factor that you think will impact the response to the treatment 2. The blocking is not random. The randomization occurs within each block essentially creating 2 or more miniature experiments. 3. Blocks should be homogenous (i.e. alike) with respect to the blocking factor.

For example, I want to find out if playing classical music during tests will result in higher mean scores. I could randomly assign half my students to the room with the music and the other half to the normal room, but I know that my juniors consistently score higher than my seniors, and I want to account for this source of variation in the results. I block according to grade by separating the juniors and seniors first and then randomly assigning half the juniors to the music room and the other half to the normal room. I do the same with the seniors. For this design to be valid, I have to expect that each grade will respond to the music similarly. In other words, I know that juniors will score higher, but I expect to see a similar improvement or decline in both groups as a result of having the music. At the end of my study I can subtract out the effect of grade level to reduce the unaccounted for variation in the results.

You have learned how to analyze the results of one special type of blocked design, namely, matched pairs. In matched pairs you subtract each pair of values which eliminates the variation due to the subject. Similar techniques are available for fancier blocked designs.

Stratified Sampling vs. Cluster Sampling Many students confuse stratified and cluster sampling since both of them involve groups of subjects. There are 2 key differences between them. First, in stratified sampling we divide up the population based on some factor we believe is important, but in cluster sampling the groups are naturally occurring (I picture schools of fish). Second, in stratified sampling we randomly select subjects from each stratum, but in cluster sampling we randomly select one or more clusters and measure every subject in each selected cluster. (Note: There are more advanced techniques in which samples are taken within the cluster(s))

Final Thoughts It is especially important to stay focused when answering questions about design. Too many students get caught up in minor details but miss the big ideas of randomization and control. Always remember that your mission in responding to questions is to demonstrate your understanding of the major concepts of the course.

Describing Data IQR is a number Many students write things like "The IQR goes from 15 to32". Every AP grader knows exactly what you mean, namely, "The box in my boxplot goes from 15 to 32.", but this statement is not correct. The IQR is defined a Q3 - Q1 which gives a single value. Writing the statement above is like saying "17 goes from 15 to 32." It just doesn't make sense.

Be able to construct graphs by hand You may be asked to draw boxplots (including outliers), stemplots, histograms, or other graphs by hand. The test writers have become very clever and present problems in such a way that you cannot depend on your calculator to graph for you.

Label, Label, Label Any graph you are asked to draw should have clearly labeled axes with appropriate scales. If you are asked to draw side-by-side boxplots, be sure to label which boxplot is which.

Refer to graphs explicitly When answering questions based on a graph(s), you need to be specific. Don¹t just say, "The female times are clearly higher than the male times.", instead say, "The median female time is higher than the first quartile of the male times." You can back up your statements by marking on the graph. The graders look at everything you write, and, often, marks on the graph make the difference between 2 scores.

Look at all aspects of data When given a set of data or summaries of data, be sure to consider the Center, Spread, Shape, and Outliers/Unusual Features. Often a question will focus on one or two to these areas. Be sure to focus your answer to match.

It's skewed which way? A distribution is skewed in the direction that the tail goes, not in the direction where the peak is. This sounds backwards to most people, so be careful.

Slow down The describing data questions appear easy, so many students dive in and start answering without making sure they know what the problem is about. Make sure you know what variable(s) are being measured and read the labels on graphs carefully. You may be given a type of graph that you have never seen before.

Inference

Not every problem involves inference You have spent most if not all of this semester on inference procedures. This leads many students to try to make every problem an inference problem. Be careful not to turn straightforward probability or normal distribution questions into full-blown hypothesis tests.

Hypotheses are about populations The point of a hypothesis test is to reach a conclusion about a population based on a sample from it. We don't need to make hypotheses about the sample. When writing hypotheses, conclusions, and formulas, be careful with your wording and symbols so that you do not get the population and sample mixed up. For example, don't write "Ho: x = 12" or "µ = mean heart rate of study participants".

Check Assumptions/Conditions Checking assumptions/conditions is not the same thing as stating them. Checking means actually showing that the assumptions are met by the information given in the problem. For example, don't just write "np>10". Write "np=150(.32)=48>10". Everyone knows you can do the math in your head or on your calculator, but writing it down makes it very clear to the reader that you're tying the assumption to the problem rather than just writing a list of things you memorized.

Confidence intervals have assumptions too Confidence intervals have the same assumptions as their matching tests, and you need to check them just as carefully.

Link conclusions to your numbers Don't just say "I reject Ho and conclude that the mean heart rate for males is greater than 78." This sentence doesn't tell us why you rejected Ho. Instead, say "Since the p-value of .0034 is less than .05, I reject Ho and ...”

Be consistent Make sure your hypotheses and conclusion match. If you find an error in your computations, change your conclusion if necessary. Even if your numbers are wrong, you will normally get credit for a conclusion that is correct for your numbers. If you get totally stuck and can't come up with a test statistic or p-value, make them up and say what you would conclude from them.

Interpreting a confidence interval is different than interpreting the confidence level Interpreting the confidence interval usually goes something like, "I am 95% confident that the proportion of AP Statistics students who are highly intelligent is between 88% and 93%" or "The superintendent should give seniors Fridays off since we are 99% confident that between 72% and 81% of parents support this plan."

Interpreting a confidence level usually goes something like "If this procedure were repeated many times, approximately 95% of the intervals produced would contain the true proportion of parents who support the plan."

Regression

Graph First, Calculate Later The most important part of the regression process is looking at plots. Regression questions will frequently provide a scatterplot of the original data along with a plot of residuals from a linear regression. Look at these plots before answering any part of the question and make sure you understand the scales used.

Is it linear? Remember that an r value is only useful for data we have already decided is linear. Therefore, an r value does not help you decide if data is linear. To determine if data is linear, look at a scatterplot of the original data and the residuals from a linear regression. If a line is an appropriate model, the residuals should appear to be randomly scattered.

Computer Output It is very likely that you will be given computer output for a linear regression. If you can read the output correctly, these questions are normally easy. You should be able to write the regression equation using the coefficients in the output and also be able to find the values of r and r2. Most software packages provide the value of r2. If you are asked for the value of r, you will need to take the square root and look at the slope to determine if r should be positive or negative.

Interpreting r If asked to interpret an r value, be sure to include strength, direction, type, and the context. A good interpretation will be something like, “There is a weak positive linear relationship between the number of math classes a person has taken and yearly income.”

After you make a 5, be sure to take more statistics in college.

Jeane Swaynos AP Workshop July 2008 35