<<

analysis for project Do what you’ve got do!

† Association between two quantitative variables (say, hours of sleep, hours of study)—consider regression, read chapter 27 Introduction to for inferential procedures; Graph: scatterplots Lecture 19 † More quantitative variables—read chapter 30 for multiple regression. † for the data project; † Association between two categorical variables (such as majors versus political affiliation)—read chapter 26 for the †Problem set 9 is already posted; Chi-square test. †[For section 4, PM] exam schedule. Graph: back-to-back bar plots or pie . † Association between a quantitative variable and a —read chapter 28 for one-way ANOVA. Graph: back-to-back boxplots.

Imagine … Recap …

† Alex and Paul are playing a gambling game. They each roll a die. The one gets the larger value wins $1. † Data: seven “6”s out of 10 rolls. (If they tie, they roll again.) † An assumption: the die is a fair one. † Alex noticed that since the start of the game, Paul got seven “6”’s out of ten rolls. † A probability value: 0.00027. † He pulls out his calculator and punches several numbers. The calculator says, if Paul’s die is a fair one, † A suspicion: maybe the die is not fair. the probability of getting seven or more 6’s out of ten is 0.00027, i.e., 2.7 out of 10000. † He shows this to Paul. And Paul shrugs and says “I guess I am just having a lucky day!” † Or is he?

n=50 Hypothesis: p=0.5 True value: p=0.8 Testing a hypothesis distribution assuming the hypothesis is true True . † A hypothesis proposes a model for the sampling distribution of a . † Data generate a value of that statistic. † Evidence that is against or supporting a Hypothesis → complete model for the sampling distribution hypothesis should be a measure of “consistence” Data → one observation from the true sampling distribution between the model and the data.

1 P=0.5

N(0.5, 0.071) The Good News!

Testing a hypothesis (cont’d) Significance

† Terms † How far will be far enough? † The test statistics we described were actually z-scores. „ Null hypothesis: statement that we are testing against. † We use the corresponding tail probability of the calculated test „ : statement that we are collecting statistic (z-score) as measure of “consistency” between the evidence to support. model and the data. † If this measure of “consistency” is smaller than a previous- † Test statistic: standardized “distance” between the decided threshold, we reject the null hypothesis. Or we can say data and the center of the sampling distribution “the difference between the data and the model is statistically model decided by the null hypothesis. significant!” † This measure of “consistency” is called the P-value. † The “further” the model and the data are apart, the † The previous-decided threshold is called the significance level. stronger evidence that the null hypothesis is not true.

Testing a hypothesis (cont’d) Logic behind significance

The null hypothesis proposes a † Steps True Population value: p value of the parameter: p0 „ State the hypotheses and a significance level;

„ Calculate the test statistic; Sampling distribution model: p (1− p ) ˆ N(p , 0 0 ) Data → Statistic p „ Calculate the p-value that is corresponding to the 0 n test statistic and the alternative hypothesis; „ State the conclusion based on the p-value and the pˆ − p Test statistic: z = 0 significance level. p0 (1− p0 ) n Large p-vale: not against the null hypothesis; P-value Small p-value: against the null hypothesis.

2 Alternative hypothesis Significance level

† Statement that proposes a (or values) of † Threshold at which we reject the null the population parameter. hypothesis and claim . † If the null hypothesis is rejected, we will † It is a level of P-value. accept the alternative hypothesis. † Different kinds of alternatives: „ Two-sided alternatives „ One-sided alternatives

P-values for different alternative hypotheses Example

† See board. † 10 years ago, about 10% of country A’s population are obese. This year, 200 individuals were selected and evaluated. 30 of them were regarded as obese. Whether there is evidence that the country’s obesity proportion has changed?

Reading

† Chapter 20

3