Study Guide for Exam 1 STAT 110
Total Page:16
File Type:pdf, Size:1020Kb
Study Guide for FINAL EXAM ~ STAT 210
Chapter 1 ~ What is Statistics?
Some of the ideas in this section of the study guide come from the introductory Powerpoint presentations shown in class. You should review the content of these presentations in preparation for the exam. These Powerpoints are available in the Additional Links section of the course website.
Polls and Surveys Why do we sample? What do we mean by sampling/random/chance error? What are nonsampling errors? What is the consequence of nonsampling errors? What are the different types of nonsampling errors? Given the description of study be able to identify the following: - target population - study population (and whether it is the same as the target population based on the sampling scheme used) - variables being measured (e.g. gender, smoking status, GPA, income, etc.) - purpose of the study - any conclusions reached/inferences made from the study etc. - any potential problems with the described study (sources of bias etc.) Discuss how you would conduct a simple survey.
Experiments What is a completely randomized design? When experimenting with humans, understand the following: - control group - role of randomization - placebo - double blind - placebo effect Discuss how you would conduct a simple experiment given a research goal.
Observational Studies What is prospective study? What is retrospective study? What are the limitations of observational studies vs. experimental studies? What is controlling for a factor? Given a study description, be able to identify some potential factors that one might want to control for. Use of Simulations to Make Decisions e.g. Swain vs. Alabama, Medical Success Rate, Casteneda vs. Partida
Concept of Sampling Variation and Standard Error when estimating proportions (see Chapter 1 sections 1.3, 1.4 and Homework 1)
Chapter 2 ~ How to Describe and Summarize Data
Types of variables – Given a list of variables you should be able to classify them as being numerical (continuous or discrete) or categorical.
Graphical displays for continuous variables. For each be able to construct, read, and draw conclusions from them. o Dot plots – construct by hand o Stem-and-leaf plots – be able to read only. o Histograms (Be able to comment on the following from a histogram: typical value, variability/spread, and distributional shape) o Outlier boxplots o CDF Plots Numerical summaries for continuous variables o Sample mean – be able to calculate by hand and interpret. o Sample median – be able to calculate by hand and interpret. o Five number summary – be able to interpret. o Range – be able to calculate by hand and interpret. o Interquartile Range (IQR) – be able to interpret and use it to compare spread across groups. o Standard deviation – be able to calculate by hand and interpret. Also be able to use it to compare spread across groups. o Empirical Rule for approximately normal data – be able to apply this rule.
Graphical and numerical summaries for discrete data o Sample mean and standard deviation - These would be given to you, you do not need to know how to compute them from a table. o Bar graphs and frequency distribution tables – Be able to read and draw conclusions from a frequency distribution table. Also make sure you understand frequency, relative frequency, and cumulative relative frequency. This relates to material presented in Chapter 4 ~ Discrete Probability Distributions.
Graphical and numerical summaries for categorical data Be able to read and draw conclusions from: o Bar graphs o Frequency and Relative Frequency Tables o Pie charts Extra Graphical Displays from Homework #2
Examining the Relationship Between Two Numeric Variables Scatter plots – be able to read and draw conclusions from them. Be able to discuss trend, scatter, association (both in terms of direction and strength), groups/clusters of points, gaps, and outliers. (see Problem #7 from HW #2)
Comparing Values of Numeric Variable Across Groups/Populations Comparative Boxplots – be able to read and draw conclusions from vertical dot plots with box plots added as we have looked at in JMP. In particular be able to comment on typical value, spread/variation, distributional shape, and within group outliers. Also if given histograms that were plotted in the same scale be able to compare contrast the groups. (see Problem #7 from HW #2)
Chapter 3 – Probability
If you haven’t done so yet, read Chapter 3 Probability. The most important concepts from this chapter are those of independence, conditional probabilities, and Baye’s Rule. Also the use Tree Diagrams to map out probabilities associated with two stage experiments should be reviewed.
Be sure you are able to do the following: Construct and use a tree diagram to find probabilities of interest. See example 3.3.2 pgs. 105 – 106, problem 3.52, and Baye’s Rule/screening test problems like those on your homework. Apply Baye’s Rule – you had several problems where this was used on your third assignment. If you did not get them all worked out correctly be sure to read through my solutions on the course website. Given a contingency table be able to find probabilities of events of interest. Also be able to compute the relative risk (RR) associated with a potential risk factor for an adverse outcome. Review the probability Powerpoint we went through in class and the additional problem from Assignment #3 that looked at the association between smoking and birth weight. Chapter 4 – Discrete Random Variables
If you have not done so, read sections 4.1 – 4.6 in your text. Also be sure to look at the solutions to the homework problems assigned from this section.
Be sure you are able to do the following: Given a discrete probability distribution (i.e. the possible values for the random variable X and their associated probabilities of occurrence f (x) , be able to find the following: o Probabilities associated with specific events. o Probability histogram o The cumulative distribution function, F(x) P(X x) and graph it. o The expectation E(X ) , the variance Var(X ) , and the standard deviation SD(X ) . o The expectation, variance, and standard deviation of linear functions of X, i.e. E(aX b) , Var(aX b) , and SD(aX b) . For a simple experiment be able to find the discrete probability function f (x) and then items in the list above.
Chapter 5 – Random Variables for Success/Failure Experiments (Binomial Random Variable)
Read Sections 5.1 and 5.2 of your text and review the solutions to your Chapter 5 homework posted on the web.
Be sure you are able to do the following:
Use the binomial probability function to find probabilities associated with an arbitrary binomial random variable K. You need to be able to do this with your calculator and your brain for the exam.
n k nk P(K k) (1 ) k = 0,1,2,...,n k Find the expectation E(K), variance Var(K), and standard deviation SD(K) of an arbitrary binomial random variable. Be able to read the output from the Binomial Table Generator from JMP. Chapter 6 – Introduction to Hypothesis Testing
Read Chapter 6 – Sections 6.1 – 6.5.
Be sure you are able to do the following:
Be able to state the null and alternative hypothesis for a given situation. Be able to state the Type I and Type II errors for a given situation. Be able to perform a Sign Test (e.g. reading comprehension course, men’s heights) for the population proportion/probability of success () and the population median (Med). Be able to perform an Exact Binomial Test for 2 by 2 contingency table.
Chapter 8 & 11 ~ Normal Distribution and Central Limit Theorems (Note: Book refers to these as The Law of Averages) Given X ~ N( , 2 ) be able to find probabilities and quantiles associated with X. Practice Problems: 8.11, 8.13
Normal approximation to the Binomial Distribution. Practice Problems: 8.29, 8.43, 8.49 K ~ N(n , n (1 )) provided n is sufficiently large ( n 5 & n(1 ) 5 ).
Know what the central limit theorem for the sample proportion says and how to apply it. Practice Problem: 11.37 (1 ) P ~ N , provided n is sufficiently large ( n 5 & n(1 ) 5 ). n
Know what the central limit theorem for the SUM says and how to apply it. Practice Problems: 11.40, 11.41, 11.43 SUM ~ N(n, n ) provided X is normal to begin with or n is “large” (n 40) .
Know what the central limit theorem for the sample means says and how to apply it. Practice Problems: 12.5 X ~ N , provided X is normal to begin with or n is “large” (n 40) . n
Chapter 12 – z and t Tests of Hypotheses Be able to conduct a z-test “by hand”. Specifically be able to set up the hypotheses to be tested, compute the test statistic, find the associated p-value and state your conclusions correctly using both in statistical and non-statistical terms. Practice Problems: 12.5, 12.17
Be able to interpret output from a t-test conducted in JMP and interpret output from the t-Probability calculator. This includes being able to read a normal quantile plot.
Chapter 13 – Estimation with Confidence (Confidence Intervals) Be able to construct and interpret a 100(1-2)% CI for a population mean ( ) using the t-table in your text to find the appropriate
t-quantile, e.g. t.975 . s x t (two-sided) n s x t2 (one-sided upper) n s x t2 (one-sided lower) n Practice Problems: 13.11, 13.13, 13.15 (a.) x 924.8, s 136.6) , 13.41, 13.51. 13.53 Be able to construct and interpret a 100(1-2)% CI for a population proportion ( ). You only need to know the large sample case: p(1 p) p z (two-sided) n p(1 p) p z2 (one-sided upper) n p(1 p) p z2 (one-sided lower) n Practice Problems: 13.29, 13.49 Given an estimate of you should be able to determine the sample size needed to have a given margin of error when estimating the mean with a 100(1-2)% CI. (see pages 367-368) 2 z Required sample size n 1 E Practice Problems: 13.19, 13.21 You should be able to determine the sample size needed to have a given margin of error when estimating the population proportion with a 100(1-2)% CI. (see pages 371-372) 2 z Required sample size n 1 (conservative) 2E 2 z Required sample size n 1 (1 ) (prior knowledge for ) E Practice Problem: 13.27
Chapter 14 – Two Sample Inference
14.1 ~ Matched-Pairs/Dependent Samples Testing Be able to perform a sign test (see Ch. 6 above) Be able to interpret the results of dependent sample comparison in JMP (see BMI study from Take Home Exam). Be able to construct and interpret a CI for the mean paired difference
(d ) given the sample mean and standard deviation of the paired differences. Practice Problems: 14.3 and 14.4 pg. 388 For 14.4 you can assume that the sample mean and sample standard devation of the paired differences would be given to you (although you should in theory be able to find them yourself).
14.2 - 14.4 ~ Independent Samples Testing ( X vs.Y ) Be able to interpret the results of an independent samples comparison in JMP (see Birth Weight study from Take Home Exam). Be able to construct and interpret a CI for the difference in two
population means ( X Y ) if given the sample sizes, sample means, and sample standard deviations from from both groups. (Equal variance case only). Be able to perform a z-test for comparing two population proportions
( X vs. Y ) . Practice Problems: 14.31 and 14.32 pg. 401 Be able to construct and interpret a CI for the difference in two
population proportions ( X Y ) . (Example: see last HW).
Chapter 16 – Analysis of Categorical Data
Be able to perform a Goodness-of-Fit test similar to the homicide season and candidate preference examples/problems. Be able to conduct a test of independence for a small contingency table and discuss the results. Realize that the Fisher’s Exact Test, the Binomial Test, the z-test for comparing two population proportions, and the Chi-square test for independence are all ways to analyze the results of a study where the ultimate goal is compare two population proportions.
STUDY FOR THIS EXAM!
DO NOT RELY TOO MUCH ON YOUR NOTES AND BOOK ETC.
YOU SHOULD BE ABLE TO DO MOST OF THE PROBLEMS WITH THE ASSISTANCE OF FORMULA SHEET THAT YOU CREATE.