Uppsala Upcerg Workshop

Quantitative Methods in STEM Education Research Topic 4: Inferential statistics Judy Sheard Faculty of Information Technology Monash University, Australia [email protected] QM STEM Ed 2018 1 Overview of topic 4 Hypothesis testing. Central Limit Theorem Level of significance Z-scores Confidence intervals Categories of statistical tests QM STEM Ed 2018 2 Descriptive vs. inferential statistics Descriptive statistics — used to describe sets of quantitative data. This involves descriptions of distributions of data and relationships between variables. Inferential statistics — used to make inferences about populations from analysis of subsets (samples) of the population. QM STEM Ed 2018 3 Inferential statistics “In inferential statistics, statistics are measures of the sample and parameters are measures of the population. Inferences are made about the parameters from the statistics”. (Wiersma, 1995, p.363) Inferences are made about a population based on a subset or random sample of that population. Note that in educational research it is often not possible to have a random sample – instead we attempt to show that the sample is typical of the population by comparing demographics, e.g. gender, age, educational background. QM STEM Ed 2018 4 Hypothesis testing In inferential statistics, a hypothesis is used to determine whether an observation has an underlying cause or whether it was due to some random fluctuation or error in a sample. The researcher will test to see if the hypothesis is consistent with the sample data – if not the hypothesis is rejected. Two different ways of stating a hypothesis: Looking for a difference between groups; Looking for relationships between groups. QM STEM Ed 2018 5 Hypothesis testing On what basis do we accept or reject a hypothesis? Consider this example: A set of exercises was designed to encourage reflection on program design. It was hypothesized that these exercises improved students’ skills in program design. This method was used on a class of 30 students. In a test on program design, the class scored a mean of 60% with a standard deviation of 10. The same test on another class that had not used these exercises, resulted in a mean score of 55% with a standard deviation of 12. Does the hypothesis seem reasonable? What if the class mean was 70%? What about 57%? QM STEM Ed 2018 6 Null hypothesis In inferential statistics we test the opposite of a research hypothesis using the null hypothesis. For example: Research hypothesis: Skills in program design will be improved with the use of exercises to encourage reflection on program design. Null hypothesis: There will be no difference in skill levels in program design between students who have completed exercises to encourage reflection on program design and those who have not. Research hypothesis: The performance of introductory programming students is related to prior programming experience. Null hypothesis: There is no relationship between programming performance and prior programming experience. If your study finds there is a difference or some relationship then you can reject the null hypothesis (H0) and you can state that there is support for your research hypothesis (H1). QM STEM Ed 2018 7 Sampling distribution We need more than intuition here. We will connect probability with a statistic — using the concept of a sampling distribution of the statistic. A sampling distribution consists of the values of a statistic computed from all possible samples of a given size. (Wiersma, 2005, p.375). Note that the sampling distribution is not the sample distribution. QM STEM Ed 2018 8 What does this mean? We have a population. We can take a sample of size n from the populations and compute a statistic of this sample, e.g. the mean. We take all possible samples of size n and compute the statistic of these samples. We now have a distribution of the statistic. QM STEM Ed 2018 9 Central limit theorem The shape, location (central tendency) and variability (dispersion) of the sampling distribution is described by the central limit theorem. The central limit theorem (CLT) states: Given any population, the distribution of the sample mean is approximately a normal distribution, provided the sample size is large. This is the key theorem in statistics! QM STEM Ed 2018 10 Central limit theorem The central limit theorem specifies that the sampling distribution of the mean has a mean equal to the population mean (μ), a standard deviation equal to σ/√n, and is normally distributed. (σ is the standard deviation of the population) Some simulations to illustrate this: http://www.stat.sc.edu/~west/javahtml/CLT.html http://www.rand.org/statistics/applets/clt.html http://en.wikipedia.org/wiki/Concrete_illustration_of_the_central_limit_theorem QM STEM Ed 2018 11 Level of significance The level of significance is a probability used in testing hypotheses. It is a criterion used in making a decision about the hypothesis. The common level used in educational research is 0.05. Occasionally other levels are used: 0.01, 0.001 and 0.1. A level of 0.05 means that when the probability is lower than 0.05, the null hypothesis is rejected. It then follows that if the null hypothesis is true it will only be rejected 5% of the time. We now connect the sampling distribution with the level of significance. QM STEM Ed 2018 12 The “68.3 - 95.5 - 99.7” rule QM STEM Ed 2018 13 Z-score The z-score (also called standard score) indicates how far, and in what direction, that score deviates from its distribution's mean, expressed in units of the distribution's standard deviation. The formula for creating z-scores is: Where: x is a raw score to be standardized μ is the mean of the population σ is the standard deviation of the population QM STEM Ed 2018 14 Standard z-score The z-score indicates if a score was above or below the distribution mean. A z-score of +1 indicates one standard deviation above the population mean. A z-score of -1 indicates one standard deviation below the population mean. For example, a mark of 53 on a test where the mean of all marks was 67 and the standard deviation of marks was 7 would give a standard score of -2.0. QM STEM Ed 2018 15 Properties of standard scores A z-score makes it possible to compare scores from different distributions. z-scores have the following properties: The mean of any set of z-scores is zero. The standard deviation of any set of z-scores is always equal to 1. The distribution of z-scores has the same shape as the distribution of raw score from which they were derived. QM STEM Ed 2018 16 Confidence intervals A confidence interval specifies a range within which we can have some degree of confidence of finding of finding another value – usually the population mean. To construct a confidence interval based on the normal distribution we need: a random sample of size n the sample mean the standard deviation of the population a level of confidence QM STEM Ed 2018 17 Defining confidence intervals To find the lower (L) and upper (U) limits for a confidence interval we use to following The std deviation L xz The sample n mean The sample U xz size n A z-score indicating the confidence level QM STEM Ed 2018 18 Confidence intervals Increasing the confidence level widens the confidence interval. Increasing the sample size narrows the confidence interval. Increasing the standard deviation makes the interval wider. Common confidence levels are 90%, 95%, 99% - but we can specify any level below 100%. QM STEM Ed 2018 19 Choosing the z-score For 95% confidence we choose a central area of 0.95 on the standard normal 0.95 curve. 1.96 1.96 For 90% confidence we choose a central area of 0.90 on the standard normal curve. 0.90 1.645 1.645 QM STEM Ed 2018 20 The “68.3-95.5-99.7” rule QM STEM Ed 2018 21 Example The numbers below were randomly drawn from a normal population with σ = 10. 56.87, 73.96, 59.77, 75.89, 71.60, 81.94, 69.11, 80.07, 74.70, 63.32 The sample mean = 70.72 and we want a 95% confidence interval. So, 10 L 70.72 1.96 64.52 10 10 U 70.72 1.96 76.92 10 QM STEM Ed 2018 22 Example cont.. So we are 95% confident that the population mean is between 64.52 and 76.92. What does this really mean? Would you get the same result from another random sample of size 10? What if you took another 100 samples and constructed 100 confidence intervals? They would all be different and about 5% of them would not even contain the population mean QM STEM Ed 2018 23 The standard error The standard error of the sample mean is: x n You can see that the standard error gets smaller as the sample size increases. The standard error also shows up in the confidence interval formula: xz This is why the n interval get smaller as n increases QM STEM Ed 2018 24 Null hypothesis The null hypothesis H0 is “State of the world” – the hypothesis of no actual situation H True H False difference or no 0 0 relationship. Correct Error But there is a possibility Accept H0 (Type II of a wrong decision. error Researcher’s p = β) decision Reducing the risk of one Error Correct error increases the risk (Type I p = 1- β Reject H0 of another error. error (power) p = α) QM STEM Ed 2018 25 Type I and Type II errors Type I error occurs when the decision is to reject the null hypothesis when it is actually true.

Load more