Common Statistical Tests and Applications in Epidemiological Literature
Total Page:16
File Type:pdf, Size:1020Kb
ERIC NOTEBOOK SERIES Second Edition Common Statistical Tests and Applications in Epidemiological Literature Second Edition Authors: Any individual in the medical field distribution, location and variation. will, at some point, encounter There are three different types of Lorraine K. Alexander, DrPH instances when epidemiological data: nominal, ordinal, and methods and statistics will be Brettania Lopes, MPH continuous data. Nominal data do valuable tools in addressing not have an established order or rank research questions of interest. Kristen Ricchetti-Masterson, MSPH and contain a finite number of values. Karin B. Yeatts, PhD, MS Examples of such questions might Gender and race are examples of include: nominal data. Ordinal data have a limited number of values between Will treatment with a new anti- which no other possible values exist. hypertensive drug significantly Number of children and stage of lower mean systolic blood disease are good examples of ordinal pressure? data. It should be noted that ordinal Is a visit with a social worker, in data do not have to have evenly addition to regular medical spaced values as occurs with visits, associated with greater continuous data, however, there is an satisfaction of care for cancer implied underlying order. Since both patients as compared to those ordinal and nominal data have a finite who only have regular medical number of possible values, they are visits? also referred to as discrete data. The last type of data is continuous data There are a number of steps in which are characterized by having an evaluating data before actually infinite number of evenly spaced addressing the above questions. values. Blood pressure and age fall These steps include description of into this category. It should be noted your data as well as determining for data collection and analysis that what the appropriate tests are for continuous, ordinal, or nominal values your data. can be grouped. Grouped data are Description of data often referred to as categorical. The type of data one has determines Possible categories might include: the statistical procedures that are low, medium, high, or those utilized. Data are typically described representing a numerical range. in a number of ways: by type, ERIC at the UNC CH Department of Epidemiology Medical Center ERIC NOTEBOOK PA G E 2 A second characteristic of data description, distribution, refers the alternative hypothesis is that there is a difference in to the frequencies or probabilities with which values occur the mean blood pressure of the standard treatment and within our population. Discrete data are often represented new drug group following therapy. The alternative graphically with bar graphs like the one below (Figure 1). hypothesis might also be described as your "best guess" as to what the values are. However, in statistical analysis, the null hypothesis is the main interest, and is the one actually being tested. In statistical testing, we assume that the null hypothesis is correct and determine how likely we are to have obtained the sample (or values) we actually obtained in our study under the condition of the null. If we determine that the probability of obtaining the sample we observed is sufficiently small, then we can reject the null hypothesis. Since we are able to reject the null hypothesis, we have Figure 1. Bar graph evidence that the alternative hypothesis may be true. Continuous data are commonly assumed to have a symmetric, On the other hand, if the probability of obtaining our study bell-shaped curve as shown below (Figure 2). This is known as a results is not small, we fail to reject the assumption that Gaussian distribution, the most commonly assumed the null hypothesis is true. It should be noted that we are distribution in statistical analysis. not concluding that the null is true. This is a small, but important distinction. A test that fails to reject the null hypothesis should be considered inconclusive. An example will help to illustrate this point. In a sealed bag, we have 100 blue marbles and 20 red marbles. (This bag is essentially representing the entire population). One individual formulates the null hypothesis that “all the marbles are blue”, and the alternative which is “all the marbles are not blue”. To test this hypothesis, 10 marbles are sampled from the bag. All Figure 2. Gaussian distribution ten marbles selected are indeed blue. Thus the individual has failed to reject the null that all the marbles Hypothesis testing in the bag are blue. However, because all of the marbles Hypothesis testing, also known as statistical inference or were not sampled, you cannot conclude that all the significance testing, involves testing a specified hypothesized marbles in the bag are blue. (We happen to know this is condition for a population’s parameter. This condition is best not true, but it is impossible to know in the real world with described as the null hypothesis. For example, in a clinical trial populations too large to fully evaluate). If another of a new anti-hypertensive drug, the null hypothesis would state individual selects 10 marbles from the bag and finds that that there is no difference in effect when comparing the new 8 are blue and 2 are red, we can reject the null hypothesis drug to the current standard treatment. Contrary to the null is that all the marbles are blue since we have selected at the alternative hypothesis, which generally defines the possible least one red marble. values for a parameter of interest. For the previous example, ERIC at the UNC CH Department of Epidemiology Medical Center ERIC NOTEBOOK PA G E 3 Error in statistical testing Example Earlier, we indicated that we can reject the null hypothesis To evaluate if drug Z reduces mean systolic blood pressure, a randomized clinical trial will be performed if the probability of obtaining a sample like the one where 12 individuals receive drug Z and 8 receive a observed in our study is sufficiently small. You may ask placebo. The null hypothesis to be tested is that there “What is sufficiently small?” “How small” is determined by is no difference in the mean systolic blood pressure of the experimental and placebo groups. The alternative how willing we are to reject the null hypothesis when it hypothesis is that there is a difference between the accurately reflects the population from which it is means of the two groups. The type I error for your trial sampled. This type of error is called a Type I error. will be 5%. This error is also commonly called alpha (α). Alpha is the probability of rejecting the null hypothesis when the null is Results true. This probability is selected by the researcher and is Below is the group assignments and resulting systolic typically set at 0.05. It is important to remember that this blood pressure (SBP) is an arbitrary cut-point and should be taken into Patient Assignment Systolic BP consideration when making conclusions about the results of the study. 1 Drug Z 100 3 Drug Z 110 There is a second type of error that can be made during statistical testing. It is known as Type II error, which is the 5 Drug Z 122 probability of not rejecting the null when the alternative 7 Drug Z 109 hypothesis is indeed true, or in other words, failing to 9 Drug Z 108 reject the null when the null hypothesis is false. Type II 11 Drug Z 111 error is commonly known as β. Beta relates to another 13 Drug Z 118 important parameter in statistical testing which is power. Power is equal to (1-β) and is essentially the ability to 15 Drug Z 105 avoid making a type II error. Like α, power is also defined 17 Drug Z 115 by the researcher, and is typically set at 0.80. Below is a 18 Drug Z 119 schematic of the relationships between α, β and power. 19 Drug Z 106 20 Drug Z 109 2 Placebo 129 Decision Truth Null True Null False 4 Placebo 125 Reject Null α power 6 Placebo 136 Accept Null β 8 Placebo 129 10 Placebo 135 Students’ T test 12 Placebo 134 This test is most commonly used to test the difference between 14 Placebo 140 the means of the dependent variables of two groups. For example, this test would be appropriate if one wanted to 16 Placebo 128 evaluate whether or not a new anti-hypertensive drug reduces mean systolic blood pressure. meandrug = 100 + 110 + …+ 109 = 111 mm Hg 12 ERIC at the UNC CH Department of Epidemiology Medical Center ERIC NOTEBOOK PA G E 4 this quantity would = 0 when there is no difference meanplacebo = 129 + 125 + … + 128 = 132 mm Hg expected between the drug and placebo groups). 8 t = (meandrug - meanplacebo) - (*meandrug - *meanplacebo) meandrug – meanplacebo = - 21 mm Hg SE (meandrug – meanplacebo) Now that we have determined the difference between t = -21 - 0 = -7.8 = |-7.8| = 7.8 means, we need to determine the standard error for that 2.69 difference which is calculated using the pooled estimate of We now compare our calculated value to a table of critical the variance ( 2). σ values for the Students' T distribution (found in most The formula for the standard error of the drug Z group is: basic statistics books). The table also requires that we know the degrees of freedom and the value of a we have selected. Degrees of freedom (df) refers to the amount of 2 σ drug = ∑ (SBPdrug –meandrug)2 = information that a sample has in estimating the variance. It is generally the sample size minus one. The df for our ndrug - 1 calculation is 12 + 8 - 2 = 18 (the sample size for each 2 σ drug =[(100-111)2 + (110-111)2 + ...+ (109-111)2] = 40.9 group - 1).