<<

BIOSTATISTICS NURS 3324

Assessing normality

Many of the statistical tests (parametric tests) are based on the assumption that the are normally distributed. However, if we collect data for a study, we rarely see perfectly normal distributions. Thus, it is important to evaluate how well the data set seems to be adequately approximated by a . Generally, there are two main methods of assessing normality: graphically and numerically. In this section some statistical tools will be presented to check whether a given set of data is normally distributed.

1. Construct charts

 For small- or moderate-sized data sets, the stem-and-leaf display and box-and- whisker plot will look symmetric.  For large data sets, construct a or polygon and see if the distribution bell-shaped or deviates grossly from a bell-shaped normal distribution. Look for and asymmetry. Look for gaps in the distribution – intervals with no observations. However, remember that normality requires more than just symmetry; the fact that the histogram is symmetric does not that the data come from a normal distribution. Also, data sampled from normal distribution will sometimes look distinctly different from the parent distribution. So, we need to develop some techniques that allow us to determine if data are significantly different from a normal distribution.

2. Compute descriptive summary measures a. The mean, and will have similar values. b. The interquartile approximately equal to 1.33 s. c. The range approximately equal 6 s.

3. Normal Counts method

Count the number of observations within 1, 2, and 3 standard deviations of the mean and compare the results with what is expected for a normal distribution in the 68-95- 99.7 rule. According to the rule,  68% of the observations lie within one of the mean.  95% of observations within two standard deviations of the mean.  99.7% of observations within three standard deviations of the mean.

Example: As part of a demonstration one semester, I collected data on the heights of sample of 25 IUG students. These data are presented in the table below. Does the sample shown below have been drawn from normally distributed populations?

Table. Heights, in inches, of 25 IUG biostatistics students.

71.0 69.0 70.0 72.5 73.0 70.0 71.5 70.5 72.0 71.0 68.5 69.0 69.0 68.5 74.0 67.0 69.0 71.5 66.0 70.0 68.5 74.0 74.5 74.0

66 BIOSTATISTICS NURS 3324

Solution: For normal Counts method, determine the following

Arrange the values

66, 67, 68.5, 68.5, 68.5, 69, 69, 69, 69, 70, 70, 70, 70.5, 71, 71, 71.5, 71.5, 72, 72.5, 73, 74, 74, 74, 74.5

Calculate the x = 70.6; Calculate the standard deviation s = 2.3 Find xs . xs68.3 xs72.9 Thus xs is equal to 68.3 to 72.9.

66, 67, 68.5, 68.5, 68.5, 69, 69, 69, 69, 70, 70, 70, 70.5, 71, 71, 71.5, 71.5, 72, 72.5, 73, 74, 74, 74, 74.5

Total = 17 Count the number of values lie between 68.3 and 72.9 This is equal to 17 which mean that, 17 out of the 24 observations i.e. 17/24 = 0.70 = 70% fall within , i.e. between 72.9 and 68.3, which is approximately equal to 68%. Thus, there is no reason to doubt that the sample is drawn from a normal population.

What to do if Not Normal?

According to some researchers, sometimes violations of normality are not problematic for running – parametric- tests. When a variable is not normally distributed (a distributional requirement for many different analyses), we can create a transformed variable and test it for normality. If the transformed variable is normally distributed, we can substitute it in our analysis.

Data transformation Data transformation involves performing a mathematical operation on each of the scores in a set of data, and thereby converting the data into a new set of scores which are then employed to analyze the results of an .

To solve for Positive Skew

Square roots, logarithmic, and inverse (1/X) transforms "pull in" the right side of the distribution in toward the middle and normalize right (positive) skew. Inverse transforms are stronger than logarithmic, which are stronger than roots.

67