The Normal Probability Distribution

BIOSTATISTICS NURS 3324 Assessing normality Many of the statistical tests (parametric tests) are based on the assumption that the data are normally distributed. However, if we collect data for a study, we rarely see perfectly normal distributions. Thus, it is important to evaluate how well the data set seems to be adequately approximated by a normal distribution. Generally, there are two main methods of assessing normality: graphically and numerically. In this section some statistical tools will be presented to check whether a given set of data is normally distributed. 1. Construct charts For small- or moderate-sized data sets, the stem-and-leaf display and box-and- whisker plot will look symmetric. For large data sets, construct a histogram or polygon and see if the distribution bell-shaped or deviates grossly from a bell-shaped normal distribution. Look for skewness and asymmetry. Look for gaps in the distribution – intervals with no observations. However, remember that normality requires more than just symmetry; the fact that the histogram is symmetric does not mean that the data come from a normal distribution. Also, data sampled from normal distribution will sometimes look distinctly different from the parent distribution. So, we need to develop some techniques that allow us to determine if data are significantly different from a normal distribution. 2. Compute descriptive summary measures a. The mean, median and mode will have similar values. b. The interquartile range approximately equal to 1.33 s. c. The range approximately equal 6 s. 3. Normal Counts method Count the number of observations within 1, 2, and 3 standard deviations of the mean and compare the results with what is expected for a normal distribution in the 68-95- 99.7 rule. According to the rule, 68% of the observations lie within one standard deviation of the mean. 95% of observations within two standard deviations of the mean. 99.7% of observations within three standard deviations of the mean. Example: As part of a demonstration one semester, I collected data on the heights of sample of 25 IUG biostatistics students. These data are presented in the table below. Does the sample shown below have been drawn from normally distributed populations? Table. Heights, in inches, of 25 IUG biostatistics students. 71.0 69.0 70.0 72.5 73.0 70.0 71.5 70.5 72.0 71.0 68.5 69.0 69.0 68.5 74.0 67.0 69.0 71.5 66.0 70.0 68.5 74.0 74.5 74.0 66 BIOSTATISTICS NURS 3324 Solution: For normal Counts method, determine the following Arrange the values 66, 67, 68.5, 68.5, 68.5, 69, 69, 69, 69, 70, 70, 70, 70.5, 71, 71, 71.5, 71.5, 72, 72.5, 73, 74, 74, 74, 74.5 Calculate the arithmetic mean x = 70.6; Calculate the standard deviation s = 2.3 Find xs . xs68.3 xs72.9 Thus xs is equal to 68.3 to 72.9. 66, 67, 68.5, 68.5, 68.5, 69, 69, 69, 69, 70, 70, 70, 70.5, 71, 71, 71.5, 71.5, 72, 72.5, 73, 74, 74, 74, 74.5 Total = 17 Count the number of values lie between 68.3 and 72.9 This is equal to 17 which mean that, 17 out of the 24 observations i.e. 17/24 = 0.70 = 70% fall within , i.e. between 72.9 and 68.3, which is approximately equal to 68%. Thus, there is no reason to doubt that the sample is drawn from a normal population. What to do if Not Normal? According to some researchers, sometimes violations of normality are not problematic for running – parametric- tests. When a variable is not normally distributed (a distributional requirement for many different analyses), we can create a transformed variable and test it for normality. If the transformed variable is normally distributed, we can substitute it in our analysis. Data transformation Data transformation involves performing a mathematical operation on each of the scores in a set of data, and thereby converting the data into a new set of scores which are then employed to analyze the results of an experiment. To solve for Positive Skew Square roots, logarithmic, and inverse (1/X) transforms "pull in" the right side of the distribution in toward the middle and normalize right (positive) skew. Inverse transforms are stronger than logarithmic, which are stronger than roots. 67 .

The Normal Probability Distribution

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Normal Probability Distribution

2018 Statistics Final Exam Academic Success Programme Columbia University Instructor: Mark England

L-Moments Based Assessment of a Mixture Model for Frequency Analysis of Rainfall Extremes

Estimation of Quantiles and the Interquartile Range in Complex Surveys Carol Ann Francisco Iowa State University

Hand-Book on STATISTICAL DISTRIBUTIONS for Experimentalists

Descriptive Statistics

Measures of Variability

Measures of Spread

Chapter 4 – Analyzing Skewed Quantitative Data Introduction: in Chapter 3, We Focused on Analyzing Bell Shaped (Normal) Data, but Many Data Sets Are Not Bell Shaped

Outing the Outliers – Tails of the Unexpected

Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 Total 75