Chapter 5: The Importance of Measuring Why is Variability important? Variability Simple Example: Suppose you wanted to • Measures of Central Tendency - know how satisfied CNAs are with their Numbers that describe what is typical or job and you found that the average average (central) in a distribution (e.g., mean, mode, median). score (on a five po int sca l)le) was “3”.

• Measures of Variability - Numbers that describe diversity or variability in the How would knowing the variability help distribution (e.g., , index of qualitative you to understand how satisfied CNAs variation, , box plot, are with their job? , ).

Answer: It would help you to see whether the average score of “3” means that the majority of CNAs are neutral about their jobs or that there is a split with CNAs either feeling very good (score of 5) or very bad (score of 1) about their job (average of 1’s and 5’s = 3).

Another example.

A measure of variability: Understanding the Index of Index of Qualitative Variation Qualitative Variation (IQV) • The IQV is a single number that expresses – A measure of variability for nominal variables. the diversity of a distribution.

- A measure that shows how diverse a particular category •The IQV ranges from 0 to 1 of a variable is. • An IQV of 0 would indicate that the distribution has NO diversity at all. For example, when looking at ethnic diversity among •An IQV of 1 would indicate that the states, some states are very diverse with multiple ethnic distribution is maximally diverse. groups and lots of people in each group. Other states may have multiple ethnic groups but most of the state’s residents are in only one of the ethnic groups. IQV (NOTE: You will not be responsible for calculating would provide a measure for how diverse each state the IQV but you will be responsible for being able actually is. to interpret an IQV) Index of Qualitative Variation in Real Life: Diversity in the U.S. The Range (Numbers differ somewhat from text)

State IQV State IQV State IQV Hawaii 0.82 North Carolina 0.56 Utah 0.30 California 0.72 Alabama 0.54 Indiana 0.30 New Mexico 0.68 Delaware 0.53 Wisconsin 0.28 New York 0.67 Oklahoma 0.48 Nebraska 0.28 • Range – A measure of variation in Texas 0.66 Colorado 0.46 South Dakota 0.26 interval-ratio variables. Maryland 0.65 Connecticut 0.46 Minnesota 0.26 Georgia 0.63 Arkansas 0.43 Idaho 0.25 Mississippi 0.62 Michigan 0.43 Kentucky 0.23 • It is the difference between the Louisiana 0.62 Tennessee 0.42 Wyoming 0.23 New Jersey 0.62 Washington 0.41 Montana 0.22 highest (maximum) and the lowest Florida 0.60 Massachusetts 0.38 North Dakota 0.18 South Carolina 0.59 Rhode Island 0.38 Iowa 0.17 (minimum) scores in the distribution. Illinois 0.58 Kansas 0.35 West Virginia 0.12 Nevada 0.58 Pennsylvania 0.35 New Hampshire 0.11 Arizona 0.58 Missouri 0.34 Vermont 0.07 Alaska 0.58 Ohio 0.34 Maine 0.07 Range = highest score - lowest score Virginia 0.56 Oregon 0.33

What is the range for these Index of Qualitative What is the range for these Index of Qualitative Variation (IQV) scores? Variation (IQV) scores? (higher number means more diversity) (higher number means more diversity) Steps to determine: subtract the lowest score ____from the Steps to determine: subtract the lowest score _.06___from highest _____to obtain the range of IQV scores_____. the highest _____to obtain the range of IQV scores_____.

State IQV State IQV State IQV State IQV State IQV State IQV California 0.80 Alabama 0.51 Indiana 0.27 California 0.80 Alabama 0.51 Indiana 0.27 New Mexico 0.76 North Carolina 0.51 Utah 0.26 New Mexico 0.76 North Carolina 0.51 Utah 0.26 Texas 0.74 Delaware 0.49 Nebraska 0.24 Texas 0.74 Delaware 0.49 Nebraska 0.24 New York 0.66 Colorado 0.45 South Dakota 0.24 New York 0.66 Colorado 0.45 South Dakota 0.24 Hawaii 0.64 Oklahoma 0.44 Wisconsin 0.24 Hawaii 0.64 Oklahoma 0.44 Wisconsin 0.24 Maryland 0.62 Connecticut 0.42 Idaho 0.23 Maryland 0.62 Connecticut 0.42 Idaho 0.23 New Jersey 0.61 Arkansas 0.40 Wyoming 0.22 New Jersey 0.61 Arkansas 0.40 Wyoming 0.22 Louisiana 0.61 Michigan 0.40 Kentucky 0.20 Louisiana 0.61 Michigan 0.40 Kentucky 0.20 Arizona 0.61 Tennessee 0.39 Minnesota 0.20 Arizona 0.61 Tennessee 0.39 Minnesota 0.20 Florida 0.61 Washington 0.37 Montana 0.20 Florida 0.61 Washington 0.37 Montana 0.20 Mississippi 0.61 Massachusetts 0.34 North Dakota 0.17 Mississippi 0.61 Massachusetts 0.34 North Dakota 0.17 Georgia 0.59 Missouri 0.31 Iowa 0.13 Georgia 0.59 Missouri 0.31 Iowa 0.13 Nevada 0.57 Ohio 0.31 West Virginia 0.11 Nevada 0.57 Ohio 0.31 West Virginia 0.11 Illinois 0.57 Pennsylvania 0.31 New Hampshire 0.08 Illinois 0.57 Pennsylvania 0.31 New Hampshire 0.08 South Carolina 0.56 Kansas 0.30 Maine 0.06 South Carolina 0.56 Kansas 0.30 Maine 0.06 Alaska 0.56 Rhode Island 0.30 Vermont 0.06 Alaska 0.56 Rhode Island 0.30 Vermont 0.06 Virginia 0.53 Oregon 0.28 Virginia 0.53 Oregon 0.28

What is the range for these Index of Qualitative What is the range for these Index of Qualitative Variation (IQV) scores? Variation (IQV) scores? (higher number means more diversity) (higher number means more diversity)? Steps to determine: subtract the lowest score _.06___from Steps to determine: subtract the lowest score _.06___from the the highest __.80___to obtain the range of IQV scores_____. highest __.80___to obtain the range of IQV scores__.74___. State IQV State IQV State IQV State IQV State IQV State IQV California 0.80 Alabama 0.51 Indiana 0.27 California 0.80 Alabama 0.51 Indiana 0.27 New Mexico 0.76 North Carolina 0.51 Utah 0.26 New Mexico 0.76 North Carolina 0.51 Utah 0.26 Texas 0.74 Delaware 0.49 Nebraska 0.24 Texas 0.74 Delaware 0.49 Nebraska 0.24 New York 0.66 Colorado 0.45 South Dakota 0.24 New York 0.66 Colorado 0.45 South Dakota 0.24 Hawaii 0.64 Oklahoma 0.44 Wisconsin 0.24 Hawaii 0.64 Oklahoma 0.44 Wisconsin 0.24 Maryland 0.62 Connecticut 0.42 Idaho 0.23 Maryland 0.62 Connecticut 0.42 Idaho 0.23 New Jersey 0.61 Arkansas 0.40 Wyoming 0.22 New Jersey 0.61 Arkansas 0.40 Wyoming 0.22 Louisiana 0.61 Michigan 0.40 Kentucky 0.20 Louisiana 0.61 Michigan 0.40 Kentucky 0.20 Arizona 0.61 Tennessee 0.39 Minnesota 0.20 Arizona 0.61 Tennessee 0.39 Minnesota 0.20 Florida 0.61 Washington 0.37 Montana 0.20 Florida 0.61 Washington 0.37 Montana 0.20 Mississippi 0.61 Massachusetts 0.34 North Dakota 0.17 Mississippi 0.61 Massachusetts 0.34 North Dakota 0.17 Georgia 0.59 Missouri 0.31 Iowa 0.13 Georgia 0.59 Missouri 0.31 Iowa 0.13 Nevada 0.57 Ohio 0.31 West Virginia 0.11 Nevada 0.57 Ohio 0.31 West Virginia 0.11 Illinois 0.57 Pennsylvania 0.31 New Hampshire 0.08 Illinois 0.57 Pennsylvania 0.31 New Hampshire 0.08 South Carolina 0.56 Kansas 0.30 Maine 0.06 South Carolina 0.56 Kansas 0.30 Maine 0.06 Alaska 0.56 Rhode Island 0.30 Vermont 0.06 Alaska 0.56 Rhode Island 0.30 Vermont 0.06 Virginia 0.53 Oregon 0.28 Virginia 0.53 Oregon 0.28 Inter-quartile Range Inter-quartile Range

• Inter-quartile range (IQR) – The width of the • Inter-quartile range (IQR) – The width of the middle 50 percent of the distribution. middle 50 percent of the distribution.

• The IQR helps us to ggpet a better picture of • It is deffffined as the difference between the the variation in the data than the range. upper and lower quartiles.

• The shortcoming of the range is that an • IQR = the larger quartile minus the smaller “outlying” case at the top or bottom can quartile or Q1 – Q3 increase the range substantially.

What is the Inter-Quartile Range (IQR) for these Index of Qualitative Variation (IQV) Scores? What is the Inter-Quartile Range (IQR) for the State IQV State IQV State IQV California 0.80 Alabama 0.51 Indiana 0.27 Index of Qualitative Variation (IQV) Scores? New Mexico 0.76 North Carolina 0.51 Utah 0.26 Texas 0.74 Delaware 0.49 Nebraska 0.24 New York 0.66 Colorado 0.45 South Dakota 0.24 Steps to determine the IQR (Q1 – Q3): Hawaii 0.64 Oklahoma 0.44 Wisconsin 0.24 Maryland 0.62 Connecticut 0.42 Idaho 0.23 New Jersey 0.61 Arkansas 0.40 Wyoming 0.22 1. Order the categories from highest to lowest (or, if you order Louisiana 0.61 Michigan 0.40 Kentucky 0.20 the cas es from lowest to high est then you would subt ract Q1 Arizona 0.61 Tennessee 0.39 Minnesota 0.20 from Q3) Florida 0.61 Washington 0.37 Montana 0.20 2. To obtain Q1, begin by dividing N (total number of categories or Mississippi 0.61 Massachusetts 0.34 North Dakota 0.17 states) by 4 (or alternatively multiply N by .25). This Georgia 0.59 Missouri 0.31 Iowa 0.13 Nevada 0.57 Ohio 0.31 West Virginia 0.11 equals______? Illinois 0.57 Pennsylvania 0.31 New Hampshire 0.08 South Carolina 0.56 Kansas 0.30 Maine 0.06 Alaska 0.56 Rhode Island 0.30 Vermont 0.06 Virginia 0.53 Oregon 0.28

(Steps are provided on the next slides)

What is the Inter-Quartile Range (IQR) for the Index What is the Inter-Quartile Range (IQR) for the of Qualitative Variation (IQV) Scores? Index of Qualitative Variation (IQV) Scores?

Steps to determine the IQR (Q1 – Q3): Steps to determine the IQR (Q1 – Q3): 1. Order the categories from highest to lowest (or vice versa) 1. Order the categories from highest to lowest (or vice versa) 2. To obtain Q1, begin by dividing N (total number of categories or 2. To obtain QQ,1, be gin b y dividing N ( total num ber of categ ories or stat es) by 4 (or a lternati ve ly mu ltiply N by . 25). This states) by 4 (or alternatively multiply N by .25). This equals___12.5___? equals___12.5___? 3. We now know that Q1 falls between the 12th and 13th category 3. We now know that Q1 falls between the 12th and 13th category or, in this case, states. or, in this case, states. 4. To find the exact number for Q1, determine the midpoint 4. To find the exact number for Q1, determine the midpoint between the 12th and 13th states or between .59 and .57 (Georgia between the 12th and 13th states or between .59 and .57) and Nevada) 5. Q1 = __.58__ 5. Q1 = ____ What is the Inter-Quartile Range (IQR) for the Index of What is the Inter-Quartile Range (IQR) for the Index of Qualitative Variation (IQV) Scores? Qualitative Variation (IQV) Scores? Steps to determine the IQR (Q1 – Q3):

Steps to determine the IQR (Q1 – Q3): 6. To obtain Q3, begin by multiplying 12.5 by 3 (or alternatively multiply N (50) by . 75). This will give us__37. 5__. 6. To obtain Q3, begin by multiplying 12.5 by 3 (or alternatively multiply N (50) by .75). This will give us____. 7. Based on this number, Q3 falls between the 37th and 38th states (Nebraska and South Dakota).

8. Determine the midpoint between these two states. This equals_____.

The Box Plot What is the Inter-Quartile Range (IQR) for the Index of •The Box Plot is a graphic device provided by PASW Qualitative Variation (IQV) Scores? (and other statistical software programs) that visually presents the following elements: the range, Steps to determine the IQR (Q1 – Q3): the IQR, the median, the quartiles, the minimum (lowest value,) and the maximum (highest value) and 6. To obtain Q3, begin by multiplying 12.5 by 3 (or alternatively skewness vs symmetrical data. mmpy()y)ultiply N (50) by .75). This will g ive us__ 37.5 __ . Maximum th th 7. Based on this number, Q3 falls between the 37 and 38 states. Q3

8. Determine the midpoint between these two states. This equals___.24__. This tells us that 50% of the cases fall Range IQR Median between .58 and .24.

9. To obtain the IQR subtract Q3 from Q1 which equals_.34____ Q1

Minimum

Measures of Variability: Measures of Variability: Shortcomings of the Range and IQR the Variance

• The range is based on only two • The variance allows us to account for the total categories (the highest and lowest) amount of variation that includes the variation of all the categories. • The amount of variation in each category is • Likewise, only two categories are used to considered when calculating the variance. calculate the inter-quartile range. • The variance is an important statistic that is used in most other sophisticated statistics. • Neither allows us to know how much Therefore, it is important for you to give it variation there is among all the particular attention. categories. Determining Variation in the “Percentage Increase” in the Variation in the “Percentage Increase” in the Nursing Home Population, 1980-1990 Nursing Home Population, 1980-1990

Nine Regions of U.S. Percentage Nine Regions of U.S. Percentage

Pacific 15.7 Pacific 15.7 West North Central 16.2 West North Central 16.2 New England 17.6 New England 17.6 East North Central 23.2 East North Central 23.2 West South Central 24.3 WtSthCtlWest South Central 24. 3 Middle Atl anti c 28. 5 A“%i”Average “% increase” Middle Atlantic 28.5 East South Central 38.0 East South Central 38.0 Mountain 47.9 Mountain 47.9 South Atlantic 71.7 South Atlantic 71.7 ∑ f Y = 283.1 mean = ∑ f Y = 31.45 Y = N

How might we take into account the variation that exists for each category? Answer: the “mean” is the mid-point so you could measure how far each category is from the mid-point and then add up all these distances. The larger this sum the more variation.

Percentage Change in the Nursing Home Percentage Change in the Nursing Home Population, 1980-1990 Population, 1980-1990

Nine Regions of U.S. Percentage Y - Y Nine Regions of U.S. Percentage Y - Y

Pacific 15.7 15.7 - 31.5 = -15.8 Pacific 15.7 15.7 - 31.5 = -15.8 West North Central 16.2 16.2 - 31.5 = -15.3 West North Central 16.2 16.2 - 31.5 = -15.3 New England 17.6 17.6 - 31.5 = -13.9 New England 17.6 17.6 - 31.5 = -13.9 East North Central 23.2 23.2 - 31.5 = - 8.3 East North Central 23.2 23.2 - 31.5 = - 8.3 West South Cen tr al 24.3 24.3 - 3351.5 = - 7.2 West South Cen tr al 24.3 24.3 - 3351.5 = - 7.2 Middle Atlantic 28.5 28.5 - 31.5 = - 3.0 Middle Atlantic 28.5 28.5 - 31.5 = - 3.0 East South Central 38.0 38.0 - 31.5 = 6.5 East South Central 38.0 38.0 - 31.5 = 6.5 Mountain 47.9 47.9 - 31.5 = 16.4 Mountain 47.9 47.9 - 31.5 = 16.4 South Atlantic 71.7 71.7 - 31.5 = 40.2 South Atlantic 71.7 71.7 - 31.5 = 40.2

(mean = 31.5)∑ Y = 283.1∑ (Y – Y) = 0 (mean = 31.5)∑ Y = 283.1∑ (Y – Y) = 0

How might we take into account the variation that exists for each category? •One solution would be to take the absolute value for each number (ignore the Problem: when you add up the distances you end up with zero rather than the minus signs). Unfortunately, absolute values are very difficult to work with total variation from all the categories. mathematically. • Fortunately, there is another alternative.

Percentage Change in the Nursing Home Population, 1980-1990 Measures of Variability: the Variance Nine Regions of U.S. Percentage Y – Y ( Y – Y)2 (squared deviations) Pacific 15.7 15.7 - 31.5 = -15.8 249.64 The Variance is the average of the squared deviations from West North Central 16.2 16.2 - 31.5 = -15.3 234.09 the mean. 2 New England 17.6 17.6 - 31.5 = -13.9 193.21 East North Central 23.2 23.2 - 31.5 = - 8.3 68.89 ∑(Y −Y) West South Cen tr al 24.3 24.3 - 3351.5 = - 7.2 5581.84 2 Middle Atlantic 28.5 28.5 - 31.5 = - 3.0 9.00 sY = East South Central 38.0 38.0 - 31.5 = 6.5 42.25 N −1 Mountain 47.9 47.9 - 31.5 = 16.4 268.96 In our example we would take the sum of the squared South Atlantic 71.7 71.7 - 31.5 = 40.2 1616.04 deviations (2733.92) and divide this number by the total (mean = 31.5)∑ Y = 283.1∑ (Y – Y)2 = 2733.92 number of cases minus one (9 – 1 = 8). This would give us _341.74___ or the variance for the Percent Increase in •The best solution is to square the differences before adding them up (when two negative numbers are multiplied the resulting product is a positive number). the Nursing Home population by region. Measures of Variability: Measures of Variability: The Variance Standard Deviation vs Variance

To Sum Up: • One problem with the variance is that the final number The Variance is the average of the obtained is in a squared form squared deviations from the mean. (that is, we squared all the deviations from the mean and so th e final num ber is st ill “in flate d” in t his way The Variance is a measure of variability making it difficult to interpret) for interval-ratio variables. • One solution is to take the square root of the variance 2 so that the number is no longer in a squared form (or “inflated”) and it is back to its original form. The square ∑(Y −Y) root of the variance is called the Standard Deviation. s2 = Y N −1

Measures of Variability: Measures of Variability: Standard Deviation Standard Deviation In Sum • To obtain the square root of the variance simply enter the number (variance) into your calculator and then push • Standard Deviation – A measure of variation the square root button. for interval-ratio variables; it is equal to •If the variance is 341.74 the standard deviation would be the square root of the variance. ___18.49_____. This tells us that the percent of change in the nursing home population for the nine regions is 2 widely dispersed around the mean (mean = 31.45). ∑(Y −Y) •Thus, the standard deviation is a measure of the average 2 amount of variation (or deviation) around the mean. s = s = Y N −1

Measures of Variability: Considerations for Choosing a Standard Deviation Measure of Variability (a look at what’s to come in future chapters) •For ordinal variables, you can calculate the We will see later that when the data are IQR (inter-quartile range.) “normally distributed” around the mean (produce a normal curve), 34% of the scores • For interval-ratio variables, you can use will be one standard deviation above the IQR, or the variance/standard deviation. mean and 34% will be one standard deviation The standard deviation (also variance) below the mean. provides the most information, since it uses all of the values in the distribution in Scores are often “normally distributed” its calculation. around the mean when a random sample has been used to obtain the scores or there are a large number of cases. 2 Fini ∑(Y −Y) N −1

Cumulative Percentage Distribution

Minimum Cumulative Age Frequency Percentage Percentage 14 1 3.7 3.7

15 1 3.7 7.4

16 9 33.3 40.7

∑ 17 4 14.8 55.5

18 12 44.4 99.9* Total N 27 99.9*

∑ f Y ∑ f Y ∑ f Y Y = Y = Y = * Doesn’t total to 100% due to rounding N N N