Replacing Missing Values for Items Within a Scale

Suppose you have a ten item scale where each item has a nine-point Likert-type response set, with values from 1 to nine. Some respondents failed to answer some items. After looking at the distribution of missing values, you decide to discard any case with more than one missing value. For cases with only one missing value, you will replace the missing value with the mean of the non-missing responses for that case on the items within that scale. This is reasonable to do, IMHO, if you assume that every item measures the same characteristic that every other item measures. I shall use an unreasonable small sample to illustrate how to replace missing values. If any of the items need to be reverse-scored, be sure to reflect them prior to replacing missing values.

PASW (previously SPSS)

Notice that Case 4 is missing a response to Q10, but has answered the other 9 items. Case 5 has more than one missing value. Transform, Compute and issue the command COMPUTE ScaleScore=10*MEAN.9(Q1 to Q10). For each respondent with at least 9 non-missing values, the ScaleScore will be the sum of the item scores, with each missing values replaced by the mean of the respondent’s scores on the other items in the scale. Case 5 gets a missing value for the ScaleScore, as it had fewer than 9 non- missing values. Case 4 was missing a response for item 10 only. The mean for this case on items 1 through 9 is 6.11, and that is the imputed value for Q10, yielding a summative ScaleScore of 61.11. Psychologists are addicted to summing item scores. Frankly, I prefer to compute means rather than sums, so I would just do this: COMPUTE ScaleScore=MEAN.9(Q1 to Q10). Why do I prefer means? Well, that makes the scale score more interpretable, IMHO, because it is in the metric of the item response. For example, on a nine-point Likert-type scale (1 = very strongly disagree, 9 = very strongly agree), a mean response of 8.1 clearly indicates strong agreement, and you do not have to remember how many items were on the scale to interpret that scale score. A scale score of 81 might represent strong agreement or strong disagreement (or anything in between), depending on how many items were on the scale. Please do not confuse the procedure explained above with that produce by using PASW’s Replace Missing Values function. If you were to use this function to replace missing item scores, the imputed score for Case 4 would be the mean of other cases on Q10 (5), not the mean item score for Case 4 (6.11). IMHO, it is rarely, if ever, appropriate to replace missing scores with the mean score of all other cases. SAS In a data step, use this syntax: ScaleScore=10*MEAN(of Q1-Q10; If NMISS(of Q1-Q10) > 1, then ScaleScore = . ;

OR

ScaleScore=MEAN(of Q1-Q10; If NMISS(of Q1-Q10) > 1, then ScaleScore = . ;

 SUM and MEAN -- proper use of the SUM and MEAN functions in SAS and PASW  Return to Wuensch’s PASW/SPSS Lessons Page  Return to Wuensch’s SAS Help Page

Karl L. Wuensch, 20. September 2009