Standard Scores
Richard S. Balkin, Ph.D., LPC-S, NCC
1 Normal Distributions
While Best and Kahn (2003) indicated that the normal curve does not actually exist, measures of populations tend to demonstrate this distribution It is based on probability—the chance of certain events occurring
R. S. Balkin, 2008 2 Normal Curve
The curve is symmetrical 50% of scores are above the mean, 50% below The mean, median, and mode have the same value Scores cluster around the center
R. S. Balkin, 2008 3 Normal distribution
"68-95-99" rule One standard deviation away from the mean in either direction (red) includes about 68% of the data values. Another standard deviation out on both sides includes about 27% more of the data (green). The third standard deviation out adds another 4% of the data (blue).
R. S. Balkin, 2008 4 The Normal Curve
R. S. Balkin, 2008 5 Interpretations of the normal curve
Percentage of total space included between the mean and a given standard deviation (z distance from the mean) Percentage of cases or n that fall between a given mean and standard deviation Probability that an event will occur between the mean and a given standard deviation Calculate the percentile rank of scores in a normal distribution Normalize a frequency distribution Test the significance of observed measures in an experiment
R. S. Balkin, 2008 6 Interpretations of the normal curve The normal curve has 2 important pieces of information We can view information related to where scores fall using the number line at the bottom of the curve. I refer to this as the score world It can be expressed in raw scores or standard scores—scores expressed in standard deviation units We can view information related to probability, percentages, and placement under the normal curve. I refer to this as the area world
R. S. Balkin, 2008 7 Interpretations of the normal curve
Area world
Score world
R. S. Balkin, 2008 8 Does the normal curve really exist?
“…the normal distribution does not actually exist. It is not a fact of nature. Rather, it is a mathematical model—an idealization—that can be used to represent data collected in behavioral research (Shavelson, 1996, p. 120).
R. S. Balkin, 2008 9 Does the normal curve really exist? Glass & Hopkins (1996): “God loves the normal curve” (p. 80). No set of empirical observations is ever perfectly described by the normal distribution… but an independent measure taken repeatedly will eventually resemble a normal distribution Many variables are definitely not normally distributed (i.e. SES) “The normal curve has a smooth, altogether handsome countenance—a thing of beauty” (p.83).
R. S. Balkin, 2008 10 Nonnormal distributions
Positively skewed—majority of the scores are near the lower numbers Negatively skewed—the majority of the scores are near the higher numbers Bimodal distributions have two modes
R. S. Balkin, 2008 11 Positively skewed
If a test was very difficult and almost everyone in the class did very poorly on it, the resulting distribution would most likely be positively skewed. In the case of a positively skewed distribution, the mode is smaller than the median, which is smaller than the mean. The mode is the point on the x-axis corresponding to the highest point, that is the score with greatest value, or frequency. The median is the point on the x-axis that cuts the distribution in half, such that 50R.% S. Balkin,of th 2008e area falls on each side. The mean is pulled 12 by the extreme scores on the right. Negatively skewed
A negatively skewed distribution is asymmetrical and points in the negative direction, such as would result with a very easy test. On an easy test, almost all students would perform well and only a few would do poorly. The order of the measures of central tendency would be the opposite of the positively skewed distribution, with the mean being smaller than the median, which is smaller than the mode.
R. S. Balkin, 2008 13 Normal Curve Summary
Unimodal Symmetry Points of inflection Tails that approach but never quite touch the horizontal axis as they deviate from the mean
R. S. Balkin, 2008 14 Standard scores
Standard scores assume a normal distribution They provide a method of expressing any score in a distribution in terms of its distance from the mean in standard deviation units Z score T score
R. S. Balkin, 2008 15 Z-score
A raw score by itself is rather meaningless. What gives the score meaning is its deviation from the mean A Z score expresses this value in standard deviation units
R. S. Balkin, 2008 16 Z-score
X " X x z = or ! ! X = Raw score
X = Mean ! = Standard deviation x (X ! X ) =R. S. Balkin, 2008 17 Z score formula
X " X x z = or ! ! X = 76 X = 82 ! = 4
R. S. Balkin, 2008 18 Z score computation X 76 X " X x = z = or ! ! X = 82 ! = 4
76 ! 82 ! 6 z = = = !1.50 4 4 R. S. Balkin, 2008 19 T score
Another version of a standard score Converts from Z score Avoids the use of negative numbers and decimals
R. S. Balkin, 2008 20 T score formula
T=50+10z Always rounded to the nearest whole number A z score of 1.27=
R. S. Balkin, 2008 21 T score formula
T=50+10z Always rounded to the nearest whole number A z score of 1.27= T=50+10(1.27)=
R. S. Balkin, 2008 22 T score formula
T=50+10z Always rounded to the nearest whole number A z score of 1.27= T=50+10(1.27)=50+12.70=62.70=63
R. S. Balkin, 2008 23 More on standard scores
Any standard score can be converted to standard deviation units A test with a mean of 500 and standard deviation of 100 would be…
R. S. Balkin, 2008 24 More on standard scores
Any standard score can be converted to standard deviation units x A test with a mean 500 +100 = 500 +100z of 500 and ! standard deviation of 100 would be…
R. S. Balkin, 2008 25 Calculating the distribution that fall before, between, or beyond the mean and standard deviation
From this point on represents above z, 16%
From this point za and below represents zb, R. S. Balkin, 2008 84th percentile 26 Confidence Intervals
Confidence intervals provide a range of values given error in a score For example, if the mean = 20 then we can be 68% confidence that the score will be plus or minus 1 sd from the mean 95% confidence that the score will be plus or minus 2 sd from the mean 99% confidence that the score will be plus or minus 3 sd from the mean
R. S. Balkin, 2008 27 Confidence intervals A mean of 20 with a sd of 5 68% confidence that the score will be between 15 to 25 95% confidence that the score will be between 10 to 30 99% confidence that the score will be between 5 to 35
R. S. Balkin, 2008 28 Confidence Intervals A mean of 48 and a sd of 2.75 68% confidence that the score will be between ____ to ____ 95% confidence that the score will be between ____ to ____ 99% confidence that the score will be between ____ to ____
R. S. Balkin, 2008 29 Confidence Intervals A mean of 48 and a sd of 2.75 68% confidence that the score will be between 45.25 to 50.75 95% confidence that the score will be between ____ to ____ 99% confidence that the score will be between ____ to ____
R. S. Balkin, 2008 30 Confidence Intervals
A mean of 48 and a sd of 2.75 68% confidence that the score will be between 45.25 to 50.75 95% confidence that the score will be between 42.5 to 53.5 99% confidence that the score will be between ____ to ____
R. S. Balkin, 2008 31 Confidence Intervals
A mean of 48 and a sd of 2.75 68% confidence that the score will be between 45.25 to 50.75 95% confidence that the score will be between 42.5 to 53.5 99% confidence that the score will be between 39.75 to 56.25
R. S. Balkin, 2008 32 Correlation Relationship between two or more paired variables or two or more data sets Correlation = r or p Correlations range from -1.00 (perfect negative correlation) to +1.00 (perfect positive correlation) A perfect correlation indicates that for every unit of increase (or decrease) in one variable there is an increase (or decrease) in another variable
R. S. Balkin, 2008 33 Positive correlation
R. S. Balkin, 2008 34 Negative correlation
R. S. Balkin, 2008 35 Low correlation
R. S. Balkin, 2008 36 Types of correlations Pearson’s Product-Moment Coefficient of correlation Known as a Pearson’s r Most commonly used Spearman Rank order coefficient of correlation Known as Spearman rho (p) Only utilized with ordinal values
R. S. Balkin, 2008 37 Interpreting a correlation coefficient Be aware of outliers—scores that differ markedly from the rest of the sample Look at direction (positive or negative) and magnitude (actual number) Does not imply cause and effect See the table on p. 388 for interpreting a correlation coefficient
R. S. Balkin, 2008 38 Interpreting a correlation coefficient Using the table on p. 388 interpret the following correlation coefficients:
+.52
-.78
+.12
R. S. Balkin, 2008 39 Interpreting a correlation coefficient
Using the table on p. 388 interpret the following correlation coefficients: +.52 Moderate -.78 Substantial +.12 Negligible
R. S. Balkin, 2008 40 Computing a Pearson r
"xy r = ("x2 )("y 2 ) "x2 = # (X ! X )2 "y 2 = # (Y !Y )2 "xy = # (X ! X )(Y !Y )
R. S. Balkin, 2008 41 Computing Pearson r
X Y x x 2 y y 2 xy 9 8 5 6 3 1 8 6 5 4
R. S. Balkin, 2008 42 Computing Pearson r
X Y x x2 y y2 xy 9 8 3 9 3 9 9 5 6 -1 1 1 1 -1 3 1 -3 9 -4 16 12 8 6 2 4 1 1 2 5 4 -1 1 -1 1 1
mean = 6 5 sum=24 sum = 28 23
R. S. Balkin, 2008 43 Computing Pearson r
23 r = (24)(28)
R. S. Balkin, 2008 44 Computing Pearson r 23 r = (24)(28) r = .8872
R. S. Balkin, 2008 45 Correlational Designs We use correlations to explore relationships between two variables We can also use a correlation to predict an outcome—
In statistics this is known as a regression analysis
R. S. Balkin, 2008 46 Correlational Designs For example, a correlation of =.60 means that for every increase in X there is a .60 standard deviation unit increase in Y Correlational designs are different from experimental designs In a correlational design, we explore relationships between two or more variables that are interval or ratio In a correlational design we do not compare groups; we do not have random assignment (though we do have random sampling)
R. S. Balkin, 2008 47 Correlational Designs
For example, maybe we want to know the relationship between self-esteem and depression. We could use two instruments, one that measures self-esteem and one that measures depression. Then we can conduct a regression analysis and see if the scores on one instrument predict scores on the other
I would hypothesize that high scores in depression correlation with low scores in self- esteem.
R. S. Balkin, 2008 48