Reliability & Validity

Total Page:16

File Type:pdf, Size:1020Kb

Reliability & Validity Another bit of jargon… Reliability & Validity In sampling… In measurement… • More is Better We are trying to represent a We are trying to represent a • Properties of a “good measure” population of individuals domain of behaviors – Standardization – Reliability – Validity • Reliability We select participants We select items – Inter-rater – Internal – External • Validity The resulting sample of The resulting scale/test of – Criterion-related participants is intended to items is intended to –Face – Content represent the population represent the domain – Construct For both “more is better” “more give greater representation” Whenever we’ve considered research designs and statistical conclusions, we’ve always been concerned with “sample size” We know that larger samples (more participants) leads to ... • more reliable estimates of mean and std, r, F & X2 • more reliable statistical conclusions • quantified as fewer Type I and II errors The same principle applies to scale construction - “more is better” • but now it applies to the number of items comprising the scale • more (good) items leads to a better scale… • more adequately represent the content/construct domain • provide a more consistent total score (respondent can change more items before total is changed much) Desirable Properties of Psychological Measures Reliability (Agreement or Consistency) Interpretability of Individual and Group Scores Inter-rater or Inter-observers reliability • do multiple observers/coders score an item the same way ? • critical whenever using subjective measures • dependent upon Standardization Population Norms Internal reliability -- do the items measure a central “thing” • Cronbach’s alpha α =.00 –1.00 higher is better Validity • more correlated items & more items improve higher α External Reliability – stability of scale/test scores over time Reliability • test-retest reliability – correlate scores from same test given 3-18 weeks apart • alternate forms reliability – correlate scores from two Standardization “versions” of the test Assessing Correlation between each item and a total comprised of all the other items Item corrected alpha if • negative item-total correlations item-total r deleted indicate either... • very “poor” item i1 .1454 .65 • reverse keying problems i2 .2002 .58 i3 -.2133 .71 What the alpha would be if that i4 .1882 .59 item were dropped i5 .1332 .68 • drop items with alpha if deleted larger than alpha i6 .2112 .56 i7 .1221 .60 Tells the for this set of items Coefficient Alpha = .58 Usually do several “passes” rather that drop several items at once. Assessing Assessing Pass #1 Pass #2, etc Item corrected alpha if All items with “-” item-total Item corrected alpha if Look for items with alpha-if- item-total r deleted correlations are “bad” item-total r deleted deleted values that are substantially higher than the i1 .0854 .65 • check to see that they have i1 .0812 .73 scale’s alpha value i2 .2002 .58 been keyed correctly i2 .2202 .68 • don’t drop too many at a time i3 -.2133 .71 • if they have been correctly i4 .1822 .70 keyed -- drop them • probably i7 i4 .1882 .59 i5 .0877 .74 • i3 would likely be dropped • probably not drop i1 & i5 i5 .0832 .68 i6 .2343 .64 • recheck on next “pass” i6 .0712 .56 i7 .0621 .78 • it is better to drop 1-2 items on i7 .0621 .60 Coefficient Alpha = .71 each of several “passes” Coefficient Alpha = .58 Validity (Consistent Accuracy) Criterion-related Validity -- does test correlate with “criterion”? • statistical -- requires a criterion that you “believe in” • predictive, concurrent, postdictive validity Face Validity -- do the items come from “domain of interest” ? non-statistical -- decision of “target population” Content Validity -- do the items come from “domain of interest”? non-statistical -- decision of “expert in the field” Construct Validity -- does test relate to other measures it should? • nonstatistical – does measure match the theory of the construct? • statistical -- Discriminant validity • convergent validity - +/-r with selected tests as should ? • divergent validity – r=0 correlate with others as should ? Criterion-related Validity “Is the test valid?” Do the test scores correlate with criterion behavior scores?? Jum Nunnally (one of the founders of modern psychometrics) claimed this was “silly question”! The point wasn’t that tests concurrent -- test taken now “replaces” criterion measured now • often the goal is to substitute a “shorter” or “cheaper” test shouldn’t be “valid” but that a test’s validity must be assessed • e.g., the written drivers test replaces road test relative to… predictive -- test taken now predicts criterion measured later • the construct it is intended to measure • want to estimate what will happen before it does • the population for which it is intended (e.g., age, level) • e.g., your GRE score (taken now) predicts grad school (later) postdictive – test taken now captures behavior & affect of before • the application for which it is intended (e.g., for classifying • most of the behavior we study “has already happened” folks into categories vs. assigning them • e.g., adult memories of childhood feelings or medical quantitative values) history When criterion behavior occurs Before Now Later So, the real question is, “Is this test a valid measure of this concurrent construct for this population in this application?” postdictive predictive That question can be answered! Test taken now Conducting a Predictive Validity Study example -- test designed to identify qualified “front desk personnel” for a major hotel chain -- 200 applicants - and 20 position openings A “proper” predictive validity study… • give each applicant the test (and “seal” the results) • give each applicant a job working at a front desk • assess work performance after 6 months (the criterion) • correlate the test (predictor) and work performance (criterion) Anybody see why the chain might not be willing to apply this design? Substituting concurrent validity for predictive validity What happens to the sample ... • assess work performance of all folks currently doing the job • give them each the test • correlate the test (predictor) and work performance (criterion) Problems? • Not working with the population of interest (applicants) Applicant pool -- target population • Range restriction -- work performance and test score variability are “restricted” by this approach Selected (hired) folks • current hiring practice probably not “random” • assuming selection basis is somewhat reasonable/functional • good workers “move up” -- poor ones “move out” Sample used in concurrent validity study • Range restriction will artificially lower the validity coefficient (r) • worst of those hired have been “released” • best of those hired have “changed jobs” What happens to the validity coefficient -- r Applicant pool r = .75 e c Hired Folks n a m r Sample used in o f r validity study e p r = .20 b o j - n o i r e t i r C Predictor -- interview/measure Face Validity “Continuum of content expertise” Does the test “look like” a measure of the construct of interest? • “looks like” a measure of the desired construct to a member Target Content of the target population population Experts • will someone recognize the type of information they are members responding to? • Possible advantage of face validity .. Researchers • If the respondent knows what information we are looking for, they can use that “context” to help interpret the questions Target population members assess Face Validity and provide more useful, accurate answers Content experts assess Content Validity • Possible limitation of face validity … • if the respondent knows what information we are looking for, Researchers – “should evaluate the validity evidence provided they might try to “bend & shape” their answers to what they about the scale, rather than the scale items !! think we want -- “fake good” or “fake bad” – unless they are truly a content expert” Content Validity Does the test contain items from the desired “content domain”? • Based on assessment by “subject matter experts” (SMEs) in that content domain • Is especially important when a test is designed to have low face validity • e.g., tests of “honesty” used for hiring decisions • Is generally simpler for “achievement tests” than for “psychological constructs” (or other “less concrete” ideas) • e.g., it is a lot easier for “math experts” to agree whether or not an item should be on an algebra test than it is for “psychological experts” to agree whether or not an items should be on a measure of depression. • Content validity is not “tested for”. Rather it is “assured” by the informed item selections made by experts in the domain. Construct Validity The statistical assessment of Construct Validity … • Does the test correspond with the theory of the construct & dos it Discriminant Validity interrelate with other tests as a measure of this construct should ? • Does the test show the “right” pattern of interrelationships with • We use the term construct to remind ourselves that many of the other variables? -- has two parts terms we use do not have an objective, concrete reality. • Convergent Validity -- test correlates with other measures of • Rather they are “made up” or “constructed” by us in our similar constructs attempts to organize and make sense of behavior and other psychological processes • Divergent Validity -- test isn’t correlated with measures of “other, different constructs” • attention to construct validity reminds us that our defense of the constructs we create is really
Recommended publications
  • Assessment Center Structure and Construct Validity: a New Hope
    University of Central Florida STARS Electronic Theses and Dissertations, 2004-2019 2015 Assessment Center Structure and Construct Validity: A New Hope Christopher Wiese University of Central Florida Part of the Psychology Commons Find similar works at: https://stars.library.ucf.edu/etd University of Central Florida Libraries http://library.ucf.edu This Doctoral Dissertation (Open Access) is brought to you for free and open access by STARS. It has been accepted for inclusion in Electronic Theses and Dissertations, 2004-2019 by an authorized administrator of STARS. For more information, please contact [email protected]. STARS Citation Wiese, Christopher, "Assessment Center Structure and Construct Validity: A New Hope" (2015). Electronic Theses and Dissertations, 2004-2019. 733. https://stars.library.ucf.edu/etd/733 ASSESSMENT CENTER STRUCTURE AND CONSTRUCT VALIDITY: A NEW HOPE by CHRISTOPHER W. WIESE B.S., University of Central Florida, 2008 A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Psychology in the College of Sciences at the University of Central Florida Orlando, Florida Summer Term 2015 Major Professor: Kimberly Smith-Jentsch © 2015 Christopher Wiese ii ABSTRACT Assessment Centers (ACs) are a fantastic method to measure behavioral indicators of job performance in multiple diverse scenarios. Based upon a thorough job analysis, ACs have traditionally demonstrated very strong content and criterion-related validity. However, researchers have been puzzled for over three decades with the lack of evidence concerning construct validity. ACs are designed to measure critical job dimensions throughout multiple situational exercises. However, research has consistently revealed that different behavioral ratings within these scenarios are more strongly related to one another (exercise effects) than the same dimension rating across scenarios (dimension effects).
    [Show full text]
  • Construct Validity and Reliability of the Work Environment Assessment Instrument WE-10
    International Journal of Environmental Research and Public Health Article Construct Validity and Reliability of the Work Environment Assessment Instrument WE-10 Rudy de Barros Ahrens 1,*, Luciana da Silva Lirani 2 and Antonio Carlos de Francisco 3 1 Department of Business, Faculty Sagrada Família (FASF), Ponta Grossa, PR 84010-760, Brazil 2 Department of Health Sciences Center, State University Northern of Paraná (UENP), Jacarezinho, PR 86400-000, Brazil; [email protected] 3 Department of Industrial Engineering and Post-Graduation in Production Engineering, Federal University of Technology—Paraná (UTFPR), Ponta Grossa, PR 84017-220, Brazil; [email protected] * Correspondence: [email protected] Received: 1 September 2020; Accepted: 29 September 2020; Published: 9 October 2020 Abstract: The purpose of this study was to validate the construct and reliability of an instrument to assess the work environment as a single tool based on quality of life (QL), quality of work life (QWL), and organizational climate (OC). The methodology tested the construct validity through Exploratory Factor Analysis (EFA) and reliability through Cronbach’s alpha. The EFA returned a Kaiser–Meyer–Olkin (KMO) value of 0.917; which demonstrated that the data were adequate for the factor analysis; and a significant Bartlett’s test of sphericity (χ2 = 7465.349; Df = 1225; p 0.000). ≤ After the EFA; the varimax rotation method was employed for a factor through commonality analysis; reducing the 14 initial factors to 10. Only question 30 presented commonality lower than 0.5; and the other questions returned values higher than 0.5 in the commonality analysis. Regarding the reliability of the instrument; all of the questions presented reliability as the values varied between 0.953 and 0.956.
    [Show full text]
  • Construct Validity in Psychological Tests
    CONSTRUCT VALIDITY IN PSYCHOLOGICAL TF.STS - ----L. J. CRONBACH and P. E. MEEID..----- validity seems only to have stirred the muddy waters. Portions of the distinctions we shall discuss are implicit in Jenkins' paper, "Validity for 'What?" { 33), Gulliksen's "Intrinsic Validity" (27), Goo<lenough's dis­ tinction between tests as "signs" and "samples" (22), Cronbach's sepa· Construct Validity in Psychological Tests ration of "logical" and "empirical" validity ( 11 ), Guilford's "factorial validity" (25), and Mosier's papers on "face validity" and "validity gen­ eralization" ( 49, 50). Helen Peak ( 52) comes close to an explicit state­ ment of construct validity as we shall present it. VALIDATION of psychological tests has not yet been adequately concep· Four Types of Validation tua1ized, as the APA Committee on Psychological Tests learned when TI1e categories into which the Recommendations divide validity it undertook (1950-54) to specify what qualities should be investigated studies are: predictive validity, concurrent validity, content validity, and before a test is published. In order to make coherent recommendations constrnct validity. The first two of these may be considered together the Committee found it necessary to distinguish four types of validity, as criterion-oriented validation procedures. established by different types of research and requiring different interpre· TI1e pattern of a criterion-oriented study is familiar. The investigator tation. The chief innovation in the Committee's report was the term is primarily interested in some criterion which he wishes to predict. lie constmct validity.* This idea was first formulated hy a subcommittee administers the test, obtains an independent criterion measure on the {Meehl and R.
    [Show full text]
  • Questionnaire Validity and Reliability
    Questionnaire Validity and reliability Department of Social and Preventive Medicine, Faculty of Medicine Outlines Introduction What is validity and reliability? Types of validity and reliability. How do you measure them? Types of Sampling Methods Sample size calculation G-Power ( Power Analysis) Research • The systematic investigation into and study of materials and sources in order to establish facts and reach new conclusions • In the broadest sense of the word, the research includes gathering of data in order to generate information and establish the facts for the advancement of knowledge. ● Step I: Define the research problem ● Step 2: Developing a research plan & research Design ● Step 3: Define the Variables & Instrument (validity & Reliability) ● Step 4: Sampling & Collecting data ● Step 5: Analysing data ● Step 6: Presenting the findings A questionnaire is • A technique for collecting data in which a respondent provides answers to a series of questions. • The vehicle used to pose the questions that the researcher wants respondents to answer. • The validity of the results depends on the quality of these instruments. • Good questionnaires are difficult to construct; • Bad questionnaires are difficult to analyze. •Identify the goal of your questionnaire •What kind of information do you want to gather with your questionnaire? • What is your main objective? • Is a questionnaire the best way to go about collecting this information? Ch 11 6 How To Obtain Valid Information • Ask purposeful questions • Ask concrete questions • Use time periods
    [Show full text]
  • How to Create Scientifically Valid Social and Behavioral Measures on Gender Equality and Empowerment
    April 2020 EMERGE MEASUREMENT GUIDELINES REPORT 2: How to Create Scientifically Valid Social and Behavioral Measures on Gender Equality and Empowerment BACKGROUND ON THE EMERGE PROJECT EMERGE (Evidence-based Measures of Empowerment for Research on Box 1. Defining Gender Equality and Gender Equality), created by the Center on Gender Equity and Health at UC San Diego, is a project focused on the quantitative measurement of gender Empowerment (GE/E) equality and empowerment (GE/E) to monitor and/or evaluate health and Gender equality is a form of social development programs, and national or subnational progress on UN equality in which one’s rights, Sustainable Development Goal (SDG) 5: Achieve Gender Equality and responsibilities, and opportunities are Empower All Girls. For the EMERGE project, we aim to identify, adapt, and not affected by gender considerations. develop reliable and valid quantitative social and behavioral measures of Gender empowerment is a type of social GE/E based on established principles and methodologies of measurement, empowerment geared at improving one’s autonomy and self-determination with a focus on 9 dimensions of GE/E- psychological, social, economic, legal, 1 political, health, household and intra-familial, environment and sustainability, within a particular culture or context. and time use/time poverty.1 Social and behavioral measures across these dimensions largely come from the fields of public health, psychology, economics, political science, and sociology. OBJECTIVE OF THIS REPORT The objective of this report is to provide guidance on the creation and psychometric testing of GE/E scales. There are many good GE/E measures, and we highly recommend using previously validated measures when possible.
    [Show full text]
  • Construct Validity in Psychological Tests – the Case of Implicit Social Cognition
    European Journal for Philosophy of Science (2020) 10:4 https://doi.org/10.1007/s13194-019-0270-8 PAPER IN THE PHILOSOPHY OF THE SCIENCES OF MIND AND BRAIN Construct validity in psychological tests – the case of implicit social cognition Uljana Feest1 Received: 17 February 2019 /Accepted: 27 November 2019/ # Springer Nature B.V. 2020 Abstract This paper looks at the question of what it means for a psychological test to have construct validity. I approach this topic by way of an analysis of recent debates about the measure- ment of implicit social cognition. After showing that there is little theoretical agreement about implicit social cognition, and that the predictive validity of implicit tests appears to be low, I turn to a debate about their construct validity. I show that there are two questions at stake: First, what level of detail and precision does a construct have to possess such that a test can in principle be valid relative to it? And second, what kind of evidence needs to be in place such that a test can be regarded as validated relative to a given construct? I argue that construct validity is not an all-or-nothing affair. It can come in degrees, because (a) both our constructs and our knowledge of the explanatory relation between constructs and data can vary in accuracy and level of detail, and (b) a test can fail to measure all of the features associated with a construct. I conclude by arguing in favor of greater philosoph- ical attention to processes of construct development. Keywords Implicit bias .
    [Show full text]
  • Reliability and Validity of the MBTI® Instrument
    Reliability and validity of the ® MBTI instrument Reliability and validity of the MBTI® instrument Reliability and validity of the MBTI® instrument When psychologists or practitioners evaluate a psychometric test or questionnaire, there are usually two main questions that they ask: “Is it reliable?” and “Is it valid?”. On both of these criteria, the Myers-Briggs Type Indicator (MBTI) performs well. Reputable psychometric tools have been developed through years of rigorous research by the test publisher, and OPP makes these research findings available via the MBTI Step I Manual and the MBTI Step II Manual, of which all practitioners are given a copy upon qualification. Major findings are also published in data supplements that can be downloaded free online from https://eu.themyersbriggs.com/en/Knowledge-centre/Practitioner-downloads. In addition, there are many articles by independent researchers in established journals. Interested parties can find hundreds of these on a free searchable database published by CAPT: Mary and Isabel’s Library Online (MILO), at www.capt.org/MILO. This document presents a few key examples. Reliability Reliability looks at whether a test or questionnaire gives consistent results, in particular investigating whether it is consistent over time (test–retest reliability), and whether the questions that measure each scale are consistent with each other (internal consistency reliability). By convention, a correlation of 0.7 is often taken as the minimum acceptable value for personality questionnaire scales. The following independent, peer-reviewed study confirms that the MBTI tool performs well on both of these measures: Capraro and Capraro, 2002: Myers-Briggs Type Indicator Score Reliability Across Studies: a Meta-Analytic Reliability Generalization Study Link to study online The MBTI tool was submitted to a descriptive reliability generalisation (RG) analysis to characterise the variability of measurement error in MBTI scores across administrations.
    [Show full text]
  • Handout 5: Establishing the Validity of a Survey Instrument STAT 335 – Fall 2016
    Handout 5: Establishing the Validity of a Survey Instrument STAT 335 – Fall 2016 In this handout, we will discuss different types of and methods for establishing validity. Recall that this concept was defined in Handout 3 as follows. Definition Validity – This is the extent to which survey questions measure what they are supposed to measure. In order for survey results to be useful, the survey must demonstrate validity. To better understand this concept, it may help to also consider the concept of operationalization. Wikipedia.org defines this as follows: For example, in the previous handout we considered measuring students’ interest in statistics using the SATS (Survey of Attitudes Toward Statistics). Note that the construct of Interest is a theoretical and rather vague concept – there is no clear or obvious way to measure this. So, the creators of this survey defined their own measure of Interest using these four questions: The resulting sub-score for Interest is their operationalization of this construct. Will the resulting score really measure Interest? Will the resulting scores for the other constructs on the SATS really measure what the researchers intend? These are the questions we seek to answer when establishing the construct validity of this survey. In the remainder of this handout, we will introduce various types of construct validity and briefly discuss how survey instruments are shown to be valid. 1 Handout 5: Establishing the Validity of a Survey Instrument STAT 335 – Fall 2016 TYPES OF CONSTRUCT VALIDITY When designing survey questionnaires, researchers may consider one or more of the following types of construct validity.
    [Show full text]
  • Download Preprint
    The Necessity of Construct and External Validity for Generalized Causal Claims∗ Kevin M. Esterling David Brady (Corresponding Author) Professor Professor School of Public Policy School of Public Policy and UC{Riverside and Department of Political Science WZB Berlin Social Science Center UC{Riverside [email protected] [email protected] Eric Schwitzgebel Professor Department of Philosophy UC{Riverside [email protected] June 18, 2021 ∗An earlier version was presented as part of the WZB Talks series. We thank Elias Barein- boim, Michael Bates, Shaun Bowler, Nancy Cartwright, Carlos Cinelli, Uljana Feest, Diogo Ferrari, Christian Fong, Francesco Guala, Steffen Huck, Macartan Humphreys, Robert Kaest- ner, Sampada KC, Jon Krosnick, Dorothea Kubler, Doug Lauen, Joscha Legewie, Michael Neblo, J¨orgPeters, Alex Rosenberg, Heike Solga, Jacqueline Sullivan, Nicholas Weller, Bernhard Wes- sels, Ang Yu and the participants in the MAMA workshop in UCR Psychology for comments. Abstract By advancing causal identification, the credibility revolution has facilitated tremendous progress in political science. The ensuing emphasis on internal validity however has led to the neglect of construct and external validity. This article develops a framework we call causal specification. The framework formally demonstrates the joint necessity of in- ternal, construct and external validity for causal generalization. Indeed, the lack of any of the three types of validity undermines the credibility revolution's own goal to under- stand causality deductively. Without construct, one cannot accurately label the cause or outcome. Without external, one cannot understand the conditions enabling the cause to have an effect. Our framework clarifies the assumptions necessary for each of the three.
    [Show full text]
  • Document Resume
    DOCUMENT RESUME ED 303 489 TM 012 694 AUTHOR Thompson, Bruce; Borrello, Gloria M. TITLE A Confirmatory Factor Analysis of Data from the Myers-Briggs Type Indicator. PUB DATE 4 Jan 89 NOTE 20p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Houston, TX, January 27, 1989). PUB TYPE Speeches/Conference Papers (150) -- Reports - Research /Technical (143) EDRS PRICE MFO1 /PCO1 Plus Postage. DESCRIPTORS College Students; Factor Analysis; Factor Structure; *Goodness of Fit; Higher Education; Maximum Likelihood Statistics; Measures (Individuals); *Personality Measures; Personality Traits; *Psychological Testing; Psychometrics; *Test Validity IDENTIFIERS *Confirmatory Factor Analysis; *Myers Briggs Type Indicator ABSTRACT The Myers-Briggs Type Indicator (MBTI) measures variations in normal personality. The validity of the measure has only recently been examined by analyses of the structure underlying MBTI responses; and these investigations have used exploratory factor analyses, rather than confirmatory factor analytic methods. A confirmatory factor analysis was performed using data from 582 adult undergraduates and graduate students (78% female) at an urban university. Three models were fit to the data from these subjects using the LISREL VI computer program. All models involved extraction of maximum-likelihood estimates of structures based on analysis of correlation matrices. Two of the models fit the data nearly equally well. Overall, the results support the conclusion that the MBTI measures expected factor structure. Three tables list study data, and one appendix lists the unsorted confirmatory factor structure. (Author/SLD) *********************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document. ********************************************************************* ..t mbticonf.rno 1/4/89 as 00 ....1.
    [Show full text]
  • Stages of Psychometric Measure Development: the Example of the Generalized Expertise Measure (GEM)
    893 Stages of Psychometric Measure Development: The example of the Generalized Expertise Measure (GEM) Marie-Line Germain Barry University & City College This paper chronicles the steps, methods, and presents hypothetical results of quantitative and qualitative studies being conducted to develop a Generalized Expertise Measure (GEM). Per Hinkin (1995), the stages of scale development are domain and item generation, content expert validation, and pilot test. Content/face validity and internal consistency of the scores of the GEM are discussed, as well as directions to ensure that the psychometric properties of the scale are theoretically and empirically sound. Keywords: Measure, Research Methods, Expertise Since its humble beginnings in the late 1950s, expertise has slowly permeated the human resource development literature. This seemingly obscure concept became an intriguing research topic as artificial intelligence and cognitive psychology further developed in the mid- to late sixties. Swanson & Holton (2001) define expertise as a “displayed behavior within a specialized domain and / or related domain in the form of consistently demonstrated actions of an individual that are both optimally efficient in their execution and effective in their results.” (p. 241). The absence of empirical evidence may well be the main reason of the gradual development of understanding of expertise in the last three decades (cf. Bédard & Chi, 1992). The meticulous standards of scientific methodologies invalidate the pervasive existence of anecdotal evidence in the popular press of expertise in organizational settings. The past 15 years, however, have seen an upsurge in the pace of expertise research, as indicated in the growing number of peer-reviewed publications in the area (Holton & Swanson, 2001).
    [Show full text]
  • Psychological Testing: Basic Concepts Commonand Misconceptions Anne Anastasi
    ANNE ANASTASI PSYCHOLOGICAL TESTING: BASIC CONCEPTS COMMONAND MISCONCEPTIONS ANNE ANASTASI nne Anastasi obtained an AB degree from Barnard College and A a PhD from Columbia University. She taught at Barnard and then at Queens College of the City University of New York, where she was the first chairperson of the psychology department at the newly established college. Next she joined the Graduate Faculty of Arts and Sciences of Fordham University, where she subsequently served two terms as chairperson of the joint graduate and under- graduate psychology departments. She retired in 1979 with the title of Professor Emeritus. A past president of the American Psychological Association (APA), Anastasi held many other offices, including the presidencies of the Eastern Psychological Association, the APA Divisions of Gen- eral Psychology, and of Evaluation and Measurement. She is the re- cipient of the APA Distinguished Scientific Award for the Applica- tions of Psychology, the Educational Testing Service Award for Dis- tinguished Service to Measurement, the American Educational Research Association Award for Distinguished Contributions to Re- search in Education, the Edward Lee Thorndike Medal for Distin- guished Psychological Contribution to Education awarded by the APA Division of Educational Psychology, and the American Psycho- logical Foundation Gold Medal. Her publications include Psychological Testing, Differential Psychology, and Fields of Applied PsycholoRy, as well as some 150 monographs and journal articles. 89 ANNE ANASTASI PSYCHOLOGICAL TESTING: BASIC CONCEPTS AND COMMON MISCONCEPTIONS s I thought about the purpose of the G. Stanley Hall Lectures Aand about the audience to which they are addressed, I decided to orient my presentation toward three major objectives.
    [Show full text]