Journal of Economic Perspectives—Volume 24, Number 3—Summer 2010—Pages 97–118

Searching for Effective Teachers with Imperfect Information

Douglas O. Staiger and Jonah E. Rockoff

eeachingaching maymay bebe thethe most-scrutinizedmost-scrutinized occupationoccupation inin thethe economy.economy. OOverver tthehe ppastast ffourour ddecades,ecades, eempiricalmpirical rresearchers—manyesearchers—many ooff tthemhem eeconomists—haveconomists—have T aaccumulatedccumulated anan impressiveimpressive amountamount ofof evidenceevidence oonn tteachers:eachers: tthehe hhetero-etero- ggeneityeneity iinn tteachereacher pproductivity,roductivity, tthehe rriseise iinn pproductivityroductivity aassociatedssociated withwith teachingteaching ccredentialsredentials andand on-the-jobon-the-job experience,experience, ratesrates ofof turnover,turnover, thethe costscosts ooff rrecruitment,ecruitment, tthehe rrelationshipelationship bbetweenetween ssupplyupply aandnd qquality,uality, tthehe eeffectffect ooff cclasslass ssizeize andand thethe mmonetaryonetary vvaluealue ofof academicacademic achievementachievement gainsgains overover a student’sstudent’s llifetime.ifetime. SSinceince tthehe ppassageassage ooff tthehe NoNo ChildChild LeftLeft BehindBehind Act,Act, alongalong withwith a numbernumber ofof state-levelstate-level eeducationalducational iinitiatives,nitiatives, tthehe ddataata nneededeeded toto estimateestimate individualindividual teacherteacher performanceperformance basedbased onon sstudenttudent aachievementchievement ggainsains hhaveave bbecomeecome mmoreore widelywidely available.available. HHowever,owever, ttherehere havehave bbeeneen rrelativelyelatively fewfew effortsefforts toto examineexamine thethe implicationsimplications ooff tthishis vvoluminousoluminous lliteratureiterature oonn teacherteacher performance.performance. InIn thisthis paper,paper, wewe askask whatwhat thethe existingexisting evidenceevidence impliesimplies fforor hhowow sschoolchool lleaderseaders mmightight rrecruit,ecruit, evaluate,evaluate, andand retainretain teachers.teachers. WWee beginbegin byby summarizingsummarizing thethe evidenceevidence onon fi veve keykey points,points, referringreferring toto existingexisting wworkork aandnd toto evidenceevidence wwee hhaveave aaccumulatedccumulated ffromrom oourur rresearchesearch wwithith tthehe nnation’sation’s ttwowo largestlargest schoolschool districts:districts: LosLos AngelesAngeles aandnd NewNew YorkYork City.City. First,First, teachersteachers displaydisplay cconsiderableonsiderable hheterogeneityeterogeneity inin theirtheir effectseffects onon studentstudent achievementachievement gains.gains. TheThe stan-stan- ddardard deviationdeviation acrossacross teachersteachers inin theirtheir impactimpact onon studentstudent achievementachievement gainsgains isis onon tthehe orderorder ofof 0.10.1 toto 0.20.2 student-levelstudent-level sstandardtandard ddeviations,eviations, wwhichhich wouldwould improveimprove thethe

■ DDouglasouglas OO.. SStaigertaiger iiss JJohnohn FFrenchrench PProfessorrofessor ooff EEconomics,conomics, DDartmouthartmouth CCollege,ollege, HHanover,anover, NNewew Hampshire.Hampshire. JonahJonah E.E. RockoffRockoff isis SidneySidney TaurelTaurel AssociateAssociate ProfessorProfessor ofof Business,Business, ColumbiaColumbia BBusinessusiness SSchool,chool, NNewew YYorkork CCity,ity, NNewew YYork.ork. TTheirheir ee-mail-mail aaddressesddresses aarere 〈[email protected]@ ddartmouth.eduartmouth.edu〉 aandnd 〈 [email protected]@columbia.edu〉. doi=10.1257/jep.24.3.97 98 Journal of Economic Perspectives

mmedianedian student’sstudent’s ttestest sscorecore 4 ttoo 8 ppercentilesercentiles iinn a ssingleingle yyear.ear.1 Second,Second, eestimatesstimates ooff tteachereacher effectivenesseffectiveness basedbased onon studentstudent aachievementchievement ddataata aarere nnoisyoisy mmeasureseasures aandnd ccanan bebe thoughtthought ofof aass hhavingaving rreliabilityeliability inin thethe rangerange ooff 3300 ttoo 5500 ppercent.ercent. TThird,hird, tteachers’eachers’ eeffectivenessffectiveness risesrises rapidlyrapidly iinn tthehe fi rstrst yearyear oror twotwo ofof theirtheir tteachingeaching ccareersareers bbutut tthenhen qquicklyuickly llevelsevels oout.ut. FFourth,ourth, thethe primaryprimary costcost ooff tteachereacher tturnoverurnover iiss nnotot tthehe directdirect costcost ofof hiringhiring andand fi ring,ring, bbutut rratherather iiss tthehe llossoss ttoo sstudentstudents wwhoho wwillill bbee ttaughtaught byby a novicenovice teacherteacher ratherrather thanthan oneone withwith severalseveral yearsyears ofof experience.experience. Fifth,Fifth, iitt iiss ddiffiiffi cultcult ttoo identifyidentify tthosehose tteacherseachers wwhoho wwillill pproverove mmoreore eeffectiveffective atat tthehe ttimeime ooff hhire.ire. AsAs a result,result, betterbetter teachersteachers cancan onlyonly bebe identifiidentifi eded afterafter somesome eevidencevidence oonn ttheirheir aactualctual jjobob pperformanceerformance hhasas aaccumulated.ccumulated. WWee tthenhen eexplorexplore wwhathat tthesehese ffactsacts implyimply fforor howhow principalsprincipals andand schoolschool districtsdistricts sshouldhould aact,ct, uusingsing a ssimpleimple mmodelodel iinn wwhichhich sschoolschools mmustust ssearchearch fforor tteacherseachers uusingsing nnoisyoisy ssignalsignals ofof teacherteacher effectiveness.effectiveness. DueDue toto a lacklack ofof informationinformation availableavailable aatt tthehe ttimeime ooff hhire,ire, wwee wwillill aarguergue fforor a hhiringiring pprocessrocess tthathat iiss nnotot hhighlyighly sselective—thatelective—that iis,s, wwhilehile itit mightmight requirerequire evidenceevidence ofof generalgeneral educationaleducational achievementachievement likelike a collegecollege ddegree,egree, itit wouldwould notnot requirerequire individualsindividuals toto makemake costlycostly uup-frontp-front sspecifipecifi c invest-invest- mmentsents beforebefore bbeingeing ppermittedermitted ttoo tteach.each. WWee tthenhen aarguergue tthat,hat, ggiveniven tthehe ssubstantialubstantial oobservedbserved heterogeneityheterogeneity ofof teacherteacher eeffects,ffects, tthehe mmodestodest rriseise iinn pproductivityroductivity wwithith oon-the-jobn-the-job eexperience,xperience, aandnd tthehe ffactact thatthat tenuretenure isis a lifetimelifetime job,job, ttenureenure pprotec-rotec- ttionsions shouldshould bbee llimitedimited ttoo tthosehose wwhoho mmeeteet a vveryery hhighigh bbar.ar. EEvenven wwithith tthehe iimprecisemprecise eestimatesstimates ooff tteachereacher eeffectivenessffectiveness currentlycurrently available,available, oourur ssimulationsimulations ssuggestuggest thatthat a sstrategytrategy tthathat wwouldould ssampleample eextensivelyxtensively fromfrom tthehe ppoolool ooff ppotentialotential tteacherseachers bbutut oofferffer ttenureenure onlyonly toto a smallsmall percentagepercentage ccouldould yieldyield substantialsubstantial annualannual gainsgains inin studentstudent aachievement.chievement. TThehe implicationsimplications ooff oourur analysisanalysis areare strikinglystrikingly ddifferentifferent ffromrom ccurrenturrent ppractice.ractice. SSchoolschools andand schoolschool districtsdistricts aattemptttempt ttoo sscreencreen atat tthehe ppointoint ooff hhiringiring aandnd rrequireequire ssignifiignifi cantcant investmentinvestment inin education-specifieducation-specifi c courseworkcoursework butbut tthenhen ggrantrant ttenureenure sstatustatus ttoo tteacherseachers aass a mmatteratter ooff ccourseourse aafterfter ttwowo ttoo tthreehree yearsyears onon thethe job.job. Perfor-Perfor- mmanceance evaluationevaluation isis typicallytypically a perfunctoryperfunctory exerciseexercise aand,nd, atat leastleast offioffi cially,cially, veryvery ffewew teachersteachers areare consideredconsidered ineffectiveineffective (Weisberg,(Weisberg, SSexton,exton, Mulhern,Mulhern, aandnd KKeeling,eeling, 22009).009). RRatherather tthanhan sscreeningcreening atat tthehe ttimeime ooff hhire,ire, tthehe eevidencevidence oonn hheterogeneityeterogeneity ooff tteachereacher performanceperformance suggestssuggests a betterbetter strategystrategy wouldwould bebe identifyingidentifying largelarge differ-differ- eencesnces bbetweenetween teachersteachers byby observingobserving tthehe fi rstrst fewfew yearsyears ofof teachingteaching pperformanceerformance aandnd retainingretaining onlyonly thethe highest-performinghighest-performing tteachers.eachers.

1 The metric of standard deviations is commonly used to assess the effect of educational interven- tions, and we will use it throughout this paper. To provide some context for readers unversed in this literature, the gap in achievement between poor and nonpoor students (or between black and white students) in the United States is roughly 0.8–0.9 standard deviations (authors’ calculations based on data from the 2009 National Assessment of Educational Progress). Douglas O. Staiger and Jonah E. Rockoff 99

FFiveive FFactsacts aaboutbout TTeachereacher EEffectivenessffectiveness

AAnyny aapproachpproach toto recruitingrecruiting andand retainingretaining teachersteachers isis based,based, atat leastleast implicitly,implicitly, oonn a setset ofof bbeliefs.eliefs. Here,Here, wewe describedescribe thethe evidenceevidence oonn fi veve keykey pparametersarameters rregardingegarding tteachereacher eeffectiveness.ffectiveness.

FFactact 11:: TeacherTeacher PProductivityroductivity BBasedased oonn GGainsains inin StudentStudent AchievementAchievement isis HHeterogeneous.eterogeneous. TThehe factfact thatthat teachersteachers areare heterogeneousheterogeneous inin theirtheir productivityproductivity ssuggestsuggests thatthat ttherehere areare potentiallypotentially largelarge gainsgains toto studentsstudents iiff iitt iiss ppossibleossible fforor sschoolchool lleaderseaders toto aattractttract aandnd rretainetain hhighlyighly eeffectiveffective teachers,teachers, andand converselyconversely toto discouragediscourage oorr aatt lleasteast ttoo aavoidvoid ggivingiving ttenureenure ttoo iineffectiveneffective tteachers.eachers. MMoreore thanthan threethree decadesdecades ago,ago, HanushekHanushek ((1971)1971) aandnd MMurnaneurnane ((1975)1975) werewere tthehe fi rrstst economistseconomists toto reportreport largelarge differencesdifferences iinn sstudenttudent achievementachievement iinn ddifferentifferent tteachers’eachers’ classrooms,classrooms, eveneven afterafter controllingcontrolling forfor students’students’ ppriorrior aachievementchievement aandnd ccharacteristics.haracteristics. TThathat lliteratureiterature hhasas aacceleratedccelerated iinn rrecentecent yyears.ears. EEspeciallyspecially ffollowingollowing tthehe NoNo ChildChild LeftLeft BehindBehind Act,Act, manymany statesstates aandnd sschoolchool ddistrictsistricts bbeganegan ccollectingollecting aannualnnual datadata oonn sstudentstudents aandnd mmatchingatching iitt ttoo tteachers.eachers.2 RResearchesearch hashas producedproduced rremarkablyemarkably consistentconsistent estimatesestimates ofof tthehe hheterogeneityeterogeneity inin teacherteacher iimpactsmpacts inin differentdifferent ssites.ites. ForFor example,example, uusingsing ddataata ffromrom TTexas,exas, RRivkin,ivkin, Hanushek,Hanushek, aandnd KKainain ((2005)2005) fi ndnd tthathat a standardstandard deviationdeviation inin teacherteacher qualityquality isis associatedassociated withwith 0.110.11 student-levelstudent-level sstandardtandard ddeviationseviations iinn mmathath aandnd 0.0950.095 standardstandard deviationsdeviations inin reading.reading. UsingUsing datadata ffromrom ttwowo sschoolchool ddistrictsistricts iinn NNewew JJersey,ersey, RRockoffockoff (2004)(2004) reportsreports tthathat oonene sstandardtandard ddeviationeviation iinn tteachereacher effectseffects isis associatedassociated withwith a 0.10.1 student-levelstudent-level standardstandard deviationdeviation iinn achievement.achievement. UsingUsing datadata fromfrom CChicago,hicago, AAaronson,aronson, BBarrow,arrow, aandnd SSanderander ((2007)2007) rreporteport tthathat a sstandardtandard ddeviationeviation iinn tteachereacher qqualityuality iiss aassociatedssociated wwithith a ddifferenceifference iinn mmathath performanceperformance ofof 0.090.09 toto 0.160.16 student-levelstudent-level sstandardtandard ddeviations.eviations.3 HHowow mmuchuch sshouldhould wwee ccareare aaboutbout tthesehese ddifferencesifferences iinn eeffectivenessffectiveness aacrosscross tteachers?eachers? TToo aattachttach aann aapproximatepproximate ddollarollar vvaluealue ttoo tthem,hem, oonene nneedseeds aann eestimatestimate ooff thethe valuevalue ofof studentstudent aachievementchievement overover thethe coursecourse ofof a student’sstudent’s llifetime.ifetime. TTherehere iiss a longlong ttraditionradition iinn llaborabor eeconomicsconomics eestimatingstimating tthehe rrelationshipelationship bbetweenetween vvariousarious ttypesypes ofof ttestest sscorescores aandnd tthehe eearningsarnings ooff eearly-careerarly-career wworkersorkers ((forfor eexample,xample, MMurnane,urnane,

2 The data requirements for measuring heterogeneity in teaching effectiveness are high. First, one needs longitudinal data on achievement for individual students matched to specifi c teachers. Second, achievement data are needed on an annual basis to be able to track gains for each student over a single school year. (Prior to the No Child Left Behind legislation, many states tested at longer intervals, such as fourth and eighth grade.) Third, panel data on teachers are required as well, to be able to track performance of individual teachers over time. Teacher-level panel data are needed to account for school-level or classroom-level shocks to student achievement that contribute to the measurement error in classroom-level measures. In this journal, Kane and Staiger (2002) showed that conventional estimates of sampling error cannot account for the lack of persistence in school-level value-added estimates. There appear to be other school-level and classroom-level sources of error. 3 Aaronson, Barrow, and Sander (2007) report the variance in teacher quality to be .02 to .06 grade- level equivalents (adjusted for sampling error). In table 1, _they report the standard_ deviation in grade-level equivalents of eighth grade students to be 1.55 ( √.02 /1.55 = .09, √.06 /1.55 = .16). Their study adjusted for sampling variation, but not for other classroom level sources of error. 100 Journal of Economic Perspectives

WWillett,illett, aandnd LLevy,evy, 11995;995; NNealeal aandnd JJohnson,ohnson, 11996).996).4 KKaneane aandnd SStaigertaiger ((2002)2002) eesti-sti- mmatedated thatthat thethe valuevalue ofof a oneone standardstandard deviationdeviation gaingain inin mathmath scoresscores wouldwould havehave bbeeneen wworthorth $$110,000110,000 aatt aagege 1188 uusingsing tthehe MMurnaneurnane eett aal.l. eestimates,stimates, aandnd $$256,000256,000 uusingsing thethe NealNeal andand JohnsonJohnson results.results. ThisThis impliesimplies thatthat a oneone standardstandard deviationdeviation iincreasencrease inin teacherteacher effectivenesseffectiveness (that(that is,is, oneone thatthat lleadseads ttoo aann iincreasencrease ooff aaboutbout 00.15.15 sstandardtandard deviationsdeviations ofof studentstudent aachievementchievement forfor 2020 students)students) hhasas a valuevalue ofof aaroundround $330,000$330,000 toto $$760,000.760,000. AAss severalseveral rrecentecent paperspapers remindremind us,us, thethe statisticalstatistical aassumptionsssumptions rrequiredequired forfor thethe iidentifidentifi cationcation ooff ccausalausal tteachereacher eeffectsffects withwith observationalobservational ddataata aarere eextraordinarilyxtraordinarily sstrongtrong aandnd rrarelyarely ttestedested ((Andrabi,Andrabi, DDas,as, KKhwaja,hwaja, andand Zajonc,Zajonc, 2009;2009; McCaffrey,McCaffrey, LLockwood,ockwood, Koretz,Koretz, Louis,Louis, andand Hamilton,Hamilton, 2004;2004; Raudenbush,Raudenbush, 22004;004; Rothstein,Rothstein, 22010;010; RRubin,ubin, SStuart,tuart, aandnd ZZanutto,anutto, 22004;004; TToddodd aandnd WWolpin,olpin, 22003).003). TTeacherseachers mmayay bbee aassignedssigned cclassroomslassrooms ooff sstudentstudents tthathat ddifferiffer iinn uunmeasurednmeasured wways—suchays—such aass cconsistingonsisting ooff moremore mmotivatedotivated sstudents,tudents, oorr sstudentstudents wwithith sstrongertronger uunmeasurednmeasured ppriorrior aachieve-chieve- mmentent oror moremore engagedengaged parents—thatparents—that resultresult inin varyingvarying studentstudent achievementachievement gains.gains. DDespiteespite tthesehese cconcerns,oncerns, sseveraleveral ppiecesieces ooff eevidencevidence ssuggestuggest thatthat tthehe mmagnitudeagnitude ooff vvariationariation iinn tteachereacher eeffectsffects iiss ddrivenriven bbyy rrealeal ddifferencesifferences iinn tteachereacher qquality.uality. FFirst,irst, eestimatesstimates ttendend ttoo bbee hhighlyighly ccorrelatedorrelated aacrosscross a wwideide vvarietyariety ooff sspecifipecifi cationscations (Harris(Harris aandnd Sass,Sass, 2006).2006). Second,Second, researchersresearchers havehave consistentlyconsistently foundfound strongstrong ccorrelationsorrelations bbetweenetween teacherteacher effecteffect estimatesestimates andand evaluationsevaluations mademade byby sschoolchool pprincipalsrincipals aandnd ootherther professionalprofessional educatorseducators (Murnane,(Murnane, 1975;1975; JacobJacob andand Lefgren,Lefgren, 22008;008; HHarrisarris aandnd SSass,ass, 22009;009; RRockoffockoff andand Speroni,Speroni, 22010;010; TTyler,yler, TTaylor,aylor, KKane,ane, aandnd WWooten,ooten, 22010).010). TThird,hird, whilewhile mostmost studiesstudies ooff tteachereacher eeffectsffects rrelyely oonn aassumptionsssumptions rregardingegarding matchingmatching ooff sstudentstudents wwithith tteacherseachers aatt tthehe cclassroomlassroom llevel,evel, RRivkin,ivkin, Hanushek,Hanushek, aandnd KKainain ((2005)2005) uusese a ccompletelyompletely ddifferentifferent aapproachpproach tthathat ddoesoes nnotot rrelyely oonn tthishis aassumptionssumption aandnd fi nndd ssimilarimilar estimatesestimates toto thethe restrest ooff tthehe lliterature.iterature. FFinally,inally, twotwo studiesstudies bbasedased oonn rrandomandom aassignmentssignment ooff tteacherseachers ttoo cclassroomslassrooms hhaveave ffoundound vvariationariation iinn tteachereacher eeffectsffects consistentconsistent withwith nonexperimentalnonexperimental eestimates,stimates, ssuggestinguggesting thatthat eestimatedstimated ddifferencesifferences iinn tteachereacher eeffectivenessffectiveness areare notnot ddrivenriven bbyy sstudenttudent ssorting.orting. NNye,ye, KKonstantopoulos,onstantopoulos, aandnd HHedgesedges ((2004)2004) rreexaminedeexamined ddataata ffromrom tthehe TennesseeTennessee STARSTAR classroomclassroom sizesize experiment,experiment, inin whichwhich teachersteachers werewere randomlyrandomly aassignedssigned toto classesclasses ofof a givengiven size.size. TheThe differencesdifferences inin classroom-levelclassroom-level studentstudent aachievementchievement tthathat eemergedmerged wwithinithin ggiveniven cclasslass ssizeize groupsgroups werewere llargerarger tthanhan wwouldould hhaveave beenbeen expectedexpected ttoo ooccurccur ddueue ttoo cchancehance aandnd sstrikinglytrikingly ssimilarimilar iinn mmagnitudeagnitude ttoo tthosehose estimatedestimated inin nonexperimentalnonexperimental studies.studies. KKaneane aandnd SStaigertaiger ((2008)2008) sstudytudy a rrecentecent eexperimentxperiment iinn LLosos AAngelesngeles UUnifinifi eded SSchoolchool DDistrictistrict iinn wwhichhich ppairsairs ooff tteacherseachers wwereere

4 Murnane, Willett, and Levy (1995) estimate that a one standard deviation difference in math test performance is associated with an 8 percent hourly wage increase for men and a 12.6 percent increase for women. These estimates may understate the value of academic achievement since the authors also control for years of schooling completed. Neal and Johnson (1996), who do not condition on educa- tional attainment, estimate that an improvement of one standard deviation in test performance is associated with 18.7 and 25.6 percent increases in hourly wages for men and women, respectively. Of course, the cross-sectional relationship between tested achievement and earnings may overstate the causal value of academic achievement. However, while there have been attempts to estimate the causal value of years of schooling, we are not aware of estimates of the causal value of academic achievement. Searching for Effective Teachers with Imperfect Information 101

rrandomlyandomly assignedassigned ttoo classroomsclassrooms wwithinithin tthehe ssameame eelementarylementary sschoolchool aandnd ggrade.rade. TTheyhey ffoundound thatthat nonexperimentalnonexperimental value-addedvalue-added estimatesestimates fromfrom a pre-experimentalpre-experimental pperioderiod wwereere ableable toto predictpredict studentstudent achievementachievement differencesdifferences followingfollowing randomrandom aassignment:ssignment: a one-pointone-point differencedifference betweenbetween randomly-assignedrandomly-assigned teachersteachers inin pre-pre- eexperimentalxperimental vvaluealue aaddeddded wwasas aassociatedssociated withwith a one-pointone-point differencedifference inin studentstudent aachievementchievement followingfollowing randomrandom aassignment.ssignment. TThus,hus, tthehe nnonexperimentalonexperimental eestimatesstimates fforor individualindividual tteacherseachers wwereere uunbiasednbiased ppredictorsredictors ooff a tteacher’seacher’s iimpactmpact oonn sstudenttudent aachievementchievement inin thethe experiment.experiment.

FFactact 22:: EstimatesEstimates ooff HHeterogeneouseterogeneous TeacherTeacher EffectsEffects IncludeInclude a SubstantialSubstantial NoiseNoise CComponent.omponent. IIdeally,deally, estimatesestimates ofof thethe amountamount thatthat teachersteachers affectaffect studentstudent aachievementchievement wwouldould bbee thethe ssameame aacrosscross cclassroomslassrooms oorr ffromrom yyearear ttoo yyearear wwithinithin tthehe ssameame tteacher,eacher, bbutut tthishis ddoesoes nnotot holdhold ttruerue iinn ppractice.ractice. TThehe eerrorrror iinn eestimatesstimates ooff tteachereacher eeffectsffects onon studentstudent aachievementchievement derivesderives fromfrom aatt lleasteast ttwowo ssources.ources. TThehe fi rrstst isis samplingsampling vvariation.ariation. TThehe ttypicalypical eelementarylementary cclassroomlassroom maymay havehave 2020 toto 2525 studentsstudents pperer yyearear ((althoughalthough mmiddleiddle andand highhigh schoolschool teachersteachers havehave somewhatsomewhat largerlarger classesclasses aandnd ttypicallyypically tteacheach mmultipleultiple ssections).ections). WWithith ssamplesamples ooff ssuchuch mmodestodest size,size, naturallynaturally occurringoccurring variationvariation iinn thethe make-upmake-up ofof a teacher’steacher’s classroomclassroom fromfrom yearyear toto yearyear willwill produceproduce vvariationariation iinn a tteacher’seacher’s eestimatedstimated eeffect.ffect. HHowever,owever, vvolatilityolatility iinn tteachereacher ((andand sschool)chool) eeffectsffects eexceedsxceeds tthathat ppredictedredicted bbyy ssamplingampling eerrorrror aalonelone ((KaneKane aandnd SStaiger,taiger, 22002;002; KKane,ane, RRockoff,ockoff, aandnd SStaiger,taiger, 22008).008). TThehe ssourceource ooff tthishis ssecondecond ttypeype ooff eerror—whichrror—which ccanan pperhapserhaps moremore accuratelyaccurately bebe thoughtthought ofof asas nonpersistentnonpersistent variationvariation iinn eestimatesstimates ofof tteachereacher eeffectsffects onon studentstudent aachievement—couldchievement—could iincludenclude a bbroadroad rrangeange ooff ffactorsactors iinflnfl uencinguencing tthehe measuredmeasured achievementachievement gainsgains ofof groupsgroups ofof students:students: forfor example,example, iinteractionsnteractions betweenbetween a sspecifipecifi c teacher’steacher’s lessonlesson plansplans aandnd tthehe ttestest uusedsed iinn a ggiveniven yyear,ear, anan (unpredictably)(unpredictably) disruptivedisruptive studentstudent tthathat ddragsrags ddownown hhis/heris/her cclassmates,lassmates, a ddogog barkingbarking inin thethe parkingparking lotlot onon thethe dayday ofof thethe test,test, oror moremore mysteriousmysterious forcesforces tthathat ffallall uundernder thethe broadbroad categorycategory ofof ““classroomclassroom cchemistry.”hemistry.” FForor presentpresent ppurposes,urposes, aanyny nnonpersistentonpersistent vvariationariation iinn a tteacher’seacher’s mmeasuredeasured iimpactmpact oonn studentstudent achievementachievement representsrepresents estimationestimation error.error. OOnene aapproachpproach toto eestimatingstimating tthehe pproportionroportion ooff vvarianceariance ddueue ttoo nnonpersistentonpersistent ssourcesources iiss ttoo sstudytudy tthehe ccorrelationorrelation iinn eestimatedstimated iimpactsmpacts aacrosscross cclassroomslassrooms ttaughtaught byby tthehe ssameame tteacher.eacher. IIff a tteacher’seacher’s eestimatedstimated iimpact,mpact, Yt , rrepresentsepresents tthehe ssumum ooff a ppersistentersistent ccomponent,omponent, , aandnd aann uuncorrelatedncorrelated nonpersistentnonpersistent error,error, εt, tthenhen tthehe ccorrelationorrelation bbetweenetween thethe eestimatedstimated eeffectffect thisthis yearyear andand lastlast yyear—thatear—that iis,s, bbetweenetween Yt andand Yt––11——representsrepresents aann estimateestimate ooff tthehe rreliabilityeliability ofof thethe teacher-levelteacher-level estimateestimate inin anyany givengiven year.year. TableTable 1 rreportseports tthehe sstandardtandard ddeviationeviation iinn eestimatedstimated tteachereacher eeffects,ffects, tthehe eestimatedstimated rreliabilityeliability ((asas mmeasuredeasured byby thethe correlationcorrelation aacrosscross cclassroomslassrooms ttaughtaught byby thethe samesame teacher),teacher), aandnd iimpliedmplied sstandardtandard ddeviationeviation iinn ttruerue tteachereacher iimpactsmpacts (σ) fforor tteacherseachers iinn ttwowo sschoolchool ddistricts:istricts: LLosos AngelesAngeles UUnifinifi eded andand NewNew YorkYork City.City. WhenWhen reportedreported inin termsterms ofof thethe sstudent-leveltudent-level sstandardtandard ddeviationeviation iinn ttestest sscorescores iinn a ggiveniven ggraderade aandnd ssubject,ubject, thethe sstandardtandard ddeviationeviation iinn eestimatedstimated vvaluealue aaddeddded fforor tteacherseachers wwasas rremarkablyemarkably ssimilarimilar iinn thethe twotwo districts,districts, withwith estimatesestimates inin bothboth mathmath andand EnglishEnglish LanguageLanguage ArtsArts inin thethe 102 Journal of Economic Perspectives

Table 1 Evidence on Teacher Value Added from Schools in Los Angeles and New York City (teacher value added measured in standard deviations of student performance)

Los Angeles New York City

English English Language Language Math Arts Math Arts

Variation in Teacher Value Added: Standard deviation of annual value-added 0.27 0.23 0.25 0.23 measure Reliability of annual value-added measure 0.50 0.37 0.39 0.28 Implied standard deviation of persistent 0.19 0.14 0.15 0.12 teacher effect Difference in Value Added Relative to Teachers with 3+ Years Experience: No experience teaching (novice) –0.08 –0.06 –0.07 –0.07 One year of experience teaching –0.02 –0.01 –0.03 –0.04 Two years of experience teaching –0.01 –0.01 –0.02 –0.02

Notes: Teacher value-added estimates are from analysis of data on fourth and fi fth graders in years 2000–2003 for Los Angeles and 2000–2005 for New York City. Teacher value added is measured in standard deviations of student performance. Reliability of the value-added measure refers to the correlation of the value-added measure across classrooms taught by the same teacher. Estimates are based on regressions of student achievement that include student-level controls for baseline test scores, race/ethnicity, special education, English Language Learners (ELL), and free lunch status; classroom peer means of the student-level characteristics; and grade-by-year fi xed effects. nnarrowarrow rrangeange ffromrom ..2323 ttoo ..27.27. AAlthoughlthough tthehe eestimatedstimated rreliabilityeliability ooff tteachereacher eeffectsffects wwasas higherhigher inin mathmath thanthan inin EnglishEnglish LanguageLanguage AArts,rts, aandnd hhigherigher iinn LLosos AAngelesngeles tthanhan iinn NewNew YorkYork City,City, aallll tthehe rreliabilityeliability estimatesestimates suggestsuggest thatthat therethere isis considerableconsiderable errorerror oorr vvolatilityolatility iinn tthehe tteachereacher iimpactmpact estimates.estimates. Indeed,Indeed, moremore thanthan halfhalf ofof thethe variationvariation iinn estimatedestimated impactsimpacts inin mathmath andand EnglishEnglish LanguageLanguage AArtsrts aarere nnonpersistent.onpersistent. TThehe sstandardtandard ddeviationeviation ooff tthehe ppersistentersistent tteachereacher eeffectffect iiss bbetweenetween ..1212 aandnd ..19,19, ssimilarimilar toto tthathat ffoundound iinn tthehe ppreviousrevious lliteratureiterature ddiscussediscussed aabove.bove.

FFactact 3:3: TeachersTeachers ImproveImprove SSubstantiallyubstantially iinn TTheirheir FFirstirst FFewew YYearsears oonn tthehe JJob.ob. TTableable 1 alsoalso reportsreports tthehe ddegreeegree ttoo wwhichhich averageaverage tteachereacher eeffectsffects onon studentstudent aachievementchievement differdiffer fromfrom tthathat ofof experiencedexperienced tteacherseachers dduringuring tthehe fi rrstst ffewew yyearsears oonn tthehe jobjob iinn tthesehese ssameame ttwowo ddistricts.istricts. IInn bbothoth LLosos AAngelesngeles aandnd NNewew YYork,ork, tteachereacher eeffectsffects onon studentstudent achievementachievement appearappear toto riserise rapidlyrapidly dduringuring tthehe fi rstrst sseveraleveral yyearsears oonn thethe jobjob aandnd thenthen fl attenatten out.out. ThisThis fi ndingnding hashas beenbeen replicatedreplicated inin a numbernumber ooff statesstates aandnd ddistrictsistricts ((Rivkin,Rivkin, HHanushek,anushek, aandnd KKain,ain, 22005;005; CClotfelter,lotfelter, LLadd,add, aandnd VVigdor,igdor, 2006;2006; HarrisHarris andand Sass,Sass, 2006;2006; Jacob,Jacob, 22007).007). WWhenhen aassignedssigned toto a fi rst-yearrst-year tteacher,eacher, thethe averageaverage sstudenttudent ggainsains ..0606 ttoo ..0808 sstandardtandard deviationsdeviations ofof achievementachievement llessess thanthan observablyobservably similarsimilar studentsstudents assignedassigned toto experiencedexperienced teachers.teachers. However,However, tthehe achievementachievement gainsgains ofof sstudentstudents aassignedssigned ttoo ssecond-yearecond-year teachersteachers laggedlagged thosethose inin Douglas O. Staiger and Jonah E. Rockoff 103

mmoreore experiencedexperienced tteachers’eachers’ cclassroomslassrooms byby onlyonly .01.01 toto .04.04 standardstandard deviations.deviations. InIn LLosos Angeles,Angeles, sstudentstudents ooff tthird-yearhird-year tteacherseachers ssawaw ggainsains ccomparableomparable ttoo tthosehose ooff moremore eexperiencedxperienced tteachers,eachers, wwhilehile ttherehere wwasas a ssmallmall ddifferenceifference fforor tthird-yearhird-year tteacherseachers iinn NNewew YorkYork (.01(.01 toto .02.02 standardstandard deviations).deviations).

FFactact 44:: TheThe MMainain CCostost ofof TeacherTeacher TurnoverTurnover iiss tthehe RReductioneduction iinn SStudenttudent AAchievementchievement wwhenhen aann EExperiencedxperienced TTeachereacher iiss RReplacedeplaced bbyy a NNovice,ovice, nnotot DirectDirect HHiringiring CCosts.osts. MMilanowskiilanowski aandnd OddenOdden (2007)(2007) carefullycarefully studiedstudied costscosts ooff tteachereacher rrecruitmentecruitment aandnd hiringhiring iinn a llargearge uurbanrban MMidwesternidwestern schoolschool district.district. TheyThey estimateestimate totaltotal costscosts ooff rroughlyoughly $8,200:$8,200: recruitingrecruiting costscosts pperer vvacancyacancy ooff $$1,1001,100 iinn ccentralentral ooffiffi cece staffstaff ttimeime aandnd $2,600$2,600 inin school-levelschool-level staffstaff time,time, plusplus $4,500$4,500 forfor thethe costcost ofof trainingtraining a newnew teacher.teacher. InIn addition,addition, somesome ofof thesethese costscosts wwillill bbee ddefrayedefrayed byby thethe salariessalaries eearnedarned byby newnew teachers,teachers, whichwhich areare typicallytypically lowerlower thanthan thethe salariessalaries ofof tthehe tteacherseachers ttheyhey rreplace.eplace. BBasedased onon thethe gainsgains thatthat teachersteachers makemake inin theirtheir fi rstrst fewfew yearsyears ofof experience,experience, eeveryvery timetime a schoolschool districtdistrict losesloses anan experiencedexperienced tteachereacher wwithith ttwowo oror moremore yearsyears ooff eexperiencexperience aandnd iiss fforcedorced ttoo hhireire a nnoviceovice tteacher,eacher, tthehe sstudentstudents aassignedssigned ttoo tthehe nnoviceovice teacherteacher ooverver tthehe fi rrstst twotwo yearsyears ofof ttheirheir ccareerareer lloseose rroughlyoughly .10.10 standardstandard ddeviationseviations iinn sstudenttudent aachievement.chievement. AAss ddiscussediscussed aabove,bove, eestimatesstimates ssuggestuggest a .10.10 stan-stan- ddardard deviationdeviation ggainain iinn mmathath sscorescores hhasas a vvaluealue ooff rroughlyoughly $$10,00010,000 ttoo $$25,00025,000 pperer sstudent.tudent. TThus,hus, tthehe eeconomicconomic ccostost ooff llostost aacademiccademic aachievementchievement wwhenhen replacingreplacing aann eexperiencedxperienced eelementarylementary tteachereacher wwithith a nnoviceovice wwouldould bbee rroughlyoughly $$10,00010,000 ttoo $$25,00025,000 ttimesimes 2020 studentsstudents perper class—orclass—or $200,000$200,000 toto $500,000.$500,000. ThisThis isis obviouslyobviously a back-of-back-of- tthe-envelopehe-envelope ccalculation,alculation, bbutut iitt ddwarfswarfs thethe directdirect costscosts ofof teacherteacher hiring.hiring.

FFactact 5:5: SchoolSchool LeadersLeaders HaveHave VeryVery LittleLittle AbilityAbility toto SelectSelect EffectiveEffective TeachersTeachers DDuringuring thethe IInitialnitial HiringHiring Process.Process. RReliableeliable screeningscreening atat thethe hiringhiring sstagetage wwouldould bbee aann eeffiffi cientcient ttoolool fforor rraisingaising sstudenttudent aachievementchievement bbecauseecause iitt aavoidsvoids tthehe ccostost ooff pplacinglacing iineffectiveneffective tteacherseachers iinn ffrontront ofof students.students. UUnfortunately,nfortunately, ttherehere iiss sscantcant evidenceevidence tthathat sschoolchool ddistrictsistricts oror pprincipalsrincipals ccanan eeffectivelyffectively sseparateeparate eeffectiveffective aandnd iineffectiveneffective tteacherseachers wwhenhen ttheyhey mmakeake hiringhiring decisions.decisions. Indeed,Indeed, tthishis nnotionotion iiss ssupportedupported bbyy tthehe ffactact tthathat mmostost ooff tthehe vvariationariation iinn tteachereacher eeffectsffects occursoccurs amongamong teachersteachers hiredhired intointo thethe samesame school.school. OOnene ofof thethe mostmost interestinginteresting ppiecesieces ofof evidenceevidence oonn tthishis ttopicopic ccomesomes ffromrom a nnaturalatural experimentexperiment tthathat ooccurredccurred iinn CCaliforniaalifornia iinn tthehe llateate 11990s990s ((KaneKane andand SStaiger,taiger, 2005).2005). BeginningBeginning inin thethe academicacademic yearyear 1996–1997,1996–1997, thethe statestate ooff CCaliforniaalifornia pprovidedrovided cashcash iincentivesncentives ttoo sschoolchool ddistrictsistricts ttoo kkeepeep cclasslass ssizesizes iinn kkindergartenindergarten tthroughhrough thirdthird gradegrade toto a maximummaximum ofof 2020 children.children. ToTo taketake advantageadvantage ofof thethe statestate iincentive,ncentive, sschoolchool ddistrictsistricts tthroughouthroughout thethe statestate ddramaticallyramatically iincreasedncreased hiringhiring ooff nnewew elementaryelementary teachers.teachers. InIn thethe yearsyears beforebefore 1997,1997, thethe LosLos AngelesAngeles UUnifinifi eded SchoolSchool DDistrictistrict hhiredired 1,2001,200 toto 1,4001,400 elementaryelementary schoolschool teachersteachers perper year,year, butbut inin 19971997 LosLos AAngelesngeles nearlynearly tripled thethe numbernumber ofof elementaryelementary schoolschool teachersteachers itit hired,hired, toto 3,335,3,335, aandnd ccontinuedontinued ttoo hhireire aatt mmoreore tthanhan ddoubleouble iitsts eearlierarlier llevelevel fforor tthehe nnextext fi veve years.years. 104 Journal of Economic Perspectives

IIff tthehe ddistrictistrict wwereere aableble ttoo ddiscerniscern tteachereacher eeffectivenessffectiveness inin thethe hiringhiring pprocess,rocess, wwee wouldwould hhaveave eexpectedxpected a llargearge increaseincrease inin hiringhiring toto havehave hadhad a negativenegative eeffectffect oonn tthehe aaverageverage eeffectivenessffectiveness ooff tthehe tteacherseachers hhired.ired. SSuchuch aann eeffectffect wwouldould llikelyikely hhaveave bbeeneen heightenedheightened bbyy tthehe ffactact tthathat nnearlyearly eeveryvery ootherther sschoolchool ddistrictistrict iinn CCaliforniaalifornia wwasas oonn a hiringhiring sspreepree becausebecause ofof thethe samesame statestate llaw,aw, tthehe ffactact tthathat tteachereacher ccompensationompensation iinn LLosos AngelesAngeles ddidid nnotot iincreasencrease moremore thanthan usualusual duringduring tthishis pperiod,eriod, aandnd tthathat tthehe pproportionroportion ooff nnewew hhiresires iinn LL.A..A. wwithoutithout tteachingeaching ccredentialsredentials roserose fromfrom 5599 ppercentercent ttoo 7272 percent.percent.5 However,However, KKaneane andand SStaigertaiger (2005)(2005) fi nndd tthat,hat, ddespiteespite tthehe ssizeize ooff tthehe hhiringiring bbubble,ubble, vvaluealue aaddeddded iinn tthehe pperioderiod 22001–2004001–2004 fforor tteacherseachers hhiredired iinn 11997997 wwasas nnoo worseworse tthanhan fforor tteacherseachers hhiredired iinn tthehe yyearsears iimmediatelymmediately beforebefore 11997.997.6 OOverall,verall, ttherehere waswas nono evidenceevidence tthathat ttriplingripling tthehe nnumberumber ooff nnewew hhiresires hhadad aanyny eeffectffect onon theirtheir aaverageverage eeffectivenessffectiveness iinn tthehe cclassroom.lassroom. 7 OOtherther evidenceevidence onon tthishis iissuessue ccomesomes ffromrom ddecadesecades ooff wworkork iinn wwhichhich rresearchersesearchers hhaveave ttried,ried, unsuccessfully,unsuccessfully, toto linklink teacherteacher characteristicscharacteristics observableobservable toto bothboth rresearchersesearchers andand principalsprincipals toto studentstudent ooutcomesutcomes ((seesee rreviewseviews byby Hanushek,Hanushek, 11986,986, 1997;1997; JJacob,acob, 22007).007). WWithith tthehe eexceptionxception ooff tteachingeaching eexperience,xperience, ttherehere iiss llittleittle ttoo ssuggestuggest tthathat tthehe ccredentialsredentials ccommonlyommonly uusedsed toto determinedetermine teacherteacher ccertifiertifi cationcation andand paypay areare rrelatedelated toto teachers’teachers’ impactsimpacts onon studentstudent outcomes.outcomes. SomeSome studiesstudies fi ndnd thatthat a teacher’steacher’s aacademiccademic backgroundbackground (like(like collegecollege gradegrade pointpoint averageaverage oror SATSAT testtest sscores)cores) iiss rrelatedelated ttoo studentstudent ooutcomes,utcomes, bbutut BBallouallou ((1996)1996) fi ndsnds thatthat tteachingeaching aapplicantspplicants wwithith sstrongtrong aacademiccademic rrecordsecords areare nono moremore likelylikely toto bebe hiredhired byby schoolschool principals.principals. MMoreore rrecentecent wworkork ssuggestsuggests tthathat sselectingelecting tteachingeaching ccandidatesandidates wwhoho aarere llikelyikely ttoo bbee eeffectiveffective iiss ddiffiiffi cult,cult, butbut notnot impossible.impossible. ForFor example,example, severalseveral studiesstudies havehave eestimatedstimated tthehe eeffectffect ofof novicenovice teachersteachers recruitedrecruited underunder thethe TeachTeach forfor AmericaAmerica pprogramrogram (Decker,(Decker, Mayer,Mayer, andand Glazerman,Glazerman, 22004;004; BBoyd,oyd, GGrossman,rossman, LLankford,ankford, Loeb,Loeb, aandnd WWyckoff,yckoff, 22006;006; KKane,ane, RRockoff,ockoff, aandnd SStaiger,taiger, 22008).008). TTeacheach fforor AAmericamerica iiss hhighlyighly sselective,elective, drawingdrawing applicantsapplicants ffromrom tthehe ttopop uuniversitiesniversities iinn tthehe ccountryountry aandnd oofferingffering ppositionsositions toto onlyonly a smallsmall fractionfraction ofof thethe thousandsthousands ofof individualsindividuals whowho apply.apply. HHowever,owever, thesethese applicantsapplicants hhaveave nnotot ggenerallyenerally ttakenaken ccollegeollege ccoursesourses iinn KK–12–12 eeduca-duca- ttionion nornor havehave theythey majoredmajored inin education.education. DDecker,ecker, Mayer,Mayer, andand GlazermanGlazerman (2004)(2004) useuse rrandomandom assignmentassignment ttoo eestimatestimate tthehe eeffectffect ooff tthehe pprogramrogram iinn eelementarylementary sschoolschools aandnd fi ndnd thatthat sstudentstudents aassignedssigned toto TeachTeach ForFor AmericaAmerica membersmembers scoredscored 2 percentilepercentile

5 It may seem surprising that the fraction of teachers without credentials didn’t rise by more, but the number of individuals with teaching certifi cation that do not teach is quite large. Data from the Baccalaureate and Beyond study indicate that roughly one in fi ve college graduates receive teaching certifi cation in the ten years after graduation, but 45 percent of college graduates that obtained teaching certifi cation are not teaching, and 15 percent have never taught (author’s calculations using National Center for Education Statistics QuickStats on 6/8/2010). 6 Their analysis focuses on grades two through fi ve in Los Angeles from 2001 through 2004. By 2001, roughly two-thirds of both the 1996 and 1997 hiring cohorts were still employed by the district; thus, there is little evidence to suggest any differential selective attrition for the larger cohort. Also, while their value-added model controls for baseline scores and other student characteristics, there was virtu- ally no difference in the types of students to which the cohorts had been assigned. 7 This evidence runs counter to the prevailing wisdom among some policy analysts that it was a decline in the average quality of the teaching force that accounts for the failure to see an increase in achieve- ment in California resulting from the class size reduction (Bohrnstedt and Stecher, 2002). Searching for Effective Teachers with Imperfect Information 105

ppointsoints ((0.0950.095 sstandardtandard ddeviations)eviations) hhigherigher iinn mmathath aandnd nnoo hhigherigher iinn rreadingeading tthanhan tthosehose aassignedssigned toto otherother teachers.teachers. UsingUsing nonexperimentalnonexperimental datadata fromfrom NewNew YorkYork City,City, iinn Kane,Kane, Rockoff,Rockoff, aandnd SStaigertaiger ((2008),2008), wwee fi ndnd positivepositive effectseffects ofof TeachTeach ForFor AAmericamerica tteacherseachers iinn mmathath ooff ..0202 sstandardtandard ddeviationseviations aandnd nnoo sstatisticallytatistically ssignifiignifi cantcant effecteffect iinn EEnglishnglish LLanguageanguage AArts.rts. BBoyd,oyd, GGrossman,rossman, LLankford,ankford, Loeb,Loeb, andand WyckoffWyckoff (2006)(2006) rreporteport ccomparableomparable rresults,esults, alsoalso usingusing datadata fromfrom NNewew YYorkork CCity.ity. MMoreore evidenceevidence ccomesomes ffromrom sstudiestudies ccollectingollecting ddataata oonn rrecently-hiredecently-hired nnoviceovice mmathath teachersteachers iinn NNewew YorkYork City.City. InIn Rockoff,Rockoff, JJacob,acob, KKane,ane, aandnd SStaigertaiger ((forth-forth- ccoming),oming), wwee ccollectedollected iinformationnformation oonn a nnumberumber ooff nnontraditionalontraditional ppredictorsredictors ooff eeffectiveness—includingffectiveness—including teaching-specifiteaching-specifi c ccontentontent kknowledge,nowledge, ccognitiveognitive aability,bility, ppersonalityersonality traits,traits, feelingsfeelings ofof sself-effielf-effi cacy,cacy, andand scoresscores onon a commerciallycommercially availableavailable tteachereacher sselectionelection iinstrument—andnstrument—and tthenhen uusedsed tthesehese ttoo ppredictredict a tteacher’seacher’s eeffectffect onon mmathath achievement.achievement. WhenWhen tthehe vvariablesariables wwereere ccombinedombined intointo twotwo primaryprimary ffactorsactors ssummarizingummarizing cognitivecognitive andand noncognitivenoncognitive skills,skills, teachersteachers whowho werewere oneone standardstandard ddeviationeviation hhigherigher oonn eeitherither tthehe ccognitiveognitive oorr nnoncognitiveoncognitive ffactoractor wwereere ffoundound ttoo rraiseaise sstudenttudent aachievementchievement iinn mmathath bbyy .033.033 student-levelstudent-level standardstandard deviationsdeviations moremore thanthan tteacherseachers wwithith aaverageverage sskillkill llevels.evels. TThosehose wwhoho wwereere oneone standardstandard deviationdeviation higherhigher oonn bothboth measuresmeasures werewere estimatedestimated ttoo raiseraise aachievementchievement bbyy ..066066 sstandardtandard ddeviations.eviations. RRockoffockoff andand SperoniSperoni (2010)(2010) examinedexamined thethe achievementachievement ofof studentsstudents aassignedssigned toto tteacherseachers recruitedrecruited throughthrough anan alternativealternative ccertifiertifi cationcation program—theprogram—the NewNew YYorkork CCityity TTeachingeaching FFellows—andellows—and aaskedsked wwhetherhether aachievementchievement ggainsains wwereere hhigherigher fforor sstudentstudents aassignedssigned ttoo tteacherseachers rratedated aass mmoreore aattractivettractive ccandidatesandidates bbyy tthehe ccertifiertifi cationcation pprogram’srogram’s interviewinterview protocol.protocol. TheyThey foundfound nnoo ssignifiignifi cantcant relationshiprelationship withwith EnglishEnglish LLanguageanguage AArtsrts ttestest sscorescores aandnd a ssmallmall ppositiveositive rrelationshipelationship wwithith mmathath ttestest sscores:cores: a oonene standardstandard deviationdeviation inin interviewinterview scorescore waswas associatedassociated withwith .013.013 standardstandard devia-devia- ttionsions hhigherigher mmathath aachievementchievement ggain.ain.

IImplicationsmplications forfor HowHow WeWe ShouldShould (and(and ShouldShould Not)Not) SearchSearch forfor EEffectiveffective TTeacherseachers

HHere,ere, wewe fi rstrst laylay outout a wayway ofof thinkingthinking aaboutbout thethe appropriateappropriate searchsearch strategystrategy fforor schoolschool lleaderseaders bbasedased oonn tthesehese eempiricalmpirical fi nndings.dings. BBasedased oonn tthishis aapproach,pproach, wwee tthenhen presentpresent simulationsimulation estimatesestimates ofof howhow thesethese differentdifferent strategiesstrategies wwouldould aaffectffect aaverageverage tteachereacher pproductivity.roductivity.

A RReservationeservation ValueValue oorr CCut-offut-off SScorecore MModelodel SSupposeuppose thatthat schoolschool districtsdistricts dodo notnot observeobserve anyany usefuluseful pre-hirepre-hire signal—theresignal—there aarere a substantialsubstantial numbernumber ofof potentialpotential applicantsapplicants forfor teachingteaching jjobsobs wwhoho aappearppear ttoo hhaveave thethe generalgeneral skillskill llevelevel ttoo ssucceeducceed inin teaching,teaching, butbut wewe cannotcannot telltell inin advanceadvance wwhichhich onesones willwill actuallyactually succeed.succeed. However,However, afterafter teachersteachers acceptaccept a job,job, thethe schoolschool ccanan observeobserve thethe gainsgains thatthat studentsstudents mmakeake iinn ttestest sscores.cores. TThus,hus, tthehe pprincipalrincipal ffacesaces a searchsearch problem:problem: thethe principalprincipal drawsdraws teachersteachers fromfrom thethe applicantapplicant pool,pool, observesobserves nnoisyoisy ssignalsignals overover timetime aboutabout teacherteacher productivity,productivity, aandnd ddecidesecides wwhetherhether ttoo ddismissismiss 106 Journal of Economic Perspectives

uunproductivenproductive tteacherseachers aandnd startstart tthehe pprocessrocess ooverver aagain.gain. IInn tthishis kkindind ooff mmodel,odel, tthehe ooptimalptimal decisiondecision rulerule hashas a reservationreservation property:property: atat tthehe eendnd ooff a yyear,ear, tthehe pprincipalrincipal mmakesakes a ddecisionecision onon whetherwhether toto dismissdismiss a teacherteacher ifif thethe expectedexpected eeffectivenessffectiveness ofof tthathat tteacher,eacher, ggiveniven tthehe iinformationnformation ttoo ddate,ate, lliesies bbelowelow a reservationreservation vvalue.alue.8 AAtt a broadbroad level,level, thethe principalprincipal shouldshould ssetet tthehe ccut-offut-off sscorecore wwherehere tthehe pproduc-roduc- ttivityivity ofof thethe marginal teacherteacher isis eexpectedxpected ttoo bbee eequalqual ttoo tthehe pproductivityroductivity ooff tthehe aaverageverage tteacher.eacher. IInn ootherther wwords,ords, tthishis ddecisionecision rruleule ttellsells pprincipalsrincipals ttoo kkeepeep oonlynly tthehe rrookiesookies whowho areare expectedexpected toto bebe betterbetter thanthan thethe average tteacher.eacher. IImaginemagine iiff tthishis wwereere nnotot true—thattrue—that iis,s, ssupposeuppose thethe marginalmarginal teacherteacher werewere lessless productiveproductive tthanhan tthehe aaverageverage tteacher.eacher. TThenhen tthehe sschoolchool ddistrictistrict ccouldould rraiseaise aaverageverage pperformanceerformance bbyy rraisingaising iitsts standardstandard fforor nnewew hhiresires bbyy a ssmallmall amount.amount. Likewise,Likewise, ifif thethe marginalmarginal teacherteacher aacceptedccepted underunder thethe standardstandard werewere moremore productiveproductive thanthan thethe averageaverage teacher,teacher, thenthen tthehe districtdistrict ccouldould rraiseaise aaverageverage pperformanceerformance bbyy lloweringowering tthehe ccut-offut-off sscorecore fforor nnewew hhiresires aandnd aaddingdding oonene mmoreore aabove-averagebove-average tteacher.eacher. TThishis rresultesult iiss aanalogousnalogous ttoo tthehe uusualsual resultresult thatthat averageaverage ccostsosts aarere mminimizedinimized aatt tthehe ppointoint wwherehere marginalmarginal costcost eequalsquals averageaverage cost.cost. HHowever,owever, ddeterminingetermining thethe reservationreservation valuevalue oror cut-offcut-off sscorecore iinn ppracticeractice wwillill bbee ccomplex.omplex. TThehe optimaloptimal reservationreservation valuevalue dependsdepends onon a setset ofof underlyingunderlying pparametersarameters ssimilarimilar toto thosethose alreadyalready discussed:discussed: thethe extentextent ofof variationvariation inin performanceperformance acrossacross tteachers,eachers, thethe returnreturn toto experience,experience, tthehe nnumberumber ooff yyearsears bbeforeefore ttenure,enure, tthehe eexog-xog- eenousnous turnoverturnover rate,rate, thethe sizesize ofof thethe applicantapplicant pool,pool, andand thethe magnitudemagnitude ooff ootherther hhiringiring aandnd fi ringring ccosts.osts. FForor eexample,xample, iiff tteacherseachers aarere mmoreore hheterogeneous,eterogeneous, tthenhen tthehe ppotentialotential bbenefienefi tsts ooff ggreaterreater sselectionelection areare higher.higher. However,However, ifif therethere isis moremore nnoiseoise ((andand thusthus uncertainty)uncertainty) inin thethe eestimatesstimates ooff tteachereacher hheterogeneity,eterogeneity, tthenhen tthehe bbenefienefi tsts ooff sselectionelection areare lower.lower. IfIf thethe gainsgains fromfrom teacherteacher experienceexperience aarere wworthorth mmore,ore, tthenhen tthehe costcost ooff ddismissingismissing experiencedexperienced teachersteachers andand replacingreplacing tthemhem wwithith nnovicesovices iiss llarger.arger. IfIf thethe exogenousexogenous turnoverturnover rrateate ooff tteacherseachers iiss hhigh,igh, thenthen thethe optimaloptimal cut-offcut-off fforor tenuretenure ffallsalls bbecauseecause ttherehere iiss llessess bbenefienefi t toto givinggiving tenuretenure toto highlyhighly effectiveeffective tteacherseachers ifif theythey dodo notnot staystay llong.ong. OOverall,verall, tthehe pprincipalrincipal mmustust ssetet tthehe bbarar ttoo ttraderade ooffff tthehe short-termshort-term ccostost ooff rreplacingeplacing aann eexperiencedxperienced tteachereacher wwithith a rrookieookie aagainstgainst tthehe llong-termong-term bbenefienefi t ooff sselectingelecting oonlynly tthehe mmostost eeffectiveffective tteachers.eachers. IInn whatwhat follows,follows, wewe reportreport tthehe rresultsesults ofof MonteMonte CarloCarlo ssimulationsimulations thatthat examineexamine tthehe cconsequencesonsequences ooff ddifferentifferent aapproachespproaches toto teacherteacher evaluationevaluation andand retention.retention. WWee uusese eevidencevidence oonn kkeyey uunderlyingnderlying pparametersarameters ttoo ccalibratealibrate tthehe mmodel;odel; aallll ooff tthesehese vvaluesalues llieie iinn thethe middlemiddle ooff tthehe eestimatesstimates rreportedeported fforor LLosos AAngelesngeles aandnd NNewew YYorkork CCityity iinn TTableable 11.. WWee ssetet thethe standardstandard deviationdeviation ofof thethe persistentpersistent teacherteacher effecteffect (in(in student-levelstudent-level stan-stan- ddardard deviationdeviation units)units) equalequal toto 0.15,0.15, andand thethe reliabilityreliability ofof thethe value-addedvalue-added measuremeasure ((thethe ratioratio ooff tthehe ppersistentersistent vvarianceariance ttoo ttotalotal vvariance)ariance) eequalqual toto 4040 percent.percent. ForFor tthehe rreturneturn toto experience,experience, wwee aassumessume tthathat a fi rst-rst- aandnd ssecond-yearecond-year teacher’steacher’s valuevalue aaddeddded iiss –0.07–0.07 andand –0.02–0.02 studentstudent sstandardtandard ddeviationseviations bbelowelow tthehe valuevalue addedadded ofof teachersteachers inin

8 For a simple algebraic presentation of this model, with some discussion of its links to search models with imperfect information in labor markets, see the online appendix available with this paper at 〈http://www.e-jep.org〉. Douglas O. Staiger and Jonah E. Rockoff 107

Figure 1 Effect of Dismissing a Given Proportion of Novice Teachers Based on One Year of Data

.08 .8 Average value added (left axis) Proportion novice (right axis)

.06 .6 Proportion novice

.04 .4

Average value added Average .02 .2

0 0

0 .2 .4 .6 .8 1 Proportion dismissed

Notes: Proportion dismissed (x-axis) refers to the proportion of teachers with the lowest value-added estimates that are dismissed after their fi rst year. The solid line (and left axis) shows the steady state impact of each proportion dismissed on the value added of the average teacher, including those in their fi rst year of teaching. Teacher value added is measured in standard deviations of student performance. The dashed line (and right axis) shows the steady state impact of each proportion dismissed on the proportion of the teacher workforce in the fi rst (or novice) year of teaching. ttheirheir tthirdhird yyearear oorr hhigher.igher. WWee iignoregnore tthehe ddirectirect costscosts ofof hiringhiring a newnew teacher.teacher. Finally,Finally, wwee assumeassume a maximummaximum teachingteaching careercareer ofof 3030 yearsyears andand anan exogenousexogenous turnoverturnover raterate ooff 5 ppercent,ercent, wwhichhich isis approximatelyapproximately thethe proportionproportion ofof experiencedexperienced tteacherseachers wwhoho lleaveeave tthehe LLosos AAngelesngeles aandnd NNewew YYorkork CCityity ddistrictsistricts eeachach yyear.ear.

TTenureenure oorr DDismissismiss aafterfter OOnene YYearear WWee beginbegin withwith a basicbasic exampleexample iinn wwhichhich thethe principalprincipal mustmust eithereither ddismissismiss oorr ttenureenure a teacherteacher afterafter oneone yearyear ofof teachingteaching basedbased onon justjust oneone yearyear ofof studentstudent vvalue-alue- aaddeddded ddata.ata. FFigureigure 1 rreportseports tthehe eexpectedxpected ssteady-stateteady-state iimpactmpact ooff ddismissingismissing a ggiveniven pproportionroportion ooff teachersteachers (the(the bottombottom aaxis)xis) oonn vvaluealue aaddeddded ooff thethe averageaverage tteachereacher ((leftleft axis,axis, solidsolid line)line) andand onon thethe proportionproportion ofof tthehe tteachereacher wworkforceorkforce wwhoho aarere iinn ttheirheir fi rstrst yearyear ofof teachingteaching ((rightright aaxis,xis, ddashedashed line).line). TThehe implicationsimplications ooff FFigureigure 1 aarere sstark.tark. FFirst,irst, tthehe ssimulationimulation ssuggestsuggests therethere aarere ssubstantialubstantial gainsgains fromfrom uusingsing vvalue-addedalue-added iinformationnformation ttoo ddismissismiss iineffectiveneffective tteacherseachers aandnd thatthat thethe principalprincipal shouldshould ssetet a vveryery hhighigh bbarar fforor ttenure.enure. TToo mmaximizeaximize 108 Journal of Economic Perspectives

aaverageverage vvaluealue aadded,dded, aboutabout 8080 percentpercent ofof teachersteachers shouldshould bebe dismisseddismissed afterafter theirtheir fi rrstst year.year. ThisThis aggressiveaggressive strategystrategy wouldwould raiseraise thethe averageaverage vvaluealue aaddeddded ooff teachersteachers inin tthehe schoolschool toto justjust ooverver 00.08;.08; pputut ddifferently,ifferently, tthehe eeffectivenessffectiveness ofof thethe averageaverage tteachereacher ((includingincluding tthehe rrookies)ookies) wwouldould bbee ggreaterreater tthanhan rroughlyoughly 7700 ppercentercent ooff tthehe ttenuredenured tteacherseachers uundernder thethe oldold system.system. MMoreover,oreover, itit isis notnot thethe casecase thatthat mostmost ooff tthehe ggainain ccomesomes ffromrom ddismissingismissing tthehe vveryery llowest-performingowest-performing tteachers.eachers. IIndeed,ndeed, uuntilntil tthehe pprincipalrincipal rreacheseaches thethe ooptimum,ptimum, tthehe ggainain ttoo bbeingeing iincreasinglyncreasingly sselectiveelective iinn wwhoho rreceiveseceives ttenureenure iiss roughlyroughly linear.linear. ForFor example,example, iiff tthehe pprincipalrincipal ddismissedismissed thethe 4040 percentpercent ofof fi rrst-yearst-year tteacherseachers wwithith tthehe lowestlowest vvaluealue aadded,dded, ratherrather thanthan 8080 percent,percent, thethe averageaverage valuevalue addedadded aamongmong tteacherseachers iinn tthehe sschoolchool wwouldould iincreasencrease byby roughlyroughly 0.0450.045 inin thethe longlong run.run. WWhilehile tthesehese rresultsesults areare surprisingsurprising relativerelative toto currentcurrent practice,practice, ttherehere aarere a nnumberumber ofof cclearlear reasonsreasons whywhy principalsprincipals mightmight choosechoose toto dismissdismiss a llargearge pproportionroportion ooff nnoviceovice tteachers.eachers. EEvenven uunreliablenreliable performanceperformance measuresmeasures suchsuch asas valuevalue addedadded cancan iidentifydentify ssubstantialubstantial andand lastinglasting ddifferencesifferences aacrosscross tteachers.eachers. DDifferencesifferences iinn tteachereacher eeffectsffects areare largelarge andand persistentpersistent relativerelative toto thethe short-livedshort-lived costscosts ofof hiringhiring a nnewew tteacher.eacher. SinceSince thethe typicaltypical teacherteacher gettinggetting tenuretenure willwill teachteach forfor tenten yearsyears oror more,more, tthehe benefibenefi t fromfrom ssettingetting a hhighigh ttenureenure bbarar wwillill bbee llarge.arge. OOff ccourse,ourse, ssuchuch uunreli-nreli- aableble measuresmeasures makemake mistakes.mistakes. BButut thethe long-runlong-run ccostost ooff rretainingetaining aann iineffectiveneffective tteachereacher farfar outweighsoutweighs tthehe sshort-runhort-run ccostost ooff ddismissingismissing aann eeffectiveffective tteacher.eacher. MMore-ore- oover,ver, becausebecause ofof thethe uncertaintyuncertainty atat thethe timetime ofof hire,hire, newnew teachersteachers havehave considerableconsiderable ooptionption vvalue;alue; fforor eeveryvery fi veve newnew hires,hires, oneone willwill bebe identifiidentifi eded aass a hhighlyighly eeffectiveffective tteachereacher andand provideprovide mmanyany yyearsears ooff vvaluablealuable service.service. TTherehere aarere mmanyany rreasonseasons whywhy thesethese simulationssimulations couldcould overstateoverstate thethe benefibenefi ttss oorr understateunderstate tthehe costscosts ooff ssuchuch aann aaggressiveggressive ttenureenure ppolicy,olicy, aandnd wwee hhaveave ttriedried ttoo eenumeratenumerate a numbernumber ooff tthemhem hhere.ere. IInn ggeneral,eneral, fforor rreasonableeasonable variationsvariations inin thethe pparameterarameter vvalues,alues, tthesehese iissuesssues ddoo nnotot aalterlter oourur qqualitativeualitative cconclusions.onclusions. FFirst,irst, wwee mmayay havehave understatedunderstated thethe hiringhiring aandnd fi rringing ccostsosts ffacingacing a pprincipal.rincipal. AAss wwee discusseddiscussed eearlier,arlier, tthehe mmainain ccostost ooff tturnoverurnover iiss tthehe llowerower eeffectivenessffectiveness ooff nnewew tteachers,eachers, whichwhich correspondscorresponds toto a costcost ofof wwellell ooverver $$100,000100,000 iinn ttermserms ooff fforegoneoregone ffutureuture studentstudent eearnings.arnings. HHowever,owever, eevenven iiff wwee ddoubleouble tthehe ddifferenceifference iinn vvaluealue aaddeddded bbetweenetween rookiesrookies andand experiencedexperienced teachersteachers (that(that is,is, fromfrom 0.070.07 toto .14.14 studentstudent llevelevel sstandardtandard ddeviations),eviations), tthehe ooptimalptimal ddismissalismissal raterate remainsremains overover 7575 percent.percent. SSecond,econd, wwee mmayay hhaveave uunderstatednderstated tturnoverurnover rratesates aamongmong ttenuredenured teachers,teachers, espe-espe- cciallyially iiff pprincipalsrincipals ffocusocus onon theirtheir oownwn sschoolchool ((ratherrather tthanhan tthehe ddistrictistrict aass a wwhole)hole) aandnd hhighlyighly eeffectiveffective teachersteachers areare moremore likelylikely toto movemove toto otherother schools.schools. Similarly,Similarly, pprincipalsrincipals mmayay ddiscountiscount tthehe ffutureuture mmoreore hhighlyighly bbecauseecause ooff ttheirheir oownwn llikelihoodikelihood ooff lleavingeaving tthehe sschool,chool, oorr bbecauseecause ttheyhey bbelieveelieve tthathat tteachereacher eeffectsffects wwillill nnotot ppersistersist iintonto tthehe ffutureuture ((althoughalthough thethe evidenceevidence ssuggestsuggests otherwise).otherwise). However,However, ifif wewe doubledouble thethe eexogenousxogenous aannualnnual tturnoverurnover rrateate ffromrom 5 ttoo 1100 ppercent,ercent, tthehe ooptimalptimal ddismissalismissal rrateate rremainsemains overover 7070 percent.percent. TThird,hird, wewe maymay havehave understatedunderstated thethe costcost ooff rrecruitingecruiting tteachers.eachers. TThehe ssimulationimulation iindicatesndicates tthathat a ddismissalismissal rrateate ooff 8800 ppercentercent wwouldould rresultesult iinn mmoreore tthanhan 2200 ppercentercent ooff tthehe wworkforceorkforce bbeingeing nnoviceovice tteacherseachers aatt aanyny ttime,ime, mmoreore tthanhan ddoubleouble tthehe ccurrenturrent pproportionroportion ofof nnovicesovices inin LLosos AngelesAngeles aandnd NNewew YYorkork CCity.ity. DDistrictsistricts wwouldould hhaveave ttoo Searching for Effective Teachers with Imperfect Information 109

hhireire manymany moremore teachersteachers toto accommodateaccommodate thisthis strategy,strategy, aandnd tthesehese nnewew hhiresires wouldwould ppresumablyresumably demanddemand higherhigher wageswages ttoo ccompensateompensate fforor tthehe ssubstantialubstantial riskrisk ooff bbeingeing ddismissed.ismissed. TThishis iiss pparticularlyarticularly ttruerue iiff wwee ccontinueontinue toto requirerequire costlycostly up-frontup-front teaching-teaching- sspecifipecifi c training.training. HHowever,owever, eevenven a ddoublingoubling ooff ccurrenturrent tteachereacher ssalariesalaries wwouldould nnotot bbee eenoughnough ttoo ooffsetffset tthehe bbenefienefi tsts ofof aann aaggressiveggressive ddismissalismissal ppolicy,olicy, ssinceince a ..0808 aannualnnual iincreasencrease inin studentstudent aachievementchievement iiss wworthorth mmoreore tthanhan $$100,000100,000 pperer tteacher.eacher. FFourth,ourth, oourur ssimulationsimulations focusfocus onon thethe steadysteady state,state, aandnd wwee hhaveave iignoredgnored whatwhat hhappensappens aalonglong tthehe wway.ay. TThishis mmayay bbee iimportantmportant iiff wwee ddiscountiscount tthehe eearningsarnings ofof ffutureuture cchildrenhildren rrelativeelative toto currentcurrent children.children. SpecifiSpecifi cally,cally, forfor a schoolschool thatthat startsstarts wwithith aallll nnewew tteachers,eachers, aann aaggressiveggressive dismissaldismissal policypolicy willwill resultresult inin highhigh fractionsfractions ofof inexpe-inexpe- rriencedienced teachersteachers inin thethe shortshort rrunun eevenven iiff tthehe eequilibriumquilibrium percentagepercentage ooff rrookiesookies iiss llower.ower. FForor example,example, iiff wwee ccompareompare a ppolicyolicy ooff rretainingetaining tthehe ttopop 2200 ppercentercent ooff nnewew tteacherseachers wwithith rretainingetaining 9090 ppercent,ercent, aaverageverage tteachereacher eeffectivenessffectiveness wouldwould bebe slightlyslightly llowerower iinn tthehe fi rstrst twotwo yearsyears afterafter implementingimplementing tthehe mmoreore aaggressiveggressive ppolicy,olicy, tturnurn ppositiveositive inin yearyear three,three, andand approachapproach thethe steadysteady statestate gainsgains afterafter rroughlyoughly tenten years.years. NNevertheless,evertheless, wewe fi nndd tthehe optimaloptimal dismissaldismissal raterate isis stillstill aabovebove 7070 percentpercent fforor aannualnnual ddiscountiscount rratesates inin a reasonablereasonable rangerange (say(say 2 ttoo 8 ppercent),ercent), aandnd ffallsalls ttoo 5500 ppercentercent oonlynly wwithith aannualnnual ddiscountiscount rratesates onon thethe orderorder ofof 1155 ppercent.ercent. FFifth,ifth, ttherehere mmayay bbee sspilloverpillover eeffectsffects iinn tteaching,eaching, wwherehere ggoodood tteacherseachers hhelpelp rraiseaise thethe achievementachievement ofof theirtheir ccolleagues’olleagues’ studentsstudents ((JacksonJackson aandnd BBruegmann,ruegmann, 22009;009; KKoedell,oedell, forthcoming).forthcoming). TThishis hhasas ttwowo ooffsettingffsetting eeffectsffects iinn tthehe ccontextontext ooff oourur aanalysis.nalysis. SpilloverSpillover effectseffects implyimply thatthat ourour estimateestimate ofof tthehe vvariationariation iinn hhowow tteacherseachers iimpactmpact ttheirheir own studentsstudents isis overstatedoverstated (because(because somesome ofof thethe observedobserved effecteffect isis duedue ttoo theirtheir colleagues).colleagues). However,However, ifif thisthis biasbias isis roughlyroughly 2020 percent,percent, asas suggestedsuggested bbyy JJacksonackson aandnd BBruegmannruegmann (2009),(2009), soso thatthat thethe truetrue sstandardtandard ddeviationeviation iinn ppersistentersistent tteachereacher eeffectsffects waswas 0.120.12 ratherrather thanthan 0.15,0.15, thethe optimaloptimal dismissaldismissal raterate wouldwould stillstill bebe 7979 ppercentercent iinn oourur mmodel.odel. MMoreover,oreover, sspilloverpillover eeffectsffects wwillill aalsolso iincreasencrease tthehe bbenefienefi tsts ofof fi llinglling sschoolschools eentirelyntirely wwithith hhighlyighly eeffectiveffective teachersteachers andand couldcould easilyeasily implyimply hhigherigher ooptimalptimal ddismissalismissal rrates.ates. SSixth,ixth, wewe havehave assumedassumed thatthat teachersteachers whowho dodo notnot receivereceive tenuretenure exitexit thethe teachingteaching wworkforce.orkforce. IIff tteachingeaching eeffectivenessffectiveness iiss mmeasuredeasured wwithith eerrorrror aandnd pprincipalsrincipals hhaveave nnoo ppowerower toto screenscreen amongamong candidates,candidates, tthenhen ddismissedismissed teachersteachers couldcould movemove toto anotheranother sschoolchool andand hopehope forfor betterbetter luckluck inin theirtheir eevaluation.valuation. IIff tthishis ooccurred,ccurred, thethe averageaverage qualityquality ooff thethe applicantapplicant poolpool wouldwould decline.decline. ThisThis typetype ofof phenomenon—poorlyphenomenon—poorly performingperforming tteacherseachers movingmoving ttoo nnewew sschools,chools, ttypicallyypically tthosehose sservingerving mmoreore ddisadvantagedisadvantaged studentsstudents ((discusseddiscussed inin Boyd,Boyd, Grossman,Grossman, Lankford,Lankford, Loeb,Loeb, andand Wyckoff,Wyckoff, 22008)—is008)—is alreadyalready well-well- kknownnown iinn eeducation,ducation, aandnd iiss rreferredeferred ttoo aass ““thethe ddanceance ooff tthehe llemons.”emons.” TThishis ssuggestsuggests tthathat principalsprincipals ccouldould bbenefienefi t theirtheir ccolleaguesolleagues atat otherother schoolsschools byby sharingsharing performanceperformance iinformationnformation oonn tteachers.eachers. HHowever,owever, iiff sschoolschools wwereere hhighlyighly sselectiveelective inin grantinggranting ttenure,enure, iitt mmightight aalsolso bbee ttruerue tthathat tteacherseachers wwhoho rreceiveeceive a bbadad ssignalignal rregardingegarding ttheirheir eeffectivenessffectiveness wwouldould hhaveave llessess iincentivencentive ttoo ““shopshop aaround.”round.”9

9 A similar complication arises if teaching skills are partially related to subject matter, grade level, or the teacher–school match. The evidence on the specifi city of teaching skill is mixed (Boyd, Grossman, 110 Journal of Economic Perspectives

SSeventh,eventh, ttherehere iiss eevidencevidence tthathat tthehe iimpactmpact ooff a tteachereacher oonn tthehe aachievementchievement ooff ccurrenturrent sstudentstudents mmayay ffadeade ooutut ooverver ttimeime aass tthosehose sstudentstudents pprogressrogress tthroughhrough ttheirheir rremainingemaining yyearsears ooff sschoolchool ((Jacob,Jacob, LLefgren,efgren, aandnd SSims,ims, 22008;008; KKaneane aandnd SStaiger,taiger, 22008).008). IItt iiss uunclearnclear whetherwhether thisthis greatlygreatly weakensweakens thethe casecase forfor raisingraising tteachereacher qquality.uality. WWee dodo notnot knowknow whetherwhether fadeoutfadeout isis a generalgeneral phenomenonphenomenon oror whetherwhether itit isis a resultresult ooff ccurrenturrent levelslevels ofof hheterogeneityeterogeneity inin teacherteacher effectiveness.effectiveness. ForFor example,example, havinghaving a hhighlyighly eeffectiveffective tteachereacher iinn tthehe ppreviousrevious yyearear mmayay ddoo sstudentstudents llittleittle ggoodood iiff ttheyhey aarere pplacedlaced withwith a highlyhighly ineffectiveineffective teacherteacher thisthis year,year, oror ifif theirtheir classmatesclassmates werewere placedplaced wwithith a hhighlyighly iineffectiveneffective teacherteacher tthehe ppreviousrevious yyear.ear. IInn ootherther wwords,ords, tteachereacher eeffectsffects mmightight notnot fadefade outout ifif a groupgroup ofof sstudentstudents iiss ggiveniven a ssequenceequence ooff hhighlyighly eeffectiveffective tteachers.eachers. Moreover,Moreover, raisingraising a student’sstudent’s aacademiccademic aachievementchievement aass mmeasuredeasured inin a pparticulararticular gradegrade maymay stillstill bbee vvaluablealuable eveneven ifif thethe student’sstudent’s sscorecore iinn llaterater gradegrade levelslevels ddoo nnotot remainremain aatt tthehe iimprovedmproved level.level. ForFor instance,instance, a betterbetter kknowledgenowledge ofof numericalnumerical ooperationsperations mmayay sstilltill bbee vvaluablealuable iinn tthehe llaborabor mmarketarket eevenven iiff iitt ddoesoes nnotot lleadead ttoo bbetteretter ttestest sscorescores iinn aalgebra.lgebra. NNevertheless,evertheless, wwee ddoo vviewiew ““fade-out”fade-out” aass a pprimaryrimary eempiricalmpirical iissue,ssue, notnot justjust inin studiesstudies ooff tteachers,eachers, bbutut iinn sstudiestudies ooff eeducationalducational iinterventionsnterventions mmoreore broadlybroadly (for(for example,example, CCurrieurrie aandnd Thomas,Thomas, 11995).995). FFinally,inally, wewe maymay havehave overstatedoverstated thethe reliabilityreliability ofof value-addedvalue-added measures.measures. WeWe aaddressddress thisthis iissuessue iinn cconsiderableonsiderable ddetailetail bbelow.elow. HHowever,owever, eevenven iiff wwee ccutut tthehe rreli-eli- aabilitybility ofof valuevalue addedadded inin halfhalf (from(from 4400 ttoo 2200 ppercent),ercent), tthehe ooptimalptimal ddismissalismissal rrateate rremainsemains overover 7070 percent.percent.

TThehe EEffectffect ofof CChanginghanging tthehe TTimeime ttoo TTenureenure RRevieweview IInn ourour nnextext setset ofof simulations,simulations, wewe evaluateevaluate howhow changingchanging tthehe ttimeime uuntilntil ttenureenure rrevieweview aaffectsffects thethe optimaloptimal dismissaldismissal raterate andand thethe averageaverage vvaluealue aaddeddded ooff tteachers.eachers. TThehe fi rstrst columncolumn ooff TTableable 2 rrepeatsepeats tthehe rresultsesults ffromrom oourur bbenchmarkenchmark ssimulationsimulations iinn wwhichhich dismissaldismissal couldcould onlyonly occuroccur atat thethe endend ofof tthehe fi rstrst yyear.ear. TThehe nnextext threethree ccolumnsolumns aallowllow tthehe pprincipalrincipal ttoo ddelayelay ttenureenure rrevieweview uuntilntil tthehe ssecond,econd, tthird,hird, oror ffourthourth year,year, aandnd ttoo ggatherather mmoreore iinformationnformation aaboutbout tteachereacher eeffectivenessffectiveness bbeforeefore mmakingaking a ddecisionecision rregardingegarding ddismissal.ismissal. TThehe nnextext tthreehree ccolumnsolumns require a delaydelay iinn ttenureenure rrevieweview fforor 2 ttoo 4 yyears,ears, ssoo tthathat ddismissalismissal ccanan ooccurccur oonlynly aafterfter mmultipleultiple yyearsears ooff value-addedvalue-added datadata areare availableavailable ttoo tthehe pprincipal.rincipal. NNotot ssurprisingly,urprisingly, givinggiving a pprincipalrincipal tthehe ooptionption ooff wwaitingaiting ttoo ggatherather mmoreore iinfor-nfor- mmationation producesproduces somesome bbenefienefi ts.ts. AverageAverage valuevalue addedadded risesrises toto aboutabout 0.100.10 standardstandard ddeviationseviations wwithith tthehe ppossibilityossibility ooff ddelayingelaying ttenureenure rrevieweview ttoo tthehe ffourthourth yyear,ear, wwithith mmostost ofof thethe gaingain comingcoming fromfrom delayingdelaying tenuretenure untiluntil thethe secondsecond year.year. EvenEven withwith thethe ooptionption toto delaydelay tthehe ttenureenure ddecision,ecision, thethe principalprincipal wouldwould stillstill ddismissismiss ttwo-thirdswo-thirds ooff

Lankford, Loeb, and Wyckoff, 2008; Lockwood and McCaffrey, 2009; Jackson, 2010), but it is reason- able to believe that, say, a mediocre teacher of high school physics in Harlem may have made a good fi fth grade math teacher in Brooklyn Heights, or vice versa. If principals do have access to informa- tion on the past performance of teachers who did not make tenure, specifi city in skill can simply be interpreted as an additional source of error. This may lead principals to be more willing to “take a chance” on a teacher who just missed tenure by trying them in a different subject, grade, or teaching environment. Douglas O. Staiger and Jonah E. Rockoff 111

Table 2 Effect of Delaying Tenure Decisions beyond the First Year, Options vs. Requirements

Baseline: Dismissal at Dismissal allowed Require dismissal T = 1 at any time until only occur at time

T = 1 T = 2 T = 3 T = 4 T = 2 T = 3 T = 4

Average value added 0.080 0.095 0.099 0.101 0.075 0.068 0.061 % Dismissed overall 81% 83% 84% 84% 75% 71% 68% % Dismissed annually At T = 1 81% 67% 67% 67% At T = 2 16% 8% 8% 75% At T = 3 9%4% 71% At T = 4 5% 68%

Notes: Average value added refers to the average level of teachers’ value-added estimates in the steady state under an optimal dismissal policy, and it includes both untenured and tenured teachers. (Teacher value added is measured in standard deviations of student performance.) “Percentages dismissed” refers to the percent of a single cohort of newly hired teachers dismissed during years leading up to when a tenure decision is required (year T ). nnewew hireshires afterafter thethe fi rstrst year,year, bbutut wwouldould wwaitait ttoo ddismissismiss ssomeome tteacherseachers fforor wwhomhom ttherehere iiss a rreasonableeasonable cchancehance tthathat aann aadditionaldditional yyearear ooff ddataata ccouldould lleadead ttoo a bbetteretter ddecision.ecision. Intuitively,Intuitively, thethe cut-offcut-off sscorecore fforor ddismissingismissing a nnewew tteachereacher rrisesises wwithith ttimeime oonn tthehe job,job, bbecauseecause tthehe ooptionption vvaluealue ooff wwaitingaiting ttoo ddismissismiss a tteachereacher ddeclineseclines asas thethe pprincipalrincipal accumulatesaccumulates betterbetter information.information. IInn ootherther wwords,ords, toto avoidavoid uunnecessarynnecessary tturnoverurnover tthehe pprincipalrincipal mmayay cchoosehoose ttoo wwaitait a yyearear bbeforeefore ddismissingismissing a tteachereacher wwhoho tthehe pprincipalrincipal bbelieveselieves iiss ““belowbelow tthehe bbar”ar” ssoo llongong aass ttherehere iiss a rreasonableeasonable cchancehance tthathat tthishis eevaluationvaluation ccouldould cchange.hange. TThus,hus, tthehe pprincipalrincipal ddismissesismisses tteacherseachers wwhosehose eexpectedxpected effectivenesseffectiveness lieslies belowbelow a barbar thatthat increasesincreases withwith teacherteacher experience.experience. OOverallverall dismissaldismissal ratesrates dodo notnot changechange mmuchuch aass tthehe pprincipalrincipal iiss aallowedllowed ttoo wwaitait uuntilntil yyearear 22,, 33,, oror 4 toto makemake a decision,decision, butbut thethe extraextra ttimeime aallowsllows tthehe pprincipalrincipal ttoo bbetteretter iidentifydentify tthehe rremainingemaining ssubsetubset ooff tteacherseachers fforor ttenure.enure. IInn contrast,contrast, requiring principalsprincipals toto delaydelay tenuretenure review—thatreview—that is,is, removingremoving tthehe ooptionption ooff dismissaldismissal untiluntil yearyear 2,2, 3,3, oror 4—would4—would leadlead toto lowerlower averageaverage teacherteacher valuevalue aadded,dded, rrelativeelative ttoo tthehe bbaselineaseline ccase.ase. EEssentially,ssentially, tthishis ppolicyolicy fforcesorces pprincipalsrincipals ttoo rretainetain low-performinglow-performing tteacherseachers aadditionaldditional yyears,ears, aandnd tthishis ooutweighsutweighs tthehe bbenefienefi tsts ofof tthehe additionaladditional informationinformation thethe principalprincipal wouldwould obtainobtain byby waitingwaiting toto seesee additionaladditional yyearsears ofof pperformanceerformance ddata.ata. NNoteote tthathat tthishis ppolicyolicy aalsolso lleadseads ttoo ffewerewer tteacherseachers bbeingeing ddismissedismissed overall,overall, ssinceince thethe optionoption valuevalue ofof hhiringiring a nnewew tteachereacher ((whowho mmayay tturnurn ooutut ttoo bebe ineffectiveineffective andand mustmust bbee rretainedetained forfor severalseveral years)years) hashas fallen.fallen.

OObtainingbtaining MMoreore ReliableReliable InformationInformation atat thethe TimeTime ofof HireHire WWee hhaveave aassumedssumed tthathat pprincipalsrincipals hhaveave nnoo uusefulseful iinformationnformation aatt tthehe ttimeime ooff hhire.ire. TThishis impliesimplies tthathat rradicaladical iincreasesncreases iinn hhiringiring rratesates ((asas rrequiredequired byby a ddismissalismissal raterate ooff 8800 ppercent)ercent) ddoo nnotot aaffectffect thethe qualityquality ofof nnewew hhires—eachires—each individualindividual isis a randomrandom 112 Journal of Economic Perspectives

Figure 2 Effect of Increasing the Reliability of the Pre-hire Performance Signal on Value Added of Average Teacher and Proportion of Teachers Dismissed after One Year

.25 .8 Average value added (left axis) Proportion dismissed (right axis)

.6

.2 Proportion dismissed

.4

.15 Average value added Average

.2

.1

0

0 .2 .4 .6 .8 1 Reliability of pre-hire performance signal

Notes: Reliability (x-axis) refers to the proportion of variance in the pre-hire performance signal that is due to the persistent component of teacher performance. The solid line (and left axis) shows the steady state impact of reliability on the value added of the average teacher, including those in their fi rst year of teaching, based on the optimal proportion of teachers dismissed after one year. (Teacher value added is measured in standard deviations of student performance.) The dashed line (and right axis) shows the steady state impact of reliability on the optimal proportion dismissed after one year. ddrawraw fromfrom a ggenerallyenerally qqualifiualifi eded applicantapplicant pool.pool. ButBut manymany districtsdistricts andand principalsprincipals pputut ssubstantialubstantial efforteffort intointo screeningscreening andand interviewinginterviewing nnewew hhires,ires, ssuggestinguggesting tthathat eevenven ssmallmall amountsamounts ofof informationinformation aatt tthehe ttimeime ooff hhireire mmayay bbee vvaluable.aluable. FFigureigure 2 showsshows hhowow cchanginghanging tthehe rreliabilityeliability ooff tthehe ppre-hirere-hire ssignalignal aaffectsffects tthehe ooptimalptimal dismissaldismissal raterate (right(right axis,axis, dasheddashed lline),ine), aandnd tthehe rresultingesulting valuevalue addedadded ofof tthehe aaverageverage tteachereacher iinn tthehe sschoolchool ((leftleft aaxis,xis, ssolidolid lline).ine). FForor tthesehese ssimulations,imulations, wewe aassumedssumed thatthat thethe principalprincipal ccouldould oonlynly ddismissismiss tteacherseachers aafterfter tthehe fi rstrst yearyear (T = 11).). WWee aalsolso aassumedssumed thatthat thethe poolpool ofof potentialpotential applicantsapplicants wwasas ttenen ttimesimes tthehe nnumberumber nneededeeded toto replacereplace teachersteachers leavingleaving tthroughhrough exogenousexogenous tturnover,urnover, ccorrespondingorresponding ttoo eestimatesstimates thatthat NewNew YorkYork CityCity andand LosLos AngelesAngeles currentlycurrently havehave aboutabout 1010 applicantsapplicants fforor eeachach position.position. OurOur baselinebaseline simulationsimulation correspondscorresponds toto a reliabilityreliability ofof 0 inin thethe ppre-hirere-hire ssignal,ignal, atat thethe farfar leftleft inin thisthis fi gure.gure. FFigureigure 2 ssuggestsuggests thatthat pre-hirepre-hire iinformationnformation oonn tteachereacher eeffectivenessffectiveness iiss ppoten-oten- ttiallyially quitequite valuable.valuable. ComparedCompared toto havinghaving nono informationinformation atat thethe timetime ofof hire,hire, a Searching for Effective Teachers with Imperfect Information 113

pperfecterfect pre-hirepre-hire signalsignal withwith 100100 percentpercent reliabilityreliability wouldwould nearlynearly tripletriple thethe gaingain toto tthehe valuevalue addedadded ofof tthehe tteachereacher wworkforce,orkforce, aandnd ooff ccourseourse wwouldould eeliminateliminate tthehe nneedeed ttoo dismissdismiss teachersteachers afterafter hire.hire. MoreMore interestingly,interestingly, eevenven a llowow rreliabilityeliability signalsignal ofof 2200 ppercentercent aatt tthehe ttimeime ooff hhireire ddoublesoubles tthehe ggainain ttoo tthehe vvaluealue aaddeddded ooff tthehe tteachereacher wworkforceorkforce relativerelative toto thethe benchmarkbenchmark casecase withwith nono pre-hirepre-hire information.information. However,However, aaccessccess ttoo a ppre-hirere-hire ssignalignal ddoesoes nnotot eeliminateliminate tthehe nneedeed toto dismissdismiss additionaladditional teachersteachers aafterfter hhire.ire. AAss llongong aass ttherehere iiss rremainingemaining uuncertaintyncertainty aaboutbout tteachereacher eeffectivenessffectiveness aamongmong tthehe tteacherseachers tthathat aarere hhired,ired, therethere willwill bebe a benefibenefi t toto dismissingdismissing aadditionaldditional tteacherseachers aafterfter oobservingbserving cclassroomlassroom pperformance.erformance.

OObtainingbtaining MoreMore ReliableReliable MeasuresMeasures ofof On-the-JobOn-the-Job PerformancePerformance FFigureigure 3 sshowshows hhowow cchanginghanging tthehe rreliabilityeliability ooff tthehe oon-the-jobn-the-job ssignalignal aaffectsffects thethe ooptimalptimal ttimingiming ooff ttenureenure ((regionsregions ddelineatedelineated bbyy ddottedotted llines,ines, llabeledabeled aatt ttop),op), tthehe ooptimalptimal ddismissalismissal raterate (right(right axis,axis, dasheddashed lline),ine), aandnd tthehe rresultingesulting valuevalue addedadded ofof tthehe aaverageverage tteachereacher iinn tthehe sschoolchool ((leftleft aaxis,xis, ssolidolid lline).ine). FForor tthesehese ssimulations,imulations, wewe aassumedssumed thatthat thethe principalprincipal couldcould onlyonly dismissdismiss teachersteachers atat ttenureenure ttimeime (T )).. MManyany schoolschool districtsdistricts aarere ccurrentlyurrently eengagedngaged iinn eeffortsfforts toto improveimprove thethe rreliabilityeliability wwithith wwhichhich theythey cancan measuremeasure teacherteacher pperformance,erformance, tthroughhrough thethe useuse ofof aadditionaldditional iinformationnformation fromfrom cclassroomlassroom observation,observation, studentstudent wwork,ork, andand studentstudent oorr pparentarent ssurveys.urveys. FFigureigure 3 ssuggestsuggests tthathat mmoreore rreliableeliable mmeasureseasures ooff tteachereacher pperformanceerformance aarere qquiteuite vvaluable.aluable. RRelativeelative ttoo oourur bbaselineaseline ssimulation,imulation, iinn wwhichhich reliabilityreliability ofof thethe aannualnnual pperformanceerformance mmeasureeasure waswas 4040 percentpercent ((.4.4 iinn tthehe fi ggure),ure), a mmeasureeasure wwithith pperfecterfect reliabilityreliability wouldwould nearlynearly doubledouble thethe gainsgains ffromrom sselectingelecting eeffectiveffective tteacherseachers ((toto 00.14.14 sstandardtandard ddeviations)eviations) wwhilehile hhavingaving llittleittle iimpactmpact oonn tthehe pproportionroportion ooff tteacherseachers ddismissed.ismissed. IIff ddistrictsistricts rreliedelied oonn pperformanceerformance mmeasureseasures tthathat wwereere llessess rreliableeliable tthanhan oourur bbaselineaseline ccase,ase, tthehe ggainsains ffromrom sselectingelecting eeffectiveffective tteacherseachers wwouldould bbee rreduced,educed, aandnd iitt wwouldould bbecomeecome ooptimalptimal fforor tthehe pprincipalrincipal ttoo wwaitait llongeronger bbeforeefore ddismissingismissing a tteacher.eacher. IInterestingly,nterestingly, tthehe pproportionroportion ooff tteacherseachers ddismissedismissed ddoesoes nnotot ddeclineecline mmuchuch uuntilntil tthehe rreliabilityeliability ofof thethe performanceperformance measuremeasure dropsdrops belowbelow 5 percentpercent (.05(.05 inin thethe fi ggure).ure). EEvenven vveryery weakweak signalssignals ofof teacherteacher pperformanceerformance eeventuallyventually iidentifydentify differencesdifferences bbetweenetween tteacherseachers tthathat mmakeake tthehe bbenefienefi ttss ooff sselectivelyelectively aawardingwarding ttenureenure sswampwamp tthehe ccostost ooff hhavingaving ttoo hhireire aadditionaldditional iinexperiencednexperienced tteachers.eachers.

CConclusiononclusion

IInn thethe ongoingongoing ddebateebate ooverver hhowow ttoo iimprovemprove teachingteaching qqualityuality iinn ppublicublic sschools,chools, ttherehere hhaveave beenbeen conflconfl iictingcting cclaimslaims rregardingegarding thethe usefulnessusefulness ofof currentlycurrently availableavailable mmeasureseasures ooff tteachereacher eeffectiveness.ffectiveness. ForFor example,example, iinn rreferenceeference ttoo a pproposedroposed initia-initia- ttiveive ttoo measuremeasure teacherteacher effectivenesseffectiveness usingusing studentstudent testtest sscores,cores, tthehe hheadead ooff tthehe NNewew YYorkork CityCity teachers’teachers’ unionunion RandiRandi WeingartenWeingarten stated:stated: “There“There isis nono wayway thatthat anyany ofof thisthis ccurrenturrent datadata couldcould actually,actually, fairly,fairly, honestlyhonestly oror withwith anyany integrityintegrity bebe usedused toto isolateisolate thethe ccontributionsontributions ooff aann iindividualndividual tteacher”eacher” ((asas rreportedeported bbyy MMedina,edina, 22008).008). IInn ccontrast,ontrast, tthehe UU.S..S. SSecretaryecretary ofof EducationEducation AArnerne DDuncanuncan ((2009)2009) hhasas sstated:tated: “I“I havehave anan openopen mindmind 114 Journal of Economic Perspectives

Figure 3 Effect of Reliability of the Annual Performance Measure on Optimal Timing of Tenure, Optimal Dismissal Rate, and Value Added of Average Teacher

T .16 T > 3 T = 3 T = 2 = 1 .8

.14 .7

.12 .6 Proportion dismissed

.1 .5

.08 .4

.06 .3

.04 .2 Average value added Average Average value added (left axis) Proportion dismissed (right axis) .02 .1

0 0

0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 Reliability of annual performance measure

Notes: Figure 3 shows how changing the reliability of the on-the-job signal affects the optimal timing of tenure (regions delineated by dotted lines, labeled at top), the optimal dismissal rate (right axis, dashed line), and the resulting value added of the average teacher in the school (left axis, solid line). For these simulations, we assumed that the principal could only dismiss teachers at tenure time (T ). Reliability (x-axis) refers to the correlation of the annual performance measure across years within the same teacher. The dashed line (and right axis) shows the steady state impact of reliability on the proportion dismissed. The solid line (and left axis) shows the steady state impact of reliability on the value added of the average teacher, including those in their fi rst year of teaching, based on the optimal proportion of teachers dismissed after the optimal waiting period (T ). (Teacher value added is measured in standard deviations of student performance.) The regions separated by horizontal dotted lines denote the optimal year in which the tenure decision should be made (T ) for different levels of reliability. aaboutbout teacherteacher evaluation,evaluation, butbut wwee nneedeed toto fi ndnd a wayway toto measuremeasure classroomclassroom successsuccess aandnd tteachereacher effectiveness.effectiveness. PretendingPretending thatthat studentstudent ooutcomesutcomes aarere nnotot ppartart ooff tthehe eequationquation iiss llikeike ppretendingretending tthathat pprofessionalrofessional bbasketballasketball hhasas nnothingothing ttoo ddoo wwithith tthehe sscore.”core.” GGiveniven thethe availableavailable evidence,evidence, wwee hhaveave ttriedried toto evaluateevaluate systematicallysystematically hhowow sschoolchool leadersleaders shouldshould uusese tthehe ccurrentlyurrently aavailablevailable bbutut iimperfectmperfect measuresmeasures ofof teacherteacher eeffectivenessffectiveness toto recruit,recruit, evaluate,evaluate, andand retainretain tteachers.eachers. OOurur ssimulationsimulations ssuggestuggest tthathat uusingsing existingexisting iinformationnformation oonn tteachereacher pperformanceerformance ttoo aaggressivelyggressively sselectelect tteacherseachers wwouldould yyieldield ssubstantialubstantial annualannual gainsgains inin academicacademic achievementachievement ofof aaroundround 00.08.08 sstudenttudent llevelevel sstandardtandard ddeviations.eviations. TThesehese aarere comparablecomparable toto thethe annualannual testtest sscorecore gainsgains foundfound inin recentrecent experimentalexperimental eevaluationsvaluations ooff ccharterharter schoolsschools (Hoxby(Hoxby aandnd MMurarka,urarka, 22009;009; AAbdulkadiroglu,bdulkadiroglu, Angrist,Angrist, Dynarski,Dynarski, Kane,Kane, andand Pathak,Pathak, 2009)2009) Douglas O. Staiger and Jonah E. Rockoff 115

aandnd ccomparableomparable ttoo tthehe eestimatedstimated aannualnnual iimpactmpact ooff rreducingeducing cclasslass ssizeize inin earlyearly eelementarylementary ggradesrades ffoundound iinn PProjectroject SSTARTAR ((Krueger,Krueger, 11999).999). OOurur aanalysisnalysis aalsolso ssuggestsuggests thatthat therethere areare substantialsubstantial returnsreturns toto investinginvesting inin betterbetter informationinformation aboutabout tteachereacher eeffectiveness,ffectiveness, bothboth atat thethe timetime ofof hhireire aandnd iinn tthehe fi rrstst fewfew yearsyears onon thethe job.job. OOtherther measuresmeasures ofof teacherteacher performance,performance, suchsuch aass eevaluationsvaluations bbasedased oonn cclassroomlassroom oobservations,bservations, mmayay bbee vveryery uuseful.seful. Finally,Finally, therethere maymay bebe otherother usesuses ofof tthishis iinforma-nforma- ttionion thatthat wwee ddidid notnot considerconsider iinn oourur aanalysis,nalysis, ssuchuch aass fforor pperformance-basederformance-based ppayay oorr ttargetedargeted professionalprofessional development,development, whichwhich wouldwould yieldyield eveneven largerlarger gains.gains. Systemati-Systemati- ccallyally exploringexploring tthehe ppotentialotential ggainsains ffromrom tthesehese ootherther uusesses wwouldould bbee vvaluable.aluable. TTherehere areare manymany practicalpractical obstaclesobstacles toto implementingimplementing a ppolicyolicy tthathat ddeniesenies ttenureenure toto a largelarge proportionproportion ofof newnew teachers.teachers. First,First, largelarge upfrontupfront investmentsinvestments iinn teachingteaching ccredentialsredentials mmakeake vveryery hhighigh rratesates ooff tterminationserminations hhardard ttoo ssupportupport iinn eequilibrium.quilibrium. GivenGiven thatthat ttherehere iiss llittleittle eevidencevidence tthathat ssuchuch ccredentialsredentials aarere rrelatedelated ttoo teacherteacher eeffectiveness,ffectiveness, ourour resultsresults suggestsuggest thatthat anan aggressiveaggressive dismissaldismissal policypolicy sshouldhould bbee ccomplimentedomplimented bbyy aann eeasyasy eentryntry ppolicy.olicy. FForor eexample,xample, aass aann aalternativelternative ttoo oobtainingbtaining ccredentials,redentials, districtsdistricts ccouldould ccreatereate anan alternativealternative pportort ooff eentryntry iinn wwhichhich aanyny ccollegeollege ggraduateraduate ((withoutwithout a ccriminalriminal rrecord)ecord) ccouldould bbecomeecome ccertifiertifi eded iiff ttheyhey pperformederformed wellwell onon thethe jobjob inin theirtheir fi rstrst yearyear oror two.two. OneOne couldcould imagineimagine suchsuch aann aalternativelternative certificertifi ccationation rrouteoute bbeingeing aann aattractivettractive ooptionption forfor manymany applicants,applicants, aandnd tthehe tteacherseachers oobtainingbtaining tthehe rresultingesulting certificertifi cationcation beingbeing hhighlyighly vvaluedalued bbyy sschools.chools. IIff aapplicantspplicants wwereere sstilltill uuneasyneasy aaboutbout iinvestingnvesting ttimeime aandnd eeffortffort iinn tthehe ddiffiiffi cultcult fi rstrst yyearsears ofof teaching,teaching, ddistrictsistricts ccouldould rredesignedesign thethe processprocess toto limitlimit thethe up-frontup-front costscosts ooff a “tryout”“tryout” (for(for example,example, iinitiallynitially eevaluatingvaluating nnewew tteacherseachers uusingsing bbriefrief ssummerummer sschoolchool ccourses),ourses), andand allowallow themthem toto gaingain betterbetter pre-hirepre-hire informationinformation atat lowlow cost.cost. SSimilarly,imilarly, iitt iiss iinterestingnteresting ttoo cconsideronsider a wworkforceorkforce ddevelopmentevelopment mmodelodel tthathat ttriesries bbothoth ttoo mminimizeinimize tthehe eexposurexposure ooff sstudentstudents ttoo uuntestedntested tteacherseachers aandnd ggenerateenerate eearly-arly- ccareerareer informationinformation onon teacherteacher effectiveness.effectiveness. ForFor example,example, insteadinstead ofof givinggiving nnewew tteacherseachers a ffullull lloadoad ooff sstudentstudents aand/ornd/or ccourses,ourses, pprincipalsrincipals ccouldould aassignssign tthemhem ttoo a ssmallmall groupgroup ofof studentsstudents oorr a ssingleingle ccourseourse aandnd tthenhen uusese tthishis llimitedimited tteachingeaching rroleole ttoo collectcollect performanceperformance information.information.1100 OOff ccourse,ourse, wwee ddoo nnotot kknownow hhowow rreliablyeliably suchsuch iinformationnformation wwouldould ppredictredict laterlater performanceperformance withwith a fullfull setset ooff tteachingeaching rresponsi-esponsi- bbilities,ilities, butbut wwee ssuspectuspect itit wouldwould bebe moremore informativeinformative tthanhan kknowingnowing wwherehere someonesomeone aattendedttended collegecollege oror whatwhat theythey scoredscored onon a standardizedstandardized certificertifi cationcation examination.examination. DDespiteespite tthesehese issuesissues andand obstacles,obstacles, tthehe ggeneraleneral mmessageessage ooff oourur analysisanalysis rremains.emains. TThehe ccurrenturrent ssystem,ystem, wwhichhich ffocusesocuses oonn ccredentialsredentials atat thethe timetime ofof hhireire aandnd ggrantsrants ttenureenure aass a mmatteratter ofof course,course, isis atat oddsodds withwith decadesdecades ooff eevidencevidence oonn tteachereacher eeffectiveness.ffectiveness. Instead,Instead, teacherteacher recruitmentrecruitment andand retentionretention policiespolicies shouldshould ffocusocus oonn improvingimproving oourur mmethodsethods ooff tteachereacher eevaluationvaluation aandnd uusese aadmittedlydmittedly iimperfectmperfect mmeasureseasures ooff tteachereacher eeffectivenessffectiveness toto identifyidentify andand retainretain onlyonly thethe bestbest tteacherseachers eearlyarly iinn theirtheir tteachingeaching ccareers.areers.

10 Just fi ve percent of recently hired teachers claim to have received a reduced workload in the fi rst year of their careers according to the 2003 School and Staffi ng Survey. 116 Journal of Economic Perspectives

■ We thank , Robert Gibbons, , Bentley Macleod, Richard Murnane, conference participants at Columbia and the University of Chicago, and the editors of this journal for many useful comments and suggestions. Financial support for Doug Staiger was provided by Institute of Education Sciences, U.S. Department of Education grant #R305C090023.

References

Aaronson, Daniel, Lisa Barrow, and William Association, April 30. http://www2.ed.gov/news Sander. 2007. “Teachers and Student Achieve- /speeches/2009/04/04302009.html. ment in the Chicago Public Schools.” Journal of Hanushek, Eric A. 1971. “Teacher Charac- Labor , 25(1): 95–135. teristics and Gains in Student Achievement: Abdulkadiroglu, Atila, Joshua Angrist, Susan Estimation using Micro Data.” American Economic Dynarski, Thomas J. Kane, and Parag Pathak. Review, 61(2): 280–88. 2009. “Accountability and Flexibility in Public Hanushek, Eric A. 1986. “The Economics of Schools: Evidence from Boston’s Charters and Schooling: Production and Effi ciency in Public Pilots.” NBER Working Paper 15549. Schools.” Journal of Economic Literature, 24(3): Andrabi, Tahir, Jishnu Das, Asim Ijaz Khwaja, 1141–77. and Tristan Zajonc. 2009. “Do Value-Added Hanushek, Eric A. 1997. “Assessing the Effects Estimates Add Value? Accounting for Learning of School Resources on Student Performance: Dynamics.” World Bank Policy Research Working An Update.” Educational Evaluation and Policy Paper 5066. Analysis, 19(2): 141–64. Ballou, Dale. 1996. “Do Public Schools Hire Harris, Douglas N., and Tim R. Sass. 2006. the Best Applicants?” Quarterly Journal of Economics, “Value-Added Models and the Measurement of 111(1): 97–133. Teacher Quality.” Unpublished paper, April. Bohrnstedt, George W., and Brian M. Stecher. Harris, Douglas N., and Tim R. Sass. 2009. 2002. What We Have Learned about Class Size “What Makes for a Good Teacher and Who Can Reduction in California. Palo Alto, CA: California Tell?” Calder Center Working Paper 30. Department of Education. Hoxby, Caroline M., and Sonali Murarka. Boyd, Donald, Pamela Grossman, Hamilton 2009. “Charter Schools in New York City: Who Lankford, , and . Enrolls and How They Affect Their Students’ 2006. “How Changes in Entry Requirements Achievement.” NBER Working Paper 14852. Alter the Teacher Workforce and Affect Student Jacob, Brian. 2007. “The Challenges of Achievement.” Education Finance and Policy, 1(2): Staffi ng Urban Schools with Effective Teachers.” 176 –216. The Future of Children, 17(1): 129–54. Clotfelter, Charles T., Helen F. Ladd, and Jacob, Brian, and Lars Lefgren. 2008. “Can Jacob L. Vigdor. 2006. “Teacher–Student Principals Identify Effective Teachers? Evidence Matching and the Assessment of Teacher Effec- on Subjective Performance Evaluation in Educa- tiveness.” NBER Working Paper 11936. tion.” Journal of Labor Economics, 26(1): 101–136. Currie, Janet, and Duncan Thomas. 1995. Jacob, Brian A., Lars Lefgren, and David “Does Head Start Make a Difference?” American Sims. 2008. “The Persistence of Teacher-Induced Economic Review, 85(3): 341–64. Learning Gains.” NBER Working Paper 14065. Decker, Paul T., Daniel P. Mayer, and Steven Jackson, C. Kirabo. 2010. “Match Quality, Glazerman. 2004. “The Effects of Teach For Worker Productivity, and Worker Mobility: Direct America on Students: Findings from a National Evidence from Teachers.” NBER Working Paper Evaluation.” Mathematica Policy Research Report 15990. No. 8792-750, June 9. Jackson, C. Kirabo, and Elias Bruegmann. Duncan, Arne. 2009. “Partners in Truth- 2009. “Teaching Students and Teaching Each Telling.” Remarks to the Education Writers Other: The Importance of Peer Learning for Searching for Effective Teachers with Imperfect Information 117

Teachers.” American Economic Journal: Applied Frank Levy. 1995. “The Growing Importance of Economics. Cognitive Skills in Wage Determination.” Review Jovanovic, Boyan. 1979. “Job Matching and the of Economics and Statistics, 77(2): 251–66. Theory of Turnover.” Journal of Political Economy, Neal, Derek A., and William R. Johnson. 1996. 87(5): 972–90. “The Role of Premarket Factors in Black–White Kane, Thomas J., and Douglas O. Staiger. Wage Differences.” Journal of Political Economy, 2002. “The Promise and Pitfalls of Using Impre- 104(5): 869–95. cise School Accountability Measures.” Journal of Nye, Barbara, Spyros Konstantopoulos, and Economic Perspectives, 16(4): 91–114. Larry V. Hedges. 2004. “How Large Are Teacher Kane, Thomas J., and Douglas O. Staiger. Effects?” Educational Evaluation and Policy Analysis, 2005. “Using Imperfect Information to Identify 26(3): 237–57. Effective Teachers.” Unpublished manuscript, Raudenbush, Stephen W. 2004. “What Are April 2005. Value-Added Models Estimating and What Kane, Thomas J., and Douglas O. Staiger. Does This Imply for Statistical Practice?” 2008. “Estimating Teacher Impacts on Student Journal of Educational and Behavioral Statistics, Achievement: An Experimental Evaluation.” 29(1): 121–29. NBER Working Paper 14607. Rivkin, Steven G., Eric A. Hanushek, and John Kane, Thomas J., Jonah E. Rockoff, and Kain. 2005. “Teachers, Schools and Academic Douglas O. Staiger. 2008. “What Does Certifi ca- Achievement.” Econometrica, 73(2): 417–58. tion Tell Us about Teacher Effectiveness? Evidence Rockoff, Jonah E. 2004. “The Impact of from New York City.” Economics of Education Review, Individual Teachers on Student Achievement: 27(6): 615–31. Evidence from Panel Data.” American Economic Krueger, Alan B. 1999. “Experimental Review, 94(2): 247–52. Estimates of Education Production Functions.” Rockoff, Jonah E., Brian Jacob, Thomas J. Quarterly Journal of Economics, 114(2): 497–532. Kane, and Douglas O. Staiger. Forthcoming. Koedel, Cory. Forthcoming. “An Empirical “Can You Recognize an Effective Teacher When Analysis of Teacher Spillover Effects in Secondary You Recruit One?” Education Finance and Policy. School.” Economics of Education Review. Rockoff, Jonah E., and Cecilia Speroni. 2010. Lockwood, J. R., and Daniel F. McCaffrey. “Subjective and Objective Evaluations of Teacher 2009. “Exploring Student–Teacher Interactions Effectiveness.” American Economic Review, 100(2): in Longitudinal Achievement Data.” Education 261–66. Finance and Policy, 4(4): 439–467. Rothstein, Jesse. 2010. “Teacher Quality McCaffrey, Daniel F., J. R. Lockwood, Daniel in Educational Production: Tracking, Decay, Koretz, Thomas A. Louis, and Laura Hamilton. and Student Achievement.” Quarterly Journal of 2004. “Models for Value-Added Modeling of Economics, 125(1): 175–214. Teacher Effects.” Journal of Educational and Behav- Rubin, Donald B., Elizabeth A. Stuart, and ioral Statistics, 29(1): 67–101. Elaine L. Zanutto. 2004. “A Potential Outcomes Medina, Jennifer. 2008. “New York View of Value-Added Assessment in Education.” Measuring Teachers by Test Scores.” New York Journal of Educational and Behavioral Statistics, Times, January 21. http://www.nytimes.com 29(1): 103–116. /2008/01/21/nyregion/21teachers.html. Todd, Petra E., and Kenneth I. Wolpin. 2003. Milanowski, Anthony T., and Allan R. Odden. “On the Specifi cation and Estimation of the 2007. “A New Approach to the Cost of Teacher Production Function for Cognitive Achievement.” Turnover.” School Finance Redesign Project Economic Journal, 113(485): F3–F33. Working Paper 13. Tyler, John H., Eric S. Taylor, Thomas J. Mortensen, Dale T. 1986. Chap. 15 in Hand- Kane, and Amy L. Wooten. 2010. “Using Student book of Labor Economics, vol 2, ed. O. Ashenfelter Performance Data to Identify Effective Classroom and R. Layard. North-Holland. Practices.” American Economic Review, 100(2): Murnane, Richard. 1975. The Impact of School 256–60. Resources on the Learning of Inner City Children. Weisberg, Daniel, Susan Sexton, Jennifer Cambridge, MA: Ballinger. Mulhern, and David Keeling. 2009. The Widget Murnane, Richard J., John B. Willett, and Effect. Brooklyn, NY: The New Teacher Project. 118 Journal of Economic Perspectives This article has been cited by:

1. C. Kirabo Jackson, Jonah E. Rockoff, Douglas O. Staiger. 2014. Teacher Effects and Teacher-Related Policies. Annual Review of Economics 6, 801-825. [CrossRef] 2. Robert M. Klassen, Virginia M.C. Tze. 2014. Teachers’ self-efficacy, personality, and teaching effectiveness: A meta-analysis. Educational Research Review 12, 59-76. [CrossRef] 3. G. E. Chamberlain. 2013. Predictive effects of teachers and schools on test scores, college attendance, and earnings. Proceedings of the National Academy of Sciences . [CrossRef] 4. Dan Goldhaber, Michael Hansen. 2013. Is it Just a Bad Class? Assessing the Long-term Stability of Estimated Teacher Performance. Economica 80:319, 589-612. [CrossRef] 5. Sean Corcoran, Dan Goldhaber. 2013. Value Added and Its Uses: Where You Stand Depends on Where You Sit. Education Finance and Policy 8:3, 418-434. [CrossRef] 6. Dan Goldhaber, James Cowan, Joe Walch. 2013. Is a Good Elementary Teacher Always Good?, Assessing Teacher Performance Estimates Across Subjects. Economics of Education Review . [CrossRef] 7. Marcus A. Winters, Joshua M. Cowen. 2013. Would a Value-Added System of Retention Improve the Distribution of Teacher Quality? A Simulation of Alternative Policies. Journal of Policy Analysis and Management 32:3, 634-654. [CrossRef] 8. Cory Koedel, Michael Podgursky, Shishan Shi. 2013. Teacher Pension Systems, the Composition of the Teaching Workforce, and Teacher Quality. Journal of Policy Analysis and Management 32:3, 574-596. [CrossRef] 9. A. Leigh. 2013. The Economics and Politics of Teacher Merit Pay. CESifo Economic Studies 59:1, 1-33. [CrossRef] 10. Eric S. Taylor,, John H. Tyler. 2012. The Effect of Evaluation on Teacher Performance. American Economic Review 102:7, 3628-3651. [Abstract] [View PDF article] [PDF with links] 11. Jonah E. Rockoff,, Douglas O. Staiger,, Thomas J. Kane,, Eric S. Taylor. 2012. Information and Employee Evaluation: Evidence from a Randomized Intervention in Public Schools. American Economic Review 102:7, 3184-3213. [Abstract] [View PDF article] [PDF with links] 12. G. T. Henry, C. K. Fortner, K. C. Bastian. 2012. The Effects of Experience and Attrition for Novice High-School Science and Mathematics Teachers. Science 335:6072, 1118-1121. [CrossRef] 13. R. Chetty, J. N. Friedman, N. Hilger, E. Saez, D. W. Schanzenbach, D. Yagan. 2011. How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star. The Quarterly Journal of Economics 126:4, 1593-1660. [CrossRef] 14. Derek NealThe Design of Performance Pay in Education 4, 495-550. [CrossRef]