Regression Discontinuity Design on Model Schools' Value-Added Effects
Total Page:16
File Type:pdf, Size:1020Kb
Regression Discontinuity Design on Model Schools' Value-Added Effects: Empirical Evidence from Rural Beijing Kai Hong CentER Graduate School, Tilburg University April 2010 Abstract In this study we examine the value-added effects of model schools on students' achievements. We apply regression discontinuity design to data from Daxin District in rural Beijing. Both parametric and nonparametric approaches are adopted and estimate results are heterogeneous. For science student, significant positive effects ranging from 20 to 80 are found. While for art students, we find few evidences to support positive effects. Three robust checks, including additional covariates, school specific cutoffs and peers effects, are also performed and robustness of our results are confirmed further. Two policy related issues are also discussed: compliers and noncompliers and partially fuzzy design, by which we find smaller effects for the full population and almost the same effects for eligible participants. Keywords: regression discontinuity design, value-added effect, model school, economics of education 1 1 Introduction How to allocate restricted educational resources to obtain maximum achievement has been a controversial issue for a long time. On the one hand, education is crucial for further developments. For example, in economics, education plays an important role in both economic growth and eliminating inequality1. On the other hand, educational resources are usually limited. For example, public finance of education in most countries accounts for less than 5% of the GDP. In 2005 in China this percentage was only 2.82%. How to balance them and guarantee that these limited educational resources are used efficiently is the central problem in economics of education. In China, establishing key schools or model schools is recognized as an effective way to solve this problem. Since 1994, policies regarding model schools have been implemented. Nowadays, 15 years later, these model schools perform really well in almost all fields, especially in students' achievements. However, because whether a student can be enrolled in a model school largely depends on his or her previous performance, students in model schools usually have excellent achievements even before they enter current schools and are more likely to obtain the same achievements in normal schools. In that case, compared with generous educational input, whether value-added effects of model schools on students' achievement are large enough is still questionable. In this paper we will exam the effects of model high schools on students' achievements in rural Beijing with the regression discontinuity design (RD design for short). The effects of teachers or schools have been drawing attention from researchers for several decades. Such effects are usually known as \value-added" effect and interpreted from both the descriptive or causal aspect, say treatment effects of certain policies concerning teachers or schools. Many specific topics, from theories to practices, are involved into this field, such as the realization of treatments, how to obtain reliable causal estimation of these treatment effects and how to deal with certain specific econometric techniques2. The main question that needs to be answered is usually presented as \what are the effects on students of being in school A on their sequential test scores", or \how much a particular school or teacher has added value to their students' test scores". To answer such questions usually we need to compare the post-test scores with test scores before the treatment assignment and identify to what extent we obtain the causal effects. Before going deeper into this field, it is necessary to review methods used to identify and estimate causal effects with an emphases on a special one called regression discontinuity design. The rest of the paper is organized as follows: the second section will be a intensive review on RD design in economics of education, including an introduction of randomized experiments which is recognized as the standard method for estimating causal effects; the development of RD design, which is commonly recognized as quasi- experimental design; a few recent applications of such design to value-added analysis; a brief conclusion and several relevant prospects. In the third section empirical backgrounds, including the introduction of relevant exams in Beijing and model school policies and data descriptions, are introduced. The forth section deals with the evidence of validity, where discontinuity of variables are analyzed by graphics and the density of the treatment-determining variable is further tested. The fifth section concerns on empirical analysis. After an introduction of the RD design framework, both of parametric and nonparametric estimation are performed. Robustness checks, including addi- tional covariates, multiple cutoffs and peers effects are also discussed. In the sixth section policy extensions dealing with compliers, noncompliers and eligible participants are intensively discussed. The seventh section concludes. 1For the relation between education and economic growth, see Stevens and Weale (2003). For the relation between education and eliminating inequality, see Mickelson (1987). 2See Donald B. Rubin et al. (2004) for a short review on the value-added assessment in education. 2 2 RD Design in Economics of Education 2.1 Basic Settings Usually the RD design begins with a population of objects N, N = (1; 2; :::; I). An object in it can be denoted by i. Such object can be individuals, households, schools and so on. For each i several attributes are observed. One is the outcome Yi. We want to know why it varies across objects. Another is the treatment Ti. If we assume that there are only two levels of treatment for simplicity, we have Ti = 1 for objects in the treatment group and Ti = 0 for those in the control group. Other characteristics can be denoted by Xi. The treatment effect is measured by the difference between outcomes of the same object in the treatment group and the control group, with other characteristics Xi unchanged. For a given individual, we have the following formulation: Yi = α + βiTi + i; where α is a constant term and i is the error term. We can find that if Ti = 0, we have Yi = α + i. So 0 0 with the assumption that E(i) = 0, we have α = E(Yi ), where Yi is the outcome without the treatment. And 1 1 1 0 E(Yi ) = α + E(βi), where Yi is the outcome with the treatment. Then we have E(Yi ) − E(Yi ) = E(βi), which is the average treatment effect for the treatment T . 2.2 Random Experiment The central idea about causal inference of treatment effect goes back to Rubin (1974), where thoughts of randomized experiment and potential outcome were introduced. As what has been pointed out there, the basic 1 0 1 0 expression of a treatment effect, say T , on individual i can be written in the form of Yi − Yi , where Yi and Yi are outcomes after and before the treatment respectively. However, it is usually impossible to observe these two outcomes of a given individual simultaneously. Holland (1986) summarizes two potential ways to deal with this problem: scientific solution and statistical solution. Which one is useful depends on the validity of assumptions3. In the scientific solution some special assumptions, like the untestable unit homogeneity is proposed. For example, we assume that both outcomes before 0 0 1 1 and after treatment are the same for two objects. It means that Y1 = Y2 and Y1 = Y2 . While, if object 1 is in 1 0 the treatment group and object 2 is in the control group, we can observe Y1 and Y2 . Then the treatment effect 1 0 can be measured by observed values Y1 − Y2 . In the statistical solution the average treatment effect is identified under certain conditions, such as the well-known randomization or independence assumption4. More detailed, the individuals are assigned randomly to make the treatment independent of all other variables, such as the backgrounds or outcomes themselves. Mathematically, we have the independence assumption: Y 0;Y 1?T . In this case we have E(Y 1) = E(Y 1jT = 1) and E(Y 0) = E(Y 0jT = 0). Then the average treatment effect can be expressed in the form of E(Y 1jT = 1) − E(Y 0jT = 0). Usually it is difficult to realize such pure random assignment, which calls for well-designed experiments. If these experiments are not available, selection biases in the average treatment effect may come about. One of common 5 cases occurs when the assignment to treatment is determined by a predictor , which can be denoted by Si. There 3The former one is commonly used in the science laboratory and the latter one is usually preferred by social experiments. 4Of course, there are also some additional necessary assumptions to make the causal inference simpler, such as the Stable Unit Treatment Value Assumption (SUTVA) argued by Rubin (1986). The SUTVA contains two components: one is that all objects in a certain group, such as the treatment group or control group, should receive the same treatment; another is that the potential outcomes of a certain object should not be affected by the treatment status of another object. 5There are also many other methods to solve the problem of non-random experiment. Sometimes researchers can replicate a random experiment by matching methods on observables, or IV strategies on unobservables. For details see Heckman, Ichimura and Todd (1998) and Imbens and Angrist (1994) respectively. 3 is a cutoff point S0 of this covariate6 and the units will be treated if the value of covariates is on one side of this 0 0 point and will not be treated if on the other side. So we have Ti = 1 if Si ≥ S and Ti = 0 if Si < S . This idea leads to another analyzing framework named regression discontinuity design. The following section is a review of the development of such design with a concentration on applications to economics of education.