Fixed Effects Models and Neighborhood Studies Vartanian: SW 541

There are a number of ways of using fixed effect models.  Say you want to examine effects of union membership on particular outcomes. You have a theoretical reason for believing that people who work at specific plants or factories have different outcomes than people who work at other plants/factories. You may only have information on the plant/factory where people work instead of some other set of characteristics of that plant/factory that will help explain these differences. You may then simply include a dummy variable for where a person works, which will “difference out” the effects of working in specific places. Thus, in this type of model, you simply use a dummy variable.  You are examining the effects of a number of variables on welfare outcomes. You believe that states differ widely in the way they treat welfare recipients but don’t have similar data for the different states. You could then include 49 state dummy variables (with one state being the reference group, or use 50 state dummies with Washington, D.C. being one of those dummies and another state being the reference group), which will factor out all differences between states in your analysis. This again is called a fixed effect model.  You believe that neighborhood conditions affect outcomes. You have a sample of siblings and wish to examine how growing up in particular types of neighborhoods affects adult outcomes. You can examine the difference in neighborhood conditions for the siblings and their adult outcomes. Difference in neighborhood conditions for the siblings may arise because of household moves to different neighborhoods or changing neighborhood conditions for the different siblings. We can then examine the difference in neighborhood conditions (along with the difference in all independent variable conditions) and the difference in outcomes.  You can examine the same individual over repeated time periods and have an individual- based fixed effect model. Thus, instead of looking at siblings, you can look at individuals over periods of time and see how living in different types of conditions affects outcomes. Variable such as race will be differenced out because the same individual will have the same race over time. Only if conditions change for the individual will we have non-zero data for the independent or dependent (depending on how these are measured) variables for the individual.

Much of the rest of this handout is taken from Vartanian and Buck (2005), “Childhood and Adolescent Neighborhood Effects on Adult Income: Using Siblings to Examine Differences in OLS and Fixed Effect Models”

Fixed effect models help in reducing the effects of omitted variable bias (where omitted variables show up in the error term, and if correlated with the included variables, produce biased b coefficients). They may also help in reducing the effects of endogeneity. Endogeneity may occur when examining non-random processes, and attributing the effects of these non-random processes to a particular characteristic. For example, when examining neighborhoods, we should examine the possibility of the simultaneous effects of neighborhood effects on individuals and individual effects on neighborhoods. Simultaneity refers to the exogenous and endogenous social interactions that presumably occur between individuals, families and their neighborhoods: the extent to which people influence their neighborhoods and vice versa (Duncan, Connell & Klebanov, 1997). These types of transactional relationships are difficult to estimate and thus present empirical challenges in any neighborhood research (Duncan & Raudenbush, 2001). Omitted variable bias, also known as unmeasured or unobserved variable bias, occurs when a study lacks important information primarily due to data set constraints (Leventhal & D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 1 Brooks-Gunn, 2000). Selection bias, or endogeneity, occurs when study participants, instead of being randomly assigned to neighborhoods, choose them for unmeasured reasons (Duncan, et al., 1997). The effect, therefore, of the unknown factors that influence residential decisions may be improperly assumed to be a neighborhood effect. These fixed effects models address issues of endogeneity that arise because neighborhoods are not randomly assigned, assuming that families choose neighborhoods for unobserved reasons. Fixed effects models present interesting options for reduced bias. They are often used with sibling data such that permanent unobserved family characteristics are held constant during migration to different neighborhoods and changes in neighborhood characteristics. Some studies that have compared OLS estimates to fixed effects estimates have found that OLS models tend to overstate neighborhood effects (Levy & Duncan, 2000; Plotnick & Hoffman, 1999; Weinberg, et al. 2002). Aaronson (1998), however, found that “measuring the unmeasured” does not necessarily show evidence that OLS results include upward biases. He finds that estimates for the effects of the neighborhood poverty rate on educational outcomes are sometimes even larger in the fixed effects models than in the OLS models. It cannot be assumed, therefore, that neighborhood effects are routinely overestimated by unobservable variables, rather that there is the potential for both upward and downward biases.

In this work, OLS and fixed effect models are compared. Statistically speaking, the general form for determining neighborhood effects in OLS models is:

Yi    1FPi   2 FIVi  Ni  i , where FP is the set of permanent family variables, FIV is the set of varying family and individual variables, N is the set of neighborhood variables, μ is the error term, β1, β2, and γ are the coefficients for the respective permanent family, varying family and individual, and neighborhood variables, and α is the intercept. A fixed effect model is better able to control for differences among families than OLS models. The fixed effect model takes on the following form:

Yij   j  1FPj   2 FIVij  N ij  ij , where i denotes the individual child and j denotes the child’s family. The constant now takes on family-specific value. Having a different constant value for each family provides a control for those factors that are permanent features of families, or that are present in the family for each of the children being examined but are not explicitly examined in the statistical model. Such uncontrolled factors in the models may include family values or aspirations for the children of the family, parental skills not captured by educational variables included in the models, or the emotional well-being of the parents. As Levy and Duncan (2000) and others (Aaronson 1998; Plotnick and Hoffman 1999) note, there are permanent components and variable components to family characteristics. The family fixed effect gets differenced out (that is, held constant) in fixed effect models. One example is the effect of parental intelligence. Varying family effects (such as those of income) remain because such effects will be different for each child. Thus, the fixed effect model does not allow control of unobserved family variables that vary over children, but does allow control for variables that are unobserved and are more permanent or the same across children. These unobserved permanent family variables may bias the comparable OLS estimates. In fixed effect models, sibling differences in unobserved family characteristics bias the key coefficients only if such differences affect the dependent variable (in this case, the log of the family income-to-needs ratio as an adult), and are correlated with sibling differences in the characteristics of the neighborhood. These unobserved variables among siblings include ability, ambition, and parental expectations (Aaronson 1998). As Aaronson points out, parents may learn how to better parent in caring for subsequent children and, thus, may choose better neighborhoods (e.g., for better schools) with each additional child. Thus, younger children may benefit from this D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 2 parental learning, biasing the estimates. Also, if parents favor one child over another, the family may move to a neighborhood with better schools and other such characteristics when the favored child starts attending school. If parents favor that child in other ways (hiring tutors, giving more homework help) that are unobserved, the effects of all of these factors will be attributed to the neighborhood conditions. Like Aaronson (1998), the current article includes controls for birth order of children and, thus, explicitly controls for possible better parenting for subsequent children. However, other unobserved variables may still affect children differently and may be reflected in the neighborhoods where they live. In addition, the PSID does not contain enough information to distinguish between biological and nonbiological siblings in all cases. Thus, fixed effect models cannot control for these differential and unobservable family factors. Accordingly, caution must be used in interpreting the estimates from the fixed effect models. The fixed effect model takes the following form:

Yij  Y. j  ( j  j )  1 (FPj  FP. j )  B2 (FIVij  FIV. j )   (Nij  N. j )  (ij  . j ) .

The constant and the permanent family factors drop out of the equation. Also in this equation, the term (FIVij – FIV.j), as well as the other subtractions of .j, indicate that overall mean family values are subtracted from individual values for both independent and dependent variables. In the current model, the dependent variable is the log of the average family income-to-needs ratio when the child becomes an adult and is at least 25 years old. The family income-to-needs ratio is a measure of income relative to the poverty line that adjusts for family size. The value of this variable is averaged over all years when the individual is 25 years or older. This fixed effect model is estimated by regressing the differences in sibling outcomes on the differences in their observed family, neighborhood, and other variables. Both OLS regression analysis and the fixed effect models are used to examine the dependent variable, the log of family income-to-needs as an adult.3 OLS models are used as comparisons to the fixed effect models to determine whether using OLS modeling produces large differences in the coefficient estimates for the neighborhood and other variables relative to the fixed effect models. Bivariate and multivariate models that control for a number of family and individual factors during childhood are used to determine whether the independent effects of family-varying variables affect the relationship between neighborhood variables and the dependent variable. A set of models also controls for a number of adult factors, such as marital status, area of residence, and family size. The multivariate fixed effect models do not explicitly control for permanent parental variables, such as level of education for the head of household, race, and region of residence, because only variables that vary across siblings can have non-zero values.4

****************************************************************************

The Following is taken from The Princeton University Library, Data and Statistics Services (see the web address below).

Fixed effects regression is the model to use when you want to control for omitted variables that differ between cases but are constant over time. It lets you use the changes in the variables over time to estimate the effects of the independent variables on your dependent variable, and is the main technique used for analysis of panel data.

The command for a linear regression on panel data with fixed effects in Stata is xtreg with the fe option, used like this: xtreg dependentvar independentvar1 independentvar2 independentvar3 ... , fe

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 3 If you prefer to use the menus, the command is under Statistics > Cross-sectional time series > Linear models > Linear regression.

This is equivalent to generating dummy variables for each of your cases and including them in a standard linear regression to control for these fixed "case effects". It works best when you have relatively fewer cases and more time periods, as each dummy variable removes one degree of freedom from your model.

Between Effects

Regression with between effects is the model to use when you want to control for omitted variables that change over time but are constant between cases. It allows you to use the variation between cases to estimate the effect of the omitted independent variables on your dependent variable.

The command for a linear regression on panel data with between effects in Stata is xtreg with the be option.

Running xtreg with between effects is equivalent to taking the mean of each variable for each case across time and running a regression on the collapsed dataset of means. As this results in loss of information, between effects are not used much in practice. Researchers who want to look at time effects without considering panel effects generally will use a set of time dummy variables, which is the same as running time fixed effects.

The between effects estimator is mostly important because it is used to produce the random effects estimator.

Random Effects

If you have reason to believe that some omitted variables may be constant over time but vary between cases, and others may be fixed between cases but vary over time, then you can include both types by using random effects. Stata's random-effects estimator is a weighted average of fixed and between effects.

The command for a linear regression on panel data with random effects in Stata is xtreg with the re option. Choosing Between Fixed and Random Effects

The generally accepted way of choosing between fixed and random effects is running a Hausman test.

Statistically, fixed effects are always a reasonable thing to do with panel data (they always give consistent results) but they may not be the most efficient model to run. Random effects will give you better P-values as they are a more efficient estimator, so you should run random effects if it is statistcally justifiable to do so.

The Hausman test checks a more efficient model against a less efficient but consistent model to make sure that the more efficient model also gives consistent results.

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 4 To run a Hausman test comparing fixed with random effects in Stata, you need to first estimate the fixed effects model, save the coefficients so that you can compare them with the results of the next model, estimate the random effects model, and then do the comparison.

. xtreg dependentvar independentvar1 independentvar2 independentvar3 ... , fe . estimates store fixed . xtreg dependentvar independentvar1 independentvar2 independentvar3 ... , re . estimates store random . hausman fixed random

The hausman test tests the null hypothesis that the coefficients estimated by the efficient random effects estimator are the same as the ones estimated by the consistent fixed effects estimator. If they are (insignificant P-value, Prob>chi2 larger than .05) then it is safe to use random effects. If you get a significant P-value, however, you should use fixed effects.

Source: http://dss.princeton.edu/online_help/analysis/panel.htm

**************************************************************************** You can also run these models with dummy dependent variables but only when there is variation within the siblings on their outcomes. All sibling sets that have the same outcomes are not use because there is no variability in the outcome.

While you can run these models in SAS and SPSS, I find that the easiest way of running fixed effect models is in STATA. The code for this in stata is the following: xtreg logefmns femhh move birthord varinc ynghh fipln perafdc hdag unemrate limited ownhome gotsepdv gotwid gotmarr nevmarr sepdiv widow kds und6 maxage agesq female yr6872 bigcity urbany city3y suby if count>1, fe i(newid) xtreg is indicating a fixed effect model (or other similar type of model). The first variable after xtreg is the dependent variable (the log of fmns). All other variables are independent variables. On the last line -- if count>1, indicates that the regression should only use cases where there are at least two family members. Count is a variable I created to indicate the number of brothers and sisters in the family that lived together throughout their childhood years. At the end of the statement, fe indicates a fixed effect model and (newid) indicates that observations should be grouped by a variable I created called newid.

I am looking at the effects of the % of female headed families during childhood on the log of FMNS as adults. reg logefmns femhh, cluster(newid)

First, the OLS model: Regression with robust standard errors Number of obs = 4627 F( 1, 2238) = 353.43 Prob > F = 0.0000 R-squared = 0.1485 Number of clusters (newid) = 2239 Root MSE = .70634

------| Robust D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 5 logefmns | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------+------femhh | -.0214611 .0011416 -18.80 0.000 -.0236998 -.0192225 _cons | 1.141882 .0225106 50.73 0.000 1.097739 1.186026 ------

Next, the Fixed Effects Model. xtreg logefmns femhh if caunt>1, fe i(newid)

Fixed-effects (within) regression Number of obs = 3652 Group variable (i): newid Number of groups = 1264

R-sq: within = 0.0079 Obs per group: min = 2 between = 0.2380 avg = 2.9 overall = 0.1596 max = 10

F(1,2387) = 18.95 corr(u_i, Xb) = 0.3018 Prob > F = 0.0000

------logefmns | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------+------femhh | -.0102226 .0023484 -4.35 0.000 -.0148277 -.0056175 _cons | .9261333 .0451412 20.52 0.000 .8376132 1.014653 ------+------sigma_u | .57748155 sigma_e | .56947051 rho | .50698429 (fraction of variance due to u_i) ------F test that all u_i=0: F(1263, 2387) = 2.70 Prob > F = 0.0000

The coefficient estimate on the FE model is around ½ of what the OLS model shows. This is in part due to the fact that the FE model is essentially controlling for the effects of unobserved “permanent” family variables that exist among siblings.

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 6 Table 2 Coefficient Estimates for OLS and Fixed Effect Bivariate Models for the Natural Log of Family Income-to-Needs Ratio as an Adult

OLS models: Ages 0-4 Ages 5-8 Ages 9-13 Ages 14-18 Female headed families (%) -.021 (.002)*** -.022 (.002)*** -.022 (.001)*** -.022 (.001)*** Households receiving public assistance income (%) -.031 (.004)*** -.031 (.002)*** -.031 (.002)*** -.031 (.002)*** In poverty (%) -.020 (.002)*** -.022 (.001)*** -.022 (.001)*** -.023 (.001)*** Household income < $15,000 a (%) -2.667 (.279)*** -2.957 (.206)*** -2.832 (.171)*** -2.876 (.152)*** Household income > $60,000 a (%) 1.595 (.138)*** 1.722 (.111)*** 1.712 (.089)*** 1.787 (.079)*** Income above respondent’s income (%) -.550 (.097)*** -1.001 (.084)*** -1.051 (.070)*** -1.136 (.060)*** Income same as respondent’s (%) 1.536 (.390)*** 1.768 (.318)*** 1.677 (.234)*** 1.159 (.229)*** Neighborhood index 1 .136 (.012)*** .154 (.009)*** .157 (.007)*** .164 (.007)*** Neighborhood index 2 -.087 (.015)*** -.163 (.017)*** -.169 (.015)*** -.158 (.013)*** Splines (% of neighborhood): Top 10 -.328 (.132)** -.265 (.114)** -.098 (.107) -.109 (.120) Top 11-25 -.195 (.068)** -.228 (.049)*** -.351 (.042)*** -.347 (.036)*** Top 26-50 -.164 (.052)** -.304 (.042)*** -.215 (.032)*** -.248 (.032)*** Top 51-75 -.111 (.048)* .014 (.047) -.068 (.035)* -.008 (.031) Top 76-90 -.187 (.119) -.015 (.108) -.222 (.101)* -.242 (.090)** Bottom 10 -.067 (.044) -.096 (.031)** -.031 (.028) -.058 (.024)* Log of family income-to-needs ratio .514 (.033)*** .569 (.027)*** .549 (.023)*** .535 (.019)*** N 1,660 2,683 3,818 4,949 Number of groups 1,199 1,660 2,043 2,319 R2 for neighborhood index models .1258 .1785 .1886 .2004

Fixed effect models: Female headed families (%) -.006 (.011) -.001 (.006) -.010 (.003)** -.010 (.003)*** Households receiving public assistance income (%) -.029 (.013)* .001 (.007) -.011 (.005)** -.012 (.003)*** In poverty (%) -.005 (.009) .001 (.005) -.006 (.003)+ -.008 (.003)** Household income < $15,000 a (%) -.745 (1.142) -.555 (.571) -.665 (.415) -.576 (.313)+ Household income > $60,000 a (%) 1.084 (.691) .543 (.396) .224 (.277) .622 (.209)** Income above respondent’s income (%) .806 (.320)** -.421 (.208)* -.312 (.146)* -.028 (.113) Income same as respondent’s (%) 1.470 (1.395) 1.663 (.759)* .600 (.534) .496 (.404) Neighborhood index 1 .072 (.062) .057 (.033)+ .074 (.024)** .071 (.018)*** Neighborhood index 2 .067 (.066) -.135 (.042)*** -.073 (.031)* -.026 (.022) Splines (% of neighborhood): Top 10 .091 (.719) -.135 (.323) -.230 (.261) -.061 (.281) Top 11-25 -.234 (.239) -.251 (.168) -.038 (.121) -.181 (.091)* Top 26-50 -.092 (.140) -.149 (.094) -.027 (.068) .005 (.055) Top 51-75 .105 (.101) .010 (.063) -.076 (.044)+ -.018 (.034) Top 76-90 .467 (.337) .100 (.188) -.060 (.137) -.145 (.110) Bottom 10 -.468 (.166)** .022 (.065) -.079 (.049) -.094 (.038)** Log of family income-to-needs ratio -.298 (.171) .281 (.099)** .079 (.066) -.105 (.045)* N 831 1,723 2,795 3,961 Number of groups 370 700 1020 1,331 Average number of observations per group 2.2 2.5 2.7 3.0 Within R2 for neighborhood index models .0068 .0109 .0069 .0058

***p <= .001; **p <= .01;*p <= .05; +p <= .10; all for two-tailed tests. a Income expressed in 2001 dollars. Note.—Neighborhood index variables are determined through principal components analysis (see Appendix table 1). Generally, each coefficient and standard error in the table comes from a separate regression model. Neighborhood variables run in the same models include the two neighborhood index variables; income above respondent’s income (%) and same income as respondent’s (%); and the spline variables.

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 7 Table 3 Coefficient Estimates and Standard Errors for Multivariate Models for the Natural Log of Family Income-to-Needs Ratio as an Adult

OLS models Ages 0-4 Ages 5-8 Ages 9-13 Ages 14-18 Female headed families (%) -.002 (.003) -.001 (.002) -.003 (.002)+ -.005 (.001)*** Households receiving public assistance income (%) -.009 (.004)* -.005 (.003)+ -.004 (.002) -.006 (.002)*** In poverty (%) -.003 (.003) -.003 (.002)+ -.004 (.001)** -.005 (.001)*** Household income < $15,000 a (%) -.643 (.342)+ -.646 (.246)** -.444 (.187)* -.555 (.166)*** Household income > $60,000 a (%) .378 (.170)* .335 (.137)** .311 (.112)** .442 (.098)*** Income above respondent’s income (%) .214 (.114)+ -.103 (.100) -.222 (.083)** -.255 (.070)*** Income same as respondent’s (%) -.102 (.397) .591 (.311)+ .581 (.244)* .184 (.228) Neighborhood index 1 .023 (.016) .038 (.012)*** .042 (.010)*** .049 (.009)*** Neighborhood index 2 .024 (.019) -.046 (.017)** -.064 (.016)*** -.049 (.013)*** Splines (% of neighborhood): Top 10 -.163 (.130) -.098 (.111) .004 (.115) .033 (.128) Top 11-25 -.021 (.062) -.050 (.049) -.137 (.041)*** -.096 (.036)** Top 26-50 .008 (.054) -.078 (.041)+ -.013 (.033) -.069 (.029)* Top 51-75 -.082 (.044)+ .018 (.041) -.052 (.031)+ .018 (.028) Top 76-90 .056 (.119) .101 (.101) -.085 (.091) -.049 (.083) Bottom 10 -.059 (.045) -.048 (.031) .015 (.027) -.032 (.023) Log of family income-to-needs ratio .177 (.073)** .296 (.043)*** .259 (.034)*** .292 (.029)*** N 1,660 2,683 3,818 4,949 Number of groups 1,199 1,660 2,043 2,319 R2 for neighborhood index models .2423 .2811 .2842 .2990

Fixed effect models Female headed families (%) -.004 (.013) .001 (.006) -.007 (.004)+ -.006 (.003)* Households receiving public assistance income (%) -.030 (.015)* .002 (.007) -.006 (.005) -.008 (.004)* In poverty (%) -.010 (.009) .001 (.005) -.007 (.004)* -.008 (.003)*** Household income< $15,000 a (%) -1.912 (1.244) -.810 (.615) -.903 (.437)* -.768 (.326)* Household income> $60,000 a (%) 1.124 (.738) .483 (.416) .061 (.296) .399 (.219)+ Income above respondent’s income (%) 1.103 (.517)* -.269 (.292) -.332 (.193)+ -.063 (.132) Income same as respondent’s (%) 1.656 (1.639) 1.671 (.875)+ .412 (.613) -.018 (.461) Neighborhood index 1 .109 (.066)+ .059 (.036)+ .064 (.026)* .054 (.019)** Neighborhood index 2 .018 (.078) -.118 (.044)** -.066 (.033)* -.010 (.023) Splines (% of neighborhood): Top 10 .333 (.728) -.215 (.331) -.280 (.266) .058 (.288) Top 11-25 -.141 (.250) -.183 (.172) -.013 (.124) -.121 (.093) Top 26-50 -.111 (.146) -.124 (.096) -.011 (.069) .015 (.055) Top 51-75 .136 (.102) .018 (.064) -.082 (.044)+ -.019 (.034) Top 76-90 .461 (.360) .121 (.191) -.040 (.139) -.162 (.110) Bottom 10 -.541 (.177)** -.007 (.066) -.068 (.051) -.068 (.040)+ Log of family income-to-needs ratio -.379 (.202)+ .284 (.120)* .170 (.081)* .008 (.060) N 831 1,723 2,795 3,961 Number of groups 370 700 1020 1,331 Average number of observations per group 2.2 2.5 2.7 3.0 Within R2 for neighborhood index models .0780 .0524 .0313 .0273

Source: Panel Study of Income Dynamics and the 1970, 1980, and 1990 Census. ***p <= .001; **p <= .01; *p <= .05; +p <= .10; all for two-tailed tests. a Income expressed in 2001 dollars. Note.—All models control for the full set of control variables (see the first two columns of Appendix tables 3 to 6 for a full list of variables included in each of the models). Neighborhood index variables are determined through principal components analysis (see Appendix table 1). Generally, each coefficient and standard error in the table comes from a separate regression model. Neighborhood variables run in the same models include the two neighborhood index variables; income above respondent’s income (%) and same income as respondent’s (%); and the spline variables.

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 8 Table 4 Coefficient Estimates and Standard Errors for Multivariate Models for the Natural Log of Family Income-to-Needs Ratio as an Adult, with Adult Variables Included

OLS models Ages 0-4 Ages 5-8 Ages 9-13 Ages 14-18 Female headed families (%) -.001 (.003) .001 (.002) -.002 (.001) -.004 (.001)*** Households receiving public assistance income (%) -.008 (.003)* -.002 (.002) -.001 (.002) -.004 (.002)* In poverty (%) -.002 (.002) -.002 (.002) -.003 (.001)* -.004 (.001)*** Household income < $15,000 a (%) -.494 (.303)+ -.341 (.212) -.320 (.155)* -.409 (.143)** Household income > $60,000 a (%) .336 (.149)* .195 (.117)+ .233 (.093)** .334 (.082)*** Income above respondent’s income (%) .170 (.102)+ -.081 (.084) -.157 (.068)* -.178 (.057)** Income same as respondent’s (%) -.183 (.355) .289 (.266) .292 (.203) -.009 (.190) Neighborhood index 1 .018 (.014) .018 (.010)+ .028 (.008)*** .034 (.007)*** Neighborhood index 2 .023 (.017) -.027 (.015)+ -.040 (.013)** -.028 (.011)** Splines (% of neighborhood): Top 10 -.175 (.108)+ -.027 (.093) .039 (.092) .025 (.102) Top 11-25 -.003 (.056) -.035 (.041) -.087 (.034)** -.050 (.030)+ Top 26-50 .005 (.051) -.060 (.036)+ -.013 (.028) -.062 (.025)** Top 51-75 -.043 (.040) .036 (.035) -.027 (.027) .005 (.024) Top 76-90 .070 (.101) .132 (.084) -.067 (.077) -.033 (.070) Bottom 10 -.067 (.040)+ -.042 (.025)+ .012 (.022) -.017 (.019) Log of family income-to-needs ratio .152 (.068)* .231 (.037)*** .192 (.030)*** .224 (.024)*** N 1,660 2,683 3,818 4,949 Number of groups 1,199 1,660 2,043 2,319 R2 for neighborhood index models .3800 .4373 .4436 .4671

Fixed effect models Female headed families (%) -.005 (.012) .004 (.006) -.008 (.004)* -.006 (.003)* Households receiving public assistance income (%) -.031 (.014)* .004 (.007) -.007 (.005) -.005 (.003) In poverty (%) -.009 (.009) .002 (.005) -.008 (.003)** -.006 (.002)** Household income < $15,000 a (%) -1.098 (1.187) -.503 (.566) -1.003 (.395)** -.515 (.295)+ Household income > $60,000 a (%) .920 (.701) .347 (.383) .244 (.268) .321 (.198)+ Income above respondent’s income (%) .795 (.500) -.160 (.268) -.078 (.175) -.037 (.120) Income same as respondent’s (%) .513 (1.574) .970 (.809) .530 (.555) .062 (.416) Neighborhood index 1 .069 (.063) .027 (.033) .064 (.024)** .040 (.018)* Neighborhood index 2 .067 (.070) -.071 (.041)+ -.039 (.030) -.010 (.021) Splines (% of neighborhood): Top 10 .518 (.696) -.114 (.303) -.257 (.240) .050 (.260) Top 11-25 -.079 (.238) -.081 (.159) -.036 (.112) -.121 (.084) Top 26-50 -.086 (.138) -.141 (.088) -.021 (.063) -.001 (.050) Top 51-75 .119 (.097) .028 (.059) -.060 (.040) -.028 (.031) Top 76-90 .577 (.341)+ .250 (.177) -.147 (.125) -.164 (.100)+ Bottom 10 -.556 (.168)*** -.017 (.061) -.044 (.046) -.023 (.036) Log of family income-to-needs ratio -.292 (.193) .217 (.111)* .132 (.073)+ -.015 (.055) N 831 1,723 2,795 3,961 Number of groups 370 700 1020 1,331 Average number of observations per group 2.2 2.5 2.7 3.0 Within R2 for neighborhood index models .1938 .2137 .2173 .2090

Source: Panel Study of Income Dynamics and the 1970, 1980, and 1990 Census. ***p <= .001; **p <= .01; *p <= .05; +p <= .10; all for two-tailed tests. a Income expressed in 2001 dollars. Note.—All models control for the full set of control variables (see the last two columns of Appendix tables 3 to 6 for a full list of variables included in each of the models). Neighborhood index variables are determined through principal components analysis (see Appendix table 1). Generally, each coefficient and standard error in the table comes from a separate regression model. Neighborhood variables run in the same models include the two neighborhood index variables; income above respondent’s income (%) and same income as respondent’s (%); and the spline variables.

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 9 Appendix Table 3 Full Models, Ages 0 to 4 Ages 0-4 No Adult Variables Includes Adult Variables Variable OLS Fixed Effect OLS Fixed Effect Neigh Index 1 .035 (.015)* .112 (.048)* .022 (.014) .051 (.043) Neigh Index 2 .009 (.018) .041 (.059) .014 (.017) .064 (.049) # OF MOVES -.046 (.014)*** -.042 (.039) -.020 (.012)+ -.022 (.035) Child Order .012 (.020) -.024 (.048) -.008 (.017) -.033 (.043) INCOME VARIANCE .000 (.000) .000 (.000) .000 (.000) .000 (.000) HD/WF BEFORE AGE 18 -.403 (.124)*** -.350 (.203)+ -.216 (.120) -.210 (.184) FAM INCOME-TO-NEEDS .062 (.020)** -.233 (.192) .059 (.021)** -.042 (.053) % OF YRS WITH AFDC -.290 (.110)** -.469 (.293) -.215 (.095)* -.243 (.258) AGE OF THE HEAD -.005 (.003) .018 (.012) -.004 (.003) .022 (.011)* CNTY UNEM RATE -.002 (.010) .015 (.034) .012 (.009) .016 (.030) HD PHYS/EMOT LIMIT .022 (.041) -.019 (.088) .041 (.034) -.079 (.080) HOME OWNERSHIP .045 (.051) -.063 (.094) .043 (.043) -.012 (.083) gotsepdv -.253 (.081)** -.161 (.139) -.161 (.069)* -.206 (.124)+ gotwid -.138 (.198) -.580 (.782) -.145 (.178) .000 (.711) gotmarr .114 (.072) .167 (.148) .076 (.060) .170 (.130) nevmarr .046 (.152) -.432 (.316) .061 (.116) -.269 (.280) sepdiv -.097 (.113) -.137 (.264) -.049 (.099) -.251 (.237) widow -.251 (.155) -.293 (1.045) -.114 (.103) .206 (.952) # OF KIDS -.023 (.017) .014 (.066) .000 (.014) .072 (.056) Children Under 6 (dummy) (dropped) (.000) (dropped) (.000) (dropped) (.000) (dropped) (.000) MAX AGE OF R .104 (.115) .066 (.207) -.031 (.103) -.131 (.187) Max Age of R Squared -.001 (.002) .000 (.003) .001 (.002) .003 (.003) FEMALE -.027 (.033) -.068 (.055) .014 (.037) -.057 (.060) HS DROPOUT -.230 (.051)*** -.061 (.048) HS GRADUATE -.170 (.046)*** -.086 (.040)* SOME COLLEGE -.130 (.049)** -.071 (.043)+ AFAM -.278 (.058)*** -.083 (.053) OTHRACE -.079 (.109) -.013 (.086) SOUTH -.002 (.039) .047 (.043) BIG CITY (500,00+) .058 (.054) -.046 (.196) .086 (.047)+ -.039 (.176) CITY2 (100,000-499,999) .037 (.043) .013 (.161) .037 (.039) .054 (.143) CITY3 (50,000-99,999) -.030 (.056) .047 (.195) -.002 (.050) .100 (.176) CITY4 (25,000-49,999) .135 (.063)* -.090 (.235) .112 (.054)* -.137 (.212) Started HH as a wife (dummy) .004 (.045) .006 (.084) % YRS MARRIED .789 (.053)*** .648 (.092)*** FAMILY SIZE -.192 (.019)*** -.151 (.032)*** HS DROPOUT -.350 (.046)*** -.337 (.085)*** HS GRADUATE -.282 (.039)*** -.230 (.084)** SOME COLLEGE -.159 (.038)*** -.254 (.084)** STUDENT AFTER AGE 25 .097 (.083) .074 (.142) County/State Unemploy Rate -.096 (.015)*** -.019 (.034) SOUTH AS AN ADULT -.068 (.041)+ -.219 (.095)* LIVE IN SMSA .067 (.034)* .241 (.062)*** yr6872 -.010 (.051) .003 (.088) -.013 (.047) -.011 (.080) yr7377 … … … … yr7882 … … yr8387 … … _cons -1.009 (1.726) -1.095 (3.108) 1.561 (1.539) 1.881 (2.784) N 1491 748 1491 748 # of groups 333 333 Avg obs per group 2.2 2.2 R2/Within R2 .2917 .1028 .4771 .264 ***: p<=.001; **:p<=.01;*:p<=.05; +:p<=.10

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 10 Appendix Table 4: Full Models, Ages 5 to 8 Ages 5 to 8 No Adult Variables Adult Variables Included Variable OLS Fixed Effect OLS Fixed Effect Neigh Index 1 .055 (.013)*** .055 (.036) .034 (.011)** .040 (.032) Neigh Index 2 -.026 (.017) -.114 (.035)** -.016 (.014) -.083 (.031)** # OF MOVES -.057 (.016)*** .025 (.029) -.039 (.013)** .003 (.026) Child Order -.001 (.015) -.050 (.027)+ -.008 (.012) -.035 (.024) INCOME VARIANCE .000 (.000) .000 (.000) .000 (.000) .000 (.000) HD/WF BEFORE AGE 18 -.303 (.084)*** .022 (.124) -.067 (.077) .157 (.109) FAM INCOME-TO-NEEDS .046 (.012)*** -.004 (.041) .035 (.012)** -.009 (.036) % OF YRS WITH AFDC -.252 (.082)** .194 (.161) -.177 (.066)** .259 (.142)+ AGE OF THE HEAD -.002 (.002) .003 (.007) -.003 (.002) .001 (.006) CNTY UNEM RATE .007 (.008) -.005 (.018) .014 (.007)* -.007 (.016) HD PHYS/EMOT LIMIT -.083 (.034)* -.060 (.055) -.025 (.029) -.047 (.049) HOME OWNERSHIP .027 (.039) -.014 (.060) .013 (.032) -.027 (.053) Gotsepdv -.049 (.086) -.060 (.131) -.022 (.072) -.084 (.115) Gotwid -.197 (.199) -.591 (.276)* -.270 (.161)+ -.518 (.243)* Gotmarr -.061 (.068) -.052 (.130) .002 (.059) -.005 (.114) Nevmarr (all years) .115 (.124) -.798 (.277)** .033 (.098) -.612 (.244)** Sepdiv (all years) .031 (.086) -.052 (.145) -.020 (.071) -.151 (.128) Widow (all years) -.146 (.156) -.049 (.256) -.094 (.117) -.193 (.225) # OF KIDS -.024 (.014)+ -.004 (.033) -.002 (.011) -.006 (.029) Children Under 6 (dummy) (dropped) (.000) (dropped) (.000) (dropped) (.000) (dropped) (.000) MAX AGE OF R .111 (.061)+ .176 (.080)* .023 (.053) .058 (.071) Max Age of R Squared -.002 (.001)+ .000 (.001) -.001 (.001) FEMALE -.001 (.001) -.029 (.037) .025 (.029) -.016 (.039) HS DROPOUT -.258 (.044)*** -.101 (.039)** HS GRADUATE -.119 (.039)** -.064 (.033)+ SOME COLLEGE -.100 (.044)* -.066 (.038)+ AFAM -.300 (.047)*** -.135 (.040)*** OTHRACE -.023 (.099) .029 (.084) South .058 (.035)+ .096 (.041)* BIG CITY (500,00+) .034 (.044) -.104 (.146) .092 (.038)* .023 (.129) CITY2 (100,000-499,999) .016 (.039) -.180 (.149) .036 (.035) -.124 (.132) CITY3 (50,000-99,999) -.079 (.047)+ -.471 (.194)* -.041 (.043) -.328 (.171)+ CITY4 (25,000-49,999) .052 (.056) -.062 (.194) .031 (.047) -.178 (.172) Started HH as a wife (dummy) -.067 (.032)* -.034 (.052) % of YRS MARRIED .909 (.044)*** .784 (.061)*** FAMILY SIZE -.201 (.015)*** -.136 (.020)*** HS DROPOUT -.411 (.035)*** -.289 (.055)*** HS GRADUATE -.285 (.030)*** -.159 (.050)** SOME COLLEGE -.154 (.029)*** -.114 (.053)* STUDENT AFTER AGE 25 .099 (.061) .117 (.085) County/State Unemploy Rate -.084 (.012)*** -.062 (.020)** Live in South as Adult -.086 (.037)* -.152 (.071)* Live in SMSA as Adult .066 (.028)* .134 (.043)** yr6872 -.042 (.061) -.241 (.112)* -.083 (.057) -.261 (.101)** yr7377 -.010 (.049) -.134 (.079)+ -.051 (.045) -.161 (.071)* yr7882 yr8387 _cons -1.132 (.963) -2.091 (1.313) .814 (.847) .233 (1.174) N 2473 1581 2473 1581 # of groups 1543 650 1543 650 Avg obs per group 2.2 2.4 R2/Within R2 .3153 .0666 .5098 .2728 ***: p<=.001; **:p<=.01;*:p<=.05; +:p<=.10

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 11 Appendix Table 5: Full Models, Ages 9 to 13 Ages 9 to 13 No Adult Variables Adult Variables Included Variable OLS Fixed Effect OLS Fixed Effect Neigh Index 1 .048 (.011)*** .068 (.024)** .035 (.009)*** .071 (.021)*** Neigh Index 2 -.059 (.016)*** -.058 (.030)* -.040 (.013)*** -.044 (.026)+ # OF MOVES -.051 (.012)*** .014 (.023) -.029 (.011)** .016 (.020) Child Order -.012 (.012) -.025 (.020) -.012 (.010) -.023 (.018) INCOME VARIANCE .000 (.000) .000 (.000) .000 (.000) .000 (.000) HD/WF BEFORE AGE 18 -.239 (.079)** .004 (.107) -.033 (.071) .076 (.094) FAM INCOME-TO-NEEDS .033 (.011)** .040 (.030) .028 (.009)** .026 (.026) % OF YRS WITH AFDC -.182 (.069)** .124 (.128) -.088 (.056) .186 (.112)+ AGE OF THE HEAD -.001 (.002) .010 (.005)+ .000 (.002) .011 (.005)* CNTY UNEM RATE -.007 (.007) -.010 (.015) .002 (.006) -.018 (.013) HD PHYS/EMOT LIMIT -.029 (.031) -.115 (.047)* -.013 (.026) -.112 (.041)** HOME OWNERSHIP .035 (.039) -.031 (.056) -.018 (.033) -.012 (.049) Gotsepdv -.039 (.061) .073 (.083) .004 (.049) .087 (.072) Gotwid -.070 (.080) .277 (.167)+ -.004 (.076) .220 (.147) Gotmarr -.070 (.050) .032 (.090) -.047 (.042) .036 (.079) Nevmarr (all years) .091 (.099) .056 (.213) .104 (.075) .272 (.186) Sepdiv (all years) .003 (.066) -.039 (.102) .012 (.054) -.008 (.089) Widow (all years) -.019 (.105) .214 (.201) -.019 (.083) .184 (.175) # OF KIDS -.023 (.012)+ .052 (.023)* -.004 (.010) .044 (.021)* Children Under 6 (dummy) .033 (.028) -.004 (.041) .046 (.023)* .017 (.036) MAX AGE OF R .064 (.044) .117 (.053)* .022 (.039) .066 (.047) Max Age of R Squared -.001 (.001) -.001 (.001)+ .000 (.001) -.001 (.001) FEMALE -.043 (.024)+ -.043 (.030) .024 (.025) -.017 (.032) HS DROPOUT -.265 (.038)*** -.109 (.033)*** HS GRADUATE -.165 (.037)*** -.092 (.031)*** SOME COLLEGE -.085 (.038)* -.033 (.033) AFAM -.318 (.041)*** -.150 (.035)*** OTHRACE .016 (.074) .059 (.060) South .032 (.030) .032 (.034) BIG CITY (pop. 500,00+) .064 (.038)+ -.087 (.117) .097 (.032)** -.045 (.102) CITY2 (100,000-499,999) .020 (.034) -.018 (.104) .048 (.029)+ .028 (.091) CITY3 (50,000-99,999) -.020 (.042) -.105 (.130) .009 (.039) -.035 (.114) CITY4 (25,000-49,999) .063 (.044) .145 (.132) .028 (.039) .073 (.115) Started HH as a wife (dummy) -.047 (.028)+ -.022 (.042) % YRS MARRIED .894 (.039)*** .806 (.050)*** FAMILY SIZE -.189 (.013)*** -.136 (.016)*** HS DROPOUT -.444 (.033)*** -.300 (.044)*** HS GRADUATE -.284 (.026)*** -.170 (.038)*** SOME COLLEGE -.183 (.025)*** -.125 (.041)** STUDENT AFTER AGE 25 .081 (.052) .027 (.066) County/State Unemploy Rate -.074 (.010)*** -.067 (.015)*** Live in the South as Adult -.040 (.033) -.127 (.057)* Live in SMSA as Adult .043 (.025)+ .163 (.037)*** yr6872 -.107 (.065)+ -.090 (.119) -.156 (.059)** -.136 (.107) yr7377 -.041 (.060) -.020 (.099) -.128 (.055)* -.086 (.089) yr7882 -.015 (.050) -.021 (.080) -.097 (.046)* -.045 (.102) yr8387 _cons -.264 (.715) -1.925 (.892)* .911 (.637) -.063 (.071) N 3215 2248 3215 2248 # of groups/clusters 1831 866 1831 866 Avg obs per group 2.6 2.6 R2/Within R2 .3286 .044 .5160 .2899 ***: p<=.001; **:p<=.01;*:p<=.05; +:p<=.10

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 12 Appendix Table 6: Full Models, Ages 14 to 18 Ages 14 to 18 No Adult Variables Adult Variables Included Variable OLS Fixed Effect OLS Fixed Effect Neigh Index 1 .055 (.008)*** .042 (.018)* .042 (.007)*** .033 (.016)* Neigh Index 2 -.052 (.012)*** -.020 (.021) -.033 (.010)*** -.014 (.019) # OF MOVES -.065 (.013)*** .017 (.019) -.032 (.010)** .010 (.017) Child Order -.020 (.010)* -.020 (.017) -.010 (.008) -.016 (.015) INCOME VARIANCE .000 (.000)*** .000 (.000) .000 (.000)*** .000 (.000) HD/WF BEFORE AGE 18 -.183 (.061)** -.077 (.089) .011 (.053) -.007 (.079) FAM INCOME-TO-NEEDS .046 (.007)*** .013 (.018) .036 (.006)*** .006 (.016) % OF YRS WITH AFDC -.124 (.062)* .022 (.090) -.071 (.050) .035 (.080) AGE OF THE HEAD -.002 (.002) .001 (.003) -.002 (.001) -.001 (.003) CNTY UNEM RATE -.005 (.005) .004 (.011) .004 (.004) -.009 (.010) HD PHYS/EMOT LIMIT -.039 (.026) .038 (.038) -.033 (.021) .012 (.034) HOME OWNERSHIP .016 (.035) .045 (.049) -.039 (.029) .017 (.043) Gotsepdv -.028 (.050) .023 (.068) -.046 (.040) .011 (.060) Gotwid .054 (.075) .016 (.110) .101 (.069) .028 (.098) Gotmarr -.101 (.047)* -.090 (.066) -.040 (.037) -.074 (.058) Nevmarr (all years) .082 (.088) -.101 (.162) .076 (.076) -.088 (.144) Sepdiv (all years) .030 (.057) .033 (.074) -.002 (.046) .043 (.066) Widow (all years) -.019 (.061) .066 (.116) -.013 (.046) .034 (.103) # OF KIDS -.015 (.009)+ .005 (.019) .003 (.007) .006 (.017) Children Under 6 (dummy) -.030 (.031) -.026 (.036) -.004 (.028) -.020 (.032) MAX AGE OF R .063 (.023)** .049 (.030) .044 (.020)* .051 (.027)+ Max Age of R Squared -.001 (.000)* -.001 (.000) .000 (.000) -.001 (.000) FEMALE -.034 (.020)+ -.038 (.024) .035 (.021)+ .032 (.027) HS DROPOUT -.236 (.030)*** -.096 (.026)*** HS GRADUATE -.087 (.033)** -.029 (.028) SOME COLLEGE -.045 (.032) -.001 (.025) AFAM -.284 (.033)*** -.129 (.027)*** OTHRACE .016 (.058) .074 (.043)+ South .020 (.027) .029 (.031) BIG CITY (pop. 500,00+) .059 (.034)+ .041 (.090) .088 (.029)** .055 (.080) CITY2 (100,000-499,999) .006 (.029) .063 (.089) .018 (.026) .047 (.079) CITY3 (50,000-99,999) .003 (.036) .081 (.097) -.008 (.032) .065 (.087) CITY4 (25,000-49,999) .090 (.041)* .026 (.112) .067 (.033)* .100 (.100) Started HH as a wife (dummy) -.047 (.023)* -.061 (.034)+ % YRS MARRIED .917 (.030)*** .835 (.041)*** FAMILY SIZE -.190 (.010)*** -.154 (.012)*** HS DROPOUT -.451 (.027)*** -.324 (.036)*** HS GRADUATE -.275 (.021)*** -.187 (.031)*** SOME COLLEGE -.187 (.023)*** -.156 (.034)*** STUDENT AFTER AGE 25 .050 (.041) -.063 (.055) County/State Unemploy Rate -.061 (.008)*** -.086 (.012)*** Live in South as Adult -.048 (.029)+ -.105 (.048)* Live in SMSA as Adult .054 (.020)** .072 (.030)* yr6872 -.039 (.056) .009 (.119) -.130 (.053)* -.059 (.108) yr7377 -.066 (.053) -.026 (.105) -.157 (.049)** -.087 (.096) yr7882 -.007 (.052) .027 (.093) -.136 (.048)** -.061 (.085) yr8387 .014 (.043) .039 (.076) -.079 (.041)* -.047 (.069) _cons -.272 (.392) -.461 (.542) .532 (.342) .405 (.490) N 4626 3652 4626 3652 # of groups/clusters 2238 1264 2238 1264 Avg obs per group 2.9 2.9 R2/Within R2 .3289 .0268 .5146 .2338 ***: p<=.001; **:p<=.01;*:p<=.05; +:p<=.10

D:\Docs\2017-11-28\0fff93a3a31ad5f18b74560760751df5.doc 13