<<

A FRAMEWORK FOR THE METAANALYSIS OF

by

Karla Michelle Phillips Cooper Fox

A thesis submitted to the Department of Mathematics and

In conformity with the requirements for

the degree of Doctor of Philosophy

Queen’s University

Kingston, Ontario, Canada

(December, 2011)

Copyright ©Karla Michelle Phillips Cooper Fox, 2011 Abstract

The outlined in this thesis covers various different statistical issues relating to meta

analysis of survey data. These issues include the creation of an original comprehensive methodological framework for combining survey data, a comparison of this framework with the traditional one proposed by Cochran for the combination of , a proposal for a new weighting method that takes into account the differences in variability due to the plan,

an examination of the convergence of metaanalytic , and a discussion on the numerous

implicit assumptions researchers make when they are using metaanalysis methods with survey

data along with guidelines for completing and reporting reviews when the data come from

surveys.

ii Acknowledgements

It is a pleasure to thank those who made this thesis possible. First, I wish to express my gratitude

to my supervisor, Dr. David Steinsaltz who was instrumental in starting this research; he was

helpful, offered significant assistance, support and guidance throughout. Gratitude is also due to

the members of the supervisory committee, and staff at Queens University, in particular Jennifer

Read for her timely help in all of my queries. Additionally, I would like to thank Statistics

Canada for supporting this research. I am indebted to my many colleagues at Statistics Canada who assisted me and in particular I would like to thank Georgia Roberts, Sander Post and Ioana

SchiopuKratina for their insights and consultations. I owe my deepest gratitude to three extraordinary women, Isobel Barnes, Gaye Taylor and Maxine Morrison, who helped in many different ways throughout this process and without whose support, editing, and counsel this thesis would not be possible. Finally, I would also like to express my deepest love and gratitude to my husband Michael Fox for his understanding, patience and endless belief in me through the long duration of my studies.

For my father Jack Phillips Cooper

iii Statement of Originality

I hereby certify that all of the work described within this thesis is the original work of the author.

Any published (or unpublished) ideas and/or techniques from the work of others are fully acknowledged in accordance with the standard referencing practices.

(Karla Michelle Phillips Cooper Fox)

(December, 2011)

iv Table of Contents

Abstract...... ii Acknowledgements...... iii Statement of Originality...... iv Table of Contents...... v List of Figures...... viii Chapter 1 Introduction...... 1 1.1 Existing MetaAnalysis Techniques ...... 6 1.1.1 Hypothesis Tests ...... 7 1.1.2 Fixed Effect Models...... 10 1.1.3 Random Effect Models ...... 14 1.1.4 Regression Methods...... 19 1.1.5 Likelihood Based Methods ...... 22 1.2 Probability Sampling ...... 24 1.3 Superpopulations...... 25 1.4 Combining Surveys...... 26 Chapter 2 Design and Model Based Methods in Sampling ...... 38 2.1 Design Based Sampling...... 39 2.1.1 Simple Random Sampling ...... 43 2.1.2 HorvitzThompson Estimation...... 46 2.1.3 ...... 48 2.1.4 ...... 50 2.1.5 Twophase Sampling ...... 55 2.1.6 Complex Sampling...... 57 2.2 Models and the Superpopulation...... 60 2.3 Inference ...... 61 Definitions ...... 61 2.4 Estimation ...... 67 Ignorability and Informativeness ...... 73 2.5 ...... 76 2.6 Mathematical Examples...... 79 Some Theory of Sampling: W. Deming...... 79

v Answers to Criticisms of Using DesignBased Methods for Inference: D. Binder ...... 81 Chapter 3 The Framework ...... 85 3.1 The Framework for the MetaAnalysis of Survey Data...... 85 Experiments as Generated in a Superpopulation Survey Context ...... 86 Case 1: Combining Finite Population Estimates...... 89 ...... 99 Case 2: Combining Finite Parameters from more than one Finite Population...... 104 Case 3: Combining Finite Parameters from Multiple Superpopulations ...... 108 3.2 Asymptotic Techniques and Definitions...... 112 3.3 Convergence of Combined Estimates under the Sampling Framework ...... 116 3.4 Testing Heterogeneity...... 125 3.5 Adjusted Weights...... 128 Survey Design Adjusted Weights ...... 129 Chapter 4 ...... 132 Notation ...... 132 Simulations ...... 136 4.1 Cases 1a and 1b: Combining several estimates from different surveys of the same finite population...... 138 4.1.1 Case 1a: Mixing Identical Studies ...... 139 Simple Random Sampling ...... 140 Stratified Sampling ...... 144 Cluster Sampling...... 146 4.1.2 Case 1a Mixing Surveys of Different Types...... 148 Ignoring the Design Weights ...... 151 Compared to Pooling all the Individual Data...... 154 4.1.3 Case 1b: Mixing Identical Designs ...... 160 Simple Random Sampling ...... 161 Stratified Sampling ...... 163 Cluster Sampling...... 166 4.1.4 Case 1b: Mixing Different Designs ...... 168 4.2 Case 2: Combining several estimates from different surveys generated from the same superpopulation but different finite populations...... 170 4.2.1 Case 2: Mixing Identical Designs ...... 171 vi Simple Random Sampling ...... 171 Stratified Sampling ...... 173 Cluster Sampling...... 175 4.2.2 Case 2: Mixing Different Surveys ...... 176 4.3 Case 3: Combining several estimates from different surveys generated from the different superpopulations under the same hypersuperpopulation...... 178 4.4 Small Properties...... 183 4.5 Plots ...... 188 Forest Plots ...... 189 Funnel Plots ...... 190 Chapter 5 Example – Partnership Status and Sex Ratio ...... 193 5.1 Partnership Status and Sex Ratio ...... 193 Sex Ratios – Birth Outcomes...... 193 Selection of Studies...... 199 Target Populations and Study Designs ...... 200 Inference using Longitudinal and Crosssectional Designs...... 210 The US Population over Time ...... 213 Cause, Odds Ratios and Risk Differences ...... 214 Reproducing Results – Validating and Recoding Data...... 219 Combining (or Not Combining) the Data ...... 230 Chapter 6 Guidelines...... 235 6.1 Data Quality...... 235 6.2 Data Comparability and Consistency...... 239 6.3 Reporting ...... 247 6.4 Conclusion ...... 253 Bibliography ...... 255 Appendix A Programs ...... 268 Appendix B ...... 303

vii List of Figures

Figure 2.1 Random Sampling from a Finite Population ...... 42 Figure 2.2 Stratified Sampling...... 48 Figure 2.3 Cluster Sampling ...... 51 Figure 2.4 TwoStage Cluster Sampling...... 52 Figure 2.5 TwoPhase Sampling...... 56 Figure 2.6 Superpopulation...... 60 Figure 2.7 Literacy...... 66 Figure 3.1 Experimental Framework for a FixedEffect Model using Sampling Concepts ...... 87 Figure 3.2 Experimental Framework for a RandomEffect Model using Sampling Concepts...... 88 Figure 3.3 a FixedEffect Model Case 1a using Sampling Concepts ...... 90 Figure 3.4 FixedEffect Model Case 2 using Sampling Concepts...... 104 Figure 3.5 RandomEffect Model Case 3 using Sampling Concepts...... 108 Figure 4.1 Convergence of a Sequence of Simple Random Samples from a Single Finite

Population using Fixed Effect Models with Three Different Weights θθθˆ F (F ) ...... 141 Figure 4.2 Convergence to the Finite Population using Simple Random Samples of Size

1000 from a Finite Population of Size 10,000 θθθˆ F (F ) ...... 143 Figure 4.3 Convergence of a Sequence of Stratified Samples from a Single Finite Population

using Fixed Effect Models with Four Different Weights θθθˆ F (F ) ...... 145 Figure 4.4 Convergence of a Sequence of Cluster Samples from a Single Finite Population using

Fixed Effect Models with Four Different Weights θθθˆ F (F ) ...... 147 Figure 4.5 Convergence of a Sequence of Samples from Different Designs from a Single Finite

Population using Fixed Effect Models with Four Different Weights θθθˆ F (F ) ...... 150 Figure 4.6 Convergence of a Sequence of Stratified Samples from a Single Finite Population using Fixed Effect Models with Three Different Weights – Ignoring the Design...... 153 Figure 4.7 Convergence of a Sequence of Simple Random Samples from a Single Finite

Population using Fixed Effect Models with Three Different Weights θθθˆ F (S ) ...... 162 Figure 4.8 Convergence of a Sequence of Stratified Random Samples from a Single Finite Population using Fixed Effect Models with Four Different Weights ...... 164

viii Figure 4.9 Correlation of the mean and the superpopulation in unequal probability sampling...... 165 Figure 4.10 Convergence of a Sequence of Cluster Samples from a Single Finite Population

using Fixed Effect Models with Four Different Weights θθθˆ F (S ) ...... 167 Figure 4.11 Convergence of a Sequence of Random Samples from Different Designs from a

Single Finite Population using Fixed Effect Models with Four Different Weights θθθˆ F (S ) 169 Figure 4.12 Convergence of a Sequence of Simple Random Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with

Three Different Weights θθθˆ F (S ) ...... 172 Figure 4.13 Convergence of a Sequence of Stratified Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Four Different

Weights θθθˆ F (S ) ...... 174 Figure 4.14 Convergence of a Sequence of Cluster Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Four Different Weights ...... 176 Figure 4.15 Convergence of a Sequence of Random Samples from Different Designs from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect

Models with Four Different Weights θθθˆ F (S ) ...... 177 Figure 4.16 Convergence of a Sequence of Simple Random Samples from a Several Superpopulations, Generated from a HyperSuperpopulation using Random Effect Model

θθθˆ F (S ) ...... 180 Figure 4.17 Convergence of a Sequence of Stratified Random Samples from a Several Superpopulations, Generated from a HyperSuperpopulation using Random Effect Model ...... 181 Figure 4.18 Convergence of a Sequence of Cluster Samples from a Several Superpopulations,

Generated from a HyperSuperpopulation using Random Effect Model θθθˆ F (S ) ...... 182 Figure 4.19 Convergence of a Sequence of Samples from Different Designs from a Single Finite

Population using Fixed Effect Models with Four Different Weights (seed 1) –θθθˆ F (F ) ...... 184 Figure 4.20 Convergence of a Sequence of Samples from Different Designs from a Single Finite

Population using Fixed Effect Models with Four Different Weights (seed 2) θθθˆ F (F ) ...... 185

ix Figure 4.21 Convergence of a Sequence of Samples from Different Designs from a Single Finite

Population using Fixed Effect Models with Four Different Weights (seed 3) θθθˆ F (F ) ...... 186 Figure 4.22 Convergence of a Sequence of Random Samples from Different Designs from a Single Finite Population, Generated from a Single Superpopulation using Fixed Effect

Models with Four Different Weights θθθˆ F (S ) ...... 187 Figure 4.23 of a ...... 189 Figure 4.24 of 9 Simulated Studies Selected using Simple Random Sampling, Showing the Finite Population and Superpopulation...... 190 Figure 4.25 Forest Plot of 9 Studies Selected using Simple Random Sampling ...... 191 Figure 4.26 Forest Plot of 200 Studies Generated from Simple Random Samples ...... 192 Figure 5.1 Study Time Periods ...... 212 Figure 5.2 Births, Infant Deaths, Marriage and Divorce in the : 19571995...... 214

x List of Tables

Table 1.1Events for the ith Study from which the is Calculated ...... 9 Table 2.1 Estimation and Inference ...... 67 Table 3.1 Estimates of daily smokers aged 12 to 19, Canadian Community Health Survey, cycles 1.1 to 3.1, Durham Health Region ...... 94 Table 4.1 Finite and Superpopulation with Different Sample Sizes ...... 137 Table 4.2Finite and Superpopulation Means with Fixed Sample Sizes...... 138 Table 4.3 Finite Population Means and Variances of Y by Strata...... 152 Table 4.4 Individual Data and Pooled Data Estimation of a Mean...... 160 Table 5.1 Counterfactual ...... 216 Table 5.2 Proportions in Counterfactual Contingency Table...... 216 Table 5.3 Partnership status and likelihood of a male child: NSFG ...... 222 Table 5.4 Partnership status and likelihood of a male child: NLSY79...... 225 Table 5.5 Partnership status and likelihood of a male child: NLSYW ...... 227 Table 5.6 Partnership status and likelihood of a male child: PSID...... 229 Table 6.1 A Reporting Checklist for Authors, Editors, and Reviewers of Metaanalyses of Surveys...... 251

xi Chapter 1 Introduction

The advantages of metaanalysis include seeing whole a body of research: discerning patterns within it, and detecting effects too small to be picked up by individual studies. Researchers who use survey data are understandably keen to exploit these advantages. When we do a metaanalysis, the body of research must fit into a coherent and plausible theoretical framework, both for the subject under study and for the statistical methods used in the metaanalysis. To date, no such methodological framework exists for pooling survey data. The urgent need for research into metaanalysis methods for survey data is apparent: while the literature on studies that pool

survey data is growing at an astounding rate, the literature on how to pool this data is not. This thesis presents a clear theoretical framework for the metaanalysis of survey data and explains the assumptions and the consequences of those assumptions that researchers unwittingly make.

This chapter provides an introduction to metaanalysis techniques and methods presently used to combine experimental data. Section 1.1 introduces the concept of metaanalysis, outlines the

issues associated with combining survey data, and sets the aims and scope of this thesis. Section

1.2 describes existing metaanalysis techniques such as fixed effect and random effect models.

The framework for probability sampling and the idea of the superpopulation are introduced in

Section 1.3 and Section 1.4, respectively. Methods for combining survey data are presented in

Section 1.5.

1

1.1 Background, Aims, and Scope of Thesis

Reviews attempt to integrate the from multiple primary research studies by first assembling and assessing these studies with the objective of minimizing and controlling, whenever possible, random errors. For example, researchers might be interested in the effect of smoking on cardiovascular health. As the literature in this field is vast, they would first narrow the subject they were reviewing to something like the impact of smoking on heart disease, specifically myocardial infarction (MI) for men age 4065. They would search for examples in the literature of studies that have looked at smoking in relation to the whole of cardiovascular health. They would read the articles and group the articles in different ways: perhaps by amount of exposure, type of study, or type of cardiovascular outcome. They would eliminate those articles that do not address the smokingMI issue directly, say, for example, broadlyfocused articles reporting on all cardiovascular events together. They would have a predefined rule to determine acceptance of a published studies results, based on some aspects of quality (sample size, , etc.) and remove those which do not meet these criteria. They would then summarize qualitatively the remaining studies.

In some cases this process may include the quantitative combining of results in order to come to a conclusion about a body of research. For example, the researcher may pool all the studies that looked at the risk of myocardial infarction for smokers and attempt to summarize the risk using an average of the studies. The use of such a quantitative approach to statistically combine data is known as metaanalysis. First used in 1976 for the combination of educational studies, (Glass

1976) metaanalysis is now common in the medical literature, having expanded from randomized

2 control trials and other experimental studies to include epidemiological quasiexperimental studies (cohort and casecontrol), diagnostic tests, bioassays and, most recently, surveys.

As metaanalysis is of interest to many different fields of research, each of which uses different

notation and language, we will clarify for the reader the language and notation used in the thesis before discussing its background and aims. Survey refer to estimation under probability sampling as designbased. They use the term classical statistics, a classical approach

or model based statistics, to refer to several cases and often interchangeably: either when one

invokes a superpopulation that generated the finite population, or when one ignores the sampling

design when there is a probability sample, or when there is no sampling, such as is the case with

experimental data. Within this thesis, classical statistics will refer to the experimental data case,

modelbased statistics will refer to the case when we invoke a superpopulation that generates the

finite population, and designbased will refer to the case where we estimate finite population parameters under a standard design consistent sampling framework.

Throughout this thesis we will consistently use notation following a survey tradition. For example, we will use θθθ with no subscript to represent the true parameter of interest under a

superpopulation model (that is the model parameter), and θθθ N as the true estimate of the finite population parameter of interest (that is the parameter). We will assume that our finite populations consist of N objects and our samples consist of small n objects. We will assume that

we are combining r different studies. Next, we will use hats on our parameter to denote that it is

ˆ an estimate: for example, θθθ n will denote the estimate of the census parameter using the sample.

We will also use the convention of differentiating the expectation over the design with the 3

notation E p (θθθ ) from the expectation of the model Eξξξ (θθθ ). We will introduce new notation in chapter 3 to aid the reader in differentiating each of the cases in our framework.

This thesis will look at the case of metaanalysis estimation using fixed and random effect models when the studies that are combined are probability samples. A study that is designed using a probability sample is commonly referred to as a survey. It differs from an in that the participants are randomly selected from the survey population. An experiment is a study in which the participants are randomly assigned. There are several critical differences between the experimental and survey case to examine . We will see in section 1.4 and 1.5, and again in

Chapter 2, that in finite population sampling it is the selection of individuals that is the random process and the characteristics we are estimating that are fixed, while in an experiment we

consider that the characteristic we are studying (often a treatment effect) is a . We

will see that whereas in the experimental case we have only one level of inferencethe treatment

effect under the modelin surveys we have two levels of inference, that is, either the

superpopulation model or the finite population. The finite population refers to an entire target population under study. For example, if we were studying the height of Canadians, the finite population would be all Canadians (at a particular time point). The superpopulation is the

stochastic process or model that generated the finite population. For example, we might think

that the heights of Canadians are normally distributed and that this model that generated the

Canadian population (possible over several time points). In Chapter 2, we will present how

inference and estimation are linked. In Chapter 3, we will see how this inferenceestimation in

the survey case leads to 3 theoretical cases whereas in the experimental case there are only 2.

4

The importance of differences between the sampling case and the experimental case is fundamental to the problem of this dissertation and will be discussed in detail throughout.

Another important concept to clarify for the reader is that of convergence. As in classical statistics, we define consistency in sampling as the limiting behavior of a sample as the sample size is increased to infinity (convergence in probability). Thus, estimation is called designconsistent if the estimate becomes equal to the census value when n=N, when we add in the concept of the superpopulation we let both the sample size n and the population size N go to

infinity with n

Chapter 3 and 4 we must require that the population size and sample size be allowed to increase

while maintaining their structure. This again is an important distinction from the

experimental case. We will provide mathematical examples in Chapter 2 that illustrate the

difference between estimation from a model based approach and estimation under a designbased

approach. The distinction between model and design based approaches will be used as the basis

of the framework for estimation in Chapter 3.

In Chapter 3, we will present a new weight for both fixed and random effect models that is

optimal visàvis minimum variance when there is clustering in the data. We will show what

assumptions are necessary for convergence under this framework, taking into account design

consistency, and illustrate how to appropriately apply either a fixed or random effect model

depending on the underlying assumptions for a particular group of surveys. We will also show

that these are very strong assumptions and we will demonstrate for the reader the consequences of

not meeting them. Furthermore, we will illustrate how the differences in survey design should

not be considered a reason for adopting a randomeffects model. We will show how it is the

5 assumptions about the target populations that will determine if one should use a fixed or a random effect model. Using simulations, Chapter 4 will illustrate how estimation converges under different sampling plans, different assumptions and different sample sizes. Furthermore, we will demonstrate how the traditional test (using s for the estimate of variance) for heterogeneity is not appropriate in the survey case, as there is often clustering in the data that is not present in the experimental case.

Chapter 5 presents an example that illustrates some misconceptions that can arise from applying standard metaanalysis approaches to survey data. In this example, Karen Norbreg pooled 5 data sources (4 of which are surveys) in an attempt to study the impact of marital status on the sex ratio at birth. This example covers many of common errors researchers may make when pooling survey data. Finally, Chapter 6 contains a discussion of data quality and its importance when determining study comparability and provides new reporting guidelines for those researchers wishing to do a metaanalysis of survey data .

1.1 Existing MetaAnalysis Techniques

When a single study is of insufficient size, does not cover a domain completely, or contains rare events then estimates from multiple sources can be combined to arrive at a conclusion about a body of research (Peto 1987). Additionally, data from multiple sources are often combined even when a single sample size from a single source is sufficient. In clinical trials it is common for a study from a single randomized trial not to be accepted as the only truth, without some additional evidence from other trials. Ideally, these studies are similar in methods and design. In the best 6 case, we wish to statistically summarize the studies (either with a hypothesis test or an estimate), identify heterogeneity and differences in subgroups, explore the reasons for the differences, test the effect of explanatory variables on the differences, and test if all variations are essentially explained (Hedges 1992, Thompson 1994). We approach metaanalysis in two ways, either by hypothesis testing or estimation (Fisher 1950, Cochran 1954).

1.1.1 Hypothesis Tests

In the metaanalysis of experimental data, a hypothesis test can be done to see if each of the

treatment effects is zero. Here the null hypothesis is that the true treatment effect θi in each trial

i, i= 1 to n is equal to zero: H O :θθθ1 = ... = θθθ n = 0 . It is assumed asymptotically that the estimate

of θi has a normal distribution with mean θi and variance vi , and that therefore under the null

ˆ hypothesis, θi is a normal random variable with zero mean and variance vi . Consequently, if any one of the trials is significantly different from zero the null hypothesis is rejected.

To derive an appropriate test statistic under this null hypothesis, both Fisher (1950, 1948) and

Pearson (1938) used an approach of combining pvalues. This method (worked on independently by Fisher and Pearson) looked at the issue of statistically summarizing the results of n independent tests of the same hypothesis using the product of probabilities from different trials.

Here, if the natural logarithm of this product is multiplied by 2, then we have a χ 2 statistic with

2n degrees of freedom

7

χ 2 = −2∑ln p .

Under the assumption of the null hypothesis, when we are not concerned about the direction of the alternative, a test statistic can be derived based on the squares of individual observed

ˆ 2 treatment effects. If we divide θθθi by the variance vi , standardizing the squared estimates then

θˆ 2 i has a χ 2 distribution with 1 degree of freedom. The sum of n independent trials is vi then χ 2 statistics with n degrees of freedom (Cochran 1954).

Following in this same direction, Winer (1971) proposed a combined test that comes from the sampling distribution of independent ttests and Stouffer extended this idea by converting the p values to zscores instead of tscores and then summed them. Here for each study under the null

θˆ 1 n θ hypothesis i has a normal distribution with zero mean and variance of . Then ∑ i is a vi vi i=1 ν i

n 1 normal random variable with mean zero and variance ∑ and thus the test statistic i=1 vi

n θ i ∑ν i=1 i n ∑ 1 i=1 ν i is a standard normal random variable. Both test statistics can be used for continuous and binary data. However, these aforementioned tests do not inform the experimenter about the size or the direction of the effect.

8

For the assessment of directional alternatives, Mantel and Haenszel (1959) and later Peto (1987) suggested a test that can be used in situations where the outcome measure is binary. Again the null hypothesis is that the treatment effect is zero for all studies. If we consider a two by two table of the number of patients within a study, here, for each individual trial, the observed events in the treatment group ai, conditional on the total number of patients, has a hypergeometric

2 distribution with mean ni1mi1 / N i and variance ni1ni2 mi1mi2 / Ni (N i − )1 (Table 1.1).

Table 1.1Events for the ith Study from which the Odds Ratio is Calculated

Yes No Total

Treatment ai bi ai + bi = ni1

Control ci di ci + di = ni2

Total ai + ci = mi1 bi + di = mi2 Ni

If in a metaanalysis we consider the n individual trials as strata in an experiment, the total

n n n 2 observed events ∑ ai has a mean of ∑ ni1mi1 / N i and variance ∑ ni1ni2 mi1mi2 / N i (N i − )1 . i=1 i=1 i=1

Now if we consider the number of observed and the number of expected cases, the variance of the

n differences of the observed and expected number of events ∑ ai − (ni1mi1 / Ni ) is equal to the i=1

sum of the individual variance of ai − (ni1mi1 / N i ) (MantelHaensel 1959). Then the appropriate test statistic is

2  n  ∑ ai − (ni1mi1 / N i )  i=1  n , 2 ∑ ni1ni2 mi1mi2 / Ni (Ni − )1 i=1

9 which has an asymptotic χ 2 distribution with 1 degree of freedom.

1.1.2 Fixed Effect Models

Often times, though, we are interested neither in some significance level, nor in the direction of the alternative, but rather in an estimate of . As with hypothesis testing, every study

should ideally be as similar as possible: all variables the same, all with the same study design, all randomized in the same fashion, with the same controls and protocols, thus ensuring that the differences are, in fact, due to random variation within the ‘population’. We must also make explicit that in this ideal case, we also assume that the underlying ‘population’ being studied is the same.

In these ideal conditions we can consider a fixed effect model. Here, the effect size observed in a

study is assumed to estimate the corresponding ‘population’ effect with a random error that stems only from the chance associated with the individual level . The only difference in the studies lies in the power to detect the outcome of interest. Again, as with the hypothesis test,

we assume asymptotically that the estimate of the treatment effect θi is considered a random

variable and has a normal distribution with mean θi and variance vi . We also assume that there is

homogeneity of treatment effects across all studies, that is θθθ1 = θθθ 2 = ... = θθθ r = θθθ . We can see that we can calculate linear combinations of unbiased estimates that are themselves unbiased estimates and that therefore the simplest approach is to calculate a weighted average from all the studies. The weighted average takes the general form 10

r ˆ ∑ wiθθθ i g(θˆ) = i=1 , θθ r ∑ wi i=1

ˆ where θ is the and the weights wi are chosen to minimize some constraint, such as g( θˆ ) having minimum variance. Here the weights themselves are selected, potentially as a function of the data.

Many researchers have addressed the optimum choice of weights in the experimental case.

Mosteller and Bush (1954) recommended weighting the studies by the sample size. The work of

Hedges (1992) and Rosenthal and Rubin (1982) has extended this simple weighting by looking at the sampling distribution of the effect size. In general, we wish to give the greatest weight to studies that have the most precise estimates. And if we want to minimize the variance, then the minimum variance estimate is the one obtained by taking the weight for the i th study to be the

inverse of the variance (Hedges 1992).

Considering the case of testing a log odds ratio, it is the logarithm of the odds ratio that is

commonly taken as θ as this transformation improves normality. In this case, Armitage and

Berry, in Statistical Methods in Medical Research (1994) show that taking θi = ln(ai di / bi ci )

ˆ the variance of θi is then 1( ai ) + 1( bi ) + 1( ci ) + 1( di ) . Then if we assume that θi is

approximately normally distributed with mean θ and variance vi and that the vi s are all known,

11

1 1 then an estimate of the variance of the overall log odds ratio is where . Thus, a r wi = vi ∑ wi i=1

95% for g(θˆ) is g(θˆ) ± .1 96 var(g(θˆ)) .

It is assumed here (and in all subsequent fixed effect models) that the weights are known and fixed when, in fact, they are generally estimated from the data. Hardy (1995) states “it is clear from the simulations that when calculating a variance for an overall fixed effect estimate in a

1 practical situation, using the standard estimate is likely to produce a value which is too r ∑ wi i=1

small.” Although she here provides an adjustment for the variance, she states that using the

unadjusted variance will not be problematic when the number of observations within each trial is

large.

The MantelHaenszel estimate, used for dichotomous outcomes, is a weighted average of the

individual odds ratios. Here, as with the MantelHaenszel test, we can consider the results of

individual trials as separate stratum. If we do so, the weighted average is then calculated with the

b c weight for each study given by i i , and thus the estimate of the odds ratio is N i

r ai di ) ∑ N OR = i=1 i . Several researchers have looked at variance in this case under different MH r b c ∑ i i i=1 N i limiting conditions. Robins, Breslow, and Greenland (1986) demonstrate in both sparse data and 12 largestrata limiting models that if we let

Pi = (ai + di /) Ni ,Qi = (bi + ci /) N i , Ri = ai di / N i and Si = bi ci / Ni then the variance is given by

r r r P R (P S + Q R ) Q S ) ∑ i i ∑ i i i i ∑ i i Var(LogOR ) = i=1 + i=1 + i=1 . MH 2 2  r   r r   r        2∑ Ri  2∑ Ri ∑ Si  2∑ Si   i=1   i=1 i=1   i=1 

Then, as with all fixed effect models we can calculate an appropriate confidence interval for the overall odds ratio.

Peto (1987) considers a variation on the MantelHaenszel estimate where the observed and expected cell counts are pooled across studies. Here the line of reasoning is that if there were no treatment effect, then the observed minus the expected number of cases for each trial would have an equal chance of being positive or negative and then the total of all the differences across all the

studies would be close to zero. If, however, a treatment were effective, then the differences

(Oi − Ei ) would tend to be negative. Peto felt that although this negative difference might not be noticeable in a single study, it would be present in the total across many studies. Moreover, dividing the difference by its variance is a reasonable approximation to the log odds results when the odds ratio is not far from unity. This is because the approximation is the first Newton

Raphson step centered at zero of the maximum likelihood estimate (General Linear models).

Z i Peto states that, in general, a parameter may be estimated by where Zi is the efficient score Vi

13

statistic and Vi Fishers information. Here (Oi − Ei ) is the efficient score statistic for the log odds ratio and its variance Vi is Fisher’s information. Then we have

 r  ∑ (Oi − Ei ) ψψψˆ = exp i=1  , where pool  r   ∑Vi   i=1 

(a + b )(a + c ) (a + b )(c + d )(a + c )(b + d ) O = a E = i i i i V = i i i i i i i i . i i i i 2 N i N i (N i − )1

Once again, it can be shown that this is a special case of a general weighted mean, where

(O − E ) 1 θˆ = i i and w = V = . i i i ˆ Vi var(θ i )

An example of a metaanalysis that uses a MantelHanzel approach to estimation is Sullivan,

Neale and Kendler’s (2000) metaanalysis of family studies (twin and adoption) which looks at the genetic of major depression. The authors identified all relevant primary studies on MEDLINE and found five family studies that met their predefined selection criteria. Though they do not make explicit the differences in combining quasiexperimental data and experimental data they did test the hypothesis that the odds ratios from these studies were homogeneous. They used a MantelHaenszel approach to combine the estimates to show that the five studies provided consistent evidence in support of the genetic component of major depression.

1.1.3 Random Effect Models

14

However, fixed effect methods do not provide unbiased results when the heterogeneity of θ is

not explained by known differences in the studies. For example, if the patients under treatment in

one center had differential exposures from those in other centers and these confounders were not

measured, a fixed effect model would be biased. In general, there are differences in study

designs, patient type, treatment duration, compliance, recruitment and regimen. Consequently,

heterogeneity is present and important if the differences can not be accounted for in the analysis.

If we start our assumptions assuming homogeneity, most researchers agree that a formal test for

heterogeneity should be done and if there is heterogeneity the reasons for the heterogeneity

should be examined (Cochran 1954, Higgins and Thompson 2002, Higgins et al 2003, Egger et

al. 1997a). We agree that this is the case. We can test for heterogeneity using a test statistic

derived from the squared deviation of the difference between the study estimates and true overall

1 ˆ 2 mean. If we consider the study variances fixed, and let wi = , then wi (θi −θ ) has a vi

χ 2 distribution with 1 degree of freedom. Summing across all n studies, we will have a statistic

with a χ 2 distribution with n degrees of freedom. As the true value is not known, we must estimate it with the average of all the studies, thus losing a degree of freedom. Thus, we now have a test statistic with r1 degrees of freedom (Cochran’s Q)

r w θθθˆ r ∑ i i Q = w (θθθˆ −θθθˆ) 2 where θθθˆ = i=1 . ∑ i i r i=1 ∑ wi i=1

15

This assumption of fixed weights means that the power of the test is artificially increased. If we consider that the weights are not fixed then test statistic is not χ 2 (r1) and the test is not valid

(Hardy 1995).

When we break this assumption of homogeneity underlying the fixed effect model the assumption is now that the true treatment effect of each trial is normally distributed with mean θ and a

2 betweenstudy variance σ B . Inference is based on the assumption that effects are independently and identically sampled from an overall population of all such effects. Although such an infinite population does not exist, it is an important assumption in the classical parametric approach. The assumption that the effects are random implies that we must include a betweenstudy as well as a within study component of variation when estimating an effect size and a confidence interval

(Demets 1987). We could then assume that the variability beyond the individual sampling error is random: that it stems from random differences among studies whose sources cannot be identified. This will normally mean that the confidence interval is at least as wide as that from a fixed effect model. Thus randomeffect methods will tend to be more conservative.

If the researcher believes that there is a reasonable coherent model that the experimental data fits into when the data is not homogeneous, she can still attempt to measure an average effect.

However, this decision is not of a statistical nature, but is rather conceptual. Here we no longer have a common effect, but we believe that the effect from each study is still linked. When a common effect cannot be assumed because we have heterogeneity, yet the effects can be assumed to be similar, it is appropriate to use a randomeffects model (DerSimonian and Laird

1986) or a maximum likelihood estimate (Normand 1999, Higgins et al 2001). DerSimonian and 16

Laird (1986) construct an estimator that uses the method of moments equating the heterogeneity test to the expected value of the test. They assume that each trial’s estimator has a normal

ˆ distribution with mean θi and variance vi . Here let θ be a pooled estimate of effect size (with weights chosen to be the inverse of the variances) and then, to calculate the DerSimonianLaird estimator use

r  2   wi   r ∑  E(χχχ 2 ) = n −1 +τττ 2 w − i=1 and ∑ i r   i=1   ∑ wi   i=1 

           χχχ 2 − n −1  τˆ 2 = max ,0 . ττ  r   w2   r ∑ i   w − i=1  ∑ i r  i=1   ∑ wi   i=1 

Using this approach (DerSimonian and Liard 1968, Whitehead and Whitehead 1991) we can then see that the estimate of mean effect is

r ˆ /( ˆ 2 v 2 ) ) ∑θθθi τττ + i θ = i=1 , θθ DL r ˆ 2 2 ∑1/(τττ + vi ) i=1

where τ 2 is a point estimate of the heterogeneity variance. This is a generalization of the fixed

effect approach and we can see this when τ 2 = 0. However, it should be noted that the weights are themselves estimates and as with fixed effects will lead to an overestimation of the between

17 study variance (Hardy). As with the fixedeffect models, using the , an

ˆ ˆ approximate 95% confidence interval may be obtained using θ DL ± .1 96 var(θ DL ) .

The principal inference in a random effects model is an estimate of the underlying mean effect; this is the average effect and not the overall effect. Also, one can use this type of model to test whether an effect exists in any study, has a consistent direction, or is predicted to exist in a new study. Although one can estimate the study specific effects if the studies are distinct from one another in ways that cannot be quantified, a more important feature of a random effects model is that it assumes that the studies come from a distribution. This implies that they can be used for prediction. Higgins et al (2009) states that ‘predictions are one of the most important outcomes of a metaanalysis, since the purpose of reviewing research is generally to put knowledge gained into future application’. However, this prediction is limited by the precision of the parametric specification of the random effects model.

Li and Begg (1994) have looked at using randomeffect models for combining results from controlled and uncontrolled studies. They state that combining is advisable only if the uncontrolled studies are no more biased than the controlled. Here they describe a model where the baseline effects vary from study to study and it is assumed the treatment effect is constant.

They make no assumptions on the distribution of the baseline effects, except that the first and second moments are finite. However, they assume that the sampling variances are known and the studies are independent. Then the weighted estimator is the difference of two weighted estimators of the controlled and uncontrolled studies where the weights depend on the

between study variation. Consider there are n comparative trials, where xi is the observed effect 18

of treatment 1 and yi is the observed effect of treatment 2. Then let there be k uncontrolled

studies of treatment 1 with observed effects ui and m uncontrolled studies of treatment 2 with

observed effects vi . Let z = (x′, y′,u′,v′ ) then under the aforementioned assumptions we have

E(z) = Xφφφ, var(z) = Σ ,

where

1 0   n  1n 1n    X = , φ =  , 1 0  δ   k      1m 1m 

and

 D(σσσ 2 + S 2 ) σσσ 2 I 0 0   ix  2 2 2  σσσ I D(σσσ + Six ) 0 0  Σ =  2 2   0 0 D(σσσ + Six ) 0   2 2   0 0 0 D(σσσ + Six )

2 2 where 1 n is the n vector (1,…,1)´, D(σ + Six ) is a diagonal matrix with diagonal elements

2 2 2 2 2 2 {σσσ + S1 (x),...,σσσ + S n (x)}, where σσσ is the model variance and Si (x ) are the sampling

variances estimated from the each study and I is the identity matrix.

1.1.4 Regression Methods

When there is heterogeneity present and we can model the effect using covariates, a is recommended. Metaregression is an extension of fixed and random model meta 19 analysis, and a generalization of subgroup analyses (Thompson and Higgins 2002). It examines the relationship between one or more studylevel characteristics and the treatment effect observed

in the studies. The experiments included in a metaanalysis often differ in a way that would be

expected to influence the outcome. Covariates in the analysis might include the risk stratification,

aspects of the design, dose of the drug, length of the treatment, choice of patients or

(Smith 1997).

In the simplest case, we can extend a fixedeffect model to a fixedeffects metaregression. We

still assume that the estimate of the treatment effect θi is considered a random variable and has a

normal distribution with mean θi and variance vi . We also assume that there is homogeneity of

treatment effects across all studies, that is θ1 = θ 2 = ... = θ n = θ . Then we would also assume a

2 model of the form Y = X 1β1 + ... + X k β k + ε , where the E(ε ) = 0 and E( εε ′) = σ Φ . Here

we assume that we can explain all betweenstudy heterogeneity with the observed covariates. A

fixed effects metaregression model thus allows for within study variability, but no between study

variability because all studies have expected fixed effect size. Again, parameter estimates are biased if between study variation cannot be ignored. Furthermore, as with all fixed effects models one cannot generalize to the whole population.

A random effects regression approach is a logical extension of the random effects model, where the underlying heterogeneity is modeled using covariates. When we can not explain all the between study heterogeneity, we should use random effects metaregression. A key methodological problem is that metaregression should be weighted to take account of both

20 withinstudy variances and the residual betweenstudy heterogeneity, in this case the heterogeneity not explained by the covariates in the regression ( Higgins and Thompson 2002 ).

This analysis is often done in conjunction with a careful exploration of the sources of heterogeneity. However, as with all modeling exercises the choice of model, covariates and uncertainty about what to do with or confounders are important issues.

There are various approaches to obtaining parameter estimates from the random effects regression model. Berkey et al. (1995) suggest a random effect model for the synthesis of 2x2 tables and introduce an iterative approach which alternates between estimating r2 and performing a weighted least squares analysis. Others have suggested using Maximum likelihood estimates.

These will be covered in more detail in section 1.1.5. The area of individual data metaanalysis is relatively new. It is rare to have access to the complete data sets from many studies and so the first approaches have used a twostage approach for metaanalysis models with individual patient data (Higgins et al 2001., Whitehead et al 2001). When the individual data come from case control studies a similar twostage approach has been suggested that adjusts for confounders.

(Stukel et al. 2001) When there are both individual patient data and studylevel covariates,

Higgins suggests the use of a multilevel model. It may be used to describe the effects of both patientlevel and studylevel covariates, with random effects for the coefficients at the between trial level. Here Higgins suggests the following hierarchical

yi = ∑ β k xik + δ i + ε i

21

2 2 where xi0 = ,1 βββ 0 ≡ , δδδ i ~N ,0( τττ ) and εεε i ~N ,0( σσσ i ) . The linear model is a 2level model in which one observation is made within studies and one for each unit between studies (Goldstein et al 2000, Turner et al 2001).

Up to this stage we have not discussed time to event data as an outcome we would want to pool.

As is a common outcome from a there has been research on meta analysis regression methods for time to event data. The most common approach in the analysis of time to event individual patient data has been a stratified logrank analysis. For example, we may have survival curves calculated from each of the trials we wish to combine. Then we would look into ways of averaging over those curves. Smith and Williamson (2007) compare the log rank analysis with an inverse variance weighted average of Cox model estimates and a stratified

Cox regression. They find that when the hazard ratio is close to 1 in the presence of little heterogeneity, all three methods have similar results. None of the methods had good coverage when there was low or moderate heterogeneity present.

1.1.5 Likelihood Based Methods

Maximum likelihood theory is widely used for estimation and inference. Brockwell and Gordon

(2001) suggest a likelihood approach to metaanalysis in place of a randomeffects model. Both the fixed and randomeffect models previously described were set up with the distributional assumption of normality. Then under the randomeffects model, the marginal distribution of each

22

ˆ 2 individual estimate θi is therefore normal with mean θ and variance vi + σ B . Then the contribution of each study to the likelihood is

 ˆ 2  2 1 − (θ i −θ )  Li (θ,σ B ) = exp  . 2  (2 v + σ 2 )  2π (vi + σ B )  i B 

The full likelihood is the product of the individual study likelihoods. In practice, it is the log

likelihood that we use for computational reasons

r r  ˆ 2  2 1 2  (θθθ i −θθθ )  li (θθθ ,σσσ B ) = − ln 2πππ (vi + σσσ B ) −   . ∑ 2 ∑ 2 i=1 i=1  (2 vi + σσσ B )

2 Maximum likelihood estimates of θ and σ B are derived by setting the partial derivates of (16) to zero and subsequently we have

r ˆ θθθ i ∑ (v + σσσˆ 2 ) θθθˆ = i=1 i Bml ml r 1 ∑ 2 i=1 (vi + σσσˆ Bml )

and

r ˆ ˆ 2 (θθθi − θθθ ml ) − vi ∑ 2 2 2 i=1 (vi + σσσˆ Bml ) σσσˆ ml = . r 1 ∑ 2 2 i=1 (vi + σσσˆ Bml )

We can then use an EM algorithm to iteratively obtain maximum likelihood estimates (MLEs)

using an initial value for the betweenstudy variance. SAS, Splus and R all can compute MLEs

2 ˆ directly from the loglikelihood. We use the fact that − 2{l(θ,σ B ) − l(θ ml ,σˆ Bml } has a

2 2 χ 2 distribution to calculate a joint confidence region for both of θ and σ B . Then the 23

2 ˆ approximate 95% likelihoodbased confidence interval is l(θ,σ B )〉 l(θ ml ,σˆ Bml ) − .5 991 2/ ,

2 where 5.991 is the 95% cut point of the χ 2 distribution.

The marginal and conditional likelihoods are available assuming certain distributions, normal being one of these. The profile loglikelihood (that is, the likelihood which takes into account

that the other parameter is unknown and must be estimated) can be used in all circumstances and

the maximum profile likelihood estimate is equal to the overall maximum likelihood estimate.

The profile likelihood can then be plotted against the other parameter to find the maximal point

(Hardy and Thompson 1996 ).

All these methods outlined above are derived based on assumptions made in the experimental

case. We will see in the next chapters how we need to link the ideas of probability sampling to

the experimental framework. Then with those links, we will provide modifications to hypothesis

testing and guidance on when to use the different types of estimation when we combine survey

data.

1.2 Probability Sampling

In an experiment, we randomly assign treatment(s) and a control to individuals in a random fashion to create two (or more) probabilistically equivalent groups. We use random allocation to reduce and to ensure that the results of the trial are internally valid. We assume the treatment effect is a random variable for analysis purposes. In the case of random sampling, the 24 population is finite and its characteristics are fixed: it is the selection of individuals that is

random. Random selection is used to generalize the results to the complete finite population.

Consider a finite population of size N, each of which can be identified by a label which can be

denoted by the set U= {1,…,N}. From this finite population we generate each possible sample

from a sampling design S1, S 2, S 3,…, S v with a known probability p(Si) (for simplicity we will assume each sample has a fixed sample size of n units though this is not necessary). We select one of the Si by a random process in which each sample Si is selected with its corresponding probability. We can see that if the selection probabilities for each sample are known, then each

unit in the population has a known probability of being in the sample π i . Additionally, we can

also know the joint probability of any two items appearing in our sample π ij . The sample is

chosen with a fixed design (and therefore known probabilities) in order to estimate some attribute

f f in the finite population. If we also have the case where π i 0 and π ij 0 ∀ i ≠ j then the

sampling design is called measurable. We can then calculate valid design based variance

estimates and confidence intervals based on the survey data (Cochran 1977, Sarndal, Thompson

1997, Lohr 1999). Design based survey estimation and inference is discussed in more detail in

Chapter 2.

1.3 Superpopulations

Though design based sampling theory is concerned with inference for finite population parameters, we can extend inference to a general or superpopulation model. Superpopulation 25

modeling occurs where the researcher specifies or assumes a random mechanism that generated the finite population. This is now called modelbased as opposed to designbased inference where the model is the statistical conceptualization of a superpopulation.

Several authors agree that this framework can extend the traditional statistical approach to

sampling by considering the random variables Y1 ,Y2 ,...,YN to be generated from some model

(RubinBleuer and SchiopuKratina 2005, Royall 1970, Royall et al 1973a, 1973b, Godambe and

Thompson 1986, Fuller 1975, Cochran 1977, Hajek 1964, Graubard and Korn 2002, Sarndal et al

1992). Then the values for the finite population y1 , y2 ,..., y N are one realization of these random

variables. The joint of Y1 ,Y2 ,...,YN is considered to link the sampled values to the units not sampled. This concept is considered in more detail in Chapter 2.

1.4 Combining Surveys

Survey methodologists have also developed methods of combining data or estimates from surveys

though they do not generally refer to it as metaanalysis. Perhaps the most vocal advocate of

combining survey data was Leslie Kish who discussed such combination in a series of articles

from the seventies until the end of his career (1979, 1981, 1983, 1986, 1990, 1994, and 1999).

There he defines the idea of a rolling sample where one jointly selects k nonoverlapping probability samples (panels) each of which constitutes 1/F of the population. One panel is interviewed each time period until all k surveys have been completed. The final estimate is dependent on the domain size and precision needed and he suggested that one could cumulate panels right up to the total k/F of the population. A rolling sample design with F = k is a rolling

26

census. He also introduces the different ideas of cumulating cases or combining population

surveys in these papers, stating that cumulating is the term one should deploy when working with periodic or rolling samples, while combining should be the term of choice when one is dealing with multipopulations or multidomains. However, this is an approach wherein the design of the samples is integral to the idea of combining, which is not the case in metaanalysis. In meta analysis we are doing a form of analysis for which the original samples were not designed.

Kish does talk about combining data when the surveys were not originally designed with that in mind, mostly in the context of combining international surveys. For this case, Kish (1994) suggests different weighting methods in what are mathematically similar to fixed effect models

(i.e. weighted means) dependent on the estimation and data problem, much as is done in meta analysis. He describes the idea of combining different surveys from different countries using a pooled or average that is weighted by the population size of the country. Morton (1999) notes

that “ Kish’s population weights approach is akin to equal effects; his suggestion to give each

country’s estimate an equal weight is fixed effects; and his use of country weights to optimize

leastsquare loss is in the spirit of random effects.” Kish urged his colleagues in the 1999 paper

to extend these ideas into a “theory of combining populations” although the focus was to estimate

the finite population characteristics. In Chapter 3, we will see how Morton is incorrect in his

assumption that a country weight is akin to random effects. In the case of survey data, it is the

assumptions about the mechanism which generated the data that determine when one should use a

random effects model.

27

Following Kish’s work, survey statisticians have considered several different cases where they want to pool data. The first and most straight forward example is when the different surveys are

subpopulations, mutually exclusive and exhaustive, and the selection probabilities are known for

all units in the target population. In these cases, the combined data can be estimated by treating

the separate populations as sampling strata (Cochran 1977). This procedure is comparable to the

MantelHanzel approach for fixed effect metaanalysis. Generally though, this is done as part of

the design stage of a survey, whereas metaanalysis is done after estimation. Another well

studied case for survey statisticians is the case where we are combining surveys whose sampled populations overlap. Then, as in the case of mutually exclusive and exhaustive strata, the

essential assumption required to produce design consistent estimates is that all the inclusion probabilities are known. When we have groups that are not mutually exclusive the researcher

must be able to subtract out the overlapping portions in order to properly estimate the finite population parameter. This problem has been considered in detail in the context of double

sampling (Cochran 1977) and survey design with multiple overlapping frames (Lohr and Rao

2000, Fuller 1972). It too resembles a case of fixed effect metaanalysis. However this situation

in the survey context is studied when we are designing surveys not analyzing data. And although

some of the methods used to combine surveys at the design point could be potentially useful for

analysts doing metaanalysis, it is almost impossible for an analyst to properly estimate the

overlap between surveys.

Associated with the idea of overlapping surveys is the problem of multiple frames. A frame is a

list of individuals from which we want to sample, that is, a list of our target population. Hartley

(1962), Fuller (1972) and Burmiester (1972), Bankier (1986), Kalton and Anderson (1986),

Skinner and Rao (1996) and Lohr and Rao (2000) among others, have studied the problem of 28 inference when there is more than one frame or list of all units in the population. In sampling it is often the case in practice that the frame is incomplete or contains units that are not part of the target population. For example, if one were using a phone book as the frame, one would have to take into account that not all persons have a phone, as well as the fact that some people have unlisted phone numbers. Furthermore, in a typical phone book, there are businesses listed along with people and if one was interested only in people one would have to remove those units from the sample. In this situation, with multiple frames, the researcher is interested in combining estimates from the same finite population and variance estimation when samples are drawn independently from different frames. The level of inference is the finite population and methods are generally designbased. We will see in Chapter 3 that although a multiframe problem is similar to a fixed effect metaanalysis problem, one should not use standard metaanalysis techniques to combine these estimates because of the added issues of overcoverage and undercoverage of the target population.

Cochran (1977) presents the problem of repeated sampling of the same population “apart from changes that the passage of time introduces”. This general problem of sampling on more than two occasions has been studied by Yates (1960), Rao, Hartley and Cochran (1962), Patterson

(1950), Cassel and Sarndal (1972), Cassel et al (1977), and Sarndal (1980). Cochran (1977) discusses the question of the replacement of the sample over different time periods depending on if the researcher is interested in the change in Y from one occasion to another, the average value of Y over all occasions, or the average value of Y for the most recent occasion. He suggests using a double sampling technique where he notes that “for estimating change, it is best to retain the sample throughout all occasion; for estimating the average over all occasions, it is best to draw a new sample on each occasion; and for current estimates, equal precision is obtained either 29 by keeping the same sample or by changing it on every occasion, replacement of part of the

sample may be better than these alternatives.” This case can be thought of as a fixed effect meta

analysis as estimation here is for a census parameter and not the superpopulation parameter. This

is discussed in more detail in Chapter 3.

When we are at the design stage, as we design the survey to estimate a parameter from a finite population, we are always correctly assuming that we have the same target population and our

inference is for the finite population (albeit a large and oddly defined one in Kish’s international

survey pooling example). However, as we will see in Chapter 3 and 4, we may not want to estimate something about the finite population; we may be more interested in the

superpopulation. For the experimental case it is clear, we are always doing analysis about the

underlying model, whereas in the survey case the analyst needs to differentiate between the

model and the finite population. An analyst is more often interested in the model that generated

the finite population, not the finite population itself. And as we will see in Chapter 3 and 4, after the survey is done we cannot assume that target populations are the same: equivalent target populations is a key assumption that will need to be verified.

The next area that links to metaanalysis is . Small area estimation concerns

the estimation of finite population characteristics when the domain size is too small for estimation

of adequate precision. Marker (1999), Rao (2003) and Merkouris (2004) explore the use of borrowing strength from several surveys of the same population that have collected comparable

information for the small domain. Marker suggests that it is possible to use forms of dualframe

estimation to combine the national survey with supplements in specific areas to produce direct

30

estimates. Rao (2003) discusses both direct and indirect estimation (regression using auxiliary

data) and Merkouris proposes a generalized estimating equation approach to direct estimation

using multiple surveys. Here again most of the estimation methods are designbased and

estimation is of the finite population parameter.

Zieschang (1990), Renseen and Niewwenbroek (1997) also discuss combining information from

multiple surveys using generalized regression estimators. Wu (2004) extends these ideas using

the pseudo empirical likelihood to calibrate the estimate using multiple surveys. These authors

do not discuss or consider what the inferential model is, nor do they consider if there should be a

test of homogeneity for the data. This is unfortunate, as both of these ideas are of great

importance in metaanalysis and will be discussed in more detail in Chapter 3. We note that these

survey approaches implicitly use the framework of Isaki and Fuller (1982) which invokes a

superpopulation model and consists of constructing nested populations and sampling from each population, increasing each time the sample size. This is the approach we will adopt in ensuring convergence is design consistent. This relates to the estimation approaches in more complex metaanalysis. However, the estimation objective of metaanalysis can differ from these and more details of the difference will be provided in Chapter 3.

Several authors have discussed methods to combine surveys in order to address methodological problems within a single survey. Schenker et al (2007), Schenker and Raghunathan (2007) and

Schenker et al (2002) study several issues related to combining surveys. They illustrate how to extend coverage of a single survey, and how to use survey data and statistical to enhance data and reduce nonsampling errors. They also demonstrate how to combine

31

information from an examination and an survey using models and how to combine information from two surveys to enhance smallarea estimation. Although they present some of the concepts related to metaanalysis, these researchers only discuss inference for the finite population, and like other authors do not discuss testing homogeneity as is done in metaanalysis.

Survey statisticians who are involved in the design of national surveys have discussed how to combine surveys for analysts. These papers are focused on a particular set of surveys and should not be used or interpreted in a broader sense or generalized to all surveys. For example, Thomas and Wannell (2007) discuss how an analyst might combine different cycles of the Canadian

Community Health Survey (CCHS). The authors present a simple comparison of pooling estimates to pooling individual data. Thomas and Wannell present a fixed effect model and contrast that with a reweighting technique. However, they do not take note of several important assumptions. They do not discuss the issues with the target population or the model or the finite population. They also do not discuss the issue of heterogeneity. For their weighting adjustment the authors make a strong assumption that the calibrated weights adequately represent the rare population they are measuring. We will use this example more in Chapter 3.

Reweighting data is an approach borrowed from methods to adjust or pool data at the design stage (e.g. poststratification, dual frame surveys). This approach is only possible when the analyst has access to all the individual data and the design. For example, O’Muircheartaigh and

Pedlow (2002) compared two different weighting strategies for combining a crosssectional sample from the US National Longitudinal Survey of Youth 97 with a supplemental sample. The supplemental sample oversampled Hispanic children in order to get more accurate estimates for

32

this subpopulation. They compared creating weights based on cumulating all the cases and

deriving appropriate inclusion probabilities for the overall sample, with a combined sample

estimate using weights that were adjusted as a linear combination of the two weights. They

concluded for this data set that cumulating cases is better than combining the samples. Again, we

must stress that for a general audience these approaches make assumptions. As noted above, the

researcher needs to know the inclusion probabilities for all the frames, the coverage of the survey,

and the underlying target population and the independence of the two samples.

Another example by Botman and Jack (1995) describes the issues surrounding combining the US

National Health Interviewer (NHIS) datasets. The NHIS collects basic health information and

each year there are supplemental surveys on different current health topics. Each of the

supplements uses different target populations, sampling mechanisms, and estimation procedures.

They note that analysts need to allow for these differences when preparing estimates based on

multiple NHIS survey components. They explain the need to consider two types of combined

datasets; combined datasets are said to be based on accumulated samples if each constituent

dataset is for the same NHIS survey component, but these survey components were fielded in

different survey years. Combined datasets are said to be based on overlapping samples if each constituent dataset is for a different survey component, and these survey components used the same sampling mechanism.

For accumulated samples, Botman and Jack note that there can be adverse effects due to exogenous or temporal changes in the data. For example, a change in diagnostic testing or health procedure could affect measures. There could also be changes in the wording of a

33

over time. They note that datasets of accumulated samples may be too small to assess if these

temporal or exogenous changes in the questionnaire affect estimation. They also note that one

must design and calculate appropriate sampling weights to estimate aggregates of any kind as the

original are usually inappropriate for the combined dataset. Finally, they state that the standard

designbased variance estimation procedures using the existing generalized variance equations

will not be appropriate in this case. This is due to the fact that we have more than one design, and

thus more than one probability space over which we are estimating our variances.

In the case of overlapping samples, Botman and Jack note that as with the accumulated samples,

the sampling weights are incorrect and need adjustment. They state that the sample size for this

combined sample is no larger than the sum of all individual datasets and may be much smaller in practice. And again they affirm that the standard designbased variance procedures will be

inappropriate. As noted in the previous survey examples, it is important in this type of research

to clearly define its scope and methodological limits. Botman and Jack limit themselves to

combining data from the same survey, and when the data comes from more than one year they

make note that temporal changes could impact analysis.

A few authors have directly looked at the challenges of data analysis using more than one survey

in a general context. Korn and Graubard (1999) discuss analysis using multiple surveys. They

note that it may be useful to use multiple surveys when one needs to increase sample size or to

assess a time trend or change in a population or to improve coverage of a survey. This is often

the reason for metaanalysis. They differentiate between the case of estimating differences and

an analysis that pools the data and propose methods to adjust the survey weights when pooling.

34

This is analogous to individual patient data metaanalysis techniques. Again there is little

discussion of comparability or heterogeneity and the methods to adjust weights rely heavily on

information that the analyst may not have.

Rao, et al (2008) discuss the issues of combining survey data in a metaanalysis. They focus on

methodology as it relates to the selection of studies and issues related to heterogeneity. Although they note that the major challenge to combining information is heterogeneity, they fail to note that the application of standard tests will not take into account the possibility of clustering in the data with complex sampling designs. They apply the standard metaanalysis techniques without reference to the different inferential or estimation frameworks.

Roberts and Binder (2009) also discuss analysis based on combining information from multiple surveys. They discuss the type of inference (descriptive, descriptive under assumed relationships, and analytic), and also address the question of estimation under two common scenarios. The first approach is having an overall estimator that is a combination of separate estimators, which corresponds to a traditional metaanalysis approach. The second offers a ‘pooled’ approach wherein the individual records from all the surveys are combined and where the original weights might be rescaled. The new data set is then analyzed as if it was a single sample. This

corresponds to an individual patient data metaanalysis. Roberts and Binder discuss the need for

an analytical framework either design based or model based upon which to frame your estimation

question. However, they do not discuss heterogeneity nor what the expected value of their proposed estimators.

35

Combining survey data may seem like a straight forward logical extension of the metaanalytic or

standard design based estimation techniques but researchers often combine survey data

erroneously, because there is currently no broad theory available that would clearly delineate

constraints for their methodology. Survey statisticians are not taking note that the different

surveys that they are combining come from different designs, and thus are defined (in terms of

expectation) under different stochastic processes or probability spaces. The recent increase in

straightforward access to large survey datasets has lead to a proliferation of analysis that groups

or pools multiple survey results. This ready access, combined with userfriendly computer

software, has lead researchers to try to borrow strength from the many surveys in an attempt to

either improve precision or address research questions for which the original studies were not

designed. Researchers have started to employ many different techniques to combine data from

complex national surveys. However, in the case of metaanalysis of survey results, a simple

application of an experimental framework is not appropriate.

For example, Karen Norberg in 2004 combined the individual data from four large American and publicly available prospective surveys and one American retrospective survey in an attempt to study the effect of partnership status on the human sex ratio at birth. However, she makes no note in her article of the issues related to the metaanalysis of survey data: the finite population, the target population(s), oversampling, ignorability or informativeness of the designs, clustering, or the superpopulation(s). We will discuss these concepts in Chapters 2, 3 and 6. Furthermore,

Norberg does not address issues of homogeneity between studies, except to note that because of differences in the study designs she (incorrectly) does not use the weights. Naively, she analyzes the pooled data as if it were a single large dataset using either an ordinary or conditional depending on the outcome. She seems ignorant of the role of sample size in 36

significance testing and, furthermore, seems unaware that she is, in fact, conducting a form of

metaanalysis (analogous to an individual patient data metaanalysis). This study will be discussed in more detail in Chapter 5.

Although some basic research has begun in this area, no one has linked directly the ideas of meta analysis using survey data in a comprehensive framework. This thesis will link the two areas, and provide a methodology for estimation of either fixed effect or random effect models that can be easily extended to more complicated estimation techniques.

37

Chapter 2 Design and Model Based Methods in Sampling

The primary purpose of classical statistics is to provide mathematical models for random

experiments. Let us say, for example, that a medical researcher is interested in the effect of a drug that is to be administered to a patient population. The only way he or she can obtain information about this effect is to perform an experiment which will have an attendant/associated outcome. While of course, prior to each experiment, the outcome cannot be known with

certainty; the researcher may be able, however, to describe what the possible outcomes could be.

Imagine we have such an experiment, where we can not know with certainty the outcome, but

where we can describe every possible outcome prior to making the experiment. If we can also

repeat this experiment and collect all possible outcomes the process is called a random

experiment and the collection of all possible outcomes is called the experimental space . So here

we have an outcome that is random and an experimental space that can be infinite, whereas in probability sampling we have an outcome that is fixed and a sampling space that is finite. One of

the simplest examples of such an experiment would be the tossing of a coin. We know that a fair

coin should produce an equal number of heads and tails. If we assume we repeatedly toss the

coin, we can describe not only the outcome of each toss as being either heads or tails, but we can produce a model for the coin tossing and within this framework make inferences about such

things as the fairness of the coin.

Treating the outcomes collected from the survey as fixed, survey statisticians characterize

as the selection of units into the sample. Also, survey statisticians in a design based

38

framework do not sample from an infinite population. Since design based or probability

sampling deals with finite rather than infinite populations, and it is the selection of units rather

than the outcomes that is random, estimation and interpretation in design based sampling are

distinct from other areas in statistics such as experimental design. It is important to clarify these

differences for readers unfamiliar with , and this chapter thus provides background on common methods and concepts used in sampling.

2.1 Design Based Sampling

In design based or probability sampling, the sample is chosen with a fixed design (and therefore

known probabilities) in order to estimate some attribute in the finite population. Estimation is

done using a nonparametric approach where we assume that the sampling is random and the

attributes that we are measuring are fixed and finite. As noted in Section 1.3 and in many

introductory texts, probability sampling is an approach to selecting samples that satisfies the

following conditions (Sarndal):

1. We can define a distinct set of samples, S1, S 2, …, S v, which the sampling procedure is

capable of selecting if applied to a specific population. This means we can say precisely

which units belong to S1, to S2, and so on.

2. Each possible sample Si has assigned to it a known probability of selection p( Si), and the

probability of the possible samples sums to 1.

3. This procedure gives every unit a known nonzero probability of selection.

39

4. We select one of the Si by a random process in which each sample has its appropriate

probability of selection p( Si).

More formally, suppose that there is a finite population consisting of N units, where N is known, usually regarded as constants and not random variables, denoted by the index set U = { 2,1 ,..., N}.

Let each index be associated to a unique vector (yi , xi , zi ) where yi are the dependent variables,

and a characteristic of interest of our population for the ith unit, xi are our dependent variables or

auxiliary information for the population, and zi are the prior information available at the time of

the design of the survey, often times called design variables. We will want to estimate the

characteristics y’s and x’s from our population and we will be using the design variables ( z’s)

directly in the calculations of the probabilities for each unit. For example, in a business survey

we might be stratifying by geography and the z’s would be something like province or city, and

within each stratum we might be taking a sample with probability proportional to size, where z is

a size variable like profit. Here we might be looking at estimating the average salary within

certain businesses, which would be a ratio (y/x) of the total salary over the total number of

employees.

To continue, to frame finite sampling problems in a more formal way we define S, our sample, to be a subset of U and let S be the collection of subsets of U that contains all possible samples. Let

P[S = s] denote the probability that s, s ∈ S is selected. Then we can define a sampling design as a function p (.) that maps s to [0, 1] such that p(s) = P[S = s] for any s ∈ S . As with the

experimental case an important set for us to consider is the set of all possible samples, where S1,

40

S2, S 3,…, S v denotes each possible sample from the population from a specific sampling procedure (for simplicity we will assume each sample has a fixed sample size of n units, though this is not necessary). Next, we define C(S) to be the power set of S (that is the set of all subsets

th of S) and p(Si) as the probability of selecting the i sample. Then we have that a fixed sampling design is a function p :C(S) × R N → [ 1,0 ]such that:

for all Si,, in S , p(s ⋅), is Borelmeasurable,

for z N ∈ R q×N , p ⋅,( z N ) is a probability measure on C(S),

therefore we have that (S, C(S), p) is a design probability space, where p(S ⋅ ), less than 0,

Si ∈ S .

Finally, the census parameter θ N , that is the estimate we would have calculated using all the units in the finite universe, is a function of y, x, z and is a Borelmeasurable function defined on the

ˆ subset of Real Numbers spanned by (yi , xi , zi ) . An estimator of this finite population θ N

ˆ N ˆ associated with a particular design is then a function θθθ N : S × R → R where θ N (s ⋅), is also

Borelmeasurable (RubinBleuer and SchiopuKratina 2005).

As noted above, each possible sample has a known probability p(Si). We select one of the Si by a

random process in which each sample Si is selected with its corresponding probability.

Expectation is then defined across all possible samples of the same design. For example, if we

N wanted to estimate the total for the population t where θ N = t = ∑ yi ,we can define an i=1

41

ˆ ˆ estimator of the statistic t as θ n = t S = NyS where yS is the sample mean. Then the expected

value of the estimator is defined using the sampling distribution of the statistic and the probability

framework

ˆ ˆ ˆ ˆ E[θθθ N ]= E[t ]= ∑ p(S)t S = ∑ kP(t = k) , S k where

P{tˆ = k}= ∑ p(S) . S:tˆS

Then we can define the variance of the sampling distribution of tˆ as

ˆ ˆ ˆ ˆ 2 ˆ ˆ 2 V [θθθ N ]=V [t ]= E[(t − E[t ]) ]= ∑ p(S)(tS − E[t ]) . all possible samples

Figure 2.1 Random Sampling from a Finite Population

N n p(S) ˆ θ N θ N

Finite Population Sample Population

Thus in practice in sampling we have the situation in Figure 2.1, where we have a single finite population and from that population we draw a sample using a fixed design p(S). In Chapter 3 we will see that unbiased designbased estimation is also a special case of a fixedeffect meta analysis approach.

42

In order to perform a metaanalysis as we saw in chapter 1, we need to have a point estimate and the variance of that point estimate for every study. In the experimental case, the point estimate and variance estimate are based upon the model assumptions about the outcome. In sampling, however, the estimator of the point estimate and the estimator of the variance are both dependent on the sampling design and any assumptions on the presences or absence of a superpopulation. It is therefore useful at this point to review point and variance estimation for some common survey designs.

2.1.1 Simple Random Sampling

In general, we will not be using all possible samples for estimation. Instead, we will use the fact

that if the selection probabilities for each sample are known, then each unit in the population has

a known probability of being in the sample π i . As it is the selection of individuals that is random

and we can define an indicator variable as follows,

1 if unit i is in the sample I i =  , 0 otherwise

we can see that the probability that element i is included in the sample is obtained from a given

design as follows π i = P(i ∈ S) = P(I i = )1 = ∑ p(s) . s∋i

Now if we consider the case of the calculation of a mean from a simple random sample (SRS )

Cornfield (1944) defines

1 if unit i is in the sample I i =  0 otherwise

43

then we can see the population mean is

N yi yi y = ∑ = ∑ I i . i∈S n i=1 n

Again we note that as I i is the only random variable in the equation, the values of y in our population are fixed. We consider that a simple random sample of n units out of N units in the

population {I1 ,..., I N }is identically distributed Bernoulli random variables with the probability of selection being

 N −1   n −1  n πππ = P(I = )1 = P(select unit i in sample) = = . i i  N  N   n 

As a consequence then,

2 n E[]I i = E[I i ]= N and

N N N  yi  n yi yi E[]y = E∑ I i  = ∑ = ∑ = y pop .  i=1 n  i=1 N n i=1 N

Again, as it is I i that is random, we note that for a simple random sample

2 2 2 n  n  n  n  Var[]Z i = E[]I i − (E[]I i ) = −   = 1 −  N  N  N  N 

and for i≠j

44

E[I i I j ]= P(I i =1 and I j = )1

= P(I j =1I i = )1 P(I i = )1  n −1  n  =     N −1 N  and consequently for i≠j

2 n −1 n  n   1  n  n  Cov[][]I i , I j = E I i I j − E[]I i E[]I j = −    1 −   . N −1 N  N   N −1 N  N 

Then

 N   N N  2 1 yi 1    n  S Var[]y = Var I i  = Cor I i yi , I j y j = 1 −  2 ∑ n 2 ∑ ∑  N n n  i=1  n  i=1j = 1   

(y − y )2 where S 2 = i pop . As with the experimental case, we do not know the true value of ∑ ()N −1

the S 2 so we use an unbiased estimate calculated from the survey data by

(y − y)2 s 2 = ∑ i . i∈S ()n −1

Although it is common practice in the experimental case to say we have a random sample of units

or outcomes from the population, there is an important difference between the experimental

random sample case and the finite sampling SRS case. We can see that the estimator for the

variance under SRS is not the same as the estimator used in the experimental case (S 2/n): the

designbased estimator for the variance includes a finite population correction term or fpc (1

n/N). In the experimental case, we are dealing with a theoretically infinite population from which

we can select independent units. In the sampling case, we have a finite population and therefore

45

the selection of the units is not independent. So we account for that in the above calculation by

the of I i ,I j which is what gives rise to the fpc.

2.1.2 HorvitzThompson Estimation

Designunbiased estimates can be generated from any sampling design, by using the sampling or

design weights. As noted above, the probability that element i is included in the sample is

obtained from a given design as follows

π i = P(i ∈ S) = P(I i = )1 = ∑ p(s) . s∋i

If we once again look at the probability of inclusion in the sample for a more general design we can see that when we have a fixed sample of n units is selected without replacement, the following relations hold:

N ∑π i = n , and i=1

N ∑π ij = (n − )1 π i , and i=1

N 1 ∑∑π ij = n(n − )1 . i=1 jf i 2

Under this framework, Horvitz and Thompson (1952) suggested the following designbased estimator for the population total:

n t yˆ HT = ∑ i i=1 π i

46

1 where is the weight for each sampled unit. This weight is interpreted as the number of units π i in the population that the sampled unit represents. These sampling or design weights are different from the weights used in metaanalysis and throughout the text we will refer to them solely as design weights or adjusted design weights so as to avoid confusion. Then the HorvitzThompson or design weighted estimator is designunbiased for t and calculated as

 N t  N t ˆ i i E[]y HT = E∑ I i  = ∑πππ i = t .  i=1 πππ i  i=1 πππ i

We can then calculate valid design based variance estimates and confidence intervals based on the survey data using these design weights. We can also sometimes calculate the joint probability

of any two items appearing in our sample π ij . If we also have the case where

f f π i 0 and π ij 0 ∀ i ≠ j then the sampling design is called measurable.

Then the variance of the HorvitzThompson estimator is given by

N N N N ˆ ˆ 1 − πππ i 2 πππ ik − πππ iπππ k V (ti ) V [YHT ]= ∑ ti + ∑∑ i tt k +∑ i=1 πππ i i=1 k ≠i πππ ik i=1 πππ i 2 N N  t t  N V (tˆ )  i k  i = ∑∑ (πππ iπππ k − πππ ik ) −  + ∑ , i=1 k ≠i  πππ i πππ k  i=1 πππ i where the second expression is the SenYatesGrundy form (Grundy and Yates 1953, Sen 1953)).

47

2.1.3 Stratified Sampling

Simple random sampling is the simplest form of probability sampling. However, if we know that

there are subgroups in our population which differ in terms of what we are measuring, or that are

subgroups of interest for which we want reliable estimates, or that we have additional information

that can be used to design a more costeffective or lower variance sampling scheme, stratified

sampling, instead of simple random sampling, should be used. For example, if we are measuring

incomes of people, we would want a stratified sample since men and women generally have

different income distributions. Having a stratified sample in this case would protect us from

having a ‘bad’ sample of only men or only women, would allow us to have a known precision for

both men and women, and would give us a lower variance estimate for an overall income than

simple random sampling.

If we partition U into L mutually exclusive and exhaustive subpopulations, called strata denoted

by U1, U 2 ,…, U L where U h = {k : k belongs to stratum h} and we select units independently

within each of the strata we have stratified sampling, as illustrated in Figure 2.2.

Figure 2.2 Stratified Sampling p(S) = p1 (S 1) p2 (S 2)… pL (S L)

S1 U1 p (S 1) S2 p2 (S 2) U2 S 3 U3 p (S ) 3 3 … ......

SL pL (S L) U L Sample with L strata

Finite Population with L strata 48

As the strata are mutually exclusive and exhaustive and the sampling between strata is independent we can decompose the population total as

L L T = ∑Th = ∑ N h yhN h=1 l=1

where Th is the population total for the ith stratum, yhN is the stratum mean, N h is the stratum

size, and the population variance for each stratum given by

2 Nh (y − y ) S 2 = hj hN h ∑ N −1 j=1 h .

As we can specify any independent sampling plans within each stratum we can use a Horvitz

Thompson estimator for the stratum totals and sum across the stratum to get population totals.

The HorvitzThompson estimator for stratum totals is

nh y tˆ HT = i h ∑ π i=1 i ,

where the populations total is

L ˆ ˆ HT tstr = ∑th h=1 which has variance

L ˆ ˆ HT V (tstr ) = ∑V (th ) i=1 .

We can see from the formulas that the variance of the estimate depends on the variability within the strata, and thus when subpopulations vary significantly it is useful to sample the subpopulations individually. For greatest precision, while the individual elements within each 49 stratum should be homogeneous, the stratum means should differ from each other as much as possible.

As with other types of research, designing the survey is probably the most important part. In

order to design a stratified survey we need to allocate the total sample to the different strata. We

can choose to allocate the sample to each of the strata in different ways, the most common being

equal, proportional, and optimal allocation schemes. In equal allocation, we assign the same

number of units to each stratum. This method is straightforward to implement and when we do

not have any idea of the variability within the stratum it is a useful method. In proportional

allocation, we partition the sample size among the strata proportional to the size of the strata. It is well known that the variance of a stratified sample which uses proportional allocation will always be, at most, that of a simple random sample. Thus, proportional allocation will do as least as well as taking a simple random sample, and in general the more separated the strata the more efficient proportional sampling will be. In optimal allocation, we choose stratum sizes to minimize the variance for a given total sample size n. Of course, this will lead to us having additional observations sampled from more variable strata, and therefore have an advantage over proportional allocation when the variability in the different strata is large.

2.1.4 Cluster Sampling

In both simple random sampling and stratified sampling we assume that we can identify all

individual units in the population at the time of sampling. If we do not have a list of units or we

can not make a list because it is difficult, expensive, or impossible, we can still sometimes

construct a probability sample. In the case where units in the population are aggregated into 50 mutually exclusive larger units and these units are easily defined we can use these larger units as our basis for sampling. For example, if we were interested in surveying students about attitudes towards video game violence, a SRS or a stratified sample would require that we have a list of all the students registered across the country. Such a list does not exist and constructing one would not be feasible. However, we could find or construct a list of all the schools – and each school is an aggregation of students. We could then sample schools (primary sampling units – psu’s) and the individual students (secondary sampling unit ssu’s) in the population would be allowed in the sample only if they belong to the schools that are chosen. This type of sampling is called cluster sampling.

Cluster sampling bears a superficial resemblance to stratification: similarly, we partition the population U into N mutually exclusive and exhaustive subpopulations, called primary sampling

units (psu) denoted by U1, U 2, …,U N where U n = {k : k belongs to cluster h}. S now denotes a sample of psu’s chosen from this population of psu’s. However, here the cluster, not the individual, is the sampling unit. We then select a sample of n psu’sor clusters. See Figure 2.3 below.

Figure 2.3 Cluster Sampling

N p (S) n

Sample n psu’s

Finite Population with N clusters or psu’s

51

As it could be costly, we need not sample everyone in our psu. Instead, we can extend Figure 2.3 to have a second stage of sampling where rather than selecting everyone within a psu we select a sample of units from the i th psu. As before, the psu is the sampling unit for the first stage and the individual is the sampling unit for the second stage. Here we also have Si which denotes the

th sample of secondary sampling units chosen from the i psu and Pi (Si ) which denotes the probability for sample Si . See Figure 2.4 below.

Figure 2.4 TwoStage Cluster Sampling

p (S) pi (S i)

Finite Population with N clusters Sample n psu’s First Phase

But there are several differences worth emphasizing. First, the sample size using this approach is not the number of individuals, but the number of primary sampling units. For example, if we were studying schools and took a sample of 20 schools from a school district and then sent to 100 students selected by a simple random sample in each school, the sample size for analyzing this study would be 20 not 2000. Equating the sample size to the number of primary sampling units corresponds to the idea in metaanalysis that it is the number of studies that is our sample size, not the number of individuals contained in those studies. Therefore, to decrease the variability we must select more clusters in the sample, not more individuals within a cluster, as the variance depends on the variability between cluster means. Secondly, when we want the greatest precision, in contrast to stratification, we want individual elements within each cluster to be heterogeneous and each of the clusters similar to one another. Finally, another important difference is that the individual responses from the ssus are not independent 52 observations and are, in fact , likely to be positively correlated (Lohr 1999, Cochran 1977). If this correlation is ignored when doing analysis, differences will be found to be statistically significant when they are not.

Such twostage cluster sampling can be expanded to multiple stage designs. The notation for two stagecluster sampling is as follows. Let N denote the number of primary sampling units in the

th population and M i the number of secondary sampling units in the i psu. Then the total number of

ssus is defined as

N K = ∑ M i . i=1

Analogous to stratified sampling, the total in the i th psu is

M i ti = ∑ yij , j=1

then the population total is the sum of all the totals across the psu’sgiven by

N N M i t = ∑ti = ∑∑ yij , i=1 i=1j = 1

and the population variance of the psu totals is then

2  t  N ti −  2  N  S = ∑ . i=1 N −1

For cluster sampling, the probability of selection is as follows: P( jth ssu in the ith psu is selected)

= P( ith psu is selected) P( jth ssu is selected| ith psu is selected) where

53

P(ith psu is selected) = π i ,

P(jth ssu is selected | ith psu is selected) = π j i ,

P(jth ssu in the ith psu is selected) = π j(i) = π iπ ij .

Then an estimator for the individual psu total’s is

y ˆ ij ti = ∑ , j∈S π ij and an estimator for the sample total would be

y tˆ = ∑∑ ij . i∈S j ∈ Si π j(i)

ˆ When we look back at Figure 2.4, we see that this twostage approach implies that the ti ’s are now random variables. As noted before, the expectation in a design based framework is across all possible samples of a certain design. In the case of twostage designs, we must first average the estimate over all secondstage selections that can be drawn from a fixed set of n psu’s. Then we

ˆ ˆ average over all the possible selections of n psu’s. This can be expressed as E(θ ) = E1 [E2 (θ )], where E denotes the expectation across all samples, E 1 the expectation across all possible first

stage selections and E 2 the expectation across all possible secondstage selections from a fixed set of psu’s. Then this approach gives us

ˆ ˆ ˆ V (θ ) = V1 [E2 (θ )]+ E1 [V2 (θ )],

54

ˆ where V2 (θ ) is the variance over all possible subsamples for a given set of psu’s. Thus we have

two components: one that takes into account the variability between psu’sand another that

accounts for the variability of ssus within psu’s.

So in the case of a simple random sample of psu’sand a simple random sample of ssus we would

n mi have P(jth ssu in the ith psu is selected) = π j(i) = . Then in this case we have the N M i following estimators:

M y ˆ i ij ˆ N ˆ ti = ∑ , and t = ∑ti , j∈Si mi n i∈S

 n  S 2 N N  m  S 2 with ˆ 2 t  i  2 i , V (t ) = N 1−  + ∑1− M i  N  n n i=1  M i  mi

2 2 where St is the population variance of the psu’sand Si is the population variance among the ssus within psu i. This approach naturally extends to three or more stages. For three stage sampling, we have variance in the following form:

ˆ ˆ ˆ ˆ V (θ ) = V1 {E2 [E3 (θ )]}+ E1 {V2 [E3 (θ )]}+ E1 {E2 [V3 (θ )]}.

2.1.5 Twophase Sampling

It may sometimes be the case, however, that while we know that a stratified sample would be most appropriate, we do not have the stratification information on our list or frame. To handle the case where a stratification variable is not available on the , we may use two

55 phase sampling, or double sampling, first proposed by Neyman in 1938. This method consists of two successive sampling steps as illustrated in Figure 2.5.

Figure 2.5 TwoPhase Sampling

First Phase

Second Phase

First, you select a simple random sample of n (1) units, called the firstphase. For the sampled units, you then measure the variable and use this information to stratify. Then you select a second sample from each stratum defined in the first sampling step. Using this approach, you get an unbiased estimate of the strata weights from the first phase and you get an unbiased estimate of the strata means in the second phase. This method can be used when the variable of interest is expensive to measure but a correlated variable can be measured easily and used to improve precision. It is also often used in dealing with nonresponse problems.

In this case we would let Wh = N h / N be the proportion of the population falling in stratum h

and wh = n'h / n' be the proportion of the first sample falling in stratum a, then since wh is an

L ˆ unbiased estimator of Wh we have y = ∑ wh yh . As illustrated in Cochran (1977), under a h=1 design based framework, we look across all possible samples and redraw both the first and

second phase samples. Here then, wh , nh and yˆ h are random variables. Therefore, to calculate

56

H ˆ the expectation of y = ∑ wh yh , we condition on the sampling weights wh and then take the h=1 expectation over different selections of the first sample and have

ˆ )2( ˆ )2( E(y ) = E[E(y wh )]= E[E(∑ wh yh wh )]= E(∑ whYh )= ∑WhYh = Y .

The variance is again computed conditionally

ˆ )2( )2( ˆ )1( ˆ )2( V (y) = V (E[yˆ wh ])+ E[V (yˆ wh )]= V (y ) + E(V [y wh ]).

2.1.6 Complex Sampling

Survey designs can combine any of the features that have been presented. Take, for example, the

Canadian Labor Force Survey (LFS). The LFS uses a probability sample that is based on a stratified twostage sample design. In the first stage, they select a sample of geographical regions, as the primary sampling units (psus). Stratification involves assigning each psu to a single stratum. For each psu selected, surveyors make a frame by listing all the dwellings contained within that psu. The sample is selected independently with probability proportional to size in each stratum, which makes it possible to adopt a different selection method for each one. There are several methods for selecting a psu sample with probability proportional to size. The LFS uses two of these methods: the RaoHartleyCochran (RHC) (Rao, Hartly, Cochran 1962) method and the systematic sampling method with probability proportional to size and random order. The RHC method is used with most strata because it allows us to update the selection probabilities when we observe strong growth in some psu’s. The method described in Keyfitz (1951) can be combined

57 with the RHC method to update the probabilities while maximizing the overlap of the selected psu’sbefore and after the update.

In the second stage, they draw a sample of households in the selected psu’s. The LFS selects the households using systematic sampling. This method is recommended because it is simple to use, ensures a good distribution of the households selected in the psu and facilitates adding new dwellings to the psu. To make a selection using systematic sampling, they determine the sampling interval, which is the inverse sampling ratio (ISR) of the psu. It is established using the number of households in the psu based on the 2001 Census and the ISR determined during allocation of the sample. The dwellings selected in the second stage are selected using rotating panel design. After a period of six months, it is necessary to replace this sample with new dwellings. Each group represents the population observed. Each month, the dwellings that participated in the survey for six months, or 1/6 of the sampled dwellings, are replaced by other dwellings. As a result, there is an overlapping of 5/6 of the respondents in two consecutive months. This approach helps to improve the quality of the change estimates between two months and to control the response burden imposed on each dwelling (LFS Methodology).

Point and variance estimation in this context is, however, not straight forward. Calibration is a method to adjust design weights according to some distance function, such that calibration weighted estimates for a vector of auxiliary variables that are exactly equal to the vector of known population totals. The LFS uses composite calibration (or regression composite estimation). This is essentially the same as calibration, except that some control totals are estimates from the previous month, and the auxiliary variables associated with these estimated

58 control totals are not known for all people and are thus imputed. The LFS uses the jackknife method to estimate the sampling variance of the regression composite estimator (or composite calibration estimator) used for each of the ten provinces, and of the generalized regression

estimator used for each of the three territories. Jackknife is a variance estimation first introduced in 1949 by Quenouille as a method to estimate bias. In 1958, Tukey suggested that it be used as a variance estimation procedure. The jackknife method consists of creating replicate data sets by deleting an observation, reweighting, recalculating the estimate using the replicate data. The jackknife variance is then calculated as a measure of spread across these replicates.

Normally estimation methods for complex surveys are designed to produce consistent and unbiased design based estimates. However, since the way in which the design and estimator interact can be difficult to mathematically describe (take, for example, the description of the LFS above), variance estimation for complex surveys is done using a variety of methods including linearization, replication, the aforementioned jackknife, bootstrap and variance equations. When the estimator is complex, for example (or some nonlinear function) we would use Taylor linearization to aid us with variance estimation. Also, if the design were complex and the estimator nonlinear, we might use a replication method such as jackknife or bootstrap. We would decide what replication method based on what we were producing. If we were going to produce a public data file we might choose boostrap, whereas if we were going to only produce tables we might decide to use jackknife. However, for all these methods as with simple survey designs, if the design information is not used in the estimation process the estimates from the survey can not be considered estimates of the related population totals.

59

2.2 Models and the Superpopulation

Though design based sampling theory is concerned with inference for finite population parameters, we can extend inference to a general or superpopulation model. Superpopulation modeling is where the researcher specifies or assumes a random mechanism that generated the finite population (Royal, Godambe and Thompson 1986, Cochran 1977, Hajek 1964, Graubard and Korn 2002). Where the model is thus the statistical conceptualization of a superpopulation,

we have modelbased rather than designbased inference. We now consider the case in Figure

2.6 where there is a stochastic process that generates the finite population (which we do not know

explicitly) and then we sample units from this finite population using a probability sample.

Figure 2.6 Superpopulation

θ θ N θˆ n

Superpopulation Finite Population Sample

Now we can define the two stages of randomization: one belonging to the design and one belonging to the model. The design stage is denoted by the subscript p and the model by the

subscript ξ . Here we define the Eξ and V ζ as the expectation and variance with respect to the

model and E p and V p as the expectation with respect to the design (i.e. we use repeated sampling of the same design for any measurable sampling plan for these two expectations.)

60

2.3 Inference

Definitions

Before exploring how estimation and inference are linked in a survey context, we must review for the reader how inference is sometimes defined in traditional metaanalysis contexts. For example, in experimental metaanalysis the acronym PICO was developed by faculty at the

University of McMaster to aid in evidence based medical decision making, and is used to help define the inference of a (McLellan 2010). The P stands for the patient population, namely, the group about whom the researcher needs information. Such a population could consist of any group, from postmenopausal women, for example, to hypertensive patients, to everyone in Canada. The I stands for the type of intervention, or medical event, whose effects

researchers wish to study. (Note: In PECO, the E is for exposure.) In the case of postmenopausal

women, such an intervention might be estrogen replacement, in hypertensive patients, the medical

event might be use of ACE inhibitors and for a patient population of all Canadians, I might be a preventative strategy like healthy eating. The C stands for comparison, or the process by which the researcher examines whether the proposed intervention produces better or worse results than either another intervention or no intervention at all. In the examples mentioned above, such comparisons might involve, respectively, no estrogen replacements, no ACEinhibitors, and a diet high in fat. Finally, O stands for the outcome, or the researcher`s explorations of the effect of the intervention. In the case of estrogen replacement therapy, we might choose to look at that therapy`s effect on osteoporosis; with hypertensive patients we might focus on the effect of ACE inhibitors on blood pressure; following the implementation of healthy eating strategies nationwide, we would be looking, perhaps, for a reduction in BMI.

61

Just as there are definitions used to describe inference in the systematic reviews of medical

literature, there are definitions needed to define inference in sampling. In sampling, we first

define the observational unit as the object about which we make . For example, where human populations are the subject of study, individuals serve as the observational units.

Having defined our observational unit, we need next to define the target population , or the complete collection of units that we would like to study. For example, we could define our target population as all postmenopausal women, or all hypertensive patients, or all Canadians. Our sample would then be a subset of this target population. Following from this, we then define the sampled population as the collection of all possible observational units that might have been chosen for the sample. Finally, the sampling unit is then the unit we actually sample. For example, although individuals may be our observational unit, we may end up sampling households or hospitals as both allow access to a list of individuals, because we often do not possess and cannot access a complete list of the observational units. A list of the sampling units

is called a frame, which in the three cases noted previously would be a list of all postmenopausal

women, all hypertensive patients, or all Canadians.

We must also explicitly note and make use of the fact that frames are not perfect and in some

cases do not even exist. It might be the case, for example, that not all our units in the target population are included in our frame ( undercoverage ), or that there are units on the frame which

are, in fact, not part of our target population ( overcoverage ). We must also recognize that when

we approach the problem from a survey perspective, we need to define precisely our target population. For example, would our target populations of postmenopausal women or

hypertensive patients include only Canadians, or all persons from North America, or only those

from developed countries? If we have no list, that is, a list of postmenopausal women from 62 developed countries now residing in Canada we will have to consider options such as cluster or

twophase approaches. For example, we are unlikely to find a list of either all postmenopausal women or hypertensive patients, but we might be able to find a list of addresses or houses within

a city. We could then sample houses to get households to find residents with the conditions or

characteristics we are interested in studying.

Another difference is that in sampling we also formally address the problem of nonresponse

which, like confounding in experimental studies, can bias estimates. And finally, in the case of the survey, we also need to clearly state whether we are interested in finite population parameters or superpopulation parameters. The idea of the superpopulation versus the finite population is the

central problem of inference in sampling and it is sometimes discussed (as it is in Deming (1950))

as a distinction between enumerative and analytical uses of survey data. Deming states that the purpose of both studies is for action. In the enumerative study, we are interested in estimating a number of something within our population for the purpose of doing some action with those cases. For example, if we are providing vaccinations to children in a health region in Canada, we would want to have an idea of how many children are at an age to receive these vaccinations in order that we can plan inventories of vaccines. By contrast, in an analytical study, where we are

interested in the prediction or cause, we are investigating in order to do something about the

cause . For example, if we were instead interested in the relationship between healthy eating,

childhood activity and adult obesity, we would want to follow children, review their eating and

activity across time and see how these factors help to predict adult obesity. This division for

inference can be directly thought of as the difference between inference for the finite parameter

(description or enumeration) and the superpopulation (prediction or cause).

63

Finite population parameters describe the finite population, from which we draw our sample.

Estimation is then about characteristics of this population U. These quantities always portray the population at a given time point (the time of sampling), and are purely descriptive in nature

(Sarndal et al 1992, Thompson 1997, Deming 1950, Fuller 2009, Hartely 1962). With a census of all the units U (with no nonresponse and no error) a finite population parameter can be established exactly. In most cases, if the researcher able to collect information from the entire population he/she would do so. However, the researcher can rarely take such a census of his population of interest, and probability sampling is therefore done to produce estimates of the finite population of a sufficient precision at a minimum cost.

A superpopulation model describes the relationship between the variables (Y N , X N , Z N ) by means of a model, ξ. Here then the model builder is interested not in the finite population U at the present in time, but rather in the causal system described in the model. So in this superpopulation case, a census will not be estimating these model parameters exactly.

To illustrate these concepts, we will use the National Longitudinal Survey of Children and Youth

(NLSCY). The NLSCY comprises the noninstitutionalized civilian population (aged 0 to 11 at the time of their selection) in Canada's 10 provinces (Special Surveys Division 2009). However, due to cost and other implementation reasons, the survey excludes children living on Indian reserves or Crown lands, residents of , fulltime members of the Canadian Armed

Forces, and residents of some remote regions. The fact that we are sampling a portion of the population means that we must be simultaneously aware of the target population, the coverage of 64 the frame, and all exclusions in order to make proper inference. Here we would be cautious in using the findings of the NLSCY for aboriginal populations who live on reserves as they are not part of the sampled population even though they are conceptually part of the target population.

The NLSCY is a longitudinal survey consisting of several longitudinal and crosssectional

samples. A longitudinal survey is one that follows the same respondents over time (with each

survey administration called a wave) whereas a repeated crosssectional survey has different

samples selected over time (with each survey administration called a cycle). The sample for the

original cohort is a mixture of pre1994 Labor Force Survey (LFS) rotational groups and the 1994

LFS redesign. The NLSCY collects data from a representative prospective cohort of individuals.

The sampling plan is complex. The target population is well defined, as are its exclusions. The parameters of interest might include things like exposures or risk factors, such as the socio

economic status of the child and the parenting styles of the mother. We might also consider

outcomes such as date of smoking initiation, BMI measurements, and math or Peabody scores.

The longitudinal samples are representative of the original longitudinal populations (i.e., the populations at the time of sample selection). Each year, the sample is recontacted, and each

annual set of data is called a wave. Estimation of the original longitudinal population is done

either for all individuals who have responded to all waves, or for individuals who have partial

response in any wave. In both cases, the researcher uses an appropriate set of weights that have been adjusted for nonresponse and poststratified by province, age and sex to match known population totals at the time of sample selection.

65

In the case of the NLSCY, we could be interested in inference about the finite population or the superpopulation. An example of a finite population characteristic would be the average literacy

score for young adults age 18 or 19 broken down by reading competencies at age 8 or 9. In

Figure 2.7, we can see that data from the National Longitudinal Survey of Children and Youth

showed that children aged 8 or 9 whose reading ability in school was better than their peers had

significantly higher test scores in literacy a decade later.

Figure 2.7 Literacy

Better reading in school at age 8 or 9 linked to higher literacy scores at age 18 or 19 Average literacy score at age 18 or 19

35

30

25

20

15

10

5

0 Did poorly to very Average in Did well in reading Did very well in poorly in reading reading at age 8 at age 8 or 9 reading at age 8 at age 8 or 9 or 9 or 9

Source: Statistics Canada, Daily Tuesday December 5, 2006

If instead we were interested in examining a model of mechanisms through which the effects of

neighborhood socioeconomic conditions impact children’s verbal and behavioral outcomes, we

would be interested in the model or superpopulation. Kohen et al. 2008 found that neighborhood

disadvantage manifested itself via lower neighborhood cohesion which was associated with

66 maternal depression and family dysfunction. These were, in turn, related to less consistent, less stimulating, and more punitive parenting behaviors, and finally, poorer child outcomes such as

literacy.

2.4 Estimation

As noted in section 2.1 and 2.2, we can approach estimation using either finite sampling or a superpopulation. If we consider the level of inference, whether from a superpopulation or finite population , simultaneously with our estimation frameworks, we have the following table (Hartley

and Sielken 1975):

Table 2.1 Estimation and Inference

Estimation Framework

Parameter of interest Sampling from a fixed or Twostage sampling from a finite population superpopulation

Finite population Classical finite population Superpopulation theory for the finite parameters or descriptive ˆ F (F ) ˆ F (S ) sampling θ population θ parameters Infinite superpopulation Impossible Inference about superpopulation parameters or model parameters from twostage sampling parameters ˆ S (S ) θ

For clarity, we introduce the following new notation to represent the level of inference and the methodology used to estimate a parameter: θˆ S (S ) denotes an estimator of the parameter of the

superpopulation where the first superscript S denotes that the parameter of interest is from the superpopulation, while the second superscript S enclosed in parenthesis (S) denotes the twostage

67 superpopulation framework used to estimate the parameter. Hartely and Seilken and later Sarndal

refer to this as case 3. Likewise, θˆ F (S ) is the finite population parameter estimate under a two

stage superpopulation framework (case 2 Hartley Seilken) and θˆ F (F ) is the finite population parameter estimate under classical finite population sampling (case 1 Hartley Sielken). As the

estimation of a superpopulation parameter under finite population sampling is not possible, the

superscript S(F) is not used. In later chapters, we will combine estimates using the same level of

inference and the same estimation procedure. As it is not logical to combine the estimates that

are inferentially different or calculated under different assumptions in later chapters and in

simulations, we only consider here cases where we are combining like estimators. As noted previously, we need both a point estimate and the variance of the estimate under the appropriate inferential model.

As another example, we can imagine when studying the health of individuals that we want to know what the average BMI was for Canadian women in 2010. Inference here would be on a descriptive parameter of the finite population. With this first type of inference, the simplest approach is to use designbased estimation such as a HorvitzThompson estimator θˆ F (F ) and its

associated variance. Here inference is based on repeated sampling of the population. We may

have an analytical model, but we do not know what it is, and we do not use it in any way.

If instead we are studying factors associated with BMI, we may want to consider some factors

that contribute to obesity. A relatively new area of obesity research is the study of sedentary behaviours as distinct from physical activity (Spanier et al 2006). In such cases we would consider a typical regression model ξ such as 68

T Yi xi =xi β + ε i where

εεε ,εεε ,...,εεε are iid normal random variables with 1 2 N 2 E[][]εεε i = 0 and Var εεε i = σσσ ∀i

Our superpopulation model would have as x variables or covariates things like the time spent watching television (including videos), using a computer (including playing computer games and using the Internet), and reading, along with other covariates and confounders that the researcher felt were relevant. Even in the case of the estimation of the mean, we could invoke a superpopulation model here and calculate the estimator θˆ F (S ) . When invoking a superpopulation

model we would want the estimator to take into account both the model and the sampling design.

Therefore, conceptually, with inference of the finite population under a superpopulation model

 ˆ F (S ) 2  we would select an estimator where Eξ E p (θ S −θ N ) is minimized, where the subscript  

ξ implies the expectation is taken over the model, and the subscript p indicates the expectation is taken over the sampling plan. When we combine estimates, we need to understand the context, have a point estimate, and a variance estimate in order to combine the data.

There are several theoretical approaches that authors have taken to estimate finite population characteristics under a model. Fuller (1975) presents theory and estimates for the case of regression under a superpopulation framework, where we assume that the finite population is a simple random sample from a superpopulation with finite fourth moments and the sample is drawn from this finite population. Hartley and Sielken (1975) provide more on the inference of the finite population parameter for regression parameters under a superpopulation model. 69

Burdick and Sielken (1979) look at variance estimation on a superpopulation model in twostage

sampling. Isaki and Fuller (1982) expand these ideas to construct model unbiased and design

consistent regression predictors.

Here under the approach of Isaki and Fuller (1982) the function of the sample θˆ F (S ) is as a predictor. Then a predictor is conditionally model unbiased for the finite population parameter

θ N if given the sample S if we have

ˆ F (S ) Eξξξ [(θθθ − θθθ N )S]= 0

where once again Eξ is the expectation with respect to the superpopulation. Here it is important

to note that the conditioning is not a characteristic of the population. Then the predictor is

ˆ F (S ) unbiased with respect to the design P(S) if E p [θ ]= θ N where the expectation is over the

design. Isaki and Fuller then define the best model unbiased predictors of θ N as the one with

minimum model variance. That is

ˆ F (S )* ˆ F (S ) ˆ F (S ) V (θ −θ N )≤ V (θ −θ N ) for all θ in the class of functions C , where

ˆ F (S )*  ˆ F (S ) ˆ F (S ) 2  V (θ −θ N )S) = Eξ {(θ −θ N ) − Eξ [(θ −θ N )]S} S .  

70

ˆ F (S ) Then one can define the anticipated variance of (θ −θ N ). It is the variance of the random

ˆ F (S ) variable (θ −θ N ) under the design and the superpopulation model as

ˆ F (S )   ˆ F (S ) 2  ˆ F (S ) 2 AV (θ −θ N )= Eξ E p (θ −θ N )  − [Eξ {E p (θ −θ N )}] .   

If we have the case of a superpopulation with finite moments and we have a design that is

independent of the y values used in estimating θˆ F (S ) , we can interchange the order of the

expectation to obtain

ˆ F (S ) ˆ F (S ) ˆ F (S ) 2 AV (θ −θ N )= E p {Vξ [θ −θ N S]}+ V p [Eξ {θ −θ N S}] .

If the estimator is design unbiased we then have that the anticipated variance is the model expectation of the design variance given by

ˆ F (S ) ˆ F (S )  ˆ F (S ) ˆ F (S ) 2  AV (θ −θ N )= Eξ {V p [θ ]}= Eξ E p {θ − E(θ } .  

However, the AV is not always a simple calculation, nor is it calculated using any standard

software and thus it may be a difficult method for analysts to use in metaanalysis.

Graubard and Korn (2002) also propose a framework to perform inference under a

superpopulation. They use a sample weighted estimator because doing so requires fewer

assumptions on the model. They derive superpopulation variance estimates based on corrections

to repeated sampling variance estimators for convenience. In the case of stratified sampling, they

suggest to obtain an unbiased estimator of the mean by use of the following:

VarSP [y]= varwr (y) + var[Y ]

71 where

L L 1 K h 2 1 K h  K − K h  2 var[]Y = ∑ (yh − y) + ∑ 1− sh K −1 h=1 K K h=1 K  (K − )1 kh 

and varwo (y) is the standard without replacement estimator for stratified sampling from the L

strata.

They also provide derivations for cluster sampling superpopulation variances which do not

require the specification of the joint inclusion probabilities. Here the population consists of K primary clusters, the ith of which consists of Ni secondary clusters. At the first stage of sampling kh, primary clusters are selected from the Kh primary clusters in stratum h as a pps sample without replacement. Then Grauband and Korn suggest using the following estimator:

VarSP [y] = varwr (y) + varb − varw where

2 1 L 1  kh  var = (t − dy ) b 2 ∑ ∑ hi hi  d h=1K h  i = 1  and

2 1 L k  kh 1 kh  var = h (t − dy ) − (t − dy ) w 2 ∑ ∑hi hi ∑ hj hj  d h=1K h (k h − )1  i = 1kh j = 1 

and varwr (y) is the withreplacement estimator for variance under stratified sampling. This

estimator has the additional benefit over the method of Isaki and Fuller of being relatively easy to

72 calculate with slight modifications to standard software routines. This is why we have chosen to use it for our simulations.

Finally, if we are interested in the inferences from this model, and we wish to extend our inference to the superpopulation and its parameters, Molina et al (2001) suggest that inference should be made using the joint process of model and design. They propose a joint modeldesign process satisfying the following for all the samples S that are possible under p

E(Y Z = Z ) = E (Y ), E(YY T Z = Z ) = E (YY T .) These conditions are satisfied when Z and S s ξξξ S s ξξξ

Y are independent. If this condition does not hold then they suggest modeling the conditional

mean and variance not on the observed sample but on all possible samples under p. Of course,

this is not always straightforward to calculate and requires the correct specification of the model.

Chapter 3 will demonstrate that where such model specification is impossible to achieve, we can

use metaanalysis to provide model inference with survey data from different finite populations.

Ignorability and Informativeness

As noted in Molina (2001) and RubinBleuer and SchiopuKratina (2005), whenever we are interested in the analytical model that generated the finite population we must consider the joint process of the model and the design. As experimenters do not have this second stochastic

element when they run experiments, often the first reaction from a classical perspective is to

ignore it. This is, in essence, what Barnard is thinking when he states that “since a simple

73 random sample of a simple random sample, is itself a simple random sample the problems of inference can be dealt with along classical lines” (Thompson and Godambe 1971). Although

Barnard was correct in noticing that a simple random sample of a simple random sample is also a random sample, most surveys are not simple random samples. If probabilities depend on y through more than the covariates, then we will make erroneous conclusions if we do not consider the design (Rubin 1976, Little 1982, Sugden and Smith 1984, Pfefferman 1988, 1993, 1996,

2007, Rao and Scott 1981). In the case of sampling, we have two different concepts that deal with the question of whether or not we can disregard the random sampling, namely, informativeness and ignorability. Here we consider a sampling design to be noninformative if the sampling mechanism does not depend on y (our outcome), and informative if it does so depend. If it is informative then we must take it into account in our estimation procedure. If we approach this estimation problem by assuming that the sampling design P(S) plays no role (it is

noninformative) as some authors have proposed (Royall 1970, Royall et al 1973a 1973b,

Thompson 1997, Brewer 1979, Konjin 1973), then we are stating that the relationship in the

model holds for every unit in the population U and the sample design should have no effect on estimation. As noted above, this can be the case as long as the probability of selection depends on y only through the x’s. If the design and inclusion probabilities play no role in estimation then design unbiasedness is unimportant. Ignoring the design is often referred to as a frequentist approach and/or a pure modelbased inference.

This frequentist approach is related to the idea of ignorability. As Binder and Roberts (2001) state, a design is ignorable for a particular analysis if the inference based on all the known information is equivalent to the inference based on the same information excluding the outcomes of the random variables related to the design. They note that ignorability is a concept related to a 74 particular analysis and one can have ignorability in both informative and noninformative

designs. Thus we can group a design with a particular analysis using the notation

design+analysis where the design is of the survey and the analysis is specific– for example, a

stratified sample with a regression using age, gender and education to predict income. If in this

case the inference based on the design based procedures is equivalent to those of the model based,

the design is ignorable. All noninformative designs are ignorable but not all ignorable

designs+analysis comes from a noninformative design (Binder Roberts 2003).

In practice, however strongly debated the importance or lack of importance of the design, the

analyst does not generally have the exact sampling plan and should not assume that the sampling plan is independent of y. For example, most large national surveys oversample rare populations

of interest. Oversampling has occurred when the sample contains a larger percentage of units of a particular kind than are present in the whole finite population. Often these rare populations do

not behave like the rest of the population and if this oversampling is not accounted for in the

estimation, then the analysis could be misleading.

As we are now dealing with both a design and a model, like classical statistics , model misspecification can lead to incorrect conclusions, particularly for large samples. As Sharon

Lohr (1999) states, “theoretically derived models known to hold for all observations do not often exist for survey situations”. Thus researchers must be careful about how the joint effects of the design and model are considered when they analyze survey data and when they combine survey data. Researchers would have used the entire finite population if such a thing were possible, in order to do their analysis, and thus from a practical standpoint a model calculated using all the

75 units in our finite population would be their goal. As a design consistent estimator will converge to the finite population parameter, regardless of the model, we suggest that estimators be design based and designconsistent even when we are attempting to make model inferences.

2.5 Design Effect

We have noted more than once that when we are doing a metaanalysis we would like comparable estimates. We have already shown that under different inferential models and under different designs we can get different estimators even using the same sample. Next we will show how in the survey context we can scale and compare variances under different designs. Kish (1965)

suggested that a way to summarize the effect of the sample design on the variance of the estimate

was to look at the following ratio which he named the design effect ( deff ):

ˆ ˆ V θ under design p with n observations Vp (θ N ) , where ˆ is a designconsistent deff θˆ = ( N ) = θN ()N ˆ ˆ V ()θ N under SRS with n observations VSRS ()θ N estimator of θ N based on survey weights from the respective survey designs.

Thus deff is a measure of the precision gained or lost due to using a complex sample design

instead of a simple random sample (SRS). Stratification usually increases the precision while

clustering and weighting decrease it. Typically, for stratified multistage cluster sampling the

values of deff are greater than 1, indicating a loss of precision compared to SRS. The values of

deff are largest for estimates of the population totals, means and proportions and could be

76 substantially smaller for regression coefficient estimates (Kish, 1987, 1995; Park and Lee 2004;

Heeringa et al., 2010).

ˆ When design information is not available to estimate Vp (θN ), but the design effect is known, then

ˆ ˆ the analysts can use deff to adjust VSRS (θ N ) produced by standard software and obtain confidence intervals or test statistics for finite population proportions, means and totals (Lohr

1999). For example, a 95% confidence interval for a mean of variable y, when sample size n is large, can be estimated as

ˆ 2 yˆ ± .1 96 deff yˆ S . ( ) n

n n The point estimator, ˆ ˆ is a designconsistent estimator of y = θ N = ∑ wi yi ∑ wi i=1 i=1

N , where is the survey weight of individual i, and ˆ 2 is an estimator of the YN =θ N = ∑ yi N wi S i=1

N finite population variance of the variable y, 2 2 . S = ∑()yi − YN ()N −1 i=1

When using complex survey data for fitting a regression model, the variance estimates for

individual regression coefficients produced by standard software would need to be adjusted by

the individual design effects to obtain confidence intervals

ˆ ˆ ˆ ˆ K ˆ βk ± t deff (βk ) VSRS (βk ), k = ,1 , p. The estimator β k is assumed to be the weighted least squares (WLS) estimate of the finite population coefficient and also of the model β N,k

77

parameter βk . The weights used in the WLS estimation are the survey weights and not the

weights for correcting heterogeneity of the error variance. For complex design, the number of

degrees of freedom for the Student t distribution is usually approximated by κκκ =

#clusters − #strata (Heeringa et al., 2010). Korn and Graubard (1998) propose using the

Bonferroni method for simultaneous inference about regression parameters. For the overall Ftest,

ˆ ˆ one would need a designconsistent estimate of the pxp variance, V p (βk ), or a

ˆ ˆ pxp matrix of design effects and VSRS (βk ).

As we will be looking at estimation under a superpopulation model in some of our cases, we now define a design effect under a superpopulation model as deff sup :

ˆ ˆ ˆ V θθθ under model − design design p with n observations Vsup (θθθ ) , where is a designconsistent deff θθθˆ = ( ) = θθθ sup () ˆ ˆ V (θθθ under SRS with replacement n observations) VSRSWR ()θθθ model unbiased estimator of our model parameter θθθ .

For contingency tables, various methods have been proposed to account for the survey design

when testing for homogeneity of population proportions or independence of variables. The first

order correction proposed by Rao and Scott (1981; 1984) uses the design effects for estimating

the cell, row, and column proportions. We will discuss this approach more in Chapter 3 when we

adjust the tradition test for heterogeneity using a Rao Scott correction.

78

2.6 Mathematical Examples

As noted above, the finite population and analytical model are different levels of inference. In the experimental case, we are always considering the analytical model and only thinking of combining model data, whereas in the survey case we could be thinking about estimating quantities from the finite population, or from the joint modeldesign. We are not recommending to combine survey data under a purely model based or frequentist approach because in general the analyst will not have the necessary design information to determine if the design is ignorable.

And we feel that if given the option between a finite population estimate and the census estimate, the analyst would consider the census parameter ‘the best’ estimate from a single fixed population . Chapter 3 will show what we must assume in order to get a better estimate of the superpopulation model using metaanalysis.

Some Theory of Sampling: W. Deming

The first example illustrating this difference comes from chapter 7 of William Deming’s text book (1950) Some Theory of Sampling . Using a simple example, Deming here illustrates the

differences between estimation of the finite population (descriptive) parameter and the

superpopulation model (analytic) parameter.

If we consider the case of a production line of containers that contain either white or red chips we

can think of the following two cases:

79

Analytic or Model Inference Descriptive or Finite Population Purpose: Purpose: From a sample drawn from the lot container From a sample draw from the lot container estimate p the proportion of red chips in the estimate the proportion x/n of the red chips in production process the lot container Method: Method: Let a succession of lots be drawn from the Let a series of samples be drawn without supply with replacement and from each lot let a replacement from a given lot. In this lot x chips sample be drawn without replacement are red and Nx chips are white

Estimation (assumes a model and uses the Estimation (assumes sampling is random): sample to estimate the parameters): If r denotes the number of red chips in a If r denotes the number of red chips in a sample n then we can see that: sample then we have:  r  x E  = and  r   n  N E  = p and n    r  N − n N − x CV   =  r  q  n  N −1 nx CV   =  n  np

Here we can see that the inference and the methods are different in each case. If we have a census, we can explicitly see that in the descriptive finite population case the , CV, will be 0. However, for the analytical model inference with the census the CV will

q be which, although small, is not 0. Np

80

Answers to Criticisms of Using DesignBased Methods for Inference: D. Binder

The second example comes from a presentation by David Binder, Answers to Criticisms of Using

Design-Based Methods for Inference at Statistics Canada in 2008. The following data have been

generated randomly using 100 independent and identically distributed Bernoulli trials (M or F),

where the “unknown” Bernoulli parameter θ =.51.

{M M M M F F M M M F F F F F M F M F M M F M F M F M F F M F F F F F F M F M M F

F M M M M M F M M M M M M F F M M F F M F M F M M F M F F M F M M M M M M F

F M M M M F F F M F M M F F F M F F F M M M}

The set consists of 46F’s and 54M’s. Using a classical model based approach to estimate the proportion of females, we let

1 if F yi =  0 if M

ˆ We then calculate the estimate θξ = ∑ yi / n , where n is the number of Bernoulli trials. Then expectation of y (the percent of females) under the assumed model (Bernoulli trials) is

θ 1( −θ ) E (θˆ ) = θ = .51 and the variance under the assumed model is V (θˆ ) = with the ξ ξ ξ ξ n

θˆ 1( −θˆ ) estimated variance being Vˆ (θˆ ) = ξ ξ . In this case what we assume is random is the ξ ξ n −1 independent Bernoulli trials and we are trying to estimate θ = .51 .

81

Now, if we arrange the observations into households using a nonrandom process, so that larger

households have more individuals of the same sex within a single household than would be expected from a purely of units to households of given sizes, we have the following:

{(M M M M F),(F M),(M M F),(F F F F M),(F M), (F M),(M F),(M F),(M F),(M F),(F M),

(F F F F F F M),(F M),(M F),(F M),(M M M M F), (M M M M M M F),(F M),(M F),(F M), (F

M),(F M), (M F),(M F),(F M),(F M),(M M M M M F),(F M), (M M M F),(F F M),(F M),(M F),(F

F M),(F F F M), (M M)}

Our samples are now in the form

(1M,1F)23; (2M,0F)1; (2M,1F) 1; (3M,1F)1; (4M,1F)2; (5M,1F)1; (6M,1F)1; (1M,2F)2;

(1M,3F)1; (1M,4F)1; (1M,6F)1.

We can consider the persons in these 35 households as the finite population of interest, and can

estimate the proportion of females by taking a cluster sample. We select m households from the

35 households at random, with replacement, using selection probabilities proportional to household size, and then enumerate all persons in the selected households. Suppose that in the i th

selected household we observe household members and females. Here we are trying to estimate

the census parameter for the proportion of females which is θ p = .46 . This is conceptually

different than the model parameter 0.51, and it is important for the analyst to recognize this

difference. The standard probability weighted estimate for the proportion of females in the finite

82

ˆ 1 xi population is θ p = ∑ which for our sample equals 0.492. We know that this is a design m S hi

ˆ unbiased estimator of the census total E p (θ p ) = .46 , where the design based expectation is across all possible samples of the same design. So the randomness is due to the repetition of samples, each consisting of selecting the m households from the 35 households. The number of males and females in each of the 35 households is fixed.

If we assume that the sample is generated from independent, identically distributed Bernoulli trials, then, as with the example from Deming, the standard estimator of y (the percent of

∑ xi ˆ S females) is the unweighted estimator given by θξ = where the numerator is the sum of all ∑ hi S

the females in the sample S and the denominator is the sum of all the persons in the households in

ˆ the sample S. Here, the expectation of the model is the model parameter that is Eξ (θξ ) = θ . If

instead of assuming the distribution was i.i.d Bernoulli trials we assumed a different model, we

ˆ would not have model unbiased estimates. For our particular sample, we have θξ = 5. .

However, if we used the model based estimator we can see that it is not design unbiased: that is,

ˆ under repeated sampling, the E p (θξ ) = .445, which is neither the census nor the model

ˆ parameter. The design bias of the model estimator is E p (θξ ) −θ p = .445 − .46 = − .0 015 .

This bias is due to the fact that the selection probabilities are directly related to the household size. If the household size were uncorrelated with the number of females in the household, we

83 would have a design unbiased model estimate. Since the correlation is small, we have a small bias. Here the design is informative and the design is not ignorable for this particular estimate.

If we naively approach metaanalysis in the survey context we miss these important differences in inference. We need to clearly state what we are wishing to estimate and ensure that the estimation from each study conforms to our needs.

84

Chapter 3 The Framework

‘Do the values of the estimate agree among themselves within the limits of their experimental errors?’ William Cochran 1954

3.1 The Framework for the MetaAnalysis of Survey Data

In this chapter, we will review the framework proposed by William Cochran for combining experiments, match up the traditional estimation framework for metaanalysis to survey concepts, show how the traditional framework is inappropriate in the survey case, and propose a new framework for the survey case. A metaanalysis of survey data, like that of experimental data, will consist of a systematic review of literature and an approach to comparing and synthesizing data. Distinct from the experimental case, however, where there is either a fixed effect or a

random effect model, we have three possible estimation models in the survey case: two fixed

effect and one random effect . In what follows, we will demonstrate, by way of examples, how metaanalysis done with survey data under the traditional metaanalysis framework will be spurious. We will illustrate to the reader the assumptions that are needed for each estimation model. In order to clarify the concepts in the framework we will explore asymptotic convergence for each estimation model. In any metaanalysis we would want to test the model assumptions, in particular, that of heterogeneity. Unlike the experimental case, we can have unexplained heterogeneity due not only to different underlying models but also to sampling, particularly when there is clustering present in the data. This impacts both estimation and model testing and we will illustrate how a naive application of metaanalysis methods when there is clustering will result in 85 incorrect quantifications of heterogeneity. Therefore, we will propose a new weight that is adjusted for the additional variance in cluster samples.

Experiments as Generated in a Superpopulation Survey Context

After the researcher has reviewed the literature, found all the appropriate studies, and summarized their similarities and differences qualitatively, he is ready to start exploring models he can use to combine the data. In the case of the metaanalysis of experiments, if the researcher finds the estimates to agree among themselves within the limits of their experimental errors, then he would

use a fixed effect model of the form θi = + ε i . As noted previously, he is assuming that there

is homogeneity in effect across all studies, that is, θθθ1 = θθθ 2 = ... = θθθ r = θθθ N and he may also (for the

purposes of testing) assume asymptotically that the estimate of the treatment effect θi is

considered a random variable and has a normal distribution with mean θi and variance ε i . Under

ˆ the null hypothesis, the estimate θi is then a normal random variable with mean θi and

variance ε i . Using a superpopulation framework, like that used in sampling, this model would correspond to Figure 3.1, where the experiments would be generated from a single

superpopulation, with mean and variance ε i , with no finite sampling step.

86

Figure 3.1 Experimental Framework for a FixedEffect Model using Sampling Concepts

θθθˆ 1 ˆ θθθˆ θθθ Superpopulation 2

θθθ ˆ θθθ3

Experiments

If, however, the researcher thinks that the studies differ by more than what he feels is from the experimental error, but he still feels that the studies come from an overall population of possible

2 experiments, he has a model of the form θθθi = i + εεε i where i ~N(,σσσ ) and

2 εεε i ~ N ,0( σσσ ) with i and εεε i independent random variables. Inference is based on the

assumption that effects are independently and identically sampled from an overall population of

all such effects.

It is important to see that this assumption, though mathematically presented, must have

conceptual validity. In the presence of betweenstudy heterogeneity, the mean, in the

experimental case, can only be viewed as the center of the distribution of the treatment effect.

Therefore, it is necessary to understand what this center or overall effect actually represents for

the particular subject under study. We should not proceed if we cannot explain what the pooled

estimate will mean or clearly define how it should be used and, possibly more importantly, how it

should not. If, however, we can explain the overall effect, we would then need an estimation

approach that accounts for betweenstudy heterogeneity. Here, under a superpopulation

87 framework, this would correspond to Figure 3.2 where we have several superpopulations that

were generated from a hyper-superpopulation , where we call the parameters of the hyper

superpopulation hyperparamenters .

Figure 3.2 Experimental Framework for a RandomEffect Model using Sampling Concepts

Superpopulation 1

θθθ1 = 1 + εεε1 θθθˆ Hypersuperpopulation 1

2 Superpopulation 2 ˆ. i ~ N(.,σσσ ) ˆ 2 θθθ = + εεε θθθ2 σσσˆ 2 2 2 2 εεε i ~ N ,0( σσσ ) 2 σσσˆ

Superpopulation 3 θθθˆ 3

θθθ3 = 3 + εεε 3

To determine which case best represents these data, and therefore which model to use, we would test for heterogeneity. Several researchers note that one of the key parts of metaanalysis is testing to determine if the study results are in fact homogeneous (Higgins et al 2003, Huedo

Medina et al 2006, Breslow and Day 1980). When we test for heterogeneity we are, in fact, trying to assess if there is a consistency across the studies. The test assesses if, under the null hypothesis, the variability of these data can be attributed to what researchers call the ‘sampling variability’ which is, in fact the variability from the generation of these data (from the superpopulation) and not from sampling in the sense used in sampling theory. It is important to

note that we are testing model structure: that is, are we in the first case, where we have one

superpopulation which generates multiple experiments? Or are we in the second case, where we

88 have several superpopulations which generate the different experimental results, which are connected with a hypersuperpopulation?

Case 1: Combining Finite Population Estimates

In the case of the metaanalysis of survey data, we must clarify this concept of model structure and sampling variability further as we are adding in stochastic steps, namely, the generation of the finite population and of the sampling procedure. This makes this problem distinct from the experimental case.

To construct the possible models under which we are combining samples we first would look at the case where we have a single finite population. If we had information on all the units in the finite population we would have a census. However, it is often the case that it is impossible to get information from all units in a population (due to cost or speed) and a sample is used to produce estimates of the finite population characteristics. As noted in Chapter 2, we can get an estimate

ˆ F (F ) of a census parameter θ N using a survey data under a design based framework θθθ or by

invoking a superpopulation that generated the finite population θθθˆ F (S ) . If these data from a single survey is not sufficient for our research needs, we might consider pooling several different estimates of the finite population. Using diagrams in the style of Binder, in the case of a single finite population, the two inferential frameworks for estimation lead to two different instances when we are combining finite population estimates, as in Figure 3.3a and 3.3b.

89

Figure 3.3 a FixedEffect Model Case 1a using Sampling Concepts

θθθˆ F (F ) Finite 2 Superpopulation 1 Population ˆ F (F ) θθθ1 ˆ θθθ ~ N(,σσσ 2 ) θθθ N θ N ˆ F (F ) θθθ 3

b FixedEffect Model Case 1b using Sampling Concepts

θθθˆ F (S ) Finite 1 Superpopulation 1 Population ˆ F (S ) θθθ 2 ˆ θθθ ~ N(,σσσ 2 ) θθθ N θ N

ˆ F (S ) θθθ3

In Figure 3.3a the superpopulation is shaded to indicate that although it is there we are not using it for inference. Also, we can see from Figure 3.3a that if we consider the case where we take all possible samples of a single design from the finite population we are, in fact, in the case where classical design theory starts. However the key difference in this thesis to a simple application of design based theory is that each one of these samples needs not come from the same design. In this problem we have distinct samples and thus distinct probability spaces for each arrow in our figures. In this first case, even when we have different designs (i.e. different stochastic processes leading up to the designs) as we only have replicates of the finite population, we will only ever converge to the census parameter regardless of the inferential model. It also means that we have

homogeneity across all studies, that is, θθθ1 = θθθ 2 = ... = θθθ r = θθθ N .

90

We would not, however, ever combine estimates using both θθθˆ F (S ) and θθθˆ F (F ) in the same study.

This is because, as noted in chapter 2, the calculation of θ F (S ) and θ F (F ) differs.

and variance estimation for a finite population parameter under finite sampling takes into account

the stochastic error due to sampling, not the error in the model. Inference is based on repeated

sampling of the population. On the other hand, in the case of the finite population parameter

under a superpopulation, we have inference based on a model conditioned on the sample. We can

see trivially even with the same sample, that we could calculate different values for the point

estimate and different values for the variance estimate of θ F (S ) and θ F (F ) . Therefore, it is

imperative to know which framework we are using for estimation. Additionally, even though we

can estimate the finite population characteristic under a superpopulation model, published

estimates from surveys are design based, that is θ F (F ) . Therefore, any combination of estimates

for a single finite population using published data will only be of this type.

Although combining data from many surveys from the same finite population can be thought of

as a special case of the traditional fixed effect approach in metaanalysis, estimation of the census parameter is usually not the objective of a metaanalysis. Rather estimation of the census parameter is an important estimation problem for survey statisticians and is frequently explored in

the survey context. As noted in Chapter 1, this census estimation problem via combining

multiple estimates includes double sampling of a population, overlapping frames, cumulating

cases, estimation of the finite population where the finite population is partitioned into either

overlapping or mutually exclusive subpopulations, and small area estimation. Small area

estimation is an area in sampling that deals with problems of estimating rare populations. As

discussed in Chapter 1, a small area in a survey context is any domain for which a direct survey

91 estimate (design based) of adequate precision cannot be produced. As noted earlier, however,

these methods often require knowledge about such things as the design or joint inclusion probabilities, data which analysts may not have. As such analysts may, therefore, choose to

combine finite population estimates using metaanalysis.

If we are going to use metaanalysis, however, we must first clarify which inferential case we are

in. If we are interested in inference of the census parameter, and want to use a metaanalysis

F (F ) approach then we should be combining estimates from different surveys of the type θθθ . That is, each estimate should be a finite population design consistent estimator, calculated under a finite population inference model. We will call this Case 1a. Then the appropriate estimation model will be a fixed effect model. This fixed effect model is a weighted average and takes the general form:

r ˆ F (F ) ∑ wiθθθi g (θˆ ) = i=1 , where w is the weight given to each study. We will extend the 1a θθ N r i ∑ wi i=1 discussion on weights in the survey case and give a weight that adjusts for different sampling plans, that is, the inverse of the variance divided by the design effect, in section 3.5. If the

F (F ) weights are independent of our estimator θθθ i , we will have an unbiased estimate of the census parameter (for details, see section 3.3, below). Chapter 4 will illustrate the convergence of this

ˆ estimator g(θθθ N ) to the census parameter using simulations under different sampling plans as the number of studies grows large.

92

There exist in the literature only a few examples of metaanalysis methods that illustrate how to combine finite population survey estimates. A case in point are Thomas and Wannell (2009) who present a similar estimator in their article on combining data from different cycles of the

Canadian Community Health Survey (CCHS). The CCHS is a crosssectional survey that collects information related to health status, health care utilization and health determinants for the

Canadian population. Each year of the crosssectional survey is called a cycle and for its first

three cycles the survey was biannual. After a redesign in 2007, however, the survey became

annual. For each cycle, there is a core component and there are occasional supplements on

specific health conditions such as healthy aging and .

Although the CCHS has a large sample and was designed so that one can produce estimates at the

health region level, sometimes a single cycle does not have enough respondents in a particular

subpopulation to meet a researcher’s analytical needs. Such was, for example, the case in

Thomas and Wannell who looked at the prevalence of smoking in youth aged 12 to 19 in a single

health region. As we see in Table 3.1, in any particular cycle for the Durham Health Region we do not have many 12 to 19 year old respondents and only a very small percentage of these identify themselves as smokers. In the same table, we can see that from 2001 to 2005 there was a decrease in the percentage of smokers over this time period. This is not an inaccuracy in these data, but indication of a change in the percentage smoking over time.

93

Table 3.1 Estimates of daily smokers aged 12 to 19, Canadian Community Health Survey, cycles 1.1 to 3.1, Durham Health Region

Cycle 1.1 Cycle 2.1 Cycle 3.1

n Estimate Variance n Estimate Variance n Estimate Variance Total aged 187 61,220 ... 210 66,523 ... 214 70,380 ... 12 to 19 Proportion of Daily … 12.38% 0.08% … 6.91% 0.04% … 7.26% 0.05% Smokers Design 2.34 2.77 2.5 Effect Source: 2000/2001 Canadian Community Health Survey, cycle 1.1; 2003 Canadian Community Health Survey, cycle 2.1; 2005 Canadian Community Health Survey, cycle 3.1.

The authors used a fixed effect model that was unweighted and estimated the average percentage of smokers was 8.67%. As metaanalysis weights were not used, they considered each sample to be of equal importance and equal precision, which is not the case, and is an error which results in an overestimation of the average effect. The average percentage of smokers is an estimate across the whole time period and the authors do, in fact, note that this combined data masks the change in youth smoking over this period. However, such a situation should have been a red flag for the authors, prompting them to choose a different example for pooling since a pooled estimate using a fixed effect model should not mask anything. Rather, it should estimate the average of the finite population it represents.

We feel that in many ways the Thomas and Wannell paper misses important ideas (and assumptions) in the metaanalysis of survey data. Using the framework in this thesis, we would first regard the published estimates from the CCHS as design consistent and that each one is of

F (F ) the form θθθ . Next we would consider the possibility that the three years are not samples from

94 the same finite population. We would want to test our assumption that these three estimates were generated from the same superpopulation model (though we would be cautious of test results

using three data points). But in order to do so we would first calculate a fixed effect estimate for

use in our heterogeneity test statistic.

Furthermore, unlike Thomas and Wannell we would not assume that the three waves are equally

accurate. In Table 3.1, we can see that each cycle has a different variance and, moreover, that because this is a rare population variance across surveys is not stable. Next, as we are combining

different years of the same survey, we would not assume that in the absence of a redesign the

design effect for each cycle is the same. The CCHS calculates the published design effects as the

75th of the design effects from all standard published estimates. As the design effect

varies by what we are estimating, an overall design effect for any survey is thus either an average

or a percentile value. The reported CCHS design effects are cycle 1.1 is 2.34, cycle 2.1 is 2.77

and cycle 3.1 is 2.5. Therefore, in a fixed effect model we would weight each sample by the

inverse of the variance divided by the design effect. We can then see that our pooled estimate is

((12.38%*1/(0.08%/2.34)+ (6.91%*1/(0.04%/2.77)) +(7.26*1/(0.05%/2.5))/(((1(0.08/2.34))

+(1/(0.04/2.77)) +(1/(0.05/2.5))) = 8.14%. As we would expect with three data points, our

estimate is similar to that reported by Thomas and Wannell. However, the variance for their

estimate was 0.02% so their confidence interval would not include 8.14. And we note that our

estimate takes into account that cycle 2.1 has the smallest value, the smallest variance, and the

largest design effect.

95

We would then move on to testing the assumption that the three estimates are similar. Following

the testing procedure derived in section 3.4, we would calculate a design consistent Cochran’s Q

to be 6.69. We would then compare this value to a ChiSquare with 2 degrees of freedom and see that the test has a pvalue of 0.0352. However, we must note that the test is probably not accurate for two reasons: first, we are measuring a rare condition with a high variability for each sample estimate and, second, we have a small number of samples (Cochran 1954 suggested with the original test to have at least 6 estimates). Although we would not reject the null hypothesis that the estimates are homogeneous, we would remark that in addition to testing similarity statistically, we would want the average estimate to make conceptual sense. A pooled estimate

(either ours or Thomas and Wannell) has several conceptual problems: what does an average

8.14% (or 8.67%) smoking rate over three years mean? And what use does it have?

Furthermore, if we were planning health care interventions in this region we would want to know each year’s estimate. We would also want to know how the smoking rate changed each year.

And all this information is lost in the average. It is an oftcited problem in metaanalysis that analysts are combining data without any thought as to what that combined estimate means or how the estimate should be used.

Although Thomas and Wannell do not make note of the fact that average smoking rates may not have any use they did explain that due to changes in content, geography, in coverage and in

, it may not make conceptual sense to combine different surveys. They note that in many

cases “the results of different crosssectional health surveys may not be comparable, and in most situations, should not be combined.” In spite of this warning, however, several other authors, apparently anxious to seize upon a simple method to combine surveys, have used Thomas and

96

Wannell’s approach to pool survey data. Further remarks on these examples follow later in the thesis.

As prior to this work, the only framework for metaanalysis was the experimental one, and because of this most authors do not differentiate between the simple cases of combining finite population design based estimates and the other more complex cases that we will outline below.

The following case (call it Case 1b), for example, is never mentioned. If instead of a finite population viewpoint we were adopting a superpopulation point of view for our inference we would have this alternative estimator:

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 , where once again w is the weight given to each study. 1b θθ N r i ∑ wi i=1

The reader should note that if the samples are all selected from the same single finite population we will converge to the census parameter even though we have invoked a superpopulation: hence

F (S ) the subscript N. Again, if the weights are independent of our estimator θθθi we will have an unbiased estimate of the census parameter (for details see section 3.3).

A reader may ask why we would even consider the case of combining finite population estimates under a superpopulation when we feel we have only one finite population and we are converging to the finite population parameter. The answer to this question is that for many applications, the estimates of interest are associated with the stochastic process or model that generated these data rather than functions of these data themselves. In this case, we would want to have an estimate

97 that was describing the model, and, as noted in Godambe and Thompson (1986), if we are using

an optimal estimating equation for estimation the census estimate, θ N is an estimate of the superpopulation parameter.

As noted previously, we may mistakenly think that it does not matter which case we use, or, worse, that we can use both types of estimators together. Case 1a and 1b differ in a few

ˆ F (F ) ˆ F (S ) important ways. Although the point estimator for θθθ i or θθθi can be mathematically identical,

(for example, the weighted estimator for the mean under a stratified sampling), the variance

calculations are not. For example, Case 1a is strictly designbased and could be calculated by

several methods, including bootstrap, jackknife, generalized variance formulas, or by a standard

variance formula from a particular design such as the following variance estimator for the

HorvitzThompson estimator of the mean:

2 1  N N  t t  N V (tˆ )  V Yˆ  =  (πππ πππ − πππ ) i − k  + i . For Case 1b, however, it is a two  HT  N 2 ∑∑ i k ik  πππ πππ  ∑ πππ   i=1 k ≠i  i k  i=1 i  phase variance estimator that we need to calculate by using both the superpopulation model and the sampling design.

ˆ F (F ) As published estimates from surveys are of the form θi , in order to combine estimates under

ˆ F (S ) a superpopulation model an analyst would need to calculate θθθi . Several authors have

suggested estimation methods using a superpopulation model. This thesis uses the estimator

suggested by Graubard and Korn (2002) because, unlike others in the literature, it does not

require the calculation of joint inclusion probabilities and is estimable from a survey master file

98 when you have the sampling design information (i.e. clusters and stratification information).

Additionally, it makes use of simple modifications of designbased estimators and does not have strong model assumptions. For example with stratified sampling let K h be the known number of

observations in the hth stratum, h = 1,…, L. And we let k h be the number of sampled units in the

hth stratum. The estimator by Graubard and Korn has two components that take into account the

generation of the finite population from the superpopulation and the variance from sampling and

is of the form:

ˆ ˆ VarSP [y]= Eξξξ (Vard (Y ) +Varξξξ (Ed (Y ))= varwo (y) + var[Y ]

where

L L 1 K h 2 1 K h  K − K h  2 var[]Y = ∑ (yh − y) + ∑ 1− sh . K −1 h=1 K K h=1 K  (K − )1 kh 

As noted by Korn and Graubard (1998), to obtain an unbiased estimator of the variance of y for

the superpopulation we either add an estimate of var[Y ] (as it is an estimator of var E (Yˆ ) ) to ξξξ  d 

the varwo (y) estimator or add a correction to the withreplacement estimator for stratified sampling from the L.

Variances

If the estimated variances in these data are not constant across all studies, using weighted least squares with weights that are inversely proportional to the variance of each study will provide the most precise parameter estimates possible. To calculate the variance of the pooled estimator, we will therefore apply weighted least squares theory, illustrating calculations with two samples for 99

ˆ the reader. To begin, suppose we have two uncorrelated estimates of our parameter, θθθ1 and

ˆ θθθ 2 (for simplicity we will drop the superscripts). We can easily show that the value of φφφ that minimizes the variance of y is inversely proportional to the variance of each estimate (Roberts

ˆ ˆ ˆ ˆ 2011). Let g(θθθ ) = φφφθθθ1 + 1( −φφφ)θθθ2 , where θθθ1 θθθ2 are independent and φφφ is known, then

2 ˆ 2 ˆ 2 2 V (g(θθθ )) = φφφ V (θθθ1 ) + 1( − φφφ) V (θθθ 2 ) = φφφ V1 + 1( −φφφ) V2 . Next, to minimize the variance we

∂V solve = 2φφφV + 1(2 −φφφ) 2 (− )1 V = 0 . We then have φφφV − 1( −φφφ)V = 0 and this implies ∂φφφ 1 2 1 2

V2 V1 φφφ(V1 +V2 ) =V2 and φφφ = ∴1 − φφφ = . We then note that V1 + V2 V1 + V2

V V /(V V ) /1 V φφφ = 2 = 2 1 2 = 1 and V1 +V2 V1 /(V1V2 ) +V2 /(V1V2 ) /1 V2 + /1 V1

V V /(V V ) /1 V 1−φφφ = 1 = 1 1 2 = 2 thus V1 +V2 V1 /(V1V2 ) +V2 /(V1V2 ) /1 V2 + /1 V1

1 1 1 1 φφφ = c ∝ and 1( −φφφ) = c ∝ . V1 V1 V2 V2

Therefore, to minimize the variance we let

cθθθˆ cθθθˆ g(θθθ ) = 1 + 2 . V1 V2

If the weights are the inverse of the variances then we have the variance of the pooled effect

1 g (θˆ ) org (θˆ ) is v g (θˆ) = , where w is the weight for the ith study. 1a θθ N 1b θθ N ( 1 θθ ) k i ∑ wi i=1

Unfortunately, however, these optimal weights, which are based on the true variances of each

data point, are unknowable, and estimated weights must be used instead . Of course, when

100

estimated weights are used, the optimality properties associated with known weights no longer

strictly apply. Yet, we know that if we can estimate the weights with enough precision, these can

significantly improve the parameter estimates as compared to the results obtained if all the data points were equally weighted. We also know that we can have added variability due to the

sampling design that can mask differences in estimates. However, we can adjust for the excess

1 variability from clusters in the sample. Therefore, we suggest the use of either where vi is the vi

1 variance of the ith study when there is little to no clustering, or the improved weight (as deff i

vi

will be discussed in section 3.5) where deff i is the design effect and vi is the estimated variance for

the ith study when there is clustering. Either weight is proportional to the inverse variances of

each study’s estimate. We would then interpret the variances as being either an average of finite population variances, as in Case 1a, or as being an average of superpopulation variances, as in

Case 1b.

To review, under Cases 1a & 1b we are combining survey estimates of the finite population.

Even in this simplest of cases, we have several strong assumptions not cited in the literature on

combining survey data. For example, we assume that the samples are independent, but in the

case of survey data from large statistical agencies such an assumption may be invalid. For

example, in both the United States and Canada the two national statistical agencies use their labor

force surveys as frames for subsequent surveys. In point of fact, Laflamme (2009) showed how

the covariance between the LFS and the CCHS is not 0 and more importantly that the covariance

term is not of order n.

101

Furthermore, if we are combining finite samples for estimation of the census parameter we are also assuming that the surveys cover the same target population. Using the CCHS example above, we would believe that the survey’s target subpopulation of 1219 year olds is the same for all surveys. Although it is clearly the case with the CCHS example that at each cycle we are surveying 1219 year olds, we could for a particular analysis not, in fact, be covering the same target population. For example, consider that each year we have a new group of children becoming 12 and a new group becoming 20. Across the cycles we are therefore, in fact, surveying

different birth cohorts. Similarly, if we are interested in the effects on a birth cohort, or if there is

a period effect of importance, that is if there was a change that affected the entire population at

the same time (i.e. a legislation change for cigarette packaging) , then as we are pooling across

time, we are not pooling the same target population.

An additional point is that it is simple to show that when the individual estimates are unbiased

and weight is independent of the point estimates the pooled estimate is also unbiased. What is not

so obvious for an audience without expert knowledge of surveys, however, is that we often think

more globally of measurement errors in the survey case (Groves 2004). In a

framework, it is assumed that Y is not actually observed, but rather Y*, where Y* is Y plus some

additive error e. Under this type of model we can have correlations between the different surveys

occurring because of correlations between errors associated with selected units in the different

surveys. An important example of this occurs where the errors are introduced by interviewers,

and an interviewer’s assignment covers more than one group. This could, for example, be the

case when you combine multiple surveys from the same organization. Ideally, to avoid this possibility you would need to ensure that interviewer assignments were arranged entirely within a

single survey, but this is not the case in practice. For example, the same interviewers are often 102 used to contact individuals for the previous National Population Health Survey and the Canadian

Community Health Survey in Canada. Therefore, when we combine estimates from large statistical agencies we must not ignore the possibility of this correlation in our data.

Next, if we apply the traditional fixedeffect model from the experimental case with our weights as our inverse variances, we are also assuming that the weights are independent of the estimates in each survey. In the experimental case, however, this independence assumption is dependent on the estimator, not the experimental design we are using. For example, if we are pooling means, in a normal model the estimate of the variance and the estimate of the mean are independent, as they are not if we are pooling correlations. With sampling, we will have independence between the estimates of the variance and estimates of the mean based not just on estimators but on designs.

For example, in some common designs such as probability proportional to size sampling (pps) or

Poisson sampling the mean and variance are not independent. Additionally, if we are using a superpopulation framework, we must also note that the variance may not be independent under certain superpopulations. We will see this illustrated in Chapter 4.

Another important point is that we are only considering combining design unbiased estimates in this case. We will not, in most cases, converge to the census parameter without using a design unbiased estimate. The only time, therefore, that we could ignore the sampling weights in estimation would be if the design itself was ignorable for all analysis (or, equally, if the design was noninformative). We illustrate this in Chapter 4. Finally, we must also make clear to the reader that in Case 1a and 1b we will not converge to the superpopulation parameter no matter how many extra samples we obtain. Even if we account for the extra variability due to the

103 superpopulation in each point estimate, adding more samples from the same finite population will

not provide us a better estimate of the model as a whole. It will only provide a better estimate of

the model conditional on the one finite population we are measuring.

Case 2: Combining Finite Parameters from more than one Finite Population

As noted throughout this thesis, researchers are often interested in prediction or cause and

therefore they are interested in the analytical model that generated these data not the finite population. It is thus understood, for example, that if instead of just adding samples from the

same finite population into our combined estimate, we added in more samples from different

finite populations that were all generated from the same analytical model, we would get a better

estimate of the model parameters. Therefore, our second case occurs when we have different

finite populations that are generated from the same superpopulations in Figure 3.4.

Figure 3.4 FixedEffect Model Case 2 using Sampling Concepts

Finite θθθˆ F (S) 1 Population 1

θθθˆ Superpopulation ˆ F (S ) Finite θθθ 2

θθθ Population 2

Finite ˆ F (S ) θθθ 3 Population 3

104

Assuming once again that there is homogeneity across all studies, that is θ1 = θ 2 = ... = θ n = θ N , we are here attempting to use the estimated finite population parameters θ F (S ) to estimate θθθ the

superpopulation or model parameter. This estimate in a survey context is assumed to estimate the

corresponding ‘population’ effect with a random error that stems only from the chance associated

with the model and the sampling error ( model-sampling error). That is the model-sampling error

is the joint variation due to the doubly stochastic process where first we generate the finite population from the superpopulation and then we randomly select a sample from the different

finite populations. This estimation model still corresponds to the idea of a fixed effect model as

the parameters for the different finite populations are generated from the same distribution or

model. We then would use the following estimator:

n ˆ F (S ) ∑ wiθθθi g (θˆ) = i=1 . 2 θθ n ∑ wi i=1

We would always combine estimators of our finite population parameter under a superpopulation model as we are combining surveys across different finite populations. Therefore, we must take into account the twostages in the estimation of the individual studies variance. Again, by least

1 squares theory, the variance of the pooled effect g (θˆ ) is v(g (θˆ)) = , where w is the 2 θθ 2 θθ k i ∑ wi i=1

1

weight for the ith study. For weights we suggest the use of either vi where vi is the variance of

105

1 the ith study when there is little to no clustering, or the adjusted weight (as will be deff i

vi discussed in section 3.5) where deff i is the design effect and vi is the estimated variance for the ith

study when there is clustering. Now, unlike Case 1, we have a pooled estimate that will

converge to the superpopulation parameter, whose variance represents the average variance under

F (S ) a superpopulation model. Again, if the weights are independent of our estimator θθθi we will

have an unbiased estimate of the census parameter (for details see section 3.3).

To summarize Case 2, we assume that the samples are independent. As noted previously,

however, this is a very strong assumption. We also assume that the variance and the point

estimates are independent, which as noted with Case 1 is anchored in the sampling design not the

estimator. We also assume that each of the sample’s target populations is the same for each set of

samples from a particular finite population. Finally, we assume that each finite population is

generated from the same superpopulation. These assumptions mean that we are combining

different finite realizations of the same superpopulation, via finite populations, via samples.

Although we are imposing a model structure, testing for heterogeneity will only show if there is

evidence that the samples are generated from the same model. The test for heterogeneity using

only samples (and not ) will not show if we are in Case 1 or Case 2 (as both assume

homogeneity visàvis the model). This note is important to the researcher as in practice it will be

very difficult to distinguish Case 1 from Case 2, and thus to distinguish in practice if we are

converging to the Census Parameter or the Superpopulation Parameter with estimates from

samples. If instead of having survey estimates you have the census estimates, then clearly if you 106 are combining census parameters you are in either case 2 or case 3, and thus you can use a test of homogeneity to determine if you are in case 2 or case 3. Additionally, we could, with caution, determine which case we are under based on the understanding of how all the surveys fit together.

If, for example, we have several samples of the same population at the same time point we are most likely in Case 1. If, however, we have multiple surveys over different individual target populations, but which have a common interpretation when pooled, we may be in Case 2. For example, a series of international surveys taken at the same time, where the estimates are homogeneous might be considered a series of different finite populations generated under the same superpopulation model. Therefore, we suggest that the researcher clearly define the different finite populations that she feels she is combining and if at all possible test hypothesis on homogeneity using census parameters. For our CCHS example to be in Case 2 we would be assuming something such as that each year is a separate realization of the finite population. That is, we would be assuming that the youth have the same probability to start smoking, continue smoking, and end smoking in all three years.

However, the idea that we are in Case 2 at all is a very strong assumption, stronger even than that of independence. Here we are making two assumptions: first, we are assuming a model generated the finite populations, and in the case of most analytical modeling we also are assuming our model is correct. However, these model assumptions, whether structural or distributional, are often times not completely verifiable. That said, even if we do get the model wrong, combining design consistent samples will get us to the superpopulation, provided we use enough samples of sufficient size. For this reason, we suggest that we always combine design consistent estimates.

107

Case 3: Combining Finite Parameters from Multiple Superpopulations

Finally, in the third case, we could be combining estimates from what are different finite populations that are generated from multiple superpopulations as in Figure 3.5 below.

Figure 3.5 RandomEffect Model Case 3 using Sampling Concepts

θ N1 Statistical θN 2 Model

θ ,σ 2 1 1 Hyperparamters 2 ˆ θ,σ Statistical θ θ N1 Model θ N2 2 θ ,σ 2 2

θ Statistical N 1 Model θ N 2 θ ,σ 2 3 3

F (S ) What is important to note is that we consider each θi to be generated from different

superpopulations i and there is a hyper-superpopulation with hyperparameters which are fixed

that generate the individual superpopulations . This corresponds to a randomeffect model in

metaanalysis. Here, each superpopulation is related through a global mean that is generated

from the supersuperpopulation and each superpopulation then generates a different finite population. We then sample within each finite population. From this process, we can see that the

108 finite populations must be linked together in a conceptual model that accounts for the variability between both finite populations and sample populations.

In the simplest case, the assumptions are analogous to those of a randomeffects model (with a normal model). That is, the estimate of each superpopulation model is normally distributed with

2 mean θ and a betweensuperpopulation variance σ B . Inference is based on the assumption that

finite populations are independently and identically generated, and that the samples are

independently sampled from each finite population. Although such an infinite population does

not exist, it is an important assumption in the classical parametric approach. The assumption that

the effects are random implies that we must include a betweensuperpopulation as well as a

within superpopulation component of variation when estimating an effect size and a confidence

interval. Here we assume that the variability beyond the individual model-sampling error is

random, that is, that it stems from random differences among studies whose sources cannot be

identified.

Unlike the experimental case, when we are attempting to combine surveys in either this random

or the previous fixed effect case we have to take into consideration two different stochastic processes: the model (or superpopulations) which generated the finite population, and the

sampling procedure which selected the individuals in each study. And as we are using survey

data, we must also be aware of the inference and estimation framework under which our estimates

were calculated.

109

If we look back at Figure 3.5, we see that the estimates have both a within and a between variability. As this case is parallel to a random effects case in traditional metaanalysis, we will use a formulation similar to the DerSimonianLaird (1986) estimator under the survey case. We

ˆ do this in order to take into account the between variance estimate explicitly. Here let g3 (θθθ H ) be

a pooled estimate of means, where the subscript H refers to the hypersuperpopulation parameter.

Then, to calculate the random effects model analogous to the DerSimonianLaird estimator we

would use

k ˆ ˆ 2 2 ) ∑θθθi /(τττ + vi ) i=1 g3 (θθθ H ) = k where ˆ 2 2 ∑1/(τττ + vi ) i=1

           χχχ 2 − n −1  ˆ 2 2 τττ = max ,0 k  , where χ is our proposed adjusted chisquare test statistics  w2   k ∑ i  w − i=1  ∑ i k   i=1   ∑ wi   i=1 

as described in section 3.4. Tau is the estimated between variance from our samples.

We can use least squares theory and construct a variance estimator that is analogous to the

ˆ 1 variance of the pooled effect for cases 1 and 2 v(g3 (θθθ H )) = k , where wi is the weight for the ∑ wi i=1

110

2 2 2 ith study τττˆ + vi , τττˆ is the estimated variance between superpopulations, and vi is the variance of the finite population estimate calculated under a superpopulation model.

Let us summarize the third case assumptions. First, we need to have independence between the superpopulations. Then we assume that the finite populations that are generated from the superpopulation are independent and our samples taken from each finite population are independent. We also need either independence between our weight and our estimate and our model or we want to assume that conditioned on the weight the estimator satisfies certain distributional assumptions: i.e. the right conditional expectation (so in particular the weight and estimate are uncorrelated), but also that we have bounds on the conditional second and third moments. As we are using a superpopulation approach for variance estimation and imposing a hypersuperpopulation model, we could easily violate that assumption through the sampling designs or the model. Thus, it is essential to check that we do not observe a correlation between our variances and our estimates in our data. Finally, we assume that the samples cover the same target populations, that these target populations are our finite populations, and that the superpopulations are linked by a hypersuperpopulation. This last assumption means we conceptually think that it is appropriate to summarize these data even though they are not homogenous. This is a very important point. We need to be certain that combining these data makes sense when these data themselves are not homogeneous. Making recourse to the CCHS example for the last time, if the analyst were interested in birth cohorts we would not think that

combining these data is appropriate. We will return to this question in more depth in Chapter 5

with another example.

111

3.2 Asymptotic Techniques and Definitions

Another important methodological distinction between sampling and experimental statistics is

found in the framing asymptotic inference. In pages following, we shall demonstrate that the

concept of convergence of sequences of random variables cannot be naively applied in a sampling

framework. In the survey case, we do not have a simple sequence of random variables that we

can allow to go to infinity. We have samples embedded in finite populations and the samples are

what are random, and we need to take this into account when we define convergence. If we first

naively think we let the sample go to infinity, we must also quickly realize that this means the

finite population goes to infinity. Thus, defining the concept of convergence in finite population

sampling requires not only that the finite population size will be allowed to increase, that is N will tend to infinity, but also that the sample size will also tend to infinity (small n). We will also need to have the same sampling structure as the limits increase (as we are looking at the limit in relation to the design). So, we must have a suitable sequence where not only the population size is allowed to increase, but also the sample size is allowed to increase while simultaneously preserving the sampling plan.

Fuller (2009) provides an example of this type of sequence. Consider a sequence of sets of Nj =

10j elements, where j = 1, 2,… Let iid Bernoulli random variables be associated with the indexes

1,2, ...,N such that

1 with probability p xi =  0 with probability 1 − p .

112

We can see if we let j = 1,2,3,4,5,…, then we have the sequence of sets N = 10, 20, 30, 40, 50…, then from each set of realized population Nj (size 10 j) select a simple random sample without replacement of size j. That is we would have a sequence of samples s = 1, 2, 3, 4, 5…etc. Then the distribution of

N X N = ∑ xi i=1

is a binomial random variable with parameters (N, p) and

 N  a n−a P{}X N = a =   p 1( − p) . a 

Then we can see that any finite population FN derived from the above sequence has X N elements

equal to 1. We can also see that the conditional distribution of X N given FN is a hypergeometric

distribution and

−1  X  N − X  N   N  N  P{}X N = a FN =     . a  j − a  j 

As we can see, in order to preserve the sampling plan in an asymptotic framework, the sequence will need to have some conditions on the relationship between n and N, and/or the sampling weights.

It is clear from the different figures above that we are (depending on the model) attempting to estimate either in Case 1a and 1b the finite population parameters, Case 2 the superpopulation parameters, or in Case 3 the hyperparameters from the hypersuperpopulation. We would like in all cases to be able to show that we are converging to our expected results. In order to do so we

113 will first need to introduce some important definitions from sampling that we will use in the next sections.

Definition 3.1 Consider a finite population consisting of N units, where N is known, denoted by the index set U = { 2,1 ,..., N}. An estimator θθθˆ is design unbiased for the finite population

ˆ parameter θθθ N =θθθ (y1 , y2 ,..., y N ) if E[θθθ ]= θθθ N where the expectation is taken over all samples possible under the design for the finite population.

Definition 3.2 (Godambe and Thompson 1986) Consider a finite population consisting of N units,

where N is known, denoted by the index set U = { 2,1 ,..., N} and the population vector y = ( y1, y2,

…,y N) where yi is the variate associated with unit i. We assume that y is generated from a

distribution ξ where ξ is known to be a member of a class C = { ξ }. The class C is then called

the superpopulation model.

Definition 3.3 (Godambe and Thompson 1986) Consider a finite population consisting of N

units, where N is known, denoted by the index set U = { 2,1 ,..., N} and the population vector y =

(y1, y2, …,y N) where yi is the variate associated with unit i. Let θθθ be a real superpopulation parameter; that, is a real valued function defined on C. Then an unbiased estimating function for

θθθ based on the finite population vector y would be a function g(y,θθθ (ξξξ )) such that for all ξ in C

Eξξξ [g(y,θθθ (ξξξ ))]= 0 , where Eξξξ denotes expectation under the model.

114

Definition 3.4 (Godambe 1960) An unbiased estimating function is termed optimal if the quantity

2 2 Eξξξ (g /) {Eξξξ (∂g / ∂θθθ )θθθ =θθθ (ξξξ ) } is minimized among unbiased g, and we denote this optimal estimator using g*.

Definition 3.5 (Godambe and Thompson 1986) Consider a finite population consisting of N units, where N is known, denoted by the index set U = { 2,1 ,..., N} and the population vector y = ( y1, y2,

…,y N) where yi is the variable associated with unit i. The solution of

g (* y,θθθ N ) = 0 possesses two properties simultaneously.

I) When all the components of the vector y are known (i.e. a census), θθθ N is an estimate of the

superpopulation parameter θθθ .

II) When the survey population vector y is not known (i.e. a sample), θθθ N defines a parameter

for the survey population.

Definition 3.6 Given a sequence of finite populations and an associated sequence of sample designs, the estimator θθθˆ is design consistent for the finite population parameter

θθθ N =θθθ (y1 , y2 ,..., y N ) if

p limn→∞,N →∞ [θθθ n −θθθ N ]= 0 , where the probability is determined by the sampling design.

115

3.3 Convergence of Combined Estimates under the Sampling Framework

Although it may seem clear from each of the figures how these data must converge if we read the

diagrams left to right, in practice we start on the right hand side with samples and try and see

which model we are under and to what we are converging. Therefore, in order to show how

strong a postulate we are making when we decide that we are in one case or another, we will look

at what assumptions we need for convergence in each case. For the first case, when we are

estimating the finite population parameter using a series of samples from the same finite population as the number of replicates increases, we have that the estimator converges in probability to the finite population mean.

In order to have convergence under a sampling framework requires constraints to insure that the pooled estimates will converge and be unbiased. For our first case (Cases 1a and 1b), we can see that we have a single finite population and that we are selecting several samples from this population. For case 1a, all we need to insure our convergence to the finite population parameter is that the samples are independent, the weights and the estimates are independent and the individual estimates are design unbiased (we do not need to place any restrictions on the variance of the finite population estimate as it will be finite).

Theorem 3.1a Let S 1, S2, S3,…, Sr, be a series of independent probability samples taken from a

ˆ ˆ ˆ single finite population with mean θθθ N and finite variance. Let θθθ1 ,θθθ 2 ,...,θθθ r be a series of design

116

r ˆ ∑ wiθθθ i unbiased estimates, also let θˆ = i=1 be an estimate of my finite population mean θθ N r ∑ wi i=1

1 where vi = . Consider first the case when the weights are fixed and known. Then for all εεε > 0 , wi

 r   ˆ  ∑ wiθθθi  i=1  lim P − θθθ N > εεε  = 0 r→∞ r  w   ∑ i   i=1 

Proof: It is clear that the weak law of large numbers holds in this case. We see that

r r  ˆ   ˆ   wiθθθ i   wiθθθ i   ∑   ∑  1 E i=1 =θ and by least squares estimation the Var i=1 = then as the p  r  θθ N  r  r      ∑ wi   ∑ wi  ∑ wi  i=1   i=1  i=1

1 estimates are design consistent we know E p (θθθi ) =θθθ N and vi = < ∞ and using Chebychev’s wi inequality, we have

 r   ˆ  wiθθθ i  ∑  1 P i=1 − θ > ε ≤ → 0 as r → ∞ .  r θθ N εε  r  w  εεε 2 w  ∑ i  ∑ i  i=1  i=1

1 We also could make note of the fact that E(θθθ r ) →θθθ N and Var()θθθ r = r → 0 to show (θθθ r ) ∑ wi i=1

tends in probability to θ N .

117

If instead we feel that the weights are in fact random, though again independent of the estimator, we could condition on the weights and then have the following result.

Theorem 3.1b Let S 1, S2, S3,…, Sr, be a series of independent probability samples taken from a

ˆ ˆ ˆ single finite population, with mean θθθ N and finite variance. Let θθθ1 ,θθθ 2 ,...,θθθ r be a series of design

r ˆ ∑ wiθθθi consistent estimates, also let θˆ = i=1 be an estimate of my finite population mean where θθ N r ∑ wi i=1

1 vi = and let wi be bounded. Then for all εεε > 0 , wi

 r   ˆ  ∑ wiθθθ i  i=1  lim P − θθθ N > εεε  = 0 . r→∞ r  w   ∑ i   i=1 

Proof: Let W be the σalgebra generated by the weights wi . By independence and design

ˆ ˆ consistency, E[θθθi W ]=θθθ N . Let Yi = wiθθθi /W , and W = ∑ wi . Then we have

 r  r r  wi ˆ  wi ˆ  E∑Yi  = ∑E θθθ i  = ∑ E [E(θθθ i W )]  i=1  i=1W  i = 1 W  r wi  = ∑ E θθθ N  | i=1 W  r  wi  = E∑ θθθ N  = E()θθθ N  i=1 W 

Next we see that if the weights are bounded we can again use Chebyshev to show

118

 r   ˆ  wiθθθ i  ∑  1 P i=1 − θ > ε ≤ → 0 as r → ∞  r θθ N εε  r  w  εεε 2 w  ∑ i  ∑ i  i=1  i=1

We also note to the reader if the variances and the third moment of the estimates are bounded

then the CLT also holds.

Case2:

For our second case in order to have convergence under a sampling framework requires slightly different constraints to ensure that the pooled estimates will converge and be unbiased. We can see that we have many finite populations and that we are selecting several samples from these populations. For case 2 in order to ensure convergence to the superpopulation parameter is that the finite populations are independent, samples are independent, the weights and the estimates are

independent and the individual estimates are design consistent. If additionally, the model is

optimal it will also have the property of the smallest variance for the class of functions that are

model unbiased.

Theorem 3.2a Let S 1, S2, S3,…, Sr, be a series of probability samples taken from different finite populations under a single superpopulation model with mean θθθ and finite variance. Let

ˆ ˆ ˆ θθθ1 ,θθθ 2 ,...,θθθ r be a series of estimates that are both optimal (definition 3.4) and design consistent.

119

r ˆ wiθθθ i ∑ 1 Let θ = i=1 be defined for all θ in the parameter space where . Consider first the θθ r θθ vi = wi ∑ wi i=1 case when the weights are fixed and known. Then for all εεε > 0 ,

 r   ˆ  ∑ wiθθθ i  i=1  lim P − θθθ > εεε  = 0 . r→∞ r  w   ∑ i   i=1 

Proof: As with the first case, it is clear that the weak law of large numbers holds in this case. We

see that each of the estimates is optimal and, therefore, model unbiased and design consistent and

thus design unbiased. That is,

 r  r r  w θθθˆ  w E E (θθθˆ ) w E (θθθ )  ∑ i i  ∑ i ξξξ p i ∑ i ξξξ Ni E E (θˆ) = E E i=1 = i=1 = i=1 =θ and by least squares ξξξ p θθ ξξξ p  r  r r θθ    ∑ wi  ∑ wi ∑ wi  i=1  i=1 i=1

r  ˆ   wiθθθ i   ∑  1 estimation the Var i=1 = then as the estimates are design consistent and model  r  r    ∑ wi  ∑ wi  i=1  i=1

optimal and the superpopulation has a finite variance we can once again use Chebychev’s

inequality to show:

 r   ˆ  wiθθθ i  ∑  1 P i=1 −θ > ε ≤ → 0 as r → ∞  r θθ εε  r  w  εεε 2 w  ∑ i  ∑ i  i=1  i=1

120

If instead we feel that the weights are in fact random, though again independent of the estimator

we could condition on the weights and then have the following result:

Theorem 3.2b Let S 1, S2, S3,…, Sr, be a series of independent probability samples taken from

different finite populations, generated under a superpopulation model with mean θθθ and finite

r ˆ ∑ wiθθθ i variance. Let θˆ ,θˆ ,...,θˆ be a series of design consistent estimates, also let θ = i=1 , be an θθ1 θθ 2 θθ r θθ r ∑ wi i=1

estimate of our superpopulation mean θθθ and let wi be bounded. Then for all εεε > 0 ,

 r   ˆ  ∑ wiθθθ i  i=1  lim P − θθθ > εεε  = 0 r→∞ r  w   ∑ i   i=1 

Proof: Let W be the σalgebra generated by the weights wi . By independence and design consistency, E[θθθ i W ]=θθθ N . Let Yi = wiθθθ i /W , and W = ∑ wi . Then we have

r r r   wi  wi  E∑Yi  = ∑E θθθ i  = ∑ E []E()θθθ i W   i=1  i=1W  i = 1 W  r wi  = ∑ E θθθ N  | i=1 W  r  wi  = E∑ θθθ N  = E()θθθ N = θθθ  i=1 W 

Next we see that if the weights are bounded we can again use Chebyshev to show

121

 r   ˆ  wiθθθ i  ∑  1 P i=1 −θ > ε ≤ → 0 as r → ∞ .  r θθ εε  r  w  εεε 2 w  ∑ i  ∑ i  i=1  i=1

We also note to the reader if the variances and the third moment of the estimates are bounded then the CLT also holds.

Case 3:

We will also introduce a superscript S to indicate that we are dealing with different superpopulations. That is, we let S 1, S2, S3,…, Sl, be a series of probability samples taken from

the first finite population, generated under the first superpopulation model with mean θθθ1 and finite variance. Then we continue this sequence to have S l+1 , …, Sk, be a series of probability

samples taken from the second finite population, generated under the second superpopulation

model with mean θθθ 2 and finite variance. So, in effect, we now have a sequence of superpopulations ζ 1,…, ζ p from which we have generated a series of finite samples 1 to q. And

from each finite population we have drawn some number of probability samples. We may think

we are in a case very similar to that of stratified sampling where the stratum are not i.i.d, however because we feel everything is generated under the hypersuperpopulation we still have the case

where all the estimates have the same expectation, then we have the following results.

Theorem 3.3a Let S 1, S2, S3,…, Sr, be a series of independent probability samples taken different

finite populations, each generated under a different superpopulation model, each of which were

122

generated under a single hypersuperpopulation with mean θθθ H , (where H denotes the parameter

ˆ ˆ ˆ for the hypersuperpopulation) and finite variance. Let θθθ1 ,θθθ 2 ,...,θθθ r be a series of design

r ˆ wiθθθ i ∑ 1 consistent model optimal estimates, also let θˆ = i=1 and let where . First consider θθ H r vi = wi ∑ wi i=1 the case where the variances are known and fixed. Then for all εεε > 0 ,

 r   ˆ  ∑ wiθθθ i  i=1  lim P −θθθ H > εεε  = 0 r→∞ r  w   ∑ i   i=1 

Proof: Once again it is clear that the weak law of large numbers holds. We know that the

expectation for each superpopulation is θθθ H and if we and by independence and design

ˆ ˆ consistency, E[θθθ i W ]=θθθ N . Let Yi = wiθθθ i /W , and W = ∑ wi . Then we have

 r  r r  wi ˆ  wi ˆ  E∑Yi  = ∑E θθθ i  = ∑ E [E(θθθ i W )]  i=1  i=1W  i = 1 W  r wi  = ∑ E θθθ N  | i=1 W  r  wi  = E∑ θθθ N  = E()θθθ N  i=1 W 

1 once again we have vi = < ∞ and using Chebychev’s inequality show: wi

123

 r   ˆ  wiθθθ i  ∑  1 P i=1 −θ > ε ≤ → 0 as r → ∞ .  r θθ H εε  r  w  εεε 2 w  ∑ i  ∑ i  i=1  i=1

Next we consider the case where the variances are not known and fixed.

Theorem 3.3b Let S 1, S2, S3,…, Sr, be a series of independent probability samples taken different

finite populations, each generated under a different superpopulation model, each of which were

generated under a single hypersuperpopulation with mean θθθ H , (where H denotes the parameter

ˆ ˆ ˆ for the hypersuperpopulation) and finite variance. Let θθθ1 ,θθθ 2 ,...,θθθ r be a series of design

r ˆ wiθθθ i ∑ 1 consistent model optimal estimates, also let θˆ = i=1 and let where , where we have θθ H r vi = wi ∑ wi i=1

ˆ E[Var(θθθ i ℑi )] is bounded above. Then for allεεε > 0 ,

 r   ˆ  ∑ wiθθθ i  i=1  lim P −θθθ H > εεε  = 0 r→∞ r  w   ∑ i   i=1 

Proof: Let ℑ be the σalgebra generated by the finite populations F i. We assume that the weights

ˆ are random depending on the finite population F i and then let wi = /1 Var(θθθ i Fi ) . Let us again

ˆ condition on the σalgebra generated by all finite populations. Let Yi ≡ E[wiθθθ i /W Fi ]. Then we have

124

 r  r r  wi ˆ  wi ˆ  E∑Yi  = ∑E θθθ i  = ∑ E [E(θθθ i ℑ)]  i=1  i=1W  i = 1 W  r wi  = ∑ E θθθ H  | i=1 W  r  wi  = E∑ θθθ H  = E()θθθ H  i=1 W 

The weak law of large numbers will hold if E[W −1 ]→ ∞ . By Jensen we have E[W −1 ]≥ E[W ]−1 ,

ˆ −1 so we need the E[W ]→ ∞ . It suffices to have E[Var(θθθ i Fi ) ] bounded below which follows if

ˆ E[Var(θθθ i Fi )]is bounded above.

We also note to the reader that if there is an upper and lower bound on the conditional variance,

σ2 (that is conditional on the σalgebra generated by all the finite populations) and the third

moment, τ 3 of the estimator conditional on the σalgebra generated by all the finite populations

then we can bound the error term in terms of both σ and τ we can show the CLT holds.

3.4 Testing Heterogeneity

As Cochran pointed out, it is important in metaanalysis to determine if the individual point

estimates “agree among themselves within the limits of their experimental errors”. Unlike the

experimental case, we can have unexplained heterogeneity due not only to different underlying

models but also due to sampling, particularly when there is clustering present in these data. We

will illustrate how a naive application of metaanalysis methods, that is using s as opposed to a

designbased estimator when there is clustering will result in incorrect quantifications of

125

heterogeneity. We note that Cochran’s Q will be chisquared if we use a designbased estimator

of variance in place of s.

Kish (1994) discusses the effect of heterogeneity on combining surveys and emphasizes how it is the survey design that is of issue, not other effects. However, in the case of survey data, it is more than just the design that is an issue. Also important is the level of inference. In this thesis

we have two cases that are fixed effect models: case one, where we consider that we are

combining estimates from the same finite population (which was generated from a single

superpopulation model) and case two, where we are combining estimates from different finite populations all thought to be generated from the same superpopulation. The third case is a random effects model.

We have already noted that different survey designs can have different estimators. If we apply the standard test for heterogeneity using a test statistic derived from the squared deviation of the difference between the study estimates and true overall mean without noting the differences in survey data as compared to experimental data, we would have erroneous conclusions. As noted by Scott and Rao (1999), in general, large national surveys are multistage cluster samples, and the clustering of individuals needs to be taken into account when we test for homogeneity. An adjustment using the design effect has been used in testing goodnessoffit hypothesis for multinomial proportions, for analyzing loglinear models in contingency tables, for corrections to

χ 2 statistics for logit models by Rao and Scott (1981, 1984, 1987), and for logistic regression by

Roberts et al. (1987). Unlike these approaches, where the variance of the estimator is not a

126

direct input into the test statistic, we can directly use the design based variance estimator in

Cochran’s Q.

We will explore in detail the case where we assume that the weights are known and fixed and n is

large and these data come from different samples drawn at random from a finite population U

with N units. In cases1a, 1b, and 2, we are combining estimates from the same superpopulation

model. Let us have a single superpopulation model and assume that there is homogeneity across

all studies; that is, θθθ1 =θθθ 2 = ... = θθθ r = θθθ N . We will also assume asymptotically that the estimate of

the mean θi from each survey is considered a random variable and has a normal distribution with

ˆ mean θi and vi . Then under the model’s null hypothesis, our estimate θi is a normal

random variable with mean and standard error vi .

A formal test for heterogeneity is normally done using a test statistic derived from the squared deviation of the difference between the study estimates and true overall mean. If we assume

1 normality and if we consider the study variances fixed, and let wi = , where vi is a design vi

ˆ 2 2 consistent variance for the ith survey, then is wi (θi −θ ) has a χ distribution with 1 degree of freedom (Rao 2011). Summing across all r studies, we will have a statistic with a χ 2 distribution

with r degrees of freedom. As the true value is not known, we must estimate it with the average

of all the studies, thus losing a degree of freedom. Then we now have a test statistic with r1

degrees of freedom (Cochran’s Q).

127

3.5 Adjusted Weights

One of the assumptions in unweighted models is that each data point gives the same amount of information about the deterministic part of the total variation in the model. Unlike the weights in surveys that represent the inverse of the probability of selection, the weights in a metaanalysis are used to give more weight to the more precise estimates when pooling. For example, if the sample sizes of the studies differ then estimates for larger studies will be more precise than smaller studies and we would want to give more weight to those studies. The idea is borrowed

from weighted least squares regression. There are many other different choices of weights

discussed in the literature for the experimental case. Weighting by the sample size is one way to

add in the reliability of a study, because in general as the study size increases so does the

accuracy. However, the most common choice is to weight by the inverse of the variance. In the

experimental case Hedges (1992) has shown that weighting by the standard error is optimal for

differences. While Brannick et al (2008) have compared unit, sample size and inverse variance

weighting procedures and have found that the optimal choice of weight depended on the test

statistic.

We know that if the errors are not constant across all studies, using weighted least squares with

weights that are inversely proportional to the variance at each study yields the most precise parameter estimates possible (though not always unbiased). However, in the survey case variability arises not just from the model that generated these data, nor from the sample size, but from sampling procedure itself. When there is clustering in these data, the variance formulas under a design based framework will be greater than a SRS. When there is efficient stratification

128 there will be less variance than a SRS. Therefore, in the survey case we present a new weight that uses the design effect. The design effect summarizes the effect of the sampling design on the variance of the estimate. This adjusted weight is a more appropriate choice in the survey case as it removes the differences in variance that are due to the design and is still proportional to the variances.

Survey Design Adjusted Weights

fixed 1 By least squares theory, the variance of the pooled effect is v = k . The optimal ∑ wi i=1 combined estimate will be found by weighting the independent estimates inversely proportional to their variance. It can be easily seen that if all the design effects are less than one, we have the following

1 1 ≥ k deff k 1 ∑ i ∑ i=1 vi i=1 v i and if the design effects are greater than one we will have

1 1 ≤ . k deff k 1 ∑ i ∑ i=1 vi i=1 v i

As clustered samples will, in general, have design effects greater than one, weighting by the variance adjusted by the design effect will be the minimum variance estimator.

129

We also need to adjust the weights for our third case when we have a random effects model. In the case of random effect models the weights for each study are the combined variance of both 1 the within and between variances. That is, wrandom = . Where we have i 2 vi + τττ

Q − df  w2 2  if Q > df ∑ i ~ τ =  C with C = wi − where wi = wi /δδδ i ∑ w 0 elsewise ∑ i Now these weights for each study are the combined variance of both the within and between

variances adjusted for the sampling design.

In final summary, before a researcher decides what model she will use to combine her data, she

must first clearly define her question. Next, she will conduct a systematic review to obtain data

relating to this question, deciding upon which studies or data sets to use. She will see if the

design can be ignored in estimation from each of the surveys (in practice this is rare). She will

then begin to consider if she should combine these data. If she feels she can combine these data,

and there are clear conceptual interpretations of the pooled findings, then the researcher must

determine which model best represents her data. This decision is not simple as we do not know

with certainty which model holds. It is both a mathematical and conceptual choice. The methods

to combine the data are mathematically straightforward, but rely heavily on very strong

assumptions that most analysts are neither aware of nor examine. In all our cases we are

assuming the point estimates are design unbiased and in case 2 and case 3 model unbiased – and

the assumption of model unbiasedness is not testable. Assumptions like independence between

samples and between weights and estimates can, and should, be examined. Ultimately, the

assumption of homogeneity needs testing (with an adjusted test statistic). And as with other

metaanalysis, if there is heterogeneity, it needs to be studied and explained. 130

131

Chapter 4 Simulations

Notation

Simple Random Sampling:

Under design based sampling, we have a finite population consisting of N units where we select a sample of size n with

πi = the inclusion probability = n/N,

n y Yˆ = ∑ i = the sample mean, i=1 nπππ i

ˆ  n  s 2 VarˆY  = 1 −  .    N  n

Stratified Sampling: Under design based sampling, we have partitioned the population of N sampling units into H strata. Where we have the following:

yhi = the value of the ith unit in stratum h, h = 1,…, L,

N h = the number of sampling units in stratum h,

nh = the number of units selected in stratum h,

∑ yhi ˆ i∈S yh = = sample mean in stratum h, nh 132

H N y yˆ = h h ∑ N h=1 = the overall sample mean,

H  n  n  s 2 Varˆ[yˆ]= 1 − h  h  h ∑ N  N  n h=1  h  h  h = the variance of the mean.

With stratification under a superpopulation we follow the notation of Graubard and Korn this

notation uses K instead of the traditional N to ensure that the reader recognizes that we are

estimating for stratified samples under a superpopulation:

Kh = the number of observations in the hth stratum, h = 1,…, L.

kh = the number of sampled units in the hth stratum.

ˆ ˆ VarSP [y]= Eξξξ (Vard (Y ) +Varξξξ (Ed (Y ))= varwo (y) + var[Y ]= the variance of the mean under a superpopulation model

1 L K 1 L K  K − K  var[]Y = h (y − y) 2 + h 1 − h s 2 = var E (Yˆ ) . ∑ h ∑   h ξξξ  d  K −1 h=1 K K h=1 K  (K − )1 kh 

var (y) wo = the standard without replacement estimator for stratified sampling from the L strata.

Cluster Sampling: Under design based sampling, we have N primary sampling units (clusters) and we have the following notation:

yij = the value of the jth unit in the ith psu,

N = the number of psu’s in the population,

M i = the number of sampling units ith psu, 133

nh = the number of units selected in stratum h,

N K = ∑ M i = the total number of ssu’s in the population, i=1 n = the number of psu’s in the sample,

mi = the number of elements in the sample from the ith psu,

πππ j(i) = the inclusion probability of the jth unit in the ith psu,

M ˆ i ti = ∑ yij = estimated total for the ith psu, i∈S mi

ˆ N ˆ tunb = ∑ ti = unbiased estimator of the population total, i∈S n

2 1  tˆ  2  ˆ unb  st = ∑ti −  = estimated variance of the psu totals, n −1 i∈S  N 

(y − y ) 2 s 2 = ij i = the sample variance within the ith psu, i ∑ m −1 j∈Si i

y Yˆ = ij = the estimated mean for the sample, ∑ ∑ Kπππ i∈S j ∈ Si j(i)

N 2 n s 2 N N  m  s 2 ˆ ˆ   t  i  2 i V (Y ) = 1 −  + ∑1 − M i = the estimated variance for the sample mean. K  N  n n i=1  M i  mi

Under a superpopulation viewpoint, we use the following notation from Graubard and Korn:

K = the number of psu’s,

134

Kh = the number of psu’sin stratum h,

Ni = number of secondary clusters in psu i,

kh = number of primary clusters selected from the K h primary clusters in stratum h as a pps

sample without replacement.

y Yˆ = ij = the estimated mean for the sample, ∑ ∑ Kπππ i∈S j ∈ Si j(i)

Var y = var (y) + var − var SP [ ] wr b w = is the variance under the superpopulation model, where

2 1 L 1  kh  varb = 2  (thi − dy hi ) d ∑K ∑ h=1h  i = 1  ,

and

2 L kh kh 1 kh  1  varw =  (thi − dy hi ) − (thj − dy hj ) d 2 ∑K (k − )1 ∑k ∑ h=1h h  i = 1h j = 1  .

Case 1a:

r ˆ F (F ) ∑ wiθθθ i g (θˆ ) = i=1 1a θθ N r ∑ wi i=1

Case 1b:

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 1b θθ N r ∑ wi i=1

135

Case 2:

r ˆ F (S ) ∑ wiθθθi g (θˆ) = i=1 2 θθ r ∑ wi i=1

Case 3:

k ˆ F (S ) /( ˆ 2 v 2 ) ) ∑θθθ i τττ + i g (θ ) = i=1 , where 3 θθ H k ˆ 2 2 ∑1/(τττ + vi ) i=1

          2  χχχ 2 − n −1  , and where χ is our proposed adjusted chisquare test τˆ 2 = max ,0 ττ  k   w2   k ∑ i   w − i=1  ∑ i k  i=1   ∑ wi   i=1  statistics as described in section 3.4.

Simulations

In this chapter, we provide some simulations to illustrate the issues that arise when combining data from surveys. The simulations were done using SAS and the programs are in Appendix 1.

The chapter is divided into several sections. For each case in our framework we simulate the situation of first mixing identical studies then mixing studies with different designs. We also discuss the importance of using design based estimates (preferably unbiased) and what can

136 happen when the design is ignored. Midway in the chapter we also present how individual pooled survey data relates to the models and issues presented in Chapter 3.

We first make note of several concepts that are important in the context of this problem. Meta

analysis in the experimental case deals with the theoretical superpopulation in either the fixed or

random effect cases. However, this is not the case with the metaanalysis of survey data: we

cannot directly leap to an idea that our combined estimate converges to the superpopulation parameter. We know that under some very broad conditions the limiting distribution of estimates based on samples from finite populations is normal (Binder 1983). The first of these conditions is that as the finite population increases in size, the higher moments do not increase too rapidly compared to the variance and for sufficiently large samples and large finite populations. The second condition is that the ratio of size of sample to size of finite population is bounded away from 1.

These two conditions can be simply illustrated by generating a series of finite populations of increasing size and comparing the finite population mean to the superpopulation mean as in Table

4.1. Here the superpopulation is normal with mean 1 and variance 25.

Table 4.1 Finite and Superpopulation Means with Different Sample Sizes

Sample Size Finite Population Mean Superpopulation Mean Difference 100 1.977690928 1 0.97769093 1000 1.150726103 1 0.1507261 10000 0.981675699 1 0.0183243 100000 0.992229047 1 0.00777095 1000000 0.998663829 1 0.00133617

10000000 0.999287491 1 137 0.00071251

We also note that the rate of convergence of the finite population mean is dependent on the variability of the data and hence the need for the second condition. We can illustrate this by keeping the sample size fixed and changing the variability of the data as in Table 4.2.

Table 4.2Finite and Superpopulation Means with Fixed Sample Sizes

Variance Sample Size Finite Population Mean Superpopulation Mean Difference 1 100000 0.998445809 1 0.00155419 5 100000 0.992229047 1 0.00777095 10 100000 0.984458095 1 0.01554191 15 100000 0.976687142 1 0.02331286 20 100000 0.96891619 1 0.03108381 25 100000 0.961145237 1 0.03885476 30 100000 0.953374285 1 0.04662572

We also need to ensure that the sampling fraction, that is the ratio of the sample size to the finite population, is bounded away from 1 because design based finite population variances are zero

when we have a census. Having asymptotic bounds for each finite population statistic using a

superpopulation should not be confused with the idea that a series of metaanalysis estimates will

converge to some superpopulation parameter. In the case of metaanalysis, we have a series of

estimates that we are pooling and it is that sequence that we want to make inference about.

4.1 Cases 1a and 1b: Combining several estimates from different surveys of the same finite population.

In the first case, we have a single fixed finite population that was generated from a single superpopulation and we are letting neither the sample size nor the finite population size increase, assuming, that is, that each size is fixed. We have two subcases, first where we do not invoke a superpopulation and the second when we do. In practice, we will have a number of samples and 138 from each of these we will produce an estimate and then we will take an average of these estimates. Therefore, for our simulations we have repeated samples taken from the same finite population and the parameter of interest is the finite population parameter (the inference can be either under finite sampling or a superpopulation model). This is a likely case in practice as in general, most national surveys are design based samples from a single finite population.

4.1.1 Case 1a: Mixing Identical Studies

Case 1a is when we have a series of samples from the same finite population that we are combining in order to estimate the finite population parameter. We can see intuitively that we will ultimately converge to the finite population parameter if we average enough different samples. And we note this is where traditional design based theory starts by looking at the

ˆ expectation across all designs of a single kind. To illustrate convergence of our estimator g(θθθ N ) as the number of replicates increases (that is as r, the number of replicates or different study estimates, increases for a fixed N and n), we will simulate data from different sampling plans, calculate an estimate for each, calculate an average across the samples, and show how the estimator performs as the number of samples increases. As a researcher could be combining data that has any sampling plan we will first illustrate convergence across four different sampling methods: simple random, stratified, cluster, and, finally, a combination of all three types. We will also simulate how the estimator works with different weights in each case.

139

Simple Random Sampling

Wanting to combine estimates from multiple samples from the same finite population can happen in practice. To illustrate, in Canada there are only 110,000 North American Indian Children

under the age of 14. A large national survey would be thus unlikely to get reliable estimates for

this particular subgroup without actively oversampling. However, we may find many different surveys that have small samples of these children and by combining these different estimates we may achieve greater reliability.

To replicate something similar to this case, we simulated a small finite population of 100,000 units from a normal superpopulation, with parameters mean 1 and sigma 5. We then generated

500 different simple random samples of size 500 from this finite population and we looked at the convergence (as r → ∞) numerically using a fixed effect model with our different weights. For

n y each random sample, we calculate the weighted mean θθθˆ F (F ) = ∑ i (a design consistent i=1 nπππ i

estimator of the finite population or census mean) and design consistent variance, given by:

 n  s 2 Varˆ[]y = 1 −  . Again, as noted in Chapter 3, the appropriate estimation model will be a  N  n

fixed effect model as all the samples are taken from the same finite population. This fixed effect

model is a weighted average and takes the general form:

140

r ˆ F (F ) ∑ wiθθθ i g (θˆ ) = i=1 , where w is the weight given to each study. We can also illustrate 1a θθ N r i ∑ wi i=1

(easily) how the estimate would vary if we chose different weighting options. To illustrate the impact of different weights on the point estimate we will run the simulations with three weights.

The first weight is the inverse of the sample size, the second is the inverse of the variance (the optimal weight in the case of simple random sampling) and the third is a weight of one, which is equivalent to simple average of the parameters. We can see how this estimator performs with different numbers of samples in Figure 4.1.

Figure 4.1 Convergence of a Sequence of Simple Random Samples from a Single Finite

Population using Fixed Effect Models with Three Different Weights θθθˆ F (F )

1.06

1.04

1.02

1 Mean

0.98

0.96

0.94

0.92 x 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 Replicates Finite Population Mean Weight 1 Weight 2 Weight 3 Superpopulation Mean

Here the y axis represents the mean value for our variable and the x axis represents the number of surveys we are combining. We expect that as we combine more surveys we should converge to the census mean, not the superpopulation mean, and this is the case.

141

To allow the reader to see more easily the convergence of this type of model to the census parameter we will simulate another simple random sample case. With this case, the finite population is small, size N = 10,000, so that we will have larger difference between the

superpopulation mean and the finite population mean. The superpopulation is normal, mean 5

and sigma 2.5 (CV = .5). Here, the finite population mean is 4.995. Next, we generate 1000

independent samples of size n = 1,000 from this finite population. For the ith sample, we

n y calculate the weighted mean θθθˆ F (F ) = ∑ j (a design consistent estimator of the finite j=1 nπππ j

 n  s 2 population or census mean) and design consistent variance, given by Varˆ[]y = 1 −  . As  N  n

noted in Chapter 3, when we have only one superpopulation it is appropriate to use a fixed effect

model. This fixed effect model is a weighted average and takes the general form:

r ˆ F (F ) ∑ wiθθθi g (θˆ ) = i=1 , where w is the weight given to each study and i = 1, 2, …, r the total 1a θθ N r i ∑ wi i=1 number of studies or replicates. In the case of simple random sampling, we have the possibility of three weights. As with the previous example, the first weight is the inverse of the sample size, the second, the optimal weight in the case of simple random sampling, is the inverse of the variance and the third is a weight of one, which is equivalent to simple average of the parameters.

This example illustrates one of the more important conceptual points in this thesis. That is, no matter how many more samples we add to the average (whether we invoke the superpopulation or not) we will never converge to the superpopulation if we are combining samples from a single finite population. We will only get a more accurate estimate of the finite population parameter.

142

This fact confirms that an analyst who combines finite population estimates from the same finite population should not be expect to get at the ‘underlying model’ using metaanalysis.

Figure 4.2 Convergence to the Finite Population Mean using Simple Random Samples of

Size 1000 from a Finite Population of Size 10,000 θθθˆ F (F )

5.02

5

4.98

4.96 Mean

4.94

4.92

4.9 x 32 64 96 128 160 192 224 256 288 320 352 384 416 448 480 512 544 576 608 640 672 704 736 768 800 832 864 896 928 960 992 Replicates

Finite Population Mean Weight 1 Weight2 Weight 3 Superpopulation Mean

Here again, the y axis represents the mean value for our variable and the x axis represents the

number of surveys we are combining. We can see easily in Figure 4.2 that, regardless of the

ˆ weight (as they are all not correlated with the mean), g1a (θθθ N ) converges to the finite population

mean, not the superpopulation mean.

143

Stratified Sampling

Instead of combining several simple random samples (which is unlikely in practice), we want to illustrate our estimator for stratified samples. Using the framework in this thesis, if the point estimates were not already available, we would first calculate for each stratified sample the

H ˆ F (F ) N h yh weighted mean θθθ = ∑ (or another design consistent estimator of the finite population h=1 N

H 2 ˆ  nh  nh  sh or census mean) and design consistent variance, now given by Varˆθθθ  = 1 −   .   ∑     h=1  N h  N h  nh

Again, the appropriate estimation model will be a fixed effect model. This fixed effect model is a weighted average and takes the general form:

r ˆ F (F ) ∑ wiθθθ i g (θˆ ) = i=1 , where w is the weight given to each study. 1a θθ N r i ∑ wi i=1

In this simulation, the finite population has N = 100,000 and there are eight strata, each of equal size, and within each stratum, simple random samples were selected. In our simulation, the stratum variances were largest for stratum 1 and 8 and smallest for stratum 4 and 5 and we used the following unequal sampling ratios (0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and

0.009) to generate a sample of size 500 for each of the 500 replicates. Unlike the simple random sample case, we have the possibility of four weights: weight 1 is the inverse of the sample size, weight 2 is the inverse of the variance, weight three is the design effect divided by the variance and weight 4 is one.

144

Figure 4.3 Convergence of a Sequence of Stratified Samples from a Single Finite

Population using Fixed Effect Models with Four Different Weights θθθˆ F (F )

1.08

1.06

1.04

1.02

1 Mean 0.98

0.96

0.94

0.92

0.9 x 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

Here again, the y axis represents the mean value for our variable and the x axis represents the

number of surveys we are combining. We can see convergences to the finite population,

ˆ regardless of the weight (as they are all not correlated with the mean) g1a (θθθ N ) , when we are

combining stratified samples. As with the simple random sample case, we see that we are

converging to the finite population parameter.

145

Cluster Sampling

If instead of having stratified samples we had cluster samples, we would again first calculate for

ˆ F (F ) yij each cluster sample the weighted mean θθθ = (or another design consistent ∑ ∑ Kπππ i∈S j ∈ Si j(i) estimator of the finite population or census mean) and design consistent variance, in this case

N 2 n S 2 N N  m  s 2 ˆ ˆ   t  i  2 i given by V (θθθ ) = 1 −  + ∑1 − M i . Again, the appropriate estimation K  N  n n i=1  M i  mi model will be a fixed effect model. This fixed effect model is a weighted average and takes the general form:

r ˆ F (F ) ∑ wiθθθi g (θˆ ) = i=1 , where w is the weight given to each study. 1a θθ N r i ∑ wi i=1

For simplicity, in the simulation of cluster samples we define 8 primary sampling units (PSU) or clusters and within each we have 12500 units. Then from these 8 primary sampling units we select 2 using simple random sampling. Here we again have the possibility of four weights: weight 1 is the inverse of the sample size, weight 2 is the inverse of the variance, weight three is the design effect divided by the variance, which is now optimal, and weight 4 which is one.

Interesting to note is the large variability of the fixed effect estimate under cluster sampling, such variability being due to the fact that we are sampling a small number of primary sampling units.

Variance in cluster sampling is not calculated using the total sample size, which in this case is

25000, but rather by using the number of primary sampling units which is two. This is

146 analogous to metaanalysis where it is not the sample size of each study that is important but rather the number of studies that we are combining.

Figure 4.4 Convergence of a Sequence of Cluster Samples from a Single Finite Population using Fixed Effect Models with Four Different Weights θθθˆ F (F )

8

7

6

5

4

3

2

1

0 x 15 30 45 60 75 90 105 120 135 150 165 180 195 210 225 240 255 270 285 300 315 330 345 360 375 390 405 420 435 450 465 480 495 Finite Population Mean Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean

Here again, the y axis represents the mean value for our variable and the x axis represents the

number of surveys we are combining. We can see again regardless of the weight (as they are all

ˆ not correlated with the mean) when we combine cluster samples usingg1a (θθθ N ) , it converges to the finite population mean, not the superpopulation mean.

147

4.1.2 Case 1a Mixing Surveys of Different Types

Of course, it is unlikely that in practice we would be combining a single survey design. When we combine surveys of different designs this is now a different category of problem than we have in traditional design based sampling. In design based sampling, our framework is based on the same type of design (i.e. expectation and variance are across all possible designs of a single type), whereas in metaanalysis we can have different designs. Generalizing the previous examples, we can look at the case where we have surveys of different designs, all generated from the same

superpopulation and sampled from a single finite population . We are simulating the situation of

combining samples that are simple random samples, stratified samples, and cluster samples all

together. We note that this is the case where using the weight adjusted by the design effect is

most useful.

As noted in Chapter 3 we have optimal variance properties when the weight is proportional to the

variances. However, as noted in Chapter 2, sampling designs also affect the variance of an

estimate. How then would we compare data where the variances are different not because the

underlying model is different, but because the design is different? Take for instance data in a

finite population that was generated using a Poisson model. We expect under this model that the

meanvariance relationship will hold for any units in the population. But when there is natural

clustering in the data and we sample using clusters this meanvariance relation will often not

hold: we will have instead which is said to occur when the data display more

variability than is predicted by the variancemean relationship for the assumed superpopulation

model. This issue has been long discussed in relation to overdispersion in regression problems

with Poisson, multinomial and other response model data when using survey 148 data (Fitzmaurice et al 2997, Molina 2001). In most large surveys, where clustering in the population occurs, overdispersion is likely to be important. We can see that if we have the correct exponential model and we adjust the individual variances by the design effect we reduce the overdispersion caused by the sampling design and thus put the variances on the same footing

(that is differences in the variance will not be due to the sampling plan).

In order to use this method, we first calculate for each sample the appropriate weighted mean (or another design consistent estimator of the finite population or census mean) and appropriate design consistent variance. Then the appropriate estimation model will be a fixed effect model.

This fixed effect model is a weighted average and takes the general form:

r ˆ F (F ) ∑ wiθθθi g (θˆ ) = i=1 , where w is the weight given to each study. 1a θθ N r i ∑ wi i=1

In this simulation, the finite population has N = 100,000 and there are eight strata, each of equal size, and for the stratified sample, simple random samples within each stratum (in practice we could have alternative sampling designs within stratum) were selected. In our simulation, our stratum variances were largest for stratum 1 and 8 and smallest for stratum 4 and 5, and the following unequal sampling ratios (0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009) were used to generate a sample of size 500 for each of the 1000 replicates. For our cluster sample, we also have 8 primary sampling units as in the previous example. For the simple random sample, 1000 samples of size 500 were selected, and for the cluster sample, 1000 samples

149 of two of the eight clusters were selected. This illustrates convergence when we combine different designs.

Figure 4.5 Convergence of a Sequence of Samples from Different Designs from a Single

Finite Population using Fixed Effect Models with Four Different Weights θθθˆ F (F )

8

7

6

5

4 Mean

3

2

1

0 x 80 160 240 320 400 480 560 640 720 800 880 960 1040 1120 1200 1280 1360 1440 1520 1600 1680 1760 1840 1920 2000 2080 2160 2240 2320 2400 2480 2560 2640 2720 2800 2880 2960 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finitepopulation Mean

Here again, the y axis represents the mean value for our variable and the x axis represents the

ˆ number of surveys we are combining. As we showed in Chapter 3, g1a (θθθ N ) converges to the finite population mean, not the superpopulation mean as long as we are combining design consistent estimates. We can see that regardless of the weight (as none are correlated with the point estimate) combining design consistent estimates of the mean from different surveys using a

fixed effect model will converge to the census mean.

150

Ignoring the Design Weights

Throughout this framework we propose to use estimators that are design based. It is important to illustrate to the reader that ignoring the design when attempting to pool data from several finite population samples is the easiest way to get erroneous results. As noted in Chapter Two, expectation in designbased sampling is taken across all samples, and from this notion we construct designbased estimators, usually using the weights. There is, however, much debate about the need to weight data, especially when the researcher’s objective is to understand the

‘model or superpopulation’ that generated the data.

Here we illustrate the idea of design informativeness and nonignorability in the context of meta analysis. We consider a design to be noninformative if the sampling mechanism does not depend on y and to be informative if it does (Pfefferman 1988, 1993, 1996, 2002, 2007). And as noted previously, a design is ignorable for a particular analysis if the inference based on all the known information is equivalent to the inference based on the same information excluding the outcomes of the random variables related to the design (Binder and Roberts 2001).

In this simulation, the finite population of size N = 100,000 was generated from a single superpopulation. The finite population once again has eight strata, each of equal size, and within each stratum simple random samples were selected. As the stratum variances were largest for stratum 1 and 8, and smallest for stratum 4 and 5, the following unequal sampling ratios (0.009,

0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009) were used to generate a sample of size 500 for each of the 500 replicates. In this simulation (unlike the previous examples), our 151 stratification was related to our outcome y and since we use nonproportional allocation, the weighted and unweighted estimates differ substantially. That is, as the design does depend on y, the design is informative and therefore not ignorable for our analysis. Specifically, we have the following means and standard deviations for our finite population strata (Table 4.3).

Table 4.3 Finite Population Means and Variances of Y by Strata

Strata Stratum size Mean 1 12500 7.22 2.10 2 12500 3.47 0.68 3 12500 1.42 0.51 4 12500 0.23 0.45 5 12500 1.80 0.45 6 12500 3.47 0.51 7 12500 5.48 0.51 8 12500 9.23 2.16

We can see that the strata are just equal sized ordered groups of my overall simulated normal random variable. This could in practice be something like weight or age. The 1 st and last strata

are my tails. I have purposely oversampled my two tails. This often happens in practice where a

small part of the population is oversampled in order to produce more reliable estimates for the

subpopulation. If I do not take this oversampling into account, I will have incorrect results as

we will see.

For these simulations, we simulate the case where for each sample we calculate an unweighted

n ˆ yi estimator for our mean θθθ = ∑ . As we are no longer considering the design in our individual i=1 n

estimates, we only have the possibility of three weights (we no longer have a design effect 152 therefore our optimal weight is omitted). Thus we have weight 1 is the inverse of the sample size, weight 2 is the inverse of the variance, and weight three is now one. For this simulation, we use our fixed effect model of the form:

r ˆ ∑ wiθθθi g(θˆ) = i=1 , where w is the weight given to each study. θθ r i ∑ wi i=1

Figure 4.6 Convergence of a Sequence of Stratified Samples from a Single Finite Population using Fixed Effect Models with Three Different Weights – Ignoring the Design

1.2

1

0.8

0.6 Mean

0.4

0.2

0 x 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238 252 266 280 294 308 322 336 350 364 378 392 406 420 434 448 462 476 490 Replicates fpmean Weight 1 Weight 2 Weight 3 popmean

In Figure 4.6 we can easily see that we are converging neither to the finite population mean nor the superpopulation mean (1.01 and 1 respectively), but rather to the unweighted average of our samples (0.018), which in this case is meaningless. Once again we must note: the simpler the model, the more assumptions we are making. An unweighted survey estimate is simpler than a weighted one, and in order to have the two be equivalent, we must assume that the design is

153 independent of the outcome under study. This assumption, in the context of metaanalysis, is easily broken as the assumption must hold for all units in all samples.

Compared to Pooling all the Individual Data

Although we have not formally presented pooling individual data from surveys, we will illustrate some important concepts for the reader using simulations. We do this because as the simulated datasets have all the individual data points, it is easy to compare the final estimate under the finite population model to an estimate obtained where we treat all the data as a large data set. We can

consider two separate cases when using individual data: the case where reweighting is performed

and the case where weights are ignored.

The two cases are related in that the analyst will create for each one a large meta-dataset or meta- sample from all the individual data. Reweighting in this metaanalysis context is an adjustment

of the sampling weights that reapportion the probability of selection for each unit from the

separate samples into a single sample from the finite population. The goal is to have the ‘new’

weight reflect the inclusion probability of the unit in the metasample. Then the researcher will

run analysis on his new very large reweighted meta-dataset . Ignoring the weights in a meta-

dataset means that you do not use any of the sampling weights in the analysis (essentially

ignoring the design). The researcher will then create the meta-dataset with no weights and run

analysis as if it were a single study.

154

In order to link these two methods in our framework, we first note that reweighting only makes sense when we consider each sample to be from the same finite population. Our inclusion probabilities refer to a single finite population. If we do not feel that the data come from a single finite population, we should not consider reweighting. Moreover, reweighting samples involves many assumptions. We would need to know things like the design information from all the surveys, the sample overlap (or for rare populations how they overlap in the samples), and whether we can assume that the samples are selected independently of each other. Reweighting data is also presented in Thomas and Wannell. They present it as “pooling data” and note that it is “an attractive option because of the power of the increased sample size, and because, once combined, it is not necessary to return to the individual datasets.” Once again, they present a simple procedure where they recommend that each of the different cycles weights be scaled by a constant factor, αi=1/k, where k is the number of cycles. A cycle is a single crosssection of a

repeated crosssectional survey. They then judge that “although the assumption that each CCHS

cycle can be used to estimate the same population parameter is questionable,” the reweighted

estimate is fine. They then feel that the resulting estimates can be interpreted as representing the

characteristics of the average population. Unfortunately, we cannot agree with this without

additional explicit assumptions. Although Thomas and Wannell are combining estimates from a

single survey across cycles, they do not point out that this type of reweighting means that the

individuals in each sample are equivalent. Moreover they do not make note of the fact that in a

general case this also implies that the designs are equivalent and there is no overlap. This may be

the case in a single survey (though for rare populations probably not): however, there is no reason

to assume that it is the case in different surveys.

155

Citing Thomas and Wannell, other authors have used this approach to combine not just cycles of

CCHS but other surveys as well. For example, Chiu et al (2010) use this method to reweight a combined sample from the 1996 National Population Health Survey (NPHS) with the first three waves of the Canadian Community Health Survey. The NPHS collects information related to the health of the Canadian population and related sociodemographic information. It is a longitudinal sample that includes 17,276 persons from all ages in 1994/1995 and these same persons are interviewed every two years. The CCHS as described previously is a crosssectional dataset. The authors use this reweighting method in order to estimate prevalence of cardiovascular risk factors for ethnic groups in Canada (White, Chinese, South Asian and Black) over the period from 1996 to 2007. The issue with this approach in this example, however, is that these different surveys do not describe the same finite population – except as an average over this time period.

Moreover, this average for these subpopulations (and thus any overall estimate) is problematic to interpret in part because Canada does not have a stable ethnic composition: there was a large net increase in all three visible minority groups in Canada over this time period. That is, the change in percentage between the 1996 and 2001 census was 19% for Chinese Canadians, 38 % percent for South Asian Canadians, and 15% for Black Canadians. Between the 2001 and 2006 census, the percent of Chinese Canadians increased again by 18%, of South Asian Canadians by another

37% and of Black Canadians by 18%. These changes are not all due to increased births; immigration between 1996 and 2007 has had an impact on the ethnic distribution in Canada.

Additionally, as these surveys are all stratified by province, we could also note that there were dramatic changes in the ethnic proportions in each province over this time period. For example, between 1996 and 2001 there was an increase of 32% in the Chinese population in British

Columbia and a 45% reduction in the Chinese population in Newfoundland.

156

This ‘fictional’ population used by the NPHS and CCHS is thus a constantly changing mixture of recent and later immigrants whose generational status from the point of view of immigration affects health outcomes. The NPHS had already produced results that show recent immigrants from nonEuropean countries were twice as likely as the Canadianborn to report deterioration in their health over an eight year period (Ng et al 2005). But reweighting the data to represent some ‘overarching’ fictional distribution in this case is unreasonable. Prevalence of a condition is a concept related to a single stable population over a time period. The pvalues for most of the characteristics related to the prevalence are significant as a result of larger sample size, a situation which makes this type of pooling attractive to researchers. But again, when we cannot define the population, we must ask ourselves “ what does a significant test statistic mean”?

Another example that uses the simple reweighting method was a study by Y. Yon (2011) on the

comparison of middle aged and geriatric emotional spousal abuse. Here the author pools data

from the General Social Surveys in 1999 and 2004. As with the first example, the author ‘pools’

the data by dividing each respondents weight by the ‘appropriate’ α, which is two. However, in

this case Yon’s main goal is to assess a model; that is, he wants to better understand the

contributing factors to spousal abuse. Here again the idea that the population described in 1999

and 2004 in Canada are the ‘same’ is probably a stretch. As the author is interested in the model,

the author must assume that the process underlying spousal abuse over this time period has not

changed, even if the population may have done so. This situation is different from the first

example where researchers are trying to describe the population; here, the researcher is

attempting to describe the model. However, both sets of researchers pool the data to get more power.

157

Like the assumption of an ‘underlying population’ the assumption of an underlying constant model is strong. By comparing cycles of the GSS we know that the overall fear of victimization was still higher for women than men, but the gap was shrinking between 1999 and 2005. (Daily

July 7 th 2005). And there are other important questions to be raised. Does this change in overall

fear relate to a change in levels of or kinds of emotional spousal abuse? If so, does that mean that

there were changes in the model explaining spousal abuse? When we are dealing with a model,

the model must remain stable between all datasets (of a reasonable size) if it is to be a fair

description of the process that generated the data. Although we may not be able to determine

significance, with a reasonable sample size we should not have a huge variation in parameters for

an accurate superpopulation model. The overall metasample had just over 17,000 cases, which

implies the individual samples have roughly 8,500. It would be interesting to see if the results for

each sample hold. This would help answer the questions posed above and also ensure that there

was some homogeneity in the two samples model structures so that combining them for use in a

single model would make sense.

Both examples reweighted data in the meta-sample in order to do estimation. In the first case, as we are attempting to describe some population using surveys (inference on the finite population) we would always use weights. In the second case, if we can assume the model holds for all observations, sampled or not, then we do not need to use weights. However, as illustrated above, this is perhaps our strongest assumption when we use survey data. If the weighted and un weighted estimates are very different, we have evidence that unweighted model parameters will not converge to what we are expecting.

158

Additionally, the two examples calculate significance levels, or pvalues. They use the implicit postulate that equal pvalues represent equal amounts of evidence. However, neither paper seems to appreciate the fact that the pvalue depends on the sample size. This is described by Bakan

(1966) when he states that “the rejection of the null hypothesis when the number of cases is small speaks of a more dramatic effect….and if the pvalue is the same, the probability of committing a type one error remains the same. Thus one can be more confident with a small N than a large N.”

This interpretation is often used to differentiate statistical and clinical significance. For example, a significant finding at the p ≤ 0.05 level may be so small in size (i.e. an odds ratio that is 1.05)

that it is of no clinical significance if the study groups are enormous.

In order to illustrate the similarities in either creating a metasample or averaging over different

samples, we perform a simulation. Here, once again, the superpopulation is normal with mean 1

and sigma 5, while the finite population has a mean of 1.03 with a sigma near 5. Using the

stratified sample, where we have 12500 individuals in each stratum, where the stratification is related to y, and the sampling fraction is not equal across strata (0.009, 0.005, 0.005, 0.001,

0.001, 0.005, 0.005, 0.009) we can estimate using the 500 samples a fixed effect model. We do this in two ways: first using our estimator with the variance as the weight (as the design effect is not greater than one, this is the optimal weight in this case) and we can pool all the data into a single large dataset. Here we find the following:

159

Table 4.4 Individual Data and Pooled Data Estimation of a Mean.

Mean Sigma

Superpopulation 1 5 Parameters

Finite Population 1.0341 4.99 Parameters 95% Confidence Estimated Mean Estimated Sigma Interval Fixed Effect 1.03071 0.0083 1.0143 – 1.0471 weighted Fixed Effect 0.1266 0.1876 0.2410 – 0.4944 unweighted Pooled Data 1.0309 0.0037 1.0234 – 1.0383 weighted Pooled Data 0.1258 4.1952 8.0967 – 8.3484 unweighted

As we are estimating the mean and not the total, we can compare different strategies for point estimation and still have comparable estimates. We can see that if the design is related to the

outcome of interest then the sampling weights need to be used. We also note that, as expected,

the pooled approach and the fixed effect approaches are consistent.

4.1.3 Case 1b: Mixing Identical Designs

Despite the fact that several authors discuss inference of the finite population parameters under a superpopulation framework, in practice published estimates as in our Case 1b are rare.

Estimation software calculates design based estimates or simple model based estimates, not variances, under a superpopulation model. Nevertheless, Graubard and Korn (2002) suggest how 160 to do this approach and illustrate it using the 1987 US National Health Interviewer Survey

(NHIS), the US National Health and Nutrition Examination Survey (NHANES III) and the 1986

National Hospital Discharge Survey (NHDS). We can simulate convergence under this scenario

which is nearly identical to that of Case 1a. We note to the reader that the variance under a

superpopulation framework will be larger than that under the design alone and this is the reason

we separate these cases.

Simple Random Sampling

Again, we simulate a finite population of 100,000 units generated from a normal superpopulation with mean 1 and sigma of 5, select 500 units in each sample using simple random sampling, and repeat the sampling 500 times. The point estimator for each study is the weighted sample mean.

Here, under simple random sampling, the recommended variance under a superpopulation model is the simple random sample variance without the finite population correction factor (Fuller 1975,

Cochran 1977, Graubard 2002). As we were adopting a superpopulation point of view for our inference, we have this estimator:

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 , where once again w is the weight given to each study i = 1,…,r. We 1b θθ N r i ∑ wi i=1 have three weights: weight 1 is the inverse of the sample size, weight 2 is the inverse of the variance (under a superpopulation model), and weight three is one.

161

Figure 4.7 Convergence of a Sequence of Simple Random Samples from a Single Finite Population using Fixed Effect Models with Three Different Weights θθθˆ F (S )

1.2

1.15

1.1

1.05

1 Mean

0.95

0.9

0.85

0.8 x 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238 252 266 280 294 308 322 336 350 364 378 392 406 420 434 448 462 476 490 Replicates

Finite Population Mean Weight 1 Weight 2 Weight 3 Superpopulation Mean

Here again, the y axis represents the mean value for our variable, and the x axis represents the

number of surveys we are combining. As we have already explained, we can see that regardless

of the weight (as none are correlated with the point estimate) combining design consistent

estimates of the mean from different surveys using a fixed effect model will converge to the

census mean – not the superpopulation parameter even when we are under a superpopulation

framework.

162

Stratified Sampling

As with the first case, we probably will not find simple random samples in practice. In the case of stratified sampling under a superpopulation framework, we again use the approach of Graubard

2002. Here we have the designbased estimate of the mean that is mathematically equivalent to

Case 1a but now we have a variance constructed under the superpopulation. For the stratified case let K h be the known number of observations in the hth stratum, h = 1,…, L. And we let k h be the number of sampled units in the hth stratum. The estimator by Graubard and Korn has two components that take into account the generation of the finite population from the superpopulation and the variance from sampling and is of the form:

ˆ ˆ VarSP []y = Eξξξ (Vard (Y ) +Varξξξ (Ed (Y ))= varwo (y) + var[Y ]

1 L K 1 L  K  K − K  where var Y = h (y − y) 2 +  h 1 − h s 2 and var (y) is the [] ∑ h ∑   h wo K −1 h=1 K K h=1  K  (K − )1 kh 

standard without replacement estimator for stratified sampling from the L strata and var[Y ] is an estimator of var E (Yˆ ) . As we were adopting a superpopulation point of view for our ξξξ  d 

inference, we have this estimator:

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 , where once again w is the weight given to each study. We have four 1b θθ N r i ∑ wi i=1 weights: weight 1 is the inverse of the sample size, weight 2 is the inverse of the variance, weight

3 is the design effect divided by the variance, and weight 4 is one. Here we simulate a small finite population of size 100,000 generated from a normal superpopulation with mean 1 and

163 sigma 5. There are 8 strata, each of equal size, and within each stratum simple random samples were selected with the unequal sampling ratios of 0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009 to generate a sample of size 500 for each of the 500 replicates.

Figure 4.8 Convergence of a Sequence of Stratified Random Samples from a Single Finite Population using Fixed Effect Models with Four Different Weights

1.01

1.005

1

0.995

0.99 Mean

0.985

0.98

0.975

0.97 x 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238 252 266 280 294 308 322 336 350 364 378 392 406 420 434 448 462 476 490 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 fpmean popmean

F (S ) We noted in Chapter 3 that if the weights are not correlated with our estimator θθθ i we will have

an unbiased estimate of the census parameter. There is a small bias detected in this graph using

the inverse of the superpopulation weight, as we can see that weight 2 and weight 4 remain

equidistant. This is because the first and last strata have the smallest and largest number of

elements respectively and they also have the largest variability . We can make the variance of the

estimate relate to the mean by altering the sampling fraction. The bias is stronger in the 164 superpopulation case as the variance is larger and depends more upon the withinstratum variances. We can see in this nonindependence by plotting the estimates mean to the superpopulation variance is shown in Figure 4.9. Therefore, we must make sure that the estimate and the weighted estimator are not correlated.

Figure 4.9 Correlation of the mean and the superpopulation variance in unequal probability sampling.

5

4.5

4

3.5

3

2.5 Mean

2

1.5

1

0.5

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Superpopulation Variance

So we again note to the reader that although in theory the superpopulation variance should

independent of the mean, in practice there could be correlation due to the design. Therefore, we

always recommend confirming this assumption before estimation.

165

Cluster Sampling

In practice, it is more likely that we will encounter cluster samples. In the case of cluster sampling, Graubard (2002) also provides estimators of the variance under a superpopulation.

Here, the population consists of K primary clusters, the ith of which consists of Ni secondary

clusters. At the first stage of sampling, kh primary clusters are selected from the Kh primary

clusters in stratum h as a pps sample without replacement. We have the following estimator:

VarSP [y] = varwr (y) + varb − varw , where

2 1 L 1  kh  var = (t − dy ) , and b 2 ∑ ∑ hi hi  d h=1K h  i = 1 

2 1 L k  kh 1 kh  var = h (t − dy ) − (t − dy ) , and w 2 ∑ ∑hi hi ∑ hj hj  d h=1K h (k h − )1  i = 1kh j = 1 

varwr (y) is the withreplacement estimator for variance under stratified sampling. As we are

adopting a superpopulation point of view for our inference we have this estimator:

r ˆ F (S ) ∑ wiθθθi g (θˆ ) = i=1 , where once again w is the weight given to each study. 1b θθ N r i ∑ wi i=1

Once again, we simulate a cluster design making note of the fact that by having the PSUs of the same size; pps sampling is equivalent to simple random sampling. We once again have only 8 primary sampling units (PSU) and within each we have 12500 units. Then from these 8 primary sampling units we select 2 using pps sampling. We have the possibility of four weights: weight 1

166 is the inverse of the sample size, weight 2 is the inverse of the variance, weight three is the design effect divided by the variance, which is now optimal, and weight 4 is one.

Figure 4.10 Convergence of a Sequence of Cluster Samples from a Single Finite Population using Fixed Effect Models with Four Different Weights θθθˆ F (S )

7

6

5

4 Mean 3

2

1

0 x 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 Replicates

fpmean Weight 1 Weight 2 Weight 3 Weight 4 popmean

Here the yaxis represents the mean and the xaxis represents the number of studies we are

combining. As we combine more studies we should converge and as with Case 1a under cluster

sampling we see a large variation in the estimates using cluster sampling and a small bias in the

estimators using the superpopulation weight. As with the stratified case under a superpopulation

model, because the first and last clusters have the largest variability, we have a small correlation between the variance and the mean and thus a small bias when we use the variance as our weight.

167

4.1.4 Case 1b: Mixing Different Designs

We are probably more interested in the general case of combining several different surveys using this variance estimation under a superpopulation model. We again are simulating combining samples from simple random samples, stratified samples, and cluster samples. Again, in this simulation the finite population has N = 100,000 and there are eight strata, each of equal size, and

for the stratified sample, simple random samples within each stratum simple random samples

were selected. As the stratum variances were largest for stratum 1 and 8, and smallest for stratum

4 and 5, the unequal sampling ratios 0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009

were used to generate a sample of size 500 for each of the 1000 replicates. For the simple

random sample, 1000 samples of size 500 were selected and for the cluster sample, 1000 samples

of two of the eight clusters were selected. The estimation for each type of survey was done as in previous examples under Case 1b. As we were adopting a superpopulation point of view for our

inference we have this estimator:

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 , where once again w is the weight given to each study. 1b θθ N r i ∑ wi i=1

These three groups of surveys were combined to illustrate convergence under different designs under a superpopulation. The four different weights are the inverse of the sample size, the inverse of the variance, the design effect times the inverse of the variance, and one.

168

Figure 4.11 Convergence of a Sequence of Random Samples from Different Designs from a Single Finite Population using Fixed Effect Models with Four Different Weights θθθˆ F (S )

3.5

3

2.5

2 Mean 1.5

1

0.5

0 x 94 188 282 376 470 564 658 752 846 940 1034 1128 1222 1316 1410 1504 1598 1692 1786 1880 1974 2068 2162 2256 2350 2444 2538 2632 2726 2820 2914 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

Here the yaxis represents the mean and xaxis represents the number of studies we are

combining. As we combine more studies we should converge, and as with Case 1a under different

sampling designs we see a large variation in the estimates. Once again, there is a small bias in the

estimators using the superpopulation weight due to the correlation between the mean and the

variance. This once again reenforces the need to validate our model assumptions before we blindly estimate.

169

4.2 Case 2: Combining several estimates from different surveys generated from the same superpopulation but different finite populations.

As the reader is now aware, even if the estimate is calculated under a superpopulation framework, if the data come only from a single finite population, any combined estimate will only converge to the finite population parameter. If, however, instead of being interested in the finite population parameter, we are interested in the superpopulation parameter and we are fortunate enough to have multiple finite populations generated from the same superpopulation, we can combine them to get a better estimate of the superpopulation parameter.

As noted in Chapter 3, this estimate in a survey context is assumed to estimate the corresponding

‘population’ effect with a random error that stems only from the chance associated with the model and the sampling error ( model-sampling error). That is, the model-sampling error is the joint variation due to the doubly stochastic process where first we generate the finite population

from the superpopulation and then we randomly select a sample from the different finite populations. This second case still corresponds to the idea of a fixed effect model as the parameters for the different finite populations are generated from the same distribution or model.

However, we must take into account the twostages in both the estimation of the point estimates

and the variance. Thus we use the results of Godambe and Thompson (1984): that is, that the

census parameter is an estimator of the superpopulation parameter, and use a variance estimation

that includes the superpopulation.

170

4.2.1 Case 2: Mixing Identical Designs

As with Case 1, we first consider the case where we are combining surveys of the same design, for simple random sampling, stratified sampling and cluster sampling.

Simple Random Sampling

We first consider the simplest case: that of combining simple random samples. We can illustrate this numerically, by simulating 1000 finite populations generated from the same superpopulation all of size n=100,000. From each finite population, we select 5 simple random samples of size

500 from each. We calculate design consistent estimators for each sample. Here, under simple random sampling, the recommended variance under a superpopulation model is the simple random sample variance without the finite population correction factor (Fuller 1975, Cochran

1977, Graubard 2002). As we are adopting a superpopulation point of view for our inference we have this estimator:

r ˆ F (S ) ∑ wiθθθ i g (θˆ) = i=1 , where once again w is the weight given to each study i = 1,…,r. We have 2 θθ r i ∑ wi i=1 now dropped the N from the subscript of the estimator as we are no longer estimating a census parameter, but rather a superpopulation parameter. The three weights we use in the simple random sample case are as follows: weight 1 as the inverse of the sample size, weight 2 as the inverse of the variance and weight three as one.

171

Figure 4.12 Convergence of a Sequence of Simple Random Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Three Different Weights θθθˆ F (S )

1.05

1

0.95 Mean

0.9

0.85

0.8 x 136 272 408 544 680 816 952 1088 1224 1360 1496 1632 1768 1904 2040 2176 2312 2448 2584 2720 2856 2992 3128 3264 3400 3536 3672 3808 3944 4080 4216 4352 4488 4624 4760 4896 Replicates

Weight 1 Weight 2 Weight 3 popmean

Here the yaxis represents the mean and the xaxis represents the number of studies we are

combining. As we combine more studies from these different finite populations, we converge not

to any of the finite population means, but instead to the superpopulation mean. This is the case

which most researchers doing metaanalysis hope they are estimating: the true superpopulation

model.

172

Stratified Sampling

Next, we consider the case where we are combining samples that are stratified by a variable

related to the outcome. We can illustrate convergence by simulating 1000 finite populations

generated from the same superpopulation all of size n=100,000. We select 5 simple random

samples of size 500 from each. We calculate a design consistent estimate of the mean. In the

case of stratified sampling under a superpopulation framework we again use the approach of

Graubard 2002. In the case of stratified sampling, they suggest to obtain an unbiased estimator of

the variance of the mean by using the following:

VarSP [y] = varwo (y) + var[Y ],

where

L L 1 K h 2 1 K h  K − K h  2 var[]Y = ∑ (yh − y) + ∑ 1− sh , K −1 h=1 K K h=1 K  (K − )1 kh 

and varwo (y) is the standard without replacement estimator for stratified sampling from the L

strata. As we were adopting a superpopulation point of view for our inference, we have this

estimator:

r ˆ F (S ) ∑ wiθθθi g (θˆ) = i=1 , where once again w is the weight given to each study. The four weights 2 θθ r i ∑ wi i=1

appropriate for stratified samples are as follows: weight 1as the inverse of the sample size,

weight 2 as the inverse of the variance, weight 3 as the design effect multiplied by the inverse of

the variance, and weight four as one. 173

Figure 4.13 Convergence of a Sequence of Stratified Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Four Different Weights θθθˆ F (S )

1.05

1

0.95 Mean

0.9

0.85

0.8 x 173 346 519 692 865 1038 1211 1384 1557 1730 1903 2076 2249 2422 2595 2768 2941 3114 3287 3460 3633 3806 3979 4152 4325 4498 4671 4844 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 popmean

Here, as with the simple random sample case, the yaxis represents the mean and the xaxis

represents the number of studies we are combining. As we combine more studies from these

different finite populations, we converge not to any of the finite population means, but instead to

the superpopulation mean.

174

Cluster Sampling

To illustrate combining cluster samples, we again simulate 1000 finite populations generated from the same superpopulation. From each finite population, we sample 5 times using a cluster sample and then look at the resulting fixed effect model using four weights. Each sample estimate is design consistent and model unbiased and as we are adopting a superpopulation point of view for our inference we have this estimator:

r ˆ F (S ) ∑ wiθθθi g (θˆ) = i=1 , where once again w is the weight given to each study, namely, first, the 2 θθ r i ∑ wi i=1 inverse of the sample size, then the inverse of the variance, then the design effect multiplied by inverse of the variance, and finally a weight of one.

175

Figure 4.14 Convergence of a Sequence of Cluster Samples from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Four Different Weights

4

3.5

3

2.5 Mean 2

1.5

1

0.5 x 156 312 468 624 780 936 1092 1248 1404 1560 1716 1872 2028 2184 2340 2496 2652 2808 2964 3120 3276 3432 3588 3744 3900 4056 4212 4368 4524 4680 4836 4992 Replicates Weight 1 Weight 2 Weight 3 Weight 4 popmean

4.2.2 Case 2: Mixing Different Surveys

Again we are interested in the more general case of combining several different surveys using this variance estimation under a superpopulation model, and once more we are simulating combining samples from simple random samples, stratified samples, and cluster samples. As before, in this simulation the finite population has N = 100,000 and there are eight strata, each of equal size, and

for the stratified sample simple random samples within each stratum simple random samples were

selected. As the stratum variances were largest for stratum 1 and 8 and smallest for stratum 4 and

5 the unequal sampling ratios 0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009 were

used to generate a sample of size 500 for each of the 1000 replicates. For the simple random

176 sample, 1000 samples of size 500 were selected and for the cluster sample, 1000 samples of two of the eight clusters were selected. These three groups of surveys were combined to illustrate convergence under different designs under a superpopulation. Each sample estimate is design consistent and model unbiased and as we are adopting a superpopulation point of view for our

r ˆ F (S ) ∑ wiθθθ i inference we have this estimator: g (θˆ) = i=1 , where once again w is the weight given 2 θθ r i ∑ wi i=1 to each study, namely, the inverse of the sample size, the inverse of the variance, the design effect times the inverse of the variance, and one.

Figure 4.15 Convergence of a Sequence of Random Samples from Different Designs from a Several Finite Populations, Generated from a Single Superpopulation using Fixed Effect Models with Four Different Weights θθθˆ F (S )

1.5

1

0.5

0 x Mean 41 82 123 164 205 246 287 328 369 410 451 492 533 574 615 656 697 738 779 820 861 902 943 984 1025 1066 1107 1148 1189 1230 1271 1312 1353 1394 1435 1476

-0.5

-1

-1.5 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation mean Finite population mean

177

4.3 Case 3: Combining several estimates from different surveys generated from the different superpopulations under the same hypersuperpopulation.

In the case of the random effects model, the combined estimate is not an estimate of one parameter, but rather is meant to be the average of a distribution of values. If the fixed effect assumption is rejected either by assumption or test, then the analyst must account for the variation between studies in the analysis. We can also think of the randomeffect model as a weighted

average. Here a random effects analysis decomposes the observed variance into its two

component parts, withinstudies and betweenstudies, and then uses both parts when assigning the

weights.

We simulate a hypersuperpopulation with a mean of 2 and sigma of five that generates several

superpopulations each with sigma of 5. If we look at 1000 different superpopulations each

generating a single finite population and then take 5 independent simple random samples of size

500 from each of those finite populations, we have the following convergence with a random

effects model. Then to calculate the random effects model analogous to the DerSimonianLaird

estimator (see Chapter 1 for detail) we would use

      k   ˆ 2 2   θθθi /(τττˆ + vi ) ) ∑  χχχ 2 − n −1  g (θ ) = i=1 ,where τˆ 2 = max ,0 , and where χ 2 is our proposed 3 θθ k ττ  k  1/(τττˆ 2 + v 2 )  w2  ∑ i  k ∑ i  i=1  w − i=1  ∑ i k  i=1   ∑ wi   i=1  adjusted chisquare test statistics as described in section 3.4. One way to approach this 178 decomposition and the method used in the simulations is to start with Q, a value that reflects the

total variance. If the only source of variation was within study then the expected value for the

test statistic would be its degrees of freedom. So we can decompose Q to get Tau which is the

Q − df 2  w 2  if Q > df ∑ i between studies variance: τ =  C , where C = wi − . ∑ w 0 elsewise ∑ i

Now the weights for each study are the combined variance of both the within and between

deff variances. That is, w = . The variance of the pooled random effect estimate is the same i 2 vi + τττ

1 as the fixed effect model where the random effect weights are used v random = and k ∑ wi i =1 the SE(θ random )= v random .

179

Figure 4.16 Convergence of a Sequence of Simple Random Samples from a Several Superpopulations, Generated from a HyperSuperpopulation using Random Effect Model θθθˆ F (S )

2.4

2.2

2

1.8 Mean

1.6

1.4

1.2

1 x 135 270 405 540 675 810 945 1080 1215 1350 1485 1620 1755 1890 2025 2160 2295 2430 2565 2700 2835 2970 3105 3240 3375 3510 3645 3780 3915 4050 4185 4320 4455 4590 4725 4860 4995 Replicates

Random Weight Hyper Population Mean

Here, the x axis represents the number of replicated studies and the yaxis represents the mean.

We see that we do, in fact, converge to the hyperpopulation mean using a random effects model in this case.

Next, using the same 1000 finite populations we select 5 stratified samples, in this case where the

stratum is not related to the outcome of interest, but still with an unequal selection within stratum

of 0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009 were used to generate a sample of

size 500. Now we ensure we are using design consistent estimators given a stratified design and

again calculate the pooled estimate as noted above as:

180

    k   θθθˆ /(τττˆ 2 + v 2 )   ) ∑ i i   i=1 2 2 g3 (θθθ ) = , where 2  χχχ − n −1  , and where χ is our proposed k τττˆ = max ,0  1/(τττˆ 2 + v 2 ) k ∑ i  w2  i=1  k ∑ i   w − i=1  ∑ i k  i=1   ∑ wi   i=1  adjusted chisquare test statistics as described in section 3.4

Figure 4.17 Convergence of a Sequence of Stratified Random Samples from a Several Superpopulations, Generated from a HyperSuperpopulation using Random Effect Model

2.4

2.2

2

1.8 Mean 1.6

1.4

1.2

1 x 135 270 405 540 675 810 945 1080 1215 1350 1485 1620 1755 1890 2025 2160 2295 2430 2565 2700 2835 2970 3105 3240 3375 3510 3645 3780 3915 4050 4185 4320 4455 4590 4725 4860 4995 Replicates Random Weight Hyper Population Mean

Again, we can easily see that the random effects model converges to the hyperparameters when

we are pooling design consistent, model unbiased estimates. Next, to illustrate a random effect

model with cluster samples, we simulate the case of 5 cluster samples, with n=2 clusters, selected

from each of the 1000 finite populations. Again, we use the random effect pooled estimate:

181

      k   ˆ 2 2   θθθi /(τττˆ + vi ) ) ∑  χχχ 2 − n −1  g (θ ) = i=1 , where τˆ 2 = max ,0 , and where χ 2 is our proposed 3 θθ k ττ  k  1/(τττˆ 2 + v 2 )  w2  ∑ i  k ∑ i  i=1  w − i=1  ∑ i k  i=1   ∑ wi   i=1  adjusted chisquare test statistics as described in section 3.4

Figure 4.18 Convergence of a Sequence of Cluster Samples from a Several Superpopulations, Generated from a HyperSuperpopulation using Random Effect Model

θθθˆ F (S )

2.4

2.2

2

1.8 Mean

1.6

1.4

1.2

1 x 135 270 405 540 675 810 945 1080 1215 1350 1485 1620 1755 1890 2025 2160 2295 2430 2565 2700 2835 2970 3105 3240 3375 3510 3645 3780 3915 4050 4185 4320 4455 4590 4725 4860 4995 Replicates Random Weight Hyper Population Mean

Here we again converge to the hyperpopulation mean when we combine design consistent model unbiased estimates.

182

4.4 Small Sample Properties

Although all weighted estimators converge, it is important to make the distinction between having a small number of studies and a large number of studies. Such a distinction is similar to that made in the case of cluster samples or sampling in two phases. Thus the variability is related to the number of primary sampling units, which in our case are the surveys, not the number of individuals within each survey. And in general, we will not have hundreds of studies which we are combining. We will be able to see in these numerical examples that convergence is slow.

To illustrate this we first look at the case of combining three different surveys from a single finite population generated from a single superpopulation. If we have the case of only 15 surveys, that is, 5 simple random samples of size 500, 5 stratified of size 500 using the sampling ratios of

0.009, 0.005, 0.005, 0.001, 0.001, 0.005, 0.005, and 0.009, and 5 cluster samples where we select

2 clusters of eight, we see that the convergence to the population mean is not rapid. We also see that in small samples all weights can appear to be optimal, due to chance, by changing the random seed in the simulation. In Figures 5.19, 5.20 and 5.21 we are using this pooled estimator:

r ˆ F (F ) ∑ wiθθθ i g (θˆ ) = i=1 . 1b θθ N r ∑ wi i=1

183

Figure 4.19 Convergence of a Sequence of Samples from Different Designs from a Single

Finite Population using Fixed Effect Models with Four Different Weights (seed 1) –θθθˆ F (F )

8

7

6

5

4 Mean

3

2

1

0 x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

We see that when the number of replicates is small (and likewise the number of units sampled)

we do not converge quickly to the finite population mean. In fact, in the first simulation, only

16% of the data points are within 10% of the finite population mean. In general, while analysts

sometimes use only point estimates, lay people are frequently guilty of relying solely on them.

Moreover, when they have some notion of how the estimate works in the limiting case, they think

that their particular estimate is ‘close’ to this theoretical parameter. Although in all the cases we

have presented the estimator, in the limit converges to the appropriate parameter, in general when

we pool 5 or 10 or 15 cases we are not ‘close’ to the limiting value.

184

We can see in the next two Figures, that by using different starting seeds, we can illustrate the typical behavior of a series of this type of pooled estimate when there are a small number of replicates.

Figure 4.20 Convergence of a Sequence of Samples from Different Designs from a Single

Finite Population using Fixed Effect Models with Four Different Weights (seed 2) θθθˆ F (F )

7

6

5

4 Mean 3

2

1

0 x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

185

Figure 4.21 Convergence of a Sequence of Samples from Different Designs from a Single

Finite Population using Fixed Effect Models with Four Different Weights (seed 3) θθθˆ F (F )

2

1

0

1 Mean 2

3

4

5 x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

We note that, in general, the convergence is poor for cases where there are fewer than 15 surveys

when we are calculating the variance from just the finite population.

In the case of estimation when we invoke a superpopulation, we need to be more conservative as

each estimate is more variable and therefore the convergence is even slower. If we consider the

case where we have calculated finite population estimates under a superpopulation model we

have the following typical pattern in small samples. Here we now have 7 samples from each

type of survey as described above and we see that convergence even after 21 samples is not

necessarily as accurate as the researcher may like. Again, we are using the case 1b estimator:

186

r ˆ F (S ) ∑ wiθθθ i g (θˆ ) = i=1 . 1b θθ N r ∑ wi i=1

Figure 4.22 Convergence of a Sequence of Random Samples from Different Designs from a Single Finite Population, Generated from a Single Superpopulation using Fixed Effect Models with Four Different Weights θθθˆ F (S )

3.5

3

2.5

2 Mean 1.5

1

0.5

0 x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Replicates

Weight 1 Weight 2 Weight 3 Weight 4 Superpopulation Mean Finite Population Mean

The reader should note that a small number of studies will not ensure that the estimate is ‘close’

to an asymptotic limit.

187

4.5 Plots

There are many ways of analyzing and displaying data arising from a metaanalysis. Two of the more common plots used in metaanalysis are discussed in the survey case. In the case of experimental data, the confidence intervals and test and overall estimate all concern the superpopulation or model. For example, the confidence interval for the mean from a single superpopulation can be interpreted using repeated sampling. Here if we generate values for the population over and over again using the model and construct the 95% confidence interval for each resulting sample, we expect that 95% of the confidence intervals will contain the true superpopulation parameter . We need to contrast this idea with the interpretation of a design based confidence interval. Again, we can think of repeated samples, but here if we repeated the

sample, the confidence interval is interpreted as the expected proportion of confidence intervals

that will include the finite population (census) estimate Y .

To illustrate the difference, we can simulate data from a simple superpopulation model. Here, we

simulate a small finite population of 20 elements, from a normal random variable with mean 0

and variance 1. For the 20 elements generated the population mean is 0.39164 . If we consider all possible samples of size 5, using a simple random sample we would have 15504 possible

different samples. In a design based framework, we can look at the all 15504 possible samples

and see that the expectation of the mean across all the samples is the finite population mean, not

the superpopulation mean.

188

Figure 4.23 Sampling distribution of a Simple Random Sample

7

6

5

P e 4 r c e n 3 t

2

1

0 -1.77 -1.59 -1.41 -1.23 -1.05 -0.87 -0.69 -0.51 -0.33 -0.15 0. 03 0.21 0.39 0.57 mean

Forest Plots

Forest plots, also known as confidence intervals plots, are used to show the individual study information that went into the metaanalysis (Clarke and Lewis 2001). They show the amount of variation between studies and an estimate of the overall result. In general, the plot has each of the studies mean and confidence interval and usually has two reference lines: one for no effect and one for the overall estimate. If we use the data we simulated above, and select 9 samples at random and construct a forest plot with the superpopulation value, we would see that we do not have anywhere near the expected coverage of 95%. This result occurs because the confidence interval for the design based estimator is based on the repeated sampling of the finite population.

If, however, we look at the design based confidence intervals, we would see that there is good coverage of the census parameter and the fixed effect parameter is near the census figure. Plots were constructed using Comprehensive Meta Analysis ® software. 189

Figure 4.24 Forest Plot of 9 Simulated Studies Selected using Simple Random Sampling, Showing the Finite Population and Superpopulation

We suggest that readers insert the finite population(s) parameter(s) and the model assumptions on the graphs to help readers better analyze survey pooled estimates.

Funnel Plots

This plot is a simple of the effects estimated from individual studies (usually on the horizontal axis) against a measure of study size – often the variance (on the vertical axis). A is used in metaanalysis to detect publication bias (Egger 1997). Publication bias may lead to asymmetrical funnel plots, but it is important to realize that publication bias is only one of

a number of possible causes of funnelplot asymmetryfunnel plots. Furthermore, these plots,

along with other tools, should be used as a means of examining study effects rather than as a tool

190 to diagnose specific types of bias. If we plot only the 9 samples from above with the mean on the x axis and the standard error on the y axis we get the following.

Figure 4.25 Forest Plot of 9 Studies Selected using Simple Random Sampling

Funnel Plot of Standard Error by Mean

0.0

0.1

0.2 Standard Error

0.3

0.4

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

Mean

This would be interpreted correctly, as having no publication type bias. Unlike the experimental

case, where there are infinite realizations of a model, we can look at all the possible samples of a

single design to see what the funnel plot would look like for a fixed design with a fixed sample

size.

If we select a larger sample of 200 randomly, we can see that with data taken from studies with a

small sample size we may interpret the asymmetry that can result in the funnel plot as bias, when

there may not be any.

191

Figure 4.26 Forest Plot of 200 Studies Generated from Simple Random Samples

Funnel Plot of Standard Error by Mean 0.0

0.1

0.2

0.3

0.4 Standard Error Standard

0.5

0.6

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

Mean

Therefore we must take care to again not interpret these plots in the context of experimental meta

analysis but in terms of survey design.

192

Chapter 5 Example – Partnership Status and Sex Ratio

This chapter critiques a published example of a poorly executed pooling of a number of surveys.

This extended examination of a single metaanalysis is presented to facilitate the reader’s

understanding of the overall methodology, estimation methods, and tests described in the previous sections, while also highlighting conceptual issues. The metaanalysis that will be thus

dissected in the following pages is that of Karen Norberg (2004) who has studied the effect of partnership status on sexratio using 5 different studies, pooling the data from three longitudinal

surveys, one crosssectional survey cycle and one epidemiological study. Her paper was published in the Proceedings to the Royal of London series B and cited in The Economist .

5.1 Partnership Status and Sex Ratio

The purpose of Norberg’s research is to attempt to examine the relationship between partnership status and the human sex ratio at birth (the secondary sex ratio) using a “partnership status hypothesis,” a theory which postulates that the family structure has different fitness consequences

(i.e. survival) for sons and for daughters, respectively.

Sex Ratios – Birth Outcomes

There are two sex ratios: the primary ration which is the sex ratio at conception, and the secondary ratio which for humans is defined as the number of male births over the number of

193 total births in a given time period. This secondary sex ratio has long been a subject of considerable interest. It was in 1662, for example, that John Graunt, one of the first demographers, first noted the difference in mortality and births between male and female children in his comments Natural and Political Observations Made upon the Bills of Mortality . In 1710,

John Arburthnott would be the first researcher to look at the sex ratio over time, noting that from

1629 to 1710 it had been relatively stable. In his An for Divine Providence , and using data from the christening registries and bills of mortality, Arburthnott published the first paper in inferential statistics showing that more males are born than females than would be expected by chance. PierreSimon Laplace in the late 1700’s also studied the sexratios of children born in

Paris and London; applying his Bayesian type methods to again conclude that there were more boys than girls ‘ due to a greater opportunity in the birth of boys’ (Laplace 1781). Two hundred years later Corrado Gini (1955) would also study large of sex ratio data in order to see if there were any discernable changes at different time scales. Building on the work of his forebears in his The Genetical Theory of Natural Selection (1930) examined the reason for the discrepancy between the expected ratio of 1:1 for males to females and the actual ratio. He used the concepts of reproductive value and natural selection as for this difference. Observing that under this framework, the “selection of males would raise the sex ratio until the expenditure upon males became equal to females,” Fisher proposed that the differences in the sexratio could be explained by the higher death rate in males before adulthood

(including in utero). Then the expenditure of the parents would be the same on average for boys that survive to adulthood, but less for each boy born.

This idea of parental expenditure as it relates to natural selection is at the foundation of the

Trivers and Willard hypothesis (Trivers and Willard 1973), which holds that evolution will favor 194 systematic deviations from a population sex ratio of 1 based on the condition of the mother. This effect has been found in other mammals, such as deer. During times of stress and lack of food the sex ratio of deer shifts dramatically towards female births. According to Trivers and Willard, mothers in good condition would have more sons, while less healthy mothers would have more daughters.

The human sex ratio is thought to be affected by a wide of biological and environmental factors including race, birth control, birth order, maternal and paternal age and hormone levels, the timing of conception, ovulation induction, environmental toxins and socioeconomic status

(James 1977, 1980, 1981, 1987, 2008, Ulizzi 1994, 2001, Erickson 1976, Teiltelbaum 1971,

Llody et al 1984, Davies 1997). These factors are of two kinds: those determining the primary

sex ratio and those determining the survival of the embryo in utero. Working in the 1930s, Fisher

noted the large inequality in the sex ratio at conception, but without benefit of modern genetics,

diagnostics, testing and imaging was unable to articulate the mechanisms producing such an effect.

We now believe that there are possibly several mechanisms that can alter the primary sex ratio.

First, the Ybearing and Xbearing sperm may have different motility and different survival times

(Sarkar 1984). Some researchers also speculate that external stress can affect sperm production and quality which in turn could affect the primary sex ratio (Fukuda 1996). The survival of a quality egg is even more of a gamble. For one thing, unlike sperm, which are in constant production, the age of the ovum at fertilization is precisely that of the mother. Furthermore, the chemical balance of a woman`s genital tract changes as she ages and also fluctuates according to

195 her monthly cycle. Each of these factors can have an effect on sex ratio at conception (Guerrero

1975). For example, the likelihood of conceiving a male child appears to be higher early and late in the menstrual cycle (Harlap 1979). In other research into the question, Renokonen (1970),

Martin (1995) and James (2008) examine the impact of coital day or on sex ratios and

note that timing can play a role. There is also evidence that parity (that is the number of children

a women has) affects sexratios: James (1977) found significant linear declines in sex ratio with both increasing parity and increasing paternal age, and a curvilinear relationship with maternal

age. In terms of the time in utero, prenatal male mortality is not linear, however, being highest between gestational months 35, lower between months 68, and higher again at term.

There are several plausible mechanisms through which single parenting may link to gestational outcomes. As socioeconomic conditions are linked with mortality, higher socioeconomic status implies lower prenatal mortality for both sexes (World Health), though this could be differential for the sexes. Maternal nutrition is associated with socioeconomic conditions (including marital status) which in turn affect the length of gestation and birth weight (Institute of medicine).

Furthermore, there exists a considerable body of research supporting the idea that pregnant women who experience stress during pregnancy have worse birth outcomes as compared to women without stress. For example, Paarlberg et al (1995) have shown that maternal stress is implicated in fetal loss and low birth weights. Regarding the former, Hobel et al (2008) suggest that fetal loss may be increased by stressinduced hypertension in the mother. As to the triggers or sources of such stress, there are many: for example, a single parent will be more likely to have to deal with a job loss or change in job status due to a pregnancy and is more likely to work longer hours yet have less income than her married counterpart. Such a change in employment status would very likely mean a subsequent loss in economic resources leading to increased 196 stress, which could in turn result in worse nutrition, overall health and self care and other signs of negative coping (Rogers 1998). And some negative coping mechanisms have a demonstrated link with sex ratios. For example, smoking is associated with differential sex ratios; Beratis et al

(2008) found that among women who smoked, significantly more male than female offspring are born from primiparous women (women with one child), whereas women who had 3 or more children gave birth to more female offspring. The pregnancy may also bring a change or loss in social networks: Weinraub (1983), for example, showed that single parents tend to be more socially isolated than married parents, and thus receive less emotional support. All the aforementioned constitute psychological stressors which can in turn affect maternal health and nutrition. This said, we must note, however, that Norberg herself does not make direct reference to a possible stress causal pathway which could account for a difference in sex ratios for single parents.

Chemicals in the body also appear to play a role in influencing the sex ratios both at conception

and in utero; for example, high levels of circulating gonadotropins may mean a lower sex ratio at birth (James 1981). Thus, MeleroMontes (2001) has shown that proportionally more females are born to women who develop hyperemesis gravidarum (severe vomiting) in the first trimester of pregnancy, and suggested that serum human gonadotroping (HCG) levels associated with the

condition are possibly linked to or influenced by the sex of the fetus (hCG is higher with female

fetuses). There is also a link between higher levels of gonadotropins and certain races. A racial

effect can be biologically determined at conception as Hispanic, black and oriental women have

higher levels of circulating gonadotropin and therefore a higher probability of conceiving girls

(Benn 1997). On the other hand, Basso (2001) also found that low estrogen levels associated

with preeclampsia were associated with more male births. (The direction of the causation is still 197 not perfectly known, however, as female fetuses produce gonadotropins and males may affect the production of estrogen.) Finally, antibodies can also play a role in utero; Nielson et al (2010) reported that HY antibody titers are increased in unexplained secondary recurrent miscarriage patients and associated with low male to female ratio in subsequent live births.

Other authors give possible mechanisms through which such changes in the secondary sex ratio could take place. Catalano et al (2006) consider it likely due to increased fetal death when there is a large external stressful event such as September 11. Fukuda (1998) and Zorn (2002) consider changes in the primary sex ratio due to sperm characteristics after events like earthquakes and war.

Parental choice also plays a role and Goodman (1961) reports that “the overall effect of birth control on the sex ratio will depend upon, among other things, the particular kinds of preferences for male offspring that influence the married couples in the population, the particular kinds of preferences for female offspring that influence the married couples in the population, the particular ways in which these preferences affect the parents' decisions as to whether or not to have another child (if they make such decisions), the relation between the parents' decisions and their probability, pt, of bearing a boy, the effectiveness with which these decisions are carried out,

the relation between family size and pit the distribution of the pi in the population, and the nature of the v< as a function of all relevant variables.” Along with birth control, selective abortion can also skew the sex ratio.

198

Selection of Studies

In any metaanalysis, we must choose the studies that can answer the analytical question. We should always explain why we are choosing (and, just as importantly, not choosing) certain

studies, since the inclusion or exclusion of studies could potentially be a source of bias.

Unfortunately, however, Norberg gives no reason for her choice of studies. Norberg uses data

from the 1959 National Collaborative Perinatal Project (NCPP), the 1968 Panel Study of Income

Dynamics, the 1969 National Longitudinal Survey of Young Women, the 1979 National

Longitudinal Survey of Youth and the 1995 National Survey of Family Growth. Furthermore,

she seems to believe that all her chosen studies are surveys, apparently unaware that the NCPP

does not fall into this category. Clearly, she does not discuss the potential

associated with selecting only these studies.

This is a problem because as a simple internet review (using Google with key words fertility, survey, and data) reveals, there exist several other publicly available studies, either national surveys or registry data that she could have considered instead or as well. These include, but are not limited to, the National Survey of Fertility and Health, the annual supplement for the Current

Population Survey on Fertility, and Vital Statistics Data from 1968 to 1992. The National Survey of Fertility and Health, the fieldwork for which took place between March 1987 and May 1988, consists of with 13,017 women and men age 19 and older, of all marital statuses. Its data set includes detailed information on respondents’ cohabitation, marriage, and fertility histories, in addition to a rich array of social and economic indicators. The Current Population

Survey supplement on Fertility similarly includes data on marital history and fertility. Men aged

199

15 and over who were ever married (currently widowed, divorced, separated, or married) were asked to report the number of times they had been married and whether their first marriage had ended in widowhood or divorce. If a woman identified herself as married, and was over 15, she was asked to report the number of times she had been married, the dates of marriage, of widowhood or of divorce. In the case of divorce, the woman was asked to report the date of break up of the household. Then questions on fertility were asked of ever married women 15 years and over, and of never married women 18 years and over. Amongst the data collected were the number of liveborn children per woman, their dates of birth, genders, and current residences for as many as five children. Vital statistics registries carry information on every birth in the country

and they contain information on the mother including education, marital status and parity.

Information on the birth includes the date of the birth, the gender of the child, and notes if the birth were a multiple. All of these studies could have also been used to help answer the question

of the impact of marital status on the human secondary sex ratio.

Target Populations and Study Designs

As noted previously, the target population for each study should be compatible. Ideally, these would be the same, or in the case that they are not, comparable, and if not comparable, complementary. We explain in detail each of the study’s target population and designs in the next section. As we will see, the target populations of each individual study vary widely and it is difficult to pool them in a way that would create an interpretable population of inference, without some model assumptions. Additionally we note that each study is a different design, with very different survey administration and we also reiterate for the reader that the NCCP is not a survey, 200 thus methods in this framework would not apply to it. Oblivious to the need for comparability,

Norberg makes no note of the target populations for the different studies (instead stating how they are all nationally representative – which is not the case for the NCCP), nor how the differences in these target populations or design could affect her analysis.

NCCP The National Collaborative Perinatal Project

The 1959 National Collaborative Perinatal Project (NCPP; National Institute of Neurological and

Communicative Diseases and Stroke 1985) was conducted to enable biomedical and behavioral

research (Klebanoff 2009). From the study’s documentation we see that it “was designed to link

any later appearance of physical or neurological disorder to patterns during the perinatal period in

order to develop strategies for prevention and intervention.” The data contained in the NCPP comes from over 55,000 pregnancies and the children born of these pregnancies at fourteen obstetrical or pediatric departments in 12 universities. These were located around the country and they provided care mostly to low income women. The women were recruited in a prospective and the data covers the time period from 1959 – 1966.

Numerous questions were asked of the mothers, and tests were performed on them, resulting in a

rich epidemiological dataset that includes the majority of known factors affecting sexratios.

Enrolled in the study on the first visit, the mothers were then followed throughout their pregnancies and into the postpartum period. For example, the mother’s blood was collected

approximately every eight weeks, at delivery, and six weeks postpartum. Other variables that

were recorded include, but are not limited to, marital status, parity, instances of premature birth,

abortion, fetal death, stillbirth, and number of prior viable births, trimester of pregnancy

201 registration, maximum weight gain, low maternal haematocrit proteinuria (up to 24 weeks), and number of prenatal visits. Levels of maternal smoking and drug use during pregnancy were also measured, and children were systematically assessed for the presence of birth defects and other outcomes through to the age of eight years.

The data constitute an internally valid prospective cohort. The researchers at each center collected background information about each pregnant women admitted for delivery, thus allowing for a comparison between those enrolled and those not. The researchers found no evidence of an internal selection bias. However, a lack of internal selection bias is not the same

concept as external validity or generalizabiltiy to the national population. The NCPP does not

constitute a representative sample of the American Population and was not intended to be so: as

Hardy (2003) notes, the study’s “identification of potentially causal relationships between perinatal events and life course outcomes to age 7 or 8 years was of greater importance than

overall representativeness of a national sample.”

On a final note, as the NCPP is not a survey, it does not have survey weights and one therefore

cannot calculate design based estimates using its data set. Admittedly, if we were to take the point of view that the model relationship between marital status and sex ratio holds for everyone

in the population and is independent of any and all sampling mechanisms, we would not use our

weights. However, as noted previously, to proceed in this way we would need first to illustrate

that the sex ratio of the study population and of all subpopulations is similar to that of the rest of

the population. To illustrate that this model assumption does not hold for the NCCP we observe

that in 1995, Liederman and Flannery discussed the sex ratio of neurodevelopmentally disordered

202

(ND) children using the NCCP. It had been conjectured that women who give birth to ND children have hormonal differences that bias them towards having male children. One key point of discussion raised by this paper was the appropriate expected sex ratio to use for comparison with the study`s findings. The ratio in this study’s subpopulation was 1.11. As noted by James

(1996) in a review letter, this is very high in comparison with the general population. In order to make a valid comparison Liederman and Flannery were able to construct a suitable control group from study participants who were nonND. The sex ratio for the nonND NCCP control group

was 1.10 (Flannery and Liederman 1996). This again is very different than the population values,

which gives further credence to the notion that the cohort, though internally valid, is not

representative and this illustrates that the model does not hold for all subpopulations.

PSID The Panel Study of Income Dynamics

The Panel Study of Income Dynamics (PSID) is a longitudinal survey (Hill 1992). A longitudinal survey is similar to a prospective cohort in that it follows the same group of individuals over time.

However, the original sample in a longitudinal survey is a representative probability sample of a population, whereas a prospective cohort need not be. The PSID is a representative sample of

U.S. individuals and families in 1968, and was conducted at the Survey Research Center, Institute

for , University of Michigan. The PSID original sample consists of two

independent samples: a crosssectional national sample and a national sample of lowincome

families. As the SRC website reveals, the crosssectional sample was drawn by the Center itself

and constitutes an equal probability sample of households from the 48 contiguous states and was

designated to yield about 3,000 completed interviews. The second sample came from the Survey

of Economic Opportunity (SEO), conducted by the Bureau of the Census for the Office of

203

Economic Opportunity. In the mid1960s, the PSID selected from amongst SEO respondents about 2,000 lowincome families, each headed by an individual under the age of sixty. Known as the SEO sample, this national sample of lowincome families was confined to Standard

Metropolitan Statistical Areas (SMSA's) in the North and nonSMSA's in the Southern region.

Both samples are probability samples, and their combination is also a probability sample.

In the PSID for each household selected, a single household representative responds, usually the husband or male adult head of the household. This respondent provides information for himself

directly and is a proxy respondent for other members in the household. The data available from

this study include variables related to fecundity: specifically, there is a special dataset with all birth and adoption information, the Childbirth and Adoption History File (CAH). However, for the initial years of the survey (19681977) the PSID does not make the distinction between marriage and cohabitation. Only after 1977 can one distinguish the married couples from those cohabitating or those who are single.

Additionally, the initial PSID sample is not representative of individuals who immigrated to the

US after 1968. As the original study was designed to study low income and poverty dynamics, the study has a disproportionately large subsample of households living in poverty in the 1960s.

Since poverty and race are not independent in the United States, there are a larger proportion of black families in this study.

The user guide notes that weighting is needed for estimation, both to compensate for the unequal probability sampling, but also to attempt to compensate for differential nonresponse and attrition 204 in 1968 and subsequent waves. These weights are called the longitudinal Core/Immigrant individual and family longitudinal sample weights. The Core (Core/Immigrant) longitudinal weights are designed to enable unbiased estimation of the for the U.S. individuals and families that were eligible for the PSID survey population at the start of the survey. This population, that is, the individuals and families that were eligible for the PSID in

1968, is the population of inference for this study and widely different from the clinical population in the NCCP.

NLSYW – The National Longitudinal Survey of Young Women

The National Longitudinal Survey of Young Women (NLSYW) was designed by the United

States Department of Labor (BLS 2011). It is one of six surveys that comprise the National

Longitudinal Survey (NLS) Series (Pergamit et al 2001). The original purpose of the survey was to study employment patterns of women in their teens and early 20s who were completing school, making initial career and job decisions, and starting families. The target population for the study was young women aged 14 to 24 as of December 31, 1967, representing the civilian, noninstitutionalized population, and residing in the United States. The survey consisted of 5,533 young women selected in the initial sample, 5,159 (approximately 93 percent) of which participated in the 1968 survey. It was a multistage probability sample drawn by the Census

Bureau from 1,900 primary sampling units (PSUs) that had originally been selected for the

Monthly Labor Survey conducted between early 1964 and late 1966. Like the PSID, the Young

Women’s sample also has an oversample of blacks in order to better estimate labor disparities by race.

205

Respondents to the NLSYW were asked questions on a number of subjects germane to sexratio research including marital status, children, other dependents, and fertility history. Specific questions focused on current marital status, dates, duration, and reason for the end of previous marriages, total number of children, number of adopted children, number of children living at home, ages of children, dates of birth, gender of children, and the timing of the respondent's marriage with respect to work, school, and the birth of her first child. However, again as the concept of cohabitation was not prevalent or some may say acceptable in 1967 the earlier years of the study can not be used to identify cohabitation. And like the PSID, the data for cohabitating mothers was not collected and cohabitating could have either been classified as married or single.

Each crosssectional wave has its own weight, which can be used to produce nationally representative designbased descriptive estimates for each survey wave. Because the weights are specific to a single crosssectional year of the study and respondents occasionally miss an interview but are contacted in a subsequent wave we have a problem similar to item nonresponse

(i.e. when there is nonresponse for just a single question as opposed to the whole survey) when the data are used longitudinally. So unlike the PSID, the default weights are not longitudinal and the technical documentation from the survey notes that “in principle, if a user wished to apply weights to multiple wave data, weights would have to be recomputed based upon the persons for whom complete data are available.” However, longitudinal custom weights can be requested by data users. In the documentation it is also noted that for regression analysis the weights may or may not produce correct estimates. They suggest that when a researcher believes that subgroups could have followed significantly different regression specifications, the preferred method of analysis is to estimate a separate regression for each group. So we see our weighted cross sectional populations or our longitudinal population for the NLSYW do not correspond to the population of inference for either the PSID or the NCCP.

206

NLSY – The National Longitudinal Survey of Youth

This is another of the six surveys that comprise the National Longitudinal Survey (NLS) Series.

The National Longitudinal Survey of Youth (NLSY) is a nationally representative sample of

12,686 young men and women who were 1422 years old when they were first surveyed in 1979.

As noted in the technical documentation, the sample consists of three subsamples:

• A crosssectional sample of 6,111 youths designed to be representative of noninstitutionalized civilian youths living in the United States in 1979 and born between January 1, 1957, and December 31, 1964 (ages 14 to 21 as of December 31, 1978).

• A supplemental sample of 5,295 youths designed to oversample civilian Hispanic, black, and economically disadvantaged nonblack/nonHispanic youths living in the United States during 1979 and born between January 1, 1957, and December 31, 1964.

• A military sample of 1,280 youths designed to represent the population born between January 1, 1957, and December 31, 1961 (ages 17 to 21 as of December 31, 1978), and enlisted in one of the four branches of the armed forces.

However, some important changes to the design have taken place over time. Of greatest

importance is the fact that the entire economically disadvantaged, nonblack/nonHispanic

subsample was dropped following the 1990 survey.

As with the National Longitudinal Survey of Young Women, the primary focus of the NLSY

survey is labor force behavior; however, the content of the survey is considerably broader. For

example, the survey includes detailed questions on educational attainment, training investments,

income and assets, health conditions, workplace injuries, insurance coverage, alcohol and

substance abuse, sexual activity, and marital and fertility histories. According to the detailed

description, the “fertility data include information on all pregnancies/live births, a cumulative

207 inventory of all children reported and the residence status of all children, contraceptive methods utilized, birth expectations and wantedness information, confidential abortion reports, as well as ages at menarche and first intercourse. A supplemental set of constructed and edited fertility

variables provides: (a) revisions to dates of birth, gender, and usual living arrangements for all

respondents' children; (b) constructed variables such as beginning and ending dates of marriages,

ages at first marriage/first birth, spacing between births and between marriage and first birth; and

(c) a variable evaluating the consistency of each respondent's longitudinal fertility recorded between the 1979 and 1982 survey years.” Again as with the NLSYW, the notion of cohabitation is problematic, and discussion with survey methodologists at the Bureau of Labor Statistics highlighted that prior to the 1997 wave, it is impossible to correctly classify cohabitating individuals (NLS 2011).

Like the other NLS surveys, each crosssectional wave has its own weight, which can be used to produce nationally representative designbased descriptive estimates for each survey wave.

Because the weights are specific to a single crosssectional year of the study and respondents occasionally miss an interview but are contacted in a subsequent wave we have a problem similar to item nonresponse when the data are used longitudinally. Using the crosssectional weights will result in nationally representative designbased estimates. These annual weights that analysts use correct the for the complex survey design in a particular year. However, as with other longitudinal surveys when a research project spans multiple survey waves, one needs to create a custom set of survey weights that adjust both for the complexity of the survey design and also for the use of data from multiple waves. So, we can see that the population of inference for the longitudinal and crosssections is mutually exclusive of the NLSYW crosssections which as we have already noted is not the same as the population of inference for the PSID or the NCCP. 208

NSFG National Survey of Family Growth

The National Survey of Family Growth (NSFG) is not a , but a crosssectional

one with a new cycle (as opposed to wave) in each year (CDC 2011). Each cycle has a different

multistage probabilty sample (Lepkowki et al 2010) . The target population of the NSFG

consists of the civilian population of women 1544 years of age and living in the conterminous

United States who are currently married, previously married, or single mothers with offspring of

their own living in households at the time of the interview, in this case 1995.

The NSFG is designed and administered by the National Center for Health Statistics (NCHS), an of the Department of Health and Human Services, and the purpose of this survey is to produce national estimates of health conditions and an information base on factors affecting pregnancy such as sexual activity, contraceptive use and sources of family planning services, infertility, and the health of women and infants. It is important to note that some of the pregnancy data collected required longterm recall on the part of the respondent: although

Norberg chose to include only births from 1991 to 1995, the data file covers births from 1968 to 1995. Once again, Norberg gives no justification for her decision not to use the early years of data.

As with all surveys, designbased point estimation and variance must use the sampling weight.

However, the original design weight can be calibrated, poststratified or otherwise adjusted to compensate for things like nonresponse. The technical documentation for the NSFG states that data analysts should use the ‘‘fully adjusted sampling weight’’. This adjusted weight is derived

209 from the sampling design and has been adjusted to compensate for nonresponse and calibrated to agree with external totals, such as those by age, race/ethnicity, marital status, and parity from the

Bureau of the Census. We make a final observation that the crosssectional population of inference for the NSFG is not the same as any other studies.

Inference using Longitudinal and Crosssectional Designs

As with other forms of analysis, we need to consider the study design and the type of data we are analyzing. However, unlike a single study, in metaanalysis we need to note the differences in the study design and data type for each study and we will have to understand how the different types of data lead to different populations of inference. In this case, we have one crosssectional survey (NSFG) and 4 longitudinal studies: one of these being a prospective epidemiological cohort (NCPP), and the other three, longitudinal surveys (PSID, NLSY, NLSYW). As noted above, a longitudinal study is one that collects data from the same sample elements on multiple occasions over time: as Lynn (2009) observes, there are many advantages to a longitudinal design, as it offers a number of options both for and analysis which are not possible with a crosssectional survey. For example, we can do analysis of unitlevel change,

take measures of stability or instability, make timerelated analysis such as of the duration of an

event, and even attempt to look at causality.

Next, the population of inference for a longitudinal study is different from a crosssectional study,

a study which is designed to produce estimates of a particular population at a particular time point

210 and uses crosssectional weights to ensure designconsistent estimates. A longitudinal survey, in general, has two populations that it can cover and thus two sets of weights: one that can be used to produce crosssectional estimates and one for longitudinal analysis. The longitudinal weights represent the original study population, whereas the crosssectional weights represent the annual population. However, in general, crosssectional analysis is not as good with longitudinal data even with the correct weights. Changes in population, in particular additions from immigration, are often not well represented in future waves of the longitudinal survey unless adjustments are made to the design to compensate for this. Such is the case with the PSID. Furthermore, data collection differs for the two studies. Although both can produce longitudinal data, cross sectional surveys that have longitudinal data do so by asking respondents about events in the past.

This type of data is, however, susceptible to recall bias.

Moreover, as Plewis (2007) notes, the defining characteristic of a longitudinal study is that we are repeatedly measuring individuals (or cases). This fact implies that the datasets are hierarchically structured; that is, we have repeated measures that can be thought of as correlated measures within each individual. Thus, we must somehow in our analysis take into account this within subject correlation. Otherwise, as Roberts (2009) observes, inferences on our model parameters could be invalid because of underestimation of our standard errors for pvalues and ttests.

As we noted previously we would like, in metaanalysis, comparable data sets. Norberg’s article has one study with a target population consisting of patients from a series of hospitals and clinics over the years 19591966, another with a target population of young women aged 14 to 24 in

1967, another with a target population of households in 1969, yet another with a target population

211 of young men and women who were 1422 years old in 1979, and, finally, one with a target population of women aged 1544 with offspring living in households in 1995. The five studies are not covering the same finite population. Nor are they covering complementary portions of a larger well defined finite population. If we look across the time periods for each study, we see we do not even have representation of every single crosssectional year, never mind all the birth cohorts over this period (see Figure 5.1). We also know that most of the surveys oversampled a portion of the US population. So the idea of constructing a finite population of inference would be complex, if not impossible in this situation. Thus, we would not ever create a single dataset out of these five studies unless we could show that the sample design in each of the surveys was ignorable.

Figure 5.1 Study Time Periods

NLSYW

PSID

NLSY NSFG

NCCP

1955 1960 1965 1970 1975 1980 1985 1990 1995 2000

As we can see, neither the target populations, nor the time periods, nor the designs of these

studies are the same, yet Norberg takes all these differences into account not at all. Instead, she

simply puts all the individual data into one large dataset, treats all the observations as a single

212 study, and assumes in her analysis that the cases are independent and identically distributed. At best, one could consider combining the four surveys, provided we made strong assumptions on the model and did not try to define a finite population. We start our reanalysis by determining that we will use the longitudinal weights for each of the longitudinal surveys and the cross sectional weight in the NSFG so that we represent the target population for each of the surveys at the time of the studies. We cannot however use the NCCP data as it is not a survey, (therefore we could not do design based estimation) and as noted previously the model assumptions we would need do not hold. For this reason, our reanalysis does not include the NCCP data.

The US Population over Time

The period covered in the datasets used by Norberg has seen drastic changes in the legislative, sociodemographic, and environmental conditions of the United States. As a consequence, demographic and biological variables that can affect the sex ratio have also changed dramatically.

In an analysis over time, one must consider the effect of changes in the population in addition to the direct effects of the dependent variables on birth ratio. If we look at Figure 5.2, we can see that over the time periods covered by each study, as seen on the top of the graph, the rate of births, infant deaths, marriages and divorces in the United States are not stable.

213

Figure 5.2 Births, Infant Deaths, Marriage and Divorce in the United States: 19571995

NCCP NLSYW PSID NLSY NSFG

30

25

20

15 Rate per1,000 Rate

10

5

0 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Births Infant Deaths Marriages Divorce

Source: US Census Bureau, Statistical Abstract of the United States

Cause, Odds Ratios and Risk Differences

In addition to objecting to Norberg’s failure to appreciate the issues and assumptions she is making when she combines data, we feel she is using the wrong point estimate. As she is interested in the effect of marital status on sex ratios her research is addressing a causal question and thus a model though we caution readers on using metaanalysis for causal questions. Such 214 caution is necessary because unless we expect similar distributions of causal factors across populations or a time period, (which as noted previously is not the case), we should not be surprised if effect measures vary. However, if hypothetically we had similar distributions of the causal factors across populations or time periods, it is also just as likely that uncontrolled

confounders have similar distributions, meaning the consistency and thus homogeneity of an

association across data sets does not provide logical support for causality (we make a formal note

that causality implies homogeneity, not vice versa) (Kaufman, 2011). For these reasons, meta

analyses should not, when we are interested in an underlying model, estimate a fictional common effect, but rather search for sources of systematic variation and similarity among study results

and this search should be done with the most appropriate model form.

One way to approach a causal question is to think about counterfactuals. A statement that would

have happened under contrary to actual conditions is a counterfactual conditional (Lewis 1986).

For example, if we had one population to which we gave a treatment and we considered the possible outcomes if it had never received the treatment, we would have a counterfactual

conditional population. In practice, however, we do not know this population and therefore we

use a proxy in its place.

We can use a counterfactual representation of the population to aid us in illustrating our choice of

estimator (where we have outcomes in the case that the event happened or did not happen).

Following Copas (1973) or Robins and Greenland (1989), we break the population into 4 parts: a portion that would be doomed regardless of the event, those whom the event would help, those

whom the event would hurt, and those who would be fine regardless of whether the event

215 happened or not. Applying the same ideas to sex ratios, we have the following for the causal framework for the effect of marital status on sex ratios, where the first column unmarried is our actual event and the second column is our counterfactual:

Table 5.1 Counterfactual Contingency Table

Type Unmarried Married 1 Baby girl Baby girl “Doomed” to have girls “Causative” being unmarried has a direct 2 Baby girl Baby boy effect on gender “Preventative” being unmarried has a direct 3 Baby boy Baby girl effect on gender 4 Baby boy Baby boy “Immune” will always have boys

Using the counterfactual framework, we can see from Table 1 that the proportions within each case can be listed as follows:

Table 5.2 Proportions in Counterfactual Contingency Table

Type Unmarried Married 1 P1 Q1 Proportion “Doomed” to have girls Proportion “Causative” being unmarried has a 2 P2 Q2 direct effect on gender Proportion “Preventative” being unmarried 3 P3 Q3 has a direct effect on gender 4 P4 Q4 Proportion “Immune” will always have boys

We can then use this table to easily illustrate some of the main points that Greenland makes in his

1987 article on the interpretation and choice of effect measures and why we are suggesting to use a risk ratio. Greenland (1987) argues that for summarizing exposure impact on risk, the risk

216 difference and/or the risk ratio should be used. The risk ratio is defined using our counterfactual

Pdoomed + Pcausative table above as RRcausal = , whereas the risk difference is calculated as Pdoomed + Pprevenative

RDcausal = Pcausative − Pprevenative . Where the P doomed is the proportion who would have girls

regardless of the marital status, the P causative is the proportion who have girls only if unmarried and the Ppreventative is the proportion who have girls only if they are married. If the population has a

large number of doomed individuals then the risk ratio will tend towards 1, which means the

ability to determine the effect of the event on the population will be difficult. This is our case, as

having either a boy or a girl is not a rare condition. Greenland notes that in the absence of bias,

these two measures possess direct interpretations in terms of exposure effect on average risk.

Greenland also stresses that the odds ratio is “biologically interpretable only insofar as it

estimates the incidenceproportion or incidencedensity ratio.” The odds ratio will only

approximate the incidence proportions if the condition is rare, which in our case it is not

(Greenland also shows how the rare assumption is not adequate to interpret the odds ratio as the

change in average risk odds).

The odds ratios, on the other hand, can be calculated in different ways; and thus its interpretation

should always be confirmed before it is calculated. Here, we suppose that every individual i has a

probability of our outcome under exposure, labeled r1i , and that every individual has another

probability of our outcome under nonexposure, labeled rOi . First we have the incidenceodds

ratio

217

∑ r1i Exposed =1

∑ 1( − r1i ) Exposed =1 (where we are summing over exposed and unexposed respectively). ∑ rOi Exposed =0

∑ 1( − rOi ) Exposed =0

If confounding is absent, this version of the odds ratio is equivalent to the counterfactual contrast of interest (when we have our true populations). That is this is the proportionate change in the incidence odds in the exposed that is due to the exposure.

However we can not equate this odds ratio to an average. The odds formulation above, is not the proportionate change in the average odds in the exposed that is due to the exposure (which is a

different calculation), nor is it the average of the individual odds ratios (yet another calculation).

However, if the individual riskodds ratios are constant (as we assume in a logistic model), then

the second two formulations of the OR described above become equivalent, although they still do

not generally equal the incidenceodds ratio above. And as Greenland notes t he odds ratio will not estimate the average change in odds among exposed individuals either, except under implausible restrictions. Additionally, it is risk ratios (not odds ratios) that will not change if adjustment is made for a variable that is not a confounder (Cummings 2009).

So, instead, of the odds ratio or risk ratio we consider the risk difference. Again, we suppose that

every individual i has a probability of our outcome under exposure, labeled r1i , and that every

individual has another probability of our outcome under nonexposure, labeled rOi . Then the risk

difference for each individual is r1i − rOi . Over an entire population, therefore, we can construct 218 the risk difference in two ways, either by the difference of each individual and taking the average over the population, or by taking the average under exposure and subtracting the average risk under nonexposure. These two methods to calculate a risk difference are equivalent, and thus we can interpret the risk difference as both the absolute change in the average risk of the exposed cohort that is due to exposure (the difference of the average individual risks) and as the average absolute change in risk that is produced by exposure among exposed individuals (the average of the individual riskdifferences). It is for these reasons, namely, issues with the oddsratio interpretation, assumptions of constant riskodds ratios, and the nonrarity of the condition that we suggest using a riskdifference. Of course, this assumes that we have no effect modification and the additive model assumption holds. We confirm these assumptions in our data analysis.

Reproducing Results – Validating and Recoding Data

In her paper, Norberg classifies partnership status as either married, cohabitating or alone. But as we observed at beginning of the reanalysis, in none of the studies was marital status collected in a way that easily maps onto this classification. It is therefore apparent that Norberg recoded the data in order to make it fit into this classification. As this partnership status was the exposure of interest in her study, and the exposure was not measured as per her classification we must discuss the possible effects of measurement error on the analysis.

Armstrong (1998) summarizes what is known about the effects of measurement error in the

exposure on the results of a study. As the exposure here is a (i.e. married,

219 single, widowed, cohabitating, divorced or other) measurement error is in effect misclassification.

There are two types of measurement errors: random or systematic. Random misclassification of an exposure usually the effect measure towards the null, whereas systematic misclassification (especially in subgroups) can bias the estimate in either direction. Once again,

Norberg makes no reference to the potential sources of bias related to her recoding.

NSFG

Starting with the NSFG, we attempted to first recreate the tables in the paper. As the studies did not have a category for cohabitating, the data when possible have to be recoded (and in several datasets it is not actually possible to recode without strong assumptions that cannot be proven valid). We contacted Norberg to discuss the recode and for the NSFG we acquired her computer programs (Norberg 2005). However, even with her programs, and some email exchange, we were not able to reconstruct the values in the published tables. We found several errors in the program and we noted that there could be potential bias due to the recoding (misclassification).

We discussed this with the author, and she felt that the programs must have been older versions, though she did not send an update, and stopped responding to our queries. As this was the case, we decided instead to run the analysis for each study as closely to the concepts that Norberg was attempting to use. All recoding and data analysis was done with SAS 9.1.3 ® and is in Appendix

2.

The NSFG has a variable ‘FMARCON5’ which is the formal martial status at conception. There

are five levels: 1 = married, 2= divorced, 3 = widowed, 4= separated, 5 = never married.

Unfortunately, the never married category could represent both women who have always been

220 without partners and those who are cohabitating but not married, which is an issue in all other datasets. Fortunately, however, the NSFG asked a wealth of questions regarding the start and end of relationships, and the derivation of cohabitating status was possible when the start and end dates for relationships as reported by the respondent were used. However, we need to note that these questions were all recall questions and could be subject to bias. There were 5 different marital start dates (1 st marriage through 5 th marriage), 5 marital divorce dates (1 st marriage through 5 th marriage), 82 cohabiting start dates (1 through 82) and 104 variables that related to the end of a relationship or marriage that included separations and deaths. Each pregnancy was related to the cohabiting status at the time of the conception to determine if the woman was either married, cohabitating, or without a partner. Then this recoded variable (with outcomes of married, cohabitating or without a partner) was again recoded into a binary variable, living together or living apart.

Norberg’s article reports that the unweighted, unadjusted odds ratio was 1.195 and indicated it

was significant. As we have already noted throughout this thesis we cannot ensure that the

unweighted estimates are unbiased for the finite population, and we again note that tests or p

values of the unweighted data are not accurate. In our reanalysis, we found several differences between what we found and what Norberg found: first, we have 63 more births in the NSFG

sample. We also did not reproduce the same exact proportions of cohabiting and thus did not

recreate the proportions living together and living apart. What we did find is summarized in

Table 5.3. First, we found that the risk difference is never significant; that is for the subset of the

data from 19911995 or the whole period, weighted or unweighted. To decide if the weighting is

needed, we want two things: that the point estimate is nearly the same and that the variance is

also similar. In this case, in spite of the nonsignificant findings, we would not suggest ignoring 221 the weights as the confidence intervals and estimates are different with weighting than without and our design based estimates have a clear interpretation.

Table 5.3 Partnership status and likelihood of a male child: NSFG

Paper results Our results Our results Our results Our results unweighted unweighted weighted unweighted weighted All years All years Years 19911995 19911995 19911995 19641995 19641995 n (births) 3294 3357 17,052,586 14906 73,937,791 Household Type Married 64.5% 60.2% 65.1% 61.2% 66.3% Cohabitating 10.6% 10.2% 8.7% 7.9% 6.7% Living Apart 24.9% 29.5% 26.2% 31.9% 37.0% Percentage of male births Parents share 51.2% 52.1% 52.2% 52.6% 52.8% household 49.9% 49.2% 48.6% 53.0% 52.9% Living apart

Odds Ratio * 1.195 Risk 0.0286 0.0357 † 0.0032 0.00477 † Difference Standard Error 0.0189 0.0228 † 0.0089 0.0102 † 0.0085 0.0009 0.0206 0.0153 CI 0.0656 0.0808 0.0141 0.0249

*paper only gave a range not the actual pvalue, pvalue < 0.05; † riskdifference and variance are of the form θθθˆ F (F )

In our tables we note to the reader that the weighted sample size reflects the number of births in the target population. Thus we conclude using the NSFG data that there is no crude (or unadjusted) effect of partnership status on the secondary sex ratio.

222

One may ask if this effect is stable if we control for other confounders. Norberg pools the data and then examines the impact of other confounders on partnership status on the secondary sex ratio. She looked at the impact of parent’s age, race, parent’s education, child’s birth order, child’s birth year and family income. As the data was a crosssectional survey all the covariates used in Norberg’s analysis were from the same time point, at the survey interview date. So unlike the other longitudinal data where things like income and education could be treated as time varying covariates, here we have only a fixed set of covariates that reflect the covariate status at the time of the survey. This is yet another difference between these studies and makes the notion of pooling them in order to run a large regression unreasonable. We used SAS proc surveyreg in order to produce design based estimation of the risk differences adjusted for parental education, parity, child’ birth year, parent’s race, parental age and family income. We also confirmed that

the assumption of an additive model held and none of the covariates was an effect modifier. As

with the crude differences, the adjusted riskdifferences were not significant. We found that the

adjusted riskdifference for the births between 1991 and 1995 was 0.0277 (SE 0.0281; CI

0.02780.0833) and that the adjusted riskdifference for the whole data set was 0.00083 (SE

0.0117; CI 0.02230.0240).

NLSY79

The next data set we reanalyzed was the National Survey of Youth. As we did not have

Norberg’s programs as a starting point we began with a detailed review of the data questionnaires and dictionaries. In the course of this review we noted that at several time points either the actual question, the categories or the wording were changed between survey dates. This is unfortunately very common in longitudinal studies. As these changes could constitute a large source of

223 measurement error we attempted to find variables that reflected the concepts in Norberg’s analysis, but required the least manipulation. For both the NLSY79 and the NLSYW we consulted (via email) with a methodologist from the Bureau of Labor Statistics about how to re code variables in particular how to get cohabitation. It became clear that there was no way to identify the actual cohabitation status in the earlier years of the survey (NLS 2011). The methodologist felt that it was only reliable to do so with the NLSY97 data. Although marital status does not account for persons who are cohabiting (unless the respondent selfidentifies as married) we decided to use marital status as defined on the data file, rather than make greater assumptions and recode the data for half the time period.

Once again we ran the analysis using both the subset of dates that Norberg chose and for all births in the file. As with the NSFG we found no significant riskdifference in any of our analysis.

Table 5.4 summarizes our findings and as with the NSFG, we do not recommend ignoring the weights in this analysis. So as with the NSFG, there are differences in both the point estimate and variance estimate for the weighted and unweighted analysis and we feel that the design should not be ignored.

224

Table 5.4 Partnership status and likelihood of a male child: NLSY79

Paper results Our results Our results Our results Our results unweighted unweighted weighted unweighted weighted All All Years 19801998 19801998 19801998 19701998 19701998 n (births) 17,152 13,090 52,107,967 13,599 53,768,941 Household Type Married 64.1% 61.3% 72.3% 60.7% 72.0% Living Apart 35.9% 38.7% 27.7% 39.3% 30.0% Percentage of male births Parents share 51.6% 51.6% 50.1% 51.6% 50.0% household 50.0% 50.4% 49.9% 50.1% 50.0% Living apart Odds Ratio* 1.066 Risk 0.0163 0.0051 † 0.0143 0.0057 † Difference Standard Error 0.0090 0.0120 † 0.0088 0.0117 † 0.0013 0.0184 0.0029 0.0172 CI 0.0339 0.0287 0.0315 0.0287

*p<0.05; † riskdifference and variance are of the form θθθˆ F (F )

Unlike the NSFG, we do not have a single time period for our covariates: rather, here we have covariates that change over time. Once again we attempted to take variables that were as consistent as possible over time. We used custom longitudinal survey weights (respondents from all waves) and as there is no cluster or stratification variable on the dataset we used parent as cluster (as there is repeated measures for each parent). We found that after accounting for parental education, parity, child’s birth year, parent’s race, parental age and family income the risk difference for the births from 19701998 was 0.0224 (SE 0.0414; CI 0.10370.0589) and

225 the adjusted risk difference for the whole data was 0.00052 (SE 0.0362; CI 0.07060.0716). This once again reaffirms our previous analysis that there is evidence to support an effect of parents living together or apart on the secondary sex ratio.

NLSYW

The next data set we reanalyzed was the National Longitudinal Survey of Young Women. It is very similar in structure to the NLSY79. We followed the same basic procedure, reading the questionnaires and survey guides and discussing the survey with methodologists at the Bureau of

Labor Statistics. As with the NLAYY79, there was no way to identify the actual cohabitation status in the earlier years of the survey. The methodologist felt that for early years the number of cohabitating women would be small and in later years there was the possibility that cohabitation was selfidentified as married. Although marital status does not account for persons who are cohabiting (unless the respondent selfidentifies as married) we decided to use marital status as defined on the data file, rather than make greater assumptions and recode the data for half the time period.

Once again we ran the analysis using both the subset of dates that Norberg chose and for all births in the file. As with the two previous surveys, we found no significant riskdifference in any of our analysis. Table 5.5 summarizes our findings and as with the other surveys, we do not recommend ignoring the weights in this analysis. There are differences in both the point estimate and variance estimate for the weighted and unweighted analysis and we feel that the design should not be ignored.

226

Table 5.5 Partnership status and likelihood of a male child: NLSYW

Paper results Our results Our results Our results Our results unweighted unweighted weighted unweighted weighted All All Years 197097 197097 197097 19521997 19521997 n (births) 6697 3146 10942602 3558 12391957 Household Type Married 72.2% 85.1% 80.1% 81.4% 79.1% Living Apart 27.8% 14.9% 29.1% 18.5% 20.2% Percentage of male births Parents share 53.2% 50.9% 51.7% 50.8% 51.6% household 50.8% 48.9% 51.3% 49.2% 50.7% Living apart Odds Ratio* 1.100 Risk 0.0196 0.0033 † 0.163 0.0087 † Difference Standard 0.025 0.0287 † 0.0226 0.0266 † Error 0.0685 0.0598 0.0607 0.0610 CI 0.0293 0.0530 0.0281 0.0434

* p<0.10; † riskdifference and variance are of the form θθθˆ F (F )

Like the NLSYW we have timevarying covariates, that is, the levels or values of the covariates change over time. Once again we attempted to take variables that were as consistent as possible over time. We used custom longitudinal survey weights (respondents from all waves) and as there is no cluster or stratification variable on the dataset we used parent as cluster (as there is repeated measures for each parent). We found that after accounting for parental education, parity, child’s birth year, parent’s race, parental age and family income the risk difference for the births from 1970 to 1997 was 0.0087 (SE 0.07434 CI –0.03940.2532) and the adjusted risk difference

227 for the whole data was 0.0805 (SE 0.0642; CI 0.04580.2068). This once again reaffirms our previous analysis that there is evidence to support an effect of parents living together or apart on the secondary sex ratio.

PSID

The final data set we reanalyzed was the Panel Survey of Income Dynamics. We began with a detailed review of the data questionnaires and dictionaries. In the course of this review, we noted that at several time points either the actual question, or the categories, or the wording were changed between survey dates. The PSID has a data retrieval site and documentation was exemplary, and was particularly useful for this analysis. In addition to complete data sets for all respondents, the PSID provide users with specialized data files (PSID 2011). One such was the

Child Adoption and Birth file that contains information on all births and is already constructed from the longitudinal survey data. Clearly from the data counts and the absence of male respondents, Norberg did not use the supplementary file and must have selected births in a different manner, either using the family information or the household information. As recoding is another source of error for an analyst, this type of file directly created from the data provider is invaluable. In addition to not dealing with spurious recodes, this data file was easily linked to individual respondent data; since education, parental race and income were only available on the individual file . As with the other studies, cohabitation was not asked about or defined till later waves in the PSID, and we therefore once again used the survey coding for marital status.

Once again we ran the analysis using both the subset of dates that Norberg chose and for all births in the file. As with the other data sets we found no significant riskdifference in any of our

228 analysis. Table 5.6 summarizes our findings and as with the PSID, we do not recommend ignoring the weights in this analysis. As with the other data sets there are differences in both the point estimate and variance estimate for the weighted and unweighted analysis and we feel that the design should not be ignored.

Table 5.6 Partnership status and likelihood of a male child: PSID

Paper results Our results Our results Our results Our results unweighted unweighted weighted unweighted weighted

Years 196997 19691997 19691997 19581997 19581997

n (births) 6187 31628 247,487 48746 Household Type Married 51.7% 70% 81.9% 75.9% 88.5% Living Apart 48.3% 30% 18.1% 24.1% 11.5% Percentage of male births Parents share 51.1% 51.4% 50.1% 51.3% 50.4% household 49.9% 50.8% 49.9% 50.7% 49.6% Living apart Odds Ratio* 1.048 Risk 0.0058 0.0036 † 0.0059 0.006 † Difference

Standard Error 0.0061 0.0153 † 0.0053 0.0130 † 0.0062 0.0264 0.0045 0.0324 CI 0.0178 0.0337 0.0163 0.0188

*NS ; † riskdifference and variance are of the form θθθˆ F (F )

Unlike the NSFG, we do not have a single time period for our covariates: here we have covariates that change over time. Once again we attempted to take variables that were as consistent as possible over time. The PSID provides information across survey waves, grouping consistent

229 questions for the analyst which facilitated the analysis. We used longitudinal survey weights from 1993 and cluster and stratification to ensure we calculated appropriate design based estimates. We found that after accounting for parental education, parity, child’s birth year, parent’s race, parental age and family income the risk difference for the births from 19691997

was 0.0147 (SE 0.03109; CI 0.07570.04624) and the adjusted risk difference for the whole data

was 0.0132 (SE 0.03101; CI 0.074070.04756). This once again consistent with our previous

analysis that there evidence to support an effect of parents living together or apart on the

secondary sex ratio.

Combining (or Not Combining) the Data

When we pool data into a single large data set or use metaanalysis we need to have an idea about which model we are assuming and we need to know what form of point estimate (either finite or superpopulation) we need. As we have already discussed in the survey case, in order to pool the individual data into one large dataset we need to either have data that represent a clearly defined finite population (and then we would need to adjust the weights to reflect that finite population), or we would need to have data where the survey design was ignorable for the analysis for all datasets. We have neither of those conditions here. Most analysts will prefer to have a larger

data set and thus this is an important and often overlooked part of metaanalysis, deciding not to

combine data.

As the different data sets do not come from a single clearly defined finite population, the only way to combine it is using our models in Case 2 or Case 3. In Case 2, we assume that a single

230 superpopulation model generated different finite populations and the surveys are samples from these conceptually linked finite populations. In this case, if we make the assumption that the finite populations from which all four surveys were selected are all generated under the same model we would suggest testing for homogeneity of effect to determine if we are estimating the parameters from the superpopulation or the hypersuperpopulation. As the weighted effect is not different from zero in each study, our model would be that the effect of partnership status is the same across studies and is zero for all finite populations generated under this model.

In theory, we would test the hypothesis using the superpopulation variances for each point estimate (adjusted by the design effect of the superpopulation variance); however, for practical reasons we can start with the finite population estimates. If we find evidence to support the conclusion that the effects are homogeneous (and we are in Case 2) that conclusion will still hold under the superpopulation, as the variance under the superpopulation model (jointly with the design) is always larger than the variance under finite sampling. However, if we do not find evidence of homogeneity using the design based estimates, we should confirm that the conclusion still holds with the larger superpopulation variances. In our case, using the design based

estimates and design effects calculated in SAS, Q* is 0.4785 and is not significant (p=.9),

however, as with the previous example by Thomas and Wannell, we are only combining 4 studies

so the test results should be interpreted with caution. With caution then, we will make the

assumption that the model holds and we are in Case 2. So, refining our model, as we have

consistent estimates of no riskdifference, the hypothesized superpopulation model would be one

where the secondary sex ratio is independent of the partnership status.

231

We now move to the issue of calculating the variances for each estimate under a superpopulation model, since as noted in Chapter 3 we should be combining estimates of the form θθθˆ F (S ) . There are some practical obstacles to these calculations for an external data user. In order to estimate the correction factors used in Korn and Graubard we need have (in addition to the sampling design when the design is a stratified multistage cluster using simple random sampling in the first stage) the total number of clusters, the number of clusters in each stratum in the population and the number of observations in each cluster within each stratum. For stratified multistage cluster sampling with pps sampling at the first phase we also need to know the selection probabilities of the cluster at the first phase.

In the case of our four surveys we do not have this information available. In particular, for the

NLSY79 and the NLSYW there is no longer any cluster or strata variable on the dataset. The methodologist at the BLS noted that due to the mobility of the cohort they no longer kept track of which cluster the respondent was in at the time of selection. This makes adjusting these correctly, impossible. We can, however, estimate the variances in SAS using the Taylorlinearization option which approximates the withreplacement variance estimator which will always be closer to the correct model variance than the without replacement variance estimator.

Next, for the PSID we attempted to get some of these values. First, using the documentation on the design the psu was a county (or similar grouping of geographic data). There were roughly

3000 counties in the 48 contiguous states in 1968. And we can use the estimated stratum totals for the population within each clusterstratum group. However, in the documentation the original design had 76 strata whereas the analytical file has only 54. We can find no documentation on

232 which strata were combined or dropped. We do know the strata were collapsed to allow for variance calculations (and we also know that some of these strata were part of the sample that was dropped in the redesign). Nor can we find the pps sampling weights. We have very calibrated (for attrition, death, nonresponse, population controls) adjusted weights, so attempting to use the correction with the correct sampling design (pps at the first phase) is not possible.

The same holds true for the NSFG although we have documentation that indicates that the psu is

defined as groups of counties and independent cites of which there were approximately 1900,

several other issues arise. Sampling at the first phase is broken into two segments: a self representative part (take all stratums) and a nonself representative part. The nonself representative portion was divided into 35 strata and then selection was pps within strata.

However, as with the PSID we do not have the selection probabilities but the adjusted weights and so once again we can not calculate the superpopulation variance. Again, as with the PSID in the NSFG we know that when there is a single psu selected in each stratum then the strata need to be collapsed to allow for variance calculations. However, we do not know how these strata were combined. We make note to the reader that this will often be the case and a discussion with statistical agencies and survey practitioners needs to take place so that this data could be readily provided for all public release data so that superpopulation variances can be estimated.

ˆ F (F ) For our reanalysis, although it may be tempting (and simple) to average the estimates of θθθi ,

we will not pool these data as doing so would estimate an incorrect variance for pooled estimator.

Once again, it is an important decision to not average the estimates since we do not have the

233 correct variances. Even without pooling, the individual studies provide evidence that there is no effect of partnership status on secondary sex ratios, and we end our reanalysis here.

This example provided an opportunity to review the assumptions and steps we should take when doing a metaanalysis with survey data. In addition to the errors Norberg made in coding and analyzing data were mistakes related to pooling and metaanalysis in general. First, she did not do a systematic review of all possible data sources. Next, she did not understand the differences in the studies target populations and population of inferences. Additionally, she did not properly assess if the design for each survey was ignorable. Moreover, she clearly did not understand the importance of the study designs, the assumptions of homogeneity and model form, or the analytical issues with odds ratios. In conclusion, it is disquieting that notwithstanding these many and significant errors in Norberg’s article it was published.

234

Chapter 6 Guidelines

The research outlined in this thesis has covered various different statistical issues relating to metaanalysis of survey data. These issues include the creation of an original comprehensive methodological framework for combining survey data, a comparison of this framework with the traditional one proposed by Cochran for the combination of experiments, a proposal for a new weighting method that takes into account the differences in variability due to the sampling plan, a firstorder correction to the traditional homogeneity test statistic, an examination of the convergence of metaanalytic estimators, and a discussion on the numerous implicit assumptions researchers make when they are using metaanalysis methods with survey data. While offering primarily a review and conclusion, this final chapter also contains a discussion on data quality in section 6.1, and on comparability in section 6.2. Lastly, as an aid to researchers, in section 6.3 we expand the guidelines for completing and reporting reviews proposed by groups like the

Cochrane collaboration and the Metaanalysis of Observational Studies in Epidemiology

(MOOSE) (Stroup et al. 2000).

6.1 Data Quality

The issues relating to the identification, selection, and quality of trials for metaanalysis have been widely discussed (Higgins and Green 2009) and survey methodologists have started to catalog the analytical and data quality issues related to combining survey data in a more general

235 metaanalysis framework (see, for example, Schenker et al. and Rao et al 2008). Such investigations are vital as researchers need to be able to determine the extent to which accuracy and other quality factors affect their analysis. They can do this by verifying the conceptual framework and definitions used in the survey design, and in the collection and processing of the data. Without such verification, researchers cannot be certain that the data they analyze meets their needs. Regardless of the method they choose to employ, they will make significant implicit assumptions when they blindly apply metaanalysis methods in their research, and thus, without review, they will be unable to investigate the appropriateness of their assumptions.

Data quality has many interrelated dimensions including accessibility, precision, relevance, unbiasedness, reliability, accuracy, interpretability, coherence, coverage, completeness, timeliness, and consistency (Brackstone 1999). To this list some researchers might add methodological soundness and in the case of metaanalysis comparability. While most discussion of data quality in traditional metaanalysis focuses on accessibility, precision and coverage, we would like to extend this conversation to include other aspects listed. Moreover, however, for each dimension of data quality considered, we want not just the individual estimates to have a sufficient data quality, but also the final pooled estimate.

First and foremost, the data from each study must be relevant to the question that we are posing.

Relevance is the degree to which the data meets the needs of the analyst, or, in other words is the of the data set for a particular analytical question. Such usability must be assessed both on a studybystudy basis, but also from a more global perspective as well. For example,

Schenker et al. describe a method to use data from supplementary surveysone household and

236 one institutionalto estimate the prevalence of certain health conditions. They saw that each data set was relevant, albeit incomplete. They used both together to get a more reliable estimate.

To define reliability we start with the three concepts: unbiasedness, precision and accuracy, each of which can be used to convey the reliability of the data. An estimate is unbiased if the expected value of the estimator is the true value of the parameter. Bias is a systematic error that can arise in many different ways and is widely discussed in the literature (and commented upon in this thesis). In Chapters 3 and 4, we noted that we can have biased estimates if the weights are not independent of estimates. In practice, in the survey context there are some designs like

Poisson and PPS sampling where this is the case. Additionally, we may have an estimator, such as ratio, which is biased in the survey context. We can also have bias if the surveys we are

combining are not independent. Moreover, we could also have bias within a total survey error

framework with things like interviewer effects, mode effects and nonresponse. With a meta

analysis it may thus not be possible to have unbiased estimation, even when it is desired.

The next two components of reliability are the precision and the accuracy of the estimate.

Precision refers to the variance of the estimate, while accuracy is the degree to which the

information correctly describes the phenomena it was designed to measure. Precision is often the

only feature of reliability that is summarized in a traditional metaanalysis visàvis a variance calculation. In the main, there has not been a thorough discussion of accuracy in traditional meta analysis, which is the combination of both the variability of the data and its bias. However, as

there is often bias, accuracy may be a more appropriate measure of reliability. Mathematically,

accuracy is measured with the meansquared error (MSE) and the smaller the MSE the more

237 accurate the estimate. Ideally, we would know all the sources of systematic error, or bias, and the variance due to sampling. When this is not possible, we can also describe the accuracy of data by investigating major sources of potential inaccuracy, for example, coverage, sampling and non response. Research into minimum MSE estimators would be an interesting area of future study in the survey case.

In addition to connecting directly to reliability, certain biases are also related to other concepts in data quality. Publication bias, for example, is related to accessibility, an important concept which denotes the ease with which data can be found and obtained. This denotation includes such things as whether the information is in print or online, its cost and userfriendliness, and the suitability of the medium of the data (i.e. a SAS dataset is more suitable than old microfiche). As is the case with the data from clinical trials, not all survey data is published or accessible.

Therefore, just as in the experimental case, the magnitude of bias due to accessibility (e.g. publication bias) must be evaluated.

Another biasquality relation is that between selection bias (or non) and coverage, a term which describes how well the target population is represented by the study population. As noted in Chapter 2, we must clearly define the target population not just for each study but also for the overall we are trying to answer.

A further critical element of data quality is interpretability, a term which denotes the availability and sufficiency of the supplementary information and metadata necessary to interpret and utilize the survey data appropriately. Such information normally covers the underlying concepts, the 238 variables, the coding and classifications used the methodology of collection, and the indicators of the accuracy of the statistical information. Without such information we could not ensure that any of the assumptions we are making in our metaanalysis hold. And as noted in the example in

Chapter 5, that information, along with occasionally contact with the survey methodologists is needed in order to attempt to understand the data one is wishing to combine.

In addition to having adequate information so that we can assess key concepts like precision, bias and coverage, for any metaanalysis we must also assess the level of coherence between the statistical information from each survey, namely, the extent which the data can be brought together with other statistical information in a broad analytic framework and over time. In fact, coherence is an essential concept in a metaanalysis. Although coherence between studies does not imply identical study design and implementation, it should be the goal to have similar concepts, classifications, target populations and methodology. Thus, coherence relates broadly, and critically, to other dimensions of data quality like coverage, completeness, interpretability and methodological soundness all of which readily apply when determining comparability and

consistency between studies, two more key concepts which will be discussed in the following

section.

6.2 Data Comparability and Consistency

The two concepts, comparability and consistency, are interconnected ideas relating to similarity.

Two or more studies are comparable if they are similar in terms of the phenomenon under study, i.e. the design, the implementation, the methods, the coverage, the completeness, the biases, the 239 interpretability and methodological soundness. Two or more studies are consistent if they produce similar results under similar or comparable conditions; that is, if the studies have similar precision, accuracy and/or reliability.

We would like to ensure that the information we are combining is comparable. We have made

use of a mathematical framework to ensure that the concepts of inference and estimation are

clearly defined. Even so, many other aspects of each survey remain that must be reviewed to

ensure comparability, even if, for example, we are certain that we have a single superpopulation

and multiple finite populations we are combining.

First, we must consider possible sources of substantive (often called clinical) heterogeneity. This

use of heterogeneity is not in reference to the outcome of the study, but to differences in study

design. Unlike Rao et al (2008), we limit substantive heterogeneity to the cases of differences in

the coverage and completeness of the data: that is, the target population, the respondents, the non

respondents, and differences in the questionnaires. If there are substantial differences in the

comparability of these elements, then it is unlikely that any summarizing of the estimates would be meaningful. However, we feel that differences in survey design can be attenuated with the use of the design effect; we also suggest that differences in administration (such as mode) should be considered as sources of measurement error. We need to ensure that we can use an appropriate model and, when needed, describe its sources of heterogeneity.

In order to ensure comparability it is important to assess the coverage and completeness of the data. With surveys, this means defining the target population, the population surveyed (i.e. the 240 actual participants), and the sampling frame. The target population denotes the conceptual universe from which the sample is selected, and is often different from the population actually surveyed. These differences must be noted. Although we can pool surveys that do not cover the

whole population, Schenker correctly points out that when combining complementary survey data

the researcher needs to know “to what extent are the target populations of the surveys mutually

exclusive and exhaustive subsets of the overall domain of interest.” In general, we need to know

the target population for each study and then assess if these population definitions overlap.

Following Schenker’s lead, we would want to know if the surveys differentially cover their target population, and how these definitions relate to the target population of the metaanalysis. In

addition to assessing the target population, we also would like to know the sampling frame(s)

used in the studies, as this frame is a more direct link to the population surveyed and can inform

us of gaps or differences between studies. Knowing the sampling frames allows us to assess the

over and/or undercoverage of the target population.

Coverage is a concept that employs a description of the participant in the survey. The Cochrane

collaboration suggests that selection criteria are set that include how to describe the study participants. For example, in an experimental setting, they consider it important to assess what

characteristics describe the participants, what the relevant demographic factors are, and which

individuals should be thus excluded from the review. For instance, if we were pooling

experimental studies on HRT, we would define the study participants as postmenopausal women.

The same would hold in the survey case. For example, we may want to know about the

differences in quality of life between those women who are on HRT and those who are not. We

may have individual surveys and we would then define our study participants as postmenopausal

women. 241

Next, researchers must determine if any portion of the population under study was excluded for administrative or logistical reasons, and judge whether this exclusion affects their conclusions.

This is related to coverage and completeness as noted above. For example, in Canada it is often the case in national surveys that the population living on First Nations reserves, the residents of institutional collectives, individuals living on military bases, Canadian Armed Forces vessels, merchant and coast guard vessels, as well as campgrounds and parks, are excluded for operational reasons. If our population of interest includes these subpopulations, we would not want to compare surveys with and without these subpopulations.

Another characteristic of a survey that might lead to incomparable data is proxy responses. Proxy responses are often grouped into two categories: those of necessity and those of convenience

(Shields 2000, 2004). Proxy of necessity occurs when the designated respondent is unable to answer the questions, oftentimes because of physical or mental infirmity. Allowing proxy responses in such cases are essential as, otherwise, we would not have the information and would, of course, have biased results. On the other hand, a proxy of convenience occurs when the respondent could respond but is not readily available, or is unwilling. Acceptance of these proxies is often an efficient use of money and time. Shields (2004) notes that “if information

about everyone in a household is collected from one person, it is possible to obtain a large sample

size with just one contact, thereby improving response rates and reducing costs” with proxy

responses. This said, however, proxy respondents are more likely to underreport health

conditions, and thus bias incidence and prevalence rates. For example, Shields found the prevalence of arthritis is different for proxy respondents. This underreporting does not just

242 happen in the area of health. Biggs (1992) found that proxy respondents to the Survey of Labour and Income Dynamics tended to underestimate participation in government income support programs, have higher item nonresponse rates, offer more consistent responses to sensitive

subject matter, but have difficulty reporting details and events of short duration. These issues

should be considered when using data with proxy responses.

Having considered our target populations, the actual coverage and completeness of the data in

relation to the target population and if the data are from the actual respondent or a proxy we need

next to consider the methodological soundness of the surveys. The elements of methodological

soundness will include the sampling & questionnaire design, error detection methods, imputation

methods, estimation methods, variance estimation methods, quality methods,

disclosure control methods, data revision, and . Clearly, studies without

sound methods should be excluded. It is important to note that large methodological differences

could lead to spurious conclusions if the estimates are blindly combined.

Agreement is widespread that with nonrandomized studies we must consider the questionnaire

as a potential source of study incomparability (i.e. statistical heterogeneity) (Shechker 2007).

While generally defined as a set of questions designed to collect information from a respondent, a

questionnaire should be specifically designed to take into account the nature of the respondent population especially as pertains to literacy levels and incidence of specialized language.

Furthermore, the questions employed should be tested to ensure that there is consistency in the

standard concepts and definitions as the erroneous ascertainment of a characteristic of interest can be a source of bias (Willis 2005). Questionnaires have a major impact on data quality,

243 particularly response accuracy and they play a central role in data collection and comparability.

Ideally, therefore, questionnaires should not only be identical, but should also be validated with potential sources of response and nonresponse errors minimized. As survey instruments do vary a great deal, the separate questionnaires can themselves be tested to ensure that there is coherence between survey instruments using such things as cognitive testing, split sample testing, or qualitative focus groups. Even when the questions are identical, however, we need to evaluate whether the sources of information used on the different surveys are compatible. For instance, if one survey uses a diagnostic provided by a health care practitioner, while another asks the question to the respondent about their illness, while yet another uses a caregiver as a proxy respondent, the sources of information can yield different results (Edwards).

As is not the case with a single crosssection survey (or a wave of a longitudinal survey), which happens at a single fixed point in time, pooling data requires us to consider together two temporal dimensions: the data collection time period and the age of the respondent. These two values are linked since each birth cohort is defined as the time period minus the age (i.e. 201145=1966 birth cohort). We need to take these two time effects into account, because with survey or observational data if there are changes in estimates over time these can arise from both period and cohort effects. First, there is the possibility of a period effect, that is, something that affects all individuals at the same time. For example, a war in a region would affect every person in the area. Or we could have a change in technology, such as the introduction of a new disease test or imaging process. Such changes would affect the whole population, and pooling estimates prior to the event with those taken in its wake may not be meaningful. Next, it is very much possible that certain birth cohorts have a differential exposure to risks throughout their lives. For example, some children born between 1957 and 1961 in North America were exposed to Thalidomide, a 244 drug prescribed to pregnant women to reduce nausea that was latterly proven to produce numerous and devastating birth defects. If we were studying birth defect rates both before 1957, and during the period 19571961, we would have to take into account cohort exposure to

Thalidomide. Another example of a birth cohort effect is differential smoking rates, Laarksonen

(1999) shows that younger male birth cohorts in Finland have a lower probability of starting smoking than older cohorts.

In addition to these more global time effects, Schenker adds that it is also important to assess what aspects of survey designs and procedures have changed and how these changes have effected the accuracy and relevance of the study. No two studies are identical and although it is often reported that research should be reproducible, the reality is that research evolves. Since studies try to improve accuracy over time, trends may be identified that are, in fact, only artifacts of changes in process or processing over time (Peto 1987) Any analysis using data sets from different points in time needs to be interpreted with caution as the researcher could produce very precise but spurious results (Egger et al 1997a).

After we have decided the studies are comparable, based on an investigation of all the potential

sources of differences noted above, we would next look at how consistent the findings are.

Although we agree with Higgins (2003) when he remarks that the assessment of consistency is an

essential part of any metaanalysis, we do not agree with his second point, that “unless we know

how consistent the results of the studies are, we cannot determine the generalizabiltiy of the

findings in the metaanalysis.” We could have a series of studies whose effects are consistent and biased; in such a case we would not want to generalize. Or we could have a series of consistent

245 unbiased effects on a very select population, which again we would not want to generalize to a larger population. Some investigators in synthesis (see Kulik, Kulik &

Cohen 1979) have recognized the potential for inconsistent effect sizes and suggest grouping studies that share similar characteristics into classes. Analysis is then done within a class. How we then go about generalizing beyond a class, however, seems a difficult question. Thus, we do not recommend using consistency as a gauge of generalizabiltiy, but as a measure of homogeneity. Therefore, consistency is a concept relating to homogeneity and thus model structure. Under our framework, we would test our model assumptions using the estimates from the different but comparable studies. Then we would ensure that the combined estimate (which is consistent with our proposed model) has a sound interpretation.

Generalizing that interpretation beyond the combined studies is difficult in metaanalysis; though it is often the goal to generalize our results. We know that having a larger numbers of research studies does not always led to clearer or broader insights about the phenomena under investigation. We can find differences for a myriad of reasons ranging from the underlying true model structure, to the survey design, to the concepts and definitions, to assorted biases in each study. If we cover 95% of our intended population, can we say it applies to all our population?

Or does our proxy responses generalize to all in our target population? So, although we can agree

that consistent data is necessary for some models, we prefer to use the concept of inference instead of generalizabiltiy. The population of inference for a survey is the population about which the results are meant to generalize (Lavrakas 2008). We feel generalizabiltiy to a finite population is a result of the random selection of individuals from our different finite populations.

For metaanalysis then, if the combined estimate is from a series of surveys of the same finite population, with no reference to a superpopulation, we can not make inference about the 246 mechanism that generated the finite population, only about the finite population itself. Likewise, if we have data from several finite populations all generated under the same superpopulation model, we need to invoke the superpopulation and use it in both point and variance estimation in order to have valid inference of the superpopulation parameters. Then we can make inference on our superpopulation. And we again assert that no matter how tempting, we should not ever generalize (or infer) beyond the restrictions of our metaanalysis model.

6.3 Reporting

Authors agree that when combining experiments there should exist a clearly defined research question which the systematic review undertakes to answer (Higgins and Green 2009, Hardy

1995, Hedges 1992, Peto 1987). This stipulation also holds for the survey case. Is the analysis investigating a rare finite population (as is often the case in small area estimation)? Is it seeking to make inference to the superpopulation more accurately? Or is it testing to see if a body of evidence on a problem is consistent? Once the question is defined, then the researcher must decide on his or her methodology. Will she use a small area estimation procedure, or will she pool data? Once the decision is made, researchers should provide a rationale or justification for using their chosen method to analyze the data.

As with experiments, the selection of surveys should be a procedure that is clearly defined prior to the acquisition of either point estimates or data sets. Whether we use a researchbased electronic facility (i.e. Pubmed, the Canadian Research Index, the Social Sciences

Index, POPLINE, PapersFirst or Jstor) or webbased electronic search engines including Google

247 and Google Scholar, or contact a statistical agency directly, researchers should set up and document the protocols used to select surveys. These protocols should include the key words, search criteria, the years under study and a description of the databases, contacts and websites that were searched. The report should also include a list of those who completed the searches along with their qualifications and their search strategies. Furthermore, researchers should document the effort taken to include all possible sources, the software, the list of found, methods to deal with other languages, unpublished studies or datasets without open access, and any contact the researcher had with corresponding authors and/or agencies.

After compiling the list of studies that the researcher found with their well documented protocol, the researcher should next apply clear criteria for inclusion and exclusion to each of the studies or datasets found in the review. These criteria should be based on the research question and be

designed to ensure that the studies are comparable and that the individual studies have a

reasonable quality. To this end, the authors should include a description of the relevance of the

surveys included, as well as the rationale for their selection and any possible bias this selection

criteria may cause. And as with experimental studies, the outcome should not be used as criteria

for inclusion or exclusion.

As with the experimental case, the researcher needs to report the rationale for the selection and

coding of outcomes. Whereas in a clinical setting, we would need to assess if the studies used

sound clinical definitions, self reports, or chart abstractions, in the context of surveys, the

researcher needs to assess how the questions were asked, noting how differences in the

questionnaire could influence the outcome. For example, within a group of surveys conducted to

248 collect data on depression, one survey might contain a question that asked about mood disorders diagnosed by a health professional, such as depression, bipolar disorder, mania or dysthymia, while another might have a question concerning each disorder, while a third might have a

question asking if the respondent ever felt depressed. Although each question relates to

depression, it would be unwise for a researcher working on depression just to assume that the

outcomes are the same and therefore pool them indiscriminately.

Moving on to the fundamental problem of comparability, as noted previously, we must also be

clear as to how each of the studies defined its population of inference (target population) and its participants. Via the research question and the methodology, the researcher must define the population of inference, the target population, the model or superpopulation, and the type of analysis being planned. After describing the population of inference for each survey, the analyst must also confirm that these individual definitions fit in the methodological model inference for the metaanalysis; the hypersuperpopulation, the superpopulation or a finite population at a particular time point. Furthermore, the analyst must ensure that the two levels of inference; that

of the individual study and of the global analysis are compatible. For instance, if the researcher is interested in the superpopulation parameter, but only has surveys taken from a single finite population, we know that the pooled result will more accurately measure the single finite population parameter (not the superpopulation parameter). By contrast, if the researcher believes he has several surveys from different finite populations generated under the same model, a combined estimate will reflect the superpopulation parameter.

249

Following the lead of the Cochrane collaboration where in an experiment we set clear definitions of the type of participant, the types of interventions, and the type of outcomes under study, so in

the survey case, we need to ask who the respondent is and whether there are proxy responses. For example, in some home surveys, nurses fill out questionnaires for some patients thereby enacting the situation of proxy of necessity. Similarly, it is sometimes the case in some household surveys that a single respondent (usually a parent) may respond for other members of the household, acting as a proxy of convenience. The researcher would then have to determine what impact such proxy response has on his research question.

At this point, the researcher should have enough information to evaluate any between study heterogeneity. When combining studies, some substantive heterogeneity is expected and may, in fact, be indistinguishable from statistical heterogeneity (i.e., whether the observed variability of effects is greater than that expected by chance alone). Heterogeneity can be tested and in some cases adjusted for in the metaanalysis as discussed in Chapters 3 and 4. When heterogeneity is found, the reasons for it should be examined. As Hardy (1995) notes, however, this examination will always be done posthoc and so hypothesis and conclusions about the variation are made after reviewing the data, and therefore should be managed with care.

The researcher must then give a description of statistical methods used in the metaanalysis, along with the provision of appropriate tables and graphics. These should include things like the weighted and unweighted survey estimates, comments on the importance of the sampling design

(or nonimportance and justification), descriptive information from each survey, model information, results of sensitivity analysis, and estimates of statistical uncertainty. Finally, the

250 report should note other possible explanations for the findings, present conclusions and point towards future research and also include a disclosure of funding sources.

Table 6.1 presents a checklist for researchers doing a metaanalysis of survey data. It is a modified version of the reporting checklist suggested by the Metaanalysis of observational

Studies in Epidemiology (MOOSE), but expanded to the survey case.

Table 6.1 A Reporting Checklist for Authors, Editors, and Reviewers of Metaanalyses of Surveys

Reporting of background should include Problem definition Hypothesis statement Description of survey outcome(s) Rationale for MetaAnalysis Population of Inference Finite Population (definition: study elements, the geographic boundaries and time periods) Superpopulation (model specification, study elements) HyperSuperpopulation (model specification, study elements)

Reporting of search strategy should include Qualifications of searchers (eg, librarians and investigators) Search strategy, including time period included in the synthesis and keywords Effort to include all available studies, including contact with authors, agencies Databases and registries searched Search software used, name and version, including special features used (eg, explosion) Use of hand searching (eg, reference lists of obtained articles) List of citations located and those excluded, including justification Method of addressing surveys published in languages other than English Method of handling abstracts, unpublished studies and surveys that are not publicly available Description of any contact with authors Description of any contact with agencies

Reporting of methods should include Description of relevance or appropriateness of studies assembled for assessing the research 251 question Rationale for the selection and coding of outcomes (eg, sound clinical definitions, complete scales, selfreported, derived variables) Documentation of how data were collected Assessment of heterogeneity between studies Description of statistical methods (eg, complete description of fixed or random effects models, justification of each) with sufficient detail to be replicated Provision of appropriate tables and graphics

Reporting of results should include Graphic summarizing individual study estimates and overall estimate Weighted and Unweighted Table giving descriptive information for each study included Tables giving model information (superpopulation, hypersuperpopulation) Results of sensitivity testing (eg, subgroup analysis) Indication of statistical uncertainty of findings

Reporting of discussion should include Quantitative assessment of bias (eg, publication bias) Justification for exclusion Assessment of quality of included studies Assessment of included studies quality Assessment of the impact of the sampling design Informativness and Ignorability Assessment of the impact of variations in questionnaire wording Assessment of the impact of variations in mode Assessment of the impact of proxy responses Assessment of the impact of nonresponse and imputation Assessment of the impact of poststratification and calibration

Reporting of conclusions should include Consideration of alternative explanations for observed results Generalization of the conclusions (ie, appropriate for the data presented and within the domain of the ) Guidelines for future research Disclosure of funding source

252

6.4 Conclusion

Metaanalysis is a form of research that uses other primary research studies as its subjects or units of observation. As Naylor (1997) states, it has been widely recognized that “methodology is less important than determining which results are to be aggregated.” We agree in the experimental case. However, in the case of survey data, the method does matter equally to the data quality because here (unlike the experimental case) we have two inferential models: that of the finite population and that of the superpopulation. Good data combined under the wrong inferential

model (for example, pooling the same finite population and thinking as if it were the

superpopulation, or ignoring the sampling design when it is informative) is as spurious as pooling poor data under the correct inferential model.

The idea of pooling data may arise in many types of research, for instance, that dealing with a

series of surveys from a single country, or a series of surveys over time, or a series of surveys

from different countries. To date, none of the applied papers combining or pooling large survey

datasets done by nonstatisticians has been completely correct. And the pooling examples done by statisticians have missed important theoretical assumptions. We have noted that when the

method is not mathematically challenging the assumptions will be strong and numerous (for

example, those concerning homogeneity, ignorability and independence). Unfortunately, we have

also seen that the simpler the model, the more likely it is that it will be used and used incorrectly.

Therefore, we hope that we will help researchers minimize errors by breaking down the cases into

their simplest forms and formally stating what is assumed in each (both explicitly and implicitly).

253

This thesis presents a framework for the metaanalysis of survey data and considers the statistical analysis appropriate to a series of surveys which are derived from a single finite population, a single superpopulation or a hypersuperpopulation model. These three models are the only models that can exist. Any other model form can be reduced into these summary cases. This fact makes this research an important starting point for all other work in this area. For example, future research could investigate techniques that minimize the MSE when we have biased estimates or how to approach pooling data with measurement errors using this framework.

Although we present the framework in a standardized fashion, we would not want the reader to approach the problem of combining data mechanistically, following the framework provided unreflectively as one might a recipe in a cookbook. Like other forms of research, metaanalysis has many steps from planning, through implementation and on to dissemination. Care must be taken in all aspects of metaanalysis because as we have illustrated, misunderstanding or oversimplifying the problem of data combination can lead to invalid conclusions.

254

Bibliography

Arburthnott, J. 1710, "An Argument for Divine Providence, taken from the constant Regularity observ'd in the Births of both Sexes", Philosophical Transactions, vol. 27, no. 328, pp. 186. Armitage, P., Berry, G. & Matthews, J. 2002, Statistical methods in medical research, Blackwell Science.

Armstrong, B.G. 1998, "Effect of measurement error on epidemiological studies of environmental and occupational exposures", Occupational and environmental medicine, vol. 55, no. 10, pp. 651656.

Bakan, D. 1966, "The test of significance in ", Psychological bulletin, vol. 66, no. 6, pp. 423437.

Bankier, M.D. 1986, "Estimators Based on Several Stratified Samples with Applications to Multiple Frame Surveys", Journal of the American Statistical Association, vol. 81, no. 396, pp. pp. 10741079.

Basso, O. & Olsen, J. 2001, "Sex Ratio and Twinning in Women with Hyperemesis or Pre Eclampsia", Epidemiology, vol. 12, no. 6, pp. 747749.

Bellhouse, D.R. 1987, "ModelBased Estimation in Finite Population Sampling", The American , vol. 41, no. 4, pp. pp. 260262.

Benn, P.A., Clive, J.M. & Collins, R. 1997, " for secondtrimester maternal serum alpha fetoprotein, human chorionic gonadotropin, and unconjugated estriol; differences between races or ethnic groups", Clin Chem, vol. 43, no. 2, pp. 333337.

Beratis, N.G., Asimacopoulou, A. & Varvarigou, A. 2008}, "Association of secondary sex ratio with smoking and parity", FERTILITY AND STERILITY}, vol. 89}, no. 3}, pp. 662667}.

Berkey, C.S., Hoaglin, D.C., Mosteller, F. & Colditz, G.A. 1995, "A randomeffects regression model for metaanalysis", Statistics in medicine, vol. 14, no. 4, pp. 395411.

Biggs, B. 1992, Self/Proxy Respondent Rules and Data Quality , Income Statistics Canada, Statistics Canada, Ottawa.

Binder, D.A. 1983, "On the Variances of Asymptotically Normal Estimators from Complex Surveys", International Statistical Review / Revue Internationale de Statistique, vol. 51, no. 3, pp. pp. 279292.

Binder, D.A. & Roberts, G.R. 2003, " for Survey Data Analysis", , pp. 568.

Binder, D.A. & Roberts, G.R. 2001, Can informative designs be ignorable? .

Binder, D. 2008, "Answers to Criticisms of Using DesignBased Methods for Inference", Statistics Canada, .

BLS , NLS documentation [Homepage of Bureau of Labor Statistics], [Online]. Available: http://www.bls.gov/nls/#overview .

255

Botman, S.L. & Jack, S.S. 1995, "Combining national health interview survey datasets: Issues and approaches", Statistics in medicine, vol. 14, no. 57, pp. 669677.

Brackstone, G. 1999, "Managing data quality in a statistical agency", Survey Methodology, vol. 25, no. 2, pp. 139149.

Brannick, M., Yan, L. & Cafri, G. 2008, "Comparison of weights for metaanalysis of r and d under realistic conditions", .

Breslow, N.E. & Day, N.E. 1980, Statistical methods in cancer research, International Agency for Research on Cancer, Lyon.

Brewer, K.R.W. 1979, "A Class of Robust Sampling Designs for LargeScale Surveys", Journal of the American Statistical Association, vol. 74, no. 368, pp. 911915.

Brewer, K. R. W. , Hanif Muhammad & Tam, S. M. 1988, "How Nearly Can ModelBased Prediction and DesignBased Estimation Be Reconciled?", Journal of the American Statistical Association, vol. 83, no. 401, pp. 128132.

Brockwell, S.E. & Gordon, I.R. 2001, "A comparison of statistical methods for metaanalysis", Statistics in medicine, vol. 20, no. 6, pp. 825840.

Cassel, C. & Sarndal, C. 1972, "A Model for Studying Robustness of Estimators and Informativeness of Labels in Sampling with Varying Probabilities", Journal of the Royal Statistical Society.Series B (Methodological), vol. 34, no. 2, pp. pp. 279289.

Cassel, C., Saerndal, C.E. & Wretman, J.H. 1977, Foundations of inference in , Wiley, New York, NY.

Catalano, R., Bruckner, T., Marks, A.R. & Eskenazi, B. 2006, "Exogenous shocks to the human sex ratio: the case of September 11, 2001 in New York City", Human Reproduction}, vol. 21, no. 12, pp. 31273131.

CDC , NSFG Documentation [Homepage of Center for Disease Control], [Online]. Available: http://www.cdc.gov/nchs/nsfg.htm .

Chiu, M. 2010, "Comparison of cardiovascular risk profiles among ethnic groups using population health surveys between 1996 and 2007", CMAJ, vol. 182, no. 8, pp. E301E310.

Cochran, W.G. 1954, "The Combination of Estimates from Different Experiments", Biometrics, vol. 10, no. 1, pp. pp. 101129.

Cochran, W.G. 1977, Sampling Techniques, John Wiley and Sons, New York.

Copas, J.B. 1973, "Randomization models for the Matched and Unmatched 2 × 2 Tables", Biometrika, vol. 60, no. 3, pp. 467476.

Cornfield, J. 1944, "On Samples from Finite Populations", Journal of the American Statistical Association, vol. 39, no. 226, pp. pp. 236239.

Cummings, P. 2009, "The relative merits of risk ratios and odds ratios", Archives of Pediatrics & Adolescent Medicine, vol. 163, no. 5, pp. 438445. 256

Davies, J.B. & Zhang, J. 1997, "The Effects of Gender Control on Fertility and Children's Consumption", Journal of Population Economics, vol. 10, no. 1, pp. pp. 6785.

Demets, D.L. 1987, "Methods for combining randomized clinical trials: strengths and limitations", Statistics in medicine, vol. 6, no. 3, pp. 341.

Deming, W.E. 1950, "Some theory of sampling", General Pulishing Company, Toronto

DerSimonian, R. & Laird, N. 1986, "Metaanalysis in clinical trials.", Controlled clinical trials, vol. 7, no. 3, pp. 177188.

Egger, M., Smith, G.D. & Phillips, A.N. 1997, "Metaanalysis: Principles and procedures", BMJ}, vol. 315, no. 7121, pp. 15331537.

Egger, M., Smith, G.D., Schneider, M. & Minder, C. 1997, "Bias in metaanalysis detected by a simple, graphical test", BMJ}, vol. 315, no. 7109, pp. 629634.

Erickson, J.D. 1976, "The secondary sex ratio in the United States 196971: association with race, parental ages, birth order, paternal education and legitimacy", Annals of Human Genetics, vol. 40, no. 2, pp. 205212.

Fisher, R.A. 1950, Statistical Methods for Research Workers, .

Fisher, R.A.,Sir 1930, "The genetical theory of natural selection", .

Fitzmaurice, G.M., Heath, A.F. & Cox, D.R. 1997, "Detecting Overdispersion in Large Scale Surveys: Application to a Study of Education and Social Class in Britain", Journal of the Royal Statistical Society.Series C (Applied Statistics), vol. 46, no. 4, pp. pp. 415432.

Flannery, K.A. & Liederman, J. 1996, "A reexamination of the sex ratios of families with a neurodevelopmentally disordered child", Journal of child and psychiatry, and allied disciplines, vol. 37, no. 5, pp. 621623.

Fukuda, M., Fukuda, K., Shimizu, T., Yomura, W. & Shimizu, S. 1996, "Kobe earthquake and reduced sperm motility", Human reproduction (Oxford, England), vol. 11, no. 6, pp. 12441246.

Fukuda, M., Fukuda, K., Shimizu, T. & Møller, H. 1998, "Decline in sex ratio at birth after Kobe earthquake", Human reproduction (Oxford, England), vol. 13, no. 8, pp. 23212322.

Fuller, W.A. & Burmeister, L.F. 1972, "Estimators of samples selected from two overlapping frames", , 1972, pp. 245.

Fuller, W.A. 1975, "Regression Analysis for Sample Surveys", Sankhyā: The Indian Journal of Statistics, Series C, vol. 37, pp. 37, 117132.132.

Fuller, W.A. 2009, "Sampling statistics", .

Gini, C. 1955, "Sulla probabilita che x termini di una serie erratica sieno tutti crescenti (o non decrescenti) ovvero tutti decrescenti (o non crescenti) con applicazioni ai rapporti dei sessi nelle nascite umane in intervalli successivi e alle disposizioni sessi nelle fratellanze umane", Metron, vol. 17, pp. 34.

257

Glass, G.V. 1976, "Primary, Secondary, and MetaAnalysis of Research", Educational Researcher, vol. 5, no. 10, pp. pp. 38.

Godambe, V.P. 1960, "An Optimum Property of Regular Maximum Likelihood Estimation", The Annals of , vol. 31, no. 4, pp. 12081211.

Godambe, V.P. & Thompson, M.E. 1986, "Parameters of Superpopulation and Survey Population: Their Relationships and Estimation", International Statistical Review / Revue Internationale de Statistique, vol. 54, no. 2, pp. pp. 127138.

Goldstein, H., Yang, M., Omar, R., Turner, R. & Thompson, S. 2000, "Metaanalysis using multilevel models with an application to the study of class size effects", JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES CAPPLIED STATISTICS, vol. 49, pp. 399412.

Goodman, L.A. 1961, "Some possible effects of birth control on the human sex ratio", Annals of Human Genetics, vol. 25, pp. 75.

Graubard, B.I. & Korn, E.L. 2002, "Inference for Superpopulation Parameters Using Sample Surveys", Statistical Science, vol. 17, no. 1, pp. pp. 7396.

Graunt, J. & Petty, W., Sir 1662, Natural and political observations mentioned in a following index, and made upon the bills of mortality , Printed by Tho. Roycroft for John Martin, James Allestry, and Tho. Dicas, London.

Greenland, S. 1987, "Interpretation and choice of effect measures in epidemiologic analyses", American Journal of Epidemiology, vol. 125, no. 5, pp. 761768.

Groves, R.M. 2004, "Survey errors and survey costs", Wiley and Sons, New Jersey.

Grundy, P. M. & Yates, F. 1953, "Selection Without Replacement from Within Strata with Probability Proportional to Size", Journal of the Royal Statistical Society.Series B (Methodological), vol. 15, no. 2, pp. 253261.

Guerrero, R. 1975, "Type and time of insemination within the menstrual cycle and the human sex ratio at birth", Studies in family planning, vol. 6, no. 10, pp. 367371.

Hájek, J. 1964, "Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population", The Annals of Mathematical Statistics, vol. 35, no. 4, pp. pp. 14911523.

Hardy, J.B. 2003, "The Collaborative Perinatal Project: lessons and legacy", Annals of Epidemiology, vol. 13, no. 5, pp. 303311.

Hardy, R.J. 1995, MetaAnalysis Techniques in Medical Research: A Statistical Perspective ., Unpublished Doctoral Thesis

Hardy, R.J. & Thompson, S.G. 1996, "A Likelihood Approach to Metaanalysis with Random Effects", Statistics in medicine, vol. 15, no. 6, pp. 619629.

Harlap, S. 1979, "Gender of infants conceived on different days of the menstrual cycle", The New England journal of medicine, vol. 300, no. 26, pp. 14451448.

Hartley, H.O. 1962, "Multiple Frame Surveys", pp. 203. 258

Hartley, H.O. & Sielken, R.L.,Jr. 1975, "A "SuperPopulation Viewpoint" for Finite Population Sampling", Biometrics, vol. 31, no. 2, pp. pp. 411422.

Hedges, L.V. 1983, "A random effects model for effect sizes", Psychological bulletin, vol. 93, no. 2, pp. 388395.

Hedges, L 1992, "MetaAnalysis", Journal of Educational Statistics, vol. 17, no. 4, pp. 279296.

Heeringa, S.G., West, B.T. & Berglund, P.A. 2010, Applied Survey Data Analysis, Chapman & Hall, CRC.

Higgins, J.P., Whitehead, A., Turner, R.M., Omar, R.Z. & Thompson, S.G. 2001, "Metaanalysis of continuous outcome data from individual patients", Statistics in medicine, vol. 20, no. 15, pp. 2219 2241.

Higgins, J.P.T. & Thompson, S.G. 2002, "Quantifying heterogeneity in a metaanalysis", Statistics in medicine, vol. 21, no. 11, pp. 15391558.

Higgins, J.P.T., Thompson, S.G., Deeks, J.J. & Altman, D.G. 2003, "Measuring inconsistency in metaanalyses", BMJ}, vol. 327, no. 7414, pp. 557560.

Higgins, Julian P. T., Spiegelhalter, David J. & Thompson, Simon G. 2009, "A ReEvaluation of RandomEffects MetaAnalysis", Journal of the Royal Statistical Society.Series A (Statistics in Society), vol. 172, no. 1, pp. 137159.

Higgins, J.P.T. & Green, S. (eds) 2009, Cochrane Handbook for Systematic Reviews of Interventions Version 5.0.2 [updated September 2009].

Hill, M.S. 1992, "The panel study of income dynamics: a user's guide", vol. 002.

Hobel, C.J., Goldstein, A. & Barrett, E.S. 2008, "Psychosocial stress and pregnancy outcome", Clinical obstetrics and gynecology, vol. 51, no. 2, pp. 333348.

Horvitz, D. & Thompson D. 1952, "A Generalization of Sampling Without Replacement From a Finite Universe", Journal of the American Statistical Association, vol. 47, no. 260, pp. 663685.

HuedoMedina, T.B., SánchezMeca, J., MarínMartínez, F. & Botella, J. 2006, "Assessing heterogeneity in metaanalysis: Q statistic or I² index?", Psychological methods, vol. 11, no. 2, pp. 193206.

Isaki, C.T. & Fuller, W.A. 1982, "Survey Design Under the Regression Superpopulation Model", Journal of the American Statistical Association, vol. 77, no. 377, pp. pp. 8996.

James, W.H. 1981, "Gonadotrophin and the Human Secondary Sex Ratio", Obstetrical & gynecological survey, vol. 36, no. 6, pp. 321322.

James, W.H. & Rostron, J. 1977, "Maternal age, parity, social class and sex ratio", Annals of Human Genetics, vol. 41, no. 2, pp. 205217.

James, W.H. 1996, "Debate and argument: the sex ratio of the sibs of neurodevelopmentally disordered children", Journal of child psychology and psychiatry, and allied disciplines, vol. 37, no. 5, pp. 619619. 259

James, W.H. 2008, "The variations of human sex ratio at birth with time of conception within the cycle, coital rate around the time of conception, duration of time taken to achieve conception, and duration of gestation: a synthesis", Journal of theoretical biology, vol. 255, no. 2, pp. 199204.

Kalton, G. & Anderson, D.W. 1986, "Sampling Rare Populations", Journal of the Royal Statistical Society.Series A (General), vol. 149, no. 1, pp. pp. 6582.

Kaufman, J. May 27th, 2011, Workshop in Causal Inference for Epidemiologic Research .

Keyfitz, N. 1951, "Sampling with Probabilities Proportional to Size: Adjustment for Changes in the Probabilities", Journal of the American Statistical Association, vol. 46, no. 253, pp. pp. 105109.

Kish, L. 1965, Survey sampling, Wiley, New York.

Kish, L. 1979, "Samples and Censuses", International Statistical Review / Revue Internationale de Statistique, vol. 47, no. 2, pp. pp. 99109.

Kish, L. 1981, Using Cumulated Rolling Samples to Integrate Census and Survey Operations of the Census Bureau .

Kish, L. 1983, "Data Collection for Details over Space and Time" in Statistical Methods and the Improvement of Data Quality , ed. T. Wright, Academic, New York, pp. 72 84.

Kish, L. 1986, "Complete Censuses and Samples", Journal of , vol. 2, pp. 381393.

Kish, L. 1990, "Rolling Samples and Censuses", Survey Methodology, vol. 16, pp. 6379.

Kish, L. 1994, "Multipopulation survey designs: five types with seven shared aspects", International statistical review = Revue internationale de statistique, vol. 62, no. 2, pp. 167186.

Kish, L. 1999, "Combining/Cumulating Population Surveys", Sur, vol. 25, pp. 129138.

Kish, L. 1995, "Methods for Design Effects", Journal of Official Statistics, vol. 11, no. 1, pp. 5577.

Klebanoff, M.A. 2009, "The Collaborative Perinatal Project: a 50year retrospective", Paediatric and perinatal epidemiology, vol. 23, no. 1, pp. 28.

Kohen, D.E., Leventhal, T., Dahinten, V.S. & McIntosh, C.N. 2008, "Neighborhood Disadvantage: Pathways of Effects for Young Children", Child development, vol. 79, no. 1, pp. 156169.

Konjin, H.S. 1973, of sample survey design and analysis, NorthHolland, Amsterdam.

Korn, E.L. & Graubard, B.I. 1990, "Simultaneous Testing of Regression Coefficients with Complex Survey Data: Use of Bonferroni t Statistics", American Statistician, vol. 44, no. 4, pp. 270276.

Korn, E.L. & Graubard, B.I. 1999, Analysis of health surveys, Wiley, New York.

Korn, E. & Graubard, B. 1998, "Variance estimation for superpopulation parameters", Statistica Sinca, vol. 8, no. 4, pp. 11311151.

260

Kulik, J.A., Kulik, C.C. & Cohen, P.A. 1979, "A metaanalysis of outcome studies of Keller's personalized system of instruction", American , vol. 34, no. 4, pp. 307318.

Laaksonen, M., Uutela, A., Vartiainen, E., Jousilahti, P., Helakorpi, S. & Puska, P. 1999, "Development of smoking by birth cohort in the adult population in eastern Finland 197297", Tobacco control, vol. 8, no. 2, pp. 161168.

Laplace, P.S. 1781, "Mémoire sur les probabilités.", Mémoires de l’Académie Royale des sciences de Paris, , pp. 227332.

Lavrakas, P.J. 2008, Encyclopedia of survey research methods, SAGE Publications, Thousand Oaks, Calif.

Lepkowski, J.M., Mosher, W.D., Davis, K.E., Groves, R.M. & Van Hoewyk, J. 2010, "The 20062010 National Survey of Family Growth: sample design and analysis of a continuous survey", Vital and health statistics.Series 2, Data evaluation and methods research, , no. 150, pp. 1.

Lewis, D.K. 1986, "Counterfactuals", .

Liederman, J. & Flannery, K.A. 1995, "The sex ratios of families with a neurodevelopmentally disordered child", Journal of child psychology and psychiatry, and allied disciplines, vol. 36, no. 3, pp. 511 517.

Little, R.J.A. 1982, "Models for Nonresponse in Sample Surveys", Journal of the American Statistical Association, vol. 77, no. 378, pp. 237250.

Lloyd, O.L., Lloyd, M.M., Holland, Y. & Lyster, W.R. 1984, "An unusual sex ratio of births in an industrial town with mortality problems", BJOG: An International Journal of Obstetrics & Gynaecology, vol. 91, no. 9, pp. 901907.

Lohr, S.L. 1999, Sampling: Design and Analysis, .

Mantel, N. & Haenszel, W. 1959, "Statistical aspects of the analysis of data from retrospective studies of disease.", Journal of the National Cancer Institute, vol. 22, no. 4, pp. 719748.

Marker, D.A. 1999, "Organization of Small Area Estimators Using a Generalized Framework", Journal of Official Statistics, vol. 15, no. 1, pp. 124.

Marsha Weinraub & Barbara M. Wolf 1983, "Effects of Stress and Social Supports on MotherChild Interactions in Single and TwoParent Families", Child development, vol. 54, no. 5, pp. 12971311.

Marson, A.G., Smith, C.T. & Williamson, P.R. 2005, "Investigating heterogeneity in an individual patient data meta ‐analysis of time to event outcomes", Statistics in medicine, vol. 24, no. 9, pp. 13071319.

McLellan, A. October 2010, , The Clinical Question [Homepage of McMaster University, Health Sciences Library], [Online]. Available: http://hsl.mcmaster.ca/resources/program/medicine/documents/ClinicalQuestionSearchStrat egy_Oct20_2010_000.pdf .

Methodology of the Canadian Labour Force Survey 2008, Minister of Industry, Ottawa.

261

MeleroMontes, M.d.M. & Jick, H. 2001, "Hyperemesis Gravidarum and the Sex of the Offspring", Epidemiology, vol. 12, no. 1, pp. pp. 123124.

Merkouris, Takis 2004, "Combining Independent Regression Estimators from Multiple Surveys", Journal of the American Statistical Association, vol. 99, no. 468, pp. 11311139.

Molina, E.A, T. M. F. Smith & R. A. Sugden 2001, "Modelling Overdispersion for Complex Survey Data", International Statistical Review / Revue Internationale de Statistique, vol. 69, no. 3, pp. 373384.

Morton, S.C. 1999, Combining Surveys From A Metaanalysis Perspective , The RAND Corporation, Statistics Group, Santa Monica, California.

Mosteller, F., Bush, R.R., Green, B.F. & Lindzey, G. 1954, Selected quantitative techniques, and, Addison Wesley, Reading, Mass.

Neyman, J. 1938, "Contribution to the Theory of Sampling Human Populations", Journal of the American Statistical Association, vol. 33, no. 201, pp. 101116.

Naylor D. 1997, "MetaAnalysis and the MetaEpidemiology of Clinical Research: MetaAnalysis Is an Important Contribution to Research and Practice but It's Not a Panacea", BMJ: British Medical Journal, vol. 315, no. 7109, pp. 617619.

Ng, E., Wilkins, R., Gendron, F. & Berthelot, J. 2005, "Dynamics of Immigrants' Health in Canada: Evidence from the National Population Health Survey", .

Nielsen, H.S., Wu, F., Aghai, Z., Steffensen, R., van Halteren, A.G., Spierings, E., Christiansen, O.B., Miklos, D. & Goulmy, E. 2010, "HY antibody titers are increased in unexplained secondary recurrent miscarriage patients and associated with low male : female ratio in subsequent live births", Human reproduction (Oxford, England), vol. 25, no. 11, pp. 27452752.

NLS methodologists 2011, Email discussion on recode, variance and datasets .

Norberg, K. 2005, Discussion of programs and models via email .

Norberg, K. 2004, "Partnership status and the human sex ratio at birth", Proceedings: Biological Sciences, vol. 271, no. 1555, pp. 24032410.

Normand, S.T. 1999, "Meta ‐analysis: formulating, evaluating, combining, and reporting", Statistics in medicine, vol. 18, no. 3, pp. 321359.

O'Muircheartaigh, C. & Pedlow, S. 2002, "Combining Samples vs. Cumulating Cases: A Comparison of Two Weighting Strategies in the NLSY97", , 2002, pp. 2557.

Paarlberg, K.M., Vingerhoets, A.J., Passchier, J., Dekker, G.A. & Van Geijn, H.P. 1995, "Psychosocial factors and pregnancy outcome: a review with emphasis on methodological issues", Journal of psychosomatic research, vol. 39, no. 5, pp. 563595.

Park, I. & Lee, H. 2004, "Design Effects for the Weighted Mean and Total Estimators Under Complex Survey Sampling", Survey Methodology, vol. 30, no. 2, pp. 183193.

Patterson, H.D. 1950, "Sampling on Successive Occasions with Partial Replacement of Units", Journal of the Royal Statistical Society.Series B (Methodological), vol. 12, no. 2, pp. 241255. 262

Pearson, K. 1933, "On a Method of Determining whether a Sample of Size n Supposed to have been Drawn from a Parent Population having known probability integral has probably been drawn at random ", Biometrika, vol. 25, no. 34, pp. 379410.

Pergamit, M.R., Pierret, C.R., Rothstein, D.S. & Veum, J.R. 2001, "Data Watch: The National Longitudinal Surveys", Journal of Economic Perspectives, vol. 15, no. 2, pp. 239253.

Peto, R. 1987, "Why do we need systematic overviews of randomized trials?", Statistics in medicine, vol. 6, no. 3, pp. 233.

Pfeffermann, D 1993, "The Role of Sampling Weights When Modeling Survey Data", International Statistical Review / Revue Internationale de Statistique, vol. 61, no. 2, pp. 317337.

Pfeffermann, D. 1996, "The use of sampling weights for survey data analysis", Statistical methods in medical research, vol. 5, no. 3, pp. 239261.

Pfeffermann, D 1988, "The Effect of Sampling Design and Response Mechanism on Multivariate RegressionBased Predictors", Journal of the American Statistical Association, vol. 83, no. 403, pp. 824833.

Pfeffermann, D. 2002, "Small Area Estimation: New Developments and Directions", International Statistical Review / Revue Internationale de Statistique, vol. 70, no. 1, pp. 125143.

Pfeffermann, D. & Sverchkov, M. 2007, "Smallarea estimation under informative probability sampling of areas and within the selected areas", JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, vol. 102, no. 480, pp. 14271439.

Plewis, I. 2007, Predictions, Explanations and Causal Effects from Longitudinal Data Institute of Education Pubulications, University of London.

Quenouille, M.H. 1949, "Approximate Tests of Correlation in TimeSeries", Journal of the Royal Statistical Society.Series B (Methodological), vol. 11, no. 1, pp. 6884.

Rao, J.N.K, Hartley, H. O. & Cochran, W. G. 1962, "On a Simple Procedure of Unequal Probability Sampling without Replacement", Journal of the Royal Statistical Society.Series B (Methodological), vol. 24, no. 2, pp. 482491.

Rao, J. N. K. & Scott, A 1981, "The Analysis of Categorical Data From Complex Sample Surveys: ChiSquared Tests for and Independence in TwoWay Tables", Journal of the American Statistical Association, vol. 76, no. 374, pp. 221230.

Rao, J.N.K. & Scott, A.J. 1984, "On ChiSquared Tests for Multiway Contingency Tables with Cell Proportions Estimated from Survey Data", The Annals of Statistics, vol. 12, no. 1, pp. pp. 4660.

Rao, J.N.K. & Scott, A.J. 1987, "On Simple Adjustments to ChiSquare Tests with Sample Survey Data", The Annals of Statistics, vol. 15, no. 1, pp. pp. 385397.

Rao, J.N.K. 2003, Small Area Estimation, Wiley, New York.

Rao, J.N.K 2011 Personal Communication

263

Rao, S.R., Zaslavsky, A.M., Louis, T.A., Graubard, B.I., Schmid, C.H., Morton, S.C. & Finkelstein, D.M. 2008, "Metaanalysis of survey data: application to health services research", Health Services and Outcomes Research Methodology, vol. 8, no. 2, pp. 98114.

Renssen, Robbert H. & Nieuwenbroek, Nico J. 1997, "Aligning Estimates for Common Variables in Two or More Sample Surveys", Journal of the American Statistical Association, vol. 92, no. 437, pp. 368374.

Roberts, G. 2011, Discussion on Variance Calculations .

Roberts, G. & Binder, D. 2009, "Analyses Based on Combining Similar Information from Multiple Surveys", .

Roberts, G., Rao, J.N.K. & Kumar, S. 1987, "Logistic Regression Analysis of Sample Survey Data", Biometrika, vol. 74, no. 1, pp. 112.

Robins, J., Breslow, N. & Greenland, S. 1986, "Estimators of the MantelHaenszel variance consistent in both sparse data and largestrata limiting models", Biometrics, vol. 42, no. 2, pp. 311323.

Robins, J. & Greenland, S. 1989, "The probability of causation under a stochastic model for individual risk", Biometrics, vol. 45, no. 4, pp. 11251138.

Rogers, I., Emmett, P., Baker, D., Golding, J. & ALSPAC Study Team 1998, "Financial difficulties, smoking habits, composition of the diet and birthweight in a population of pregnant women in the South West of England", European journal of clinical nutrition, vol. 52, no. 4, pp. 251260.

Rosenthal, R. & Rubin, D.B. 1982, "A simple, general purpose display of magnitude of experimental effect", Journal of , vol. 74, no. 2, pp. 166.

Royall, Richard M. 1970, "Finite Population SamplingOn Labels in Estimation", The Annals of Mathematical Statistics, vol. 41, no. 5, pp. 17741779.

Royall R. & Herson, J. 1973a, "Robust Estimation in Finite Populations I", Journal of the American Statistical Association, vol. 68, no. 344, pp. 880889.

Royall R. & Herson, J 1973b, "Robust Estimation in Finite Populations II: Stratification on a Size Variable", Journal of the American Statistical Association, vol. 68, no. 344, pp. 890893.

Royall, R.M. 1992, "The Model Based (Prediction) Approach to Finite Population Sampling Theory", Lecture NotesMonograph Series, vol. 17, no. , Current Issues in Statistical Inference: Essays in Honor of D. Basu, pp. pp. 225240.

Rubin, D. 1976, "Inference and Missing Data", Biometrika, vol. 63, no. 3, pp. 581592.

Rubin, D.B. 2004, "Teaching Statistical Inference for Causal Effects in Experiments and Observational Studies", Journal of Educational and Behavioral Statistics, vol. 29, no. 3, pp. 343367.

RubinBleuer, S & Schiopu Kratina, I 2005, "On the TwoPhase Framework for Joint Model and DesignBased Inference", The Annals of Statistics, vol. 33, no. 6, pp. 27892810.

264

Sarkar, S. 1984 Dec, "Motility, expression of surface antigen, and X and Y human sperm separation in in vitro fertilization medium", FERTILITY AND STERILITY, vol. 42, no. 6, pp. 899905.

Sarndal, C.E. 1980, "On πInverse Weighting Versus Best Linear Unbiased Weighting in Probability Sampling", Biometrika, vol. 67, no. 3, pp. pp. 639650.

Sarndal, C.E., Swensson, B. & Wretman, J. 1992, Model Assisted Survey Sampling, SpringerVerlag, New York.

Schenker, N., Gentleman, J., Rose, D., Hing, E. & Shimizu, I.M. 2002, "Combining Estimates from Complementary Surveys: A Using Prevalence Estimates from National Health Surveys of Households and Nursing Homes", Public Health Reports (1974), vol. 117, no. 4, pp. 393407.

Schenker, N. & Raghunathan, T.E. 2007, "Combining information from multiple surveys to enhance estimation of measures of health", Statistics in medicine, vol. 26, no. 8, pp. 18021811.

Scott, A.J. 2007, "RaoScott Corrections and their Impact", , 2007, pp. 3514.

Sharon L. Lohr & Rao, J.N.K. 2000, "Inference from Dual Frame Surveys", Journal of the American Statistical Association, vol. 95, no. 449, pp. pp. 271280.

Shields, M. 2004, "Proxy reporting of health information", Health reports / Statistics Canada, Canadian Centre for Health Information = Rapports sur la santé / Statistique Canada, Centre canadien d'information sur la santé, vol. 15, no. 3, pp. 21.

Skinner, C.J. & Rao, J.N.K. 1996, "Estimation in Dual Frame Surveys With Complex Designs", Journal of the American Statistical Association, vol. 91, no. 433, pp. pp. 349356.

Smith, G.D., Egger, M. & Phillips, A.N. 1997, "Metaanalysis. Beyond the grand mean?", BMJ (Clinical research ed.), vol. 315, no. 7122, pp. 16101614.

Spanier, P.A., Marshall, S.J. & Faulkner, G.E. 2006, "Tackling the obesity pandemic: a call for sedentary behaviour research", Canadian journal of public health.Revue canadienne de santé publique, vol. 97, no. 3, pp. 255257.

Special Surveys Division 2009, National Longitudinal Survey of Children and Youth – Survey Overview, Cycle 8.

Stroup, D.F., Berlin, J.A., Morton, S.C., Olkin, I., Williamson, G.D., Rennie, D., Moher, D., Becker, B.J., Sipe, T.A., Thacker, S.B., for the Metaanalysis Of Observational Studies in Epidemiology Group & MOOSE Grp 2000, "Metaanalysis of Observational Studies in Epidemiology: A Proposal for Reporting", JAMA: The Journal of the American Medical Association, vol. 283, no. 15, pp. 20082012.

Stukel, T.A., Demidenko, E., Karagas, M.R. & Dykes, J. 2001, "Two ‐stage methods for the analysis of pooled data", Statistics in medicine, vol. 20, no. 14, pp. 21152130.

Sugden, R.A. & Smith, T.M.F. 1984, "Ignorable and Informative Designs in Survey Sampling Inference", Biometrika, vol. 71, no. 3, pp. 495506.

265

Sullivan, P.F., Neale, M.C. & Kendler, K.S. 2000, "Genetic epidemiology of major depression: review and metaanalysis", The American Journal of Psychiatry, vol. 157, no. 10, pp. 15521562.

Teitelbaum, M.S. 1971, "Coital frequency and sex ratio", Lancet, vol. 1, no. 7703, pp. 800.

Thomas, S. & Wannell, B. 2009, "Combining cycles of the Canadian Community Health Survey", Health reports / Statistics Canada, Canadian Centre for Health Information = Rapports sur la santé / Statistique Canada, Centre canadien d'information sur la santé, vol. 20, no. 1, pp. 53.

Thompson M.E. & Godambe V. 1971, "Bayes, Fiducial and Frequency Aspects of Statistical Inference in Regression Analysis in SurveySampling", Journal of the Royal Statistical Society.Series B (Methodological), vol. 33, no. 3, pp. 361390.

Thompson, M.E. 1997, Theory of Sample Surveys, Chapman & Hall, London.

Thompson, S.G. & Higgins, J.P.T. 2002, "How should metaregression analyses be undertaken and interpreted?", Statistics in medicine, vol. 21, no. 11, pp. 15591573.

Trivers, R.L. & Willard, D.E. 1973, "Natural selection of parental ability to vary the sex ratio of offspring", Science (New York, N.Y.), vol. 179, no. 68, pp. 9092.

Tudur Smith, C. & Williamson, P.R. 2007, "A comparison of methods for fixed effects metaanalysis of individual patient data with time to event outcomes", Clinical Trials}, vol. 4, no. 6, pp. 621 630.

Tukey, J.W. 1958, "Bias and confidence in not quite large samples", Annals of Mathematical Statistics, vol. 29, pp. 614.

Turner, R.M., Thompson, S.G. & Warn, D.E. 2001, "Multilevel models for metaanalysis, and their application to absolute risk differences", Statistical methods in medical research, vol. 10, no. 6, pp. 375392.

Ulizzi, L., Astolfi, P. & Zonta, L.A. 2001, "Sex ratio at reproductive age: changes over the last century in the Italian population", Human biology; an international record of research, vol. 73, no. 1, pp. 121128.

Ulizzi, L. & Zonta, L.A. 1994, "Sex ratio and selection by early mortality in humans: fiftyyear analysis in different ethnic groups", Human biology; an international record of research, vol. 66, no. 6, pp. 1037.

Wadley, R.L. & Martin, J.F. 1997, "On Secondary Sex Ratios and Coital Frequency with an Iban Case", Current anthropology, vol. 38, no. 1, pp. 7981.

Whitehead, A. & Whitehead, J. 1991, "A general parametric approach to the metaanalysis of randomized clinical trials", Statistics in medicine, vol. 10, no. 11, pp. 1665.

Whitehead, A., Omar, R.Z., Thompson, S.G., Higgins, J.P.T., Turner, R.M. & Savaluny, E. 2001, "Meta ‐analysis of ordinal outcomes using individual patient data", Statistics in medicine, vol. 20, no. 15, pp. 22432260.

Willis, G.B. 2005, "Cognitive interviewing: a tool for improving questionnaire design", .

266

Winer, B.J. 1971, Statistical principles in experimental design, McGrawHill, New York.

Wu, C. 2004, "Combining information from multiple surveys through the empirical likelihood method", Canadian Journal of Statistics, vol. 32, no. 1, pp. 1526.

Yates, F. 1960, Sampling methods for censuses and surveys, C. Griffin, London.

Yon, Y. "A comparison of Intimate Partner Violence in MidandOld Age: Is Elder Abuse Simply a Case of Spousal Abuse Grown Old?", Unpublished Masters Thesis, Department of Gerentology, Simon Fraser University, Vancouver.

Zieschang, Kimberly D. 1990, "Sample Weighting Methods and Estimation of Totals in the Consumer Expenditure Survey", Journal of the American Statistical Association, vol. 85, no. 412, pp. 9861001.

Zhaohai Li & Colin B. Begg 1994, "Random Effects Models for Combining Results from Controlled and Uncontrolled Studies in a MetaAnalysis", Journal of the American Statistical Association, vol. 89, no. 428, pp. 15231527.

Zorn, B., Sucur, V., Stare, J. & MedenVrtovec, H. 2002, "Decline in sex ratio at birth after 10day war in Slovenia: brief communication", Human reproduction (Oxford, England), vol. 17, no. 12, pp. 31733177.

267

Appendix A Simulation Programs

File Name: Program_Caller.sas *options source2; %if &f=&finpops %then %do; * Update date - December 5, 2010; %include * %include "&dirname.\Section_1.sas"; "&dirname.\Section_SRS_case2nextb * Just a script that calls the it.sas"; various parts of the program; *options mprint mlogic symbolgen * %include * This script can be considered source2; "&dirname.\Section_Strat_case2nex section 0; tbit.sas"; %macro looper; * %include *Proc printto "&dirname.\Section_Cluster_case2n log="C:\temp\logfile.txt"; %do f=1 %to &finpops; extbit.sas"; *Run; * %include *options nonotes nosource; %do i=1 %to &loops; "&dirname.\Section_5.sas"; %include Proc printto log=log; Run; "&dirname.\Section_2.sas"; %end; %end; %end; %include %mend; data timer; "&dirname.\Section_3.sas"; day1=date(); * %include %looper; time1=time(); "&dirname.\Section_3_case2.sas"; run; * %include data timer2; "&dirname.\Section_4srs.sas"; set timer; * Global Parameters - just one; * %include seconds_in_day=24*60*60; %global dirname; "&dirname.\Section_4srs_case2.sas day2=date(); %let dirname=C:\Documents and "; time2=time(); Settings\foxkarl\Desktop\SAS * %include if day2=day1 then PROGRAMS REDONE\Current "&dirname.\Section_4SPstrata_case time_taken_in_seconds=time2- Version\Simulations Nov15 Week; 2nextbit.sas"; time1; * %include else do; *FINPOPS: number of finite "&dirname.\Section_4SPCluster_cas populations required; e2.sas"; time_taken_in_seconds=seconds_in_ %let FINPOPS=1; * %include day-time1 + (day2-day1- "&dirname.\Section_4Asrs.sas"; 1)*seconds_in_day + time2; * LOOPS should be set to 1 if you * %include * untested for multiple want one graph with however many "&dirname.\Section_4strata.sas"; days/across midnight runs but replicates. ; * %include basically: * LOOPS should be set to >1 if "&dirname.\Section_4SPstrata.sas" time until midnight of first you want the outlier loop run, ; day + seconds for whole days + the average relative bias and all * %include seconds from midnight to end that goodness run; "&dirname.\Section_4Astrata.sas"; of running on last day; *last run was 500 loops; * %include %let LOOPS=1; "&dirname.\Section_4cluster.sas"; end; * %include * Also, section 1 contains a "&dirname.\Section_4Acluster.sas" put "This program took : " bunch of specific parameters; ; time_taken_in_seconds 20.0 " * %include seconds to run."; * Assumptions: "&dirname.\Section_4SPcluster.sas run; &dirname.\..\datasets exists; "; * * %include &dirname.\..\output also "&dirname.\Section_4srs_superpopv File Name: Section_1.sas exists; ar.sas"; * Not verified; * %include * Section 1; "&dirname.\datasetfinal.sas"; * Call sections of code; * %include * Parameters and libraries; "&dirname.\ESL.sas"; 268

*user requirements * define a random seed and put it if ranuni(0) > .25 into a macro variable; then do; VARSUSED : Number of Variables Used: - this will be set to 4. data _NULL_; X&V.=int(10000+ranexp(&SEED.)*250 BIGN: N - size of the seed1=1; 00); superpopulation * define seed1 as any end; LITTLEN: n - size for the particular integer if repeatable else do; replicate samples results are desired; X&V.=0; R: number of replicates end; STRATA: number of strata desired, call %end; mostly for development symput("SEED",compress(put(seed1, %else %if &V=2 %then 5.0))); run; %do; SIGMA: (not sigma squared, for * age - assume , but maybe it should * libraries; usually survey requirements of 15 be, for clarity); up?; *libname outlib %let VARSUSED=4; "&dirname.\datasets"; X&V.=int(15+85*ranuni(&SEED.)); * %let BIGN=100000; * low for a flat distribution that does not development purposes; accurately reflect our %let LITTLEN=500; * low for File Name: Section_2.sas population; development purposes; %end; %let R=5; * Section 2; %else %if &V=3 %then %do; %let ERRORBAR=.05; * Start creating datasets; * gender.; * errorbar defined as percent X&V.=int(&PI + lines you want to see - .01 is 1 %macro section2; ranuni(&SEED.)); * Should be 0/1; percent, .1 is 10 percent, %end; etc...; * needed %else %if &V=4 %then * same value for want to see how Y %do; often convergent value is outside B * household size; of this range; Strata X&V.=int(1+ranexp(&SEED.)); %let SPMEAN=1; X1-X&VARSUSED; %end; %let ALTMEAN=1.5; %end; %let SPVAR=5; * Then x1 to x&VARSUSED; output; %let STRATA=8; * update seed each time to be a end; * do i=1 to bign loop; different stream; run; * need &STRATA numbers in brackets - sampling percentages %let SEED=%eval(&SEED+&f.); %mend; in internal order of strata; * internal order = order after data finitepopulation_&f.(keep=y %section2; running a basic proc sort on b strata x1-x&VARSUSED); strata; length strata $3. y 8.0 b 3.0; %let B1=.08; %let STRATASAMPRATE=(0.01 do i=1 to &BIGN; %let B2=.60; 0.005 %let B3=4; 0.004 * percent do loop below %let B4=2; 0.001 awkwardly written, but designed 0.001 to be easily extendable (or data finitepopulation_&f.; 0.004 retractable if < 4 desired); set finitepopulation_&f.; 0.005 0.01 %do V=1 %to &VARSUSED; ); %if &V=1 %then %do; y1=X1*&B1.+X2*&B2.+X3*&B3.+X4*&B4 * income; .+&SIGMA.*rannor(&SEED.); %let SIGMA=400; * Sigma for * 25% chance of rand = rannor(&SEED.); error terms; zero; y=&SPMEAN.+&SPVAR.*rand; %let PI=0.75; * PI for the * 75% chance of y_a = &altmean. + &Spvar.*rand; binomial variable B; exponential dist from 10000 on run; up; * Define strata; 269

set clusterFrame; call execute("proc delete proc sort keep strata; run; data=samples_strata_&f.; run;"); data=finitepopulation_&f.; end; by y; run; proc surveyselect data if =clusterFrame method=srs n=2 exist("work.samples_cluster_&f.") data finitepopulation_&f.; seed = &seed. then do; set finitepopulation_&f.; out=samples_cluster rep=&r. call execute("proc delete noprint; run; data= samples_cluster_&f.; run"); strata=1+(int(&STRATA*_N_/&BIGN)) end; run; ; data samples_cluster ; * correction for last record; set samples_cluster; * So, SRS samples; if _N_=&BIGN then samplingweight = 2/8; run; strata=&STRATA.; run; proc surveyselect proc sql; data=finitepopulation_&f. create table samples_cluster2 method=srs seed = &seed. File Name: Section_3.sas as out=samples_srs_&f. rep=&R. select samples_cluster.*, n=&LITTLEN stats noprint; run; * This section will take the finitepopulation_&f..* finitepopulation, and select from samples_cluster inner *Cluster; three different sets of join finitepopulation_&f. replicates; on proc freq data = * Based on SRS - Stratified - samples_cluster.strata=finitepopu finitepopulation_&f. noprint; cluster; lation_&f..strata tables strata order by replicate, /out=clusterFrame_&f.; run; * ensure nothing left in memory; strata; quit; data clusterFrame_&f.; data _NULL_; set clusterFrame_&F.; if exist("work.samples_srs") * Stratified samples; keep strata; run; then do; call execute("proc delete proc surveyselect proc surveyselect data data=samples_srs; run;"); data=finitepopulation_&f. =clusterFrame_&f. method=srs n=2 end; method=srs out=samples_strata seed = &seed. if exist("work.samples_strata") seed=&seed. rep=&R out=samples_cluster_&f. rep=&r. then do; SAMPRATE=&STRATASAMPRATE. stats noprint; run; call execute("proc delete noprint; data=samples_strata; run;"); strata strata; run; data samples_cluster_&f. ; end; set samples_cluster_&f.; if samplingweight = 2/8; run; exist("work.samples_cluster") File Name: Section_3_case2.sas then do; proc sql; call execute("proc delete * This section will take the create table data= samples_cluster; run"); finitepopulation, and select samples_cluster2_&f. as end; run; three different sets of select samples_cluster_&f..*, replicates; finitepopulation_&f..* * So, SRS samples; * Based on SRS - Stratified - from samples_cluster_&f. cluster; inner join finitepopulation_&f. proc surveyselect on data=finitepopulation_&f. * ensure nothing left in memory; samples_cluster_&f..strata=finite method=srs seed = &seed. population_&f..strata out=samples_srs rep=&R n=&LITTLEN data _NULL_; order by replicate, stats noprint; run; if strata; exist("work.samples_srs_&f.") quit; *Cluster; then do; call execute("proc delete * Stratified samples; proc freq data = data=samples_srs_&f.; run;"); proc surveyselect finitepopulation_&f. noprint; end; data=finitepopulation_&f. tables strata /out=clusterFrame; if method=srs run; exist("work.samples_strata_&f.") out=samples_strata_&F. then do; seed=&seed. rep=&R. data clusterFrame; 270

SAMPRATE=&STRATASAMPRATE. stats data a_deffmerge2; by mergevar; noprint; set a_deffcalc2(keep=replicate retain sumweight1 0 sumtotal1 0 strata strata; run; deff); run; sumweight2 0 sumtotal2 0 sumweight3 0 sumtotal3 * and merge deff into 0 sumweight4 0 sumtotal4 0; File Name: Section_4Acluster.sas cluster_stats; sumweight1+weight1; sumtotal1+weight1*mean; proc sort data=samples_cluster2; data a_cluster_stats; sumweight2+weight2; by replicate strata; run; merge a_cluster_stats sumtotal2+weight2*mean; a_deffmerge2; sumweight3+weight3; ods listing close; * kind of like id = _n_; sumtotal3+weight3*mean; a noprint for surveymeans - a new by replicate; run; sumweight4+weight4; procedure under the new model; sumtotal4+weight4*mean; data a2_cluster_stats; proc surveymeans set a_cluster_stats; x=_N_; * for plotting on ye data=samples_cluster2 sum varsum where id > &r.*3/4; run; olde graphe; mean var n=8 ; cum1=sumtotal1/sumweight1; by replicate; data cluster2_stats; cum2=sumtotal2/sumweight2; var y_a; set cluster_stats; cum3=sumtotal3/sumweight3; cluster strata; where id > &r.*3/4; run; cum4=sumtotal4/sumweight4; weight samplingweight; ods output data a_cluster_stats; *superpopulation ; statistics=a_cluster_stats; run; set a2_cluster_stats cluster2_stats; run; popmean =&SPMEAN.; * calculate variance for stratified design (above) and srs * Add weights to the various poptotalupper=popmean*(1+&ERRORBA (below); samples; R); * Weight 1 = sample size - a poptotallower=popmean*(1- proc surveymeans constant in our replicates; &ERRORBAR); data=samples_cluster2 sum varsum * Weight 2 = inverse of variance; mean var; * weight 3 = design effect * var; * relative bias - abs(estimate- by replicate; * weight 4 = 1; true)/true; var y_a; * true defined as ods output data a_cluster_stats; finitepopulation_&f. value; statistics=a_cluster_deff_base; set a_cluster_stats; run; weight1=1/2; relbias_fp=abs(cum1 - weight2=1/var; fpmean)/fpmean*100; ods listing; weight3=deff*(1/var); sumbias_fp+relbias_fp; weight4 = 1; *avgrelbias_fp=sumbias_fp/_N_; *rename; mergevar='1'; run; data a_cluster_deff_base; relbias_sp=abs(cum1- set a_cluster_deff_base; * Generate an finitepopulation &SPMEAN.)/&SPMEAN.*100; varbase = var; total; sumbias_sp+relbias_sp; keep replicate varbase; run; *avgrelbias_sp=sumbias_sp/_N_; proc summary df = x-1; run; * merge into one; data=finitepopulation_&f.; var y_a; data a_cluster_cumulative_&f.; data a_deffcalc2; output set a_cluster_cumulative_&f.; merge a_cluster_stats out=a_fptotal(drop=_TYPE_ _FREQ_) order=_N_; run; a_cluster_deff_base; sum=fptotal mean=fpmean; run; by replicate; run; proc sort data a_fptotal; data=a_cluster_cumulative_&f.; * deff calc simplified; set a_fptotal; by descending order; run; mergevar='1'; run; data a_deffcalc2; * Then carry the value forward; set a_deffcalc2; data deff=var/varbase; run; a_cluster_cumulative_&f.(drop=mer data a_cluster3_&f.; gevar); set a_cluster_cumulative_&f.; * keep deff; merge a_cluster_stats retain sum1 sum2 sum3 sum4 ; a_fptotal; if _N_=1 then do; sum1=cum1 ; 271

sum2=cum2; pv_2 = 1-chi_2; %if &i=1 %then %do; sum3=cum3; pv_3 = 1-chi_3; sum4=cum4; pv_4 = 1-chi_4; data a_cluster_metric_&f.; end; set h=1; run; reject_1=(pv_1 <= 0.05); a_cluster_runningtotal2; run; reject_2=(pv_2 <= 0.05); data a_cluster_pv_&f.; proc sort data = a_cluster3_&f.; reject_3=(pv_3 <= 0.05); set by order; run; reject_4=(pv_4 <= 0.05); a_cluster4_&f.(where=(x=&R./2) keep=x replicate reject: proc univariate data = sreject:); run; a_cluster3_&f. noprint; if df ge 1 then schi_1 = var deff ; cdf('CHISQUARE',sumQs1,df); else %end; output out=a_deff2 max=max; run; schi_1 = .; %else %do; if df ge 1 then schi_2 = data a_deff2; cdf('CHISQUARE',sumQs2,df); else proc append set a_deff2; schi_2 = .; base=a_cluster_metric_&f. h = 1; run; if df ge 1 then schi_3 = data=a_cluster_runningtotal2; cdf('CHISQUARE',sumQs3,df); else run; data a_cluster3_&f.; schi_3 = .; proc append merge a_cluster3_&f. a_deff2; if df ge 1 then schi_4 = base=a_cluster_pv_&f. by h; cdf('CHISQUARE',sumQs4,df); else data=a_cluster4_&f.(where=(x=&R./ *Q reg statistic; schi_4 = .; 2) keep=x replicate reject: Q1 = weight1*((mean-sum1)**2); sreject:); run; Q2 = weight2*((mean-sum2)**2); spv_1 = 1-schi_1; Q3 = weight3*((mean-sum3)**2); spv_2 = 1-schi_2; %end; Q4 = weight4*((mean-sum4)**2); spv_3 = 1-schi_3; QS1 = Q1/max; spv_4 = 1-schi_4; %mend; QS2 = Q2/max; QS3 = Q3/max; sreject_1=(spv_1 <= 0.05); %tsnnam; QS4 = Q4/max; sreject_2=(spv_2 <= 0.05); * set running total to keep track H=1; run; sreject_3=(spv_3 <= 0.05); how often we are outside ranges sreject_4=(spv_4 <= 0.05); of somethingorother; data a_cluster4_&f.; run; set a_cluster3_&f.; by H; retain sumQ1 0 sumQ2 0 sumQ3 0 data a_cluster_runningtotal1; File Name: Section_4Asrs.sas sumQ4 0 sumQS1 0 set a_cluster_cumulative_&f.; sumQS2 0 sumQS3 0 sumQS4 0; outsideflag1=ifn(cum1 < * Run surveymeans on the SRS sumQ1 +Q1; poptotallower or cum1 > sample; sumQ2 +Q2; poptotalupper,1,0); sumQ3 +Q3; outsideflag2=ifn(cum2 < ods listing close; * kind of like sumQ4 +Q4; poptotallower or cum2 > a noprint for surveymeans - a new sumQS1+QS1; poptotalupper,1,0); procedure under the new model; sumQS2+QS2; outsideflag3=ifn(cum3 < sumQS3+Qs3; poptotallower or cum3 > proc surveymeans data=samples_srs sumQs4+Qs4; poptotalupper,1,0); sum varsum mean var ; outsideflag4=ifn(cum4 < by replicate; if df ge 1 then chi_1 = poptotallower or cum4 > var y_a; cdf('CHISQUARE',sumQ1,df); else poptotalupper,1,0); run; weight samplingweight; chi_1 = .; ods output if df ge 1 then chi_2 = proc =a_srs_stats; run; cdf('CHISQUARE',sumQ2,df); else data=a_cluster_runningtotal1; chi_2 = .; var outsideflag1 outsideflag2 ods listing; if df ge 1 then chi_3 = outsideflag3 outsideflag4; cdf('CHISQUARE',sumQ3,df); else output * Add weights to the various chi_3 = .; out=a_cluster_runningtotal2(drop= samples; if df ge 1 then chi_4 = _TYPE_ _FREQ_) sum=outside_1 * Weight 1 = sample size - a cdf('CHISQUARE',sumQ4,df); else outside_2 outside_3 outside_4; constant in our replicates; chi_4 = .; run; * Weight 2 = inverse of variance; * weight 3 = design effect - for pv_1 = 1-chi_1; %macro tsnnam; srs = 1; 272

* weight 4 = 1; cum3=sumtotal3/sumweight3; cum4=sumtotal4/sumweight4; data a_srs4_&f.; data a_srs_stats; *superpopulation ; set a_srs3_&f.; by H; set a_srs_stats; popmean =&SPMEAN.; retain sumQ1 0 sumQ2 0 sumQ3 0 deff = 1; sumQ4 0 sumQS1 0 weight1=1/&LITTLEN; poptotalupper=popmean*(1+&ERRORBA sumQS2 0 sumQS3 0 sumQS4 0; weight2=1/var; R); sumQ1 +Q1; weight3=1/var; poptotallower=popmean*(1- sumQ2 +Q2; weight4 = 1; &ERRORBAR); sumQ3 +Q3; mergevar='1'; sumQ4 +Q4; id = _n_; run; * relative bias - abs(estimate- sumQS1+QS1; * Generate an finitepopulation true)/true; sumQS2+QS2; total; * true defined as sumQS3+Qs3; finitepopulation_&f. value; sumQs4+Qs4; proc summary data=finitepopulation_&f.; relbias_fp=abs(cum1 - if df ge 1 then chi_1 = var y_a; fpmean)/fpmean*100; cdf('CHISQUARE',sumQ1,df); else output sumbias_fp+relbias_fp; chi_1 = .; out=a_fptotal(drop=_TYPE_ _FREQ_) *avgrelbias_fp=sumbias_fp/_N_; if df ge 1 then chi_2 = sum=fptotal mean=fpmean; run; cdf('CHISQUARE',sumQ2,df); else relbias_sp=abs(cum1- chi_2 = .; data a_fptotal; &SPMEAN.)/&SPMEAN.*100; if df ge 1 then chi_3 = set a_fptotal; sumbias_sp+relbias_sp; cdf('CHISQUARE',sumQ3,df); else mergevar='1'; run; *avgrelbias_sp=sumbias_sp/_N_; chi_3 = .; df = x-1; run; if df ge 1 then chi_4 = data a2_srs_stats; cdf('CHISQUARE',sumQ4,df); else set a_srs_stats; data a_srs_cumulative_&f.; chi_4 = .; where id > &r.*3/4; run; set a_srs_cumulative_&f.; order=_N_; run; pv_1 = 1-chi_1; data srs2_stats; pv_2 = 1-chi_2; set srs_stats; proc sort pv_3 = 1-chi_3; where id > &r.*3/4; run; data=a_srs_cumulative_&f.; pv_4 = 1-chi_4; by descending order; run; data a_srs_stats; reject_1=(pv_1 <= 0.05); set a2_srs_stats srs2_stats; * Then carry the value forward; reject_2=(pv_2 <= 0.05); run; reject_3=(pv_3 <= 0.05); data a_srs3_&f.; reject_4=(pv_4 <= 0.05); data set a_srs_cumulative_&f.; a_srs_cumulative_&f.(drop=mergeva retain sum1 sum2 sum3 sum4 ; if df ge 1 then schi_1 = r); if _N_=1 then do; sum1=cum1 ; cdf('CHISQUARE',sumQs1,df); else merge a_srs_stats fptotal; sum2=cum2; schi_1 = .; by mergevar; sum3=cum3; if df ge 1 then schi_2 = retain sumweight1 0 sumtotal1 0 sum4=cum4; cdf('CHISQUARE',sumQs2,df); else sumweight2 0 sumtotal2 0 end; run; schi_2 = .; sumweight3 0 sumtotal3 if df ge 1 then schi_3 = 0 sumweight4 0 sumtotal4 0; proc sort data = a_srs3_&f.; cdf('CHISQUARE',sumQs3,df); else by order; run; schi_3 = .; sumweight1+weight1; if df ge 1 then schi_4 = sumtotal1+weight1*mean; data a_srs3_&f.; cdf('CHISQUARE',sumQs4,df); else sumweight2+weight2; set a_srs3_&f.; schi_4 = .; sumtotal2+weight2*mean; *Q reg statistic; sumweight3+weight3; Q1 = weight1*((mean-sum1)**2); spv_1 = 1-schi_1; sumtotal3+weight3*mean; Q2 = weight2*((mean-sum2)**2); spv_2 = 1-schi_2; sumweight4+weight4; Q3 = weight3*((mean-sum3)**2); spv_3 = 1-schi_3; sumtotal4+weight4*mean; Q4 = weight4*((mean-sum4)**2); spv_4 = 1-schi_4; QS1 = Q1/(1-&littlen./&bigN.); x=_N_; * for plotting on ye QS2 = Q2/(1-&littlen./&bigN.); sreject_1=(spv_1 <= 0.05); olde graphe; QS3 = Q3/(1-&littlen./&bigN.); sreject_2=(spv_2 <= 0.05); cum1=sumtotal1/sumweight1; QS4 = Q4/(1-&littlen./&bigN.); sreject_3=(spv_3 <= 0.05); cum2=sumtotal2/sumweight2; H=1; run; 273

sreject_4=(spv_4 <= 0.05); proc summary data = a_srs_pv_&f.; data a_deffcalc; run; var reject: sreject: ; merge a_strat_stats output out=a_srs_pv_2(drop=_TYPE_ a_strat_deff_base; data a_srs_runningtotal1; _FREQ_) sum=reject1-reject4 by replicate; run; set a_srs_cumulative_&f.; sreject1-sreject4; run; * deff calc simplified; outsideflag1=ifn(cum1 < poptotallower or cum1 > proc summary data = data a_deffcalc; poptotalupper,1,0); a_srs_metric_&f.; set a_deffcalc; outsideflag2=ifn(cum2 < var outside_1-outside_4; deff=var/varbase; run; poptotallower or cum2 > output poptotalupper,1,0); out=a_srs_averagetimesout * keep deff; outsideflag3=ifn(cum3 < (drop=_TYPE_ _FREQ_) mean= poptotallower or cum3 > outside1-outside4; run; data a_deffmerge; poptotalupper,1,0); set a_deffcalc(keep=replicate outsideflag4=ifn(cum4 < deff); run; poptotallower or cum4 > File Name: Section_4Astrata.sas poptotalupper,1,0); run; * and merge deff into * Run surveymeans on the various strat_stats; proc summary samples; data=a_srs_runningtotal1; data a_strat_stats; var outsideflag1 outsideflag2 proc sort data=samples_strata; merge a_strat_stats outsideflag3 outsideflag4; by replicate strata; run; a_deffmerge; output by replicate; out=a_srs_runningtotal2(drop=_TYP ods listing close; * kind of like id = _n_; run; E_ _FREQ_) sum=outside_1 a noprint for surveymeans - a new outside_2 outside_3 procedure under the new model; data a2_strat_stats; outside_4; run; set a_strat_stats; proc surveymeans where id > &r.*3/4; run; %macro tsnnam; data=samples_strata sum varsum %if &i=1 %then %do; mean var ; data strat2_stats; by replicate; set strat_stats; data a_srs_metric_&f.; strata strata; where id > &r.*3/4; run; set a_srs_runningtotal2; var y; run; weight samplingweight; data a_strat_stats; data a_srs_pv_&f.; ods output set a2_strat_stats strat2_stats; set statistics=a_strat_stats; run; run; a_srs4_&f(where=(order=&R./2) keep=order reject: sreject:); * calculate variance for * Add weights to the various run; stratified design (above) and srs samples; %end; (below); * Weight 1 = sample size - a %else %do; constant in our replicates; proc surveymeans * Weight 2 = inverse of variance; proc append data=samples_strata sum varsum * Weight 3 = design effect ; base=a_srs_metric_&f. mean var; * Weight 4 = 1 ; data=a_srs_runningtotal2; run; by replicate; proc append base=a_srs_pv_&f. var y; data a_strat_stats; data=a_srs4_&f.(where=(order=&R./ ods output set a_strat_stats; 2) keep=order reject: sreject:); statistics=a_strat_deff_base; weight1=1/&LITTLEN.; run; run; weight2=1/var; weight3=deff*(1/var); %end; ods listing; weight4=1; mergevar='1'; run; %mend; *rename; data a_strat_deff_base; * Generate an actual total; %tsnnam; set a_strat_deff_base; varbase = var; proc summary * set running total to keep track keep replicate varbase; run; data=finitepopulation_&f.; how often we are outside ranges var y_a; of somethingorother; * merge into one;

274

output if df ge 1 then chi_2 = out=a_fptotal(drop=_TYPE_ _FREQ_) proc sort cdf('CHISQUARE',sumQ2,df); else sum=fptotal mean=fpmean; run; data=a_strat_cumulative_&f.; chi_2 = .; by descending order; run; if df ge 1 then chi_3 = data a_fptotal; cdf('CHISQUARE',sumQ3,df); else set a_fptotal; * Then carry the value forward; chi_3 = .; mergevar='1'; run; if df ge 1 then chi_4 = data a_strat3_&f.; cdf('CHISQUARE',sumQ4,df); else data set a_strat_cumulative_&f.; chi_4 = .; a_strat_cumulative_&f.(drop=merge retain sum1 sum2 sum3 sum4 ; var); if _N_=1 then do; sum1=cum1 ; pv_1 = 1-chi_1; merge a_strat_stats a_fptotal; sum2=cum2; pv_2 = 1-chi_2; by mergevar; sum3=cum3; pv_3 = 1-chi_3; retain sumweight1 0 sumtotal1 0 sum4=cum4; pv_4 = 1-chi_4; sumweight2 0 sumtotal2 0 end; sumweight3 0 sumtotal3 h=1; run; reject_1=(pv_1 <= 0.05); 0 sumweight4 0 sumtotal4 0; reject_2=(pv_2 <= 0.05); proc sort data = a_strat3_&f.; reject_3=(pv_3 <= 0.05); sumweight1+weight1; by order; run; reject_4=(pv_4 <= 0.05); sumtotal1+weight1*mean; sumweight2+weight2; proc univariate data = sumtotal2+weight2*mean; a_deffmerge noprint; if df ge 1 then schi_1 = sumweight3+weight3; var deff ; cdf('CHISQUARE',sumQS1,df); else sumtotal3+weight3*mean; output out=a_deff1 max=max; run; schi_1 = .; sumweight4+weight4; if df ge 1 then schi_2 = sumtotal4+weight4*mean; data a_deff1; cdf('CHISQUARE',sumQS2,df); else set a_deff1; schi_2 = .; x=_N_; * for plotting on ye h = 1; run; if df ge 1 then schi_3 = olde graphe; cdf('CHISQUARE',sumQS3,df); else cum1=sumtotal1/sumweight1; data a_strat3_&f.; schi_3 = .; cum2=sumtotal2/sumweight2; merge a_strat3_&f. a_deff1; by if df ge 1 then schi_4 = cum3=sumtotal3/sumweight3; h; cdf('CHISQUARE',sumQS4,df); else cum4=sumtotal4/sumweight4; *Q reg statistic; schi_4 = .; *superpopulation ; Q1 = weight1*((mean-sum1)**2); popmean =&SPMEAN.; Q2 = weight2*((mean-sum2)**2); spv_1 = 1-schi_1; Q3 = weight3*((mean-sum3)**2); spv_2 = 1-schi_2; poptotalupper=popmean*(1+&ERRORBA Q4 = weight4*((mean-sum4)**2); spv_3 = 1-schi_3; R); QS1 = Q1/max; spv_4 = 1-schi_4; poptotallower=popmean*(1- QS2 = Q2/max; &ERRORBAR); QS3 = Q3/max; sreject_1=(spv_1 <= 0.05); QS4 = Q4/max; sreject_2=(spv_2 <= 0.05); * relative bias - abs(estimate- H=1; run; sreject_3=(spv_3 <= 0.05); true)/true; sreject_4=(spv_4 <= 0.05); * true defined as data a_strat4_&f.; run; finitepopulation_&f. value; set a_strat3_&f.; by H; retain sumQ1 0 sumQ2 0 sumQ3 0 * set running total to keep track relbias_fp=abs(cum1 - sumQ4 0 sumQS1 0 sumQS2 0 sumQS3 how often we are outside ranges fpmean)/fpmean*100; 0 sumQS4 0 ; of somethingorother; sumbias_fp+relbias_fp; sumQ1 +Q1; *avgrelbias_fp=sumbias_fp/_N_; sumQ2 +Q2; data a_strat_runningtotal1; sumQ3 +Q3; set a_strat_cumulative_&f.; relbias_sp=abs(cum1- sumQ4 +Q4; outsideflag1=ifn(cum1 < &SPMEAN.)/&SPMEAN.*100; sumQS1+QS1; poptotallower or cum1 > sumbias_sp+relbias_sp; sumQS2+Qs2; poptotalupper,1,0); *avgrelbias_sp=sumbias_sp/_N_; sumQs3+Qs3; outsideflag2=ifn(cum2 < df = x-1; run; sumQs4+Qs4; poptotallower or cum2 > poptotalupper,1,0); data a_strat_cumulative_&f.; if df ge 1 then chi_1 = outsideflag3=ifn(cum3 < set a_strat_cumulative_&f.; cdf('CHISQUARE',sumQ1,df); else poptotallower or cum3 > order=_N_; run; chi_1 = .; poptotalupper,1,0); 275

outsideflag4=ifn(cum4 < * and merge deff into poptotallower or cum4 > File Name: section_4cluster.sas strat_stats; poptotalupper,1,0); run; proc sort data=samples_cluster2; data cluster_stats; proc summary by replicate strata; run; merge cluster_stats deffmerge2; data=a_strat_runningtotal1; by replicate; run; var outsideflag1 outsideflag2 ods listing close; * kind of like outsideflag3 outsideflag4; a noprint for surveymeans - a new * Add weights to the various output procedure under the new model; samples; out=a_strat_runningtotal2(drop=_T * Weight 1 = sample size - a YPE_ _FREQ_) sum=outside_1 proc surveymeans constant in our replicates; outside_2 outside_3 outside_4; data=samples_cluster2 sum varsum * Weight 2 = inverse of variance; run; mean var n=8 ; * weight 3 = design effect * var; by replicate; * weight 4 = 1; %macro tsnnam; var y; cluster strata; data cluster_stats; %if &i=1 %then %do; weight samplingweight; set cluster_stats; ods output weight1=1/2; data a_strat_metric_&f.; statistics=cluster_stats; run; weight2=1/var; set a_strat_runningtotal2; weight3=deff*(1/var); run; * calculate variance for weight4 = 1; data a_strat_pv_&f.; stratified design (above) and srs mergevar='1'; set (below); id = _n_; run; a_strat4_&f(where=(order=&R./2) * Generate an finitepopulation keep=order reject: sreject:); proc surveymeans total; run; data=samples_cluster2 sum varsum mean var; proc summary %end; by replicate; data=finitepopulation_&f.; %else %do; var y; var y; ods output output out=fptotal(drop=_TYPE_ proc append statistics=cluster_deff_base; _FREQ_) sum=fptotal mean=fpmean; base=a_strat_metric_&f. run; run; data=a_strat_runningtotal2; run; proc append ods listing; data fptotal; base=a_strat_pv_&f. set fptotal; data=a_strat4_&f.(where=(order=&R *rename; mergevar='1'; run; ./2) keep=order reject: data cluster_deff_base; sreject:); run; set cluster_deff_base; data varbase = var; cluster_cumulative_&f.(drop=merge %end; keep replicate varbase; run; var); merge cluster_stats fptotal; %mend; * merge into one; by mergevar; run; data cluster_cumulative_&f.; %tsnnam; data deffcalc2; set cluster_cumulative_&f.; merge cluster_stats where id le &r./2; proc summary data = cluster_deff_base; retain sumweight1 0 sumtotal1 0 a_strat_pv_&f.; by replicate; run; sumweight2 0 sumtotal2 0 var reject: sreject: ; * deff calc simplified; sumweight3 0 sumtotal3 output 0 sumweight4 0 sumtotal4 0; out=a_strat_pv_2(drop=_TYPE_ data deffcalc2; _FREQ_) sum=reject1-reject4 set deffcalc2; sumweight1+weight1; sreject1-sreject4; run; deff=var/varbase; run; sumtotal1+weight1*mean; sumweight2+weight2; proc summary data = * keep deff; sumtotal2+weight2*mean; a_strat_metric_&f.; sumweight3+weight3; var outside_1-outside_4; data deffmerge2; sumtotal3+weight3*mean; output set deffcalc2(keep=replicate sumweight4+weight4; out=a_strat_averagetimesout deff); run; sumtotal4+weight4*mean; (drop=_TYPE_ _FREQ_) mean= outside1-outside4; run; 276

x=_N_; * for plotting on ye h = 1; run; if df ge 1 then schi_3 = olde graphe; cdf('CHISQUARE',sumQs3,df); else cum1=sumtotal1/sumweight1; data cluster3_&f.; schi_3 = .; cum2=sumtotal2/sumweight2; merge cluster3_&f. deff2; by h; if df ge 1 then schi_4 = cum3=sumtotal3/sumweight3; *Q reg statistic; cdf('CHISQUARE',sumQs4,df); else cum4=sumtotal4/sumweight4; Q1 = weight1*((mean-sum1)**2); schi_4 = .; *superpopulation ; Q2 = weight2*((mean-sum2)**2); popmean =&SPMEAN.; Q3 = weight3*((mean-sum3)**2); spv_1 = 1-schi_1; Q4 = weight4*((mean-sum4)**2); spv_2 = 1-schi_2; poptotalupper=popmean*(1+&ERRORBA QS1 = Q1/max; spv_3 = 1-schi_3; R); QS2 = Q2/max; spv_4 = 1-schi_4; poptotallower=popmean*(1- QS3 = Q3/max; &ERRORBAR); QS4 = Q4/max; sreject_1=(spv_1 <= 0.05); H=1; run; sreject_2=(spv_2 <= 0.05); * relative bias - abs(estimate- sreject_3=(spv_3 <= 0.05); true)/true; data cluster4_&f.; sreject_4=(spv_4 <= 0.05); * true defined as set cluster3_&f.; by H; run; finitepopulation_&f. value; retain sumQ1 0 sumQ2 0 sumQ3 0 sumQ4 0 sumQS1 0 data cluster_runningtotal1; relbias_fp=abs(cum1 - sumQS2 0 sumQS3 0 sumQS4 0; set cluster_cumulative_&f.; fpmean)/fpmean*100; sumQ1 +Q1; outsideflag1=ifn(cum1 < sumbias_fp+relbias_fp; sumQ2 +Q2; poptotallower or cum1 > *avgrelbias_fp=sumbias_fp/_N_; sumQ3 +Q3; poptotalupper,1,0); sumQ4 +Q4; outsideflag2=ifn(cum2 < relbias_sp=abs(cum1- sumQS1+QS1; poptotallower or cum2 > &SPMEAN.)/&SPMEAN.*100; sumQS2+QS2; poptotalupper,1,0); sumbias_sp+relbias_sp; sumQS3+Qs3; outsideflag3=ifn(cum3 < *avgrelbias_sp=sumbias_sp/_N_; sumQs4+Qs4; poptotallower or cum3 > df = x-1; run; poptotalupper,1,0); if df ge 1 then chi_1 = outsideflag4=ifn(cum4 < data cluster_cumulative_&f.; cdf('CHISQUARE',sumQ1,df); else poptotallower or cum4 > set cluster_cumulative_&f.; chi_1 = .; poptotalupper,1,0); run; order=_N_; run; if df ge 1 then chi_2 = cdf('CHISQUARE',sumQ2,df); else proc summary proc sort chi_2 = .; data=cluster_runningtotal1; data=cluster_cumulative_&f.; if df ge 1 then chi_3 = var outsideflag1 outsideflag2 by descending order; run; cdf('CHISQUARE',sumQ3,df); else outsideflag3 outsideflag4; chi_3 = .; output * Then carry the value forward; if df ge 1 then chi_4 = out=cluster_runningtotal2(drop=_T cdf('CHISQUARE',sumQ4,df); else YPE_ _FREQ_) sum=outside_1 data cluster3_&f.; chi_4 = .; outside_2 outside_3 outside_4; set cluster_cumulative_&f.; run; retain sum1 sum2 sum3 sum4 ; pv_1 = 1-chi_1; if _N_=1 then do; sum1=cum1 ; pv_2 = 1-chi_2; %macro tsnnam; sum2=cum2; pv_3 = 1-chi_3; %if &i=1 %then %do; sum3=cum3; pv_4 = 1-chi_4; sum4=cum4; data cluster_metric_&f.; end; reject_1=(pv_1 <= 0.05); set cluster_runningtotal2; h=1; run; reject_2=(pv_2 <= 0.05); run; reject_3=(pv_3 <= 0.05); data cluster_pv_&f.; proc sort data = cluster3_&f.; reject_4=(pv_4 <= 0.05); set by order; run; cluster4_&f(where=(x=&R./2) keep= if df ge 1 then schi_1 = x replicate reject: sreject:); proc univariate data = deffmerge2 cdf('CHISQUARE',sumQs1,df); else run; noprint; schi_1 = .; var deff ; if df ge 1 then schi_2 = %end; output out=deff2 max=max; run; cdf('CHISQUARE',sumQs2,df); else %else %do; schi_2 = .; data deff2; set deff2; 277

proc append output out=temp4(drop=_TYPE_ proc delete data=temp6; run; base=cluster_metric_&f. _FREQ_) sum=d; run; proc delete data=temp7; run; data=cluster_runningtotal2; run; proc delete data=temp8; run; data temp5; proc delete data=temp9; run; proc append merge temp2 temp3; base=cluster_pv_&f. by replicate; *rename; data=cluster4_&f.(where=(x=&R./2) * define the ti-ybar*di term; keep= x replicate reject: term1=ti-ybar*di; run; data deffcalc2SP; sreject:); run; merge cluster_varsp_&f. proc summary data=temp5 nway; cluster_deff_base; %end; class replicate; by replicate; run; var term1; * deff calc simplified; %mend; output out=temp6(drop=_TYPE_ _FREQ_) sum=term1sum; run; data deffcalc2SP; %tsnnam; set deffcalc2SP; data cluster_varb_&f.; deff=varsp/varbase; run; * set running total to keep track merge temp4 temp6; how often we are outside ranges by replicate; of something or other; data deffmerge2SP; varb=1/(d*d)*(1/&STRATA.)*(term1s set deffcalc2SP(keep=replicate um*term1sum); run; deff); run; File Name: Section_4SPCluster.sas * Now calculate varw; * and merge deff into * Variance calculation sp - strat_stats; calculate required ti, di, ybar data temp7; values; merge temp5 temp6; data clusterSP_stats; * put only required data into a by replicate; merge cluster_varsp_&f. temp dataset; term2=(term1- deffmerge2SP; * at this point in time, cluster 0.5*term1sum)*(term1- by replicate; run; is in fact called strata.; 0.5*term1sum); run; * first we are calculating varb; data cluster_mean; proc summary data=temp7 nway; set cluster_stats; data temp; class replicate; keep replicate mean; run; set var term2; samples_cluster2(keep=replicate output out=temp8(drop=_TYPE_ data clusterSP_stats; strata samplingweight y); _FREQ_) sum=term2sum; run; merge cluster_mean d=1/samplingweight; clusterSP_stats; t=d*y; run; data cluster_varw_&f.; by replicate; run; merge temp4 temp8; * start with di, ti; by replicate; data clusterSP_stats; proc summary data=temp nway; set clusterSP_stats; class replicate strata; varw=1/(d*d)*(2/&STRATA.)*term2su weight1=1/25000; var d t; m; run; weight2=1/var; output out=temp2(drop=_TYPE_ weight3=deff*(1/var); _FREQ_) sum=di ti; run; proc sort weight4 = 1; data=cluster_cumulative_&f.(keep= mergevar='1'; run; * now ybar; replicate var) out=temp9; by replicate; run; proc summary proc summary data=temp nway; data=finitepopulation_&f.; class replicate; data cluster_varsp_&f.; var y; weight d; merge temp9 cluster_varb_&f. output out=fptotal(drop=_TYPE_ var y ; cluster_varw_&f.; _FREQ_) sum=fptotal mean=fpmean; output out=temp3(drop=_TYPE_ by replicate; run; _FREQ_) mean=ybar; run; varsp=var+varb-varw; run; data fptotal; * and d; proc delete data=temp; run; set fptotal; proc delete data=temp2; run; mergevar='1'; run; proc summary data=temp nway; proc delete data=temp3; run; class replicate; proc delete data=temp4; run; var d; proc delete data=temp5; run; 278

data set cluster_cumulativeSP_&f.; if df ge 1 then chi_4 = cluster_cumulativeSP_&f.(drop=mer retain sum1 sum2 sum3 sum4 ; cdf('CHISQUARE',sumQ4,df); else gevar); if _N_=1 then do; sum1=cum1 ; chi_4 = .; merge clusterSP_stats fptotal; sum2=cum2; by mergevar; sum3=cum3; pv_1 = 1-chi_1; retain sumweight1 0 sumtotal1 0 sum4=cum4; pv_2 = 1-chi_2; sumweight2 0 sumtotal2 0 end; pv_3 = 1-chi_3; sumweight3 0 sumtotal3 h=1; run; pv_4 = 1-chi_4; 0 sumweight4 0 sumtotal4 0; proc sort data = cluster3SP_&f.; reject_1=(pv_1 <= 0.05); sumweight1+weight1; by order; run; reject_2=(pv_2 <= 0.05); sumtotal1+weight1*mean; proc univariate data = reject_3=(pv_3 <= 0.05); sumweight2+weight2; deffmerge2SP noprint; reject_4=(pv_4 <= 0.05); sumtotal2+weight2*mean; var deff ; sumweight3+weight3; output out=deff2SP max=max; run; if df ge 1 then schi_1 = sumtotal3+weight3*mean; cdf('CHISQUARE',sumQs1,df); else sumweight4+weight4; data deff2SP; schi_1 = .; sumtotal4+weight4*mean; set deff2SP; if df ge 1 then schi_2 = h = 1; run; cdf('CHISQUARE',sumQs2,df); else x=_N_; * for plotting on ye schi_2 = .; olde graphe; data cluster3SP_&f.; if df ge 1 then schi_3 = cum1=sumtotal1/sumweight1; merge cluster3SP_&f. deff2SP; cdf('CHISQUARE',sumQs3,df); else cum2=sumtotal2/sumweight2; by h; schi_3 = .; cum3=sumtotal3/sumweight3; *Q reg statistic; if df ge 1 then schi_4 = cum4=sumtotal4/sumweight4; Q1 = weight1*((mean-sum1)**2); cdf('CHISQUARE',sumQs4,df); else *superpopulation ; Q2 = weight2*((mean-sum2)**2); schi_4 = .; popmean =&SPMEAN.; Q3 = weight3*((mean-sum3)**2); Q4 = weight4*((mean-sum4)**2); spv_1 = 1-schi_1; poptotalupper=popmean*(1+&ERRORBA QS1 = Q1/max; spv_2 = 1-schi_2; R); QS2 = Q2/max; spv_3 = 1-schi_3; poptotallower=popmean*(1- QS3 = Q3/max; spv_4 = 1-schi_4; &ERRORBAR); QS4 = Q4/max; H=1; run; sreject_1=(spv_1 <= 0.05); * relative bias - abs(estimate- sreject_2=(spv_2 <= 0.05); true)/true; sreject_3=(spv_3 <= 0.05); * true defined as data cluster4SP_&f.; sreject_4=(spv_4 <= 0.05); finitepopulation_&f. value; set cluster3SP_&f.; by H; run; retain sumQ1 0 sumQ2 0 sumQ3 0 relbias_fp=abs(cum1 - sumQ4 0 sumQS1 0 data cluster_runningtotal1SP; fpmean)/fpmean*100; sumQS2 0 sumQS3 0 sumQS4 0; set cluster_cumulativeSP_&f.; sumbias_fp+relbias_fp; sumQ1 +Q1; outsideflag1=ifn(cum1 < *avgrelbias_fp=sumbias_fp/_N_; sumQ2 +Q2; poptotallower or cum1 > sumQ3 +Q3; poptotalupper,1,0); relbias_sp=abs(cum1- sumQ4 +Q4; outsideflag2=ifn(cum2 < &SPMEAN.)/&SPMEAN.*100; sumQS1+QS1; poptotallower or cum2 > sumbias_sp+relbias_sp; sumQS2+QS2; poptotalupper,1,0); *avgrelbias_sp=sumbias_sp/_N_; sumQS3+Qs3; outsideflag3=ifn(cum3 < df = x-1; run; sumQs4+Qs4; poptotallower or cum3 > poptotalupper,1,0); data cluster_cumulativeSP_&f.; if df ge 1 then chi_1 = outsideflag4=ifn(cum4 < set cluster_cumulativeSP_&f.; cdf('CHISQUARE',sumQ1,df); else poptotallower or cum4 > order=_N_; run; chi_1 = .; poptotalupper,1,0); run; if df ge 1 then chi_2 = proc summary proc sort cdf('CHISQUARE',sumQ2,df); else data=cluster_runningtotal1SP; data=cluster_cumulativeSP_&f.; chi_2 = .; var outsideflag1 outsideflag2 by descending order; run; if df ge 1 then chi_3 = outsideflag3 outsideflag4; cdf('CHISQUARE',sumQ3,df); else output * Then carry the value forward; chi_3 = .; out=cluster_runningtotal2SP(drop= _TYPE_ _FREQ_) sum=outside_1 data cluster3SP_&f.; 279

outside_2 outside_3 outside_4; proc summary data=temp nway; run; class replicate strata; varw=1/(d*d)*(2/&STRATA.)*term2su %macro tsnnam; var d t; m; run; %if &i=1 %then %do; output out=temp2(drop=_TYPE_ _FREQ_) sum=di ti; run; proc surveymeans data cluster_metricSP_&f.; data=samples_cluster2_&f. sum set * now ybar; varsum mean var n=8 ; cluster_runningtotal2SP; run; by replicate; data cluster_pvSP_&f.; proc summary data=temp nway; var y; set class replicate; cluster strata; cluster4SP_&f(where=(replicate=&R weight d; weight samplingweight; .) keep=replicate reject: var y ; ods output sreject:); run; output out=temp3(drop=_TYPE_ statistics=cluster_cumulative_&f. _FREQ_) mean=ybar; run; ; run; %end; %else %do; * and d; proc sort data=cluster_cumulative_&f.(keep= proc append proc summary data=temp nway; replicate var) out=temp9; base=cluster_metricSP_&f. class replicate; by replicate; run; data=cluster_runningtotal2SP; var d; run; output out=temp4(drop=_TYPE_ data cluster_varsp_&f.; _FREQ_) sum=d; run; merge temp9 cluster_varb_&f. proc append cluster_varw_&f.; base=cluster_pvSP_&f. data temp5; by replicate; data=cluster4SP_&f.(where=(replic merge temp2 temp3; varsp=var+varb-varw; run; ate=&R.) keep=replicate reject: by replicate; sreject:); run; * define the ti-ybar*di term; proc delete data=temp; run; term1=ti-ybar*di; run; proc delete data=temp2; run; %end; proc delete data=temp3; run; proc summary data=temp5 nway; proc delete data=temp4; run; %mend; class replicate; proc delete data=temp5; run; var term1; proc delete data=temp6; run; %tsnnam; output out=temp6(drop=_TYPE_ proc delete data=temp7; run; _FREQ_) sum=term1sum; run; proc delete data=temp8; run; * set running total to keep track proc delete data=temp9; run; how often we are outside ranges data cluster_varb_&f.; of something or other; merge temp4 temp6; *rename; by replicate; proc surveymeans File Name: varb=1/(d*d)*(1/&STRATA.)*(term1s data=samples_cluster2_2 sum Section_4SPCluster_case2.sas um*term1sum); run; varsum mean var ; by replicate; * Variance calculation sp - * Now calculate varw; var y; calculate required ti, di, ybar ods output values; data temp7; statistics=cluster_deff_base_&f.; * put only required data into a merge temp5 temp6; run; temp dataset; by replicate; * at this point in time, cluster term2=(term1- *rename; is in fact called strata.; 0.5*term1sum)*(term1- data cluster_deff_base_&f.; * first we are calculating varb; 0.5*term1sum); run; set cluster_deff_base_&f.; varbase = var; ods listing close; proc summary data=temp7 nway; keep replicate varbase; run; data temp; class replicate; set var term2; data deffcalc2SP_&f.; samples_cluster2_&f.(keep=replica output out=temp8(drop=_TYPE_ merge cluster_varsp_&f. te strata samplingweight y); _FREQ_) sum=term2sum; run; cluster_deff_base_&f.; d=1/samplingweight; by replicate; run; t=d*y; run; data cluster_varw_&f.; * deff calc simplified; merge temp4 temp8; * start with di, ti; by replicate; data deffcalc2SP_&f.; 280

set deffcalc2SP_&f.; merge clusterSP_stats_&f. *rename; deff=varsp/varbase; run; fptotal_&f.; data strat_deff_base; by mergevar; set strat_deff_base; data deffmerge2SP_&f.; finite = &f.; run; varbase = var; set keep replicate varbase; run; deffcalc2SP_&f.(keep=replicate ods listing; deff); run; *calculate the additional part of File Name: Section_4SPstrata.sas SP var; * and merge deff into strat_stats; * Run surveymeans on the various data small_strat; samples; set samples_strata; data clusterSP_stats_&f.; y_w = y*samplingweight; merge cluster_varsp_&f. proc sort data=samples_strata; keep y_w y samplingweight deffmerge2SP_&f.; by replicate strata; run; replicate strata; run; by replicate; run; proc sort data = small_strat; ods listing close; * kind of like by replicate strata; run; proc surveymeans a noprint for surveymeans - a new data=samples_cluster2_&f. sum procedure under the new model; proc summary data=small_strat ; varsum mean var n=8 ; var y_w; by replicate strata; by replicate; data samplingrate (drop=temp: i output out=temp1 t=total; run; var y; j); format strata $3.; cluster strata; do j=1 to &r.; data temp1; weight samplingweight; Replicate = j; set temp1; ods output strata = i; Y_H_mean = total/12500; run; statistics=cluster_stats_&f.; tempi=1; * merge into one; run; tempstring="&STRATASAMPRATE"; do data temp2; data cluster_mean_&f.; while(scan(tempstring,tempi," merge temp1 strat_stats; set cluster_stats_&f.; ()") > " "); by replicate; keep replicate mean; run; strata=tempi; diff1 = Y_H_mean - mean; _rate_=input(scan(temps square = diff1*diff1; run; data clusterSP_stats_&f.; tring,strata," ()"),6.4); merge cluster_mean_&f. tempi+1; proc means data = samples_strata clusterSP_stats_&f.; output; end; end; run; noprint; by replicate; run; proc freq data = samplingrate; var y; tables strata*_rate_/list; run; by replicate strata; output data clusterSP_stats_&f.; out=temp4 var=sigma_h; run; set clusterSP_stats_&f.; proc surveymeans weight1=1/25000; data=samples_strata sum varsum data temp5; weight2=1/var; mean var rate=samplingrate; merge temp2 temp4; weight3=deff*(1/var); by replicate; by replicate strata; weight4 = 1; strata strata; num1 = &BIGN.-(&BIGN./&STRATA.); mergevar='1'; run; var y; if strata = 1 then den1 = weight samplingweight; (&BIGN.-1)*125; proc summary ods output else if strata = 2 then den1 = data=finitepopulation_&f.; statistics=strat_stats; run; (&BIGN.-1)*63; var y; * calculate variance for else if strata = 3 then den1 = output stratified design (above) and srs (&BIGN.-1)*50; out=fptotal_&f.(drop=_TYPE_ (below); else if strata = 4 then den1 = _FREQ_) sum=fptotal mean=fpmean; (&BIGN.-1)*13; run; proc surveymeans else if strata = 5 then den1 = data=samples_strata sum varsum (&BIGN.-1)*13; data fptotal_&f.; mean var; else if strata = 6 then den1 = set fptotal_&f.; by replicate; (&BIGN.-1)*50; mergevar='1'; run; var y; else if strata = 7 then den1 = ods output (&BIGN.-1)*63; data statistics=strat_deff_base; run; else if strata = 8 then den1 = cluster_cumulativeSP_&f.(drop=mer (&BIGN.-1)*125; gevar); ods listing;

281

p_1 =(1/(&BIGN.- var y; order=_N_; run; 1))*(1/8)*square; output out=fptotal(drop=_TYPE_ _FREQ_) sum=fptotal mean=fpmean; proc sort p_2 = (1/&BIGN.)*(1/8)*(1- run; data=strat_SP_cumulative_&f.; (num1/den1))*sigma_h; by descending order; run; data fptotal; v_y = p_1 + p_2; run; set fptotal; * Then carry the value forward; mergevar='1'; run; proc summary data = temp5; data strat3_SP_&f.; var v_y; by replicate; data set strat_SP_cumulative_&f.; output out = temp6 t=v_yt; run; strat_SP_cumulative_&f.(drop=merg retain sum1 sum2 sum3 sum4 ; evar); if _N_=1 then do; sum1=cum1 ; data strat_stats2; merge stratSP_stats fptotal; sum2=cum2; merge strat_stats temp6; by mergevar; sum3=cum3; varSP = var+v_yt; run; retain sumweight1 0 sumtotal1 0 sum4=cum4; sumweight2 0 sumtotal2 0 end; data deffcalcSP; sumweight3 0 sumtotal3 h=1; run; merge strat_stats2 0 sumweight4 0 sumtotal4 0; strat_deff_base; proc sort data = strat3_SP_&f.; by replicate; run; sumweight1+weight1; by order; run; * deff calc simplified; sumtotal1+weight1*mean; sumweight2+weight2; proc univariate data = data deffcalcSP; sumtotal2+weight2*mean; deffmergeSP noprint; set deffcalcSP; sumweight3+weight3; var deff ; sumtotal3+weight3*mean; output out=deff1SP max=max; run; deff=varSP/varbase; run; sumweight4+weight4; sumtotal4+weight4*mean; data deff1SP; * keep deff; set deff1SP; x=_N_; * for plotting on ye h = 1; run; data deffmergeSP; olde graphe; set deffcalcSP(keep=replicate cum1=sumtotal1/sumweight1; data strat3_SP_&f.; deff); run; cum2=sumtotal2/sumweight2; merge strat3_SP_&f. deff1SP; by cum3=sumtotal3/sumweight3; h; * and merge deff into cum4=sumtotal4/sumweight4; *Q reg statistic; strat_stats; *superpopulation ; Q1 = weight1*((mean-sum1)**2); popmean =&SPMEAN.; Q2 = weight2*((mean-sum2)**2); data stratSP_stats; Q3 = weight3*((mean-sum3)**2); merge strat_stats2 deffmergeSP; poptotalupper=popmean*(1+&ERRORBA Q4 = weight4*((mean-sum4)**2); by replicate; run; R); QS1 = Q1/max; poptotallower=popmean*(1- QS2 = Q2/max; * Add weights to the various &ERRORBAR); QS3 = Q3/max; samples; QS4 = Q4/max; * Weight 1 = sample size - a * relative bias - abs(estimate- H=1; run; constant in our replicates; true)/true; * Weight 2 = inverse of variance; * true defined as data strat4_SP_&f.; * Weight 3 = design effect ; finitepopulation_&f. value; set strat3_SP_&f.; by H; * Weight 4 = 1 ; retain sumQ1 0 sumQ2 0 sumQ3 0 relbias_fp=abs(cum1 - sumQ4 0 sumQS1 0 sumQS2 0 sumQS3 data stratSP_stats; fpmean)/fpmean*100; 0 sumQS4 0 ; set stratSP_stats; sumbias_fp+relbias_fp; sumQ1 +Q1; weight1=1/&LITTLEN.; *avgrelbias_fp=sumbias_fp/_N_; sumQ2 +Q2; weight2=1/varsp; sumQ3 +Q3; weight3=deff*(1/varSP); relbias_sp=abs(cum1- sumQ4 +Q4; weight4=1; &SPMEAN.)/&SPMEAN.*100; sumQS1+QS1; mergevar='1'; run; sumbias_sp+relbias_sp; sumQS2+Qs2; *avgrelbias_sp=sumbias_sp/_N_; sumQs3+Qs3; * Generate an actual total; df = x-1; run; sumQs4+Qs4; proc summary data strat_SP_cumulative_&f.; data=finitepopulation_&f.; set strat_SP_cumulative_&f.; 282

if df ge 1 then chi_1 = outsideflag3=ifn(cum3 < var outside_1-outside_4; cdf('CHISQUARE',sumQ1,df); else poptotallower or cum3 > output chi_1 = .; poptotalupper,1,0); out=strat_averagetimesoutSP if df ge 1 then chi_2 = outsideflag4=ifn(cum4 < (drop=_TYPE_ _FREQ_) mean= cdf('CHISQUARE',sumQ2,df); else poptotallower or cum4 > outside1-outside4; run; chi_2 = .; poptotalupper,1,0); run; if df ge 1 then chi_3 = cdf('CHISQUARE',sumQ3,df); else proc summary File Name: chi_3 = .; data=strat_SP_runningtotal1; Section_4SPstrata_case2nextbit.sa if df ge 1 then chi_4 = var outsideflag1 outsideflag2 s cdf('CHISQUARE',sumQ4,df); else outsideflag3 outsideflag4; chi_4 = .; output * Run surveymeans on the various out=strat_SP_runningtotal2(drop=_ samples; pv_1 = 1-chi_1; TYPE_ _FREQ_) sum=outside_1 pv_2 = 1-chi_2; outside_2 outside_3 outside_4; proc sort pv_3 = 1-chi_3; run; data=samples_strata_&f.; pv_4 = 1-chi_4; by replicate strata; run; %macro tsnnam; reject_1=(pv_1 <= 0.05); ods listing close; * kind of like reject_2=(pv_2 <= 0.05); %if &i=1 %then %do; a noprint for surveymeans - a new reject_3=(pv_3 <= 0.05); procedure under the new model; reject_4=(pv_4 <= 0.05); data strat_metric_SP_&f.; set strat_SP_runningtotal2; data samplingrate (drop=temp: i if df ge 1 then schi_1 = run; j); format strata $3.; cdf('CHISQUARE',sumQS1,df); else do j=1 to &r.; schi_1 = .; data strat_pv_SP_&f.; Replicate = j; if df ge 1 then schi_2 = set strata = i; cdf('CHISQUARE',sumQS2,df); else strat4_SP_&f(where=(replicate=&R. tempi=1; schi_2 = .; ) keep=replicate reject: tempstring="&STRATASAMPRATE"; if df ge 1 then schi_3 = sreject:); run; do cdf('CHISQUARE',sumQS3,df); else while(scan(tempstring,tempi," schi_3 = .; %end; ()") > " "); if df ge 1 then schi_4 = %else %do; strata=tempi; cdf('CHISQUARE',sumQS4,df); else _rate_=input(scan(temps schi_4 = .; proc append tring,strata," ()"),6.4); base=strat_metric_SP_&f. tempi+1; spv_1 = 1-schi_1; data=strat_SP_runningtotal2; run; output; end; end; run; spv_2 = 1-schi_2; proc freq data = samplingrate; spv_3 = 1-schi_3; proc append tables strata*_rate_/list; run; spv_4 = 1-schi_4; base=strat_pv_SP_&f. data=strat4_SP_&f.(where=(replica proc surveymeans sreject_1=(spv_1 <= 0.05); te=&R.) keep=replicate reject: data=samples_strata_&f. sum sreject_2=(spv_2 <= 0.05); sreject:); run; varsum mean var sreject_3=(spv_3 <= 0.05); rate=samplingrate; sreject_4=(spv_4 <= 0.05); %end; by replicate; run; strata strata; %mend; var y; * set running total to keep track weight samplingweight; how often we are outside ranges %tsnnam; ods output of something or other; statistics=strat_stats_&f.; run; proc summary data = * calculate variance for data strat_SP_runningtotal1; strat_pv_SP_&f.; stratified design (above) and srs set strat_SP_cumulative_&f.; var reject: sreject: ; (below); outsideflag1=ifn(cum1 < output poptotallower or cum1 > out=strat_pv_SP2(drop=_TYPE_ proc surveymeans poptotalupper,1,0); _FREQ_) sum=reject1-reject4 data=samples_strata_&f. sum outsideflag2=ifn(cum2 < sreject1-sreject4; run; varsum mean var; poptotallower or cum2 > by replicate; poptotalupper,1,0); proc summary data = var y; strat_metric_SP_&f.; 283

ods output else if strata = 6 then den1 = weight1=1/&LITTLEN.; statistics=strat_deff_base_&f.; (&BIGN.-1)*50; weight2=1/varsp; run; else if strata = 7 then den1 = weight3=deff*(1/varSP); (&BIGN.-1)*63; weight4=1; ods listing; else if strata = 8 then den1 = mergevar='1'; run; (&BIGN.-1)*125; *rename; * Generate an actual total; data strat_deff_base_&f.; p_1 =(1/(&BIGN.- set strat_deff_base_&f.; 1))*(1/8)*square; proc summary varbase = var; data=finitepopulation_&f.; keep replicate varbase; run; p_2 = (1/&BIGN.)*(1/8)*(1- var y; (num1/den1))*sigma_h; output *calculate the additional part of out=fptotal_&f.(drop=_TYPE_ SP var; v_y = p_1 + p_2; run; _FREQ_) sum=fptotal mean=fpmean; run; data small_strat_&f.; proc summary data = temp5_&f.; set samples_strata_&f.; var v_y; by replicate; data fptotal_&f.; y_w = y*samplingweight; output out = temp6_&f. t=v_yt; set fptotal_&f.; keep y_w y samplingweight run; mergevar='1'; run; replicate strata; run; proc sort data = small_strat_&f.; data strat_stats2_&f.; data by replicate strata; run; merge strat_stats_&f. temp6_&f.; strat_SP_cumulative_&f.(drop=merg varSP = var+v_yt; run; evar); proc summary data=small_strat_&f. merge stratSP_stats_&f. ; data deffcalcSP_&f.; fptotal_&f.; var y_w; by replicate strata; merge strat_stats2_&f. finite = &f. ; output out=temp1_&f. t=total ; strat_deff_base_&f.; by mergevar; run; run; by replicate; run; * deff calc simplified; data temp1_&f.; File Name: Section_4srs.sas set temp1_&f.; data deffcalcSP_&f.; Y_H_mean = total/12500; run; set deffcalcSP_&f.; * Run surveymeans on the SRS * merge into one; sample; deff=varSP/varbase; run; data temp2_&f.; ods listing close; * kind of like merge temp1_&f. strat_stats_&f.; * keep deff; a noprint for surveymeans - a new by replicate; procedure under the new model; diff1 = Y_H_mean - mean; data deffmergeSP_&f.; %let f = 1; square = diff1*diff1; run; set proc surveymeans data=samples_srs deffcalcSP_&f.(keep=replicate sum varsum mean var ; proc means data = deff); run; by replicate; samples_strata_&f. noprint; var y; var y; * and merge deff into weight samplingweight; by replicate strata; output strat_stats; ods output out=temp4_&f. var=sigma_h; run; statistics=srs_stats; run; data stratSP_stats_&f.; data temp5_&f.; merge strat_stats2_&f. ods listing; merge temp2_&f. temp4_&f.; deffmergeSP_&f.; by replicate strata; by replicate; run; * Add weights to the various num1 = &BIGN.-(&BIGN./&STRATA.); samples; if strata = 1 then den1 = * Add weights to the various * Weight 1 = sample size - a (&BIGN.-1)*125; samples; constant in our replicates; else if strata = 2 then den1 = * Weight 1 = sample size - a * Weight 2 = inverse of variance; (&BIGN.-1)*63; constant in our replicates; * weight 3 = design effect - for else if strata = 3 then den1 = * Weight 2 = inverse of variance; srs = 1; (&BIGN.-1)*50; * Weight 3 = design effect ; * weight 4 = 1; else if strata = 4 then den1 = * Weight 4 = 1 ; (&BIGN.-1)*13; data srs_stats; else if strata = 5 then den1 = data stratSP_stats_&f.; set srs_stats; (&BIGN.-1)*13; set stratSP_stats_&f.; deff = 1; 284

weight1=1/&LITTLEN; weight2=1/var; relbias_fp=abs(cum1 - if df ge 1 then chi_1 = weight3=1/var; fpmean)/fpmean*100; cdf('CHISQUARE',sumQ1,df); else weight4 = 1; sumbias_fp+relbias_fp; chi_1 = .; mergevar='1'; *avgrelbias_fp=sumbias_fp/_N_; if df ge 1 then chi_2 = id = _n_; run; cdf('CHISQUARE',sumQ2,df); else * Generate an finitepopulation relbias_sp=abs(cum1- chi_2 = .; total; &SPMEAN.)/&SPMEAN.*100; if df ge 1 then chi_3 = sumbias_sp+relbias_sp; cdf('CHISQUARE',sumQ3,df); else proc summary *avgrelbias_sp=sumbias_sp/_N_; chi_3 = .; data=finitepopulation_&f.; df = x-1; run; if df ge 1 then chi_4 = var y; cdf('CHISQUARE',sumQ4,df); else output out=fptotal(drop=_TYPE_ data srs_cumulative_&f.; chi_4 = .; _FREQ_) sum=fptotal mean=fpmean; set srs_cumulative_&f.; run; order=_N_; run; pv_1 = 1-chi_1; pv_2 = 1-chi_2; data fptotal; proc sort pv_3 = 1-chi_3; set fptotal; data=srs_cumulative_&f.; pv_4 = 1-chi_4; mergevar='1'; run; by descending order; run; if df ge 1 then reject_1=(pv_1 data * Then carry the value forward; <= 0.05); else pv_1 = .; srs_cumulative_&f.(drop=mergevar) if df ge 1 then reject_2=(pv_2 ; data srs3_&f.; <= 0.05); else pv_2 = .; merge srs_stats fptotal; set srs_cumulative_&f.; if df ge 1 then reject_3=(pv_3 by mergevar; run; retain sum1 sum2 sum3 sum4 ; <= 0.05); else pv_3 = .; data srs_cumulative_&f.; if _N_=1 then do; sum1=cum1 ; if df ge 1 then reject_4=(pv_4 set srs_cumulative_&f. ; sum2=cum2; <= 0.05); else pv_4 = .; where id le &r./2; sum3=cum3; retain sumweight1 0 sumtotal1 0 sum4=cum4; if df ge 1 then schi_1 = sumweight2 0 sumtotal2 0 end; run; cdf('CHISQUARE',sumQs1,df); else sumweight3 0 sumtotal3 schi_1 = .; 0 sumweight4 0 sumtotal4 0; proc sort data = srs3_&f.; if df ge 1 then schi_2 = by order; run; cdf('CHISQUARE',sumQs2,df); else sumweight1+weight1; schi_2 = .; sumtotal1+weight1*mean; data srs3_&f.; if df ge 1 then schi_3 = sumweight2+weight2; set srs3_&f.; cdf('CHISQUARE',sumQs3,df); else sumtotal2+weight2*mean; *Q reg statistic; schi_3 = .; sumweight3+weight3; Q1 = weight1*((mean-sum1)**2); if df ge 1 then schi_4 = sumtotal3+weight3*mean; Q2 = weight2*((mean-sum2)**2); cdf('CHISQUARE',sumQs4,df); else sumweight4+weight4; Q3 = weight3*((mean-sum3)**2); schi_4 = .; sumtotal4+weight4*mean; Q4 = weight4*((mean-sum4)**2); QS1 = Q1/(1-&littlen./&bigN.); if df ge 1 then spv_1 = 1- x=_N_; * for plotting on ye QS2 = Q2/(1-&littlen./&bigN.); schi_1; else spv_1 = .; olde graphe; QS3 = Q3/(1-&littlen./&bigN.); if df ge 1 then spv_2 = 1- cum1=sumtotal1/sumweight1; QS4 = Q4/(1-&littlen./&bigN.); schi_2; else spv_2 = .; cum2=sumtotal2/sumweight2; H=1; run; if df ge 1 then spv_3 = 1- cum3=sumtotal3/sumweight3; schi_3; else spv_3 = .; cum4=sumtotal4/sumweight4; data srs4_&f.; if df ge 1 then spv_4 = 1- *superpopulation ; set srs3_&f.; by H; schi_4; else spv_4 = .; popmean =&SPMEAN.; retain sumQ1 0 sumQ2 0 sumQ3 0 sumQ4 0 sumQS1 0 sreject_1=(spv_1 <= 0.05); poptotalupper=popmean*(1+&ERRORBA sumQS2 0 sumQS3 0 sumQS4 0; sreject_2=(spv_2 <= 0.05); R); sumQ1 +Q1; sreject_3=(spv_3 <= 0.05); poptotallower=popmean*(1- sumQ2 +Q2; sreject_4=(spv_4 <= 0.05); &ERRORBAR); sumQ3 +Q3; run; sumQ4 +Q4; * relative bias - abs(estimate- sumQS1+QS1; data srs_runningtotal1; true)/true; sumQS2+QS2; set srs_cumulative_&f.; * true defined as sumQS3+Qs3; finitepopulation_&f. value; sumQs4+Qs4; 285

outsideflag1=ifn(cum1 < output out=srs_pv_2(drop=_TYPE_ set fptotal_&f.; poptotallower or cum1 > _FREQ_) sum=reject1-reject4 mergevar='1'; run; poptotalupper,1,0); sreject1-sreject4; run; * change after here; outsideflag2=ifn(cum2 < poptotallower or cum2 > proc summary data = data poptotalupper,1,0); srs_metric_&f.; srs_cumulative_&f.(drop=mergevar) outsideflag3=ifn(cum3 < var outside_1-outside_4; ; poptotallower or cum3 > output out=srs_averagetimesout merge srs_stats_&f. poptotalupper,1,0); (drop=_TYPE_ _FREQ_) mean= fptotal_&f.; outsideflag4=ifn(cum4 < outside1-outside4; run; finite = &f.; poptotallower or cum4 > by mergevar; run; poptotalupper,1,0); run; File Name: Section_4srs_case2.sas File Name: proc summary * Run surveymeans on the SRS Section_4srs_superpopvar.sas data=srs_runningtotal1; sample; var outsideflag1 outsideflag2 * Run surveymeans on the SRS outsideflag3 outsideflag4; ods listing close; * kind of like sample; output a noprint for surveymeans - a new out=srs_runningtotal2(drop=_TYPE_ procedure under the new model; ods listing close; * kind of like _FREQ_) sum=outside_1 outside_2 a noprint for surveymeans - a new outside_3 outside_4; run; proc surveymeans procedure under the new model; data=samples_srs_&f. mean var ; %macro tsnnam; by replicate; proc means data=samples_srs ; %if &i=1 %then %do; var y; by replicate; weight samplingweight; var y; data srs_metric_&f.; ods output weight samplingweight; set srs_runningtotal2; run; statistics=srs_stats_&f.; run; output out=srs_stats_s mean=mean var=var; run; data srs_pv_&f.; ods listing; set ods listing; srs4_&f.(where=(replicate=&R./2) * Add weights to the various keep=replicate reject: sreject:); samples; * Add weights to the various run; * Weight 1 = sample size - a samples; constant in our replicates; * Weight 1 = sample size - a %end; * Weight 2 = inverse of variance; constant in our replicates; %else %do; * weight 3 = design effect - for * Weight 2 = inverse of variance; srs = 1; * weight 3 = design effect - for proc append * weight 4 = 1; srs = 1; base=srs_metric_&f. * weight 4 = 1; data=srs_runningtotal2; run; data srs_stats_&f.; set srs_stats_&f.; data srs_stats_s; proc append base=srs_pv_&f. deff = 1; set srs_stats_s; data=srs4_&f.(where=(replicate=&R weight1=1/&LITTLEN; deff = 1; ./2) keep=replicate reject: weight2=1/var; weight1=1/&LITTLEN; sreject:); run; weight3=1/var; weight2=1/var; weight4 = 1; weight3=1/var; %end; mergevar='1'; run; weight4 = 1; * Generate an finitepopulation mergevar='1'; run; %mend; total; * Generate an finitepopulation total; %tsnnam; proc summary data=finitepopulation_&f.; proc summary * set running total to keep track var y; data=finitepopulation_&f.; how often we are outside ranges output var y; of something or other; out=fptotal_&f.(drop=_TYPE_ output out=fptotal(drop=_TYPE_ _FREQ_) sum=fptotal mean=fpmean; _FREQ_) sum=fptotal mean=fpmean; proc summary data = srs_pv_&f.; run; run; var reject: sreject: ; data fptotal_&f.; data fptotal; 286

set fptotal; title f=swissb h=14pt mergevar='1'; run; move=(17,23)"SRS_sup, &R. ods listing close; * kind of like replicates, a noprint for surveymeans - a new data Finite population &F., Superpop procedure under the new model; srs_cumulative_s_&f.(drop=mergeva Mean=&SPMEAN., %let f = 1; r); Superpop Sigma &SPVAR., n = proc surveymeans data=samples_srs merge srs_stats_s fptotal; &littlen., N = &bign."; sum varsum mean var ; by mergevar; * weight one graph; by replicate; retain sumweight1 0 sumtotal1 0 symbol1 color=purple var y; sumweight2 0 sumtotal2 0 interpol=join line=1 width=0.5 weight samplingweight; sumweight3 0 sumtotal3 value=none; ods output 0 sumweight4 0 sumtotal4 0; * weight two graph; statistics=srs_stats; run; symbol2 color=red interpol=join sumweight1+weight1; line=1 width=0.5 value=none; ods listing; sumtotal1+weight1*mean; * weigth three graph; sumweight2+weight2; symbol3 color=green interpol=join * Add weights to the various sumtotal2+weight2*mean; line=1 width=0.5 value=none; samples; sumweight3+weight3; * weight four graph; * Weight 1 = sample size - a sumtotal3+weight3*mean; symbol4 color=blue interpol=join constant in our replicates; sumweight4+weight4; line=1 width=0.5 value=none; * Weight 2 = inverse of variance; sumtotal4+weight4*mean; * superpopulation mean bar; * weight 3 = design effect - for symbol5 color=black interpol=join srs = 1; x=_N_; * for plotting on ye line=1 width=0.5 value=none; * weight 4 = 1; olde graphe; *finite population mean bar; cum1=sumtotal1/sumweight1; symbol6 color=brown data srs_stats; cum2=sumtotal2/sumweight2; interpol=join line=1 width=0.5 set srs_stats; cum3=sumtotal3/sumweight3; value=none; deff = 1; cum4=sumtotal4/sumweight4; weight1=1/&LITTLEN; *superpopulation ; proc gplot weight2=1/var; popmean =&SPMEAN.; data=srs_cumulative_s_&f. weight3=1/var; gout=graph_srs; weight4 = 1; poptotalupper=popmean*(1+&ERRORBA plot cum1*x cum2*x cum3*x cum4*x mergevar='1'; R); popmean*x fpmean*x/overlay; id = _n_; poptotallower=popmean*(1- run; quit; &ERRORBAR); weight1_squared = data srs_graph_s; length x 8.; weight1*weight1; * relative bias - abs(estimate- set srs_cumulative_s_1; weight2_squared = true)/true; weight2*weigth2; * true defined as keep x cum1-cum4 fpmean popmean weight3_squared = finitepopulation_&f. value; ; run; weight3*weight3; weight4_sqaured = relbias_fp=abs(cum1 - proc sort data = srs_graph_s; by weight4*weight4; run; fpmean)/fpmean*100; x; run; sumbias_fp+relbias_fp; * Generate an finitepopulation *avgrelbias_fp=sumbias_fp/_N_; PROC EXPORT DATA= total; WORK.srs_graph_s relbias_sp=abs(cum1- OUTFILE= proc summary &SPMEAN.)/&SPMEAN.*100; "C:\Documents and data=finitepopulation_&f.; sumbias_sp+relbias_sp; Settings\foxkarl\Desktop\SAS var y; *avgrelbias_sp=sumbias_sp/_N_; PROGRAMS REDONE\Current output out=fptotal(drop=_TYPE_ df = x-1; run; Version\Datasets\srs_cumulative_& _FREQ_) sum=fptotal mean=fpmean; littlen._1_&r._s.xls" run; data srs_cumulative_s_&f.; DBMS=EXCEL REPLACE; set srs_cumulative_s_&f.; SHEET="d"; RUN; data fptotal; order=_N_; run; set fptotal; mergevar='1'; run; proc sort File Name: Section_4srscase3.sas data=srs_cumulative_s_&f.; data by descending order; run; * Run surveymeans on the SRS srs_cumulative_&f.(drop=mergevar) sample; ; 287

merge srs_stats fptotal; relbias_sp=abs(cum1- sumQS2 0 sumQS3 0 sumQS4 0 by mergevar; run; &SPMEAN.)/&SPMEAN.*100; sum_n_w2 0 sum_n_w3 0 data srs_cumulative_&f.; sumbias_sp+relbias_sp; sumtotal_n_w2 0 sumtotal_n_w3 0; set srs_cumulative_&f. ; *avgrelbias_sp=sumbias_sp/_N_; sumQ1 +Q1; where id le &r./2; df = x-1; run; sumQ2 +Q2; retain sumweight1 0 sumtotal1 0 sumQ3 +Q3; sumweight2 0 sumtotal2 0 data srs_cumulative_&f.; sumQ4 +Q4; sumweight3 0 sumtotal3 set srs_cumulative_&f.; sumQS1+QS1; 0 sumweight4 0 sumtotal4 0 order=_N_; run; sumQS2+QS2; sumweightsq1 0 sumQS3+Qs3; sumweightsq2 0 sumweightsq3 0 proc sort sumQs4+Qs4; sumweightsq4 0; data=srs_cumulative_&f.; by descending order; run; sum_n_w2+new_w2; sumweight1+weight1; Sum_n_w3+new_w3; sumtotal1+weight1*mean; * Then carry the value forward; sumtotal_n_w2+new_w2*mean; sumweight2+weight2; sumtotal_n_w3+new_w3*mean; sumtotal2+weight2*mean; data srs3_&f.; cum_n1 = sumweight3+weight3; set srs_cumulative_&f.; sumtotal_n_w2/sum_n_w2; sumtotal3+weight3*mean; retain sum1 sum2 sum3 sum4 nc1 cum_n2 = sumweight4+weight4; nc2 nc3 nc4 ; sumtotal_n_w3/sum_n_w3; sumtotal4+weight4*mean; if _N_=1 then do; sum1=cum1 ; sumweightsq1+weight1_squared; sum2=cum2; if df ge 1 then chi_1 = sumweightsq2+weight2_squared; sum3=cum3; cdf('CHISQUARE',sumQ1,df); else sumweightsq3+weight3_squared; sum4=cum4; chi_1 = .; sumweightsq4+weight4_squared; if df ge 1 then chi_2 = nc1 = c1; cdf('CHISQUARE',sumQ2,df); else x=_N_; * for plotting on ye nc2 = c2; chi_2 = .; olde graphe; nc3 = c3; if df ge 1 then chi_3 = cum1=sumtotal1/sumweight1; nc4 = c4; cdf('CHISQUARE',sumQ3,df); else cum2=sumtotal2/sumweight2; end; run; chi_3 = .; cum3=sumtotal3/sumweight3; if df ge 1 then chi_4 = cum4=sumtotal4/sumweight4; proc sort data = srs3_&f.; cdf('CHISQUARE',sumQ4,df); else by order; run; chi_4 = .; c1 = sumweight1 - (sumweightsq1/sumweight1); data srs3_&f.; pv_1 = 1-chi_1; c2 = sumweight2 - set srs3_&f.; pv_2 = 1-chi_2; (sumweightsq2/sumweight2); *Q reg statistic; pv_3 = 1-chi_3; c3 = sumweight3 - Q1 = weight1*((mean-sum1)**2); pv_4 = 1-chi_4; (sumweightsq3/sumweight3); Q2 = weight2*((mean-sum2)**2); c4 = sumweight4 - Q3 = weight3*((mean-sum3)**2); if df ge 1 then reject_1=(pv_1 (sumweightsq4/sumweight4); Q4 = weight4*((mean-sum4)**2); <= 0.05); else pv_1 = .; QS1 = Q1/(1-&littlen./&bigN.); if df ge 1 then reject_2=(pv_2 *superpopulation ; QS2 = Q2/(1-&littlen./&bigN.); <= 0.05); else pv_2 = .; popmean =&SPMEAN.; QS3 = Q3/(1-&littlen./&bigN.); if df ge 1 then reject_3=(pv_3 QS4 = Q4/(1-&littlen./&bigN.); <= 0.05); else pv_3 = .; poptotalupper=popmean*(1+&ERRORBA if df ge 1 then reject_4=(pv_4 R); tau1 = (Q1 - df)/c1; <= 0.05); else pv_4 = .; poptotallower=popmean*(1- tau2 = (Q2 - df)/c2; &ERRORBAR); tau3 = (Q3 - df)/c3; if df ge 1 then schi_1 = tau4 = (Q4 - df)/c4; cdf('CHISQUARE',sumQs1,df); else * relative bias - abs(estimate- schi_1 = .; true)/true; new_w2 = 1/(var+tau2); if df ge 1 then schi_2 = * true defined as new_w3 = 1/(var+tau3); cdf('CHISQUARE',sumQs2,df); else finitepopulation_&f. value; H=1; run; schi_2 = .; if df ge 1 then schi_3 = relbias_fp=abs(cum1 - data srs4_&f.; cdf('CHISQUARE',sumQs3,df); else fpmean)/fpmean*100; set srs3_&f.; by H; schi_3 = .; sumbias_fp+relbias_fp; retain sumQ1 0 sumQ2 0 sumQ3 0 if df ge 1 then schi_4 = *avgrelbias_fp=sumbias_fp/_N_; sumQ4 0 sumQS1 0 cdf('CHISQUARE',sumQs4,df); else schi_4 = .; 288

proc append base=srs_pv_&f. proc surveymeans if df ge 1 then spv_1 = 1- data=srs4_&f.(where=(replicate=&R data=samples_strata sum varsum schi_1; else spv_1 = .; ./2) keep=replicate reject: mean var rate=samplingrate; if df ge 1 then spv_2 = 1- sreject:); run; by replicate; schi_2; else spv_2 = .; strata strata; if df ge 1 then spv_3 = 1- %end; var y; schi_3; else spv_3 = .; weight samplingweight; if df ge 1 then spv_4 = 1- %mend; ods output schi_4; else spv_4 = .; statistics=strat_stats; run; %tsnnam; * calculate variance for sreject_1=(spv_1 <= 0.05); stratified design (above) and srs sreject_2=(spv_2 <= 0.05); * set running total to keep track (below); sreject_3=(spv_3 <= 0.05); how often we are outside ranges sreject_4=(spv_4 <= 0.05); of something or other; proc surveymeans run; data=samples_strata sum varsum proc summary data = srs_pv_&f.; mean var; data srs_runningtotal1; var reject: sreject: ; by replicate; set srs_cumulative_&f.; output out=srs_pv_2(drop=_TYPE_ var y; outsideflag1=ifn(cum1 < _FREQ_) sum=reject1-reject4 ods output poptotallower or cum1 > sreject1-sreject4; run; statistics=strat_deff_base; run; poptotalupper,1,0); outsideflag2=ifn(cum2 < proc summary data = ods listing; poptotallower or cum2 > srs_metric_&f.; poptotalupper,1,0); var outside_1-outside_4; *rename; outsideflag3=ifn(cum3 < output out=srs_averagetimesout data strat_deff_base; poptotallower or cum3 > (drop=_TYPE_ _FREQ_) mean= set strat_deff_base; poptotalupper,1,0); outside1-outside4; run; varbase = var; outsideflag4=ifn(cum4 < keep replicate varbase; run; poptotallower or cum4 > poptotalupper,1,0); File Name: Section_4strata * merge into one; run; * Run surveymeans on the various data deffcalc; proc summary samples; merge strat_stats data=srs_runningtotal1; strat_deff_base; var outsideflag1 outsideflag2 proc sort data=samples_strata; by replicate; run; outsideflag3 outsideflag4; by replicate strata; run; * deff calc simplified; output out=srs_runningtotal2(drop=_TYPE_ ods listing close; * kind of like data deffcalc; _FREQ_) sum=outside_1 outside_2 a noprint for surveymeans - a new set deffcalc; outside_3 outside_4; run; procedure under the new model; deff=var/varbase; run; %macro tsnnam; data samplingrate (drop=temp: i %if &i=1 %then %do; j); format strata $3.; * keep deff; do j=1 to &r.; data srs_metric_&f.; Replicate = j; data deffmerge; set srs_runningtotal2; run; strata = i; set deffcalc(keep=replicate tempi=1; deff); run; data srs_pv_&f.; tempstring="&STRATASAMPRATE"; set do * and merge deff into srs4_&f.(where=(replicate=&R./2) while(scan(tempstring,tempi," strat_stats; keep=replicate reject: sreject:); ()") > " "); run; strata=tempi; data strat_stats; _rate_=input(scan(temps merge strat_stats deffmerge; %end; tring,strata," ()"),6.4); by replicate; run; %else %do; tempi+1; output; end; end; run; * Add weights to the various proc append proc freq data = samplingrate; samples; base=srs_metric_&f. tables strata*_rate_/list; run; * Weight 1 = sample size - a data=srs_runningtotal2; run; constant in our replicates; * Weight 2 = inverse of variance; 289

* Weight 3 = design effect ; * Weight 4 = 1 ; * relative bias - abs(estimate- data strat4_&f.; true)/true; set strat3_&f.; by H; data strat_stats; * true defined as retain sumQ1 0 sumQ2 0 sumQ3 0 set strat_stats; finitepopulation_&f. value; sumQ4 0 sumQS1 0 sumQS2 0 sumQS3 weight1=1/&LITTLEN.; 0 sumQS4 0 ; weight2=1/var; relbias_fp=abs(cum1 - sumQ1 +Q1; weight3=deff*(1/var); fpmean)/fpmean*100; sumQ2 +Q2; weight4=1; sumbias_fp+relbias_fp; sumQ3 +Q3; mergevar='1'; *avgrelbias_fp=sumbias_fp/_N_; sumQ4 +Q4; id = _n_; run; sumQS1+QS1; relbias_sp=abs(cum1- sumQS2+Qs2; * Generate an actual total; &SPMEAN.)/&SPMEAN.*100; sumQs3+Qs3; sumbias_sp+relbias_sp; sumQs4+Qs4; proc summary *avgrelbias_sp=sumbias_sp/_N_; data=finitepopulation_&f.; df = x-1; run; if df ge 1 then chi_1 = var y; cdf('CHISQUARE',sumQ1,df); else output out=fptotal(drop=_TYPE_ data strat_cumulative_&f.; chi_1 = .; _FREQ_) sum=fptotal mean=fpmean; set strat_cumulative_&f.; if df ge 1 then chi_2 = run; order=_N_; run; cdf('CHISQUARE',sumQ2,df); else chi_2 = .; data fptotal; proc sort if df ge 1 then chi_3 = set fptotal; data=strat_cumulative_&f.; cdf('CHISQUARE',sumQ3,df); else mergevar='1'; run; by descending order; run; chi_3 = .; if df ge 1 then chi_4 = data * Then carry the value forward; cdf('CHISQUARE',sumQ4,df); else strat_cumulative_&f.(drop=mergeva chi_4 = .; r); data strat3_&f.; merge strat_stats fptotal; set strat_cumulative_&f.; pv_1 = 1-chi_1; by mergevar; run; retain sum1 sum2 sum3 sum4 ; pv_2 = 1-chi_2; data strat_cumulative_&F.; if _N_=1 then do; sum1=cum1 ; pv_3 = 1-chi_3; set strat_cumulative_&f.; sum2=cum2; pv_4 = 1-chi_4; where id le &r./2; sum3=cum3; retain sumweight1 0 sumtotal1 0 sum4=cum4; reject_1=(pv_1 <= 0.05); sumweight2 0 sumtotal2 0 end; h=1; run; reject_2=(pv_2 <= 0.05); sumweight3 0 sumtotal3 reject_3=(pv_3 <= 0.05); 0 sumweight4 0 sumtotal4 0; proc sort data = strat3_&f.; reject_4=(pv_4 <= 0.05); by order; run; sumweight1+weight1; if df ge 1 then schi_1 = sumtotal1+weight1*mean; proc univariate data = deffmerge cdf('CHISQUARE',sumQS1,df); else sumweight2+weight2; noprint; schi_1 = .; sumtotal2+weight2*mean; var deff ; if df ge 1 then schi_2 = sumweight3+weight3; output out=deff1 max=max; run; cdf('CHISQUARE',sumQS2,df); else sumtotal3+weight3*mean; schi_2 = .; sumweight4+weight4; data deff1; if df ge 1 then schi_3 = sumtotal4+weight4*mean; set deff1; cdf('CHISQUARE',sumQS3,df); else h = 1; run; schi_3 = .; x=_N_; * for plotting on ye if df ge 1 then schi_4 = olde graphe; data strat3_&f.; cdf('CHISQUARE',sumQS4,df); else cum1=sumtotal1/sumweight1; merge strat3_&f. deff1; by h; schi_4 = .; cum2=sumtotal2/sumweight2; *Q reg statistic; cum3=sumtotal3/sumweight3; Q1 = weight1*((mean-sum1)**2); spv_1 = 1-schi_1; cum4=sumtotal4/sumweight4; Q2 = weight2*((mean-sum2)**2); spv_2 = 1-schi_2; *superpopulation ; Q3 = weight3*((mean-sum3)**2); spv_3 = 1-schi_3; popmean =&SPMEAN.; Q4 = weight4*((mean-sum4)**2); spv_4 = 1-schi_4; QS1 = Q1/max; poptotalupper=popmean*(1+&ERRORBA QS2 = Q2/max; sreject_1=(spv_1 <= 0.05); R); QS3 = Q3/max; sreject_2=(spv_2 <= 0.05); poptotallower=popmean*(1- QS4 = Q4/max; sreject_3=(spv_3 <= 0.05); &ERRORBAR); H=1; run; 290

sreject_4=(spv_4 <= 0.05); symbol6 color=brown run; %end; interpol=join line=1 width=0.5 value=none; * set running total to keep track %mend; how often we are outside ranges *proc gplot of something or other; %tsnnam; data=srs_cumulative_&f. gout=graph_srs; data strat_runningtotal1; proc summary data = strat_pv_&f.; *plot cum1*x cum2*x cum3*x cum4*x set strat_cumulative_&f.; var reject: sreject: ; popmean*x fpmean*x/overlay; outsideflag1=ifn(cum1 < output out=strat_pv_2(drop=_TYPE_ * run; poptotallower or cum1 > _FREQ_) sum=reject1-reject4 * quit; poptotalupper,1,0); sreject1-sreject4; run; title f=swissb h=14pt outsideflag2=ifn(cum2 < move=(17,23) "STRAT, &R. poptotallower or cum2 > proc summary data = replicates, Finite population poptotalupper,1,0); strat_metric_&f.; &F., seed= &seed"; outsideflag3=ifn(cum3 < var outside_1-outside_4; poptotallower or cum3 > output proc gplot data=srs_cumulative poptotalupper,1,0); out=strat_averagetimesout gout=graph_srs; outsideflag4=ifn(cum4 < (drop=_TYPE_ _FREQ_) mean= plot cum1*x cum2*x cum3*x cum4*x poptotallower or cum4 > outside1-outside4; run; popmean*x fpmean*x/overlay; run; poptotalupper,1,0); quit; run; File Name: Section_5.sas *proc gplot proc summary * Section 5 is graphing; data=strat_cumulative_&f. data=strat_runningtotal1; * Graph one, for SRS; gout=graph_strat; var outsideflag1 outsideflag2 * Copying some examples; *plot cum1*x cum2*x cum3*x cum4*x outsideflag3 outsideflag4; popmean*x fpmean*x/overlay; output goptions reset=all device=png * run; out=strat_runningtotal2(drop=_TYP htext=8pt xpixels=2000 * quit; E_ _FREQ_) sum=outside_1 ypixels=1000; *title f=swissb h=14pt outside_2 outside_3 axis1 length=57 pct label=none move=(17,23) "STRATSP, &R. outside_4; run; major=none minor=none; replicates, Finite population axis2 label=none major=none &F., seed= &seed"; %macro tsnnam; minor=none offset=(4,4); title f=swissb h=14pt *proc gplot %if &i=1 %then %do; move=(17,23) "SRS, &R. data=strat_SP_cumulative_&f. replicates, gout=graph_strat; data strat_metric_&f.; Finite population &F., Superpop *plot cum1*x cum2*x cum3*x cum4*x set strat_runningtotal2; Mean=&SPMEAN., popmean*x fpmean*x/overlay; run; Superpop Sigma &SPVAR., n = * run; &littlen., N = &bign."; * quit; data strat_pv_&f.; set * weight one graph; proc gplot strat4_&f(where=(replicate=&R./2) symbol1 color=purple data=strat_SP_cumulative keep=replicate reject: sreject:); interpol=join line=1 width=0.5 gout=graph_strat; run; value=none; plot cum1*x cum2*x cum3*x cum4*x * weight two graph; popmean*x fpmean*x/overlay; run; %end; symbol2 color=red interpol=join quit; %else %do; line=1 width=0.5 value=none; /* * weigth three graph; title f=swissb h=14pt proc append symbol3 color=green interpol=join move=(17,23) "Cluster, &R. base=strat_metric_&f. line=1 width=0.5 value=none; replicates, Finite population data=strat_runningtotal2; * weight four graph; &F., seed= &seed"; run; symbol4 color=blue interpol=join line=1 width=0.5 value=none; proc gplot proc append * superpopulation mean bar; data=cluster_cumulative_&f. base=strat_pv_&f. symbol5 color=black interpol=join gout=graph_cluster; data=strat4_&f.(where=(replicate= line=1 width=0.5 value=none; plot cum1*x cum2*x cum3*x cum4*x &R./2) keep=replicate reject: *finite population mean bar; popmean*x fpmean*x/overlay; run; sreject:); run; quit; 291

INTERCEPT 252000 OUTFILE= samplingweight 50000000/e; run; "F:\Thesis\OUTPUT\significance title f=swissb h=14pt tables\Cluster_&R._&littlen._&loo move=(17,23) "Cluster, &R. proc surveyreg data = ps._case2.xls" replicates, Finite population samples_strata rate=samplingrate; DBMS=EXCEL REPLACE; &F., seed= &seed"; class strata; SHEET="d"; RUN; strata strata; proc gplot model y = strata/solution; PROC EXPORT DATA= data=cluster_cumulativeSP_&f. weight samplingweight; WORK.Cluster_averagetimesoutSP gout=graph_cluster; estimate 'Estimate of Y under OUTFILE= plot cum1*x cum2*x cum3*x cum4*x Model using strata' "F:\Thesis\OUTPUT\significance popmean*x fpmean*x/overlay; run; INTERCEPT 252000 tables\cluster_averagetimesout_&R quit; strata 12500 12500 12500 12500 ._&littlen._&loops._case2.xls" */ 12500 12500 12500 12500/e; run; DBMS=EXCEL REPLACE; title f=swissb h=14pt SHEET="d"; RUN; move=(17,23) "Cluster, &R. replicates, Finite population *Graphs; &F., seed= &seed"; File Name: datasetfinal.sas data srs_graph; length x 8.; proc gplot *permanent data sets; set srs_cumulative_1; data=cluster_cumulativeSP_&f. gout=graph_cluster; PROC EXPORT DATA= WORK.SRS_PV_2 keep x cum1-cum4 fpmean popmean plot cum1*x cum2*x cum3*x cum4*x OUTFILE= ; run; popmean*x fpmean*x/overlay; run; "F:\Thesis\OUTPUT\significance quit; tables\srs_&R._&littlen._&loops._ proc sort data = srs_graph; by x; case2.xls" run; DBMS=EXCEL REPLACE; SHEET="d"; RUN; PROC EXPORT DATA= WORK.srs_graph File Name: comparing to pooling OUTFILE= datasets.sas PROC EXPORT DATA= "C:\Documents and WORK.strat_PV_SP2 Settings\foxkarl\Desktop\SAS /* compared to pooling in a large OUTFILE= PROGRAMS REDONE\Current data set*/ "F:\Thesis\OUTPUT\significance Version\Datasets\srs_cumulative_& tables\strat_&R._&littlen._&loops littlen._1_&r..xls" proc means data = samples_strata ._case2.xls" DBMS=EXCEL REPLACE; ; DBMS=EXCEL REPLACE; SHEET="d"; RUN; *by replicate; SHEET="d"; RUN; var y; data strat_graph; length x 8.; output out=strat_pooled set strat_cumulative_1; mean=ndmean var=ndvar; run; PROC EXPORT DATA= WORK.Srs_averagetimesout keep x cum1-cum4 fpmean popmean; proc univariate data = OUTFILE= run; samples_strata ; "F:\Thesis\OUTPUT\significance var samplingweight; run; tables\srs_averagetimesout_&R._&l proc sort data = strat_graph; by ittlen._&loops._case2.xls" x; run; DBMS=EXCEL REPLACE; proc surveymeans data = SHEET="d"; RUN; PROC EXPORT DATA= samples_strata rate=samplingrate; WORK.strat_graph var y; PROC EXPORT DATA= OUTFILE= strata strata; WORK.Strat_averagetimesoutSP "C:\Documents and *by replicate; OUTFILE= Settings\foxkarl\Desktop\SAS weight samplingweight; "F:\Thesis\OUTPUT\significance PROGRAMS REDONE\Current ods output tables\strat_averagetimesout_&R._ Version\Datasets\strat_cumulative statistics=strat_pooled2; run; &littlen._&loops._case2.xls" _&littlen._1_&r..xls" DBMS=EXCEL REPLACE; DBMS=EXCEL REPLACE; proc surveyreg data = SHEET="d"; RUN; SHEET="d"; RUN; samples_strata rate=samplingrate; model y = samplingweight; PROC EXPORT DATA= estimate 'Estimate of Y under WORK.Cluster_PV_SP2 data cluster_graph; length x 8.; Model using Weights' set cluster_cumulative_1; 292

output out=srs_ESL wilcoxon; set alt_cluster null_cluster; keep x cum1-cum4 fpmean popmean; run; run; run; *Strata; proc sort data = cluster_graph; %macro eslcluster; by x; run; data alt_strat; %if &i=1 %then %do; set a_strat4_&f.; data all_cluster2 ; PROC EXPORT DATA= flag = 1; set all_cluster; run; WORK.cluster_graph keep chi_: schi_: flag; %end; OUTFILE= where x = &r./2; run; %else %do; "C:\Documents and Settings\foxkarl\Desktop\SAS data null_strat; proc append base=all_cluster2 PROGRAMS REDONE\Current set strat4_&f.; data=all_cluster; run; Version\Datasets\cluster_cumulati flag = 0; ve_&littlen._4_&r..xls" keep chi_: schi_: flag; %end; DBMS=EXCEL REPLACE; where x = &r./2; run; SHEET="d"; RUN; %mend; data all_strat; set alt_strat null_strat; run; %eslcluster; File Name: ESL.sas

*ESL; %macro eslstrat; proc npar1way wilcoxon data = *SRS; %if &i=1 %then %do; all_cluster2; data all_strat2 ; class flag; ods listing close; set all_strat; run; var chi_: schi_:; data alt_srs; %end; output out=cluster_ESL wilcoxon; set a_srs4_&f.; %else %do; run; flag = 1; keep chi_: schi_: flag; proc append base=all_strat2 ods listing; where x = &r./2; run; data=all_strat; run; data null_srs; %end; set srs4_&f.; File Name: Finite_Spdiff.sas flag = 0; %mend; keep chi_: schi_: flag; *To show how far FPMEAN is from where x = &r./2; run; %eslstrat; SPMEAN; data all_srs; proc npar1way wilcoxon data = %let SPMEAN=1; set alt_srs null_srs; run; all_strat2; %let ALTMEAN=1.5; class flag; %let SPVAR=10; %macro eslsrs; var chi_: schi_:; %let Seed = 666; %if &i=1 %then %do; output out=strat_ESL wilcoxon; %let BIGN=100000; * low for data all_srs2 ; run; development purposes; set all_srs; run; %let LITTLEN=500; * low for %end; *cluster; development purposes; %else %do; data alt_cluster; data finitepopulation(keep=y proc append base=all_srs2 set a_cluster4_&f.; y_a); data=all_srs; run; flag = 1; do i=1 to &BIGN; keep chi_: schi_: flag; %end; where x = &r./2; run; rand = rannor(&SEED.); y=&SPMEAN.+&SPVAR.*rand; %mend; data null_cluster; y_a = &altmean. + &Spvar.*rand; set cluster4_&f.; %eslsrs; flag = 0; output; keep chi_: schi_: flag; end; * do i=1 to bign loop; proc npar1way wilcoxon data = where x = &r./2; run; run; all_srs2; class flag; data all_cluster; proc summary var chi_: schi_:; data=finitepopulation; 293

var y; poptotallower=popmean*(1- sumQ2 +Q2; output out=fpsp_&spvar. &ERRORBAR); sumQ3 +Q3; mean=fpmean; run; sumQ4 +Q4; * relative bias - abs(estimate- sumQS1+QS1; true)/true; sumQS2+QS2; proc append base=fpsp_1 * true defined as sumQS3+Qs3; data=fpsp_30; run; finitepopulation_&f. value; sumQs4+Qs4; proc sort data = fpsp_1000; relbias_fp=abs(cum1 - if df ge 1 then chi_1 = by _FREQ_; run; fpmean)/fpmean*100; cdf('CHISQUARE',sumQ1,df); else sumbias_fp+relbias_fp; chi_1 = .; *avgrelbias_fp=sumbias_fp/_N_; if df ge 1 then chi_2 = cdf('CHISQUARE',sumQ2,df); else File Name: relbias_sp=abs(cum1- chi_2 = .; Section_SRS_case2nextbit.sas &SPMEAN.)/&SPMEAN.*100; if df ge 1 then chi_3 = sumbias_sp+relbias_sp; cdf('CHISQUARE',sumQ3,df); else %macro appendall; *avgrelbias_sp=sumbias_sp/_N_; chi_3 = .; df = x-1; run; if df ge 1 then chi_4 = * seed with first dataset; cdf('CHISQUARE',sumQ4,df); else data srs_cumulative; data srs_cumulative; chi_4 = .; set srs_cumulative_1 ; run; set srs_cumulative; order=_N_; run; pv_1 = 1-chi_1; %do i=2 %to &finpops.; pv_2 = 1-chi_2; proc sort data=srs_cumulative; pv_3 = 1-chi_3; proc append base=srs_cumulative by descending order; run; pv_4 = 1-chi_4; data=srs_cumulative_&i.; run; * Then carry the value forward; if df ge 1 then reject_1=(pv_1 %end; <= 0.05); else pv_1 = .; data srs3; if df ge 1 then reject_2=(pv_2 %mend; set srs_cumulative; <= 0.05); else pv_2 = .; retain sum1 sum2 sum3 sum4 ; if df ge 1 then reject_3=(pv_3 %appendall; if _N_=1 then do; sum1=cum1 ; <= 0.05); else pv_3 = .; sum2=cum2; if df ge 1 then reject_4=(pv_4 data srs_cumulative; sum3=cum3; <= 0.05); else pv_4 = .; set srs_cumulative; sum4=cum4; retain sumweight1 0 sumtotal1 0 end; run; if df ge 1 then schi_1 = sumweight2 0 sumtotal2 0 cdf('CHISQUARE',sumQs1,df); else sumweight3 0 sumtotal3 proc sort data = srs3; schi_1 = .; 0 sumweight4 0 sumtotal4 0; by order; run; if df ge 1 then schi_2 = cdf('CHISQUARE',sumQs2,df); else sumweight1+weight1; data srs3; schi_2 = .; sumtotal1+weight1*mean; set srs3; if df ge 1 then schi_3 = sumweight2+weight2; *Q reg statistic; cdf('CHISQUARE',sumQs3,df); else sumtotal2+weight2*mean; Q1 = weight1*((mean-sum1)**2); schi_3 = .; sumweight3+weight3; Q2 = weight2*((mean-sum2)**2); if df ge 1 then schi_4 = sumtotal3+weight3*mean; Q3 = weight3*((mean-sum3)**2); cdf('CHISQUARE',sumQs4,df); else sumweight4+weight4; Q4 = weight4*((mean-sum4)**2); schi_4 = .; sumtotal4+weight4*mean; QS1 = Q1/(1-&littlen./&bigN.); QS2 = Q2/(1-&littlen./&bigN.); if df ge 1 then spv_1 = 1- x=_N_; * for plotting on ye QS3 = Q3/(1-&littlen./&bigN.); schi_1; else spv_1 = .; olde graphe; QS4 = Q4/(1-&littlen./&bigN.); if df ge 1 then spv_2 = 1- cum1=sumtotal1/sumweight1; H=1; run; schi_2; else spv_2 = .; cum2=sumtotal2/sumweight2; if df ge 1 then spv_3 = 1- cum3=sumtotal3/sumweight3; schi_3; else spv_3 = .; cum4=sumtotal4/sumweight4; data srs4; if df ge 1 then spv_4 = 1- *superpopulation ; set srs3; by H; schi_4; else spv_4 = .; popmean =&SPMEAN.; retain sumQ1 0 sumQ2 0 sumQ3 0 sumQ4 0 sumQS1 0 sreject_1=(spv_1 <= 0.05); poptotalupper=popmean*(1+&ERRORBA sumQS2 0 sumQS3 0 sumQS4 0; sreject_2=(spv_2 <= 0.05); R); sumQ1 +Q1; sreject_3=(spv_3 <= 0.05); 294

sreject_4=(spv_4 <= 0.05); cum2=sumtotal2/sumweight2; run; proc summary data = srs_pv; cum3=sumtotal3/sumweight3; var reject: sreject: ; cum4=sumtotal4/sumweight4; data srs_runningtotal1; output out=srs_pv_2(drop=_TYPE_ *superpopulation ; set srs_cumulative; _FREQ_) sum=reject1-reject4 popmean =&SPMEAN.; outsideflag1=ifn(cum1 < sreject1-sreject4; poptotallower or cum1 > run; poptotalupper=popmean*(1+&ERRORBA poptotalupper,1,0); R); outsideflag2=ifn(cum2 < proc summary data = srs_metric; poptotallower=popmean*(1- poptotallower or cum2 > var outside_1-outside_4; &ERRORBAR); poptotalupper,1,0); output out=srs_averagetimesout outsideflag3=ifn(cum3 < (drop=_TYPE_ _FREQ_) mean= * relative bias - abs(estimate- poptotallower or cum3 > outside1-outside4; true)/true; poptotalupper,1,0); run; * true defined as outsideflag4=ifn(cum4 < finitepopulation_&f. value; poptotallower or cum4 > poptotalupper,1,0); run; relbias_fp=abs(cum1 - File Name: fpmean)/fpmean*100; proc summary Section_SRS_case2nextbitv2.sas sumbias_fp+relbias_fp; data=srs_runningtotal1; *avgrelbias_fp=sumbias_fp/_N_; var outsideflag1 outsideflag2 %macro appendall; outsideflag3 outsideflag4; relbias_sp=abs(cum1- output data &SPMEAN.)/&SPMEAN.*100; out=srs_runningtotal2(drop=_TYPE_ srs_cumulative_&f.(drop=mergevar) sumbias_sp+relbias_sp; _FREQ_) sum=outside_1 outside_2 ; *avgrelbias_sp=sumbias_sp/_N_; outside_3 outside_4; run; merge srs_stats_&f. df = x-1; run; fptotal_&f.; finite = &f.; data srs_cumulative; %macro tsnnam; by mergevar; run; set srs_cumulative; %if &i=1 %then %do; order=_N_; run; * seed with first datasets; data srs_metric; data srs_cumulative; proc sort data=srs_cumulative; set srs_runningtotal2; run; set srs_cumulative_1 ; run; by descending order; run;

data srs_pv; %do i=2 %to &f.; * Then carry the value forward; set srs4(where=(replicate=&R.) proc append base=srs_cumulative data srs3; keep=replicate reject: sreject:); data=srs_cumulative_&i.; run; set srs_cumulative; run; retain sum1 sum2 sum3 sum4 ; %end; if _N_=1 then do; sum1=cum1 ; %end; sum2=cum2; %else %do; data srs_cumulative; sum3=cum3; set srs_cumulative; sum4=cum4; proc append base=srs_metric retain sumweight1 0 sumtotal1 0 end; run; data=srs_runningtotal2; run; sumweight2 0 sumtotal2 0 sumweight3 0 sumtotal3 proc sort data = srs3; by order; proc append base=srs_pv 0 sumweight4 0 sumtotal4 0; run; data=srs4(where=(replicate=&R.) keep=replicate reject: sreject:); sumweight1+weight1; data srs3; run; sumtotal1+weight1*mean; set srs3; sumweight2+weight2; *Q reg statistic; %end; sumtotal2+weight2*mean; Q1 = weight1*((mean-sum1)**2); sumweight3+weight3; Q2 = weight2*((mean-sum2)**2); %mend; sumtotal3+weight3*mean; Q3 = weight3*((mean-sum3)**2); sumweight4+weight4; Q4 = weight4*((mean-sum4)**2); %tsnnam; sumtotal4+weight4*mean; QS1 = Q1/(1-&littlen./&bigN.); QS2 = Q2/(1-&littlen./&bigN.); * set running total to keep track x=_N_; * for plotting on ye QS3 = Q3/(1-&littlen./&bigN.); how often we are outside ranges olde graphe; QS4 = Q4/(1-&littlen./&bigN.); of somethingorother; cum1=sumtotal1/sumweight1; H=1; run; 295

if df ge 1 then spv_3 = 1- schi_3; else spv_3 = .; %mend; data srs4; if df ge 1 then spv_4 = 1- set srs3; by H; schi_4; else spv_4 = .; %tsnnam; retain sumQ1 0 sumQ2 0 sumQ3 0 sumQ4 0 sumQS1 0 sreject_1=(spv_1 <= 0.05); * set running total to keep track sumQS2 0 sumQS3 0 sumQS4 0; sreject_2=(spv_2 <= 0.05); how often we are outside ranges sumQ1 +Q1; sreject_3=(spv_3 <= 0.05); of something or other; sumQ2 +Q2; sreject_4=(spv_4 <= 0.05); sumQ3 +Q3; run; proc summary data = srs_pv; sumQ4 +Q4; var reject: sreject: ; sumQS1+QS1; data srs_runningtotal1; output out=srs_pv_2(drop=_TYPE_ sumQS2+QS2; set srs_cumulative; _FREQ_) sum=reject1-reject4 sumQS3+Qs3; outsideflag1=ifn(cum1 < sreject1-sreject4; sumQs4+Qs4; poptotallower or cum1 > run; poptotalupper,1,0); if df ge 1 then chi_1 = outsideflag2=ifn(cum2 < proc summary data = srs_metric; cdf('CHISQUARE',sumQ1,df); else poptotallower or cum2 > var outside_1-outside_4; chi_1 = .; poptotalupper,1,0); output out=srs_averagetimesout if df ge 1 then chi_2 = outsideflag3=ifn(cum3 < (drop=_TYPE_ _FREQ_) mean= cdf('CHISQUARE',sumQ2,df); else poptotallower or cum3 > outside1-outside4; chi_2 = .; poptotalupper,1,0); run; if df ge 1 then chi_3 = outsideflag4=ifn(cum4 < cdf('CHISQUARE',sumQ3,df); else poptotallower or cum4 > %mend; chi_3 = .; poptotalupper,1,0); run; if df ge 1 then chi_4 = %appendall; cdf('CHISQUARE',sumQ4,df); else proc summary chi_4 = .; data=srs_runningtotal1; var outsideflag1 outsideflag2 pv_1 = 1-chi_1; outsideflag3 outsideflag4; File Name: pv_2 = 1-chi_2; output Section_Cluster_case2nextbit.sas pv_3 = 1-chi_3; out=srs_runningtotal2(drop=_TYPE_ pv_4 = 1-chi_4; _FREQ_) sum=outside_1 outside_2 %macro appendall; outside_3 outside_4; run; if df ge 1 then reject_1=(pv_1 * seed with first dataset; <= 0.05); else pv_1 = .; data cluster_cumulativeSP; if df ge 1 then reject_2=(pv_2 %macro tsnnam; set cluster_cumulativeSP_1 ; <= 0.05); else pv_2 = .; %if &i=1 %then %do; run; if df ge 1 then reject_3=(pv_3 <= 0.05); else pv_3 = .; data srs_metric; %do i=2 %to &finpops.; if df ge 1 then reject_4=(pv_4 set srs_runningtotal2; run; <= 0.05); else pv_4 = .; proc append data srs_pv; base=cluster_cumulativeSP if df ge 1 then schi_1 = set data=cluster_cumulativeSP_&i.; cdf('CHISQUARE',sumQs1,df); else srs4(where=(replicate=&R.) run; schi_1 = .; keep=replicate reject: sreject:); if df ge 1 then schi_2 = run; %end; cdf('CHISQUARE',sumQs2,df); else schi_2 = .; %end; %mend; if df ge 1 then schi_3 = %else %do; cdf('CHISQUARE',sumQs3,df); else %appendall; schi_3 = .; proc append base=srs_metric if df ge 1 then schi_4 = data=srs_runningtotal2; run; data cluster_cumulativeSP; cdf('CHISQUARE',sumQs4,df); else set cluster_cumulativeSP; schi_4 = .; proc append base=srs_pv data=srs4(where=(replicate=&R.) retain sumweight1 0 sumtotal1 0 if df ge 1 then spv_1 = 1- keep=replicate reject: sreject:); sumweight2 0 sumtotal2 0 schi_1; else spv_1 = .; run; sumweight3 0 sumtotal3 if df ge 1 then spv_2 = 1- 0 sumweight4 0 sumtotal4 0; schi_2; else spv_2 = .; %end; 296

sumweight1+weight1; if df ge 1 then schi_1 = sumtotal1+weight1*mean; proc univariate data = cluster3SP cdf('CHISQUARE',sumQs1,df); else sumweight2+weight2; noprint; schi_1 = .; sumtotal2+weight2*mean; var deff ; if df ge 1 then schi_2 = sumweight3+weight3; output out=deff2SP max=max; run; cdf('CHISQUARE',sumQs2,df); else sumtotal3+weight3*mean; schi_2 = .; sumweight4+weight4; data deff2SP; if df ge 1 then schi_3 = sumtotal4+weight4*mean; set deff2SP; cdf('CHISQUARE',sumQs3,df); else h = 1; run; schi_3 = .; x=_N_; * for plotting on ye if df ge 1 then schi_4 = olde graphe; data cluster3SP; cdf('CHISQUARE',sumQs4,df); else cum1=sumtotal1/sumweight1; merge cluster3SP deff2SP; by h; schi_4 = .; cum2=sumtotal2/sumweight2; *Q reg statistic; cum3=sumtotal3/sumweight3; Q1 = weight1*((mean-sum1)**2); spv_1 = 1-schi_1; cum4=sumtotal4/sumweight4; Q2 = weight2*((mean-sum2)**2); spv_2 = 1-schi_2; *superpopulation ; Q3 = weight3*((mean-sum3)**2); spv_3 = 1-schi_3; popmean =&SPMEAN.; Q4 = weight4*((mean-sum4)**2); spv_4 = 1-schi_4; QS1 = Q1/max; poptotalupper=popmean*(1+&ERRORBA QS2 = Q2/max; sreject_1=(spv_1 <= 0.05); R); QS3 = Q3/max; sreject_2=(spv_2 <= 0.05); poptotallower=popmean*(1- QS4 = Q4/max; sreject_3=(spv_3 <= 0.05); &ERRORBAR); H=1; run; sreject_4=(spv_4 <= 0.05); run; * relative bias - abs(estimate- data cluster4SP; true)/true; set cluster3SP; by H; data cluster_runningtotal1SP; * true defined as retain sumQ1 0 sumQ2 0 sumQ3 0 set cluster_cumulativeSP; finitepopulation_&f. value; sumQ4 0 sumQS1 0 outsideflag1=ifn(cum1 < sumQS2 0 sumQS3 0 sumQS4 0; poptotallower or cum1 > relbias_fp=abs(cum1 - sumQ1 +Q1; poptotalupper,1,0); fpmean)/fpmean*100; sumQ2 +Q2; outsideflag2=ifn(cum2 < sumbias_fp+relbias_fp; sumQ3 +Q3; poptotallower or cum2 > *avgrelbias_fp=sumbias_fp/_N_; sumQ4 +Q4; poptotalupper,1,0); sumQS1+QS1; outsideflag3=ifn(cum3 < relbias_sp=abs(cum1- sumQS2+QS2; poptotallower or cum3 > &SPMEAN.)/&SPMEAN.*100; sumQS3+Qs3; poptotalupper,1,0); sumbias_sp+relbias_sp; sumQs4+Qs4; outsideflag4=ifn(cum4 < *avgrelbias_sp=sumbias_sp/_N_; poptotallower or cum4 > df = x-1; run; if df ge 1 then chi_1 = poptotalupper,1,0); run; cdf('CHISQUARE',sumQ1,df); else data cluster_cumulativeSP; chi_1 = .; proc summary set cluster_cumulativeSP; if df ge 1 then chi_2 = data=cluster_runningtotal1SP; order=_N_; run; cdf('CHISQUARE',sumQ2,df); else var outsideflag1 outsideflag2 chi_2 = .; outsideflag3 outsideflag4; proc sort if df ge 1 then chi_3 = output data=cluster_cumulativeSP; cdf('CHISQUARE',sumQ3,df); else out=cluster_runningtotal2SP(drop= by descending order; run; chi_3 = .; _TYPE_ _FREQ_) sum=outside_1 if df ge 1 then chi_4 = outside_2 outside_3 outside_4; * Then carry the value forward; cdf('CHISQUARE',sumQ4,df); else run; chi_4 = .; %macro tsnnam; data cluster3SP; %if &i=1 %then %do; set cluster_cumulativeSP; pv_1 = 1-chi_1; retain sum1 sum2 sum3 sum4 ; pv_2 = 1-chi_2; data cluster_metricSP; if _N_=1 then do; sum1=cum1 ; pv_3 = 1-chi_3; set sum2=cum2; pv_4 = 1-chi_4; cluster_runningtotal2SP; run; sum3=cum3; data cluster_pvSP; sum4=cum4; reject_1=(pv_1 <= 0.05); set end; h=1; run; reject_2=(pv_2 <= 0.05); cluster4SP(where=(replicate=&R.) reject_3=(pv_3 <= 0.05); keep=replicate reject: sreject:); proc sort data = cluster3SP; by reject_4=(pv_4 <= 0.05); run; order; run; 297

%end; %mend; data strat3_SP; %else %do; set strat_SP_cumulative; %appendall; retain sum1 sum2 sum3 sum4 ; proc append if _N_=1 then do; sum1=cum1 ; base=cluster_metricSP data strat_SP_cumulative; sum2=cum2; data=cluster_runningtotal2SP; set strat_SP_cumulative; sum3=cum3; run; retain sumweight1 0 sumtotal1 0 sum4=cum4; sumweight2 0 sumtotal2 0 end; h=1; run; proc append sumweight3 0 sumtotal3 base=cluster_pvSP 0 sumweight4 0 sumtotal4 0; proc sort data = strat3_SP; data=cluster4SP(where=(replicate= by order; run; &R.) keep=replicate reject: sumweight1+weight1; sreject:); run; sumtotal1+weight1*mean; proc univariate data = strat3_SP sumweight2+weight2; noprint; %end; sumtotal2+weight2*mean; var deff ; sumweight3+weight3; output out=deff1SP max=max; run; %mend; sumtotal3+weight3*mean; sumweight4+weight4; data deff1SP; %tsnnam; sumtotal4+weight4*mean; set deff1SP; h = 1; * set running total to keep track x=_N_; * for plotting on ye run; how often we are outside ranges olde graphe; of something or other; cum1=sumtotal1/sumweight1; data strat3_SP; cum2=sumtotal2/sumweight2; merge strat3_SP deff1SP; by h; proc summary data = Cluster_pvSP; cum3=sumtotal3/sumweight3; *Q reg statistic; var reject: sreject: ; cum4=sumtotal4/sumweight4; Q1 = weight1*((mean-sum1)**2); output *superpopulation ; Q2 = weight2*((mean-sum2)**2); out=cluster_pv_SP2(drop=_TYPE_ popmean =&SPMEAN.; Q3 = weight3*((mean-sum3)**2); _FREQ_) sum=reject1-reject4 Q4 = weight4*((mean-sum4)**2); sreject1-sreject4; run; poptotalupper=popmean*(1+&ERRORBA QS1 = Q1/max; R); QS2 = Q2/max; proc summary data = poptotallower=popmean*(1- QS3 = Q3/max; cluster_metricSP; &ERRORBAR); QS4 = Q4/max; var outside_1-outside_4; H=1; run; output * relative bias - abs(estimate- out=cluster_averagetimesoutSP true)/true; data strat4_SP; (drop=_TYPE_ _FREQ_) mean= * true defined as set strat3_SP; by H; outside1-outside4; run; finitepopulation_&f. value; retain sumQ1 0 sumQ2 0 sumQ3 0 sumQ4 0 sumQS1 0 sumQS2 0 sumQS3 relbias_fp=abs(cum1 - 0 sumQS4 0 ; fpmean)/fpmean*100; sumQ1 +Q1; File Name: sumbias_fp+relbias_fp; sumQ2 +Q2; Section_Strat_case2nextbit.sas *avgrelbias_fp=sumbias_fp/_N_; sumQ3 +Q3; sumQ4 +Q4; %macro appendall; relbias_sp=abs(cum1- sumQS1+QS1; &SPMEAN.)/&SPMEAN.*100; sumQS2+Qs2; * seed with first dataset; sumbias_sp+relbias_sp; sumQs3+Qs3; data strat_SP_cumulative; *avgrelbias_sp=sumbias_sp/_N_; sumQs4+Qs4; set strat_SP_cumulative_1 ; df = x-1; run; run; if df ge 1 then chi_1 = data strat_SP_cumulative; cdf('CHISQUARE',sumQ1,df); else %do i=2 %to &finpops.; set strat_SP_cumulative; chi_1 = .; order=_N_; run; if df ge 1 then chi_2 = proc append cdf('CHISQUARE',sumQ2,df); else base=strat_SP_cumulative proc sort chi_2 = .; data=strat_SP_cumulative_&i.; data=strat_SP_cumulative; if df ge 1 then chi_3 = run; by descending order; run; cdf('CHISQUARE',sumQ3,df); else chi_3 = .; %end; * Then carry the value forward;

298

if df ge 1 then chi_4 = var outsideflag1 outsideflag2 %macro appendall; cdf('CHISQUARE',sumQ4,df); else outsideflag3 outsideflag4; chi_4 = .; output * seed with first dataset; out=strat_SP_runningtotal2(drop=_ data strat_SP_cumulative; pv_1 = 1-chi_1; TYPE_ _FREQ_) sum=outside_1 set strat_SP_cumulative_1 ; pv_2 = 1-chi_2; outside_2 outside_3 outside_4; run; pv_3 = 1-chi_3; run; pv_4 = 1-chi_4; %do i=2 %to &finpops.; %macro tsnnam; reject_1=(pv_1 <= 0.05); proc append reject_2=(pv_2 <= 0.05); %if &i=1 %then %do; base=strat_SP_cumulative reject_3=(pv_3 <= 0.05); data=strat_SP_cumulative_&i.; reject_4=(pv_4 <= 0.05); data strat_metric_SP; run; set strat_SP_runningtotal2; if df ge 1 then schi_1 = run; %end; cdf('CHISQUARE',sumQS1,df); else schi_1 = .; data strat_pv_SP; %mend; if df ge 1 then schi_2 = set cdf('CHISQUARE',sumQS2,df); else strat4_SP(where=(replicate=&R.) %appendall; schi_2 = .; keep=replicate reject: sreject:); if df ge 1 then schi_3 = run; data strat_SP_cumulative; cdf('CHISQUARE',sumQS3,df); else set strat_SP_cumulative; schi_3 = .; %end; retain sumweight1 0 sumtotal1 0 if df ge 1 then schi_4 = %else %do; sumweight2 0 sumtotal2 0 cdf('CHISQUARE',sumQS4,df); else sumweight3 0 sumtotal3 schi_4 = .; proc append 0 sumweight4 0 sumtotal4 0; base=strat_metric_SP spv_1 = 1-schi_1; data=strat_SP_runningtotal2; run; sumweight1+weight1; spv_2 = 1-schi_2; sumtotal1+weight1*mean; spv_3 = 1-schi_3; proc append base=strat_pv_SP sumweight2+weight2; spv_4 = 1-schi_4; data=strat4_SP(where=(replicate=& sumtotal2+weight2*mean; R.) keep=replicate reject: sumweight3+weight3; sreject_1=(spv_1 <= 0.05); sreject:); run; sumtotal3+weight3*mean; sreject_2=(spv_2 <= 0.05); sumweight4+weight4; sreject_3=(spv_3 <= 0.05); %end; sumtotal4+weight4*mean; sreject_4=(spv_4 <= 0.05); run; %mend; x=_N_; * for plotting on ye olde graphe; * set running total to keep track %tsnnam; cum1=sumtotal1/sumweight1; how often we are outside ranges cum2=sumtotal2/sumweight2; of somethingorother; proc summary data = strat_pv_SP; cum3=sumtotal3/sumweight3; var reject: sreject: ; cum4=sumtotal4/sumweight4; data strat_SP_runningtotal1; output *superpopulation ; set strat_SP_cumulative; out=strat_pv_SP2(drop=_TYPE_ popmean =&SPMEAN.; outsideflag1=ifn(cum1 < _FREQ_) sum=reject1-reject4 poptotallower or cum1 > sreject1-sreject4; run; poptotalupper=popmean*(1+&ERRORBA poptotalupper,1,0); R); outsideflag2=ifn(cum2 < proc summary data = poptotallower=popmean*(1- poptotallower or cum2 > strat_metric_SP; &ERRORBAR); poptotalupper,1,0); var outside_1-outside_4; outsideflag3=ifn(cum3 < output * relative bias - abs(estimate- poptotallower or cum3 > out=strat_averagetimesoutSP true)/true; poptotalupper,1,0); (drop=_TYPE_ _FREQ_) mean= * true defined as outsideflag4=ifn(cum4 < outside1-outside4; run; finitepopulation_&f. value; poptotallower or cum4 > poptotalupper,1,0); relbias_fp=abs(cum1 - run; fpmean)/fpmean*100; File Name: sumbias_fp+relbias_fp; proc summary Section_Strat_case2nextbit.sas *avgrelbias_fp=sumbias_fp/_N_; data=strat_SP_runningtotal1; 299

relbias_sp=abs(cum1- sumQS2+Qs2; outsideflag2=ifn(cum2 < &SPMEAN.)/&SPMEAN.*100; sumQs3+Qs3; poptotallower or cum2 > sumbias_sp+relbias_sp; sumQs4+Qs4; poptotalupper,1,0); *avgrelbias_sp=sumbias_sp/_N_; outsideflag3=ifn(cum3 < df = x-1; run; if df ge 1 then chi_1 = poptotallower or cum3 > cdf('CHISQUARE',sumQ1,df); else poptotalupper,1,0); data strat_SP_cumulative; chi_1 = .; outsideflag4=ifn(cum4 < set strat_SP_cumulative; if df ge 1 then chi_2 = poptotallower or cum4 > order=_N_; run; cdf('CHISQUARE',sumQ2,df); else poptotalupper,1,0); run; chi_2 = .; proc sort if df ge 1 then chi_3 = proc summary data=strat_SP_cumulative; cdf('CHISQUARE',sumQ3,df); else data=strat_SP_runningtotal1; by descending order; run; chi_3 = .; var outsideflag1 outsideflag2 if df ge 1 then chi_4 = outsideflag3 outsideflag4; * Then carry the value forward; cdf('CHISQUARE',sumQ4,df); else output chi_4 = .; out=strat_SP_runningtotal2(drop=_ data strat3_SP; TYPE_ _FREQ_) sum=outside_1 set strat_SP_cumulative; pv_1 = 1-chi_1; outside_2 outside_3 outside_4; retain sum1 sum2 sum3 sum4 ; pv_2 = 1-chi_2; run; if _N_=1 then do; sum1=cum1 ; pv_3 = 1-chi_3; sum2=cum2; pv_4 = 1-chi_4; %macro tsnnam; sum3=cum3; sum4=cum4; reject_1=(pv_1 <= 0.05); %if &i=1 %then %do; end; h=1; run; reject_2=(pv_2 <= 0.05); reject_3=(pv_3 <= 0.05); data strat_metric_SP; proc sort data = strat3_SP; reject_4=(pv_4 <= 0.05); set strat_SP_runningtotal2; by order; run; run; if df ge 1 then schi_1 = proc univariate data = strat3_SP cdf('CHISQUARE',sumQS1,df); else data strat_pv_SP; noprint; schi_1 = .; set var deff ; output out=deff1SP if df ge 1 then schi_2 = strat4_SP(where=(replicate=&R.) max=max; run; cdf('CHISQUARE',sumQS2,df); else keep=replicate reject: sreject:); schi_2 = .; run; data deff1SP; if df ge 1 then schi_3 = set deff1SP; cdf('CHISQUARE',sumQS3,df); else %end; h = 1; run; schi_3 = .; %else %do; if df ge 1 then schi_4 = data strat3_SP; cdf('CHISQUARE',sumQS4,df); else proc append merge strat3_SP deff1SP; by h; schi_4 = .; base=strat_metric_SP *Q reg statistic; data=strat_SP_runningtotal2; run; Q1 = weight1*((mean-sum1)**2); spv_1 = 1-schi_1; proc append base=strat_pv_SP Q2 = weight2*((mean-sum2)**2); spv_2 = 1-schi_2; data=strat4_SP(where=(replicate=& Q3 = weight3*((mean-sum3)**2); spv_3 = 1-schi_3; R.) keep=replicate reject: Q4 = weight4*((mean-sum4)**2); spv_4 = 1-schi_4; sreject:); run; QS1 = Q1/max; QS2 = Q2/max; sreject_1=(spv_1 <= 0.05); %end; QS3 = Q3/max; sreject_2=(spv_2 <= 0.05); QS4 = Q4/max; sreject_3=(spv_3 <= 0.05); %mend; H=1; run; sreject_4=(spv_4 <= 0.05); run; %tsnnam; data strat4_SP; set strat3_SP; by H; * set running total to keep track proc summary data = strat_pv_SP; retain sumQ1 0 sumQ2 0 sumQ3 0 how often we are outside ranges var reject: sreject: ; sumQ4 0 sumQS1 0 sumQS2 0 sumQS3 of something or other; output 0 sumQS4 0 ; out=strat_pv_SP2(drop=_TYPE_ sumQ1 +Q1; data strat_SP_runningtotal1; _FREQ_) sum=reject1-reject4 sumQ2 +Q2; set strat_SP_cumulative; sreject1-sreject4; run; sumQ3 +Q3; outsideflag1=ifn(cum1 < sumQ4 +Q4; poptotallower or cum1 > proc summary data = sumQS1+QS1; poptotalupper,1,0); strat_metric_SP; 300

var outside_1-outside_4; mergevar='1'; run; output relbias_fp=abs(cum1 - out=strat_averagetimesoutSP data fpmean)/fpmean*100; (drop=_TYPE_ _FREQ_) mean= strat_cumulativenotdesgin(drop=me sumbias_fp+relbias_fp; outside1-outside4; run; rgevar); *avgrelbias_fp=sumbias_fp/_N_; merge strat_notdesignbased fptotal; relbias_sp=abs(cum1- by mergevar; &SPMEAN.)/&SPMEAN.*100; File Name: retain sumweight1 0 sumtotal1 0 sumbias_sp+relbias_sp; strata_not_designbased.sas sumweight2 0 sumtotal2 0 *avgrelbias_sp=sumbias_sp/_N_; sumweight4 0 df = x-1; run; proc means data = samples_strata sumtotal4 0; ; data strat_notdesign_graph; by replicate; sumweight1+weight1; length x 8.; var y; sumtotal1+weight1*ndmean; set strat_cumulativenotdesgin; output out=strat_notdesignbased sumweight2+weight2; mean=ndmean var=ndvar; sumtotal2+weight2*ndmean; keep x cum1-cum4 fpmean popmean; run; * sumweight3+weight3; run; *when we are not design based the * sumtotal3+weight3*mean; deff is not used; sumweight4+weight4; proc sort data = sumtotal4+weight4*ndmean; strat_notdesign_graph; by x; run; data strat_notdesignbased; set strat_notdesignbased; x=_N_; * for plotting on ye PROC EXPORT DATA= weight1=1/&LITTLEN.; olde graphe; WORK.strat_notdesign_graph weight2=1/ndvar; cum1=sumtotal1/sumweight1; OUTFILE= *weight3=deff*(1/var); cum2=sumtotal2/sumweight2; "C:\Documents and weight4=1; * cum3=sumtotal3/sumweight3; Settings\foxkarl\Desktop\SAS mergevar='1'; run; cum4=sumtotal4/sumweight4; PROGRAMS REDONE\Current *superpopulation ; Version\Datasets\notdesign_&littl * Generate an actual total; popmean =&SPMEAN.; en._4_&r..xls" DBMS=EXCEL REPLACE; proc summary poptotalupper=popmean*(1+&ERRORBA SHEET="d"; RUN; data=finitepopulation_1 print; R); var y; run; poptotallower=popmean*(1- output out=fptotal(drop=_TYPE_ &ERRORBAR); _FREQ_) sum=fptotal mean=fpmean var=vmean; run; * relative bias - abs(estimate- true)/true; data fptotal; * true defined as set fptotal; finitepopulation_&f. value;

301

302

Appendix B Data Analysis

NLSYW R0550700 R1569900 File Name: step1readindata2.sas R0581000 R3495000 R0583400 R5163700 options nocenter R0587500 R5163800 validvarname=any; R0666390 R5163900 R0676400 R5164000 *---Read in space-delimited ascii R0680300 R5164100 file; R0681100 R5164200 R0681900 R5164300 data new_data; R0682700 R5164400 R0683500 R5164500 infile 'C:\Documents and R0710000 R5164600 Settings\foxkarl\Desktop\NLSYW\Ka R0749400 R5164700 rla July final un\With gender of R0749900 R5164800 child\Final R0756500 R5164900 July\karlajulyfinal.dat' R0796500 R5165000; lrecl=340 missover DSD DLM=' ' R0797100 array nvarlist _numeric_; print; R0902000 input R0913600 *---Recode missing values to SAS R0000100 R0914500 custom system missing. See SAS R0000102 R0915400 documentation for use of R0003100 R0916300 MISSING option in procedures, R0003200 R0929500 e.g. PROC FREQ; R0003300 R0940700 R0037000 R0947400 do over nvarlist; R0072000 R1043400 if nvarlist = -1 then nvarlist R0086200 R1043700 = .R; /* Refused */ R0113300 R1044000 if nvarlist = -2 then nvarlist R0134600 R1044300 = .D; /* Dont know */ R0146100 R1050000 if nvarlist = -3 then nvarlist R0193100 R1063000 = .I; /* Invalid missing */ R0212600 R1089400 if nvarlist = -4 then nvarlist R0253300 R1089700 = .V; /* Valid missing */ R0308700 R1090000 if nvarlist = -5 then nvarlist R0325540 R1090300 = .N; /* Non-interview */ R0326200 R1095800 end; R0334400 R1097400 R0392500 R1196200 label R0000100 = "ID_CODE, 68"; R0409900 R1206400 label R0000102 = "RANDOMIZED R0417200 R1206700 ID_CODE, 68"; R0475100 R1215100 label R0003100 = "AGE,68"; R0494262 R1221000 label R0003200 = "RACE, 68"; R0494266 R1328300 label R0003300 = "MAR_STAT,68"; R0494270 R1339300 label R0037000 = "TOT FAM INC L R0494274 R1339600 YR 68 R,H NOT ALON"; R0494278 R1339900 label R0072000 = "HGC,68"; R0494282 R1346400 label R0086200 = "MAR_STAT,69"; R0494286 R1352700 label R0113300 = "TOT FAM INC L R0494290 R1438800 YR 69 R,H NOT ALON"; R0494300 R1439200 label R0134600 = "HGC,69"; R0519600 R1439600 label R0146100 = "MAR_STAT,70"; R0521100 R1506300 label R0193100 = "TOT FAM INC L R0533300 R1520400 YR 70 R,H NOT ALON"; 303

label R0212600 = "HGC,70"; label R0902000 = "TOT FAM INC L label R1506300 = "TOT FAM INC L label R0253300 = "MAR_STAT,71"; YR 83"; YR 93"; label R0308700 = "TOT FAM INC L label R0913600 = "CHLN R SNC label R1520400 = "HGC,93"; YR 71 R,H NOT ALON"; 78,83-#1-YR OF BRTH"; label R1569900 = "MAR_STAT,93"; label R0325540 = "HGC,71 REV"; label R0914500 = "CHLN R SNC label R3495000 = "ID_CODE, 97"; label R0326200 = "HGC,71"; 78,83-#2-YR OF BRTH"; label R5163700 = "GENDER OF label R0334400 = "MAR_STAT,72"; label R0915400 = "CHLN R SNC CHILD, 99 C#01"; label R0392500 = "TOT FAM INC L 78,83-#3-YR OF BRTH"; label R5163800 = "GENDER OF YR 72 R,H NOT ALON"; label R0916300 = "CHLN R SNC CHILD, 99 C#02"; label R0409900 = "HGC, 72 REV"; 78,83-#4-YR OF BRTH"; label R5163900 = "GENDER OF label R0417200 = "MAR_STAT,73"; label R0929500 = "HGC,83"; CHILD, 99 C#03"; label R0475100 = "TOT FAM INC L label R0940700 = "MAR_STAT,83"; label R5164000 = "GENDER OF YR 73 R,H NOT ALON"; label R0947400 = "MAR_STAT,85"; CHILD, 99 C#04"; label R0494262 = "YR BRTH 1ST label R1043400 = "CHLN R SNC label R5164100 = "GENDER OF CHLD BORN BY 73 INT_DT"; L_INT 85-#1-YR OF BRTH"; CHILD, 99 C#05"; label R0494266 = "YR BRTH 2ND label R1043700 = "CHLN R SNC label R5164200 = "GENDER OF CHLD BORN BY 73 INT_DT"; L_INT 85-#2-YR OF BRTH"; CHILD, 99 C#06"; label R0494270 = "YR BRTH 3RD label R1044000 = "CHLN R SNC label R5164300 = "GENDER OF CHLD BORN BY 73 INT_DT"; L_INT 85-#3-YR OF BRTH"; CHILD, 99 C#07"; label R0494274 = "YR BRTH 4TH label R1044300 = "CHLN R SNC label R5164400 = "GENDER OF CHLD BORN BY 73 INT_DT"; L_INT 85-#4-YR OF BRTH"; CHILD, 99 C#08"; label R0494278 = "YR BRTH 5TH label R1050000 = "TOT FAM INC L label R5164500 = "GENDER OF CHLD BORN BY 73 INT_DT"; YR 85"; CHILD, 99 C#09"; label R0494282 = "YR BRTH 6TH label R1063000 = "MAR_STAT,87"; label R5164600 = "GENDER OF CHLD BORN BY 73 INT_DT"; label R1089400 = "CHLN R SNC CHILD, 99 C#10"; label R0494286 = "YR BRTH 7TH L_INT 87-#1-YR OF BRTH"; label R5164700 = "GENDER OF CHLD BORN BY 73 INT_DT"; label R1089700 = "CHLN R SNC CHILD, 99 C#11"; label R0494290 = "YR BRTH 8TH L_INT 87-#2-YR OF BRTH"; label R5164800 = "GENDER OF CHLD BORN BY 73 INT_DT"; label R1090000 = "CHLN R SNC CHILD, 99 C#12"; label R0494300 = "HGC,73 REV"; L_INT 87-#3-YR OF BRTH"; label R5164900 = "GENDER OF label R0519600 = "MAR_STAT,75"; label R1090300 = "CHLN R SNC CHILD, 99 C#13"; label R0521100 = "HGC,75"; L_INT 87-#4-YR OF BRTH"; label R5165000 = "GENDER OF label R0533300 = "TOT FAM INC L label R1095800 = "TOT FAM INC L CHILD, 99 C#14"; YR 75 R,H NOT ALON"; YR 87"; label R0550700 = "HGC,77"; label R1097400 = "HGC,87"; /*------label R0581000 = "MAR_STAT,77"; label R1196200 = "TOT FAM INC L ------label R0583400 = "TOT FAM INC L YR 88"; -----* YR 77 R,H NOT ALON"; label R1206400 = "CHLN R SNC * Crosswalk for Reference label R0587500 = "MAR_STAT,78"; L_INT 88-#1-YR OF BRTH"; number & Question name label R0666390 = "HGC,78 REV"; label R1206700 = "CHLN R SNC * label R0676400 = "TOT FAM INC L L_INT 88-#2-YR OF BRTH"; *------YR 78-XCLD R,H NOT ALON"; label R1215100 = "HGC,88"; ------label R0680300 = "CHLN R SNC label R1221000 = "MAR_STAT,88"; -----* 73,78-#1-YR OF BRTH"; label R1328300 = "TOT FAM INC L * Uncomment and edit this RENAME label R0681100 = "CHLN R SNC YR 91"; statement to rename variables 73,78-#2-YR OF BRTH"; label R1339300 = "CHLN R SNC * for ease of use. You may need label R0681900 = "CHLN R SNC L_INT 91-#1-YR OF BRTH"; to use name literal strings 73,78-#3-YR OF BRTH"; label R1339600 = "CHLN R SNC * e.g. 'variable-name'n to label R0682700 = "CHLN R SNC L_INT 91-#2-YR OF BRTH"; create valid SAS variable names, 73,78-#4-YR OF BRTH"; label R1339900 = "CHLN R SNC or label R0683500 = "CHLN R SNC L_INT 91-#3-YR OF BRTH"; * alter variables similarly 73,78-#5-YR OF BRTH"; label R1346400 = "HGC,91"; named across years. label R0710000 = "MAR_STAT,80"; label R1352700 = "MAR_STAT,91"; * This command does not label R0749400 = "TOT FAM INC L label R1438800 = "CHLN R SNC guarentee uniqueness YR 80-XCLD R,H NOT ALON"; L_INT 93-#1-YR OF BRTH"; * See SAS documentation for use label R0749900 = "HGC,80"; label R1439200 = "CHLN R SNC of name literals and use of the label R0756500 = "MAR_STAT,82"; L_INT 93-#2-YR OF BRTH"; * VALIDVARNAME=ANY option. label R0796500 = "TOT FAM INC L label R1439600 = "CHLN R SNC *------YR 82"; L_INT 93-#3-YR OF BRTH"; ------label R0797100 = "HGC,82"; -----*/ 304

/* *start* */ 7='7' 4='DIVORCED' proc format; 8='8' 5='SEPARATED' value vx0f 9='9' 6='NEVER MARRIED' 1-5159='1 TO 5159: 10='10' ; RESPONDENT''S RECORD NUMBER'; 11='11' value vx11f value vx1f 12='12' 1='UNDER $1,000' 1-5159='1 TO 5159: ACTUAL 13='13' 2='$1,000 - $1,999' (RANDOM) NUMBER'; 14='14' 3='$2,000 - $2,999' value vx2f 15='15' 4='$3,000 - $3,999' -99999-13='-99999 TO 13: < 14' 16='16' 5='$4,000 - $4,999' 14='14: = 14' 17='17' 6='$5,000 - $5,999' 15='15' 18='18' 7='$6,000 - $7,499' 16='16' 19-99999='19 TO 99999: 19+' 8='$7,500 - $9,999' 17='17' ; 9='$10,000 - $14,999' 18='18' value vx7f 10='$15,000 - $24,999' 19='19' 1='MARRIED SPOUSE PRESENT' 11='$25,000 AND OVER' 20='20' 2='MARRIED SPOUSE ABSENT' ; 21='21' 3='WIDOWED' value vx12f 22='22' 4='DIVORCED' 0='NONE' 23='23' 5='SEPARATED' 1='FIRST GRADE' 24='24' 6='NEVER MARRIED' 2='SECOND GRADE' 25='25' ; 3='THIRD GRADE' 26='26' value vx8f 4='FOURTH GRADE' 27='27' 1='UNDER $1,000' 5='FIFTH GRADE' 28='28' 2='$1,000 - $1,999' 6='SIXTH GRADE' 29='29' 3='$2,000 - $2,999' 7='SEVENTH GRADE' 30-99999='30 TO 99999: 30+' 4='$3,000 - $3,999' 8='EIGHTH GRADE' ; 5='$4,000 - $4,999' 9='HIGH SCHOOL 1' value vx3f 6='$5,000 - $5,999' 10='HIGH SCHOOL 2' 1='WHITE' 7='$6,000 - $7,499' 11='HIGH SCHOOL 3' 2='BLACK' 8='$7,500 - $9,999' 12='HIGH SCHOOL 4' 3='OTHER' 9='$10,000 - $14,999' 13='COLLEGE 1' ; 10='$15,000 - $24,999' 14='COLLEGE 2' value vx4f 11='$25,000 AND OVER' 15='COLLEGE 3' 1='MARRIED SPOUSE PRESENT' ; 16='COLLEGE 4' 2='MARRIED SPOUSE ABSENT' value vx9f 17='COLLEGE 5' 3='WIDOWED' 0='NONE' 18='COLLEGE 6 OR MORE' 4='DIVORCED' 1='FIRST' ; 5='SEPARATED' 2='SECOND' value vx13f 6='NEVER MARRIED' 3='THIRD' 1='MARRIED, SPOUSE PRESENT' ; 4='FOURTH' 2='MARRIED, SPOUSE ABSENT' value vx5f 5='FIFTH' 3='WIDOWED' 1='UNDER $1,000' 6='SIXTH' 4='DIVORCED' 2='$1,000 - $1,999' 7='SEVENTH' 5='SEPARATED' 3='$2,000 - $2,999' 8='EIGHTH' 6='NEVER MARRIED' 4='$3,000 - $3,999' 9='HIGH SCHOOL 1' ; 5='$4,000 - $4,999' 10='HIGH SCHOOL 2' value vx14f 6='$5,000 - $5,999' 11='HIGH SCHOOL 3' 1='UNDER 1,000' 7='$6,000 - $7,499' 12='HIGH SCHOOL 4' 2='1,000 - 1,999' 8='$7,500 - $9,999' 13='COLLEGE 1' 3='2,000 - 2,999' 9='$10,000 - $14,999' 14='COLLEGE 2' 4='3,000 - 3,999' 10='$15,000 - $24,999' 15='COLLEGE 3' 5='4,000 - 4,999' 11='$25,000 OR MORE' 16='COLLEGE 4' 6='5,000 - 5,999' ; 17='COLLEGE 5' 7='6,000 - 7,499' value vx6f 18='COLLEGE 6 OR MORE' 8='7,500 - 9,999' -99999-2='-99999 TO 2: < 3' ; 9='10,000 - 14,999' 3='3: = 3' value vx10f 10='15,000 - 24,999' 4='4' 1='MARRIED SPOUSE PRESENT' 11='25,000 - AND OVER NOTE: ALL 5='5' 2='MARRIED SPOUSE ABSENT' AMOUNTS IN DOLLARS.' 6='6' 3='WIDOWED' ; 305

value vx15f 10='$15,000 - $24,999' 71='71' 0='NONE OR NEVER ATTENDED' 11='$25,000 - AND OVER' 72='72' 1='FIRST GRADE' ; 73='73' 2='SECOND GRADE' value vx19f 74-99999='74 TO 99999: 74+' 3='THIRD GRADE' 0='NONE OR NEVER ATTENDED' ; 4='FOURTH GRADE' 1='FIRST GRADE' value vx23f 5='FIFTH GRADE' 2='SECOND GRADE' -99999-57='-99999 TO 57: < 58' 6='SIXTH GRADE' 3='THIRD GRADE' 58='58: = 58' 7='SEVENTH GRADE' 4='FOURTH GRADE' 59='59' 8='EIGHTH GRADE' 5='FIFTH GRADE' 60='60' 9='NINTH GRADE' 6='SIXTH GRADE' 61='61' 10='TENTH GRADE' 7='SEVENTH GRADE' 62='62' 11='ELEVENTH GRADE' 8='EIGHTH GRADE' 63='63' 12='TWELFTH GRADE' 9='NINTH GRADE' 64='64' 13='FIRST YEAR OF COLLEGE' 10='TENTH GRADE' 65='65' 14='SECOND YEAR OF COLLEGE' 11='ELEVENTH GRADE' 66='66' 15='THIRD YEAR OF COLLEGE' 12='TWELFTH GRADE' 67='67' 16='FOURTH YEAR OF COLLEGE' 13='FIRST YEAR OF COLLEGE' 68='68' 17='FIFTH YEAR OF COLLEGE' 14='SECOND YEAR OF COLLEGE' 69='69' 18='SIXTH+ YEAR OF COLLEGE' 15='THIRD YEAR OF COLLEGE' 70='70' ; 16='FOURTH YEAR OF COLLEGE' 71='71' value vx16f 17='FIFTH YEAR OF COLLEGE' 72='72' 0='NONE' 18='SIXTH+ YEAR OF COLLEGE' 73='73' 1='ELEMENTARY 1' ; 74-99999='74 TO 99999: 74+' 2='ELEMENTARY 2' value vx20f ; 3='ELEMENTARY 3' 1='MARRIED, SPOUSE PRESENT' value vx24f 4='ELEMENTARY 4' 2='MARRIED, SPOUSE ABSENT' -99999-57='-99999 TO 57: < 58' 5='ELEMENTARY 5' 3='WIDOWED' 58='58: = 58' 6='ELEMENTARY 6' 4='DIVORCED' 59='59' 7='ELEMENTARY 7' 5='SEPARATED' 60='60' 8='ELEMENTARY 8' 6='NEVER MARRIED' 61='61' 9='HIGH SCHOOL 1' ; 62='62' 10='HIGH SCHOOL 2' value vx21f 63='63' 11='HIGH SCHOOL 3' 1='UNDER $1,000' 64='64' 12='HIGH SCHOOL 4' 2='$1,000 - $1,999' 65='65' 13='COLLEGE 1' 3='$2,000 - $2,999' 66='66' 14='COLLEGE 2' 4='$3,000 - $3,999' 67='67' 15='COLLEGE 3' 5='$4,000 - $4,999' 68='68' 16='COLLEGE 4' 6='$5,000 - $5,999' 69='69' 17='COLLEGE 5' 7='$6,000 - $7,499' 70='70' 18='COLLEGE 6 OR MORE' 8='$7,500 - $9,999' 71='71' ; 9='$10,000 - $14,999' 72='72' value vx17f 10='$15,000 - $24,999' 73='73' 1='MARRIED, SPOUSE PRESENT' 11='$25,000 - AND OVER' 74-99999='74 TO 99999: 74+' 2='MARRIED, SPOUSE ABSENT' ; ; 3='WIDOWED' value vx22f value vx25f 4='DIVORCED' -99999-57='-99999 TO 57: < 58' -99999-57='-99999 TO 57: < 58' 5='SEPARATED' 58='58: = 58' 58='58: = 58' 6='NEVER MARRIED' 59='59' 59='59' ; 60='60' 60='60' value vx18f 61='61' 61='61' 1='UNDER $1,000' 62='62' 62='62' 2='$1,000 - $1,999' 63='63' 63='63' 3='$2,000 - $2,999' 64='64' 64='64' 4='$3,000 - $3,999' 65='65' 65='65' 5='$4,000 - $4,999' 66='66' 66='66' 6='$5,000 - $5,999' 67='67' 67='67' 7='$6,000 - $7,499' 68='68' 68='68' 8='$7,500 - $9,999' 69='69' 69='69' 9='$10,000 - $14,999' 70='70' 70='70' 306

71='71' 71='71' 5='FIFTH GRADE' 72='72' 72='72' 6='SIXTH GRADE' 73='73' 73='73' 7='SEVENTH GRADE' 74-99999='74 TO 99999: 74+' 74-99999='74 TO 99999: 74+' 8='EIGHTH GRADE' ; ; 9='NINTH GRADE' value vx26f value vx29f 10='TENTH GRADE' -99999-57='-99999 TO 57: < 58' -99999-55='-99999 TO 55: < 56' 11='ELEVENTH GRADE' 58='58: = 58' 56='56: = 56' 12='TWELFTH GRADE' 59='59' 57='57' 13='FIRST YEAR COLLEGE' 60='60' 58='58' 14='SECOND YEAR COLLEGE' 61='61' 59='59' 15='THIRD YEAR COLLEGE' 62='62' 60='60' 16='FOURTH YEAR COLLEGE' 63='63' 61='61' 17='FIFTH YEAR COLLEGE' 64='64' 62='62' 18='SIXTH+ YEARS COLLEGE' 65='65' 63='63' 96='ELEMENTARY, YEAR 66='66' 64='64' UNSPECIFIED' 67='67' 65='65' 97='HIGH SCHOOL, YEAR 68='68' 66='66' UNSPECIFIED' 69='69' 67='67' 98='COLLEGE, YEAR UNSPECIFIED' 70='70' 68='68' 93='KINDERGARTEN' 71='71' 69='69' 94='PRESCHOOL' 72='72' 70='70' 95='NONACADEMIC DEGREE OR 73='73' 71='71' DIPLOMA EDUCATIONAL PROGRAM' 74-99999='74 TO 99999: 74+' 72-99999='72 TO 99999: 72+' 99='GED' ; ; ; value vx27f value vx30f value vx33f -99999-57='-99999 TO 57: < 58' 0='NONE OR NEVER ATTENDED' -999999--6='-999999 TO -6: <0' 58='58: = 58' 1='FIRST GRADE' 0='0' 59='59' 2='SECOND GRADE' 1-999='1 TO 999' 60='60' 3='THIRD GRADE' 1000-1999='1000 TO 1999' 61='61' 4='FOURTH GRADE' 2000-2999='2000 TO 2999' 62='62' 5='FIFTH GRADE' 3000-3999='3000 TO 3999' 63='63' 6='SIXTH GRADE' 4000-4999='4000 TO 4999' 64='64' 7='SEVENTH GRADE' 5000-5999='5000 TO 5999' 65='65' 8='EIGHTH GRADE' 6000-6999='6000 TO 6999' 66='66' 9='NINTH GRADE' 7000-7999='7000 TO 7999' 67='67' 10='TENTH GRADE' 8000-8999='8000 TO 8999' 68='68' 11='ELEVENTH GRADE' 9000-9999='9000 TO 9999' 69='69' 12='TWELFTH GRADE' 10000-14999='10000 TO 14999' 70='70' 13='FIRST YEAR OF COLLEGE' 15000-19999='15000 TO 19999' 71='71' 14='SECOND YEAR OF COLLEGE' 20000-24999='20000 TO 24999' 72='72' 15='THIRD YEAR OF COLLEGE' 25000-49999='25000 TO 49999' 73='73' 16='FOURTH YEAR OF COLLEGE' 50000-9999999999='50000 TO 74-99999='74 TO 99999: 74+' 17='FIFTH YEAR OF COLLEGE' 9999999999: >50000' ; 18='SIXTH+ YEAR OF COLLEGE' ; value vx28f ; value vx34f -99999-57='-99999 TO 57: < 58' value vx31f 0='NONE' 58='58: = 58' 1='MARRIED, SPOUSE PRESENT' 1='FIRST GRADE' 59='59' 2='MARRIED, SPOUSE ABSENT' 2='SECOND GRADE' 60='60' 3='WIDOWED' 3='THIRD GRADE' 61='61' 4='DIVORCED' 4='FOURTH GRADE' 62='62' 5='SEPARATED' 5='FIFTH GRADE' 63='63' 6='NEVER MARRIED' 6='SIXTH GRADE' 64='64' ; 7='SEVENTH GRADE' 65='65' value vx32f 8='EIGHTH GRADE' 66='66' 0='NONE' 9='NINTH GRADE' 67='67' 1='FIRST GRADE' 10='TENTH GRADE' 68='68' 2='SECOND GRADE' 11='ELEVENTH GRADE' 69='69' 3='THIRD GRADE' 12='TWELFTH GRADE' 70='70' 4='FOURTH GRADE' 13='FIRST YEAR COLLEGE' 307

14='SECOND YEAR COLLEGE' 9='NINTH GRADE' 74='74' 15='THIRD YEAR COLLEGE' 10='TENTH GRADE' 75='75' 16='FOURTH YEAR COLLEGE' 11='ELEVENTH GRADE' 76='76' 17='FIFTH YEAR COLLEGE' 12='TWELFTH GRADE' 77='77' 18='SIXTH+ YEARS COLLEGE' 13='FIRST YEAR COLLEGE' 78='78' 96='ELEMENTARY, YEAR 14='SECOND YEAR COLLEGE' 79='79' UNSPECIFIED' 15='THIRD YEAR COLLEGE' 80='80' 97='HIGH SCHOOL, YEAR 16='FOURTH YEAR COLLEGE' 81='81' UNSPECIFIED' 17='FIFTH YEAR COLLEGE' 82='82' 98='COLLEGE, YEAR UNSPECIFIED' 18='SIXTH+ YEARS COLLEGE' 83='83' 93='KINDERGARTEN' 96='ELEMENTARY, YEAR 84='84' 94='PRESCHOOL' UNSPECIFIED' 85='85' 95='NONACADEMIC DEGREE OR 97='HIGH SCHOOL, YEAR 86='86' DIPLOMA EDUCATIONAL PROGRAM' UNSPECIFIED' 87='87' 99='GED' 98='COLLEGE, YEAR UNSPECIFIED' 88='88' ; 93='KINDERGARTEN' 89-99999='89 TO 99999: 89+' value vx35f 94='PRESCHOOL' ; 1='MARRIED, SPOUSE PRESENT' 95='NONACADEMIC DEGREE OR value vx42f 2='MARRIED, SPOUSE ABSENT' DIPLOMA EDUCATIONAL PROGRAM' -99999-72='-99999 TO 72: < 73' 3='WIDOWED' 99='GED' 73='73: = 73' 4='DIVORCED' ; 74='74' 5='SEPARATED' value vx39f 75='75' 6='NEVER MARRIED' 0='NOTHING, LOSS, OR BROKE 76='76' 7='SPOUSE ABSENT, REASON EVEN' 77='77' UNKNOWN' 1='UNDER $1000' 78='78' ; 2='$1000-1999' 79='79' value vx36f 3='$2000-2999' 80='80' -99999--6='-99999 TO -6: <0' 4='$3000-3999' 81='81' 0='0' 5='$4000-4999' 82='82' 1-99='1 TO 99' 6='$5000-5999' 83='83' 100-199='100 TO 199' 7='$6000-7499' 84='84' 200-299='200 TO 299' 8='$7500-9999' 85='85' 300-399='300 TO 399' 9='$10000-14999' 86='86' 400-499='400 TO 499' 10='$15000-24999' 87='87' 500-599='500 TO 599' 11='$25000 AND OVER' 88='88' 600-699='600 TO 699' 12='DON''T KNOW' 89-99999='89 TO 99999: 89+' 700-799='700 TO 799' ; ; 800-899='800 TO 899' value vx40f value vx43f 900-999='900 TO 999' -99999-72='-99999 TO 72: < 73' -99999-72='-99999 TO 72: < 73' 1000-99999999='1000 TO 73='73: = 73' 73='73: = 73' 99999999: >1000' 74='74' 74='74' ; 75='75' 75='75' value vx37f 76='76' 76='76' 1='MARRIED, SPOUSE PRESENT' 77='77' 77='77' 2='MARRIED, SPOUSE ABSENT' 78='78' 78='78' 3='WIDOWED' 79='79' 79='79' 4='DIVORCED' 80='80' 80='80' 5='SEPARATED' 81='81' 81='81' 6='NEVER MARRIED' 82='82' 82='82' ; 83='83' 83='83' value vx38f 84='84' 84='84' 0='NONE' 85='85' 85='85' 1='FIRST GRADE' 86='86' 86='86' 2='SECOND GRADE' 87='87' 87='87' 3='THIRD GRADE' 88='88' 88='88' 4='FOURTH GRADE' 89-99999='89 TO 99999: 89+' 89-99999='89 TO 99999: 89+' 5='FIFTH GRADE' ; ; 6='SIXTH GRADE' value vx41f value vx44f 7='SEVENTH GRADE' -99999-72='-99999 TO 72: < 73' -99999-72='-99999 TO 72: < 73' 8='EIGHTH GRADE' 73='73: = 73' 73='73: = 73' 308

74='74' 14='SECOND YEAR COLLEGE' 15='THIRD YEAR COLLEGE' 75='75' 15='THIRD YEAR COLLEGE' 16='FOURTH YEAR COLLEGE' 76='76' 16='FOURTH YEAR COLLEGE' 17='FIFTH YEAR COLLEGE' 77='77' 17='FIFTH YEAR COLLEGE' 18='SIXTH+ YEARS COLLEGE' 78='78' 18='SIXTH+ YEARS COLLEGE' 96='ELEMENTARY, YEAR 79='79' 96='ELEMENTARY, YEAR UNSPECIFIED' 80='80' UNSPECIFIED' 97='HIGH SCHOOL, YEAR 81='81' 97='HIGH SCHOOL, YEAR UNSPECIFIED' 82='82' UNSPECIFIED' 98='COLLEGE, YEAR UNSPECIFIED' 83='83' 98='COLLEGE, YEAR UNSPECIFIED' 93='KINDERGARTEN' 84='84' 93='KINDERGARTEN' 94='PRESCHOOL' 85='85' 94='PRESCHOOL' 95='NONACADEMIC DEGREE OR 86='86' 95='NONACADEMIC DEGREE OR DIPLOMA EDUCATIONAL PROGRAM' 87='87' DIPLOMA EDUCATIONAL PROGRAM' 99='GED' 88='88' 99='GED' ; 89-99999='89 TO 99999: 89+' ; value vx51f ; value vx48f 1='LESS THAN 3,999' value vx45f 1='MARRIED, SPOUSE PRESENT' 2='4,000 - 5,999' 1='MARRIED' 2='MARRIED, SPOUSE ABSENT' 3='6,000 - 7,499' 2='MARRIED, SPOUSE ABSENT' 3='WIDOWED' 4='7,500 - 9,999' 3='WIDOWED' 4='DIVORCED' 5='10,000 - 14,999' 4='DIVORCED' 5='SEPARATED' 6='15,000 - 17,499' 5='SEPARATED' 6='NEVER MARRIED' 7='17,500 - 19,999' 6='NEVER MARRIED' ; 8='20,000 - 24,999' ; value vx49f 9='25,000 - 34,999' value vx46f -999999--6='-999999 TO -6: <0' 10='35,000 - 49,999' -999999--6='-999999 TO -6: <0' 0='0' 11='50,000 AND OVER' 0='0' 1-999='1 TO 999' 0='NOTHING' 1-999='1 TO 999' 1000-1999='1000 TO 1999' ; 1000-1999='1000 TO 1999' 2000-2999='2000 TO 2999' value vx52f 2000-2999='2000 TO 2999' 3000-3999='3000 TO 3999' -99999-67='-99999 TO 67: < 68' 3000-3999='3000 TO 3999' 4000-4999='4000 TO 4999' 68='68: = 68' 4000-4999='4000 TO 4999' 5000-5999='5000 TO 5999' 69='69' 5000-5999='5000 TO 5999' 6000-6999='6000 TO 6999' 70='70' 6000-6999='6000 TO 6999' 7000-7999='7000 TO 7999' 71='71' 7000-7999='7000 TO 7999' 8000-8999='8000 TO 8999' 72='72' 8000-8999='8000 TO 8999' 9000-9999='9000 TO 9999' 73='73' 9000-9999='9000 TO 9999' 10000-14999='10000 TO 14999' 74='74' 10000-14999='10000 TO 14999' 15000-19999='15000 TO 19999' 75='75' 15000-19999='15000 TO 19999' 20000-24999='20000 TO 24999' 76='76' 20000-24999='20000 TO 24999' 25000-49999='25000 TO 49999' 77='77' 25000-49999='25000 TO 49999' 50000-9999999999='50000 TO 78='78' 50000-9999999999='50000 TO 9999999999: >50000' 79='79' 9999999999: >50000' ; 80='80' ; value vx50f 81='81' value vx47f 0='NONE' 82='82' 0='NONE' 1='FIRST GRADE' 83='83' 1='FIRST GRADE' 2='SECOND GRADE' 84-99999='84 TO 99999: 84+' 2='SECOND GRADE' 3='THIRD GRADE' ; 3='THIRD GRADE' 4='FOURTH GRADE' value vx53f 4='FOURTH GRADE' 5='FIFTH GRADE' -99999-67='-99999 TO 67: < 68' 5='FIFTH GRADE' 6='SIXTH GRADE' 68='68: = 68' 6='SIXTH GRADE' 7='SEVENTH GRADE' 69='69' 7='SEVENTH GRADE' 8='EIGHTH GRADE' 70='70' 8='EIGHTH GRADE' 9='NINTH GRADE' 71='71' 9='NINTH GRADE' 10='TENTH GRADE' 72='72' 10='TENTH GRADE' 11='ELEVENTH GRADE' 73='73' 11='ELEVENTH GRADE' 12='TWELFTH GRADE' 74='74' 12='TWELFTH GRADE' 13='FIRST YEAR COLLEGE' 75='75' 13='FIRST YEAR COLLEGE' 14='SECOND YEAR COLLEGE' 76='76' 309

77='77' ; 71='71' 78='78' value vx57f 72='72' 79='79' 1='MARRIED, SPOUSE PRESENT' 73='73' 80='80' 2='MARRIED, SPOUSE ABSENT' 74='74' 81='81' 3='WIDOWED' 75='75' 82='82' 4='DIVORCED' 76='76' 83='83' 5='SEPARATED' 77='77' 84-99999='84 TO 99999: 84+' 6='NEVER MARRIED' 78='78' ; ; 79='79' value vx54f value vx58f 80='80' -99999-67='-99999 TO 67: < 68' 1='MARRIED, SPOUSE PRESENT' 81='81' 68='68: = 68' 2='MARRIED, SPOUSE ABSENT' 82='82' 69='69' 3='WIDOWED' 83='83' 70='70' 4='DIVORCED' 84='84' 71='71' 5='SEPARATED' 85='85' 72='72' 6='NEVER MARRIED' 86-99999='86 TO 99999: 86+' 73='73' ; ; 74='74' value vx59f value vx62f 75='75' -99999-69='-99999 TO 69: < 70' -99999-69='-99999 TO 69: < 70' 76='76' 70='70: = 70' 70='70: = 70' 77='77' 71='71' 71='71' 78='78' 72='72' 72='72' 79='79' 73='73' 73='73' 80='80' 74='74' 74='74' 81='81' 75='75' 75='75' 82='82' 76='76' 76='76' 83='83' 77='77' 77='77' 84-99999='84 TO 99999: 84+' 78='78' 78='78' ; 79='79' 79='79' value vx55f 80='80' 80='80' -99999-67='-99999 TO 67: < 68' 81='81' 81='81' 68='68: = 68' 82='82' 82='82' 69='69' 83='83' 83='83' 70='70' 84='84' 84='84' 71='71' 85='85' 85='85' 72='72' 86-99999='86 TO 99999: 86+' 86-99999='86 TO 99999: 86+' 73='73' ; ; 74='74' value vx60f value vx63f 75='75' -99999-69='-99999 TO 69: < 70' -999999--6='-999999 TO -6: <0' 76='76' 70='70: = 70' 0='0' 77='77' 71='71' 1-999='1 TO 999' 78='78' 72='72' 1000-1999='1000 TO 1999' 79='79' 73='73' 2000-2999='2000 TO 2999' 80='80' 74='74' 3000-3999='3000 TO 3999' 81='81' 75='75' 4000-4999='4000 TO 4999' 82='82' 76='76' 5000-5999='5000 TO 5999' 83='83' 77='77' 6000-6999='6000 TO 6999' 84-99999='84 TO 99999: 84+' 78='78' 7000-7999='7000 TO 7999' ; 79='79' 8000-8999='8000 TO 8999' value vx56f 80='80' 9000-9999='9000 TO 9999' 9='NINTH GRADE' 81='81' 10000-14999='10000 TO 14999' 10='TENTH GRADE' 82='82' 15000-19999='15000 TO 19999' 11='ELEVENTH GRADE' 83='83' 20000-24999='20000 TO 24999' 12='TWELFTH GRADE' 84='84' 25000-49999='25000 TO 49999' 13='FIRST YEAR COLLEGE' 85='85' 50000-9999999999='50000 TO 14='SECOND YEAR COLLEGE' 86-99999='86 TO 99999: 86+' 9999999999: >50000' 15='THIRD YEAR COLLEGE' ; ; 16='FOURTH YEAR COLLEGE' value vx61f value vx64f 17='FIFTH YEAR COLLEGE' -99999-69='-99999 TO 69: < 70' 1='MARRIED, SPOUSE PRESENT' 18='SIXTH YEAR COLLEGE' 70='70: = 70' 2='MARRIED, SPOUSE ABSENT' 310

3='WIDOWED' 86='86' 3='6,000-7,499' 4='DIVORCED' 87='87' 4='7,500-9,999' 5='SEPARATED' 88='88' 5='10,000-14,999' 6='NEVER MARRIED' 89-99999='89 TO 99999: 89+' 6='15,000-17,499' ; ; 7='17,500-19,999' value vx65f value vx68f 8='20,000-24,999' -99999-72='-99999 TO 72: < 73' -99999-72='-99999 TO 72: < 73' 9='25,000-34,999' 73='73: = 73' 73='73: = 73' 10='35,000-49,999' 74='74' 74='74' 11='50,000 AND OVER' 75='75' 75='75' 0='NOTHING' 76='76' 76='76' ; 77='77' 77='77' value vx72f 78='78' 78='78' -99999-73='-99999 TO 73: < 74' 79='79' 79='79' 74='74: = 74' 80='80' 80='80' 75='75' 81='81' 81='81' 76='76' 82='82' 82='82' 77='77' 83='83' 83='83' 78='78' 84='84' 84='84' 79='79' 85='85' 85='85' 80='80' 86='86' 86='86' 81='81' 87='87' 87='87' 82='82' 88='88' 88='88' 83='83' 89-99999='89 TO 99999: 89+' 89-99999='89 TO 99999: 89+' 84='84' ; ; 85='85' value vx66f value vx69f 86='86' -99999-72='-99999 TO 72: < 73' -999999--6='-999999 TO -6: <0' 87='87' 73='73: = 73' 0='0' 88='88' 74='74' 1-999='1 TO 999' 89='89' 75='75' 1000-1999='1000 TO 1999' 90-99999='90 TO 99999: 90+' 76='76' 2000-2999='2000 TO 2999' ; 77='77' 3000-3999='3000 TO 3999' value vx73f 78='78' 4000-4999='4000 TO 4999' -99999-73='-99999 TO 73: < 74' 79='79' 5000-5999='5000 TO 5999' 74='74: = 74' 80='80' 6000-6999='6000 TO 6999' 75='75' 81='81' 7000-7999='7000 TO 7999' 76='76' 82='82' 8000-8999='8000 TO 8999' 77='77' 83='83' 9000-9999='9000 TO 9999' 78='78' 84='84' 10000-14999='10000 TO 14999' 79='79' 85='85' 15000-19999='15000 TO 19999' 80='80' 86='86' 20000-24999='20000 TO 24999' 81='81' 87='87' 25000-49999='25000 TO 49999' 82='82' 88='88' 50000-9999999999='50000 TO 83='83' 89-99999='89 TO 99999: 89+' 9999999999: >50000' 84='84' ; ; 85='85' value vx67f value vx70f 86='86' -99999-72='-99999 TO 72: < 73' 9='NINTH GRADE' 87='87' 73='73: = 73' 10='TENTH GRADE' 88='88' 74='74' 11='ELEVENTH GRADE' 89='89' 75='75' 12='TWELFTH GRADE' 90-99999='90 TO 99999: 90+' 76='76' 13='FIRST YEAR COLLEGE' ; 77='77' 14='SECOND YEAR COLLEGE' value vx74f 78='78' 15='THIRD YEAR COLLEGE' 0='NONE' 79='79' 16='FOURTH YEAR COLLEGE' 1='FIRST GRADE' 80='80' 17='FIFTH YEAR COLLEGE' 2='SECOND GRADE' 81='81' 18='SIXTH+ YEAR COLLEGE' 3='THIRD GRADE' 82='82' ; 4='FOURTH GRADE' 83='83' value vx71f 5='FIFTH GRADE' 84='84' 1='LESS THAN 3,999' 6='SIXTH GRADE' 85='85' 2='4,000-5,999' 7='SEVENTH GRADE' 311

8='EIGHTH GRADE' 91='91' 13='FIRST YEAR COLLEGE' 9='NINTH GRADE' 92='92' 14='SECOND YEAR COLLEGE' 10='TENTH GRADE' 93='93' 15='THIRD YEAR COLLEGE' 11='ELEVENTH GRADE' 94='94' 16='FOURTH YEAR COLLEGE' 12='TWELFTH GRADE' 95-99999='95 TO 99999: 95+' 17='FIFTH YEAR COLLEGE' 13='FIRST YEAR COLLEGE' ; 18='SIXTH+ YEARS COLLEGE' 14='SECOND YEAR COLLEGE' value vx78f 96='ELEMENTARY, YEAR 15='THIRD YEAR COLLEGE' -99999-78='-99999 TO 78: < 79' UNSPECIFIED' 16='FOURTH YEAR COLLEGE' 79='79: = 79' 97='HIGH SCHOOL, YEAR 17='FIFTH YEAR COLLEGE' 80='80' UNSPECIFIED' 18='SIXTH+ YEARS COLLEGE' 81='81' 98='COLLEGE, YEAR UNSPECIFIED' 96='ELEMENTARY, YEAR 82='82' 93='KINDERGARTEN' UNSPECIFIED' 83='83' 94='PRESCHOOL' 97='HIGH SCHOOL, YEAR 84='84' 95='NONACADEMIC DEGREE OR UNSPECIFIED' 85='85' DIPLOMA EDUCATIONAL PROGRAM' 98='COLLEGE, YEAR UNSPECIFIED' 86='86' 99='GED' 93='KINDERGARTEN' 87='87' ; 94='PRESCHOOL' 88='88' value vx81f 95='NONACADEMIC DEGREE OR 89='89' 1='MARRIED, SPOUSE PRESENT' DIPLOMA EDUCATIONAL PROGRAM' 90='90' 2='MARRIED, SPOUSE ABSENT' 99='GED' 91='91' 3='WIDOWED' ; 92='92' 4='DIVORCED' value vx75f 93='93' 5='SEPARATED' 1='MARRIED, SPOUSE PRESENT' 94='94' 6='NEVER MARRIED' 2='MARRIED, SPOUSE ABSENT' 95-99999='95 TO 99999: 95+' ; 3='WIDOWED' ; value vx82f 4='DIVORCED' value vx79f -99999-78='-99999 TO 78: < 79' 5='SEPARATED' -99999-78='-99999 TO 78: < 79' 79='79: = 79' 6='NEVER MARRIED' 79='79: = 79' 80='80' ; 80='80' 81='81' value vx76f 81='81' 82='82' 1='$0-3,999' 82='82' 83='83' 2='4,000-5,999' 83='83' 84='84' 3='6,000-7,499' 84='84' 85='85' 4='7,500-9,999' 85='85' 86='86' 5='10,000-14,999' 86='86' 87='87' 6='15,000-17,499' 87='87' 88='88' 7='17,500-19,999' 88='88' 89='89' 8='20,000-24,999' 89='89' 90='90' 9='25,000-34,999' 90='90' 91='91' 10='35,000-49,999' 91='91' 92='92' 11='50,000-74,999' 92='92' 93='93' 12='75,000-99,999' 93='93' 94='94' 13='100,000 AND OVER' 94='94' 95-99999='95 TO 99999: 95+' 14='NOTHING' 95-99999='95 TO 99999: 95+' ; ; ; value vx83f value vx77f value vx80f -99999-78='-99999 TO 78: < 79' -99999-78='-99999 TO 78: < 79' 0='NONE' 79='79: = 79' 79='79: = 79' 1='FIRST GRADE' 80='80' 80='80' 2='SECOND GRADE' 81='81' 81='81' 3='THIRD GRADE' 82='82' 82='82' 4='FOURTH GRADE' 83='83' 83='83' 5='FIFTH GRADE' 84='84' 84='84' 6='SIXTH GRADE' 85='85' 85='85' 7='SEVENTH GRADE' 86='86' 86='86' 8='EIGHTH GRADE' 87='87' 87='87' 9='NINTH GRADE' 88='88' 88='88' 10='TENTH GRADE' 89='89' 89='89' 11='ELEVENTH GRADE' 90='90' 90='90' 12='TWELFTH GRADE' 91='91' 312

92='92' 18='SIXTH+ YEARS COLLEGE' 2='FEMALE' 93='93' 96='ELEMENTARY, YEAR ; 94='94' UNSPECIFIED' value vx95f 95-99999='95 TO 99999: 95+' 97='HIGH SCHOOL, YEAR 1='MALE' ; UNSPECIFIED' 2='FEMALE' value vx84f 98='COLLEGE, YEAR UNSPECIFIED' ; -99999-78='-99999 TO 78: < 79' 93='KINDERGARTEN' value vx96f 79='79: = 79' 94='PRESCHOOL' 1='MALE' 80='80' 95='NONACADEMIC DEGREE OR 2='FEMALE' 81='81' DIPLOMA EDUCATIONAL PROGRAM' ; 82='82' 99='GED' value vx97f 83='83' ; 1='MALE' 84='84' value vx87f 2='FEMALE' 85='85' -99999--1='-99999 TO -1: < 0' ; 86='86' 0='0: = 0' value vx98f 87='87' 1='1' 1='MALE' 88='88' 2='2' 2='FEMALE' 89='89' 3='3' ; 90='90' 4='4' value vx99f 91='91' 5='5' 1='MALE' 92='92' 6='6' 2='FEMALE' 93='93' 7='7' ; 94='94' 8='8' value vx100f 95-99999='95 TO 99999: 95+' 9='9' 1='MALE' ; 10='10' 2='FEMALE' value vx85f 11='11' ; 1='0-3999' 12='12' value vx101f 2='4000-5999' 13='13' 1='MALE' 3='6000-7499' 14='14' 2='FEMALE' 4='7500-9999' 15='15' ; 5='10000-14999' 16-99999='16 TO 99999: 16+' value vx102f 6='15000-17499' ; 1='MALE' 7='17500-19999' value vx88f 2='FEMALE' 8='20000-24999' 10001-15083='10001 TO 15083: ; */ 9='25000-34999' Mature Women' 10='35000-49999' 20001-25159='20001 TO 25159: File Name: Step 2-recode 11='50000-74999' Young women' NLSYW.sas 12='75000-99999' ; 13='100000 AND OVER' value vx89f * Recode NLSYW; 14='NOTHING' 1='MALE' * Make one observation per child; ; 2='FEMALE' value vx86f ; proc contents data = new_data; 0='NONE' value vx90f run; 1='FIRST GRADE' 1='MALE' 2='SECOND GRADE' 2='FEMALE' data new_data; 3='THIRD GRADE' ; set new_data; 4='FOURTH GRADE' value vx91f CASEID = R0000100; run; 5='FIFTH GRADE' 1='MALE' /* 6='SIXTH GRADE' 2='FEMALE' proc freq data = new_data; 7='SEVENTH GRADE' ; tables R0494262 R0494266 8='EIGHTH GRADE' value vx92f R0494270 R0494274 R0494278 9='NINTH GRADE' 1='MALE' R0494282 R0494286 10='TENTH GRADE' 2='FEMALE' R0494290 11='ELEVENTH GRADE' ; R0680300 R0681100 R0681900 12='TWELFTH GRADE' value vx93f R0682700 R0683500 R0913600 13='FIRST YEAR COLLEGE' 1='MALE' R0914500 14='SECOND YEAR COLLEGE' 2='FEMALE' R0915400 15='THIRD YEAR COLLEGE' ; R0916300 R1043400 R1043700 16='FOURTH YEAR COLLEGE' value vx94f R1044000 R1044300 R1089400 17='FIFTH YEAR COLLEGE' 1='MALE' R1089700 313

R1090000 relation = real{i}; sex1 = * if cyear ge 68; R1090300 R1206400 R1206700 male{i};output;* end; end; run; *RECODE INOCME TO SAME GROUPS; R1339300 R1339600 R1339900 ARRAY TOTINC{7} R0037000 R1438800 R0113300 R0193100 R0308700 R1439200 data new; R0392500 R0475100 R0676400; R1439600/missing; run; set new_data; * array the possible DO I = 1 TO 7 ; data small; child birth years ; IF TOTINC{i} = 1 THEN set new_data; array CByear{33} TOTINC{i} = 500; array b73 {8} R0494262 R0494266 R0494262 R0494266 R0494270 IF TOTINC{i} = 2 THEN R0494270 R0494274 R0494278 R0494274 R0494278 R0494282 TOTINC{i} = 1500; R0494282 R0494286 R0494286 IF TOTINC{i} = 3 THEN R0494290 ; R0494290 TOTINC{i} = 2500; do i = 1 to 8; if b73(i) > 0 R0680300 R0681100 R0681900 IF TOTINC{i} = 4 THEN then do; R0682700 R0683500 R0913600 TOTINC{i} = 3500; kidsyob = b73(i); output ; end; R0914500 IF TOTINC{i} = 5 THEN end; run; R0915400 TOTINC{i} = 4500; */ R0916300 R1043400 R1043700 IF TOTINC{i} = 6 THEN R1044000 R1044300 R1089400 TOTINC{i} = 5500; proc freq data = new_data; R1089700 IF TOTINC{i} = 7 THEN tables R5185600 R5190000 R1090000 TOTINC{i} = 6500; R5194200 R5198400 R5202000 R1090300 R1206400 R1206700 IF TOTINC{i} = 8 THEN R5205600 R1339300 R1339600 R1339900 TOTINC{i} = 8500; R5209200 R5212800 R5216300 R1438800 IF TOTINC{i} = 9 THEN R5219300 R5222200 R5225100 R1439200 R1439600; TOTINC{i} = 12000; R5228000 do i = 1 to 33; if CByear(i) IF TOTINC{i} = 10 THEN R5230900 ; run; ge 0 then do; TOTINC{i} = 20000; IF TOTINC{i} = 11 THEN data new_93; cyear=cbyear(i);output; TOTINC{i} = 25000; set new_data; end; end; run; IF TOTINC{i} = 12 THEN * array the possible TOTINC{i} = .; end; child birth years ; data new_93 ; array CByear{14} set new_93; cyear = cyear -1900; array tot2{2} R0902000 R5185600 R5190000 R5194200 if cyear ge 93; drop sex1 R1196200; R5198400 R5202000 R5205600 relation; run; do i = 1 to 2; R5209200 R5212800 R5216300 proc freq data = new; IF TOT2{i} = 1 THEN tot2{i} = R5219300 R5222200 R5225100 tables cyear; run; 2500; R5228000 proc append base = new IF TOT2{i} = 2 THEN R5230900 ; data=new_93; run; tot2{i} = 5000; IF TOT2{i} = 3 THEN array real{14} R5166000 proc print data= new (obs=8); tot2{i} = 6750; R5166100 R5166200 R5166300 run; IF TOT2{i} = 4 THEN R5166400 R5166500 R5166600 tot2{i} = 8250; R5166700 R5166800 *counts do not match to norberg - IF TOT2{i} = 5 THEN R5166900 R5167000 R5167100 ; tot2{i} = 12500; R5167200 R5167300 ; IF TOT2{i} = 6 THEN proc freq data = new; tot2{i} = 16500; array male{14} R5163700 tables cyear; where cyear ge IF TOT2{i} = 7 THEN R5163800 R5163900 R5164000 1970 and cyear le 1997; run; tot2{i} = 18000; R5164100 R5164200 R5164300 IF TOT2{i} = 8 THEN R5164400 R5164500 data new; tot2{i} = 22500; R5164600 R5164700 R5164800 set new; IF TOT2{i} = 9 THEN R5164900 R5165000 ; childid=_n_; run; tot2{i} = 30000; IF TOT2{i} = 10 THEN *get one record for proc freq data = new;*where tot2{i} = 42500; each birth; caseid=10; IF TOT2{i} = 11 THEN do i = 1 to 14;* if tables cyear/list; run; tot2{i} = 55000; end; CByear(i) ge 1993 then do; data new2; array tot3{2} R1328300 cyear=cbyear(i); set new; R1506300; 314

do i = 1 to 2; else if cyear = 65 then do; R0749900; TOTINC = R0749400; IF tot3{i} = 1 THEN marstat = . ; educ = .; TOTINC = age_1 = R0003100 +12;end; tot3{i} = 2500; .; age_1 = R0003100-3; end; else if cyear = 81 then do; IF tot3{i} = 2 THEN else if cyear = 66 then do; marstat = R0710000; educ = tot3{i} = 5000; marstat = . ; educ = .; TOTINC = R0749900; TOTINC = R0749400; IF tot3{i} = 3 THEN .; age_1 = R0003100-2; end; age_1 = R0003100 +13;end; tot3{i} = 6250; else if cyear = 67 then do; else if cyear = 82 then do; IF tot3{i} = 4 THEN marstat = . ; educ = .; TOTINC = marstat = R0756500; educ = tot3{i} = 8750; .; age_1 = R0003100-1; end; R0797100; TOTINC = R0796500; IF tot3{i} = 5 THEN age_1 = R0003100 +14;end; tot3{i} = 12500; else if cyear = 68 then do; else if cyear = 83 then do; IF tot3{i} = 6 THEN marstat = R0003300 ; educ = marstat = R0940700; educ = tot3{i} = 16500; R0072000; TOTINC = R0037000; R0929500; TOTINC = R0902000; IF tot3{i} = 7 THEN age_1 = R0003100; end; age_1 = R0003100 +15;end; tot3{i} = 18000; else if cyear = 69 then do; else if cyear = 84 then do; IF tot3{i} = 8 THEN marstat = R0086200; educ = marstat = R0940700; educ = tot3{i} = 22500; R0134600; TOTINC = R0113300; R0929500; TOTINC = R0902000; IF tot3{i} = 9 THEN age_1 = R0003100 +1;end; age_1 = R0003100 +16;end; tot3{i} = 30000; else if cyear = 70 then do; else if cyear = 85 then do; IF tot3{i} = 10 THEN marstat = R0146100; educ = marstat = R0947400; educ = tot3{i} = 42500; R0212600; TOTINC = R0193100; R0929500; TOTINC = R1050000; IF tot3{i} = 11 THEN age_1 = R0003100 +2;end; age_1 = R0003100 +17;end; tot3{i} = 65000; else if cyear = 71 then do; else if cyear = 86 then do; IF tot3{i} = 12 THEN marstat = R0253300; educ = marstat = R0947400; educ = tot3{i} = 85000; R0326200; TOTINC = R0308700; R0929500; TOTINC = R1050000; IF tot3{i} = 13 THEN age_1 = R0003100 +3;end; age_1 = R0003100 +18;end; tot3{i} = 110000; else if cyear = 72 then do; else if cyear = 87 then do; IF tot3{i} = 14 THEN marstat = R0334400; educ = marstat = R1063000; educ = tot3{i} = 0; end; run; R0409900; TOTINC = R0392500; R1097400; TOTINC = R1095800; age_1 = R0003100 +4;end; age_1 = R0003100 +19;end; data new3; else if cyear = 73 then do; else if cyear = 88 then do; set new2; marstat = R0417200; educ = marstat = R1221000; educ = caseid = R0000100; R0494300; TOTINC = R0475100; R1215100; TOTINC = R1196200; *get the marriage, EDUC, TOTINC age_1 = R0003100 +5;end; age_1 = R0003100 +20;end; data closest to the birth; else if cyear = 74 then do; else if cyear = 89 then do; *no marriage var for years marstat = R0417200; educ = marstat = R1221000; educ = 74,76,79,81,84,86,90,92; R0494300; TOTINC = R0475100; R1215100; TOTINC = R1196200; if cyear = 58 then do; marstat = age_1 = R0003100 +6;end; age_1 = R0003100 +21;end; .; educ = .; TOTINC = .; age_1 = else if cyear = 75 then do; else if cyear = 90 then do; R0003100-10; end; marstat = R0519600; educ = marstat = R1221000; educ = else if cyear = 59 then do; R0521100; TOTINC = R0533300; R1215100; TOTINC = R1196200; marstat = . ; educ = .; TOTINC = age_1 = R0003100 +7;end; age_1 = R0003100 +22;end; .; age_1 = R0003100-9; end; else if cyear = 76 then do; else if cyear = 91 then do; else if cyear = 60 then do; marstat = R0519600; educ = marstat = R1352700; educ = marstat = . ; educ = .; TOTINC = R0521100; TOTINC = R0533300; R1346400; TOTINC = R1328300; .; age_1 = R0003100-8; end; age_1 = R0003100 +8;end; age_1 = R0003100 +23;end; else if cyear = 61 then do; else if cyear = 77 then do; else if cyear = 92 then do; marstat = . ; educ = .; TOTINC = marstat = R0581000; educ = marstat = R1352700; educ = .; age_1 = R0003100-7; end; R0550700; TOTINC = R0583400; R1346400; TOTINC = R1328300; else if cyear = 62 then do; age_1 = R0003100 +9;end; age_1 = R0003100 +24;end; marstat = . ; educ = .; TOTINC = else if cyear = 78 then do; else if cyear ge 93 then .; age_1 = R0003100-6; end; marstat = R0587500; educ = do;marstat = R1569900; educ = else if cyear = 63 then do; R0666390; TOTINC = R0676400; R1520400; TOTINC = R1506300; marstat = . ; educ = .; TOTINC = age_1 = R0003100 +10;end; age_1 = R0003100 +25;end; .; age_1 = R0003100-5; end; else if cyear = 79 then do; else if cyear = 64 then do; marstat = R0587500; educ = IF EDUC ge 0 then EDUC = EDUC; marstat = . ; educ = .; TOTINC = R0666390; TOTINC = R0676400; ELSE IF EDUC GE 95 THEN EDUC = .; age_1 = R0003100-4; end; age_1 = R0003100 +11;end; .; else if cyear = 80 then do; ELSE EDUC = .; marstat = R0710000; educ = 315

IF marstat = 1 then couple = 1; if post_wt = . then delete; model sex1 = couple /clparm ELSE IF marstat ge 2 then couple run; solution noint ; = 2; weight post_wt; proc freq data = new4; estimate 'together vs apart' IF TOTINC ge 0 then TOTINC1 = tables caseid*post_wt/missing couple 1 -1; run; TOTINC; list; run; ELSE TOTINC1 = .; *Adjusting for other things; proc freq data = new4; *Using method in Brumback et al RACE = R0003200; run; tables cyear; run; 2010; *binary outcome is sex of child; Data thesis.nlsyw1; *binary exposure of interest is proc sort data=new3; by CaseID set new4; run; marital satus; ChildID; run; *verified that age and parity are not effect modifiers ; data new3; *confounders are income, race, set new3; File Name: Step-4 RD-NLSYW.sas education, year of birth age and by CaseID; parity; retain Parity; * RD re-analysis - based on if first.CaseID then Parity=1; exposure modeling - crude risk, else Parity=Parity+1; run; weighted and unweighted; title 'RD-Conception Year GE 1970 le 1997 - weighted without data new3; proc freq data = new4; income'; set new3; tables post_wt race educ cyear IF Parity = 1 then SEX1 = parity couple age_1 totinc/list proc surveyreg data = new4; where R5163700 ; missing; run; cyear ge 70 and cyear le 97; IF Parity = 2 then SEX1 = class couple educ race parity; R5163800 ; cluster caseid; *account for IF Parity = 3 then SEX1 = title 'RD - Conception Year GE repeated measures; R5163900 ; 70 le 1997 - unweighted'; model sex1 = couple age_1 educ IF Parity = 4 then SEX1 = proc freq data = new4; where race cyear parity /clparm R5164000 ; cyear ge 70 and cyear le 97; solution noint ; IF Parity = 5 then SEX1 = tables couple*sex1/ riskdiff; weight post_wt; R5164100 ; run; estimate 'together vs apart' IF Parity = 6 then SEX1 = couple 1 -1; run; R5164200 ; title 'RD- All years - IF Parity = 7 then SEX1 = unweighted'; title 'RD-Conception Year all - R5164300 ; proc freq data = new4; weighted wihtout income'; IF Parity = 8 then SEX1 = tables couple*sex1/ riskdiff; R5164400 ; run; proc surveyreg data = new4; IF Parity = 9 then SEX1 = class couple educ race parity; R5164500 ; title 'RD-Conception Year GE cluster caseid; *account for IF Parity = 10 then SEX1 = 1970 le 97 - weighted'; repeated measures; R5164600 ; model sex1 = couple age_1 educ proc surveyreg data = new4; where race cyear parity /clparm if sex1 ge 1 then sex1 = sex1 ; cyear ge 70 and cyear le 97; solution noint ; else sex1 = .; run; class couple ; weight post_wt; cluster caseid; *account for estimate 'together vs apart' repeated measures; couple 1 -1; run; proc freq data = new3; model sex1 = couple /clparm tables solution noint ; couple*marstat educ totinc parity weight post_wt; title 'RD-Conception Year GE sex1 /list; run; estimate 'together vs apart' 1970 le 1997 - weighted with couple 1 -1; run; income'; data new4; merge new3(in=a) weights; title 'RD-Conception Year all - proc surveyreg data = new4; where by caseid; keep age_1 race weighted'; cyear ge 70 and cyear le 97; caseid childid cyear educ totinc class couple educ race parity; parity sex1 couple post_wt; proc surveyreg data = new4; cluster caseid; *account for if childid = . then delete; class couple ; cluster caseid; repeated measures; *account for repeated measures; 316

model sex1 = couple age_1 educ if sex1 = 2 then iptw = (1- R0014900 race cyear parity totinc /clparm puncond)/(1-propensity); R0017300 solution noint ; combinedweight = iptw*post_wt; R0155400 weight post_wt; run; R0214700 estimate 'together vs apart' R0214800 couple 1 -1; run; proc surveyreg data = stand; R0218004 class couple parity; R0226000 title 'RD-Conception Year all - model sex1 = couple age_1 R0226700 weighted with income'; parity/solution clparm ; R0227000 strata col_str; R0229200 proc surveyreg data = new4; weight combinedweight; R0312300 class couple educ race parity; estimate 'together vs apart' R0388800 cluster caseid; *account for couple 1 -1; run; */ R0407604 repeated measures; R0413600 model sex1 = couple age_1 educ R0414300 race cyear parity totinc/clparm R0414600 solution noint ; File Name: Weights read inà.sas R0417400 weight post_wt; R0482600 estimate 'together vs apart' *Read in custom weights; R0533500 couple 1 -1; run; R0647105 data weights; R0656100 /* infile 'C:\Documents and R0664500 proc surveylogistic data =new3; Settings\foxkarl\Desktop\NLSYW1\c R0782100 where cyear ge 1991; ustomweight_WOMEN_722916315622002 R0813300 class couple educat race 8201.dat'; R0898856 totinc1 cyear parity/param=ref; R0901200 cluster panel; INPUT R0905900 model couple = age_1 educat CASEID R1024000 race totinc1 cyear parity / ; POST_WT; run; R1048500 strata col_str; R1146848 weight post_wt; proc freq data = weights; run; R1205800 output out=propout (keep=caseid R1410700 propensity) p = propensity; run; data weights; R1431800 set weights; R1522064 proc sort data =new3; post_wt=post_wt/100; run; R1605100 by caseid; run; R1778500 R1865200 proc sort data = propout; NLSY79 R1892766 by caseid; run; File Name: Step1a.sas R1905600 R2141600 proc surveylogistic data = new3; options nocenter R2231300 where cyear ge 1991; validvarname=any; R2259866 class couple educat race R2300300 totinc1 cyear parity/param=ref; *---Read in space-delimited ascii R2301600 cluster panel; file; R2306500 model couple =parity age_1; R2350300 strata col_str; data new_data; R2427100 weight post_wt; R2448059 output out=pout (keep=caseid infile 'C:\Documents and R2501300 puncond) p = puncond; run; Settings\foxkarl\Desktop\nlsy79da R2502600 ta\July\KarlaredoJuly2.dat' R2509000 proc sort data = pout; lrecl=478 missover DSD DLM=' ' R2722500 by caseid; run; print; R2747200 input R2881300 data stand; R0000100 R2901300 merge new3 propout pout; R0010600 R2902600 by caseid; R0013700 R2908100 if sex1 = 1 then iptw = R0014000 R2971400 puncod/propensity; R0014300 R2990000 R0014600 R3076874 317

R3101700 label R0647105 = "REL CODE OF R3103000 do over nvarlist; CURR SP/PTR, 1981 XRND"; R3110200 if nvarlist = -1 then nvarlist label R0656100 = "ICHK MARSTAT R3279400 = .R; /* Refused */ LISTED ISHEET 82"; R3411240 if nvarlist = -2 then nvarlist label R0664500 = "HGC 82"; R3501700 = .D; /* Dont know */ label R0782100 = "TOT INC WAGES R3503000 if nvarlist = -3 then nvarlist = AND SALRY P-C YR 82"; R3510200 .I; /* Invalid missing */ label R0813300 = "ISHEET 03 82- R3559000 if nvarlist = -4 then nvarlist = MARITAL STATUS"; R3659081 .V; /* Valid missing */ label R0898856 = "REL CODE OF R3701700 if nvarlist = -5 then nvarlist = CURR SP/PTR, 1982 XRND"; R3703000 .N; /* Non-interview */ label R0901200 = "ICHK MARSTAT R3710200 end; CODE ON ISHEET 83"; R3897100 label R0905900 = "HGC 83"; R4009488 label R0000100 = "ID# (1-12686) label R1024000 = "TOT INC WAGES R4121800 79"; AND SALRY P-C YR 83"; R4447100 label R0010600 = "MARITAL label R1048500 = "ISHEET 01 83- R4506000 STATUS 79"; MARITAL STATUS"; R4526500 label R0013700 = "YR OF BIRTH label R1146848 = "REL CODE OF R4982800 CHILD#1 79"; CURR SP/PTR, 1983 XRND"; R5060600 label R0014000 = "YR OF BIRTH label R1205800 = "HGC 84"; R5090402 CHILD#2 79"; label R1410700 = "TOT INC WAGES R5176202 label R0014300 = "YR OF BIRTH AND SALRY P-C YR 84"; R5205300 CHILD#3 79"; label R1431800 = "ISHEET 01 84- R5221800 label R0014600 = "YR OF BIRTH MARITAL STATUS"; R5626200 CHILD#4 79"; label R1522064 = "REL CODE OF R5738000 label R0014900 = "YR OF BIRTH CURR SP/PTR, 1984 XRND"; R5802300 CHILD#5 79"; label R1605100 = "HGC 85"; R5804900 label R0017300 = "HGC 79"; label R1778500 = "TOT INC WAGES R5821800 label R0155400 = "TOT INC WAGES AND SALRY P-C YR 85"; R6364600 AND SALRY P-C YR 79"; label R1865200 = "ISHEET 01 85- R6481702 label R0214700 = "RACL/ETHNIC MARITAL STATUS"; R6481800 COHORT /SCRNR 79"; label R1892766 = "REL CODE OF R6482202 label R0214800 = "SEX OF R 79"; CURR SP/PTR, 1985 XRND"; R6482300 label R0218004 = "REL CODE OF label R1905600 = "HGC 86"; R6482702 CURR SP/PTR, 1979 XRND"; label R2141600 = "TOT INC WAGES R6482800 label R0226000 = "ICHK MARITAL AND SALRY P-C YR 86"; R6483202 STATUS 80"; label R2231300 = "ISHEET 01 86- R6483300 label R0226700 = "YR OF BIRTH MARITAL STATUS"; R6483702 CHILD#1 SIN LINT 80"; label R2259866 = "REL CODE OF R6483800 label R0227000 = "YR OF BIRTH CURR SP/PTR, 1986 XRND"; R6484202 CHILD#2 SIN LINT 80"; label R2300300 = "CURRENT R6484300 label R0229200 = "HGC 80"; MARSTAT NO CHNG SIN LINT 87"; R6484702 label R0312300 = "TOT INC WAGES label R2301600 = "CURRENT R6484800 AND SALRY P-C YR 80"; MARSTAT CHNG SIN LINT 87"; R6485102 label R0388800 = "ISHEET 01 80- label R2306500 = "HGC 87"; R6485200 MARITAL STATUS"; label R2350300 = "TOT INC WAGES R6485502 label R0407604 = "REL CODE OF AND SALRY P-C YR 87"; R6485600 CURR SP/PTR, 1980 XRND"; label R2427100 = "ISHEET 01 87- R6485902 label R0413600 = "ICHK MARITAL MARITAL STATUS"; R6486000 STATUS 81"; label R2448059 = "REL CODE OF R6489402; label R0414300 = "YR OF BIRTH CURR SP/PTR, 1987 XRND"; CHILD#1 SIN LINT 81"; label R2501300 = "CURRENT array nvarlist _numeric_; label R0414600 = "YR OF BIRTH MARSTAT NO CHNG SIN LINT 88"; CHILD#2 SIN LINT 81"; label R2502600 = "CURRENT *---Recode missing values to SAS label R0417400 = "HGC 81"; MARSTAT CHNG SIN LINT 88"; custom system missing. See SAS label R0482600 = "TOT INC WAGES label R2509000 = "HGC 88"; documentation for use of AND SALRY P-C YR 81"; label R2722500 = "TOT INC WAGES MISSING option in procedures, label R0533500 = "ISHEET 02 81- AND SALRY P-C YR 88"; e.g. PROC FREQ; MARITAL STATUS"; 318

label R2747200 = "ISHEET 01 88- label R5205300 = "R'S CURRENT MARITAL STATUS"; MARITAL STATUS. 96"; /*------label R2881300 = "REL CODE OF label R5221800 = "HGHST ------CURR SP/PTR, 1988 XRND"; GRADE/YR COMPLTD & GOT CREDIT -----* label R2901300 = "CURRENT 96"; * Crosswalk for Reference MARSTAT NO CHNG SIN LINT 89"; label R5626200 = "AMT OF R'S number & Question name label R2902600 = "CURRENT WAGES/SALARY/TIPS (PCY) 96"; * MARSTAT CHNG SIN LINT 89"; label R5738000 = "MARITAL *------label R2908100 = "HGC 89"; STATUS AT INT DATE 96"; ------label R2971400 = "TOT INC WAGES label R5802300 = "CUR MAR STAT -----* AND SALRY P-C YR 89"; (NO CHG SINC LAST INT) 1998"; * Uncomment and edit this RENAME label R2990000 = "ISHEET 01 89- label R5804900 = "RS CURRENT statement to rename variables MARITAL STATUS"; MARITAL STATUS 1998"; * for ease of use. You may need label R3076874 = "REL CODE OF label R5821800 = "HGHST to use name literal strings CURR SP/PTR, 1989 XRND"; GRADE/YR COMPLTD & GOT CREDIT * e.g. 'variable-name'n to label R3101700 = "CURRENT 1998"; create valid SAS variable names, MARSTAT NO CHNG SIN LINT 90"; label R6364600 = "AMT OF RS or label R3103000 = "CURRENT WAGES/SALARY/TIPS (PCY) 1998"; * alter variables similarly MARSTAT CHNG SIN LINT 90"; label R6481702 = "DATE OF BIRTH named across years. label R3110200 = "HGC 90"; OF 1ST CHILD 1998"; * This command does not label R3279400 = "TOT INC WAGES label R6481800 = "SEX OF 1ST guarentee uniqueness AND SALRY P-C YR 90"; CHILD 1998"; * See SAS documentation for use label R3411240 = "REL CODE OF label R6482202 = "DATE OF BIRTH of name literals and use of the CURR SP/PTR, 1990 XRND"; OF 2ND CHILD 1998"; * VALIDVARNAME=ANY option. label R3501700 = "CURRENT label R6482300 = "SEX OF 2ND *------MARSTAT NO CHNG SIN LINT 91"; CHILD 1998"; ------label R3503000 = "CURRENT label R6482702 = "DATE OF BIRTH -----*/ MARSTAT CHNG SIN LINT 91"; OF 3RD CHILD 1998"; /* *start* */ label R3510200 = "HGC 91"; label R6482800 = "SEX OF 3RD proc format; label R3559000 = "TOT INC WAGES CHILD 1998"; value vx1f AND SALRY P-C YR 91"; label R6483202 = "DATE OF BIRTH 1='PRESENTLY MARRIED' label R3659081 = "REL CODE OF OF 4TH CHILD 1998"; 2='WIDOWED' CURR SP/PTR, 1991 XRND"; label R6483300 = "SEX OF 4TH 3='DIVORCED' label R3701700 = "CURRENT CHILD 1998"; 4='SEPARATED' MARSTAT NO CHNG SIN LINT 92"; label R6483702 = "DATE OF BIRTH 5='NEVER MARRIED-ANNUL' label R3703000 = "CURRENT OF 5TH CHILD 1998"; ; MARSTAT CHNG SIN LINT 92"; label R6483800 = "SEX OF 5TH value vx2f label R3710200 = "HGC 92"; CHILD 1998"; 0-69='0 TO 69: < 70' label R3897100 = "TOT INC WAGES label R6484202 = "DATE OF BIRTH 70='70' AND SALRY P-C YR 92"; OF 6TH CHILD 1998"; 71='71' label R4009488 = "REL CODE OF label R6484300 = "SEX OF 6TH 72='72' CURR SP/PTR, 1992 XRND"; CHILD 1998"; 73='73' label R4121800 = "R'S CURRENT label R6484702 = "DATE OF BIRTH 74='74' MARITAL STATUS. 93"; OF 7TH CHILD 1998"; 75='75' label R4447100 = "REL CODE OF label R6484800 = "SEX OF 7TH 76='76' CURR SP/PTR, 1993 XRND"; CHILD 1998"; 77='77' label R4506000 = "R'S CURRENT label R6485102 = "DATE OF BIRTH 78='78' MARITAL STATUS. 94"; OF 8TH CHILD 1998"; 79='79' label R4526500 = "HGHST label R6485200 = "SEX OF 8TH 80='80' GRADE/YR COMPLTD & GOT CREDIT CHILD 1998"; 81='81' 94"; label R6485502 = "DATE OF BIRTH 82='82' label R4982800 = "AMT OF R'S OF 9TH CHILD 1998"; 83='83' WAGES/SALARY/TIPS (PCY) 94"; label R6485600 = "SEX OF 9TH 84='84' label R5060600 = "MARITAL CHILD 1998"; 85='85' STATUS AT LAST INTERVIEW 94"; label R6485902 = "DATE OF BIRTH 86-99999='86 TO 99999: 86+' label R5090402 = "REL CODE OF OF 10TH CHILD 1998"; ; CURR SP/PTR, 1994 XRND"; label R6486000 = "SEX OF 10TH value vx3f label R5176202 = "REL CODE OF CHILD 1998"; 0-69='0 TO 69: < 70' CURR SP/PTR, 1996 XRND"; label R6489402 = "REL CODE OF 70='70' CURR SP/PTR, 1998 XRND"; 71='71' 319

72='72' 72='72' value vx10f 73='73' 73='73' 1='MALE' 74='74' 74='74' 2='FEMALE' 75='75' 75='75' ; 76='76' 76='76' value vx11f 77='77' 77='77' -999='-999: No reported 78='78' 78='78' spouse/partner' 79='79' 79='79' 1='1: Spouse' 80='80' 80='80' 33='33: Partner' 81='81' 81='81' 36='36: Other' 82='82' 82='82' ; 83='83' 83='83' value vx12f 84='84' 84='84' 1='MARRIED' 85='85' 85='85' 2='OTHER CODE LISTED' 86-99999='86 TO 99999: 86+' 86-99999='86 TO 99999: 86+' ; ; ; value vx13f value vx4f value vx7f 79='1979' 0-69='0 TO 69: < 70' 0='NONE' 80='1980' 70='70' 1='1ST GRADE' 81='1981' 71='71' 2='2ND GRADE' 82='1982' 72='72' 3='3RD GRADE' ; 73='73' 4='4TH GRADE' value vx14f 74='74' 5='5TH GRADE' 79='1979' 75='75' 6='6TH GRADE' 80='1980' 76='76' 7='7TH GRADE' 81='1981' 77='77' 8='8TH GRADE' 82='1982' 78='78' 9='9TH GRADE' ; 79='79' 10='10TH GRADE' value vx15f 80='80' 11='11TH GRADE' 0='NONE' 81='81' 12='12TH GRADE' 1='1ST GRADE' 82='82' 13='1ST YR COL' 2='2ND GRADE' 83='83' 14='2ND YR COL' 3='3RD GRADE' 84='84' 15='3RD YR COL' 4='4TH GRADE' 85='85' 16='4TH YR COL' 5='5TH GRADE' 86-99999='86 TO 99999: 86+' 17='5TH YR COL' 6='6TH GRADE' ; 18='6TH YR COL' 7='7TH GRADE' value vx5f 19='7TH YR COL' 8='8TH GRADE' 0-69='0 TO 69: < 70' 20='8TH YR COL OR MORE' 9='9TH GRADE' 70='70' 95='UNGRADED' 10='10TH GRADE' 71='71' ; 11='11TH GRADE' 72='72' value vx8f 12='12TH GRADE' 73='73' 0='0' 13='1ST YR COL' 74='74' 1-99='1 TO 99' 14='2ND YR COL' 75='75' 100-199='100 TO 199' 15='3RD YR COL' 76='76' 200-299='200 TO 299' 16='4TH YR COL' 77='77' 300-399='300 TO 399' 17='5TH YR COL' 78='78' 400-499='400 TO 499' 18='6TH YR COL' 79='79' 500-599='500 TO 599' 19='7TH YR COL' 80='80' 600-699='600 TO 699' 20='8TH YR COL OR MORE' 81='81' 700-799='700 TO 799' 95='UNGRADED' 82='82' 800-899='800 TO 899' ; 83='83' 900-999='900 TO 999' value vx16f 84='84' 1000-9999999='1000 TO 9999999: 0='0' 85='85' 1000+' 1-999='1 TO 999' 86-99999='86 TO 99999: 86+' ; 1000-1999='1000 TO 1999' ; value vx9f 2000-2999='2000 TO 2999' value vx6f 1='HISPANIC' 3000-3999='3000 TO 3999' 0-69='0 TO 69: < 70' 2='BLACK' 4000-4999='4000 TO 4999' 70='70' 3='NON-BLACK, NON-HISPANIC' 5000-5999='5000 TO 5999' 71='71' ; 6000-6999='6000 TO 6999' 320

7000-7999='7000 TO 7999' 76='76' ; 8000-8999='8000 TO 8999' 77='77' value vx25f 9000-9999='9000 TO 9999' 78='78' -999='-999: Never reported 10000-14999='10000 TO 14999' 79='79' spouse/partner' 15000-19999='15000 TO 19999' 80='80' 0='0: No current 20000-24999='20000 TO 24999' 81='81' spouse/partner' 25000-49999='25000 TO 49999' 82='82' 1='1: Spouse' 50000-9999999='50000 TO 83='83' 33='33: Partner' 9999999: 50000+' 84-99999='84 TO 99999: 84+' 36='36: Other' ; ; ; value vx17f value vx22f value vx26f 1='PRESENTLY MARRIED' 0='NONE' 1='MARRIED' 2='WIDOWED' 1='1ST GRADE' 2='NEVER MARRIED' 3='DIVORCED' 2='2ND GRADE' 3='OTHER CODE LISTED' 4='SEPARATED' 3='3RD GRADE' ; 5='NEVER MARRIED-ANNUL' 4='4TH GRADE' value vx27f ; 5='5TH GRADE' 0='NONE' value vx18f 6='6TH GRADE' 1='1ST GRADE' -999='-999: Never reported 7='7TH GRADE' 2='2ND GRADE' spouse/partner' 8='8TH GRADE' 3='3RD GRADE' 0='0: No current 9='9TH GRADE' 4='4TH GRADE' spouse/partner' 10='10TH GRADE' 5='5TH GRADE' 1='1: Spouse' 11='11TH GRADE' 6='6TH GRADE' 33='33: Partner' 12='12TH GRADE' 7='7TH GRADE' 36='36: Other' 13='1ST YR COL' 8='8TH GRADE' ; 14='2ND YR COL' 9='9TH GRADE' value vx19f 15='3RD YR COL' 10='10TH GRADE' 1='MARRIED' 16='4TH YR COL' 11='11TH GRADE' 2='OTHER CODE LISTED' 17='5TH YR COL' 12='12TH GRADE' ; 18='6TH YR COL' 13='1ST YR COL' value vx20f 19='7TH YR COL' 14='2ND YR COL' 0-67='0 TO 67: < 68' 20='8TH YR COL OR MORE' 15='3RD YR COL' 68='68' 95='UNGRADED' 16='4TH YR COL' 69='69' ; 17='5TH YR COL' 70='70' value vx23f 18='6TH YR COL' 71='71' 0='0' 19='7TH YR COL' 72='72' 1-999='1 TO 999' 20='8TH YR COL OR MORE' 73='73' 1000-1999='1000 TO 1999' 95='UNGRADED' 74='74' 2000-2999='2000 TO 2999' ; 75='75' 3000-3999='3000 TO 3999' value vx28f 76='76' 4000-4999='4000 TO 4999' 0='0' 77='77' 5000-5999='5000 TO 5999' 1-999='1 TO 999' 78='78' 6000-6999='6000 TO 6999' 1000-1999='1000 TO 1999' 79='79' 7000-7999='7000 TO 7999' 2000-2999='2000 TO 2999' 80='80' 8000-8999='8000 TO 8999' 3000-3999='3000 TO 3999' 81='81' 9000-9999='9000 TO 9999' 4000-4999='4000 TO 4999' 82='82' 10000-14999='10000 TO 14999' 5000-5999='5000 TO 5999' 83='83' 15000-19999='15000 TO 19999' 6000-6999='6000 TO 6999' 84-99999='84 TO 99999: 84+' 20000-24999='20000 TO 24999' 7000-7999='7000 TO 7999' ; 25000-49999='25000 TO 49999' 8000-8999='8000 TO 8999' value vx21f 50000-9999999='50000 TO 9000-9999='9000 TO 9999' 0-67='0 TO 67: < 68' 9999999: 50000+' 10000-14999='10000 TO 14999' 68='68' ; 15000-19999='15000 TO 19999' 69='69' value vx24f 20000-24999='20000 TO 24999' 70='70' 0='NEVER MARRIED' 25000-49999='25000 TO 49999' 71='71' 1='MARRIED' 50000-9999999='50000 TO 72='72' 2='SEPARATED' 9999999: 50000+' 73='73' 3='DIVORCED' ; 74='74' 5='REMARRIED' value vx29f 75='75' 6='WIDOWED' 0='NEVER MARRIED' 321

1='MARRIED' 50000-9999999='50000 TO 50000-9999999='50000 TO 2='SEPARATED' 9999999: 50000+' 9999999: 50000+' 3='DIVORCED' ; ; 5='REMARRIED' value vx34f value vx38f 6='WIDOWED' 0='NEVER MARRIED' 0='NEVER MARRIED' ; 1='MARRIED' 1='MARRIED' value vx30f 2='SEPARATED' 2='SEPARATED' -999='-999: Never reported 3='DIVORCED' 3='DIVORCED' spouse/partner' 5='REMARRIED' 4='REUNITED' 0='0: No current 6='WIDOWED' 6='WIDOWED' spouse/partner' ; ; 1='1: Spouse' value vx35f value vx39f 33='33: Partner' -999='-999: Never reported -999='-999: Never reported 36='36: Other' spouse/partner' spouse/partner' ; 0='0: No current 0='0: No current value vx31f spouse/partner' spouse/partner' 1='MARRIED' 1='1: Spouse' 1='1: Spouse' 2='OTHER CODE LISTED' 33='33: Partner' 33='33: Partner' ; 36='36: Other' 36='36: Other' value vx32f ; ; 0='NONE' value vx36f value vx40f 1='1ST GRADE' 0='NONE' 0='NONE' 2='2ND GRADE' 1='1ST GRADE' 1='1ST GRADE' 3='3RD GRADE' 2='2ND GRADE' 2='2ND GRADE' 4='4TH GRADE' 3='3RD GRADE' 3='3RD GRADE' 5='5TH GRADE' 4='4TH GRADE' 4='4TH GRADE' 6='6TH GRADE' 5='5TH GRADE' 5='5TH GRADE' 7='7TH GRADE' 6='6TH GRADE' 6='6TH GRADE' 8='8TH GRADE' 7='7TH GRADE' 7='7TH GRADE' 9='9TH GRADE' 8='8TH GRADE' 8='8TH GRADE' 10='10TH GRADE' 9='9TH GRADE' 9='9TH GRADE' 11='11TH GRADE' 10='10TH GRADE' 10='10TH GRADE' 12='12TH GRADE' 11='11TH GRADE' 11='11TH GRADE' 13='1ST YR COL' 12='12TH GRADE' 12='12TH GRADE' 14='2ND YR COL' 13='1ST YR COL' 13='1ST YR COL' 15='3RD YR COL' 14='2ND YR COL' 14='2ND YR COL' 16='4TH YR COL' 15='3RD YR COL' 15='3RD YR COL' 17='5TH YR COL' 16='4TH YR COL' 16='4TH YR COL' 18='6TH YR COL' 17='5TH YR COL' 17='5TH YR COL' 19='7TH YR COL' 18='6TH YR COL' 18='6TH YR COL' 20='8TH YR COL OR MORE' 19='7TH YR COL' 19='7TH YR COL' 95='UNGRADED' 20='8TH YR COL OR MORE' 20='8TH YR COL OR MORE' ; 95='UNGRADED' 95='UNGRADED' value vx33f ; ; 0='0' value vx37f value vx41f 1-999='1 TO 999' 0='0' 0='0' 1000-1999='1000 TO 1999' 1-999='1 TO 999' 1-999='1 TO 999' 2000-2999='2000 TO 2999' 1000-1999='1000 TO 1999' 1000-1999='1000 TO 1999' 3000-3999='3000 TO 3999' 2000-2999='2000 TO 2999' 2000-2999='2000 TO 2999' 4000-4999='4000 TO 4999' 3000-3999='3000 TO 3999' 3000-3999='3000 TO 3999' 5000-5999='5000 TO 5999' 4000-4999='4000 TO 4999' 4000-4999='4000 TO 4999' 6000-6999='6000 TO 6999' 5000-5999='5000 TO 5999' 5000-5999='5000 TO 5999' 7000-7999='7000 TO 7999' 6000-6999='6000 TO 6999' 6000-6999='6000 TO 6999' 8000-8999='8000 TO 8999' 7000-7999='7000 TO 7999' 7000-7999='7000 TO 7999' 9000-9999='9000 TO 9999' 8000-8999='8000 TO 8999' 8000-8999='8000 TO 8999' 10000-14999='10000 TO 14999' 9000-9999='9000 TO 9999' 9000-9999='9000 TO 9999' 15000-19999='15000 TO 19999' 10000-14999='10000 TO 14999' 10000-14999='10000 TO 14999' 20000-24999='20000 TO 24999' 15000-19999='15000 TO 19999' 15000-19999='15000 TO 19999' 25000-49999='25000 TO 49999' 20000-24999='20000 TO 24999' 20000-24999='20000 TO 24999' 25000-49999='25000 TO 49999' 25000-49999='25000 TO 49999' 322

50000-9999999='50000 TO 50000-9999999='50000 TO value vx51f 9999999: 50000+' 9999999: 50000+' 0='0' ; ; 1-999='1 TO 999' value vx42f value vx46f 1000-1999='1000 TO 1999' 0='NEVER MARRIED' 0='NEVER MARRIED' 2000-2999='2000 TO 2999' 1='MARRIED' 1='MARRIED' 3000-3999='3000 TO 3999' 2='SEPARATED' 2='SEPARATED' 4000-4999='4000 TO 4999' 3='DIVORCED' 3='DIVORCED' 5000-5999='5000 TO 5999' 5='REMARRIED' 5='REMARRIED' 6000-6999='6000 TO 6999' 6='WIDOWED' 6='WIDOWED' 7000-7999='7000 TO 7999' ; ; 8000-8999='8000 TO 8999' value vx43f value vx47f 9000-9999='9000 TO 9999' -999='-999: Never reported -999='-999: Never reported 10000-14999='10000 TO 14999' spouse/partner' spouse/partner' 15000-19999='15000 TO 19999' 0='0: No current 0='0: No current 20000-24999='20000 TO 24999' spouse/partner' spouse/partner' 25000-49999='25000 TO 49999' 1='1: Spouse' 1='1: Spouse' 50000-9999999='50000 TO 33='33: Partner' 33='33: Partner' 9999999: 50000+' 36='36: Other' 36='36: Other' ; ; ; value vx52f value vx44f value vx48f 0='NEVER MARRIED' 0='NONE' 0='NEVER MARRIED' 1='MARRIED' 1='1ST GRADE' 1='MARRIED' 2='SEPARATED' 2='2ND GRADE' 2='SEPARATED' 3='DIVORCED' 3='3RD GRADE' 3='DIVORCED' 5='REMARRIED' 4='4TH GRADE' 5='REMARRIED' 6='WIDOWED' 5='5TH GRADE' 6='WIDOWED' ; 6='6TH GRADE' ; value vx53f 7='7TH GRADE' value vx49f -999='-999: Never reported 8='8TH GRADE' 1='MARRIED' spouse/partner' 9='9TH GRADE' 2='SEPARATED' 0='0: No current 10='10TH GRADE' 3='DIVORCED' spouse/partner' 11='11TH GRADE' 4='REUNITED' 1='1: Spouse' 12='12TH GRADE' 5='REMARRIED' 33='33: Partner' 13='1ST YR COL' 6='WIDOWED' 36='36: Other' 14='2ND YR COL' ; ; 15='3RD YR COL' value vx50f value vx54f 16='4TH YR COL' 0='NONE' 0='NEVER MARRIED' 17='5TH YR COL' 1='1ST GRADE' 1='MARRIED' 18='6TH YR COL' 2='2ND GRADE' 2='SEPARATED' 19='7TH YR COL' 3='3RD GRADE' 3='DIVORCED' 20='8TH YR COL OR MORE' 4='4TH GRADE' 5='REMARRIED' 95='UNGRADED' 5='5TH GRADE' 6='WIDOWED' ; 6='6TH GRADE' ; value vx45f 7='7TH GRADE' value vx55f 0='0' 8='8TH GRADE' 1='MARRIED' 1-999='1 TO 999' 9='9TH GRADE' 2='SEPARATED' 1000-1999='1000 TO 1999' 10='10TH GRADE' 3='DIVORCED' 2000-2999='2000 TO 2999' 11='11TH GRADE' 4='REUNITED' 3000-3999='3000 TO 3999' 12='12TH GRADE' 5='REMARRIED' 4000-4999='4000 TO 4999' 13='1ST YR COL' 6='WIDOWED' 5000-5999='5000 TO 5999' 14='2ND YR COL' ; 6000-6999='6000 TO 6999' 15='3RD YR COL' value vx56f 7000-7999='7000 TO 7999' 16='4TH YR COL' 0='NONE' 8000-8999='8000 TO 8999' 17='5TH YR COL' 1='1ST GRADE' 9000-9999='9000 TO 9999' 18='6TH YR COL' 2='2ND GRADE' 10000-14999='10000 TO 14999' 19='7TH YR COL' 3='3RD GRADE' 15000-19999='15000 TO 19999' 20='8TH YR COL OR MORE' 4='4TH GRADE' 20000-24999='20000 TO 24999' 95='UNGRADED' 5='5TH GRADE' 25000-49999='25000 TO 49999' ; 6='6TH GRADE' 323

7='7TH GRADE' value vx61f -999='-999: Never reported 8='8TH GRADE' 1='MARRIED' spouse/partner' 9='9TH GRADE' 2='SEPARATED' 0='0: No current 10='10TH GRADE' 3='DIVORCED' spouse/partner' 11='11TH GRADE' 4='REUNITED' 1='1: Spouse' 12='12TH GRADE' 5='REMARRIED' 33='33: Partner' 13='1ST YR COL' 6='WIDOWED' 36='36: Other' 14='2ND YR COL' ; ; 15='3RD YR COL' value vx62f value vx66f 16='4TH YR COL' 0='NONE' 0='NEVER MARRIED' 17='5TH YR COL' 1='1ST GRADE' 1='MARRIED' 18='6TH YR COL' 2='2ND GRADE' 2='SEPARATED' 19='7TH YR COL' 3='3RD GRADE' 3='DIVORCED' 20='8TH YR COL OR MORE' 4='4TH GRADE' 5='REMARRIED' 95='UNGRADED' 5='5TH GRADE' 6='WIDOWED' ; 6='6TH GRADE' ; value vx57f 7='7TH GRADE' value vx67f 0='0' 8='8TH GRADE' 1='MARRIED' 1-999='1 TO 999' 9='9TH GRADE' 2='SEPARATED' 1000-1999='1000 TO 1999' 10='10TH GRADE' 3='DIVORCED' 2000-2999='2000 TO 2999' 11='11TH GRADE' 4='REUNITED' 3000-3999='3000 TO 3999' 12='12TH GRADE' 5='REMARRIED' 4000-4999='4000 TO 4999' 13='1ST YR COL' 6='WIDOWED' 5000-5999='5000 TO 5999' 14='2ND YR COL' ; 6000-6999='6000 TO 6999' 15='3RD YR COL' value vx68f 7000-7999='7000 TO 7999' 16='4TH YR COL' 0='NONE' 8000-8999='8000 TO 8999' 17='5TH YR COL' 1='1ST GRADE' 9000-9999='9000 TO 9999' 18='6TH YR COL' 2='2ND GRADE' 10000-14999='10000 TO 14999' 19='7TH YR COL' 3='3RD GRADE' 15000-19999='15000 TO 19999' 20='8TH YR COL OR MORE' 4='4TH GRADE' 20000-24999='20000 TO 24999' 95='UNGRADED' 5='5TH GRADE' 25000-49999='25000 TO 49999' ; 6='6TH GRADE' 50000-9999999='50000 TO value vx63f 7='7TH GRADE' 9999999: 50000+' 0='0' 8='8TH GRADE' ; 1-999='1 TO 999' 9='9TH GRADE' value vx58f 1000-1999='1000 TO 1999' 10='10TH GRADE' 0='NEVER MARRIED' 2000-2999='2000 TO 2999' 11='11TH GRADE' 1='MARRIED' 3000-3999='3000 TO 3999' 12='12TH GRADE' 2='SEPARATED' 4000-4999='4000 TO 4999' 13='1ST YR COL' 3='DIVORCED' 5000-5999='5000 TO 5999' 14='2ND YR COL' 5='REMARRIED' 6000-6999='6000 TO 6999' 15='3RD YR COL' 6='WIDOWED' 7000-7999='7000 TO 7999' 16='4TH YR COL' ; 8000-8999='8000 TO 8999' 17='5TH YR COL' value vx59f 9000-9999='9000 TO 9999' 18='6TH YR COL' -999='-999: Never reported 10000-14999='10000 TO 14999' 19='7TH YR COL' spouse/partner' 15000-19999='15000 TO 19999' 20='8TH YR COL OR MORE' 0='0: No current 20000-24999='20000 TO 24999' 95='UNGRADED' spouse/partner' 25000-49999='25000 TO 49999' ; 1='1: Spouse' 50000-9999999='50000 TO value vx69f 33='33: Partner' 9999999: 50000+' 0='0' 36='36: Other' ; 1-999='1 TO 999' ; value vx64f 1000-1999='1000 TO 1999' value vx60f 0='NEVER MARRIED' 2000-2999='2000 TO 2999' 0='NEVER MARRIED' 1='MARRIED' 3000-3999='3000 TO 3999' 1='MARRIED' 2='SEPARATED' 4000-4999='4000 TO 4999' 2='SEPARATED' 3='DIVORCED' 5000-5999='5000 TO 5999' 3='DIVORCED' 5='REMARRIED' 6000-6999='6000 TO 6999' 5='REMARRIED' 6='WIDOWED' 7000-7999='7000 TO 7999' 6='WIDOWED' ; 8000-8999='8000 TO 8999' ; value vx65f 9000-9999='9000 TO 9999' 324

10000-14999='10000 TO 14999' 2000-2999='2000 TO 2999' 20='8TH YR COL OR MORE' 15000-19999='15000 TO 19999' 3000-3999='3000 TO 3999' 95='UNGRADED' 20000-24999='20000 TO 24999' 4000-4999='4000 TO 4999' ; 25000-49999='25000 TO 49999' 5000-5999='5000 TO 5999' value vx79f 50000-9999999='50000 TO 6000-6999='6000 TO 6999' 0='0' 9999999: 50000+' 7000-7999='7000 TO 7999' 1-999='1 TO 999' ; 8000-8999='8000 TO 8999' 1000-1999='1000 TO 1999' value vx70f 9000-9999='9000 TO 9999' 2000-2999='2000 TO 2999' -999='-999: Never reported 10000-14999='10000 TO 14999' 3000-3999='3000 TO 3999' spouse/partner' 15000-19999='15000 TO 19999' 4000-4999='4000 TO 4999' 0='0: No current 20000-24999='20000 TO 24999' 5000-5999='5000 TO 5999' spouse/partner' 25000-49999='25000 TO 49999' 6000-6999='6000 TO 6999' 1='1: Spouse' 50000-9999999='50000 TO 7000-7999='7000 TO 7999' 33='33: Partner' 9999999: 50000+' 8000-8999='8000 TO 8999' 36='36: Other' ; 9000-9999='9000 TO 9999' ; value vx75f 10000-14999='10000 TO 14999' value vx71f -999='-999: Never reported 15000-19999='15000 TO 19999' 0='NEVER MARRIED' spouse/partner' 20000-24999='20000 TO 24999' 1='MARRIED' 0='0: No current 25000-49999='25000 TO 49999' 2='SEPARATED' spouse/partner' 50000-9999999='50000 TO 3='DIVORCED' 1='1: Spouse' 9999999: 50000+' 5='REMARRIED' 33='33: Partner' ; 6='WIDOWED' 36='36: Other' value vx80f ; ; -999='-999: Never reported value vx72f value vx76f spouse/partner' 1='MARRIED' 0='NEVER MARRIED' 0='0: No current 2='SEPARATED' 1='MARRIED' spouse/partner' 3='DIVORCED' 2='SEPARATED' 1='1: Spouse' 4='REUNITED' 3='DIVORCED' 33='33: Partner' 5='REMARRIED' 5='REMARRIED' 36='36: Other' 6='WIDOWED' 6='WIDOWED' ; ; ; value vx81f value vx73f value vx77f 1='YES' 0='NONE' 0='NEVER MARRIED' 0='NO' 1='1ST GRADE' 1='MARRIED' ; 2='2ND GRADE' 2='SEPARATED' value vx82f 3='3RD GRADE' 3='DIVORCED' -999='-999: Never reported 4='4TH GRADE' 6='WIDOWED' spouse/partner' 5='5TH GRADE' ; 0='0: No current 6='6TH GRADE' value vx78f spouse/partner' 7='7TH GRADE' 0='NONE' 1='1: Spouse' 8='8TH GRADE' 1='1ST GRADE' 33='33: Partner' 9='9TH GRADE' 2='2ND GRADE' 36='36: Other' 10='10TH GRADE' 3='3RD GRADE' ; 11='11TH GRADE' 4='4TH GRADE' value vx83f 12='12TH GRADE' 5='5TH GRADE' 1='YES' 13='1ST YR COL' 6='6TH GRADE' 0='NO' 14='2ND YR COL' 7='7TH GRADE' ; 15='3RD YR COL' 8='8TH GRADE' value vx84f 16='4TH YR COL' 9='9TH GRADE' 1='1ST GRADE' 17='5TH YR COL' 10='10TH GRADE' 2='2ND GRADE' 18='6TH YR COL' 11='11TH GRADE' 3='3RD GRADE' 19='7TH YR COL' 12='12TH GRADE' 4='4TH GRADE' 20='8TH YR COL OR MORE' 13='1ST YR COL' 5='5TH GRADE' 95='UNGRADED' 14='2ND YR COL' 6='6TH GRADE' ; 15='3RD YR COL' 7='7TH GRADE' value vx74f 16='4TH YR COL' 8='8TH GRADE' 0='0' 17='5TH YR COL' 9='9TH GRADE' 1-999='1 TO 999' 18='6TH YR COL' 10='10TH GRADE' 1000-1999='1000 TO 1999' 19='7TH YR COL' 11='11TH GRADE' 325

12='12TH GRADE' 1='1ST GRADE' 1='1ST GRADE' 13='1ST YEAR COLLEGE' 2='2ND GRADE' 2='2ND GRADE' 14='2ND YEAR COLLEGE' 3='3RD GRADE' 3='3RD GRADE' 15='3RD YEAR COLLEGE' 4='4TH GRADE' 4='4TH GRADE' 16='4TH YEAR COLLEGE' 5='5TH GRADE' 5='5TH GRADE' 17='5TH YEAR COLLEGE' 6='6TH GRADE' 6='6TH GRADE' 18='6TH YEAR COLLEGE' 7='7TH GRADE' 7='7TH GRADE' 19='7TH YEAR COLLEGE' 8='8TH GRADE' 8='8TH GRADE' 20='8TH YEAR COLLEGE OR MORE' 9='9TH GRADE' 9='9TH GRADE' 95='UNGRADED' 10='10TH GRADE' 10='10TH GRADE' ; 11='11TH GRADE' 11='11TH GRADE' value vx85f 12='12TH GRADE' 12='12TH GRADE' 0='0' 13='1ST YEAR COLLEGE' 13='1ST YEAR COLLEGE' 1-999='1 TO 999' 14='2ND YEAR COLLEGE' 14='2ND YEAR COLLEGE' 1000-1999='1000 TO 1999' 15='3RD YEAR COLLEGE' 15='3RD YEAR COLLEGE' 2000-2999='2000 TO 2999' 16='4TH YEAR COLLEGE' 16='4TH YEAR COLLEGE' 3000-3999='3000 TO 3999' 17='5TH YEAR COLLEGE' 17='5TH YEAR COLLEGE' 4000-4999='4000 TO 4999' 18='6TH YEAR COLLEGE' 18='6TH YEAR COLLEGE' 5000-5999='5000 TO 5999' 19='7TH YEAR COLLEGE' 19='7TH YEAR COLLEGE' 6000-6999='6000 TO 6999' 20='8TH YEAR COLLEGE OR MORE' 20='8TH YEAR COLLEGE OR MORE' 7000-7999='7000 TO 7999' 95='UNGRADED' 95='UNGRADED' 8000-8999='8000 TO 8999' ; ; 9000-9999='9000 TO 9999' value vx91f value vx96f 10000-14999='10000 TO 14999' 0='0' 0='0' 15000-19999='15000 TO 19999' 1-999='1 TO 999' 1-999='1 TO 999' 20000-24999='20000 TO 24999' 1000-1999='1000 TO 1999' 1000-1999='1000 TO 1999' 25000-49999='25000 TO 49999' 2000-2999='2000 TO 2999' 2000-2999='2000 TO 2999' 50000-99999999='50000 TO 3000-3999='3000 TO 3999' 3000-3999='3000 TO 3999' 99999999: 50000+' 4000-4999='4000 TO 4999' 4000-4999='4000 TO 4999' ; 5000-5999='5000 TO 5999' 5000-5999='5000 TO 5999' value vx86f 6000-6999='6000 TO 6999' 6000-6999='6000 TO 6999' 0='0: 0 NEVER MARRIED' 7000-7999='7000 TO 7999' 7000-7999='7000 TO 7999' 1='1: 1 MARRIED' 8000-8999='8000 TO 8999' 8000-8999='8000 TO 8999' 2='2: 2 SEPARATED' 9000-9999='9000 TO 9999' 9000-9999='9000 TO 9999' 3='3: 3 DIVORCED' 10000-14999='10000 TO 14999' 10000-14999='10000 TO 14999' 6='6: 6 WIDOWED' 15000-19999='15000 TO 19999' 15000-19999='15000 TO 19999' ; 20000-24999='20000 TO 24999' 20000-24999='20000 TO 24999' value vx87f 25000-49999='25000 TO 49999' 25000-49999='25000 TO 49999' -999='-999: Never reported 50000-99999999='50000 TO 50000-99999999='50000 TO spouse/partner' 99999999: 50000+' 99999999: 50000+' 0='0: No current ; ; spouse/partner' value vx92f value vx98f 1='1: Spouse' 0='0: NEVER MARRIED' 1='Male' 33='33: Partner' 1='1: MARRIED' 2='Female' 36='36: Other' 2='2: SEPARATED' ; ; 3='3: DIVORCED' value vx100f value vx88f 6='6: WIDOWED' 1='Male' -999='-999: Never reported ; 2='Female' spouse/partner' value vx93f ; 0='0: No current 0='Never Married' value vx102f spouse/partner' 1='Married' 1='Male' 1='1: Spouse' 2='Separated' 2='Female' 33='33: Partner' 3='Divorced' ; 36='36: Other' 6='Widowed' value vx104f ; ; 1='Male' value vx89f value vx94f 2='Female' 1='YES' 1='YES' ; 0='NO' 0='NO' value vx106f ; ; 1='Male' value vx90f value vx95f 2='Female' 326

; data new_data; else if parity = 4 then sex1 = value vx108f set new_data; R6483300; 1='Male' else if parity = 5 then sex1 = 2='Female' CASEID = R0000100; R6483800; ; RACE = R0214700; run; else if parity = 6 then sex1 = value vx110f R6484300; 1='Male' proc freq data = new_data; else if parity = 7 then sex1 = 2='Female' tables R6484800; ; R0013700*R6481702*R0414300/list else if parity = 8 then sex1 = value vx112f missing; run; R6485200; 1='Male' else if parity = 9 then sex1 = 2='Female' data new; R6485600; ; set new_data; else if parity = 10 then sex1 = value vx114f * array the possible R6486000; run; 1='Male' child birth years ; 2='Female' array CByear{10} data new3; ; R6481702 R6482202 R6482702 merge new2(in=a) small ; value vx116f R6483202 R6483702 R6484202 by CASEID; if a; run; 1='Male' 2='Female' R6484702 proc freq data = new3; ; R6485102 R6485502 R6485902; tables R0010600 R0226000 value vx117f *get one record for R0413600 R0656100 R0901200 -999='-999: Never reported each birth; R1431800 spouse/partner' do i = 1 to 10; if R1865200 0='0: No current CByear(i) ge 0 then do; R2231300 R2300300*R2301600 spouse/partner' R2427100 R2501300*R2502600 1='1: Spouse' cyear=cbyear(i); R2747200 33='33: Partner' output; end; end; run; R2901300*R2902600 R2990000 36='36: Other' R3101700*R3103000 ; proc sort data = new; by CASEID; run; R3501700*R3503000 R3701700*R3703000 File Name: Step1b weights.sas data new; R4121800*R5060600 set new; R4506000 *Weights; by CaseID; R5205300*R5738000 R5804900/list retain Parity; missing ; run; data weights; if first.CaseID then Parity=1; infile else Parity=Parity+1; run; *missing has several codes; 'C:\Documents and * .R; /* Refused */ Settings\foxkarl\Desktop\nlsy79da proc freq data = new; * .D; /* Dont know */ ta\July\customweight_nlsy79_47467 tables cyear parity; run; * .I; /* Invalid missing */ 33922301957649.dat'; * .V; /* Valid missing */ input CASEID post_wt; run; data new2; * .N; /* Non-interview */ merge new(in=a) weights; ***add on year of birth of by CASEID; data new3; respondent from small data file; if a; run; set new3; YOB = 1900+R0000500; data small ; proc freq data = new2; set new_data2; tables post_wt; run; if R0010600 = 1 then couple79 = CASEID = R0000100; 1 ; keep CASEID R0000500; run; data new2; else if R0010600 ge 2 then set new2; couple79 = 2; post_wt = post_wt/100; if R0226000 = 1 then couple80 File Name: Step 2 recode.sas if parity = 1 then sex1 = = 1 ; R6481800; else if R0226000 = .v then * recode NLSY79; else if parity = 2 then sex1 = couple80 = couple79; proc contents data = new_data; R6482300; else couple80 = R0226000; run; else if parity = 3 then sex1 = R6482800; 327

if R0413600 = 1 then couple81 else if R2747200 ge 2 then if R5804900 = 1 then couple98 = = 1 ; couple88 = 2; 1 ; else if R0413600 = .v then else couple88 = .; else if R5804900 = 0 then couple81 = couple80; couple98 = 2; else if R0413600 ge 2 then if R2990000 = 1 then couple89 else if R5804900 = .v then couple81 = 2; = 1 ; couple98 = couple96; else couple81 = R0413600; else if R2990000 = 0 then else couple98 = .; run; couple89 = 2; if R0656100= 1 then couple82 = else if R2990000 ge 2 then proc freq data = new3; 1 ; couple89 = 2; tables couple79-couple94 else if R0656100= .v then else couple89 = .; couple96 couple98; run; couple82 = couple81; else if R0656100 ge 2 then if R3101700 = 1 or R3103000 = proc freq data = new3; couple82 = 2; 1 then couple90 = 1 ; tables R0155400 R0312300 else couple82 = R0413600; else if R3101700 = 0 or R0482600 R0782100 R1024000 R3103000 = 0 then couple90 = 2; R1410700 if R0901200 = 1 then couple83 = else if R3101700 ge 2 or R1778500 1 ; R3103000 ge 2 then couple90 = 2; R2141600 R2350300 R2722500 else if R0901200 = .v then else couple90 = .; R2971400 R3279400 couple83 = couple82; R3559000 else if R0901200 ge 2 then if R3501700 = 1 or R3503000 = R3897100 R4982800 R5626200 couple83 = 2; 1 then couple91 = 1 ; R6364600; run; else couple83 = 2; else if R3501700 = 0 or proc freq data = new3; R3503000 = 0 then couple91 = 2; tables R0017300 R0229200 if R1431800= 1 then couple84 = else if R3501700 ge 2 or R0417400 R0664500 R0905900 1 ; R3503000 ge 2 then couple91 = 2; R1205800 R1605100 else if R1431800 = 0 then else couple91 = .; R1905600 couple84 = 2; R2306500 R2509000 R2908100 else if R1431800 ge 2 then if R3701700 = 1 or R3703000 = R3110200 R3510200 R3710200 couple84 = 2; 1 then couple92 = 1 ; R4526500 else couple84 = .; else if R3701700 = 0 or R5221800 R5821800; run; R3703000 = 0 then couple92 = 2; if R1865200 = 1 then couple85 else if R3701700 ge 2 or *all income is continuous! as is = 1 ; R3703000 ge 2 then couple92 = 2; education! remove 95 - it is else if R1865200 = 0 then else couple92 = .; ungraded; couple85 = 2; proc freq data = new3; tables else if R1865200 ge 2 then if R5060600 = 1 then couple93 = cyear; run; couple85 = 2; 1 ; else couple85 = . ; else if R5060600 = 0 then data new3; couple93 = 2; set new3; if R2231300 = 1 then couple86 else if R5060600 ge 2 then = 1 ; couple93 = 2; if cyear le 1978 then do; couple else if R2231300 = 0 then else couple93 = .; = .; totinc = .; age_1 = cyear- couple86 = 2; yob; educ = .;end; else if R2231300 ge 2 then if R4506000 = 1 then couple94 = else if cyear = 1979 then do; couple86 = 2; 1 ; age_1 = cyear-yob; educ =R0017300 else couple86 = .; else if R4506000 = 0 then ; totinc =R0155400 ; couple = couple94 = 2; couple79; end; if R2427100 = 1 then couple87 = else if R4506000 =.v then else if cyear = 1980 then do; 1 ; couple94 = couple93; age_1 = cyear-yob; educ =R0229200 else if R2427100 = 0 then else couple93 = .; ; totinc =R0312300 ; couple = couple87 = 2; couple80; end; else if R2427100 ge 2 then if R5738000 = 1 then couple96 else if cyear = 1981 then do; couple87 = 2; = 1 ; age_1 = cyear-yob; educ =R0417400 else couple87 = .; else if R5738000 = 0 then ; totinc =R0482600 ; couple = couple96 = 2; couple81; end; if R2747200 = 1 then couple88 else if R5738000 ge 2 then else if cyear = 1982 then do; = 1 ; couple96 = 2; age_1 = cyear-yob; educ =R0664500 else if R2747200 = 0 then else couple96 = .; ; totinc =R0782100 ; couple = couple88 = 2; couple82; end; 328

else if cyear = 1983 then do; else if cyear = 1998 then do; estimate 'together vs apart' age_1 = cyear-yob; educ =R0905900 age_1 = cyear-yob; educ =R5821800 couple 1 -1; run; ; totinc =R1024000 ; couple = ; totinc =R6364600 ; couple = couple83; end; couple98; end; run; *Adjusting for other things; else if cyear = 1984 then do; *Using method in Brumback et al age_1 = cyear-yob; educ =R1205800 proc freq data = new3; 2010; ; totinc =R1410700 ; couple = tables race age_1 totinc couple *binary outcome is sex of child; couple84; end; cyear educ parity sex1 couple; *binary exposure of interest is else if cyear = 1985 then do; run; marital satus; age_1 = cyear-yob; educ =R1605100 *verified that age and parity are ; totinc =R1778500 ; couple = not effect modifiers ; couple85; end; *confounders are income, race, else if cyear = 1986 then do; File Name: Step-4 RD-NLSY79.sas education, year of birth age and age_1 = cyear-yob; educ =R1905600 parity; ; totinc =R2141600 ; couple = * RD re-analysis - based on couple86; end; exposure modeling - crude risk, title 'RD-Conception Year GE else if cyear = 1987 then do; weighted and unweighted; 1980 le 1998 - weighted with age_1 = cyear-yob; educ =R2306500 income'; ; totinc =R2350300 ; couple = proc freq data = new3; couple87; end; tables post_wt race educ cyear proc surveyreg data = new3; where else if cyear = 1988 then do; parity couple age_1 totinc/list cyear ge 1980 and cyear le 1998; age_1 = cyear-yob; educ =R2509000 missing; run; class couple educ race parity; ; totinc =R2722500 ; couple = cluster caseid; *account for couple88; end; repeated measures; else if cyear = 1989 then do; title 'RD - Conception Year GE model sex1 = couple age_1 educ age_1 = cyear-yob; educ =R2908100 80 le 1998 - unweighted'; race cyear parity totinc /clparm ; totinc =R2971400 ; couple = proc freq data = new3; where solution noint ; couple89; end; cyear ge 1980 and cyear le 1998; weight post_wt; else if cyear = 1990 then do; tables couple*sex1/ riskdiff; estimate 'together vs apart' age_1 = cyear-yob; educ =R3110200 run; couple 1 -1; run; ; totinc =R3279400 ; couple = couple90; end; title 'RD- All years - title 'RD-Conception Year all - else if cyear = 1991 then do; unweighted'; weighted - with income'; age_1 = cyear-yob; educ =R3510200 proc freq data = new3; ; totinc =R3559000 ; couple = tables couple*sex1/ riskdiff; proc surveyreg data = new3; couple91; end; run; class couple educ race parity; else if cyear = 1992 then do; cluster caseid; *account for age_1 = cyear-yob; educ =R3710200 title 'RD-Conception Year GE repeated measures; ; totinc =R3897100 ; couple = 1980 le 98 - weighted'; model sex1 = couple age_1 educ couple92; end; race cyear parity totinc/clparm else if cyear = 1993 then do; proc surveyreg data = new3; where solution noint ; age_1 = cyear-yob; educ =R3710200 cyear ge 1980 and cyear le 1998; weight post_wt; ; totinc =R3897100 ; couple = class couple ; estimate 'together vs apart' couple93; end; cluster caseid; *account for couple 1 -1; run; else if cyear = 1994 then do; repeated measures; age_1 = cyear-yob; educ =R4526500 model sex1 = couple /clparm title 'RD-Conception Year GE ; totinc =R4982800 ; couple = solution noint ; 1980 le 1998 - weighted without couple94; end; weight post_wt; income'; else if cyear = 1995 then do; estimate 'together vs apart' age_1 = cyear-yob; educ =R4526500 couple 1 -1; run; proc surveyreg data = new3; where ; totinc =R4982800 ; couple = cyear ge 1980 and cyear le 1998; couple94; end; title 'RD-Conception Year all - class couple educ race parity; else if cyear = 1996 then do; weighted'; cluster caseid; *account for age_1 = cyear-yob; educ =R5221800 repeated measures; ; totinc =R5626200 ; couple = proc surveyreg data = new3; model sex1 = couple age_1 educ couple96; end; class couple ; cluster caseid; race cyear parity /clparm else if cyear = 1997 then do; *account for repeated measures; solution noint ; age_1 = cyear-yob; educ =R5221800 model sex1 = couple /clparm weight post_wt; ; totinc =R5626200 ; couple = solution noint ; estimate 'together vs apart' couple96; end; weight post_wt; couple 1 -1; run; 329

model sex1 = couple age_1 data all; title 'RD-Conception Year all - parity/solution clparm ; set thesis.all; weighted - without income'; strata col_str; weight combinedweight; FORMAT PREGORDR pregordr. proc surveyreg data = new3; estimate 'together vs apart' PREGXEN1 pregxenf. PREGXEN2 class couple educ race parity; couple 1 -1; run; */ pregxe0f. cluster caseid; *account for PREGXEN3 pregxe1f. repeated measures; NBRNLV nbrnlv. MULTBRTH multbrth. model sex1 = couple age_1 educ File Name: table counts.sas DATPRGEN datprgen. race cyear parity/clparm MON_PREG mon_preg. WKS_PREG solution noint ; *table counts; wks_preg. weight post_wt; title '80-98'; DK2GEST dk2gest. DK1GEST estimate 'together vs apart' proc freq data = new3;where cyear dk1gest. DK3GEST dk3gest. couple 1 -1; run; ge 1980 and cyear le 1998; table BABDOB babdob. DIFDOBMB couple; run; difdobmb. HLPGETPR hlpgetpr. /* proc surveylogistic data HLPMISCR hlpmiscr. =new3; where cyear ge 1991; title 'all'; PRIORSMK priorsmk. POSTSMKS class couple educat race proc freq data = new3; table postsmks. totinc1 cyear parity/param=ref; couple; run; NPOSTSMK npostsmk. cluster panel; title '80-98'; KNEWPREG knewpreg. LTRIMEST model couple = age_1 educat proc surveyfreq data = new3; ltrimest. race totinc1 cyear parity / ; where cyear ge 1980 and cyear le TRIMESTR trimestr. strata col_str; 1998; GETPRENA getprena. BGNPRENA weight post_wt; cluster caseid; *account for bgnprena. output out=propout (keep=caseid repeated measures; LPNCTRIM lpnctrim. propensity) p = propensity; run; table sex1*couple/HISQ CHISQ1 PNCTRIM pnctrim. PLCPRENA COL CV CL CVWT DEFF LRCHISQ plcprena. proc sort data =new3; LRCHISQ1; PAYPREN1 payprenf. by caseid; run; weight post_wt; run; PAYPREN2 paypre0f. PAYPREN3 paypre1f. proc sort data = propout; title 'all'; PRBPRENA prbprena. by caseid; run; VGBLDFST vgbldfst. VGBLDLST proc surveyfreq data = new3; vgbldlst. proc surveylogistic data = new3; cluster caseid; *account for GESTDBTS gestdbts. where cyear ge 1991; repeated measures; ANEMIA anemia. PREECLMP preeclmp. class couple educat race table sex1*couple /HISQ CHISQ1 TOXEMIA toxemia. totinc1 cyear parity/param=ref; COL CV CL CVWT DEFF LRCHISQ WEAKCRVX weakcrvx. SWLNANKL cluster panel; LRCHISQ1 ; swlnankl. model couple =parity age_1; weight post_wt; run; OTHRPROB othrprob. NEWPR strata col_str; *the formats and data entry are newpr. HSPBIRTH hspbirth. weight post_wt; available on request (they are WHEREBRN wherebrn. output out=pout (keep=caseid roughly 600 pages long). BIRTHEXP birthexp. PAYBIRT1 puncond) p = puncond; run; paybirtf. File Name: Analysis.sas PAYBIRT2 paybir0f. proc sort data = pout; by caseid; PAYBIRT3 paybir1f. WORKBORN run; proc logisitic data = preg3; workborn. DIDWORK didwork. data stand; model sex1 = fmarcon5; run; MATWEEKS matweeks. WEEKSDK merge new3 propout pout; *NSFG analysis ; weeksdk. by caseid; MATSTART matstart. if sex1 = 1 then iptw = data all; MATEND matend. MATLEAVE matleave. puncod/propensity; merge data2(in=a) data3(in=b); VAGORCE1 vagorcef. if sex1 = 2 then iptw = (1- by CASEID; KIDSSEX1 kidssexf. LBS_BOR1 puncond)/(1-propensity); if a; run; lbs_borf. combinedweight = iptw*post_wt; OZS_BOR1 ozs_borf. run; data preg; LOBTHWT1 lobthwtf. LIVEHER1 set all; liveherf. proc surveyreg data = stand; IF OUTCOME=1; /* keep only live ALIVENW1 alivenwf. class couple parity; births */ WHENDIE1 whendief. WHENLEF1 whenleff. 330

ANYNURS1 anynursf. METHUSE3 methus1f. OCCCD0 occcd0ff. RNONUR11 rnonur1f. RNONUR12 METHUSE4 methus2f. METHUSE5 MEDICAID medicaid. BRNOUT brnout. rnonur0f. methus3f. STRUS strus. OUTCOMEI RNONUR13 rnonur2f. DATSTRT1 datstrtf. outcomei. PRGLNGTI prglngti. DIFNUR11 difnur1f. DIFNUR12 STOPDUSE stopduse. WHYSTOPD DATEND_I datend_i. difnur0f. whystopd. YRPREG_I yrpreg_i. AGEPREGI DIFNUR13 difnur2f. WHATMET1 whatmetf. agepregi. FEDSOLI1 fedsolif. MON_EAT1 WHATMET2 whatme0f. WHATMET3 FMAROU5I fmarou5i. mon_eatf. whatme1f. FMAROU2I fmarou2i. DATECONI WKS_EAT1 wks_eatf. WHATMET4 whatme2f. dateconi. DAY_EAT1 day_eatf. QUITNUR1 WHATMET5 whatme3f. EVERMISS AGECONI ageconi. quitnurf. evermiss. FMARCO5I fmarco5i. FMARCO2I MON_QNU1 mon_qnuf. USENSAFE usensafe. fmarco2i. WKS_QNU1 wks_qnuf. DAY_QNU1 EVERSKIP everskip. ARESNOUS DELIVERI deliveri. day_qnuf. aresnous. LEARNPRI learnpri. PNCAREWI YSTPNU11 ystpnu1f. BRESNOUS bresnous. pncarewi. YSTPNU12 ystpnu0f. YSTPNU13 CRESNOUS cresnous. RESNOUSE PAYPNC_I paypnc_i. ystpnu2f. resnouse. PAYDELII paydelii. LBW1_I lbw1_i. DIFSTN11 difstn1f. WANTBOLD wantbold. LBW2_I lbw2_i. SEX1_I DIFSTN12 difstn0f. DIFSTN13 PROBABE probabe. CNFRMNO cnfrmno. sex1_i. SEX2_I sex2_i. difstn2f. WANTBLD2 wantbldf. BFEEDWKI bfeedwki. VAGORCE2 vagorc0f. TIMINGOK timingok. TOOSOONY MATERNLI maternli. OLDWR_I KIDSSEX2 kidsse0f. LBS_BOR2 toosoony. oldwr_i. lbs_bo0f. TOOSOONM toosoonm. OLDWP_I oldwp_i. OZS_BOR2 ozs_bo0f. FEELING feeling. WORRIED worried. WANTRP_I wantrp_i. WANTPT_I LOBTHWT2 lobthw0f. LIVEHER2 INTEFERE intefere. wantpt_i. livehe0f. CONFIDNT confidnt. BABYFUN AGEAPR_I ageapr_i. ALIVENW2 aliven0f. babyfun. RACE_I race_i. HISPAN_I hispan_i. WHENDIE2 whendi0f. WHENLEF2 CARRYFUN carryfun. BRAG HISPRA_I hispra_i. whenle0f. brag. BODYBAD bodybad. EDUCAT_I educat_i. HIEDUC_I ANYNURS2 anynur0f. GETPOOR getpoor. HIDIT hieduc_i. RNONUR21 rnonur3f. RNONUR22 hidit. BUYBABE buybabe. POVERT_I povert_i. rnonur4f. HPWNOLD hpwnold. TIMOKHP LABORF_I laborf_i. REGION_I RNONUR23 rnonur5f. timokhp. HPAGE hpage. region_i. DIFNUR21 difnur3f. DIFNUR22 OUTCOME outcome. RELIGI_I religi_i. ; difnur4f. PRGLNGTH prglngth. YRPREG yrpreg. run; DIFNUR23 difnur5f. FMAROUT5 fmaroutf. FEDSOLI2 fedsol0f. MON_EAT2 FMAROUT2 fmarou0f. FMARCON5 *proc contents data = preg; mon_ea0f. fmarconf. *run; WKS_EAT2 wks_ea0f. FMARCON2 fmarco0f. data preg; DAY_EAT2 day_ea0f. QUITNUR2 DELIVERY delivery. LEARNPRG set all; run; quitnu0f. learnprg. proc freq data = preg; MON_QNU2 mon_qn0f. PNCAREWK pncarewk. table pregordr; run; WKS_QNU2 wks_qn0f. DAY_QNU2 PAYPNC paypnc. PAYDELIV paydeliv. day_qn0f. LBW1 lbw1ffff. LBW2 proc freq data = preg; YSTPNU21 ystpnu3f. lbw2ffff. SEX1 sex1ffff. table kidssex1 sex1 kidssex2 YSTPNU22 ystpnu4f. YSTPNU23 SEX2 sex2ffff. BFEEDWKS outcome*outcomeI pregordr/list ystpnu5f. bfeedwks. MATERNLV maternlv. missing ; run; DIFSTN21 difstn3f. OLDWANTR oldwantr. DIFSTN22 difstn4f. DIFSTN23 OLDWANTP oldwantp. WANTRESP proc freq data = preg; difstn5f. wantresp. tables agepreg yrpreg caseid; PREGEND pregend. WANTPART wantpart. RACE where yrpreg ge 91; run; GEST_WKS gest_wks. CONCEPT race. HISPANIC hispanic. concept. HISPRACE hisprace. PROC SORT data=preg; BY CASEID EVUSEINX evuseinx. EDUCAT educat. HIEDUC hieduc. PREGORDR; /* sort PREGORDR within EVUSEINT evuseint. EVERSTOP POVERTY poverty. CASEID */ everstop. LABORFOR laborfor. REGION region. DATA LASTPREG; DATESTOP datestop. METRO metro. RURAL SET preg; BY CASEID; METHUSE1 methusef. METHUSE2 rural. RELIGION religion. methus0f. 331

IF LAST.CASEID THEN OUTPUT; /* array marstt{5} mardat01 wnstpx16 wnstpx17 wnstpx18 keep only the last birth for each mardat02 mardat03 mardat04 wnstpx19 wnstpx20 wnstpx21 R */ run; mardat05; wnstpx22 proc freq data = preg; array mardis{5} mardis01 stopoth3 tables caseid*cohout/list mardis02 mardis03 mardis04 wnstpx24 wnstpx25 wnstpx26 missing; where yrpreg ge 91; run; mardis05; wnstpx27 wnstpx28 wnstpx29 wnstpx30 proc freq data = preg; array start{82} strtogh1 stopoth4 tables wnh1die*divdath1/list stragnh0 stragnh1 stragnh2 wnstpx32 wnstpx33 wnstpx34 missing; where yrpreg ge 91 and stragnh3 wnstpx35 wnstpx36 wnstpx37 outcome = 1; run; strtogh2 stagh00 wnstpx38 stagh01 stagh02 stagh03 stopoth5 data preg3; strtogh3 stagh10 wnstpx40 wnstpx41 wnstpx42 set preg; stagh11 stagh12 stagh13 wnstpx43 wnstpx44 wnstpx45 where outcome = 1; strtogh4 stagh20 wnstpx46 *recode missing; stagh21 stagh22 stagh23 stopoth6 if pregend ge 9997 then strtogch wnstpx48 wnstpx49 wnstpx50 pregend = .; stragnc0 stragnc1 stragnc2 wnstpx51 wnstpx52 wnstpx53 if babdob ge 9997 then stragnc3 wnstpx54 babdob = .; wnstrtcp stopoth7 if concept ge 9997 then strtoth0 wnstpx56 wnstpx57 wnstpx58 concept = .; stagb00 stagb01 wnstpx59 wnstpx60 wnstpx61 if babdob > . then stagb02 stagb03 stagb04 stagb05 wnstpx62; pregend = babdob; strtoth1 stagb08 stagb09 stagb10 stagb11 stagb12 stagb13 array ostop{15} wnh1die divdath1 mage0 = strtoth2 stagb16 stagb17 wnstph1 floor(agecon/100); stagb18 stagb19 stagb20 stagb21 wnhxdie0 divdath2 wnstphx0 mhgc0 = higrade; strtoth3 stagb24 stagb25 wnhxdie1 divdath3 wnstphx1 cyear = 1900 + yrpreg; stagb26 stagb27 stagb28 stagb29 wnhxdie2 divdath4 wnstphx2 bthord = pregordr; strtoth4 stagb32 stagb33 wnchdie divdatch wnstpch ; stagb34 stagb35 stagb36 stagb37 * if pregend > .; strtoth5 stagb40 stagb41 /* convert missing data values */ stagb42 stagb43 stagb44 stagb45 *if outcome = 1 then liveb = strtoth6 stagb48 stagb49 do i = 1 to 5; if marstt{i} ge 1; *else liveb = 0; stagb50 stagb51 stagb52 stagb53 9997 then marstt{i} = .; end; *if outcome = 2 then strtoth7 stagb56 stagb57 do i = 1 to 5; if mardis{i} ge abort = 1; *else abort = 0; stagb58 stagb59 stagb60 stagb61; 9997 then mardis{i} = .; end; *if outcome = 3 then do i = 1 to 82; if start{i} ge stillb = 1;* else stillb = 0; array stop{89} wnstoph1 stah100 9997 then start{i} = .; end; *if outcome = 4 then stah101 stah102 stah103 do i = 1 to 15;if ostop{i} ge miscar = 1;* else miscar = 0; wnstoph2 stpah00 9997 then ostop{i} = .; end; *if outcome = 5 then stpah01 stpah02 stpah03 do i = 1 to 89;if stop{i} ge ectopic= 1;* else ectopic= 0; wnstoph3 stpah10 9997 then stop{i} = .; end; stpah11 stpah12 stpah13 if outcome = 1 then wnstoph4 stpah20 /* drop dates of events which livgest = prglngth; stpah21 stpah22 stpah23 occurred after child's birth date gestage = prglngth; wnstopch */ stpagn00 stpagn01 stpagn02 /* (This strategy is for live /* RENAME SEX VARIABLE*/ stpagn03 births only */ stopoth0 /* CHANGE THIS STEP TO KEEP DATES csex = sex1; wnstpx00 wnstpx01 wnstpx02 AFTER CHILDS BIRTH */ if race = 1 then black = 1; else wnstpx03 wnstpx04 wnstpx05 black = 0; wnstpx06 do i = 1 to 5 ;if pregend> . and if hispanic = 1 then hisp = 1; stopoth1 pregend< marstt{i} then else hisp = 0; wnstpx08 wnstpx09 wnstpx10 marstt{i}=.;end; wnstpx11 wnstpx12 wnstpx13 do i = 1 to 5 ;if pregend> . and * array the relationship start wnstpx14 pregend< mardis{i} then and stop dates ; stopoth2 mardis{i}=.;end;

332

do i = 1 to 82 ;if pregend> . and strtoth1 stagb08 stagb09 lastotstop =max(of wnh1die pregend< start{i} then stagb10 stagb11 stagb12 stagb13 divdath1 wnstph1 start{i}=.;end; strtoth2 stagb16 stagb17 wnhxdie0 divdath2 wnstphx0 do i = 1 to 89 ;if pregend> . and stagb18 stagb19 stagb20 stagb21 wnhxdie1 divdath3 wnstphx1 pregend< stop{i} then stop{i}= strtoth3 stagb24 stagb25 wnhxdie2 divdath4 wnstphx2 .;end; stagb26 stagb27 stagb28 stagb29 wnchdie divdatch wnstpch ); do i = 1 to 15 ;if pregend> . and strtoth4 stagb32 stagb33 /* status at conception and pregend< ostop{i} then ostop{i}= stagb34 stagb35 stagb36 stagb37 birth: --\ .;end; strtoth5 stagb40 stagb41 | fstat0 = status at birth stagb42 stagb43 stagb44 stagb45 | /* select most recent cohab strtoth6 stagb48 stagb49 | fstat00 = status at before child's date of birth ---- stagb50 stagb51 stagb52 stagb53 conception | ---\ strtoth7 stagb56 stagb57 | xstart = start date | if there is a cohab without stagb58 stagb59 stagb60 stagb61 cohab/mar | stop before birth: use cohab lastmar); | date| | | if no cohab, look for a laststop=max(of wnstoph1 stah100 | marst0 = married at birth marriage before birth stah101 stah102 stah103 | | wnstoph2 stpah00 | cohab0 = cohab. at birth | use marriage dates to stpah01 stpah02 stpah03 | establish 3-level categories wnstoph3 stpah10 | marst00 = married at conc'n | stpah11 stpah12 stpah13 | | for now: deaths, divorces wnstoph4 stpah20 | cohab00 = cohab. at conc'n don't count (ostop_) | stpah21 stpah22 stpah23 | \------wnstopch \------*/ stpagn00 stpagn01 stpagn02 ---*/ stpagn03 /*------stopoth0 /*if lastmar > lastdis then ------\ wnstpx00 wnstpx01 wnstpx02 marst0 = 1; else marst0 = 0; | does this method misattribute wnstpx03 wnstpx04 wnstpx05 if lastcoh > laststop then cohab0 some intervals in | wnstpx06 = 1; else cohab0 = 0; | relation to conception? check stopoth1 | wnstpx08 wnstpx09 wnstpx10 if fmarout5 = 1 then fstat0 = 2; \------wnstpx11 wnstpx12 wnstpx13 else if marst0 = 1 then fstat0 = ------*/ wnstpx14 2; stopoth2 else if cohab0 = 1 then fstat0 = lastmar = max(of mardat01 wnstpx16 wnstpx17 wnstpx18 1; mardat02 mardat03 mardat04 wnstpx19 wnstpx20 wnstpx21 else fstat0 = 0; mardat05); wnstpx22 */ lastdis = max(of mardis01 stopoth3 mardis02 mardis03 mardis04 wnstpx24 wnstpx25 wnstpx26 if lastmar = lastcoh then flag1 = mardis05); wnstpx27 wnstpx28 wnstpx29 1; wnstpx30 else flag1 = 0; lastcoh = max(of strtogh1 stopoth4 stragnh0 stragnh1 stragnh2 wnstpx32 wnstpx33 wnstpx34 if lastdis = laststop then flag2 stragnh3 wnstpx35 wnstpx36 wnstpx37 = 1; strtogh2 stagh00 wnstpx38 else flag2 =0; stagh01 stagh02 stagh03 stopoth5 strtogh3 stagh10 wnstpx40 wnstpx41 wnstpx42 if lastotstop = lastdis and stagh11 stagh12 stagh13 wnstpx43 wnstpx44 wnstpx45 lastdis = laststop and lastotstop strtogh4 stagh20 wnstpx46 = . then flag3 = 1; stagh21 stagh22 stagh23 stopoth6 else if lastotstop = lastdis and strtogch wnstpx48 wnstpx49 wnstpx50 lastdis = laststop then flag3 = stragnc0 stragnc1 stragnc2 wnstpx51 wnstpx52 wnstpx53 2; stragnc3 wnstpx54 else flag3 = 0; wnstrtcp stopoth7 strtoth0 wnstpx56 wnstpx57 wnstpx58 if flag1 = 1 then startdate = stagb00 stagb01 wnstpx59 wnstpx60 wnstpx61 lastmar; stagb02 stagb03 stagb04 stagb05 wnstpx62); if flag3 = 1 then enddate = 999999; 333

if flag3 = 2 then enddate = and enddate > startdate else if lastdis > concept and lastdis; and concept > enddate then concept > lastmar then marst00 = status00=12; 7; run; *run; else if flag1 = 0 and lastcoh = max(of lastmar lastcoh) else if concept = . or pregend = data preg4; and enddate > concept . then marst00 = .; set preg3; and concept > startdate then else marst00 = 0; where outcome = 1; status00=13; else if flag1 = 0 and lastcoh = if lastcoh > laststop and concept if flag1 = 1 then startdate = max(of lastmar lastcoh) > lastcoh then cohab00 = 1; lastmar; and startdate > enddate *next was wrong; else startdate = max(of lastmar and concept > startdate then else if lastcoh > concept and lastcoh); status00=14; concept > laststop then cohab00 = else if flag1 = 0 and lastcoh = 10; enddate = max(of lastdis max(of lastmar lastcoh) *should be; lastotstop laststop); and concept > enddate *else if laststop > concept and and startdate > enddate then concept > lastcoh then cohab00 = if enddate = . then enddate = status00=15; 7; 99999; else if flag1 = 0 and lastcoh = else if concept = . or pregend = if startdate = . then startdate max(of lastmar lastcoh) . then cohad00 = .; = 99999; and enddate > concept else cohab00 = 0; and startdate > enddate then if flag1 = 1 and enddate > status00 = 16; **new part with the other startdate and startdate > concept else if flag1 = 0 and lastcoh = endings such as death, divorce or then status00=1; max(of lastmar lastcoh) separation; else if flag1 = 1 and enddate > and enddate = concept startdate and concept > enddate then status00=17; if lastotstop > lastmar and then status00=2; concept > lastotstop then marst00 else if flag1 = 1 and enddate > *add in the funny cases where = 3; concept and concept > startdate start and end date are the same; if lastotstop > lastcoh and then status00=3; else if startdate=enddate and concept > lastotstop then cohab00 else if flag1 = 1 and startdate concept > enddate then status00= = 3; > enddate and concept > startdate 18; then status00=4; else if startdate=enddate and **if they are the same day; else if flag1 = 1 and concept enddate > concept then status00= if laststop = concept then > enddate and startdate > 19; cohab00 = 4; enddate then status00=5; if lastdis = concept then marst00 else if flag1 = 1 and enddate if lastcoh = max (of lastmar = 4; run; > concept and startdate > lastcoh) and flag1 = 0 then flag4 enddate then status00 = 6; = 1; run; proc freq data = preg5; else if flag1 = 1 and enddate = tables marstat00*status00/list concept then status00=7; *now regroup all the cases into missing; run; cohabit or not; else if startdate = 99999 and *check the first bit of recode ; enddate = 99999 then status00= 8; data preg5; else if flag1 = 1 and startdate set preg4; data next; = concept and enddate > startdate if status00 in (3,4,6,9) then set preg; then status00=9; marstat00 = 1; *married; if fmarcon5 = 1 then fstat00 = 2; else if flag1 ne 1 and else if status00 in else if marst00= 1 then fstat00= startdate = concept and enddate > (13,14,16,20) then marstat00 = 2; 2; startdate then status00=20; *cohabitating; else if cohab00= 1 then fstat00= else if status00 in (8) then 1; *now for other cohabit; marstat00 = 3; *not married; else fstat00= 0; else if flag1 = 0 and lastcoh = else marstat00 = 0; *other; max(of lastmar lastcoh) /* create start date for most and enddate > startdate if concept = . then marstat00 = recent start & stop before------and startdate > concept then .; *run; ------\ status00=11; | conception. else if flag1 = 0 and lastcoh = if lastmar > lastdis and concept | max(of lastmar lastcoh) > lastmar then marst00 = 1; 334

\------/* check the programming here: n *if wantr = 1 then want = 1;* ------for fastat0 is > n for fastat00-- else if wantr > 1 then want = 0; ---*/ ----\ \------if agecon > 1800 then age2 = 2; if fstat0=2 and lastcoh > ------else if agecon > . then age2 = 1; laststop then xstart= min(of ---*/ lastmar lastcoh); /* Create "am__" series ( actual else if fstat0 = 2 and laststop cdob = pregend; married );------ge lastcoh then xstart = lastmar --\ ; if fmarout5 = 1 then fastat0 = 2; | derived from reported marriage else if fstat0 = 1 then xstart = else if fstat0 > 0 then fastat0 = dates; lastcoh ; 1; | else xstart = .; else fastat0 = 0; | parallels nlsy | /* create intervals from most if fmarcon5 = 1 then fastat00= 2; | There may be some problems recent start & stop before------else if fstat00 > 0 then fastat00 with this series - check ------\ = 1; | | conception. else fastat00 = 0; \------| ------| dates of marriage & cohab, /* define cohint, mbthint, cohcon -*/ dob & doc are in century months : ------\ /* define am__ from marriage | \------dates: */ | ------*/ | if mbthint ge 1 then am01 = 1; | create categories of if fstat0 ge 1 then cohint= cdob- else am01 = 0; relationship status before birth xstart; if mbthint ge 2 then am02 = 1; : | else if cdob > . then cohint = - else am02 = 0; | using nsfg variable indicating 10; if mbthint ge 3 then am03 = 1; formal marital status: else am03 = 0; | if lastmar > lastdis then mbthint if mbthint ge 4 then am04 = 1; | fastat_: 2= married = cdob - lastmar; else am04 = 0; | else if lastdis > concept then if mbthint ge 5 then am05 = 1; | 1= cohabiting mbthint = cdob - lastmar; else am05 = 0; | if mbthint ge 6 then am06 = 1; | 0= neither married if fstat0 ge 1 then cohcon= else am06 = 0; nor cohabiting concept-xstart; if mbthint ge 7 then am07 = 1; | else if concept > . then cohcon = else am07 = 0; | -10; if mbthint ge 8 then am08 = 1; | else am08 = 0; | cdob = date of birth; if fstat0= 0 then oc0 = 0; else if mbthint ge 9 then am09 = 1; | if fstat0 > 0 then oc0 = 1; else am09 = 0; | concept = date of conception; if fstat00= 0 then oc00 = 0; else if mbthint ge 10 then am10= 1; | if fstat00 > 0 then oc00 = 1; else am10= 0; | using new variable indicating if mbthint ge 11 then am11= 1; marriage or cohab status: if fastat0= 0 then of0 = 0; else else am11= 0; | if fastat0 > 0 then of0 = 1; if mbthint ge 12 then am12= 1; | cohint = interval from last if fastat00= 0 then of00 = 0; else am12= 0; start date to child's birth else if fastat00 >0 then of00 = if mbthint ge 18 then am18= 1; | 1; else am18= 0; | cohcon = interval from last if mbthint ge 24 then am24= 1; start date to child's conception if fstat0= 2 then ms0= 1; else else am24= 0; | ms0 = 0; | if fstat0= 1 then co0 = 1; else | co0 = 0; /* Create "ac__" series (actual \------cohabitation);------\ ------if fstat00= 2 then ms00= 1; else | derived from cohab. dates ---*/ ms00 = 0; | if fstat00= 1 then co00 = 1; | parallels psid else co00 = 0; |

335

| also - no variation in this? if cohcon ge 7 then cc07 = 1; if cyear = 1964 then y64 = 1; check programming | else cc07 = 0; else y64 = 0; \------if cohcon ge 6 then cc06 = 1; if cyear = 1965 then y65 = 1; ------*/ else cc06 = 0; else y65 = 0; if cohcon ge 5 then cc05 = 1; if cyear = 1966 then y66 = 1; if cohint ge 0 then ac0 = 1; else cc05 = 0; else y66 = 0; else ac0 = 0; if cohcon ge 4 then cc04 = 1; if cyear = 1967 then y67 = 1; if cohint ge 1 then ac01 = 1; else cc04 = 0; else y67 = 0; else ac01 = 0; if cohcon ge 3 then cc03 = 1; if cyear = 1968 then y68 = 1; if cohint ge 2 then ac02 = 1; else cc03 = 0; else y68 = 0; else ac02 = 0; if cohcon ge 2 then cc02 = 1; if cyear = 1969 then y69 = 1; if cohint ge 3 then ac03 = 1; else cc02 = 0; else y69 = 0; else ac03 = 0; if cohcon ge 1 then cc01 = 1; if cyear = 1970 then y70 = 1; if cohint ge 4 then ac04 = 1; else cc01 = 0; else y70 = 0; else ac04 = 0; if cohcon ge 0 then cc00 = 1; if cyear = 1971 then y71 = 1; if cohint ge 5 then ac05 = 1; else cc00 = 0; else y71 = 0; else ac05 = 0; if cohcon ge -1 then ccm1 = 1; if cyear = 1972 then y72 = 1; if cohint ge 6 then ac06 = 1; else ccm1 = 0; else y72 = 0; else ac06 = 0; if cohcon ge -2 then ccm2 = 1; if cyear = 1973 then y73 = 1; if cohint ge 7 then ac07 = 1; else ccm2 = 0; else y73 = 0; else ac07 = 0; if cohcon ge -3 then ccm3 = 1; if cyear = 1974 then y74 = 1; if cohint ge 8 then ac08 = 1; else ccm3 = 0; else y74 = 0; else ac08 = 0; if cohcon ge -4 then ccm4 = 1; if cyear = 1975 then y75 = 1; if cohint ge 9 then ac09 = 1; else ccm4 = 0; else y75 = 0; else ac09 = 0; if cohcon ge -5 then ccm5 = 1; if cyear = 1976 then y76 = 1; if cohint ge 10 then ac10= 1; else ccm5 = 0; else y76 = 0; else ac10= 0; if cohcon ge -6 then ccm6 = 1; if cyear = 1977 then y77 = 1; if cohint ge 11 then ac11= 1; else ccm6 = 0; else y77 = 0; else ac11= 0; if cohcon ge -7 then ccm7 = 1; if cyear = 1978 then y78 = 1; if cohint ge 12 then ac12= 1; else ccm7 = 0; else y78 = 0; else ac12= 0; if cohcon ge -8 then ccm8 = 1; if cyear = 1979 then y79 = 1; if cohint ge 18 then ac18= 1; else ccm8 = 0; else y79 = 0; else ac18= 0; if cohcon ge -9 then ccm9 = 1; if cyear = 1980 then y80 = 1; if cohint ge 24 then ac24= 1; else ccm9 = 0; else y80 = 0; else ac24= 0; if cohcon ge -10 then ccm10= 1; if cyear = 1981 then y81 = 1; else ccm10= 0; else y81 = 0; if cyear = 1982 then y82 = 1; /*------/* use oc__, based on dates from else y82 = 0; ------\ birth */ if cyear = 1983 then y83 = 1; | Create an indicator based on else y83 = 0; household status from date of oc01 = ac01; if cyear = 1984 then y84 = 1; | oc02 = ac02; else y84 = 0; | conception, using cohcon: oc03 = ac03; if cyear = 1985 then y85 = 1; | oc04 = ac04; else y85 = 0; \------oc05 = ac05; if cyear = 1986 then y86 = 1; ------*/ oc06 = ac06; else y86 = 0; oc07 = ac07; if cyear = 1987 then y87 = 1; if cohcon ge 15 then cc15 = 1; oc08 = ac08; else y87 = 0; else cc15 = 0; oc09 = ac09; if cyear = 1988 then y88 = 1; if cohcon ge 12 then cc12 = 1; oc10 = ac10; else y88 = 0; else cc12 = 0; oc11 = ac11; if cyear = 1989 then y89 = 1; if cohcon ge 11 then cc11 = 1; oc12 = ac12; else y89 = 0; else cc11 = 0; oc18 = ac18; if cyear = 1990 then y90 = 1; if cohcon ge 10 then cc10 = 1; oc24 = ac24; else y90 = 0; else cc10 = 0; if cyear = 1991 then y91 = 1; if cohcon ge 9 then cc09 = 1; pregmnth = cdob - concept; else y91 = 0; else cc09 = 0; if cyear = 1992 then y92 = 1; if cohcon ge 8 then cc08 = 1; /* create year dummies ; year = else y92 = 0; else cc08 = 0; 1984 is reference group */ if cyear = 1993 then y93 = 1; else y93 = 0; 336

if cyear = 1994 then y94 = 1; blkyear = black*cyear; proc freq data = else y94 = 0; ocyear = oc00*cyear; thesis.nsfg_preg3; if cyear = 1995 then y95 = 1; blkoc = black*oc00; tables parity*couple*sex1/cmh else y95 = 0; riskdiff; run; if mage0 > 15 then young = 0; /* create dummies for decade; else if mage0 > . then young = 1; proc freq data = 1990's are omitted category */ *if multbrth = 0; run; thesis.nsfg_preg3; tables agecat*couple*sex1/cmh if cyear > . and cyear < 1970 proc sort data = preg1; by cyear; riskdiff; run; then d60 = 1; else d60 = 0; run; if cyear > 1969 and cyear < 1980 proc surveyfreq data = then d70 = 1; else d70 = 0; proc freq; table csex*marstat00 thesis.nsfg_preg3; if cyear > 1979 and cyear < 1990 fmarcon5*marstat00/missing list; weight post_wt; then d80 = 1; else d80 = 0; where cyear ge 1991; tables * by cyear; run; parity*couple*sex1/rschisq; run; /* convert from months to trimesters of pregnancy */ data thesis.all ; proc surveyfreq data = set all; run; thesis.nsfg_preg3; if cohint ge 0 and cohint < 4 weight post_wt; then q1 = 1; else q1 = 0; tables if cohint ge 4 and cohint < 7 agecat*couple*sex1/rschisq; run; then q2 = 1; else q2 = 0; File Name: testing effect if cohint ge 7 and cohint <10 modification NSFG.sas PSID File Name: Step1a weights.sas then q3 = 1; else q3 = 0; if cohint ge 10 and cohint <13 title 'RD-Conception Year GE *weights; then q4 = 1; else q4 = 0; 1991 - weighted'; data weights; if cohint ge 13 and cohint <25 set ind2009er; CASEID = er30001||er30002; then q5 = 1; else q5 = 0; proc sort data = post_wt = ER33318; if cohint ge 25 then q6 = 1; else thesis.NSFG_preg3; strata = ER31996 ; cluster = ER31997; q6 = 0; by parity ; run; keep CASEID post_wt strata cluster; run; if q1 = 1 then cohq = 1; else if q2 = 1 then cohq = 2; proc surveylogistic data = File Name: Step 2 -Recode else if q3 = 1 then cohq = 3; thesis.NSFG_preg3; where cyear ge PSID.sas else if q4 = 1 then cohq = 4; 1991; data new; else if q5 = 1 then cohq = 5; class couple educat race set CAH85_09; else if q6 = 1 then cohq = 6; totinc1 cyear parity/param=ref; CASEID = CAH2||CAH3; else if cohint = -1 then cohq = cluster panel; by parity;

0; model sex1 = couple ; *only live births - so take out adoptions; strata col_str; where CAH1 = 1; oc0 = ac0; weight post_wt; PARITY = CAH8; if parity ge 98 *output out=propout then parity = .; if cyear < 1970 then oc60 = oc00; (keep=caseid propensity) p = if CAH7 = 1 then couple = 1; else if CAH7 ge 2 and CAH7 le 7 else oc60 = 0; propensity; run; then couple = 2; if cyear ge 1970 and cyear < 1980 else couple = .; SEXP = CAH4; then oc70 = oc00; else oc70 = 0; data thesis.nsfg_preg3; SEX1 = CAH11; if sex1 ge 8 then if cyear ge 1980 and cyear < 1990 set thesis.nsfg_preg3; sex1 = .; CYEAR = CAH13; if cyear ge 9998 then oc80 = oc00; else oc80 = 0; if age_1 le 20 then agecat1 = 1; then cyear = .; if cyear ge 1990 then oc90 = else if age_1 > 20 and age_1 le PYEAR = CAH6; if pyear ge 9998 then pyear = .; oc00; else oc90 = 0; 25 then agecat = 2; AGE_1 = cyear-pyear; else if age_1 > 25 and age_1 le povrat0 = poverty/100; 30 then agecat = 3; proc freq data = new; if povrat0 > 1 then pov0 = 0; else if age_1 > 30 and age_1 le tables parity sexp sex1 cyear couple cah1 sexp*age_1/list; run; else if povrat0 > . then pov0 = 35 then agecat = 4;

1; else if age_1 > 35 and age_1 le *there are funny ages for parents as men are included - the ones logpov0= log(povrat0); 40 then agecat = 5; with age le 10 do not else if age_1 > 40 then agecat = marital status so will not be in any tables; cpover = poverty - 249.9; 6; run; data new1; povsq = cpover**2; set j128399; keep v181 er30001 er30002 ; run;

337

else if cyear = 1970 tables parity educ totinc race data new1; then do; educ = ER30052; totinc = sex1 sexp age_1 cyear couple; set new1; ER30057; end; run; CASEID = er30001||er30002; else if cyear = 1971 RACE = v181; run; then do; educ = ER30076; totinc = data new3; ER30081;end; set new3; data new2; else if cyear = 1972 where cyear le 1997; *subset to set ind2009er; then do; educ = ER30100; totinc = years where Norberg looked; run; keep er30001 er30002 ER30106;end; er30012 er30033 er30057 er30081 else if cyear = 1973 data new3; er30106 er30130 then do; educ = ER30126; totinc = merge weights new3(in=a); by er30152 er30173 er30202 er30231 ER30130;end; caseid; er30268 er30298 er30328 er30358 else if cyear = 1974 if a; run; er30386 then do; educ = ER30147; totinc = er30415 er30445 er30480 er30515 ER30152;end; er30551 er30586 er30622 er30659 else if cyear = 1975 ER30705 then do; educ = ER30169; totinc = File Name: Step-4 RD-PSID.sas ER30750 ER30821 ER30821 ER30173;end; else if cyear = 1976 * RD re-analysis - based on er30010 er30052 er30076 er30100 then do; educ = ER30197; totinc = exposure modeling - crude risk, er30126 er30147 er30169 er30197 ER30202;end; weighted and unweighted; er30226 er30255 er30296 er30326 else if cyear = 1977 er30356 er30384 er30413 er30443 then do; educ = ER30226; totinc = *data 3 is all ; er30478 er30513 er30549 er30584 ER30231;end; *data 4 is only women; er30620 er30657 er30703 er30748 else if cyear = 1978 er30820 er33115 er33215 er33315 then do; educ = ER30255; totinc = data new4; er33415; ER30268;end; set new3; ; else if cyear = 1979 where sexp = 2; run; *keep income and race and then do; educ = ER30296; totinc = education; run; ER30298;end; proc freq data = new4; else if cyear = 1980 tables post_wt race educ cyear data new2; then do; educ = ER30326; totinc = parity couple age_1 totinc/list set new2; ER30328;end; missing; run; CASEID = er30001||er30002; run; else if cyear = 1981 then do; educ = ER30356; totinc = data test; ER30358;end; title 'RD - Conception Year GE merge new(in=a) new2; else if cyear = 1982 69 le 1997 - unweighted mothers'; by caseid;if a; run; then do; educ = ER30384; totinc = proc freq data = new4; where ER30386;end; cyear ge 1969 and cyear le 1997; PROC freq data = test; else if cyear = 1983 tables couple*sex1/ riskdiff; tables er30012 er30033 er30057 then do; educ = ER30413; totinc = run; er30081 er30106 er30130 ER30415;end; er30152 er30173 er30202 er30231 else if cyear = 1984 title 'RD- All years - er30268 er30298 er30328 er30358 then do; educ = ER30443; totinc = unweighted mothers'; er30386 ER30445;end; er30415 er30445 er30480 er30515 else if cyear = 1985 proc freq data = new4; tables er30551 er30586 er30622 er30659 then do; educ = ER30478; totinc = couple*sex1/ riskdiff; run; ER30705 ER30480;end; ER30750 ER30821 ER30821; run; else if cyear = 1986 title 'RD-Conception Year GE then do; educ = ER30513; totinc = 1969 le 97 - weighted mothers'; proc freq data = test; ER30515;end; tables er30010 er30052 er30076 else if cyear = 1987 proc surveyreg data = new4; where er30100 er30126 er30147 er30169 then do; educ = ER30549; totinc = cyear ge 1969 and cyear le 1997; er30197 ER30551;end; class couple ; er30226 er30255 er30296 er30326 else if cyear = 1988 strata strata; er30356 er30384 er30413 er30443 then do; educ = ER30584; totinc = cluster cluster caseid; er30478 er30513 er30549 er30584 ER30586;end; *account for repeated measures; er30620 er30657 er30703 er30748 else if cyear = 1989 model sex1 = couple /clparm er30820 er33115 er33215 er33315 then do; educ = ER30620; totinc = solution noint ; er33415; run; ER30622;end; weight post_wt; else if cyear = 1990 estimate 'together vs apart' then do; educ = ER30657; totinc = couple 1 -1; run; data new3; ER30659;end; merge test(in=a) new1; by else if cyear = 1991 title 'RD-Conception Year all - caseid; then do; educ = ER30703; totinc = weighted mothers'; if a; ER30705;end; else if cyear = 1992 proc surveyreg data = new4; array educ2 {29} er30010 then do; educ = ER30748; totinc = class couple ; er30052 er30076 er30100 er30126 ER30750;end; strata strata; er30147 er30169 er30197 else if cyear = 1993 cluster cluster caseid; *account er30226 er30255 er30296 er30326 then do; educ = ER30820; totinc = for repeated measures; er30356 er30384 er30413 er30443 ER30821;end; model sex1 = couple /clparm er30478 er30513 er30549 er30584 else if cyear = 1994 solution noint ; er30620 er30657 er30703 er30748 then do; educ = ER33115; totinc = weight post_wt; er30820 er33115 er33215 er33315 .;end; estimate 'together vs apart' er33415; else if cyear = 1995 couple 1 -1; run; do i = 1 to 29 ; if educ2{i} then do; educ = ER33215; totinc = ge 98 then educ2{i} = .;end; .;end; *Adjusting for other things; else if cyear = 1996 *Using method in Brumback et al if cyear le 1967 then then do; educ = ER33315; totinc = 2010; do ; educ = . ; totinc = .;end; .;end; *binary outcome is sex of child; else if cyear = 1968 else if cyear = 1997 *binary exposure of interest is then do; educ = ER30010; totinc = then do; educ = ER33415; totinc = marital satus; ER30012; end; .;end; *verified that age and parity are else if cyear = 1969 not effect modifiers ; then do; educ = ER30010; totinc = if race ge 9 then race = .; run; *confounders are income, race, ER30033; end; education, year of birth age and proc freq data = new3; parity;

338

cluster panel; *confounders are income, race, title 'RD-Conception Year GE model couple =parity age_1; education, year of birth age and 1970 le 1997 - weighted without strata col_str; parity; income mothers'; weight post_wt; output out=pout (keep=caseid proc surveyreg data = new4; where puncond) p = puncond; run; title 'RD-Conception Year GE cyear ge 1969 and cyear le 1997; 1970 le 1997 - weighted without class couple educ race parity; proc sort data = pout; by caseid; income both mothers and fathers'; strata strata; run; cluster cluster caseid; proc surveyreg data = new3; where *account for repeated measures; data stand; cyear ge 1969 and cyear le 1997; model sex1 = couple age_1 educ merge new3 propout pout; class couple educ race parity; race cyear parity /clparm by caseid; strata strata; solution noint ; if sex1 = 1 then iptw = cluster cluster caseid; weight post_wt; puncod/propensity; *account for repeated measures; estimate 'together vs apart' if sex1 = 2 then iptw = (1- model sex1 = couple age_1 educ couple 1 -1; run; puncond)/(1-propensity); race cyear parity /clparm combinedweight = iptw*post_wt; solution noint ; title 'RD-Conception Year all - run; weight post_wt; weighted wihtout income mothers'; estimate 'together vs apart' proc surveyreg data = stand; couple 1 -1; proc surveyreg data = new4; class couple parity; run; class couple educ race parity; model sex1 = couple age_1 strata strata; parity/solution clparm ; title 'RD-Conception Year all - cluster cluster caseid; *account strata col_str; weighted wihtout income both for repeated measures; weight combinedweight; mothers and fathers'; model sex1 = couple age_1 educ estimate 'together vs apart' race cyear parity /clparm couple 1 -1; proc surveyreg data = new3; solution noint ; run; class couple educ race parity; weight post_wt; */ strata strata; estimate 'together vs apart' cluster cluster caseid; *account couple 1 -1; run; for repeated measures; *both mothers and fathers; model sex1 = couple age_1 educ title 'RD - Conception Year GE race cyear parity /clparm title 'RD-Conception Year GE 69 le 1997 - unweighted both solution noint ; 1970 le 1997 - weighted with mothers and fathers'; weight post_wt; income mothers'; proc freq data = new3; where estimate 'together vs apart' cyear ge 1969 and cyear le 1997; couple 1 -1; proc surveyreg data = new4; where tables couple*sex1/ riskdiff; run; cyear ge 1969 and cyear le 1997; run; class couple educ race parity; strata strata; title 'RD- All years - title 'RD-Conception Year GE cluster cluster caseid; unweighted both mothers and 1970 le 1997 - weighted with *account for repeated measures; fathers'; income both mothers and fathers'; model sex1 = couple age_1 educ proc freq data = new3; race cyear parity totinc /clparm tables couple*sex1/ riskdiff; proc surveyreg data = new3; where solution noint ; run; cyear ge 1969 and cyear le 1997; weight post_wt; class couple educ race parity; estimate 'together vs apart' title 'RD-Conception Year GE strata strata; couple 1 -1; run; 1969 le 97 - weighted both cluster cluster caseid; mothers and fathers'; *account for repeated measures; title 'RD-Conception Year all - model sex1 = couple age_1 educ weighted with income mothers'; proc surveyreg data = new3; where race cyear parity totinc /clparm cyear ge 1969 and cyear le 1997; solution noint ; proc surveyreg data = new4; class couple ; weight post_wt; class couple educ race parity; strata strata; estimate 'together vs apart' strata strata; cluster cluster caseid; couple 1 -1; cluster caseid; *account for *account for repeated measures; run; repeated measures; model sex1 = couple /clparm model sex1 = couple age_1 educ solution noint ; title 'RD-Conception Year all - race cyear parity totinc/clparm weight post_wt; weighted with income both mothers solution noint ; estimate 'together vs apart' and fathers'; weight post_wt; couple 1 -1; estimate 'together vs apart' run; proc surveyreg data = new3; couple 1 -1; run; class couple educ race parity; title 'RD-Conception Year all - strata strata; /* weighted both mothers and cluster caseid; *account for proc surveylogistic data =new3; fathers'; repeated measures; where cyear ge 1991; model sex1 = couple age_1 educ class couple educat race proc surveyreg data = new3; race cyear parity totinc/clparm totinc1 cyear parity/param=ref; class couple ; solution noint ; cluster panel; strata strata; weight post_wt; model couple = age_1 educat cluster cluster caseid; *account estimate 'together vs apart' race totinc1 cyear parity / ; for repeated measures; couple 1 -1; strata col_str; model sex1 = couple /clparm run; weight post_wt; solution noint ; output out=propout (keep=caseid weight post_wt; propensity) p = propensity; run; estimate 'together vs apart' couple 1 -1; File Name: educationincome.sas proc sort data =new3; by caseid; run; run; /* PSID DATA CENTER *Adjusting for other things; ********************************* proc sort data = propout; by *Using method in Brumback et al ******************** caseid; run; 2010; JOBID : 128411 *binary outcome is sex of child; DATA_DOMAIN : PSID proc surveylogistic data = new3; *binary exposure of interest is USER_WHERE : NULL where cyear ge 1991; marital satus; FILE_TYPE : All class couple educat race *verified that age and parity are Individuals Data totinc1 cyear parity/param=ref; not effect modifiers ; OUTPUT_DATA_TYPE : ASCII 339

STATEMENTS : SAS ER30118 FORMAT=F2. ER30285 FORMAT=F1. Statements LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD CODEBOOK_TYPE : PDF 73" 79" N_OF_VARIABLES : 169 ER30119 FORMAT=F1. ER30296 FORMAT=F2. N_OF_OBSERVATIONS: 71285 LABEL="RELATIONSHIP TO HEAD LABEL="K49 EDUCATION MAX_REC_LENGTH : 494 73" 79" ER30126 FORMAT=F2. ER30298 FORMAT=F5. ********************************* LABEL="GRADE FINISHED LABEL="TOT TAXABLE INCOME ********************************* 73" 79" ******/ ER30130 FORMAT=F5. ER30313 FORMAT=F4. LABEL="MONEY INCOME LABEL="1980 INTERVIEW NUMBER" FILENAME J128411 "C:\Documents 73" ER30314 FORMAT=F2. and ER30138 FORMAT=F4. LABEL="SEQUENCE NUMBER Settings\foxkarl\Desktop\PSID\Edu LABEL="1974 INTERVIEW NUMBER" 80" cInc\J128411.txt" ; ER30139 FORMAT=F2. ER30315 FORMAT=F1. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD DATA educinc ; 74" 80" ATTRIB ER30140 FORMAT=F1. ER30326 FORMAT=F2. ER30001 FORMAT=F4. LABEL="RELATIONSHIP TO HEAD LABEL="EDUCATION ATTAINED LABEL="1968 INTERVIEW NUMBER" 74" 80" ER30002 FORMAT=F3. ER30147 FORMAT=F2. ER30328 FORMAT=F5. LABEL="PERSON NUMBER LABEL="HIGHEST GRADE LABEL="1979 TOT TAXABLE Y 68" 74" 80" ER30003 FORMAT=F1. ER30152 FORMAT=F5. ER30343 FORMAT=F4. LABEL="RELATIONSHIP TO HEAD LABEL="TOTAL MONEY INCOME LABEL="1981 INTERVIEW NUMBER" 68" 74" ER30344 FORMAT=F2. ER30010 FORMAT=F2. ER30160 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="YRS SCHL COMPL LABEL="1975 INTERVIEW NUMBER" 81" 68" ER30161 FORMAT=F2. ER30345 FORMAT=F1. ER30012 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="MONEY INCOME IND 75" 81" 68" ER30162 FORMAT=F1. ER30356 FORMAT=F2. ER30020 FORMAT=F4. LABEL="RELATIONSHIP TO HEAD LABEL="COMPLETED EDUC LABEL="1969 INTERVIEW NUMBER" 75" 81" ER30021 FORMAT=F2. ER30169 FORMAT=F2. ER30358 FORMAT=F5. LABEL="SEQUENCE NUMBER LABEL="HIGHEST GRAD FINISHED LABEL="TOT TXBL INCOME 80 69" 75" 81" ER30022 FORMAT=F1. ER30173 FORMAT=F5. ER30373 FORMAT=F4. LABEL="RELATIONSHIP TO HEAD LABEL="TAXABLE INCOME LABEL="1982 INTERVIEW NUMBER" 69" 75" ER30374 FORMAT=F2. ER30033 FORMAT=F5. ER30188 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="MONEY INCOME LABEL="1976 INTERVIEW NUMBER" 82" 69" ER30189 FORMAT=F2. ER30375 FORMAT=F1. ER30043 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="1970 INTERVIEW NUMBER" 76" 82" ER30044 FORMAT=F2. ER30190 FORMAT=F1. ER30384 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="COMPLETED EDUCATION 70" 76" 82" ER30045 FORMAT=F1. ER30197 FORMAT=F2. ER30386 FORMAT=F5. LABEL="RELATIONSHIP TO HEAD LABEL="HIGHEST GRADE FINISH LABEL="TOT TXBL INCOME 81 70" 76" 82" ER30052 FORMAT=F2. ER30202 FORMAT=F5. ER30399 FORMAT=F4. LABEL="GRADE FINISHED LABEL="TOTAL TAXABLE Y LABEL="1983 INTERVIEW NUMBER" 70" 76" ER30400 FORMAT=F2. ER30057 FORMAT=F5. ER30217 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="MONEY INCOME LABEL="1977 INTERVIEW NUMBER" 83" 70" ER30218 FORMAT=F2. ER30401 FORMAT=F2. ER30067 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="1971 INTERVIEW NUMBER" 77" 83" ER30068 FORMAT=F2. ER30219 FORMAT=F1. ER30413 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="COMPLETED EDUCATION 71" 77" 83" ER30069 FORMAT=F1. ER30226 FORMAT=F2. ER30415 FORMAT=F6. LABEL="RELATIONSHIP TO HEAD LABEL="HIGHEST GRADE FINISH LABEL="TOT TXBL INCOME 82 71" 77" 83" ER30076 FORMAT=F2. ER30231 FORMAT=F5. ER30429 FORMAT=F4. LABEL="GRADE FINISHED LABEL="TOTAL TAXABLE Y LABEL="1984 INTERVIEW NUMBER" 71" 77" ER30430 FORMAT=F2. ER30081 FORMAT=F5. ER30246 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="MONEY INCOME LABEL="1978 INTERVIEW NUMBER" 84" 71" ER30247 FORMAT=F2. ER30431 FORMAT=F2. ER30091 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="1972 INTERVIEW NUMBER" 78" 84" ER30092 FORMAT=F2. ER30248 FORMAT=F1. ER30443 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="COMPLETED EDUC 72" 78" 84" ER30093 FORMAT=F1. ER30255 FORMAT=F2. ER30445 FORMAT=F6. LABEL="RELATIONSHIP TO HEAD LABEL="HIGHEST GRADE FINISH LABEL="TOT TXBL INCOME 72" 78" 84" ER30100 FORMAT=F2. ER30268 FORMAT=F5. ER30463 FORMAT=F4. LABEL="GRADE FINISHED LABEL="TOTAL TAXABLE INCOM LABEL="1985 INTERVIEW NUMBER" 72" 78" ER30464 FORMAT=F2. ER30106 FORMAT=F5. ER30283 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="MONEY INCOME LABEL="1979 INTERVIEW NUMBER" 85" 72" ER30284 FORMAT=F2. ER30465 FORMAT=F2. ER30117 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="RELATIONSHIP TO HEAD LABEL="1973 INTERVIEW NUMBER" 79" 85"

340

ER30478 FORMAT=F2. ER30705 FORMAT=F6. ER33602 FORMAT=F2. LABEL="COMPLETED EDUCATION LABEL="TOT LABOR INCOME-IND LABEL="SEQUENCE NUMBER 85" 91" 01" ER30480 FORMAT=F6. ER30733 FORMAT=F4. ER33603 FORMAT=F2. LABEL="TOT TXBL INCOME LABEL="1992 INTERVIEW NUMBER" LABEL="RELATION TO HEAD 85" ER30734 FORMAT=F2. 01" ER30498 FORMAT=F4. LABEL="SEQUENCE NUMBER ER33616 FORMAT=F2. LABEL="1986 INTERVIEW NUMBER" 92" LABEL="YEARS COMPLETED EDUCATION ER30499 FORMAT=F2. ER30735 FORMAT=F2. 01" LABEL="SEQUENCE NUMBER LABEL="RELATION TO HEAD ER33701 FORMAT=F5. 86" 92" LABEL="2003 INTERVIEW NUMBER" ER30500 FORMAT=F2. ER30748 FORMAT=F2. ER33702 FORMAT=F2. LABEL="RELATIONSHIP TO HEAD LABEL="COMPLETED EDUCATION LABEL="SEQUENCE NUMBER 86" 92" 03" ER30513 FORMAT=F2. ER30750 FORMAT=F6. ER33703 FORMAT=F2. LABEL="COMPLETED EDUCATION LABEL="TOT LABOR INCOME LABEL="RELATION TO HEAD 86" 92" 03" ER30515 FORMAT=F6. ER30806 FORMAT=F5. ER33716 FORMAT=F2. LABEL="TOT TXBL INCOME LABEL="1993 INTERVIEW NUMBER" LABEL="YEARS COMPLETED EDUCATION 86" ER30807 FORMAT=F2. 03" ER30535 FORMAT=F4. LABEL="SEQUENCE NUMBER ER33801 FORMAT=F5. LABEL="1987 INTERVIEW NUMBER" 93" LABEL="2005 INTERVIEW NUMBER" ER30536 FORMAT=F2. ER30808 FORMAT=F2. ER33802 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="RELATION TO HEAD LABEL="SEQUENCE NUMBER 87" 93" 05" ER30537 FORMAT=F2. ER30820 FORMAT=F2. ER33803 FORMAT=F2. LABEL="RELATIONSHIP TO HEAD LABEL="YRS COMPLETED EDUCATION LABEL="RELATION TO HEAD 87" 93" 05" ER30549 FORMAT=F2. ER30821 FORMAT=F6. ER33817 FORMAT=F2. LABEL="COMPLETED EDUCATION LABEL="TOTAL LABOR INCOME LABEL="YEARS COMPLETED EDUCATION 87" 93" 05" ER30551 FORMAT=F6. ER33101 FORMAT=F5. ER33901 FORMAT=F5. LABEL="TOT TXBL INCOME LABEL="1994 INTERVIEW NUMBER" LABEL="2007 INTERVIEW NUMBER" 87" ER33102 FORMAT=F2. ER33902 FORMAT=F2. ER30570 FORMAT=F4. LABEL="SEQUENCE NUMBER LABEL="SEQUENCE NUMBER LABEL="1988 INTERVIEW NUMBER" 94" 07" ER30571 FORMAT=F2. ER33103 FORMAT=F2. ER33903 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="RELATION TO HEAD LABEL="RELATION TO HEAD 88" 94" 07" ER30572 FORMAT=F2. ER33115 FORMAT=F2. ER33917 FORMAT=F2. LABEL="RELATION TO HEAD LABEL="YRS COMPLETED EDUC LABEL="YEARS COMPLETED EDUCATION 88" 94" 07" ER30584 FORMAT=F2. ER33201 FORMAT=F5. ER34001 FORMAT=F5. LABEL="COMPLETED EDUC-IND LABEL="1995 INTERVIEW NUMBER" LABEL="2009 INTERVIEW NUMBER" 88" ER33202 FORMAT=F2. ER34002 FORMAT=F2. ER30586 FORMAT=F6. LABEL="SEQUENCE NUMBER LABEL="SEQUENCE NUMBER LABEL="TOT TXBL INCOME-IND 95" 09" 88" ER33203 FORMAT=F2. ER34003 FORMAT=F2. ER30606 FORMAT=F4. LABEL="RELATION TO HEAD LABEL="RELATION TO HEAD LABEL="1989 INTERVIEW NUMBER" 95" 09" ER30607 FORMAT=F2. ER33215 FORMAT=F2. ER34020 FORMAT=F2. LABEL="SEQUENCE NUMBER LABEL="YEARS COMPLETED EDUCATION LABEL="YEARS COMPLETED EDUCATION 89" 95" 09" ER30608 FORMAT=F2. ER33301 FORMAT=F4. ; LABEL="RELATION TO HEAD LABEL="1996 INTERVIEW NUMBER" INFILE J128411 LRECL = 494 ; 89" ER33302 FORMAT=F2. INPUT ER30620 FORMAT=F2. LABEL="SEQUENCE NUMBER ER30001 1 - 4 LABEL="COMPLETED EDUC-IND 96" ER30002 5 - 7 ER30003 89" ER33303 FORMAT=F2. 8 - 8 ER30622 FORMAT=F6. LABEL="RELATION TO HEAD ER30010 9 - 10 LABEL="TOT TXBL INCOME-IND 96" ER30012 11 - 14 ER30020 89" ER33315 FORMAT=F2. 15 - 18 ER30642 FORMAT=F5. LABEL="YEARS COMPLETED EDUCATION ER30021 19 - 20 LABEL="1990 INTERVIEW NUMBER" 96" ER30022 21 - 21 ER30033 ER30643 FORMAT=F2. ER33401 FORMAT=F5. 22 - 26 LABEL="SEQUENCE NUMBER LABEL="1997 INTERVIEW NUMBER" ER30043 27 - 30 90" ER33402 FORMAT=F2. ER30044 31 - 32 ER30045 ER30644 FORMAT=F2. LABEL="SEQUENCE NUMBER 33 - 33 LABEL="RELATION TO HEAD 97" ER30052 34 - 35 90" ER33403 FORMAT=F2. ER30057 36 - 40 ER30067 ER30657 FORMAT=F2. LABEL="RELATION TO HEAD 41 - 44 LABEL="COMPLETED EDUC-IND 97" ER30068 45 - 46 90" ER33415 FORMAT=F2. ER30069 47 - 47 ER30076 ER30659 FORMAT=F6. LABEL="YEARS COMPLETED EDUCATION 48 - 49 LABEL="TOT TXBL INCOME-IND 97" ER30081 50 - 54 90" ER33501 FORMAT=F5. ER30091 55 - 58 ER30092 ER30689 FORMAT=F4. LABEL="1999 INTERVIEW NUMBER" 59 - 60 LABEL="1991 INTERVIEW NUMBER" ER33502 FORMAT=F2. ER30093 61 - 61 ER30690 FORMAT=F2. LABEL="SEQUENCE NUMBER ER30100 62 - 63 ER30106 LABEL="SEQUENCE NUMBER 99" 64 - 68 91" ER33503 FORMAT=F2. ER30117 69 - 72 ER30691 FORMAT=F2. LABEL="RELATION TO HEAD ER30118 73 - 74 ER30119 LABEL="RELATION TO HEAD 99" 75 - 75 91" ER33516 FORMAT=F2. ER30126 76 - 77 ER30703 FORMAT=F2. LABEL="YEARS COMPLETED EDUCATION ER30130 78 - 82 ER30138 LABEL="COMPLETED EDUC-IND 99" 83 - 86 91" ER33601 FORMAT=F4. ER30139 87 - 88 LABEL="2001 INTERVIEW NUMBER" ER30140 89 - 89 ER30147 90 - 91 341

ER30152 92 - 96 ER30734 358 - 359 CAH6 LABEL="YEAR ER30160 97 - 100 ER30161 ER30735 360 - 361 ER30748 PARENT BORN" 101 - 102 362 - 363 format=f4. ER30162 103 - 103 ER30750 364 - 369 CAH7 LABEL="MARITAL ER30169 104 - 105 ER30173 ER30806 370 - 374 ER30807 STATUS OF MOTHER WHEN IND BORN" 106 - 110 375 - 376 format=f1. ER30188 111 - 114 ER30808 377 - 378 CAH8 LABEL="BIRTH ER30189 115 - 116 ER30190 ER30820 379 - 380 ER30821 ORDER" 117 - 117 381 - 386 format=f2. ER30197 118 - 119 ER33101 387 - 391 CAH9 LABEL="1968 ER30202 120 - 124 ER30217 ER33102 392 - 393 ER33103 INTERVIEW NUMBER OF CHILD" 125 - 128 394 - 395 format=f4. ER30218 129 - 130 ER33115 396 - 397 CAH10 LABEL="PERSON ER30219 131 - 131 ER30226 ER33201 398 - 402 ER33202 NUMBER OF CHILD" 132 - 133 403 - 404 format=f3. ER30231 134 - 138 ER33203 405 - 406 CAH11 LABEL="SEX OF ER30246 139 - 142 ER30247 ER33215 407 - 408 ER33301 CHILD" 143 - 144 409 - 412 format=f1. ER30248 145 - 145 ER33302 413 - 414 CAH12 LABEL="MONTH ER30255 146 - 147 ER30268 ER33303 415 - 416 ER33315 CHILD BORN" 148 - 152 417 - 418 format=f2. ER30283 153 - 156 ER33401 419 - 423 CAH13 LABEL="YEAR CHILD ER30284 157 - 158 ER30285 ER33402 424 - 425 ER33403 BORN" 159 - 159 426 - 427 format=f4. ER30296 160 - 161 ER33415 428 - 429 CAH14 LABEL="BIRTH ER30298 162 - 166 ER30313 ER33501 430 - 434 ER33502 WEIGHT OF CHILD IN OUNCES" 167 - 170 435 - 436 format=f3. ER30314 171 - 172 ER33503 437 - 438 CAH15 LABEL="STATE IN ER30315 173 - 173 ER30326 ER33516 439 - 440 ER33601 WHICH CHILD BORN" 174 - 175 441 - 444 format=f2. ER30328 176 - 180 ER33602 445 - 446 CAH16 LABEL="WHERE ER30343 181 - 184 ER30344 ER33603 447 - 448 ER33616 CHILD WAS WHEN LAST REPORTED" 185 - 186 449 - 450 format=f1. ER30345 187 - 187 ER33701 451 - 455 CAH17 LABEL="MONTH ER30356 188 - 189 ER30358 ER33702 456 - 457 ER33703 CHILD MOVED OUT OR DIED" 190 - 194 458 - 459 format=f2. ER30373 195 - 198 ER33716 460 - 461 CAH18 LABEL="YEAR CHILD ER30374 199 - 200 ER30375 ER33801 462 - 466 ER33802 MOVED OUT OR DIED" 201 - 201 467 - 468 format=f4. ER30384 202 - 203 ER33803 469 - 470 CAH19 ER30386 204 - 208 ER30399 ER33817 471 - 472 ER33901 LABEL="HISPANICITY" 209 - 212 473 - 477 format=f1. ER30400 213 - 214 ER33902 478 - 479 CAH20 LABEL="RACE OF ER30401 215 - 216 ER30413 ER33903 480 - 481 ER33917 CHILD, 1ST MENTION" 217 - 218 482 - 483 format=f1. ER30415 219 - 224 ER34001 484 - 488 CAH21 LABEL="RACE OF ER30429 225 - 228 ER30430 ER34002 489 - 490 ER34003 CHILD, 2ND MENTION" 229 - 230 491 - 492 format=f1. ER30431 231 - 232 ER34020 493 - 494 CAH22 LABEL="RACE OF ER30443 233 - 234 ER30445 ; CHILD, 3RD MENTION" 235 - 240 run ; format=f1. ER30463 241 - 244 CAH23 LABEL="PRIMARY ER30464 245 - 246 ER30465 ETHNIC GROUP OF CHILD" 247 - 248 format=f1. ER30478 249 - 250 CAH24 LABEL="SECONDARY ER30480 251 - 256 ER30498 File Name: ETHNIC GROUP OF CHILD, 1ST MEN" 257 - 260 ReadinChildbirthfile.sas format=f2. ER30499 261 - 262 CAH25 LABEL="SECONDARY ER30500 263 - 264 ER30513 /******************************** ETHNIC GROUP OF CHILD, 2ND MEN" 265 - 266 ********************************* format=f2. ER30515 267 - 272 ******** CAH26 LABEL="YR MOST ER30535 273 - 276 ER30536 Label : Childbirth RECENTLY REPORTED NUMBER OF KIDS" 277 - 278 and Adoption History 1985-2009 format=f4. ER30537 279 - 280 Rows : 120325 CAH27 LABEL="YEAR MOST ER30549 281 - 282 ER30551 Columns : 31 RECENTLY REPORTED THIS CHILD" 283 - 288 ASCII File Date : May 17, 2011 format=f4. ER30570 289 - 292 ********************************* CAH28 LABEL="NUMBER OF ER30571 293 - 294 ER30572 ********************************* NATURAL OR ADOPTED CHILDREN" 295 - 296 *******/ format=f2. ER30584 297 - 298 CAH29 ER30586 299 - 304 ER30606 DATA CAH85_09 ; LABEL="RELATIONSHIP TO ADOPTIVE 305 - 308 ATTRIB PARENT" ER30607 309 - 310 CAH1 LABEL="RECORD format=f2. ER30608 311 - 312 ER30620 TYPE" CAH30 LABEL="NUMBER OF 313 - 314 format=f1. BIRTH OR ADOPTION RECORDS" ER30622 315 - 320 CAH2 LABEL="1968 format=f2. ER30642 321 - 325 ER30643 INTERVIEW NUMBER OF PARENT" CAH31 LABEL="RELEASE 326 - 327 format=f4. NUMBER" ER30644 328 - 329 CAH3 LABEL="PERSON format=f1. ER30657 330 - 331 ER30659 NUMBER OF PARENT" ; 332 - 337 format=f3. INFILE 'C:\Documents and ER30689 338 - 341 CAH4 LABEL="SEX OF Settings\foxkarl\Desktop\PSID\CHI ER30690 342 - 343 ER30691 PARENT" LDHOOD FILE PSID\CAH85_09.txt' 344 - 345 format=f1. LRECL = 68 ; ER30703 346 - 347 CAH5 LABEL="MONTH INPUT ER30705 348 - 353 ER30733 PARENT BORN" CAH1 1 - 1 354 - 357 format=f2. CAH2 2 - 5 CAH3 6 - 8

342

CAH4 9 - 9 x = _n_; run; weight post_wt; run; CAH5 10 - 11 data var2; CAH6 12 - 15 set var; CAH7 16 - 16 bitone = sumNHY-sumtot; CAH8 17 - 18 bittwo = (1/skh)*(1/skh)*bitone; CAH9 19 - 22 bit2 =(kh/k)*bittwo*bittwo; CAH10 23 - 25 CAH11 26 - 26 den1 = kh/K*(1/skh); CAH12 27 - 28 den2 = sumnh*den1; CAH13 29 - 32 den3 = den2*den2; CAH14 33 - 35 *num = (1/skh*(sumNHY- CAH15 36 - 37 sumtot))**2; CAH16 38 - 38 *den = (kh/K*(1/skh)*sumnh)**2; CAH17 39 - 40 delta = kh/K*bit2/den3; CAH18 41 - 44 detlak = delta/K; run; CAH19 45 - 45 data var3; CAH20 46 - 46 set var2;by id; CAH21 47 - 47 if last.id then output ; keep CAH22 48 - 48 delta; run; CAH23 49 - 49 CAH24 50 - 51 *lets see if we can assume pps CAH25 52 - 53 where post_wt is actually close CAH26 54 - 57 to pps selection weight; CAH27 58 - 61 CAH28 62 - 63 proc sort data = kids; by strata CAH29 64 - 65 cluster; run; CAH30 66 - 67 CAH31 68 - 68 proc univariate data = kids; ; var post_wt; ; by RUN ; strata cluster; run;

File Name: Superpopextravar.sas * nope the weights within cluster seem to vary far too much to use; data kids ; set new3; run;

proc sort data = kids; by File Name: table counts.sas strata; run; *table counts; proc surveyfreq data = kids; by title '69-97'; strata; proc freq data = new3;where cyear strata strata; ge 1969 and cyear le 1997; cluster cluster ; table sex1*couple; run; weight post_wt; table cah11/deff; title 'all'; ods output oneway=mbystrata; run; proc freq data = new3; data mbystrata; table sex1 couple; run; set mbystrata ; title '69-97'; if CAH11 = 1; run; proc surveyfreq data = new3; proc surveyfreq data = kids; where cyear ge 1969 and cyear le strata strata ; 1997; cluster cluster; strata strata; weight post_wt; cluster cluster caseid; tables cah11/deff; **account for repeated measures; ods output oneway = mall; run; table sex1*couple/HISQ CHISQ1 data mall; COL CV CL CVWT DEFF LRCHISQ set mall; LRCHISQ1; where cah11 =1; weight post_wt; run; deff = designeffect; y_hat = percent; title 'all'; id = 1;keep Deff y_hat id; run; data mbystrata; proc surveyfreq data = new3; set mbystrata; strata strata; where cah11 =1; cluster cluster caseid; * id = 1; run; *account for repeated measures; table sex1*couple /HISQ CHISQ1 * let kh = 2, assume K = 3024 Kh COL CV CL CVWT DEFF LRCHISQ = 56 (if we have 56 strata - LRCHISQ1 ; though there are 74 at time of weight post_wt; run; sampling; title '69-97'; * N pop in 1968 = 200,706,056 - proc surveyfreq data = new3; assume roughly equal in strata where cyear ge 1969 and cyear le and psu so Nhi = 3,500,000; 1997; *assume simple sample psu; strata strata; cluster cluster caseid; data var; **account for repeated measures; merge mall mbystrata;by id; run; table couple/HISQ CHISQ1 COL CV data var; CL CVWT DEFF LRCHISQ LRCHISQ1; set var; weight post_wt; run; kh = 54; K = 3024; title 'all'; skh = 2; Nhiyhi = wgtfreq*percent; run; proc surveyfreq data = new3; data var; strata strata; set var; cluster cluster caseid; * retain sumNHY 0 sumNh 0 sumtot 0; *account for repeated measures; sumNHY+NHiyhi; table couple /HISQ CHISQ1 COL sumNh+wgtfreq; CV CL CVWT DEFF LRCHISQ LRCHISQ1 sumtot = sumNh*y_hat; ; 343