ASSOCIATIONS OF UNCONVENTIONAL NATURAL GAS DEVELOPMENT WITH ASTHMA EXACERBATIONS AND DEPRESSIVE SYMPTOMS IN PENNSYLVANIA
by Sara Rasmussen
A dissertation submitted to Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy
Baltimore, Maryland April, 2017
Abstract
Background: Unconventional natural gas development (UNGD) has proceeded rapidly
in Pennsylvania, which now accounts for over 25% of the country’s unconventional
natural gas production. UNGD has been associated with air quality and community
impacts.
Objectives: 1) Evaluate associations of UNGD metrics with asthma exacerbations. 2)
Compare different approaches to UNGD activity assessment with one another and in
associations with mild asthma exacerbations. 3) Evaluate associations of UNGD metrics
with depression symptoms. 4) Evaluate whether and how other aspects of UNGD
(impoundments, compressor engines, flaring events) should be incorporated into UNGD
activity metrics.
Methods: The health studies were conducted using electronic data from the Geisinger
Clinic in Pennsylvania. We created UNGD metrics for four phases of well development.
We conducted a nested case-control study comparing asthma patients with
exacerbations to asthma patients without exacerbations from 2005-12. We then re-
evaluated the mild exacerbation associations after replacing our UNGD metrics with
those used in prior studies. We evaluated the association of UNGD metrics with
depression symptoms ascertained from questionnaire data. We identified UNGD-related impoundments, compressor engines, and flaring using crowdsourcing, abstraction of paper records, and satellite data, respectively, and conducted a principal component analysis (PCA) of UNGD metrics created for wells, impoundments and compressor engines.
Results: From the mid-2000s through 2015, 9,669 wells were drilled in Pennsylvania.
We found consistent associations of UNGD metrics with three types of asthma
exacerbations. Comparing UNGD metrics created in different studies, metrics had
ii
different magnitudes of association with mild asthma exacerbations, though the highest category of each metric (vs. the lowest) was associated with the outcome. In the
depression study, the UNGD metric was associated with depression symptoms. We
identified 361 compressor stations, 1,218 impoundments, and 216 locations with flaring
events. The PCA identified that a single component captured most of the variation
between metrics and was approximately an equal mix of the metrics for compressors,
impoundments, and well development.
Conclusions: UNGD metrics were associated with asthma exacerbations and
depression symptoms and were robust to increasing covariate control and in sensitivity
analyses. Determining if these associations are causal requires further research,
including more detailed exposure assessment.
Thesis readers: Brian Schwartz, Karen Bandeen-Roche, Hugh Ellis, and Paul Strickland
Alternates: Jessie Buckley, Mary Fox, and Holly Wilcox
iii
Acknowledgments
I could not have completed this thesis without the support of many, some of whom I would like to thank here. Brian Schwartz has been a tremendous mentor. He provided both freedom to pursue my own ideas and structure to complete this thesis.
Thank you for so generously giving expertise and time, and for teaching me to be an environmental epidemiologist. I am so grateful for the opportunities he has provided me.
I want to thank Hugh Ellis, Karen Bandeen-Roche, Holly Wilcox, Meredith
McCormack, Kirsten Koehler, Paul Strickland, and Anne-Marie Hirsch for their guidance and insight. I also would like to thank Chris Heaney and Ana Navas-Acien for leading journal club, which has been instrumental to my education. Thanks to my fellow Hopkins students for their friendship and support, and thanks to Jon Pollak and Joan Casey for all of their help. Thanks to Cindy Parker for encouraging me to pursue a Ph.D. as a
master’s student and providing mentorship throughout my time at Johns Hopkins. I
would like to thank the Johns Hopkins Bloomberg School of Public Health for providing
me with a great education. Thanks also to the School of Engineering and the IGERT
program, namely Grace Brush and Shahin Zand, for broadening my horizons. This
thesis would not be possible without the collaboration with Geisinger, and I would like to
thank Dione Mercer and Joe DeWalle. I would also like to thank the team at SkyTruth for
their collaboration.
Thank you to my parents, Joan and Per Rasmussen, for their encouragement
from preschool through 22nd grade; and thank you to my family: Lisbeth Rasmussen;
Robert Jacobs, and Andrea, Tony, and Kristin Rogers. Thank you to friends: Nicole Tatz
and Marta Schantz, who spent countless hours with me at every coffee shop in DC, and
Ruth Mandelbaum, Hillary Smith, and Becca Shareff, for a group text that kept me sane.
Finally, thank you to Tim Rogers for his love and support, and for helping me spell all the
hard words in this thesis. Thanks for sticking with me through three degrees.
iv
Table of Contents
Abstract ...... ii Acknowledgments ...... iv Table of Contents ...... v List of tables ...... x List of figures ...... xii List of equations ...... xii Chapter 1: Introduction ...... 1 1.0 Rationale ...... 1 1.1 Unconventional natural gas terminology ...... 1 1.2 Unconventional natural gas development in the United States ...... 3 1.3 Development of a shale gas well ...... 3 1.4 Environmental and social impacts ...... 4 1.4.1 Soil impacts ...... 7 1.4.2 Water impacts ...... 7 1.4.3 Air impacts ...... 8 1.4.4 Community and social impacts ...... 11 1.5 UNGD and health studies ...... 12 1.5.1 Occupational studies of UNGD ...... 13 1.5.2 Epidemiology studies of UNGD ...... 13 1.6 The use of electronic health records for epidemiology studies ...... 19 1.6.1 Overview of the Geisinger Clinic and its EHR ...... 19 1.6.2 Environmental epidemiology studies using the Geisinger EHR ...... 20 1.7 Outcomes selected for study in this thesis ...... 21 1.7.1 Epidemiology of asthma and asthma exacerbations ...... 22 1.7.1.1 Overview and definition of asthma exacerbations ...... 23 1.7.1.2 Asthma exacerbations and air pollution ...... 24 1.7.1.3 Asthma exacerbations and stress ...... 28 1.7.2 Epidemiology of depression ...... 30 1.7.2.1 Association of community characteristics and depression symptoms ...... 30 1.7.2.2 Association of environmental variables and depressive symptoms ...... 32 1.7.3 Relationship between depression and asthma ...... 35 1.8 Specific aims ...... 35 1.9 References...... 36 Chapter 2: Detailed Methods ...... 52 2.0 Chapter overview ...... 52 2.1 Data sources ...... 52 2.1.1 Geisinger Clinic ...... 52 2.1.2 Pennsylvania Department of Environmental Protection ...... 52 2.1.2.1 Natural gas wells ...... 52 2.1.2.2 Compressors ...... 55 2.1.2.3 Municipal water supply ...... 56 2.1.3 Pennsylvania Department of Conservation and Natural Resources ...... 56 2.1.4 U.S. Census data ...... 56 2.1.5 Federal Highway Administration ...... 57 2.1.6 U.S. Department of Agriculture National Agricultural Imagery Program ...... 57 2.1.7 Satellite data ...... 57 2.1.8 Environmental Protection Agency ...... 58 2.1.8.1 Air quality monitoring network ...... 58 2.1.8.2 National Emissions Inventory ...... 58
v
2.2 Data acquisition ...... 58 2.2.1 Geisinger EHR data ...... 58 2.2.2 Crowdsourced data on impoundments and well pads from SkyTruth ...... 59 2.2.3 Compressor data ...... 60 2.3 Data processing ...... 61 2.3.1 Creation of the unconventional natural gas well dataset ...... 61 2.3.1.1 Well data sources ...... 62 2.3.1.2 Inclusion criteria ...... 62 2.3.1.3 Creation of well variables ...... 63 2.3.1.3.1 Well latitude and longitude ...... 63 2.3.1.3.2 Spud date...... 63 2.3.1.3.3 Total depth ...... 65 2.3.1.3.4 Stimulation date ...... 66 2.3.1.3.5 Production start date and production quantities ...... 67 2.3.3.1.6 Well pad ...... 67 2.3.2 Creation of the UNGD-related compressor engine dataset ...... 68 2.3.2.1 Data abstraction ...... 68 2.3.2.2 Data checking ...... 68 2.3.2.3 Creation of a Compressor Station Database ...... 69 2.4 Selection of study population and outcomes ...... 69 2.4.1 Asthma study ...... 69 2.4.1.1 Identification of asthma population ...... 69 2.4.1.2 Identification of asthma exacerbations ...... 70 2.4.1.3 Identification and matching of control index dates ...... 71 2.4.2 Depression symptom study ...... 72 2.4.2.1 Study population ...... 72 2.4.2.2 Outcome and mediating variables created from the questionnaires ...... 73 2.4.2.2.1 Fatigue ...... 73 2.4.2.2.2 Migraine headache ...... 74 2.4.2.2.3 Depression symptoms ...... 74 2.4.2.3 Case and control dates for the disordered sleep outcome ...... 75 2.5 Exposure study ...... 76 2.5.1 Creation of the regular grid ...... 76 2.5.2 Estimation of impoundment start and stop dates ...... 76 2.6 Geocoding of study population ...... 77 2.7 Creation of study variables ...... 78 2.7.1 Covariates created from the electronic health record ...... 78 2.7.1.1 Sex ...... 79 2.7.1.2 Age ...... 79 2.7.1.3 Season ...... 79 2.7.1.4 Race/ethnicity ...... 80 2.7.1.5 Smoking ...... 80 2.7.1.6 Family history ...... 82 2.7.1.7 Medical Assistance ...... 82 2.7.1.8 Diabetes ...... 83 2.7.1.9 Overweight/obesity ...... 83 2.7.1.10 Alcohol use ...... 84 2.7.1.11 Anti-depressant use ...... 85 2.7.2 Covariates created using patients’ coordinates ...... 86 2.7.2.1 Place type ...... 86 2.7.2.2 Community socioeconomic deprivation ...... 87
vi
2.7.2.3 Maximum temperature on prior day ...... 87 2.7.2.4 Distance to nearest major and minor road ...... 88 2.7.2.5 Distance to hospital ...... 89 2.7.2.6 Well water supply ...... 89 2.7.2.7 Greenness ...... 90 2.7.3 UNGD activity metrics ...... 90 2.7.3.1 Durations of phases of well development ...... 91 2.7.3.2 Assignment of unconventional natural gas activity metrics for wells ...... 92 2.7.3.3 Assignment of unconventional natural gas activity metrics for impoundments and compressors ...... 93 2.8 References...... 93 Chapter 3: Asthma Exacerbations and Unconventional Natural Gas Development in the Marcellus Shale ...... 102 3.0 Cover page ...... 102 3.1 Abstract ...... 103 3.2 Introduction ...... 104 3.3 Methods ...... 106 3.3.1 Study population ...... 106 3.3.2 Outcome Ascertainment ...... 106 3.3.3 Controls and Matching ...... 108 3.3.4 Covariates ...... 109 3.3.5 Well Data ...... 109 3.3.6 Activity Metric Assignment ...... 110 3.3.7 Statistical Analysis ...... 111 3.3.7.1 Model Building ...... 112 3.3.7.2 Sensitivity Analyses ...... 112 3.4 Results ...... 113 3.4.1 Descriptions of Wells and Patients ...... 113 3.4.2 Associations of UNGD Activity Metrics with Asthma Outcomes ...... 116 3.4.3 Sensitivity Analyses ...... 121 3.5 Discussion...... 124 3.6 References...... 127 Chapter 4: Associations of unconventional natural gas development with disordered sleep and depression symptoms in Pennsylvania ...... 133 4.0 Cover Page ...... 133 4.1 Abstract ...... 134 4.2 Introduction ...... 136 4.3 Methods ...... 138 4.3.1 Survey design and study population ...... 138 4.3.2 Outcome ascertainment ...... 139 4.3.2.1 Depression symptoms ...... 139 4.3.2.2 Disordered sleep diagnoses ...... 139 4.3.3 Potential mediating variables: migraine and fatigue symptoms ...... 141 4.3.4 Well data and activity metric assignment ...... 142 4.3.5 Covariates ...... 143 4.3.6 Statistical analysis ...... 144 4.4 Results ...... 147 4.4.1 Description of study population ...... 147 4.4.2 Associations of UNGD with depression symptoms ...... 150 4.4.3 Associations of UNGD with disordered sleep ...... 153 4.5 Discussion...... 154
vii
4.6 Conclusions ...... 157 4.7 References...... 157 Chapter 5: Exposure assessment using secondary data sources in unconventional natural gas development and health studies ...... 164 5.0 Cover page ...... 164 5.1 Abstract ...... 165 5.2 Introduction ...... 166 5.3 Methods ...... 167 5.3.1 UNGD-related compressor engines, impoundments, and flaring events in Pennsylvania ...... 167 5.3.2 Incorporate impoundments and compressor engines into exposure assessment ...... 169 5.3.3 Comparison of GIS-based metrics and their associations with mild asthma exacerbations ...... 171 5.4 Results ...... 173 5.4.1 UNGD-related compressor engines, impoundments, and flaring events in Pennsylvania ...... 173 5.4.2 PCA applied to wells, compressor stations, and impoundments ...... 175 5.4.3 Comparison of GIS-based UNGD metrics ...... 178 5.5 Discussion...... 180 5.6 References...... 187 Chapter 6: Miscellaneous results ...... 193 6.1 Additional Results for Chapter 3 ...... 193 6.1.1 Associations of covariates with event status ...... 193 6.1.1 Race/ethnicity ...... 193 6.1.2 Family history ...... 193 6.1.3 Smoking status ...... 194 6.1.4 Season ...... 194 6.1.5 Type 2 diabetes ...... 194 6.1.6 Community socioeconomic deprivation ...... 194 6.1.7 Maximum temperature on prior day ...... 195 6.1.7 Distance to nearest roadway ...... 195 6.1.2 Additional sensitivity analyses ...... 202 6.1.2.1 Stimulation extrapolation methods ...... 202 6.1.2.2 Control Encounter Dates ...... 203 6.1.2.3 Inverse distance squared vs. cubed metric ...... 204 6.1.2.4 Distance to hospital ...... 205 6.2 Additional Results for Chapter 4 ...... 207 6.2.1 Associations of covariates with event status ...... 207 6.2.1.1 Race / ethnicity ...... 207 6.2.1.2 Sex ...... 207 6.2.1.3 Age ...... 207 6.2.1.4 Smoking status ...... 208 6.2.1.5 Alcohol status ...... 208 6.1.2.6 Medical assistance ...... 208 6.2.1.7 Body mass index ...... 208 6.1.2.8 Community socioeconomic deprivation ...... 208 6.2.1.9 Well water ...... 209 6.3 Comparing asthma patients identified in the electronic health record and by self- report ...... 216 6.3.1 Methods ...... 218
viii
6.3.2 Results ...... 218 6.3.3 Discussion ...... 219 6.4 Unconventional natural gas development and asthma symptom study ...... 222 6.4.1 Survey data ...... 222 6.4.2 Asthma symptom outcomes ...... 223 6.4.3 UNGD metrics ...... 223 6.4.4 Covariates ...... 224 6.4.5 Data analysis ...... 224 6.4.6 Results ...... 226 6.4.7 Discussion ...... 242 6.5 Greenness and asthma exacerbation study ...... 244 6.5.1 Study population and covariates ...... 248 6.5.2 Greenness measurement, and exposure assignment ...... 248 6.5.3 Statistical analysis ...... 248 6.5.4 Results ...... 249 6.5.4 Discussion ...... 250 6.6 Greenness and depression symptoms study ...... 250 6.6.1 Greenness measure ...... 250 6.6.2 Depression symptom data ...... 251 6.6.3 Covariates ...... 251 6.6.4 Data analysis ...... 252 6.6.5 Results ...... 254 6.6.6 Discussion ...... 255 6.7 References...... 256 Chapter 7: Discussion ...... 260 7.1 Summary of findings ...... 260 7.2 Health impacts of energy production and use ...... 262 7.3 Future research directions and policy implications ...... 264 7.3.1 Research opportunities ...... 264 7.3.1.1 Replication in other shale basins ...... 265 7.3.1.2 Improving UNGD exposure assessment in epidemiology studies ...... 265 7.3.1.3 Reducing potential sources of bias in epidemiology studies of UNGD ... 267 7.3.1.4 Employ causal inference methods in studies of UNGD and health ...... 267 7.3.2 Policy implications of studies on UNGD and health ...... 268 7.3.2.1 Improve data collection on UNGD ...... 268 7.3.2.2 Expand air quality monitoring in rural oil and natural gas producing areas269 7.3.2.3 Fund research on UNGD and health ...... 270 7.3.2.4 Incorporate externalities into energy prices ...... 271 7.3.3 Health implications of our research ...... 272 7.4 Final Remarks ...... 272 7.5 References...... 274 Appendix ...... 278 Institutional review board documents ...... 278 Curriculum Vita – Sara G. Rasmussen ...... 294
ix
List of tables Table 1.4. Description of the phases of Marcellus shale well development...... 6 Table 1.5.2.1 Published epidemiology studies of unconventional natural gas development by other research groups...... 16 Table 1.5.2.2 Published epidemiology studies of unconventional natural gas development by our research group...... 18 Table 1.7.1.2. Epidemiology studies of air pollution and asthma...... 25 Table 1.7.1.3 Studies of psychosocial stress and asthma exacerbations...... 29 Table 1.7.2.2. Epidemiology studies of environmental variables and mental health outcomes...... 33 Table 2.3.3.2.1. Median days from spud to stimulation by year and region, based on the 2013 well dataset ...... 64 Table 2.3.3.2.2. Median days from stimulation to production start by year and region, based on the 2013 well dataset ...... 65 Table 2.3.3.2.3. Spud date missingness percent (number) by data set iteration ...... 65 Table 2.3.3.4. Stimulation date missingness percent (number) by data set iteration ...... 67 Table 2.3.3.6. Percentage (number) of well pads by data source ...... 68 Table 2.4.2.2.1. Symptoms included in Patient-Reported Outcomes Measurement Information System fatigue short form 8a...... 73 Table 2.4.2.2.2. Symptoms included in the ID Migraine questionnaire...... 74 Table 2.4.2.2.3. Symptoms included in the Personal Health Questionnaire Depression Scale (PHQ-8) questionnaire...... 75 Table 2.4.2.3. ICD-9 codes used to identify disordered sleep...... 76 Table 2.6. Geocoding level for the asthma and depression study populations...... 78 Table 2.7.1. Variables created from the electronic health record used in health studies.78 Table 2.7.1.5.1. Smoking status categories considered as evidence of current smoking81 Table 2.7.1.5.2. Procedure codes considered as evidence of smoking ...... 81 Table 2.7.1.5.3. ICD-9 codes considered as evidence of smoking ...... 81 Table 2.7.2. Variables created from the electronic health record used in health studies 86 Table 2.7.2.2. Variables used to create the socioeconomic deprivation index...... 87 Table 2.7.3.2. Spearman correlation coefficient of the drilling metric assigned for different durations and lags for 446 randomly chosen asthma hospitalizations...... 93 Table 3.4.1. Descriptive statistics of cases and controls by exacerbation type for selected study variables by variable type (constant vs. time-varying) ...... 118 Table 3.4.2. Associations of unconventional natural gas activity metrics and asthma outcomes ...... 121 Table 4.3.2.2. ICD-9 codes used to identify disordered sleep...... 140 Table 4.3.6. Calculation of sample weights...... 145 Table 4.4.1. Descriptive statistics by depression symptoms...... 148 Table 4.4.2.1. Association of UNGD and depression symptoms in survey multinomial logistic models ...... 151 Table 4.4.2.2. Association of UNGD and depression symptoms in survey negative binomial models...... 152 Table 4.4.2.3. Association of UNGD (assigned at baseline) and depression symptoms in survey negative binomial models that include migraine or fatigue...... 153 Table 4.4.3. Association between UNGD and sleep deprivation in a survey-weighted generalized estimating equations model...... 154 Table 5.4.2.1. Results of PCA with Percentage of Variation Explained by Component 1 and Component 1 Loadings ...... 176
x
Table 6.1.1.1. Odds ratios from oral corticosteroid (mild exacerbation) models...... 196 Table 6.1.1.2. Odds ratios from emergency encounter (moderate exacerbation) models ...... 198 Table 6.1.2.1. Associations of UNGD stimulation metrics creating with extrapolated and sensitivity stimulation dates and asthma hospitalizations ...... 203 Table 6.1.2.2. Associations of UNGD spud metrics assigned on random encounter dates vs. random dates and asthma hospitalizations ...... 204 Table 6.1.2.3. Associations of spud activity metrics assigned using distance squared vs. distance cubed ...... 205 Table 6.1.2.4.1. Median distance to closer Geisinger Hospital by event and event status, km ...... 206 Table 6.1.2.4.2. Associations of the UNGD spud metric and hospitalization outcome without and with distance to hospital in the model...... 206 Table 6.2.1.1. Exponentiated coefficients from the truncated-weighted negative binomial model...... 210 Table 6.2.1.2. Odds ratios from the truncated-weighted multinomial logistic model. .... 211 Table 6.2.1.3. Exponentiated coefficients from the fully-weighted negative binomial model...... 212 Table 6.2.1.4. Odds ratios from the fully-weighted multinomial logistic model...... 213 Table 6.2.1.5. Exponentiated coefficients from the unweighted negative binomial model ...... 214 Table 6.2.1.6. Odds ratios from the unweighted multinomial logistic model...... 215 Table 6.3. Studies comparing self-reported asthma and asthma in medical record...... 217 Table 6.3.2.1. Classification by the electronic health record asthma algorithm by self- reported asthma...... 218 Table 6.3.2.2. Characteristics of patients with and without EHR and self-reported asthma...... 220 Table 6.4.5 Calculation of survey weights at baseline and follow-up (cells are counts unless otherwise specified)...... 226 Table 6.4.6.1. Asthma symptoms and missingness at baseline and follow-up...... 227 Table 6.4.6.2. Association of UNGD metrics with number of asthma symptoms at baseline among patients with and without asthma from adjusted multinomial models ...... 229 Table 6.4.6.3. Association of UNGD metrics with number of asthma symptoms at follow- up among patients with and without asthma from adjusted multinomial logistic models ...... 233 Table 6.4.6.4. Association of UNGD metrics with number of asthma symptoms (multinomial) at baseline among patients with and without asthma from adjusted logistic models...... 237 Table 6.5.1. Studies on greenness and asthma prevalence...... 246 Table 6.5.2. Studies on greenness and asthma exacerbations...... 247 Table 6.6.5.1. Association of peak normalized difference vegetation index with depression symptoms among study participants in boroughs in surveya negative binomial regressions...... 254 Table 6.6.5.2. Association of peak normalized difference vegetation index with depression symptoms among study participants in townships in surveya negative binomial regressions...... 254 Table 6.5.4. Association between UNGD, NDVI, and asthma exacerbations ...... 249
xi
List of figures Figure 1.1.1. Formations containing unconventional natural gas ...... 2 Figure 1.2. U.S. natural gas production, 2000-2015. Data from the U.S. Energy Information Agency...... 3 Figure 1.4. Impacts of Unconventional Natural Gas Development ...... 5 Figure 1.4.3.1. The spatial extent of the Haynesville Shale in Texas and Louisiana ...... 10 Figure 1.4.3.2. Modeled ozone impacts from Haynesville shale development compared to baseline...... 11 Figure 1.7. Conceptual diagram of outcomes considered in this thesis...... 22 Figure 2.1.2.1. The well record form...... 53 Figure 2.3.3.2.1. Counties considered eastern and northern for the purposes of well variable imputation and extrapolation...... 64 Figure 2.7.2.3. Weather stations reporting daily maximum temperature between 2005-12 in New York, New Jersey, and Pennsylvania...... 88 Figure 2.7.2.4. Locations of major and minor roads in New York and Pennsylvania...... 89 Figure 2.7.2.6. Public water supply areas in Pennsylvania...... 90 Figure 2.7.3.1. Timeline of well development with estimated durations each phases. .... 92 Figure 3.3.2. Flow diagram for identification of new asthma oral corticosteroid (OCS) medication orders...... 108 Figure 3.4.1.1. Number of developed pads (blue), and spudded (red), stimulated (green), and producing wells (yellow), 2005-12...... 114 Figure 3.4.1.2 The location of spudded wells as of December 2012 and residential location of Geisinger asthma patients...... 115 Figure 3.4.1.3 Locations of cases and controls by quartile of spud activity metric...... 116 Figure 3.4.3. Counties Associated with Asthma Hospitalization Case Status...... 123 Figure 4.2. Relationships among UNGD and moderating, mediating, and outcome variables...... 138 Figure 4.3.2.2. Flow diagram for identification of disordered sleep diagnoses...... 141 Figure 5.3.2. Location of UNG-related impoundments, compressor engines, and UNG wells...... 171 Figure 5.4.1. Total number of drilled unconventional natural gas wells and operating unconventional natural gas related impoundments and compressor engines in Pennsylvania by year...... 174 Figure 6.1.2.1. Locations of Wells with Extrapolated Stimulation Dates...... 202 Figure 6.4.1. Chronic Rhinosinusitis Integrative Studies Program survey design...... 223 Figure 6.5.1. Directed acyclic graph of UNGD, NDVI, and asthma exacerbations...... 245 Figure 6.6.4.1. Peak normalized difference vegetation index in 2014 by place type among study participants...... 252 Figure 6.6.4.1. Lowess smoother and scatter plot of population density (per km2) and peak normalized difference vegetation index in 2014 among study participants in townships...... 253 Figure 6.6.5. Association of peak normalized difference vegetation index with depression symptoms among study participants in townships in adjusted survey negative binomial regressions...... 255
xii
List of equations Equation 2.7.1.9. BMI formula for adults...... 84 Equation 2.7.3.1 Activity metrics for unconventional natural gas wells ...... 92 Equation 3.3.6.1. Pad preparation and spud metric...... 110 Equation 3.3.6.2. Stimulation activity metric...... 110 Equation 3.3.6.3. Production activity metric...... 111 Equation 3.3.7. Statistical Model...... 112 Equation 4.3.4. Activity metric...... 142 Equation 5.3.2. Inverse distance squared (IDS) metric...... 170 Equation 5.3.3. Inverse distance metric based on the drilling phase (IDD)...... 172
xiii
Chapter 1: Introduction
1.0 Rationale Over the last decade, unconventional natural gas development (UNGD) has
rapidly become a major energy source in the United States. Although UNGD is a major
industrial undertaking with community and environmental impacts, research on the
health effects of UNGD is limited. To help fill in the gaps in the understanding of the
health impacts of UNGD, we completed four primary tasks: 1) evaluation of associations
of UNGD metrics with asthma exacerbations; 2) evaluation of associations of UNGD
metrics with depression symptoms; 3) comparison of different approaches to UNGD
activity assessment with one another and in their associations with mild asthma
exacerbations; and 4) evaluation of whether and how other exposure-relevant aspects of
UNGD, such as impoundments, compressor engines, and flaring events should be
incorporated into UNGD activity metrics.
1.1 Unconventional natural gas terminology Unconventional natural gas refers to the resource (e.g., the natural gas in shale geologic layers), not the drilling technologies used. Unconventional natural gas
resources include natural gas from shale (natural gas trapped in the shale rock
formations), tight sand (natural gas trapped in sandstone formations), and coalbed
methane (methane stored within coal) (Figure 1.1.1).1 In the United States, most UNGD
extracts natural gas from shale. There are many shale gas plays in United States
(Figure 1.1.2).
1
Figure 1.1.1. Formations containing unconventional natural gas.1
Figure 1.1.2. Shale gas plays in the United States.2
2
1.2 Unconventional natural gas development in the United States The United States is the first country to produce shale gas on an industrial scale.2,3 Advances in drilling technologies (e.g., horizontal drilling) and the use of
hydraulic fracturing (“fracking”) have allowed for the rapid growth in shale gas
production, from 4% of total United States natural gas production in 2005 to 46% of
production in 2015.2 Since 2008, conventional natural gas production has declined, although the country’s total natural gas production has increased as shale gas production has grown (Figure 1.2). In the United States, the Barnett shale in Texas drove the growth in shale gas production in the early 2000s, but since 2012 the
Marcellus shale in Pennsylvania has been the most productive shale gas play in the
United States.2
Figure 1.2. U.S. natural gas production, 2000-2015. Data from the U.S. Energy Information Agency.2
1.3 Development of a shale gas well The first step of UNGD is well pad preparation, during which the land (about seven acres) is cleared and materials are brought to the site.4 This requires 1,420-1,975
diesel truck trips per well.5 The well is then drilled vertically and horizontally. The day
3
drilling begins is known as the spud date. After drilling is completed, the horizontal
portion of the well is then perforated. Hydraulic fracturing (“fracking”) is the following
step, and is also called stimulation. Fracking requires three to seven million gallons of
fluids, which is 90-95% water, 5-10% sand proppant, and 0.1-1% chemical additives
(including friction reducers, biocides, acids, and gelling agents). The 10-20% of the injected fracking fluid that returns to the surface in the first 20 to 30 days is called flowback water, and it is chemically similar to fracking fluids.6 During this time, the gases
produced by the well are either flared off, resulting in pollutants such as carbon
monoxide, nitrogen oxides (NOx), and PM2.5 (particulate matter less than or equal to
2.5 micrometers in aerodynamic diameter), or captured and separated by devices called
green completions into its different components, which can then be sold.7
Gas production then begins, flowing through the enhanced fracture network to the well, and after it arrives at the surface the gas is separated from other organics and water.6 Next, the gas is compressed using diesel or natural gas powered compressor
engines. The gas then is distributed or stored.6 During gas production, water continues
to flow up the well—this is called produced water.8 To date, data suggest that sixty percent of an average well’s lifetime production of natural gas is produced in the first
year, and 88% by the end of the second year.9
1.4 Environmental and social impacts UNGD is rapidly changing rural communities into industrial areas. UNGD has
documented air, water, soil, and community impacts, and these impacts vary in scale
from local to global (Figure 1.4).10 Studies on environmental and community impacts are
summarized in the sections below.
From the start of pad preparation to the beginning of well production takes about
3 months.11 The Pennsylvania Department of Environmental Protection, under
Pennsylvania Code § 78.122, requires companies to submit forms and permits at most
4 of the stages of well development.12 The forms and permits that are required and the
information that is collected varies at the various steps of well completion (Table 1.4).
Figure 1.4. Impacts of Unconventional Natural Gas Development.10
5
Table 1.4. Description of the phases of Marcellus shale well development. Abbreviation: Mcf, thousand cubic feet of natural gas Approximate Phase Description Duration Report Data available from report
Permits Issued Detail Report Location, operator, permit issue date, unconventional (yes/no) Well Permitting -- Well Location Plat Location of the proposed well
Pad Requires tree clearing and building a Preparation foundation for the pad 1 month -- --
Local coordinates that describe the path of the well bore in terms of the total measured depth and the north/ south and east/west Directional Survey offsets from the surface hole location Spud Data Report Date, location, operator, well status
Drilling commence date is also known as The drilling process (drill method, drilling started, drilling the spud date. Rigs run non-stop during complete, completion date) and the physical characteristics of the
Well drilling drilling. 1 month Well Record well (type, depth, cement, casing and tubing)
Perforation Record 3 months (part of Completion Perforation Report) When and where the well was perforated
Stimulation Date, average pump rate, average treatment pressure, maximum Information (part of breakdown pressure, instantaneous shut in pressure, proppant Completion Report) type, and proppant mesh size
Well Completion Stimulation Fluid A list of all the additives in the stimulation fluid including the Stimulation 1 week Additives Chemical Component % Mass used in the Total Base Fluid
Statewide Data 10-20% of the injected hydraulic Download of Waste Waste type, waste quantity, disposal method, name and location Flowback fracturing volume 1 month Data and disposal site
Gas is treated (separated from other Statewide Data Well organics and water), gas is compressed, Download of Gas Quantity (Mcf) and gas production days by 6 month window Production and gas is sent to stored/distributed -- Production Data (Jan-June or July-Dec)
6
1.4.1 Soil impacts
Soil impacts are primarily during the drilling phase, which produces cuttings, the
debris from the various geologic layers that are drilled. In 2011, drilling of Marcellus wells
in Pennsylvania produced 789,632 tons of cuttings, which contain low levels of naturally
occurring radioactive materials. The cuttings were primarily disposed of in landfills.6
1.4.2 Water impacts
Much of the opposition to UNGD has cited its potential impacts on ground and surface water.13 Unconventional natural gas wells are drilled as deep as 10,000 feet,
passing through the drinking water aquifers, which are less than 300 feet from the
surface, creating a concern for water contamination.14 Additionally, UNGD can stress the
availability of water and the ability to treat waste water. Each well requires 3-7 million
gallons of fluid for hydraulic fracturing. In 2011, Marcellus wells in Pennsylvania
generated 7,878,587 barrels (330 million gallons) of flowback and 9,065,470 barrels
(380 million gallons) of produced water. Most of the flowback water was (90%) reused
for uses other than road spreading within Pennsylvania. Over half (56%) of the produced
water was reused for uses other than road spreading within Pennsylvania, a quarter
(26%) was disposed of in injection disposal wells in Ohio, and 12% was treated in
industrial waste water treatment facilities in Pennsylvania.6
Following reports of water contamination in 2008 in Pavillion, Wyoming and 2009 in Dimock, Pennsylvania,8 studies were conducted that documented impacts of UNGD
on ground and surface water quality. Findings from three studies of private water wells in
northeastern Pennsylvania and upstate New York found no evidence that Marcellus
shale formation water or saline fracturing fluids were contaminating drinking water
aquifers, but did find evidence that some water wells within 1 kilometer of a natural gas
well were contaminated with stray methane, ethane, and propane.14-16 The methane
carbon isotopes for wells closer to active drilling were consistent with deeper
7 thermogenic methane sources (such as the Marcellus and Utica shales), but the carbon isotopes for wells farther were more likely from biogenic or biogenic/thermogenic sources.15 Unlike methane, ethane and propane do not have biogenic sources and thus
must have arisen from the shale layer. Ethane concentrations were 23 times higher in
water wells within one kilometer of a natural gas well, and all of the water wells with
detectable propane concentrations were within 1 kilometer of a natural gas well.14 The
authors hypothesized that leaky gas well casings, inadequate cement sealings between
the casing and rock, or enhanced connectivity between the shale layer and the shallow
layers due to hydraulic fracturing may have been responsible for the water
contamination.14,15 The only conclusive evidence of groundwater impacts from UNGD
comes in Pavilion, Wyoming, where these same authors demonstrated impacts to groundwater drinking sources by comparing concentrations of major ions in groundwater and stimulation fluids and concluded that the impacts to groundwater in Pavilion were due to hydraulic fracturing.17
Surface water may be impacted by UNGD, but through a different pathway than
for groundwater: surface water impacts from UNGD may have been a result of waste
management and well pad preparation. Wastewater from UNGD is high in chloride (Cl-), which wastewater treatment plants may not be effective at removing. A study from
Pennsylvania found that the release of treated flowback and produced water from treatment facilities upstream was associated with an increase in the Cl- concentration in watersheds downstream. Additionally, the study suggested that clearing land for well pad development upstream may have contributed to runoff and increased the
concentration of total suspended solids downstream.4
1.4.3 Air impacts
Each stage of UNGD has air impacts.10 Both direct emissions, such as volatile organic compounds (VOCs), NOx, and PM, as well as ozone, a secondary pollutant
8 formed from VOCs and NOx in the presence of sunlight, are of concern. During the pad preparation and drilling phases, the trucks that bring materials to the site and the heavy machinery used at the site emit diesel exhaust, which includes PM, NOx, and SOx.
During stimulation, diesel trucks bring water to the site. The water injected into the well returns to the surface with VOCs. Fugitive emissions of VOCs, including benzene, toluene, ethylbenzene and xylenes (BTEX), occur during production. The infrastructure associated with UNGD other than wells may contribute more to emissions than the wells themselves. Two studies estimated that compressor engines contributed the most to emissions (including VOCs, PM, and NOx) from any source related to UNGD in
Pennsylvania.5,18 However, neither study incorporated impoundments, because there were not reliable emission factors for them, so impoundments remain an uncharacterized source of emissions.
The scale of the pollutants described above varies tremendously, from hundreds of meters to hundreds of kilometers. The hazardous air pollutants, such as BTEX, drop off quickly from the well site, but likely represent an important exposure to people living very close to a well site.19 Other pollutants have a regional scale. Vinciguerra 2015 attributed an increase in ethane to UNGD in the Baltimore/Washington area, hundreds of
20 kilometers from UNGD. Studies of UNGD’s impact on ozone, PM2.5, NOx, SO2, and/or
VOC in the Marcellus shale,18 and in Texas shale plays in Eagle Ford,21 Barnett,22 and
Haynesville23 shales have also demonstrated quality impacts several counties away from where the natural gas was actually produced. An emissions inventory estimated that, in
2020, Marcellus-related development would contribute 12% of VOCs and 12% of regional NOx in an area covering much of West Virginia and Pennsylvania, and the southern counties in New York.18 In a modeling study of the Haynesville shale in Texas and Louisiana, the authors note that ozone impacts of UNGD in the Haynesville shale
“may extend well outside the immediate vicinity of the Haynesville shale into other
9 regions of Texas and Louisiana.” In comparing the extent of the Haynesville shale
(Figure 1.4.3.1, from Kemball-Cook 201023) to the extent of ozone impacts (Figure
1.4.3.2, from Kemball-Cook 201023), the episode maximum difference in daily maximum
8-hour ozone for the low and high scenarios (panels C and D in figure 4 [Figure 1.4.3.2])
show impacts of up to 6 ppb in areas several counties away from shale development.23
Figure 1.4.3.1. The spatial extent of the Haynesville Shale in Texas and Louisiana.23
10
Figure 1.4.3.2. Modeled ozone impacts from Haynesville shale development compared to baseline. Panels A and B show average daily differences and panels C and D show maximum daily differences; panels A and C are based on a low development scenario and panels B and D are based on a high development scenario.23
Emissions from UNGD also have climate impacts, which although not the focus
of this research, have implications for public health.10 UNGD has fugitive methane
emissions, and methane is a powerful greenhouse gas, with over 20 times more
greenhouse gas warming potential over 100 years than carbon dioxide. There have
been conflicting studies on the magnitude of fugitive methane emissions, some of which
suggest that, over various 25-100 year time scales, natural gas produced from UNGD is
worse for climate than coal because of the fugitive emissions.24-27
1.4.4 Community and social impacts
11
UNGD has many potential impacts on communities, including changes to employment opportunities, community composition, and neighborhood aesthetics.10,28 An
increase in heavy truck traffic is one of the most noticeable changes observed by
residents of communities undergoing UNGD.13 This increased traffic can lead to more
motor vehicle crashes. Counties with high levels of drilling in northeastern Pennsylvania
had 23% higher crash rates than counties with no drilling.29 Two non-peer reviewed studies that used ecologic designs found increased rates of calls to police,30 arrests for
driving under the influence,30 traffic violations,30 violent crimes,31 cases of sexually
transmitted diseases,31 and traffic fatalities31 in counties with UNGD compared to those
without.
Two studies have evaluated the impact of UNGD on home prices in
Pennsylvania. One found that the price of a house using groundwater and within 2km of a spudded well lost, on average, $16,059 of value, but the price of a house with public
water and within 2km of a well gained $5,070 on average.32 The second found a 20%
decrease in value for houses with well water and within 0.75 miles of a well permitted
within the last 6 months (approximately $30,000 for the study’s average home price of
$148,401).33
1.5 UNGD and health studies As discussed above, the construction and operation of shale gas wells presents risks to the environment and the community, and these risks can have occupational and environmental health impacts. UNGD may be an example of chronic environmental contamination—a long lasting environmental exposure that has contextual effects on health through psychosocial stress pathways or by influencing health-related behaviors.34 Psychosocial stress, sleep disruption, low socioeconomic status, exposure
to truck traffic, and exposure to air pollution are all biologically plausible pathways for
UNGD to affect health. There is a small but growing number of health studies on UNGD.
12
1.5.1 Occupational studies of UNGD
Although there are many potential occupational health risks from UNGD, including injury, exposure to noise, and chemical hazards, there is limited research on the subject to date.10 A NIOSH study evaluated exposure to the sand used for hydraulic fracturing. Inhalation of respirable crystalline silica, a component of sand, is a recognized
cause of silicosis and lung cancer. Worker exposure to respirable crystalline silica during hydraulic fracturing in 11 well drilling sites in five shale gas plays, including one site in
the Marcellus shale, found that in all shale gas plays studied, workers were being
exposed to levels above the Occupational Safety and Health Administration’s
permissible exposure limit.35
1.5.2 Epidemiology studies of UNGD
Research on UNGD and health remains limited. While a review article claims to have identified 31 studies on UNGD and health,36 the majority of those studies were
exposure assessment studies with no health outcomes evaluated. There have been eight epidemiology studies published on UNGD and health by other research groups
(Table 1.5.2.1) that have categorized study participants by UNGD, identified a health outcome, and evaluated associations between UNGD and the health outcome.29,37-43
Most of these studies were conducted in Pennsylvania. Two of these studies evaluated birth outcomes, two evaluated cancers, two evaluated symptoms, one evaluated car crashes, and one evaluated inpatient hospitalization rates. Six of the studies used government databases or registries for outcome assessment and three used questionnaires. Sample sizes ranged from very small (72 responders) to very large
(124,842 births).
Several of these studies have important limitations that impact the interpretation of the results. Fryzek 201337 may not have adequately accounted for the latency of the
cancers studied. The four ecologic studies were subject to the ecologic fallacy.29,37,41,42
13
McKenzie et al. 201439 included primarily conventional wells in their well metric. Saberi
et al. 201443 ascertained information on both UNGD and health outcomes on the same
questionnaire, creating the potential for dependent measurement error.
Simultaneously to the articles presented in this thesis, our research group
published two additional studies on UNGD and health outcomes, both of which I was a
co-author on but are not presented in this thesis (Table 1.5.2.2). One study evaluated
associations of UNGD with birth outcomes (ascertained from an electronic health record
[EHR])44 and the other with chronic rhinosinusitis (CRS), migraine headache, and fatigue
symptoms, ascertained from a questionnaire.45 Both were conducted in Pennsylvania.
Environmental epidemiology studies need to rank study participants on a gradient by exposure, but in studies of UNGD, this is challenging because exposure to
UNGD is not a single exposure, but multiple exposures (Section 1.4). Instead of using exposure assessment methods of each of air quality impacts, noise, light, vibration, flaring events, odors, and stress individually, studies to date have used geographic information system (GIS)-based proxies that incorporated the distance between study participants’ home addresses and unconventional natural gas wells, and in some cases additional information about well characteristics, to capture multiple potential pathways at once. Of the studies that have assigned a UNGD metric at the individual level, one asked study participants to attribute their health outcome to UNGD, and the remaining five assigned UNGD at the individual level using GIS-proxies, namely using nearest neighbor distance or gravity models. These GIS-based proxies have the benefit of being easily used retrospectively and being inexpensive. However, the GIS proxies for UNGD exposure also have limitations. To date, all GIS-based proxies used in UNGD and health studies have only used wells in their UNGD assessment, even though air pollution modeling studies have shown that components of UNGD other than wells, namely impoundments and compressor engines, may have significant contributions to emissions
14
(Section 1.4.3). Additionally, studies have used different GIS-based UNGD metrics,
which makes understanding what each metric is capturing and comparing results across
studies challenging.
While all studies found significant associations of UNGD and at least one health
outcome evaluated, associations were not consistent, even for the same outcome,
across studies. For example, the three studies evaluated associations of UNGD with
birth weight, and all found different results.39,40,44 Stacy found that UNGD was associated with lower birth weight, Casey also found that UNGD was associated with lower birth weight, but that the association was null when year was added to the model, and
McKenzie found that UNGD was associated with increased birth weight. The different results may be due to different confounders included in the models by study (e.g., only
Casey included year in the model as a confounder), or because different studies have used different UNGD metrics.
15
Table 1.5.2.1 Published epidemiology studies of unconventional natural gas development by other research groups. Abbreviations: C, cohort; E, ecologic; CS, cross-sectional; PA, Pennsylvania; CO, Colorado; G, government database; Q, questionnaire; UNGD, unconventional natural gas development; UNG, unconventional natural gas; OR, odds ratio; CI, confidence interval; SIR, standardized incidence ratio; RR, rate ratio First Outcome author, Study data year design Sample size Location Well metric Outcome(s) source Model(s) Significant findings 11,508 urinary Urinary bladder cancer bladder cases, 6,222 County count of cancer, Counties with higher well counts had higher thyroid cancer producing UNG thyroid rates of urinary bladder cancer compared to cases, and wells (high vs. cancer, and Standardize those without UNGD (e.g. in Washington Finkel 5,061 leukemia low) in six leukemia d incidence county from 2008-12, for urinary cancer, SIR 201641 E cases PA counties cases G ratios 130.7, 95% CI: 104.8, 156.6). Cancer (all cancers, central nervous system tumors, and 10,708 Before vs. after leukemia) Standardize Counties after drilling had a higher rate of Fryzek childhood PA drilling by rates by d incidence central nervous system tumors compared to 201337 E cancer cases county county G ratios before drilling SIR = 1.13; 95% CI, 1.02-1.25). Linear generalized Counties with high drilling were associated Graham estimating with higher vehicle crash rates (e.g., in 201529 6,432 car PA County level equation northern drilling counties vs. non-drilling E crashes high/low drilling Car crashes G models counties, RR = 1.15, 95% CI: 1.04-1.27). Inpatient Cardiology inpatient rates were associated prevalence with number of UNG wells per zip code and rates by UNG wells per km2. Neurology inpatient rates Jemielita Wells per zip medical Poisson were associated with UNG wells per km2 (e.g., 201542 92,805 PA code; wells per category per regression for wells per zip code and cardiology inpatient E hospitalizations km2 zip code G models rates RR = 1.0007, p = 0.0007). Congenital heart Inverse defects, distance to neural tube UNGD was associated with increased odds of spudded well, defects, oral congenital heart defect and neural tube includes clefts, Linear and defects (e.g., for UNGD and congenital heart conventional preterm birth, logistic defects, OR = 1.3, 95% CI: 1.2, 1.5). UNGD McKenzie CO and and term low regression was negatively associated with preterm birth 201439 C 124,842 births unconventional birth weight G models and positively associated with fetal growth.
16
First Outcome author, Study data year design Sample size Location Well metric Outcome(s) source Model(s) Significant findings 180 responders reporting on the Proximity to UNG wells was associated with Rabinowitz health status of Generalized increased odds of dermal and respiratory 201538 492 household PA Distance to well linear mixed symptoms (e.g., for skin conditions, OR = 4.1, CS members in buffers Symptoms Q model 95% CI: 1.4, 12.3). Questionnaire asked whether responder attribute health Saberi symptoms to Descriptive Of the 72 responders, 13% attributed a 201443 CS 72 responders PA UNGD Symptoms Q statistics symptom to UNGD. Birth weight, small for Linear and UNGD was associated with lower birth weight Stacy Inverse gestational logistic and higher odds of small for gestational age 201540 PA distance to age, and regression (e.g., for small for gestational age, OR = 1.34, C 15,451 births spudded well prematurity G models 95% CI: 1.10–1.63).
17
Table 1.5.2.2 Published epidemiology studies of unconventional natural gas development by our research group. Abbreviations: C, cohort; CS, cross-sectional; PA, Pennsylvania; EHR, electronic health record; Q, questionnaire; UNGD, unconventional natural gas development; OR, odds ratio; CI, confidence interval. First author, Study Outcome year design Sample size Location Well metric Outcome(s) data source Model(s) Significant findings
Inverse distance Term birth weight, UNGD was associated with squared UNGD preterm birth, low Multilevel higher odds of preterm birth and 9,384 activity metric 5 minute Apgar linear and high-risk pregnancy. Casey mothers; incorporating four score and small logistic (e.g., high UNGD and preterm 201644 10,496 PA phases of well size for regression birth OR = 1.41, 95% CI: 1.04- C neonates development gestational age EHR models 1.92). UNGD was associated with Inverse distance increased odds of CRS and squared UNGD Chronic fatigue together, migraine and activity metric rhinosinusitis Survey- fatigue together, and all three incorporating four (CRS), migraine weighted outcomes together (high UNGD Tustin 7,785 phases of well headache, and logistic and all three symptoms OR = 201645 CS responders PA development fatigue symptoms Q regression 1.84, 95% CI: 1.08, 3.14).
18
1.6 The use of electronic health records for epidemiology studies EHRs have been rapidly adopted by heath systems over the past decade. While they were originally designed for clinical and administrative (e.g., medical billing) purposes, they present an opportunity for epidemiological research. The use of EHRs in epidemiology studies is growing. EHRs can provide a longitudinal data source with large population sizes at much lower costs than cohort studies, although they have important differences compared to traditional primary data collection methods used in most epidemiology studies.46
1.6.1 Overview of the Geisinger Clinic and its EHR Geisinger is one health system using EHRs for epidemiology research. The
Geisinger Clinic provides primary care services to over 400,000 patients in over 35 counties of central and northeastern Pennsylvania, and includes counties with and without Marcellus shale wells. From 2001 to present, the Geisinger Clinic has been expanding, with an increasing number of hospitals (now around 7 or 8) and outpatient community clinics (now approaching 50). Geisinger has used an EHR since 2001.
Electronic health records capture data at all clinical encounters (inpatient, outpatient, emergency, telephone). The data in the EHR includes information on diagnoses and
ICD-9 codes, vital signs, medications, procedures, and laboratory tests. The EHR also includes sociodemographic information on patients, including address, age, sex, race/ethnicity, Medical Assistance for health insurance (a surrogate for family socioeconomic status), and information on habits such as tobacco and alcohol use.
Importantly, patients who have a Geisinger Clinic primary care provider represent the general population of the region.47 The Geisinger Clinic’s EHR is a powerful source of data for epidemiology studies because it includes a large sample size, can be obtained relatively inexpensively, is longitudinal, has detailed health information, and is already in an electronic form.
19
1.6.2 Environmental epidemiology studies using the Geisinger EHR
Several environmental epidemiology studies have been conducted using the Geisinger EHR. These include studies evaluating associations of: abandoned coal mines and diabetes,47,48 the built environment and obesity,49-51 industrial food animal
production and methicillin-resistant Staphylococcus aureus,52,53 unconventional natural
gas development and several health outcomes (these studies conducted simultaneously
to those presented in this thesis, as discussed in Section 1.5.2) ,44,45 greenness and
pregnancy outcomes,54 and several environmental exposures and CRS.55 Many of these
studies have used similar methods. First, patient data was obtained from the EHR. The
study population of cases and controls were identified using the EHR, including by
diagnosis codes, laboratory results, vitals, and mediation orders. Patients were
geocoded to their home address. Geographic information systems (GIS) were used to
create exposure metrics. Biostatistical analyses were used to evaluate the association of
exposures and outcomes while taking into account correlations within people, places,
and time. Finally, sensitivity analyses were conducted to evaluate the robustness of
associations.
The use of EHR data in epidemiology studies can present limitations, but many
of these limitations can be overcome. For example, the Geisinger EHR does not contain
information on socioeconomic status, but Medical Assistance, a means-tested program,
can be used as a surrogate. The EHR also does not retain old addresses if patients
move, with only the most recent address recorded. However, addresses can be
compared over different EHR data pulls, and an analysis of address changes revealed
little residential mobility among Geisinger patients.56 The EHR only captures health
conditions for which patients seek care, so conditions that patients can treat over the
counter or for which they do not seek care are not well captured. For these types of
outcomes, the EHR can be supplemented by a questionnaire (e.g., migraine headache,
20 fatigue, and nasal and sinus symptoms were ascertained from a questionnaire in an analysis of the association of UNGD and these symptoms45).
1.7 Outcomes selected for study in this thesis This thesis research first focused on asthma exacerbations using EHR data. As associations were discovered in that analysis, other data from another NIH- funded study (Tustin et al., discussed in Section 1.5.2) became available that assessed
depression symptoms by questionnaire. These data thus allowed the evaluation of
additional hypotheses regarding UNGD and health pathways. The rationales for
selecting asthma exacerbations as an outcome were: asthma exacerbations are common and severe; patients seek care for them, so they are well captured in an EHR; they can be affected by stress and by air pollution, which we hypothesized were the two primary pathways for UNGD to affect health; and they have a short latency from exposure to care seeking and thus EHR documentation. In contrast, while there is public and scientific concern over carcinogens in drinking water as a result of UNGD,19 the
latency from exposure to carcinogens to development of cancer could be decades. After
completing the study of UNGD and asthma exacerbations (Chapter 3), we selected
depression symptoms and disordered sleep as outcomes in our second epidemiology
study of UNGD. Similar to asthma exacerbations, depression symptoms can be affected
by stress and air pollution. Unlike asthma exacerbations, depression symptoms are not
well captured in an EHR because patients do not always seek care for symptoms, or
patients may take years to seek care. In a survey representative of the United States
population, only 13, 19.5, and 35.3% of patients with mild, moderate, and moderately
severe/severe depression symptoms reported contacting a mental health professional in the prior 12 months.57 However, we were still able to ascertain depression symptoms on
a population of Geisinger patients because questions on depression symptoms were
available from a questionnaire used for a study of chronic rhinosinusitis epidemiology
21 that was sent to Geisinger patients in 2014. We wanted to evaluate mediation of the
UNGD – depressive symptoms association by sleep deprivation, but because the association of UNGD with disordered sleep diagnoses has not been previously studied, we also directly evaluated this association. We also wanted to evaluate depression symptoms as a mediator of the UNGD and asthma exacerbation association, but that was not possible because there was an insufficient amount of time between the return of the depressive symptom questionnaire and the latest events collected in the EHR data
(Figure 1.7).
Figure 1.7. Conceptual diagram of outcomes considered in this thesis. Blue arrows identify associations evaluated in a study concurrent to, but not included in, this thesis (Tustin 2016), red arrows identify associations evaluated in this thesis, and the yellow arrow identifies an association that could not be evaluated because an insufficient number of events of asthma exacerbations were available in the EHR data at the time of this analysis. Abbreviation: EHR, electronic health record
1.7.1 Epidemiology of asthma and asthma exacerbations
22
Asthma is a common, chronic disease—in 2010, 25.7 million people in
the United States had asthma, a prevalence of 8.4%.58 Asthma is characterized by
variable and recurring symptoms (including cough, wheezing, shortness of breath, and
chest tightness), reversible airflow obstruction, bronchial hyper-responsiveness, and
underlying inflammation.59,60 In 2009, there were 11.8 million outpatient visits, 2.1 million
emergency department (ED) visits, and 479,300 hospitalizations for asthma.58 There is
no cure for asthma, but asthma can be controlled in most patients.61 Asthma control is
the degree to which asthma symptoms have been reduced by treatment. This includes
the current state of symptoms and the future risk of symptoms. Asthma severity reflects
the level of treatment needed to achieve asthma control.62
1.7.1.1 Overview and definition of asthma exacerbations
The goal of asthma treatment is to control symptoms and prevent exacerbations.
Asthma exacerbations are acute episodes of shortness of breath, cough, wheezing,
and/or chest tightness. Some exacerbations can be managed at home by patients, but
others require emergency medical care or hospitalization.60 Asthma exacerbations are a
strain on healthcare costs and reduce quality of life for patients. The 20% of asthma
patients who have exacerbations requiring hospitalization contribute 80% of all asthma
healthcare costs.59 Asthma exacerbations can cause children to miss school and adults
to miss work.63 We identified an asthma exacerbation using the National Institutes of
Health standardized asthma outcomes as one of the following, defining severe, moderate, and mild exacerbations, respectively: (1) an asthma specific hospitalization,
(2) an asthma specific ED visit, or (3) new oral corticosteroid (OCS) medication order.64
We defined an asthma hospitalization or ED visit using the International Classification of
Diseases, 9th Revision, Clinical Modification code for asthma (493.x), and a new OCS
mediation order by distinguishing new OCS medication orders from standing orders, and
23
OCS orders for an asthma exacerbation from those for other diseases (Chapter 3,
Figure 3.3.2).
1.7.1.2 Asthma exacerbations and air pollution
Outdoor air pollution is a recognized cause of asthma exacerbations, as pollutants can cause airway inflammation. There is a large body of literature linking asthma exacerbations to acute and chronic exposure to air pollutants, in particular ozone and PM, but also nitrogen dioxide and sulfur dioxide.60 Epidemiology studies have found
an association between low levels of air pollutants and asthma exacerbations (including
hospital and emergency visits and medication use) (Table 1.6.1.1).65-83 Many studies on
asthma and air pollution have evaluated the effect of an increase in ambient levels of air pollutants to rates of asthma hospitalizations or emergency room visits, and have found significant results. These studies have been done in many countries worldwide and often use a time series design (ecologic) to evaluate the short term effects of air pollution on
asthma exacerbations.
24
Table 1.7.1.2. Epidemiology studies of air pollution and asthma. Abbreviations: PM, particulate matter; NOx, nitrogen oxides; CO, carbon monoxide; SOx, sulfur oxides
Association with a measure of asthma outcomea
Exposure to Air Pollution Assigned on an Area Level Duration (A) or on and Lag Study Sample Confounders Individual Used for b c d e Author, Year Design Size Outcomes Level (I) Exposure Country Population Ozone PM2.5 PM10 NOx CO SOx Time 6447 Abe 200783 Series events H M A 0, 1 Japan A ------null null null Meta- Not Australia and yes, Barnett 200565 analysis reported H M, S A 0, 1 New Zealand C null null null expected null null 23,373 child ER visits; 6939 child hospital admissions ; 22,277 adolescent ER visits; 5,478 adolescent Time hospital yes, yes, Chew 200166 series admissions H M A 0, 1, 2 Singapore C null -- -- expected -- expected Parity, breast- feeding, income quintile (neighborhood level), maternal education status (neighborhood Sum over 3,482 level), birth gestation; nested cases; weight, and sum over case– 17,410 gestational first year yes, yes, yes, yes, yes, Clark 201067 control controls H, P length. I of life Canada C null expected expected expected expected expected 1123 0, 1, 2, 3, panel person 4 yes, yes, yes, yes, Delfino 200384 study days S D, I, M A USA C expected -- expected expected null expected Time 0, 1, 2, 3 yes, Fauroux 200085 Series 715 events H I, P, M A France C expected -- null null -- null Average of Time days 0 - 2 yes, Friedman 200180 Series 73 days H D, M A USA C expected ------Time 271 0, 1 yes, Gent 200368 Series participants M, S M A USA C expected null ------Time 5,933 0, 1, 2 Gouveia 200081 Series events H D, H, M, S A Brazil C null -- null null null null
25
Association with a measure of asthma outcomea
Exposure to Air Pollution Assigned on an Area Level Duration (A) or on and Lag Study Sample Confounders Individual Used for b c d e Author, Year Design Size Outcomes Level (I) Exposure Country Population Ozone PM2.5 PM10 NOx CO SOx Single day: 0,1,2,3,4,5 ; average of 0-4 lag Time 1972 days yes, yes, Halonen 201069 Series events H H, I, M, P A Finland C expected expected ------4416 yes, yes, Jaffe 200382 Cohort events H M, S A 1, 2, 3 USA A expected -- null null -- expected Single day: 0, 1, Mean of 2, 3; sum Case 174 events of days 0 crossove per day for and 1 yes, yes, yes, yes, yes, yes, Jalaludin 200870 r 1826 days H D, H, M A Australia C expected expected expected expected expected expected Single day: 0, 1, 2, 3, 4, 5; sum of days 0 - 1, Time 69,716 0 - 2, and yes, yes, yes, yes, Ko 200771 series events H M, S A 0 - 5 Hong Kong A expected expected expected expected -- null Time 1270 yes, Magas 200772 Series events H M A 1 USA C null null -- expected -- -- A, sex, R, poverty level, and health Case 1502 insurance yes, yes, Meng 200979 control participants H, S status I 1 year USA A expected expected yes null null -- 10,022 panel person- yes, yes, yes, yes, Ostro 200173 study days M, S M, R A 0 ,1, 2, 3 USA C expected expected expected expected -- -- Single day: 1, 2, 3; sum of Petroeschevsky Time 13,246 days 0 – 2 yes, 200174 Series events H M, S A and 0 - 4 Australia A expected -- -- null -- null Single day: 0, 1, Time 3601 2; sum of yes, yes, Samoli 201177 Series events H D, I, M, S A days 0 - 2 Greece C null -- expected null -- expected Single day: 0, 1, 2; 3-day moving 54,450 (sum of 0-, Meta- person- 1-, and 2- USA and yes, yes, yes, Schildcrout 200675 analysis days M M, S A day lags) Canada C null -- null expected expected expected
26
Association with a measure of asthma outcomea
Exposure to Air Pollution Assigned on an Area Level Duration (A) or on and Lag Study Sample Confounders Individual Used for b c d e Author, Year Design Size Outcomes Level (I) Exposure Country Population Ozone PM2.5 PM10 NOx CO SOx Time 1987 yes, Stieb 199686 Series events H M, S A 0 Canada A expected ------Time 0, 1, 2, 3, yes, yes, yes, Tenías 199878 Series 734 events H D, I, M, S A 4, 5 Spain A expected -- null expected -- expected 830 person- yes, Thurston 199776 Cohort days M, S M A 0, 1, 2, 3 USA C expected ------a "--" means the pollutant was not studied. b "H" is hospital visits or admissions; "P" is physician diagnosis; "M" is extra medication use; "S" is symptoms c "A" is age, "D" is day of the week, "H" is holidays, "I" is influenza or respiratory infections, "M" is meteorological factors, "P" is pollen, "R" is race / ethnicity, "S" is for seasonality or time trends. d Single day used for duration unless otherwise noted. Numbers refer to lag in days unless otherwise noted. “0” is the same day for exposure as the event. e "C" is children, "A" is children and adults
27
1.7.1.3 Asthma exacerbations and stress
Psychosocial stress can modify the effects of environmental triggers87 and has
been associated with asthma exacerbations, worse asthma control, and worse
medication aderence.88-93 Both chronic and acute psychosocial stress can exacerbate
asthma. Compared to studies of air pollution and asthma exacerbation, studies on
psychosocial stress have found stronger associations. For example, in children with
asthma, the risk of an asthma exacerbation increased 4.7 times in the two days following a very stressful event,89 and adults exposed to violence in their community had 2.3 and
2.5 times the risk of an asthma ED visit or hospitalization, respectively, compared to
those not exposed to community violence.91 Epidemiology studies of stress and asthma
exacerbations are summarized in Table 1.6.1.2.
28
Table 1.7.1.3 Studies of psychosocial stress and asthma exacerbations. Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval First author, Outcome data year Study design Sample size Stress measure Outcomes(s) source Model(s) Primary findings Children with caretakers with mental health problems were A questionnaire on more likely to have psychosocial status asthma 1528 children of the child and the Asthma symptoms, hospitalizations (OR = Weil 199993 Longitudinal with asthma caregiver health care utilization Caregiver report Logistic regression 1.8, 95%CI [1.2-2.7]). Exposure to a stressful event was associated with an Three standardized Asthma exacerbation, increased risk of an interviews on the defined as a peak- asthma exacerbation 90 children with child’s stressful life flow recording of less in the subsequent 4 Sandberg asthma aged 6– events than 30% of the weeks (OR = 1.71, 200090 Longitudinal 13 years recorded maximum Self-report Logistic regression 95% CI [1.04–2.82]). Asthma exacerbation, defined as a mean of The risk of an asthma 60 children with Three standardized the day’s two exacerbation asthma aged 6– interviews on the readings below 70% increased 4.7 times 13 years (the child’s stressful life of the normal value (p<0.01) in the 2 days Sandberg same data as events and an increase in following a very 200489 Longitudinal Sandberg 2000) reported symptoms Self-report Cox regression stressful event. Adults exposed to Asthma-related ED violence had higher visits, self-reported odds of asthma asthma-related emergency visits and hospitalizations, hospitalizations (OR = asthma-related 2.3, 95% CI [1.3-3.9] 397 adults with quality of life, and and 2.5, 95% CI [1.1- prior asthma Exposure to forced expiratory 5.6], respectively) than Apter 201091 Longitudinal encounters community violence volume in 1 second. Self-report Logistic regression those unexposed. Children in neighborhoods Asthma symptoms, perceived as less safe medication use, had increased risk of missed school due to asthma exacerbations Caregiver asthma, (e.g., for medication 219 children with assessment of health care utilization, use, OR = 4.0, 95% CI Kopel 201592 Cross-sectional asthma neighborhood safety lung function Kopel, 2015 Logistic regression [1.8–8.8]).
29
1.7.2 Epidemiology of depression
Depression is a common but serious disease. Depression is a symptom-based condition, and symptoms of major depressive disorder include hopelessness, helplessness, sad or irritable mood, loss of interest in activities, and fatigue.94
Depression has significant public health and economic costs. Major depressive disorder
is one of the top five contributors to disability adjusted life years lost in the United
States.95 Depression can range in severity, from mild to severe.96 The Centers for
Disease Control, using data from the 2009-12 National Health and Nutrition Examination
Survey, estimated that the general population prevalence of mild, moderate, and
moderately severe/severe depression symptoms was 15.3, 4.7, and 2.9%, respectively,
in the United States, using the PHQ–9. There were differences in rates of depression
symptoms by race/ethnicity and poverty: rates of severe depression symptoms were
higher among non-Hispanic black people than non-Hispanic white people, and rates of
mild and moderate depression symptoms were higher among non-Hispanic black people
and Hispanic people than non-Hispanic white people. Rates of moderate, moderately
severe, and severe depression symptoms were higher among people living below the
federal poverty line than above it.57
In epidemiology studies, depression can be ascertained using a questionnaire
about depression symptoms (e.g., the Patient Health Questionnaire [PHQ–9], asks about
depression symptoms in the two weeks prior to questionnaire response), or by using
clinical diagnoses and antidepressant medication use. However, since many patients
with depression do not seek care, using clinical diagnoses and antidepressant
medication use could have low sensitivity for identifying depression (Section 1.7).57,97
1.7.2.1 Association of community characteristics and depression symptoms
UNGD may change the social and physical characteristics of a community
(Section 1.4.4), and these neighborhood changes could affect depression symptoms.
30
Several potential pathways from neighborhood characteristics to depression have been identified, including changing health behaviors and psychosocial stress (Figure
1.7.2.1).98 The neighborhood characteristics that could be affected by UNGD, which
include noise, traffic, changes in community composition, and changes in socioeconomic
status, are highlighted in yellow.
Two recent reviews evaluated studies on associations of community
characteristics with depression symptoms, one of cross-sectional and longitudinal
studies and one of only longitudinal studies.98,99 In studies of the association of
community characteristics with depression symptoms, characteristics were often
quantified using a single community-level variable or an index of variables, including
poverty, employment, home ownership, availability of retail outlets, and community
population composition (e.g., percent minority, percent single parents, or percent foreign
born). The review of cross-sectional and longitudinal studies concluded that community
characteristics were associated with depression symptoms,98 and the review of
longitudinal studies concluded the evidence was mixed for the association of community characteristics and depression symptoms.99
31
Figure 1.7.2.1. Relationship between community characteristics and depression symptoms.1 The community characteristics that could be affected by UNGD are highlighted in yellow.
1.7.2.2 Association of environmental variables and depressive symptoms
Several studies have evaluated the association of a wide range of environmental variables with depression symptoms.100-106 These studies have modeled depression
symptoms as binary and continuous outcomes and have largely been cross-sectional. In
particular, studies of coal mining and the Deepwater Horizon oil spill provide evidence of
an association between chronic environmental contamination (which have contextual
effects on health through psychosocial stress pathways or by influencing health-related
behaviors) and depressive symptoms.
32
Table 1.7.2.2. Epidemiology studies of environmental variables and mental health outcomes. Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval; PHQ, Patient Health Questionnaire; NHANES, National Health and Nutrition Examination Survey Study Study design Sample size Location Outcome Exposure(s) Model(s) Primary findings 2,669 people with a Dryland salinity at the district level Speldewinde hospitalization Western Hospitalization rates for Linear was associated with 2009100 Ecological for depression Australia depression Dryland salinity regression hospitalizations for depression.
Residence in a county with Living in a county with mountaintop Kentucky, mountain top removal, removal, compared to no coal Tennessee, Depression symptoms, other non-mountaintop mining, was associated with Hendryx Cross- Virginia, and measured with removal coal mining, or no Logistic depression symptoms (OR = 1.40, 2013101 sectional 8,591 adults West Virginia the PHQ-9 coal mining regression 95% CI: 1.15, 1.71). Survey weighted (NHANES Exposure to secondhand smoke Depression symptoms, data) zero- was associated with increased Bandiera Cross- measured with Secondhand smoke inflated Poisson level of depression symptoms (β = 2011102 sectional 2,901 children United States the PHQ-9 exposure regression 0.09, p = .03). Four measures to exposure to Deepwater Horizon Oil Spill: 1. Location of home with respect to the oil spill (proximal vs. non-proximal) 2. Did participants have direct contact with the oil? (yes vs. no) 3. Were participants involved in cleanup Having direct contact with oil (OR activities? (yes/no) = 1.58, 95% CI: 1.15, 2.16) and 4. How did the oil spill having had a direct impact on impact upon their job job/income from the oil spill (OR = Depression symptoms and/or income? (1 = no job 1.52, 95% CI: 1.08, 2.12) were Alabama, using the dichotomized loss or no income associated with depression Florida, PHQ-8 (a score of decrease; 2 = job loss; 3 = symptoms, but proximity of Louisiana, greater than or equal to income decrease; 4 = both Loglinear residence to the oil spill and being Cross- 38,361 adults and 10 was used to define job loss and income regression involved in cleanup were not. Fan 2014103 sectional Mississippi current depression) decrease) models
33
Study Study design Sample size Location Outcome Exposure(s) Model(s) Primary findings A factor analysis of 9 Economic and physical exposure questions on exposure to to the oil spill was associated with Deepwater Horizon Oil Spill depression symptoms ( RR for 20-item Center for was used to create two economic exposure = 1.2, 95% CI: Epidemiological factors: 1.02 - 1.41; RR for physical Cross- Studies Depression 1. Economic Exposure exposure = 1.2, 95% CI: 1.01 - Rung 2016104 sectional 2,842 women Louisiana Scale, dichotomized 2. Physical Exposure Poisson 1.43) Four questions on exposure to the Haiti Major damage to the house was earthquake: associated with depression 1. Major damage to house symptoms (OR= 1.34, 95% CI: 2. Trapped under rubble 1.01, 1.78), but was not statistically Depression symptoms, 3. Physically injured significantly associated with the Cerdá Cross- measured with the 4. Involved in other three measures of exposure. 2013105 sectional 1,323 adults Haiti PHQ-9 rescue/recovery Logistic Two survey weighted (NHANES) models: 1. Poisson: depression symptoms as a dichotomous variable Blood lead levels were not Depression symptoms, 2. Ordinal statistically significantly associated Golub Cross- measured with the logistic with depression symptoms in 2010106 sectional 4,159 adults United States PHQ-9 Blood lead quartiles regression either model.
34
1.7.3 Relationship between depression and asthma
Asthma and depression symptoms are often co-morbid, but the temporal relationship between asthma and depression is not clear. A recent meta-analysis of depression and asthma identified six studies that evaluated the association between depression at baseline and the subsequent risk of adult-onset asthma,107-112 and two
studies that evaluated the association between asthma at baseline and the subsequent
risk of depression.112,113 The meta-analysis found that depression at baseline was associated with increased risk of developing asthma (relative risk [RR] = 1.43, 95% confidence interval [CI]: 1.28, 1.61), but that asthma at baseline was not associated with increased risk of developing depression (RR = 1.23, 95% CI: 0.72, 2.10).114 However,
the authors note that the analysis of asthma and incident depression may not have been
statistically significant because of the limited number of studies available. Research on
the association of depression symptoms and asthma exacerbations is limited, but in a
longitudinal study of asthma patients, having depression symptoms was associated with
subsequent asthma ED visits, but not with OCS medication orders.115
1.8 Specific aims As described above, there were few published studies on UNGD and health,
despite calls for more research, when this thesis research began.116,117 Notably, there
were no prior published studies on UNGD and asthma exacerbations or UNGD and a
mental health outcome. Furthermore, the published studies on UNGD and health
outcomes used different methods of exposure assessment (nearest neighbor distance
and gravity metrics), making comparison of results across studies difficult. To fill these
gaps, the aims of this dissertation research were to:
1. Using a nested case-control design, evaluate associations between UNGD activity
(using surrogate measures created using geographic information systems [GIS]) and
35 asthma exacerbations (hospitalizations, ED visits, and new OCS mediation orders) in a
cohort of Geisinger patients with asthma using EHR data.
2. Evaluate associations between UNGD activity and depression symptoms using
questionnaire data.
3. Characterize impoundments, compressor engines, and flaring events related to
UNGD in Pennsylvania, evaluate whether and how to incorporate impoundments and
compressor engines into UNGD activity assessment, and compare the different
approaches to UNGD activity assessment to themselves and in their relations with mild
asthma exacerbations.
1.9 References
1. U.S. Energy Information Administration. The geology of natural gas resources. Today
in Energy Web site. http://www.eia.gov/todayinenergy/detail.cfm?id=110. Published
2/14/2011. Updated 2011. Accessed 5/19, 2016.
2. U.S. Energy Information Administration. Shale in the United States. Energy in Brief
Web site.
https://www.eia.gov/energy_in_brief/article/shale_in_the_united_states.cfm#shaledata.
Updated 2016. Accessed 5/19, 2016.
3. US Energy Information Administration. Technically recoverable shale oil and shale
gas resources. . 2013(8/19/2013).
4. Olmstead SM, Muehlenbachs LA, Shih J, Chu Z, Krupnick AJ. Shale gas development
impacts on surface water quality in Pennsylvania. Proceedings of the National Academy
of Sciences. 2013;110(13):4962-4967.
36
5. Litovitz A, Curtright A, Abramzon S, Burger N, Samaras C. Estimation of regional air- quality damages from Marcellus shale natural gas extraction in Pennsylvania.
Environmental Research Letters. 2013;8(1):014017.
6. Maloney KO, Yoxtheimer DA. Production and disposal of waste materials from gas and oil extraction from the Marcellus shale play in Pennsylvania. Env Prac.
2012;14(04):278-287.
7. Weinhold B. The future of fracking: New rules target air emissions for cleaner natural gas production. Environ Health Perspect. 2012;120(7):a272-9.
8. Vidic R, Brantley S, Vandenbossche J, Yoxtheimer D, Abad J. Impact of shale gas development on regional water quality. Science. 2013;340(6134).
9. Hughes JD. Drill baby drill. Nature. 2013(7437):7437-308. http://www.postcarbon.org/reports/DBD-report-FINAL.pdf.
10. Adgate JL, Goldstein BD, McKenzie LM. Potential public health hazards, exposures and health effects from unconventional natural gas development. Environ Sci Technol.
2014;48(15):8307-8320.
11. Hill EL. Unconventional Natural Gas Development and Infant Health: Evidence from
Pennsylvania. 2012.
12. Pennsylvania Code. Subchapter E. Well reporting § 78.121-§ 78.125. http://www.pacode.com/secure/data/025/chapter78/subchapEtoc.html.
13. Powers M, Saberi P, Pepino R, Strupp E, Bugos E, Cannuscio CC. Popular epidemiology and “fracking”: Citizens’ concerns regarding the economic, environmental,
37 health and social impacts of unconventional natural gas drilling operations. J Community
Health. 2015;40(3):534-541.
14. Jackson RB, Vengosh A, Darrah TH, et al. Increased stray gas abundance in a subset of drinking water wells near Marcellus shale gas extraction. Proceedings of the
National Academy of Sciences. 2013;110(28):11250-11255.
15. Osborn SG, Vengosh A, Warner NR, Jackson RB. Methane contamination of drinking water accompanying gas-well drilling and hydraulic fracturing. Proc Natl Acad
Sci U S A. 2011;108(20):8172-8176.
16. Warner NR, Jackson RB, Darrah TH, et al. Geochemical evidence for possible natural migration of Marcellus formation brine to shallow aquifers in Pennsylvania.
Proceedings of the National Academy of Sciences. 2012;109(30):11961-11966.
17. DiGiulio DC, Jackson RB. Impact to underground sources of drinking water and domestic wells from production well stimulation and completion practices in the Pavillion,
Wyoming. Environ Sci Technol. 2016;50(8):4524-4536.
18. Roy AA, Adams PJ, Robinson AL. Air pollutant emissions from the development, production, and processing of Marcellus shale natural gas. J Air Waste Manage Assoc.
2013;64(1):19-37.
19. McKenzie LM, Witter RZ, Newman LS, Adgate JL. Human health risk assessment of air emissions from development of unconventional natural gas resources. Sci Total
Environ. 2012;424:79-87.
38
20. Vinciguerra T, Yao S, Dadzie J, et al. Regional air quality impacts of hydraulic fracturing and shale natural gas activity: Evidence from ambient VOC observations.
Atmos Environ. 2015;110:144-150.
21. Pacsi AP, Kimura Y, McGaughey G, McDonald-Buller E, Allen DT. Regional ozone impacts of increased natural gas use in the Texas power sector and development in the
eagle ford shale. Environ Sci Technol. 2015;49(6):3966-3973.
22. Pacsi AP, Alhajeri NS, Zavala-Araiza D, Webster MD, Allen DT. Regional air quality impacts of increased natural gas production and use in Texas. Environ Sci Technol.
2013;47(7):3521-3527.
23. Kemball-Cook S, Bar-Ilan A, Grant J, et al. Ozone impacts of natural gas
development in the Haynesville shale. Environ Sci Technol. 2010;44(24):9357-9363.
24. Howarth RW, Santoro R, Ingraffea A. Methane and the greenhouse-gas footprint of
natural gas from shale formations. Clim Change. 2011;106(4):679-690.
25. Allen DT, Torres VM, Thomas J, et al. Measurements of methane emissions at
natural gas production sites in the united states. Proc Natl Acad Sci U S A.
2013;110(44):17768-17773.
26. Jiang M, Griffin WM, Hendrickson C, Jaramillo P, VanBriesen J, Venkatesh A. Life
cycle greenhouse gas emissions of Marcellus shale gas. Environmental Research
Letters. 2011;6(3):034014.
27. Karion A, Sweeney C, Pétron G, et al. Methane emissions estimate from airborne
measurements over a western united states natural gas field. Geophys Res Lett.
2013;40(16):4393-4397.
39
28. Sangaramoorthy T, Jamison AM, Boyle MD, et al. Place-based perceptions of the impacts of fracking along the Marcellus shale. Soc Sci Med. 2016.
29. Graham J, Irving J, Tang X, et al. Increased traffic accident rates associated with shale gas drilling in Pennsylvania. Accident Analysis & Prevention. 2015;74:203-209.
30. Brasier KJ, Rhubart D. Effects of Marcellus shale development on the criminal justice
system (The Marcellus Impacts Project Report #6). 2014.
31. Price M, Basurto L, Herzenberg S, Polson D, Ward S, and Wazeter E. The shale
tipping point: The relationship of drilling to crime, traffic fatalities, STDs, and rents in
Pennsylvania, West Virginia, and Ohio. (The Multi-State Shale Research Collaborative).
32. Muehlenbachs L, Spiller E, Timmins C. The housing market impacts of shale gas
development. Am Econ Rev. 2015;105(12):3633-59.
33. Gopalakrishnan S, Klaiber HA. Is the shale energy boom a bust for nearby residents? evidence from housing values in Pennsylvania. Am J Agric Econ.
2014;96(1):43-66.
34. Couch SR, Coles CJ. Community stress, psychosocial hazards, and EPA decision-
making in communities impacted by chronic technological disasters. Am J Public Health.
2011;101(S1).
35. Esswein EJ, Breitenstein M, Snawder J, Kiefer M, Sieber WK. Occupational
exposures to respirable crystalline silica during hydraulic fracturing. Journal of
occupational and environmental hygiene. 2013;10(7):347-356.
40
36. Hays J, Shonkoff SB. Toward an understanding of the environmental and public health impacts of unconventional natural gas development: A categorical assessment of the peer-reviewed scientific literature, 2009-2015. PloS one. 2016;11(4):e0154164.
37. Fryzek J, Pastula S, Jiang X, Garabrant DH. Childhood cancer incidence in
Pennsylvania counties in relation to living in counties with hydraulic fracturing sites.
Journal of Occupational and Environmental Medicine. 2013;55(7):796-801.
38. Rabinowitz PM, Slizovskiy IB, Lamers V, et al. Proximity to natural gas wells and reported health status: Results of a household survey in Washington County,
Pennsylvania. Environ Health Perspect. 2014.
39. McKenzie LM, Guo R, Witter RZ, Savitz DA, Newman LS, Adgate JL. Birth outcomes and maternal residential proximity to natural gas development in rural Colorado. Environ
Health Perspect. 2014.
40. Stacy SL, Brink LL, Larkin JC, et al. Perinatal outcomes and unconventional natural gas operations in southwest Pennsylvania. PLOS ONE. 2015;10(6):e0126425.
41. Finkel M. Shale gas development and cancer incidence in southwest Pennsylvania.
Public Health. 2016;141:198-206.
42. Jemielita T, Gerton GL, Neidell M, et al. Unconventional gas and oil drilling is
associated with increased hospital utilization rates. PLoS ONE. 2015;10(7):e0131093.
43. Saberi P, Propert KJ, Powers M, Emmett E, Green-McKenzie J. Field survey of
health perception and complaints of Pennsylvania residents in the Marcellus shale
region. Int J Environ Res Public Health. 2014;11(6):6517-6527.
41
44. Casey JA, Savitz DA, Rasmussen SG, et al. Unconventional natural gas development and birth outcomes in Pennsylvania, USA. Epidemiology. 2015.
45. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS.
Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in Pennsylvania. Environ Health Perspect.
2016.
46. Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: A review of methods and applications. Annu Rev Public
Health. 2015(0).
47. Liu AY, Curriero FC, Glass TA, Stewart WF, Schwartz BS. The contextual influence of coal abandoned mine lands in communities and type 2 diabetes in Pennsylvania.
Health Place. 2013.
48. Liu AY, Curriero FC, Glass TA, Stewart WF, Schwartz BS. Associations of the burden of coal abandoned mine lands with three dimensions of community context in
Pennsylvania. ISRN Public Health. 2012;2012.
49. Nau C, Schwartz BS, Bandeen‐Roche K, et al. Community socioeconomic deprivation and obesity trajectories in children using electronic health records. Obesity.
2015;23(1):207-212.
50. Nau C, Ellis H, Huang H, et al. Exploring the forest instead of the trees: An innovative method for defining obesogenic and obesoprotective environments. Health
Place. 2015;35:136-146.
42
51. Schwartz BS, Stewart WF, Godby S, et al. Body mass index and the built and social environments in children and adolescents using electronic health records. Am J Prev
Med. 2011;41(4):e17-e28.
52. Casey JA, Curriero FC, Cosgrove SE, Nachman KE,Schwartz BS. HIgh-density livestock operations, crop field application of manure, and risk of community-associated methicillin-resistant staphylococcus aureus infection in Pennsylvania. JAMA Internal
Medicine. 2013;173(21):1980-1990.
53. Casey JA, Shopsin B, Cosgrove SE, et al. High-density livestock production and molecularly characterized MRSA infections in Pennsylvania. Environ Health Perspect.
2014;122(5):464-470.
54. Casey JA, James P, Rudolph KE, Wu CD, Schwartz BS. Greenness and birth outcomes in a range of Pennsylvania communities. Int J Environ Res Public Health.
2016;13(3):10.3390/ijerph13030311.
55. Hirsch AG, Stewart WF, Sundaresan AS, et al. Nasal and sinus symptoms and chronic rhinosinusitis in a population-based sample. Allergy. 2016.
56. Rasmussen SG, Ogburn EL, McCormack M, et al. Association between unconventional natural gas development in the Marcellus shale and asthma exacerbations. JAMA Intern Med. 2016;176(9):1334-1343.
57. Pratt LA, Brody DJ. Depression in the U.S. household population, 2009-2012. NCHS
Data Brief. 2014;(172)(172):1-8.
58. Moorman JE, Akinbami LJ, Bailey CM, et al. National surveillance of asthma: United states, 2001-2010. National Center for Health Statistics, Vital Health Stat. 2012;3:35.
43
59. Dougherty R, Fahy JV. Acute exacerbations of asthma: Epidemiology, biology and the exacerbation‐prone phenotype. Clinical & Experimental Allergy. 2009;39(2):193-202.
60. National Heart, Lung, and Blood Institute. National Asthma Education Program.
Expert Panel on the Management of Asthma. Expert panel report 3: Guidelines for the diagnosis and management of asthma: Full report. US Department of Health and Human
Services, National Institutes of Health, National Heart, Lung, and Blood Institute; 2007.
61. Taylor D, Bateman E, Boulet L, et al. A new perspective on concepts of asthma
severity and control. European Respiratory Journal. 2008;32(3):545-554.
62. Reddel HK, Taylor DR, Bateman ED, et al. An official American thoracic
society/European respiratory society statement: Asthma control and exacerbations:
Standardizing endpoints for clinical asthma trials and clinical practice. American journal
of respiratory and critical care medicine. 2009;180(1):59-99.
63. Jackson DJ, Sykes A, Mallia P, Johnston SL. Asthma exacerbations: Origin, effect,
and prevention. J Allergy Clin Immunol. 2011;128(6):1165-1174.
64. Fuhlbrigge A, Peden D, Apter AJ, et al. Asthma outcomes: Exacerbations. J Allergy
Clin Immunol. 2012;129(3):S34-S48.
65. Barnett AG, Williams GM, Schwartz J, et al. Air pollution and child respiratory health:
A case-crossover study in Australia and New Zealand. American Journal of Respiratory
and Critical Care Medicine. 2005;171(11):1272-1278.
66. Chew F, Goh D, Ooi B, Saharom R, Hui J, Lee B. Association of ambient
air‐pollution levels with acute asthma exacerbation among children in Singapore. Allergy.
1999;54(4):320-329.
44
67. Clark NA, Demers PA, Karr CJ, et al. Effect of early life exposure to air pollution on development of childhood asthma. Environ Health Perspect. 2010;118(2):284.
68. Gent JF, Triche EW, Holford TR, et al. Association of low-level ozone and fine particles with respiratory symptoms in children with asthma. JAMA: the journal of the
American Medical Association. 2003;290(14):1859-1867.
69. Halonen JI, Lanki T, Yli-Tuomi T, Kulmala M, Tiittanen P, Pekkanen J. Urban air pollution, and asthma and COPD hospital emergency room visits. Thorax.
2008;63(7):635-641.
70. Jalaludin B, Khalaj B, Sheppeard V, Morgan G. Air pollution and ED visits for asthma in australian children: A case-crossover analysis. Int Arch Occup Environ Health.
2008;81(8):967-974.
71. Ko F, Tam W, Wong T, et al. Effects of air pollution on asthma hospitalization rates in different age groups in Hong Kong. Clinical & Experimental Allergy. 2007;37(9):1312-
1319.
72. Magas OK, Gunter JT, Regens JL. Ambient air pollution and daily pediatric hospitalizations for asthma. Environ Sci Pollut Res Int. 2007;14(1):19-23.
73. Ostro B, Lipsett M, Mann J, Braxton-Owens H, White M. Air pollution and
exacerbation of asthma in African-American children in Los Angeles. Epidemiology.
2001;12(2):200-208.
74. Petroeschevsky A, Simpson RW, Thalib L, Rutherford S. Associations between outdoor air pollution and hospital admissions in Brisbane, Australia. Archives of
Environmental Health: An International Journal. 2001;56(1):37-52.
45
75. Schildcrout JS, Sheppard L, Lumley T, Slaughter JC, Koenig JQ, Shapiro GG.
Ambient air pollution and asthma exacerbations in children: An eight-city analysis. Am J
Epidemiol. 2006;164(6):505-517.
76. Thurston GD, Lippmann M, Scott MB, Fine JM. Summertime haze air pollution and
children with asthma. Am J Respir Crit Care Med. 1997;155(2):654-660.
77. Samoli E, Nastos P, Paliatsos A, Katsouyanni K, Priftis K. Acute effects of air
pollution on pediatric asthma exacerbation: Evidence of association and effect
modification. Environ Res. 2011;111(3):418-424.
78. Tenias JM, Ballester F, Rivera ML. Association between hospital emergency visits
for asthma and air pollution in Valencia, Spain. Occup Environ Med. 1998;55(8):541-
547.
79. Meng YY, Rull RP, Wilhelm M, Lombardi C, Balmes J, Ritz B. Outdoor air pollution
and uncontrolled asthma in the San Joaquin valley, California. J Epidemiol Community
Health. 2010;64(2):142-147.
80. Friedman MS, Powell KE, Hutwagner L, Graham LM, Teague WG. Impact of changes in transportation and commuting behaviors during the 1996 summer Olympic
Games in Atlanta on air quality and childhood asthma. JAMA. 2001;285(7):897-905.
81. Gouveia N, Fletcher T. Respiratory diseases in children and outdoor air pollution in
Sao Paulo, brazil: A time series analysis. Occup Environ Med. 2000;57(7):477-483.
82. Jaffe DH, Singer ME, Rimm AA. Air pollution and emergency department visits for asthma among Ohio Medicaid recipients, 1991–1996. Environ Res. 2003;91(1):21-28.
46
83. Abe T, Tokuda Y, Ohde S, Ishimatsu S, Nakamura T, Birrer RB. The relationship of short-term air pollution and weather to ED visits for asthma in japan. Am J Emerg Med.
2009;27(2):153-159.
84. Delfino RJ, Gong H,Jr, Linn WS, Pellizzari ED, Hu Y. Asthma symptoms in Hispanic children and daily ambient exposures to toxic and criteria air pollutants. Environ Health
Perspect. 2003;111(4):647-656.
85. Fauroux B, Sampil M, Quenel P, Lemoullec Y. Ozone: A trigger for hospital pediatric asthma emergency room visits. Pediatr Pulmonol. 2000;30(1):41-46.
86. Stieb DM, Burnett RT, Beveridge RC, Brook JR. Association between ozone and
asthma emergency department visits in Saint John, New Brunswick, Canada. Environ
Health Perspect. 1996;104(12):1354-1360.
87. Chen E, Miller GE. Stress and inflammation in exacerbations of asthma. Brain Behav
Immun. 2007;21(8):993-999.
88. Wisnivesky JP, Lorenzo J, Feldman JM, Leventhal H, Halm EA. The relationship between perceived stress and morbidity among adult inner-city asthmatics. Journal of
Asthma. 2010;47(1):100-104.
89. Sandberg S, Jarvenpaa S, Penttinen A, Paton JY, McCann DC. Asthma exacerbations in children immediately following stressful life events: A cox's hierarchical regression. Thorax. 2004;59(12):1046-1051.
90. Sandberg S, Paton JY, Ahola S, et al. The role of acute and chronic stress in asthma attacks in children. The Lancet. 2000;356(9234):982-987.
47
91. Apter AJ, Garcia LA, Boyd RC, Wang X, Bogen DK, Ten Have T. Exposure to community violence is associated with asthma hospitalizations and emergency department visits. J Allergy Clin Immunol. 2010;126(3):552-557.
92. Kopel LS, Gaffin JM, Ozonoff A, et al. Perceived neighborhood safety and asthma
morbidity in the school Inner‐City asthma study. Pediatr Pulmonol. 2015;50(1):17-24.
93. Weil CM, Wade SL, Bauman LJ, Lynn H, Mitchell H, Lavigne J. The relationship
between psychosocial factors and asthma morbidity in inner-city children with asthma.
Pediatrics. 1999;104(6):1274-1280.
94. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub; 2013.
95. Murray CJ, Abraham J, Ali MK, et al. The state of US health, 1990-2010: Burden of diseases, injuries, and risk factors. JAMA. 2013;310(6):591-606.
96. Kroenke K, Strine TW, Spitzer RL, Williams JB, Berry JT, Mokdad AH. The PHQ-8
as a measure of current depression in the general population. J Affect Disord.
2009;114(1):163-173.
97. González HM, Vega WA, Williams DR, Tarraf W, West BT, Neighbors HW.
Depression care in the United States: Too little for too few. Arch Gen Psychiatry.
2010;67(1):37-46.
98. Kim D. Blues from the neighborhood? Neighborhood characteristics and depression.
Epidemiol Rev. 2008;30:101-117.
48
99. Richardson R, Westley T, Gariépy G, Austin N, Nandi A. Neighborhood socioeconomic conditions and depression: A systematic review and meta-analysis. Soc
Psychiatry Psychiatr Epidemiol. 2015;50(11):1641-1656.
100. Speldewinde PC, Cook A, Davies P, Weinstein P. A relationship between environmental degradation and mental health in rural Western Australia. Health Place.
2009;15(3):880-887.
101. Hendryx M, Innes-Wimsatt KA. Increased risk of depression for people living in coal mining areas of central Appalachia. Ecopsychology. 2013;5(3):179-187.
102. Bandiera FC, Richardson AK, Lee DJ, He J, Merikangas KR. Secondhand smoke exposure and mental health among children and adolescents. Arch Pediatr Adolesc
Med. 2011;165(4):332-338.
103. Fan AZ, Prescott MR, Zhao G, Gotway CA, Galea S. Individual and community- level determinants of mental and physical health after the Deepwater Horizon oil spill:
Findings from the gulf states population survey. The journal of behavioral health services
& research. 2015;42(1):23-41.
104. Rung AL, Gaston S, Oral E, et al. Depression, mental distress and domestic conflict among Louisiana women exposed to the Deepwater Horizon oil spill in the WaTCH study. Environ Health Perspect. 2016.
105. Cerdá M, Paczkowski M, Galea S, Nemethy K, Péan C, Desvarieux M.
Psychopathology in the aftermath of the Haiti earthquake: A population‐based study of posttraumatic stress disorder and major depression. Depress Anxiety. 2013;30(5):413-
424.
49
106. Golub NI, Winters PC, van Wijngaarden E. A population-based study of blood lead levels in relation to depression in the United States. Int Arch Occup Environ Health.
2010;83(7):771-777.
107. Jonas BS, Wagener DK, Lando JF, Feldman JJ. Symptoms of anxiety and depression as risk factors for development of asthma. Journal of Applied Biobehavioral
Research. 1999;4(2):91-110.
108. Patten SB, Williams JV, Lavorato DH, Modgill G, Jetté N, Eliasziw M. Major depression as a risk factor for chronic disease incidence: Longitudinal analyses in a general population cohort. Gen Hosp Psychiatry. 2008;30(5):407-413.
109. Loerbroks A, Apfelbacher CJ, Bosch JA, Sturmer T. Depressive symptoms, social support, and risk of adult asthma in a population-based cohort study. Psychosom Med.
2010;72(3):309-315.
110. Brumpton BM, Leivseth L, Romundstad PR, et al. The joint association of anxiety, depression and obesity with incident asthma in adults: The HUNT study. Int J Epidemiol.
2013;42(5):1455-1463.
111. Coogan PF, Yu J, O'Connor GT, Brown TA, Palmer JR, Rosenberg L. Depressive symptoms and the incidence of adult-onset asthma in African American women. Annals of Allergy, Asthma & Immunology. 2014;112(4):333-338. e1.
112. Brunner WM, Schreiner PJ, Sood A, Jacobs Jr DR. Depression and risk of incident asthma in adults. the CARDIA study. American journal of respiratory and critical care medicine. 2014;189(9):1044-1051.
50
113. Walters P, Schofield P, Howard L, Ashworth M, Tylee A. The relationship between asthma and depression in primary care patients: A historical cohort and nested case control study. PLoS One. 2011;6(6):e20750.
114. Gao Y, Zhao H, Zhang F, et al. The relationship between depression and asthma: A meta-analysis of prospective studies. PloS one. 2015;10(7):e0132424.
115. Ahmedani BK, Peterson EL, Wells KE, Williams LK. Examining the relationship between depression and asthma exacerbations in a prospective follow-up study.
Psychosom Med. 2013;75(3):305-310.
116. Mitka M. Rigorous evidence slim for determining health risks from natural gas
fracking. JAMA. 2012;307(20).
117. Kovats S, Depledge M, Haines A, et al. The health implications of fracking. Lancet.
2014;383(9919):757-758.
51
Chapter 2: Detailed Methods
2.0 Chapter overview This chapter describes the data sources used in this research, and provides detailed methods and their rationales for the analyses in Chapters 3 - 6. Only methods that are not described in Chapters 3 - 6 are covered here.
2.1 Data sources 2.1.1 Geisinger Clinic
The Geisinger Clinic provides primary care services to over 450,000 patients in more than 35 counties of central and northeastern Pennsylvania, covering a range of place types (townships, boroughs, and cities). Geisinger has had an electronic health record (EHR) since 2001. The EHR captures data at all clinical encounters (inpatient,
outpatient, emergency, urgent care, telephone).1 This is a powerful source of data for epidemiology studies because it includes a large sample size, can be obtained relatively inexpensively, is longitudinal, has detailed health information, and is already in an electronic form.2
2.1.2 Pennsylvania Department of Environmental Protection The Pennsylvania Department of Environmental Protection is the state agency responsible for the enforcement of the state's environmental laws.
2.1.2.1 Natural gas wells
Oil and gas development in Pennsylvania is regulated under the Pennsylvania
Oil and Gas Act (58 P.S. §§ 601.101-601.607), which was passed in 1984, was overhauled in 2012 with the passage of Act 13, and was additionally updated by the
Unconventional Well Report Act passed in 2014.3-5 As required by these laws, well
operators are required to submit a well record form within 30 days of drilling of a well and
a well completion report within 30 days of stimulating a well to the DEP. The well record
form collects information on the well’s location, permit number, depth, and the date
52 drilling started (spud date) (Figure 2.1.2.1). The well completion report collects information on the stimulation phase, including the date of stimulation (Figure 2.1.2.2).
Reporting requirements for natural gas production have changed over time. Before July
2010, production was reported in yearlong increments. From July 2010 to December
2014, production was reported in 6-month increments (January – June and July –
December). As of January 2015, production is reported monthly. The DEP compiles the spud and production data electronically, and the data is available for download on their website (https://www.paoilandgasreporting.state.pa.us/publicreports).
Figure 2.1.2.1. The well record form.
53
54
Figure 2.1.2.2. The well completion report.
2.1.2.2 Compressors
Compressors have been documented to be an important source of air pollution
from unconventional natural gas development (UNGD).6,7 As emitters of air pollution,
compressor stations in Pennsylvania are required to obtain a plan approval and an
operating permit from the DEP. Smaller compressor stations (non-major facilities) can
obtain a General Permit for Air Pollution Control in Natural Gas Compression and/or
Processing Facilities (referred to as a GP-5 permit). This is a general permit that serves
as both a plan approval and an operating permit; companies can apply for this as long
as the facility meets the qualifying criteria, which include limits (that vary by county) on
55 the yearly emissions of several pollutants. Companies must submit a suite of standardized documents, including an application and a General Information Form (GIF), to obtain a GP-5. The GP-5 offers faster times to approvals over non-general plan approvals, which are used by larger compressor stations. Non-general plan approvals do not have standardized forms.8
2.1.2.3 Municipal water supply
As the agency in charge of enforcing the state's water quality laws, the DEP
maintains a shapefile of areas in the state with a public water supply. We downloaded
the shape file from their website (http://www.depgis.state.pa.us/emappa/). This
information was used to identify patients who were likely using ground water for drinking
water, if their residences geocoded to areas outside these public water supply
boundaries.
2.1.3 Pennsylvania Department of Conservation and Natural Resources
After the DEP receives a completion report (described in Section 2.1.2.1) from a well operator, the DEP sends this report to the Pennsylvania Department of
Conservation and Natural Resources (DCNR). The DCNR is charged with analyzing and interpreting the geologic information provided on the completion report. To do so, the
DCNR maintains the Internet Record Imaging System/Wells Information System
(PA*IRIS/WIS), renamed Exploration and Development Well Information Network
(EDWIN) in 2016. PA*IRIS/WIS is a subscription service that allows electronic access to scanned images of Location Plats, Completion Reports, Geophysical Logs, and Plugging
Certificates, as well as access to an electronic database of the variables contained on the completion report.9
2.1.4 U.S. Census data We used Census data to create community level variables. As required by the
Constitution, the census is conducted every 10 years. Starting in 2010, the census
56 contained only 10 questions on name, sex, age, race, race/ethnicity, relationship, and home ownership; and questions on economic, housing, and social characteristics were moved to the American Community Survey, which is a continuous monthly survey.10 In this thesis, we used data from the 2010 American Community Survey (5 year estimates).
This information was used to create area-level social environmental variables.
2.1.5 Federal Highway Administration
The Federal Highway Administration is the division of the United States
Department of Transportation that is in charge of the highway system. We downloaded a shape file of highway system from the Federal Highway Administration’s Highway
Performance Monitoring System website
(http://www.fhwa.dot.gov/policyinformation/hpms/shapefiles.cfm). This information was
used to create metrics for distance to major and minor road ways.
2.1.6 U.S. Department of Agriculture National Agricultural Imagery Program The U.S. Department of Agriculture’s National Agricultural Imagery
Program (NAIP) provides publically available aerial imagery at a one meter resolution of the continental U.S. during the agricultural growing season. The aerial imagery is available to download from their website (https://www.fsa.usda.gov/programs-and-
services/aerial-photography/imagery-programs/naip-imagery/index). For Pennsylvania,
aerial imagery is available for 2005, 2008, 2010, and 2013. This information was used to
confirm the location of well pads.
2.1.7 Satellite data
To create area-level variables, we used data from three different satellites:
Landsat 7, Suomi NPP, and Terra. NASA’s Landsat program is the longest running program to collect satellite imagery (both visible and infrared) of the Earth. The current satellite in the Landsat program is Landsat 7.11 We used the Visible Infrared Imaging
Radiometer Suite (VIIRS) on the Suomi NPP satellite to identify natural gas flares.12 The
57
Terra satellite contains the Moderate-resolution Imaging Spectroradiometer (MODIS).
MODIS is used to create a measure of greenness (normalized difference vegetation
index [NDVI]) every 16 days with a 250 meter resolution.13 Satellite data is available
from NASA’s Earth Observing System Clearing House
(https://reverb.echo.nasa.gov/reverb/).
2.1.8 Environmental Protection Agency
The Environmental Protection Agency (EPA) is the federal agency in charge of enforcing environmental laws. The EPA was the sources of several types of data for the completed research, as follows.
2.1.8.1 Air quality monitoring network
The National Ambient Air Quality Standards (NAAQS) are the federal standards for criteria air pollutants (carbon monoxide, lead, nitrogen dioxide, ozone, particulate matter, and sulfur dioxide), and the EPA requires that states maintain a monitoring network for these. Data from these monitors is available from the EPA’s AirData website
(https://aqs.epa.gov/api), as are monitor locations.
2.1.8.2 National Emissions Inventory
The National Emissions Inventory (NEI) is a database of sources that emit
criteria and hazardous air pollutants. It includes their type, their location, and their
emissions. It is used to create air pollution models. It is available on the EPA’s website
(https://www.epa.gov/air-emissions-inventories). We considered using these data as a
source of air pollution emissions in Chapter 5.
2.2 Data acquisition 2.2.1 Geisinger EHR data The Geisinger EHR data was provided in 13 files: vitals, family history, demographics, contact information, procedures, social history, encounter diagnoses, outpatient encounters, hospital encounters, medication orders, medication order
58 diagnoses, medication record, lab orders, and problem list. Each patient had a study identification number, which was used to link patients across datasets, and each encounter had an encounter identification number, which was used to link information about encounters across datasets.
2.2.2 Crowdsourced data on impoundments and well pads from SkyTruth
We collected the impoundment and well pad data in partnership with SkyTruth.
For impoundments, SkyTruth created a collaborative image analysis application on their website (skytruth.org) that displayed aerial imagery collected by the USDA National
Agricultural Imagery Program14 of the one square kilometer area around UNG wells from
the summers of 2005, 2008, 2010 and 2013 (Figure 2.2.2). Trained volunteers and staff
identified and outlined impoundments. Each image was reviewed by no less than three
staff or ten volunteers, 66.6% agreement was required among staff or 70% among
volunteers, and assignments were validated by a GIS analyst before inclusion in the final
dataset. The methods were the same for well pads, but volunteers only identified points
(not outlines) of well pads.
59
Figure 2.2.2. Screenshot of the SkyTruth application used to crowdsource impoundment locations.
2.2.3 Compressor data
We started with a list of compressor stations related to UNGD from the
Pennsylvania Department of Environmental Protection (DEP) to define our population of compressor stations (n=506). To characterize the compressor stations in our population, we needed to collect the following variables on each compressor engine: station name, location, compressor engine horsepower, compressor engine emissions, and start and stop dates of operation for each engine. These variables are contained on several different documents, including applications, GIFs, authorization letters, start letters, and cancelation letters in the station’s files. Applications, GIFs, and plan approvals are
60 submitted before the construction of a new compressor station, alteration of an existing compressor station, or to renew a station’s permit. These contain information on the station location, number of compressor engines, compressor engine horsepower, compressor engine emissions, and expected start date of operation. Authorization letters are sent from the DEP to the company notifying them that their permit is authorized.
Start letters are sent from the company to the DEP notifying them that an engine has started operation. Cancelation letters are sent from the company to the DEP notifying them that an engine has stopped operation.
These documents are not available electronically. Paper copies of these documents are kept in files at 4 different DEP locations, for the Northeast, North-central,
Northwest, and Southwest Regional offices. We also scanned other documents that, during file review, were found to contain information on station location, number of compressor engines, compressor engine horsepower, compressor engine emissions, start date of operation, and stop date of operation. Between October 28, 2013 and May
1, 2014, we made a total of 17 visits (each lasting between 2 days and one week) to the
DEP regional offices and scanned a total of 6,007 documents on our population of compressor stations.
2.3 Data processing 2.3.1 Creation of the unconventional natural gas well dataset
To estimate patients’ UNGD activity metrics, we needed complete data on the location, dates of development, grouping of wells on well pads, total depth, and production quantities of all drilled unconventional natural gas wells in Pennsylvania.
However, these variables were not available from a single dataset, the different datasets with these variables contained different populations of wells, and within each dataset there were variables with missing values. To create a complete well dataset, we merged data from several sources and abstracted, extrapolated, and imputed several missing
61 variables using the following methods. We initially created a complete well dataset through June 2013, which identified 6,915 spudded wells. We then updated the well data twice: first through December 2014, which identified 8,888 spudded wells, and then through December 2015, which identified 9,669 spudded wells.
2.3.1.1 Well data sources
In our analysis, we used data on well latitude and longitude; well pad
latitude and longitude; dates of spudding, stimulation, and production; total depth; and
volume of natural gas produced and the number of production days. All variables except
for total depth, date of stimulation, and well pad latitude and longitude were available
electronically from the PA DEP. Total depth and date of stimulation were available from
the Pennsylvania Internet Record Imaging System (PA*IRIS). Well pad data was from
SkyTruth, which used crowdsourcing of aerial photographs from the U.S. Department of
Agriculture from 2005, 2008, 2010, and 2013 to identify the location of wellpads
(Section 2.2.2).15 We merged data across different data sources using API number, a
unique well identifier.
2.3.1.2 Inclusion criteria
Because the spud, stimulation, and production reports included different
populations of wells, we used the following inclusion criteria to identify unconventional
wells:
Wells with a spud date and marked unconventional in the spud report; OR
Wells with a total depth greater than 10,848.5 feet (the median depth), well type of gas, and a stimulation date in the stimulation report; OR
Wells with at least one non-zero production period and marked unconventional in the production report.
We merged data on wells that met at least one inclusion criterion from the spud, stimulation, and production reports using wells’ permit numbers.
62
2.3.1.3 Creation of well variables
The variables in each of the three iterations of the UNGD datasets that were most important to UNGD activity assessment included: well permit number, well location, well pad identification number, well pad location, spud date, total depth, stimulation date, production start date, and production quantities. We required the spud date to be before the stimulation date, which was required to be before the production start date. We also created indicator variables to identify the data source (present in the original report, abstracted, extrapolated, or imputed) for the spud date, total depth, and stimulation date variables.
2.3.1.3.1 Well latitude and longitude
The spud, production, and stimulation reports all contained latitudes and longitudes. We took the average of the latitudes and longitudes if the largest difference between any two of the decimal latitudes or longitudes was less than 0.001
(approximately 100 meters). For wells with differences in the latitudes or longitudes greater than or equal to 0.001 (n = 35), we used Google Earth to locate the well pad and then used the latitude and longitude from Google Earth. If none of the locations looked like a well pad (n = 3), then we used the latitude and longitude from the spud report.
2.3.1.3.2 Spud date
All wells in the dataset were required to have a spud date. If a spud date was after the well's stimulation date, or after the well’s start date of production, we deleted the spud date and treated it as missing. To impute spud dates, we first calculated the median number of days from spud date to stimulation date and to production start date by time period (2009 and before vs. 2010 and after) and region (north vs. east, Figure
2.3.3.2.1) for wells not missing those dates (Tables 2.3.3.2.1 and 2.3.3.2.2). For wells missing a spud date but not missing a stimulation date, we extrapolated the spud date by subtracting the median days from spud to stimulation by year and region from the
63 well’s stimulation date. For wells missing a spud date and a stimulation date, we imputed the spud date by subtracting the median days from spud to start date of production by year and region from the well’s start date of production. To update the well dataset in
2014 and 2015, we looked for spud dates in the spud report for wells with a previously extrapolated spud date, and if we found a spud date in the spud report, we replaced the extrapolated spud date with the date in the spud report. The percentage of wells with spud dates present in the spud report vs. extrapolated remained constant at about 98% over the three iterations of the dataset (Table 2.3.3.2.3).
Figure 2.3.3.2.1. Counties considered eastern and northern for the purposes of well variable imputation and extrapolation.
Table 2.3.3.2.1. Median days from spud to stimulation by year and region, based on the 2013 well dataset Production start date in Production start date in Region 2009 and earlier 2010 and later Northern 179 192 Eastern 111 244
64
Table 2.3.3.2.2. Median days from stimulation to production start by year and region, based on the 2013 well dataset Production start date in Production start date in Region 2009 and earlier 2010 and later Northern 202 330 Eastern 143 334
Table 2.3.3.2.3. Spud date missingness percent (number) by data set iteration 2013 well 2014 well dataset 2015 well dataset dataset Spud date not missing 97.8 (6,766) 98.3 (8,733) 98.4 (9,512) in spud report Spud date missing in 2.2 (149) 1.7 (155) 1.6 (157) spud report Total wells in dataset 100 (6,915) 100 (8,888) 9,669
2.3.1.3.3 Total depth
All wells in the dataset were required to have a total depth. The total depth variable is from PA*IRIS/WIS. We looked up and abstracted total depths for wells missing this variable in the scanned forms in PA*IRIS/WIS. We imputed the remaining missing total depths using the predictions from a regression of total depth on county
(indicator variables) and spud year (indicator variables). Similar to spud dates, to update the well dataset in 2014 and 2015, we looked for total dates in the stimulation report for wells with a previously imputed total depth, and we replaced the imputed total depth with the total depth in the stimulation report if one was present. The percentage of wells with total depths present in the stimulation report declined from 62.4% to 53.1% from the
2013 dataset to the 2015 dataset, and the percentage of wells with an imputed total depth jumped from <1% in the 2013 and 2014 datasets to 7.6% in the 2015 well dataset
(Table 2.3.3.3).
65
Table 2.3.3.3. Total depth missingness percent (number) by data set iteration 2013 well 2014 well dataset 2015 well dataset dataset Total depth not missing 62.4 (4,312) 56.6 (5030) 53.1 (5135) in stimulation report Total depth missing and abstracted from 37.0 (2,558) 42.7 (3,795) 39.3 (3803) PA*IRIS/WIS Total depth missing 0.6 (45) 0.7 (63) 7.6 (731) and imputed Total wells in dataset 100 (6,915) 8,888 9,669
2.3.1.3.4 Stimulation date
Because a well could be spudded but not yet stimulated or producing at the time
the dataset was created, we only considered stimulation dates missing if the well had
reported production quantities. If the stimulation date was after the well’s estimated start
date of production, we deleted the stimulation date and treated it as missing. Missing
stimulation dates were data abstracted from PA*IRIS/WIS, and the remaining were extrapolated using a similar process as was used for missing spud dates. We divided the median number of days from spud to stimulation (Table 2.3.3.2.1) by the number of days
from spud to start date of production (Table 2.3.3.2.2), and calculated the median proportion by region and time period. We multiplied this proportion by the number of days from spud to the start date of production for a given well, and added the calculated number of days to the well’s spud date. From the 2013 to the 2015 well dataset, the number of wells not needing a stimulation date decreased, reflecting the increasing number of wells in production (Table 2.3.3.4).
66
Table 2.3.3.4. Stimulation date missingness percent (number) by data set iteration 2013 well 2014 well dataset 2015 well dataset dataset Stimulation date not missing in stimulation 42.2 (2,159) 43.9 (3,901) 45.5 (4,404) report Stimulation date missing and abstracted 1.8 (121) 1.4 (127) 2.1 (202) from PA*IRIS/WIS Stimulation date missing and 24.9 (1,718) 29.5 (2,625) 31.8 (3071) extrapolated Stimulation date not needed (well does not 31.2 (2,159) 25.2 (2,235) 20.6 (1,992) have production) Total wells in dataset 100 (6,915) 100 (8,888) 100 (9,669)
2.3.1.3.5 Production start date and production quantities
The production report includes well production days and production quantities
(MCF, thousand cubic feet of natural gas) by reporting period for each well. Reporting periods were year-long in 2009 and prior, and half-year-long in 2010 and after. We
estimated the start date of production by subtracting the gas production days from a
well’s first production period from the last day of the well’s first production period. For
wells missing a production quantity with reported quantities in the periods before and
after, we took the average of the quantities before and after.
2.3.3.1.6 Well pad
Using the well permit number, we merged our well dataset with the SkyTruth well
pad dataset. If the well was in the SkyTruth well pad dataset, we assigned it the
SkyTruth well pad ID. However, wells were missing from the SkyTruth well pad dataset.
In this case, we grouped wells within 150 meters of one another using ArcGIS. If a well
not in the SkyTruth well pad dataset grouped with a well in the SkyTruth well pad
dataset, we assigned the well missing a well pad the SkyTruth well pad ID of the well in
its group. Wells that did not group with a well in the SkyTruth well pad dataset were
assigned a well pad ID with a designation that these were GIS-created. The SkyTruth
67 well pad dataset was not updated between the creation of the 2013 and 2015 dataset, so over time the number of well pads from SkyTruth remained constant and the number of well pads created in GIS increased (Table 2.3.3.6).
Table 2.3.3.6. Percentage (number) of well pads by data source 2013 well 2014 well dataset 2015 well dataset dataset Well pad from 40 (1,174) 36 (1,174) 35 (1,174) SkyTruth Well pad created in 60 (1,736) 64 (2,096) 65 (2,218) GIS Total well pads in 100 (2,910) 100 (3,270) 100 (3,392) dataset
2.3.2 Creation of the UNGD-related compressor engine dataset
We used the following methods to create a dataset on UNGD-related compressor engines from the documents we scanned at the DEP.
2.3.2.1 Data abstraction
To systematically extract the variables we needed from the scanned documents on compressor stations, we created 6 data abstraction forms using Google Docs: applications, GIFs, authorization letters, start letters, cancelation letters, and other documents. Each scanned PDF was read and data abstracted onto its respective data abstraction form, by one of three primary data abstractors. Documents that did not contain information on the key variables identified above were not data abstracted.
2.3.2.2 Data checking
We exported the spreadsheets from each of the six abstraction forms to Excel.
We took 10% random samples of the compressor stations. A data abstractor (different than the person who originally abstracted the data) re-abstracted the scanned source documents from that station, and then compared the re-abstracted data to the data originally abstracted. They noted if the errors observed were entry errors or the result of differential decision making by a data abstractor. They then corrected the errors. We
68 took four random samples. The first three contained errors. These errors did not appear to be the result of differential decision making. We corrected these errors in the database. The fourth did not, so we considered the data abstraction complete.
2.3.2.3 Creation of a Compressor Station Database
We merged the six compressor engine spreadsheets using station ID and station name to link stations across spreadsheets. The database was formatted with one row per type of engine at each station, and contained the following variables: station ID, station name, station latitude, date the engine was authorized, station longitude, date the engine started operating, date the engine was canceled (stopped operating), engine horsepower, engine emissions (NOx, VOC, and CO), and the number of engines at the station with all these same characteristics. We reformatted the data in STATA to format with one row per compressor engine using the “expand” command.
2.4 Selection of study population and outcomes Discussed below are the methods for selecting the study population and identifying health outcomes for the two epidemiology studies in this thesis (Chapter 3 and Chapter 4).
2.4.1 Asthma study
The UNGD and asthma exacerbation study compared asthma patients with asthma exacerbations to asthma patients without asthma exacerbations (up to that point in time). Because everyone in the study had asthma, we needed to first identify patients with asthma from the general Geisinger Clinic population, and then identify all asthma exacerbations among patients in this population.
2.4.1.1 Identification of asthma population
To identify patients with asthma from the general Geisinger Clinic population, we
first restricted the Geisinger Clinic population to patients with a Pennsylvania or New
York address. Next, based a study that used electronic health records to identify patients
69
with asthma,16 we excluded patients with two or more encounters with ICD-9 codes for
cystic fibrosis (277.0x); chronic pulmonary heart disease (416.x); paralysis of vocal cords
or larynx (478.3x); bronchiectasis (494.xx); and pneumoconiosis (500.xx-508.xx). Next,
we required patients have at least 2 ICD-9 encounters code for asthma (493.x) on
different days, or at least one ICD-9 encounter code for asthma and at least one
medication order (with an ICD-9 code for asthma) on a different day. Finally, we dropped
patients who did not geocode to any level (Section 2.6) and patients missing information
on sex or date of birth in the EHR.
2.4.1.2 Identification of asthma exacerbations
We identified three types of asthma exacerbations among the study population of patients with asthma: mild (new oral corticosteroid [OCS] medication order), moderate
(asthma emergency department visit), and severe (asthma hospitalization). For asthma
emergency department visits and asthma hospitalizations, first we combined all
emergency or hospitalization encounters by patient that were overlapping or within 72
hours. We considered encounters that combined both emergency department visits and
hospitalizations to be hospitalizations. We excluded emergency department visits within
a week before or after a hospitalization. We identified moderate and severe asthma
exacerbations (2005 to 2012) by selecting those with an ICD-9 encounter code for
asthma (493.x). We used both primary and secondary diagnoses.
For OCS medication orders, we needed to distinguish new OCS medication
orders for an asthma exacerbation from standing orders or OCS ordered for other
diseases. To do so, we identified all OCS orders among asthma patients from 2008 to
2012. OCS orders from before 2008 were excluded because inpatient medication orders
were not consistently captured in the EHR then. We dropped OCS orders from to seven
days before to seven days after a hospitalization or emergency department encounter.
To separate new orders from standing orders, we dropped OCS orders that were
70
submitted while the patient was already on OCS, as reported in the medication record
file, or already had another order for OCS. We dropped OCS orders that were within a
week of the previous order, and we required the outpatient visit reason or the medication
order diagnosis to be asthma-related. We dropped OCS orders associated with an
outpatient visit for the following reasons, since these are reasons that OCS are
prescribed that are not related to asthma: suppurative and unspecified otitis media (ICD-
9 code 382.x), nonsuppurative otitis media and Eustachian tube disorders (ICD-9 code
381.x), contact dermatitis and other eczema (ICD-9 code 692.x), and other and
unspecified disorders of the back (i.e. spine) (ICD-9 code 724.x).
For all three types of asthma exacerbations, we dropped exacerbations among patients less than 5 years of age on the day of the encounter, exacerbations after the patient’s date of death (likely because of erroneous recording of one of the two dates), and exacerbations before 2005 or after 2012. Finally, we randomly selected and retained one exacerbation per type per person per year.
2.4.1.3 Identification and matching of control index dates
Because our study design compared asthma patients with asthma exacerbations
(case events) to asthma patients without asthma exacerbations (control date), we needed to identify contact dates for asthma patients who had not yet had an asthma exacerbation. To do so, we started with the population of asthma patients (Section
2.4.1.1), and then created a list of all contact dates with the health system (for lab orders, outpatient encounters, hospital encounters, medication orders, medication order diagnoses, procedures, and vitals) for these patients. For patients with each type of asthma exacerbation, we dropped potential control dates after they had a case event.
For hospitalization controls, we dropped potential control index dates in the year and after the control had an asthma hospitalization; for emergency department controls, we dropped potential control dates in the year and after the control had an asthma
71
hospitalization or an emergency department visit; and for OCS controls, we dropped
potential control dates in the year and after the control had an OCS medication order,
emergency department visit, or hospitalization. Next, we randomly selected one control
date per person per year. We did this so patients with many contact dates with the
health system would not contribute much more information to the analysis than patients
with fewer contact dates. For patients with contact with Geisinger in a year and again
two years later, but not in the middle year, we considered the patient under observation
for the entire period and took the average of the two dates to create a date in the middle
year. We dropped potential control dates among patients less than 5 years of age on the
day of the encounter, potential control dates after the patient’s date of death, and
potential control dates before 2005 or after 2012. Finally, we frequency-matched controls
to cases by age category (5 to < 13, 13 to < 19, 19 to < 45, 45 to < 62, 62 to < 75, > 75
years), year, and sex to case events to select which control dates to include in the
analysis.
2.4.2 Depression symptom study
The depression symptom study was conducted using data from the
Chronic Rhinosinusitis Integrative Studies Program (CRISP), a study of chronic rhinosinusitis (CRS) in conducted in the Geisinger Clinic.17
2.4.2.1 Study population
The CRISP study population consisted of adult primary care patients of the
Geisinger Clinic. EHR data from 2006 to 2013 were used to categorize all adult primary care patients into one of three groups based on a history of sinus-related diagnoses and/or evaluations in the EHR, and three groups based on race/ethnicity, for a total of nine groups. The nasal and sinus symptom groups were: patients with at least two ICD-9 codes for CRS (ICD-9 codes 473.x or 471.x); patients with least one ICD-9 code for asthma or allergic rhinitis (ICD-9 codes 493.x or 477.x) or a single ICD-9 code for CRS;
72 and patients with no ICD-9 codes for CRS, asthma, or allergic rhinitis. The three race/ethnicity groups were white, non-Hispanic; black, non-Hispanic; and Hispanic. The survey design oversampled for patients with a history of nasal and sinus symptoms and patients who were not white, and 23,700 patients were randomly selected and included in the CRISP study population.17
2.4.2.2 Outcome and mediating variables created from the questionnaires
The 23,700 patients selected for the CRISP study were sent the baseline questionnaire in April 2014, and all patients who responded to the baseline questionnaire were sent the follow-up questionnaire in October 2014. The baseline and follow-up questionnaires included validated questionnaires on symptoms that we used to create the outcome and mediating variables.
2.4.2.2.1 Fatigue
Fatigue symptoms were ascertained using the Patient-Reported Outcomes
Measurement Information System (PROMIS) fatigue short form 8a, a questionnaire with
8 questions that was included in the baseline questionnaire.18 It asks eight questions
about fatigue symptoms over the past seven days (Table 2.4.2.2.1), with answer choices
of “not at all” (1 point), “a little bit” (2 points), “somewhat” (3 points), “quite a bit” (4
points), and “very much” (5 points). We added the points for each responder and
considered responders in the highest quartile of fatigue scores to have severe fatigue.
Table 2.4.2.2.1. Symptoms included in Patient-Reported Outcomes Measurement Information System fatigue short form 8a. I feel fatigued. I have trouble starting things because I am tired How run-down did you feel on average? How fatigued were you on average? How much were you bothered by your fatigue on average? To what degree did your fatigue interfere with your physical functioning? How often did you have to push yourself to get things done because of your fatigue? How often did you have trouble finishing things because of your fatigue?
73
2.4.2.2.2 Migraine headache
We ascertained migraine headache using the ID Migraine questionnaire, which was included on the baseline questionnaire.19 This questionnaire first asks responders
how often they have had headaches in the past 12 months, and the answer choices
were “never,” “once in a while,” “some of the time,” “most of the time,” or “all of the time”.
Those who responded to that question with at least “some of the time” were asked three
additional questions on headache-associated disability, nausea, and light sensitivity
(Table 2.4.2.2.2) with response choices of “never,” “rarely,” “less than half the time,” or
“half the time or more.” “Never” or “rarely” were scored as no and “less than half the
time” or “half the time or more” were scored as yes, and responders with “yes” on at
least two of the three questions were considered to have migraines.
Table 2.4.2.2.2. Symptoms included in the ID Migraine questionnaire. How often do your headaches interfere with your ability to work, study, or enjoy life? How often do you have nausea with your headaches? How often have you been unusually sensitive to light during your headaches?
2.4.2.2.3 Depression symptoms
We used the validated Personal Health Questionnaire Depression Scale (PHQ-8) questionnaire, which was included on the follow-up questionnaire, to ascertain depression symptoms. The PHQ-8 is a measure of current depression symptoms, used both in the clinical setting and in epidemiology studies, that asks responders how often they were bothered by the symptoms in Table 2.4.2.2.3 over the past two weeks: “not at all” (0 points), “several days” (1 point), “more than half the days” (2 points), or “nearly every day” (3 points). The points are added up, and a score of 0 to < 5 was considered no or minimal depression symptoms, 5 to < 10 was mild depression symptoms, 10 to <
15 was moderate depression symptoms, 15 to < 20 was moderate depression symptoms, and greater than 20 was severe depression symptoms.20
74
Table 2.4.2.2.3. Symptoms included in the Personal Health Questionnaire Depression Scale (PHQ-8) questionnaire. Little interest or pleasure in doing things Feeling down, depressed, or hopeless Trouble falling or staying asleep, or sleeping too much Feeling tired or having little energy Poor appetite or overeating Feeling bad about yourself, or that you are a failure, or have let yourself or your family down Trouble concentrating on things, such as reading the newspaper or watching television Moving or speaking so slowly that other people could have noticed. Or the opposite – being so fidgety or restless that you have been moving around a lot more than usual
2.4.2.3 Case and control dates for the disordered sleep outcome
We identified disordered sleep diagnoses (case-events), consisting of encounters
and medication orders, in Geisinger’s EHR among the study population included in the
depression study. Encounters were identified in the EHR using the ICD-9 codes for
disordered sleep in Table 2.4.2.3.21 We identified orders for disordered sleep medications in the drug class “hypnotics” and using drug subclass and name. We included all medications in the drug subclass antihistamine hypnotics, selective melatonin receptor agonists, hypnotics – tricyclic agents, and orexin receptor antagonists. In the subclass non-barbiturate hypnotics, we included all medications except midazolam hydrochloride. Either an appropriate medication order or an encounter with the appropriate ICD-9 code was considered as a disordered sleep outcome. We excluded disordered sleep outcomes from before 2009, only retained disordered sleep diagnoses from when the participant was 18 years of age or older, and randomly selected one disordered sleep diagnosis per participant per year so that study subjects with many encounters for sleep disorders would not unduly contribute. We identified control encounters and matched them to case events using the same methods as in the asthma exacerbation study (Section 2.4.1.3).
75
Table 2.4.2.3. ICD-9 codes used to identify disordered sleep. ICD-9 code Description 780.52 Insomnia 780.50 Sleep disturbance, unspecified 307.47 Other dysfunctions of sleep stages or arousal from sleep 780.59 Other sleep disturbances 307.42 Persistent disorder of initiating or maintaining sleep 780.5 Sleep disturbances 307.41 Transient disorder of initiating or maintaining sleep 307.40 Nonorganic sleep disorder, unspecified 307.48 Repetitive intrusions of sleep 780.56 Dysfunctions associated with sleep stages or arousal from sleep 780.55 Disruptions of 24-hour sleep-wake cycle Abbreviation: ICD-9 = International Classification of Diseases, 9th Revision, Clinical Modification 2.5 Exposure study 2.5.1 Creation of the regular grid
In Chapter 5, we wanted to explore the relationships among UNGD metrics
(Section 2.7.3) for wells, compressors, and impoundments. We did not want to assign
the UNGD metrics at the locations of Geisinger patients because then the analysis
would be influenced by population density and residential patterns. Instead, we created
a regular grid across the Geisinger region. We did this using the “Create Fishnet” tool in
ArcGIS, and we specified that points would be 5 km from each other. We then exported
the coordinates of the points to R to assign them the UNGD metrics (Section 2.7.3).
2.5.2 Estimation of impoundment start and stop dates
Rutherford Platt at Gettysburg College estimated an installation and removal date for each impoundment. He used a trend analysis of Landsat data to identify sudden spectral changes in the grid cell that contained each impoundment. He compiled all available Landsat 5, 7, and 8 surface reflectance imagery with < 30% cloud cover for the years 2000-2015, a total of 754 images across four Landsat path/rows. For each impoundment location, he masked remaining clouds and then interpolated a monthly time series for the near infrared band and the NDVI. He used the Breaks for
Additive Season and Trend package in R to identify discrete breaks in the time series
76 after the removal of seasonal effects.22 The dataset has a nominal temporal resolution of
1 month, but cloud cover and gaps can potentially delay the detection of the creation or
removal of impoundments. Based on the direction, magnitude, and timing of the time
series breaks, he identified approximate dates of creation and removal of
impoundments. He verified estimates for a sample of impoundments by comparing
Landsat-derived dates to dates derived using historical imagery on Google Earth.
2.6 Geocoding of study population Joseph Dewalle at the Geisinger Center for Health Research, geocoded the
Geisinger patients included in this thesis. To do so, first, all addresses were validated against a US Postal Service database, which standardizes address components, adds
ZIP + 4, and converts some rural-style addresses and P.O. boxes into city-style addresses. Then, he sequentially used the following base maps to geocode addresses: residential structure points, created as a part of Pennsylvania’s effort to convert rural style addresses to city style addresses; StreetMap Premium Tom Tom Edition, commercial products from ESRI, versions 3, 2 and 1 for years 2012, 2011, and 2010;
Census TIGER 2013 and Census TIGER 2010, basemaps created by the U.S. Census for the 2010 Census; TeleAtlas 2009, an ESRI product; and Census TIGER 2000, a basemap created by the U.S. Census for the 2000 Census. These basemaps were used in an order of decreasing quality or increasing age. Only matches with a high sensitivity
(spelling > 90) were accepted. Addresses that could not geocode to the street level were instead geocoded to the centroid of the address’s ZIP + 4 or ZIP code. In the asthma exacerbation study (Chapter 3), we included only patients who geocoded to the states of Pennsylvania or New York. In the depression symptom study (Chapter 4), we included only patients who geocoded to the state of Pennsylvania. The number and percentages of the study populations in these studies who geocoded to the street, ZIP +
4, or ZIP code are provided in Table 2.6.
77
Table 2.6. Geocoding level for the asthma and depression study populations. Depression study Asthma study population population Geocoding level n (%) n (%) Street address 31,567 (88.9) 4396 (89.1) ZIP + 4 centroid 923 (2.6) 155 (3.1) ZIP centroid 3,018 (8.5) 381 (7.7)
2.7 Creation of study variables 2.7.1 Covariates created from the electronic health record
The covariates created from the electronic health record data are summarized in
Table 2.7.1 and described in detail along with their rationale in the following sections.
Table 2.7.1. Variables created from the electronic health record used in health studies. Variable Type Units Study Sex Binary Male/female Asthma, depression symptoms Age Categorical Years Asthma Age Continuous Years Depression symptoms Season Categorical Spring, summer, fall, Asthma winter Race/ethnicity Categorical White, black, Asthma Hispanic, other Race/ethnicity Categorical White, black, Depression Hispanic symptoms Smoking status Categorical Current, former, Asthma, never, missing depression symptoms Family history of Binary Yes/no Asthma asthma Family history of Binary Yes/no Depression mental disorders symptoms Medical Assistance Binary Yes/no Asthma, depressive symptoms Overweight/obesity Categorical Not Asthma, overweight/obese, depression overweight, obese, symptoms missing Diabetes Binary Yes/no Asthma Alcohol use Categorical Yes, not heavy; yes, Depression heavy; no symptoms Anti-depressant use Binary Yes/no Depression symptoms
78
2.7.1.1 Sex
Patient sex was determined from the sex variable in the demographics file, which classified patients as female, male, or unknown/missing. Patients with unknown/missing sex were excluded from the analysis. Sex is an important covariate in studies of asthma and of depression. In children, asthma prevalence is higher among males than females, but in adults asthma prevalence is higher among females than males.23 Male and female
children with asthma have similar risk for asthma exacerbations, but female adults have
higher risk for asthma exacerbation than male adults.24 Females of all ages 12 years and
older are more likely to have depression symptoms than males,25 and are more likely to
take anti-depressant medication than males at all levels of depression severity.26
2.7.1.2 Age
Age is also an important covariate in studies of asthma and depression. The rates of these diseases vary by age though both diseases can affect children and adults.
The rates of depressive symptoms increase from age groups 12-17 years to 18-39 years, and from 18-39 years to 40-59 years, but then decrease from 40-59 years to 60 years and over, among both males and females.25 Asthma prevalence is higher among
children than adults,23 and children are at higher risk for asthma exacerbations than
adults.24
Age was calculated as years between date of birth, from the demographics file,
and the index date (in the asthma study) or the date of survey return (in the depression
study). In the asthma study, patients were categorized into six age groups (5 to < 13, 13
to < 19, 19 to < 45, 45 to < 62, 62 to < 75, ≥ 75 years), the same categories used for
matching. In the depression analysis, age was a continuous variable because this
analysis had fewer study participants than the asthma analysis so we were more
concerned with having a parsimonious model.
2.7.1.3 Season
79
Asthma exacerbations tend to peak in the fall, especially for children, but also for adults, which has been attributed to children returning to school and to respiratory viruses.27,28 In the asthma study, season was calculated using month of the index date,
and categorized as spring (March 22-June 21), summer (June 22-September 21), fall
(September 22-December 21), and winter (December 22-March 21). Because the survey
that included depression questions was mailed on the same day to all recipients, we did
not use season as a covariate in the depression analysis.
2.7.1.4 Race/ethnicity
In the United States, black race/ethnicity is associated with higher risk of asthma
hospitalization than white race/ethnicity,29 and black or Hispanic race/ethnicity is
associated with higher rates of depression but lower use of anti-depressant
medication.25,26 In the asthma study, patient race/ethnicity was determined from the race/ethnicity variable in the demographics file, which categorized patients in five mutually exclusive categories: white, black, Hispanic, other, and missing. We combined the other and missing to create four categories: white, black, Hispanic, or other/missing.
In the depression study, three race/ethnic categories (white, black, Hispanic) were used in the survey design, and these same categories were used in analysis.
2.7.1.5 Smoking
It is well established that smoking can aggravate asthma.30 Smoking is also
associated with depression, though the direction of the association is not clear.31 Data
from the social history, the procedure, and the encounter diagnosis files were combined
to create the smoking variable, which classified patients into current, former, never,
missing smoking status. We started with the social history file, which included variables
on: the date the social history was taken, smoking status, packs per day, smoke years,
and smoke quit date. The categories of the smoking status variable and those that we
considered to be current smokers are in Table 2.7.1.5.1. We assumed passive smokers
80 were not smokers. Former smokers were reclassified as current smokers if their quit date was greater than the date the social history was taken. We treated “unknown if ever smoked” and “never assessed” as missing and did not use these social history records.
Next, we looked in the procedure file (Table 2.7.1.5.2) and in the encounter diagnosis file (Table 2.7.1.5.3) for smoking-related codes. All smoking related procedure and encounter diagnosis codes were considered to be indicative of current smoking. We identified the most recent smoking status, smoking related procedure code, or smoking related encounter diagnosis code before the index date.
Table 2.7.1.5.1. Smoking status categories considered as evidence of current smoking Smoking status category Considered current smoker? Current everyday smoker Yes Current some day smoker Yes Smoker, current status unknown Yes Former smoker; never assessed No Never smoker No Passive smoker No Unknown if ever smoked No
Table 2.7.1.5.2. Procedure codes considered as evidence of smoking Code Description 99406 “BEHAV CHNG SMOKING 3-10 MIN” 99407 “BEHAV CHNG SMOKING < 10 MIN” G0375 “DEMO-SMOKING CESSATION COUN” G0376 “SMOKING & TOBAC CESSATION” G9016 “SMOKING & TOBAC CESSATION-INT” W5963 “SMOKING (TOBACCO) CESSATION CO”
Table 2.7.1.5.3. ICD-9 codes considered as evidence of smoking Code Description Excluded subcategories "Chewing tobacco use," "Chews tobacco,” and 305.1 “Tobacco use disorder” "Tobacco dipper" V15.82 “History of tobacco use” "Toxic effect of secondhand 989.84 “Tobacco” tobacco smoke" “Tobacco use disorder complicating pregnancy, childbirth, or the 649.0 puerperium”
81
Next, we looked for any evidence of former smoking before the index date. This included smoking statuses, smoking related procedure codes, or smoking related encounter diagnosis codes, packs per day, smoke years, and smoke quit date. Patients were moved from the never smoker category to the former smoker category if their most recent smoking status was for never smoking, but they had evidence of smoking in their past. We assumed patients 15 and younger were non-smokers if they had no smoking information from the social history, procedures, or encounter diagnosis files. Patients over the age of 15 were categorized as missing if they had no smoking information from the social history, procedures, or encounter diagnosis files.
2.7.1.6 Family history
For both asthma and depression, having a family history of the disease is a risk factor for the disease in relatives.30,32 The family history file includes family history of
asthma and of mental disorders (a more specific family history of depression is not
available). Family history of asthma was created as a binary variable, which
distinguished patients recorded in the family history file as having one or more first
degree relatives (i.e., father, mother, brother, sister) with asthma from those without a
first degree relative with asthma. We also created a variable for family history of mental
disorders in the same way.
2.7.1.7 Medical Assistance
People below the poverty level have a higher risk for depression and for
asthma.25,29 Medical Assistance for health insurance is a means tested program. It has been used as a surrogate for low family SES in prior studies, which found associations with various health outcomes in patterns consistent with prior knowledge about low SES, supporting its use for this purpose.33,34 Patients were considered to be on Medical
Assistance if, up to their index date, patients had at least 3 outpatient visits with any of
the following payors: ACCESS PLUS D15, BLUE CHIP S18, CHIP UHC COMM PL
82
KIDS H64, GHP CHILD HLTH INS PROG (CHIP), MA PENDING D99, MEDICAID SSU
D74, or PENNA M A PROGRAM D01. If a patient did not have an outpatient encounter before their index date, we looked at outpatient encounters up to a year after their index date. If patients only had 3 outpatient visits in total up to their index date, patients only needed 2 visits with any of the above payors to be categorized as having medical assistance; if patients only had 2 outpatient visits in total up to a year after their index date, patients only needed 1 visit with any of the above payors to be categorized as having medical assistance; and if patients only had 1 outpatient visit in total up to a year after their index date, patients only needed 1 visit with any of the above payors to be categorized as having medical assistance.
2.7.1.8 Diabetes
We used diagnoses from both the medication and encounter files to classify patients as having diabetes mellitus. Patients were considered to have type 1 diabetes if they had one ICD-9 code (from either medication an encounter) of 250.X1 or 250.X3 before their index date. Patients were considered to have type 2 diabetes if they had two
ICD-9 codes (from either medication an encounter) on different days of 250.X0 or
250.X2 before their index date.
We did not allow patients to have both type 1 and type 2 diabetes. However, some patients did have ICD-9 codes for both type 1 and type 2 diabetes. For patients with ICD-9 codes for both types of diabetes, we calculated the number of encounters on different days in which the patient had diagnoses for each type of diabetes. We assigned patients as type 1 diabetes if they had more type 1 diabetes than type 2 diagnosis days and had at least one prior medication order for insulin. We assigned patients as type 2 diabetes if they had more type 2 diabetes than type 1 diagnosis days, or if they had more type 1 diabetes than type 2 diagnosis days but had a BMI greater than 40.
2.7.1.9 Overweight/obesity
83
Obesity is associated with worse asthma severity, and obesity and depression are often comorbid.30,35 The vitals file contains information on height and weight. We
used these variables to create the overweight and obese variables, which were based
on body mass index (BMI). For adults in the asthma and depression studies, we
calculated BMI with the most recent weight and the most recent height using Equation
2.7.1.9. For adults, we assumed heights of less than 36 inches or greater than 90 inches
and weights of less than 50 pounds or greater than 600 pounds were not possible and
dropped those measurements. If no height measurements before the index date were
available, we used the average of height measurements after the index date, assuming
that adults do not have increases in height. We classified BMIs less than 25 as not
overweight or obese BMIs greater than or equal to 25 and less than 30 as overweight,
and greater than or equal to 30 as obese.36
Equation 2.7.1.9. BMI formula for adults.
In the asthma study, for children, we calculated BMI z-scores using the CDC
SAS growth chart program.37 We used the most recent height and weight before the
index date and assumed a BMI percentile greater than or equal to the 85th percentile but
less than the 95th percentile was overweight, and greater than or equal to the 95th
percentile was obese.36 For both children and adults, we created a fourth category of missing BMI if the weight and height data to calculate BMI z-score or BMI were not available.
2.7.1.10 Alcohol use
Alcohol use is a standard confounder in studies of depression.38 We created a
categorical variable for alcohol use (yes, not heavy; yes, heavy; no; missing) using the
social history and encounter diagnosis files. Heavy alcohol use is defined by the CDC as
15 or more drinks a week for males and 8 or more drinks a week for females.39 We
84 looked at social histories taken within a year before the survey was returned. The social history file includes variables on alcohol status (yes, no, unknown, not asked) and drinks per week. We considered patients with any social history of heavy drinking in the past year as “yes, heavy,” and patients with any social history of drinking (less than heavy drinking) as “yes, not heavy.” Patients recorded as not drinking in the year prior to survey return were classified as “no,” and patients with no social histories in the year were classified as “missing.” We then looked in encounter diagnoses for ICD-9 codes
“305.0” (nondependent abuse of alcohol) and “303” (alcohol dependence syndrome).
Patients with either of these ICD-9 codes in the year prior to survey return were then reclassified as “yes, heavy.”
2.7.1.11 Anti-depressant use
We hypothesized that patients on anti-depressants many not be susceptible to
the potential effects of UNGD on depression. We created an anti-depressant variable
using the medication record file. We identified anti-depressant medications (selective
serotonin reuptake inhibitors, serotonin and norepinephrine reuptake inhibitors, serotonin
antagonist and reuptake inhibitors, tricyclic antidepressants, tetracyclic antidepressants,
bupropion, serotonin modulator and stimulators, and monoamine oxidase inhibitors) from
the antidepressant and miscellaneous psychotherapeutic pharmacy classes.40
Antidepressant use was identified with medication orders as classified using the Medi-
Span Therapeutic Classification System and the 14-digit Generic Product Identifier (GPI)
that identified drug group (e.g., antidepressants), drug class (e.g., tricyclics), drug sub-
class, and drug name (e.g., amitriptyline). The vast majority of orders were identified by
drug group, but some required more extensive searching (e.g., fluoxetine was also found
in drug class “miscellaneous psychotherapeutic”). We looked at the start and end dates
of medication use and assigned patients as on anti-depressants if the patient was on
anti-depressants on at least one day of the 30 days before survey return. We did not
85 look at anti-depressant medication use as an outcome because patients with depression may take several years to seek care.
2.7.2 Covariates created using patients’ coordinates We also created covariates using patients’ geocoded coordinates as summarized in Table 2.7.2 and described along with their rationale below.
Table 2.7.2. Variables created from the electronic health record used in health studies Variable Type Units Study Place type City, borough, Categorical township Asthma, depression Community socioeconomic deprivation Continuous Z-score Asthma, depression Maximum temperature on prior day Continuous Degrees Celsius Asthma Distance to nearest major and minor road Continuous Meters Asthma Distance to hospital Continuous Meters Asthma Well water supply Binary Public/well water Depression
2.7.2.1 Place type
Because patients are clustered in communities, we needed to create a variable
to describe community type and an identifier to group people in the same community
together. We used a mixed definition of place, termed mixed because two different sets
of place boundaries are used – minor civil divisions (township and boroughs) and census
tracts (in cities).41 This is done because census tracts are too large in rural areas to adequately represent communities; cities are too large and heterogeneous in minor civil division boundaries. This definition was thought to be more culturally, behaviorally, and experientially relevant to the concept of neighborhood condition.41 We downloaded U.S.
Census shapefiles for minor civil divisions and census tracts in Pennsylvania.42 In
ArcGIS, we plotted the townships and boroughs from the minor civil divisions, and the
census tracts in the cities. We assigned each patient to their mixed definition of place to
86 create the place type variable (city, borough, or township). We also assigned each patient the geographic identifier unique for their specific city census tract, borough, or township.43
2.7.2.2 Community socioeconomic deprivation
Community economic factors can effect health, even after accounting for individual level economic status.44 We assigned a measure of community socioeconomic deprivation (CSD) to each mixed definition of place geographic identifier. CSD is based on the commonly used deprivation indexes first derived from the Townsend Index45 and
shown to be associated with health outcomes in many prior studies in the social
epidemiology literature.46,47 The CSD index was calculated as the sum of six transformed
census variables (Table 2.7.2.2) using 2010 data.
Table 2.7.2.2. Variables used to create the socioeconomic deprivation index Variable Transformation Proportion of the population (25 years and older) with less than high school education Z-score Proportion of the population (16 years and older) unemployed Log transformed, z-score Proportion of the population (16 years and older) not in labor force Z-score Proportion of the population in poverty in the last 12 months Z-score Proportion of the population receiving public assistance in the last 12 months Log transformed, z-score Proportion of households without a car Log transformed, z-score
2.7.2.3 Maximum temperature on prior day
Temperature may have a direct effect on asthma exacerbations, and also an indirect effect through asthma triggers (e.g., air pollution), and the effects of temperature may not be adequately captured by season alone.48 We downloaded daily maximum
temperature data for 2005-2012 for New York, New Jersey, and Pennsylvania from the
National Climatic Data Center (Figure 2.7.2.3).49 Using R, we calculated the distance between each patient and each weather station. We then assigned each patient to the
87 closest weather station that reported maximum temperature on the day before each of the patient’s index dates.
Figure 2.7.2.3. Weather stations reporting daily maximum temperature between 2005- 12 in New York, New Jersey, and Pennsylvania.
2.7.2.4 Distance to nearest major and minor road
Living very close to roads (e.g., less than 200 meters) is associated with
increased risk of prevalent asthma and of asthma symptoms.50 We downloaded the
Federal Highway Administration’s 2011 shapefile of highways in the Pennsylvania and
New York (Figure 2.7.2.4).51 Using ArcGIS, we calculated the distance from each
patient’s geocoded address to the closest major road (defined as an interstate; principal
arterial, other freeways and expressways; and principal arterial, other) and minor road
(defined as minor arterials) in meters.
88
Figure 2.7.2.4. Locations of major and minor roads in New York and Pennsylvania.
2.7.2.5 Distance to hospital
We were concerned that asthma patients who lived closer to a Geisinger hospital might be more likely to seek care at the hospital than patients who lived farther away.
The majority of asthma emergency department encounters and hospitalizations occurred at two Geisinger hospitals, Geisinger Medical Center in Danville, Pennsylvania, and
Geisinger Wyoming Valley Medical Center in Wilkes-Barre, Pennsylvania (Figure
2.7.2.5). We calculated the distance from each patient’s geocoded address to both hospitals in meters in R, and then assigned each patient the smaller of the two
distances.
2.7.2.6 Well water supply
Studies on the effect of UNGD on home prices found larger decreases in home prices for homes on well water near unconventional wells compared to homes at the same distance but with a public water supply. The authors proposed the possible
89 explanation that, given public concern on the effect of UNGD on groundwater, having well water may increase the perception of risk from UNGD.52,53 We created a well water
variable using the shapefile of the public water supplier's service areas from the
Pennsylvania Department of Environmental Protection (Figure 2.7.2.6).54 We assumed
that patients with geocoded coordinates inside the shapefile had public water, and that
those outside the shape file had well water.
Figure 2.7.2.6. Public water supply areas in Pennsylvania.
2.7.2.7 Greenness
We assigned a measure of residential greenness, NDVI, which is from the
MODIS satellite (Section 2.1.7), to each patient’s geocoded home address. The NDVI data were from NASA on a 250 meter by 250 meter grid. We resampled to a five image by five image grid using the Focal Statistics tool in ArcGIS. Using the grid from the resampling, the value assigned to each geocoded address was the mean NDVI of the surrounding 25 images (1250 meter by 1250 meters).
2.7.3 UNGD activity metrics
The general purpose of exposure assessment is to rank study participants on intensity, duration, and/or frequency of an exposure for relevant time periods considering
90 disease latency and recent vs. cumulative exposure depending on consideration of how exposure is thought to cause acute vs. chronic disease.55 In our studies, we needed an
exposure assessment method that could be used retrospectively and would be sensitive
to the time-varying nature of UNGD. We wanted the metric to capture all potential
pathways for UNGD to affect health, and for the metric to rank patients higher who lived
closer to wells, among a greater density of wells, and/or near larger wells. We used a
gravity metric, which has been used in epidemiology studies previously, for example, in
a study of infections from methicillin-resistant Staphylococcus aureus in relation to industrial food animal production.33
2.7.3.1 Durations of phases of well development
We estimated the durations of well development phases to determine when wells would contribute to the UNGD metrics using descriptions of the process and our data
(Figure 2.7.3.1).56 We assumed that pad preparation lasted 30 days for the first well on each pad. We estimated the duration of drilling by assuming the largest well was drilled in 30 days. We divided the largest total depth of an unconventional well drilled as of
2013 (20,664 feet) by 30 and assumed that value, 688.8 feet, was the depth of a well that would be drilled in one day. We assumed that stimulation took seven days and that production was daily for each production period that a well reported production.
91
Figure 2.7.3.1. Timeline of well development with estimated durations each phases.
2.7.3.2 Assignment of unconventional natural gas activity metrics for wells
For each of four phases of well development (pad preparation, drilling
stimulation, and production), the metric was assigned using Equation 2.7.3.1.
Equation 2.7.3.1 Activity metrics for unconventional natural gas wells
2 In Equation 2.7.3.1, n was the number of wells in the given phase, dij was the
squared-distance (meters) between well i and participant j, and si was 1 for the pad
production and drilling phases, total well depth (meters) of well i for the stimulation
phase, and daily natural gas production volume (m3) of well i for the production phase.
We summed the metric for the relevant period of exposure for each outcome (d; negative in the formula above because it represents days before the index date). For the asthma exacerbation outcomes, we assigned the metrics on the single day before the index date. We did this because studies of air pollution and asthma typically use a short window of exposure (Table 1.6.1). We compared several lags and durations for the drilling metric for a subset of 446 randomly selected asthma hospitalizations (Table
2.7.3.2). Because they were highly correlated, for computational simplicity, we used a lag of one day and duration of one day for each UNGD metric.
92
For the depression symptoms outcome, for each phase of development, the metric was summed for the period of the 14 days prior to the date of the returned follow- up questionnaire, because the depression symptom questionnaire asked about the prior
14 days.20 In the analysis to evaluate mediation by fatigue or migraine of the UNGD-
depression symptom association, we assigned the UNGD metric for the three months
before baseline questionnaire return because the prior study that evaluated associations
of UNGD with symptoms from the baseline questionnaire also summed the UNGD
metric over three months.57
Table 2.7.3.2. Spearman correlation coefficient of the drilling metric assigned for different durations and lags for 446 randomly chosen asthma hospitalizations. Lagged Lagged Lagged Lagged day 3 day 1 days 3-5 days 1-5 Lagged day 3 1 Lagged day 1 0.97 1 Lagged days 3-5 0.99 0.96 1 Lagged days 1-5 0.99 0.98 0.99 1
2.7.3.3 Assignment of unconventional natural gas activity metrics for impoundments and
compressors
In the exposure assessment study (Chapter 5), the UNGD metrics for
compressor engines and impoundments were also assigned using Equation 2.7.3.1. For
compressor engines, si was the compressor engine horsepower. Engines contributed to the metric from their start date to their removal date. For impoundments, si was the area
(m2) of the impoundment, which contributed to the metric from their installation to their removal date.
2.8 References
1. Paulus RA, Davis K, Steele GD. Continuous innovation in health care: Implications of
the geisinger experience. Health Affairs. 2008;27(5):1235-1245.
93
2. Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: A review of methods and applications. Annu Rev Public
Health. 2015(0).
3. Pennsylvania Code. Unconventional well report act. http://www.legis.state.pa.us/cfdocs/Legis/LI/uconsCheck.cfm?txtType=HTM&yr=2014&s essInd=0&smthLwInd=0&act=173. Updated 2014. Accessed 1/11, 2017.
4. Pennsylvania Code. Act 13. http://www.legis.state.pa.us/CFDOCS/LEGIS/LI/uconsCheck.cfm?txtType=HTM&yr=201
2&sessInd=0&smthLwInd=0&act=0013.&CFID=126352892&CFTOKEN=56814378.
Updated 2012. Accessed 1/11, 2017.
5. Pennsylvania Code. Oil and gas act. http://www.legis.state.pa.us/WU01/LI/LI/CT/HTM/58/00.032..HTM. Updated 1984.
Accessed 1/11, 2017.
6. Roy AA, Adams PJ, Robinson AL. Air pollutant emissions from the development, production, and processing of Marcellus shale natural gas. J Air Waste Manage Assoc.
2013;64(1):19-37.
7. Litovitz A, Curtright A, Abramzon S, Burger N, Samaras C. Estimation of regional air- quality damages from marcellus shale natural gas extraction in pennsylvania.
Environmental Research Letters. 2013;8(1):014017.
8. Pennsylvania Department of Environmental Protection Bureau of Air Quality. GP-5 fact sheet.
94 http://www.dep.state.pa.us/dep/deputate/airwaste/aq/permits/gp/Fact_Sheet_GP5.pdf.
Accessed 1/12, 2017.
9. Pennsylvania Department of Conservation and Natural Resources. Exploration and development well information network.
http://dcnr.state.pa.us/topogeo/econresource/oilandgas/EDWIN_home/index.htm.
Updated 2016. Accessed 1/5, 2017.
10. U.S. Census Burea. History of the census. 2010 Overview Web site.
https://www.census.gov/history/www/through_the_decades/overview/2010_overview_1.
html. Updated 2016. Accessed 1/11, 2017.
11. National Aeronautics and Space Administration. Landsat overview. . Updated 2015.
Accessed 1/12, 2017.
12. National Oceanic and Atmospheric Administration. The earth at night: Suomi NPP
satellite offers unprecedented views.
http://research.noaa.gov/News/TabId/496/ArtMID/1377/ArticleID/10133/The-Earth-at- night-Suomi-NPP-satellite-offers-unprecedented-views.aspx. Updated 2012. Accessed
1/12, 2017.
13. National Aeronautics and Space Administration. Moderate resolution imaging spectroradiometer . http://terra.nasa.gov/about/terra-instruments/modis Web site. .
Updated 20171/12.
14. U.S. Department of Agriculture. National agriculture imagery program. https://www.fsa.usda.gov/programs-and-services/aerial-photography/imagery- programs/naip-imagery/index. Updated 2013. Accessed 1/10, 2017.
95
15. SkyTruth. TADPOLE pennsylvania results. http://frack.skytruth.org/frackfinder/frackfinder-news/tadpolepennsylvaniaresults.
Published Feb 12, 2014. Updated 2014. Accessed June 30, 2014.
16. Pacheco JA, Avila PC, Thompson JA, et al. A highly specific algorithm for identifying asthma cases and controls for genome-wide association studies. AMIA Annu Symp
Proc. 2009;2009:497-501.
17. Hirsch AG, Stewart WF, Sundaresan AS, et al. Nasal and sinus symptoms and chronic rhinosinusitis in a population-based sample. Allergy. 2016.
18. Patient-Reported Outcomes Measurement Information System. PROMIS fatigue short form 8a. http://www.assessmentcenter.net. Updated 2015. Accessed October 10,
2015.
19. Lipton RB, Dodick D, Sadovsky R, et al. A self-administered screener for migraine in primary care: The ID migraine validation study. Neurology. 2003;61(3):375-382.
20. Kroenke K, Strine TW, Spitzer RL, Williams JB, Berry JT, Mokdad AH. The PHQ-8 as a measure of current depression in the general population. J Affect Disord.
2009;114(1):163-173.
21. Balkrishnan R, Rasu RS, Rajagopalan R. Physician and patient determinants of pharmacologic treatment of sleep difficulties in outpatient settings in the united states.
Sleep. 2005;28(6):715.
22. Verbesselt J, Hyndman R, Newnham G, Culvenor D. Detecting trend and seasonal changes in satellite image time series. Remote Sens Environ. 2010;114(1):106-115.
96
23. Moorman JE, Zahran H, Truman BI, Molla MT, Centers for Disease Control and
Prevention (CDC). Current asthma prevalence-united states, 2006-2008. MMWR
Surveill Summ. 2011;60(Suppl):84-86.
24. Moorman JE, Person CJ, Zahran HS. Asthma attacks among persons with current
asthma—United states, 2001–2010. MMWR Surveill Summ. 2013;62(suppl 3):93-98.
25. Pratt LA, Brody DJ. Depression in the U.S. household population, 2009-2012. NCHS
Data Brief. 2014;(172)(172):1-8.
26. Pratt LA, Brody DJ, Gu Q, National Center for Health Statistics (US). Antidepressant
use in persons aged 12 and over: United states, 2005-2008. 2011.
27. Johnston NW, Sears MR. Asthma exacerbations . 1: Epidemiology. Thorax.
2006;61(8):722-728.
28. Johnston NW, Johnston SL, Norman GR, Dai J, Sears MR. The september epidemic
of asthma hospitalization: School children as disease vectors. J Allergy Clin Immunol.
2006;117(3):557-562.
29. Moorman JE, Akinbami LJ, Bailey CM, et al. National surveillance of asthma: United
states, 2001-2010. National Center for Health Statistics, Vital Health Stat. 2012;3:35.
30. National Heart, Lung, and Blood Institute. National Asthma Education Program.
Expert Panel on the Management of Asthma. Expert panel report 3: Guidelines for the
diagnosis and management of asthma: Full report. US Department of Health and Human
Services, National Institutes of Health, National Heart, Lung, and Blood Institute; 2007.
97
31. Munafo MR, Araya R. Cigarette smoking and depression: A question of causation. Br
J Psychiatry. 2010;196(6):425-426.
32. Schreier A, Höfler M, Wittchen H, Lieb R. Clinical characteristics of major depressive disorder run in families–a community study of 933 mothers and their children. J
Psychiatr Res. 2006;40(4):283-292.
33. Casey JA, Curriero FC, Cosgrove SE, Nachman KE,Schwartz BS. HIgh-density livestock operations, crop field application of manure, and risk of community-associated methicillin-resistant staphylococcus aureus infection in Pennsylvania. JAMA Internal
Medicine. 2013;173(21):1980-1990.
34. Schwartz BS, Bailey-Davis L, Bandeen-Roche K, et al. Attention deficit disorder, stimulant use, and childhood body mass index trajectory. Pediatrics. 2014;133(4):668-
676.
35. Preiss K, Brennan L, Clarke D. A systematic review of variables associated with the relationship between obesity and depression. Obesity Reviews. 2013;14(11):906-918.
36. Ogden CL, Carroll MD, Kit BK, Flegal KM. Prevalence of childhood and adult obesity in the united states, 2011-2012. JAMA. 2014;311(8):806.
37. Centers for Disease Control and Prevention. A SAS Program for the 2000 CDC
Growth Charts (ages 0 to< 20 years). 2014.
38. Mair C, Diez Roux AV, Galea S. Are neighbourhood characteristics associated with depressive symptoms? A review of evidence. J Epidemiol Community Health.
2008;62(11):940-6.
98
39. Stahre M. Contribution of excessive alcohol consumption to deaths and years of potential life lost in the united states. Preventing chronic disease. 2014;11.
40. Schwartz BS, Glass TA, Pollak J, et al. Depression, its comorbidities and treatment, and childhood body mass index trajectories. Obesity (Silver Spring). 2016.
41. Schwartz BS, Stewart WF, Godby S, et al. Body mass index and the built and social environments in children and adolescents using electronic health records. Am J Prev
Med. 2011;41(4):e17-e28.
42. U.S. Census Bureau. TIGER/line shapefiles. https://www.census.gov/geo/maps- data/data/tiger-line.html. Updated 2016. Accessed 5/24, 2016.
43. U.S. Census Bureau. Understanding geographic identifiers. https://www.census.gov/geo/reference/geoidentifiers.html. Updated 2015. Accessed
5/24, 2016.
44. Pickett KE, Pearl M. Multilevel analyses of neighbourhood socioeconomic context and health outcomes: A critical review. J Epidemiol Community Health. 2001;55(2):111-
122.
45. Townsend, Peter,, Phillimore, Peter,,Beattie, Alastair.,. Health and deprivation :
Inequality and the north. London; New York: Croom Helm; 1988.
46. Nau C, Schwartz BS, Bandeen‐Roche K, et al. Community socioeconomic deprivation and obesity trajectories in children using electronic health records. Obesity.
2015;23(1):207-212.
99
47. Liu AY, Curriero FC, Glass TA, Stewart WF, Schwartz BS. The contextual influence of coal abandoned mine lands in communities and type 2 diabetes in pennsylvania.
Health Place. 2013.
48. Buckley JP, Richardson DB. Seasonal modification of the association between temperature and adult emergency department visits for asthma: A case-crossover study.
Environ Health. 2012;11(1):55.
49. National Climatic Data Center. Climate Data Online Web site. . Accessed May 11,
2011.
50. McConnell R, Berhane K, Yao L, et al. Traffic, susceptibility, and childhood asthma.
Environ Health Perspect. 2006:766-772.
51. U.S. Department of Transportation Federal Highway Administration. Highway
Performance Monitoring System Web site. http://www.fhwa.dot.gov/policyinformation/hpms/shapefiles.cfm. Updated 2013.
Accessed March 27, 2015.
52. Gopalakrishnan S, Klaiber HA. Is the shale energy boom a bust for nearby residents? evidence from housing values in pennsylvania. Am J Agric Econ.
2014;96(1):43-66.
53. Muehlenbachs L, Spiller E, Timmins C. The housing market impacts of shale gas development. Am Econ Rev. 2015;105(12):3633-59.
54. Pennsylvania Department of Health. Public water systems. Environmental Health
Tracking Program Web site. http://www.health.pa.gov/My%20Health/Environmental%20Health/Environmental%20Pu
100
blic%20Health%20Tracking/Pages/Metadata-for-Drinking-Water-
Quality.aspx#.V0Xr8JErKM8. Updated 2015. Accessed 5/25, 2016.
55. Porta M. A dictionary of epidemiology. Oxford University Press; 2008.
56. Gaines M. PennDOT’s posting and bonding program and impact of unconventional oil & gas. http://extension.psu.edu/natural-resources/natural-gas/webinars/shale-energy-
developments-effect-on-the-posting-bonding-and-maintenance-of-roads-in-rural-
pa/mark-gaines-may-16-2013-powerpoint. Published May 16, 2013.
57. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS.
Associations between unconventional natural gas development and nasal and sinus,
migraine headache, and fatigue symptoms in Pennsylvania. Environ Health Perspect.
2016.
101
Chapter 3: Asthma Exacerbations and Unconventional Natural Gas Development in the Marcellus Shale
3.0 Cover page Sara G. Rasmussen, MHS1; Elizabeth L. Ogburn, PhD2; Meredith McCormack, MD3; Joan A. Casey, PhD4; Karen Bandeen-Roche, PhD2; Dione G. Mercer, BS5; and Brian S. Schwartz, MD, MS1,3,5
1Department of Environmental Health Sciences, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; 2Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; 3Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA; 4Robert Wood Johnson Foundation Health and Society Scholars Program, UC San Francisco and UC Berkeley, California, USA; 5Center for Health Research, Geisinger Health System, Danville, Pennsylvania, USA
Acknowledgements: We thank Joseph J. DeWalle, BS, Jennifer K. Irving, BA, and Joshua M. Crisp, BS (Geisinger Center for Health Research) for patient geocoding and assistance in assembling the UNGD dataset; SkyTruth in Shepherdstown, WV for the well pad data; Kirsten Koehler, PhD (JHSPH) for assistance with the temperature data; Kara Rudolph, PhD, MHS (Robert Wood Johnson Foundation Health and Society Scholars Program, UC San Francisco and UC Berkeley) for the code for the unmeasured confounder graphs; and Jonathan S. Pollak, MPP (JHSPH) for identifying the asthma patients from the general Geisinger population. All except KK and KR received compensation for their contributions. This study was funded by the National Institute of Environmental Health Sciences grant ES023675-01 (PI: B S Schwartz) and training grant ES07141 (S G Rasmussen). Additional support was provided by the Degenstein Foundation for compilation of well data, the Robert Wood Johnson Foundation Health & Society Scholars program (J A Casey), and the National Science Foundation Integrative Graduate Education and Research Traineeship (S G Rasmussen). No funders had input into the study design, conduct, data collection or analysis, or manuscript preparation.
Rasmussen SG, Ogburn EL, McCormack M, et al. Association between unconventional natural gas development in the Marcellus shale and asthma exacerbations. JAMA Intern Med. 2016;176(9):1334-1343.
102
3.1 Abstract Importance: Asthma is common and can be exacerbated by air pollution and stress.
Unconventional natural gas development (UNGD) has community and environmental impacts. In Pennsylvania, development began in 2005 and by 2012, 6,253 wells were drilled. There are no prior studies of UNGD and objective respiratory outcomes.
Objective: To evaluate associations between UNGD and asthma exacerbations.
Design: A nested case-control study comparing asthma patients with exacerbations to asthma patients without exacerbations from 2005-12.
Setting: The Geisinger Clinic, which provides primary care services to over 400,000 patients in Pennsylvania.
Participants: Asthma patients aged 5-90 years (n = 35,508) were identified in electronic health records; those with exacerbations were frequency-matched on age, sex, and year of event to those without.
Exposure(s): On the day before each patient’s index date (cases: date of event or medication order; controls: contact date), we estimated UNGD activity metrics for four phases (pad preparation, drilling, stimulation [“fracking”], and production) using distance from the patient’s home to the well, well characteristics, and the dates and durations of phases.
Main Outcome(s) and Measure(s): We identified mild, moderate, and severe asthma exacerbations (new oral corticosteroid medication order, emergency department encounter, and hospitalization, respectively).
Results: We identified 20,749 mild, 1,870 moderate, and 4,782 severe asthma exacerbations, and frequency-matched these to 14,104, 9,350, and 18,693 control index dates, respectively. In three-level adjusted models, there was an association between the highest group of the activity metric for each UNGD phase compared to the lowest
103 group for 11 out of 12 UNGD-outcome pairs (odds ratios [95% CI] ranged from 1.5 [1.2-
1.7] for the association of the pad metric with severe exacerbations to 4.4 [3.8-5.2] for the association of the production metric with mild exacerbations). Six of the 12 UNGD- outcome associations had increasing odds ratios across quartiles. Our findings were robust to increasing levels of covariate control and in sensitivity analyses that included evaluation of some possible sources of unmeasured confounding.
Conclusions and Relevance: Residential UNGD activity metrics were statistically associated with increased odds of mild, moderate, and severe asthma exacerbations.
Whether these associations are causal awaits further investigation, including more detailed exposure assessment.
3.2 Introduction Asthma is a common, chronic disease – in 2010, 25.7 million people in the
United States had asthma, a prevalence of 8.4%.1 Asthma is characterized by variable and recurring symptoms (including cough, wheezing, shortness of breath, and chest tightness), reversible airflow obstruction, bronchial hyper-responsiveness, and underlying inflammation.2,3 In 2009, there were 11.8 million outpatient visits, 2.1 million emergency department visits, and 479,300 hospitalizations for asthma in the US.1
Outdoor air pollution is a recognized cause of asthma exacerbations. A large body of literature links asthma exacerbations to exposure to air pollutants, including ozone, particulate matter, nitrogen dioxide, and sulfur dioxide,2,4 and exposure to even low levels of these pollutants has been associated with asthma hospitalizations, emergency department visits, and rescue medication use, with latency between 0 and 5 days.5-11 Stress at the individual and community levels is also associated with asthma exacerbations.12 Psychosocial stress can modify the effects of environmental triggers13 and is associated with worse asthma control and medication aderence.14
104
Unconventional natural gas development (UNGD) has recently become a major
energy source domestically and worldwide. Pennsylvania has proceeded with UNGD
rapidly – between the mid-2000s and 2012, 6,253 wells were drilled. In contrast, New
York and Maryland, also in the Marcellus shale, have not developed.15,16 Despite calls
for research on the health effects of the industry, there are few published studies of
public health impacts of UNGD.17,18
The first step of UNGD is well pad preparation, lasting about 30 days,
during which 3-5 acres are cleared and materials are brought to the site.19 Drilling begins
on the spud date and typically lasts up to a month as a well is drilled vertically 2,000-
3,000 meters and horizontally 600-3,000 meters.19 After drilling is completed, the
horizontal portion is perforated. Stimulation, also called hydraulic fracturing or “fracking,”
follows, lasts around a week, and requires 11-19 million liters of water, sand, and
chemical additives (e.g., friction reducers, biocides, gelling agents).19,20 Development to
this point requires over 1,000 truck trips per well.19 After stimulation, gas production
begins. The Pennsylvania Department of Environmental Protection (PA DEP) requires
companies to submit documentation at most of these stages of well development.21
UNGD has been associated with air quality and community social impacts.22-29
Psychosocial stress,12 exposure to air pollution4,30 including truck traffic,31 sleep disruption,32,33 and reduced socioeconomic status34 are all biologically plausible
pathways for UNGD to affect asthma exacerbations. To date, there have been no
epidemiologic studies of UNGD and objective respiratory outcomes. Respiratory
outcomes are appropriate outcomes to assess potential health impacts of UNGD
because these have clear links to air pollution and stress; have short latency between
exposure and health effects; are common in the general population; and prompt patients
to seek care so are captured by health system data. Using electronic health record
(EHR) data from the Geisinger Clinic, located in over 35 counties in Pennsylvania,
105 including many with active UNGD, we conducted a nested case-control study of the association between four UNGD activity metrics and asthma exacerbations.
3.3 Methods 3.3.1 Study population We identified asthma patients from the Geisinger Clinic population, which is representative of the general population in the region.35 We included Pennsylvania and
New York patients and excluded patients with cystic fibrosis (277.0x); chronic pulmonary heart disease (416.x); paralysis of vocal cords or larynx (478.3x); bronchiectasis
(494.xx); and pneumoconiosis (500.xx-508.xx) using International Classification of
Diseases, 9th Revision, Clinical Modification (ICD-9) codes. We required patients to have at least two encounters or medication orders with ICD-9 codes for asthma on different days.36 Patients were geocoded using previously published methods,37 88.9% to home address, 2.6% to ZIP+4, and 8.5% to ZIP code centroid. Inclusion criteria also included contact with Geisinger from 2005-2012 while between the ages 5-90 years and recorded information on sex (n=35,508). The study was approved by the Geisinger
Institutional Review Board (with an IRB authorization agreement with Johns Hopkins
Bloomberg School of Public Health). Patients did not receive a stipend and informed consent was obtained through a waiver of HIPAA authorization.
3.3.2 Outcome Ascertainment We identified new oral corticosteroid (OCS) medication orders, asthma emergency department encounters, and asthma hospitalizations, termed mild, moderate, and severe exacerbations, respectively. For patients with more than one exacerbation of a given type within a calendar year, we randomly selected one event. For mild exacerbations, we distinguished new OCS medication orders from 2008-2012 for an asthma exacerbation from standing orders or OCS ordered for other diseases (Figure
3.3.2). The medication order date was considered the index date. OCS orders from
106 before 2008 were excluded because these were not consistently captured before then.
For moderate and severe exacerbations, we identified all emergency and hospitalization encounters from 2005-12. Primary or secondary diagnoses were for asthma (493.x) were used to identify emergency or hospitalization encounters. Patients who had multiple emergency or hospitalization encounters within 72 hours were considered to have a single event. Emergency and hospitalizations encounters within 72 hours were identified as a single hospitalization. The first encounter or admission date of each group of combined encounters was the index date. For patients with more than one type of exacerbation within a week, we retained only the higher category.
107
Figure 3.3.2. Flow diagram for identification of new asthma oral corticosteroid (OCS) medication orders.
3.3.3 Controls and Matching
We identified controls from asthma patients under observation by the health system, so that if the patient were to have an exacerbation, it would be captured by the
EHR. All patient contact dates were identified (e.g., encounter, order, test). Because many of the covariates and the UNGD metrics were time-varying, we needed a single
108
date on which to assign these variables. Therefore, for controls, we randomly selected
one contact date per year per patient. A case was always eligible to be a control for a
less severe event; or for an event of equal or greater severity until the year of the case’s
event. We frequency-matched cases to controls on age category (5-12, 13-18, 19-44,
45-61, 62-74, 75+ years), sex (male, female), and year of encounter.
3.3.4 Covariates
We created time-varying covariates (age, season of event, smoking status,
overweight and obesity, Medical Assistance [as a measure of low family socioeconomic
status], type 2 diabetes) for each index date; and non-time-varying covariates (sex and
race/ethnicity) for each patient. Race/ethnicity was assessed by patient self-report, and
was included because it is a well-documented confounder in studies of asthma.2 We
estimated the patients’ distance to nearest major and minor road using a network from
the Federal Highway Administration,38 and used patients’ geographic coordinates to
assign them to a community using a mixed definition of place and calculated community
socioeconomic deprivation (CSD) for these places.37,39 In cities, communities were
defined by census tracts; elsewhere, communities were defined by minor civil divisions
(townships and boroughs). We estimated the peak temperature on the day before each
index date using data from the nearest weather station to each patient.40 We did not control for place type because of the concern of controlling for exposure.
3.3.5 Well Data
Well data were obtained from: the PA DEP, for well spud (start of drilling) and production; the Pennsylvania Department of Conservation and Natural Resources, for information on well stimulation (hydraulic fracturing) and depths; and SkyTruth
(Shepherdstown, WV), which used crowdsourcing of aerial photographs from the U.S.
Department of Agriculture to identify the location of wellpads.41 For each well, we had
information on well pad; latitude and longitude; dates of spudding, stimulation, and
109
production; total depth; and volume of natural gas produced and the number of
production days. We imputed missing total depths (0.4%) using conditional mean
imputation. We estimated missing production quantities (0.2%) by averaging production
quantities in the prior and following period. We extrapolated missing spud (2.0%) and
stimulation (34.6%) dates using the well’s available dates of development by requiring
that the stimulation date fall in between the spud and production date and using median
durations between phases from wells without any missing dates.
3.3.6 Activity Metric Assignment We estimated the UNGD activity metrics using an inverse distance-squared method for pad preparation, spud, stimulation, and production phases. We compared activity metrics on the day before, three days before, the sum of three to five days before, and the sum of one to five days before the index date, and because they were highly correlated (Spearman correlation coefficients ranged from 0.96-1.00), we used only the day before the index date.
For the pad preparation and spud metrics, we used Equation 3.3.6.1), where n is
2 the number of wells and dij is the squared-distance (meters) between well i and patient j.
Equation 3.3.6.1. Pad preparation and spud metric.
2 For the stimulation metric, we used Equation 3.3.6.2, where n is the number of wells, dij
is the squared-distance (meters) between well i and patient j, and ti is the total well depth
(meters) of well i.
Equation 3.3.6.2. Stimulation activity metric.
Total depth was used as a surrogate for truck traffic because volume of water used
during stimulation42 was highly correlated with total depth, and water is trucked to the
well during stimulation. For the production metric, we used Equation 3.3.6.3, where n is
110
2 the number of wells, dij is the squared-distance (meters) between well i and patient j,
3 and vi is the daily natural gas production volume (m ) of well i.
Equation 3.3.6.3. Production activity metric.
Production volume was used as a surrogate for fugitive emissions and compressor
engine activity.22
Based on descriptions of the process19 and our data, we estimated that pad
development lasted 30 days before the spud date for the first well on a pad, drilling
lasted between 1-30 days after the spud date based on total depth, and stimulation
lasted seven days. All wells in Pennsylvania in a given phase on the day prior to an
index date contributed to that phase’s activity metric (Equations 3.3.6.1-3.3.6.3). We divided the four continuous metrics into quartiles using all 69,548 index dates from all three outcomes so the cutpoints were the same for all outcomes (very low, low, medium, and high).
3.3.7 Statistical Analysis
To assess the association of the four UNGD activity metrics with the three types of asthma exacerbations, we used multilevel logistic regression with random intercept for patient and community to account for multiple events per patient and patient clustering within communities. The base model included one of the four UNGD activity metrics
(very low, low, medium, high), age category (5-12, 13-18, 19-44, 45-61, 62-74, 75+
years), sex (male, female), race/ethnicity (black, Hispanic, other, white), family history of
asthma (yes, no), smoking status (former, current, missing, never), season (summer,
fall, winter, spring), Medical Assistance (yes, no), and overweight/obesity (using BMI
percentile for children and BMI for adults43) as covariates. We then added, one at a time, type 2 diabetes (yes, no), CSD (quartiles), distances to nearest major and minor arterial
road (meters, z-transformed), and maximum temperature on the day prior to event (°C
111
per interquartile range) (Equation 3.3.7). We included the continuous covariates as linear and quadratic terms to allow for non-linearity. We used a 2-sided type 1 error rate of 0.05 for significance testing. We used Stata version 11.2 (StataCorp Inc.) and R version 3.1.2 (R Foundation for Statistical Computing).
Equation 3.3.7. Statistical Model a b Logit(Yijk) = β0 + β1(UNGD Q 2)ijk + β2(UNGD Q3)ijk + β3(UNGD Q4)ijk + β4(age category 13-18)ijk + β5(age category 19-44)ijk + β6(age category 45-61)ijk + β7(age category 62-74)ijk + β8(age category 75+)ijk + β9(male sex)ij + β10(race/ethnicity, black)ij + β11(race/ethnicity, Hispanic)ij + β12(race/ethnicity, other/missing)ij + β13(family history of asthma)ij + β14(smoking status, current)ijk + β15(smoking status, former)ijk + β16(smoking status, missing)ijk + β17(season, summer)ijk + β18(season, fall)ijk + β19(season, winter)ijk + β20(Medical Assistance)ijk + β21(overweight/obesity, overweight)ijk + β22(overweight/obesity, obese)ijk + β23(overweight/obesity, BMI missing)ijk + β24(type 2 diabetes)ijk + β25(community socioeconomic deprivation Q2)i + β26(community socioeconomic deprivation Q3)i + β27(community socioeconomic deprivation Q4)i + β28(distance to nearest major road)ij + β29(distance to nearest major road squared)ij + β30(distance to nearest minor road)ij + β31(distance to nearest minor road squared)ij + β32(maximum temperature on the day prior to event)ij + β33(maximum temperature on the day prior to event squared)ij + u_i + u_ij
where i=community, j=person, k=index date and (u_i, u_ij) are independent normally distributed random effects with mean 0 and separate variances.
a unconventional natural gas development activity metric b quartile 3.3.7.1 Model Building
We calculated the intraclass correlation coefficient for the person and community levels. The proportions of total variance that were accounted for by between-community variation and between-person variation, respectively, were 14% and 63% for severe exacerbations, 41% and 89% for moderate exacerbations, and 1.2% and 59% for mild
exacerbations. We evaluated covariates for conditional significance as they were added
to the models.
3.3.7.2 Sensitivity Analyses
To evaluate how the four separate UNGD activity metrics compared to a
summary measure, we calculated z-scores using continuous metrics, summed the z-
scores, and re-ran the final models with this combined UNGD activity metric (quartiles).
112
To explore whether an unmeasured confounder was responsible for our associations, we evaluated associations with encounters for a negative control44 (intestinal infectious disease and noninfectious gastroenteritis, ICD-9 codes 001-009 and 558.9) among asthma patients, and we also replaced the UNGD activity metric with indicators for counties. We were concerned about the unbalanced numbers of cases and controls for certain age categories, sex, and years in the mild exacerbations analysis, so we reran the analysis dropping the unbalanced cells. In order to check the sensitivity to geocoding level, we reran the final model for the production UNGD metric and each outcome using only patients who were geocoded to their home address. We estimated how large an unmeasured confounder would need to be to account for the observed associations, in whole or in part.45
3.4 Results 3.4.1 Descriptions of Wells and Patients
Between 2005-2012, 6,253 unconventional natural gas wells were spudded on
2,710 pads, 4,728 were stimulated, and 3,706 were in production. The median number of wells per pad was 1 (IQR 1-3) and median total depth was 3,394m (IQR 2,934-3,839).
Most development occurred after 2007 (Figure 3.4.1.1). On their index date, patients in the highest group of the spud metric lived a median of 19km from the closest spudded well, compared to 63km for patients in the lowest group. We identified 5,600 severe,
2,291 moderate, and 25,647 mild exacerbations. After retaining one event per type per year per person, 4,782 severe, 1,870 moderate, and 20,749 mild exacerbations were included. There was substantial overlap of patients and wells in the northern counties
(Figure 3.4.1.2), and substantial overlap of patients by quartile of UNGD activity metric
(Figure 3.4.1.3).
113
Figure 3.4.1.1. Number of developed pads (blue), and spudded (red), stimulated (green), and producing wells (yellow), 2005-12.
114
Figure 3.4.1.2 The location of spudded wells as of December 2012 and residential location of Geisinger asthma patients.
115
Figure 3.4.1.3 Locations of cases and controls by quartile of spud activity metric.
Demographic and clinical variables differed by outcome (Table 3.4.1). Compared to patients with mild and moderate exacerbations, patients with severe exacerbations were more likely to be female, older, current smokers, and obese (all p<0.001). Patients with moderate exacerbations were more likely to be on Medical Assistance and of black race than patients with the other two outcomes, and patients with mild exacerbations were more likely to live in townships (all p<0.001) than patients with the other two outcomes.
3.4.2 Associations of UNGD Activity Metrics with Asthma Outcomes
For severe, moderate, and mild exacerbations, the average percent changes for all odds ratios, from simple models with random intercepts for person and place without
116
covariates to fully adjusted multilevel models, were -8.5%, -0.2%, and 6.0%,
respectively, suggesting little sensitivity of the associations to measured covariates. In
adjusted models, the high activity (vs. very low) of each UNGD metric was associated
with each asthma outcome (Table 3.4.2), except for the pad metric with mild
exacerbations. Associations for the other 11 exposure-outcome pairs ranged from (odds ratio [95% confidence interval]) 1.5 (1.2-1.7) for pad metric with severe exacerbations to
4.4 (3.8-5.2) for production metric with mild exacerbations. Of the 12 activity metric- outcome pairs, six had increasing odds ratios across quartiles 2-4.
117
Table 3.4.1. Descriptive statistics of cases and controls by exacerbation type for selected study variables by variable type (constant vs. time-varying) Hospitalization Emergency Department Encounter Oral Corticosteroid Order Control n (%a) Case n (%) Control n (%) Case n (%) Control n (%) Case n (%) Non-time-varying (constant)
variables Total number of patients 14104 (100) 3576 (100) 9350 (100) 1454 (100) 18693 (100) 13196 (100) Female 10093 (71.6) 2520 (70.5) 5660 (60.5) 872 (60) 11297 (60.4) 8173 (61.9) Family history of asthma 1324 (9.4) 404 (11.3) 1147 (12.3) 266 (18.3) 2047 (11) 1672 (12.7) Race/ethnicity White 13309 (94.4) 3316 (92.7) 8705 (93.1) 1223 (84.1) 17160 (91.8) 12177 (92.3) Black 345 (2.4) 111 (3.1) 286 (3.1) 125 (8.6) 676 (3.6) 431 (3.3) Hispanic 344 (2.4) 126 (3.5) 273 (2.9) 93 (6.4) 674 (3.6) 471 (3.6) Other/missing 106 (0.8) 23 (0.6) 86 (0.9) 13 (0.9) 183 (1.0) 117 (0.9) Place type Township 8583 (60.9) 2017 (56.4) 5590 (59.8) 659 (45.3) 11324 (60.6) 7917 (60) Borough 4192 (29.7) 1108 (31) 2786 (29.8) 490 (33.7) 5445 (29.1) 3891 (29.5) City 1329 (9.4) 451 (12.6) 974 (10.4) 305 (21) 1924 (10.3) 1388 (10.5) Community socioeconomic
deprivation Quartile 1 2967 (21) 673 (18.8) 1936 (20.7) 226 (15.5) 3897 (20.8) 2751 (20.8) Quartile 2 3677 (26.1) 886 (24.8) 2454 (26.2) 307 (21.1) 4839 (25.9) 3259 (24.7) Quartile 3 3561 (25.2) 920 (25.7) 2294 (24.5) 378 (26.0) 4659 (24.9) 3427 (26.0) Quartile 4 3899 (27.6) 1097 (30.7) 2666 (28.5) 543 (37.3) 5298 (28.3) 3759 (28.5) Total number of eventsb 0 14104 (100) 0 (0) 9350 (100) 0 18693 (100) 0 (0) 1 0 (0) 2732 (76.4) 0 (0) 1169 (80.4) 0 (0) 8205 (62.2) 2 0 (0) 605 (16.9) 0 (0) 208 (14.3) 0 (0) 3138 (23.8) 3 0 (0) 162 (4.5) 0 (0) 46 (3.2) 0 (0) 1273 (9.6) 4 0 (0) 48 (1.3) 0 (0) 20 (1.4) 0 (0) 451 (3.4) 5 0 (0) 20 (0.6) 0 (0) 5 (0.3) 0 (0) 129 (1) 6 0 (0) 3 (0.1) 0 (0) 3 (0.2) 0 (0) 0 (0) 7 0 (0) 4 (0.1) 0 (0) 0 (0) 0 (0) 0 (0) 8 0 (0) 2 (0.1) 0 (0) 3 (0.2) 0 (0) 0 (0) Time-varying variables Encounters (controls) or events 14104 (100) 4782 (100) 9350 (100) 1870 (100) 18693 (100) 20749 (100) (cases)c Age (years) at event or matched
encounter
118
Hospitalization Emergency Department Encounter Oral Corticosteroid Order Control n (%a) Case n (%) Control n (%) Case n (%) Control n (%) Case n (%) 5 to < 13 1062 (7.5) 354 (7.4) 2265 (24.2) 453 (24.2) 4157 (22.2) 4245 (20.5) 13 to < 19 810 (5.7) 269 (5.6) 995 (10.6) 199 (10.6) 1926 (10.3) 1926 (9.3) 19 to < 45 5253 (37.2) 1751 (36.6) 4105 (43.9) 821 (43.9) 6013 (32.2) 6323 (30.5) 45 to < 62 4014 (28.5) 1338 (28) 1390 (14.9) 278 (14.9) 4313 (23.1) 5353 (25.8) 62 to < 75 1983 (14.1) 661 (13.8) 405 (4.3) 81 (4.3) 1613 (8.6) 2113 (10.2) > 75 years 982 (7.0) 409 (8.6) 190 (2.0) 38 (2.0) 671 (3.6) 789 (3.8) Year of encounter 2005 1593 (11.3) 531 (11.1) 845 (9) 169 (9) 0 (0) 0 (0) 2006 1767 (12.5) 589 (12.3) 905 (9.7) 181 (9.7) 0 (0) 0 (0) 2007 1659 (11.8) 552 (11.5) 1185 (12.7) 237 (12.7) 0 (0) 0 (0) 2008 1563 (11.1) 526 (11) 1220 (13) 244 (13) 3375 (18.1) 3375 (16.3) 2009 1819 (12.9) 608 (12.7) 1380 (14.8) 276 (14.8) 4038 (21.6) 4038 (19.5) 2010 1794 (12.7) 603 (12.6) 1205 (12.9) 241 (12.9) 4019 (21.5) 4019 (19.4) 2011 1886 (13.4) 648 (13.6) 1230 (13.2) 246 (13.2) 4286 (22.9) 4624 (22.3) 2012 2023 (14.3) 725 (15.2) 1380 (14.8) 276 (14.8) 2975 (15.9) 4693 (22.6) Season of encounterd Spring 3447 (24.4) 1219 (25.5) 2218 (23.7) 456 (24.4) 4337 (23.2) 4618 (22.3) Summer 3357 (23.8) 1134 (23.7) 2253 (24.1) 380 (20.3) 4536 (24.3) 3207 (15.5) Fall 4171 (29.6) 1183 (24.7) 2724 (29.1) 553 (29.6) 5695 (30.5) 6995 (33.7) Winter 3129 (22.2) 1246 (26.1) 2155 (23) 481 (25.7) 4125 (22.1) 5929 (28.6) Obesitye Not overweight/obese 3728 (26.4) 1046 (21.9) 3366 (36) 569 (30.4) 6591 (35.3) 5737 (27.6) Overweight 3605 (25.6) 1077 (22.5) 2173 (23.2) 376 (20.1) 4441 (23.8) 4821 (23.2) Obese 6683 (47.4) 2641 (55.2) 3762 (40.2) 895 (47.9) 7577 (40.5) 10137 (48.9) Missing 88 (0.6) 18 (0.4) 49 (0.5) 30 (1.6) 84 (0.4) 54 (0.3) Smoking status Never 7454 (52.9) 2014 (42.1) 5335 (57.1) 826 (44.2) 11375 (60.9) 11556 (55.7) Current 2552 (18.1) 1204 (25.2) 1466 (15.7) 387 (20.7) 2589 (13.9) 3672 (17.7) Former 3204 (22.7) 1238 (25.9) 1395 (14.9) 304 (16.3) 3231 (17.3) 4251 (20.5) Missing 894 (6.3) 326 (6.8) 1154 (12.3) 353 (18.9) 1498 (8) 1270 (6.1) Medical Assistancef 2657 (18.8) 1568 (32.8) 2529 (27) 741 (39.6) 4956 (26.5) 5850 (28.2) Type 2 diabetes 1504 (10.7) 917 (19.2) 517 (5.5) 156 (8.3) 1420 (7.6) 1905 (9.2) On inhaled corticosteroids 4061 (28.8) 1577 (33) 2545 (27.2) 713 (38.1) 5319 (28.5) 10458 (50.4) Distance to nearest major roadg, 1042 826 1077.5 651.5 1064 1032 median, meters Distance to nearest minor roadh, 708.5 535 682 411 687 691 median, meters
119
Hospitalization Emergency Department Encounter Oral Corticosteroid Order Control n (%a) Case n (%) Control n (%) Case n (%) Control n (%) Case n (%) Temperature on the prior day, 16.1 16.7 16.7 15 16.1 13.3 median, degrees Celsius Pad activity metric, 1010 /m2 Very low, less than 10.7 5988 (42.5) 2004 (41.9) 3671 (39.3) 719 (38.4) 2344 (12.5) 2661 (12.8) Low, 10.7 to 25.7 2811 (19.9) 816 (17.1) 2096 (22.4) 350 (18.7) 5281 (28.3) 6033 (29.1) Medium, 25.8 to 48.7 2675 (19) 887 (18.5) 1819 (19.5) 363 (19.4) 5489 (29.4) 6154 (29.7) High, greater than 48.7 2630 (18.6) 1075 (22.5) 1764 (18.9) 438 (23.4) 5579 (29.8) 5901 (28.4) Spud activity metric, 1010 /m2 Very low, less than 5.1 6009 (42.6) 2032 (42.5) 3701 (39.6) 742 (39.7) 2352 (12.6) 2551 (12.3) Low, 5.1 to 32.3 2796 (19.8) 819 (17.1) 2030 (21.7) 371 (19.8) 5491 (29.4) 5880 (28.3) Medium, 32.4 to 66.8 2719 (19.3) 821 (17.2) 1832 (19.6) 317 (17) 5389 (28.8) 6309 (30.4) High, greater than 66.8 2580 (18.3) 1110 (23.2) 1787 (19.1) 440 (23.5) 5461 (29.2) 6009 (29) Stimulation activity metric, 1013 x m/m2 Very low, less than 2.7 5829 (41.3) 1986 (41.5) 3598 (38.5) 729 (39) 2577 (13.8) 2668 (12.9) Low, 8,2.7 to 25.5 2876 (20.4) 858 (17.9) 2089 (22.3) 391 (20.9) 5573 (29.8) 5600 (27) Medium, 25.6 to 67.4 2736 (19.4) 841 (17.6) 1835 (19.6) 310 (16.6) 5415 (29) 6250 (30.1) High, greater than 67.4 2663 (18.9) 1097 (22.9) 1828 (19.6) 440 (23.5) 5128 (27.4) 6231 (30) Production activity metric, 1015 x m3/m2 Very low, less than 2.3 6079 (43.1) 2087 (43.6) 3776 (40.4) 765 (40.9) 2345 (12.5) 2335 (11.3) Low, 2.3 to 133.2 2629 (18.6) 794 (16.6) 1953 (20.9) 363 (19.4) 5713 (30.6) 5935 (28.6) Medium, 133.3 to 759.7 2636 (18.7) 798 (16.7) 1789 (19.1) 271 (14.5) 5787 (31) 6106 (29.4) High, greater than 759.7 2760 (19.6) 1103 (23.1) 1832 (19.6) 471 (25.2) 4848 (25.9) 6373 (30.7) a Percentages may not add to 100 due to rounding. b Cases contribute up to one hospitalization per year (hospitalizations are randomly chosen from patients with multiple hospitalizations in a year). Controls cannot have had a hospitalization up to the year of the hospitalization in the frequency-matched case, but can serve as a case later. c For controls, the encounter is a randomly selected encounter during the year of the matched case’s hospitalization and before the year of any subsequent asthma hospitalization in the control. For cases, the event is an asthma hospitalization. d Spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21. e For children and adults, respectively: normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25- <30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2 f A means tested program that is a surrogate for family SES. g Principal arterial or interstate h Minor arterial road
120
Table 3.4.2. Associations of unconventional natural gas activity metrics and asthma outcomes Asthma Asthma Emergency OCS Ordersa Hospitalizationsa Department Visitsa Odds Ratio (95% CIb) Odds Ratio (95% CI) Odds Ratio (95% CI) Lowc 1.26 (1.06 - 1.50) 1.53 (1.06 - 2.23) 1.54 (1.37 - 1.74) Pad Activity Medium 1.37 (1.15 – 1.64) 1.77 (1.2 - 2.6) 1.66 (1.47 - 1.87) Metric High 1.45 (1.21 – 1.73) 1.37 (0.94 - 1.99) 1.59 (1.41 - 1.81) Spud Low 1.16 (0.98 – 1.37) 1.53 (1.06 - 2.21) 1.45 (1.29 - 1.63) Activity Medium 1.26 (1.05 – 1.50) 1.54 (1.04 - 2.27) 1.98 (1.75 - 2.24) Metric High 1.64 (1.38 – 1.97) 1.57 (1.08 - 2.29) 1.99 (1.75 - 2.26) Stimulation Low 1.13 (0.96 - 1.33) 1.51 (1.05 - 2.19) 1.23 (1.09 - 1.39) Activity Medium 1.31 (1.10 - 1.57) 1.74 (1.17 - 2.61) 2.22 (1.95 - 2.53) Metric High 1.66 (1.38 - 1.98) 1.71 (1.16 - 2.52) 3.00 (2.60 - 3.45) Production Low 1.10 (0.92 - 1.30) 1.47 (1.01 - 2.14) 1.28 (1.13 - 1.46) Activity Medium 1.16 (0.97 - 1.38) 1.10 (0.74 - 1.65) 2.15 (1.87 - 2.47) Metric High 1.74 (1.45 - 2.09) 2.19 (1.47 - 3.25) 4.43 (3.75 - 5.22) a Multilevel models with a random intercept for patient and community, adjusted for age category (5-12, 13-18, 19-44, 45-61, 62-74, 75+ years), sex (male, female), race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) b Confidence interval c Very low is the reference group
3.4.3 Sensitivity Analyses
The four UNGD activity metrics, calculated for all case and control index dates
(n=69,548), were correlated with one another (Spearman correlation coefficients of the continuous variables ranged from 0.73-0.91). In the analysis to evaluate associations of a combined UNGD activity metric of the four phases of development, the odds ratio point estimates were between those from regressions of each phase separately. In the negative disease control analysis, we found no association of the spud activity metric with gastrointestinal illness. In a model evaluating associations of counties with
121
outcomes (UNGD metrics removed), counties with high UNGD activity were not
associated with outcomes (Figure 3.4.3). In the analysis that removed cells with
unbalanced numbers of cases and controls in the mild exacerbation analysis,
associations were attenuated (odds ratios decreased by 5%, 17%, 37%, and 55% for the
high group odds ratio for the pad, spud, stimulation, and production metrics,
respectively, all odds ratios p<0.05). In the analysis to evaluate the impact of different quality of geocoding, associations were unchanged. In the analysis of the mild and severe exacerbations, we determined that even an unmeasured confounder strongly associated with both UNGD activity and outcome (e.g., both odds ratios = 3.0), and a prevalence of 0.3 in the exposed group, would not likely change our inference about associations, given our models. However, for moderate exacerbations, an unmeasured confounder with the same characteristics could account for two of the three statistically
significant associations.
122
Figure 3.4.3. Counties Associated with Asthma Hospitalization Case Status.
123
3.5 Discussion We conducted a nested case-control study in a large number of asthma patients using EHR data in Pennsylvania from 2005-2012, a period of rapid development. In this first study of UNGD and objective respiratory outcomes, we found consistent associations of four UNGD activity metrics with three types of asthma exacerbations.
Whether these associations are causal awaits further investigation, including more detailed exposure assessment.
Asthma is a suitable outcome because UNGD has community and environmental impacts that could affect it; it is highly prevalent; it can be exacerbated by stress and small changes in air quality with short latency; and patients usually seek care for exacerbations so they are captured by an EHR. By leveraging longitudinal EHR data, we
were able to complete a number of sensitivity analyses that suggested the associations
were robust to increasing levels of adjustment, although in some cases they were
attenuated.
Studies of air pollution and asthma exacerbations have generally found
small but consistently increased risks. A study of pediatric emergency department visits
for asthma in Atlanta found that a standard deviation increase in pollution had
associated risk ratios of 1.020, 1.036, and 1.062 for particulate matter < 10μm, nitrogen
dioxide, and ozone, respectively.46 Studies on psychosocial stress have found that in children with asthma, the risk of an asthma exacerbation increased 4.7 times in the two days following a very stressful event.47 Adults exposed to violence in their community
have 2.3 and 2.5 times the risk of an asthma emergency department visit and
hospitalization, respectively, than those not exposed to community violence.48
Two sensitivity analyses were directed to the very important possibility that
unmeasured confounding could account for our results. First, UNGD metrics were not
associated with the negative disease control. Second, in the analysis replacing UNGD
124
metrics with indicators for counties, counties with UNGD were not associated with
severe exacerbations. These both provide evidence that unmeasured confounding is unlikely to account for our findings, but we acknowledge that the possibility still exists.
We note that an unmeasured confounder would need to be strongly associated with both
UNGD and asthma outcomes to account for our results. In sensitivity analysis to address unbalanced numbers of cases and controls, results were attenuated; the
majority of dropped patients comprised the most susceptible groups (younger and older)
in the most exposed years, so attenuation was not unexpected. Finally, geocoding
method and analysis with an overall activity metric did not change inferences
This study had several strengths, including a large sample size from a population
that represents the general population in the region. Additionally, our exposure
assessment improved on in prior studies,49,50 which used categorical distance-based metrics, that did not account for UNGD phases. Our metric incorporated the temporality and duration of phases, gas production volume, and a surrogate for truck traffic. This study also improved on outcome ascertainment used in the previous study on UNGD and respiratory outcomes,50 which relied on self-reported outcomes and grouped several
respiratory symptoms and conditions together (including asthma). We used documented
asthma exacerbations. Our findings were robust to increasing levels of covariate control
and in several sensitivity analyses.
This study also had limitations. The EHR does not collect information on
occupation and only keeps patients’ most recent address. However, comparing
addresses used in a prior study35 to addresses used in this study (39 months apart),
79.8% of patients were at the same address and an additional 7.4% and 7.6% were less
than 3.2km and 3.2-16km, respectively, from their prior address, indicating little residential mobility. The EHR only collects data on events that occur at Geisinger facilities, but ambulances go to the closest hospital, so we may have under-counted
125
events. We were unable to differentiate between asthma exacerbations that were
hospitalized from those that occurred while hospitalized. We frequency-matched cases
and controls for year because UNGD activity metrics and year were highly correlated.
We did not include year in the final model because of this high correlation, so there
remains the possibility of unmeasured residual confounding by factors that strongly vary
by year. We kept all four UNGD metrics because of a priori evidence that exposures
differed by phase, but because metrics were highly correlated we were unable to
definitively distinguish among them. Furthermore, our UNGD metrics do not provide
insight into the mechanism of the associations we observed.
Asthma is a common disease with large individual and societal burdens, so the possibility that UNGD may increase risk for asthma exacerbations requires public health attention. As ours is the first study of UNGD and objective respiratory outcomes, and several other health outcomes have not been investigated to date, there is an urgent need for more health studies. These should include more detailed exposure assessment to better characterize pathways and identify the phases of development that present the most risk.
126
3.6 References 1. Akinbami LJ, Bailey CM, Johnson CA, et al. National surveillance of asthma: United
States, 2001-2010. National Center for Health Statistics, Vital Health Stat. 2012;3:35.
2. National Heart, Lung, and Blood Institute. National Asthma Education Program.
Expert Panel on the Management of Asthma. Expert panel report 3: Guidelines for the
diagnosis and management of asthma: Full report. US Department of Health and Human
Services, National Institutes of Health, National Heart, Lung, and Blood Institute; 2007.
3. Dougherty R, Fahy JV. Acute exacerbations of asthma: Epidemiology, biology and the exacerbation Clinical‐prone phenotype.& Experimental Allergy. 2009;39(2):193-202.
4. Guarnieri M, Balmes JR. Outdoor air pollution and asthma. The Lancet.
2014;383(9928):1581-1592.
5. Gent JF, Triche EW, Holford TR, et al. Association of low-level ozone and fine particles with respiratory symptoms in children with asthma. JAMA. 2003;290(14):1859-
1867.
6. Peel JL, Tolbert PE, Klein M, et al. Ambient air pollution and respiratory emergency department visits. Epidemiology. 2005;16(2):164-174.
7. Ostro B, Lipsett M, Mann J, Braxton-Owens H, White M. Air pollution and exacerbation of asthma in African-American children in Los Angeles. Epidemiology.
2001;12(2):200-208.
8. Dominici F, Peng RD, Bell ML, et al. Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. JAMA. 2006;295(10):1127-1134.
9. Ko F, Tam W, Wong T, et al. Effects of air pollution on asthma hospitalization rates in different age groups in Hong Kong. Clinical & Experimental Allergy. 2007;37(9):1312-
1319.
127
10. Schildcrout JS, Sheppard L, Lumley T, Slaughter JC, Koenig JQ, Shapiro GG.
Ambient air pollution and asthma exacerbations in children: An eight-city analysis. Am J
Epidemiol. 2006;164(6):505-517.
11. Halonen JI, Lanki T, Yli-Tuomi T, Kulmala M, Tiittanen P, Pekkanen J. Urban air
pollution, and asthma and COPD hospital emergency room visits. Thorax.
2008;63(7):635-641.
12. Yonas MA, Lange NE, Celedón JC. Psychosocial stress and asthma morbidity.
Current Opinion in Allergy and Clinical Immunology. 2012;12(2):202-210.
13. Chen E, Miller GE. Stress and inflammation in exacerbations of asthma. Brain Behav
Immun. 2007;21(8):993-999.
14. Wisnivesky JP, Lorenzo J, Feldman JM, Leventhal H, Halm EA. The relationship
between perceived stress and morbidity among adult inner-city asthmatics. Journal of
Asthma. 2010;47(1):100-104.
15. New York State Department of Health Completes Review of High-volume Hydraulic
Fracturing [press release]. Albany, NY: New York State Department of Enviornmental
Conservation, December 17, 2014.
16. Cox E. Assembly votes to ban fracking for two years. Baltimore Sun April 10, 2015.
17. Werner AK, Vink S, Watt K, Jagals P. Environmental health impacts of
unconventional natural gas development: A review of the current strength of evidence.
Sci Total Environ. 2015;505:1127-1141.
18. Mitka M. Rigorous evidence slim for determining health risks from natural gas
fracking. JAMA. 2012;307(20).
19. Gaines M. PennDOT’s posting and bonding program and impact of unconventional oil & gas [webinar]. http://extension.psu.edu/natural-resources/natural-
gas/webinars/shale-energy-developments-effect-on-the-posting-bonding-and-
128
maintenance-of-roads-in-rural-pa/mark-gaines-may-16-2013-powerpoint. Published May
16, 2013.
20. Maloney KO, Yoxtheimer DA. Production and disposal of waste materials from gas
and oil extraction from the Marcellus shale play in Pennsylvania. Env Prac.
2012;14(04):278-287.
21. Pennsylvania Code. Subchapter E. Well reporting § 78.121-§ 78.125.
http://www.pacode.com/secure/data/025/chapter78/subchapEtoc.html.
22. Roy AA, Adams PJ, Robinson AL. Air pollutant emissions from the development,
production, and processing of Marcellus shale natural gas. J Air Waste Manage Assoc.
2013;64(1):19-37.
23. Litovitz A, Curtright A, Abramzon S, Burger N, Samaras C. Estimation of regional air-
quality damages from Marcellus shale natural gas extraction in Pennsylvania.
Environmental Research Letters. 2013;8(1):014017.
24. McKenzie LM, Witter RZ, Newman LS, Adgate JL. Human health risk assessment of
air emissions from development of unconventional natural gas resources. Sci Total
Environ. 2012;424:79-87.
25. Sangaramoorthy T, Jamison AM, Boyle MD, et al. Place-based perceptions of the
impacts of fracking along the Marcellus shale. Soc Sci Med. 2016.
26. Adgate JL, Goldstein BD, McKenzie LM. Potential public health hazards, exposures
and health effects from unconventional natural gas development. Environ Sci Technol.
2014;48(15):8307-8320.
27. Vinciguerra T, Yao S, Dadzie J, et al. Regional air quality impacts of hydraulic
fracturing and shale natural gas activity: Evidence from ambient VOC observations.
Atmos Environ. 2015;110:144-150.
129
28. Gopalakrishnan S, Klaiber HA. Is the shale energy boom a bust for nearby residents? evidence from housing values in Pennsylvania. Am J Agric Econ.
2014;96(1):43-66.
29. Muehlenbachs L, Spiller E, Timmins C. The housing market impacts of shale gas
development. Am Econ Rev. 2015;105(12):3633-59.
30. Brunekreef B, Holgate ST. Air pollution and health. Lancet. 2002;360:1233-1242.
31. Salam MT, Islam T, Gilliland FD. Recent evidence for adverse effects of residential proximity to traffic sources on asthma. Curr Opin Pulm Med. 2008;14(1):3-8.
32. Hanson MD, Chen E. Brief report: The temporal relationships between sleep, cortisol, and lung functioning in youth with asthma. J Pediatr Psychol. 2007;33(3):312-
316.
33. Daniel LC, Boergers J, Kopel SJ, Koinis-Mitchell D. Missed sleep and asthma morbidity in urban children. Annals of Allergy, Asthma & Immunology. 2012;109(1):41-
46.
34. Griswold SK, Nordstrom CR, Clark S, Gaeta TJ, Price ML, Camargo CA. Asthma exacerbations in North American adults: Who are the “frequent fliers” in the emergency department? Chest Journal. 2005;127(5):1579-1586.
35. Casey JA, Cosgrove SE, Stewart WF, Pollak J, Schwartz BS. A population-based study of the epidemiology and clinical features of methicillin-resistant staphylococcus aureus infection in Pennsylvania, 2001-2010. Epidemiol Infect. 2013;141(06):1166-1179.
36. Pacheco JA, Avila PC, Thompson JA, et al. A highly specific algorithm for identifying asthma cases and controls for genome-wide association studies. AMIA Annu Symp
Proc. 2009;2009:497-501.
37. Schwartz BS, Stewart WF, Godby S, et al. Body mass index and the built and social environments in children and adolescents using electronic health records. Am J Prev
Med. 2011;41(4):e17-e28.
130
38. U.S. Department of Transportation Federal Highway Administration. Highway
Performance Monitoring System website. http://www.fhwa.dot.gov/policyinformation/hpms/shapefiles.cfm. Updated 2013.
Accessed March 27, 2015.
39. Liu AY, Curriero FC, Glass TA, Stewart WF, Schwartz BS. The contextual influence of coal abandoned mine lands in communities and type 2 diabetes in Pennsylvania.
Health Place. 2013.
40. National Climatic Data Center. Climate Data Online website.
http://www.ncdc.noaa.gov/cdo-web/. Accessed May 11, 2011.
41. SkyTruth. TADPOLE Pennsylvania results. http://frack.skytruth.org/frackfinder/frackfinder-news/tadpolepennsylvaniaresults.
Published Feb 12, 2014. Updated 2014. Accessed June 30, 2014.
42. SkyTruth. Fracking chemical database. http://frack.skytruth.org/fracking-chemical- database. Updated 2013. Accessed November 27, 2013.
43. Ogden CL, Carroll MD, Kit BK, Flegal KM. Prevalence of childhood and adult obesity in the united states, 2011-2012. JAMA. 2014;311(8):806.
44. Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: A tool for detecting confounding and bias in observational studies. Epidemiology. 2010;21(3):383-388.
45. VanderWeele TJ, Arah OA. Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology.
2011;22(1):42-52.
46. Strickland MJ, Darrow LA, Klein M, et al. Short-term associations between ambient air pollutants and pediatric asthma emergency department visits. American journal of respiratory and critical care medicine. 2010;182(3):307-316.
131
47. Sandberg S, Jarvenpaa S, Penttinen A, Paton JY, McCann DC. Asthma exacerbations in children immediately following stressful life events: A cox's hierarchical regression. Thorax. 2004;59(12):1046-1051.
48. Apter AJ, Garcia LA, Boyd RC, Wang X, Bogen DK, Ten Have T. Exposure to community violence is associated with asthma hospitalizations and emergency department visits. J Allergy Clin Immunol. 2010;126(3):552-557.
49. McKenzie LM, Guo R, Witter RZ, Savitz DA, Newman LS, Adgate JL. Birth outcomes and maternal residential proximity to natural gas development in rural Colorado. Environ
Health Perspect. 2014.
50. Rabinowitz PM, Slizovskiy IB, Lamers V, et al. Proximity to natural gas wells and reported health status: Results of a household survey in Washington counties,
Pennsylvania. Environ Health Perspect. 2014.
132
Chapter 4: Associations of unconventional natural gas development with disordered sleep and depression symptoms in Pennsylvania
4.0 Cover Page Sara G. Rasmussen, MHS1; Holly C. Wilcox, PhD2; Annemarie G. Hirsch3; Jonathan Pollak1; Brian S. Schwartz, MD, MS1,3,4 1Department of Environmental Health and Engineering, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; 2Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA; 3Department of Epidemiology and Health Services Research, Geisinger Health System, Danville, Pennsylvania, USA; 4Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
Acknowledgements: We thank Joseph J. DeWalle, BS (Geisinger Health System) for patient geocoding; Aaron Tustin (JHSPH) for assistance with survey weights; and Karen Bandeen-Roche, PhD (JHSPH) for assistance interpreting the mediation analysis results. This research was funded by National Institutes of Health U19 AI106683 (PI Robert Schleimer), R21 ES023675 (PI Brian Schwartz), and training grant ES07141 (Sara Rasmussen); the Degenstein Foundation; and the National Science Foundation Integrative Graduate Education and Research Traineeship (Sara Rasmussen). No funders had input into the study design, conduct, data collection or analysis, or manuscript preparation. The authors declare they have no actual or potential competing financial interests. Dr. Schwartz is a Fellow of the Post Carbon Institute (PCI), serving as an informal advisor on climate, energy, and health issues. He receives no payment for this role. His research is entirely independent of PCI, and is not motivated, reviewed, or funded by PCI.
133
4.1 Abstract Background: Social and environmental factors are associated with depression.
Unconventional natural gas development (UNGD) has community and environmental
impacts.
Objectives: In this study, we evaluated the association of UNGD with depression symptoms and disordered sleep diagnoses. There are no prior studies of UNGD and mental health or sleep outcomes.
Methods: We identified depression symptoms among 4,762 adult primary care patients from Geisinger Clinic in Pennsylvania who responded to a questionnaire that included the PHQ-8. For these patients, we used electronic health records to identify 3,868 disordered sleep diagnoses and frequency-matched these to control dates on age, sex, and year. We assigned each person (depression analysis) or diagnosis date (sleep
analysis) a metric for residential UNGD (very low, low, medium, and high) that
incorporated dates and durations of well development, distance from patient homes to
wells, and well characteristics. We estimated associations of the residential UNGD
metric with depression symptoms using negative binomial and multinomial logistic
regression, and evaluated mediation by migraine and fatigue symptoms. We estimated
associations of the residential UNGD metric with disordered sleep using generalized
estimating equations. Models were weighted to account for sampling design and
participation.
Results: High UNGD activity (vs. very low) was associated with an increasing burden of
depression symptoms (exponentiated coefficient = 1.18, 95% confidence interval [CI]:
1.04 - 1.34). We observed weak evidence of mediation by fatigue. UNGD was not
associated with disordered sleep.
134
Conclusions: UNGD activity was associated with depression symptoms in adults in
Pennsylvania.
135
4.2 Introduction Unconventional natural gas development (UNGD) is a long-lasting industrial
process with environmental and social impacts, including noise; light; vibration; truck
traffic; air, water, and soil pollution; social disruption; changes in home prices; and stress
and anxiety related to rapid industrial development.1-17 The process involves pad
preparation, drilling, stimulation (“fracking”), and production, involving development of
both wells and associated infrastructure (e.g., pipelines, compressor stations).1
Pennsylvania has proceeded with UNGD rapidly: development began in the state in the mid-2000s and by 2015, 9,669 wells had been drilled. UNGD has been associated with
several health outcomes for which there are environmental and social risk factors,18-22 but to date no epidemiologic study has evaluated a mental health outcome. Here, we evaluated the association of UNGD with depression symptoms, measured by the Patient
Health Questionnaire depression scale (PHQ-8), which is used in both epidemiology studies and clinical settings.23 Depression is a symptom-based condition defined by
hopelessness, helplessness, sad or irritable mood, loss of interest in activities, and
fatigue.24
This study was conducted in a sample of adults from the Geisinger Clinic.25
which has had an electronic health record (EHR) since the 2000s and is located in
central and northeastern Pennsylvania, a region with a range of UNGD. A study of the
patient sample reported herein showed associations between UNGD and nasal and
sinus, migraine, and fatigue symptoms,26 and a study of Geisinger patients with asthma
showed an association of UNGD and asthma exacerbations (using objectively-
documented events from the EHR) 27. We hypothesized that these associations could be
related to air pollution and/or stress pathways. Here, we studied depression because it
can also be affected by stress and air pollution, is common, and has significant public
136
health and economic costs.23,24,28-34 We explored effect modification of the UNGD- depression symptoms association by antidepressant use, and mediation of the association by self-reported fatigue and migraine symptoms and by disordered sleep diagnoses from the EHR (Figure 4.2), each of which can be comorbid with depression symptoms 35-39. Many aspects of UNGD could impact sleep. Disordered sleep can be one cause of fatigue, but there are several other causes of fatigue independent of disordered sleep.40 Because the association of UNGD with disordered sleep diagnoses
has not been previously studied, we also directly evaluated this association.
137
Figure 4.2. Relationships among UNGD and moderating, mediating, and outcome variables.
The baseline questionnaire was mailed in April 2014 and the follow-up questionnaire in October 2014. The dashed lines identify the associations evaluated in this study; the dotted line identifies an association that could not be evaluated because an insufficient number of events of asthma exacerbations were available in EHR data at the time of this analysis; and the solid lines identify associations evaluated in prior studies (1 = Rasmussen et al. 2016, 2 = Tustin et al. 2016). Abbreviations: UNGD = unconventional natural gas development; EHR = electronic health record.
4.3 Methods 4.3.1 Survey design and study population Depression symptoms were ascertained in a questionnaire that was designed to study nasal and sinus symptoms (methods described previously25,26). The survey design oversampled for people more likely to have nasal and sinus symptoms and racial/ethnic minorities. Briefly, in April 2014, a baseline questionnaire that included questions on migraine headache and fatigue symptoms was sent to 23,700 adults 18 years of age and older, of whom 7,847 responded (response rate = 33.1%). Six months later, a
138
follow-up questionnaire, which included questions on depression symptoms, was sent to all responders of the baseline questionnaire. Of the 7,847 study subjects who received
the follow-up questionnaire, 4,966 returned the questionnaire (response rate = 63.3%).
After excluding respondents who lived outside Pennsylvania (n = 34), the analysis
consisted of 4,932 participants. Returned follow-up questionnaires were received
between November 4, 2014 and May 14, 2015 (median date of November 12, 2014).
The study was approved by the Geisinger Institutional Review Board (with an IRB
authorization agreement with Johns Hopkins Bloomberg School of Public Health).
Implied consent was considered to have been provided if the participant returned the
mailed questionnaire.
4.3.2 Outcome ascertainment
4.3.2.1 Depression symptoms
The follow-up questionnaire included the PHQ-8. Each question on the PHQ-8
has response options as “not at all,” “several days,” “more than half the days,” or “nearly
every day,” scored as 0 to 3 respectively. For participants who answered all eight
questions, their total score was the sum of each of the eight questions.23 For participants
who answered less than eight questions, to include the greatest number of subjects in
the study as possible, we calculated their total score as a pro-rated sum using the
formula: (sum of answered questions x 8)/(number of questions answered). We used the
PHQ-8’s depression severity categories, but combined the two most severe groups
because few participants had a “severe” total score. Scores were categorized into 0 to
<5, no significant depression symptoms; 5 to <10, mild depression symptoms; 10 to <15,
moderate depression symptoms; and 15 to 24, moderately severe/severe depression
symptoms.23
4.3.2.2 Disordered sleep diagnoses
139
Disordered sleep diagnoses (case-events) among the study population were identified in Geisinger’s EHR. We identified encounters (98% outpatient) in the EHR that were accompanied by ICD-9 codes for disordered sleep (Table 4.3.2.2).41 We also
identified orders for disordered sleep medications, using drug class hypnotics and using
drug subclass and name. We included all medications in the drug subclass antihistamine
hypnotics, selective melatonin receptor agonists, hypnotics – tricyclic agents, and orexin
receptor antagonists. In the subclass non-barbiturate hypnotics, we included all
medications except midazolam hydrochloride. We considered either an appropriate
medication order or an encounter with the appropriate ICD-9 code as a disordered sleep
outcome. We excluded disordered sleep outcomes from before 2009 (when UNGD
activity was low), only retained disordered sleep diagnoses from when the participant
was 18 years of age or older, and randomly selected one disordered sleep diagnosis per
participant per year so that study subjects with many encounters for sleep disorders
would not unduly contribute (Figure 4.3.2.2).
Table 4.3.2.2. ICD-9 codes used to identify disordered sleep. Abbreviation: ICD-9 = International Classification of Diseases, 9th Revision, Clinical Modification ICD-9 code Description 780.52 Insomnia 780.50 Sleep disturbance, unspecified 307.47 Other dysfunctions of sleep stages or arousal from sleep 780.59 Other sleep disturbances 307.42 Persistent disorder of initiating or maintaining sleep 780.5 Sleep disturbances 307.41 Transient disorder of initiating or maintaining sleep 307.40 Nonorganic sleep disorder, unspecified 307.48 Repetitive intrusions of sleep 780.56 Dysfunctions associated with sleep stages or arousal from sleep 780.55 Disruptions of 24-hour sleep-wake cycle
140
Figure 4.3.2.2. Flow diagram for identification of disordered sleep diagnoses. Disordered sleep events were identified using medications and ICD-9 codes from encounters (98% outpatient). In this figure, the numbers refer to counts of medications for disordered sleep and encounters with ICD-9 codes for disordered sleep.
Control dates were frequency-matched to cases on age category (18-44, 45-61,
62-74, 75+ years), sex, and year. For control dates, we identified all their dates of
contact with the health system (e.g., medications, inpatient and outpatient visits,
procedures), excluded contact dates within one year of a disordered sleep diagnosis for
a case, and randomly selected one encounter date per year per participant. Encounter
dates, not patients, had to be the selection frame for this analysis because UNGD
activity metrics and many covariates were time-varying.
4.3.3 Potential mediating variables: migraine and fatigue symptoms Using the baseline questionnaire, we used migraine headache and fatigue symptom score groups as previously described.26 We used the validated scoring method
of the ID Migraine questionnaire, which covers the past twelve months, to classify those
141
with migraine headaches from those without.42 We used the Patient-Reported Outcomes
Measurement Information System fatigue short form 8a to create fatigue symptom score groups 43. For participants who answered all eight questions, we summed the
responses, which ranged from “not at all” (1) to “very much” (5). For participants who
answered between four and seven questions, we assigned an adjusted score: (sum of
answered questions x 8)/(number of questions answered).43 We considered participants in the highest quartile of fatigue scores to have severe fatigue and compared them to the bottom three quartiles.
4.3.4 Well data and activity metric assignment Well data were compiled from the Pennsylvania Department of Environmental
Protection, the Pennsylvania Department of Conservation and Natural Resources, and
SkyTruth, as described previously.18,26,27,44 Data that were collected for all
unconventional natural gas wells in Pennsylvania from 2005-2015 included: latitude and
longitude; well pad; dates of drilling, stimulation, and production; total depth; and volume
of natural gas produced.
We assigned UNGD activity for the four phases of well development (pad
preparation, drilling, stimulation, and production) to each study subject (in the depression
symptom analysis) or index date (in the disordered sleep analysis) using metrics that
incorporated distances from participant residence to wells, and the density and size of
wells as in prior studies.18,26,27 For each phase of well development, the metric was
assigned using Equation 4.3.4.
Equation 4.3.4. Activity metric.
2 In Equation 4.3.4, n is the number of wells in the given phase, dij was the squared-
distance (meters) between well i and participant j, and si was 1 for the pad production
142
and drilling phases, total well depth (meters) of well i for the stimulation phase, and daily natural gas production volume (m3) of well i for the production phase.
For the depression symptom analysis, for each phase of development, the metric
was summed for the 14 days prior to the date of the returned follow-up questionnaire (d;
negative in the formula above because it represents days before the survey was
returned). We chose 14 days prior to the survey return because the PHQ-8 asks about
depression symptoms over the past two weeks. In the analysis to evaluate mediation by
fatigue or migraine symptoms of the UNGD-depression symptom association, we
assigned the UNGD metric for the three months before baseline questionnaire return
because we had previously observed that UNGD activity summed over three months
was associated with migraine and fatigue symptoms at baseline 26. For the disordered
sleep analysis, UNGD activity was summed for the three months prior to the date of the
sleep disorder diagnosis. In all analyses, we z-transformed the activity metrics for each
of the four phases of development, summed the transformed values, and quartiled the
sums to create a composite UNGD metric to create the very low, low, medium, and high
UNGD activity groups.
4.3.5 Covariates
Using the EHR, we created covariates for race/ethnicity; sex; Medical
Assistance, a means-tested program that was used as a surrogate for family
socioeconomic status; age; smoking and alcohol status; body mass index; and
antidepressant medication use in the month prior to survey return using drug group,
class, sub-class, and name.45 Time-varying-covariates (all but race/ethnicity and sex)
were assigned before the date of survey return (for the depression symptom analysis) or
before the diagnosis or comparison date (for the disordered sleep analysis). Using
previously described methods,46 we geocoded study subjects to their residential address
in the EHR, 89.13% to street address, 3.14% to ZIP+4, and 7.73% to ZIP code centroid.
143
Using participants’ geocoded coordinates, we assigned people to a community using a mixed definition of place (township, borough, or census tract in cities), calculated community
socioeconomic deprivation for each community,46,47 and created a covariate for water
source (municipal water or well water) by using the locations of public water service
areas from the Pennsylvania Department of Environmental Protection.48
4.3.6 Statistical analysis
All models were weighted to account for the survey stratified sampling design, the response rate to the baseline questionnaire, and loss to follow-up from the baseline
to the follow-up questionnaires (Table 4.3.6). Because one weight was much larger than
the others, we truncated the largest weight to the next largest for our primary analyses.49
To build models, we first included the composite UNGD activity index (very low, low,
medium, high), and then added, one at a time, race/ethnicity (white non-Hispanic, black
non-Hispanic, Hispanic), sex (male, female), Medical Assistance (no, yes), age (years),
smoking status (never, former, current), alcohol status (yes, heavy [based on the
Centers for Disease Control definition of heavy drinking as 8 or more drinks per for
females and 15 or more drinks per week for males50]; yes, not heavy; no), body mass index (BMI, kg/m2), community socioeconomic deprivation, and water source (municipal water, well water). We centered the continuous covariates (age, BMI, and community
socioeconomic deprivation) and included them as linear and quadratic terms to allow for
non-linearity. We used a 2-sided type 1 error rate of 0.05 for significance testing. We
used Stata version 11.2 (StataCorp Inc.) and R version 3.2.2 (R Foundation for
Statistical Computing) for analysis.
144
Table 4.3.6. Calculation of sample weights. Higher likelihood Intermediate Lower likelihood Race/ethnicity of CRS likelihood of CRS of CRS White non-Hispanic Identified using EHR 13,132 47,892 131,366 Received baseline survey 12,209 4,224 2,775 Responded to baseline survey 4,691 1,481 871 Responded to follow-up survey 3,076 950 551 Sample weight 4.27 50.41 238.41 Black non-Hispanic Identified using EHR 170 991 2,832 Received baseline survey 159 903 1,109 Responded to baseline survey 35 144 155 Responded to follow-up survey 17 67 62 Sample weight 10.00 14.79 45.68 Hispanic Identified using EHR 192 1,035 3,159 Received baseline survey 181 966 1,174 Responded to baseline survey 35 206 167 Responded to follow-up survey 19 92 98 Sample weight 10.11 11.25 32.23 Abbreviations: EHR = electronic health record; CRS = chronic rhinosinusitis
145
We evaluated the association of UNGD with depression symptoms in two ways, using multinomial logistic regression and negative binomial regression. We used
multinomial logistic regression to evaluate the association of UNGD with different levels
of depression symptoms separately because we hypothesized that UNGD would be
differentially associated with depression symptom severity. We fit multinomial logistic
models to estimate the association of the UNGD activity metric with each level of
depression symptoms (mild, moderate, moderately severe/severe) compared to no
depression symptoms (base outcome). We also evaluated the association of UNGD with
depression symptoms using negative binomial logistic regression. Negative binomial
logistic regression treats the PHQ-8 score as a continuous outcome, which allowed us to
evaluate associations between UNGD and the burden of depression symptoms, rather
than with the screening tool cutoffs.51
We hypothesized that participants not on antidepressants may be more
susceptible to potential stressors like UNGD. To test this hypothesis, we evaluated effect
modification by antidepressant use by adding cross-products of the UNGD indicator
variables and antidepressant medication use to the final multinomial logistic and
negative binomial models, and used a Wald test to evaluate the significance of the
cross-products. We evaluated whether migraine or fatigue symptoms, measured on the
baseline questionnaire, mediated the associations between UNGD and depression
symptoms (measured on the follow-up survey) by including these variables in the
negative binomial model, by comparing the UNGD effect estimates from models with
and without the potential mediators.
Because weighted models are less precise, but more unbiased, and unweighted
models tend to be more precise, but more biased,52 we wanted to evaluate the influence
of weighting. In a sensitivity analysis, we evaluated associations of UNGD with
depression symptoms among all subjects using the final multinomial logistic and
146
negative binomial models without weights and with full weights (i.e., not truncating the largest weight to the second largest weight, as was done in the primary analysis).
To assess the association of the UNGD activity metrics with disordered sleep
diagnoses, we fit a survey-weighted generalized estimating equations model, to account
for multiple events within participants. In this analysis, if UNGD was associated with
disordered sleep, we would next evaluate whether disordered sleep was a mediator of
the UNGD-depression symptom association.
4.4 Results 4.4.1 Description of study population Of the 4,932 subjects in the study, 170 did not answer any PHQ-8 questions,
2,976 had no significant depression symptoms, 1,075 had mild depression symptoms,
454 had moderate depression symptoms, and 257 had moderately severe or severe
depression symptoms (Table 4.4.1). Participants with more severe depression
symptoms, compared to those with no or less severe symptoms, were more likely to be
female, take antidepressants, and record having heavy alcohol use in the EHR (all p <
0.01). We identified 8,578 disordered sleep diagnoses using EHR data among 1,699 of
the 4932 study subjects. The remaining 3,233 study subjects did not have disordered
sleep diagnoses. After selecting one disordered sleep diagnosis per person per year,
3,868 disordered sleep diagnoses were included in the study (Figure 4.3.2.2).
Participants with at least one disordered sleep diagnosis, compared to those with no
disordered sleep diagnoses, were more likely to be female and to be older at the time of
survey return (both p < 0.05).
147
Table 4.4.1. Descriptive statistics by depression symptoms. Depression symptoms No significant Moderately depression severe / Variable symptoms Mild Moderate severe Missing Total numbera, n (%b) 2,976 (100) 1,075 (100) 454 (100) 257 (100) 170 (100) UNGDc metric, n (%) Very low 756 (25.4) 259 (24.1) 117 (25.8) 62 (24.1) 39 (22.9) Low 726 (24.4) 285 (26.5) 113 (24.9) 66 (25.7) 43 (25.3) Medium 776 (26.1) 253 (23.5) 101 (22.2) 58 (22.6) 45 (26.5) High 718 (24.1) 278 (25.9) 123 (27.1) 71 (27.6) 43 (25.3) pd = 0.65 Race, n (%) White 2766 (92.9) 1005 (93.5) 420 (92.5) 228 (88.7) 158 (92.9) Black 88 (3.0) 33 (3.1) 14 (3.1) 8 (3.1) 3 (1.8) Hispanic 122 (4.1) 37 (3.4) 20 (4.4) 21 (8.2) 9 (5.3) p = 0.11 Female, n (%) 1829 (61.5) 721 (67.1) 294 (64.8) 191 (74.3) 87 (51.2) p < 0.01 Medical Assistance, n 138 (4.6) 107 (10.0) 80 (17.6) 84 (32.7) 12 (7.1) (%) p < 0.01 Smoking status, n (%) Never 1774 (59.6) 588 (54.7) 233 (51.3) 107 (41.6) 83 (48.8) Current 278 (9.3) 162 (15.1) 81 (17.8) 67 (26.1) 18 (10.6) Former 924 (31.0) 325 (30.2) 140 (30.8) 83 (32.3) 69 (40.6) p < 0.01 Community type, n (%) Borough 799 (26.8) 284 (26.4) 131 (28.9) 80 (31.1) 46 (27.1) City 202 (6.8) 99 (9.2) 45 (9.9) 34 (13.2) 9 (5.3) Township 1975 (66.4) 692 (64.4) 278 (61.2) 143 (55.6) 115 (67.6) p < 0.01
148
Depression symptoms No significant Moderately depression severe / Variable symptoms Mild Moderate severe Missing Well water, n (%) 1129 (37.9) 410 (38.1) 147 (32.4) 66 (25.7) 75 (44.1) p < 0.01 Alcohol status, n (%) No 1256 (42.2) 431 (40.1) 183 (40.3) 121 (47.1) 82 (48.2) Yes, not heavy 1505 (50.6) 524 (48.7) 191 (42.1) 92 (35.8) 78 (45.9) Yes, heavy 215 (7.2) 120 (11.2) 80 (17.6) 44 (17.1) 10 (5.9) p < 0.01 On depression 601 (20.2) 396 (36.8) 213 (46.9) 138 (53.7) 43 (25.3) medication, n (%) p < 0.01 Number of PHQ-8 questions missing, n (%) 0 1-7 2796 (94) 977 (90.9) 411 (90.5) 235 (91.4) 0 (0) All 8 180 (6) 98 (9.1) 43 (9.5) 22 (8.6) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 170 (100) p < 0.01 BMI, mean 29.6 30.5 31.5 32.1 29.4 Abbreviation: UNGD = unconventional natural gas development a The follow-up responders outside of Pennsylvania (n = 34) were excluded. b Column percent c The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to survey return. d p-values from chi-squared tests of each covariate with the different levels of depression symptoms (no, mild, moderate, moderately severe / severe depression symptoms; missing).
149
4.4.2 Associations of UNGD with depression symptoms
The high and low groups of the UNGD activity index (vs. very low) were associated with mild depression symptoms (vs. none) in an adjusted multinomial logistic model (Table 4.4.2.1) and with the burden of depression symptoms in an adjusted
negative binomial model (Table 4.4.2.2). There was no association between the medium
UNGD activity group (vs. very low) and depression symptoms in either model. When we added a cross-product between UNGD and antidepressant medication use to the model, the p-value from the Wald test of the cross-product was 0.14 and 0.12 in the multinomial logistic and negative binomial models, respectively, indicating that there was not statistically significant effect modification by treatment. When we added fatigue or migraine to the negative binomial models of UNGD and depression symptoms, the coefficient for the high UNGD activity decreased by 9.4% and 3.4%, respectively,
providing weak evidence of mediation by fatigue symptoms on the associations of
UNGD with depression symptoms (Table 4.4.2.3). In the sensitivity analysis to evaluate the influence of weighting on associations, we observed stronger associations between
UNGD and depression symptoms using full weights, and no association between UNGD and depression symptoms using no weights (Tables 4.4.2.1 and 4.4.2.2).
150
Table 4.4.2.1. Association of UNGD and depression symptoms in survey multinomial logistic models (n=4,762a). Moderately Moderate severe/severe Mild depression depression depression UNGD activity symptomsc,d symptomsc,d symptomsc,d groupb OR (95% CI) OR (95% CI) OR (95% CI) Truncated survey weightse Lowf 1.63 (1.21 - 2.19) 1.22 (0.80 - 1.86) 1.13 (0.61 - 2.06) Medium 1.25 (0.92 - 1.71) 1.04 (0.68 - 1.60) 0.89 (0.47 - 1.69) High 1.51 (1.12 - 2.04) 1.26 (0.83 - 1.92) 1.39 (0.76 - 2.54) Full survey weights Low 1.72 (1.14 - 2.59) 1.20 (0.67 - 2.14) 0.93 (0.37 - 2.34) Medium 1.29 (0.84 - 1.98) 1.23 (0.66 - 2.28) 0.68 (0.26 - 1.79) High 1.95 (1.28 - 2.97) 1.77 (0.98 - 3.20) 1.47 (0.63 - 3.46) No survey weights Low 1.23 (1.003 - 1.50) 1.04 (0.78 - 1.39) 1.19 (0.82 - 1.74) Medium 0.996 (0.81 - 1.22) 0.84 (0.63 - 1.13) 0.91 (0.61 - 1.34) High 1.12 (0.92 - 1.37) 1.06 (0.80 - 1.40) 1.11 (0.76 - 1.61) Abbreviations: UNGD = unconventional natural gas development, OR = odds ratio, CI = confidence interval a The follow-up responders outside of Pennsylvania (n = 34) and those who answered no depression symptom questions (n = 170) were excluded. b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. c Models adjusted for race/ethnicity (white non-Hispanic, black non-Hispanic, Hispanic), sex (male, female), Medical Assistance (no, yes), age (years, linear and quadratic terms), smoking status (never, former, current), alcohol status (no; yes, not heavy; yes, heavy), body mass index (BMI, kg/m, linear and quadratic terms), community socioeconomic deprivation (linear and quadratic terms), and water source (municipal water, well water). d No depression symptoms was the base outcome. e Primary analysis f Very low was the reference group.
151
Table 4.4.2.2. Association of UNGD and depression symptoms in survey negative binomial models (n=4,762a). UNGD activity Depression symptomsc groupb Exponentiated coefficientd (95% CI) Truncated survey weightse Lowf 1.14 (1.01 - 1.29) Medium 1.03 (0.91 - 1.17) High 1.18 (1.04 - 1.34) Full survey weights Low 1.12 (0.94 - 1.34) Medium 1.07 (0.88 - 1.29) High 1.29 (1.08 - 1.56) No survey weights Low 1.05 (0.96 - 1.15) Medium 0.96 (0.88 - 1.05) High 1.03 (0.94 - 1.13) Abbreviations: UNGD = unconventional natural gas development, CI = confidence interval a The follow-up responders outside of Pennsylvania (n = 34) and those that answered no depression symptom questions (n = 170) were excluded. b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. c Models adjusted for race/ethnicity (white non-Hispanic, black non-Hispanic, Hispanic), sex (male, female), Medical Assistance (no, yes), age (years, linear and quadratic terms), smoking status (never, former, current), alcohol status (no; yes, not heavy; yes, heavy), body mass index (BMI, kg/m, linear and quadratic terms), community socioeconomic deprivation (linear and quadratic terms), and water source (municipal water, well water). d Ratio of mean symptom counts e Primary analysis f Very low was the reference group.
152
Table 4.4.2.3. Association of UNGD (assigned at baseline) and depression symptoms in survey negative binomial models that include migraine or fatigue (n=4,762a).
UNGD 3 months UNGD 3 months prior to baseline prior to baseline UNGD 3 months with fatigue as a with migraine as a prior to baselinec covariate covariate UNGD activity Exponentiated Exponentiated Exponentiated groupb coefficientd (95% CI) coefficientd (95% CI) coefficientd (95% CI) Lowe 1.07 (0.95 - 1.21) 1.03 (0.91 - 1.16) 1.08 (0.96 - 1.22) Medium 1.02 (0.90 - 1.15) 0.997 (0.88 - 1.13) 1.05 (0.92 - 1.19) High 1.17 (1.03 - 1.33) 1.06 (0.94 - 1.21) 1.13 (0.999 - 1.29) Fatigue 2.58 (2.39 - 2.80) Migraine 1.77 (1.60 - 1.96) Abbreviations: UNGD = unconventional natural gas development, CI = confidence interval a The follow-up responders outside of Pennsylvania (n = 34) and those that answered no depression symptom questions (n = 170) were excluded. b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the three months prior to baseline survey return. c Models adjusted for race/ethnicity (white non-Hispanic, black non-Hispanic, Hispanic), sex (male, female), Medical Assistance (no, yes), age (years, linear and quadratic terms), smoking status (never, former, current), alcohol status (no; yes, not heavy; yes, heavy), body mass index (BMI, kg/m, linear and quadratic terms), community socioeconomic deprivation (linear and quadratic terms), and water source (municipal water, well water). d Ratio of mean symptom counts e Very low was the reference group.
4.4.3 Associations of UNGD with disordered sleep
In the multilevel model for the longitudinal disordered sleep outcome, UNGD was not associated with disordered sleep diagnoses (Table 4.4.3). Because there was no association of UNGD and disordered sleep, we did not evaluate mediation of the UNGD and depression symptoms association by disordered sleep.
153
Table 4.4.3. Association between UNGD and sleep deprivation in a survey-weighteda generalized estimating equations model. Depression UNGD activity symptomsc groupb OR (95% CI) Lowd 0.96 (0.73 - 1.25) Medium 1.06 (0.80 - 1.40) High 1.06 (0.79 - 1.42) Abbreviations: UNGD = unconventional natural gas development, OR = odds ratio, CI = confidence interval a Truncated survey weights were used. b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the three months prior to each event. c Models adjusted for race/ethnicity (white non-Hispanic, black non-Hispanic, Hispanic), sex (male, female), Medical Assistance (no, yes), age (years, linear and quadratic terms), smoking status (never, former, current), alcohol status (no; yes, not heavy; yes, heavy), body mass index (BMI, kg/m, linear and quadratic terms), community socioeconomic deprivation (linear and quadratic terms), and water source (municipal water, well water). d Very low was the reference group.
4.5 Discussion We observed an association between UNGD activity and depression symptoms, but which was only present in weighted models. However, unweighted models tend to be more biased, so the weighted analysis was our primary analysis. Antidepressant use did not appear to be an effect modifier of this relationship. We observed suggestive evidence of mediation by fatigue of the association between UNGD and depression symptoms, but we cannot rule out the possibility that fatigue was instead a confounder of
the association. While we established the temporality of the mediation analysis by
making the UNGD metric precede fatigue, and fatigue precede depression symptoms,
because migraine, fatigue, and depression symptoms and the UNGD activity metrics
were all correlated within a person over time, we cannot rule out the potential for
confounding. Finally, we did not observe an association between UNGD and disordered
sleep.
There are several biologically plausible pathways for UNGD to affect depression
symptoms, including air pollution, psychosocial stress, and changes to the built or social
154
environment (which may be mediated through stress30). Each stage of UNGD has air
impacts, including from truck traffic, diesel powered machinery, and fugitive
emissions.1,2,8-12,17 UNGD has contributed to psychosocial stress in the region; in an
analysis of letters to the editor about UNGD in a newspaper in Pennsylvania, the authors
identified stress as a major theme,16 and among a convenience sample of people living
near UNGD in Pennsylvania, the most commonly reported symptom was stress.53 In a
study of symptoms that found that dermal and upper respiratory symptoms were more
common in people living less than 1 km from a drilled well compared to those living more
than 2 km from a drilled well, the authors suggested stress as a potential pathway for
these associations.21
Short and long term exposure to air pollution has been associated with
depression symptoms.29,54,55 For example, in a study in Korea that evaluated long-term exposure, a 10 µg/m3 increase in PM2.5 over the prior year was associated with 1.47 times the risk of a diagnosis of major depression disorder.55 Air pollution is hypothesized
to affect depression through an inflammatory pathway.55
Exposure to technological disasters has also been associated with depression.
Technological disasters, for instance, oil spills, are long lasting and are of man-made origin, and can affect health through psychosocial stress pathways or by influencing health-related behaviors.56 For example, women who reported income loss from the
Deepwater Horizon oil spill or physical exposure to the oil spill were more likely to exhibit depression symptoms than those without income loss or physical exposure.57 Similarly,
UNGD is a long-lasting exposure of man-made origin that has effects on health through
psychosocial stress pathways.
Studies of neighborhood conditions, including socioeconomic measures, crime
rates, and the built environment, in relation to depression have generally, but not always,
reported that worse neighborhood conditions were associated with depression. 30,31 The
155
associations between neighborhood conditions and depression may be mediated through stress. Residents of disadvantaged neighborhoods may experience greater exposure to stressors and poorer access to mental and physical health resources, both
of which could contribute to depression.30,31
This study had several strengths, in particular its large sample size and that it is
the first study to evaluate the association of UNGD with a mental health outcome. It
assessed depression symptoms with a questionnaire, which is a strength because
depression and its symptoms may not be well captured in EHRs. Unlike a prior study of
UNGD and self-reported outcomes,22 the questionnaire did not mention UNGD, which is
a strength because it reduced the possibility of dependent measurement error.
Additionally, the UNGD metric captured the time-varying nature of well development,
though it was not able to determine the pathway through which UNGD was associated
with depression symptoms.
This study had several limitations. Responders tended to be sicker than the general population because the survey was designed to oversample patients with nasal and sinus symptoms.25,26 We used survey weights to account for the survey design and non-response, but there may still be differences between the weighted population and the general population. We used ICD-9 codes and medication orders in the EHR to identify disordered sleep diagnoses, but if many participants treated disordered sleep over the counter, the EHR would have low sensitivity for identifying disordered sleep.58
This could explain why we did not observe an association between UNGD and
disordered sleep. Future studies could consider identifying disordered sleep in the
clinical notes or by questionnaire. Additionally, we did not have information on the onset
of migraine, fatigue, or depression symptoms. In longitudinal studies, fatigue and
depression are risk factors for one another, but because this study asked about each
symptom at one time point only, we could not determine the timing of onset or changes
156
in symptom status.59 We did not have information on if survey responders had signed a lease with a drilling company. Leaseholders are more supportive of UNGD than non- leaseholders,60 so lease-holding could be an effect modifier if people who have gained financially from UNGD experience this development differently than those who have not.
4.6 Conclusions UNGD was associated with depression symptoms in a large population, and this association may be mediated by fatigue symptoms. UNGD was not associated with disordered sleep diagnoses using EHR data. This was the first study of UNGD and a mental health or disordered sleep outcome.
4.7 References 1. Adgate JL, Goldstein BD, McKenzie LM. Potential public health hazards, exposures
and health effects from unconventional natural gas development. Environ Sci Technol.
2014;48(15):8307-8320.
2. McKenzie LM, Witter RZ, Newman LS, Adgate JL. Human health risk assessment of
air emissions from development of unconventional natural gas resources. Sci Total
Environ. 2012;424:79-87.
3. Maloney KO, Yoxtheimer DA. Production and disposal of waste materials from gas and oil extraction from the Marcellus shale play in Pennsylvania. Env Prac.
2012;14(04):278-287.
4. Olmstead SM, Muehlenbachs LA, Shih J, Chu Z, Krupnick AJ. Shale gas development
impacts on surface water quality in Pennsylvania. Proceedings of the National Academy
of Sciences. 2013;110(13):4962-4967.
5. Jackson RB, Vengosh A, Darrah TH, et al. Increased stray gas abundance in a subset of drinking water wells near Marcellus shale gas extraction. Proceedings of the National
Academy of Sciences. 2013;110(28):11250-11255.
157
6. Warner NR, Jackson RB, Darrah TH, et al. Geochemical evidence for possible natural migration of marcellus formation brine to shallow aquifers in Pennsylvania. Proceedings of the National Academy of Sciences. 2012;109(30):11961-11966.
7. Osborn SG, Vengosh A, Warner NR, Jackson RB. Methane contamination of drinking water accompanying gas-well drilling and hydraulic fracturing. Proc Natl Acad Sci U S A.
2011;108(20):8172-8176.
8. Roy AA, Adams PJ, Robinson AL. Air pollutant emissions from the development, production, and processing of Marcellus shale natural gas. J Air Waste Manage Assoc.
2013;64(1):19-37.
9. Pacsi AP, Alhajeri NS, Zavala-Araiza D, Webster MD, Allen DT. Regional air quality impacts of increased natural gas production and use in texas. Environ Sci Technol.
2013;47(7):3521-3527.
10. Pacsi AP, Kimura Y, McGaughey G, McDonald-Buller E, Allen DT. Regional ozone impacts of increased natural gas use in the Texas power sector and development in the
eagle ford shale. Environ Sci Technol. 2015;49(6):3966-3973.
11. Litovitz A, Curtright A, Abramzon S, Burger N, Samaras C. Estimation of regional air-
quality damages from marcellus shale natural gas extraction in Pennsylvania.
Environmental Research Letters. 2013;8(1):014017.
12. Kemball-Cook S, Bar-Ilan A, Grant J, et al. Ozone impacts of natural gas
development in the haynesville shale. Environ Sci Technol. 2010;44(24):9357-9363.
13. Sangaramoorthy T, Jamison AM, Boyle MD, et al. Place-based perceptions of the
impacts of fracking along the marcellus shale. Soc Sci Med. 2016.
14. Muehlenbachs L, Spiller E, Timmins C. The housing market impacts of shale gas
development. Am Econ Rev. 2015;105(12):3633-59.
158
15. Gopalakrishnan S, Klaiber HA. Is the shale energy boom a bust for nearby residents? evidence from housing values in Pennsylvania. Am J Agric Econ.
2014;96(1):43-66.
16. Powers M, Saberi P, Pepino R, Strupp E, Bugos E, Cannuscio CC. Popular
epidemiology and “fracking”: Citizens’ concerns regarding the economic, environmental,
health and social impacts of unconventional natural gas drilling operations. J Community
Health. 2015;40(3):534-541.
17. Vinciguerra T, Yao S, Dadzie J, et al. Regional air quality impacts of hydraulic
fracturing and shale natural gas activity: Evidence from ambient VOC observations.
Atmos Environ. 2015;110:144-150.
18. Casey JA, Savitz DA, Rasmussen SG, et al. Unconventional natural gas
development and birth outcomes in Pennsylvania, USA. Epidemiology. 2015.
19. McKenzie LM, Guo R, Witter RZ, Savitz DA, Newman LS, Adgate JL. Birth outcomes and maternal residential proximity to natural gas development in rural colorado. Environ
Health Perspect. 2014.
20. Stacy SL, Brink LL, Larkin JC, et al. Perinatal outcomes and unconventional natural gas operations in southwest Pennsylvania. PLOS ONE. 2015;10(6):e0126425.
21. Rabinowitz PM, Slizovskiy IB, Lamers V, et al. Proximity to natural gas wells and reported health status: Results of a household survey in Washington county, pennsylvania. Environ Health Perspect. 2014.
22. Saberi P, Propert KJ, Powers M, Emmett E, Green-McKenzie J. Field survey of
health perception and complaints of Pennsylvania residents in the Marcellus shale
region. Int J Environ Res Public Health. 2014;11(6):6517-6527.
23. Kroenke K, Strine TW, Spitzer RL, Williams JB, Berry JT, Mokdad AH. The PHQ-8
as a measure of current depression in the general population. J Affect Disord.
2009;114(1):163-173.
159
24. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Pub; 2013.
25. Hirsch AG, Stewart WF, Sundaresan AS, et al. Nasal and sinus symptoms and chronic rhinosinusitis in a population-based sample. Allergy. 2016.
26. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS.
Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in Pennsylvania. Environ Health Perspect.
2016.
27. Rasmussen SG, Ogburn EL, McCormack M, et al. Association between unconventional natural gas development in the Marcellus shale and asthma exacerbations. JAMA Intern Med. 2016;176(9):1334-1343.
28. Murray CJ, Abraham J, Ali MK, et al. The state of US health, 1990-2010: Burden of diseases, injuries, and risk factors. JAMA. 2013;310(6):591-606.
29. Lim YH, Kim H, Kim JH, Bae S, Park HY, Hong YC. Air pollution and symptoms of depression in elderly adults. Environ Health Perspect. 2012;120(7):1023-1028.
30. Kim D. Blues from the neighborhood? neighborhood characteristics and depression.
Epidemiol Rev. 2008;30:101-117.
31. Richardson R, Westley T, Gariépy G, Austin N, Nandi A. Neighborhood socioeconomic conditions and depression: A systematic review and meta-analysis. Soc
Psychiatry Psychiatr Epidemiol. 2015;50(11):1641-1656.
32. Kendler KS, Gardner CO, Prescott CA. Toward a comprehensive developmental model for major depression in men. Am J Psychiatry. 2006;163(1):115-124.
33. Kendler KS, Gardner CO, Prescott CA. Toward a comprehensive developmental model for major depression in women. Am J Psychiatry. 2002;159(7):1133-1145.
34. Hammen C. Stress and depression. Annu Rev Clin Psychol. 2005;1:293-319.
160
35. Riemann D. Insomnia and comorbid psychiatric disorders. Sleep Med. 2007;8 Suppl
4:S15-20.
36. Riemann D, Voderholzer U. Primary insomnia: A risk factor to develop depression? J
Affect Disord. 2003;76(1-3):255-259.
37. Bigal ME, Lipton RB. The epidemiology, burden, and comorbidities of migraine.
Neurol Clin. 2009;27(2):321-334.
38. Jette N, Patten S, Williams J, Becker W, Wiebe S. Comorbidity of migraine and psychiatric disorders--a national population-based study. Headache. 2008;48(4):501-
516.
39. Antonaci F, Nappi G, Galli F, Manzoni GC, Calabresi P, Costa A. Migraine and
psychiatric comorbidity: A review of clinical findings. J Headache Pain. 2011;12(2):115-
125.
40. O'Donnell JF. Insomnia in cancer patients. Clin Cornerstone. 2004;6(1):S6-S14. doi: http://dx.doi.org/10.1016/S1098-3597(05)80002-X.
41. Balkrishnan R, Rasu RS, Rajagopalan R. Physician and patient determinants of pharmacologic treatment of sleep difficulties in outpatient settings in the united states.
Sleep. 2005;28(6):715.
42. Lipton RB, Dodick D, Sadovsky R, et al. A self-administered screener for migraine in
primary care: The ID migraine validation study. Neurology. 2003;61(3):375-382.
43. Patient-Reported Outcomes Measurement Information System. PROMIS fatigue short form 8a. http://www.assessmentcenter.net. Updated 2015. Accessed October 10,
2015.
44. Casey JA, Ogburn EL, Rasmussen SG, et al. Predictors of indoor radon concentrations in Pennsylvania, 1989-2013. Environ Health Perspect.
2015;123(11):1130-1137.
161
45. Schwartz BS, Glass TA, Pollak J, et al. Depression, its comorbidities and treatment, and childhood body mass index trajectories. Obesity (Silver Spring). 2016.
46. Schwartz BS, Stewart WF, Godby S, et al. Body mass index and the built and social environments in children and adolescents using electronic health records. Am J Prev
Med. 2011;41(4):e17-e28.
47. Liu AY, Curriero FC, Glass TA, Stewart WF, Schwartz BS. Associations of the burden of coal abandoned mine lands with three dimensions of community context in
Pennsylvania. ISRN Public Health. 2012;2012.
48. Pennsylvania Department of Health. Public water systems. Environmental Health
Tracking Program Web site. http://www.health.pa.gov/My%20Health/Environmental%20Health/Environmental%20Pu blic%20Health%20Tracking/Pages/Metadata-for-Drinking-Water-
Quality.aspx#.V0Xr8JErKM8. Updated 2015. Accessed 5/25, 2016.
49. Potter F. Survey of procedures to control extreme sampling weights. . 1988:453-458.
50. Esser MB, Hedden SL, Kanny D, Brewer RD, Gfroerer JC, Naimi TS. Prevalence of alcohol dependence among US adult drinkers, 2009-2011. Prev Chronic Dis.
2014;11:E206.
51. Gries CJ, Engelberg RA, Kross EK, et al. Predictors of symptoms of posttraumatic stress and depression in family members after patient death in the ICU. Chest.
2010;137(2):280-287.
52. Pike GR. Using weighting adjustments to compensate for survey nonresponse.
Research in Higher Education. 2008;49(2):153-171.
53. Ferrar KJ, Kriesky J, Christen CL, et al. Assessment and longitudinal analysis of health impacts and stressors perceived to result from unconventional shale gas development in the Marcellus shale region. Int J Occup Environ Health. 2013;19(2):104-
112.
162
54. Szyszkowicz M, Rowe BH, Colman I. Air pollution and daily emergency department visits for depression. Int J Occup Med Environ Health. 2009;22(4):355-362.
55. Kim KN, Lim YH, Bae HJ, Kim M, Jung K, Hong YC. Long-term fine particulate matter exposure and major depressive disorder in a community-based urban cohort.
Environ Health Perspect. 2016.
56. Couch SR, Coles CJ. Community stress, psychosocial hazards, and EPA decision- making in communities impacted by chronic technological disasters. Am J Public Health.
2011;101(S1).
57. Rung AL, Gaston S, Oral E, et al. Depression, mental distress and domestic conflict among louisiana women exposed to the deepwater horizon oil spill in the WaTCH study.
Environ Health Perspect. 2016.
58. Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: A review of methods and applications. Annu Rev Public
Health. 2015(0).
59. Skapinakis P, Lewis G, Mavreas V. Temporal relations between unexplained fatigue and depression: Longitudinal data from an international study in primary care.
Psychosom Med. 2004;66(3):330-335.
60. Kriesky J, Goldstein BD, Zell K, Beach S. Differing opinions about natural gas drilling in two adjacent counties with different levels of drilling activity. Energy Policy.
2013;58:228-236.
163
Chapter 5: Exposure assessment using secondary data sources in unconventional natural gas development and health studies
5.0 Cover page Sara Rasmussen, MHS1; Kirsten Koehler, PhD1; J. Hugh Ellis, PhD1; David Manthos, BA2; Karen Bandeen-Roche, PhD3; Rutherford Platt, PhD4; and Brian S. Schwartz*, MD, MS1,5,6 1Department of Environmental Health and Engineering, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; 2SkyTruth, Shepherdstown, WV, USA; 3Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; 4Department of Environmental Studies, Gettysburg College, Gettysburg, Pennsylvania, USA; 5Department of Epidemiology and Health Services Research, Geisinger Health System, Danville, Pennsylvania, USA; 6Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
Acknowledgements: This study was funded by the National Institute of Environmental Health Sciences grant ES023675-01 (PI: B S Schwartz), training grant ES07141 (S G Rasmussen), the Degenstein Foundation, and the National Science Foundation Integrative Graduate Education and Research Traineeship (S G Rasmussen). No funders had input into the study design, conduct, data collection or analysis, or manuscript preparation. Dr. Schwartz is a Fellow of the Post Carbon Institute (PCI), serving as an informal advisor on climate, energy, and health issues. He receives no payment for this role. His research is entirely independent of PCI, and is not motivated, reviewed, or funded by PCI. We thank Joseph J. DeWalle (Geisinger Health System) for creating the map; John Amos (SkyTruth) for guidance with the flaring and impoundment data; and Chloe Quinlan (Johns Hopkins Bloomberg School of Public Health), Jennifer Irving, and Joshua Crisp (Geisinger Health System) for compiling the compressor data.
Reproduced with permission from Environmental Science and Technology, submitted for publication. Unpublished work copyright 2017 American Chemical Society.
164
5.1 Abstract Studies of unconventional natural gas development (UNGD) and health have ranked participants along an “exposure” gradient using geographic information system (GIS)-based proxies that incorporated the distance between participants’ home addresses and unconventional natural gas wells. However, studies have used different GIS-based proxies, making comparison of results across studies difficult. Furthermore, studies have only incorporated wells, but neglected other components of development, namely compressors, impoundments, and flaring events, which may have relevance to health. Here, we characterized UNGD-related impoundments, compressors, and flaring events in Pennsylvania and evaluated whether and how to incorporate these into exposure assessment using a principal component analysis. We compared three different approaches to GIS-based UNGD metrics used in health studies to each other and their associations with a health outcome, mild asthma exacerbations. We identified 361 compressor stations, 1,218 impoundments, and 216 locations with flaring events. The principal component analysis identified a single component that was approximately an equal mix of the metrics for compressors, impoundments, and four phases of well development (pad preparation, drilling, stimulation, and production). The three GIS- based UNGD metrics had different magnitudes of association with mild asthma exacerbations, although the highest category of each metric (vs. the lowest) was associated with the outcome, regardless of metric.
165
5.2 Introduction Unconventional natural gas (UNG) constitutes over 40% of the natural gas produced in the U.S., up from less than 10% in 2007. Pennsylvania’s Marcellus shale accounts for over a quarter of the country’s UNG production.1 Several epidemiology
studies evaluated associations of unconventional natural gas development (UNGD) with
health outcomes, but these studies used different UNGD metrics to categorize
participants, making comparing results difficult, and these metrics have only
incorporated wells, though wells are just one component of UNGD-related infrastructure.
UNGD involves pad preparation, drilling, perforation, stimulation, and gas production. The fluid returning to the surface with the gas can be stored in surface impoundments, where volatile organic compounds (VOCs) evaporate. Gas is
compressed using diesel or natural gas powered compressor engines before
distribution.2 Residents of regions undergoing UNGD face potential chemical exposures,
through water, soil, and air; physical exposures, including noise, light, and vibration; and
community impacts.2-21 Exposure to UNGD is not a single exposure, but multiple time
varying exposures, each with of different scales of impact.
Epidemiologic studies have evaluated the associations of UNGD with birth
outcomes,22-24 asthma exacerbations,25 symptoms,26-28 cancer,29, 30 hospitalization
rates,31 and car crashes,5 using several different metrics to rank study participants on a
gradient by UNGD. The epidemiologic studies that assigned UNGD metrics on an
individual level22-26, 28 assigned their metrics using a geographic information system
(GIS)-based proxy that incorporated the distance between study participants’ home
addresses and UNG wells, using a nearest neighbor distance or gravity model approach.
The primary advantages of GIS-based proxies are that they are inexpensive compared
166
to multiple pathway exposure assessment of physical, chemical, and social impacts, and that they can be used retrospectively.
However, the GIS-based proxies used in these studies had limitations. Thus far, these metrics have only incorporated wells, even though other components of UNGD, such as compressor stations and impoundments, may have air quality impacts.12, 21 An
air emissions study estimated that compressor stations were responsible for the majority
of UNGD-related emissions of VOC, nitrogen oxides, and PM10 and PM2.5 (particulate
matter less than or equal to 10 and 2.5 micrometers in aerodynamic diameter,
respectively) in Pennsylvania in 2011.21 Impoundments remain a largely uncharacterized
source of air emissions.12 Because no prior study has attempted to incorporate
impoundments and compressor engines into UNGD metrics, it is not clear what (if
anything) they add to metric creation. Additionally, no study has incorporated flaring
events, which are sources of combustion products.2, 32 Finally, because studies have
used different approaches to defining metrics, comparing results across studies is
problematic.
To address the limitations related to UNGD metrics used in epidemiology studies, the three primary aims of the analyses in this paper were to: 1) characterize UNGD-
related impoundments, compressor engines, and flaring events in Pennsylvania; 2)
evaluate whether and how to incorporate impoundments, compressor engines, and
flaring events into a UNGD metric; and 3) compare associations of different GIS-based
UNGD metrics used in existing studies to each other and in their associations with mild
asthma exacerbations.
5.3 Methods 5.3.1 UNGD-related compressor engines, impoundments, and flaring events in Pennsylvania
167
Unlike data on wells,33, 34 data on compressor engines and impoundments are not available electronically. To identify compressor engines, we obtained a list of compressor stations thought to be UNGD-related from the Pennsylvania Department of
Environmental Protection (DEP) (n = 506). We visited four DEP locations (Northeast,
North-central, Northwest, and Southwest) and scanned relevant documents (including applications, General Information Forms, authorizations, start letters, and cancelations; n
= 6,007) between October 2013 and May 2014. We data abstracted these documents for station name, location, number of compressor engines, compressor engine horsepower, compressor engine emissions, expected start date of operation, authorization date, start date, and cancelation date; 2,700 documents contained at least one of these variables (initially, we scanned unnecessary documents; later, we refined the process on which documents to scan). We excluded compressor stations that had no available documents or, upon document review, were not UNGD-related (n = 49). After data entry, did data checking to confirm the accuracy of entered data. We used compressor station names and site identification numbers to link data across documents. If, say, horsepower was missing in one document, we looked for it in other documents for that compressor engine.
Information on impoundment location and sizes was obtained in partnership with
SkyTruth, which created a collaborative image analysis application on their website
(skytruth.org) that displayed aerial imagery collected by the USDA National Agricultural
Imagery Program35 of the one square kilometer area around UNG wells from the summers of 2005, 2008, 2010 and 2013 (Figure 2.2.2). Trained volunteers and staff identified and outlined impoundments. Each image was reviewed by no less than three staff or ten volunteers, 66.6% agreement was required, and assignments were validated by a GIS analyst before inclusion in the final dataset.
168
To estimate an installation and removal date for each impoundment, we used a trend analysis of Landsat data to identify sudden spectral changes in the grid cell that contained each impoundment. To do so, we compiled all available Landsat 5, 7, and 8 surface reflectance imagery with < 30% cloud cover for the years 2000-2015, a total of
754 images across four Landsat path/rows. For each impoundment location, we masked
remaining clouds and then interpolated a monthly time series for the near infrared band
and the normalized difference vegetation index (NDVI). We used the Breaks for Additive
Season and Trend package in R to identify discrete breaks in the time series after the
removal of seasonal effects.36 The dataset has a nominal temporal resolution of 1
month, but cloud cover and gaps can potentially delay the detection of the creation or
removal of impoundments. Based on the direction, magnitude, and timing of the time series breaks, we identified approximate dates of creation and removal of impoundments. We verified estimates for a sample of impoundments by comparing
Landsat-derived dates to photointerpretation-derived dates using historical imagery on
Google Earth.
To identify flaring events, we used detections recorded at night by the Visible
Infrared Imaging Radiometer Suite on the Suomi NPP satellite operated by the National
Oceanic and Atmospheric Administration (NOAA). We identified detections in
Pennsylvania with a temperature >1773ºK and <5273.15ºK (excluding temperatures of
1810ºK, which NOAA used to identify detections where it is not possible to estimate the
temperature) from September 9, 2012 – August 3, 2015. Because there were often
several detections close together, we grouped detections that were within 150 m on the
same day.
5.3.2 Incorporate impoundments and compressor engines into exposure assessment
We used principal components analysis (PCA) to assess the relationship between metrics created for four phases of well development (pad preparation, drilling,
169
stimulation, and production), compressor engines, and impoundments. We created a regular grid (5 by 5 km) across 38 counties in central and Northeastern Pennsylvania
(Figure 1, in green) (number of grid points = 2,627). On January 1 and July 1 for 2005-
2013, we assigned inverse distance-squared (IDS) metrics to each grid point for four phases of well development (pad preparation, drilling, stimulation, and production, as used in our prior health studies22, 25, 26), impoundments, and compressor engines using
Equation 5.3.2.
Equation 5.3.2. Inverse distance squared (IDS) metric.
For each IDS metric, m was either the number of wells in the given phase, started
2 compressor engines, or installed impoundments; and dij was the squared-distance
(meters) between well, compressor engine, or impoundment i and point j. For the four
phases of well development, si was 1 for the pad production and drilling phases, total
well depth (meters) of well i for the stimulation phase, and daily natural gas production
3 volume (m ) of well i for the production phase. For compressor engines, si was the compressor engine horsepower. Engines contributed to the metric from their start date to
2 their removal date. For impoundments, si was the area (m ) of the impoundment, which
contributed to the metric from their installation to their removal date. For years with aerial
imagery (2005, 2008, 2010, and 2013), we assigned six metrics (impoundments,
compressor engines, and four phases of well development). For the remaining dates, we
assigned all metrics but the one for impoundments, though in some cases because there
were no wells in a given phase on a date, so that metric was not included in the PCA for
that date. We did not incorporate flaring events into this analysis because we did not
have information on flaring events before 2013 and only four locations had flaring events
identified in 2013.
170
On each date evaluated, we truncated the UNGD metrics at their 98th percentile,
log- and z-transformed the truncated values to normalize distributions and put the
metrics on the same scale, and conducted a PCA using the Pearson correlation matrix in
Stata. We compared loadings and scree plots across the evaluated dates. We also
compared the first component from the PCA to a summed z-score of all UNGD metrics
available on that date.
Figure 5.3.2. Location of UNG-related impoundments, compressor engines, and UNG wells. Impoundments included those identified in 2005, 2008, 2010, and 2013 (n = 1,218); compressor engines included those started by 2013 (n = 861), and wells included those drilled by 2015 (n = 9,669). Counties in green are those that were included in the fishnet grid. Ozone monitors (n = 55) are those that were active in 2012. Abbreviations: UNGD, unconventional natural gas development; UNG, unconventional natural gas
5.3.3 Comparison of GIS-based metrics and their associations with mild asthma exacerbations
Studies of UNGD and health have used different approaches to GIS-based
UNGD metrics. Here, we compared how different UNGD metrics categorized patients
171
and evaluated the sensitivity of associations of UNGD and a health outcome to different
metrics. In our prior study, we evaluated the associations of four phases of UNG well
development with mild, moderate, and severe asthma exacerbations, among 35,508
primary care patients with asthma of the Geisinger Clinic in Pennsylvania from 2005-
12.25 We identified case encounters (mild, moderate, and severe asthma exacerbations: asthma oral corticosteroid [OCS] medication orders, asthma emergency department visits, and asthma hospitalizations, respectively) and control encounters (patient contact dates with the health system) from the Geisinger electronic health record. To compare how different metrics categorized patients on UNGD, here, we assigned three different
UNGD metrics (described below) to the case and control encounters identified in our previous study (n = 69,548) and we compared how each metric ranked case and control
encounter dates using Spearman correlations for continuous metrics and tables for
categorical metrics. We then evaluated associations of each of these metrics with mild
asthma exacerbations (the largest outcome, with 39,442 case and control dates) using
our previously reported adjusted multilevel model.25
We evaluated three different approaches to UNGD metrics: 1) categorical
distance to the nearest drilled well (DNDW), 2) inverse distance metric based on the
drilling phase (IDD), and 3) IDS metric incorporating four phases of well development
and compressor engines (IDS4PC). The DNDW approach, used by Rabinowitz,28 was based on distance from a patient’s home to the closest drilled well of any age, and categorized into less than 1 km, 1-2km, and greater than 2km.28 The IDD metric, used by
McKenzie and Stacy,23, 24 was assigned using Equation 5.3.3.
Equation 5.3.3. Inverse distance metric based on the drilling phase (IDD).
172
In Equation 5.3.3, n was the number of drilled wells within 10 miles of a patient’s
home and dij was the distance between the patient’s home and a well. We tertiled the
IDD metric using case and control encounters with at least one well within 10 miles, and
created a reference group of case and control encounters with no wells within 10 miles.24
The IDS4PC metric included four phases of well development and UNG-related
compressor engines. As described above (Equation 5.3.2), we assigned each
encounter date a value for four phases of well development and compressor stations,
created z-scores for each of the five values, summed the z-scores, and quartiled the
sum using all patient events (exacerbations or control dates). The results of the PCA
(Section 5.4) informed the creation of the IDS4PC metric. We did not include
impoundments because data were not available for all years.
To evaluate sensitivity of associations of different approaches to UNGD metric
creation with a health outcome, we re-ran the model for mild asthma exacerbations from
our prior study25 with the DNDW, IDD, and IDS4PC metrics. We then compared the odds ratios from each of these models. The study was approved by the Geisinger Institutional
Review Board (with an IRB authorization agreement with Johns Hopkins Bloomberg
School of Public Health).
5.4 Results 5.4.1 UNGD-related compressor engines, impoundments, and flaring events in Pennsylvania
We identified 1,218 impoundments and 457 compressor stations in Pennsylvania
(Figures 5.3.2 and 5.4.1). The median area (m2) of impoundments in 2005, 2008, 2010, and 2013 was 344.0, 558.8, 1990.2, and 6209.7, respectively. The average estimated duration of an impoundment from installation to removal was 1.9 years. At the 457
173
compressor stations, we identified 1,419 compressor engines, though only 861 engines at 361 stations had start letters stating they were operational.
Between September 2012 and August 2015, we identified 1,174 flaring observations on 380 days. After grouping flares within 150m, we identified flares at 216 locations (Figure 5.3.2). At 114 locations (53%), the flaring event was observed on one
day, and at the remaining 102 sites, there was a median of 115 days from the first to last
flaring event.
Figure 5.4.1. Total number of drilled unconventional natural gas wells and operating unconventional natural gas related impoundments and compressor engines in Pennsylvania by year.
174
5.4.2 PCA applied to wells, compressor stations, and impoundments
In each PCA, the first component explained between 58 and 94% (median 79%) of the total variation (Table 5.4.2.1). For 15 of the 18 dates, only the first component had an eigenvalue above one. The first components’ loadings were consistently made up of an approximately equal mix of the UNGD metrics. The first component was highly correlated with a summed z-score of the metrics on each date (Spearman correlations >
0.99). In contrast, the second component, which explained between 4 and 29% of the variation, did not have consistent loadings, although the compressor metric tended to be the largest (Table S1).
175
Table 5.4.2.1. Results of PCA with Percentage of Variation Explained by Component 1 and Component 1 Loadings Proportion of Component 1 loadings Correlation variance Compressor Well metrics of explained by engine Impoundment component 1 Date component 1 metric Pad Drilling Stimulation Production metric with z score 1/1/2005 0.77 0.50 a a a 0.62 0.59 0.99 7/1/2005 0.76 0.47 0.46 0.33 a 0.47 0.49 0.99 1/1/2006 0.91 0.56 0.59 a a 0.58 b 0.99 7/1/2006 0.94 0.42 0.46 0.46 0.46 0.44 b 0.99 1/1/2007 0.85 0.46 0.37 0.45 0.47 0.47 b 0.99 7/1/2007 0.72 0.50 0.21 0.50 0.45 0.51 b 0.99 1/1/2008 0.72 0.46 0.36 0.43 0.43 0.46 0.30 0.99 7/1/2008 0.58 0.46 0.35 0.34 0.43 0.48 0.36 0.99 1/1/2009 0.58 0.47 0.53 0.41 0.47 0.32 b 0.99 7/1/2009 0.69 0.24 0.50 0.50 0.48 0.46 b 0.99 1/1/2010 0.67 0.33 0.36 0.45 0.39 0.45 0.45 0.99 7/1/2010 0.81 0.34 0.43 0.43 0.42 0.40 0.42 0.99 1/1/2011 0.80 0.38 0.47 0.46 0.46 0.46 b 0.99 7/1/2011 0.83 0.41 0.46 0.46 0.45 0.46 b 0.99 1/1/2012 0.84 0.41 0.44 0.46 0.46 0.46 b 0.99 7/1/2012 0.79 0.40 0.43 0.47 0.45 0.48 b 0.99 1/1/2013 0.83 0.38 0.41 0.42 0.42 0.43 0.39 0.99 7/1/2013 0.78 0.41 0.37 0.41 0.41 0.44 0.40 0.99 a All grid points had a value of zero for this variable on this date, and variables with zero variance were dropped from PCA. b Impoundment data was only available in 2005, 2008, 2010, and 2013.
176
Table 5.4.2.2. Results of PCA with Percentage of Variation Explained by Component 2 and Component 2 Loadings
Component 2 loadings Proportion of Well metrics variance Compressor explained by engine Impoundment Date component 2 metric Pad Drilling Stimulation Production metric 1/1/2005 0.20 0.83 a a a -0.19 0.59 7/1/2005 0.15 -0.15 -0.46 0.33 a -0.08 0.07 1/1/2006 0.07 0.81 -0.23 a a 0.58 b 7/1/2006 0.04 0.86 -0.24 0.06 0.46 0.44 b 1/1/2007 0.11 0.08 0.83 -0.45 -0.32 0.01 b 7/1/2007 0.18 -0.13 0.97 -0.18 0.05 -0.13 b 1/1/2008 0.12 -0.09 0.21 -0.11 -0.34 -0.23 0.88 7/1/2008 0.27 -0.31 0.49 0.53 -0.43 -0.30 0.33 1/1/2009 0.29 -0.35 -0.12 0.54 -0.41 0.63 b 7/1/2009 0.19 0.90 0.02 -0.14 0.05 -0.40 b 1/1/2010 0.22 0.56 0.49 -0.32 -0.46 -0.29 0.20 7/1/2010 0.11 0.78 -0.17 -0.05 -0.24 -0.46 0.29 1/1/2011 0.10 0.87 -0.05 -0.33 0.01 -0.36 b 7/1/2011 0.08 0.84 -0.36 -0.23 0.13 -0.29 b 1/1/2012 0.07 0.86 -0.33 -0.24 0.08 -0.29 b 7/1/2012 0.11 0.76 -0.63 0.05 0.02 -0.15 b 1/1/2013 0.10 0.58 -0.39 -0.34 -0.24 -0.10 0.57 7/1/2013 0.09 -0.48 0.37 0.08 0.52 0.12 -0.59 a All grid points had a value of zero for this variable on this date, and variables with zero variance were dropped from PCA. b Impoundment data was only available in 2005, 2008, 2010, and 2013
177
5.4.3 Comparison of GIS-based UNGD metrics
We sought to compare how the DNDW, IDD, and IDS4PC metrics categorized the index dates. Comparing the DNDW and IDS4PC metrics (Table 5.4.3.1), 96.4% of the index dates in the IDS4PC metric’s highest quartile were also in the highest category of the DNDW metric (greater than 2 km from the closest well), but a 98.6% of index dates in the IDS4PC metric’s highest category were greater than 2km from the closest well. For the IDD and ID4PC metrics, we compared both the continuous and categorical metrics. The Spearman correlation for continuous IDD and ID4PC metrics was 0.36.
While 80.3% of assignments for the IDD metric’s highest tertile were also in the highest quartile of IDS4PC, 18.5% of assignments for IDD’s lowest category (no wells within 10 miles) were in IDS4PC’s highest quartile (Table 5.4.3.2).
Table 5.4.3.1. Categorization of case and control encounter dates (counts) by distance to nearest drilled well (DNDW) and by an inverse distance squared metric incorporating four phases of well development and compressor engines (IDS4PC) DNDW categoriesa <1 km 1-2 km >2 km Total IDS4PC Qc1 2 4 17,381 17,387 categoriesb Q2 4 30 17,353 17,387 Q3 4 46 17,337 17,387 Q4 238 385 16,764 17,387 Total 248 465 68,835 69,548 a Distance to the nearest drilled well, based on Rabinowitz b An inverse distance metric incorporating four phases of well development (pad preparation, drilling, stimulation, and production) and UNG-related compressor stations, based on Casey, Tustin, and Rasmussen. c Quartile
178
Table 5.4.3.2. Categorization of case and control encounter dates (counts) by an inverse distance metric that was based only on the drilling phase inverse distance (IDD) and an inverse distance squared metric incorporating four phases of well development and compressor engines (IDS4PC) IDD tertilesa 0 wells Tb1 T2 T3 Total in 10 miles IDS4PC Qd1 16,999 159 146 83 17,387 c quartiles Q2 15,158 954 965 310 17,387 Q3 14,866 1,050 1,086 385 17,387 Q4 10,649 1,796 1,762 3,180 17,387 Total 57,672 3,959 3,959 3,959 69,548 a An inverse distance metric incorporating drilled wells, based on McKenzie and Stacy b Tertile c An inverse distance metric incorporating four phases of well development (pad preparation, drilling, stimulation, and production) and UNG-related compressor stations, based on Casey, Tustin, and Rasmussen. d Quartile
We compared associations of the DNDW, IDD, and IDS4PC metrics with a health outcome, mild asthma exacerbations. In the models that evaluated associations of the different metrics with mild asthma exacerbations, the highest group of each metrics (vs. the lowest) was associated with increased odds of mild exacerbation, though the magnitudes of association differed (IDD < DNDW < IDS4PC, Table 5.4.3.3). The DNDW and IDS4PC metrics had increasing odds ratios across UNGD categories, whereas the second tertile for the IDD metric had a slightly stronger association with the outcome than that for the third tertile. Associations of the IDS4PC metric with mild asthma exacerbations were intermediate of those from four regressions of each phase of well development separately in our prior study.25”
179
Table 5.4.3.3. Associations of unconventional natural gas development (UNGD) metrics and with mild asthma exacerbationsa. UNGD metric included in modelb Category Odds Ratio (95% CIb) > 2 km (REF) 1.0 DNDWc 1 - 2 km 1.13 (0.76 - 1.69) < 1 km 1.83 (1.03 - 3.25) No wells within 10 miles (REF) 1.0 Tertile 1 0.96 (0.83 - 1.13) IDDd Tertile 2 1.21 (1.03 - 1.42) Tertile 3 1.19 (1.01 - 1.41) Quartile 1 (REF) 1.0 Quartile 2 1.31 (1.16 - 1.48) IDS4PCe Quartile 3 2.20 (1.93 - 2.52) Quartile 4 3.69 (3.16 - 4.30) a New oral corticosteroid medication orders. b Multilevel models with a random intercept for patient and community, adjusted for age category (5-12, 13-18, 19-44, 45-61, 62-74, 75+ years), sex (male, female), race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) c Confidence interval d Distance to the nearest drilled well, based on Rabinowitz e An inverse distance metric that was based only on the drilling phase, based on McKenzie and Stacy f An inverse distance metric incorporating four phases of well development (pad preparation, drilling, stimulation, and production) and UNG-related compressor engines, based on Casey, Tustin, and Rasmussen.
5.5 Discussion Compressor engines, impoundments, and flaring events, which are potential sources of emissions, have not previously been described or incorporated in epidemiology studies, in part because data is not readily available. Additionally, approaches to incorporating wells have differed across epidemiology studies. In this study, we described UNGD-related compressor engines, impoundments and flaring events in Pennsylvania, evaluated the impact of including compressor engines and
180
impoundments in a UNGD metric, and compared associations of different metrics with a health outcome.
We identified 361 compressor stations, 1,218 impoundments, and 216 sites with flares. The dates of development for compressor engines and impoundments was similar to that for wells. Although the number of impoundments decreased from 2010-
2013, the total area of impoundments increased from 1.96 to 3.96 km2.
The PCAs suggested that on a majority of days evaluated, a single component captured most, but not all, of the variation of the compressor engine, impoundment, and well IDS metrics. That single component ranked points similarly to a z-score of the
metrics. It was not unexpected that the PCA loaded on a single component since the
wells, impoundments, and compressor engines have similar spatial and temporal
distributions. Based on these results, we incorporated compressor engines and the four
phases of well development into a UNGD metric by summing the z-score of each
component (the IDS4PC metric). We then compared how that metric classified case and
control dates to classifications by two other metrics (DNDW and IDD). There were
substantial differences in how the DNDW, IDD, and IDS4PC metrics ranked dates.
Differences between the three metrics were expected because the metrics were
designed for studies conducted in different regions and time periods. The DNDW metric
was designed for a study in southwestern Pennsylvania in 201228 and the IDD metric for
a study in Colorado from 1996 and 2009.24 The participants included in those studies
lived, on average, closer to wells than in the studies that used the IDS metric, which
were conducted in from 2005-2105 in northeastern Pennsylvania,22, 25, 26 because the
study in southwestern Pennsylvania did not include the earlier years of UNGD, when
wells were less dense, and the Colorado study included both conventional and UNG
wells.
181
Additionally, each of the three metrics incorporated different information.
Because the DNDW metric categorized based on distance to the single closest drilled well, it did not take into account the density of wells. The DNDW and IDD metrics only incorporated the drilling phase of development, whereas the IDS4PC distinguished between four phases of development. Both the DNDW and IDD metrics assumed that all exposures from wells were continuous after a well was drilled and that exposures were equal from all drilled wells, regardless of phase of development, depth of the well, or volume of natural gas produced at the well. However, phase of development is important to incorporate into metric formulation because exposures such as air emissions differ by phase of development, and because not all drilled wells are later stimulated or produce natural gas. Of the 9,669 unconventional natural gas wells drilled in Pennsylvania by
2015, 1,992 did not have stimulation dates, and of wells with stimulation dates, 377 did not report production (although it is not possible to distinguish between missing and zero values in the data). The DNDW metric assumed that wells farther than 2 km did not contribute to exposure, and the IDD also assumed that wells farther than 10 miles from patient’s home did not contribute to exposure, assumptions that may not be true for exposures such as regional air pollutants (e.g., ozone and particulate matter).
We designed our IDS4PC metric to capture all potential exposure pathways associated with UNGD. The IDS4PC metric assumed that wells only contributed to exposure during the four phases of development (pad preparation, drilling, stimulation, or production), and in between these phases wells did not contribute to the metric. It also assumed wells to contribute differently to exposure during the stimulation phases
(proportional to total depth) and production phase (proportional to gas quantity produced), and compressor engines contribute differently (proportional to total horsepower). We hypothesize that these are reasonable assumptions because well depth is correlated with the amount of fluid used in stimulation,37, 38 and thus also likely
182
correlated with truck trips needed to fluids to the site, and we hypothesize that fugitive
emissions are correlated with the volume of natural gas produced. However, we
acknowledge that without environmental measurements, we cannot definitively say how
well our metric is captures each potential exposure pathway.
We compared the associations of the different UNGD metrics with mild asthma
exacerbations. Although inference was the same across the three metrics (the highest
group of each was associated with mild asthma exacerbations), the magnitude of the
odds ratios differed. The IDS4PC metric was most strongly associated with OCS orders,
and the IDD metric was the least strongly associated. Had we used the IDD or DNDW
metric in our original study, we would have come to different conclusions on the strength
of the association of UNGD and asthma exacerbations. Because the associations of the
IDS4PC metric with mild asthma exacerbations were in between of those from each
phase of well development separately (as in our prior study),25 we concluded that the
time, effort, and expense required to capture information on compressor engines did not
substantively change interpretation of the association between the inverse distance
squared metric UNGD and mild asthma exacerbations. It is possible that the DNDW and
IDD metrics had more misclassification than the IDS4PC metric did, but without
environmental measurements, we cannot quantify how well each metric is captures
potential exposures from UGND, so we cannot definitively interpret the decrease in
magnitude of the association with the DNDW and IDD metrics compared to the IDS4PC
metric. We also acknowledge that we cannot rule out the potential for unmeasured
confounding in each model, just as we could not do so in our original study.25
One important pathway through which UNGD could influence health is air quality
impacts. We wanted to evaluate the adequacy of a GIS-based metric for air quality
impacts by comparing them to air quality estimates. To do this, we needed air pollution
measurements that were on a fine spatial resolution with a daily time step that included
183
emissions from UNGD and covered the years of UNGD (2005-2015) in Pennsylvania.
Because EPA monitors are too sparse in counties with UNGD (Figure 5.3.2), we
considered using the Community Multi-scale Air Quality (CMAQ) model output on a
12km grid for PM2.5 and ozone in 2007 and 2011. However, the National Emissions
Inventory (NEI), which CMAQ uses, “likely underestimates oil and gas emissions.”39 It
included only 2,675 unconventional natural gas wells in Pennsylvania in 2011, whereas
our analysis identified 4,951 spudded wells by the end of 2011. The Environmental
Protection Agency is working to improve UNGD emissions for future versions of the NEI,
so it may be possible to compare UNGD metrics to CMAQ in the future.
This study had several strengths. No prior study has described the size and
temporal and spatial distribution of UNGD-related compressor stations, impoundments,
and flaring in Pennsylvania, and evaluated what information they added to GIS-based
metrics, or compared associations of different UNGD metrics with a health outcome.
This study also had several limitations. No UNGD metric took into consideration the full
variability of potential exposures: for example, safety practices (and potential accidental
exposures) differ between well operators,40 and impacts also vary over time, as
regulations (such as Act 13 in 2009) are enacted and industry practices change. We also
recognize that we likely captured different potential exposures with differing amounts of
measurement error. There are rigorous approaches to characterize each of the potential
chemical, physical, and social exposures from UNGD, but such approaches may not
work well retrospectively and are much more time-consuming and costly than GIS-based approaches. The IDS approach should capture exposures that are consistent during a given phase of well development, are absent between well phases of development, are the same across wells of the same depth, and decay based on one over distance squared, but we did not take environmental measurements to identify which exposures these assumptions hold for.
184
We likely underestimated counts of impoundments because we only had aerial imagery for four years between 2005 and 2013. Because the average estimated duration of an impoundment from installation to removal was 1.9 years, there were likely
impoundments that were installed and removed in between the years with images, and
thus would not have had the chance to make it into our dataset. Additionally, we did not
look for impoundments that were more than 1 km from a well. We likely underestimated
the number of compressor engines because we could not distinguish between
compressor engines missing a start letter and those never started. Additionally, we are
not able to evaluate if the original list of UNGD-related compressor stations from the
DEP was missing any stations. We also likely underestimated the number of flaring
events because we could not identify flaring events on cloudy nights. We did not have
information on whether compressor engines were diesel or natural gas powered. For the
PCA, we assigned metrics to points in a regular grid, instead of to the residential
locations of Geisinger patients, so that the locations of the points included in the PCA
would not be affected by residential patterns or population density. Although there is still
a spatial correlation structure between the grid points included in the PCA, we aimed
primarily to build an index rather than to study correlation structure, and thus we do not
consider this a major limitation. Finally, the PCA was restricted to the Geisinger region
and therefore may not be generalizable to areas where wells, compressors, and
impoundments are not co-located.
GIS proxies for UNGD were defensible metrics to capture multiple pathways
retrospectively for low cost in the initial studies of UNGD and health. However, without
environmental measurements, it is not possible to determine what pathways are
captured by the GIS proxies. This study highlights the need for future UNGD and health
studies to improve exposure assessment by collecting environmental measurements or
biomarkers. Only when we understand how UNGD is affecting health can we effectively
185
design interventions to reduce exposure.
186
5.6 References 1. Shale in the United States; https://www.eia.gov/energy_in_brief/article/shale_in_the_united_states.cfm#shaledata.
2. Adgate, J.L.; Goldstein, B.D.; McKenzie, L.M. Potential Public Health Hazards,
Exposures and Health Effects from Unconventional Natural Gas Development. Environ.
Sci. Technol. 2014, 48 (15), 8307-8320; DOI: 10.1021/es404621d.
3. Sangaramoorthy, T.; Jamison, A.M.; Boyle, M.D.; Payne-Sturges, D.; Sapkota, A.;
Milton, D.K.; Wilson, S.M. Place-Based Perceptions of the Impacts of Fracking along the
Marcellus Shale. Soc. Sci. Med. 2016, 151; DOI: 10.1016/j.socscimed.2016.01.002.
4. Powers, M.; Saberi, P.; Pepino, R.; Strupp, E.; Bugos, E.; Cannuscio, C.C. Popular
epidemiology and “fracking”: citizens’ concerns regarding the economic, environmental,
health and social impacts of unconventional natural gas drilling operations. J.
Community Health 2015, 40 (3), 534-541; DOI:10.1007/s10900-014-9968-x
5. Graham, J.; Irving, J.; Tang, X.; Sellers, S.; Crisp, J.; Horwitz, D.; Muehlenbachs, L.;
Krupnick, A.; Carey, D. Increased traffic accident rates associated with shale gas drilling in Pennsylvania. Accident Analysis & Prevention 2015, 74, 203-209; DOI:
10.1016/j.aap.2014.11.003.
6. Gopalakrishnan, S.; Klaiber, H.A. Is the shale energy boom a bust for nearby residents? Evidence from housing values in Pennsylvania. Am. J. Agric. Econ. 2014, 96
(1), 43-66; DOI: 10.1093/ajae/aat065.
7. Muehlenbachs, L.; Spiller, E.; Timmins, C. The Housing Market Impacts of Shale Gas
Development. Am. Econ. Rev. 2015, 105 (12), 3633-59; DOI: 10.1257/aer.20140079.
8. The Shale Tipping Point: The Relationship of Drilling to Crime, Traffic Fatalities,
STDs, and Rents in Pennsylvania, West Virginia, and Ohio; http://www.multistateshale.org/shale-tipping-point.
187
9. Brasier, K.J.; Rhubart, D. Effects of Marcellus shale development on the criminal justice system (The Marcellus impacts Project Report # 6); http://www.rural.palegislature.us/documents/reports/Marcellus-Report-6-Crime%20.pdf;
2014.
10. Pacsi, A.P.; Alhajeri, N.S.; Zavala-Araiza, D.; Webster, M.D.; Allen, D.T. Regional air quality impacts of increased natural gas production and use in Texas. Environ. Sci.
Technol. 2013, 47 (7), 3521-3527; DOI: 10.1021/es3044714.
11. Pacsi, A.P.; Kimura, Y.; McGaughey, G.; McDonald-Buller, E.; Allen, D.T. Regional
Ozone Impacts of Increased Natural Gas Use in the Texas Power Sector and
Development in the Eagle Ford Shale. Environ. Sci. Technol. 2015, 49 (6), 3966-3973;
DOI: 10.1021/es5055012.
12. Roy, A.A.; Adams, P.J.; Robinson, A.L. Air pollutant emissions from the development, production, and processing of Marcellus Shale natural gas. J. Air Waste
Manage. Assoc. 2013, 64 (1), 19-37; DOI: 10.1080/10962247.2013.826151.
13. Vinciguerra, T.; Yao, S.; Dadzie, J.; Chittams, A.; Deskins, T.; Ehrman, S.;
Dickerson, R.R. Regional air quality impacts of hydraulic fracturing and shale natural gas activity: Evidence from ambient VOC observations. Atmos. Environ. 2015, 110, 144-150;
DOI: 10.1016/j.atmosenv.2015.03.056.
14. McKenzie, L.M.; Witter, R.Z.; Newman, L.S.; Adgate, J.L. Human health risk assessment of air emissions from development of unconventional natural gas resources.
Sci. Total Environ. 2012, 424, 79-87; DOI: 10.1016/j.scitotenv.2012.02.018.
15. Osborn, S.G.; Vengosh, A.; Warner, N.R.; Jackson, R.B. Methane contamination of
drinking water accompanying gas-well drilling and hydraulic fracturing. Proc. Natl. Acad.
Sci. U. S. A. 2011, 108 (20), 8172-8176; DOI: 10.1073/pnas.1100682108.
16. Jackson, R.B.; Vengosh, A.; Darrah, T.H.; Warner, N.R.; Down, A.; Poreda, R.J.;
Osborn, S.G.; Zhao, K.; Karr, J.D. Increased stray gas abundance in a subset of drinking
188
water wells near Marcellus shale gas extraction. Proceedings of the National Academy of Sciences 2013, 110 (28), 11250-11255; DOI: 10.1073/pnas.1221635110.
17. Warner, N.R.; Jackson, R.B.; Darrah, T.H.; Osborn, S.G.; Down, A.; Zhao, K.; White,
A.; Vengosh, A. Geochemical evidence for possible natural migration of Marcellus
Formation brine to shallow aquifers in Pennsylvania. Proceedings of the National
Academy of Sciences 2012, 109 (30), 11961-11966; DOI: 10.1073/pnas.1121181109.
18. Olmstead, S.M.; Muehlenbachs, L.A.; Shih, J.; Chu, Z.; Krupnick, A.J. Shale gas
development impacts on surface water quality in Pennsylvania. Proceedings of the
National Academy of Sciences 2013, 110 (13), 4962-4967; DOI:
10.1073/pnas.1213871110.
19. Maloney, K.O.; Yoxtheimer, D.A. Production and Disposal of Waste Materials from
Gas and Oil Extraction from the Marcellus Shale Play in Pennsylvania. Env Prac 2012,
14 (04), 278-287; DOI: 10.1017/s146604661200035x.
20. Vengosh, A.; Jackson, R.B.; Warner, N.; Darrah, T.H.; Kondash, A. A critical review
of the risks to water resources from unconventional shale gas development and
hydraulic fracturing in the United States. Environ. Sci. Technol. 2014, 48 (15), 8334-
8348; DOI: 10.1021/es405118y.
21. Litovitz, A.; Curtright, A.; Abramzon, S.; Burger, N.; Samaras, C. Estimation of
regional air-quality damages from Marcellus Shale natural gas extraction in
Pennsylvania. Environmental Research Letters 2013, 8 (1), 014017; DOI:10.1088/1748-
9326/8/1/014017.
22. Casey, J.A.; Savitz, D.A.; Rasmussen, S.G.; Ogburn, E.L.; Pollak, J.; Mercer, D.G.;
Schwartz, B.S. Unconventional Natural Gas Development and Birth Outcomes in
Pennsylvania, USA. Epidemiology 2015; DOI: 10.1097/EDE.0000000000000387.
189
23. Stacy, S.L.; Brink, L.L.; Larkin, J.C.; Sadovsky, Y.; Goldstein, B.D.; Pitt, B.R.; Talbott,
E.O. Perinatal Outcomes and Unconventional Natural Gas Operations in Southwest
Pennsylvania. PLOS ONE 2015, 10 (6), e0126425; DOI: 10.1371/journal.pone.0126425.
24. McKenzie, L.M.; Guo, R.; Witter, R.Z.; Savitz, D.A.; Newman, L.S.; Adgate, J.L. Birth
Outcomes and Maternal Residential Proximity to Natural Gas Development in Rural
Colorado. Environ. Health Perspect. 2014; DOI: 10.1289/ehp.1306722.
25. Rasmussen, S.G.; Ogburn, E.L.; McCormack, M.; Casey, J.A.; Bandeen-Roche, K.;
Mercer, D.G.; Schwartz, B.S. Association Between Unconventional Natural Gas
Development in the Marcellus Shale and Asthma Exacerbations. JAMA Intern. Med.
2016, 176 (9), 1334-1343; DOI: 10.1001/jamainternmed.2016.2436.
26. Tustin, A.W.; Hirsch, A.G.; Rasmussen, S.G.; Casey, J.A.; Bandeen-Roche, K.;
Schwartz, B.S. Associations between Unconventional Natural Gas Development and
Nasal and Sinus, Migraine Headache, and Fatigue Symptoms in Pennsylvania. Environ.
Health Perspect. 2016; DOI: 10.1289/EHP281.
27. Saberi, P.; Propert, K.J.; Powers, M.; Emmett, E.; Green-McKenzie, J. Field survey of health perception and complaints of Pennsylvania residents in the Marcellus Shale region. Int. J. Environ. Res. Public. Health. 2014, 11 (6), 6517-6527; DOI:
10.3390/ijerph110606517.
28. Rabinowitz, P.M.; Slizovskiy, I.B.; Lamers, V.; Trufan, S.J.; Holford, T.R.; Dziura,
J.D.; Peduzzi, P.N.; Kane, M.J.; Reif, J.S.; Weiss, T.R.; Stowe, M.H. Proximity to Natural
Gas Wells and Reported Health Status: Results of a Household Survey in Washington
County, Pennsylvania. Environ. Health Perspect. 2014; DOI: 10.1289/ehp.1307732.
29. Finkel, M. Shale gas development and cancer incidence in southwest Pennsylvania.
Public Health 2016, 141, 198-206; DOI: 10.1016/j.puhe.2016.09.008.
30. Fryzek, J.; Pastula, S.; Jiang, X.; Garabrant, D.H. Childhood Cancer Incidence in
Pennsylvania Counties in Relation to Living in Counties With Hydraulic Fracturing Sites.
190
Journal of Occupational and Environmental Medicine 2013, 55 (7), 796-801; DOI:
10.1097/jom.0b013e318289ee02.
31. Jemielita, T.; Gerton, G.L.; Neidell, M.; Chillrud, S.; Yan, B.; Stute, M.; Howarth, M.;
Saberi, P.; Fausti, N.; Penning, T.M.; Roy, J.; Propert, K.J.; Panettieri, R.A., Jr.
Unconventional Gas and Oil Drilling Is Associated with Increased Hospital Utilization
Rates. PLoS ONE 2015, 10 (7), e0131093; DOI: 10.1371/journal.pone.0131093.
32. Olaguer, E.P. The potential near-source ozone impacts of upstream oil and gas
industry emissions. J. Air Waste Manag. Assoc. 2012, 62 (8), 966-977; DOI:
10.1080/10962247.2012.688923/
33. Pennsylvania Internet Record Imaging System Well Information System website,
http://www.dcnr.state.pa.us/topogeo/econresource/oilandgas/resrefs/wis_home/.
34. Oil & Gas Reporting Website,
https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Welcome/Welcome.
aspx.
35. National Agriculture Imagery Program website, https://www.fsa.usda.gov/programs-
and-services/aerial-photography/imagery-programs/naip-imagery/index.
36. Verbesselt, J.; Hyndman, R.; Newnham, G.; Culvenor, D. Detecting trend and
seasonal changes in satellite image time series. Remote Sens. Environ. 2010, 114 (1),
106-115; DOI: 10.1016/j.rse.2009.08.014.
37. Fracking Chemical Database website, http://frack.skytruth.org/fracking-chemical-
database.
38. Schmid, K.W. The Marcellus Shale Gas Play. Pennsylvania Geology 2012,
http://www.dcnr.state.pa.us/cs/groups/public/documents/document/dcnr_20027757.pdf.
39. EPA Needs to Improve Air Emissions Data for the Oil and Natural Gas Production
Sector; United States Environmental Protection Agency: Washington, DC, 2013;
https://www.epa.gov/sites/production/files/2015-09/documents/20130220-13-p-0161.pdf.
191
40. Abualfaraj, N.; Olson, M.S.; Gurian, P.L.; De Roos, A.; Gross-Davis, C.A. Statistical
analysis of compliance violations for natural gas wells in Pennsylvania. Energy Policy
2016, 97, 421-428; DOI: 10.1016/j.enpol.2016.07.051.
192
Chapter 6: Miscellaneous results
In this chapter we present additional results from Chapters 3 and 4, additional
sensitivity analyses not reported in those chapters, and additional hypotheses explored.
6.1 Additional Results for Chapter 3 6.1.1 Associations of covariates with event status
Presented below are the odds ratios for the covariates with event status in the 12
UNGD phase - asthma outcome models (Tables 6.1.1.1-6.1.1.3).
6.1.1 Race/ethnicity
For the hospitalization outcome, race/ethnicity was not statistically significantly associated with event status. For the emergency department outcome, patients with non-Hispanic black race/ethnicity and with Hispanic race/ethnicity had higher odds of the outcome compared to patients with non-Hispanic white race/ethnicity. For example, in the pad and emergency department model, patients with non-Hispanic black race/ethnicity had 4.87 times the odds of having an event (95% confidence interval [CI]:
2.45 - 9.66), and patients with Hispanic race/ethnicity had 3.44 times the odds of having an event (95% CI: 1.7 - 6.94). For the oral corticosteroid (OCS) outcome, patients with non-Hispanic black race/ethnicity had lower odds of event status (e.g., in the pad model,
odds ratio [OR] = 0.72, 95% CI: 0.58 - 0.90), and Hispanic race/ethnicity was not
associated with event status.
6.1.2 Family history
Family history of asthma was associated with event status in every model. For example, in the pad and hospitalization model, patients with a family history of asthma had 1.48 times the odds (95% CI: 1.20 - 1.83) of having the outcome than patients
without a family history. Similiarly, in the pad and emergency department model, the
odds ratio for family history was 3.25 (95% CI: 2.14 - 4.95); and in the pad and OCS
model, the odds ratio for family history was 1.45 (95% CI: 1.29 - 1.64).
193
6.1.3 Smoking status
For each of the outcomes, current and former smoking status (compared to never smoking) were both associated with event status, current smoking was more strongly associated than former smoking, and odds ratios were similar across models. In the pad and OCS model, for example, current smokers had 1.81 times the odds of having a hospitalization (95% CI: 1.61 - 2.04) compared to never smokers, and former smokers had 1.55 times the odds (95% CI: 1.39 - 1.73), compared to never smokers.
6.1.4 Season
For all three outcomes, index dates in the winter were more likely to be case events compared to those in spring, though the odds ratios were only consistently statistically significant across the four phases of well development with the OCS outcome (e.g., in the pad and OCS model, OR = 1.52, 95% CI: 1.34 - 1.73). Index dates
in the fall were less likely to be case events compared to those in spring, though the
odds ratios were only consistently statistically significant across the four phases of well
development with the hospitalization outcome (e.g., in the pad and hospitalization model,
OR = 0.67, 95% CI: 0.56 - 0.80). Finally, index dates in the summer were also less likely
to be case events compared to those in spring, though the odds ratios were only
consistently statistically significant across the four phases of well development with the
OCS outcome (e.g., in the pad and hospitalization model, OR = 0.65, 95% CI: 0.58 -
0.73).
6.1.5 Type 2 diabetes
Patients with type 2 diabetes, compared to those who did not have type 2 diabetes, were more likely to be cases, though the odds ratios were only statistically significant for the hospitalization outcome (e.g., in the pad and hospitalization model, OR
= 2.48, 95% CI: 2.04 - 3.01).
6.1.6 Community socioeconomic deprivation
194
Community socioeconomic deprivation was not statistically significantly associated with event status in models for any of the three outcomes. Odds ratios were elevated, but not statistically significant, for the emergency department outcome (e.g., in the pad model, OR = 1.23, 95% CI: 0.44 - 3.45).
6.1.7 Maximum temperature on prior day
Temperature was not associated with event status for the emergency department or hospitalization outcomes. For the OCS outcome, higher temperatures were statistically significantly associated with lower odds of event status compared to lower temperatures (e.g., for the pad and OCS model, OR for linear temperature = 0.98, 95%
CI: 0.98 - 0.99; OR for quadratic temperature = 0.998, 95% CI: 0.998 - 0.999).
6.1.7 Distance to nearest roadway
Distance to nearest major road was not associated with event status for any of
the three outcomes. Distance to nearest minor road was not associated with event
status in the hospitalization and OCS models. However, for the emergency department
outcome, patients living closer to minor roads had greater odds of event status than
patients living farther away (e.g., for the pad and emergency department model, OR for
linear distance = 0.90, 95% CI: 0.57 - 1.40; OR for quadratic distance = 0.68, 95% CI:
0.53 - 0.89).
195
Table 6.1.1.1. Odds ratios from oral corticosteroid (mild exacerbation) models. UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% Odds ratio (95% Odds ratio (95% CI) CI) CI) UNGD activity metric (ref: very low) Low 1.54 (1.37 - 1.74) 1.45 (1.29 - 1.63) 1.23 (1.09 - 1.39) 1.28 (1.13 - 1.46) Medium 1.66 (1.47 - 1.87) 1.98 (1.75 - 2.24) 2.22 (1.95 - 2.53) 2.15 (1.87 - 2.47) High 1.59 (1.41 - 1.81) 1.99 (1.75 - 2.26) 3.00 (2.60 - 3.45) 4.43 (3.75 - 5.22) Race/ethnicity (ref: white) Black 0.72 (0.58 - 0.9) 0.71 (0.57 - 0.9) 0.69 (0.54 - 0.88) 0.66 (0.51 - 0.85) Hispanic 0.86 (0.70 - 1.07) 0.85 (0.68 - 1.06) 0.84 (0.66 - 1.06) 0.79 (0.62 - 1.02) Other/missing 0.85 (0.56 - 1.27) 0.83 (0.55 - 1.26) 0.82 (0.53 - 1.29) 0.82 (0.51 - 1.32) Family history (ref: no) 1.45 (1.29 - 1.64) 1.46 (1.29 - 1.66) 1.48 (1.29 - 1.69) 1.49 (1.29 - 1.72) Smoking status (ref: never) Current 1.81 (1.61 - 2.04) 1.85 (1.64 - 2.09) 1.91 (1.68 - 2.18) 2.00 (1.74 - 2.29) Former 1.55 (1.39 - 1.73) 1.57 (1.40 - 1.76) 1.60 (1.41 - 1.80) 1.64 (1.44 - 1.86) Missing 0.65 (0.56 - 0.76) 0.67 (0.57 - 0.78) 0.71 (0.60 - 0.84) 0.76 (0.64 - 0.91) Seasonb (ref: spring) Summer 0.65 (0.58 - 0.73) 0.65 (0.58 - 0.73) 0.59 (0.52 - 0.67) 0.58 (0.51 - 0.66) Fall 0.99 (0.90 - 1.09) 0.98 (0.88 - 1.08) 0.91 (0.82 - 1.02) 0.85 (0.76 - 0.95) Winter 1.52 (1.34 - 1.73) 1.49 (1.31 - 1.70) 1.52 (1.33 - 1.74) 1.51 (1.31 - 1.74) Medical Assistance (ref: no) 1.19 (1.09 - 1.32) 1.19 (1.07 - 1.31) 1.16 (1.04 - 1.29) 1.14 (1.02 - 1.28) Overweight/obesityc (ref: normal) Overweight 1.42 (1.28 - 1.58) 1.44 (1.29 - 1.60) 1.47 (1.31 - 1.65) 1.51 (1.34 - 1.70) Obese 1.90 (1.72 - 2.09) 1.93 (1.75 - 2.14) 2.02 (1.82 - 2.25) 2.10 (1.88 - 2.35) Missing BMI 0.60 (0.32 - 1.14) 0.59 (0.31 - 1.15) 0.57 (0.28 - 1.14) 0.57 (0.27 - 1.21) Type 2 diabetes (ref: no) 1.04 (0.9 - 1.21) 1.05 (0.90 - 1.22) 1.05 (0.89 - 1.23) 1.07 (0.9 - 1.27) CSD quartile (ref: quartile 1) Quartile 2 0.89 (0.76 - 1.04) 0.89 (0.76 - 1.04) 0.88 (0.74 - 1.05) 0.86 (0.71 - 1.04) Quartile 3 0.95 (0.82 - 1.11) 0.95 (0.81 - 1.11) 0.95 (0.79 - 1.13) 0.92 (0.76 - 1.11) Quartile 4 0.90 (0.77 - 1.05) 0.90 (0.76 - 1.06) 0.90 (0.75 - 1.08) 0.89 (0.73 - 1.08) Maximum temperature on prior day, degrees Celsius
196
UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% Odds ratio (95% Odds ratio (95% CI) CI) CI) Centered 0.98 (0.98 - 0.99) 0.98 (0.98 - 0.99) 0.98 (0.98 - 0.99) 0.98 (0.98 - 0.99) Centered and squared 0.998 (0.998 - 0.998 (0.998 - 0.998 (0.998 - 0.998 (0.998 - 0.999) 0.999) 0.999) 0.999) Distance to nearest major road, meters Truncated at the 98th percentile, z- 0.99 (0.9 - 1.09) 0.99 (0.90 - 1.09) 1.001 (0.90 - 1.11) 0.999 (0.89 - 1.12) transformed Truncated at the 98th percentile, z- 1.01 (0.97 - 1.05) 1.01 (0.96 - 1.05) 0.999 (0.95 - 1.05) 0.997 (0.95 - 1.05) transformed, squared Distance to nearest minor road, meters Truncated at the 98th percentile, z- 0.96 (0.87 - 1.05) 0.96 (0.87 - 1.06) 0.96 (0.86 - 1.06) 0.97 (0.87 - 1.09) transformed Truncated at the 98th percentile, z- 1.02 (0.97 - 1.07) 1.02 (0.97 - 1.07) 1.02 (0.96 - 1.07) 1.01 (0.96 - 1.07) transformed, squared Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Multilevel models with a random intercept for patient and community, b Spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21 c Normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults
197
Table 6.1.1.2. Odds ratios from emergency encounter (moderate exacerbation) models. UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) UNGD activity metric (ref: very low) 1.53 (1.06 - 2.23) 1.53 (1.06 - 2.21) 1.51 (1.05 - 2.19) 1.47 (1.01 - 2.14) Low 1.77 (1.20 - 2.6) 1.54 (1.04 - 2.27) 1.74 (1.17 - 2.61) 1.10 (0.74 - 1.65) Medium 1.37 (0.94 - 1.99) 1.57 (1.08 - 2.29) 1.71 (1.16 - 2.52) 2.19 (1.47 - 3.25) High Race/ethnicity (ref: white) Black 4.87 (2.45 - 9.66) 4.81 (2.42 - 9.58) 4.82 (2.41 - 9.65) 5.03 (2.49 - 10.15) Hispanic 3.44 (1.70 - 6.94) 3.45 (1.70 - 6.98) 3.41 (1.68 - 6.95) 3.40 (1.66 - 6.98) Other/missing 0.83 (0.18 - 3.81) 0.83 (0.18 - 3.84) 0.82 (0.17 - 3.86) 0.80 (0.17 - 3.86) Family history (ref: no) 3.25 (2.14 - 4.95) 3.28 (2.15 – 5.00) 3.31 (2.16 - 5.06) 3.33 (2.17 - 5.13) Smoking status (ref: never) Current 1.91 (1.26 - 2.9) 1.90 (1.25 - 2.89) 1.91 (1.25 - 2.92) 1.95 (1.27 - 3.00) Former 1.54 (0.997 - 2.37) 1.55 (1.002 - 2.39) 1.56 (1.01 - 2.42) 1.58 (1.01 - 2.46) Missing 3.24 (2.10 - 5.00) 3.24 (2.09 - 5.01) 3.35 (2.15 - 5.22) 3.50 (2.23 - 5.5) Seasonb (ref: spring) Summer 0.76 (0.50 - 1.17) 0.77 (0.50 - 1.19) 0.75 (0.48 - 1.15) 0.75 (0.48 - 1.16) Fall 0.88 (0.60 - 1.28) 0.87 (0.60 - 1.27) 0.83 (0.57 - 1.22) 0.82 (0.56 - 1.21) Winter 1.57 (0.97 - 2.55) 1.49 (0.92 - 2.42) 1.53 (0.94 - 2.50) 1.48 (0.90 - 2.42) Medical Assistance (ref: no) 2.25 (1.6 - 3.17) 2.26 (1.6 - 3.18) 2.22 (1.57 - 3.13) 2.23 (1.58 - 3.16) Overweight/obesityc (ref: normal) Overweight 1.11 (0.75 - 1.63) 1.11 (0.75 - 1.64) 1.12 (0.76 - 1.66) 1.12 (0.75 - 1.66) Obese 1.95 (1.36 - 2.78) 1.95 (1.36 - 2.78) 1.96 (1.37 - 2.81) 1.98 (1.38 - 2.85) Missing BMI 14.01 (2.9 - 67.65) 14.27 (2.92 - 69.69) 14.58 (2.94 - 72.21) 14.62 (2.89 – 74.00) Type 2 diabetes (ref: no) 1.73 (0.95 - 3.14) 1.72 (0.94 - 3.14) 1.73 (0.94 - 3.17) 1.75 (0.95 - 3.22) CSD quartile (ref: quartile 1) Quartile 2 0.89 (0.33 - 2.41) 0.89 (0.33 - 2.40) 0.89 (0.33 - 2.45) 0.90 (0.33 - 2.47) Quartile 3 1.20 (0.44 - 3.32) 1.19 (0.43 - 3.29) 1.21 (0.43 - 3.37) 1.21 (0.43 - 3.38) Quartile 4 1.23 (0.44 - 3.45) 1.24 (0.44 - 3.48) 1.25 (0.44 - 3.54) 1.24 (0.44 - 3.53)
198
UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Maximum temperature on prior day, degrees Celsius Centered 0.998 (0.98 - 1.02) 0.996 (0.98 - 1.02) 0.998 (0.98 - 1.02) 0.998 (0.98 - 1.02) Centered and squared 0.999 (0.997 - 0.999 (0.997 - 0.999 (0.997 - 0.999 (0.997 - 1.00001) 1.00004) 0.99998) 1.0001) Distance to nearest major road, meters Truncated at the 98th percentile, 0.92 (0.56 - 1.52) 0.92 (0.55 - 1.53) 0.92 (0.55 - 1.53) 0.91 (0.54 - 1.52) z-transformed Truncated at the 98th percentile, 1.01 (0.81 - 1.26) 1.01 (0.81 - 1.26) 1.01 (0.81 - 1.26) 1.01 (0.81 - 1.27) z-transformed, squared Distance to nearest minor road, meters Truncated at the 98th percentile, 0.90 (0.57 - 1.40) 0.90 (0.58 - 1.41) 0.90 (0.57 - 1.41) 0.91 (0.58 - 1.43) z-transformed Truncated at the 98th percentile, 0.68 (0.53 - 0.89) 0.68 (0.52 - 0.89) 0.68 (0.52 - 0.89) 0.67 (0.51 - 0.88) z-transformed, squared Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Multilevel models with a random intercept for patient and community, b Spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21 c Normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults
199
Table 6.1.1.3. Odds ratios from hospitalization (severe exacerbation) models.
UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) UNGD activity metric (ref: very low) Low 1.26 (1.06 - 1.5) 1.16 (0.98 - 1.37) 1.13 (0.96 - 1.33) 1.10 (0.92 - 1.30) Medium 1.37 (1.15 - 1.64) 1.26 (1.05 - 1.50) 1.31 (1.10 - 1.57) 1.16 (0.97 - 1.38) High 1.45 (1.21 - 1.73) 1.64 (1.38 - 1.97) 1.66 (1.38 - 1.98) 1.74 (1.45 - 2.09) Race/ethnicity (ref: white) Black 1.15 (0.77 - 1.73) 1.15 (0.77 - 1.73) 1.14 (0.76 - 1.72) 1.13 (0.75 - 1.70) Hispanic 1.41 (0.96 - 2.08) 1.43 (0.97 - 2.11) 1.41 (0.96 - 2.09) 1.41 (0.95 - 2.08) Other/missing 0.87 (0.39 - 1.91) 0.86 (0.39 - 1.91) 0.86 (0.39 - 1.91) 0.85 (0.38 - 1.9) Family history (ref: no) 1.48 (1.20 - 1.83) 1.47 (1.19 - 1.82) 1.48 (1.19 - 1.83) 1.48 (1.19 - 1.83) Smoking status (ref: never) Current 1.92 (1.61 - 2.28) 1.94 (1.62 - 2.31) 1.94 (1.63 - 2.32) 1.96 (1.64 - 2.34) Former 1.51 (1.28 - 1.78) 1.52 (1.29 - 1.79) 1.51 (1.28 - 1.79) 1.51 (1.28 - 1.78) Missing 1.36 (1.04 - 1.77) 1.36 (1.04 - 1.78) 1.37 (1.05 - 1.80) 1.38 (1.05 - 1.81) Seasonb (ref: spring) Summer 0.87 (0.72 - 1.05) 0.88 (0.73 - 1.07) 0.85 (0.71 - 1.03) 0.86 (0.71 - 1.04) Fall 0.67 (0.56 - 0.80) 0.67 (0.56 - 0.80) 0.66 (0.55 - 0.78) 0.65 (0.54 - 0.77) Winter 1.26 (1.02 - 1.57) 1.24 (0.996 - 1.54) 1.25 (1.002 - 1.55) 1.24 (0.998 - 1.54) Medical Assistance (ref: no) 3.32 (2.79 - 3.95) 3.31 (2.78 - 3.93) 3.34 (2.8 - 3.97) 3.35 (2.81 - 3.99) Overweight/obesityc (ref: normal) 1.14 (0.95 - 1.37) 1.14 (0.95 - 1.37) 1.14 (0.95 - 1.37) 1.15 (0.95 - 1.38) Overweight 1.52 (1.29 - 1.79) 1.53 (1.29 - 1.80) 1.53 (1.29 - 1.81) 1.53 (1.30 - 1.81) Obese 0.65 (0.26 - 1.64) 0.65 (0.26 - 1.64) 0.65 (0.26 - 1.64) 0.64 (0.25 - 1.63) Missing BMI Type 2 diabetes (ref: no) 2.48 (2.04 - 3.01) 2.48 (2.04 - 3.02) 2.49 (2.04 - 3.03) 2.5 (2.05 - 3.05) CSD quartile (ref: quartile 1) Quartile 2 0.85 (0.59 - 1.21) 0.84 (0.59 - 1.21) 0.85 (0.59 - 1.21) 0.84 (0.58 - 1.20) Quartile 3 0.95 (0.66 - 1.36) 0.95 (0.66 - 1.36) 0.94 (0.65 - 1.36) 0.94 (0.65 - 1.35) Quartile 4 0.75 (0.52 - 1.09) 0.75 (0.52 - 1.10) 0.75 (0.51 - 1.09) 0.74 (0.51 - 1.08)
200
UNGD phase Pad Spud Stimulation Production Odds ratioa (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Odds ratio (95% CI) Maximum temperature on prior day, degrees Celsius Centered 1.01 (0.997 - 1.01) 1.01 (0.996 - 1.01) 1.01 (0.997 - 1.02) 1.01 (0.997 - 1.02) Centered and squared 1.0002 (0.9996 - 1.0002 (0.9996 - 1.0002 (0.996 – 1.001) 1.0002 (0.9996 - 1.001) 1.001) 1.001) Distance to nearest major road, meters Truncated at the 98th 0.88 (0.72 - 1.07) 0.89 (0.73 - 1.08) 0.88 (0.73 - 1.08) 0.89 (0.73 - 1.08) percentile, z-transformed Truncated at the 98th 1.04 (0.96 - 1.14) 1.04 (0.95 - 1.13) 1.04 (0.96 - 1.14) 1.04 (0.95 - 1.14) percentile, z-transformed, squared Distance to nearest minor road, meters Truncated at the 98th 0.96 (0.81 - 1.15) 0.96 (0.80 - 1.15) 0.96 (0.80 - 1.15) 0.97 (0.81 - 1.16) percentile, z-transformed Truncated at the 98th 0.99 (0.90 - 1.08) 0.99 (0.90 - 1.08) 0.99 (0.90 - 1.08) 0.98 (0.90 - 1.08) percentile, z-transformed, squared Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Multilevel models with a random intercept for patient and community, b Spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21 c Normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults
201
6.1.2 Additional sensitivity analyses
Presented here are additional sensitivity analyses for the study on UNGD and asthma exacerbations.
6.1.2.1 Stimulation extrapolation methods
More than a third (34.6%) of wells were missing stimulation dates, and we were concerned that this missing data could cause bias. However, because the stimulation dates are bounded by the spud and production start date, we hypothesized that the potential for exposure misclassification caused by extrapolated dates was minimal. As long as the spatial distribution of the wells with and without extrapolated stimulation dates was random, as it appears in Figure 6.1.2.1, the extrapolation of stimulation dates should not account for our results.
Figure 6.1.2.1. Locations of Wells with Extrapolated Stimulation Dates.
To evaluate the sensitivity of our results to stimulation date extrapolation, we conducted an analysis that replaced all extrapolated stimulation dates with a date 30 days after the spud date. We used these new dates to calculate a new stimulation
202
metric, and then re-ran the final model for the hospitalization outcome. The results of this sensitivity analysis (Table 6.1.2.1) show comparable odds ratios and overlapping confidence intervals for the extrapolated stimulation dates (as presented in Chapter 3) and sensitivity stimulation dates (replacing the extrapolated stimulation dates with a date
30 days after the spud date).
Table 6.1.2.1. Associations of UNGD stimulation metrics creating with extrapolated and sensitivity stimulation dates and asthma hospitalizations Extrapolateda stimulation datesc Sensitivity b stimulation datesc Odds Ratio (95% CId) Odds Ratio (95% CI) Low e 1.14 (0.97 - 1.34) 1.22 (1.04-1.44) Medium 1.32 (1.10 - 1.57) 1.50 (1.26-1.78) High 1.66 (1.39 - 1.98) 1.47 (1.23-1.76) a As presented in Chapter 3. b Extrapolated stimulation dates replaced with a date 30 days after the spud date. c Multilevel models with a random intercept for patient and community, adjusted for race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) d Confidence interval e Very low is the reference group
6.1.2.2 Control Encounter Dates
In Section 3.3.3, for controls, we selected a random encounter date with the
Geisinger Health System to have a date to assign time-varying covariates. There are
several advantages to using a date with actual patient contact: first, it made sure that the
patient was under observation, and had they had an exacerbation, they would have
been evaluated at a Geisinger facility. Using a specific contact date gave us confidence
that the patient was in the area and so that they could have had exposure on the day
before. Additionally, using a specific contact date increased the likelihood that if there
203
had been a major change to one of the patient’s time-varying covariates (i.e., change in
BMI or smoking status), that the EHR would have recorded this change.
However, we also considered using a random encounter date for patients instead
of a contact date. We completed a sensitivity analysis to evaluate the sensitivity of our
results to using a randomly selected patient contact date instead of a random date from
all dates within the year. We randomly selected a date from all dates within the year and
assigned the spud activity metric on those dates for controls instead of on the randomly
selected contact dates. The results of this sensitivity analysis (Table 6.1.2.2) show comparable odds ratios and overlapping confidence intervals whether a random encounter date or a random date from the entire year was used for controls.
Table 6.1.2.2. Associations of UNGD spud metrics assigned on random encounter dates vs. random dates and asthma hospitalizations Control spud activity metric assigned on Control Spud Activity Metric randomly selected index datea,b Assigned on Randomly Selected Dateb,c Odds ratio (95% CId) Odds ratio (95% CI) Low e 1.17 (0.98 - 1.39) 1.30 (1.11-1.54) Medium 1.26 (1.06 - 1.50) 1.28 (1.08-1.52) High 1.65 (1.38 - 1.97) 1.45 (1.22-1.72) a As presented in Chapter 3. b Multilevel models with a random intercept for patient and community, adjusted for race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) c Random date in year selected. d Confidence interval e Very low is the reference group
6.1.2.3 Inverse distance squared vs. cubed metric
204
We used inverse-distance squared UNGD activity metrics, but we were concerned about our assumption about a squared decay function. To evaluate the sensitivity of our results to the functional form of the inverse-distance-squared weighting,
we conducted an analysis that assigned the spud exposure using distance-cubed
instead. We then re-ran the final model for the hospitalization outcome. The results of
this sensitivity analysis (Table 6.1.2.3) show comparable odds ratios and overlapping confidence intervals for the spud activity metric created using distance-squared in the denominator compared to using distance-cubed.
Table 6.1.2.3. Associations of spud activity metrics assigned using distance squared vs. distance cubed Spud activity metric assigned using Spud activity metric assigned using distance squareda,b distance cubedb Odds ratio (95% CIc) Odds ratio (95% CI) Low d 1.17 (0.98 - 1.39) 1.20 (1.02-1.41) Medium 1.26 (1.06 - 1.50) 1.44 (1.21-1.71) High 1.65 (1.38 - 1.97) 1.44 (1.21-1.71) a As presented in Chapter 3. b Multilevel models with a random intercept for patient and community, adjusted for race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) c Confidence interval d Very low is the reference group
6.1.2.4 Distance to hospital
We were concerned that distance to hospital might be a confounder in the
hospitalization and emergency department analyses. Patients who lived closer to the
hospital might be more likely to go the hospital for an exacerbation, while patients who
lived farther away might seek care over the phone or in an outpatient center. The
205
distance to hospital was much shorter among patients with events than without (Table
6.1.2.4.1).
To evaluate distance to hospital as a confounder, we added distance to hospital as a covariate in the spud and hospitalization model. We z-transformed the distance to hospital variable, and we added the standardized and the squared standardized variables to the model. The results of this sensitivity analysis (Table 6.1.2.4.2) show comparable odds ratios and overlapping confidence intervals whether or not distance to hospital was in the model.
Table 6.1.2.4.1. Median distance to closer Geisinger Hospital by event and event status, km Hospitalization Emergency OCS Control 37.1 37.7 36.8 Case 21.0 12.4 38.9
Table 6.1.2.4.2. Associations of the UNGD spud metric and hospitalization outcome without and with distance to hospital in the model. Final Modela,b Adding distance to hospital to the modelc Odds Ratio (95% CId) Odds Ratio (95% CId) Spud Activity Low 1.17 (0.98 - 1.39) 1.17 (0.99-1.38) Metric e Medium 1.26 (1.06 - 1.50) 1.23 (1.03-1.46) High 1.65 (1.38 - 1.97) 1.49 (1.26-1.77) a As presented in Chapter 3. b Multilevel models with a random intercept for patient and community, adjusted for race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) c Distance to hospital added as a z-score and a squared z-score d Confidence interval e Very low is the reference group
206
6.2 Additional Results for Chapter 4 6.2.1 Associations of covariates with event status
Presented below are the associations of the covariates in the models of UNGD with the level of depression symptoms (multinomial logistic regression) and with the burden of depression symptoms (negative binomial regression) (Tables 6.2.1.1 and
6.1.1.2). Only results from truncated survey-weighted models (Section 4.2.6), which were the primary analyses, are discussed below, but the associations of the covariates in fully-weighted and unweighted models, which were sensitivity analyses, are presented in Tables 6.2.1.3 - 6.2.1.6. Associations generally had similar interpretations in fully- weighted and unweighted models as in the truncated weighted models, though they tended to be stronger in fully-weighted models and weaker in unweighted models.
Below, all continuous variables were first centered and then both linear and quadratic terms were included in models to evaluate non-linearity.
6.2.1.1 Race / ethnicity
Race/ethnicity was not associated with the level of depression symptoms or with the burden of depression symptoms.
6.2.1.2 Sex
Female sex was associated with a higher burden of depression symptoms and with mild (but not moderate or severe) depression symptoms, compared to no / minimal symptoms. For examples, females had 1.46 times the odds (95% CI: 1.15 - 1.84) of having mild depression symptoms compared to males.
6.2.1.3 Age
Younger age was associated with both lower odds of mild, moderate, and moderately severe/severe depression symptoms (compared to no / minimal symptoms), and with a lower burden of depression symptoms (exponentiated coefficient, modeled as a continuous variable, = 0.99, 95% CI =0.99 - 0.99).
207
6.2.1.4 Smoking status
Current and former smoking was associated with a lower burden of depression symptoms and with severe depression symptoms (compared to no / minimal symptoms), but not with mild or moderate depression symptoms. Current smokers had 2.88 times the odds of having severe depression symptoms (95% CI: 1.45 - 5.73) compared to
never smokers, and former smokers had 2.15 times the odds (95% CI: 1.26 - 3.65),
compared to never smokers.
6.2.1.5 Alcohol status
Alcohol status was not statistically significantly associated with either the burden
of depression symptoms or the level of depression symptoms. Odds ratios were
elevated, but not statistically significant, for the association of heavy alcohol status with
mild and moderate depression symptoms (e.g., for moderate depression symptoms, OR
= 1.84, 95% CI: 0.95 - 3.58).
6.1.2.6 Medical assistance
Patients with Medical Assistance, compared to those without, had both a higher
burden of depression symptoms and higher odds of mild, moderate, and moderately
severe/severe depression symptoms. For example, having Medical Assistance was
associated with 1.62 times the odds of having mild depression symptoms (95% CI: 1.07
- 2.44) compared to not having Medical Assistance.
6.2.1.7 Body mass index
Higher body mass index, modeled as a continuous variable, was associated with
both a higher burden of depression symptoms and higher odds of mild and moderate
(but not moderately severe/severe) depression symptoms. For example, each additional
point in body mass index was associated with 1.04 times the odds (95% CI: 1.02 - 1.06)
of mild depression.
6.1.2.8 Community socioeconomic deprivation
208
Community socioeconomic deprivation was not statistically significantly associated with either the burden of depression symptoms or the level of depression symptoms.
6.2.1.9 Well water
Residential well water, compared to municipal water, was not associated with the
burden of depression symptoms or having mild or moderate depression symptoms.
Patients with well water were less likely to have moderately severe / severe depression
symptoms than patients with municipal water (odds ratio = 0.53, 95% CI: 0.32 - 0.87).
209
Table 6.2.1.1. Exponentiated coefficients from the truncated-weighted negative binomial model. Exponentiated coefficienta (95% CI) UNGD activity metricb (ref: very low) Low 1.14 (1.01 - 1.29) Medium 1.03 (0.91 - 1.17) High 1.18 (1.04 - 1.34) Race/ethnicity (ref: white) Black 0.83 (0.65 - 1.06) Hispanic 0.85 (0.71 - 1.02) Age Centered 0.99 (0.99 - 0.99) Centered and squared 0.9999 (0.9997 - 1.000048) Sex (ref: male) 1.14 (1.04 - 1.26) Smoking status (ref: never) Current 1.30 (1.09 - 1.54) Former 1.18 (1.06 - 1.31) Alcohol status (ref: no) Yes, not heavy 1.01 (0.92 - 1.11) Yes, heavyc 1.14 (0.95 - 1.37) Medical Assistance (ref: no) 1.54 (1.32 - 1.81) Body mass index Centered 1.02 (1.01 - 1.02) Centered and squared 0.9999 (0.9992 - 1.001) CSD Continuous 1.01 (0.996 - 1.02) Centered and squared 1.0006 (0.998 - 1.003) Well water (ref: municipal water) 0.94 (0.86 - 1.04) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Truncated-weighted negative binomial model b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. c Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
210
Table 6.2.1.2. Odds ratios from the truncated-weighted multinomial logistic model. Odds ratioa (95% CI) Odds ratioa (95% CI) Odds ratioa (95% CI) Level of depression symptomsb Mild Moderate Severe UNGD activity metricc (ref: very low) Low 1.63 (1.21 - 2.19) 1.22 (0.80 - 1.86) 1.13 (0.61 - 2.06) Medium 1.25 (0.92 - 1.71) 1.04 (0.68 - 1.60) 0.89 (0.47 - 1.69) High 1.51 (1.12 - 2.04) 1.26 (0.83 - 1.92) 1.39 (0.76 - 2.54) Race/ethnicity (ref: white) Black 0.78 (0.47 - 1.28) 0.56 (0.27 - 1.15) 0.79 (0.3 - 2.09) Hispanic 0.68 (0.43 - 1.08) 0.66 (0.36 - 1.19) 1.13 (0.58 - 2.2) Age Centered 0.99 (0.98 - 0.998) 0.98 (0.97 - 0.99) 0.97 (0.95 - 0.99) Centered and squared 1.0002 (0.9998 - 1.0005) 0.9996 (0.9991 - 1.0002) 0.9995 (0.999 - 1.0005) Sex (ref: male) 1.46 (1.15 - 1.84) 0.99 (0.71 - 1.37) 1.41 (0.85 - 2.34) Smoking status (ref: never) Current 1.48 (0.97 - 2.24) 1.12 (0.59 - 2.13) 2.88 (1.45 - 5.73) Former 1.25 (0.97 - 1.60) 1.24 (0.87 - 1.79) 2.15 (1.26 - 3.65) Alcohol status (ref: no) Yes, not heavy 1.15 (0.91 - 1.44) 1.06 (0.75 - 1.48) 0.69 (0.42 - 1.12) Yes, heavyd 1.43 (0.90 - 2.29) 1.84 (0.95 - 3.58) 1.03 (0.47 - 2.26) Medical Assistance (ref: no) 1.62 (1.07 - 2.44) 2.01 (1.19 - 3.41) 4.14 (2.24 - 7.64) Body mass index Centered 1.04 (1.02 - 1.06) 1.05 (1.02 - 1.08) 1.02 (0.99 - 1.06) Centered and squared 0.996 (0.99 - 0.998) 1.001 (0.998 - 1.003) 0.9998 (0.997 - 1.003) CSD Continuous 1.03 (0.996 - 1.07) 1.01 (0.97 - 1.06) 1.03 (0.967 - 1.09) Centered and squared 1.0004 (0.99 - 1.01) 1.0013 (0.99 - 1.01) 0.9998 (0.99 - 1.01) Well water (ref: municipal water) 1.21 (0.95 - 1.53) 0.90 (0.64 - 1.27) 0.53 (0.32 - 0.87) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Truncated-weighted multinomial logistic model b No depression symptoms was the base outcome. c The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. d Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
211
Table 6.2.1.3. Exponentiated coefficients from the fully-weighted negative binomial model. Exponentiated coefficienta (95% CI) UNGD activity metricb (ref: very low) Low 1.12 (0.94 - 1.34) Medium 1.07 (0.88 - 1.29) High 1.29 (1.08 - 1.56) Race/ethnicity (ref: white) Black 0.89 (0.69 - 1.15) Hispanic 0.93 (0.76 - 1.13) Age Centered 0.99 (0.99 - 0.99) Centered and squared 0.9999 (0.9997 - 1.0001) Sex (ref: male) 1.22 (1.06 - 1.4) Smoking status (ref: never) Current 1.21 (0.97 - 1.53) Former 1.19 (1.02 - 1.39) Alcohol status (ref: no) Yes, not heavy 1.04 (0.91 - 1.20) Yes, heavyc 1.19 (0.93 - 1.51) Medical Assistance (ref: no) 1.44 (1.15 - 1.81) Body mass index Centered 1.01 (1.001 - 1.02) Centered and squared 0.99993 (0.999 - 1.0009) CSD Continuous 1.02 (1.001 - 1.04) Centered and squared 0.9995 (0.996 - 1.003) Well water (ref: municipal water) 0.89 (0.78 - 1.03) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Fully-weighted negative binomial model b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. c Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
212
Table 6.2.1.4. Odds ratios from the fully-weighted multinomial logistic model. Odds ratioa (95% CI) Odds ratioa (95% CI) Odds ratioa (95% CI) Level of depression symptomsb Mild Moderate Severe UNGD activity metricc (ref: very low) Low 1.72 (1.14 - 2.59) 1.20 (0.67 - 2.14) 0.93 (0.37 - 2.34) Medium 1.29 (0.84 - 1.98) 1.23 (0.66 - 2.28) 0.68 (0.26 - 1.79) High 1.95 (1.28 - 2.97) 1.77 (0.98 - 3.20) 1.47 (0.63 - 3.46) Race/ethnicity (ref: white) Black 0.75 (0.43 - 1.30) 0.67 (0.32 - 1.42) 0.85 (0.29 - 2.47) Hispanic 0.71 (0.43 - 1.16) 0.77 (0.42 - 1.44) 1.16 (0.53 - 2.53) Age Centered 0.99 (0.98 - 0.999) 0.97 (0.96 - 0.99) 0.96 (0.94 - 0.988) Centered and squared 1.0001 (0.9996 - 1.0006) 0.9996 (0.999 - 1.0004) 0.9997 (0.998 - 1.0012) Sex (ref: male) 1.52 (1.09 - 2.11) 1.2 (0.77 - 1.86) 1.65 (0.78 - 3.49) Smoking status (ref: never) Current 1.61 (0.89 - 2.92) 0.68 (0.25 - 1.83) 2.01 (0.62 - 6.58) Former 1.10 (0.78 - 1.56) 1.26 (0.77 - 2.08) 1.92 (0.89 - 4.14) Alcohol status (ref: no) Yes, not heavy 1.19 (0.85 - 1.65) 0.96 (0.58 - 1.58) 0.79 (0.39 - 1.62) Yes, heavyb 1.58 (0.80 - 3.12) 2.09 (0.77 - 5.62) 1.24 (0.37 - 4.23) Medical Assistance (ref: no) 1.80 (0.998 - 3.25) 1.24 (0.58 - 2.66) 4.00 (1.53 - 10.44) Body mass index Centered 1.04 (1.01 - 1.06) 1.05 (1.004 - 1.10) 1.002 (0.95 - 1.06) Centered and squared 0.996 (0.99 - 0.999) 1.00004 (0.997 - 1.003) 0.998 (0.993 - 1.003) CSD Continuous 1.06 (1.011 - 1.11) 1.04 (0.971 - 1.11) 1.07 (0.981 - 1.16) Centered and squared 0.9983 (0.99 - 1.01) 0.9992 (0.99 - 1.01) 0.99 (0.97 - 1.004) Well water (ref: municipal water) 1.23 (0.89 - 1.71) 0.65 (0.40 - 1.05) 0.44 (0.20 - 0.99) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Fully-weighted multinomial logistic model b No depression symptoms was the base outcome. c The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. d Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
213
Table 6.2.1.5. Exponentiated coefficients from the unweighted negative binomial model. Exponentiated coefficienta (95% CI) UNGD activity metricb (ref: very low) Low 1.05 (0.96 - 1.15) Medium 0.96 (0.88 - 1.05) High 1.03 (0.94 - 1.13) Race/ethnicity (ref: white) Black 0.81 (0.67 - 0.97) Hispanic 0.89 (0.76 - 1.05) Age Centered 0.99 (0.99 - 0.99) Centered and squared 0.9998 (0.9997 - 0.9999) Sex (ref: male) 1.12 (1.05 - 1.20) Smoking status (ref: never) Current 1.25 (1.10 - 1.41) Former 1.14 (1.06 - 1.22) Alcohol status (ref: no) Yes, not heavy 0.93 (0.87 - 0.999) Yes, heavyc 1.09 (0.95 - 1.25) Medical Assistance (ref: no) 1.65 (1.47 - 1.86) Body mass index Centered 1.02 (1.01 - 1.02) Centered and squared 1.00005 (0.9996 - 1.0005) CSD Continuous 1.01 (0.998 - 1.02) Centered and squared 1.0003 (0.999 - 1.002) Well water (ref: municipal water) 0.93 (0.86 - 0.99) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Unweighted negative binomial model b The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. c Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
214
Table 6.2.1.6. Odds ratios from the unweighted multinomial logistic model. Odds ratioa (95% CI) Odds ratioa (95% CI) Odds ratioa (95% CI) Level of depression symptoms Mild Moderate Severe UNGD activity metric (ref: very low) 1.23 (1.003 - 1.50) 1.04 (0.78 - 1.39) 1.19 (0.82 - 1.74) Low 0.996 (0.81 - 1.22) 0.84 (0.63 - 1.13) 0.91 (0.61 - 1.34) Medium 1.12 (0.92 - 1.37) 1.06 (0.80 - 1.40) 1.11 (0.76 - 1.61) High Race/ethnicity (ref: white) Black 0.75 (0.50 - 1.15) 0.64 (0.35 - 1.17) 0.49 (0.23 - 1.08) Hispanic 0.67 (0.46 - 0.99) 0.76 (0.46 - 1.26) 1.14 (0.67 - 1.94) Age Centered 0.99 (0.98 - 0.994) 0.98 (0.98 - 0.991) 0.97 (0.96 - 0.985) Centered and squared 0.99996 (0.9997 - 1.0002) 0.9997 (0.9993 - 1.0001) 0.999 (0.998 - 0.9996) Sex (ref: male) 1.27 (1.09 - 1.49) 1.03 (0.83 - 1.28) 1.41 (1.04 - 1.92) Smoking status (ref: never) Current 1.51 (1.15 - 1.99) 1.14 (0.78 - 1.68) 2.68 (1.73 - 4.14) Former 1.14 (0.96 - 1.34) 1.17 (0.92 - 1.49) 1.71 (1.25 - 2.35) Alcohol status (ref: no) Yes, not heavy 1.01 (0.86 - 1.18) 0.87 (0.69 - 1.08) 0.63 (0.47 - 0.85) Yes, heavyb 1.16 (0.85 - 1.59) 1.93 (1.30 - 2.86) 0.78 (0.47 - 1.27) Medical Assistance (ref: no) 1.71 (1.28 - 2.27) 2.98 (2.14 - 4.15) 5.51 (3.85 - 7.89) Body mass index Centered 1.03 (1.02 - 1.05) 1.03 (1.015 - 1.05) 1.041 (1.02 - 1.07) Centered and squared 0.998 (0.997 - 0.9995) 1.00054 (0.999 - 1.002) 1.0001 (0.998 - 1.002) CSD Continuous 1.04 (1.021 - 1.07) 1.01 (0.978 - 1.04) 1.01 (0.971 - 1.05) Centered and squared 0.99 (0.99 - 1.002) 1.0023 (1 - 1.01) 1.002 (0.99 - 1.009) Well water (ref: municipal water) 1.14 (0.97 - 1.33) 0.85 (0.68 - 1.07) 0.65 (0.47 - 0.90) Abbreviations: UNGD, unconventional natural gas development; CI, confidence interval; ref, reference group a Unweighted multinomial logistic model b No depression symptoms was the base outcome. c The UNGD metric was a composite for four phases of well development (pad preparation, drilling, stimulation, and production) and was assigned for the two weeks prior to follow-up survey return. d Heavy was defined based on the Centers for Disease Control definition of as 8 or more drinks per for females and 15 or more drinks per week for males
215
6.3 Comparing asthma patients identified in the electronic health record and by self- report In the unconventional natural gas development (UNGD) and asthma exacerbation study, we identified asthma patients using visits and medications for asthma documented in the electronic health record (EHR) (Section 3.3.1), using an algorithm developed in a prior study.1 In the UNGD and asthma symptom study (Section
6.4), we used the chronic rhinosinusitis baseline questionnaire, which included a
question on doctor diagnosed asthma. Here, we compared characteristics of patients
identified as having asthma using the self-reported doctor-diagnosed question and with
those identified using the EHR algorithm. Prior studies comparing asthma diagnoses
from medical records and from self-report have found a range of agreements (kappa
coefficients ranged from 0.4 - 0.7) (Table 6.3) and several variables associated with
discordance between asthma diagnosis in medical record and self-report.2-4 However, in
contrast to these prior studies, which used abstraction or audit of medical records, our
EHR asthma classification is based on an algorithm.
216
Table 6.3. Studies comparing self-reported asthma and asthma in medical record. Article Patient Number Outcome Medical record Self-report Agreement Variables source of source data source associated patients with discordance Tisnado Adult 1270 History of Medical records Mailed, self- Kappa=0.7, None Medical outpatient asthma abstracted by administered agreement=91% evaluated Care 2006 clinics in study survey California, fieldworkers Washington, and Oregon Skinner J Elderly male 402 Chronic lung Medical records Self-reported Kappa=0.55 Age, marital Ambulatory veterans disease abstracted by screening status, and Care using (defined as authors questionnaires education Manage Veterans chronic administered at were not 2006 Affairs bronchitis, the time of a associated ambulatory emphysema, clinic visit with care or asthma) discordance. Corser Hospitalized 525 Chronic Medical record Telephone Kappa=0.43 Asthma in self BMC Health acute pulmonary audits by study interview report but not Services coronary Disease/ fieldworkers of in medical Research syndrome Asthma and paper or record 2008 patients bronchitis electronic associated medical records with (approximately depression. 50% in each Education and form) age were not associated.
217
6.3.1 Methods
The baseline questionnaire, which 7,785 adults residing in Pennsylvania responded to, asked if responders had ever been told by a doctor that they had asthma.
We used this answer to create a variable for self-reported asthma (yes, no, missing).
Using the EHR-based algorithm we used to identify asthma patients in the UNGD and asthma study, we assigned all baseline questionnaire responders a variable for EHR asthma on the date of baseline survey return (yes, no). We compared patients’ classification by self-reported asthma and EHR asthma. We created covariates for
Medical Assistance, smoking status, sex, community socioeconomic deprivation (CSD), community type, Charleson index, race/ethnicity, years since first Geisinger encounter, and body mass index (BMI), and we compared these characteristics among patients in the four groups (yes/no self-reported asthma, and yes/no EHR asthma).
6.3.2 Results Among the 7,785 baseline survey responders, 7,780 had electronic health record data available, and these responders made up the study population. Among the study population, 2,059 (26%) had self-reported asthma and 1,715 (22%) were classified as having asthma using the EHR algorithm (Table 6.3.2.1). Among patients who did answered the question on doctor-diagnosed asthma (n=7,590), the kappa statistic between the two asthma classifications was 0.70.
Table 6.3.2.1. Classification by the electronic health record asthma algorithm by self- reported asthma. Self-reported asthma No Yes Missing Total EHR No 5,288 611 166 6,065 asthma Yes 243 1,448 24 1,715 Total 5,531 2,059 190 7,780
All covariates evaluated were statistically significantly different among patients with EHR and self-reported asthma, patients with self-reported but not EHR asthma,
218
patients with EHR but not self-reported asthma, and neither EHR nor self-reported
asthma (Table 6.3.2.2). Excluding patients with neither self-reported nor EHR asthma, community type was no longer statistically significantly different among patients with self-reported and/or EHR asthma. Additionally, community socio-economic deprivation and BMI were also no longer statistically significantly among patients with self-reported and/or EHR asthma, though the p-values were marginal (p=0.08 and p=0.06, respectively).
Patients with EHR and self-reported asthma were the most likely to be female of the four groups (p<0.001). Patients with EHR but not self-reported asthma tended to have more years of Geisginer EHR data compared to patients with self-reported but not
EHR asthma (p<0.001).
6.3.3 Discussion
The kappa agreement between self-reported asthma and asthma in the EHR was
substantial. However, several characteristics differed across patients with self-reported
asthma, EHR asthma, or both.
219
Table 6.3.2.2. Characteristics of patients with and without EHR and self-reported asthma. p-value p-value from Chi2 from Chi2 test with Self- test self- EHR and reported but EHR but not among reported self-reported not EHR self-reported all and/or asthma asthma asthma Neither Total responde EHR (n=1,448) (n=611) (n=243) (n=5,288) (n=7,590) rs asthma Medical Assistance No 1146 (79.1) 525 (85.9) 212 (87.2) 4828 (91.3) 6711 (88.4) p<0.001 p<0.001 Yes 302 (20.9) 86 (14.1) 31 (12.8) 460 (8.7) 879 (11.6) Smoking status Never 833 (57.5) 302 (49.4) 133 (54.7) 2900 (54.8) 4168 (54.9) p=0.03 p=0.01 Current 195 (13.5) 111 (18.2) 37 (15.2) 753 (14.2) 1096 (14.4) Former 420 (29) 198 (32.4) 73 (30) 1635 (30.9) 2326 (30.6) Female 1049 (72.4) 389 (63.7) 156 (64.2) 3152 (59.6) 4746 (62.5) p<0.001 p<0.001 CSD Q1 318 (22) 165 (27) 69 (28.4) 1397 (26.4) 1949 (25.7) Q2 345 (23.8) 126 (20.6) 59 (24.3) 1334 (25.2) 1864 (24.6) p<0.001 p=0.08 Q3 372 (25.7) 156 (25.5) 58 (23.9) 1326 (25.1) 1912 (25.2) Q4 413 (28.5) 164 (26.8) 57 (23.5) 1231 (23.3) 1865 (24.6) Community type Borough 413 (28.5) 172 (28.2) 65 (26.7) 1427 (27) 2077 (27.4) p=0.045 p=0.41 City 144 (9.9) 72 (11.8) 19 (7.8) 448 (8.5) 683 (9) Township 891 (61.5) 367 (60.1) 159 (65.4) 3413 (64.5) 4830 (63.6) Charleson index 0 81 (5.6) 146 (23.9) 20 (8.2) 1466 (27.7) 1713 (22.6) 1 124 (8.6) 166 (27.2) 45 (18.5) 1734 (32.8) 2069 (27.3) p<0.001 p<0.001 2 256 (17.7) 133 (21.8) 42 (17.3) 1150 (21.7) 1581 (20.8) 3+ 987 (68.2) 166 (27.2) 136 (56) 938 (17.7) 2227 (29.3) Race/ethnicity White 1271 (87.8) 538 (88.1) 225 (92.6) 4834 (91.4) 6868 (90.5) p<0.001 p=0.01 Black 71 (4.9) 43 (7) 9 (3.7) 201 (3.8) 324 (4.3) Hispanic 106 (7.3) 30 (4.9) 9 (3.7) 253 (4.8) 398 (5.2)
220
p-value p-value from Chi2 from Chi2 test with Self- test self- EHR and reported but EHR but not among reported self-reported not EHR self-reported all and/or asthma asthma asthma Neither Total responde EHR (n=1,448) (n=611) (n=243) (n=5,288) (n=7,590) rs asthma Years in Geisinger 0-4 61 (4.2) 54 (8.8) 5 (2.1) 280 (5.3) 400 (5.3) p<0.001 p<0.001 5-9 204 (14.1) 126 (20.6) 23 (9.5) 856 (16.2) 1209 (15.9) 10+ 1183 (81.7) 431 (70.5) 215 (88.5) 4152 (78.5) 5981 (78.8) BMI Not 300 (20.7) 139 (22.7) 43 (17.7) 1341 (25.4) 1823 (24.0) overweight/obese 410 (28.3) 200 (32.7) 74 (30.5) 1758 (33.2) 2442 (32.2) p<0.001 p=0.06 Overweight 738 (51.0) 272 (44.5) 126 (51.9) 2189 (41.4) 3325 (43.8) Obese Total 1448 (100) 611 (100) 243 (100) 5288 (100) 7590 (100)
221
6.4 Unconventional natural gas development and asthma symptom study After completing the study of UNGD and clinically-documented asthma exacerbations, we evaluated the association of UNGD with asthma symptoms. The rationale for this study was to evaluate if the association of UNGD with asthma symptoms was similar to the associations that we observed between UNGD and clinically-documented asthma exacerbations. We hypothesized that UNGD would be associated with increased odds of asthma symptoms.
6.4.1 Survey data
Asthma symptom data came from a baseline and follow-up questionnaire sent to adult patients of the Geisinger Clinic in April and October 2014 by the Chronic
Rhinosinusitis Integrative Studies Program.5 The survey design oversampled for patients
with nasal and sinus symptoms and race/ethnic minorities. The baseline survey, mailed
to 23,700 patients, included questions on asthma symptoms (cough, wheeze, chest
tightness, and shortness of breath) in the past three months and a question on doctor
diagnosis of asthma. The follow-up questionnaire was mailed to all responders of the
baseline questionnaire and included questions on asthma symptoms in the past six
months. A total of 7,847 patients responded to the baseline questionnaire (response rate
33.1%) (Figure 6.4.1). All baseline survey responders received the follow-up survey,
and 4,966 patients responded to the follow-up questionnaire. We excluded patients who
lived outside of Pennsylvania, leaving 7,785 patients at baseline and 4,932 at follow-up.
222
Figure 6.4.1. Chronic Rhinosinusitis Integrative Studies Program survey design. Abbreviation: LTFU, loss to follow-up
6.4.2 Asthma symptom outcomes The responses for the questions on asthma symptoms were a scale (“never,”
“once in a while,” “some of the time,” “most of the time,” “all of the time”). We created two types of outcomes. First, we evaluated each symptom (cough, wheeze, chest tightness, and shortness of breath) separately. We excluded patients who did not respond to the question for each symptom. Second, we created an outcome for number of asthma symptoms with a “most of the time” or “all of the time” response, which ranged from zero to four. For this outcome, we assumed patients who did not respond to the question did not have the symptom.
6.4.3 UNGD metrics
We assigned the same UNGD metrics as used in the UNGD and objectively- documented asthma exacerbations study (Section 3.3.6). For the analysis of symptoms at baseline, which asked about symptoms over the past three months, we assigned the metrics from the day before the questionnaire was received to 90 days prior to that. For the analysis of symptoms at follow-up, which asked about symptoms over the past six
223
months, we assigned the metrics from the day before the questionnaire was received to
180 days prior to that. In a sensitivity analysis, we re-assigned the UNGD metrics at
follow-up with two different exposure windows: from the day before the questionnaire
was received to 14 days prior to that, and from the day before the questionnaire was
received to 365 days prior to that. We summed z-scores of the four phases of
development (pad preparation, spud, stimulation, and production) and quartiled the sum
to create summary metrics.
6.4.4 Covariates From the electronic health record, we created covariates on race/ethnicity (white, black, Hispanic); sex (male, female); use of Medical Assistance, a measure of low socio- economic status (no, yes); age at survey return (years); smoking status at survey return
(current, former, never); and family history of asthma (no, yes). From the baseline survey, we created a variable on self-reported doctor diagnosis of asthma (no/missing, yes).
6.4.5 Data analysis
We used multinomial logistic regression to evaluate the association of the UNGD summary metric with the asthma symptom outcomes. Race/ethnicity, sex, use of
Medical Assistance, age, smoking status, family history of asthma and doctor diagnosis of asthma were included as covariates. Age was included centered and centered and squared to allow for non-linearity. We evaluated effect modification by asthma diagnosis by including an interaction between the summary UNGD metric and doctor diagnosis of asthma. In a sensitivity analysis, we re-ran the model for the number of asthma symptoms outcome at follow-up with a two week and one year summary UNGD metric
(described in 6.4.3).
We calculated baseline and follow-up weights as show in Table 6.4.5. We ran each model three times: using unweighted logistic regression, using logistic regression
224
with the truncated weights (which replaced the largest weight in Table 6.4.5 with the second largest weight), and using logistic regression with the full weighting (as in Table
6.4.5). We ran the three models to examine the trade-off between bias and precision: the full weighting model should be the least biased, but the unweighted model should be the most precise.
225
Table 6.4.5 Calculation of survey weights at baseline and follow-up (cells are counts unless otherwise specified). White 1. Identified using EHR 13,132 47,892 131,366 2. Received survey 12,209 4,224 2,775 Survey weight (#1 ÷ #2) 1.08 11.34 47.34 3. Responded to BL survey 4,730 1,489 876 Response weight (#2 ÷ #3) 2.58 2.84 3.17 Sample weight, baseline (#1 ÷ #3) 2.78 32.16 149.96 4. Responded to FU6mo 3,095 955 554 Response weight fu6mo (#3 ÷ #4) 1.53 1.56 1.58 Sample weight, follow-up (#1 ÷ #4) 4.24 50.15 237.12
Black 1. Identified using EHR 170 991 2,832 2. Received survey 159 903 1,109 Survey weight (#1 ÷ #2) 1.07 1.10 2.55 3. Responded to BL survey 35 147 160 Response weight (#2 ÷ #3) 4.54 6.14 6.93 Sample weight, baseline (#1 ÷ #3) 4.86 6.74 17.70 4. Responded to FU6mo 17 69 65 Response weight fu6mo (#3 ÷ #4) 2.06 2.13 2.46 Sample weight, follow-up (#1 ÷ #4) 10.00 14.36 43.57
Hispanic 1. Identified using EHR 192 1,035 3,159 2. Received survey 181 966 1,174 Survey weight (#1 ÷ #2) 1.06 1.07 2.69 3. Responded to BL survey 35 207 168 Response weight (#2 ÷ #3) 5.17 4.67 6.99 Sample weight, baseline (#1 ÷ #3) 5.49 5.00 18.80 4. Responded to FU6mo 19 93 99 Response weight fu6mo (#3 ÷ #4) 1.84 2.23 1.70 Sample weight, follow-up (#1 ÷ #4) 10.11 11.13 31.91
6.4.6 Results
Each outcome had missing data, ranging from 1.5% for cough to 2.2% for shortness of breath at baseline, and from 2.3% for cough to 3.2% for shortness of breath at follow-up (Table 6.4.6.1). The percentage of survey responders classified with the
226
symptom “most of the time” or “all of the time” ranged from 5% for chest tightness to
14% for cough at baseline, and from 4% for chest tightness to 13% for cough at follow-
up.
Table 6.4.6.1. Asthma symptoms and missingness at baseline and follow-up. Time Baseline survey responders Follow-up survey responders Period (n=7,785) (n=4,932) Outcom Chest Shortnes Chest Shortnes Coug Wheez Coug Wheez e tightnes s of tightnes s of h e h e s breath s breath Never 1,309 3,860 4,143 3,367 908 2,594 2,813 2,331 Once in a while 3,144 1,926 2,050 2,321 2,051 1,229 1,243 1,436 Some of the time 2,153 1,282 1,067 1,281 1,216 668 563 684 Most of the time 791 423 291 474 479 214 135 237 All of the time 268 161 83 170 167 96 38 86 Missing 120 133 151 172 111 131 140 158
In the analysis of baseline symptoms, among patients with asthma, there were
no associations of UNGD with wheeze or shortness of breath symptoms. The fourth
quartile of the summary UNGD metric (compared to the first quartile) was associated
with increased odds of cough symptoms “once in a while” and “all of the time” in
unweighted models only. The second and third quartiles of the summary UNGD metric
(compared to the first quartile) was associated with decreased odds of chest tightness
symptoms “once in a while” in truncated weighted models only.
Among patients without asthma, the fourth quartile of the summary UNGD metric
(compared to the first quartile) was associated with increased odds of each severity of
shortness of breath symptoms in unweighted models, but associations were not
consistent in truncated or full weighted models at baseline (Table 6.4.6.2). The fourth quartile of the summary UNGD metric (compared to the first quartile) was associated with increased odds of chest tightness symptoms “once in a while” in an unweighted
227
model only, and was associated with increased odds of cough symptoms “all of the time” in a full weighted model only. The fourth quartile of the summary UNGD metric
(compared to the first quartile) was associated with increased odds of wheeze symptoms
“some of the time” and “most of the time” in unweighted models, and wheeze symptoms
“all of the time” in a full weighted model.
228
Table 6.4.6.2. Association of UNGD metrics with number of asthma symptoms at baseline among patients with and without asthma from adjusted multinomial models. Having the symptom “never” was the reference outcome. In each cell, the odds ratio and 95% confidence interval were reported from unweighted logistic regression, survey logistic regression with truncated weights, and survey logistic regression with full weights. Associations statistically significant at p=0.05 are bolded. Abbreviations: OR, odds ratio; CI, confidence interval Outcome Cough Wheeze Chest tightness Shortness of breath Association among asthma patients with the symptom “once in a while” (OR, 95% CI) Q2 vs. Q1 1.14 (0.76 - 1.71) 0.83 (0.60 - 1.16) 0.89 (0.66 - 1.21) 0.80 (0.57 - 1.11) 1.11 (0.60 - 2.04) 0.62 (0.38 - 1.03) 0.59 (0.38 - 0.94) 0.73 (0.44 - 1.20) 1.20 (0.51 - 2.82) 0.72 (0.35 - 1.45) 0.66 (0.35 - 1.23) 0.997 (0.49 - 2.04) Q3 vs. Q1 1.28 (0.85 - 1.95) 0.84 (0.60 - 1.16) 0.80 (0.59 - 1.08) 0.87 (0.63 - 1.21) 1.32 (0.72 - 2.42) 0.59 (0.36 - 0.98) 0.56 (0.35 - 0.89) 0.71 (0.43 - 1.17) 1.42 (0.60 - 3.32) 0.71 (0.35 - 1.44) 0.62 (0.32 - 1.18) 1.04 (0.50 - 2.14) Q4 vs. Q1 1.63 (1.03 - 2.57) 1.26 (0.87 - 1.82) 0.96 (0.69 - 1.33) 1.02 (0.72 - 1.46) 1.51 (0.76 - 3.02) 1.14 (0.65 - 2.01) 0.77 (0.47 - 1.27) 1.13 (0.65 - 1.97) 2.24 (0.83 - 6.10) 1.31 (0.58 - 2.95) 0.78 (0.39 - 1.58) 1.28 (0.58 - 2.80) Association among asthma patients with the symptom “some of the time” (OR, 95% CI) Q2 vs. Q1 1.22 (0.81 - 1.84) 1.05 (0.75 - 1.46) 0.96 (0.69 - 1.32) 0.97 (0.69 - 1.35) 1.59 (0.86 - 2.94) 0.87 (0.52 - 1.46) 0.86 (0.82 - 1.42) 0.83 (0.50 - 1.37) 2.06 (0.86 - 4.96) 1.11 (0.54 - 2.28) 0.89 (0.45 - 1.74) 0.94 (0.49 - 1.82) Q3 vs. Q1 1.29 (0.85 - 1.96) 1.001 (0.72 - 1.40) 0.86 (0.62 - 1.19) 0.94 (0.67 - 1.32) 1.11 (0.59 - 2.06) 1.001 (0.59 - 1.69) 0.86 (0.52 - 1.42) 0.77 (0.46 - 1.30) 1.37 (0.56 - 3.34) 1.30 (0.62 - 2.73) 0.83 (0.42 - 1.63) 0.96 (0.48 - 1.91) Q4 vs. Q1 1.37 (0.86 - 2.18) 1.70 (1.17 - 2.46) 1.19 (0.85 - 1.67) 1.22 (0.84 - 1.76) 1.57 (0.78 - 3.18) 1.33 (0.74 - 2.38) 0.95 (0.55 - 1.63) 1.10 (0.63 - 1.94) 2.34 (0.85 - 6.43) 1.63 (0.70 - 3.77) 1.24 (0.56 - 2.73) 1.12 (0.52 - 2.41) Association among asthma patients with the symptom “most of the time” (OR, 95% CI) Q2 vs. Q1 1.08 (0.68 - 1.74) 0.88 (0.56 - 1.36) 0.80 (0.48 - 1.34) 0.83 (0.53 - 1.30) 0.85 (0.41 - 1.76) 0.61 (0.31 - 1.19) 0.58 (0.26 - 1.29) 0.63 (0.31 - 1.26) 0.97 (0.35 - 2.73) 0.53 (0.22 - 1.28) 0.49 (0.18 - 1.38) 0.53 (0.22 - 1.30)
229
Q3 vs. Q1 1.19 (0.74 - 1.93) 0.73 (0.46 - 1.14) 0.83 (0.50 - 1.37) 0.82 (0.52 - 1.28) 1.06 (0.52 - 2.18) 0.55 (0.27 - 1.10) 0.80 (0.35 - 1.77) 0.94 (0.46 - 1.90) 0.86 (0.34 - 2.22) 0.47 (0.19 - 1.14) 0.63 (0.23 - 1.74) 0.98 (0.36 - 2.68) Q4 vs. Q1 1.63 (0.98 - 2.72) 1.49 (0.94 - 2.38) 1.23 (0.74 - 2.04) 1.06 (0.66 - 1.71) 1.39 (0.63 - 3.07) 1.10 (0.54 - 2.27) 1.31 (0.58 - 2.96) 0.74 (0.35 - 1.56) 1.31 (0.46 - 3.74) 1.08 (0.38 - 3.08) 1.06 (0.37 - 3.00) 0.51 (0.20 - 1.31) Association among asthma patients with the symptom “all of the time” (OR, 95% CI) Q2 vs. Q1 1.24 (0.66 - 2.35) 0.85 (0.47 - 1.54) 1.26 (0.54 - 2.92) 1.11 (0.59 - 2.08) 0.86 (0.31 - 2.36) 0.58 (0.22 - 1.49) 0.80 (0.22 - 2.86) 0.60 (0.22 - 1.62) 0.74 (0.20 - 2.67) 0.71 (0.24 - 2.11) 0.89 (0.22 - 3.61) 0.74 (0.24 - 2.28) Q3 vs. Q1 1.69 (0.91 - 3.16) 0.64 (0.34 - 1.19) 0.64 (0.24 - 1.67) 0.64 (0.32 - 1.29) 1.10 (0.42 - 2.89) 0.39 (0.15 - 1.02) 0.45 (0.11 - 1.79) 0.54 (0.19 - 1.53) 0.82 (0.24 - 2.82) 0.47 (0.16 - 1.44) 0.55 (0.12 - 2.48) 0.72 (0.22 - 2.32) Q4 vs. Q1 2.33 (1.22 - 4.47) 1.35 (0.73 - 2.51) 1.28 (0.53 - 3.09) 1.56 (0.82 - 2.98) 2.00 (0.74 - 5.43) 1.19 (0.59 - 3.72) 0.80 (0.23 - 2.78) 1.45 (0.54 - 3.89) 1.74 (0.47 - 6.54) 2.85 (0.88 - 9.27) 1.58 (0.38 - 6.59) 3.41 (0.98 - 11.87) Association among patients without asthma with the symptom “once in a while” (OR, 95% CI) Q2 vs. Q1 0.89 (0.72 - 1.09) 0.98 (0.82 - 1.18) 1.10 (0.92 - 1.31) 1.07 (0.90 - 1.28) 0.91 (0.70 - 1.20) 1.13 (0.87 - 1.47) 1.13 (0.88 - 1.46) 0.998 (0.78 - 1.27) 0.86 (0.60 - 1.23) 1.14 (0.80 - 1.63) 0.95 (0.67 - 1.34) 0.999 (0.72 - 1.39) Q3 vs. Q1 0.89 (0.73 - 1.10) 1.10 (0.92 - 1.32) 1.14 (0.95 - 1.36) 1.16 (0.97 - 1.37) 0.79 (0.60 - 1.04) 1.22 (0.94 - 1.59) 1.22 (0.94 - 1.57) 1.11 (0.87 - 1.41) 0.82 (0.58 - 1.17) 1.18 (0.83 - 1.68) 1.07 (0.76 - 1.50) 1.04 (0.75 - 1.44) Q4 vs. Q1 0.91 (0.74 - 1.11) 1.12 (0.94 - 1.35) 1.21 (1.02 - 1.44) 1.22 (1.03 - 1.44) 0.87 (0.66 - 1.14) 1.10 (0.85 - 1.44) 1.18 (0.92 - 1.52) 1.20 (0.94 - 1.53) 0.77 (0.54 - 1.10) 0.89 (0.62 - 1.28) 0.84 (0.59 - 1.18) 1.15 (0.83 - 1.60) Association among patients without asthma with the symptom “some of the time” (OR, 95% CI) Q2 vs. Q1 1.02 (0.81 - 1.27) 1.13 (0.88 - 1.44) 1.02 (0.78 - 1.33) 1.25 (0.99 - 1.59) 1.11 (0.81 - 1.51) 1.06 (0.74 - 1.52) 1.19 (0.80 - 1.77) 1.47 (1.02 - 2.10) 1.27 (0.84 - 1.93) 1.06 (0.64 - 1.73) 1.75 (1.01 - 3.03) 1.77 (1.07 - 2.90) Q3 vs. Q1 0.95 (0.76 - 1.18) 1.20 (0.94 - 1.54) 1.05 (0.81 - 1.38) 1.10 (0.86 - 1.41) 1.03 (0.76 - 1.41) 1.13 (0.79 - 1.61) 0.35 (0.91 - 1.99) 1.43 (0.997 - 2.04) 1.23 (0.81 - 1.85) 1.11 (0.68 - 1.80) 1.53 (0.88 - 2.64) 1.71 (1.01 - 2.80)
230
Q4 vs. Q1 1.03 (0.83 - 1.29) 1.51 (1.19 - 1.90) 1.35 (1.05 - 1.74) 1.36 (1.07 - 1.72) 1.16 (0.85 - 1.58) 1.36 (0.96 - 1.91) 1.50 (1.03 - 2.19) 1.66 (1.16 - 2.36) 1.31 (0.86 - 1.98) 1.17 (0.73 - 1.88) 2.08 (1.23 - 3.53) 1.90 (1.18 - 3.18) Association among patients without asthma with the symptom “most of the time” (OR, 95% CI) Q2 vs. Q1 1.05 (0.76 - 1.43) 1.29 (0.83 - 2.00) 1.32 (0.80 - 2.19) 1.14 (0.76 - 1.69) 1.11 (0.81 - 1.51) 0.85 (0.43 - 1.68) 0.82 (0.38 - 1.74) 0.997 (0.56 - 1.79) 1.25 (0.63 - 2.47) 1.43 (0.55 - 3.73) 0.64 (0.22 - 1.86) 1.40 (0.63 - 3.11) Q3 vs. Q1 1.01 (0.74 - 1.39) 1.19 (0.76 - 1.87) 1.23 (0.74 - 2.04) 1.27 (0.86 - 1.87) 1.03 (0.76 - 1.41) 0.99 (0.51 - 1.95) 0.80 (0.38 - 1.72) 0.95 (0.54 - 1.69) 1.56 (0.83 - 2.95) 1.50 (0.60 - 3.78) 0.95 (0.32 - 2.78) 1.17 (0.52 - 2.62) Q4 vs. Q1 1.26 (0.93 - 1.70) 1.54 (1.004 - 2.35) 1.20 (0.72 - 1.99) 1.47 (1.01 - 2.14) 1.16 (0.85 - 1.58) 1.39 (0.74 - 2.61) 0.87 (0.41 - 1.84) 1.42 (0.83 - 2.45) 1.76 (0.94 - 3.30) 2.41 (1.01 - 5.79) 0.80 (0.27 - 2.41) 2.31 (1.09 - 4.91) Association among patients without asthma with the symptom “all of the time” (OR, 95% CI) Q2 vs. Q1 1.25 (0.74 - 2.13) 0.96 (0.45 - 2.05) 0.68 (0.24 - 1.94) 1.14 (0.55 - 2.39) 1.39 (0.64 - 3.02) 0.91 (0.27 - 3.10) 0.23 (0.60 - 0.85) 1.11 (0.37 - 3.31) 1.69 (0.58 - 4.95) 1.89 (0.41 - 8.75) 0.23 (0.06 - 0.89) 0.66 (0.15 - 2.95) Q3 vs. Q1 1.08 (0.63 - 1.85) 0.63 (0.27 - 1.48) 0.87 (0.33 - 2.28) 1.58 (0.80 - 3.12) 1.15 (0.53 - 2.53) 0.43 (0.11 - 1.75) 0.95 (0.22 - 4.16) 1.31 (0.46 - 3.76) 1.47 (0.50 - 4.38) 1.32 (0.20 - 8.58) 0.99 (0.22 - 4.51) 0.95 (0.23 - 3.99) Q4 vs. Q1 1.59 (0.96 - 2.64) 1.55 (0.79 - 3.04) 1.54 (0.67 - 3.54) 1.95 (1.02 - 3.76) 1.65 (0.77 - 3.50) 2.21 (0.80 - 6.08) 1.49 (0.43 - 5.19) 2.75 (1.07 - 7.08) 3.58 (1.30 - 9.85) 3.68 (1.12 - 12.07) 2.71 (0.52 - 14.04) 1.77 (0.53 - 5.90)
231
In the analysis of symptoms at follow-up, among patients with asthma, UNGD was not associated with chest tightness symptoms (Table 6.4.6.3). UNGD quartile four
(vs. quartile one) was associated with increased odds of cough symptoms only “most of the time” and only in an unweighted model among patients with asthma. Among patients with asthma, UNGD quartile two (vs. quartile one) was associated with increased odds of wheeze symptoms only “all of the time” and only in an unweighted model, and was associated with increased odds of shortness of breath symptoms, only “most of the time” and only in an unweighted model
Among patients without asthma, UNGD was not associated with cough, wheeze or chest tightness symptoms at follow-up. UNGD quartile three (vs. quartile one) was
associated with shortness of breath “once in a while” in truncated and full weighted
models, and quartile four (vs. quartile one) was associated with increased odds of
shortness of breath symptoms “once in a while” in an unweighted model.
232
Table 6.4.6.3. Association of UNGD metrics with number of asthma symptoms at follow-up among patients with and without asthma from adjusted multinomial logistic models. Having the symptom “never” was the reference outcome. In each cell, the odds ratio and 95% confidence interval was reported from unweighted logistic regression, survey logistic regression with truncated weights, and survey logistic regression with full weights. Associations statistically significant at p=0.05 are bolded. Abbreviations: OR, odds ratio; CI, confidence interval Outcome Cough Wheeze Chest tightness Shortness of breath Association among asthma patients with the symptom “once in a while” (OR, 95% CI) Q2 vs. Q1 1.17 (0.71 - 1.93) 1.28 (0.86 - 1.91) 1.03 (0.71 - 1.49) 1.28 (0.86 - 1.92) 1.39 (0.66 - 2.91) 0.95 (0.52 - 1.75) 1.05 (0.60 - 1.85) 1.36 (0.74 - 2.51) 0.96 (0.35 - 2.65) 0.95 (0.40 - 2.24) 0.98 (0.46 - 2.11) 0.96 (0.41 - 2.23) Q3 vs. Q1 1.22 (0.74 - 2.00) 0.90 (0.61 - 1.34) 0.99 (0.68 - 1.44) 1.23 (0.82 - 1.83) 0.88 (0.44 - 1.79) 0.75 (0.41 - 1.38) 0.74 (0.42 - 1.30) 0.99 (0.54 - 1.83) 0.76 (0.29 - 2.01) 0.97 (0.40 - 2.33) 0.70 (0.32 - 1.56) 0.58 (0.25 - 1.36) Q4 vs. Q1 1.003 (0.59 - 1.72) 1.24 (0.81 - 1.91) 1.05 (0.71 - 1.55) 1.25 (0.81 - 1.91) 1.004 (0.45 - 2.25) 0.67 (0.34 - 1.32) 0.86 (0.48 - 1.56) 1.09 (0.57 - 2.09) 1.002 (0.34 - 2.99) 0.46 (0.19 - 1.14) 0.71 (0.32 - 1.53) 1.09 (0.45 - 2.68) Association among asthma patients with the symptom “some of the time” (OR, 95% CI) Q2 vs. Q1 1.56 (0.93 - 2.61) 1.15 (0.74 - 1.80) 1.05 (0.68 - 1.61) 1.14 (0.73 - 1.76) 1.53 (0.70 - 3.34) 0.88 (0.44 - 1.76) 0.97 (0.49 - 1.92) 1.05 (0.52 - 2.10) 1.31 (0.45 - 3.87) 0.72 (0.28 - 1.84) 0.99 (0.44 - 2.25) 0.79 (0.33 - 1.88) Q3 vs. Q1 1.26 (0.75 - 2.13) 0.86 (0.55 - 1.33) 0.98 (0.64 - 1.52) 1.08 (0.70 - 1.67) 0.92 (0.43 - 1.99) 0.78 (0.39 - 1.58) 0.81 (0.41 - 1.61) 0.99 (0.50 - 1.97) 0.99 (0.35 - 2.82) 0.59 (0.23 - 1.49) 0.95 (0.39 - 2.32) 0.88 (0.34 - 2.29) Q4 vs. Q1 1.63 (0.94 - 2.83) 1.47 (0.93 - 2.33) 1.09 (0.69 - 1.71) 1.13 (0.71 - 1.80) 1.75 (0.76 - 4.01) 1.06 (0.52 - 2.15) 0.83 (0.41 - 1.7) 1.21 (0.59 - 2.49) 2.78 (0.86 - 9.00) 0.85 (0.31 - 2.35) 1.08 (0.42 - 2.8) 1.34 (0.51 - 3.48) Association among asthma patients with the symptom “most of the time” (OR, 95% CI) Q2 vs. Q1 1.47 (0.79 - 2.74) 1.50 (0.82 - 2.74) 1.33 (0.65 - 2.76) 2.29 (1.26 - 4.14) 1.62 (0.63 - 4.21) 0.84 (0.32 - 2.17) 1.78 (0.56 - 5.60) 1.65 (0.66 - 4.11) 1.05 (0.30 - 3.65) 0.43 (0.13 - 1.48) 0.93 (0.2 - 4.41) 0.85 (0.24 - 2.97)
233
Q3 vs. Q1 1.72 (0.94 - 3.16) 1.17 (0.64 - 2.12) 1.65 (0.82 - 3.32) 1.09 (0.56 - 2.1) 1.15 (0.46 - 2.87) 0.43 (0.16 - 1.12) 0.73 (0.25 - 2.19) 0.55 (0.2 - 1.49) 0.78 (0.24 - 2.55) 0.22 (0.06 - 0.76) 0.38 (0.09 - 1.63) 0.25 (0.07 - 0.90) Q4 vs. Q1 2.24 (1.19 - 4.19) 1.60 (0.85 - 3.01) 1.07 (0.49 - 2.32) 1.66 (0.87 - 3.16) 2.10 (0.80 - 5.50) 0.94 (0.35 - 2.55) 1.11 (0.33 - 3.78) 0.95 (0.35 - 2.59) 1.92 (0.54 - 6.78) 0.36 (0.09 - 1.43) 0.58 (0.12 - 2.74) 0.62 (0.17 - 2.27) Association among asthma patients with the symptom “all of the time” (OR, 95% CI) Q2 vs. Q1 2.02 (0.89 - 4.56) 2.57 (1.15 - 5.76) 2.37 (0.76 - 7.40) 1.84 (0.75 - 4.52) 1.03 (0.30 - 3.52) 1.9 (0.53 - 6.82) 1.03 (0.17 - 6.24) 1.65 (0.40 - 6.91) 0.43 (0.09 - 2.09) 1.81 (0.45 - 7.25) 1.10 (0.18 - 6.67) 1.35 (0.30 - 6.00) Q3 vs. Q1 1.62 (0.70 - 3.72) 1.36 (0.58 - 3.17) 0.89 (0.23 - 3.51) 1.27 (0.50 - 3.21) 0.61 (0.18 - 2.12) 0.85 (0.22 - 3.25) 0.83 (0.14 - 4.76) 0.77 (0.18 - 3.22) 0.53 (0.08 - 3.39) 0.89 (0.21 - 3.71) 0.78 (0.13 - 4.75) 0.51 (0.11 - 2.34) Q4 vs. Q1 1.62 (0.66 - 3.95) 1.51 (0.59 - 3.85) 0.96 (0.24 - 3.82) 1.86 (0.74 - 4.67) 0.39 (0.12 - 1.3) 1.04 (0.26 - 4.25) 0.16 (0.03 - 0.93) 0.62 (0.17 - 2.34) 0.23 (0.05 - 1.07) 0.89 (0.20 - 3.92) 0.2 (0.03 - 1.18) 0.68 (0.17 - 2.73) Association among patients without asthma with the symptom “once in a while” (OR, 95% CI) Q2 vs. Q1 0.84 (0.66 - 1.07) 0.97 (0.77 - 1.22) 1.22 (0.97 - 1.54) 1.03 (0.83 - 1.28) 0.95 (0.68 - 1.31) 0.93 (0.66 - 1.29) 1.38 (0.99 - 1.93) 1.18 (0.86 - 1.61) 0.82 (0.53 - 1.28) 0.89 (0.55 - 1.42) 1.27 (0.80 - 2.02) 1.31 (0.85 - 2.02) Q3 vs. Q1 0.9 (0.70 - 1.15) 1.08 (0.85 - 1.35) 1.14 (0.90 - 1.43) 1.12 (0.90 - 1.39) 1.01 (0.73 - 1.41) 1.08 (0.77 - 1.50) 1.39 (0.99 - 1.95) 1.37 (1.01 - 1.87) 1.07 (0.70 - 1.66) 1.15 (0.73 - 1.82) 1.38 (0.87 - 2.18) 1.69 (1.11 - 2.57) Q4 vs. Q1 1.15 (0.90 - 1.48) 1.03 (0.82 - 1.30) 1.23 (0.98 - 1.54) 1.24 (1.003 - 1.54) 1.20 (0.86 - 1.68) 1.01 (0.73 - 1.42) 1.43 (1.03 - 2.00) 1.21 (0.89 - 1.65) 1.23 (0.79 - 1.92) 1.06 (0.67 - 1.66) 1.47 (0.93 - 2.33) 1.44 (0.94 - 2.21) Association among patients without asthma with the symptom “some of the time” (OR, 95% CI) Q2 vs. Q1 0.96 (0.73 - 1.27) 1.18 (0.86 - 1.63 1.09 (0.77 - 1.56) 1.07 (0.77 - 1.48) 1.07 (0.73 - 1.57) 1.07 (0.67 - 1.70) 0.91 (0.54 - 1.54) 1.11 (0.69 - 1.79) 0.95 (0.57 - 1.60) 1.22 (0.63 - 2.35) 0.83 (0.40 - 1.71) 1.43 (0.75 - 2.76) Q3 vs. Q1 1.05 (0.80 - 1.40) 1.05 (0.76 - 1.47) 1.13 (0.80 - 1.62) 1.18 (0.86 - 1.63) 1.05 (0.71 - 1.56) 1.07 (0.66 - 1.71) 1.16 (0.70 - 1.93) 1.08 (0.67 - 1.73) 1.01 (0.60 - 1.70) 1.17 (0.60 - 2.27) 1.50 (0.75 - 2.98) 1.30 (0.67 - 2.54)
234
Q4 vs. Q1 1.26 (0.95 - 1.68) 1.28 (0.94 - 1.76) 1.33 (0.94 - 1.86) 1.30 (0.95 - 1.78) 1.10 (0.73 - 1.64) 1.01 (0.64 - 1.61) 1.10 (0.66 - 1.82) 1.16 (0.73 - 1.85) 1.05 (0.61 - 1.81) 1.10 (0.57 - 2.12) 0.84 (0.41 - 1.69) 1.17 (0.60 - 2.26) Association among patients without asthma with the symptom “most of the time” (OR, 95% CI) Q2 vs. Q1 0.95 (0.65 - 1.39) 1.23 (0.65 - 2.30) 1.10 (0.48 - 2.53) 1.18 (0.65 - 2.14) 1.001 (0.57 - 1.76) 0.86 (0.33 - 2.26) 0.96 (0.26 - 3.48) 1.13 (0.48 - 2.66) 1.22 (0.56 - 2.65) 0.63 (0.21 - 1.84) 0.45 (0.09 - 2.23) 0.78 (0.25 - 2.39) Q3 vs. Q1 0.83 (0.56 - 1.23) 1.17 (0.62 - 2.22) 1.56 (0.72 - 3.35) 1.72 (0.99 - 2.99) 0.85 (0.47 - 1.52) 1.43 (0.57 - 3.58) 1.81 (0.53 - 6.18) 1.62 (0.71 - 3.70) 0.91 (0.41 - 2.02) 1.32 (0.39 - 4.45) 0.83 (0.17 - 4.1) 1.56 (0.50 - 4.86) Q4 vs. Q1 1.18 (0.81 - 1.73) 1.50 (0.83 - 2.73) 1.47 (0.68 - 3.16) 1.43 (0.81 - 2.53) 1.24 (0.71 - 2.17) 1.45 (0.60 - 3.50) 1.47 (0.43 - 4.97) 1.38 (0.60 - 3.17) 2.05 (0.95 - 4.40) 2.16 (0.68 - 6.90) 0.66 (0.14 - 3.04) 1.05 (0.35 - 3.14) Association among patients without asthma with the symptom “all of the time” (OR, 95% CI) Q2 vs. Q1 0.62 (0.32 - 1.18) 0.62 (0.22 - 1.76) 1.86 (0.34 - 10.29) 0.76 (0.31 - 1.87) 0.76 (0.30 - 1.90) 0.75 (0.17 - 3.31) 2.46 (0.26 - 23.04) 0.51 (0.14 - 1.83) 0.81 (0.23 - 2.86) 1.85 (0.3 - 11.30) 2.25 (0.24 - 21.38) 0.48 (0.13 - 1.81) Q3 vs. Q1 0.86 (0.47 - 1.58) 1.04 (0.42 - 2.59) 1.18 (0.19 - 7.21) 0.33 (0.10 - 1.05) 0.64 (0.25 - 1.62) 1.19 (0.31 - 4.64) 1.18 (0.09 - 15.18) 0.42 (0.09 - 1.88) 0.99 (0.26 - 3.68) 2.28 (0.43 - 11.99) 5.27 (0.35 - 80.09) 1.92 (0.38 - 9.63) Q4 vs. Q1 1.50 (0.86 - 2.63) 1.30 (0.55 - 3.04) 2.33 (0.46 - 11.74) 1.25 (0.57 - 2.74) 1.75 (0.80 - 3.84) 0.62 (0.16 - 2.41) 1.27 (0.14 - 11.39) 0.72 (0.24 - 2.19) 2.64 (0.90 - 7.69) 0.59 (0.15 - 2.33) 1.14 (0.12 - 10.65) 1.24 (0.33 - 4.65)
235
Among patients with asthma, the fourth quartile (vs. the first quartile) of the summary UNGD metric was associated with decreased odds of having four asthma symptoms at baseline (vs. no symptoms) in the full weighted model (Table 6.4.6.4). The fourth quartile (vs. the first quartile) of the summary UNGD metric was associated with increased odds of having two asthma symptoms at baseline (vs. no symptoms) in unweighted and full weighted models among patients without asthma, but not with one, three, or four symptoms. At follow-up, the summary UNGD metric was associated with one and two symptoms (vs. zero) among patients with asthma in the unweighted model only, but was not associated with number of asthma symptoms among patients without asthma (Table 6.4.6.5). In the sensitivity analysis where we re-ran the number of asthma symptoms at follow-up model with a UNGD metric assigned over the two weeks before survey return, there was no association between UNGD and number of asthma symptoms. When we re-ran the number of asthma symptoms at follow-up model with a
UNGD metric assigned over the year before survey return, the fourth quartile of the
UNGD metric was associated with having one asthma symptom in patients without asthma. There were no associations among patients with asthma, or patients without asthma with two, three, or four symptoms (Table 6.4.6.6).
236
Table 6.4.6.4. Association of UNGD metrics with number of asthma symptoms (multinomial) at baseline among patients with and without asthma from adjusted logistic models. Zero symptoms was the reference outcome. In each cell, the odds ratio and 95% confidence interval were reported from unweighted logistic regression, survey logistic regression with truncated weights, and survey logistic regression with full weights. Associations statistically significant at p=0.05 are bolded. Abbreviations: OR, odds ratio; CI, confidence interval Outcome Number of asthma symptoms Association among asthma patients with 1 symptom (OR, 95% CI) Q2 vs. Q1 0.95 (0.69 - 1.31) 095 (0.57 - 1.60) 0.89 (0.44 - 1.79) Q3 vs. Q1 0.94 (0.68 - 1.30) 0.88 (0.52 - 1.49) 0.79 (0.39 - 1.62) Q4 vs. Q1 1.14 (0.82 - 1.60) 0.81 (0.47 - 1.40) 0.54 (0.28 - 1.05) Association among asthma patients with 2 symptoms (OR, 95% CI) Q2 vs. Q1 1.06 (0.67 - 1.67) 0.82 (0.39 - 1.72) 0.66 (0.28 - 1.55) Q3 vs. Q1 0.63 (0.38 - 1.05) 0.62 (0.26 - 1.44) 0.45 (0.17 - 1.16) Q4 vs. Q1 1.14 (0.71 - 1.83) 1.15 (0.54 - 2.44) 1.44 (0.51 - 4.04) Association among asthma patients with 3 symptoms (OR, 95% CI) Q2 vs. Q1 0.98 (0.54 - 1.78) 0.76 (0.30 - 1.88) 0.77 (0.29 - 2.03) Q3 vs. Q1 0.86 (0.47 - 1.57) 1.0004 (0.38 - 2.66) 0.99 (0.35 - 2.76) Q4 vs. Q1 1.59 (0.90 - 2.80) 2.20 (0.94 - 5.15) 2.37 (0.93 - 6.06) Association among asthma patients with 4 symptoms (OR, 95% CI) Q2 vs. Q1 0.79 (0.41 - 1.54) 0.44 (0.15 - 1.26) 0.33 (0.10 - 1.12) Q3 vs. Q1 1.004 (0.54 - 1.88) 0.78 (0.31 - 1.95) 0.55 (0.19 - 1.62)
237
Q4 vs. Q1 0.91 (0.46 - 1.80) 0.42 (0.14 - 1.23) 0.27 (0.08 - 0.92) Association among patients without asthma with 1 symptom (OR, 95% CI) Q2 vs. Q1 1.03 (0.81 - 1.30) 1.006 (0.70 - 1.45) 1.16 (0.69 - 1.96) Q3 vs. Q1 0.92 (0.72 - 1.17) 1.12 (0.78 - 1.61) 1.19 (0.72 - 1.98) Q4 vs. Q1 1.17 (0.93 - 1.47) 1.33 (0.93 - 1.89) 1.40 (0.85 - 2.31) Association among patients without asthma with 2 symptoms (OR, 95% CI) Q2 vs. Q1 1.45 (0.87 - 2.44) 1.30 (0.64 - 2.62) 1.31 (0.50 - 3.47) Q3 vs. Q1 1.84 (1.12 - 3.01) 0.97 (0.47 - 1.99) 1.44 (0.52 - 3.98) Q4 vs. Q1 1.98 (1.12 - 3.21) 1.88 (0.98 - 3.61) 2.98 (1.20 - 7.43) Association among patients without asthma with 3 symptoms (OR, 95% CI) Q2 vs. Q1 1.34 (0.64 - 2.77) 0.56 (0.18 - 1.70) 0.61 (0.20 - 1.89) Q3 vs. Q1 1.33 (0.65 - 2.73) 1.37 (0.49 - 3.80) 1.44 (0.52 - 4.06) Q4 vs. Q1 1.19 (0.58 - 2.48) 1.45 (0.52 - 4.08) 4.61 (1.39 - 15.25) Association among patients without asthma with 4 symptoms (OR, 95% CI) Q2 vs. Q1 1.01 (0.42 - 2.44) 0.59 (0.14 - 2.48) 1.77 (0.26 - 12.21) Q3 vs. Q1 0.85 (0.34 - 2.11) 0.80 (0.19 - 3.35) 1.98 (0.34 - 11.52)
238
Q4 vs. Q1 1.76 (0.82 - 3.78) 1.42 (0.42 - 4.78) 1.55 (0.46 - 5.19)
Table 6.4.6.5. Association of UNGD metrics with number of asthma symptoms (multinomial) at follow-up among patients with and without asthma from adjusted logistic models. Zero symptoms was the reference outcome. In each cell, the odds ratio and 95% confidence interval were reported from unweighted logistic regression, survey logistic regression with truncated weights, and survey logistic regression with full weights. Associations statistically significant at p=0.05 are bolded. Abbreviations: OR, odds ratio; CI, confidence interval Outcome Number of asthma symptoms Association among asthma patients with 1 symptom (OR, 95% CI) Q2 vs. Q1 1.39 (0.90 - 2.14) 0.93 (0.46 - 1.89) 0.80 (0.35 - 1.86) Q3 vs. Q1 1.40 (0.91 - 2.15) 1.18 (0.59 - 2.36) 1.09 (0.42 - 2.82) Q4 vs. Q1 1.92 (1.24 - 2.95) 1.53 (0.77 - 3.07) 1.24 (0.54 - 2.85) Association among asthma patients with 2 symptoms (OR, 95% CI) Q2 vs. Q1 3.40 (1.60 - 7.23) 2.57 (0.77 - 8.57) 1.18 (0.25 - 5.67) Q3 vs. Q1 2.25 (1.02 - 4.97) 0.66 (0.21 - 2.01) 0.27 (0.06 - 1.24) Q4 vs. Q1 2.56 (1.14 - 5.77) 2.05 (0.57 - 7.42) 0.85 (0.15 - 4.74) Association among asthma patients with 3 symptoms (OR, 95% CI) Q2 vs. Q1 1.58 (0.72 - 3.50) 2.05 (0.56 - 7.48) 1.97 (0.50 - 7.72) Q3 vs. Q1 1.27 (0.56 - 2.87) 0.71 (0.19 - 2.72) 0.68 (0.17 - 2.75) Q4 vs. Q1 1.46 (0.63 - 3.37) 1.16 (0.29 - 4.67) 1.14 (0.27 - 4.90) Association among asthma patients with 4 symptoms (OR, 95% CI)
239
Q2 vs. Q1 1.38 (0.58 - 3.30) 0.79 (0.22 - 2.86) 0.38 (0.08 - 1.84) Q3 vs. Q1 1.17 (0.48 - 2.84) 0.64 (0.18 - 2.30) 0.30 (0.61 - 1.46) Q4 vs. Q1 1.03 (0.39 - 2.71) 0.43 (0.11 - 1.74) 0.22 (0.05 - 1.09) Association among patients without asthma with 1 symptom (OR, 95% CI) Q2 vs. Q1 1.14 (0.83 - 1.56) 1.17 (0.73 - 1.89) 1.32 (0.67 - 2.58) Q3 vs. Q1 1.001 (0.72 - 1.38) 1.19 (0.74 - 1.93) 1.41 (0.72 - 2.76) Q4 vs. Q1 1.35 (0.99 - 1.83) 1.49 (0.94 - 2.35) 1.80 (0.94 - 2.46) Association among patients without asthma with 2 symptoms (OR, 95% CI) Q2 vs. Q1 0.71 (0.38 - 1.35) 0.60 (0.24 - 1.48) 0.66 (0.19 - 2.28) Q3 vs. Q1 1.01 (0.57 - 1.80) 0.50 (0.21 - 1.24) 0.52 (0.13 - 2.08) Q4 vs. Q1 0.95 (0.54 - 1.70) 0.94 (0.43 - 2.09) 1.33 (0.46 - 3.82) Association among patients without asthma with 3 symptoms (OR, 95% CI) Q2 vs. Q1 0.93 (0.39 - 2.67) 0.83 (0.22 - 3.22) 0.84 (0.21 - 3.32) Q3 vs. Q1 1.07 (0.46 - 2.51) 1.49 (0.41 - 5.42) 2.55 (0.54 - 12.01) Q4 vs. Q1 0.91 (0.38 - 2.16) 0.99 (0.25 - 3.84) 2.04 (0.37 - 11.18) Association among patients without asthma with 4 symptoms (OR, 95% CI)
240
Q2 vs. Q1 1.50 (0.36 - 6.36) 1.25 (0.13 - 12.30) 1.12 (0.11 - 10.92) Q3 vs. Q1 1.09 (0.24 - 4.93) 0.61 (0.08 - 4.82) 0.57 (0.07 - 4.41) Q4 vs. Q1 2.58 (0.70 - 9.52) 1.52 (0.20 - 11.36) 1.31 (0.17 - 9.97)
241
Table 6.4.6.6. Association of UNGD metrics assigned over prior six months, two weeks, or one year and number of asthma symptoms (multinomial) at follow-up among patients with and without asthma from adjusted models. Zero symptoms is the reference outcome. In each cell, the odds ratio and 95% confidence interval were reported from survey logistic regression with truncated weights. Associations statistically significant at p=0.05 are bolded. Abbreviations: OR, odds ratio; CI, confidence interval
UNGD assigned over UNGD assigned over UNGD assigned over past six months past two weeks past one year Association among asthma patients with 1 symptom (OR, 95% CI) Q2 vs. Q1 0.93 (0.46 - 1.89) 0.70 (0.34 - 1.42) 1.40 (0.69 - 2.85) Q3 vs. Q1 1.18 (0.59 - 2.36) 0.86 (0.43 - 1.72) 1.37 (0.67 - 2.80) Q4 vs. Q1 1.53 (0.77 - 3.07) 1.28 (0.65 - 2.51) 1.80 (0.88 - 3.68) Association among asthma patients with 2 symptoms (OR, 95% CI) Q2 vs. Q1 2.57 (0.77 - 8.57) 0.87 (0.27 - 2.79) 2.60 (0.75 - 9.03) Q3 vs. Q1 0.66 (0.21 - 2.01) 1.35 (0.45 - 4.11) 0.66 (0.21 - 2.11) Q4 vs. Q1 2.05 (0.57 - 7.42) 1.09 (0.33 - 3.67) 2.18 (0.61 - 7.84) Association among asthma patients with 3 symptoms (OR, 95% CI) Q2 vs. Q1 2.05 (0.56 - 7.48) 0.96 (0.26 - 3.45) 2.07 (0.57 - 7.50) Q3 vs. Q1 0.71 (0.19 - 2.72) 0.7 (0.21 - 2.3) 0.69 (0.17 - 2.84) Q4 vs. Q1 1.16 (0.29 - 4.67) 0.78 (0.2 - 2.97) 1.23 (0.32 - 4.77) Association among asthma patients with 4 symptoms (OR, 95% CI) Q2 vs. Q1 0.79 (0.22 - 2.86) 0.63 (0.18 - 2.18) 1.49 (0.41 - 5.46) Q3 vs. Q1 0.64 (0.18 - 2.30) 0.63 (0.17 - 2.32) 0.54 (0.12 - 2.39) Q4 vs. Q1 0.43 (0.11 - 1.74) 0.38 (0.09 - 1.53) 1.01 (0.24 - 4.29) Association among patients without asthma with 1 symptom (OR, 95% CI) Q2 vs. Q1 1.17 (0.73 - 1.89) 0.91 (0.55 - 1.49) 1.21 (0.75 - 1.96) Q3 vs. Q1 1.19 (0.74 - 1.93) 1.34 (0.84 - 2.13) 1.28 (0.79 - 2.08) Q4 vs. Q1 1.49 (0.94 - 2.35) 1.44 (0.91 - 2.28) 1.64 (1.03 - 2.60) Association among patients without asthma with 2 symptoms (OR, 95% CI) Q2 vs. Q1 0.60 (0.24 - 1.48) 1.19 (0.52 - 2.71) 0.75 (0.31 - 1.8) Q3 vs. Q1 0.50 (0.21 - 1.24) 0.31 (0.11 - 0.87) 0.49 (0.19 - 1.26) Q4 vs. Q1 0.94 (0.43 - 2.09) 1.14 (0.5 - 2.6) 1.05 (0.47 - 2.33) Association among patients without asthma with 3 symptoms (OR, 95% CI) Q2 vs. Q1 0.83 (0.22 - 3.22) 1.20 (0.32 - 4.52) 0.84 (0.23 - 3.11) Q3 vs. Q1 1.49 (0.41 - 5.42) 1.41 (0.36 - 5.46) 1.48 (0.41 - 5.28) Q4 vs. Q1 0.99 (0.25 - 3.84) 1.46 (0.39 - 5.48) 0.96 (0.25 - 3.76) Association among patients without asthma with 4 symptoms (OR, 95% CI) Q2 vs. Q1 1.25 (0.13 - 12.30) 1.10 (0.10 - 11.65) 1.56 (0.15 - 16.31) Q3 vs. Q1 0.61 (0.08 - 4.82) 0.75 (0.10 - 5.40) 0.68 (0.08 - 5.86) Q4 vs. Q1 1.52 (0.20 - 11.36) 1.40 (0.18 - 10.88) 1.71 (0.21 - 14.23)
6.4.7 Discussion
We conducted a study of the association of UNGD and self-reported asthma symptoms using survey data. In contrast to our study of UNGD and objective asthma
242
exacerbations, where we observed consistent associations between UNGD (assigned on the day prior to the exacerbation) and asthma exacerbations among patients with asthma, here we found inconsistent, but generally null, associations between UNGD and asthma symptoms at baseline and follow-up over three and six month windows. At
baseline, the associations that were statistically significant were primarily among
patients without asthma. These results were somewhat unexpected, but there are
several possible explanations, as discussed below.
The differences in the associations at baseline and follow-up may be due to
seasonality. The baseline surveys were returned between March and October 2014, with
a median date of April 22, 2014. The follow-up surveys were returned between
November 2015 and May 2015 with a median date of November 12, 2014. Asthma
exacerbations tend to peak in the fall, especially for children, but also for adults, which
has been attributed to children returning to school and to respiratory viruses.6,7 Seasonal
trends in asthma exacerbations could have obscured an association between UNGD
and asthma symptoms. An additional potential problem is that the questions on asthma
symptoms were not validated for asthma exacerbations, and thus the two analyses
(exacerbations vs. symptoms) are two very different outcomes.8 The most likely explanation is that the questionnaire asked about symptoms over too long a time frame
(three or six months). In the exacerbation study, it was UNGD activity on the day prior to the exacerbation event that was strongly associated, but in the symptom study, UNGD activity and symptoms were assessed over much longer time periods. Finally, it is possible that symptoms were too common and non-specific to have discriminatory properties for asthma exacerbations. These are potential reasons we may not have observed associations between UNGD and asthma symptoms in this data even though we observed prior associations between UNGD and clinically-documented asthma exacerbations.
243
6.5 Greenness and asthma exacerbation study There is increasing interest in the association between greenness and health outcomes. Potential pathways for greenness to affect health include through physical activity, air quality, social contact, stress reduction, and increased social interaction.9
Greenness can be measured in a number of ways, including normalized difference vegetation index (NDVI), which is a measure of greenness from satellite data,9 and Light
Detection and Ranging (LiDAR), which is a laser measurement.10 Several studies have
evaluated greenness and asthma prevalence or exacerbations (defined as
hospitalizations) (Table 6.5.1 and 6.5.2). Studies of NDVI and asthma prevalence have
found conflicting results: one study did not find an association of greenness (measured
using NDVI) and asthma prevalence,11 two studies found increased odds of asthma
prevalence with increasing greenness,12,13 and one study found decreased odds of
asthma prevalence with increasing greenness.14 Both studies that evaluated
associations between NDVI and asthma hospitalizations used an ecological design. Of
these, one found a significant negative correlation between increasing and asthma
hospitalization rates in the spring only,15 and the other found no significant association.16
In this analysis, we evaluated the association of NDVI and asthma exacerbations, and if NDVI was a confounder of the association between UNGD and asthma exacerbations (Figure 6.5.1). NDVI could be a confounder because NDVI is associated UNGD, and NDVI may also be associated with asthma exacerbations. While our primary hypothesis was that NDVI could be a confounder of the UNGD-asthma exacerbation relationship, NDVI could also mediate the association of UNGD and asthma exacerbations: for example, UNGD could alter the landscape and result in less greenness.
244
Figure 6.5.1. Directed acyclic graph of UNGD, NDVI, and asthma exacerbations.
a: NDVI as a mediator of the association of UNGD and asthma exacerbations b: NDVI as a confounder of the association of UNGD and asthma exacerbations c: Ayres-Sampaio 2014 observed a negative association between increasing NDVI and asthma hospitalization rates. d: Rasmussen 2016 observed a positive association between increasing UNGD and asthma exacerbations.
245
Table 6.5.1. Studies on greenness and asthma prevalence. Author Location Population Design Outcome Exposure Results Dadvand Sabadell, 3,178 9-12 Cross-sectional Asthma diagnosis NDVI buffers: OR (95% CI): 2014 Spain year olds 100m 1.00 (0.82, 1.21) 250m 1.00 (0.78, 1.27) 500m 1.03 (0.79, 1.34) 1,000m 1.06 (0.85, 1.32) Lovasi 2013 New York 492 5 year Longitudinal (but Asthma diagnosis LiDAR and multi- OR (95% CI): City olds; 427 7 analyzed cross spectral imagery Age 5: 1.11 (0.85, 1.45) year olds sectionally) at a 0.25 km Age 7: 1.17 (1.02, 1.33) radius Sbihi 2015 British 65,000 Longitudinal (but Incident asthma NDVI OR per IQR of NDVI Columbia, children analyzed cross during preschool- (95% CI): 0.96 (0.93, Canada sectionally) age (0–5 years old) 0.99) Andrusaityte Kaunas, 112 children Cross-sectional Asthma diagnosis NDVI in buffers OR per IQR of NDVI 2016 Lithuania with asthma case-control of 100, 300 and (95% CI): and 1377 500 m 100m buffer: 1.43 children (1.10, 1.85) without 300m buffer: 1.23 (0.94, 1.61) 500m buffer: 1.18 (0.88, 1.57) Abbreviations: OR, odds ratio; CI, confidence interval
246
Table 6.5.2. Studies on greenness and asthma exacerbations. Abbreviations: OR, odds ratio; CI, confidence interval; RR, relative risk Author Location Population Design Outcome Exposure Results
Ayres- Portugal All people Ecologic Asthma Average NDVI in Pearson correlations: Sampaio hospitalization municipality Low urban cover municipalities: 2014 rates by no statistically significant municipality correlation Moderate urban cover municipalities : statistically significant negative correlation (- 0.257) in spring only High urban cover municipality: statically significant negative correlation in all seasons (ranged from -0.50 to -0.38)
Erdman Northeastern People aged 65 Ecologic Asthma Most frequent RR, 95% CI: 0.96 (0.86 - 1.06) 2015 United States years and older hospitalization NDVI value in rates by ZIP code zip code
247
6.5.1 Study population and covariates
We included all events from the UNGD and asthma exacerbation study (Chapter
3). Briefly, we identified 35,508 asthma patients from the Geisinger Clinic population.
Among these asthma patients, we identified 20,749 mild, 1,870 moderate, and 4,782 severe asthma exacerbations, and frequency-matched these on age, sex, and year of events to 14,104, 9,350, and 18,693 control index dates, respectively. For each event,
we created covariates for age, sex, race/ethnicity, season of event, smoking status,
overweight and obesity, Medical Assistance, type 2 diabetes, community socioeconomic
deprivation, distance to nearest major and minor arterial road, and maximum
temperature on the day prior to event (Section 3.3.4).
6.5.2 Greenness measurement, and exposure assignment
We used NDVI to measure greenness. NDVI data was from the National
Aeronautics and Space Administration’s Moderate Resolution Imaging
Spectroradiometer (MODIS). MODIS has global coverage with a 250 meter resolution and provides an NDVI measurement every 16 days. For this study, NDVI values were generalized to a 5 image by 5 image grid in ArcGIS, resulting in a grid of 1250 meters by
1250 meters. For each year of data (2006-12), we assigned all asthma patients’ events the peak NDVI values for the year of the event for the grid cell that contained the geocoded home address. Data was only available for 2006-12, so for events in 2005 we used the NDVI values from 2006. Continuous NDVI values were highly correlated within each patient over the years 2006-2012 (correlation ranged from 0.91 – 0.97).
6.5.3 Statistical analysis
For each asthma exacerbation outcome, we ran a multilevel logistic model with a random intercept for patient (to account for multiple events per patient) and community
(using a mixed definition of place: townships, boroughs, and census tracts in cities). The
models included NDVI (quartiled), the spud UNGD metric, sex, race/ethnicity, season of
248
event, smoking status, overweight and obesity, Medical Assistance, type 2 diabetes, community socioeconomic deprivation, distance to nearest major and minor arterial road, and maximum temperature on the day prior to event as covariates. Continuous covariates (community socioeconomic deprivation, distance to nearest major and minor arterial road, and maximum temperature on the day prior to event) were included with linear and quadratic terms.
6.5.4 Results
There was no association between NDVI and asthma exacerbations in the severe or mild asthma exacerbation analysis. In the moderate asthma exacerbation analysis, the third quartile of NDVI was associated with lower odds of case status (odds ratio = 0.48, 95% confidence interval [0.27-0.86]), and the fourth quartile of NDVI had similar odds (odds ratio = 0.63) but did not reach statistical significance (p = 0.06). For all three outcomes, the odds ratios for UNGD remained largely unchanged compared to models that did not include NDVI (Table 3.4.2).
Table 6.5.4. Association between UNGD, NDVI, and asthma exacerbationsa Severe Moderate Mild UNGD Q2 vs. Q1 1.16 (0.98 - 1.37) 1.57 (1.09 - 2.27) 1.44 (1.28 - 1.62) odds ratios Q3 vs. Q1 1.26 (1.05 - 1.50) 1.55 (1.05 - 2.29) 1.98 (1.75 - 2.24) (95% CI) Q4 vs. Q1 1.64 (1.37 - 1.96) 1.56 (1.07 - 2.27) 2.00 (1.75 - 2.27) NDVI odds Q2 vs. Q1 0.86 (0.70 - 1.06) 0.80 (0.50 - 1.27) 1.08 (0.96 - 1.22) ratios Q3 vs. Q1 1.02 (0.80 - 1.30) 0.48 (0.27 - 0.86) 1.06 (0.93 - 1.22) (95% CI) Q4 vs. Q1 0.92 (0.70 - 1.21) 0.53 (0.27 - 1.03) 1.12 (0.96 - 1.31) Abbreviations: UNGD, unconventional natural gas development; NDVI, Normalized Difference Vegetation Index; CI, confidence interval a Multilevel models with a random intercept for patient and community, adjusted for age category (5-12, 13-18, 19-44, 45-61, 62-74, 75+ years), sex (male, female), race/ethnicity (white, black, Hispanic, other), family history of asthma (yes vs. no), smoking status (never, former, current, missing), season (spring, March 22-June 21; summer, June 22-September 21; fall, September 22-December 21; winter, December 22-March 21), Medical Assistance (yes vs. no), overweight/obesity (normal, body mass index [BMI] < 85th percentile or BMI < 25 kg/m2; overweight, BMI = 85th-<95th percentile or BMI = 25-<30 kg/m2; obese, BMI ≥ 95th percentile or BMI ≥ 30 kg/m2, for children and adults, respectively; BMI missing), type 2 diabetes (yes vs. no), community socioeconomic deprivation (quartiles), distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed), squared distance to nearest major and minor arterial road (truncated at the 98th percentile, meters, z-transformed),
249
maximum temperature on the day prior to event (degrees Celsius), and squared maximum temperature on the day prior to event (degrees Celsius) 6.5.4 Discussion
We did not observe an association between NDVI and mild or severe asthma exacerbations. We observed a lower odds of moderate exacerbations in the third quartile of NDVI, compared to the first, and similarly lower odds, though not statistically significant, for the fourth quartile compared to the first. In contrast to the ecological study that found a negative correlation between NDVI and asthma hospitalization rates at the municipality level, our study used individual level data, so is relevant to inferences about causes of asthma exacerbations in individual patients, not the causes in populations, for example, by zip code or municipality. It is also possible that the relationship between greenness and asthma exacerbations is different in the United States and Europe, and
that the association observed from the study in Portugal is different than that observed in
the United States. Finally, the association between UNGD and asthma exacerbations
remained unchanged comparing models with and without NDVI, and hence was not
confounded by NDVI.
6.6 Greenness and depression symptoms study There is a growing literature on greenness and health outcomes, including mental health.9 However, most studies evaluating the association between greenness
and mental health used a general measure of mental health, the General Health
Questionnaire (GHQ-12), as the outcome.17-24 Only one study evaluated the association
of greenness and a questionnaire specific for depression.25 Here, we evaluated the
association of greenness with depression symptoms. We hypothesized that increased
greenness would be associated with fewer depression symptoms, and that this
association might differ by place type.
6.6.1 Greenness measure
250
As described in the study of greenness and asthma exacerbations (Section
6.6.2), we used normalized difference vegetation index (NDVI), a measure of greenness from satellite data.9 As in the greenness and asthma exacerbation study (Section 6.5), we used peak NDVI and assigned each study participant the NDVI value for the grid cell that contained their geocoded home address. Within each place type, we truncated the
NDVI value at the 2nd and 98th percentiles and created a z-score of the truncated NDVI
values.
6.6.2 Depression symptom data The depression symptom data was described in the unconventional natural gas
and depression symptoms study (Chapter 4). Briefly, data came from a questionnaire
sent to adult patients of the Geisinger Clinic in October 2014 by the Chronic
Rhinosinusitis Integrative Studies Program.5,26 The survey design oversampled for
patients with diagnostic codes for chronic rhinosinusitis, allergic rhinitis, and asthma and
for race/ethnic minorities. A baseline questionnaire was mailed to 23,700 patients, and a
follow-up questionnaire was mailed to all responders of the baseline questionnaire
(Section 4.3.1). The follow-up questionnaire included a validated eight item
questionnaire on depression symptoms (PHQ-8). We excluded patients who lived
outside of Pennsylvania and patients who answered no PHQ-8 questions, leaving 4,762
patients in this analysis.
6.6.3 Covariates
As described in Chapter 4, from the electronic health record, we created covariates on race/ethnicity (white, black, Hispanic); sex (male, female); use of Medical
Assistance for health insurance, a measure of low family socioeconomic status (no, yes); age at survey return (years); smoking status at survey return (current, former, never); alcohol use at survey return (no; current, not heavy; current, heavy); and body mass index (BMI, kg/m2); and we created covariates for well water and community
251
socioeconomic deprivation (CSD) using patients’ geocoded coordinates. We also created a covariate for population density per km2 for each study participant’s mixed
definition of place using data from the U.S. Census 2014 American Community Survey.27
6.6.4 Data analysis
As in the UNGD and depression symptoms study (Chapter 4), we used negative binomial logistic regression to evaluate the association of NDVI with depression symptoms. All models were weighted using truncated survey weights (Section 4.3.6).
We included centered NDVI as a linear and quadratic variable. Race/ethnicity, sex, use of Medical Assistance, age, smoking status, alcohol use, BMI, well water, and CSD were included as covariates. Age, BMI, and CSD were included centered and centered and squared to allow for non-linearity. We stratified by place type (borough and township) because there was little overlap of NDVI by place type (Figure 6.6.4.1), as has been observed in another study of greenness and health outcomes in the Geisinger region.28
We did not run a model in cities because few responders lived in cities (n = 380). In the
township model, we hypothesized that social isolation could be a confounder of the
association between NDVI and depression symptoms. We used population density (per
km2) as a measure of social isolation. Social isolation is a risk factor for depression, and in our data, population density appeared to be associated with NDVI (Figure 6.6.4.2). To
test this hypothesis, we added population density (centered, linear and quadratic) as a
covariate to the model of NDVI and depression symptoms in townships.
Figure 6.6.4.1. Peak normalized difference vegetation index in 2014 by place type among study participants.
252
Abbreviation: NDVI, normalized difference vegetation index
Figure 6.6.4.1. Lowess smoother and scatter plot of population density (per km2) and peak normalized difference vegetation index in 2014 among study participants in townships. Abbreviation: NDVI, normalized difference vegetation index
253
6.6.5 Results
We identified 3,088 study participants in townships and 1,294 in boroughs.
Among study participants in boroughs, there was no association of NDVI with depression symptoms in unadjusted or adjusted models (Table 6.6.5.1). Among study participants in townships, there was no association with the linear term, but there was a quadratic association of NDVI (global p value for the linear and quadratic term = 0.01) with depression symptoms, and associations were largely unchanged when population density was added to the model (Table 6.6.5.2 and Figure 6.6.5).
Table 6.6.5.1. Association of peak normalized difference vegetation index with depression symptoms among study participants in boroughs in surveya negative binomial regressions. Abbreviation: NDVI, normalized difference vegetation index Model Unadjusted Adjusted Included in model 1,294 1,294 Covariates None Allb Exponential coefficient (95% CI) NDVIc 0.98 (0.90 - 1.07) 1.03 (0.94 – 1.13) NDVI2 1.02 (0.94 - 1.11) 1.02 (0.93 - 1.11) Abbreviations: CI = confidence interval a Using truncated survey weights. b Covariates included: race/ethnicity, sex, Medical Assistance, age (centered, centered & squared) smoking status, BMI (centered, centered & squared), well water, alcohol use, CSD (centered, centered & squared) c z-transformed NDVI Table 6.6.5.2. Association of peak normalized difference vegetation index with depression symptoms among study participants in townships in surveya negative binomial regressions. Abbreviation: NDVI, normalized difference vegetation index Adjusted additionally with population Model Unadjusted Adjusted density Included in model 3,088 3,088 3,088 All plus population Covariates None Allb density (linear and quadratic) Exponential coefficient (95% CI) NDVIc 1.001 (0.94 – 1.07) 1.01 (0.93 - 1.08) 1.02 (0.94 - 1.10) NDVI2 1.08 (1.02 - 1.12) 1.06 (1.01 - 1.11) 1.06 (1.01 - 1.11) Abbreviations: CI = confidence interval
254
a Using truncated survey weights. b Covariates included: race/ethnicity, sex, medical assistance, age (centered, centered & squared) smoking status, BMI (centered, centered & squared), well water, alcohol use, CSD (centered, centered & squared) c z-transformed NDVI
Figure 6.6.5. Association of peak normalized difference vegetation index with depression symptoms among study participants in townships in adjusted survey negative binomial regressions. Abbreviation: NDVI, normalized difference vegetation index
6.6.6 Discussion
We conducted a study of the association of NDVI, a measure of residential greenness, and depression symptoms. We stratified the analysis by place type because the distributions of NDVI evidenced large differences with little overlap place type. We observed no association between NDVI and depression symptoms among study participants in boroughs. In townships, we observed a nonlinear association between
NDVI and depression symptoms. For NDVI values below the mean, as they increased towards the mean value, NDVI was associated with fewer depression symptoms.
255
However, increasing NDVI values above the mean were associated with more depression symptoms. These results are in contrast to the prior study of NDVI and depression symptoms, which used data from a state-wide survey of Wisconsin residents.
That study found that higher NDVI was associated with fewer depression symptoms. It controlled for urbanicity by including Rural and Urban Commuting Area codes, a categorical census tract level variable created by the U.S. Department of Agriculture to measure urbanicity based on commuting flow estimates, and population density at the census tract in the models. However, that study did not stratify by place type or urbanicity, so if the levels of greenness in Wisconsin are as different across urban and rural areas as they are in our study, they may have extrapolated beyond their data.
Additionally, that study did not evaluate nonlinear relationships between NDVI and depression symptoms, which could also account for the different results between our study and theirs.
We hypothesized that social isolation (in this study, measured by population density) could explain the positive association for study participants with NDVI above the mean, but when we added population density to the model, associations remained largely unchanged. It is possible that population density is a poor measure of social isolation and that we would see different results with a different measure of social isolation.
6.7 References 1. Pacheco JA, Avila PC, Thompson JA, et al. A highly specific algorithm for identifying asthma cases and controls for genome-wide association studies. AMIA Annu Symp
Proc. 2009;2009:497-501.
2. Tisnado DM, Adams JL, Liu H, et al. What is the concordance between the medical record and patient self-report as data sources for ambulatory care? Med Care.
2006;44(2):132-140.
256
3. Skinner KM, Miller DR, Lincoln E, Lee A, Kazis LE. Concordance between respondent self-reports and medical records for chronic conditions: Experience from the veterans health study. J Ambul Care Manage. 2005;28(2):102-110.
4. Corser W, Sikorskii A, Olomu A, Stommel M, Proden C, Holmes-Rovner M.
"Concordance between comorbidity data from patient self-report interviews and medical record documentation". BMC Health Serv Res. 2008;8:85-6963-8-85.
5. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS.
Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in pennsylvania. Environ Health Perspect.
2016.
6. Johnston NW, Sears MR. Asthma exacerbations . 1: Epidemiology. Thorax.
2006;61(8):722-728.
7. Johnston NW, Johnston SL, Norman GR, Dai J, Sears MR. The september epidemic of asthma hospitalization: School children as disease vectors. J Allergy Clin Immunol.
2006;117(3):557-562.
8. Juniper EF. Validated questionnaires should not be modified. European Respiratory
Journal. 2009;34(5):1015-1017. doi: 10.1183/09031936.00110209.
9. James P, Banay RF, Hart JE, Laden F. A review of the health benefits of greenness.
Curr Epidemiol Rep. 2015;2(2):131-142.
10. National Ocean Service. What is LIDAR?. National Oceanic and Atmospheric
Administration Web site. http://oceanservice.noaa.gov/facts/lidar.html. Published May
29, 2015. Updated 2015. Accessed May 9, 2016.
11. Dadvand P, Villanueva CM, Font-Ribera L, et al. Risks and benefits of green spaces
for children: A cross-sectional study of associations with sedentary behavior, obesity,
asthma, and allergy. Environ Health Perspect. 2014;122(12):1329-1335.
257
12. Andrusaityte S, Grazuleviciene R, Kudzyte J, Bernotiene A, Dedele A,
Nieuwenhuijsen MJ. Associations between neighbourhood greenness and asthma in preschool children in kaunas, lithuania: A case-control study. BMJ Open.
2016;6(4):e010341-2015-010341.
13. Lovasi GS, O'Neil-Dunne JP, Lu JW, et al. Urban tree canopy and asthma, wheeze,
rhinitis, and allergic sensitization to tree pollen in a new york city birth cohort. Environ
Health Perspect. 2013;121(4):494-500.
14. Sbihi H, Tamburic L, Koehoorn M, Brauer M. Greenness and incident childhood
asthma: A 10-year follow-up in a population-based birth cohort. Am J Respir Crit Care
Med. 2015;192(9):1131-1133.
15. Ayres-Sampaio D, Teodoro AC, Sillero N, Santos C, Fonseca J, Freitas A. An
investigation of the environmental determinants of asthma hospitalizations: An applied
spatial approach. Appl Geogr. 2014;47:10-19.
16. Erdman E, Liss A, Gute D, Rioux C, Koch M, Naumova E. Does the presence of
vegetation affect asthma hospitalizations among the elderly? A comparison between
rural, suburban, and urban areas. International Journal of Environment and
Sustainability (IJES). 2015;4(1).
17. Sugiyama T, Leslie E, Giles-Corti B, Owen N. Associations of neighbourhood
greenness with physical and mental health: Do walking, social coherence and local
social interaction explain the relationships? J Epidemiol Community Health.
2008;62(5):e9.
18. Maas J, van Dillen SM, Verheij RA, Groenewegen PP. Social contacts as a possible
mechanism behind the relation between green space and health. Health Place.
2009;15(2):586-595.
19. Triguero-Mas M, Dadvand P, Cirach M, et al. Natural outdoor environments and
mental and physical health: Relationships and mechanisms. Environ Int. 2015;77:35-41.
258
20. Sarkar C, Gallacher J, Webster C. Urban built environment configuration and psychological distress in older men: Results from the caerphilly study. BMC Public
Health. 2013;13:695-2458-13-695.
21. Mitchell R. Is physical activity in natural environments better for mental health than physical activity in other environments? Soc Sci Med. 2013;91:130-134.
22. White MP, Alcock I, Wheeler BW, Depledge MH. Would you be happier living in a
greener urban area? A fixed-effects analysis of panel data. Psychol Sci. 2013;24(6):920-
928.
23. Annerstedt M, Ostergren PO, Bjork J, Grahn P, Skarback E, Wahrborg P. Green qualities in the neighbourhood and mental health - results from a longitudinal cohort
study in southern sweden. BMC Public Health. 2012;12:337-2458-12-337.
24. Astell-Burt T, Mitchell R, Hartig T. The association between green space and mental
health varies across the lifecourse. A longitudinal study. J Epidemiol Community Health.
2014;68(6):578-583.
25. Beyer KM, Kaltenbach A, Szabo A, Bogar S, Nieto FJ, Malecki KM. Exposure to
neighborhood green space and mental health: Evidence from the survey of the health of
wisconsin. Int J Environ Res Public Health. 2014;11(3):3453-3472.
26. Hirsch AG, Stewart WF, Sundaresan AS, et al. Nasal and sinus symptoms and
chronic rhinosinusitis in a population-based sample. Allergy. 2016.
27. U.S. Census Bureau. American community survey, 2014. Table B01003 Web site.
http://factfinder2.census.gov. Accessed 8/22, 2016.
28. Casey JA, James P, Rudolph KE, Wu CD, Schwartz BS. Greenness and birth
outcomes in a range of pennsylvania communities. Int J Environ Res Public Health.
2016;13(3):10.3390/ijerph13030311.
259
Chapter 7: Discussion
7.1 Summary of findings The aims of this thesis were to: 1) evaluate associations of UNGD activity metrics, based on wells only, with asthma exacerbations; 2) evaluate associations of these well-based UNGD activity metrics with depressive symptoms and with disordered sleep; 3) compare the different approaches to UNGD activity assessment used in published studies to themselves and in their relations with mild asthma exacerbations;
and 4) determine whether and how other exposure-relevant aspects of UNGD, such as
impoundments, compressor engines, and flaring events should be incorporated into
UNGD activity metrics. We began data analysis in fall 2013 and finished in late 2016.
This chapter will summarize the findings for these four primary aims (presented in three
manuscripts) and discuss policy implications and future research directions.
The first two aims were addressed with epidemiologic studies and are presented
in separate chapters in this thesis. In Chapter 3, we evaluated the associations of four
well-based UNGD metrics (pad preparation, drilling, stimulation, and production) with
mild, moderate, and severe asthma exacerbation outcomes (new oral corticosteroid
medication orders for asthma, asthma emergency department visits, and asthma
hospitalizations, respectively). We found an association between 11 out of 12 of the
UNGD activity metric-asthma exacerbation pairs. Odds ratios (OR) for the high UNGD
group, compared to very low, ranged from 1.5 (95% confidence interval [CI], 1.2 - 1.7)
for the association of the pad metric with severe exacerbations to 4.4 (95% CI, 3.8 - 5.2)
for the association of the production metric with mild exacerbations. These associations
were robust to increasing levels of covariate control and in several sensitivity analyses.
We hypothesized that these associations, if determined to be causal, were biologically
plausible and could operate through air pollution and/or stress pathways.
260
For the second aim (Chapter 4), we evaluated the association of UNGD activity
with depression symptoms in a population of adults surveyed about their health
symptoms. We chose depression symptoms as the outcome because, similar to asthma
exacerbations, depression and its symptoms have been associated with exposure to air
pollution and stress in prior studies. In this study, we used a summary UNGD metric of
the four phases of well development, instead of the four phases separately, as in the
prior study. We evaluated if migraine headache or fatigue were mediators of this
association because a prior study of this surveyed population found associations of
UNGD activity with symptoms of migraine and fatigue.1 We also evaluated the association of UNGD activity with disordered sleep diagnoses in the Geisinger electronic health record (EHR) because we hypothesized that disordered sleep could be a mediator of a UNGD activity – depression association. We observed an association
between UNGD activity and depression (e.g., for high UNGD activity, compared to very
low, and mild depression, OR = 1.5 [95% CI, 1.1 - 2.0]), but not UNGD activity and
disordered sleep. Fatigue, but not migraine, appeared to partially mediate the
association between UNGD activity and depression. These associations were robust to
increasing levels of covariate control, but only present in survey-weighted models.
The third and fourth aims are presented in a single thesis chapter (Chapter 5).
To address these aims, we completed several analyses related to the UNGD metrics
used in epidemiology studies of UNGD to date. First, we identified and described UNGD-
related compressor stations, impoundments, and flaring events in Pennsylvania.
Second, we used principal component analysis to understand the relationship among
GIS-based metrics for compressors, impoundments, and four phases of well
development (pad preparation, drilling, stimulation, and production). Finally, we
compared how three different metrics used in UNGD and health studies to date
categorized case and control dates identified in the asthma exacerbation study (Chapter
261
3), and in their associations with mild asthma exacerbations. These three metrics were a categorical distance to the nearest drilled well (DNDW) metric, based on Rabinowitz2; an inverse-distance metric based on the drilling phase (IDD), based on McKenzie3 and
Stacy4; and an inverse-distance-squared metric incorporating four phases of well
development and compressor engines (IDS4PC), based on our research. We identified
361 UNGD-related compressor stations, 1,218 impoundments, and 216 locations with
flaring events in Pennsylvania. Using principal component analysis, we found that a
single component captured most of the variation between the metrics for compressors,
impoundments, and the four phases of well development. The loading weights were
approximately the same for each metric. Finally, when we compared three GIS-based
UNGD metrics used in UNGD and health epidemiology studies to date, to each other
and in their association with mild asthma exacerbations, we found that the three metrics
ranked persons differently across a gradient of UNGD and had different magnitudes of
association with mild asthma exacerbations. Although the highest category of each
metric (vs. the lowest) was associated with the outcome, the IDS4PC metric was most
strongly associated with mild asthma exacerbations and evidenced the clearest pattern
of increasing odds across categories of increasing UNGD, followed by the DNDW metric
and then the IDD the metric.
7.2 Health impacts of energy production and use Before discussing the health implications of UNGD, it is important to note that energy production and use, regardless of the source, has both health benefits and health impacts. In low income countries, a lack of access to affordable energy sources is a barrier to health and economic potential,5 and increasing energy use improves health.
For example, on a national level, initially, life expectancy increases and infant mortality
decreases with increasing energy consumption, though the gains quickly level off
(Figure 7.2.1). In higher income countries, the overuse or inefficient use of energy has
262
health impacts, if the energy source has air impacts. Producing and burning fossil fuels results in occupational accidents and the emission of pollutants including particulate matter, nitrogen oxides, and sulfur dioxide. In the United Kingdom, for example, in 2001,
the use of energy for electricity generation was estimated to be responsible for 3,778
deaths, 85% Figure 7.2.1. Scatter plot of infant mortality and life expectancy vs. energy consumption per person.6 Size of the of which were bubble is proportionate to the country’s population. attributed to
coal.6 In the
United
States,
energy use
per person is
much higher
than it is in
other
countries with
similar levels
of economic
development.
But the
increased
energy use in
the United
States does
not translate
263
to improved health or wellbeing outcomes, suggesting that the United States energy use could be lowered.
While not the focus of this thesis, the production and use of energy also has
climate implications, which in turn affects public health. When burned for electricity,
natural gas emits only half the carbon dioxide per unit of energy than coal does.7
However, UNGD has fugitive methane emissions, and there have been conflicting
studies on the magnitude of fugitive methane emissions, some of which suggest that
natural gas produced from UNGD is worse for climate than coal because of the fugitive
emissions.8-12
7.3 Future research directions and policy implications When the research in this thesis began, there was one unpublished epidemiology study of UNGD and pregnancy outcomes.13 Since then, several studies have been
conducted and published,1-4,14-19 including those in this thesis, but the state of knowledge
on UNGD and health is not conclusive. Below is a discussion of future research
directions for studies on UNGD and health, as well as policy implications given the
current state of knowledge on UNGD and health.
7.3.1 Research opportunities
Several frameworks exist to evaluate if these Table 7.3.1. Bradford Hill’s 1 associations are causal, from Bradford Hill’s criteria causal criteria. Strength in 1965 (Table 7.3.1)20 to modern causal inference Consistency Specificity 21 methods. In studies of UNGD and health, we are Temporality Biological gradient limited to observational studies because we cannot Coherence expose populations to something that may be Experiment Analogy harmful, so the experiment criterion is not relevant.
264
Although experimental studies of UNGD and health outcomes in people are not possible, there are many other potential studies that would help inform if the relationship between
UNGD and health outcomes is causal.
7.3.1.1 Replication in other shale basins
Evaluation of a causal relationship is not possible from a single study. This thesis
contains the first studies to evaluate associations of UNGD with asthma exacerbations,
depression, and sleep deprivation. These studies have not been replicated, but need to
be, particularly in other shale basins, where UNGD practices, and thus potential health
impacts, may be different than in the Marcellus shale in Pennsylvania. The asthma
exacerbation and sleep deprivation studies would be difficult to replicate without EHR
data. Researchers could conduct these replication studies in health systems that have
EHRs in other regions with UNGD, for example, Texas and Colorado. Both states have
health care systems that are members of the Health Care Systems Research Network
(www.hcsrn.org), which could provide a source of EHR data. A replication of the
depression study, which relied on a questionnaire, could be conducted without EHR
data, or it could be conducted within a health system with EHR data, as we did. After the
studies in this thesis were replicated, it would be possible to evaluate if the associations
were consistent across studies and if the strength of association was similar across
studies, which are pieces of evidence that would inform if the associations observed are
causal.
7.3.1.2 Improving UNGD exposure assessment in epidemiology studies
A limitation of the epidemiology studies in this thesis is that they did not
incorporate environmental measurements (e.g., air pollution measurements) or
biomarkers (e.g., cortisol). Instead, we designed the GIS-based proxies for UNGD to
capture all potential pathways, though without environmental measurements, we cannot
definitively say what components of UNGD our metric is capturing. We hypothesize that,
265
if the relationships between UNGD and the health outcomes we evaluated are causal, stress and air pollution are the two primary pathways, but without exposure assessment
methods specific to stress or air pollution we cannot test these hypotheses. While GIS-
based proxies we used for UNGD were defensible as a method for low-cost exposure
assessment in the initial studies of UNGD and health, future studies should aim to
improve exposure assessment methods so that they can evaluate specific pathways,
including air pollution, stress, and noise.
One study that would be useful for both policy and informing future research
would be to evaluate exposure levels to noise, criteria air pollutants, and hazardous air
pollutants at different distances from UNG wells, at different phases of well development,
and at wells of varying depths and volumes of natural gas production. Such a study
would be useful to policymakers in determining minimum distances (setbacks) from UNG
wells to homes. Setback distances vary across jurisdictions and have largely been
decided as a result of political negotiations and not based on scientific studies (in Texas,
the minimum setback distance is 200 feet, but in Pennsylvania and Colorado, it is 500
feet22). This study would also be useful for epidemiologists to gain insight into what
pathways may be operating at varying distances from wells.
Future epidemiology studies could evaluate the association of UNGD with health
outcomes using environmental measurements and biomarkers instead of using GIS-
based proxies. For example, a cohort study could periodically collect cortisol biomarkers
from a cohort of patients living at varying distances from UNGD, and in the analysis the
study could compare cortisol levels within the same person at different phases of well
development (and in between phases of development), and across people living at
different densities of UNG wells. A similar study could be conducted for noise and air
pollutants using personal monitors. Because drilling has slowed in recent years, studies
improving exposure assessment methods may need to wait to be conducted until drilling
266
picks up again so that the studies can include a large enough exposed population.
Epidemiology studies incorporating cortisol biomarkers or personal measurements of air pollution or noise would strengthen the evidence on UNGD and health by informing the mechanism between UNGD and health outcomes.
7.3.1.3 Reducing potential sources of bias in epidemiology studies of UNGD
Future studies of UNGD and health outcomes should address potential sources of bias that may affect in the studies in this thesis. For example, we used ICD-9 codes and medication orders to identify disordered sleep outcomes, but this method could have resulted in bias if many patients with disordered sleep did not seek treatment or used
over the counter treatments. A future study on UNGD and disordered sleep would want
to consider using a different method to ascertain the outcome, for example, by using
questionnaires. Additionally, the study participants in our study of UNGD and depression
symptoms tended to be sicker than the general population because the survey
framework oversampled for patients with nasal and sinus symptoms. We used survey
weights to make our study population more similar to the general population in the
region, but there may still be differences between the weighted population and the
general population. A future study on UNGD and depression symptoms could consider
using a study population that more closely matched the general population.
7.3.1.4 Employ causal inference methods in studies of UNGD and health
There are several opportunities to employ causal inference methods in studies
of UNGD with health outcomes, and such studies could help determine if the relationship
between UNGD and health outcomes was causal. For example, the epidemiology
studies in this thesis could be repeated using propensity scores to make patients in the
different UNGD activity groups (very low, low, medium, and high) more similar on
measured confounders. The studies could also be repeated using a difference-in-
differences approach, as used in an unpublished study of UNGD and pregnancy
267
outcomes,13 by comparing the health outcomes of patients living near permitted wells
that are later drilled to patients living near permitted wells that are not drilled. We chose
not use a difference-in-differences approach in this thesis because such an approach
has limitations, namely, that the exposure metric can only be dichotomous. However,
that could be an advantage in studies of setbacks (Section 7.3.1.2), where the exposure
of interest is inherently dichotomous.
7.3.2 Policy implications of studies on UNGD and health
Policy makers on local, state, and international levels have been interested in
the results of studies on UNGD and health from our research group and from others, and
in developing policies to reduce the public health impact of UNGD. I have presented
results from our studies to policy makers from the Maryland House of Delegates and the
European Union Directorate-General for the Environment. Even though the current body
of research on UNGD and health is not conclusive, there are several policy
recommendations that make sense given the current state of knowledge.
7.3.2.1 Improve data collection on UNGD
Studies of UNGD are affected by the quality of recordkeeping on UNG wells. In
Pennsylvania, reporting on UNG wells needs to improve. Particularly, stimulation dates
are frequently missing, though other dates of development (spud dates, production start
dates) and natural gas production quantities also have missingness (Section 2.3.1).
Pennsylvania requires well operators to report this data,23 so the state needs to ensure
that the required information is actually collected with no missingness and made public.
Additionally, the state should publish other information on wells that would be useful for
health studies, but that is not currently collected and made available electronically,
including: natural gas production on a daily basis, the duration of stimulation, the
duration of pad preparation, dates of flaring, vertical and horizontal depth (reported
268
individually, not as just as total depth, as is currently done), and volume of fluid used during stimulation at each well.
Pennsylvania also needs to improve collection of data on infrastructure related to
wells, including compressor engines, impoundments and pipelines. Currently, data on
these are not available electronically, which makes incorporating these into health
studies challenging. We identified impoundments using crowdsourcing and compressors
by data abstracting paper documents, but we likely underestimated the counts of both of
these, because we only looked for impoundments within a kilometer of the nearest well,
and we could not distinguish between compressor engines missing a start letter and
those never started (Chapter 5). The state already collects data on compressor engines,
impoundments, and pipelines, because proposals for new compressor engines,
impoundments, and pipelines are published in the Pennsylvania Bulletin
(pabulletin.com). However, the details included in the Pennsylvania Bulletin for proposed
compressor engines, impoundments, and pipelines are not consistent (for example,
some entries include latitude and longitude and others do not), which makes the
Pennsylvania Bulletin a poor source of information for exposure assessment in
epidemiology studies. The state should compile data on locations, dates of development,
and sizes of this infrastructure systematically into an electronic format available online.
Although our evaluation of compressor engines and impoundments did not suggest that
incorporating these into an inverse-distance-squared metric made a difference in the
interpretation of the association of that metric with an adverse health outcome, having
information on the locations, sizes, and dates of development of compressor engines
and impoundments could be important for exposure assessment studies of specific
pathways. For example, it would be important to know the locations and sizes of
compressor engines to design a study to evaluate the noise impacts of these.
7.3.2.2 Expand air quality monitoring in rural oil and natural gas producing areas
269
The National Ambient Air Quality Standards (NAAQS) are the federal standard
for criteria air pollutants (carbon monoxide, lead, nitrogen dioxide, ozone, particulate
matter, and sulfur dioxide), and the EPA requires that states maintain a monitoring
network for these. However, the requirements for monitors outside of urban areas are
minimal.24 As noted in Chapter 5, we were not able to compare our UNGD metrics
against EPA monitor data because the EPA monitor network is not dense enough in the
areas of Pennsylvania with UNGD. The Environmental Defense Fund and other
environmental organizations have called on the EPA to increase air pollution monitoring
in rural areas with oil and gas development and laid out a legal framework for the EPA to
do so.25 The EPA should move forwards with increasing air pollution monitoring, particularly ozone and particulate matter, in rural, oil and gas producing areas so that future studies could evaluate the impact of UNGD on air quality.
For hazardous air pollutants (HAPs) and precursors to ozone, the monitoring strategy should be different. Studies of emission events of HAPs and ozone precursors from UNGD show that these have been events with high emissions over a short period of time with large spatial variability over a small area. The existing EPA network is not able to detect these events because of the short duration and high spatial variability.
One potential strategy to monitor these emission events is to drive mobile air pollution monitoring stations equipped with monitors designed to measure emissions from point sources like these around well pads during phases of development when emissions are likely (e.g. stimulation).26
7.3.2.3 Fund research on UNGD and health
While Pennsylvania proceeded rapidly with UNGD, other states in the Marcellus shale, namely Maryland and New York, enacted moratoriums due to possible uncharacterized environmental and health impacts. At the time these were enacted, few health studies had been published. Now, several studies have found associations
270
between UNGD and health outcomes, but the research on UNGD and health is still far from conclusive. In order for states to determine if the risks from UNGD outweigh the benefits, they need more epidemiology and exposure studies on UNGD, so this research needs to be funded. This funding could be modeled after the Deepwater Horizon
Research Consortia, which is a National Institute of Environmental Health Sciences
(NIEHS) program to study the health effects of oil spills, funded in part by BP. While BP
provided some of the funding, it is not otherwise involved in the program or the
research.27 Similarly, companies involved with UNGD could fund a NIEHS program to
study UNGD and health, and the NEIHS could ensure that these companies are not
involved with the program or its research.
7.3.2.4 Incorporate externalities into energy prices
Several state and local governments have considered enacting or have enacted moratoriums or bans on UNGD. But UNGD has decreased the cost of natural gas, and as a result there has been a decline in the use of coal to produce electricity in favor of natural gas.28 An unintended consequence of UNGD moratoriums or bans could be to
increase the use of coal for electricity production by decreasing the supply of natural gas
from UNG wells. If the methane leakage from UNG wells can be reduced through
regulations and best practices, UNGD moratoriums or bans could actually speed up
climate change by shifting power plants back to coal.
Instead of UNGD moratoriums and bans, energy needs to be priced to incorporate the externalities of energy production and use. Currently, the producers of energy sources that have health and climate impacts do not pay the costs of the impacts they create. The leading proposal to incorporate these externalities into energy prices is a carbon tax, which is a tax based on the carbon dioxide content (and also potentially the content of other greenhouse gases, like methane) of fuels.29 The price of coal and
natural gas (from conventional and unconventional sources) would rise. By increasing
271
the cost of fossil fuels, the use of those fuels will decrease in favor of renewables and energy efficiency. A carbon tax would have positive health effects over the short term because fuels like coal, which have high greenhouse gas emissions, also have emissions of particulate matter and other air emissions that affect health. If the carbon tax also incorporated other greenhouse gases, such as methane, the carbon tax would de-incentivize UNGD. This could potentially reduce any negative health effects of
UNGD, if the associations observed in epidemiology studies to date are causal. A
carbon tax would also have positive health effects over the longer term by mitigating
climate change and its associated health impacts.
7.3.3 Health implications of our research
If there is a causal association of UNGD with adverse health outcomes, as drilling continues in Pennsylvania, populations will continue to be exposed. While a policy intervention to reduce exposure would be ideal, until such a measure is passed, residents of regions with UNGD should be aware of potential health impacts and take steps to protect their health. The Southwest Pennsylvania Environmental Health Project
(environmentalhealthproject.org), a nonprofit organization working on health impacts of
UNGD in Pennsylvania, provides recommendations for nearby residents, including
frequently vacuuming with a HEPA filter vacuum, taking notes of health symptoms over
time, and recommending that residents stop drinking water from the tap if it causes rash or pain for someone in the household. However, the effectiveness of these recommendations has not been evaluated. Instead, we would recommend that residents are aware and inform their doctors of UNGD occurring near their home and work, and that residents advocate for policies to reduce potential exposures.
7.4 Final Remarks This thesis contributes to the body of research on UNGD and health, and more
broadly, energy and health. Research on the health effects of UNGD needs to continue
272
and address the limitations of prior studies, most importantly by incorporating biomarkers and environmental measurements, so that epidemiologists can determine if the associations observed in this thesis are causal and so that research can better inform
policy decisions. As greenhouse gas emissions rise to critical levels and supplies of
conventional fossil fuels diminish, policy makers have important decisions to make about
what sources of energy will power the future. Historically, economic considerations were
key in making decisions about energy, but this thesis and other epidemiology studies of
UNGD show that health and environmental concerns must be considered too.
273
7.5 References
1. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS.
Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in pennsylvania. Environ Health Perspect.
2016.
2. Rabinowitz PM, Slizovskiy IB, Lamers V, et al. Proximity to natural gas wells and reported health status: Results of a household survey in washington county, pennsylvania. Environ Health Perspect. 2014.
3. McKenzie LM, Guo R, Witter RZ, Savitz DA, Newman LS, Adgate JL. Birth outcomes and maternal residential proximity to natural gas development in rural colorado. Environ
Health Perspect. 2014.
4. Stacy SL, Brink LL, Larkin JC, et al. Perinatal outcomes and unconventional natural gas operations in southwest pennsylvania. PLOS ONE. 2015;10(6):e0126425.
5. Markandya A, Wilkinson P. Electricity generation and health. The Lancet.
2007;370(9591):979-990.
6. Wilkinson P, Smith KR, Joffe M, Haines A. A global perspective on energy: Health effects and injustices. The Lancet. 2007;370(9591):965-978.
7. U.S. Energy Information Administration. How much carbon dioxide is produced when different fuels are burned? http://www.eia.gov/tools/faqs/faq.cfm?id=73&t=11.
8. Howarth RW, Santoro R, Ingraffea A. Methane and the greenhouse-gas footprint of natural gas from shale formations. Clim Change. 2011;106(4):679-690.
274
9. Allen DT, Torres VM, Thomas J, et al. Measurements of methane emissions at natural gas production sites in the united states. Proc Natl Acad Sci U S A. 2013;110(44):17768-
17773.
10. Jiang M, Griffin WM, Hendrickson C, Jaramillo P, VanBriesen J, Venkatesh A. Life
cycle greenhouse gas emissions of marcellus shale gas. Environmental Research
Letters. 2011;6(3):034014.
11. Karion A, Sweeney C, Pétron G, et al. Methane emissions estimate from airborne
measurements over a western united states natural gas field. Geophys Res Lett.
2013;40(16):4393-4397.
12. Adgate JL, Goldstein BD, McKenzie LM. Potential public health hazards, exposures and health effects from unconventional natural gas development. Environ Sci Technol.
2014;48(15):8307-8320.
13. Hill EL. Unconventional Natural Gas Development and Infant Health: Evidence from
Pennsylvania. 2012.
14. Fryzek J, Pastula S, Jiang X, Garabrant DH. Childhood cancer incidence in pennsylvania counties in relation to living in counties with hydraulic fracturing sites.
Journal of Occupational and Environmental Medicine. 2013;55(7):796-801.
15. Finkel M. Shale gas development and cancer incidence in southwest pennsylvania.
Public Health. 2016;141:198-206.
16. Graham J, Irving J, Tang X, et al. Increased traffic accident rates associated with
shale gas drilling in pennsylvania. Accident Analysis & Prevention. 2015;74:203-209.
275
17. Jemielita T, Gerton GL, Neidell M, et al. Unconventional gas and oil drilling is
associated with increased hospital utilization rates. PLoS ONE. 2015;10(7):e0131093.
18. Saberi P, Propert KJ, Powers M, Emmett E, Green-McKenzie J. Field survey of health perception and complaints of pennsylvania residents in the marcellus shale region. Int J Environ Res Public Health. 2014;11(6):6517-6527.
19. Casey JA, Savitz DA, Rasmussen SG, et al. Unconventional natural gas
development and birth outcomes in pennsylvania, USA. Epidemiology. 2015.
20. Hill AB. The environment and disease: Association or causation? Proc R Soc Med.
1965;58(5):295-300.
21. Glass TA, Goodman SN, Hernan MA, Samet JM. Causal inference in public health.
Annu Rev Public Health. 2013;34:61-75.
22. Haley M, McCawley M, Epstein AC, Arrington B, Bjerke EF. Adequacy of current state setbacks for directional high-volume hydraulic fracturing in the marcellus, barnett, and niobrara shale plays. Environ Health Perspect. 2016;124(9):1323-1333.
23. Pennsylvania Code. Subchapter E. Well reporting
§ 78.121-§ 78.125. http://www.pacode.com/secure/data/025/chapter78/subchapEtoc.html.
24. Environmental Protection Agency. Network design criteria for ambient air quality
monitoring. .
25. Environmental Defense Fund. Petition for the U.S. environmental protection agency
to promptly require oila nd gas owners an opertors to monitor for ozone and to issue
276
control techniques guidelines for oil and natural gas operations in nonattainment areas. .
2012.
26. Olaguer EP, Erickson M, Wijesinghe A, Neish B, Williams J, Colvin J. Updated methods for assessing the impacts of nearby gas drilling and production on neighborhood air quality and human health. J Air Waste Manag Assoc. 2016;66(2):173-
183.
27. National Institute of Environmental Health Sciences. Deepwater horizon research consortia. https://www.niehs.nih.gov/research/supported/centers/gulfconsortium/.
Updated 2016. Accessed 1/6, 2017.
28. Culver WJ, Hong M. Coal’s decline: Driven by policy or technology? The Electricity
Journal. 2016;29(7):50-61.
29. Schnoor JL. Responding to climate change with a carbon tax. Environ Sci Technol.
2014;48(21):12475-12476.
277
Appendix
Institutional review board documents
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
Curriculum Vita – Sara G. Rasmussen
WORK ADDRESS
Johns Hopkins Bloomberg School of Public Health 615 N. Wolfe St., W7508 Baltimore, MD 21205 [email protected]
EDUCATION
2012–2017 (expected) Doctor of Philosophy in Environmental Health Sciences Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland Advisor: Brian Schwartz, MD MS Dissertation Title: “Associations of unconventional natural gas development with asthma exacerbations and depressive symptoms in Pennsylvania”
2010–2011 Master of Health Science in Environmental Health Sciences Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
2006–2010 Bachelor of Arts in Anthropology, cum laude Washington University in St. Louis, St. Louis, Missouri
PROFESSIONAL TRAINING
Fall 2013 Risk Sciences and Public Policy Certificate, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
PROFESSIONAL EXPERIENCE
May 2011–August Staff Researcher, Earth Policy Institute, Washington, DC 2012 Research, data collection, and fact-checking for Lester Brown's book, Full Planet, Empty Plates: The New Geopolitics of Food
TEACHING EXPERIENCE
Fall 2016 Teaching assistant, Global Sustainability and Health Seminar (188.688.01, 1 credit), Johns Hopkins Bloomberg School of Public Health Lead weekly discussions on readings
Fall 2012 - Fall 2015 Lead teaching assistant, The Global Environment and Public Health (180.611.01, 4 credits), Johns Hopkins Bloomberg School of Public Health Formulated assignments, graded exams, addressed student questions
Spring 2013 Teaching assistant, Environmental and Occupational Health Law and Policy (180.629.01, 4 credits), Johns Hopkins Bloomberg School of Public Health Graded exams and homework assignments
294
HONORS AND AWARDS
March 2016 Morgan James Endowment Award
March 2015 Morgan James Endowment Award
February 2015 Johns Hopkins Bloomberg School of Public Health Delta Omega Poster Competition, Second Place
May 2011 Delta Omega Honor Society
PUBLICATIONS
1. Casey JA, Ogburn EL, Rasmussen SG, Irving JK, Pollak J, Locke PA, Schwartz BS. Predictors of indoor radon concentrations in Pennsylvania, 1989-2013. Environ Health Perspect. 2015 Nov;123(11):1130-1137.
2. Casey JA, Savitz DA, Rasmussen SG, Ogburn EL, Pollak J, Mercer DG, Schwartz BS. Unconventional natural gas development and birth outcomes in Pennsylvania, USA. Epidemiology. 2016 Mar;27(2):163-72.
3. Rasmussen SG, Ogburn EL, McCormack M, Casey JA, Bandeen-Roche K, Mercer DG, Schwartz BS. Association between unconventional natural gas development in the Marcellus shale and asthma exacerbations. JAMA Intern Med. 2016;176(9):1334-1343.
4. Tustin AW, Hirsch AG, Rasmussen SG, Casey JA, Schwartz BS. Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in Pennsylvania. Environ Health Perspect. 2017 Feb;125(2):189-197.
PAPERS IN PROGRESS
1. Rasmussen SG, Wilcox H, Hirsch AG, Pollak J, Schwartz BS. Associations of unconventional natural gas development with disordered sleep and depression symptoms in Pennsylvania.
2. Rasmussen SG, Koehler K, Ellis H, Manthos D, Bandeen-Roche K, Platt R, Schwartz BS. Exposure assessment using secondary data sources in unconventional natural gas development and health studies
NON-PEER REVIEWED ARTICLES
1. Rasmussen SG, Casey JA, Schwartz BS. Fracking and health: What we know from Pennsylvania’s natural gas boom. The Conversation. 25 August 2016.
2. Rasmussen SG, Schwartz BS. Unconventional Natural Gas Development: Epidemiology Studies and Public Health Implications. Society of General Internal Medicine Forum. 2016.
SCIENTIFIC CONFERENCE PRESENTATIONS
1. Rasmussen SG, McCormack M, Casey JA, Ogburn EL, Schwartz BS. Marcellus shale development, air pollution, and asthma exacerbations. Poster session presented at: 27th Conference of the International Society for Environmental Epidemiology; 2015 Aug 30-Sep 3; São Paulo, Brazil.
2. Rasmussen SG, Casey JA, Bandeen-Roche K, Schwartz BS. Proximity to industrial food animal production and asthma exacerbations in Pennsylvania, 2005-2012. Poster session at: 49th Annual Meeting of the Society for Epidemiologic Research; 2016 June 21-24; Miami, Florida.
295
3. Rasmussen SG, Hirsch AG, McCormack M, Schwartz BS. Associations between unconventional natural gas development, respiratory symptoms, and mental health in Pennsylvania. Poster session at: 28th Conference of the International Society for Environmental Epidemiology; 2016 September 1-4; Rome, Italy.
4. Rasmussen SG, Ellis H, Koehler KA, Schwartz BS. Exposure assessment in unconventional natural gas and health studies. Oral presentation at: Annual Conference of International Society of Exposure Science; 2016 October 9-13; Utrecht, The Netherlands.
INVITED PRESENTATIONS
1. Rasmussen SG. Marcellus shale development, air pollution, and asthma exacerbations. Lecture in Occupational and Environmental Hygiene Seminar (182.860.81) at Johns Hopkins University, 2015 March, Johns Hopkins Bloomberg School of Public Health.
2. Rasmussen SG. Public Health Practice Seminar on the Future of the Maryland Fracking Moratorium at Johns Hopkins University, 2016 October, Johns Hopkins Bloomberg School of Public Health.
3. Rasmussen SG. Unconventional Natural Gas Development & Health Studies. Presentation at the European Union Technical Workshop on Public Health Impacts and Risks Resulting from Hydrocarbons Exploration and Production, Brussels, Belgium, 2016 November.
4. Schwartz BS and Rasmussen SG. New Research on Asthma and Other Public Health Considerations of Shale Gas. Presentation at the League of Women Voters of Pennsylvania Shale & Public Health Conference, Pittsburgh, Pennsylvania, 2016 November.
RESEARCH GRANT PARTICIPATION
2014–2016 National Science Foundation Water, Climate, and Health Integrative Graduate Education and Research Traineeship
2012–2014 National Institute of Environmental Health Sciences Training Grant, ES07141
EDITORIAL ACTIVITIES
2016 - Present Ad Hoc Reviewer, Environmental Research
PROFESSIONAL MEMBERSHIPS
2015-Present International Society for Environmental Epidemiology
296