ENVIRONMENTAL AND HEALTH IMPACTS OF EXTREME HEAT EVENTS

A Dissertation Presented to The Academic Faculty

by

Ambarish Vaidyanathan

In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Civil and Environmental Engineering

Georgia Institute of Technology August, 2014

COPYRIGHT © Ambarish Vaidyanathan 2014 ENVIRONMENTAL AND HEALTH IMPACTS OF EXTREME HEAT EVENTS

Approved by:

Dr. James A. Mulholland, Advisor Dr. Armistead G. Russell School of Civil and Environmental School of Civil and Environmental Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology

Dr. Michael H. Bergin Dr. Yongtao Hu School of Civil and Environmental School of Civil and Environmental Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology

Dr. Seymour E. Goodman Dr. Judith R. Qualters School of International Affairs/ School of National Center for Environmental Health Computer Science Centers for Disease Control and Prevention Georgia Institute of Technology

Dr. Scott R. Kegler National Center for Injury Prevention and Control Centers for Disease Control and Prevention

Date Approved: June 23, 2014

To all my teachers

ACKNOWLEDGEMENTS

This personally enriching experience would not have been possible without the support of numerous individuals along the way. First and foremost, I would like to express my sincere gratitude to my adviser, Prof. James Mulholland. He gave me ample freedom in pursuing the topics of my interest while guiding me with his expert comments and invaluable insights. In short, without his guidance and help, I would not be penning down this section. I would also like to thank Prof. Armistead Russell for his help and support throughout the course of my PhD; I appreciate his help in tackling complex issues related to my dissertation and consider myself fortunate to have him in my committee. I would also like to thank the Prof. Goodman for introducing me to the policy aspects of scientific research and I am grateful to have completed the Sam Nunn Security Program Fellowship under his guidance. I would also like to thank my other Georgia Tech committee members Dr. Yongtau Hu and Prof. Michael Bergin for their guidance and support.

I would like to express my utmost appreciation to Dr. Scott Kegler for his statistical support and guidance. He has invested endless hours of his time on my development and growth, and has taught me how to be critical, creative and patient. His dedication will always be my motivation and inspiration to strive for excellence in my professional life. I would be remiss if I did not acknowledge the guidance provided by Dr. Judith Qualters.

Dr. Qualters stood by me all the time, pushing me to set high standards and excel every time. I am deeply indebted for her words of encouragement, and the support and resources that she provided me throughout my PhD. I would also like to thank Dr. Dana

Flanders, Emory University, for his assistance reviewing certain chapters of my dissertation.

iv

I have had the good fortune of having great friends both at Georgia Tech and CDC, and I would like to thank them for their unflinching support during the course of my PhD.

Finally, I am forever indebted to my parents, for the never-ending love and support they have provided throughout the years. Without them, I would not be here today.

v

TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS ...... iv LIST OF TABLES ...... ix LIST OF FIGURES ...... xi SUMMARY ...... xiv Chapter 1 Introduction ...... 1 1.1 References ...... 8 Chapter 2 Region-specific Evaluation of Extreme Heat Event Definitions Using Heat Mortality Data: A Comprehensive National Assessment with a Public Health Focus ..... 16 2.1 Introduction ...... 16 2.2 Methods ...... 18 2.2.1 Meteorology data ...... 18 2.2.2 EHE definitions and core variables...... 20 2.2.3 Mortality and population data ...... 22 2.2.4 Evaluating EHE definitions using heat mortality data ...... 23 2.2.5 Rate regression modeling ...... 26 2.3 Results ...... 28 2.4 Summary and Conclusions ...... 35 2.5 References ...... 40 Chapter 3 Exploring the Utility of Modeled Meteorology Data for Extreme Heat-Related Health Research and Surveillance ...... 45 3.1 Introduction ...... 45 3.2 Methods ...... 47 3.2.1 Meteorology data ...... 47 3.2.2 Mortality and population data ...... 51 3.2.3 Station-level comparison of model and station data ...... 52 3.2.4 County-level evaluation of model and station-based exposure estimates using heat-related mortality data ...... 53 3.3 Results and Discussion ...... 55 3.3.1 Descriptive statistics ...... 55 3.3.2 Station-level comparison of model and station data ...... 57

vi

3.3.3 County-level comparison of mean EHE effect ...... 62 3.4 Summary and Conclusions ...... 65 3.5 References ...... 67 Chapter 4 Characterizing the Effect of Meteorology on Ozone Levels during Extreme Heat Events ...... 72 4.1 Introduction ...... 72 4.2 Methods ...... 74 4.2.1 Meteorology and ozone data ...... 74 4.2.2 EHE Definitions ...... 76 4.2.3 Modeling approach ...... 77 4.2.4 Sensitivity of meteorology-ozone relationships to EHE definitions...... 79 4.3 Results ...... 80 4.3.1 Descriptive summary ...... 80 4.3.2 Impact of meteorology on ozone and effect modification during EHEs .... 86 4.4 Summary and Conclusions ...... 91 4.5 References ...... 94

Chapter 5 Assessment of Modeled PM2.5: A Public Health Perspective ...... 101 5.1 Introduction ...... 101 5.2 Methods ...... 102 5.2.1 Study domain and time period ...... 102

5.2.2 Station based PM2.5 measurements ...... 103

5.2.3 Model-based PM2.5 predictions ...... 104 5.2.4 Evaluation methods: daily grid cell level evaluation ...... 106 5.2.5 Comparison of linked metrics of air quality and health ...... 106 5.2 Results ...... 108 5.3.1 Data completeness and descriptive statistics ...... 108 5.3.2 Daily grid level evaluation ...... 110 5.3.3 Comparison of estimated annual county-level change in mortality rate ... 117 5.3 Discussion ...... 120 5.4 Conclusion ...... 124 5.5 References ...... 125 Chapter 6 Monetizing Health Burden Associated with Extreme Heat Events: Exploring the Role of Air Pollution and the Sensitivity Associated with Definitions in the Excess Death Estimation Process ...... 130 6.1 Introduction ...... 130

vii

6.2 Methods ...... 132 6.2.1 Meteorology data ...... 132 6.2.2 Air pollution data ...... 132 6.2.3 Mortality data ...... 133 6.2.4 Population and other ancillary data...... 134 6.2.5 EHE definitions ...... 135 6.2.6 Estimation of exposure-response (E-R) relationship ...... 135 6.2.7 Excess death estimation ...... 137 6.2.8 Region-level summary of E-R relationship and excess deaths ...... 138 6.2.9 Monetizing Excess deaths ...... 139 6.3 Results and Discussion ...... 141 6.3.1 Descriptive statistics ...... 141 6.3.2 Modeling results...... 145 6.3.3 E-R relationships ...... 145 6.3.4 Excess deaths and economic costs associated with EHEs ...... 149 6.3.5 Limitations ...... 152 6.4 Conclusions ...... 153 6.5 References ...... 154 Chapter 7 Summary of Conclusions and Future Research ...... 159 7.1 Summary of Conclusions ...... 159 7.1.1 Region-Specific Evaluation of Extreme Heat Event Definitions Using Heat Mortality Data ...... 159 7.1.2 Exploring the Utility of Modeled Meteorology Data for Extreme Heat- Related Health Research and Surveillance...... 160 7.1.3 Characterizing the Effect of Meteorology on Ozone levels during Extreme Heat Events ...... 161

7.1.4 Assessment of Modeled data sources of PM2.5: A public health perspective 162 7.1.5 Monetizing Health Burden Associated with Extreme Heat Events ...... 163 7.2 Future Directions ...... 165 Appendix A ...... 167 Appendix B ...... 181 Appendix C ...... 199 Appendix D ...... 219 Appendix E ...... 228

viii

LIST OF TABLES

Page

Table 2-1: Core variables used in extreme heat event definitions ...... 21

Table 2-2: Two-way frequency table of daily agreements/disagreements for two operationalized EHE definitions/variants ...... 24

Table 2-3: Heat-related deaths and counties with meteorological data, by climate region (1999-2009) ...... 29

Table 2-4: Representative EHE definition from each cluster ...... 32

Table 2-5: Mean (95% CI) EHE effect by U.S. climate regions ...... 34

Table 2-6: Meta-analyzed baseline rate and EHE effect by U.S. climate region ...... 35

Table 3-1: Availability of complete ASOS stations by climate region and urbanicity .... 56

Table 3-2: Station-level comparison at SEARCH locations ...... 61

Table 3-3: Absolute and relative change in mean EHE effect by U.S. climate region ..... 64

Table 4-1: List of meteorological variables ...... 75

Table 4-2: EHE definitions used in this analysis ...... 84

Table 4-3: The effect modification (slope factors on a logarithmic scale) of the relationship between meteorological variables on ozone during EHEs ...... 90

Table 5-1: Completeness of data sources...... 109

Table 5-2:Annual and seasonal comparison of AQS-based measurements and modeled estimates of PM2.5 ...... 111

Table 5-3: Annual and seasonal comparison of SEARCH-based measurements and modeled estimates of PM2.5 ...... 113

Table 6-1: Descriptive summary statistics ...... 143

Table 6-2: Levels of air pollutants and standardized AP scores on EHE and non-EHE days ...... 144

Table 6-3: Excess deaths per EHE day using RR generated from a model with and without air pollution terms ...... 150

ix

Table 6-4: Excess costs per EHE day using RRs generated from a model with and without air pollution terms ...... 151

Table A-1: List of EHE definitions used in the analysis ...... 167

Table A-2: Percent of days classified as EHE days and percent of X30 deaths on EHE days, by EHE definition and exposure offset combinations for U.S. Climate Regions ...... 178

Table A-3: Mean (5th, and 95th percentile) levels of demographic and social variables, by U.S. Climate Regions ...... 180

Table B-1: Grid extent of the NLDAS grid ...... 181

Table B-2: Performance evaluation metrics used in this analysis ...... 184

Table B-3: Short-listed EHE definitions used to evaluate ASOS- and NLDAS-based exposure estimates ...... 186

Table B-4: Annual station-level RMSD for daily heat metrics ...... 190

Table C-1: City-specific Pearson correlation coefficient between meteorological variables ...... 207

Table C-2: Model performance for 27 cities...... 210

Table C-3: The effect (slope factors on a logarithmic scale) of meteorological variables on ozone and effect modification by EHEs ...... 211

Table C-4: The effect modification (slope factors on a logarithmic scale) of the relationship between meteorological variables on ozone during EHEs by region and definition ...... 212

Table E-1: List of covariates and data sources ...... 229

Table E-2: Short-listed EHE definitions ...... 230

Table E-3: Rankings for the combination of EHE definitions and exposure offset indicators by regions...... 232

Table E-4: Mean (5th, and 95th percentile) levels of demographic and social variables, by U.S. Climate Regions ...... 239

x

LIST OF FIGURES

Page

Figure 2-1: Spatial coverage of ASOS weather stations ...... 19

Figure 2-2: U.S. climate regions ...... 23

Figure 2-3: Schematic showing exposure offset indicators ...... 25

Figure 2-4: Dendrogram of hierarchical clusters ...... 31

Figure 3-1: Spatial coverage of: (A) ASOS, (B) SEARCH stations, and (C) NLDAS grid extent with U.S. climate regions ...... 50

Figure 3-2: Station-level Pearson correlation coefficient between ASOS observations and NLDAS-based estimates for daily heat metrics ...... 58

Figure 3-3: Station-level median difference between ASOS observations and NLDAS- based estimates for daily heat metrics ...... 59

Figure 4-1: Cities selected for this analysis ...... 76

Figure 4-2: Ozone concentrations by month in 27 cities ...... 81

Figure 4-3: Effect modification of the meteorology-ozone relationship during EHE and non-EHE days for: (A) daily maximum temperature, and (B) daily mean inverse wind speed ...... 89

Figure 5-1: Spatial domain...... 103

Figure 5-2: Comparison model- and SEARCH-based PM2.5 concentrations ...... 115

Figure 5-3: County-level annual comparison of model- and AQS-based metrics: (A) Bland Altman Plot, and (B) Comparison of change in mortality rate ...... 119

Figure 6-1: Rate ratios by climate regions generated from the random effects summary- level analysis, controlling for social and demographic factors, and with and without adjustment for air pollution (standardized AP score)...... 148

Figure B-1: Inverse squared distance weighted interpolation to obtain station-level modeled daily heat metric estimates...... 182

Figure B-2: Generating population-weighted county-level modeled daily heat metric estimates ...... 183

Figure B-3: Schematic showing exposure offset indicators ...... 187

xi

Figure B-4: Urban/Rural (Urbanicity) classification of counties in the U.S...... 188

Figure B-5: Station-level correlation as a function of proximity to the U.S. coastline .. 189

Figure B-6: Bland Altman plots for Tmax at SEARCH locations ...... 191

Figure B-7: Bland Altman plots for HImax at SEARCH locations ...... 192

Figure B-8: Bland Altman plots for Tavg at SEARCH locations ...... 193

Figure B-9: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 1 ...... 194

Figure B-10: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 2 ...... 195

Figure B-11: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 3 ...... 196

Figure B-12: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 4 ...... 197

Figure B-13: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 5 ...... 198

Figure C-1: Daily mean inverse wind speed by month in 27 cities ...... 199

Figure C-2: Daily mean relative humidity by month in 27 cities ...... 200

Figure C-3: Daily maximum temperature by month in 27 cities ...... 201

Figure C-4: Daily cloud-adjusted net solar insolation by month in 27 cities ...... 202

Figure C-5: Daily maximum temperature distribution by EHE definitions in 27 cities on (A) non-EHE day, and (B) EHE day ...... 203

Figure C-6: Daily mean inverse wind speed distribution by EHE definitions in 27 cities on (A) non-EHE day, and (B) EHE day ...... 204

Figure C-7: Daily maximum relative humidity by EHE definitions in 27 cities on (A) non-EHE day, and (B) EHE day...... 205

Figure C-8: Scatter plot showing relationship between meteorological variables used in this analysis ...... 206

Figure C-9: Effect modification of the RH-ozone relationship during EHEs ...... 218

Figure D-1: Annual averages of PM2.5: (a) Monitor (AQS/SEARCH); (b) AOD; (c) CMAQ; (d) DS ...... 223

xii

Figure D-2: Description of the grid cell neighborhood around SEARCH monitors ...... 224

Figure D-3: Comparison of model- and SEARCH-based PM2.5 concentrations, when measurements from nearby AQS monitor are present ...... 225

Figure D-4: Comparison of model- and SEARCH-based PM2.5 concentrations, when measurements from nearby AQS monitor are absent ...... 226

Figure D-5: Comparison of SEARCH- and nearby AQS-based PM2.5 measurements ... 227

Figure E-1: U.S. climate regions ...... 228

Figure E-2: Schematic showing exposure offset indicators ...... 231

Figure E-3: Unit lifetime work lost costs in millions ($) by age and gender...... 233

Figure E-4: Air pollution levels on EHE and non-EHE days for definition 1 (Daily maximum heat index greater than 90 ℉ for at least 3 consecutive days) ...... 234

Figure E-5: Air pollution levels on EHE and non-EHE days for definition 2 (Daily maximum and minimum temperature greater than 80th percentile for at least 3 consecutive days)...... 235

Figure E-6: Air pollution levels on EHE and non-EHE days for definition 3 (Daily maximum temperature greater than 95th percentile for at least 2 consecutive days) ...... 236

Figure E-7: Air pollution levels on EHE and non-EHE days for definition 4 (Huth definition) ...... 237

Figure E-8: Air pollution levels on EHE and non-EHE days for definition 5 (Daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days) ...... 238

Figure E-9: E-R relationship between mortality and EHEs for Southeast based on different top 10 EHE definitions ...... 240

Figure E-10: E-R relationship between mortality and EHEs for West based on different top 10 EHE definitions ...... 241

xiii

SUMMARY

An extreme heat event (EHE) or a heat wave is a sustained period of substantially hotter and/or more humid weather. EHEs cause a wide range of health problems such as rashes, cramps, heat exhaustion, heat stroke, and in some instances, death. While the negative consequences of EHEs on health are understood, there is limited information on the extent of region-specific adverse health and economic impacts resulting from EHEs.

Further, estimating excess deaths or economic costs associated with EHEs are impeded by several constraints. Some of the major constraints include: a lack of scientific consensus on the ideal EHE definition, inadequate understanding of the role of other environmental exposures during EHEs (such as air pollutants) in modifying health risk attributable to EHEs, and limited access of high-quality fine resolution environmental and health datasets to conduct a robust region-specific analysis. The overarching goal of this study is to improve the understanding of the adverse environmental and health impacts of

EHEs in the United States (U.S.), develop metrics to quantify the burden associated with

EHEs, and lay the ground work for the development of effective strategies to address multiple environmental stressors during periods of extreme heat.

There is no shortage of EHE definitions available from epidemiological and meteorological literature. Exploring the predictive power of EHE definitions for health research is not only challenging because of the sheer number of definitions, but also due to the difficulty in determining an appropriate health end point for evaluation. This study employs a hierarchical clustering technique to group EHE definitions into homogenous sets and uses deaths that result from exposure to excessive natural heat as the health end point to identify EHE definitions suitable for extreme heat surveillance and research.

xiv

High temperatures prevailing during EHEs are conducive to the formation of certain air pollutants, but very little is known about the relationship between other meteorological variables and air pollutants during EHEs. Hence, it is worthwhile to examine the prevailing levels of meteorological variables on EHE and non-EHE days, and evaluate whether EHEs encapsulate variations in multiple meteorological variables that are associated with higher air pollutant concentrations. The relationship between ozone and meteorology on EHE and non-EHE days can be successfully characterized using a multivariate autoregressive model and a logarithmic response for ozone. The effect modification of the relationship between meteorological variables and ozone on EHE days varies with meteorological parameter in consideration, climate region, and EHE definition.

When conducting an assessment involving multiple environmental variables, often the limiting factor becomes the availability of highly resolved exposure data that allign with the resolution of health data. Air quality measurements and station-based meteorological variables are deemed accurate but are limited in geographic scope.

Alternatively, data from mechanistic and deterministic models, which are available over continuous spatial and temporal scales, can be used to assign exposure to populations.

However, the utility should be weighed against any potential bias and variability, and a rigorous evaluation of modeled exposure data is warranted before full-scale adoption. A comparison of modeled estimates utilizing an independent set of measurements and using health-based metrics to evaluate meaningful differences, shed light on the pros and cons of exposure estimates generated from modeled data.

xv

The association between air pollution and health, especially mortality, is well understood.

However, the role of air pollutants in modifying the relationship between EHEs and mortality is not well characterized in the U.S., yet is critical to generating accurate estimates of health burden. Further, through this work, the sensitivities associated with selecting an EHE definition is taken into consideration when providing region-specific health and economic burden associated with EHEs. Finally, the framework to generate excess deaths and costs presented in this work could be useful to study and quantify the adverse health impact of EHEs either in a prospective or a retrospective setting.

xvi

CHAPTER 1 Introduction

In the United States (U.S.), extreme temperature-related deaths account for far more deaths than hurricanes, lightning, tornadoes, floods, and earthquakes combined (Thacker et al. 2008). According to the Centers for Disease Control and Prevention (CDC), a total of 7,2331 heat-related deaths were reported between 1999 and 2009 (Fowler et al. 2013).

An extreme heat event (EHE) is defined as a sustained period of abnormally and uncomfortably hot, and usually humid, weather (Meehl and Tebaldi 2004). EHEs can negatively impact vital aspects of society, including agriculture, power production and consumption, and human health (National Research Council [U.S.] et al. 2010; Parry et al. 2007). The Third National Climate Assessment report states that there will be statistically significant increases in simulated annual mean temperatures across the contiguous United States for both A1 and B2 climate scenarios2 (Arnell et al. 2004;

Meehl et al. 2000). Adverse health outcomes associated with extreme heat are preventable, and it is imperative to understand the local characteristics of EHEs in order to identify such events in advance. Early identification or prediction of these events would allow for an adequate response, avoiding a number of public health risks.

In order to understand the adverse health impacts of EHEs, it is necessary to define what constitutes a heat episode. Many EHE definitions are available from the literature

(Anderson and Bell 2011; Arguez et al. 2012; Burrows 1900; CDC 2013; Easterling et al.

2000; Hajat et al. 2006; Hajat et al. 2010; Huth et al. 2000; et al. 2014; Kovats and

1 This figure represents the total count of deaths when exposure to excessive natural heat (Internation Classification of Deaths (ICD) code , X30) is listed either as an underlying or a contributing cause. Aditionally, this figure includes X30 deaths among both U.S. resident and non-U.S. resident population. 2 The different climate scenarios are based on the emission scenarios defined by the Intergovernmental Panel on Climate Change (IPCC). These emission scenarios are organized into groups based on different assumptions describing human activity in the future. A1 represents rapid economic growth in a homogeneous world; B2 represents ecologically friendly with less rapid technological changes in a heterogeneous world. 1

Hajat 2008; Meehl and Tebaldi 2004; Pascal et al. 2006; Pascal et al. 2013; Peng et al.

2011; Robinson 2001; Zaitchik et al. 2014). As exemplified in these studies, EHEs are defined based on meteorological variable deviations from the norm (e.g., temperature). A majority of studies apply one definition to all climate regions, and hence, neglect climate adaptation by resident populations. Studies that have extensively evaluated EHEs are limited to a few geographic areas (Gasparrini and Armstrong 2011; Hajat et al. 2010;

Ishigami et al. 2008) and extending definitions from such studies to non-study areas could result in misidentification of EHEs. As a whole, there is a lack of consensus in the environmental health literature on definitions and procedures to accurately identify periods of extreme heat with adverse health impacts.

Several studies have confirmed a relationship between air pollution and its impact on human health (Samet 2005). Several epidemiologic and human clinical studies have examined the cardiovascular and respiratory health effects of both acute and long-term exposures to air pollution (Rom and Samet 2006). The Medicare Air Pollution Study

(MCAPS), reported a short-term increase in hospital admission rates associated with elevated ambient PM2.5 concentrations, for health outcomes such as ischemic heart disease, heart failure, chronic obstructive pulmonary disease and respiratory infection

(Dockery et al. 1993; Dockery and Pope 1994; Pope III et al. 2002). Tropospheric ozone, a criteria pollutant regulated under the Clean Air Act of the U.S., is also known to adversely impact heath. Numerous studies have identified a positive relationship between ambient ozone exposure and hospitalization/ emergency department visits for respiratory diseases, such as asthma and chronic obstructive pulmonary disease (Burnett et al. 1997;

Strickland et al. 2010). Studies have also shown an association between short- and long-

2

term air pollutant exposure and mortality (Bell et al. 2004; Bell et al. 2005; Jerrett et al.

2009; Pope et al. 2001).

Meteorology plays a dominant role in the formation of air pollutants. In particular, extremely high temperatures are conducive to the formation of certain air pollutants. The

European summer of 2003 was exceptionally warm, with an unprecedented 15-day long heat wave. In alone, there were 14,800 excess deaths during this 2-week heat wave period (Vautard et al. 2005). During this heat wave, many western and central

European countries recorded the highest ozone concentrations on record since the late

1980s (Solberg et al. 2008; Vautard et al. 2005). A study conducted by U.S.

Environmental Protection Agency (EPA)(Cox and Chu 1996) explored the relationship between meteorology and ozone for years 1983-1993 and concluded that daily maximum

8-hr ozone levels were considerably higher for the hottest summer (1988) and that the lowest number of ozone exceedances were observed during the coolest summer (1992).

Observed correlations of PM2.5 total mass with meteorological variables are weaker than correlations between such variables and ozone; however, for the sulfate fraction of PM2.5, correlation increases with temperature (Jacob and Winner 2009). Although PM2.5 total mass is not strongly correlated with extreme temperatures, persistently high temperatures observed during EHEs could lead to an increase in certain types of emissions. For example, biogenic emissions (isoprene and monoterpene) can increase during periods of high temperatures, as plants tend to release these volatile organic carbon (VOC) compounds as a defense mechanism to combat heat stress (Benjamin et al. 1996; Geron et al. 2006; Sharkey et al. 2008). Additionally, the escalation of air conditioning use during extreme temperature days lead to a higher electricity demand from electricity

3

generating units (EGU), which in turn leads to emissions of oxides of nitrogen (NOx) (He et al. 2013).

While it is believed that most of the excess mortalities and morbidities during EHEs are associated with extreme temperatures, a recent study conducted in (Analitis et al.

2014) has concluded that heat wave-related mortality was 54% higher on high ozone days compared with low ozone days among people age 75-84. Hence, it is worth investigating the role of air pollutants in causing adverse health effects during these periods. The conduct of health studies requiring weather and air pollution data from stations and monitors, respectively, is limited by data availability and completeness. Data from weather stations are available from the National Climatic Data Center (NCDC); however, these stations are limited in geographic scope. Similarly, ambient air monitoring data are available from Environmental Protection Agency’s (EPA) Air Quality System (AQS).

However, these AQS-based monitors have limited spatial coverage and many monitors do not sample for PM2.5 on a daily basis (Vaidyanathan et al. 2013). Further, assigning population-level exposures using station- and monitor-based data is constrained by the fact that some of them are located in non-residential areas or in remote places (Gallo et al. 1996).

Alternatively, modeled exposure data, which are available over continuous spatial and temporal scales, can be used to assign exposure to populations. Meteorological data from models are available over continuous spatial and temporal scales, and have found use in air pollution modeling, weather forecasting and various other climatological predictions

(Aiyyer et al. 2007; Glahn and Lowry 1972; Michalakes et al. 2001; Ritter and Geleyn

1992). Similarly, statistical fusion models, which combine air quality measurements from

4

monitors with predictions from numerical deterministic simulation models, have been used to fill temporal and spatial gaps in ambient air monitoring data (Berrocal et al. 2010,

2012; Fuentes et al. 2006; McMillan et al. 2009). However, the utility of modeled exposure estimates for extreme heat surveillance and research should be weighed against any potential bias and variability present in these estimates, and an evaluation is needed before full-scale adoption.

Impacts on health are usually estimated to be the largest adverse consequences of EHEs when measured in economic terms using standard valuation approaches and dominating other losses, such as damage to crops and ecosystems (Yang et al. 2005). The most severe of adverse health outcomes associated with EHEs is death, where losses to society and the economy extend from the point of premature death forward until that person would have died of other causes had they not succumbed to the effects of extreme heat. For example, one study (Kovats and Hajat 2008) estimated 22,080 excess deaths in England,

Wales, France, Italy, and Portugal during and immediately after the heat waves of the summer of 2003. Similarly, the , which lasted only for five days, resulted in 750 deaths (Semenza et al. 1996). Much of the excess deaths during these heat episodes were related to cardiovascular, cerebrovascular, and respiratory causes— mortality endpoints that are also associated with air pollution.

The overall goals of this dissertation are to assess the environmental and health impacts associated with EHEs in the United States (U.S.). Below are the objectives for individual chapters.

5

Chapter 2: Region-specific Evaluation of Extreme Heat Event Definitions Using

Heat Mortality Data: A Comprehensive National Assessment with a Public Health

Focus

In this chapter, we describe a region-specific evaluation of EHE definitions using heat mortality data. We use station-based meteorology data from National Climatic Data

Center and heat mortality data from National Center for Health Statistics for years 1999-

2009 to conduct this evaluation. We employ a combination of hierarchical cluster analysis and negative binomial rate regression methods to identify EHE definitions that are closely associated with heat-related mortality. This chapter provides insights into the spatial and temporal distribution of EHEs nationally, and sheds light on variations in regional susceptibility of populations to extreme heat.

Chapter 3: Exploring the Utility of Modeled Meteorology Data for Extreme Heat- related Health Research and Surveillance

In this chapter, we assess the utility of modeled meteorology data from North American

Land Data Assimilation System (NLDAS) model for use in extreme heat-related health research and surveillance in areas without meteorological measurements. We evaluate the performance of model-based predictions using measurements from stations from

Southeastern Aerosol Research and Characterization (SEARCH) network and conduct a county-level health analysis using heat-related mortality data. The results generated from station- and modeled-based exposure estimates are compared.

Chapter 4: Characterizing the Relationship between Ozone and Meteorology during

Extreme Heat Events

6

In this paper, we explore the effect of meteorological variables on ozone levels, conditioned on EHE and non-EHE day, and quantify the degree of effect modification by

EHEs at city and regional scales. We use station-based meteorology data from National

Climatic Data Center and ozone measurements from EPA for 27 cities in the U.S., representing different climate regions. We execute a city-specific multivariate autoregressive model to control for the autocorrelation of residuals, and use a logarithmic response for ozone to model the relationship between meteorological parameters and ozone. We conduct a summary-level pooled analysis, considering the heterogeneity in effect sizes arising due to EHE definitions, to generalize the effect of meteorology on ozone for each city and climate region.

Chapter 5: Assessment of Modeled PM2.5: A Public Health Perspective

In this chapter, we conduct an assessment in the Southeastern U.S. to evaluate the accuracy and utility of model-based PM2.5 predictions against measurements, as well as compare linked metrics of air quality and health created from model- and monitor-based estimates of PM2.5. We consider predictions from Community Multiscale Air Quality

(CMAQ), Bayesian space-time Downscaler (DS), and Aerosol Optical Depth (AOD) based models. We quantify the variability, bias, and the change in mortality rate ( ) associated with a 25% reduction in annual PM2.5 levels based on the modeled predictions and measurements from the SEARCH and AQS-based monitors.

Chapter 6: Monetizing Health Burden Associated with Extreme Heat Events:

Exploring the Role of Air Pollution and the Sensitivity Associated with Heat Wave

Definitions in the Excess Death Estimation Process

7

In this chapter, we explore region-specific interactions between air pollution and EHEs, and their collective impact on mortality. We model the region-specific mortality risks

(rate ratio) associated with EHEs using a negative binomial rate regression model. We implement factor analysis to create composite a air pollution score and used that as a predictor along with county-level adult smoking prevalence, air conditioning prevalence, and proportion of Hispanic population. We compute region-specific excess deaths and monetize excess deaths using standard economic metrics.

Chapter 7: Summary of Conclusions and Future Research

A summary of the key conclusions of this dissertation are presented and potential future research directions are dicsussed.

1.1 References

1. Aiyyer A, Cohan D, Russell A, Stockwell W, Tanrikulu S, Vizuete W, et al. 2007.

Final report: Third peer review of the cmaq model.Submitted to the Community

Modeling and Analysis System Center, Carolina Environmental Program, The

University of North Carolina at Chapel Hill, 23pp.

2. Analitis A, Michelozzi P, D'Ippoliti D, De'Donato F, Menne B, Matthies F, et al.

2014. Effects of heat waves on mortality: Effect modification and confounding by air

pollutants. Epidemiology 25:15-22.

3. Anderson GB, Bell ML. 2011. Heat waves in the united states: Mortality risk during

heat waves and effect modification by heat wave characteristics in 43 U.S.

Communities. Environmental health perspectives 119:210-218.

4. Arguez A, Durre I, Applequist S, Vose RS, Squires MF, Yin XG, et al. 2012. Noaa's

1981-2010 u.S. Climate normals an overview. B Am Meteorol Soc 93:1687-1697.

8

5. Arnell N, Livermore MJ, Kovats S, Levy PE, Nicholls R, Parry ML, et al. 2004.

Climate and socio-economic scenarios for global-scale climate change impacts

assessments: Characterising the sres storylines. Global Environmental Change 14:3-

20.

6. Bell ML, McDermott A, Zeger SL, Samet JM, Dominici F. 2004. Ozone and short-

term mortality in 95 us urban communities, 1987-2000. JAMA : the Journal of the

American Medical Association 292:2372-2378.

7. Bell ML, Dominici F, Samet JM. 2005. A meta-analysis of time-series studies of

ozone and mortality with comparison to the national morbidity, mortality, and air

pollution study. Epidemiology 16:436-445.

8. Benjamin MT, Sudol M, Bloch L, Winer AM. 1996. Low-emitting urban forests: A

taxonomic methodology for assigning isoprene and monoterpene emission rates.

Atmospheric Environment 30:1437-1452.

9. Berrocal VJ, Gelfand AE, Holland DM. 2010. A spatio-temporal downscaler for

output from numerical models. Journal of agricultural, biological, and environmental

statistics 15:176-197.

10. Berrocal VJ, Gelfand AE, Holland DM. 2012. Space-time data fusion under error in

computer model output: An application to modeling air quality. Biometrics 68:837-

848.

11. Burnett RT, Brook JR, Yung WT, Dales RE, Krewski D. 1997. Association between

ozone and hospitalization for respiratory diseases in 16 Canadian cities.

Environmental research 72:24-31.

9

12. Burrows AT. 1900. Hot waves: Conditons which produce them, and their effect on

agriculture. Yearbook of the U.S. Department of Agriculture 22:325-336.

13. CDC. 2013. Heat illness and deaths--New York city, 2000-2011. MMWR Morbidity

and mortality weekly report 62:617-621.

14. Cox WM, Chu SH. 1996. Assessment of interannual ozone variation in urban areas

from a climatological perspective. Atmospheric Environment 30:2615-2625.

15. Dockery DW, Pope CA, Xu X, Spengler JD, Ware JH, Fay ME, et al. 1993. An

association between air pollution and mortality in six U.S. cities. New Engl J Med

329:1753-1759.

16. Dockery DW, Pope CA. 1994. Acute respiratory effects of particulate air pollution.

Annual review of public health 15:107-132.

17. Easterling DR, Meehl GA, Parmesan C, Changnon SA, Karl TR, Mearns LO. 2000.

Climate extremes: Observations, modeling, and impacts. Science 289:2068-2074.

18. Fowler DR, Mitchell CS, Brown A, Pollock T, Bratka LA, Paulson J, et al. 2013.

Heat-related deaths after an extreme heat event - four states, 2012, and United States,

1999-2009. Mmwr-Morbid Mortal W 62:433-436.

19. Fuentes M, Song HR, Ghosh SK, Holland DM, Davis JM. 2006. Spatial association

between speciated fine particles and mortality. Biometrics 62:855-863.

20. Gallo KP, Easterling DR, Peterson TC. 1996. The influence of land use/land cover on

climatological values of the diurnal temperature range. Papers in Natural

Resources:191.

21. Gasparrini A, Armstrong B. 2011. The impact of heat waves on mortality.

Epidemiology 22:68-73.

10

22. Geron C, Guenther A, Greenberg J, Karl T, Rasmussen R. 2006. Biogenic volatile

organic compound emissions from desert vegetation of the Southwestern U.S.

Atmospheric Environment 40:1645-1660.

23. Glahn HR, Lowry DA. 1972. The use of model output statistics (mos) in objective

weather forecasting. J Appl Meteorol 11:1203-1211.

24. Hajat S, Armstrong B, Baccini M, Biggeri A, Bisanti L, Russo A, et al. 2006. Impact

of high temperatures on mortality: Is there an added heat wave effect? Epidemiology

17:632-638.

25. Hajat S, Sheridan SC, Allen MJ, Pascal M, Laaidi K, Yagouti A, et al. 2010. Heat-

health warning systems: A comparison of the predictive capacity of different

approaches to identifying dangerously hot days. American journal of public health

100:1137-1144.

26. He H, Hembeck L, Hosley KM, Canty TP, Salawitch RJ, Dickerson RR. 2013. High

ozone concentrations on hot days: The role of electric power demand and nox

emissions. Geophysical Research Letters 40:5291-5294.

27. Huth R, Kysely J, Pokorna L. 2000. A GCM simulation of heat waves, dry spells, and

their relationships to circulation. Climatic Change 46:29-60.

28. Ishigami A, Hajat S, Kovats RS, Bisanti L, Rognoni M, Russo A, et al. 2008. An

ecological time-series study of heat-related mortality in three European cities.

Environmental health : a global access science source 7:5.

29. Jacob DJ, Winner DA. 2009. Effect of climate change on air quality. Atmospheric

Environment 43:51-63.

11

30. Jerrett M, Burnett RT, Pope III CA, Ito K, Thurston G, Krewski D, et al. 2009. Long-

term ozone exposure and mortality. New Engl J Med 360:1085-1095.

31. Kent ST, McClure LA, Zaitchik BF, Smith TT, Gohlke JM. 2014. Heat waves and

health outcomes in alabama (USA): The importance of heat wave definition.

Environmental health perspectives 122:151-158.

32. Kovats RS, Hajat S. 2008. Heat stress and public health: A critical review. Annual

review of public health 29:41-55.

33. McMillan NJ, Holland DM, Morara M, Feng J. 2009. Combining numerical model

output and particulate data using bayesian space-time modeling. Environmetrics:n/a-

n/a.

34. Meehl GA, Karl T, Easterling DR, Changnon S, Pielke Jr R, Changnon D, et al. 2000.

An introduction to trends in and climate events: Observations,

socioeconomic impacts, terrestrial ecological impacts, and model projections. B Am

Meteorol Soc 81:413-416.

35. Meehl GA, Tebaldi C. 2004. More intense, more frequent, and longer lasting heat

waves in the . Science 305:994-997.

36. Michalakes J, Chen S, Dudhia J, Hart L, Klemp J, Middlecoff J, et al. Development

of a next generation regional weather research and forecast model. In: Proceedings of

the Developments in Teracomputing: Proceedings of the Ninth ECMWF Workshop

on the use of high performance computing in meteorology, 2001, Vol. 1World

Scientific, 269-276.

37. National Research Council (U.S.). Committee on America's Climate Choices.,

National Research Council (U.S.). Panel on Advancing the Science of Climate

12

Change., National Research Council (U.S.). Board on Atmospheric Sciences and

Climate. 2010. Advancing the science of climate change. Washington, D.C.:National

Academies Press.

38. Parry ML, Intergovernmental Panel on Climate Change. Working Group II., World

Meteorological Organization., United Nations Environment Programme. 2007.

Climate change 2007 : Impacts, adaptation and vulnerability : Summary for

policymakers, a report of working group ii of the intergovernmental panel on climate

change and technical summary, a report accepted by working group ii of the ipcc but

not yet approved in detail : Part of the working group ii contribution to the fourth

assessment report of the intergovernmental panel on climate change. S.l.:WMO :

UNEP.

39. Pascal M, Laaidi K, Ledrans M, Baffert E, Caserio-Schonemann C, Le Tertre A, et al.

2006. France's heat health watch warning system. International journal of

biometeorology 50:144-153.

40. Pascal M, Wagner V, Le Tertre A, Laaidi K, Honore C, Benichou F, et al. 2013.

Definition of temperature thresholds: The example of the french heat wave warning

system. International journal of biometeorology 57:21-29.

41. Peng RD, Bobb JF, Tebaldi C, McDaniel L, Bell ML, Dominici F. 2011. Toward a

quantitative estimate of future heat wave mortality under global climate change.

Environmental health perspectives 119:701-706.

42. Pope CA, Burnett R, Ito K, Krewski D, Thurston GD. 2001. Mortality effects of long-

term exposure to air pollution: Analysis of extended follow-up of ACS cohort.

Epidemiology 12:S44-S44.

13

43. Pope III CA, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, et al. 2002. Lung

cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air

pollution. JAMA : the journal of the American Medical Association 287:1132-1141.

44. Ritter B, Geleyn J-F. 1992. A comprehensive radiation scheme for numerical weather

prediction models with potential applications in climate simulations. Monthly

Weather Review 120:303-325.

45. Robinson PJ. 2001. On the definition of a heat wave. J Appl Meteorol 40:762-775.

46. Rom WN, Samet JM. 2006. Small particles with big effects. Am J Resp Crit Care

173:365-366.

47. Samet JM. 2005. The perspective of the national research council's committee on

research priorities for airborne particulate matter. Journal of Toxicology and

Environmental Health, Part A 68:1063-1067.

48. Semenza JC, Rubin CH, Falter KH, Selanikio JD, Flanders WD, Howe HL, et al.

1996. Heat-related deaths during the july 1995 heat wave in chicago. The New

England journal of medicine 335:84-90.

49. Sharkey TD, Wiberley AE, Donohue AR. 2008. Isoprene emission from plants: Why

and how. Annals of botany 101:5-18.

50. Solberg S, Hov Ø, Søvde A, Isaksen I, Coddeville P, De Backer H, et al. 2008.

European surface ozone in the extreme summer 2003. Journal of Geophysical

Research: Atmospheres (1984–2012) 113.

51. Strickland MJ, Darrow LA, Klein M, Flanders WD, Sarnat JA, Waller LA, et al.

2010. Short-term associations between ambient air pollutants and pediatric asthma

emergency department visits. Am J Resp Crit Care 182:307.

14

52. Thacker MT, Lee R, Sabogal RI, Henderson A. 2008. Overview of deaths associated

with natural events, united states, 1979-2004. Disasters 32:303-315.

53. Vaidyanathan A, Dimmick WF, Kegler SR, Qualters JR. 2013. Statistical air quality

predictions for public health surveillance: Evaluation and generation of county level

metrics of PM2.5 for the environmental public health tracking network. International

journal of health geographics 12.

54. Vautard R, Honore C, Beekmann M, Rouil L. 2005. Simulation of ozone during the

august 2003 heat wave and emission control scenarios. Atmospheric Environment

39:2957-2967.

55. Yang T, Matus K, Paltsev S, Reilly J. 2005. Economic benefits of air pollution

regulation in the USA: An integrated approach:MIT Joint Program on the Science and

Policy of Global Change.

56. Zaitchik D, Iqbal Y, Carey S. 2014. The effect of executive function on biological

reasoning in young children: An individual differences study. Child development

85:160-175.

15

CHAPTER 2 Region-specific Evaluation of Extreme Heat Event Definitions Using

Heat Mortality Data: A Comprehensive National Assessment with a Public Health

Focus3

2.1 Introduction

In the United States (U.S.), extreme temperature related deaths account for far more deaths in most years than deaths resulting from hurricanes, lightning, tornadoes, floods, and earthquakes combined (Thacker et al. 2008). According to the Centers for Disease

Control and Prevention (CDC), a total of 7,233 heat-related deaths were reported between

1999 and 2009 (Fowler et al. 2013). An extreme heat event (EHE) is defined as a sustained period of abnormally and uncomfortably hot, and usually humid, weather

(Meehl and Tebaldi 2004). EHEs can negatively impact vital aspects of society, including agriculture, power production and consumption, and human health (National Research

Council [U.S.] et al. 2010; Parry et al. 2007). The Third National Climate Assessment report states that there will be a statistically significant increase in simulated annual mean temperatures across the contiguous United States for both A1 and B2 climate scenarios4

(Meehl et al. 2000). Adverse health outcomes associated with extreme heat are often preventable, and it is imperative to understand the local characteristics of EHEs that would help identify such events in advance and respond adequately to avoid the public health risk.

3 This work has been presented at the American Meterological Society meeting in February 2014. 4 The different climate scenarios are based on the emission scenarios defined by the Intergovernmental Panel on Climate Change (IPCC). These emission scenarios are organized into groups based on different assumptions describing human activity in the future. A1 represents rapid economic growth in a homogeneous world; B2 represents ecologically friendly with less rapid technological changes in a heterogeneous world. 16

In order to understand the adverse health impacts of EHEs, it is necessary to define what constitutes a heat episode. Typical EHE definitions can be decomposed into the following core variables:

1. Daily heat metric: Heat metrics, such as daily maximum temperature, daily apparent

temperature (heat index), and diurnal temperature difference are used as metrics in

studies exploring EHE definitions.

2. Duration: Number of consecutive days of extreme heat needed to constitute an EHE.

The minimum duration in existing definitions varies from two to four days.

3. Threshold type: Absolute—based on an observed absolute daily heat metric that does

not change, or relative—based on an exceedance above a set percentile, which varies

depending on the underlying daily heat metric distribution for a given location.

4. Intensity: Indicates the extremity of deviation that is required to constitute an EHE.

Most definitions refer to the exceedances above absolute thresholds, such as, 90, 95,

100 or 105 degrees Fahrenheit (℉) or 95th, 97th, 98th, or 99th percentiles.

Many EHE definitions are available from the literature (Anderson and Bell 2011; Arguez et al. 2012; Burrows 1900; CDC 2013; Easterling et al. 2000; Hajat et al. 2006; Hajat et al. 2010; Huth et al. 2000; Kent et al. 2014; Kovats and Hajat 2008; Meehl and Tebaldi

2004; Pascal et al. 2006; Pascal et al. 2013; Peng et al. 2011; Robinson 2001; Zaitchik et al. 2014). As exemplified above, EHEs are defined based on meteorological variable deviations (e.g., temperature) from the norm. A majority of studies apply one definition to all climate regions, and hence, neglect potential climate adaptation by resident populations. Studies that have extensively evaluated EHEs are limited to a few geographic areas (Gasparrini and Armstrong 2011; Hajat et al. 2010; Ishigami et al.

17

2008) and extending definitions from such studies to non-study areas could result in misidentification of EHEs in terms of human health effects. Some studies that have been published evaluated EHE definitions using health data (Anderson and Bell 2009; Hajat et al. 2010; Kent et al. 2014; Pascal et al. 2013) but almost all of the studies conducted nationally failed to evaluate EHE definitions using daily heat-related mortality data. As a whole, there is lack of consensus in the environmental health literature on definitions and procedures to accurately identify periods of extreme heat having the potential for adverse health impacts. Hence, it is important to evaluate EHE definitions using health outcomes with clear causal links, such as heat-related mortality, to identify those definitions most strongly associated with adverse health effects. In this study, we hypothesize that people in different climatic regions might have varying susceptibility to extreme heat, which motivates a region-specific investigation of extreme heat and associated heat-related mortality. Additionally, we anticipate that the most appropriate definitions of EHEs may vary with climate region.

2.2 Methods

2.2.1 Meteorology data

We used station-based meteorology data for years 1999-2009, and any county in the conterminous U.S. (lower 48 states) that had an automated surface observing system

(ASOS) unit was included in this evaluation. Spatial coverage of ASOS stations is shown in Figure 2-1.

18

Figure 2-1: Spatial coverage of ASOS weather stations

Further, we checked on the completeness of hourly and daily meteorology data used in this analysis. For each station we set a daily completeness threshold of 75% for hourly observations in a given day (at least 18 of 24 hourly measurements available) for computing daily summaries of the heat metric. For each county we calculated an average of all available daily station-based summaries to create county-level estimates of daily weather variables. We then applied a 95% completeness threshold for the daily county- level estimates of the heat metric across the summer months (May 1 through September

30). Finally, we only included counties for which sufficiently complete data (based on the above-mentioned criteria) were available for all 11 years (1999-2009) of the analysis period.

19

2.2.2 EHE definitions and core variables

For this study, we considered a number of EHE definitions that have been used in public health research and/or widely cited in the literature. Table 2-1 summarizes the different combinations of core variables used to define an EHE in this analysis. We used daily maximum temperature (Tmax), daily maximum heat index (HImax), daily average temperature (Tavg), and a combination of Tmax and daily minimum temperature (Tmin) as daily heat metrics; all heat metrics were represented in ℉ and we used the formula cited in Robinson (2001) to compute HImax. We considered EHE definitions with both absolute and relative thresholds. Absolute thresholds were set at various intensity values, including 90, 95, 100, and 105 oF. Relative thresholds were calculated using two different approaches. We calculated percentile-based relative thresholds representing different intensities, including the 80th, 85th, 90th, 95th, 98th, 99th percentile values and, for one definition (Huth5), we used the 81st and 97.5th percentile values. We computed these percentiles using heat metric data for the summer months for years 1999-2009. We obtained station-level climate normal information from the National Climatic Data Center

(NCDC), i.e., the mean and standard deviation (SD) of daily heat metrics computed based on data from 1981-2010; climate normals were unavailable for the heat index. We implemented EHE definitions with minimum duration, i.e., the number of consecutive days needed to constitute an EHE, variously ranging from two to four days. Varying minimum durations coupled with the various EHE base definitions resulted in a total of

5 Per Huth’s definition, a heat wave is defined as the longest period of consecutive days satisfying the following three conditions: (1) The daily maximum temperature is above T1 (97.5th percentile) for at least 3 consecutive days; (2) The daily maximum temperature is aboveT2 (81th percentile) during the entire period; (3) The average of daily maximum temperature over the entire period is greater than T1. 20

92 variants (Table 1). Appendix Table A-1 provides precise details for each of these variant.

Table 2-1: Core variables used in extreme heat event definitions

Intensity

based based

normal

Climate

Relative

Absolute

threshold threshold Daily heat Threshold

metric Duration

℉ ℉

℉ ℉

P80 P85 P90 P95 P98 P99

90 90 95

100 100 105

P81/P97.5

Mean1SD + Mean2SD +

Daily > 2 days X X X X X X X X X X maximum > 3 days X X X X X X X X X X X temperature > 4 days X X X X X X X X X X X (Tmax) Daily > 2 days X X X X X X X X X X average > 3 days X X X X X X X X X X X temperature > 4 days X X X X X X X X X X X (Tavg) Daily > 2 days X X X X X X X X maximum > 3 days X X X X X X X X X heat index > 4 days X X X X X X X X X (HImax) Daily maximum and > 3 days X minimum temperature (Tmax & Tmin) Huth* > 3 days X Definition used in this X analysis Definition published in X literature * Per Huth’s definition, a heat wave is defined as the longest period of consecutive days satisfying the following three conditions: 1. The daily maximum temperature is above T1 (97.5th percentile) for at least 3 consecutive days; 2. The daily maximum temperature is aboveT2 (81th percentile) during the entire period; 3. The average of daily maximum temperature over the entire period is greater than T1.

21

We operationalized each EHE definition/variant6 as a binary (Yes (1) / No (0)) variable and thereby separately classified each day in each county during the summer months as either an “EHE day” or a non-EHE day.”7 Days for which daily county-level data were not available could in some instances have interrupted a data sequence that might otherwise have qualified the surrounding days as EHE days. However, because of the high data completeness threshold employed, we believe any such effects to be minimal.

2.2.3 Mortality and population data

We obtained mortality data from the National Center for Health Statistics (NCHS)

National Vital Statistics System and extracted death records for years 1999-2009 based on International Classification of Diseases, 10th revision (ICD-10) external cause codes

(Minino et al. 2011). Specifically, we selected death records for which exposure to excessive natural heat (ICD-10 code: X30) was listed as the underlying cause of death; the underlying cause of death is defined as the disease or injury that initiated the chain of events leading to death (Hanzlick et al. 2006). We summarized the extracted death records for the summer months to get counts of heat-related deaths by county and day.

We then assigned the data for each county to one of the nine U.S. climate regions, which are aggregations of states based on homogeneous long-term climatology (Figure 2-2); a description of these regions is available from the NCDC

(http://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-regions.php).

Additionally, due to small death counts in the West North Central and Northwest regions, we combined these two regions into “North West Central.” We excluded counties that did

6 In subsequent chapters the term, EHE definition/variant will be refered to as simply EHE definition. 7 We added a buffer of 3 days to the start and end of the summer months to account for any potential EHE that either started prior to May 1 and ended on or shortly after May 1, or started on or shortly before September 30 and ended in the early part of October. The buffer days were not included in the analysis. 22

not have meteorology data (or that did not meet the data completeness threshold) and made adjustments to account for county boundary changes that occurred between 1999 and 2009.

Figure 2-2: U.S. climate regions

2.2.4 Evaluating EHE definitions using heat mortality data

Separately evaluating 92 different EHE definitions/variants becomes onerous and, hence, we used cluster analysis as a preliminary data reduction technique to group EHE definitions/variants into homogeneous sets. We differentiated any two EHE definitions/variants based on county-day disagreements between the binary variables representing the operationalized definitions. For a given county and year, the total count of daily disagreements between two definitions is provided by the sum of the off- diagonal frequencies as shown in Table 2-2. (This sum represents the squared Euclidean

23

distance between two vectors of binary variables.) Because the main research focus is on human health effects, these counts were then weighted by the yearly county population estimates in order to ensure proportional representation. The population-weighted disagreement counts were then summed across counties (nationwide) and years to obtain an overall measure of disagreement (or distance) between the two EHE definitions/variants. A distance matrix containing the overall disagreement measures for all pairs of EHE definitions/variants (4,186 pairs) was used as input to the clustering procedure.

Table 2-2: Two-way frequency table of daily agreements/disagreements for two operationalized EHE definitions/variants

EHE DEFINITION 1

YES NO

YES A B

EHE DEFINITIONEHE 2 NO C D

We applied a hierarchical clustering technique, and employed an average distance metric to determine distances between clusters that might be merged in each step of the clustering process (Zhang et al. 1996). Average distance is calculated using the following formula:

∑ ∑ (1)

Ca and Cb are two disjoint clusters; na and nb are the number of members within clusters Ca and Cb, respectively; d is the Euclidean distance between two members of the two disjoint clusters.

24

We divided the final hierarchical cluster (one large cluster encompassing all definitions) into smaller clusters, taking into consideration various diagnostics including the overall

R-squared, pseudo F and pseudo T-squared indices. The pseudo F index describes the ratio of the between-cluster variance to the within-cluster variance and, in general, values of this index denote the degree of separation for clusters. The pseudo T-squared index quantifies the difference between two clusters that are about to be merged at any given step of the clustering process (Edens et al. 1999). Based on these diagnostics, we identified relatively distinct high-level clusters. One representative EHE definition was then selected from each high-level cluster. Candidate definitions were identified according to the following criteria: (1) EHE definitions/variants that are well-recognized in the literature; (2) application in studies conducted in the U.S.; and (3) application in nationally representative studies, i.e. those studies that covered the various climate regions of the U.S. , Among the candidates meeting these criteria to the extent possible, we made our final selection of EHE definitions to reflect differentiated combinations of the core variables that are used to operationalize the definitions. For each representative

EHE definition we considered different exposure offsets: no lag (i.e., no offset), 1-day lag, and 1-, 2-, 3-day extended (post-heat wave) effects (Figure 2-3).

Figure 2-3: Schematic showing exposure offset indicators

25

2.2.5 Rate regression modeling

We applied rate regression models to evaluate the relationship between operationalized

EHE definitions and heat-related deaths. The following model was used to estimate the death rate per person-day on a logarithmic scale for each EHE definition/variant and exposure offset combination: log(E[D] / P) = α + βregion + βEHE∙EHE + βEHE.Region∙EHE∙Region (2) with model terms defined as follows:

D: count of deaths for each combination of region, year, and EHE status8;

E[D]: expected count of deaths;

P: person-days of exposure for which D is measured;

α: intercept;

βregion: intercept offset for the climate region;

βEHE: parameter estimate for the binary variable referring to the EHE definition and exposure offset combination;

EHE: binary indicator variable for the operationalized EHE definition/variant and exposure offset combination;

βEHE.Region: parameter estimate for the interaction between region and EHE;

Region: climate region;

βk: parameter estimate for covariate k;

To compensate for over dispersion, we specified a negative binomial link. Using this modeling approach, we estimated a baseline rate of heat-related deaths (deaths in the absence of EHE), and an EHE rate of heat-related deaths (deaths in the presence of EHE).

8 To facilitate reliable modeling diagnostics as well as convergence, data were collapsed according to a three-way stratification: climate region × year × EHE status (for the EHE definition/variant and exposure offset combination under consideration). 26

We termed the estimated increase (on a log-scale) in the rate due to EHE as the “EHE effect.”

We used the estimated EHE effects to identify the “best” EHE definition/variant and exposure offset combinations with respect to heat-related deaths. One might hypothesize that there is some “gold standard” EHE definition that best explains heat-related mortality; the various EHE definitions considered in this evaluation represent approximations to this hypothetical gold standard. The extent to which each operationalized EHE definition deviates from the hypothetical gold standard can be expected to materialize in the form of attenuation bias, i.e., weaker estimated EHE effects than might be ideally attained. By this reasoning, the strongest estimates – presumably corresponding to those with the least attenuation bias, are assumed to best represent the gold standard. We tested this reasoning by simulating an “ideal” dataset, with health outcomes following a probability distribution conforming to an arbitrary gold standard

EHE definition. Our simulation involved three basic steps: (1) we first selected an arbitrary operationalized EHE definition/variant and exposure offset combination to represent the hypothetical gold standard and used it to estimate a corresponding model;

(2) using the estimated model parameters, we then simulated a new time series of heat- related deaths; (3) we then re-estimated the EHE effect estimate and corresponding 95% confidence interval (CI) after introducing various random distortions to the originally selected EHE definition/variant and exposure offset combination (in the form of false- positives, false-negatives, or both). Under all forms of distortion, the EHE effect estimate and the confidence limits were routinely biased toward the null.

27

After the simulation exercise indicated that the attenuation bias concept is applicable to our analysis, we employed model (2) to identify the EHE definition/variant and exposure offset combinations having the strongest effect estimates. We evaluated each of the EHE definitions/variants selected as high-level cluster representatives crossed with the five exposure offsets, and ranked the results in descending order based on the lower confidence limit associated with each EHE effect estimate, by climate region. Further, to assess the region-specific differences in population-level susceptibility to extreme heat, we conducted a random-effects meta-analysis, by region, based on the 10 “best” region- specific EHE definition/variant and exposure offset combinations, to estimate the mean baseline rate, the mean EHE effect, and associated CIs for each region. We carried out our data analyses using the Statistical Analysis System (SAS® Version 9.3),

Environmental Systems Research Institute’s GIS software (ESRI, ArcGIS® Version 9.3), and comprehensive meta-analysis software (CMA® Version 2.0).

2.3 Results

Table 2-3 summarizes the number of heat-related deaths and counties with meteorological data. Heat-related deaths, based on the underlying cause codes, were summed up for all counties in each climate region and also summed up regionally only for counties with meteorological data. The total number of heat-related deaths in the U.S for 1999-2009 was 3,829 and among these, 2,218 (58%) occurred in counties with meteorological data. The South region had the highest number of heat-related deaths and also had the highest number of counties with meteorological data (n=91). The North West

Central region, which we formed by combining the Northwest and West North Central regions, had the lowest number of heat-related deaths (n=118). The West region had the

28

lowest number of counties (n=38) with meteorological data. The warmer regions, South,

Southeast, Southwest and West, accounted for 64% (n=2,447) of all heat-related deaths in the U.S. during the study period. The percent of U.S. population living in counties with meteorological data varied with climate regions. The West and West North Central regions had the highest (92%) and the lowest (42%) percent of population living in counties with meteorological data, respectively. The percent of total U.S. population living in counties with meteorological data was 57%.

Table 2-3: Heat-related deaths and counties with meteorological data, by climate region (1999-2009)

Percent of the U.S. Number Number of Number of heat- population living in of heat- counties with related deaths in U.S. Climate Region counties with related meteorological counties with meteorological data deaths data meteorological data (%) Central 640 78 314 49 East North Central 150 54 93 49 Northeast 474 70 212 47 Northwest 70 40 51 73 South 890 91 481 60 Southeast 541 71 224 49 Southwest 508 43 367 64 West 508 38 455 92 West North Central 48 48 21 42 Total 3,829 533 2,218 57

Figure 2-4 shows a dendrogram (or cluster tree), which depicts the sequential clustering of the EHE definitions/variants in a hierarchical manner. We delineated the final high- level clusters taking into consideration, pseudo F- and T-squared indices (data not shown). The break points were also influenced by subjective assessments of the

29

homogeneity of members within clusters and the heterogeneity across clusters. We ultimately settled on five high-level clusters. We labeled each high-level cluster to reflect the underlying feature(s) common to the definitions/variants comprising it. “Cluster 1” was the first cluster delineated and it contains only definitions/variants that are based on absolute thresholds for several of the daily heat metrics considered in the study. “Cluster

2” contains definitions/variants based on thresholds that are predominantly moderate in severity. “Cluster 3” contains definitions/variants based on thresholds that are slightly more severe than those for Cluster 2. “Cluster 4” contains definitions/variants based on thresholds that are predominantly extreme in nature. “Cluster 5” consists of definitions/variants that rely on relative thresholds constructed from long-term climate normal data, with thresholds predominantly low.

30

Figure 2-4: Dendrogram of hierarchical clusters

Table 2-4 lists the EHE definition/variant that was selected as the representative from each high-level cluster. The five representative EHE definitions/variants crossed with the

31

five exposure offsets resulted in 25 different combinations to be evaluated against heat- related deaths using the rate regression modeling framework.

Table 2-4: Representative EHE definition from each cluster

Daily Cluster EHE definition Threshold Threshold Cluster heat Duration common name name type value metric Daily maximum Absolute heat index temperature greater than 3+ consecutive 1 HI Absolute >90oF based 90 ℉ for at least max days thresholds 3 consecutive days Daily maximum and minimum temperature "Predominantly greater than 80th T >80th 3+ consecutive 2 moderate" max Relative percentile for at and T percentile days thresholds min least 3 consecutive days Daily maximum temperature greater than 95th "Predominantly >95th 2+ consecutive 3 percentile for at T Relative high" thresholds max percentile days least 2 consecutive days Everyday >T2, and 3+ T1: >97.5th consecutive "Predominantly percentile days >T1, and 4 extreme" Huth definition T Relative max T2: >81st average T thresholds max percentile >T1 for the whole time period Daily mean temperature greater than >mean + 1 Climate normal 3+ mean + 1 SD of SD of 5 based T Relative Consecutive climate normal avg climate thresholds days for at least 3 normal consecutive days

Table 2-5 ranks the EHE definition/variant and exposure offset combinations by climate region. The representative definition/variant from cluster 3, daily maximum temperature greater than the 95th percentile for at least two consecutive days, is most closely

32

associated with heat-related mortality for six of the eight climate regions. The combinations of this definition/variant with exposure offsets representing a 1-day lag

(Lag1) and no lag (Lag0) show the highest estimated EHE effects for all regions except the Southwest and South. The representative definition/variant from cluster 1, daily maximum heat index greater than 90°F for three consecutive days, combined with each of the different exposure offsets, shows the highest estimated EHE effects for the

Southwest. The representative definition/variant from cluster 4, the Huth Definition, was the best definition for the South but generally shows the weakest estimated EHE effects for other regions. The representative definition/variant from cluster 2, daily maximum and minimum temperature greater than the 80th percentile for at least three consecutive days, ranked fairly high (depending on the exposure offset) for the Central, Northeast, and Southeast regions; Lag1 and Lag0 represent the best exposure offsets. The representative definition/variant from cluster 5, daily mean temperature greater than the mean plus one standard deviation of the long-term climate normal for at least three consecutive days,shows the weakest estimated EHE effects overall. For most regions, no one definition/variant is distinctly superior to all others. We also provide a table in the appendix (Appendix Table A-2) that describes other metrics such as the percent of days classified as EHE days and percent of heat-related deaths covered by EHE days for each representative EHE definition/variant and exposure offset combination.

33

Table 2-5: Mean (95% CI) EHE effect by U.S. climate regions Ranking and EHE Effect by Climate Regions

Central East North Central North West Central Northeast South Southeast Southwest West

Rank Rank Rank Rank Rank Rank Rank

Exposure Rank

(95%CI) (95%CI) (95%CI) (95%CI) (95%CI) (95%CI) (95%CI) (95%CI)

EHE effect EHE effect EHE effect EHE effect EHE effect EHE effect EHE effect EHE EHE Definition offset type effect EHE ExE1 10 13.1 (8.4, 20.4) 7 19.0 (10.7, 33.8) 6 21.7 (13.6, 34.7) 10 10.9 (5.9, 20.3) 14 5.3 (3.3, 8.6) 14 4.4 (2.6, 7.3) 4 10.4 (6.4, 16.9) 7 7.1 (4.7, 10.7) Daily maximum heat index ExE2 13 12.6 (8.1, 19.7) 6 19.5 (10.9, 34.7) 5 22.0 (13.7, 35.3) 11 9.9 (5.3, 18.3) 11 5.7 (3.5, 9.5) 12 4.9 (2.8, 8.3) 2 11.6 (7.1, 19.0) 8 7.0 (4.6, 10.6) greater than 90oF for at ExE3 14 12.1 (7.7, 19.1) 9 17.4 (9.7, 31.2) 8 20.1 (12.4, 32.5) 14 8.7 (4.7, 16.1) 12 5.8 (3.5, 9.8) 4 6.4 (3.5, 11.5) 1 11.7 (7.1, 19.4) 9 6.6 (4.4, 10.1) least 3 consecutive days Lag0 15 11.3 (7.3, 17.4) 10 15.6 (8.8, 27.7) 10 17.3 (10.9, 27.5) 5 12.6 (6.8, 23.5) 20 4.6 (2.9, 7.2) 16 4.3 (2.6, 7.0) 5 9.5 (5.9, 15.1) 4 7.4 (4.9, 11.1) Lag1 5 14.8 (9.5, 23.0) 3 22.6 (12.7, 40.1) 2 22.3 (14.0, 35.5) 6 11.8 (6.3, 22.1) 15 5.2 (3.3, 8.3) 13 4.4 (2.7, 7.3) 3 10.9 (6.8, 17.6) 5 7.4 (4.9, 11.1) Daily maximum and ExE1 6 14.8 (9.3, 23.5) 12 13.9 (7.7, 24.8) 11 16.8 (10.3, 27.2) 9 11.0 (5.9, 20.6) 19 4.7 (3.0, 7.4) 9 5.6 (3.3, 9.5) 7 7.0 (4.5, 11.0) 13 5.4 (3.5, 8.3) minimum temperature ExE2 7 14.5 (9.2, 23.0) 13 13.7 (7.7, 24.4) 12 16.7 (10.3, 27.0) 13 9.3 (5.0, 17.2) 16 4.8 (3.1, 7.4) 7 5.6 (3.4, 9.3) 8 6.8 (4.4, 10.6) 14 5.0 (3.3, 7.8) greater than 80th ExE3 9 13.8 (8.7, 21.8) 15 12.2 (6.9, 21.8) 15 14.6 (9.0, 23.7) 15 8.0 (4.3, 14.9) 18 4.7 (3.0, 7.2) 6 5.6 (3.4, 9.2) 12 6.3 (4.0, 9.7) 15 4.9 (3.2, 7.5) percentile for at least 3 Lag0 4 15.2 (9.6, 24.0) 14 12.5 (7.0, 22.4) 14 15.1 (9.4, 24.4) 3 13.0 (7.0, 24.2) 13 5.1 (3.3, 8.0) 5 6.0 (3.5, 10.1) 9 6.6 (4.2, 10.3) 12 5.6 (3.6, 8.6) consecutive days Lag1 2 17.0 (10.7, 27.0) 11 15.3 (8.5, 27.4) 9 19.4 (12.0, 31.6) 8 11.5 (6.1, 21.7) 17 4.8 (3.1, 7.6) 10 5.5 (3.2, 9.4) 6 7.9 (5.0, 12.5) 11 5.8 (3.7, 9.0) Daily maximum ExE1 3 16.1 (10.1, 25.7) 2 23.5 (13.0, 42.4) 4 22.8 (13.8, 37.7) 2 14.0 (7.4, 26.6) 4 7.1 (4.5, 11.3) 2 6.2 (3.7, 10.4) 11 6.6 (4.1, 10.7) 3 8.0 (5.1, 12.4) temperature greater than ExE2 8 14.4 (9.1, 22.9) 4 21.7 (12.1, 39.1) 3 22.8 (13.9, 37.6) 7 11.6 (6.2, 21.8) 6 6.4 (4.1, 10.2) 8 5.6 (3.4, 9.4) 15 6.0 (3.7, 9.6) 6 7.4 (4.8, 11.5) th ExE3 12 13.1 (8.3, 20.7) 5 20.2 (11.2, 36.3) 7 20.7 (12.6, 34.1) 12 9.5 (5.1, 17.8) 7 6.3 (4.0, 9.9) 11 4.8 (2.9, 8.0) 13 6.0 (3.8, 9.6) 10 6.7 (4.3, 10.3) 95 percentile for at least Lag0 11 13.7 (8.4, 22.5) 8 18.2 (9.8, 33.6) 13 15.9 (9.4, 27.0) 1 14.9 (7.6, 29.0) 3 7.5 (4.6, 12.2) 1 7.2 (4.2, 12.5) 14 6.3 (3.8, 10.5) 2 8.3 (5.2, 13.2) 2 consecutive days Lag1 1 18.8 (11.6, 30.6) 1 31.0 (16.9, 56.7) 1 24.8 (14.8, 41.7) 4 13.7 (7.0, 26.9) 2 7.6 (4.6, 12.4) 3 6.2 (3.5, 10.8) 10 6.8 (4.1, 11.3) 1 9.5 (5.9, 15.1) ExE1 22 5.5 (3.0, 10.1) 20 11.0 (5.6, 21.6) 22 7.7 (4.0, 14.7) 23 4.6 (1.5, 14.0) 5 7.3 (4.2, 12.7) 25 3.6 (1.7, 7.7) 22 3.0 (1.5, 6.1) 23 5.1 (2.4, 10.9) ExE2 24 5.3 (2.9, 9.6) 21 10.6 (5.5, 20.5) 23 7.4 (3.9, 13.9) 24 3.8 (1.2, 11.8) 8 6.8 (3.9, 11.8) 23 3.7 (1.8, 7.7) 23 2.9 (1.5, 5.8) 19 5.4 (2.6, 11.2) Huth definition ExE3 25 4.7 (2.6, 8.5) 25 9.3 (4.8, 18.0) 24 6.6 (3.5, 12.4) 25 3.3 (1.1, 10.2) 10 6.4 (3.7, 11.2) 24 3.6 (1.8, 7.3) 24 2.7 (1.4, 5.3) 22 4.9 (2.4, 10.1) Lag0 23 5.5 (3.0, 10.3) 23 10.5 (5.2, 20.9) 25 5.9 (2.9, 11.9) 21 5.6 (1.8, 17.2) 9 6.6 (3.8, 11.7) 22 3.9 (1.8, 8.5) 25 2.3 (1.1, 5.2) 24 4.9 (2.2, 10.9) Lag1 21 5.9 (3.2, 10.8) 16 13.2 (6.7, 25.8) 21 8.6 (4.5, 16.7) 22 5.6 (1.8, 17.3) 1 8.3 (4.8, 14.4) 21 4.3 (2.0, 9.1) 21 3.6 (1.8, 7.4) 25 4.9 (2.2, 10.9) Daily mean temperature ExE1 17 7.7 (4.8, 12.4) 19 10.4 (5.7, 18.9) 17 10.8 (6.5, 17.9) 17 5.6 (3.0, 10.5) 24 3.2 (2.0, 5.1) 18 4.2 (2.5, 6.9) 17 3.7 (2.3, 5.9) 17 4.2 (2.7, 6.7) greater than mean + 1 ExE2 19 7.4 (4.6, 11.9) 18 10.6 (5.8, 19.3) 18 10.2 (6.2, 16.9) 19 4.6 (2.5, 8.6) 22 3.2 (2.0, 5.1) 19 3.8 (2.3, 6.3) 19 3.7 (2.3, 5.8) 20 4.1 (2.6, 6.4) standard deviation (SD) of ExE3 20 6.9 (4.3, 11.0) 22 9.8 (5.4, 18.0) 20 9.0 (5.4, 14.9) 20 3.9 (2.1, 7.4) 25 3.0 (1.9, 4.7) 20 3.7 (2.2, 6.0) 18 3.7 (2.3, 5.8) 21 4.0 (2.5, 6.2) climate normal for at least Lag0 18 7.6 (4.7, 12.3) 24 9.0 (5.0, 16.4) 19 10.1 (6.1, 16.6) 16 6.0 (3.2, 11.4) 23 3.2 (2.0, 5.1) 17 4.2 (2.5, 7.0) 20 3.5 (2.2, 5.5) 18 4.2 (2.6, 6.6) 3 consecutive days Lag1 16 8.7 (5.4, 14.0) 17 12.0 (6.6, 21.6) 16 11.2 (6.8, 18.5) 18 5.0 (2.6, 9.5) 21 3.4 (2.1, 5.4) 15 4.3 (2.6, 7.1) 16 3.8 (2.4, 6.0) 16 4.5 (2.9, 7.2) Rank 1 Rank 2 Rank 3 Rank 4 Rank 5

34

Table 2-6 provides the result of the random effect meta-analyses of the estimated baseline rates and EHE effects, based on the top 10 best definitions, for each climate region. The North West Central region shows the lowest mean (95% CI) baseline rate,

1.8 (1.5 – 2.2) deaths per one billion person-days of risk, and the highest mean (95% CI)

EHE effect of 22.0 (17.7 – 27.3). The South region shows the highest mean (95% CI) baseline rate of 10.0 (8.8 – 12.0) deaths per one billion person-days of risk. The lowest mean EHE effect was observed in the Southeast. In general, colder regions of the U.S. show a relatively low baseline rate and a relatively high EHE effect, while the warmer regions of the U.S. show a relatively high baseline rate and a relatively low EHE effect.

Table 2-6: Meta-analyzed baseline rate and EHE effect by U.S. climate region

Mean (95% CI) baseline heat U.S. Climate Region mortality rate (deaths/person- Mean (95% CI) EHE effect day) X 10-9

Central 4.1 (3.5 - 4.8) 15.0 (12.2 - 18.4) East North Central 2.3 (1.9 - 2.8) 20.7 (15.9 - 26.9) North West Central 1.8 (1.5 - 2.2) 22.0 (17.7 - 27.3) Northeast 2.9 (2.4 - 3.5) 13.1 (9.9 - 17.4) South 10.0 (8.8 - 12.0) 7.1 (5.7 - 8.8) Southeast 3.8 (3.2 - 4.5) 6.2 (4.9 - 7.9) Southwest 5.0 (4.1 - 6.0) 10.1 (8.1 - 12.5) West 4.7 (4.0 - 5.5) 7.6 (6.2 - 9.2)

2.4 Summary and Conclusions

Several local and state health departments are currently interested in issuing heat advisories, as well as conducting retrospective health studies to understand the effects of extreme heat on mortality and morbidity; health departments are collaborating with local and national weather offices to do so. Identification of appropriate region-specific EHE definition(s) can contribute to such efforts.

35

EHE definitions used for most heat warning systems to issue alerts are calibrated to the extreme end of the daily heat metric spectrum. As noted by (Hajat et al. 2010), using a definition that only identifies extreme temperature days may introduce false negatives and therefore underestimate the public health burden attributable to extreme heat, whereas using a less stringent or a mild threshold for EHE definitions may introduce false positives and therefore overestimate the public health burden. Additionally, prior research efforts evaluating definitions using mortality data have considered death due to all causes. While this is better than not using health data, the relationship between all-cause mortality and extreme heat is confounded by several other risk factors. Research studies have shown that certain social and demographic variables, which act as surrogates for social capital, could influence heat-related health outcomes (Reid et al. 2009; Semenza et al. 1996).

To the best of our knowledge, this effort is the first nationally comprehensive, and at the same time region-specific, evaluation of EHE definitions using heat exposure mortality data. We comprehensively abstracted and operationalized commonly used EHE definitions from the literature, expanded the definitions to cover various combinations of the core variables, and considered various exposure offsets. Our evaluation framework, which employed cluster analysis to identify homogeneous groupings of EHE definitions, followed by rate regressions to EHE effect estimates for representatives from these groupings, provides a robust framework to identify definitions that are most closely associated with heat-mortality. Our approach not only identifies a set of definitions that are most closely associated with heat-related mortality but may also shed light on some of the more weakly associated EHE definitions that appear in the literature. Finally, the

36

random effects meta-analysis summarizes the overall region-specific summertime baseline morality rates and the estimated EHE effects, which support informed speculations on the ability of populations to adapt to extreme heat.

Our findings suggest that definitions with thresholds that are either too extreme or too moderate tend to be among those most weakly associated with heat-related mortality for most climate regions. Of the exposure offsets considered, EHE definitions/variants combined with a 1-day lag resulted in the highest estimated EHE effect in the Central,

East North Central, North West Central, South, and West regions. For the Northeast and

Southeast regions, definitions/variants involving no lag resulted in the highest estimated

EHE effects. For the Southwest region, a definition/variant combined with an exposure offset extending 3 days past the end of the heat event resulted in the highest estimated

EHE effect. Our evaluation suggests that the warmer regions of the U.S., such as the

South, have a relatively low EHE effect and a relatively high baseline heat mortality rate.

Colder areas of the U.S., such as the North West Central and East North Central regions, have a relatively high EHE effect and a low baseline heat mortality rate. This result is consistent with the relationship between temperature and mortality for colder versus warmer cities of the U.S. noted in prior research (Curriero et al. 2002) and may indicate that populations in warmer regions are better adapted to extreme heat than the colder regions of the U.S. We speculate that due to persistent extreme heat throughout summer over prolonged time periods in warmer regions, people have adapted well to extreme heat. In colder regions, EHEs are rare and hence people are less adapted to extreme heat.

In other words, the populations living in colder regions have a greater a risk associated with EHEs than those populations living in warmer regions.

37

Additionally, there could be several social and demographic characteristics for some regions which might confound the relationship between extreme heat and mortality.

Table A-3 in the appendix provides information on the levels of various social and demographic variables by U.S. climate regions. AC prevalence is a significant risk factor for extreme heat-related mortality (Reid et al. 2009) and relatively high AC prevalence is observed in the South and the Southeast regions of the U.S. In addition to the ability of the populations to adapt, higher prevalence of AC in these regions could explain a lower

EHE effect. Studies have shown different degrees of susceptibility to extreme heat among ethnic groups, and some of the regional variations we observe could be an artifact of the underlying demographic distribution (Klinenberg 2003a; Klinenberg 2003b).

There are a few limitations in this study. Given the sparseness in the region-wide numbers of heat-related deaths, we could not explore the estimation of EHE effects while controlling for various socio-demographic risk factors, such as gender, age, poverty status and ethnicity. These risk factors could confound the relationship between extreme heat and heat-related mortality. However, our ultimate goal was not to estimate the EHE effects associated with heat-related mortality but to use the estimated effects as a metric to rank the different definitions. Also, we used station-based meteorology data as the source of ambient heat data, and characterizing population-level exposures from weather stations may misrepresent actual individual-level exposures. Additionally, weather stations have limited spatial coverage (especially the ASOS stations) and are located near airports or in remote places to measure baseline meteorology for climatological purposes, where the majority of population may not reside. This may limit our generalizability of the relationship between extreme heat and mortality; however, station-based meteorology

38

data have been used widely in health studies. Further, EHEs extend beyond counties and are considered meso-scale events. Therefore, exposure misclassification, even if present, is minimal and we believe it does not affect our results. Lastly, mortality data used for this analysis is based on what is reported on death certificates, which in some instances could lead to misclassification of heat-related deaths (Combs et al. 1999).

Region-specific evaluation of EHE definitions offers several potential benefits. A recent study conducted in Europe by Analitis et al. (Analitis et al. 2014) examined confounding and effect modification by air pollutants. A similar study design could be implemented in the U.S., given that appropriate definitions have been identified for each climate region.

Further, the rate regression modeling approach could be extended to quantify excess deaths associated with EHEs for all causes or for broad cause groupings such as cardiovascular and respiratory deaths. Excess death estimation can be conducted either in a historical or in a prospective manner. Knowledge of the historical burden attributable to extreme heat may help local and state emergency planners with the development of community preparedness initiatives related to future heat waves. Additionally, anthropogenic climate change is projected to increase the likelihood and/or magnitude of several types of weather extremes, including extreme heat events (Morss et al. 2011).

Under a climate change scenario, estimates of the public health burden associated with

EHEs could help identify vulnerable populations and support adaptation efforts.

39

2.5 References

1. Analitis A, Michelozzi P, D'Ippoliti D, De'Donato F, Menne B, Matthies F, et al.

2014. Effects of heat waves on mortality: Effect modification and confounding by air

pollutants. Epidemiology 25:15-22.

2. Anderson BG, Bell ML. 2009. Weather-related mortality: How heat, cold, and heat

waves affect mortality in the United States. Epidemiology 20:205-213.

3. Anderson GB, Bell ML. 2011. Heat waves in the United States: Mortality risk during

heat waves and effect modification by heat wave characteristics in 43 U.S.

Communities. Environmental health perspectives 119:210-218.

4. Arguez A, Durre I, Applequist S, Vose RS, Squires MF, Yin XG, et al. 2012.

NOAA's 1981-2010 U.S. Climate normals an overview. B Am Meteorol Soc

93:1687-1697.

5. Burrows AT. 1900. Hot waves: Conditons which produce them, and their effect on

agriculture. Yearbook of the U S Department of Agriculture 22:325-336.

6. CDC. 2006. Heat-related deaths--United States, 1999-2003. MMWR Morbidity and

mortality weekly report 55:796-798.

7. CDC. 2013. Heat illness and deaths--new york city, 2000-2011. MMWR Morbidity

and mortality weekly report 62:617-621.

8. Combs DL, Quenemoen LE, Parrish RG, Davis JH. 1999. Assessing disaster-

attributed mortality: Development and application of a definition and classification

matrix. International journal of epidemiology 28:1124-1129.

40

9. Curriero FC, Heiner KS, Samet JM, Zeger SL, Strug L, Patz JA. 2002. Temperature

and mortality in 11 cities of the eastern United States. American journal of

epidemiology 155:80-87.

10. Easterling DR, Meehl GA, Parmesan C, Changnon SA, Karl TR, Mearns LO. 2000.

Climate extremes: Observations, modeling, and impacts. Science 289:2068-2074.

11. Edens JF, Cavell TA, Hughes JN. 1999. The self‐systems of aggressive children: A

cluster‐analytic investigation. Journal of Child Psychology and Psychiatry 40:441-

453.

12. Fowler DR, Mitchell CS, Brown A, Pollock T, Bratka LA, Paulson J, et al. 2013.

Heat-related deaths after an extreme heat event - four states, 2012, and United States,

1999-2009. Mmwr-Morbid Mortal W 62:433-436.

13. Gasparrini A, Armstrong B. 2011. The impact of heat waves on mortality.

Epidemiology 22:68-73.

14. Hajat S, Armstrong B, Baccini M, Biggeri A, Bisanti L, Russo A, et al. 2006. Impact

of high temperatures on mortality: Is there an added heat wave effect? Epidemiology

17:632-638.

15. Hajat S, Sheridan SC, Allen MJ, Pascal M, Laaidi K, Yagouti A, et al. 2010. Heat-

health warning systems: A comparison of the predictive capacity of different

approaches to identifying dangerously hot days. American journal of public health

100:1137-1144.

16. Hanzlick R, College of American Pathologists., National Association of Medical

Examiners (U.S.). 2006. Cause of death and the death certificate : Important

41

information for physicians, coroners, medical examiners, and the public. Northfield,

Ill.:College of American Pathologists.

17. Huth R, Kysely J, Pokorna L. 2000. A gcm simulation of heat waves, dry spells, and

their relationships to circulation. Climatic Change 46:29-60.

18. Ishigami A, Hajat S, Kovats RS, Bisanti L, Rognoni M, Russo A, et al. 2008. An

ecological time-series study of heat-related mortality in three European cities.

Environmental health : a global access science source 7:5.

19. Kent ST, McClure LA, Zaitchik BF, Smith TT, Gohlke JM. 2014. Heat waves and

health outcomes in Alabama, USA: The importance of heat wave definition.

Environmental health perspectives 122:151-158.

20. Klinenberg E. 2003a. Review of heat wave: Social autopsy of disaster in chicago. The

New England journal of medicine 348:666-667.

21. Klinenberg E. 2003b. Heat wave: A social autopsy of disaster in chicago (book).

Choice: Current Reviews for Academic Libraries 40:1268.

22. Kovats RS, Hajat S. 2008. Heat stress and public health: A critical review. Annual

review of public health 29:41-55.

23. Meehl GA, Tebaldi C. 2004. More intense, more frequent, and longer lasting heat

waves in the 21st century. Science 305:994-997.

24. Minino AM, Murphy SL, Xu J, Kochanek KD. 2011. Deaths: Final data for 2009.

National vital statistics reports : from the Centers for Disease Control and Prevention,

National Center for Health Statistics, National Vital Statistics System 59:1-126.

42

25. Morss RE, Wilhelmi OV, Meehl GA, Dilling L. 2011. Improving societal outcomes

of extreme weather in a changing climate: An integrated perspective. In: Annual

review of environment and resources, vol 36, Vol. 36, (Gadgil A, Liverman DM,

eds). Palo Alto:Annual Reviews, 1-25.

26. National Research Council (U.S.). Committee on America's Climate Choices.,

National Research Council (U.S.). Panel on Advancing the Science of Climate

Change., National Research Council (U.S.). Board on Atmospheric Sciences and

Climate. 2010. Advancing the science of climate change. Washington, D.C.:National

Academies Press.

27. Parry ML, Intergovernmental Panel on Climate Change. Working Group II., World

Meteorological Organization., United Nations Environment Programme. 2007.

Climate change 2007 : Impacts, adaptation and vulnerability : Summary for

policymakers, a report of working group ii of the intergovernmental panel on climate

change and technical summary, a report accepted by working group ii of the ipcc but

not yet approved in detail : Part of the working group ii contribution to the fourth

assessment report of the intergovernmental panel on climate change. S.l.:WMO :

UNEP.

28. Pascal M, Laaidi K, Ledrans M, Baffert E, Caserio-Schonemann C, Le Tertre A, et al.

2006. France's heat health watch warning system. International journal of

biometeorology 50:144-153.

29. Pascal M, Wagner V, Le Tertre A, Laaidi K, Honore C, Benichou F, et al. 2013.

Definition of temperature thresholds: The example of the french heat wave warning

system. International journal of biometeorology 57:21-29.

43

30. Peng RD, Bobb JF, Tebaldi C, McDaniel L, Bell ML, Dominici F. 2011. Toward a

quantitative estimate of future heat wave mortality under global climate change.

Environmental health perspectives 119:701-706.

31. Reid CE, O'Neill MS, Gronlund CJ, Brines SJ, Brown DG, Diez-Roux AV, et al.

2009. Mapping community determinants of heat vulnerability. Environmental health

perspectives 117:1730-1736.

32. Robinson PJ. 2001. On the definition of a heat wave. J Appl Meteorol 40:762-775.

33. Thacker MT, Lee R, Sabogal RI, Henderson A. 2008. Overview of deaths associated

with natural events, United States, 1979-2004. Disasters 32:303-315.

34. Semenza JC, Rubin CH, Falter KH, Selanikio JD, Flanders WD, Howe HL, et al.

1996. Heat-related deaths during the july 1995 heat wave in Chicago. The New

England journal of medicine 335:84-90.

35. Zaitchik D, Iqbal Y, Carey S. 2014. The effect of executive function on biological

reasoning in young children: An individual differences study. Child development

85:160-175.

36. Zhang T, Ramakrishnan R, Livny M. Birch: An efficient data clustering method for

very large databases. In: Proceedings of the ACM SIGMOD Record, 1996, Vol. 25

ACM, 103-114.

44

CHAPTER 3 Exploring the Utility of Modeled Meteorology Data for Extreme Heat-

Related Health Research and Surveillance9

3.1 Introduction

Meteorological data play vital role in the vast and growing realm of environmental health research. Meteorological variables such as temperature, wind speed, relative humidity, and pressure are often incorporated in studies examining the detrimental impacts of environmental exposures on human health. Studies that have explored such relationships

(B Anderson et al. 2013; Anderson and Bell 2009; Anderson and Bell 2011; Basu 2002;

Basu et al. 2005; Basu et al. 2010; O'Neill et al. 2002; O'Neill et al. 2003; O'Neill et al.

2005; Ostro et al. 2010; Zanobetti and Schwartz 2008; Zanobetti et al. 2012) have mostly relied on station-based meteorology data. Data from weather stations are available from the National Climatic Data Center (NCDC); however, these stations are limited in geographic scope. Further, assigning population-level exposures using station-based meteorology data is constrained by the fact that some of these stations are located in non- residential areas, such as airports, or in remote places (Gallo et al. 1996).

Meteorological data from models are available over continuous spatial and temporal scales, and have found use in air pollution modeling, weather forecasting and various other climatological predictions (Aiyyer et al. 2007; Glahn and Lowry 1972; Michalakes et al. 2001; Ritter and Geleyn 1992). These numerical weather prediction models output meteorology fields by grid cells and form the basis for some of most commonly used reanalysis10 data products such as the North American Regional Reanalysis (NARR)

9 Manuscript is undergoing internal CDC review. 10 Reanalysis data products combine predictions from numerically deterministic simulation models with ground based measurements. 45

model predictions (Mesinger et al. 2006). NARR predictions are generated jointly by the

National Centers for Environmental Prediction (NCEP) and the National Center for

Atmospheric Research (NCAR). Meteorology fields from the NARR model are provided at approximately a 0.3 degrees (32-km) spatial resolution and a 3-hourly temporal frequency. The primary motivation behind generating these reanalysis data products is to provide consistent long-term climate data on a regional scale for the North American domain (Mesinger et al. 2006). The NARR-based meteorological fields are spatially interpolated to the finer resolution, approximately 0.125 degree (12-km), and then temporally disaggregated to an hourly frequency. Meteorological data from this finer- scale reanalysis model, also known as the North American Land Data Assimilation

System Phase 2 (NLDAS) (Luo et al. 2003; Mitchell et al. 2004; Rodell et al. 2004), are available to users of the Centers for Disease Control and Prevention’s (CDC)

Environmental Public Health Tracking Network (Tracking Network) (http:// ephtracking.cdc.gov).

In this study, we assess the accuracy and utility of modeled predictions generated from

NLDAS model for extreme heat research and surveillance. Our objectives for this assessment are to: (1) evaluate the performance of the model-based predictions against measurements from stations; (2) conduct a county-level health analysis using heat mortality data and compare the results generated from station and modeled based exposure estimates.

46

3.2 Methods

3.2.1 Meteorology data

We used station-based meteorology data for years 1999-2009 and selected meteorology fields from automated surface observing system (ASOS) stations in the conterminous

United States (U.S.)—lower 48 states. Spatial coverage of ASOS stations is shown in

Figure 3-1A. Further, we checked on the completeness of hourly and daily meteorology data used in this analysis. For each station we set a daily completeness threshold of 75% for hourly observations in a given day (at least 18 of 24 hourly measurements available) for computing daily summaries of the heat metric. For each county we calculated an average of all available daily station-based summaries to create county-level estimates of daily weather variables. We then applied a 95% completeness threshold for the daily county-level estimates of the heat metric across the summer months (May 1 through

September 30). Finally, we only included counties for which sufficiently complete data were available for all 11 years (1999-2009) of the analysis period.

We also selected stations from the Southeastern Aerosol Research and Characterization

(SEARCH) network (SEARCH, 1999) to conduct an independent evaluation of the modeled predictions against measuresments. The SEARCH network was developed as part of public-private collaboration with EPRI (Electric Power Research Institute),

Southern Company, and other utilities. SEARCH network was formed primarily to assess air quality in the Southeast (Kleindienst et al. 2010; Zheng et al. 2002). We applied the completeness criteria we developed for ASOS stations and this resulted in all eight

SEARCH stations having sufficient records (Figure 3-1B). We selected data from years

2001-2008 as these eight years had complete data for all eight stations during summer

47

months. Of these eight stations four are in rural or non-urban areas (Centreville, Alabama

(CTR), Oak Grove, Mississippi (OAK), Outlying Landing Field, Pensacola, Florida

(OLF), and Yorkville, Georgia (YRK)) and four are in urban areas (Jefferson St, Atlanta,

Georgia (JST), Gulfport, Mississippi (GFP), North Birmingham, Alabama (BHM),

Pensacola, Florida (PNS)).

We extracted temperature and relative humidity only from weather stations in the ASOS and SEARCH network with complete records since our focus was to evaluate the utility of modeled data for extreme heat research and surveillance. We used the hourly data to create daily minimum (Tmin), maximum (Tmax) and mean temperature (Tavg), and computed daily maximum heat index (HImax) by combining both temperature and humidity; all daily heat metrics were represented in Fahrenheit (℉). We calculated the heat index based on Steadman’s formula that was modified using multiple regression analysis by Rothfusz (Rothfusz 1990)11. We then created an average of all available daily station-based data from ASOS stations to create county-level estimates of daily heat metric variables. We made adjustments to factor in county boundary changes that occurred between 1999-2009 in the conterminous U.S.

The Centers for Disease Control and Prevention (CDC) has been collaborating with the

National Aeronautics and Space Administration (NASA) on the development of long- term weather metrics for the CDC’s Environmental Public Health Tracking Network

(Tracking Network12) using data from the NLDAS model. All daily variables that we extracted from weather stations were also available from the NLDAS model. Daily

11 We used Rothfusz’s formula instead of the formula cited in Robinson (2001) since the modeled data we received had used Rothfusz’s formula. 12Tracking Network provides nationally consistent data and metrics (indicators and measures) to monitor relationships among hazards, exposures, and health effects (CDC, 2013). 48

NLDAS predictions (raw data) available at 0.125 degrees were made available to CDC as part of an interagency agreement between CDC and NASA. Containing 464 columns and

224 rows, the NLDAS grid covers the conterminous U.S., along with parts of northern

Mexico and southern Canada (Figure 3-1C). The geographic coordinates

(latitude/longitude) at the four corners of the NLDAS grid are given in Table B-1.

49

Figure 3-1: Spatial coverage of: (A) ASOS, (B) SEARCH stations, and (C) NLDAS grid extent with U.S. climate regions

50

We generated NLDAS-based estimates at two different geographic resolutions: (1) station-level estimates: we interpolated NLDAS predictions to ASOS and SEARCH locations from four nearest NLDAS grid centroids (geometric center) using an inverse squared distance weighting approach (Figure B-1), and, (2) county-level estimates: we used a multi-stage geo-imputation approach to convert grid-level meteorological data to county-level estimates. First, we calculated the population within each NLDAS grid cell using population estimates given by U.S. Census Blocks. We then converted NLDAS grid polygons with population information to centroids and related all the grid cell centroids to the counties in the conterminous U.S. based on a containment relationship.

For counties that did not have a grid cell centroid within its boundary, we assigned a grid cell centroid closest to the county boundary. Finally, we created a population-weighted average from all the grid cell centroids to obtain county-level estimates of daily heat metrics (Figure B-2).

3.2.2 Mortality and population data

We obtained mortality data from the National Center for Health Statistics (NCHS)

National Vital Statistics System and extracted death records for years 1999-2009 based on International Classification of Diseases, 10th revision (ICD-10) external cause codes

(Minino et al. 2011). Specifically, we selected death records for which exposure to excessive natural heat (ICD-10 code: X30) was listed as the underlying cause of death; the underlying cause of death is defined as the disease or injury that initiated the chain of events leading to death (Hanzlick et al. 2006). We summarized the extracted death records for the summer months to get counts of heat-related deaths by county and day.

We then assigned the data for each county to one of the nine U.S. climate regions, which

51

are aggregations of states based on homogeneous long-term climatology (Figure 3-1C); a description of these regions is available from the NCDC

(http://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-regions.php).

Additionally, due to small death counts in the West North Central and Northwest regions, we combined these two regions into “North West Central.” We excluded counties that did not have meteorology data (or that did not meet the data completeness threshold) and made adjustments to account for county boundary changes that occurred between 1999 and 2009. For incidence rate denominators we used county-level bridged-race population estimates developed by NCHS and the U.S. Census Bureau.

3.2.3 Station-level comparison of model and station data

We compared station-level estimates of Tmax, Tavg, and HImax with measurements from

ASOS and SEARCH to assess the performance of the NLDAS model. NLDAS estimates use ASOS-based measurements in the model fitting process, and model performance was expected to be more accurate in grid cells that have ASOS stations. However, we conducted an in-sample evaluation to assess the regional variability in the model fitting process. We also used daily heat metrics from stations in the SEARCH network to independently evaluate modeled estimates, since SEARCH data were not used to create

NLDAS-based predictions. We assessed the consistency of the relationship between model predictions and measurements using the following performance metrics: (1)

Pearson correlation coefficient(r), (2) Kendall Tau-B correlation coefficient (t), (3)

Difference (D), and (4) Root mean squared deviation (RMSD). We provide formulae used to calculate these metrics in Table B-2. Additionally, we computed the distance between each ASOS station and the U.S. coastline, and examined correlations between

52

ASOS and NLDAS-based estimates as a function of distance from the station to the U.S. coastline. We obtained U.S. coastline information from National Oceanic and

Atmospheric Administration’s (NOAA) National Geophysical Data Center. Lastly, we compared station-level measurements from SEARCH and model-based estimates using

Bland-Altman plots; these plots are primarily used for identifying the presence of fractional bias (Vaidyanathan et al. 2013). We provide the approach used to create Bland-

Altman plot in Table B-2.

3.2.4 County-level evaluation of model and station-based exposure estimates using heat-related mortality data

We used rate regression models to evaluate the relationship between heat-related deaths and exposure estimates derived from ASOS and NLDAS data at the county-level. We only included counties that had both station- and model-based data. The following model form was used to estimate the death rate per person-day on a logarithmic scale for each

EHE definition/variant and exposure offset combination: log(E[D] / P) = α + βregion + βEHE∙EHE + βEHE.Region∙EHE∙Region (1) with model terms defined as follows:

D: count of deaths for each combination of region, year, and EHE status13;

E[D]: expected count of deaths;

P: person-days of exposure for which D is measured;

α: intercept;

βregion: intercept offset for the climate region;

13 To facilitate reliable modeling diagnostics as well as convergence, data were collapsed according to a three-way stratification: climate region × year × EHE status (for the EHE definition/variant and exposure offset combination under consideration). 53

βEHE: parameter estimate for the binary variable referring to the EHE definition and exposure offset combination;

EHE: binary indicator variable for the operationalized EHE definition and exposure offset combination;

βEHE.Region: parameter estimate for the interaction between region and EHE;

Region: climate region;

To compensate for over dispersion, we specified a negative binomial link. We selected all the shortlisted definitions (Table B-3). For each EHE definition, we considered different exposure offsets: no lag (i.e., no offset), 1-day lag, and 1-, 2-, 3-day extended (post-heat wave) effects (Figure B-3). We operationalized each EHE definition and exposure offset combination using station and model-based county-level estimates.

Using the modeling approach in eq (1), we estimated a baseline rate of heat-related deaths (deaths in the absence of EHE), and an EHE rate of heat-related deaths (deaths in the presence of EHE). We termed the estimated increase (on a log-scale) in the rate due to EHE as the “EHE effect.”We estimated the absolute and relative difference in mean

EHE effect between ASOS and NLDAS-based county-level estimates. We also plotted the mean EHE effect and 95% confidence limits (CI) based on ASOS and NLDAS estimates. We carried out our data analyses using the Statistical Analysis System (SAS®

Version 9.3) and Environmental Systems Research Institute’s GIS software (ESRI,

ArcGIS® Version 9.3).

54

3.3 Results and Discussion

3.3.1 Descriptive statistics

We provide a summary of the number of ASOS stations that passed the completeness criteria by climate regions and urbanicity14 in Table 3-1. There were 617 ASOS stations that were considered complete and were distributed among 533 counties for 1999-2009.

405 ASOS stations (65%) were located in urban counties, and only 139 (23%) stations were located in rural counties. The South had the highest number of ASOS stations

(n=106), while the Southwest had the fewest ASOS stations (n=44). In most climate regions, a majority of the ASOS stations in our research dataset were concentrated in urban counties; however, in the West North Central region, 69% of the ASOS stations were in rural counties. Also, the West North Central region had a high proportion of rural counties compared to other climate regions. Figure B-4 shows a county map with urbanicity classification.

14 Urban/rural (urbanicity) nature of each county was assigned to each decedent using urban/rural continuum codes (version 2003) provided by department of agriculture;18 counties with codes 1, 2, and 3 were classified as “urban”, while counties with codes 4, 6, and 8 classified as “suburban” and counties with codes 5, 7, 9 were classified as “rural.” 55

Table 3-1: Availability of complete ASOS stations by climate region and urbanicity

U.S. Climate Number of ASOS stations by urbanicity Region Rural Suburban Urban Total Central 12 11 63 86 East North Central 17 6 35 58 Northeast 11 11 64 86 Northwest 11 14 26 51 South 28 10 68 106 Southeast 4 6 72 82 Southwest 17 8 19 44 West 5 2 48 55 West North Central 34 5 10 49 Total 139 73 405 617

56

3.3.2 Station-level comparison of model and station data

We compared daily station-level ASOS and NLDAS-based estimates; performance metrics r, D, and RMSD were calculated. Figures 3-2A-C present station-level r between ASOS and NLDAS estimates for daily heat metrics Tmax, HImax, and Tavg, respectively. The correlation between ASOS and NLDAS estimates was generally high with most locations having a correlation coefficient value of 0.8 or greater. Relatively weaker correlations were observed in coastal areas of the Southeast, South, and West for all daily heat metrics, while Central and Midwestern areas of the U.S. had relatively high correlations. We evaluated station-level t as a function of distance (log-scale) from the

U.S. coastline (Figure B-4) and we noticed that the correlation increased with distance from the coastline.

57

Figure 3-2: Station-level Pearson correlation coefficient between ASOS observations and NLDAS-based estimates for daily heat metrics

We present station-level D between ASOS and NLDAS estimates for all the daily heat metrics in Figures 4-3A-C. The data ranges for the map display were set at (<-2℉), (-

1.9–0℉), (0.1–1.9℉), and (>=2℉) and the range offset of 2℉ is approximately the maximum error associated with ground-level temperature measurements provided by

ASOS stations ((NOAA) 1998). Most station-level NLDAS estimates for Tmax in the

Eastern U.S. under predicted ASOS measurements, and most stations in the Northeast

58

consistently under predicted with a magnitude lesser than -2 ℉. Tavg and HImax, did not show this pattern of consistent bias in the Eastern U.S., but there was more variability for these heat metrics. Performance of NLDAS estimates in Central and parts of East North

Central areas showed over prediction but the magnitude of over prediction was mostly less than 2℉.

Figure 3-3: Station-level median difference between ASOS observations and NLDAS-based estimates for daily heat metrics

59

We provide RMSD by climate regions and year in Table B-4. Year-to-year variability in

RMSD was not evident as much as regional variability. Also, of all the daily heat metrics considered, HImax showed the highest variability, whereas, Tavg showed the least variability. Among all climate regions, West showed the highest degree of variability for

Tmax and HImax.

We provide the results of our comparison of estimates between SEARCH and the nearest

ASOS station, and between SEARCH and NLDAS in Table 3-2. We tabulate r, t, and median D for all daily heat metrics by urbanicity of SEARCH stations. Rural SEARCH stations did not have an ASOS station nearby, whereas most urban sites had an ASOS station close by. The relationship assessed (based on correlation coefficient and median

D) between ASOS and urban SEARCH monitors was relatively better than the relationship between ASOS and rural SEARCH monitors. Median D computed for daily heat metric ranges (0–80℉ 15 was positive for most SEARCH sites, indicating over prediction, whereas under prediction was more common for median D computed for daily heat metric ranges (>80℉ . NLDAS-based estimates for daily heat metrics showed a higher degree of variability than ASOS-based estimates at all SEARCH locations. The correlation between NLDAS and SEARCH was slightly lower than the correlations observed between ASOS and SEARCH stations. Similar to what was observed between

ASOS and SEARCH-based estimates, NLDAS estimates were more likely to under predict SEARCH measurements at higher temperatures. We provide Bland Altman plots in Figure B-5-7 for all daily heat metrics. The plots indicate a high degree of variability and under prediction at higher temperatures for most SEARCH locations.

15 Dichotomizing differences at 80℉ to account for the starting range of the temperature alerts issued by National Weather Service. 60

Table 3-2: Station-level comparison at SEARCH locations Correlation Distance Median (5th, 95th Percentile) Median (5th, 95th Percentile) Coefficients (ASOS between difference between ASOS and difference between NLDAS and Daily ASOS Vs. SEARCH, and SEARCH SEARCH by temperature SEARCH by temperature Heat Urbanicity and NLDAS Vs. Station Ranges ranges Metric SEARCH SEARCH) stations Kendall Pearson (0-80) ℉ (>80) ℉ (0-80) ℉ (>80)℉ (km) Tau-B Rural CTR 52.60 (0.82, 0.74) (0.66, 0.57) 0.24 (-3.47, 8.25) -2.48 (-17.12, 2.8) 3.51 (-5.17, 17.23) 0.20 (-14.87, 6.65) OAK 68.00 (0.79, 0.80) (0.65, 0.64) 3.75 (-2.56, 14.95) 0.49 (-5.28, 6.84) 2.00 (-2.86, 14.17) 1.17 (-5.42, 6.83) OLF 20.10 (0.75, 0.68) (0.53, 0.47) 3.84 (-1.02, 15.37) 0.31 (-7.37, 11.46) 1.70 (-4.05, 14.43) -1.53 (-13.28, 7.49) YRK 49.40 (0.82, 0.77) (0.70, 0.63) 3.47 (-1.08, 11.75) 1.88 (-12.76, 6.42) 2.19 (-4.04, 12.49) 1.34 (-14.1, 6.77) HImax Urban BHM 5.90 (0.84, 0.79) (0.7, 0.63) -0.08 (-3.15, 8.68) -1.36 (-12.42, 5.76) 1.28 (-5.50, 14.45) -0.14 (-12.32, 7.04) GFP 2.70 (0.65, 0.60) (0.54, 0.46) 1.37 (-2.30, 21.66) -0.79 (-25.20, 4.91) -0.59 (-5.55, 16.94) -2.57 (-27.96, 4.91) JST 9.10 (0.92, 0.85) (0.79, 0.69) 0.26 (-3.21, 6.13) -1.56 (-9.69, 2.24) -0.44 (-5.79, 9.91) -1.37 (-11.03, 3.98) PNS 8.20 (0.86, 0.74) (0.70, 0.55) 0.40 (-1.99, 6.96) -1.22 (-7.68, 4.22) -3.14 (-7.32, 10.99) -5.15 (-14.96, 2.94) Rural CTR 52.60 (0.91, 0.82) (0.72, 0.60) 0.64 (-2.85, 5.96) -0.58 (-4.64, 3.36) 2.83 (-5.15, 10.12) -1.01 (-6.84, 4.20) OAK 68.00 (0.78, 0.85) (0.59, 0.64) 3.91 (-1.16, 8.24) -0.2 (-5.37, 4.29) 1.88 (-2.86, 8.05) -0.39 (-5.21, 3.19) OLF 20.10 (0.79, 0.69) (0.57, 0.47) 3.84 (-0.19, 8.25) 1.22 (-2.87, 7.15) 0.74 (-4.05, 7.31) -2.94 (-8.54, 2.18) YRK 49.40 (0.88, 0.81) (0.73, 0.61) 3.94 (-0.56, 8.82) 3.04 (-5.22, 6.54) 1.72 (-4.04, 9.43) 0.26 (-8.84, 5.39) Tmax Urban BHM 5.90 (0.91, 0.85) (0.76, 0.64) 0.81 (-2.29, 9.15) -0.26 (-3.94, 4.35) 1.21 (-5.5, 10.17) -1.82 (-6.94, 3.59) GFP 2.70 (0.76, 0.72) (0.58, 0.49) 1.99 (-1.30, 15.19) 0.7 (-6.10, 4.29) -0.45 (-5.17, 8.76) -2.42 (-8.95, 1.77) JST 9.10 (0.94, 0.88) (0.79, 0.69) 1.02 (-2.83, 5.40) -0.14 (-3.87, 3.57) -0.08 (-5.41, 7.77) -1.91 (-6.88, 2.41) PNS 8.20 (0.91, 0.63) (0.76, 0.41) 1.15 (-1.21, 3.41) 0.87 (-1.99, 3.68) -3.14 (-7.32, 5.08) -5.13 (-11.19, 0.49) Rural CTR 52.60 (0.85, 0.81) (0.63, 0.57) 1.07 (-3.83, 5.12) -1.41 (-6.19, 2.29) 1.88 (-4.57, 7.18) -0.65 (-5.61, 3.85) OAK 68.00 (0.82, 0.82) (0.59, 0.54) 2.49 (-2.45, 6.14) 0.61 (-3.13, 4.10) 2.02 (-3.14, 6.40) 0.73 (-4.02, 4.14) OLF 20.10 (0.79, 0.71) (0.54, 0.42) 3.11 (-1.46, 7.59) 0.81 (-2.45, 4.30) 4.06 (-1.45, 8.05) 0.66 (-4.21, 3.86) YRK 49.40 (0.86, 0.87) (0.64, 0.64) 2.77 (-1.99, 7.35) -0.99 (-6.50, 4.58) 2.18 (-2.9, 6.93) -1.52 (-5.77, 5.53) Tavg Urban BHM 5.90 (0.88, 0.84) (0.69, 0.63) 0.48 (-3.58, 7.68) -0.88 (-5.01, 2.78) 0.63 (-3.76, 8.97) -0.67 (-5.59, 3.41) GFP 2.70 (0.78, 0.74) (0.55, 0.45) 0.12 (-4.84, 4.35) -1.35 (-7.53, 2.31) 1.56 (-3.66, 6.65) -0.89 (-6.73, 3.11) JST 9.10 (0.88, 0.88) (0.67, 0.67) 0.16 (-4.62, 4.66) -1.76 (-6.53, 1.97) 0.12 (-4.86, 5.12) -0.93 (-5.84, 3.02) PNS 8.20 (0.91, 0.76) (0.71, 0.45) 0.93 (-1.49, 3.94) 0.29 (-2.69, 2.74) 2.64 (-2.45, 6.53) 0.12 (-4.87, 3.40)

61

3.3.3 County-level comparison of mean EHE effect

We provide information on the absolute and relative change between mean EHE effect estimates computed using ASOS- and NLDAS-based exposure estimates in Table 3. The degree of agreement between the mean EHE effect estimates varied by climate region and EHE definitions. The absolute difference between ASOS- and NLDAS-based mean

EHE effect estimates was negative, which indicated under prediction. The Huth definition (Refer Table B-3) showed maximum variability and consistent under prediction across all climate regions and exposure offset combinations. Worth noting, the

Huth definition uses an extreme relative threshold and was also one of the poorly associated EHE definitions with heat-related mortality. The definition, daily mean temperature greater than the mean plus one standard deviation of the long-term climate normal for at least three consecutive days, showed little variability and better agreement between ASOS- and NLDAS-based mean EHE effect. It is important to reiterate that this definition was one of the poorly correlated EHE definitions with respect to heat-related mortality, and the magnitude of the estimated mean EHE effect was not very high when compared to other definitions. Other EHE definitions used in this comparison did not follow a consistent pattern of under prediction and the difference in mean EHE effect observed for these definitions varied with climate region. Of all climate regions considered, the magnitude of the difference in mean EHE effect for all definitions

(excluding Huth Definition) was comparatively small for West. Similarly, East North

Central had a relatively higher magnitude of difference for all definitions but one (daily maximum and minimum temperature greater than the 80th percentile for at least three

62

consecutive days). Differences in mean EHE effect for other regions varied with EHE definitions and exposure offset combinations.

63

Table 3-3: Absolute and relative change in mean EHE effect by U.S. climate region

Absolute and Relative Change in mean EHE effect, by U.S. Climate Regions East North North West Central Northeast South Southeast

Central Central Southwest West

Exposure

EHE definition offset

type

Change Change Change Change Change Change Change Change

Relative Relative Relative Relative Relative Relative Relative Relative

Absolute Absolute Absolute Absolute Absolute Absolute Absolute Absolute

Change (%) Change (%) Change (%) Change (%) Change (%) Change (%) Change (%) Change (%) ExE1 -1.30 -10 -6.84 -36 -4.63 -21 0.86 8 0.86 16 0.23 5 -0.92 -9 -0.71 -10 Daily maximum heat ExE2 -1.07 -8 -7.78 -40 -4.22 -19 0.89 9 0.26 4 -0.21 -4 -1.54 -13 -0.90 -13 index greater than 90oF ExE3 -0.86 -7 -5.81 -33 -4.11 -21 1.49 17 0.23 4 -1.22 -19 -1.77 -15 -0.39 -6 for at least 3 lag0 0.24 2 -5.01 -32 -2.89 -17 0.79 6 1.62 35 0.19 4 -0.61 -6 -0.82 -11 consecutive days lag1 -2.50 -17 -7.42 -33 -3.50 -16 0.90 8 0.65 12 0.58 13 -1.19 -11 -0.82 -11 Daily maximum and ExE1 -2.90 -20 -0.49 -4 -6.27 -37 -1.50 -14 0.22 5 -1.03 -18 -1.55 -22 -0.35 -6 minimum temperature ExE2 -3.19 -22 1.25 9 -6.50 -39 -0.83 -9 0.15 3 -1.40 -25 -1.51 -22 -0.08 -2 greater than 80th ExE3 -2.37 -17 0.68 6 -5.46 -37 -0.72 -9 -0.15 -3 -1.56 -28 -1.23 -20 -0.28 -6 percentile for at least 3 lag0 -3.33 -22 -2.03 -16 -6.30 -42 -1.83 -14 0.05 1 -1.68 -28 -0.86 -13 -0.06 -1 consecutive days lag1 -3.80 -22 -0.35 -2 -7.68 -40 -3.17 -28 0.27 6 -0.52 -9 -2.29 -29 -0.44 -8 Daily maximum ExE1 -4.91 -31 -6.56 -28 1.86 8 -3.11 -22 -0.61 -9 -0.72 -12 0.56 9 -0.18 -2 temperature greater ExE2 -4.32 -30 -5.68 -26 0.62 3 -2.49 -21 -0.25 -4 -0.51 -9 0.73 12 -0.39 -5 than 95th percentile for ExE3 -3.68 -28 -5.48 -27 0.34 2 -1.99 -21 0.03 0 -0.37 -8 0.30 5 -0.16 -2 at least 2 consecutive lag0 -3.69 -27 -4.84 -27 -0.62 -4 -1.78 -12 -0.49 -7 -0.78 -11 1.48 24 0.10 1 days lag1 -6.11 -33 -11.00 -36 2.74 11 -3.80 -28 -1.56 -21 0.54 9 0.68 10 -1.05 -11 ExE1 -2.01 -37 -4.37 -40 -1.61 -21 -4.57 N/A8 -3.77 -52 -3.06 -85 -1.87 -62 -3.70 -73 ExE2 -1.98 -37 -4.20 -40 -1.63 -22 -2.52 -66 -3.18 -47 -3.23 -87 -1.91 -65 -4.22 -78 Huth definition ExE3 -1.67 -36 -3.57 -38 -1.41 -21 -0.94 -28 -2.87 -45 -3.16 -88 -1.79 -66 -3.66 -74 lag0 -1.72 -31 -4.07 -39 -0.45 -8 -5.61 N/A8 -3.02 -46 -3.28 -84 -1.01 -43 -3.77 -77 lag1 -1.86 -32 -5.49 -42 -1.35 -16 -5.62 N/A16 -4.38 -53 -3.63 -85 -2.31 -63 -3.28 -67 Daily mean temperature ExE1 -0.77 -10 -2.16 -21 0.83 8 -1.48 -27 0.36 11 -0.18 -4 -0.63 -17 0.37 9 greater than mean + 1 ExE2 -1.19 -16 -2.16 -20 0.61 6 -0.84 -18 0.20 6 -0.17 -5 -0.66 -18 0.39 10 standard deviation (SD) ExE3 -0.97 -14 -1.60 -16 0.96 11 -0.54 -14 0.18 6 -0.23 -6 -0.63 -17 0.26 6 of climate normal for at lag0 -0.46 -6 -1.66 -18 -0.81 -8 -1.39 -23 0.37 11 0.03 1 -0.38 -11 0.49 12 least 3 consecutive days lag1 -1.37 -16 -2.73 -23 1.70 15 -1.19 -24 0.12 3 0.43 10 -0.47 -12 0.40 9

16 N/A: Rate regression model for NLDAS-based exposure estimates did not converge for certain exposure offset combinations when using Huth Definition. 64

We also compared the mean effect and the 95% confidence limits associated with ASOS and NLDAS-based mean EHE effect (Figure B-8-12). For most EHE definitions and climate regions (except North West Central), the mean EHE effect estimate from NLDAS was lower than the effect produced by ASOS estimates indicating a downward bias.

However, the width of the confidence interval was very similar or tighter in some cases, which indicated comparable variability in the mean EHE effect estimates.

3.4 Summary and Conclusions

Daily heat metric data obtained from weather stations have limited geographic coverage and possible gaps on a temporal scale. Such limitations can negatively impact our ability to conduct extreme heat-related research and surveillance on a routine basis. Thus, modeled meteorology predictions may provide a suitable alternative for use in research studies and surveillance efforts examining the environmental and health impacts of extreme heat. The utility of NLDAS data for extreme heat surveillance and research should be weighed against any potential bias and variability present in these predictions, and an evaluation is needed to characterize the benefits and limitations of modeled weather data. In this evaluation, we assess the utility of model-based estimates of daily heat metrics using a framework well suited to identify the pros and cons of the modeled meteorology data for health-related surveillance and research. This assessment sheds light on aspects that are critical to a large-scale adoption of modeled daily heat metric estimates in environmental health.

Model- and station-based estimates from ASOS comport well with each other. At most station locations, the correlation is high and the difference between station- and model- based estimates are within the maximum measurement error associated with ASOS

65

stations. There are certain areas in the U.S. where estimates from NLDAS do not correspond well with station-based measurements. The modeled estimates show variability, as indicated by relatively lower correlations, near the coastal areas of the

South, Southeast and the West. Similarly, Northeast shows a consistent negative difference with the magnitude greater than the maximum measurement error of weather stations.

While some of these differences are expected given the location of weather stations, certain region-specific discrepancies that we notice could be due to the assumptions made in the modeling process. Some of these regional differences arise also due to the lack of station-based observations available to calibrate the model. This is evident from our independent evaluation of model-based estimates against SEARCH measurements.

Performance of model-based estimates drops at SEARCH locations which do not have an

ASOS station nearby. Also, users of model-based meteorology data from NLDAS should take note of the variability in performance at high and low temperature ranges. At high temperatures (greater than 80℉), particularly of interest in extreme heat-related research and surveillance, NLDAS-based estimates under estimate SEARCH measurements.

County-level analysis provided useful insights into the benefits and limitations of using

NLDAS-based exposure estimates as well as highlighting certain region-specific and

EHE definition-specific differences. For combinations of certain EHE definitions and regions, the difference in mean EHE effect is relatively high. In general, the degree of agreement between the ASOS- and NLDAS-based exposure estimates can be improved by omitting certain EHE definitions for certain regions. Under estimation of mean EHE effect generated using NLDAS-based estimates, which are more frequently observed, is a

66

factor to consider for health studies. The variability associated with the mean EHE effect, based on the 95% CI, is comparable to the variability we see with ASOS-based exposure estimates. These insights are helpful to researchers and public health professionals interested in conducting health linkage studies, deriving exposure-response relationships, and estimating excess deaths related to extreme temperatures.

Generating county-level heat metric estimates for every single county in the conterminous U.S. has wide-ranging potential for use in public health surveillance and research. Researchers and public health professionals will greatly benefit from using exposure estimates that align with the spatio-temporal resolution of health data. Further,

NLDAS data are available for years 1979-2011, which makes it an invaluable resource for linkage studies exploring health impacts associated with long-term extreme heat exposures. There are several positive points present to promote the use of model-based daily heat metric estimates for extreme heat-related health research and surveillance.

Superior spatio-temporal coverage is certainly an appealing attribute, especially given the limitations with station-based measurements, to adopt NLDAS-based modeled estimates.

3.5 References

1. NOAA. 1998. Automated surface observing system (asos) user's guide.

2. Aiyyer A, Cohan D, Russell A, Stockwell W, Tanrikulu S, Vizuete W, et al. 2007.

Final report: Third peer review of the cmaq model.Submitted to the Community

Modeling and Analysis System Center, Carolina Environmental Program, The

University of North Carolina at Chapel Hill, 23pp.

67

3. Anderson B, Dominici F, Wang Y, Bell M, McCormack M, Peng R. 2013. Heat-

related emergency hospitalizations for respiratory diseases in the medicare

population. American journal of epidemiology 177:S64-S64.

4. Anderson BG, Bell ML. 2009. Weather-related mortality: How heat, cold, and heat

waves affect mortality in the United States. Epidemiology 20:205-213.

5. Anderson GB, Bell ML. 2011. Heat waves in the United States: Mortality risk during

heat waves and effect modification by heat wave characteristics in 43 U.S.

Communities. Environmental health perspectives 119:210-218.

6. Basu R. 2002. Relation between elevated ambient temperature and mortality: A

review of the epidemiologic evidence. Epidemiologic Reviews 24:190-202.

7. Basu R, Dominici F, Samet JM. 2005. Temperature and mortality among the elderly

in the United States. Epidemiology 16:58-66.

8. Basu R, Malig B, Ostro B. 2010. High ambient temperature and the risk of preterm

delivery. American journal of epidemiology 172:1108-1117.

9. Gallo KP, Easterling DR, Peterson TC. 1996. The influence of land use/land cover on

climatological values of the diurnal temperature range. Papers in Natural

Resources:191.

10. Glahn HR, Lowry DA. 1972. The use of model output statistics (mos) in objective

weather forecasting. J Appl Meteorol 11:1203-1211.

11. Kleindienst TE, Lewandowski M, Offenberg JH, Edney EO, Jaoui M, Zheng M, et al.

2010. Contribution of primary and secondary sources to organic aerosol and pm2. 5 at

search network sites. Journal of the Air & Waste Management Association 60:1388-

1399.

68

12. Luo L, Robock A, Mitchell KE, Houser PR, Wood EF, Schaake JC, et al. 2003.

Validation of the north american land data assimilation system (nldas) retrospective

forcing over the southern great plains. Journal of Geophysical Research: Atmospheres

(1984–2012) 108.

13. Mesinger F, DiMego G, Kalnay E, Mitchell K, Shafran PC, Ebisuzaki W, et al. 2006.

North american regional reanalysis. B Am Meteorol Soc 87.

14. Michalakes J, Chen S, Dudhia J, Hart L, Klemp J, Middlecoff J, et al. Development

of a next generation regional weather research and forecast model. In: Proceedings of

the Developments in Teracomputing: Proceedings of the Ninth ECMWF Workshop

on the use of high performance computing in meteorology, 2001, Vol. 1World

Scientific, 269-276.

15. Mitchell KE, Lohmann D, Houser PR, Wood EF, Schaake JC, Robock A, et al. 2004.

The multi‐institution north american land data assimilation system (nldas): Utilizing

multiple gcip products and partners in a continental distributed hydrological modeling

system. Journal of Geophysical Research: Atmospheres (1984–2012) 109.

16. O'Neill M, Zanobetti A, Schwartz J. 2002. Modifiers of the temperature and mortality

association in four us cities. Epidemiology 13:S81-S81.

17. O'Neill MS, Zanobetti A, Schwartz J. 2003. Modifiers of the temperature and

mortality association in seven us cities. American journal of epidemiology 157:1074-

1082.

18. O'Neill MS, Hajat S, Zanobetti A, Ramirez-Aguilar M, Schwartz J. 2005. Impact of

control for air pollution and respiratory epidemics on the estimated associations of

temperature and daily mortality. International journal of biometeorology 50:121-129.

69

19. Ostro B, Rauch S, Green R, Malig B, Basu R. 2010. The effects of temperature and

use of air conditioning on hospitalizations. American journal of epidemiology

172:1053-1061.

20. Ritter B, Geleyn J-F. 1992. A comprehensive radiation scheme for numerical weather

prediction models with potential applications in climate simulations. Monthly

Weather Review 120:303-325.

21. Rodell M, Houser P, Jambor Uea, Gottschalck J, Mitchell K, Meng C-J, et al. 2004.

The global land data assimilation system. B Am Meteorol Soc 85.

22. Rothfusz LP. 1990. The heat index equation (or, more than you ever wanted to know

about heat index). Fort Worth, Texas: National Oceanic and Atmospheric

Administration, National Weather Service, Office of Meteorology:90-23.

23. Vaidyanathan A, Dimmick WF, Kegler SR, Qualters JR. 2013. Statistical air quality

predictions for public health surveillance: Evaluation and generation of county level

metrics of PM2.5 for the environmental public health tracking network. International

journal of health geographics 12.

24. Zanobetti A, Schwartz J. 2008. Temperature and mortality in nine us cities.

Epidemiology 19:563-570.

25. Zanobetti A, O'Neill MS, Gronlund CJ, Schwartz JD. 2012. Summer temperature

variability and long-term survival among elderly people with chronic disease.

Proceedings of the National Academy of Sciences of the United States of America

109:6608-6613.

70

26. Zheng M, Cass GR, Schauer JJ, Edgerton ES. 2002. Source apportionment of pm2. 5

in the southeastern united states using solvent-extractable organic compounds as

tracers. Environmental Science & Technology 36:2361-2371.

71

CHAPTER 4 Characterizing the Effect of Meteorology on Ozone Levels during

Extreme Heat Events

4.1 Introduction

An extreme heat event (EHE) is defined as a sustained period of abnormally and uncomfortably hot, and usually humid, weather (Meehl and Tebaldi 2004). The adverse effects of EHEs on mortality and morbidity have been documented in city-specific and regional studies (Analitis et al. 2014; Anderson and Bell 2011; Bouchama 2004; Jones et al. 1982; Knowlton et al. 2009; Meehl and Tebaldi 2004; Semenza et al. 1996; Semenza et al. 1999). Existing literature supports an association between EHEs and health outcomes, including cardiovascular diseases, respiratory diseases, renal failure, and mental health issues (GB Anderson et al. 2013; Braga et al. 2002; D'Ippoliti et al. 2010;

Hansen et al. 2008; Kovats et al. 2004; Semenza et al. 1999). Tropospheric ozone, a criteria pollutant regulated under the Clean Air Act of the United States (U.S.), is also known to adversely impact heath. Several studies have found a consistent positive relationship between ambient ozone exposure and hospitalization/ emergency department visits for respiratory diseases, such as asthma and chronic obstructive pulmonary disease

(Burnett et al. 1997; Strickland et al. 2010). Studies have also shown an association between short- and long-term ozone exposure and mortality (Bell et al. 2004; Bell et al.

2005; Jerrett et al. 2009).

Meteorology plays a dominant role in the formation of air pollutants, in particular tropospheric ozone. Several meteorology adjustment analyses conducted in the U.S. have explored the relationship between ozone and meteorological variables (Baur et al. 2004;

Bloomfield et al. 1996; Camalier et al. 2007; Cox and Chu 1996; Lehman et al. 2004; Lu

72

and Chang 2005; Wise and Comrie 2005). Statistical methods used to study the relationship between ozone and meteorology are well documented (Thompson et al.

2001). The primary motivation behind these studies has been to explore the interannual variations in ozone concentrations or to account for the variations in meteorology when studying the impact of emission-reduction efforts and human activities on ozone levels.

During the European heat wave of 2003, many western and central European countries experienced the highest ozone concentrations on record since the 1980s (García-Herrera et al. 2010; Solberg et al. 2008; Vautard et al. 2005). A recent study conducted in Europe

(Analitis et al. 2014) has concluded that heat wave-related mortality was 54% higher on high ozone days compared with low ozone days among people age 75-84. To an extent, the impact of EHEs on ozone concentrations can be surmised based on the relationship between high temperatures and ozone; however, it is worthwhile to quantify the impacts

(with certainty) of consecutive days of heat stress on ozone levels in the United States

(U.S.). Additionally, our understanding on how EHEs modify the relationship between meteorological variables and ozone is limited. It is worthwhile for us to examine the prevailing levels of meteorological variables on EHE days and non-EHE days, and evaluate whether EHEs encapsulate variations in multiple meteorological variables that are associated with higher ozone concentrations. In this study, we have the following objectives: (1) explore the effect of meteorological variables on ozone levels, conditioned on EHE and non-EHE day; (2) if the effect of meteorological variables on ozone is significant, characterize the degree of effect modification by EHEs at city and regional scales.

73

4.2 Methods

4.2.1 Meteorology and ozone data

We used station-based meteorology data for years 1999-2009 and selected stations in the conterminous U.S. (lower 48 states) that were automated surface observing system

(ASOS) units for this analysis. Further, we set a completeness threshold of 75% for hourly observations in a given day to compute daily summaries of the weather variables.

For each station we set a daily completeness threshold of 75% for hourly observations in a given day (at least 18 of 24 hourly measurements available) for computing daily summaries of the weather variables For each county we calculated an average of all available daily station-based summaries to create county-level estimates of daily weather variables. We then applied a 95% completeness threshold for the daily county-level estimates of the weather variables across the summer months (May 1 through September

30). Finally, we only included counties for which sufficiently complete data were available for all 11 years (1999-2009) of the analysis period.We selected meteorological variables, such as: precipitation, pressure, relative humidity, temperature, wind direction, and wind speed. In addition to these variables, we selected variables for sky characteristics, which were converted to cloud cover fraction and ultimately used to compute cloud adjusted solar radiation. We also computed hours of day light (time difference between sunrise and sunset). All these weather variables were summarized daily but with different averaging periods in a given day: (1) 24-hour period, (2) during day light (from sunrise to sunset) hours, and (3) duration between noon to 16:00 hours.

The list of meteorological variables and various daily metrics associated with each variable is provided in Table 3-1.

74

Table 4-1: List of meteorological variables Meteorological Daily summary metrics variable Maximum, minimum, diurnal temperature change, apparent temperature (includes relative humidity), deviation from 11-year summertime daily max Temperature (T) (℉) and mean temperature, deviation from 30-year climate normal daily temperature Average and max daily wind speed; wind speed observed at the time of Wind Speed (WS) maximum and minimum temperature. Also, calculated the following (m/s) parameters based on inverse of wind speed Relative Average and max relative humidity; relative humidity observed at the time of Humidity(RH) (%) maximum and minimum temperature Pressure (P) (mb) Average and maximum pressure observed at the station Cloud Cover (CC) Average and maximum cloud cover fraction (OKTA) Solar Insolation (SI) Total daily net solar insolation, average daily solar insolation, cloud adjusted (W/sq.m) solar insolation Precipitation(Precip) Average and maximum hourly precipitation, total daily net precipitation (mm) Other Weekend indicator; holiday indicator,and year

We obtained daily 8-hour maximum ozone concentrations in parts per billion (ppb) and supplemental data fields such as: latitude, longitude, and elevation, for all the monitoring sites across the U.S. from the Environmental Protection Agency (EPA). The data are obtained only from monitors that are designated as Federal Reference Methods or equivalent. We retained observations associated with exceptional events. We then selected ozone monitors that were at least 90% complete during the summer months. We also restricted our selection of ozone monitors to those that have an ASOS station in close proximity, i.e., the distance between ASOS station and ozone monitor does not exceed 10 kilometers (km). From these complete stations, we selected 27 cities representing different climate regions in the U.S. (http://www.ncdc.noaa.gov/monitoring- references/maps/us-climate-regions.php). Cities selected for this analysis are shown in

Figure 4-1. Additionally, due to insufficient monitor-station pairs in West North Central

75

and Northwest regions, we combined these two regions and called it “North West

Central.”

Figure 4-1: Cities selected for this analysis

4.2.2 EHE Definitions

There is a lack of scientific consensus in the available literature on definitions and procedures to accurately identify periods of extreme heat. For this study, we selected 92 different EHE definitions that have been used in scientific research and/or widely cited in the literature. We implemented a total of 92 different EHE definitions (Refer Table A-1) and operationalized each EHE definition as a binary (“Yes (1)/ No (0)) variable and categorized any day in the summer months as either an “EHE day” or a “non-EHE day.”

Separately evaluating 92 different EHE definitions/variants becomes onerous and, hence, we used cluster analysis as a preliminary data reduction technique to group EHE definitions/variants into homogeneous sets. We applied a hierarchical clustering technique, and employed an average distance metric to determine distances between 76

clusters that might be merged in each step of the clustering process (Zhang et al. 1996).

Average distance is calculated using the following formula:

∑ ∑ (2)

Ca and Cb are two disjoint clusters; na and nb are the number of members within clusters Ca and Cb, respectively; d is the Euclidean distance between two members of the two disjoint clusters.

We divided the final hierarchical cluster, one “big” cluster consisting of all definitions, into smaller clusters. We delineated clusters taking certain metrics into consideration, such as: overall R-squared, pseudo-F and pseudo-T-square indices (Edens et al. 1999).

One representative EHE definition was then selected from each high-level cluster.

Candidate definitions were identified according to the following criteria: (1) EHE definitions/variants that are well-recognized in the literature; (2) application in studies conducted in the U.S.; and (3) application in nationally representative studies, i.e. those studies that covered the various climate regions of the U.S. , Among the candidates meeting these criteria to the extent possible, we made our final selection of EHE definitions to reflect differentiated combinations of the core variables that are used to operationalize the EHE definitions.

4.2.3 Modeling approach

We executed a multivariate regression model with all meteorological variables listed in

Table 4-1. This saturated model was reduced in a step-wise approach (“backward one variable deletion”) by eliminating variables that were highly correlated with one another, and variables that offered little explanatory power (Camalier et al. 2007). We also examined model diagnostics, and specifically evaluated the residuals for autocorrelation.

77

Since autocorrelation was present and it was not due to missing predictors,17 we employed an autoregressive (AR) modeling approach. Without controlling for autocorrelation of residuals, the regression coefficients no longer have the minimum variance property. As a result, the variance of the error terms are underestimated.

Consequently, the parameter estimates are biased with inaccurate confidence limits.

The order of autocorrelation varied with city and hence, we used a step-wise autoregressive approach; we initially fit a higher order (n=20) autocorrelation model with autoregressive lags, then sequentially eliminated non-significant autoregressive parameters. We used an AR model with a yule-walker estimation method (Kegler et al.

2001) and used a Durbin-Watson (DW) statistic to test for autocorrelation (Durbin and

Watson 1971). Our final daily time series model with a correlational structure for the residuals is specified as follows:

∑ ∑ ∑ Log(O3) + * +

(2a)

∑ (2b) where,

O3: ozone concentration (ppb);

α: intercept;

: meteorological variable;

: parameter estimate for the meteorological variable;

: binary indicator variable for operationalized EHE;

17We identified that autocorrelation was not due to missing predictors by purposely excluding a significant predictor in the model and examined the residuals for autocorrelation. Then, we included the omitted predictor and executed the model again to check whether the residual autocorrelation was present. If autocorrelation was present after the inclusion of the missing predictor then we concluded autocorrelation was not due to missing predictors. 78

: parameter estimate for the binary variable referring to the EHE definition and exposure offset combination;

: parameter estimate for the term denoting the interaction between meteorological variable and EHE;

: error term;

: autocorrelation parameter;

: disturbances (independent, normal random variables) s: period (no of days considered) or order of autocorrelation ;

We evaluated model diagnostics including (but not limited to): regress and total R- squared, mean absolute error (MAE), and mean relative accuracy (RA) (Hu et al. 2013), to assess the goodness of fit at each city. The regress R-squared diagnostic is a metric to determine the explanatory value associated with the candidate predictors, whereas, total

R-squared measures the explanatory value associated with the candidate predictors and the autoregressive lag component (SAS 2011). We conducted an out-of-sample model validation by using 2010 data, which was not used in the modeling process.

4.2.4 Sensitivity of meteorology-ozone relationships to EHE definitions

Meteorology varies by climate regions (and many of these regions have micro-climates resulting in local variations), and hence, the effect of meteorological variables on ozone could change across the cities we have selected for this analysis. Additionally, the relationship between meteorology and ozone during EHE could depend on the definition, which could, in turn, influence the effect sizes. We conducted a summary-level pooled analysis of mean (95% CI) parameter estimates of meteorological variables, considering the heterogeneity in effect sizes arising due to EHE definitions, and to generalize the

79

effect of meteorology on ozone for each city and climate region. This summary-level pooled analysis was analogous to a meta-analysis of effect sizes from studies with different subjects or study participants (Borenstein and Higgins 2013; Mortimer et al.

2012; Shah et al. 2005). In our analysis, each of the studies and study participants were akin to a model-run executed with different EHE definitions. Specifically, we used a random effects model to conduct the summary-level pooled analyses to account for differences in effect sizes arising from random variability as well as the error introduced by selecting a particular EHE definition. We used diagnostics such as I-squared (Higgins et al. 2003), to check for the presence of heterogeneity and the magnitude of heterogeneity by city. Further, we also extended this summary-level pooled analysis to generate mean and 95% confidence interval (CI) effect sizes by climate region.

We carried out our data analyses using the Statistical Analysis System (SAS® Version

9.3), Environmental Systems Research Institute’s GIS software (ESRI, ArcGIS® Version

9.3), and comprehensive meta-analysis software (CMA® Version 2.0).

4.3 Results

4.3.1 Descriptive summary

Ozone is monitored during the designated ozone season

(http://epa.gov/ttn/naaqs/ozone/ozonetech/40cfr58d.htm), which varies by state. Ozone concentrations are the highest during the warmer months of year, May–September. For most cities considered in this analysis, we observed higher ozone concentrations in July and August. We also observed that for certain cities in the warmer climate regions, such as: Fresno, CA, Las Vegas, NV, and Phoenix, AZ, higher number of days with ozone

80

concentrations greater than 75 ppb18 were seen as early as May. Figure 4-2 shows monthly distribution of ozone concentrations in the 27 cities considered in the analysis of

1999-2009.

Figure 4-2: Ozone concentrations by month in 27 cities

The candidate predictors for this analysis were chosen based on the results from ordinary least squares (OLS) regression model. In order to maintain consistency when comparing the effect of meteorology variables on ozone across all cities, we selected a common subset of candidate predictors for all cities. Further, we selected a set of predictors that were not correlated with one and another, and offered the best explanatory predictive power. We settled for the following predictors in the autoregressive modeling phase: (1)

18 National ambient air quality standard (NAAQS) is set at 75 (ppb) 81

inverse of daily (24-hour) mean wind speed (InvWS), (2) daily mean relative humidity

(RH), (3) daily maximum temperature (Tmax), (4) daily cloud cover adjusted net solar insolation (SI), (5) EHE definition indicator, (6) interaction term between EHE indicator and inverse of daily mean wind speed (EHE*InvWS), (7) interaction term between EHE indicator and daily mean relative humidity (EHE*RH), and (8) interaction term between

EHE indicator and daily maximum temperature (EHE*Tmax).

We examined the monthly distributions of predictor variables by month to understand the city-specific variations in the meteorology (Figures C1-4). Monthly distributions for

Tmax were similar across cities, with peak Tmax occurring in the month of July and

August. Most cities in the warmer climate regions had higher extreme temperatures, with peak values occurring as early as May or as late as September. InvWS, which could be understood as a metric to denote stagnation, showed little variation during summer months; however, there were certain cities with noticeably higher stagnation values.

Baton Rouge, LA, Birmingham, AL, and Portland, OR, are a few cities with a higher degree of stagnation. We noticed that patterns in RH did not vary with summer months but varied across cities. Cities in the arid regions of the Southwest and West had relatively lower levels of RH, while other cities had similar ranges in RH levels. Day-to- day variations were noticeable in SI, but such variations were common to all cities considered in this analysis.

We used cluster analysis to select EHE definitions for this analysis. We settled for five definitions from five different clusters. “Definition 1”, daily maximum heat index greater than 90 ℉ for at least 3 consecutive days (Burrows 1900), was selected from a cluster with definitions based on absolute thresholds. “Definition 2”, daily maximum and

82

minimum temperature greater than 80th percentile for at least 3 consecutive days

(Easterling et al. 2000), was selected from a cluster that consisted of definitions that had relatively moderate thresholds. “Definition 3”, daily maximum temperature greater than

95th percentile for at least 2 consecutive days (Anderson and Bell 2011), was selected from a cluster with definitions that used a relative high threshold. “Definition 4”, the

Huth Definition19, was selected from a cluster consisting of definitions with extreme thresholds. “Definition 5”, daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days (Arguez et al. 2012;

Pascal et al. 2006), was selected from a cluster with definitions with climate normal- based thresholds that were either relatively moderate or low. The list of EHE definitions considered in this analysis is provided in Table 4-2.

19Per Huth’s definition, a heat wave is defined as the longest period of consecutive days satisfying the following three conditions: 1. The daily maximum temperature is above T1(97.5th percentile) for at least 3 consecutive days; 2. The daily maximum temperature is aboveT2(81th percentile) during the entire period; 3. The average of daily maximum temperature over the entire period is greater than T1. 83

Table 4-2: EHE definitions used in this analysis Daily Definition Threshold Threshold Cluster EHE definition name heat Duration id type value metric Daily maximum heat index 3+ greater than 90 ℉ for at least 1 Definition 1 HI Absolute >90oF consecutive 3 consecutive days (Burrows max days 1900). Daily maximum and minimum temperature greater 3+ T >80th 2 Definition 2 than 80th percentile for at max Relative consecutive and T percentile least 3 consecutive days min days (Easterling et al. 2000). Daily maximum temperature greater than 95th percentile 2+ >95th 3 Definition 3 for at least 2 consecutive T Relative consecutive max percentile days (Anderson and Bell days 2011). Everyday >T2, and 3+ T1: >97.5th consecutive Huth definition (Bobb et al. percentile days >T1, 4 Definition 4 T Relative 2011; Huth et al. 2000) max T2: >81st and average percentile Tmax >T1 for the whole time period Daily mean temperature >mean + 1 greater than mean + 1 SD of 3+ SD of 5 Definition 5 climate normal for at least 3 T Relative Consecutive avg climate consecutive days (Arguez et days normal al. 2012; Pascal et al. 2006)

The EHE definitions considered in this analysis varied in severity, and as a result predicted different sets of days as EHE and non-EHE days. We examined the levels of meteorological predictors (InvWS, Tmax, and RH), that had an interaction term with

EHE variable, on EHE and non-EHE days. Figures C5-7 show city- and definition- specific distributions of meteorological variables on EHE and non-EHE days. As expected, the range of values for Tmax was higher during EHEs. We observed that stagnation, as measured by InvWS, was slightly higher during EHE days than non-EHE days for most cities and EHE definitions. During EHEs, most cities have lower RH and, of note, the more severe the definition, the lower the RH values.

84

The autoregressive models were executed separately for each city with the same set of candidate predictors and different EHE definitions. We noticed for a few cities, some of these predictors were not significant and/or were correlated with another predictor. We provide scatter plots between candidate predictors and a city-specific Pearson correlation coefficient (r) between meteorological predictors in Figure C-8 and Table C-1, respectively. SI was positively correlated (r >0.5) with RH for 5 of the 27cities, whereas,

Tmax was negatively correlated (r < -0.50) with RH for 7 of the 27 cities, for example. In such instances, we decided to not exclude those predictors from the modeling process to maintain a consistent set of predictors across all cities. Table C-2 provides the goodness of fit measures used to assess model performance for all cities and EHE definitions. The model performance, evaluated based on the regress R-squared and total R-squared statistics, varied with cities but not with EHE definitions. Atlanta, GA had the highest total R-squared and Albuquerque, NM had the lowest total R-squared among all cities considered in this analysis. While for the majority of cities, most of the explanatory value came from the candidate predictors; in certain cities, the autoregressive lag component accounted for a substantially high proportion of the explanatory value. In Los Angeles,

CA, less than 5% of explanatory value came from meteorological predictors or EHE definition variable, for example. Results from our out-of-sample validation using 2010 data corroborated much of the goodness-of-fit metrics obtained from our in-sample evaluation. MAE was less than 8 ppb, and MRA was greater than 75% for majority of the cities.

85

4.3.2 Impact of meteorology on ozone and effect modification during EHEs

We used parameter estimates (slope factors) obtained from the autoregressive model to describe the effect of meteorology on ozone. The main effect associated with meteorological predictors provides a “baseline” effect or relationship between meteorology and ozone. This effect is applicable to both EHE and non-EHE days. The parameter estimates associated with interaction terms describe the effect modification during EHEs or the “EHE day” effect. Table C-3 provides the slope factors for each city and for all the predictors used in the model. The slope factors are generated based on the summary-level pooled analysis. The baseline effect, which is examining the overall summertime relationship between meteorological variables and ozone, is consistent with the published literature. We noticed an increase in Tmax resulted in higher ozone levels and the magnitude of the effect measured in terms of slope varied with cities. In general, cities in the climate regions Central, East North Central, and Northeast, showed a much stronger relationship with Tmax. While climate regions Southwest and West showed the weakest relationship between Tmax and ozone among all of the climate regions. The relationship between Tmax and ozone was positive in the climate regions South,

Southeast, and North West Central, but that effect was in between regions with the highest and lowest effect.

We observed that an increase in InvWS (or a decrease in wind speed) was associated with higher ozone concentrations; however, the effect of InvWS on ozone was mostly felt in cities located in the South, Southeast, and West. Also, the magnitude of effect for InvWS on ozone was less pronounced as compared to that of Tmax. The effect of RH on ozone was present but generally small compared to the effect associated with Tmax or InvWS.

86

An increase in RH levels was associated with a decrease in ozone levels. The effect of SI on ozone was negligible for most cities.

The extent of effect modification during EHEs varied with cities. For certain cities, effect modification was sensitive to the EHE definition selected. We examined this heterogeneity in effect modification using the I-squared statistic (data not shown) that was generated during the summary-level pooled analysis. Table C-4 provides the mean

(95%) slope factors for the baseline effect and the EHE day effect for those meteorological variables that have interaction terms in the autoregressive model. In general, EHE definitions that use low to moderate thresholds were better associated with a higher magnitude of effect modification. The summary-level pooled analysis provides a generalized mean estimate with 95% confidence limits that represents the overall magnitude and uncertainty associated with effect modification during EHEs. Figure 4-

3A-B shows profile plots describing the effect of InvWS and Tmax on ozone during EHE and non-EHE days; the plots also have two vertical reference lines to indicate the range of values observed for the meteorological variables during EHEs. Table 4-3 summarizes the extent of effect modification by region based on summary-level pooled analysis conducted across cities (and definitions) within each climate region. Effect modification of the relationship between Tmax and ozone was most prominent in Boston, MA; however other cities in the Northeast showed little or no effect modification during

EHEs. Similarly, Portland, OR showed significant effect modification during EHEs, but the confidence interval associated with the mean estimate was wider compared to Boston.

Apart from Boston, MA and Portland, OR, Las Vegas, NV and Los Angeles, CA, showed

87

effect modification during EHEs, however, the magnitude of effect modification estimate was relatively small.

Effect modification of the relationship between InvWS and ozone during EHEs was observed in 14 of the 27 cities; however the extent of effect modification was lower compared to Tmax, for the city-specific ranges of InvWS values observed during EHEs.

All climate regions except East North Central, Southwest, and West, showed some degree of effect modification. The effect modification across these cities ranged anywhere between 1 to 3 ppb increase in ozone for a 0.5 s/m increase in InvWS (or a 2 m/s decrease in wind speed). The effect modification observed during EHEs for the relationship between RH and ozone is felt in 14 of the 27 cities, and all regions except

Central and Northeast showed consistent modification. Figure C-9 describes the effect of

RH on ozone during EHE and non-EHE days. Although effect modification was present in most climate regions, the extent of effect modification did not result in an appreciable increase in ozone.

88

Figure 4-3: Effect modification20 of the meteorology-ozone relationship during EHE and non-EHE days for: (A) daily maximum temperature, and (B) daily mean inverse wind speed

20 EHE day effect for cities is only shown if the summary-level pooled analysis of the effect modification (interaction term) in the model was significant. 89

Table 4-3: The effect modification (slope factors21 on a logarithmic scale) of the relationship between meteorological variables on ozone during EHEs Meteorological variable Daily maximum temperature Daily mean relative humidity Inverse daily mean wind speed Effect Effect Effect Baseline effect Baseline effect Baseline effect

Region Modification Modification Modification

limit limit

limit limit limit limit limit limit limit limit limit limit

Upper

Lower

Mean Mean Mean Mean Mean Mean

Upper Upper Upper Upper Upper Upper

Lower Lower Lower Lower Lower Lower Central 0.23 0.22 0.25 -0.03 -0.06 0.01 -0.11 -0.12 -0.11 0.00 -0.02 0.01 0.04 0.03 0.05 0.06 0.04 0.08 East North Central 0.28 0.26 0.30 -0.03 -0.06 0.01 -0.08 -0.09 -0.07 0.03 0.01 0.05 0.02 0.01 0.03 0.00 -0.02 0.03 North West Central 0.14 0.13 0.16 0.08 0.03 0.14 -0.09 -0.10 -0.08 0.07 0.04 0.09 0.01 0.00 0.01 0.03 0.01 0.04 Northeast 0.28 0.26 0.30 0.03 -0.01 0.07 -0.07 -0.08 -0.05 0.02 0.01 0.04 -0.01 -0.03 0.01 0.03 0.01 0.05 South 0.10 0.07 0.13 -0.01 -0.07 0.05 -0.14 -0.17 -0.12 0.07 0.04 0.10 0.12 0.10 0.14 0.04 0.02 0.07 Southeast 0.16 0.14 0.19 0.03 -0.02 0.07 -0.18 -0.19 -0.18 0.07 0.05 0.09 0.04 0.02 0.05 0.06 0.03 0.08 Southwest 0.07 0.05 0.08 0.05 -0.02 0.11 -0.03 -0.04 -0.02 0.06 0.04 0.09 0.05 0.03 0.07 0.00 -0.02 0.02 West 0.10 0.07 0.12 0.04 0.02 0.06 -0.01 -0.02 0.00 0.04 0.02 0.05 0.04 0.02 0.06 0.01 -0.01 0.03

21 The slope factors are presented on a log scale for certain predictors were scaled for display purposes. Specifically, Baseline effect and effect modification associated with daily mean inverse wind speed: slope factor represents the change in ozone for a 0.1 (s/m) increase in inverse wind speed or a 10 m/s decrease in wind speed; Baseline effect and effect modification associated with daily mean relative humidity: slope factor represents the change in ozone for a 10% increase in RH; Baseline effect and effect modification associated with daily maximum temperature: Slope factor represents the change in ozone for a 10℉ increase in daily maximum temperature. 90

4.4 Summary and Conclusions

Previous studies conducted in the U.S. examining the effect of meteorology on ozone have mostly aimed at measuring the influence of meteorological variables to assess ozone trends. U.S. Environmental Protection Agency (EPA) routinely conducts meteorologically adjusted trend analysis and provides trends that are adjusted for weather

(http://www.epa.gov/airtrends/weather.html). The other studies that have explored relationships between ozone and meteorology have made weather-based adjustments to ozone levels to accurately characterize the impact of emission-reduction efforts and human activities on prevailing levels. The primary driver behind a majority of these studies has always been to facilitate environmental policy-making within a regulatory context.

We successfully employed a multivariate autoregressive model to control for the autocorrelation of residuals, and used a logarithmic response for ozone, to model the relationship between meteorology and ozone. The goodness-of-fit measures and the results of out-of-sample validation using 2010 data indicated that the model adequately captured the day-to-day variations in ozone levels for majority of the cities. The model fit, based on MAE and total R-squared (Table C-2), was poor in certain cities but the performance did not fluctuate between EHE and non-EHE days. Also, in some cities, the explanatory value accounted for by predictors is very low; autocorrelation parameters account for the majority of the explanatory value.

Our model-based analysis yielded two sets of results: (1) baseline effect of meteorology applicable to both EHE and non-EHE days, and (2) EHE day effect of meteorology on ozone. The baseline effect we observed is consistent with the results previously published

91

in literature. Higher temperatures are associated with higher ozone and a monotonically increasing trend is observed in ozone levels for temperatures above ~70℉. Lower wind speeds result in higher ozone as a stagnant air mass facilitates higher local production of ozone. Higher humidity levels, which correspond with greater cloud cover, are indicators of atmospheric instability, and such conditions interrupt the photochemical process leading to the depletion of ozone (Camalier et al. 2007).

We have shown that the extent of effect modification that varies with cities. The changes in ozone concentrations during EHEs could be due to different meteorological variables in different parts of the country. This heterogeneity could be explained based on the definitions selected for this analysis, but there could be other factors, such as fluctuations in emissions of ozone precursors. On days with higher temperatures, certain emissions could increase, leading to higher ozone levels not explained by the variations in the meteorological variables. For example, biogenic emissions (isoprene and monoterpene) can increase due to high temperatures as plants tend to release these volatile organic carbon (VOC) compounds as a defense mechanism to combat heat stress (Benjamin et al.

1996; Geron et al. 2006; Sharkey et al. 2008). The escalation of air conditioning use during extreme temperature days generate a higher electricity demand from electricity generating units (EGU), which in turn lead to emissions of oxides of nitrogen (NOx) (He et al. 2013). Further, evaporative VOC emissions have a temperature correspondence, and persistent high temperatures, such as those prevailing during heat waves, could result in higher VOC emissions during EHEs.

Our analysis has limitations. We could only examine the effect of meteorology on ozone in 27 cities due to the limited number of ozone monitors in close proximity to ASOS

92

stations. Perhaps in the future, we could reproduce this analysis using modeled weather and ozone data. We have compensated for the limited number of study locations and were able to generalize the effect by conducting a summary-level pooled analysis. This is a novel technique frequently used in health studies, but to our knowledge, is not very common in environmental data analysis. In some climate regions, North West Central for example, the summary-level pooled analysis may result in unreliable estimates given very few study locations. Lastly, we were unable to examine the effect of meteorology on fine particulate matter (PM2.5). Although literature doesn’t suggest strong correspondence between PM2.5 and meteorology, it is worth examining the relationship on EHE and non-

EHE days as PM2.5 have stronger associations with mortality and morbidity outcomes.

Although epidemiologic studies focusing on EHEs control for ozone, very few studies have investigated effect modification of the relationship between ozone and meteorological variables on EHE days and how that impacts results. In general, the interactive effects of air pollution and extreme heat observed during EHEs have not been well characterized in studies conducted in the U.S. In this study, we were able to quantify the relationship between ozone and meteorology, and ascertain the extent of effect modification during EHEs. Further, the city-specific analysis using short-listed definitions from cluster analysis gave us insights into the sensitivity of results to EHE definitions. Also worth noting, is the benefit of the summary-level pooled analysis in providing us with a generalized mean effect and associated uncertainty by climate region.

Climate change is predicted to increase the number of extreme heat events in the future.

With the projected increase in occurrences of EHEs, demand for electric power generation will increase (Brown Jr et al. 2013). This may contribute to further

93

degradation of air quality despite efforts to control EGU emissions. Also, future climate is supposed to be more stagnant due to a weaker global circulation and a decreasing frequency of mid-latitude (Jacob and Winner 2009). We have shown in our analysis that in certain cities, especially in the South and Southeast, the effect of inverse wind speed on ozone and effect modification during EHEs are more common.

Given these considerations, environmental degradation, measured in terms of poor air quality, may exacerbate adverse health impacts already posed by EHEs. Information gleaned from this analysis will drive our health effects modeling phase where we intend to explore the interactive effects of extreme heat and air pollutants on EHE days and their impact on morbidity and mortality.

4.5 References

1. Analitis A, Michelozzi P, D'Ippoliti D, De'Donato F, Menne B, Matthies F, et al.

2014. Effects of heat waves on mortality: Effect modification and confounding by air

pollutants. Epidemiology 25:15-22.

2. Anderson GB, Bell ML. 2011. Heat waves in the united states: Mortality risk during

heat waves and effect modification by heat wave characteristics in 43 u.S.

Communities. Environmental health perspectives 119:210-218.

3. Anderson GB, Dominici F, Wang Y, McCormack MC, Bell ML, Peng RD. 2013.

Heat-related emergency hospitalizations for respiratory diseases in the medicare

population. Am J Resp Crit Care 187:1098-1103.

4. Arguez A, Durre I, Applequist S, Vose RS, Squires MF, Yin XG, et al. 2012. Noaa's

1981-2010 u.S. Climate normals an overview. B Am Meteorol Soc 93:1687-1697.

94

5. Baur D, Saisana M, Schulze N. 2004. Modelling the effects of meteorological

variables on ozone concentration—a quantile regression approach. Atmospheric

Environment 38:4689-4699.

6. Bell ML, McDermott A, Zeger SL, Samet JM, Dominici F. 2004. Ozone and short-

term mortality in 95 us urban communities, 1987-2000. JAMA : the journal of the

American Medical Association 292:2372-2378.

7. Bell ML, Dominici F, Samet JM. 2005. A meta-analysis of time-series studies of

ozone and mortality with comparison to the national morbidity, mortality, and air

pollution study. Epidemiology 16:436-445.

8. Benjamin MT, Sudol M, Bloch L, Winer AM. 1996. Low-emitting urban forests: A

taxonomic methodology for assigning isoprene and monoterpene emission rates.

Atmospheric Environment 30:1437-1452.

9. Bloomfield P, Royle JA, Steinberg LJ, Yang Q. 1996. Accounting for meteorological

effects in measuring urban ozone levels and trends. Atmospheric Environment

30:3067-3077.

10. Bobb JF, Dominici F, Peng RD. 2011. A bayesian model averaging approach for

estimating the relative risk of mortality associated with heat waves in 105 U.S. Cities.

Biometrics 67:1605-1616.

11. Borenstein M, Higgins JPT. 2013. Meta-analysis and subgroups. Prevention Science

14:134-143.

12. Bouchama A. 2004. The 2003 european heat wave. Intens Care Med 30:1-3.

13. Braga AL, Zanobetti A, Schwartz J. 2002. The effect of weather on respiratory and

cardiovascular deaths in 12 us cities. Environmental health perspectives 110:859.

95

14. Brown Jr EG, Rodriquez M, Chapman R. 2013. Preparing california for extreme heat.

15. Burnett RT, Brook JR, Yung WT, Dales RE, Krewski D. 1997. Association between

ozone and hospitalization for respiratory diseases in 16 canadian cities.

Environmental research 72:24-31.

16. Burrows AT. 1900. Hot waves: Conditons which produce them, and their effect on

agriculture. Yearbook of the U S Department of Agriculture 22:325-336.

17. Camalier L, Cox W, Dolwick P. 2007. The effects of meteorology on ozone in urban

areas and their use in assessing ozone trends. Atmospheric Environment 41:7127-

7137.

18. Cox WM, Chu SH. 1996. Assessment of interannual ozone variation in urban areas

from a climatological perspective. Atmospheric Environment 30:2615-2625.

19. D'Ippoliti D, Michelozzi P, Marino C, de'Donato F, Menne B, Katsouyanni K, et al.

2010. Research the impact of heat waves on mortality in 9 european cities: Results

from the euroheat project.

20. Durbin J, Watson G. 1971. Testing for serial correlation in least squares regression.

Iii. Biometrika 58:1-19.

21. Easterling DR, Meehl GA, Parmesan C, Changnon SA, Karl TR, Mearns LO. 2000.

Climate extremes: Observations, modeling, and impacts. Science 289:2068-2074.

22. Edens JF, Cavell TA, Hughes JN. 1999. The self‐systems of aggressive children: A

cluster‐analytic investigation. Journal of Child Psychology and Psychiatry 40:441-

453.

96

23. García-Herrera R, Díaz J, Trigo R, Luterbacher J, Fischer E. 2010. A review of the

european summer heat wave of 2003. Critical Reviews in Environmental Science and

Technology 40:267-306.

24. Geron C, Guenther A, Greenberg J, Karl T, Rasmussen R. 2006. Biogenic volatile

organic compound emissions from desert vegetation of the Southwestern U.S.

Atmospheric Environment 40:1645-1660.

25. Hansen A, Bi P, Nitschke M, Ryan P, Pisaniello D, Tucker G. 2008. The effect of

heat waves on mental health in a temperate australian city. Environmental health

perspectives 116:1369.

26. He H, Hembeck L, Hosley KM, Canty TP, Salawitch RJ, Dickerson RR. 2013. High

ozone concentrations on hot days: The role of electric power demand and nox

emissions. Geophysical Research Letters 40:5291-5294.

27. Higgins JP, Thompson SG, Deeks JJ, Altman DG. 2003. Measuring inconsistency in

meta-analyses. BMJ: British Medical Journal 327:557.

28. Hu X, Waller LA, Al-Hamdan MZ, Crosson WL, Estes Jr MG, Estes SM, et al. 2013.

Estimating ground-level pm< sub> 2.5 concentrations in the Southeastern U.S.

using geographically weighted regression. Environmental research 121:1-10.

29. Huth R, Kysely J, Pokorna L. 2000. A gcm simulation of heat waves, dry spells, and

their relationships to circulation. Climatic Change 46:29-60.

30. Jacob DJ, Winner DA. 2009. Effect of climate change on air quality. Atmospheric

Environment 43:51-63.

31. Jerrett M, Burnett RT, Pope III CA, Ito K, Thurston G, Krewski D, et al. 2009. Long-

term ozone exposure and mortality. New Engl J Med 360:1085-1095.

97

32. Jones TS, Liang AP, Kilbourne EM, Griffin MR, Patriarca PA, Wassilak SGF, et al.

1982. Morbidity and mortality associated with the july 1980 heat wave in St. Louis

and kansas city, mo. JAMA : the journal of the American Medical Association

247:3327-3331.

33. Kegler SR, Wilson WE, Marcus AH. 2001. Pm 1 , intermodal (pm 2.5-1 ) mass, and

the soil component of pm 2.5 in phoenix, az, 1995-1996. Aerosol Science and

Technology 35:914-920.

34. Knowlton K, Rotkin-Ellman M, King G, Margolis HG, Smith D, Solomon G, et al.

2009. The 2006 california heat wave: Impacts on hospitalizations and emergency

department visits. Environmental health perspectives 117:61-67.

35. Kovats RS, Hajat S, Wilkinson P. 2004. Contrasting patterns of mortality and hospital

admissions during hot weather and heat waves in greater , uk. Occupational

and environmental medicine 61:893-898.

36. Lehman J, Swinton K, Bortnick S, Hamilton C, Baldridge E, Eder B, et al. 2004.

Spatio-temporal characterization of tropospheric ozone across the Eastern United

States. Atmospheric Environment 38:4357-4369.

37. Lu H-C, Chang T-S. 2005. Meteorologically adjusted trends of daily maximum ozone

concentrations in taipei, taiwan. Atmospheric Environment 39:6491-6501.

38. Meehl GA, Tebaldi C. 2004. More intense, more frequent, and longer lasting heat

waves in the 21st century. Science 305:994-997.

39. Mortimer J, Borenstein A, Nelson L. 2012. Associations of welding and manganese

exposure with parkinson's disease: A systematic review and meta-analysis. Neurology

78.

98

40. Pascal M, Laaidi K, Ledrans M, Baffert E, Caserio-Schonemann C, Le Tertre A, et al.

2006. France's heat health watch warning system. International journal of

biometeorology 50:144-153.

41. SAS. 2011. Sas/stat 9. 3 user's guide: Proc autoreg:SAS Institute.

42. Semenza JC, Rubin CH, Falter KH, Selanikio JD, Flanders WD, Howe HL, et al.

1996. Heat-related deaths during the july 1995 heat wave in Chicago. The New

England journal of medicine 335:84-90.

43. Semenza JC, McCullough JE, Flanders WD, McGeehin MA, Lumpkin JR. 1999.

Excess hospital admissions during the july 1995 heat wave in Chicago. American

journal of preventive medicine 16:269-277.

44. Shah NR, Borenstein J, Dubois RW. 2005. Postmenopausal hormone therapy and

breast cancer: A systematic review and meta-analysis. Menopause 12:668-678.

45. Sharkey TD, Wiberley AE, Donohue AR. 2008. Isoprene emission from plants: Why

and how. Annals of botany 101:5-18.

46. Solberg S, Hov Ø, Søvde A, Isaksen I, Coddeville P, De Backer H, et al. 2008.

European surface ozone in the extreme summer 2003. Journal of Geophysical

Research: Atmospheres (1984–2012) 113.

47. Strickland MJ, Darrow LA, Klein M, Flanders WD, Sarnat JA, Waller LA, et al.

2010. Short-term associations between ambient air pollutants and pediatric asthma

emergency department visits. Am J Resp Crit Care 182:307.

48. Thompson M, Reynolds J, Cox LH, Guttorp P, Sampson PD. 2001. A review of

statistical methods for the meteorological adjustment of tropospheric ozone.

Atmospheric Environment 35:617-630.

99

49. Vautard R, Honore C, Beekmann M, Rouil L. 2005. Simulation of ozone during the

august 2003 heat wave and emission control scenarios. Atmospheric Environment

39:2957-2967.

50. Wise EK, Comrie AC. 2005. Meteorologically adjusted urban air quality trends in the

southwestern united states. Atmospheric Environment 39:2969-2980.

51. Zhang T, Ramakrishnan R, Livny M. Birch: An efficient data clustering method for

very large databases. In: Proceedings of the ACM SIGMOD Record, 1996, Vol.

25ACM, 103-114.

100

22 CHAPTER 5 Assessment of Modeled PM2.5: A Public Health Perspective

5.1 Introduction

Many epidemiologic and clinical studies have found an association between both acute and long- term exposure to particles with an aerodynamic diameter of 2.5 microns or less

(PM2.5) and adverse cardiovascular and respiratory health effects (Dominici et al. 2006;

Peters et al. 2001; Pope et al. 2000). PM2.5 ambient air concentration data, which are used to characterize population-level exposure, are available from United States (U.S.)

Environmental Protection Agency’s (EPA) Air Quality System (AQS). However, these

AQS-based PM2.5 air monitors only cover approximately one fifth of all U.S. counties and many monitors do not sample for PM2.5 on a daily basis (Vaidyanathan et al. 2013).

Exposure estimates of PM2.5 derived from numerical deterministic simulation models, such as, the Community Multiscale Air Quality (CMAQ), have been used in epidemiologic studies (Hamilton et al. 2009; Marmur et al. 2006). Further, statistical models, which combine PM2.5 measurements from monitors with CMAQ predictions or use measurements to calibrate model predictions, have been used to fill temporal and spatial gaps in ambient air monitoring data (Berrocal et al. 2011; Fuentes et al. 2005;

McMillan et al, 2010). In general, these statistical models use monitoring data where they are available and incorporate results from the CMAQ model to generate PM2.5 predictions; the Bayesian space-time Downscaler (DS) (Berrocal et al. 2010a, 2010b) model is one such model developed by the EPA. Techniques for atmospheric remote sensing have advanced rapidly over the years, producing several sensors capable of monitoring aerosols, such as, Moderate Resolution Imaging Spectroradiometer (MODIS).

22 Manuscript in internal CDC clearance 101

These sensors are available on various platforms and measure Aerosol Optical Depth

(AOD), which represents columnar loading of aerosols and can be used to estimate ground-level PM2.5 concentrations. Several studies have examined the feasibility of deriving ambient PM2.5 concentrations from AOD in the U.S. (Al-Hamdan et al. 2009;

Beckerman et al. 2013; Liu et al. 2009; Paciorek et al. 2012).

PM2.5 predictions from the CMAQ model, AOD-based models, and DS model are all currently used to characterize exposure for health studies. Further, PM2.5 predictions based on these models are available to users of the Centers for Disease Control and

Prevention’s (CDC) Environmental Public Health Tracking Network (Tracking Network)

(http:// ephtracking.cdc.gov). In this study, we assess the accuracy and utility of PM2.5 predictions generated from these three model types with the following objectives: (1) evaluate the performance of the model-based predictions against station-based measurements; (2) compare linked metrics of air quality and health—change in mortality rate associated with lowering PM2.5 concentration levels created from model- and monitor-based estimates of PM2.5.

5.2 Methods

5.2.1 Study domain and time period

The spatial domain of the study was the Southeastern U.S., as seen in Figure 5-1. The study area covers approximately 500,000 sq. km and accounts for approximately 24 million people. Geographic scale of analyses was 12 km x 12 km grid cells; the grid resolution and extent aligns with the CMAQ grid definition used in Clausen et al. (2009).

We chose the year 2006 to conduct this analysis, since 2006 was the latest year for which

102

we had both PM2.5 measurements and modeled predictions available from all data sources.

Figure 5-1: Spatial domain

5.2.2 Station based PM2.5 measurements

We selected monitors from AQS that used Federal Reference Methods (FRM) to measure

PM2.5 concentrations and obtained daily 24 hour average PM2.5 concentrations for the all the monitors contained in the study area. We restricted our selection to monitors that sample year-round and excluded those that did not have at least 11 measurements in each calendar quarter (CDC, 2013). We also selected monitors from the Southeastern Aerosol

Research and Characterization (SEARCH) network (SEARCH, 1999). SEARCH network monitors provided daily PM2.5 measurements using FRM. There are five SEARCH

103

monitors in the study domain. Of these five monitors, three are in rural areas (Centreville,

Alabama (CTR), Outlying Landing Field, Pensacola, Florida (OLF), and Yorkville,

Georgia (YRK)), and two are in urban areas (Jefferson St, Atlanta, Georgia (JST) and

North Birmingham, Alabama (BHM)). We used SEARCH data to evaluate model-based

PM2.5 predictions. We computed county-level annual averages from daily monitor-based

PM2.5 concentrations by first computing monitor level PM2.5 averages per the standard

EPA protocol (EPA, 2012), and then assigned them to counties where they were located.

We calculated a mean of all monitor level averages when more than one monitor was available in a given county.

5.2.3 Model-based PM2.5 predictions

The CMAQ model, a multi-pollutant, multiscale chemical transport model, generates air quality predictions at user-defined spatio-temporal scales taking into account land use, chemical transport, chemistry, weather, and emission processes (EPA 2006; Clausen et al. 2009). Daily CMAQ predictions of PM2.5 for 2006 were available on a 12 km grid from the Models-3/CMAQ modeling system (version 4.7 (CBO5)). Modeled predictions of PM2.5 from the DS model, were also available to CDC for 2006 (Berrocal et al. 2012).

The DS combines the FRM-based AQS measurements (where available) and CMAQ predictions to predict PM2.5 through space and time (Heaton et al. 2012). DS uses a

Bayesian approach, and the model structure and additional information are provided in

Appendix D-1. DS predictions of PM2.5 were available at the centroid (geometric center) of Census tract (CT) locations. We then up-converted these CT level predictions to the 12 km CMAQ grid by relating the CT centroids to the grid cell in which they fall, and computed average grid cell level predictions. For grid cells that did not contain a CT

104

centroid, we used the nearest one. Both CMAQ and DS predictions were generated as part of an interagency agreement between CDC and EPA to provide modeled air quality data for public health surveillance.

AOD based PM2.5 predictions were generated using a geographically weighted regression

(GWR) approach (Hu et al. 2013). The GWR model was developed to determine the relationship between concentrations of PM2.5 from AQS monitors, AOD values, meteorology, and land use information. AOD observations were retrieved from MODIS, aboard both Terra and Aqua satellites. Meteorological data were obtained from the North

American Land Data Assimilation System (NLDAS). A 2006 Landsat-derived land cover map of the study area was downloaded from the National Land Cover Database (NLCD) to provide land use information. Model predictions were generated for the pixel centroids of the 12 km CMAQ grid. A separate GWR model was established for each day. We provide the model structure and additional information in Appendix D-2.

County-level annual averages were created from CMAQ-, DS-, and AOD-based predictions of PM2.5. Our geo-imputation method was based on centroid containment and relates all 12 km grid cell centroids (geometric) to the county into which they fall. We established a relationship between each given county boundary polygon and all the grid cell geometric centroids it contains, and then transferred modeled predictions to that county. Since many counties contain more than one grid cell centroid, we created a mean predicted concentration value for each day from all the grid cells with centroids in a given county to create daily county-level estimates of PM2.5 (Vaidyanathan et al. 2013).

We then computed annual averages using these daily county-level estimates of PM2.5.

105

5.2.4 Evaluation methods: daily grid cell level evaluation

We compared the grid level PM2.5 predictions to measurements from AQS to assess the performance of each model in the study domain. Both DS- and AOD-based predictions use AQS-based measurements in the model fitting process, and model performance was expected to be more accurate in grid cells that have both model- and AQS-based PM2.5 concentrations. However, we conduct this in-sample evaluation as a consistency check of the model fitting process, and to evaluate the performance of CMAQ predictions against

AQS-based PM2.5 concentrations in the study domain. We also used PM2.5 measurements from monitors in the SEARCH network to independently evaluate modeled predictions, since SEARCH data were not used to create these modeled predictions. We assessed the consistency of the relationship between model predictions and measurements using the following performance metrics: (1) Pearson correlation coefficient(r); (2) Kendall Tau-B correlation coefficient (t); (3) Difference (D); (4) Root mean squared deviation (RMSD);

(5) Relative accuracy (RA) (Hu et al. 2013; Vaidyanathan et al. 2013). We provide formulae used to calculate these metrics in Appendix D3-4.

5.2.5 Comparison of linked metrics of air quality and health

In order to compare county-level model- and monitor-based linked metrics of air quality and health, we computed the change in mortality rate associated with a 25% reduction in

PM2.5 levels for counties with AQS monitors. Before calculating change in mortality rate, we compared county-level annual averages derived using AQS-based measurements and model-based predictions using Bland-Altman plots; these plots are primarily used for identifying the presence of fractional bias (Vaidyanathan et al 2013). We provide the approach used to create Bland-Altman plot in Appendix D-4. We used methods similar

106

to EPA’s Benefits Mapping and Analysis Program (BenMAP), a Geographic Information

System-based program that allows users to calculate health impacts associated with change in pollutant levels. (Fann et al., 2012) In addition to model- and monitor-based estimates, we used the following inputs: (1) Concentration–Response (C-R) relationship between change in PM2.5 levels and how that influences mortality derived from literature; the mean (95% confidence intervals (CI)) effect estimate () for a unit change in PM2.5 concentration, that was obtained from literature was 0.0057 (0.0036 – 0.0079) (Krewski et al.), (2) County-level population data, bridged-race estimates, provided by the National

Center for Health Statistics and U.S. Census Bureau; (3) Mortality data from National

Center for Health Statistics to compute mean three-year (2004 – 2006) baseline mortality rate (M0)for all causes of death; we excluded non-U.S residents and decedents under 25 years of age; and (4) The change in PM2.5 annual averages (∆x) were computed for a 25% reduction in annual averages for each county. After preparing all the necessary inputs we computed the change in mortality rate ( (Anenberg et al. 2010) as

( ) (1)

= Baseline mortality rate expressed as deaths per 100,000 person-years;

β = Effect estimate coefficient obtained from C-R function;

∆x = Change in PM2.5 annual average concentration.

We carried out our data analyses using the Statistical Analysis System (SAS® Version

9.2) and Environmental Systems Research Institute’s GIS software (ESRI, ArcGIS®

Version 9.3 and 10.1).

107

5.2 Results

5.3.1 Data completeness and descriptive statistics

The study area contained 3,702 12-km grid cells, 96 complete AQS monitors, and 5

SEACRH monitors. Data completeness for the various PM2.5 data sources is presented in

Table 5-1. Most of the AQS monitors in the study domain sampled PM2.5 concentrations every third day with a few monitors sampling every sixth day or on a daily basis.

Sampling frequency did not change with calendar quarter and the median daily completeness for most monitors was 32%. Mean (range) of annual average PM2.5

3 concentrations among all monitors was 14.2 (11.0 – 18.5) g/m . AOD-based PM2.5 predictions were available for all grid cells in the study domain, however, daily completeness varied across grid cells; median daily completeness was 56%. Mean (range) of annual averages from AOD-based PM2.5 predictions among all grid cells was 13.8 (9.0

– 18.2) g/m3. Both CMAQ- and DS-based predictions were available daily and for all grid cells in the study domain. Mean (range) of annual averages from CMAQ- and DS- based PM2.5 predictions among all grid cells were 9.6 (4.6 – 45.6) and 12.5 (9.3 – 17.0)

g/m3, respectively. Monitors in the SEARCH network were highly complete with most monitors having an annual daily completeness of 90% or higher. Of all the PM2.5 annual averages from SEARCH monitors, BHM had the highest (17.3 g/m3) concentration and

OLF had the lowest (11.5 g/m3) concentration. We present maps of annual averages from model and monitor-based PM2.5 concentrations in Figure D-1.

108

Table 5-1: Completeness of data sources No. of Median (Range) daily completeness (%) Data spatial units Spatial unit Source in the Jan - Mar Apr - Jun Jul - Sep Oct - Dec Annual domain Point AQS 96 32 (13-99) 32 (15-100) 34 (13-100) 33 (16-100) 32 (15-100) (latitude/longitude) Remote Sensing Grid cell 3,702 53 (31-63) 56 (35-67) 54 (34-63) 61 (45-71) 56 (40-63) (AOD) CMAQ Grid cell 3,702 100 (100-100) 100 (100-100) 100 (100-100) 100 (100-100) 100 (100-100)

Downscaler Census Tracts 6,171 100 (100-100) 100 (100-100) 100 (100-100) 100 (100-100) 100 (100-100) Point SEARCH (latitude/longitude) 5 92 (88-96) 96 (86-99) 96 (85-99) 92 (83-97) 93 (90-95)

109

5.3.2 Daily grid level evaluation

We compared daily grid level model- and AQS-based PM2.5 concentrations; performance metrics r, t, D, RMSD, and RA are presented in Table 5-2. CMAQ-based PM2.5 predictions were weakly correlated with AQS-based PM2.5 measurements (r=0.58, t=0.45) with a mean RA of approximately 50%, and a mean RMSD equal to 6.5 g/m3.

CMAQ model performance varies with time of the year. In the warmer months (April through September) the mean D between CMAQ-based PM2.5 predictions and PM2.5 measurements was consistently negative, indicating under prediction, and the magnitude of difference was highest for these months. In fall and winter (October through March)

CMAQ-based predictions are biased slightly high versus the AQS-based PM2.5 measurements.

110

Table 5-2:Annual and seasonal comparison of AQS-based measurements and modeled estimates of PM2.5

Correlation Mean Root Mean Mean Rel. Mean Difference Modeled vs. Coefficient Squared Deviation Accuracy Time (25th, 75th %tile) Time Period Monitor (Pearson (r), (25th, 75th %tile) (25th, 75th %tile) Scale (g/m3) Comparison Kendall Tau- (g/m3) (%) (D) B (t)) (RMSD) (RA)

AQS vs. AOD (0.89, 0.75) -0.6 (-2.8, 1.4) 3.1 (2.3, 3.8) 77.8 (73.9, 83.5) Annual JAN-DEC AQS vs. CMAQ (0.58, 0.45) -2.1 (-6.7, 1.9) 6.5 (6.0, 7.1) 53.8 (51.4, 58.6) AQS vs. DS (0.97, 0.86) -0.2 (-1.3, 0.9) 1.8 (1.3, 2.1) 87.6 (85.4, 90.1) AQS vs. AOD (0.81, 0.70) -0.8 (-3.0, 1.1) 2.8 (1.7, 3.3) 76.0 (72.5, 84.3) JAN-MAR AQS vs. CMAQ (0.62, 0.48) 0.6 (-3.3, 4.4) 5.4 (4.3, 6.1) 53.4 (47.8, 62.5) AQS vs. DS (0.95, 0.84) -0.1 (-1.1, 1.1) 1.6 (1.1, 2.0) 86.3 (83.5, 90.2) AQS vs. AOD (0.86, 0.74) -0.2 (-2.4, 1.7) 2.8 (2.0, 3.2) 81.4 (79.4, 87.7) APR-JUN AQS vs. CMAQ (0.57, 0.42) -4.7 (-9.0, -0.6) 6.9 (6.0, 7.7) 55.1 (50.0, 60.4) AQS vs. DS (0.96, 0.85) -0.3 (-1.4, 0.8) 1.6 (1.2, 1.9) 89.5 (87.4, 92.4) Seasonal AQS vs. AOD (0.90, 0.76) -0.4 (-2.6, 1.8) 3.2 (2.3, 3.6) 82.1 (78.9, 87.5) JUL-SEP AQS vs. CMAQ (0.66, 0.49) -5.1 (-9.6, -0.8) 7.6 (6.9, 8.3) 57.2 (53.5, 61.0) AQS vs. DS (0.97, 0.88) -0.3 (-1.4, 0.8) 1.7 (1.2, 1.9) 90.7 (89.7, 93.1) AQS vs. AOD (0.86, 0.71) -0.9 (-3.8, 1.4) 3.2 (1.9, 4.1) 72.9 (68.1, 82.6) OCT-DEC AQS vs. CMAQ (0.74, 0.59) 0.7 (-2.9, 4.3) 5.3 (4.1, 5.9) 54.3 (47.9, 65.7) AQS vs. DS (0.95, 0.84) -0.1 (-1.3, 1.1) 1.9 (1.3, 2.2) 84.1 (82.1, 88.8)

111

AOD- and DS-based PM2.5 predictions were strongly correlated with AQS-based measurements, as expected since AQS measurements were utilized in the model fitting process. Unlike CMAQ-based predictions, AOD-based PM2.5 predictions show less variation in the warmer months and mean RA was the highest during this time period.

The magnitudes of mean D for all calendar quarters was between 0 and 1 g/m3, and

3 mean RMSD was between 2.8 and 3.1 g/m . DS-based PM2.5 predictions were highly correlated with AQS-based measurements (r=0.97, t=0.86), and have the highest mean

RA (87.6%) among all model-based PM2.5 predictions for all calendar quarters.

Performance of DS-based predictions does not vary appreciably with the time of the year.

Table 5-3 provides the results of validation between measurements from SEARCH monitors and model-based predictions. For reference, we provide information on the grid cell neighborhood around SEARCH monitors in Figure D-2; we also indicate whether

AQS monitors are present nearby, and if present, distance to the closest AQS monitor.

112

Table 5-3: Annual and seasonal comparison of SEARCH-based measurements and modeled estimates of PM2.5

SEARCH vs. AOD SEARCH vs. CMAQ SEARCH vs. DS Correlation Root Mean Correlation Root Mean Correlation Root Mean Rel. Rel. Rel. Time SEARCH Coefficient Squared Coefficient Squared Coefficient Squared Time Scale Accuracy Accuracy Accuracy Period Site (Pearson (r), Deviation (Pearson (r), Deviation (Pearson (r), Deviation (%) (%) (%) Kendall Tau- (g/m3) Kendall Tau- (g/m3) Kendall Tau- (g/m3) (RA) (RA) (RA) B (t)) (RMSD) B (t)) (RMSD) B (t)) (RMSD) Annual JAN- BHM (0.93, 0.76) 3.6 79.4 (0.62, 0.43) 7.1 59.3 (0.95, 0.81) 2.8 83.6 DEC CTR (0.51, 0.42) 9.8 17.1 (0.49, 0.40) 6.5 45.7 (0.86, 0.70) 3.2 72.9 JST (0.90, 0.78) 3.5 78.8 (0.68, 0.52) 5.6 65.7 (0.95, 0.83) 2.6 84.2 OLF (0.43, 0.41) 8 30.6 (0.41, 0.28) 6.7 41.8 (0.85, 0.66) 3.1 72.8 YRK (0.74, 0.61) 6.4 54.3 (0.64, 0.45) 6.1 56.3 (0.93, 0.76) 2.7 80.3 Seasonal JAN- BHM (0.83, 0.66) 3.6 71.6 (0.52, 0.38) 7 44.1 (0.88, 0.73) 2.7 78.6 MAR CTR (0.18, 0.22) 7.2 16.8 (0.53, 0.41) 6.1 28.9 (0.79, 0.63) 3.2 62.5 JST (0.89, 0.71) 2.8 77.4 (0.78, 0.62) 4.9 61.1 (0.94, 0.81) 2 84.4 OLF (0.67, 0.59) 6.4 32.3 (0.69, 0.43) 5.2 45.3 (0.88, 0.66) 3 68.9 YRK (0.68, 0.48) 3.9 59.7 (0.72, 0.50) 3.8 60.5 (0.93, 0.76) 2.4 75.2 APR- BHM (0.91, 0.77) 3.4 81.7 (0.58, 0.44) 7 62.6 (0.93, 0.79) 2.8 84.9 JUN CTR (0.39, 0.38) 13.5 8.8 (0.46, 0.34) 8.3 44.1 (0.79, 0.60) 3.8 74.6 JST (0.89, 0.78) 3.5 79.6 (0.74, 0.54) 5.7 67.1 (0.96, 0.84) 2.1 87.6 OLF (0.36, 0.34) 9.4 32.6 (0.53, 0.43) 8.1 41.8 (0.84, 0.66) 3.8 72.6 YRK (0.66, 0.58) 7.6 52.2 (0.66, 0.45) 7.2 55.1 (0.93, 0.76) 2.4 84.7 JUL- BHM (0.95, 0.80) 3 86.1 (0.62, 0.45) 7.6 64.3 (0.97, 0.85) 2.5 88.2 SEP CTR (0.71, 0.55) 7.7 47 (0.70, 0.48) 6.4 56.1 (0.89, 0.72) 3 79.8 JST (0.89, 0.74) 3.2 84.7 (0.62, 0.46) 6.4 69.3 (0.96, 0.83) 2.2 89.6 OLF (-0.14, 0.11) 9.8 21.9 (0.53, 0.34) 6.6 47.7 (0.86, 0.67) 2.9 77.1 YRK (0.76, 0.69) 6.3 67.1 (0.72, 0.53) 7.5 60.9 (0.92, 0.77) 3.1 84.1 OCT- BHM (0.92, 0.78) 4.2 74.2 (0.77, 0.57) 6.5 60.1 (0.96, 0.82) 3.3 79.8 DEC CTR (0.39, 0.20) 9.5 3.5 (0.70, 0.52) 4.5 54.3 (0.87, 0.72) 2.9 71 JST (0.91, 0.79) 4.2 71.1 (0.83, 0.69) 5.4 63.1 (0.94, 0.81) 3.8 73.8 OLF (0.46, 0.39) 6.2 37.7 (0.60, 0.47) 6.7 33.1 (0.84, 0.65) 2.8 72 YRK (0.58, 0.42) 6.7 36.5 (0.70, 0.51) 5 53 (0.85, 0.70) 3 71.5

113

Performance of CMAQ-based predictions was similar to what was observed against

AQS. RA was relatively higher for the two urban SEARCH monitors, BHM and JST, than CTR, OLF, and YRK which are rural sites. The difference in mean RA of CMAQ- based predictions between urban and rural sites in the SEARCH network was 14.6%.

CMAQ-based PM2.5 predictions under predict SEARCH measurements in the warmer months and over predict in fall and winter months. Performance of AOD-based PM2.5 predictions varied with SEARCH monitors (Figure 5-2).

114

Figure 5-2: Comparison model- and SEARCH-based PM2.5 concentrations

115

AOD-based predictions were strongly correlated with SEARCH-based measurements at

BHM (r=0.93, t=0.76) and JST (r=0.90, t=0.78), two urban sites in the network; however, the AOD-based predictions had relatively weak correlations at CTR (r=0.51, t=0.42), OLF (r=0.43, t=0.41) and YRK (r=0.74, t=0.61), which are sited in rural locations. RMSD for CTR (9.8 g/m3), OLF (8.0 g/m3), and YRK (6.4 g/m3) were relatively higher than urban SEARCH monitors. The difference in mean RA of AOD- based predictions between urban and rural sites in the SEARCH network was 45.1%.

DS-based predictions had the best relationship with SEARCH-based measurements among all model-based PM2.5 predictions, with high correlations, low RMSD, and high

RA. DS-based predictions were strongly correlated with measurements at BHM (r=0.95, t=0.81) and JST (r=0.95, t=0.83), and YRK (r=0.93, t=0.76) and the correlations were slightly weaker at CTR (r=0.86, t=0.70) and OLF (r=0.85, t=0.66). RA of DS-based predictions against all SEARCH monitors was greater than 72% and overall there was less variability between measurements and predictions (Figure 2). Performance metrics did not fluctuate with calendar quarters for the DS model, however, a slightly better performance was observed in warmer months compared to fall and winter months. The difference in mean RA of DS-based predictions between urban and rural sites in the

SEARCH network was 8.6%. In general, the performance of AOD- and DS-based predictions against SEARCH measurements depends on the number of AQS observations available to calibrate the model; this dependency is more pronounced for AOD-based predictions. We provide scatter plots comparing SEARCH-based measurements and model-based estimates taking into account two scenarios—Comparison on days when

116

AQS data are available and unavailable to calibrate the model in Figure D3-4, respectively.

5.3.3 Comparison of estimated annual county-level change in mortality rate

There were 71 counties in the study domain with at least one AQS monitor. The median

(range) baseline mortality rate (M0) in these counties was 1,566 (677 – 2,123) deaths per

100,000 person-years. The population sizes of these counties varied from 5,915 to

477,701 people, with a median county-level population of 66,820 people. A Bland-

Altman plot (Figure 3a) shows annual averages computed from AOD- and DS-based

PM2.5 estimates strongly agree with AQS-based annual averages. For most counties, the difference between annual averages from predictions and measurements was between -

1.5 and +1.5 g/m3, and fractional bias was negligible. CMAQ-based annual averages show weak associations with AQS-based annual averages and consistently under predict

AQS-based annual averages.

Except for air quality estimates, all other inputs used to calculate the change in mortality rate ( ) were held constant. The mean (range) ∆x for AQS-based PM2.5 annual averages was 3.5 (2.7 – 4.2) g/m3. The mean (95% CI) of AQS-based estimates was

30 (19 – 41) deaths per 100,000 person-years. The mean (range) ∆x for AOD-, CMAQ-, and DS-based PM2.5 annual averages was 3.5 (2.8 – 4.0), 2.7 (1.4 – 3.8), and 3.3 (2.5 –

4.1)g/m3, respectively. The mean (95% CI) for AOD-, CMAQ-, and DS-based

estimates was 29 (18 – 40), 22 (14 – 31), and 30 (19 – 42) deaths per 100,000 person-years, respectively. CMAQ-based estimates is relatively less correlated(r=0.78) and consistently under predicts the AQS-based estimates (Figure

3b). AOD- and DS-based were in very good agreement with the calculated using

117

AQS-based estimates, while from CMAQ are slightly lower. Pearson correlation coefficient for AOD- and DS-based estimates was 0.90 and 0.92, respectively.

118

Figure 5-3: County-level annual comparison of model- and AQS-based metrics: (A) Bland Altman Plot, and (B) Comparison of change in mortality rate 119

5.3 Discussion

In our study domain, approximately 15% of the counties had PM2.5 measurements via

AQS monitors. Most of the AQS monitors sample every third day, leaving approximately

66% of the days in a year without data. These limitations could hinder our ability to accurately ascertain population-level ambient exposure and could introduce uncertainty when used to quantify health risks. Our analyses focused on the three most commonly used modeling approaches in the context of public health by researchers and public health practitioners.

Advances in remote sensing technologies, coupled with rigorous statistical methodologies, help generate AOD-based predictions of PM2.5. The GWR model used to generate AOD-based PM2.5 predictions produces parameter estimates of AOD and other meteorological variables that are adjusted locally. AOD data from satellite sensors, such as MODIS, are provided in a 10 km x 10 km spatial resolution and allow for the creation of PM2.5 predictions at geographic scales finer than county, such as, zip codes and census tracts. Temporal data completeness for AOD-based PM2.5 predictions is approximately

50%, which is greater than the completeness offered by AQS-based monitor data.

Further, AOD data can be retrieved within a time lag of a few months and processing

AOD data to generate PM2.5 predictions is computationally less intensive than executing numerical deterministic simulation models.

There are a few limitations that should be taken into consideration before using AOD- based PM2.5 predictions. Daily PM2.5 predictions generated using AOD data for each grid cell is based on at most two data points (dictated by the number of passes the satellite makes over the earth). In certain seasons, data availability is limited by cloud cover and

120

other atmospheric factors. Additionally, the calibration process depends on the availability of AQS monitor data and to some extent, the accuracy of the modeled predictions depends on the number of AQS monitors available to calibrate the model. In an area with sufficient monitoring data, AOD-based predictions compare well with observed PM2.5 concentrations. On the other hand, in areas where there are no monitors or when the number of daily observations needed to calibrate AOD is limited, due to lack of monitor-based measurements or missing AOD data, AOD-based PM2.5 predictions are less accurate. This is evident from comparisons performed against monitors in the

SEARCH network. SEARCH monitors that are located in urban areas (BHM and JST) have many AQS monitors nearby, and hence SEARCH- and AQS-based PM2.5 measurements are highly correlated. We provide a comparison between SEARCH- and

AQS-based measurements in Figure D-5. As a result, AOD-based predictions, which are calibrated using AQS-based PM2.5 concentrations, are correlated with SEARCH-based measurements. On the contrary, rural SEARCH sites, such as CTR and OLF do not have a dense enough AQS monitoring network to represent the PM2.5 spatial variability in the region and expectedly, AOD-based PM2.5 predictions are weakly associated with PM2.5 measurements at these locations. The performance of AOD-based predictions at YRK is in between what is observed at rural and urban SEARCH sites, likely due to the fact that the YRK site has an AQS site nearby which operates once-every-third day. On days where PM2.5 measurements are available to calibrate the model, AOD-based predictions are in agreement with SEARCH-based measurements.

The CMAQ model offers PM2.5 predictions at continuous space and time scales. Given the wealth of emission, meteorology, land use and other pertinent information supplied to

121

the model, the CMAQ modeling framework does capture the dynamics of the air pollution processes to an extent (Jun et al. 2004). However, CMAQ-based PM2.5 predictions show the weakest association with observed PM2.5 concentrations among the models evaluated in this paper. This is not surprising as AOD- and DS-based methods are directly linked to observed air quality via their calibration approach, while CMAQ is not.

It is clear that CMAQ does not currently have the ability to fully capture day-to-day variability (Marmur et al., 2006). On the other hand, CMAQ is well suited for ascertaining the background concentrations and computing long-term averages. CMAQ- based predictions can be used to augment monitor data and are currently being used as input to data fusion models. CMAQ can also provide information on speciated PM2.5, whereas, AOD-based models and Bayesian space time models, such as the Downscaler, are typically used to estimate total PM2.5.

Data fusion models, such as DS, use a Bayesian approach and generate robust predictions using AQS monitor data where available, and CMAQ predictions in places without monitor data. The DS model has some additional advantages over earlier versions of prior

Bayesian space time models; the current DS has the ability to borrow useful information from neighboring grid cells and provide a smoothed prediction, which tends to improve performance against observed concentrations. In our study domain, DS-based modeled results show the strongest association with observed PM2.5 concentrations. Evaluation of the predictions from DS model using measurements from SEARCH sites suggest that

DS-based predictions and SEARCH-based measurements are highly correlated. One of the few drawbacks of using DS model is the long wait time associated with executing

122

CMAQ, which is a needed input. However, CMAQ data may be available from prior studies, facilitating this approach.

Summary measures, such as, annual averages created from either AOD- or DS-based methods comport well with AQS-based annual averages; bias and variability observed on an annual scale is minimal. Computing linked metrics of air quality and health, such as the change in mortality rate associated with lowering PM2.5 levels, using annual averages from AOD- and DS-based models match closely with those derived using AQS-based annual averages. CMAQ-based annual averages consistently under predict AQS-based annual averages and hence the change in mortality rate computed from CMAQ-based annual averages are negatively biased when compared to those derived from AQS-based annual averages. In general, computing linked metrics of air quality and health at aggregate geographic or longer time scales could reduce uncertainty and circumvent some of the limitations related to missing data and variability observed in places without monitors.

Even considering the limitations of the models, model-based predictions are a viable option for public health. Misrepresentation of prevailing air quality levels can be minimized if end users identify a suitable use for a model after considering the trade-offs, for example, the enhanced spatial and temporal coverage offered by the model with the associated uncertainty. For example, a potential use of AOD-based models could be generating annual averages at finer geographic scales in places where adequate monitor- based measurements are available to calibrate. Although missing data could affect the reliability of these estimates and diminish the representativeness of the annual averages,

AOD-based estimates may be utilized when the quarterly completeness is higher than

123

33% (most PM2.5 monitors sample once-every-third day) and when data are present for all calendar quarters. In areas without monitors, the DS model has a superior performance when compared to predictions from both AOD-based and CMAQ model, and the performance of CMAQ is slightly better than the AOD-based method. Given these findings, relying on statistical models that incorporate both monitoring data and CMAQ in a Bayesian approach, such as DS, is prudent when monitoring data is locally unavailable and/or when calculating annual metrics for public health.

5.4 Conclusion

One of the primary goals of the CDC’s Tracking Network is to advance the state of science in surveillance by creating metrics of exposure at a finer geographic scale, and identifying vulnerable places and populations. Understanding the benefits and limitations of the modeled data sources of PM2.5 is a necessary first step to utilizing these alternative data sources in facilitating linkages with health and population data. We have conducted an evaluation of different approaches to estimate PM2.5 concentrations using an independent set of monitors, identified deficiencies, and suggested an appropriate use of each modeled data source. Although this study is conducted in the Southeast, our assessment sheds light on model performance that goes beyond the study domain and could help researchers and public health practitioners choose the appropriate modeled data source(s) of PM2.5 for health studies and public health surveillance.

124

5.5 References

1. Anenberg SC, Horowitz LW, Tong DQ, West, JJ. 2010. An estimate of the global

burden of anthropogenic ozone and fine particulate matter on premature human

mortality using atmospheric modeling. Environ Health Perspect. 118(9), 1189–1195.

2. Al-Hamdan MZ, Crosson WL, Limaye AS, Rickman DL, Quattrochi DA, Estes MG,

et al. 2009. Methods for characterizing fine particulate matter using ground

observations and remotely sensed data: potential use for environmental public health

surveillance. J. Air Waste Manage. Assoc. 59, 865–881.

3. Beckerman BS, Jerrett M, Serre M, Martin RV, Lee SJ, van Donkelaar P, et al. 2013.

A hybrid approach to estimating national scale spatiotemporal variability of PM2.5 in

the contiguous United States. Environ Sci Technol. 47(13), 7233-7241. doi:

10.1021/es400039u

4. Berrocal VJ, Gelfand AE, Holland DM. 2010a. A spatio-temporal downscaler for

output from numerical models. J Agric Biol Environ Stat. 15(2) 176-197. doi:

10.1007/s13253-009-0004-z

5. Berrocal VJ, Gelfand AE, Holland DM. 2010b. A bivariate space-time downscaler

under space and time misalignment. Annu. Appl. Stat. 4, 1942-1975.

6. Berrocal VJ, Gelfand AE, Holland DM, Burke J, Miranda ML. 2011. On the use of a

PM2.5 exposure simulator to explain birthweight. Environmetrics. 22, 553–571.

125

7. Berrocal VJ, Gelfand AE, Holland DM. (2012). Space-time data fusion under error in

computer model output: an application to modeling air quality. Biometrics. 68(3),

837-848. doi: 10.1111/j.1541-0420.2011.01725.x

8. Clausen R, Hall ES. 2009. FINAL REPORT: overview of epa’s hierarchical Bayesian

model for predicting air quality patterns in the United States over space and time, for

use with public health tracking data. Work Assignment 54, Task 1.

9. Dominici F, Peng RD, Bell ML, Pham L, McDermott A, Zeger SL, et al. 2006. Fine

particulate air pollution and hospital admission for cardiovascular and respiratory

diseases. JAMA. 295, 1125 – 1134.

10. Fann N, Lamson AD, Anenberg SC, Wesson K, Risley D, Hubbell BJ. 2012.

Estimating the national public health burden associated with exposure to ambient

PM2.5 and ozone. Risk Analysis. 32, 81–95.

11. Fuentes M, Raftery AE. 2005. Model evaluation and spatial interpolation by Bayesian

combination of observations with outputs from numerical models. Biometrics. 61, 36-

45.

12. Hamilton WJ, Byun DW, Chan W, Ching JK, Han Y, Lopez RA, et al. 2009. A pilot

study using EPA’s cmaq model and hospital admission data to identify multipollutant

“hot spots” of concern in Harris County, Texas. NUATRC, 15.

13. Heaton M, Holland DM, Leininger T. 2012. User's manual for downscaler fusion

software. U.S. Environmental Protection Agency.

126

14. Hu X, Waller L, Al-Hamdan MZ, Crosson W, Estes MG, Estes SM, Sarnat, J. Liu, Y.

2013. Estimating ground-level PM2.5 concentrations in the Southeastern U.S. using

geographically weighted regression. Environmental Research. 121, 1-10.

15. Jun M, Stein ML. 2004. Statistical comparison of observed and CMAQ modeled

daily sulfate levels. Atmospheric Environment. 38(27), 4427-4436. doi:

10.1016/j.atmosenv.2004.05.019

16. Krewski D, Jerrett M, Burnett RT, Hughes E, Shi Y, et al. 2009. Extended follow-up

and spatial analysis of the american cancer society study linking particulate air

pollution and mortality. Boston: Health Effects Institute.

17. Liu Y, Paciorek CJ, Koutrakis P. 2009. Estimating regional spatial and temporal

variability of PM2.5 concentrations using satellite data, meteorology, and land use

information. Environ. Health Perspect. 117, 886–892.

18. Marmur A, Park SK, Mulholland JA, Tolbert PE., Russell AG. 2006. Source

apportionment of PM2.5 in the southeastern United States using receptor and

emissions-based models: Conceptual differences and implications for time-series

health studies. Atmospheric Environment, 40(14), 2533-2551. doi:

10.1016/j.atmosenv.2005.12.019

19. McMillan NJ, Holland DM, Morara M, Feng J. (2009a). Combining numerical model

output and particulate data using Bayesian space-time modeling. Environmetrics.

10:1002

127

20. Paciorek CJ, Liu, Y. 2012. Assessment of the relationship between satellite-based

estimates and measurements of PM2.5 in the Eastern United States. Health Effects

Institute, Synopsis of Research Report 167.

21. Peters A, Dockery DW, Muller JE, Mittleman MA. 2001. Increased air pollution and

triggering of myocardial infarction. Circulation. 103, 2810 – 2815.

22. Pope, C. A. (2000). Epidemiology of fine particulate air pollution and human health:

biologic mechanisms and who’s at risk? Env. Health Persp. 104, 713–723.

23. SEARCH, 1999. http://atmospheric-research.com/studies/SEARCH/; (Accessed

August 2013)

24. U.S. Centers for Disease Control and Prevention (CDC). 2013. Environmental Public

Health Tracking Network (EPHT).

http://ephtracking.cdc.gov/showIndicatorPages.action; (Accessed August 2013).

25. U.S. Environmental Protection Agency (EPA). 2006. CMAQ v4.6 Operational

guidance document. http://www.cmaq-model.org/op_guidance_4.6/manual.pdf;

(Accessed August 2013).

26. U.S. Environmental Protection Agency (EPA). 2012. Federal Advisory Committee

Act – Ozone, PM, Regional Haze Implementation. http://www.epa.gov/ttn/faca

(Accessed January 2012)

27. Vaidyanathan A, Dimmick WF, Kegler SR, Qualters JR. 2013. Statistical air quality

predictions for public health surveillance: evaluation and generation of county-level

128

metrics of PM2.5 for the environmental public health tracking network. International

Journal of Health Geographics. 2013, 12:12 doi:10.1186/1476-072X-12-12

129

CHAPTER 6 Monetizing Health Burden Associated with Extreme Heat Events:

Exploring the Role of Air Pollution and the Sensitivity Associated with Heat Wave

Definitions in the Excess Death Estimation Process

6.1 Introduction

Severe weather, especially extreme temperatures, adversely impacts human health

(Anderson and Bell 2011; Basu 2002; Basu et al. 2005; Knowlton et al. 2009). The

National Oceanic and Atmospheric Administration’s (NOAA) National Climatic Data

Center (NCDC) assesses the burden of severe weather and climate events from a historical perspective in the United States (U.S.), and has been compiling a database that provides information on extreme weather events and natural disasters

(http://www.ncdc.noaa.gov/billions/). Specifically, the database describes the nature of damage and costs associated with severe weather and natural disasters (such as , flooding, freeze, severe storms, tropical cyclones, , and winter storms) in the

U.S. since 1980 (Smith and Katz 2013). While this database is comprehensive, it provides limited information on the economic impact of mortality associated with severe weather events and natural disasters. In addition, some extreme temperature events are not captured in the NOAA database, such as the 2006 North American heat wave, which lasted for more than a month. In California alone, this extreme heat event (EHE) resulted in 140 direct deaths, as well as many more excess deaths over a 17-day period in July (Margolis et al. 2008). Further, this database does not quantify the excess deaths that are associated with EHEs.

130

A European study estimated 22,080 excess deaths in England, Wales, France, Italy, and

Portugal during and immediately after the heat wave of the summer of 2003 (Kovats and

Hajat 2008). Similarly, a 1995 Chicago heat wave, which lasted for only five days, resulted in 750 deaths (Semenza et al. 1996). Several studies have quantified the health burden attributable to air pollution. U.S. Environmental Protection Agency (EPA) estimates that 130,000 PM2.5-related deaths and 4,700 ozone-related deaths resulted from

2005 air quality levels (Fann et al. 2012). While it is believed that most of the deaths and illnesses during EHEs are associated with extreme temperatures, a recent study conducted in Europe has concluded that heat wave-related mortality was 54% higher on high ozone days compared with low ozone days among people age 75-84 (Analitis et al. 2014).

Hence, it is worth investigating the role of air pollutants in causing adverse health effects during EHEs.

Additionally, impacts on health are usually estimated to be the largest adverse consequences of EHEs when measured in economic terms using standard valuation approaches and dominating other losses, such as damage to crops and ecosystems (Yang et al. 2005). The most severe of adverse health outcomes associated with EHEs is death, where losses to society and the economy extend from the point of premature death forward until that person would have died of other causes had they not succumbed to the effects of extreme heat. To truly understand the full impact of the EHE-related fatalities, we should not only enumerate the monetary losses from infrastructure damage but also account for the economic loss these deaths have on the decedent’s household, community, and society in general. In this effort, we have the following objectives: (1) model the region-specific exposure – response (E-R) relationship between EHEs and

131

mortality, adjusting for air pollution levels observed during EHEs (2) estimate excess deaths associated with EHEs, and monetize those excess deaths using standard economic metrics.

6.2 Methods

6.2.1 Meteorology data

We used station based meteorology data for years 2001-2009, and included any county in the conterminous U.S. (lower 48 states) that had an automated surface observing system

(ASOS) unit in this evaluation. Further, we checked on the completeness of hourly and daily meteorology data used in this analysis. For each station we set a daily completeness threshold of 75% for hourly observations in a given day (at least 18 of 24 hourly measurements available) for computing daily summaries of the heat metric. For each county we calculated an average of all available daily station-based summaries to create county-level estimates of daily weather variables. We then applied a 95% completeness threshold for the daily county-level estimates of the heat metric across the summer months (May 1 through September 30). Finally, we only included counties for which sufficiently complete data were available for all 11 years (1999-2009) of the analysis period.

6.2.2 Air pollution data

We used air pollution data for the years 2001-2009 from the Downscaler (DS) model, a space-time hierarchical Bayesian model (Berrocal et al. 2010, 2012) . DS-based estimates of daily 8-hr maximum ozone concentrations (parts per billion or ppb) and daily 24-hour

3 average PM2.5 concentrations (micrograms per cubic meter or g/m ) are available by

132

census tracts. Daily county-level modeled estimates are obtained using a population- weighted approach, where tract populations are used to weight daily tract level ozone and

PM2.5 predictions. The population-weighted approach is described below:

∑ (1) ∑

Where,

daily DS estimate at the county-level for county k;

daily DS estimate at the census tract level for a tract j located within county k;

total population for a census tract j located within county k;

number of census tracts in county k.

In health studies exploring the relationship between individual air pollutants, such as ozone and PM2.5, and health outcomes is complicated by multicollinearity (Marcus and

Kegler 2001). To remedy this issue, we created a composite air pollution (AP) score, by combining ozone and PM2.5, using factor analysis. Factor analysis or principal components analysis has been used in health studies involving air pollutants and health outcomes (Jerrett et al. 2005; Nikolov et al. 2011). We examined the Eigenvalues associated with the factors and took note of the proportion of the variance accounted for by each factor.

6.2.3 Mortality data

We obtained mortality data from the National Center for Health Statistics (NCHS)

National Vital Statistics System and extracted death records for years 1999-2009 based on International Classification of Diseases, 10th revision (ICD-10) external cause codes

(Minino et al. 2011). Specifically, we selected death records from all causes except those

133

that had injury conditions listed as the underlying cause of death; the underlying cause of death is defined as the disease or injury that initiated the chain of events leading to death

(Hanzlick et al. 2006). Additionally, we extracted individual-level covariates (such as age, gender, race, and ethnicity) from death records, and attached county-level covariates based on county of residence. We summarized the extracted death records for the summer months to get counts of deaths by county and day. We then assigned each county to one of the nine U.S. climate regions, which are aggregations of states based on homogenous long-term climatology (Figure E-1); a description of these regions is available from

National Climatic Data Center (http://www.ncdc.noaa.gov/monitoring- references/maps/us-climate-regions.php). Additionally, due to small death counts in the

West North Central and Northwest regions, we combined these two regions into “North

West Central.” We excluded counties that did not have meteorology data (or that did not meet the data completeness threshold) and made adjustments to account for county boundary changes that occurred between 1999 and 2009.

6.2.4 Population and other ancillary data

For incidence rate denominators we used county-level bridged-race population estimates developed by NCHS and the U.S. Census Bureau. We restricted our analysis to counties with a resident population of greater than 100,000. For use as model covariates we obtained a number of county-level health and behavioral measures23 from several different sources. Percentages of residents of all ages living in poverty and percentages of residents aged 0-64 years without health insurance were obtained from the U.S. Census

23 County-level health and behavioral covariates were categorized based on tertiles of the distribution of measure values by each region and categorized as following:(1) the lowest tertile was called “Low”, (2) the middle tertile was called “Medium”, and (3) the highest tertile was called “High”. 134

Bureau. Prevalence estimates of current adult smokers were obtained from CDC’s

Behavioral and Risk Factor Surveillance System. We obtained data on diabetes prevalence, adults that reported no leisure time physical activity, and obesity prevalence

(body mass index > 30) from the National Center for Chronic Disease Prevention and

Health Promotion, Division of Diabetes Translation at CDC. We obtained county-level air conditioning (AC) prevalence data from a private vendor, Efficiency 2.0.

6.2.5 EHE definitions

We considered all the shortlisted definitions (Table E-2) as well as exposure offsets shown in Figure E-2. But, we restricted our selection to the top 1024 (Table E-3) EHE definition and exposure offset combinations for each climate region. We operationalized each EHE definition and exposure offset combination as a binary (Yes (1) / No (0)) variable and thereby separately classified each day in each county during the summer months as either an “EHE day” or a non-EHE day.”25

6.2.6 Estimation of exposure-response (E-R) relationship

We used a rate regression modeling approach with negative binomial link to estimate the

E-R relationships. We selected several candidate predictors including, but not limited to: the binary variable for each EHE definition and exposure offset indicator combination, combined AP score, air conditioning (AC) prevalence, adult smoking prevalence, and month. We used a summary-level model to explore the relationship between EHEs and mortality, and generated the rate ratios (RR) at various levels of AP scores. Additionally,

24 The top15 EHE definitions are based on the evaluation from previous work, which examines the predictive power of each EHE definition against heat mortality data. 25 We added a buffer of 3 days to the start and end of the summer months to account for any potential EHE that either started prior to May 1 and ended on or shortly after May 1, or started on or shortly before September 30 and ended in the early part of October. The buffer days were not included in the analysis. 135

in certain regions and for most EHE definitions, there is little overlap between levels of standardized AP scores observed on EHE and non-EHE days. In order to achieve a model fit that is reliable for both EHE and non-EHE days, we filtered out extreme values of standardized AP scores and used values that were common to both sets of days. We fitted a region-specific model for each of the top 10 EHE definitions and exposure offset variables. The following region-specific model was used to derive E-R relationships:

( ) ∑ (2)

with model terms defined as follows:

D: count of deaths (stratification level: county, year, month and EHE status)26;

E[D]: expected count of deaths;

P: size of the population over which D is measured

α: intercept

EHE: binary indicator variable for operationalized EHE definition and exposure offset combination

AP: factor score

Ck: represents the kth covariate used as a control

T: month

: parameter estimate for the binary variable referring to the EHE definition and lag type combination

: parameter estimate for factor score

: parameter estimate for the term denoting the interaction between EHE

26 To facilitate reliable modeling diagnostics as well as convergence, data were collapsed according to a four-way stratification: county × year × month × EHE status (for the EHE definition/variant and exposure offset combination under consideration). 136

and factor score

: parameter estimate for kth covariate

: parameter estimate for month (Refer to Appendix Table E-1 for the list of candidate covariates).

We estimated RRs27 by region for each definition and exposure offset combination. We estimated region-specific RRs without and with air pollution terms in the model.

6.2.7 Excess death estimation

We estimated the excess deaths associated with each EHE definition and exposure offset combination, and county using the modeled output generated by the negative binominal rate regression model. We first estimated the total deaths from all EHE days combined for each definition, , i.e., we summed up all deaths (obtained from the model) when

EHE indicator was one ( ). Then, we computed the total model-estimated deaths on non-EHE days or baseline deaths, by summing up when EHE indicator was set to zero ( ). We calculated daily county-level baseline deaths ( ) as follows:

∑ (3a)

Subsequently, we summed the county-level baseline deaths on all EHE days ( , where

(3b)

28 We finally calculated county-level excess deaths ( ) on EHE days for each county as

27 We used the estimate statement in Proc GenMOD procedure available from Statistical Analysis System (SAS® Version 9.3) to compute RRs without and with air pollution terms in the model. 137

(3c)

We summed all the values in our study dataset (counties with complete meteorological information), across months, and years by climate region to obtain a regional estimate of excess deaths, . We then scaled this estimate of regional excess deaths to account for all the population contained with each climate region using a population-adjusted scaling factor, .

(3d)

Where

∑ ∑ (3e) ∑ ∑

We also generated the interquartile range (IQR) for the top 10 definitions based on the bootstrap-generated empirical distributions of excess death estimates.

6.2.8 Region-level summary of E-R relationship and excess deaths

We conducted a summary-level pooled analysis, using the top 10 EHE definitions and exposure offset variables, to estimate the mean and 95% confidence interval (CI) RR by different regions (with and without air pollution terms). This summary-level pooled analysis was analogous to a meta-analysis of effect sizes from studies with different subjects or study participants (Borenstein and Higgins 2013; Mortimer et al. 2012; Shah et al. 2005). In our analysis, each of the studies and study participants were akin to a model-run executed with different EHE definitions. Specifically, we used a random

28The only time this process would break down would be in instances where every day of the month is classified as an EHE day. In such instances, we inserted dummy records in the negative binomial rate regression modeling procedure with complete list of predictors and missing dependent variable, deaths. The modeling procedure does not include this record in the model fitting procedure but generates . 138

effects model to conduct the summary-level pooled analyses to account for differences in effect sizes arising from random variability as well as the error introduced by selecting a particular EHE definition. We used diagnostics such as I-squared (Higgins et al. 2003), to check for the presence of heterogeneity and the magnitude of heterogeneity by region.

Further, we estimated the excess deaths per EHE day from all causes, except injury, using the mean and 95% confidence interval (CI) RR generated from the summary-level pooled analysis. We deduced the following formula from equation (2) to calculate excess deaths

(ExD) per EHE day for each region:

(4)

Where,

= region-level excess deaths per EHE day for each region based on all EHE

definitions and exposure offset indicators;

= rate ratio generated by the summary-level pooled analysis using the random effects model;

29 = daily regional baseline rate (deaths/population) on EHE day;

Population = total regional population.

6.2.9 Monetizing Excess deaths

We assessed the monetized mortality burden in terms of lifetime work loss (LWL) costs resulting from premature death. LWL costs include: lost wages, lost benefits, and self- reported lost household benefits; lifetime work loss costs are determined by the age and

29 We first calculated daily county-level baseline deaths, , based on the model generated output using formula 3a. We generated a county-level daily baseline rate by dividing by the county population. We then used a random effects model to generate a mean daily baseline rate by each climate region. 139

gender of the decedent (Lawrence et al. 2009). LWL cost coefficients were obtained by single year of age and gender from CDC’s Web-based Injury Statistics Query and

Reporting System (WISQARS) (www.cdc.gov/ncipc/wisqars).

LWL cost coefficients were indexed to 2010 dollar-value and assigned according to the gender and age of the decedent. Figure E-3 shows the unit LWL costs in millions ($) by age and gender. We created a baseline LWL cost per death ( ) by region for deaths due to all-causes that are not injury related. We then monetized the excess deaths per EHE day, as

(5)

Where,

= regional LWL cost associated with excess deaths;

= region-level excess deaths per EHE day for each region based on all EHE

definitions and exposure offset indicators;

= region-level baseline LWL cost per decedent.

As a sensitivity analysis for our economic estimation, we also used the “value of statistical life (VSL)” metric to quantify the mortality burden in economic terms (EPA

2010; Kochi et al. 2006). We used a baseline VSL estimate of 7.6 million (M) dollars from 2006 and adjusted it to $8.1M per 2010 prices, using a cumulative inflation rate of

8.2%. (http://www.usinflationcalculator.com/)

We carried out our data analyses using the Statistical Analysis System (SAS® Version

9.3), Environmental Systems Research Institute’s GIS software (ESRI, ArcGIS® Version

9.3), and comprehensive meta-analysis software (CMA® Version 2.0).

140

6.3 Results and Discussion

6.3.1 Descriptive statistics

Table 6-1 summarizes the number of all cause, non-injury deaths by climate regions for years, 2001-2009. Total number of deaths from all causes, except injury, in the coterminous U.S for 2001-2009 was 20.2 M and, out of those deaths, 10.90M (54 %) deaths occurred in counties with meteorological data. In counties with meterological data, the Northeast region had the highest number of deaths (n=2.1M) and the Southwest region (n=0.5 M) had the lowest number of deaths. The North West Central region, which we created by combining regions Northwest and West North Central, had the second lowest number of deaths (n=0.6 million). The South region had the highest number of counties with meteorological data (n=91). The West region had the lowest number of counties (n=38) with meteorological data.

Table 6-2 provides the levels of air pollutants and the standardized AP score during EHE and non-EHE days by climate regions. We notice that air pollution levels on EHE-days are higher than non-EHE day levels for most climate regions. On EHE days, highest county-level average monthly ozone levels were observed in the Northeast region (mean

= 61.7 ppb; IQR = 16.4 ppb) and the lowest ozone levels were observed in the North

West Central region (mean = 50.4 ppb; IQR = 12.1 ppb). Highest county-level average

3 monthly PM2.5 levels were observed in the Northeast region (mean = 21.9 g/m ; IQR =

3 8.6 g/m ) and the lowest county-level average monthly PM2.5 levels were observed in the Southwest region (mean = 8.0 g/m3; IQR = 2.2 g/m3).

Standardized air pollution scores derived using factor analysis provides control for confounding while avoiding multicollinearity among ozone and PM2.5. Before adopting

141

these standardized air pollution scores based on factor analysis, we verified the proportion of variance accounted for by the factors. We ended up using the factor that consistently accounted for a high proportion of variance across all climate regions.

Figures E-4-8, show three-dimensional (3-D) plots of factors scores on EHE and non-

EHE days as a function of ozone and PM2.5 by climate regions and different EHE definitions (plots shown only for lag0 definitions). Similarly to the pattern observed for

PM2.5 and ozone, county-level average monthly factor scores are relatively higher during

EHE-days than non-EHE days for most EHE definitions.

142

Table 6-1: Descriptive summary statistics Number of Number of deaths in Percent of people Number of counties with Number of counties with living in counties deaths in population Number of counties with population with population U.S. Climate Region counties with greater than deaths meteorological greater than greater than 100,000 meteorological 100,000 and data 100,000 and and meteorological data meteorological meteorological data (%) data data Central 3,690,858 78 1,728,583 47 1,588,783 45 East North Central 1,665,304 54 797,120 27 706,240 44 North West Central 1,066,930 88 647,112 19 471,177 45 Northeast 4,495,488 70 2,135,716 49 2,046,717 49 South 2,613,039 91 1,358,598 46 1,214,403 55 Southeast 3,773,684 71 1,782,648 53 1,686,756 47 Southwest 826,136 43 545,396 14 498,143 59 West 2,111,975 38 1,912,455 30 1,894,269 91 All regions 20,243,414 533 10,907,628 285 10,106,488 54

143

Table 6-2: Levels of air pollutants and standardized AP scores on EHE and non-EHE days County-level average monthly levels of air pollutants and standardized air pollution score non-EHE day EHE day Standardized Standardized Ozone PM2.5 Ozone PM2.5 Region AP score AP score

Mean IQR Mean IQR Mean IQR Mean IQR Mean IQR Mean IQR (ppb) (ppb) (g/m3) (g/m3) (ppb) (ppb) (g/m3) (g/m3)

Central 49.8 8.6 14.6 4.8 -0.2 1.1 57.3 11.7 19.7 8.6 0.9 1.6 East North Central 44.0 8.1 10.2 3.8 -0.3 1.0 55.0 12.1 15.6 6.9 1.1 1.6 North West Central 42.5 8.4 7.2 2.3 -0.3 0.9 50.4 8.0 10.2 3.3 1.0 1.0 Northeast 45.6 10.2 11.7 5.2 -0.3 0.9 61.7 16.4 21.9 8.6 1.3 1.4 South 46.9 8.2 10.7 3.9 -0.1 1.2 53.7 10.6 11.9 5.1 0.7 1.7 Southeast 46.8 13.0 13.2 5.7 -0.1 1.3 52.7 16.3 16.7 8.3 0.6 1.7 Southwest 51.7 7.4 6.4 1.7 -0.2 1.1 55.9 7.9 8.0 2.2 0.8 1.3 West 49.3 15.7 9.4 4.0 -0.3 1.2 60.8 17.7 11.8 4.3 0.6 1.3

144

6.3.2 Modeling results

We considered several candidate predictors. In addition to predictors such as standardized

AP score, EHE definition and exposure offset indicator, we decided to include county- level percentage of Hispanic population, AC prevalence, and adult smoking prevalence in the final model. We settled for these social and behavioral risk factors as these factors were commonly cited in literature as indicators of heat vulnerability and/or risk factors for mortality (Klinenberg 2003a, b; Reid et al. 2009; Semenza et al. 1996). Once we settled for these predictors, we executed a summary-level model mentioned in equation

(2). We examined patterns of residuals and goodness-of-fit parameters to identify instances where the data were stratified too finely. Some of the regions had counties in our study dataset that did not have sufficient number of deaths (even at the monthly level), and this is primarily the reason why we restricted our analysis to counties with a resident population of greater than 100,000. After applying this filter, we had 285 counties spread across different climate regions. Of note, these counties accounted for

10.1 million (50%) deaths and 54% of the total conterminous U.S. population. A tally of deaths and percent of people living in counties with a population of over 100,000 is provided in Table 6-1.

6.3.3 E-R relationships

We executed a region-specific model for each of the top 10 definitions and exposure offset indicators. We examined the model diagnostics to assess the goodness-of-fit and intercept offsets for social and behavioral predictors. We observe that higher levels of AC prevalence correspond with lower mortality rate. AC prevalence is a significant risk

145

factor for extreme heat related mortality (Reid et al. 2009) and relatively high AC prevalence is observed in the South and the Southeast regions of the U.S. We also note that in areas with higher levels of Hispanic population, the mortality rate is relatively lower. It has been speculated that higher levels of Hispanic population could be a proxy for the lack of certain risk factors, such as, people living alone (without a family) or lack of social support during times of health discomfort (Klinenberg 2003a, b). Finally, higher levels of smoking correspond with a higher mortality rate. This relationship between smoking and mortality is corroborated in previous studies (Jerrett et al. 2009; Pope III et al. 2002). Table E-4 in the appendix provides information on the levels of various social and demographic variables by U.S. climate region.

In order to assess the relationship between EHE and mortality, considering air pollution terms, we first checked the significance level (p-value) of the parameter estimate for the term denoting interaction between EHE indicator and AP score. We noticed that the interaction term was significant only for the Southeast and West regions based on most of the EHE definitions and exposure offsets considered. Hence, for other climate regions, we removed the interaction term and decided to retain only the main terms for AP score and EHE indicator. We executed this model without an interaction term (but with all the other covariates) and examined the p-value of the parameter estimate for the EHE indicator. We noticed that for certain climate regions parameter estimate for the EHE term that was not statistically significant (p-value >0.05) for majority of the EHE definitions. In some cases, the Southwest region for example, the EHE term was not significant regardless of the definition selected.

146

Figure 6-1 shows that the E-R relationship between mortality and EHEs (with and without air pollution terms)30. We estimate the mean (95%CI) RR by region, based on the random effects summary-level analysis. The RRs are provided by region and they illustrate that there is confounding by air pollutants in all regions. Additionally, the level of confounding varies with climate region. The effect sizes without air pollution are different from what we observe when considering air pollution terms. The differences in the effect sizes are prominent in regions where the interaction term between EHE and AP scores is significant (Southeast and West). Also, the difference in the effect size for the

Northeast region is significantly different with and without air pollution. The RR for the

Northeast regionwithout considering air pollution is positively significant and with air pollution in the model, RR is negatively significant. We are unable to speculate as to why this pattern is observed in the Northeast and more research is needed.

30 In Southeast and West, regions with significant interaction term between EHE indicator and standardized AP scores, we estimated RR at the mean levels of AP scores. 147

Figure 6-1: Rate ratios by climate regions generated from the random effects summary-level analysis, controlling for social and demographic factors, and with and without adjustment for air pollution (standardized AP score).

148

6.3.4 Excess deaths and economic costs associated with EHEs

We observed significantly different estimates of excess deaths depending on the EHE definition and exposure offset indicator used in the rate regression model. These estimates are based on first estimating county-level excess deaths and then summing them up to obtain regional estimates. The variability in excess death estimate, across different EHE definitions, could potentially be due to the combination of unstable county-level baseline rate and slight variation in definition-specific RRs estimated from the model.

We tried to reduce the variability introduced by different EHE definitions and unstable baseline rates by estimating excess deaths at the regional-level using RRs from the random effects summary-level pooled analysis. We generated an average daily regional baseline rate, considering all top 10 EHE definitions, using a random effects summary- level pooled analysis. The magnitude of average regional daily baseline rate was comparable across regions and was approximately 2 deaths per 100,000 population. We provide excess deaths per EHE day estimates for climate regions based on RRs generated from models with and without air pollution terms. We notice that patterns observed in excess deaths per EHE days vary with climate regions. Daily excess death estimates drastically vary when air pollution terms are included in the model, especially for the

Northeast, Southeast and West regions. In the East North Central and North West Central regions, mean estimate of excess deaths are approximately equal but the 95%CI are considerably wider when air pollution based RRs are used. In the South region, there isn’t much variation in excess deaths per EHE day estimates regardless of whether the model

149

used air pollution terms. Region-specific excess death estimates per EHE-day are provided in Table 3.

Table 6-3: Excess deaths per EHE day using RR generated from a model with and without air pollution terms With air pollution Without air pollution Mean (95% CI) excess Mean (95% CI) excess U.S. Climate Region deaths per EHE day deaths per EHE day Lower Upper Lower Upper Mean Mean limit limit limit limit Central -14 -26 0 3 -4 10 East North Central 4 -1 9 7 3 12 North West Central 5 -3 14 5 3 7 Northeast -54 -64 -45 37 20 54 South -11 -21 -2 -13 -21 -4 Southeast 49 28 71 -28 -63 7 Southwest -1 -5 4 4 -1 11 West 14 11 18 -10 -20 0

In most regions, the mortality burden is higher among older populations and this reflected in the lower average daily cost per death estimates. The average cost per deaths from all causes except injury is very much comparable among regions and it hovers around $0.3

M per death. Hence, the observed patterns in excess economic costs per EHE day are similar to that of excess deaths per EHE day. Region-specific monetized health burden estimates per EHE-day are provided in Table 4.

150

Table 6-4: Excess costs per EHE day using RRs generated from a model with and without air pollution terms With air pollution Without air pollution Mean (95% CI) excess costs ($ million) Mean (95% CI) excess costs ($ million) per EHE day per EHE day U.S. Climate Region LWL-based VSL-based LWL-based VSL-based estimates estimates estimates estimates Lower Upper Lower Upper Lower Upper Lower Upper Mean limit limit Mean limit limit Mean limit limit Mean limit limit Central -4.2 -8.2 -0.1 -110.0 -214.0 -3.6 0.9 -1.2 3.1 23.7 -32.6 80.4 East North Central 1.1 -0.1 2.4 32.2 -4.2 68.9 2.1 0.7 3.4 58.6 21.0 96.6 North West Central 1.4 -0.9 3.8 40.5 -27.2 110.2 1.5 0.9 2.0 42.0 27.1 57.1 Northeast -15.7 -18.5 -12.9 -440.0 -517.0 -362.0 10.6 5.7 15.6 296.5 158.5 436.2 South -3.8 -7.1 -0.5 -90.9 -168.0 -12.3 -4.3 -7.0 -1.4 -101.0 -168.0 -34.2 Southeast 15.9 9.0 23.0 398.6 226.2 574.3 -9.1 -20.3 2.4 -229.0 -508.0 59.9 Southwest -0.3 -1.6 1.2 -6.5 -41.7 29.4 1.4 -0.5 3.4 36.3 -11.9 85.6 West 4.6 3.4 5.9 117.2 85.7 149.0 -3.2 -6.5 0.1 -81.2 -164.0 3.3

151

The estimates of economic burden derived using VSL and LWL costs reflect different methodological approaches with different interpretations. According to EPA’s (EPA

2010) guidelines for preparing economic analyses, the VSL-based estimates encapsulate the amount that people would be willing to pay to avoid certain environmental risks or natural hazards in order to reduce the statistical probability of death from these causes.

Total economic burden estimates using this as the unitary cost is intended to reflect the economic value society places on avoiding these premature deaths. On the other hand,

LWL costs represent direct costs (such as lost wages, benefits, and self-provided household services) associated with a premature death.

6.3.5 Limitations

Our study had a few limitations. We were unable to model the relationship between

EHEs and mortality with certainty in some climate regions, which could be due to region- specific differences, such as: (1) the mortality response to extreme heat could be confounded by factors not considered in this assessment, (2) deaths from all-causes excluding injury may not be an ideal mortality endpoint to consider in these regions, and

(3) the lack of information on the effectiveness of heat alerts and advisories issued during

EHEs and/or sub-regional differences in behavioral modification as a result of heeding to alerts. We had insufficient death counts in some of the less populous (population less than or equal to100,000) counties and we had to exclude them from the analysis. These counties, which are mostly in rural areas, could have a different E-R relationship between

EHEs and mortality. Although rural areas typically have lower air pollution levels, other risk factors that determine mortality response to extreme heat may be more prevalent in less populated areas. For example, access to care and/or access to cooling shelters may be

152

limited. Also, the certain sub-populations, such as older populations, are more vulnerable to extreme heat and air pollution. Our analysis does not factor differences in age or levels of other individual level covariates when estimating RRs. This was mainly to prevent fine stratification of data in certain climate regions. Further, we used modeled estimates of air pollution to create standardized factor scores. While the DS model has been thoroughly evaluated, E-R relationships, or for that matter, excess death estimates, could be different if this analysis were to be reproduced using air quality measurements. As future work, we would like to employ small area Bayesian approaches to explore E-R relationships and estimate excess deaths to get around some of the limitations associated with small death counts.

6.4 Conclusions

Economic burden associated with mortality is important to capture because in extreme temperature events, especially heat, infrastructure damage is often minimal compared to hurricanes or tornadoes. More often than not, the burden is underestimated if the costs resulting from mortality are overlooked. Further, the public health community strives to minimize mortality risks associated with environmental hazards. Efforts to minimize the adverse health impacts from extreme heat, as well to reduce exposure to air pollution, are carried out on an ongoing basis but are often treated as two separate efforts. In this study, we examine the nexus between air pollution and heat waves by climate regions. We have successfully explored the role of air pollutants in modifying or confounding the relationship between EHEs and deaths from all causes that are non-injury related. We observe that air pollution confounds the relationship between EHE and non-injury mortality and the extent of confounding varies with climate regions. Further, the E-R

153

relationship and excess deaths estimates are sensitive to EHE definitions. Hence we present a mean estimate of excess deaths and associated costs per day, using random effects meta-analysis and considering different EHE definitions that are closely related to heat mortality. We feel that there will always be some subjectivity in selecting the best

EHE definition and these “per day” estimates are useful since they could be used as

“excess death/cost multipliers” to estimate the total mortality burden prospectively or for years not considered in this assessment.

6.5 References

1. Analitis A, Michelozzi P, D'Ippoliti D, De'Donato F, Menne B, Matthies F, et al.

2014. Effects of heat waves on mortality: Effect modification and confounding by

air pollutants. Epidemiology 25:15-22.

2. Anderson GB, Bell ML. 2011. Heat waves in the united states: Mortality risk during

heat waves and effect modification by heat wave characteristics in 43 U.S.

communities. Environmental health perspectives 119:210-218.

3. Basu R. 2002. Relation between elevated ambient temperature and mortality: A

review of the epidemiologic evidence. Epidemiologic Reviews 24:190-202.

4. Basu R, Dominici F, Samet JM. 2005. Temperature and mortality among the elderly

in the United States: A comparison of epidemiologic methods. Epidemiology 16:58-

66.

5. Berrocal VJ, Gelfand AE, Holland DM. 2010. A spatio-temporal downscaler for

output from numerical models. Journal of agricultural, biological, and environmental

statistics 15:176-197.

154

6. Berrocal VJ, Gelfand AE, Holland DM. 2012. Space-time data fusion under error in

computer model output: An application to modeling air quality. Biometrics 68:837-

848.

7. Borenstein M, Higgins JPT. 2013. Meta-analysis and subgroups. Prevention Science

14:134-143.

8. EPA. 2010. Guidelines for preparing economic analyses

(http://yosemite.Epa.Gov/ee/epa/eerm.Nsf/vwan/ee-0568-50.Pdf/$file/ee-0568-

50.Pdf).

9. Fann N, Lamson AD, Anenberg SC, Wesson K, Risley D, Hubbell BJ. 2012.

Estimating the national public health burden associated with exposure to ambient

pm2. 5 and ozone. Risk Analysis 32:81-95.

10. Hanzlick R, College of American Pathologists., National Association of Medical

Examiners (U.S.). 2006. Cause of death and the death certificate : Important

information for physicians, coroners, medical examiners, and the public. Northfield,

Ill.:College of American Pathologists.

11. Higgins JP, Thompson SG, Deeks JJ, Altman DG. 2003. Measuring inconsistency in

meta-analyses. BMJ: British Medical Journal 327:557.

12. Jerrett M, Burnett RT, Ma R, Pope III CA, Krewski D, Newbold KB, et al. 2005.

Spatial analysis of air pollution and mortality in los angeles. Epidemiology 16:727-

736.

13. Jerrett M, Burnett RT, Pope III CA, Ito K, Thurston G, Krewski D, et al. 2009.

Long-term ozone exposure and mortality. New Engl J Med 360:1085-1095.

155

14. Klinenberg E. 2003a. Review of heat wave: Social autopsy of disaster in chicago.

The New England journal of medicine 348:666-667.

15. Klinenberg E. 2003b. Heat wave: A social autopsy of disaster in chicago (book).

Choice: Current Reviews for Academic Libraries 40:1268.

16. Knowlton K, Rotkin-Ellman M, King G, Margolis HG, Smith D, Solomon G, et al.

2009. The 2006 california heat wave: Impacts on hospitalizations and emergency

department visits. Environmental health perspectives 117:61-67.

17. Kochi I, Hubbell B, Kramer R. 2006. An empirical bayes approach to combining and

comparing estimates of the value of a statistical life for environmental policy

analysis. Environmental & Resource Economics 34:385-406.

18. Kovats RS, Hajat S. 2008. Heat stress and public health: A critical review. Annual

review of public health 29:41-55.

19. Lawrence B, Bhattacharya S, Zaloshnja E, Jones P, Miller T, Corso P, et al. 2009.

Medical and work loss cost estimation methods for the wisqars cost of injury

module. Calverton, MD: Pacific Institute for Research and Evaluation (PIRE).

20. Marcus AH, Kegler SR. 2001. Confounding in air pollution epidemiology: When

does two-stage regression identify the problem? Environmental health perspectives

109:1193.

21. Margolis H, Gershunov A, Kim T, English P, Trent R. 2008. 2006 california heat

wave high death toll: Insights gained from coroner’s reports and meteorological

characteristics of event. Epidemiology 19:S363-S364.

22. Minino AM, Murphy SL, Xu J, Kochanek KD. 2011. Deaths: Final data for 2008.

National vital statistics reports : from the Centers for Disease Control and

156

Prevention, National Center for Health Statistics, National Vital Statistics System

59:1-126.

23. Mortimer JA, Borenstein AR, Nelson LM. 2012. Associations of welding and

manganese exposure with parkinson disease review and meta-analysis. Neurology

79:1174-1180.

24. Nikolov MC, Coull BA, Catalano PJ, Godleski JJ. 2011. Multiplicative factor

analysis with a latent mixed model structure for air pollution exposure assessment.

Environmetrics 22:165-178.

25. Pope III CA, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, et al. 2002. Lung

cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air

pollution. JAMA : the journal of the American Medical Association 287:1132-1141.

26. Reid CE, O'Neill MS, Gronlund CJ, Brines SJ, Brown DG, Diez-Roux AV, et al.

2009. Mapping community determinants of heat vulnerability. Environmental health

perspectives 117:1730-1736.

27. Semenza JC, Rubin CH, Falter KH, Selanikio JD, Flanders WD, Howe HL, et al.

1996. Heat-related deaths during the july 1995 heat wave in chicago. The New

England journal of medicine 335:84-90.

28. Shah NR, Borenstein J, Dubois RW. 2005. Postmenopausal hormone therapy and

breast cancer: A systematic review and meta-analysis. Menopause 12:668-678.

29. Smith AB, Katz RW. 2013. Us billion-dollar weather and climate disasters: Data

sources, trends, accuracy and biases. Nat Hazards 67:387-410.

157

30. Yang T, Matus K, Paltsev S, Reilly J. 2005. Economic benefits of air pollution

regulation in the USA: An integrated approach:MIT Joint Program on the Science

and Policy of Global Change.

158

CHAPTER 7 Summary of Conclusions and Future Research

7.1 Summary of Conclusions

7.1.1 Region-Specific Evaluation of Extreme Heat Event Definitions Using Heat

Mortality Data

Several local and state health departments are currently interested in issuing heat advisories, as well as conducting retrospective health studies to understand the effects of extreme heat on mortality and morbidity; health departments are collaborating with local and national weather offices to do so. EHE definitions used by most heat warning systems to issue alerts are calibrated to the extreme end of the daily heat metric spectrum.

This effort is the first nationally comprehensive, region-specific evaluation of EHE definitions using heat mortality data. Further, this evaluation framework, which employed cluster analyses to identify cluster groupings of EHE definitions and subsequently estimating the EHE effect of a representative definition from each cluster using rate regression modeling, provides a robust framework to identify definitions that are closely associated with heat-mortality data. This approach not only identifies a set of

“ideal” definitions that are closely associated with heat-related mortality but also sheds light on some of the poorly associated EHE definitions that are used in literature.

Research findings from this study suggest that definitions with thresholds that are either too extreme or too moderate are poorly associated with heat-related mortality for most climate regions. Of the exposure offset indicators considered, definition combinations involving a 1-day lag seem to produce a higher EHE effect in most of the regions except

Northeast, Southeast, and Southwest; in Northeast and Southwest, definitions involving no-lag exposure offset indicator produce a higher EHE effect. Our evaluation indicates 159

that the warmer regions of the U.S., such as the South and Southeast, have a relatively lower EHE effect and a relatively higher baseline heat mortality rate. Meanwhile, colder areas of the U.S., such as the North West Central and East North Central, have a relatively higher EHE effect and a lower baseline heat mortality rate.

7.1.2 Exploring the Utility of Modeled Meteorology Data for Extreme Heat-Related

Health Research and Surveillance

The benefits of utilizing model-based estimates should be considered in light of the added uncertainty which they introduce, as a thorough evaluation then becomes a prerequisite.

Estimates from North American Land Data Assimilation Model (NLDAS) and station- based estimates from automated surface observing system (ASOS) comport well with each other. At most station locations, the correlation is high and the difference between station- and model-based estimates are within the maximum measurement error associated with ASOS stations. There are certain areas in the U.S. where estimates from

NLDAS do not correspond well with station-based measurements. The modeled estimates show variability, as indicated by relatively lower correlations, near the coastal areas of the South, Southeast and the West. Similarly, Northeast shows a consistent negative difference with the magnitude greater than the maximum measurement error of weather stations.

Performance of model-based estimates drops at station locations that are part of the

Southeastern Aerosol Research and Characterization (SEARCH) network which do not have an ASOS nearby. Also, users of model-based meteorology data from NLDAS should take note of the variability in performance at high and low temperature ranges. At high temperatures (greater than 80), particularly of interest in extreme heat-related 160

research and surveillance, NLDAS-based estimates under predict SEARCH measurements. County-level health effects analysis provided useful insights into the benefits and limitations of using NLDAS-based exposure estimates as well as highlighting certain region-specific and EHE definition-specific differences. In general, the degree of agreement between the ASOS- and NLDAS-based exposure estimates can be improved by omitting certain EHE definitions for particular regions. Under prediction of mean EHE effect generated using NLDAS-based estimates, which are more frequently observed, is a factor to consider for health studies. The variability associated with the mean EHE effect, observed based on the 95% CI, is comparable to the variability we see with ASOS-based exposure estimates. These insights are helpful to researchers and public health professionals interested in conducting health linkage studies, deriving exposure-response relationships, and estimating excess deaths related to extreme temperatures.

7.1.3 Characterizing the Effect of Meteorology on Ozone levels during Extreme Heat

Events

Studies that have explored relationships between ozone and meteorology have made weather-based adjustments to ozone levels to accurately characterize the impact of emission-reduction efforts and human activities on prevailing levels. The primary driver behind a majority of these studies has always been to facilitate environmental policy- making within a regulatory context. The relationship between ozone and meteorology on

EHE and non-EHE days can be successfully characterized using a multivariate autoregressive model and a logarithmic response for ozone.

161

The baseline effect, the relationship between meteorology and ozone applicable to both

EHE and non-EHE days, is consistent with the results previously published in literature.

Higher temperatures result in higher ozone and a monotonically increasing trend is observed in ozone levels for temperatures above ~70°F. Lower wind speeds result in higher ozone as a stagnant air mass facilitates higher production of ozone. Higher humidity levels, which correspond with greater cloud cover, are indicators of atmospheric instability; such conditions interrupt the photochemical process leading to the depletion of ozone. The extent of effect modification during EHE days varies across cities and could be due to different meteorological variables in different parts of the country. This heterogeneity could be explained based on the definitions selected for this analysis, but there could be other factors, such as fluctuations in emissions of ozone precursors.

7.1.4 Assessment of Modeled data sources of PM2.5: A public health perspective

Most of the Air Quality System (AQS) monitors sample every third day, leaving approximately 66% of the days in a year without data. These limitations could hinder our ability to accurately ascertain population-level ambient exposure and could introduce uncertainty when used to quantify health risks. This analysis focused on the three most commonly used modeling approaches in the context of public health by researchers and public health practitioners: Community Multiscale Air Quality (CMAQ), Bayesian space- time Downscaler (DS), and Aerosol Optical Depth (AOD) based models.

Advances in remote sensing technologies, coupled with rigorous statistical methodologies, help generate AOD-based predictions of PM2.5 at finer geographic scale.

However, there are a few limitations that should be taken into consideration before using

AOD-based PM2.5 predictions. In certain seasons, data availability is limited by cloud

162

cover and other atmospheric factors. Additionally, the calibration process depends on the availability of AQS monitor data and to some extent, the accuracy of the modeled predictions depends on the number of AQS monitors available to calibrate the model. The

CMAQ model offers PM2.5 predictions at continuous space and time scales. Given the wealth of emission, meteorology, land use and other pertinent information supplied to the model, the CMAQ modeling framework does capture the dynamics of the air pollution processes to an extent. However, CMAQ-based PM2.5 predictions show the weakest association with observed PM2.5 concentrations among the models evaluated in this study.

CMAQ is well suited for ascertaining the background concentrations and computing long-term averages. CMAQ-based predictions can be used to augment monitor data and are currently being used as input to data fusion models.

Data fusion models, such as DS, use a Bayesian approach and generate robust predictions using AQS monitor data where available, and CMAQ predictions in places without monitor data. The DS model has the ability to borrow useful information from neighboring grid cells and provide a smoothed prediction, which tends to improve performance against observed concentrations. Overall, the DS model offers the best performance among the models considered in this assessment. This evaluation further identifies the pros and cons of each model and suggests a potential use after considering the trade-offs, for example, the enhanced spatial and temporal coverage offered by the model with the associated uncertainty.

7.1.5 Monetizing Health Burden Associated with Extreme Heat Events

For the years 2001 – 2009, we estimate a total of 5,454 excess all-cause non-injury deaths associated with EHEs in the U.S.; the model used to estimate excess deaths utilized the

163

best region-specific EHE definition, accounted for air pollution, and controlled for adult smoking prevalence, AC prevalence, Hispanic status and month as covariates. In comparison, during this time period, there were 2,979 direct heat-related deaths reported in the U.S.31 Relying on death certificate information alone to determine the total deaths associated with extreme heat could under estimate the mortality burden by a factor of approximately 2. Moreover, monetizing excess deaths, assuming a baseline lifetime work loss costs of $0.3 million, suggests that the economic costs associated with the excess mortality burden is under estimated by about $1.6 billion.

Economic burden associated with mortality is important to capture because in extreme temperature events, especially heat, infrastructure damage is often minimal compared to hurricanes or tornadoes. More often than not, the burden is underestimated if the costs resulting from mortality are overlooked. Further, the public health community strives to minimize mortality risks associated with environmental hazards. Efforts to minimize the adverse health impacts from extreme heat as well reduce exposure to air pollution are carried out on an ongoing basis, but are often treated as two separate efforts. This study examines the nexus between air pollution and heat waves by exploring the interactive effects of air pollution and extreme heat on all mortality causes that are non-injury related. The strength of the exposure-response (E-R) relationship between EHEs and mortality varies with climate regions. Further, the E-R relationship and excess deaths estimates are sensitive to EHE definitions. Mean estimate of excess deaths and associated costs are provided on a per EHE day basis, using random effects meta-analysis. There will always be some subjectivity in selecting the best EHE definition and these “per day”

31 There were 5,201 direct heat-related deaths among U.S. and non-U.S. residents between the years 1999 – 2009. We excluded non-U.S. residents and restricted the year range to 2001 – 2009, and this resulted in 2,979 direct heat-related deaths. 164

estimates are useful since they could be used as “excess death/cost multipliers” to estimate the total mortality burden prospectively or for years not considered in this assessment.

7.2 Future Directions

Health effects modeling

In this work, the relationship between extreme heat events and health outcomes was largely assessed using a summary-level rate regression model. While this approach adequately captures the relationship, it has limitation when dealing with small death counts. Further, even though our inputs were at the county-level, we were unable to make assertions on the nature of E-R relationship at the county or sub-county-levels. This can be remedied by using a Bayesian approach. A Bayesian approach is widely used in modeling E-R relationships when the counts are sparse.

Sensitivity associated with defining an EHE

Constructing a national database with as many as 92 different EHE definitions is an onerous undertaking. Of the core parameters that make up an EHE definition, sensitivity associated with selection of each parameter is still a question at large. Meta-regression can directly utilize the results from summary-level pooled analysis and can be used to measure sensitivities associated with each core parameter. Additionally, the daily heat metric threshold used in different EHE definitions is set based on deviations from the historical norm or on an absolute value. One singular threshold and exposure offset are used for all sensitive sub-populations. However, thresholds and exposure offset can vary with age and other individual factors. Modeling sub-population-specific exposure offsets can be achieved using distributed lag non-linear models (DLNM). As defined by Dr.

165

Antonio Gasparrini, DLNMs represent a modeling framework to flexibly describe associations showing potentially non-linear and delayed effects in time series data.

Cause-specific mortality end points

While mortality from all causes has been used as a dependent variable for previous studies exploring relationship between EHEs and mortality, it has confounded several factors that are non-environmental and unrelated to extreme heat. Selecting end points such as cardiopulmonary, cardiovascular, and respiratory mortality could lead to better characterization of the effects of EHEs on mortality. There is always a risk of stratifying the data too finely, but a hierarchical Bayesian framework could circumvent this issue.

166

APPENDIX A

SUPPLEMENTAL MATERIAL FOR CHAPTER 2

Table A-1: List of EHE definitions used in the analysis Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Absolute Daily Maximum Temperature Maximum 2+ Consecutive 1 temperature based Definition 1 > 90 ℉ for at least 2 Absolute > 90 ℉ Temperature days thresholds consecutive days Absolute Daily Maximum Temperature Maximum 3+ Consecutive 1 temperature based Definition 2 > 90 ℉ for at least 3 Absolute > 90 ℉ Temperature days thresholds consecutive days Absolute Daily Maximum Temperature Maximum 4+ Consecutive 1 temperature based Definition 3 > 90 ℉ for at least 4 Absolute > 90 ℉ Temperature days thresholds consecutive days Absolute Daily Maximum Heat Index > Maximum 2+ Consecutive 1 temperature based Definition 4 90 ℉ for at least 2 consecutive Absolute > 90 ℉ Heat Index days thresholds days Absolute Daily Maximum Heat Index > Maximum 2+ Consecutive 1 temperature based Definition 5 95 ℉ for at least 2 consecutive Absolute > 95 ℉ Heat Index days thresholds days Absolute Daily Maximum Heat Index > Maximum 3+ Consecutive 1 temperature based Definition 6 90 ℉ for at least 3 consecutive Absolute > 90 ℉ Heat Index days thresholds days Absolute Daily Maximum Heat Index > Maximum 3+ Consecutive 1 temperature based Definition 7 95 ℉ for at least 3 consecutive Absolute > 95 ℉ Heat Index days thresholds days Absolute Daily Maximum Heat Index > Maximum 4+ Consecutive 1 temperature based Definition 8 90 ℉ for at least 4 consecutive Absolute > 90 ℉ Heat Index days thresholds days

167

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Absolute Daily Maximum Heat Index > Maximum 4+ Consecutive 1 temperature based Definition 9 95 ℉ for at least 4 consecutive Absolute > 95 ℉ Heat Index days thresholds days Daily Maximum Temperature "Predominantly Maximum > 90th 2+ Consecutive 2 Definition 10 > 90th Percentile for at least 2 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 85th 3+ Consecutive 2 Definition 11 > 85th Percentile for at least 3 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 90th 3+ Consecutive 2 Definition 12 > 90th Percentile for at least 3 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 85th 4+ Consecutive 2 Definition 13 > 85th Percentile for at least 4 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 90th 4+ Consecutive 2 Definition 14 > 90th Percentile for at least 4 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum 2+ Consecutive 2 Definition 15 > 95 ℉ for at least 2 Absolute > 95 ℉ moderate" thresholds Temperature days consecutive days Daily Maximum Temperature "Predominantly Maximum 3+ Consecutive 2 Definition 16 > 95 ℉ for at least 3 Absolute > 95 ℉ moderate" thresholds Temperature days consecutive days Daily Maximum Temperature "Predominantly Maximum 4+ Consecutive 2 Definition 17 > 95 ℉ for at least 4 Absolute > 95 ℉ moderate" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean > 90th 2+ Consecutive 2 Definition 18 90th Percentile for at least 2 Relative moderate" thresholds Temperature Percentile days consecutive days

168

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Mean Temperature > "Predominantly Mean > 85th 3+ Consecutive 2 Definition 19 85th Percentile for at least 3 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 90th 3+ Consecutive 2 Definition 20 90th Percentile for at least 3 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 85th 4+ Consecutive 2 Definition 21 85th Percentile for at least 4 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 90th 4+ Consecutive 2 Definition 22 90th Percentile for at least 4 Relative moderate" thresholds Temperature Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 90th 2+ Consecutive 2 Definition 23 90th Percentile for at least 2 Relative moderate" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 85th 3+ Consecutive 2 Definition 24 85th Percentile for at least 3 Relative moderate" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 90th 3+ Consecutive 2 Definition 25 90th Percentile for at least 3 Relative moderate" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 85th 4+ Consecutive 2 Definition 26 85th Percentile for at least 4 Relative moderate" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 90th 4+ Consecutive 2 Definition 27 90th Percentile for at least 4 Relative moderate" thresholds Heat Index Percentile days consecutive days

169

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Maximum & Minimum. Maximum "Predominantly Temperature > 80th and > 80th 3+ Consecutive 2 Definition 28 Relative moderate" thresholds Percentile for at least 3 MinimumTe Percentile days consecutive days mperature Daily Maximum Temperature "Predominantly Maximum > 95th 2+ Consecutive 3 Definition 29 > 95th Percentile for at least 2 Relative high" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 95th 3+ Consecutive 3 Definition 30 > 95th Percentile for at least 3 Relative high" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 95th 4+ Consecutive 3 Definition 31 > 95th Percentile for at least 4 Relative high" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum 2+ Consecutive 3 Definition 32 > 100 ℉ for at least 2 Absolute > 100 ℉ high" thresholds Temperature days consecutive days Daily Maximum Temperature "Predominantly Maximum 3+ Consecutive 3 Definition 33 > 100 ℉ for at least 3 Absolute > 100 ℉ high" thresholds Temperature days consecutive days Daily Maximum Temperature "Predominantly Maximum 4+ Consecutive 3 Definition 34 > 100 ℉ for at least 4 Absolute > 100 ℉ high" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean > 95th 2+ Consecutive 3 Definition 35 95th Percentile for at least 2 Relative high" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 95th 3+ Consecutive 3 Definition 36 95th Percentile for at least 3 Relative high" thresholds Temperature Percentile days consecutive days

170

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Mean Temperature > "Predominantly Mean > 95th 4+ Consecutive 3 Definition 37 95th Percentile for at least 4 Relative high" thresholds Temperature Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 95th 2+ Consecutive 3 Definition 38 95th Percentile for at least 2 Relative high" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 95th 3+ Consecutive 3 Definition 39 95th Percentile for at least 3 Relative high" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 95th 4+ Consecutive 3 Definition 40 95th Percentile for at least 4 Relative high" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum 2+ Consecutive 3 Definition 41 100 ℉ for at least 2 Absolute > 100 ℉ high" thresholds Heat Index days consecutive days Daily Maximum Heat Index > "Predominantly Maximum 3+ Consecutive 3 Definition 42 100 ℉ for at least 3 Absolute > 100 ℉ high" thresholds Heat Index days consecutive days Daily Maximum Heat Index > "Predominantly Maximum 4+ Consecutive 3 Definition 43 100 ℉ for at least 4 Absolute > 100 ℉ high" thresholds Heat Index days consecutive days Daily Maximum Temperature "Predominantly > Mean + 2SD of Climate Maximum > Mean + 2+ Consecutive 4 Definition 44 Relative extreme" thresholds Normal for at least 2 Temperature 2SD days consecutive days Daily Maximum Temperature "Predominantly > Mean + 2SD of Climate Maximum > Mean + 3+ Consecutive 4 Definition 45 Relative extreme" thresholds Normal for at least 3 Temperature 2SD days consecutive days

171

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Maximum Temperature "Predominantly > Mean + 2SD of Climate Maximum > Mean + 4+ Consecutive 4 Definition 46 Relative extreme" thresholds Normal for at least 4 Temperature 2SD days consecutive days Daily Maximum Temperature "Predominantly Maximum > 98th 2+ Consecutive 4 Definition 47 > 98th Percentile for at least 2 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 99th 2+ Consecutive 4 Definition 48 > 99th Percentile for at least 2 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 98th 3+ Consecutive 4 Definition 49 > 98th Percentile for at least 3 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 99th 3+ Consecutive 4 Definition 50 > 99th Percentile for at least 3 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 98th 4+ Consecutive 4 Definition 51 > 98th Percentile for at least 4 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum > 99th 4+ Consecutive 4 Definition 52 > 99th Percentile for at least 4 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Maximum Temperature "Predominantly Maximum 2+ Consecutive 4 Definition 53 > 105 ℉ for at least 2 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days Daily Maximum Temperature "Predominantly Maximum 3+ Consecutive 4 Definition 54 > 105 ℉ for at least 3 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days

172

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Maximum Temperature "Predominantly Maximum 4+ Consecutive 4 Definition 55 > 105 ℉ for at least 4 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean > 98th 2+ Consecutive 4 Definition 56 98th Percentile for at least 2 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 99th 2+ Consecutive 4 Definition 57 99th Percentile for at least 2 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 98th 3+ Consecutive 4 Definition 58 98th Percentile for at least 3 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 99th 3+ Consecutive 4 Definition 59 99th Percentile for at least 3 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 98th 4+ Consecutive 4 Definition 60 98th Percentile for at least 4 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean > 99th 4+ Consecutive 4 Definition 61 99th Percentile for at least 4 Relative extreme" thresholds Temperature Percentile days consecutive days Daily Mean Temperature > "Predominantly Mean + 2SD of Climate Mean > Mean + 2+ Consecutive 4 Definition 62 Relative extreme" thresholds Normal for at least 2 Temperature 2SD days consecutive days Daily Mean Temperature > "Predominantly Mean + 2SD of Climate Mean > Mean + 3+ Consecutive 4 Definition 63 Relative extreme" thresholds Normal for at least 3 Temperature 2SD days consecutive days

173

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Mean Temperature > "Predominantly Mean + 2SD of Climate Mean > Mean + 4+ Consecutive 4 Definition 64 Relative extreme" thresholds Normal for at least 4 Temperature 2SD days consecutive days Daily Mean Temperature > "Predominantly Mean 2+ Consecutive 4 Definition 65 90 ℉ for at least 2 consecutive Absolute > 90 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 2+ Consecutive 4 Definition 66 95 ℉ for at least 2 consecutive Absolute > 95 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 2+ Consecutive 4 Definition 67 100 ℉ for at least 2 Absolute > 100 ℉ extreme" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean 2+ Consecutive 4 Definition 68 105 ℉ for at least 2 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean 3+ Consecutive 4 Definition 69 90 ℉ for at least 3 consecutive Absolute > 90 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 3+ Consecutive 4 Definition 70 95 ℉ for at least 3 consecutive Absolute > 95 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 3+ Consecutive 4 Definition 71 100 ℉ for at least 3 Absolute > 100 ℉ extreme" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean 3+ Consecutive 4 Definition 72 105 ℉ for at least 3 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days

174

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Mean Temperature > "Predominantly Mean 4+ Consecutive 4 Definition 73 90 ℉ for at least 4 consecutive Absolute > 90 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 4+ Consecutive 4 Definition 74 95 ℉ for at least 4 consecutive Absolute > 95 ℉ extreme" thresholds Temperature days days Daily Mean Temperature > "Predominantly Mean 4+ Consecutive 4 Definition 75 100 ℉ for at least 4 Absolute > 100 ℉ extreme" thresholds Temperature days consecutive days Daily Mean Temperature > "Predominantly Mean 4+ Consecutive 4 Definition 76 105 ℉ for at least 4 Absolute > 105 ℉ extreme" thresholds Temperature days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 98th 2+ Consecutive 4 Definition 77 98th Percentile for at least 2 Relative extreme" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 99th 2+ Consecutive 4 Definition 78 99th Percentile for at least 2 Relative extreme" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 98th 3+ Consecutive 4 Definition 79 98th Percentile for at least 3 Relative extreme" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 99th 3+ Consecutive 4 Definition 80 99th Percentile for at least 3 Relative extreme" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 98th 4+ Consecutive 4 Definition 81 98th Percentile for at least 4 Relative extreme" thresholds Heat Index Percentile days consecutive days Daily Maximum Heat Index > "Predominantly Maximum > 99th 4+ Consecutive 4 Definition 82 99th Percentile for at least 4 Relative extreme" thresholds Heat Index Percentile days consecutive days

175

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Maximum Heat Index > "Predominantly Maximum 2+ Consecutive 4 Definition 83 105 ℉ for at least 2 Absolute > 105 ℉ extreme" thresholds Heat Index days consecutive days Daily Maximum Heat Index > "Predominantly Maximum 3+ Consecutive 4 Definition 84 105 ℉ for at least 3 Absolute > 105 ℉ extreme" thresholds Heat Index days consecutive days Daily Maximum Heat Index > "Predominantly Maximum 4+ Consecutive 4 Definition 85 105 ℉ for at least 4 Absolute > 105 ℉ extreme" thresholds Heat Index days consecutive days "Predominantly Daily Maximum Temperature Maximum > 81th/97.5th 3+ Consecutive 4 Definition 86 Relative extreme" thresholds > Huth Thresholds Temperature Percentile days Daily Maximum Temperature Climate normal > Mean + 1SD of Climate Maximum > Mean + 2+ Consecutive 5 Definition 87 Relative based thresholds Normal for at least 2 Temperature 1SD days consecutive days Daily Maximum Temperature Climate normal > Mean + 1SD of Climate Maximum > Mean + 3+ Consecutive 5 Definition 88 Relative based thresholds Normal for at least 3 Temperature 1SD days consecutive days Daily Maximum Temperature Climate normal > Mean + 1SD of Climate Maximum > Mean + 4+ Consecutive 5 Definition 89 Relative based thresholds Normal for at least 4 Temperature 1SD days consecutive days Daily Mean Temperature > Climate normal Mean + 1SD of Climate Mean > Mean + 2+ Consecutive 5 Definition 90 Relative based thresholds Normal for at least 2 Temperature 1SD days consecutive days

176

Daily Cluster Cluster common Definition Threshold Threshold Description heat Duration number name number type value metric Daily Mean Temperature > Climate normal Mean + 1SD of Climate Mean > Mean + 3+ Consecutive 5 Definition 91 Relative based thresholds Normal for at least 3 Temperature 1SD days consecutive days Daily Mean Temperature > Climate normal Mean + 1SD of Climate Mean > Mean + 4+ Consecutive 5 Definition 92 Relative based thresholds Normal for at least 4 Temperature 1SD days consecutive days

177

Table A-2: Percent of days classified as EHE days and percent of X30 deaths on EHE days, by EHE definition and exposure offset combinations for U.S. Climate Regions U.S. Climate Regions East North Central North West Northeast South Southeast Southwest West

Central Central

Exposure EHE definition offset ays

type

EHE daysEHE

Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30

Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE Deaths on d EHE Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE

Percent of Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE ExE1 16 71 5 54 7 32 5 64 55 91 46 85 12 88 22 63 Daily maximum heat index greater ExE2 19 74 6 59 8 33 6 68 59 93 50 88 14 89 24 65 o than 90 F for at ExE3 21 75 7 60 9 33 7 70 61 93 53 91 15 90 26 65 least 3 consecutive lag0 14 64 4 43 5 31 4 53 51 88 42 82 11 86 20 61 days lag1 14 70 4 53 5 29 4 59 51 89 42 83 11 87 20 61 Daily maximum ExE1 6 57 6 53 6 39 6 61 6 27 5 21 5 35 7 38 and minimum temperature ExE2 7 61 7 57 7 39 8 65 7 30 6 25 6 38 8 40 greater than 80th ExE3 8 62 9 58 8 39 9 65 8 32 6 27 7 39 9 42 percentile for at lag0 5 52 5 44 5 38 5 53 5 25 4 19 4 29 6 34 least 3 consecutive days lag1 5 55 5 49 5 35 5 59 5 24 4 18 4 33 6 35 Daily maximum ExE1 4 44 4 51 4 35 4 52 4 24 4 21 4 22 5 31 temperature ExE2 5 47 5 55 5 36 5 58 5 26 5 24 5 24 6 34 greater than 95th percentile for at ExE3 6 49 6 58 6 36 6 60 6 28 6 24 5 27 7 36 least 2 consecutive lag0 3 32 3 37 3 29 3 36 3 20 3 19 3 16 3 26 days lag1 3 40 3 49 3 28 3 47 3 20 3 17 3 17 3 28 Huth definition ExE1 2 10 2 25 2 6 2 12 2 11 2 5 1 4 1 3

178

U.S. Climate Regions East North Central North West Northeast South Southeast Southwest West

Central Central

Exposure EHE definition offset ays

type

EHE daysEHE

Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30 Percent of X30

Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE Deaths on d EHE Deaths on daysEHE Deaths on daysEHE Deaths on daysEHE

Percent of Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE Percent of days EHE ExE2 2 11 3 27 2 6 2 13 2 11 2 6 1 5 1 4 ExE3 3 11 3 27 2 6 2 14 3 11 2 7 1 5 1 4 lag0 2 9 2 20 1 6 1 8 2 9 1 5 1 3 1 2 lag1 2 9 2 25 1 6 1 11 2 11 1 5 1 4 1 2 Daily mean ExE1 13 58 11 59 13 47 14 63 15 37 13 39 22 58 18 54 temperature greater than ExE2 16 61 14 65 15 47 16 67 17 41 15 41 25 61 21 56 mean + 1 ExE3 18 63 16 67 18 47 18 67 19 42 17 43 27 65 23 59 standard lag0 11 52 9 49 11 43 11 56 12 33 11 35 18 51 15 49 deviation (SD) of climate normal for at least 3 consecutive days lag1 11 55 9 57 11 39 11 58 12 34 11 35 18 53 15 51

179

Table A-3: Mean (5th, and 95th percentile) levels of demographic and social variables, by U.S. Climate Regions

U.S. Climate Region East North North West Central Northeast South Southeast Southwest West

Central Central

Demographic

th th th th th th th

Social Variables th

and 95

Mean Mean Mean Mean Mean Mean Mean Mean

, and 95 , , and 95 , and 95 , and 95 , and 95 , and 95 , and 95

th th th th th th th th

percentile) percentile) percentile) percentile) percentile) percentile) percentile)

percentiles)

(5 (5 (5 (5 (5 (5 (5 (5

Air conditioning prevalence (%) 57 (33, 80) 40 (17, 63) 22 (5, 60) 13 (1, 48) 72 (53, 85) 83 (60, 94) 23 (8, 70) 36 (6, 72) Diabetes prevalence (%) 10 (7, 13) 9 (6, 12) 9 (6, 12) 9 (7, 11) 10 (7, 13) 11 (8, 15) 7 (5, 10) 8 (7, 10) Obesity prevalence (body mass index >= 30) (%) 31 (27, 37) 30 (24, 36) 28 (23, 34) 26 (19, 32) 31 (27, 36) 29 (24, 36) 23(16, 32) 25 (20, 31) Percent of Hispanic population (%) 4 (1, 13) 5 (1, 13) 8 (1, 30) 7 (1, 21) 20 (2, 67) 9 (2, 25) 23 (8, 48) 31 (10, 55) Percent of adult smokers (%) 22 (15, 30) 20 (12, 27) 18 (12, 26) 19 (13, 26) 20 (13, 28) 19 (13, 26) 18 (10, 24) 15 (10, 24) Percent of adults that report no leisure time physical activity (%) 28 (23, 34) 23 (18, 30) 24 (16, 32) 24 (18, 30) 28 (22, 35) 26 (19, 35) 21 (15, 27) 19 (14, 26)

Percent of population in poverty (%) 17 (10, 24) 15 (10, 22) 15 (10, 24) 13 (7, 19) 18 (11, 27) 18 (8, 27) 17 (10, 24) 16 (9, 23) Percent of population over 65 (%) 14 (10, 18) 15 (10, 21) 15 (10, 23) 15 (11, 19) 13 (8, 23) 14 (9, 24) 14 (9, 24) 13 (9, 20)

Percent of population under 65 uninsured (%) 16 (11, 20) 12 (9, 17) 18 (11, 26) 11 (5, 17) 23 (15, 31) 20 (13, 27) 21 (16, 26) 20 (13, 26) Population density (population/per sq. 576 368 101 2,341 323 653 191 521 mile) (44, 2,193) (12, 1886) (1, 482) (30, 10417) (4, 1286) (80, 1,784) (2, 718) (1, 2,388)

180

APPENDIX B

SUPPLEMENTAL MATERIAL FOR CHAPTER 3

Table B-1: Grid extent of the NLDAS grid Grid Position Grid Row Longitude Latitude Column Lower Left 1 1 -124.9375 25.0625 Lower Right 464 1 -67.0625 25.0625 Upper Right 464 224 -67.0625 52.9375 Upper Left 1 224 -124.9375 52.9375

181

~ H : Interpolated value H1, H2, H3, H4: Grid cell centroids H: Station location  : Epsilon, a small infinitesimal value to account for a scenario where grid cell centroid is right on top of a monitor location.

Figure B-1: Inverse squared distance weighted interpolation to obtain station-level modeled daily heat metric estimates

182

Relate block level population

data to grid cells

Assign population

to grid cells

Convert grid cell polygons to point

(centroid)

Assign population-weighted daily heat metrics to counties Relate grid cell centroids to counties Figure B-2: Generating population-weighted county -level modeled daily heat metric estimates 183

Table B-2: Performance evaluation metrics used in this analysis Performance evaluation metric Formula/Description

Pearson Correlation Coefficient (r): For a set of daily data points (MON1, MOD1), (MON2, MOD2), …, (MONn, MODn) r is defined as

∑ ( ̅ ̅̅ ̅̅̅ ̅) ( ̅ ̅̅ ̅̅̅ ̅)

√ ̅̅̅̅̅̅̅ √ ̅̅̅̅̅̅̅ ∑ ( ) ∑ ( ) Where, n: number of observations;

: Station-based daily heat metric measurements; : Model-based daily heat metric estimates; ̅̅̅ ̅̅ ̅̅: Station-based daily heat metric average; ̅̅̅ ̅̅̅ ̅: Model-based daily heat metric estimates. Kendall Tau-B correlation Coefficient (t) To calculate t, n(n-1)/2 pairs of data points are classified as concordant or discordant. A concordant pair is any pair for

which the ranks of ASOS and NLDAS estimates agree, i.e., for any pair of observations (MONi, MODi) and (MONj,

MODj), both MONi > MONj and MODi > MODj or both MONi < MONj and MODi < MODj. A discordant pair is any pair

of observations for which the ranks for MON and MOD disagree, i.e., either MONi > MONj and MODi < MODj or MONi

< MONj and MODi > MODj. With C and D respectively denoting the number of concordant and discordant pairs (assuming no ties), the value of tis then defined as

The denominator is adjusted accordingly in the event of ties. Difference (D) D for a grid cell k and day i is defined as

: Daily heat metric from ASOS for day i and at station k

: Daily heat metric from NLDAS for day i and at station k

Relative Difference (RD) RD for a grid cell k and day i is defined as ) /

: Daily heat metric from ASOS for day i and at station k

: Daily heat metric from NLDAS for day i and at station k

Root mean squared deviation (RMSD) RMSD at a staion k is defined as

√∑

184

: Daily heat metric from ASOS for day i and at station k

: Daily heat metric from NLDAS for day i and at station k

Bland Altman Plot A Bland-Altman plot is a scatter plot with ( , ) points defined as

̅̅̅ ̅̅̅ ̅̅̅ ̅̅̅ ̅̅̅ ̅̅̅

̅̅̅ ̅̅ ̅̅ ̅ ̅̅̅ ̅̅ ̅̅ ̅ where:

̅̅̅ ̅̅ ̅̅ ̅ : Station-based daily heat metric for station k; ̅̅̅ ̅̅ ̅̅ ̅ : Model-based daily heat metric for station k;

185

Table B-3: Short-listed EHE definitions used to evaluate ASOS- and NLDAS-based exposure estimates Cluster Daily heat Cluster EHE definition name Threshold type Threshold value Duration common name metric Absolute Daily maximum heat temperature index greater than 1 HI Absolute >90oF 3+ consecutive days based 90 ℉ for at least 3 max thresholds consecutive days Daily maximum and "Predominantly minimum temperature T and 2 moderate" greater than 80th max Relative >80th percentile 3+ consecutive days T thresholds percentile for at least 3 min consecutive days Daily maximum temperature greater "Predominantly 3 than 95th percentile for T Relative >95th percentile 2+ consecutive days high" thresholds max at least 2 consecutive days

"Predominantly Everyday >T2, and 3+ consecutive T1: >97.5th percentile 4 extreme" Huth definition T Relative days >T1, and average T >T1 for max T2: >81st percentile max thresholds the whole time period

Daily mean temperature greater Climate normal than mean + 1 SD of >mean + 1 SD of 5 based T Relative 3+ Consecutive days climate normal for at avg climate normal thresholds least 3 consecutive days

186

Figure B-3: Schematic showing exposure offset indicators

187

Figure B-4: Urban/Rural (Urbanicity) classification of counties in the U.S.

188

Figure B-5: Station-level correlation as a function of proximity to the U.S. coastline

189

Table B-4: Annual station-level RMSD for daily heat metrics

Daily Median (5th, 95th Percentile) Station level root mean squared deviation (oF) Summarized by Climate Region Heat Year East North West North Central Northeast Northwest South Southeast Southwest West Metric Central Central 1999 3.63 (3.04, 4.54) 3.78 (3.28, 5.20) 4.16 (3.37, 7.93) 3.93 (2.84, 7.36) 4.06 (3.20, 5.88) 3.75 (3.20, 5.83) 3.36 (2.52, 5.92) 5.38 (3.03, 10.69) 4.20 (3.44, 5.17) 2000 4.06 (3.34, 5.00) 4.53 (3.20, 6.22) 3.92 (3.00, 8.00) 3.32 (2.36, 7.01) 3.93 (2.92, 5.28) 3.57 (2.85, 5.17) 3.09 (2.11, 5.51) 5.61 (2.90, 11.83) 4.29 (3.10, 6.07) 2001 3.54 (2.96, 4.55) 4.04 (3.01, 5.77) 3.90 (3.11, 8.10) 3.55 (2.60, 7.07) 3.62 (2.70, 4.76) 3.69 (2.73, 6.08) 2.95 (2.13, 6.00) 5.86 (3.26, 10.74) 3.90 (2.83, 5.89) 2002 3.84 (3.27, 4.59) 4.27 (3.66, 6.05) 3.98 (3.26, 7.70) 3.48 (2.81, 7.97) 3.93 (3.12, 4.87) 3.75 (3.30, 5.25) 3.30 (2.10, 7.31) 5.93 (3.46, 10.26) 4.21 (3.41, 5.53) 2003 3.77 (3.13, 4.58) 3.80 (2.98, 5.35) 3.81 (2.91, 8.73) 3.81 (2.64, 7.89) 4.14 (3.21, 5.59) 3.76 (3.10, 5.64) 3.00 (2.22, 5.70) 5.96 (3.86, 10.78) 3.65 (2.97, 5.69) HImax 2004 3.52 (2.95, 5.22) 3.91 (3.08, 6.05) 3.73 (3.02, 7.34) 3.75 (2.57, 7.55) 3.97 (3.11, 5.54) 3.24 (2.67, 5.73) 3.14 (1.85, 5.61) 6.04 (3.18, 11.07) 3.90 (3.20, 5.12) 2005 3.84 (2.98, 4.50) 4.24 (3.19, 5.30) 3.74 (2.97, 6.99) 3.19 (2.57, 7.17) 3.84 (2.89, 5.62) 3.57 (2.90, 4.76) 3.07 (2.03, 5.67) 5.52 (2.74, 9.56) 4.03 (3.25, 5.43) 2006 4.34 (3.24, 5.55) 4.81 (3.81, 6.23) 3.72 (3.17, 6.85) 3.41 (2.61, 7.98) 4.30 (2.67, 5.68) 3.27 (2.68, 4.28) 2.98 (2.25, 5.60) 5.78 (3.24, 10.28) 3.72 (2.79, 5.27) 2007 3.86 (2.77, 5.07) 4.96 (3.36, 6.05) 4.03 (3.29, 8.19) 3.46 (2.49, 8.28) 4.01 (3.00, 5.42) 3.39 (2.57, 4.36) 2.63 (1.82, 6.36) 5.61 (2.83, 11.34) 4.20 (3.05, 6.14) 2008 3.73 (2.71, 5.29) 4.17 (3.12, 5.65) 3.44 (2.68, 7.17) 4.05 (2.56, 7.49) 4.25 (2.85, 5.38) 3.35 (2.60, 4.39) 3.05 (2.05, 6.23) 5.85 (2.99, 12.36) 3.81 (2.68, 5.02) 2009 4.25 (3.43, 5.40) 4.46 (3.04, 6.26) 3.56 (3.00, 7.58) 3.40 (2.46, 7.95) 5.28 (3.36, 8.77) 4.62 (2.91, 6.43) 3.62 (2.25, 6.06) 5.47 (2.48, 10.66) 4.39 (2.75, 5.87) 1999 3.38 (2.88, 4.82) 3.40 (2.76, 5.64) 4.37 (3.30, 8.53) 4.44 (3.21, 8.19) 3.56 (2.42, 4.47) 3.20 (2.54, 5.31) 3.63 (2.52, 8.37) 5.85 (3.41, 11.64) 3.95 (3.16, 5.85) 2000 3.05 (2.61, 5.09) 3.88 (2.71, 6.18) 4.03 (2.81, 7.99) 3.72 (2.65, 8.48) 3.16 (2.33, 4.42) 3.12 (2.4, 4.61) 3.48 (2.29, 7.70) 6.43 (3.19, 13.34) 4.43 (3.36, 6.48) 2001 3.00 (2.47, 3.86) 3.59 (2.81, 5.73) 4.31 (2.97, 8.77) 4.23 (2.90, 8.20) 3.06 (2.43, 3.95) 3.18 (2.39, 4.89) 3.43 (2.19, 8.28) 6.77 (3.30, 11.88) 3.97 (2.91, 6.40) 2002 3.76 (2.97, 4.68) 3.96 (3.17, 5.28) 4.15 (2.99, 8.19) 4.24 (3.03, 8.87) 3.36 (2.58, 4.44) 3.13 (2.51, 5.19) 3.67 (2.56, 8.74) 6.40 (3.24, 12.42) 4.40 (3.62, 5.81) 2003 3.12 (2.69, 4.92) 3.85 (2.89, 5.61) 3.88 (2.64, 9.19) 4.63 (3.05, 10.46) 3.44 (2.49, 4.66) 3.20 (2.36, 5.02) 3.39 (2.45, 8.14) 6.46 (3.37, 12.52) 4.05 (3.02, 6.85) Tmax 2004 2.90 (2.48, 4.43) 3.52 (2.71, 5.53) 3.86 (2.67, 8.07) 4.42 (2.73, 9.39) 3.38 (2.27, 4.92) 2.98 (2.36, 5.43) 3.69 (2.07, 7.84) 6.85 (3.67, 12.31) 3.92 (3.24, 6.10) 2005 3.58 (2.72, 4.85) 3.73 (3.08, 5.50) 4.40 (3.01, 7.70) 3.97 (2.90, 9.31) 3.20 (2.49, 4.49) 3.46 (2.34, 5.17) 3.66 (2.21, 7.75) 5.88 (3.32, 11.75) 4.01 (3.34, 5.64) 2006 3.37 (2.82, 4.60) 4.36 (3.51, 6.98) 4.01 (2.95, 7.23) 3.96 (2.91, 9.60) 3.51 (2.38, 4.92) 3.08 (2.31, 5.1) 3.41 (2.29, 8.40) 6.46 (3.28, 11.90) 4.02 (3.01, 6.45) 2007 3.62 (2.74, 4.58) 4.38 (3.23, 6.41) 4.46 (3.24, 9.04) 4.04 (2.83, 9.87) 2.83 (2.31, 3.60) 3.00 (2.24, 4.53) 3.06 (2.19, 8.64) 6.55 (2.74, 12.33) 4.34 (3.16, 7.16) 2008 3.09 (2.62, 4.43) 3.41 (2.79, 5.56) 4.00 (2.72, 7.49) 4.33 (3.02, 9.81) 3.12 (2.58, 4.30) 2.97 (2.24, 4.77) 3.53 (2.14, 8.46) 6.92 (3.62, 13.62) 3.65 (2.95, 6.79) 2009 3.45 (3.05, 4.89) 4.50 (2.99, 6.66) 3.79 (2.78, 7.79) 4.09 (2.95, 9.35) 5.50 (3.13, 9.47) 4.76 (2.40, 6.27) 4.28 (2.52, 8.59) 6.41 (3.12, 12.47) 4.76 (3.02, 6.40) 1999 2.52 (2.12, 3.15) 2.66 (2.35, 3.42) 2.70 (2.23, 4.63) 3.25 (2.36, 7.18) 2.66 (1.84, 4.06) 2.32 (1.63, 3.42) 4.29 (2.38, 6.16) 3.52 (2.34, 7.09) 4.38 (2.71, 6.74) 2000 2.46 (2.05, 3.62) 2.79 (2.08, 4.50) 2.54 (2.13, 4.37) 3.44 (2.16, 7.73) 2.76 (1.76, 4.37) 2.35 (1.70, 3.74) 4.46 (2.37, 8.58) 4.17 (2.26, 8.03) 5.03 (2.83, 7.00) 2001 2.45 (1.97, 6.69) 2.75 (2.17, 10.11) 2.65 (2.13, 6.85) 3.64 (2.08, 7.88) 2.53 (1.84, 7.29) 2.18 (1.64, 7.08) 4.83 (2.50, 9.65) 4.03 (2.20- 9.90) 4.80 (2.66, 8.96) 2002 2.45 (1.98, 9.45) 2.77 (2.18, 6.59) 2.77 (1.99, 8.76) 4.35 (2.30, 8.40) 2.77 (1.85, 7.84) 2.21 (1.65, 6.76) 4.61 (2.4, 10.47) 4.25 (2.46, 8.06) 4.59 (3.11, 10.02) 2003 2.37 (1.97, 7.03) 3.19 (2.12, 6.69) 2.50 (1.95, 9.77) 3.42 (2.17, 9.84) 3.08 (1.88, 7.77) 2.17 (1.58, 8.10) 4.74 (2.38, 10.79) 4.49 (2.64, 8.89) 5.17 (3.41, 9.49) Tavg 2004 2.48 (1.91, 6.33) 2.99 (2.15, 4.19) 2.61 (1.91, 6.10) 3.23 (2.47, 6.48) 2.83 (1.96, 6.39) 2.04 (1.62, 10.35) 4.27 (2.22, 7.56) 4.27 (2.45, 7.77) 4.46 (2.73, 7.96) 2005 2.37 (1.94, 3.45) 2.82 (2.06, 4.40) 2.58 (1.91, 4.46) 3.10 (2.13, 7.15) 2.74 (1.79, 4.13) 2.12 (1.63, 2.63) 4.09 (2.26, 7.9) 3.74 (2.22, 7.12) 4.41 (2.86, 6.09) 2006 2.46 (2.05, 3.73) 2.92 (2.28, 4.18) 2.46 (2.01, 4.63) 3.43 (2.24, 7.36) 2.97 (1.80, 4.27) 2.13 (1.69, 2.97) 4.05 (2.34, 7.35) 3.90 (2.32, 7.66) 4.62 (2.70, 5.95) 2007 2.46 (1.93, 3.81) 3.17 (2.48, 4.49) 2.74 (2.12, 4.68) 3.06 (1.98, 6.59) 2.22 (1.70, 4.05) 2.20 (1.56, 3.52) 3.60 (2.17, 7.44) 3.68 (2.55, 7.71) 4.33 (2.69, 5.80) 2008 2.29 (1.97, 3.12) 2.45 (1.93, 4.53) 2.41 (1.84, 3.95) 3.41 (2.22, 6.41) 2.43 (1.90, 3.88) 2.02 (1.59, 2.79) 3.36 (2.12, 6.57) 4.36 (2.32, 8.51) 4.57 (2.55, 5.85) 2009 2.41 (2.05, 3.20) 3.11 (2.17, 4.53) 2.40 (1.95, 3.81) 3.27 (2.04, 6.86) 4.33 (2.33, 6.82) 2.94 (1.81, 4.97) 3.77 (2.35, 6.80) 4.29 (2.14, 8.02) 4.62 (2.73, 6.08)

190

Figure B-6: Bland Altman plots for Tmax at SEARCH locations

191

Figure B-7: Bland Altman plots for HImax at SEARCH locations

192

Figure B-8: Bland Altman plots for Tavg at SEARCH locations

193

Figure B-9: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 1

194

Figure B-10: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 2

195

Figure B-11: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 3

196

Figure B-12: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 4

197

Figure B-13: Comparison of mean (95% CI) EHE effect from ASOS- and NLDAS-based exposure estimates for Definition 5

198

APPENDIX C

SUPPLEMENTAL MATERIAL FOR CHAPTER 4

Figure C-1: Daily mean inverse wind speed by month in 27 cities

199

Figure C-2: Daily mean relative humidity by month in 27 cities

200

Figure C-3: Daily maximum temperature by month in 27 cities

201

Figure C-4: Daily cloud-adjusted net solar insolation by month in 27 cities

202

A: non-EHE day

B: EHE day

Figure C-5: Daily maximum temperature distribution by EHE definitions32 in 27 cities on (A) non-EHE day, and (B) EHE day

32Definition 1: Daily maximum heat index greater than 90℉ for at least 3 consecutive days Definition 2: Daily maximum and minimum temperature greater than 80th percentile for at least 3 consecutive days Definition 3: Daily maximum temperature greater than 95th percentile for at least 2 consecutive days Definition 4: Huth definition Definition 5: Daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days 203

A: non-EHE day

B: EHE day

Figure C-6: Daily mean inverse wind speed distribution by EHE definitions33 in 27 cities on (A) non-EHE day, and (B) EHE day

33Definition 1: Daily maximum heat index greater than 90℉ for at least 3 consecutive days Definition 2: Daily maximum and minimum temperature greater than 80th percentile for at least 3 consecutive days Definition 3: Daily maximum temperature greater than 95th percentile for at least 2 consecutive days Definition 4: Huth definition Definition 5: Daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days 204

A: non-EHE day

B: EHE day

Figure C-7: Daily maximum relative humidity by EHE definitions34 in 27 cities on (A) non-EHE day, and (B) EHE day

34Definition 1: Daily maximum heat index greater than 90℉ for at least 3 consecutive days Definition 2: Daily maximum and minimum temperature greater than 80th percentile for at least 3 consecutive days Definition 3: Daily maximum temperature greater than 95th percentile for at least 2 consecutive days Definition 4: Huth definition Definition 5: Daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days 205

Figure C-8: Scatter plot showing relationship between meteorological variables used in this analysis

206

Table C-1: City-specific Pearson correlation coefficient between meteorological variables Meteorological variable Meteorological City Inv. Mean WS Mean RH SI Tmax variable (s/m) (%) (KW/sq.m) (℉) Inv. Mean WS (s/m) 1 0.06 -0.29 0.13 Mean RH (%) 0.06 1 0.22 -0.23 SI (KW/sq.m) -0.29 0.22 1 -0.02 Albuquerque Tmax(deg F) 0.13 -0.23 -0.02 1 Inv. Mean WS (s/m) 1 -0.27 -0.14 0.32 Mean RH (%) -0.27 1 0.53 -0.26 SI (KW/sq.m) -0.14 0.53 1 -0.16 Atlanta Tmax(deg F) 0.32 -0.26 -0.16 1 Inv. Mean WS (s/m) 1 -0.08 -0.16 0.21 Mean RH (%) -0.08 1 0.28 -0.28 SI (KW/sq.m) -0.16 0.28 1 -0.07 Baton Rouge Tmax(deg F) 0.21 -0.28 -0.07 1 Inv. Mean WS (s/m) 1 -0.19 -0.15 0.19 Mean RH (%) -0.19 1 0.58 -0.37 SI (KW/sq.m) -0.15 0.58 1 -0.18 Birmingham Tmax(deg F) 0.19 -0.37 -0.18 1 Inv. Mean WS (s/m) 1 0.26 0 -0.13 Mean RH (%) 0.26 1 0.32 -0.23 SI (KW/sq.m) 0 0.32 1 0.05 Boston Tmax(deg F) -0.13 -0.23 0.05 1 Inv. Mean WS (s/m) 1 -0.08 -0.11 0.08 Mean RH (%) -0.08 1 0.51 -0.17 SI (KW/sq.m) -0.11 0.51 1 0.07 Buffalo Tmax(deg F) 0.08 -0.17 0.07 1 Inv. Mean WS (s/m) 1 0.18 0.01 0.06 Mean RH (%) 0.18 1 0.4 -0.07 SI (KW/sq.m) 0.01 0.4 1 0.2 Chicago Tmax(deg F) 0.06 -0.07 0.2 1 Inv. Mean WS (s/m) 1 0.04 0.02 0.1 Mean RH (%) 0.04 1 0.47 -0.18 SI (KW/sq.m) 0.02 0.47 1 0.08 Columbus Tmax(deg F) 0.1 -0.18 0.08 1 Inv. Mean WS (s/m) 1 -0.21 -0.16 0 Mean RH (%) -0.21 1 0.54 -0.52 SI (KW/sq.m) -0.16 0.54 1 -0.15 Dallas Tmax(deg F) 0 -0.52 -0.15 1 Inv. Mean WS (s/m) 1 0.1 -0.12 -0.07 Mean RH (%) 0.1 1 0.56 -0.57 SI (KW/sq.m) -0.12 0.56 1 -0.13 Denver Tmax(deg F) -0.07 -0.57 -0.13 1 Inv. Mean WS (s/m) 1 0.16 0.01 0.18 Mean RH (%) 0.16 1 0.36 -0.13 SI (KW/sq.m) 0.01 0.36 1 0.23 Detroit Tmax(deg F) 0.18 -0.13 0.23 1 Inv. Mean WS (s/m) 1 0.13 -0.04 0.07 Fargo Mean RH (%) 0.13 1 0.26 -0.39

207

Meteorological variable Meteorological City Inv. Mean WS Mean RH SI Tmax variable (s/m) (%) (KW/sq.m) (℉) SI (KW/sq.m) -0.04 0.26 1 0.01 Tmax(deg F) 0.07 -0.39 0.01 1 Inv. Mean WS (s/m) 1 -0.06 -0.47 0.28 Mean RH (%) -0.06 1 -0.26 -0.68 SI (KW/sq.m) -0.47 -0.26 1 0.13 Fresno Tmax(deg F) 0.28 -0.68 0.13 1 Inv. Mean WS (s/m) 1 0.1 0 0.15 Mean RH (%) 0.1 1 0.3 -0.05 SI (KW/sq.m) 0 0.3 1 0.27 Grand Rapids Tmax(deg F) 0.15 -0.05 0.27 1 Inv. Mean WS (s/m) 1 -0.23 -0.15 0.13 Mean RH (%) -0.23 1 0.65 -0.06 SI (KW/sq.m) -0.15 0.65 1 -0.07 Indianapolis Tmax(deg F) 0.13 -0.06 -0.07 1 Inv. Mean WS (s/m) 1 -0.06 -0.24 -0.02 Mean RH (%) -0.06 1 0.06 -0.31 SI (KW/sq.m) -0.24 0.06 1 0.06 Las Vegas Tmax(deg F) -0.02 -0.31 0.06 1 Inv. Mean WS (s/m) 1 0.15 -0.05 0.28 Mean RH (%) 0.15 1 0.09 -0.65 SI (KW/sq.m) -0.05 0.09 1 -0.11 Los Angeles Tmax(deg F) 0.28 -0.65 -0.11 1 Inv. Mean WS (s/m) 1 -0.28 -0.17 0.06 Mean RH (%) -0.28 1 0.44 -0.49 SI (KW/sq.m) -0.17 0.44 1 -0.43 McAllen Tmax(deg F) 0.06 -0.49 -0.43 1 Inv. Mean WS (s/m) 1 -0.04 -0.01 -0.03 Mean RH (%) -0.04 1 0.36 -0.24 SI (KW/sq.m) -0.01 0.36 1 0.02 Minneapolis Tmax(deg F) -0.03 -0.24 0.02 1 Inv. Mean WS (s/m) 1 -0.22 -0.13 0.06 Mean RH (%) -0.22 1 -0.25 -0.02 SI (KW/sq.m) -0.13 -0.25 1 0.03 Phoenix Tmax(deg F) 0.06 -0.02 0.03 1 Inv. Mean WS (s/m) 1 -0.23 -0.13 0.06 Mean RH (%) -0.23 1 0.41 -0.24 SI (KW/sq.m) -0.13 0.41 1 0.02 Pittsburgh Tmax(deg F) 0.06 -0.24 0.02 1 Inv. Mean WS (s/m) 1 0.11 -0.17 0.16 Mean RH (%) 0.11 1 0.24 -0.7 SI (KW/sq.m) -0.17 0.24 1 -0.17 Portland Tmax(deg F) 0.16 -0.7 -0.17 1 Inv. Mean WS (s/m) 1 -0.18 -0.4 0.29 Mean RH (%) -0.18 1 0 -0.67 SI (KW/sq.m) -0.4 0 1 0.05 Reno Tmax(deg F) 0.29 -0.67 0.05 1 Inv. Mean WS (s/m) 1 -0.11 -0.17 -0.01 Mean RH (%) -0.11 1 0.37 -0.24 SI (KW/sq.m) -0.17 0.37 1 0.12 St. Louis Tmax(deg F) -0.01 -0.24 0.12 1

208

Meteorological variable Meteorological City Inv. Mean WS Mean RH SI Tmax variable (s/m) (%) (KW/sq.m) (℉) Inv. Mean WS (s/m) 1 0.05 0.08 -0.03 Mean RH (%) 0.05 1 0.45 -0.49 SI (KW/sq.m) 0.08 0.45 1 -0.32 Tampa Tmax(deg F) -0.03 -0.49 -0.32 1 Inv. Mean WS (s/m) 1 0.27 0.02 -0.14 Mean RH (%) 0.27 1 0.38 -0.48 SI (KW/sq.m) 0.02 0.38 1 -0.04 Tulsa Tmax(deg F) -0.14 -0.48 -0.04 1 Inv. Mean WS (s/m) 1 -0.05 -0.13 0.16 Mean RH (%) -0.05 1 0.36 -0.2 Washington SI (KW/sq.m) -0.13 0.36 1 0.05 D.C. Tmax(deg F) 0.16 -0.2 0.05 1

209

Table C-2: Model performance for 27 cities EHE definitions

Definition 1 Definition 2 Definition 3 Definition 4 Definition 5

- - - - -

- - - - -

(%) (%) (%) (%) (%)

Mean Mean Mean Mean Mean

squared squared squared squared squared squared squared squared squared squared

Relative Relative Relative Relative Relative

absolute absolute absolute absolute absolute absolute

Total R Total R Total R Total R Total R

Accuracy Accuracy Accuracy Accuracy Accuracy Accuracy

Regress R Regress R Regress R Regress R Regress R Regress

error (ppb)error (ppb)error (ppb)error (ppb)error Region City (ppb)error Chicago 0.44 0.59 7 80 0.44 0.59 7 80 0.43 0.59 7 80 0.43 0.59 7 80 0.44 0.59 7 80 Columbus 0.55 0.70 8 80 0.54 0.70 8 80 0.54 0.70 7 80 0.54 0.70 7 81 0.55 0.70 7 80 Indianapolis 0.54 0.68 14 62 0.54 0.68 14 63 0.54 0.68 14 62 0.54 0.68 14 62 0.54 0.68 14 62 Central St. Louis 0.49 0.64 9 76 0.49 0.64 9 75 0.49 0.64 10 74 0.49 0.64 9 75 0.49 0.64 10 75 Detroit 0.53 0.66 9 75 0.53 0.66 9 78 0.53 0.66 9 78 0.53 0.66 9 76 0.53 0.66 9 78 East North Grand Rapids 0.48 0.64 8 78 0.49 0.64 7 79 0.48 0.64 8 77 0.48 0.64 8 79 0.48 0.64 8 79 Central Minneapolis 0.46 0.64 8 74 0.46 0.65 8 73 0.45 0.64 8 75 0.45 0.64 8 75 0.46 0.65 8 76 North West Fargo 0.48 0.64 7 76 0.48 0.65 7 76 0.48 0.64 7 76 0.48 0.64 7 77 0.48 0.65 7 77 Central Portland 0.42 0.58 8 72 0.42 0.58 7 73 0.41 0.58 7 73 0.42 0.58 8 72 0.43 0.59 7 73 Boston 0.39 0.55 9 76 0.39 0.55 9 73 0.39 0.55 9 76 0.39 0.55 9 75 0.40 0.55 9 73 Buffalo 0.49 0.64 11 73 0.49 0.64 11 73 0.49 0.64 10 73 0.49 0.64 10 74 0.49 0.64 11 73 Pittsburgh 0.58 0.70 9 76 0.58 0.70 9 76 0.58 0.70 9 77 0.58 0.70 9 76 0.58 0.70 9 77 Washington Northeast D.C. 0.41 0.60 11 75 0.41 0.60 10 77 0.41 0.60 10 77 0.41 0.60 11 76 0.41 0.60 10 78 Baton Rouge 0.41 0.62 9 71 0.39 0.62 9 71 0.40 0.62 9 71 0.40 0.62 9 71 0.40 0.62 9 72 Dallas 0.46 0.68 14 62 0.44 0.67 14 62 0.42 0.66 15 59 0.43 0.66 14 61 0.43 0.66 14 61 McAllen 0.27 0.70 6 72 0.26 0.70 7 71 0.26 0.70 7 71 0.26 0.70 7 71 0.26 0.70 7 71 South Tulsa 0.35 0.59 11 72 0.32 0.58 11 72 0.30 0.57 11 71 0.30 0.57 11 71 0.32 0.58 10 73 Atlanta 0.55 0.72 14 68 0.54 0.72 15 64 0.54 0.72 14 66 0.53 0.72 14 66 0.54 0.72 13 69 Birmingham 0.52 0.66 10 74 0.51 0.66 9 75 0.52 0.67 10 74 0.51 0.66 10 74 0.52 0.67 9 72 Southeast Tampa 0.20 0.50 8 69 0.19 0.50 8 70 0.19 0.50 9 69 0.19 0.50 9 69 0.19 0.50 8 69 Albuquerque 0.11 0.43 6 86 0.11 0.42 6 86 0.11 0.43 6 86 0.11 0.43 6 85 0.11 0.43 6 86 Denver 0.31 0.59 7 84 0.31 0.59 7 84 0.31 0.59 7 84 0.31 0.59 7 84 0.32 0.60 7 84 Southwest Phoenix 0.22 0.42 6 86 0.15 0.37 7 85 0.15 0.37 7 85 0.15 0.37 7 85 0.15 0.37 7 85 Fresno 0.52 0.71 8 84 0.51 0.71 8 84 0.50 0.70 8 84 0.50 0.70 8 84 0.51 0.71 8 84 Las Vegas 0.13 0.41 5 88 0.12 0.40 5 88 0.11 0.40 6 87 0.11 0.40 6 87 0.11 0.40 6 87 Los Angeles 0.02 0.48 6 83 0.03 0.48 6 83 0.04 0.49 6 83 0.03 0.48 6 83 0.03 0.48 6 83 West Reno 0.16 0.45 6 85 0.16 0.45 6 85 0.16 0.45 6 85 0.15 0.44 6 85 0.16 0.45 6 85

210

Table C-3: The effect (slope factors on a logarithmic scale35) of meteorological variables on ozone and effect modification by EHEs Slope factors (Effect)on associated with meteorological varibles (Log scale) EHE InvWS EHE*InvWS RH EHE*RH Tmax EHE*Tmax SI City Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Albuquerque -0.34 (-0.76, 0.083) 0.028 (0.022, 0.035) -0.01 (-0.04, 0.016) -0.02 (-0.03, -0.02) 0.043 (0.023, 0.063) 0.049 (0.042, 0.056) 0.03 (-0.01, 0.073) -0 (-0, 0.001) Atlanta -1.03 (-1.85, -0.21) 0.012 (0.008, 0.015) 0.1 (0.071, 0.129) -0.18 (-0.19, -0.18) 0.096 (0.069, 0.123) 0.22 (0.207, 0.233) 0.028 (-0.05, 0.104) -0.02 (-0.02, -0.01) Baton Rouge -0.54 (-1.52, 0.446) 0.084 (0.077, 0.09) 0.008 (-0.03, 0.045) -0.24 (-0.25, -0.23) 0.043 (-0.04, 0.127) 0.149 (0.128, 0.169) 0.057 (-0.03, 0.149) -0.01 (-0.01, -0) Birmingham -0.42 (-1.33, 0.485) 0.069 (0.065, 0.074) 0.028 (0.014, 0.043) -0.18 (-0.19, -0.17) 0.05 (0.024, 0.075) 0.141 (0.128, 0.153) -0.01 (-0.1, 0.086) -0.01 (-0.01, -0) Boston -1.25 (-1.89, -0.61) -0.06 (-0.08, -0.04) 0.096 (-0.01, 0.201) -0.01 (-0.02, -0.01) 0.016 (-0.03, 0.057) 0.225 (0.217, 0.232) 0.135 (0.074, 0.196) 69E-5 (-0, 0.003) Buffalo -0.13 (-0.84, 0.569) -0.07 (-0.08, -0.06) -0.01 (-0.04, 0.026) -0.05 (-0.05, -0.04) 0.031 (-0.01, 0.069) 0.281 (0.273, 0.288) 0.004 (-0.05, 0.061) -0 (-0, -0) Chicago -0.22 (-0.91, 0.475) 0.013 (0.003, 0.023) 0.142 (0.094, 0.19) -0.1 (-0.1, -0.09) -0 (-0.03, 0.028) 0.231 (0.223, 0.239) 0.008 (-0.06, 0.077) -0.01 (-0.01, -0) Columbus 0.517 (-0.67, 1.702) 0.036 (0.033, 0.04) 0.021 (0.005, 0.037) -0.13 (-0.13, -0.12) 0.003 (-0.02, 0.027) 0.234 (0.226, 0.242) -0.06 (-0.18, 0.055) -0 (-0, -0) Dallas -0.06 (-1.56, 1.44) 0.139 (0.118, 0.16) 0.105 (0.042, 0.169) -0.13 (-0.13, -0.12) 0.076 (0.003, 0.149) 0.153 (0.132, 0.175) -0.05 (-0.2, 0.088) -0.01 (-0.01, -0) Denver -0.32 (-1.07, 0.437) 0.019 (0.006, 0.032) 0.072 (0.007, 0.137) -0.04 (-0.05, -0.04) 0.078 (0.04, 0.117) 0.089 (0.08, 0.098) -0.01 (-0.08, 0.065) -0.01 (-0.01, -0) Detroit 0.053 (-0.86, 0.964) 0.041 (0.036, 0.045) 0.058 (0.025, 0.092) -0.11 (-0.12, -0.1) 0.018 (-0.02, 0.051) 0.299 (0.291, 0.307) -0.02 (-0.1, 0.062) -0 (-0, 0.001) Fargo -0.53 (-1.18, 0.123) 0.008 (0.002, 0.014) -0.01 (-0.05, 0.015) -0.07 (-0.08, -0.07) 0.047 (0.024, 0.07) 0.168 (0.162, 0.174) 0.034 (-0.03, 0.096) 0.012 (0.011, 0.013) Fresno -0.46 (-1.13, 0.21) 0.074 (0.068, 0.08) -0.02 (-0.04, 0.006) -0 (-0.01, 0.007) 0.071 (0.053, 0.089) 0.203 (0.189, 0.217) 0.025 (-0.04, 0.088) 0.006 (0.004, 0.009) Grand Rapids 0.499 (-0.17, 1.17) 0.014 (0.007, 0.021) -0.03 (-0.06, 0.009) -0.07 (-0.08, -0.06) -0 (-0.03, 0.031) 0.318 (0.309, 0.327) -0.05 (-0.12, 0.015) -0 (-0, -0) Indianapolis 0.718 (-0.28, 1.716) 0.044 (0.038, 0.05) 0.052 (0.008, 0.096) -0.13 (-0.14, -0.13) -0.01 (-0.04, 0.012) 0.276 (0.268, 0.284) -0.08 (-0.17, 0.009) -0 (-0, -0) Las Vegas -0.42 (-0.72, -0.12) 0.028 (0.019, 0.037) 0.038 (0.014, 0.062) -0.03 (-0.04, -0.03) 0.035 (0.021, 0.05) 0.058 (0.051, 0.065) 0.028 (0.003, 0.054) 0.003 (74E-5, 0.005) Los Angeles -0.47 (-1.24, 0.309) 0.067 (0.05, 0.085) -0.07 (-0.15, 0.019) 0.03 (0.021, 0.04) -0.01 (-0.04, 0.017) 0.054 (0.038, 0.07) 0.103 (0.024, 0.181) 0.003 (82E-5, 0.005) McAllen -1.55 (-4.67, 1.573) 0.245 (0.215, 0.274) 0.153 (0.093, 0.213) -0.13 (-0.15, -0.12) 0.069 (-0.04, 0.18) -0.07 (-0.1, -0.04) 0.089 (-0.18, 0.355) -0.01 (-0.01, -0) Minneapolis -0.22 (-0.93, 0.489) 0.007 (0.004, 0.009) -0.02 (-0.04, 23E-6) -0.05 (-0.06, -0.05) 0.05 (0.024, 0.076) 0.225 (0.218, 0.232) -0.01 (-0.06, 0.047) 0.001 (-0, 0.003) Phoenix -0.81 (-2.2, 0.568) 0.094 (0.087, 0.102) -0.01 (-0.03, 0.018) -0.03 (-0.05, -0.01) 0.065 (0.017, 0.113) 0.061 (0.032, 0.09) 0.067 (-0.06, 0.194) 0.014 (0.01, 0.018) Pittsburgh -0.44 (-1.15, 0.272) 0.006 (0.002, 0.01) 0.05 (0.023, 0.077) -0.11 (-0.11, -0.1) 0.033 (0.007, 0.059) 0.298 (0.29, 0.306) 0.014 (-0.06, 0.087) 77E-5 (-0, 0.002) Portland -1.82 (-2.42, -1.22) 0.006 (0.005, 0.007) 0.035 (0.025, 0.046) -0.11 (-0.12, -0.1) 0.107 (0.073, 0.141) 0.115 (0.106, 0.124) 0.136 (0.083, 0.189) 0.008 (0.006, 0.01) Reno -0.29 (-0.63, 0.058) -0 (-0.01, 0.004) 0.01 (-0.02, 0.042) -0.04 (-0.05, -0.04) 0.034 (-0.01, 0.079) 0.066 (0.06, 0.072) 0.033 (-0, 0.066) 0.007 (0.005, 0.01) St. Louis 0.091 (-0.43, 0.608) 0.06 (0.054, 0.066) 0.077 (0.044, 0.11) -0.1 (-0.1, -0.09) -0 (-0.02, 0.021) 0.196 (0.189, 0.204) -0.02 (-0.07, 0.025) -0 (-0.01, -0) Tampa -1.4 (-2.66, -0.14) 0.035 (0.021, 0.049) 0.042 (-0.04, 0.126) -0.21 (-0.22, -0.19) 0.052 (0.008, 0.096) 0.114 (0.086, 0.142) 0.096 (-0.03, 0.22) -0 (-0, 22E-5) Tulsa -0.41 (-0.88, 0.067) 0.033 (0.029, 0.036) 0.017 (-0, 0.037) -0.08 (-0.09, -0.08) 0.092 (0.052, 0.131) 0.135 (0.121, 0.149) -0.02 (-0.08, 0.026) 0.003 (0.002, 0.004) Washington D.C. 0.264 (-0.52, 1.052) 0.075 (0.062, 0.088) 0.057 (-0.01, 0.119) -0.09 (-0.1, -0.09) 0.012 (-0.02, 0.044) 0.319 (0.308, 0.33) -0.05 (-0.12, 0.026) -0 (-0, 9E-4)

35 The slope factors are presented on a log scale for certain predictors were scaled for display purposes. Specifically, InWS and EHE*InvWS: Slope factor represents the change in ozone for a 0.1 (s/m) increase in InvWS or a 10 m/s decrease in wind speed; RH and EHE*RH: Slope factor represents the change in ozone for a 10% increase in RH; Tmax and EHE*Tmax: Slope factor represents the change in ozone for a 10℉ increase in Tmax; SI: Slope factor represents the change in ozone for a 1KW/ sq.m increase in SI. 211

Table C-4: The effect modification (slope factors36 on a logarithmic scale) of the relationship between meteorological variables on ozone during EHEs by region and definition Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 1 0.23 0.20 0.26 -0.07 -0.17 0.03 Definition 2 0.23 0.20 0.26 -0.04 -0.13 0.04 Definition 3 0.24 0.20 0.27 0.00 -0.25 0.25 Central Definition 4 0.23 0.20 0.27 -0.08 -0.22 0.06 Definition 5 0.24 0.20 0.27 0.00 -0.05 0.05 All definitions 0.23 0.22 0.25 -0.03 -0.06 0.01 Definition 1 0.28 0.22 0.34 -0.02 -0.14 0.11 Definition 2 0.28 0.22 0.34 -0.02 -0.08 0.05 East North Definition 3 0.28 0.23 0.34 -0.02 -0.23 0.20 Central Definition 4 0.28 0.23 0.34 -0.03 -0.16 0.10 Daily Maximum Definition 5 0.28 0.22 0.34 -0.04 -0.09 0.01 Temperature All definitions 0.28 0.26 0.30 -0.03 -0.06 0.01 Definition 1 0.14 0.09 0.20 0.09 -0.02 0.20 Definition 2 0.14 0.08 0.19 0.13 0.06 0.20 North West Definition 3 0.15 0.09 0.20 0.10 -0.04 0.25 Central Definition 4 0.15 0.11 0.19 0.03 -0.19 0.25 Definition 5 0.14 0.08 0.19 0.05 -0.09 0.19 All definitions 0.14 0.13 0.16 0.08 0.03 0.14 Definition 1 0.28 0.24 0.32 -0.05 -0.16 0.05 Definition 2 0.28 0.24 0.32 0.06 -0.01 0.13 Northeast Definition 3 0.28 0.25 0.32 0.13 -0.22 0.48 Definition 4 0.28 0.24 0.32 0.06 -0.08 0.20

36 The slope factors are presented on a log scale for certain predictors were scaled for display purposes. Specifically, Baseline effect and effect modification associated with daily mean inverse wind speed: slope factor represents the change in ozone for a 0.1 (s/m) increase in inverse wind speed or a 10 m/s decrease in wind speed; Baseline effect and effect modification associated with daily mean relative humidity: slope factor represents the change in ozone for a 10% increase in RH; Baseline effect and effect modification associated with daily maximum temperature: Slope factor represents the change in ozone for a 10℉ increase in daily maximum temperature. 212

Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 5 0.28 0.24 0.32 0.03 -0.05 0.12 All definitions 0.28 0.26 0.30 0.03 -0.01 0.07 Definition 1 0.17 0.13 0.21 -0.08 -0.14 -0.01 Definition 2 0.09 0.00 0.18 0.05 -0.12 0.22 Definition 3 0.08 -0.01 0.17 -0.14 -0.40 0.12 South Definition 4 0.08 0.00 0.17 -0.04 -0.22 0.14 Definition 5 0.09 0.00 0.18 0.11 0.01 0.21 All definitions 0.10 0.07 0.13 -0.01 -0.07 0.05 Definition 1 0.17 0.10 0.23 0.07 0.01 0.13 Definition 2 0.16 0.10 0.22 -0.06 -0.20 0.07 Definition 3 0.16 0.10 0.23 -0.07 -0.51 0.37 Southeast Definition 4 0.16 0.09 0.22 0.11 -0.32 0.54 Definition 5 0.17 0.10 0.23 -0.01 -0.10 0.08 All definitions 0.16 0.14 0.19 0.03 -0.02 0.07 Definition 1 0.03 -0.02 0.09 0.15 -0.05 0.35 Definition 2 0.07 0.05 0.09 0.05 -0.03 0.13 Definition 3 0.07 0.05 0.10 -0.05 -0.54 0.44 Southwest Definition 4 0.07 0.05 0.10 -0.02 -0.23 0.20 Definition 5 0.07 0.05 0.10 0.01 -0.02 0.04 All definitions 0.07 0.05 0.08 0.05 -0.02 0.11 Definition 1 0.09 0.03 0.15 0.05 0.01 0.10 Definition 2 0.09 0.02 0.17 0.06 0.01 0.11 Definition 3 0.10 0.02 0.18 -0.02 -0.17 0.13 West Definition 4 0.10 0.02 0.18 -0.13 -0.42 0.16 Definition 5 0.09 0.02 0.17 0.03 0.00 0.06 All definitions 0.10 0.07 0.12 0.04 0.02 0.06 Definition 1 0.18 0.14 0.21 0.00 -0.04 0.05 Definition 2 0.17 0.13 0.20 0.03 0.00 0.06 All Regions Definition 3 0.17 0.14 0.21 0.01 -0.06 0.08 Definition 4 0.17 0.14 0.21 -0.02 -0.08 0.04 Definition 5 0.17 0.14 0.21 0.02 0.00 0.04

213

Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 1 -0.11 -0.13 -0.09 -0.02 -0.04 0.01 Definition 2 -0.11 -0.13 -0.09 0.00 -0.02 0.03 Definition 3 -0.11 -0.13 -0.09 0.02 -0.04 0.08 Central Definition 4 -0.11 -0.13 -0.09 0.03 -0.02 0.08 Definition 5 -0.11 -0.13 -0.09 0.00 -0.02 0.02 All definitions -0.11 -0.12 -0.11 0.00 -0.02 0.01 Definition 1 -0.08 -0.11 -0.04 0.02 -0.05 0.09 Definition 2 -0.08 -0.11 -0.05 0.04 0.02 0.07 East North Definition 3 -0.08 -0.11 -0.04 0.04 -0.01 0.10 Central Definition 4 -0.08 -0.11 -0.04 0.04 -0.03 0.12 Definition 5 -0.08 -0.11 -0.04 0.01 -0.02 0.03 All definitions -0.08 -0.09 -0.07 0.03 0.01 0.05 Definition 1 -0.09 -0.12 -0.05 0.05 -0.03 0.12 Definition 2 -0.09 -0.13 -0.06 0.08 0.05 0.12 North West Definition 3 -0.09 -0.13 -0.05 0.07 -0.03 0.17 Daily Mean Central Definition 4 -0.09 -0.12 -0.05 0.00 -0.15 0.15 Relative Humidity Definition 5 -0.09 -0.14 -0.05 0.07 0.01 0.13 All definitions -0.09 -0.10 -0.08 0.07 0.04 0.09 Definition 1 -0.07 -0.11 -0.02 0.00 -0.03 0.04 Definition 2 -0.07 -0.11 -0.02 0.03 0.00 0.06 Definition 3 -0.06 -0.11 -0.02 0.07 0.01 0.13 Northeast Definition 4 -0.06 -0.11 -0.02 0.10 0.04 0.16 Definition 5 -0.06 -0.11 -0.02 0.00 -0.02 0.03 All definitions -0.07 -0.08 -0.05 0.02 0.01 0.04 Definition 1 -0.14 -0.18 -0.09 0.00 -0.04 0.05 Definition 2 -0.15 -0.21 -0.09 0.12 0.08 0.16 Definition 3 -0.15 -0.21 -0.08 0.08 0.00 0.16 South Definition 4 -0.15 -0.21 -0.08 0.14 0.05 0.24 Definition 5 -0.15 -0.21 -0.09 0.09 0.05 0.13 All definitions -0.14 -0.17 -0.12 0.07 0.04 0.10 Southeast Definition 1 -0.19 -0.20 -0.18 0.07 0.03 0.10

214

Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 2 -0.19 -0.20 -0.17 0.08 0.02 0.14 Definition 3 -0.18 -0.20 -0.17 0.12 0.04 0.21 Definition 4 -0.18 -0.20 -0.17 0.24 0.00 0.47 Definition 5 -0.19 -0.20 -0.17 0.04 0.00 0.09 All definitions -0.18 -0.19 -0.18 0.07 0.05 0.09 Definition 1 -0.05 -0.09 -0.02 0.10 0.03 0.16 Definition 2 -0.03 -0.04 -0.01 0.06 0.03 0.08 Definition 3 -0.03 -0.04 -0.01 0.07 -0.03 0.18 Southwest Definition 4 -0.03 -0.04 -0.01 0.10 0.02 0.18 Definition 5 -0.03 -0.05 -0.01 0.04 0.02 0.06 All definitions -0.03 -0.04 -0.02 0.06 0.04 0.09 Definition 1 -0.02 -0.05 0.01 0.05 0.02 0.08 Definition 2 -0.01 -0.04 0.01 0.04 0.00 0.08 Definition 3 -0.01 -0.04 0.02 0.03 -0.01 0.08 West Definition 4 -0.01 -0.04 0.02 0.09 -0.23 0.40 Definition 5 -0.01 -0.04 0.02 0.01 -0.02 0.05 All definitions -0.01 -0.02 0.00 0.04 0.02 0.05 Definition 1 -0.09 -0.11 -0.07 0.03 0.01 0.05 Definition 2 -0.09 -0.11 -0.07 0.05 0.03 0.06 All Regions Definition 3 -0.09 -0.11 -0.07 0.05 0.03 0.07 Definition 4 -0.09 -0.11 -0.07 0.07 0.04 0.10 Definition 5 -0.09 -0.11 -0.07 0.02 0.01 0.04 Definition 1 0.04 0.03 0.05 0.06 0.02 0.11 Definition 2 0.04 0.02 0.05 0.06 0.02 0.11 Definition 3 0.04 0.03 0.06 0.10 -0.03 0.23 Central Definition 4 0.04 0.03 0.06 0.03 -0.06 0.12 Inverse Daily Definition 5 0.04 0.02 0.05 0.07 0.03 0.10 Mean Wind Speed All definitions 0.04 0.03 0.05 0.06 0.04 0.08 Definition 1 0.02 0.00 0.04 0.02 -0.04 0.09 East North Definition 2 0.02 0.00 0.04 0.01 -0.04 0.06 Central Definition 3 0.02 0.00 0.04 -0.04 -0.16 0.09

215

Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 4 0.02 0.00 0.04 0.01 -0.09 0.10 Definition 5 0.02 0.00 0.04 0.00 -0.03 0.02 All definitions 0.02 0.01 0.03 0.00 -0.02 0.03 Definition 1 0.01 0.00 0.01 0.02 -0.02 0.06 Definition 2 0.01 0.00 0.01 0.01 -0.04 0.07 North West Definition 3 0.01 0.00 0.01 0.01 -0.02 0.05 Central Definition 4 0.01 0.00 0.01 0.05 -0.02 0.13 Definition 5 0.01 0.00 0.01 0.02 -0.02 0.06 All definitions 0.01 0.00 0.01 0.03 0.01 0.04 Definition 1 -0.01 -0.07 0.04 0.05 -0.01 0.11 Definition 2 -0.01 -0.07 0.04 0.04 -0.01 0.09 Definition 3 -0.01 -0.07 0.05 0.03 -0.07 0.12 Northeast Definition 4 -0.01 -0.07 0.05 0.00 -0.09 0.09 Definition 5 -0.01 -0.07 0.04 0.03 -0.01 0.07 All definitions -0.01 -0.03 0.01 0.03 0.01 0.05 Definition 1 0.07 0.03 0.11 0.08 0.03 0.14 Definition 2 0.13 0.06 0.20 0.06 -0.04 0.15 Definition 3 0.13 0.06 0.21 0.00 -0.07 0.07 South Definition 4 0.13 0.06 0.21 0.04 -0.08 0.17 Definition 5 0.13 0.06 0.21 0.03 0.01 0.05 All definitions 0.12 0.10 0.14 0.04 0.02 0.07 Definition 1 0.03 -0.02 0.07 0.06 0.00 0.11 Definition 2 0.04 -0.01 0.08 0.05 -0.03 0.13 Definition 3 0.04 0.00 0.08 0.10 -0.02 0.21 Southeast Definition 4 0.04 0.00 0.09 0.11 -0.08 0.30 Definition 5 0.04 0.00 0.08 0.05 0.01 0.10 All definitions 0.04 0.02 0.05 0.06 0.03 0.08 Definition 1 0.05 0.00 0.10 0.01 -0.02 0.03 Definition 2 0.05 0.00 0.10 0.01 -0.04 0.05 Southwest Definition 3 0.05 0.00 0.10 0.02 -0.12 0.15 Definition 4 0.05 0.00 0.10 0.05 -0.06 0.16

216

Baseline effect Effect Modification Meteorological Region Definition Lower Upper Lower Upper variable Mean Mean limit limit limit limit Definition 5 0.05 -0.01 0.10 0.01 -0.06 0.07 All definitions 0.05 0.03 0.07 0.00 -0.02 0.02 Definition 1 0.03 -0.01 0.08 0.03 -0.01 0.07 Definition 2 0.04 0.00 0.08 0.02 -0.01 0.05 Definition 3 0.04 0.00 0.08 0.02 -0.05 0.09 West Definition 4 0.04 0.00 0.08 -0.10 -0.43 0.22 Definition 5 0.05 0.00 0.09 -0.02 -0.04 0.01 All definitions 0.04 0.02 0.06 0.01 -0.01 0.03 Definition 1 0.03 0.02 0.04 0.05 0.03 0.06 Definition 2 0.04 0.03 0.05 0.03 0.01 0.05 All Regions Definition 3 0.04 0.03 0.06 0.02 0.00 0.05 Definition 4 0.04 0.03 0.06 0.02 -0.02 0.05 Definition 5 0.04 0.03 0.05 0.02 0.01 0.04

217

Figure C-9: Effect modification37 of the RH-ozone relationship during EHEs

37 EHE day effect for cities is only shown if the summary-level pooled analysis of the effect modification (interaction term) in the model was significant. 218

APPENDIX D

SUPPLEMENTAL MATERIAL FOR CHAPTER 5

D1: Downscaler model structure

In its most general form, the DS model can be expressed in an equation similar to that of linear regression (Berrocal et al. 2010a, 2010b, 2012):

̃ ̃ (1)

Where, : observed concentration at point s and time t;

: CMAQ concentration at time t. This value is a weighted average of both the grid cell containing the monitor and neighboring grid cells;

̃ : intercept, and is composed of both a global and a local component;

̃ : global slope; local components of the slope are contained in the term;

: model error.

This Bayesian approach involves drawing random samples of model parameters from built-in "prior" distributions and assessing their fit on the data iteratively. The resulting collection of intercept and slope values at each space-time point are used to predict concentrations and associated uncertainties at new spatial locations.

219

D2: Geographically weighted regression (GWR) model structure

In its most general form, the GWR model can be expressed (Hu et al. 2013):

PM     HPBL   RH   TEMP   WIND _ SPEED  2.5,st 0,s 1,s st 2,s st 3,s st 4,s st (2)

5,s FOREST _COVERst  6,s MODIS _ AODst   st

3 Where PM : daily ground-level PM concentrations (µg/m ) at site s in day t; 2.5,st 2.5

HPBL : boundary layer height (m) at site s in day t; st

RH : relative humidity (%) at site s in day t; st

TEMP : air temperature (K) at site s in day t; st

WIND_SPEED : refers to the surface wind speed (m/sec) at site s in day t; st

FOREST_COVER : percentage of the forest cover (unitless) at site s in day t; st

MODIS_AOD : MODIS AOD value (unitless) at site s in day t; st

β and β :location-specific intercept and slopes, respectively. 0,s i,s

β is calculated by incorporating the geographical weighting of each observation (e.g., a

PM monitoring site) relative to the location of the regression point (e.g., a PM 2.5 2.5 monitoring site or the centroid of a gird cell). The weighting is calculated by a Gaussian distance-decay weighting function, and thus the weighting of each observation for the regression point will decrease according to a Gaussian curve as the distance between them increases. In addition, a bandwidth needs to be determined for the weighting function. Due to the unevenly distribution of PM monitoring sites, we obtained the 2.5 adaptive bandwidth by minimizing the corrected Akaike Information Criterion (AIC ) c value.

220

D3-4: Calculations of performance metrics (Hu et al. 2013; Vaidyanathan et al.

2013)

1. Pearson Correlation Coefficient (r): For a set of daily data points (MON1, MOD1),

(MON2, MOD2), …, (MONn, MODn) r is defined as

∑ ( ̅ ̅̅ ̅̅̅ ̅) ( ̅̅̅ ̅̅̅ ̅) (3) √ ̅̅̅̅̅̅̅ √ ̅̅̅̅̅̅̅ ∑ ( ) ∑ ( )

Where, n: number of observations;

: Monitor-based daily PM2.5 measurements;

: Model-based daily PM2.5 predictions;

̅̅̅ ̅̅̅ ̅: Monitor-based PM2.5 average;

̅̅̅ ̅̅̅ ̅: Model-based PM2.5 average.

2. Kendall Tau-B correlation Coefficient (t): To calculate t, n(n-1)/2 pairs of data points are classified as concordant or discordant. A concordant pair is any pair for which the ranks of AQS and AOD agree, i.e., for any pair of observations (MONi, MODi) and

(MONj, MODj), both MONi > MONj and MODi > MODj or both MONi < MONj and

MODi < MODj. A discordant pair is any pair of observations for which the ranks for

MON and MOD disagree, i.e., either MONi > MONj and MODi < MODj or MONi <

MONj and MODi > MODj. With C and D respectively denoting the number of concordant and discordant pairs (assuming no ties), the value of tis then defined as

(4)

The denominator is adjusted accordingly in the event of ties.

3. Difference (D) for a grid cell k and day i is defined as

(5)

221

4. Root mean squared deviation (RMSD) for a grid cell k is defined as

√∑ (6)

5. Relative accuracy (RA) for a grid cell k is defined as

√ ∑

(7)

= Model-based PM2.5 predictions for day i and county k;

= monitor-based PM2.5 measurement for day i and county k.

6. A Bland-Altman plot is a scatter plot with ( , ) points defined as

̅̅̅ ̅̅̅ ̅̅̅ ̅ ̅̅ ̅̅ ̅̅̅ (8a)

̅̅̅ ̅̅ ̅̅ ̅ ̅̅̅ ̅̅ ̅ ̅ (8b) where:

̅̅̅ ̅̅ ̅ ̅= AQS-based annual average for county k;

̅̅̅ ̅̅ ̅̅ ̅ annual average for county k from model-based estimates.

222

Figure D-1: Annual averages of PM2.5: (a) Monitor (AQS/SEARCH); (b) AOD; (c) CMAQ; (d) DS

223

Figure D-2: Description of the grid cell neighborhood around SEARCH monitors

224

Figure D-3: Comparison of model- and SEARCH-based PM2.5 concentrations, when measurements from nearby AQS monitor are present

225

Figure D-4: Comparison of model- and SEARCH-based PM2.5 concentrations, when measurements from nearby AQS monitor are absent

226

Figure D-5: Comparison of SEARCH- and nearby AQS-based PM2.5 measurements

227

APPENDIX E

SUPPLEMENTAL MATERIAL FOR CHAPTER 6

Figure E-1: U.S. climate regions

228

Table E-1: List of covariates and data sources Candidate predictor Name Description and Source Air conditioning (AC) prevalence AC prevalence. AC prevalence data from a private vendor Efficiency 2.0. Poverty status Levels of poverty. Poverty data from U.S. Census Bureau Smoking Adult smoking prevalence. Data from CDC’s Behavioral Risk Factor Surveillance System. Uninsured population (under 65) Percent of population (under 65) uninsured is available from U.S. Census Bureau Population over 65 Percent of population over 65 is available from U.S. Census Bureau. Obesity Adult obesity prevalence (percent of adults that report a BMI >= 30) is available from National Center for Chronic Disease Prevention and Health Promotion, Division of Diabetes Translation Physical inactivity Physical inactivity (percent of adults that report no leisure time physical activity) is available from National Center for Chronic Disease Prevention and Health Promotion, Division of Diabetes Translation Diabetes Diabetes prevalence is available from National Center for Chronic Disease Prevention and Health Promotion, Division of Diabetes Translation Population density Population density is calculated using population estimates from U.S. Census Bureau

229

Table E-2: Short-listed EHE definitions Cluster Daily heat Threshold Threshold Cluster EHE definition name Duration common name metric type value Absolute Daily maximum heat index temperature 1 greater than 90 ℉ for at least 3 HI Absolute >90oF 3+ consecutive days based max consecutive days thresholds Daily maximum and minimum "Predominantly temperature greater than 80th T and >80th 2 moderate" max Relative 3+ consecutive days percentile for at least 3 T percentile thresholds min consecutive days

Daily maximum temperature "Predominantly >95th 3 greater than 95th percentile for T Relative 2+ consecutive days high" thresholds max percentile at least 2 consecutive days

Everyday >T2, and T1: >97.5th "Predominantly 3+ consecutive days percentile 4 extreme" Huth definition T Relative >T1, and average T max T2: >81st max thresholds >T1 for the whole percentile time period Daily mean temperature >mean + 1 Climate normal greater than mean + 1 SD of SD of 5 based T Relative 3+ Consecutive days climate normal for at least 3 avg climate thresholds consecutive days normal

230

Figure E-2: Schematic showing exposure offset indicators

231

Table E-3: Rankings for the combination of EHE definitions and exposure offset indicators by regions Exposure Ranking of EHE definitions by Climate Regions

offset East North North West

EHE Definition type Central Central Central Northeast South Southeast Southwest West

R a n k R a n k R a n k R a n k R a n k R a n k R a n k R a n k ExE1 10 7 6 10 14 14 4 7 Daily maximum heat index greater ExE2 13 6 5 11 11 12 2 8 than 90oF for at least 3 consecutive ExE3 14 9 8 14 12 4 1 9 days Lag0 15 10 10 5 20 16 5 4

Lag1 5 3 2 6 15 13 3 5 ExE1 6 12 11 9 19 9 7 13 Daily maximum and minimum ExE2 7 13 12 13 16 7 8 14 temperature greater than 80th ExE3 9 15 15 15 18 6 12 15 percentile for at least 3 consecutive Lag0 4 14 14 3 13 5 9 12 days Lag1 2 11 9 8 17 10 6 11 ExE1 3 2 4 2 4 2 11 3 Daily maximum temperature ExE2 8 4 3 7 6 8 15 6 greater than 95th percentile for at ExE3 12 5 7 12 7 11 13 10 least 2 consecutive days Lag0 11 8 13 1 3 1 14 2 Lag1 1 1 1 4 2 3 10 1 ExE1 22 20 22 23 5 25 22 23 ExE2 24 21 23 24 8 23 23 19 Huth definition ExE3 25 25 24 25 10 24 24 22 Lag0 23 23 25 21 9 22 25 24 Lag1 21 16 21 22 1 21 21 25 ExE1 17 19 17 17 24 18 17 17 Daily mean temperature greater ExE2 19 18 18 19 22 19 19 20 than mean + 1 standard deviation ExE3 20 22 20 20 25 20 18 21 (SD) of climate normal for at least 3 consecutive days Lag0 18 24 19 16 23 17 20 18 Lag1 16 17 16 18 21 15 16 16 Rank1 Rank2 Rank3 Rank4 Rank5

232

Figure E-3: Unit lifetime work lost costs in millions ($) by age and gender

233

Figure E-4: Air pollution levels on EHE and non-EHE days for definition 1 (Daily maximum heat index greater than 90 ℉ for at least 3 consecutive days)

234

Figure E-5: Air pollution levels on EHE and non-EHE days for definition 2 (Daily maximum and minimum temperature greater than 80th percentile for at least 3 consecutive days)

235

Figure E-6: Air pollution levels on EHE and non-EHE days for definition 3 (Daily maximum temperature greater than 95th percentile for at least 2 consecutive days)

236

Figure E-7: Air pollution levels on EHE and non-EHE days for definition 4 (Huth definition)

237

Figure E-8: Air pollution levels on EHE and non-EHE days for definition 5 (Daily mean temperature greater than mean + 1 standard deviation (SD) of climate normal for at least 3 consecutive days)

238

Table E-4: Mean (5th, and 95th percentile) levels of demographic and social variables, by U.S. Climate Regions

U.S. Climate Region East North North West Central Northeast South Southeast Southwest West

Central Central

Demographic

th th th th th th th

Social Variables th

Mean Mean Mean Mean Mean Mean Mean Mean

, and 95 , and 95 , and 95 , and 95 , and 95 , and 95 , and 95 , and 95

th th th th th th th th

percentile) percentile) percentile) percentile) percentile) percentile) percentile)

percentiles)

(5 (5 (5 (5 (5 (5 (5 (5

Air conditioning prevalence (%) 57 (33, 80) 40 (17, 63) 22 (5, 60) 13 (1, 48) 72 (53, 85) 83 (60, 94) 23 (8, 70) 36 (6, 72) Diabetes prevalence (%) 10 (7, 13) 9 (6, 12) 9 (6, 12) 9 (7, 11) 10 (7, 13) 11 (8, 15) 7 (5, 10) 8 (7, 10) Obesity prevalence (body mass index >= 30) (%) 31 (27, 37) 30 (24, 36) 28 (23, 34) 26 (19, 32) 31 (27, 36) 29 (24, 36) 23(16, 32) 25 (20, 31) Percent of Hispanic population (%) 4 (1, 13) 5 (1, 13) 8 (1, 30) 7 (1, 21) 20 (2, 67) 9 (2, 25) 23 (8, 48) 31 (10, 55) Percent of adult smokers (%) 22 (15, 30) 20 (12, 27) 18 (12, 26) 19 (13, 26) 20 (13, 28) 19 (13, 26) 18 (10, 24) 15 (10, 24) Percent of adults that report no leisure time physical activity (%) 28 (23, 34) 23 (18, 30) 24 (16, 32) 24 (18, 30) 28 (22, 35) 26 (19, 35) 21 (15, 27) 19 (14, 26)

Percent of population in poverty (%) 17 (10, 24) 15 (10, 22) 15 (10, 24) 13 (7, 19) 18 (11, 27) 18 (8, 27) 17 (10, 24) 16 (9, 23) Percent of population over 65 (%) 14 (10, 18) 15 (10, 21) 15 (10, 23) 15 (11, 19) 13 (8, 23) 14 (9, 24) 14 (9, 24) 13 (9, 20)

Percent of population under 65 uninsured (%) 16 (11, 20) 12 (9, 17) 18 (11, 26) 11 (5, 17) 23 (15, 31) 20 (13, 27) 21 (16, 26) 20 (13, 26) Population density (population/per sq. 576 368 101 2,341 323 653 191 521 mile) (44, 2,193) (12, 1886) (1, 482) (30, 10417) (4, 1286) (80, 1,784) (2, 718) (1, 2,388)

239

Figure E-9: E-R relationship between mortality and EHEs for Southeast based on different top 10 EHE definitions

240

Figure E-10: E-R relationship between mortality and EHEs for West based on different top 10 EHE definitions

241