Statistics and Public Policy

ISSN: (Print) 2330-443X (Online) Journal homepage: http://www.tandfonline.com/loi/uspp20

Superfund Locations and Potential Associations with Cancer Incidence in Florida

Alexander Kirpich & Emily Leary

To cite this article: Alexander Kirpich & Emily Leary (2017) Locations and Potential Associations with Cancer Incidence in Florida, Statistics and Public Policy, 4:1, 1-9, DOI: 10.1080/2330443X.2016.1267599 To link to this article: http://dx.doi.org/10.1080/2330443X.2016.1267599

© 2017 The Author(s). Published with License by American Statistical Association© Alexander Kirpich and Emily Leary

View supplementary material

Accepted author version posted online: 05 Dec 2016. Published online: 05 Dec 2016.

Submit your article to this journal

Article views: 266

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=uspp20

Download by: [182.73.193.34] Date: 07 March 2017, At: 03:21 STATISTICS AND PUBLIC POLICY , VOL. , NO. , – http://dx.doi.org/./X..

Superfund Locations and Potential Associations with Cancer Incidence in Florida

Alexander Kirpicha and Emily Learyb aDepartment of Molecular Genetics & Microbiology, College of Medicine, University of Florida, Gainesville, FL; bSchool of Medicine, University of Missouri, Columbia, MO

ABSTRACT ARTICLE HISTORY Uncontrolled hazardous wastes sites have the potential to adversely impact human health and damage or Received January  disrupt ecological systems and the greater environment. Four decades have passed since the Superfund Accepted November  law was enacted, allowing increased exposure time to these potential health hazards while also allow- KEYWORDS ing advancement of analysis techniques. Florida has the sixth highest number of Superfund sites in the Cancer; Cancer incidence; US and, in 2016, Florida was projected to have the second largest number of new cancer cases in the US. Geographic information We explore statewide cancer incidence in Florida from 1986 to 2010 to determine if differences or associa- system (GIS) models; tions exist in counties containing Superfund sites compared to counties that do not. To investigate potential Mapping; Superfund environmental associations with cancer incidence; results using spatial and nonspatial mixed models were compared. Using a Poisson–Gamma mixture model, our results provide some evidence of an association between cancer incidence rates and Superfund site hazard levels, as well as proxy measures of water con- tamination around Superfund sites. In addition, results build upon previously observed gender differences in cancer incidence rates and further indicate spatial differences for cancer incidence. Heterogeneity among cancer incidence rates were observed across Florida with some mild association with Superfund exposure proxies.

Introduction the Comprehensive Environmental Response, Compensation

In the past 20 years, spatial analysis techniques have greatly and Liability Act (CERCLA), known informally as Superfund. advanced. The software capabilities for these techniques have Once enacted, the Superfund law provided the U.S. Environ- greatly expanded as well. These new analytic tools allow more mental Protection Agency (EPA) the legal authority to conduct precise estimation methods and analyses that produce higher clean-up efforts for uncontrolled sites and qualitymodelingresults.Inparticular,theseadvancedtech- spills (EPA 2014). According to the EPA (2014), these sites niquescanbeusedtoimproveupontheexistingbodyof include those “abandoned, accidentally spilled, and illegally literature that investigates associations between cancer inci- dumped hazardous wastes that are determined to pose current dence and environmental exposures. Most of the previous or future threats to humans or the environment.” Long-term research investigating these relationships was conducted in the remediation efforts are also authorized by the Superfund law 1980s and early 1990s. Only a few recent studies have further and conducted in areas where hazardous waste release occurred explored these relationships, none of which have investigated “through years of inadequate or illegal waste management” nonpediatric cancer incidence. In addition, a longer time (EPA 2014). period of recorded cancer incidence and exposure data allow There are multiple environmental factors to cancer risk and better characterization of exposures for the known long latency these include indoor and outdoor air , soil contamina- period prior to cancer incidence (Nadler 2014). Additional tion, and drinking water contamination (Boffetta and Nyberg historical data and current advanced analysis techniques avail- 2003). In contrast to the large amount of work on air pollution able to researchers provides an opportunity to restudy these and health risk, particularly lung cancer (Raaschou-Nielsen relationships and consider trends over a longer time period. et al. 2013;Hamraetal.2014),ourfocusforthisstudyisonsoil When considering adverse environmental exposures, contamination and drinking water contamination exposures, “Superfund” is probably one of the most recognized terms common avenues of exposure for hazardous wastes, using to describe the scale and extent of these exposure types. On Superfund sites as the environmental factor of interest. A com- December 11, 1980, U.S. President Jimmy Carter signed into law prehensive review of the literature on implications of hazardous

CONTACT Alexander Kirpich akirpich@ufl.edu Department of Molecular Genetics & Microbiology, College of Medicine, University of Florida, C Cancer & Genetics Research Complex,  Mowry Road, Gainesville, FL . The authors contributed equally to the contribution of this manuscript. © Alexander Kirpich and Emily Leary This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/./), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted. Published with License by American Statistical Association 2 A. KIRPICH AND E. LEARY waste was published (Johnson and DeRosa 1997)andcon- Table . Raw total, all-cause cancer incidence counts for Florida from  to , in -year intervals and all-cause cancer incidence rates using population totals aver- cluded that in spite of inherent study limitations, some studies aged across the -year intervals. have detected associations between cancer and exposure to materials similar to those that are found at hazardous waste Year – – – – – sites. Older studies from the 1980s found increased frequency of Frequency , , , , , gastrointestinal, esophageal, stomach, colon, lung, bladder, large Rate . . . . . intestine, and rectal cancers in counties containing hazardous waste sites (Najem et al. 1983;Budnicketal.1984;Griffith et.al. 1989). More current studies have continued to explore materials are identified as potentially hazardous (TOXMAP cancer incidence or other health impacts of hazardous waste 2012b, 2012c). exposure on the national or local level (Johnson 1995;Davies Four decades have passed since the Superfund law was 2006; Florida Department of Health 2012a). Statewide cancer enacted, which has allowed for more historical data to be col- impacts and hazardous waste site exposure in Florida have lected regarding these potential health hazards. During this previously been explored in one study, which focused on same time frame, analysis techniques—particularly geographic childhood cancers; results indicated some evidence of spatial information system techniques (GIS)—have advanced and allow clustering focused within 5–10 miles (8–16 km) of Superfund a more detailed investigation of these relationships. In this study, sites and not in areas in closer proximity to the Superfund the association between all-cause cancer incidence in Florida site,althoughhealthdatawereaggregatedtothegeographic and locations of Superfund sites are explored, using proxy centroid of each census tract and may not reflect the real cluster measures of exposure to hazardous chemicals and various location (Kearney 2008). A series of studies have been pub- demographic variables as predictors. lished that explore the pediatric cancer incidence in Florida and potential spatial clustering (Heaton 2014;Aminetal.2014; Methods Wang and Rodríguez 2014;LawsonandRotejanaprasert2014; Zhang, Lim, and Maiti 2014). For this series of studies, data Data were obtained from the Florida Association of Pediatric Tumor Program (FAPTR) and explored only pediatric onset cancers. A publically available information database for cancer inci- Results indicate evidence of spatial clustering of pediatric dence in Florida, called the STAT CD, was requested from cancer incidence; however different areas were identified as the Florida Cancer Data System (FCDS) and available demo-

potential clusters across studies. Further, in Heaton (2014), graphic covariates were age, sex, and county of residence (FCDS spatial differences in pediatric cancer incidence were observed (Florida Cancer Data System) 2012). Because of the overwhelm- by race but this was not observed in the other studies in this ing white and non-Hispanic cancer incidence numbers in these series. data (>90% for both), these analyses did not separate out race Attributing pediatric cancers to environmental causes is dif- and ethnicity demographics. Pediatric cancers were excluded ficult because pediatric cancer is rare but also because the from analysis because of the likely nonenvironmental cause; causal linkage between environmental exposure pediatric can- it is widely believed that children would not have the needed cer is unclear (National Cancer Institute 2016). However, envi- prolonged exposure to environmental toxins or chemicals to ronmental causes can be considered for adult cancers because cause environmentally induced cancers (National Cancer Insti- the older age supports the longer latency needed. Nadler and tute 2016). Age at the time of cancer incidence was categorized Zurbenko (2014) provided a nice description of the approximate into groups, while, due to identifiability concerns, diagnosis year latency period from cancer initiation to diagnosis for 34 cancer was grouped into 5-year intervals for the years between 1986 types. Cancer latency periods range from 2.2 years for chronic and 2010, inclusive. Due to privacy concerns and limited data lymphocytic leukemia to 56.8 years for ascending colon cancer. availability, only county-level information was considered in this Unfortunately because location history is problematic to obtain, manuscript. University of Florida Health Science Center and inference is still difficult. University of Missouri Institution Review Board approval was Uncontrolled hazardous waste sites have the potential to obtained and nearly 2.4 million records were used for analysis adversely impact human health and damage or disrupt ecolog- (Tables 1 and 2). ical systems and the greater environment. Specifically, improp- We adjust the raw, total, all-cause cancer incidence counts to erly stored or damaged containers of hazardous wastes can con- account for the total Florida population. There are two main tribute to overall toxic exposures through the food chain or methods of adjustments: the direct method and the indirect through intakes as part of a water supply for human or agri- method (Naing 2000). The direct method of adjustment applies cultural consumption (Hendryx et al. 2012). The reasons for strata (here the 5-year intervals or age groups) observed in the identifying potential Superfund sites are those designated by the population to a reference or standard population, here the 2000 Superfund law as (i) either causing or contributing to an increase US Census numbers for comparison to the Florida Cancer Data in mortality or irreversible or incapacitating illness, (ii) pos- System rates (Florida Cancer Data System 2016). The direct ing a substantial present or potential hazard to human health, method for adjustment results in the number of expected can- or (iii) when materials hazardous to humans or to the envi- cers in the reference population. The indirect method for adjust- ronment were improperly treated, stored, transported, disposed ment is similar but observed rates are compared to the popula- of, or otherwise managed (TOXMAP 2012a). More than 800 tions of interest, to result in the observed number of deaths in materials are designated as hazardous and nearly 100 additional each population of interest. The direct method was used here STATISTICS AND PUBLIC POLICY 3

Table . Raw total, all-cause cancer incidence counts by age grouping and gender, –. Total Rate refers to the rate per ,.

Age in years < – – – – – – – + Total

Male   , , , , , , , ,, Female   , , , , , , , ,, Total , , , , , , , , , ,, Total Rate . . . . . . . . . . to compare rates easily to results from the Florida Cancer Data was also included, as a method to determine if there may be an System (Table 3). association with cancer incidence and either previous exposure We aggregated data for Florida Superfund sites and haz- or continued exposure after remediation. We assume that possi- ards (TOXMAP 2012a), Florida shape files for GIS techniques ble confounding influences that contribute to cancer incidence (U.S. Census Bureau 2008), socioeconomic status data (Florida are relatively homogenous across Florida. CHARTS 2015), census data (U.S. Census Bureau 2008;Florida Seventy-seven Superfund sites with a final, proposed, or Department of Economic and Demographic Research 2016), deleted status are located in the state of Florida in 22 differ- and data from water reservoirs (ESRI 2012). ent counties (Figure 1A). The EPA defines a hazardous ranking Before any hazardous waste site can be classified as a Super- score (HRS) using a numerical value from 0 to 100, with larger fund site, it must first be placed on the values indicating a larger potential health hazard to humans or (NPL) in the U.S. Federal Register. To be placed on the NPL, the environmental. The HRS is assigned by the EPA to each sites contaminated by hazardous wastes must be proposed by Superfund site, regardless of site status, and characterizes the theEPA,thestateinwhichthesiteislocated,orconcernedciti- combination of the potential hazard of the location, the pos- zens.OnceplacedontheNPL,sitestatusisdesignatedas“Pro- sibility that a site has released or has the potential to release posed,”“Deleted,”and “Final.”The EPA identifies proposed sites hazardous substances into the environment, characteristics of as candidates for cleanup activities because they pose a risk to the waste (e.g., and waste quantity), and people or sen- human health and/or the environment. Deleted sites are those sitive environments that could be affected by a release. Expo- that have been deleted from the NPL by the EPA, with state con- sures to hazardous wastes for each Florida county were assigned currence. Sites are given deleted status when cleanup goals have using ordinary kriging of the HRS. Ordinary kriging (OK) is a been met and no further hazardous waste response is necessary. method that is widely used for prediction of geostatistical data Final sites are those determined to pose a real or potential threat and assumes a constant mean in the neighborhood of each pre- to human health and the environment. Final status is received diction.Valuesarepredictedbasedontheknownvaluesatcol- after completion of the hazardous ranking score screening—a lected points. The reader is referred to supplementary materials method that triages hazardous waste sites—and public solicita- for more details on OK. Ordinary kriging produces spatial pre- tion of comments about the proposed site. For these analyses, the dictions of the HRS and these predictions were located at each number of proposed and final Superfund sites in each county county’s centroid. For the spatial models, these spatial predic- was compiled and included as one exposure variable to indi- tionswereusedastheHRSvalueforeachcounty(Figure 1a). cateahigherhazardforthosethathavenotbeenremediated.In For nonspatial models, total county-level HRS was computed addition, the number of deleted Superfund sites in each county by summing each HRS score for all Superfund sites located in

Table . Direct adjusted cancer incidence rates per , people, adjusted by age and gender, for each Florida county.

County Cancer incidence rates County Cancer incidence rates County Cancer incidence rates

Alachua . Hardee . Okeechobee . Baker . Hendry . Orange . Bay . Hernando . Osceola . Bradford . Highlands . Palm Beach . Brevard . Hillsborough . Pasco . Broward . Holmes . Pinellas . Calhoun . Indian River . Polk . Charlotte . Jackson . Putnam . Citrus . Jefferson . Santa Rosa . Clay . Lafayette . Sarasota . Collier . Lake . Seminole . Columbia . Lee . St. Johns . Desoto . Leon . St. Lucie . Dixie . Levy . Sumter . Duval . Liberty . Suwannee . Escambia . Madison . Taylor . Flagler . Manatee . Union . Franklin . Marion . Volusia . Gadsden . Martin . Wakulla . Gilchrist . Miami-Dade . Walton . Glades . Monroe . Washington . Gulf . Nassau . Hamilton . Okaloosa . 4 A. KIRPICH AND E. LEARY

Figure . (a) Locations of Superfund sites in Florida with the spatial prediction for county-level hazardous ranking score (HRS). (b) Locations of surface water reservoirs and National Weather Service (NWS) regions in Florida. a county. Both of these measures can be considered a proxy for comparison to the standard Poisson model, Z isassumedtohave the level of hazard associated with Superfund site(s) within and a gamma distribution as previously defined. around each county. As defined, the marginal distribution of Y has the probability An important factor when considering hazardous waste and mass function: toxic materials is their ability to travel from the original waste   (y + α) μ y site; one such method of travel is through water sources (EPA p(y|μ, α) = y!(y) μ + α 1992). Because of the higher potential for hazardous wastes   in Florida to travel through water—due to Florida’s swampy, α α × , = , , , ... wet environment—the proportion of surface water contained μ + α y 0 1 2 3 (1) within a 1 km area around each Superfund site was calculated and rescaled. As shown in Figure 1B, most Superfund sites For integer values α = m, the mass function (1) simplifies to (black dots) in Florida have some type of surface water reser- the negative binomial mass function: voir located nearby (blue areas). For Superfund sites near county   + − lines, this proportion of surface water was considered part of ( |μ, ) = m y 1 ( − )y m, = , , , ... p y m − 1 p p y 0 1 2 3 the county in which the Superfund site was located. The aggre- m 1 gatedrescaledproportionforallSuperfundsiteslocatedwithin For fixed m (or α), the model follows the standard general- a county’s borders was used as the proxy for water exposure of ized linear equation framework with the link function, g(E[Y]). hazardous wastes for the entire county. The linear predictor has the form η = g(E[Y]) = X T β,where X T isthetransposedvectorofpredictorsandβ is the vector of Nonspatial Models model coefficients. For arbitrary α, the mean and variance have the form: Cancer incidence counts were modeled using a Poisson–  μ = μ = μ + Gamma mixture model. The Poisson–Gamma mixture model E [Y] and var [Y] 1 α is equivalent to the negative binomial generalized linear model andallowsamoreflexiblemodelfitbecauseitaccountsfor This model was fit (see the supplementary materials) using overdispersion. The negative binomial model is formulated as a population offset from the 2000 U.S. Census (U.S. Census a mixture of the Poisson and gamma distributions and has the Bureau), and thus all interpretations will be in terms of can- following form: cer incidence rates. Using the 2000 U.S. Census numbers, rather than the 2010 U.S. Census numbers, was intended to pro- Y|Z ∼ Poisson (μZ) vide uniform comparison to the Florida Cancer Data System Z ∼ Gamma (α,1/α) with density f (Z|α,1/α) rates, which use the 2000 U.S. Census numbers (Florida Cancer αα Data System 2016). For the nonspatial analysis, the 67 counties = α−1 −αz  (α)z e withinFloridacouldnotbeusedasaspatialproxyduetothe large number of counties, which when included as an indica- where Y is the response of interest that can either be count data tor variable, produced model instability due to the large number or rate data. Models with count data must include an offset term of parameters. However, the National Weather Service (NWS) with total population counts; this ensures that the counts are defines seven different weather regions within Florida, which scaled to the total counts. To ensure the required flexibility for are defined using county boundaries within FloridaFigure ( 1b). STATISTICS AND PUBLIC POLICY 5

These NWS regions were used as spatial proxies in all nonspa- presence of any spatial autocorrelation (spatial relationship) tial analyses and are composed of the Jacksonville region (JAX), between the cancer incidence rate and county. Global Moran’s Keys region (KEY), Miami region (MFL), Melbourne region I is sensitive to global spatial autocorrelation and has the form: (MLB), Mobile region (MOB), Tallahassee region (TAE), and      N N w − X¯ − X¯ the Tampa Bay region (TBW). The summary information for N i=1 j=1 ij Xi Xj I =    . Superfund site locations by NWS region is provided in Table 4. N N w N ( − X¯ )2 i=1 j=1 ij i=1 Xi Categorical covariates included in the nonspatial model were w age group, gender, diagnosis period, presence of Superfund indi- Here, Xi are the values for the spatial objects and ij are the cator, and NWS region. Continuous predictor and covariates weights from the 67 × 67 weight matrix,W,fori = 1, 2,...,67 ¯ were median income, the number of proposed and final Super- and j = 1, 2,...,67. The value X corresponds to the average X¯ = 1 N fund sites, number of deleted Superfund sites, total county-level value of Xi across all spatial elements, that is, N i=1 Xi. HRS, and county-level water exposure. Median income was used The reader is referred to supplementary materials for more to control the potential confounding from socioeconomic sta- details on the Global Moran’s I statistic. tus. Demographic covariates, such as gender and age group, were Although similar to the Global Moran’s I,Geary’sC statistic included to account for the known differences in cancer inci- wasalsousedtofocusonrelationshipsatthelocallevel.The dence by gender and age (e.g., cancer incidence increases for statistic has the form   older age groups). Diagnosis period was also included in the ( − ) N N w ( − )2 N 1 i=1 j=1 ij Xi Xj analysis as it can be considered a measure of latency or proxy C =    N N w N ( − X¯ )2 for medical care during this large time frame. 2 i=1 j=1 ij i=1 Xi The reader is referred to supplementary materials for more Spatial Models details on Geary’s C statistic. Tofurther investigate the observed spatial relationships, local A spatial simultaneous autoregressive error model (SAR Error Getis-Ord general G and local Moran’s I tests were conducted. Model) was used for spatial modeling. These model equations Local Gettis-Ord G are used to identify clusters with high and had the form: i low values. Local Gettis-Ord Gi statistics for each spatial object Y = Xβ + u where u = λWu + . i have the form:   N w X − X¯ N w Or equivalently, j=1 ij j j=1 ij Gi = .

     2 N X2 N N w2 − N w (I − λW ) u =  where E  = 0andvar  = σ 2I j=1 j − X¯ 2 j=1 ij j=1 ij [ ] [ ] N N−1 W Y Here, was the matrix of the boundary weights, was the Anselin Local Moran’s Ii statistics were introduced to detect response vector, X is the matrix of predictors, I is the identity local spatial autocorrelation. The formula for each spatial object matrix, and  is the vector of normally distributed errors. The i has the form: parameters λ and β were estimated (see supplementary materi- − X¯ N   als). The county-level spatial structure for Florida was described Xi ¯ Ii =  wij Xj − X . using a boundary matrix W ofdimension67by67representing 1 N ( − X¯ )2 N−1 j=1, j=i Xj j=1, j=i the number of Florida counties—using two different types of spatial weights to represent the spatial relationships between ThelocalGetis-OrdgeneralG tests for local clustering of can- counties: (1) binary 0-1 queen weights and (2) standardized cer incidence rates will indicate any difference from expected queen weights. Binary 0-1 queen weights for a given county are cancer incidence rates. In addition, the Local Moran’s I statistic stored in a single row of the weights matrix. The weight value helps to identify if a county has similarly high or low cancer inci- is equal to 1 if counties have at least one shared boundary point dence rates with the surrounding counties and can be helpful to and 0 otherwise. Standardized queen weights are obtained from indicate potential clusters of either high or low cancer incidence binary 0-1 queen weights by dividing each element in the row rates. Used together with Getis-Ord general G, results can iden- by the sum of all elements in that row. The individual elements tify potential clusters and cancer incidence rates that could be of W are denoted by wij for i = 1, 2,...,N and j = 1, 2,...,N considered “outliers” compared to rates in surrounding coun- where N = 67 corresponds to the number of counties in Florida. ties. The reader is referred to the supplementary materials for Sincespatialmodelsareconstructedsothatonlyasingleratecan more details on Local Gettis-Ord Gi and Anselin Local Moran’s be used for each spatial element (here, county), age and diagno- Ii statistics. sis year were included in these spatial models by adjusting the cancer incidence rates for each county by both age and diagnosis Results year, using US Census Data for Florida from 2000 as previously discussed. County-level exposure variables included in the Results from both the spatial and nonspatial models indicated models included the ordinary kriged HRS, county-level water that spatial differences are observed in adult cancer incidence exposure, the number of proposed and final Superfund sites for in Florida from 1986–2010. Further, nonspatial models indi- the given county and the number of deleted Superfund sites. cated that increased county-level HRS, county-level water expo- To investigate spatial heterogeneity, the Global Moran’s I sure for hazardous wastes and being male were associated with statistic was calculated for the spatial models to determine the increased rates for adult cancer incidence. 6 A. KIRPICH AND E. LEARY

Table . Superfund status by NWS region and aggregated HRS.

Frequency of Superfund status by NWS regions Site status JAX KEY MFL MLB MOB TAE TBW

Proposed Final        Deleted   All        Combined HRS by Superfund status and NWS regions

JAX KEY MFL MLB MOB TAE TBW

Proposed .      . Final .  . . . . . Deleted .  .  . . . All .  . . . . .

Table . The results of the Poisson-Gamma mixture model fit.

Coefficient name Estimate Std. error % confidence intervals p-value Sig.

Intercept − . . − . − . < . ∗∗∗ Hazardous Ranking Score . . . . . ∗∗ Water km Buffer . .  . . ∗ Proposed and Final − . . − . − . . ∗∗ Deleted − . . − . . . Gender (Female) − . . − . . . . – years old . . . . < . ∗∗∗ – years old . . . . < . ∗∗∗ – years old . . . . < . ∗∗∗ – years old . . . . < . ∗∗∗ – years old . . . . < . ∗∗∗ – years old . . . . < . ∗∗∗ + years old . . . . < . ∗∗∗ Indicator of Superfund Site . . . . < . ∗∗∗ < ∗∗∗

Diagnosis – . . . . . Diagnosis – . . . . . ∗ Diagnosis – . . − . . . Diagnosis – − . . − . . . JAX region . . . . < . ∗∗∗ KEY region . . . . < . ∗∗∗ MFL region . . − . . . . MLB region . . . . < . ∗∗∗ MOB region . . . . . ∗ TBW region . . . . < . ∗∗∗ Median Income . . − . . . .

Nonspatial Models (2008), most Superfund sites near a cancer cluster were located within 5–10 miles rather than within 0–1 mile; perhaps that The nonspatial model fit results are provided in Table 5.Inthe study, as well as this study, are observing effects of the spatial nonspatial Poisson–Gamma mixture model using NWS regions misalignment problem–data are not measured at the same spa- as spatial proxies, each age group category was observed to be tial level and the misalignment propagates error throughout the significantly associated with cancer incidencep ( < 0.010). This analysis. significance reflects the higher cancer incidence observed as In addition, evidence of extreme spatial heterogeneity among a population ages. Being female was inversely associated with cancer incidence rates was observed between the seven NWS cancer incidence, with p value equal to 0.0508, indicating a geographic regions. Because of the larger size and few Superfund potential protective effect against cancer incidence for females, sites, the Tallahassee region (TAE) was used as the reference or reduced exposure. Of the exposure variables, county-level region for all nonspatial analyses. Of the remaining six regions, water exposure for hazardous wastes and total county-level all other regions demonstrated strong positive association with HRS were positively associated with cancer incidence (p = increased cancer incidence rates in comparison to Tallahassee 0.028 and p = 0.009, respectively). The number of proposed region (TAE) region (p < 0.05), with the exception of the Miami and final Superfund sites were also significantly associated region (MFL). The strongest associations were observed in the with cancer incidence (p = 0.0025), indicating that counties Jacksonville region (JAX), the Melbourne region (MLB), and the with a greater number of proposed and final sites were related Tampa Bay region (TBW). These relationships likely reflect that to increasing cancer incidence. However, the direct opposite these regions contain the most Superfund sites. The covariate for relationship that counties with fewer Superfund sites—of all median income was on the boundary of significance (p = 0.060). types—also related to increased cancer incidence. In Kearney STATISTICS AND PUBLIC POLICY 7

Spatial Models with the largest numbers of Superfund sites (JAX, MLB, TBW) had the strongest associations. Confirming results from the Spatial models were constructed separately for each gender and nonspatial model, Global Moran’s I provided evidence of spa- similar results were obtained using each type of spatial weight. tial autocorrelation for both types of spatial weights and sex. However, no evidence of a significant association between Similar results were observed using Geary’s C statistic indi- environmental exposures and cancer incidence rates was cating that similar cancer incidence rates cluster spatially. The observed. Gettis-Ord G and Local Moran’s I results indicated that there is Because of the observed regional differences using the non- stronger spatial heterogeneity among males, although one high- spatial models, the Global Moran’s I and Geary’s C statistics high cluster was observed in one county for both males and were calculated for the spatial data, to determine the pres- females. ence of any spatial autocorrelation between the cancer inci- The spatial models did not find evidence of a significant asso- dence rate and county. Both Moran’s I and Geary’s C presented ciation.Thisresultcouldbeduetothestructureofthespatial evidence of spatial autocorrelation with p values ranging from model, as only one response was allowed for each spatial unit p < 0.0001 (Moran’s I, males, standardized queen weights) to p (e.g. county). The direct rate adjustment could have diluted a = 0.07 (Geary’s C, females, binary queen weights). The results signal in the data, the signal that was observed in the non-spatial depended on the test, gender, and weights used. The tests for model. One suggestion is that future studies consider techniques global spatial autocorrelation were significant for both males to better ameliorate spatial misalignment in the data, such as and females; smaller p values were observed for males with the using block kriging for county level estimates. largest value, p = 0.003. Although previous research has indicated an association To investigate further, the local Getis-Ord general G and local between gender and increased cancer incidence (Cook et al. Moran’s I tests were conducted. Both tests produced similar pat- 2009),resultsfromthisanalysisindicatethatinaddition,there terns for males and females, with the exception of a few coun- are spatial differences between genders for cancer incidence.These ties (Figure 2). Similar to the conclusions from Global Moran’s results build on the spatial differences observed in Goli et al. I and Geary’s C, these results indicate that cancer incidence (2013) by identifying these spatial differences for a Western pop- ratesformaleshavestrongerspatialheterogeneities.Thatpro- ulation while also investigating their associations with environ- vides evidence that cancer rates are indeed different between mental exposures related to Superfund site locations. Spatial het- counties and identifies the patterns in the data. Considering the erogeneity is important for future epidemiological work into Local Moran’s I test together with the Getis-Ord general G,over- identifying environmental causes of cancer clusters as well as to all higher geographic variability was observed for male rates

inform environmental regulations and policy. indicating spatial differences in cancer incidence rates by gender A perplexing result from the nonspatial analysis is the (Figure 2). observed relationship between decreasing numbers of proposed and deleted Superfund sites and increased cancer incidence, while a protective relationship is observed when considering Discussion all types of Superfund sites (proposed, deleted, and final) and Using nonspatial analysis techniques, our results suggest cancer incidence. In addition, counties with small numbers of a potential association between increased HRS score and Superfund sites were found to have high cancer incidence rates. increased proxy water exposure of hazardous wastes around This could be due to location or material characteristics of the Superfund sites with increased adult cancer incidence rates. Fur- final Superfund sites. For example, Miami-Dade county has ther, presence of a Superfund site in a county was associated with experienced massive immigration from inside and outside the an increase in cancer incidence rates. Relationships between US, in the past few decades. This has caused the population in gender and age covariates, specifically being older and male, this area to dramatically increase. Because of this population are associated with increased cancer incidence. Results from inflow, environmental exposures to toxic chemicals in this area this research that show associations with cancer incidence rates is less likely for the new residents. In addition, the influx of new between age and the male sex are not unexpected based on the residents increased total population counts, possibly confound- typical longer latency period needed for cancer onset (Nadler ingcancerratesforthisarea.Duringthesametimeframe,immi- 2014) and established gender differences in cancer incidences gration to the other parts of Florida (especially North Florida) (Cook et al. 2009;Edgrenetal.2012). These results are impor- was more moderate. tant as almost 37% of the total population in Florida is 50 years of As in most research studies, limitations of the data and anal- age or older and the resident median age for Florida is 41.2 years ysis are problematic. The demographics in the cancer inci- old, higher than the national median age of 37.4 years old (U.S. dence database for Florida contains an overwhelming number Census Bureau 2015a). Unexpectedly, higher median income of white, non-Hispanic persons (>90%). These values are differ- was associated with a higher cancer incidence rate but this is ent from the overall population of Florida which is more diverse perhaps an indication that diagnostic procedures might be more with 55% white, non-Hispanic and nearly 25% white, Hispanic frequent or of a higher quality for people or families with higher (US Census Bureau 2015b). Another limitation of the analy- income. This result may reflect that the NWS regions that con- sis is the privacy concern motivating aggregation to the county tain the largest numbers of Superfund sites are also those regions level. Although beyond the scope of this analysis, to investi- with both wealthy and urban populations. gate overall all-cause cancer incidence, further investigation into In addition, heterogeneity of cancer incidence was identi- different categories of cancer incidence may reveal more about fied in all NWS regions without exception and NWS regions possible explanations for these observed spatial differences. For 8 A. KIRPICH AND E. LEARY

Figure . (a) Getis-Ord hot and cold spots for cancer incidence. (b) Local Moran’s I cancer incidence clusters and outliers. example, differences in cancer incidence could be attributed FAPTP The Florida Association of Pediatric Tumor Pro- to differences in occupation, residential location, risk behav- gram iors such as tobacco use, or lifestyle differences such as staying FCDS Florida Cancer Data System indoors. Although further analysis and more detailed data, par- FDOH Florida Department of Health ticularly exposure histories, are needed to better attribute what GIS Geographic Information System may be driving these spatial differences, the range of possible GLM Generalized Linear Model factors makes this a daunting task. The recent advances in spatial HRS Hazardous Ranking Score analysis techniques and software will become useful to fur- NPL National Priorities List ther investigate cancer incidence and environmental hazardous NRC National Research Council waste exposure, when exposure history data are recorded and NWS National Weather Service made available for study. SAR Model Spatial Autoregressive Model

Conclusions Acknowledgments Results indicated potential association with environmental exposures related to Superfund sites and cancer incidence rates The Florida cancer incidence data used in this report were collected by the for Florida. In addition, evidence of spatial differences by gender FloridaCancerDatasystemundercontractwiththeFloridaDepartmentof Health. The views expressed herein are solely those of the author(s), and do for county-level cancer incidence rates in Florida were observed. not necessarily reflect those of the contractor of the Florida Department of More research is recommended on this topic to further identify Health. and define these differences. Data availability creates the biggest The authors thank Dr. Xiaohui Xu and Dr. Jonathan D. Sugimoto for the challenge to further the science. Within the United States, can- project guidance and critical review of the analysis performed. In addition, cer registries are defined and run by each state and they do not the authors thank Dr. Wendell P.Cropper, Jr. for general guidance and early, critical review of the manuscript. The authors contributed equally to this routinely include geographical or exposure histories that indi- article. cate length of residence at the time of cancer diagnosis, a thor- ough residential history, or an extremely detailed smoking and drinking exposure history. However, these exposure histories are Funding essential to thoroughly investigate the effect of environmental exposures on cancer. This research project was not funded by grant or any other source.

Conflict of Interest Statement Abbreviations Theauthorsstatethattheyhavenocompetingfinancialinterestsorconflict ACS American Cancer Society of interests in conducting this research. ArcGIS Geographic information system software pro- duced by Environmental Systems Research Insti- tute Supplementary Materials CERCLA Comprehensive Environmental Response, Com- The supplementary materials contain detailed background information on pensation and Liability Act the statistics and models used as well as a reference map of the 67 counties EPA U.S. Environmental Protection Agency in Florida. STATISTICS AND PUBLIC POLICY 9

References Effects in US Rural-Urban Areas,” International Journal of Health Geo- graphics, 11, 9–23. [2] Amin, R., Hendryx, M., Shull, M., and Bohnert, A. (2014), “A Cluster Anal- Johnson, B. (1995), “Nature, Extent, and Impact of Superfund Hazardous ysis of Pediatric Cancer Incidence Rates in Florida: 2000–2010,” Statis- Waste Sites,” Chemosphere, 31, 2415–2428. [2] tics and Public Policy, 1, 69–77. [2] Johnson, B. L., and DeRosa, C. T. (1997), “The Toxicologic Hazard of Super- Boffetta, P., and Nyberg, F. (2003), “Contribution of Environmental Factors fund Hazardous-Waste Sites,” Reviews on , 12, to Cancer Risk,” British Medical Bulletin, 68, 71–94. [1] 235–251. [2] Budnick, L., Sokal, D., Falk, H., Logue, J., and Fox, J. (1984), “Cancer and Kearney, G. (2008), “A Procedure for Detecting Childhood Cancer Clus- Birth Defects Near the Drake Superfund Site, Pennsylvania,” Archives ters Near Hazardous Wastes Sites in Florida,” Journal of Environmental of Environmental & Occupational Health, 39, 409–413. [2] Health, 70, 29–34. [2,6] Cook,M.B.,Dawsey,S.M.,Freedman,N.D.,Inskip,P.D.,Wichner,S. Lawson, A. B., and Rotejanaprasert, C. (2014), “Childhood Brain Cancer in M., Quraishi, S. M., Devesa, S. S., and McGlynn, K. A. (2009). “Sex Florida: A Bayesian Clustering Approach,” Statistics and Public Policy, Disparities in Cancer Incidence by Period and Age,” Cancer Epidemiol 1, 99–107. [2] Biomarkers Preview, 18, 1174–1182. [7] Nadler, D. L., and Zurbenko, I. G. (2014), “Estimating Cancer Latency Davies, K. (2006). “Economic Costs of Childhood Diseases and Disabili- Times using a Weibull Model,” Advanced Epidemiology.doi: ties Attributable to Environmental Contaminants in Washington State, 10.1155/2014/746769.Availableathttps://www.hindawi.com/archive/ USA,” EcoHealth, 3, 86–94. [2] 2014/746769/abs/.[1,2,7] Edgren, G., Liang, L., Adami, H. O., and Chang, E. T. (2012). “Enigmatic Sex Naing, N. N. (2000), “Easy Way to Learn Standardization: Direct and Indi- Disparities in Cancer Incidence,” European Journal of Epidemiology, 27, rect Methods,” Malaysian Journal of Medical Sciences, 7, 10–15. [2] 187–196. [7] Najem, G., Thind, I., Lavenhar, M., and Louria, D. (1983), “Gastrointestinal EPA (Environmental Protection Agency) (1992), “Hazard Ranking Sys- Cancer Mortality in New Jersey Counties, and the Relationship with tem Guidance Manual: Chapter 3, the Hazardous Ranking Sys- Environmental Variables,” International Journal of Epidemiology, 12, tem Scoring Process,” available at http://nepis.epa.gov/Exe/ZyPDF.cgi/ 276–289. [2] 2000IS27.PDF?Dockey=2000IS27.PDF [4] National Cancer Institute (2016), “Childhood Cancers,” Available at EPA (Environmental Protection Agency) (2014), “Superfund,” Available at https://www.cancer.gov/types/childhood-cancers [2] http://www.epa.gov/superfund [1] Raaschou-Nielsen, O., Andersen, Z. J., Beelen, R., Samoli, E., Stafoggia, ESRI (2012), Environmental Systems Research Institute. Available at M., Weinmayr, G., Hoffmann, B., Fischer, P., Nieuwenhuijsen, M. J., http://esri.com [3] Brunekreef, B. Xun, W. W., Katsouyanni, K., Dimakopoulou, K., Som- FCDS (Florida Cancer Data System) (2012), “FCDS Variables (avail- mar,J.,Forsberg,B.,Modig,L.,Oudin,A.,Oftedal,B.,Schwarze,P.E., able upon request and approval),” available at https://fcds.med. Nafstad,P.,DeFaire,U.,Pedersen,N.L.,Östenson,C.,Fratiglioni,L., miami.edu/inc/datarequest.shtml [2] Penell,J.,Korek,M.,Pershagen,G.,Eriksen,K.T.,Sørensen,M.,Ane Florida Cancer Data System (2016), FloridaCHARTS (2015) Median Tjønneland, A., Ellermann, T., Eeftens, M., Peeters, P. H., Meliefste, K., Household Income, “Statistics,” available at https://fcds.med.miami. Wang, M., Bueno-de-Mesquita, B., Key, T. J., de Hoogh, K., Concin, H., edu/inc/statistics.shtml [2,3,4] Nagel, G., Vilier, A., Grioni, S., Krogh, V., Tsai, M., Ricceri, F., Sacer-

Florida CHARTS (2015), available at http://www.floridacharts.com/charts/ dote,C.,Galassi,C.,Migliore,E.,Ranzi,A.,Cesaroni,G.,Badaloni,C., OtherIndicators/NonVitalIndRateOnlyDataViewer.aspx?cid=0293 [3] Forastiere, F., Tamayo, I., Amiano, P., Dorronsoro, M., Trichopoulou, FDOH (Florida Department of Health) (2012a) “Alachua County Report, A., Bamia, C., Vineis, P., and Hoek, G. (2013), “Air Pollution and Lung Review of Cancer Rates for Census Tract 3 (Containing Stephen Foster Cancer Incidence in 17 European Cohorts: Prospective Analyses from Neighborhood),” available at http://www.alachuacounty.us/Depts/ the European Study of Cohorts for Air Pollution Effects (ESCAPE),” EPD/Pollution/Documents/1658_FDOH%20March%202012%20Ca- The Lancet Oncology, 14, 813–822. [1] ncer%20Study%20Stephen%20Foster%20%20Koppers%20Neighbor- TOXMAP: Environmental Health e-Maps (2012a), “U.S. National Library hood%20%206-6-2012.pdf [2] of Medicine,” available at http://toxmap.nlm.nih.gov/toxmap/main/ Goli, A., Oroei, M., Jalalpour, M., Faramarzi, H., and Askarian, M. (2013), index.jsp [2,3] “The Spatial Distribution of Cancer Incidence in Fars Province: A GIS- ——— (2012b), “What Kinds of Chemicals Are Found at Superfund Sites?” Based Analysis of Cancer Registry Data,” International Journal of Pre- Available at http://toxmap.nlm.nih.gov/toxmap/faq/2009/08/what- ventive Medicine, 4, 1122–1130. [7] kinds-of-chemicals-are-found-at-superfund-sites.html [2] Griffith, J., Duncan, R., Riggan, W., and Pellom, A. (1989), “Can- ———. (2012c), “Superfund Chemicals in TOXMAP,” available at cer Mortality in U.S. Counties with Hazardous Waste Sites and http://toxmap.nlm.nih.gov/toxmap/main/sfChemicals.jsp [2] Ground Water Pollution,” Archives of Environmental Health, 44, U.S. Census Bureau (2008), “2008 TIGER/Line® Shapefiles,” available at 69–74. [2] http://www.census.gov/cgi-bin/geo/shapefiles2008/national-files [3] Hamra,G.B.,Guha,N.,Cohen,A.,Laden,F.,Raaschou-Nielsen,O.,Samet, ——— (2015a), “Community Facts Florida,” available at https://factfinder. J. M., Vineis, P., Forastiere, F., Saldiva, P., Yorifugi, T., and Loomis, D. census.gov/faces/nav/jsf/pages/community_facts.xhtml [7] (2014), “Outdoor Particulate Matter Exposure and Lung Cancer: A Sys- ——— (2015b), “Quick Facts: Florida, 2015,” Available at http://www. tematic Review and Meta-Analysis,” Environmental Health Perspectives, census.gov/quickfacts/table/PST045215/12 [7] 122, 906–912. [1] Wang, H., and Rodríguez, A. (2014), “Identifying Pediatric Cancer Clusters Heaton, M. J. (2014), “Wombling Analysis of Childhood Tumor Rates in in Florida Using Log-Linear Models and Generalized Lasso Penalties,” Florida,” Statistics and Public Policy, 1, 60–67. [2] Statistics and Public Policy, 1, 86–96. [2] Hendryx, M., Conley, J., Fedorko, E., Luo, J., and Armistead, M. (2012), Zhang, Z., Lim, C. Y., and Maiti, T. (2014), “Analyzing 2000–2010 Child- “Permitted Water Pollution Discharges and Population Cancer and hood Age-Adjusted Cancer Rates in Florida: A Spatial Clustering Non-Cancer Mortality: Toxicity Weights and Upstream Discharge Approach,” Statistics and Public Policy, 1, 120–128. [2]