Research Article Variants in Inflammation Are Implicated in Risk of Lung Cancer in Never Smokers Exposed to Second-hand Smoke

Margaret R. Spitz1, Ivan P. Gorlov2, Christopher I. Amos3, Qiong Dong3, Wei Chen3, Carol J. Etzel3, Olga Y. Gorlova3, David W. Chang3, Xia Pu3, Di Zhang3, Liang Wang4, Julie M. Cunningham4, Ping Yang4, and Xifeng Wu3 The BATTLE Trial: Personalizing Therapy for Lung Cancer research article

Abstract Lung cancer in lifetime never smokers is distinct from that in smokers, but the role of separate or overlapping carcinogenic pathways has not been explored. We therefore evaluated a comprehensive panel of 11,737 single-nucleotide polymorphisms (SNP) in inflammatory- pathway genes in a discovery phase (451 lung cancer cases, 508 controls from Texas). SNPs that were significant were evaluated in a second external population (303 cases, 311 controls from the Mayo Clinic). An intronic SNP in the ACVR1B , rs12809597, was replicated with significance and restricted to those reporting adult exposure to environmental tobacco smoke. Another promising candidate was a SNP in NR4A1, although the replication OR did not achieve statistical significance. ACVR1B belongs to the TGFR-β superfamily, contributing to resolution of inflammation and initiation of airway remodeling. An inflammatory microenvironment (second-hand smoking, asthma, or hay fever) is necessary for risk from these gene variants to be expressed. These findings require further replication, followed by tar- geted resequencing, and functional validation.

Significance: Beyond passive smoking and family history of lung cancer, little is known about the etiol- ogy of lung cancer in lifetime never smokers that accounts for about 15% of all lung cancers in the United States. Our two-stage candidate pathway approach examined a targeted panel of inflammation genes and has identified novel structural variants that appear to contribute to risk in patients who report prior exposure to sidestream smoking. Cancer Discovery; 1(5): 420–9. ©2011 AACR.

Introduction comprehensive panel of germline genetic variants in inflamma- tory pathway genes in risk of lung cancer in lifetime never smok- From etiologic, molecular genetic, and biologic viewpoints, it ers in a discovery phase of cases and controls selected from an is now fairly well accepted that lung cancer occurring in lifetime ongoing multiracial/ethnic lung cancer case–control study that never smokers is distinct from smoking-associated lung cancer has recruited study participants from The University of Texas (1). It is noteworthy that the top hit from all published ever- MD Anderson Cancer Center from 1995 onward (4). We per- smoking lung cancer genome-wide association studies (GWAS), formed a replication analysis in an independent sample of never- the 15q25 locus encoding nicotinic acetylcholine smoking lung cancer cases and controls from the Mayo Clinic receptor (NAChR) subunits and a proteasome subunit, has (5). Lung cancer in never smokers accounts for 15% of all lung not been implicated in lung cancer risk in never smokers (2). cancers in the United States, yet beyond passive smoking and Nevertheless, it is likely that the 2 disease entities do share some family history of lung cancer, few other well-established genetic molecular features suggesting separate but overlapping path- or nongenetic clues to its etiology are known. ways to lung carcinogenesis (1). Increasing evidence suggests that pathway-based approaches to identify the genetic contribu- tion to cancer susceptibility may provide complementary infor- Results mation to conventional single-marker analyses. For the discovery phase, we recruited 451 non–small cell Of intense interest in lung carcinogenesis is the inflamma- lung cancer cases and 508 controls, all lifetime never smokers tion pathway, because an abnormally prolonged or intense in- (Table 1). Of these subjects, about two thirds of the cases and flammatory response could create a microenvironment that controls (650) were included in our previously published risk promotes lung cancer development. Although tobacco-induced model for never smokers (6). Adenocarcinoma was diagnosed lung cancer is characterized by increased tissue oxidative stress in 76% of the cases. On average, the controls were 5 years and an abundant and deregulated inflammatory microenviron- younger than the cases. More than 60% of both the cases ment (3), a similar role for inflammation in lung cancer in never and controls were women. The percentages of self-reported smokers has not been studied in depth. We therefore evaluated a environmental tobacco smoke (ETS) exposure were 83% and 75% for the discovery cases and controls, respectively. The associations between asthma and dust exposure were not Authors′ Affiliations: 1The Dan L. Duncan Cancer Center, Baylor College of statistically significant. However, a history of hay fever (OR Medicine; Departments of 2Genitourinary Medical Oncology and = 0.70; P = 0.02), passive smoking exposure (OR = 1.59; 3 Epidemiology, The University of Texas MD Anderson Cancer Center, P = 0.01), family history of 2 or more first-degree relatives Houston, Texas; and 4The Mayo Clinic, Rochester, Minnesota with any cancer (OR = 2.24, P < 0.001) (data not shown), or Corresponding Author: Margaret R. Spitz, The Dan L. Duncan Cancer Center, Baylor College of Medicine. One Baylor Plaza, Cullen Building, 2 or more first-degree relatives with lung cancer (OR = 3.47, Suite 450-A. Mail Stop: BCM-305, Houston, TX 77030-3411. Phone: 713- P = 0.04) all achieved statistical significance. 798-2113; Fax: 713-798-2716; E-mail: [email protected] The replication set (Table 1) included 303 cases and doi: 10.1158/2159-8290.CD-11-0080 311 controls, all lifetime never smokers and well matched ©2011 American Association for Cancer Research. on age and gender. ETS exposure was reported by 67%

OCTOBER 2011 CANCER DISCOVERY | 421 research article Spitz et al. of the cases and 56% of the controls. Passive smoking women only, the overall OR was 0.72, P = 0.0013; for men, the (OR = 1.56; P = 0.008) and family history of lung cancer (OR = combined OR was 0.74, P = 0.05. A second SNP in this region, 2.06; P = 0.0002) were significantly associated with risk. Asthma rs2701129 in the 5′ UTR of NR4A1, was strongly significant in was not a risk factor in this population. Ten cases, but no con- our data (OR = 0.63; P = 0.0009) but did not achieve statistical trols reported a history of emphysema (OR = 10.6; P = 0.02). significance in the Mayo Clinic data, although the OR was in the In total, 11,737 single-nucleotide polymorphisms (SNP) same protective direction (OR = 0.85; P = 0.36). were available for analysis in the discovery phase. Table 2 sum- We also conducted stratified analysis by select variables in- marizes the subpathways, genes, and SNPs included in the cluding ETS exposure, family history of lung cancer, hay fever, customized Illumina inflammation chip, and as outlined in and asthma (data not shown). Notably, the significant associa- Loza and colleagues (7). In univariate analysis, assuming an tion between lung cancer risk and rs12809597 was evident in additive model, 21 SNPs were statistically significant with only those who reported ETS exposure, OR = 0.67; P = 7.8 P values ≤0.001 and Bayesian false discovery probability ×10−5, compared with an OR of 0.78, P = 0.39 in those who (BFDP) levels ≤ 0.8 (Table 3). denied ETS exposure. In the discovery data, this ACVR1B SNP In the replication analysis of these 21 SNPs from the dis- was significantly protective in both men [OR = 0.47 (0.30– covery phase, only one, rs12809597 in the ACVR1B gene, was 0.73); P = 0.0010], and women with ETS exposure [OR = 0.74 concordant for direction with the discovery phase [discovery OR (0.54-1.01); P = 0.0543]. In the replication, this pattern was = 0.72 (0.59, 0.88); P = 0.0012] (Table 3) but was of border- evident only in women with ETS exposure [OR = 0.60 (0.41– line overall significance [replication OR = 0.80 (0.62, 1.02); P = 0.88); P = 0.009]. It is noteworthy that only 83 male cases were 0.069]. For women specifically, the OR in the replication popula- present in the replication set, and power is therefore limited tion was 0.67, P = 0.0097, but was not statistically significant in for these subset analyses. Likewise, rs2701129 in NR4A1 was men, although the numbers were small. In the combined data statistically significant only in ETS-exposed subjects in the dis- sets, the overall OR for rs12809597 was 0.72, P = 0.0002. For covery set [OR = 0.61 (0.43–0.87); P = 0.0068]. We also noted

Table 1. Distribution of selected variables in discovery and replication populations

Discovery Replication Cases Controls OR Cases Controls OR Variable (n = 451) (n = 508) (95% CI)a P value (n = 303) (n = 311) (95% CI) P value Age

Mean (SD) 61.6 (13.0) 56.6 (13.1) 1.03 (1.02–1.04) <0.0001 62.0 (12.9) 62.2 (13.1) 1.03 (1.02–1.04) <0.0001 Sex

Male, n (%) 147 (32.6) 190 (37.4) 83 (27.4) 86 (27.7) Female, n (%) 304 (67.4) 318 (62.6) 0.81 (0.62–1.06) 0.1198 220 (72.6) 225 (72.4) 0.99 (0.7–1.4) 0.9425 Asthma No 369 (87.0) 442 (87.3) 263 (86.8) 278 (89.4) Yes 55 (13.0) 64 (12.4) 1.03 (0.70–1.51) 0.8830 40 (13.2) 33 (10.6) 1.28 (0.8–2.1) 0.3223 Hay fever No 332 (78.3) 363 (71.7) N/A Yes 92 (21.7) 143 (28.3) 0.70 (0.52–0.95) 0.022 Dust No 345 (81.4) 429 (84.8) N/A Yes 79 (18.6) 77 (15.2) 1.28 (0.90–1.80) 0.1657 ETS No 57 (17.0) 115 (24.6) 99 (33.3) 135 (43.8) Yes 278 (83.0) 353 (75.4) 1.59 (1.12–2.26) 0.0104 198 (66.7) 173 (56.2) 1.56 (1.12–2.17) 0.0082 Family history of lung cancer 0 348 (84.5) 439 (87.1) 213 (70.3) 258 (83.0) 1 53 (12.9) 61 (12.1) 1.10 (0.74–1.63) 0.6482 90 (29.7) 53 (17.0) 2.06 (1.40–3.02) 0.0002 2 11 (2.7) 4 (0.8) 3.47 (1.09–10.98) 0.0346

422 | CANCER DISCOVERY OCTOBER 2011 www.aacrjournals.org Inflammation Gene Variants in Lifetime Never Smokers with Lung Cancer research article

Imputation was performed to increase coverage of SNPs Table 2. Summary of inflammation subpathways, genes, in the region surrounding rs12809597 in the ACVR1B gene and SNPs on Illumina chip for their association with lung cancer risk (Fig. 1). Before imputation, 30 genotyped SNPs were found, 23 of which Pathwaya Genes (n) SNPs (n) were between 50.58 Mb and 50.74 Mb. After imputation, 156 SNPs exhibited r2 > 0.8 and MAF > 0.01 and were adequately Adhesion-extravasation-migration 12 108 reliably imputed between 50.58 Mb and 50.74 Mb. Best-guess Apoptosis signaling 67 834 genotypes were used in the analysis. The most likely candi- −4 Complement cascade 3 8 date SNP, rs1882119 (P = 1.76 × 10 ), an imputed SNP (r2 = 0.9849) in this region is in an intron of NR4A1, not Cytokine signaling 266 3,139 ACVR1B. Conversely, rs2701129 (P =1.96 × 10−4) was di- Glucocorticoid/PPAR signaling 24 258 rectly genotyped. Because the r2 for rs12809597(ACVR1B) and Innate pathogen detection 53 542 rs2701129(NR4A1) was only 0.013, we further investigated relevant SNPs in NR4A1. Leukocyte signaling 132 2,023 In parallel with the ACVR1B analysis described earlier, we MAPK signaling 156 2,854 identified 170 upstream/downstream genes related to NR4A1, of which 65 genes and 568 SNPs had been included in our Natural killer 31 296 inflammation panel. Of these, 17 SNPs had P values < 0.01 Phagocytosis-Ag presentation 41 488 in univariate analysis, assuming an additive model (Table 4). PI3K/AKT signaling 45 580 Five of these SNPs (NR4A2, NR4A1, TP53, BCL2, and MAP2K2), based on P values < 0.05, remained statistically significant in ROS/glutathione/cytotoxic granules 25 231 models using logistic regression forward or stepwise selection TNF superfamily signaling 49 569 procedures, and with controlling for age, sex, second-hand Total 904 11,930 smoking exposure, and family history of lung cancer (Table 5). Our original risk model was constructed based on 709 never Abbreviations: Ag, antigen; AKT, MAPK, mitogen-activated smokers (330 lung cancer cases and 379 controls) (6). Of the ; PPAR, peroxisome proliferator-activated receptor; ROS, total of 959 never smokers in this new analysis, 650 (68%) over- reactive oxygen species; SNP, single-nucleotide polymorphism. aSee ref. 7 lapped in both analyses. The published AUC for never smokers in that model was 0.57. The point estimate of the AUC for those not included in our original study (N = 309) was 0.56. The AUC statistic for the baseline model in the entire discov- a greater significant effect for NR4A1 (OR = 0.31; P = 0.0081) ery dataset, incorporating the same clinical and epidemiologic in those with asthma (the risk group for lung cancer in never variables (age, gender, family history of lung cancer, and ETS smokers), compared with those who denied having asthma exposure) was 0.62 (data not shown). With the addition of (OR = 0.69; P = 0.0165). However, although we did not note the replicated SNP, rs12809597, the AUC increased to 0.64, a similar pattern in the discovery data for ACVR1B, we saw an P = 0.098. The comparable model for the Mayo Clinic data identical pattern for the ACVR1B SNP in the Mayo Clinic data with addition of rs12809597 yielded an AUC of 0.60. The same for those with and without asthma (OR = 0.39; P = 0.02 vs. analysis for the discovery data, adding in the NR4A1 SNPs and OR = 0.86; P = 0.27, respectively). upstream and downstream regulators, yielded an AUC of 0.68 Also of interest is that in the discovery set, in those who denied (P = 0.0005), data not shown. having had hay fever (i.e., the risk group), the ORs were signifi- We also summed the number of adverse alleles (ACVR1B, cantly protective for both ACVR1B (OR = 0.70; P = 0.0026 and NR4A1, and upstream and downstream regulators) and eval- NR4A1 (OR = 0.54; P = 0.0003). We did not have comparable uated the distribution of cases and controls across different data for analysis in the replication set. We previously reported strata to determine the cumulative risk in the discovery set (8) that, paradoxically, those with both conditions (asthma and (Table 6). Compared with the lowest-risk stratum (0 to 6 risk hay fever) had a significantly elevated lung cancer risk (OR = alleles), the risks increased to an OR of 2.21, P = 0.0272 for 2.43; 95% confidence interval = 1.11–5.35). It is in this subgroup 7 risk alleles; OR = 3.26; P = 5.0 × 10−4 for 8 risk alleles, and (asthma and hay fever) that we detected the greatest protective OR = 5.28 for 9 or more risk alleles (P = 3.9 × 10−7 (Table 6). effect with NR4A1 (OR = 0.28; P = 0.04). A 46% increase in risk was found for each adverse allele, and the We hypothesized that polymorphisms in genes directly asso- P value for trend was 1.11 × 10−9 (Table 6). Six percent of cases ciated with ACVR1B might contribute to the risk noted for the and 13% of controls were in the lowest-risk stratum compared ACVR1B SNP. Therefore, we used an in silico approach, Pathway with 50% and 35% in the highest-risk stratum, respectively. Studio (9), to identify upstream regulators and downstream targets of ACVR1B. Direct interactions between genes (i.e., direct regulation of , protein/protein binding, DISCUSSION or binding to the promoter region) were used to construct the In this two-stage candidate pathway analysis of inflam- network. Based on these criteria, we identified 25 upstream mation gene variants, we were able to replicate one variant regulators and 39 downstream targets of the ACVR1B gene. In (rs12809597) in the type-1B (ACVR1B)/Activin this study, we had genotype data for 11 upstream regulators receptor-like kinase 4 (ALK4) gene that was significantly asso- and 16 downstream targets. Of these, none was nominally sig- ciated with lung cancer risk in lifetime never-smoking cases. nificant at P < 0.05 in additive models. This risk was most prominent in women and in those risk

OCTOBER 2011 CANCER DISCOVERY | 423 research article Spitz et al.

Table 3. Significant SNPs in discovery set (additive model)

Minor CHR SNP BP allele OR L95 U95 P value Location Gene 1 rs10127728 171417779 A 1.679 1.266 2.226 0.0003 Flanking_3UTR TNFSF4

1 rs549471 158972243 A 0.6511 0.5058 0.8381 0.0009 Flanking_5UTR SLAMF7 1 rs12131065 67541594 A 1.416 1.149 1.746 0.0011 Flanking_5UTR IL12RB2 1 rs2300095 11188304 A 1.377 1.131 1.676 0.0015 Intron MTOR 2 rs17488897 97731596 G 0.7201 0.5947 0.8718 0.0008 Flanking_3UTR ZAP70 2 rs1464572 45807414 A 0.7392 0.6157 0.8874 0.0012 Intron PRKCE 2 rs13432276 46165464 C 1.358 1.125 1.639 0.0015 Intron PRKCE 5 rs4585495 159717272 A 2.147 1.391 3.316 0.0006 Intron C1QTNF2 5 rs745749 179648409 G 1.359 1.131 1.634 0.0011 Intron MAPK9 5 rs17651965 149435808 C 0.6051 0.4454 0.8221 0.0013 Intron CSF1R 6 rs350294 158440305 G 1.399 1.152 1.699 0.0007 Flanking_3UTR SYNJ2 10 rs1887327 6648739 G 0.6144 0.4691 0.8047 0.0004 Flanking_5UTR PRKCQ 11 rs11819995 127894601 A 1.479 1.186 1.843 0.0005 Intron ETS1 12 ars12809597 50642590 C 0.7192 0.5892 0.8778 0.0012 Intron ACVR1B 12 brs2701129 50715744 C 0.6261 0.4751 0.825 0.0009 Flanking_5UTR NR4A1 13 rs9518587 101309356 G 1.504 1.191 1.9 0.0006 Flanking_3UTR FGF14 14 rs11621263 24164990 A 1.581 1.218 2.054 0.0006 Flanking_3UTR GZMB 14 rs11629129 24164696 A 1.743 1.253 2.426 0.001 Flanking_3UTR GZMB 14 rs11158813 24154521 A 1.721 1.236 2.397 0.0013 Flanking_5UTR GZMH 17 rs11653414 5355762 C 0.5093 0.3409 0.7609 0.001 Intron NLRP1 21 rs962859 33569993 C 0.7385 0.6148 0.8871 0.0012 Intron IL10RB

Abbreviations: BP, ; CHR, chromosome. aReplicated in the Mayo data set (OR = 0.80; 0.62–1.02); P = 0.069. bMayo data set (OR = 0.85; 0.60–1.21); P = 0.36.

subgroups that reported adult exposure to ETS, prior asthma, polymorphisms could be important in lung cancer risk in or no prior hay fever. Further analysis of SNPs 1 Mb from this lifetime never smokers as well. polymorphism suggested that another promising target was in Elevated prediagnostic C-reactive protein (CRP) levels, a sys- the 5′ UTR of the Nuclear receptor subfamily 4 group A member 1 temic, but nonspecific, marker of chronic inflammation, have (NR4A1) gene, although the OR in the replication Mayo Clinic been associated with subsequent lung cancer risk (12) with data did not achieve statistical significance, and the associa- evidence of a dose–response relation. Conversely, use of non- tion we detected could be attributed to chance. steroidal anti-inflammatory drugs (NSAIDs) has been associ- Inflammation is a complex host defense against biologi- ated with decreased lung cancer risk in some (13–16), but not cal, chemical, physical, and endogenous irritants. Innate all studies (17–19). Few of these studies have specifically evalu- immunity is mediated by a variety of secreted proinflam- ated the risk in lifetime never smokers, although in one cohort matory cytokines. The inflammation is resolved by anti- analysis (13), the strongest effect for total NSAID use was for inflammatory cytokines. Chronic inflammation results from long-term former smokers. a dysfunction of these negative regulatory mechanisms (10). Activin receptor type-1B is a protein encoded by the Although smoking (and perhaps, to a lesser extent, passive ACVR1B gene with alternate splicing, resulting in multiple exposure) is the obvious cause of a chronic inflammatory transcript variants. Our SNP of interest, rs12809597, is in- milieu in the lung parenchyma and bronchial epithelium, tronic, and no function has been reported for this SNP, al- other likely precipitating factors include infection, inhaled though it is possible that this tagSNP may be linked to other particulate exposures, and pulmonary scarring (11) that can causal SNP(s) in the gene that affect expression or function. lead to oxidative stress and an inflammatory response, even ACVR1B, also known as ALK4, acts as a transducer of activin in non–tobacco-exposed subjects in whom lung cancer devel- or activin-like ligands that are growth and differentiation fac- ops. It remains plausible, therefore, that inflammation gene tors belonging to the transforming growth factor-β (TGF-β)

424 | CANCER DISCOVERY OCTOBER 2011 www.aacrjournals.org CD-11-0080-f01-z-4c.pdf 9/9/11 8:31:27 AM

Inflammation Gene Variants in Lifetime Never Smokers with Lung Cancer research article

ACVR1B / NR4A1 region

rs1882119 P=1.76E–04 60 4 rs2701129 P=1.96E–04 chr12:50735912 P=1.76E–04 rs12809597 P=3.67E–04

rs2701124 P=0.00196 40 2 Observed (–logP)

20

0 Recombination rate (cM/Mb)

0 ANKRD33 ACVRL1 ACVR1B GRASP NR4A1 C12orf44 KRT80

50600 50700 50800

Chromosome 12 position (kb)

Figure 1. Association of imputed and genotyped SNPs in the region around ACVR1B and NR4A1 with lung cancer risk. Chromosomal position is on the x-axis and negative logarithm to the base 10 of the P values from logistic regression analysis is on the y-axis. Genotyped SNPs are plotted as solid diamonds, and imputed SNPs, as open circles. The two most significant SNPs in the region rs1882119 and chr12:50735912 are plotted in red. The overall structure of the linkage disequilibrium (LD) with SNPs in this region is reflected by estimated recombination rates from the genetic map of Hapmap in build 36 coordinates. The strength of the pairwise correlation between the surrounding markers and the most significant SNP (rs1882119) is reflected by the size of the symbols: the larger the size, the stronger the LD. LD was calculated from actual genotyped or imputed data by using PLINK. Genes in the region are annotated with location, range, and orientation by using gene annotations from the UCSC genome browser (downloaded from Broad Institute website). The original downloaded files were in build 35 positions and converted to build 36 positions (46).

superfamily of signaling , essential regulators of remodeling after allergen challenge (28). Activin may also proliferation and apoptosis, and key regulators of inflamma- act as an inhibitor of cytokine-induced proinflammatory tion and angiogenesis. Activins signal through a heteromeric chemokine release from the airway epithelium. Activin-A is complex of receptor serine , which include at least rapidly induced in TH2 cells on T-cell activation and may two type I (I and IB) and two type II (II and IIB) receptors also function as a TH2 immunomodulatory cytokine (29). An (20). Activin complexes with ACVR1B and recruits SMAD2 enhanced TH2 immune response contributes to the induc- or SMAD3, members of the SMAD family of transcriptional tion of allergy and asthma. coregulators. ACVR1B has been shown to be mutated in pan- We previously showed that self-reported, physician- creatic tumors (21), and activin signaling mediates growth diagnosed asthma is significantly associated with risk of lung inhibition and cell cycle arrest in breast cancer cells (22). cancer in lifetime never smokers who were a subset of this Moreover, differential expression of this gene has been found larger analysis (OR = 1.82) with evidence of a dose–response in the epithelial cells of a subset of smokers with lung cancer pattern for duration (P = 0.007 for trend) (8), although this (23) and in bone marrow micrometastases from lung cancer pattern was not evident in our discovery data. In their meta- patients (24), although the relevance of the gene deregulation analysis, Santillan and colleagues (30) also found asthma to in lung cancer is not entirely clear. Whole-genome microarray be a significant risk factor for lung cancer in never smokers. analysis of ACVR1B expression in large airway epithelial cells Our data also demonstrated a protective effect of prior hay indicated some reduction in expression among normal smok- fever on lung cancer risk in never smokers (6). Cockcroft and ers compared with nonsmokers (25), suggesting the possible colleagues (31) suggested that patients with respiratory atopy impact of cigarette-smoke exposure on activin signaling. It appeared to have some degree of protection against develop- is therefore of interest that risk from the variant was most ing malignancies of endodermal origin, attributable to en- apparent in ETS-exposed subjects. Activins have also been hanced immune surveillance in a stimulated immune system. implicated in the etiology of fibrotic diseases (26) and are It was of special interest that the most significant odds upregulated during the fibrotic response in vivo (27). ratio for the ACVR1B SNP was obtained in the subset of cases Both the TGF-β and activin signaling pathways are acti- and controls that reported adult exposure to ETS, although vated on allergen provocation in asthma and may contribute these subset analyses are based on small sample sizes. No to the resolution of inflammation and initiation of airway such association was evident in those who denied such

OCTOBER 2011 CANCER DISCOVERY | 425 research article Spitz et al.

Table 4. SNPs from NR4A1 targets and upstream and downstream regulators (additive model)

CHR SNP BP OR (95% CI)a P value MAF Location Gene 1 rs566421 26782253 1.23 (1.02–1.48) 0.029 0.38 Flanking_3UTR RPS6KA1 1 rs10159180 154746948 0.83 (0.69–1.00) 0.050 0.48 Flanking_5UTR MEF2D 2 rs13428968 156900859 1.30 (1.02–1.66) 0.034 0.16 Flanking_5UTR NR4A2 4 rs7656411 154847105 1.23 (1.00–1.50) 0.048 0.26 Flanking_3UTR TLR2 5 rs10482642 142708224 1.28 (1.00–1.63) 0.047 0.17 Flanking_3UTR NR3C1 12 rs2701129 50715744 0.63 (0.48–0.83) 0.001 0.13 Flanking_5UTR NR4A1 12 rs2701124 50734424 0.58 (0.41–0.82) 0.002 0.08 Coding NR4A1 15 rs325383 98072569 1.28 (1.05–1.57) 0.016 0.29 Flanking_3UTR MEF2A 17 rs2078486 7523808 0.71 (0.51–1.00) 0.050 0.08 Intron TP53 18 rs4987856 58944474 1.44 (1.07–1.94) 0.016 0.10 3UTR BCL2 18 rs1977971 59119174 1.28 (1.06–1.53) 0.009 0.46 Intron BCL2 18 rs11152377 59123426 0.83 (0.69–0.99) 0.036 0.42 Intron BCL2 18 rs1462129 59131851 1.26 (1.05–1.51) 0.013 0.48 Intron BCL2 18 rs1801018 59136859 0.82 (0.69–0.99) 0.037 0.43 Coding BCL2 19 rs8101696 4066452 0.60 (0.41–0.89) 0.011 0.06 Intron MAP2K2 19 rs4808100 17792497 1.24 (1.03–1.49) 0.020 0.37 Intron INSL3 20 rs6063022 35426741 1.33 (1.03–1.71) 0.027 0.15 Intron SRC

Abbreviations: CHR, chromosome; CI, confidence interval; MAF, multiple alignment format. aUnivariate analysis. exposure. It could be argued that an inflammatory microen- A 1% increase in the AUC (0.64) was found in an expanded vironment is more likely to exist in those exposed passively to clinical and epidemiologic risk model incorporating the tobacco smoke, and that exposure is necessary for the impact ACVR1B SNP, and an additional 5% (0.68), when we also of the gene variant to be apparent. added the upstream and downstream targets of NR4A1. The association with NR4A1 (also known as Nur77) is in- These improvements in risk prediction incorporating these triguing, but must be viewed with considerable caution. It is an genes were statistically significant. The final AUC of 0.68 is orphan receptor within the nuclear hormone receptor super- similar, but the incremental improvement in AUC is larger, family and a potent inhibitor of NF-κB activation (32). NRA41 than that obtained from a risk-prediction model of lung is overexpressed in patients with atopic dermatitis compared cancer in ever smokers in which we incorporated top lung with healthy volunteers (33). Protective effects of the NR4A1 cancer GWAS hits, the chromosome 15q nicotinic receptor SNP were also largest in putative risk subgroups (asthma, no gene cluster (tag SNP rs1051730 G>A), and two SNPs from prior hay fever). the 5p15.33 region (rs2736100 and rs401681) (34). However, higher AUC values are desirable for the model to have clini- cal utility and for any public health impact or recommenda- Table 5. Stepwise logistic model including upstream and tion, especially because the incidence of lung cancer in never downstream regulators of NR4A1 (additive model) smokers is substantially lower compared with that in ever smokers. In a parallel analysis, rs2701129 was associated with an SNP OR (95% CI)a P value OR of 0.78, P = 0.014, in 1,096 cases and 727 controls, all rs13428968 1.46 (1.09–1.95) 0.0103 ever smokers, whom we have genotyped by using the same Illumina platform, although rs12809597 was not a risk pre- rs2701124 0.52 (0.35–0.80) 0.0024 dictor. The rs12809597 was not directly genotyped in the rs2078486 0.61 (0.41–0.93) 0.0212 GWAS, and the r2 value was not sufficiently robust for impu- rs1977971 1.36 (1.10–1.68) 0.0049 tation. The rs2701129 was genotyped in GWAS, but was not statistically significant. rs8101696 0.56 (0.35–0.88) 0.0122 Although the chemical constituents of sidestream and

a mainstream smoke are qualitatively the same, differ- Adjusted for age, sex, second-hand smoking, and family history of lung cancer. ences in pH, combustion temperature, and degree of di- lution with air contribute to quantitative differences in

426 | CANCER DISCOVERY OCTOBER 2011 www.aacrjournals.org Inflammation Gene Variants in Lifetime Never Smokers with Lung Cancer research article

Table 6. Genetic risk score in discovery set for ACVR1B, NR4A1, and upstream and downstream SNPs

Adverse alleles (n) Cases, n (%) Controls, n (%) OR (95% CI)a P value 0–6 26 (5.9) 64 (12.8) Ref. 7 64 (14.4) 112 (22.4) 2.21 (1.09–4.45) 0.0272

8 133 (29.9) 150 (29.9) 3.26 (1.69–6.32) 5.00 × 10−4 9+ 221 (49.8) 175 (34.9) 5.28 (2.78–10.05) 3.90 × 10−7 P for trend 1.46 (1.30–1.65) 1.11 ×10−9

aAdjusted for age, sex, environmental tobacco smoke, and lung cancer family history.

their chemical composition and their emission rates. For Methods example, nitrosamines and other carcinogens are pres- ent in greater concentrations in sidestream than in main- Subject Accrual stream smoke (35). The ever-smoker cases for GWAS also This analysis focuses on lung cancer cases and controls who re- differ from the never-smoker cases in this analysis. For ported themselves to be lifetime never smokers (i.e., smoked <100 example, more than 25% of our ever-smoker cases report cigarettes over a lifetime). Cases for the discovery phase were con- secutive Caucasian patients with newly diagnosed, histopathologi- preexisting chronic obstructive pulmonary disease (COPD) cally confirmed, and previously untreated non–small cell lung cancer that is almost nonexistent in never smokers. One could with no age, gender, ethnicity, tumor histology, or disease-stage therefore hypothesize that the pathogenic processes for restrictions. Medical history, family history of cancer, adult envi- smokers and never smokers are not equivalent, although ronmental tobacco-exposure history, and occupational history were certain etiologic pathways could be shared, such as the obtained through an interviewer-administered risk-factor question- involvement of inflammation. naire. We did not validate self-reports of passive smoking exposure. We acknowledge the limitations of this study and the chal- Case-exclusion criteria for the study included prior chemotherapy or lenge in drawing causal inferences from association analyses. radiotherapy or recent blood transfusion. Relatively small sample sizes were used for both the discov- We recruited our control population from the Kelsey-Seybold ery and replication sets, and this problem is exaggerated in Clinic, Houston’s largest multidisciplinary physician practice. Potential controls were first surveyed with a short questionnaire for subset analyses. We also relied on self-reported questionnaire their willingness to participate in research studies and to provide pre- data for assessment of ETS exposure, raising the potential liminary data for matching demographic characteristics with those of for both misclassification and recall bias. Nevertheless, for cases (4). Controls were frequency matched to the cases on the basis residential exposure to ETS, most studies in the past have of age (±5 years), sex, smoking status, and ethnicity. Exclusion criteria confirmed that self-reports were generally reliable (36), and were similar and also included no prior cancer. To date, the response practical approaches to alternative measurement of ETS rate among both the cases and controls has been approximately 75%. exposure decades before the onset of lung cancer have not On receiving informed consent, we drew a 40–mL blood sample into been established. In national survey data, the accuracy of coded, heparinized tubes from study participants. Genomic DNA was o self-reported second-hand smoke exposure at work, home, extracted from peripheral blood lymphocytes and stored at -80 C. or home and work ranged from 87% to 92%, although work- The replication phase was conducted among never-smoking cases and controls recruited between January 1997 and September 2008 ers reporting no second-hand smoke exposure were only 28% and who were included in a published GWAS (5). These lifetime accurate (37). Thus underreporting of ETS exposure could never-smoking lung cancer cases were recruited from the Mayo occur, but overreporting is less likely. Clinic, and community residents who were never smokers were In summary, this analysis used a candidate pathway selected as controls and matched to the patients according to age, approach to evaluate SNPs comprehensively in inflamma- sex, and ethnic background. Personal interviews with structured tion genes as predisposing to lung cancer risk in lifetime questionnaires were used to elicit demographic, epidemiologic, and never smokers. We replicated an SNP in the TGF-β family exposure data. Institutional review board approval was obtained in ETS-exposed patients or those with inflammatory/aller- from the MD Anderson Cancer Center, Kelsey-Seybold Foundation gic conditions, and by using in silico analyses, we were able (Houston, Texas), and Mayo Clinic (Rochester, Minnesota). to identify upstream and downstream SNPs of our target Gene and SNP Selection SNPs that further contributed to risk. Recent progress in identification of novel SNPs, especially those generated Candidate genes for the discovery phase were selected based on the from the 1000 Genomes Project, have identified several poly- following criteria. We searched the database (38) and the National Center for Biotechnology Information (NCBI) PubMed morphisms in the ACVR1B gene that could be candidates (39) to identify a list of inflammation pathway–related genes. For each for causal variants. Those SNPs include 6 polymorphisms gene, we selected haplotype tagging SNPs (htSNP) located within 10 kb located in the coding region of the ACVR1B: rs34488074, upstream of the transcriptional start site or 10 kb downstream of the rs114081852, rs117020497, rs114735080, rs77643569, and transcriptional stop site, based on data from the International HapMap rs34050429. We plan to include these SNPs in the next phase Project (40) release 24/Phase II. By using the LD select program (41) and of our targeted sequencing studies. the UCSC Golden Path Gene Sorter program (42), we further divided

OCTOBER 2011 CANCER DISCOVERY | 427 research article Spitz et al. identified SNPs into bins based on an r2 threshold of 0.8 and minor allele calculates the probability of declaring no association, given the data and a frequency (MAF) greater than 0.05 in Caucasians to select tagging SNPs. specified prior on the presence of an association, and has a noteworthy We also included SNPs in the coding (synonymous SNPs, nonsynony- threshold that is defined in terms of the costs of false discovery and nondis- mous SNPs) and regulatory regions (promoter, splicing site, 5′ UTR, and covery. Four levels of prior probability of 0.01, 0.03, 0.05, and 0.07 and odds 3′ UTR). Functional SNPs and SNPs previously reported to be associated ratios from 1.3 through 2.0 were tested; selected levels of noteworthiness with cancer were also included. We also extensively used the inflamma- for BFDP were set at 0.8 (i.e., false nondiscovery rate is 4 times as costly as tion pathway gene list and functionally defined subpathways, as outlined false discovery). We used the most conservative prior of 0.01 to determine in Loza and colleagues (7), who suggested that variants in multiple genes that the association was unlikely to represent a false-positive result. in inflammation pathways may likely cooperate in additive or synergistic In stratified analyses, we used logistic regression to examine as- ways to affect disease risk. The complete set of selected SNPs was submit- sociations of selected SNPs with lung cancer case–control status for ted to Illumina technical support for Infinium chemistry designability, subgroups of subjects defined by sidestream tobacco exposure, his- beadtype analyses, and iSelect Infinium Beadchip synthesis. tory of hay fever, asthma, or family history of lung cancer, comparing Of the total number of selected SNPs, 2.9% could not be designed each subgroup of cases against controls within that subgroup. because of designability score failure. An additional 12% could not be We also performed a stepwise forward logistic regression analysis incorporated into the beadchip owing to manufacturing issues (within in which we allowed significant univariate SNPs to enter a model ac- the norm stated by Illumina). Overall, slightly fewer than 15% of all cording to the strength of association, provided they showed associa- SNPs were not designed. We did not seek surrogates for failed SNPs, be- tion with disease (P < 0.05). SNPs were retained for analysis if they cause of the relatively low failure rate for designability (<3%) and con- continued to show association (P < 0.05), given other SNPs in the straint on the total number of beadtypes for the custom chip design. model. Linkage disequilibrium (LD) between SNPs was calculated for cases and controls by using PLINK before all the SNPs were entered Genotyping into the model. If two SNPs were in high LD (r2 ≥ 0.8), only one SNP In total, 19,949 SNPs were genotyped in the discovery samples by was entered into the model. Linkage disequilibrium was visualized by 2 using Illumina′s Infinium iSelect HD Custom Genotyping BeadChip using Haploview v. 4.1 (45) to summarize r statistics. according to the standard 3-day protocol (San Diego, CA). Of these, Genotyped SNPs in the region 1 Mb from each side of the ACVR1B 11,930 SNPs were in inflammation pathways, and the remaining SNPs gene range were retrieved (46). Before imputation, we identified three were identified from ongoing GWAS for further query in separate anal- A/T or C/G SNPs that were in opposite strand orientation to the strand yses. Genotypes were autocalled by using the BeadStudio software. Any of the 1000 Genomes Project reference data, based on comparisons of SNP with a call rate lower than 95% was excluded from further analysis minor allele frequencies. The strands for these three SNPs were flipped (n = 203). A further 27 SNPs were removed because of a difference in before imputation. MACH version 1.016 was used for imputation and genotype between the original and the duplicate sample (error rate). options, with the 1000 Genomes Project March 2010 release CEU data We also deleted 93 SNPs that were at the same chromosomal position as the reference panel (47). For the replication analysis, we included all and 89 SNPs with MAF = 0. The final data set included 19,537 SNPs, SNPs that were statistically significant at P values < 0.001 and BFDP of which 11,737 SNPs were in the inflammation pathway. levels ≤ 0.8 with prior probability of 0.01. For risk-model construction, For the Mayo Clinic samples, whole genome amplification (WGA) we retained all epidemiologic variables that were components of our was performed before SNP genotyping. The WGA was set up in four published risk-prediction model for never smokers (6). However, because separate reactions, each of which included 25 ng of genomic DNA and the Mayo Clinic study did not have data available on prior hay fever, we standard amplification procedures with a total reaction volume of elected to omit this variable from the model. For each risk model, we 25 μL (REPLI-g Midi Kit, Qiagen). After WGA, the four reactions were calculated specificity and sensitivity of the resulting logistic regression pooled, mixed, and quantified by the picogreen method. Genotyping model by constructing receiver operator characteristic (ROC) curves and was performed in Dynamic Arrays (Fluidigm; South San Francisco, CA) calculating the area under the curve (AUC) statistic to estimate the abil- containing integrated fluidic circuits (IFCs). Then 75 ng of the WGA- ity of the models to discriminate between patients and controls for the DNA was pre-amplified using 0.2X primer multiplex of the source two populations separately and combined. Approximate 95% confidence primers. 2.3 μL of pre-amplified DNA was then loaded onto the ar- intervals for the AUC were calculated, assuming a binegative exponential ray; 3 μL of each Applied Biosystems TaqMan genotyping assay in a distribution by using SAS statistical software. An AUC of 0.5 indicates 5-μL assay reaction volume was loaded onto the array. The assay was chance prediction (equivalent to a coin toss), whereas a statistic of 0.7 run for 40 PCR cycles under vacuum pressure. The end-point read was or higher indicates good discrimination. We also constructed expanded performed on an EP1 machine by using a CCD camera to detect VIC models that included any replicated SNPs. We performed pairwise and FAM dyes. SNP Genotyping Analysis Software was used to autocall comparisons of AUCs of the baseline multiple logistic model and the SNP genotype clusters with a confidence index of 95%. The specific expanded model including genetic data by using a contrast matrix to SNPs identified from this pathway-based analysis were not included in evaluate differences of the areas under the empirical ROC curves (48). the Mayo GWAS chip (5) and were directly genotyped for this analysis. The never-smoker GWAS with the Mayo Clinic samples had a rather Disclosure of Potential Conflicts of Interest limited sample size, and an additional GWAS in never smokers is under The authors declare that they have no competing financial inter- way, including our discovery set of never-smoking cases and controls. ests. None of the sponsors played a role in the study design, collec- tion, analysis, and interpretation of the data, in the writing of this Statistical Analyses article, or in the decision to submit the manuscript for publication. Pearson’s χ2 test was used to assess the differences in categoric variables, and t tests were used for continuous variables in both dis- Grant Support covery and replication data sets. All tests were 2-sided. For each SNP, This work was supported by grants from the National Cancer Hardy-Weinberg equilibrium was assessed among controls by using Institute [CA55769 and CA127219 (M.R. Spitz); CA80127 and a χ2 test. To assess case–control associations of SNP genotypes with CA84354 (P. Yang); U19CA148127 and CA121197 (C.I. Amos); lung cancer risk, we used unconditional logistic regression, imple- CA123235 and CA131327 (C.J. Etzel); and CA149462 (O.Y. Gorlova)]; mented by using SAS/Genetics version 9.2. Single-SNP association Kelsey Seybold Research Foundation; and Mayo Foundation Fund. tests were carried out by using PLINK 1.07 (43). We applied the Bayesian false discovery probability test (BFDP) (44) to Received April 8, 2011; revised August 19, 2011; accepted August evaluate the chance of obtaining a false-positive association. This approach 22, 2011; published OnlineFirst August 25, 2011.

428 | CANCER DISCOVERY OCTOBER 2011 www.aacrjournals.org Inflammation Gene Variants in Lifetime Never Smokers with Lung Cancer research article

25. Carolan BJ, Heguy A, Harvey BG, Leopold PL, Ferris B, Crystal RG. Up- References regulation of expression of the ubiquitin carboxyl-terminal 1. Subramanian J, Govindan R. Lung cancer in ‘Never-smokers’: a L1 gene in human airway epithelium of cigarette smokers. Cancer Res unique entity. Oncology 2010;24:29–35. 2006;66:10729–40. 2. Spitz MR, Amos CI, Dong Q, Lin J, Wu X. The CHRNA5-A3 region 26. Border WA, Noble NA. Transforming growth factor beta in tissue on chromosome 15q24-25.1 is a risk factor both for nicotine depen- fibrosis. N Engl J Med 1994;331:1286–92. dence and for lung cancer. J Natl Cancer Inst 2008;100:1552–6. 27. Ohga E, Matsuse T, Teramoto S, Katayama H, Nagase T, Fukuchi Y, 3. Walser T, Cui X, Yanagawa J, Lee JM, Heinrich E, Lee G, et al. et al. Effects of activin A on proliferation and differentiation of human Smoking and lung cancer: the role of inflammation. Proc Am Thorac lung fibroblasts. Biochem Biophys Res Commun 1996;228:391–6. Soc 2008 Dec 1;5:811–5. 28. Kariyawasam HH, Pegorier S, Barkans J, Xanthou G, Aizen M, Ying 4. Hudmon KS, Honn SE, Jiang H, Chamberlain RM, Xiang W, Ferry G, et al. S, et al. Activin and transforming growth factor-beta signaling path- Identifying and recruiting healthy control subjects from a managed care ways are activated after allergen challenge in mild asthma. J Allergy organization: a methodology for molecular epidemiological case-control Clin Immunol 2009;124:454–62. studies of cancer. Cancer Epidemiol Biomarkers Prev 1997;6:565–71. 29. Ogawa K, Funaba M, Chen Y, Tsujimoto M. Activin A functions as a 5. Li Y, Sheu CC, Ye Y, de Andrade M, Wang L, Chang SC, et al. Genetic Th2 cytokine in the promotion of the alternative activation of mac- variants and risk of lung cancer in never smokers: a genome-wide as- rophages. J Immunol 2006;177:6787–94. sociation study. Lancet Oncol 2010;11:321–30 30. Santillan AA, Camargo CA Jr, Colditz GA. A meta-analysis of asthma 6. Spitz MR, Hong WK, Amos CI, Wu X, Schabath MD, Dong Q, and risk of lung cancer Cancer Causes Control 2003;14:327–34. et al. A risk model for prediction of lung cancer. J Natl Cancer Inst 31. Cockcroft DW, Klein GJ, Donevan RE, Copland GM. Is there a neg- 2007;99:715–26. ative correlation between malignancy and respiratory atopy? Ann 7. Loza MJ, McCall CE, Li L, Isaacs WB, Xu J, Chang BL. Assembly Allergy 1979;43:345–7. of inflammation-related genes for pathway-focused genetic analysis. 32. Diatchenko L, Romanov S, Malinina I, Clarke J, Tchivilev I, Li X, et al. PLoS One 2007;2:e1035. Identification of novel mediators of NF-kappaB through genome-wide sur- 8. Gorlova OY, Zhang Y, Schabath MB, Lei L, Zhang Q, Amos CI, et al. vey of monocyte adherence-induced genes. J Leukoc Biol 2005;78:1366–77. Never smokers and lung cancer risk: a case-control study of epide- 33. Kagaya S, Hashida R, Ohkura N, Tsukada T, Sugita Y, Terakawa M, et miological factors. Int J Cancer 2006;118:1798–804. al. NR4A orphan nuclear receptor family in peripheral blood eosino- 9. Nikitin A, Egorov S, Daraselia N, Mazo I. Pathway studio: the analysis phils from patients with atopic dermatitis and apoptotic eosinophils and navigation of molecular networks. Bioinformatics 2003;19:2155–7. in vitro. Int Arch Allergy Immunol 2005;137(Suppl 1):35–44. 10. Hanada T, Yoshimura A. Regulation of cytokine signaling and 34. Spitz MR, Amos CI, D’Amelio A Jr, Dong Q, Etzel C. Re: Discriminatory inflammation. Cytokine Growth Factor Rev 2002;13:413–21. accuracy from single-nucleotide polymorphisms in models to predict 11. Engels EA. Inflammation in the development of lung cancer: epide- breast cancer risk. J Natl Cancer Inst 2009;101:1731–2. miological evidence. Expert Rev Anticancer Ther 2008;8:605–15. 35. Husgafvel-Pursiainen K. Genotoxicity of environmental tobacco smoke: 12. Chaturvedi AK, Caporaso NE, Katki HA, Wong HL, Chatterjee N, a review. Mutat Res 2004;567:427–45. Pine SR, et al. C-reactive protein and risk of lung cancer. J Clin Oncol 36. Wu AH. Exposure misclassification bias in studies of environmental tobacco 2010;28:2719–26. smoke and lung cancer. Environ Health Perspect 1999;107(Suppl 6):873–7. 13. Slatore CG, Au DH, Littman AJ, Satia JA, White E. Association of non- 37. Arheart KL, Lee DJ, Fleming LE, LeBlanc WG, Dietz NA, McCollister KE, steroidal anti-inflammatory drugs with lung cancer: results from a large et al. Accuracy of self-reported smoking and secondhand smoke exposure cohort study. Cancer Epidemiol Biomarkers Prev 2009;4:1203–7. in the US workforce: the National Health and Nutrition Examination 14. Van Dyke AL, Cote ML, Prysak G, Claeys GB, Wenzlaff AS, Schwartz AG. Surveys. J Occup Environment Med 2008;50:1414–20. Regular adult aspirin use decreases the risk of non-small cell lung cancer 38. The Gene Ontology Consortium. Gene ontology: tool for the unifica- among women. Cancer Epidemiol Biomarkers Prev 2008;17:148–57. tion of biology. (cited 2010 August 25). Available from: http://www 15. Khuder SA, Herial NA, Mutgi AB, Federman DJ. Nonsteroidal an- .geneontology.org. tiinflammatory drug use and lung cancer: a meta-analysis. Chest 39. The National Center for Biotechnology Information (NCBI). (cited 2010 2005;127:748–54. August 25). Available from: http://www.ncbi.nlm.nih.gov/pubmed 16. Olsen JH, Friis S, Poulsen AH, Fryzek J, Harving H, Tjønneland A, et al. Use 40. International HapMap Consortium. The international HapMap of NSAIDs, smoking and lung cancer risk. Int J Cancer 2008;98:232–7. project. (cited 2011 August 25). Available from: http://hapmap.ncbi. 17. Kelly JP, Coogan P, Strom BL, Rosenberg L. Lung cancer and regular nlm.nih.gov/cgi-perl/gbrowse/hapmap24_B36/. use of aspirin and nonaspirin nonsteroidal anti-inflammatory drugs. 41. Documentation for ldSelect Version 1.0 Deborah A. Nickerson, Mark Pharmacoepidemiol Drug Saf 2008;4:322–7. Rieder, Chris Carlson, Qian Yi, University of Washington. (cited 2010 18. Feskanich D, Bain C, Chan AT, Pandeya N, Speizer FE, Colditz GA. August 25). Available from: http://droog.gs.washington.edu/ldSelect.html Aspirin and lung cancer risk in a cohort study of women: dosage, 42. UCSC Genome Bioinformatics Genome Browser. (cited 2010 August duration and latency. Br J Cancer 2007;97:1295–9. 25). Available from: http://genome.ucsc.edu 19. Wall RJ, Shyr Y, Smalley W. Nonsteroidal anti-inflammatory drugs 43. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreria MA, Bender D, and lung cancer risk: a population-based case control study. J Thorac et al. PLINK: a tool set for whole-genome association and population- Oncol 2007;2:109–14. based linkage analyses. Am J Hum Genet 2007;81:559–75. 20. ten Dijke P, Ichijo H, Franzen P, Schulz P, Saras J, Toyoshima H, et al. 44. Wakefield J. A Bayesian measure of the probability of false discovery Activin receptor-like kinases: a novel subclass of cell-surface receptors with in genetic epidemiology studies. Am J Hum Genet 2007;81:208–27. predicted serine/threonine kinase activity. Oncogene 1993;8:2879–87. 45. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visuali- 21. Su GH, Bansal R, Murphy KM, Montgomery E, Yeo CJ, Hruban RH, zation of LD and haplotype maps. Bioinformatics 2005;21:263–5. et al. ACVR1B (ALK4, activin receptor type 1B) gene mutations in 46. Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund pancreatic carcinoma. Proc Natl Acad Sci U S A 2001;98:3254–7. University, and Novartis Institutes of BioMedical Research. (cited 2009 22. Burette JE, Jeruss JS, Kurley SJ, Lee EJ, Woodruff TK. Activin A medi- Jan 15). Available from: http://www.broadinstitute.org/science/projects/ ates growth inhibition and cell cycle arrest through SMADs in hu- diabetes-genetics-initiative/plotting-genome-wide-association-results. man breast cancer cells. Cancer Res 2005;65:7968–75. 47. Markov Chain Haplotyping (MACH) software tool for haplo- 23. Spira A, Beane JE, Shah V, Steiling K, Liu G, Schembri F, et al. Airway type estimation and genotype imputation. Developed by Goncalo epithelial gene expression in the diagnostic evaluation of smokers Abecasis and Yun Li. (cited 2008 April). Available from: http://www. with suspect lung cancer. Nat Med 2007;13:361–6. sph.umich.edu/csg/abecasis/MACH/index.html release MACH1.016 24. Wrage M, Ruosaari S, Eijk PP, Kaifi JT, Hollmén J, Yekebas EF, et al. 48. DeLong ER, DeLong, DM, Clarke-Pearson DL. Comparing the Genomic profiles associated with early micrometastasis in lung can- areas under two or more correlated receiver operating characteristic cer: relevance of 4q deletion. Clin Cancer Res 2009;15:1566–74. curves: a nonparametric approach. Biometrics 1988;44:837–45.

OCTOBER 2011 CANCER DISCOVERY | 429