© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. doi:10.1038/ng.398 Received 13 June; accepted 6 October; published online 30 October 2017; A full list of affiliations appears at the end of the paper. in ( filters control summarized quality passed that is variants genetic 8,307,659 study disease our 1 in Figure used Supplementary approach analytical The of share Asthma, to repositioning, immunity. that heritability that specific nearby including We that (GWAS; variants, because dermatitis) Asthma, Catarina Almqvist Patrik K E Magnusson Konstantin Strauch David L Duffy Andreas Arnold Arnulf Langhammer BIOS consortium G Lars Fritsche Joana A Revez Jouke-Jan Hottenga BurrowsKimberley Yi Lu Joshua D Hoffman Manuel A Ferreira elucidates biology disease allergic Shared genetic origin of asthma, hay fever and eczema Nature Ge Nature Supplementary Table 1 Table Supplementary

immune-related influence

identified

shared most considers

many 9

,

38

effects

n of hay hay we

, , Franz Rüschendorf

73 represent n = Six

a

often and risk etics

genetic

360,838)

transcription performed fever fever shared

not 1

in

target while the

36

were

biological variants allergic 5

1 1

previously

coexist , , Jonathan Beesley 11 , , Natalija Novak

presence (or and independent

23 ADVANCE ONLINE PUBLICATION ONLINE ADVANCE genes.

, , Maiken E Gabrielsen

risk

shared

genetic 17

for genes , Georg , HomuthGeorg detected 1 9

7 allergic 8 of

, , LifeLines Cohort Study , Cohort LifeLines , 32 10 , 38 eczema 35 , Gonneke Willemsen 38

20 a ), comparing 180,129 cases who reported reported who cases 180,129 comparing ),

36 variants disease

influence . We tested for association with allergic allergic with association for tested We . , , Ronald B Melles , , Oliver Hummel a

in

, , , Quinta Helmer

39 genome-wide 9 , Oddgeir L Holmen

independently

process broad

,

provide genes risk

39

, , Judith M Vonk the

reported, of origin , , Rick Jansen

rhinitis) , Robert , KarlssonRobert for

any

partly

same

factors. risk

pathophysiology.

that

allergic 4

CpG only

2–

, lymphocyte-mediated enrichment

38 one an

variants 4 27

. individuals

dysregulate

coexist

To which

, , Holger Schulz opportunity

1 and methylation six

of

, , Jorge Esparza-Gordillo Tissue-specific , , Melanie C Matheson 24 association

identify of disease

these , , Carsten O Schmidt variants,

eczema

genetic

8 34 4 ( implicate 33

11 P , because , Sarah , Grosche Sarah 38 2

,

analyses , , David A Hinds 39 ,

, Brunilda , Balliu Brunilda < 38 8 three , , Annika Tillander 1

, Jie Zheng

11

, phenotype

shared 3 the , , Hansjörg Baurecht

Disease- partly

9

17

, Mari Løset for

×

confirming (or was

, , , Eric Jorgenson effects. study 39 1

, , Jonas B Nielsen

expression

diseases.

they

1

0 drug atopic

−8 suggest found 28

32

risk

, , Gerard H Koppelman

), ,

29

10 , , Stefan Karrasch

, Elke Rodríguez

4

11 , 15 5 25 6 , , Ben M Brumpton 16 , , Norbert Hübner , had similar allele frequencies in case-only association analyses that that analyses association case-only in frequencies allele similar had variants risk shared previously reported associations 89 ( the over sub a new, increment are that stantial disease allergic with associations genetic 23) ( nals 7 Table or ( traits for diseases other reported variants nearby ( associa tions reported previously from Mb >1 located were variants loci) 49 sentinel (in 50 remaining The loci. these in associations new represent they that indicating variants, risk reported previously the ( LD low in were variants sentinel 86 these of 23 note, ( disease allergic GWAS of previous in reported variants risk from Mb <1 located loci) 50 (in 86 included in linkage disequilibrium (LD) with a causal functional variant. These are or represent variants’, either risk ‘sentinel which as these to refer ( risk disease with ciation asso independent a statistically had loci 99 these in variants genetic 3 Table 10 × 3 of threshold significance wide at a disease genome- with allergic one variant associated least genetic identified 99 genomic regions (loci) located >1 Mb apart containing at ( studies contributing 13 the from results of ( 180,709 controls with who reported noteczema suffering and/or from any fever of hay these diseases and/or asthma from suffered having , , Shyamali C Dharmage 21 , , Philip J Thompson Supplementary Fig. 3 Fig. Supplementary 4 Supplementary Table 2 Table Supplementary , , 23andMe Research Team , As expected from a study design that maximized power to identify power to identify from a that maximized As study design expected 5 , Gonçalo R Abecasis 9 33 , , 37 38 18 , Supplementary Table 3 Table Supplementary 39 3 , Chris , W Chris Medway , , Vilhelmina Ullemar Supplementary Table 6 Table Supplementary , ). Eighteen loci had multiple independent association sig association independent multiple had loci Eighteen ). , . n h bss f prxmt cniinl analysis conditional approximate of basis the On ). 38 19

, , Ingo Marenholz

, , Young-Ae Lee

, , Wei Zhou 28 3 , Melanie Hotze – 36 30 6 , , Christian , Gieger Christian 39 , we found that 130 of the 136 sentinel variants variants sentinel 136 the of 130 that found we , 4 , 10 & PaternosterLavinia 39 and and – , , Stephan Weidinger 26 ), all of European ancestry. Meta-analysis Meta-analysis ancestry. European of all ), 12 Supplementary Table 4 Table Supplementary 19 , , Nicholas G Martin 11 , , John S Witte Supplementary Table 8 Supplementary , , Kristian Hveem 10 15 4 , ). Altogether, we identified 73 (50 + (50 73 identified we Altogether, ). 22 , ), of which 17 were in low LD with with LD low in were 17 which of ), 4 , , Edward Mountjoy 5 , Lisa , M Lisa Bain , , 9 , Cristen J Willer 5 39 , 17 , 38 38 , , Dorret I Boomsma 3 , , AAGC collaborators , Andre Franke , , Jenny van Dongen , , Chao Tian −8 Supplementary Table 5 Table Supplementary ( Fig. Fig. 31 , Supplementary Fig. 2 Fig. Supplementary 13

1 and and , 1

, 11 10 3

). Henceforth, we we Henceforth, ). s r e t t e l , 6 39 Supplementary Supplementary Supplementary Supplementary 1 ). , , , 20 39 38 , 11 r , 2

14 , 10 ,

< 0.05) with with 0.05) < , 18

, 19

,

8 , 8

, , 22 39 , 5 38 , 136 136 , 17 ). Of Of ).

, , ,

,

 - - - - )

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. is is proportional to the number of dashes shown. Otherwise, when the risk variant was located within a , the respective gene name is shown in brackets the risk variant was intergenic (indicated by the “gene1-[]-gene2”), two closest genes (upstream and downstream) are shown; the distance to each gene gene was identified (black font), brackets are used to indicate the location of the sentinel risk variant relative to the nearest gene(s). Specifically, when identified for each are shown, with target gene names listed in blue font. For loci with many target genes, only a selection is listed. When no target were located in 50 previously reported (86 variants) and 49 new (50 variants) risk loci. The numbers of plausible target genes of sentinel risk variants Figure 1 the on effects differential have variants 130 these that evidence no or only eczema ( ( only asthma from suffering compared three non-overlapping groups of adults: those who reported s r e t t e l  (“[gene] ”

). ). The red vertical line in the Manhattan plot shows the genome-wide significance threshold used ( Loci Loci containing genetic risk variants associated independently with the risk of allergic disease at NAB2 USF1 HLA- CAMK4 n ORMDL , = ( 6,276) STAT PITPNM2 , C F11R , MCCD1P1 , NFKBI 6 SLC25A46 3 TNFSF18-- OVOL1 , , GATA3--- , SUOX RP11-132N15. CCR7 PPOX , OGFOD2 A 50 knownloci , SLC22A4 MFSD9 , CLEC16A- , PPP2R3C , SNX3 , , Supplementary Table Supplementary 9 Target genes RPS26 RP11-94L15.2 ADAMTS4 HLA-DQA , BOLL [] [] n TSLP ---SFTA1P -TNFSF4,FASLG- = 12,268), hay fever only ( only fever hay = 12,268), , 2 , ACTR1 ARL6IP4 ITGB8-- , IL18R1 , , 2 , RPRD2 , RP11-770G2.5 PA2G KIRREL3-AS3--- SLC22A5 , RFTN2 , CTC-551A13. [] RP5-1115A15.1 RP11-132N15.1 KIAA0391 -RMI 1 MIR5708-- , C9orf114- , B4GALT3 A , GPANK WNT11-- , 4 [] , GATA3-- FBXO45 , , , 2 IL1RL1 , -ABCB , TMEM180 ZPBP HDAC3 ABCB9 C1orf54 , MYL6B RP11-554D20. FOXA1- DDX6 MARS2 RMI2- , C5orf56 THEM [ , [] LINC0029 1 FAM177A1 --TNFSF18 [] 2 [] [ [] , 5 , 2 , - KIAA1109 FAM105 , [ D2HGDH [ --ZBTB10 - , [] -LRRC8A RAD51B [] [ [ -LRRC32 , NFATC PRRC2A IL18RAP , , [] , FCER1G [ EFEMP [] PTGER4 ZNF365 ZNF217 , GSDM , SFTA1P BACH2 -CXCR5 , [] [] NDFIP1 AAGAB [ -CEP19 WDR36 ERMP1 SBNO1 GNGT2 STMN3 --LITAF ERBB3 , TARS2 [ ITGB8 CD247 , 4 PLCL1 , --ETS1 --TTC6 GLB IL2RA [ RERE BCL6 , , ARL3 MYC TLR1 IL7R IL6R IL13 FL 9 1 2 G B A 1 2 ] ] ] ] ] ] ] ] ] ] 10 2 0 1 0 2 4 1 4 1 0 1 0 4 3 1 0 1 1 2 1 1 1 1 1 1 1 4 5 3 4 2 2 7 5 1 0 2 0 1 0 1 0 3 1 1 0 2 1 6 0 1 0 1 3 1 4 3 0 1 0 1 0 2 8 1 4 1 0 1 0 1 1 1 1 2 0 4 7 1 1 1 0 1 0 1 1 n target genes 2 1 3 3 2 1 1 1 7 1 1 ). ). There was thus 1 n sentinel variants 1 n 21 = 33,305) = 33,305) 18 15 –log 10 12

( P Fig. Fig. located in five known risk allergy loci (for example, were two other the to compared when disease allergic one in effects for stronger evidence with The variants six diseases. individual three value 6 9 ) 2 ). On the other hand, many sentinel variants (26, or 19%) were or 19%) (26, variants many sentinel hand, other the On ). 11 12 13 14 15 16 17 18 19 20 21 22 10 X 1 2 3 4 5 6 7 8 9 0 3 a 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 2 1 2 1 1 1 0 1 2 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 1 1 0 1 0 1 0 1 0 2 0 1 1 1 0 1 1 1 0 1 1 1 4 1 1 1 0 1 0 1 0 1 0 1 2 1 0 1 0 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 1 2 1 DVANCE ONLINE PUBLICATION ONLINE DVANCE n sentinel variants

n target genes 1 [ PIBF1- JDP2- RCOR1- [ SLC7A10 [ [ SFPQ [ [ BCL2L11- [ CCL20- LINC00870- STX18- [ [ MIR3142- [ [ [ RNASET2- C7orf72- SESN3- HDAC7 AQP5 SH2B SPPL RTF1 IQGAP1 ALOX15 MAP3K14-AS DYNA PIGN C22orf46 TNFRSF14 RUNX3 ARHGAP15 INPP5 SENP SLC15A RASA RGS1 TNFAIP3 JAZF1 GSAP FBXW PPP2R1B FOXO STAT5B RUNX1 SIK1 ITPK LOC339807 IL1B MANBA TNFAIP ATG5 PTPR ARID1B P ] , ] B 3 P 3 7 P - 2 4 [] 2 ] K D < 3 × 10 ITPK [] 1 [] ] - , -BAT [] = 3 × 10 - ] ] -ZMYM4 [] ] -KLF5 , 2 8 [] RAB24 [] [] ] ] -DAW ] , -MSX - CRTC3 , ] -TRAF3 -FAM76B [] -IKZF1 - , [] PMM1 [] EAF2 [] -MIR146 SIK2 -CEBPA A -MIR3939 - --ANAPC1 F ] [] 1 -RYBP 1 1 , , SPATA3 , IQCB1 NHP2L −8 A 49 newloc −8 . . The 136 sentinel risk variants ). , 2 1 DC86 , CCDC134 i Target genes , PHF5A

Nature Ge Nature FLG , CSDC2,MEI1 and GSDMB n etics ; © 2017 Nature America, Inc., part of Springer Nature. All rights reserved. genes specifically expressed in whole blood and lung ( lung and blood whole in expressed specifically genes for enriched was targets plausible of list the that found we genome, the in genes non-MHC 17,671 remaining the comparedto When sis. the TSEA GTEx database, leaving 112 plausible target genes for analy the major histocompatibility complex (MHC) region or not present in (TSEA) approach described previously Analysis Expression Tissue-Specific the using expressed, portionally fied broad tissue types in which the plausible target genes were dispro in (examples pathophysiology disease to tributors ( terms allergy-related with co-cited been previously not target have genes plausible 132 the of (60%) 79 note, Of details). for Methods available ( data publicly from functional evidence independent of basis the on gets 0.95; ( threshold LD conservative more a using identified 78 include follow-up functional for prioritized be could that genes warranted; are variants functional underlying the identify and predictions gene ( loci risk 99 the of 54 in located were which genes, target plausible as Tables 13 Supplementary genes; 110 additional (an types cell or tissues relevant in identified 12 Table genes; (22 SNP nonsynonymous a ( either LD with high 0.8) in was variant sentinel nearby the transcripts, these of 132 For genes. -coding 2,569 including variants, sentinel ( near annotated transcripts were 5,739 There variants. sentinel 136 of the genes target plausible identified we then coexist. conditions three these why explains partly loci at these ( overall heritability for each disease that is explained by common SNPs of the one-tenth and one-sixth one-fifth, about respectively, senting, to be 3.2% for 3.8% for asthma, hay fever and 1.2% for repre eczema, found was heritability This meta-analysis. discovery the of part not the in associations top up to (HUNT; Study 136 Nord-Trøndelag Health the by explained collectively was that of heritability the diseases three individual the quantify liability-scale diseases. three all for the age develop at symptoms first which influence variants that these ( diseases individual between different significantly not was onset of age on effect the variants, 26 ( onset of age earlier associated with being always risk disease higher a with associated allele ( developed first ease dis allergic any of symptoms which at age the with associated also Nature Ge Nature selected from the same 98 non-MHC allergy risk loci identified in in identified loci risk allergy non-MHC 98 same the from selected randomly were genes When genes. random of lists 1,000 when analyzing exceeded was genes target plausible the with observed ment ants. To address this possibility, how we determined often the enrich vari risk sentinel the near located genes arbitrary of feature general targets were also enriched for genes with eQTLs in tissues. those and lung, blood plausibleandthe that because is to unlikely this arise plausible targets are enriched for genes preferentially expressed intarget wholegenes ( with eQTLs reported in the same studies used to identify the plausible genes non-MHC 12,804 of subset the to list gene background the ing associations remained significant ( Fig. Fig. Supplementary TableSupplementary 15 Supplementary Table 11 Supplementary Next, on the basis of data from the GTEx Consortium GTEx the from data of basis the on Next, risk variants, of To allergy consequences the biological understand analysis regression score LD used Wethen The enrichment in whole blood and lung expression could be a be could expression lung and blood whole in enrichment The 1 Supplementary Table 15 Table Supplementary and ) or a sentinel expression quantitative trait locus (eQTL) (eQTL) locus trait quantitative expression sentinel a or ) n Supplementary TableSupplementary 15 etics Supplementary Table 13 Supplementary Tables 16 16 Tables Supplementary

ADVANCE ONLINE PUBLICATION ONLINE ADVANCE n = 35,972; = 35,972; ) ) and represent so potentially new key con ). Therefore, the inheritance of risk alleles alleles risk of inheritance the Therefore, ). and Supplementary Fig. 4 Fig. Supplementary Supplementary Table 10 Table Supplementary 14 ) or 61 predicted to be the likely tar likely the be to predicted 61 or ) Supplementary Table 10 Supplementary Supplementary Fig. 5 ). We refer to these 132 transcripts transcripts 132 We). these to refer ). ). Studies that the confirm target ). These results indicate that the 9 . . We excluded genes located in and n ± 7 (Online Methods) to to Methods) (Online = 20,350), which was which = 20,350), 1 Mb with respect to) 1 respect Mb with 17 ). For 18 of these these of 18 For ). Table ; see the Online Online the see ; Supplementary Supplementary

Fig. 3 Fig. ) after restrict ), suggesting suggesting ), 8 1 , we identi we , ). ), with the the with ), a ). Both Both ). r r 2 2

≥ ≥ ------

Figure 2 Figure enrichment in whole blood, we used an orthogonal approach orthogonal an used we blood, whole in enrichment variants. of risk sentinel targets plausible as identified genes the to specific was lung and blood in whole observed in expression that enrichment the conclude ( results the on impact no had Table 18 ( similar very were results and genome, the in genes all from simply or genome the from random at drawn loci 2-Mb from selected also were genes arbitrary comparison, For and random lists results when for considering all 25 tested tissues ( 1,000 the of any in enrich exceeded not was the blood whole (112), in observed total ment in and 11) to 0 (range locus per identified genes target of plausible number on the matching meta-analysis, the for black for red association: the of significance the reflects text ratio odds the of color The 1. to equal approximately were ratios odds estimated three the case, the in (rs2228145, shown also is analyses association single-disease pairwise three all in differences frequency allele no with a variant comparison, For cases. asthma-only against cases eczema-only = 1.26, (OR) ratio (odds result A fever. similar hay for than eczema for factor ( fever hay from only suffering individuals to compared when eczema from only FLG the in located is which allele, rs61816761[A] the example, For cases. single-disease of pairs between observed difference frequency allele the to corresponds that triangle outer the of edges the along position the to point triangle inner the of vertices the variant, a given For figure. the of rows two first the in shown are which variants, sentinel 136 the 6 of for observed only were analyses these of one least at for associations significant testing, multiple for accounting After cases. eczema-only against cases ( cases eczema-only against cases only ( cases asthma-only comparing analyses, association case-only three performed we variant, sentinel each For disease. allergic a single from suffering individuals contrasting analyses association case-only pairwise n Eczema Eczema Eczema rs61816761[A] (chr.1:152Mb) rs2228145[C] (chr.1:154Mb) rs6594499[C] (chr.5:110Mb) = 12,268) against hay fever–only cases ( cases fever–only hay against = 12,268) To identify specific cell types that were likely to contribute to the the to contribute to likely were that types cell specific To identify P 1.26 gene (fillagrin), was 1.32-fold more common in individuals suffering suffering individuals in common more 1.32-fold was (fillagrin), gene Supplementary Supplementary Table 18 WDR36 < 1.2 × 10 < 1.2 1.02 1.01 P 1.32 = 7.2 × 10 = 7.2 P

Asthm Asthm Asthm ). Randomly selecting ). genes Randomly selecting from the subset with eQTLs also P Sentinel variants with significant allele frequency differences in differences frequency allele significant with variants Sentinel [ = 0.0004) was observed for this variant when contrasting contrasting when variant this for observed was = 0.0004) [ IL6R FLG 1.06 > 0.05. -[]- 1 CAMK4 ] ] a a a Hay fever Hay fever Hay fever 1.03 1 1.05 −4 (correction for multiple testing), blue for for blue testing), multiple for (correction −8 ), consistent with this SNP being a stronger risk risk a stronger being SNP this with consistent ), Eczem Eczem rs12123821[T] (chr.1:152Mb) rs1839660[T] (chr.10:6Mb) 1.18 1.11 RPTN ). ). Similar results were forobserved lung. a a Supplementary Fig. 5 Fig. Supplementary 1.16 Asthma Asthma [ 1.05 IL2RA -[]- HRNR n Hay fever Hay fever ] = 6,276); and hay fever–only fever–only hay and = 6,276); 1.05 1.06 n Fig. 3 Fig. = 33,305); asthma- = 33,305); a and and Eczema Eczema rs12470864[A] (chr.2:103Mb) rs921650[A] (chr.17:38Mb) IL6R s r e t t e l IL1RL2 1.07 1.08 Supplementary Supplementary ). Therefore, we Therefore, ). [ GSDM Asthma Asthma gene). In this this In gene). P < 0.05 and and < 0.05 1.06 -[]- 1.07 IL18R1 B Hay fever Hay fever 1.04 ] 1.01 Fig. Fig. 3

1 0 that that  a -

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. Genes and SNPs in the MHC region were excluded from these analyses. these from lines. excluded were genes) region MHC the available in all SNPs from and Genes selected genes (random yellow and loci) random from genes (random blue loci), allergy from genes (random black the the strategies, these of each For above. described generated strategies genes three random of same the lists using 1,000 analyzed we comparison, For font. blue in listed are pathways ten top The details). for Methods Online (see ( Enrichment genome. the in genes the of rest the to compared when category process GeneNetwork We used genes. target plausible in results all font; blue in listed ten (top circles black in shown are enrichment an with (−log Annotations heritability. heritability SNP in disease SNP-based the to annotations regulatory cell-type-specific overlap analysis that SNPs of regression score LD contribution the stratified We used annotations. regulatory cell-type-specific individual 220 in ( shown. heritability not are set SNP-based data of random individual each for results enrichment respectively; lines, yellow and blue the by shown is list gene the strategies, two latter the For analysis. for available genes genes 18,300 all Third, from total. random in at and drawn locus were per selection) for available (and selected genes of number the on matching genome, the from random at selected this enrichment, no of hypothesis the shows line solid black The circle. enrichment The total. in and locus GWAS, per our in selected number identified the loci on risk matching allergy non-MHC 98 the from drawn randomly were genes First, details). for Methods Online (see strategies three test ( Enrichment genome. the in genes other to compared when genes target plausible of list the among approach TSEA the We used Consortium. GTEx the by studied 3 Figure T cells, CD56 cells, T (including cells T helper T in measured annotations regulatory for observed was heritability SNP in enrichment strongest the analysis, of Encyclopedia In this types. cell in different 100 the project (ENCODE) DNA Elements by measured annotations cell-type- overlap regulatory that SNPs specific by explained is that the heritability quantifies trait approach this rather Specifically, heritability expression. SNP gene in in than enrichments tissue-specific quantifies a SENP7 ARHGAP15 THEM4 DYNAP SMARCE1 RTF1 SIK2 RASA2 PPP2R3C RERE Gene T s r e t t e l  a References thatsupportthepossiblerole(s)listedareprovidedin able able 1 H Enrichment –log10 (P) 17, T 17, 0 1 2 3 4 5 P value. For comparison, we analyzed 1,000 lists of random genes instead of the plausible target genes. We selected genes at random using using random at genes We selected genes. target plausible the of instead genes random of lists 1,000 analyzed we comparison, For value. 0 Blood s

Enrichment intissue-speci cgeneexpression H Lun Tissues and biological processes influenced by allergy risk variants. ( variants. risk allergy by influenced processes biological and Tissues 1 and T and 1 elected elected examples of plausible target genes not previously implicated in the of pathophysiology allergic disease g Random genes: Sentrin/small ubiquitin-likemodifier(SUMO)-specificprotease Rho GTPase–activatingproteinthatdownregulatesRAC1 Mitochondrial thioesterasethatisanegativeregulatorofproteinkinaseB Dynactin-associated proteinthatactivateskinaseB Subunit oftheBAFchromatin-remodelingcomplex Component ofthePAF complex,whichisinvolvedintranscriptional Salt-inducible kinase GTPase-activating protein of Ras that regulates receptor signal transduction Subunit ofproteinphosphatase2A(PP2A)thatregulatesimmunecell Nuclear receptorco-regulatorthatpositivelyregulatesretinoicacid function signaling 1 5 regulation + Ascertained fromallgenes Ascertained fromrandomloci Ascertained fromallergyloci natural killer (NK) cells and CD19 and cells (NK) killer natural Ski H n 2), regulatory T cells, CD4 cells, T regulatory 2), GTEx tissue 1 0 10 of the the of 5 Respiratory/ski Reproductive/urinary Nervous Immune Digestive/endocrine Cardiovascular/musculoskeletal P P value of the regression coefficient; coefficient; regression the of value value should be close to 0.05 (horizontal gray line). Second, genes were drawn at random from 2-Mb loci loci 2-Mb from random at drawn were genes Second, line). gray (horizontal 0.05 to close be should value P 20 value for the 50th most significant random list (corresponding to the 5th percentile): under the null null the under percentile): 5th the to (corresponding list random significant most 50th the for value n Summary P 1 25 = 0.05 2 to test whether the plausible target genes as a group were more likely to be part of a specific biological biological specific a of part be to likely more were group a as genes target plausible the whether test to + b and CD8 and Enrichment –log (P)

12 10 15 S 0 3 6 9 + upplementary Note B cells ( cells B 5 0 ENCODE tissue-speci cregulatoryannotation 10. 9. CD4naiveprimary:H3K4rme1 8. 7. CD8memoryprimary:H3K4me1 6. CD4 5. CD4 4. CD4 3. CD4 2. CD4memoryprimary:H3K4me1 1. CD4 Tissues withstrongestenrichmentinSNPheritability Enrichment intissue-speci cSNPheritability 9 T T H + to test whether genes specifically expressed in a given tissue were enriched enriched were tissue given a in expressed specifically genes whether test to 1: H3K27a H 0: H3K27a + + + + + memory memory CD2 CD2 CD2 CD2 CD2 5 5 5 5 5 Fig. 3 Fig. + – – – – CD45R IL-17 IL-17 CD127 T c S c H 0 primary:H3K4me upplementary upplementary – + PMA/ionomycin-stimMACS PMA/ionomycin-stim – 0

– T memoryprimary:H3K4me y reg . primary:H3K4rme1 axis) that was significant after correcting for multiple testing ( testing multiple for correcting after significant was that axis) b P

value for each of the 1,000 lists of random genes is shown by a gray gray a by shown is genes random of lists 1,000 the of each for value

1 100

after removing the 136 top associations from our GWAS results results GWAS our from ( associations top 136 obtained the were removing results Similar after responses. allergic to subsets cell T findings vious and not limited to the ones that reached genome-wide significance, significance, genome-wide reached that ones the to but limited including not eczema, and fever hay asthma, between shared vari ants risk genetic that genome-wide demonstrate results beyond These SNPs. extend significant enrichments observed the that ing T a H upeetr Fg 6 Fig. Supplementary 17 primary:H3K4me ) Enrichment of tissue-specific gene expression in 25 broad tissues tissues broad 25 in expression gene tissue-specific of Enrichment ) T H 1 primary:H3K4me Susceptibility toviralinfection Rac1-dependent inflammatoryresponse Vitamin D–dependentmacrophage-mediatedinflammation Cytokine signaling,Tcellfunction Repressor ofCD4differentiation Antiviral response,regulationofTNFexpression Regulation ofmacrophageinflammatoryphenotype,metabolichomeostasis Unknown; T Positive regulation of B cell differentiation, eosinophil survival and migration Supplementary Table 19 Supplementary P T Respiratory/ski Reproductive/urinary Nervous Immune Digestive/endocrine Cardiovascular/musculoskeletal H value for the 50th most significant random gene list is shown by by shown is list gene random significant most 50th the for value able 19 able 2 differentiation,T 150 y axis) is shown as –log as shown is axis) 1 1 n RASA3 ). ( ). : 1 200 P 1 a c = 0.0002 and the widely documented contribution of these these of contribution documented widely the and P ) Biological processes enriched amongst the list of of list the amongst enriched processes Biological ) DVANCE ONLINE PUBLICATION ONLINE DVANCE = 0.05 , hematopoiesis; y P Possible role(s)inallergicdisease axis) is shown as the –log the as shown is axis) reg value for the 50th most significant random random significant most 50th the for value c Enrichment –log (P) function,responsetoviralinfection 10 and 0 2 4 6 8 0 10 ). These results are results pre). with consistent These 10. GO:0016311dephosphorylatio 9. GO:00042100Bcellproliferatio 8. GO:0002695negativeregulationofleukocyteactivation 7. GO:0030888regulationofBcellproliferation 6. GO:0031295TcelIcostimulation 5. GO:0031294lymphocytecostimulation 4. GO:0045086positiveregulationofinterleukin-2biosyntheticproces 3. GO:0006917inductionofapoptosi 2. GO:0012502inductionofprogrammedcelldeat 1. GO:0008624inductionofapoptosisbyextracellularsignal upeetr Tbe 19 Table Supplementary Gene OntologyBiologicalProcesscategory of the Wilcoxon rank-sum test test rank-sum Wilcoxon the of RASA4 Enrichment inbiologicalprocesses 1,000 , macrophagephagocytosis n n s 10 2,000 Fisher’s exact exact Fisher’s

Random genes: Nature Ge Nature a Ascertained fromallgenes Ascertained fromrandomloci Ascertained fromallergyloci 1 b h 0 ) Enrichment Enrichment ) to quantify quantify to P < 0.0002) 0.0002) < s 3,000

, indicat ),

s P n value value

etics P 3,770 = 0.05 - - -

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. ( ( diseases mune ( diseases allergic of ment been or are being considered as drug targets, including 9 for the treat others by shown drug as for opportunity repositioning, new a represent potentially could identified homeostasis. support for a role of allergy risk variants in the regulation of metabolic traits, both previously suggested by twin studies genetic correlations were observed for obesity- and depression-related from our GWAS results ( matory bowel disease obtained after excluding the 136 top associations inflam and Crohn’s disease disease, positive celiac with correlations significant genetic by evidenced as variants, sentinel to restricted 22 Table ( to associate with other immune-related traits, notably blood cell counts immune manycell function, sentinel risk variants have reported been differentiation. cell NK and phorylation phos lipid death, cell of induction to related pathways for observed were enrichments noteworthy Other immunity. phocyte-mediated lym on genes target likely the and variants sentinel the for role key a confirming production, IL-4 and IL-2 and switching, isotype proliferation and cell B activation, cell B and T to related processes cal ( lists gene random the of <5% in exceeded was genes target plausible the tested, processes we found 35 pathways biological for which the 3,770 enrichment with observed the for correcting After loci. risk allergy that with observed 1,000 lists of genes randomly drawn from the with same genes target plausible of list the with observed enrichment the compared we process, biological specific each for expression, gene com GeneNetwork when genes processes using target excluded), (MHC genome the in genes the of rest biological the to pared plausible the identified among then we over-represented function, cell immune system. of immune the cells in expression gene modulating by extent large a to operate NA, notapplicable. TARS2 RGS14 PHF5A F11R F11R CCR7 CCR7 CCR7 CCR7 CCR7 CD86 target gene Plausible expression of the allele allergy-protective and the existing drug matched T Nature Ge Nature in currently drugs development for other indications might influence Supplementary Table 21 Supplementary Table 26 Table Supplementary able able 2 We then investigated whether any of the plausible target genes genes target plausible the of any whether investigated then We on variants risk allergy of effect widespread a with Consistent influence might variants sentinel the how understand help To Fig. 3 Fig.

Plausible Plausible target genes with drugs in development for indications other than allergic diseases, for which the effect on gene ). The genetic overlap with autoimmune diseases was not was diseases autoimmune with overlap genetic The ). c n and and etics Decreased Decreased Decreased Decreased Decreased Decreased Decreased Decreased Decreased Decreased Increased 1 protective alleleon 2 . As for the analysis of tissue-specific enrichment in in enrichment tissue-specific of analysis the for As . Effect ofallergy- Supplementary Table25 Supplementary gene expression Supplementary Table 20 Supplementary

ADVANCE ONLINE PUBLICATION ONLINE ADVANCE Supplementary TableSupplementary 23 ) and autoimmune diseases ( Supplementary Table 24 Table Supplementary ), mostly cancer. Therefore, for 20 genes, genes, 20 for Therefore, cancer. mostly ),

Antagonist Antagonist Antagonist Antagonist Antagonist Antagonist Antagonist Antagonist Antagonist Antagonist Agonist Drug action 1 4 . We found that 29 genes have have genes 29 that found We . ) and 16 for other diseases diseases other for 16 and ) ). These included biologi included These ). Discovery NA Discovery Discovery Discovery NA NA Discovery Discovery Discovery Discovery Drug status 1 3 . The former provides ). Other ). significant Other

), 4 for autoim for 4 ), Supplementary Borrelidin inhibitor Regulator ofGproteinsignaling14 PHF5A inhibitors F-50073 F11R inhibitors Chemokine receptorinhibitors Chemokine antagonists CCR7-targeting antibody Anti-CCR7 monoclonalantibody Anti-CCR7 chimericIgG1antibodies BR-02001 Drug name ------symptoms and so could be prioritized for preclinical testing. preclinical for prioritized be could so and symptoms that suggesting the latter might attenuate (and not allergy exacerbate) genes, the effect on gene expression of the allergy-protective allele allele ( allergy-protective the of expression gene on effect the genes, these of six For disease. allergic underlying mechanisms biological the the in available are references, and codes accession statements of including Methods, and data availability any associated Me target methylation. of gene modulation smoking through risk as disease such allergic factors influence might environmental that results possibility our the Finally, raise repositioning. drug genomics-guided the of on basis proposed are allergy for drugs Novel cells. func immune the of tion overwhelmingly likely influence to their and predicted are variants, genes, risk target The entities. disease individual the on effects similar had identified variants the exceptions, a With few from hay and fever eczema). (asthma, diseases information correlated of genetically three basis the on defined mul a phenotype of GWAS large a tiple-disease through disease allergic for variants risk function neutrophil in for CpGs disease expression-associated of allergic state methylation of the modulating by risk partly the influence might smoking that indicate 29 Table of state methylation the and smoking n ylation state of expression-associated CpGs for those 36 genes (largest lished risk factors for allergic disease (Online Methods) and the meth five estab between we the association tested analyses, In exploratory risk. Well-powered studies that address this are possibility disease warranted. allergic extension, by and, expression gene target influence might CpGs these of state methylation the on effects environmental ( effects of SNP significantly independently blood, in were mRNA levels with correlated levels methylation which for site CpG nearby a had 27%) or (36, genes target of fraction substantial a that found we Supplementary Table 28 Supplementary Table 27 Supplementary = 1,211). We observed only one significant association, between between association, significant one only observed We 1,211). = Finally, on the basis of data from the BIOS consortium In conclusion, we substantially increased the number of known known of number the increased substantially we conclusion, In pape t ho r ds ), which was reported in a previous study previous a in reported was which ), . PITPNM2 Scripps ResearchInstitute University ofMalaga Research Center Fred HutchinsonCancer Pierre FabreSA Provid Pharmaceuticals,Inc. Sosei GroupCorp Neurocrine Biosciences,Inc. Abilita Bio,Inc. Pepscan SystemsBV North CoastBiologics,LLC Boryung PharmCoLtd , a PYK2-binding protein PYK2-binding a , Originator company 18 , 1 ). This observation raises the possibility that 9 ) and the existing drug matched ( matched drug ) existing and the .

PITPNM2 Infectious disease Memory loss Glioblastoma Cancer Cardiovascular disease NA NA Metastatic breastcancer Cancer Unidentified indication Autoimmune disease 1 7 potentially involved involved potentially Active indications ( o s r e t t e l 1 Supplementary Supplementary nline version of version nline 6 . These results results These . 1 5 ( n = 2,101), Table 2 ), ),  - - - -

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. to to M.A.F. ( Stevenage, UK. GlaxoSmithKline, University of Groningen, University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, the Netherlands. Stockholm, Sweden. Medical Center, Amsterdam, the Netherlands. Health, Neuherberg, Germany. for Health, Environmental Neuherberg, Germany. Universität, Munich, Germany. German Center for Lung Research, Munich, Germany. Zentrum Research München–German Center for Health, Environmental Neuherberg, Germany. Western Australia, Australia. Medicine Greifswald, Greifswald, Germany. Medicine and University Greifswald, Greifswald, Ernst-Moritz-Arndt Germany. University Medicine Greifswald, Greifswald, Germany. University Hospital, Trondheim, Norway. Health and Nursing, NTNU, Norwegian University of Science and Technology, Trondheim, Norway. Ann Arbor, Michigan, USA. Stanford, California, USA. School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia. California, San Francisco, San Francisco, California, USA. Norway. 11 Karolinska Institutet, Stockholm, Sweden. of Biological Psychology, Netherlands Twin Register, Vrije University, Amsterdam, the Netherlands. Mountain View, California, USA. Pediatric Allergy, Experimental and Clinical Research Center of Berlin Charité Universitätsmedizin and Max Delbrück Center, Berlin, Germany. Venereology, University Campus Hospital Schleswig-Holstein, Kiel, Kiel, Germany. University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, the Netherlands. 1 2. 1. claims jurisdictional in published maps and affiliations. institutional reprints/index.htm at online available is information permissions and Reprints The authors declare no competing interests.financial B.M.B., S.W., P.K.E.M., R.J., E.J., Y.-A.L., D.I.B., C.A., G.H.K., R.K., L.P. D.I.B., C.A., G.H.K., R.K., L.P. Study design and management: M.A.F., D.A.H., A.T., V.U., J.v.D., Y.L., J.E.-G., B.M.B., J.B., S.C.D., S.W., P.K.E.M., R.J., E.J., Y.-A.L.,tables and figures: M.A.F. Writing group: M.A.F., J.M.V., I.M., C.T., J.D.H., Q.H., C.W.M., E.M., K.B., O.H., J.Z., J.A.R., J.B., B.B. Quality control, meta-analysis, Methylation analysis: J.v.D., D.I.B., R.J. Biological and drug annotation: M.A.F., A.L., O.L.H., M.L., C.J.W.G.R.A., (HUNT study); L.P., M.A.F. (UK Biobank study). SALTY studies); L.P. (ALSPAC study); B.M.B., L.G.F., M.E.G., J.B.N., W.Z., K.H., D.I.B. (NTR study); A.T., V.U., Y.L., P.K.E.M., C.A., (CATSS,R.K. and TWINGENE (23andMe study); J.D.H., J.S.W., R.B.M., E.J. (GERA study); Q.H., J.-J.H., G.W., J.E.-G., S.G., A.A., G.H., C.O.S., N.H., Y.-A.L. (GENUFAD studies); C.T., D.A.H. H.B., M.H., E.R., A.F., N.N., H.S., S.K., C.G., K.S., S.W. (GENEVA study); I.M., F.R., L.M.B., P.J.T., N.G.M., D.L.D. (AAGC study); J.M.V., G.H.K. (LifeLines study); Data andcollection analysis in the contributing studies: M.A.F., M.C.M., S.C.D., each contributing study in the number acknowledgments 10074. Detailed and funding details are provided for This research was conducted using the UK Biobank resource under application online versio Note: Any Supplementary Information and Source Data files are available in the s r e t t e l  COMPETING FINANCIAL INTERESTS COMPETING AUTHOR CONTRIBUTIONS A Genetics Genetics and Biology,Computational QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia. cknowledgments

K.G. K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, Trondheim, Thomsen, S.F.Thomsen, M. Pinart, J. Tuberc. Lung Dis. Tuberc.Lung J. Med. Respir. study. cohort population-based a MeDALL: in children non-IgE-sensitised 12 [email protected] Department Department of Thoracic Medicine, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway. n n of the pape et al. et

2 l , 131–140 (2014). 131–140 , et al. et . . Publisher’s with neutral Nature remains regard to note: Springer Comorbidity of eczema, rhinitis, and asthma in IgE-sensitised and IgE-sensitised in asthma and rhinitis, eczema, of Comorbidity 36

Findings on the atopic triad from a Danish twin registry. twin Danish a from triad atopic the on Findings University University of Groningen, University Medical Center Groningen, Beatrix Children’s Hospital, Pediatric Pulmonology and Pediatric Allergology, and 10 r . , 1268–1272 (2006). 1268–1272 , 17 19 A A full list of members and affiliations appears in the 27 Department Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA. Supplementary Supplementary Note 33 Department Department of Dermatology and Allergology, University Hospital Bonn, Bonn, Germany. 31 7 Division Division of Research, Kaiser Permanente Northern California, Oakland, California, USA. Research Research Unit of Molecular Epidemiology and Institute of Epidemiology II, Helmholtz Zentrum Research München–German Center Department Department of Epidemiology and University Biostatistics, of California, San Francisco, San Francisco, California, USA. 38 These These authors contributed equally to this work. 22 Center Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA. 10 26 MRC MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, UK. Institute Institute for Respiratory Health, Harry Perkins Institute of Medical Research, University of Western Australia, Nedlands, 35 Pediatric Pediatric Allergy and Pulmonology Unit at Astrid Lindgren Children’s Hospital, Karolinska University Hospital, ). 32 . Institute Institute of Genetic Epidemiology, Helmholtz Zentrum Research München–German Center for Environmental 30 24 Institute Institute and Outpatient Clinic for Social Occupational, and Medicine, Environmental Ludwig-Maximilians- Department Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University 14 http://www.na Institute Institute of Clinical Molecular Biology, Christian Albrechts University of Kiel, Kiel, Germany. ture.com/ Lancet 25 Int. Institute Institute for Community Medicine, Study of Health in Pomerania/KEF, University S 4 upplementary upplementary Note Max Max Delbrück Center (MDC) for Molecular Medicine, Berlin, Germany. 39 17. 16. 15. 14. 13. 12. 11. 10. 9. 8. 7. 6. 5. 4. 3. 19. 18. These These authors jointly supervised this work. should Correspondence be addressed

Wells, A. Wells, GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue B.K. Bulik-Sullivan, by disease allergic for variants risk detect to power the Improving M.A. Ferreira, J. Yang, Loh, P.R. asthma, reported parentally of Genetics D.I. Boomsma, & C.E. Beijsterveldt, van Yan, S.R. & Novak, M.J. Novak, & S.R. Yan, Res. humans. in regulation gene genome-widepolygenicityassociationinstudies. Genet. fever. hay and asthma both on based status case–control defining (2012). S1–S3 369–375, traits. complex influencing variants additional identifies statistics (2015). analysis. variance-components fast using diseases adherent to fibrinogen. to adherent factor necrosis tumor by stimulated Pyk2 kinase eczema and rhinitis in 5-yr-old twins. 5-yr-old in rhinitis and eczema Lev, S. R. Joehanes, Bonder,M.J. P.Sanseau, review a atopy: in Backer,V.relationships S.F.,& Etiological Thomsen, K.O. Kyvik, R.S. Fehrmann, Farh, K.K. Finucane, H.K. ae, .. Shesne, . Lwl, .. y2 s eurd o neutrophil for required is Pyk2 C.A. Lowell, & J. Schlessinger, L.A., Kamen, 1656–1665 (2011). 1656–1665 Genet sites. binding their of Biotechnol. Nat. studies. twin of cancer. in sensitivity variants. statistics. summary association wide degranulation and host defense responses to bacterial infection. bacterial to responses defense host and degranulation protein. (rdgB) B degeneration retinal

43 29

9 17 et al. , 10804–10820 (2015). 10804–10820 , Comprehensive Pneumology Comprehensive Center Munich (CPC-M), Member of the , 436–447 (2016). 436–447 , 16 Nature et al. t al. et et al. et 21 , 505–511 (2014). 505–511 , 9 Department Department of Pathology, Stanford University School of Medicine, et al. Department Department of Medical Epidemiology and Biostatistics, Department Department of Dermatology, St. Olavs Hospital, Trondheim Identification of a novel family of targets of PYK2 related to et al. et et al. et . . Contrasting genetic architectures of schizophrenia and other complex t al. et The anatomical distribution of genetic associations. genetic of distribution anatomical The odtoa ad on mlil-N aayi o GA summary GWAS of analysis multiple-SNP joint and Conditional 18 et al. Genetic and epigenetic fine mapping of causal autoimmune disease a

Twin Res. Hum. Genet. Hum. TwinRes. 518 30 Department Department of Human Genetics, University of Michigan, t al. et DVANCE ONLINE PUBLICATION ONLINE DVANCE Use of genome-wide association studies for drug repositioning. drug for studies association genome-wide of Use Disease variants alter transcription factor levels and methylation and levels factor transcription alter variants Disease pgntc intrs f iaet smoking. cigarette of signatures Epigenetic , 317–320 (2012). 317–320 , Partitioning heritability by functional annotation using genome- , 337–343 (2015). 337–343 , t al. et Nat. Genet. Nat. Nat. Genet. Nat. Gene expression analysis identifies global gene dosage gene global identifies analysis expression Gene FEBS Lett. FEBS β2 D cr rgeso dsigihs ofudn from confounding distinguishes regression Score LD 20 integrin–dependent phosphorylation of protein-tyrosine of phosphorylation integrin–dependent Science 13 2 HUNT HUNT Research Centre, Department of Public Epidemiology, University of Groningen, 28 Epidemiology Epidemiology and University Biostatistics, of 3

Institute Institute of Epidemiology I, Helmholtz Department Department of Dermatology, Allergology and 47 34

49 451

Department Department of Psychiatry, VU University , 115–125 (2015). 115–125 , 348 Nat. Genet. Nat. , 131–138 (2017). 131–138 , Eur. Respir. J. Respir.Eur. 23 Mol. Cell. Biol. Cell. Mol. , 33–38 (1999). 33–38 ,

11 Clinic Clinic and Polyclinic of Dermatology, , 648–660 (2015). 648–660 , , 112–120 (2008). 112–120 , Nat. Genet.Nat. α

47 and fMLP in human neutrophils human in fMLP and , 1228–1235 (2015). 1228–1235 , a. Genet. Nat.

29

19

, 516–521 (2007). 516–521 , 6 , 2278–2288 (1999). 2278–2288 , Research, Research, 23andMe, Nature Ge Nature 37

47 Present Present address: , 291–295 (2015).291–295,

J. Immunol. J. 47 Nat. Genet. Nat. Twin Res. Hum. TwinRes. ic Cardiovasc Circ 8 15 1385–1392 , Nucleic Acids Nucleic Department Department 5 Clinic Clinic for Melbourne Melbourne

Drosophila n etics

186

44 , ,

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. any disease or trait with genome-wide significant associations reported in in reported associations significant genome-wide with trait or disease any considering when were new our associations whether to determine approach same the group, we used this Within loci. risk allergy in new located variants locus risk known a of in composed was group major second The variant risk (‘KnownLocus–KnownVariant’). known a be to considered was had ciation variant (‘KnownLocus–NewVariant’). reported locus one least risk at when known a Alternatively, in variant risk new a represent to considered was association our then variant, sentinel our with had variants reported GWAS. all If previous in reported variant(s) in our study and all identified variant sentinel each the LD between estimated then we group, first the For loci. risk allergy ‘NewLocus’) variants; reported previously the from Mb (>1 new and ‘KnownLocus’) associations; reported previously 185 any of the from Mb (<1 known groups: major of two one into associations SNP independent our of each classify to information this used ( variants between LD of the basis on the ciations asso independent 89 into grouped we which conditions, allergic various of with associations SNP disease. allergic independent of status novelty the Determining had ants that confirmed and apart) Mb >1 is, (that loci risk different in located variants sentinel between LD the estimated we Lastly, locus. the in variants associated independently strongly at disease with allergic SNPs of effects the by locus the in SNPs all of statistics summary the adjusted (iv) ( SNP associated significantly most the identified (iii) SNP; top the of effect the by locus that ( SNP associated cantly signifi most the identified (i) we locus, each for Briefly, study. Biobank UK were on LD based a calculations analyses, from of subset the individuals 5,000 GCTA in implemented as analyses, conditional approximate using associations independent tically statis we identified loci, of for Next, these each to a locus. new were assigned variant significant genome-wide previous the from Mb >1 located Variants Mb. <1 was variants consecutive between distance the if locus same the into 10 analyses. condi tional approximate through associations independent of Identification (MAF) frequency allele 10 × at 3 set was threshold significance genome-wide The of 1.04. intercept regression score LD meta-analysis the for adjusted further were meta-analysis the from meta-analysis. fixed-effects inverse-variance-weighted, an using ies regression score LD Weintercept. respective METALused then the by estimate effect genetic each of error standard the of square the multiplying by inflation observed the for adjusted and ( 1.00 1.16 between ranged biases, due likely to of technical test statistics inflation reflect analysis from LD regression score estimates Intercept sample. overall able of variants, for most which were 8,307,659 (89%) in available >95% of the combined Inc.; 23andMe, and Biobank (UK studies largest two the least at in present SNPs to analysis the restricting and results from individual studies ( the in study each for is provided as testing, well as for and imputation association SNP genotyping, controls, and cases identify to used procedures the of description detailed A total group; (control conditions three these of any from fering total group; (case eczema of individuals in European descent model who reported suffering genetic from asthma and/or hayadditive fever an and/or using performed was GWAS a 360,838). GWASof Meta-analysis disease allergic results in conducted 13 studies ( O doi: GWAS catalog. NHGRI-EBI the NLIN −8 Prior to the meta-analysis, standard quality control filters were applied to to applied were filters control quality standard meta-analysis, the to Prior 10.1038/ng.3985 , sorted these , on these sorted the basis of position and base-pair then grouped variants i and and −8 r E E In each of 13 studies ( participating 2 , as suggested previously for GWAS analyzing variants with minor minor with GWAS variants for previously analyzing suggested as , >0.02). M j. We repeated this process until there were no SNPs associated associated SNPs no were there until process this Werepeated E Supplementary Table 1 Supplementary T For each , we identified all SNPs with with SNPs all identified we chromosome, each For Previous GWAS Previous the with risk 185 SNPs associated identified HO j ) that remained genome-wide significant in the locus; and and locus; the in significant genome-wide remained that ) Supplementary Note Supplementary DS P 5 i ≥

. We refer to these as sentinel risk variants. In these these In variants. risk sentinel as these to Werefer . ); (ii) adjusted the summary statistics of all SNPs in in SNPs all of statistics summary the adjusted (ii) ); ≤ 1% (ref. (ref. 1% n 3 × 10 r = 180,129) against those who never reported suf reported never who those against = 180,129) 2 was always close to 0 (no pairs of sentinel vari sentinel of pairs (no 0 to close always was Supplementary Supplementary Table 1 2 −8 0 to combine association results across stud across results to combine association 2 after adjusting for of after adjusting the effect other, more 1 ). ). Results from individual studies were studies from individual ). Results . Supplementary TablesSupplementary 1 n = 256,623), results were avail were results 256,623), = Supplementary Note Supplementary ). ). After quality control, r 2

≥ 0.05, our asso our 0.05, n = 180,709). 180,709). = r P 7 2 P , which , which and values values < 0.05 0.05 <

≤ ). We). 3 × 3 n 2 = ), ), ------

GWAS results. Fourth, we re-estimated SNP-based heritabilities as described described as heritabilities SNP-based we re-estimated GWAS Fourth, results. at correlated variants all (and variants sentinel 136 the removed we Third, study. population-based a was this that given prevalence, sample the as same the be to prevalence population the set we scale, liability the on estimate HapMapTo SNPs. lion heritability a obtain regression score LD with heritabilities SNP-based overall the quantified we diseases, three the of each for Second, eczema. for ( tor fac inflation genomic The phenotype. individual each with association for imputed) and (genotyped variants common million 7.6 analyzed we filters, control quality GWAS. After eczema the for controls 16,131 and cases 2,630 GWAS; fever and hay the for controls 12,844 and the GWAS; cases for 6,939 asthma controls 16,463 and cases 1,875 identified we questionnaire, a from in the The in HUNT study greater detail is meta-analysis. in described the discovery a GWAS of the individual diseases in the HUNT study, which was not included and hay eczema. fever asthma, of heritability the to variants sentinel the of contribution the Estimating distribution. normal a follows where ( size effect ( ( asthma-only disease: single a from only suffering fever cases with eczema, which tends to precede the development of asthma and hay ant might be associated with earliest age of onset because it is more prevalent in performed ( performed of analyses sets SNPs 136 and for three the correction to a Bonferroni sponds 10 × 1.2 at set was analyses these for threshold The significance analyses. eczema versus fever hay and eczema versus asthma the in >1 of OR an but comparison fever hay versus. asthma the in ~1 of OR an expect and of to hay but risks the asthma fever not one to then would that of eczema, similarly contributed SNP a if example, For (g2). 2 group to compared when is more (OR > 1) or allele group 1 (OR < for less in, (g1) 1) example, common For SNP, a the risk sentinel given whether indicate analyses from results these out analysis carried as of part the GWAS, of interpretation case–control the facilitates which results. the from independent (g3). statistically only are eczema analyses versus These (g2) only fever hay and (g3); only eczema versus (g1) only asthma (g2); only fever hay versus (g1) only asthma individuals: of groups non-overlapping three contrasting analyses association of sets three 1 Table in indicated are analysis this to contributed that studies The ( cases asthma-only ( cases hay fever–only adults: 12,268), of groups three between frequencies allele comorbidities allergic reported individuals affected all not that observation of the advantage we took diseases, allergic across shared factors risk represent to likely GWAS our indeed in were discovered ciations to but any not of specific diseases the three between shared variants risk identify to power improve to design study our expected GWAS,we our in used definition case–control the in eczema and disease. allergic single a from suffering individuals between frequencies allele risk of Comparison onset phenotype could be driven by different risk allele frequencies among among cases suffering from different individual conditions (for frequencies example, an allele risk different by driven be could phenotype onset ( × 10 3.6 was analysis this for used threshold significance The covariates. as included variable array SNP a and sex with phenotype, this with association for tested were SNPs reported. being differentiated) be not could so and tion ques same the by covered were two latter the fever/eczema; hay or (asthma disease age of any earliest allergic the considered we first For individual, each in our GWAS and age of in onset observed the UK Biobank study ( identified variants sentinel the between association the We studied therefore first reported, which has been shown to be by influenced genetic risk factors onset. of age allergy in variation and variants risk sentinel between Association n P = 4,232) and eczema-only ( and = eczema-only 4,232) < 0.05/136). Because significant SNP associations with this broad age-of- broad this with associations SNP significant Because 0.05/136). < λ 2 ) for these analyses was 1.049 for asthma, 1.078 for hay fever and 1.041 1.041 and fever hay for 1.078 asthma, for 1.049 was analyses these for ) 5 Supplementary Note Supplementary There is considerable variation in the age at which allergic diseases are diseases in age the at allergic which variation is There considerable σ ), ), we repeated the analysis by considering individuals who had reported and described in detail in the the in detail in described and = β β group A group P ) between groups were quantified using the formula formula the using quantified were groups between ) < 0.05/(136 × 3)). × 0.05/(136 < – β group B group By combining information from asthma, hay fever fever hay asthma, from information combining By . Briefly, on the basis of self-reported information . information Briefly, on of the basis self-reported and and n n = 1,225) cases. For a cases. = SNP,given 1,225) in differences Five steps were involved. First, we performed Five we were First, steps involved. performed = 33,305) and eczema-only cases ( cases and eczema-only = 33,305) SE s = Supplementary Note Supplementary r 2 sq > 0.05) from the individual disease disease individual the from 0.05) > rt 6 . . To the asso whether understand ( SE 2 b grou 7 n using a subset of 1.2 mil 1.2 of subset a using = 7,445), hay fever–only fever–only hay 7,445), = pA 1 + Nature Ge Nature , 22 SE , 2 −4 3 2 b Supplementary Supplementary . We performed Weperformed . and compared compared and grou , which corre which , n pB n z = 35,972). FLG = = = 6,276). = 6,276). ) n , which , which σ etics vari /SE n 2 = −4 σ 4 ------, .

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. ied by the GTEx Consortium, we tested whether genes with tissue-specific tissue-specific with genes stud whether types tested we tissue Consortium, broad GTEx 25 the by of ied each for Specifically, scripts. locally custom approach using this implemented We variants. risk sentinel the of effects by biological the functionally to be affected that were likely tissues to identify gene Enrichment in expression. tissue-specific the in described procedure the using identified were disease allergic of physiology LD, r high in proxies its of one (or variant risk sentinel a overlapped that ers ( skin and lung types, cell IM–PET in expression gene with associated SNPs of list a in genes other many of expression in transcript a of expression the influences SNP are these to thought effects involve indirect often target genes of plausible because risk variants to eQTLs sentinel identify (based on(based PreSTIGE levels transcription gene in variation and marks epigenetic enhancer in tion ChIA–PET 5C on (based promoter gene a and an enhancer between interactions chromatin of presence the just not supported data functional available publicly which for genes identified we studies, sequent ( levels LD at high resolution limited very have these because effects genetic shared from colocalization distinguish to developed approaches statistical use not Wedid colocalization. spurious of chance the there reduces which eQTL, and a variant which a sentinel sentinel was strong LD between for genes considered only we is, That variant. risk sentinel the LD had sets data study–tissue 94 the of any in eQTL sentinel a any variant risk as a gene of for target likely allergy a which identified sentinel in listed sets data study–tissue 94 of the each for repeated was procedure This other. each with Mb) 2 of window ( LD low in and expression gene with association strongest the with SNPs the as eQTLs’, ‘sentinel defined of set a to SNPs) correlated many PLINK to SNPs reduce the list often included of (which expression-associated in procedure --clump the used we set, data study–tissue that in gene each for 10 × 2.3 = gene) SNPs per 1,000 × by others indicated database ( ( genes coding of protein- number total the considering correction of a Bonferroni basis the specifically associations, SNP–gene expression significant to identify threshold nificance ( disease allergic to relevant or in types 19 cell tissues conducted studies eQTL from 39 published risk variant or one of its proxies ( 20130502_v5a). release Project Genomes 1000 the from from descent data European of individuals genotype using identified were variants risk sentinel with LD in ( LD in variants all among SNPs ( proxies its of one or variant risk sentinel a with associated is transcription in variation and/or sequence protein the gene’, ‘target By which for gene a mean we associations. observed the underlying genes target plausible to identify were used strategies pendent variants. risk of sentinel genes target of plausible Identification associations). top 136 (without the four and SNPs) (all two steps in estimated heritability the SNP-based between difference the as calculated was disease each of heritability the toward variants of step, the 136 sentinel and the contribution In final the fifth for step two, but now using the GWAS results without the 136 top associations. Nature Ge Nature M 2 > 0.95). > Genes that were unlikely to have been previously implicated in the patho the in implicated previously been have to unlikely were that Genes created we tissue, each for study each within and study, eQTL each For To help prioritize plausible target genes for functional validation in sub in validation functional for genes target plausible prioritize Tohelp Second, to identify genes with transcription levels associated with a sentinel First, we used wANNOVAR ): ): P Supplementary Note Supplementary < 0.05/( < 4 2 0 8 or FANTOM5 (ref. , queried on 19 October 2016. We approximated 2016. October 19 on queried , 3 P 6 n < 2.3 × 10 or or etics G G ) and the number of SNPs likely to have been tested per gene gene per tested been have to of SNPs likely number the ) and in situ in × × 29– Supplementary Table 13 Table Supplementary 3 M 8 , , H3K27ac enhancer and super-enhancer annotations 3 ). ). 1 Hi-C −9 , and so the threshold became became threshold the so , and G for was set at 21,742, on the basis of the GeneCards GeneCards the of basis the on 21,742, at set was Supplementary Table 16 Table Supplementary . 3 cis r 7 2 4 2 data), but also an association between varia between association an also but data), > 0.8). > 6 effects (<1 effects Mb). We this threshold onselected 1 to identify genes containing nonsynonymous r r analyses). Weanalyses). considered data from immune 2 2 −9 > 0.8) with any sentinel risk variant. SNPs variant. risk sentinel any with 0.8) > > 0.8), we publicly queried available results trans . We did not use information from from We. information use not did Supplementary Table 13 Table Supplementary ). 3 2 (for example, where the sentinel ). We used a conservative sig conservative a used We ). 3 We used the TSEA approach 4 cis , promoter capture Hi-C capture promoter , cis , which in turn affects the the affects turn in which , at at r ) and putative enhanc putative and ) 2 P > 0.8) > P = 0.05/(21,472 genes genes = 0.05/(21,472 < 2.3 × 10 × 2.3 < M 3 3 to be 1,000, as as 1,000, be to . r r 2 . Finally, we we Finally, . 2 2 < 0.05, LD LD 0.05, < > 0.8 with with 0.8 > Two inde 7 ( −9 n . Then, Then, . = 294; 294; = trans 3 3 5 9 9 ------, ,

random from the available genes in that locus. This procedure plausible was repeated for of number actual the ( genes target than greater or same the was selection for available genes of number the which for loci risk allergy non-MHC of subset locus risk allergy non-MHC each for genes, random of list To each generate variant). sentinel a ( to respect loci with risk allergy from drawn randomly were genes First, strategies. an with pSI and the value 17,783 available not in the MHC three using region, We at from genes genes. random target of plausible the selected instead genes random 112 containing each lists, gene arbitrary 1,000 generated we sibility, To not pos targets. this as address plausible and identified of genes list to the variants specific risk sentinel near genes of feature general a be could sion identify to used variants. sentinel of were genes target plausible that studies eQTL same the in eQTLs have to found gene that to list were genes of a 12,804 subset the background restricting after analysis the we repeated eQTLs, for with genes was enriched targets plausible of list the because arise could enrichment that a significant out possibility the Torule (one-sided). test exact Fisher’s used we tissue, given a in expression To test whether the plausible target genes were enriched for genes with specific for genes available 112 target analysis. genes and plausible background 17,671 were there MHC and region, in a the pSI value without genes excluding After genome. the in genes the of rest the to compared when genes target plausible h from downloaded TableS3_NAR_Dougherty_Tissue_gene_pSI_v3-1.txt, file threshold index specificity a on (based expression H3K27ac) measured by the ENCODE project are used to define regulatory regulatory define to used are project ENCODE the by and measured H3K9ac H3K27ac) H3K4me3, (H3K4me1, marks histone four to up Specifically, risk. disease with association significant genome-wide a with those just not likely target genes of risk variant allergy and considers all SNPs in the genome, the overall heritability.disease As such, it does not require the ofidentification to annotations regulatory tissue-specific in located SNPs of contribution the quantifies approach This regression. score LD stratified called variants, risk of disease effects functional by the affected likely tissues to identify approach Enrichment in tissue-specific SNP heritability. eQTL. known a with genes of subset the to list) gene background the (and for selection available genes the restricting after lists gene of random analysis and did generation we the repeated As we genes, target tested. plausible the of tissues analysis the for 25 the for correcting after lists, gene random the analyzing when enrichment that exceeding of probability the to corresponds with target the genes ( plausible observed value a with 1,000) of 25 tissues tested was retained ( smallest the lists, of For these each genes. random for the plausible target genes was then used to analyze each of the 1,000 lists of same alone. that by chance list in genes other to proximity close in be the only list, could random selected given genes a for result, a As genome. the in located were the where genes ignoring value, pSI available an with genes non-MHC 17,783 all R locus locus selected randomly the that ensure locus for selection for available genes of number ( target genes plausible of number actual the than greater or same the was number the (i) criteria: following the satisfied selection for available of genes number L locus risk allergy for non-MHC Then, each loci. were in of present each these 1,430 into consecutive region) 2-Mb loci MHC and counted the how many genes (excluding with an available autosomes pSI value the partitioned first we at random from the genome. In this case, to generate each list of random genes, set. data random given a in twice selected not was locus same the that ensuring loci, risk allergy non-MHC all t , we randomly selected a locus a locus selected randomly , we , we selected selected we , t We also tested whether a significant enrichment in tissue-specific expres tissue-specific in enrichment significant a whether Wetested also The same approach used to test enrichment in tissue-specific expression expression tissue-specific in enrichment test to used approach same The from random at genes selected simply we strategy, final and third the In In the second strategy, genes were randomly drawn from 2-Mb loci selected p : / / L g in terms of the number of Then, for genes available for locus selection. T e n ) selected for locus locus for selected ) e t i c T s . T ) selected for locus for locus ) selected w genes at random from the available genes in that locus. that in genes available the from at random genes P u min s t l . value that was the same or lower than the enrichment enrichment the than lower or same the was that value e d u / j d l L a b and (ii) the number matched (within 10%) the the 10%) (within matched number the (ii) and P / min p R s i from the subset of 2-Mb loci for which the the which for loci of 2-Mb subset the from L L _ ). ). The proportion of random gene lists (out , we randomly selected a locus a locus selected , we randomly . Then, for locus for locus . Then, p a c k a R g was comparable to the allergy risk risk allergy the to comparable was e / ) were enriched ) among were enriched the list of P Finucane obs P ) was then calculated. This ) was calculated. then L value observed across all all across observed value 9 R . This was important to to important was This . (pSI) of 0.05; listed in in listed 0.05; of (pSI) , we selected , we selected doi: et al. 10.1038/ng.3985 1 0 developed an R T from the the from genes at genes ± 1 Mb Mb 1 P - -

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. ascribed ‘target-based actions’, keeping only entries that corresponded to the the to corresponded that actions’, entries only keeping ‘target-based ascribed the after Second, running search query, results were on filtered the basis of the receptor”.6 “IL6RA”OR “gp80” OR “IL6R” OR “IL6RQ” “interleukin OR OR “IL-6RA” OR “IL6Q” OR “IL-6R-1” OR “CD126” was: used query search the For that org.Hs.eg.db. target to example, drugs package find annotation Bioconductor the from obtained were aliases name Gene alias_N. OR … OR alias_1 OR name gene approved HGNC format: following was the on based query built search a First, gene. each for individually out carried was search base between 7 and 15 November 2016, which included 63,417 drugs. data The Drug drug Cortellis Reuters Thomson the queried we development, clinical that for are To of considered targets transcripts drugs genes encoding identify Identification of target genes considered as drug targets for human diseases. meta-analysis. the after or level study at the either intercept, score In these analyses, results from our meta-analysis were not corrected for the LD our between GWAScorrelation and results genetic reported for 229 SNP-based common traits the or diseases, using LD estimated Hub we analysis, this Astle the in reported these GWAS database NHGRI-EBI with catalog associations any extracted and 20130502_v5a) Project Genomes 1000 the in Europeans from data using ( LD in variants all identified first We variants. risk disease allergic with associated diseases and traits Common but (12,913) results foundsimilar (data not very shown). after restricting the genes available for selection to the subset with lists gene random known analyzed and generated eQTLs we analysis, enrichment the biased have plausible could target genes information eQTL to identify using whether available an with genes non-MHC 18,300 the from sampling genes, random of lists 1,000 generate to close proximity to each other. We used the same three strategies described above a general feature of genes located near sentinel allergy risk variants or simply in might not be a specific feature of the plausible target genes identified but instead random loci. This analysis addresses the possibility that an observed enrichment be had expected we sampled genes at random from the risk allergy loci or from how often a BP enrichment observed with the list of plausible target genes would list. gene background to the compared when genes, probability that genes in that BP are among enriched the list of plausible target Wilcoxon test. The a one-sided rank-sum 107 plausible target genes and genes not listed in GENCODE release 19), using a and background analysis) gene for list of 18,193 available genes (obtainedwere after so excluding MHCand genes, the files set gene GeneNetwork the in z of BP. that of distribution part the is X compared we gene that Second, ability The GO:0000002). example, where “pathway” was replaced with the actual name of the BP being tested (for 8080/GeneNe from genome the in genes unique 19,976 of each for target genes as follows. First, we a downloaded gene set file containing ble for analysis. For each BP, we tested its enrichment among the list of plausible procedure ‘guilt-by-association’ a of basis the on genes other include to expanded were GeneNetwork used (BP) we process biological GO given a in included originally sets gene approach, genes, target non-MHC the among Enrichment of processes. biological analysis. regression score LD the repeated GWASand results meta-analysis the from these) with with variants any (and SNPs top the removed we variants, sentinel of effects the by explained not were enrichments heritability SNP significant the than rather coefficient, sion Finucane by recommended As quantified. is heritability disease the to group a as tion contribu their and identified then are annotations regulatory that these SNPs overlap types. cell different 100 in enhancers) example, (for annotations doi: the with obtained results 65 the of example, For alias. an or name gene scores between the list of plausible target genes (107 non-MHC genes were were genes non-MHC (107 genes target plausible of list the between scores As for the enrichment analysis of tissue-specific expression, we estimated estimated we expression, tissue-specific of analysis enrichment the for As 10.1038/ng.3985 et et al. et et al. 1 4 2 3 . After excluding BPs with <10 or >500 genes, 3,770 BPs were availa , , a large GWAS of cell blood counts ( twork/resources/onto 1 0 , we ranked cell types , on of we types the cell basis ranked the z score and in GENCODE release 19. To determine Todetermine 19. release GENCODE in and score z score for gene X in that file reflects the prob the reflects file that in X gene for score P value of total enrichment. To ensure that that Toensure enrichment. total of value logy?ontology=GO_BP To identify biological processes enriched 4 2 (queried on 13 December 2016) or by 2016) on December 13 (queried r 2 > 0.8) with a sentinel risk variant variant risk sentinel a with 0.8) > P value from value test this the represents n = To173,480). complement http://129.125 P value of value the regres 2 &term=[pathway 7 ( n = 294; release release 294; = 1 2 . With this this With . .135.180: r 2 z > 0.05 0.05 > scores IL6R IL6R 4 4 ] - - - - - , , .

the direction of effect of the allergy risk allele on gene expression. gene on allele risk allergy the of effect of direction the on the same haplotype as the allergy risk allele. Then we used that allele to infer was SNP expression-associated the of allele which determined first other, we did not correspond to the same variant but were in high LD ( with lower associated gene expression. On the other hand, when the two SNPs z ( effect a had negative allele allergy-protective the if for example, variant: assess when the sentinel SNP and the expression-associated SNP were the same set of target genes via identified an eQTL (see above). It was to straightforward with or higher associated lower target gene expression, we on focused the sub to protein levels. which is true for most but not all expression-associated SNPs the protective allele on transcription levels is not dependent on tissue or context, of is a simplistic this comparison it effect assumes because that directional the between the allergy-protective allele and the existing expression drug. gene onWe effect acknowledgethe compared we thatsymptoms, allergic exacerbate or In an attempt to predict whether existing drugs would be to expected attenuate expression.ongene target allele of allergy-protective the effect Directional conditions. other of treatment the for considered gene (iii) and specifically; diseases allergic not but conditions immune-related of treatment the for considered gene (ii) diseases; allergic of treatment considered the drug for one least at with gene (i) groups: three of one to allocated were drug respective and gene the and downloaded were results drug Third, mention action only above, for 20 query did the target-based ma 23andMe study are available for atdownload availability. Data status. for smoking adjusting also while tested was smoking maternal with association The eosinophils). % and monocytes % neutrophils, (% ages percent cell blood row, white and array plate bisulfite methylation sampling, blood at age sex, covariates: as included variables following the with factors, ( methylation between to used test the association was regression logistic or Linear possible. where co-twin the from data with were checked for consistency and missing data for one twin were supplemented obtained through self-report in NTR surveys 2, 3 and 6. For twin pairs, answers was siblings older of number “no”, don’ton know”. “I and Information “yes” tion: “Did your mother ever smoke during pregnancy?” with answer categories in measured NTR 10 in Survey (data 2013) with collection the following ques described previously as surveys project biobank NTR the of part as collected was smoking current and BMI on Information n rent smoking ( included in the BIOS consortium studies from data using unrelated disease individuals of theallergic Netherlands for Twin factors Register (NTR) risk study, which established was five with CpGs analyses. downstream for expression gene in variation with associated there were multiple from and downloaded SNPs. Such CpG called sites, for expression-associated for levels SNPs and expression levels methylation-associated methylation the adjusting after 5%), < rate discovery (false levels expression gene with associated significantly were levels methylation which in For sites CpG sample). identified per we gene, pairs target each read million >15 2000; HiSeq reads; paired-end bp × 50- (2 RNA-seq and arrays Kit BeadChip HumanMethylation450 Infinium ( samples blood whole elsewhere that detail in consortium described are (BIOS) Study Omics Integrative Biobank-based the from data using effects, SNP of independently expression, gene target in variation with associated was methylation CpG DNA in variation whether tested first factors. risk environmental by methylation gene target of Modulation score) on gene expression in the published eQTL study, then that allele was was allele that study, eQTL then published the in expression gene on score) = 1,214), birth weight ( weight birth 1,214), = To determine whether the allergy-protective allele of was variant a allele sentinel Tothe allergy-protective whether determine Next, we tested the association between methylation levels at levels methylation Next, sentinel these we between tested the association nuelf/gwas_results/m n = 1,221), maternal smoking ( 4 9 at blood draw. Birth weight was obtained in multiple NTR NTR multiple in obtained was weight draw. Birth blood at Summary statistics of the meta-analysis without the the without meta-analysis the of statistics Summary cis -eQTMs, and -eQTMs, the CpG so we site selected most strongly n ht = 2,101) were quantified, respectively, with Illumina Illumina with respectively, quantified, were 2,101) = tp://genenetwork.nl/ ain.htm n = 1,015) and number of older siblings ( siblings older of number and 1,015) = cis -eQTMs, were identified in a study previous were identified -eQTMs, 5 l 0 . The full GWAS summary statistics for the the for statistics GWAS full summary The . 15 . Maternal smoking during pregnancy was was pregnancy during smoking Maternal . , 4 8 . Methylation and expression levels in in levels expression and Methylation . 15 , 4 8 . . The risk factors tested were cur https://genepi. n cis biosqtlbrowse = 637), body mass index (BMI; (<250 kb from the gene) for for gene) the from kb (<250 β value) and individual risk and value) individual Nature Ge Nature qimr.edu.au/staff/ r r 45– 2 . . For most genes, IL6R > 0.8) with each 4 7 , , and extends or an alias. n n = 775). 775). = etics β We or 1 5 - - - -

© 2017 Nature America, Inc., part of Springer Nature. All rights reserved. 32. 31. 30. 29. 28. 27. 26. 25. 24. 23. 22. 21. 20. Summary Reporting Sciences com Hinds ( David contact Please of participants. the 23andMe privacy the protects that 23andMe with agreement an under researchers fied data 23andMe set be will discovery made available through 23andMe to quali Nature Ge Nature

Westra, H.J. Westra, T. Lappalainen, Montgomery, S.B. J.R. Davis, integrating GeneCards: D. Lancet, & J. Prilusky, V., Chalifa-Caspi, M., Rebhan, from variation genetic of map integrated An Consortium. Project Genomes 1000 Chang, X. & Wang, K. wANNOVAR: annotating genetic variants for personal genomes S.C. Dharmage, C. Sarnowski, Mortz, C.G., Andersen, K.E., Dellgren, C., Barington, T. & Bindslev-Jensen, C. Atopic H. Gough, GWAS (in)famous The L. Groop, & J.C. Florez, A.K., Manning, J., Fadista, of meta-analysis efficient and fast METAL: G.R. Abecasis, & Y. Li, C.J., Willer, known disease associations. disease known humans. in variation population. Caucasian a in (2016). 216–224 variants. between disequilibrium linkage for accounts diseases. and (1997). genes, about information genomes. human 1,092 web. the via (2014). 17–27 onset. asthma to prevalence, cohort: TOACS the in adulthood comorbidities. and persistence to adolescence from dermatitis (2015). MAS. cohort birth German the in years 24 variants. low-frequency for updated and revisited threshold scans. association genomewide ) for more information and to apply to access the 23andMe data. A A data. 23andMe the access to apply to and information more for ) , 1202–1205 (2016). 1202–1205 , n t al. et etics t al. et J. Med. Genet. Med. J. et al. et et al. et et al. et et al. et legc utmriiy f sha riii ad cea vr 20 over eczema and rhinitis asthma, of multimorbidity Allergic An efficient multiple-testing adjustment for eQTL studies that eQTL studies for adjustment multiple-testing efficient An J. Allergy Clin. Immunol. Clin. Allergy J. et al. Systematic identification of identification Systematic Identification of a new locus at 16q12 associated with time with associated 16q12 at locus new a of Identification Atopic dermatitis and the atopic march revisited. march atopic the and dermatitis Atopic Nature Transcriptome and genome sequencing uncovers functional uncovers sequencing genome and Transcriptome Transcriptome genetics using second generation sequencing Nature Nature

Nat. Genet. Nat. 49 501 is available for this paper. this for available is Allergy

, 433–436 (2012). 433–436 , Bioinformatics 491 , 506–511 (2013). 506–511 ,

464 , 56–65 (2012). 56–65 ,

70 , 773–777 (2010). 773–777 ,

45 eit. leg Immunol. Allergy Pediatr. , 836–845 (2015). 836–845 ,

138 , 1238–1243 (2013). 1238–1243 ,

26 , 1071–1080 (2016). 1071–1080 , trans , 2190–2191 (2010). 2190–2191 , eQTLs as putative drivers of drivers putative as eQTLs rns Genet. Trends m J Hm Genet. Hum. J. Am. dhinds@23andme. u. . u. Genet. Hum. J. Eur.

26

431–437 , Allergy 13 P 163 , -value

Life Life

69 98 - , ,

46. 45. 44. 43. 42. 41. 40. 39. 38. 37. 36. 35. 34. 33. 50. 49. 48. 47.

Fairfax, B.P. B.P.Fairfax, J. Zheng, W.J.Astle, D. Welter, R. Andersson, Teng,C., Chen, interactome Tan,B., enhancer–promoter & He, of L. view Global K. D. Hnisz, O. Corradin, S.S. Rao, Li, G. B. Mifsud, landscape interaction long-range Dekker,The & J. G. Jain, B.R., Lajoie, A., Sanyal, S. Chun, si P.C. Tsai, G. Willemsen, D.V.Zhernakova, J. Fu, upon monocyte gene expression. gene monocyte upon HLAalleles. of and roles (2012). 502–510 regulators master type–specific cell analysis. correlation genetic and heritability GWASSNP level for summary data of potential the maximizes that regression score disease. complex common to associations. tissues. cells. human in 155 traits. common to susceptibility confer to expression gene of levels dictate disequilibrium looping. chromatin of principles regulation. transcription for basis Hi-C. capture resolution promoters. gene of 49 types. immune-cell major three in loci autoimmune-disease-associated discordant adult monozygotictwins.adultdiscordant studies. epidemiological blood. whole in loci trait expression. gene of variation genetic , 600–605 (2017). 600–605 , , 934–947 (2013). 934–947 , et al. Genome Res. Genome t al. et Nature et al. et et al. et t al. et t al. et Extensive promoter-centered chromatin interactions provide a topological t al. et et al. et et al. et t al. et et al. naeig h rgltr mcaim udryn tissue-dependent underlying mechanisms regulatory the Unraveling et al. et t al. et Nucleic Acids Res. Acids Nucleic Limited statistical evidence for shared genetic effects of eQTLs and eQTLs of effects genetic shared for evidence statistical Limited et al. et

t al. et LD Hub: a centralized database and web interface to perform LD perform to interface web and database centralized a Hub: LD ue-nacr i te oto o cl iett ad disease. and identity cell of control the in Super-enhancers 507 3 mp f h hmn eoe t ioae eouin reveals resolution kilobase at genome human the of map 3D A The allelic landscape of human blood cell trait variation and links and variation trait cell blood human of landscape allelic The N mtyain hne i the in changes methylation DNA Mapping long-range promoter contacts in human cells with high- with cells human in contacts promoter long-range Mapping Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. h NGI WS aao, crtd eore f SNP–trait of resource curated a Catalog, GWAS NHGRI The Innate immune activity conditions the effect of regulatory variants et al. et Genetics of gene expression in primary immune cells identifies cells immune primary in expression gene of Genetics obntra efcs f utpe nacr ains n linkage in variants enhancer multiple of effects Combinatorial The Netherlands Twin Register biobank: a resource for genetic for resource a Twinbiobank: Netherlands Register The Nature

, 455–461 (2014). 455–461 , 24 n ta o atv ehnes cos ua cl tps and types cell human across enhancers active of atlas An Identification of context-dependent expression quantitative expression context-dependent of Identification , 1–13 (2014). 1–13 , Nat. Genet. Nat. Twin Res. Hum. Genet. Hum. TwinRes. Nat. Genet. Nat.

489 Cell , 109–113 (2012). 109–113 ,

Cell 42 Science

Cell 167 , D1001–D1006 (2014). D1001–D1006 , Twin Res. Hum. Genet. Hum.Twin Res.

47 PLoS Genet. PLoS 159

49 , 1415–1429.e19 (2016). 1415–1429.e19 , 148 , 598–606 (2015). 598–606 , , 139–145 (2017). 139–145 ,

, 1665–1680 (2014). 1665–1680 , 343 , 84–98 (2012). 84–98 ,

Bioinformatics 111 , 1246949 (2014). 1246949 ,

, E2191–E2199 (2014). E2191–E2199 , 13

8 , 231–245 (2010). 231–245 , , e1002431 (2012). e1002431 , IGF1R

ee n it weight birth in gene 33 doi: 18 , 272–279 (2017). 272–279 , , 635–646 (2015).635–646 , 10.1038/ng.3985 a. Genet. Nat. Nat. Genet. Nat.

Cell 44 ,

nature research | life sciences reporting summary

Corresponding author(s): Manuel A R Ferreira Initial submission Revised version Final submission Life Sciences Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form is intended for publication with all accepted life science papers and provides structure for consistency and transparency in reporting. Every life science submission will use this form; some list items might not apply to an individual manuscript, but all fields must be completed for clarity. For further information on the points included in this form, see Reporting Life Sciences Research. For further information on Nature Research policies, including our data availability policy, see Authors & Referees and the Editorial Policy Checklist.

` Experimental design 1. Sample size Describe how sample size was determined. Studies were invited to participate in the meta-analysis if: (1) genome-wide genotype data were available for >2,000 individuals (prior to final phenotype exclusions); and (2): information was available on asthma, hay fever and eczema status. 2. Data exclusions Describe any data exclusions. Pre-defined exclusion criteria were applied at the study-level as part of standard GWAS quality control procedures. These included, for example, excluding samples and SNPs with high levels of missing data. 3. Replication Describe whether the experimental findings were No replication phase was included in this study. reliably reproduced. 4. Randomization Describe how samples/organisms/participants were Participants were allocated to the case or control groups based on available allocated into experimental groups. information for asthma, hay fever and eczema. Specifically, participants suffering from one or more allergic condition were considered as cases. Participants who had never suffered from any allergic condition were considered as controls. 5. Blinding Describe whether the investigators were blinded to For most studies, investigators involved in blood collection and DNA genotyping group allocation during data collection and/or analysis. were blinded to case-control status. Note: all studies involving animals and/or human research participants must disclose whether blinding and randomization were used. June 2017

1 Nature Genetics: doi:10.1038/ng.3985 6. Statistical parameters nature research | life sciences reporting summary For all figures and tables that use statistical methods, confirm that the following items are present in relevant figure legends (or in the Methods section if additional space is needed).

n/a Confirmed

The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement (animals, litters, cultures, etc.) A description of how samples were collected, noting whether measurements were taken from distinct samples or whether the same sample was measured repeatedly A statement indicating how many times each experiment was replicated The statistical test(s) used and whether they are one- or two-sided (note: only common tests should be described solely by name; more complex techniques should be described in the Methods section) A description of any assumptions or corrections, such as an adjustment for multiple comparisons The test results (e.g. P values) given as exact values whenever possible and with confidence intervals noted A clear description of statistics including central tendency (e.g. median, mean) and variation (e.g. standard deviation, interquartile range) Clearly defined error bars

See the web collection on statistics for biologists for further resources and guidance.

` Software Policy information about availability of computer code 7. Software Describe the software used to analyze the data in this R, PLINK, METAL, SNPTEST, BOLT-LMM, RAREMETALWORKER, GCTA, MACH2DAT, study. EPACTS, EIGENSTRAT

For manuscripts utilizing custom algorithms or software that are central to the paper but not yet described in the published literature, software must be made available to editors and reviewers upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). Nature Methods guidance for providing algorithms and software for publication provides further information on this topic.

` Materials and reagents Policy information about availability of materials 8. Materials availability Indicate whether there are restrictions on availability of Summary statistics of the meta-analysis without the 23andMe study will be made unique materials or if these materials are only available publicly available at the time of publication. The full GWAS summary statistics for for distribution by a for-profit company. the 23andMe discovery data set will be made available through 23andMe to qualified researchers under an agreement with 23andMe that protects the privacy of the 23andMe participants. Please contact David Hinds ([email protected]) for more information and to apply to access the 23andMe data. 9. Antibodies Describe the antibodies used and how they were validated No antibodies were used for use in the system under study (i.e. assay and species). 10. Eukaryotic cell lines a. State the source of each eukaryotic cell line used. No cell lines were used

b. Describe the method of cell line authentication used. No cell lines were used

c. Report whether the cell lines were tested for No cell lines were used mycoplasma contamination.

d. If any of the cell lines used are listed in the database No cell lines were used

of commonly misidentified cell lines maintained by June 2017 ICLAC, provide a scientific rationale for their use.

2 Nature Genetics: doi:10.1038/ng.3985 ` Animals and human research participants nature research | life sciences reporting summary Policy information about studies involving animals; when reporting animal research, follow the ARRIVE guidelines 11. Description of research animals Provide details on animals and/or animal-derived No animals were used materials used in the study.

Policy information about studies involving human research participants 12. Description of human research participants Describe the covariate-relevant population We studied 180,129 participants who reported having suffered from asthma and/ characteristics of the human research participants. or hay fever and/or eczema, and 180,709 participants who reported not suffering from any of these diseases. Mean age within each contributing study varied between 4 and 62, with females representing 39% to 68% of the sample size. June 2017

3 Nature Genetics: doi:10.1038/ng.3985