Identifying functionally significant causal variants in Segregation data Functional data Annotating sequence

variation Frequency Animal in Interpretation Models Controls Shamil Sunyaev

Department of Biomedical Informatics Harvard Medical School Bioinformatics

Broad Institute of M.I.T. and Harvard

Map variants on genomic annotation Nonsense variants

One of most significant types of variants usually leading to the complete loss of function.

Watch for multiple transcripts! Nonsense variants are enriched in sequencing artifacts

Watch for conflicting annotations! Important considerations: i) location along the gene, ii) does the variant cause NMD? iii) is the variant in a commonly skipped exon?

Tool: LOFTEE Variants involved in splicing SpliceAI

Variants in canonic splicing sites

Variants in exonic or intronic splicing enhancers

Gain of splicing variants

Missense variants: computational Does the mutation fit the pattern of past predictions evolution?

This image cannot currently be displayed.

human VVSTADLCAPSSTKLDERA

dog FVSTSELCAGSTTRLEERA

fish FLSTSELCVPSTLKVNEKV

Statistical issues: -sequences are related by phylogeny -generally, we have too few sequences Does the mutation fit the pattern of past evolution? Continuous time Markov model

• We assume a constant fitness landscape: what is good for fish is good for human!

GLY VAL ALA GLY ALA • We can estimate whether the mutation fits the pattern of amino acid changes.

• We can also estimate rate of evolution at the amino acid site

Continuous time Markov model structure view

P – matrix of transition probabilities

푃 푡 = 푒

p – stationary distribution • Most of pathogenic mutations are important for stability (good news?).

• DDG is difficult to estimate. 푄휋= 0 • Unfolded protein response pathway has to be taken into account.

• Heuristic structural parameters help but less than comparative genomics. PolyPhen2 Weakly deleterious mutations

• Multiple independent lines of evidence suggest abundance of weakly deleterious alleles in humans

• Weakly deleterious variants may occur in highly conserved positions

• Weakly deleterious alleles probably contribute to complex phenotypes but not to simple Mendelian phenotypes

www.genetics.bwh.harvard.edu/pph2 Adzhubei, et al. Nature Methods 2010

Conservation can be due to very weak selection Constant evolutionary landscape Every new mutation eventually will be either fixed or lost AA s – selection coefficient b B Ne - effective population size

For humans estimated to be ~ 10 000

1 . 0 K/K0 0 . 9

0 . 8 A

0 . 7 β 0 . 6 Complete Neutral 0 . 5 conservation

behavior 0 . 4

0 . 3

0 . 2

0 . 1

0 . 0 10-6 10-5 10-4 10-3 Selection coefficient, s Epistatic interactions Compensatory mutations

AA b B

A β

The phenomenon of compensatory Ridges on the fitness landscape mutations in different fields

• Biochemistry – protein stability, allosteric effects

• Genetics – incomplete penetrance

• Evolutionary biology – speciation, epistatic models of evolution Dobzhansky-Muller incompatibility Looking at vertebrate species

Many human disease mutations are found in How complex genetic suppression can be? vertebrates

HumVar "Disease" LETTER doi:10.1038/nature12678 (22,207 variants) ClinVar "Pathogenic" (10,596 variants) Genetic incompatibilities are widespread within species

6,503 1 1 2,3 1 1,2,4 16,544 3,250 5.5-6.5% of presumably Russell B. Corbett-Detig , Jun Zhou , Andrew G. Clark , Daniel L. Hartl & Julien F. Ayroles 313 530 pathogenic human

2,100 mutations are detected in mammals Numerous Dobzhansky-Muller 24,304,185 incompatibilities in fly population

Found in MultIZ 100-Way alignment (24,307,128 variants) How complex genetic suppression can be? Model: accumulation of neutral variants

Variant of interest time x 1 change

x x 2 changes

x x x 3 changes Stability

x x x 4 changes

time xx x x x x n changes

Disease variants do not fit the Poisson Exponential fit expectation Model: accumulation of disease variants Model: Erlang distribution

target size m Suppressors Disease variant time x 1 change

x x 2 changes 휆 푡 푒 퐿 푘, 푡, 휆 = 푘 − 1 ! x x x 3 changes

x x x 4 changes

xx x x x x n changes

k out of m required for suppression

Model: fit for target size and number of Evolutionary model of cis- compensatory changes complementation

1 suppressor required 2 suppressors available

C“Pathogenic” ! Compensation! D mutation is lost is lost human human

macaque macaque

mouse mouse

rabbit rabbit Compensation! “Pathogenic” ! is fixed mutation is fixed Model Summary Predicting and testing suppressors

• ~5-6% of human disease mutations have potential suppressors (i.e. are present in another mammalian species) • In most cases, one large-effect suppressor is sufficient, out of only 1-2 available x • These values allow simple experiment to identify suppressors

Zebrafish model Experiment interpretation

• Model of Bardet-Biedl Normal Syndrome (obesity, renal Human gene with failure, vision loss) No injection disease mutant • Caused by defects in primary cilium Class I Double mutant Knockdown (no suppression) • Embryonic convergence / extension phenotype in Rescue with Double mutant zebrafish human gene (full suppression) Class II • Easily scorable phenotype

Images: Phoebe Images: Phoebe Liu Liu % of embryos Bardet % of embryos Bardet - ctrl

Out of 32 candidates,of32 Out rescuecomplete 1 WT -

+ ctrl Biedl

Mut - Biedl syndrome syndrome syndrome syndrome mutants Double u Phe193Le ** – mutants Double RPGRIP1L R937L – Figure: Stephan Frangakis Stephan Figure: Figure: Stephan Stephan Figure: BBS4 N165H * * Frangakis ** p < 0.001 < p ** * p < 0.05 < p * Affected Severely Affected Normal Out of 9 candidatesof9 shownOut (7 here), rescuecomplete 1

% of embryos Bardet 003 - 001 - - Biedl A newly identified gene identified newly A 002 - 004 - 004 - syndrome syndrome mutants Double negative testing genetic extensive metabolic testing testing metabolic normal female microarray Clinical testing frequentinfections respiratory low immunoglobulins abnormal muscle tone failurethrive to issues feeding microcephaly Global developmental delay Clinical features * *** *** De novo De De novo De NOS2 BTG2 – BBS4 N165H Figure: Stephan Stephan Figure: – negative Compound het Compound Compound het Compound 0.0001 *** p < p *** Stephan Frangakis Stephan LAMA1 – TTN Frangakis Affected Severely Normal Affected Erica Davis Erica BTG2 is the disease driver The mutation is a reversal to the mammalian+ ancestral state ef BTG2 R80 L128 Q140 V141 L142 H. sapiens RLQVL P. troglodytes ••••• G. gorilla ••••• M. musculus KV •MM R. norvegicus KV •MM H. glaber •V•MM S. domesticus KV •MM B. primigenius KV •MM E. ferus caballus KV •MM F. catus KV •MM C. lupus familiaris KV •MM D. novemcinctus KV •MM Microcephaly G. gallus KP•MM

êNeuronal expr.

êcell proliferation

BTG2 has two compensatory mutations Methods

PolyPhen2

SIFT

* * LRT

MutationTaster

Number of cells/embryo FatHMM

SNPs3D

DeepSequence *P<0.01 vs V141M rescue Injection alone Stephan Frangakis Umbrella methods Incorporating regional constraint

Condel CCR

REVEL M-CAP

CADD PrimateAI

M-CAP

Regulatory variants

• Regulation: variants in promoters, enhancers, silencers, insulators

Non-coding variants Chromatin accessibiliy Chromatin modification

Epigenomics Why do we think that non-coding variation is of importance?

Regions and individual nucleotides conserved along phylogeny show signals of purifying selection in humans

Epigenetic studies report many well-localized regulatory marks

GWAS signals are predominantly located in non-coding regions Non-disease alleles of large effect Ultraconserved elements

PLoS BIOLOGY Deletion of Ultraconserved Elements Yields Viable Mice

Eye color Nadav Ahituv1,2¤, Yiwen Zhu1, Axel Visel1,AmyHolt1, Veena Afzal1, Len A. Pennacchio1,2, Edward M. Rubin1,2* 1 Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, 2 United States Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America

Ultraconserved elements have been suggested to retain extended perfect sequence identity between the human, mouse, and rat genomes due to essential functional properties. To investigate the necessities of these elements in vivo, we removed four noncoding ultraconserved elements (ranging in length from 222 to 731 base pairs) from the mouse genome. To maximize the likelihood of observing a phenotype, we chose to delete elements that function as enhancers in a mouse transgenic assay and that are near that exhibit marked phenotypes both when completely inactivated in the mouse and when their expression is altered due to other genomic modifications. Remarkably, all four resulting lines of mice lacking these ultraconserved elements were viable and fertile, and failed to reveal any critical abnormalities when assayed for a variety of phenotypes including growth, longevity, pathology, and metabolism. In addition, more targeted screens, informed by the abnormalities observed in mice in which genes in proximity to the investigated elements had been altered, also failed to reveal notable abnormalities. These results, while not inclusive of all the possible phenotypic impact of the deleted sequences, indicate that extreme sequence constraint does not necessarily reflect crucial functions required for viability.

Lactase tolerance

Enrichment of GWAS signals in Enrichment of GWAS signals in putative regulatory elements putative regulatory elements Partitioning heritability GWAS SNPs co-localize with eQTLs

Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS

Dan L. Nicolae1,2,3, Eric Gamazon1, Wei Zhang1, Shiwei Duan1¤, M. Eileen Dolan1,2, Nancy J. Cox1,2*

Whole blood The Genotype-Tissue Expression ) 5

p Crohn’s

(GTEx) pilot analysis: Multitissue ( 4 disease gene regulation in humans 10 3 The GTEx Consortium*† 2

1

Observed –log 0 34021 34

Expected –log10 (p) )

p (

Translating GWAS findings into Human Genetics All the Way mechanistic models GWAS peak GWAS peak

Endophenotype Endophenotype Controlled model system Gene expression (eQTL)

Biochemistry Molecular phenotype (molecular_QTL) Causality Co-localization

Mediation Same causal variant

G EP G EP

Independent effects Distinct variants E G E G 1 P LD Reverse causation

G PE G2 P

Methods PAINTOR’s fine mapping model

• If a SNP is causal, then r2 ●

should predict association of ● ● ● ● ●● ● ● ● ● other SNPs in the area: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● Coloc ● ● ●●● ● ●● ● ● ● ● ● ●●●●●●●●●● ●● ●●● ● ● ● ● ● ● ●●● ● ● ●●● ● ● ● ● ●● ● ● ●●● ● ●

Test Z score Test ●● ● ● ● ● ●● ● ●● ● • Correlation between test ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● 2 0 2 4● 6 8 ● ● ● ●● ●●●●●●●●● statistics Z are approximated by − ● ● ● ●●●●● ● ● ● ●●●●●●● eCAVIAR ● ●● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● MVN given local pairwise LD ● ● ● ● 4 ● − ● structure. ● ● ●

JLIM −0.5 0.0 0.5 1.0 LD with causal SNP (r)

Parameters ! ! ! = {!}; !! = ! !; ! ! ∘ ! , ! ! λ: standardized effect size Z: association statistic ! ! !! ! C: indicator of causality ! (! ! !!!!) ∝ e ! ! m: SNP considered Kichaev et al. PLoS Genet. 2014; Chen et al. Genetics. 2015 Joint likelihood: causal same variant

Trait #2 Trait #1 Higher log likelihood Higher log likelihood

-log L(causal SNP := i) -log L(causal SNP := i) -log L(causal SNP := i) 2.5 7.5 2.5 7.5 2.5 7.5 Likelihood of causal SNP causal of Likelihood 5 5 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3.0238 233.90 233.85 233.80 3.0238 233.90 233.85 233.80 233.90 233.85 233.80 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● Genomic Position (Mbp) Genomic Position (Mbp) ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ●● ● ● ●● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● M4 M4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●●●● ●●●● ● ● ● ● ●●●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●●●● ●●●● ●● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●●●●● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ●●● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●●● ●●● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●●●●●● ●●●●●● ● ● ●●●●●● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ●●●●●● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

GWAS ● ● ● ● ● ● ● ● ● ● ● ● ● ● M3 M2 M1 M5 M4 All other SNPs LD neighbor of lead SNP for trait #1 Causal SNPs All other SNPs LD neighbor of lead SNP for trait #1 Causal SNPs 1M M3 M2 M1 eQTL M4 M5 Joint likelihood: causal same variant

Trait #2 Trait #1 Joint likelihood: distinct variants Trait #2 Trait #1 Higher log likelihood Higher log likelihood

-log L(causal SNP := i) -log L(causal SNP := i) -log L(causal SNP := i) -log L(causal SNP := i) 2.5 7.5 2.5 7.5 2.5 7.5 5 5 5 0 2 4 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3.0238 233.90 233.85 233.80 233.90 233.85 233.80 233.90 233.85 233.80 3.0238 233.90 233.85 233.80 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● M1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● Genomic Position (Mbp) ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ●● ●● Genomic Position (Mbp) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ●● ● ●● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ●●● ●●● ● ● ●●● ● ●●● ● ●●● ●●● ●●● ●●● ●●● ●●● ●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● M4 M4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ●●●● ●●●● ● ● ● ● ●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●●●● ●●●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ●● ● ● ● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●●●●● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ●●● ● ● ● ● ● ●● ●● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ●●●●●● ●●●●●● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ●●●●●● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

GWAS ● ● ● ● ● ● ● ● ● ● ● ● ● ● M3 M2 M1 M5 M4 All other SNPs LD neighbor of lead SNP for trait #1 Causal SNPs All other SNPs LD neighbor of lead SNP for trait #1 Causal SNPs 1M M3 M2 M1 eQTL M4 M5 GWAS Joint likelihood: distinct variants M5 M4 M3 M2 M1 Trait #2 Trait #1

M1 Higher log likelihood Distinct M5 M4 M3 M2 L(causal SNP := i) -log L(causal SNP := i)

eQTL -log 2.5 7.5 5 P-values can be estimated by permuting by permuting estimated be can P-values 0 2 4 6

● ● ● ● ● ● Joint likelihood test 3.0238 233.90 233.85 233.80 3.0238 233.90 233.85 233.80 = ● ● ! ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● M1 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● = ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● , ●● ● ● ● ● ● ● ● 9 ● ● ● ● ● ● ●● ●

GWAS ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ∗ ● ● ● ● ● ● ●● ● ● ● ● 2 ● ● ● ● ● ● ● ● ●● ●● ● ●● ● M3 M2 M1 M5 M4 ● ● 5 ● ● ● ● ● ●● ●● ● ● ● ; ● ● ● , ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● 9 ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● Genomic Position (Mbp) ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● 7 ● ● ● ●● ● ●● ● ● ● ● ● ● ● ∗ ● ● ● ●● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ; ● ● ●● ● ● < ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1M M3 M2 M1 ● ● ● ● ● ● ● ● ● ● ● 7 ● ● ● ● ● ●● ●● ● ● ● ●● ●● = ● ● ● ● ●● ●● ●● ●● ●● ●● ● ●● ●● ● ● ● + $ ● ● ● # ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● M4 ● ● ● ( ● ●● ● ● ● ● ● ●●●● ● ● ●● $ ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● > ●●●● ● ●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● 2 ● ● Same ● 5 ● ● ● ● ● ● ● ● ● ● % ● ● ● ● ● ● ● ● ● ● ● ● = ● ● eQTL ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● > ● ●● ● ● ●● ● ● ● ● ● 9 ⋅ ● ● ●● 5 ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● log ● ● ● ● ● ● ∗ ● ● ● ● ●● ● ●● ● ●●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ) ●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ⋅ ● ● ● M4 1 ( max 2 5 ? , 4 ● ● ● 6 ● ● ● ● ● ● ● # ● ● ● @ ● ● + ● M5 ● ● ● ● ● ● ● ● ● ● ● 7 $ ● ● ● ● ● ● ● ● ● ● ●● ● − ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● # ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● * ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ●●● ● ● ● $ ● ● ● ● ● ● ● ● ● ● ● ●● ● 1 ● ● ● ●● ● ● ● max ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● # ● ● ● ● ●● ● ● ● ● ● 5 ● ● ● ●● , ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● * ● ● ● 4 ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● + ● ● ● ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ( ● ●● # ● ●● 7 ● ●●●●●● ● ● ● ● ● ● ● ● ● ● * ● ● ●●●●●● ● ● ● ● ● ● ● ● ●●●●●● ● ● ● ● ● ● + ● ● ● ● ● ● ● ● ● ● ● ● ) ● ● ? ● ● ● ( B 8 + ) ) GWAS

M5 M4 M3 M2 M1 GWAS ● ● ● ● ● ● M5 M4 M3 M2 M1 All other SNPs LD neighbor of lead SNP for trait #1 Causal SNPs Parameters w z 휃 m* 1M M3 M2 M1 : disease association statistic : : r : lead disease-associated SNP eQTL 2 M1 resolution limit eQTLs Same eQTL 2M 4M5 M4 M3 M2 association statistic eQTL r M4 2 < M5 휃 = 0.8 Real data: autoimmune/inflammatory autoimmune/inflammatory Real data:

GWAS M5 M4 M3 M2 M1 • Accessibility of disease relevant cell types and types and cell relevant of disease Accessibility • of statistics summary on Free availability • • GWASsuccessful Highly • M1 QL data eQTLs ImmunoBase ImmunoChip Distinct M5 M4 M3 M2 P-values can be estimated by permuting by permuting estimated be can P-values eQTL

Joint likelihood test : custom fine- = ! 1 = 2 5 ,

9 GWAS 1 ∗ 2 M3 M2 M1 M5 M4 5 ; , diseases 9 7 ∗ ; < 1M M3 M2 M1 7 = + $ # ( $ > 2 Same 5 % = eQTL > 9 ⋅ 5 log ∗ ) mapping array mapping ⋅ M4 1 ( max 2 5 ? , 4 6 # @ + M5 7 $ − # * $ 1 max 2 # 5 , * 4 + 6 ( # 7 * + ) ? ( B 8 + ) )

w z 휃 m* Parameters : disease association statistic : : r : lead disease-associated SNP eQTL 2 eQTLs resolution limit association statistic

Number of loci Number of loci Densely eQTL present b Driven by same effect c Densely eQTL present b Driven by same effect c Disease genotyped a CD4+ CD14+ LCL Total CD4+ CD14+ LCL Total Disease genotyped a CD4+ CD14+ LCL Total CD4+ CD14+ LCL Total MS 59 54 55 55 56 8 3 6 12 MS 59 54 55 55 56 8 3 6 12 IBD 69 69 69 68 69 6 9 1 12 IBD 69 69 69 68 69 6 9 1 12 Crohn 19 18 18 18 18 2 1 0 3 Crohn 19 18 18 18 18 2 1 0 3 UC 10 10 9 10 10 2 1 3 4 UC 10 10 9 10 10 2 1 3 4 T1D 47 39 40 36 40 2 0 0 2 T1D 47 39 40 36 40 2 0 0 2 RA 34 34 34 34 34 2 0 1 3 RA 34 34 34 34 34 2 0 1 3 CEL 34 34 34 34 34 3 2 0 5 CEL 34 34 34 34 34 3 2 0 5 Overall 272 258 259 255 261 25 16 11 41 Overall 272 258 259 255 261 25 16 11 41     * Excluding conditional hits ** Defined by ImmunoChip's densely genotyped fine- mapping intervals. Excluding MHC *** Loosely identified by cis-eQTL signal within +/- 100kb from index SNPs. cis-eQTL is defined by the association p-value < 0.05. **** CD4/CD14+ (n=211/213) from Raj et al. Science 2014; LCL (n=278) from Lappalainen et al. Nature 2013

Number of loci Number of loci Densely eQTL present b Driven by same effect c Densely eQTL present b Driven by same effect c Disease genotyped a CD4+ CD14+ LCL Total CD4+ CD14+ LCL Total Disease genotyped a CD4+ CD14+ LCL Total CD4+ CD14+ LCL Total MS 59 54 55 55 56 8 3 6 12 MS 59 54 55 55 56 8 3 6 12 IBD 69 69 69 68 69 6 9 1 12 IBD 69 69 69 68 69 6 9 1 12 Crohn 19 18 18 18 18 2 1 0 3 Crohn 19 18 18 18 18 2 1 0 3 UC 10 10 9 10 10 2 1 3 4 UC 10 10 9 10 10 2 1 3 4 T1D 47 39 40 36 40 2 0 0 2 T1D 47 39 40 36 40 2 0 0 2 RA 34 34 34 34 34 2 0 1 3 RA 34 34 34 34 34 2 0 1 3 CEL 34 34 34 34 34 3 2 0 5 CEL 34 34 34 34 34 3 2 0 5 Overall 272 258 259 255 261 25 16 11 41 Overall 272 258 259 255 261 25 16 11 41    <1 Mb  <1 Mb <100 kb <100 kb

lead disease SNP nominal eQTL SNP eQTL gene lead disease SNP nominal eQTL SNP eQTL gene (assoc. P < 0.05) (assoc. P < 0.05)

**** CD4/CD14+ (n=211/213) from Raj et al. Science 2014; **** CD4/CD14+ (n=211/213) from Raj et al. Science 2014; LCL (n=278) from Lappalainen et al. Nature 2013 LCL (n=278) from Lappalainen et al. Nature 2013 METTL21B eQTLs in CD4+ and CD14+ are METTL21B eQTLs in CD4+ and CD14+ are consistent with association to MS consistent with association to MS

RP11571M6.15 RP11571M6.15 RP11 620J15.2 RP11 620J15.2 CYP27B1 RP11571M6.17  CYP27B1  RP11571M6.17    RNU61083P RP11620J15.1 RNU61083P RP11620J15.1   CDK4 METTL21B CDK4 METTL21B AGAP2 AS1 AVIL AGAP2 AS1 AVIL  METTL1   METTL1 OS9 TSPAN31 MIR26A2 OS9 TSPAN31 MIR26A2 TSFM TSFM AGAP2 MARCH9 CTDSP2 AGAP2 MARCH9 CTDSP2 12 position (Mb) position (Mb) a 58.1 58.2 58.3 a 58.1 58.2 58.3 b Probability per gene

2 2 16 r 16 r 1 1 CD14+ METTL21B

CD4+ METTL21B MS MS LCL

Joint Likelihood -log (P) 0 0 0 0 0610 c d

20 r2 1 (P) 10 + CD4

Association -log Association 0 0

21 r2 1 + CD14

0 0

METTL21B eQTLs in CD4+ and CD14+ are METTL21B eQTLs in CD4+ and CD14+ are consistent with association to MS consistent with association to MS

RP11571M6.15 RP11571M6.15 RP11 620J15.2 RP11 620J15.2 CYP27B1  RP11571M6.17  CYP27B1  RP11571M6.17      RNU61083P RP11620J15.1 RNU61083P RP11620J15.1     CDK4 METTL21B CDK4 METTL21B AGAP2 AS1 AVIL AGAP2 AS1 AVIL   METTL1   METTL1 OS9 TSPAN31  MIR26A2 OS9 TSPAN31  MIR26A2 TSFM   TSFM   AGAP2 MARCH9 CTDSP2 AGAP2 MARCH9  CTDSP2          Chromosome 12 position (Mb)  Chromosome 12 position (Mb)  a 58.1 58.2 58.3 b Probability per gene a 58.1 58.2 58.3 b Probability per gene

2 2 16 r 16 r 1 CD14+ METTL21B 1 CD14+ METTL21B

CD4+ METTL21B CD4+ METTL21B MS MS LCL LCL

Joint Likelihood -log (P) Joint Likelihood -log (P) 0 0 0610 0 0 0610 c d c d 06 06 06 20 r2 c d 20 r2 c d 1 20 1 20 d (P) (P)

10 10 10 (P) (P) 10 10 + + CD4 CD4 CD4+ Z CD4+ -log CD4+ -log

Association -log Association 0 0 0 -log Association 0 0 0 0 016 016 01601

21 21 21 21 r2 r2 11 (P) (P)

1 10 1 10 + + CD14 CD14 CD14+ Z CD14+ -log CD14+ -log

0 0 0 0 0 016 0 0 016 MS -log10(P) 01MS -log10(P) 601 LD from MS lead 80SNP (r ) A GWAS peak shared across RA, Crohn, and ANKRD55 eQTL in CD4+ is consistent with MS is consistent with ANKRD55 eQTL in T cells association with MS, Crohn, and RA

 RNU6299P RNA5SP184 RNU6299P RNA5SP184 ANKRD55 ANKRD55

Chromosome 5 position (Mb) Chromosome 5 position (Mb) a a b 55.41 55.43 55.45 55.41 55.43 55.45 Probability per gene r2 r2 1 1 19 19 RA IL6ST ANKRD55

Crohn IL6ST ANKRD55 RA RA

MS IL6ST ANKRD55 0 0 0 0 Joint Likelihood -log (P) in CD4+ cells 0610

r2 r2 c d 11 1 11 1 rohn rohn C C

0 0 0 0 (P) (P) 10 10

r2 r2 1 1 9 9 Association -log Association Association -log Association MS MS

0 0 0 0

10 r2 1 + CD4

0 0

ANKRD55 eQTL in CD4+ is consistent with ANKRD55 eQTL in CD4+ is consistent with association with MS, Crohn, and RA association with MS, Crohn, and RA

  RNU6 299P RNA5SP184 RNU6 299P RNA5SP184 ANKRD55  ANKRD55

Chromosome 5 position (Mb) Chromosome 5 position (Mb) a b a b 55.41 55.43 55.45 Probability per gene 55.41 55.43 55.45 Probability per gene r2 r2 1 1 19 RA IL6ST ANKRD55 19 RA IL6ST ANKRD55

Crohn IL6ST ANKRD55 Crohn IL6ST ANKRD55 RA RA

MS IL6ST ANKRD55 MS IL6ST ANKRD55 0 0 0 0 Joint Likelihood -log (P) in CD4+ cells Joint Likelihood -log (P) in CD4+ cells 0610 0610 06 060610 r2 cc dd r2 cc dd 10 10 d 11 1 11 1 7 (P) (P) 10 10 rohn rohn C C CD4+ Z CD4+ -log CD4+ -log 0 0 0 0 (P) 0 (P) 0

10 10 0 019 019 RA -log (P) 01RA -log (P) 901 10 10 LD from RA lead SNP (r ) r2 r2 1 10 1 10 9 9 7 Association -log Association -log Association (P) (P) 10 10 MS MS CD4+ Z CD4+ -log CD4+ -log 0 0 0 0 0 0 0 011 011 Crohn -log (P) 01Crohn -log (P) 101 10 10 LD from Crohn lead SNP (r ) 10 r2 10 r2 1 10 1 10 7 (P) (P) + + 10 10 CD4 CD4 CD4+ Z CD4+ -log CD4+ -log 0 0 0 0 0 0 0 84 09 090901 MS -log10(P) MS -log10(P) LD from MS lead SNP (r ) Associations to multiple sclerosis, Crohn disease and rheumatoid arthritis (RA) on chromosome 5 are consistent with an eQTL AssociationsAssociations to tomultiple multiple sclerosis, sclerosis, Crohn Crohn disease disease and and rheumatoid rheumatoid arthritis arthritis (RA) (RA) on on chromosome chromosome 5 are 5 are consistent consistent with with an aneQTL eQTL for ANKRD55 in CD4 + T 15% of disease loci have eQTL ~75% of tests disease eQTL pairs are driven by the same variant (FDR 5%) driven by distinct variants

a CD4+ CD14+ LCL Number of loci 100 100 100 Densely eQTL present b Driven by same effect c 100 Distinct Disease genotyped a CD4+ CD14+ LCL Total CD4+ CD14+ LCL Total MS 59 54 55 55 56 8 3 6 12 IBD 69 69 69 68 69 6 9 1 12 Crohn 19 18 18 18 18 2 1 0 3 50 75 50 75 50 75 UC 10 10 9 10 10 2 1 3 4 50 75 T1D 47 39 40 36 40 2 0 0 2 25 25 25 25 RA 34 34 34 34 34 2 0 1 3 eQTL/ambiguous No CEL 34 34 34 34 34 3 2 0 5

Overall 272 258 259 255 261 25 16 11 41 Tested disease-eQTL pairs (%) 0 0 0

 -5 -3 -5 -3 -5 -3 -5 -3 -5 -3 -5 -3 <10 10 -10 10 -0.01 0.01-0.05 <10<1 10 -10 10 -0.01 0.01-0.05 <10 10 -10 10 -0.01 0.01-0.05 Same  Strength of eQTL Strength of eQTL Strength of eQTL

n=3,256 n=3,360 n=2,653 * 75% of hits pass Bonferroni threshold as well.

Summary on eQTLs Methods

• ~15% of GWAS loci were mapped to GWAVA (supervised) eQTL genes. CADD (predicts loss of genetic variation)

• ~25% of GWAS loci are driven by the INSIGHT / LINSIGHT (population genetics) eQTLs of same effect. Eigen (eigenvector in the annotation space) • JLIM software PINES (phenotype-specific) Prediction Methods PINES

• Take all annotations and create an uncorrelated space

• Upweight axes corresponding to relevant cell/tissue types

• The score is based on the angle with the direction of the maximal possible annotation

Relevant cell type Vector of all annotations

Vector of all annotations Vector of all annotations

Our SNP Our SNP PINES

PINES: http://genetics.bwh.harvard.edu/pines/