Gene c regula on of gene expression varia on
Barbara E. Stranger Sec on of Gene c Medicine Ins tute for Genomics and Systems Biology Center for Data Intensive Science
ENCODE Users group mee ng, Stanford University 6/8/2016
Outline
• The Immunological Varia on (ImmVar) Project – Baseline eQTLs in adap ve and innate immunity • Context specificity • Role in disease – Ac va on eQTLs Outline
• The Immunological Varia on (ImmVar) Project – Baseline eQTLs in adap ve and innate immunity • Context specificity • Role in disease – Ac va on eQTLs The Immunological Varia on Project (ImmVar)
Goal: Understand how gene c variability translates into gene expression variability in adap ve and innate immunity and contributes to higher order phenotypes ImmVar Study Design
Boston-based healthy donors (n=415) European American (EU, n=215) Phenogenetic Project African American (AA, n=115) (Phillip De Jager) East Asian (EA, n=85)
Naïve CD4+ CD14+16- T-cell activation Dendritic cell T-cells monocytes activation
Transcriptome profiling (Affy Exon 1.0 ST) eQTL mapping + cis- and trans- Genome-wide genotyping & imputation cis-eQTL analysis (baseline)
1Mb 1Mb 1Mb window SNPs
Model: residual gene expression profile (control for age, sex, PCgt, PCge) + Genotyped and imputed SNPs
Significance assessed by 10,000 permuta ons per gene: Nominal p < 0.01 tail of the minimal permuta on p-value
1) Analysis performed separately for each cell-type and popula on 2) Mul -ethnic Meta-analysis for each cell type T. Raj et al 2014 Science Detected eGenes Baseline eQTLs in CD4+ T and CD14+16- 30% of tested genes have cis-eQTL associa ons
Up to 17% of genes with a cis-eQTL, have >= 2 independent signals Outline
• The Immunological Varia on (ImmVar) Project – Baseline eQTLs in adap ve and innate immunity • Context specificity • Role in disease – Ac va on eQTLs allelic direc on different genes eQTL presence/ different SNPs same gene, single SNP, effect size variability absence flip
log2 expr log2 expr -log10 p -log10 p -log10 p 0 0 condi on 1 1 1 2 2
log2 expr log2 expr 0 0 1 1 condi on2 2 2
4-6% of Popula on specificity of cis-eQTLs are popula on-specific cis-eQTLs? Monocytes T-cells
For shared-popula on eQTLs: Fold change across popula ons highly correlated; Pearson’s r: 0.85 -0.95
Li le popula on-specificity of presence/absence or fold-change modula on
BUT those differences might be highly relevant for popula on-diverged phenotypes
Bayesian model average and a hierarchical model BFBMA; Flutre et al. 2013 Popula on specific cis-regulatory effects
TARSL2: threonyl-tRNA synthetase-like 2 ~40% cell-type specific cis-eQTLs are cell-type specificeQTLs?
For eQTLs shared across cell types: Fold change across cell types less highly correlated; r: 0.49-0.64
Evidence for cell – type specificity in presence/absence AND fold change Cell-type specificity of cis-eQTLs in a rheumatoid arthri s risk FADS genes: 11q12-q13 fa y acid desaturase locus involved in atopic disease CD52 cis-eQTL shows direc onal regulatory effects across cell types
CD52 lymphocyte cell-surface glycoprotein, func on in an -adhesion, role in lymphoma. It is the protein targeted by alemtuzumab, a monoclonal an body used for the treatment of chronic lymphocy c leukemia Sex-differen ated eQTLs • Autosomal gene c varia on contributes to sexual dimorphism (Ober et al. 2008; Heid et al. 2010) • Sex-specific eQTLs have been detected in mice (Yang et al. 2006) and humans (Dimas et al. 2012, Kent et al. 2012, Yao et al. 2014, Kukurba et al. 2015)
X% of cis-eQTLs exhibit sex-bias Male Female 9.5 9.5 CLL risk locus. chronic lymphocy c 9.0 9.0 leukemia: monocytes 8.5 Male risk is 2X 8.5 female risk (Slager et al. 2012) 8.0 8.0 Expression of BAK1 p= 1.3e-6 p= 0.025 CC CT TT CC CT TT rs210142 genotype BAK1: BCL2-Antagonist/Killer 1; role in apoptosis, interacts with the tumor suppressor P53 a er exposure to cell stress Outline
• The Immunological Varia on (ImmVar) Project – Baseline eQTLs in adap ve and innate immunity • Context specificity • Role in disease – Ac va on eQTLs GWAS-associated SNPs are more likely to be eQTLs
Nicolae et al. 2010 PLoS Gene cs GWAS: Where is the causal disease variant and what does it do?
Gene A Gene B Gene C Gene A Gene B Gene C
Gene A Gene B Gene C
Nica and Dermitzakis Hum Mol Genet 2008 Common disease variants are cis-eQTLs in ImmVar data
Traits # SNPs Monocytes T-cells Shared
GWAS Curated (LD-pruned) 1,068 94 53 29
Cancers 121 7 3 1 Neurodegenerative diseases 55 19 0 1 Neuropsychiatric 27 3 4 3 Metabolic diseases 161 13 2 7 Height 180 12 5 1 Autoimmune disease associated SNPs
Regulatory Trait Concordance, RTC score > 0.9 (Nica et al 2010): T-cells 106 genes, monocytes 123 genes Polarization in the regulatory effects of neurodegenerative and inflammatory disease variants Traits Cell-specificity Monocytes Shared T Cells (random vs observed) All cis eQTLs GWAS Monocytes T-cells Alzheimer's disease p < 0.05 Parkinson's disease HDL Cholesterol Type 2 diabetes Systemic Lupus Erythematosus Body Mass Index Breast cancer Ankylosing Spondylitis Psoriasis Height Ulcerative Colitis Celiac disease Primary Biliary Cirrhosis Ovarian cancer Lung cancer Colorectal cancer Restless Leg Syndrome Crohn's disease Systemic sclerosis Bipolar Schizophrenia Testicular germ cell cancer Prostate cancer Type 1 diabetes p < 0.05 Multiple Sclerosis Rheumatoid Arthritis 0.00 0.25 0.50 0.75 1.00 Proportion of cell specifcity Monocyte-specific cis-eQTL for CD33 associated with Alzheimer’s Disease Plotted SNPs
● ●● 2 100 ● r rs3826656 ● ● ● ●● ● 10 1 ● ● ● ● ● ●● ● ● 0.8 ●● ●●●● ● ●● ● ●● ● ● ●● 0.6 ●●●● ● ●●●● ●● 80 ●●●●●● ● ● ● Recombination rate (cM/Mb) ●● ●● ● ● ●● ●● ●●●●● 0.4 ● ● ● ●● ●●● 8 ● ●●●●●●● ●●● ● ● ●●●●● ●● ●● ● ●●●● ● ● ● ● ● 0.2 ● 0 ●● ●● ●●● ●● ● ●●● ●●● ● ● ●●●●●● ●● ● ●●● ● ●●●● ●●●● ● ● ● ● ● ● 60 ● ● ● ● ●● 6 ● ● ●● Monocytes value) ● ● − ● ●● ● ● ● (p ● ● −1 ● ●● ● 10 ● ● g ●● ● ● o ● l ● ● 40 ● 4 ●● ● − ● ●● ● ●● ● ● ● ●●●●●● CD33 Expression ● ●● ● ● ●● ●●●●● ● ● ● ● ● ●●●●● ● ● ● ● ●●● ●●●●● ● ● ● ● ●● −2 ● ● ● ● ●● ● ●● ● ● ● ● ● 20 2 ● ● ●●● ●●● ●● ●●● ●●●● ●●● ● ● ● ●● ●● ●● ●●● ●●● ● ●●●●●●●● ● ● ●●● ● ●● ●●● ●●●●●●●●●● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●●●● ● ●● ● ●●●● ●●● ●● ● ● ● ● ●●● ●●●● ●●● ●●●● ● ●● ●● ●●● ●●●●●●●● ●●● ●●●●●●●●●●●●●● ●● ●● ●● ● ● ●●●● ● ● ●●●●●●● ●●●●●●●● ●●●●●●●●●●● ●●●● ●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●● ●●● ●●● ●●●●●● ●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●● ●●● ●●●●●●●● ●● ●●●●●●●●● ●●● ●●●●●● ●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0 AA AG GG Plotted SNPs rs3826656
CLEC11A2 KLK2 KLK6 KLK13 CD33 VSIG10L SIGLEC8 ZNF175 HAS1 10 r rs3826656 100 GPR32 KLK3 KLK5 KLK12 SIGLEC7 IGLON5 CEACAM18 SIGLEC5 0.8 ● ACPT KLKP1 KLK7 KLK14 SIGLECL1 NKG7 SIGLEC12 SIGLEC14 0.6 ● ● 8 80 ● Recombination rate (cM/Mb) 1 ● C19orf480.4 KLK4 KLK8 CTU1 ETFB SIGLEC6 MIR99B ● ● ● ● 0.2 ●● SNORD88B KLK9 SIGLEC9 CLDND2 FLJ30403 ● ● ● ● ● ● ●● ●● ● ●● ● ● 51.4 51.6 51.8 52 52.2 ●● ●● ● ●●● ● ● ●● 6 SNORD88A KLK10 SIGLEC17P LIM2 MIRLET7E 60 ●● ●● ●● ●● ●●● ● ● Position on chr19 (Mb) ● ●● ● value) ●●● ● ● ●●●● ● ● − ●● ● ● ● SNORD88C KLK11 LOC147646 MIR125A ●● ● ● ●● ● ●● ●●● ● ● ● ● (p ●● ● ● ● 0 ● ●● ●●● ● 10 ● ● ● ●● ●● ● ● ● ● g ●●●● ● ●● ●● o ●● ● T-cells l 4 40 ●● ●● ● ● ●● − ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● 2 ● ● ● ●●● ● ●● ● ● ● 20 ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ●●●● ● ●●● ● ●● ● ● ●●● ●● ●●● ● ● ●● ●● ●●● ● ● ● ●● ●●● ●● −1 ●● ●● ● ● ● ●●● ●●● ● ● ● ●● ●●●●● ● ● ● ●● ● ●● CD33 Expression ●●● ●●●●●● ● ● ●●● ●●●● ●●● ● ●● ● ● ●●●●●●●●●●● ●●●● ●●● ●●●●●●●●●●●●● ●●● ●●●● ●● ● ●●● ●●●● ●●●●●●●● ●●●●● ●● ●● ●●●● ●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●● ● ●●●●●● ●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●●●●●● ●●● ●●●● ●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0 ● ●
CLEC11A KLK2 KLK6 KLK13 CD33 VSIG10L SIGLEC8 ZNF175 HAS1 AA AG GG GPR32 KLK3 KLK5 KLK12 SIGLEC7 IGLON5 CEACAM18 SIGLEC5 rs3826656 ACPT KLKP1 KLK7 KLK14 SIGLECL1 NKG7 SIGLEC12 SIGLEC14
C19orf48 KLK4 KLK8 CTU1 ETFB SIGLEC6 MIR99B
SNORD88B KLK9 SIGLEC9 CLDND2 FLJ30403 51.4 51.6 51.8 52 52.2 SNORD88A KLK10 SIGLEC17P LIM2 MIRLET7E Position on chr19 (Mb) SNORD88C KLK11 LOC147646 MIR125A Previous studies report over-expression of CD33 on cell surface of microglia in postmortem brains of late- onset AD pa ents. CD33 expression level correlated with beta-amyloid protein and plaque accumula on. Exploring eQTLs in the relevant cell type is important for disease associa on studies
Expression
relevant cell type for disease cell type not relevant for disease
Importance of cataloguing regulatory varia on in mul ple cell types
modified from Nica and Dermitzakis Hum Mol Genet 2008 Outline
• The Immunological Varia on (ImmVar) Project – Baseline eQTLs in adap ve and innate immunity • Context specificity • Role in disease – Ac va on eQTLs Cell s mula on mo va on
• Immune cells respond to different s muli through different circuits (Amit et al. 2009, Science, Gat-Viks et al. 2013 Nat Biotech)
• Do healthy individuals vary in their immune response?
• Is there a gene c basis to this response?
• Does the varia on in immune response relate to clinical disease?
• Can we leverage naturally occurring varia on to reconstruct regulatory rela onships? Immune cell ac va on Innate Immunity Adap ve Immunity Dendri c cells (DCs) T-cells
• Lipopolysaccharide (LPS) • α-CD3, α-CD28 – bacteria • Influenza • α-CD3, α-CD28, IFN-β – Virus • Interferon-beta (IFN-β) • α-CD3, α-CD28, TGF-β – Virus – Th 17 Study pipeline Step I: PBMC collec on (n=560 individuals) & high-throughput assay development
Step II: Microarray study & Codeset Selec on DCs CD4+ T 415 genes 255 genes
Step III: Nanostring study ~1300 samples 1598 samples Baseline Baseline α-CD3, α-CD28 4h LPS 5 hour α-CD3, α-CD28, IFNβ 4h FLU 10 hour α-CD3, α-CD28 48h IFNβ 6.5 hour α-CD3, α-CD28, Th17-P 48h Step IV: eQTL associa on study
Step V: Func onal fine-mapping Immune pathways ac vated by s muli
Receptor engagement ac vates signal transduc on cascades that regulate expression of inflammatory genes, IFNs and IFN-s mulated genes Different categories of cis-eQTLs inform mechanism
1- Common to ALL condi ons (n≈67) 2- Specific to FLU s mula on (n≈11)
3- Specific to LPS/FLU s mula on (n≈21) 4- Specific to LPS/FLU/IFNβ s mula on (n≈40) cis-response QTLs (cis-reQTLs): 121 genes
7: FLU only
15: LPS + FLU
57: ALL cis-reQTLs that alter sequence of TF binding sequences
Enrichment of known binding sites for TFs from the STAT family (ENCODE ChIP-Seq) STAT2: 116-fold, p < 2.55 x 10-21 STAT1: 126-fold, p < 2.98 x 10-13 Valida on
CRISPR altera on of het to homo
+ IFN-β
Fold induc on SLFN5 changed from 3.64 to 1.03
Luciferase assay in HEK 293 S mulate with IFN-β Autoimmune and Infec ous disease SNPs from GWAS
orange: cis-reQTLs, yellow: s mulus-specific cis-eQTLs Summary • Reference of gene c basis of transcriptome varia on in innate and adap ve immune cells of a healthy mul -ethnic cohort
• Characteriza on of context specificity of eQTLs (popula on, cell-type, sex, ac va on state) with real implica ons for medical phenotypes, foremost in elucida ng disease mechanisms.
• Popula on ‘specific’ signals are largely explained by allele frequency differences across popula ons, li le effect size differences.
• Approximately 60% of cis-eQTLs are shared across adap ve and innate immune cell types, though effect sizes vary.
• Inflammatory disease alleles over-represented in T-cell regulatory effects, whereas neurodegenera ve disease alleles are enriched in monocyte effects.
• Gene c effects on response to immune cell ac va on Acknowledgements
Christophe Benoist Nir Hacohen and Aviv Regev Katherine Rothamel Chun (Jimmie) Ye Ting Feng Mark Lee Michael Wilson Chloe Villani Natasha Asinovski Weibo Li Sco Davis Daphne Koller Diane Mathis Sara Mostafavi
Phillip De Jager David Haffler Michelle Lee Soumya Raychaudhuri Cris n McCabe Selina Imboywa Barbara Stranger Avery Davis Towfique Raj* Manik Kuchroo Joseph Replogle Alina Von Korff Eric Gamazon Nikos Patsopoulos Patrick Evans Irene Frohlich Meritxell Oliva Charles Czysz Stranger lab is Daisy Cas llo Katya Khramtsova recrui ng Nancy Cox postdocs!! Dan Nicolae Condi on specific cis-eQTLs in DCs
Baseline Results
LPS Results
Flu Results
IFNb Results
264/415 genes have cis-eQTL in at least one condi on Lee et al. Science 2014