STARR-Seq Reporter Assays
Total Page:16
File Type:pdf, Size:1020Kb
Identifying regulatory mechanisms of human disease Tim Reddy Genomics and Precision Medicine Forum April 5, 2018 My long term goal To understand how changes in gene regulation contribute to human health and diseases Two stories: • Identifying genetic mechanisms of disease • Quantifying the gene regulatory effects of drugs Two stories: • Identifying genetic mechanisms of disease • Quantifying the gene regulatory effects of drugs There are now thousands of known associations between genotype and phenotype Genetic associations with human traits and diseases are largely non-coding Protein Coding Untranslated Exonic (7%) (2%) Non-coding (91%) Why this is important • Improved diagnostics • Improved preventative measures • Potential to identify therapeutically actionable mechanisms • Regulatory elements may be targetable therapeutically For all of these reasons, understanding the regulatory mechanisms of human disease has immense value for improving health. Case study: hyperglycemia during pregnancy GDM and fetal health contribute to a transgenerational cycle of diabetes and obesity Maternal Obesity/Diabetes Adult Obesity Fetal Overnutrition and Type 2 Diabetes Macrosomia Metabolic Syndrome Adolescent Obesity Postnatal Postnatal Early-Onset Type 2 Diabetes Overnutrition Overnutrition Slide from Bill Lowe, Adapted from: Dabelea and Crume, Diabetes 60:1849, 2011 Hyperglcyemia and Adverse Pregnancy Outcomes (HAPO) • The HAPO study was designed to address the hypothesis that hyperglycemia is associated with adverse neonatal outcomes. • GWA between maternal genotype and measures of glucose metabolism identified variants in several genomic regions that are known to be associated with type 2 diabetes • HAPO also found novel genetic associations with hyperglycemia specifically during pregnancy. Genetic variation on chr10 associated with hyperglycemia during pregnancy Hayes et al, Diabetes, 2013 Lead imputed variant in 1st intron of HKDC1 rs4746822 rs1983127 (Lead SNP, Imputation) (Genotyped SNP) Hayes MG et al, Diabetes, 2013 Lots of epigenetic signals of regulation rs4746822 rs1983127 (Lead SNP, Imputation) (Genotyped SNP) Hayes MG et al, Diabetes, 2013 Candidate regulatory elements in the locus rs4746822 rs1983127 (Lead SNP) (Genotyped SNP) I II III IV V VI VII VIII X XI IX Guo et al, Nat Comm, 2015 Allele-specific reporter assays C Luciferase G Luciferase } Candidate regulatory elements in the locus rs4746822 rs1983127 (Lead SNP) (Genotyped SNP) I II III IV V VI VII VIII X XI IX Guo et al, Nat Comm, 2015 Many regulatory variants near HKDC1 rs4746822 rs1983127 (Lead SNP) (Genotyped SNP) Guo et al, Nat Comm, 2015 Regulatory effects are coordinated with respect to risk allele rs4746822 rs1983127 (Lead SNP) (Genotyped SNP) Direction of effect with respect to GWAS risk allele: Guo et al, Nat Comm, 2015 HKDC1 Complete Literature Review (ca. 2013) Inferring therapeutic targets from heterogeneous data: HKDC1 is a novel potential therapeutic target for cancer. Li GH, Huang JF., Bioinformatics, 2013 Identification and characterization of genes that control fat deposition in chickens. Claire D'Andre H, Paul W, Shen X, Jia X, Zhang R, Sun L, Zhang X., J Anim Sci Biotechnol. 2013 Identification of HKDC1 and BACE2 as genes influencing glycemic traits during pregnancy through genome-wide association studies. HAPO Study Cooperative Research Group., Diabetes. 2013 Case-control genome-wide association study of attention-deficit/hyperactivity disorder. IMAGE II Consortium Group, J Am Acad Child Adolesc Psychiatry. 2010 Molecular evolution of the vertebrate hexokinase gene family: Identification of a conserved fifth vertebrate hexokinase gene. Irwin DM, Tan H., Comp Biochem Physiol Part D Genomics Proteomics. 2008 HKDC1 Complete Literature Review (ca. 2013) Inferring therapeutic targets from heterogeneous data: HKDC1 is a novel potential therapeutic target for cancer. Li GH, Huang JF., Bioinformatics, 2013 Identification and characterization of genes that control fat deposition in chickens. Claire D'Andre H, Paul W, Shen X, Jia X, Zhang R, Sun L, Zhang X., J Anim Sci Biotechnol. 2013 Identification of HKDC1 and BACE2 as genes influencing glycemic traits during pregnancy through genome-wide association studies. HAPO Study Cooperative Research Group., Diabetes. 2013 Case-control genome-wide association study of attention-deficit/hyperactivity disorder. IMAGE II Consortium Group, J Am Acad Child Adolesc Psychiatry. 2010 Molecular evolution of the vertebrate hexokinase gene family: Identification of a conserved fifth vertebrate hexokinase gene. Irwin DM, Tan H., Comp Biochem Physiol Part D Genomics Proteomics. 2008 Hexokinase catalyzes the first step in glycolysis Review of hexokinases Wikipedia: There are four important mammalian hexokinase isozymes. (emphasis added) HK1: • Km < 1 mM glucose HK2: • Can metabolize various hexose sugars HK3: • Activity saturated at physiologcal glucose concentrations. HK4 (Glucokinase): Km ~ 8 mM Activity is dynamic over physiological [glucose] Genetic variation near HK1 and HK4 has been associated with diabetes a1.5 HKDC1 mRNA Expression b 1.5 Scrambled t Cellular HK Activity HKDC1 siRNA 1+2 n A u N n o 1.0 o 1.0 R i m s m s A e e e r v i v p t i 0.5 t x a 0.5 l a E l e e R R 0.0 Scrambled 1 2 1+2 0.0 HK1 HK2 GCK HKDC1 siRNA 1 C P D c 10 d 1.5 e Adenovirus: F K 1 G H K C n H y D t o 1.0 anti-HKDC1: i i e K v s v i i t s H 5 anti-β-actin: t c e a e r l A 0.5 v p e i t x f R 1.0 a E l e 0 0.0 0 9 R Purified HKDC1 Controlhas hexokinaseHKDC1 activity4 1 D C O D 1 g K K h d y e H H t 30 i z 0.5 i kD l v i t a ) 117 c g 20 m r A GFP m o c / i HK1 f 80 N i U 10 ( c HKDC1 e 38 p 0 0.0 S 1 1 K C -2 -1 0 1 2 H D K log [glucose (mM)] H 10 Conclusion: HKDC1 is a 5th human hexokinase Guo et al, Nat Comm, 2015 Mouse Model of HKDC1: non-pregnant adults are normoglycemic Ludvik et al, Endocrinology, 2016 Mouse Model of HKDC1: impaired glucose tolerance in pregnancy Ludvik et al, Endocrinology, 2016 Summary • Much of the genetics of complex disease maps to non-coding regions of the genome • Mapping causes underlying those associations suggests that multiple genetic variants may underlie those association signals • Doing so can reveal unexpected candidate genes that themselves could have therapeutic potential That was hard. (And that was an easy case.) The 3q25 locus associated with fetal adiposity Vockley, Guo, Majoros, et al, Genome Research, 2015 The 3q25 locus associated with fetal adiposity Epigenetically predicted candidate regulatory elements Vockley, Guo, Majoros, et al, Genome Research, 2015 The 3q25 locus associated with fetal adiposity Candidate target genes Long-noncoding RNAs in adiposity (Sun et al, 2013) Vockley, Guo, Majoros, et al, Genome Research, 2015 The 3q25 locus associated with fetal adiposity Candidate target genes Cyclin-L1 Could be involved in cell cycling Vockley, Guo, Majoros, et al, Genome Research, 2015 The 3q25 locus associated with fetal adiposity Candidate target genes Makes fruit flies fat (Melted gene, Teleman et al, 2005) Vockley, Guo, Majoros, et al, Genome Research, 2015 Making allele-specific reporter assays high-throughput C Luciferase T Luciferase STARR-seq reporter assays GFP STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP Regulatory elements located in the 3’ UTR of the reporter gene. STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP From that position, the elements regulate their own expression. STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP AAAAAAA..... AAAAAAA..... AAAAAAA..... AAAAAAA..... STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP AAAAAAA..... STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP AAAAAAA..... AAAAAAA..... AAAAAAA..... AAAAAAA..... AAAAAAA..... STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP Read 1 Read 2 AAAAAAA..... AAAAAAA..... AAAAAAA..... AAAAAAA..... AAAAAAA..... STARR-seq: Arnold et al, Science, 2013 STARR-seq reporter assays GFP 4 Regulatory Element activity 1 STARR-seq: Arnold et al, Science, 2013 A platform for diverse studies GFP STARR-seq: Arnold et al, Science, 2013 A platform for diverse studies GFP STARR-seq: Arnold et al, Science, 2013 A platform for diverse studies Bacterial Artificial Chromosomes GFP STARR-seq: Arnold et al, Science, 2013 Comprehensively assaying genomic responses to steroid hormones GFP Vockley et al, Cell, 2016 Probe-based capture of GWAS regions Patient DNA Custom RNA Pulldown of Input Baits Selected Regions GFP Coverage of Target Regions Coverage of reporter assays in the region Coverage of Target Regions Output from POP-STARR assays in HepG2 cells Coverage of reporter assays in the region Coverage of reporter assays in the region Coverage of Target Regions Coverage of reporter assays in the region Differences in Regulatory Activity Predict Hyperglycemia rs6517656 1 0.8 퐴푙푡푒푟푛푎푡푒 퐴푙푙푒푙푒 푅푁퐴ൗ 0.6 퐴푙푡푒푟푛푎푡푒 퐴푙푙푒푙푒 퐷푁퐴 0.4 푅푒푓 퐴푙푙푒푙푒푠 푅푁퐴 ൗ 0.2 푅푒푓 퐴푙푙푒푙푒푠 퐷푁퐴 0 Normalized Expression Normalized 1 2 Normalized Expression Normalized Ref Alt Differences in Regulatory Activity Predict Hyperglycemia rs6517656 rs1541103 rs2776343 rs13049843 1 2.5 3 1 2.5 ff 0.8 2 0.8 2 0.6 1.5 0.6 1.5 0.4 1 0.4 1 0.2 0.5 0.5 0.2 Normalized Expression Normalized Expression Normalized 0 0 0 0 Normalized Expression Normalized Normalized Expression Normalized 1 2 Ref1 Alt2 Ref1 Alt2 1 2 Normalized Expression Normalized Ref Alt Ref Alt Long Distance Noncoding Variants log10 log10 p value - 0 3 5 8 41 41.2 41.4 41.6 41.8 42 Position on chr21(MB) Long Distance Noncoding Variants log10 log10 p value - 0 2 4 6 8 41 41.2 41.4 41.6 41.8 42 Position on chr21(MB) CRISPR/Cas9 epigenome editing Genome Editing Epigenome Editing Nature (2016) Enhancer Activation: dCas9-P300 at HS2 activates globin expression 46 kb away Hilton et al, Nature Biotechnology, 2015 Enhancer Repression: dCas9-KRAB causes H3K9me3..