Identifying, characterizing, and modulating regulatory elements in their natural context

Gregory E. Crawford

Center for Genomic and Computational Biology Department of Pediatrics The What does the other 98% do?

45% repetitive DNA 53% Unique and segmental duplicated DNA

2% (25,000) Lots of genomic contexts to explore…

Evolution Population Diseases

Different tissues Environmental exposures Development Overview of talk

• Regulatory elements in 200 diverse human cell types • Resource for understanding disease genes

• Identifying non-coding variants that impact chromatin structure and expression

• Validating the function of regulatory elements using CRISPR/Cas9 epigenome editing strategies • High-throughput screens DNase hypersensitive (HS) sites identify active gene regulatory elements

DNase I HS sites

Regions hypersensitive to DNase HS sites identify “open” Promoters regions of chromatin Enhancers Silencers Insulators Locus control regions Meiotic recombination hotspots High-throughput identification of regulatory elements

DNase-seq Single base resolution

~100 million sequences per cell line or tissue

n=150

Sequencing using Illumina (DNase-seq)

Boyle et al., Cell 2008 A single DNase experiment matches most ChIP-seq data from 50 factors

Thurman et al., Nature, 2012 Generating a chromatin atlas From >200 cell types

Complex Disease Type 2 Diabetes Cross species Cancer •Human Preterm birth •Chimpanzee Population differences Schizophrenia •Orangutan Lymphoblastoids •Macaque Pushing the envelope From different individuals •Mouse •Difficult cell types •3 Europeans •Endogenous nuclease •3 African •Frozen tissues •70 humans •Small numbers of cells •Male vs. Female

Stem Cells Diverse •Embryonic Stem Cells •Brain •iPS (induced pluripotent •Blood Environmental Exposure cells) •Cytokines •Skin Different blood cell types •HDAC inhibitors •Heart •B cells Differentiation •Chemotherapy •Liver •T cells •Myoblasts -> Myotubes •Hormones •Kidney •Activated B/T cells •Muscle differentiation •Microbiota •Muscle •Neutrophils •Mouse brain development •Fat 200 cell types (> 1 million DNase sites) 200 cell types (> 1 million DNase sites)

What is this DHS doing?

What TFs bind to this Element?

What gene(s) does this Element regulate?

Can this help us understand genes that cause rare or common diseases?

Functional Validation Regulatory Elements Surrounding CFTR locus

Ann Harris, Northwestern

Yang et al., NAR 2016 Chromatin varies across individuals Identification of individual-specific open chromatin using lymphoblastoid cells from 6 individuals

Approximately 5% of open chromatin regions display individual/population differences

McDaniell et al., Science 2010 DNase sensitive quantitative trait loci (dsQTL) (Jonathan Pritchard, U. of Chicago)

DNase site 1 G DNase-seq 2 G Performed on lymphoblastoid cells 3 G from 70 individuals

4 G T 5 T 6 T 7

70 T Degner et al., Nature 2012 DNase sensitive quantitative trait loci (dsQTL) (Jonathan Pritchard, U. of Chicago)

DNase site ++++ 1 G DNase-seq 2 G ++++ Performed on lymphoblastoid cells 3 G ++++ from 70 individuals

4 G ++++ T 5 + T 6 + T + 7

70 T + Degner et al., Nature 2012 DNase sensitive quantitative trait loci (dsQTL) (Jonathan Pritchard, U. of Chicago)

DNase site ++++ 1 G

2 G ++++ ~9000 dsQTL identified 3 G ++++ 55% of eQTL=dsQTL 4 G ++++ T Validated by 5 + ChIP-seq T 6 + T + 7

70 T + Degner et al., Nature 2012 Chromatin QTL analyses

• ~9000 chromatin QTLs identified in lymphoblastoid cell lines.

• Direct mechanism for how non-coding variants leads to altered gene expression

• Recently we have identified another 9000 chromatin QTLs in bra