8/3/2015

What do we know about the genetics of autism?

Santhosh Girirajan Pennsylvania State University, University Park

National Autism Conference August 3, 2015

Outline of the talk

• Principles and applications of genetics • Detecting genetic variants: implications for genetic testing • Genetics of autism • Chromosomal disorders and copy-number variants • changes (exome sequencing) • Sex differences in autism

1 8/3/2015

Objectives of the talk

• Obtain a basic understanding of types of genetic variants and how to detect them • Understand the implications of genetic variants • Understand how CNVs and SNVs disrupt • Understand the complexity associated with autism – heterogeneity of autism

What will not be discussed in this talk?

• About how autism was determined to have a genetic basis • Syndromic forms of autism – Fragile X, Tuberous sclerosis etc • Details of each of the CNVs or genes identified to be associated with autism • Mouse models, stem cell models, whole genome sequencing, genome wide association studies

However, you can contact me (anytime) requesting • Details on any specific aspect of the talk • Details on any aspect of genetics or any of the above • About my research interests Contact: Santhosh Girirajan, Email: [email protected]

2 8/3/2015

Human genetics and genomics

Why do we have our father’s Human traits Developmental problems nose and mother’s eyes?

Images courtesy: pinterest.com, sikids.com; Cassidy et al, EJHG; wiki.ggc.edu

Research in genomics and autism

Number of research papers in PubMed with the term “genomics” or “autism” from years 2000 to 2015 20000 5000

s

s

n

n

o

o

i

i

t t Genomics research 4000 Autism research

a a 15000

c

c

i

i

l

l

b b 3000

u

u

p

p

10000 f

f

o

o

r r 2000

e

e

b

b

5000 m

m

u 1000

u

N

N 0 0

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4

0

1

2

3

4

5

6

7

8

9

0

1

2

3

4

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1

0

0

0

0

0

0

0

0

0

0

1

1

1

1

1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2 Year Year Major reasons for these increases: • Advent and utility of high throughput technologies for understanding the nature of the genome in natural variation and disease • Increased awareness and development of more sensitive tests for detecting autism • High heterogeneity of autism – multiple genes implicated

3 8/3/2015

Human biology and genetics

Nerve cells Heart cells

Bone cells Fat cells

Cells, , and DNA

Human spectral karyotyping

• Genetics is the science of heredity and variation • Genomics is the study of genomes

4 8/3/2015

Chromosomes, DNA and genes

Human chromosomes An excerpt from the CCATCCAGCTTTGTTCCATTGCTCGCAAGGAGCTGCAATCCTTTGGAGGAGAAGCGGCGCTCTGGTTTTT TGAATTTTCAGCTTGTCTGCTCTGGTTTCCCCCCATATATGTGGTTTTATCTACCTTTGGTCTTTAATGA TGGTGACCTACAGATGTGGTTTTGGCAGGGATGTCCTTTTTGTTGATGCTGTTCCTTTCTGTTTGCTAGC TTTCCTTCTAACAGTCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTTCCTTCTAACAATCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTCATCTCAGAGGGACACCTGGCTGTATGAGGTGTCAGTAAATCCCTACGGGCAGCTCTGTCTATTCTCA GAGTTCAAACTCCATGCTGGAGAATGACTGCTCTCTTCAGAGCTGTCAGACAGGGATGTTTAAGTCTGCA GAAGTTTCTGCTGCCTTTTATTCAGCTATACCCTGCCCCTAGAGGTGGAGTCTACAGAGGCTTCCAGGGC TCCTTGAGCTGCAGTGAGCTCCACCCAGTTCAGGCTTCCCAGCTGCTTTGTTTAACTATTCAAGCCTCAG CAATGGTGGACGCCCCTCCCCCAGCCCAGGCTGCCACCTTGCAGTTCGATCTCGGACTGCTGCACTAGCA GTAAGCAAGGCTGTGTGGGCATGGGACCCGCCAAGCCATGCAAGGGATATAATCTCCTGGTGTGCCGCTT GCTAAGACCATTGGAAAAGCACAGTATTAGGGTGGGAATGTCTGGATTTTCCAGGTGCCGTCTGTCACGG CTTCCCTTGGCTAGGAAAGGGAAATCCCCCGACCACTTGTGCTGCTTCCCAGATGAGGTGACACCCTGCC CTGCTTCGGCTCACCCTCTGTGGGCTGCACCCACTGTCCGACCCGTCTCAGTGTGATGAACTAAGTACCT CAGATGGAAATACAGAAATCACCTGTCTTCTACGTCAATTATGCTGAGAGCTGCAGACAGGAGCTGTTCC TATTCGGCCATCTTGGAAAAATCCTCTCTTTTCATTTATTTAAGAAATATTTGAAAAGCAAAGATTTCAT

Chromosomes, DNA and genes

Human chromosomes An excerpt from the human genome CCATCCAGCTTTGTTCCATTGCTCGCAAGGAGCTGCAATCCTTTGGAGGAGAAGCGGCGCTCTGGTTTTT TGAATTTTCAGCTTGTCTGCTCTGGTTTCCCCCCATATATGTGGTTTTATCTACCTTTGGTCTTTAATGA TGGTGACCTACAGATGTGGTTTTGGCAGGGATGTCCTTTTTGTTGATGCTGTTCCTTTCTGTTTGCTAGC TTTCCTTCTAACAGTCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTTCCTTCTAACAATCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTCATCTCAGAGGGACACCTGGCTGTATGAGGTGTCAGTAAATCCCTACGGGCAGCTCTGTCTATTCTCA GAGTTCAAACTCCATGCTGGAGAATGACTGCTCTCTTCAGAGCTGTCAGACAGGGATGTTTAAGTCTGCA GAAGTTTCTGCTGCCTTTTATTCAGCTATACCCTGCCCCTAGAGGTGGAGTCTACAGAGGCTTCCAGGGC TCCTTGAGCTGCAGTGAGCTCCACCCAGTTCAGGCTTCCCAGCTGCTTTGTTTAACTATTCAAGCCTCAG CAATGGTGGACGCCCCTCCCCCAGCCCAGGCTGCCACCTTGCAGTTCGATCTCGGACTGCTGCACTAGCA GTAAGCAAGGCTGTGTGGGCATGGGACCCGCCAAGCCATGCAAGGGATATAATCTCCTGGTGTGCCGCTT GCTAAGACCATTGGAAAAGCACAGTATTAGGGTGGGAATGTCTGGATTTTCCAGGTGCCGTCTGTCACGG CTTCCCTTGGCTAGGAAAGGGAAATCCCCCGACCACTTGTGCTGCTTCCCAGATGAGGTGACACCCTGCC CTGCTTCGGCTCACCCTCTGTGGGCTGCACCCACTGTCCGACCCGTCTCAGTGTGATGAACTAAGTACCT CAGATGGAAATACAGAAATCACCTGTCTTCTACGTCAATTATGCTGAGAGCTGCAGACAGGAGCTGTTCC TATTCGGCCATCTTGGAAAAATCCTCTCTTTTCATTTATTTAAGAAATATTTGAAAAGCAAAGATTTCAT

Gene structure

Exon Intron

• Classically, DNA sequences coding for are called genes

www.pathology.washington.edu; www.expasy.org

5 8/3/2015

Chromosomes, DNA and genes

Human chromosomes An excerpt from the human genome CCATCCAGCTTTGTTCCATTGCTCGCAAGGAGCTGCAATCCTTTGGAGGAGAAGCGGCGCTCTGGTTTTT TGAATTTTCAGCTTGTCTGCTCTGGTTTCCCCCCATATATGTGGTTTTATCTACCTTTGGTCTTTAATGA TGGTGACCTACAGATGTGGTTTTGGCAGGGATGTCCTTTTTGTTGATGCTGTTCCTTTCTGTTTGCTAGC TTTCCTTCTAACAGTCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTTCCTTCTAACAATCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTCATCTCAGAGGGACACCTGGCTGTATGAGGTGTCAGTAAATCCCTACGGGCAGCTCTGTCTATTCTCA GAGTTCAAACTCCATGCTGGAGAATGACTGCTCTCTTCAGAGCTGTCAGACAGGGATGTTTAAGTCTGCA GAAGTTTCTGCTGCCTTTTATTCAGCTATACCCTGCCCCTAGAGGTGGAGTCTACAGAGGCTTCCAGGGC TCCTTGAGCTGCAGTGAGCTCCACCCAGTTCAGGCTTCCCAGCTGCTTTGTTTAACTATTCAAGCCTCAG CAATGGTGGACGCCCCTCCCCCAGCCCAGGCTGCCACCTTGCAGTTCGATCTCGGACTGCTGCACTAGCA GTAAGCAAGGCTGTGTGGGCATGGGACCCGCCAAGCCATGCAAGGGATATAATCTCCTGGTGTGCCGCTT GCTAAGACCATTGGAAAAGCACAGTATTAGGGTGGGAATGTCTGGATTTTCCAGGTGCCGTCTGTCACGG CTTCCCTTGGCTAGGAAAGGGAAATCCCCCGACCACTTGTGCTGCTTCCCAGATGAGGTGACACCCTGCC CTGCTTCGGCTCACCCTCTGTGGGCTGCACCCACTGTCCGACCCGTCTCAGTGTGATGAACTAAGTACCT CAGATGGAAATACAGAAATCACCTGTCTTCTACGTCAATTATGCTGAGAGCTGCAGACAGGAGCTGTTCC TATTCGGCCATCTTGGAAAAATCCTCTCTTTTCATTTATTTAAGAAATATTTGAAAAGCAAAGATTTCAT

Gene structure Messenger RNA Amino acids Met Trp Phe Trp Glu ATG TGG TTT TGG CAG

• Classically, DNA sequences coding for proteins are called genes • There are approximately 22,000 annotated genes in the human genome

www.pathology.washington.edu; www.expasy.org

Chromosomes, DNA and genes

Human chromosomes An excerpt from the human genome CCATCCAGCTTTGTTCCATTGCTCGCAAGGAGCTGCAATCCTTTGGAGGAGAAGCGGCGCTCTGGTTTTT TGAATTTTCAGCTTGTCTGCTCTGGTTTCCCCCCATATATGTGGTTTTATCTACCTTTGGTCTTTAATGA TGGTGACCTACAGATGTGGTTTTGGCAGGGAAGTCCTTTTTGTTGATGCTGTTCCTTTCTGTTTGCTAGC TTTCCTTCTAACAGTCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTTCCTTCTAACAATCAGGACCCTCAGCTGCAGGTCTGTTGGAGTTTGTTGGAGGTCCACTCCAGACCCT GTTTGCCTGAGTGTCACCAGTGGAGGCTGCAGAACAGCAAATATTGCTGCCTGATCCTTCTTCTGGAAGC TTCATCTCAGAGGGACACCTGGCTGTATGAGGTGTCAGTAAATCCCTACGGGCAGCTCTGTCTATTCTCA GAGTTCAAACTCCATGCTGGAGAATGACTGCTCTCTTCAGAGCTGTCAGACAGGGATGTTTAAGTCTGCA GAAGTTTCTGCTGCCTTTTATTCAGCTATACCCTGCCCCTAGAGGTGGAGTCTACAGAGGCTTCCAGGGC TCCTTGAGCTGCAGTGAGCTCCACCCAGTTCAGGCTTCCCAGCTGCTTTGTTTAACTATTCAAGCCTCAG CAATGGTGGACGCCCCTCCCCCAGCCCAGGCTGCCACCTTGCAGTTCGATCTCGGACTGCTGCACTAGCA GTAAGCAAGGCTGTGTGGGCATGGGACCCGCCAAGCCATGCAAGGGATATAATCTCCTGGTGTGCCGCTT GCTAAGACCATTGGAAAAGCACAGTATTAGGGTGGGAATGTCTGGATTTTCCAGGTGCCGTCTGTCACGG CTTCCCTTGGCTAGGAAAGGGAAATCCCCCGACCACTTGTGCTGCTTCCCAGATGAGGTGACACCCTGCC CTGCTTCGGCTCACCCTCTGTGGGCTGCACCCACTGTCCGACCCGTCTCAGTGTGATGAACTAAGTACCT CAGATGGAAATACAGAAATCACCTGTCTTCTACGTCAATTATGCTGAGAGCTGCAGACAGGAGCTGTTCC TATTCGGCCATCTTGGAAAAATCCTCTCTTTTCATTTATTTAAGAAATATTTGAAAAGCAAAGATTTCAT

Gene structure Protein

Regulatory Messenger RNA Amino acids sequence Met Trp Phe Trp Glu Exon Regulatory ATG TGG TTT TGG CAG Intron sequence ON/OFF/modulate • Classically, DNA sequences coding for proteins are called genes • There are approximately 22,000 annotated genes in the human genome • Any disruption (or altered expression) of genes leads to human variation or disease

www.pathology.washington.edu; www.expasy.org

6 8/3/2015

Genetic variants and gene disruptions

Gene structure Protein

Regulatory Messenger RNA Amino acids sequence Met Trp Phe Cys Trp Exon Regulatory ATG TGG TTT TGG TGT

Intron sequence q q

p p Copy-number variants (CNVs)

21.2

22 22

12 12

12 12

25.3

25.2

25.1

24.1

23.1

21.1

13.1

13.2

13.3

1

1

1

1

1.2

1.1 1.1 1.2 • Heterozygous deletions lead to one copy of the gene • Homozygous deletions result in zero copy of the gene • Duplications can lead to 3 or more copies of the gene Gene A Gene B Gene C Gene D

Gene A Gene D

deletion duplication Base pair or single nucleotide variants (SNVs) Amino acids • Missense mutations lead to altered amino acid/protein Met ……… Trp Phe Trp Glu Glu Glu… • Stop or nonsense mutations lead to truncated or altered protein ATG…………TGG TTT TGG CAG CAG CAG • Repeat expansions lead to altered (defective) protein

Met ……… Trp Leu Trp ATG…………TGG TTA AGG

Missense mutation Met ……… Trp Phe STOP Met ……… Trp Phe Trp Glu Glu Glu Glu Glu Glu ATG…………TGG TTT TGA ATG…………TGG TTT TGG CAG CAG CAG CAG CAG CAG Nonsense mutation Repeat expansion

www.expasy.org

Genetic variants and assays

Types of genetic variants How do we assay them?

SNP genotyping

Single nucleotide changes (SNVs)

Array-CGH Copy number Karyotyping

variants (CNVs) Frequency Throughput Sequencing

Trisomy monosomy 1 bp 1 kb 1 Mb 1 chr 1 bp 1 kb 1 Mb 1 chr Size of variant Size of variant

7 8/3/2015

Karyotyping: for detecting cytogenetic (chromosomal) changes • Chromosomal banding in metaphase (cell division) – Q, Giemsa, C, R, silver nitrate – Banding patterns are reflective of regional differences in base pair composition and other functional characteristics • Chromosomal banding was adopted as a clinical tool for karyotyping in the clinic from the 1960s – amniocentesis for chromosomal abnormalities – Down syndrome (Lejeune, 1958) – Turner syndrome and Klinefelter syndrome (1959)

Down syndrome Turner syndrome

Fluorescent in situ hybridization (FISH)

• The idea that complementary DNA sequences can hybridize to probes to query specific DNA segments – For cytogenetic localization of DNA sequences • FISH helped identification of chromosomal deletions and duplications – Prader-Willi/Angelman syndrome (15q13.1 deletion) – 15q11.2q13.1 duplication – Smith-Magenis syndrome (17p11.2 deletion)

Girirajan et al, Nature Genetics, 2010

8 8/3/2015

Array comparative genomic hybridization

Normal Cy3 Channel Human DNA Sample

Array of BACs/Oligos

Hybridization Cy5 Channel

Disease individual DNA Sample Merge

Targeted array

BACs/oligos TEL dist: >50 kb<5 Mb 95% identity, 10 kb

Chromosomal address of the genetic lesion

Dup(15q11.2-q13.1) Sub- region band band Telomere (pter) 3 p22.3

2 2 p22.2 2 1 p22.1 p (petit) 5.6 Mbp p21 1 p11.4 4 p11.3 1 3 p11.2 1 2 Dup(14q32.13-q32.2) p10 1 1 Centromere (p10-q10) q10 2 q12 1 3 q13

1 q21 848 kbp

2 q22 q 3 q23 Del (5q15-q23.3)

2 4 q24

5 q25

6 q26 7 q27

8 q28 Telomere (pter) 35 Mbp

Girirajan, Nature Genet, 2010

9 REVIEWS

3 Box 1 | Workflow for exome sequencing variants for monogenic diseases . Second, most allel es that are known to underlie Mendelian disorders disrupt Since 2007, there has been tremendous progress in the development of diverse protein-coding sequences13. Third, a large fraction of technologies for capturing arbitrary subsets of a mammalian genome at a scale rare, protein-altering variants, such as missense or 8,10,72–79 commensurate with that of massively parallel sequencing . To capture all nonsense si ngle-base substitutions or small insertion– protein-coding sequences, which constitute less than 2% of the human genome, deletions (that is, indels), are predicted to have functional the field has largely converged on the aqueous-phase, capture-by-hybridization 14 approach described below. consequences and/or to be deleterious . As such, the The basic steps required for exome sequencing are shown in the figure. Genomic exome represents a highly enriched subset of the genome DNA is randomly sheared, and several micrograms are used to construct an in vitro in which to search for variants with large effect sizes. shotgun library; the library fragments are flanked by adaptors (not shown). Next, the library is enriched for sequences corresponding to exons (dark blue fragments) by Defining the exome. One particular challenge for apply- aqueous-phase hybridization capture: the fragments are hybridized to biotinylated ing exome sequencing has been how best to define the DNA or RNA baits (orange fragments) in the presence of blocking oligonucleotides set of tar gets that constitute the exome. Considerable that are complementary to the adaptors (not shown). Recovery of the hybridized uncertainty remains regarding which sequences of the fragments by biotin–streptavidin-based pulldown is followed by amplification and human genome are truly protein coding. When sequence massively parallel sequencing of the enriched, amplified library and the mapping capacity was more limiting, initial efforts at exome and calling of candidate causal variants. Barcodes to allow sample indexing can potentially be introduced during the initial library construction or during sequencing erred on the conservative side (for exam- post-capture amplification. Key performance parameters include the degree of ple, by targeting the high-confidence subset of genes enrichment, the uniformity with which targets are captured and the molecular identified by the Consensus Coding Sequence (CCDS) complexity of the enriched library. Pr oject). Commercial kits now target, at a minimum, all At least three vendors (Agilent, Illumina and Nimblegen) offer kitted reagents of the RefSeq collection and an increasingly large num- for exome capture. Although there are technical differences between them (for ber of hypothetical proteins. Nevertheless, all existing example, Agilent relies on RNA baits, whereas Illumina and Nimblegen use DNA baits targets have limitations. First, our knowledge of all truly — the kits vary in the definition of the exome), we find the performance of these kits protein-coding exons in the genome is still incomplete, to be largely equivalent, and each is generally scalable to 96-plex robotic so current capture probes can only target exons that have automation. The fact that the costs of exome sequencing are not directly been identified so far. Second, the efficiency of capture proportional to the fraction of the genome targeted is a consequence of several factors, including imperfect capture specificity, skewing in the uniformity of target probes varies consi derably, and some sequences fail to coverage introduced by the capture step and the fixed or added costs that are be targeted by capture probe design altogether (FIG. 1). associated with sample processing (for example, library construction and exome Third, not all templates are sequenced with equal ef fi- capture). This ratio will fall as the cost of whole-genome sequencing drops. ciency, and not all sequences can be aligned to the ref- Although methods for calling single nucleotide substitutions are maturing80, there erence genome so as to allow base calling. Indeed, the is considerable room for improvement in detecting small insertion–deletions and effective coverage (for example, 50×) of e8/3/2015xons using especially copy number changes from short-read exome sequence data81 (for currently available commercial kits varies substantially. example, detecting a heterozygous, single-exon deletion with breakpoints that fall Finally, there is also the issue of whether sequences other within adjacent introns). Exome sequencing also needs improvements of a technical than exons should be targeted (for example, microRNAs nature. First, input requirements (several micrograms of high-quality DNA) are such (miRNAs), promoters and ultra-conserved elements). that many samples that have already been collected are inaccessible. Protocols using whole-genome amplification or transposase-based library construction offer a These caveats aside, exome sequencing is rapidly prov- solution82, but additional work is required to fully integrate and validate these ing to be a powerful new strategy for finding the cause methods. Second, as the minimum ‘unit’ of sequencing of massively parallel of known or suspected Mendelian disorders for which sequencing continues to increase, sample indexing with minimal performance loss the genetic basis has yet to be discovered. Exomeand m isequencingnimal crosstalk between samples wforill be re qidentifyinguired to lower the costs of ex obaseme pair sequencing. Third, a substantial fraction of the exome (~5–10%, depending on the kit) Identifying causal alleles is poorly covered or altogether misvariantssed, largely owing to fa ctors that are not specific A key challenge of using exome sequencing to find to exome capture itself. novel disease genes for either Mendelian or complex traits is how to identify disease-related alleles among the background of non-pathogenic polymorphism and sequencing errors. On average, exome sequencing identifies ~24,000 single nucleotide variants (SNVs) in African American samples and ~20,000 in European American samples (TABLE 1). More than 95% of these variants are already known as polymorphisms in human populations. Strategies for finding causal alleles against this background vary, as they do for traditional approaches to gene discovery, depending on factors such as: the mode of inheritance of a trait; the pedigree or population structure; whether a phenotype arises owing to de novo or inherited variants; and the extent AGGTCGTTACGTACGCTAC GACCTACATCAGTACATAG of locus heterogeneity for a trait. Such factors al so influ- GCATGACAAAGCTAGGTGT ence both the sample size needed to provide adequate power to detect trait-associated alleles and the selection of the most successful analytical framework. • A technique to sequence all protein-coding regions (exons) of the human genome to746 identify| NOVEM BER mutations2011 | VOLUM E 1 2 www.nature.com/ reviews/genetics © 2011 Macmillan Publishers Limited. All rights reserved • Has been replacing or is complementary to standard karyotyping and array CGH as a diagnostic tool in clinical laboratories • High throughput but will generate large sizes of data files for analysis (more so for whole genome sequencing) Bamshad et al, NRG, 2011

Summary: Principles and applications of genetics • Basic genetics: , DNA, gene • Types of mutations: CNVs, SNVs, repeat expansions • Genetic techniques to identify variants • Karyotyping • FISH • Array CGH • Exome sequencing

Further reading • Trask, Trends in Genetics, 1991 • Emmanuel and Saitta, Nature Review Genetics, 2004 • Trask, Nature Review Genetics, 2002 • Alkan et al, Nature Reviews Genetics, 2012 • Shendure and Ji, Nature Biotechnology, 2008

10 8/3/2015

Advances in the genetic studies of autism

1980 1985 1990 1995 2000 2005 2010 2015

1000 • Number of research papers in PubMed

s with the term “genomics” AND “autism”

n

o Exome sequencing

i t 800 from years 2000 to 2015

a

c

i

l

b 600

u

p

f • With reducing costs of genomic

o

Array CGH r 400 technologies, discovery of genes and

e

b

m genomic regions contributing to autism is

u 200

N increasing 0

0

1

2

3

4

5

6

7

8

9

0

1

2

3 4

0

0

0

0

0

0

0

0

0

0

1

1

1

1

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2 Year Abrahams and Geschwind, JAMA Neurology, 2010

Genetic models to identify causes of autism

A) Common Disease Common Variant B) Common Disease Rare Variant Model

New Mutation

New Mutation

• New mutation occurs 1000s of • New mutation frequency is high generations ago, the mutation and occurs in immediate carrying block becomes generation(s), individually rare common in the population mutations but collectively because identity by descent contribute significantly because multiple genes lead to disease

Slide courtesy: Evan Eichler

11 8/3/2015

Linkage and association studies of autism

• Using linkage studies (looking at shared blocks of DNA common in all affected individuals within a large family - chromosome 7q35 (containing CNTNAP2) and 20p13 have been implicated • Genome wide association studies (GWAS) for common variants identified a handful of loci including genes on 5p14.1 (CDH9 and CDH10), 5p15.31 (SEMA5A and TAS2R1), and 7q35 (within CNTNAP2) • The contribution of common variants to risks of small effect (Odds ratio <3) is still being explored with larger sample sizes

Please refer to Chen et al, Ann Review of Pathology, 2015; Geschwind and State, Lancet, 2015, Geschwind, Ann Rev Medicine, 2009, Gaugler et al, Nat Genetics, 2014.

Genetic syndromes associated with autism

• Although not specific to autism, these disorders include Fragile X syndrome (1-2% of individuals with autism), Tuberous sclerosis (~1%), Timothy syndrome (<1%), and Rett syndrome (~0.5%)

Sotos syndrome Turner syndrome Neurofibromatosis type 1 • When interrogated using standardized Wolf-Hirshhorn syndrome Phenylketonuria measures (ADI(R), ADOS, SCQ, SRS) Beckwith-Wiedemann syndrome Hypomelanosis of Ito noticeable frequency of autism was Goldenhar syndrome Cowden syndrome identified in classical genetic syndromes Down syndrome Duchenne/Becker muscular dystrophies Klinefelter syndrome • Initial studies identified several syndromic Jacobsen syndrome 2q deletion syndrome disorders that were later identified to be Cornelia de Lange syndrome Prader-willi syndrome significantly associated with autism features CHARGE syndrome Moebius syndrome Fragile X syndrome Phelan-McDermid syndrome Velocardiofacial syndrome Smith-Lemli-Opitz syndrome Inverted 8p deletion syndrome 2q37 deletion syndrome Potocki-Lupski syndrome Lujan-Fryns syndrome Cohen syndrome Timothy syndrome Smith-Magenis syndrome Angelman syndrome William-Beuren syndrome 0.0 0.5 1.0 Frequency Polyak et al, AJMG, 2015

12 8/3/2015

Rare copy number variants (CNVs)

• Frequency of <0.1% to <1% in the general population • Contributes to a range of human disorders including autism, intellectual disability, and schizophrenia • CNVs contain functionally-relevant genes • Haploinsufficiency or triploinsufficiency

Deletion Duplication

Genomic disorders with typical syndromic features Intellectual disability, craniofacial 17p11.2 deletion or mutations in RAI1

features, sleep disturbance, obesity,

D

P

M

-

- -

self-injurious behaviors

PRPSAP2

MYO15A ALDH3A

SMS REP SMS RAI1 PMP22

PEMT COPS3 TNFRSF13B ADORA2B

SMS REP SMS ULK2 EPN2 FLCN

SHMT1

SMS REP SMS MFAP4

Cen Tel

Smith-Magenis syndrome 17p11.2 duplication Intellectual disability, growth Reduced Rai1 Overexpressing Rai1 retardation, ADHD, autism 20 w 20 w

WT WT

Copies 2 1 2 4 6 Obesity Growth retardation Potocki-Lupski syndrome Hypoactivity Hyperactivity Girirajan, PhD Thesis, 2008 Craniofacial defects Abnormal social behavior

13 8/3/2015

Rare copy-number variants in autism

• Duplications on chromosome 15q11.2q13.1 was one of the first locus to be implicated in autism – Children that inherited the duplication from the mother showed features of autism (Cook et al, 1997, AJHG), while paternally inherited duplications did not show autism features – This duplication accounts for ~1.5% of children with autism • In 2007, using DNA microarrays, Sebat and colleagues performed a genome -wide search for CNVs in autism • They found that 10% of individuals carrying a large (>500,000 base pairs) of deletions or duplications Science, 316 (5823), 445-449 that occurred de novo (i.e. not inherited from a parent). • This study lead to a watershed in autism genetics with multiple CNVs being discovered for autism risk

A list of rare CNVs in autism

• 15q13.1 duplication – Cook, AJHG, 1997;Thomas, AJHG, 1998 • 22q13 deletion – Phelan, 1997; Moesner, AJHG, 2007 • 16p11.2 deletion – Sebat, Science, 2004, Weiss, NEJM, 2008, Kumar, HMG, 2008 • 16p13.11 deletion – Ullman, Hum Mut, 2007; Hannes, JMG, 2009 • 17q12 deletion – Mefford, AJHG, 2007, Moreno-De-Luca et al, AJHG, 2010 • 15q13.3 deletion – Sharp, Nat Genet, 2008 • 1q21.1 deletion – Mefford, NEJM, 2008 • 7q11.23 duplication – Sanders, Neuron, 2011

14 8/3/2015

16p11.2 deletion as an example

• Weiss et al and Kumar et al identified 16p11.2 deletions and duplications in individuals with autism • The 16p11.2 deletion specifically accounts for 1% of cases with

autism – associated with Autism

macrocephaly

• It is known (Courchesne et al, JAMA, 2003) that autism is associated with macrocephaly • McCarthy et al detected an enrichment for 16p11.2 duplications in individuals with Schizophrenia schizophrenia • Reciprocal deletion/duplication CNVs associated with opposing phenotypes

16p11.2 deletion mice show social behavior defects • Ultrasonic vocalizations (USVs) were tested in young adult male 16p11.2 deletion mice during a novel three-phase male–female social interaction test. • This test detects vocalizations emitted by a male in the presence of an estrous female, how the male changes its calling when the female is suddenly absent, and the extent to which calls resume when the female returns. • Fewer vocalizations were detected in 16p11.2 deletion males.

Yang et al, Autism Research, 2015

15 8/3/2015

Identifying phenotype-specific genes within CNV regions – 16p11.2 • Golzio et al modeled duplication and deletion of

individual genes in zebrafish • They performed systematic overexpression and downregulation of pairs of genes within the CNV • Observed microcephaly and macrocephaly phenotype with KCTD13

Golzio et al, Nature, 2011

A gene for neuroanatomical features in 16p11.2 CNVs

• Increased proliferation (increased phospho-histone 3) or apoptosis (TUNEL) observed with deletion or duplication of KCTD13

Golzio et al, Nature, 2011

16 8/3/2015

Using Drosophila to assess function of genes

within rare CNV regions

MAZ

MVP

ASPHD1

TAOK2 HIRIP3 TBX6

CDPIT

SPN

QPRT

YPEL3

MAPK3

ALDOA

INO80E

PPRT2

DOC2A

SEZ6L2

KCTD13

GDPD3

TMEM219

FAM57B PPP4C

C16orf92 C16orf54 CORO1A TEL C16orf53 CEN

• Daughterless-Gal4 : Ubiquitous • GMR-Gal4 : Eye Specific • Elav-Gal4 : Nervous system specific • C5-Gal4 and MS1096-Gal4: Wing

Shoko Nishihara, (2012). Heritable and inducible RNAi knockdown system in Drosophila. Retrieved 20,8,2013 , from http://jcggdb.jp/GlycoPOD/protocolShow.action?nodeId=t32. http://commons.wikimedia.org/wiki/File:Drosophila-drawing.svg

Ubiquitous knockdown of 16p11.2 genes

Da-Gal4 at 25°C Eye Gene Larval Pupal Wing No gross morphology Name Lethality Lethality Defects Defects defects

PPP4C ALDOA CORO1A MAPK3 TBX6 FAM57B C16ORF53 > 80% of flies showing phenotype CDIPT < 20% of flies showing phenotype KCTD13 < 10% of flies showing phenotype DOC2A 0% of flies showing phenotype YPEL3 Early development Adult development

Role in early development: ALDOA, PPP4C, CORO1A, TBX6 and MAPK3.

17 8/3/2015

Head phenotypes of 16p11.2 genes

100mm

A B

Macrocephaly

Negative control: RTN2 -

Microcephaly Macrocephaly: PTEN - Phosphatase and tensin ortholog

Microcephaly: CTNNB1 – Beta catenin 1 UBE3A – Ubiquitin protein ligase 3A

Head phenotypes of 16p11.2 genes

Macrocephaly C16ORF53 CDIPT KCTD13 MAPK3 Lethal genes DOC2A Microcephaly PPP4C CORO1A ALDOA Dosage dependent severity C16ORF53 MAPK3

18 8/3/2015

Motor defects of 16p11.2 genes

Potential contributors to motor function : MAPK3, FAM57B, CORO1A and KCTD13.

Challenges in understanding the molecular mechanisms of autism

• Genetic heterogeneity: Hundreds of genes and genomic regions have been identified as conferring risks for autism • Phenotypic heterogeneity: variable in severity and frequency of comorbid features – Associated with comorbidity of nosologically distinct phenotypes • Intellectual disability (68%) • Epilepsy (40%) • Behavioral and ‘emotional’ disturbance (30%)

Polyak et al, AJMG, 2015

19 8/3/2015

Common molecular etiology for “distinct” neurodevelopmental phenotypes 15q13.3 deletion Mental retardation (Sharp, Nat Genet, 2008) Epilepsy (Helbig, Nat Genet, 2009) Schizophrenia (Stefansson, Nature, 2008) Aggressive behavior (Shachar, JMG, 2009) Autism (Pagnamenta, EJHG, 2008)

Autism (Szatmari, Nature Genet, 2008) 1q21.1 deletion Glioblastomas (Veeriah, Nature, 2009) Schizophrenia (Stefansson, Nature, 2008) Mental retardation (Mefford, NEJM, 2008) Heart defect (Greenway, Nat Genet, 2009) Cataracts (Mefford, NEJM, 2008)

16p11.2 deletion Autism (Weiss, NEJM, 2008) Obesity (Walters, Nature, 2010) Mental retardation (Rosenfeld, JNDD, 2010)

• Also seen in unaffected parents • Challenge in diagnosis and management

Comorbid features confound diagnosis

• Comorbid phenotypes confer severity and variability • Depends on age and genetic background • This can confound diagnosis and prevalence estimates (see Polyak et al, AJMG, 2015).

http://www.proprofs.com

20 8/3/2015

The 16p12.1 deletion as a paradigm for complex genetics • Mostly inherited (95%) •• CarrierPathogenic: parents enriched show neuropsychiatric in cases features(42/21,127) compared to controls (8/14,839),• Variable features(P= 0.00012; within familyOR = 3.7) • Enrichment (25%) for large CNV • (>500kb)Variable secondclinical hitsfeatures in probands • Clinical• Developmental features more delay distinct (100%) or severe than that associated with second hit • Craniofacial features (96%) • Seizures (36%)

1y 3m 2y 6m 2y

Girirajan et al. Nat Genet, 2010

A second-site (two-hit) model for 16p12.1 deletion

First hit 16p12.1 microdeletion High penetrance Second-site hit Variable expressivity Large CNV Subclinical manifestations Single gene mutation Learning disabilities Epigenetic factors Psychiatric issues Environmental factors

Overt phenotypes Severe cognitive delay Craniofacial dysmorphology Cardiac defects

21 8/3/2015

Second-site hits frequent in CNVs with variable features

30.00 Correlation, r=0.63 Rare CNVs with

P<0.0001 phenotypic variability

25.00 16p12.1 del

20.00 16p11.2 dup 15q11.2 del Classical CNV syndromes 22q11.2 distal dup 6q25.3 del 15q13.3 small dup 15.00 PWS/AS 17q12 del 22q11.2 dup 16p11.2 distal del 17q12 dup 16p13.11 del 8p23 del 15q13.3 del 16p11.2 del 10.00 10q23 del 16p13.11 dup 1p36 del PWS dup 1q21.1 del 22q11.2 del 1q21.1 dup WBS dup 9q34 del 2q37 del 5.00

Percentage of cases hits second with cases of Percentage Williams Potocki-Lupski Sotos Smith-Magenis 22q13 del Rubinstein-Taybi 0.00 17q21.31 del 0.00 20.00 40.00 60.00 80.00 100.00 22q11.2 distal del 19p13.12 del Percentage of cases with inherited first hits Girirajan et al, New Engl J Med, 2012

Summary: Rare CNVs in human neurodevelopmental disorders

Rare Necessary and sufficient

Typical syndromic features Haploinsufficiency Triplosensitivity 17p11.2 deletion (Smith-Magenis)

(frequency 1/15,000)

Penetrance Reduced Rai1 Overexpressing Rai1

Necessary but not sufficient Common • Contributes towards a continuum of Frequency of variants phenotypes in neurodevelopmental disorders Variable phenotypes • Other factors play a role in 16p12.1 deletion (1/~2000) phenotypic heterogeneity and comorbidity

22 8/3/2015

Global assessment of rare and common CNV load

Affected children with one or more combination of

intellectual disability, autism, epilepsy (n=15,767)

Odds ratio 3.2 5.2 8.3

Percentage of populationPercentage Neurologically unaffected adult controls (n=8,329)

Maximum size of CNV (kb)

Cooper*, Coe*, Girirajan*, et al, Nat Genet, 2011

CNV load correlates with severity of features

Craniofacial Cardiovascular All cases Epilepsy

Autism Percentage ofpopulation Percentage

Maximum size of CNV (kb)

Cooper*, Coe*, Girirajan*, et al, Nat Genet, 2011

23 8/3/2015

CNV load correlates with a range of neurodevelopmental disorders

100% NIMH controls Dyslexia Autism (without ID) 25% Autism (with ID) IntellectualIdiopathic disability ID with multiple 10% congenitalID (with anomalies MCA)

Idiopath ID Autism+ ID

Autism 1%

Percentage Percentage of population Dyslexia Controls

0.1% 0 0.5 1 1.5 2 2.5 3 MaximumLargest size of CNV CNV (Mbp) (Mb) x 106

Girirajan et al, PLoS Genetics, 2011

Deletion size associates with intellectual disability severity in autism cohort

2,588 samples analyzed from Simons Simplex Collection

Duplicationsduplicationduplication deletionDeletionsduplicationdeletion deletion Pearson correlation Pearson correlation

160 160 R(317) = -0.06, p=0.257 160 160160 R(222) = -0.24, p<0.0001 160

Q

Q Q

I 120 120 I

I 120 120120 120

l

l

l

Q

Q

Q

I

a

I

I

a

a

l

l l

Nonverbal b

b

b

a

r

a

a

r

r

b

b

b

e

r

e

e

r

r

e

v

e

e

v

v

v

v v

IQ n 80 80

n n

80 n 8080 80

n

n

o

o

o

o

o

o

N

N

N

N

N N

40 40 40 4040 40

0 0 0 00 0 0 20000000 400000020000006000000400000080000006000000100000008000000 01000000020000000 400000020000006000000400000080000006000000100000008000000 10000000 0 2000000 4000000 6000000 8000000 10000000 SizeSize (bp) Size Size SizeSize (bp) Size

e 10 10

e 10

e 10 1010

r

r

r

o

o

o

c

c c

S 9 9 e

S 9

S 9 9 9

e

e

r

r

r

y

o

y

y

o

o

t

c

t

t

i

c

c

i

i

r

S

r

r

S S

8 8

e

8 8 8 8

e

e

y

y

y

t

v

i

t

t

v

v

i

i

r

e

r

r

e e

CNV size e CNV size

e

e

S

v

S

v S 7 7 v

7 7 7 7

e

e

e

d

S

d

d

S

S

e

e

e

d

t

d

d

t

t e

a 6 e

6 e

t a

a 6 6 6 6

t

t

r

r

r

a

a

a

r

b

r

r

b

b

i

b

i

i

l

i

b

b

l

l

l

i

i

l

l a

a 5 a

5 a a

5 a 5 5 Girirajan*, Dennis* et al, Am J Hum5 Genet, 2013

C

C

C

C

C C

4 4 4 4 4 4

3 3 3 3 3 3 0 20000000 400000020000006000000400000080000006000000100000008000000 010000000200000000 40000002000000200000060000004000000400000080000006000000600000010000000800000080000001000000010000000 0 2000000 4000000 6000000 8000000 10000000 Size Size Size SizeSize Size

24 8/3/2015

duplicationduplication deletiondeletionduplication deletion

160 160 160 160160 160

Q

Q

Q

I I

I 120 120 120 120120 120

l

l

l

Q

Q

Q

I

I

I

a

a

a

l

l

l

b

b

b

a

a

a

r

r

r

b

b

b

e

e

e

r

r

r

e

e

e

v

v

v

v

v

v

n n

n 80 80 80 8080 80

n

n

n

o

o

o

o

o

o

N

N

N

N N Duplication size correlatesN with autism severity 40 40 40 4040 40

0 0 0 00 0 0 20000000 20000004000000 40000006000000 60000008000000 100000008000000 100000000 200000000 200000040000002000000 400000060000004000000 60000008000000600000010000000800000080000001000000010000000 0 2000000 4000000 6000000 8000000 10000000

DuplicationsSize Size DeletionsSize SizeSize Size

e e

e 10 10 10 1010 10

r

r

r

o

o

o

c

c c

S 9 S

S 9 9 9 99

e

e

e

r r

Calibrated r

y

y

y

o

o

o

t

t

t

c

c

c

i

i

i

r

r

r

S

S

S

8 8 8 88 8

e

e

e

y y

autism y

t

t

t

v

v

v

i

i

i

r

r

r

e

e

e

e

e

e

S

v

S

v S

7 7 v 7 77 7

e e

severity e

d

d

d

S

S

S

e

e

e

d

d

d

t

t

t

e

e

e

a a

a 6 6 6 66 6

t

t

t

r r

score* r

a

a

a

r

r

r

b

b

b

i

i

i

b

b

b

l

l

l

i

i

i

l

l

l

a

a

a

a a

5 5 a 5 55 5

C

C

C

C

C C Spearman correlation Spearman correlation 4 4 4 44 4 R(302) = 0.13, p=0.01 R(211) = 0.02, p=0.86 3 3 3 33 3 0 20000000 20000004000000 40000006000000 60000008000000 100000008000000 100000000 20000000 20000004000000 40000006000000 60000008000000 100000008000000 10000000 0 2000000 4000000 6000000 8000000 10000000 Size Size Size Size Size Size (bp) Size (bp)

CNV size CNV size *Age-normalized ADOS score: measures deficits in social communication and repetitive/restricted interests/behaviors

Girirajan*, Dennis* et al, Am J Hum Genet, 2013

Exome sequencing in autism

Exome sequencing has revolutionized disease gene discovery. Several themes have emerged from these studies 1. Increased paternal age contributes to a higher risk for autism and related disorders 2. Decreased non-verbal IQ is significantly associated with an increasing number of mutational events

O’Roak et al, 2010, 2012, Iossifov et al, 2014, Sanders et al, 2012

25 8/3/2015

Exome sequencing in autism

3. Gene disruptive mutations are more likely to affect Fragile X pathways, chromatin modifiers, embryonic developmental and postsynaptic density proteins 4. Females and males with low IQ require a higher mutational burden than males for manifestation of the phenotype 5. The rate of mutations is significantly higher in probands compared to the siblings

O’Roak et al, 2010, 2012, Iossifov et al, 2014, Sanders et al, 2012

Identification of an autism subtype: CHD8 mutations

Bernier et al, Cell, 2014

26 8/3/2015

Summary: Genetics of autism

• Syndromic autism, genome-wide association studies, linkage • Rare CNVs • Syndromic CNVs (Smith-Magenis syndrome) • CNVs associated with autism • Example of 16p11.2 deletion (~1% of autism) • 16p11.2 mouse models • 16p11.2 zebrafish models • 16p11.2 fly models • Heterogeneity of autism • 16p12.1 deletion as – Multiple genetic hits and comorbid features • Exome sequencing – Identifying subtypes of autism

Sex differences in autism and intellectual disability • There is a documented sex bias in autism with a diagnosis skewed towards males than females (4:1 ratio for autism and 2:1 for intellectual disability) • Recently a female protective genetic model has been proposed where females require a higher genetic load compared to males to manifest autism • We tested various factors affecting the genetic and phenotypic heterogeneity associated with autism – Frequency of comorbid features – Male to female ratio for specific rare CNVs – Burden of copy-number variants – Frequency of family history of neurodevelopmental features

27 8/3/2015

Sex differences in autism and intellectual disability Figure 2 A B • Frequency of comorbid All All features Males Males Females Females Figure 4 0 50 100 0 50 100 • Male to female ratio for Frequency within autism Frequency within ID/DD specific comorbid features Figure 3 C and specificA CNVs AB B Male to female ratio of autism within CNVs Male to female ratio in autism 100 100 Female Female Male

s

s

e

e

s

s

a a Male

c

c

f

f

o

o

e

e

g g 50 50

a

a

t

t

n

n

e

e

c

c

r

r

e

e

P

P

0 0

s

s

s

s

y

s

y

n

n

n

n

n

e

n

n

e

n

n

D

t

r

r

i

s

n

n

e

o

o

o

o

o

o

o

o

o

i

e

e

D

i

i

i

i

i

i

i

i

i

m

m

d

t

p

o

o

/

t

t

t

t

i

i

t

t

t

t

t

i

i

d

d

l

t

t

e

o

o

i

r

r

b

l

a

a

a

a

a

e

e

e

e

D

r

r

Behavioral disorder i

l

l

l

l

r

a a Multiple features

I

b

o

o

c

c

c

c

c

d

d

p

i

i

i

i

i

o

e

e

e

e

s

s

a

l

l

l

l

l

m

m

i

i

n

n

E

r

r

s

d

d

d

d

p

p

p

p

p

m

i

d

d

y

y

o

Psychiatric disordeo r Hearing/visual impairment

o

d

u

u

u

u

u

f

t

f

3

2

3

2

s

c

l

i

l

S

.

.

.

i

c

d

d

d

d

d

1

c

r

a

a

e

i

1

s

3

1

t

f

e

o

q

g

2

1

2

1

1 1 Other health impairment

S

1

a

m m Speech or language impairment

.

.

.

e

g

i

n

a

2

m

1

l

q

p

r

q

.

d

1

1

1

h

u

2

a

S

W

a

5

6

i

o

5

c

1

1

3

t

m

2

l

g

n

i

l

N

1

1

P

Autismy Specific learning disability

e

1

s

i

q

p

1

n

q

i

o

n

s

t

i

C

a

2

6

p

1

t

G

e

l

W

u

i

P

r

2

1

6

n

g

e

A Comorbidities d No comorbidities

D

1

e

n

t

h

n

t

o

t

a

c

A

O

r

h

e

c

h

e

t

e

O

p

S

CNV load is higher in females than males with autism

• Higher CNV load in females compared to males for autism, intellectual disability and ASD co-occurring with ID or epilepsy • Epilepsy in autism is significantly associated with intellectual disability – Individuals with autism manifesting with epilepsy are more likely than those without epilepsy to have intellectual disability

Polyak et al, unpublished

28 8/3/2015

Females require a higher genetic load in the background than males

A genetic liability model for autism

Werling et al, 2014, Reich et al, 1975

29 8/3/2015

Conclusions • Autism is a heterogeneous disorder ranging in severity and variability of associated features • Recent technological advances have greatly improved our understanding of the molecular etiology of autism • Genetic heterogeneity is extensive and will generate a large set of candidate genes as diagnostic markers and for therapies • Detailed genetic and iterative phenotypic evaluations will help identify subtypes of autism • Functional evaluation of the identified genes in model systems are important to map the molecular basis of autism and to devise targeted therapies • There is an increasing need for multidisciplinary collaborative team of teachers, clinicians, parents, and researchers for addressing the challenges associated with autism

Girirajan Lab o Janani Iyer o Andrew Polyak Acknowledgments o Lucilla Pizzo o Qingyu Wang o Komal Vadodaria o Payal Patel o Alexis Kubina o Sneha Yennawar o Abigail Talbert o Yixuan Wang o Rhea Sullivan

o Ayush Thomas o Sarah Elsea (Baylor) o Brian Lifschutz o Evan Eichler (U Washington) o Ben Ruddle o Lisa Shaffer (Signature Genomics) o Carla Bertulfo o Mary-Claire King (U Washington) Funding: March of Dimes and NARSAD foundations

30