Next Generation Update

Karl V. Voelkerding, MD Professor of Pathology University of Utah Medical Director for Genomics and Bioinformatics ARUP Laboratories

AACC-AMP 2012 Molecular Pathology Course

[email protected] Disclosures

• Grant/Research Support: NIH • Salary/Consultant Fees: None • Committees: College of American Pathologists • Stocks/Bonds: None • Honorarium/Expenses: None • Intellectual Property/Royalty Income: None Learning Objectives

• Explain Principles of NGS • Describe Current and Future NGS Platform Options • Discuss Spectrum of NGS Clinical Applications First Next Generation Sequencing Publication

Nature 437 (7057) 376-380 454 Life Sciences 2005 Paradigm Shift

Sanger Sequencing Electrophoretic Separation of Chain Termination Products

Next Generation Sequencing

Sequence Clonally Amplified DNA Templates in a Flow Cell Massively Parallel Configuration Process Genomic DNA or Enriched Genes

Fragmentation (150 – 500 bp)

End Repair and Adapter Ligation

“Fragment Library”

Adapter Fragment A Adapter

Adapter Fragment B Adapter

Adapter Fragment C Adapter Process A “Fragment Library” B C

Clonal Amplification of Each Fragment

Emulsion Bead PCR Surface Clusters

B A B C A

C

Sequencing of Clonal Amplicons in a Flow Cell Process Sequencing of Clonal Amplicons in a Flow Cell

Pyrosequencing Sequencing by Ligation 454 SOLiD

Reversible Dye Terminators Illumina

Generation of Luminescent or Fluorescent Images

Conversion to Sequence 454/Roche Solexa/Illumina

Bead Emulsion PCR Surface Bridge PCR

Pyrosequencing Reversible dye terminators

200 – 400 base reads 36 – 75 base reads Solexa/Illumina Sequencing

A T C G Qualitative and Quantitative Information Ref Seq

Illumina G>A

Coverage Next Generation Sequencing • Sequence up to billions of fragments simultaneously • Iterative/cyclic sequencing

Luminescence Fluorescence pH Detection (Roche) (Illumina,SOLiD) (Ion Torrent)

Signal to Noise Processing

Cyclic Base Calls C G A T G C - - - 

Base Quality Scores

C30 G28 A33 T30 G28 C30 - - -  @HW-ST573_75:1:1:1353:4122/11 CAATCGAATGGAATTATCGAATGCAATCGA Next Generation Sequencing Data ATAGAATCATCGAATGGACTCGAATGGAAT CATCGAA + ggfggggggggggggfgggggggfgegggg fdfeefeggggggggegbgegegggdeYed gggggeg @HW-ST573_75:1:1:1347:4151/11 Primary Sequence Alignment ATCTGTTCTTGTCTTTAACTCTCAAGGCAC BWA CACCTTCCATGGTCAATAATGAACAACGCC AGCATGC + effffggggggggggggfgggggggggggg gdggggfgggfgdggaffffgfggffgdgg ggggdfg Refined Sequence Alignment @HW-ST573_75:1:1:1485:4153/11 GATK/Picard GAGGAGAGATATTTTGACTTCCTCTCTTCA TATTTGGATGCTTTTTACTTATCTCTCTTG ACTAATT + dZdddbXc`_ccccbeeedbeaedeeeee^ aeeedcaZca_`^c[eeeeed]eeecd[dd Variant Calling ^eeba[d

SAMTools/GATK FastQ File Format

Variant Annotation Variant g.34142190T>C in TPM1 Annovar

Next Generation Sequencers First Wave Second Wave - SMS

454/Roche Solexa/Illumina ABI/Life Tech Pacific Helicos 2004/5 2006/7 2007/8 Biosciences

Genome GS FLX SOLiD HeliScope SMRT Analyzer Third Wave GS GAIIx SOLiD 5500 Junior GAIIe SOLiD 5500xl Ion Torrent HiScanSQ Life Technologies PGM HiSeq 2011

MiSeq 2011 Clinical Dissemination

2 Independent Flow Cells Illumina HiSeq 2000 8 Lanes per Flow Cell

 2 X 100 base pairs . Multiple Gene Panel Samples per Lane

 540-600 Gb Output . 2-3 Exome(s) per Lane  8-11 Day Sequencing Run . 2 per Flow Cell Illumina MiSeq

 2 X 150 bp 2 X 250 bp  2.0 – 7.0 Gb Output  ~27 Hrs Sequencing Run

. Multi-Gene Panels .Genetics .Oncology .Microbiology

Reversible Dye Terminators . Viral and Bacterial Genomes

. Transcriptomes Illumina MiSeq

Transcriptome Sequencing GAPDH Sequence Reads Ion Torrent

Hydrogen Ion

Pyrophosphate Monitors H+ Release Ion Torrent

 100 – 200 base pairs  10 Mb – 1.0 Gb Output  ~2 Hrs Sequencing Run

. Multi-Gene Panels .Genetics .Oncology .Microbiology

Monitors H+ Release . Viral and Bacterial Genomes

. Transcriptomes Ion Torrent

BRAF, c.1799T>A, p.V600E

26.5% mutant alleles Technology Advances for 2012/13 Illumina HiSeq 2000 Late 2012

Upgrade Module

120 Gb 27+ Hours

 2 X 100 base pairs . Single in 27+ Hours  540-600 Gb Output . Multiple Exomes in 27+ Hours  11 Day Sequencing Run Late 2012 Ion Torrent - Proton

Exomes/Genome “Several Hours” Oxford Nanopore Technologies

Processive

Protein Nanopore in Polymer Membrane MinION – Late 2012

Current Disruption Based Electronic Signal The Meeting Place

Biotechnology Bioinformatics

Sequence Generation Sequence Analysis Interpretation

Biomedical Question

What is the Genetic Landscape of a Tumor

What Pathogen is Responsible for an Outbreak

What Genetic Contributors Account for a Phenotype Clinical Applications

Whole Genome

Whole Exome

Multi-Gene Diagnostics

Increasing Complexity Multi-Gene Diagnostics

Clinical Phenotype

Multiple Genes Mutational Spectrum

Locus Heterogeneity Allelic Heterogeneity Multi-Gene Diagnostics

“New First Tier” Genetic Testing

Scaling Increases Interpretive Complexity

Can Yield Non-Definitive Results

Gateway to Exome/Genome Multi-Gene Diagnostics

Genomic DNA

Enrichment

Target Genes

NGS Library Preparation

Next Generation Sequencing

Bioinformatics

Interpretation Gene Enrichment Approaches

Genomic DNA

Amplification Based Array Capture Based

PCR or LR-PCR Solid Surface RainDance ePCR or Fluidigm In Solution HaloGenomics

Enriched Genes NGS Gene Enrichment Approaches

Genomic DNA

Amplification Based Array Capture Based

PCR or LR-PCR Solid Surface RainDance ePCR or Fluidigm In Solution HaloGenomics

Advantage: Enrichment Specificity Advantage: Scalable to Exome

Drawbacks: Drawbacks: Not as Scalable Homologous Sequence Capture Instrument and Chip Costs Manually Complex

Clinical Applications

Whole Genome

Whole Exome

Multi-Gene Diagnostics

Increasing Complexity Human Exome

“Journey to the Center of the Genome”

~ 30+ Megabases (~ 1.5% of the genome)

~ 180,000 exons (~ 20,500 genes)

Harbors “Majority” of Mendelian Mutations Exome Sequencing History

“Genetic Diagnosis by Whole Exome Capture and Massively Parallel DNA Sequencing”

Choi et al PNAS 2009 – Congenital Chloride Diarrhea

~45 Gene Discovery Publications May 2012

Recessive Dominant De Novo Genomic DNA

Library Preparation

Next Generation Sequencing Library

Hybridize to Exome Capture Probes

Exome Enriched Library

Next Generation Sequencing

Bioinformatics Analysis Comparison of Exome DNA Sequencing Technologies

Clark et al Nature Biotech Vol 29(10) Oct 2011 Comparison of Exome DNA Sequencing Technologies

Clark et al Nature Biotech Vol 29(10) Oct 2011 Exome Sequencing - Coverage of Coding Regions is Variable

Coverage

Aligned reads

Reference Capture probes

MAZ HLA-DOB Exon 1 Exon 1

Nimblegen Exome Capture and Illumina HiSeq Exome Sequencing – Performance Characteristics

Define Proportion of Exome “Adequately Covered”

Conversely

Define Proportion of Exome “Not Adequately Covered”

Dependent On

Capture Technology – Probe Design and Capture Efficiency

Sequencing Depth Exome Sequencing – Performance Characteristics

Define Proportion of Exome “Accurately Sequenced”

Co-Capture Component Difficult to Sequence Regions

Pseudogenes Repetitive Elements

Paralogs and Homologs

Mendelian Disorders – Working Hypothesis Seeking “Rare” Variants in a Single Gene(s)

Needle(s) in the Haystack(s) Bioinformatics

Annotated Variants

Prioritization by Heuristic Filtering Prioritization by Likelihood Prediction

Filter Out VAAST Algorithm Common Variants Pedigree Information Linkage/SGS/IBD dbSNP/1000 genomes Intersects Variant frequency Variant Binning

Missense Pathogenicity Nonsense/Frameshift/Splice Site/Indels Prediction Filtering Cross Reference SIFT/PolyPhen Databases GERP HGMD/OMIM/Locus Specific

Candidate Genes/Potential Causative Variants Genomic DNA

Library Preparation

Next Generation Sequencing Library

Hybridize to Exome Capture Probes Genome Sequencing Exome Enriched Library

Next Generation Sequencing

Bioinformatics Analysis Genomic DNA

Library Preparation

Next Generation Sequencing Library

Next Generation Sequencing

Bioinformatics Analysis

Exome Genome Sequencing vs Sequencing

Cost – Coverage – Complexity Chr 10: g.43,615,633C>G in RET Horizon

 Continued Evolution of Sequencing and Bioinformatics

 College of American Pathologists Checklist Requirements for Next Generation Sequencing

 Professional Societies Guidelines for Clinical Next Generation Sequencing Self Assessment Questions

• Describe Process Steps for NGS • List NGS Platform Options and Capabilities • Relate Spectrum of Clinical NGS Applications