Next Generation Sequencing Update
Karl V. Voelkerding, MD Professor of Pathology University of Utah Medical Director for Genomics and Bioinformatics ARUP Laboratories
AACC-AMP 2012 Molecular Pathology Course
[email protected] Disclosures
• Grant/Research Support: NIH • Salary/Consultant Fees: None • Committees: College of American Pathologists • Stocks/Bonds: None • Honorarium/Expenses: None • Intellectual Property/Royalty Income: None Learning Objectives
• Explain Principles of NGS • Describe Current and Future NGS Platform Options • Discuss Spectrum of NGS Clinical Applications First Next Generation Sequencing Publication
Nature 437 (7057) 376-380 454 Life Sciences 2005 Paradigm Shift
Sanger Sequencing Electrophoretic Separation of Chain Termination Products
Next Generation Sequencing
Sequence Clonally Amplified DNA Templates in a Flow Cell Massively Parallel Configuration Process Genomic DNA or Enriched Genes
Fragmentation (150 – 500 bp)
End Repair and Adapter Ligation
“Fragment Library”
Adapter Fragment A Adapter
Adapter Fragment B Adapter
Adapter Fragment C Adapter Process A “Fragment Library” B C
Clonal Amplification of Each Fragment
Emulsion Bead PCR Surface Clusters
B A B C A
C
Sequencing of Clonal Amplicons in a Flow Cell Process Sequencing of Clonal Amplicons in a Flow Cell
Pyrosequencing Sequencing by Ligation 454 SOLiD
Reversible Dye Terminators Illumina
Generation of Luminescent or Fluorescent Images
Conversion to Sequence 454/Roche Solexa/Illumina
Bead Emulsion PCR Surface Bridge PCR
Pyrosequencing Reversible dye terminators
200 – 400 base reads 36 – 75 base reads Solexa/Illumina Sequencing
A T C G Qualitative and Quantitative Information Ref Seq
Illumina G>A
Coverage Next Generation Sequencing • Sequence up to billions of fragments simultaneously • Iterative/cyclic sequencing
Luminescence Fluorescence pH Detection (Roche) (Illumina,SOLiD) (Ion Torrent)
Signal to Noise Processing
Cyclic Base Calls C G A T G C - - -
Base Quality Scores
C30 G28 A33 T30 G28 C30 - - - @HW-ST573_75:1:1:1353:4122/11 CAATCGAATGGAATTATCGAATGCAATCGA Next Generation Sequencing Data ATAGAATCATCGAATGGACTCGAATGGAAT CATCGAA + ggfggggggggggggfgggggggfgegggg fdfeefeggggggggegbgegegggdeYed gggggeg @HW-ST573_75:1:1:1347:4151/11 Primary Sequence Alignment ATCTGTTCTTGTCTTTAACTCTCAAGGCAC BWA CACCTTCCATGGTCAATAATGAACAACGCC AGCATGC + effffggggggggggggfgggggggggggg gdggggfgggfgdggaffffgfggffgdgg ggggdfg Refined Sequence Alignment @HW-ST573_75:1:1:1485:4153/11 GATK/Picard GAGGAGAGATATTTTGACTTCCTCTCTTCA TATTTGGATGCTTTTTACTTATCTCTCTTG ACTAATT + dZdddbXc`_ccccbeeedbeaedeeeee^ aeeedcaZca_`^c[eeeeed]eeecd[dd Variant Calling ^eeba[d
SAMTools/GATK FastQ File Format
Variant Annotation Variant g.34142190T>C in TPM1 Annovar
Next Generation Sequencers First Wave Second Wave - SMS
454/Roche Solexa/Illumina ABI/Life Tech Pacific Helicos 2004/5 2006/7 2007/8 Biosciences
Genome GS FLX SOLiD HeliScope SMRT Analyzer Third Wave GS GAIIx SOLiD 5500 Junior GAIIe SOLiD 5500xl Ion Torrent HiScanSQ Life Technologies PGM HiSeq 2011
MiSeq 2011 Clinical Dissemination
2 Independent Flow Cells Illumina HiSeq 2000 8 Lanes per Flow Cell
2 X 100 base pairs . Multiple Gene Panel Samples per Lane
540-600 Gb Output . 2-3 Exome(s) per Lane 8-11 Day Sequencing Run . 2 Genomes per Flow Cell Illumina MiSeq
2 X 150 bp 2 X 250 bp 2.0 – 7.0 Gb Output ~27 Hrs Sequencing Run
. Multi-Gene Panels .Genetics .Oncology .Microbiology
Reversible Dye Terminators . Viral and Bacterial Genomes
. Transcriptomes Illumina MiSeq
Transcriptome Sequencing GAPDH Sequence Reads Ion Torrent
Hydrogen Ion
Pyrophosphate Monitors H+ Release Ion Torrent
100 – 200 base pairs 10 Mb – 1.0 Gb Output ~2 Hrs Sequencing Run
. Multi-Gene Panels .Genetics .Oncology .Microbiology
Monitors H+ Release . Viral and Bacterial Genomes
. Transcriptomes Ion Torrent
BRAF, c.1799T>A, p.V600E
26.5% mutant alleles Technology Advances for 2012/13 Illumina HiSeq 2000 Late 2012
Upgrade Module
120 Gb 27+ Hours
2 X 100 base pairs . Single Genome in 27+ Hours 540-600 Gb Output . Multiple Exomes in 27+ Hours 11 Day Sequencing Run Late 2012 Ion Torrent - Proton
Exomes/Genome “Several Hours” Oxford Nanopore Technologies
Processive Enzyme
Protein Nanopore in Polymer Membrane MinION – Late 2012
Current Disruption Based Electronic Signal The Meeting Place
Biotechnology Bioinformatics
Sequence Generation Sequence Analysis Interpretation
Biomedical Question
What is the Genetic Landscape of a Tumor
What Pathogen is Responsible for an Outbreak
What Genetic Contributors Account for a Phenotype Clinical Applications
Whole Genome
Whole Exome
Multi-Gene Diagnostics
Increasing Complexity Multi-Gene Diagnostics
Clinical Phenotype
Multiple Genes Mutational Spectrum
Locus Heterogeneity Allelic Heterogeneity Multi-Gene Diagnostics
“New First Tier” Genetic Testing
Scaling Increases Interpretive Complexity
Can Yield Non-Definitive Results
Gateway to Exome/Genome Multi-Gene Diagnostics
Genomic DNA
Enrichment
Target Genes
NGS Library Preparation
Next Generation Sequencing
Bioinformatics
Interpretation Gene Enrichment Approaches
Genomic DNA
Amplification Based Array Capture Based
PCR or LR-PCR Solid Surface RainDance ePCR or Fluidigm In Solution HaloGenomics
Enriched Genes NGS Gene Enrichment Approaches
Genomic DNA
Amplification Based Array Capture Based
PCR or LR-PCR Solid Surface RainDance ePCR or Fluidigm In Solution HaloGenomics
Advantage: Enrichment Specificity Advantage: Scalable to Exome
Drawbacks: Drawbacks: Not as Scalable Homologous Sequence Capture Instrument and Chip Costs Manually Complex
Clinical Applications
Whole Genome
Whole Exome
Multi-Gene Diagnostics
Increasing Complexity Human Exome
“Journey to the Center of the Genome”
~ 30+ Megabases (~ 1.5% of the genome)
~ 180,000 exons (~ 20,500 genes)
Harbors “Majority” of Mendelian Mutations Exome Sequencing History
“Genetic Diagnosis by Whole Exome Capture and Massively Parallel DNA Sequencing”
Choi et al PNAS 2009 – Congenital Chloride Diarrhea
~45 Gene Discovery Publications May 2012
Recessive Dominant De Novo Genomic DNA
Library Preparation
Next Generation Sequencing Library
Hybridize to Exome Capture Probes
Exome Enriched Library
Next Generation Sequencing
Bioinformatics Analysis Comparison of Exome DNA Sequencing Technologies
Clark et al Nature Biotech Vol 29(10) Oct 2011 Comparison of Exome DNA Sequencing Technologies
Clark et al Nature Biotech Vol 29(10) Oct 2011 Exome Sequencing - Coverage of Coding Regions is Variable
Coverage
Aligned reads
Reference Capture probes
MAZ HLA-DOB Exon 1 Exon 1
Nimblegen Exome Capture and Illumina HiSeq Exome Sequencing – Performance Characteristics
Define Proportion of Exome “Adequately Covered”
Conversely
Define Proportion of Exome “Not Adequately Covered”
Dependent On
Capture Technology – Probe Design and Capture Efficiency
Sequencing Depth Exome Sequencing – Performance Characteristics
Define Proportion of Exome “Accurately Sequenced”
Co-Capture Component Difficult to Sequence Regions
Pseudogenes Repetitive Elements
Paralogs and Homologs
Mendelian Disorders – Working Hypothesis Seeking “Rare” Variants in a Single Gene(s)
Needle(s) in the Haystack(s) Bioinformatics
Annotated Variants
Prioritization by Heuristic Filtering Prioritization by Likelihood Prediction
Filter Out VAAST Algorithm Common Variants Pedigree Information Linkage/SGS/IBD dbSNP/1000 genomes Intersects Variant frequency Variant Binning
Missense Pathogenicity Nonsense/Frameshift/Splice Site/Indels Prediction Filtering Cross Reference SIFT/PolyPhen Databases GERP HGMD/OMIM/Locus Specific
Candidate Genes/Potential Causative Variants Genomic DNA
Library Preparation
Next Generation Sequencing Library
Hybridize to Exome Capture Probes Genome Sequencing Exome Enriched Library
Next Generation Sequencing
Bioinformatics Analysis Genomic DNA
Library Preparation
Next Generation Sequencing Library
Next Generation Sequencing
Bioinformatics Analysis
Exome Genome Sequencing vs Sequencing
Cost – Coverage – Complexity Whole Genome Sequencing Chr 10: g.43,615,633C>G in RET Horizon
Continued Evolution of Sequencing and Bioinformatics
College of American Pathologists Checklist Requirements for Next Generation Sequencing
Professional Societies Guidelines for Clinical Next Generation Sequencing Self Assessment Questions
• Describe Process Steps for NGS • List NGS Platform Options and Capabilities • Relate Spectrum of Clinical NGS Applications