Sequencing Technology and Applica�Ons

Sequencing Technology and Applica�Ons

Next Genera�on Sequencing technology and applica�ons Leonardo A. Meza-­‐Zepeda, Ph.D. Genomics Core Facility Helse Sør-­‐Øst/University of Oslo oslo.genomics.no Topics 7 Introduc�on 7 Technologies 7 Applica�ons oslo.genomics.no 1 Human Genome Project 3 billion bases Cost approx. 3 Billion Dollars Public HGP Celera Genomics 1990-­‐2003 oslo.genomics.no Development of Sequencing Technologies Massively parallel sequencing Human Genome Project Stra�on MR et al, Nature 2009 oslo.genomics.no 2 Development of Sequencing Technologies last 10 years ER Mardis. Nature 470, 198-203 (2011) oslo.genomics.no Cost per Megabase oslo.genomics.no 3 Costs for a Human Genome Capillary Sequencing Next Genera�on Sequencing Applied Biosytems 3730xl HiSeq 2500 (2004) (Today) U$ 15,000,000 U$ 6,000 The Norwegian oslo.genomics.no Radium Hospital Sequencing Costs per Genome Costs Genomes $100M Venter 1M $10M 100k Watson $1M 10k African, Asian, Cancer pair $100k 169 in Genbank 1,000 Individual Genome Cost per Human Genome Human per Cost $10k Sequencing 100 2007 2008 2009 2010 2011 2012 Time oslo.genomics.no 4 Sequencing Technologies 7 Solexa (Illumina) 7 Sequencing by synthesis 7 454 (Roche) 7 Pyrosequencing 7 SOLiD (Life Technologies) 7 Sequencing by liga�on 7 Non op�cal 7 Ion Torrent/ Ion Proton 7 Single molecule sequencing 7 Helicos, Pacific Biosciences, Nanopores Oxford oslo.genomics.no Common a�ributes of commercial sequencers 7 Random fragmenta�on of DNA 7 Liga�on of adapter, crea�on of a library (genome/transcriptome) 7 Library amplifica�on on a solid surface 7 Direct sequencing for single molecule pla�orms 7 Direct step by step detec�on of nucleo�de incorpora�on 7 Shorter read length thank tradi�onal sequencers 7 Digital read type, enables direct quan�ta�ve comparisons 7 Possible to read both ends of a DNA fragment 7 Single read or Paired-­‐end reads oslo.genomics.no 5 Single and Paired Ends Libraries Single Read Single Read Library (Up to 100bp) i.e. ChIP-­‐Seq, miRNA-­‐Seq Read 1 Read 2 Short Insert Paired-­‐Ends Library 200bp → 500bp i.e. RNA-­‐Seq, Genomic-­‐Seq Read 1 Read 2 2Kb → Paired-­‐end 10Kb read offer advantages for Long Insert Mate Pair Library sequencing larger and complex genomes. i.e. Genomic-­‐Seq Facilitates accurate posi�oning (mapping) of the reads compared to single reads oslo.genomics.no Paired-­‐end Sequencing and Alignment oslo.genomics.no 6 Mate-­‐pair Sequencing and Alignment Combina�on of short-­‐read and mate-­‐pair sequence reads for de novo sequencing oslo.genomics.no Barcoding Libraries Library Sequence Separate prepara�on pool sequences Sequencing of insert and index oslo.genomics.no 7 Sequencing Technologies oslo.genomics.no DNA Fragmenta�on Adap�ve Focused Acous�cs, COVARIS 7 Acous�c energy wave that converges and focuses to a small-­‐localized area 7 Shearing of DNA, RNA, Chroma�n, +++ 7 Random fragmenta�on Plates Single Sample oslo.genomics.no 8 Library Construc�on oslo.genomics.no Nextera Library Construc�on, Illumina oslo.genomics.no 9 Targeted Amplifica�on HaloPlex (Agilent) oslo.genomics.no Targeted Amplifica�on TruSeq Custom Amplicon (Illumina) oslo.genomics.no 10 Targeted Amplifica�on AmpliSeq (Ion Torrent) 12 to 3072-­‐plex oslo.genomics.no Targeted Amplifica�on Single Molecule Molecular Inversion Probes O’Roak BJ, et al Science 2012 Hia� JB, et al Genome Res 2013 oslo.genomics.no 11 Library Amplifica�on, Emulsion PCR Roche/454 and Ion Torrent/Proton Metzker, Nature Reviews Gene�cs 2010 oslo.genomics.no Roche 454, Pyrosequencing Problems with homopolymers Metzker, Nature Reviews Gene�cs 2010 oslo.genomics.no 12 Roche/454 Technology Instrument Run �me (hr) Read Length Yield Error Type Error Rate (%) (bp) (Mb/run) GS FLX 23 1000 700 Mb Indel 1 Titanium XL+ Mean: 700 GS FLX 10 600 450 Mb Indel 1 Titanium XLR70 Mean: 450 GS Junior 10 400 35 Mb Indels 1 7 Mate pair paired-­‐end reads of 3 kb, 8kb and 20kb 7 Cost per run makes sequencing an en�re human prohibi�ve 7 Great pla�orm for targeted valida�on 7 Good for de novo sequencing in combina�on with short reads oslo.genomics.no SOLiD, Life Technologies Beads are placed in the surface of the flow cell oslo.genomics.no 13 SOLiD, Life Technologies Tucker, Am J Hum Gen 2009 Metzker, Nature Reviews Gene�cs 2010 oslo.genomics.no SOLiD Technology Instrument Run �me Read Length Yield Error Type Error Rate (%) (days) (bp) (Gb/run) 5550 W 2 -­‐ 8 1 x 50 80 Gb A-­‐T bias 0.01 1 x 75 120 Gb 2 x 50 160 Gb 5500xl W 2 -­‐ 8 1 x 50 160 Gb A-­‐T bias 0.01 1 x 75 240 Gb 2 x 50 320 Gb 7 6-­‐lane flow chip with independent lanes 7 Very high accuracy data due to two base encoding 7 Conversion of color space to base space 7 Paired-­‐end chemistry enabled 7 Wild-­‐fire chemistry being implemented (replaces ePCR) oslo.genomics.no 14 Ion Torrent/Proton, Life Technologies Library Construc�on and Emulsion PCR Semiconductor Chip Problems with homopolymers oslo.genomics.no Different Chip Sizes oslo.genomics.no 15 Instruments Ion Proton to sequence one human genome per day for U$ 1000 oslo.genomics.no Illumina Sample Prepara�on 1 Library prepara�on Fragment DNA Repair ends / Add A overhang Ligate adapters Select ligated DNA 2 Automated Cluster Genera�on Hybridize to flow cell 1-­‐8 samples Extend hybridized oligos Perform bridge amplifica�on 3 Sequencing Perform sequencing on forward strand 1-­‐16 samples Re-­‐generate reverse strand Perform sequencing on reverse strand oslo.genomics.no 16 Illumina Library Prepara�on oslo.genomics.no Nextera Library Construc�on, Illumina 7 Low DNA input 50 ng 7 Fast Library Prep. 90 minutes 7 Automa�on friendly 7 Larger insert size 7 GC bias (inser�on site) oslo.genomics.no 17 Illumina Cluster Genera�on Seq. Library 100 μm (8 pmols) Single molecule array 3 Billion clusters Library Cluster Growth Cluster Density Prepara�on Amplifica�on oslo.genomics.no Sequencing-­‐by-­‐Synthesis A T C G Terminator and 3 Billion clusters Incorporated x 2x100bp = 600 Gigabases per run Add 4 Fl-­‐ fluorescent dye are Fl-­‐NTP is NTP’s + cleaved from the Fl-­‐ imaged Polymerase 100 exomes 30x NTP 5 human X 36 genomes -­‐ 150 30x coverage oslo.genomics.no 18 Illumina Instruments 2010: HiSeq 2000 • Two flow cells per run, 100 Gbp/FC or one human genome • New scanning mechanics -­‐ scans both surfaces of FC lanes 2011: HiSeq 2000 • Improved chemistry: increased yield and accuracy • Approx. 600 GB, 5-­‐6 human genomes 2011: MiSeq Personal Sequencer • One flow cells per run • 2x150 bp, approx. 4-­‐5 Gb • Fast sequencing, 4-­‐27 hrs per run 2012: MiSeq v.2 chemistry • Scans both surfaces of FC, Double the capacity • 2x250 bp, approx. 8-­‐10 Gb 2013: HiSeq 2500 • One flow cell per run • RAPID mode, 27hrs sequencing run • One human genome per flow cell 2013: MiSeq v.3 chemistry • Scans both surfaces of FC, Double the capacity • 2x300 bp, approx. 15 Gb oslo.genomics.no Illumina Instrument Run �me Read Length Yield Error Type Error Rate (%) (days) (bp) (Gb/run) HiSeq 2500 2 1 x 36 108 Gb Sub 0.1 High output 5 2 x 50 300 Gb mode 11 2 x 100 600 Gb HiSeq 2500 7 1 x 36 22 Gb Sub 0.1 Rapid mode 27 2 x 100 120 Gb 40 2 x 150 360 Gb MiSeq 4 1 x 50 1.3 Gb Sub 0.1 24 2 x 150 7.5 Gb 65 2 x 300 15 Gb New development: Ordered array of clusters oslo.genomics.no 19 Moleculo, Long-­‐Read Sequencing Voskoboynik A, et al eLife 2013 oslo.genomics.no Single Molecule Sequencing 7 Helicos 7 Pacific Biosciences 7 Oxford Nanopores oslo.genomics.no 20 Helicos Top: CTAGTC Bo�om: CAGCTA Metzker, Nature Reviews Gene�cs 2010 oslo.genomics.no Pacific Biosciences 7 Single Molecule Real Time (SMRT) sequencing technology 7 No amplification 7 Single pass read accuracy 85% 7 150,000 zero-mode waveguides per SMRT cell 7 Approx. 3-5,000 bp sequence length Metzker, Nature Reviews Gene�cs 2010 oslo.genomics.no 21 Nanopore Technology Nanopore is, essen�ally, a nano-­‐scale hole. 7 Biological: pore-­‐forming protein in a membrane (lipid bilayer) 7 Solid-­‐state: formed by synthe�c materials, (silicon nitride) 7 Hybrid: pore-­‐forming protein set in a synthe�c material To come oslo.genomics.no Data analysis Fluorescent signal Number Base call Alignment Biological pH change de novo Interpreta�on Conduc�vity Large IT infrastructure oslo.genomics.no 22 NGS Applica�ons oslo.genomics.no NGS applica�ons 7 Genomes: re-­‐sequencing or de novo 7 Point muta�on/indel/structural varia�on discovery 7 Protein:DNA binding 7 Chroma�n IP/histone binding 7 Nucleosome/transcrip�on factor binding, etc. 7 ncRNA discovery/sequencing/variants 7 Transcriptome sequencing (RNA-­‐seq) 7 Genome-­‐wide methyla�on of DNA (Methyl-­‐seq) 7 Clinical sequencing for therapeu�c decisions oslo.genomics.no 23 de novo Genome Sequencing oslo.genomics.no Resequencing Meyerson et al, Nature Reviews Gene�cs 2010 oslo.genomics.no 24 ICGC Descrip�on of genomic, transcriptomic and epigenomic changes 7 Data available to the en�re research community 50 different tumour types and/or subtypes 7 Clinical and societal importance across the globe 7 Pa�ent-­‐matched control samples (500 of each) 7 ~ $ 25 million each project Osteosarcomas (Myklebost/Meza-­‐Zepeda) 7 Wellcome Trust Sanger Ins�tute (Michael Stra�on) Similar US ini�a�ve, The Cancer Genome Atlas (TCGA) Interna�onal network of cancer genome projects.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    50 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us