Sequencing Library Preparation
Total Page:16
File Type:pdf, Size:1020Kb
Sequencing Library Preparation Sarah Boswell http://scholar.harvard.edu/saboswell Genomics, Epigenomics & Transcriptomics Hasin et al. Multi-omics approaches to disease, Genome Biology 18:83 (2017). Genomics, Epigenomics & Transcriptomics DNA DNA & Histone RNA Modifications C. Petit. Evodevomics, BioSciences Master Reviews (2013). Basic Library Preparation Small piece of DNA or cDNA INSERT INSERT + Sequencing Adapter Library Seq Methods Sequencing Library Preparation Starting Material: DNA & RNA RNA-Seq library prep Genomic library prep Mate-pair sequencing (circularization) Low input / single cell library prep High Quality Starting Material DNA Extraction Treat with RNase RNA Extraction Treat with DNase Practice your extraction before the real experiment Genomic DNA Extraction Any of the available kits or protocols Bacteria or yeast often need additional bead-beating step. RNase treat all DNA samples. Quantitation Absorbance: Nano-drop (50-500 ng/ul) Theoretically should can read to 3000 ng/ul. Empirically find it is only accurate within range above. Dye based SYBR Green, Qubit, Quant-IT https://genomics.ed.ac.uk/resources/sample-requirements Genomic DNA QC Agarose Gel TapeStation Genomic DNA Assay https://genomics.ed.ac.uk/resources/sample-requirements http://www.agilent.com/cs/library/applications/5991-5258EN.pdf Chromatin Immunoprecipitation (ChIP) Crosslinking Cell Lysis Chromatin Preparation / Sheering Immunoprecipitation Crosslinking Reversal & DNA cleanup https://www.thermofisher.com/us/en/home/life-science/antibodies/antibodies-learning-center/antibodies-resource-library/antibody-application-notes/step-by-step-guide-successful-chip-assays.html High Quality Starting Material DNA Extraction Treat with RNase RNA Extraction Treat with DNase RNA: What is Transcriptomics? Wikipedia “The transcriptome is the set of all messenger RNA molecules in one cell or a population of cells.” National Cancer Institute Definition of Terms “The study of all RNA molecules in a cell. Transcriptomics is used to learn more about how genes are turned on in different types of cells and how this may help cause certain diseases, such as cancer.” RNA enrichment PolyA tailed messenger RNA: mRNA-Seq Total RNA (rRNA removed): “total” RNA-Seq Front Genet. 2015 Jan 26;6:2 mRNA (polyA) Purification mRNA enrichment mRNA binds beads coated with oligo dT primer Non-polyadenylated transcripts are washed away AAAA TTTTTT AAAA AAAA TTTTTT Transcripts Lost in polyA Purification Ribosomal/Transfer RNA Histone mRNA Long-noncoding RNA Nascent intron containing transcripts Micro RNA Degraded RNA Many viral transcripts Prokaryote/Bacterial transcripts polyA is the degradation signal rRNA Depletion Bead Rnase H mRNA/long noncoding RNA/nascent RNA Illumina: TruSeq rRNA rRNA Probes hybridize rRNA on magnetic beads RNA of interest remains in supernatant KAPA: RiboErase Probes hybridize rRNA in solution Hybrids are digested with RNase H Probes digested with DNAse I Purified RNA Modified from: Scientific Reports 6, article 37876 (2016) RNA Extraction Cultured cells – Easy! Use your favorite column kit – Qiagen, Invitrogen, Zymo For high throughput suggest bead based – MagMax 96 well format column plates are also available Tissue samples Dissect in cold room if possible Use RNAlater solution to store tissue sample Upon extraction proceed to homogenization in cold room RNA Extraction: Column Purification On-column DNase digestion Dry to remove excess ethanol Elute with warm water to increase yields On-Column DNase http://www.sabiosciences.com/pathwaymagazine/pcrhandbook/10.php RNA Extraction: Column Purification Standard columns/protocols have 100-200bp cut off. Will result in loss of small RNA species Specialized columns and kits are available microRNA FFPE Blood HACK - Cut off size can be adjusted by changing the percent ethanol used for sample binding RNA Extraction: Bead Purification Binding Wash DNase Re-bind Wash Elute Magnetic based purification good for high-throughput applications Can use oligo dT beads or total RNA beads DNase step: DNase I digestion step is sometimes skipped (polyA libraries only) Best practice is to keep this step. http://www.beckman.com/nucleic-acid-sample-prep/rna-isolation/isolation-from-tissue RNA Extraction: Tissues / Trizol Keep tissues as cold as possible Work in cold room After homogenization suggest column based cleanup Particularly important if used Trizol for lysis DNase 10ug then column cleanup https://www.thermofisher.com/order/catalog/product/AM7020 RNA Quantitation & Quality Quantitation Absorbance: Nano-drop (50-500 ng/ul) Theoretically should can read to 3000 ng/ul. Empirically find it is only accurate within range above. Dye based RiboGreen Qubit / Quant-IT Quality Visualize on gel Agilent Bioanalyzer (RIN) RNA quality High quality RNA needed for mRNA libraries Degraded samples should only be used to make a “total” RNA-seq library – rRNA removal FFPE & Archival Samples mRNA Purification of Degraded Samples AAAA AAAA TTTTTT AAAA AAAA TTTTTT Transcript 1 Transcript 2 PolyA tail no longer attached to transcript. Results in differential loss of transcripts between samples. Sequencing Library Preparation Starting Material: DNA & RNA RNA-Seq library prep Genomic library prep Mate-pair sequencing (circularization) Low input / single cell library prep RNASeq Stranded Library Prep (dUTP method) or mRNA purification or strand specific amplification Index http://www.rna-seqblog.com/wp-content/uploads/2012/12/library-preparation.jpg Library Strandedness ACCATGAACCGTA TGGTACTTGGCAT ACCAUGAACCGUA Read alignment depends on direction of transcription “sense” strand of transcript can be on either the sense or antisense strand of the DNA http://seqanswers.com/forums/showthread.php?t=44220 Library Strandedness https://galaxyproject.org/tutorials/rb_rnaseq/ Key steps in library preparation Starting Material Library amplification bias Multiplexing qPCR quantitation Sequencing read order & terminology Library Amplification Bias Final step of library prep is amplification transcript count 2, 13, 4 Introduces library bias Some products preferentially amplified Fewer cycles = less bias transcript count 12, 20, 8 Modified from: Nature Methods 9, 72-74 (2012) Limited Cycle Library Amplification Number of cycles needed is proportional to amount of input RNA. Library prep kits will recommend a certain number of cycles. This is usually optimized for the lower input. Test how many cycles will give you enough product. Fewer cycles = less bias Limited Cycle Library Amplification Perform micro qPCR reaction on small amount of pre- amplification library (Kapa) Amplify only the number cycles needed to get enough product for sequencing (20ul of 4nM product) https://www.kapabiosystems.com/document/kapa-library-quantification-illumina-tds/?dl=1 Limited Cycle Library Amplification 9 cycles | 10 cycles 11 cycles | 12 cycles PCR bubble Amplify 9 cycles Library QC Quantitation Peak around 150 = primer dimer Dye based SYBR Green Qubit / Quant-IT Size & Quality Agilent Bioanalyzer Size determination Do not use for quantitation Size selection with SPRI beads Solid Phase Reverse Immobilization beads Carboxyl groups on surface bind DNA in the presence of crowding agents (PEG & NaCl) http://core-genomics.blogspot.com/2012/04/how-do-spri-beads-work.html Key steps in library preparation Starting Material Library amplification bias Multiplexing qPCR quantitation Sequencing read order & terminology RNASeq Stranded Library Prep (dUTP method) or mRNA purification or strand specific amplification Index http://www.rna-seqblog.com/wp-content/uploads/2012/12/library-preparation.jpg Multiplexing (barcodes and indices) Multiplexing allows optimal use of reads you will get Charges for sequencing are usually per lane of the flow cell HiSeq generates ~150 million reads per lane NextSeq generates ~ 450 million reads (one lane instrument) For RNA-Seq number of reads you need will depend on your experiment 10 million standard for transcriptome 20 million standard for total RNA (rRNA depleted) Make sure multiplexing libraries of similar size 3’ 5’ DNA (0.1-5.0 μg) A T C G C T G T A C G C G A A T C G A A C T C G C G C A T Single molecule array C G A A 5’ Library Preparation Cluster Growth Sequencing 1 2 3 4 5 6 7 8 9 T G CT A C G A T … Image Acquisition Base Calling Consider Cluster Size in Multiplexing www.support.illumina.com Multiplexing Pool samples based on dye based quantitation Submit pool to core facility for sequencing. Make all sequencing libraries in one batch qPCR quantitate before sequencing Key steps in library preparation Starting Material Library amplification bias Multiplexing qPCR quantitation Sequencing read order & terminology Sequencing Read Order Rd1 Seq Primer Index 1 primer Barcode and/or UMI INDEX Index 2 Index 2 primer(A) primer(B) Rd2 Seq Primer 1. Read 1 HiSeq/MiSeq (4 color) 2. Index Read 1 (i7) • A&C read on one camera 3. Index Read 2 (i5) • G&T read on other 4. Read 2 NextSeq (2 color) Sequencing Library Preparation Starting Material: DNA & RNA RNA-Seq library prep Genomic library prep Mate-pair sequencing (circularization) Low input / single cell library prep Genomic Library Prep Once you have sheared your DNA this is a quick process Shear Genomic DNA or begin with cDNA Protocol same as for RNA-Seq once End Repair (blunt ends) you have sheared dsDNA Add 3’ A Tail Acoustic shearing – Covaris Ligate Adapters