TRAINER’S MANUAL
Introduction to Next Generation Sequencing Hands-on Workshop
Bioplatforms Australia (BPA) The Commonwealth Scientific and Industrial Research Organisation (CSIRO)
TRAINER’S MANUAL
Licensing
This work is licensed under a Creative Commons Attribution 3.0 Unported License and the below text is a summary of the main terms of the full Legal Code (the full licence) available at http://creativecommons.org/licenses/by/3.0/legalcode.
You are free: to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions: Attribution - You must give the original author credit. With the understanding that: Waiver - Any of the above conditions can be waived if you get permission from the copyright holder. Public Domain - Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license. Other Rights - In no way are any of the following rights affected by the license:
• Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations; • The author’s moral rights; • Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights.
Notice - For any reuse or distribution, you must make clear to others the licence terms of this work. Contents
Licensing3
Contents 4
Workshop Information7 The Trainers ...... 9 Providing Feedback...... 10 Document Structure ...... 10 Resources Used...... 11
Data Quality 13 Key Learning Outcomes...... 14 Resources You’ll be Using ...... 14 Useful Links...... 14 Introduction...... 15 Prepare the Environment...... 16 Quality Visualisation...... 16 Read Trimming...... 19
Read Alignment 25 Key Learning Outcomes...... 26 Resources You’ll be Using ...... 26 Useful Links...... 26 Introduction...... 28 Prepare the Environment...... 28 Alignment...... 28 Manipulate SAM output...... 30 Visualize alignments in IGV...... 31 Practice Makes Perfect! ...... 32
ChIP-Seq 33 Key Learning Outcomes...... 34 Resources You’ll be Using ...... 34 Introduction...... 36 Prepare the Environment...... 36 Finding enriched areas using MACS...... 36 Viewing results with the Ensembl genome browser...... 38 Annotation: From peaks to biological interpretation...... 40 Motif analysis...... 41 Contents Contents
Reference ...... 43
RNA-Seq 45 Key Learning Outcomes...... 46 Resources You’ll be Using ...... 46 Introduction...... 48 Prepare the Environment...... 48 Alignment...... 49 Isoform Expression and Transcriptome Assembly ...... 52 Differential Expression...... 54 Visualising the CuffDiff expression analysis...... 56 Functional Annotation of Differentially Expressed Genes...... 60 Differential Gene Expression Analysis using edgeR...... 61 References...... 67
de novo Genome Assembly 69 Key Learning Outcomes...... 70 Resources You’ll be Using ...... 70 Introduction...... 72 Prepare the Environment...... 72 Downloading and Compiling Velvet...... 73 Assembling Single-end Reads ...... 75 Assembling Paired-end Reads ...... 82 Hybrid Assembly...... 92
Post-Workshop Information 95 Access to Computational Resources...... 96 Access to Workshop Documents...... 110 Access to Workshop Data ...... 110
Space for Personal Notes or Feedback 111
TRAINER’S MANUAL 5
Workshop Information Workshop Information
8 TRAINER’S MANUAL The Trainers Workshop Information
The Trainers
Dr. Zhiliang Chen Postdoctoral Research Associate The University of New South Wales (UNSW), NSW [email protected]
Dr. Susan Corley Postdoctoral Research Associate The University of New South Wales (UNSW), NSW [email protected]
Dr. Nandan Deshpande Postdoctoral Research Associate The University of New South Wales (UNSW), NSW [email protected]
Dr. Konsta Duesing Research Team Leader