Limma: Linear Models for Microarray and RNA-Seq Data User’S Guide
Total Page:16
File Type:pdf, Size:1020Kb
limma: Linear Models for Microarray and RNA-Seq Data User's Guide Gordon K. Smyth, Matthew Ritchie, Natalie Thorne, James Wettenhall, Wei Shi and Yifang Hu Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia First edition 2 December 2002 Last revised 14 July 2021 This free open-source software implements academic research by the authors and co-workers. If you use it, please support the project by citing the appropriate journal articles listed in Section 2.1. Contents 1 Introduction 5 2 Preliminaries 7 2.1 Citing limma ......................................... 7 2.2 Installation . 9 2.3 How to get help . 9 3 Quick Start 11 3.1 A brief introduction to R . 11 3.2 Sample limma Session . 12 3.3 Data Objects . 13 4 Reading Microarray Data 15 4.1 Scope of this Chapter . 15 4.2 Recommended Files . 15 4.3 The Targets Frame . 15 4.4 Reading Two-Color Intensity Data . 17 4.5 Reading Single-Channel Agilent Intensity Data . 19 4.6 Reading Illumina BeadChip Data . 19 4.7 Image-derived Spot Quality Weights . 20 4.8 Reading Probe Annotation . 21 4.9 Printer Layout . 22 4.10 The Spot Types File . 22 5 Quality Assessment 24 6 Pre-Processing Two-Color Data 26 6.1 Background Correction . 26 6.2 Within-Array Normalization . 28 6.3 Between-Array Normalization . 30 6.4 Using Objects from the marray Package . 33 7 Filtering unexpressed probes 34 1 8 Linear Models Overview 36 8.1 Introduction . 36 8.2 Single-Channel Designs . 37 8.3 Common Reference Designs . 38 8.4 Direct Two-Color Designs . 39 9 Single-Channel Experimental Designs 41 9.1 Introduction . 41 9.2 Two Groups . 41 9.3 Several Groups . 43 9.4 Additive Models and Blocking . 43 9.4.1 Paired Samples . 43 9.4.2 Blocking . 44 9.5 Interaction Models: 2 × 2 Factorial Designs . 44 9.5.1 Questions of Interest . 44 9.5.2 Analysing as for a Single Factor . 45 9.5.3 A Nested Interaction Formula . 46 9.5.4 Classic Interaction Models . 46 9.6 Time Course Experiments . 48 9.6.1 Replicated time points . 48 9.6.2 Many time points . 49 9.7 Multi-level Experiments . 50 10 Two-Color Experiments with a Common Reference 52 10.1 Introduction . 52 10.2 Two Groups . 52 10.3 Several Groups . 54 11 Direct Two-Color Experimental Designs 55 11.1 Introduction . 55 11.2 Simple Comparisons . 55 11.2.1 Replicate Arrays . 55 11.2.2 Dye Swaps . 56 11.3 A Correlation Approach to Technical Replication . 57 12 Separate Channel Analysis of Two-Color Data 59 13 Statistics for Differential Expression 61 13.1 Summary Top-Tables . 61 13.2 Fitted Model Objects . 62 13.3 Multiple Testing Across Contrasts . 63 14 Array Quality Weights 65 14.1 Introduction . 65 14.2 Example 1 . 65 14.3 Example 2 . 67 14.4 When to Use Array Weights . 69 2 15 RNA-Seq Data 70 15.1 Introduction . 70 15.2 Making a count matrix . 70 15.3 Normalization and filtering . 70 15.4 Differential expression: limma-trend . 71 15.5 Differential expression: voom . 71 15.6 Voom with sample quality weights . 72 15.7 Differential splicing . 74 16 Two-Color Case Studies 75 16.1 Swirl Zebrafish: A Single-Group Experiment . 75 16.2 Apoa1 Knockout Mice: A Two-Group Common-Reference Experiment . 86 16.3 Weaver Mutant Mice: A Composite 2x2 Factorial Experiment . 89 16.3.1 Background . 89 16.3.2 Sample Preparation and Hybridizations . 89 16.3.3 Data input . 90 16.3.4 Annotation . 91 16.3.5 Quality Assessment and Normalization . 91 16.3.6 Setting Up the Linear Model . 93 16.3.7 Probe Filtering and Array Quality Weights . 94 16.3.8 Differential expression . 94 16.4 Bob1 Mutant Mice: Arrays With Duplicate Spots . 95 17 Single-Channel Case Studies 99 17.1 Lrp Mutant E. Coli Strain with Affymetrix Arrays . 99 17.1.1 Background . 99 17.1.2 Downloading the data . 99 17.1.3 Background correction and normalization . 100 17.1.4 Gene annotation . 100 17.1.5 Differential expression . 101 17.2 Effect of Estrogen on Breast Cancer Tumor Cells: A 2x2 Factorial Experiment with Affymetrix Arrays . 103 17.3 Comparing Mammary Progenitor Cell Populations with Illumina BeadChips . 107 17.3.1 Introduction . 107 17.3.2 The target RNA samples . 108 17.3.3 The expression profiles . 109 17.3.4 How many probes are truly expressed? . 110 17.3.5 Normalization and filtering . 110 17.3.6 Within-patient correlations . 111 17.3.7 Differential expression between cell types . 111 17.3.8 Signature genes for luminal progenitor cells . 112 17.4 Time Course Effects of Corn Oil on Rat Thymus with Agilent 4x44K Arrays . 113 17.4.1 Introduction . 113 17.4.2 Data availability . 113 17.4.3 Reading the.