Methods in Molecular Biology 1979

Valentina Proserpio Editor Single Cell Methods Sequencing and Proteomics M ETHODS IN M OLECULAR B IOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, AL10 9AB, UK

For further volumes: http://www.springer.com/series/7651 Single Cell Methods

Sequencing and Proteomics

Edited by Valentina Proserpio

Department of Life Sciences and System Biology, University of Turin, Italian Institute for Genomic Medicine, IIGM Turin, Torino, Italy Editor Valentina Proserpio Department of Life Sciences and System Biology University of Turin, Italian Institute for Genomic Medicine, IIGM Turin Torino, Italy

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-4939-9239-3 ISBN 978-1-4939-9240-9 (eBook) https://doi.org/10.1007/978-1-4939-9240-9

© Springer Science+Business Media, LLC, part of Springer Nature 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Humana Press imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A. Preface

From the first mRNA-Seq whole-transcriptome analysis in 2009, in less than 10 years, many new technologies and strategies have been rapidly developed in order to analyze the genome, transcriptome, and proteome of individual cells, scaling from few to hundreds of thousands of cells analyzed at a time. Since then, many new biological questions have opened, and many laboratories across the world have utilized single-cell omics for their research, with a parallel massive increase in the number of publications regarding single cells. Keeping up with such a rapidly evolving technology is not an easy task, and for someone that enters the “single-cell field” for the first time, this might look like a maze, a jungle of choices and possibilities. The aim of this Methods in Molecular Biology (MIMB) book is to give readers a comprehensive overview of the available options for investigating biological questions at the level of individual cells and to help them in deciding which way is best to follow for different biological questions. Written by outstanding scientists in the field, the book is organized into eight parts that span from organizing a single-cell lab to performing single-cell DNA-Seq, RNA-Seq, and proteomic experiments. The book also covers single-cell epigenetics, single-cell multi-omics analysis, screening, and live imaging of individual cells. Each chapter lists all the materials required for the experiment and describes every protocol in a detailed, step-by-step manner, with all the precautions that should be taken when working with individual cells. The authors wrote every procedure for experts as well as for readers with no prior knowledge, making each experiment simple to perform in every lab equipped with the listed instrumentation. With very rich and detailed “Notes” sections, in which scientists included all the small tips and hints to best perform every protocol and to avoid common practical mistakes, I am confident that this book will represent a very powerful resource for any lab that will approach any experiment at the level of individual cells. I would like to thank Dr. Sarah Teichmann for introducing me to the “single-cell world,” Prof. John Walker for the opportunity to edit this book and for his constant guidance, and all the authors for their amazing job, their time, and their effort to make this book as perfect and as comprehensive as possible.

Torino, Italy Valentina Proserpio

v Acknowledgment

Valentina Proserpio is supported by the Fondazione Umberto Veronesi.

vii Contents

Preface ...... v Contributors...... xiii

PART ILAB SETUP AND TISSUE PREPARATION

1 Setting Up a Single-Cell Genomic Laboratory...... 3 Lira Mamanova 2 Tissue Handling and Dissociation for Single-Cell RNA-Seq ...... 9 Felipe A. Vieira Braga and Ricardo J. Miragaia

PART II SINGLE CELL TRANCRIPTOMIC ANALYSIS

3 Full-Length Single-Cell RNA Sequencing with Smart-seq2 ...... 25 Simone Picelli 4 CEL-Seq2—Single-Cell RNA Sequencing by Multiplexed Linear Amplification ...... 45 Itai Yanai and Tamar Hashimshony 5 Single-Cell RNA-Seq by Multiple Annealing and Tailing-Based Quantitative Single-Cell RNA-Seq (MATQ-Seq) ...... 57 Kuanwei Sheng and Chenghang Zong 6 Single-Cell RNA Sequencing with Drop-Seq ...... 73 Josephine Bageritz and Gianmarco Raddi 7 Chromium 10Â Single-Cell 30 mRNA Sequencing of Tumor-Infiltrating Lymphocytes...... 87 Marco De Simone, Grazisa Rossetti, and Massimiliano Pagani 8 Seq-Well: A Sample-Efficient, Portable Picowell Platform for Massively Parallel Single-Cell RNA Sequencing...... 111 Toby P. Aicher, Shaina Carroll, Gianmarco Raddi, Todd Gierahn, Marc H. Wadsworth II, Travis K. Hughes, Chris Love, and Alex K. Shalek 9 Single-Cell Tagged Reverse Transcription (STRT-Seq) ...... 133 Kedar Nath Natarajan 10 Single-Cell RNA-Sequencing of Peripheral Blood Mononuclear Cells with ddSEQ...... 155 Shaheen Khan and Kelly A. Kaihara 11 High-Throughput Single-Cell Real-Time Quantitative PCR Analysis...... 177 Liora Haim-Vilmovsky

ix x Contents

12 Single-Cell Dosing and mRNA Sequencing of Suspension and Adherent Cells Using the PolarisTM System ...... 185 Chad D. Sanada and Aik T. Ooi 13 Targeted TCR Amplification from Single-Cell cDNA Libraries ...... 197 Shuqiang Li and Kenneth J. Livak

PART III SINGLE CELL GENOMIC AND EPIGENOMIC ANALYSIS

14 Sequencing the Genomes of Single Cells...... 227 Veronica Gonzalez-Pena and Charles Gawad 15 Studying DNA Methylation in Single-Cell Format with scBS-seq ...... 235 Natalia Kunowska 16 Single-Cell 5fC Sequencing ...... 251 Chenxu Zhu, Yun Gao, Jinying Peng, Fuchou Tang, and Chengqi Yi 17 ChIPmentation for Low-Input Profiling of In Vivo Protein–DNA Interactions ...... 269 Natalia Kunowska and Xi Chen

PART IV SINGLE CELL PROTEOMIC ANALYSIS

18 Immunophenotyping of Human Peripheral Blood Mononuclear Cells by Mass Cytometry ...... 285 Susanne Heck, Cynthia Jane Bishop, and Richard Jonathan Ellis 19 Classification of the Immune Composition in the Tumor Infiltrate...... 305 Davide Brusa and Jean-Luc Balligand

PART VSINGLE CELL MULTI OMIC ANALYSIS

20 Combined Genome and Transcriptome (G&T) Sequencing of Single Cells ...... 319 Iraad F. Bronner and Stephan Lorenz 21 Simultaneous Profiling of mRNA Transcriptome and DNA Methylome from a Single Cell ...... 363 Youjin Hu, Qin An, Ying Guo, Jiawei Zhong, Shuxin Fan, Pinhong Rao, Xialin Liu, Yizhi Liu, and Guoping Fan 22 Simultaneous Targeted Detection of Proteins and RNAs in Single Cells ...... 379 Aik T. Ooi and David W. Ruff

PART VI SINGLE CELL SCREENING

23 CRISPR Screening in Single Cells ...... 395 Johan Henriksson Contents xi

PART VII SINGLE CELL LIVE IMAGING

24 Single-Cell Live Imaging ...... 409 Toru Hiratsuka and Naoki Komatsu

PART VIII SINGLE CELL DATA ANALYSIS

25 Differential Expression Analysis in Single-Cell Transcriptomics ...... 425 Luca Alessandrı`, Maddalena Arigoni, and Raffaele Calogero 26 A Bioinformatic Toolkit for Single-Cell mRNA Analysis ...... 433 Kevin Baßler, Patrick Gu¨nther, Jonas Schulte-Schrepping, Matthias Becker, and Paweł Biernat

Index ...... 457 Contributors

TOBY P. A ICHER  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Department of Chemistry, Institute for Medical Engineering and Sciences (IMES), MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA LUCA ALESSANDRI`  Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy QIN AN  Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA MADDALENA ARIGONI  Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy KEVIN BAßLER  Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany JOSEPHINE BAGERITZ  Division Signaling and Functional Genomics, German Cancer Research Center (DKFZ), Heidelberg, Germany JEAN-LUC BALLIGAND  Pole of Pharmacology and Therapeutics, Institute of Experimental and Clinical Research (IREC), Medical School, Universite´ Catholique de Louvain (UCL), Brussels, Belgium MATTHIAS BECKER  Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany; Platform for Single Cell Genomics and Epigenomics, German Center for Neurodegenerative Diseases (DZNE), University of Bonn, Bonn, Germany PAWEŁ BIERNAT  Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany CYNTHIA JANE BISHOP  NIHR Biomedical Research Centre at Guy’s and St Thomas’ Hospital and King’s College London, London, UK IRAAD F. BRONNER  Single Cell Genomics Core Facility, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK DAVIDE BRUSA  Flow Cytometry Platform, Institute of Experimental and Clinical Research (IREC), Universite´ Catholique de Louvain (UCL), Brussels, Belgium RAFFAELE CALOGERO  Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy SHAINA CARROLL  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Department of Chemistry, Institute for Medical Engineering and Sciences (IMES), MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA XI CHEN  Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK MARCO DE SIMONE  Istituto Nazionale Genetica Molecolare INGM ‘Romeo ed Enrica Invernizzi’, Milan, Italy RICHARD JONATHAN ELLIS  NIHR Biomedical Research Centre at Guy’s and St Thomas’ Hospital and King’s College London, London, UK GUOPING FAN  Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA SHUXIN FAN  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China

xiii xiv Contributors

YUN GAO  Biodynamic Optical Imaging Center, Beijing Advanced Innovation Center for Genomics, School of Life Sciences, Peking University, Beijing, People’s Republic of China CHARLES GAWAD  Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN, USA; Department of Computational Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA TODD GIERAHN  Koch Institute for Integrative Cancer Research, MIT, Cambridge, MA, USA VERONICA GONZALEZ-PENA  Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN, USA; Department of Computational Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA PATRICK GU¨ NTHER  Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany YING GUO  The Second Affiliated Hospital, Xiangya School of Medicine, Central South University, Changsha, China LIORA HAIM-VILMOVSKY  EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK; Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK TAMAR HASHIMSHONY  Lokey Interdisciplinary Center for Life Sciences and Engineering, Technion–Israel Institute of Technology, Haifa, Israel SUSANNE HECK  NIHR Biomedical Research Centre at Guy’s and St Thomas’ Hospital and King’s College London, London, UK JOHAN HENRIKSSON  Molecular Infection Medicine Sweden, Umea˚ University, Umea˚, Sweden TORU HIRATSUKA  Centre for Stem Cells and Regenerative Medicine, King’s College London, Guy’s Hospital, London, UK YOUJIN HU  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China TRAVIS K. HUGHES  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Department of Chemistry, Institute for Medical Engineering and Sciences (IMES), MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA KELLY A. KAIHARA  Digital Biology Center, Bio-Rad Laboratories, Pleasanton, CA, USA SHAHEEN KHAN  Department of Immunology, University of Texas Southwestern Medical Center, Dallas, TX, USA NAOKI KOMATSU  Laboratory for Cell Function Dynamics, RIKEN Center for Brain Science, Wako, Saitama, Japan NATALIA KUNOWSKA  Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK SHUQIANG LI  Broad Institute of MIT and Harvard, Cambridge, MA, USA; Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA XIALIN LIU  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China YIZHI LIU  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China KENNETH J. LIVAK  Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA STEPHAN LORENZ  Single Cell Genomics Core Facility, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK; Clinical Genomics Laboratory, Sidra Medicine, Doha, Qatar Contributors xv

CHRIS LOVE  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Koch Institute for Integrative Cancer Research, MIT, Cambridge, MA, USA; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA LIRA MAMANOVA  Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK RICARDO J. MIRAGAIA  Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK KEDAR NATH NATARAJAN  Functional Genomics and Metabolism Unit, Danish Institute of Advanced Study (D-IAS), University of Southern Denmark, Odense, Denmark AIK T. OOI  Fluidigm Corporation, South San Francisco, CA, USA MASSIMILIANO PAGANI  Istituto Nazionale Genetica Molecolare INGM ‘Romeo ed Enrica Invernizzi’, Milan, Italy; Department of Medical Biotechnology and Translational Medicine, Universita` Degli Studi di Milano, Milan, Italy JINYING PENG  State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, People’s Republic of China SIMONE PICELLI  German Centre for Neurodegenerative Diseases (DZNE), Bonn, Germany GIANMARCO RADDI  Wellcome Sanger Institute, University of Cambridge, Hinxton, UK; NIAID at National Institutes of Health, Bethesda, MD, USA; David Geffen School of Medicine at UCLA, Los Angeles, CA, USA PINHONG RAO  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China GRAZISA ROSSETTI  Istituto Nazionale Genetica Molecolare INGM ‘Romeo ed Enrica Invernizzi’, Milan, Italy DAVID W. RUFF  Mission Bio, Inc., South San Francisco, CA, USA CHAD D. SANADA  Fluidigm Corporation, South San Francisco, CA, USA JONAS SCHULTE-SCHREPPING  Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany ALEX K. SHALEK  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Department of Chemistry, Institute for Medical Engineering and Sciences (IMES), MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Koch Institute for Integrative Cancer Research, MIT, Cambridge, MA, USA KUANWEI SHENG  Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA FUCHOU TANG  Biodynamic Optical Imaging Center, Beijing Advanced Innovation Center for Genomics, School of Life Sciences, Peking University, Beijing, People’s Republic of China; Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, People’s Republic of China; Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Peking University, Beijing, People’s Republic of China FELIPE A. VIEIRA BRAGA  Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK MARC H. WADSWORTH II  Ragon Institute of MGH, Harvard, and MIT, Cambridge, MA, USA; Department of Chemistry, Institute for Medical Engineering and Sciences (IMES), MIT, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA ITAI YANAI  Institute for Computational Medicine, NYU School of Medicine, New York, NY, USA xvi Contributors

CHENGQI YI  State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, People’s Republic of China; Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, People’s Republic of China; Department of Chemical Biology and Synthetic and Functional Biomolecules Center, College of Chemistry and Molecular Engineering, Peking University, Beijing, People’s Republic of China JIAWEI ZHONG  Zhongshan Ophthalmic Center, State Key Laboratory of Ophthalmology, Sun Yat-Sen University, Guangzhou, China CHENXU ZHU  State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, People’s Republic of China CHENGHANG ZONG  Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA Part I

Lab Setup and Tissue Preparation Chapter 1

Setting Up a Single-Cell Genomic Laboratory

Lira Mamanova

Abstract

Transcriptomics has been revolutionized by massive throughput RNA-seq. To date, the ongoing decrease in sequencing cost and recent eruption of single-cell related protocols have boosted a demand for single-cell RNA sequencing projects. Although the single-cell RNA-Seq (scRNA-Seq) approach is close to the conventional “bulk” RNA-seq, several features that are unique to scRNA-seq should be taken into consideration in order to obtain high-quality libraries and unbiased sequencing data. In this chapter I give recommendations for setting up the single cell-suitable laboratory environment.

Key words Single-cell RNA-seq (scRNA-seq), RNase-free, Contamination, Aliquots, Automation, Liquid handling, Musculoskeletal disorders (MSD)

1 Introduction

Despite the growing interest for single-cell RNA sequencing in the scientific community, only few centers worldwide have the specialized skills and equipment to accommodate the demand for it. Single-cell RNA sequencing (scRNA-Seq) procedures are not standard and are quite challenging for individual laboratories to perform, and not many of them have an open access to specialized facilities [1]. Due to technical limitations, most of the recent research employed “population-level” techniques that should be modified to accommodate new requirements for the single-cell laboratory setup, such as sample and reagent handling, conducting experi- ments, and QC parameters. One of the factors that should be taken into consideration is that the handling of an individual cell is much more challenging than that of a pool of cells [2]. The minute amount of starting RNA from a single cell is prone to degradation, sample loss, and elevated background noise in sequencing data. To avoid RNA degradation samples should be always kept in a RNase free environment on ice or any other available cooler racks during preparation and reaction setup.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_1, © Springer Science+Business Media, LLC, part of Springer Nature 2019 3 4 Lira Mamanova

Fig. 1 PCR workstation/PCR hood

Since a very low amount of RNA is used, extra effort should be taken to avoid sample and reagent contamination. The sources of contamination can be of different nature, including untreated sur- faces, pipettes, gloves, equipment, and reagents [3]. An efficient and reliable decontamination procedure should be applied before and after the experiment. RNA-related work should be carried out in a specially designated clean room or in PCR workstation/PCR hood that is equipped with UV lights for sterilization (Fig. 1). In addition, RNase-free reagents, barrier tips, and ultrapure water should be used for scRNA-seq experiment setup [4]. Today’s single-cell studies are typically conducted with thousands of cells per experiment. Most of the standard labora- tories employ manual liquid handling, resulting in a quite low throughput, that can be error prone, technically variable, and time-consuming. Automation of scRNA-seq protocols on robotic platforms allows for parallel processing of individual cells at an unprecedented scale and facilitates high-throughput single-cell profiling [5]. Over the past few years, the advances in benchtop automated workstations and dispensing instruments have resulted in increased throughput, accuracy, and reproducibility of scRNA-seq methods (Figs. 2 and 3). It also provides significant reduction in sample reaction volume, minimizes reagent dead volume, tip consump- tion, and hands-on time that overall results in cost-effective large- scale single-cell studies. Another important aspect of automation exploitation is a grow- ing concern regarding the development of musculoskeletal disor- ders (MSD) that are usually caused by continuous physical stress on specific body parts in repetitive tasks, including excessive pipetting, working at microscopes, and using cell counters [6]. If left Setting Up a Single-Cell Genomic Laboratory 5

Fig. 2 Benchtop automated microplate handling workstation

Fig. 3 Benchtop liquid handling dispenser

untreated these conditions can become chronic, resulting in signif- icant pain and discomfort to individuals, which in some instances can be career-threatening. The use of liquid handling robots can prevent development of such disorders by substitution of highly repetitive steps in protocols and can be used as a practical 6 Lira Mamanova

ergonomic solution. Therefore, while planning the single-cell lab- oratory setup good ergonomic design principles should be consid- ered and employed, to eliminate where practicable, or significantly reduce, the risk factors for MSDs. Here we describe a single-cell RNA laboratory setup for both small laboratories and large institutions.

2 Materials

2.1 Experimental 1. It is important to record batch or lot numbers of all reagents Procedure and QC and plasticware used in an experiment. This information can be used during troubleshooting procedures in order to exclude these components as the potential source of contamination if such a situation occurs. 2. Reagent aliquoting should be done using a standard procedure if a reagent is supplied in a large volume, including RNase-free water. This will ensure decreasing reagent thaw–freeze cycles, significant reduction of contamination and provide protocol continuity and consistency. 3. Another essential part of the experimental design is including template and reagent controls as a QC measure of the experi- ment. Introduction of nontemplate and nonreagent controls allows for monitoring contamination, and commercial RNA and DNA templates verify the viability of the biological mate- rial and efficiency of the protocol. 4. Due to low reaction volumes used in scRNA protocols, single- cell samples are prone to evaporation that subsequently lead to ultimate technical experiment failure. In order to avoid this issue, suitable plasticware and plate seals should be chosen for a particular protocol. 5. Practically, standard csRNA-Seq protocols and especially com- mercial kits are quite costly, especially for experiments involving hundreds of thousands of samples. The use of low-bind plates and vials, and low-retention tips can diminish sample loss and decrease waste of expensive reagents by keeping the master mix dead volume to the minimum.

2.2 Workplace, 1. It is crucial to keep pre-PCR and post-PCR areas clearly Equipment, Reagents divided, with pre-PCR area mainly for RNA work. In case and Samples Handling there is limited laboratory space the PCR workstations/PCR hoods can be used to prevent contamination with exogenous oligos, DNA, or RNA. 2. Due to the low volumes used in scRNA protocols it is impor- tant to recognize the risk associated with out-of-calibration pipettes, the role of routine pipette checks, and good pipetting practice. Setting Up a Single-Cell Genomic Laboratory 7

3. Put the necessary equipment, including a vortexer, mini centri- fuge, pipetting devices, plasticware and cooler racks, inside the hood. 4. Wipe the surface, equipment, plasticware, pipetting devices, and cooler racks with fresh 80% EtOH. 5. Wipe the surface, equipment, plasticware, pipetting devices, and cooler racks with widely available RNase-removing solu- tions or 5% bleach. 6. Irradiate a PCR hood with UV for 20–30 min. 7. Keep reagents and samples in the fridge/freezer during decon- tamination procedures. 8. Wipe reagent vials with an RNase-removing solution and place in a rack for defrosting in the hood. Keep enzymes/enzyme mixes in cooling racks. 9. Spin down thawed reagents, mix by pipetting or vortexing, and spin down again. Keep reagents in cooling racks from now on. 10. Keep biological material in the fridge or freezer until the PCR hood and reagent master mix are ready to use. 11. In case you touch objects outside the sterile hood, immediately change the gloves or wipe with 80% EtOH and RNase- removing solutions. 12. Owing to very low reaction volumes, it is important to consider a thermal cycler with a “smart lid” in order to minimize sample evaporation. Also samples should be spun down and visually inspected after every manipulation. 13. In addition, to increase throughput it is beneficial to choose multiblock PCR instruments over single-block thermocyclers (Fig. 4).

Fig. 4 Multiblock PCR instrument 8 Lira Mamanova

References

1. Haque A, Engel J, Teichmann SA, Lo¨nnberg T 4. Kroneis T (2015) Whole genome amplification. (2017) A practical guide to single-cell RNA-se- Methods Mol Biol 1347:43–55 quencing for biomedical research and clinical 5. Yuan J, Sims PA (2016) Automated microwell applications. Genome Med 9:75 platform for large-scale single cell RNA-Seq. 2. Perkel JM (2017) Single-cell sequencing made Nature 6:1–10 simple. Nature 547:125–126 6. Haile EL, Taye B, Hussen F (2012) Ergonomic 3. Champlot S, Berthelot C, Pruvost M, Bennett workstations and work-related musculoskeletal EA, Grange T, Geigl EM (2010) An efficient disorders in the clinical laboratory. Lab Medicine multistrategy DNA decontamination procedure 43:11–12 of PCR reagents for hypersensitive PCR applica- tions. PLoS One 5(9):1–15 Chapter 2

Tissue Handling and Dissociation for Single-Cell RNA-Seq

Felipe A. Vieira Braga and Ricardo J. Miragaia

Abstract

The starting material for all single-cell protocols is a cell suspension. The particular functions and spatial distribution of immune cells generally make them easy to isolate them from the tissues where they dwell. Here we describe tissue dissociation protocols that have been used to obtain human immune cells from lymphoid and nonlymphoid tissues to be then used as input to single-cell methods. We highlight the main factors that can influence the final quality of single-cell data, namely the stress signatures that can bias its interpretation.

Key words Single-cell RNA sequencing, Tissue processing, Digestion, Single-cell suspension, Immune cells

1 Introduction

All single-cell protocols start with a suspension of cells. For most tissues, this means that beforehand, the extracellular matrix that holds cells together has to be processed to loosen this mesh and to induce the release of cells into suspension. Depending on the tissue in question and the cells of interest, different approaches and a variety of conditions can be used to get a cell suspension. The functions of immune cells usually require them to move within and between tissues. Not surprisingly, they are quite resilient to being removed from a 3D tissue and to remain in suspension, in contrast with structural cell types, as epithelial and endothelial cells, which heavily depend on physical interactions with neighboring cells. Additionally, as immune cells do not constitute structural blocks of the tissues they reside in, they are generally easier to release. These two main features largely influence the dissociation protocols for immune cell isolation. Dissociation of tissues for single-cell protocols can be achieved by mechanical means, such as simple mashing, dicing, or slicing. For example, for lymphoid organs as the spleen and lymph nodes, most immune cell types can easily be isolated by mashing the tissue

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_2, © Springer Science+Business Media, LLC, part of Springer Nature 2019 9 10 Felipe A. Vieira Braga and Ricardo J. Miragaia

through a cell strainer. When the cells of interest are embedded in more densely packed tissues, such as the dermis or the colonic lamina propria, enzymes that target collagen and other structural molecules must be used. The choice of enzyme for each tissue will depend on the composition of the extracellular matrices in terms of molecules such as fibronectin, different types of collagen, and accutin. In a lot of cases, dissociation protocols use a combination of both approaches to decrease the overall processing time: a mechanical step breaks up the tissue, increasing the surface area, followed by an enzymatic treatment. Addition of cell selection steps allowing for enrichment/depletion of certain cell types (e.g., dead cell removal, CD45 enrichment, flow cytometry), is common to most dissociation protocols. Despite their overall resilience, it is important to be aware that immune cell types can be affected by the harsh conditions of tissue dissociation. In single-cell RNA sequencing data of other cell types, it has been shown that enzymatic digestions can induce gene expression changes of a set of immediate-early genes, which then create artificial subpopulations in biologically homogeneous cell populations [1–3]. We have seen that similar artifacts can be detected to some extent in immune cell types such as T cells. Decreasing the time and temperature of digestions as much as possible is recommended in order to minimize these unwanted effects. Minimizing all other potential sources of stress, such as temperature changes, mechanical stress, and overall processing time, is recommended (see Note 1). In this chapter we describe several human lymphoid and non- lymphoid tissue dissociation protocols that have been used to obtain single-cell suspensions that were then successfully used as input for single-cell protocols.

2 Materials

All protocols require basic lab material such as pipette aid, and different sizes of serological pipettes and pipettes. Prior to proces- sing, tissues are kept in storage buffer, which usually consists of an organ preservation solution (e.g., University of Wisconsin (UW) solution, Hypothermosol).

2.1 Human Blood 1. 50 mL Falcon tubes. 2. Sterile phosphate buffered saline (PBS). 3. Ficoll-Paque. 4. Fetal bovine serum (FBS). 5. Centrifuge (that allows for break regulation). 6. Washing medium: PBS containing 2% FBS. Tissue Handling and Dissociation for Single-Cell RNA-Seq 11

7. All reagents for cell counting (Subheading 2.910), red blood cells (RBC) lysis (Subheading 2.6), and dead cell removal (Subheading 2.78).

2.2 Human Spleen 1. 100 μM cell strainers. 2. 10 cm petri dish. 3. 2 mL syringe. 4. 50 mL Falcon tubes. 5. Disposable scalpels. 6. Fetal calf serum (FCS). 7. Forceps. 8. Sterile phosphate buffered saline (PBS). 9. Red blood cell lysis. 10. Washing medium: PBS containing 2% FCS. 11. All reagents for cell counting (Subheading 2.10), RBC lysis (Subheading 2.6), and dead cell removal (Subheading 2.78).

2.3 Human 1. Collagenase D. Lymph Node 2. DNAse I. 3. 50 mL Falcon tubes. 4. Sterile phosphate buffered saline (PBS). 5. 100 μM cell strainer. 6. FBS. 7. 2 mL syringe. 8. DMEM. 9. Scalpel. 10. Dissection forceps. 11. Incubator or water bath at 37 C, ideally with shaker/rocker. 12. Washing medium: PBS containing 2% fetal bovine serum (FBS). 13. Digestion medium: DMEM + 1 mg/mL collagenase D + 0.1 mg/mL DNase I. Prepare 10 mL per lymph node. 14. Complete medium: DMEM + 10% FCS. 15. All reagents for cell counting (Subheading 2.910), RBC lysis (Subheading 2.6), and dead cell removal (Subheading 2.78).

2.4 Human Lung 1. 5 mL Eppendorf tube. 2. Collagenase D. 3. DNase I. 4. 50 mL Falcon tubes. 12 Felipe A. Vieira Braga and Ricardo J. Miragaia

5. Sterile phosphate buffered saline (PBS). 6. DMEM. 7. 70 μM cell strainer. 8. FBS. 9. 2 mL syringe. 10. 5 mL Eppendorf tubes. 11. Scalpel. 12. Dissection forceps. 13. Metzenbaum scissors. 14. 6 cm petri dish. 15. 10 cm petri dish. 16. Incubator or water bath at 37 C, ideally with shaker/rocker. 17. Complete medium: DMEM with 10% FBS. 18. Digestion medium: DMEM + 1 mg/mL collagenase D + 0.1 mg/mL DNase I. Prepare 20 mL per gram of tissue. 19. All reagents for cell counting (Subheading 2.910), RBC lysis (Subheading 2.6), dead cell removal (Subheading 2.78), and immune cell enrichment (Subheading 2.89).

2.5 Human Skin 1. 100 μm cell strainer. 2. 50 mL Falcon tubes. 3. Collagenase Type IV (Worthington Biochem): Reconstituted in sterile PBS at 160 mg/mL. 4. Dermatome with Pilling-Wecprep Blade. 5. Disposable scalpels. 6. DNAse I (Roche): Reconstituted in sterile water at 10 mg/mL. 7. Human skin sample. 8. One pair of large forceps. 9. Cork, wooden or Styrofoam square. 10. Petri dishes 10 cm. 11. RPMI medium, containing 10% heat-inactivated FCS, stored at 4 C. 12. Size 8 Goulian Guard. 13. Sterile phosphate buffered saline (PBS). 14. Two pins or needles. 15. Scalpels. 16. Incubator or water bath at 37 C, ideally with shaker/rocker. 17. Digestion medium: RPMI medium, 10% heat-inactivated FCS, 1.6 mg/mL collagenase Type IV, 0.1 mg/mL DNase I. Tissue Handling and Dissociation for Single-Cell RNA-Seq 13

18. Complete medium: RPMI, 10% heat-inactivated FCS. Store at 4 C. 19. All reagents for cell counting (Subheading 2.910), RBC lysis (Subheading 2.6), dead cell removal (Subheading 2.78), and immune cell enrichment (Subheading 2.89).

2.6 Red Blood Cell 1. Red blood cell lysis solution. Lysis 2. Sterile phosphate buffered saline (PBS). 3. Washing medium: PBS containing 2% fetal bovine serum (FBS).

2.7 Dead Cell 1. 15 mL Falcon Tubes. Removal 2. CaCl2 (1 mM). 3. EasySep Dead Cell Removal Kit. 4. EasySep magnet. Depending on the absolute number of cells, different magnets should be used and volumes adapted accord- ingly. Here, we use “The Big Easy” magnet, which can be used to label up to 1 Â 109 cells. 5. Fetal bovine serum (FBS). 6. Sterile phosphate buffered saline (PBS). 7. Resuspension medium: PBS containing 2% FBS and 1 mM CaCl2.

2.8 Immune Cell 1. 15 mL Falcon tubes. Enrichment 2. CD45 MicroBeads. 3. 30 μm cell filters. 4. MACS buffer: PBS pH 7.2, +0.5% BSA, +2 mM EDTA. 5. MACS columns. Depending on the absolute number of cells, MS (max of 2 Â 108) or LS (max of 2 Â 109) columns should be used and volumes adapted accordingly. 6. MACS magnet. MS and LS columns require different magnets. 7. MACS stand.

2.9 Cell Counting 1. 0.5 mL Eppendorfs. 2. Cell counting chamber or alternative method. 3. Trypan blue (see Note 2). 14 Felipe A. Vieira Braga and Ricardo J. Miragaia

3 Methods

3.1 Blood 1. To minimize changes in transcriptome and in blood cell pro- portions we recommend to work with fresh blood. If that is not 3.1.1 Tissue Collection possible, blood stored at 4 C shows less transcriptome changes and Preparation then blood stored at room temperature. 2. If blood has been stored at 4 C, adjust it to room temperature for 15–30 min before starting the protocol. 3. Prepare a number of 50 mL Falcon tubes with 15 mL Ficoll in each of them. You will need one Falcon Tube per 15 mL. 4. If the final amount of blood after diluting with PBS (see below) is less than 15 mL, adjust Ficoll amounts to keep at least a 1:1 ratio Ficoll to diluted blood. 5. If using buffy coats, prepare six tubes with 15 mL of Ficoll each. 6. Prepare approximately 500 mL of Washing medium per buffy coat.

3.1.2 Neutrophils 1. Dilute blood 1 to 2.2Â using fresh PBS. and Mononuclear Fraction 2. If buffy coats are being used, dilute each buffy coat five times. Separation 3. Carefully and very slowly pipet the blood on the wall of the tube so it forms a layer on top of the Ficoll without mixing the two fractions. 4. Centrifuge for 30 min at 700 Â g at room temperature, with acceleration four and deceleration zero (The use of room tem- perature and no break is essential for adequate gradient separa- tion). This can take between 20 and 45 min total, depending on the centrifuge. 5. After centrifugation, you should have four layers. The upper most layer will be formed mostly of serum and PBS. The yellow small ring formed between the upper layer and the clear Ficoll underneath contains your mononuclear cells. The bottom most layer contains a mixture of red blood cells and neutrophils. 6. If interested in the mononuclear cell fraction, collect the yellow ring between the upper most layer and the clear Ficoll and transfer to a new 50 mL tube. If interested in the neutrophils, discard all the upper three fractions and use the red cell pellet for the next steps. 7. Add cold PBS to 50 mL. 8. Spin down at 360 Â g, 10 min, 4 C. 9. Discard supernatant very carefully, as the pellet is a bit loose at this step. Tissue Handling and Dissociation for Single-Cell RNA-Seq 15

10. Resuspend pellet in 50 mL of washing medium. 11. Spin down at 360 Â g, 5 min, 4 C. 12. Discard supernatant carefully. 13. Resuspend pellet in 50 mL of washing medium. 14. Spin down at 360 Â g, 5 min, 4 C. 15. Repeat steps 12–14 two times more. 16. Discard supernatant and proceed to red blood cell lysis.

3.1.3 RBC Lysis 1. Add 5 mL of 1Â Red blood cell lysis solution to your mono- nuclear cell pellet. If working with neutrophils, add 10 mL of 1Â red blood cell lysis solution. 2. Incubate for 5 min at room temperature. 3. Add fresh cold PBS up to 50 mL. 4. Spin down at 360 Â g, 5 min, 4 C. 5. Discard supernatant. 6. Resuspend pellet in 50 mL of washing medium. 7. Spin down at 360 Â g, 5 min, 4 C. 8. Discard supernatant. 9. Resuspend in small volume of washing medium and count live and total cells. 10. Proceed for dead cell removal.

3.1.4 Dead Cell Removal If total number of cells is below 2.5 Â 107, resuspend in the minimum volume and adapt all volumes to it. See Notes 2–5. 1. Centrifuge samples at 500 Â g for 5 min. 2. Remove supernatant and resuspend in the appropriate volume of resuspension medium (0.25–8 mL) to obtain a suspension with 1 Â 108 cells/mL. 3. Transfer cell suspension to a 15 mL Falcon. 4. Add Dead Cell Removal (Annexin V) Cocktail to sample (50 μL per mL of sample). 5. Add Biotin Selection Cocktail to sample (50 μL per mL of sample). 6. Mix (up and down with pipette) and incubate for 3 min at room temperature. 7. Vortex RapidSpheres™ for 30 s. Particles should appear evenly dispersed. 8. Add RapidSpheres™ to sample (100 μL per mL of sample) and mix. Proceed immediately to the next step. 16 Felipe A. Vieira Braga and Ricardo J. Miragaia

9. Add Resuspension medium to top up the sample to the indi- cated volume. Top up to 5 mL for samples 2 mL, and to 10 mL for samples >2 mL. 10. Mix by gently pipetting up and down 2–3 times. 11. Place the tube (without lid) into the magnet and incubate for 3 min at room temperature. 12. Pick up the magnet, and in one continuous motion invert the magnet and tube, pouring the enriched cell suspension into a new tube. Leave the magnet and tube inverted for 2–3 s, then return upright. Do not shake or blot off any drops that may remain hanging from the mouth of the tube. 13. Count cells and calculate viability.

3.2 Human Spleen Upon collection, spleen samples should be kept in storage buffer, at 4 C, up until processing. 3.2.1 Tissue Collection and Preparation

3.2.2 Mechanical 1. Remove spleen sample from storage buffer and place onto Dissociation 10 cm petri Dish. 2. Add a little of cold washing medium to prevent tissue from drying. 3. Slice spleen sample into small pieces using forceps and scalpels (approximately 1 mm3). 4. Place a 100 μM cell strainer above a 50 mL Falcon Tube and transfer spleen pieces onto the cell strainer. 5. Wash plate with 10 mL Washing medium and pass through cell strainer into the 50 mL Falcon tube. 6. Mash spleen through cell strainer using a 2 mL syringe plunger. 7. Wash cell strainer with 10 mL of Washing medium. 8. Top up to 50 mL with Washing medium. (a) Spin down at 500 Â g, 5 min, 4 C. (b) Discard supernatant carefully. (c) Resuspend pellet in 50 mL of Washing medium. (d) Spin down at 500 Â g, 5 min, 4 C. (e) Discard supernatant carefully. (f) Proceed to red blood cell lysis.

3.2.3 RBC Lysis See Subheading 3.1.3.

3.2.4 Dead Cell Removal See Subheading 3.1.4. Tissue Handling and Dissociation for Single-Cell RNA-Seq 17

3.3 Human Upon collection, lymph node should be kept in storage buffer, at  Lymph Node 4 C, up until processing.

3.3.1 Tissue Collection and Preparation

3.3.2 Mechanical 1. Remove lymph nodes from storage buffer and place onto Dissociation and Enzymatic 10 cm Petri Dish. Digestion 2. Add 10 mL of cold Washing medium. 3. Dissect out the connective tissue around the lymph node. 4. Place a 100 μm cell strainer above a 50 mL Falcon Tube and transfer one lymph node on to the cell strainer. 5. Mash lymph node through 100 μm cell strainer above a 50 mL Falcon Tube using a 2 mL syringe plunger, washing through with 10 mL Digestion medium (see Note 1). 6. Incubate for 15 min at 37 C, rotating. 7. Spin down at 360 Â g, 10 min, 4 C. 8. Discard supernatant carefully. 9. Resuspend pellet in 10 mL of Washing medium. 10. Spin down at 360 Â g, 5 min, 4 C. 11. Discard supernatant carefully. 12. Resuspend in 2 mL of Complete medium.

3.3.3 Dead Cell Removal See Subheading 3.1.4.

3.4 Human Lung Upon collection, lung samples should be kept in storage buffer, at 4 C, up until processing. 3.4.1 Tissue Collection and Preparation

3.4.2 Mechanical 1. Transfer the piece of tissue to a 10 cm petri dish and add Dissociation and Enzymatic enough Complete medium to cover it. Digestion 2. Using forceps and scalpel cut the tissue in smaller parts of approximately 0.2 g each. 3. Transfer each 0.2 g piece to a 5 mL eppendorf with 1 mL of Digestion medium. 4. Using Metzenbaum scissors, chop the piece inside the tube as finely as possible, until they look almost like sand. 5. Transfer the mashed tissue to a 6 cm petri dish. 6. Wash the eppendorf with 1 mL of digestion medium, transfer- ring it to the 6 cm petri dish with the tissue. 7. Add approximately 2 mL of medium or enough to completely cover the tissue and the whole surface of the petri dish. 18 Felipe A. Vieira Braga and Ricardo J. Miragaia

8. Transfer it to an incubator at 37 C for 1 h under slow shaking conditions. If automatic shaking not available, mix the solution every 10 min. 9. Collect the sample and filter the cells through a 70 μm cell strainer into a 50 mL falcon tube. Using the plunger of a syringe, repeatedly mash the filter and rinse with cold complete medium up to 25 mL. 10. Spin down at 360 Â g, 10 min, 4 C. Acceleration 4, brake 2. 11. Very carefully discard the supernatant. 12. Resuspend the cell pellet in 25 mL of complete medium. 13. Spin down at 360 Â g, 10 min, 4 C. Acceleration 4, brake 2. 14. Very carefully discard the supernatant. 15. Repeat steps 12–14 one more time. 16. Proceed to red blood cell lysis.

3.4.3 RBC Lysis See Subheading 3.1.3.

3.4.4 Dead Cell Removal See Subheading 3.1.4.

3.4.5 Immune Cell To enrich the lung cell suspensions in immune cell types, we usually Enrichment use MACS as described below. See Note 6. Volumes recommended for a total of 107 cells or less, unless specifically mentioned. If above 107 cells, increase volumes proportionally. 1. To remove cell clumps and avoid clogging of the column, pass cell suspension through a 30 μm strainer. 2. Centrifuge cell suspension at 500 Â g for 5 min and discard supernatant. 3. Resuspend cell pellet in 80 μL of buffer. 4. Add 20 μL of CD45 MicroBeads. 5. Mix well and incubate for 15 min in the refrigerator (2–8 C). Do not place on ice. 6. Wash cells by adding 1–2 mL of buffer. Centrifuge at 500 Â g for 5 min and discard supernatant. 7. Resuspend up to 108 cells in 500 μL of buffer. (For higher cell numbers, scale up buffer volume accordingly). 8. Place MS or LS column on the MACS magnet and position a 15 mL Falcon tube below it. 9. Prepare column by rinsing with buffer: MS: 500 μL; LS: 3 mL. 10. Load the 1–2 mL cell suspension onto the column and wait for the column reservoir to be empty. Tissue Handling and Dissociation for Single-Cell RNA-Seq 19

11. Wash column by adding appropriate volume of buffer and repeat for a total of three times. Wait for the reservoir to be empty between washes. MS: 3 Â 500 μL LS: 3 Â 3 mL. 12. Discard the flow-through, which contains unlabeled cells À (mostly CD45 ). 13. Remove column from the MACS magnet and place it on a suitable collection tube. 14. Pipet the appropriate amount of buffer onto the column. MS: 1 mL; LS: 5 mL. 15. Flush out the magnetically labeled cells by firmly pushing the column plunger into the column.

3.5 Human Skin Skin can be a challenging tissue to obtain good quality single-cell suspensions from due to the harsh digestion conditions required to dissociate it. This protocol, previously described by Gunawan et al. [4], can be used for this purpose.

3.5.1 Tissue Collection Upon collection, skin samples should be kept in storage buffer, at  and Preparation 4 C, up until processing. 1. Place skin on a petri dish with little PBS to prevent it from getting dry. 2. Holding the skin with a forceps, use a disposable scalpel to scrape subcutaneous fat. After removing it, move the skin to a clean petri dish, as fat can interfere with digestion later on. 3. If samples are large enough, cut the skin into 1.5 cm  4cm pieces using a disposable scalpel. 4. Pin one end of the skin strip onto a wooden, cork, or Styrofoam block, with the epidermis facing up. 5. Flatten and pull skin tight with large forceps. 6. Cut the skin using a dermatome by gentle horizontal move- ment of the hand with slight downward traction. 7. Perform steps 4–6 for all the skin pieces available, collecting all the dermatome-cut skin strips (~300 μm-thick) in a large petri dish filled with PBS with the epidermis side up.

3.5.2 Mechanical To decrease time of the downstream enzymatic treatment to 4 h, Dissociation scalpels can be used to cut the skin strips into smaller pieces prior to digestion using scalpels.

3.5.3 Enzymatic 1. Float whole skin pieces in a petri dish, epidermis side facing Digestion upward, in Digestion medium for 12–16 h (or 4 h if mechanical digestion has been undertaken) at 37 C. 2. Pass the digest repeatedly through a 10 mL pipette until no visible material remains. 20 Felipe A. Vieira Braga and Ricardo J. Miragaia

3. Place a 100 μm filter on top of a 50 mL Falcon tube and transfer the digest onto it. 4. Wash the petri dish with 25 mL of fresh Complete medium and pass them through the strainer into the same 50 mL Falcon tube. 5. Spin cells down at 500 Â g for 5 min.

3.5.4 RBC Lysis See Subheading 3.1.3.

3.5.5 Dead Cell Removal See Subheading 3.1.4.

3.5.6 Immune Cell See Subheading 3.4.5. Enrichment

4 Notes

1. Recently published studies [1–3] have shown that enzymatic digestion of tissues can induce gene expression artifacts that affect scRNA-seq data interpretation. It has also been shown that these effects depend on the temperature and duration of the digestion. From our observations, different cell types have different susceptibilities to this stress. Therefore, it is good practice to decrease digestion time and temperature (e.g., using psychrophilic proteases) whenever possible. When differ- ent dissociation protocols are used, it is advisable to make them as similar as possible to achieve artifact-free comparisons. For instance, enzymatic digestion of tissues such as spleen or lymph nodes could be considered. The use of transcriptional inhibi- tors during the enzymatic treatment should also be considered for particularly sensitive tissues [2]. Nonetheless, being aware of such potential artifacts can be used at the data analysis stage to reduce their impact. 2. Accurate estimations of cell numbers are very important when loading cell suspensions into microfluidic devices for single-cell capture. Overloading such devices can induce clogging and increased rate of cell doublets captured, whilst loading lower numbers of cells will lead to increased costs per cell. From our experience, manual counting is more accurate than commer- cially available automated systems. However, person-to-person variation is quite significant. 3. Cell suspensions with viability below ~70–80% should be depleted of dead cells. Higher percentages of dead cells can interfere with other steps in the process (e.g., dead cells may bind nonspecifically to MACS MicroBeads), they will increase costs per cell and might contribute to background expression Tissue Handling and Dissociation for Single-Cell RNA-Seq 21

noise in certain scRNA-seq methods. Such enrichment can be achieved using dead cell removal kits (MACS, EasySep) or FACS sorting see also Notes 4 and 5. 4. If the absolute number of cells is low, consider the further loss of cells that dead cell removal and cell enrichment kits might cause. 5. The tissue dissociation process inevitably leads to cell death and to the release of nucleic acids into the cell suspension. Washing cells multiple times, especially just before loading onto micro- fluidic devices, will decrease the background noise that can otherwise interfere with data interpretation. 6. Dissociation of nonlymphoid tissue releases high number and variety of nonimmune cell types (e.g., epithelial and endothelial cells). When focusing on immune cells, using an immune cell enrichment strategy is advisable, and there are several strategies that can be adopted. Mainly, MACS and FACS, which rely on the pan-immune marker CD45, and density gradient separa- tion, which relies on cell density. Here we use MACS purifica- tion, which yields a purer immune population than density gradients, while still circumventing the need for a cell sorter.

References

1. van den Brink SC, Sage F, Ve´rtesy A´ et al (2017) 3. Adam M, Potter AS, Potter SS (2017) Psychro- Single-cell sequencing reveals dissociation- philic proteases dramatically reduce single-cell induced gene expression in tissue subpopula- RNA-seq artifacts: a molecular atlas of kidney tions. Nat Methods 14:935–936 development. Development 144:3625–3632 2. Wu YE, Pan L, Zuo Y et al (2017) Detecting 4. Gunawan M, Jardine L, Haniffa M (2016) Iso- activated cell populations using single-cell lation of human skin dendritic cell subsets. RNA-seq. Neuron 96:313–329.e6 Methods Mol Biol 1423:119–128 Part II

Single Cell Trancriptomic Analysis Chapter 3

Full-Length Single-Cell RNA Sequencing with Smart-seq2

Simone Picelli

Abstract

In the last few years single-cell RNA sequencing (scRNA-seq) has enabled the investigation of cellular heterogeneity at the transcriptional level, the characterization of rare cell types as well as the detailed analysis of the stochastic nature of gene expression. A large number of methods have been developed, varying in their throughput, sensitivity, and scalability. A major distinction is whether they profile only 50-or3- 0-terminal part of the transcripts or allow for the characterization of the entire length of the transcripts. Among the latter, Smart-seq2 is still considered the “gold standard” due to its sensitivity, precision, lower cost, scalability and for being easy to set up on automated platforms. In this chapter I describe how to efficiently generate sequencing-ready libraries, highlight common issues and pitfalls, and offer solutions for generating high-quality data.

Key words Smart-seq2, RNA-seq, Single cell, Full-length, In-house Tn5 transposase, Tagmentation, ® Nextera XT kit, Automation, High-throughput

1 Introduction

The rapid technological development of the single-cell sequencing field in the past 10 years has enabled researchers to answer ques- tions that could not be addressed by the classic RNA-seq or micro- array technologies. It is now well established that seemingly homogenous cell populations in vivo and cell cultures in vitro can display considerable differences in gene expression, both due to stochastic processes at the transcriptional level (e.g., transcriptional burst) or extrinsic factors such as experimental conditions [1–3]. Among all the different applications developed over the years, single-cell RNA sequencing (scRNA-seq) is the one that has seen major improvements, both in terms of sensitivity and scalabil- ity. Some of the technologies, including the latest emulsion droplet methods like Drop-seq, inDrop, and the 10Â Genomics technol- ogy [4–6], characterize only the 30-end of the RNA transcripts, which is generally sufficient for the investigation of cellular hetero- geneity and the identification of population substructures. Other methods such as Smart-seq2, SUPeR-seq, and MATQ-seq [7–9]

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_3, © Springer Science+Business Media, LLC, part of Springer Nature 2019 25 26 Simone Picelli

Fig. 1 Flowchart of the Smart-seq2 library preparation. Single cells are collected manually or by FACS and deposited in single tubes or 96-/384-well plates containing a mild hypotonic lysis buffer. The cells are lysed and the RNA is released. The RT reaction begins with the annealing of an oligo-dT primer (SMART dT30VN) Single-Cell RNA-seq with Smart-seq2 27

have the ability to capture full-length transcripts and are therefore useful for the characterization of single nucleotide variants (SNVs), splice isoform, and transcriptional start sites (TSSs), or for the detection of monoallelic and imprinted genes. Smart-seq2 relies on the SMART technology (Switching Mechanism at the 50-end of the RNA Transcript) and exploits two intrinsic properties of the Moloney murine leukaemia virus reverse transcriptase (MMLV-RT): reverse transcription (RT) and template switching (TS) [10]. Template switching represents the ability of the MMLV-RT to introduce a few untemplated nucleotides, most commonly 2–5 cytosines, upon reaching the 50-end of the RNA template during the RT reaction (Fig. 1). These extra nucleotides work as a docking site for a helper oligonucleotide (“template switching oligonucleotide,” TSO) carrying two riboguanosines and one locked nucleic acid (LNA) guanosine at its 30-end. This special base configuration is crucial for a stable annealing between the TSO and the cytosine tail and is required for the MMLV-RT to “switch template” and synthesize a cDNA strand using the helper oligonucleotide as template. Thus, TS makes possible the introduc- tion of a predefined sequence at the 30-end end of the cDNA transcript (the 50-end of the mRNA template) which, notably, is the same as the one used at the 50-end of the oligo-dT primer. This allows the amplification of the entire transcriptome in a single PCR reaction (“preamplification,” below and Fig. 1). Smart-seq2 relies on the Illumina sequencing technology and, therefore, the full-length fragments generated after PCR need to be fragmented before being loaded on the flowcell. For this purpose, several methods can be used but the one that has gained popularity in the field is the tagmentation reaction, mainly due to its simplicity and ease of use [11]. “Tagmentation” is a neologism introduced by Illumina to describe the tagging and fragmentation of double-stranded DNA carried out in a single reaction by a prokaryotic Tn5 Transposase. ®

Although the commercial kits available on the market (Nextera , ä

Fig. 1 (continued) carrying a known sequence at its 50-end (orange bar). The mRNA is converted to cDNA by a reverse transcriptase capable of performing the template switching (TS) reaction upon reaching the opposite of the template. TS enables the incorporation of 2–5 untemplated nucleotides, most commonly three cytosines. These nucleotides functions docking site for a LNA-modified template switching oligonucleotide (LNA-TSO) carrying the same known sequence at the 50-end as the oligo-dT primer, allowing the reverse transcriptase to “switch template” and make a complementary copy of the LNA-TSO. The result is full-length cDNA-mRNA hybrid from each polyadenylated transcript originally present in the cells that can then be preamplified via a suppression PCR reaction using a single primer (ISPCR). The library preparation is carried out in a near-to-random Tagmentation reaction by a Tn5 Transposase, followed by a second (enrichment) PCR reaction that adds the P5 and P7 sequences required for binding to the Illumina flowcell as well as S5xx and N7xx indices for multiplexing purposes. The dual-indexed libraries can then be pooled and sequenced either in the single end (SE) or paired end (PE) mode 28 Simone Picelli

® ® Nextera XT, and Nextera Flex , all from Illumina) generate high- quality data, they are expensive and therefore not suitable for research labs on a tight budget or for large-scale projects. A few years ago we developed a protocol for the production of a cheaper alternative of the commercial Tn5 Transposase, as well as optimized specific reaction conditions that make possible the tag- mentation of very low DNA inputs [12]. Thus, the combination of such a cheaper, “in-house,” Tn5 Transposase with Smart-seq2 can be considered a complete proto- col for the generation of high-quality scRNA-seq libraries from single cells [12–13]. All the steps entirely rely on off-the-shelf reagents, bringing down the cost per cell to a just a fraction of what it would be with any commercial kit. There are, however, limitations that the next generations of the protocol should address. Smart-seq2 is an oligo dT-based method and therefore enables the analysis only of polyadenylated RNAs, neglecting important species such as micro-RNAs (miRNAs), piwi- interacting RNAs (piRNAs,) and nonpolyadenylated long noncod- ing RNAs (lncRNAs), among others. Furthermore, Smart-seq2 does not retain the information about strand-specificity, thus making impossible to uniquely assign reads mapping to overlapping genes transcribed by opposite strands. Lastly, sample pooling is performed only posttagmentation (see below), making Smart- seq2 more labor-intensive than tag- or emulsion droplet-based methods (both 30- and 50-based). However, this issue has been mitigated by the use of liquid handling robots and nanodispensers, as illustrated below. Here I describe a medium- to high-throughput version of the Smart-seq2 protocol that combines the use of the NS-2 nanodis- penser (BioNex, http://gcbiotech.com/product/nanodrop-ii/) for reagent dispensing and the Freedom Evo 200 (Tecan, https://lifesciences.tecan.com/products/liquid_handling_an- d_automation/freedom_evo_series) for cleanup, index adaptor dis- tribution, and pooling steps. Other setups can be envisioned, such as the combination of Mosquito nanopipettors (TTP Labtech) and MicroLab STAR liquid handling robots (Hamilton Robotics). The use of any automated solution is strongly recommended in order to improve reproducibility, increase processing speed, minimize reagent waste and, ultimately, reduce personnel and cost of reagents. However, sometimes researchers do not have access to or do not have experience with automated systems. For them, the follow- ing protocols can be implemented simply by increasing (e.g., dou- bling) the reaction volumes of most reactions and using regular pipettes or multichannel pipettes. ® ® Since many laboratories rely on Nextera , Nextera XT, or ® Nextera Flex kits, I will describe how to carry out the library ® preparation both with the Nextera XT kit as well as with in-house Single-Cell RNA-seq with Smart-seq2 29

® ® Tn5 Transposase. The Nextera and Nextera Flex kits are suitable for larger inputs, are less relevant for single-cell applications, and will not be covered here. For details about the production of Tn5 Transposase the reader is referred to [12].

2 Materials

Prepare all solutions using RNase- and DNase-free water and ana- lytical grade reagents. Prepare and store all reagents at the recom- mended temperatures (unless indicated otherwise). It is very important to work in sterile conditions and clean thoroughly when experiments are carried out on a normal lab bench and not in a dedicated clean room (which would be the ideal solution but not always practical for different reasons). Wipe all the working surfaces with 0.5% NaClO (sodium hypochlorite) followed by DEPC-treated water. Use separate pre- and post-PCR working areas in order to avoid contaminations. Be especially careful when handling the index primers used in the final enrichment PCR step (see below). The use of a laminar flow hood equipped with UV light for sterili- zation is highly recommended for eliminating traces of nucleic acids from previous experiments. All reagents except enzymes can be thawed at room tempera- ture. All master mixes can be briefly vortexed after preparation. However, do not vortex the enzyme stock solutions, but rather mix them by gently inverting the tube.

2.1 Cell Lysis Mix 1. 0.4% Triton X-100 solution. Store at +4 C(see Notes 1 and 2). 2. Premixed dNTP solution (25 mM each). Store at À20 C. 3. Recombinant RNase Inhibitor (40 U/μL). Store at À20 C. 0 4. SMART dT30VN Oligonucleotide (5 Bio-AAGCAGTGG- 0 TATCAACGCAGAGTACT30VN-3 , 100μM). “Bio”¼ Biotin. Store at À20 C(see Note 3). 5. Optional: ERCC spike-ins, 1:40,000 dilution. Store at À80 C (see Note 4). 6. Nuclease-free water.

2.2 Reverse 1. Superscript™ II kit: First Strand Buffer (5Â), Dithiothreitol Transcription (RT) Mix (DTT, 100 mM), Superscript™ II Reverse Transcriptase (200 U/μL). Store at À20 C(see Note 5). 2. Betaine (5 M solution). Store at +4 C.  3. Magnesium chloride (MgCl2, 1 M solution). Store at +4 C. 4. Recombinant RNase Inhibitor (40 U/μL). Store at À20 C. 30 Simone Picelli

5. Nuclease-free water. Store at room temperature. 6. LNA-modified template-switching oligonucleotide, LNA-TSO (50 Bio-AAGCAGTGGTATCAACGCAGAG- TACrGrG+G-30, 100 μM). “Bio” ¼ Biotin. Store at À20 C (days) or À80 C (long-term, months).

2.3 Preamplification 1. KAPA HiFi HotStart ReadyMix (2X). Store at À20 C. Mix 2. ISPCR Primer (50 Bio-AAGCAGTGGTATCAACGCAGAGT- 30,10μM). “Bio” ¼ Biotin. The primer can be left out without any adverse effect (see Note 6). Store at À20 C. 3. Nuclease-free water. Store at room temperature.

2.4 Magnetic Beads 1. Sera-Mag SpeedBeads™ solution containing 19% w/v Poly- Cleanup ethylene Glycol (PEG). To prepare 50 mL of Bead Solution: withdraw 1 mL of Sera-Mag SpeedBeads™ suspension (car- boxyl magnetic beads, hydrophilic, 5% suspension) and transfer it into a 1.5 mL tube. Pellet the beads by placing the tube on a magnetic stand, wait until the solution is clear and discard the supernatant. Add 1 mL of 10 mM Tris–HCl pH 8.0, 1 mM EDTA (TE buffer) and resuspend the beads off the magnet. Pellet the beads again, wait until the solution is clear, discard the supernatant and repeat one more time. Pellet the beads once more, wait until the solution is clear, discard the superna- tant and resuspend off the magnet with 0.9 mL TE buffer. In a Beaker mix 2.92 g NaCl, 500 μL Tris–HCl pH 8.0 (1 M), 100 μL EDTA (500 mM), 9.5 g PEG (MW ¼ 8000). Bring everything in solution by stirring and heating to 37 C. Once the solution is clear add the resuspended beads prepared in the step 1. Add 50 μL Tween-20 (10% solution, (see Note 7)) and 250 μL sodium azide (NaN3, 10% solution (see Note 8)). Add the cleaned up beads in 0.9 mL TE buffer and adjust the volume to 50 mL with nuclease-free water. Store at +4 C. Do not freeze (see Note 9). 2. 80% v/v Ethanol. Store at room temperature (see Note 10). 3. Elution solution of your choice. Store at room temperature (see Note 11).

® 2.5 Library 1. Nextera XT DNA Sample Preparation Kit. Store at À20 C. ® Preparation with 2. Nextera XT Index Kit v2. Store at À20 C(see Note 12). Commercial Illumina Reagents

2.6 Library In-house Tn5 Transposase (12.5 μM), preloaded with the follow- 0 Preparation with ing oligonucleotides: Tn5MErev: 5 -[phos]CTGTCTCTTATA- 0 Home-made Tn5 CACATCT-3 . “Phos” ¼ phosphate; this oligonucleotide should Transposase be annealed either with. Single-Cell RNA-seq with Smart-seq2 31

Tn5ME-A: 5- 0-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-30. or Tn5ME-B: 5- 0-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-30. Store at À20 C(see Note 13).

1. 10Â TAPS–MgCl2 Buffer: 100 mM TAPS–NaOH pH 8.5 at  25 C, 50 mM MgCl2. Adjust the pH as indicated above. Filter with 0.22 μm filters and store at +4 C. 2. 40% w/v Polyethylene Glycol MW ¼ 8000. Store at +4 C(see Note 14). 3. 0.2% SDS solution. Store at +4 C(see Note 15). 4. KAPA HiFi non-HotStart kit includes KAPA HiFi DNA Poly- merase (1 U/μL), KAPA HiFi High-Fidelity Buffer (5Â), dNTP mix (10 mM each). Store at À20 C(see Note 16). 5. Nuclease-free water. Store at room temperature. ® 6. Nextera XT Index Kit v2. Store at À20 C(see Notes 12 and 17).

2.7 Pooling and 1. Qubit™ dsDNA High Sensitivity Assay Kit and Qubit™ assay Sequencing tubes. Store at room temperature. 2. Agilent High Sensitivity D1000 ScreenTape Assay (alterna- tively: Agilent High Sensitivity DNA Assay). Store at +4 C. 3. Sera-Mag SpeedBeads™ working solution (for details about preparation: see above). Store at +4 C. 4. 80% v/v ethanol. Store at room temperature. 5. Nuclease-free water. Store at room temperature. 6. NextSeq™ 550 High Output v2 kit (75 cycles), MiSeq™ Reagent Kits v3 (150 cycles) or HiSeq™ 2500 (High Output Run Mode or Rapid Run Mode) depending on sample throughput and desired read output. Other kits are also an option. Follow manufacturer’s instructions regarding storage.

2.8 Instruments and Depending on the throughput, the protocol can be carried out in Consumables single tubes, 96- or 384-well plates. As an example, instruments and consumables required for 96- and 384-well plates processing are reported below. 1. Fully skirted 96- or 384-well plates (see Note 18). 2. Aluminum foil and adhesive plastic foil (see Note 19). 3. Plate centrifuge, this setup only. 4. Thermocycler with 96- or 384-well block. 5. NS-2 nanodispenser (BioNex), this setup only. 32 Simone Picelli

6. Freedom Evo 200 (Tecan), this setup only. 7. Low Elution (LE) Magnet Plate (when using 96-well plates) or 384 Post Magnet Plate (when using 384-well plates) (Alpaqua). 8. BD FACSAria™ III or BD Influx™. Other Fluorescent Acti- vated Cell Sorters (FACS) may be used. 9. Agilent 4200 TapeStation system or Agilent 2100 Bioanalyzer. 10. Nextseq™ 550, MiSeq™, or HiSeq™ Sequencing Systems, depending on sample throughput and desired read output.

3 Methods

For simplicity, hereafter I will describe only the high-throughput protocol, which uses 96- or 384-well plates.

3.1 Cell Lysis 1. For each sample prepare 2.3 μL of the following Cell Lysis Mix: 1.15 μL Triton X-100, 0.40 μL dNTP Mix, 0.05 μL SMART dT30VN Oligonucleotide, 0.05 μL Recombinant RNase Inhib- itor, and 0.65 μL nuclease-free water. If using ERCC, add 0.025 μL/reaction for large cells or 0.0025 μL/reaction for small cells (see Note 20). The final concentration of Triton X-100 is 0.2%. 2. Mix well and gently spin down. 3. Dispense the Cell Lysis Mix into the plate of your choice. 4. If used immediately, keep the plate on ice until needed. The Cell Lysis Mix can be prepared and dispensed on plates several weeks or months in advance. The plates are then stored at À20 C until needed (see Note 21). Avoid multiple freezing- thawing cycles. 5. Collect the cells either by manual picking or, more commonly, by FACS. Manual picking is time-consuming and some effort is required to master it (see Note 22). Cell sorting by FACS yields the best results when a BD FACSAria™ III or a BD Influx™ is used. To make sure the stream is perfectly centered and the single cell will hit the center of the well it is advised to check the instruments settings by using BD FACS™ Accudrop beads (see Note 23). 6. Immediately after sorting, seal the plate with aluminum foil and snap-freeze it at À80 C or by placing it on dry ice, especially if not processing immediately with RT (see Note 24). Lysed cells can be stored in these conditions for several months without appreciable decrease in RNA quality. The plate should never undergo freeze–thaw cycles for any reason. Single-Cell RNA-seq with Smart-seq2 33

3.2 Reverse 1. For each sample prepare 2.7 μL of the following RT Mix: 1 μL Transcription (RT) Mix First Strand Buffer, 0.25 μL DTT, 1 μL betaine, 0.06 μL MgCl2, 0.125 μL RNase Inhibitor, 0.125 μL Superscript™ II Reverse Transcriptase, 0.05 μL LNA-TSO, and 0.09 μL nuclease-free water. 2. Briefly vortex and gently spin down the mix. Keep it on ice until needed. 3. If cell sorting was carried out earlier, take the plate out of the À80 C freezer and leave it for a couple of minutes at room temperature to thaw. 4. Briefly spin down to collect eventual drops that might have condensed on the lid. 5. Optional: Place the plate on a thermocycler block and perform cell lysis and mRNA denaturation incubating the plate at 72 C or 95 C for 3 min (see Note 25). 6. If denaturation is performed: remove the plate from the ther- mocycler and keep it on wet ice for a couple of minutes to cool. Briefly spin down to collect the lysate at the bottom of each well. 7. Dispense 2.7 μL of RT Mix in each well. The final volume is now 5 μL. 8. Seal the plate with adhesive transparent film, vortex, spin down, place it in a thermocycler and carry out the RT reaction: 42 C for 90 min, 70 C for 15 min, 4 C hold. Once the reaction is completed proceed to the next step. Alternatively, run the RT overnight but continue with the preamplification reaction the following morning (see Note 26).

3.3 Preamplification 1. Remove the KAPA HiFi HotStart ReadyMix and ISPCR  Reaction primer from the À20 C storage and keep them at room temperature. 2. For each sample prepare 7.5 μL of preamplification mix: 6.25 μL KAPA HiFi HotStart ReadyMix, 0.15 μL ISPCR primer (optional, see Note 6), 1.1 μL nuclease-free water. 3. Briefly vortex and gently spin down. The preamplification mix is stable for hours at room temperature and there is no need to keep it on ice (see Note 27). 4. Once the RT is completed, spin down the plate and keep it at room temperature, if the preamplification mix is not ready yet. 5. Dispense 7.5 μL of preamplification mix in each well of the plate. The final volume is now 12.5 μL. 6. Seal the plate and place it in a thermocycler, starting the fol- lowing PCR program: 98 C for 3 min, then N cycles of (98 C for 20 s, 67 C for 20 s, 72 C for 6 min), 4 C hold. The 34 Simone Picelli

number of cycles “N” should be adjusted according to the RNA content of the specific cells that you are working with. Use 1–2 cycles more, if unsure (see Note 28). It is safe to stop here and store the PCR product in a À20 C freezer until needed. 7. Spin down the tubes/plate once the reaction is completed.

3.4 Magnetic Beads 1. Remove the Sera-Mag SpeedBeads™ aliquot from the +4 C Cleanup After storage and equilibrate it at room temperature for 15 min. Preamplification 2. Add 10 μL of Sera-Mag SpeedBeads™ solution (0.8Â) and mix well by pipetting up and down at least 20 times or by vortexing. 3. Incubate off the magnetic stand for 5 min at room temperature. 4. Place the plate on the magnetic stand and leave it there for 5 min or until the solution appears clear. 5. Carefully remove the supernatant without disturbing the beads. 6. Optional: add 200 μL of 80% v/v ethanol, pipetting the liquid from the opposite side of the beads and incubate 30 s without removing the tube/plate from the magnetic stand. Performing a second ethanol wash is not necessary (see Note 10). 7. If an ethanol wash is performed: remove any trace of ethanol and let the bead pellet dry for 3–4 min or until small cracks appears. Do not seal the plate or remove it from the magnetic stand during this time. 8. Remove the plate from the magnetic stand, add 15 μLof nuclease-free water and mix well by pipetting or vortexing to resuspend the beads. 9. Incubate 2 min off the magnetic stand. 10. Place the plate back on the magnetic stand and incubate for 2 min or until the solution appears clear. 11. Carefully remove 14 μL of the supernatant trying to minimize the bead carryover and transfer it to a new plate. It is safe to stop here and store the cDNA in a À20 C freezer until needed (see Note 29). 12. Check the cDNA quality on the Agilent Bioanalyzer or TapeS- tation instruments. Follow the instructions as described in the High Sensitivity DNA chip or High Sensitivity DNA5000 ScreenTape assay user manuals. A good library is characterized by a low proportion of fragments <400 bp, absence of residual primers (ca. 100 bp) and an average cDNA size of 1.5–2.0 Kb. Single-Cell RNA-seq with Smart-seq2 35

3.5 Preparation of It is quite common that the preamplified cDNA is too concentrated the Preamplified cDNA and needs to be diluted before carrying out the tagmentation. In for the Tagmentation the protocol presented here I use 0.5 μL of cleaned-up cDNA from Reaction the preamplification reaction, containing about 50–150 pg. A higher cDNA input leads to incomplete tagmentation (especially ® when using the Nextera XT kit), and thus libraries with longer average size that will cluster suboptimally on the flowcell. For low-throughput projects, cDNA from all cells can be measured on the TapeStation or Bioanalyzer instruments. For high-throughput projects this might not be feasible. In any case, I recommend always checking the cDNA size distribution on the Agilent Bioanalyzer or Agilent TapeStation and not relying exclu- sively on fluorometric assays, which cannot assess sample quality but only its concentration.

® 3.6 Library 1. Remove the Amplicon Tagment Mix (ATM) and the Nextera  Preparation PCR Mix (NPM) from the À20 C storage and place them on ice. Remove the Tagment DNA Buffer (TD) and the diluted 3.6.1 Library Preparation and premixed Primer Index Plate from the À20 C storage and with the Nextera® XT Kit keep them at room temperature. Remove the Neutralization Buffer (NT) from the +4 C storage and keep it at room temperature. 2. For each sample prepare 1.5 μL of Tagmentation Mix: 0.5 μL of ATM and 1 μL of TD. 3. Briefly vortex and gently spin down the mix. Keep it on ice until needed. 4. Using the liquid handling robot or a multichannel pipette transfer 0.5 μL of diluted cDNA in a new empty plate. 5. Add 1.5 μL of Tagmentation Mix to each well. 6. Seal the plate with adhesive transparent film, vortex, spin down, and place it in a thermocycler and carry out the tagmentation reaction: 55 C for 8 min, 4 C hold. Once the reaction is completed proceed immediately to the next step. 7. Add 0.5 μL of NT Buffer to each well. 8. Seal the plate with adhesive transparent film, vortex, spin down, and proceed to the next step (see Note 30). Do not put the plate back on ice. 9. Add 1 μL of prediluted N7xx + S5xx Index Adaptors (see Note 12). 10. Add 1.5 μL of NPM solution to each well. 11. Seal the plate with adhesive transparent film, vortex, spin down, and place it in a thermocycler and carry out the Enrich- ment PCR Reaction: 72 C for 3 min, 95 C for 30 s, then N cycles of (95 C for 10 s, 55 C for 30 s, 72 C for 30 s), 72 C for 5 min, 4 C hold. The number of cycles “N” should 36 Simone Picelli

be adjusted according to the amount of cDNA used for the tagmentation reaction. When starting from 50–150 pg input cDNA, 14–16 cycles are sufficient. It is safe to stop here and store the final library in a À20 C freezer until needed.

3.6.2 Library Preparation 1. Remove the in-house Tn5 Transposase and the KAPA HiFi with In-House Tn5 Polymerase from the À20 C storage and place them on ice. Transposase Remove the dNTP Mix and High-Fidelity KAPA HiFi Buffer and the diluted and premixed Primer Index Plate from the À20 C storage and keep them at room temperature. Remove the 0.2% SDS solution, TAPS–MgCl2 Buffer and 40% PEG solution from the +4 C storage and keep them at room temperature. 2. For each sample prepare 1.5 μL of Tagmentation Mix: 0.2 μL of TAPS–MgCl2 Buffer, 0.5 μL of PEG solution, 0.1–0.01 μL of in-house Tn5 Transposase, nuclease-free water to volume. The amount of in-house Tn5 transposase varies according to the activity which is often batch-related. 3. Briefly vortex and gently spin down the Tagmentation Mix. Keep it on ice until needed. 4. Using a liquid handling robot transfer 0.5 μL of diluted cDNA in a new empty plate. 5. Add 1.5 μL of Tagmentation Mix to each well. 6. Seal the plate with adhesive transparent film, vortex, spin down, place it in a thermocycler and carry out the tagmentation reaction: 55 C for 8 min, 4 C hold. Once the reaction is completed proceed immediately to the next step. 7. Add 0.5 μL of 0.2% SDS solution to each well. 8. Seal the plate with adhesive transparent film, vortex, spin down, and proceed to the next step. Do not put the plate back on ice. 9. Add 1 μL of diluted and premixed S5xx + N7xx primers from the Primer Index Plate. 10. Add 1.5 μL of the following Enrichment PCR Mix: 0.1 μL KAPA HiFi DNA polymerase, 0.15 μL dNTP mix, 1 μL High- Fidelity Buffer, and nuclease-free water to volume. 11. Seal the plate with adhesive transparent film, vortex, spin down, place it in a thermocycler and carry out the Enrichment PCR reaction: 72 C for 3 min, 95 C for 30 s, then N cycles of (95 C for 10 s, 55 C for 30 s, 72 C for 30 s), 72 C for 5 min, 4 C hold. The number of cycles “N” should be adjusted according to the amount of cDNA used for the tagmentation reaction. When starting from 50–150 pg input cDNA, 14–- 16 cycles are sufficient. It is safe to stop here and store the final library in a À20 C freezer until needed. Single-Cell RNA-seq with Smart-seq2 37

3.7 Pooling and Bead 1. Remove the Sera-Mag SpeedBeads™ aliquot from the +4 C Cleanup of the Final storage and equilibrate it at room temperature for 15 min. Library 2. Pool the entire content of each well in a 1.5 mL LoBind tube (see Note 31). Vortex to mix the sample and briefly spin down. 3. Use an aliquot for the final bead cleanup. Add the same volume of Sera-Mag SpeedBeads™, vortex thoroughly and incubate 5 min off the magnetic stand. 4. Place the tube on the magnetic stand and leave it there for 5 min or until the solution appears clear. 5. Carefully remove the liquid paying attention not to disturb the beads. 6. Add 1 mL of 80% w/v ethanol, pipetting the liquid from the opposite side of the beads and incubate 1 min without remov- ing the tube from the magnet. 7. Remove any trace of ethanol and let the bead pellet dry for at least 5 min (tube always on the magnetic stand and with the lid open!) or until small cracks starts being visible. It might take up to 10 min. 8. Remove the tube from the magnetic stand; add 200 μLof nuclease-free water and mix well to resuspend the beads. 9. Incubate for 2 min off the magnetic stand. 10. Place the tube back on the magnetic stand and leave it there for 2 min or until the solution appears clear. 11. Carefully remove the supernatant trying to minimize the bead carryover and place it in a new 1.5-mL LoBind tube. This is the final pool that will be used for sequencing.

3.8 Preparation of Use 1 μL to assess the concentration of the final library on a Qubit the Final Library for instrument. Use 1 μL (when using the High Sensitivity DNA chip Sequencing and the Agilent Bioanalyzer) or 2 μL (when using the High Sensi- tivity D1000 ScreenTape and the Agilent TapeStation) to assess the average size of the final library.

4 Notes

1. The concentration of Triton X-100 can be increased without a negative impact on the enzymes used in the RT reaction. We did not observe any appreciable difference in performance when using other nonionic surfactants such as Tween 20, Ige- ® pal CA-630, NP-40, or similar. On the other hand, the use of anionic detergents such as SDS leads to complete inactivation of reverse transcriptases, even when concentrations much lower than 0.2% are used. 38 Simone Picelli

2. A mild hypotonic buffer such as Triton X-100 is sufficient for cell membrane disruption and release of cytoplasmic RNA but generally is not strong enough to lyse the nuclear membrane. Therefore, if also the recovery of nuclear RNA is the goal of the experiment, other buffers should be used. We found that chao- tropic agents such as guanidine hydrochloride (GuHCl) and guanidine thiocyanate (GuHSCN) fit this purpose very well. Acting as protein denaturants, they unfold the tertiary struc- ture of all proteins, including ribonucleases, thus making superfluous the addition of RNase inhibitors to the lysis buffer. In our settings, we find that a final concentration of 40–60 mM of GuHCl or GuHSCN in 2.3 μL of Cell Lysis Mix is not interfering with the downstream RT reaction (see also US 2011/0136180 A1). 3. All primers should be biotinylated and, preferably, HPLC- purified. In the RT reaction the biotinylation prevents secondary-strand switch events and the associated creation of artifacts such as concatamers [14–15]. 4. A recent paper showed that the use of External RNA Controls Consortium (ERCC) spike-ins has several drawbacks and is not, in general, the best solution for data normalization [16]. Unfortunately, there is no viable solution at the moment. The alternative, the use of Spike-In RNA Variants (SIRV, Lexo- gen) is actually an even poorer choice since the mix spans only 4 abundance levels and is therefore not suitable for sensitivity analysis [16]. If using the ERCC, the amount added in each well needs to be determined empirically. As a rule of thumb, for large cells (cell lines or large primary cells such as cardiomyo- cytes or some neurons, for example) we use a concentration in the final RT volume of 1 in four million; for smaller cells (most of the primary cells) we lower this concentration to 1 in 40 mil- lion. The ideal target would be to have 1–5% of the total number of reads coming from the ERCC, in order to be able to normalize across several logs but without “wasting” too much sequencing power and, therefore, money. 5. Other Reverse Transcriptases have been successfully tested by us and others. We did not observe any significant difference between Superscript™ II used in the standard Smart-seq2 protocol and Superscript™ IV or Maxima RT H-. Super- script™ IV has a comparable cost as Superscript™ IV but shorten the RT reaction to as little as 15 min. The reader could swap Superscript™ II with either of the 2 mentioned above without any impact on data quality. Conversely, Super- script™ III has a negligible TS-activity and should not be used in this context. Therefore, always perform some initial tests to evaluate the TS activity of any new Reverse Transcriptase. Single-Cell RNA-seq with Smart-seq2 39

6. The ISPCR primer is generally not needed for the preamplifi- cation reaction. This has the additional benefit of decreasing the risk of having leftover primers in the bead cleanup. Pri- mers/dimers have a size which is very close to the magnetic bead cut-off (100 bp) which makes them difficult to be effi- ciently removed, especially if present in excess. Although it might seem counterintuitive, the preamplification reaction does not require the ISPCR primer. In fact, after RT, there is still a large excess of unused SMART dT30VN and LNA-TSO oligonucleotides. The 50-end of both oligonucleotides carries exactly the same sequence as the ISPCR primer and they can therefore work as PCR primers themselves. The result is a PCR reaction done without PCR primers! 7. It is recommended to add it in the end, in order to avoid the formation of a lot of foam. 8. Due to its toxicity, it is recommended to add it as the very last ingredient. 9. The protocol described here is based on [17], with modifica- tions. Over the years, a large numbers of protocols have been published on scientific journals or on the Internet; the one presented here is, therefore, just an example. We found that using a final concentration of 19% w/v of PEG MW ¼ 8000 works well in our setup. However, every new batch of Sera- Mag SpeedBeads™ should be titrated side-by-side with com- mercial Agencourt AMPure XP magnetic beads. We generally test different beads–DNA ratios of each type of bead (0.8:1, 1:1, and 1.2:1, for example) by using a 1 Kb DNA Ladder and perform a standard magnetic bead purification. We then load a High Sensitivity D1000 Screen Tape or a High Sensitivity DNA Bioanalyzer chip and check which bands of the DNA Ladder are retained and which are not. If the goal is to use the Sera-Mag SpeedBeads™ for all other protocols in the lab, we recommend adjusting the amount of PEG to achieve the same performance between the two types of beads. 10. Washing the bead pellet before the final cDNA elution is time- consuming, causes losses of precious material, and can lead to reaction failure if not removed completely. Ethanol washes are never performed in this protocol until the final library cleanup (and would not be necessary even then). 11. There is no difference in the elution efficiency between nuclease-free water and Elution Buffer (Qiagen, 10 mM Tris–HCl pH 8.5) and both can be used in the final step. Here we use nuclease-free water. 12. For tagmentation of picogram amounts of DNA, the indices in ® the Nextera Index Kit can be diluted 1:5 with nuclease-free water or low-EDTA TE buffer (10 mM Tris–HCl, 0.1 mM 40 Simone Picelli

EDTA, pH 8.0). For ease of use, generate an Index Stock Plate by mixing the Index Adaptors in order to have a unique com- bination of N7xx and S5xx in each well. Each Index Adaptor will then have been diluted 10 times from its original concentration. 13. Tn5MErev primer needs to be preannealed in separate vials with Tn5ME-A and Tn5ME-B before loading the two partly double-stranded oligonucleotides on the Tn5 transposase. See [12] for a detailed protocol. 14. The concentration and the molecular weight (MW) of poly- ethylene glycol are critical for a successful library, and we obtained the best results when using a final concentration of 8–10% of PEG MW ¼ 4000–8000. However, other PEG polymers can also be used, but it is worth remembering that there is an inverse relationship between the MW of the PEG polymer and the average size of the library (i.e., a reaction with 8% PEG MW ¼ 35,000 gives shorter libraries compared to the same reaction when 8% PEG MW ¼ 4000 is used). Avoid using polymers with low MW (i.e., MW ¼ 400 or similar) since they are not as effective a macromolecular crowding agent as the longer ones [18]. 15. The concentration of SDS is extremely important for the inac- tivation of the in-house Tn5 Transposase and for an efficient Enrichment PCR reaction. Best results are obtained when the concentration of SDS is in the range of 0.1–0.2%. Do not increase the concentration further, as already 0.3% leads to a complete failure of the following PCR. 16. Do not replace the enzyme with the corresponding HotStart version. The initial step of Enrichment PCR is carried out at 72 C and is required in order to fill the 9-bp gap created by the Tn5 in the tagmentation reaction. Without the gap filling, the following amplification cannot take place. The HotStart ver- sion needs to be activated at 98 C for several minutes and would not work here. However, it is possible to replace the KAPA HiFi non-HotStart Polymerase with Phusion non-HotStart Polymerase without any difference in performance. 17. If using the Smart-seq2 protocol with in-house Tn5 Transpo- sase the sequencing will probably represent the largest fraction ® of the entire cost, even when using the Nextera Index Kit with the highest multiplexing capability (24 i-7xx and 16 i-5xx indices ¼ 384 combinations). As an example, if 384 single- cell libraries are sequenced on a NextSeq™ 550 cartridge, approximately one million reads per cell will be generated, far higher than the 250,000 recommended by some Authors [16]. While it is possible to use other instruments with lower Single-Cell RNA-seq with Smart-seq2 41

throughput but higher price per base (such as the MiSeq™, for example), a more efficient solution is to increase the number of Index Adaptors used. One option is to take the recently released TruGrade™ Oligonucleotides (IDT), manufactured by proprietary methods that are proven to reduce index cross talk and increase success of multiplex NGS experiments. Oli- gonucleotides should be delivered in Deepwell plates, with the layout chosen according to the liquid handling robot available in the laboratory. All Index Adaptors should carry a 50-biotin to minimize artifacts as well as a phosphorothioate bond between the last and second last nucleotide at the 30-end to make them more resistant to exonucleases. 18. Always choose fully skirted plates when carrying out the proto- col with liquid handling robots, which also have the additional advantage of offering increased rigidity, thus reducing warping ® during thermal cycling. Twin.tec (Eppendorf) and Hard- ® Shell (Bio-Rad) are two possible choices. 19. There is a limited choice when it comes to finding a seal that ® can resist to storage at À80 C. I recommend AlumaSeal 384 film (VWR International) for optimal results. For storage at À20 C of preamplified cDNA or final libraries use the adhesive plastic foil of your choice. 20. The final volume of 2.3 μL of Cell Lysis Mix is not a crucial value and can be modified. If the FACS instrument is perfectly calibrated, as little as 0.5 μL of Cell Lysis Mix is required when using 384-well plates. When using 96-well plates, volumes need to be larger due to the larger well size. Generally 4–5 μL yields the best results. Reducing the volumes of Cell Lysis Mix has the advantage of reducing the amount of reagents needed for the RT and preamplification reactions, thus decreasing the final cost. 21. A quick centrifugation after cell isolation by FACS is generally not necessary but must be performed if, for some reasons, the Cell Lysis Mix is not located at the bottom of the wells any- more. Otherwise, spinning will not affect the results because the cells that did not reach the Cell Lysis Mix (those ending up on the wall, for example) are already dead and the RNA degraded by the time the centrifugation is performed. 22. Semi-automated low- to medium-throughput solutions, such as the CellSorter instrument, [19] are an appealing alternative for rare or very fragile cell types that cannot be sorted with a FACS instrument. The drawback is that this requires a rela- tively large initial investment. 23. It is very important to clean the instrument at regular intervals to ensure residual RNA or DNA from previous experiments is not present. To guarantee high viability and therefore success 42 Simone Picelli

in the library preparation, apply proper sample and sheath pressure. Lower sheath pressure causes less damage to the cells. On the downside, the stream may not be very stable when the pressure is too low. Especially when the volume of Cell Lysis Mix is low or when using 384-well plates, it is also crucial to align the deflected drop to the center of the sorted wells. For good well alignment, a small angle for the drop deflection is pre- ferred. The nearly vertical deflection stream ensures that the sorted drop hits the center of the well of the collection plate. 24. I recommend to always using both a positive and a negative control in every experiment. The negative control is a reaction where only 2.3 μL of Cell Lysis Mix but no cell is added. A positive control can be either a reaction where a “mini-bulk” of 10–20 cells have been sorted in the same well or some high- quality total RNA (10–100 pg). 25. The initial denaturation step before RT is generally not neces- sary and can be skipped if RNA degradation is a concern. Historically, this was done because it was believed it would help resolving secondary structures present in the RNA, thus making it accessible to oligo-dT primers or randomers before the cDNA synthesis. However, the denaturation temperature is so high that no oligonucleotide can anneal to the RNA. This becomes possible only once the RT begins (which is generally carried out at temperatures between 42 and 60, depending on the enzyme used). The RT can take up to 90 min, enough time for the RNA to refold, in a process that is probably faster than the speed with which the enzyme synthesize the first cDNA strand. This fact has also been observed by others and is now implemented in some of the most-popular scRNA-seq protocols [20]. 26. The final step of RT, the enzyme inactivation, can also be skipped to make the protocol even faster. The enzyme will be inactivated anyway during the initial activation step of the preamplification reaction. A word of warning: if the RT inacti- vation is not performed, it is advised to keep the plate on ice or store it at À20 C if not proceeding the same day with the preamplification reaction. It is known that Superscript™ II has an exonuclease activity and this might cause the degradation of the newly synthesized cDNA. 27. Any leftover Preamplification Reaction Mix can be frozen and safely used it again without a decrease in performance. 28. When working with bulk RNA, choose 14 cycles if starting from 1 ng total RNA, 16 cycles if starting from 100 pg total RNA and so on. When working with single cells, adjust the number of cycles according to the RNA content, if known. As a Single-Cell RNA-seq with Smart-seq2 43

guideline, one intact cell contains approximately 10 pg total RNA. For cell lines use 18 cycles, for tumor cells 18–20 cycles, and for immune cells use 22–24 cycles. Keep the number of preamplification cycles as low as possible, in order to avoid introducing a large bias in your experiment. Ziegenhain and collaborators have recently estimated that with every PCR cycle the power to detect a log2-fold change of 0.5 appears to drop by 2.4% [21]. 29. Leaving a couple of μL behind minimizes the risk of bead carryover that might interfere with the TapeStation or Bioanalyzer run. ® 30. The Nextera XT Kit User Manual states that the samples should be incubated for 5 min at room temperature before proceeding with the enrichment PCR. However, this is not necessary. The NT buffer (i.e., SDS) is a strong denaturant with immediate effects on the Tn5 Transposase. Furthermore, prolonged incubation of the samples in NT Buffer (up to several hours) does not cause a decrease in the quality of the final sequencing library. 31. For long-term storage of final libraries and, ideally, purified preamplification products LoBind tubes and/or LoBind twin. ® tec (Eppendorf) plates should be preferred. This ensures a minimal adsorption of the DNA to the plastic and guarantees that little DNA is lost upon storage. We have noticed, however, that even when using LoBind tubes a slight decrease in library concentration over time is to be expected.

References

1. Raj A, van Oudenaarden A (2008) Nature, Weitz DA, Kirschner MW (2015) Droplet bar- nurture, or chance: stochastic gene expression coding for single-cell transcriptomics applied and its consequences. Cell 135:216–226. to embryonic stem cells. Cell 161:1187–1201 https://doi.org/10.1016/j.cell.2008.09.050 6. Zheng GX, Terry JM, Belgrader P, Ryvkin P, 2. Wilkinson DJ (2009) Stochastic modelling for Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, quantitative description of heterogeneous McDermott GP, Zhu J, Gregory MT, Shuga J, biological systems. Nat Rev Genet 10:122–133 Montesclaros L, Underwood JG, Masquelier 3. Marinov GK, Williams BA, McCue K, Schroth DA, Nishimura SY, Schnall-Levin M, Wyatt GP, Gertz J, Myers RM, Wold BJ (2014) From PW, Hindson CM, Bharadwaj R, Wong A, single-cell to cell-pool transcriptomes: stochas- Ness KD, Beppu LW, Deeg HJ, McFarland C, ticity in gene expression and RNA splicing. Loeb KR, Valente WJ, Ericson NG, Stevens Genome Res 24:496–510 EA, Radich JP, Mikkelsen TS, Hindson BJ, 4. Macosko EZ, Basu A, Satija R, Nemesh J, Bielas JH (2017) Massively parallel digital tran- Shekhar K, Goldman M, Tirosh I, Bialas AR, scriptional profiling of single cells. Nat Com- Kamitaki N, Martersteck EM, Trombetta JJ, mun 8:14049 ˚ Weitz DA, Sanes JR, Shalek AK, Regev A, 7. Picelli S, Bjo¨rklund AK, Faridani OR, McCarroll SA (2015) Highly parallel genome- Sagasser S, Winberg G, Sandberg R (2013) wide expression profiling of individual cells Smart-seq2 for sensitive full-length transcrip- using nanoliter droplets. Cell 161:1202–1214 tome profiling in single cells. Nat Methods 5. Klein AM, Mazutis L, Akartuna I, 10:1096–1098 Tallapragada N, Veres A, Li V, Peshkin L, 44 Simone Picelli

8. Sheng K, Cao W, Niu Y, Deng Q, Zong C Capture and Amplification by Tailing and (2017) Effective detection of variation in Switching (CATS). An ultrasensitive ligation- single-cell transcriptomes using MATQ-seq. independent method for generation of DNA Nat Methods 14:267–270 libraries for deep sequencing from picogram 9. Fan X, Zhang X, Wu X, Guo H, Hu Y, Tang F, amounts of DNA and RNA. RNA Biol Huang Y (2015) Single-cell RNA-seq tran- 11:817–828 scriptome analysis of linear and circular RNAs 16. Svensson V, Natarajan KN, Ly LH, Miragaia in mouse preimplantation embryos. Genome RJ, Labalette C, Macaulay IC, Cvejic A, Teich- Biol 16:148 mann SA (2017) Power analysis of single-cell 10. Zhu YY, Machleder EM, Chenchik A, Li R, RNA-sequencing experiments. Nat Methods Siebert PD (2001) Reverse transcriptase tem- 14:381–387 plate switching: a SMART approach for full- 17. Rohland N, Reich D (2012) Cost-effective length cDNA library construction. BioTechni- high-throughput DNA sequencing libraries ques 30:892–897 for multiplexed target capture. Genome Res 11. Adey A, Morrison HG, Asan XX, Kitzman JO, 22:939–946 Turner EH, Stackhouse B, MacKenzie AP, Car- 18. Zimmerman SB, Minton AP (1993) Macro- uccio NC, Zhang X, Shendure J (2010) Rapid, molecular crowding: biochemical, biophysical, low-input, low-bias construction of shotgun and physiological consequences. Annu Rev fragment libraries by high-density in vitro Biophys Biomol Struct 22:27–65 transposition. Genome Biol 11:R119 19. Ko¨rnyei Z, Beke S, Miha´lffy T, Jelitai M, 12. Picelli S, Bjo¨rklund AK, Reinius B, Sagasser S, Kova´cs KJ, Szabo´ Z, Szabo´ B (2013) Cell sort- Winberg G, Sandberg R (2014) Tn5 transpo- ing in a Petri dish controlled by computer sase and tagmentation procedures for massively vision. Sci Rep 3:1575 scaled sequencing projects. Genome Res 20. Macaulay IC, Haerty W, Kumar P, Li YI, Hu 24:2033–2040 TX, Teng MJ, Goolam M, Saurat N, 13. Picelli S, Faridani OR, Bjo¨rklund AK, Coupland P, Shirley LM, Smith M, Van der Winberg G, Sagasser S, Sandberg R (2014) Aa N, Banerjee R, Ellis PD, Quail MA, Swer- Full-length RNA-seq from single cells using dlow HP, Zernicka-Goetz M, Livesey FJ, Pont- Smart-seq2. Nat Protoc 9:171–181 ing CP, Voet T (2015) G&T-seq: parallel 14. Kapteyn J, He R, McDowell ET, Gang DR sequencing of single-cell genomes and tran- (2010) Incorporation of non-natural nucleo- scriptomes. Nat Methods 12:519–522 tides into template-switching oligonucleotides 21. Ziegenhain C, Vieth B, Parekh S, Reinius B, reduces background and improves cDNA syn- Guillaumet-Adkins A, Smets M, Leonhardt H, thesis from very small RNA samples. BMC Heyn H, Hellmann I, Enard W (2017) Com- Genomics 11:413 parative analysis of single-cell RNA sequencing 15. Turchinovich A, Surowy H, Serva A, methods. Mol Cell 65:631–643 Zapatka M, Lichter P, Burwinkel B (2014) Chapter 4

CEL-Seq2—Single-Cell RNA Sequencing by Multiplexed Linear Amplification

Itai Yanai and Tamar Hashimshony

Abstract

Single-cell RNA sequencing has revolutionized the way we look at cell populations. Of the methods available, CEL-Seq was the first to use linear RNA amplification. With early barcoding and 30 sequencing, it is sensitive, cost-effective and easy to perform. Here we describe a protocol for performing CEL-Seq2 on sorted cells, which can be performed without any special equipment.

Key words Single-cell RNA amplification, Single-cell transcriptomics, In vitro transcription, 30 sequencing, Gene expression

1 Introduction

Ten years have passed since the publication of the first method for single-cell RNA-seq [1] and many more since the first general method to study gene expression in individual cells [2]. The ability to determine all the genes that are expressed in individual cells opened an exciting new ability to define cellular identity [3], follow developmental processes [4], and understand disease [5], one cell at a time. The Tang et al. protocol was followed by many more [6, 7], which improved the sensitivity and throughput. New meth- ods for capturing cells also greatly increased throughput [8–10]. A central issue for all single-cell RNA-Seq protocols is the need to increase the amount of RNA present in a single cell to a level suitable for sequencing. The first protocol to be published, as well as several that followed relied on PCR amplification, which is very efficient, but susceptible to biases. Our group sought to improve the sensitivity by using in vitro transcription (IVT) for linear RNA amplification, first introduced by Eberwine, which was the basis of much of the microarray work. We thus developed CEL-Seq [11], which gave sensitive and reproducible results. The use of IVT also has the advantage that no template switching is required, thereby

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_4, © Springer Science+Business Media, LLC, part of Springer Nature 2019 45 46 Itai Yanai and Tamar Hashimshony

increasing the sensitivity. CEL-Seq uses a polyT primer that intro- duces a cell-specific barcode next to the transcripts’ polyA and selects the 30 ends of the transcripts for sequencing. This reduces sequencing costs, as less reads are needed compared to methods that cover the entire length of the transcript. The early barcoding also enables early pooling of the samples, so that a single library is made, reducing hands-on time. With CEL-Seq2 [12], we imple- mented the unique molecular identifiers (UMI) approach [13, 14], which further reduced amplification biases. CEL-Seq works well on manually collected single cells and cells sorted by FACS but has also been combined with automation [15] and fluidics [10, 12]to increase throughput. Here we describe how to use CEL-Seq2 to analyze a plate of 96 FACS sorted cells. This does not require expensive special equipment and can be performed in any lab with good RNA practices. The lower throughput is balanced by the better quality of data obtained.

2 Materials

Use ultrapure, RNAse-free water throughout the protocol. 1. Order desalted primers at the lowest possible scale. Dissolve in water, keep at À20 C. Sequences, stock concentrations and working dilutions are summarized in Tables 1, 2, and 3. 2. Working dilution plate (CEL-Seq barcoded primers): Dilute primers to a concentration of 25 ng/μl in a 96-well plate. 3. Primer dilution mix: 100 μl 10 nm dNTP mix, 12 μl 10% NP40, appropriate amount of ERCC spike-ins (see Note 1), complete with ultrapure water to 1000 μl. 4. RT mix: amounts to mix for a full plate (for each library RT): 40 μl(2μl) 5Â First Strand buffer, 20 μl(1μl) DTT 0.1 M, 10 μl (0.5 μl) RNase Inhibitor, 10 μl (0.5 μl) SuperScript II. 5. second strand mix: 700 μl ultrapure water, 231 μl5Â Second strand buffer, 23 μl 10 mM dNTPs, 8 μl E. coli DNA ligase (10 U/μl), 30 μl E. coli DNA Pol (10 U/μl), 8 μl E. coli RNase H(2U/μl). 6. Bead binding buffer: 20% PEG8000, 2.5 M NaCl. 7. IVT mix: 1.6 μl10Â T7 Buffer, 1.6 μl10Â T7 enzyme mix, 1.6 μl each 75 mM ATP, UTP, CTP, and GTP. 8. 5Â fragmentation buffer: 200 mM Tris–acetate, pH 8.1, 500 mM KOAc, 150 mM MgOAc. 9. Stop buffer: 0.5 M EDTA, pH 8. 10. PCR mix—any 2Â PCR enzyme mix suitable for preparing sequencing libraries. Table 1

Primers used for the protocol 47 Amplification RNA Linear Cell Single CEL-Seq2—Multiplexed

Working Primer name sequence Stock conc. dilution remarks

CEL-Seq primer GCCGGTAATACGACTCACTATAGGGAGTTC 1 μg/μl 25 ng/μl Index sequences are in Table 2 TACAGTCCGACGATCNNNNNN XXXXXXTTTTTTTTTTTTTTTTTTTTTTTTV randomhexRT GCCTTGGCACCCGAGAATTCCANNNNNN 100 μM 100 μM RNA PCR Primer AATGATACGGCGACCACCGAGATCTACACG 100 μM10μM Oligonucleotide sequences © 2007–2013 Illumina, (RP1) TTCAGAGTTCTACAGTCCGA Inc. All rights reserved RNA indexed CAAGCAGAAGACGGCATACGAGATXXXXXXG 100 μM10μM Oligonucleotide sequences © 2007–2013 Illumina, PCR Primer, TGACTGGAGTTCCTTGGCACCCGAGAA Inc. All rights reserved, indexes are in Table 3 (RPIX) TTCCA 8Ia aa n aa Hashimshony Tamar and Yanai Itai 48

Table 2 CEL-Seq index sequences

1 AGACTC 13 AGTGTC 25 GTTGCA 37 CTAGGA 49 TGGTAC 61 CAACCA 73 GTTGAG 85 TGAGTG 2 AGCTAG 14 TCCTAG 26 GTGACA 38 CTCATG 50 GACATG 62 CAACTC 74 GTTGTC 86 TGAGGA 3 AGCTCA 15 TCTGAG 27 GTGATC 39 CTCAGA 51 GATCAC 63 CACTCA 75 GTGAAG 87 TGTCTG 4 AGCTTC 16 TCTGCA 28 ACAGTG 40 CTTCGA 52 GATCTG 64 CACTTC 76 ACAGAC 88 TGGTTG 5 CATGAG 17 TCGAAG 29 ACCATG 41 CTGTAC 53 GATCGA 65 CAGAAG 77 ACAGGA 89 TGGTGA 6 CATGCA 18 TCGACA 30 ACTCTG 42 CTGTGA 54 GAGTAC 66 CAGACA 78 ACCAAC 90 GAAGAC 7 CATGTC 19 TCGATC 31 ACTCGA 43 TGAGAC 55 AGACAG 67 TCACCA 79 ACCAGA 91 GAAGTG 8 CACTAG 20 GTACAG 32 ACGTAC 44 TGCAAC 56 AGACCA 68 TCACTC 80 ACTCAC 92 GAAGGA 9 CAGATC 21 GTACCA 33 ACGTTG 45 TGCATG 57 AGTGAG 69 TCCTCA 81 CTCAAC 93 GACAAC 10 TCACAG 22 GTACTC 34 ACGTGA 46 TGCAGA 58 AGGAAG 70 TCCTTC 82 CTTCAC 94 GACAGA 11 AGGATC 23 GTCTAG 35 CTAGAC 47 TGTCAC 59 AGGACA 71 TCTGTC 83 CTTCTG 95 GAGTTG 12 AGTGCA 24 GTCTCA 36 CTAGTG 48 TGTCGA 60 CAACAG 72 GTCTTC 84 CTGTTG 96 GAGTGA CEL-Seq2—Multiplexed Single Cell Linear RNA Amplification 49

Table 3 Illumina index sequences

1 CGTGAT 13 TTGACT 25 ATCAGT 37 ATTCCG 2 ACATCG 14 GGAACT 26 GCTCAT 38 AGCTAG 3 GCCTAA 15 TGACAT 27 AGGAAT 39 GTATAG 4 TGGTCA 16 GGACGG 28 CTTTTG 40 TCTGAG 5 CACTGT 17 CTCTAC 29 TAGTTG 41 GTCGTC 6 ATTGGC 18 GCGGAC 30 CCGGTG 42 CGATTA 7 GATCTG 19 TTTCAC 31 ATCGTG 43 GCTGTA 8 TCAAGT 20 GGCCAC 32 TGAGTG 44 ATTATA 9 CTGATC 21 CGAAAC 33 CGCCTG 45 GAATGA 10 AAGCTA 22 CGTACG 34 GCCATG 46 TCGGGA 11 GTAGCC 23 CCACTC 35 AAAATG 47 CTTCGA 12 TACAAG 24 GCTACC 36 TGTTGG 48 TGCCGA Oligonucleotide sequences © 2007–2013 Illumina, Inc. All rights reserved

3 Method

Plates are kept on ice at all times, unless indicated otherwise. Incubations are performed in a thermocycler, with lid set 10 warmer than incubation temperature, or in an incubator with the appropriate temperature. It is recommended to use low-binding plasticware throughout the protocol.

3.1 Preparing 1. Aliquot 10 μl of primer dilution mix in to each well of a 96-well Barcoded Plates plate. 2. With a multichannel pipette transfer 2 μl of each primer from the working dilution plate to each well of the primer mix plate. Mix well. 3. With a multichannel pipette transfer 1.2 μl of each primer mix from the primer mix plate to a 384-well plate (see Note 2), seal well (see Note 3), and keep in À80 C until ready to use (see Note 4).

3.2 Sorting Cells into 1. Prepare cells for sorting; when ready, remove 384-well plate  Plates from À80 C. 2. Spin down at 450 Â g for 1 min. 3. Remove seal just prior to sorting, sort into the appropriate wells (see Note 5). 4. Seal plate and quick-freeze on dry ice. Return to À80 C(see Note 6). 50 Itai Yanai and Tamar Hashimshony

3.3 Converting RNA 1. Remove 384-well plate from À80 C. to cDNA 2. Spin down at 450 Â g for 1 min. 3. Incubate at 65 C for 5 min. 4. Spin down at 450 Â g for 1 min. 5. Add 0.8 μl RT mix, mix by pipetting up and down 3–5 times, and seal plate. 6. Spin down at 450 Â g for 1 min. 7. Incubate at 42 C for 1 h. 8. Incubate at 70 C for 10 min. 9. Spin down at 450 Â g for 1 min. 10. Add 10 μl second strand mix (see Note 7), seal plate, and mix by flicking or gentle vortexing. 11. Spin down at 450 Â g for 1 min. 12. Incubate at 16 C for 2 h.

3.4 cDNA Cleanup 1. Prewarm AMPure XP beads and bead binding buffer to room temperature. 2. Pool all samples to one tube. Split to six 0.5 ml tubes—192 μl each tube (see Note 8). 3. Vortex AMPure XP Beads until well dispersed. 4. Add to each of the six tubes of pooled samples 39 μl beads and 192 μl bead binding buffer, and mix well by pipetting. 5. Incubate at room temperature for 15 min. 6. Place on magnetic stand for at least 5 min, until liquid appears clear. 7. Remove and discard 400 μl of the supernatant. 8. Add 200 μl freshly prepared 80% EtOH (increase volume if does not cover beads). 9. Incubate at least 30 s, then remove and discard supernatant without disturbing beads. 10. Add 200 μl freshly prepared 80% EtOH. 11. Incubate at least 30 s, then remove and discard supernatant without disturbing beads. 12. Quick-spin the tubes (2 s) to collect beads to the bottom and remove extra EtOH. 13. Air-dry beads for 15 min or until completely dry. 14. Resuspend one tube with 6.4 μl water. Pipet entire volume up and down ten times to mix thoroughly. 15. Transfer resuspended beads to next tube. Repeat until all beads are in a single tube. CEL-Seq2—Multiplexed Single Cell Linear RNA Amplification 51

3.5 Library 1. To 6.4 μl resuspended beads add 9.6 μl IVT mix. Amplification 2. Incubate for 13 h at 37 C(see Note 9). 3. Add 6 μl EXOSap-IT (see Note 10). 4. Incubate for 15 min at 37 C. 5. Add 5.5 μl fragmentation buffer (0.25 reaction volume). 6. Incubate for 3 min at 94 C. 7. Add 2.75 μl stop buffer (0.5 volume of fragmentation buffer added). 8. Place on magnetic stand for 5 min, until liquid appears clear. 9. Transfer supernatant to new tube.

3.6 RNA Cleanup 1. Prewarm RNAClean XP beads to room temperature. 2. Vortex RNAClean XP beads until well dispersed and add to sample 55 μl beads (1.8 volumes). Mix well by pipetting. 3. Incubate at room temperature for 10 min. 4. Place on magnetic stand for at least 5 min, until liquid appears clear. 5. Remove and discard 80 μl of the supernatant. 6. Add 200 μl freshly prepared 70% EtOH. 7. Incubate for at least 30 s, then remove and discard supernatant without disturbing beads. 8. Repeat wash two more times. 9. Air-dry beads for 15 min, or until completely dry. 10. Resuspend with 5 μl water (see Note 11). Mix well by pipetting. 11. Incubate at room temperature for 2 min. 12. Place on magnetic stand for 5 min, until liquid appears clear. 13. Transfer 4.5 μl of the supernatant to a new tube. Stopping point: Samples can be kept at À80 C.

3.7 Preparing Library 1. To 4.5 μl amplified RNA add 1 μl randomhexRT primer, 0.5 μl dNTP mix. 2. Incubate at 65 C for 5 min. 3. Add 4 μl RT mix and mix by pipetting up and down 3–5 times. 4. Incubate at 25 C for 10 min. 5. Incubate at 42 C for 1 h. 6. Transfer 5 μl to new tube (keep rest of RT reaction in À20 C). Add 5.5 μl ultrapure water, 1 μl RNA PCR Primer (RP1), 1 μl indexed RNA PCR Primer (RPIX, see Note 12), and 12.5 μl PCR mix. 52 Itai Yanai and Tamar Hashimshony

7. Use the following program on your thermocycler: 30 s at 98 C, 11 cycles (see Note 13) of: (10 s at 98 C, 30 s at 60 C, 30 s at 72 C), 10 min at 72 C, Hold at 4 C. Stopping point: samples can be kept at À20 C.

3.8 Library Cleanup 1. Prewarm beads to room temperature. 2. Vortex AMPure XP Beads until well dispersed. 3. Add 25 μl to the PCR reaction. Mix well by pipetting. 4. Incubate at room temperature for 15 min. 5. Place on magnetic stand for at least 5 min, until liquid appears clear. 6. Remove and discard 45 μl of the supernatant. 7. Add 200 μl freshly prepared 80% EtOH. 8. Incubate for at least 30 s, then remove and discard supernatant without disturbing beads. 9. Add 200 μl freshly prepared 80% EtOH. 10. Incubate for at least 30 s, then remove and discard supernatant without disturbing beads. 11. Air-dry beads for 15 min or until completely dry. 12. Resuspend with 26 μl water. Mix well by pipetting. 13. Incubate at room temperature for 2 min. 14. Place on magnetic stand for 5 min, until liquid appears clear. 15. Transfer 25 μl of supernatant to new tube. 16. Repeats steps 2–11. 17. Resuspend with 10.5 μl water. Mix well by pipetting. 18. Incubate at room temperature for 2 min. 19. Place on magnetic stand for 5 min, until liquid appears clear. 20. Transfer 10 μl of supernatant to new tube and store prepared library at À20 C.

3.9 Assessing 1. Check library concentration by Qubit, and size distribution by Library Quality BioAnalyzer or TapeStation. and Sequencing 2. Calculate molar library concentration (see Note 14):

½ŠÂconcðÞ ng=ml =sizeðÞ bp ðÞÂ1=649 1,000,000 ¼ conc:ðÞnM 3. Prepare library for sequencing (see Note 15). Perform paired- end sequencing reading 15 bases for read 1 and 36 bases for read 2 (see Notes 16–18 for sequencing and analysis-related issues). CEL-Seq2—Multiplexed Single Cell Linear RNA Amplification 53

4 Notes

1. A known amount of ERCC spike-ins is added to each well. This can be used to calculate efficiency of the amplification for each well, and estimate number of transcripts in each cell. It is very useful for troubleshooting—indicates if a failed well is due to sorting (in which case the spike-ins will amplify well), or due to problems in amplification (in which case reads from the spike- ins will be missing). Spike-ins should be diluted to a level where they get about 5% of the reads. For larger cells I find that diluting the spike-in 1,000,000 fold works well, and for smaller cells spike-ins can be diluted even further. 2. 384-well plates are used because of the narrow wells; this is more compatible with the small volumes used throughout the protocol. I use alternating well positions appropriate for 8- or 12-channel pipette, and for keeping the same barcode order as in the primer plate (Fig. 1). 3. Seal plates with film suitable for storage in À80 C. I find aluminum foil most suitable both for keeping the plates and later for incubations. 4. Several plates can be prepared at a time (keep all plates on ice during preparation). Plates can be kept for at least several months in À80 C. 5. If you have plenty of cells it is sometimes easier to sort into all 384-wells even though you will be using only 96.

123456789101112131415161718192021222324 A B C D E F G H I J K L M N O P

Fig. 1 Layout of 384-well plate. Wells used are labeled in black 54 Itai Yanai and Tamar Hashimshony

6. Plates with sorted cells keep well for at least several months. It is possible to sort cells into several plates, analyze one plate, and based on the results decide if data from more cells is needed. It should be noted that there is some run-to-run bias, so if it is known in advance that data from more than 96 cells is needed, prepare and run the libraries together. 7. After RT step RNA is already barcoded, so second strand mix can be added with a single tip (or row of tips if using multi- channel pipette). 8. If you have a magnetic stand for 1.5 ml tubes you can split to three tubes, increasing all volumes twofold in this and the following steps. 9. It is most convenient to put this incubation overnight in a thermocycler that can cool down to 4 C at the end of the incubation. The sample is stable for several hours at 4 C. 10. Use of exonuclease removes leftover primers and reduces bar- code crossover. If using a different exonuclease than the one indicated, correct volumes of fragmentation and stop buffer. 11. You can keep some of the amplified RNA to run on BioAnaly- zer for quality control, in which case increase the volume of water used for resuspension to 7 μl. RNA should have a distri- bution of 200–1000 bases, with a peak close to 500 bases. Concentration can range from a few ng/μl to more than 10 ng/μl. 12. Each library should get a unique barcode if libraries are to be run together on the same lane. Choose balanced primers according to Illumina’s pooling guide. 13. The number of cycles can be changed depending on your cells and the efficiency of the amplification. Eleven cycles work in most cases, so it is a good place to start. Do not go above 15 cycles, as this probably means that amplification did not work well enough, or under eight cycles, as the PCR reaction is not only for amplification but also required for introducing the full length of the adaptors needed for sequencing. The second half of the RT reaction kept in À20 C can be used if initial number of cycles chosen does not give a good library concentration. 14. Minimal library concentration for sequencing is 2 nM, with typical libraries above 10 nM. If concentration is much higher (above 100 nM), consider reducing number of PCR cycles. 15. Denaturing and diluting the sample should be performed according to Illumina protocols. Concentration for loading should be tested, but at our facility 12 pM for HiSeq rapid mode or MiSeq, or 8 pM for HiSeq high-throughput mode work well. CEL-Seq works best on HiSeq rapid mode, MiSeq CEL-Seq2—Multiplexed Single Cell Linear RNA Amplification 55

and HiSeq high throughput mode using v.3 reagents. When using HiSeq high-throughput mode with v.4 reagents, increase the amount of PhiX added to the sample to 20% for better results. 16. Read 1 is used to identify the UMI (first six bases) and cell specific barcode (six additional bases). The second read identi- fies the specific transcript. Thirty-five bases are enough to identify the transcript, although it is possible to increase the length of the second read if more stringent mapping criteria are wanted. It should be noted that with the relatively tight size distribution usually obtained for CEL-Seq libraries, when increasing the length of the second read you increase the chance of sequencing into the polyA stretch. 17. A pipeline for CEL-seq2 analysis can be found in the original publication [12]. It includes the following steps: (1) demulti- plexing samples to libraries using the Illumina barcode; (2) demultiplexing libraries to single cells using the CEL-Seq barcode in read 1; (3) for each cell, mapping read 2 to identify the transcript, (4) determining the UMI sequence in read 1; (5) for each transcript, removing reads with identical UMIs (UMI collapse); and (6) counting the remaining transcripts. 18. Recommended sequencing depth is 0.5–one million reads per cell after demultiplexing. At this sequencing depth most expressed genes should be identified, including lowly expressed genes. For some experiments lower sequencing depth can be enough, if interested in medium to highly expressed genes. Sequencing more than a million reads does not add much data and can actually increase the percentage of crossover reads (reads with a wrong barcode). Such reads can arise at the second RT step (of the pooled amplified RNA) from left over CEL-Seq primers. Since they are formed after the IVT step, they are lower in copy number and can be distinguished from real read by the ratio of reads before and after UMI collapse (For crossover reads this ratio will often be 1).

Acknowledgments

We thank the Technion Genome center for technical assistance.

References

1. Tang F, Barbacioru C, Wang Y, Nordman E, 2. Eberwine J, Yeh H, Miyashiro K, Cao Y, Nair S, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Finnell R, Zettel M, Coleman P (1992) Analy- Siddiqui A, Lao K, Surani MA (2009) mRNA- sis of gene expression in single live neurons. Seq whole-transcriptome analysis of a single Proc Natl Acad Sci U S A 89(7):3010–3014 cell. Nat Methods 6(5):377–382. https://doi. 3. Baron M, Veres A, Wolock SL, Faust AL, org/10.1038/nmeth.1315 Gaujoux R, Vetere A, Ryu JH, Wagner BK, 56 Itai Yanai and Tamar Hashimshony

Shen-Orr SS, Klein AM, Melton DA, Yanai I Kamitaki N, Martersteck EM, Trombetta JJ, (2016) A single-cell transcriptomic map of the Weitz DA, Sanes JR, Shalek AK, Regev A, human and mouse pancreas reveals inter- and McCarroll SA (2015) Highly parallel genome- intra-cell population structure. Cell Syst 3 wide expression profiling of individual cells (4):346–360 e344. https://doi.org/10. using nanoliter droplets. Cell 161 1016/j.cels.2016.08.011 (5):1202–1214. https://doi.org/10.1016/j. 4. Molinaro AM, Pearson BJ (2016) In silico line- cell.2015.05.002 age tracing through single cell transcriptomics 10. Klein AM, Mazutis L, Akartuna I, identifies a neural stem cell population in pla- Tallapragada N, Veres A, Li V, Peshkin L, narians. Genome Biol 17:87. https://doi.org/ Weitz DA, Kirschner MW (2015) Droplet bar- 10.1186/s13059-016-0937-9 coding for single-cell transcriptomics applied 5. Gladka MM, Molenaar B, de Ruiter H, van der to embryonic stem cells. Cell 161 Elst S, Tsui H, Versteeg D, Lacraz GPA, Hui- (5):1187–1201. https://doi.org/10.1016/j. bers MMH, van Oudenaarden A, van Rooij E cell.2015.04.044 (2018) Single-cell sequencing of the healthy 11. Hashimshony T, Wagner F, Sher N, Yanai I and diseased heart reveals Ckap4 as a new mod- (2012) CEL-Seq: single-cell RNA-Seq by mul- ulator of fibroblasts activation. Circulation 138 tiplexed linear amplification. Cell Rep 2 (2):166–180. https://doi.org/10.1161/ (3):666–673. https://doi.org/10.1016/j.cel CIRCULATIONAHA.117.030742 rep.2012.08.003 6. Islam S, Kjallquist U, Moliner A, Zajac P, Fan 12. Hashimshony T, Senderovich N, Avital G, JB, Lonnerberg P, Linnarsson S (2011) Char- Klochendler A, de Leeuw Y, Anavy L, acterization of the single-cell transcriptional Gennert D, Li S, Livak KJ, Rozenblatt-Rosen- landscape by highly multiplex RNA-seq. O, Dor Y, Regev A, Yanai I (2016) CEL-Seq2: Genome Res 21(7):1160–1167. https://doi. sensitive highly-multiplexed single-cell org/10.1101/gr.110882.110 RNA-Seq. Genome Biol 17:77. https://doi. 7. Ramskold D, Luo S, Wang YC, Li R, Deng Q, org/10.1186/s13059-016-0938-8 Faridani OR, Daniels GA, Khrebtukova I, Lor- 13. Hug H, Schuler R (2003) Measurement of the ing JF, Laurent LC, Schroth GP, Sandberg R number of molecules of a single mRNA species (2012) Full-length mRNA-Seq from single- in a complex mRNA preparation. J Theor Biol cell levels of RNA and individual circulating 221(4):615–624 tumor cells. Nat Biotechnol 30(8):777–782. 14. Kivioja T, Vaharautio A, Karlsson K, Bonke M, https://doi.org/10.1038/nbt.2282 Enge M, Linnarsson S, Taipale J (2011) 8. Shalek AK, Satija R, Shuga J, Trombetta JJ, Counting absolute numbers of molecules Gennert D, Lu D, Chen P, Gertner RS, Gau- using unique molecular identifiers. Nat Meth- blomme JT, Yosef N, Schwartz S, Fowler B, ods 9(1):72–74. https://doi.org/10.1038/ Weaver S, Wang J, Wang X, Ding R, nmeth.1778 Raychowdhury R, Friedman N, Hacohen N, 15. Muraro MJ, Dharmadhikari G, Grun D, Park H, May AP, Regev A (2014) Single-cell Groen N, Dielen T, Jansen E, van Gurp L, RNA-seq reveals dynamic paracrine control of Engelse MA, Carlotti F, de Koning EJ, van cellular variation. Nature 510(7505):363–369. Oudenaarden A (2016) A single-cell transcrip- https://doi.org/10.1038/nature13437 tome atlas of the human pancreas. Cell Syst 3 9. Macosko EZ, Basu A, Satija R, Nemesh J, (4):385–394 e383. https://doi.org/10.1016/ Shekhar K, Goldman M, Tirosh I, Bialas AR, j.cels.2016.09.002 Chapter 5

Single-Cell RNA-Seq by Multiple Annealing and Tailing- Based Quantitative Single-Cell RNA-Seq (MATQ-Seq)

Kuanwei Sheng and Chenghang Zong

Abstract

Single-cell technologies have emerged as advanced tools to study various biological processes that demand the single cell resolution. To detect subtle heterogeneity in the transcriptome, high accuracy and sensitivity are still desired for single-cell RNA-seq. We describe here multiple annealing and dC-tailing-based quanti- tative single-cell RNA-seq (MATQ-seq) with ~90% capture efficiency. In addition, MATQ-seq is a total RNA assay allowing for detection of nonpolyadenylated transcripts.

Key words Single cell, RNA-seq, Total RNA, Transcriptomics, Library preparation, Biotechnology

1 Introduction

The development of new single-cell chemistry and the rapid drop of sequencing cost have greatly propelled the wide applications of single-cell RNA-seq. Single-cell transcriptome profiling has been applied to study embryo development [1] and tissue developmental hierarchy [2, 3], identify new cell types and markers [4, 5], charac- terize the heterogeneity in tissue and tumor [6], and determine the response of single cells to perturbation [7, 8]. These studies revealed amazing complexity in a variety of biological processes and have revolutionized our view on life science. Multiple methods to profile single-cell transcriptome have been developed in the field [9–17]. Most of these methods utilize oligo dT primer that binds to the PolyA tail of the 30 end of mRNA transcripts for first-strand cDNA synthesis. Three approaches are mostly used for second-strand synthesis. Cel-seq [15, 16] utilizes RNase H to fragment the RNA and uses the fragmented RNA as the primer for second-strand synthesis. SMART-seq [9, 10] takes advantage of terminal transferase activity of MMLV reverse tran- scriptase which predominantly adds a few bases of C on the 50 end of first-strand cDNA. Template switching oligos are used to bind to

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_5, © Springer Science+Business Media, LLC, part of Springer Nature 2019 57 58 Kuanwei Sheng and Chenghang Zong

the C’s and allow for template switching to generate full amplicon for PCR amplification. Tailing-based methods [11] apply PolyA tailing on the 30 of cDNA catalyzed by TdT terminal transferase. Another oligo dT primers are then used to bind to the cDNA PolyA tail to initiate the second-strand synthesis. However, despite the significant progress made in recent years, previous methods are still limited by several factors. Firstly, the capture efficiency of these methods is only 40–50%, meaning that there are at least 50% of transcripts that cannot be detected in a single cell. Secondly, these methods showed strong 30 bias, as secondary structures in the transcripts can easily form and block the elongation of reverse transcription. Thirdly, due to the limita- tion of the first-strand synthesis, only PolyA transcripts can be characterized. Therefore, these assays are not suitable for studies that demand high sensitivity and accuracy. They are also limited in investigating nonpolyadenylated long noncoding RNA or nascent transcripts. Taking all these into consideration, to quantify subtle transcription variations among single cells, a more quantitative and sensitive single-cell transcriptome assay is needed. We developed a new single-cell RNA-seq assay, multiple annealing and dC-tailing-based quantitative single-cell RNA-seq (MATQ-seq) [18] (Fig. 1). Briefly, after cell lysis, we utilize not only oligo dT but also MALBAC [19] random primers to perform first-strand synthesis. Multiple times of thermocycling is used to allow primers to anneal to transcripts multiple times. After first- strand synthesis, primers are digested. A following RNA digestion step allows the first-strand cDNA to be released as single-stranded DNA. Terminal transferase is then used for PolyC tailing on the 30 the of the single-stranded cDNA. For efficient second-strand syn- thesis, we use MALBAC 6N3G primers which bind to the PolyC tail of single-stranded DNA. Subsequently, cDNA is amplified by PCR to generate enough DNA to be sequenced. With MALBAC adapters used in both first-strand and second-strand synthesis, the primer dimers will form into stem loops, and the primer junk is effectively quenched.

2 Materials

All reagents and tips for cDNA synthesis step need to be DNase and RNase free. Ultrapure water, low-retention tips, and low-retention PCR tubes are recommended for all steps. Preparing reagents and reactions in an RNase-free hood is recommended. All buffers should be aliquoted and stored at À20 C. Before experiments, spray the workspace with 70% ethanol and RNaseZap to minimize contamination. MATQ-Seq for Single-Cell Total RNA Sequencing 59

Fig. 1 Reaction scheme of MATQ-seq. After cell lysis, we apply multiple annealing of dT and random primer for total RNA first-strand synthesis. Residual primers and RNA are then digested. Terminal transferase is used to add polyC tailing on the first-strand cDNA. We then use G-rich second-strand primers to perform second- strand synthesis. Samples are then amplified by PCR and are ready for library preparation and sequencing

2.1 cDNA Synthesis 1. UltraPure DEPC-treated water. 2. 10 mM each dNTP. 3. 0.1 M DTT. 4. 10% Triton X-100. 5. 40 U/μL RNaseOUT™ Recombinant Ribonuclease Inhibitor. 6. 200 U/μL Superscript III reverse transcriptase. 7. 5Â First-strand reverse transcription buffer. 8. 3 U/μL T4 DNA polymerase 9. 5 U/μL RNase H.

10. 50 U/μL RNase If. 11. 20 U/μL terminal transferase. 12. 10Â TdT terminal transferase buffer. 13. 100 mM dCTP. ® 14. 2 U/μL Deep Vent (exoÀ) DNA polymerase. 60 Kuanwei Sheng and Chenghang Zong

® 15. 10Â ThermoPol Reaction Buffer. ® 16. iTaq™ universal SYBR Green supermix. 17. 96-well real-time PCR plate. 18. PCR plate sealing film. 19. Column-based PCR cleanup kit. 20. Page purified Primers, diluted into 100 μM stock: GAT27dT: GTG AGT GAT GGT TGA GGA TGT GTG GAG NNNNN TTTTTTTTTTTT. GAT27 5N3G: GTG AGT GAT GGT TGA GGA TGT GTG GAG NNNNN GGG. GAT27 5N3T: GTG AGT GAT GGT TGA GGA TGT GTG GAG NNNNN TTT. GAT21 6N3G: GAT GGT TGA GGA TGT GTG GAG NNNNNN GGG. GAT27 PCR: GTG AGT GAT GGT TGA GGA TGT GTG GAG. 3NGAT24: NNN AGT GAT GGT TGA GGA TGT GTG GAG.

2.2 Library 1. NEBNext End Repair Enzyme Mix. Preparation 2. NEBNext End Repair Reaction Buffer. 3. Klenow Fragment 30 ! 50 exoÀ. 4. 100 mM dATP. 5. 10Â NEB Buffer #2. 6. 2Â Quick Ligase Buffer. 7. Quick Ligation™ Kit. 8. KAPA HIFI Hotstart Ready Mix. 9. Qubit™ dsDNA HS Assay Kit. 10. Agencourt AMPure XP beads. 11. 80% ethanol. 12. Magnetic stand for beads purification. 13. Duplex-specific nuclease. 14. 10Â duplex-specific nuclease buffer. 15. TruSeq adapters and Illumina PCR primers, diluted into 100 μM stocks (see Note 1). 16. Sonicator or Fragmentase. 17. KAPA library quantification kit. 18. NanoDrop spectrophotometer. 19. Qubit™ fluorometer. MATQ-Seq for Single-Cell Total RNA Sequencing 61

3 Methods

3.1 cDNA Synthesis 1. Prepare the lysis mix containing for each sample: 1 μL 0.2% Triton X-100 in UltraPure DEPC-treated H O, 0.05 μL 0.1 M 3.1.1 Cell Lysis 2 DTT, 0.05 μL RnaseOUT, 0.4 μL Primer mix (page purified GAT27dT 1.5 μM, GAT275n3G 5 μM, GAT275n3T 5 μM), and 0.1 μL 10 mM dNTP. 2. Flow-sort or mouth-pipet single cells in PCR tubes containing lysis mix, briefly centrifuge the tubes, then put the samples on a PCR thermocycler. Heat the samples at 72 C for 3 min, then quickly put the tubes on ice for 1 min.

3.1.2 Reverse 1. Add the RT mix containing: 0.8 μL5Â first-strand synthesis Transcription buffer, 0.2 μL 0.1 M DTT, 0.1 μL RNaseOUT, 0.15 μL Super- (See Note 2) script III, and 1.15 μL UltraPure DEPC-treated H2O. 2. Mix well by flicking the tubes 4–5 times, then briefly centrifuge the tubes. 3. Place the tubes on a PCR thermocycler. Run the following program: 10 cycles of 8 C12s. 15 C45s. 20 C45s. 30 C30s. 42 C 2 min. 50 C 3 min. End cycle. 50 C 15 min. 4 C Forever. The samples can be stored at 4 C overnight.

3.1.3 Digestion 1. Put the tubes on a PCR thermocycler. Incubate the tubes at  of Primers (See Note 3) 50 C for 1 min. 2. Add 0.2 μL T4 DNA polymerase, mix well by flicking the tubes 4–5 times, then briefly centrifuge the tubes. 3. Place the samples on a PCR thermocycler. Continue with the following program: 37 C 40 min. 75 C 20 min. 4 C Forever. 62 Kuanwei Sheng and Chenghang Zong

3.1.4 RNA Digest (See 1. Mix 0.1 μL RNaseH with 0.1 μL RNaseIf, add the enzyme mix Note 4) into the samples, mix well by flicking the tubes 4–5 times, then briefly centrifuge the tubes. 2. Put the samples on a PCR thermocycler. Run the following program: 37 C 15 min. 72 C 15 min. 4 C Forever.

3.1.5 Tailing 1. Prepare the following TdT mix for each sample: 0.4 μL10Â TdT buffer, 0.4 μL 100 mM dCTP, 0.1 μL TdT terminal transferase, and 3.1 μLH2O. 2. Heat the samples to 72 C for 1 min, pause the thermocycler at 37 C. Add the TdT mix into the samples at 37 C, mix well by flicking the tubes 4–5 times, briefly centrifuge the tubes. 3. Place the samples on a PCR thermocycler. Continue with the following program: 37 C 15 min. 72 C 15 min. 4 C Forever.

3.1.6 Second-Strand 1. Prepare the following second-strand mix for each sample: ® Synthesis (See Note 5) 1.5 μL10Â ThermoPol buffer, 1.25 μL 10 mM each dNTP, 0.125 μL 100 μM GAT21 6n3G, and 13.1 μLH2O. 2. Add the mix into each sample, then put the tubes on a PCR thermocycler. Run the following program: 80 C30s. 48 C hold for at least 20 s, add Deep Vent exoÀ DNA polymerase 0.4 μL, mix well by flicking the tubes 4–5 times, briefly centrifuge the tubes, then put them back on the PCR thermocycler. 3. Continue with the following program: 10 cycles of 48 C20s. 72 C 1 min. End cycles. 72 C 2 min. 4 C Forever. The samples can be stored at 4 C overnight. MATQ-Seq for Single-Cell Total RNA Sequencing 63

abAmplification Melt Peak

1200 150 1000

800 100

RFU 600 -d(RFU)/dT 400 50 200

0 0 0510 15 20 25 30 60 70 80 90 Cycles Temperature, Celsius

Fig. 2 (a) The amplification curves of 20 HEK293T single cells. The mean Cq value of the cells is 18.43 Æ 0.50. (b) The melting curve of 20 HEK293T single cells. The melting peak temperature is 84.5–85 C

3.1.7 (Optional) Real- 1. Prepare the following real-time PCR mix in a real-time PCR ® Time PCR plate: 5 μLiTaq™ universal SYBR Green supermix, 0.05 μL for Preamplification Quality 100 μM GAT27 PCR primer, 0.5 μL second-strand product, Control (See Note 6) and 4.5 μLH2O. 2. Seal the plate with a sealing film. Make sure that the film is appropriately attached. Centrifuge the plate at 500 Â g for 30 s. 3. Place the plate on a real-time PCR thermocycler. Run the following program: 95 C 2 min. 30 cycles of 95 C15s. 63 C20s. 72 C 2 min Read plate. End cycle. 72 C 3 min. Melting curve. End.

4. Analyze the data based on the quantitation cycle (Cq) value and the melting curve (Fig. 2). Use baseline correction and regres- sion model for real-time PCR analysis. Proceed with high- quality samples.

3.1.8 Amplification (See 1. Prepare the following PCR mix for each sample: 13 μL10Â ® Note 7) ThermoPol Buffer, 0.8 μL 100 μM GAT27 PCR primer, 3 μL 10 mM each dNTP, 108 μLH2O, and 3 μL Deep Vent exoÀ DNA polymerase. 2. Mix the PCR mix and second-strand product in a 1.7 mL Eppendorf tube and split the samples into four PCR tubes. 64 Kuanwei Sheng and Chenghang Zong

Lower 299 409 497 672 841 Upper

800 700 600 500 400 300 200 Sample Intensity [FU] 100 0 Size [bp] 25 50 100 200 300 400 500 700 1000 1500

Fig. 3 Size distribution of amplified cDNA. A TapeStation image of single-cell cDNA after amplification. The spikes at ~500 and 400 bp come from random primer specific binding to ribosomal RNA. The size ranges from 150 bp to 2 kb

3. Place the tubes on a PCR thermocycler. Run the following program: 95 C30s. 24 cycles of 95 C15s. 63 C20s. 72 C 2 min. End cycle. 72 C 3 min. 4 C Forever. 4. Purify the amplified product with column-based PCR purifica- tion kit (perform the experiments according to the kit proto- col) into 40 μLH2O. 5. Measure the concentration of the samples using NanoDrop. The samples can be stored at À20 C for at least 1 year. Figure 3 shows the size distribution of the cDNA for a typical fresh single cell.

3.2 Library 1. For each sample, prepare the following reaction mix: 100 ng of ® Preparation amplified cDNA, 5 μL10Â ThermoPol buffer, 1.25 μLof 10 mM each dNTP, 0.25 μL of 100 μM 3NGAT24 primer, and 3.2.1 3NGAT24 Double- 1 μL Deep Vent exoÀ DNA polymerase. Add H2O to adjust Strand Conversion (See the volume of the reaction to 50 μL. Note 8) 2. Place the samples on a PCR thermocycler, then run the follow- ing program: MATQ-Seq for Single-Cell Total RNA Sequencing 65

95 C30s. 20 cycles of 62 C20s. 72 C 1 min. End cycles. 72 C 3 min. 95 C30s. 20 cycles of 62 C20s. 72 C 1 min. End cycles. 72 C 3 min. 4 C Forever. 3. To purify the samples, add 60 μL (1.2Â) of AMPure XP beads into each sample, incubate at room temperature for 5 min. Place the samples on a magnetic stand for 1 min. Discard the supernatant. 4. Wash the beads twice with 100 μL 80% ethanol. (Do not disturb the beads.) Discard the supernatant. Incubate the beads at room temperature for 15 min to dry the beads. 5. Add 30 μL water to elute. Vortex and briefly centrifuge the samples. Incubate the samples at room temperature for 2 min. Place the samples back on the magnetic stand for 1 min. Trans- fer the purified product to a new Eppendorf tube. 6. Measure the concentration of samples using NanoDrop. The samples can be stored at À20 C for up to 6 months.

3.2.2 Shearing Shear 200–400 ng of each sample to peak length of 200 bp. Here (See Note 9) we use Covaris S220 sonicator as an example. 1. Fill the water tank with deionized water. 2. Degas the water tank for at least 30 min. Cool the water tank to 4–6 C. 3. Dilute 200–400 ng of samples to 100 μL, and add the samples to Covaris microtubes. 4. Sonicate the samples to a peak length of 200 bp (peak incident power: 175 W, duty factor: 10%, cycles per burst: 200, treat- ment time: 200 s). Put the samples on ice.

3.2.3 End Repair 1. For each reaction, mix the following components in Eppendorf tubes: 43 μL sheared DNA, 5 μL10Â NEBNext End Repair Reaction Buffer, and 2 μL NEBNext End Repair Enzyme Mix. 2. Incubate the samples at room temperature for 30 min. 66 Kuanwei Sheng and Chenghang Zong

3.2.4 Size Selection (for 1. Add 0.75Â (37.5 μL) AMPure XP beads into each tube. Incu- 200 bp, See Note 10) bate the samples at room temperature for 5 min. Place the samples on a magnetic stand. Wait till the supernatant becomes clear. Discard the beads and transfer the supernatant to new Eppendorf tubes. 2. Add 0.45Â (22.5 μL) AMPure XP beads into the supernatant tubes. Incubate at room temperature for 5 min. Discard the supernatant. Wash the beads twice with 80 μL 80% ethanol. Wash gently. Do not disturb the beads. Place the samples on a magnetic stand. Dry the beads to evaporate ethanol. 3. Add 18 μL water, vortex, and incubate at room temperature for 2 min. Transfer the end-repaired samples to new tubes. The samples can be stored at 4 C for overnight.

3.2.5 A-Tailing 1. For each sample, mix the following components in Eppendorf tubes: 17.5 μL End-repaired product, 2 μL10Â NEB Buffer #2, 0.1 μL 100 mM dATP, and 0.5 μL5U/μL Klenow 30 to 50 exoÀ DNA polymerase. 2. Incubate the samples at room temperature for 30 min. 3. Use 1.5Â AMPure beads (30 μL) to purify the DNA into 13 μL water.

3.2.6 Ligation 1. Anneal the universal adapters (or I5 index adapters) with the I7 index adapters by adding 2 μL of each 100 μM stock to PCR tubes containing 16 μL of water. Heat the tubes to 90 C for 30 s, ramping to 25 CatÀ0.1 C/s rate. Place the tubes on ice. The annealed adapters can be stored at À20 C for up to a year. 2. Mix the following components in Eppendorf tubes: 12 μL A-tailed DNA, 12.5 μL2Â Quick Ligase Buffer, 0.5 μL 10 μM Annealed TruSeq Adapters, and 0.5 μL Quick Ligase (2000 U/μL). 3. Incubate the samples at room temperature for 20 min. Transfer the samples on ice. Add 5 μL 0.5 M EDTA, pH 8.0 to each sample. 4. Purify the samples with 1Â (30 μL) AMPure XP beads. Elute ligated samples in 25 μL water. The samples can be stored at À20 C for up to 6 months.

3.2.7 KAPA Amplification 1. For each sample, mix the following components in PCR tubes: (See Note 11) 8 μL ligated product, 10 μL KAPA HIFI ready mix, 1 μL 10 μM each Illumina PCR primers, and 1 μL water. 2. Place the samples on a PCR thermocycler. Run the following program: 95 C 2 min. MATQ-Seq for Single-Cell Total RNA Sequencing 67

Lower 364 Upper 1400

1200

1000

800

600

400 Sample Intensity [FU] 200

0 Size [bp] 25 50 1000 1500 100 200 300 400 500 700

Fig. 4 Size distribution of a MATQ-seq library. A TapeStation image of single-cell cDNA after library preparation. The peak is at ~350 bp with most of the product ranging from 200 to 600 bp

5 cycles of 98 C15s. 60 C15s. 72 C35s. End cycles. 72 C 1 min. 4 C Forever. 3. Use 1Â (20 μL) AMPure XP beads to purify the samples into 20 μL water. 4. Measure concentration of libraries using Qubit High Sensitive dsDNA kit. Check the size of the library by 2% agarose gel, Bioanalyzer, or TapeStation. A typical library size is shown in Fig. 4. The samples can be stored at À20 C for up to 6 months.

3.2.8 (Optional) Duplex- 1. Pool all the amplified libraries to total 100 ng in a PCR tube. Specific Nuclease (DSN) Dilute the pooled libraries with water to total volume of 18 μL. Treatment (See Note 12) Add 2 μL10Â DSN buffer to the samples. 2. Place the samples on a PCR thermocycler. Run the following program: 95 C30s. 80 C3h. 3. Add 1 μL of preheated DSN to each pooled sample at 80 Con the thermocycler. Incubate the sample at 80 C for 15 min. 68 Kuanwei Sheng and Chenghang Zong

4. Add 4 μL of preheated 50 mM EDTA to each pooled sample at 80 C. Transfer the tubes on ice. 5. Purify the libraries with 1Â AMPure beads (25 μL) and elute each sample into 20 μL water. The samples can be stored at À20 C for up to 6 months. 6. Prepare the following reaction in PCR tubes: 9 μL of DSN product, 10 μM each 1 μL of Illumina PCR primer, and 10 μL of 2Â KAPA HIFI ready mix. 7. Place the samples on a PCR thermocycler. Run the following program: 95 C 2 min. 6 cycles of 98 C15s. 60 C15s. 72 C35s. End cycles. 72 C 1 min. 4 C Forever. 8. Purify the samples with 1Â beads (20 μL) and elute the pro- ducts into 20 μL water. Measure concentration of libraries using Qubit High Sensitive dsDNA kit. The samples can be stored at À20 C for up to 6 months.

3.2.9 Library 1. Quantify the library using KAPA library Quantification Kit. Quantification Calculate the library concentration according to the kit and Sequencing (See protocol. Note 13 ) 2. Dilute and load the library on sequencer according to Illumina sequencing protocol. For sequencing depth, we recommend having more than five million reads per cell.

3.2.10 Data Analysis To analyze the sequencing data, first use Skewer package to trim the (See Note 14) adapters and unique molecular identifiers (UMIs) from raw reads. Map the trimmed reads to the reference genome using either TopHat or STAR. HT-seq package can then be used to count the number of reads mapped to genes. Parse the UMIs to the reads and count the number of UMI of each gene. We recommend use amplicons per million amplicons (APM) (Gene expression level ¼ the number of unique amplicon indexes of a gene/the total number of indexes of all genes  1,000,000) to normalize the gene expression level. Downstream data analysis can be per- formed in MATLAB or R. MATQ-Seq for Single-Cell Total RNA Sequencing 69

4 Notes

1. TruSeq adapter sequence can be found on Illumina website. Depending on the number of samples and total sequencing reads. Either TruSeq Single Index adapters or TruSeq HT with combinatorial dual index adapters can be used. Anneal the TruSeq universal adapter or I5 adapters with I7 adapters before use. It is highly recommended to use a phosphorothioated bond for the last base between C and T of the TruSeq universal adapter (or I5 primers for dual index). The phosphorothioated bond can minimize primer degradation of the protruding T after annealing, preventing the formation of adapter dimers. 2. The ten cycles of ramping from 8 to 50 C maximizes the primer random binding and extension to the transcripts. It is important to use Superscript III reverse transcriptase as it has high thermostability and low terminal transferase activity. 3. Exonuclease I can also be used for primer digestion. It requires 80 C 20 min for inactivation. This condition is not ideal for the semiamplicons. T4 DNA polymerase also has high exonu- clease activity and is more thermolabile which can get inacti- vated at 75 C.

4. The combination of RNase H and RNase If promotes the efficient release of single-stranded cDNA from the RNA–DNA hybrids. Terminal transferase has higher efficiency on single- stranded DNA. For efficient second-strand synthesis, RNA digestion is crucial. 5. We recommend using Deep Vent exoÀ DNA polymerase for second-strand synthesis. Deep Vent DNA polymerase has a low Km value, leading to high efficient primer annealing. An exoÀ DNA polymerase is required as we noticed that exo+ DNA polymerase leads to severe formation of junk product. 6. We use real-time PCR to filter out cells of low quality or doublets. If the Cq value of a sample is too high, it could be a degraded sample. If the Cq is 1–2 cycle(s) lower than the value expected, the sample could be a doublet or triplets. Melting peak is another key parameter. For fresh human samples, based on our experience, the melting peak is 84.5–85 C. For fresh mouse samples, the melting peak is likely to be 83–83.5 C. A lower melting peak could indicate sample degradation. The difference of melting peaks is likely due to differences in human and mouse ribosomal RNA. 7. We designed the first strand and second strand to have the same sequence at 50 end. This will lead to loop-based amplification. The smaller the loop is, the harder it is for the amplicon to be amplified. Therefore, the residual small primer dimers will have 70 Kuanwei Sheng and Chenghang Zong

low amplification efficiency. The typical yield of a normal HEK293T cell after 24 cycles of amplification is 800 ng to 2 μg. 8. We noticed that 50% of our sequencing reads start with the GAT adapter sequence. The Illumina sequencers use first five bases of read 1 to generate cluster. If the sequence has low diversity, the cluster generation could be problematic. 3NAGT24 double-strand conversion can increase the sequence diversity of the first few bases of read 1, leading to better cluster generation during sequencing. The typical yield after double- strand conversion is 300–400 ng. 9. Bioruptor Pico is an alternative for sonication. Enzymatic shearing with dsDNA Fragmentase can also be used. 10. For different sequencing read length, the size selection and shearing can be adjusted accordingly. 11. Measure the concentration of libraries using Qubit HS kit is recommended, as NanoDrop is not as accurate. The typical yield of the library is 100–300 ng. 12. Since MATQ-seq is a total RNA approach, the ribosomal reads composed of more than 60% of the reads. Duplex-specific nuclease (DSN) treatment [20] removes 80–90% of the ribo- somal reads. DSN treatment yields higher mapping rate to genes while has an only minor effect on gene expression level. Preheat the enzyme on the PCR block and add the enzyme on the PCR block is important as lower temperature leads to lower yield. The typical yield of the library after KAPA amplification is 20–200 ng. In case where the ribosomal sequencing is needed, this step can be skipped. 13. For sequencing using Illumina platforms, size correction is a very important step in library quantification. Overestimation of size causes overload leading to poor data quality while under- estimate of size causes underload leading to low sequencing depth. For MATQ-seq, with size selection of 200 bp before adapter ligation, we recommend using 380 bp as library size to adjust the concentration. 14. Please refer to the method section of the MATQ-seq paper [18] for details and scripts.

Acknowledgments

This work is supported by McNair Scholarship and McNair Single Cell Initiative. We are grateful to McNair family and Dr. Neblett for their kindly support. We would like to thank Wenjian Cao, Yichi Niu, Dr. Zhiying Hu, and Dr. Yanhua Zhao for their contribution to the development of MATQ-seq. MATQ-Seq for Single-Cell Total RNA Sequencing 71

Reference

1. Scialdone A, Tanaka Y, Jawaid W et al (2016) 11. Tang F, Barbacioru C, Wang Y et al (2009) Resolving early mesoderm diversification mRNA-seq whole-transcriptome analysis of a through single-cell expression profiling. single cell. Nat Methods 6(5):377–382 Nature 535(7611):289–293 12. Fan HC, Fu GK, Fodor SP (2015) Expression 2. Wagner DE, Weinreb C, Collins ZM et al profiling. Combinatorial labeling of single cells (2018) Single-cell mapping of gene expression for gene expression cytometry. Science 347 landscapes and lineage in the zebrafish embryo. (6222):1258367 Science 360(6392):981–987 13. Klein AM, Mazutis L, Akartuna I et al (2015) 3. Treutlein B, Brownfield DG, Wu AR et al Droplet barcoding for single-cell transcrip- (2014) Reconstructing lineage hierarchies of tomics applied to embryonic stem cells. Cell the distal lung epithelium using single-cell 161(5):1187–1201 RNA-seq. Nature 509(7500):371–375 14. Macosko EZ, Basu A, Satija R et al (2015) 4. Habib N, Li Y, Heidenreich M et al (2016) Highly parallel genome-wide expression Div-Seq: single-nucleus RNA-Seq reveals profiling of individual cells using nanoliter dro- dynamics of rare adult newborn neurons. Sci- plets. Cell 161(5):1202–1214 ence 353(6302):925–928 15. Hashimshony T, Wagner F, Sher N et al (2012) 5. Lake BB, Ai R, Kaeser GE et al (2016) Neuro- CEL-seq: single-cell RNA-Seq by multiplexed nal subtypes and diversity revealed by single- linear amplification. Cell Rep 2(3):666–673 nucleus RNA sequencing of the human brain. 16. Hashimshony T, Senderovich N, Avital G et al Science 352(6293):1586–1590 (2016) CEL-seq2: sensitive highly-multiplexed 6. Venteicher AS, Tirosh I, Hebert C et al (2017) single-cell RNA-seq. Genome Biol 17:77 Decoupling genetics, lineages, and microenvi- 17. Islam S, Zeisel A, Joost S et al (2014) Quanti- ronment in IDH-mutant gliomas by single-cell tative single-cell RNA-seq with unique molec- RNA-seq. Science 355(6332):pii:eaai8478 ular identifiers. Nat Methods 11(2):163–166 7. Jaitin DA, Weiner A, Yofe I et al (2016) Dis- 18. Sheng K, Cao W, Niu Y et al (2017) Effective secting immune circuits by linking CRISPR- detection of variation in single-cell transcrip- pooled screens with single-cell RNA-seq. Cell tomes using MATQ-seq. Nat Methods 14 167(7):1883–1896 (3):267–270 8. Dixit A, Parnas O, Li B et al (2016) Perturb- 19. Zong C, Lu S, Chapman AR et al (2012) seq: dissecting molecular circuits with scalable Genome-wide detection of single-nucleotide single-cell RNA profiling of pooled genetic and copy-number variations of a single human screens. Cell 167(7):1853–1866 cell. Science 338(6114):1622–1626 9. Picelli S, Bjorklund AK, Faridani OR et al 20. Vandernoot VA, Langevin SA, Solberg OD (2013) Smart-seq2 for sensitive full-length et al (2012) cDNA normalization by hydroxy- transcriptome profiling in single cells. Nat apatite chromatography to enrich transcrip- Methods 10(11):1096–1098 tome diversity in RNA-seq applications. 10. Ramskold D, Luo S, Wang YC et al (2012) BioTechniques 53(6):373–380 Full-length mRNA-seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 30(8):777–782 Chapter 6

Single-Cell RNA Sequencing with Drop-Seq

Josephine Bageritz and Gianmarco Raddi

Abstract

Drop-Seq is a low-cost, high-throughput platform to profile thousands of cells by encapsualting them into individual droplets. Uniquely barcoded mRNA capture microparticles and cells are coconfined through a microfluidic device within the droplets where they undergo cell lysis and RNA hybridiztion. After breaking the droplets and pooling the hybridized particles, reverse transcription, PCR, and sequencing in single reactions allow to generate data from thousands of single-cell transcriptomes while maintaining information on the cellular origin of each transcript.

Key words Drop-seq, Single-cell RNA-sequencing, Systems biology, Transcriptomics, Droplet tech- nology, Cell barcoding, Microfluidics

1 Introduction

Single-cell RNA-seq allows biological investigators to tap the tran- scriptome of individual cells to analyze which cell types they belong to, what their lineage is, and what transcriptional modules they possess [1, 2]. In a short amount of time, single-cell RNA-seq (scRNA-seq) has become a popular tool used in all branches of biology [3]. Protocol improvements and technological advance- ments have led to an exponential growth in the number of cells being analyzed, allowing in turn for the discovery of even rarer subtypes [3, 4]. Manual selection [5], FACS sorting [6], and microfluidic circuits led the way [7], but it was only with droplet- based techniques such as Drop-seq that thousands of individual cells could be routinely profiled in each experiment [8–10]. Drop-seq involves coencapsulation of single cells and barcoded microparticles in nanoliter-sized droplets [9]. Cells and beads are strongly diluted such that only a few droplets will contain a bead or a cell, or both, enabling a very low doublet rate but resulting also in a lower cell capture efficiency (number of captured cells/number of cells used). Once cells are encapsulated within the droplet, they are

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_6, © Springer Science+Business Media, LLC, part of Springer Nature 2019 73 74 Josephine Bageritz and Gianmarco Raddi

immediately lysed which causes the release of polyadenylated mRNA transcripts. Breaking the droplets and reverse transcription of their mRNA forms covalent and stable STAMPs (single-cell transcriptomes attached to microparticles). Exonuclease treatment is then applied to remove bead primers that have not captured an mRNA mole- cule, after which cDNA amplification, library construction, and sequencing are performed [9]. This droplet-based microfluidics technology can be set up in a cost-effective manner in each laboratory. Droplet technologies in general cut the cost of library preparation per cell tenfold compared to FACS approaches such as Smartseq2. However, only the 30-end of the transcript can be sequenced, thus losing information that alternative full-coverage techniques can provide [1]. All in all, Drop-seq is a flexible and cost-effective method that allows many laboratories to assess the transcriptome of single cells on a large scale. Furthermore, it can easily be modified to suit individual needs (DroNcSeq [11], CITE-Seq [12]).

2 Materials

2.1 Cell It is recommended to prepare all buffers/reagents using ultrapure Encapsulation water and molecular biological grad reagents. Store them at room and STAMP Generation temperature, unless indicated otherwise. 1. Three infuse only syringe pumps (e.g., KD Scientific Legato 100). 2. Inverted microscope. 3. Multi Stirrus magnetic tumble stirrer (V&P Scientific). 4. Accessory kit for positioning the Multi Stirrus (V&P Scientific). 5. Mixing disc (e.g., V&P Scientific). 6. Aquapel-coated PDMS microfluidics device (e.g., FlowJEM). 7. Luer-lock syringe needles, 26 G. 8. 3 and 10 ml luer-lock syringes. 9. Low-density polyethylene (LDPE) tubing (0.38 mm ID, 1.09 mm OD, e.g., VWR). 10. Barcoded mRNA capture beads (MACOSKO-2011-10, ChemGenes). 11. Ethanol, absolute. 12. Fuchs-Rosenthal hemocytometer. 13. Cell strainer with 100 μm pores. Single-Cell RNA Sequencing with Drop-Seq 75

14. Cell lysis buffer: 6% Ficoll PM-400, 0.2% Sarkosyl, 20 mM EDTA, 200 mM Tris pH 7.5 in water. Filter buffer using a 0.22 μm pore size filter to prevent clogging of the microfluidics device. 15. Cell lysis buffer complete: Cell lysis buffer (prepared in previ- ous step), 50 mM DTT (store at À20 C, avoid repeated freezing and thawing cycles). Prepare fresh before use. 16. Droplet generation oil (e.g., Bio-Rad). Filter oil with 0.22 μm pore size to prevent clogging of the microfluidics device. 17. SSC (saline sodium citrate) buffer (6Â). 18. Perfluorooctanol (PFO; Sigma, see Note 1). 19. TE (Tris-EDTA)—SDS buffer: 10 mM Tris pH 8.0, 1 mM EDTA, 0.5% SDS. 20. TE (Tris-EDTA)—Tween buffer: 10 mM Tris pH 8.0, 1 mM EDTA, 0.01% Tween-20. For preparing the barcoded beads, filter buffer using a 0.22 μm pore size filter to prevent clogging of the microfluidics device. 21. 10 mM Tris–HCl buffer pH 8.0. 22. Template Switch Oligo (TSO): 50-AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG-30. 23. Reverse transcription (RT) mix: 1Â RT buffer, 4% Ficoll ® PM-400, 1 mM dNTPs, 200 U NxGen RNase Inhibitor (Lucigen), 2.5 μM Template Switch Oligo, and Maxima H Minus RTase, in water. Make 200 μl for up to 90,000 beads. For 200 μl RT mix, add 75 μl of water, 40 μl Maxima 5Â RT buffer, 40 μl 20% Ficoll PM-400, 20 μl 10 mM dNTPs, 5 μl RNAse Inhibitor, and 10 μl50μM Template Switch Oligo. Prepare fresh and keep on ice until use. Add Maxima H Minus RTase just prior to starting with the RT (for the 200 μl RT mix add 10 μl enzyme). 24. Exonuclease mix: 1Â Exo I buffer and 200 U Exonuclease I, in water. Make 200 μl for up to 90,000 beads. For 200 μl Exonu- clease mix, add 170 μl water, 20 μl10Â Exo I buffer, and 10 μl Exonuclease I. Prepare fresh on ice.

2.2 Library 1. SMART PCR primer: 0 0 Preparation 5 -AAGCAGTGGTATCAACGCAGAGT-3 . PCR mix: 0.8 μM SMART PCR primer and 1Â KAPA HiFi HotStart ReadyMix, in water. Make 50 μl for an aliquot of 2000 beads. For 50 μl PCR mix, add 21 μl water, 4 μl10μM SMART PCR primer, and 25 μl2Â KAPA HiFi HotStart ReadyMix. Prepare fresh on ice. 76 Josephine Bageritz and Gianmarco Raddi

2. Agencourt AMPure XP beads. 3. 80% ethanol. Dilute absolute ethanol with water. Make up fresh. 4. Agilent High Sensitivity DNA Kit (Agilent). 5. Qubit dsDNA HS (High Sensitivity) Assay Kit. 6. Nextera XT DNA Library Preparation Kit. 7. 10 μM New-P5-SMART PCR hybrid oligo: 50AATGATACGGCGACCACCGAGATCTACACGCCT- GTCCGCGGAAGCAGTGGTATCAACGCAGAGT*A*C-30.

2.3 Sequencing 1. Custom Read 1primer: 50-GCCTGTCCGCGGAAGCAGTGGTATCAACGCAGA GTAC-30.

3 Methods

In order to avoid PCR contaminations, it is recommended to follow the guidelines of Loh and Chan [13]. Carry out all experi- mental procedures at room temperature, unless indicated otherwise.

3.1 Preparation 1. The DNA-barcoded beads are delivered in ethanol. Upon of Barcoded Beads arrival wash the beads twice with 30 ml ethanol, twice with 30 ml TE-Tween buffer, and resuspend in 20 ml TE-Tween buffer (see Note 2). 2. Pass the beads through a 100 μm strainer and count them using a Fuchs-Rosenthal hemocytometer. Store counted beads at 4 C. Macosko and colleagues store beads in this way longer than 6 months without any impaired performance [14].

3.2 Prerun Setup 1. Arrange the three syringe pumps close to the microscope, with cell and oil syringe pump in a horizontal position and the bead/lysis buffer syringe pump in an upright position with the syringe pointing down. Use the support stand to position the magnetic stirrer close to the bead/lysis buffer syringe pump (Fig. 1). 2. Cut three tubings of the same lengths for the oil, bead/lysis buffer, cell inlet, respectively, and one shorter bit for the outlet channel (see Note 3). 3. Place the microfluidics chip on the microscope stage and select a clean device, which is free of any fabrication defects (visual channel imperfection). 4. Load filtered oil into a 10 ml syringe (20 ml syringe for larger experiments), attach a 26 G needle and affix the tubing Single-Cell RNA Sequencing with Drop-Seq 77

Fig. 1 Arrangement of the Drop-seq equipment. (1) Microscope stage with PDMS device. (2) Cell syringe pump. (3) Oil syringe pump. (4) Bead/lysis buffer syringe pump. (5) Magnetic stirrer

Inserted tubings

Outlet Oil inlet Cell inlet Bead/Lysis PDMS buffer inlet

Fig. 2 Illustration of the PDMS microfluidics device with attached tubings

(see Note 4). In order to remove air bubbles, bring the needle in an upright position and slowly push the plunger of the syringe until oil comes out of the free-end tubing. 5. Place the oil syringe into the pump, set the flow rate to 15,000 μl/h (see Note 5), and insert the free-end tubing in the oil inlet of the device (Fig. 2). 6. Take the shorter tubing and insert it into the outlet channel of the same device. Place the free-end tubing into a 50 ml waste collection tube. 7. For a microfluidics device that produces droplets of 125 μm transfer 120,000 beads into a 1.5 ml tube (see Note 6), spin 78 Josephine Bageritz and Gianmarco Raddi

down in a tabletop centrifuge, remove the TE-Tween buffer, and add 950 μl of lysis buffer without DTT (see Notes 7 and 8). 8. For a microfluidics device that produces droplets of 125 μm(see Note 9), load a single-cell suspension with a concentration of 100 cells/μl into a 3 ml syringe (see Note 10). Attach a 26 G needle and affix the tubing (see Note 4). 9. Place the cell syringe into the pump, set the flow rate to 30,000 μl/h, and briefly start the pump until liquid dips out of the free end of the tubing and all the air bubbles are pushed out. Then insert the free-end tubing in the cell inlet of the device (Fig. 2) and set the flow rate to 4,000 μl/h (see Note 11). 10. Place a cleaned magnetic mixing disc into a 3 ml syringe. 11. Add 50 μl DTT to the lysis buffer/bead aliquot and draw up the suspension (see Note 10). In order to avoid clogging of the needle with settled beads, keep the syringe in a vertical position with the needle facing upward. Attach a 26 G needle and affix the tubing (see Note 4). 12. Turn on the magnetic stirrer and begin mixing the beads (see Note 12). 13. Place the bead/lysis buffer syringe into the pump with the needle facing downward. 14. Position the magnet of the stirrer in a good angle to the syringe that facilitates the magnet to travel vertically through the full volume of the syringe (see Note 13). 15. Set the flow rate to 30,000 μl/h and briefly start the pump until bead/lysis buffer dips out of the free end of the tubing and the air bubbles are pushed out. 16. Make sure that the dipping has stopped before inserting the free-end tubing in the bead/lysis buffer inlet of the device (Fig. 2). Set the flow rate to 4,000 μl/h

3.3 Cell 1. Begin the Drop-seq run by first starting the cell, then the Encapsulation bead/lysis buffer and finally the oil syringe pump. This starting and STAMP Generation order is important to avoid cell lysis outside the droplets. Allow the flow to stabilize for about 10–40 s. Once stable and uniform droplets are made, a blurry, elongated triangle will form at the droplet generation junction (see Note 14). At this point, transfer the outgoing tubing into a fresh 50 ml falcon tube and start to collect droplets. A standard collection time is 15 min, which ensures that the cells are still evenly distributed in suspension. 2. [Optional: In order to reload more cells, first stop the bead/ lysis buffer, then the cell and finally the oil syringe pump. Wait for a few seconds and then pull the three ingoing tubings out of Single-Cell RNA Sequencing with Drop-Seq 79

the device, leaving the outgoing tubing at place. Wait for the flow to stop, then pull out the outgoing tubing and start to connect the oil, cell and bead/lysis buffer tubing as described in the steps above]. 3. Evaluate the droplet quality by carefully letting the droplets, which form a layer on top of the oil, run down the wall of the collection tube. Droplets showing a uniform appearance is indicative of a high-quality emulsion. 4. Droplet size and bead occupancy can be assessed under the microscope. For this purpose, transfer 17 μl oil and then care- fully 3 μl of the emulsion (taken from the top of the collection tube) into a Fuchs-Rosenthal hemocytometer (see Note 15). 5. Continue with the droplet breakage procedure (see Note 16), by removing most of the oil from the bottom of the collection tube using a P1000 (see Note 17). 6. Add 30 ml of 6Â SSC. 7. Break the droplets by addition of 1 ml perfluorooctanol (PFO) and three firm shakes (see Notes 1 and 18). 8. Centrifuge at 1000 Â g for 1 min, with the cap being loosened and a brake speed set to about 25%. 9. Carefully transfer the tube on ice with preformed holes and evaluate the success of the breakage. A relatively thin, white line indicates that most of the droplets have been broken (see Notes 19 and 20). 10. Remove the upper aqueous phase by first using a 20 ml pipette and then a P1000. Leave a few milliliters above the interface. 11. Kick up the beads lying on top of the interface by addition of 30 ml 6Â SSC. Wait for 2–3 s to let the oil sink down and then transfer the supernatant to a fresh 50 ml collection tube. Be careful not to transfer any oil or interface material. 12. Pellet the beads by centrifugation (1000 Â g, 1 min, 50% brake speed). Remove most of the supernatant, leaving about 1–1.5 ml liquid. 13. Resuspend the beads and transfer the bead/SSC solution into a fresh 1.5 ml tube (see Note 6). 14. Wash the beads twice with 1 ml 6Â SSC (see Notes 21 and 22) and once with about 300 μl5Â RT buffer. 15. Continue with reverse transcription by addition of 200 μlRT mix to a maximum of 90,000 recovered beads (see Note 23). 16. After an incubation time of 30 min at room temperature with gentle rotation, transfer the tube to 42 C and incubate for another 90 min with rotation (see Note 24). 80 Josephine Bageritz and Gianmarco Raddi

17. Stop the reaction by addition of 1 ml TE-SDS and wash the beads twice with 1 ml TE-Tween (see Note 25) and once with 1 ml 10 mM Tris pH 8.0. 18. Continue with the exonuclease I digest by adding 200 μl of the exonuclease mix to the pelleted beads. 19. Incubate at 37 C for 45 min with rotation (see Note 26). 20. Stop the reaction by addition of 1 ml TE-SDS and wash the beads twice with 1 ml TE-Tween (see Note 25) and once with 1 ml water.

3.4 Library 1. Spin the bead pellet in a tabletop centrifuge and resuspend in Preparation 1 ml water (see Note 27). 2. Count the beads by first resuspending the bead solution using a P1000 and then quickly transferring 20 μl into a Fuchs- Rosenthal hemocytometer using a P200. For accurate count- ing, the beads need be evenly distributed on the chip (see Note 28). 3. Make aliquots of 2,000 beads for each PCR reaction (see Note 29). This will yield about 100 STAMPs for the cell input concentration of 100 cells/μl. 4. Spin the PCR tube(s) in a tabletop centrifuge and add 50 μl PCR mix per tube. Mix well, transfer the tube(s) to a thermo- cycler, and start the PCR program (see Note 30): 3 min at 95 C. 4 cycles of 20 s at 98 C, 45 s at 65 C, and 3 min at 72 C. 9 cycles of 20 s at 98 C, 20 s at 67 C, and 3 min at 72 C. 5 min at 72 C. Infinite hold at 8 C. 5. Purify the amplified cDNA using AMPure XT beads. Prior to starting with the protocol, transfer the PCR product into a new tube without carrying any Chemgene beads over. 6. Purify the amplified cDNA with a bead-to-DNA ratio of 0.6 following the manufacturer’s instructions. Elute one aliquot of 2,000 beads in 10 μl water (see Note 31). 7. Assess cDNA quality by running 1 μl of the sample on a Bioanalyzer High Sensitivity Chip following the manufac- turer’s instructions. High-quality cDNA should show a species-specific profile, with average sizes from 1,300 to 2,000 bp for human and mouse, and 1,300–1,600 bp for Drosophila samples. 8. Determine the concentration of the purified cDNA using the Qubit dsDNA HS assay kit according to the manufacturer’s instructions. Use 1 μl as input amount. The yield is depending on the experimental conditions and samples and should be 400–1,000 pg/μl for 2,000 beads of human HEK293T cells. Single-Cell RNA Sequencing with Drop-Seq 81

9. Perform the cDNA tagmentation and amplification using the Nextera XT DNA Library Preparation Kit following the man- ufacturer’s instructions with the following modifications: l Use 600 pg of purified cDNA as input and combine it with water to a total volume of 5 μl in a PCR tube. l Perform all centrifugation steps using a tabletop centrifuge. l For the amplification step, add to each tube these in the following order: 15 μl Nextera PCR mix, 1 μl10μM New-P5-SMART PCR hybrid Oligo, 1 μl10μM Nextera N70X oligo, and 8 μl water. 10. Purify the tagmented and amplified library three times with a bead-to-DNA ratio of 0.6, 0.6, and 1.0, respectively, following the manufacturer’s instruction. Elute in 10 μl water. 11. Assess the quality of the final library by running 1 μl of the sample on a Bioanalyzer High Sensitivity Chip following the manufacturer’s instructions. High-quality libraries should show a smooth profile, with a size average from 500 to 680 bp. 12. Determine the concentration of the final library using the Qubit dsDNA HS assay kit according to the manufacturer’s instructions. Use 1 μl as input amount. The yield should be in the range of 10–30 nM for human HEK293T cells.

3.5 Sequencing 1. If using MiSeq, NextSeq 500, HiSeq 4000, or HiSeq 2500 follow Illumina’s protocol specific for each sequencing system for denaturing and diluting the final library. We load the fol- lowing amount of DNA onto a flow cell: MiSeq (Reagent Kit v3) ¼ 8 pM, NextSeq 500 (High Output kit) ¼ 1.4 pM, HiSeq 4000 (SBS (Sequencing by Synthesis) Kit) ¼ 200 pM, HiSeq 2500 (HiSeq Rapid SBS Kits v2) ¼ 9 pM. PhiX is not spiked-in. 2. Specify the sequencing length with Read 1 ¼ 20 bp, Read 2 ¼ 50 bp, Index ¼ 8 bp (for multiplexed samples only).

4 Notes

In addition to the material and methods section in the Drop-seq paper [9], Macosko and colleagues provided a detailed protocol on the group home page of Steve McCarroll (Drop-seq-Protocol- v3.1-Dec-2015) [14]. This protocol together with own experience served as the basis for the notes found in this chapter. 1. Use PFO only in a fume hood. Wear appropriate laboratory clothing when handling it. 2. Wash the beads at 1,000 Â g for about 1 min. Make sure to visually inspect successful pelleting. Increase centrifugation time and/or centrifuge without brake in the case of floating 82 Josephine Bageritz and Gianmarco Raddi

beads. Keep pipetting steps as gentle as possible to prevent damage of the beads. 3. In order to more easily connect the tubings to the needle and the microfluidics device, cut them in a sharp angle. 4. Make sure to not scratch the inner wall of the tubing. If it does happen, cut this piece of the tubing and insert the needle again. Using tweezers to push the tubing to the base of the PDMS facilitates a straight insertion. 5. The oil flow rate determines the size of the droplets. We found that the same device can produce uniform droplets with about 40 μm size differences by adjusting the oil flow rate. 6. The barcoded beads tend to stick to the wall of microcentrifuge ® tubes. We found VWR slick disposable microcentrifuge tubes particularly suitable for pelleting barcoded beads. 7. The bead concentration can be adjusted in case the device produces smaller droplets (see Note 9 for details), but should not be higher than 260 beads/μl as bead mixing in the syringe becomes insufficient. For the same reason, the maximum vol- ume of loaded beads in lysis buffer in the syringe should be higher than 1.3 ml. 8. This is a good time to prepare your single-cell suspension. The RT mix without Maxima H Minus RTase can also be prepared and kept on ice until use. We have kept this RT mix up to 2 h on ice and do not observe any apparent effect on quantity or quality of the cDNA samples. 9. The standard droplets are 125 μm in size and the recom- mended input cell concentration is 100 cells/μl and 120 beads/μl. Using this setup, Macosko and colleagues showed a cell and bead doublet rate of about 4–5% [9]. In case the microfluidics device produces droplets with a different size, the cell and bead concentration would need to be adjusted (Table 1): l Volumeof the droplets [cubic microns] ¼ (4/3) Â π Â (radius of droplets [μm]3). l Volume of the droplets [cubic microns] ¼ Volume of the droplets/106 [nl]. l Adjusted input cell/bead concentration ¼ (1 nl/Volume of the droplets [nl]) Â 100 cells/μl. 10. The formation of air bubbles is a serious obstacle in microflui- dics systems. In order to minimize the introduction of air bubbles into the syringe, firmly press a 1 ml pipette tip into the head of the luer-lock syringe and slowly draw in the solu- tion. However, if air bubbles do arise, one can try to expel them Single-Cell RNA Sequencing with Drop-Seq 83

Table 1 Example cell/bead concentrations

Droplet diameter [μm] Calculated cell/bead concentration per μl

125 100 110 143.5 100 191

by holding the syringe with the tip end facing upward, gently tapping the syringe and slowly pushing the plunger. 11. The aqueous and oil flow rates used to make monodisperse droplets of about 125 μm can vary from PDMS chip to PDMS chip (e.g., depending of different silicon masters used). Hence, the running speed can be different from the standard settings and should be determined for each chip. 12. Set the magnetic tumble stirrer to 25–30 with a 110/120 V plug and to 16–20 with a 220/240 V plug. Try to avoid higher settings, as this can significantly damage the beads. 13. In case there are difficulties finding the perfect position of the magnet, support the mixing by manually moving the support- ing stand regularly. Monitor successful mixing by checking the bead doublet rate in the beginning and at the end of the droplet collection. 14. Using a magnification higher than 10Â allows to see a clear oil stream in the middle of the triangle. However, if droplets are formed the surrounding should be blurry. In case there are small air bubbles and/or small pieces of dust in the cell/bead channel causing tensions and instable droplet formation at the junction, we found it helpful to increase the cell and the bead/ lysis buffer flow rate for some seconds to 10,000 μl/h. In case uniform droplets are not being formed, you can also stop the bead and cell flow for a few seconds while leaving the oil flow rate switched on. Restarting the cells and beads/lysis buffer syringe pumps then often leads to the formation of stable, uniform droplets. Make sure to first increase the cell flow rate and then the bead/lysis buffer (the reverse order when setting the flow rates back to 4,000 μl/h). If those procedures do not lead to a good-quality emulsion, then switch to a new device. 15. The droplet quality and size can also be checked on a regular glass slide. Be aware that the half-life of the droplets on a glass slide is shorter than in the hemocytometer. 16. The emulsion can be kept on ice or in the fridge for at least an hour without compromising the library quantity or quality. 84 Josephine Bageritz and Gianmarco Raddi

Another Drop-seq run can then be done and the droplet break performed for both samples together. 17. Avoid losing any droplets at this step by pushing the pipette to the first stop before inserting the tip down into the oil phase and then pushing down to the second stop to release any trapped droplets out of the pipette tip. 18. The added amount of PFO depends on the amount of sample collected. For a 15 min droplet collection with an aqueous flow rate of 4,000 μl/h, 1 ml of cell suspension and 1 ml bead/lysis buffer will flow into the collection tube. In this case, 1 ml of PFO is used for the droplet break. 19. The amount of extracellular proteins (e.g., BSA/FCS, >0.2%) is strongly inhibitory to the droplet breakage. The cell suspen- sion should hence be washed rigorously prior to loading the sample into the syringe. 20. Repeating the droplet breakage by adding another 500 μl PFO, firmly shaking, and another centrifugation step can increase the bead recovery rate. Do not add more PFO than 1.5Â of the sample volume, as this can lead to lower library quality. 21. If sample preparation and pre-PCR rooms are separated, then this is a good time to add Maxima H Minus RTase to the RT mix. 22. The use of a second microcentrifuge tube during the washing helps to get rid of residual oils that might have been carried over. 23. The bead recovery rate is usually around 20–40%. The standard settings should hence yield 24,000–48,000 recovered beads. 24. We use a microarray oven for the 42 C incubation step. 25. This is a stopping point. Beads can be stored in TE-Tween at 4 C. We have stored beads in this way for more than 3 months and do not observe any apparent effect on quantity or quality of the cDNA sample. 26. We use a microarray oven for the 37 C incubation step. 27. Resuspend beads in less than 1 ml when having a smaller bead pellet. 28. Since the beads settle fast by gravity, it is important to quickly load the hemocytometer after resuspension in a uniform motion. For this purpose, preload the P200 with a pipette tip. We found a bead concentration of about 90–100 beads/μ l prediluted in 6Â loading dye (Thermo Scientific) and TE-Tween (10 μl beads +10 μl6Â loading dye +10 μl TE-Tween) makes it easier to evenly distribute the beads on the counter. Single-Cell RNA Sequencing with Drop-Seq 85

29. For large-scale experiments we apportion 4,000 beads per tube, because we found good correlation between data derived from 2,000 and 4,000 bead aliquots. 30. The cycle number of 13 depends on the experimental condi- tions (e.g., cell types, cell loading concentration). For new samples, we first determine the optimal number of cycles by an initial PCR of only 2,000 beads, aiming for a concentration of 400–1000 pg/μl. 31. Multiple PCR tubes can be pooled before the cleanup, allow- ing to elute in less volume and this way to reduce the number of PCR cycles for cDNA amplification.

Acknowledgments

J.B. was supported by a research stipend from the Fritz Thyssen Foundation. G.R. was supported by the Intramural Research Pro- gram of the Division of Intramural Research Z01AI000947, NIAID, NIH; the UCLA-Caltech MSTP, and the NIGMS T32 GM008042.

References

1. Kolodziejczyk AA, Kim JK, Svensson V et al 8. Zheng GXY, Terry JM, Belgrader P et al (2015) The technology and biology of single- (2017) Massively parallel digital transcriptional cell RNA sequencing. Mol Cell 58:610–620 profiling of single cells. Nat Commun 8:14049 2. Kolodziejczyk AA, Lo¨nnberg T (2018) Global 9. Macosko EZ, Basu A, Satija R et al (2015) and targeted approaches to single-cell tran- Highly parallel genome-wide expression scriptome characterization. Brief Funct Geno- profiling of individual cells using nanoliter dro- mics 17:209–219 plets. Cell 161:1202–1214 3. Svensson V, Vento-Tormo R, Teichmann SA 10. Klein AM, Mazutis L, Akartuna I et al (2015) (2018) Exponential scaling of single-cell Droplet barcoding for single-cell transcrip- RNA-seq in the past decade. Nat Protoc tomics applied to embryonic stem cells. Cell 13:599–604 161:1187–1201 4. Ziegenhain C, Vieth B, Parekh S et al (2017) 11. Habib N, Avraham-Davidi I, Basu A et al Comparative analysis of single-cell RNA (2017) Massively parallel single-nucleus RNA-- sequencing methods. Mol Cell 65:631–643.e4 seq with DroNc-seq. Nat Methods 5. Tang F, Barbacioru C, Wang Y et al (2009) 14:955–958 mRNA-seq whole-transcriptome analysis of a 12. Stoeckius M, Hafemeister C, Stephenson W single cell. Nat Methods 6:377–382 et al (2017) Simultaneous epitope and tran- 6. Macaulay IC, Svensson V, Labalette C et al scriptome measurement in single cells. Nat (2016) Single-cell RNA-sequencing reveals a Methods 14:865–868 continuous spectrum of differentiation in 13. Lo YMD, Chan KCA (2006) Setting up a poly- hematopoietic cells. Cell Rep 14:966–977 merase chain reaction laboratory. In: Lo YMD, 7. Shalek AK, Satija R, Adiconis X et al (2013) Chiu RWK, Chan KCA (eds) Clinical applica- Single-cell transcriptomics reveals bimodality tions of PCR. Humana, Totowa, NJ, pp 11–18 in expression and splicing in immune cells. 14. McCarroll S (2018) Drop-seq-protocol-v3.1- Nature 498:236–240 Dec-2015. McCarroll Lab, Boston, MA. http://mccarrolllab.com/dropseq/ Chapter 7

Chromium 10Â Single-Cell 30 mRNA Sequencing of Tumor-Infiltrating Lymphocytes

Marco De Simone, Grazisa Rossetti, and Massimiliano Pagani

Abstract

Chromium 10Â 30 V2 protocol is a 30 end counting single-cell mRNA sequencing protocol that allows to process and sequence RNA from thousands of cells in parallel. Chromium10Â by 10Â Genomics is an emulsion-based device that enables to compartmentalize single cells along with sets of uniquely barcoded primers and reverse transcription reagents into nanoscale droplets that are used as reaction chambers to generate barcoded full-length cDNA from single cells. After RT reaction single-stranded barcoded cDNAs are pooled together and processed to generate sequencing libraries compatible with the standard Illumina platforms. Here we show in detail the main steps of the protocol applied to the analysis of tumor-infiltrating T lymphocytes (TILs). The main steps are cell preparation, cDNA synthesis, library construction, and sequencing. This protocol refers specifically to the CG00052_SingleCell3_ReagentKitv2UserGuide_RevD down- loadable from 10Â Genomics website (https://www.10xgenomics.com) and does not substitute it. Always refer to this guide, paying attention to updates and revisions.

Key words Chromium 10Â, Single-cell RNA sequencing, 30 End counting, Emulsion device, Tumor-infiltrating lymphocytes

1 Introduction

Chromium 10Â system is a “reverse emulsion device” that creates oil-in-water emulsions producing droplets that can encapsulate single cells [1] along with hydrogel beads and reagents in droplets (gel in emulsion beads or GEMs) used to reverse-transcribe mRNA and amplify the derived cDNA. Gel beads work as delivery system of millions of oligo primers each containing (a) a partial Read 1 sequence used for sequencing on the Illumina flow cell, (b) a 16-bp cell barcode called 10Â barcode that serves as the unique molecular address for that partic- ular bead and is shared by all the oligos of a bead (Potentially, 750,000 different bead types can be used in the single-cell 30 assay, and each individual bead contains millions of identical oligos

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_7, © Springer Science+Business Media, LLC, part of Springer Nature 2019 87 88 Marco De Simone et al.

with the same, unique, 16-bp 10Â barcode), (c) a 10-bp UMI (or unique molecular identifier) which is a 10-bp randomer differ- ent for each oligo of a bead used for digital mRNA counting (Each one of the millions of primers on a gel bead has its own unique, 10-bp UMI sequence), and (d) a 30-bp poly dT tail which allows the primers to hybridize to mature mRNAs for first-strand cDNA synthesis. Once partitioned with the cells and RT reagents, the gel bead dissolves and its oligo primers are released into the aqueous envi- ronment of the GEM. The cell captured in the GEM is also lysed and the content of the GEM (oligos, lysed cell components, and master mix) is incubated in an RT reaction to generate full-length, barcoded cDNA from the poly A-tailed mRNA transcripts. The reverse-transcription reaction is primed by the barcoded gel bead oligo and the reverse transcriptase incorporates a template switch oligo via a template switching reaction at the 50 end of the transcript [2]. The GEMs are then “broken” and single-stranded barcoded cDNAs from thousands of cells are pooled. A bulk cDNA PCR amplification follows to generate enough material for library gen- eration. During library preparation after an enzymatic fragmenta- tion Read 2 is added by adapter ligation, and finally Illumina P5 and P7 sequences and sample index sequences are added during the Sample Index PCR.

2 Materials

2.1 Chromium 1. Chromium™ Single Cell 30 Library & Gel Bead Kit v2, 16 rxns Reagents PN120237 (10Â Genomics), composed by: (a) Chromium™ Single Cell 30 Library Kit v2, 16 rxns (store at À20 C) PN120234. (b) Chromium™ Single Cell 30 Gel Bead Kit v2, 16 rxns (store at À80 C) PN120235 (see Note 1). 2. Chromium™ Single Cell 30 Library & Gel Bead Kit v2, 4 rxns PN-120267 (10Â Genomics) composed by: (a) Chromium™ Single Cell 30 Library Kit v2, 4 rxns (store at À20 C) PN120264. (b) Chromium™ Single Cell 30 Gel Bead Kit v2, 4 rxns (store at À80 C) PN120265 (see Note 1). 3. Chromium™ Single Cell A Chip Kit, 48 rxns (10Â Genomics) (store at ambient temperature) PN120236 (see Note 2). 4. Chromium™ Single Cell A Chip Kit, 16 rxns (10Â Genomics) (store at ambient temperature) PN1000009 (see Note 2). of Tumor Infiltrating Lymphocytes 89

5. Chromium™ i7 Multiplex Kit, 96 rxns (10 Genomics) (store at À20 C) 120262 (see Note 3). 6. 10™ Vortex Adapter PN 330002 (10 Genomics). 7. 10™ Chip Holder PN 330019 (10 Genomics). 8. 10™ Magnetic Separator PN 230003 (10 Genomics) (see Note 4).

2.2 Plastic Ware (See 1. PCR Tubes 0.2 ml 8-tube strips PN 0030124286 Note 5) (Eppendorf). 2. DNA LoBind Tubes, 1.5 ml PN 022431021 (Eppendorf). 3. DNA LoBind Tubes, 2.0 ml PN 022431048 (Eppendorf). 4. TempAssure PCR 8-tube strip PN 1402-4700 (alternate to Eppendorf or Thermo Fisher Scientific product). ® 5. MicroAmp 8-Tube Strip, 0.2 ml (Thermo Fisher Scientific, alternate to Eppendorf or USA Scientific product) PN N8010580. ® 6. MicroAmp 8-Cap Strip, clear PN N8010535. ® 7. twin.tec 96-Well PCR Plate Semi-skirted PN 0030129326 (Eppendorf), if using PCR plates and depending on the ther- mal cycler. ® 8. twin.tec 96-Well PCR Plate Divisible, Unskirted PN 2231000209 (Eppendorf), if using PCR plates and depending on the thermal cycler. ® 9. twin.tec 96-Well PCR Plate Unskirted PN 0030133390 (Eppendorf), if using PCR plates and depending on the ther- mal cycler.

® 2.3 Kits 1. DynaBeads MyOne™ Silane Beads PN37002D, Thermo and Reagents Fisher Scientific. 2. Nuclease-free water. 3. PBS—phosphate-buffered saline (1Â) pH 7.4. 4. Ultrapure nonacetylated bovine serum albumin (BSA). 5. Low TE Buffer (10 mM Tris–HCl pH 8.0, 0.1 mM EDTA). 6. Ethanol, pure (200 Proof, anhydrous). 7. SPRIselect Reagent Kit B23318 (Beckman Coulter). 8. 10% Tween 20. 9. Glycerin (glycerol), 50% (v/v) aqueous solution.

2.4 Equipment 1. Pipet-Lite LTS Pipette L-2XLS+ PN 17014393 (Rainin). 2. Pipet-Lite LTS Pipette L-10XLS+ PN 17014388 (Rainin). 3. Pipet-Lite LTS Pipette L-20XLS+ PN 17014392 (Rainin). 90 Marco De Simone et al.

4. Pipet-Lite LTS Pipette L-100XLS+ PN 17014384 (Rainin). 5. Pipet-Lite LTS Pipette L-200XLS+ PN 17014391 (Rainin). 6. Pipet-Lite LTS Pipette L-1000XLS+ PN 17014382 (Rainin). 7. Pipet-Lite Multi Pipette L8-10XLS+ PN 17013802 (Rainin). 8. Pipet-Lite Multi Pipette L8-20XLS+ PN 17013803 (Rainin). 9. Pipet-Lite Multi Pipette L8-50XLS+ PN 17013804 (Rainin). 10. Pipet-Lite Multi Pipette L8-200XLS+ PN 17013805 (Rainin). 11. Tips LTS 20UL FilterRT-L10FLR PN 17007957 (Rainin). 12. Tips LTS 200UL Filter RT-L200FLR PN 17007961 (Rainin). 13. Tips LTS 1ML Filter RT-L1000FLR PN 17007954 (Rainin). 14. Vortex mixer PN 10153-838 VWR. 15. Divided polystyrene reservoirs 41428. 16. Heat sealing foil, PCR clean PN 0030127854 (Eppendorf) if using PCR plates. 17. PX1™ PCR Plate Sealer PN 1814000 (Eppendorf) if using PCR plates. 18. Automated cell counter.

2.5 Quantification 1. Agilent 2100 Bioanalyzer Laptop Bundle G2943CA (Agilent). and Quality Control 2. High Sensitivity DNA Kit PN 5067-4626 (Agilent). 3. 4200 TapeStation G2291aa (Agilent). 4. High Sensitivity D1000 ScreenTape 5067-5584. 5. High Sensitivity D1000 Reagents 5067-5585. 6. High Sensitivity D5000 ScreenTape 5067-5592. 7. High Sensitivity D5000 Reagents 5067-5593. ® 8. Qubit 3.0 Fluorometer Q33216 (Thermo Fisher Scientific). ® 9. Qubit dsDNA HS Assay Kit PN Q32854 (Thermo Fisher Scientific). ® 10. Illumina Library Quantification Kit PN KK4824 (KAPA Biosystems).

2.6 Recommended 1. Bio-Rad C1000 Touch™ Thermal Cycler with 96-Deep Well Thermal Cycles Reaction Module (PN-1851197). ® 2. Eppendorf MasterCycler Pro (PN North America 950030010, International 6321 000.019). 3. Thermo Fisher Veriti© 96-Well Thermal Cycler (PN-4375786). Single Cell Sequencing of Tumor Infiltrating Lymphocytes 91

3 Methods

3.1 Cell Preparation Cell preparation is the most challenging step of the protocol because in order to isolate tumor-infiltrating T lymphocytes, tumor tissue has to be dissociated and lymphocytes enriched from the suspension of dissociated cells using gradient centrifugation and FACS Sorting. Dissociation can be both mechanical and enzy- matic and dissociation protocols are strictly dependent on the tumor type [3–5]. Considering that the success of the procedure is strictly related to viability of target cells and that cell stress can induce consistent transcriptional changes that can mask the real behavior of target cells inside the tissue, an experimental setup that minimizes these effects is highly recommended before approaching this technology. After dissociation and lymphocyte purification proceed as described below: 1. Visual inspect the cells under a microscope or in an automated cell counter to confirm the quality of your suspension: a good cell suspension should be free of debris and cell aggregates and should contain cells with high viability (>80%) (see Note 6). 2. If aggregates are present filter the cell suspension using cell strainer and perform cell washes using PBS 0.04% BSA or RPMI medium plus FBS 10%. Centrifuge cells at 400 rcf for 10 min at 4 C. 3. Count cells manually or using an automated cell counter. 4. Dilute or concentrate your cell suspension according to the optimal performance range of the automated platform and count your cells again. 5. If cell stock concentration is between 700 and 1200 cells/μl and viability is >80%, you can proceed further. If viability is poor, try to eliminate dead cells with additional washes in PBS 0.04% or RPMI medium. 6. Pipet cells very gently to minimize cell lysis after quantification and before chip loading. 7. Cell suspensions should be loaded as soon as possible after preparation, ideally within 30 min to avoid formation of cellu- lar aggregates.

3.2 GEM Generation 1. Prepare master mix on ice (see Note 7). Add reagents in the order shown below, mix by pipetting, and centrifuge briefly. Master mix: (a) RT reagent mix 50 μl; (b) RT primer 3.8 μl; (c) Additive A 2.4 μl; (d) RT enzyme mix 10 μl. Total volume 66.2 μl. RT primer is provided in lyophilized form; once resus- pended, store at À80 C. Do not add Single Cell Suspension at this point. If processing more than one sample per experiment 92 Marco De Simone et al.

Fig. 1 Introducing Single Cell A Chip into the chip holder. Place Single Cell A Chip (a) is in the chip holder (b) before loading the reagents. Handle chips by the edges to avoid frictional charges on the bottom of the chip that can impact the partitioning performance. Align the chips at the upper left (beveled corners) and insert it under the guide at the left edge. Press the chip down on the right side until the spring loaded clip is engaged. Close the holder and lay the chip flat on benchtop (c)

prepare a master mix including a 10% excess (1–8 samples can be run in parallel). 2. Place a Single Cell A Chip in a 10™ Chip Holder (Fig. 1). The order in which wells are loaded is critical to avoid failures and rows should be loaded in the labeled order: 1 followed by 2, then 3. In each chip you can load from 1 to 8 reactions (rows) and If processing fewer than eight samples the following volumes of 50% glycerol solution to each unused well must be added: 90 μl in the row labeled 1, 40 μl in the row labeled 2, and 270 μl in the row labeled 3. Single Cell Sequencing of Tumor Infiltrating Lymphocytes 93

Table 1 Cell Suspension Volume Calculator Table1 Volume of Cell Suspension Stock (µl)/Volume of Nuclease Free Water (µl) Cell Stock TARGET CELL RECOVERY Concentration (Cells/µl) 500 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 100 8.8/25.1 17.4/16.4 N/A N/A N/A N/A N/A N/A N/A N/A N/A 200 4.4/29.5 8.7/25.1 17.4/16.4 26.1/7.7 N/A N/A N/A N/A N/A N/A N/A 300 2.9/30.9 5.8/28.0 11.6/22.2 17.4/16.4 23.2/10.6 29.0/4.8 N/A N/A N/A N/A N/A 400 2.2/31.6 4.4/29.5 8.7/25.1 13.1/20.8 17.4/16.4 21.8/12.1 26.1/7.7 30.5/3.4 N/A N/A N/A 500 1.7/32.1 3.5/30.3 7.0/26.8 10.4/23.4 13.9/19.9 17.4/16.4 20.9/12.9 24.4/9.4 27.8/6.0 31.3/2.5 N/A 600 1.5/32.4 2.9/30.9 5.8/28.0 8.7/25.1 11.6/22.2 14.5/19.3 17.4/16.4 20.3/13.5 23.2/10.6 26.1/7.7 29.0/4.8 700 1.2/32.6 2.5/31.3 5.0/28.8 7.5/26.3 9.9/23.9 12.4/21.4 14.9/18.9 17.4/16.4 19.9/13.9 22.4/11.4 24.9/8.9 800 1.1/32.7 2.2/31.6 4.4/29.5 6.5/27.3 8.7/25.1 10.9/22.9 13.1/20.8 15.2/18.6 17.4/16.4 19.6/14.2 21.8/12.1 900 1.0/32.8 1.9/31.9 3.9/29.9 5.8/28.0 7.7/26.1 9.7/24.1 11.6/22.2 13.5/20.3 15.5/18.3 17.4/16.4 19.3/14.5 1000 0.9/32.9 1.7/32.1 3.5/30.3 5.2/28.6 7.0/26.8 8.7/25.1 10.4/23.4 12.2/21.6 13.9/19.9 15.7/18.1 17.4/16.4 1100 0.8/33.0 1.6/32.2 3.2/30.6 4.7/29.1 6.3/27.5 7.9/25.9 9.5/24.3 11.1/22.7 12.7/21.1 14.2/19.6 15.8/18.0 1200 0.7/33.1 1.5/32.4 2.9/30.9 4.4/29.5 5.8/28.0 7.3/26.6 8.7/26.6 10.2/23.7 11.6/22.2 13.1/20.8 14.5/19.3 1300 0.7/33.1 1.3/32.5 2.7/31.1 4.0/29.8 5.4/28.4 6.7/27.1 8.0/25.8 9.4/24.4 10.7/23.1 12.0/21.8 13.4/20.4 1400 0.6/33.2 1.2/32.6 2.5/31.3 3.7/30.1 5.0/28.8 6.2/27.6 7.5/26.3 8.7/25.1 9.9/23.9 11.2/22.6 12.4/21.4 1500 0.6/33.2 1.2/32.6 2.3/31.5 3.5/30.3 4.6/29.2 5.8/28.0 7.0/26.8 8.1/25.7 9.3/24.5 10.4/23.4 11.6/22.2 1600 0.5/33.3 1.1/32.7 2.2/31.6 3.3/30.5 4.4/29.5 5.4/28.4 6.5/27.3 7.6/26.2 8.7/25.1 9.8/24.0 10.9/22.9 1700 0.5/33.3 1.0/32.8 2.0/31.8 3.1/30.7 4.1/29.7 5.1/28.7 6.1/27.7 7.2/26.6 8.2/25.6 9.2/24.6 10.2/23.7 1800 0.5/33.3 1.0/32.8 1.9/31.9 2.9/30.9 3.9/29.9 4.8/29.0 5.8/28.0 6.8/27.0 7.7/26.1 8.7/25.1 9.7/24.6 1900 0.5/33.3 0.9/32.9 1.8/32.0 2.7/31.1 3.7/30.1 4.6/29.2 5.5/28.3 6.4/27.4 7.3/26.6 8.2/25.6 9.2/24.6 2000 0.4/33.4 0.9/32.9 1.7/32.1 2.6/31.2 3.5/30.3 4.4/29.5 5.2/28.6 6.1/27.7 7.0/26.8 7.8/26.0 8.7/25.1

3. Dispense 66.2 μl master mix into each well of an 8-tube strip on a chilled metal block resting on ice and then add the appropri- ate volume of nuclease-free water (see Cell Suspension Volume Calculator Table 1) into each well containing master mix. 4. Gently pipet-mix the tube containing the washed and diluted cells and add the appropriate volume (μl) of single-cell suspen- sion (determined from the Cell Suspension Volume Calculator Table 1 and Note 8) to each well of the tube strip containing the master mix and nuclease-free water. 5. With a pipette set to 90 μl, gently pipet-mix the combined cells, master mix, and nuclease-free water five times while keeping the tube strip on ice. 6. Without discarding the pipette tips, transfer 90 μl master mix containing cells to the wells in the row labeled 1, taking care not to introduce bubbles. To do this, place the tips into the bottom center of the wells and raise the tips slightly above the bottom before slowly dispensing the master mix containing cells (Fig. 2a). 7. Snap the Single Cell 30 Gel Bead Strip (equilibrated at room temperature for at least 30 min, see Note 9) into a 10™ Vortex Adapter and vortex for 30 s. A 30-s wait while vortexing the Single Cell 30 Gel Bead Strip is required to ensure proper priming of the master mix containing cells in the Single Cell A Chip. Remove the Single Cell 30 Gel Bead Strip from the 94 Marco De Simone et al.

Fig. 2 Loading the Single Cell A Chip. The loading order of reagents is critical for an optimal performance. (a) After adding cells and water to the Mater Mix, gently resuspend and transfer 90 μl in the row labeled 1. Wait for 300 seconds to allow the correct priming of the circuit and in the meantime vortex the gel beads strip for 30 s. (b) Puncture the foil seal on the beads strip of a number of wells equal to the number of sample to process. Aspirate 40 μl of gel beads very slowly because solution has a very high density and load them in the row labeled 2. (c) Pipet a total volume of 270 μl of Partitioning Oil into the wells in the row labeled 3. (d) Attach the 10Â Gasket (the notched cut should be at the top left corner) Single Cell Sequencing of Tumor Infiltrating Lymphocytes 95

vortex, flick in a sharp, downward motion, and confirm that there are no bubbles at the bottom of the tube and that liquid levels are uniform. 8. Pipet Single Cell 30 Gel Beads slowly as they have a high viscosity. Carefully puncture the foil seal and slowly aspirate 40 μl Single Cell 30 Gel Beads, taking care not to introduce air bubbles and dispense them into the bottom of the wells in the row labeled 2 (Fig. 2b). 9. Pipet 270 μl of Partitioning Oil into the wells in the row labeled 3 (Fig. 2c). 10. Attach the 10™ Gasket. The notched cut should be at the top left corner. Ensure the 10 Gasket holes are aligned with the wells and keep the assembly horizontal to avoid wetting the 10 Gasket with Partitioning Oil (Fig. 2d). 11. Run the Chromium™ Controller. Press the button on the touchscreen of the Chromium Controller to eject the tray; place the assembled Chip, 10 Chip Holder, and 10 Gasket on the tray and press the button on the touchscreen again to retract the tray. Confirm that Chromium Single Cell A pro- gram is displayed on screen and press the play button to begin the run. At the completion of the run (~6.5 min) proceed immediately to the next step.

3.3 GEM Transferring 1. Press the eject button to eject the tray and remove the Single and RT Reaction Cell A Chip. Remove and discard the 10™ Gasket. Press the button to retract the empty tray. 2. Open the 10 Chip Holder and fold the lid back until it clicks to expose the wells at a 45 angle. 3. Slowly aspirate 100 μl GEMs from the lowest points of the Recovery Wells (row labeled black left pointing pointed ◄)as shown in Fig. 3a without creating a seal between the tips and the bottom of the wells and avoid the introduction of air bubbles. Pipet GEMs very slowly as they have high viscosity. Visually inspect the tips to confirm the correct emulsion for- mation. A good emulsion should be opaque and uniform in all the tips used to recover the emulsion (Fig. 3b). A lack of uniformity (Fig. 3c) in the tips indicates that a failure, a clog (red arrow) or a wetting failure (blue arrow), has occurred during GEM generation (see also Notes 10 and 11). 4. Over the course of ~20 s, dispense the GEMs into a 96-Well PCR plate or 8-well PCR strips on a chilled metal block resting on ice with the pipette tips against the sidewalls of the wells. Keep the tips above the liquid level to minimize GEMs lost on the outside of the tips. 96 Marco De Simone et al.

Fig. 3 GEM visual inspection. (a) Open the chip holder and fold the lid back to expose the recovery wells at a 45 angle and slowly aspirate 100 μl from the bottom of the Recovery wells avoiding introduction of air bubbles. (b) Make a visual inspection of the emulsion: emulsion should appear opaque and uniform across all channels. (c) The presence of a non uniform emulsion indicates that a failure has occurred: a clog (red arrow) or a wetting failure (blue arrow)

5. Discard the used Single Cell A Chip and close the 10™ Chip Holder. If using 96-well PCR plates seal the plate with pierce- able foil heat seal. Load the sealed PCR plate or the 8-well PCR strip into a thermal cycler Thermal Protocol: Step 1: 45 min at 53 C; Step 2: 5 min at 85 C; Step 3: Hold 4 C. Lid temperature: 53 C. Reaction volume: 125 μl. For incubation use one of the suggested thermal cyclers (see Subheading 2). Store in the PCR plate or the 8-well strip at 4 C for up to 72 h or at À20 C for up to a week, or proceed directly to Post- GEM-RT Cleanup.

3.4 Post-GEM-RT 1. Add 125 μl Recovery Agent to each well containing post- Cleanup incubation GEMs without mixing. A biphasic mixture should form (Fig. 4a). Wait for 60 s and then if using a PCR plate transfer the entire volume to an 8-tube strip. The recovered biphasic mixture contains distinct Recovery Agent/Partition- ing Oil (pink) and aqueous phases (clear), with no persisting emulsion. If an abnormal volume ratio of Recovery Agent/ Partitioning Oil (pink) and aqueous phase (clear) is present, that can indicate or confirm that a failure has occurred (Fig. 4b and Notes 10 and 11). 2. Slowly remove 125 μl Recovery Agent/Partitioning Oil (pink) from the bottom of the tubes and discard it. Be careful not to aspirate any of the clear aqueous sample because the aqueous phase contains barcoded cDNA. A small volume of Recovery Agent/Partitioning Oil will remain (Fig. 4c). If an abnormal volume ratio of Recovery Agent/Partitioning Oil (pink) and aqueous phase was observed in the previous step, these Single Cell Sequencing of Tumor Infiltrating Lymphocytes 97

Fig. 4 Post-GEM-RT Cleanup. To purify the barcoded cDNAs using magnetic beads, GEMs are first broken adding Partitioning Oil (pink reagent) to the emulsion. (a) Adding 125 μl or Recovery Agent a biphasic mixture should form, containing a distinct Recovery Agent/Partitioning Oil phase (pink) and an aqueous phase (clear). (b) An abnormal volume ratio of Recovery Agent/Partitioning Oil (pink) and aqueous phase (clear) indicates a failure: a clog (red arrow) or a wetting failure (blue arrow). (c, d) Abnormalities are even more evident after Recovering Agent/Partitioning Oil removal in comparison to successful samples. In (c) the same samples shown in (a) after Recovering Agent/Partitioning Oil removal while in (d) the same samples shown in (b) after Recovering Agent/Partitioning Oil removal

discrepancies should be more evident at this stage (Fig. 4d and Notes 10 and 11). 3. cDNA purification is performed using magnetic beads Vortex DynaBeads MyOne Silane beads until fully resuspended. Pre- pare DynaBeads Cleanup Mix and Elution Solution I by adding reagents in the order shown below and vortex mix thoroughly. Cleanup Mix: (1) nuclease-free water: 9 μl; (2) Buffer Sam- ple Cleanup: 182 μl; (3) Dynabeads MyOne SILANE: 4 μl; (4) Additive A: 5 μl. If processing more than one sample per experiment prepare a master mix including a 10% excess. Elution Solution I: (1) Buffer EB 98 μl; (2) 10% Tween 20 1 μl; (3) Additive A 1 μl. 4. Add 200 μl DynaBeads Cleanup Mix to each sample to obtain a uniform suspension and incubate at room temperature for 10 min. 5. After the 10-min incubation step is complete, place the tube strip on a magnetic separator until the supernatant is clear. Carefully remove and discard the supernatant. 6. Add 300 μl freshly prepared 80% ethanol to the pellet while on the magnet and stand for 30 s. Carefully remove and discard the ethanol, and perform a second 200 μl wash. 7. Remove and discard the second ethanol wash and allow the samples to air-dry for 1 min. Remove the tube strip from the magnet add 35.5 μl of Elution Solution I. 8. Pipet-mix thoroughly until beads are fully resuspended (pipette set to 30 μl to avoid introducing air bubbles) and incubate at room temperature for 1 min. Place the tube strip in a magnetic 98 Marco De Simone et al.

separator until the solution is clear. Transfer 35 μl of purified GEM-RT product to a new tube strip.

3.5 cDNA 1. Prepare cDNA Amplification Reaction Mix on ice. Add Amplification reagents in the order described below. cDNA Amplification Mix: (1) nuclease-free water: 8 μl; (2) amplification master mix: 50 μl; (3) cDNA additive: 5 μl; (4) cDNA primer mix: 2 μl. If processing more than one sample per experiment prepare a master mix including a 10% excess. 2. Vortex mix and centrifuge briefly and add 65 μl cDNA Ampli- fication Reaction Mix to each tube containing 35 μl of purified GEM-RT product. 3. Cap and load the tube strip into a thermal cycler that can accommodate at least 100 μl reaction volume and proceed with the following incubation protocol. Thermal Protocol: Step 1: 3 min at 98 C; Step 2: 15 s at 98 C; Step 3: 20 s at 67 C; Step 4: 1 min at 72 C; Step 5: go to Step 2 for N cycles; Step 6: 1 min at 72 C; Step 7: Hold 4 C. Lid temperature: 105 C. Reaction volume: 100 μl. Use a compatible thermal cycler. The number of PCR cycles is to be determined according to the target cell recovery as shown in the Table 2 (see Note 12). Store the samples at 4 C in a tube strip for up to 72 h or proceed directly to SPRIselect Cleanup.

3.6 Post-cDNA 1. Vortex the SPRIselect Reagent (see Notes 13 and 14) until Amplification Reaction fully resuspended and add 60 μl SPRIselect Reagent (0.6Â)to Cleanup and Quality each sample in the tube strip and pipet-mix. Control 2. Incubate the tube strip at room temperature for 5 min. Place the tube strip in a magnetic separator until the solution is clear and carefully remove and discard the supernatant. 3. Wash twice the pellet with 200 μl 80% ethanol, stand for 30 s, and carefully remove and discard the ethanol wash.

Table 2 cDNA amplification

Targeted cell recovery Total cDNA amplification cycles

<2000 14 2000–6000 12 6000–10,000 10 >10,000 8 Optimal number of cycles Single Cell Sequencing of Tumor Infiltrating Lymphocytes 99

4. Remove and discard any remaining ethanol and allow the samples to air-dry for 2 min. Do not exceed 2 min as this will lead to decreased elution efficiency. Remove the tube strip from the magnetic separator and add 40.5 μl Buffer EB. 5. Pipet-mix 15 times and incubate at room temperature for 2 min. Place the tube strip in a magnetic separator until the solution is clear and transfer 40 μl of sample to a new tube strip and cap the sample wells. Samples can be stored at 4 Cina tube strip for up to 72 h or at À20 C for up to a week or proceed directly to Post-cDNA Amplification QC and Quantification. 6. For qualitative analysis run 1 μl of sample at a 1:5 dilution in nuclease-free water on the Agilent Bioanalyzer High Sensitivity chip or on the Agilent TapeStation High Sensitivity D1000 ScreenTape. Traces should resemble the overall shape of the sample electropherograms shown in Fig. 5a. Determine the cDNA yield per sample (see Note 15). Concen- tration will be used to determine the appropriate number of Sample Index PCR cycles to generate sufficient concentration of the final library.

3.7 Library 1. Prepare a thermal cycler with the following incubation proto-  Construction: col and initiate the 4 C precool block step prior to assembling Fragmentation, End the fragmentation mix. Repair, and A-Tailing Thermal Protocol: Step 1: Precool block and Hold 4 C; Step 2 (Fragmentation): 5 min at 32 C; Step 3 (End Repair and A-Tailing): 30 min at 65 C; Step 4: Hold 4 C. Lid temperature: 65 C. Reaction volume: 50 μl. Use a compatible thermal cycler. Vortex the fragmentation buffer and verify that there is no precipitate before proceeding. Prepare the fragmentation mix on ice add the reagents in the order shown below. Fragmentation mix: (a) fragmentation enzyme blend: 10 μl; (b) fragmentation buffer: 5 μl. If processing more than one sample per experiment prepare a master mix including a 10% excess. 2. Mix thoroughly, centrifuge briefly, and dispense 15 μl fragmen- tation mix into each well of an 8-tube strip on a chilled metal block resting on ice. 3. Add 35 μl purified cDNA to each well of the tube strip contain- ing the fragmentation mix. Pipet-mix and transfer the chilled tube strip into the precooled thermal cycler (4 C) and press “SKIP” to initiate the Fragmentation protocol. 4. Double Sided Size Selection (see Note 14). Resuspend SPRI- select Reagent by vortexing and add 30 μl (0.6Â) to each 100 Marco De Simone et al.

Fig. 5 cDNA and library quality control. (a) Electrophoretic profile of purified cDNA after RT reaction loaded on an Agilent Bioanalyzer High Sensitivity Chip. A good cDNA profile should resemble the one shown in figure. The average size (red box) and the cDNA concentration (blue box) are automatically determined by the instrument. (b) Electrophoretic profile of the final sequencing library loaded on an Agilent Bioanalyzer High Sensitivity Chip. A good library should resemble the one shown in figure. The average size (red box) and the cDNA concentration (blue box) are automatically determined by the instrument

sample in the tube strip, pipet-mix, and incubate the tube strip at room temperature for 5 min. 5. Place the tube strip in a magnetic separator until the superna- tant is clear and transfer 75 μl supernatant to a new tube strip. Single Cell Sequencing of Tumor Infiltrating Lymphocytes 101

6. Discard the previous tube strip (containing the beads) and add 10 μl SPRIselect Reagent (0.8Â) to each sample in the tube strip. Pipet-mix and incubate the tube strip at room tempera- ture for 5 min. Place the tube strip in a magnetic separator until the solution is clear and carefully remove and discard 80 μl supernatant. 7. With the tube strip still in a separator, wash twice the pellet with 125 μl 80% ethanol, and stand for 30 s. 8. Carefully remove and discard the remaining ethanol wash and resuspend directly with Buffer EB without waiting for the beads to dry to ensure maximum elution efficiency. Remove the tube strip from the magnetic separator and add 50.5 μl Buffer EB. Pipet-mix. 9. Mix and incubate the tube strip at room temperature for 2 min. Place the tube strip in a magnetic separator until the solution is clear and transfer 50 μl of sample to a new tube strip.

3.8 Library 1. Prepare Adaptor Ligation Mix by adding the reagents in the Construction: Adaptor order shown below. Mix thoroughly and centrifuge briefly. Ligation Adaptor Ligation Mix: (1) nuclease-free water: 17.5 μl; (2) ligation buffer: 20 μl; (3) DNA ligase: 10 μl; (4) adaptor mix: 2.5 μl. If processing more than one sample per experiment prepare a master mix including a 10% excess. Add 50 μl Adap- tor Ligation Mix to each tube containing 50 μl sample from the Post-Fragmentation, End Repair and A-tailing Size Selection. Pipet-mix (pipette set to 50 μl) and incubate in a thermal cycler with the following protocol. Thermal Protocol: Step 1: 15 min at 20 C; Lid temperature: 30 C. Reaction volume: 50 μl. Use a compatible thermal cycler. 2. Post-ligation cleanup. Vortex the SPRIselect Reagent until fully resuspended and add 80 μl SPRIselect Reagent (0.8Â)toeach sample in the tube strip. Pipet-mix 15 times and incubate the tube strip at room temperature for 5 min (see Note 13). 3. Place the tube strip in a magnetic separator until the solution is clear and carefully remove and discard the supernatant. Add 200 μl 80% ethanol to the pellet and stand for 30 s. Remove and discard the ethanol wash and repeat this step for a total of two washes. Allow the samples to air-dry for 2 min. (Do not exceed 2 min as this will lead to decreased elution efficiency.) 4. Remove the tube strip from the magnetic separator and add 30.5 μl Buffer EB. Pipet-mix and incubate the tube strip at room temperature for 2 min. 5. Place the tube strip in a magnetic separator until the solution is clear and transfer 30 μl of sample to a new tube strip. 102 Marco De Simone et al.

3.9 Library 1. Prepare Sample Index PCR Mix by adding the reagents in the Construction. Sample order shown below. Sample index PCR mix: (a) nuclease-free Index PCR water: 8 μl; (b) amplification master mix: 50 μl; sample index (SI) PCR primer: 2 μl. Mix and add 60 μl Sample Index PCR Mix to each tube containing 30 μl purified Post-Ligation sample. If processing more than one sample per experiment prepare a master mix including a 10% excess. Add 10 μl of an individual Chromium i7 Sample Index to each well and record their assignment. This index is a “molecular tag” that allows to assign sequencing reads to specific samples during the sequencing process. Mix, centrifuge briefly and index the DNA library in a thermal cycler with the following protocol. Thermal Protocol: Step 1: 45 s at 98 C; Step 2: 20 s at 98 C; Step 3: 30 s at 54 C; Step 4: 20 s at 72 C; Step 5: go to Step 2 for N cycles; Step 6: 1 min at 72 C; Step 7: Hold 4 C. Lid temperature: 105 C. Reaction volume: 100 μl. Use a compatible thermal cycler. Choose the appropriate sample index sets to ensure that no sample indices overlap in a multiplexed sequencing run. Record the 10Â Sample Index name (PN-220103 Chro- mium™ i7 Sample Index Plate well ID) used, especially if running more than one sample. The optimal number of cycles for the Sample Index PCR reaction is a balanced compromise between obtaining enough material for sequencing and minimizing PCR amplification biases. Table 3 is a starting point for this optimization and amplification cycles should be determined according to the starting amount of cDNA. Store the tube strip at 4 C for up to 72 h or proceed directly to Post-Sample Index PCR Double Sided Size Selection.

Table 3 Sample index PCR

Input into library construction, ng Total sample index cycles

1–25 14–16 25–150 12–14 150–500 10–12 500–1000 8–10 1000–1500 6–8 >1500 5 Optimal number of cycles Single Cell Sequencing of Tumor Infiltrating Lymphocytes 103

2. Post-Sample Index PCR Double Sided Size Selection (see Note 14). Vortex the SPRIselect Reagent and add 60 μl (0.6Â)to each sample in the tube strip, pipet-mix, and incubate the tube strip at room temperature for 5 min. 3. Place the tube strip in a magnetic separator until the solution is clear and transfer 150 μl supernatant to a new tube strip and discard the previous tube strip (containing the beads). 4. Add 20 μl SPRIselect Reagent (0.8Â) to each sample in the tube strip, pipet-mix 15 times, and incubate the tube strip at room temperature for 5 min. 5. Place the tube strip in a magnetic separator until the solution is clear, carefully remove and discard 165 μl supernatant and with the tube strip still in magnetic separator, add 200 μl 80% ethanol to the pellet and stand for 30 s. 6. Carefully remove and discard the ethanol wash and repeat step 5 for a total of two washes. 7. Remove the tube strip from the magnetic separator and resus- pend directly with Buffer 35.5 μl of Buffer EB without waiting for the beads to dry to ensure maximum elution efficiency and incubate the tube strip at room temperature for 2 min. 8. Place the tube strip in a magnetic separator until the solution is clear and transfer 35 μl of sample to a new tube strip and cap the sample wells. Store the tube strip at 4 C for up to 72 h or at À20 C for long-term storage.

3.10 Post-library 1. For qualitative analysis load 1 μl of sample at 1:10 dilution on Construction QC the Agilent Bioanalyzer High Sensitivity chip or on the Agilent and Quantification TapeStation High Sensitivity D1000 ScreenTape. A typical electropherogram is shown in Fig. 5b. Determine the average fragment size from the Bioanalyzer/TapeStation trace. Deter- mine the cDNA yield per sample using Kapa DNA Quantifica- tion Kit for Illumina platforms (see Note 16). Calculation of total yield can be also performed using Fluorimetric DNA Quantification methods like Qubit dsDNA HS Assay Kit (Thermofisher) or QuantiFluor dsDNA System (Promega).

3.11 Sequencing 1. Sequencing library Description and Depth Recommendations. A Single Cell 30 Library comprises standard Illumina paired- end constructs which begin and end with P5 and P7. Read 1 and Read 2 are standard Illumina sequencing primer sites used in paired-end sequencing. Read 1 is used to sequence the 16 bp 10Â Barcode and 10 bp UMI, while Read 2 is used to sequence the cDNA fragment. Each sample index provided in the Chromium™ i7 Sample Index Kit combines 4 different sequences in order to balance across all four nucleotides. 104 Marco De Simone et al.

Table 4 Sequencing run parameters

Sequencing read Number of cycles

Read1 26 i7 Index 8 i5 Index 0 Read2 98

The technical performance of Single Cell 30 libraries is driven by sequencing coverage per cell. Fifty thousand raw reads per cell is recommended. This sequencing depth is optimal for libraries generated from T Lymphocytes (see Note 17). 2. Sequencing Run Parameters Single Cell 30 libraries can be sequenced using MiSeq, NextSeq 500/550, HiSeq 2500, or HiSeq 3000/4000 platforms. Libraries should be run using paired-end sequencing with single indexing. The supported number of cycles for each read is shown in Table 4.

4 Notes

1. Cell 30 library and gel bead kit v2. This kit is the most expensive reagent of the protocol. We suggest to order reagents after planning very carefully your experiment. Reagents can be sold in 16- or 4-reaction solutions and we suggest, to avoid any technical variability to perform the experiment using the same reagents Lot Number. 2. Chromium™ Single Cell A Chip Kit. Microfluidic Chips can be sold separately from reagents in a six-chip (48 reactions) or two-chip (16 reactions) solution. On each chip a total of eight independent reactions can be loaded and considering that on each channel of the chip you can load from 500 to 10,000 cells, a total of 80,000 cells per chip is the maximum number of cells analyzed per run. Chips are disposable meaning that if running a chip and you cannot fill all the channels, empty channels cannot be reused. 3. Chromium™ i7 multiplex kit, 96 rxns. This kit contains 96 independent mixes of “sample index primers.” Sample index is added by PCR (see Subheading 3.9) because partially overlapping to the adapter added by ligation during library preparation This mix is used to assign a specific index to each sample: each cell in fact is indexed during RT reaction but no index has been yet assigned to samples to distinguish them if pooled during sequencing. Remember to label the well Single Cell Sequencing of Tumor Infiltrating Lymphocytes 105

number once you index your sample to allow a correct demul- tiplexing after sequencing. 4. 10™ magnetic separator is an 8-well strip magnetic separator used in all the purification steps of the protocol that always employ magnetic beads. This magnet is replaceable with other commercially available but we strongly recommend the 10 magnet because is very efficient and allows to work very pre- cisely with very small volumes (in some purification steps just 10 μl of beads is used). 5. Plastics and special equipment. Some plastics can interact with and destabilize GEMs. It is therefore critical to use validated emulsion-safe plastic consumables when handling GEMs. 10 ® ® Genomics has validated Eppendorf twin.tec PCR plates and Rainin LTS Low retention pipette tips as GEM-compatible plastics. USA Scientific, Eppendorf, and Thermo Fisher PCR 8-tube strips have also been validated. Substituting these mate- rials can adversely affect performance. Considering the high viscosity of the reagents to be han- dled (especially the beads), to avoid sample and solution loss and to obtain a high reproducibility, the use of low retention tips is recommended. In Subheading 2 a list of suggested thermal cycles is provided. These cyclers have been tested for a uniform heating of the emulsion which is essential for emul- sion stability and the efficiency of the protocol. 6. Cell preparation. The quality of the starting single cell suspen- sion is one of the major factors to take into consideration for the generation of good quality single cell mRNA seq results. This suspension should have a high viability (>80%), and to guarantee a good reproducibility cell concentration should be precisely calculated. Cell viability is one of the most important aspects affecting cell recovery: dying cells have a lower amount of RNA that cannot be efficiently captured during RT reaction and this decreases the number of cells that we expect to recover. More- over, dead cells lyse very easily releasing RNA called “ambient RNA” that can be incorporated in droplets along with intact cells increasing the background. Dead cells can be eliminated with PBS washes or using ® specific protocols for dead cell removal (MACS Dead Cell Removal Kit, Miltenyi Biotec). To calculate concentration, we suggest using both manual counting and an automated cell Counter after staining cells with Trypan blue or a viability die. Like viability, the quantifi- cation of cell concentration is critical because overestimation of cell concentration decreases the expected recovery. 106 Marco De Simone et al.

7. Handling reagents and master mixes. Before starting ensure that reagents are fully thawed and thoroughly mixed before use. All enzyme components and master mixes should be kept on ice during setup and promptly moved back to the recom- mended storage temperature when possible. 8. Cell loading. Chromium 10Â has a processing efficiency of up to 65% meaning that the number of recovered cells will be up to 65% of the cells loaded in each channel of the chip. In the User Guide 10Â Genomics provides a “Cell Suspension Volume Calculator Table” (Table 1). This table is a useful tool to calculate the volume of quantified cell suspension to load on the chip in order to achieve a desired Targeted Cell Recovery. The left column lists the cell concentration in cells/μl while the row at the top of the table lists the number of cells targeted for recovery. In this table in red and blue are indicated respec- tively the volume of cell suspension and water to add to the master mix in order to achieve the targeted number of recov- ered cells for a particular sample. For example, for a cell stock concentration of 1000 cells/μl we need to add 25.1 μl of water to the master mix first, and then add 8.7 μl of the cell suspension to the master mix in order to capture 6000 cells. Another important point concerning cell loading is cell resuspension before loading. After quantification cells are usually kept on ice and if cells are sitting in the tube for a long time they begin to settle. If cells are not properly resuspended before loading the number of cells loaded can differ consistently if pipetting from the top or the bottom of the solution in the tube and the cell recovery can be widely inconsistent with the one calculated after cell quantification. 9. Gel beads storage. Single Cell 30 Gel Beads Strip should be stored at À80 C and equilibrated to room temperature before use. Unused Single Cell 30 gel beads should be stored at À80 C avoiding more than ten freeze–thaw cycles. Gel beads should be pipetted very slowly as they have a viscosity similar to high concentration glycerol. 10. Chip loading. Wetting failures. Once reagents are added to the Chromium™ Chip wells, they immediately flow into and prime the microfluidic channels on the chip. A wetting failure is the result of an incorrect priming of the chip in which, instead of a uniform emulsion, millimeter-scale droplets are formed. To favor correct priming and to minimize the occur- rence of wetting failures, it is critical to add reagents in the correct order and to wait for 30 s and no more than 120 s between addition of master mix and addition of Gel Beads. Chip should be run in 120 s after loading and delays can Single Cell Sequencing of Tumor Infiltrating Lymphocytes 107

potentially cause failures. The occurrence of a wetting failure can be easily recognized by the absence of a uniform emulsion in the outlet well or pipette tip (Fig. 3c) and can be further confirmed during the Post-GEM-RT Cleanup Step just after the addition of Recovery Agent: if a wetting failure occurs in a specific channel, the volume ratio between Partitioning Oil/- Recovery Agent phase and aqueous phase is abnormal com- pared to normal samples (Fig. 4b, d). 11. Chip loading. Clogs. The generation of GEMs occurs in chan- nels that are narrower than 100 μm. The presence of particles or fibers on the working station or the presence of cell aggre- gates in the single cell suspension due to an improper prepara- tion can affect the emulsion generation creating clogs in the microfluidic circuits. A clog can be easily recognized in the pipette tip during the recovery phase: the volume of emulsion is reduced compared to the successful samples and an excess of oil (clear) or air is observed (Fig. 3c). To avoid clogs, it is also important to minimize exposure of reagents, chips, and gaskets to sources of particles and fibers and to check for the absence of aggregates in the single cell suspension to be loaded on the chip. Clogs can be further confirmed during the Post-GEM- RT Cleanup Step just after the addition of Recovery Agent: if a clog occurs in a specific channel, the volume of aqueous phase during recovery will be decreased compared to successful reac- tions (Fig. 4b, d). 12. cDNA amplification cycles. In Table 2 the suggested amplifica- tion cycles in the cDNA amplification reaction are listed. Sug- gested cycles are determined according to the starting number of cells to “normalize” samples derived from different amount of cells and to obtain enough material for library preparation. As different cells contain different amount of RNA a range of cycles is suggested. T cells contain very low amount of RNA and for this reason it can be useful to increase the suggested cycles. As an example, when starting from 5000 T cells we suggest to use 13 cycles instead of 12 suggested by the protocol. 13. Solid-phase reversible immobilization beads (SPRI beads) are paramagnetic meaning that they are magnetic only when exposed to a magnetic field (magnetic stand in our case). These particles are coated with carboxyl groups that can bind DNA nonspecifically and reversibly. These beads are resus- pended in a solution containing polyethylene glycol (PEG) whose concentration directly affects their capacity to bind DNA fragments according to their size; this property is exploited in DNA purification procedures to select specific DNA fragments from samples highly heterogeneous in size. In general, the higher the concentration of PEG and salt in the 108 Marco De Simone et al.

solution, the shorter the cutoff size, and, therefore, the lower the starting molecular weight of the purified products. Part of the reason for this effect is that DNA fragment size affects the total charge per molecule with larger DNAs having larger charges; this promotes their electrostatic interaction with the beads and displaces smaller DNA fragments. 14. Double size selection. This protocol can be used to generate DNA fragment libraries with a certain size range or to narrow fragment-size distribution; both smaller and larger fragments can, in fact, be removed. A first step enables binding of all DNA fragments longer than the desired upper limit of the interval. The beads with the unwanted larger DNA fragments are dis- carded (0.6Â, right side selection). The supernatant, which contains DNA fragments shorter that the upper length cutoff, is transferred to a new tube to perform the second size selec- tion step (0.8Â, left side selection). 15. cDNA quality control and quantification. cDNA Quality Control is performed running 1 μl of sam- ple at a dilution of 1 part sample: 5 parts Nuclease-Free Water on the Agilent Bioanalyzer High Sensitivity chip. Considering that PCR cycles in the cDNA Amplification Step are calculated according to the number of cells recovered it may be necessary, if working with very small cells with very low amount of RNA to load undiluted PCR product. cDNA concentration can be than extrapolated from the “Electropherogram” view choosing the “Region Table” tab on the Agilent 2100 Expert Software and manually selecting the region encompassing 200–9000 bp. Obviously the concentration is then corrected according to the initial dilution. This concentration will be used in Subheading 3.9 to determine the appropriate number of Sample Index PCR cycles to generate sufficient amount of final library. Calculation of total yield can be also performed using Fluorimetric DNA Quantification methods like Qubit dsDNA HS Assay Kit (Thermofisher) or QuantiFluor dsDNA System (Promega). 16. Library quality control and quantification. Post-Library Construction Quality Control is performed running 1 μl of sample at 1:10 dilution on the Agilent Bioana- lyzer High Sensitivity chip. Traces should resemble the overall shape of the sample electropherogram shown below. Deter- mine the average fragment size from the Bioanalyzer trace and use it as the reference insert size for accurate library quan- tification in qPCR. Post-Library Construction Quantification is performed using Kapa DNA Quantification Kit for Illumina platforms. This is a qPCR based quantification method that allows DNA quantification using a universal DNA standard at known concentration. A series of 1:40,000, 1:200,000, Single Cell Sequencing of Tumor Infiltrating Lymphocytes 109

1:1,000,000, and 1:5,000,000 of the completed Single Cell 30 library is required to fall within the dynamic range of the assay. Calculation of total yield can be also performed using fluori- metric DNA quantification methods like Qubit dsDNA HS Assay Kit (Thermo Fisher) or QuantiFluor dsDNA System (Promega). 17. Sequencing depth. Chromium System from 10Â is a cost effec- tive methods to obtain shallow sequencing of thousands to tens of thousands of single cells in one run. (Each chip contains 8 loading channels. In each channel it is possible to load from 500 to 10,000 cells.) The sensitivity of the emulsion based systems, defined as the fraction of each cell’s transcriptome represented in the final sequencing library is lower compared to the ones obtained with other systems (3–10% compared with 10–20% for other methods). This bias can be partially solved increasing the depth at which libraries are sequenced. Sequencing depth is mainly limited by costs and lower sequencing depth limits the complexity of the expression pro- file attained per cell. At a low sequencing depth, only the most highly expressed genes will be observed and consequently the number of cell to be loaded should be decided according to the scientific questions of interest. Studies aiming at identifying cell clusters that can be defined by many genes, with an emphasis on finding rare cell populations, should prioritize a breadth- based approach (shallow sequencing of tens of thousands of cells), whereas studies aiming at distinguishing stochastic vari- ation in individual genes should prioritize a high depth of sequencing (deeper sequencing of fewer single cells).

References

1. Zheng GX, Terry JM, Belgrader P, Ryvkin P, Provasi E, Sarnicola ML, Panzeri I, Moro M, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, Crosti M, Mazzara S, Vaira V, Bosari S, McDermott GP, Zhu J, Gregory MT, Shuga J, Palleschi A, Santambrogio L, Bovo G, Montesclaros L, Underwood JG, Masquelier Zucchini N, Totis M, Gianotti L, Cesana G, DA, Nishimura SY, Schnall-Levin M, Wyatt Perego RA, Maroni N, Pisani Ceretti A, PW, Hindson CM, Bharadwaj R, Wong A, Ness Opocher E, De Francesco R, Geginat J, Stun- KD, Beppu LW, Deeg HJ, McFarland C, Loeb nenberg HG, Abrignani S, Pagani M (2016) KR, Valente WJ, Ericson NG, Stevens EA, Transcriptional landscape of human tissue lym- Radich JP, Mikkelsen TS, Hindson BJ, Bielas phocytes unveils uniqueness of tumor- JH (2017) Massively parallel digital transcrip- infiltrating T regulatory cells. Immunity 45 tional profiling of single cells. Nat Commun (5):1135–1147. https://doi.org/10.1016/j. 8:14049. https://doi.org/10.1038/ immuni.2016.10.021 ncomms14049 4. Tirosh I, Izar B, Prakadan SM, Wadsworth MH 2. Zhu YY, Machleder EM, Chenchik A, Li R, Sie- 2nd, Treacy D, Trombetta JJ, Rotem A, bert PD (2001) Reverse transcriptase template Rodman C, Lian C, Murphy G, Fallahi-Sichani- switching: a SMART approach for full-length M, Dutton-Regester K, Lin JR, Cohen O, cDNA library construction. BioTechniques 30 Shah P, Lu D, Genshaft AS, Hughes TK, Ziegler (4):892–897 CG, Kazer SW, Gaillard A, Kolb KE, Villani AC, 3. De Simone M, Arrigoni A, Rossetti G, Johannessen CM, Andreev AY, Van Allen EM, Gruarin P, Ranzani V, Politano C, Bonnal RJP, Bertagnolli M, Sorger PK, Sullivan RJ, Flaherty 110 Marco De Simone et al.

KT, Frederick DT, Jane-Valbuena J, Yoon CH, 5. Plitas G, Konopacki C, Wu K, Bos PD, Rozenblatt-Rosen O, Shalek AK, Regev A, Garr- Morrow M, Putintseva EV, Chudakov DM, away LA (2016) Dissecting the multicellular Rudensky AY (2016) Regulatory T cells exhibit ecosystem of metastatic melanoma by single- distinct features in human breast cancer. Immu- cell RNA-seq. Science 352(6282):189–196. nity 45(5):1122–1134. https://doi.org/10. https://doi.org/10.1126/science.aad0501 1016/j.immuni.2016.10.032 Chapter 8

Seq-Well: A Sample-Efficient, Portable Picowell Platform for Massively Parallel Single-Cell RNA Sequencing

Toby P. Aicher, Shaina Carroll, Gianmarco Raddi, Todd Gierahn, Marc H. Wadsworth II, Travis K. Hughes, Chris Love, and Alex K. Shalek

Abstract

Seq-Well is a low-cost picowell platform that can be used to simultaneously profile the transcriptomes of thousands of cells from diverse, low input clinical samples. In Seq-Well, uniquely barcoded mRNA capture beads and cells are co-confined in picowells that are sealed using a semipermeable membrane, enabling efficient cell lysis and mRNA capture. The beads are subsequently removed and processed in parallel for sequencing, with each transcript’s cell of origin determined via the unique barcodes. Due to its simplicity and portability, Seq-Well can be performed almost anywhere.

Key words Seq-Well, Single-cell RNA sequencing, Single-cell genomics, Systems biology, Transcrip- tomics, RNA-Seq, Picowells

1 Introduction

Single-cell RNA sequencing (scRNA-seq) is an emerging method that enables genome-wide expression profiling at cellular resolu- tion. Population-level transcriptomic techniques, such as microar- rays and bulk RNA-seq, average over a large number of cells and assume transcriptional homogeneity; yet even related cells of the same subtype can present dramatic heterogeneity in their transcrip- tional activities and states [1]. ScRNA-seq allows direct measure- ment of this variability, as well as analyses of expression covariation across cells. This information can be used to discover gene-expres- sion patterns that define distinct cell types and states, as well as their molecular circuits and biomarkers, affording an unprecedented view of cellular phenotype. Over the years, technological progress and protocol improvements have resulted in a substantial increase

Junior authors, Toby P. Aicher, Shaina Carroll, and Gianmarco Raddi, and senior authors, Chris Love and Alex K. Shalek, contributed equally to this work.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_8, © Springer Science+Business Media, LLC, part of Springer Nature 2019 111 112 Toby P. Aicher et al.

in the number of cells that can be processed in parallel [2–4], enhancing statistical power and providing opportunities to look at increasingly complex systems. Current methods used to prepare single-cell libraries include manual selection [5], FACS sorting [6], microfluidic circuits [7], droplet-based techniques [8–10], and picowells [11, 12]. Seq-Well, an example of the latter, is an easy-to-use, low-cost, sample-efficient and portable platform for massively parallel scRNA-seq [11]. Seq-Well utilizes PDMS arrays containing ~88,000 subnanoliter wells in which single cells and uniquely barcoded poly(dT) mRNA beads are co-confined with a semiper- meable membrane. Crucially, well size ensures that only one bar- coded mRNA capture bead can fit into each well, improving cell capture efficiency. Cells, meanwhile, are loaded at a low density to minimize cell doublets, ensuring single-cell resolution. Selective chemical functionalization allows reversible attachment of a semi- permeable polycarbonate membrane with 10 nm pores, permitting buffer exchange for cell lysis while trapping larger macromolecules, such as nucleic acids, to minimize cross-contamination. The co-confined mRNA capture beads are covered in oligonucleotides that consist of a universal primer, a cell barcode (unique to each bead), a unique molecular identifier (UMI; unique to each primer), and a poly-T sequence that can capture cellular mRNA upon lysis and during hybridization [13]. Following these steps, the semiper- meable membrane can be peeled off for bead removal. Finally, the barcoded beads can be pooled for reverse transcription, PCR ampli- fication, library preparation, and sequencing, with a transcript’s cell of origin and uniqueness determined via its cell barcode and UMI, respectively. Importantly, implementing Seq-Well only requires a PDMS array, a polycarbonate membrane, a pipette, a clamp, an oven/ heat source, and a tube rotator to produce stable cDNA product, making it functional in nearly every clinic and laboratory context.

2 Materials

All buffers and solutions are to be prepared with ultrapure water and stored at room temperature, unless otherwise indicated.

2.1 Array Processing 1. Bead loading buffer (BLB): 10% BSA, 100 mM sodium car- Prior to Reverse bonate, pH 10. Add 2.5 mL BSA (100 mg/mL) to a 50 mL Transcription falcon tube. Add water to ~15 mL followed by 1.25 mL 2 M sodium carbonate. Add additional water to achieve a final volume of 25 mL. Titrate with glacial acetic acid to reach pH 10 (see Note 1). Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 113

2. Prelysis buffer: 5 M guanidine thiocyanate, 1 mM EDTA (see Notes 2 and 3). 3. Complete lysis buffer: 5 M guanidine thiocyanate, 1 mM EDTA, 0.5% sarkosyl, 1% β-mercaptoethanol. Combine 5 mL prelysis buffer with 25 μL 10% sarkosyl and 50 μL β-mercaptoethanol (see Note 4). 4. Hybridization buffer: 2 M NaCl, 4% PEG 8000 in PBS. Com- bine 10 mL 5 M NaCl with 13 mL of PBS, and 2 mL 50% (w/v) PEG 8000 (see Note 5).

5. Wash buffer: 2 M NaCl, 3 mM MgCl2, 20 mM Tris–HCl (pH 8.0), 4% PEG 8000. Combine 20 mL 5 M NaCl, 150 μL 1 M MgCl2, 1 mL 1 M Tris–HCl (pH 8.0), and 4 mL 50% (w/v) PEG 8000. Add water to bring volume to 50 mL (see Note 5). 6. Polycarbonate membranes: 0.01 μm pores, 62 mm  22 mm (see Note 6). 7. mRNA capture beads (see Note 7). 8. Seq-Well arrays (see Notes 8 and 9). 9. RPMI. 10. RP-10: RPMI with 10% FBS. 11. PBS for washing.

2.2 Array Storage 1. Array quenching buffer: 100 mM sodium carbonate, 10 mM Tris–HCl (pH 8.0). Combine 2.5 mL 2 M sodium carbonate with 500 μL 1 M Tris–HCl. Add water to bring total volume to 50 mL. Arrays can be stored in array quenching buffer for up to 1 month at 4 C(see Note 10). 2. Aspartic acid solution: 20 μg/mL of L-aspartic acid, 2 M NaCl, and 100 mM sodium carbonate solution (pH 10.0). Arrays can be stored in the aspartic acid solution for up to 6 months at 4 C(see Note 10).

2.3 Reverse 1. Maxima H-RT with Maxima 5Â RT buffer. Transcription 2. 30% PEG 8000. 3. dNTP mix (10 mM each). 4. RNase Inhibitor. 5. Template Switch Oligo (see Subheading 2.5). 6. TE-TW: 10 mM Tris–HCl pH 8.0, 1 mM EDTA, 0.01% Tween-20. Combine 49.5 mL water, 0.5 mL 1.0 M Tris pH 8.0, 100 μL 0.5 M EDTA, and 50 μL Tween-20. 7. TE-SDS: 10 mM Tris pH 8.0, 1 mM EDTA, 0.05% SDS. Combine 49.5 mL water, 0.5 mL 1.0 M Tris pH 8.0, 100 μL 0.5 M EDTA, and 250 μL 10% SDS. 114 Toby P. Aicher et al.

2.4 PCR and Library 1. Exonuclease I (E. coli) with buffer (NEB Cat. No. M0293S). Preparation 2. 10 mM Tris–HCl (pH 8.0). 3. Thermocycler. 4. Microseal B adhesive seal. 5. Microseal F foil. 6. Qubit assay tubes. 7. Qubit 2.0 fluorometer. 8. 96-well PCR plates, skirted. 9. SMART PCR Primer (see below). 10. KAPA HiFi Hotstart Readymix PCR Kit. 11. Ampure DNA Spri beads. 12. 80% ethanol. 13. Agilent High Sensitivity DNA Kit. 14. Nextera XT kit. 15. Custom P5-SMART PCR hybrid oligo (see Subheading 2.5).

2.5 Primers 1. Template Switch Oligo: AAGCAGTGGTATCAACGCAGAG TGAATrGrGrG. 2. SMART PCR Primer: AAGCAGTGGTATCAACGCAGAGT. 3. Custom P5-SMART PCR hybrid oligo: AATGATACGGC- GACCACCGAGATCTACACGCCTGTCCGCGGAAGCAG TGGTATCAACGCAGAGT*A*C. 4. Custom Read 1 Primer: GCCTGTCCGCGGAAGCAGTGG TATCAACGCAGAGTAC.

3 Methods

3.1 Membrane 1. Place a precut (22 Â 66 mm) polycarbonate membrane onto a Functionalization glass slide, using a gloved finger and tweezers to carefully separate the membrane and paper. Make sure the shiny side of the polycarbonate membrane is facing up. Discard any mem- branes that have creases or other large-scale imperfections (see Note 11). 2. Place membranes onto a shelf in the plasma cleaner (see Note 12). 3. Close the plasma cleaner door and make sure the three-way valve lever is in the closed position. Then turn on the main power and pump switch to form a vacuum (see Note 13). 4. Allow a vacuum to form for 2 min. Once the vacuum has formed, simultaneously turn the valve clockwise to 12:00 while turning the power to the high setting. Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 115

Fig. 1 Functionalized membranes can be stored in 1Â PBS for 24 h

5. Treat membranes with plasma for 7 min. 6. After treatment, in the following order, turn the RF level valve from HIGH to OFF, then turn off the power followed by turning off the vacuum. Then slowly open the valve until you can just barely hear air entering the chamber. Allow the cham- ber to slowly fill with air until the door opens. This will take about 5 min (see Note 14). 7. Pipet 1 mL of 1Â PBS into each well of a four-well plate. Transfer slides with treated membranes from the plasma cleaner to the four-well plate. Quickly pipet 4 mL of 1Â PBS over the membrane, preventing the membrane from folding on itself (see Note 15 and Fig. 1). 8. Remove any air bubbles underneath the membrane by gently pressing on the membrane using wafer forceps. Membranes are now functionalized and ready for use. Membranes solvated with 1Â PBS should be used within 24 h.

3.2 Bead Loading 1. Aspirate storage solution and solvate each array with 5 mL of BLB (see Note 16). 2. Aliquot ~110,000 beads per array from bead stock into a 1.5 mL tube and spin on a tabletop centrifuge for 15 s to form a pellet (see Note 17). 3. Aspirate storage buffer and replace it with 500 μL of BLB. Invert the tube several times to wash the beads. Pellet the beads and then repeat the wash step with an additional 500 μL of BLB. 116 Toby P. Aicher et al.

Fig. 2 Apply beads to the array in a dropwise fashion

4. Pellet beads, aspirate BLB, and resuspend in 200 μL of BLB per 110,000 beads. 5. Before loading beads, thoroughly aspirate BLB from the dish containing the array(s), being careful not to aspirate or dry the PDMS surface of the array(s). Center the array(s) so that there is no contact between the array(s) and the sides of the four- well dish. 6. Use a 200 μL pipette to apply 200 μL containing 110,000 beads, in a dropwise fashion, to the surface of each array. Your goal is to cover the surface of the entire array with beads (see Fig. 2). 7. Rock the four-well dish in the x and y directions for 10 min (see Notes 18–20). 8. Thoroughly wash array(s). Position each array so that it sits in the center of the four-well dish. Dispense 500 μL of BLB in the upper right corner of each array and 500 μL in the bottom right corner of the PDMS surface of each array. Be careful not to directly pipet onto the microwells, as it can dislodge beads. Using wafer forceps, push each array against the left side of the four-well dish to create a capillary flow—this will help remove excess beads from the surface. Aspirate the liquid from the bottom of the dish, reposition each array in the center of the four-well dish, and repeat, but this time pipetting BLB onto the opposite corners (see Notes 21 and 22 and Fig. 3). Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 117

1.Pipette 1 mL ofBLB onto array 2. Reposition array adjacent to the 3.Aspirate excess beads from the 4.Rotate array and repeat washing surface (500uLatcorners) edge of the dish to create a left side of the array surface and until all excess beads are removed capillary flow across the array bottom of the dish

Fig. 3 Create capillary flow to draw excess beads from the center of the array

9. Repeat step 8 as necessary. Periodically examine the array (s) under a microscope to verify that very few loose beads are present on the surface, as this will interfere with membrane attachment. 10. Once excess beads have been removed from the surface, solvate each array. If continuing to cell loading immediately (i.e., within 1–5 h), loaded arrays should be stored in 5 mL of BLB. Alternatively, loaded arrays can be stored for up to 2 weeks in Array Quenching Buffer.

3.3 Cell Loading 1. Obtain a cell or tissue sample and prepare a single-cell suspen- sion using your preferred protocol. 2. Aspirate BLB from each array and soak in 5 mL of RPMI + 10% FBS (RP-10). 3. After obtaining a single-cell suspension, count cells using a hemocytometer and make a new solution of 15,000 cells in 200 μL of RP-10 (see Note 23). 4. Aspirate the RP-10 from the four-well dish, center each array in well, and then load the cell loading solution in a dropwise fashion onto the surface of each array. 5. Rock the array in the x and y directions for a total of 10 min— alternate between rocking for 20 s and letting the arrays sit for 30 s to let cells fall into wells. 118 Toby P. Aicher et al.

6. Wash array(s) 4Â with PBS to remove FBS in media (see Note 24). To wash, add 5 mL of PBS to the corner of the four-well dish and then aspirate. 7. Aspirate final PBS wash and replace with 5 mL of RPMI media (no FBS).

3.4 Membrane 1. Gather the follow materials before sealing the array(s): wafer Sealing forceps, paper towels, Agilent clamps, pretreated membranes, and clean microscope slides. 2. Use the wafer forceps, transfer the array from media to the lid of a four-well dish, being careful to keep the array as close to horizontal as possible (see Note 25). 3. Use the wafer forceps to remove a pretreated membrane from the four-well dish. Gently dab away excess moisture from the glass slide on the paper towel until the membrane does not spontaneously change position on the glass slide. Avoid touch- ing the surface of the membrane that will be sealed to PDMS array as this may affect membrane sealing. 4. Carefully position the membrane in the center of the micro- scope slide leaving a small (2–3 mm) membrane overhang beyond the edge of the slide (see Note 26 and Fig. 4). 5. Holding the membrane in your left hand, invert the micro- scope slide so that the treated surface is facing down. 6. Place the overhang of the membrane in contact with the PDMS surface of the array just above the boundary of the microwells (see Fig. 5). 7. Using a clean glass slide held in your right hand, firmly press down the overhang of the membrane against the PDMS sur- face of the array.

Fig. 4 Use tweezers to position the membrane on the glass slide so that there is a small overhang and touch it to the array just above the boundary of the wells Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 119

Fig. 5 Hold the membrane firmly against the array with a clean glass slide

Fig. 6 Slide the hand holding the membrane across the array to apply the membrane

8. While maintaining pressure with your right hand to hold the membrane in place, gently apply the membrane by shifting your left hand across the array (see Notes 27 and 28, and Fig. 6). 9. After applying the membrane, carefully pry the array and mem- brane from the surface of the lid and transfer to an Agilent clamp (see Fig. 7). 10. Once the array is in the clamp, place a glass slide on top of the array and then assemble the clamp, tightening it just past the point of resistance. Be careful not to tighten too far so as not to break either of the glass slides. 11. Place the assembled clamp in a 37 C incubator for 30 min (see Note 29). 120 Toby P. Aicher et al.

Fig. 7 Place the array in a clamp and heat it at 37 C for 30 min to seal the membrane

3.5 Cell Lysis and 1. Remove the clamp from the incubator and then remove the Hybridization array(s) from the Agilent clamp(s) (see Note 30). 2. Submerge each array, with top slide still attached, in 5 mL of complete lysis buffer in a new four-well dish (see Note 31). 3. Gently rock the array(s) in lysis buffer until the top glass slide lifts off. Do not pry the top slide off as this can reverse mem- brane sealing. The time necessary for detachment of the top slide varies (10 s to 10 min). Just be patient. 4. Once the top slide has detached, let the array(s) rotate for 20 min at 50–60 rpm. 5. After 20 min, remove the lysis buffer and wash each array with 5 mL of hybridization buffer. Use a container without bleach to collect lysis buffer waste because guanidine thiocyanate can react with bleach to create toxic gas. 6. Remove hybridization buffer and add another 5 mL of hybri- dization buffer to each array. Rotate arrays for 40 min at 50–60 rpm. While arrays are rocking, prepare RT master mix (see Note 32).

3.6 Bead Removal To remove beads from the array, either wash the arrays with a pipette or spin them down in a centrifuge with angled inserts.

3.6.1 Bead Removal by 1. Aspirate hybridization buffer and replace with 5 mL of wash Pipette Washes buffer. 2. Rock for 3 min. Fill 50 mL conical tube(s) with 48 mL of wash buffer (see Note 33). 3. Remove membranes with fine-tipped tweezers (see Fig. 8). 4. Carefully position the array over the 50 mL conical tube. Repeatedly wash (~15 times) beads from the surface of the array over the 50 mL falcon tube using 1 mL of wash buffer. Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 121

Fig. 8 After placing the array in wash buffer, remove the membrane with tweezers

Fig. 9 Carefully position the array over a conical of wash buffer and pipet on the array to dislodge beads

Flip the array and repeatedly wash (~15 times) the other end (see Note 34 and Fig. 9). 5. Hold the array above the 50 mL conical and gently scrape the array ten times with a glass slide, dipping the glass slide into the wash buffer after every scrape. Flip the array and repeat. 122 Toby P. Aicher et al.

6. Wash again using 1 mL of wash buffer (~10 times) and inspect the array underneath a microscope to check if there are any beads remaining. If so, take a glass slide and scrape more forcefully and continue to wash until all beads have been dis- lodged (see Note 35). 7. Spin the 50 mL falcon tube at 2000 Â g for 5 min to pellet beads (see Note 36). 8. Aspirate all wash buffer except for ~1 mL. Be careful not to disturb the pellet of beads. 9. Transfer beads to a centrifuge tube and proceed to reverse transcription.

3.6.2 Alternative Method: 1. Alternatively, you can remove beads using 3D-printed inserts. Bead Removal with Inserts 2. Aspirate hybridization buffer and replace with 5 mL of wash buffer. 3. Fill a Falcon tube with 45 mL of wash buffer and label with sample name. 4. Remove membrane and place array into the Falcon tube with wash buffer. 5. Ensure that the array is angled within the tube as shown below. 6. Place the insert so the array is secured angled as shown in the image below. 7. Secure the lid and seal with parafilm, if necessary (see Note 37). 8. Put the sealed conical in a centrifuge, making certain the PDMS surface of the array is facing away from the rotor arm (see Fig. 10). 9. Centrifuge at 2000 Â g for 5 min to remove the beads.

Fig. 10 Make sure the array faces outward so that the beads will fall out of wells during centrifugation Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 123

10. At this point you should see a small, but visible, pellet of beads at the bottom of the tube. 11. Aspirate 5–10 mL of wash buffer to enable easier removal of the array. 12. Remove the array and carefully position it over the top of the 50 mL tube. 13. Repeatedly wash any remaining beads from the surface of the array over the surface of the 50 mL falcon tube using 1 mL of wash buffer remaining in the tube. 14. Spin again at 2000 Â g for 5 min to pellet beads. 15. Aspirate all wash buffer except for ~1 mL. Be careful not to disturb the pellet of beads. 16. Transfer beads to a 1.5 mL centrifuge tube and proceed to reverse transcription.

3.7 Reverse 1. Prepare the following Maxima RT Mastermix during the hybri- Transcription dization step (volumes provided are good for one array): 40 μLH2O. 40 μL Maxima 5Â RT buffer. 80 μL 30% PEG 8000. 20 μL 10 mM dNTPs. 5 μL RNase inhibitor. 5 μL 100 μM Template Switch Oligo. 10 μL Maxima H-RT. 2. Centrifuge the 1.5 centrifuge tubes with beads for 1 min at 1000 Â g. 3. Remove supernatant and resuspend in 250 μLof1Â Maxima RT Buffer (see Note 38). 4. Centrifuge beads for 1 min at 1000 Â g. 5. Aspirate 1Â Maxima RT buffer and resuspend beads in 200 μL of the maxima RT mastermix. 6. Incubate at room temperature for 30 min with end-over-end rotation. 7. After 30 min, incubate at 52 C for 90 min with end-over-end rotation (see Note 39). 8. Following the RT reaction, wash beads once with 0.5 mL TE-TW, once with 0.5 mL TE-SDS, and twice with 0.5 mL of TE-TW (see Notes 40 and 41).

3.8 Exonuclease I 1. Prepare the following Exonuclease I Mix: Treatment 20 μL10Â ExoI buffer.

170 μLH2O. 124 Toby P. Aicher et al.

10 μL ExoI enzyme. 2. Centrifuge beads for 1 min at 1000 Â g and aspirate the TE-TW solution. 3. Resuspend in 0.5 mL of 10 mM Tris–HCl pH 8.0. 4. Centrifuge beads again, remove supernatant and resuspend beads in 200 μL of Exonuclease I mix. 5. Place in a 37 C incubator for 50 min with end-over-end rotation. 6. Wash the beads once with 0.5 mL of TE-SDS, then twice with 0.5 mL TE-TW (see Note 42).

3.9 Whole- 1. Wash beads once with 500 μL of water, pellet beads, remove Transcriptome supernatant and resuspend in 500 μL of water. Amplification (WTA) 2. Mix well (do not vortex) to evenly resuspend beads and transfer 20 μL of beads to a separate 1.5 mL tube to count the beads (see Note 43). 3. Pellet the small aliquot of beads, aspirate the supernatant, and resuspend in 20 μL of bead counting solution (10% PEG, 2.5 M NaCl) (see Note 44). 4. Count the beads using a hemocytometer (see Note 45). 5. Prepare the following PCR Mastermix (volumes provided are good for 2000 beads) (see Note 46): 25 μL2Â KAPA HiFi Hotstart Readymix.

24.6 μLH2O. 0.4 μL 100 μM SMART PCR Primer. 6. Pellet beads, remove supernatant, and resuspend in 50 μLof PCR Mastermix for every 2000 beads (see Note 47). 7. Pipet 50 μL of PCR Mastermix with beads into a 96-well plate, making sure to PCR the entire array (see Note 48). 8. Use the following cycling conditions to perform whole- transcriptome amplification (see Note 49).

95 C 3 min 4 Cycles 98 C20s 65 C45s 72 C 3 min 9–12 Cycles 98 C20s 67 C20s 72 C 3 min Final extension 72 C 5 min 4 C infinite hold Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 125

3.10 Purification of 1. Pool PCR products in a 1.5 mL microcentrifuge tube so that PCR Products you have 7–8 PCR reactions per 1.5 mL microcentrifuge tube. 2. Purify the product by mixing thoroughly using Ampure SPRI beads at a 0.6Â volumetric ratio (beads:PCR products (see Note 50). 3. Let the tubes sit in the rack off the magnet for 5 min, then place the rack on the magnet for 5 min. 4. Perform two washes with 80% ethanol. 5. After second wash, allow the beads to dry for 10 min on the magnet, then remove the rack from the magnetic, elute the beads in 100 μL, then place the rack back on the magnet and transfer the 100 μL to a new 1.5 mL microcentrifuge tube. 6. SPRI the 100 μL at 0.8Â volumetric ratio, repeating steps 6–8. 7. After the second wash, allow the beads to dry for 5–10 min on the magnet, remove the rack from the magnetic, elute the beads in 15 μL, then place the rack back on the magnet and transfer the 15 μL to a new 1.5 mL microcentrifuge tube. 8. Run a High Sensitivity DNA D5000 ScreenTape on an Agilent 4200 Tapestation to determine the length distribution of your cDNA. The distribution should be fairly smooth with an aver- age bp size of 900–1500 bp (see Fig. 11). 9. Proceed to library preparation or store the WTA product at 4 C.

3.11 Nextera Library 1. Make certain that your thermocyclers are set up for Tagmenta- Preparation tion (step 5) and PCR (step 9). 2. For each sample, combine 800 pg of purified cDNA with water in a total volume of 5 μL. It is ideal to dilute your PCR product in a separate tube/plate so that you can add 5 μL of that for tagmentation.

Lower 945 Upper

1000

800

600

400

200 Sample Intensity [Normalized FU] 0 Size 15

100 250 400 600 [bp] 1000 1500 2500 3500 5000 10000

Fig. 11 An ideal WTA product distribution has a peak at 900–1100 bp and has a long tail reaching 5000 bp 126 Toby P. Aicher et al.

3. To each tube, add 10 μL of Nextera TD buffer, then 5 μLof ATM buffer (the total volume of the reaction is now 20 μL). 4. Mix by pipetting ~5 times. Spin down. 5. Incubate at 55 C for 5 min. 6. Let the thermocycler cool to 4 C after incubation, and then immediately add 5 μL of Neutralization Buffer. Mix by pipetting ~5 times. Spin down for 1 min at 1000 Â g. Bubbles are normal. 7. Incubate at room temperature for 5 min. 8. Add to each PCR in the following order: 15 μL Nextera PCR mix.

8 μLH2O. 1 μL10μM New-P5-SMART PCR hybrid oligo. 1 μL10μM Nextera N700X oligo. 9. After sealing the reaction tubes and spinning them down (1 min at 1000 Â g), run the following PCR program:

95 C30s 12 Cycles 95 C10s 55 C30s 72 C30s Final extension 72 C 5 min 4 C Infinite hold

10. Proceed to SPRI purification or store the WTA product at 4 C. 11. SPRI at 0.6Â volumetric ratio. 12. Let the tubes sit in the rack off the magnet for 5 min, then place the rack on the magnet for 5 min. 13. Perform two washes with 80% ethanol. 14. After the second wash, allow the beads to dry for 5–10 min on the magnet, remove the rack from the magnetic, elute the beads in 100 μL, then place the rack back on the magnet and transfer the 100 μL to a new 1.5 mL microcentrifuge tube. 15. Spri 100 μL at 0.8Â volumetric ratio and repeat steps b and c. 16. After the second wash, allow the beads to dry for 5–10 min on the magnet, remove the rack from the magnetic, elute the beads in 15 μL, then place the rack back on the magnet and transfer the 15 μL to a new 1.5 mL microcentrifuge tube. 17. Run a High Sensitivity DNA D1000 ScreenTape on an Agilent 4200 Tapestation. Your tagmented library should be fairly smooth, with an average bp size of 600–750 bp (see Note 51 and Fig. 12). Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 127

Lower 647 Upper

900

800

700

600

500

400

300

200

100 Sample Intensity [Normalized FU] 0 Size

15 [bp] 100 250 400 600 1000 1500 2500 3500 5000 10000

Fig. 12 An ideal NTA product distribution is a smooth bell curve with a peak between 600 and 750 bp

3.12 NextSeq500 1. Make a 5 μL library pool at 4 nM as input for denaturation. Sequencing 2. To this 5 μL library, add 5 μL of 0.2 N NaOH (make this solution fresh from a 2 M NaOH stock). 3. Flick to mix, then spin down and let tube sit for 5 min at room temperature. 4. After 5 min, add 5 μL of 0.2 M Tris–HCl pH 7.5. 5. Add 985 μL of HT1 Buffer to make a 1 mL, 20 pM library (solution 1). 6. In a new tube (solution 2), add 165 μL of solution 1 and dilute to 1.5 mL with HT1 buffer to make a 2.2 pM solution—this is the recommended loading concentration. 7. Add 6 μL of Custom Read 1 primer to 1.994 mL of HT1 buffer to make 2 mL of 0.3 μM Custom Read 1 primer. 8. Follow Illumina’s guide for loading a NextSeq500 kit. Seq-Well requires paired-end sequencing with a read structure of 20 bp read one, 50 bp read two, and 8 bp index one.

4 Notes

1. You will want ~25 mL of bead loading buffer for each array. It is important that you do not add the sodium carbonate directly to the BSA to avoid denaturing the BSA. This solution should be prepared fresh just before loading beads. The stock of BSA should be filtered prior to use with a 0.22 μm filter and should be kept at 4 C. 2. It will take some time for the guanidine thiocyanate to dissolve. Be sure that this is prepared in advance. 128 Toby P. Aicher et al.

3. Prelysis buffer is photosensitive so wrap the buffer’s container with aluminum foil. Wrapped prelysis buffer can be stored at room temperature and has a shelf life of approximately 6 months. 4. Complete lysis buffer should be prepared immediately prior to use. 5. You will want 10 mL of hybridization buffer and 50 mL of wash buffer per array. Both solutions can be made in advance and stored at room temperature for 3 months. 6. We purchase membranes from Sterlitech Corporation. 7. We purchase the mRNA capture beads from Chemgenes (Cat. No. MACOSKO-2011-10). Currently, this is the only supplier manufacturing these beads. 8. You can transport arrays by placing them in 50 mL conical tubes filled with array quenching buffer or the aspartic acid solution. Two arrays will fit per conical if they are arranged back to back with their glass slides touching. 9. An alternative transportation method is to dry the arrays and transport them in a glass slide box. To dry the arrays, remove them from the storage buffer, use a paper towel to wick off excess liquid from the glass slide (while being careful not to touch the surface of the array), and then let them sit until air dried. Rehydrate the arrays by placing them in a four-well dish with either array quenching buffer or the aspartic acid solution. Place them under vacuum until there are no air bubbles remaining in the wells of the array. Alternatively, if a vacuum chamber is not available you can let the arrays soak overnight; they will be hydrated and ready to use the following day. 10. Use array quenching storage buffer for short-term storage for up to 1 month. If storing longer than 1 month, it is advisable to store in aspartic acid solution. 11. Prepare one extra membrane in case of a mistake during mem- brane application. 12. If your plasma oven has multiple shelves, place membranes on the bottom shelf to reduce the risk of them flying when vacuum is released and atmospheric pressure is restored. 13. The plasma should be a bright pink color. If not, adjust the air valve to increase or decrease the amount of oxygen you are letting into the chamber. Also check to see if vacuum has formed by gently pulling on the door of the plasma oven. 14. If membranes have slightly folded over, slowly flip the mem- brane back using sharp tweezers. If membranes have blown off the slide entirely, repeat membrane preparation procedure to ensure you know which side was exposed to plasma. Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 129

15. If transporting solvated membranes (e.g., between buildings), remove all but ~1 mL of PBS to prevent membranes from flipping within the dish. Alternatively, membranes can be sol- vated in 1Â PBS, dried out, and stored for one week at room temperature. When ready to use membranes, they can be rehy- drated with 1Â PBS. This is helpful when traveling with mem- branes or when you want to run Seq-Well in a laboratory without access to a plasma cleaner. 16. Before bead loading, use a microscope to inspect wells for air bubbles. If air bubbles are present, place array(s) under vacuum with rotation (50 RPM) for 10 min to remove air bubbles in wells. The house vacuum in most laboratories should be suffi- cient to remove any air bubbles from the wells. 17. Never vortex beads, as this can fragment them and interfere with bead loading and transcript capture. 18. Place a black background behind the four-well dish to better visualize bead coverage of the array. 19. Be careful not to let the surface of the array dry. Sudden move- ments or tilting at too steep an angle can lead to spillage. If BLB falls off the surface of the array into the four-well dish, gently pipet the BLB back onto the corners of the PDMS surface of the array, being careful not to pipet directly onto wells. 20. It can be helpful to repeatedly (every ~30 s) rest the four-well dish level to allow beads to fall into wells. After tilting forward and backward for 10 min, tilt the four-well dish in whichever direction is needed to cover poorly loaded areas with beads. 21. You can save the excess beads by pipetting the liquid into a 50 mL conical instead of aspirating it. After collecting excess beads, wash them twice by spinning down at 1000 rcf and resuspending the pellet in TE-TW. Store the washed beads in TE-TW for future loading. 22. An alternative bead removal method is to add 3 mL of BLB and rock at ~30 angles six times to get beads to roll off the surface. Repeat this procedure three times. 23. You can also load cells in DMEM with 5% FBS or PBS with 0.05% BSA. 24. Washing with PBS is critical to ensure successful membrane attachment as FBS can interfere with membrane sealing. 25. Make sure the lid of the four-well dish is dry. Position the array in the corner of the lid so the array does not slide as you apply the membrane. 26. Occasionally, the membrane will fold back on the other side of the glass slide. Readjust the membrane until there is an 130 Toby P. Aicher et al.

overhang. An alternative method to prevent this is to first invert the glass slide and then pull the membrane past the edge of the glass slide. 27. For optimal results, use little to no pressure while applying the membrane with the left hand. See Instructional Video (www. shaleklab.com/seq-well) for additional details. Attempts to manually seal the microwell device using pressure result in a “squeegee” effect, effectively removing moisture from the membrane while fixing membrane creases in place. 28. It is very important to avoid a rumpled membrane. If the membrane is creased, dip an edge of a glass slide in liquid and smooth over areas. However, there is a limited amount of remediation that is possible prior to decreasing the efficacy of sealing, so ideally create the smoothest membrane possible on the first pass. 29. This time is flexible and depends on the incubator. If you want to decrease this incubation time, please optimize on cell lines before proceeding with precious samples. 30. Sometimes the array will be stuck to the top piece of the clamp—this is fine, just carefully slide it off. 31. To make complete lysis buffer, combine 5 mL of prelysis with 25 μL of 10% sarkosyl and 50 μL β-mercaptoethanol. 32. The hybridization buffer may contain trace amounts of guani- dine thiocyanate and therefore should be collected in the lysis buffer waste container. 33. Label the 50 mL conical tubes with sample names to avoid mixing up samples after removing beads from the array(s). 34. Make sure to pipet on the entire surface of the array, including the edges and corners. 35. Sometimes after scrapping empty wells will fill up with bubbles. Be careful not to mistake bubbles for beads when inspecting underneath a microscope. 36. You should see a small, but visible, pellet of beads at the bottom of the tube. 37. The array might move around at this point. This is not a problem. 38. Prepare 1Â maxima by diluting 5Â maxima buffer in RNAse free H2O. 39. You can also let RT continue overnight at 52 C and wash the beads the next day. 40. Salts in the RT buffer can cause SDS to precipitate, making it difficult to remove in subsequent washes, so it is best to begin with a single wash in TE-TW. Seq-Well: A Picowell Platform for Single-Cell RNA-Sequencing 131

41. This is a stopping point; after the final TE-TW wash, beads can be resuspended in TE-TW and stored for up to 2 weeks at 4 C. 42. This is another stopping point; after the final TE-TW wash, beads can be resuspended in TE-TW and stored for up to 2 weeks at 4 C. 43. Do not vortex beads as this can result in bead fragmentation. 44. The bead counting solution aids in even dispersion of beads across a hemocytometer. 45. Sometimes the beads will not evenly disperse, making it diffi- cult to count them. If this is the case, assume there are 60,000 beads for the following steps. 46. For instance, if you have 60,000 beads from an array, prepare a PCR mastermix with 750 μLof2Â KAPA HiFi Hotstart Readymix, 738 μLofH2O, and 12 μL of 100 μM SMART PCR Primer. 47. For instance, if you have 60,000 beads resuspend in 1500 μLof PCR mastermix. 48. Periodically resuspend the beads to make sure they are evenly dispersed in the solution. 49. The total number of PCR cycles necessary for amplification depends on the cell type used. Approximately 16 cycles are optimal for primary cells (e.g., PBMCs) and approximately 13 cycles are optimal for cell lines or larger cells (e.g., macro- phages). For experiments on dissociated human tissue, start with 16 cycles and optimize from there. 50. For instance, if you have 400 μL of product add 240 μLof SPRI beads and mix for a 0.6Â volumetric SPRI. 51. We have successfully sequenced NTA product with average bp sizes from 350 to 800 bp.

Acknowledgments

R.G. was supported by the Intramural Research Program of the Division of Intramural Research Z01AI000947, NIAID, NIH; the UCLA-Caltech MSTP, and the NIGMS T32 GM008042. A.K.S. was supported by the Searle Scholars Program, the Beckman Young Investigator Program, the Pew-Stewart Scholars, a Sloan Fellowship in Chemistry, NIH grants 1DP2OD020839, 2U19AI 089992, 1U54CA217377, P01AI039671, 5U24AI118672, 2RM1HG006193, 1R33CA202820, 2R01HL095791, 1R01AI 138546, 1R01HL126554, 1R01DA046277, 2R01HL095791, and Bill and Melinda Gates Foundation grants OPP1139972, OPP1137006, and OPP1116944. J.C.L. was supported by NIH grants DP3DK09768101, P01AI045757, R21AI106025, and 132 Toby P. Aicher et al.

R56AI104274, the W.M. Keck Foundation, Camille Dreyfus Teacher-Scholar program, and the US Army Research Office through the Institute for Soldier Nanotechnologies, under contract number W911NF-13-D-0001. This work was also supported in part by the Koch Institute Support (core) NIH Grant P30-CA14051 from the National Cancer Institute.

References

1. Kolodziejczyk AA, Lo¨nnberg T (2018) Global in expression and splicing in immune cells. and targeted approaches to single-cell tran- Nature 498:236–240. https://doi.org/10. scriptome characterization. Brief Funct Geno- 1038/nature12172 mics 17:209–219. https://doi.org/10.1093/ 8. Klein AM, Mazutis L, Akartuna I et al (2015) bfgp/elx025 Droplet barcoding for single-cell transcrip- 2. Svensson V, Vento-Tormo R, Teichmann SA tomics applied to embryonic stem cells. Cell (2018) Exponential scaling of single-cell 161:1187–1201. https://doi.org/10.1016/j. RNA-seq in the past decade. Nat Protoc cell.2015.04.044 13:599–604. https://doi.org/10.1038/ 9. Macosko EZ, Basu A, Satija R et al (2015) nprot.2017.149 Highly parallel genome-wide expression 3. Kolodziejczyk AA, Kim JK, Svensson V et al profiling of individual cells using nanoliter dro- (2015) The technology and biology of single- plets. Cell 161:1202–1214. https://doi.org/ cell RNA sequencing. Mol Cell 58:610–620. 10.1016/j.cell.2015.05.002 https://doi.org/10.1016/j.molcel.2015.04. 10. Zheng GXY, Terry JM, Belgrader P et al 005 (2017) Massively parallel digital transcriptional 4. Ziegenhain C, Vieth B, Parekh S et al (2017) profiling of single cells. Nat Commun Comparative analysis of single-cell RNA 8:14049. https://doi.org/10.1038/ sequencing methods. Mol Cell 65:631–643. ncomms14049 e4. https://doi.org/10.1016/j.molcel.2017. 11. Gierahn TM, Ii MHW, Hughes TK et al (2017) 01.023 Seq-Well: portable, low-cost RNA sequencing 5. Tang F, Barbacioru C, Wang Y et al (2009) of single cells at high throughput. Nat Meth- mRNA-Seq whole-transcriptome analysis of a ods 14:395–398. https://doi.org/10.1038/ single cell. Nat Methods 6:377–382. https:// nmeth.4179 doi.org/10.1038/nmeth.1315 12. Bose S, Wan Z, Carr A et al (2015) Scalable 6. Macaulay IC, Svensson V, Labalette C et al microfluidics for single-cell RNA printing and (2016) Single-cell RNA-sequencing reveals a sequencing. Genome Biol 16:120. https://doi. continuous spectrum of differentiation in org/10.1186/s13059-015-0684-3 hematopoietic cells. Cell Rep 14:966–977. 13. Kivioja T, V€ah€arautio A, Karlsson K et al https://doi.org/10.1016/j.celrep.2015.12. (2012) Counting absolute numbers of mole- 082 cules using unique molecular identifiers. Nat 7. Shalek AK, Satija R, Adiconis X et al (2013) Methods 9:72–74. https://doi.org/10.1038/ Single-cell transcriptomics reveals bimodality nmeth.1778 Chapter 9

Single-Cell Tagged Reverse Transcription (STRT-Seq)

Kedar Nath Natarajan

Abstract

Single-cell RNA sequencing (scRNA-seq) has become an established approach to profile entire transcrip- tomes of individual cells from different cell types, tissues, species, and organisms. Single-cell tagged reverse transcription sequencing (STRT-seq) is one of the early single-cell methods which utilize 50 tag counting of transcripts. STRT-seq performed on microfluidics Fluidigm C1 platform (STRT-C1) is a flexible scRNA- seq approach that allows for accurate, sensitive and importantly molecular counting of transcripts at single- cell level. Herein, I describe the STRT-C1 method and the steps involved in capturing 96 cells across C1 microfluidics chip, cDNA synthesis, and preparing single-cell libraries for Illumina short-read sequencing.

Key words STRT-C1, scRNA-seq, 50 Tag counting, UMIs, Single-cell tagged reverse transcription, Fluidigm C1, Microfluidics

1 Introduction

Single-cell RNA sequencing (scRNA-seq) is a powerful and unbi- ased approach for quantifying the transcriptome of individual cells and has transformed our understanding of development and disease mechanism [1]. scRNA-seq approaches has been applied to identify and distinguish subpopulation structures across different cell types including pluripotent stem cell (PSCs), immune cells (Reviewed in [1, 2]) and for all cell types in human body [3, 4]. Single-cell tagged reverse transcription sequencing (STRT or STRT-seq) is one of the early multiplexed approaches for single-cell RNA-sequencing and has been developed to be performed either in 96/384-well plates or using Fluidigm C1 microfluidics system [5, 6]. STRT-seq has further been improved including a recently published dual-index method compatible with nuclear RNA-seq on microwell platform [7, 8]. The initial step in STRT-seq isolation/capture and lysis either in tubes. Subsequently, the first-strand is synthesized using

Electronic supplementary material: The online version of this chapter (https://doi.org/10.1007/978-1-4939- 9240-9_9) contains supplementary material, which is available to authorized users.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_9, © Springer Science+Business Media, LLC, part of Springer Nature 2019 133 134 Kedar Nath Natarajan

biotinylated Oligo-dT primer and the reverse transcriptase adds 3–6 cytosines to the 30 end. Subsequently, using a helper template switching oligonucleotide (TSO), the reverse transcriptase switches the template, continues the helper oligonucleotide synthesis, and introduces a barcode into cDNA. This barcoded cDNA is purified, PCR-amplified, immobilized on biotin beads followed by fragmen- tation, adapter ligation, library amplification, and sequencing using two sets of primers. The STRT-seq protocol has been improved to incorporate individual molecular counting using unique molecular identifiers (UMI) and adapted to be performed on the Fluidigm C1 platform [5, 7–9]. Herein, I describe the 50 tag counting STRT-seq protocol on 96 single mouse PSCs performed on microfluidics Fluidigm C1 platform (STRT-C1) [6]. The STRT-C1 has several advantages over older STRT-seq protocol and other tag counting scRNA-seq methods. Firstly, STRT-C1 is performed on microfluidics chip that enables accurate, sensitive, and efficient reactions in nanoliter volumes. Secondly, the introduction of UMIs allows for precise counting of unique transcripts and separate from PCR duplicates. Thirdly, the protocol allows flexibility to use in-house Tn5 trans- posase for tagmentation and different single-cell barcoding indices. Lastly, since the TSO and Oligo-dT have biotin tags, the pooled library can be purified by streptavidin beads and further enriched for 50 UMI transcripts by depleting PvuI restriction site containing 30 fragments. Among the disadvantages of STRT-C1 are costs associated with microfluidics instrument and chips, access to in-house Tn5 transposase, identification and quantification of 50 tags alone, and hands-on-time. The limiting step is access to micro- fluidics device; however, after generating cDNA, multiple libraries can be performed in parallel over 2–3 days.

2 Materials

Across any scRNA-seq method, four key aspects are to be kept in mind. Firstly, individual cells have very limited starting material (RNA) and can vary in cell sizes. Secondly, the individual single- cell samples should be diligently handled to avoid RNA degrada- tion and/or contamination. Thirdly, reference controls should be added to distinguish true biology differences from technical varia- tion; lastly, the scRNA-seq reagents and equipment are expensive, and experiments should be carefully planned and managed. The C1 IFC accommodates cell sizes spanning from 5 to 25 μm(see Note 1), while the synthetic RNA spike-in controls provide a means to distinguish technical variation and noise (see Note 2). It is essential that dedicated pre-PCR areas or UV workstations should be used for making mastermixes and priming microfluidics C1-chip. Post-PCR areas or UV workstations should be used for STRT-Seq on C1 135

subsequent steps involving either lysed cells or cDNA (see Note 3). Prepare and keep all reagents and mastermixes on ice throughout the protocol, unless specifically highlighted. All solutions are prepared using ultrapure nuclease-free water.

2.1 C1 Auto Prep 1. C1 Reagent kit for mRNA-seq consisting of two modules.  System and Reagents Module 1 (4 C) containing cell suspension reagent, blocking reagent, Cell wash buffer and Module 2 (À20 C) containing harvest buffer, DNA dilution reagent, and preloading and loading reagent. 2. C1 microfluidics chips (IFC: integrated fluidic circuits) based on average cell size. Either small (5–10 μm) or medium (10–17 μm) or large (17–25 μm) (see Note 1). 3. C1 Autoprep system.

2.2 RNA Spike-In 1. RNA spike-in controls: These are several control RNA tran- Controls scripts of known sequence, quantity, and concentrations. The spike-ins can be of synthetic origin, such as ArrayControl, ERCCs (External RNA control consortium), SIRVs (Spike-in variant control mixes) or sequencing spike-ins (Sequins), or RNA from a different organism than tested (plant RNA, micro- bial RNA etc.,) (see Note 2).

2.3 Custom 1. Oligo-dT sequence: Custom Oligo-dT sequence with a spacer 0 Oligonucleotides sequence and 5 biotin tag. The Oligo-dT should be diluted in sterile ultrapure nuclease-free water. C1-P1-T31: 50-Biotin-AATGATACGGCGACCACCGATCG 0 TT30-3 . 2. Template switching oligonucleotide (TSO): Custom TSO of RNA nucleotides having a 50 biotin tag. The TSO should be diluted in sterile ultrapure nuclease-free water. C1-P1-TSO: 50-Biotin-AAUGAUACGGCGACCACCGU NNNNNNGGG-30. 3. Custom PCR handle sequence: Custom PCR DNA oligo primer with biotin tag. The PCR handle should also be diluted in sterile ultrapure nuclease-free water. C1-P1-PCR: 50-Biotin-GAATGATACGGCGACCACCGAT-30. 4. Custom oligonucleotides for single-cell barcoding and library preparation. STRT-Tn5-U: 50-Phosphate-CTGTCTCTTATACACATCTG ACGC-30. Ninety-six different single-cell barcodes (see Supplementary Table 1). 136 Kedar Nath Natarajan

2.4 Cell Culture and 1. Pre-PCR and post-PCR hoods (see Note 3). Laboratory Equipment 2. Cell counter. 3. Resuspension and washing buffer: Sterile 1Â Dulbecco’s phosphate-buffered saline (PBS) without calcium or magne- sium is used for cell culture including as washing buffer and for final resuspension of single-cell solution. 4. Single-channel (P2, P10, P20, P200, and P1000) and multi- channel pipettes (P20 and P200). 5. Sterile ultrapure nuclease-free water. 6. Sterile PCR grade water. 7. RNase inhibitor (40 U/μL). 8. Magnesium chloride (1000 mM). 9. Betaine (5 M). 10. Triton X-100 (0.2%). 11. dNTPs (20 mM). 12. DTT (20 mM). 13. 5Â first strand buffer (5Â). 14. Reverse transcriptase (200 U/μL). 15. 10Â Advantage 2 PCR buffer (10Â). 16. 50Â dNTP mix (20 mM). 17. 50Â Advantage 2 Polymerase mix (50Â). 18. 96-well plates. 19. 96-well plate seals. 20. Fragment analyzer or Bioanalyzer station. 21. High sensitivity DNA kit reagents. 22. Bright-field microscope with 4Â and 20Â objectives. 23. RNAZap. 24. Ethanol.

2.5 Library 1. 2Â TAPS buffer (pH 8.5): 10 mM TAPS, 50 mM MgCl2 in Preparation total 100 mL water. Set pH to 8.5. 2. 2Â BWT buffer: 10 mM Tris–HCl (pH 7.5), 1 mM EDTA, 2 M NaCl, 0.02% Tween 20 in total 100 mL water. 3. TNT buffer: 20 mM Tris–HCl (pH 7.5), 50 mM NaCl, 0.02% Tween 20 in total 100 mL water. 4. Custom Tn5-tranposase. 5. 96-well plate containing oligonucleotide barcodes (list in Sup- plementary Table 1). STRT-Seq on C1 137

6. Magnetic rack. 7. Plate/tube shaker with temperature control. 8. Streptavidin Dynabeads. 9. PCR purification kit. 10. Restriction enzyme (PvuI-HF) and CutSmart buffer. 11. Agencourt AMPure XP beads. 12. KAPA Illumina library quantification kit.

3 Methods

Prepare and keep all reagents and mastermixes on ice throughout the protocol, unless specifically highlighted. Nuclease-free fresh filter tips and tubes should be used for all the steps. All solutions are prepared using ultrapure nuclease-free water. Extra care should be given for all steps before and during the “mRNA Seq: RT & Amp” step including addition of RNase inhibitors, since RNA is more susceptible to degradation. Dedicated pre-PCR areas should be used for making mastermixes/reagents and post-PCR areas for processing single-cell cDNA samples, sequencing indices, libraries, etc. to avoid contamination. UV sterilization step should be carried out for PCR workstations, followed by complete cleaning using RNAZap and 80% ethanol.

3.1 Preparing 1. Allow the frozen (À20 C) reagents: C1 preloading reagent, Reagents, harvest reagent, loading reagent to thaw on ice for 15–20 min.  Mastermixes and Keep also the blocking reagent (4 C) on ice. Also keep the  Setting Up the System À80 C RNA-spikes on dry ice. 2. RNA-spike mixes and aliquots: The RNA spikes should be thawed quickly and vortexed, and several dilutions (~1:10 and 1:100 dilutions) should be made using C1 loading reagent (see Note 3). 3. Serial dilutions of 1:10 and 1:100 RNA-spikes should be fur- ther made, vortexed, and spun down, and 5 μL aliquots should be stored at À80 C for single experimental use (see Note 2). 4. Thaw the frozen (À20 C) reagents: RNase inhibitor, 30 SMART CDS Primer IIA, SMARTer dilution buffer, 5Â first strand buffer, DTT, dNTP mix, PCR grade water, 10Â Advan- tage 2 PCR buffer, 50Â dNTP mix, IS PCR primer, 50Â Advantage 2 PCR polymerase Mix, on ice for 15–20 min. 5. Make Lysis pre-mastermix (see Note 4). 138 Kedar Nath Natarajan

Stock Final Reagent concentration Volume, μL concentration

Triton X-100 1% 30 0.15% dNTP mix 20 mM 35 3.5 mM DTT 100 mM 35 17.5mμM C1-P1-T31 20 μM404μM C1 loading reagent 100% 10 5% Nuclease-free water – 30 – Total volume 180

6. Make Lysis mastermix (see Note 5).

Stock Final Reagent concentration Volume, μL concentration

RNase inhibitor 40 U/μL1 2U/μL Lysis pre-mastermix ~1.6Â 18 1.5Â RNA spike-ins – 1 – Total volume 20

7. Make reverse transcription (RT) pre-mastermix (see Note 6).

Stock Final Reagent concentration Volume, μL concentration

5Â First-strand buffer 5Â 116 1.75Â

MgCl2 1000 mM 3.5 10.5 mM Betaine 5 M 96 1.45 M C1 loading reagent 20Â 12 3.6Â Total volume 227.5

8. Make RT mastermix (see Note 7).

Stock Final Reagent concentration Volume, μL concentration

RNase inhibitor 40 U/μL 1.3 1 U/μL Reverse Transcriptase 200 U/μL 3 11.42 U/μL RT pre-mastermix 2.5Â 22.5 1.1Â C1-P1-TSO 40 μM 3 2.3 μM Total volume 30 STRT-Seq on C1 139

9. Make PCR pre-mastermix (see Note 8).

Stock Final Reagent concentration Volume, μL concentration

C1-P1-PCR 12 μM 30 480 nM 10Â Advantage 2 PCR 10Â 75 1Â buffer 50Â dNTP mix 20 mM 15 400 μM PCR-grade water 490 C1 loading reagent 20Â 35 1.1Â 20Â EvaGreen dye 20Â 3.2 0.1Â (optional) Total volume 645

10. Make PCR mastermix (see Notes 9 and 10).

Stock Final Reagent concentration Volume, μL concentration

50Â Advantage 2 PCR 50Â 3 2.2Â polymerase mix Reverse Transcriptase 1.173Â 64.5 1.12Â Total volume 67.5

11. Store all mastermixes and reagents on ice or at 4 C(see Note 11).

3.2 Priming C1 IFC 1. Open a new vacuum packed C1 IFC of appropriate size (see Note 1) and make sure that seals and strips on IFC are not tampered. 2. Add 200 μL C1 harvest reagent into each of the 40 chambers, 20 μL C1 harvest reagent into four chambers, 20 μL C1 pre- loading reagent into a single chamber, 20 μL cell wash buffer into two chambers, and 15 μL C1 blocking reagents into two chambers (as marked in Fig. 1)(see Note 12). 3. Switch on the C1 system (see Note 13). 4. Peel the tape from the bottom of C1 IFC and place into C1 system and run the mRNA seq: Prime script. This step takes ~10 min (see Note 14). 5. During priming, prepare the single-cell suspension from tis- sues, primary cells or cells from culture. Dissociated single cell suspension should be suspended in relevant sterile media or sterile PBS or buffer and passed through appropriate size cell 140 Kedar Nath Natarajan

A1 200 µL Harvest reagent 20 µL Harvest reagent 20 µL Preloading reagent 15 µL Blocking reagent 20 µL CellWash buffer

Fig. 1 The final C1 IFC configuration with the highlighted wells that should be filled with specific reagents and volumes prior to running the priming step

strainers (30, 40, 70, or 100 μm) to get rid of debris, clumps, and large particles. 6. Preparing cell mix: Cells should optimally be loaded at a con- centration between 166,000 and 250,000 cells/mL, to enable ~600 cells to be loaded on C1 IFC (see Notes 15 and 16). Starting with the appropriate cell concentration is quite critical and it is advised to use multiple methods to count cells (see Note 17). Prepare the final cell suspension mix by adding cells (166–250 cells/μL) with C1 suspension reagent (see Note 18).

Reagent Concentration Final volume, μL

Single-cell suspension. 166–250 cells/μL60 Suspension reagent – 40 Total volume 100

7. Remove the primed C1 IFC from the C1 Autoprep system into pre-PCR area, remove the flow-through C1 blocking reagent from inlet and outlet chambers (~15 μL; as marked in Fig. 2). Cells can also be optionally stained during the cell load step (see Notes 19 and 20). 8. Typically, live–dead staining is performed using ethidium homodimer that stains membranes of dead cells and with Cal- cein for live cells. (Note: Other dyes can be used, which do not interfere with DNA or RNA activity.)

3.3 Loading Cells in 1. Mix the cell mix well and add up to 20 μL of cell mix to C1 IFC C1 IFC (as marked in Fig. 2)(see Note 21). 2. Place the C1 IFC with cells into C1 system and run the mRNA seq: Cell Load script. This step takes ~10–30 min (see Notes 20 and 22). STRT-Seq on C1 141

A1

20 µL LIVE/DEAD stain 5 µL Cell mix (upto 20 µL) Remove Blocking buffer

Fig. 2 After priming, the blocking reagent from C1 IFC should be removed before adding the add cell mix and optional staining solutions, and running the cell load step

A1 180 µL Harvest reagent 9 µL Lysis mastermix 9 µL RT mastermix 24 µL PCR mastermix

Fig. 3 The final C1 IFC configuration with the highlighted wells filled with specified lysis, RT, and PCR reagents before the running the mRNA seq: RT & Amp script

3. During the cell load, the remaining reagents can be optionally tested as tube controls using purified RNA or whole cells (see Note 16). 4. Single-cell capture, imaging C1 IFC, and annotating cell cham- bers: After cell load, each of the 96 single-cell capture sites should be critically assessed under bright-field microscope (as marked in Fig. 2). Each single-cell capture site should be checked for single-cell, empty, debris, doublet or multiple cells, and should be carefully annotated (see Notes 15, 23, and 24).

3.4 Running Cell 1. After single-cell capture site annotation, add 180 μL of harvest Lysis, RT, and PCR reagent into the four large inlets, 9 μL of lysis mastermix, 9 μL Within C1 System of RT mastermix, and 24 μL each of PCR mastermix into two inlets (as marked in Fig. 3). 142 Kedar Nath Natarajan

2. Place the C1 IFC with cells into C1 system and run the mRNA seq: RT & Amp script. This step takes ~8.5 h but can be programmed and extended up to ~25 h for the researcher’s convenience (see Note 25). 3. Lysis program.

Temperature, C Duration, min

72 3 410 25 1

4. Reverse transcription program.

Temperature, C Duration, min

42 90 70 10

5. PCR.

Step Temperature, C Duration Number of cycles

Initial denaturation 95 1 min 1 Denaturation 95 20 s 5 Annealing 58 4 min Extension 68 6 min Denaturation 95 20 s 9 Annealing 64 30 s Extension 68 6 min Denaturation 95 30 s 7 Annealing 64 30 s Extension 68 7 min Final extension 72 10 min 1

6. Harvesting amplified products: Thaw the C1 dilution reagent ~20–30 min at room temperature and transfer a new 96-well plate to clean post-PCR area and mark as “Diluted harvest cDNA.” 7. Add 10 μL of C1 dilution reagent to each well of “diluted harvest cDNA” plate (see Note 26). 8. Transfer the C1 IFC to post-PCR area once the mRNA seq: RT & Amp script gets completed. Carefully remove the white strip tapes from the C1 IFC to reveal the harvesting outlets (see Notes 27–29). STRT-Seq on C1 143

A1 ...... Harvest well ......

Fig. 4 The C1 IFC harvests single-cell cDNA from different cell chambers into the collection wells (grey well). Each column contains 16 harvest wells and spaced alternatively to fit an eight-channel multipipette

122 3 4 5 6 7 8 9 10 11 1 A 32144950 51 6 5 4 52 53 5

B 98705556 57 12 11 10 58 59 6

C 15 141361 62 63 18 17 16 64 65 66

D 21 201967 68 69 24 23 22 70 71 72

E 25 262775 74 73 28 29 30 78 77 76

F 31 323381 80 79 34 35 36 84 83 82

G 37 383987 86 85 40 41 42 90 89 88

H 43 444593 92 91 46 47 48 96 95 94

Fig. 5 The final configuration of single-cell cDNA from different cell chambers is collected across 96-well plate. The cell numbers on 96-well plate correspond to cell capture sites (same as imaging) on C1 IFC

9. Using an eight-channel pipette, transfer the single-cell cDNA from C1 IFC to the “diluted harvest cDNA” plate (as marked in Fig. 4)(see Notes 30–32). The “diluted harvest cDNA” plate can be stored at À20 C. The final layout of single cells is marked in Fig. 5. 10. The cDNA concentration from either a representative single cell or all single cells should be measured using either Bioana- lyzer/Fragment analyzer or Picogreen. Typically, we observe 144 Kedar Nath Natarajan

between 1 and 20 ng/μL concentration depending on cell type. 11. Dilute the single-cell cDNA concentration to 1 ng/μL final concentration.

3.5 Library 1. Follow steps 1–3 to make the 10x transposome stock mix. Using Preparation a multichannel pipette, transfer 5 μL of each oligonucleotide barcode from the 96-well containing oligonucleotide barcodes (Supplementary Table 1) into a new 96-well plate (“Tn5 bar- code plate”). 2. Add 5 μL of STRT-Tn5-U oligonucleotide to each well of the “Tn5 barcode plate,” briefly vortex and spin. Anneal the oli- gonucleotides by placing the 96-well Tn5 barcode plate in the PCR machine and run program (95 C for 2 min, ramp down to 25 C over 25 min). This plate can be stored at À20 C for long-term storage, or immediately used for next step. 3. In a new 96-well plate, transfer 1 μL each from “Tn5 barcode plate” using a multichannel pipette. Add 3.5 μL of Tn5 trans- posase to each well and incubate at room temperature (or 37 C) for 60 min (see Note 33). This plate is 10x transpo- some stock. 4. Follow steps 5–9 to prepare “Tagmentation mastermix”.

Volume Final Volume Reagent (per well), μL concentration. (100 wells), μL

Nuclease-free water 7.5 – 750 100% DMF 2 – 200 2Â TAPS buffer 2 – 200 Total volume 11.5 1150

5. Add 11.5 μL of tagmentation mastermix to each well of a new 96-well plate (tagmentation plate). 6. Transfer 2.5 μLof10Â transposome stock mix to each well to tagmentation plate. 7. Add 6 μL of harvested single-cell cDNA to tagmentation plate. 8. Each well of the tagmentation plate should have 20 μL final volume containing a 6 μL of unique harvested single-cell cDNA, 2.5 μL unique 10Â transposome barcode and 11.5 μL of tagment mastermix. Mix and spin the plate. Place the harvested single-cell cDNA plate back at À20 C. 9. Perform tagmentation reaction in PCR block at 55 C for 5 min followed by cooling to 4 C(see Note 34). 10. To bind Streptavidin beads to tagmented cDNA, prepare strep- tavidin dynabeads: Take 120 μL of Streptavidin dynabeads and STRT-Seq on C1 145

wash twice in 2Â BWT buffer. Finally resuspend in 3 mL 2Â BWT buffer. 11. Add 20 μL streptavidin dynabeads to each well of the cooled tagmented plate. Incubate at room temperature for 10 min. The plate can also be left on plate shaker at 37 C and ~500–800 rpm shaking. 12. Pool all the samples from each well of 96-well plate into a single tube (see Note 35). Bind beads to the magnet and remove supernatant. This step pools all single-cell cDNA libraries into a single-tube. 13. Wash the beads in 100 μL of TNT buffer, followed by 100 μL of Qiaquick PB buffer and further 3Â washes in 100 μL TNT buffer. During washing, do not remove the pooled single-cell library tube from magnet (see Notes 36 and 37). This step immobilizes the cDNA fragments on beads. 14. To remove 30 fragments, following last wash, resuspend the beads in 100 μL of below restriction mix to remove all beads bound 30 fragments. Incubate at 37 C for 60 min on shaker at interval mix (30 s at ~106 Â g; 2 min pause) to avoid beads clumping.

Stock Volume, Final Reagent concentration μL concentration

CutSmart buffer 10Â 10 1Â PvuI-HF restriction 20 U/μL 2 0.4 U/μL enzyme Nuclease-free water 88 Total volume 100

15. Wash the beads thrice with 100 μL TNT buffer after restriction digestion (see Note 38). Let beads air-dry for 3–5 min and resuspend in 30 μL nuclease-free water. Incubate at 70 C for 10 min at constant mixing (850 rpm) to release the 50 frag- ments from beads. 16. Bind beads to magnet and carefully collect the eluate (superna- tant) containing pooled single-cell libraries. 17. Follow steps 18–24 to perform cleanup of cDNA libraries and final library elution. 18. To the 30 μL of eluate library, add 54 μL (1.8Â volume) AMPure XP beads. 19. Mix well and incubate at room temperature for 10 min. 20. Bind beads to magnet for 1 min and discard supernatant con- taining unbound fragments. 146 Kedar Nath Natarajan

21. Wash beads with fresh 70% ethanol for 30 s, wait for 1 min and remove the supernatant (see Notes 36 and 37). 22. Dry the beads at room temperature for 2–5 min (see Note 39). 23. Resuspend beads in 30 μL nuclease-free water and incubate for 10 min at room temperature. 24. Bind beads to magnet for 1 min and collect supernatant con- taining purified single-cell cDNA library. The final library should be quantified and stored at À20 C. 25. Follow steps 26–29 to quantify cDNA library and estimate fragment length. 26. Dilute the final single-cell cDNA library into 1:100 and 1:1000 dilutions. For 1:100 dilution, add 198 μL nuclease-free water to 2 μL cDNA library. For 1:1000, add 18 μL nuclease-free water to 2 μL of 1:100 dilution library. 27. Quantify cDNA concentration using KAPA library quantifica- tion with standard controls. Follow the below table for prepar- ing mastermixes. Add 18 μL of mastermix to 2 μL of 1:100 and 2 μL of 1:1000 diluted cDNA libraries respectively.

Stock Volume, Final Reagent concentration μL concentration

KAPA SYBR FAST 10Â 12 1Â mastermix STRT-Tn5-U primer 100 μM210μM C1-P1-PCR primer 100 μM210μM Nuclease-free water 2 Total volume 18

28. Run below qPCR program.

Initial denaturation 95 C 5 min Denaturation 95 C 30 s For 30 cycles Annealing/extension 60 C45s Hold 10 C

29. In parallel, the same PCR reaction should be also run for 11 cycles should be run using 2 μL of undiluted pooled library. This amplified product can be run on fragment analyzer/bioa- nalyzer for quantification, fragment length, and size distribu- tion. The expected final library concentration should be ~300–1200 pM. 30. The final libraries can be sequenced across on Illumina plat- form using a pair of C1-P1-PCR as Read1 primer and STRT- Tn5-U as index read primer (Fig. 6). STRT-Seq on C1 147

Step 1. Annealing biotinylated oligo-dT (C1-P1-T31) to polyA mRNA 5’--XXXXXXXXXXXXXXAAAA n

Step 3. Template switching and second strand synthesis

Step 4. cDNA amplification by C1-P1-PCR GAATGATACGGCGACCACCGAT> 5’--AAUGAUACGGCGACCACCGAUNNNNNNGGGXXXXXXXXXXXXXXAAAA CGATCGGTGGTCGCCGTATCATT n TTACTATGCCGCTGGTGGCTANNNNNNCCCXXXXXXXXXXXXXXTTTT GCTAGCCACCAGCGGCATAGTAA--5’ 27

NNNNNN Unique molecular identifier (UMI) Step 5. Tn5 dimer with mosaic sequences and tagmentation

GAC CTGTCTCTTATACACATCTAGAGAATATGTG CTGTC

GTGTATAAGAGACAG TAC AGAT Tn5 dimer C TACACATATTCT ACTGC GCGT GXXXXXX

XX XXXXXXXX AGCATACGGCAGAAGACGAAC

GACGGCATACGA

TC CAAGCAGAA 5’ tagmentation product 5’--GAATGATACGGCGACCACCGATNNNNNNGGGXX..XXCCCXXXXXXXXXXXXXXTTTT GCTAGCCACCAGCGGCATAGTAA--5’ 27 CTTACTATGCCGCTGGTGGCTANNNNNNCCCXX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC--5’ Middle CAAGCAGAAGACGGCATACGAXXXXXXXXGCGTCAGATGTGTATAAGAGACAGXX..XX XX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC 3’ tagmentation product 5’--GAATGATACGGCGACCACCGATCGTTT27XX..XX CTTACTATGCCGCTGGTGGCTAGCAAAAnXX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC--5’

XXXXXXXX Cell specific barcode

Step 6. Streptavidin pulldown of 5’ product, PvuI digestion of 3’ product and selective amplification of 5’ product.

5’--GAATGATACGGCGACCACCGAT^CGTTT27XX..XX CTTACTATGCCGCTGGTGGC^TAGCAAAAnXX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC--5’

Product is cleaved and not amplified

AATGATACGGCGACCACCGAT> 3’-CTTACTATGCCGCTGGTGGCTANNNNNNCCCXX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC-5’

Step 7. Final library and sequencing Index read primer CTGTCTCTTATACACATCTGACGC> 5’-AATGATACGGCGACCACCGATNNNNNNGGGXX..XXCTGTCTCTTATACACATCTGACGCXXXXXXXXTCGTATGCCGTCTTCTGCTTG TTACTATGCCGCTGGTGGCTANNNNNNCCCXX..XXGACAGAGAATATGTGTAGACTGCGXXXXXXXXAGCATACGGCAGAAGACGAAC-5’ TTACTATGCCGCTGGTGGCTA> Read 1 sequencing primer

Fig. 6 Complete overview and chemistry of the different steps during STRT-C1 experiment 148 Kedar Nath Natarajan

4 Notes

1. The C1 microfluidics IFCs can capture either small (5–10 μm) or medium (10–17 μm) or large (17–25 μm) cells. For cells between 8 and 11 μm, it is preferable to use the smaller IFC as single cells are better trapped in the capture site. 2. While a wide variety of spike-in types can be utilized (ERCCs, SIRVs, Sequins, Plant RNAs etc.), the abundances and concen- tration of each RNA-spikes should be known for classifying technical variation. To avoid RNA spike-in degradation, spikes should be carefully handled in RNase-free areas and resus- pended in solutions containing RNase inhibitors. Once spikes as thawed, they should be diluted, aliquoted, and frozen for single use. Spikes-ins must not be subjected to repeated free- ze–thaw cycles as this leads significant RNA degradation [10]. 3. Dedicated pre-PCR and post-PCR UV-workstations areas should be set up for scRNA-seq experiments to avoid nucleic acid and nuclease contamination. The workstations should be sterilized with short wavelength UV for 10–30 min before and after use. In addition, the spaces should be wiped with low-percentage bleach and 70% ethanol to remove traces of RNase and DNases. Apart from being a good laboratory prac- tice, the use of workstations also help reduce other human sources of contaminants from saliva, skin, hair, etc. 4. The Lysis pre-mastermix is enough for ten Lysis mastermixes and should be transferred to single-use aliquots and stored at À20 C. The Lysis pre-mastermix aliquots are stable for several months. The premixes should be thawed on ice and carefully handled in areas that are cleaned for RNAses. 5. The Lysis mastermix is enough for two C1 IFCs (9 μL per IFC). It is not advisable to scale down the reagents as both spike-ins and RNase inhibitors volumes reduce below 1 μL and cannot be accurately pipetted. RNase inhibitor should be first added to lysis pre-mastermix and then diluted spike-ins of required concentration. The Lysis mastermix should be kept at 4 C at all times. 6. The RT pre-mastermix is also enough for ten RT mastermixes, and should be transferred to single-use aliquots and stored at À20 C. The RT pre-mastermix aliquots are stable for several months and should be thawed on ice. Care should be taken that both MgCl2 and C1 loading reagents are fully thawed. 7. The RT mastermix is also enough for two C1 IFCs (9 μL per IFC). The Reverse Transcriptase enzyme should be added at the end. It is important that C1-P1-TSO primer is fully thawed. The RT mastermix should be kept at 4 C at all times. STRT-Seq on C1 149

8. The PCR pre-mastermix is also enough for ten PCR master- mixes. All the reagents and mastermix should be thawed and kept on ice throughout. 9. The PCR mastermix is only enough for a single C1 run. 10. The 50Â Advantage 2 PCR reagent and reverse transcriptase (À80 C) should be kept on dry ice and thawed just before adding to mastermix. 11. It is essential to keep the Lysis, RT, and PCR mastermixes on ice throughout. 12. Before priming step, the harvest reagent is generally added in excess across multiple wells with the C1 IFC. During the priming step, the harvest reagent flows through the chip to remove clogs, debris, etc. 13. It is important to switch on the C1 Autoprep system ~20–30 min before starting the reagent preparation. The C1 Autoprep system performs a set of calibration and checks, including temperature control, pressure, and vacuum control. 14. Independent of the C1 IFC size, the priming step takes ~10 min. The chips can stay within the Autoprep system for up to 1 h postpriming. 15. The range of cells to be loaded on the C1 IFCs is between 166 and 250 cells/μL. Loading fewer cells will lead to several empty capture sites, while overloading typically leads to multi- cell chambers (doublet, 3-cell, 4-cell), clumps, and debris. 16. Remaining pool of cells (bulk) or purified RNA from bulk cells after cell load step on C1 IFC can be used for optional tube controls and bulk RNA-seq: (a) Dilute RNA from bulk cells (20–50 ng/μL) or prepare cell mix (100–200 cells/μL). (b) Prepare positive control and no-template negative control tubes for lysis on new PCR strips, and place on thermal cycler for lysis step (same thermal cycler steps as for C1). (c) Add RT mastermix to lysis products in PCR strips and run RT program on thermal cycler (same thermal cycler steps as for C1). (d) Add PCR mastermix to each tube within PCR strip and run PCR amplification step (same thermal cycler steps as for C1). (e) After PCR, move the strips to post-PCR area and dilute the amplified product, that is, 1 μL cDNA + 45 μLC1 DNA dilution reagent. (f) Follow the quantification (step 33) and library prepara- tion steps as with single cells (step 35 onward). 150 Kedar Nath Natarajan

17. It is essential that multiple cell counting methods are used for determining the number of cells due to the inherent biases in counting methods. Different labs may use either commercial systems or traditional hemocytometer, but cross-validation should be performed either with a live–dead stain or nuclear staining. 18. The ratio 166–250 cells/μL typically works for most spherical cell types. However, for cells with unusual morphology, size, or volume, it would be better to optimize the dilution and capture on C1 IFC. 19. The cell load step of the mRNA-seq program is different between small C1 IFC and medium as well as large C1 IFC. It takes ~15 min for loading cells onto a small C1 IFC, while it takes ~30 min for medium and large C1 IFC. The cell load step can be usually combined with staining of live–dead marker or fluorescent protein expression within cells. 20. It is advisable to survey the C1 IFC cell capture sites after priming and before cell load step, for any clogs, debris, and/or blockage. 21. The maximum volume of diluted cell mix (Cells + Suspension reagent) that can be loaded to C1 IFC is 20 μL, while the minimum volume is 5 μL. The actual volume that is loaded into the C1 IFC is ~5 μL. 22. Care should be taken that cells are properly mixed prior to adding to C1 IFC. The primed C1 IFC should not be left beyond 1 h on the C1 Autoprep system. 23. The individual cell positions are marked on the top left of capture chamber within C1 IFC. These should be used to annotate well numbers and whether single, multiplet, or empty of debris is captured. If live–dead staining is performed during the cell loading step in the C1 IFC or cells express fluorescent markers, it is advised to have a semiautomated or automated fluorescent microscope. The specific grids and posi- tions of wells for automated microscopy can be obtained online from C1 Autoprep system website. 24. The C1 IFC after cell load step should be quickly imaged as cells are still live within the cell capture sites. The staining/ imaging time should be minimized to further reduce stress on live cells. 25. The key advantage of the C1 Autoprep system is that harvest- ing time for cDNA from C1 IFCs can be programmed to suit the needs of researcher. Typically, the C1 run is performed on day 1 (afternoon/evening) and harvest is scheduled for next day (day 2 a.m.), to allow cDNA quantification and library preparation on the same day. STRT-Seq on C1 151

26. When performing repeat pipetting steps across 96-well plates and with critical, expensive reagents, we prefer electronic repeater pipettes that sensitively and accurately dispense liquid. Care should be taken with repeater pipettes that tips should not touch the plate or individual wells. Dispensing the reagents to side or middle of the well at low speed is advised to avoid liquid spilling and cross-contamination. 27. Once the C1 IFC is removed from the Autoprep system, care should be taken that white strips covering the harvest wells are still stuck. 28. Typically, if the white strips covering the harvest wells are undone or loose, there is some cDNA evaporation from indi- vidual wells. It is advised to carefully measure the quantity of harvested cDNA from different wells and annotate the volume difference. 29. The white strips covering harvest reagents can be removed easily using the strip clipper along with C1 IFC. If strip clipper is not available, a nuclease-free unused pipette tip can be used to wedge the harvest strips. 30. The layout of collection wells (harvested cDNA) is distinctly different to the capture sites. The C1 IFC contains 16 rows and 8 columns (128 wells), where the first and the last collections columns are empty and should be ignored. 31. The cDNA is harvested using eight-channel pipette that fits alternative collecting wells within C1 IFC. Care should be taken that pipette tips do not touch other collection wells within C1 IFCs. 32. The individual cell layout in the collected 96-well plate should be updated to correspond to cells within C1 IFC. 33. For each batch of Tn5 transposase generated, the binding of mosaic sequences and tagmentation should be titrated. Typi- cally, serial dilution of oligonucleotides for same cDNA con- centration is performed with different Tn5 transposase concentration. 34. The Tn5 concentration and tagmentation efficiency is critical for each Tn5 batch. Excess Tn5 or cDNA or tagmentation time can lead to over tagmentation and extremely small fragments, or under-tagmentation resulting in large fragments. 35. The pooled cDNA libraries should be collected in 1.5/2 mL low-bind tubes to help with cleanup and elution of magnetic beads. 36. During bead immobilization and cleanup steps, care should be taken to not remove the tube containing beads from the mag- netic rack. The beads on the magnetic rack usually form a tight 152 Kedar Nath Natarajan

blob/spot and the remaining liquid should be taken off from the other side of tube, without removing from magnetic rack. 37. Care should be taken that no residual liquid is present on the lid of 1.5/2 mL tubes. It is advisable to do a short spin before binding of beads to magnetic rack. 38. Any residual liquid after washing steps should be carefully removed using both P200 and P20 pipettes. Beads can be quickly spun and then placed on magnetic rack for 1–2 min, before removing excess liquid. 39. Before final elution, it is important to make sure beads are dried to remove traces of 70% ethanol. A good visual check is appear- ance of small cracks in between spot/blob of beads, which typically happens between 3 and 10 min. Care should be taken not to over dry the beads, since this affects their resus- pension in elution buffer.

References

1. Natarajan KN, Teichmann SA, Kolodziejczyk 5. Islam S, Kjallquist U, Moliner A, Zajac P, Fan AA (2017) Single cell transcriptomics of plu- JB, Lonnerberg P, Linnarsson S (2012) Highly ripotent stem cells: reprogramming and differ- multiplexed and strand-specific single-cell entiation. Curr Opin Genet Dev 46:66–76. RNA 50 end sequencing. Nat Protoc 7 https://doi.org/10.1016/j.gde.2017.06.003 (5):813–828. https://doi.org/10.1038/ 2. Papalexi E, Satija R (2018) Single-cell RNA nprot.2012.022 sequencing to explore immune cell heteroge- 6. Pollen AA, Nowakowski TJ, Shuga J, Wang X, neity. Nat Rev Immunol 18(1):35–45. https:// Leyrat AA, Lui JH, Li N, Szpankowski L, doi.org/10.1038/nri.2017.76 Fowler B, Chen P, Ramalingam N, Sun G, 3. Regev A, Teichmann SA, Lander ES, Amit I, Thu M, Norris M, Lebofsky R, Toppani D, Benoist C, Birney E, Bodenmiller B, Kemp DW 2nd, Wong M, Clerkson B, Jones Campbell P, Carninci P, Clatworthy M, BN, Wu S, Knutsson L, Alvarado B, Wang J, Clevers H, Deplancke B, Dunham I, Weaver LS, May AP, Jones RC, Unger MA, Eberwine J, Eils R, Enard W, Farmer A, Kriegstein AR, West JA (2014) Low-coverage Fugger L, Gottgens B, Hacohen N, single-cell mRNA sequencing reveals cellular Haniffa M, Hemberg M, Kim S, heterogeneity and activated signaling pathways Klenerman P, Kriegstein A, Lein E, in developing cerebral cortex. Nat Biotechnol Linnarsson S, Lundberg E, Lundeberg J, 32(10):1053–1058. https://doi.org/10. Majumder P, Marioni JC, Merad M, 1038/nbt.2967 Mhlanga M, Nawijn M, Netea M, Nolan G, 7. Hochgerner H, Lonnerberg P, Hodge R, Pe’er D, Phillipakis A, Ponting CP, Quake S, Mikes J, Heskol A, Hubschle H, Lin P, Reik W, Rozenblatt-Rosen O, Sanes J, Satija R, Picelli S, La Manno G, Ratz M, Dunne J, Schumacher TN, Shalek A, Shapiro E, Husain S, Lein E, Srinivasan M, Zeisel A, Lin- Sharma P, Shin JW, Stegle O, Stratton M, Stub- narsson S (2017) STRT-seq-2i: dual-index 50 bington MJT, Theis FJ, Uhlen M, van single cell and nucleus RNA-seq on an address- Oudenaarden A, Wagner A, Watt F, able microwell array. Sci Rep 7(1):16327. Weissman J, Wold B, Xavier R, Yosef N, https://doi.org/10.1038/s41598-017- Human Cell Atlas Meeting P (2017) The 16546-4 Human Cell Atlas. Elife 6. doi:https://doi. 8. Islam S, Kjallquist U, Moliner A, Zajac P, Fan org/10.7554/eLife.27041 JB, Lonnerberg P, Linnarsson S (2011) Char- 4. Rozenblatt-Rosen O, Stubbington MJT, acterization of the single-cell transcriptional Regev A, Teichmann SA (2017) The Human landscape by highly multiplex RNA-seq. Cell Atlas: from vision to reality. Nature 550 Genome Res 21(7):1160–1167. https://doi. (7677):451–453. https://doi.org/10.1038/ org/10.1101/gr.110882.110 550451a STRT-Seq on C1 153

9. Zeisel A, Munoz-Manchado AB, Codeluppi S, 10. Svensson V, Natarajan KN, Ly LH, Miragaia Lonnerberg P, La Manno G, Jureus A, RJ, Labalette C, Macaulay IC, Cvejic A, Teich- Marques S, Munguba H, He L, Betsholtz C, mann SA (2017) Power analysis of single-cell Rolny C, Castelo-Branco G, Hjerling-Leffler J, RNA-sequencing experiments. Nat Methods Linnarsson S (2015) Brain structure. Cell types 14(4):381–387. https://doi.org/10.1038/ in the mouse cortex and hippocampus revealed nmeth.4220 by single-cell RNA-seq. Science 347 (6226):1138–1142. https://doi.org/10. 1126/science.aaa1934 Chapter 10

Single-Cell RNA-Sequencing of Peripheral Blood Mononuclear Cells with ddSEQ

Shaheen Khan and Kelly A. Kaihara

Abstract

Peripheral blood mononuclear cells (PBMCs) are blood cells that are a critical part of the immune system used to fight off infection. However, due to the complexity of PBMCs, which contain multiple different cell types, studying the function of the individual cell types can be difficult, and often studies rely on bulk measurements. Here, we describe the analysis of PBMCs using single-cell RNA-sequencing in droplets. Data from these studies allow for the identification and quantification of the subpopulation of cells that make up the PBMC sample. In addition, differential gene expression between cell types and samples can be assessed.

Key words Single-cell, RNA-sequencing, ddSEQ, PBMC, CD14 monocytes

1 Introduction

The immune system is complex, and its function involves interplay of a variety of cells, tissues, and secreted molecules such as cytokines and chemokines. It plays a crucial role in recognition and removal of foreign or “nonself” material, pathogens, cancer cells, and graft transplantations [1–4]. Failure of the immune system to recognize tissue or cells as “self” can lead to autoimmune diseases and, on the other hand, defects in immune system ability to fight foreign inva- ders results in immune deficiency and susceptibility to infections. Failure in regulating immune system responses results in myriad of inflammatory diseases. Harnessing the power of the immune sys- tem to fight against cancer, such as immune checkpoint therapy, has been successful against a variety of cancer types. To delineate the mechanisms that drive immune function in both health and disease states, researchers often perform gene expression studies on immune cells isolated from peripheral blood, including PBMCs, which include lymphocytes (B- and T-cells), dendritic cells, macro- phages, and NK-cells [5]. RNA- Sequencing (RNA-Seq) provides an accurate method with which to measure gene expression of the

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_10, © Springer Science+Business Media, LLC, part of Springer Nature 2019 155 156 Shaheen Khan and Kelly A. Kaihara

whole transcriptome without prior knowledge of the genes expressed. However, traditional RNA-Seq is performed on cells processed in bulk, which averages gene expression and conceals the underlying heterogeneity [6]. Single-cell RNA-Seq of PBMCs can provide in-depth assessment of gene expression of individual immune cell types that can reveal novel immune cell populations, identify pathogenic subsets of immune cells and transcriptional modules driving the pathogenesis, and identify biomarkers of effi- cacy and response to therapy [7]. Furthermore, single-cell RNA-- Seq of PBMCs would be a key assay in analyzing a longitudinal sample in a large cohort of human patients since it is more readily available than tumor or tissue biopsies. PBMCs can be further enriched for rare subsets of immune cells, such as dendritic cells and monocytes, by selection protocols or cell sorting to gain detailed analysis of the transcriptome. The Illumina® Bio-Rad Single-Cell Sequencing Solution combines Bio-Rad’s innovative Droplet Digital PCR (ddPCR) technology with Illumina next gen- eration sequencing (NGS) library preparation, sequencing, and analysis (Fig. 1). This solution provides a comprehensive, user- friendly workflow for single-cell RNA-Seq that enables controlled experiments with multiple samples, treatment conditions, and time points. Built and supported in collaboration between technology leaders, the Illumina Bio-Rad Single-Cell Sequencing Solution enables transcriptome analysis of hundreds to thousands of single cells across a wide range of cell sizes in a single experiment. The simple push-button analysis for alignment, cell decoding, and library quality control (QC) in the SureCell™ RNA Single-Cell App in the BaseSpace™ Sequence Hub is combined with data reduction and population identification tools using the Seurat package for RStudio Software. Here, we demonstrate the high- quality single-cell RNA-Seq data achieved with the Illumina

Fig. 1 Illumina Bio-Rad single-cell sequencing workflow. PBMCs are isolated and prepared into a single-cell suspension. Cell and barcode mixes are loaded onto the ddSEQ single-cell isolator and droplets are generated. First strand synthesis occurs in the droplets. Droplets are disrupted, and second strand cDNA synthesis is carried out in bulk. Illumina library preparation is carried out with the Nextera transposase followed by PCR and sequencing on the Illumina NextSeq Sequencer using the 500/550 high output 150 cycle kit. FASTQ files are then processed using the Illumina BaseSpace SureCell Single-Cell App. The data are then processed using RStudio Software and the “Seurat” package Single-Cell RNA-Sequencing of PBMCs with ddSEQ 157

Bio-Rad Single-Cell Sequencing Solution with fresh and frozen PBMCs from healthy human donors. We also performed single- cell RNA-Seq on positively selected monocytes isolated from the same human donor PBMCs. This technology can be utilized in basic, translational, and clinical studies to gain deeper insight into the role and function of the immune system across normal and diseased states.

2 Materials

2.1 Single-Cell 1. BD Vacutainer CPT cell preparation tube with sodium citrate. Preparation of PBMCs 2. RPMI-1640 Media. (Fresh/Frozen) 3. Heat-inactivated fetal bovine serum (FBS)/fetal calf serum (FCS). 4. Freezing media (90% FBS/FCS + 10% DMSO). 5. Standard cell freezing container. 6. 2-Propanol. 7. Cryotube vials, 1 ml. 8. Sterile pipettes (10 and 5 ml). 9. Heated water bath (37 C). 10. Pipettes and tips. 11. 50 ml polypropylene centrifuge tubes. 12. 15 ml polypropylene centrifuge tubes. 13. PBS (phosphate buffered saline, pH 7.4) 1Â without calcium and magnesium. 14. Bovine serum albumin (BSA): 20 mg/ml. 15. 30–40 μm cell strainer. 16. Benchtop centrifuge for 15 and 50 ml conical tubes and microcentrifuge. 17. Centrifuge capable of up to 1800 RCF with swinging bucket. 18. RBC lysis Buffer. 19. PBS + 0.1% BSA. 20. Cell counter. 21. Cell counting slides. 22. 0.4% trypan blue dye solution. 23. Vortexer. 24. DNase- and RNase-free 1.5 ml tubes.

2.2 CD14 Positive 1. EasySep Human CD14 Positive Selection Kit (Stemcell Monocyte Isolation Technologies). 2. The Big Easy EasySep Magnet (Stemcell Technologies). 158 Shaheen Khan and Kelly A. Kaihara

3. PBS + 2% FBS + 1 mM EDTA (e.g., Stemcell Technologies EasySep Buffer). 4. 5 ml (12 Â 75 mm) polystyrene round-bottom tube. 5. 30–40 μm cell filters.

2.3 SureCell WTA 30 1. ddSEQ single-cell isolator (Bio-Rad). Library Prep 2. SureCell WTA 30 Library Prep Kit for the ddSEQ System (Illumina). 3. Deep well thermal cycler. 4. Magnetic peg stand (Thermo Fisher). 5. DynaMag 96 side magnet (Thermo Fisher) or the DynaMag 96 side skirted magnet (Thermo Fisher). 6. Microplate centrifuge. 7. Pipettes and tips (Rainin) (see Note 1). 8. 96-well cooling block. 9. Bio-Rad ddPCR plate (Bio-Rad) (see Note 7). 10. PCR tubes 8-tube strip, clear. 11. Optical flat 8-cap strips. 12. 1.5 ml DNAse/RNAse free tubes. 13. Multichannel pipette reservoir. 14. Nuclease-free water. 15. 2100 Bioanalyzer (Agilent Technology). 16. 2100 Bioanalyzer high sensitivity DNA kit (Agilent Technology).

2.4 Sequencing on 1. NextSeq™ 500/550 High Output Kit 150 cycles (Illumina). an Illumina Sequencer 2. PhiX Control v3 (Optional) (Illumina).

2.5 Bioinformatic 1. RStudio software. Analysis

3 Methods

This section describes the process for isolation of PBMCs from blood collected using standard venipuncture into blue-top BD Vacutainer CPT Tubes. In addition, it describes the PBMC cell preparation method used for freshly isolated PBMCs and previ- ously frozen PBMCs. The method has been developed and tested with the Illumina Bio-Rad SureCell WTA 30 Library Prep for PBMC Demonstrated Protocol and the ddSEQ single-cell isolator (see Note 2). Samples were sequenced on an Illumina NextSeq Single-Cell RNA-Sequencing of PBMCs with ddSEQ 159

Sequencer and processed through Illumina’s BaseSpace SureCell RNA Single-Cell App. Tertiary analysis was conducted using RStu- dio Software and the R package “Seurat” developed by the Satija lab [8]. In the first experiment, we performed single-cell RNA-Seq using fresh PBMCs isolated from a healthy donor. CD14 mono- cytes were isolated from this same donor using a positive selection protocol and compared to the total PBMC population. Principal component analysis (PCA) was used for data analysis. In experi- ment 2, fresh and frozen PBMCs were compared from the same healthy donor and data were analyzed using canonical correlation analysis (CCA) [7].

3.1 Single-Cell 1. Remix blood sample immediately prior to centrifugation by Preparation of PBMCs gently inverting the tube 8–10 times. (Fresh/Frozen) 2. Centrifuge CPT tube(s) at 1800 RCF (approximately 2800 rpm on a Sorvall RT6000 centrifuge) for 20 min at 3.1.1 PBMC Cell  Preparation from BD 20 C without break. Vacutainer CPT Tube for 3. After centrifugation, bring the CPT tube(s) to a biological Single-Cell RNA- safety hood and carefully open the tops. After centrifugation, Sequencing mononuclear cells and platelets will be in a whitish layer just under the plasma layer. Aspirate approximately half of the plasma without disturbing the cell layer. Collect the cell layer with a Pasteur pipette and transfer to a 15 ml conical centrifuge tube with a cap. Collection of cells immediately following centrifugation will yield best results. 4. Add PBS to bring the volume to 15 ml. Cap the tube and mix the cells by inverting the tube five times. Centrifuge for 15 min at 120 Â g. Aspirate as much supernatant as possible without disturbing the pellet. 5. Resuspend the cell pellet by gently vortexing or tapping tube with your index finger. Add PBS to bring the volume to 10 ml. Cap the tube and mix the cells by inverting the tube five five times. Centrifuge for 10 min at 450 Â g. Aspirate as much supernatant as possible without disturbing the pellet. Resus- pend the cells in cold 1Â PBS with 0.1% BSA if used immedi- ately for single-cell capture. 6. Filter the cells through the 30 μm strainer. Keep the cells on ice and proceed to counting. 7. In order to freeze PBMCs for later use, resuspend the pellet in the freezing medium and gently mix the cells. Immediately dispense aliquots into cryovials, place the cryovials into a freez- ing container (site standard cell freezing container or Nalgene Mr. Frosty), and place the container into a À80 C freezer. The following day, cryovials in the freezing container can be moved to standard À80 C storage (or transferred to liquid nitrogen if available) until further use. 160 Shaheen Khan and Kelly A. Kaihara

3.1.2 PBMC Cell 1. Invert the blood sample 8–10 times and add twice the volume Preparation Using Ficoll for of 1Â PBS + 2% FBS. Single-Cell RNA-Seq 2. Add Ficoll to the SepMate tube by carefully pipetting it through the central hole of the SepMate insert. 3. Mix the diluted blood sample from step 1 gently. Keeping the SepMate tube vertical, add the diluted sample by pipetting it down the side of the tube. 4. Centrifuge at 1200 Â g for 20 min at room temperature, with the brake on. 5. Immediately pour off the top layer that contains the PBMCs into a new 50 ml conical tube. 6. Wash two times with PBS + 2% FBS. If the pellet appears red or pinkish, perform the RBS lysis step. 7. RBC lysis step: Add 5 ml of RBC lysis buffer to the pellet and gently pipet three times with a 5 ml pipette. Let it sit for 5 min. Add 10% RPMI media and spin down for 5 min. 8. Resuspend the pellet in appropriate media based on the down- stream application.

 3.1.3 Preparation of 1. Make sure the water bath is at 37 C before starting the Frozen PBMCs for Single- protocol. Cell RNA-Seq 2. Prepare 1Â PBS with 0.1% BSA (1 mg/ml) for the final resuspension. 3. Remove a single cryovial of frozen mixed cells and place it in a 37 C water bath to thaw (should not take more than 1–3 min; do not leave it in the water bath for longer than 3 min). Remove the tube from the water bath as soon as it has thawed. 4. Pipet-mix the cells and transfer the entire volume to a 1.5 ml Eppendorf tube. Add 500 μl of 10% RPMI and centrifuge the cells at 450 Â g for 5 min. 5. Carefully remove the supernatant without disturbing the pel- let. Add 1 ml of cold 1Â PBS with 0.1% BSA to the tube and gently pipet-mix five times to slowly dislodge and resuspend the pellet. 6. Centrifuge the cells at 200 RCF for 3 min. Carefully remove the supernatant without disturbing the pellet. Perform two washes, and add 1 ml of cold 1Â PBS with 0.1% BSA and gently pipet-mix 15–18 times until the cells are completely resuspended. 7. Carefully filter the cells by passing them through the 30 μm strainer to get rid of any clumps and keep them on ice. Single-Cell RNA-Sequencing of PBMCs with ddSEQ 161

3.2 CD14 Positive In this protocol, monocytes were isolated using the CD14 positive Monocyte Isolation selection kit from Stemcell Technologies.

3.2.1 Monocyte Cell 1. Perform an RBC lysis step on the PBMCs before proceeding Isolation from PBMCs for with monocyte isolation. Single-Cell RNA- 2. Resuspend the PBMCs in PBS + 2% FBS + 1 mM EDTA. Sequencing Gently mix and pass the mixture through a 30 μm filter. 3. Follow CD14 positive monocyte isolation as per the manufac- turer’s instructions in their entirety. 4. Resuspend positively selected monocytes in ice cold PBS with 0.1% BSA.

3.2.2 Cell Counting of 1. Mix the cells very gently three times using a wide-bore pipette Fresh PBMCs, Frozen tip. Pulse vortex the cells for 1 s for a total of three times (3 s) to PBMCs, and Monocytes mix them and then determine the cell count using the Bio-Rad Isolated from PBMCs for TC20 Cell Counter by mixing 10 μl of the cells with 10 μlof Single-Cell RNA-Seq 0.4% trypan blue. Make note of the total count, viable cell count, and viability (see Note 3). 2. The viability should be >80%. Repeat step 1 for a total of four counts. Take an average of the four viable cell counts and use that to dilute the cells to 3000 cells/μl using cold 1Â PBS with 0.1% BSA. 3. Determine the final cell count and viability and make note of it. Take two readings and use the average. The final count should be within Æ10% of the target concentration (3000 cells/μl); if not, adjust accordingly. The final viability should be >95%. 4. Keep the cell mixture on ice until use; preferably use within 1–2 h of preparation.

3.3 SureCell WTA 30 Prepare reagents and follow the protocol outlined in the Illumina 0 Library Prep Bio-Rad SureCell WTA 3 library prep for PBMC demonstrated protocol (Illumina 1000000044179 v00) (see Note 2).

3.3.1 Prepare the Cell 1. Prepare the SureCell Enzyme Mix according to Table 1 and and Barcode Suspension store on ice. Mixes l Use Rainin pipettes and tips (see Note 1). l Thaw and mix the reagents according to the SureCell protocol. l Prepare one master mix for all cartridges, leaving out the cells, which are added separately. l Two samples or four wells are combined in the PBMC protocol to increase the cDNA yield for downstream Nex- tera™ NGS library prep. 162 Shaheen Khan and Kelly A. Kaihara

Table 1 Preparation of SureCell enzyme mix

Volume for one cartridge, Cell enzyme mix component μl (two samples)

Cell suspend buffer 60 DTT 8 RNA stabilizer 6 RT enzyme 13.2 Enhancer enzyme 12 Total 99.2

Table 2 Preparation of sure cell suspension mix

Cell suspension Volume for one sample Volume for one cartridge, mix component (two chambers) μl (two samples)

Cell enzyme mix 43 86 Cells (>2500 cells/ 918 μl)

Table 3 Preparation of Barcode Suspension Mix

Volume for one cartridge, Barcode suspension mix component μl (two samples)

Barcode buffer 60 30 Barcode mix 60

2. Prepare the SureCell Cell Suspension Mix according to Table 2 and store on ice. l Prepare separate tubes for the number of different samples being used. l To load the same cell sample across multiple wells or multi- ple cartridges, make a master mix. l For accurate loading of cell number, vortex the cells for 1 s, and repeat three times before adding to the Cell Enzyme Mix. 3. Prepare the SureCell Barcode Suspension Mix according to Table 3 and store on ice. 0 l Before adding the 3 Barcode Mix, vortex for 1 s, repeat three times, and immediately add to the Barcode Buffer. Single-Cell RNA-Sequencing of PBMCs with ddSEQ 163

3.3.2 Isolate Single Cells 1. Place the cartridge in the cartridge holder. with Barcodes in the ddSEQ 2. Place the cartridge in the cartridge holder. Single-Cell Isolator 3. Prime the cartridge with Priming Solution for 1 min and no longer than 3 min. 4. Vortex the Barcode Suspension Mix for 1 s, and repeat three times (see Note 4). 5. After vortexing the Barcode Suspension Mix, load 20 μl into the blue ports of the cartridge. 6. Vortex the Cell Suspension Mix for 1 s, and repeat three times (see Note 4). Add 20 μl of Cell Suspension Mix to the red ports in the cartridge, numbered 1–4. 7. Follow manufacturer’s guidance in observing proper loading of the cartridge—adding sample first and avoiding introduction of air bubbles into the chamber well (see Note 5). 8. Add 80 μl Encapsulation Oil into the eight companion oil wells. 9. Place the cartridge holder into the ddSEQ single-cell isolator and press the silver button to begin single-cell isolation. The process will take approximately 5 min. Single-cell isolation is complete when all three indicator lights are solid green. 10. Transfer the droplets from the cartridge output wells to eight wells of a 96-well ddPCR plate using an 8-channel P50 pipettor. Aspirate slowly at a ~70 angle, and dispense slowly into the ddPCR plate, along the sides of the wells (see Notes 6 and 7). 11. Repeat as needed for all cartridges. 12. Seal the wells with an 8-tube strip cap and keep the samples on the 96-well cooling block on ice.

3.3.3 Reverse- 1. Load the droplet plate onto a thermal cycler and run the Transcribe Samples Reverse Transcription program in Table 4.

Table 4 Reverse transcription program

Step Temperature, C Time # of cycles

1 37 30 min 1 2 50 60 min 1 3 85 5 min 1 4 4 Hold 1 Choose the preheat lid option and set to 105 C Set the reaction volume to 50 μl 164 Shaheen Khan and Kelly A. Kaihara

3.3.4 Break Emulsion, 1. After completion of the reverse transcription protocol, break Clean Up First Strand open the droplets with the droplet disruptor reagent. Synthesis, and Synthesize 2. Clean up first stand using the sample purification beads (SPB), Second Strand cDNA magnetic peg stand, and DynaMag 96 side magnet (see Note 8). l Use freshly prepared 80% ethanol. l Mix the disrupted droplets with the SPB, ensuring that, after mixing, the top layer is an entirely homogenous brown aqueous layer (see Note 9). l After cleanup, combine the four wells for each sample into a single well (see Note 2). 3. Prepare the Second Strand Synthesis Mix according to Table 5 and add the appropriate volumes to the samples. 4. Load the plate onto a thermal cycler and run the Second Strand Synthesis (SSS) program in Table 6. This is a safe stopping point.

3.3.5 Clean Up cDNA, 1. Clean up double stranded cDNA using the SPB cleanup Tagment cDNA, and protocol. Amplify Tagmented cDNA 2. Check cDNA on the Agilent Technology 2100 Bioanalyzer using a High Sensitivity DNA chip. l DNA yields for PBMC samples are on average 1.85 ng. 3. Prepare the Tagmentation Mix according to Table 7 and add the appropriate volumes to the samples.

Table 5 Preparation of second strand synthesis mix

Volume for oneartridge, Second strand synthesis component μl (two samples)

Second strand buffer (SSB) 18 Second strand enzyme (SSE) 9

Table 6 Second strand synthesis program

Step Temperature, C Time # of cycles

1 16 120 min 1 2 4 Hold 1 Turn off the heated lid function Set the reaction volume to 80 μl Single-Cell RNA-Sequencing of PBMCs with ddSEQ 165

Table 7 Preparation of tagmentation mix

Volume for one cartridge, Tagmentation mix component μl (two samples)

Tagment buffer (TCB) 44 Tagment enzyme (TCE) 22

Table 8 Tagmentation program

Step Temperature, C Time # of cycles

1 55 5 min 1 2 4 Hold 1 Choose the preheat lid option and set to 105 C Set the reaction volume to 40 μl

Table 9 Library amplification program

Step Temperature, C Time # of cycles

19530s1 29510s15 36045s15 47260s15 5 72 5 min 1 6 4 Hold 1 Choose the preheat lid option and set to 105 C Set the reaction volume to 100 μl

4. Load the plate onto a thermal cycler that is preheated to 55 C and run the Tagmentation Program in Table 8. 5. Remove the plate from the thermal cycler as soon as the tem- perature reaches 4 C and stop the reaction by adding Tagment Stop Buffer. 6. Incubate at room temperature for five min. 7. Amplify the tagmented DNA by adding the Tagmentation PCR Mix, Tagment PCR Adapter, and DNA adapters to tag each sample with a unique i7 index. 8. Load the plate onto a thermal cycler and run the Library Amplification program in Table 9. 166 Shaheen Khan and Kelly A. Kaihara

3.3.6 Clean Up Libraries 1. Clean up the libraries using the SPB protocol according to the and Assess Libraries manufacturer’s protocol. 2. Perform two back-to-back cleanups. 3. Check libraries on the Agilent Technology 2100 Bioanalyzer using a High Sensitivity DNA chip. l Library yields are on average 1.5–2.5 nM, but yields as low as 0.5 nM have been observed and successfully sequenced.

3.4 Sequencing on a 1. Before starting, thaw the reagent cartridge for 1 h in a water  NextSeq Illumina bath. Alternatively, thaw overnight at 4 C. Sequencer 2. Leave the flow cell at room temperature for 30 min before starting a run. 3. Prepare libraries for sequencing. (a) Make fresh 0.2 N NaOH (tenfold dilution) using stock 2 N NaOH + water. 2 N NaOH is labeled HP3 Illumina reagent (see Note 10). (b) Calculate the amount of sample of each library to add to create a total pooled library of 2 nM (or the concentration of the lowest library) in a 10 μl volume. Use the results of the Bioanalyzer to balance the libraries and load equal amounts across samples. Add the required amount of Resuspension Buffer for a final volume of 10 μlofa 2 nM pooled library. Vortex and spin down. (c) Add 10 μl of 0.2 N NaOH to the library pool. Vortex and centrifuge at 280 Â g for 1 min. (d) Incubate for 5 min at room temperature. (e) Add 10 μl of 200 mM TRIS, pH 7. Vortex briefly and centrifuge at 280 Â g for 1 min. (f) Add the calculated amount of HT1 buffer to stop the reaction and dilute the library pool to 20 pM. Vortex and centrifuge at 280 Â g for 1 min. (g) Make 2.1 ml of Sequencing Primer at 300 nM by diluting with HT1 buffer as shown in Table 10. (h) Add the following to get a 3.0 pM library with or without a 5% PhiX spike-in as indicated in Table 11. Vortex and spin down.

Table 10 Preparation of sequencing primer

Volume, μl

HT1 2087.4 Sequencing primer 12.6 Single-Cell RNA-Sequencing of PBMCs with ddSEQ 167

Table 11 Preparation of 3.0 pM library with or without PhiX spike-in

With PhiX (5%), μl Without PhiX, μl

20 pM denatured library 195 195 20 pM denatured PhiX 3.25 N/A HT1 1101.75 1105.00

l Target 3 pM loading. The cluster density target is 165 k/mm2. If overclustered, reduce to 2.3–2.7 pM load. 4. Prepare the reagent cartridge for sequencing. (a) Remove the reagent cartridge from the water bath and dry the base using a Kimwipes. (b) Invert the reagent cartridge 5Â gently to mix. (c) Wipe clean the foil seal covering reservoir #10 and pierce the seal with a 1 ml pipette tip. Dispense 1300 μl (total volume) of pooled library. (d) Load 2 ml of the 300 nM Sequencing Primer to reservoir #7. 5. Setup run in Illumina’s BaseSpace Sequence Hub. (a) Click the Prep tab. (b) Select Biological Samples under Manual Prep. (c) Create a new sample. Give the sample a name, select the project folder, select the species, and select RNA. (d) Select Save & Continue later. Repeat for all samples in the pool. (e) Select all samples and click prep library (Select SureCell WTA 30). Give the pooled library a name. (f) Drag each sample to the correct N70X index. (g) Click pool libraries. (h) Select Plan Run. (i) Select NextSeq. (j) Below are the parameters that will automatically be selected for sequencing of the SureCell libraries. l Use Custom Primer R1. l Paired End. l Read 1 Cycles 68. l Read 2 Cycles 75. l Single Index. 168 Shaheen Khan and Kelly A. Kaihara

l Index 1 Cycles 8. l Index 2 Cycles 0. 6. Load the NextSeq sequencer. (a) Enter your BaseSpace user name and password. (b) Select Next. (c) Remove the used flow cell. (d) Open a new flow cell and wipe gently with a Kimwipes. Align the flow cell over the alignment pins. Select load. (e) Remove the used buffer cartridge and load a new one. (f) Empty the liquid waste from the removable reservoir and place it back into the machine. (g) Remove the used reagent cartridge and load a new one (with samples and primers loaded). (h) Select Next. Select the appropriate BaseSpace run. Wait for the automated checks and then start the run.

3.5 Bioinformatic Once the run is complete and the FASTQ files have been generated, Analysis proceed to analysis.

3.5.1 Secondary Analysis 1. On BaseSpace, select Apps. Using Illumina’s 2. Select the SureCell RNA Single Cell app. BaseSpace SureCell 3. Select the sample(s) for analysis. Single-Cell App 4. Select the project (use the project folder created for the run). 5. Press the blue continue button. 6. Rename the analysis (or BaseSpace will name it automatically). An example of data from a technical replicate of the PBMC experiment is shown in Fig. 2. Sequencing metrics give an indication of the quality of the experiment, including the quality of the starting cellular material and the quality of the molecular biology technique employed throughout the SureCell protocol (see Note 11).

3.5.2 Export Data from 1. Select Files. BaseSpace for 2. For each sample, open the sample folder and download the Downstream Analysis in following three files: RStudio Software l S1.cell.summary.csv. l S1.counts.abundantReadCounts.csv. l S1.counts.umiCounts.passingKneeFilter.table.zip.

3.5.3 Tertiary Analysis 1. Create a “Basespace_data” folder and, within this folder, create Using RStudio Software and folders for each sample. Each sample folder should contain the Seurat Tutorial by Bio-Rad three files exported from BaseSpace. Single-Cell RNA-Sequencing of PBMCs with ddSEQ 169

Fig. 2 Secondary analysis metrics from Illumina’s BaseSpace SureCell RNA Single-Cell App. (a) An example of a technical replicate from the PBMC experiment. The Illumina BaseSpace SureCell Single-Cell App takes the primary data of the FASTQ files and through secondary analysis. This includes deconvoluting the cellular barcodes, determining the barcodes that are associated with cells, and aligning the reads to a reference genome of choice. Sample Information provides general information about the sequencing run for that sample. Cell Information portrays the number of cells passing knee filter for that sample, as well as critical metrics such as the number of UMIs per cell passing filter and the number of genes per cell passing filter. The Alignment Quality Information provides statistics for the alignment of the reads to the reference genome. The Abundant Sequence Information refers to reads that align to mitochondrial, ribosomal, or small nuclear or cytoplasmic RNA 170 Shaheen Khan and Kelly A. Kaihara

Fig. 2 (continued) (b) Plot of UMIs per cell for each bead barcode in descending order by genic UMI count. The steep drop in genic UMI count determines what is considered a cell passing filter (to the left of the red line) and what is considered cellular debris (to the right of the red line). (c) Knee plot of the cumulated fraction of genic transcripts assigned to cell barcodes. The inflection point (otherwise known as the knee) is used to determine the number of barcoded cells detected and indicates that a high fraction of transcripts are assigned to single cells

2. Rename the files to have the following structure: condition— condition—condition _S1... . For example, a condition could be an experimental treatment, an experimental replicate, a different subject, or different days. The conditions that are separated by dashes will be used in the downstream Seurat tutorial by Bio-Rad to visualize the data and calculate differen- tial expression based on these conditions. 3. Launch RStudio Software and create a New Project using the main directory that contains the “Basespace_data” folder. 4. The Seurat tutorial by Bio-Rad is in r-markdown (.rmd) for- mat. This allows the output of the final analysis to be in .html format for easy sharing. There are two tutorials, one for PCA analysis and the other for CCA analysis (see Note 12). 5. Within the r-markdown files, the text can be changed to what- ever the user desires. The code chunks should be run sequen- tially, and one at a time at first, in order to customize the code for the dataset being analyzed. Parameters that need customi- zation include: l Define experimental conditions. l Select filtering parameters for cells (nGene, nUMI, pct. rRNA, pct.mito, pct.sncRNA, novelty, etc.) (see Note 13). Single-Cell RNA-Sequencing of PBMCs with ddSEQ 171

l Select variables to regress out (nUMI, pct.mito, cell-cycle phase, etc.). l Select the number of PCs or CCs to use. l Determine the perplexity of the t-SNE analysis. l Select the final resolution and thus the number of subpopu- lations in the final analysis (see Note 14). l Define the experimental conditions for downstream com- parison by statistical analysis. l Indicate select markers to visualize for specific cell types. 6. Once the parameters have been defined, all R code chunks have been run, and the text has been customized, knit the object to create an .html output (Figs. 3 and 4).

4 Notes

1. Avoid low-quality pipette tips that may shed plastic particulates into the SureCell solutions and/or the ddSEQ Cartridge. This can result in shredded droplets. 2. In addition to the standard SureCell WTA 30 Library Prep Reference Guide (Illumina 1000000021452 v01), the Sure- Cell WTA 30 PBMC Demonstrated Protocol (Illumina 1000000044179 v00), which is used here, and the SureCell WTA 30 Nuclei Demonstrated Protocol (Illumina 1000000044178 v00) are also available. Due to the low RNA content of PBMCs, two samples are combined in the PBMC Demonstrated Protocol, and the volumes for SPB cleanups and elutions are appropriately adjusted. 3. To load cells at the proper concentration of 3000 cells/μl, accurately count the cells by vortexing the cells for 1 s, three times, and then load the cell counter. Use the Live Cell Count as opposed to the Total Cell Count to make the final 3000 cells/μl dilution. 4. Just before loading the ddSEQ Cartridge, vortex the Cell and Barcode Suspension Mixes. This ensures even distribution of the cells and barcodes into the droplets. 5. To avoid getting air bubbles in the microfluidics of the car- tridge, follow these tips. Use P-20 pipette tips. Gently slide the pipette tip down the side of the well until it reaches the bottom. The tip should be ~15 from the vertical. Holding the tip in this position, gently begin dispensing the sample. After dis- pensing half of the sample, slowly begin drawing the pipette tip up the side of the well. Expel the last of the sample. Do not push the pipette plunger past the first stop. This technique will 172 Shaheen Khan and Kelly A. Kaihara

Fig. 3 Tertiary analysis using the Seurat tutorial by Bio-Rad for PCA analysis. (a) PBMCs from a healthy patient were isolated and run through the Illumina Bio-Rad SureCell single-cell workflow. Data were exported from Single-Cell RNA-Sequencing of PBMCs with ddSEQ 173

ensure the sample wets the bottom of the well and wicks into the microchannel for optimal droplet generation. 6. To transfer the droplets to a 96-well ddPCR plate, follow these tips. Use a P-50 pipetman with a P-200 tip and gently press the pipette tip at a ~20 angle from the vertical into the junction where the side wall meets the bottom of the well. Slowly draw 40 μl into the pipette tip (should take ~5 s and air is expected at the bottom of the tip). To dispense the droplets into the 96-well ddPCR plate, position the pipette tip along the side of the well, near, but not at, the bottom of the well, and slowly dispense the droplets (should take ~5 s). 7. Use 96-well ddPCR plates that have been specially formulated to support stable droplets. Do not use 96-well plates by other manufacturers for any steps involving droplets in wells. 8. Critical steps in the SureCell protocol include the proper exe- cution of the SPB cleanups, which bind DNA and determine

Fig. 4 Analysis of PBMCs and isolated CD14 positive monocytes obtained from the same subject. (a) CD14 monocytes were positively selected and run through the Illumina Bio-Rad SureCell single-cell workflow. PBMCs and CD14 isolated monocytes from the same patient were analyzed by the Seurat tutorial by Bio-Rad for PCA. The t-SNE plot shows subpopulations of the PBMCS. The first percentage is the percentage of the cells from that subpopulation to the total number of cells in the total PBMC experiment while the second percentage is for the isolated monocyte experiment. Of note, the monocyte isolation led to 90% purity of monocytes. (b) A t-SNE plot showing cells from the total PBMC sample and the isolated CD14 monocyte sample. (c) Differential gene expression between the CD14 monocyte sample and the total PBMC sample. Only

upregulated genes, expressed in at least 25% of cells and with a log-fold change of >0.25, are shown ä

Fig. 3 (continued) the BaseSpace App and imported into RStudio Software. The Seurat tutorial by Bio-Rad for PCA was used to cluster the cells, which are displayed on a t-SNE plot. Percentages indicate the percentage of the subpopulation to the total number of cells. (b) Heat map showing the top ten genes for each cluster ranked by log-fold change. Genes were statistically determined by comparing the cells in each cluster to all other cells. The gene markers were used to determine the identity of the subpopulations 174 Shaheen Khan and Kelly A. Kaihara

the DNA yield for downstream steps. Helpful tips include slowly aspirating the SPB into the pipette tip and completely expulsing of the SPB into the sample by dispensing the SPB from the pipette tips, waiting, and then expunging the tip again. This ensures that the correct quantity of SPB is used for binding of the correctly sized DNA. Mix vigorously to ensure that the SPB binds the DNA in the sample. When air drying the pellet, ensure that all the ethanol has evaporated from the wells. When eluting into Resuspension Buffer, ensure that the pellet is not overdried. Adhere to all of the incubation times for maximal binding of DNA to the SPB. For additional tips, refer to: https://support.illumina.com/sequencing/ sequencing_kits/nextera_dna_kit/best_practices.html 9. The first SPB cleanup occurs after first-strand synthesis and is carried out on the aqueous phase above the oil phase. Care should be taken not to go into the bottom oil layer when mixing the SPB with the top aqueous phase. Ensure that the aqueous phase is well mixed until the entire layer is homoge- nously brown. This step is critical in capturing the single- stranded cDNA produced in the cells. 10. Prepare a fresh 0.2 N NaOH solution before preparing the libraries for sequencing. 11. Additional metrics that can be useful to compute from the data output from the BaseSpace SureCell Single-Cell App include “Number of Reads per Cell” and “Read Utilization.” To com- pute the “Number of Reads per Cell,” divide the “Total Reads” by the “Cells Passing Knee Filter.” To compute the “Read Utilization,” divide the “Reads Aligned to Unique Genes” by the “Total Reads.” 12. There are two main analyses described here, PCA and CCA. We use PCA to compare total PBMCs to isolated monocytes from the same subject. When comparing multiple different subjects or conditions, for instance, the fresh PBMCs versus frozen PBMCs, then we perform CCA. CCA identifies shared corre- lation structures and aligns these dimensions to allow for clus- tering based on conserved markers across subjects or conditions, rather than clustering due to differences between subjects or conditions (Fig. 5). 13. In the Seurat tutorial by Bio-Rad, there are parameters that the user must determine and input. When deciding what factors to regress out, select those that make sense for the experiment. For instance, it may not make sense to regress out the cell cycle stage if this parameter determines downstream gene expression that is of interest. Clustering is typically better when fewer variables are regressed out. Single-Cell RNA-Sequencing of PBMCs with ddSEQ 175

Fig. 5 Tertiary analysis using the Seurat tutorial by Bio-Rad for CCA analysis of fresh versus frozen PBMCs. (a) Analysis of PBMCs comparing a sample from freshly isolated PBMCs to previously frozen PBMCs from the same subject. Data were analyzed by the Seurat tutorial by Bio-Rad for CCA, which aligns the scRNA-seq datasets from two conditions and allows for better clustering and comparison versus PCA. The first percentage is from the sample of fresh PBMCs and the second percentage is from the sample of previously frozen PBMCs. There is no statistical difference in gene expression between the samples or within each subpopulation between the fresh and frozen samples (data not shown). (b) The t-SNE plot is labeled by sample type, fresh or frozen. Each cluster has a similar number of cells in the fresh and the frozen sample

14. Selecting the resolution and ultimately the number of subpo- pulations is an important step in the analysis. It is helpful to assess the heat maps to determine if the subpopulations have distinct expression markers that distinguish them from other subpopulations. Alternatively, subpopulations can be marked by expression of known cellular markers.

Acknowledgments

This work was supported by the University of Texas Southwestern Medical Center Genomics Core. We would like to thank Gunjan Choudhary for her careful reading and editing of this book chapter. Bio-Rad, Droplet Digital PCR, and ddPCR are trademarks of Bio-Rad Laboratories, Inc. in certain jurisdictions. All trademarks used herein are the property of their respective owners.

References

1. Abbas AK, Leichtman AH (2009) Basic immu- repertoire can remain stable over decades with nology: functions and disorders of the immune minimal turnover. J Virol 87(1):697–700 System, 3rd edn. Saunders/Elsevier, Philadel- 3. Haen SP, Rammensee HG (2013) The reper- phia, PA toire of human tumor-associated epitopes— 2. Neller MA, Burrows JM, Rist MJ, Miles JJ, Bur- identification and selection of antigens and rows SR (2013) High frequency of herpesvirus- their application in clinical trials. Curr Opin specific clonotypes in the human T-cell Immunol 25(2):277–283 176 Shaheen Khan and Kelly A. Kaihara

4. Meyer EH, Hsu AR, Liliental J et al (2013) A 6. Kolodziejczyk AA, Kim JK, Svensson V, Marioni distinct evolution of the T-cell repertoire cate- JC, Teichmann SA (2015) The technology and gorizes treatment refractory gastrointestinal biology of single-cell RNA sequencing. Mol Cell acute graft-versus host. Blood 121 58(4):610–620 (24):4955–4962 7. Orit-Rozenblatt-Rosen MJT, Regev A, Teich- 5. American Autoimmune Related Diseases Associ- mann SA (2017) Single-cell transcriptomics to ation (2011) The cost burden of autoimmune explore the immune system in health and dis- disease: the latest front in the war on healthcare ease. Science 358(6359):58–63 spending. AARDA, Eastpointe, MI. www. 8. Butler A, Hoffman P, Smibert P, Papalexi E, diabetesed.net/page/_files/autoimmune-dis Satija R (2019) Integrating single-cell transcrip- eases.pdf. Accessed 22 Sept 2017. tomic data across different conditions, technol- ogies, and species. Nat Biotechnol 36:411–420 Chapter 11

High-Throughput Single-Cell Real-Time Quantitative PCR Analysis

Liora Haim-Vilmovsky

Abstract

Examining transcriptomics of populations at the single-cell level allows for higher resolution when studying functionality in development, differentiation, and physiology. Real-time quantitative PCR (qPCR) enables a sensitive detection of specific gene expression; however, processing a large number of samples for single- cell research involves a time-consuming process and high reagent costs. Here we describe a protocol for single-cell qPCR using nanofluidic chips. This method reduces the number of handling steps and volumes per reaction, allowing for more samples and genes to be measured.

Key words Quantitative real-time PCR, Single cells, Targeted assays, Gene expression, Transcripts

1 Introduction

Differences in gene expression determine the development, differ- entiation, and physiology of an organism. The most common methods to explore gene expression average the expression profiles over large numbers of cells. This obscures population heterogeneity and whether there are one or more distinct cell types within a population [1–3]. Quantitative PCR (qPCR), also known as real-time PCR, is one method to determine transcript or DNA region levels in a sample. Gene-specific assays are used to amplify the corresponding gene using PCR, and additional fluorescence probe for monitoring the amplification. The major difficulty when applying qPCR to single- cell study is the enormous number of reactions required and the reagent cost for those experiments, due to the need of increased sample number. The use of the nanofluidic system offers the option to automatically combine up to 96 assays (measured genes) with 96 samples and detect the amplification process of all the combined reactions, which allows for higher throughput of cell number. The systems uses nanofluidic chips which results in a much lower costs for reagents.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_11, © Springer Science+Business Media, LLC, part of Springer Nature 2019 177 178 Liora Haim-Vilmovsky

Further advantages of this method include reproducibility, sen- sitivity, and specificity [4–6]; however, it is still limited to less than a hundred assays/samples per chip, and lower accuracy when mea- suring low expressed transcripts [7]. Other methods are available today to detect the general tran- scriptomics of a large number of cells with lower costs. However, this technique can be used to validate the findings and can be used in case there is urgency in the required results and its analysis.

2 Materials

Work in an RNAse, DNase, DNA, and PCR products-free laminar flow hood.

2.1 FACS 1. 96-Well PCR plates (see Note 1). 2. RNase inhibitor. Store at À20 C. 3. CellDirect 2Â reaction Mix (Invitrogen). Store at À20 C. 4. Lysis buffer: CellDirect 2Â reaction Mix, 2% RNase inhibitor. Store at À20 C. 5. FACS instrument. 6. Adhesive plate seals. 7. Dry ice.

2.2 Reverse 1. 20Â TaqMan Gene Expression Assays (see Notes 2 and 3).  Transcription and Store at À20 C. Specific Target 2. DNA suspension Buffer: 10 mM Tris, pH 8.0, 0.1 mM EDTA. Amplification Store at 4 C. 3. 0.2Â Assay mix: Each assay in final concentration of 0.2Â,in DNA suspension Buffer (see Note 4). Store at À20 C. 4. SuperScript III RT/Platinum Taq Mix. Store at À20 C. 5. PCR certified water. Store at room temperature. 6. 96-Well thermal cycler.

2.3 Real-Time PCR 1. 2Â Assay Loading Reagent (Fluidigm) (see Note 5). Store at À20 C. 2. Quanta PerfeCTa qPCR Fast Mix, low ROX. Store at À20 C. 3. 20Â GE Sample Loading Reagent (Fluidigm) (see Note 5). Store at À20 C. 4. 96.96 dynamic array (Chip). 5. Integrated Fluidic Circuit (IFC) Controller HX. 6. Biomark HD System. Single Cell qPCR 179

3 Methods

This protocol is for a 96.96 chip (containing 96 single cells and 96 assays per plate). Vortex and spin down all reagents and reaction mixes before use.

3.1 Samples and 1. Add 5.1 μl of the 2Â Lysis buffer to each well of 96-well PCR Assays Preparation plate and seal the plate with film (see Notes 6 and 7). 2. Sort single cells directly into each well, seal the plate, vortex it for 10 s, and spin down at 450,180 Â g for 1 min (see Notes 8–12). 3. Immediately place the plate on dry ice and store at À80 C. 4. Prepare the reverse transcription and specific target amplifica- tion reaction mix by combining 300 μl 0.2Â Assay mix, 24 μl SuperScript III RT/Platinum Taq Mix, and 144 μl PCR certified water into 1.5 ml sterile tube. Vortex for 10 s and spin down at 450,180 Â g for 1 min. 5. Add 3.9 μl of the mix to each well. Seal the plate. Vortex for 10 s and spin down at 450,180 Â g for 1 min (see Notes 13–15). 6. Place the plate onto a thermal cycler and use the following program: 50 C, 15 min; 95 C, 2 min; 22 Â (95 C, 15 s; 60 C, 4 min) (see Note 16). 7. Dilute the cDNA 1:5 with DNA Suspension Buffer (see Notes 17 and 18). 8. Prepare 10Â Assay dilution by mixing 3 μl from each 20Â TaqMan Gene Expression Assays with equal volume of 2Â Assay Loading Reagent to make the assay running plate (see Note 19). 9. Prepare a Sample Mix by combining 360 μl Master Mix and 20Â GE Sample Loading Reagent. 10. Aliquot 3.3 μl of the Sample mix to each well of a new 96-plate (see Notes 13 and 14). 11. Transfer 2.7 μl from each well of the diluted cDNA to the plate with aliquoted sample mix to make the “sample running” plate (see Note 20). 12. Seal the plate, vortex for 10 s, and spin down at 450,450 Â g for 1 min.

3.2 On-Chip Real 1. Open the chip packaging and inject control line fluid into each Time PCR accumulator on the chip (see Notes 21 and 22). 2. Place the chip into the IFC controller and run the Prime (136Â) script. 180 Liora Haim-Vilmovsky

3. Vortex both the assay and the sample running plates for 10 s and spin down at 450,180 Â g for 1 min. 4. Once the Prime script has finished, transfer 5 μl of each assay from the assay running plate to the assay inlets on the chip and 5 μl of each sample on the sample running plate to the sample inlets on the chip (see Notes 23 and 24). 5. Put the chip back into the IFC controller and run the Load mix (136Â) script (see Note 25). 6. Once Load script is done, remove the chip from the IFC controller. 7. Clean the IFC surface (see Note 26). 8. Load the chip into the Biomark, choose: Gene expression as the application type, ROX as the passive reference, Single probe as the assay, and FAM-MGB as the probe type. Run the protocol “GE 96 Â 96 Standard v1.pcl”. Confirm Auto Exposure (see Note 27).

3.3 Analysis 1. Within the analysis software, open the chip run file (.bml). Choose the appropriate container type and container format for both the sample and assay, and add names. 2. Assess the failed quality value by viewing the amplification curves. If the graph looks exponential, change the value to pass. 3. Export the data to a .csv file. 4. Choose your reference genes by testing the correlation between the housekeeping genes. 5. Calculate the delta value for each gene in each sample accord- ing to the determined references.

4 Notes

1. Make sure the plates are compatible with the FACS instrument and the thermal cycler. 2. Choose 96 assays corresponding to the 96 genes of interest. Choose at least five housekeeping genes for normalization (not highly expressed if possible), two genes which should not be expressed as negative controls, and a few genes as positive control (the cell type markers). It is best to choose assays that were used before by others. 3. Validate assay functionality by running the assay the first time with melting curve option on the software. Replace assays if needed. 4. Pipet 4 μl of each 20Â assay and add 16 μl DNA Suspension Buffer to get a volume of 400 μl. 300 μl are needed for the Single Cell qPCR 181

reaction. Be very careful not to contaminate the assays; switch tips between wells. It is possible to store the mix at À20 C. 5. Leave the reagent at room temperature for 20 min before use. 6. Steps 1–5, 7, 9, and 10, need to be done in an RNAse, DNase, DNA, and PCR products-free laminar flow hood. 7. Keep the plate on ice if used immediately, or at À20 C if used within a few days. 8. Calibrate the FACS instrument before sorting. 9. Sort fluorescent beads to a flat 96-well plate and check by fluorescent microscope that the beads are located in the middle of the well. Dry spots can also be detected by eye against a source of light. 10. If possible, sort different samples in the same plate to reduce batch effects. If more cells are needed, sort into additional plates. 11. Leave three wells empty spread randomly over the plate. These will be negative controls to verify the absence of contamination. 12. Keep the plate on ice or on a cool plate holder while sorting, if possible. 13. To reduce technical variation, divide the reaction mix equally to each tube of an 8 or 12-well PCR strip, and transfer the specific volume using a multichannel pipette. Make sure the liquid level in each tip is the same. 14. Add the drop to the side of the well. 15. Seal the plate properly using a plate sealer or a roller to avoid evaporations, especially at edges. 16. The number of cycles can vary between 18 and 22, depending on the cell size. We find the best results when using 22 cycles, for many different cell size ranges. 17. This step needs to be done outside the DNA-free hood. 18. cDNA can be stored at À20 C. 19. For one plate 6 μl of each diluted assay is needed. A stock plate can be prepared (50–100 μl) and kept at À20 C. Using a multichannel pipette, aliquots can be easily transferred to a new plate when required. 20. Use a multichannel pipette, put the drop on the side of the well (opposite side of the Sample Mix), and change tips between samples. 21. Press the accumulator once with the syringe and then inject the fluid when the syringe is tilted. 182 Liora Haim-Vilmovsky

Fig. 1 Scheme for transferring of samples/assays from the 96-well running plate into the 96.96 chip inlet

22. Use chip within 24 h of opening the package. Load the chip as soon as possible after priming, not later than an hour. 23. While adding the volume to the chip, put the pipette at an angle and do not pass the first stop on the pipette. This is done to prevent bubbles. If you do find bubbles, use a thin needle to pop them. If the bubble is on the surface and not on the bottom of the well, it can be left in the well. 24. Use a multichannel pipette with eight tips, and move column 1 from the running plate to the first column of the chip, start at the top (so that A1 from the running plate is on number 1 in the chip inlet), then column 2 from the running plate to the second column on the chip, start at the top, and so on until column number 6. Then add column 7–12 of the running plate on columns 1–6, but start at the second well of the column (A6 from the running plate goes to number 7 in the chip inlet (Fig. 1). When analyzing the plate on the software, remember to choose SBS96 as the container format. 25. Start the loading within 1 h of adding the samples and assays onto the chip. 26. Use Sellotape to remove particles or dust. 27. For the first run with the specific assay, run the script with additional phase of melting curves. This will allow to assess the quality of the assay within the pool assay environment. Any assay that does not have a normal melting curve should be replaced. Single Cell qPCR 183

Acknowledgments

This work was supported by EMBO (award number ALTF 698-2012), Directorate-General for Research and Innovation (FP7-PEOPLE-2010-IEF, ThPLAST 274192) and an EMBL Interdisciplinary Postdoctoral fellowship, supported by H2020 Marie Skłodowska-Curie Actions.

References

1. Shen-Orr SS, Tibshirani R, Khatri P et al (2010) measurement with real time PCR in a microflui- Cell type-specific gene expression differences in dic dynamic array. PLoS One 3:e1662 complex tissues. Nat Methods 7:287–289 5. Jang JS, Simon VA, Feddersen RM et al (2011) 2. Hebenstreit D, Fang M, Gu M et al (2011) RNA Quantitative miRNA expression analysis using sequencing reveals two major classes of gene Fluidigm microfluidics dynamic arrays. BMC expression levels in metazoan cells. Mol Syst Genomics 12:144 Biol 7:497 6. Devonshire AS, Elaswarapu R, Foy CA (2011) 3. Hebenstreit D, Teichmann SA (2011) Analysis Applicability of RNA standards for evaluating and simulation of gene expression profiles in RT-qPCR assays and platforms. BMC Genomics pure and mixed cell populations. Phys Biol 12:118 8:035013 7. Bengtsson M, Hemberg M, Rorsman P, Sta˚hl- 4. Spurgeon SL, Jones RC, Ramakrishnan R berg A (2008) Quantification of mRNA in single (2008) High throughput gene expression cells and modelling of RT-qPCR induced noise. BMC Mol Biol 9:63 Chapter 12

Single-Cell Dosing and mRNA Sequencing of Suspension and Adherent Cells Using the PolarisTM System Chad D. Sanada and Aik T. Ooi

Abstract

Single-cell functional analysis provides a natural next step in the now widely adopted single-cell mRNA sequencing studies. Functional studies can be designed to study cellular context by using single-cell culture, perturbation, manipulation, or treatment. Here we present a method for a functional study of 48 single cells by single-cell isolation, dosing, and mRNA sequencing with an integrated fluidic circuit (IFC) on the ® Fluidigm Polaris™ system. The major procedures required to execute this protocol are (1) cell preparation and staining; (2) priming, single-cell selection, cell dosing, cell staining, and cDNA generation on the Polaris IFC; and (3) preparation and sequencing of single-cell mRNA-seq libraries. The cell preparation and staining steps employ the use of a universal tracking dye to trace all cells that enter the IFC, while additional fluorescence dyes chosen by the user can be used to differentiate cell types in the overall mix. The steps on the Polaris IFC follow standard protocols, which are also described in the Fluidigm user documentation. ® ® The library preparation step adds Illumina Nextera XT indexes to the cDNA generated on the Polaris IFC. The resulting sequencing libraries can be sequenced on any Illumina sequencing platform.

Key words Single-cell perturbation, Differentiation, Adherent culture, Extracellular matrix (ECM), Time-lapse imaging, Single-cell functional study, Phenotype–genotype correlation, Dose response, Time course, Microfluidics

1 Introduction

Understanding cell function using gene expression information alone is often not sufficient, due to the loss of cellular context. To make better links between gene expression and cellular context, researchers often subject the cells to different treatment groups, then harvest the cells for gene expression analysis. When those studies are done in bulk, there is a lack of resolution in terms of how many cells are actually driving the changes seen in gene expression. Indeed, this is an area of study where single-cell tech- niques can be powerful. However, even if the cells from bulk treatments are subsequently singulated for downstream analysis, the effects on gene expression due to cell–cell interactions and changes in microenvironment caused by other cells cannot be

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_12, © Springer Science+Business Media, LLC, part of Springer Nature 2019 185 186 Chad D. Sanada and Aik T. Ooi

deconvoluted from the resulting data [1]. Thus, it is critical to have methods where treatments or stimuli can be applied on isolated single cells followed by single-cell cDNA generation. A handful of existing approaches achieve this to some extent [2–6] but fail to integrate the process into a single device. In addition, most of those approaches suffer from being labor-intensive or are not easily repro- ducible due to the requirement for in-house fabrication of custom devices and systems. The Fluidigm Polaris system overcomes these challenges by providing an off-the-shelf solution to automate and reproduce these multistep single-cell experiments [7]. The Polaris system enables users to perform experiments on an IFC that integrates single-cell isolation (one or two populations), dosing with stimuli, time-lapse imaging to track cell response to stimuli, and subsequent single-cell mRNA sequencing. The three major workflows offered on the Polaris system include No Treat- ment (for controls), Dose and Feed (simultaneous dosing and feeding of cells), and Time Course (staggered dosing for different treatment durations). Cells can be cultured under these conditions for up to 24 h, during which fluorescence imaging occurs automat- ically at least once per hour for all cell culture chambers. After dosing is complete, users can perform an optional cell staining step to track cell status, such as activation or viability. The IFC can then be imaged off instrument using a fluorescence microscope before proceeding to single-cell cDNA generation on Polaris. The following protocol describes a typical Dose and Feed exper- iment on the system, using the Polaris Single-Cell Dosing IFC. The users will define which dosing and feeding reagents they wish to use for the experiment. Typical examples of dosing reagents could include but are not limited to: antibodies for stimulating immune cells, water-soluble drugs, growth factors and cytokines, and viruses or vectors aimed at altering gene expression [8]. At the end of the Dose and Feed experiment, the users will generate single-cell cDNA on the IFC and harvest all samples from individual outlets, which can then be indexed as desired for downstream sequencing.

2 Materials

2.1 Polaris Workflow 1. Fluidigm Polaris system (Fluidigm). 2. Polaris Single-Cell Dosing mRNA Seq IFC (Fluidigm). 3. Polaris Single-Cell mRNA Seq Reagent Kit (Fluidigm). 4. Polaris Sponge Pack—5 Pack (Fluidigm). 5. (Optional) Single-Cell IFC Barrier Tape (Fluidigm). ® 6. SMARTer Ultra™ Low RNA Kit for the Fluidigm C1™ System, 10 IFCs—includes Boxes 1 of 2 and 2 of 2 and Advan- ® tage 2 PCR Kit (Takara Bio USA). Single-Cell Dosing and mRNA-Seq 187

7. Universal cell labeling fluorescence dye for cell tracing and imaging purposes. For example: CellTracker™ Orange CMRA Dye (Thermo Fisher Scientific), CellTracker Green CMFDA Dye (Thermo Fisher Scientific), or CellTracker Deep Red Dye (Thermo Fisher Scientific). 8. (Optional) Zombie Yellow™ Fixable Viability Kit (BioLegend). 9. INCYTO C-Chip™ Disposable Hemocytometer (Neubauer Improved) (INCYTO).

2.2 Library 1. Nextera XT DNA Sample Preparation Kit (Illumina). Preparation 2. Nextera XT Index Kit (Illumina; select an appropriate kit for the desired indices). ® 3. Quant-iT™ PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific). ® ® 4. Agencourt AMPure XP (Beckman Coulter). 5. Magnetic stand for PCR tubes. ® 6. 2100 Bioanalyzer (Agilent Technologies). 7. High Sensitivity DNA Kit (Agilent Technologies).

3 Methods

3.1 Configure the 1. Select one of the three available experiments: Dose and Feed, Experimental Profile Time Course, and No Treatment (see Note 2). on Polaris (See Note 1) 2. For a dose response experiment, select Dose and Feed. Select this option if you desire to dose all 48 sites individually, or if you want to feed all 48 sites from a common reservoir, or some combination of the two. 3. Name and add description of the Dose and Feed experiment, if desired. 4. Set up the Prime step for either suspension or adherent cells by tapping the default Suspension Cell Priming box. Toggle to choose between Suspension Cell Priming or Adherent Cell Priming at the pop-up screen. 5. Set up the Cell Selection step by selecting the Sample box. Editable parameters in the pop‑up screen are Selection Pressure (Standard or Low), Cell Input (Standard Cell Input or Low Cell Input), Number of Cell Populations (1 Population or 2 Populations), sample Name, and Input Volume (1–5 inlets of 25 μL/inlet). 6. Select the cell population 1 box to edit the selection settings based on fluorescence staining. Choose the appropriate fluo- rescence Channels, set Lower Bound and Upper Bound of fluo- rescence intensity, and select Exposure time. Edit the channel 188 Chad D. Sanada and Aik T. Ooi

name under Alias if desired. The parameters for Lower Bound, Upper Bound, and Exposure can be adjusted later. 7. Repeat for Cell Population 2 if two populations of cells are to be selected in the experiment. 8. Tap the Dose and Feed area for a pop-up screen to configure the dosing experimental setup. Available duration parameters are Exchange Interval, Pre Dose Duration, Dose Duration, and Post Dose Duration. Use the (+) or (À) sign to adjust the duration for these parameters. 9. Rename the eight dosing groups for the experiment by tapping Condition A through H. 10. Configure the Post Stain/Wash settings by tapping the Stain & Wash box for the pop-up screen. Toggle between Stain & Wash and Wash Only under the Stain/Wash option. If Stain & Wash is selected, additional parameters are available to choose the number of stains, staining duration, number and choice of fluorescence channels, and exposure time for each selected channel.

3.2 Prime the Polaris 1. Pipet the Actuation Fluid, Valve Priming Reagent, Polaris IFC Blocking Reagent, Cell Wash Buffer, Polaris Imaging Reagent, and Initialization Reagent into appropriate inlets on the IFC. All the reagents used in this step are from the Polaris Single- Cell mRNA Seq Reagent Kit (Fluidigm) (see Notes 2–4). 2. If the experiment is set up for adherent cells, pipet 25 μg/mL fibronectin, or an appropriate extracellular matrix (ECM) of choice into the IFC to coat the cell chamber (see Note 5). 3. Place the IFC into the Polaris instrument with the Environ- mental Control Interface Plate (EC IP), then run the first Prime step. 4. After the first Prime step is finished, prepare the Cell Capture Bead Mix according to the user guide and pipet the mix into the IFC. Run the second Prime step on the Polaris.

3.3 Prepare Cell Mix 1. Bring the Cell Suspension Reagent (Fluidigm) to room tem- perature before use. Wash and stain cells with a universal cell labeling fluores- cence dye, following the recommended staining condition for the chosen dye (see Notes 6–8). 2. After cell staining, wash and resuspend cells in phenol red-free complete culture medium or appropriate buffer (HBSS or PBS, with 3% FBS). 3. Count cells and adjust the cell concentration in culture medium in the range of 150–550 cells/μL, depending on the purity of the target cell population. Refer to the protocol guide Single-Cell Dosing and mRNA-Seq 189

for the reference table on recommended input volume and cell mix concentration (see Note 2). The absolute minimum cell concentration needed for an experiment is 20 cells/μL, which requires the use of the Low Cell Input option at the Cell Selection step. 4. Vortex the Cell Suspension Reagent for 15 s (do not centri- fuge), then mix the cells with Cell Suspension Reagent at a ratio ranging from 3:2 to 3:1, depending on cell type. For example, 120 μL of cells with 80 μL of Cell Suspension Reagent for a 3:2 ratio. Refer to protocol guide for notes on optimizing Cell Suspension Reagent ratio (see Notes 2 and 9). This is the final cell mix to be loaded onto the Polaris IFC. Save an aliquot of cells without the addition of Cell Suspension Reagent for imaging with a hemocytometer on the subsequent step.

3.4 Image Cells on 1. Pipet 10 μL of fluorescence-stained cells (prepared in Subhead- Polaris with a ing 3.3) without the addition of Cell Suspension Reagent into Hemocytometer an empty chamber of a hemocytometer. 2. Place the hemocytometer onto the hemocytometer holder, then load the holder into the Polaris. 3. Adjust the parameters for Channels to select the number of channels between 1 and 3 to be included in the imaging process. 4. Tap a Channel box to select a specific fluorescence channel. The first channel will be used as the focusing reference for all images. 5. Adjust the imaging Exposure time for each selected channel. 6. Tap Image to begin imaging of all selected channels. 7. After imaging is complete, images and histograms are displayed on the Hemocytometer images acquired screen. If multiple chan- nels are used, select to view the image and histogram of indi- vidual channel by tapping the channel name on the upper-left area of the screen (see Note 10). 8. To optimize display of the hemocytometer image, adjust the parameters for Low Contrast Limit, High Contrast Limit, and Exposure time. Repeat for each channel as needed. 9. Tap Image again to capture new images based on the updated parameters. 10. Tap Done to continue to the next step.

3.5 Cell Selection on 1. Remove leftover reagents and waste from the IFC after the Polaris priming, following the protocol guide (see Notes 2 and 4). 2. Prepare the IFC for cell selection by pipetting PCR-grade water, cell selection medium (see Notes 11 and 12), and final cell mix into appropriate inlets in the IFC. One to five inlets are 190 Chad D. Sanada and Aik T. Ooi

available for the final cell mix according to the workflow speci- fied during experimental setup. Number of cell inlets used for selection will depend on cell concentration and target cell purity. 3. Load the IFC and EC IP into Polaris and tap Initialize. 4. After initialization, tap Configure for the Configure cell selection screen. 5. Adjust upper and lower thresholds of fluorescence intensity for cell selection. 6. The low- and high-contrast limits can be adjusted to optimize the display of the imaged cells. 7. If needed, the imaging exposure time can be adjusted. Image cells again to apply the new exposure time. 8. Repeat the configuration of upper and lower thresholds for each channel used in cell selection. 9. If no candidate cells are visible, tap Load Cells to load another set of cells for configuration of cell selection parameters. 10. Once all the cell selection settings are adjusted to satisfaction, select Save, followed by Start Selection to begin cell selection. 11. During selection, cell selection settings can be changed by tapping Configure to allow adjustments on the exposure time and the upper and lower thresholds of fluorescence intensity.

3.6 Dose and Feed 1. When cell selection is complete, retrieve the IFC from Polaris on the Polaris and remove remaining reagents and waste from the IFC according to user guide (see Notes 2 and 4). 2. Prepare the IFC by pipetting cell culture medium (see Notes 11 and 12) and dosing reagents of choice into the designated IFC inlets. 3. Eight groups of dosing conditions are available, where unique or replicated dosing reagents can be used for each group. 4. Pipet 2.0 mL of PCR-grade water onto the hydration sponge that is inserted into the sponge holder, then clip the sponge holder onto the EC IP. Remove excess water with a lint-free wiper. 5. Place the IFC and EC IP into the Polaris to start the Dose and Feed step.

3.7 Post Stain/Wash 1. Remove the IFC from the Polaris instrument when the Dose Cells on the Polaris and Feed step is complete. 2. Pipet off any excess water accumulated on the IFC, especially in the furrow around the center of the IFC. Single-Cell Dosing and mRNA-Seq 191

3. Remove remaining reagents and waste from the IFC inlets according to protocol guide (see Notes 2 and 4). 4. Prepare the IFC for the post-stain step by pipetting viability stain (for example, the Zombie Yellow from BioLegend), an optional second stain, Polaris Imaging Reagent, Cell Wash Buffer, and Harvest Reagent onto the specified IFC inlets (see Notes 13 and 14). 5. Alternatively, if a Wash Only step is used, prepare the IFC by pipetting Cell Wash Buffer and Harvest Reagent onto the IFC (see Note 15). 6. Load the IFC along with the EC IP into Polaris and start the Post Stain or Wash step.

3.8 Run the mRNA 1. Prepare the Lysis Mix, RT Mix, and PCR Mix (see Notes Seq Chemistry on the 2 and 16). Polaris 2. When Post Stain/Wash step is complete, remove the IFC from the Polaris instrument. 3. Carefully remove the black tape underneath the IFC without inverting the IFC. 4. With the tape removed, it is possible to image the cells on the IFC with a microscope or any suitable imaging system. Imag- ing is recommended to record higher resolution of cell images (see Notes 17 and 18). 5. Remove remaining reagents and waste from the IFC inlets according to user guide (see Notes 2 and 4). 6. Prepare the IFC for the chemistry step by pipetting Harvest Reagent, Cell Wash Buffer, Preloading Reagent, Lysis Mix, RT Mix, and PCR Mix into the designated IFC inlets. 7. Load the IFC along with the EC IP into the Polaris. 8. Move the slider to select the desired time for the protocol to finish.

3.9 Harvest, 1. Remove the IFC from the Polaris instrument once the mRNA Quantify, and Dilute Seq step is finished. Amplified cDNA 2. Aliquot 5 μL of DNA Dilution Reagent into 48 wells of a new 96-well plate and label it “Diluted Harvest Plate” (see Note 19). 3. Using a multichannel pipette, transfer the 48 amplified pro- ducts from the IFC harvest outlets to the prepared Diluted Harvest Plate (see Note 2). 4. Quantify the diluted harvest products by using the Quant-iT PicoGreen dsDNA Assay (Thermo Fisher Scientific) on each sample following the plate-based assay protocol. Alternatively, other DNA quantitation methods can be used, such as the 192 Chad D. Sanada and Aik T. Ooi

® Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) or a NanoDrop™ instrument. It is recommended that cDNA also be qualified using an Agilent Bioanalyzer with the High Sensi- tivity DNA chip, or a similar fragment analyzer system. 5. Based on the quantification results, dilute and normalize each sample to the optimal concentration range of 0.10–0.30 ng/μL.

3.10 Prepare Nextera 1. Label a new 96-well plate “Library Prep.” XT Library 2. Prepare Tagmentation PreMix: 2.5 μL Tagment DNA Buffer (Illumina) and 1.25 μL Amplicon Tagment Mix (Illumina) for each sample. 3. Add 3.75 μL Tagmentation PreMix to 48 wells on the Library Prep plate. 4. Pipet 1.25 μL of diluted sample to each of the 48 wells. Seal the plate with an adhesive film, vortex, and centrifuge the plate. 5. Place the Library Prep plate into a thermal cycler for 10 min at 55 C, then hold at 10 C. 6. Pipet 1.25 μL of NT Buffer (Illumina) into each well to neu- tralize the tagmented samples. Seal the plate with an adhesive film, vortex, and centrifuge for 5 min at 300 Â g. 7. Pipet 3.75 μL of Nextera PCR Master Mix (Illumina) into each sample well. 8. Select appropriate Index 1 (N7xx) and Index 2 (S5xx) primers to form 48 unique pairs of indexing primers. 9. Pipet 1.25 μL of Index 1 (N7xx) primer and 1.25 μL of Index 2 (S5xx) primer into each sample well, based on the selected indexing scheme. Seal the plate with an adhesive film, vortex, and centrifuge for 2 min at 300 Â g. 10. Place the Library Prep plate into a thermal cycler and run the following PCR program:

Temperature (C) Time Cycles

72 3 min 1 95 30 s 1 95 10 s 55 30 s 12 72 60 s 72 5 min 1 10 Hold

3.11 Pool and Clean 1. When the PCR step in finished, prepare a dual-indexed library Up the Library pool by pipetting 1 μL of PCR product from each of the 48 sample wells. Single-Cell Dosing and mRNA-Seq 193

2. Perform two rounds of cleanup using 90% of total pool volume of AMPure XP beads (Beckman coulter) (e.g., add 44 μLof AMPure XP beads to 48 μL of library pool) following the standard AMPure XP beads protocol. 3. The final library eluted from the AMPure XP beads is ready to be prepared for sequencing on any Illumina sequencing platform.

4 Notes

1. For detailed instructions on instrument and software opera- tion, refer to the Polaris User Guide (Fluidigm 100-9580). 2. For detailed instructions on the experimental protocol, refer to the Generate cDNA Libraries with the Polaris Single-Cell Dos- ing mRNA Seq IFC Protocol (Fluidigm 101-0082). 3. To prevent bubbles from forming, push only to the first stop on the pipette when pipetting into the IFC inlets. If a bubble is created, use a pipette tip to either burst the bubble or move it to the top surface of the solution. 4. For a quick reference guide on IFC preparation in each step, refer to the Generate cDNA Libraries with the Polaris Single- Cell Dosing mRNA Seq IFC Quick Reference (Fluidigm 101-0075). 5. Be sure that any ECM used is clean and free of particulates that may clog the microfluidic channels. Some ECMs, such as col- ® lagen and Matrigel , are highly likely to cause clogs and are not currently recommended for use on the Polaris. Other possible ECM choices may include laminin and vitronectin, at a con- centration 50 μg/mL. ECM can be diluted in Cell Wash Buffer, PBS, or another simple buffer of choice, given that it is filtered to remove particulates prior to use. 6. Cells can be stained with additional dyes chosen by the user as long as they are verified to work with the excitation and emis- sion filters available on Polaris. Recommended universal dyes include CellTracker Orange CMRA, CellTracker Green CMFDA, and CellTracker Deep Red (all from Thermo Fisher Scientific). Cell preparation and staining are very critical to the success of the Polaris workflow, so it is recommended to opti- mize these steps in advance of a Polaris experiment. 7. If the cells of interest have green fluorescent protein (GFP), then CellTracker Orange CMRA is not recommended, as it will quench the GFP signal. Using a universal dye such as Cell- Tracker CM-DiI Dye (Thermo Fisher Scientific) is recom- mended as a replacement for the CMRA dye. 194 Chad D. Sanada and Aik T. Ooi

8. Use the Hemocytometer function on Polaris to optimize your cell stains before beginning the experiment on the IFCs. The Hemocytometer function can be accessed from the start menu on Polaris at any time that an active experiment is not running. 9. To prevent bubbles from forming, push only to the first stop on the pipette when pipetting into the IFC inlets. If a bubble is created, use a pipette tip to either burst the bubble or move it to the top surface of the solution. 10. In general, cells with fluorescence intensities below 5000 will be difficult to identify during the Cell Selection step of an IFC run. It is recommended that cell fluorescence intensities on a hemocytometer be 8000 for exposure of 0.5–1.0 s. For exposure times longer than 1.0 s, the cell intensities should be 10,000. 11. Cell selection medium must be free of particulates or cell clumps larger than 40 μm diameter, as they might clog the microfluidic channels of the IFC. Cell media containing serum or BSA should be sterile-filtered. Any cell suspension with cell clumps should be strained through a 40 μm mesh filter before use. 12. The cell selection media should be phenol-red free and free of any autofluorescence within the fluorescence spectra used in the experiment to prevent interference on cell selection efficiency. 13. For Zombie Yellow stain, it is recommended to dilute 1:100–1:500 in Cell Wash Buffer. User can optimize further as needed. 14. Although DAPI cannot be imaged directly on Polaris, the IFC can be imaged for DAPI staining on a microscope upon com- pletion of the post stain step. If using DAPI as the staining reagent for Post Stain, a dilution of 1:1000–1:3000 DAPI in Cell Wash Buffer is recommended. 15. If running Wash Only and no Post Stain, it is possible to remove the black tape underneath the IFC and image the IFC on a microscope right after the Dose and Feed step. 16. For a quick reference guide on making these reagent mixes, refer to the Prepare the Reagent Mixes for mRNA Seq Chem- istry on Polaris Quick Reference (Fluidigm 101-2819). 17. If cells were stained with DAPI in the Post Stain step, this is the step to collect images for cell viability analysis. 18. An inverted microscope with phase-contrast imaging capability is recommended. Use 10Â or 20Â objectives for adequate cell imaging. Criteria for selection of a compatible imaging system are outlined in Minimum Specifications for Imaging Cells in Fluidigm Integrated Fluidic Circuits (Fluidigm 100-5004). Single-Cell Dosing and mRNA-Seq 195

19. If low cDNA output is expected, or if a higher concentration of sample is desired, aliquot a smaller amount of DNA Dilution Reagent into the plate. Alternatively, collect the amplified cDNA from the IFC without any dilution.

References

1. Guo G, Pinello L, Han X et al (2016) Single Genet 18(6):345–361. https://doi.org/10. human T cells stimulated in the absence of 1038/nrg.2017.15 feeder cells transcribe interleukin-2 and undergo 5. Dura B, Dougan SK, Barisa M et al (2015) long-term clonal growth in response to defined Profiling lymphocyte interactions at the single- monoclonal antibodies and cytokine stimula- cell level by microfluidic cell pairing. Nat Com- tion. Cell Rep 14:956–965. https://doi.org/ mun 6:5940. https://doi.org/10.1038/ 10.1016/j.celrep.2015.12.089 ncomms6940 2. Kimmerling RJ, Lee Szeto G, Li JW et al (2016) 6. Etzrodt M, Schroeder T (2017) Illuminating A microfluidic platform enabling single-cell stem cell transcription factor dynamics: long- RNA-seq of multigenerational lineages. Nat term single-cell imaging of fluorescent protein Commun 7:10220. https://doi.org/10.1038/ fusions. Curr Opin Cell Biol 49:77–83. https:// ncomms10220 doi.org/10.1016/j.ceb.2017.12.006 3. Sunder-Plassmann R, Breiteneder H, Zimmer- 7. Ramalingam N, Fowler B, Szpankowski L et al mann K et al (1996) Single human T cells sti- (2016) Fluidic logic used in a systems approach mulated in the absence of feeder cells transcribe to enable integrated single-cell functional analy- interleukin-2 and undergo long-term clonal sis. Front Bioeng Biotechnol 4:70. https://doi. growth in response to defined monoclonal anti- org/10.3389/fbioe.2016.00070 bodies and cytokine stimulation. Blood 87 8. Wills QF, Mellado-Gomez E, Nolan R et al (12):5179–5184 (2017) The nature and nurture of cell heteroge- 4. Prakadan SM, Shalek AK, Weitz DA (2017) neity: accounting for macrophage gene- Scaling by shrinking: empowering single-cell environment interactions with single-cell RNA-- ’omics’ with microfluidic devices. Nat Rev Seq. BMC Genomics 18(1):53 Chapter 13

Targeted TCR Amplification from Single-Cell cDNA Libraries

Shuqiang Li and Kenneth J. Livak

Abstract

Single-cell sequencing of TCR alleles enables determination of T cell specificity. Here we describe a sensitive protocol for targeted amplification of TCR CDR3 regions from single-cell full-length cDNA libraries. By exploiting the specificity of RNase H-dependent PCR (rhPCR), the protocol achieves amplification of TCR alleles and addition of cell barcodes in a single PCR step.

Key words T cell receptor repertoire, Paired TCRαβ single-cell sequencing, RNase H-dependent PCR

1 Introduction

The diversity of individual T cell receptor (TCR) chains has been used as a measure of clonal diversity in the analysis of the immune response to pathogens and vaccines [1]. The recent successes of immunotherapy for cancer have intensified interest in TCR analysis because T cells are the primary effector cells of immune response to tumor [2]. Interaction of a TCR with a peptide antigen bound to a major histocompatibility complex (MHC) molecule occurs mainly through the paired alpha- and beta-CDR3 regions. Thus, deter- mining the sequence of these CDR3 segments is a necessary com- ponent in characterizing the antigen specificity of a T cell. Identification of which TCR interacts with a particular antigen requires single-cell sequencing so the exact pairing of TCR-alpha and TCR-beta chains is known. Programs such as TraCeR [3] enable extraction of TCR sequences from single-cell RNA-seq data as long as the sequencing uses whole transcript libraries rather than end counting libraries. Due to read-depth limitations of single cell RNA-seq, this type of analysis does not always detect both TCR alpha and beta sequences in the same single cell. Also, it sometimes recovers only partial CDR3 sequences. Complete CDR3 sequence information is critical for the cloning and expression of TCRs, which is required for determining the specificity of discovered

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_13, © Springer Science+Business Media, LLC, part of Springer Nature 2019 197 198 Shuqiang Li and Kenneth J. Livak

TCRs. Thus, there is a need to improve the sensitivity of determin- ing TCR sequences in single-cell cDNA libraries. Here we present a protocol that starts with full-length single- cell cDNA libraries suitable for preparing whole transcript sequenc- ing libraries and achieves specific, targeted amplification of TCR alleles plus addition of cell barcodes in a single PCR step, as opposed to a nested PCR strategy such as the one described by Han et al. [4].

2 Materials

1. 96-Well PCR plates. 2. 20 mg/mL Proteinase K (800 units/mL). Store at À20 C. 3. Prionex: Sigma-Aldrich G0411. 4. NEBNext Single Cell/Low Input RNA Library Prep Kit: New England BioLabs, E6420L. 5. 25 mM AAPV protease inhibitor: Dissolve 5 mg Elastase Inhibitor III (Sigma-Aldritch 324745-5MG) in 400 μL DMSO. Dispense into 20 μL aliquots and store at À80 C. This is a peptide inhibitor with the amino acid sequence AAPV. 6. Thermal cycler. 7. ProNex beads: Promega NG2001(see Note 1). Product includes Promega Wash Buffer concentrate and Promega Elu- tion Buffer. 8. Ethanol: Used to dilute Promega Wash Buffer concentrate and to prepare 80% ethanol. 9. Magnetic stand for 96-well plate. 10. DNA Suspension Buffer: 10 mM Tris–HCl, pH 8.0 and 0.1 mM EDTA. 11. Set of 96 barcode primer mixes: Each mix consists of 6 μM rhPCR primer P5.IDT***.Rd1x.x1 and 6 μM rhPCR primer P7.IDT***.Rd2x.x1, where IDT*** refers to the different barcodes. The sequences of the 192 primers are in Table 1. These are mixed in pairwise combinations, one P5 primer with one P7 primer. Store at À20 C. 12. RNase H2 Enzyme Kit: Integrated DNA Technologies 11-02- 12-01. The kit contains 2 U/μL RNase H2 Enzyme and RNase H2 Dilution Buffer. 13. 1 M Tris–HCl, pH 8.4. 14. 1 M KCl.

15. 1 M MgCl2. Table 1 Primers used in targeted amplification and barcode addition for templates derived from TCR transcripts as described in Subheading 3.2

Name Sequence Notes

P5.IDT289. AATGATACGGCGACCACCGAGATCTACACCATTGCCTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT290. AATGATACGGCGACCACCGAGATCTACACCTAGGTGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT291. AATGATACGGCGACCACCGAGATCTACACTCCGTATGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT292. AATGATACGGCGACCACCGAGATCTACACACGATGACACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT293. AATGATACGGCGACCACCGAGATCTACACGTCGGTAAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT294. AATGATACGGCGACCACCGAGATCTACACTCGAAGGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT295. AATGATACGGCGACCACCGAGATCTACACAGAAGCGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT296. AATGATACGGCGACCACCGAGATCTACACCTCTACTCACACTCTTTCCCTACrACGACa/ 199 Sequencing TCR Single-Cell Rd1x.x1 3SpC3/

P5.IDT297. AATGATACGGCGACCACCGAGATCTACACCTAGGCATACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT298. AATGATACGGCGACCACCGAGATCTACACTGGAGTTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT299. AATGATACGGCGACCACCGAGATCTACACGAGGACTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT300. AATGATACGGCGACCACCGAGATCTACACCAATCGACACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ (continued) 0 hqagL n ent .Livak J. Kenneth and Li Shuqiang 200 Table 1 (continued)

Name Sequence Notes

P5.IDT301. AATGATACGGCGACCACCGAGATCTACACTCTAACGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT302. AATGATACGGCGACCACCGAGATCTACACTCTCGCAAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT303. AATGATACGGCGACCACCGAGATCTACACATCGGTGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT304. AATGATACGGCGACCACCGAGATCTACACGAGATACGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT305. AATGATACGGCGACCACCGAGATCTACACGTCTCCTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT306. AATGATACGGCGACCACCGAGATCTACACAGTCGACAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT307. AATGATACGGCGACCACCGAGATCTACACCGGATTGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT308. AATGATACGGCGACCACCGAGATCTACACCACAAGTCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT309. AATGATACGGCGACCACCGAGATCTACACTACATCGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT310. AATGATACGGCGACCACCGAGATCTACACAGCTCCTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT311. AATGATACGGCGACCACCGAGATCTACACACTCGTTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT312. AATGATACGGCGACCACCGAGATCTACACCTGACACAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ P5.IDT313. AATGATACGGCGACCACCGAGATCTACACCAACCTAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT314. AATGATACGGCGACCACCGAGATCTACACAAGGACACACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT315. AATGATACGGCGACCACCGAGATCTACACTGCAGGTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT316. AATGATACGGCGACCACCGAGATCTACACACCTAAGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT317. AATGATACGGCGACCACCGAGATCTACACAGTCTGTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT318. AATGATACGGCGACCACCGAGATCTACACAGGTTCGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT319. AATGATACGGCGACCACCGAGATCTACACGACTATGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT320. AATGATACGGCGACCACCGAGATCTACACTTCAGGAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT321. AATGATACGGCGACCACCGAGATCTACACTGTGCGTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ igeCl C eunig201 Sequencing TCR Single-Cell

P5.IDT322. AATGATACGGCGACCACCGAGATCTACACCGAGACTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT323. AATGATACGGCGACCACCGAGATCTACACCTCAGAGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT324. AATGATACGGCGACCACCGAGATCTACACGCCATAACACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT325. AATGATACGGCGACCACCGAGATCTACACTTACCGAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT326. AATGATACGGCGACCACCGAGATCTACACGCTCTGTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ (continued) 0 hqagL n ent .Livak J. Kenneth and Li Shuqiang 202 Table 1 (continued)

Name Sequence Notes

P5.IDT327. AATGATACGGCGACCACCGAGATCTACACCGTTATGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT328. AATGATACGGCGACCACCGAGATCTACACGTCTGATCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT329. AATGATACGGCGACCACCGAGATCTACACTAGTTGCGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT330. AATGATACGGCGACCACCGAGATCTACACTGATCGGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT331. AATGATACGGCGACCACCGAGATCTACACCCAAGTTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT332. AATGATACGGCGACCACCGAGATCTACACCCTACTGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT333. AATGATACGGCGACCACCGAGATCTACACCTTGCTGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT334. AATGATACGGCGACCACCGAGATCTACACTGCCATTCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT335. AATGATACGGCGACCACCGAGATCTACACTTGATCCGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT336. AATGATACGGCGACCACCGAGATCTACACAGTGCAGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT337. AATGATACGGCGACCACCGAGATCTACACGACTTAGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT338. AATGATACGGCGACCACCGAGATCTACACCGTACGAAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ P5.IDT339. AATGATACGGCGACCACCGAGATCTACACTACCAGGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT340. AATGATACGGCGACCACCGAGATCTACACCGTCAATGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT341. AATGATACGGCGACCACCGAGATCTACACGAAGAGGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT342. AATGATACGGCGACCACCGAGATCTACACGACGAATGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT343. AATGATACGGCGACCACCGAGATCTACACAGGAGGAAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT344. AATGATACGGCGACCACCGAGATCTACACCTTACAGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT345. AATGATACGGCGACCACCGAGATCTACACGAGATGTCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT346. AATGATACGGCGACCACCGAGATCTACACTACGGTTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT347. AATGATACGGCGACCACCGAGATCTACACCTATCGCAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ igeCl C eunig203 Sequencing TCR Single-Cell

P5.IDT348. AATGATACGGCGACCACCGAGATCTACACTCGAACCAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT349. AATGATACGGCGACCACCGAGATCTACACGAACGCTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT350. AATGATACGGCGACCACCGAGATCTACACCAGAATCGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT351. AATGATACGGCGACCACCGAGATCTACACATGGTTGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT352. AATGATACGGCGACCACCGAGATCTACACGCTGGATTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ (continued) 0 hqagL n ent .Livak J. Kenneth and Li Shuqiang 204 Table 1 (continued)

Name Sequence Notes

P5.IDT353. AATGATACGGCGACCACCGAGATCTACACGATGCACTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT354. AATGATACGGCGACCACCGAGATCTACACACCAATGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT355. AATGATACGGCGACCACCGAGATCTACACGTCCTAAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT356. AATGATACGGCGACCACCGAGATCTACACCCGACTATACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT357. AATGATACGGCGACCACCGAGATCTACACTTGGTCTCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT358. AATGATACGGCGACCACCGAGATCTACACGCCTTGTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT359. AATGATACGGCGACCACCGAGATCTACACGATACTGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT360. AATGATACGGCGACCACCGAGATCTACACATTCGAGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT361. AATGATACGGCGACCACCGAGATCTACACGTCAGTTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT362. AATGATACGGCGACCACCGAGATCTACACGTAGAGCAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT363. AATGATACGGCGACCACCGAGATCTACACACGTGATGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT364. AATGATACGGCGACCACCGAGATCTACACTAAGTGGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ P5.IDT365. AATGATACGGCGACCACCGAGATCTACACTGTGAAGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT366. AATGATACGGCGACCACCGAGATCTACACCATTCGGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT367. AATGATACGGCGACCACCGAGATCTACACTTGGTGAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT368. AATGATACGGCGACCACCGAGATCTACACCAGTTCTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT369. AATGATACGGCGACCACCGAGATCTACACAGGCTTCTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT370. AATGATACGGCGACCACCGAGATCTACACGAATCGTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT371. AATGATACGGCGACCACCGAGATCTACACACCAGCTTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT372. AATGATACGGCGACCACCGAGATCTACACCTCATTGCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT373. AATGATACGGCGACCACCGAGATCTACACCGATAGAGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ igeCl C eunig205 Sequencing TCR Single-Cell

P5.IDT374. AATGATACGGCGACCACCGAGATCTACACTGGAGAGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT375. AATGATACGGCGACCACCGAGATCTACACGTATGCTGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT376. AATGATACGGCGACCACCGAGATCTACACCTGGAGTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT377. AATGATACGGCGACCACCGAGATCTACACAATGCCTCACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT378. AATGATACGGCGACCACCGAGATCTACACTGAGGTGTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/ (continued) 0 hqagL n ent .Livak J. Kenneth and Li Shuqiang 206 Table 1 (continued)

Name Sequence Notes

P5.IDT379. AATGATACGGCGACCACCGAGATCTACACACATTGCGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT380. AATGATACGGCGACCACCGAGATCTACACTCTCTAGGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT381. AATGATACGGCGACCACCGAGATCTACACCGCTAGTAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT382. AATGATACGGCGACCACCGAGATCTACACAATGGACGACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT383. AATGATACGGCGACCACCGAGATCTACACGATAGCGAACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P5.IDT384. AATGATACGGCGACCACCGAGATCTACACCGACCATTACACTCTTTCCCTACrACGACa/ Rd1x.x1 3SpC3/

P7.IDT001. CAAGCAGAAGACGGCATACGAGATCTGATCGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT002. CAAGCAGAAGACGGCATACGAGATACTCTCGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT003. CAAGCAGAAGACGGCATACGAGATTGAGCTAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT004. CAAGCAGAAGACGGCATACGAGATGAGACGATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT005. CAAGCAGAAGACGGCATACGAGATCTTGTCGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT006. CAAGCAGAAGACGGCATACGAGATTTCCAAGGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 P7.IDT007. CAAGCAGAAGACGGCATACGAGATCGCATGATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT008. CAAGCAGAAGACGGCATACGAGATACGGAACAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT009. CAAGCAGAAGACGGCATACGAGATCGGCTAATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT010. CAAGCAGAAGACGGCATACGAGATATCGATCGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT011. CAAGCAGAAGACGGCATACGAGATGCAAGATCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT012. CAAGCAGAAGACGGCATACGAGATGCTATCCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT013. CAAGCAGAAGACGGCATACGAGATTACGCTACGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT014. CAAGCAGAAGACGGCATACGAGATTGGACTCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT015. CAAGCAGAAGACGGCATACGAGATAGAGTAGCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 igeCl C eunig207 Sequencing TCR Single-Cell

P7.IDT016. CAAGCAGAAGACGGCATACGAGATATCCAGAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT017. CAAGCAGAAGACGGCATACGAGATGACGATCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT018. CAAGCAGAAGACGGCATACGAGATAACTGAGCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT019. CAAGCAGAAGACGGCATACGAGATCTTAGGACGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT020. CAAGCAGAAGACGGCATACGAGATGTGCCATAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 (continued) 0 hqagL n ent .Livak J. Kenneth and Li Shuqiang 208 Table 1 (continued)

Name Sequence Notes

P7.IDT021. CAAGCAGAAGACGGCATACGAGATGAATCCGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT022. CAAGCAGAAGACGGCATACGAGATTCGCTGTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT023. CAAGCAGAAGACGGCATACGAGATTTCGTTGGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT024. CAAGCAGAAGACGGCATACGAGATAAGCACTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT025. CAAGCAGAAGACGGCATACGAGATCCTTGATCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT026. CAAGCAGAAGACGGCATACGAGATGTCGAAGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT027. CAAGCAGAAGACGGCATACGAGATACCACGATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT028. CAAGCAGAAGACGGCATACGAGATGATTACCGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT029. CAAGCAGAAGACGGCATACGAGATGCACAACTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT030. CAAGCAGAAGACGGCATACGAGATGCGTCATTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT031. CAAGCAGAAGACGGCATACGAGATATCCGGTAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT032. CAAGCAGAAGACGGCATACGAGATCGTTGCAAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 P7.IDT033. CAAGCAGAAGACGGCATACGAGATGTGAAGTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT034. CAAGCAGAAGACGGCATACGAGATCATGGCTAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT035. CAAGCAGAAGACGGCATACGAGATATGCCTGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT036. CAAGCAGAAGACGGCATACGAGATCAACACCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT037. CAAGCAGAAGACGGCATACGAGATTGTGACTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT038. CAAGCAGAAGACGGCATACGAGATGTCATCGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT039. CAAGCAGAAGACGGCATACGAGATAGCACTTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT040. CAAGCAGAAGACGGCATACGAGATGAAGGAAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT041. CAAGCAGAAGACGGCATACGAGATGTTGTTCGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 igeCl C eunig209 Sequencing TCR Single-Cell

P7.IDT042. CAAGCAGAAGACGGCATACGAGATCGGTTGTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT043. CAAGCAGAAGACGGCATACGAGATACTGAGGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT044. CAAGCAGAAGACGGCATACGAGATTGAAGACGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT045. CAAGCAGAAGACGGCATACGAGATGTTACGCAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT046. CAAGCAGAAGACGGCATACGAGATAGCGTGTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 (continued) 1 hqagL n ent .Livak J. Kenneth and Li Shuqiang 210 Table 1 (continued)

Name Sequence Notes

P7.IDT047. CAAGCAGAAGACGGCATACGAGATGATCGAGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT048. CAAGCAGAAGACGGCATACGAGATACAGCTCAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT049. CAAGCAGAAGACGGCATACGAGATGAGCAGTAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT050. CAAGCAGAAGACGGCATACGAGATAGTTCGTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT051. CAAGCAGAAGACGGCATACGAGATTTGCGAAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT052. CAAGCAGAAGACGGCATACGAGATATCGCCATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT053. CAAGCAGAAGACGGCATACGAGATTGGCATGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT054. CAAGCAGAAGACGGCATACGAGATCTGTTGACGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT055. CAAGCAGAAGACGGCATACGAGATCATACCACGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT056. CAAGCAGAAGACGGCATACGAGATGAAGTTGGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT057. CAAGCAGAAGACGGCATACGAGATATGACGTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT058. CAAGCAGAAGACGGCATACGAGATTTGGACGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 P7.IDT059. CAAGCAGAAGACGGCATACGAGATAGTGGATCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT060. CAAGCAGAAGACGGCATACGAGATGATAGGCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT061. CAAGCAGAAGACGGCATACGAGATTGGTAGCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT062. CAAGCAGAAGACGGCATACGAGATCGCAATCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT063. CAAGCAGAAGACGGCATACGAGATGATGTGTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT064. CAAGCAGAAGACGGCATACGAGATGATTGCTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT065. CAAGCAGAAGACGGCATACGAGATCGCTCTATGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT066. CAAGCAGAAGACGGCATACGAGATTATCGGTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT067. CAAGCAGAAGACGGCATACGAGATAACGTCTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 igeCl C eunig211 Sequencing TCR Single-Cell

P7.IDT068. CAAGCAGAAGACGGCATACGAGATACGTTCAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT069. CAAGCAGAAGACGGCATACGAGATCAGTCCAAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT070. CAAGCAGAAGACGGCATACGAGATTTGCAGACGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT071. CAAGCAGAAGACGGCATACGAGATCAATGTGGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT072. CAAGCAGAAGACGGCATACGAGATACTCCATCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 (continued) 1 hqagL n ent .Livak J. Kenneth and Li Shuqiang 212 Table 1 (continued)

Name Sequence Notes

P7.IDT073. CAAGCAGAAGACGGCATACGAGATGTTGACCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT074. CAAGCAGAAGACGGCATACGAGATCGTGTGTAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT075. CAAGCAGAAGACGGCATACGAGATACGACTTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT076. CAAGCAGAAGACGGCATACGAGATCACTAGCTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT077. CAAGCAGAAGACGGCATACGAGATACTAGGAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT078. CAAGCAGAAGACGGCATACGAGATGTAGGAGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT079. CAAGCAGAAGACGGCATACGAGATCCTGATTGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT080. CAAGCAGAAGACGGCATACGAGATATGCACGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT081. CAAGCAGAAGACGGCATACGAGATCGACGTTAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT082. CAAGCAGAAGACGGCATACGAGATTACGCCTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT083. CAAGCAGAAGACGGCATACGAGATCCGTAAGAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT084. CAAGCAGAAGACGGCATACGAGATATCACACGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 P7.IDT085. CAAGCAGAAGACGGCATACGAGATCACCTGTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT086. CAAGCAGAAGACGGCATACGAGATCTTCGACTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT087. CAAGCAGAAGACGGCATACGAGATTGCTTCCAGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT088. CAAGCAGAAGACGGCATACGAGATAGAACGAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT089. CAAGCAGAAGACGGCATACGAGATGTTCTCGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT090. CAAGCAGAAGACGGCATACGAGATTCAGGCTTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT091. CAAGCAGAAGACGGCATACGAGATCCTTGTAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT092. CAAGCAGAAGACGGCATACGAGATGAACATCGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT093. CAAGCAGAAGACGGCATACGAGATTAACCGGTGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1 igeCl C eunig213 Sequencing TCR Single-Cell P7.IDT094. CAAGCAGAAGACGGCATACGAGATAACCGTTCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT095. CAAGCAGAAGACGGCATACGAGATTGGTACAGGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

P7.IDT096. CAAGCAGAAGACGGCATACGAGATATATGCGCGTGACTGGAGTTCAGArCGTGTa/3SpC3/ Rd2x.x1

Rd1.AV01.x1 ctctttccctacacgacgctcttccgatctAACTGCACGTACCAGACATCTrGGGTTa/ Amplifies TRAV1-1, 3SpC3/ TRAV1-2

Rd1.AV02.x1 ctctttccctacacgacgctcttccgatctTCATCGCTGCTCATCCTCCrAGGTGa/3SpC3/ Amplifies TRAV2 (continued) 1 hqagL n ent .Livak J. Kenneth and Li Shuqiang 214 Table 1 (continued)

Name Sequence Notes

Rd1.AV03.x1 ctctttccctacacgacgctcttccgatctCCTGGTTAAAGGCAGCTATGGrCTTTGc/ Amplifies TRAV3 3SpC3/

Rd1.AV04.x1 ctctttccctacacgacgctcttccgatctGCCGACAGAAAGTCCAGCrACTCTa/3SpC3/ Amplifies TRAV4

Rd1.AV05.x1 ctctttccctacacgacgctcttccgatctTCTGCGCATTGCAGACACrCCAGAa/3SpC3/ Amplifies TRAV5

Rd1.AV06.x1 ctctttccctacacgacgctcttccgatctTGAAGGTCACCTTTGATACCACCrCTTAAc/ Amplifies TRAV6 3SpC3/

Rd1.AV07.x1 ctctttccctacacgacgctcttccgatctCCGTGCAGCCTGAAGATTCrAGCCAa/3SpC3/ Amplifies TRAV7

Rd1.AV08-1.x1 ctctttccctacacgacgctcttccgatctTGGTCAACACCTTCAGCTTCTrCCTCAc/ Amplifies TRAV8-1 3SpC3/

Rd1.AV08-2/4/ ctctttccctacacgacgctcttccgatctAAGGACTCCAGCTTCTCCTGrAAGTAg/3SpC3/ Amplifies TRAV8-2, 6.x1 TRAV8-4,TRAV8-6

Rd1.AV08-3.x1 ctctttccctacacgacgctcttccgatctGGAAACCCTCTGTGCATTGGrAGTGAc/3SpC3/ Amplifies TRAV8-3

Rd1.AV9.x1 ctctttccctacacgacgctcttccgatctGAAACCACTTCTTTCCACTTGGArGAAAGc/ Amplifies TRAV9-1, 3SpC3/ TRAV9-2

Rd1.AV10.x1 ctctttccctacacgacgctcttccgatctCACAAAGCAAAGCTCTCTGCArCATCAa/ Amplifies TRAV10 3SpC3/

Rd1.AV12.x1 ctctttccctacacgacgctcttccgatctCAGTGATTCAGCCACCTACCTrCTGTGa/ Amplifies TRAV12-1, 3SpC3/ TRAV12-1,TRAV12-3

Rd1.AV13-1.x1 ctctttccctacacgacgctcttccgatctACAAGACAGCCAAACATTTCTCCrCTGCAa/ Amplifies TRAV13-1 3SpC3/

Rd1.AV13-2.x1 ctctttccctacacgacgctcttccgatctTGCAGCTACTCAACCTGGArGACTCc/3SpC3/ Amplifies TRAV13-2

Rd1.AV14.x1 ctctttccctacacgacgctcttccgatctACCTTGTCATCTCCGCTTCArCAACTa/3SpC3/ Amplifies TRAV14/ DV4 Rd1.AV16.x1 ctctttccctacacgacgctcttccgatctGGCGAGACATCTTTCCACCTrGAAGAc/3SpC3/ Amplifies TRAV16

Rd1.AV17.x1 ctctttccctacacgacgctcttccgatctAGTCACGCTTGACACTTCCArAGAAAc/3SpC3/ Amplifies TRAV17

Rd1.AV18.x2 ctctttccctacacgacgctcttccgatctCAGTTCCTTCCACCTGGAGArAGCCCa/3SpC3/ Amplifies TRAV18

Rd1.AV19.x1 ctctttccctacacgacgctcttccgatctCACAGCCTCACAAGTCGTGrGACTCc/3SpC3/ Amplifies TRAV19

Rd1.AV20.x1 ctctttccctacacgacgctcttccgatctCTGCACATCACAGCCCCTArAACCTa/3SpC3/ Amplifies TRAV20

Rd1.AV21.x1 ctctttccctacacgacgctcttccgatctACATTGCAGCTTCTCAGCCTrGGTGAa/3SpC3/ Amplifies TRAV21

Rd1.AV22.x1 ctctttccctacacgacgctcttccgatctTCCTCTTCCCAGACCACAGArCTCAGa/3SpC3/ Amplifies TRAV22

Rd1.AV23.x1 ctctttccctacacgacgctcttccgatctGATTCCCAGCCTGGAGACTCrAGCCAa/3SpC3/ Amplifies TRAV23/ DV6

Rd1.AV24.x1 ctctttccctacacgacgctcttccgatctGTACATCAAAGGATCCCAGCCTrGAAGAa/ Amplifies TRAV24 3SpC3/

Rd1.AV25.x1 ctctttccctacacgacgctcttccgatctGCCACCCAGACTACAGATGTrAGGAAa/3SpC3/ Amplifies TRAV25

Rd1.AV26-1.x1 ctctttccctacacgacgctcttccgatctCGCTACGCTGAGAGACACTrGCTGTa/3SpC3/ Amplifies TRAV26-1

Rd1.AV26-2.x1 ctctttccctacacgacgctcttccgatctTGGCAATCGCTGAAGACAGArAAGTCa/3SpC3/ Amplifies TRAV26-2

Rd1.AV27.x1 ctctttccctacacgacgctcttccgatctTGCAAGAAAGGACAGTTCTCTCCrACATCc/ Amplifies TRAV27 3SpC3/ igeCl C eunig215 Sequencing TCR Single-Cell

Rd1.AV29.x1 ctctttccctacacgacgctcttccgatctTGGAGACTCTGCAGTGTACTTCTrGTGCAa/ Amplifies TRAV29/ 3SpC3/ DV5

Rd1.AV30.x1 ctctttccctacacgacgctcttccgatctGCAAAGCTCCCTGTACCTTACGrGCCTCa/ Amplifies TRAV30 3SpC3/

Rd1.AV34.x1 ctctttccctacacgacgctcttccgatctCCAGCCATGCAGGCATCTArCCTCTa/3SpC3/ Amplifies TRAV34

Rd1.AV35.x1 ctctttccctacacgacgctcttccgatctGCATCCATACCTAGTGATGTAGGCrATCTAa/ Amplifies TRAV35 3SpC3/

Rd1.AV36.x1 ctctttccctacacgacgctcttccgatctAGCATCCTGAACATCACAGCCrACCCAa/ Amplifies TRAV36/ 3SpC3/ DV7 (continued) 1 hqagL n ent .Livak J. Kenneth and Li Shuqiang 216 Table 1 (continued)

Name Sequence Notes

Rd1.AV38.x1 ctctttccctacacgacgctcttccgatctGCAGCCAAATCCTTCAGTCTCArAGATCc/ Amplifies TRAV38-1, 3SpC3/ TRAV38-2

Rd1.AV39.x1 ctctttccctacacgacgctcttccgatctTGCATGACCTCTCTGCCACrCTACTc/3SpC3/ Amplifies TRAV39

Rd1.AV40.x2 ctctttccctacacgacgctcttccgatctCCCCCATTGTGAAATATTCAGTCCrAGGTAc/ Amplifies TRAV40 3SpC3/

Rd1.AV41.x1 ctctttccctacacgacgctcttccgatctCCCATCCCAGAGACTCTGCrCGTCTc/3SpC3/ Amplifies TRAV41

Rd1.BV02.x1 ctctttccctacacgacgctcttccgatctTCTGAAGATCCGGTCCACAAAGrCTGGAa/ Amplifies TRBV2 3SpC3/

Rd1.BV03.x1 ctctttccctacacgacgctcttccgatctCAATTCCCTGGAGCTTGGTGArCTCTGa/ Amplifies TRBV3-1, 3SpC3/ TRBV3-2

Rd1.BV04.x1 ctctttccctacacgacgctcttccgatctTCTCACCTGAATGCCCCAACrAGCTCc/3SpC3/ Amplifies TRBV4-1, TRBV4-2,TRBV4-3

Rd1.BV05-1.x1 ctctttccctacacgacgctcttccgatctCGCTCTGAGATGAATGTGAGCArCCTTGa/ Amplifies TRBV5-1 3SpC3/

Rd1.BV05-4/5/ ctctttccctacacgacgctcttccgatctCTCTGAGCTGAATGTGAACGCrCTTGGc/ Amplifies TRBV5-4, 6/8.x1 3SpC3/ TRBV5-5,TRBV5-6, TRBV5-7,TRBV5-8

Rd1.BV06- ctctttccctacacgacgctcttccgatctACATCTGTGTACTTCTGTGCCAGrCAGTGc/ Amplifies TRBV6-1, 1to6.x1 3SpC3/ TRBV6-2,TRBV6-3, TRBV6-4,TRBV6-5, TRBV6-6

Rd1.BV06-8/9. ctctttccctacacgacgctcttccgatctCCTGGTATCGACAAGACCCAGrGCATGa/ Amplifies TRBV6-8, x1 3SpC3/ TRBV6-9

Rd1.BV07-2/6. ctctttccctacacgacgctcttccgatctCTCCACTCTGACGATCCAGCrGCACAa/3SpC3/ Amplifies TRBV7-2, x1 TRBV7-6 Rd1.BV07-3.x1 ctctttccctacacgacgctcttccgatctCTACTCTGAAGATCCAGCGCArCAGAGa/ Amplifies TRBV7-3 3SpC3/

Rd1.BV07-4/6/ ctctttccctacacgacgctcttccgatctCGGTTCTCTGCAGAGAGGCrCTGAGt/3SpC3/ Amplifies TRBV7-4, 7.x1 TRBV7-6,TRBV7-7

Rd1.BV07-8.x1 ctctttccctacacgacgctcttccgatctGGATCCGTCTCCACTCTGAAGrATCCAa/ Amplifies TRBV7-8 3SpC3/

Rd1.BV07-9.x1 ctctttccctacacgacgctcttccgatctTCCACCTTGGAGATCCAGCrGCACAa/3SpC3/ Amplifies TRBV7-9

Rd1.BV09.x1 ctctttccctacacgacgctcttccgatctACGATTCTCCGCACAACAGTTrCCCTGc/ Amplifies TRBV9 3SpC3/

Rd1.BV10-1.x1 ctctttccctacacgacgctcttccgatctCCTCACTCTGGAGTCTGCTGrCCTCCa/3SpC3/ Amplifies TRBV10-1

Rd1.BV10-2.x1 ctctttccctacacgacgctcttccgatctCCCCTCACTCTGGAGTCAGrCTACCa/3SpC3/ Amplifies TRBV10-2

Rd1.BV10-3.x1 ctctttccctacacgacgctcttccgatctGCTACCAGCTCCCAGACATrCTGTGc/3SpC3/ Amplifies TRBV10-3

Rd1.BV11.x1 ctctttccctacacgacgctcttccgatctAGGCTCAAAGGAGTAGACTCCArCTCTCc/ Amplifies TRBV11-1, 3SpC3/ TRBV11-2,TRBV11-3

Rd1.BV12.x1 ctctttccctacacgacgctcttccgatctATCCAGCCCTCAGAACCCrAGGGAa/3SpC3/ Amplifies TRBV12-3, TRBV12-4,TRBV12-5

Rd1.BV13.x1 ctctttccctacacgacgctcttccgatctGAACTGAACATGAGCTCCTTGGArGCTGGa/ Amplifies TRBV13 217 Sequencing TCR Single-Cell 3SpC3/

Rd1.BV14.x1 ctctttccctacacgacgctcttccgatctCTACTCTGAAGGTGCAGCCTrGCAGAc/3SpC3/ Amplifies TRBV14

Rd1.BV15.x1 ctctttccctacacgacgctcttccgatctCAGGAGGCCGAACACTTCTTTrCTGCTc/ Amplifies TRBV15 3SpC3/

Rd1.BV16.x1 ctctttccctacacgacgctcttccgatctACGAAGCTTGAGGATTCAGCArGTGTAc/ Amplifies TRBV16 3SpC3/

Rd1.BV18.x1 ctctttccctacacgacgctcttccgatctGCATCCTGAGGATCCAGCArGGTAGc/3SpC3/ Amplifies TRBV18

Rd1.BV19.x1 ctctttccctacacgacgctcttccgatctCCAAAAGAACCCGACAGCTTTCTrATCTCc/ Amplifies TRBV19 3SpC3/ (continued) 1 hqagL n ent .Livak J. Kenneth and Li Shuqiang 218

Table 1 (continued)

Name Sequence Notes

Rd1.BV20.x1 ctctttccctacacgacgctcttccgatctCAGTGCCCATCCTGAAGACArGCAGCc/3SpC3/ Amplifies TRBV20-1

Rd1.BV24.x1 ctctttccctacacgacgctcttccgatctTCTCCCTGTCCCTAGAGTCTGrCCATCa/ Amplifies TRBV24-1 3SpC3/

Rd1.BV25.x1 ctctttccctacacgacgctcttccgatctACAGTCTCCAGAATAAGGACGGArGCATTc/ Amplifies TRBV25-1 3SpC3/

Rd1.BV27.x1 ctctttccctacacgacgctcttccgatctCCCCAACCAGACCTCTCTGTArCTTCTa/ Amplifies TRBV27 3SpC3/

Rd1.BV28.x1 ctctttccctacacgacgctcttccgatctCCAGCACCAACCAGACATCTrATGTAa/3SpC3/ Amplifies TRBV28

Rd1.BV29.x1 ctctttccctacacgacgctcttccgatctGAGCAACATGAGCCCTGAAGArCAGCAa/ Amplifies TRBV29-1 3SpC3/

Rd1.BV30.x1 ctctttccctacacgacgctcttccgatctCTCTCAGCCTCCAGACCCrCAGGAa/3SpC3/ Amplifies TRBV30

Rd2.AC.x1 gtgactggagttcagacgtgtgctcttccgatctTCAGCTGGTACACGGCArGGGTCt/ 3SpC3/

Rd2.BC.x1 gtgactggagttcagacgtgtgctcttccgatctTCTCTGCTTCTGATGGCTCArAACACc/ 3SpC3/

P5 AATGATACGGCGACCACCGAGATCTACAC

P7 CAAGCAGAAGACGGCATACGAGAT Single-Cell TCR Sequencing 219

16. dNTPs: Mix of 25 mM dATP, 25 mM dCTP, 25 mM dGTP, and 25 mM TTP. Store at À20 C. 17. Rd1.AV.x1–Rd1.BV.x1 primer mix: Mix of 69 rhPCR primers at a concentration of 5 μM each. These primers are specific for the V segments of the human alpha and beta TCR genes and are designed to amplify all productive alpha and beta alleles. The sequences are in Table 1. Store at À20 C. 18. Rd2.AC.x1–Rd2.BC.x1 primer mix: Mix of two rhPCR pri- mers at a concentration of 5 μM each. These primers are specific for the C segments of the human alpha and beta TCR genes. The sequences are in Table 1. Store at À20 C. 19. Hot start Taq DNA polymerase: 5 units/μL. Store at À20 C. 20. AMPure XP beads: Beckman Coulter A63880. 21. Beads Buffer: 2.5 M NaCl, 20% polyethylene glycol (PEG). 22. Magnetic stand for 1.5-mL Eppendorf tube. 23. 10 μM P5 primer: sequence in Table 1. 24. 10 μM P7 primer: sequence in Table 1. 25. 2Â Hot start high fidelity PCR master mix. Store at À20 C. 26. Agilent HS DNA BioAnalyzer. 27. Illumina sequencer.

3 Methods

3.1 Preparation 1. Dry sort single cells into a 96-well PCR plate. Immediately  of Single-Cell cDNA centrifuge the plate at 800 Â g for 1 min at 4 C and place on  Libraries (See Note 2) dry ice. Store the plate at À80 C. 2. On ice, dilute 800 units/mL (20 mg/mL) proteinase K 1:10 by mixing 3 μL with 27 μL Prionex. 3. On ice, combine 12 μL10Â NEBNext Cell Lysis Buffer, 6 μL Murine RNase Inhibitor, 24 μL diluted 80 units/mL protein- ase K in Prionex, 24 μL NEBNext Single Cell RT Primer Mix, and 174 μLH2O. Dispense 2 μL to each of the wells in the 96-well plate. Vortex vigorously and collect samples by centrifugation. 4. Transfer plate to a thermal cycler. Run protocol: 50 C for 30 min, 72 C for 5 min, hold at 4 C. 5. On ice, combine 120 μL NEBNext Single Cell RT Buffer, 9.6 μL 25 mM AAPV protease inhibitor, 24 μL NEBNext Template Switching Oligo, 48 μL NEBNext Single Cell RT Enzyme Mix, and 38.4 μLH2O. Add 2 μL to each well. Gently vortex and collect by centrifugation. 220 Shuqiang Li and Kenneth J. Livak

6. Transfer plate to a thermal cycler. Run protocol: 42 C for 90 min, 70 C for 10 min, and hold at 4 C. 7. On ice, combine 1200 μL NEBNext Single Cell cDNA PCR Master Mix, 48 μL NEBNext Single Cell cDNA PCR Primer, and 672 μLH2O. Add 16 μL to each well. Gently vortex and collect by centrifugation. 8. Transfer plate to a thermal cycler. Run protocol: 98 C for 45 s, 22 cycles of (98 C for 10 s, 62 C for 15 s, 72 C for 3 min), 72 C for 5 min, hold at 4 C. 9. At room temperature, add 21 μL ProNex beads to each well and incubate at room temperature for 10 min. Place the plate in 96-well magnetic stand, wait for 2 min for beads to collect, then discard supernatants. With plate remaining in the mag- netic stand, wash the beads by adding 100 μL Promega Wash Buffer per well, waiting for 30 s, and discarding the super- natants. Repeat this wash step one time. After air-drying the beads for 10 min, elute with 17 μL Promega Elution Buffer. Transfer 15 μL from each well to a fresh plate. Store at À20 C. 10. The fragment size distribution of one or more of the cDNA libraries can be measured using the Agilent HS DNA BioAna- lyzer or similar instrumentation. Figure 1 shows a typical pro- file analyzing 1 μL of a 1:10 diluted cDNA library. 11. Determine the DNA concentration of each cDNA library (see Note 3). Normalize the libraries to approximately 0.15 ng/μL (range 0.1–0.2 ng/μL) by combining an aliquot of each library with the appropriate volume of DNA Suspension Buffer in a fresh 96-well plate. Store at À20 C.

Fig. 1 Bioanalyzer profile of single-cell cDNA library prepared as described in Subheading 3.1. The x-axis is the size of DNA fragments in bp; the y-axis is arbitrary fluorescence units Single-Cell TCR Sequencing 221

Start with single-cell, full-length cDNA library

TRBV TRBC 69 rhPCR primers 2 rhPCR primers TRAV TRAC

V (D) J C

Pool samples & perform P5/P7 PCR

Fig. 2 Diagram of targeted amplification and barcode addition on a template derived from a TCR transcript

3.2 Targeted As depicted in Fig. 2, this protocol achieves targeted amplification Amplification of TCR of TCR alleles and single-cell barcoding in a single PCR step by Segments exploiting the specificity of rhPCR technology [5]. In this method, inclusion of a single ribo residue in each PCR primer and the use of thermostable RNase H2 in the amplification reaction means that a functional primer is generated only when the oligonucleotide is hybridized to its intended target. In order to avoid the problem of index switching, this protocol uses dual indexing [6, 7]to encode the identity of each cell. For processing a 96-plate of single cells, there are 96 barcodes for index 1 and 96 barcodes for index 2. The sequences of all primers (Integrated DNA Technologies, Coralville, IA) used in the protocol are provided in Table 1. 1. Per well of a 96-well plate on ice, combine 2 μL (approximately 300 pg) of cDNA library material (from step 11 in Subheading 3.1) with 2 μL of a barcode primer mix containing rhPCR primer P5.IDT***.Rd1x.x1 and rhPCR primer P7.IDT***. Rd2x.x1. For example, well A1 receives P5.IDT289.Rd1x.x1 and P7.IDT001.Rd2x.x1. Vortex and collect by centrifugation. 2. Dilute RNase H2 to 20 mU/μL by mixing 1 μL2U/μL RNase H2 Enzyme with 99 μL RNase H2 Dilution Buffer. Keep on ice. 3. On ice, combine 10.8 μL 1 M Tris–HCl, pH 8.4, 18 μL1M KCl, 2.9 μL 1 M MgCl2, 11.5 μL 25 mM each dNTP, 7.2 μL Rd1.AV.x1–Rd1.BV.x1 primer mix, 7.2 μL Rd2.AC.x1/Rd2. BC.x1 primer mix, 18 μL diluted 20 mU/μL RNase H2, 28.8 μL 5 units/μL hot start Taq DNA polymerase (see Note 4), and 135.6 μLH2O. Add 2 μL to each of the 96 wells. Gently vortex and collect by centrifugation. 4. Transfer plate to a thermal cycler. Run protocol: 95 C for 5 min, 18 cycles of (96 C for 20 s, 60 C for 6 min), and hold at 4 C. 222 Shuqiang Li and Kenneth J. Livak

5. Pool 2 μL of each sample (192 μL final volume). Store remain- der of samples at À20 C. At room temperature, add 19.2 μL AMPure XP beads and 96 μL Beads Buffer to the pooled sample. Incubate at room temperature for 5 min, then place in magnetic stand. After 2 min to allow beads to collect, trans- fer supernatant to a fresh tube. Add 6.4 μL AMPure XP beads plus 32 μL Beads Buffer and incubate at room temperature for 5 min. Place tube in magnetic stand, wait for 2 min for beads to collect, then discard supernatant. With tube remaining in the magnetic stand, wash the beads by adding 500 μL freshly prepared 80% ethanol, waiting for 30 s, and discarding the supernatant. Repeat this wash step one time. After air-drying the beads for 5 min, elute by adding 21 μL DNA Suspension Buffer and transferring 20 μL eluate to a fresh tube. 6. Clean up again by adding 16 μL AMPure XP beads and incu- bating for 5 min at room temperature. Place tube in magnetic stand, wait for 2 min for beads to collect, then discard super- natant. With tube remaining in the magnetic stand, wash the beads by adding 100 μL freshly prepared 80% ethanol, waiting for 30 s, and discarding the supernatant. Repeat this wash step one time. After air-drying the beads for 5 min, elute with 20 μL DNA Suspension Buffer. Store at À20 C. 7. On ice, combine 5 μL PCR product from step 6, 2.5 μL10μM P5 primer, 2.5 μL10μM P7 primer, 25 μL hot start high fidelity PCR master mix (see Note 5), and 15 μLH2Oina PCR tube. Gently vortex and collect by centrifugation. 8. Transfer sample to a thermal cycler. Run protocol: 98 C for 30 s, 12 cycles of (98 C for 10 s, 62 C for 2 min), 75 C for 2 min, and hold at 4 C. 9. At room temperature, add 40 μL AMPure XP beads and incu- bate at room temperature for 5 min. Place tube in magnetic stand, wait for 2 min for beads to collect, then discard super- natant. With tube remaining in the magnetic stand, wash the beads by adding 100 μL freshly prepared 80% ethanol, waiting for 30 s, and discarding the supernatant. Repeat this wash step one time. After air-drying the beads for 5 min, elute with 20 μL DNA Suspension Buffer. Store at À20 C. 10. Use 1 μL of the purified PCR product to measure the fragment size distribution and estimate concentration using the Agilent HS DNA BioAnalyzer or similar instrumentation. Figure 3 shows a typical profile. 11. Sequence the library on an Illumina sequencer using paired- end reads (see Note 6). Read 1 needs to be 250 nt or longer in order to capture CDR3 information. 12. The details of data processing are beyond the scope of this protocol. Identification of CDR3 sequences and of alpha and Single-Cell TCR Sequencing 223

Fig. 3 Bioanalyzer profile of targeted TCR library from 96 human CD3+ T cells prepared as described in Subheading 3.2. The x-axis is the size of DNA fragments in bp; the y-axis is arbitrary fluorescence units

Table 2 Results for detecting complete CDR3 sequences for both TCR alpha and TCR beta in single cells from the analysis of 96 human CD3+ T cells

Both alpha and beta Alpha or beta Multiple cells or contamination Failed

70 7 11 8

beta V and J alleles is done using the program MiXCR [8]. Results from the analysis of 96 human CD3+ T cells are shown in Table 2.

4 Notes

1. The protocol can be adapted to use AMPure XP beads. 2. This protocol is written using the reagents from the NEBNext Single Cell/Low Input RNA Library Prep Kit because this provides the simplest workflow. Smart-seq2 cDNA libraries prepared following the original protocol of Picelli et al. [9], the modified protocol of Trombetta et al. [10] or the SMART- Seq v4 Ultra Low Input RNA Kit for Sequencing from Takara can also be used for the targeted TCR amplification protocol in Subheading 3.2. 3. We have used the Quant-iT High Sensitivity dsDNA Assay Kit (Thermo Fisher Q33120). Any DNA quantification method should be suitable as long as its detection limit is 0.1 ng/μLor lower. 224 Shuqiang Li and Kenneth J. Livak

4. We have used Hot Start Taq DNA Polymerase from New England BioLabs (M0495L). Other nonproofreading hot start Taq DNA polymerases should work as well. 5. We have used Q5 Hot Start HiFi PCR Master Mix from New England BioLabs (M0543L). Other hot start high fidelity PCR master mixes should work as well, but may require adjusting the PCR hot start, denaturation, and annealing temperatures and times. 6. We sequence on the MiSeq using the 300-cycle kit. Read lengths are: 248 nt for read 1, 8 nt for index 1, 8 nt for index 2, and 48 nt for read 2. By changing the Illumina-specific segments in the primers to the appropriate sequences, it may be possible to adapt this protocol to sequencers from other vendors.

Acknowledgments

We thank Jing Sun and Rosa Allesøe for their help in analyzing the sequencing data. This work was supported by Dana-Farber Cancer Institute.

References

1. Becattini S, Latorre D, Mele F, Foglierini M, 6. Kircher M, Sawyer S, Meyer M (2012) Double De Gregorio C, Cassotta A, Fernandez B, indexing overcomes inaccuracies in multiplex Kelderman S, Schumacher TN, Corti D, sequencing on the Illumina platform. Nucleic Lanzavecchia A, Sallusto F (2015) T cell immu- Acids Res 40:e3 nity. Functional heterogeneity of human mem- 7. MacConaill LE, Burns RT, Nag A, Coleman + ory CD4 T cell clones primed by pathogens or HA, Slevin MK, Giorda K, Light M, Lai K, vaccines. Science 347:400–406 Jarosz M, McNeill MS, Ducar MD, 2. Gajewski TF, Schreiber H, Fu YX (2013) Meyerson M, Thorner AR (2018) Unique, Innate and adaptive immune cells in the dual-indexed sequencing adapters with UMIs tumor microenvironment. Nat Immunol effectively eliminate index cross-talk and signif- 14:1014–1022 icantly improve sensitivity of massively parallel 3. Stubbington MJT, Lo¨nnberg T, Proserpio V, sequencing. BMC Genomics 19:30 Clare S, Speak AO, Dougan G, Teichmann SA 8. Bolotin DA, Poslavsky S, Mitrophanov I, (2016) T cell fate and clonality inference from Shugay M, Mamedov IZ, Putintseva EV, Chu- single-cell transcriptomes. Nat Methods dakov DM (2015) MiXCR: software for com- 13:329–332 prehensive adaptive immunity profiling. Nat 4. Han A, Glanville J, Hansmann L, Davis MM Methods 12:380–381 (2014) Linking T-cell receptor sequence to 9. Picelli S, Faridani OR, Bjo¨rklund A˚ K, functional phenotype at the single-cell level. Winberg G, Sagasser S, Sandberg R (2014) Nat Biotechnol 32:684–692 Full-length RNA-seq from single cells using 5. Dobosy JR, Rose SD, Beltz KR, Rupp SM, Smart-seq2. Nat Protoc 9:171–181 Powers KM, Behlke MA, Walder JA (2011) 10. Trombetta JJ, Gennert D, Lu D, Satija R, Sha- RNase H-dependent PCR (rhPCR): improved lek AK, Regev A (2014) Preparation of single- specificity and single nucleotide polymorphism cell RNA-Seq libraries for next generation detection using blocked cleavable primers. sequencing. Curr Protoc Mol Biol BMC Biotechnol 11:80 107:4.22.1–4.2217 Part III

Single Cell Genomic and Epigenomic Analysis Chapter 14

Sequencing the Genomes of Single Cells

Veronica Gonzalez-Pena and Charles Gawad

Abstract

Single-cell genome sequencing can detect low-frequency genetic alterations present in complex tissues. However, the experimental procedures are technically challenging. This includes dissociation of the tissue, isolation of single cells, whole-genome amplification, sequencing library preparation, and an optional target enrichment. Here we describe how to perform each of these processes to obtain high-quality single-cell genome sequencing data.

Key words Single-cell isolation, Genomics, Sequencing library preparation, WGA, Whole-genome amplification (WGA), Target enrichment, MDA, DOP-PCR

1 Introduction

Sequencing the genomes of small numbers of immune cells requires whole-genome amplification prior to sequencing library prepara- tion. This may be necessary when working with sorted populations of rare immune cell subsets [1], immune cells infiltrating tissues such as tumors [2, 3], or when sequencing the genomes of single normal or malignant immune cells [4]. This process generally requires three steps that are driven by the hypothesis being tested. The first step is isolating the cells of interest. Depending on the number of cells available, as well as the number that will be analyzed, this can be accomplished with manual manipulation, flow-activated cell sorting (FACS), or isola- tion in a microfluidic device. The second step is amplifying the genomes of the isolated cells. There are now a number of whole-genome amplification methods available, so the method chosen for a given experiment will depend on whether the question requires the detection of large regions of copy number variation (CNV) or smaller regions of nucleotide variation such as single nucleotide variants (SNV) or indels. The final step is to select how the amplification products will be interrogated. This, again, should be tailored to the questions that

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_14, © Springer Science+Business Media, LLC, part of Springer Nature 2019 227 228 Veronica Gonzalez-Pena and Charles Gawad

the study is trying to answer. The amplification products generally undergo low-pass whole-genome sequencing for CNV detection. For SNV detection, the samples can undergo whole-genome sequencing, but investigators generally focus on coding regions using target enrichment.

2 Materials

2.1 Cell Isolation Most methods of cell isolation from solid tissues comprise both mechanical and enzymatic dissociation. Specific enzyme combina- tions and dissociation times should be optimized and selected based on optimal viable cell yield and representation of expected cell populations or cell type of interest. 1. For cryopreserved suspension cells: ThawStar automated cell- thawing system. 2. For solid tissue cells: Tissue dissociation enzymes or Miltenyi tissue dissociation kit and heated shaker or gentleMACS Dis- sociator System. 3. PluriStrainer cell strainer (PluriSelect). 4. Washing buffer (PBS with 1% BSA). 5. Fluorescence-based cell counter (Luna FL cell counter or similar). 6. Single cell isolation device (Micromanipulator, FACS sorter, microfluidic device).

2.2 Whole-Genome The method chosen to amplify the genomes of single cells depends Amplification (WGA) on whether the experiment is designed to detect CNV or SNV. There are a number of different technologies on the market, but we have included the kits that we recommend for each of these applications. 1. For evaluating CNV, we recommend a PCR-based whole- genome amplification method such as DOP-PCR from Sigma or PicoPlex from Takara. 2. For detecting SNV or indels, we recommend using multiple displacement amplification (MDA) with either Qiagen REPLI- g Single-Cell or GE GenomiPhi V2 kit. 3. Agencourt AMPure XP beads for purifying amplified DNA. 4. EB Buffer (Qiagen). 5. Fluorometer for DNA quantification (Qubit or similar). 6. 1% agarose gel or BioAnalyzer for DNA size determination. 7. Appropriate magnetic Stand. Single-Cell Genome Sequencing 229

2.3 Sequencing There are a variety of commercially available kits that provide the Library Preparation reagents necessary to perform all steps of library construction (DNA fragmentation, end-preparation, adapter ligation, and library amplification). We recommend using the KAPA HyperPlus Library Preparation kit for its simplified workflow and compatibility with low-input samples. 1. KAPA HyperPlus Library Preparation kit (Roche). 2. We recommend ordering sequencing adapters from Illumina (illumina.com) or Integrated DNA Technologies (www.idtdna. com). 3. Agencourt AMPure XP beads for purifying libraries. 4. Qubit or similar fluorometer for library quantification. 5. 1% agarose gel or BioAnalyzer for DNA size determination.

2.4 Library Targeted library enrichment encompasses a variety of methodolo- Enrichment (Optional) gies that have been developed to selectively isolate genomic regions of interest, ranging from whole exomes to small panels of single nucleotide polymorphisms (SNPs). This step, while optional, is a cost-effective way to increase depth of coverage and uniformity across the regions of interest. 1. For exome or targeted panel enrichment, we recommend xGen Capture Oligonucleotides and Enrichment Reagents from Integrated DNA Technologies (www.idtdna.com). 2. KAPA HiFi HotStart ReadyMix (Roche). 3. Agencourt AMPure XP beads for purifying libraries. 4. Qubit or similar fluorometer for DNA quantification. 5. 1% agarose gel or BioAnalyzer for DNA size determination.

3 Methods

3.1 Cell Isolation 1. The first step in performing single-cell sequencing is to obtain a single-cell suspension from freshly collected or cryopreserved suspension cells. Please note that special precautions should be taken when thawing cryopreserved cells as inadequate handling will lead to significant cell death/loss (see Note 1). Alterna- tively, single cell suspensions can be obtained from solid tissues by performing enzymatic digestion. Tissue should be disso- ciated per the manufacturer’s protocol for that specific tissue (see Notes 2 and 3). 2. Cells then need to be washed and filtered prior to isolation: (a) Wash cells once in 5 mL of Washing buffer and pelleting at 300 Â g. 230 Veronica Gonzalez-Pena and Charles Gawad

(b) Discard supernatant. (c) Wash three more times in 1 mL of washing buffer, remov- ing 900 μL of the supernatant each time the cells are pelleted. (d) Resuspend cells after the last wash and filter solution to remove large cell clumps and debris (see Notes 4 and 5). (e) Cells should be pelleted one final time at 150 Â g. Nine- hundred microliters of PBS should be removed, and the cells should be resuspended in the remaining 100 μL. 3. Count cells and confirm that viability is greater than 90% (see Note 3). 4. The most commonly used methods for isolating cells are man- ual manipulation methods such as micropipetting, FACS, lim- iting dilution, microwell dilution, and valve or droplet microfluidic-based isolation. Detailed descriptions of these methods are beyond the scope of this chapter, as each of these methods requires specialized equipment and training. For more details, please see these recent review articles [5–7] (see Note 6). 5. Perform single cell isolation using the method that will best allow you to address the question that you are trying to answer and the number of input cells available (see Note 6).

3.2 Whole-Genome Extreme care should be taken when working with reagents that will Amplification be included in the whole-genome amplification, as minute quanti- ties of contaminating DNA will be amplified in the reaction. Ultra- pure reagents tested for contaminating DNA should be used. In addition, reagents that do not contain oligonucleotides or proteins, such as buffers and plasticware, can be treated with UV light [8]. 1. The volume of solution each cell is isolated in should be taken into account, and adjusted as necessary based on the manufac- turer’s instructions for the WGA method being used. 2. Perform the whole-genome amplification following manufac- turer’s instructions. 3. Clean up the DNA using Ampure XP beads with a 2:1 beads- to-sample ratio to isolate most fragments that are larger than 125 bp. 4. Add 100 μL of thoroughly resuspended, room-temperature AMPure XP beads to 50 μL of sample and mix well. 5. Incubate at room temperature for 10 min and then place on magnetic stand for 2 min or until solution clears. 6. Discard supernatant and wash beads twice with 200 μLof freshly prepared 80% ethanol. 7. Discard ethanol and allow the beads to air-dry for 5 min. Single-Cell Genome Sequencing 231

8. Elute DNA from beads with 52 μL of EB Buffer. Place tube on magnetic stand and transfer 50 μL of the eluted DNA to a new tube while avoiding bead carryover (see Note 7). 9. Quantify DNA using fluorometer and run amplification products on a gel or BioAnalyzer to determine the size (see Note 8).

3.3 Sequencing 1. If the product is of the expected size distribution, dilute the Library Preparation DNA to 3 ng/μL and add 35 μL to the fragmentation reaction. Follow manufacturer’s instructions and fragment for 30 min (see Note 9). 2. Follow manufacturer’s instruction for library preparation. We recommend using 15 μM adapter concentration and 8–- 10 cycles of PCR (see Notes 10 and 11). 3. After PCR amplification, clean up the reaction using equal volume of Ampure XP beads to sample per manufacturer’s instructions and elute in 30 μL. 4. Quantify DNA using fluorometer and run amplification pro- ducts on a gel or BioAnalyzer to determine the size. The ideal library has a Gaussian distribution with most fragments between 200 and 700 bp (see Fig. 1). 5. The libraries are now ready to be sequenced if whole-genome sequencing is being performed. If the library will be enriched for specific genomic regions, continue to Subheading 3.4.

Fig. 1 BioAnalyzer trace of a library with an ideal size distribution obtained using a high sensitivity chip 232 Veronica Gonzalez-Pena and Charles Gawad

Fig. 2 Example of a typical BioAnalyzer trace of a postcapture library using a DNA 1000 chip

3.4 Target 1. Pool 250 ng of each library (up to 12 libraries, and up to 3 μg Enrichment total per pool) into a single tube. 2. Follow xGen Lockdown for hybridization and washing proto- col (see Notes 12 and 13). 3. Perform PCR using KAPA HiFi HotStart ReadyMix with the library amplification primers provided in the KAPA HyperPlus Kit (Roche) (see Note 14). 4. After PCR amplification, clean up using equal volume of Ampure XP beads per manufacturer’s instructions and elute the enriched library in 30 μL of EB buffer. 5. Quantify DNA using fluorometer and run amplification pro- ducts on a gel or BioAnalyzer to determine the size. The ideal post-capture library has a Gaussian distribution with a narrower distribution than before capture with most fragments now between 200 and 500 bp (see Fig. 2). 6. The target-enriched libraries are now ready to be sequenced (see Note 15).

4 Notes

1. We recommend ThawSTAR system from Astero Bio, as we have found the automated thawing procedure results in higher cell viability than conventional thawing methods. 2. There are many tissue dissociation protocols available. We rec- ommend consulting protocols from Worthington Biochemical (www.worthington-biochem.com) and Miltenyi Biotec (www. miltenyibiotec.com). Single-Cell Genome Sequencing 233

3. We recommend counting and measuring the viability of the cells using a system with a fluorescence-based viability reporter, such as the LUNA-FL counter from Logos Biosystems (http://logosbio.com) or similar. 4. We recommend the 10, 20, or 30 micron PluriStrainer filters (PluriSelect) depending on the diameter of the cells being used. 5. Cells should be examined to insure that cell clumps and cell debris are removed from the cell suspension as these can result in carryover contamination and clogging of microfluidic devices. 6. Some cell isolation strategies are not compatible with small number of input cells. Manual manipulation, FACS, and dilu- tional methods allow for small numbers of input cells, while microfluidic-based methods generally require thousands or tens of thousands of input cells. 7. The beads-to-sample volume ratio can be adjusted if the aim is to examine smaller or larger fragments. Elution volume will need to be adjusted based on the yield of the method being used. We recommend eluting in a volume greater than 20 μLin an elution buffer that does not contain EDTA as this inhibits DNA fragmentation. 8. Correct size distribution will depend on the WGA method utilized. Refer to the manufacturer’s instructions in the kit for the correct size distribution. 9. Fragmentation time may need to be adjusted depending on the amplicon size generated during amplification. 10. When using a sequencer with a patterned flow cell (i.e., Illu- mina HiSeq 400 or NovaSeq), we recommend ordering adap- ters with unique dual indices to prevent index hopping, which are available from both the listed vendors. 11. It is important to use adapters with different indices for each sample if they are going to be sequenced on the same lane or pooled for enrichment. 12. Exome, gene-specific, disease-specific, as well as custom probes and panels can be obtained from IDT DNA. 13. Be sure to use appropriate blocking oligos for adapters used in library preparation. 14. Number of PCR cycles will depend on panel size and number of libraries pooled per capture. If there are less than 1000 targets, a second enrichment may be required to get target rates >90%. Contact Integrated DNA Technologies for a detailed protocol. 15. Make sure that each library loaded on the same sequencing lane has a unique index so that they can be accurately demultiplexed. 234 Veronica Gonzalez-Pena and Charles Gawad

References

1. Genovese P, Schiroli G, Escobar G, Tomaso TD, breast cancer 1:15005. https://doi.org/10. Firrito C, Calabria A, Moi D, Mazzieri R, 1038/npjbcancer.2015.5 Bonini C, Holmes MC, Gregory PD, van der 4. Gawad C, Koh W, Quake SR (2014) Dissecting Burg M, Gentner B, Montini E, Lombardo A, the clonal origins of childhood acute lympho- Naldini L (2014) Targeted genome editing in blastic leukemia by single-cell genomics. Proc human repopulating haematopoietic stem cells. Natl Acad Sci U S A 111(50):17947–17952. Nature 510(7504):235–240. https://doi.org/ https://doi.org/10.1073/pnas.1420822111 10.1038/nature13420 5. Gawad C, Koh W, Quake SR (2016) Single-cell 2. De Simone M, Arrigoni A, Rossetti G, Gruarin P, genome sequencing: current state of the science. Ranzani V, Politano C, Bonnal RJP, Provasi E, Nat Rev Genet 17(3):175–188. https://doi. Sarnicola ML, Panzeri I, Moro M, Crosti M, org/10.1038/nrg.2015.16 Mazzara S, Vaira V, Bosari S, Palleschi A, 6. Gross A, Schoendube J, Zimmermann S, Santambrogio L, Bovo G, Zucchini N, Totis M, Steeb M, Zengerle R, Koltay P (2015) Technol- Gianotti L, Cesana G, Perego RA, Maroni N, ogies for single-cell isolation. Int J Mol Sci 16 Pisani Ceretti A, Opocher E, De Francesco R, (8):16897–16919. https://doi.org/10.3390/ Geginat J, Stunnenberg HG, Abrignani S, Pagani ijms160816897 M (2016) Transcriptional landscape of human tissue lymphocytes unveils uniqueness of tumor- 7. Hu P, Zhang W, Xin H, Deng G (2016) Single infiltrating T regulatory cells. Immunity 45 cell isolation and analysis. Front Cell Dev Biol (5):1135–1147. https://doi.org/10.1016/j. 4:116. https://doi.org/10.3389/fcell.2016. immuni.2016.10.021 00116 3. Kleppe M, Comen E, Wen HY, Bastian L, 8. Tamariz J, Voynarovska K, Prinz M, Caragine T Blum B, Rapaport FT, Keller M, Granot Z, (2006) The application of ultraviolet irradiation Socci N, Viale A, You D, Benezra R, Weigelt B, to exogenous sources of DNA in plasticware and Brogi E, Berger MF, Reis-Filho JS, Levine RL, water for the amplification of low copy number Norton L (2015) Somatic mutations in leuko- DNA. J Forensic Sci 51(4):790–794. https:// cytes infiltrating primary breast cancers. NPJ doi.org/10.1111/j.1556-4029.2006.00172.x Chapter 15

Studying DNA Methylation in Single-Cell Format with scBS-seq

Natalia Kunowska

Abstract

DNA methylation at cytosine is a major epigenetic mark, heavily implicated in controlling key cellular processes such as development and differentiation, cellular memory, or carcinogenesis. Bisulfite treatment in conjunction with next generation sequencing has been a powerful tool for studying this modification in a quantitative manner in the context of the whole genome and with a single nucleotide resolution. This chapter describes a protocol for bisulfite sequencing adapted to a single-cell format that allows for capturing the methylation signal from up to 50% CpG nucleotides in each cell.

Key words DNA methylation, Bisulfite sequencing, Epigenetics, Heterogeneity, Single cell, Genome-wide

1 Introduction

In mammals, 70–80% of CpG dinucleotide sequences are methy- lated on the C5 position of cytosine [1]. 5-Methylcytosine (5mC) in the DNA constitutes a true epigenetic mark, which can be inherited through cell divisions and which is pivotal in cell-type definition and stemness and in maintaining cellular memory [2, 3]. It plays a vital role in controlling key biological processes such as gene expression, development and differentiation, genomic imprinting, or silencing the repetitive elements [3, 4]. Conse- quently, DNA methylation (DNAme) patterns were found to be altered during aging and in pathological processes such as carcino- genesis or neurological and autoimmune disorders [2]. Given its importance, multiple methods to study this modifica- tion have been developed, ranging from mass spectroscopy to enzymatic and affinity assays [5]. Among them, bisulfite sequenc- ing (BS-seq) is considered to be the “gold standard.” In BS-seq protocol, the treatment of denatured DNA with bisulfite (BS) salts in highly acidic environment leads to the deamination of cytosine base into uracil, which will then be read as thymine during

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_15, © Springer Science+Business Media, LLC, part of Springer Nature 2019 235 236 Natalia Kunowska

subsequent analysis by sequencing. However, 5mC residues are resistant to deamination. Therefore, by comparing the sequencing results from a sample following bisulfite treatment to untreated control sequences, the modified cytosines can be identified and quantified with a single-nucleotide resolution [6]. Combining BS with next-generation sequencing (NGS) allowed for analyzing DNA methylation on a genome-wide scale [7]. Recent advances in the field have enabled studying DNA meth- ylation in the single-cell format [8–11], adding an important layer of information in chartering the cellular heterogeneity. In the somatic cells the DNAme is a particularly stable epigenetic mark, closely linked to the cell identity [2, 12]. Therefore, single-cell studies of DNAme patterns have a potential to significantly con- tribute to the single-cell community efforts to distinguish between bona-fide cell types and cell states. The protocol described here has been developed by Smallwood et al. [8, 13] who adapted PBAT (post-bisulfite adaptor tagging) approach [14] to study DNAme on the level of the single cells. In PBAT, the BS-treatment is used to both convert the unmethylated cytosines and to fragment the DNA before tagging it with sequenc- ing adapters. This allows to circumvent BS-induced degradation of the tagged libraries, significantly reducing the input requirements. To study DNA methylation in individual cells, the PBATstrategy has been combined with five rounds of preamplification of the mod- ified template with random primers harboring sequencing adapters. The sample is then treated with exonuclease I to remove the excess of the primers and purified. Subsequently, the second adapter oligo is introduced also by random priming. The resulting double-tagged libraries are amplified by PCR. In this step, sequencing indexes can be introduced to allow for multiplexing of the samples (Fig. 1). This scBS-seq protocol allows to capture DNA methylation signal at up to 50% of individual CpGs in genome [8]. In addition, it can be easily adapted for multiomics approaches based on physical separation of mRNAs and gDNA, such as M&T-seq (DNAme and transcriptome) [15] or NMT-seq (nucleosome positioning, DNAme, and transcriptome) [16]. Such combined approaches have great potential to shed a light on the relationships between the chromatin state and the transcriptional output with the single- cell resolution. The major challenge in performing the BS-seq protocol in the single-cell format is the very limited starting material. A mammalian cell would normally contain only two copies of each DNA mole- cule. Therefore, it is critical to limit the losses of material, especially at the preamplification stages of the protocol, as each molecule carries unique information. Moreover, the very low starting mate- rial which makes multiple rounds of amplification necessary, ren- ders the method particularly vulnerable to contamination, making it paramount to work in clean, “pre-PCR” conditions. Studying DNA Methylation in Single Cells 237

+ Oligo 2 tagging Cell capture and lysis n

e T A G A C G A G G C C G T C A C G A T C m t a e r t ×10-14 BS- t

T A G A C G A G G U C G T U A C G A T U indexing PCR amplification with 5‘ 3‘ +

×5

oligo tagging 5‘ 3‘ Preamp n o i analysis e t a m c t i f a i sequencing and data e r r u t p I- t QC, and n Exo

Fig. 1 The single cells are captured and lysed and the isolated DNA is subjected to BS-treatment, which results in both conversion of unmodified cytosines to uracil and fragmentation. The methylated cytosines are protected against conversion. After desulfonation, the DNA is preamplified by random priming with oligos containing the first sequencing adapter. The tagged sample is then treated with exonuclease I and purified, and the second sequencing adapter is introduced by one round of random priming. The resulting libraries are PCR amplified. At this stage, the multiplexing indexes can be introduced. The ready libraries should be QC-ed, including checking the size distribution profiles. The successful samples are then sequenced and analyzed

1.1 Abbreviations 5mC 5-Methylcytosine BS Bisulfite DNAme DNA methylation mtDNA Mitochondrial DNA PBAT Post-bisulfite adapter tagging QC Quality control RT Room temperature U Unit

2 Materials

1. DNA and DNase decontamination reagent. 2. Low-bind tubes, plates, and filter tips. 3. RLT Plus Buffer (QIAGEN).

4. Ultrapure H2O, nuclease-free. 5. Unmethylated Lambda DNA or other controls for BS conver- sion efficiency, as needed. 238 Natalia Kunowska

6. EZ DNA Methylation-Direct™ Kit (Catalog Nos.: D5020, D5021, D5044, and D5045; depending on the size). 7. Optional: PureLink PCR micro kit. À 8. High concentration (50,000 U/ml) Klenow 30 ! 50 exo (for example from Enzymatics). 9. Blue buffer for Klenow reaction, or a corresponding equivalent. 10. Exonuclease I (E. coli) at 20,000 U/ml. 11. AMPure XP SPRI beads. 12. 80% ethanol (vol/vol). 13. KAPA HiFi HotStart Polymerase 1000 U/ml (or an equivalent hot start, high-fidelity polymerase). 14. 5Â KAPA HiFi Fidelity buffer, or a corresponding equivalent. 15. EB buffer (QIAGEN). 16. Agilent BioAnalyser and high-sensitivity DNA Kit or an equiv- alent setup. 17. Optional: KAPA Library Quantification Kit or other PCR-based library quantification method. 18. Optional: MiSeq Kit. 19. HiSeq or other compatible instrument. 20. Heating block. 21. Magnetic racks for tubes and plates, as appropriate. 22. PCR thermocycler. 23. “Pre-PCR” laminar flow cabinet with UV light. 24. Oligos [17]:

Oligo Sequence

Preamp oligo CTACACGACGCTCTTCCGATCTNNNNNN Oligo 2 TGCTGAACCGCTCTTCCGATCTNNNNNN PE1.0 AATGATACGGCGACCACCGAGATCTACAC TCTTTCCCTACACGACGCTCTTCCGA TC*T iPCRTag CAAGCAGAAGACGGCATACGAGATXXXX XXXXGAGATCGGTCTCGGCATTCC TGCTGAACCGCTCTT CCGATC*T iTag sequencing primer AAGAGCGGTTCAGCAGGAATGCCGAGAC CGATCTC

N—random nucleotide *—phosphorothioate X—8-base sequencing index Studying DNA Methylation in Single Cells 239

iTag index sequences:

Primer Tag sequence iPCRtag 1 AACGTGAT iPCRtag 2 AAACATCG iPCRtag 3 ATGCCTAA iPCRtag 4 AGTGGTCA iPCRtag 5 ACCACTGT iPCRtag 6 ACATTGGC iPCRtag 7 CAGATCTG iPCRtag 8 CATCAAGT iPCRtag 9 CGCTGATC iPCRtag 10 ACAAGCTA iPCRtag 11 CTGTAGCC iPCRtag 12 AGTACAAG iPCRtag 13 AACAACCA iPCRtag 14 AACCGAGA iPCRtag 15 AACGCTTA iPCRtag 16 AAGACGGA iPCRtag 17 AAGGTACA iPCRtag 18 ACACAGAA iPCRtag 19 ACAGCAGA iPCRtag 20 ACCTCCAA iPCRtag 21 ACGCTCGA iPCRtag 22 ACGTATCA iPCRtag 23 ACTATGCA iPCRtag 24 AGAGTCAA iPCRtag 25 AGATCGCA iPCRtag 26 AGCAGGAA iPCRtag 27 AGTCACTA iPCRtag 28 ATCCTGTA iPCRtag 29 ATTGAGGA iPCRtag 30 CAACCACA iPCRtag 31 CAAGACTA iPCRtag 32 CAATGGAA

(continued) 240 Natalia Kunowska

Primer Tag sequence

iPCRtag 33 CACTTCGA iPCRtag 34 CAGCGTTA iPCRtag 35 CATACCAA iPCRtag 36 CCAGTTCA iPCRtag 37 CCGAAGTA iPCRtag 38 CCGTGAGA iPCRtag 39 CCTCCTGA iPCRtag 40 CGAACTTA iPCRtag 41 CGACTGGA iPCRtag 42 CGCATACA iPCRtag 43 CTCAATGA iPCRtag 44 CTGAGCCA iPCRtag 45 CTGGCATA iPCRtag 46 GAATCTGA iPCRtag 47 GACTAGTA iPCRtag 48 GAGCTGAA

All oligos should be HPLC-purified and resuspended in EB buffer at 100 μM.

3 Methods

3.1 General Overview Before starting, check that you have all the necessary materials and of the Protocol that the experimental setup is correct (see General Notes)

Safe stopping point Protocol step Time needed after step completion

1. Cell Lysis Depending on the cell Yes number and collection method 2. Bisulfite conversion 3.5–4 h Not recommended, proceed immediately to the next step 3. Desulfonation and 1.5–2 h Not recommended, purification of proceed immediately BS-converted samples to the next step 4. Preamp oligo tagging 1.5 h Yes 5. Preamplification 4 h Yes, after each full round of first-strand synthesis

(continued) Studying DNA Methylation in Single Cells 241

Safe stopping point Protocol step Time needed after step completion

6. Exonuclease I 1 h 15 min Yes treatment 7. Purification of 45 min to 1 h Yes exonuclease I-treated samples 8. Oligo 2 tagging 2 h Yes 9. Purification of 45 min to 1 h Yes double-tagged libraries 10. Library 1h Yes amplification 11. Amplified library 45 min to 1 h Yes purification

3.2 Cell Lysis FACS-sort the single cells of the population of interest into plates containing 2.5 μl RLT Plus. At this stage, cell can be stored at À80 C for up to 1 month (see Notes 1 and 2).

3.3 Bisulfite 1. Prepare CT Conversion Reagent, following the manufacturer’s Conversion instructions. The exact volumes depend on the size of EZ DNA Methylation-Direct™ Kit used; please refer to the manual provided with the kit. The volumes listed here correspond to the smallest kit size (Catalog No. D5020), which is sufficient to BS-convert 20 samples (using half of the volumes recom- mended in the kit). Add 790 μl M-Solubilization Buffer and 300 μlof M-Dilution Buffer and vortex until all visible particles are dis- solved. Add 1.6 ml of M-Reaction Buffer and vortex again for an additional 4 min (see Note 3).

2. Add 7.5 μlH2O to each sample. It is recommended to add control of BS conversion efficiency, such as 60 fg of unmethy- lated Lambda DNA (see Note 4). 3. Add 65 μl of prepared CT Conversion reagent solution. Incu- bate in the thermocycler with the following program:

Temperature Time

98 C 8 min 65 C3h 8 C HOLD 242 Natalia Kunowska

4. Proceed immediately to the next step: Desulfonation and puri- fication of BS-converted samples (see Note 5).

3.4 Desulfonation 1. Prepare a deep-well plate with 5 μl of MagBinding beads and and Purification of 200 μl of M-binding buffer in each well (see Note 6). Preheat  BS-Converted Samples an appropriate heating block to 55 C. 2. Transfer the whole volume (75 μl) of the BS-treated samples from the point 2.3. To the deep-well plate containing Mag- Binding beads and M-binding buffer. 3. To improve sample recovery, rinse the wells of the BS-conversion plate with 100 μl of M-binding buffer, and combine this with the MagBinding beads and M-binding buffer mixture. Mix by vortexing. 4. Incubate at RT for 5 min to allow the BS-treated DNA to bind to the magnetic beads. 5. Place on the magnet and wait until the solution becomes clear (3–6 min). Remove and discard the supernatant. 6. Remove the plate from the magnet and resuspend the beads in 100 μl of M-Desulfonation buffer. 7. Incubate the plate at RT for 15 min (see Note 7). 8. Place the plate on the magnet and allow the solution to clear (3–6 min). Discard the supernatant. 9. Again, remove the plate from the magnet and add 200 μlof M-wash buffer to the beads. Resuspend well. 10. Place the plate on the magnet and allow the solution to clear (3–6 min). Discard the supernatant. 11. Repeat the wash with the M-wash buffer (Subheading 3.4, steps 9 and 10). 12. Dry the beads by incubating the plate on the heating block that has been preheated to 55 C for 15 min. Proceed immediately to the next step: Preamp oligo tagging.

3.5 Preamp Oligo 1. To elute the DNA, resuspend the beads in 31 μlof1Â blue  Tagging buffer. Incubate at 55 C for 4 min. Spin down briefly to concentrate the liquid at the bottom of the wells. 2. Place the plate on the magnet and allow the solution to clear (3–6 min). 3. Transfer 30 μl of the supernatant to a new PCR plate contain- ing 9 μl of the following mix in each well: Studying DNA Methylation in Single Cells 243

Stock Final Reagent concentration Volume concentration

Blue buffer 10Â 1 μl1Â dNTP mix 10 mM 1.6 μl 0.4 mM Preamp oligo 10 μM 1.6 μl 0.4 μM

H2O – 4.8 μl–

4. Incubate at 65 C for 3 min. 5. Cool immediately to 4 C by placing the PCR plate on an aluminum rack on ice. À 6. Add 1 μl (50 units) of high concentration Klenow 30 ! 50 exo and mix well. Spin down briefly (see Note 8). 7. Incubate in the thermocycler with the following program:

Temperature Time Ramp speed

4 C 5 min – 4 ! 38 C 8 min 30 s 4 C/1 min 37 C 30 min – 8 C HOLD –

The first-strand synthesis product can be safely stored over- night at 4 C or for up to 1 month at À20 C.

3.6 Preamplification 1. Prepare the preamplification mix (2.5 μl per sample). For each sample add (see Note 9):

Stock Final Reagent concentration Volume concentration

Blue buffer 10Â 0.25 μl1Â dNTP mix 10 mM 0.1 μl 0.4 mM Preamp oligo 10 μM1μl4μM À Klenow 30 ! 50 exo 50,000 U/ml 0.5 μl10U/μl

H2O – 0.65 μl– 244 Natalia Kunowska

2. Denature the preamp oligo tagging reaction mixture from the point 4.7. At 95 C for 45 s. 3. Cool immediately to 4 C by placing the PCR plate on an aluminum rack on ice. 4. Add 2.5 μl of the preamplification mix from point 5.1. To each sample and mix well. Spin down briefly. 5. Incubate in the thermocycler with the following program:

Temperature Time Ramp speed

4 C 5 min – 4 ! 38 C 8 min 30 s 4 C/1 min 37 C 30 min – 8 C HOLD –

6. Repeat Subheading 3.6, steps 1–5 three times to complete the five rounds of first-strand synthesis. The final volume should be 50 μl. The preamplification reaction can be safely stopped after each full cycle and safely stored overnight at 4 C or for up to 1 month at À20 C.

3.7 Exonuclease I 1. To remove the excess of the preamp oligos, add 2 μl Treatment (corresponding to 40 U) of E. coli exonuclease I and 48 μlof H2O (final volume 100 μl). Mix well. 2. Incubate at 37 C for 1 h in a thermocycler with a lid heated to 50 C.

3.8 Purification of 1. To each sample, add 80 μl of SPRI beads, equilibrated to RT Exonuclease I-Treated and well resuspended. Mix thoroughly (see Note 10). Samples 2. Incubate at RT for 10 min to allow the DNA to bind to the beads. 3. Place the samples on the magnet and incubate until the solu- tion clears (2–5 min). Remove and discard the supernatant. 4. While keeping the samples on the magnet, wash the beads with 200 μl 80% ethanol. Wait till the beads settle completely (1–2 min) and remove the supernatant. 5. Repeat the ethanol wash (Subheading 3.8, step 4), trying to remove as much residual ethanol as possible without disturbing the beads on the magnet. 6. Dry the SPRI beads at RT for 5–10 min. Make sure not to overdry the beads, as it will result in impaired sample recovery. As soon as the beads are dry, proceed immediately to the elution (see Note 11). 7. Add 41 μlof1Â blue buffer to the dried beads, resuspend well. Studying DNA Methylation in Single Cells 245

8. Incubate for 10 min at the RT. 9. Place on the magnet and wait for the liquid to clear (1–3 min). Collect 40 μl of the supernatant containing the DNA. The samples can be safely stored for up to 1 month at À20 C.

3.9 Oligo 2 Tagging 1. Prepare the oligo 2 tagging mix (9 μl per sample). For each sample add (see Note 12):

Reagent Stock concentration Volume Final concentration

Blue buffer 10Â 1 μl1Â dNTP mix 10 mM 2 μl 0.4 mM Oligo 2 10 μM2μl 0.4 μM

H2O– 4μl–

2. Add 9 μl of the oligo 2 tagging mix (from point 8.1) to each SPRI-purified sample from point 7.9. Mix well. 3. Denaturate the preamp oligo tagging reaction mixture from the point 4.7 at 95 C for 45 s. 4. Cool immediately to 4 C by placing the PCR plate on an aluminum rack on ice. À 5. Add 1 μl (50 units) of high concentration Klenow 30 ! 50 exo and mix well. Spin down briefly. 6. Incubate in the thermocycler with the following program:

Temperature Time Ramp speed

4 C 5 min – 4 ! 38 C 8 min 30 s 4 C/1 min 37 C 90 min – 8 C HOLD –

The samples after oligo 2 tagging can be safely stored for up to 1 month at À20 C.

3.10 Purification of 1. Dilute the libraries by adding 50 μl of ultra-pure H2O. Add Double-Tagged 80 μl of well resuspended SPRI beads that have been equili- Libraries brated to RT. 2. Perform SPRI beads purification in an analogous manner to Subheading 3.8, steps 2–6. 3. As soon as the beads are dry, immediately elute the DNA by adding 41 μlof1Â KAPA HiFi Fidelity buffer. 4. Incubate for 10 min at the RT. 246 Natalia Kunowska

5. Place on the magnet and wait for the liquid to clear (1–3 min). Collect 40 μl of the supernatant containing the DNA. The samples after purification can be safely stored for up to 1 month at À20 C.

3.11 Library 1. Prepare the PCR mix (10 μl per sample). For each sample add Amplification (see Note 13):

Stock Final Reagent concentration Volume concentration

KAPA HiFi Fidelity buffer 5Â 2 μl1Â dNTP mix 10 mM 1 μl 0.2 mM PE1.0 10 μM2μl 0.4 μM iPCRTag 10 μM2μl 0.4 μM KAPA HiFi HotStart 1000 U/ml 1 μl 2 U/ml polymerase

H2O–2μl–

2. Mix the samples well and spin down. 3. Amplify in a thermocycler with the following program:

Temperature Time Number of cycles

95 C 2 min 1 94 C 1 min 20 s 65 C 30 s 10–14 (to be determined for the cell type studied) 72 C30s 72 C 3 min 1 8 C HOLD 1

3.12 Amplified 1. Dilute the PCR-amplified libraries by adding 50 μl of ultra- Library Purification pure H2O. Add 80 μl of well resuspended SPRI beads that have been equilibrated to RT. 2. Perform SPRI beads purification in an analogous manner to Subheading 3.8, steps 2–6. 3. As soon as the beads are dry, immediately elute the DNA by adding 20 μl of EB buffer. 4. Incubate for 10 min at the RT. 5. Place on the magnet and wait for the liquid to clear (1–3 min). Transfer 18 μl of the supernatant containing the DNA to a new, Studying DNA Methylation in Single Cells 247

clean plate. The ready, purified libraries are very stable and can be stored at À20 C  year.

3.13 Quality Control The quality of the ready libraries should be assessed by checking the and Sequencing size distribution profiles. Run 1 μl of each sample on Agilent BioAnalyzer using the high-sensitivity DNA chip or an equivalent set up. Smooth profiles with a single wide peak between 200 and 700 bp is expected. The concentration should be 2 nM. Optionally, the final library concentration can also be examined using PCR-based library quantification kits, such as KAPA Library Quantification Kit. Again, the concentration should be between 2 and 10 nM. To assess the BS-conversion rate, presence of contaminations and the mapping efficiency of the samples before performing full- scale sequencing, an additional QC step is highly recommended. The pooled libraries can be sequenced by a low-coverage 50 bp MiSeq run using the custom iTag primer. For the final sequencing on HiSeq, paired end, 2Â 100 bp run is recommended aiming for a depth of at least 20 million mapped reads (optimally 50–60 million) (see Note 14). For basic troubleshooting, see Notes 15–20.

4 Notes

General notes: Due to its high sensitivity, the scBS-seq is very sensitive to contaminations. Therefore, negative controls should always be introduced for cell collection and lysis (empty wells with lysis buffer only). Optimally, the negative controls for each stage of the proto- col (ultra-pure H2O or EB buffer) should be used. This is especially important when significant contamination has been detected. The negative controls for each step will then allow to identify the source of the contamination. Additionally, a positive control (10–100 pg of purified DNA) is recommended to assess the efficiency of cell collection and lysis. To limit contamination, the protocol should be performed in designated pre-PCR laminar flow hood, which has been UV-irradiated and treated with DNA and DNase decontaminating solution prior to use. Use ultrapure reagents, when possible UV-treated, in small aliquots. To obtaining successful libraries and to ensure good yield, using low DNA binding tubes, plates and tips is essential. For small sample numbers, PCR tubes and eppendorfs can be used instead of PCR and deep-well plates. All magnetic beads purifications steps can be automatized, using liquid handlers such as PerkinElmer Zephyr G3 NGS work- station or Agilent Bravo. 248 Natalia Kunowska

1. Other methods such as mouth pipetting can be used at this step, depending on the cell type and cell numbers. 2. Similarly, other cell lysis buffers compatible with BS-conversion can be used such as appropriate for the cell type studied. At this stage poly-A mRNAs can be isolated for single-cell multiomics approaches such as M&T-seq or NMT-seq. 3. If the CT Conversion Reagent particles persist, heat the solu- tion to 50 C. The CT Conversion Reagent is light sensitive; therefore, exposure to light should be minimized. Prepared CT Conversion Reagent solution can be stored up to 1 month at À20 C. Stored CT Conversion Reagent solution must be warmed to 37 C and vortexed thoroughly prior to use. 4. At this step, 60 fg of unmethylated phage Lambda DNA is spiked into the buffer as the control for the BS conversion efficiency. Alternatively, other controls can be used such as synthetic methylated and unmethylated oligonucleotides. 5. After the BS conversion, the DNA is single stranded and there- fore unstable. It is recommended to keep the incubation at 8 C to a minimum and proceed to the next steps. 6. For small numbers of samples, tubes and column-based purifi- cation kits (such as PureLink PCR micro kit) are used. 7. Important: samples must not remain in the M-Desulfonation buffer for more than 20–25 min. 8. To obtain good yield, it is essential to use highly concentrated À Klenow 30 ! 50 exo . 9. Remember to prepare an excess of the preamplification mix to account for pipetting errors. 10. Due to a large volume of the combined sample, be extra careful when mixing the exonuclease I digestion product with the SPRI beads. Make sure that the beads are well resuspended and have been equilibrated to RT before adding them to the sample. 11. It is important to observe the SPRI beads while drying to make sure that they do not overdry. The beads will turn from dark and glistening into rusty brown and matte. As soon as the first cracks on the surface start to appear, add the elution mix immediately. Keep in mind that due to manual processing times, some samples may be dry sooner than others. 12. Remember to prepare an excess of the oligo 2 tagging mix to account for pipetting errors. 13. When preparing a master mix for multiple samples, remember to prepare an excess of the PCR mix to account for pipetting errors. Studying DNA Methylation in Single Cells 249

14. During the sequencing, a PhiX spike-in control needs to be used, as the BS-treated samples have unbalanced genomic composition with most of the cytosines having been converted to thymidines. 15. The rate of conversion of the unmethylated Lambda DNA (if added) can be used to assess the efficiency of BS conversion. Alternatively, for mammalian cells mtDNA can be used for this purpose, as it is practically unmethylated in most of the cell types. 16. The expected BS-conversion efficiency is >95%. Low conver- sion efficiencies suggest problems with the bisulfite reagent. Prepare a fresh aliquot and repeat the experiment. 17. Though it is not unusual for the negative controls to produce libraries that can be detected in the BioAnalyzer QC step, the mapping efficiency of the “empty” samples should be much lower (about 8–10 times) than for the cell-containing wells and should be below 5%. If the contamination is detected, a series of negative controls should be used to pinpoint its source. The contaminated reagents should be discarded and the experimen- tal conditions and the “cleanness” of the setup should be reassessed. 18. The high ratio of PCR duplicates suggests that too many PCR cycles were used during the final library amplification step. Repeat the experiment using less cycles. 19. Conversely, low yield might suggest insufficient number of PCR cycles. The libraries might be “rescued” by performing additional 2–5 PCR cycles and higher number of cycles should be used for the subsequent experiments. 20. Library with size distribution profile shifted towards lower molecular size with visible sharp peaks and low mapping effi- ciencies can result from samples with very DNA content, such as empty wells or when the cell lysis was inefficient. Such experiments need to be repeated, and positive control (purified DNA) should be used in parallel to check for the efficiency of cell collection and lysis. However, libraries with such profiles might also be a result of a too high SPRI beads-to-sample ratio during the final library purification. This might be due to sample evaporation or imprecise pipetting of the viscous SPRI bead solution. In such cases, an additional round of 0.8Â SPRI bead purification is recommended.

References

1. Jabbari K, Bernardi G (2004) Cytosine meth- 2. Fernandez AF, Assenov Y, Martin-Subero JI ylation and CpG, TpG (CpA) and TpA fre- et al (2012) A DNA methylation fingerprint quencies. Gene 333:143–149 250 Natalia Kunowska

of 1628 human samples. Genome Res uncovers extensive heterogeneity in the 22:407–419 mouse liver methylome. Genome Biol 17:150 3. Bird A (2002) DNA methylation patterns and 11. Guo H, Zhu P, Wu X et al (2013) Single-cell epigenetic memory. Genes Dev 16:6–21 methylome landscapes of mouse embryonic 4. Law JA, Jacobsen SE (2010) Establishing, stem cells and early embryos analyzed using maintaining and modifying DNA methylation reduced representation bisulfite sequencing. patterns in plants and animals. Nat Rev Genet Genome Res 23:2126–2135 11:204–220 12. Bogdanovic´ O, Lister R (2017) DNA methyla- 5. Kurdyukov S, Bullock M (2016) DNA methyl- tion and the preservation of cell identity. Curr ation analysis: choosing the right method. Opin Genet Dev 46:9–14 Biology 5. https://doi.org/10.3390/ 13. Clark SJ, Smallwood SA, Lee HJ et al (2017) biology5010003 Genome-wide base-resolution mapping of 6. Frommer M, LE MD, Millar DS et al (1992) A DNA methylation in single cells using single- genomic sequencing protocol that yields a pos- cell bisulfite sequencing (scBS-seq). Nat Protoc itive display of 5-methylcytosine residues in 12:534–547 individual DNA strands. Proc Natl Acad Sci U 14. Miura F, Enomoto Y, Dairiki R, Ito T (2012) S A 89:1827–1831 Amplification-free whole-genome bisulfite 7. Lister R, Pelizzola M, Dowen RH et al (2009) sequencing by post-bisulfite adaptor tagging. Human DNA methylomes at base resolution Nucleic Acids Res 40:e136 show widespread epigenomic differences. 15. Angermueller C, Clark SJ, Lee HJ et al (2016) Nature 462:315–322 Parallel single-cell sequencing links transcrip- 8. Smallwood SA, Lee HJ, Angermueller C et al tional and epigenetic heterogeneity. Nat Meth- (2014) Single-cell genome-wide bisulfite ods 13:229–232 sequencing for assessing epigenetic heteroge- 16. Clark SJ, Argelaguet R, Kapourani C-A et al neity. Nat Methods 11:817–820 (2018) scNMT-seq enables joint profiling of 9. Farlik M, Sheffield NC, Nuzzo A et al (2015) chromatin accessibility DNA methylation and Single-cell DNA methylome sequencing and transcription in single cells. Nat Commun bioinformatic inference of epigenomic cell- 9:781 state dynamics. Cell Rep 10:1386–1397 17. Quail MA, Otto TD, Gu Y et al (2011) Opti- 10. Gravina S, Dong X, Yu B, Vijg J (2016) Single- mal enzymes for amplifying sequencing cell genome-wide bisulfite sequencing libraries. Nat Methods 9:10–11 Chapter 16

Single-Cell 5fC Sequencing

Chenxu Zhu, Yun Gao, Jinying Peng, Fuchou Tang, and Chengqi Yi

Abstract

Active DNA demethylation plays important roles in the epigenetic reprogramming of developmental processes. 5-formylcytosine (5fC) is produced during active demethylation of 5-methylcytosine (5mC). Here, we describe a technique called CLEVER-seq (Chemical-labeling-enabled C-to-T conversion sequencing), which detects the whole genome 5fC distribution at single-base and single-cell resolution. CLEVER-seq is suitable for the analysis of precious samples such as early embryos and laser microdissection captured samples.

Key words CLEVER-seq, 5-Formylcytosine, Bisulfite-free sequencing, Chemical labeling, Single cell

1 Introduction

1.1 Methods 5-Formylcytosine (5fC) is produced during active demethylation of to Analyze 5-methylcytosine (5mC): 5mC is sequentially oxidized by 5-Formylcytosine ten-eleven translocation (TET) family proteins to give 5-hydroxymethylcytosine (5hmC), 5fC and 5-carboxylcytosine (5caC); the latter two can be reversed to cytosine by thymine DNA glycosylase (TDG)-mediated DNA base-excision repair [1–6]. Active DNA demethylation is shown to have has crucial roles in multiple biological processes, including embryo develop- ment, neurogenesis, carcinogenesis, and stem cell pluripotency and differentiation [7–9]. Unlike 5mC and 5hmC, 5fC could not be distinguished from 5caC and unmodified cytosine during bisulfite treatment [10]. Vari- ous modified bisulfite-dependent methods have been developed to profile 5fC at single-base resolution [11]. fCAB-seq (chemical- assisted bisulfite sequencing for 5fC) uses EtONH2 to react specifi- cally with 5fC and makes it resistant to bisulfite treatment. After bisulfite conversion, 5fC could be identified by subtracting tradi- tional BS-seq signal [12, 13]. redBS-seq (reduced bisulfite sequenc- ing) converts 5fC to 5hmC by NaBH4 treatment; thus, the converted 5fC stays intact in the bisulfite treatment, so 5fC signal

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_16, © Springer Science+Business Media, LLC, part of Springer Nature 2019 251 252 Chenxu Zhu et al.

could be resolved by subtracting traditional BS-seq signal [14]. In MAB-seq (M.SssI methylase-assisted bisulfite sequencing), M.SssI was used to convert unmodified cytosine (C) into 5mC in a CpG context in vitro and the methylated DNA was then subjected to bisulfite treatment. Hence, 5fC and 5caC could be indiscriminately identified [15–18]. Meanwhile, bisulfite-free 5fC profiling methods have also been developed: 5fC-targeted antibody-based [19]or chemical labeling-based [12, 20, 21] enrichment of target DNA for sequencing, and converting 5fC to 5hmC then using 5hmC- sentitive restriction enzyme for detection [22]. We have also devel- oped a bisulfite-free, single-base resolution 5fC sequencing method, fC-CET [23]. However, these methods are all limited to analysis of bulk samples. scMAB-seq and liMAB-seq could be adopted for single cell or limited cells [24] as start material but identify 5fC/5caC indiscriminately. In this chapter, we provide a detailed description of “CLEVER-seq,” a single-base and single-cell resolution 5fC mapping technology that we recently developed [25].

1.2 Principle and CLEVER-seq was developed based on the reactivity of malononi- Application of CLEVER- trile to 5fC (Fig. 1a, b)[25]. Malononitrile is a stable and commer- seq cially available small molecule and is highly water-soluble. Single- cells are lysed in individual tubes and labeled with malononitrile. This labeling reaction is performed under mild condition and causes no DNA degradation. After chemical treatment, the 5fC-adduct (“5fC-M”) is read as a dT during DNA amplification by various DNA polymerases, which enables single-base resolution detection. The labeled single-cell genomic DNA can be amplified by whole- genome amplification (WGA) technology such as MALBAC (multi- ple annealing- and looping-based amplification cycles)[26]. Cell lysis, DNA labeling, and single-cell whole genome amplification are performed in a “one-tube” fashion without any purification step: this chemical is highly biocompatible and does not inhibit the activity of commonly used DNA polymerases. After producing suf- ficient amount of DNA by WGA, universal library construction procedures are then performed on the amplified DNA. Libraries quality controls have been implemented by monitoring spike-in DNA conversion ratio, amplification efficiency, and fragment distri- bution. Libraries are sequenced on a next-generation sequencing platform including Illumina HiSeq 2500 or HiSeq 4000. CLEVER-seq enables the single-cell, single-base resolution 5fC detection of rare cell types, for example, mammalian early embryos or laser microdissection captured samples. CLEVER-seq depends on the chemical labeling by a small molecule and, thus, has no bias on base composition. With a reasonable 3Â sequencing depth of a single mouse embryonic stem cell, CLEVER-seq could stably cover 27% of CpG sites in the entire genome[25]. CLEVER- seq could also be combined with other library preparation methods, such as reduced-representative sequencing and target- Single Cell 5fC Sequencing 253

a NH N 2 O NH2 N N N N N O N N O DNA DNA 5fC Malononitrile “5fC-M” b

One-tube reaction Single cell

Single cell lysis M

5mC 5hmC5fC 5caC 5mC 5hmC5fC 5caC M Malononitrile

Malononitrile Labeling

DNA amplification

T T

T T T T T

-200 -100 5fC +100 +200 High-throughput sequencing

Fig. 1 Principle of CLEVER-seq. (a) Labeling reaction of 5fC. (b) Steps of CLEVER- seq. Single-cells were picked into individual tubes, lysed and labeled with malononitrile. The single-cell genome is then subjected to whole genome amplification in the labeling mixture directly. After obtaining sufficient amount of DNA, library preparation and sequencing is then performed

enriched sequencing, to obtain desired formylome with lower sequencing cost.

2 Materials (See Note 1)

2.1 Cell Lysis 1. Lysis Buffer (Tris–EDTA, 20 mg/mL Protease, Triton X-100, KCl; for details see Table 1) 254 Chenxu Zhu et al.

Table 1 Lysis buffer composition

Lysis buffer (common) Lysis buffer for sperm

Final Volume Final Volume Component concentration (μL) concentration (μL)

0.1 M Tris–EDTA, pH 8.0 (1 M 10 mM 0.5 10 mM 0.5 Tris–HCl + 0.1 M EDTA) Protease (20 mg/mL) 1 mg/mL 0.25 1 mg/mL 0.25 10% Triton X-100 0.3% 0.15 0.3% 0.15 1 M KCl 20 mM 0.1 20 mM 0.1 0.1 M DTT / / 15 mM 0.75 Nuclease-free water / 4 / 3.25 Total / 5 / 5

Table 2 Primer sequences

Name Sequence (50-30)

Spike-in CCTCACCATCTCAACCAATATTATATTACGCGTATAT5fCGCGTATTTCGCGTTA TAATATTGAGGGAGAAGTGGTGAATACTGAATAAGAATGTAGTCCAGGTAGG ATGGGTGGTTGATGGTAGTGATAATGTCGGAG Spike-in-RC CTCCGACATTATCACTACCATCAACCACCCATCCTACCTGGACTACA TTCTTA TTCAGTATTCACCACTTCTCCCTCAATATTATA ACGCGAAATACGCGATA TACGCGTAATATAATATTGGTTGAGATGGTGAGG SP-Test-F CTCCGACATTATCACTACCA SP-Test-R CCTCACCATCTCAACCAATATTATATT

2. 200 μL thin wall PCR tubes. 3. Thermocycler.

2.2 Chemical 1. Unmethylated phage λ DNA. Labeling 2. Spike-in DNA containing 5fC (see Table 2). 3. Malononitrile. 4. 200 μL thin wall PCR tubes. 5. Mineral oil. 6. 100 mM Tris–HCl, pH 8.0. 7. Eppendorf ThermoMixer. Single Cell 5fC Sequencing 255

2.3 Quasilinear 1. MALBAC Single Cell WGA Kit (Yikon Genomics). Preamplification 2. 200 μL thin wall PCR tubes. 3. Thermocycler.

2.4 Exponential 1. MALBAC Single Cell WGA Kit (Yikon Genomics). Amplification 2. 200 μL thin wall PCR tubes. 3. Thermocycler.

2.5 Post- 1. Zymo DNA Clean & Concentrator-5 (Zymo). amplification DNA 2. 1.5 mL microcentrifuge tubes. Purification and 3. Covaris Focused-ultrasonicator. Fragmentation

2.6 Library 1. NEBNext Ultra II DNA Library Prep Kit for Illumina. Preparation 2. Agencourt AMPure XP beads. 3. 200 μL reaction tubes. 4. Magnetic separation rack. 5. Thermocycler.

2.7 Library 1. NEBNext Ultra II Q5 Master Mix. Amplification 2. NEBNext Multiplex Oligos for Illumina. 3. 200 μL thin wall PCR tubes. 4. Thermocycler.

2.8 Library 1. Agencourt AMPure XP beads. Purification and 2. 1.5 mL and 200 μL microcentrifuge tube. Quality Control 3. NEBNext Ultra II Q5 Master Mix. 4. SP-Test-F and SP-Test-R primers (see Table 2). 5. Thermocycler. 6. Fragment Analyzer (Advanced Analytical) or Bioanalyzer (Agilent). 7. Qubit Fluorometer and Quant-iT dsDNA HS Assasy Kit.

2.9 Next-Generation 1. Illumina TruSeq Index Sequencing Primer Box. Sequencing of 2. Illumina HiSeq 2500 Sequencer. CLEVER-seq Libraries and Bioinformatics Analysis 256 Chenxu Zhu et al.

3 Methods

3.1 Cell Lysis 1. Prepare the cell Lysis Buffer (see Table 1) freshly each time before performing the experiments. For analysis of mammalian sperm, prepare the Lysis Buffer according to Lysis Buffer for Sperm; for analysis of other single cell samples, prepare the Lysis Buffer according to Lysis Buffer (Common). 2. Use 5 μL Lysis Buffer for each cell, capture single cell (with <0.5 μL PBS-0.1% BSA) into 200 μL thin-wall PCR tube containing 5 μL Lysis Buffer, put on ice immediately. 3. Centrifuge for 1 min at 4 C, 4500 Â g and put on ice immediately. 4. Perform cell lysis in a thermocycler with the following program with the heated lid set at 80 C(see Note 2):

Step Temp (C) Time

Cell lysis 50 3 h Enzyme deactivation 70 30 min Hold 4 Infinite

5. Proceed to next step directly or freeze at À80 C. Sample could be safely stored for less than 1 week at À80 C.

3.2 Chemical 1. Prepare the Spike-in DNA Annealing Mixture as follows: Labeling Component Final concentration Volume (μL)

100 μM Spike-in 45 μM 22.5 100 μM Spike-in-RC 45 μM 22.5 100 mM Tris–HCl pH 8.0 10 mM 5 Total / 50

2. Perform the Spike-in DNA Annealing in a thermocycler with the following program:

Step Temp Time

Denaturing 95 C 5 min Annealing 95 C–4 C À0.1 C/s Hold 4 C Infinite Single Cell 5fC Sequencing 257

3. Put the annealed DNA on ice and add 4 μL 1 ng unmethylated phage λ DNA into the tube. Pipet up and down for ten times to mix thoroughly. Measure the concentration of spike-in DNA mix and dilute to 2 ng/μL with precooled 10 mM Tris–HCl pH 8.0. 4. Thaw one tube of new malononitrile in the hand or at 37 Cin a ThermoMixer (see Note 3). Prepare the Labeling Mix as follows:

Component Final concentration Volume (μL)

2 ng/μL Spike-in DNA mix 0.3 ng/cell 0.15 Malononitrile 0.91% 0.05 Nuclease-free water / 0.3 Total / 0.5

5. Carefully add 0.5 μL of Labeling Mix to each tube containing single cells. Add the mixture to the surface of cell lysis buffer (see Note 4). 6. Add 15 μL mineral oil to the top of liquid surface. Seal each tube with Parafilm sealing film and encase the tubes with tinfoil. 7. Set the Eppendorf ThermoMixer to 37 C, 850 rpm and incubate the tubes for 20 h. Proceed to the next step immediately.

3.3 Quasilinear 1. Prepare the Amplification Mix I as follows: Preamplification Component Volume (μL)

Pre-Amp Buffer (Green Cap) 30 Pre-Amp Enzyme Mix (Green Cap) 1 Total 31

2. Carefully add 31 μL Amplification Mix I into each tube con- taining lysed single cells by adding the mixture to the surface of liquid (see Note 4). 3. Centrifuge for 1 min at 4 C, 4500 Â g and put on ice immediately. 4. Perform the quasilinear preamplification in a thermocycler with the following program (see Note 5): 258 Chenxu Zhu et al.

Step Temp (C) Time (s) Cycles

1 94 180 1 22040 33040 44030 55030 66030 7 70 240 89520 95810Gotostep 2, for a total of 11 cycles 10 4 Infinite 1

5. Centrifuge for 1 min at 4 C, 4500 Â g and put on ice immediately. Proceed to the next step immediately.

3.4 Exponential 1. Prepare the Amplification Mix II as follows: Amplification Component Volume (μL)

Amplification buffer (Red Cap) 30 Amp enzyme mix (Red Cap) 0.8 Total 30.8

2. Add 30.8 μL Amplification Mix II into each tube and mix by gently pipet up and down. Put on ice immediately. 3. Centrifuge for 1 min at 4 C, 4500 Â g and put on ice immediately. 4. Perform the exponential amplification in a thermocycler with the following program:

Step Temp (C) Time (s) Cycles

194301 29420 35830 4 72 180 Go to step 2, for a total of 17 cycles 5 72 300 1 6 4 Infinite 1

After this step, the amplicons from Subheading 3.3 is further amplified as much as 100 ng–1 μg. Perform postampli- fication DNA purification step immediately. Single Cell 5fC Sequencing 259

3.5 Postamp- 1. Transfer the amplification products into a 1.5 mL microcen- lification DNA trifuge tube, add 340 μLofDNA Binding Buffer (of Zymo Purification and DNA Clean & Concentrator-5) to each DNA sample. Mix Fragmentation briefly by pipet up and down. 2. Transfer mixture to a Zymo-Spin Column in a Collection Tube. Centrifuge for 30 s at 13,800 Â g. Discard the flow-through. 3. Add 200 μL DNA Wash Buffer (of Zymo DNA Clean & Concentrator-5, add ethanol before use) to the column. Centrifuge for 30 s at 13,800 Â g. Repeat this step once. 4. Transfer the column to a new 1.5 mL microcentrifuge tube. Add 140 μL DNA Elution Buffer (of Zymo DNA Clean & Concentrator-5) directly to the column matrix and incubate at room temperature for 2 min. Centrifuge for 2 min to elute the purified DNA. Purified DNA could store at À20 C for less than 1 week. 5. Use Covaris Focused-ultrasonicator to fragment the 135 μLof purified DNA with fragmentation program set to 400 bp mode. 6. Transfer the fragmented DNA into a new 1.5 mL microcen- trifuge tube. Add 675 μL DNA Binding Buffer to each DNA sample. Mix briefly pipet up and down. 7. Transfer 450 μL of mixture to a Zymo-Spin Column in a Collec- tion Tube. Centrifuge for 30 s at 13,800 Â g. Discard the flow- through. 8. Transfer the rest of mixture to the same Zymo-Spin Column in a Collection Tube for each single cell samples. Centrifuge for 30 s at 13,800 Â g. Discard the flow-through. 9. Add 200 μL DNA Wash Buffer to the column. Centrifuge for 30 s at 13,800 Â g. Repeat this step once. 10. Transfer the column to a new 1.5 mL microcentrifuge tube. Add 26 μL DNA Elution Buffer directly to the column matrix and incubate at room temperature for 2 min. Centrifuge for 2 min to elute the purified DNA. Purified DNA could be stored at À20 C for less than 1 week. 11. For quality control of the amplified DNA, dilute 1 μL of the purified DNA by 4 μLH2O and using the Bioanalyzer or Fragment Analyzer for determination of the fragment size distribution (Fig. 2a) and Qubit fluorimetry for determination of DNA concentration. 260 Chenxu Zhu et al.

Fig. 2 Quality control of amplification product and CLEVER-seq library. (a) A representative fragment analyzer chromatography of post-MALBAC DNA. An average DNA fragment size of 1.5 kb indicates successful amplification of malononitrile-labeled single-cell genome. (b) A representative fragment analyzer chromatog- raphy of CLEVER-seq library. For paired-end 300 bp sequencing on Illumina Hi-Seq Platform, an average DNA fragment size between 300 and 500 bp is recommended

3.6 Library 1. Assemble the DNA End Preparation Mix to a sterile nuclease- Preparation free 200 μL tube: (See Note 6) Component Volume (μL)

Fragmented, double-stranded DNA 25 NEBNext Ultra II End Prep Reaction Buffer (Green Cap) 3.5 NEBNext Ultra II End Prep Enzyme Mix (Green Cap) 1.5 Total 30 Single Cell 5fC Sequencing 261

2. Pipet up and down at least ten times to mix the mixture thoroughly. Short spin to collect all liquid from the sides of the tubes. 3. Perform DNA End Preparation in a thermocycler with the following program with the heated lid set at 75 C:

Step Temp (C) Time (min)

End Repair and 20 30 A-Tailing 60 30 Hold 4 Infinite

4. Put the reaction mixture on ice and proceed immediately to the next step. 5. Add the following components directly to the DNA End Prep- aration Mix directly (see Note 7):

Component Volume (μL)

NEBNext Ultra II Ligation Master Mix (Red Cap) 15 NEBNext Ligation Enhancer (Red Cap) 0.5 NEBNext Adaptor for Illumina (Red Cap) 1.25 Total 16.75

6. Pipet up and down at least ten times to mix the mixture thoroughly. Short spin to collect all liquid from the sides of the tubes. 7. Incubate the reaction mixture in a thermocycler set at 20 C with the heated lid off for 30 min. 8. Put the ligation mixture on ice. Add 1.5 μL of 1000 units/mL USER Enzyme to the ligation mixture. Pipet up and down three times to mix the mixture. 9. Incubate the reaction mixture in a thermocycler set at 37 C with the heated lid set at 45 C for 30 min. 10. Vortex prewarmed AMPure beads for 30 s to resuspend thor- oughly. Add 43.5 μL AMPure beads (0.9Â) into the ligation mixture. Pipet up and down ten times to mix the mixture thoroughly. 11. Incubate the mixture at RT for 15 min. 12. Transfer the tube to a magnetic separation rack and wait 5 min or until the solution become completely clear, remove and discard the supernatant carefully without disturbing the bead pellets. 262 Chenxu Zhu et al.

13. Keep the tube in the magnetic separation rack, add 100 μLof 80% ethanol and incubate for 30 s, carefully remove and discard the supernatant without disturbing the bead pellets. 14. Repeat step 13 once. 15. Keep the tube in the magnetic separation rack with lid open for 2–5 min at RT until the beads became completely dry. Do not over dry the beads. 16. Remove the tube from the magnetic separation rack and resus- pend the bead pellets in 24 μL sterile deionized water. Transfer the tube back to the magnetic separation rack and wait for 5 min or until the solution became completely clear. 17. Transfer the clear supernatants to a new sterile nuclease-free tube.

3.7 Library 1. Add the following component to assemble the library amplifi- Amplification cation reaction mixture:

Component Volume (μL)

Purified Adaptor-ligated DNA 23 NEBNext Ultra II Q5 Master Mix (Blue Cap) 25 Universal PCR primer (Blue Cap) 1 Index PCR Primer (Blue Cap) 1 Total 50

2. Pipet up and down at least five times to mix the mixture thoroughly. Short spin to collect all liquid from the sides of the tubes. 3. Perform the library amplification in in a thermocycler with the following program:

Step Temp (C) Time Cycles

Initial denaturation 95 30 s 1 Denaturation 95 10 s 6–8 Annealing/extension 65 75 s Final extension 65 5 min 1 Hold 4 Infinity 1

4. Store the amplification product at 4 CorÀ20 C for up to 72 h, or proceed directly to Library Purification and Quality Control. Single Cell 5fC Sequencing 263

3.8 Library 1. Vortex prewarmed AMPure beads for 30 s to resuspend thor- Purification and oughly. Add 45 μL AMPure beads (0.9Â) into the ligation Quality Control mixture. Pipet up and down ten times to mix the mixture thoroughly. 2. Repeat steps 11–16 in from Subheading 3.6. But resuspend the beads in 30 μL sterile deionized water or EB elution buffer. 3. For quality control of the fragmented DNA, dilute 1 μL of the purified DNA by 4 μLH2O and using the Bioanalyzer or Fragment Analyzer for determination of the fragment size distribution (Fig. 2b) and Qubit fluorimetry for determination of DNA concentration (see Note 8). 4. To test the labeling efficiency of 5fC (see Note 8). Add the following component to assemble the spike-in DNA amplifica- tion reaction mixture.

Component Volume (μL)

Purified DNA Library 0.5 NEBNext Ultra II Q5 Master Mix (Blue Cap) 25 10 μM SP-Test-F 2 10 μM SP-Test-R 2

H2O 20.5 Total 50

5. Pipet up and down at least five times to mix the mixture thoroughly. Shortly spin to collect all liquid from the sides of the tubes. 6. Perform the library amplification in in a thermocycler with the following program:

Step Temp (C) Time Cycles

Initial denaturation 95 30 s 1 Denaturation 95 10 s 28 Annealing 58 20 s Extension 65 30 s Final extension 65 5 min 1 Hold 95 Infinity 1

7. Perform electrophoresis with 2% agarose gel. There should be a clear band around 150 bp. Perform Sanger Sequence using SP- Test-F as sequencing primer to check the labeling efficiency of 5fC (Fig. 3a, b). 264 Chenxu Zhu et al.

Fig. 3 Sanger sequencing of spike-in DNA. (a) A representative example of low 5fC labeling efficiency. The T peak (red) is much lower than the C peak (blue). (b) A representative example of successful labeled 5fC. The T peak (red) is similar to the C peak (blue). The T peak is from labeled 5fC, and the C peak is from the base G complementary to the original 5fC

3.9 Next-Generation 1. Analyze the CLEVER-seq library on Illumina HiSeq 2500 Sequencing of using the TruSeq Sequencing Primer. CLEVER-seq Libraries 2. Demultiplex the fastq file and remove the low-quality bases and and Bioinformatics artificial sequences including MALBAC primer and Illumina Analysis adaptor sequences using fastx toolkit (https://github.com/ agordon/fastx_toolkit) and trim galore (https://www.bioinfor matics.babraham.ac.uk/projects/trim_galore/). Align the remaining cleaned reads to reference genome using the Bis- mark Tool (https://www.bioinformatics.babraham.ac.uk/pro jects/bismark/). For the determination of the formylation state, summarize only bases with Phred-scaled quality score of 30. Consider an identified cytosine as C site and a cytosine converted to thymine as 5fC site.

4 Notes

1. To avoid possible DNA contaminations, use only filter pipet- ting tips. Use a DNA-free H2O control in parallel to the CLEVER-seq library preparation. Split all reagents into sub-packages and use a new tube for each batch. It is recom- mended to treat all tubes and H2O with UV light before use. 2. To avoid denaturing of double-strand genomic DNA, the lid temperature should NOT higher than 80 C. 3. Use a new tube of malononitrile each time. At the initial delivery of malononitrile, incubate the package at 37 C for 30 min to complete thawed. Split malononitrile into 15 μL sub-packages and encase with tinfoil. Store at 4 C and avoid light. Use a new tube for each batch. Single Cell 5fC Sequencing 265

4. When adding reagents into preamplified single cell lysis, DO NOT insert pipetting tips into the lysis buffer, DO NOT pipet up and down. 5. After labeling and before Quasilinear Preamplification step, the mixture should be light yellow or white (if DTT is added); a deep yellow color indicates the malononitrile used may not be stored well. After Quasilinear Preamplification step, the mix- ture should turn to dark yellow or orange. 6. Recommended is the use of one commercial kit, for example, NEBNext Ultra II DNA Library Prep Kit from Illumina. Other commercial or home-made kits of Illumina TruSeq Library might also be suitable. 7. Do not make the mixture containing adaptor long in advance. Immediately add the mixture into End Repaired DNA after making the adaptor ligation mixture. If the amount of ampli- fied DNA is less than 100 ng, dilute the adaptor by 1:10 with 10 mM Tris–HCl pH 8.0. 8. Quality control of the fragment distribution and spike-in DNA conversion ratio is recommended. Insufficient amplification might result in abnormal fragment distribution in final library. Insufficient 5fC labeling might result in failure of 5fC detection.

Acknowledgments

The authors thank Dr. Hongshan Guo, Bo Xia, Jinghui Song, and Hu Zeng for their help in developing the original protocol. Part of the analysis was performed on the Computing Platform of the Center for Life Sciences. This work was supported by the National Basic Research Program of China and the National Natural Science Foundation of China (91519325, MOST2016YFC0900300, 21522201, and 2014CB964900). Competing financial interests: C.Z. and C.Y. are coinventors on filed patents (201710111600.9) for the labeling strategy and sequencing method reported herein.

References

1. Kriaucionis S, Heintz N (2009) The nuclear 5-hydroxymethylcytosine in mammalian DNA DNA base 5-hydroxymethylcytosine is present by MLL partner TET1. Science 324 in Purkinje neurons and the brain. Science 324 (5929):930–935. https://doi.org/10.1126/ (5929):929–930. https://doi.org/10.1126/ science.1170116; pii: 1170116 science.1169786. pii: 1169786 3. He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, 2. Tahiliani M, Koh KP, Shen Y, Pastor WA, Ding J, Jia Y, Chen Z, Li L, Sun Y, Li X, Dai Q, Bandukwala H, Brudno Y, Agarwal S, Iyer Song CX, Zhang K, He C, Xu GL (2011) LM, Liu DR, Aravind L, Rao A (2009) Con- Tet-mediated formation of 5-carboxylcytosine version of 5-methylcytosine to and its excision by TDG in mammalian DNA. 266 Chenxu Zhu et al.

Science 333(6047):1303–1307. https://doi. reveal genome-wide DNA demethylation org/10.1126/science.1210944. pii: dynamics. Cell Res 25(3):386–389. https:// science.1210944 doi.org/10.1038/cr.2015.5; pii: cr20155 4. Ito S, Shen L, Dai Q, Wu SC, Collins LB, 14. Booth MJ, Marsico G, Bachman M, Beraldi D, Swenberg JA, He C, Zhang Y (2011) Tet pro- Balasubramanian S (2014) Quantitative teins can convert 5-methylcytosine to sequencing of 5-formylcytosine in DNA at 5-formylcytosine and 5-carboxylcytosine. Sci- single-base resolution. Nat Chem 6 ence 333(6047):1300–1303. https://doi.org/ (5):435–440. https://doi.org/10.1038/ 10.1126/science.1210597; pii: nchem.1893; pii: nchem.1893 science.1210597 15. Guo F, Li X, Liang D, Li T, Zhu P, Guo H, 5. Maiti A, Drohat AC (2011) Thymine DNA Wu X, Wen L, Gu TP, Hu B, Walsh CP, Li J, glycosylase can rapidly excise 5-formylcytosine Tang F, Xu GL (2014) Active and passive and 5-carboxylcytosine: potential implications demethylation of male and female pronuclear for active demethylation of CpG sites. J Biol DNA in the mammalian zygote. Cell Stem Cell Chem 286(41):35334–35338. https://doi. 15(4):447–458. https://doi.org/10.1016/j. org/10.1074/jbc.C111.284620; pii: stem.2014.08.003; pii: S1934-5909(14) C111.284620 00341-5 6. Pfaffeneder T, Hackner B, Truss M, 16. Wu H, Wu X, Shen L, Zhang Y (2014) Single- Munzel M, Muller M, Deiml CA, base resolution analysis of active DNA demeth- Hagemeier C, Carell T (2011) The discovery ylation using methylase-assisted bisulfite of 5-formylcytosine in embryonic stem cell sequencing. Nat Biotechnol 32 DNA. Angew Chem Int Ed Engl 50 (12):1231–1240. https://doi.org/10.1038/ (31):7008–7012. https://doi.org/10.1002/ nbt.3073; pii: nbt.3073 anie.201103899 17. Neri F, Incarnato D, Krepelova A, Rapelli S, 7. Kohli RM, Zhang Y (2013) TET enzymes, Anselmi F, Parlato C, Medana C, Dal Bello F, TDG and the dynamics of DNA demethyla- Oliviero S (2015) Single-base resolution analy- tion. Nature 502(7472):472–479. https:// sis of 5-formyl and 5-carboxyl cytosine reveals doi.org/10.1038/nature12750 promoter DNA methylation dynamics. Cell 8. Pastor WA, Aravind L, Rao A (2013) TETonic Rep. https://doi.org/10.1016/j.celrep.2015. shift: biological roles of TET proteins in DNA 01.008. pii: S2211-1247(15)00009-1 demethylation and transcription. Nat Rev Mol 18. Hu X, Zhang L, Mao SQ, Li Z, Chen J, Zhang Cell Biol 14(6):341–356. https://doi.org/10. RR, Wu HP, Gao J, Guo F, Liu W, Xu GF, Dai 1038/nrm3589 HQ, Shi YG, Li X, Hu B, Tang F, Pei D, Xu GL 9. Wu X, Zhang Y (2017) TET-mediated active (2014) Tet and TDG mediate DNA demethyl- DNA demethylation: mechanism, function and ation essential for mesenchymal-to-epithelial beyond. Nat Rev Genet 18(9):517–534. transition in somatic cell reprogramming. Cell https://doi.org/10.1038/nrg.2017.33 Stem Cell 14(4):512–522. https://doi.org/ 10. Song CX, Yi C, He C (2012) Mapping recently 10.1016/j.stem.2014.01.001 identified nucleotide variants in the genome 19. Shen L, Wu H, Diep D, Yamaguchi S, D’Ales- and transcriptome. Nat Biotechnol 30 sio AC, Fung HL, Zhang K, Zhang Y (2013) (11):1107–1116. https://doi.org/10.1038/ Genome-wide analysis reveals TET- and nbt.2398 TDG-dependent 5-methylcytosine oxidation 11. Wu H, Zhang Y (2015) Charting oxidized dynamics. Cell 153(3):692–706. https://doi. methylcytosines at base resolution. Nat Struct org/10.1016/j.cell.2013.04.002; pii: S0092- Mol Biol 22(9):656–661. https://doi.org/10. 8674(13)00401-7 1038/nsmb.3071 20. Raiber EA, Beraldi D, Ficz G, Burgess HE, 12. Song CX, Szulwach KE, Dai Q, Fu Y, Mao SQ, Branco MR, Murat P, Oxley D, Booth MJ, Lin L, Street C, Li Y, Poidevin M, Wu H, Reik W, Balasubramanian S (2012) Genome- Gao J, Liu P, Li L, Xu GL, Jin P, He C wide distribution of 5-formylcytosine in (2013) Genome-wide profiling of embryonic stem cells is associated with tran- 5-formylcytosine reveals its roles in epigenetic scription and depends on thymine DNA glyco- priming. Cell 153(3):678–691. https://doi. sylase. Genome Biol 13(8):R69. https://doi. org/10.1016/j.cell.2013.04.001; pii: S0092- org/10.1186/gb-2012-13-8-r69; pii: 8674(13)00400-5 gb-2012-13-8-r69 13. Lu X, Han D, Zhao BS, Song CX, Zhang LS, 21. Iurlaro M, McInroy GR, Burgess HE, Dean W, Dore LC, He C (2015) Base-resolution maps Raiber EA, Bachman M, Beraldi D, of 5-formylcytosine and 5-carboxylcytosine Balasubramanian S, Reik W (2016) In vivo Single Cell 5fC Sequencing 267

genome-wide profiling reveals a tissue-specific 24. Wu X, Inoue A, Suzuki T, Zhang Y (2017) role for 5-formylcytosine. Genome Biol 17 Simultaneous mapping of active DNA demeth- (1):141. https://doi.org/10.1186/s13059- ylation and sister chromatid exchange in single 016-1001-5 cells. Genes Dev 31(5):511–523. https://doi. 22. Sun Z, Dai N, Borgaro JG, Quimby A, Sun D, org/10.1101/gad.294843.116 Correa IR Jr, Zheng Y, Zhu Z, Guan S (2015) 25. Zhu C, Gao Y, Guo H, Xia B, Song J, Wu X, A sensitive approach to map genome-wide Zeng H, Kee K, Tang F, Yi C (2017) Single- 5-hydroxymethylcytosine and cell 5-formylcytosine landscapes of mammalian 5-formylcytosine at single-base resolution. early embryos and ESCs at single-base resolu- Mol Cell 57(4):750–761. https://doi.org/ tion. Cell Stem Cell 20(5):720–731.e725. 10.1016/j.molcel.2014.12.035; pii: S1097- https://doi.org/10.1016/j.stem.2017.02. 2765(14)01013-2 013 23. Xia B, Han D, Lu X, Sun Z, Zhou A, Yin Q, 26. Zong C, Lu S, Chapman AR, Xie XS (2012) Zeng H, Liu M, Jiang X, Xie W, He C, Yi C Genome-wide detection of single-nucleotide (2015) Bisulfite-free, base-resolution analysis and copy-number variations of a single human of 5-formylcytosine at the genome scale. Nat cell. Science 338(6114):1622–1626. https:// Methods 12(11):1047–1050. https://doi. doi.org/10.1126/science.1229164; pii: org/10.1038/nmeth.3569; pii: nmeth.3569 338/6114/1622 Chapter 17

ChIPmentation for Low-Input Profiling of In Vivo Protein–DNA Interactions

Natalia Kunowska and Xi Chen

Abstract

Many of the key cellular processes including establishing the cell’s identity are governed by chromatin proteins. Mapping their binding on the level of a single cell would give us important insights into a new dimension of cellular heterogeneity. However, ChIP-seq, the main method to study protein–DNA interac- tion in the chromatin context, has proven very challenging to scale down. ChIPmentation is a modification of ChIP-seq, in which the Tn5 transposase is used to introduce sequencing adapters in one step. This allows to significantly reduce the required input material. ChIPmentation is a robust and versatile approach and even though it has not yet achieved single-cell resolution, we believe that it is a very promising starting point for further downscaling.

Key words Epigenetics, Chromatin immunoprecipitation (ChIP), Low-input ChIP, ChIPmentation, Next generation sequencing

1 Introduction

Chromatin associated proteins, such as transcription factors and cofactors, histones or histone-modifying enzymes play pivotal roles in many cellular processes, from gene regulation to DNA replication and repair [1–3]. Therefore, mapping the location of DNA-bound proteins along the genome is critical in understanding these processes. Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) is a powerful tool for genome- wide identification of DNA sequences occupied by those chromatin-associated proteins. Since the first publication in 2007 [4–6], ChIP-seq has revolutionized the fields of chromatin biology, transcriptional regulation, and epigenetics. In a typical ChIP-seq experiment, the proteins are cross-linked to the DNA using formaldehyde [7]. Chromatin is then fragmen- ted, usually by sonication, and a protein of interest or a histone

Natalia Kunowska and Xi Chen contributed equally to this work.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_17, © Springer Science+Business Media, LLC, part of Springer Nature 2019 269 270 Natalia Kunowska and Xi Chen

X-link and fragment chromatin

Immunoprecipitation

ChIP-seq ChIPmentation

Reverse X-link and purify DNA Tagmentation Tn5 transposase (Nextera)

End repair

A tailing

Reverse X- link and purify DNA Adapter ligation

PCR PCR amplification amplification

Sequencing

Fig. 1 Schematic comparison of classical ChIP-seq and ChIPmentation protocols. Left: typical ChIP-seq library preparation workflow, which contains a series of steps including end repair, A tailing and adapter ligation and PCR amplification. Right: the workflow of ChIPmentation library preparation, which contains a simple tagmentation step and PCR amplification

harboring a particular modification is precipitated with antibodies, leading to enrichment of specifically bound DNA fragments. The captured DNA is then converted into Illumina sequencing library, by ligating the sequencing adapters followed by PCR amplification (Fig. 1, left). However, the method requires a large number of cells, often in the range of 1–10 million, to obtain clear enrichment signal [8, 9]. If only a limited number of cells are available, for example Robust Low Input ChIP-seq: ChIPmentation 271 when studying rare cell populations, the losses of material at multi- ple stages of the protocol can lead to poor noise-to-signal ratio and reduced complexity of the final library. This makes classical ChIP- seq unsuitable to study low input material and is a major obstacle in downscaling the protocol down to the single cell level. So far, there have been many attempts to reduce the number of cells in per ChIP-seq experiment. However, they either require significant modifications of the protocol or a large amount of initial material before the immunoprecipitation stage [10–16]. To date, single cell ChIP-seq has been made possible via a microfluidic droplet device (Drop-ChIP) [17]. In Drop-ChIP, the single-cell resolution has been achieved by barcoding enzymatically fragmented DNA inside the droplets, then pooling barcoded sam- ples and performing the immunoprecipitation in bulk. Drop-ChIP has provided a first enticing glimpse into heterogeneity of the histone modifications across cells. However, it is a very demanding approach, which requires highly customized microfluidics setup and until now has not been applied by any other group or in a different system. More recently, a previously published CUT&RUN protocol [18] has been adapted to ultralow cell numbers, including single cells [19]. In this approach, the chromatin is not fragmented and solubilized like during classical ChIP. Instead, a fusion of protein A and micrococcal nuclease (MNase) is recruited to chromatin regions by specific antibodies, where the nuclease can then release the DNA fragments bound by the protein of interest from the insoluble chromatin upon addition of Ca2+ [18, 20]. Ultra low input CUT&RUN (uliCUT&RUN) used to map the binding of CTCF, SOX2, and NANOG in single mouse ES cells captured enrichment of these transcription factors at their known binding sites. These two studies can serve as a proof of concept that mapping histone modifications and transcription factor binding in single-cell format is indeed possible. However, both of these approaches are technically challenging and require highly customised setups. Therefore, further testing would be necessary to establish how versatile these methods would be in studying different biological systems. One of the major sources of the material loss during traditional ChIP-seq procedure is the library generation from the immuno- precipitated DNA. It generally requires a series of steps including end polishing, A-tailing, adapter ligation, and PCR amplification. The efficiency of each of these steps is limited. Further material loss occurs during multiple intermediate purification steps. Recently, a modification of the standard ChIP-seq protocol, ChIPmentation, has been proposed, which addresses the aforemen- tioned issues [21]. In ChIPmentation, the series of adapter ligation 272 Natalia Kunowska and Xi Chen

and purification steps during the library preparation stage is replaced by the more straightforward Tn5 transposase-mediated insertion of the sequencing adapters directly into the DNA (“tag- mentation”) during the immunoprecipitation reaction (Fig. 1, right) [21]. ChIPmentation is a very robust approach, which does not require any additional equipment other than that necessary for standard ChIP-seq. Moreover, on-beads tagmentation is very easy to combine with different ChIP protocols, both custom and off- the-shelf kits. Therefore, since its publication, it has been widely adopted to study a range of biological systems, from heritable silencing in C. elegans to function of TAD boundaries in murine limb development [22–25]. In our hands, ChIPmentation reliably performed in mapping both transcription factors and broad and sharp histone marks in different cells types, from cell lines to primary cells. Apart from increased sensitivity and reduced input require- ments, additional advantages of the ChIPmentation in comparison to the classical ChIP-seq are reduction in library preparation costs and hands-on time and improved resolution, especially for cell types such as lymphocytes, where consistent chromatin fragmentation is challenging [21]. More recently, a further development of the ChIPmentation principle has been proposed in tagmentation-assisted fragmenta- tion of chromatin (TAF-ChIP) [26]. Here, the ability of Tn5 to fragment DNA not protected by proteins has been leveraged to limit the need for sonication to a short pulse in order to solubilize the chromatin. TAF-ChIP has been used to map repressive and active histone marks in as few as 100 human K562 cells. Similarly to ChIPmentation, this approach does not involve any elaborate equipment or reagents. Therefore, it should be easily applicable to any type of cells. To date, all the methods that have the potential to interrogate protein–DNA interactions at the single cell level are free of sonica- tion. Instead, they utilize enzymes (such as MNase and Tn5) to fragment the genome. Since Tn5-mediated transposition can achieve fragmentation and adapter addition in one simple and efficient step, we think the Tn5 transposase will play a key role in developing ChIP-based method at the single cell level.

2 Material

1. Oligonucleotides (see Table 1). 2. ChIP Block Solution (store at 4 C). Robust Low Input ChIP-seq: ChIPmentation 273

Table 1 Oligonucleotides

N701 CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGG N702 CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGG N703 CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGG N704 CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGG N705 CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTCTCGTGGGCTCGG N706 CAAGCAGAAGACGGCATACGAGATCATGCCTAGTCTCGTGGGCTCGG N707 CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTCTCGTGGGCTCGG N710 CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTCTCGTGGGCTCGG N711 CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTCTCGTGGGCTCGG N712 CAAGCAGAAGACGGCATACGAGATTCCTCTACGTCTCGTGGGCTCGG N714 CAAGCAGAAGACGGCATACGAGATTCATGAGCGTCTCGTGGGCTCGG N715 CAAGCAGAAGACGGCATACGAGATCCTGAGATGTCTCGTGGGCTCGG N716 CAAGCAGAAGACGGCATACGAGATTAGCGAGTGTCTCGTGGGCTCGG N718 CAAGCAGAAGACGGCATACGAGATGTAGCTCCGTCTCGTGGGCTCGG N719 CAAGCAGAAGACGGCATACGAGATTACTACGCGTCTCGTGGGCTCGG N720 CAAGCAGAAGACGGCATACGAGATAGGCTCCGGTCTCGTGGGCTCGG N721 CAAGCAGAAGACGGCATACGAGATGCAGCGTAGTCTCGTGGGCTCGG N722 CAAGCAGAAGACGGCATACGAGATCTGCGCATGTCTCGTGGGCTCGG N723 CAAGCAGAAGACGGCATACGAGATGAGCGCTAGTCTCGTGGGCTCGG N724 CAAGCAGAAGACGGCATACGAGATCGCTCAGTGTCTCGTGGGCTCGG N726 CAAGCAGAAGACGGCATACGAGATGTCTTAGGGTCTCGTGGGCTCGG N727 CAAGCAGAAGACGGCATACGAGATACTGATCGGTCTCGTGGGCTCGG N728 CAAGCAGAAGACGGCATACGAGATTAGCTGCAGTCTCGTGGGCTCGG N729 CAAGCAGAAGACGGCATACGAGATGACGTCGAGTCTCGTGGGCTCGG S502 AATGATACGGCGACCACCGAGATCTACACCTCTCTATTCGTCGGCAGCGTC S503 AATGATACGGCGACCACCGAGATCTACACTATCCTCTTCGTCGGCAGCGTC S505 AATGATACGGCGACCACCGAGATCTACACGTAAGGAGTCGTCGGCAGCGTC S506 AATGATACGGCGACCACCGAGATCTACACACTGCATATCGTCGGCAGCGTC S507 AATGATACGGCGACCACCGAGATCTACACAAGGAGTATCGTCGGCAGCGTC S508 AATGATACGGCGACCACCGAGATCTACACCTAAGCCTTCGTCGGCAGCGTC S510 AATGATACGGCGACCACCGAGATCTACACCGTCTAATTCGTCGGCAGCGTC S511 AATGATACGGCGACCACCGAGATCTACACTCTCTCCGTCGTCGGCAGCGTC

(continued) 274 Natalia Kunowska and Xi Chen

Table 1 (continued)

S513 AATGATACGGCGACCACCGAGATCTACACTCGACTAGTCGTCGGCAGCGTC S515 AATGATACGGCGACCACCGAGATCTACACTTCTAGCTTCGTCGGCAGCGTC S516 AATGATACGGCGACCACCGAGATCTACACCCTAGAGTTCGTCGGCAGCGTC S517 AATGATACGGCGACCACCGAGATCTACACGCGTAAGATCGTCGGCAGCGTC S518 AATGATACGGCGACCACCGAGATCTACACCTATTAAGTCGTCGGCAGCGTC S520 AATGATACGGCGACCACCGAGATCTACACAAGGCTATTCGTCGGCAGCGTC S521 AATGATACGGCGACCACCGAGATCTACACGAGCCTTATCGTCGGCAGCGTC S522 AATGATACGGCGACCACCGAGATCTACACTTATGCGATCGTCGGCAGCGTC

Stock For 100 mL Final concentration

10Â PBS 10 mL 1Â BSA 500 mg 0.5% BSA (w/v)

ddH2O90mL Total 100 mL

3. ChIP Lysis Buffer 1 (store at 4 C).

Stock For 100 mL Final concentration

1 M Hepes–KOH, pH 7.5 5.0 mL 50 mM 5 M NaCl 2.8 mL 140 mM 0.5 M EDTA 0.2 mL 1 mM 50% glycerol 20.0 mL 10% 10% Igepal 5.0 mL 0.5% 10% Triton X-100 2.5 mL 0.25%

ddH2O 64.5 mL

4. ChIP Lysis Buffer 2 (store at 4 C).

Stock For 100 mL Final concentration (mM)

1 M Tris–HCl, pH 8.0 1.0 mL 10 5 M NaCl 4.0 mL 200 0.5 M EDTA, pH 8.0 0.2 mL 1 0.5 M EGTA, pH 8.0 0.1 mL 0.5

(continued) Robust Low Input ChIP-seq: ChIPmentation 275

Stock For 100 mL Final concentration (mM) ddH2O 94.7 mL

5. ChIP Lysis Buffer 3 (store at 4 C).

Stock For 100 mL Final concentration

1 M Tris–HCl, pH 8.0 1.0 mL 10 mM 5 M NaCl 2.0 mL 100 mM 0.5 M EDTA, pH 8.0 0.2 mL 1 mM 0.5 M EGTA, pH 8.0 0.1 mL 0.5 mM 10% Na-deoxycholate 1.0 mL 0.1% 20% N-lauroylsarcosine 2.5 mL 0.5% ddH2O 93.2 mL

6. Wash Buffer (RIPA) (store at 4 C).

Stock For 250 mL Final concentration

1 M Hepes–KOH, pH 7.5 12.5 mL 50 mM 5 M LiCl 25.0 mL 500 mM 0.5 M EDTA, pH 8.0 0.5 mL 1 mM 10% Igepal 25 mL 1% 10% Na-deoxycholate 17.5 mL 0.7% ddH2O 169.5 mL

7. 2Â Tagmentation Buffer (aliquot and store at À20 C).

Stock For 1 mL Final concentration

1 M Tris–HCl, pH 8.0 20 μL20mM

1 M MgCl2 10 μL10mM Dimethylformamide (DMF) 200 μL 20% (v/v) ddH2O 770 μL

8. TE–NaCl Buffer (store at 4 C).

Stock For 100 mL Final concentration

1 M Tris–HCl, pH 8.0 1.0 mL 10 mM 0.5 M EDTA, pH 8.0 0.2 mL 1 mM

(continued) 276 Natalia Kunowska and Xi Chen

Stock For 100 mL Final concentration

5 M NaCl 1.0 mL 50 mM

ddH2O 97.8 mL

9. ChIP Elution Buffer (store at room temperature).

Stock For 100 mL Final concentration

1 M Tris–HCl 5.0 mL 50 mM 0.5 M EDTA, pH 8.0 2.0 mL 10 mM 10% SDS 10.0 mL 1%

ddH2O 83.0 mL

3 Procedures

Day 0: Bind Antibodies to Beads 1. Prepare each immunoprecipitation (IP) individually in eppen- dorf tubes. For each IP, wash 15 μL Protein A/G Dynabeads with 0.5 mL Block Solution three times. 2. Resuspend the washed beads in 250 μL Block Solution, add 1 μg antibody to the beads, and rotate at 4 C overnight (at least 6 h).

Day 1: Wash Ab–Bead Complex, Harvest the Cells, and Perform IP 1. Just before IP, wash the antibody–bead complex with 0.5 mL Block Solution three times. They are then ready for IP. 2. Cross-link cells by 1% HCHO in PBS (1 volume of 37% HCHO þ36 volumes of 1Â PBS) for 10 min at room temper- ature: for adherent cells, remove media, and then add 1% HCHO to the plate/well; for suspension cells, pellet cells at 500 Â g, and resuspend the cell pellet with 1% HCHO. Add 1/20 volume of 2.5 M glycine to quench the formaldehyde. Leave on bench for 5 min. 3. Pellet cells: for adherent cells, scrape the cells into 1.2 mL ice-cold PBS containing Protease Inhibitors. This can be done in this way: add 0.6 mL ice-cold PBS containing Protease Inhibitors to the plate/well, scrape the cells, transfer the 0.6 mL cells to a 1.5 mL Eppendorf tube placed on ice; add another 0.6 mL ice-cold PBS containing Protease Inhibitors to the same plate/well, scrape again to collect the remaining cells, transfer to the previous eppendorf tube; for suspension cells, centrifuge directly. In both cases, add 1/10 volume of FBS to facilitate cell pelleting. Sometimes, especially when the cell Robust Low Input ChIP-seq: ChIPmentation 277

numbers are low, it is very difficult to pellet the cells after cross- linking in PBS. Adding 10% FBS help reduce the cell loss. 4. Pellet cells using 2000 Â g,4C, 5 min. Wash the cell pellet with ice-cold 1Â PBS and centrifuge again. 5. Prepare the cell nuclei: resuspend cells in 300 μL ChIP Lysis Buffer I containing protease inhibitors, leave on ice for 5 min. Pellet cells using 2000 Â g,4C, 5 min; resuspend cells in 300 μL ChIP Lysis Buffer II containing Protease Inhibitors, leave on ice for 5 min. Pellet the cells using 2000 Â g,4C, 5 min. 6. Sonication: resuspend cell nuclei in 300 μL ChIP Lysis Buffer III, sonicate for 2–5 cycles (30 s on/off) using Bioruptor Pico (aiming for 100–500 bp sonicated DNA for histone modifica- tion; 200–1000 bp for transcription factor) (see Note 1). 7. Add 30 μL 10% Triton to the 300 μL sonicated lysate, spin down using the tabletop centrifuge at top speed (~16,000 Â g), 4 C, 10 min. 8. Take out 2 μL lysate as Input sample, put into the À20 C for future use. 9. Immunoprecipitation: transfer ~ 328 μL lysate to the tube containing antibody–bead complex for IP and incubate overnight.

Day 2: Wash IP, Tagmentation, Elution, and Reverse-Cross-Link 1. Wash each IP with 0.5 mL Wash Buffer (RIPA), 3 times. An aspirator can be used at this stage to remove the wash buffer. The washes can be done by removing the magnet from the rack while the tubes are still on the rack. Invert the rack about 10–15 times until the beads are homogenized. Then put the magnet back on, slowly invert twice or three times to collect the magnetic beads attached to the cap of the tubes (no need to centrifuge) and repeat the wash. 2. Wash each IP twice with 0.5 mL 10 mM Tris–HCl pH 8.0, this can be done in the same way as before, but remove the wash buffer using a P1000 pipette. DO NOT use an aspirator at this stage, since the beads will not be attached to the magnet tightly. 3. Pulse spin down the tubes to collect the beads and the residue of Tris. Put the tubes on the magnet and remove all traces of Tris. 4. Prepare tagmentation master mix: for each IP, a 30 μL tagmen- tation reaction is needed, which contains 15 μL Tagmentation Buffer þ14 μL ddH2O þ 1 μL TDE1 from the Nextera kit. Prepare a master mix of the reaction that is enough for all samples. 278 Natalia Kunowska and Xi Chen

5. Resuspend the beads with 30 μL tagmentation master mix. For input samples, take the 2 μL Input out of À20 freezer to thaw and add 30 μL of tagmentation master mix. 6. Put both IP and Input on a Thermomixer, 37 C, 5 min, 800 rpm for tagmentation. 7. After tagmentation, wash the beads with cold Wash Buffer (RIPA) twice. For input samples, add 70 μL ChIP Elution Buffer directly to the 30 μL tagmentation reaction to stop the Tn5 (100 μL in final volume). 8. Wash beads once with 1Â TE 50 mM NaCl. Again, use a P1000 pipette to remove wash buffer at this stage. 9. Pulse spin down to collect beads, put tubes onto the magnetic rack, remove trace of TE. 10. Add 100 μL ChIP Elution Buffer to each IP, put both IP and Input onto the Thermomixer, 65 C, >¼ 18,000 Â g (top speed), shake for 5 h or overnight. This step denatures the Tn5 and does reverse cross-linking. 11. Spin down beads using the table top microcentrifuge at top speed, 30 s. Use magnetic rack to remove the beads, and purify with Qiagen MinElute PCR Purification Kit. Elute with 12.5 Elution Buffer from the kit (end up with ~10 μL). 12. The purified DNA can be stored in À80 C for a long time. (A period of 6 months has been tested without any problems.)

Day 4: Library Construction 1. Set up PCR reaction as follows (50 μL reaction): 10 μL DNA template (eluted from MinElute kit).

10 μLH2O. 2.5 μL N5xx (10 μM, desalted oligos). 2.5 μL N7xx (10 μM, desalted oligos). 25 μL NEBNext High-Fidelity 2Â PCR Master Mix. 2. PCR cycle conditions: 72 C 5 min. 98 C30s. [98 C10s,63C30s,72C20s]Â 4. 4 C hold. 3. Take 9 μL from the original reaction, add 1 μL10Â EvaGreen, run a real time PCR with 98 C 30 s, [98 C10s,63C30s, 72 C20s]Â 20, to see the amplification curve. Qualitatively decide the cycle number N where the curve reach around half way of saturation (Fig. 2, dotted line). Do N more cycles for the rest of 41 μL reaction. Robust Low Input ChIP-seq: ChIPmentation 279

105 100 95 90 85 80 75 70 65 60 55 50 45 40 35 Relative intensity 30 25 20 15 10 5 0 -5 0 2468910 12 14 16 18 Cycles

Fig. 2 A typical qPCR amplification plot of the ChIPmentation libraries

4. AmpureXP beads purification: add 20 μL AmpureXP beads to the 41 μL amplified libraries. Pipette up and down to mix well. Leave at room temperature for 5 min. Put on a magnet and wait until the solution is clear. Transfer the supernatant to a new eppendorf tube, and discard the beads. Add 30 μL Ampur- eXP beads to the supernatant. Pipette up and down to mix well. Leave at room temperature for 5 min. Put on a magnet and wait until the solution is clear. Remove the supernatant and add 100 μL 80% ethanol without taking the tube off the magnet. Wait about 15 s and remove the ethanol. Repeat ethanol wash two more times. Air-dry the beads, and elute the library from the beads by resuspending the beads in 30 μL10mM Tris–HCl, pH 8.0. Leave at room temperature for 5 min, and transfer elutes to a new tube. 5. Run 1 μL of the purified library on BioAnalyzer or TapeStation, and send for sequencing. 50 bp, either single-end or pair-end, is generally used for ChIP-seq and ChIPmentation (see Note 2). Some examples of library profiles on Agilent Bioanalyzer are shown in Fig. 3 (see Note 3).

4 Notes

1. Sonication is the most efficient way of breaking down DNA. However, it can also destroy the protein epitope recognized by the antibody. Therefore, it is critical not to oversonicate. 280 Natalia Kunowska and Xi Chen

Fig. 3 Four examples of Bioanalyzer profiles from ChIPmentation experiments in the Jurkat cell line using 105 cells. The active histone marks (H3K4me3 and H3K27ac) have shorter fragments compared to the repressive mark (H3K27me3). For some difficult-to-sonicate cells, multiple peaks representing nucleosomes are observed

Histones are generally more tolerant to sonication comparing to transcription factors. In general, 100–500 bp and 200–1000 bp are good starting points for histone and tran- scription factor ChIP respectively. 2. Using qPCR to check enrichment on known loci is useful to tell whether the ChIP experiments work or not. However, this is not always possible, and enrichment in qPCR does not guaran- tee the success of a ChIP-seq or ChIPmentation experiment. Therefore, sequencing data need to be examined. One million reads are enough to tell whether a ChIP-seq or ChIPmentation experiment work or not for most factors. A shallow sequencing of a library is a good and economical way of assessing the quality of the experiment. We have found that the most effi- cient way of telling whether a ChIP experiment work or not is by visual inspection. First, reads should be mapped to the reference genome using an aligner, such as Hisat2 [27]. Then the aligned reads can be used to perform peak calling using programs such as MACS [28]. Many peak callers also have the functionality to generate coverage files that can be viewed via UCSC genome browser (Fig. 4). Visual inspection on known target genes and the shape of the peak is informative about the quality of the experiments. A successful experiment should exhibit many bell-curve shaped peaks, and the peaks can be clearly identified from background. Robust Low Input ChIP-seq: ChIPmentation 281

Fig. 4 UCSC genome browser screen shot across a gene dense region. ChIP: traditional ChIP-seq. ChM: ChIPmentation. H3K4me3 and H3K27ac marks exhibit clear and punctate peaks, while H3K27me3 mark exhibits broad domain of enrichment, which appears in different places comparing to H3K4me3 and H3K27ac

3. In most cases, the majority of the fragments should be within 200–1000 bp. Occasionally, some large fragments will be observed. This could be a combination of biology (such het- erochromatin associated proteins) and the artifacts of Agilent Bioanalyzer, which artificially concentrate large fragments (note it is not a proper log scale like the traditional agarose gel). However, we found the large fragment does not affect sequencing, and we still get good sequencing results from libraries with large fragments.

References

1. Kouzarides T (2007) Chromatin modifications 3. Lambert SA, Jolma A, Campitelli LF, Das PK, and their function. Cell 128:693–705 Yin Y, Albu M, Chen X, Taipale J, Hughes TR, 2. Helin K, Minucci S (2017) The role of Weirauch MT (2018) The human transcription chromatin-associated proteins in cancer. Annu factors. Cell 172:650–665 Rev Cancer Biol 1:355–377 282 Natalia Kunowska and Xi Chen

4. Barski A, Cuddapah S, Cui K, Roh TY, Schones ChIP-seq method for estrogen receptor- DE, Wang Z, Wei G, Chepelev I, Zhao K chromatin interactions from breast cancer (2007) High-resolution profiling of histone core needle biopsy samples. BMC Genomics methylations in the human genome. Cell 14:232 129:823–837 17. Rotem A, Ram O, Shoresh N, Sperling RA, 5. Johnson DS, Mortazavi A, Myers RM, Wold B Goren A, Weitz DA, Bernstein BE (2015) (2007) Genome-wide mapping of in vivo Single-cell ChIP-seq reveals cell subpopula- protein-DNA interactions. Science tions defined by chromatin state. Nat Biotech- 316:1497–1502 nol 33:1165–1172 6. Mikkelsen TS, Ku M, Jaffe DB et al (2007) 18. Skene PJ, Henikoff S (2017) An efficient Genome-wide maps of chromatin state in plu- targeted nuclease strategy for high-resolution ripotent and lineage-committed cells. Nature mapping of DNA binding sites. eLife 6:e21856 448:553 19. Hainer SJ, Boskovic A, Rando OJ, Fazzio TG 7. Solomon MJ, Larsen PL, Varshavsky A (1988) (2018) Profiling of pluripotency factors in indi- Mapping protein-DNA interactions in vivo vidual stem cells and early embryos. bioRxiv with formaldehyde: evidence that histone H4 286351 is retained on a highly transcribed gene. Cell 20. Schmid M, Durussel T, Laemmli UK (2004) 53:937–947 ChIC and ChEC; genomic mapping of chro- 8. Schmidt D, Wilson MD, Spyrou C, Brown matin proteins. Mol Cell 16:147–157 GD, Hadfield J, Odom DT (2009) ChIP-seq: 21. Schmidl C, Rendeiro AF, Sheffield NC, Bock C using high-throughput sequencing to discover (2015) ChIPmentation: fast, robust, low-input protein-DNA interactions. Methods ChIP-seq for histones and transcription fac- 48:240–248 tors. Nat Methods 12:963–965 9. Farnham PJ (2009) Insights from genomic 22. Stadhouders R, Vidal E, Serra F et al (2018) profiling of transcription factors. Nat Rev Transcription factors orchestrate dynamic Genet 10:605–616 interplay between genome topology and gene 10. Brind’Amour J, Liu S, Hudson M, Chen C, regulation during cell reprogramming. Nat Karimi MM, Lorincz MC (2015) An ultra- Genet 50:238–249 low-input native ChIP-seq protocol for 23. Rodrı´guez-Carballo E, Lopez-Delisle L, genome-wide profiling of rare cell populations. Zhan Y, Fabre PJ, Beccari L, El-Idrissi I, Nat Commun 6:6033. https://doi.org/10. THN H, Ozadam H, Dekker J, Duboule D 1038/ncomms7033 (2017) The HoxD cluster is a dynamic and 11. Dahl JA, Jung I, Aanes H et al (2016) Broad resilient TAD boundary controlling the segre- histone H3K4me3 domains in mouse oocytes gation of antagonistic regulatory landscapes. modulate maternal-to-zygotic transition. Genes Dev 31:2264–2281 Nature 537:548–552 24. Akay A, Di Domenico T, Suen KM et al (2017) 12. Lara-Astiaso D, Weiner A, Lorenzo-Vivas E The helicase Aquarius/EMB-4 is required to et al (2014) Immunogenetics. Chromatin overcome intronic barriers to allow nuclear state dynamics during blood formation. Sci- RNAi pathways to heritably silence transcrip- ence 345:943–949 tion. Dev Cell 42:241–255.e6 13. Adli M, Zhu J, Bernstein BE (2010) Genome- 25. Bolte C, Flood HM, Ren X, Jagannathan S, wide chromatin maps derived from limited Barski A, Kalin TV, Kalinichenko VV (2017) numbers of hematopoietic progenitors. Nat FOXF1 transcription factor promotes lung Methods 7:615–618 regeneration after partial pneumonectomy. Sci 14. Shankaranarayanan P, Mendoza-Parra M-A, Rep 7:10690 Walia M, Wang L, Li N, Trindade LM, Grone- 26. Akhtar J, More P, Kulkarni A, Marini F, Kaiser meyer H (2011) Single-tube linear DNA W (2018) TAF-ChIP: An ultra-low input amplification (LinDA) for robust ChIP-seq. approach for genome wide chromatin immu- Nat Methods 8:565–567 noprecipitation assay. bioRxiv 15. O’Neill LP, VerMilyea MD, Turner BM (2006) 27. Kim D, Langmead B, Salzberg SL (2015) Epigenetic characterization of the early embryo HISAT: a fast spliced aligner with low memory with a chromatin immunoprecipitation proto- requirements. Nat Methods 12:357–360 col applicable to small cell populations. Nat 28. Zhang Y, Liu T, Meyer CA et al (2008) Model- Genet 38:835–841 based analysis of ChIP-Seq (MACS). Genome 16. Zwart W, Koornstra R, Wesseling J, Rutgers E, Biol 9:R137 Linn S, Carroll JS (2013) A carrier-assisted Part IV

Single Cell Proteomic Analysis Chapter 18

Immunophenotyping of Human Peripheral Blood Mononuclear Cells by Mass Cytometry

Susanne Heck, Cynthia Jane Bishop, and Richard Jonathan Ellis

Abstract

Mass cytometry is a variation of conventional flow cytometry using metal tagged antibodies for cell staining instead of fluorochromes and detection in a mass cytometer, a modified mass spectrometer that allows for separation of discrete masses of these metal tags by time of flight (TOF). Currently, up to 50 different metal tags are available for cell analysis. The lack of any significant mass spectral overlap and autofluorescence background makes mass cytometry uniquely suited for complex high-dimensional phenotypic and func- tional analysis at the single cell level, thus accelerating biomarker discovery and drug screening. Here we describe a workflow for phenotyping of human peripheral blood mononuclear cells (PBMCs) covering cell staining, instrument setup of a Fluidigm Helios™ mass cytometer, and sample acquisition, and summarize a basic workflow of data analysis.

Key words Mass cytometry, MaxPar™, Lanthanide, Metal conjugated antibody, Rhodium interca- lator (103Rh), Iridium cell ID marker (191/193Ir), CyTOF, Helios™, EQ bead, Normalization

1 Introduction

For mass cytometry, traditional inductively coupled plasma mass cytometry (ICP-MS) has been modified to allow for the introduc- tion and measurement of metal isotope-labeled cell material. Since the introduction of the mass cytometry platform (CyTOF™ and MaxPar™ reagents) in 2009, it has had a tremendous impact on the field of cytometry, immune-monitoring, and biomarker discovery [1–6]. Currently, about 50 purified stable isotopes, mostly of the lanthanide (rare-earth metal) group of elements, have been selected to tag probes mostly using the proprietary Fluidigm MaxPar™ reagent system and covering a mass range of 89–209 kDa (see Note 1). Lanthanides are typically absent from biological mat- ter non radioactive and their masses are distinct from elements composing biological matter, making them ideally suited probes [2]. While we do not observe background equivalent to spectral

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_18, © Springer Science+Business Media, LLC, part of Springer Nature 2019 285 286 Susanne Heck et al.

Fig. 1 Schematic overview of the mass cytometry workflow. A liquid sample containing fixed cells labeled with metal-tagged antibodies/probes and EQ beads for later data normalization (a) is introduced into the mass cytometer front end via a nebulizer at a rate of ~500 cells/s creating an aerosol (b). Cells are then transported toward the torch where they enter a hot argon plasma of ~7500 K (c) to be atomized and ionized. Approximately 50% of injected cells are lost in the system before entering the mass cytometer apparatus. The resulting ion cloud is introduced into the mass cytometer via a series of metal cones bringing the ions from atmospheric pressure to an internal vacuum. After filtering out noncharged particles and photons and selective removal of low mass ions (such as elements composing biological matter, argon, and argon dimers) in the quadrupole (d), the ion cloud containing the isotopic metal tags used for labeling the cells enters the TOF chamber where probes are separated by time of flight based on their mass-to-charge ratio as they accelerate toward the detector (e). The time-resolved detector measures a mass spectrum that corresponds to the identity and quantity of each isotopic probe on a per-cell basis (f), delivering a “mass fingerprint” of each cell. Data is saved in .fcs format (g) and can be analyzed using third party software (h). Figure modified from [4]

overlap known from fluorescent flow cytometry, it is important for panel design to consider the impact of isotopic impurities, oxida- tion and abundance sensitivity on population resolution [4–7](see Note 2). Many readily conjugated antibodies are now commercially available; however, it is often necessary to produce custom reagents for a given project (see Note 1). In addition to antibodies we also use metal isotopes as cell identifiers: a 103-Rhodium (103 Rh) (103 Rh) based intercalator as well as cisplatin (194/195Pt platinum isotopes) are used to identify dead cells and Iridium (191/193Ir), intercalating into DNA/RNA, is used to identify nucleated cell events. Figure 1 summarizes the workflow of a complete mass cytometry experiment. At the time of measurement fixed labeled cells are resuspended in water and introduced as single cells into a Immunophenotyping by Mass Cytometry 287

Fig. 2 Rain plot of mass cytometry raw data (example) as seen during data acquisition. A human PBMC sample was labeled using the panel described in Table 1. The rain plot shows two independent nucleated events (boxes A and B). On top of the rain plot all available mass channels are displayed with markers added according to panel composition. Channel 140Ce EQ-beads as this metal is uniquely reserved for the normali- zation beads and never to tag antibodies. (C) List of channels and panel components. Cell events are identified by the presence of CD45 (D) as leukocytes and Iridium (191/193Ir) (E) as nucleated events. Arrows indicate positive signals observed in individual mass channels corresponding to event A or B, resulting in these cells’ typical mass fingerprint

hot argon plasma where they are atomized and ionized. After removal of non-informative low mass ions those ions resulting from lanthanide tags enter the detection chamber of the mass cytometer. Here they are separated by time of flight and counted on the detector where they arrive at defined intervals with increas- ing mass-to-charge ratio (lightest ions arrive first, heaviest ions arrive last). A typical rain plot showing raw mass cytometry data as seen during acquisition on the mass cytometer is shown in Fig. 2. Resulting high-dimensional data are saved in .fcs file format. Data are normalized using a “spiked in” 4-element bead calibrator solution (EQ-beads) to correct for sensitivity changes of the appa- ratus during acquisition [8]. While analytical software packages for traditional flow cytometry can be valuable for a first quality control of results, data files are typically analyzed by unsupervised methods using novel algorithms such as SPADE, viSNE, Citrus, or Pheno- graph to name the most popular ones at this stage [9–12]. Detailed advance planning of a mass cytometry experiment is essential for success. After developing a clear scientific question/ hypothesis researchers must procure sufficient sample material and controls, carefully design the marker panel (see Note 2), validate and optimize reagents as well as the full panel, and finally plan for and conduct sample acquisition and data analysis. Staining protocols for mass cytometry are largely comparable to those used for traditional fluorescent flow cytometry. We will use 288 Susanne Heck et al.

the example of a 20-marker surface phenotyping panel, combining eighteen metal-tagged antibodies as well as 103Rhodium intercala- tor as viability marker and 191/193-Iridium to identify nucleated cell events on cryopreserved human peripheral blood mononuclear cells (PBMC) to describe a typical mass cytometry experiment. Cryopreserved human PBMCs are the specimen of choice pre- served during clinical and translational research projects and, thus, frequently used for immune-monitoring, giving the protocol wide applicability.

2 Materials

All solutions and consumables used for mass cytometry must be free of contaminating background metals detectable in mass range 89–209 kDa as these can interfere with signals of interest and impact on the sensitivity and longevity of the detector. When first establishing mass cytometry technologies and when using novel reagents or buffers aliquots of all nonisotopic reagents should be analyzed for metal contaminants on the mass cytometer prior to handling biological specimen. Frequently observed contaminants are Barium (138Ba, contained in many commercial detergents), Iodine (127I, observed in some plastic consumables and in samples collected after skin sterilization with iodine-containing solutions) and lead (204/206/207/208Pb), resulting in regions with lead- containing municipal water pipes. The use of disposable plasticware or glassware reserved solely for mass cytometry and proven free of contaminants is best to avoid ® background. Ultrapure Milli-Q water (sensitivity 18 MΩ-cm at 25 C) is used for dilutions/cell suspensions where required. 1. Complete RPMI: RPMI medium, 10% FBS, 1Â penicillin–- streptomycin solution, 1Â glutamine. 2. Cell thawing medium: Complete RPMI, 25 U/ml benzonase (Sigma). 3. 1Â PBS and 10Â PBS (Ca2+/Mg2+ free). 4. PBMC samples, cryopreserved and stored in liquid nitrogen. 5. Sterile CSM buffer: 1Â PBS, 0.5% BSA, 0.02% sodium azide. 6. Sterile CSM-S buffer: 1Â PBS, 0.5% BSA, 0.02% sodium azide, 0.3% Saponin. 7. Saponin stock solution: Dissolve 1 g saponin in 10 ml 1Â PBS at 37 C, sterile filter using a 0.2 μm syringe filter and store at 4 C for up to 2 months. ® 8. Milli-Q -grade water. 9. 0.4% trypan blue solution, filter sterilized. 10. Commercial Fc-receptor blocking solution for human cells. Immunophenotyping by Mass Cytometry 289

11. Phenotyping antibodies: MaxPar™ conjugated metal tagged antibodies (Fluidigm, or tagged in house (see Note 1). Ensure that sufficient quantities of each antibody reagent are present before starting an experiment. 12. Viability stain 103 Rhodium intercalator (103 Rh, 500Â stock solution, Fluidigm). Upon delivery aliquot 10 μl of the 103 Rh stock solution in 0.2 ml microcentrifuge tubes, store at À20 C. Once thawed, store at 4 C and use within 1 week. 13. 16% paraformaldehyde (PFA) stock solution, methanol-free, electron microscopy grade. 14. 191/193 Iridium DNA intercalator (500 μM stock solution, Fluidigm). Upon delivery aliquot 5 μl of the 191/193 Ir stock solu- tion to 0.2 ml microcentrifuge tubes, store at À20 C. Once thawed, store at 4 C and use within 1 week. 15. EQ four-element calibration beads (Fluidigm) (see Note 3). 16. Ice bucket with dry ice to transport frozen cells. 17. Ice bucket with ice. 18. 37 C water bath. 19. Biosafety cabinet for tissue culture work (BSL2 grade). 20. Fume hood. 21. Household bleach. 22. Spray bottle with 70% isopropanol.  23. Tissue culture incubator, 37 C, 5% CO2 (for resting cells). 24. Tissue culture centrifuge. 25. High-speed microcentrifuge, min 10,000 Â g. 26. Vacuum aspirator with clean sterile tips (optional). 27. 1.5 ml microcentrifuge tubes, low protein binding. 28. Sterile plastic pastettes, 5 ml. 29. Cell strainer with 40 μm filter mesh. 30. Falcon 5 ml test tube with cell strainer snap cap. 31. 15 ml conical tubes. 32. 50 ml conical tubes. 33. Sterile serological pipettes (2 ml, 5 ml, and 10 ml). 34. Pipette aid for serological pipettes. 35. Calibrated adjustable volume pipettes (0.2–1000 μl). 36. Sterile filter tips, various sizes (0.2–1000 μl) (see Note 4). 37. Racks for 0.5 ml, 1.5 ml, 15 ml, and 50 ml conical tubes. 38. Vortex mixer. 290 Susanne Heck et al.

39. Nonlint paper wipes. 40. Gloves (powder free), laboratory coat, and goggles. 41. Waterproof marker pens. 42. Hemocytometer Neubauer cell counting chamber, handheld cell counter, and microscope (see Note 5). 43. Mass cytometer with argon supply (Fluidigm Helios™).

3 Methods

3.1 Cell Preparation 1. Locate cells for thawing in liquid nitrogen log. from Frozen PBMC 2. Prewarm water bath to 37 C. 3.1.1 Preparation Prior 3. Centrifuge (for 15 ml Falcon tubes), set to room temperature to Thawing Cells (R/T). 4. Prepare per PBMC 2Â pastettes, label 1Â 15 ml tube with a unique identifier. 5. Prepare complete RPMI and cell thawing medium, prewarm to 37 C: Each sample will require one 15 ml conical tube with 10 ml of cell thawing medium and 15 ml of complete RPMI medium. Calculate the amount needed to thaw all samples used on the day. 6. Have the tissue culture hood prepared and running. (Wipe surface with 70% isopropanol, wipe all items to be taken into the tissue culture hood with 70% isopropanol, and follow sterile techniques throughout.) 7. Prepare a waste container (one-tenth filled with bleach) to dispose media supernatant. 8. Set tissue culture centrifuge to room temperature (R/T).

3.1.2 Thawing PBMC 1. Remove samples from liquid nitrogen and transport to lab on Samples dry ice. 2. Thaw frozen vials in 37 C water bath, being careful not to submerge tube or lid. 3. When cells are nearly completely thawed with the remaining frozen core the size of a grain of rice take out cryovial, dry outside with a paper towel sprayed with 70% isopropanol, and carry into the tissue culture hood. 4. Remove 15 ml conicals with 10 ml prewarmed cell thawing medium from water bath, wipe outside with a paper towel sprayed with 70% Isopropanol, and carry into the tissue culture hood. Immunophenotyping by Mass Cytometry 291

5. Remove lids of 15 ml conical and cryovial containing PBMC sample. 6. Working quickly using a pastette add 1 ml of warm cell thawing medium slowly to the cells, then transfer the cells to their designated 15 ml conical tube. Rinse vial with more of the cell thawing medium to retrieve all cells. When working with multiple samples make sure to have a clear system to avoid mixing them up. Close 15 ml conical tubes. 7. Continue with any remaining samples without delay repeating steps 5 and 6. 8. Spin at 400 Â g for 10 min, R/T. 9. Wipe tubes with a paper towel sprayed with 70% isopropanol and carry into the tissue culture hood and remove lid. 10. Remove supernatant by decanting or aspirating medium and loosen the pellet by tapping the tube. 11. Gently resuspend pellet in 1 ml warm complete RPMI medium. Add 9 ml warm complete RPMI medium resulting in 10 ml total volume. Filter cells using a 40 μm cell strainer if needed (i.e., if you observe any clumps). 12. Spin at 400 Â g for 10 min, R/T. 13. Remove supernatant from the cells and resuspend the pellet by tapping the tube. 14. Resuspend cells in 5 ml complete RPMI medium. 15. Using a Neubauer chamber (hemocytometer): Mix cells up right before counting, take 20 μl cells and mix with 80 μl 0.4% trypan blue solution. Count trypan-blue negative live cells following standard procedure and calculate the live cell number per ml: ðÞÂAverage number of cells=squares 5ðÞÂ Dilution factor 104 ¼ Cell number=ml: Total cell number ¼ 5 Â cell number per ml as cells were resuspended in 5 ml of medium in step 14 (see Note 6). Expect 10–20% of dead cells in a typical PBMC sample. Samples with significantly more dead cells may result in poor material for mass cytometry acquistion as they are more likely to disintegrate when transferred into water before injection. Keep a close record on cell viabilty to develop your own sample inclusion/ exclusion criteria and track potential sources of poor sample quality. 16. For optimal detection of cryo-sensitive markers and better cell robustness cells are transferred into a 50 ml conical tube and 292 Susanne Heck et al.

allowed to rest for 2 h in complete RPMI in a tissue culture incubator. 17. While letting the cells rest calculate the volume needed to obtain two million viable cells for surface staining. 18. After rest period wipe outside of 50 ml conical with a paper towel sprayed with 70% isopropanol and carry into the tissue culture hood. 19. Open lid, mix up cells using a 5 ml serological pipette, and transfer volume corresponding to two million viable cells into a fresh, clearly labeled 15 ml conical tube. 20. Spin at 400 Â g for 10 min, R/T, and discard supernatant. 21. Gently tap and loosen pellet in 1 ml of warm PBS. Add 9 ml warm PBS resulting in 10 ml total volume. 22. Spin at 400 Â g for 10 min, R/T, and discard supernatant. 23. Gently tap and loosen pellet and resuspend in warm PBS to a concentration of 2 Â 106 cells/ml. 24. Transfer a 1 ml aliquot of evenly resuspended cells at 2 Â 106 cells per ml into a low protein-binding 1.5 ml microcentrifuge tube and proceed to Subheading 3.2 (see Note 7).

3.2 Viability Stain Viability staining is highly recommended to identify dead cells in Using 103Rhodium conventional flow cytometry, however it is absolutely crucial in and FcR Blocking mass cytometry experiments to avoid artifacts in staining and high-dimensional data interpretation. Do not omit viability stain- ing! The incubation period in 103 Rhodium viability stain [5] and Fc block reagents can be used to prepare the master mix of the antibody panel (see Notes 8 and 9). 1. Remove aliquot of 500Â 103 Rhodium (103Rh) stock solution from À20 C, allow to thaw and prepare fresh working solution by adding 1 μl 500Â 103Rh stock solution to 499 μl sterile 1Â PBS. 2. Spin cells (400 Â g, 5 min, R/T), aspirate supernatant. 3. Disperse pellet and resuspend in 500 μlof1ÂRh103-intercala- tor in 1Â PBS. 4. Incubate for 15 min at R/T. 5. Spin cells (400 Â g, 5 min, R/T) and aspirate supernatant. 6. Wash by adding 1 ml CSM buffer and inverting the closed tube 3times. 7. Spin cells (400 Â g, 5 min, R/T) and aspirate supernatant, being careful not to touch the cell pellet. 8. Disperse pellet in CSM buffer, add FcR block reagent for 2 Â 106 cells (volume of FcR block according to manufac- turer’s instructions) to reach a final volume of 50 μl of cell suspension (see Note 10). Immunophenotyping by Mass Cytometry 293

9. Incubate for 10 min at R/T, then move on to Subheading 3.3 for surface antibody staining.

3.3 Surface Antibody 1. Prepare a printout of all panel antibodies and titration values to Staining have next to you when working in the lab. An example is shown in Table 1. 3.3.1 Preparation Prior The total stain volume (cells + staining buffer + Fc block + to Surface Staining antibodies) must always be constant for all samples in an exper- with Metal-Conjugated imental series as staining is dependent on cell and antibody Antibodies concentration. Typical total stain volumes for samples of ~1 to 5 Â 106 cells are 100–150 μl. 2. Let your mass cytometry facility know in advance which metal channels you wish to record, the channel names (e.g., antibody name/viability stain/barcode) and panel names so that they can enter them into the mass cytometer in advance. This step is time-consuming for each new panel; however, good file anno- tation at this stage is essential to have meaningful metadata for downstream analysis.

3.3.2 Cell Surface 1. Remove antibodies from refrigerator and keep them on ice Staining with Metal Tagged while working. Antibodies 2. Centrifuge all antibodies before use to pellet potential antibody precipitates for 5 min, 10,000 Â g,4C, then only take solution from the top. 3. Prepare a master mix of antibodies (in CSM buffer as a diluent) in a 1.5 ml low protein binding microcentrifuge tube, labeled with the panel name: As in the example in Table 1, add 39.5 μl of CSM buffer to the tube and pipet in 0.5 μlor1μl of each antibody as listed. Scale up master mix for multiple samples. Make sure you have a clear system so that you know which antibodies you already added (see Note 11). 4. Vortex the master mix and spin for 5 min, at 10,000 Â g,4C to remove potential precipitates. Pipet only from the top, leaving any precipitate behind in this tube. 5. Add the antibody cocktail to 50 μl of cells (Subheading 3.2, step 8) and mix well immediately by gentle vortexing or gently pipetting sample up and down three times. 6. Incubate sample on ice for 60 min. 7. Wash by adding 1 ml of CSM buffer. 8. Spin at 400 Â g, 5 min, R/T, aspirate supernatant. 9. Disperse pellet and repeat wash with 1 ml of CSM buffer, aspirate supernatant. 10. Using the washed cell pellet, proceed to the cell fixation proce- dure in Subheading 3.4 (see Note 12). 294 Susanne Heck et al.

Table 1 Example of stain preparation worksheet for mass cytometry staining

Reagent Volume catalogue Reagent per stain Stain Target Isotype number lot number Clone Mass Metal (μl)

Rhodium L/dead n/a 201103B P11K2108A NA 103 Rh 0 Iridium DNA n/a 201192B P11K3104B NA 191/ Ir 0 193 CD45 Hu mIgG1 3089003B 3061505 HI30 89 Y 0.5 CCR6/CD196 Hu MIgG2B 3141003A 3061502 G034E3 141 Pr 0.5 CD4 Hu mIgG1 In house N/A RPA-T4 145 Nd 0.5 CD8a Hu mIgG1 3146001B 0261618 RPA-T8 146 Nd 0.5 CD20 Hu mIgG1 In house N/A 2H7 147 Sm 0.5 CD16 Hu mIgG1 3148004B 2191530 3G8 148 Nd 0.5 CCR4/CD194 Hu mIgG2b 3149003A 1691504 205,410 149 Sm 0.5 CD3 Hu mIgG1 In house N/A UCHT1 154 Sm 0.5 CD45RA Hu mIgG2b 3155011B 1191694 HI100 155 Gd 0.5 CCR7 Hu mIgG2a 3159003A 1891504 G043H7 159 Tb 0.5 CD14 Hu mIgG2a 3160001B 0261612 M5E2 160 Gd 0.5 CXCR3/ Hu mIgG1 3163004B 2951501 G025H7 163 Dy 0.5 CD183 CD45RO Hu mIgG2a 3165011B 2191526 UCHL1 165 Ho 0.5 CD27 Hu mIgG1 3167002B 0141518 O323 167 Er 0.5 CD25 Hu mIgG1 3169003B 1031505 2A3 169 Tm 1.0 CD25 Hu mIgG1 In house N/A M-A251 169 Tm 1.0 CD57 Hu mIgM 3172009B 0621514 HCD57 172 Yb 0.5 HLA-DR Hu mIgG2a 3174001B 0261619 L243 174 Yb 0.5 CD127 Hu mIgG1 3176004B 0261616 A019D5 176 Yb 0.5 Total Ab vol 10.5 100 - Ab vol 90.5 For example only—volumes will vary for each panel

3.4 Cell Fixation All work is done in a fume cabinet to avoid exposure to formalde- and Iridium Staining hyde fumes. 1. Remove aliquot of 500 μM 191/193Iridium stock solution from À20 C store, allow to thaw in fume hood (see Note 13). Immunophenotyping by Mass Cytometry 295

2. Prepare fresh Fix-Perm buffer with Iridium in a 15 ml conical tube; per sample 500 μl of this solution is needed: Add 0.8 ml ® 10Â PBS to 5.9 ml Milli-Q grade water and mix well by pipetting up and down. Add 1 ml 16% PFA stock solution and 0.3 ml 10% saponin stock solution and mix well by carefully pipetting up and down with a serological pipette, avoiding producing foam. Add 2 μl 191/193Ir-intercalator (500 μM stock solution) and mix well by carefully pipetting up and down with a sero- logical pipette, avoiding producing foam. Wrap tube in aluminum foil and store in the dark, lid tightly closed and protected from light. Use within 48 h. 3. Disperse the cell pellet from Subheading 3.3.2, step 10 and resuspend it in 500 μl of Fix-perm buffer with Iridium (191/193Ir). 4. Leave cells at room temperature for 1 h before placing at 4 C overnight (see Notes 14–16).

3.5 Sample The following steps are done on the day of sample acquisition on Preparation the mass cytometer. Sample acquisition is usually done by manu- and Acquisition facturer trained and experienced operators, typically in a central facility, due to the complexity of the apparatus. The operator will set up and tune the mass cytometer according to manufacturer’s SOP as detailed in the manufacturer user manual and samples will only be acquired after all QC parameters were passed. It is best to communicate the panel composition to the operator prior to arriv- ing for sample acquisition as entering these data into the Helios™ acquisition software will take up time best used for sample run. Work to prepare the samples for injection is done in a fume cabinet to avoid exposure to toxic formaldehyde fumes. 1. Remove samples from 4 C storage and spin at 600 Â g, 7 min, R/T in a microcentrifuge (see Note 17). Aspirate supernatant, being careful to not dislodge the cell pellet. 2. Resuspend pellet in 500 μlof1Â PBS. 3. Spin down at 600 Â g, 7 min, R/T in a microcentrifuge. 4. Aspirate supernatant, being careful to not dislodge the cell pellet. 5. Wash cells in 500 μl1Â PBS (600 Â g, 7 min, R/T). OPTIONAL: Count cells before centrifugation using a hemocytometer or an automated cell counter following man- ufacturer’s instructions to determine the cell number before water washes (see Note 18). 6. Aspirate supernatant, leaving pellet covered with a small vol- ume (~10 μl) of 1Â PBS. Place tubes on ice. 296 Susanne Heck et al.

7. When the Helios™ mass cytometer is ready for sample acquisi- tion proceed with the first sample while any further samples will remain on ice (see Note 19). 8. Prepare an aliquot of Fluidigm EQ-beads in a 15 ml conical tube: Vortex and sonicate (30 s) the EQ bead stock solution, then prepare a 1/10 working dilution by adding 500 μlofEQ beads to 4.5 ml of Milli-Q water. ® 9. Wash cells with 1 ml Milli-Q water and aspirate supernatant with pipette tip. (Vacuum is more efficient at removing the saline solution without disturbing the cell pellet.) 10. Repeat wash step. ® 11. Resuspend cells in 500 μl 1/10 EQ beads in Milli-Q water. Filter sample into a 5 ml round bottom test tube with cell strainer snap cap. Clearly write the sample identifier, concen- tration, and volume on the side of the tube. 12. Determine cell number with hemocytometer or automated cell counter. Observe the quality of the single cell suspension: cells should be >95% single cells. This count is not optional -samples must be at a consistent concentration for injection into the mass cytometer 13. Adjust the cell concentration to 0.5 Â 106 cells per ml using 0.1 Â EQ beads in Milli-Q® water (see Note 20). 14. The sample is introduced into the mass cytometer via the automated sample loader station of the Helios™ mass cyt- ometer. Either a total number of events or a run time can be chosen as the stopping parameter. The Helios™ sample intro- duction rate is fixed at 0.03 ml/min. The operator will observe the acquisition and ensure that the sample will not run dry to prevent introduction of air into the fine capillaries transporting cells into the nebulizer (see Note 21). 15. To prevent sample carryover the instrument is cleaned between samples by placing a 5 ml round bottom test tube with 2 ml ® Milli-Q water via the sample injection port. Water is run until the rain plot on the acquisition pane is free of events (see Fig. 2). Typically, this step takes between 10 and 20 min (see Note 22). Prepare the next sample following steps 9–14. 16. Normalize raw .fcs files using the normalization algorithm provided by Fluidigm’s CyTOF Software version 6.7. This essential step can be performed by the operator or the user and will generate a data file ready to analyze by third-party software (see Note 23). 17. Export normalized data in .fcs file format for downstream analysis. Immunophenotyping by Mass Cytometry 297

Fig. 3 Initial data cleanup of mass cytometry data in preparation for high-dimensional phenotyping and data quality control. Plots shown were generated in CytoBank™ using data from a PBMC sample and the panel listed in Table 1. Gated events are encircled by blue lines and were applied in hierarchical order from plot A to E. include (a) Gating out of EQ bead events (time vs. 140Ce, a metal isotope only occurring in EQ beads in the mass cytometry data files), (b) Gating out cell doublets (191Ir vs. cell length) (see Note 24), (c) Gating on Iridium (191Ir vs. 193Ir) to remove noncell events, (d) Gate out dead cells (time vs. 103Rh), (e) Purity check on 191Ir vs. CD45 (or other universal cell marker) to remove any remaining non-CD45 positive cell events

3.6 Initial Data Strategies for unsupervised analysis of high-dimensional mass cyto- Cleanup for High- metry data usually apply a set of algorithms [9–12] to obtain read- Dimensional Data outs on biomarkers and will due to the complexity of the steps Analysis involved not be covered in this chapter. However, there are com- mon steps for initial data cleanup in preparation for unsupervised analysis which have to be performed for all files. These steps include (1) gating out of EQ bead events (time vs. 140Ce, a metal isotope only occurring in EQ beads in the mass cytometry data files), (2) gating out cell doublets (191Ir vs. cell length) (see Note 24), (3) Gating on Iridium (191Ir vs. 193Ir) to remove noncell events, (4) gating out dead cells (time vs. 103Rh), (5) check on 191Ir vs. CD45 (or other universal cell marker) to remove any remaining non-CD45 positive cell events, and (6) OPTIONAL, IF POSSIBLE IN THE CHOSEN PANEL: use a combination of biologically non overlapping markers to exclude further aggregates (such as CD3 (only on T cells) vs. CD20 (only on B cells), CD3 (only on T cells) vs. CD14 (only on monocytes) or CD20 (only on B cells) vs. CD14 (only on monocytes). Events resulting in the final cleanup gate after step 5 (or step 6 if panel allowed for the optional step) are exported as a new .fcs file and used for further analysis steps. Figure 3 illustrates a typical data cleanup of mass cytometry data.

4 Notes

1. While many antibodies are available in metal tagged format most panels do require tagging in-house, following the same procedure used by the vendor. For in-house conjugation 100 μg of antibody in a protein-free formulation (no BSA, no gelatin) and a MaxPar™ metal labeling kit (Fluidigm) is required. Published protocols work best with IgG subclass 298 Susanne Heck et al.

antibodies, although other subclasses can be tested. Clones need to be validated after conjugation for metal content and functionality as the tagging process occasionally can affect the binding capacity of the reagent. MaxPar™ labeling kits are not available for Yttrium (89Y) and Wismut (209Bi) while there is a limited set of antibodies to highly expressed antigens are com- mercially available. In addition to lanthanides paladium and platinum isotopes, silver nanoparticles, and Q-dots (cadmium core) have been used to produce metal conjugated clones, albeit some showing lower sensitivity than the lanthanide reagents [15, 16]. 2. Panel design for mass cytometry requires careful consideration with respect to isotopic impurities inherent to enriched lantha- nides. Most tags used for mass cytometry naturally occur as mixtures of stable isotopes with various masses. Most isotopes used for mass cytometry are sold at purities >98% purity, however some isotopes contain up to 4% contamination of another isotope of the same element (for details see tracesciences.com). It is important not to place antibodies to highly expressed antigens on tags with high impurities to avoid spill into neighboring channels where low expression antigens are to be measured. Equally careful titration of antibodies tagged with more impure tags will reduce spillover into neigh- boring channels. Rules for panel design are taken into account in the MaxPar™ panel designer (Fluidigm), a free online tool supporting complex panel design for mass cytometry. 3. ICP-MS instruments like the mass cytometer show slight signal drift over the course of the day, between days and between different instruments resulting in small differences in sensitivity between data sets. To normalize mass cytometry EQ beads are spiked into each sample. Four-element EQ beads (Fluidigm) are a mixture of naturally occurring cerium (140/142Ce), europium (151/153Eu), holmium (165Ho), and lutetium (175/176Lu). Data are normalized to a global reference standard determined for each lot of EQ by the manufacturer. 4. Do not use non filter tips for mass cytometry experiments to avoid metal carryover between samples. 5. If frequently running mass cytometry experiments consider the ® use of an automated cell counter (e.g., Life Technologies Countess II used with Countess slides for easier workflows). 6. Typical recovery from healthy adult human donors is 4–8 Â 106 cells from a vial frozen at 10 Â 106 cells/vial, cell recovery from patient material and children can vary. 7. The total number of cell required depends on the specific research question and has to be determined during assay opti- mization. Generally, the population with the lowest expected Immunophenotyping by Mass Cytometry 299

frequency will determine the total number of events required to achieve statistically significant results with acceptable CV during data analysis. Contrary to fluorescent cytometry it is not possible to put a stopping gate on a specific subpopulation at the time of writing this chapter, but rather only to collect total counts. Furthermore, a loss of ~50% of the introduced cell number has to be respected when calculating the total number of cells required. 8. Cisplatin is often preferred for its more robust signal and quicker workflow [13]. 103Rhodium DNA intercalator is how- ever less toxic and the preferred solution when using cell sus- pension after enzymatic tissue digest, where cisplatin can lead to unspecific background. Cisplatin must not be used on cell samples derived from patients that have undergone treatment with a cisplatin containing regiment (cancer therapy). 9. Below is the alternative workflow for using cisplatin as viability stain. As this protocol is significantly faster it is advisable to prepare master mixes of antibodies prior to starting the cis- platin stain. Additional material not listed under paragraph 2: cisplatin stock solution (Sigma-Aldrich, Catalogue number 479306, linear formula Pt(NH3)2Cl2, MW 300.05) and DMSO. All work is done in a fume hood, carefully avoiding skin and eye contact with cisplatin (toxic). 1. Prepare cisplatin master stock at 100 mM in DMSO, dispense in 0.35 ml aliquots in Eppendorf tubes and freeze at À80 C. 2. From the master mix prepare a working stock solution: dilute 1 in 10 for a 10 mM working stock in DMSO (10–20 μl aliquots) and freeze at À80 C. 3. On day of experiment thaw an aliquot of the working stock solution. (Master stock is thawed only when you need to make a set of new working stock aliquots.) 4. Add 2.5 μl of the working stock to 997.5 μlof1Â PBS per sample (final cisplatin stain concentration 25 μM). 5. Make sure you have all of the following items ready before you start: vortex, timer, CSM buffer to quench, pipettes, and centrifuge. 6. Dispense a 1 ml aliquot of evenly resuspended cells at 2 Â 106 cells per ml into a 5 ml round bottom test tube. Use capped tubes. 7. Spin at 400 Â g, 5 min, R/T, and aspirate supernatant. 8. Disperse pellet in PBS (no BSA, i.e., no extra protein), spin down at 400 Â g, 5 min, R/T, and aspirate supernatant. 300 Susanne Heck et al.

9. Resuspend pellet in 1000 μl of PBS, make sure there are no clumps, and filter through a 40 μm filter mesh if required. 10. Add 2.5 μl of cisplatin working stock by pipette and immediate vortex. Start timer set to 1 min. 11. Incubate at R/T and quench each after exactly 1 min by adding 3 ml CSM. 12. Spin at 400 Â g, 5 min, R/T, and aspirate supernatant. 13. Wash by adding 1 ml CSM and inverting tube 3Â. 14. Resuspend cells in 1 ml CSM buffer and transfer to low protein binding 1.5 ml microcentrifuge tube. Continue protocol as described from Subheading 3.2, step 7. 15. FcR blocking is essential when working with cells of hemato- poietic origin such as PBMCs, whole blood, bone marrow or spleen suspensions to block unspecific antibody binding via their cell surface FcγR receptors while it can be omitted for other tissue and cell types. 16. Volumes of antibodies must be determined using your specific biological sample type in titration experiments prior to running final samples. Separate master mixes must be made if you perform a surface and intracellular staining procedure. Master mixes are not stable when stored in CSM buffer but must be used on the day of preparation. 17. If the workflow also includes intracellular staining you will now move to the fixation permeabilization step, followed by stain- ing with a cocktail of intracellular antibodies and a number of washes. It is best to finish surface and intracellular staining on the same day, then fix the stained sample overnight. Some transcription factors/cytokines do not stain well if cells are fixed overnight prior to the intracellular stain. Once completed the cells will follow the workflow in Subheading 3.4. For phosphoflow samples fixed in methanol can be stored for a prolonged time at À80 C. 18. Ensure to have sufficient good quality paraformaldehyde (PFA) fixative: A 16% stock solution of electron-microscopy grade material gives good results. The stock solution is usually shipped in 10 ml glass ampoules which can be opened when required and transferred to a 15 ml tube wrapped in foil (protected from light) and stored in the chemicals cupboard. The 16% PFA stock solution can be kept for 1 month before being discarded. Working dilutions (2%, in PBS) are made fresh and can be stored for 48 h (again, protected from light) before being discarded. 19. A minimum fixation time of 1 h at R/T is required, but will result in significant cell loss. A fixation time of 4 h gives better cell preservation, but overnight fixation is recommended which Immunophenotyping by Mass Cytometry 301

gives the best fixation. Longer incubations of up to 48 h have also been successful. 20. If it is not possible to acquire the sample within 48 h of staining on a mass cytometer it is advisable to freeze samples to avoid variation in data quality and continue with Subheading 3.5 after recovering the stored material according to protocol [14]. 21. For cells which are already permeabilized after intracellular stain with a detergent based perm solution, prepare the fixation solution as follows: Prepare fresh Fix-Perm buffer with Iridium in a 15 ml conical tube; per sample 500 μl of this solution is needed: Add 0.8 ml 10Â PBS to 6.2 ml Milli-Q grade water and mix well by pipetting up and down. Add 1 ml 16% PFA stock solution and mix well by carefully pipetting up and down. Add 2 μl 191/193Ir-intercalator (500 mM stock solution) and mix well by carefully pipetting up and down with a sero- logical pipette. Wrap tube in aluminum foil, store in the dark, lid tightly closed and protected from light. Use within 48 h. 22. After incubation in FixPerm buffer with Iridium it is important to spin the cells at a higher speed to avoid cell loss. For PBMC samples 600 Â g, 7 min, R/T are sufficient; however, other sample types may require different conditions, which must be determined by the researcher. Fixed cells are smaller than their fresh equivalents and, thus, do not pellet as easily. Also, the pellet is more transparent and more easily dislodged from the tube than unfixed cells. Place the tubes into the centrifuge rotor always in the same orientation with the hinge of the lid facing outward so that you can expect the pellet always in the same location. Any mixing steps for fixed cells should be less vigorous—short pulse vortex or gentle pipetting is recom- mended from this point. 23. Counting cells prior to Subheading 3.5, step 9 can be infor- mative: Extensive cell loss during water washes is indicative of poor fixation and/or fragile cells and will help guide your further experiments. If the cell count shows that there are more cells than required to achieve your target event rate consider perhaps to take an aliquot of the sample and reserve the remaining material in case you encounter any problem during the injection of the first portion. Once cells have pro- ceeded to Subheading 3.5, step 10 they can no longer be salvaged. 24. Usually several samples will be acquired per experiment on a single day. Assuming 250,000 events will be saved per sample it is possible to read ~10 samples during normal working hours. 302 Susanne Heck et al.

It is NOT advisable to rebuffer all samples at once into water/ EQ beads as cells will disintegrate the longer they stay exposed to water. Therefore single samples are prepared as described while remaining cell pellets stay on ice. While acquisition of sample 1 is progressing, the second sample will be prepared for injection. 25. For barcoded samples slightly higher cell concentrations can be used (0.75–1.0 Â 106), however the concentration should be constant for any given type of experiment. Using concentra- tions above 0.75–1 Â 106 cells/ml leads to loss in resolution resulting from ion clouds of single cells overlapping each other during acquisition on the mass cytometer and hence to loss of data. 26. The number of required events will depend on the experimen- tal question and has to be determined by the user during the experimental design phase. If any cells remaining at the bottom after finishing the first tube run have to be acquired as well they ® can be resuspended in ~0.25 ml of Milli-Q water and injected as well. 27. If more stringent washing is required the operator will run ® Helios cleaning solution (Fluidigm) followed by Milli-Q water. 28. Normalization software contained in Helios 6.7 software can be downloaded from Fluidigm. (Create account at http:// www.dvssciences.com/create-account.php., then navigate to Data Processing Software & Documents > Helios >6.7 Soft- ware Package for Stand-alone Workstations and download Helios 6.7 software for stand-alone workstations.) Another option delivering equally good results has been published [8] and can be run in MATLAB (MATLAB Compiler Runtime (MRC) version 8.1 (R2013a). For normalization software see https://github.com/nolanlab/bead-normalization/releases. 29. In addition to cell length it is possible to add other Gaussian parameters characterizing the signal pulses on the mass cyt- ometer: Center, Offset, Width, and Residual are saved as meta- data alongside cell event date in each .fcs files acquired on a Helios™ mass cytometer.

References

1. Bandura DR, Baranov VI, Ornatsky OI, 2. Ornatsky OI, Kinach R, Bandura DR, Lou X, Antonov A, Kinach R, Lou X, Pavlov S, Tanner SD, Baranov VI, Nitz M, Winnik MA Vorobiev S, Dick JE, Tanner SD (2009) Mass (2008) Development of analytical methods for cytometry: technique for real time single cell multiplex bio-assay with inductively coupled multitarget immunoassay based on inductively plasma mass spectrometry. J Anal At Spectrom coupled plasma time-of-flight mass spectrome- 23(4):463–469 try. Anal Chem 81(16):6813–6822 Immunophenotyping by Mass Cytometry 303

3. Bendall SC, KSimonds EF, Qiu P, Amir ED, 10. Amir ED, Davis KL, Tadmor MD, Simonds Krutzik PO, Finck R, Bruggner RV, EF, Levine JH, Bendall SC, Shenfeld DK, Melamed R, Trejo A, Ornatsky OI, Balderas Krishnaswamy S, Nolan GP, Pe’er D (2013) RS, Plevritis SK, Sach K, Pe’er D, Tanner SD, viSNE enables visualization of high dimen- Nolan GP (2011) Single cell mass cytometry of sional single-cell data and reveals phenotypic differential immune and drug responses across heterogeneity of leukemia. Nat Biotechnol a human hematopoietic continuum. Science 31:545–552 332:687–696 11. Bruggner RV, Bodenmiller B, Dill DL, Tibshir- 4. Virani F, Tanner SD (2015) Mass cytometry: ani RJ, Nolan GP (2014) Automated identifi- an evolution in ICP-MS enabling novel cation of stratifying signatures in cellular insights in single-cell biology. Spectroscopy subpopulations. Proc Natl Acad Sci U S A 30(5). http://www.spectroscopyonline.com 111(26):E2770–E2777 5. Ornatsky O, Bandura D, Baranov V, Nitz M, 12. Levine JH, Simonds EF, Bendall SC, Davis KL, Winnik MA, Tanner S (2010) Highly multi- Amir e-AD, Tadmor MD, Litvin O, Fienberg parametric analysis by mass cytometry. J Immu- HG, Jager A, Zunder ER, Finck R, Gedman nol Methods 36:1–20 AL, Radtke I, Downing JR, Pe’er D, Nolan GP 6. Tanner SD, Baranov VI, Ornatsky OI, Bandura (2015) Data-driven phenotypic dissection of DR, George TC (2013) An introduction to AML reveals progenitor-like cells that correlate mass cytometry: fundamentals and applica- with prognosis. Cell 162(1):184–197 tions. Cancer Immunol Immunother 62 13. Fienberg HG, Simonds EF, Fantl WJ, Nolan (5):955–965 GP, Bodenmiller B (2012) A platinum-based 7. Takahashi C, Au-Yeung A, Fuh F, Ramirez- covalent viability reagent for single-cell mass Montagut T, Bolen C, Mathews W, O’Gorman cytometry. Cytometry A 81(6):467–475 WE (2017) Mass cytometry panel optimization 14. Sumatoh HR, Teng KW, Cheng Y, Newell EW through the designed distribution of signal (2017) Optimization of mass cytometry sam- interference. Cytometry A 91A:39–47 ple cryopreservation after staining. Cytometry 8. Finck R, Simonds EF, Jager A, A 91(1):48–61 Krishnaswamy S, Sachs K, Fantl W, Pe’er D, 15. Mei HE, Leipold MD, Maecker HT (2016) Nolan GP, Bendall SC (2013) Normalization Platinum-conjugated antibodies for applica- of mass cytometry data with bead standards. tion in mass cytometry. Cytometry A 89 Cytometry A 83(5):483–494 (3):292–300. https://doi.org/10.1002/cyto. 9. Qiu P, Simonds EF, Bendall SC, Gibbs KD Jr, a.22778 Bruggner RV, Linderman MD, Sachs K, Nolan 16. Hartmann FJ, Simonds EF, Bendall SC (2018) GP, Plevritis SK (2011) Extracting a cellular A Universal Live Cell Barcoding-Platform for hierarchy from high-dimensional cytometry Multiplexed Human Single Cell Analysis. Sci data with SPADE. Nat Biotechnol 29:886–891 Rep 8:10770. https://doi.org/10.1038/ s41598-018-28791-2 Chapter 19

Classification of the Immune Composition in the Tumor Infiltrate

Davide Brusa and Jean-Luc Balligand

Abstract

Flow cytometry is one of the most suitable techniques for analyzing and classifying different cell suspensions derived from blood or others compartments. The characterization of all different cellular subtypes is made with different antibodies that detect surface or intracytoplasmic antigens. Here we describe the technique to thoroughly characterize immune cells from tumor infiltrates and proceed to isolation using single-cell sorting.

Key words Flow cytometry, FACS analysis, Immune system, Immune infiltrate, Tumor immunology, Single-cell sorting

1 Introduction

Fluorescence-activated cell sorting (FACS) is increasingly used worldwide in several areas of clinical and translational research [1, 2], for phenotypic characterization and functional activity by measuring cytokines, intracellular signaling and cell proliferation. It is a rapid, sensitive, quantitative, single-cell analysis technique that still may give impetus to new developments, such as the identifica- tion of circulating tumor cells (CTS) [3] or microvesicles [4, 5]. A flow cytometer is constituted of three main systems: fluidics, optics and electronics. The fluidics is the system used for particles transportation, in a single-cell suspension, through the hydrody- namic focusing in front of the laser beam, in the so-called inter- rogation point. Here cells are enlighten by the lasers and all the characteristics are sent to the optical system, where the fluorescence is divided by band pass filters to select each wavelength detected by photomultipliers (PMT). PMTs are the electronic system convert- ing light signals in electronic signals, for further processing by the computer. Flow cytometry is used for characterization of cells through scatter analysis and fluorescence.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_19, © Springer Science+Business Media, LLC, part of Springer Nature 2019 305 306 Davide Brusa and Jean-Luc Balligand

Scatter analysis describes the size (forward-scattered light or FSC) and internal complexity (side-scattered or SSC) according to the presence of granules and polylobate nuclei. Fluorescence identifies the different types of white blood cells in lymphocyte, monocyte, macrophage, and granulocyte compart- ments. Inside the lymphocyte compartment there are many sub- populations, as T, B, NK, T-reg cell subsets that we cannot identify with scatter analysis. Consequently, we need surface markers to identify the specific cell subsets. These markers are called Cluster Designation (CD) and are identified by monoclonal antibodies that in flow cytometry are conjugated by fluorochromes [6]. Fluorochromes are particular molecules that, once irradiated by a laser source, emit fluorescence in a higher wavelength then the excitation one. Emission lights are recorded by the flow cytometer PMTs and data are showed as dots in a Cartesian plane called dot plot. The two axes represent the emission fluorescences and cells are represented as dots according to the amount of fluorescences expressed. Since flow cytometry has always been used to detect different cell subpopulations in different blood diseases, we apply this method to detect immune subpopulations in tumor infiltrates. For this purpose, we process the surgical biopsies of different tumors in order to get a single-cell suspension. Subsequently we stained these samples with a large number of antibodies in order to obtain as much information as possible on cellular subpopulations. Identified cells can be easily isolated with a single-cell sorter: this instrument is suitable for recovering all the needed cells from the total suspension, making them available for further protein, RNA, and DNA analysis by next generation sequencing (NGS).

2 Materials

All antibodies and buffers should be stored at 4 C and used within the expiration date. Renew and filter the buffers each week.

2.1 Samples 1. Washing buffer: phosphate buffered saline (PBS) without Ca2 Preparation Buffers +/Mg2+ with 0.5% BSA (see Note 1). 2. Digestion buffers: trypsin, Accutase, or collagenase type 1 (10 μg/mL) þ DNAse (10 μg/mL). Prepare and use accord- ing to the type of the tissue (see Note 2). 3. Gradient buffer: Ficoll (see Note 3). 4. Blocking buffer: PBS 1–2% BSA or commercially available solutions blocking the Fragment Crystallizable (FcR) receptors (see Note 4). 5. Viability dyes: PI, 7-AAD, DRAQ5, DRAQ7, or DAPI. Single Cell Analysis by Flow Cytometry 307

2.2 FACS Buffers 1. Staining buffer: PBS w/o Ca2+/Mg2+, 1% BSA, 1–2 mM EDTA (see Note 5). 2. Fixation buffer: 2–4% paraformaldehyde (PFA) solution (see Note 6). 3. Permeabilization buffers: 0.1% saponin PBS (see Note 7).

2.3 Antibodies 1. Purified monoclonal antibodies or fluorochrome-conjugated and Beads antibodies (Abs) according to the analysis needed (see Note 8) should be titrated before use by each laboratory. 2. Compensation beads to set the right compensation between fluorochromes in a multiparametric staining. 3. Calibration beads to check the daily performances of the instrumentation.

2.4 Tubes and Filters 1. 5 mL 12 Â 75 mm FACS tubes. 2. 70–100 μm tube filters.

2.5 Sorting Buffers 1. Resuspension buffer: PBS, 1–2% BSA, 1–2 mM EDTA. 2. Recovering buffers: Complete medium (5–10%FBS, 25 mM HEPES in DMEM/RPMI).

3 Methods

All procedures have to be performed at room temperature except where differently indicated.

3.1 Samples 1. Partially digest the tumors or biopsies using the digestion Preparation buffer and make 3–5 different injections into the tissue with a syringe according to the size of tumor tissue, in order to smash easily the tissue afterward. Incubate for 30 min at 37 C. 2. Take the sample and place it on a 100 μm nylon strain and use the syringe plunger to smash and get the single-cell suspension through the strainer. Block the digestion before proceeding with further steps with ice-cold complete medium and wash out the buffer (see Note 9). 3. Tissue homogenate can be stratified on Ficoll (density of 1077 g/mL) to separate the white blood cells (WBC) from parenchymal and dead cells or debris. Add 5 mL of Ficoll in a 15 mL Falcon tube and stratify 10 mL of tissue homogenate on it (see Note 10). Centrifuge at 400 Â g for 30 min at room temperature (RT) with no brakes (see Note 11). 4. Leukocytes are separated by red blood cells (RBC), tissue cells, and debris. Leukocytes will form an opaque layer between the Ficoll and medium, all other cells will be in the pellet at the bottom of tube. Remove and discard the top part over the layer 308 Davide Brusa and Jean-Luc Balligand

to recover better the opaque interphase layer containing leu- kocytes. Take the interphase using a pipette and add to a fresh 50 mL Falcon tube containing PBS w/o Ca2+/Mg2+. Centri- fuge and perform three washes with PBS to remove any trace of the Ficoll (see Note 12). 5. Resuspend the cell pellet in staining buffer and count the cells always before proceeding with staining. Divide the cells in the 12 Â 75 mm tubes with minimum of 105 cells up to 1.5 Â 106 cells maximum (see Note 13).

3.2 Staining 1. Take the amount of cells needed and proceed with FcR block-  Procedure ade at 4 C for 10–15 min if your cells express high levels of Fc receptors. 2. Proceed directly with staining without washing. Abs are added in 100 μL of staining buffer for the pellet resuspension in the right concentration according to datasheet of the producer or according to the laboratory titrations [7]. 3. Incubate for 15–20 min at room temperature (RT) protecting from light (see Note 14). 4. Resuspend the cells in wash buffer and centrifuge to discard the surnatant. Add 100–200 μL of resuspension buffer for the analysis (see Note 15). 5. If Abs are unconjugated, an indirect staining is needed. Primary Abs against the cell antigen is added with the same conditions as for a direct staining. After two washes in washing buffer, the cells are resuspended again in staining buffer for the binding with secondary conjugated Abs against primary Abs. Incubate for 10 min at RT in the dark and wash the cells. Resuspend as in step 4. 6. If the sample cannot be read in the same day of the preparation, proceed with the fixation with the Fixation buffer. 7. PI, 7-AAD, or DAPI (if UV or violet laser is provided) is added for dead cells discrimination just before analysis (see Note 16) and only if the cells are not fixed (see Note 17).

3.3 FASC Analysis 1. The tumor infiltrate analysis consists of detecting many sub- populations in the white blood cell compartment. A multipara- metric staining will be necessary to identify all the different subpopulations. In this case, a wide range of fluorochromes can be used for a variety of Abs. The compatible fluorescences and the most widely adopted are: FITC, PE, PE-Cy5, PrCP-Cy5.5, PE-Cy7, APC, APC-Cy7, BV421, BV510 [8]. 2. Single color controls are required to set up the compensation matrix. It is good practice to use the BD CompBeads or poly- styrene microparticles coupled to an antibody specific for the Single Cell Analysis by Flow Cytometry 309

Fig. 1 Representative flow-cytometric panel of human blood cell sample analysis. (a) CD45þ gate on leukocytes. (b) Monocytes isolation identified as CD11b+CD14+ cells. (c) Morphological gate on the lympho- cytes discriminating the cellular debris according to FSC and SSC. (d) Identification of T cells (CD3+), NK (CD16+CD56+CD3À), and NKT (CD3+CD16dim) subsets. (e) Gated on CD3ÀCD16ÀCD56À it is possible to identify B cells (CD19+). (f) CD4+ (Th) and CD8+ (CTL) expression gated on CD3+ cells. Identification of naı¨ve 310 Davide Brusa and Jean-Luc Balligand

Kappa light chain of mouse, rat, or rat/hamster Ig to perform the single staining controls according to the manufacturer’s protocol [7](see Note 18). 3. Panel Design Strategy: The different Abs against antigens for a tumor infiltrate analysis are many, according to the analysis the researchers are interested in. The most studied are CD45, CD3, CD8, CD4, CD11b, CD11c, CD14, CD19, CD20, CD62L, CD27, CD28, CD68, CD86, CD138, CD206, MHCII, etc. in order to detect all different subpopulations of T, B, macrophages, monocytes, etc. (see Note 19). 4. Before starting with the analysis proceed always and every day with the check performance and tracking (CS&T) (see Note 20). 5. Design the correct gates, discriminating between positive and negative cells and accordingly perform the sample recording. Gating strategy is the fundamental part of flow cytometry analysis in order to take into consideration the right subpopu- lation to be studied and sorted. The best way to determine the fluorescence gating strategy is to use fluorescence minus one (FMO) controls [9](see Note 21). 6. Representative results: all different leukocytes subpopulations can be analyzed. Since the analysis is to be done on tumor infiltrate, it is better to select all the leukocytes with CD45 as first (Fig. 1a). Then, a morphological gate is necessary to identify the lymphocytes (Fig. 1c, d) detected as T cells (CD3+) divided in the two major subsets T helper (CD4+) and cytotoxic T lymphocytes (CD8+) (Fig. 1f). Both CD4+ and CD8+ cells could be further divided on the basis of CD45RA and CCR7 (or in alternative CD62L) staining into maturational subsets, defined as naı¨ve CCR7+CD45RA+, cen- À tral memory CCR7+CD45RA , effector memory À À CCR7 CD45RA , and terminally differentiated memory À cells CCR7 CD45RA+ (Fig. 1g, h). T-reg cells gated on À CD4+ as CD127 CD25hiFoxP3+ [10] is very important in the tumor infiltrate analysis because T-regs represent one of the immunosuppressive subpopulations that drive the tumor escape mechanisms. B cells (Fig. 1e) can be identified as CD19+CD20+, with the subpopulation of B-regs [11] identi- À À À fied as IgD IgM CD1dhiCD24hiCD38hi. Naı¨ve (CD27 ) and

memory (CD27+) B cells can be also analyzed. Plasma cells ä

Fig. 1 (continued) (CCR7+CD45RA+), central memory (CM, CCR7+CD45RAÀ), effector memory (EM, CCR7ÀCD45RAÀ), and terminally differentiated effector RAþ (CCR7ÀCD45RA+) subpopulations gated on CD4+ (g) and CD8+ (h) cells. (i) Morphological gate on granulocytes identified as higher SSC in comparison to lymphocytes and monocytes. (j) Subsequent gate on myeloid cells (CD11b+) and identification of neutrophils (CD16high) and eosinophils (CD16dim)(k) Single Cell Analysis by Flow Cytometry 311

À (PC) can be identified as CD19+CD20 CD138+. NK cells as À À CD16+CD56+ gated on CD3 CD19 cells, but NKT as CD3+CD16+CD56+ are easily identified (Fig. 1d). Monocytes can be identified as CD14+CD11b+ (Fig. 1b) and eventually also as CD4dim gated on higher SSC width respect to lympho- cytes. Macrophages are selected as CD11b+CD68+ and divided in proinflammatory M1 (MHCII+) and tumor associated macrophages M2 (CD206+)[12]. For myeloid derived sup- pressor cells (MDSC) there is still no complete accordance but À they can be identified as CD14+/-CD33+MHC-II [13]. Den- À dritic cells are selected as SSChiCD14 CD11b+CD11c+MHC- II+CD86+CD83+ [10]. Gated on granulocyte scatter (SSChi)it À is possible to identify eosinophils as CD66b+CD16 , neutro- phils as CD66b+CD16+, and basophils as CD66b+CD16+CD294+ [9] (Fig. 1).

3.4 Single-Cell FACS is a highly sophisticated technique for purifying cell popula- Sorting tions at the highest degree of purity, reaching 95–100% of the sorted population [7]. Therefore, this technique shows its best results in experiments where high purity is an essential requirement (e.g., microarray analysis) [14]. In the last years the development of ultrahigh- speed sorters has further extended the possibilities of application of flow sorting in clinical settings. The FACS potential clinical applications may now include purification of blood stem cells from human blood for therapeutic purposes [15], applications in cancer therapy [1], amniocentesis replacement [7], and sorting human sperm [16]. Single-cell sorting has a great impact on cellular development analysis [17]. Lymphocyte development of Ig or TCR gene rear- rangement can be amplified by PCR and the genetic basis of the immune response characterized [18]. With the development of the clone sorting system, it becomes possible to identify the phenotype of the single cell that self- renewed or gave rise to differentiated progeny. 1. Once the populations have been determined, select them with a designated gate and recover only the specific cells into collec- tion tubes. Up to four gated populations can be sorted at once. For plate sorting, only one population at a time can be taken. 2. Select the appropriate nozzle depending on the cell type to be sorted. For sterile sort, sterilize the instrument. If a sterile sort is performed, the cells can be cultured. 3. Run the experimental sample tube at 4 C, turn on deflection plates, and sort the sample. 4. Set the selection and sorting mask on the single-cell sorting, in order to discard all doublets and to take just one cell in one well 312 Davide Brusa and Jean-Luc Balligand

of 96-or 384-well plate containing a lysis buffer to lyse the cells and at the same time preserve the RNA/DNA content. 5. Store the plate at À80C for further analysis of gene expression and RNAseq or single cell NGS.

4 Notes

1. The use of Ca2+/Mg2+ buffers may increase the formation of cellular aggregates or doublets that will be discarded in the analysis, with the consequence of a reduction in number of analyzed cells. 2. Many organs from mouse or human tissue biopsies need to be digested before staining to get a single-cell suspension. All different tissues need to be tested for the best enzymatic diges- tion in order not to interfere with the epitope expression. Moreover, the enzymatic treatment can interfere on cell viabil- ity. For this reason, it is best to add a viability marker (PI, 7-AAD, DAPI, etc.) during the analysis. 3. As a gradient separator, Ficoll is appropriate to isolate the viable cells and remove dead cells, fragments and debris. 4. Stain without FcR blockade may incur to unspecific binding especially if staining involved B cells, monocytes, and macro- phages. In this case, FcR blocking is always required. Alterna- tively, there are many new Abs that are mutated in the Fc sequence: these Abs do not bind in an unspecific way. In this case, this step can be avoided. 5. EDTA buffers help to maintain the single-cell suspension avoiding the cluster and doublets formation. Check always which is the best concentration of EDTA for the cells, normally 1 to 2 mM of EDTA is ok. 6. The fixatives use depends on the type of intracellular staining needed. For staining of intracytoplasmic antigenic proteins, cytokines, and chemokines, 2–4% PFA buffer is the most widely used. This buffer is used also at the end of staining to fix and keep the cells in the fridge before the analysis. Other buffers may contain ethanol 70% for staining of intranuclear transcrip- tion factors. Alternatively, commercially available Fix&Perm Buffers from different suppliers are used. 7. Fix&Perm buffer is suitable for intracytoplasmic staining. In this case, cells should be kept at 4 C for longer and particularly difficult staining. Fixed cells cannot be recovered from cell sorting for cellular culture. 8. Monoclonal Abs are to be used in preference to the polyclonal Abs due to high unspecific binding. Single Cell Analysis by Flow Cytometry 313

9. Homogenization of tissues should always be performed on ice, working on a petri dish to cut and smash the tissue. 10. Pay attention not to mix the sample with the Ficoll, otherwise it will prevent the white blood cell band formation. 11. Brakes during start and stop of centrifugation may interfere with the layer formation of WBC; deactivate this option from the centrifuge. Set the centrifuge at RT and not at 4 C because low temperatures interfere with the WBC layer formation. 12. Ficoll is a glucose-based solution and may interfere with cell viability if not removed correctly after the stratification. Pro- ceed always with at least three washes. 13. If the amount of cells is low, it is also possible to seed the cells into a 96-well plate and proceed with staining directly there. Add the mixture of Abs, wash the cells adding 150–200 μLof washing buffer, and centrifuge the plate with the plate adaptor. Discard the supernatant quickly flipping the plate into the sink. 14. Perform staining always in the dark to prevent fluorochromes from being destroyed by direct light, above all if tandem dyes are used. 15. If the cells are prone to clump, make sure to have a single-cell suspension without aggregates before analysis, check on the microscope and if any proceed with a further filtration step with a70μm strainer. 16. PI and 7-AAD are detected on PrCP channel but may interfere with the PE. Make sure to set all the compensations with the other channels in the correct way. 17. If fixation is performed, it will not be possible to use discrimi- nator of viable staining because these particular dyes are able to pass the membrane of injured cells and bind the DNA. If cells are already fixed the membrane could not be intact anymore. Pay attention to avoid this step if cells are fixed. 18. Compensation is necessary to remove the spectrum overlap of one fluorescence between two detectors. CompBeads (BD) allow to make an automatic compensation by calculating an algebraic matrix for all the fluorophores used in the experi- ment. In a multiparametric experiment, it is a good practice, and it is very helpful for the researcher as it reduces the setup times of the experiment. 19. For best results, the panel design should consider the fluoro- chromes brightness. Bright fluorochromes should be conju- gated to antibodies detecting low expressed markers. On the contrary, markers that are well expressed and provide a good separation between negative and positive cell populations must be used with less brilliant fluorochromes. 314 Davide Brusa and Jean-Luc Balligand

20. CS&T is a daily check made by the operator in order to see if the laser power has the same performance during the time. The CS&T beads (BD) are required to run the software in an automatic way. This performance check is always required in order to work always in the same condition during the experi- ment progress in different days. 21. In an FMO (fluorescence minus one) control tube, all reagents used in the experiment, are included except one. Create as much FMO tubes as the number of fluorochromes used. For example, if staining is made of three Abs conjugated with FITC, PE, and APC, the FMO tubes will be FMO FITC (only with PE and APC fluorescence), FMO PE (only with FITC and APC), and FMO APC (only with FITC and PE). The FMO helps to discriminate between dimly stained and broad negative populations. The FMO tubes are useful to determine where to place the positivity markers in a plot.

References

1. Jaye DL, Bray RA, Gebel HM, Harris WAC, empiric approach. Cytometry A 73 Waller EK (2012) Translational applications of (5):400–410 flow cytometry in clinical practice. J Immunol 7. Basu S, Campbell HM, Dittel BN, Ray A 188(10):4715–4719 (2010) Purification of specific cell population 2. Ma W, Gilligan BM, Yuan J, Li T (2016) Cur- by fluorescence activated cell sorting (FACS). J rent status and perspectives in translational bio- Vis Exp (41):e1546 marker research for PD-1/PD-L1 immune 8. Herzenberg LA, De Rosa SC, Herzenberg LA checkpoint blockade therapy. J Hematol (2000) Monoclonal antibodies and the FACS: Oncol 9:47 complementary tools for immunobiology and 3. Koonce NA, Juratli MA, Cai C, medicine. Immunol Today 21:383–390 Sarimollaoglu M, Menyaev YA, Dent J, Quick 9. Tung JW, Heydari K, Tirouvanziam R, Sahaf B, CM, Dings RPM, Nedosekin D, Zharov V, Parks DR, Herzenberg LA, Herzenberg LA Griffin RJ (2017) Real-time monitoring of cir- (2007) Modern flow cytometry: a practical culating tumor cell (CTC) release after nano- approach. Clin Lab Med 27(3):453–468 drug or tumor radiotherapy using in vivo flow 10. Brusa D, Carletto S, Cucchiarale G, Gontero P, cytometry. Biochem Biophys Res Commun Greco A, Simone M, Ferrando U, Tizzani A, 492(3):507–512 Matera L (2011) Prostatectomy restores the 4. Menck K, Bleckmann A, Wachter A, maturation competence of blood dendritic Hennies B, Ries L, Schulz M, Balkenhol M, cell precursors and reverses the abnormal Pukrop T, Schatlo B, Rost U, Wenzel D, expansion of regulatory T lymphocytes. Pros- Klemm F, Binder C (2017) Characterisation tate 71:344–352 of tumour-derived microvesicles in cancer 11. Gorosito Serra´n M, Fiocca Vernengo F, Bec- patients’ blood and correlation with clinical caria CG, Acosta Rodriguez EV, Montes CL, outcome. J Extracell Vesicles 6(1):1340745 Gruppi A (2015) The regulatory role of B cells 5. Burger D, Oleynik P (2017) Isolation and in autoimmunity, infections and cancer: per- characterization of circulating microparticles spectives beyond IL10 production. FEBS Lett by flow cytometry. Methods Mol Biol 589(22):3362–3369 1527:271–281 12. Allavena P, Chieppa M, Bianchi G, Solinas G, 6. McLaughlin BE, Baumgarth N, Bigos M, Fabbri M, Laskarin G, Mantovani A (2010) Roederer M, De Rosa SC, Altman JD, Nixon Engagement of the mannose receptor by DF, Ottinger J, Oxford C, Evans TG, Asmuth tumoral mucins activates an immune suppressive DM (2008) Nine-color flow cytometry for phenotype in human tumor-associated macro- accurate measurement of T cell subsets and phages. Clin Dev Immunol 2010:547179. cytokine responses. Part I: Panel design by an https://doi.org/10.1155/2010/547179 Single Cell Analysis by Flow Cytometry 315

13. Brusa D, Simone M, Gontero P, Spadi R, Schulman JD (2014) The effectiveness of flow Racca P, Micari J, Degiuli M, Carletto S, cytometric sorting of human sperm (Micro- Tizzani A, Matera L (2013) Circulating Sort®) for influencing a child’s sex. Reprod immunosuppressive cells of prostate cancer Biol Endocrinol 12:106 patients before and after radical prostatectomy: 17. Battye FL, Light A, Tarlinton DM (2000) Sin- Profile comparison. Int J Urol 20:971–978 gle cell sorting and cloning. J Immunol Meth- 14. Hu P, Zhang Z, Xin H, Deng G (2016) Single ods 243:25–32 cell isolation and analysis. Front Cell Dev Biol 18. Six A, Mariotti-Ferrandiz ME, Chaara W, 4:116 Magadan S, Pham HP, Lefranc MP, Mora T, 15. Reitsma MJ, Lee BR, Uchida N (2002) Thomas-Vaslin V, Walczak AM, Boudinot P Method for purification of human hematopoie- (2013) The past, present, and future of tic stem cells by flow cytometry. Methods Mol immune repertoire biology – the rise of next- Med 63:59–77 generation repertoire analysis. Front Immunol 16. Karabinus DS, Marazzo DP, Stern HJ, Potter 4:413 DA, Opanga CI, Cole ML, Johnson LA, Part V

Single Cell Multi Omic Analysis Chapter 20

Combined Genome and Transcriptome (G&T) Sequencing of Single Cells

Iraad F. Bronner and Stephan Lorenz

Abstract

The simultaneous examination of a single cell’s genome and transcriptome presents scientists with a powerful tool to study genetic variability and its effect on gene expression. In this chapter, we describe the library generation method for combined genome and transcriptome sequencing (G&T-seq) originally described by Macaulay et al. (Nat Protoc 11(11):2081–2103, 2016; Nat Methods 12(6):519–522, 2015). This includes some alterations we made to improve robustness of this process for both the novice user and laboratories that want to deploy this method at scale. Using this method, genomic DNA and full-length mRNA from single cells are separated, amplified, and converted into Illumina sequencer-compatible sequencing libraries.

Key words Single cell, Transcriptome, Genome, MDA, WGA, SNV detection, Copy number varia- tion, Gene expression, NGS

1 Introduction

Recently, a sequencing library-generation method for combined genome and transcriptome sequencing (G&T-seq) on single cells was described by Macaulay et al. [1, 2]. Until then, single-cell sequencing methods had only allowed to either determine individ- ual cellular expression profiles or to examine genomic variation (i.e., copy-number variations or single nucleotide variants). By combining transcriptome and genome amplification methods in one protocol, a tool was created that could correlate genetic varia- bility and its effect on gene expression [1]. We have seen a great need for the implementation of the G&T-seq protocol as part of our automated high-throughput single-cell pipeline at the Well- come Sanger Institute. The original Nature Protocol written by Macaulay et al. [1] gives a very accurate description of the process and can be implemented by skilled scientists. However, it is a very complex protocol that is hard to scale to high sample numbers and is also sensitive to experimental errors. During the implementation

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_20, © Springer Science+Business Media, LLC, part of Springer Nature 2019 319 320 Iraad F. Bronner and Stephan Lorenz

into our high-throughput pipeline, we have identified potential pitfalls, which might not be obvious to less-experienced users, and improved on the robustness of this protocol to allow for its routine use in high-throughput genomics laboratories. Before endeavoring into the G&T library preparation method, one should understand that the scientific question will have a great impact on the cost per sample and the complexity of the experi- ment. The G&T method itself provides a robust separation of cellular nucleic acids into distinct single-cell RNA and DNA sam- ples that can be subjected to a range of library preparation methods from these materials. For the RNA, we recommend preparing full- length cDNA samples to maximize insights; however, 30 or 50 enrichment methods will also yield valuable gene expression data. For the DNA, the operator must make a decision that fundamen- tally impacts on the analysis that is available later on. For single-cell single-nucleotide variant (SNV) analysis of genomic DNA, a phi29-based multiple displacement amplification (MDA) method is the best choice since phi29 has been shown to have very high DNA amplification accuracy [3]. To get accurate SNV data from amplified DNA products, we suggest using a subsequent no-PCR library preparation method and sequencing of the individual cells to sufficient coverage (e.g., 10–30Â)to accurately determine SNVs in lower coverage regions. At the time of writing, sequencing individual cells to 10Â coverage is still expensive. We therefore suggest to perform an up-front genotyping assay on the individually amplified DNA samples to assess the coverage of the amplified single-cell product and enable cherry- picking of a high-quality subset of samples for library preparation and sequencing. We have observed that genotyping assays correlate with the amount of locus and allelic dropout that is observed after PCR-free library preparation and sequencing, meaning that sam- ples with high-quality genotyping results will generally yield high- quality sequencing data. PCR-based methods like the commercial MALBAC or Pico- PLEX method are more quantitative than phi29-based amplifica- tion methods and will grant better ability to discriminate copy number variations and genomic microinsertions and microdele- tions. Library preparation can be done with more general methods (e.g., Nextera XT) that are easier to perform and more cost- effective. Genomic coverage needed for this type of analysis is sub- stantially lower. We have routinely sequenced 96–384 samples on an individual HiSeq 4000 lane yielding 300 Â 106 paired-end reads. At the heart of the G&T method is the capture and separation of the genomic (DNA) and transcriptomic (RNA) material of indi- vidual cells. After the lysis of individual cells, the polyadenylated mRNA is hybridized with oligonucleotides containing a universal amplification site, a large stretch of 30 thymine residues, and a VN anchor site (V ¼ A, G, or C; N ¼ A, G, C, or T). These primers are G&T-Sequencing 321

synthesized to contain a 50 biotin, which is conjugated to magnetic streptavidin-coated beads to enable magnetic retention and separa- tion of materials. During this process, captured mRNA molecules are thoroughly washed in order to rinse the beads of all unbound material, including the cellular DNA. The supernatant from these washes is then deposited into a separate receptacle where it is further processed. The bound poly-A mRNA is converted to cDNA and amplified using a process similar to Smart-seq2 [4, 5]. The DNA contained in the wash buffer supernatant is precipitated using solid-phase reversible immobilization (SPRI) beads and amplified using MDA or PicoPLEX. Efficient lysis of the individual cells subjected to this protocol is essential. To make sure that the cellular content containing DNases, RNases, and DNA-binding proteins is denatured and cellular membranes, including the nuclear envelope, are success- fully disrupted, cells are sorted into a commercially available lysis buffer (i.e., RLT plus, QIAGEN Ltd, UK) containing the denatur- ing agent guanidinium chloride and sodium dodecyl sulfate (SDS), a strong anionic detergent. To get a better understanding of the sensitivity and amplifica- tion bias of the RNA transcripts present during sequencing, we recommend the addition of external RNA controls to the lysis buffer. We routinely use those developed by the External RNA Controls Consortium (ERCC); however, other commercial spike- in controls are available. The ploidy level of the cells (e.g., monoploid, diploid, or tetraploid) will affect the DNA content of the cell and with it the potential amount of captured and amplified DNA. S-phase cells are ideally omitted since replication events look like copy number variation (CNV) events in the data. Specifically selecting for cells in G2 allows to increase DNA coverage since cells will have four copies of each chromosome instead of two. To determine each cell’s DNA content and cell state, a nuclear DNA marker (e.g., Hoechst) should be used during cell sorting.

2 Materials

The G&T-seq protocol is very modular and can be matched with various other protocols to suit the experimenter’s needs. We there- fore have structured this chapter into the different modules and options. For the successful completion of the protocol, all reagents and instruments indicated in the individual sections of relevance are required. It is critical that reagents need to be free of contaminating DNA, RNA, and guaranteed nuclease free (see Notes 1 and 2). Since RNase and DNA contamination are the most frequent failure causing issues of this method, it is vital that the experimenter wear a clean lab coat and gloves all the time and liquid handling of all reactions are done in a clean UV PCR workstation (see Note 3). 322 Iraad F. Bronner and Stephan Lorenz

2.1 Lysis Buffer Plate 1. ERCC RNA Spike-In Mix. Generation 2. RLT Plus Lysis buffer (QIAGEN Ltd, UK, 79216). 3. 96-well plates for sorting. 4. UV cross-linker to irradiate plastic labware.

2.2 Preparation 1. UV Shaker with 1.5 mL tube adapter. of Biotin-dT30 Bound 2. 1.5 mL tube magnet. Streptavidin Beads 3. Dynabeads MyOne Streptavidin C1 (Thermo Fisher #65002). in Resuspension  Buffer 4. Superscript II 5 first strand buffer (Thermo Fisher, part of the superscript II kit # Y02321). 5. Nuclease-free water. 6. 10 M Sodium hydroxide solution. 7. 5 M NaCl solution. 8. 1 M Tris–HCl solution pH 7.5. 9. 0.5 M EDTA pH 8.0. 10. Solution A: 0.1 M NaOH, 0.05 M NaCl solution. Make up in a 50 mL Falcon tube and store at 4 C. Make fresh every 2 months. 11. Solution B: 0.1 M NaCl solution. Make up in a 50 mL Falcon tube and store at 4 C. Make fresh every 2 months. 12. 2 B&W: 0.5 mM Tris–HCl pH 7.5, 0.1 mM EDTA pH 8.0, 2 M NaCl. Make up in a 50 mL Falcon tube and store at 4 C. Make fresh every 2 months. 13. Resuspension Buffer (without RNase inhibitor): 1.2 mL Superscript II 5 first strand buffer, 4.5 mL nuclease-free water. This is enough for processing four plates. Store in the fridge until directed. 14. Biotinylated G&T dT30 primers: Order your biotinylated primer RNase free purified. When ordering from IDT, one can copy and paste the following sequence: /5BiotinTEG/ AAG CAG TGG TAT CAA CGC AGA GTA CTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TVN. Dilute the bioti- nylated dT30 primer to 100 μM. Make aliquots of 300 μL. Label the tubes “G&T DT30” and store at À20 C.

2.3 DNA, RNA 1. Alpaqua 96 low elution plate magnet (A000350). Separation and RNA 2. Three 96-well plates. Amplification 3. Filter pipette tips and a fresh box. 4. UV cross-linker to irradiate plastic labware. 5. Plate seals or plate sealer with PCR friendly seals. 6. Wipeable cool box and cooling block. 7. Repeater pipette (e.g., Eppendorf Stream). G&T-Sequencing 323

8. UV PCR workstation. 9. Heated shaker with heated lid (e.g., Eppendorf Thermomixer C with heated lid). 10. Shaker at room temperature with 96-well plate insert (e.g., Eppendorf Thermomixer C). 11. Biotin-dT30 bound streptavidin beads in resuspension buffer (without RNase inhibitor; see previous section for the preparation). 12. Nuclease-free water. 13. MMLV RNase HÀ Reverse Transcriptase (e.g., Superscript II). 14. Reverse Transcriptase reaction buffer, supplied with reverse transcriptase. 15. RNase inhibitor (e.g., NEB Murine, M0314L). 16. 100 mM DTT. 17. 5 M betaine.

18. 1 M MgCl2. 19. 100 μM template switching oligonucleotide (TSO; 50-AAG CAG TGG TAT CAA CGC AGA GTA CAT rGrG+G-30; the last three bases consist of two riboguanosines (rG) and one locked nucleic acid (LNA)-modified guanosine (+G)). 20. 25 mM dNTP mix (each); combine 250 μL 100 mM dATP with 250 μL 100 mM dTTP, 250 μL 100 mM dCTP, and 2650 μL 100 mM dGTP and create aliquots of 200 μL. Each of these aliquots will have enough dNTPs to amplify four plates. Label the tubes “dNTP mix” and store at À20 C. The final concentration of each of the dNTPs is 25 mM. 21. 1 M Tris–HCl pH 8.3 solution; 6.77 g recalibrated Tris Buffer salt pH 8.3 (Sigma-Aldrich, T8943). Make up to 50 mL with nuclease-free water in a 50 mL Falcon tube and store at 4 C. These crystals have been shown to be consistently nuclease free. Make fresh every 2 months. 22. G&T wash buffer; 50 mM Tris–HCl pH 8.3, 75 mM KCl, 3 mM MgCl2, 0.5% Tween 20 Solution (for accurate pipetting use a positive displacement pipette), and 10 mM M DTT. Use nuclease-free water for dilutions. Make batches of 100 mL and divide these into 7 Â 14 mL aliquots and store at À20 C. Each 14 mL aliquot contains enough reagent for four plates 23. PCR polymerase (e.g., Kapa Hifi PCR polymerase). 24. 100 μM IS PCR primers (50-AAG CAG TGG TAT CAA CGC AGA GT-30). 25. OPTIONAL: Commercial total RNA to be used as positive control. 324 Iraad F. Bronner and Stephan Lorenz

2.4 DNA 1. Plate heat sealer or adhesive plate seals. Amplification for Copy 2. Repeater pipette (e.g., Eppendorf Stream). Number Variation 3. UV PCR workstation. (PicoPLEX) 4. 80% DNA-free and nuclease-free ethanol solution; combine 40 mL of pure 100% Ethanol with 10 mL nuclease-free water into a 50 mL Falcon tube (see Note 4). 5. Heated shakers with heated lid set to 42 C (e.g., Eppendorf Thermomixer C with heated lid). 6. Shakers at room temperature with 96-well plate insert (e.g., Eppendorf Thermomixer C). 7. Thermocycler. 8. PicoPLEX WGA Kit (50 reactions; Rubicon Genomics) containing: (a) Cell extraction buffer. (b) Extraction enzyme dilution buffer. (c) Cell extraction enzyme. (d) Pre-Amp buffer. (e) Pre-Amp enzyme. (f) Amplification buffer. (g) Amplification enzyme. (h) Nuclease-free water. 9. 1.5 mL Eppendorf tube. 10. 15 mL Falcon tube. 11. Cooling block for Eppendorf tubes.

2.5 Multiple 1. Plate heat sealer or adhesive plate seals. Displacement 2. Repeater pipette (e.g., Eppendorf Stream). Amplification (QIAGEN 3. UV PCR workstation. REPLI-g Single Cell Kit) 4. 80% DNA-free and nuclease-free ethanol solution; combine 40 mL of pure 100% ethanol with 10 mL nuclease-free water into a 50 mL Falcon tube (see Note 4). 5. Shaker with 96-well plate insert (e.g., Eppendorf Thermo- mixer C). 6. Thermocycler. 7. QIAGEN REPLI-g Single Cell Kit, containing: (a) 4Â REPLI-g sc DNA Polymerase. (b) 4Â REPLI-g sc Reaction Buffer. (c) 2Â Buffer DLB. (d) Stop Solution. (e) 4Â PBS sc 1Â. G&T-Sequencing 325

(f) DTT, 1 M.

(g) 4Â H2O sc. 8. 1.5 mL Eppendorf tube. 9. 15 mL Falcon tube. 10. Cooling block for Eppendorf tubes. 11. Cooling block for 96-well plate.

2.6 Nextera XT (For 1. Nextera XT DNA Library Preparation Kit, containing: Transcriptome (a) TDB (Tagment DNA Buffer). and PicoPLEX DNA) (b) ATM (Amplicon Tagment Mix). and SPRI Cleanup (c) NT (Neutralize Tagment Buffer). (d) NPM (Nextera PCR Mix). 2. Nextera XT Index Kit v2. (a) i7 index primers-i7 (12 orange capped vials). (b) i5 index primers-i5 (eight white capped vials.) 3. TruSeq Index Plate Fixture. 4. Replacement caps (orange and white). 5. 96-well PCR plate. 6. Cooling block. 7. Multichannel pipette. 8. Plate vortex or heated shaker with 96-well plate adapter. 9. UV PCR workstation. 10. PCR clean Eppendorf tube. 11. Plate seals. 12. Thermocycler. 13. Tube magnet.

3 Methods

Since most human cells only contain marginal (0.5–50 pg) amounts of RNA and approximately ~3 pg of DNA per monoploid chromo- some set, even minor contamination with exogenous nucleic acids or nucleases will have major effects on the efficacy of the G&T-seq protocol. To avoid contamination with genetic material or nucleases during processing, it is important all work is done in a containment hood, such as a PCR hood or a laminar flow cabinet that is clean and subjected to UV light before operation, while the experimenter wears gloves and a clean lab coat. Prior to any experi- ment, we recommend to wipe all surfaces and equipment using bleach diluted to 5000 ppm. To remove bleach residues, subse- quently rinse surfaces with Milli-Q filtered and sterilized water. 326 Iraad F. Bronner and Stephan Lorenz

3.1 Lysis Buffer Plate To isolate both DNA and RNA from cells or nuclei, these need to Generation be sorted into an effective lysis buffer. As described by Macaulay et al. [1], we recommend using 96-well PCR plates filled with 2 μL RLT Plus lysis buffer (QIAGEN, UK). We also recommend adding ERCC controls to the lysis buffer to enable transcript normaliza- tion and assessment of expression noise parameters [6]. We recom- mend to leave two wells empty so they can be used for positive and negative controls after cell deposition. While cells can be picked using an automated cell picker or manual picking with a Stripette, we generally recommend using flow cytometry with index sorting, which can be helpful during troubleshooting. We use a commer- cially available RNA as a positive control. This commercial RNA (Human brain total RNA, Promega) contains trace amounts of DNA, meaning that this is both an effective positive RNA and DNA amplification control (see Subheading 3.3). When using ERCC spike-in controls, defrost the original ERCC spike-in control tube and dilute it (e.g., 1:50) in nuclease- free water, divide into 1 μL aliquots to avoid freeze–thaw cycles and store the residual aliquots at À80 C. Dilute one of your individual aliquots to a final dilution of 1:5 Â 106 in RLT lysis buffer. This dilution has been shown to be generally a good starting point (see Note 5). 1. OPTIONAL: Put the 96-well PCR plate in a UV cross-linker and set it to 600 mJ/cm2. This will eliminate potentially con- taminating RNA and DNA on the plates. 2. Dispense 2 μL of RLT lysis buffers with ERCC spike ins into 94 wells of the plate. Keep two wells empty for the positive control and negative control. 3. Freeze the plates and store at À20 C until sorting. 4. Make sure that the lysis buffer is defrosted before sorting the cells into the lysis buffer (see Note 6). 5. After the plates have been sorted, store the plates at À80 C for up to 6 months.

3.2 Preparation This section assumes all reagents described in Subheading 2.2 have of Biotin-dT30 Bound been purchased/made up already. Before starting, make sure that Streptavidin Beads all reactions are done in an UV PCR workstation and the experi- menter wears gloves and a clean lab coat to reduce the risk of contamination. 1. Retrieve an aliquot of G&T dT30 and leave to thaw on ice until directed. 2. Take the Dynabeads MyOne Streptavidin C1 bottle out of the fridge and resuspend by vortexing the bottle. 3. Label a 1.5 mL Eppendorf tube “Dynabeads” and indicate the number of plates this aliquot is intended to supply and add your name and the date. G&T-Sequencing 327

4. Take 75 μL Dynabeads per plate (up to 300 μL for four plates) and transfer to the 1.5 mL Eppendorf tube. 5. Place the tube on the tube magnet and leave beads to settle. When the beads are settled (approx. 2 min), remove and dis- card supernatant. 6. Remove the tube from the magnet and resuspend the beads in 300 μL (per plate) Solution A. Do not vortex (see Note 7). 7. Place on magnet and leave to settle. When the beads are settled (approx. 2 min), remove and discard supernatant. 8. Repeat steps 6 and 7. 9. Remove the tube from the magnet and resuspend the beads in 300 μL (per plate) Solution B. Do not vortex. 10. Place on magnet and leave the beads to settle. When the beads are settled (approx. 2 min), remove and discard supernatant. 11. Resuspend the beads in 75 μL (per plate) of 2Â B&W. Do not vortex. 12. Add 75 μL (per plate) of 100 μM G&T dT30 and incubate the tube on a thermomixer with a 1.5 mL tube adapter for 20 min at 1200 rpm (2.4 Â g)at20C. 13. Per plate, make 1.5 mL of 1Â B&W by diluting 750 μLof 2Â B&W in 750 μL Nuclease-free. (For four plates use 3 mL of 2Â B&W in 3 mL nuclease-free water in a 15 mL Falcon tube). Vortex to mix. 14. Once the thermomixer has finished, place the beads on the magnet and leave to settle. When the beads are settled (approx. 2 min), remove and discard supernatant. 15. Remove the tube from the magnet and resuspend the beads in 300 μL (per plate) 1Â B&W buffer. Do not vortex. 16. Place on magnet and leave to settle. When the beads are settled (approx. 2 min), remove and discard supernatant. 17. Repeat steps 15 and 16 three times. NB: Make sure that you do these four washes to make sure that all unbound biotin primers are washed away! 18. Resuspend the beads in 1.425 mL of bead resuspension buffer per plate. If preparing more beads than 1.425 mL, resuspend and transfer the beads to a 15-mL tube containing the remain- ing volume and mix by vortexing. 19. Label the new tube with the same information as the parent tube and store at 4 C until directed. These beads can be used for up to 2 months and possibly longer. 20. To make sure that the beads are functional and have not been contaminated follow Subheading 3.13. 328 Iraad F. Bronner and Stephan Lorenz

3.3 DNA/RNA Before starting this part of the method, make sure that all reagents Separation and RNA described in Subheading 2.3 have been prepared. Since individual Amplification batches can be made beforehand it is not part of this section. This protocol assumes that you have prepared and defrosted G&T wash buffer and biotin-dT30 bound streptavidin beads in resuspension buffer (without RNase inhibitor) as described in previous section. This also assumes you have already sorted the cells in RLT Plus lysis buffer outlined in Subheading 3.1. 1. OPTIONAL: Put three 96-well plates in a cross-linker and set it to 600 mJ/cm2 (see Note 8). 2. Label one plate as “wash buffer” plate, one as “beads” plate and one as “DNA” plate. 3. Get a 3.75 mL G&T wash buffer aliquot from the freezer, thaw, add 18.75 μL RNase inhibitor (700 U), and mix well. 4. Pipet 30 μL of the wash buffer into each well of the wash buffer plate, seal it and spin it down to retrieve all liquid in the bottom of the well. 5. Retrieve biotin-dT30 bound streptavidin beads in resuspension buffer (without RNase inhibitor) from the fridge, and vortex thoroughly. Ensure that the beads are fully resuspended. 6. From the original tube, transfer 1.425 mL beads to another tube and add 75 μL RNase inhibitor (3000 U). Mix well with a pipette and place into the cooling block. Label the original bead aliquot “opened” with your name and the date and place back into the fridge. These beads will be stable for at least 2 months (see Note 9). 7. It is good practice to run a positive control. Dilute the control RNA (e.g., total human brain RNA) in RLT plus to 12.5 pg/μ L. Pipet 2 μL into the positive control well. 8. Run a negative control to control for contamination after the cell sort. Dispense 2 μL of RLT plus into the negative control well. 9. The next step can be automated using liquid dispensers. In case you do not have access to a liquid dispenser, dispense 10 μLof the beads to your cells and controls using a multichannel pipette or dispensing pipette (e.g., Eppendorf Stream). 10. Mix the beads (using a multichannel pipette) by slowly pipet- ting up and down 10 times or until fully resuspended. Avoid pipetting air as this will lead to excessive foam formation. Put the tips back into a tip box; they are to be reused for all subsequent steps. Spin the plate for 10 s at 100 Â g to collect the beads in the bottom of the well (see Note 10). 11. Incubate the beads for 20 min into the Eppendorf Thermo- mixer C with heated lid at room temperature while shaking at 1200 rpm (2.4 Â g). 12. After 20 min take the plate off and put it on the low elution plate magnet. Wait for the beads to settle. G&T-Sequencing 329

13. Aspirate the supernatant making sure not to disturb the beads and transfer the supernatant to the DNA collection plate. Make sure that you reuse the same tips for the same wells in order to avoid cross-contamination between samples. The plate with the beads now contains all your polyadenylated RNA bound to the beads and will subsequently be called RNA plate. 14. Take the RNA plate of the magnet. With the same tips, dis- pense 10 μL of wash buffer from your wash buffer plate into the same wells and resuspend the beads by slowly mixing 10 times. Try to resuspend the beads as thoroughly as possible but do it gently in order to not damage the beads. 15. Incubate the RNA plate with the beads for 20 min onto the Eppendorf Thermomixer C with heated lid at room tempera- ture while shaking at 1200 rpm (2.4 Â g). While the program is running proceed to the next step to prepare the RT master mix. 16. Prepare the RT master mix as described below. The preparation time is approximately 20 min.

Volume per Volume per Final Component plate (μL) reaction (μL) concentration

Nuclease-free water 258.9 2.1575 Superscript II 5Â 120 1 1Â first strand buffer 5 M betaine 120 1 1 M 100 mM DTT 30 0.25 5 mM 25 mM each dNTP 24 0.2 1 mM each mix

1 M MgCl2 3.6 0.03 6 μM 100 μM TSO 6 0.05 1 μM RNase inhibitor 7.5 0.0625 0.5 U/μL murine (20 U/μL) Superscript II reverse 30 0.25 10 U/μL transcriptase (200 U/μL) Total volume 600 5

17. Vortex the RT master mix briefly or mix thoroughly with a 1 mL pipette and store in your cooling block until directed. 18. Turn on another thermomixer on and preheat it to 42 C. 19. After 20 min, take the RNA plate off the Thermomixer and put it on the low elution plate magnet. Wait for the beads to settle. 20. Aspirate the supernatant out of the RNA plate, making sure not to disturb the beads and transfer the supernatant to the DNA collection plate. Make sure that you reuse the same tips. 330 Iraad F. Bronner and Stephan Lorenz

21. After this second wash, wash your tips with another 5 μLof wash buffer to rinse the tips. Dispense this wash buffer into your DNA collection plate. Some of the genomic material could still have been bound to the tip’s surface and this step will make sure that you harvest as much of the genomic mate- rial as possible into your DNA plate. 22. Take the DNA plate, seal it, spin the plate at 1000 Â g for 1 min and then store at À20 C until ready to proceed to the DNA amplification protocol. 23. Take the RNA plate to the UV PCR workstation and proceed to the next step immediately. 24. Dispense 5 μL of the RT master mix to each well of the RNA plate. 25. Seal the plate, spin it at 100 Â g for 15 s only (DO NOT spin faster or longer, since this will settle the beads). 26. Place the RNA plate on the thermomixer and incubate for 1 min at RT at 2000 rpm (6.7 Â g) to resuspend the beads. Check whether the beads are resuspended. If so continue to the next step, if not repeat this step. 27. Place the RNA plate on the thermomixer (with a PCR 96 adapter) and tape the plate down to keep it in place. 28. Incubate the plate running the following steps on the pre- heated (42 C) thermomixer. It will take 1:42 h for the pro- gram to complete.

Step Temperature Time (min) Speed (rpm)

1 42 2 2000 2 42 60 1500 3 50 30 1500 4 60 10 1500

29. Prepare the following PCR master mix a few minutes before the end the program in a 1.5 mL Eppendorf tube.

Volume per reaction Volume per plate Component (μL) (μL)

Kapa Hifi 6.25 687.5 100 μM IS PCR 0.0125 1.4 primers Nuclease-free water 1.2375 136.1 Total volume 7.5 825 G&T-Sequencing 331

30. Centrifuge the RNA plate, which now contains cDNA at 100 Â g for 15 s, to make sure that all liquid is collected in the bottom of the plate. 31. Dispense 7.5 μL PCR master mix to each well. 32. Seal the plate and spin it at 100 Â g for 15 s only (see Note 11). 33. Place the RNA plate on the thermomixer and incubate for 1 min at RT at 2000 rpm (6.7 Â g) to resuspend the beads. Check whether the beads are resuspended. If so continue to the next step, if not repeat this step. 34. Incubate the plate on a thermocycler using the following conditions:

Step Temperature (C) Time (m:s)

1 98 3:00 Cycle for an additional 19–24 cycles 98 0:10 67 0:15 72 6:00 3 72 5:00 4 10 For ever

35. This program will take approximately 3 h to complete and can be left overnight. Alternatively, one can continue with Sub- heading 3.4 to clean the amplified cDNA or continue with the DNA amplification Subheading 3.6.

3.4 SPRI Cleanup To clean amplified cDNA from primer dimers and reagents, we recommend a SPRI bead cleanup of the material. Because success- fully amplified material from healthy cells will be high molecular weight, a 1:1 cDNA to SPRI bead ratio (i.e., 13 μL of beads to 13 μL of reaction mix) is usually sufficient. In case the amplified cDNA is of low yield or to make sure that contaminating low-molecular weight products (<100–200 bp) are effectively removed within one bead cleaning step, this can be changed to a 0.6:1 SPRI to reaction volume ratio (i.e., 7.8 μL of beads to 13 μL reaction mix). 1. Before starting, make sure that all reactions are done in a POST PCR lab, preferably in an UV PCR workstation and the experi- menter wears gloves and a clean POST PCR lab coat to reduce the risk of contaminating the PRE-PCR material with amplified material. 2. Bring the AMPure XP beads to room temperature, and vortex thoroughly to resuspend. 3. After sanitizing the UV PCR workstation with UV, clean it with bleach and rinse it with water. 332 Iraad F. Bronner and Stephan Lorenz

4. Locate all the consumable plastics and tips and place them inside the UV PCR workstation. 5. Collect your RNA plate with the amplified cDNA from the fridge or freezer, and centrifuge. 6. Add SPRI beads (13 μL of beads to 13 μL of reaction mix; or 7.8 μL of beads to the 13 μL reaction mix for a more stringent cleaning regime) (see Note 12). 7. Mix well by pipetting up and down 10 times and incubate for 5 min to make sure that the majority of the DNA will bind to the SPRI beads. 8. Place the RNA plate on the plate magnet, wait 3 min and remove supernatant. Make sure not to disturb the beads since this will decrease the yield. 9. Add 80% ethanol and incubate for 30 s. 10. Remove the ethanol, making sure not to disturb the beads since this will decrease yield. 11. Repeat steps 9 and 10. 12. Add a nuclease-free buffer that is compatible with the subsequent library preparation step (e.g., QIAGEN EB buffer or 10:1 Tris–EDTA (TE) buffer). 13. Store the cleaned RNA plate in the fridge or freezer or con- tinue with Subheading 3.5.

3.5 Quality Control To determine whether the RNA amplification was successful we of Amplified RNA recommend running the amplified material on an automated elec- trophoresis instrument (e.g., an Agilent Bioanalyzer using their High Sensitivity DNA kit (5067-4626; Agilent, UK) or AATI Fragment Analyzer using their High Sensitivity Large Fragment Analysis kit (DNF-493; AATI Ltd, UK)) (see Note 13). Below we have included example Bioanalyzer traces of success- fully amplified material. All peaks observed at 35 and 10,380 bp are the lower and higher marker peak controls. None of the samples was SPRI cleaned before running on the Bioanalyzer chips. Lower molecular peak traces as observed between 35 and 250 bp would have been eliminated after a SPRI cleanup (Figs. 1 and 2).

3.6 DNA Depending on the scope of the experiment one can choose to use Amplification different DNA amplification methods. For accurate copy number variant detection and detection of microinsertions and microdele- tions we recommend using a PCR-based amplification method (see Subheading 3.6.1). For single cell single nucleotide variation anal- ysis we recommend using a phi29-based amplification kit since this type of kit has a higher fidelity than PCR based methods [7](see Subheading 3.6.3). Fig. 1 Successfully amplified mRNA. Some smaller fragments can still be observed, but the majority of the trace shows high-molecular weight material

Fig. 2 Unsuccessfully amplified mRNA. Since the mRNA in the G&T protocol is captured at the poly-A tail, degradation primarily affects yield, whereas the size of the amplified product is impacted only secondarily. There will be a shift in the ratio between high- and low-molecular weight products, but it will not be as obvious as observed during Smart-seq2 stand-alone reactions, where large part of the trace will show up as low-molecular weight product 334 Iraad F. Bronner and Stephan Lorenz

Both of the described methods are commercially available as kits. In this part of the chapter we will describe two kits that have shown the best results in our hands. For accurate copy number variations and estimations of micro deletions, we and others have observed consistent results using the Rubicon Genomics (now Takara Clontech) PicoPLEX kit [7], see Subheading 3.6.1.An alternative kit would be the commercial MALBAC kit (Yikon Genomics, CN). For single cell SNV analysis, we and others have observed consistent results using the REPLI-g Single Cell Kit (QIAGEN Ltd, UK) [7], see Subheading 3.6.3. However, one could also use the TruePrime™ Single Cell WGA Kit (Sygnis AG, Germany), or illustra Single Cell GenomiPhi DNA Amplification Kit (GE Healthcare, UK) (see Note 14).

3.6.1 Amplifying DNA Please note, this section describes one of the DNA amplification for Copy Number Variation methods and should only be done if the criteria of the research (PicoPLEX) match those discussed at the introduction of Subheading 3.6.A protocol for an alternative DNA amplification method more suit- able for mutation analysis is described in Subheading 3.6.3. 1. Before starting, make sure that all reactions are done in an UV PCR workstation and the experimenter wears gloves and a clean lab coat to reduce the risk of contamination. 2. Bring the AMPure XP beads to room temperature and vortex thoroughly to resuspend the beads. 3. After sterilizing the UV PCR workstation with UV, clean the UV PCR workstation with bleach and then with sterile nuclease-free water. 4. Locate all the consumable plastics and tips and place all items inside the UV PCR workstation. 5. Collect the DNA plate from the fridge or freezer to defrost and bring to room temperature. 6. Get a PicoPLEX kit and defrost the following reagents: Extrac- tion enzyme dilution buffer (purple); Cell extraction buffer (green); Pre-Amp buffer (red). 7. Return the rest of the kit to the freezer until directed. 8. Once defrosted, place these buffers inside a cooling block. 9. Once the DNA source plate has defrosted, centrifuge at 1000 Â g for 1 min. 10. Add 25 μL of beads to the DNA plate containing 25 μL DNA in wash buffer. 11. Mix well, by pipetting up and down, and incubate for 5 min to make sure that the majority of the DNA will bind to the SPRI beads. G&T-Sequencing 335

12. While the DNA is binding, proceed to the next step to prepare the cell extraction mix. 13. Prepare the cell extraction mix as described below; vortex to mix, store on ice until directed.

Volume for Volume per Component 1 plate (μL) reaction (μL) Color

Cell extraction buffer 275 2.5 Green Extraction enzyme 264 2.4 Purple dilution buffer Cell extraction enzyme 11 0.1 Yellow Total volume 550 5

14. Place the DNA plate on a low elution plate magnet and remove supernatant. Make sure not to disturb the beads since this will decrease the yield. 15. Add 80% DNA- and nuclease-free ethanol and incubate for 30 s. 16. Remove the ethanol, making sure not to disturb the beads since this will decrease yield. 17. Repeat step 15 and 16. 18. After aspirating the last of the ethanol from the DNA plate proceed immediately to the next step to avoid overdrying. 19. Dispense 5 μL of the cell extraction mix to each well of the DNA plate (e.g., using an Eppendorf repeater stream pipette set on 96 steps using a 0.5 mL Combitip). 20. Seal the DNA plate and centrifuge the plate at 100 Â g for 30 s to collect the liquid in the bottom of the well. 21. Vortex the DNA plate on a thermomixer at 2000 rpm (6.7 Â g) for 1 min. Check whether the beads are resuspended. If so, continue, if not, repeat this step. 22. Transfer the plate to a thermocycler and run the following DNA extraction program. It will take 14 min to complete.

Step Temperature (C) Time (m:s)

1 75 10:00 2 95 4:00 3 RT Hold

23. While the thermocycler is running, prepare the Pre-Amp mas- ter mix as described below; mix and store on ice until directed. 336 Iraad F. Bronner and Stephan Lorenz

Component Volume (μL) Volume per reaction (μL) Color

Pre-Amp buffer 264 2.4 Red Pre-Amp enzyme 11 0.1 White Total volume 275 2.5

24. Centrifuge the DNA plate at 100 Â g for 15 s. 25. Dispense 2.5 μL Pre-Amp master mix into each well of the DNA plate. 26. Seal the DNA plate and centrifuge it at 1000 Â g for 1 min. 27. Vortex the DNA plate on a thermomixer at 2000 rpm (6.7 Â g) for 1 min. Check whether the beads are resuspended. If so continue, if not repeat this step 28. Transfer the plate to the thermocycler and run the following preamplification program. This program will take 1 h 20 min to complete.

Step Temperature (C) Time (m:s)

1 95 2:00 Cycle for 12 cycles 95 0:15 15 0:50 25 0:40 35 0:30 65 0:40 75 0:40 3 10 Hold

29. Defrost the amplification buffer (orange) and nuclease-free water (clear) on ice so they are defrosted before the preampli- fication process has finished. 30. Prepare the amplification master mix as follows a few min before the end of the preamplification step.

Volume Volume per sample Component (μL) (μL) Color

Amplification buffer 1312.5 12.5 Orange Nuclease-free water 1795.5 17.1 Clear Amplification 42 0.4 Blue enzyme Total volume 3150 30

31. After the preamplification program has finished, centrifuge the plate at 1000 Â g for 60 s. G&T-Sequencing 337

32. Dispense 30 μL Amplification master mix to each well of the DNA plate and seal the plate. 33. Centrifuge the DNA plate at 100 Â g for 15 s, making sure not to settle the beads. 34. Place the DNA plate on the thermomixer C and vigorously vortex the plate for 1 min at 2000 rpm (6.7 Â g). Check whether the beads are resuspended. If so continue, if not repeat this step. 35. Transfer the plate to the thermocycler. The following program will take 52 min to complete and can be left overnight.

Step Temperature (C) Time (m:s)

1 95 2:00 Cycle for 14 cycles 95 0:15 65 1:00 75 1:00 3 10 Hold

36. Rubicon recommends 14 cycles, but with human or mouse cells we have seen high yields and the amount of cycles can be reduced with one or two cycles. For smaller genomes, it might be necessary to increase the amount of cycles. 37. After the amplification reaction has finished, centrifuge the DNA plate at 1000 Â g for 1 min. 38. To continue processing the samples, take the DNA plate and continue with Subheading 3.6.2. Alternatively, for short-term storage (less than 24 h), seal the DNA plate and store in the fridge. When continuing at a later stage, store the samples in the freezer.

3.6.2 Quality Control To determine whether the PicoPLEX amplification was successful of PicoPLEX Amplified DNA we recommend running the amplified material on an automated electrophoresis instrument (e.g., an Agilent Bioanalyzer using their High Sensitivity DNA kit (5067-4626; Agilent, UK) or AATI Fragment Analyzer using their High Sensitivity Large Fragment Analysis kit (DNF-493; AATI Ltd, UK)) (see Note 13). Then continue with Subheading 3.7. Below we have included example Bioanalyzer traces of success- fully amplified PicoPLEX material. All peaks observed at 35 and 10,380 bp are the lower and higher marker peak controls. None of the samples was SPRI cleaned before running on the Bioanalyzer chips. Lower molecular peak traces as observed between 35 and 250 bp would have been eliminated after an SPRI cleanup (Figs. 3, 4, and 5). 338 Iraad F. Bronner and Stephan Lorenz

Fig. 3 Successfully amplified PicoPLEX DNA

3.6.3 Amplifying DNA Please note, this section is one of the DNA amplification methods for Mutation Analysis (MDA and should not follow Subheading 3.6.2 but instead is an indepen- Using REPLI-g) dent step and should follow Subheading 3.5 as is discussed at the introduction of Subheading 3.6. 1. Before starting, make sure that all reactions are done in an UV PCR workstation and the experimenter wears gloves and a clean lab coat to reduce the risk of contamination. 2. Bring the AMPure XP beads up to room temperature, and vortex thoroughly to bring all the beads into solution. 3. After sterilizing the UV PCR workstation with UV, clean the UV PCR workstation with bleach and then with water. 4. Locate all the consumable plastics and tips, label all of them correctly, and place inside the UV PCR workstation. 5. Before starting, make sure that all reactions are done in an UV PCR workstation and the experimenter wears gloves and a clean lab coat to reduce the risk of contamination. 6. Collect your DNA plate from the fridge or freezer to defrost. G&T-Sequencing 339

Fig. 4 Unsuccessfully amplified PicoPLEX DNA (negative trace). No amplified material is observed. Only a low-molecular weight band consisting of primer dimers can be observed interfering with the lower marker peak

7. Take out a your REPLI-g kit and defrost the following reagents: DTT 1 M (purple); PBS sc (1Â PBS solution supplied with the kit (clear, labeled PBS sc)); H2O sc (nuclease-free water supplied with the kit (clear, labeled H2O)); Stop solution (red); 4Â Reaction buffer (yellow). 8. Make sure that the reagents are stored on a cooling block after they have been defrosted. 9. Return the rest of the kit to the freezer until directed. 10. Once the DNA source plate has defrosted, centrifuge it at 1000 Â g for 1 min. 11. Add 25 μL of beads to the DNA plate containing 25 μL DNA in wash buffer. 12. Mix well by pipetting up and down, and incubate for 5 min to make sure that the majority of the DNA will bind to the SPRI beads. 13. While the DNA is binding proceed to the next step to prepare the cell extraction mix. 340 Iraad F. Bronner and Stephan Lorenz

Fig. 5 Low quality amplified PicoPLEX DNA (failed cell). Partial cells or failed amplification generally look like this. More low-molecular weight product, while the overall yield will be low

14. First take out one aliquot of DLB and add 500 μL of the nuclease-free water. Resuspend well to make sure that all the lyophilized reagents are in solution. 15. For a full 96-well plate mix the following reagents together in a D2 labeled Eppendorf tube to make buffer D2.

Component Volume (μL) Volume per sample (μL) Color

Reconstituted DLB 330 2.7 Clear 1 M DTT 30 0.3 Purple Total volume 360 3

16. Use the D2 prepared above to prepare the cell extraction mix as described below; vortex to mix, store on the cooling block until directed. G&T-Sequencing 341

Volume Volume per sample Component (μL) (μL) Color

PBS sc 440 4 Clear D2 330 3 Eppendorf labeled D2 Total 770 7 volume

17. Place the DNA plate on a low elution plate magnet and remove supernatant. Make sure not to disturb the beads since this will decrease the yield. 18. Add 80% DNA- and nuclease-free ethanol and incubate for 30 s. 19. Remove the ethanol, making sure not to disturb the beads since this will decrease yield. 20. Repeat step 18 and 19. 21. Now, aspirate the ethanol from the DNA plate, making sure not to disturb the beads. Make sure to remove all the ethanol, while making sure not to overdry the beads. 22. Using a repeater pipette, dispense 7 μL of the cell extraction mix onto the dry beads and seal the plate. 23. Centrifuge the DNA plate at 100 Â g for 15 s, making sure not to settle the beads. 24. Place the DNA plate on the thermomixer and vigorously vortex the plate for 1 min at 2000 rpm (6.7 Â g). Check whether the beads are resuspended. If so continue, if not repeat this step. 25. Transfer the DNA plate to the thermocycler and incubate the plate for 10 min at 65 C. 26. Using a repeater pipette dispense 3 μL of stop solution (red cap) into each of the wells. 27. Centrifuge the DNA plate at 100 Â g for 15 s to collect the liquid in the bottom of the wells, making sure not to settle the beads. 28. Place the DNA plate on the thermomixer and vigorously vor- tex the plate for 1 min at 2000 rpm (6.7 Â g) to mix the stop reagent with the sample, then store the sample on a cooled 96-well block. 29. Prepare the amplification master mix in a 15 mL Falcon tube as described below. Briefly vortex the mix and store on the cool- ing block until directed. NB: The kit just has enough reaction buffer for a 96-well plate plus excess. The volumes below are guidelines for 100 samples and kits have shown to have ade- quate reagent volumes enough for 100 samples. 342 Iraad F. Bronner and Stephan Lorenz

Component Volume (μL) Volume per sample (μL) Color

H2O sc 900 9 Clear Reaction buffer 2900 29 Yellow DNA polymerase 200 2 Blue Total volume 4000 40

30. Using a repeater pipette, dispense 40 μL of the amplification master mix onto the 10 μL of extracted samples. 31. Centrifuge the DNA plate at 100 Â g for 15 s, making sure not to settle the beads. 32. Place the RNA plate on the thermomixer and vortex the plate for 1 min at 2000 rpm (6.7 Â g), to mix the liquids and the beads. Check whether the beads are resuspended. If so con- tinue, if not repeat this step. 33. Transfer the plate to the thermocycler and incubate the plate as follows. QIAGEN suggests to incubate the samples for 8 h. With human and mouse cells, this yields too much amplified DNA for subsequent quantification and library preparation. Instead we routinely incubate for 3 h only.

Step Temperature (C) Time (h:m:s)

1 30 3:00:00 2 65 0:03:00 3 10 Hold

3.6.4 Quality Control To determine whether the DNA amplification was successful we of the Amplified DNA recommend running the amplified material on an automated elec- trophoresis instrument (e.g., an Agilent DNA 7500 or 12000 Bioanalyzer chip or AATI Fragment Analyzer using their long fragment kit) (see Note 13). Below we have included example Bioanalyzer 7500 traces of successfully amplified MDA material. All peaks observed at 35 and 10,380 bp are the lower and higher marker peak controls. In some cases, the DNA might need further dilution not to interfere with the electrophoresis. Also, high-molecular weight MDA amplified material will be running and interfering with the higher marker peak control (Figs. 6, 7, and 8).

3.7 Quantification After amplification of the samples (both transcriptome and of Amplified Products genome), we suggest to perform a quantification of the samples to get an accurate input amount for subsequent library preparation. This could be done on just a representative subset of samples if the library preparation method used is robust enough to tolerate some G&T-Sequencing 343

Fig. 6 Successfully amplified MDA DNA. Please note that due to the amount of DNA generated using this assay a Bioanalyzer 7500 chip has been used

input variation. In any case we suggest to use a fluorimetric approach since these protocols yield robust, cost effective and quick results. We have been using a Quantus fluorometer (Pro- mega) or Qubit fluorometer (Thermo Fisher) to measure small sets of samples, or an FLUOstar Omega plate reader (BMG LABTECH Ltd., UK) using the AccuClear Ultra High Sensitivity dsDNA Quantitation Kit (Biotium Inc., USA) (see Note 15).

3.8 Library Nowadays, one can select from a large selection of library genera- Preparation Selection tion methods to generate Illumina sequencing libraries. The G&T Criteria method has been shown to work with, but is not limited to, Illumina sequencers. Amplified genomic and transcriptome mate- rial can also be converted into Pacific Biosystems (PacBio) RSII or Sequel libraries, which have shown to successfully sequence on this platform [2], although PCR cycles and amplification times might need to be increased due to the high input material requirement for library preparation for PacBio sequencers. For the transcriptome part of the G&T assay, we routinely use volume-reduced protocols of Nextera XT or Nextera library 344 Iraad F. Bronner and Stephan Lorenz

Fig. 7 Unsuccessfully amplified MDA DNA (negative trace). This is the negative control from the GenomiPhi Kit

preparation methods; however, other standard library preparation methods such as NEBNext Ultra II (FS) would also work. Please see Subheading 3.9 for a protocol describing a robust, cheaper, volume-reduced Nextera XT library preparation protocol. For the genomic material yielded by the G&T assay, selecting the ideal library preparation method depends on the scientific scope of the study. If copy number variation is the main analytical out- come and the DNA was amplified using the Rubicon/Takara Clon- tech PicoPLEX kit, the Illumina Nextera XT library preparation is the method of choice due to its rapid process, compatibility with very low input amounts and general robustness. Please see Subhead- ing 3.9 for our protocol describing a robust, cheaper, volume- reduced Nextera XT library preparation protocol. If single cell SNV analysis is the main goal and the DNA was amplified using phi29 DNA polymerase, a PCR-free library prepa- ® ration method (e.g., NEBNext Ultra™ II or Illumina’s TruSeq DNA PCR-free library preparation methods) is recommended in order to not compromise SNV detection by introducing PCR artifacts. These methods are relatively laborious but yield high- quality data. Please refer to the manufacturer’s instructions for successful library preparation (see Note 16). G&T-Sequencing 345

Fig. 8 Unsuccessfully amplified MDA DNA (negative trace) REPLI-g. Some amplification will be expected. The amount should be lower than the positive control and generally the molecular weight (size) of the amplified material might be lower. REPLI-g DNA has however been optimized for yield, and even with a 3 h incubation negative traces will show a substantial amount of DNA. The only way to distinguish no-cell controls from a true positive result is to do genotyping on this material as suggested in the introduction and Subheading 3.8

3.9 Nextera Library Before starting, make sure that you have quantified your input Preparation Using material. Input material should be normalized to 0.2 ng/μLon a Quarter average. See Subheading 3.7 for recommended DNA quantification of the Volume methods. All reactions should be done in a clean UV PCR worksta- Suggested by Illumina tion and the experimenter wears gloves and a clean lab coat to reduce the risk of contamination. 1. Take out a pack of index primers and defrost on ice. Once defrosted, vortex, spin and put them in the correct order on the TruSeq index plate fixture. When using a subset of the index primers, check Illumina’s website for the recommended index primer combinations. 2. Defrost the normalized DNA source plate. 3. Locate and thaw TDB and NPM on the cooling block or wet ice. 346 Iraad F. Bronner and Stephan Lorenz

4. Take buffer NT from the fridge and keep this in the UV PCR workstation on a tube rack. NB: Do not put NT on ice and do not vortex. It contains SDS, which will precipitate when put on ice and will foam when vortexed. If precipitates have formed, warm up by hand and make sure that the precipitates have disappeared before using NT. 5. Label a 96-well plate as sample plate and dispense 1.25 μLof your input DNA at 0.2 ng/μL using a multichannel pipette. The indicated DNA concentration can be varied twofold (0.1–0.4 ng/μL) without affecting library size too much and provides an opportunity to work with plate column averages, rather than normalizing each sample individually. If data qual- ity and consistency of yield is of priority, then each sample should be normalized individually. 6. Briefly spin the plate to gather all liquid at the bottom of the plate. 7. Mix TD (Tagment DNA Buffer) with ATM (Amplicon Tag- ment Mix) and use a dispensing pipette to dispense 3.75 μLof the master mix onto the DNA samples.

Volume per sample Volume per plate Reagent (μL) (μL)

TD (Tagment DNA 2.5 275 Buffer) ATM (Amplicon Tagment 1.25 137.5 Mix)

8. Briefly spin down the plate and mix the reagents with a multi- channel or use a plate vortex to make sure that the reagents are mixed well. 9. Then incubate the plate on a thermocycler with a heated lid as follows:

Step Temperature Time (m:s)

1 55 5:00 2 10 For ever

10. When the plate reaches 10 C, take off the plate and add 1.25 μL NT buffer to stop the tagmentation enzyme. Quickly spin down the plate and mix the reagents with a multichannel pipette or vortex the plate on a plate vortex to make sure that the reagents are mixed well. 11. Incubate the plate at room temperature for at least 5 min to make sure that the transposon becomes inactivated. 12. In the meantime, check whether the i7 and i5 index primers have thawed. Then vortex them to mix and spin down to collect the liquid in the bottom of the tubes. G&T-Sequencing 347

13. Open the index primer tubes that you need and discard the lids. 14. Using a multichannel pipette, dispense 1.25 μL of both the i5 and i7 index primers in each individual well. Make sure to mix. These primers contain the dual index tag sequences and are necessary to discriminate the individual cell data after sequencing. 15. Using a repeater pipette, dispense 3.75 μL of NPM in each well. This contains the PCR enzyme. Mix on a plate mixer or multichannel pipette. 16. Seal the plate and place on the thermocycler and run the following protocol:

Step Temperature (C) Time (m:s)

1 72 2:00 2 95 0:30 Cycle for 12 cycles 95 0:10 55 0:30 72 0:30 3 72 5:00 4 10 For ever

17. After the PCR has finished, store the plate in the fridge.

3.10 SPRI Cleanup All reactions should be done in a clean UV PCR workstation and of the Nextera Library the experimenter should wear gloves and a clean lab coat to reduce Products the risk of contamination. Locate all the consumable plastics and tips, label all of them correctly, and place inside the UV PCR workstation. 1. Bring the AMPure XP beads up to room temperature, and vortex thoroughly to bring all the beads into solution. 2. Locate the Nextera library plate, spin down and combine 5 μL of all samples to an equivolume pool containing 480 μL pooled library. 3. Add the SPRI beads at a ratio of 0.6Â (i.e., 288 μL), mix well and incubate for 5 min to make sure that the majority of the DNA will bind to the SPRI beads. 4. Place the tube on a tube magnet and remove supernatant. Make sure not to disturb the beads since this will decrease the yield. 5. Add 80% ethanol and incubate for 30 s. 6. Remove the ethanol, making sure not to disturb the beads since this will decrease yield. 7. Repeat steps 5 and 6 at least once. 348 Iraad F. Bronner and Stephan Lorenz

8. Add 25 μL of a nuclease-free buffer that is compatible with the subsequent step (e.g., QIAGEN EB buffer or 10:1 Tris–EDTA (TE) buffer). 9. Either store the sample in the fridge or freezer or continue with the next step.

3.11 Gel To determine whether the library generation and amplification was Electrophoresis successful we recommend running the cleaned material on an of the Nextera Library automated electrophoresis instrument, for example, a high sensitiv- Products ity Agilent Bioanalyzer chip (see Note 13). Below we have included example Bioanalyzer traces of success- fully amplified Nextera libraries. All peaks observed at 35 and 10,380 bp are the lower and higher marker peak controls. The library was 0.6Â SPRI cleaned before running on the Bioanalyzer chips to achieve a larger average library size. Nextera libraries tend to show insert sizes ~200–250 bp smaller while sequencing than the size observed on Bioanalyzer chips. Using a 0.6Â SPRI ratio ensures that small library fragments are eliminated and libraries sequence of adequate insert size (Figs. 9, 10, and 11). 1. After running the cleaned libraries on the Bioanalyzer, use the molarity specified by this platform to dilute your samples to approximately 5 nM (nanomolar).

Fig. 9 Successfully amplified Nextera library. Well-cleaned smear with average size of ~600 bp Fig. 10 Suboptimally amplified Nextera library. Either failing Nextera reagents or primers cause traces to look like this

Fig. 11 Under-tagmented Nextera library. Usually too much DNA in the Nextera reaction causes very high- molecular weight traces like this. Insert size will be larger 350 Iraad F. Bronner and Stephan Lorenz

3.12 Quantification Because the Bioanalyzer has the tendency to under-quantify strong of the Nextera Library samples we recommend diluting the pooled library to approxi- Products mately 5 nM based upon the Bioanalyzer results before running a for Sequencing quantification assay (e.g., not diluting the libraries to 5 nM might lead to samples falling outside of the qPCR standard curve). To get accurate library quantification results, we recommend running a qPCR assay to determine accurate molarities. We have used the KAPA Library Quantification Kit successfully and recommend using the one that is compatible with your qPCR machine. NB: Please follow the manufacturer’s instructions on running the qPCR setup on the qPCR instrument of your choice. Alterna- tively, one can use the Bioanalyzer quantification results to help loading the library pool. For strong samples, we again recommend diluting the samples to 5 nM and rerunning them for accurate quantification, whereas not doing so, might lead to overclustering of the Illumina sequencer flowcell.

3.13 Reagent Quality Ensure that you wear a lab coat and gloves at all times and all Control preparations are done in a UV PCR workstation to avoid contami- nation. This protocol assumes that you have prepared G&T wash 3.13.1 G&T Bead Quality buffer and biotin-dT30 bound streptavidin beads in resuspension Control buffer (without RNase inhibitor) as described in Fig. 1 respectively. 1. Clean the UV PCR workstation with bleach and then with water. 2. Locate all the consumable plastics and tips, label all of them correctly, and place inside the UV PCR workstation. 3. Get a 15 mL G&T wash buffer aliquot from the freezer, thaw. and mix. 4. Use the UV cross-linker to UV-irradiate three clear 96-well Armadillo plates and label them “QC plate,” “Beads,” and “Wash.” Initial and date each one and leave in the hood. 5. Prepare a 1:8000 total RNA dilution in RLT buffer. Final concentration is 6.25 pg/μL. This is used to simulate a custo- mer’s sample during the QC of the dyna beads. We suggest: (a) First step dilution: 1:200. (b) A second dilution step of 1:40. 6. Transfer 600 μL of the premade and defrosted G&T wash buffer to a 1.5 mL Eppendorf tube. 7. Place the Falcon tube containing the rest of the G&T wash buffer back into the freezer and mark it as defrosted. Alterna- tively, when using the G&T wash buffer next day, store it overnight in the fridge. 8. Dispense 3 μL of RNase inhibitor to your Eppendorf tube containing 600 μL of G&T wash buffer and mix well. G&T-Sequencing 351

9. Dispense 30 μL of the G&T wash buffer with RNase inhibitor solution at least one row (e.g., D1–D12) of your plate labeled “Wash,” using an Eppendorf repeater pipette. 10. Seal the plate and spin at 1000 Â g for 1 min to collect all the liquid in the bottom of the plate. 11. Remove your Falcon tube containing the G&T bead solution made under Fig. 1 from the fridge and if you have so, an old validated bead solution as positive control. 12. Vortex both tubes thoroughly to ensure the beads are back into solution. 13. Label two 1.5 mL Eppendorf tubes, one as “New” and the other as “Old,” and dispense 150 μL from each solution, into their respective Eppendorf tube. 14. Add 8 μL of RNase inhibitor to both Eppendorf tubes with the beads and mix well. Place the Falcon tube(s) containing the remaining bead solutions back into the fridge. 15. Dispense 13 μL of the old beads into six wells (e.g., D1–D6) into the plate labeled “Beads.” 16. Dispense 13 μL of the new beads into six wells (e.g., D7–D12) into the plate labeled “Beads.” 17. Seal the plate and spin at 100 Â g for 15 s only. NB: Do not allow the beads to settle as it will be hard to resuspend them. 18. Dispense 2 μL of the diluted RNA (created in step 5), into 12 wells (e.g., D1–D12) into the plate labeled “QC plate.” 19. Using a multichannel pipette, dispense 10 μL of the beads from the bead plate onto the 12 wells containing the diluted RNA. 20. Mix the beads (using a multichannel pipette). In this case tips do not have to be reused because the poly A-tailed RNA should be bound to the beads, whereas the wash buffer should only contain waste. 21. Spin down the plate for 10 s at 100 Â g to collect the beads in the bottom of the well. Do not spin longer or faster. The beads are sticky and will be difficult to resuspend. 22. Incubate the beads for 20 min into the Eppendorf Thermo- mixer C with heated lid at room temperature while shaking at 1200 rpm (2.4 Â g). 23. After 20 min take the plate off and put it on the low elution plate magnet. Wait for the beads to settle. 24. Aspirate the supernatant making sure not to disturb the beads and transfer the supernatant to the waste. Tips do not have to be reused since beads will bind the polyA-tailed mRNA and the waste will not be used for amplification. 352 Iraad F. Bronner and Stephan Lorenz

25. Take the plate of the magnet. Dispense 10 μL of wash buffer from your wash buffer plate into the same wells and resuspend the beads. Try to resuspend the beads as well as possible, but do not overdo it since this will destroy the beads. 26. Incubate the beads again for 20 min onto the Eppendorf Thermomixer C with heated lid at room temperature while shaking at 1200 rpm (2.4 Â g). While the program is running proceed to the next step to prepare the RT master mix. 27. Prepare the RT master mix as described below. The preparation time is approximately 20 min. We have increased the excess to make logical pipetting volumes.

Volume per Volume per Final Component plate (μL) reaction (μL) concentration

Nuclease-free water 43.15 2.1575 Superscript II 5Â 20 1 1Â first strand buffer Betaine (5 M) 20 1 1 M 100 mM DTT 5 0.25 5 mM 25 mM each dNTP 4 0.2 1 mM each mix

1 M MgCl2 0.6 0.03 6 μM 100 μM TSO 1 0.05 1 μM RNase inhibitor 1.25 0.0625 0.5 U/μL murine (20 U/μL) Superscript II reverse 5 0.25 10 U/μL transcriptase (200 U/μL) Total volume 100 5

28. Vortex the RT master mix shortly or mix thoroughly with a 200 μL pipette and store in your cooling block until directed. 29. Turn on another thermomixer on and preheat it to 42 C. 30. Now return to the original plate that has been shaking. After 20 min take the plate off and put it on the low elution plate magnet. Wait for the beads to settle. 31. Aspirate the supernatant making sure not to disturb the beads and transfer the supernatant to the waste. 32. After this, take the QC plate to the UV PCR workstation and proceed to the next step immediately. 33. Dispense 5 μL of the RT master mix to each well of the RNA plate. G&T-Sequencing 353

34. Seal the plate, spin it at 100 Â g for 15 s only (DO NOT spin faster or longer, since this will settle the beads). 35. Place the QC plate on the thermomixer and incubate for 1 min at RT at 2000 rpm (6.7 Â g) to resuspend the beads. Check whether the beads are resuspended. If so continue to the next step, if not repeat this step. 36. Place the QC plate on the thermomixer (with a PCR 96 adapter) and tape the plate down to keep it in place. 37. Incubate the plate running the following steps on the pre- heated (42 C) thermomixer. It will take 1 h 42 min for the program to complete.

Step Temperature Time (min) Speed (rpm)

1 42 2 2000 2 42 60 1500 3 50 30 1500 4 60 10 1500

38. Prepare the following PCR master mix a few min before the end the program in a 1.5 mL Eppendorf tube.

Volume per Component reaction (μL) Volume for 16 samples (μL)

Kapa Hifi 6.25 100 100 μMIS 0.0125 0.2 (or dilute the 100 μM stock to PCR 10 μM then take 2 μL) primers Nuclease-free 1.2375 19.8 water Total volume 7.5 120

39. Centrifuge the QC plate at 100 Â g for 1 min, to make sure that all liquid is collected in the bottom of the plate. 40. Dispense 7.5 μL PCR master mix into each well. 41. Seal the plate, spin it at 100 Â g for 15 s only (DO NOT spin faster or longer, since this will settle the beads). 42. Place the QC plate on the thermomixer and incubate for 1 min at RT at 2000 rpm (6.7 Â g) to resuspend the beads. Check whether the beads are resuspended. If so continue to the next step, if not repeat this step. 43. Incubate the plate on a thermocycler using the following conditions: 354 Iraad F. Bronner and Stephan Lorenz

Step Temperature (C) Time (m:s)

1 98 3:00 Cycle for an additional 24 cycles 98 0:10 67 0:15 72 6:00 3 72 5:00 4 10 For ever

44. This program will take approximately 3 h to complete and can be left overnight. This would be a good stopping point. Alter- natively, one can continue with the next step to clean the amplified cDNA. 45. Once complete, clean up the samples using a 1:1 SPRI to sample ratio as specified in Subheading 3.4 and run all the samples on an automated electrophoresis instrument as speci- fied in Subheading 3.5. 46. Clean functional reagents should yield a high-molecular weight peak, with relatively low amounts of low-molecular weight material. Bioanalyzer traces of successfully amplified material can be observed in Subheading 3.5.

3.13.2 G&T Buffer Before starting, make sure that all reactions are done in a PRE-PCR Testing for RNase lab, preferably in a UV PCR workstation and the experimenter Contamination wears gloves and a clean lab coat to reduce the risk of contamination. 1. Switch on a thermomixer with a 96-well plate heating block. Set the temperature to 37 C without shaking. 2. UV enough Eppendorf tubes and a 96-well plate for your reagents in a cross-linker set to 600 mJ/cm2. 3. Locate all the reagents to be tested. For example: l Buffer A. l Buffer B. l 2Â B&W. l Tris–HCl pH 8.3. 4. Locate quality-controlled RNase free water. 5. Per tested reagent, label the top of your Eppendorf tube with the reagent name to be tested. Also take one extra tube and label this “Water.” (This is your nuclease-free water control.) 6. Dilute each tube containing 50 ng of total RNA 1:100 with the reagent that is to be tested. 7. Seal the tubes and incubate the reagents with the RNA for 1 h at 37 C without shaking. G&T-Sequencing 355

8. Label a the clean 96-well clear PCR plate with your initials, the date and write down the order of the reagents that you are checking. 9. Take a quality-controlled G&T-dT30 aliquot (100 μM) and an aliquot of quality controlled G&T dNTPs (25 mM) and make a dNTP–dT30 mix as follows:

Volume per reaction Volume for ten samples Component (μL) (μL)

25 mM each 0.4 4 dNTP 100 μM dT30 0.1 1 Nuclease-free 1.5 15 water Total 2 20

10. Store in a cooling block until use. 11. Make up the following RT master mix approximately 30 min after the start of the incubation.

Volume per Volume for Final Component reaction (μL) 10 samples (μL) concentration

First strand 2201Â buffer (5Â) 5 M betaine 2 20 1 M Nuclease-free 0.59 5.9 water 0.1 M DTT 0.5 5 1 mM

1 M MgCl2 0.06 0.6 7.5 μM 100 μM TSO 0.1 1 1.25 μM RNase inhibitor 0.25 2.5 0.5 U/μL (40 U/μL) Superscript II 0.5 5 10 U/μL (200 U/μL) Total 8 80

12. Store in the cooling block until use. 13. Just before the 37 C incubation of your reagents is ready, take a fresh RNA sample from the freezer and dilute 50 ng 1:100 with nuclease-free water. This is your no incubation control, and should always yield a good amount of cDNA, unless your RT reagents are contaminated or faulty. 356 Iraad F. Bronner and Stephan Lorenz

14. Then dilute all reagents including the tube from step 11 1:40 in nuclease-free water. 15. From these aliquots dispense 2 μL per reagent in a well in the Armadillo 96-well plate. 16. Add 2 μL of the dNTP–dT30 mix from step 9 to the same wells, mix and seal the plate. 17. Incubate the plate on a thermocycler with heated lid at 72 C for 2 min and quench the plate on ice. 18. Dispense 6 μL of the RT master mix per well. Seal and mix on the thermomixer. 19. Run the following program on a thermocycler with heated lid.

Step Temperature (C) Time (h:m:s)

1 42 1:30:00 Cycle for an additional 9 cycles 50 2:00 42 2:00 3 70 15:00 4 10 For ever

20. Just before the RT step is finished make the following PCR master mix:

Volume for Volume per 10 Reagent reaction (μL) samples (μL)

Kapa HIFI (25 mM) 12.5 125 IS PCR primers (100 mM) 0.03 0.3 Nuclease-free water 2.47 24.7 Total 15 150

21. Store the master mix on a cooling block until use. 22. Add 15 μL of the PCR Master mix to the reagent wells of the plate. Then mix on the plate vortex, seal the plate and incubate the plate on the thermocycler using the following program. This will take 2 h 30 min and will probably take you to the next day.

Step Temperature (C) Time (m:s)

1 98 3:00 Cycle for an additional 24 cycles 98 0:10 67 0:15 72 6:00 3 72 5:00 4 10 For ever G&T-Sequencing 357

23. Once complete, clean up the samples using a 1:1 SPRI to sample ratio as specified in Subheading 3.4 and run all the samples on an automated electrophoresis instrument as speci- fied in Subheading 3.5. 24. Clean functional reagents should yield a high-molecular weight peak, with relatively low amounts of low-molecular weight material. Bioanalyzer traces of successfully amplified material can be observed in Subheading 3.5.

3.13.3 G&T Buffer Before starting, make sure that all reactions are done in a PRE-PCR Testing for DNA lab, preferably in a UV PCR workstation and the experimenter Contamination wears gloves and a clean lab coat to reduce the risk of contamina- tion. For the DNA QC we have been using part of a PicoPLEX kit (see Note 17). 1. Label a 96-well plate as “QC plate” and treat with UV in a cross-linker set to 600 mJ/cm2. 2. Take out a Rubicon PicoPLEX kit and defrost the following reagents: (a) Extraction enzyme dilution buffer (purple). (b) Cell extraction buffer (green). (c) Pre-Amp buffer (Red). (d) Return the rest of the kit to the freezer until directed. 3. Once defrosted, place the purple green and red buffers on ice or inside a cooling block until directed. 4. In a separate location, dilute some control DNA to 5 pg/ul. This will be your positive control (see Note 18). 5. Locate all the reagents to be tested and transfer them to your UV workstation. These are as follows: (a) Buffer A. (b) Buffer B. (c) 2Â B&W. (d) Tris–HCl pH 8.3. (e) AMPure XP beads. (f) Nuclease-free 80% EtOH. (g) Biotin-dT30 bound streptavidin beads. (h) Quality controlled DNA-free water. 6. Add 25 μL of the following reagents to individual wells. (a) AMPure XP. (b) Biotin-dT30 bound streptavidin beads. (c) Nuclease-free 80% EtOH. 358 Iraad F. Bronner and Stephan Lorenz

7. Place the plate on a plate magnet and make sure that the beads settle. 8. Aspirate the supernatants and discard. 9. Dispense 1.25 μL of DNA-free nuclease-free water into these wells. 10. Add 1.25 μL of all the other reagents to independent wells, make sure to write down which well contains what. 11. Dispense 1.25 μL of cell extraction buffer to a subsequent well. This will be your negative control. 12. Take your control DNA and dispense 1.25 μL into the last well. This will be your positive control. 13. Prepare the cell extraction mix as described below; vortex to mix, store on ice until directed.

Volume per Volume for ten Component reaction (μL) samples (μL) Color

Cell extraction buffer 1.25 25 Green Extraction enzyme 2.4 24 Purple dilution buffer Cell extraction 0.1 1 Yellow enzyme Total volume 3.75 37.5

14. Dispense 3.75 μL of the cell extraction mix to each well of the DNA plate (e.g., using a repeater pipette). 15. Seal the QC plate and centrifuge the QC plate at 100 Â g for 30 s to collect the liquid in the bottom of the well. 16. Vortex the DNA plate on a thermomixer at 2000 rpm (6.7 Â g) for 1 min. Check whether the beads are resuspended. If so continue, if not repeat this step. 17. Transfer the plate to a thermocycler and run the following DNA extraction program. This program will take 14 min to complete.

Step Temperature (C) Time (m:s)

1 75 10:00 2 95 4:00 3 RT Hold

18. While the thermocycler is running prepare the Pre-Amp master mix as described below; mix and store on ice until directed. G&T-Sequencing 359

Volume per reaction Volume per ten Component (μL) samples (μL) Color

Pre-Amp 2.4 24 Red buffer Pre-Amp 0.1 1 White enzyme Total volume 2.5 25

19. When the DNA extraction program is finished centrifuge the DNA plate at 100 Â g for 15 s. 20. Dispense 2.5 μL Pre-Amp master mix to each well of the DNA plate. 21. Seal the QC plate and centrifuge it at 1000 Â g for 1 min. 22. Vortex the DNA plate on a thermomixer at 2000 rpm (6.7 Â g) for 1 min. Check whether the beads are resuspended. If so continue, if not repeat this step. 23. Transfer the plate to the thermocycler and run the following preamplification program. This program will take 1 h 20 min to complete.

Step Temperature (C) Time (m:s)

1 95 2:00 Cycle for 12 cycles 95 0:15 15 0:50 25 0:40 35 0:30 65 0:40 75 0:40 3 10 Hold

24. Defrost the amplification buffer (orange) and nuclease-free water (clear) in a cool box so they are defrosted before the preamplification process has finished. 25. Prepare the amplification master mix as follows a few min before the end of the preamplification step.

Volume per Volume for ten Component sample (μL) samples (μL) Color

Amplification 12.5 125 Orange buffer Nuclease-free 17.1 171 Clear water Amplification 0.4 4 Blue enzyme Total volume 30 300 360 Iraad F. Bronner and Stephan Lorenz

26. After the preamplification program has finished, centrifuge the plate at 1000 Â g for 60 s. 27. Dispense 30 μL amplification master mix to each well of your QC plate and seal the plate. 28. Centrifuge the DNA plate at 100 Â g for 15 s, making sure not to settle the beads. 29. Place the RNA plate on the thermomixer C and vigorously vortex the plate for 1 min at 2000 rpm (6.7 Â g). 30. Transfer the plate to the thermocycler and run the following program. This will take 52 min to complete and can be left overnight.

Step Temperature (C) Time (m:s)

1 95 2:00 Cycle for 14 cycles 95 0:15 65 1:00 75 1:00 3 10 Hold

31. Once complete, clean up the samples using a 1:1 SPRI to sample ratio as specified in Subheading 3.4 and run all the samples on an automated electrophoresis instrument as speci- fied in Subheading 3.5. 32. Clean functional reagents should yield a spikey smear for the positive control while the negative control should have (nearly) no high-molecular weight material. Bioanalyzer traces of suc- cessfully amplified material can be found in Subheading 3.6.2.

4 Notes

1. Even though the reagents will be supplied as such, we still recommend testing these reagents independently to ensure functionality and, more importantly, assure the operator of complete absence of contaminants like background amounts of DNA or nucleases. 2. Even though we feel most reagents could be purchased form any brand fulfilling these criteria, we suggest using the reagents stipulated by Macaulay et al. [3]. In addition, we have provided manufacturer and order codes for some reagents we feel are indispensable for the successful implementation of this protocol. 3. Even though most plasticware is guaranteed nuclease-free, we still recommend UV irradiating all plasticware before use in a cross-linker set to 600 mJ/cm2. G&T-Sequencing 361

4. Make fresh each time and discard remnants after use. 5. Depending on the RNA content of your cells optimization will be needed. Observed concentrations range from 1:5 Â 105 to 1:1.28 Â 108. We recommend aiming for 5% of the final sequencing reads to map to ERCCs. 6. If lysis buffer is not defrosted the cells may not be lysed, which will influence the quality of the RNA. 7. Streptavidin beads are sticky, and it is expected that some beads will adhere to the tube and tips. 8. This will inactivate RNA and DNA on the plates. One can also include the other lab ware necessary for this protocol, that is, 15 mL Falcon tubes and Eppendorf tubes if in doubt. 9. We have not tested their stability for longer. 10. Do not spin longer or faster, this step is only to collect any droplets in the main liquid phase. The beads are sticky and will be difficult to resuspend. 11. DO NOT spin faster or longer since this will settle the beads. 12. When starting this protocol, it is best to start with 13 μLof SPRI beads to capture the majority of cDNA fragments to estimate RNA degradation. For experienced users that do not expect large amounts of degradation products, it is better to use 7.8 μL, since this will remove most low-molecular weight products that can interfere with accurate cDNA quantification for subsequent Nextera XT library preparation. 13. Please follow the manufacturer’s instructions on running the automated electrophoresis instrument of your choice. 14. Reagent volumes in the GenomiPhi kit are quite small and we have seen an inhibiting effect of the beads on the DNA yield (results not shown). 15. We advocate using multiple standards that cover the entirety of the dynamic range of the assay in order to assess linearity and true limit of detection. Biotium’s kits contain seven standards, which in our hands have shown to give very accurate standard curves (r2 > 0.99), thus enabling accurate sample quantification. 16. Please note that as a cost saving measure, we generally suggest performing a genotyping assay (e.g., Fluidigm access array, SNPlex, or PCR-based genotyping assay) to select samples for PCR-free library preparation. These genotyping assays corre- late with the amount of locus and allelic dropout that is observed after PCR-free library preparation and sequencing and can help select the samples with high coverage and low locus dropout (results not shown). 362 Iraad F. Bronner and Stephan Lorenz

17. We use this kit because it gives a much better positive to negative control ratio than most MDA kits. 18. Do not do this in the UV hood you will be subsequently using since you might risk contaminating the hood with DNA, inva- lidating this experiment.

Acknowledgments

We would like to thank the Wellcome Trust for funding. We would like to acknowledge our colleagues from the Single Cell Genomics Core Facility and in particular: Howerd Fordham and Emily Hink- ley; our collaborators: Dr. Iain Macaulay, Dr. Mabel Teng, Scott Thurston, Dr. Jose Garcia-Bernardo, and Dr. Daniel Brown; and for cell sorting our colleagues from the Cytometry Core Facility: Bee Ling Ng, Christopher Hall, Jennie Graham, and Sam Thompson.

References

1. Macaulay IC, Teng MJ, Haerty W, Kumar P, Full-length RNA-seq from single cells using Ponting CP, Voet T (2016) Separation and par- Smart-seq2. Nat Protoc 9(1):171–181. allel sequencing of the genomes and transcrip- https://doi.org/10.1038/nprot.2014.006 tomes of single cells using G&T-seq. Nat Protoc 5. Picelli S, Bjorklund AK, Faridani OR, Sagasser S, 11(11):2081–2103. https://doi.org/10.1038/ Winberg G, Sandberg R (2013) Smart-seq2 for nprot.2016.138 sensitive full-length transcriptome profiling in 2. Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, single cells. Nat Methods 10(11):1096–1098. Teng MJ, Goolam M, Saurat N, Coupland P, https://doi.org/10.1038/nmeth.2639 Shirley LM, Smith M, Van der Aa N, 6. Svensson V, Natarajan KN, Ly LH, Miragaia RJ, Banerjee R, Ellis PD, Quail MA, Swerdlow HP, Labalette C, Macaulay IC, Cvejic A, Teichmann Zernicka-Goetz M, Livesey FJ, Ponting CP, SA (2017) Power analysis of single-cell RNA-se- Voet T (2015) G&T-seq: parallel sequencing of quencing experiments. Nat Methods 14 single-cell genomes and transcriptomes. Nat (4):381–387. https://doi.org/10.1038/ Methods 12(6):519–522. https://doi.org/10. nmeth.4220 1038/nmeth.3370 7. Wei Z, Shu C, Zhang C, Huang J, Cai H (2017) 3. Esteban JA, Salas M, Blanco L (1993) Fidelity of A short review of variants calling for single-cell- phi 29 DNA polymerase. Comparison between sequencing data with applications. Int J Bio- protein-primed initiation and DNA polymeriza- chem Cell Biol 92:218. https://doi.org/10. tion. J Biol Chem 268(4):2719–2726 1016/j.biocel.2017.09.018 4. Picelli S, Faridani OR, Bjorklund AK, Winberg G, Sagasser S, Sandberg R (2014) Chapter 21

Simultaneous Profiling of mRNA Transcriptome and DNA Methylome from a Single Cell

Youjin Hu, Qin An, Ying Guo, Jiawei Zhong, Shuxin Fan, Pinhong Rao, Xialin Liu, Yizhi Liu, and Guoping Fan

Abstract

Single-cell transcriptome and single-cell methylome analysis have successfully revealed the heterogeneity in transcriptome and DNA methylome between single cells, and have become powerful tools to understand the dynamics of transcriptome and DNA methylome during the complicated biological processes, such as differentiation and carcinogenesis. Inspired by the success of using these single-cell -omics methods to understand the regulation of a particular “-ome,” more interests have been put on elucidating the regulatory relationship among multiple- omics at single-cell resolution. The simultaneous profiling of multiple-omics from the same single cell would provide us the ultimate power to understand the relationship among different “-omes,” but this idea is not materialized for decades due to difficulties to assay extremely tiny amount of DNA or RNA in a single cell. To address this technical challenge, we have recently developed a novel method named scMT-seq that can simultaneously profile both DNA methylome and RNA transcriptome from the same cell. This method enabled us to measure, from a single cell, the DNA methylation status of the most informative 0.5–1 million CpG sites and mRNA level of 10,000 genes, of which 3200 genes can be further analyzed with both promoter DNA methylation and RNA transcription. Using the scMT-seq data, we have successfully shown the regulatory relationship between DNA methylation and transcriptional level in a single dorsal root ganglion neuron (Hu et al., Genome Biol 17:88, 2016). We believe the scMT-seq would be a powerful technique to uncover the regulatory mechanism between transcription and DNA methylation, and would be of wide interest beyond the epigenetics community.

Key words Single-cell sequencing, Single-cell DNA methylome, Single-cell transcriptome, Multi- omics profiling

1 Introduction

DNA methylation is among the best studied epigenetic modifica- tion, which has been shown to be related to many critical biological progress [1–3]. Integrative analysis of bisulfite sequencing data

Youjin Hu, Qin An, and Ying Guo contributed equally to this work.

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_21, © Springer Science+Business Media, LLC, part of Springer Nature 2019 363 364 Youjin Hu et al.

with RNA-seq data revealed how DNA methylation correlates with genes’ expression level and a measurement of the relationship between DNA methylation and RNA transcription with higher resolution will inspire a more faithful hypothesis about how DNA methylation and other epigenetic modification interact with gene expression. Recently, advance in technology has made it possible to profile transcriptome or DNA methylome at single-cell level [4–6], such as single-cell RNA-seq [7–9], single-cell bisulfite sequencing scBS-seq [4], and single-cell reduced representative bisulfite sequencing (scRRBS) [5]. Based on these techniques, growing evidence sup- porting the existence of transcriptional and epigenetic heterogene- ity in previously thought “pure” cell populations, and the heterogeneity in transcriptome and DNA methylome impose a huge challenge to understand the true relationship between them. Recently, we and other groups have developed methods to simul- taneously profile DNA methylome and transcriptome from one single cell (scMT-seq) [10–14]. Integrative analysis of these multi-omics from the same single cell can provide detailed sub- grouping picture of a complex cell population and accurate tempo- ral cellular roadmap of differentiation progress. scMT-seq includes four steps: (1) the cell membrane is selec- tively lysed and the nucleus are physically separatedform the cyto- plasm; (2) the cytoplasm containing the majority of mRNA is used for single-cell RNA sequencing by Smart-seq2; (3) the nucleus which contains all genomic DNA is used for DNA methylome profiling by single-cell RRBS (scRRBS); (4) next-generation sequencing and data analysis. The experimental part of the protocol takes 3 weeks.

2 Materials

2.1 Equipment and 1. High Sensitivity DNA Kit Chips and Reagents (Agilent Consumables Technologies). 2. Qubit dsDNA HS Assay Kit (Thermo Fisher). 3. MethyCode kit (Thermo Scientific). Other bisulfite conversion kits can also be used.

2.2 Solutions (The 1. Cell membrane-selective lysis buffer: 2% Triton-X100 19 μl, Solutions Were 40 U/μl RNase inhibitor 1 μl, 10 μM oligo-dT30VN primer Prepared for Ten 10 μl, and 10 mM dNTP 10 μl. Pipet 4 μl lysis buffer to each Reactions) PCR tube for one sample. 2. RRBS lysis buffer: Protease 11.25 μl and 1.2 pg/μl lambda DNA (dam–, dcm–; Thermo Scientific, cat. no. SD0021) 0.55 μl. mRNA Transcriptome and DNA Methylome from a Single Cell 365

3. RT mix: SuperScript III first-strand buffer 20 μl, SuperScript III reverse transcriptase (ThermoFisher, 18080044) 5 μl, 40 U/μl RNase inhibitor 2.5 μl, 100 mM DTT 5 μl, 5 M Betaine 20 μl, 1 M MgCl2 0.6 μl, 100 μM TSO primer (Invi- trogen) 2 μl, and 1:1000 diluted ERCC (ThermoFisher, 4456740) 5 μl. The total volume for one sample is 60.1 μl. The amount of ERCC added to the reaction mix should be optimized with different cell types. 4. cDNA PCR amplification mix: 2Â KAPA HiFi HotStart ready- Mix (Kapa Biosystems, KK2601) 125 μl, 10 μM IS PCR pri- mers 2.5 μl, and nuclease-free water 22.5 μl. 5. Tagmentation reaction mix: Tagmentation DNA buffer 40 μl, Tagmnetation enzyme mix 16.7 μl, Resuspension buffer 13.3 μl, DNA (input 0.1 ng) 10 μl. Dilute amplified cDNA to 0.1 ng/μl before prepare the tagmentation mix. 6. NT buffer: 0.2% SDS solution. Also commercially available from Nextera XT DNA sample preparation kit (Illumina, FC-131-1096). 7. Tagmented cDNA PCR reaction mix: KAPA 2Â polymerase 150 μl. 8. MspI reaction mix: 10Â Tango buffer (Thermo Scientific, cat. no. BY5) 20 μl, 10 U/μl MspI (Thermo Scientific, cat. no. ER0541) 9 μl, and nuclease-free water 101 μl. 9. Ligation reaction mix: 10Â Tango buffer 5 μl, 30 U/μlHCT4 ligase 10 μl, 10 mM ATP 12.5 μl, and nuclease-free water 12.5 μl. 10. First-round PCR reaction mix: 10Â Reaction buffer 50 μl, 10 mM dNTP Mix 10 μl, 10 μM PCR Primer 15 μl, 5 U/μl Pfu Turbo Cx (Agilent Technologies, cat. no. 600412) 4 μl, and nuclease-free water 121 μl. 11. Second-round PCR reaction mix: 5Â Phusion HF buffer 100 μl (New England BioLabs, cat. no. M0531S), 10 mM dNTP Mix 10 μl, 10uM PCR Primer 15 μl, Phusion HF 5 μl, and nuclease-free water 50 μl. 12. Ampure XP beads. Commercially available from Beckman Coulter. 13. EB solution: 10 mM Tris–HCl, adjust pH to 8.5. Also com- mercially available. 14. Klenow fragment exo–: 5 U/μl. Available from Thermo Scientific. 15. TrueSeq methylated adaptors: From Illumina, cat. no. FC-121-2001. 16. GlycoBlue from Thermo Fisher, cat. no. AM9516. 366 Youjin Hu et al.

2.3 Oligos 1. TSO primer: Sequence: 50-AAGCAGTGGTATCAACGCAGA GTACATrGrG+G-30. Please not that there are two riboguano- sines (rG) and one LNA-modified guanosine (+G) at the 30 end of the primer. These modified nucleotides are critical for TSO primer to function. 2. Oligo-dT30VN: Sequence: 50-AAGCAGTGGTATCAACGCA GAGTACT30VN-30.Atthe30 endofthisoligothereare30tan- dem T (T30), followed by a V (A, C, or G) and an N (A, T, C, or G). 3. ISPCR oligo: Sequence: 50-AAGCAGTGGTATCAACGCA GAGT-30. 4. PCR primers for the first and second round of PCR in scRRBS: QP1: 50-AATGATACGGCGACCACCGA-30 QP2: 50-CAAGCAGAAGACGGCATACGA-30.

2.4 Software and 1. Trim Galore! (Version 0.4.5). Freely available from https:// Databases github.com/FelixKrueger/TrimGalore. 2. Bowtie (Version 1.2.2). Freely available from http://bowtie- bio.sourceforge.net/index.shtml. 3. Bismark [15] (Version 0.19). Freely available from https:// github.com/FelixKrueger/Bismark. 4. Samtools. Download from http://samtools.sourceforge.net/. 5. STAR [16] (Version 2.5.4b). Download from https://github. com/alexdobin/STAR. 6. featureCounts. Available from http://bioinf.wehi.edu.au/ featureCounts/. 7. SNPsplit (Version 0.3.2). Available from https://www.bioinfor matics.babraham.ac.uk/projects/SNPsplit/. 8. Mouse strain-specific SNP dataset. Obtained from Mouse Genome Project database (http://www.sanger.ac.uk/science/ data/mouse-genomes-project). 9. Mouse reference genome and gene annotation. Obtained from GENCODE database (https://www.gencodegenes.org/).

3 Methods

3.1 Isolation 1. Clean the hood with RNaseZap and DNA-OFF solutions of Nucleus and before setting up the working plates. Spray pipettes with Cytoplasm from a RNaseZap. Single Cell 2. Add 4 μl of cell membrane-selective lysis buffer on the wall of a μ 3.1.1 Preparing Single- PCR tube. Add 4 l RRBS lysis buffer at the bottom of another Cell Samples PCR tube. 3. Pick a single cell using mouth pipetting with glass microcapil- lary pipette under the microscope, and transfer the cell into the mRNA Transcriptome and DNA Methylome from a Single Cell 367

cell membrane-selective lysis buffer droplet. Make sure the cell is transferred with as little liquid in the pipette as possible (ideally <0.2 μl), to prevent diluting the cell membrane- selective lysis buffer droplet (see Note 1). 4. Incubate the cell for 5 min at room temperature to lysis the cell membrane thoroughly, and the cell nucleus is exposed. Pick the nucleus by a microcapillary pipette in 0.2 μl buffer and transfer the nucleus into another PCR tube containing 4 μl RRBS lysis buffer. After picking up the nucleus, add 1 μlof10μM oligo- dT primer and 1 μl of 10 mM dNTP into the tube including cytosol RNA. Put the tubes containing nucleus or cytosol on dry ice immediately, and transfer to À80 C until the following scRRBS and SMART-seq2 steps.

3.2 SMART-seq2 for 1. Incubate the tube containing cytoplasm on ice at 4 C to thaw. mRNA Sequencing Quickly vortex, then spin down the tube (700 Â g for 10 s at room temperature) to collect solution at the bottom of the 3.2.1 Reverse tube. Place the tube on ice immediately (see Note 2). Transcription 2. Incubate the samples at 72 C for 3 min, then take out the tube immediately and put back on ice. 3. Spin down the samples (700 Â g for 10 s at room temperature) to collect the liquid at the bottom of the tubes. 4. Add 6 μl of the RT mix to the cytoplasm samples. Mix the reaction vigorously by vortexing for 10 s (see Note 3). 5. Spin down the samples (700 Â g for 10 s at room temperature) to collect the liquid at the bottom of the tubes, then incubate the reaction in a thermal cycler with a heated lid using the following program for reverse transcription:

Cycle Temperature (C) Time

1 42 90 min 2–11 50 2 min 42 2 min 12 70 1 min 13 4 Hold

6. Prepare cDNA PCR amplification mix and add 15 μl of cDNA PCR amplification mix to each tube from the step 5, vortex the tubes vigorously to mix, then spin them down (700 Â g for 10 s at room temperature) to collect the liquid at the bottom of the tubes. 368 Youjin Hu et al.

7. Perform the PCR in a thermal cycler by using the following program, to synthesize the second strand and amplify the cDNA:Cycle:

Temperature (C) Time

1 98 3 min 2–19 98 20 s 67 15 s 72 6 min 20 72 5 min 21 4 Hold

3.2.2 PCR Product 1. Equilibrate Ampure XP beads at room temperature for at least Purification 15 min, then vortex thoroughly for 30 s. 2. Add 25 μl of Ampure XP beads (so the DNA sample and beads are 1:1 vol/vol) to each sample from step 11 and mix by vortex until the solution appears homogeneous. Transfer solutions to a 96-well plate or 8-strip PCR tubes with compatible magnet stand. 3. Incubate the mixture for 8 min at room temperature to let the DNA bind to the beads. 4. Incubate the tube on the magnetic stand for 5 min or until the solution is clear and the beads have been collected at one corner of the well. 5. While samples are on the magnet, carefully remove the liquid with a 200 μl pipette tip without disturbing the beads. 6. Add 200 μl of 80% (vol/vol) ethanol solution to the beads without disturbing the beads. Incubate the samples for 30 s, then remove the ethanol. 7. Repeat step 6 once more. 8. Remove any trace of ethanol and open the lid to dry the beads. Leaving the plate at room temperature for 5 min or until a small crack appears on the surface of the beads. Avoid over- drying the beads, otherwise the DNA yield will be decreased. 9. Add 15 μl of EB solution or nuclease-free water. Mix ten times by pipetting to resuspend the beads thoroughly. Incubate the plate off the magnet for 2 min to completely dissolve DNA off the beads. 10. Place the tubes on the magnetic stand again for 2 min or until the solution appears clear and beads have accumulated in a corner of the well. mRNA Transcriptome and DNA Methylome from a Single Cell 369

11. Set the volume of the pipette to 13 μl, collect the supernatant without disturbing the beads and transfer it to a fresh 0.2-ml thin-walled PCR tube. Avoid bead contamination in the final 13 μl solution, otherwise the following PCR reaction will be inhibited. 12. Check the size distribution on an Agilent high-sensitivity DNA chip or PAGE gel electrophoresis. A good cDNA product should be free of short (<500 bp) fragments and should show a smear band between 0.5 kb and 3 kb with a peak at around 1.5 kbp. 13. Quantify the concentration of cDNA product using Qubit. Dilute the cDNA sample with high concentration for measure- ment if necessary.

3.2.3 Tagmentation 1. Prepare the tagmentation mix with reagents and PCR Reaction amplified cDNA. 2. Incubate the 8 μl reaction at 55 C for 10 min. Add 2 μlNT buffer (alternatively 0.2% SDS solution) immediately after incubation. Let the solution incubate at room temperature for 5 min to stop the tagmentation and dissociate Tn5 from the DNA.

3.2.4 PCR Amplification 1. Add 20 μl of Tagmented cDNA PCR reaction mix to 10 μlof and Indexing Tagmentation DNA products from the last step. 2. Perform the PCR in a thermal cycler by using the following program:

Step 1: 72 C 3 min Step 2: 98 C30s Step 3: 98 C10s 55 C30s 72 C60s go to step 3, repeat (10–15 cycles)a Step 4: 72 C 5 min Step 5: 4 C Hold

aThe number of cycles depends on the amount of DNA used for tagmentation. If we are starting from 100 pg of amplified cDNA, we usually perform 15 PCR cycles. It may be helpful to run a range of cycles to determine the best conditions 3. Repeat steps 3.2.2, but use only 30 μl of AMPure XP beads in a ratio of 0.6:1. This will minimize the carryover of primer dimers. 4. Dilute the library to 2 nM. Use the concentration obtained with the Qubit and the average size obtained on the 370 Youjin Hu et al.

BioAnalyzer to calculate the molarity of the final library. TapeS- tation can be used as an alternate to BioAnalyzer to determine the molarity and library size. 5. Measure the diluted library by Qubit to confirm that the con- centration is 10 nM. The library is ready for sequencing.

3.3 scRRBS for 1. Add 1.05 μl of protease–lambda DNA mix to 8-tube strip PCR Nuclear DNA tubes containing the single nucleus (see Note 4). Â 3.3.1 Prepare Nucleus 2. Spin down the tubes at 400 g for 2 min to collect the solution Lysis Buffer at the bottom of the tubes, then mix by gentle pulse-vortexing ten times. Spin down the tubes again at 400 Â g for 3 min to collect the solution at the bottom of the tubes. 3. Incubate at 50 C for 3 h to dissociate the DNA from protein completely. Then heat the reaction at 75 C for 30 min to inactivate the protease (see Note 5). 4. Spin down the tubes at 400 Â g for 1 min to collect all liquid to the bottom of the tubes.

3.3.2 DNA Fragmentation 1. Add 13 μl of MspI reaction mix to each tube containing DNA with MspI from one single nucleus. Spin down at 400 Â g for 2 min, pulse-vortex ten times, and spin down at 400 Â g for 2 min again to mix the reaction system thoroughly. Incubate the reaction at 37 C for 3 h, then heat the reaction mix to 65 C for 20 min to stop MspI digestion. 2. Spin down the plate at 400 Â g for 1 min to collect all liquid to the bottom of the tubes before the next step.

3.3.3 Gap-Filling/dA- 1. Add 1 μl of dATP–dCTP–dGTP solution mix Tailing (20 mM:2 mM:2 mM) and 1 μl of Klenow fragment exo- to each tube. Spin down the plate at 400 Â g for 2 min, pulse- vortex ten times, and spin down again at 400 Â g for 3 min to collect the liquid to the bottom of the tubes. 2. Incubate at 30 C for 20 min for gap filling, then at 37 C for 20 min for extra dA-tailing. 3. Heat the reaction solution at 75 C for 10 min to stop the reaction. 4. Spin down the plate at 400 Â g for 1 min before the next step.

3.3.4 Methylated Adaptor 1. Dilute TrueSeq methylated adaptors by 50-fold by mixing 1 μl Ligation of adaptor with 49 μl nuclease-free water and mix thoroughly. 2. Add 1 μl of diluted methylated adapter to one gap-filling/dA- tailing product, then add 4 μl of ligation reaction mix. 3. Incubate at 16 C for 30 min and 4 C for 16 h to ligate the adapter to the DNA. mRNA Transcriptome and DNA Methylome from a Single Cell 371

3.3.5 Bisulfite The bisulfite conversion is performed using MethyCode kit. Alter- Conversion native kit can also be used for bisulfite conversion. 1. Prepare CT Conversion Reagent mix for three reactions, by adding 850 μl nuclease-free water, 50 μl of Resuspension Buffer, and 300 μl of Dilution Buffer to CT Conversion Reagent (if inputting 25 μl DNA sample, reduce H2O from 900 μl to 850 μl) (see Note 6). 2. Add 125 μl of CT Conversion Reagent mix to adaptor ligated DNA. Mix by pipetting ten times, then spin down the tubes at 400 Â g for 1 min to collect the liquid to the bottom of the tubes. 3. Incubate using following below program:

Temperature (C) Time

98 10 min 64 2.5 h 4 Hold

4. Mix Binding Buffer and 10 ng/μl tRNA at 600:1 ratio. For 32 reactions, mix 19.5 ml of Binding Buffer with 32.5 μlof 10 ng/μl tRNA. 5. Add 601 μl of Binding Buffer–tRNA mix to the column. Transfer bisulfite-converted DNA to the column and mix the DNA with binding buffer by pipetting up and down for five times. Rinse the well of the column with small amount of Binding Buffer to transfer DNA to the column as much as possible. 6. Spin down at 16,000 Â g for 30 s. Discard supernatant. 7. Wash with 100 μl Wash buffer. 8. Spin down at 16,000 Â g for 30 s. 9. Incubate with 200 μl of Desulfonation Buffer for 15 min. 10. Spin down at 16,000 Â g for 30 s. 11. Wash column with 200 μl Wash Buffer. 12. Spin down at 16,000 Â g for 30 s. Discard supernatant. 13. Wash the column with 200 μl Wash Buffer. 14. Spin down at 16,000 Â g for 2 min to remove Wash buffer completely. 15. Elute converted DNA with 31 μl of warm (60 C) Elution Buffer. Incubate column with Elution Buffer at room temp for 2 min to dissolve DNA from the column. 16. Spin down at 16,000 Â g for 1 min. This should have ~30 μl DNA left for PCR. 372 Youjin Hu et al.

3.3.6 First Round of PCR 1. Add first-round PCR reaction mix to 30 μl of bisulfite- converted DNA. 2. Perform the first round of PCR using the following program:

Cycle Temperature (C) Time

1 95 2 min 2–21 95 20 s 60 30 s 72 1 min 22 72 2 min 23 4 Hold

3.3.7 First-Round PCR 1. Prepare 26 ml of 75% EtOH by mixing 19.5 ml of 100% EtOH Product Purification with 6.5 ml of nuclease-free water. 2. Add 50 μl AMPure bead to the first-round PCR product. Mix by pipetting ten times. Sit at room temperature for 8 min to let DNA bind to the beads. 3. Transfer to sit on magnet for 5 min or until the mixture appears clear and all beads are collected at the corner of the PCR tubes. 4. Wash twice with 160 μl freshly prepared 80% EtOH without disturbing the beads. 5. Dry the beads for 3–5 min. 6. Resuspend with 50 μl nuclease-free water. 7. Add 50 μl of fresh AMPure bead. Mix by pipetting ten times. 8. Sit for 5 min, then transfer to sit on magnet for 5 min to collect the beads. 9. Wash twice with 160 μl freshly prepared 80% EtOH. 10. Dry the beads for 3–5 min. (Make sure that the beads are completely dried out to avoid EtOH inhibiting PCR.) 11. Resuspend the beads with 40 μl nuclease-free water. 12. Transfer 32 μl purified first-round amplicons to the second- round PCR.

3.3.8 Second-Round PCR 1. Add second-round PCR reaction mix to the product of first round of PCR. 2. Perform the PCR using the following program: mRNA Transcriptome and DNA Methylome from a Single Cell 373

Cycle Temperature (C) Time

1 98 2 min 2–17 98 10 s 60 30s 72 1 min 18 72 2 min 19 4 Hold

3.3.9 Size Selection After PCR enrichment, DNA fragments between 200 and 500 bp were size-selected and recovered by cutting PAGE gel. 1. Mix the purified PCR product with loading dye and load all the PCR products in one well on the 6% PAGE gel. 2. Run the gel for 50 min at 120 V or until the blue dye reaches the bottom of the gel. 3. Stain and visualize the gel. Cut the gel with a clean scalpel between 200 and 500 bp. 4. Place the gel slice in one 1.5 ml tube and crush the gel slices with the RNase-free disposable pellet pestle and then soak in 250 μl TE buffer. 5. Rotate end-to-end for at least 45 min at 37 C. 6. Transfer the eluate and the gel debris to the top of the purifica- tion column. 7. Centrifuge the filter for 1 min at 16,000 Â g. 8. Recover eluate and add 0.05 μl GlycoBlue, 25 μl 3 M sodium acetate, pH 5.5, and 750 μl of 100% ethanol. 9. Precipitate DNA at À80 C for at least 30 min. Longer incu- bation time will likely increase DNA recovery rate. 10. Spin in a centrifuge at >13,000 Â g for 30 min at 4 C. Longer contrifuge time will likely increase the DNA recovery. 11. Remove the supernatant carefully without disturbing the pellet. 12. Wash the pellet with 80% ethanol by vortexing vigorously. 13. Spin in a centrifuge at >13,000 Â g for 30 min at 4 C. Longer contrifuge time will likely increase the DNA recovery. 14. Oen the lid and air-dry pellet for up to 10 min at room temperature to remove residual ethanol. 15. Resuspend pellet in 12 μl TE Buffer or nuclease-free water. 16. Quantify the concentration of libraries with Qubit assay and check the size of libraries with bioanalyzer. 374 Youjin Hu et al.

3.4 Sequencing 1. The scRRBS library sequencing is performed on Illumina HiSeq sequencer with five to ten million reads per cell. Paired-end sequencing mode is recommended other than single-end mode. The scRNA-seq libraries can be sequenced in single-end mode (see Note 7).

3.5 Data Processing 1. Trim RNA-seq reads using Trim Galore!

trim_galore -q 20 --phred33 --gzip --length 30

2. Generate STAR index.

STAR --runMode genomeGenerate --genomeDir --genomeFastaFiles --sjdbGTFfile

3. Align seqeucing data using STAR.

STAR --runMode alignReads --readFilesIn --readFilesCommand zcat --outSAMtype BAM Unsorted --genomeDir --outFileNamePrefix

4. Count reads using featureCounts.

featureCounts -a -o readscount.txt

5. Build bismark genome index.

bismark_genome_preparation --verbose

6. Trim RRBS reads.

trim_galore--quality20--phred33--stringency3--gzip--length36--rrbs--paired--trim1 --output_dir

7. Align RRBS reads using bismark in the directional mode.

Bismark -1 -2 8. Deduplicate alignment results and call methylation level on CpG sites.

deduplicate_bismark --bam mRNA Transcriptome and DNA Methylome from a Single Cell 375

9. Extract methylation level.

bismark_methylation_extractor --comprehensive --merge_non_CpG --gzip -o Bismark_methextract_out --bedGraph --CX_context --cytosine_report --split_by_chromosome

10. scMT-seq data can be analyzed according to the user’s demand. Example analysis results can be found in our original scMT-seq paper [10], and papers by Angermueller et al. [11] and Guo et al. [13]. We recommend to use RnBeads [17] for scRRBS quality control and preliminary analysis. For Smart- seq2 part of the data, many pipelines have been established and comprehensive reviews can be found elsewhere [18–20].

4 Notes

1. Different strategies can be used to isolate DNA and RNA from the same cell beside mouth pipette. For example, after selective lysis of cell membrane, the nucleus can be separated from the cytoplasm by centrifuge and antibody pulling down. All of the methods have been successfully used to isolate high-quality nucleus for genome sequencing or bisulfite sequencing. Users can choose different methods for nucleus isolation according to the cost, the number of samples, and availability of reagents and equipment. 2. To obtain the scRNA-seq libraries with the best quality, try to minimize the exposure of oligos to freeze–thaw cycle as much as possible, by aliquoting the stock solution. TSO primer con- tains modified nucleotides, and can be stored at À80 C for up to 6 months. Also, try to avoid store cytoplasm containing mRNA before reverse transcription for more than 1 week. 3. Do add spike-in to control the quality; ERCC can be added into cellular mRNA and used as a spike-in control. Make sure the ERCC is diluted to the optimal concentration so that the ERCC originated reads take 1–5% of total RNA-seq reads. One can run the amplified cDNA on a PAGE gel to tell if the ERCC is excessive. If added at right concentration, ERCC bands can be barely seen on the gel. 4. Lambda DNA can be added to nucleus before lysis, to serve as a control for bisulfite conversion. The ideal amount of lambda to genomic DNA is 5% (w/w), resulting in 5% of reads are from lambda DNA. 5. To optimize the quality of scRRBS libraries, make sure the protease is of good activity, and the nucleus is lysed completely 376 Youjin Hu et al.

and all DNA is disassociated from histone protein. Insufficient lysis will result in uneven coverage among different CpG islands, and absence of CpG islands with high nucleosome density. 6. When performing the bisulfite conversion, make sure to use freshly prepared bisulfite reagent, or the reagent is stored properly according to the manufacturer’s instruction. Expired bisulfite reagents can impair the conversion. 7. We recommend to sequence the scRRBS libraries in paired-end mode to have more CpG sites covered given the same amount of library fragments sequenced.

Acknowledgments

The work was supported by National Key R&D Program of China (2017YFA0104100, 2017YFC1001300) and National Natural Science Foundation of China (31700900).

References

1. Smith ZD, Meissner A (2013) DNA methyla- Rozenblatt-Rosen O, Suva ML, Regev A, tion: roles in mammalian development. Nat Bernstein BE (2014) Single-cell RNA-seq Rev Genet 14(3):204 highlights intratumoral heterogeneity in pri- 2. Robertson KD (2005) DNA methylation and mary glioblastoma. Science 344 human disease. Nat Rev Genet 6(8):597 (6190):1396–1401. https://doi.org/10. 3. Lister R, Pelizzola M, Dowen RH, Hawkins 1126/science.1254257 RD, Hon G, Tonti-Filippini J, Nery JR, 8. Picelli S, Bjorklund AK, Faridani OR, Lee L, Ye Z, Ngo Q-M (2009) Human DNA Sagasser S, Winberg G, Sandberg R (2013) methylomes at base resolution show wide- Smart-seq2 for sensitive full-length transcrip- spread epigenomic differences. Nature 462 tome profiling in single cells. Nat Methods 10 (7271):315 (11):1096–1098. https://doi.org/10.1038/ 4. Smallwood SA, Lee HJ, Angermueller C, nmeth.2639 Krueger F, Saadeh H, Peat J, Andrews SR, 9. Usoskin D, Furlan A, Islam S, Abdo H, Stegle O, Reik W, Kelsey G (2014) Single-cell Lonnerberg P, Lou D, Hjerling-Leffler J, genome-wide bisulfite sequencing for assessing Haeggstrom J, Kharchenko O, Kharchenko epigenetic heterogeneity. Nat Methods 11 PV, Linnarsson S, Ernfors P (2015) Unbiased (8):817–820 classification of sensory neuron types by large- 5. Guo H, Zhu P, Wu X, Li X, Wen L, Tang F scale single-cell RNA sequencing. Nat Neu- (2013) Single-cell methylome landscapes of rosci 18(1):145–153. https://doi.org/10. mouse embryonic stem cells and early embryos 1038/nn.3881 analyzed using reduced representation bisulfite 10. Hu Y, Huang K, An Q, Du G, Hu G, Xue J, sequencing. Genome Res 23(12):2126–2135 Zhu X, Wang CY, Xue Z, Fan G (2016) Simul- 6. Ramsko¨ld D, Luo S, Wang Y-C, Li R, Deng Q, taneous profiling of transcriptome and DNA Faridani OR, Daniels GA, Khrebtukova I, Lor- methylome from a single cell. Genome Biol ing JF, Laurent LC (2012) Full-length mRNA- 17:88. https://doi.org/10.1186/s13059- seq from single-cell levels of RNA and individ- 016-0950-z ual circulating tumor cells. Nat Biotechnol 30 11. Angermueller C, Clark SJ, Lee HJ, Macaulay (8):777–782 IC, Teng MJ, Hu TX, Krueger F, Smallwood 7. Patel AP, Tirosh I, Trombetta JJ, Shalek AK, SA, Ponting CP, Voet T, Kelsey G, Stegle O, Gillespie SM, Wakimoto H, Cahill DP, Nahed Reik W (2016) Parallel single-cell sequencing BV, Curry WT, Martuza RL, Louis DN, links transcriptional and epigenetic mRNA Transcriptome and DNA Methylome from a Single Cell 377

heterogeneity. Nat Methods 13(3):229–232. 16. Dobin A, Davis CA, Schlesinger F, Drenkow J, https://doi.org/10.1038/nmeth.3728 Zaleski C, Jha S, Batut P, Chaisson M, Gingeras 12. Cheow LF, Courtois ET, Tan Y, TR (2013) STAR: ultrafast universal RNA-seq Viswanathan R, Xing Q, Tan RZ, Tan DS, aligner. Bioinformatics 29(1):15–21 Robson P, Loh Y-H, Quake SR (2016) 17. Assenov Y, Mu¨ller F, Lutsik P, Walter J, Single-cell multimodal profiling reveals cellular Lengauer T, Bock C (2014) Comprehensive epigenetic heterogeneity. Nat Methods 13 analysis of DNA methylation data with (10):833 RnBeads. Nat Methods 11(11):1138 13. Guo F, Li L, Li J, Wu X, Hu B, Zhu P, Wen L, 18. Bacher R, Kendziorski C (2016) Design and Tang F (2017) Single-cell multi-omics computational analysis of single-cell RNA-se- sequencing of mouse early embryos and quencing experiments. Genome Biol 17(1):63 embryonic stem cells. Cell Res 27(8):967 19. Dal Molin A, Baruzzo G, Di Camillo B (2017) 14. Clark SJ, Argelaguet R, Kapourani C-A, Stubbs Single-cell RNA-sequencing: assessment of dif- TM, Lee HJ, Alda-Catalinas C, Krueger F, ferential expression analysis methods. Front Sanguinetti G, Kelsey G, Marioni JC (2018) Genet 8:62 scNMT-seq enables joint profiling of chroma- 20. Lun AT, McCarthy DJ, Marioni JC (2016) A tin accessibility DNA methylation and tran- step-by-step workflow for low-level analysis of scription in single cells. Nat Commun 9(1):781 single-cell RNA-seq data with bioconductor. 15. Krueger F, Andrews SR (2011) Bismark: a flex- F1000Res 5:2122 ible aligner and methylation caller for Bisulfite- seq applications. Bioinformatics 27 (11):1571–1572 Chapter 22

Simultaneous Targeted Detection of Proteins and RNAs in Single Cells

Aik T. Ooi and David W. Ruff

Abstract

Simultaneous detection of both RNA and protein in individual single cells offers a powerful tool for genotype-to-phenotype investigations. Proximity extension assay (PEA) is a quantitative, sensitive, and multiplex protein detection system that has superb utility in single-cell omic analysis. We implemented PEA ® using the flexible microfluidic workflow of the Fluidigm C1™ system followed by real-time quantitative polymerase chain reaction (RT-qPCR) on the Fluidigm Biomark™ HD system. With this workflow, targeted quantification of RNAs and proteins within individual cells is readily conducted.

Key words Single-cell analysis, Proximity extension assay (PEA), Protein quantification, Multi-omics, Protein–RNA correlation, qPCR, RT-qPCR, Antibody detection, Single-cell protein detection, Microfluidics

1 Introduction

The recent advent of high-throughput single-cell analysis platforms has greatly accelerated the pace of understanding cell types and characterizing cellular states and heterogeneity. Much of the prog- ress has been elucidated from single-cell transcriptomic analysis using targeted RT-qPCR assays and RNA sequencing regimens [1, 2]. Ideally, RNA levels should indicate protein levels. However, previous studies indicate that levels of individual mRNAs and their respective translated proteins often display discordance [3]. Conse- quently, direct profiling of cellular protein levels offers a more accurate picture of protein-related cellular activities, while the abil- ity to link RNA levels to protein expression would generate the most complete view of genotype-to-phenotype relationships in dynamic biological pathways. Quantification of proteins at the single-cell level relies on antibody-based detection approaches. While methods based on fluorescence-activated cell sorting (FACS) are widely used, this technology is often limited to cell surface markers. Another

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_22, © Springer Science+Business Media, LLC, part of Springer Nature 2019 379 380 Aik T. Ooi and David W. Ruff

single-cell protein detection method based on proximity ligation assay has sensitivity limitations and cannot be multiplexed [4]. To overcome these limitations, an improved detection format for pro- tein analysis has been developed using proximity extension assay (PEA) [5]. For a protein target, PEA requires two independent antibodies, each conjugated with an oligonucleotide that carries a short sequence that is complementary to the other. When the pair of antibody probes bind to their antigen in close proximity, the complementary ends anneal, followed by strand extension to gen- erate a full-length RT-qPCR template. The two-antibody system reduces background and thus enables simultaneous detection of multiple targets. With RT-qPCR as the end-point analysis, PEA overcomes the limitation of overlapping fluorescence spectra faced by the FACS technology and in turn allows for a higher number of possible targets in a single experiment. A single-cell version of the method has been developed for the Fluidigm C1 system with RT-qPCR readout on the Biomark HD system [6, 7], where the use of custom PEA antibody probes for multiplex targets has been reported. Constructing custom panels requires time-consuming antibody–oligonucleotide conjugation procedures and verification [8]. The above-mentioned single-cell C1 microfluidic protocols can be readily adapted to using commer- cially available and validated PEA antibody panels from the Pro- ® ® seek Multiplex96x96 Kit (Olink Proteomics). We incorporate at the front end a mild cell lysis condition that allows the binding of PEA antibody probes to their antigen targets while preserving cellular RNA. Furthermore, by utilizing the polymerase activity of a reverse transcriptase on both RNA and DNA, cDNA generation and PEA oligonucleotide extension can be coordinately accom- plished in the isolated C1 reaction chambers (Fig. 1). This enables targeted detection of both RNA and protein from individual cells processed on C1 and Biomark HD via RT-qPCR readouts. Here we

Fig. 1 C1 microfluidic chamber reaction overview. First, single-cell capture occurs in the 4.5 nL chamber. Lysis and antibody probes binding occur next at 37 C for 120 min and 10 C for 1 min. Reverse transcription and oligonucleotide extension are carried out next at 42 C for 60 min followed by heat inactivation at 85 C for 5 min and 10 C for 1 min. The final preamplification step engages all the chambers. The thermal cycling parameters are 95 C for 5 min, 20 cycles of 96 C for 20 s, and 60 C for 6 min, and 10 C for 1 min Single-Cell Protein and RNA Detection 381

describe the homogenous and simultaneous single-cell protein and RNA co-detection of up to 96 targets each, using readily available reagents.

2 Materials

2.1 Simultaneous 1. Fluidigm C1 system (Fluidigm) with the appropriate scripts Protein and RNA installed (see Notes 1 and 2). Detection Workflow 2. C1 Single-Cell Open App™ IFC (integrated fluidic circuit; on C1 Fluidigm). Select the IFC needed based on cell size: C1 Single-Cell Open App IFC, 5–10 μm; C1 Single-Cell Open App IFC, 10–17 μm; or C1 Single-Cell Open App IFC, 17–25 μm. 3. C1 Single-Cell Reagent Kit (Fluidigm). 4. Proseek Multiplex96x96 Kit (Olink Proteomics). Choose from the available 12 human target panels or 1 mouse target panel (see Note 3). ® 5. (Recommended) LIVE/DEAD Viability/Cytotoxicity Kit for Mammalian Cells (Thermo Fisher Scientific). Prepare 1Â LIVE/DEAD stains: 1250 μL C1 Cell Wash Buffer (Flui- digm), 2.5 μL ethidium homodimer (Thermo Fisher Scien- tific), 0.625 μL Calcein AM (Thermo Fisher Scientific). Prepare on the day of experiment and keep on ice in the dark until use. 6. 5Â lysis buffer: 2.5% NP-40 (dilute in water from 10%), 250 mM Tris–HCl, pH 8.4, 5 mM EDTA. Store at room temperature. 7. Lysis and Probe Mix: 21 μL5Â Lysis Buffer, 8.4 μL Incubation Solution (Olink Proteomics), 8 μL Incubation Stabilizer (Olink Proteomics), 1 μL A-probes (Olink Proteomics), 1 μL B-probes (Olink Proteomics), 5.2 μL C1 Loading Reagent (Fluidigm), 25.4 μL nuclease-free water. Prepare on the day of experiment and keep on ice until use. 8. RT Mix: 8 μL5Â Reverse Transcription Master Mix (Flui- digm), 1.8 μL C1 Loading Reagent, 27 μL nuclease-free water. Prepare on the day of experiment and keep on ice until use. 9. PCR Mix: 20 μL5Â Preamp Master Mix (Fluidigm), 10 μL pooled 500 nM preamplification primers (see Note 4), 1 μL PEA Solution (Olink Proteomics), 2.2 μL C1 Loading Reagent, 11.6 μL nuclease-free water. Prepare on the day of experiment and keep on ice until use. 382 Aik T. Ooi and David W. Ruff

2.2 Protein Target 1. Fluidigm Biomark HD (see Note 5). Biomark HD Data Col- Detection by Real- lection software v3.0.2 or higher is required. Time Quantitative PCR 2. 96.96 Dynamic Array™ IFC for gene expression (Fluidigm) on Biomark HD (see Note 6). 3. IFC Controller HX (see Note 7) or the Juno™ system (see Note 8). 4. Proseek Multiplex96x96 detection kit (Olink Proteomics). 5. Detection Mix: 550 μL Detection Solution (Olink Proteo- mics), 230 μL nuclease-free water, 7.8 μL Detection Enzyme (Olink Proteomics), 3.1 μL PCR Polymerase (Olink Proteo- mics). Prepare on the day of experiment and keep on ice until use. 6. 2Â Assay Loading Reagent (Fluidigm). This is only needed if fewer than 96 protein targets are being investigated in the real- time PCR.

2.3 RNA Target 1. Fluidigm Biomark HD (see Note 5). Biomark HD Data Col- Detection by Real- lection software v3.0.2 or higher is required. Time Quantitative PCR 2. 96.96 Dynamic Array IFC for gene expression (Fluidigm) (see on Biomark HD Note 6). 3. IFC Controller HX (see Note 7) or the Juno system (see Note 8). ® 4. Sample premix: 360 μL2Â SsoFast™ EvaGreen Supermix with Low ROX (Bio-Rad), 36 μL20Â DNA Binding Dye (Fluidigm). Prepare on the day of experiment, vortex and centrifuge briefly, then keep on ice until use. 5. 2Â Assay Loading Reagent (Fluidigm). 6. 1Â DNA Suspension Buffer (TEKnova). 7. 100 μM combined forward and reverse primers. These are the primers used in the C1 step for the preamplification of cDNA targets (see Note 4).

3 Methods

An overview of the protocol and the estimated time for each step is outlined in Fig. 2.

3.1 Prime the C1 IFC 1. Pipet 200 μL of C1 Harvest Reagent (Fluidigm) into each of the two accumulators on the IFC (see Note 9). 2. Pipet 20 μL of C1 Harvest Reagent into each of the 36 control line inlets (4 groups of 9 inlets on both sides of the accumula- tors) and 4 hydration inlets (2 middle inlets on the outside columns on each side of the IFC). 3. Pipet 20 μL of C1 Preloading Reagent (Fluidigm) into inlet 2. Single-Cell Protein and RNA Detection 383

1. Prime 2. Prepare cells 3. Load cells Prepare the IFC 5 min Wash cells 15 min Prepare the IFC 5 min

Run the Prime script: Count, dilute, and 10 min Run the Cell Load or Cell Load • Small-cell IFC 11 min mix cells with C1 & Stain script: • Medium- or large- 12 min Suspension Reagent • Small-cell IFC with staining 30 min cell IFC • Small-cell IFC without 20 min staining • Medium- or large-cell IFC 60 min with staining • Medium- or large-cell IFC 30 min without staining

5. Generate protein and RNA 4. Image (optional) target amplicons 6. Harvest Image cells with 15–30 min Prepare reagents 15 min Harvest amplified 10 min a microscope products from the IFC Prepare the IFC 5 min

Run the Sample Prep 8 hr 7 min script (irrespective of (end time can be cell size) adjusted to a later time for convenience)

1. Prime 2. Load samples and assays 3. Run real-time PCR Inject control line 5 min Prepare sample and 10–30 min Run the real-time PCR fluid into the IFC assay mixes program on Biomark HD: • Protein target detection 2 hr 3 min Run the Prime 20 min Pipet sample and 10 min thermal protocol script assay mixes into the • RNA target detection 1 hr 14 min IFC thermal protocol

Run the Load Mix 1 hr 30 script on the IFC min controller

Fig. 2 An overview of the protocol for both the C1 and Biomark workflows including the estimated time required for each step. The Biomark workflow will be performed twice, once for protein detection and once for RNA detection

4. Pipet 15 μL of C1 Blocking Reagent (Fluidigm) into each cell inlet and outlet. 5. Pipet 20 μL of C1 Cell Wash Buffer into inlet 5. 6. Remove and discard the protective film from the bottom of the IFC. 7. Place the IFC into the C1 system, tap LOAD, and run the Single-Cell Targeted Protein & RNA: Prime script. The prime step takes 11–12 min, depending on the IFC. Tap EJECT to remove the IFC when the Prime script is finished. 8. Proceed to Subheading 3.3 within an hour after Prime is finished. 384 Aik T. Ooi and David W. Ruff

3.2 Prepare Cells 1. Prepare a cell suspension in native medium. 2. Wash cells twice with 1 mL of C1 Cell Wash Buffer. Centrifuge at 300 Â g for 5 min between washes. 3. Resuspend washed cells in C1 Cell Wash Buffer to a concentra- tion of 166–250/μL(see Note 10).

3.3 Cell Load (and 1. Prepare final cell mix for loading into the IFC. Vortex C1 Stain) on the C1 IFC Suspension Reagent (Fluidigm) for 5 s, then combine cells (properly prepared and diluted—see Subheading 3.2) with C1 Suspension Reagent at a ratio of 3:2 (see Note 11). For exam- ple: 30 μL of cells (at a concentration of 166–250/μL) with 20 μL of C1 Suspension Reagent. Mix by pipetting the cell mix five to ten times. Do not vortex the final cell mix. 2. Pipet and remove the remaining C1 Blocking Reagent from the cell inlet and outlet. 3. Pipet 5 μL of the final cell mix prepared in Subheading 3.1 to the cell inlet on the IFC (see Note 12). 4. (Optional) Pipet 20 μL of staining solution into inlet 1. Use either 1Â LIVE/DEAD stains outlined in Subheading 2,ora suitable staining solution of choice. 5. Place the IFC into the C1 system, tap LOAD, and run the Single-Cell Targeted Protein & RNA: Cell Load or Single- Cell Targeted Protein & RNA: Cell Load & Stain script. The cell loading step takes 20–60 min, depending on IFC type and whether a staining step is used. Tap EJECT to remove the IFC when the script is finished. 6. (Optional) Prior to proceeding to the next step, image the captured cells on a compatible microscope or imaging system (see Note 13).

3.4 Prepare C1 IFC 1. Pipet 180 μL of C1 Harvest Reagent into each of the four for Protein and RNA reservoirs on the four corners of the IFC. Detection 2. Prepare Lysis and Probe Mix, RT Mix, and PCR Mix as out- lined in Subheading 2. 3. Pipet 8 μL of Lysis and Probe Mix into inlet 3. 4. Pipet 27 μL of RT Mix into inlet 7. 5. Pipet 25 μL of PCR Mix into inlet 8. 6. Place the IFC into the C1 system, tap LOAD, and select the Single-Cell Targeted Protein & RNA: Sample Prep script. The run time for this script is 8 h, 7 min. 7. Select a convenient time for the program to finish by sliding the orange bar to the desired end time. Tap START. Single-Cell Protein and RNA Detection 385

8. The program end time can be rescheduled until 2 h before the selected time. Tap Reschedule to change the end time on the sliding bar.

3.5 Harvest 1. When the script is finished, tap EJECT to remove the IFC (see the Amplified Products Note 14). 2. Aliquot 25 μL of DNA Dilution Reagent (Fluidigm) into each well of a clean 96-well plate (see Note 15). 3. Carefully pull back the barrier tape covering the harvesting inlets of the IFC using the provided plastic removal tool. 4. Using an 8-channel pipette set at 6 μL, pipet the entire volume of amplified products from the harvest outlets into the 96-well plates with DNA Dilution Reagent. Figure 3 provides stepwise detailed instructions on pipetting the amplified products. 5. Seal the plate with adhesive film. Briefly vortex and centrifuge the plate. Label the plate “Diluted Harvest Plate.” After har- vesting, materials from the capture sites are arranged in the plate as depicted in Fig. 4. 6. Proceed immediately to real-time PCR on the Biomark HD system, or store the plate at À20 C.

3.6 Protein Target 1. Prepare a 96.96 Dynamic Array IFC for priming step by inject- Detection by Real- ing control line fluid into each of the two accumulators on the Time PCR IFC, using the prefilled syringes provided (see Note 16). Fig- on Biomark HD ure 5 shows the location of the accumulators. 2. Remove and discard the blue protective film from the bottom of the IFC. 3. Place the IFC into the IFC controller and run the prime script: Prime (136Â) if using the IFC Controller HX; Prime 96.96 GE if using the Juno system. The run time for the prime script is 20 min. 4. Thaw, then briefly vortex and centrifuge the Primer Plate (Olink Proteomics). Keep on ice until use. 5. Prepare the Detection Mix as outlined in Subheading 2. Vortex and centrifuge the Detection Mix briefly. Aliquot 95 μL of the Detection Mix into each well of an 8-well strip tube. 6. Label a new 96-well plate “Sample Plate” and pipet 7.2 μLof the Detection Mix into each well of the Sample Plate using a multichannel pipette. 7. Thaw the Diluted Harvest Plate, vortex, and centrifuge briefly. 8. Remove the adhesive film and transfer 2.8 μL of sample from each well of the Diluted Harvest Plate into the Sample Plate using a multichannel pipette. 386 Aik T. Ooi and David W. Ruff

Fig. 3 The 12 pipetting steps to harvest the amplified products onto a 96-well plate containing DNA Dilution Reagent Single-Cell Protein and RNA Detection 387

Fig. 4 Arrangement of samples by capture site numbers on the C1 IFC

9. Seal the Sample Plate with a new adhesive film, vortex, and centrifuge briefly. 10. When the prime script is finished, remove the IFC from Juno or the HX loader. 11. Remove the adhesive film on the Primer Plate. Be extra careful when handling the Primer Plate to avoid introducing contami- nation among wells. 12. Using a multichannel pipette, transfer 5 μL of primer and probe mix from each well the Primer Plate to the corresponding assay inlets of the primed 96.96 Dynamic Array IFC. Assays inlets are located on the left side of the IFC (Fig. 5). 13. Using a multichannel pipette, transfer 5 μL from each well of the Sample Plate into the sample inlets of the IFC. Sample inlets are located on the right side of the IFC (Fig. 5). 14. Fill unused inlets. Do not leave any inlets empty. For unused assay inlets, use 3.0 μL2Â Assay Loading Reagent and 3.0 μL DNA-free water per inlet. For unused sample inlets, use 3.3 μL of Detection Mix and 2.7 μL of DNA-free water per inlet. 15. Place the IFC into the IFC controller and run the load script: Load Mix (136Â) if using the IFC Controller HX; Load Mix 96.96 GE if using the Juno system. The Load Mix script takes 1 h, 30 min to complete. 16. Remove the IFC from the controller once the load script is finished. Proceed to the real-time PCR workflow within an hour of loading samples. 17. On Biomark HD, launch the Data Collection software (see Note 5). 388 Aik T. Ooi and David W. Ruff

Fig. 5 Location of the accumulators, assay inlets, and sample inlets on the 96.96 Dynamic Array IFC

18. Create, name, and save the following thermal protocol.

Program stage Temperature Time Cycle

Thermal mix 50 C 120 s 1 70 C 1800 s 25 C 600 s Hot start 95 C 300 s 1 Denature 95 C15s40 Extend 60 C60s

19. Click Start a New Run, place the IFC into the instrument, and click Load. 20. Select the following settings: Load Mix (136Â) if using the IFC Controller HX; Load Mix 96.96 GE if using the Juno system. The Load Mix script takes 1 h, 30 min to complete. Load Mix (136Â) if using the IFC Controller HX; Load Mix 96.96 GE if using the Juno system. The Load Mix script takes 1 h, 30 min to complete. Passive reference: ROX Assay: Single probe Probe type: FAM-MGB. 21. Select the thermal protocol created in step 18, and confirm that Auto Exposure is selected. 22. Verify the run information and click Start Run. The real-time PCR program takes 2 h, 3 min to complete. 23. When the real-time-PCR is complete, view and analyze data on the Fluidigm Real-Time PCR Analysis software (see Note 17). You can export data to a spreadsheet for further analysis on protein target detection (see Note 18). Single-Cell Protein and RNA Detection 389

3.7 RNA Target 1. Prepare a 96.96 Dynamic Array IFC for priming step by inject- Detection by Real- ing control line fluid into each of the two accumulators on the Time PCR IFC, using the prefilled syringes provided (see Note 16). on Biomark HD Figure 5 shows the location of the accumulators. 2. Remove and discard the blue protective film from the bottom of the IFC. 3. Place the IFC into the IFC controller and run the prime script: Prime (136Â) if using the IFC Controller HX; Prime 96.96 GE if using the Juno system. The run time for the prime script is 20 min. 4. Prepare sample premix as outlined in Subheading 2. Vortex and centrifuge briefly before use. 5. Pipet 3.3 μL of sample premix into each well of a 96-well plate. 6. Prepare final sample mix by adding 2.7 μL of sample from each well of the Diluted Harvest Plate to individual wells with the sample premix. 7. Seal the plate with an adhesive film, vortex for 20 s, then centrifuge briefly to collect the samples. 8. Prepare final assay mix by combining 3.0 μL2Â Assay Loading Reagent (Fluidigm) with 2.7 μL1Â DNA Suspension Buffer (TEKnova) and 0.3 μL 100 μM combined forward and reverse primers for each target. Transfer the assay mixes into individual wells of a 96-well plate to assist pipetting into the IFC inlets. 9. Seal the plate with an adhesive film, vortex for 20 s, then centrifuge briefly to collect the samples. 10. Remove the IFC from the controller once the prime script is finished. 11. Using a multichannel pipette, transfer 5 μL of each final assay mix and 5 μL of each final sample mix into their respective inlets on the IFC (Fig. 5). 12. Fill unused inlets. Do not leave any inlets empty. For unused assay inlets, use 3.0 μL2Â Assay Loading Reagent and 3.0 μL DNA-free water per inlet. For unused sample inlets, use 3.3 μL of sample premix and 2.7 μL of DNA-free water per inlet. 13. Place the IFC into the IFC controller and run the load script: Load Mix (136Â) if using the IFC Controller HX; Load Mix 96.96 GE if using the Juno system. The Load Mix script takes 1 h, 30 min to complete. 14. Remove the IFC from the controller once the load script is finished. Proceed to the real-time PCR workflow within an hour of loading samples. 15. On Biomark HD, launch the Data Collection software (see Note 5). 390 Aik T. Ooi and David W. Ruff

16. Click Start a New Run, place the IFC into the instrument, and click Load. 17. Select the following settings: Application type: Gene Expression Passive reference: ROX Assay: Single probe Probe type: EvaGreen. 18. Select the thermal protocol GE Fast 96x96 PCR þ Melt v2. pcl and confirm that Auto Exposure is selected. 19. Verify the run information and click Start Run. The real-time PCR program takes 1 h, 14 min to complete. 20. When the real-time-PCR is complete, view and analyze data on the Fluidigm Real-Time PCR Analysis software (see Note 17). You can export data to a spreadsheet for further analysis on RNA target detection.

4 Notes

1. For detailed instructions on instrument and software opera- tion, refer to the C1 System User Guide (Fluidigm 100-4977). 2. Scripts for Protein & RNA can be downloaded from Fluidigm Script Hub™ at www.fluidigm.com/c1openapp/scripthub. 3. Select the appropriate panel of protein target based on your research interest. Available panels are listed at www.olink.com/ products. 4. This is a pool of forward and reverse primer pairs to amplify target cDNA for downstream quantitative real-time PCR detection. Primers can be the Fluidigm Delta Gene™ assays or obtained from other major laboratory suppliers. It is possi- ble to multiplex up to 96 primer pairs for the detection of 96 RNA targets. 5. For detailed instructions on instrument and software opera- tion, refer to the Biomark HD Data Collection User Guide (Fluidigm 100-2451). 6. For a different number of samples and targets arrangement, the 192.24 Dynamic Array IFC for gene expression (Fluidigm) or the 48.48 Dynamic Array IFC for gene expression (Fluidigm) may be used. 7. For detailed instructions on instrument and software opera- tion, refer to the IFC Controller MX and IFC Controller HX User Guide (Fluidigm 68000112). Single-Cell Protein and RNA Detection 391

8. For detailed instructions on instrument and software opera- tion, refer to the Juno System User Guide (Fluidigm 100-7070). 9. To prevent bubbles from forming, push only to the first stop on the pipette when pipetting into the IFC inlets. If a bubble is created, use a pipette tip to either burst the bubble or move it to the top surface of the solution. 10. Depending on the cell type, concentration outside of this range might be needed to ensure better single-cell capture efficiency. 11. Depending on the cell type, the ratio for cells to C1 Suspension Reagent may be optimized to improve single-cell capture effi- ciency. See the Fluidigm Single-Cell Preparation Guide (Flui- digm 100-7697). 12. Up to 20 μL can be pipetted into the cell inlet, but only 5 μL will enter the IFC. 13. Criteria for selection of a compatible imaging system are out- lined in Minimum Specifications for Imaging Cells in Fluidigm Integrated Fluidic Circuits (Fluidigm 100-5004). 14. Continue the remaining steps in a post-PCR lab environment. 15. If a higher concentration of sample is desired, aliquot smaller amount of DNA Dilution Reagent into the plate. A minimum volume of 5.5 μL per sample is required for downstream detec- tion steps. 16. Different types of Dynamic Array IFC require different volumes of control line fluid in the accumulators. Please use the 96.96 syringes filled with 150 μL of control line fluid for the 96.96 Dynamic Array IFC. 17. For detailed instructions on software operation, refer to the Real Time PCR Data Analysis User Guide (Fluidigm 68000088). 18. Refer to Olink Proteomics Proseek Multiplex96x96 User Man- ual and website (www.olink.com) for instructions and options on data analysis.

References

1. Livak KJ, Wills QF, Tipping AJ, Datta K, transcriptomics reveals bimodality in expression Mittal R, Goldson AJ, Sexton DW, Holmes CC and splicing in immune cells. Nature 498 (2013) Methods for qPCR gene expression (7453):236–240 profiling applied to 1440 lymphoblastoid single 3. Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, cells. Methods 59(1):71–79 Hearn J, Emili A, Xie XS (2010) Quantifying 2. Shalek AK, Satija R, Adiconis X, Gertner RS, E. coli proteome and transcriptome with single- Gaublomme JT, Raychowdhury R, Schwartz S, molecule sensitivity in single cells. Science 329 Yosef N, Malboeuf C, Lu D, Trombetta JJ, (5991):533–538 Gennert D, Gnirke A, Goren A, Hacohen N, 4. Sta˚hlberg A, Thomsen C, Ruff D, A˚ man P Levin JZ, Park H, Regev A (2013) Single-cell (2012) Quantitative PCR analysis of DNA, 392 Aik T. Ooi and David W. Ruff

RNAs, and proteins in the same single cell. Clin transcriptomes in a single reaction. Genome Biol Chem 58(12):1682–1691 17(1):188 5. Assarsson E, Lundberg M, Holmquist G, 7. Gong H, Wang X, Liu B, Boutet S, Holcomb I, Bjo¨rkesten J, Thorsen SB, Ekman D, Dakshinamoorthy G, Ooi A, Sanada C, Sun G, Eriksson A, Rennel Dickens E, Ohlsson S, Ramakrishnan R (2017) Single-cell protein- Edfeldt G, Andersson AC, Lindstedt P, mRNA correlation analysis enabled by multi- Stenvang J, Gullberg M, Fredriksson S (2014) plexed dual-analyte co-detection. Sci Rep 7 Homogenous 96-plex PEA immunoassay exhi- (1):2776 biting high sensitivity, specificity, and excellent 8. Gong H, Holcomb I, Ooi A, Wang X, scalability. PLoS One 9(4):e95192 Majonis D, Unger MA, Ramakrishnan R 6. Genshaft AS, Li S, Gallant CJ, Darmanis S, Pra- (2016) Simple method to prepare kadan SM, Ziegler CG, Lundberg M, oligonucleotide-conjugated antibodies and its Fredriksson S, Hong J, Regev A, Livak KJ, application in multiplex protein detection in sin- Landegren U, Shalek AK (2016) Multiplexed, gle cells. Bioconjug Chem 27(1):217–225 targeted profiling of single-cell proteomes and Part VI

Single Cell Screening Chapter 23

CRISPR Screening in Single Cells

Johan Henriksson

Abstract

The combination of single-cell RNA-seq and CRISPR allows for efficient interrogation of possibly any number of genes, only limited by the sequencing capability. Here we describe the current protocols for CRISPR screening in single cells, from cloning and virus production to generating sequencing data.

Key words Single-cell, CRISPR, CRISPRi, Multiplexing, Lentivirus, Pooling, Droplets, 10Â, Clon- ing, Virus production, Transduction, Transfection, RNA-seq

1 Introduction

One of the key advances in next generation sequencing is the ability to multiplex. For example, sequencing has been made easy to scale by pooling multiple barcoded libraries. The single-cell transcrip- tomics field has pushed library pooling technology to the limit, in the quest to minimize cost. If each cell can be made an individual experiment, this can be harnessed to perform large-scale knock-out or perturbation phenotyping experiments. To date, there have been six studies published for CRISPR- screening at the single cell level [1–6]. In one of these papers, the multiwell plate method MARS-seq is used as the basis [2]. This method is however not commonly used and because it relies on multiwell plates, it is unable to scale to large number of cells. SC-CRISPR will most likely be applied to millions of cells in the future. This method will thus not be discussed further. The other studies utilize droplet-based system for the screen. Examples are Drop-seq [7] and 10Â Chromium droplet RNA-seq [8]. However, essentially any droplet system can be used here. An overview of a single-cell CRISPR screen is shown in Fig. 1a. The cells are each expressing Cas9 and one sgRNA. The challenge is to identify which sgRNA is expressed in each cell, allowing them to be separated. Unfortunately the sgRNA is not a polyadenylated RNA molecule and will not be captured by standard single-cell

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_23, © Springer Science+Business Media, LLC, part of Springer Nature 2019 395 396 Johan Henriksson

Fig. 1 Single-cell CRISPR screening. (a) Overview of the procedure. Targeted sgRNA viruses are cloned, individually or pooled. These are transfected to packaging cells that produce virus. Cells of interested, already expressing Cas9, are transduced by the virus. Single-cell libraries are produced and sequenced. Optionally, and ideally, sgRNA reads are amplified and sequenced separately. (b) A typical barcoded sgRNA-plasmid. The barcode is here attached the BFP molecule whose mRNA is polyadenylated and captured using standard scRNA-seq chemistry. The sgRNA is transcribed by Pol III using the U6 promoter. The barcode and the sgRNA are far apart and cloning is more demanding. (c) The CROP-seq system. Only the sgRNA need be cloned in. Upon viral integration the 30LTR is copied to the 50. One of these copies is expressed as part of the virus and polyadenylated, while other copy is transcribed by Pol III as required for CRISPR to function. (d) The distribution of number of viruses per cell (k) as a function of infection rate (λ), or MOI

chemistries (e.g., derivatives of CEL-Seq [9] or Smart-seq2 [10]). Two solutions have so far been used. In Perturb-seq and Mosaic- seq [1, 6], a separate barcode is coded into the end of the BFP selection marker (Fig. 1b). It is thus at the 30 end which is typically captured using droplet RNA-seq methods. The disadvantage is that the barcode has to be matched with the sgRNA, and the location on the plasmid of the barcode versus the sgRNA target sequence makes cloning tedious. A much more elegant solution was pre- sented as CROP-seq [3]. This method uses the fact that the 30 LTR of lentiviruses are copied to the 50 during integration (Fig. 1c). The sgRNA sequence and promoter is thus located in the 30, which is expressed by Pol II along with the rest of the virus content, and the transferred 50 copy is expressed as usual for CRISPR, using Pol III. Since no separate barcode is required, cloning is much easier for CROP-seq than that for Perturb-seq. Care needs, however, to be taken in producing similar constructs—within our lab, we have struggled with poor infection efficiency, likely due to reduced CRISPR Screening in Single Cells 397

function of the LTR’s. Nevertheless, few cells (<0.1–1 M) are needed for SC-CRISPR, so efficiency is generally not a concern. Another reason in favor of CROP-seq is that a recent study has shown that barcode-based systems may recombine during virus production, swapping the barcode and sgRNAs [5]. Arrayed virus production can be performed the circumvent this problem, but at the cost of more work and lower scalability. Thus, to conclude, CROP-seq is currently the recommended method. When designing your experiment, there are three crucial para- meters dictating your budget (Fig. 1d): the number of genes tar- geted, the number of cells sequenced, and the depth of sequencing. Being too ambitious for the budget is likely to generate data with poor ability to find differentially expressed genes. A guess at these parameters is best made from existing data. A good starting point is 100 cells/sgRNA and 1000 reads/cell [1]. From a trial experiment, the parameters can be optimized by downsampling the number of cells/read to an acceptable level. A key requirement for the multiplexing is that each cell only receives one virus/sgRNA. Combinatorial effects may be studied by delivering two viruses. This is described by the measure of infection (MOI), with MOI ¼ 30% meaning that 30% of the cells are infected. The number of viruses is commonly assumed to follow a Poisson distribution. This distribution only has one parameter, λ. For low values of MOI (<0.2), λ  MOI. Otherwise it can be calculated with the equation in Fig. 1d. This figure also shows the distribution for MOI ¼ 0.3, which is commonly used for regular CRISPR screens. It should be noted that already this produces some cells with more than one virus. Values as high as MOI ¼ 1.4 have been used. The virus production protocol given here is sufficient for the MOIs normally used. Should you need higher titers you will need to concentrate the virus. It may then also be worth improving your production by the addition of caffeine [11], which is removed during the concentration. Virus production can be produced at lower cost using the PEI and calcium-phosphate protocols [12], but with lower titers and requiring more optimization. These pro- tocols are not recommended unless the lab intends to do virus production frequently and at large scale.

2 Materials

The required material highly depends on your choice of cells, and we strongly recommend first getting experience with simpler rou- tine virus work. Note that if you use a VSV-G coated virus which can infect any mammalian cell, you need to perform the virus work in biosafety level 2. Pseudotyped viruses which do not infect humans can sometimes be used in biosafety level 1 but still require a GMO risk assessment. 398 Johan Henriksson

a. b. c. d.

sgRNA

Barcode

5' CACC GXXtargetXXXXX 3' e. 3' CYYYYYYYYYYYYY CAAA 5'

Fig. 2 Approaches to cloning (a) Cloning sgRNAs by ligation. Two oligos are ordered and annealed together into a construct with sticky ends (see panel e). Regular T4 ligase is then used to insert the dsDNA into the backbone. (b) Cloning the sgRNA using Gibson assembly using single-stranded oligos. This is the fastest method but more expensive than ligation. (c) Cloning sgRNAs using Gibson assembly or Infusion. This is the procedure for oligo pool libraries in which the input material is scarce. A PCR step is introduced to amplify the ssDNA, resulting in dsDNA. Further overhang can be added. It is otherwise the same as the previous method. (d) Two-step cloning, retaining barcode and sgRNA matching in pooled clonings. The insert, having both barcode and sgRNA, also contains a restriction site. A large stretch containing other promoters and genes is inserted at this site. (e) The design of annealed inserts, ready for ligation. Note the G in the target which is required for efficient U6 transcription

2.1 Cloning 1. A sgRNA virus plasmid of choice, such as CROPseq-Guide- Puro (Addgene #86708), Perturb-seq pPS (Addgene #85801), or CROP-seq-opti (Addgene #106280). This proto- col focuses on CROP-seq but generally applies for most sgRNA plasmids. 2. Oligos (forward and reverse) for each sgRNA to be expressed: (a) For ligation, order two oligos prenormalized to 100 μM, and with forward and reverse primer in the same well, in 96-well format. See Fig. 2e for the design, which is the same for CROP-seq-puro as well sgOPTI-CROP-seq- Guide-1. Note the G in the target part, which is only needed in case the target sequence itself starts with a G. The G is needed to ensure efficient transcription from U6, or transcription is likely to only start from a later G in the target sequence (unpublished data from our lab). (b) For Gibson assembly or In-fusion, order oligos of the format [ATCTTGTGGAAAGGACGAAACACC][target DNA sequence][G][GTTTTAGAGCTAGAAATAGCAA GTTAAAATAAGGC] for CROPseq-Guide-Puro, and [ATCTTGTGGAAAGGACGAAACACC][target DNA sequence][G][GTTTAAGAGCTATGCTGGAAACA GCATAGCAAGTT] for sgOPTI-CROP-seq-Guide-1. 3. Agarose and TE buffer for running gels. 4. Thermal cycler. 5. Compatible restriction enzymes. (The commonly used ones are BsmBI, BsaI, and BbsI.) 6. Some enzyme to assemble the new plasmids. Either T4 ligase and associated buffer or Gibson assembly or In-fusion reagents. CRISPR Screening in Single Cells 399

7. Any chemically competent cells such as DH5α for individual cloning. Preferably electro-competent cells if the cloning is pooled. 8. LB-agar plates with antibiotics. 9. Miniprep and midi/maxiprep kits. 10. If you perform individual cloning, then verify the insert by Sanger sequencing. A suitable primer annealing to U6 is CGA- TACAAGGCTGTTAGAG. If you do this on a pooled library you will get a random signal over the insert region but obvi- ously not the sequence distribution. Given the cost of the screen, it may be worth sequencing the library on an MiSeq. To do so, you can PCR the region as is commonly done in “bulk” pooled CRISPR screens where only the sgRNA distri- bution is read out. That is, design PCR primers with Nextera S5/S7 tails. You can then use Nextera index primers to tag the reads and pool with any other samples for sequencing. One million reads are by far sufficient to find any skew in a pool of 50 sgRNA plasmids.

2.2 Virus Production 1. Suitable packaging plasmids, such as psPAX2 (Addgene #12260) and pMD2.G (Addgene #12259). 2. DMEM or Advanced DMEM. 3. Fetal bovine serum. 4. GlutaMAX. 5. Pen-Strep. 6. Lipofectamine LTX. 7. 293FT cell line. 8. 0.1% (w/v) gelatin.

2.3 Virus 1. Cells of your choice. Transduction 2. Polybrene. and Titration 3. FACS analyzer.

2.4 Single-Cell 1. Cells of your choice, including suitable media. RNA-seq and sgRNA 2. A FACS sorter along with suitable tubes and filters. Identification 3. Chromium 10Â, or equivalent equipment, and related reagents. 4. Illumina HiSeq and MiSeq, or equivalent short-read sequencer.

2.5 Analysis 1. Computer able to run Python and/or R. A Linux installation and 16GB RAM or more is recommended. 2. Access to a Linux-driven computer cluster may speed up analy- sis significantly. 400 Johan Henriksson

3 Methods

Carry out all procedures at room temperature, unless otherwise specified. The protocol is written to be broadly applicable as the field is developing rapidly and details are likely to change, but primarily for CROP-seq like plasmids which only require one clon- ing. Different cloning schemes are shown in Fig. 2. The most common methods are ligation of an annealed sticky overhang insert, and Gibson assembly of a ssDNA oligo. Barcode-based plasmids require two inserts. These can either be done in two rounds, or a library of backbones with random barcodes (stretch of N’s) can be used to clone the sgRNA insert. Sanger sequencing can connect the barcode and sgRNA sequences. It is also possible to do pooled cloning in which the barcode and sgRNA sequence is associated in the same oligo (Fig. 2d)—these protocols are more complicated and not further described here. However, more can be found in, for example, [13].

3.1 Cloning 1. Pick sgRNA sequences for the genes of interest. There are many online tools for this, such as http://www.e-crisp.org/E- CRISP/, http://chopchop.cbu.uib.no/ and http://crispr.mit. edu/. I suggest 2–4 sgRNAs per gene, depending on your choice of Cas9, number of cells, and budget. 2. Design oligos according to Fig. 2, depending on the chosen cloning method. 3. Prepare inserts: (a) If you ordered both the forward and reverse strand: Anneal the oligos by diluting them in T4 ligase buffer. That is, mMix 1 μl top strand oligo, 1 μl bottom strand oligo, 1 μl10Â T4 ligase buffer, and 7 μl nuclease-free water. Vortex and spin down. Heat up to 98 C, then cool down to room temperature over one hour. If your PCR machine is incapable of ramping down then either do it on a heat block (heat it up, then turn off the power) or leave the oligo tube in a bucket of hot water to cool. This final dsDNA solution is 10 μM. (b) If you ordered a single strand (e.g., for CROP-seq), per- form PCR to amplify and complement, then purify. Due to the small size, a kit such as the Zymo DNA Clean and Concentrator kit is suggested. The Qiagen MinElute PCR purification kit may work with a very fine margin (70 bp þ fragments retained). 4. If you want to perform multiplexed cloning to obtain a pool, mix the forward/reverse strands after annealing, and either after/before PCR. CRISPR Screening in Single Cells 401

5. Digest the viral backbone þ stuffer (e.g., using BsmBI for the CROP-seq vector). Five micrograms of plasmid at 2–5 h, 37 C is more than sufficient. If the stuffer contains Ccdb1 then no further purification is needed as undigested plasmids should not form colonies. Otherwise perform a gel purification and extract the backbone. In either case, verify the digestion on a gel. 6. Assembly the backbone and the insert. (a) Using ligation with 1:5 ratio (if you ordered the forward and reverse strand oligos, see Fig. 2a): Mix 4 fmoles of your backbone (e.g. 22 ng CROPseq-Guide-Puro back- bone, 8333 bp), 2 μl dsDNA oligos (diluted 1000Â to 10 nM, giving a final 20 fmoles), 1 μl T4 ligase, 2 μl10Â T4 ligase buffer, and nuclease-free water to a total of 20 μl (for a negative control, replace the dsDNA insert with water) (see Note 1). Incubate at RT for 1 h or 16 C overnight. Be sure to prepare the reaction on ice to pre- vent premature ligation (see Note 2). (b) Before Gibson assembly (or In-fusion), you can do an optional PCR step. This is not needed unless you work with an oligo pool as made by, for example, CustomArray or Twist Bioscience, and have a very small amount of input material. This is done in [5], where the PCR primers are simply the homology regions of the insert oligos. See Fig. 2c. Gibson assembly (or in-fusion, see Fig. 2b and see Note 3): Mix 10 fmoles of your backbone (e.g., 54 ng CROPseq-Guide-Puro backbone, 8333 bp), 2 μl dsDNA/ssDNA oligos (100 nM, giving 200 fmoles), 10 μl2ÂGibson assembly master mix, and nuclease-free water to a total of 20 μl (for a negative control, replace the Gibson assembly MM with water). Incubate at 50 C for 1h. 7. Purify the assembled product and dilute in the smallest possible volume. The MinElute PCR purification kit is practical as it allows a final volume of 10 μl. Purification is not necessary but will improve transformation. Furthermore, sparks can be gen- erated during electroporation if the salts from the assembly buffer are not removed. 8. Transform competent cells with the DNA according to the manufacturer’s instruction. For pooled libraries: (a) Use electroporation. After 1 h incubation, extract a small amount (e.g., 50 μl out of 1 ml) and perform a serial dilution, up to 10,000Â. Streak these dilutions onto agar plates. Grow the remaining concentrate in 100 ml LB media. 402 Johan Henriksson

(b) The day after, count colonies on the plates. Back-calculate the number of colonies in the 100 ml media. If the reac- tion worked it is highly unlikely that the complexity is too low for sc-CRISPR libraries (see Note 4). 9. Perform midi/maxipreps on the cultures. A kit that includes endotoxin removal is recommended.

3.2 Virus Production 1. Prepare media: DMEM, or Advanced DMEM, with 10% FBS, P/S, and GlutaMAX. 2. Thaw 293FT cells quickly in heat bath and resuspend in 10 ml  media (T75/10 cm plate). Grow the cells at 37 C, 5% CO2 for a few days to verify their integrity (see Note 5). 3. Prepare cells aiming at 60–80% confluence. To do this, seed 293FT packaging cells the day before at roughly 4 Â 106 cells per plate in cell culture media, in 10 cm tissue culture plates or T75 flasks, the day before. Too low confluence is in my experi- ence better than too high confluency. Sometimes 293FT cells have the habit of floating off from the plate after transfection. While uncertain if this is a real problem, it can be prevented by first coating the plates with gelatin. 4. Day 0: Mix the following: 1 ml Opti-MEM, 5 μg pMD2.G, 7.5 μg psPAX2, 10 μg transfer plasmid and 22 μl PLUS reagent. 5. Vortex and incubate for 5 min. 6. Add 47 μl Lipofectamine LTX (see Note 6). Vortex for 1 s or pipette to mix. 7. Incubate for 30 min (up to 2 h). 8. Replace media on the cells with 5 ml Opti-MEM (required for LT2000, recommended for LTX). 9. Add the LTX mix dropwise to the 293FT cells. Several proto- cols suggest dropwise addition; however, we have not noted any major difference to just adding all the Lipofectamine in one go. 10. Five hours later, replace with DMEM media.  11. Incubate overnight at 37 C, 5% CO2. 12. Day 1: Replace the media with 5 ml fresh DMEM media. 13. In the evening: verify transfection efficiency by fluorescence microscopy. Compared to a control (not transfected, or trans- fected with only packaging plasmid), you should see a distinct difference in fluorescence. 14. Day 2: Harvest supernatant late evening. Filter using a syringe and a 0.20 μm or 0.45 μm filter (see Note 7). 15. Snap-freeze the virus in 1 ml aliquots (see Note 8). CRISPR Screening in Single Cells 403

3.3 Virus Infection 1. This protocol assumes that your cells express Cas9. Either perform targeted engineering, or transfect a Cas9-PiggyBac vector, or use a combined Cas9 þ sgRNA virus in the next step. 2. Depending on your cell type, a typical protocol is 80% of the media being filtered virus media, and 20% fresh media (or 100% media, with concentrated virus being a minor volume). Per- form a dilution series. Polybrene is added at 8 μg/ml, a con- centration that works for a wide range of cell types. Do not refreeze virus aliquots (see Note 8). 3. For difficult cells, spin for 30 min to 2 h at 1100 Â g,32C. Keep the plated cells in a ziplock bag to prevent drying up and spreading the viral aerosol. 4. Incubate for 4–24 h at 37 C. 5. Replace media. Some protocols recommend washing with PBS but we never do this. 6. Incubate and expand cells for a suitable time. Optionally, per- form puromycin selection by adding 1–5 μg/ml the day after virus transduction. For primary cells, 3 days of expansion is sufficient for Cas9 to induce a double-strand break, but it depends on the cell type. Cancer cells with higher copy num- bers of certain chromosomes may need longer time.

3.4 Verifying CRISPR 1. Perform virus infection above. For CRISPR KO, use a self- Function cutting virus and the corresponding control (e.g., pKLV2- U6gRNA5(Empty)-PGKBFP2AGFP-W (Addgene #67979) and pKLV2-U6gRNA5(gGFP)-PGKBFP2AGFP-W (Addgene #67980)). 2. Measure the self-cutting efficiency using FACS (see Notes 9 and 10).

3.5 Virus Titration 1. Perform virus infection as above. 2. Perform FACS analysis to measure the MOI of the different concentrations of your virus. 3. Calculate the optimal virus concentration by assuming an approximately linear relationship virus vs number of infected cells.

3.6 Single-Cell 1. Produce transduced cells as before. Purify the transduced pop- RNA-seq ulation by FACS. 2. Produce and amplify cDNA according to the protocol of your machine. 3. Produce a library enriched for sgRNA reads, by nested PCR. These protocols are still being improved, so refer to the resource page of your particular plasmid for suitable primer sequences. 404 Johan Henriksson

3.7 Analysis There is yet no established standard for analyzing sc-CRISPR data. Thus this is only an outline of the procedure: 1. Produce a read count table for the cells using the standard method for your chosen chemistry (e.g., CellRanger for 10Â Chromium). 2. From the raw reads, extract the barcode of the virus. Since the number of reads may be low additional care might be taken to handle read errors. One such method is aligning to all expected barcodes using, for example, the widely available Smith–Water- man algorithm, and assigning the closest match. 3. It has been noted that many infected cells do not display a phenotype. This may be due to the Cas9-induced mutation not disabling the target gene. This has previously been solved by training a classifier and removing WT-like cells [1]. A good starting point for the analysis is to look at the code available from previous studies: https://github.com/asncd/MIMOSCA. http://www.medical-epigenomics.org/papers/datlinger2017/. https://github.com/shendurelab/single-cell-ko-screens.

4 Notes

1. Different ligases can be used, but T4 ligase should work under normal circumstances. T3 ligase is a simple drop-in replace- ment that enables you to handle salt contaminations (common carryover from gel purifications). T7 ligases only perform sticky end ligations and may remove some background. 2. Preparing the ligation on ice is particularly important if you make a master mix with enzyme þ backbone. NEB claims their ligation can happen in 5 min, which suggests that a non-insert RT mix will circulate anything that can (and cannot) be ligated. 3. Before Gibson assembly (or In-fusion), you can do an optional PCR step. This is not needed unless you work with an oligo pool as made by for example CustomArray or Twist Bioscience, and have a very small amount of input material. This is done in [5], where the PCR primers are simply the homology regions of the insert oligos. See Fig. 2c. 4. If you have few colonies, or plenty of background, try the following: (a) Digest your backbone for longer (up to 5 h at 37 C, or overnight at 16 C). Consider doing a gel purification even if you have a Ccdb1 insert. If it does not have such an insert, make one before further subcloning. If you do gel CRISPR Screening in Single Cells 405

purification, watch out for salt carryover—a subsequent PCR purification can remove residual salt. Salt and other contamina- tion can be spotted on a NanoDrop spectrophotometer by the shape of the curve. 5. Never let the 293FT reach 100% confluency. There is a good chance that it will adapt and divide slower, and this appears to be a permanent effect once in place. Thus if this happens, you should consider throwing away the cells and thaw new ones. If the cells are not doubling every day, again thaw new ones. Try to keep the number of passages to the minimum by freezing all the cells you have left over. 6. Lipofectamine 2000 can be used instead of LTX, but you cannot mix it with serum-containing media. An alternative protocol is to culture the cells in Opti-MEM, together with the Lipofectamine 2000 mix, over 4 h. The media is then replaced with serum-containing media at the end of the day. Harvest the supernatant 2 days later. 7. Instead of syringe purification, you can consider spinning the virus at 4 C for 1 h to pellet the packaging cells really hard. If you flash-freeze the virus then these cells are for certain dead and will not contaminate your infected cells. There have been claims that 20 μm syringe filters are too harsh and reduce titer. 40 μm filters is a compromise. I have not tested filtration sufficient number of times to confirm the loss in titer. 8. Virus can also be stored in fridge at 4 C for a week, but the point of freezing is to retain the efficiency between repeated infections. Refreezing virus is a bad idea as it may lower the titer by up to a tenfold. 9. There are many possible reasons why you detect low titers. First, ensure your production by looking at the 293FT in the microscope before harvesting the virus. Second, investigate if the promoter for your selection marker (commonly BFP) is active in your chosen cells. Finally, if truly unlucky, your cells might commit suicide if they detect an anomaly such as double strand breaks or simply the presence of virus. For some cell types special procedures are required, such as mouse T cells. We have developed a retroviral CRISPR system as lentiviruses do not work on these [14]. 10. It is not a given that CRISPR will work on your cells. If it does not work then first ensure that your Cas9 is expressed by performing RT-qPCR and/or western blot. It is possible that the chosen promoter is unsuitable for your cell type. There are collections of viruses available specifically for testing different promoters in your chosen cells. 406 Johan Henriksson

Acknowledgments

J.H. is funded by the Swedish Research Council.

References

1. Dixit A, Parnas O, Li B et al (2016) Perturb- profiling of individual cells using nanoliter dro- Seq: dissecting molecular circuits with scalable plets. Cell 161:1202–1214 single-cell RNA profiling of pooled genetic 8. Zheng GXY, Terry JM, Belgrader P et al screens. Cell 167:1853–1866.e17 (2017) Massively parallel digital transcriptional 2. Jaitin DA, Weiner A, Yofe I et al (2016) Dis- profiling of single cells. Nat Commun 8:14049 secting immune circuits by linking CRISPR- 9. Hashimshony T, Wagner F, Sher N, Yanai I pooled screens with single-cell RNA-seq. Cell (2012) CEL-Seq: single-cell RNA-seq by mul- 167:1883–1896.e15 tiplexed linear amplification. Cell Rep 3. Datlinger P, Rendeiro AF, Schmidl C et al 2:666–673 (2017) Pooled CRISPR screening with single- 10. Picelli S, Faridani OR, Bjo¨rklund A˚ Ketal cell transcriptome readout. Nat Methods 14 (2014) Full-length RNA-seq from single cells (3):297–301. https://doi.org/10.1038/ using Smart-seq2. Nat Protoc 9:171–181 nmeth.4177 11. Ellis BL, Potts PR, Porteus MH (2010) Creat- 4. Adamson B, Norman TM, Jost M et al (2016) ing higher titer lentivirus with caffeine. Hum A multiplexed single-cell CRISPR screening Gene Ther 22:93–100 platform enables systematic dissection of the 12. Tang Y, Garson K, Li L, Vanderhyden BC unfolded protein response. Cell (2015) Optimization of lentiviral vector pro- 167:1867–1882.e21 duction using polyethylenimine-mediated 5. Hill AJ, McFaline-Figueroa JL, Starita LM et al transfection. Oncol Lett 9:55–62 (2018) On the design of CRISPR-based sin- 13. Vidigal JA, Ventura A (2015) Rapid and effi- gle-cell molecular screens. Nat Methods cient one-step generation of paired gRNA 15:271–274 CRISPR-Cas9 libraries. Nat Commun 6:8083 6. Xie S, Duan J, Li B et al (2017) Multiplexed 14. Henriksson J, Chen X, Gomes T et al (2019) engineering and analysis of combinatorial Genome-wide CRISPR screens in T helper enhancer activity in single cells. Mol Cell cells reveal pervasive cross-talk between activa- 66:285–299.e5 tion and differentiation. BioRxiv 176 7. Macosko EZ, Basu A, Satija R et al (2015) (4):882–896 Highly parallel genome-wide expression Part VII

Single Cell Live Imaging Chapter 24

Single-Cell Live Imaging

Toru Hiratsuka and Naoki Komatsu

Abstract

Recent fluorescence microscopy allows for high-throughput acquisition of 5D (X, Y, Z, T, and Color) images in various targets such as cultured cells, 3D spheroid/organoid, and even living tissue with single- cell resolution. The technology is considered promising to augment insights on heterogeneous features of both physiological and pathological cell phenotypes, for instance, distinct responses of cancer cells to anticancer drug treatment. Here we overview microscopic applications to capture live cell events for different types of targets, together with a couple of proof of concepts. The 2D live imaging will be exemplified by a FRET-based time-lapse cultured cell imaging, and 3D tissue imaging protocol will be complemented with a method for mouse skin live imaging.

Key words Single cell imaging, Microscopy, Fluorescent probes, 2D/3D-cultured cell observation, Living mouse observation

1 Introduction

Since the development of fluorescence microscopy, the technology has continuously improved its fundamental features, such as reso- lution, speed, and time-lapse image acquisition. Simultaneously, development of various fluorescent protein mutants and probes significantly accelerated its applications. Fluorescence microscopy has now become an essential tool for a wide range of biological and medical studies. Single cell live imaging allows for time-series understanding of individual cells while maintaining their culture or living conditions. This represents significant advantage over other population-based methods such as Western blotting and immunohistochemistry of fixed samples. Live imaging of cultured cells with epifluorescence microscopy now allows for automated observation of multiple wells such as 96 wells, which especially benefits drug screening and multitarget siRNA screening studies. On the other hand, the application of new microscopy such as multiphoton excitation microscopy has been utilized to capture in vivo dynamics of cells in living organisms from Drosophila

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_24, © Springer Science+Business Media, LLC, part of Springer Nature 2019 409 410 Toru Hiratsuka and Naoki Komatsu

melanogaster, Caenorhabditis elegans, zebrafish, mouse, and even living human skin [1, 2]. Here we describe general protocols for single cell imaging of 2D-cultured cells with epifluorescence microscopy, 3D-cultured spheroid/organoid with confocal micros- copy, and living mouse tissue with multiphoton microscopy. For more details, we exemplify the technology with single cell time- lapse FRET imaging of cultured cells (2D) and single cell mouse skin tissue imaging (3D).

2 Materials

2.1 General 1. Microscope systems (including objective lens, detectors, and Equipment PC installed with microscope control software) (see Note 1).

2. CO2 stage incubator. 3. Heating chamber. 4. Sample dish/plate holder for microscope stage. 5. Phenol-red free imaging media. 6. (If nuclear labeling is needed) Hoechst 33342. 7. Image analysis software (see Note 2). 8. PC for image analysis.

2.2 Cell Culture 1. Collagen type I-C solution. Reagents 2. Expression plasmid for fluorescent probe. 3. Transfection reagents (e.g., Lipofection reagents).

2.3 Mouse Imaging 1. Depilation cream. 2. Pulse oximeter. 3. Animal temperature controller. 4. Anesthesia machine. 5. Heat pad for mouse.

3 Methods

3.1 Sample Plate cells onto a glass-based dish/well (0.15–0.18 μm of thick- Preparation ness). Coating the glass with extracellular matrices such as collagen and poly-L-lysine promotes cell attachment (see Note 3). After 3.1.1 2D-Cultured Cells complete attachment (1 day), transfect the cells with fluorescence expression plasmids by lipofection reagents (see Note 4). Incubate  cells under 37 C humidified CO2 incubator for 1–2 days (see Note 5) to allow cells to express sufficient amount of fluorescent probe. If you are using stable cell lines, you can skip this step (see Note 6). Single-Cell Live Imaging 411

If nuclear labeling is needed, Hoechst 33342 (final concentra- tion at 0.1–1 μg/ml) can be added 30 min before imaging and the signal will last at least for 24 h.

3.1.2 3D-Cultured Cells Plate cells stably expressing the fluorescent probe of your interest into appropriate culture matrices to allow cells to form organoids/ spheroids [3, 4]. U-bottom plates can be useful to locate the organoid/sphenoid at the center of the well. Assure that the target sample is located close enough to the bottom of the plate/dish to allow the objective to focus within its working distance. If nuclear labeling is needed, Hoechst 33342 (final concentra- tion at 0.1–1 μg/ml) can be added. Typically 30 min of incubation is sufficient but the time depends on the infiltration rate of the dye into your sample.

3.1.3 Mouse The mouse needs to be genetically modified to express fluorescent probe(s) unless the imaging only needs autofluorescence or dye labeling. Stable and high expression of fluorescent protein is highly important since deep-tissue observation is susceptible to light scat- tering and absorbance in the light path (see Note 7). Depending on the target tissue, necessary procedures to expose the tissue greatly vary (see Notes 8 and 9). For skin observation hairs must be removed by depilation cream 3–24 h before imaging to make sure temporal skin damage is recovered (see Note 10). Wipe the cream gently with water-soaked papers and rinse with water several times to completely remove the cream. If nuclear labeling is needed, intravenous injection of Hoechst 33342 (200 μl of 2 mg/ml solution in PBS) 1 h prior to imaging can readily label nuclei in the body.

3.2 Microscope Launch all the microscopic devices including fluorescence lamps/ Set-Up lasers, and PC. Warming up lamp or laser will take a while (about 30 min). Turn on the temperature and CO2 controller and incubate for 30–60 min to stabilize the culture condition before mounting your sample. A typical fluorescence microscope system for cultured cell imaging is shown in Fig. 1.

3.3 Set Sample Change culture media of the transfected cells to a phenol red-free imaging medium (see Note 11) and incubate cells typically at 37 C 3.3.1 2D/3D-Cultured and 5% CO atmosphere for at least 1 h to equilibrate the cellular Cells 2 condition. Transfer the dish from incubator (see Note 12) and mount it onto the microscope stage using a proper dish/plate holder (see Note 13). Cover the sample with the CO2 device lid that has a tube connected to CO2 supply.

3.3.2 Mouse Anesthetize the mouse by injection or inhalation of anesthetic agent. Injectable agent includes ketamine/xylazine, phenobarbital, and Avertin (intraperitoneal injection, follow protocol for each 412 Toru Hiratsuka and Naoki Komatsu

Fig. 1 Microscope setup for cultured cell imaging. Microscopic devices need to be placed on an antivibration table. Dehumidification is necessary if the humidity is high to prevent possible damages to the devices

agent concentration). Typically used inhalable agent is vaporized isoflurane (1–1.5%) (see Note 14). Expose and stabilize the skin area of your interest gently with- out damaging the skin. There are several approaches but one of the ways is to sandwich the skin between a cover slip and sticky soft silicon gum sheet (Fig. 2). Mouse health status should be assured continuously with pulse oximeter, heat pad, etc. Be noted that all animal experiments need to be performed under animal experiment regulations in your country and research institute.

3.4 Observation/ Select a proper objective lens for your sample depending on the cell Image Acquisition size (magnification), required viewfield, resolution (NA), working distance, etc. (see Note 15). Start live view mode from the control 3.4.1 2D/3D-Cultured software of the PC and search for cells expressing the fluorescent Cells probe by the appropriate illumination channel (see Note 16). Opti- mize exposure time, camera binning (see Note 17) and illumination power to gain brightest image while avoiding maximum intensity of images being saturated for detection (see Note 18). Set interval and total time of time-lapse acquisition. If motorized XY stage is avail- able, set stage points. If you are using confocal microscopy, adjust laser power for the focus (see Note 19). Confocal microscopy can adjust Z resolution by changing the size of pinhole. The smaller the pinhole size, the higher the Z resolution but the less the signal gained. Once all the settings are confirmed (see Note 20), start image acquisition (see Notes 21 and 22). Single-Cell Live Imaging 413

Fig. 2 Experimental setup for mouse skin imaging. (a) The mouse ear skin is gently stabilized on the microscopic stage. The mouse is anesthetized with isoflurane on a heat pad. The depilated ear skin is placed on a sticky silicon gum sheet and covered with a cover glass. (b) The mouse back skin is stabilized in the same way as ear skin. Since back skin is more extensive than ear skin, make sure to fully flatten the skin part of observation. The skin part for observation is recommended to be as far as possible from pulsing parts of mouse body such as heart and lung since those movements severely affect image stability

3.4.2 Mouse Most protocols for observation are same as 2D/3D-cultured sam- ple observation (Subheading 3.4.1), but be warned against (1) Autofluorescence (see Note 23) (2) Health monitoring of the mouse (see Note 24) and (3) Tissue movement (see Note 25). The maximum depth for observation is around 1 mm for transparent tissue such as brain, and 200–400 μm for skin. Time lapse images are ready to be acquired once all the condi- tions are set, but make sure the mouse is properly and steadily anesthetized before starting an automatic image acquisition. During imaging, keep an eye on acquired images as well as the health status of the mouse. The tissue could be damaged gradually if the laser power is too high. Successful observation of mouse skin will appear as in Fig. 3. 414 Toru Hiratsuka and Naoki Komatsu

Fig. 3 Representative multiphoton images of mouse skin. (a) Schematic of mammalian skin structure. (b–d) Representative images of suprabasal layer of epidermis (b), basal layer of epidermis (c), and dermis (d) The mouse expresses EKAREV FRET probe in nucleus [9]

3.5 Image Analysis Transfer raw images from the microscope PC to the analysis PC. Open raw images by image analysis software such as Fiji/ 3.5.1 Transfer and Open ImageJ (see Note 2) and check if the images were acquired properly. Images Adjust brightness and contrast if necessary. Gamma value must be 1 when intensity quantification is required (see Note 26).

3.5.2 Image Processing Subtract background. Quantify intensities of fluorescence signals. If necessary, analyze the intensity values further with other software dedicated to numerical processing, such as Microsoft Excel and MATLAB. An example of image analysis is shown in Fig. 4.

3.6 (Optional: Protocols described above can be extended to automated image Cultured Cell Imaging) acquisition, which is useful for observation of rare events in a cell Automated population and collection of large number of single cell data Microscopy for statistical analysis. Differences from non-automated Single-Cell Live Imaging 415

Fig. 4 Example of image analysis: Quantification of FRET-based ERK activity. (a) Structure of ERK activity FRET biosensor, EKAREV [5]. Depending on ERK activity it changes the 3D structure and emits different signals of 480 nm (ERK inactive) and 533 nm (ERK active). (b) Flowchart of image analysis. For further details, see [6, 7]. (c) Time course of average ERK activity (FRET/CFP ratio) from a population of cells

epifluorescence microscopy are as follows: (1) program custom- built macros for microscope control (automated stage moving, programmed illumination setting, etc.) (see Note 27), (2) plate cells on a multiwell format (e.g., 96-well plate), and (3) analyze raw images with custom-built pipelines for automated image analy- sis and numerical analysis [5].

4 Notes

1. Selection of microscope systems depends on the object you want to observe. For cultured cells, wide-field epifluorescence microscopy and confocal microscopy are widely used. Wide- field epifluorescence microscope illuminates samples uniformly and collects fluorescence from samples by a two-dimensional detector such as CCD and CMOS camera, allowing for quick acquisition of bright and wide images. This is suitable for multipoint and multicolor quantitative imaging. On the other hand, confocal microscopy is powerful for the observation of subcellular localization of organelles/proteins and 3D structures. It obtains sectional images by laser-based 416 Toru Hiratsuka and Naoki Komatsu

point illumination and by selection of signal from the section through its pin hole. This means, however, fluorescent probes are continuously excited through the light path during multi- ple Z slice acquisitions regardless of the temporal Z section of interest. Due to this, phototoxicity and photobleaching take place outside of the Z focus as well. Spinning disk microscopy (SDM), light sheet microscopy (LSM) and Multiphoton microscopy (MPM) are good choices if confocal microscopy does not fit to your experimental pur- poses, for instance, when you require wide area observation, fine time-lapse imaging and/or deep area observation. SDM can efficiently obtain signals from wide viewfield by a number of pinholes on its Nipkow disc. Although the sectioning capac- ity and resolution tend to be slightly compromised, its speed is significantly higher than confocal microscopy; thus, its photo- toxicity tends to be low. LSM also enables very high-speed image acquisition by applying sheet-structured illumination. This overcomes the problem of out-of-focus phototoxicity significantly reducing the damage to the sample. MPM’s speed for image acquisition is comparable to confocal micros- copy but its great advantage is deep-tissue observation. By using long-wavelength multiple lasers (typically 750–1300 nm), it allows for specific excitation of deep area to achieve Z resolution. Long-wavelength light causes less photo- toxicity than short-wavelength light used in confocal microscopy. 2. Now various software tools for bioimage analysis are available. Among them, Fiji/ImageJ is license-free and provides a bunch of plugins useful for cell tracking, display of 5D image etc. On the other hand, licensed software tends to include easy-to-use functions and user-friendly interfaces. Selection of the software tools is also highly dependent on what you want to observe/ analyze from your images. 3. Collagen and poly-L-lysine are commonly used to coat the glass surface of the dish/plate. For collagen coating, (1) dilute stock solution of collagen type I-C with 1 mM HCl to reconstitute to the final concen- tration of 0.3 mg/ml. (2) Add sufficient volume of the collagen coating solution to cover the entire surface of the glass bottom. (3) After an hour, aspirate the coating solution from the dish/ plate, rinse 3 times with sterile water, and then allow it to dry out for 1–2 h. The time for coating varies from a few seconds to overnight incubation, depending on the cell type and lots/ suppliers of the glass-bottom dish/plate. For poly-L-lysine coating, (1) add sufficient volume of 0.1 mg/ml poly-L-lysine aqueous solution to the glass bottom to cover its entire surface. (2) After 5 min, remove the solution Single-Cell Live Imaging 417

and rinse the glass surface three times with sterile water. (3) Allow it to dry at least for 2 h. Poly-L-lysine coating is commonly used to stabilize and image floating cells. Note, however, that this forced attachment may cause unwanted effects on cells such as altered cellular signaling and gene expression profile. 4. Transfection condition should be optimized for each cell type (i.e., types/amounts of lipofection reagents and incubation time) in order to realize both high expression efficacy and healthy cell condition. Follow manuals of each transfection reagent for detailed protocol of optimization. 5. Note that some fluorescent proteins such as RFPs need rela- tively longer time for maturation compared to GFP variants such as CFP, GFP and YFP. 6. Lentiviral gene transfer is commonly used to establish stable cell lines of fluorescent probes. Using GFP variants with similar cDNA sequences (e.g., CFP and YFP) may incur gene recom- bination. One solution to this is to use codon-diversified GFPs that encode the same protein by different DNA sequences [8]. 7. Transposon-based gene integration significantly reduces the risk of gene recombination in the transgene compared to lenti- viral gene integration [9]. 8. While skin observation only needs depilation of hair, other organs such as intestine, mammary glands, lung, and heart require surgical procedures to expose the tissue [10–12]. For skin, multiple methods are reported to acquire long and stable images and relatively long-term imaging (>12 h) is achieved [13–15]. 9. Organs with active pulsing (heart and lung) require further elaborations to stabilize the tissue or compensate the move- ment, for instance, vacuum and computational image recon- struction [12, 16]. 10. Depilation cream for human is readily available and convenient to remove mouse hairs. Depilation cream for sensitive skin is recommended to minimize the risk of damaging skin tissue especially ear skin. Ear skin will typically be depilated within 1 min while back skin will take 2–3 min. Note that hairs in growth phase are too deeply rooted and cannot completely be removed. 11. The media can be supplemented with 10% serum for with- serum observation or 0.1% BSA for serum-starved observation, which is to keep the cells healthy during live imaging. 12. Minimize the time for sample loading as opening the stage incubator will change the temperature. This may influence 418 Toru Hiratsuka and Naoki Komatsu

your cell condition and also may lose stability of focus for time- lapse image acquisition. 13. There are several types of dish/plate attachments for mounting dish/plate on microscopic stage. In addition, there are two types of stage with rectangle or round-shaped hole. We recom- mend a rectangle attachment with rectangle stage because of its high sample acceptability. This, by using adapters, covers 35 mm round dish, 60 mm round dish, 8-well cover glass chamber, and 96-well plate. 14. The initial induction of anesthesia needs relatively high con- centration (1.5–2.0%). After indication for maintenance, it is recommended to be reduced (0.8–1.5%). The ideal concentra- tion will vary depending on the mouse body size, age, health status, etc. 15. For cellular level observation, 10Â,20Â,or40Â lens will provide sufficient resolution, while subcellular structures (membrane dynamics, cytoskeleton, mitochondria, etc.) will require higher magnification (60Â or 100Â). 16. Following issues should be carefully examined and solved to obtain reliable fluorescence images: (1) Cross talks among color channels (i.e. bleed-through of fluorescence signals and cross-excitation of dyes/probes) and (2) unevenness of illumi- nation. The former can be compensated by linear unmixing. The latter is, in many cases, compensated by adjusting illumi- nation settings (e.g., aligning and focusing halogen lamp) or changing filters/dichroic mirrors (filters/dichroic mirrors with uneven coating can cause uneven illumination). 17. CCD (and CMOS) image sensor consists of an array of thousands of pixels, each of which receives photons and con- verts them to photoelectrons. By combining together of photoelectrons of adjacent pixels on the CCD to form elec- tronic superpixels, we can improve brightness and S/N ratio of images for the same exposure time. This process is called binning. For example, in 2 Â 2 binning, 4 pixels of CCD are electronically combined and functioned as 1 large pixel. Note that binning reduces spatial resolution. 18. Saturated fluorescence signals do not reflect the exact signal of the probe and impairs quantitativity of images. In addition, illuminating cells for long time with strong excitation light often causes photobleaching and phototoxicity. 19. Deep area observation will require higher laser power/light exposure and this may impair quantitativity through the axis of Z. Coexpression of reference fluorescence that is stably and invariably expressed in cells may help to normalize the signal. Single-Cell Live Imaging 419

20. You might need to compromise one or more parameters for live imaging such as laser power and number of Z slice and/or time points. Time-lapse imaging requires stability of signal through the time and it strongly depends on multiple factors such as promoter strength and stability of the fluorescent protein. 21. For time-series observation with multiple stage points, keeping the focus during the time is crucial for successful tracking of the same cells. Autofocus systems equipped in microscope such as “PFS (perfect focus system)” for Nikon microscope or “ZDC (Z-drift compensation)” for Olympus microscope greatly helps to keep the focus automatically. 22. If some reagents (growth factor, inhibitor, etc.) need to be added during time-lapse imaging, the image acquisition can be suspended. The reagent must be dissolved in media around 1/10 of the final volume for efficient diffusion. Minimize change in temperature, CO2, etc. and never forget to resume image acquisition after the reagent addition. 23. Mouse health status (temperature, breathing, heart rate, oxy- gen saturation level) can be monitored by pulse oximeter. Among the listed indications, temperature plays the most important role as the highest risk for long-term anesthesia is hypothermia. Heat pad will greatly help for the health mainte- nance but ideally temperature should be autoregulated by animal temperature controller that has temperature sensor and heat pad that automatically regulate the mouse temperature. 24. Some biological features such as collagen, myosin, and tubulin can be visualized via SHG (second harmonic generation) by multiphoton microscopy without any antibody-based staining or dye use (Fig. 3d). SHG, in principle, is a generation of autofluorescence where nonlinear interaction with inversion symmetry-free materials emits double the frequency (half the wavelength) of excitation laser light. For example, if a 910 nm laser was used for GFP in two-photon microscopy, collagen fibers will emit autofluorescence light of around 450 nm wave- length [17, 18]. 25. Most common source of tissue movement is pulse movements from heart and lung. To minimize the movement, the skin area of observation should be placed as far as possible from those pulsing organs. For this reason and thickness, ear skin is easier for time-lapse observation than back skin. 26. When gamma is 1, signals displayed in an image correlate linearly with signals originally recorded. Otherwise, the output signals on display nonlinearly correlate with the input signals. If 420 Toru Hiratsuka and Naoki Komatsu

you modify the gamma value for presentation, denoting the used gamma value is desirable. 27. Alternatively, using high-content image-based screening plat- forms such as Opera (PerkinElmer) is convenient and user- friendly for high-throughput analysis. Nevertheless, coding- based microscope operation provides more flexibility, for instance, in illumination settings for multicolor assays, which requires careful optimization of each channel setting to obtain reliable fluorescence signals.

Acknowledgments

T.H. has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No 704587. N.K. was sup- ported by JSPS KAKENHI grant numbers JP16K21109 and JP18K19313.

References

1. Cahalan MD, Parker I, Wei SH et al (2002) between YFP and CFP genes of FRET biosen- Two-photon tissue imaging: seeing the sors introduced by lentiviral or retroviral gene immune system in a fresh light. Nat Rev Immu- transfer. Sci Rep 5:13283 nol 2:872–880 9. Kamioka Y, Sumiyama K, Mizuno R et al 2. Helmchen F, Denk W (2005) Deep tissue (2012) Live imaging of protein kinase activities two-photon microscopy. Nat Methods in transgenic mice expressing FRET biosen- 2:932–940 sors. Cell Struct Funct 37:65–73 3. Debnath J, Muthuswamy SK, Brugge JS 10. Kumagai Y, Naoki H, Nakasyo E et al (2015) (2003) Morphogenesis and oncogenesis of Heterogeneity in ERK activity as visualized by MCF-10A mammary epithelial acini grown in in vivo FRET imaging of mammary tumor cells three-dimensional basement membrane cul- developed in MMTV-Neu mice. Oncogene tures. Methods 30:256–268 34:1051–1057 4. Mahe MM, Aihara E, Schumacher MA et al 11. Mizuno R, Kamioka Y, Kabashima K et al (2013) Establishment of gastrointestinal epi- (2014) In vivo imaging reveals PKA regulation thelial organoids. Curr Protoc Mouse Biol of ERK activity during neutrophil recruitment 3:217–240 to inflamed intestines. J Exp Med 5. Komatsu N, Aoki K, Yamada M et al (2011) 211:1123–1136 Development of an optimized backbone of 12. Vinegoni C, Aguirre AD, Lee S et al (2015) FRET biosensors for kinases and GTPases. Imaging the beating heart in the mouse using Mol Biol Cell 22:4647–4656 intravital microscopy techniques. Nat Protoc 6. Aoki K, Matsuda M (2009) Visualization of 10:1802–1819 small GTPase activity with fluorescence reso- 13. Hiratsuka T, Fujita Y, Naoki H et al (2015) nance energy transfer-based biosensors. Nat Intercellular propagation of extracellular Protoc 4:1623–1631 signal-regulated kinase activation revealed by 7. Broussard JA, Rappaz B, Webb DJ et al (2013) in vivo imaging of mouse skin. elife 4:e05178 Fluorescence resonance energy transfer micros- 14. Li JL, Goh CC, Keeble JL et al (2012) Intravi- copy as demonstrated by measuring the activa- tal multiphoton imaging of immune responses tion of the serine/threonine kinase Akt. Nat in the mouse ear skin. Nat Protoc 7:221–234 Protoc 8:265–281 15. Pineda CM, Park S, Mesa KR et al (2015) 8. Komatsubara AT, Matsuda M, Aoki K (2015) Intravital imaging of hair follicle regeneration Quantitative analysis of recombination in the mouse. Nat Protoc 10:1116–1130 Single-Cell Live Imaging 421

16. Rodriguez-Tirado C, Kitamura T, Kato Y et al applications to disease diagnosis. Laser Pho- (2016) Long-term high-resolution intravital tonics Rev 5:13–26 microscopy in the lung with a vacuum stabi- 18. Zoumi A, Yeh A, Tromberg BJ (2002) Imaging lized imaging window. J Vis Exp 116:54603 cells and extracellular matrix in vivo by using 17. Campagnola PJ, Dong CY (2011) Second har- second-harmonic generation and two-photon monic generation microscopy: principles and excited fluorescence. Proc Natl Acad Sci U S A 99:11014–11019 Part VIII

Single Cell Data Analysis Chapter 25

Differential Expression Analysis in Single-Cell Transcriptomics

Luca Alessandrı`, Maddalena Arigoni, and Raffaele Calogero

Abstract

Differential expression analysis is an important aspect of bulk RNA sequencing (RNAseq). A lot of tools are available, and among them DESeq2 and edgeR are widely used. Since single-cell RNA sequencing (scRNAseq) expression data are zero inflated, single-cell data are quite different from those generated by conventional bulk RNA sequencing. Comparative analysis of tools used to detect differentially expressed genes between two groups of single cells showed that edgeR with quasi-likelihood F-test (QLF) outper- forms other methods. In bulk RNAseq, differential expression is mainly used to compare limited number of replicates of two or more biological conditions. However, scRNAseq differential expression analysis might be also instrumental to identify the main players of cells subpopulation organization, thus requiring the use of multiple comparisons tools. Nowadays, edgeR is one of the few tools that are able to handle both zero inflated matrices and multiple comparisons. Here, we provide a guide to the use of edgeR as a tool to detect differential expression in single-cell data.

Key words Differential expression, Single-cell RNA sequencing, scRNAseq, edgeR

1 Introduction

Single-cell sequencing is a powerful technology to study cell het- erogeneity and represents a new frontier for the bioinformatics community. Cell heterogeneity analysis requires the use of cluster- ing methods, (tSne [1], kernel similarity learning [2], etc.). How- ever, after cluster detection there is the need of identifying genes playing a pivotal role in defining the cells’ cluster organization. Differential expression analysis could be the key to detect such genes. Many methods have been developed to identify differential gene expression from single-cell RNA (scRNA)-seq data, and, recently, Soneson evaluated the overall characteristics of 36 of them [3], testing their efficacy in the differential expression of two groups. Soneson has shown that bulk RNA-seq analysis meth- ods do not perform worse than those developed specifically for

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_25, © Springer Science+Business Media, LLC, part of Springer Nature 2019 425 426 Luca Alessandrı` et al.

scRNA-seq [3]. However, the two-group comparison does not represent the optimal approach for intercluster features selection, that is, to identify the main players of cells subpopulation organiza- tion, and multigroup comparison would be more appropriate. Within the top ten best tools for differential expression analysis tested by Soneson, only Limma [4] and edgeR/QLF [5] are able to handle multigroup comparisons, and since among tools for two-group comparison edgeR/QLF appears to outperform other tools [3], in this chapter we will focus on edgeR/QLF as a tool for multigroup differential expression analysis. Another point that is important to address in bioinformatics analyses is reproducibility. Reproducibility of a research is a key element in the modern science and represents the ability of replicat- ing an experiment independent of the location and the operator. Therefore, a study can be considered reproducible only if all the used data are available and the exploited computational analysis workflow is clearly described. In genomics and transcriptomics data analysis, the availability of raw data and list of tools used might/could not be enough to guarantee the reproducibility of the results obtained. Indeed, different releases of the same tools might result in sneaky reproducibility issues [6]. Reproducible Bio- informatics Project (RBP) [7] is an open-source project, based on docker images and R packages, providing reproducible results in the genomics and transcriptomics framework. In RBP it is available as an implementation of edgeR/QLF, and here we will describe its use as a differential expression tool for single cells.

2 Materials and Methods

The analysis of transcription data generally requires the use of Unix operating system. Specifically, the RBP applications require the installation, in a UNIX-based environment, of a docker daemon (https://www.docker.com/) and of R (https://cran.r-project.org/). The tools supported by RBP for single-cell analysis are implemented intheCASCgithubpackage(https://github.com/kendomaniac/ CASC).

2.1 Preparing 1. Download from https://www.docker.com/community-edi the System tion the docker daemon version compatible with your Linux platform (debian, redhat, ubuntu, etc.) and follow the installa- tion instructions (https://docs.docker.com/install/). 2. To install R follow the installation instructions provided at https://cran.r-project.org/bin/linux/ on the basis of your Linux platform. 3. Open an R session and install the CASC github library: ANOVA-Like Differential Expression 427

(a) install.packages("devtools"). This package is required to install locally github R packages. (b) install_github("kendomaniac/rCASC", ref¼"master"). This command installs casc package. (c) library(rCASC). This command loads CASC library in R. (d) downloadContainers(). This command downloads all the docker images required by CASC to execute single-cell data analysis.

2.2 Preparing We use, as sample dataset, the single-cell data generated by Buett- the Sample Dataset ner [8], in which 288 mouse embryonic stem cells (mESCs), respectively, sorted for G1 (96 cells), G2/M (96 cells), and S (96 cells), were sequenced using the C1 Fluidigm instrument. 1. system("wget http://130.192.119.59/public/buettner_ counts_noSymb.txt.zip"). (a) downloading the Buettner data set [8]. 2. unzip("buettner_counts_noSymb.txt.zip"). The file buettner_counts_noSymb.txt is a tab-delimited file containing the counts table for 288 cells labeled on the basis of their cell cycle state and 38294 ENSEMBL gene IDs followed by 93 ERCC spike-in RNAs and three summary rows (see Notes 1 and 2). 3. system("wget ftp://ftp.ensembl.org/pub/release-92/gtf/ mus_musculus/Mus_musculus.GRCm38.92.gtf.gz"). ENSEMBL GTF file required to annotate the counts table, if ENSEMBL IDs are provided as gene identifiers in the counts table (see Note 3). 4. library(rCASC). 5. lorenzFilter(group¼"docker",scratch.folder¼"/data/scratch/", data.folder¼getwd(),matrixName¼"buettner_counts_noSymb", p_value¼0.05, format¼"txt", separator¼’\t’). The above function is an implementation of the Lorenz statistics [9], which allows for the removal of low-quality cells, since Diaz showed that Lorenz statistics results are correlated with live–dead staining results. group parameter is a character string with two options: sudo or docker, depending on to which group the user belongs. scratch.folder parameter is a character string indicating the path of the scratch folder, that is, the folder where temporary calculation is executed. data. folder is a character string indicating the folder where input data are located and where output will be written. matrixName is the name of the counts data file located in data.folder. The file must contain raw counts/UMI, without any modification, such as log transformation and normalization. umiXgene is an integer defining how many UMI/counts are required to 428 Luca Alessandrı` et al.

call a gene as present. The default is set to 3. p_value is the max Lorenz statistics value required to have a cell passing the filter. format it is the counts file extension (e.g., txt or csv). separa- tor is the field delimiting character used to separate cell in the count file (e.g., tab or comma). The output of the function is named adding lorenz_ prefix to the input counts file name, and it is a tab-delimited file containing the cells passing the Lorenz cells quality test. 6. scannobyGtf(group¼"docker", data.folder¼getwd(), counts. table¼"lorenz_buettner_counts_noSymb.txt", gtf. name¼"Mus_musculus.GRCm38.92.gtf", biotype¼"protein_ coding", mt¼FALSE, ribo.proteins¼FALSE, file.type¼"txt", umiXgene¼3). The majority of single-cell primary analysis tools use as gene identifier ENSEMBL gene IDs. The function scanno- byGtf adds gene symbol to ENSEMBL gene IDs. The function also allows to remove mitochondrial and ribosomal proteins genes that might act as confounding factors. counts.table is a character string indicating the counts table file. gtf.name is a character string indicating the ENSEMBL gtf file. biotype is a character string indicating the ENSEMBL biotype(s) of inter- est, for single-cell data the biotype of interest is protein_cod- ing. mt is a Boolean value defining if mitochondrial genes have to be removed; FALSE means that mt genes are removed. ribo. proteins is a Boolean value defining if ribosomal proteins have to be removed; FALSE means that ribosomal proteins are removed. file.type indicates if counts table columns are delim- ited by comma (csv) or tab (txt). The output has the prefix annotated_. The filename anno- tated_lorenz_buettner_counts_noSymb.txt is then changed in annotated_lorenz_buettner_counts_woMT_woRIBO.txt to recall removal of mitochondrial and ribosomal protein genes. The cells retained from Lorenz statistics are showed in red in Fig. 1a. The effect of removing mitochondrial and ribosomal proteins from the cells passing the Lorenz filter is shown in Fig. 1b.

2.3 Detecting 1. deDetection(group¼"docker", data.folder¼getwd(), counts. Differential Expression table¼" annotated_lorenz_buettner_counts_woMT_woR- with edgeR IBO.txt ", file.type¼"txt", logFC.threshold¼1, FDR.thresh- old¼0.05, logCPM.threshold¼4). (a) The function detects differential expression starting from a raw counts table. log2FC.threshold is an integer indi- cating the minimum threshold requested as absolute log2 fold change to have the gene selected as significantly differentially expressed. log2CPM.threshold is an inte- ger indicating the minimum average expression requested ANOVA-Like Differential Expression 429

Fig. 1 Data preprocessing. (a) Removing low quality cells using Lorenz statistics. (b) Annotating of ENSEMBL IDs for protein_coding biotype. In this specific case, mitochondrial and ribosomal protein genes were also removed, resulting in an overall reduction of the total detected genes and counts

Fig. 2 Structure of the input raw counts file required for differential expression analysis. In bold are indicated the three experimental covariates used in this example, that is, cov1 for G1, cov2 for G2/M, and cov3 for S phase

to have the gene selected as significantly differentially expressed (see Note 4). FDR.threshold is an integer indi- cating the maximum value accepted as BH corrected p-value to consider a gene significantly differentially expressed. The analysis done by edgeR is an ANOVA- like; thus, all covariates are compared with respect to the first one (see Note 5). (b) The groups to be considered for differential expression analysis need to be indicated in the column names of the raw counts table, using underscore as separator (Fig. 2). (c) The output of the ANOVA-like analysis done with edgeR returns two files with the same name of the input file respectively with the prefix DE_ and filtered_DE_. The information provided in the output files (Fig. 3)is logFCXX columns indicating log2 fold change with respect to the reference covariate, in this example G1 cov1; LogCPM, log2-average abundance estimated over 430 Luca Alessandrı` et al.

Fig. 3 Structure of the output generated by ANOVA-like differential expression analysis

Fig. 4 Differential expression results. In the plot log2FC is the maximum value detected within the tested comparisons. The genes passing the selected logFC. threshold, FDR.threshold, and logCPM.threshold are shown in red

the samples; PValue, the raw p-value; and FDR, the BH corrected p-value. The filtered_DE_ file retains only the set of genes characterized by passing, in at least one com- parison, the thresholds set for the parameters logFC. threshold, FDR.threshold, and logCPM.threshold. (d) deDetection function also provides as output a PDF file, called filteredDE.pdf (Fig. 4), representing the results of the differential expression analysis. ANOVA-Like Differential Expression 431

3 Notes

1. It is very important to check the overall structure of the raw data set, to avoid keeping in the analysis a set of rows which are not related to the transcriptome of the cells (e.g., ERCC spike- ins and summary information). 2. It is very important to use for differential expression analysis only raw counts, without any normalization [10]. 3. It is extremely important that the GTF files (http://www. ensembl.org/info/data/ftp/index.html), used for the annota- tion of ENSEMBL IDs, are correctly associated to the genome assembly version used for the transcriptome mapping (e.g., Genome Reference Consortium mouse assembly, GRCm38, and ENSEMBL corresponds to NCBI mm10 genome assem- bly); the only differences between the two assemblies is that chromosomes are indicated by numbers in ENSEMBL (e.g., for mouse 1-19, X, MT) and with prefix chr in NCBI (e.g., chr1-chr19, chrX, and chrMT). Any GTF starting from ENSEMBL release 68 can be associated to GRCm38 assembly (ftp://ftp.ensembl.org/pub/release-68/). 4. The use of threshold for log2CPM is strongly suggested to avoid the selection of fake differential expression events. The suggested default is 4 which indicates that average expression is at least 16 CPMs. 5. In ANOVA-like, the first covariate works like a reference, and all the other covariates are compared with respect to the first one.

References

1. Acuff NV, Linden J (2017) Using visualization analysis tools for RNA-seq read counts. of t-distributed stochastic neighbor embed- Genome Biol 15(2):R29. https://doi.org/10. ding to identify immune cell subsets in mouse 1186/gb-2014-15-2-r29 tumors. J Immunol 198(11):4539–4546. 5. Robinson MD, McCarthy DJ, Smyth GK https://doi.org/10.4049/jimmunol. (2010) edgeR: a bioconductor package for dif- 1602077 ferential expression analysis of digital gene 2. Wang B, Zhu J, Pierson E, Ramazzotti D, Bat- expression data. Bioinformatics 26 zoglou S (2017) Visualization and analysis of (1):139–140. https://doi.org/10.1093/bioin single-cell RNA-seq data by kernel-based simi- formatics/btp616 larity learning. Nat Methods 14(4):414–416. 6. Beccuti M, Cordero F, Arigoni M, Panero R, https://doi.org/10.1038/nmeth.4207 Amparore EG, Donatelli S, Calogero RA 3. Soneson C, Robinson MD (2018) Bias, (2017) SeqBox: RNAseq/ChIPseq reproduc- robustness and scalability in single-cell differ- ible analysis on a consumer game computer. ential expression analysis. Nat Methods 15 Bioinformatics. https://doi.org/10.1093/bio (4):255–261. https://doi.org/10.1038/ informatics/btx674 nmeth.4612 7. Kulkarni N, Alessandrı` L, Panero R, Arigoni 4. Law CW, Chen Y, Shi W, Smyth GK (2014) M, Olivero M, Ferrero G, Cordero F, Beccuti voom: precision weights unlock linear model M, Calogero (2018) Reproducible 432 Luca Alessandrı` et al.

bioinformatics project: a community for repro- 9. Diaz A, Liu SJ, Sandoval C, Pollen A, Now- ducible bioinformatics analysis pipelines. RA. akowski TJ, Lim DA, Kriegstein A (2016) BMC Bioinformatics. 19(Suppl 10):349. SCell: integrated analysis of single-cell RNA-- 8. Buettner F, Natarajan KN, Casale FP, seq data. Bioinformatics 32(14):2219–2220. Proserpio V, Scialdone A, Theis FJ, Teichmann https://doi.org/10.1093/bioinformatics/ SA, Marioni JC, Stegle O (2015) Computa- btw201 tional analysis of cell-to-cell heterogeneity in 10. Love MI, Anders S, Kim V, Huber W (2015) single-cell RNA-sequencing data reveals RNA-Seq workflow: gene-level exploratory hidden subpopulations of cells. Nat Biotechnol analysis and differential expression. F1000Res 33(2):155–160. https://doi.org/10.1038/ 4:1070. https://doi.org/10.12688/ nbt.3102 f1000research.7035.1 Chapter 26

A Bioinformatic Toolkit for Single-Cell mRNA Analysis

Kevin Baßler, Patrick Gu¨ nther, Jonas Schulte-Schrepping, Matthias Becker, and Paweł Biernat

Abstract

The recent technological developments in the field of single-cell RNA-Seq enable us to assay the tran- scriptome of up to a million single cells in parallel. However, the analyses of such big datasets present a major challenge. During the last decade, a wide variety of strategies have been proposed covering different steps of the analysis. Here, we introduce a selection of computational tools to provide an overview of a generic analysis pipeline. The first step of every scRNA-Seq experiment is proper study design, which does not require sophisti- cated experimental or informatics skills but is nonetheless presumably the most important step. The quality of the resulting data strictly depends on the proper planning of the experiment, including the selection of the most suitable technology for the biological question of interest as well as an elaborated study design to minimize the influence of confounding factors. Once the experiment has been conducted, the raw sequencing data needs to be processed to extract the gene expression information for each cell. This task comprises quality assessment of the sequenced reads, alignment against a reference genome, demultiplexing of the cell barcodes, and quantification of the reads/transcripts per gene. As any other transcriptomics technology, single-cell mRNA-Seq requires data normalization to assure sample-to-sample, here cell-to- cell, comparability and the consideration of confounding factors. Once gene expression values have been extracted from the reads and normalized, the researcher has the agony of choosing between a plethora of analysis approaches to investigate diverse aspects of the single-cell transcriptomes, such as dimensionality reduction and clustering to explore cellular heterogeneity or trajectory analysis to model differentiation processes. In this chapter, we present a wrap-up of the abovementioned steps to conduct single-cell RNA-Seq analyses and present a selection of existing tools.

Key words Single-cell, mRNA-Seq, Data analysis, Guidelines

1 Introduction

Recent technologies have enabled to assay the transcriptome of up to a million of single cells in parallel. However, the analyses of such big datasets become a challenge. Moreover, the vast number of different analytical tools that emerge almost every week are over- whelming for a researcher, especially when his or her focus is on wet-lab experiments and not on bioinformatics. Here we introduce

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9_26, © Springer Science+Business Media, LLC, part of Springer Nature 2019 433 434 Kevin Baßler et al.

the main steps of a typical bioinformatics pipeline for the analysis of single-cell mRNA-Seq data starting from quality assessment of reads and alignment toward cell-subtype discovery in low-dimensional space. Figure 1 presents a comprehensive over- view of the main analytical steps covered in this chapter. Since each of these steps presents some downsides, we not only introduce algorithms, methods, and tools but also critically revise their appli- cability and limitations. Furthermore, we aim this section at biolo- gists and bench scientists to help them understand the meaning of each analysis step and to get an overview about existing methods in the field.

2 Experimental Planning

One of the most important steps toward a successful application of single-cell mRNA-Seq to a biological question is a detailed planning of the experiment. In the first two sections of this chapter, we focus on aspects that are needed to be taken into account, when planning a proper single-cell mRNA-Seq experiment.

2.1 Choosing Following the first description of single-cell mRNA sequencing in a Single-Cell mRNA- 2009 [1], a wide variety of single-cell mRNA-Seq methods has Seq Technology been proposed. All methods have certain advantages, which demand an experimenter to choose a technique that is best suited for the biological question in mind. Regarding the gene body coverage of single-cell mRNA-Seq data, two major protocol types have emerged. Full-length methods (e.g., SMART-Seq2 [2] and Strt-Seq [3]) provide read coverage across the complete transcript allowing the investigation of, for example, alternative RNA proces- sing. However, most available single-cell protocols (e.g., Drop-Seq [4], Seq-Well [5] or sci-RNA-Seq [6]) sacrifice full-length coverage for the sake of early multiplexing, which minimizes the cost. Another important consideration during planning of a single- cell mRNA-Seq experiment is the procedure of isolating single cells from a cell mixture. Early isolation protocols focus on manual cell isolation techniques, such as micropipetting or laser capture micro- dissection. While these techniques gain spatial information about selected cells, their throughput is very low [7]. Fluorescence- activated cell sorting (FACS) is a widely established technique that can be used for isolation of single cells. In addition, FACS records the protein expression of a cell, which allows to combine the proteome and transcriptome data derived from the same cell. It has been shown that this additional layer of information can be very valuable for characterization and investigation of cells of interest [8]. However, the sorting procedure exerts stress to the cells in the form of high pressure and shear forces, which can change the transcriptome or even force the cells into apoptosis. Recently Single-Cell mRNA-Seq Data Analysis 435

A

C

G

G

T

A A

C C

A T T GCCATFASTQ reads

demultiplexing

... cellcell1 1 cell2 cell3 cell n

alignment reference genome

gene1 gene2 gene3 gene4 quantification reads UMIs imputed data

...... cell1 cell2 cell3 cell n cell1 cell2 cell3 cell n cell1 cell2 cell3 cell n gene1 4 2 0 2 gene1 2 1 0 1 gene1 2 1.5 2 2 gene2 1 0 1 ... 1 gene2 1 0 1 ... 1 gene2 1 1 1 ... 1 gene3 0 1 0 0 gene3 0 1 0 0 gene3 0 1 0 0 gene4 5 0 4 3 gene4 3 0 3 1 gene4 3 2 3 3 ...... gene m 0 0 1 0 gene m 0 0 1 0 gene m 0 0 1 0

normalization number of reads number of reads

seq-run 1 seq-run 2 seq-run 1 seq-run 2

dimensionality reduction/ clustering

cell type identification cell type x cell type y cell type z

Fig. 1 Schematic overview of a generic single-cell mRNA-Seq analysis pipeline 436 Kevin Baßler et al.

developed droplet-based isolation techniques, such as Drop-Seq [4] and inDrop [9], have substantially decreased the cost while increasing the throughput. The same holds for cell isolation using microwell plates that allow for easy and fast separation of single cells into wells [5, 10, 11]. Remarkably, very recent single-cell technol- ogies do not rely on physical isolation/separation of single-cells but rather perform each enzymatic step of single-cell mRNA-Seq library preparation inside of a cell using a split-pool barcoding approach [6]. These technologies rely on fixation of the cells, which might not be suited for all cell types with the existing pro- tocols, so that the respective technology needs to be adapted for some cells of interest. Furthermore, an experimenter should take into consideration whether the biological investigation would benefit from coestima- tion of transcriptomic data and genome data (TCR/BCR recep- tors, genotypes, etc.) [12] and/or epigenetic data (methylome) [13, 14]. Another important question is the required number of cells that should be covered. Again, this is very much dependent on multiple factors, including cost, cell types, technology, and biological ques- tion. The number of required cells greatly depends on the assumed heterogeneity in the cell mixture. Since this is unknown for most of the experiments it helps to deploy other available resources (flow cytometry data, etc.), to estimate the expected heterogeneity. This approximation can be used to determine the required number of cells using the Satija lab’s resource http://satijalab.org/ howmanycells. As a rule of thumb, the less complex a heterogeneous cell mixture is, the higher the required information depth to detect heterogeneity, which can be increased by either increase of analyzed cells or by usage of a more sensitive approach (more genes detected). More cells analyzed means higher statistical power and lower impact of dropouts (see also Subheading 3.7).

2.2 Reducing Batch Since the starting material of single cells is very low, the generated Effects gene expression measurements might be confounded because of differences in RNA extraction, enzyme activities, degradation/ fragment length, amplification, and sequencing depth. These effects become apparent when looking at batch effects. Batch effects are confounding factors, which occur because of, for exam- ple, different enzyme lots, and differences in personnel or prepara- tion dates. However, batch effects can also occur within one experiment. Some of the available experimental protocols for single-cell genomics necessitate splitting of cells into different pools/batches during various steps of downstream processing. Although these pools are processed simultaneously, technical varia- tion introduced due to processing in different batches is hard to avoid. A stringent study design may reduce their influence. Single-Cell mRNA-Seq Data Analysis 437

The easiest and most efficient way to account for batch effects is a proper experimental setup to begin with. To this end, it is impor- tant to balance biological conditions among batches to avoid a confounding study design. Ideally, all conditions (e.g., patient and control) should be represented and evenly distributed among all batches. For example, the comparison between patient and healthy control samples will be much harder to analyze due to batch effects on top of the naturally occurring between-donor variability if the single-cell libraries of both donors will be generated separately, for example at different time points. However, one can circumvent the introduction of such a batch effect by generating both single-cell mRNA libraries in one experiment. To this end, the experimenter can take advantage of technologies which rely on DNA-barcoded antibodies (such as CITE-Seq [15]) that tag sur- face markers of different donors with different sequences. Doing so, the cells of, say, donor 1 and donor 2 can be used in one single- cell experiment and can later be demultiplexed in silico because of the introduced donor-specific sequences. As an alternative to CITE-Seq [15], the use of polymorphism information in the genome can also be used to multiplex cells from different donors in one experiment. For example, the tool demuxlet [16] takes advantage of a few single-nucleotide polymorphisms (SNPs) to assign a cell-index of a 10Â Genomics dataset to a donor. However, the use of polymorphism data necessitates the availability of geno- type information for each individual, which is not always the case. For single-cell protocols utilizing FACS-based cell separation, the multiplexing can be performed on the same plates and since the sorting layout and the information about the antibody intensities will be stored, the single-cell data can easily be assigned to the respective donors afterward. Importantly, we recommend retaining any kind of information about the conducted experiments to be able to trace back the sources of confounding factors such as batch effects. If processing of cells in batches cannot be avoided, it is impor- tant to include standards to estimate and ideally correct for the batch effects. For this purpose, ERCC spike-in RNAs can be used; spike-ins will be further discussed in Subheading 3.5. In cases where biological conditions do not correspond to different batches (e.g., FACS sorted plate 1 contains only cells from donor 1 and plate 2 exclusively cells from donor 2), it is hard to distinguish biological variance from technical differences between batches. Here, the addition of some cells of a very homo- geneous and stable cell population (e.g., cell lines) increases com- patibility between two batches. It is important that these cells can be traced back during data analysis (e.g., by using cells from distinct species). Single-cell data generated from spike-in cells should only reflect technical differences across batches and hence might be used to assess and correct for batch effects. 438 Kevin Baßler et al.

In Subheading 3.6, we introduce some strategies to remove batch effects in silico. Once the optimal technology and strategy have been found to answer the biological questions of interest and the respective data- set has been generated by trying to reduce the influence of potential batch effects, the actual analysis starts.

3 Computational Aspects and Challenges

3.1 Preprocessing The full pipelines for processing single-cell data are diverse and of Single-Cell Data depend on the particular experimental setup. However, the typical process can be divided into two major parts. The first part concerns turning raw sequences into a count table, which contains the infor- mation on how many times each gene (or transcript) was expressed in each cell. The second part is the actual analysis, where we extract biologically relevant information from the count table. There are deviations from this general scheme but for simplicity we will use it as a basis for the rest of this section.

3.2 FASTQ-Files The data from the sequencing machine is typically stored in a bcl file, which is then converted into a FASTQ format and compressed for storage efficiency. The FASTQ format is a simple text format designed to store not only the bases called for each sequence but also metadata associated to these sequences. The information for each sequence is stored in four lines of text; the first line contains the information on the physical location of the sequence on the flow cell, while the second line is the actual sequence in the form of A, C, T, G, and Ns (nonidentified base); the fourth line contains the information on the quality of the base calling (as Phred scores), and the third line always starts with a + sign and optionally contains the same sequence identifier as the first line. The quality string represents Phred scores encoded with an ASCII table. Below you can find the first two reads of a sample file. @NB502097:6:HLWV5BGX3:1:11101:18363:1037 2:N:0: TAAGGCGA TTGAATGGGCCTTTNTCNGGNGGCGNCGNNCNNGG NNAGNANGCCNNNNNTNNCTNNNGNNCNCTNCNNNN TNNCN + 6AAA/EEA

AA////

pools can be used to process data in a reasonable time frame. General-purpose graphics processing units (GPGPUs) have been targeted to accelerate omics applications (e.g., through nVidia’s nvBio toolkit and the GPGPU adapted bowtie version nvBowtie (https://nvlabs.github.io/nvbio/nvbowtie_page.html)). Other vendors provide dedicated hardware like the Edico genome DragenBio-IT Processor [22], an accelerator card with support for various pipelines. Recently, Intel has integrated these accelera- tors with some of their server CPUs to run genomics software in collaboration with the Broad Institute (https://newsroom.intel. com/editorials/intel-extends-advanced-analytics-to-grand-chal lenge-of-genomics/). Dedicated hardware also exists for long-read assembly (e.g., the genomics coprocessor Darwin [23]). Algo- rithms that process count data can also be accelerated using GPGPUs like single-cell variational inference (scVI) [24]. After alignment and quantification (see also Subheading 3.3), the count table is represented as a matrix and typically stored in a human-readable text-based file formats like csv. However, this gives rise to issues similar to the ones affecting FASTQ format, and further optimizations can be made to store and process the counts more efficiently. Since single-cell data is rather sparse: only 1–10% of counts are nonzero as most cells only express small number of the genes and because of dropout events. Different representations exploit the sparsity and store only the nonzero counts and their positions. Since sparse data is commonly encountered in other applications, data formats for sparse matrices are already estab- lished. Specialized single-cell formats were developed based on the existing technologies like anndata from scanpy [25] and loom (http://loompy.org/). Anndata internally uses a sparse data repre- sentation based on HDF5 (http://www.hdfgroup.org/HDF5/), loom uses a chunked matrix representation that allows for a com- pression rate similar to sparse formats. Both formats bundle addi- tional information about the data set and processing results. These developments are an effort toward establishing common storage formats, which would greatly improve collaboration between groups and ease of use of existing pipelines.

3.3 Demultiplexing, After RNA sequencing and conversion of read information into Alignment, FASTQ format, we recommend examining the quality of sequenc- and Quantification ing reads by specialized tools like FASTQC (https://www.bioinfor matics.babraham.ac.uk/projects/fastqc/), TRIMMOMATIC [26] or fastx_toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). This helps to get an initial overview about the quality of the generated single-cell mRNA-Seq data and to remove reads that do not pass the quality control (QC) filters. Independent of which single-cell method was used, each read (or read pair) will contain a cell-specific tag, also referred to as cell barcode, which was integrated during library preparation and can Single-Cell mRNA-Seq Data Analysis 441 be used to uniquely assign each read to a specific cell. The process of separating all QC-passing reads based on the cell barcode is called demultiplexing. Several algorithms have been developed for effi- cient demultiplexing of the sequences stored in FASTQ files like fastx toolkit or Picard (available at http://broadinstitute.github.io/ picard). Importantly, the length and position of this cell barcode may vary between different single-cell mRNA-Seq technologies and hence has to be known beforehand. In addition to the cell barcode, some technologies (e.g., MARS-Seq [27] or Drop-Seq [4]) also include UMIs (unique molecular identifiers) in the reads (see Subheading 3.5). To facilitate downstream processing this UMI information must be stored and its sequences must be removed from the retaining read for proper alignment. Tools like Umi-tools [28] and zUMI [29] are able to handle read data containing UMI sequences. Once reads have been separated, the actual sequence of the mRNA can be assigned to its related gene locus in a process called alignment. Therefore, the mRNA sequence of a read is mapped to its corresponding gene on an annotated reference genome using splice aware alignment tools like STAR [21], HISAT2 [30], or Tophat2 [31]. Today, reference genomes are available for many different organisms, including human, mouse, and other popular model organisms. If aligning against the whole genome is memory and computationally prohibitive (as is the case for smaller desktop systems or very large datasets) one can turn to transcriptome aligners such as kallisto [20] or Salmon [32]. For every single cell uniquely mapped reads overlapping the exonic regions of a gene are counted using featureCounts [33]or HTSeq-count [34] and as a result, a read count matrix is generated. Finally, in case UMIs have been introduced during library prepara- tion, the read count for a respective gene is collapsed to the number of distinct UMIs associated with these reads generating a transcript count matrix. It is important to keep in mind that read data from different single-cell technologies can drastically differ and hence, there is no one-fits-all software solution. Therefore, technology-specific pipe- lines are often needed, even for basic functions, such as the produc- tion of a count table from the raw data. As a consequence, developers of most scRNA-Seq techniques provide technique- specific, prewritten computational pipelines to readily process the raw data for downstream analysis. DropSeq tools (available at http://mccarrolllab.org/dropseq/) or Cellranger (available at https://www.10xgenomics.com/) are popular examples of these pipelines. Although these require programming skills, their exten- sive documentations allow also nonprogrammers to comprehend and execute necessary preprocessing steps. 442 Kevin Baßler et al.

3.4 Quality Control Gene expression data from single cells vary regarding the depth and quality of transcriptome information. It is important to account for such differences and it is crucial to remove cells from the analysis that are of low quality. Low quality data may be caused by a failure to capture a cell, capturing multiple cells, apoptotic cells, degrading RNA, low library complexity, or low sequencing depth. Dying cells have been shown to be associated with an increased ratio of reads mapping to mitochondrial genes and remaining endogenous genes. It is assumed that the membrane of apoptotic cells is leaking so that cytoplasmic RNA gets lost, but the mito- chondrial RNA is retained causing an overrepresentation of mito- chondrial transcripts [35]. Since the ratio of mitochondrial genes to endogenous genes within a cell is highly dependent on the overall quality and experimental setup, the threshold should be deter- mined in consideration of the distribution of this ratio among all cells within a dataset. However, usually the ratio is in a range of 5–20%. All cells exhibiting an exceptional high ratio should be considered apoptotic and removed from further analysis. Moreover, cells with only a very low number of identified genes should also be removed from the analysis. However, it is important to keep in mind that different cells will vary in their number of identified genes, and great care must be taken to not bias the analysis by removal of certain cell types with lower intrinsic complexity. Gene expression profiles generated from single cells contain a clear majority of zero measurements, either representing a failure of mRNA detection or a true missing of transcription of a gene. It was shown that variance is highly correlated with the mean expression [36]. It is recommended to remove lowly expressed genes to limit their effect on the variance within a data set. The identification of these genes can be performed based on the number of cells that express a certain gene. If a gene is expressed in less than 1% of the cells it is unlikely that this gene contributes to the overall variability. Since this data quality check is of very high importance, several pipelines and tools like Seurat [37], Cellity [35], or SCell [38] have been developed. These tools suggest filtering low-quality cells by analyzing multiple QC parameters and can detect outliers in the data.

3.5 Normalization An indispensable step for proper scRNA-Seq analysis is the normal- ization of the data to make the transcriptome of the cells compara- ble to each other. Some of the normalization tools were initially developed for bulk data but have been successfully applied to single-cell data as well. Generally, one has to distinguish between within-sample normalization, which corrects for gene-specific biases and between-sample normalization, which adjusts for distri- butional differences across cells (e.g., read/transcript number). The latter type of normalization will be the focus of this paragraph. Single-Cell mRNA-Seq Data Analysis 443

A popular normalization method which is commonly applied to single-cell technologies with full-transcript coverage (such as SMART-Seq2 [2]) is the TPM (transcripts per million kilobase) method. This method is used to normalize for differences in sequencing depth across samples (between-sample normalization) and is related to the RPM (reads per million) method whose principle is commonly applied to single-cell data. However, TPM also considers the gene length (within-sample normalization) and thus is very similar to RPKM (reads per million kilobase) and FPKM (fragments per kilobase million). However, for the sake of cell-to-cell comparison, TPM is more powerful. The main disad- vantage of these estimates is that they can be dominated by a handful of highly expressed genes, which can bias the downstream analysis [39]. Further comparison of RPM/TPM/RPKM and FPKM can be found in, for example, https://www.rna-seqblog. com/rpkm-fpkm-and-tpm-clearly-explained/. As already outlined above (see Subheading 2.2), batch effects are a serious issue in single-cell mRNA-Seq analysis and can have deleterious effects on analysis and biological interpretation of single-cell mRNA data. An easy way of accounting for such con- founding effects in silico is to use genes that show stable expression across the cells in the dataset (housekeeping genes). However, especially for heterogeneous cell populations, it is often difficult to find stable housekeeping genes. To account for technical variation, even in a heterogeneous dataset, external synthetic spike-in RNA, for example, from the External RNA Control Consortium (ERCC) can be used. ERCC spike-in RNA is equally added to cell lysate. Therefore, differences in the final libraries can be exploited and used to assess technical sources of variation and to normalize gene expression data across cells. Using linear regression, between-read counts, and the known concentrations of spike-in RNA enables even the conversion of read counts into transcript counts [40]. However, the use of ERCC spike-in RNA is currently restricted to well-based technologies such as SMART-Seq2 [2] or MARS-Seq [27]. For technologies, which are capturing the 30 end (e.g., Drop- Seq [4], Seq-Well [5], or sci-RNA-Seq [6]), another type of nor- malization strategy can be introduced, namely via the usage of sUMIs. These sequences barcode individual mRNA molecules and hence can be used to account for amplification bias which is a major source of technical variation. Because of technical limitations, UMIs are not usable for all single-cell technologies. Moreover, ERCC spike-in RNAs differ to mammalian genes in terms of length, lack of 50 capping, and lack of binding sites for RNA-proteins and thus might not efficiently mimic internal RNA [41]. A recently developed algorithm, called CENSUS [40], enables the conversion of TPM/FPKM values (see below) into quasi- 444 Kevin Baßler et al.

molecule counts without the need of either UMIs or ERCC spike- in RNA. However, this algorithm cannot be used to efficiently control for amplification bias. Nevertheless, benchmarking showed that the results of differential analysis resembles the results obtained by using real spike-ins.

3.6 Accounting Besides the aforementioned technical sources of variation, there are for Other Unwanted additional factors that might contribute substantially to the varia- Sources of Variation bility of gene expression. For some datasets, cell-to-cell variation can also reflect the cell cycle stage at which a cell was captured. In more detail, a proliferating cell upregulates its gene expression and hence will contribute more to the read pool in the single-cell library compared to a resting cell. Although normalization (RPM, RPKM, FPKM, and TPM) will account for some of this variability, it will not be able to remove all of the cell cycle-related variability. One possibility to account for cell cycle-related effects is the usage of scLVM [42], which builds on a latent-variable model based on Gaussian processes. Although scLVM was designed to account for cell cycle-induced variations, it can also be used to correct for other sources of variation, which can be modeled by latent variables. Inspired by scLVM, Satija and colleagues imple- mented the modeling of latent variables in their single-cell analysis pipeline Seurat [37](see below) and used them to account for differences in the alignment rate, expression differences, and mito- chondrial gene expression. In principle, latent variable models can also be used to account for batch effects in datasets (see also Sub- heading 2.2). Batch effects are also a major problem of meta-analysis, when different single-cell datasets are combined in one analysis. Butler et al. recently developed an elegant method [37], which circum- vents the limited comparability across datasets because of batch effects. This method relies on a strategy that identifies a shared structure (common sources of variation) between the different datasets based on canonical correlation analysis (CCA) followed by an alignment based on this structure. This CCA-based proce- dure enables the elimination of batch effects across datasets, and was successfully benchmarked in datasets confounded by different treatment conditions, technologies, and even species.

3.7 Imputation A prominent feature of scRNA-Seq is the sparsity of the data, that of Missing Data is, a vast proportion of zero values. This so-called zero inflation results in a bimodal distribution of gene expression even within one cell group [43]. Shalek and colleagues recently confirmed via scRNA-Seq and RNA fluorescence in situ hybridization, that the expression of genes can occur in a bursting manner meaning that some genes are expressed in either an on or off status fashion [44]. Another reason for a bimodal distribution might be the so-called dropout events. A dropout event occurs when a transcript Single-Cell mRNA-Seq Data Analysis 445

is not detected at all because of technical reasons. That is, some zeros in the count table do not mean that the respective gene is not expressed, but rather that its transcript was either lost during the library production or because of sequencing issues. A clear indica- tion that dropout events are one of the major issues in the single- cell mRNA-Seq universe is the remarkable large number of meth- ods developed within the last years to impute missing values. In statistics, the term imputation refers to the process of handling missing data by replacing them with values that represent the true value most likely. The power of imputed data was bench- marked in many cases. For example, CIDR [45] (Clustering through Imputation and Dimensionality Reduction) was the first dimensionality reduction and clustering tool which used imputed single-cell mRNA data and relied on a simple PCA-related method. Interestingly, the results of CIDR outperform state-of-the-art methods like ZIFA [46] or RaceID [47] which were both explicitly developed to handle the complex nature of single-cell transcrip- tome data. Since the introduction of CIDR a vast number of other tools were developed to denoise and impute single-cell data with continuously rising precision (e.g., MAGIC [48], SAVER [49], and scImpute [50]).

4 Exploring Cellular Heterogeneity

4.1 Dimensionality A count table may contain counts for tens of thousands of different Reduction genes, more if we count different isoforms separately. Because every cell is characterized by a large number of values we say that the single-cell data is high-dimensional. This high-dimensional data often contains redundant information and can be summarized in a lower-dimensional space by applying a dimensionality reduction algorithm. The dimensionality reduction serve several purposes, firstly, it can be used to summarize the data by plotting it in a lower-dimensional space (2 or 3 dimensional). Secondly, the dimensionality reduction can be used as a preprocessing step before applying other algorithms (like clustering) to improve their effi- ciency (both computational-wise and by removing noise from the data). In the latter case we reduce the number of dimensions to something between 10 and 100 with principal component analysis (PCA) [51]. The PCA is a fast and scalable algorithm that finds directions in the original space in which the data varies the most. These directions (principal components, or PCs) are sorted from the most to the least varying. Based on how much variation is captured in each of the PCs, we then specify how many of the PCs we wish to keep and discard the rest. Our data is then projected onto the remaining PCs in effect reducing the number of dimen- sions. Assuming that the biologically relevant information is 446 Kevin Baßler et al.

responsible for the most variation in our data, by removing low varying components we discard only technical noise. The output of the PCA is therefore used as an initial denoising step and the resulting medium-dimensional data can be further analyzed. Another benefit of applying PCA is that some downstream algo- rithms are computationally inefficient when applied on the raw high-dimensional data. The last thing to note is that the decision of which PCs to keep is largely subjective. Keeping more PCs means that less information is lost but also that it comes with more noise. The decision is made by not only looking at the variation associated to each PC but also on the results of the downstream analysis. It is a thin line between extracting biologically relevant information and biasing the data toward the expected results. The other goal of dimensionality reduction, visualization, is typically accomplished by much more sophisticated algorithms. The PCA is normally insufficient for this purpose, unless the data is extremely simple, because PCA is a linear transformation: it only shifts, rotates, and scales the original space. If the data is folded on itself or forms more complex structures, PCA alone, with its linear transformations, cannot cope with simplifying these structures to present them in a two- or three-dimensional plot. Instead, nonlin- ear dimensionality reduction methods have to be applied, of which the most popular one is t-SNE [52]. T-SNE computes the local relationships between points in the original high-dimensional space and places the points in a lower-dimensional space (normally two or three dimensional) in such a way as to preserve these local relations. Relying only on the local structure of the data allows it to simplify complex structures and lay them out in a 2D space in a clear fashion. Similar to other algorithms, there is a loss of information after dimensionality reduction, and t-SNE is no exception. It sacrifices the global structure of the data to preserve the local relationships. In effect, the t-SNE results can be difficult to interpret [53]. For example, if after performing t-SNE the cells are depicted as several separate clusters there is no way to tell how these clusters relate to each other in the original space. The outputs also heavily rely on the parameters selected by the user, most significantly by perplexity. The issues with t-SNE are common to other nonlinear dimension- ality reduction algorithms, which raises a question to what degree should these methods be used in the analysis (other than for visualization). The danger here is two sided: we can lose the bio- logically relevant information after applying dimensionality reduc- tion, or, perhaps even more dangerously, we can overinterpret the results and see structures which are not really there. One way to fight off these issues is to use several different algorithms for visualization; this is typically done with PCA and t-SNE, but other existing algorithms can also be used. At the very least one Single-Cell mRNA-Seq Data Analysis 447

should generate several t-SNE plots by varying its parameters and present them side by side. It is worth noting that t-SNE is somewhat limited in the scale of the data it can process. For example, although it takes just a few minutes to process a few thousands of cells, it takes a few days to process a million. Although the latter is not yet the size of a typical single-cell data set, it will become increasingly difficult to efficiently analyze single-cell data with t-SNE. There are, however, variations of t-SNE based on deep learning that are computationally more efficient [54]. Aside from t-SNE other dimensionality-reduction algorithms are being applied to single-cell data, often offering better perfor- mance or interpretability. There is a recent outpour of autoencoder- based algorithms (autoencoders build upon deep-learning techni- ques) such as scVI [55], DCA [56], or VASC [57]. These algo- rithms offer much better scalability as they can be trained on small chunks of the whole data set (minibatches). Autoencoders general- ize well to unseen data points and they can be used for batch-effect removal or imputation: given a position of a cell in a low-dimensional space they can be used to fill in the dropout counts. Another important class of algorithms is based on diffusion maps [58], which cope well with visualizing cell lineages [59]. Last but not least, the recently introduced UMAP [60, 61] could be used as a more scalable alternative to t-SNE.

4.2 Developmental For an analyst of single-cell data, it is important to keep in mind Trajectories that not all clustering algorithms are suitable for any biological problem of interest. For example, if one is interested in the transi- tional states of cells, it is not advisable to use t-SNE [52]as dimensionality reduction method. Although tSNE is powerful in getting a clustered structure of the dataset, the position of the clusters in the final t-SNE map has no real meaning, so that close clusters do not indicate close biological relationships. To enable the inference of dynamic biological processes (e.g., cell cycle, cell acti- vation or differentiation), a plethora of different approaches have been developed to model such trajectories. The assumption of these methods is that the recorded single cells are at different stages of the dynamic process and hence the trajectory can be computationally modeled by taking the informa- tion of all single cells into account. To this end, the cells are ordered along a pseudotime in a trajectory, which can have a simple linear shape, but also complex bifurcated structures like developmental trees are possible. By building a trajectory, the analyst can answer different biological questions, for example, in the context of differ- entiation trajectories, the identification of rare precursor cells, or the stage where a bifurcation occurs which means a stage in the pseudotime where cells undergo fate decision and branch into distinct differentiation directions. 448 Kevin Baßler et al.

For many of the current trajectory methods, prior knowledge (e.g., marker genes or starting cells from which the trajectory will originate) is required. Although prior knowledge can be helpful in guiding the construction of the biologically most relevant trajec- tory, it can also be disadvantageous when providing noisy or erro- neous information. Therefore, some tools do not rely on any prior knowledge but built the trajectory solely data driven (e.g., Monocle DDRTree [62], Sincell [63], and TSCAN [64]). A recent paper by Saelens et al. comprehensively assessed the performance and robustness of different trajectory inference tools [65]. Most of the evaluated methods worked best for datasets containing topology type they were supposed to handle. For exam- ple, methods designed for linear trajectory types commonly per- formed best for datasets representing these types of structures. Consequently, an analyst needs to know a priori the underlying topological structure of the dataset, which is often difficult. Never- theless, Saelens et al. provide guidance in form of a decision tree to help users decide which trajectory inference method is most suit- able for the dataset of interest. For example, if the trajectory is expected to have a linear topology, SCORPIUS [66] is the method of choice. Moreover, the authors recommend to use reCAT (https://github.com/tinglab/reCAT) for cycle topologies, Sling- shot [67] for bifurcated trajectories and Monocle DDRTree for complex tree trajectories. In the future, methods will be needed, which are not designed for a specific topology and hence can efficiently model biological relevant trajectory without the necessity of a priori knowledge.

4.3 Clustering Clustering algorithms serve to label similar cells in preparation for further analysis (counting, comparing differentially expressed genes, etc.). The clustering is normally performed on a 2D repre- sentation of the data, which means that the results rely heavily on the dimensionality-reduction algorithm. Clustering based on t-SNE [52] result is especially dangerous here, as it may result in artificial clusters which are not reflecting the complete underlying biology. Therefore, there is a recent trend in single-cell analysis to perform clustering on the high- or an intermediate-dimensional data (say, the first 50 components of PCA [51]), leaving the non- linear dimensionality reduction purely as a visualization technique and completely removing its influence. Graph-based clustering algorithms like Louvain clustering [68] (implemented in Seurat [37] and scanpy [25]) and SNN-Cliq [69] belong to this group. Out of the clustering algorithms used on 2D representation of the data the most popular ones are k-means and (H)DBSCAN. The k-means algorithm requires passing the expected number of clus- ters as an input, so it requires prior knowledge on the outcomes which is rarely available. On top of that, there are known problems with applying it to data of nonuniform density or where clusters Single-Cell mRNA-Seq Data Analysis 449

have dissimilar abundances. The second method, DBSCAN [70], is a density based algorithm that finds regions of high density and moves from there to find reachable points assigning all of them to the same cluster, classifying points that cannot be reached from any high-density region as outliers. The downside of this method is that it needs specifying the density parameter a priori, which needs to be fine-tuned for the given data set. HDBSCAN addresses this issue by using local density estimates, building a hierarchy of clusters on top of that and using a more sophisticated cluster selection approach. All the three algorithms mentioned above are characterized by their computational efficiency but are largely dependent on the perfor- mance of the upstream dimensionality reduction algorithm. Effec- tively, the latter groups the points into clusters, and the clustering algorithm only labels the groups that were found. The above list is by no means complete and only mentions the most popular algorithms. As with the previously discussed parts of the pipeline like dimensionality reduction or trajectory reconstruc- tion, the choice of the clustering algorithm is dictated by specificity of the data. There are also similar dangers caused by an abundance of algorithms and their parameters: users can apply a range of clustering methods and pick the results that coincide with their intuition biasing the results in consequence. This can be alleviated to a degree by comparing different clusterings based on objective internal and external quality measures such as silhouette scores or adjusted mutual information.

4.4 Identifying After identifying groups of cells that exhibit a high similarity in their Subpopulations gene expression profile, it can often be helpful to link the identity of these clusters to the established knowledge of cellular biology. Visualizing the expression of known marker genes in the respective clusters of cells in for example a violin plot or by color-coding the cells in their low-dimensional representation (PCA or t-SNE plot) is a quick and easy way to link a priori knowledge to the single-cell mRNA-Seq data. However, it is important to keep in mind that especially lowly expressed genes might be affected by dropout events or other technical noise (see also Subheading 3.7). There- fore, rather than evaluating single marker genes, we recommend to use sets of genes, often referred to as cellular gene signatures, to assess the biological identity of the clusters. The current knowledge of cell types and their states is limited; thus, knowledge-driven classification must necessarily fail for unde- scribed cell types. However, due to its unbiased and encompassing nature, single-cell mRNA-Seq presents unprecedented capabilities to readily identify novel cell types and thus expand the knowledge base [71]. One strategy to characterize groups of cells of unknown iden- tity is based on the unbiased identification of marker genes specifi- cally expressed in these clusters (often determined by differential 450 Kevin Baßler et al.

gene expression analysis). Given such a list of genes we can refer to the literature to further determine the characteristics of cells from this cluster. Moreover, the identified marker genes can be used for Gene Ontology enrichment analysis (GOEA) or gene set enrich- ment analysis (GSEA) to test whether identified marker genes significantly overlap with gene ontology terms or other gene sets (e.g., already identified list of markers from population-based RNA-Seq). This way, we can characterize the new cell type by its function and similarities to other cells. The AUCell package (https://github.com/aertslab/AUCell) is designed to analyze the state of gene sets in single-cell mRNA-Seq data.

5 Analytical Platforms for Single-Cell RNA-Seq Data

As discussed in the previous chapters, scRNA-Seq allows for com- prehensive transcriptome profiling of thousands to millions of indi- vidual cells. Such an analytical capacity entails the production of great amounts of data and the challenge to derive biologically meaningful interpretations of it. Bench scientists eager to deploy these new and powerful technologies often lack the expertise to handle complex datasets as computational methods mostly require a significant amount of programming skills. To tackle this issue, numerous research groups made efforts to provide easy-to-use, integrated software solutions to make data readily accessible to a wider range of interested researchers with important biological questions. Since the field of scRNA-Seq is still in its infancy, the develop- ment and refinement of experimental techniques and analytical approaches is progressing rapidly. Although the new tools are pub- lished daily, most of them emphasize novel algorithms, data storage solutions, or new programming tools. This leaves the ease of use lower on the priority list, and, in effect, most of the new tools are difficult to use by people with no prior programming experience. The unintended effect is that most of the recently developed pow- erful tools are accessible only to bioinformaticians, while bench scientists are left with old and often outdated solutions. That said, there are notable exceptions of packages that, although designed for programmers, contain a large number of tutorials and a clear and extensive documentation (such as Seurat and scanpy men- tioned below). Once the field has established common grounds for the production and analysis of scRNA-Seq data and the speed of development has somewhat decelerated, user-friendliness will move more into focus. Once the scRNA-Seq raw data has been processed into inter- pretable gene counts, biological questions are tackled by specialized analytical tools, including dimensionality reduction, clustering and trajectory inference. One of the most widely used R-based toolkits Single-Cell mRNA-Seq Data Analysis 451

for single-cell genomics is Seurat [37]. This software package developed by researchers from New York University was designed for quality control, analysis, and basic data exploration. Scanpy [25] is another similar toolkit covering preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expres- sion testing, and simulation of gene regulatory networks. Its highly scalable Python-based implementation can efficiently deal with very large datasets of more than one million cells. Both of these packages provide a range of alternative methods that can be used to pro- grammatically compose a pipeline and experiment with for example different dimensionality reduction methods. Besides these well-documented but programmer-focused packages, other more interactive web-based analysis tools have been developed, which combine state-of-the art analytical approaches with a graphical user interface. Analysis platforms, such as FASTGenomics [72], Granatum [73], or ASAP [74], enable in-depth analysis of scRNA-Seq data without any program- ming skills. Additionally, these analysis platforms often provide computational resources to process and analyze your own data. Besides these few prominent examples, there is a long list of similar analysis packages, all of which have their particular strengths. Since new tools and platforms are constantly developed, a printed list is necessarily incomplete. Therefore, we would like to direct the reader to online libraries, such as https://github.com/ seandavi/awesome-single-cell, https://omictools.com/,or https://www.scrna-tools.org/, for a comprehensive and constantly updated overview of existing tools and algorithms. As scRNA-Seq becomes more and more popular among biol- ogists, the amount of published data is constantly growing. To make published data sets easily accessible for the research commu- nity and to allow exploration and comparative (re)analysis, numer- ous online data repositories have been established. These do not only enable access to processed data but also ensure standardized processing for direct comparability of different data sets and, in part, also provide means for visualization of the data. Single Cell Portal (https://portals.broadinstitute.org/single_cell), JingleBells (http://jinglebells.bgu.ac.il/), and SCPortalen (http://single-cell. clst.riken.jp/) are prominent examples hereof.

6 Closing Remarks

As repeatedly mentioned in the sections above, there is no one-fits- all solution to analyze any single-cell data. An analyst is demanded to carefully choose which methods and algorithms to use at various steps of analysis. It is very likely that an inexperienced user will be quickly overwhelmed. Although some tools have recently emerged that offer a guided analysis of data, they will reach the limits of the 452 Kevin Baßler et al.

software’s analytical capacities relatively fast. Therefore, we encour- age any emerging analyst to learn the basics of a programming language, such as R, to open the door to a broader understanding of analysis and hence to exhaust the possibilities of single-cell mRNA-Seq.

Acknowledgments

The authors would like to acknowledge Prof. Dr. med. Joachim L. Schultze for support and advice during the writing process. Moreover, the authors Paweł Biernat and Matthias Becker are supported by a grant from the Federal Ministry for Economic Affairs and Energy (BMWi Project FASTGenomics). The work of Jonas Schulte-Schrepping receives funding from the European Union’s Horizon 2020 research and innovation program under grant agreement no. 733100 (SYSCID). The DFG graduate pro- gram 2168/1 (Bonn and Melbourne International Research and Training Group—Bo&MeRanG) supports Patrick Gu¨nther.

References

1. Tang F, Barbacioru C, Wang Y et al (2009) morphologic profiling of single neurons using mRNA-Seq whole-transcriptome analysis of a Patch-Seq. Nat Biotechnol 34:199–203. single cell. Nat Methods 6:377–382. https:// https://doi.org/10.1038/nbt.3445 doi.org/10.1038/nmeth.1315 8. Paul F, Arkin Y, Giladi A et al (2015) Tran- 2. Picelli S, Bjo¨rklund A˚ K, Faridani OR et al scriptional heterogeneity and lineage commit- (2013) Smart-Seq2 for sensitive full-length ment in myeloid progenitors. Cell transcriptome profiling in single cells. Nat 163:1663–1677. https://doi.org/10.1016/j. Methods 10:1096–1098. https://doi.org/10. cell.2015.11.013 1038/nmeth.2639 9. Klein AM, Mazutis L, Akartuna I et al (2015) 3. Islam S, Kj€allquist U, Moliner A et al (2011) Droplet barcoding for single-cell transcrip- Characterization of the single-cell transcrip- tomics applied to embryonic stem cells. Cell tional landscape by highly multiplex 161:1187–1201. https://doi.org/10.1016/J. RNA-Seq. Genome Res 21:1160–1167. CELL.2015.04.044 https://doi.org/10.1101/gr.110882.110 10. Fan HC, Fu GK, SP a F (2015) Combinatorial 4. Macosko EZ, Basu A, Satija R et al (2015) labeling of single cells for gene expression cyto- Highly parallel genome-wide expression metry. Science 347:1258367. https://doi. profiling of individual cells using nanoliter dro- org/10.1126/science.1258367 plets. Cell 161:1202–1214. https://doi.org/ 11. Goldstein LD, Chen Y-JJ, Dunne J et al (2017) 10.1016/j.cell.2015.05.002 Massively parallel nanowell-based single-cell 5. Gierahn TM, Wadsworth MH, Hughes TK gene expression profiling. BMC Genomics et al (2017) Seq-Well: portable, low-cost 18:519. https://doi.org/10.1186/s12864- RNA sequencing of single cells at high 017-3893-1 throughput. Nat Methods 14:395–398. 12. Dey SS, Kester L, Spanjaard B et al (2015) https://doi.org/10.1038/nmeth.4179 Integrated genome and transcriptome 6. Cao J, Packer JS, Ramani V et al (2017) Com- sequencing of the same cell. Nat Biotechnol prehensive single-cell transcriptional profiling 33:285. https://doi.org/10.1038/nbt.3129 of a multicellular organism. Science 13. Angermueller C, Clark SJ, Lee HJ et al (2016) 357:661–667. https://doi.org/10.1126/sci Parallel single-cell sequencing links transcrip- ence.aam8940 tional and epigenetic heterogeneity. Nat Meth- 7. Cadwell CR, Palasantza A, Jiang X et al (2016) ods 13:229. https://doi.org/10.1038/ Electrophysiological, transcriptomic and nmeth.3728 Single-Cell mRNA-Seq Data Analysis 453

14. Hou Y, Guo H, Cao C et al (2016) Single-cell 27. Jaitin DA, Kenigsberg E, Keren-Shaul H et al triple omics sequencing reveals genetic, epige- (2014) Massively parallel single-cell RNA-seq netic, and transcriptomic heterogeneity in for marker-free decomposition of tissues into hepatocellular carcinomas. Cell Res 26:304. cell types. Science 343:776–779. https://doi. https://doi.org/10.1038/cr.2016.23 org/10.1126/science.1247651 15. Stoeckius M, Hafemeister C, Stephenson W 28. Smith T, Heger A, Sudbery I (2017) et al (2017) Simultaneous epitope and tran- UMI-tools: modeling sequencing errors in scriptome measurement in single cells. Nat Unique Molecular Identifiers to improve quan- Methods 14:865. https://doi.org/10.1038/ tification accuracy. Genome Res 27:491–499. nmeth.4380 https://doi.org/10.1101/gr.209601.116 16. Kang HM, Subramaniam M, Targ S et al 29. Parekh S, Ziegenhain C, Vieth B et al (2018) (2017) Multiplexed droplet single-cell RNA-- zUMIs: a fast and flexible pipeline to process sequencing using natural genetic variation. Nat RNA sequencing data with UMIs. bioR- Biotechnol 36:89–94. https://doi.org/10. xiv:153940. https://doi.org/10.1101/ 1038/nbt.4042 153940 17. Langmead B, Nellore A (2018) Cloud com- 30. Kim D, Langmead B, Salzberg SL (2015) puting for genomic data analysis and collabora- HISAT: a fast spliced aligner with low memory tion. Nat Rev Genet 19:208–219. https://doi. requirements. Nat Methods 12:357–360. org/10.1038/nrg.2017.113 https://doi.org/10.1038/nmeth.3317 18. Regev A, Teichmann SA, Lander ES et al 31. Kim D, Pertea G, Trapnell C et al (2013) (2017) The human cell atlas. Elife 6:e27041. TopHat2: accurate alignment of transcrip- https://doi.org/10.7554/eLife.27041 tomes in the presence of insertions, deletions 19. Beaulieu-Jones BK, Greene CS (2017) Repro- and gene fusions. Genome Biol 14:R36. ducibility of computational workflows is auto- https://doi.org/10.1186/gb-2013-14-4-r36 mated using continuous analysis. Nat 32. Patro R, Duggal G, Love MI et al (2017) Biotechnol 35:342–346. https://doi.org/10. Salmon provides fast and bias-aware quantifica- 1038/nbt.3780 tion of transcript expression. Nat Methods 20. Bray NL, Pimentel H, Melsted P, Pachter L 14:417–419. https://doi.org/10.1038/ (2016) Near-optimal probabilistic RNA-Seq nmeth.4197 quantification. Nat Biotechnol 34:525–527. 33. Liao Y, Smyth GK, Shi W (2014) feature- https://doi.org/10.1038/nbt.3519 Counts: an efficient general purpose program 21. Dobin A, Davis CA, Schlesinger F et al (2013) for assigning sequence reads to genomic fea- STAR: ultrafast universal RNA-Seq aligner. tures. Bioinformatics 30:923–930. https:// Bioinformatics 29:15–21. https://doi.org/ doi.org/10.1093/bioinformatics/btt656 10.1093/bioinformatics/bts635 34. Anders S, Pyl PT, Huber W (2015) HTSeq--a 22. Dutton G (2016) From DNA to diagnosis Python framework to work with high- without delay. Genet Eng Biotechnol News throughput sequencing data. Bioinformatics 36:8–9. https://doi.org/10.1089/gen.36.05. 31:166–169. https://doi.org/10.1093/bioin 03 formatics/btu638 23. Turakhia Y, Bejerano G, Dally WJ (2018) Dar- 35. Ilicic T, Kim JK, Kolodziejczyk AA et al (2016) win. In: Proceedings of the Twenty-Third Classification of low quality cells from single- International Conference on Architectural cell RNA-seq data. Genome Biol 17:29. Support for Programming Languages and https://doi.org/10.1186/s13059-016-0888- Operating Systems - ASPLOS ’18. ACM 1 Press, New York, NY, pp 199–213 36. Gru¨n D, Kester L, van Oudenaarden A (2014) 24. Lopez R, Regier J, Cole M, et al (2017) A deep Validation of noise models for single-cell tran- generative model for gene expression profiles scriptomics. Nat Methods 11:637–640. from single-cell RNA sequencing https://doi.org/10.1038/nmeth.2930 25. Wolf FA, Angerer P, Theis FJ (2018) SCANPY: 37. Butler A, Hoffman P, Smibert P et al (2018) large-scale single-cell gene expression data Integrating single-cell transcriptomic data analysis. Genome Biol 19:15. https://doi. across different conditions, technologies, and org/10.1186/s13059-017-1382-0 species. Nat Biotechnol 36:411. https://doi. 26. Bolger AM, Lohse M, Usadel B (2014) Trim- org/10.1038/nbt.4096 momatic: a flexible trimmer for Illumina 38. Diaz A, Liu SJ, Sandoval C et al (2016) SCell: sequence data. Bioinformatics 30:2114–2120. integrated analysis of single-cell RNA-seq data. https://doi.org/10.1093/bioinformatics/ Bioinformatics 32:2219–2220. https://doi. btu170 org/10.1093/bioinformatics/btw201 454 Kevin Baßler et al.

39. Vallejos CA, Risso D, Scialdone A et al (2017) 2:559–572. https://doi.org/10.1080/ Normalizing single-cell RNA sequencing data: 14786440109462720 challenges and opportunities. Nat Methods 52. Van Der ML, Hinton G (2008) Visualizing 14:565–571. https://doi.org/10.1038/ data using t-SNE. J Mach Learn Res nmeth.4292 9:2579–2605. https://doi.org/10.1007/ 40. Qiu X, Hill A, Packer J et al (2017) Single-cell s10479-011-0841-3 mRNA quantification and differential analysis 53. Wattenberg M, Vie´gas F, Johnson I (2016) with Census. Nat Methods 14:309. https:// How to use t-SNE effectively. Distill 1:e2. doi.org/10.1038/nmeth.4150 https://doi.org/10.23915/distill.00002 41. Gru¨n D, Van Oudenaarden A (2015) Design 54. Gisbrecht A, Schulz A, Hammer B (2015) and analysis of single-cell sequencing experi- Parametric nonlinear dimensionality reduction ments. Cell 163:799. https://doi.org/10. using kernel t-SNE. Neurocomputing 1016/j.cell.2015.10.039 147:71–82. https://doi.org/10.1016/j.neu 42. Buettner F, Natarajan KN, Casale FP et al com.2013.11.045 (2015) Computational analysis of cell-to-cell 55. Lopez R, Regier J, Cole MB et al (2018) Bayes- heterogeneity in single-cell RNA-sequencing ian inference for a generative model of tran- data reveals hidden subpopulations of cells. scriptome profiles from single-cell RNA Nat Biotechnol 33:155–160. https://doi. sequencing. bioRxiv:292037. https://doi. org/10.1038/nbt.3102 org/10.1101/292037 43. Yu P, Lin W (2016) Single-cell transcriptome 56. Eraslan G, Simon LM, Mircea M et al (2018) study as big data. Genomics Proteomics Bioin- Single cell RNA-seq denoising using a deep formatics 14:21 count autoencoder. bioRxiv:300681. https:// 44. Shalek AK, Satija R, Adiconis X et al (2013) doi.org/10.1101/300681 Single-cell transcriptomics reveals bimodality 57. Wang D, Gu J (2017) VASC: dimension reduc- in expression and splicing in immune cells. tion and visualization of single cell RNA Nature 498:236–240. https://doi.org/10. sequencing data by deep variational autoenco- 1038/nature12172 der. bioRxiv:199315. https://doi.org/10. 45. Lin P, Troup M, Ho JWK (2016) CIDR: ultra- 1101/199315 fast and accurate clustering through imputa- 58. Haghverdi L, Buettner F, Theis FJ (2014) Dif- tion for single cell RNA-Seq data. bioRxiv. fusion maps for high-dimensional single-cell https://doi.org/10.1101/068775 analysis of differentiation data. Bioinformatics 46. Pierson E, Yau C (2015) ZIFA: dimensionality 31:2989. https://doi.org/10.1093/bioinfor reduction for zero-inflated single-cell gene matics/btv325 expression analysis. Genome Biol 16:241. 59. Haghverdi L, Bu¨ttner M, Wolf FA et al (2016) https://doi.org/10.1186/s13059-015-0805- Diffusion pseudotime robustly reconstructs z lineage branching. Nat Methods 13:845. 47. Gru¨n D, Lyubimova A, Kester L et al (2015) https://doi.org/10.1038/nmeth.3971 Single-cell messenger RNA sequencing reveals 60. McInnes L, Healy J (2018) UMAP: Uniform rare intestinal cell types. Nature 525:251–255. Manifold Approximation and Projection for https://doi.org/10.1038/nature14966 dimension reduction 48. van DD, Nainys J, Sharma R et al (2017) 61. Becht E, Dutertre C-A, Kwok IWH et al MAGIC: a diffusion-based imputation method (2018) Evaluation of UMAP as an alternative reveals gene-gene interactions in single-cell to t-SNE for single-cell data. bioRxiv:298430. RNA-sequencing data. bioRxiv:111591. https://doi.org/10.1101/298430 https://doi.org/10.1101/111591 62. Trapnell C, Cacchiarelli D, Grimsby J et al 49. Huang M, Wang J, Torre E et al (2017) Gene (2014) The dynamics and regulators of cell expression recovery for single cell RNA fate decisions are revealed by pseudotemporal sequencing. bioRxiv:138677. https://doi. ordering of single cells. Nat Biotechnol org/10.1101/138677 32:381–386. https://doi.org/10.1038/nbt. 50. Li WV, Li JJ (2018) An accurate and robust 2859 imputation method scImpute for single-cell 63. Julia´ M, Telenti A, Rausell A (2015) Sincell:an RNA-seq data. Nat Commun 9:997. https:// R/Bioconductor package for statistical assess- doi.org/10.1038/s41467-018-03405-7 ment of cell-state hierarchies from single-cell 51. Pearson K (1901) LIII. On lines and planes of RNA-seq. Bioinformatics 31:3380–3382. closest fit to systems of points in space. London, https://doi.org/10.1093/bioinformatics/ Edinburgh. Dublin Philos Mag J Sci btv368 Single-Cell mRNA-Seq Data Analysis 455

64. Ji Z, Ji H (2016) TSCAN: pseudo-time recon- 70. Ester M, Kriegel H-P, Sander J, Xu X (1996) A struction and evaluation in single-cell RNA-seq density-based algorithm for discovering clus- analysis. Nucleic Acids Res 44:e117–e117. ters a density-based algorithm for discovering https://doi.org/10.1093/nar/gkw430 clusters in large spatial databases with noise. In: 65. Saelens W, Cannoodt R, Todorov H, Saeys Y Proceedings of the Second International Con- (2018) A comparison of single-cell trajectory ference on Knowledge Discovery and Data inference methods: towards more accurate and Mining. AAAI Press, Palo Alto, CA, pp robust tools. bioRxiv:276907. https://doi. 226–231 org/10.1101/276907 71. Mass E, Ballesteros I, Farlik M et al (2016) 66. Cannoodt R, Saelens W, Sichien D et al (2016) Specification of tissue-resident macrophages SCORPIUS improves trajectory inference and during organogenesis. Science 353:aaf4238. identifies novel modules in dendritic cell devel- https://doi.org/10.1126/science.aaf4238 opment. bioRxiv:79509. https://doi.org/10. 72. Scholz CJ, Biernat P, Becker M et al (2018) 1101/079509 FASTGenomics: an analytical ecosystem for 67. Street K, Risso D, Fletcher RB et al (2017) single-cell RNA sequencing data. bioR- Slingshot: cell lineage and pseudotime infer- xiv:272476. https://doi.org/10.1101/ ence for single-cell transcriptomics. bioR- 272476 xiv:128843. https://doi.org/10.1101/ 73. Zhu X, Wolfgruber TK, Tasato A et al (2017) 128843 Granatum: a graphical single-cell RNA-Seq 68. Blondel VD, Guillaume J-L, Lambiotte R, analysis pipeline for genomics scientists. Lefebvre E (2008) Fast unfolding of commu- Genome Med 9:108. https://doi.org/10. nities in large networks. https://doi.org/10. 1186/s13073-017-0492-3 1088/1742-5468/2008/10/P10008 74. Gardeux V, David FPA, Shajkofci A et al (2017) 69. Xu C, Su Z (2015) Identification of cell types ASAP: a web-based platform for the analysis from single-cell transcriptomes using a novel and interactive visualization of single-cell clustering method. Bioinformatics RNA-seq data. Bioinformatics 33:3123–3125. 31:1974–1980. https://doi.org/10.1093/bio https://doi.org/10.1093/bioinformatics/ informatics/btv088 btx337 INDEX

A CRISPR ...... 395–405 CRISPRi ...... 395 Adherent culture ...... 185 CyTOF...... 285, 296 Alignment...... 42, 156, Cytosine deamination ...... 235 168, 169, 374, 434, 439–441, 443 Aliquots ...... 34, 37, 49, 75, 78, D 79, 85, 115, 124, 137, 148, 159, 179, 181, 189, 191, 195, 198, 220, 247, 249, 275, 288, 289, Data analysis ...... 20, 68, 159, 291, 292, 294, 296, 299, 301, 322, 323, 326, 287, 296, 299, 364, 391, 426, 427, 437 340, 350, 355, 356, 385, 391, 401, 403 Data normalization ...... 38, 286 Antibody detection ...... 379 Data preprocessing...... 429 Automation ...... 4, 28, 46 ddSEQ ...... 155–175 Dead cell removal...... 10–13, 15–18, 20, 21, 105 B Demultiplexing...... 55, 439–441 Differential expression ...... 170, 425–431, 451 BAM...... 374 Differentiation...... 177, 235, 251, 364, 446 Batch effects ...... 181, 435–438, 443, 447 Digestions...... 10–12, 17–20, Beads purification...... 60, 245–247, 279 58, 61, 69, 145, 229, 248, 306, 307, 312, 370, Biotechnology ...... 57 401 Bisulfite-free sequencing...... 252 Dimensionality reduction ...... 445–451 Bisulfite sequencing (BS-seq)...... 235, 251, 363 DNA methylation ...... 235–238, 240–249, 363 Blood ...... 10, 11, 13–16, DOP-PCR ...... 228 18, 155–175, 285–302, 306, 308–310, 313 Dose response ...... 187 C Droplets ...... 25, 73–75, 77, 79, 82–84, 87, 105, 106, 156, 162, 164, 170, CD14 monocytes ...... 157, 159, 161, 173, 296, 309 173, 230, 271, 361, 367, 395, 396 Cell barcoding ...... 135, 220 Droplet technologies ...... 25, 74, 156 CEL-Seq ...... 45–48, 54, 55, 57, 396 Drop-seq...... 25, 73–85, Chemical labelling...... 252, 254, 256, 257 395, 434, 436, 441, 443 Chemical-labelling-enabled C-to-T conversion sequencing (CLEVER-seq) ...... 252, E 253, 255, 260, 264 EdgeR ...... 426, 428–430 ChIPmentation ...... 269–281 4-Element bead calibrator solution (EQ-beads)...... 286, Chromatin ...... 236, 269, 271, 272 287, 296–298, 302 Chromatin immunoprecipitation (ChIP) ...... 269, Emulsion device ...... 87 271, 272, 274–278, 280, 281 Epigenetic mark ...... 235, 236 Chromium10X ...... 87 Epigenetics ...... 269, 363, 436 Cloning...... 197, 396, 398–402 Extracellular matrix (ECM)...... 9, 188, 193 Clustering ...... 174, 175, 425, 445, 446, 448–450 F Contamination ...... 4, 6, 58, 134, 137, 148, 151, 181, 223, 233, 236, 247, FASTQ...... 156, 167, 169, 264, 437, 439–441 249, 298, 321, 325, 326, 328, 334, 338, 344, 5-formylcytosine (5fC) ...... 251–265 347, 350, 354–360, 369, 387, 405 5-methylcytosine (5mC)...... 235, 251, 252 Copy number variation (CNV)...... 227, 228, Flow-activated cell sorting analysis (FACS analysis) .... 73, 320, 321, 324, 332, 334–337, 344 227, 399, 403

Valentina Proserpio (ed.), Single Cell Methods: Sequencing and Proteomics, Methods in Molecular Biology, vol. 1979, https://doi.org/10.1007/978-1-4939-9240-9, © Springer Science+Business Media, LLC, part of Springer Nature 2019 457 SINGLE CELL METHODS:SEQUENCING AND PROTEOMICS 458 Index Flow cytometry ...... 10, 286, 287, 233, 252, 253, 255, 260–262, 264, 270, 272, 291, 305, 306, 309, 310, 326, 436 320, 325, 332, 342–347, 361, 436, 439, 441 Fluidigm C1 ...... 133, 134, 186, 380–383, 427 Liquid handling...... 4, 5, 28, 35, 36, 41, 321 Fluorescent probes...... 410–412, 416, 417 Live cells ...... 140, 150, 170, 291 Full-length ...... 25–43, 54, 88, 198, 320, 380, 434 Live imaging...... 409–420 Living mouse observation ...... 410 G Low-input ChIP...... 269–281 Gene expression ...... 10, 20, 25, 45, Lung...... 11, 12, 17–19, 417, 419 Lymph nodes...... 9, 11, 17, 20 68, 70, 111, 155, 173–175, 177–180, 185, 186, 235, 312, 319, 320, 364, 382, 390, 417, 425, M 435, 442, 443, 449 Genome and transcriptome sequencing Mass spectrometry (MS)...... 13, 18, 19 (G&T-seq) ...... 319, 321, 325 MaxPar™...... 285, 289, 296, 298 Genomes...... 68, 169, 227–233, Metal conjugated antibodies ...... 293 252, 253, 260, 264, 269, 272, 280, 281, Microfluidics...... 20, 21, 73–77, 319–362, 366, 374, 375, 431, 436, 437, 440, 441 82, 104, 106, 107, 112, 133–135, 148, 170, 193, Genome-wide ...... 111, 236, 269 194, 227, 228, 233, 271, 380 Genomics...... 3–8, 25, 88, 89, Microscopy ...... 289, 290, 366, 401, 405, 105, 106, 229, 231, 235, 236, 249, 252, 255, 409–412, 414–416, 418, 419 264, 319, 320, 324, 330, 334, 343, 344, 364, mRNA sequencing (mRNA-Seq)...... 87–92, 375, 426, 435, 437, 439, 440, 451 94–100, 102–109, 135, 137, 139–142, 150, 185–195, 367–370, 434–436, 441, 443, 445, H 449, 450, 452 Multi-omics profiling...... 364 HDF5 ...... 440 Helios™...... 290, 295, 296, 302 Multiple displacement amplification (MDA) .... 228, 320, Heterogeneity...... 25, 57, 111, 321, 324, 325, 338–345, 362 Multiplexing ...... 27, 40, 236, 237, 397, 434, 437 156, 177, 236, 271, 364, 379, 425, 436, 445–450 High-throughput ...... 28, 32, 35, 320, 379, 420 Musculoskeletal disorders (MSD)...... 4 Histone marks ...... 272, 280 N Histone modifications...... 271, 277 ® Nextera XT kit ...... 28, 35, 36, 43 I Next generation sequencing (NGS) ...... 41, 156, 161, 236, 247, 252, 255, 264, 306, 312, 364, Illumina library...... 68, 90, 137, 146, 156, 170, 265, 270, 343–347 395, 439 Immune cells ...... 9, 10, 12, 13, Normalization ...... 38, 286, 287, 296, 302, 427, 431, 437, 441–444 18–21, 43, 133, 155, 156, 186, 227 Immune infiltrate ...... 305–314 P Immune system ...... 155, 157 Imprinting ...... 27, 235 Paired TCRαβ single-cell sequencing ...... 197, 219, 223 Imputation...... 436, 443–445, 447, 449 PCR amplification ...... 45, 58, In house...... 28, 30, 36, 40, 88, 102, 112, 149, 231, 232, 270, 271, 365, 367, 134, 186, 289, 294, 296 369, 370 In-vitro transcription (IVT) ...... 45, 46, 51, 55 Peripheral blood mononuclear cells Iridium cell ID marker (191/193 Ir)...... 286, 287, (PBMCs)...... 155–161, 164, 289, 294, 295, 301 167, 169, 170, 172–175, 287, 288, 290–292, 297, 300, 301 L Phenotype-genotype correlation...... 185 PicoPLEX ...... 228, 320, 321, Lanthanides ...... 285, 287, 298 Lentiviruses...... 396, 405 324, 325, 334–340, 344, 357 Library preparation ...... 26, 28, 30, Picowells ...... 111–119, 121–131 31, 35, 36, 42, 59, 60, 64–68, 74–76, 79, 81, 88, Pooling ...... 28, 31, 37, 46, 54, 271, 395 Post-Bisulfite Adaptor Tagging (PBAT)...... 236 104, 107, 112, 114, 125, 126, 135, 136, 144–146, 149, 150, 156, 187, 227, 229, 231, Protein-DNA interaction...... 269–281 SINGLE CELL METHODS:SEQUENCING AND PROTEOMICS Index 459 Protein quantification ...... 379 Single-cell tagged reverse transcription Protein-RNA correlation ...... 379 C1 (STRT-C1) ...... 134, 147 Proximity extension assay (PEA)...... 380, 381 Single-nucleotide resolution...... 236 Pseudotime...... 446, 451 Single nucleotide variants (SNVs) detection ...... 27, 227, 228, 344 Q Smart-seq2...... 25–43, 222, 321, 333, Quality control (QC)...... 3, 6, 99, 364, 367–370, 396, 434, 443 103, 156, 247, 249, 295, 350, 351, 353, SNV detection...... 228 Spleen...... 9, 11, 16, 20, 300 357–360, 439, 442 Quantitative real-time PCR (qPCR)...... 108, 146, Study design ...... 435, 437 177, 178, 279, 280, 350 Subtype discovery ...... 73, 434 System biology ...... 73, 111 R T Real-time quantitative polymerase chain reaction (RT-qPCR) ...... 379, 380, 405 5’ Tag counting...... 134 Rhodium intercalator (103Rh) ...... 286, Tagmentation ...... 27, 28, 35, 36, 288, 289, 291, 296, 297 39, 40, 81, 125, 134, 144, 151, 164, 165, 192, RNAse-free ...... 3, 4, 6, 46, 58, 130, 270, 272, 275, 277, 278, 346, 365, 369, 370 148, 157, 158, 322, 354, 373 Targeted assays ...... 379 RNase H-dependent PCR (rhPCR) ...... 198, 219, 220 Target enrichment...... 228, 232 T cell receptor repertoire ...... 197 RNA-sequencing (RNA-Seq)...... 9–21, 25–43, 45, 57–70, 73, 111, 133, 149, 155, 156, 10x ...... 144 159–161, 197, 364, 374, 375, 395, 399, 403, Time course...... 186, 187, 415 Time-lapse imaging...... 186, 409, 425, 450–451 410, 416, 418, 419 S Tissue processing...... 10, 21 Tn5 transposase...... 27, 28, 30, 31, 3’ Sequencing...... 46, 87–92, 94–100, 102–109 36, 40, 43, 134, 144, 151, 272 Sequencing library preparation ...... 59, 156, Total RNA ...... 42, 59, 70, 227, 229, 231, 253, 320, 342 323, 326, 350, 354, 375 Seq-Well...... 111–119, 121–131, 434, 443 Transcription factors ...... 269, 271, Silencing ...... 235, 272 272, 277, 280, 300, 312 Single-cell Transcriptomics ...... 111, 178, 320, 379, 395, 436 analysis ...... 305, 379, 426, 448 Transcripts ...... 25, 46, DNA methylation ...... 235–238, 240–249 53, 55, 57, 58, 69, 74, 88, 134, 135, 170, 178, functional study...... 185 321, 442, 443 genomics...... 3, 252, 435, 451 Transduction ...... 399, 403 imaging ...... 409–420 Transfection...... 401, 410, 417 isolation ...... 162, 186, 228–230, 366, 367, 434 Tumor immunology ...... 305 multi-omics...... 364 Tumor infiltrating lymphocytes (TILs) ...... 87–92, perturbation ...... 57, 395 94–100, 102–109 protein detection...... 380 2D/3D-cultured cell observation...... 411–413 RNA amplification ...... 45–55 sorting...... 310 U suspensions ...... 9, 10, 19, 78, 82, 93, 105, 107, 116, 139, 140, 156, 185–195, Unique molecular identifiers (UMIs) ...... 46, 55, 68, 88, 112, 134, 169, 427, 441, 443, 444 296, 305, 306, 312, 313 TCR sequencing...... 197 V transcriptomics ...... 111, 379, 395, 425–431 Single-cell RNA-seq (scRNA-seq) ...... 3, 4, 9–21, Virus production ...... 397, 399, 401 25–43, 45, 57–70, 73–85, 111–119, 121–131, 133, 134, 148, 155–175, 374, 375, 396, 399, W 403, 425, 441–443, 450, 451 Whole genome amplification (WGA) ...... 227, 228, Single-cell tagged reverse transcription 230, 231, 233, 252, 253, 255, 324, 334 (STRT-seq) ...... 133–141, 143–145, 147–152, 434