A Surrogate mRNA-Based Single Cell Assay for Rapidly Enumerating Rare Long-Term Hematopoietic Stem Cells

by

Gary Tran

A thesis submitted in conformity with the requirements for the degree of Master of Science

Department of Medical Biophysics University of Toronto

© Copyright by Gary Tran 2014 A Surrogate mRNA-Based Single Cell Assay for Rapidly Enumerating Rare Long-Term Hematopoietic Stem Cells

Gary Tran

Master of Science

Medical Biophysics University of Toronto

2014

Abstract

Long-term hematopoietic stem cells (LT-HSCs) exist at a frequency of 1 in 20,000 nucleated murine bone marrow cells. LT-HSCs sustain hematopoiesis lifelong. Their numbers in the donor samples determine the long-term efficacy of hematopoietic cell transplants. Additionally, there are ongoing efforts to identify conditions that expand LT-HSC numbers ex vivo, or to derive

LT-HSCs from pluripotent sources. Accordingly, the ability to enumerate LT-HSCs rapidly would be valuable. Here, I developed a surrogate mRNA marker-based single cell assay to enumerate LT-HSCs rapidly.

The combined presence or absence of EPCR, Gata3, and IL-7Rα transcripts was found to mark

LT-HSCs independent of their cell cycle state. A microfluidic system was developed to encapsulate single cells with qRT-PCR reagents into nanolitre microdroplets. QRT-PCR parameters were adapted to quantify the three targets simultaneously across a broad dynamic range, and were subsequently used in the microdroplets for single mRNA molecule and single cell analyses.

ii

Acknowledgments

First and foremost, I would like to thank Dr. Norman Iscove, my supervisor, for his mentorship and support. His passionate and rigorous approach to science had brought about many insightful discussions that have and will continue to shape my foundations in science and beyond. It has been a pleasure to work in his laboratory, and I am grateful for his time and commitment. I would also like to thank my committee members, Dr. Peter Zandstra and Dr. Mark Minden, for their guidance throughout my project.

Jennifer Ma, a friend and collaborator, is someone I must thank for pushing me through the long nights of experiments, troubleshooting, preparing reports/presentations, and yes, more troubleshooting. She has been a true collaborator in every sense of the word – an amazing individual that I had the pleasure to share the many lessons of failures as well as the many joys of success at each step forward in this project.

I would to also thank all of the members of the Iscove lab, Catherin Frelin, Harvey Lim, Robert Herrington, Mary Barbara, and Salima Janmohamed. Especially Catherine, who was there to discuss and develop my ideas. Thank you all for your advice, support, and companionship.

iii

Table of Contents

Acknowledgments ...... iii Table of Contents ...... iv List of Abbreviations ...... vi List of Tables...... vii List of Figures ...... viii List of Appendices ...... x Chapter 1: Introduction ...... 1 1.1 Hematopoietic stem cells ...... 1 1.2 Assays to identify and characterize hematopoietic stem and progenitor cells ...... 2 1.2.1 Colony-forming unit assays ...... 2 1.2.2 Long-term culture-initiating cell assay ...... 3 1.2.3 Phenotypic purification of LT-HSCs ...... 3 1.2.4 Long-term transplantation assays ...... 7 1.3 Methods of single cell analysis for rare cell detection ...... 8 1.3.1 Flow cytometry and fluorescence activated cell sorting ...... 8 1.3.2 Fluorescence in situ hybridization ...... 9 1.3.3 Quantitative reverse transcription polymerase chain reaction and microfluidics ...... 10 1.4 Experimental Rationale and Objectives ...... 12 1.5 Thesis Outline ...... 14 Chapter 2: Identification of stable LT-HSC markers ...... 15 2.1 Identification of stable LT-HSC mRNA markers ...... 15 2.1.1 Global transcript amplification ...... 16 2.1.2 Primer design for global transcript amplification products ...... 16 2.1.3 Semi-quantitative PCR ...... 17 2.1.4 Immunophenotypic purification of multipotent populations ...... 17 2.1.5 Cell culture of IT-HSCs and LT-HSC-enriched fractions ...... 18 2.1.6 Isolation of progenitors and terminally differentiated cells ...... 18 2.1.7 Utility of Gata3 as a surrogate marker ...... 20 2.1.8 Complementary markers to exclude non-LT-HSCs that are Gata3+ ...... 21 2.2 Immunophenotypic analysis and prospective isolation of LT-HSCs using Gata3, EPCR and IL-7Rα as markers ...... 25 2.3 Discussion ...... 28 Chapter 3: Development of a multiplex one-step qRT-PCR for analyzing single cell gene expression in the microdroplets ...... 32 3.1 One-step qRT-PCR to directly analyze single cell gene expression in microdroplets ...... 32 3.1.1 Simulation of the microdroplet environment ...... 33

iv

3.1.2 Identification of singleplex qRT-PCR parameters to quantify mRNA levels directly from high cell lysate concentrations ...... 35 3.2 Development of triplex parameters to analyze LT-HSC mRNA markers in the microdroplets ...... 48 3.2.1 An experimental strategy to simulate single cell mRNA relative abundances ...... 48 3.2.2 Generation of the in vitro transcripts ...... 49 3.2.3 Characterization of the in vitro transcript combinations ...... 50 3.2.4 Identification of triplex qRT-PCR parameters to quantify Gata3, EPCR and IL-7Rα mRNA in the microdroplets ...... 52 3.3 Discussion ...... 62 Chapter 4: Conclusions ...... 67 References ...... 68 Appendices ...... 80

v

List of Abbreviations

5-FU 5-fluorouracil APC Activated protein C BFU-E Erythrocyte blast forming unit BM Bone marrow BP Binding protein CD Cluster of differentiation cDNA Complementary deoxyribonucleic acid CFC Colony forming cell CFU-E Erythrocyte colony forming unit CFU-S Spleen colony forming unit

CT Threshold cycle dNTP Deoxynucleoside triphosphate EPCR Endothelial protein C receptor EGI EPCR+ Gata3+ IL-7Rα- EST Expressed sequence tag FACS Fluorescence activated cell sorting FISH Fluorescence in situ hybridization Flt3L Flt3 ligand G-CSF Granulocyte colony stimulating factor GFP Green fluorescent protein Gpi1 Glucose phosphate isomerase 1 HSC Hematopoietic stem cell IL Interleukin IT Intermediate-term IVT In vitro transcript LDA Limiting dilution analysis LMPP Lymphoid-primed multipotent progenitor LSKR Lineage- c-Kit+ Sca1+ Rhodamine123lo LT Long-term LTC-IC Long-term culture-initiating cell Mac Macrophage Mast Mast cell Meg Megakaryocyte MPP Mixed multipotent progenitor mRNA Messenger ribonucleic acid Neut Neutrophil NK Natural killer PCR Polymerase chain reaction pE/Meg Eythroid/megakaryocyte progenitor pMac Macrophage progenitor pMeg Megakaryocyte progenitor pNeut Neutrophil progenitor pNeut/Mac Neutrophil/macrophage progenitor Poly(A) polyadenylation RT Reverse transcription SCF Stem cell factor SNP Single nucleotide polymorphism ST Short-term TLR Toll-like receptor UCB Umbilical cord blood

vi

List of Tables

Table 2-1. Primer sequences for PCR on global amplification products. Table 2-2. Cell surface markers and dye retention used to purify multipotent populations by FACS. Table 2-3. Cell surface markers used to purify B- and T- lymphoid populations by FACS. Table 3-1. Culture conditions, and cell diameters of cycling LT-HSCs and the hematopoietic cell lines used for the microdroplet cell lysate simulations Table 3-2. Comparison of Gata3 and EPCR amplification efficiencies calculated by qPCR curve kinetics and the standard curve method Table 3-3. Summary of singleplex qRT-PCR parameters Table 3-4. In vitro transcript information and primer sequences used to generate cDNA template Table 3-5. Primer, probe and target amplicon information

vii

List of Figures

Figure 1-1. The hierarchical structure of the hematopoietic system Figure 2-1. Expression of Gata3 throughout the hematopoietic hierarchy Figure 2-2. Gata3 is expressed in the NK lineage Figure 2-3. Complementary markers to exclude Gata3+ committed cell types Figure 2-4. Quantitative gene expression analysis of hematopoietic stem and progenitor cells Figure 2-5. Immunophenotypic analysis of EPCR+ Gata3+ IL-7Rα- cells in Gata3eGFP/+ mice Figure 2-6. EPCR+ Gata3+ IL-7Rα- phenotype is highly enriched in LT-HSCs Figure 2-7. Validation of Gata3, EPCR and IL-7Rα mRNA markers for enumerating LT-HSCs under conditions where LT-HSC numbers are expected to vary Figure 3-1. Workflow for cell encapsulation, thermal cycling and analysis of RT-PCR products Figure 3-2. Simulation of microdroplet cell lysate concentrations in conventional 10 μL qRT-PCRs Figure 3-3. Quantitative PCR amplification plot Figure 3-4. Mechanism of real-time PCR detection by TaqMan hydrolysis probes Figure 3-5. Images of cycling LT-HSCs and hematopoietic cell lines at 400X magnification Figure 3-6. Mechanism and utility of binding protein Figure 3-7. Addition of binding protein rescued the CellsDirect One-Step qRT-PCR Kit from inhibition by high cell lysate concentration Figure 3-8. Addition of NP-40 to the qRT-PCR for cell lysis Figure 3-9. Product yield and sensitivity are improved by increasing and primer concentrations Figure 3-10. Titration of TaqMan probes from 50 nM to 550 nM Figure 3-11. Testing the qRT-PCR parameters across a range of cell lysate Figure 3-12. Quantification of Gata3 and EPCR RNA in the presence of 103 B62c cells/μL Figure 3-13. RNA and cell RT-PCR in the microdroplets Figure 3-14. Experimental strategy to simulate the range of mRNA concentrations of single cells encapsulated in 1 nL microdroplets Figure 3-15. Triplex qRT-PCR in four combinations of high and low IVT abundances using singleplex qRT-PCR parameters Figure 3-16. Example of a reaction plate configuration and corresponding qPCR curves Figure 3-17. Increase in binding protein is required for the amplification of multiple RNA targets

Figure 3-18. Influence of KCl and MgCl2 salts on triplex qRT-PCR Figure 3-19. Testing primers for specificity and quantitative performance Figure 3-20. Influence of Taq polymerase on triplex qRT-PCR

viii

Figure 3-21. Interplay between dNTP and MgCl2 concentrations in a triplex qRT-PCR Figure 3-22. Quantification of IVT combinations 1 – 4 using parameters that have been adapted for a triplex qRT-PCR Figure 3-23. Quantification of IVT combination 3 in the presence and absence of 1,000 B62c cells/μL Figure 3-24. Simulation of a LT-HSC mRNA profile using IVTs Figure 3-25. Multiplex RT-PCR in the microdroplets

ix

List of Appendices

Appendix 1. Additional improvements to the singleplex qRT-PCR Appendix 2. A duplex qRT-PCR was used as a starting point to identify parameters for a quantitating IL-7Rα, EPCR and Gata3 Appendix 3. Five of the most stable cross dimers formed in the initial and final triplex primer sets Appendix 4. SYBR Green validation of new primer pairs Appendix 5. Performance of qRT-PCR with and without cell lysate Appendix 6. Performance of RT-PCR in microdroplets prior to the identification of multiplex qRT- PCR parameters Appendix 7. Effect of NP40 on microdroplet stability

x

1

Chapter 1 Introduction

1.1 Hematopoietic stem cells

The establishment and lifelong maintenance of the hematopoietic system depends on the enduring regenerative capacity of hematopoietic stem cells (HSCs). The ability of HSCs to sustain self-renewal through divisions yielding progeny that can retain the function of a stem cell or differentiate into any blood cell ensures the continual replenishment of all the mature cell types. In the adult human hematopoietic system, this equates to upwards of hundreds of billions blood cells daily1. Incredibly, a single HSC is capable of reconstituting the entire hematopoietic system2,3. As HSCs progressively differentiate into progenitors and ultimately mature cells of the erythroid, myeloid and lymphoid lineages, their ability to self-renew and differentiative potential diminishes. This creates a hierarchical structure with HSCs at its apex (Figure 1-1).

Figure 1-1. The hierarchical structure of the hematopoietic system. LT-HSCs reside at the apex of this hierarchy as the only cells capable of enduring engraftment. This is achieved through divisions where at least one daughter retains the stem cell identity. The daughter cells poised to differentiate progresses with each cell division through a transient multipotent stage, followed by lineage commitment, and eventually a terminally differentiated and short-lived cell type at the bottom of the hierarchy.

The unique biological role of HSCs has been exploited for the treatment of hematological malignancies and other diseases for decades4. Now, over 45,000 hematopoietic cell transplants are performed annually4. With the implementation of immunophenotypic markers and dye efflux properties, the HSC compartment has been further refined into multipotent cells distinguished by the duration in which they are capable of executing self-renewal divisions5,6. The specific HSC fraction capable of sustaining self-renewal capabilities lifelong is known as the long-term

2

(LT) HSC, which exists at a frequency of 1 in 20,000 nucleated murine bone marrow cells5. As the only cell capable of permanently reconstituting the hematopoietic system, LT-HSCs are the therapeutic entities responsible for the long-term effectiveness of hematopoietic cell transplants. Their clinical utility and rarity in numbers has motivated research groups to study methods to expand their numbers ex vivo. The current standard used to enumerate LT-HSCs after such cultures requires at least 5 months for a conclusive count. A rapid alternative to enumerate LT- HSCs should accelerate the discovery of novel culture conditions to expand their numbers, and support other research strategies such as identifying conditions to differentiate embryonic and induced pluripotent stem cells to LT-HSCs. I will introduce the in vivo and in vitro assays for detecting stem and progenitor cells, and propose another approach to identify LT-HSCs in the following sections.

1.2 Assays to identify and characterize hematopoietic stem and progenitor cells

1.2.1 Colony-forming unit assays

The seminal work by Till and McCulloch introduced a quantitative in vivo approach to examine the fundamental characteristics of stem cells. The intravenous injection of marrow into lethally irradiated recipients yielded macroscopic nodules within 7 to 12 days in the recipient spleen which contained cells of the erythrocytic, megakaryotic, monocytic and granulocytic lineages7. It was subsequently demonstrated by tracking unique radiation-induced cytological abnormalities that these nodules were of clonal origin8. The cells that gave rise to these nodules were termed spleen colony forming units (CFU-S). Furthermore, within some of the CFU-S were cells that could generate nodules upon transplantation into secondary recipients9. These were landmark studies that were first to characterize the proliferation, multipotentiality, and self- renewal of stem cells. In later studies, a distinction was made between primitive cell populations yielding enduring multilineage engraftment and CFU-S, which represent a collection of erythromyeloid progenitors10,11.

3

Lineage-restricted and multipotent progenitor cells can be assayed in vitro using semi-solid media supplemented with the appropriate cytokines for 7 to 14 days12. The cellular composition of the resultant colonies can be inspected by light microscopy based on morphological features. The detected lineages of a single colony are used to gauge the differentiative potential of the initiating colony forming cell (CFC). Although these assays are not capable of measuring LT- HSCs, they are valuable for many purposes, including assessing the hematotoxicity of drugs13, diagnostic assays for myeloproliferative disorders and leukemia, and analyzing the functional integrity of hematopoietic progenitors after ex vivo manipulations12,14.

1.2.2 Long-term culture-initiating cell assay

The long-term culture-initiating cell (LTC-IC) assay is a co-culture system where various doses of test cells are seeded onto a pre-established stromal layer that is capable of sustaining myelopoiesis over several weeks15. Less primitive hematopoietic cells in the test cell population will be exhausted in the first 3 to 5 weeks of culture due to their limited proliferative capacity. Beyond this time point, continued myeloid production is a consequence of a primitive population differentiating into cells that can be detected by subsequent CFC assays on methylcellulose- based media. The cells capable of sustaining myelopoiesis in these extended cultures are termed LTC-ICs. These LTC-ICs represent primitive hematopoietic progenitors. The frequencies of LTC-ICs can be determined using a limiting dilution analysis (LDA), and the numbers obtained may vary depending on the feeder layer and culture conditions used15. Although LTC-ICs appear to be a more primitive cell population than the CFC, it is unlikely that LTC-IC assays detect LT- HSCs16.

1.2.3 Phenotypic purification of LT-HSCs

The difficulty of isolating LT-HSCs lies in their rarity in numbers and the lack of any morphological distinction when compared to other hematopoietic cells. Many combinations of cell surface markers and supravital stains have been used to prospectively isolate the LT-HSC

4 population. This has allowed for the study of LT-HSCs on a clonal basis and a refined assessment of stem and progenitor cells with varying repopulating potentials. Additionally, the purity at which LT-HSCs can be isolated has allowed for the high-throughput analysis of potential mRNA transcripts by microarray that could distinctively mark LT-HSCs. My host laboratory also applied microarray analysis to explore the quiescent and cycling LT-HSC transcriptome, which I used to examine mRNA transcripts that were expressed independent of their cell cycle status. This was critical for my purposes, which was to enumerate LT-HSCs based on gene expression in conditions beyond the steady state marrow. Some of the common phenotypic markers that are used to isolate LT-HSCs are described here.

1.2.3.1 Expression of c-Kit and Sca1, and the absence of lineage markers (LSK) in primitive hematopoietic cells

C-Kit is a receptor tyrosine kinase that is broadly expressed in the hematopoietic hierarchy including on the surface of LT-HSCs17,18. C-Kit receptor and its ligand, stem cell factor (SCF), is critical for hematopoiesis in vivo as determined by the characterization of mice deficient in either one of these components. C-Kit is encoded at the white-spotting locus, W, while SCF is encoded at the Steel factor locus, Sl19-23. Absence of either c-Kit or SCF (W mutant and Sl mutant, respectively) results in perinatal death due to severe macrocytic anemia, indicating that it plays an essential developmental role in utero24. Hypomorphic mutations, however, are viable and are associated with varying levels of anemia, mast cell and pigmentation deficiencies, heightened radiosensitivity, and sterility. W mutant marrow was not capable of reconstituting irradiated recipient wild type mice when transplanted25, whereas Sl mutant bone marrow derived adherent cells were not capable of supporting hematopoiesis in vitro26. Furthermore, injection of BM with a hypomorphic allele of Sl into sublethally irradiated W mutant mice cured these recipients of anemia27. The evidence indicates that the W mutation leads to defective HSC activity while the Sl defect results in a microenvironment incapable of supporting hematopoiesis.

Stem cell antigen-1 (Sca1) is expressed on a spectrum of hematopoietic stem and progenitor cells, and like others, it cannot be used a sole marker to isolate LT-HSCs10,28. Sca1 does not appear to be necessary for homeostatic maintenance of hematopoiesis, and Sca1-/- BM was

5 capable of competing against wild-type marrow over the long-term. However, a deficiency was observed in secondary transplantations where a significant portion of recipients were not rescued from the lethal doses of radiation, and in those that survived, donor contribution was lower relative to secondary transplants from wild-type reconstituted primary mice29,30.

The exclusion of cell surface proteins expressed on various differentiated hematopoietic cell types has been used to purify primitive cell populations such as LT-HSCs. These include but are not exclusive to myeloid (e.g., Gr-1, CD11b, 7-4), lymphoid (e.g., CD5, CD4, CD8, CD19, B220), and erythroid (e.g., Ter-119) markers. Commonly, the depletion of lineage markers and the expression of c-Kit and Sca1 (LSK) are used in conjunction with other markers (e.g., CD150, Thy1.1, CD34, Flt3, CD49b, Rhodamine123) to purify LT-HSCs2,5,31-33.

1.2.3.2 Rhodamine123

Rhodamine123 (Rho123) is a cationic dye that binds specifically to mitochondria. The level of rhodamine retention is proportional to mitochondria mass and its electric potential across its inner membrane34,35. High retention of Rho123 is consequently an indicator of high metabolic activity of the cell. However, it becomes more complicated because Rho123 is also a substrate for P-glycoprotein efflux pump that is encoded by the multidrug resistance gene (Mdr1) and is highly expressed in HSCs36,37. Therefore, the fluorescence intensity of Rho staining is dependent on the mitochondrial number, mass, state of activation, and P-glycoprotein efflux activity.

The separation of populations enriched for HSC activity based on Rho123 retention has revealed that most cells capable of long-term multilineage reconstitution reside in the Rho123lo fraction while the Rho123hi fraction contains predominantly cells with short-term repopulation kinetics10,33,38. Furthermore, it was demonstrated that low Rho123 retention corresponds to a state of quiescence in the Thy1.1lo Lin- Sca1+ c-Kit+ fraction that is enriched for HSC activity39,40.

6

1.2.3.3 Marker stability

The perturbance of LT-HSCs from their steady state by subjecting them to 5-fluorouracil (5-FU) treatment41,42, mobilization agents such as granulocyte colony-stimulating factor43-45, or culture conditions results in alterations of their cell surface marker expression that may not always be associated with changes in their repopulation capabilities46. When mice were injected with 5- FU, it was found that cells capable of long-term reconstitution of recipient mice expressed c-Kit 10-fold lower41, upregulated myeloid marker CD11b41, and were found in both the CD34+ and CD34- fractions47. Furthermore, there was a significant increase in phenotypic HSC numbers as defined by either LSK CD34- CD41- Flt3- or LSK CD150+ CD48- CD41- in the bone marrow after G-CSF administration, which conflicted with the expected decrease in long-term repopulating cells in the marrow45. This phenotypic increase was unanticipated because hematopoietic stem and progenitors should have mobilized from the marrow into the periphery48. Schuettpelz and colleagues attribute this disparity in the number of phenotypic and functional stem cells to the sensitivity of the markers to inflammatory signals, such as the upregulation of CD150 in response to TLR signaling45. Others have shown that while the expression of c-Kit and Sca1 were maintained in culture, the long-term repopulating activity was found in the Tie- 2- fraction, and indiscriminately in the Mpl+/- and Endoglin+/- fractions46. Moreover, it was demonstrated that long-term multilineage engrafting cells were also present the lineage marker positive fraction49.

The assessment of culture conditions for the expansion of transplantable human umbilical cord blood stem cells has also been hindered by the disassociation between phenotypic human HSCs (CD34+ CD38-) and their actual functional output50. Although a 166-fold increase in the number of CD34+ CD38- cells post-culture was observed, the number of cells capable of reconstituting severe combined immunodeficiency (SCID) mice was unchanged. Consequently, marker stability cannot be assumed in any circumstance other than the one it was identified in.

7

1.2.4 Long-term transplantation assays

Despite advances in phenotypic characterizations, the golden standard for the enumeration of LT-HSCs is still in vivo transplantation assays that extend to at least 20 weeks post-transplant. The duration of the assay ensures that the hematopoietic graft detected in the recipient is generated exclusively from LT-HSCs, and not from transient stem and progenitor populations that are also capable of giving rise to mature cells for an extended period5. The competitive reconstitution assay is the most common transplantation assay used. It involves the co-injection of test population (donor cells) with a known number of HSCs typically in the form of unseparated bone marrow (competitor cells) into lethally irradiated recipient mice51. Congenic mice differing in their Ly5 (CD45) isoforms are used to differentiate donor myeloid and lymphoid cells from those derived from competitor cells. The ‘a’ and ‘b’ isoforms of glucose phosphate isomerase 1 (Gpi1) are used for circulating erythrocytes since these cells do not express Ly5. The ratio of mature cell types generated from the donor and competitor cell populations can be used to calculate the approximate HSC content provided that the HSCs in the donor population are equivalent to the HSCs in the competitor population in terms of their self- renewal and proliferative abilities10,52-54.

The limiting dilution analysis (LDA) is a non-competitive transplantation assay that can be used to determine the frequency of LT-HSCs. C57BL/6-KitW-41J/W-41J recipients express a hypomorphic allele of the c-Kit receptor (described in Section 1.2.3.2.), which provides a highly sensitive environment that can detect a single injected LT-HSC55,56. These mice are sublethally irradiated (4 Gy), and injected with limiting numbers of marrow cells or single cells from fractions highly enriched in LT-HSC activity. The proportion of recipient mice without long- term multilineage engraftment is then used to calculate the HSC frequency in the test population by maximum likelihood Poisson statistics. Since single HSCs are capable of engrafting in this assay with near absolute efficiency, the failure to produce a detectable graft is likely a consequence of an intrinsic inability of the injected cell to sufficiently self-renew and proliferate3.

The serial transplantation assay is a rigorous test for the self-renewal potential of HSCs. Marrow of primary recipients can be collected, and injected into secondary, tertiary and quaternary

8 irradiated recipients57,58. With each recipient, the number of HSCs from wild-type marrow increased 10-20-fold, and over the course of the successive transplants, an accumulated 8400- fold increase was previously observed with no indication of exhaustion58. This demonstrates the regenerative capacity of HSCs in response to the appropriate extrinsic cues in its environment.

1.3 Methods of single cell analysis for rare cell detection

Methods for the single cell analyses of rare populations have been applied to dissect the biological heterogeneity of many complex physiological and malignant systems. These range from the application of flow cytometry for analyzing protein expression to genetic studies using fluorescence in situ hybridization. Examining the cellular heterogeneity in a disease such as cancer has been important for understanding metastasis, treatment response, and relapse. For example, a rare subset of the tumor bulk may persist through treatment, and cause the re-initiation of tumor formation59. Additionally, methods for rare cell detection can also be applied to detect rare viral infections, genetic abnormalities in rare cells, detection of fetal cells in maternal blood for non-invasive prenatal diagnoses, and others60.

For the enumeration of rare LT-HSCs in unenriched blood samples, the selected method must have the sensitivity and throughput to detect cells that exist at frequencies of 0.05% - 0.01% (LT- HSCs in murine BM/human UCB)5,61. It is also important for the method to be rapid, cost- effective, simple and applicable for many purposes if it is to be widely adopted. The methods that can be applied for rare cell detection will be assessed in the follow sections based on these criteria.

1.3.1 Flow cytometry and fluorescence activated cell sorting

Flow cytometry is used regularly for multiparametric analyses of single cells. Cells of interest are stained with supravital dyes or monoclonal antibodies that are conjugated to fluorophores. Suspended in sheath fluid for circulation through the machine, the cells are individually interrogated by a light beam. Data on size, granularity, protein expression and dye retention

9 properties of each cell are acquired at rates of a few thousand events per second. Some cytometers are capable of detecting up to 18 parameters. The high-throughput and multiparametric capabilities of flow cytometry are compatible with the needs for rare cell detection. Additionally, fluorescence activated cell sorting (FACS) can be used to isolate viable single cells for further analysis that may allow one to assign functional attributes to a particular phenotype. FACS has played an instrumental role in dissecting the hematopoietic system and understanding the biology of HSCs. However, the sensitivity of flow cytometry is often limited by the presence of false positive events that are a collective consequence of non-specific antibody binding, cellular autofluorescence, contaminating residual cells and particles between experiments, and event bursts during acquisition. Autofluorescence due to the presence of native fluorescent molecules such as flavins, which emit across a range of wavelengths upon excitation by blue-green light, is particularly problematic for cultured cells60,62. Although many of these problems can be resolved63, ultimately, the reliance on the availability and quality of antibodies restricts the markers that can be used for such a sensitive application. Flow cytometers and cell sorters range from three hundred thousand to a million dollars.

1.3.2 Fluorescence in situ hybridization

Fluorescence in situ hybridization (FISH) is a fluorescently labelled oligonucleotide-based approach to detect DNA and RNA molecules. It can be adapted to detect a spectrum of target types in individual cells, including various chromosomal abnormalities, mRNA and non-coding RNA. Commercial kits such as the QuantiGene ViewRNA ISH Cell Assay (Affymetrix) are available that have the demonstrated sensitivity to detect as low as a single mRNA molecule and up to 4 targets simultaneously. This is achieved through a branched amplification system where target-specific probes also have complementary sequences to allow for additional probe hybridization. By having multiple oligonucleotide probes attach to a single target nucleic acid, the protocol is capable of amplifying the signal up to 8,000-fold. With highly sensitive imaging instrumentation, these events can be analyzed in the context of whole cells, allowing for intracellular localization studies.

10

Although the protocols for FISH are laborious, requiring extended incubations and approximately 8 hours of hands-on time, there are fully automated systems that perform this front-end processing. FISH also requires the use of highly sensitive and automated image acquisition systems to quantify single molecules, which can challenging when a large number of cells must be screened to provide a statistically relevant enumeration of rare cells64. The difficulty in automating such protocols does translate to instruments with a price tag upwards of a million dollars. Protocols have been developed to combine FISH with flow cytometry (flow- FISH). However, with the increase in the throughput of event acquisition, there is a sacrifice in the sensitivity due to the need to permeabilize the cells for the entry of flow-FISH reagents into the cells which may result in the leakage of RNA molecules, in addition to the other sources of noise in flow cytometry. Applying the same Affymetrix probe technology described above for FISH, their flow-FISH assay is validated only for the detection of cells greater than a frequency of 1%.

1.3.3 Quantitative reverse transcription polymerase chain reaction and microfluidics

Similar to FISH, quantitative reverse transcription polymerase chain reaction (qRT-PCR) can quantify a spectrum of nucleic acid types. However, qRT-PCR is also capable of detecting single nucleotide polymorphisms (SNPs). The repetitive enzymatic amplification of the target nucleic acids provides a level of sensitivity and signal-to-noise ratio (SNR) that allows for the detection of single molecules. Additionally, with a broad dynamic range, multiplex capabilities, and minimal pre- and post- reaction processing, this technique has been readily adapted for various clinical and research applications. With regards to applying PCR for rare cell detection, conventional assays in 96-well or 384-well plate formats cannot provide the throughput needed.

With the advent of microfluidic technologies, the number of simultaneous qPCRs performed can be easily scaled up to the extent necessary to detect rare events. The ability of microfluidics to manipulate small volumes of liquid very precisely has been exploited to miniaturize a number of biological assays which consequently permits massive parallelization, and streamlining of protocols through integration. There are chamber-based microfluidic chips that utilize cell traps

11 and intricate microvalve networks to isolate single cells traveling through their micron-diameter channels. Once these cells are captured, these chamber-based microfluidic devices are designed to allow reagents to be fed onto the chip for the analysis of single cell growth kinetics65, signaling responses66,67, gene expression levels68,69, and response to drugs70. Furthermore, the cells interrogated on some of these chips can be recovered intact for further in vivo analyses65. Although they have numerous applications, the difficulty in scaling up these devices beyond a few thousand chambers presents a problem for rare cell detection.

Alternatively, microfluidic systems can be designed to generate water-in-oil emulsions, where single cells can be encapsulated with the necessary assay reagents into the microdroplets as well. These microdroplets essentially act as stable reaction compartments that are surrounded by a continuous oil phase. In addition to the immiscibility of these two phases, a fluorinated surfactant is added to stabilize the microdroplets at the water-oil interface, and promotes surface passivation to maximize enzymatic efficiency in the assays71. The ability to scale up the number of parallel microdroplet generators allows for the production of at least thousands of microdroplets per second. The use of microfluidics also overcomes the large size and polydispersion of droplets that were generated by previous bulk emulsion methods72. Moreover, miniaturization of the qRT- PCR increases the sensitivity of the reaction68,73-75. Thus far, it has been demonstrated that emulsion-based microfluidics can generate millions of monodispersed microdroplets containing single cells for semi-quantitative multiplex PCR76,77 and singleplex RT-PCR78. The approximate price for these devices is between fifty thousand to five hundred thousand dollars, which is an indicator of the simplicity of these devices.

All of these described high-throughput platforms are capable of rare cell detection. Each may also have unique advantages that serve other purposes, such as the ability to sort for viable cells based on phenotype or to observe the subcellular localization of the target and cell morphology. The assessments of these technologies as presented here are focused on the specifications that are required for the rapid enumeration of rare cells, as well as those characteristics that will allow for the selected technology to be readily adapted for other research and clinical applications.

When FISH and emulsion qRT-PCR are compared to flow cytometry, it is evident that the type of markers that can be used to detect rare cells by flow cytometry is relatively limited. While

12

FISH and qRT-PCR can target a vast number of nucleic acid markers, highly specific flow cytometry assays for detecting rare cells is dependent on the availability and quality of the antibodies. Nonetheless, flow cytometry has the highest multiplex capacity. The rapid development of nucleic acid assays after the initial identification of reaction conditions, especially for qPCR, is advantageous. Furthermore, both FISH and qRT-PCR can detect numerous chromosomal abnormalities, such as translocations and copy number variations, which may be used as biomarkers. Thus, the versatility of these oligonucleotide-based methods provides a wide range of marker choices that can be used to distinguish rare cell populations. QRT-PCR is unique in that it can also be applied to analyze SNPs and point mutations that may be of value for the analysis of tumor cells.

There are factors that may limit the applicability of FISH for rare cell or event detection in clinical and research settings compared to microdroplet qRT-PCR. Both can be automated, but the complexity and cost difference between the two technologies must be considered. Processing of samples for FISH is laborious, dependent on the cell type, and so the automation of such protocols is difficult, requiring large and costly instruments that are up to 10-fold more expensive than the current microdroplet PCR systems. Additionally, the data acquisition and analysis is simpler for the microdroplets than FISH which can also translate into cost differences. The fluorescence signal generated in PCR can be monitored in real-time for quantitative measurements of the target at a magnification of 5X while quantification of FISH products require 20-100X magnification. Thus, emulsion-based qRT-PCR was selected as the platform for developing a high-throughput single cell assay for the enumeration of rare cell populations, such as the LT-HSCs in the hematopoietic system.

1.4 Experimental Rationale and Objectives

The objective of this project is to develop an emulsion-based qRT-PCR technology that can be used to enumerate rare HSCs rapidly. Single cells will be encapsulated with the reagents needed for a gene expression analysis into nanolitre microdroplets that act as individual reaction compartments. If a stable LT-HSC mRNA signature can be identified, then the expression

13 analysis of the single cells in a sample will be able to predict the number of LT-HSCs present. The ability to perform such enumerations will not only accelerate research strategies examining conditions to expand LT-HSCs numbers or derive them from pluripotent sources, but it will also allow for measurements in the clinic for diagnostics and treatment planning.

The surrogate mRNA marker based assay will first be used to validate enumerations made in the murine hematopoietic system because the microdroplet-based enumerations can be compared directly to enumerations performed using the current gold standard in vivo transplantation assays. Furthermore, the murine hematopoietic system has been a valuable model for understanding stem cell biology, its cellular composition is well-characterized, and tissue is readily available. Numerous conditions in which the frequency of LT-HSCs are expected vary will be analyzed using in vivo transplantation assays and the surrogate mRNA marker assay for a correlative validation.

A protocol developed by my host laboratory was used to generate cDNA samples from a small number of cells. The cDNAs are globally amplified products representing the transcriptome of the cell type. A collection of cDNA samples was generated from stem, progenitor and mature cell types that constitute much of the cellular diversity in the hematopoietic system. Additionally, cDNA was generated from long-term and transient HSCs induced to cycle in culture conditions that preserve their functional integrity. In Chapter 2, the cDNA samples were used to identify a specific and stable LT-HSC mRNA marker combination.

Currently, microfluidic systems lack either the throughput, multiplex or quantitative capabilities required for rare cell detection68,77-79. Separation of lysis, reverse transcription, and polymerase chain reaction into sequential chambers of larger volumes allowed White and colleagues to perform qRT-PCR on approximately 270 cells simultaneously68. However, the efficiency of mechanical cell traps decreases with heterogeneous cell populations, and the size of the device components make scaling up for rare cells detection unfeasible. Others have performed endpoint PCR and RT-PCR in scalable microdroplets, but no quantitative measurements have been made73-75. The lack of quantitative experiments may be due to the inhibitory nature of high cell lysate concentrations in qPCR68,80. Consequently, in Chapter 3, qRT-PCR reagents were analyzed systematically to overcome the inhibition due to the high cell lysate concentrations.

14

For the quantification of the mRNA markers identified in Chapter 2, parameters for multiplex qRT-PCR were also examined.

1.5 Thesis Outline

Chapter 2 presents the experimental work performed to characterize the expression of the LT- HSC mRNA markers throughout the hematopoietic hierarchy, and demonstrate how in combination they are capable of marking LT-HSCs independent of their cell cycle status. Using a reporter mouse, we analyzed the engraftment potential of the cells that are defined by these LT-HSC markers. The chapter also discusses the gene expression data with respect to their application for single cell analysis, and the final validation strategy for the mRNA marker combination.

In Chapter 3, the barriers to performing single cell multiplex qRT-PCR inside the microdroplets were explored. Experimental strategies were developed to simulate the microdroplet conditions in conventional qRT-PCRs. By developing the simulations that are described in Chapter 3, a systematic analysis of the qRT-PCR reagents was performed in a time- and cost- effective approach. Following this chapter is a conclusion that includes an integrated summary of Chapter 2 and 3, bringing together two critical components of the assay – the LT-HSC mRNA markers and the reaction parameters needed to detect them. The conclusion chapter also explores the future direction of this work.

15

Chapter 2 Identification of stable LT-HSC markers

2.1 Identification of stable LT-HSC mRNA markers

The analysis of the quiescent transcriptome of LT-HSCs in the steady state bone marrow has resulted in the discovery of mRNA transcripts preferentially expressed in this primitive population relative to other downstream hematopoietic cell types81-83. However robust the expression of these mRNA species is in quiescent LT-HSCs, the question remains as to whether their expression persists or changes when these cells transit from a quiescent to cycling state. Although it is unlikely that LT-HSCs contribute significantly to steady-state homeostatic maintenance of daily blood cell production, these dormant cells can be activated by injury signals in vivo or by in vitro culture conditions44,84-87. The perturbance of its native state in the marrow is associated with changes in its transcriptome41,42,44. Thus, marker stability is as important as its specificity for the purposes of obtaining a marker set that can enumerate LT-HSCs from steady state marrow as well as a spectrum of conditions in which LT-HSCs may be actively cycling. In the pursuit for mRNA markers to identify LT-HSCs encapsulated in the microdroplets, it will be imperative to ensure that the expression pattern of the identified markers is robust independent of their cell cycle status. In this chapter, I place an emphasis on the ability of the identified mRNA transcripts to discriminate LT-HSCs from the preponderance of downstream hematopoietic cell types irrespective of whether the LT-HSCs are quiescent or cycling. The discovery of stable and specific mRNA markers was achieved by performing gene expression analyses on a collection of cDNA samples generated from stem, progenitor, and mature cell types representing much of the hematopoietic cell diversity. Additionally, markers that were stable were identified from comparing cDNA generated from stem cells that were induced to cycle in culture conditions that preserved their stem cell character. The cell types isolated are described in Sections 2.1.4 – 2.1.6, and the global transcript amplification protocol used to generate the cDNA is described in the following section.

16

2.1.1 Global transcript amplification

Transcriptional profiles of hematopoietic cells were developed in my host laboratory by Mary Barbara using a global transcript amplification protocol that enabled one to examine entire mRNA profiles from a limited number of cells. In the hematopoietic system, this is particularly advantageous for analyzing rare LT-HSCs that are available only in the hundreds when FACS purified using stringent immunophenotypic marker sets. Global gene expression analysis of limited cell numbers requires reverse transcription of the mRNA species and subsequent amplification of the cDNA prior to transcript-specific PCR. To prevent distortion in transcript abundance, my host laboratory developed an amplification method targeting polyadenylated mRNA for first strand synthesis that limited the length of the cDNA products to 300 – 500 bp upstream of the 3’ end of the transcript88,89. The conservation of transcript abundance was achieved by limiting the duration of the reverse transcription step, and using low dNTP and primer concentrations88. After PCR, the cDNA products were diluted 1:250 to approximately 4 ng for subsequent semi-quantitative and quantitative PCR. Isolation of the specific cell populations used for the following PCR analyses will be described in Sections 2.1.4 – 2.1.6.

2.1.2 Primer design for global transcript amplification products

The PCR primers used to analyze the global transcript amplification products were manually designed to target regions within 300 – 500 bp upstream of the 3’ end of the transcripts. Using the National Centre for Biotechnology Information (NCBI) Map Viewer, true 3’ ends of the transcripts were identified by the clusters of EST ends occurring within approximately 50 bp of the poly(A) sequences (AATAAA or ATTAAA). This manual verification of the 3’ end was necessary to confirm that the mRNA sequences in the public databases did indeed end at utilized poly(A) sites90. Primers were designed to have a length of approximately 20 bp and 50% GC content. With the Primer-BLAST (NCBI) primer design software, the specificity of the primer pairs were selected to minimize predicted off-target amplification, particularly those with a similar amplicon length.

17

2.1.3 Semi-quantitative PCR

PCR of globally amplified products was performed in a 100 μL reaction consisting of 10 mM

TrisHCl pH 8.3, 50 mM KCl, 1.5 mM MgCl2, 0.1 mg/mL BSA, 0.05% Triton X-100, 200 μM dNTPs, and a 250 nM pair of forward and reverse primers. The assembled reactions were heated to 94°C before 5 U of Taq polymerase were added. The addition of Taq polymerase at 94°C prevented the amplification of non-specific products at lower temperatures. All PCRs were performed on an Applied Biosystems 9700 instrument using the following thermal cycling conditions: 5 minutes at 94°C, followed by at least 25 cycles of 94°C for 15 seconds, 60°C annealing for 30 seconds and 72°C for 1 minute. Afterward, reactions were held at 72°C for 5 minutes, and then 4°C until they were loaded onto 2.0% agarose gels containing GelRed (Biotium).

Table 2-1. Primer sequences for PCR on global amplification products.

Gene Forward primer (5’ to 3’) Reverse primer (5’ to 3’) Size (bp) Gata3 GTCACTTTTCTTGCAGCCTA CAGACTGTTTAAAGGCAGTG 161 EPCR AAGTAACGGGTTTGCAAAGG GAATGCCTACAGTAAGTCTGCC 152 IL-7Rα CTGCCAATTTTCCTCTTGGT CCAGAAAATAGCGCATGCTT 249 Gapdh GCTGGCATTGCTCTCAATGA AGGCCCCTCCTGTTATTATG 240

2.1.4 Immunophenotypic purification of multipotent populations

The bone marrow in the femurs and tibias of either C57BL/6J-Ly5.2-Gpi1b/b or C57BL/6J- Ly5.1-Gpi1a/a mice were used for the purification of these populations by FACS. The cell surface markers and dye retention properties used to purify various multipotent populations are shown in Table 2.2, and some are introduced in Section 1.2.3. The biological potential of these cells is discussed in-depth in the references listed.

Table 2-2. Cell surface markers and dye retention used to purify multipotent populations by FACS.

Phenotype Cell Population Ref. lo lo Lin- Sca1+ cKit+ Rho123 (LSKR) α2 Long-term hematopoietic stem cell (LT-HSC) [5] hi LSKR α2 Intermediate-term hematopoietic stem cell (IT-HSC) [5] LSK CD34+ Flt3- Short-term hematopoietic stem cell (ST-HSC) [6] Lin- Sca1- cKit+ Mixed multipotent progenitor (MPP) [6] LSK CD34+ Flt3+ Lymphoid-primed multipotent progenitor (LMPP) [31]

18

2.1.5 Cell culture of IT-HSCs and LT-HSC-enriched fractions

LT-HSC and IT-HSC fractions are quiescent at the time of their purification from the bone marrow of mice as seen by their low Rhodamine retention, the long latency period to their first cell division in vitro, and long inter-mitotic intervals in vivo5,44. The cells were cultured in Iscove’s Modified Dulbecco’s Medium containing 4% fetal bovine serum, 0.1% bovine serum albumin, 5 g/mL transferrin, 5 g/mL insulin, 50 ng/mL murine c-Kit ligand, 50 ng/mL human Flt3 ligand and 10 ng/mL interleukin-113. These culture conditions are capable of preserving the input stem cell numbers over the first 3 days of culture.

IT-HSCs were cultured to 20 and 36 hours, and the LT-HSCs to 28 and 44 hours. At these specified time points, the cells were collected to generate cDNA by global transcript amplification. The latter time points of each fraction corresponds to the times at which 50% of the cells divided in culture. Since the murine cell cycle is approximately 12 hours, the earlier time points represent cells presumably in an early G1-phase. The observed delay to their first cell division in culture, and not subsequent divisions confirms their quiescent state following purification5. The latter time point at which 50% of cells have divided were excluded from my analysis because any division other than a symmetrical self-renewal division could contribute cDNA that is not derived from LT-HSCs.

2.1.6 Isolation of progenitors and terminally differentiated cells

While the T- and B- lymphoid lineages and natural killer (NK) cells were isolated based on immunophenotypic markers, cells of the myeloid lineages were isolated and characterized by sibling cell analyses as described below. T- and B- lymphocyte populations were purified by FACS using the marker sets listed in Table 2-3, and their RNA was generated by the laboratories of Drs. Chris Paige and Juan Carlos Zúñiga-Pflücker. NK cells were isolated as described in Section 2.1.8. All cDNA samples were prepared using the global amplification protocol in my host laboratory by Mary Barbara.

19

Table 2-3. Cell surface markers used to purify B- and T- lymphoid populations by FACS.

B-lymphocyte Immunophenotypic fraction Profile A B220+ CD43+ HSA- BP.1- B B220+ CD43+ HSA+ BP. 1- C B220+ CD43+ HSA+ BP.1+ D B220+ CD43- IgM- IgD- E B220+ CD43- IgM+ IgD- F B220+ CD43- IgM+ IgD+

T-lymphocyte Immunophenotypic fraction profile Double negative 1 (DN1) CD3- CD4- CD8- CD44+ CD25- Double negative 2 (DN2) CD3- CD4- CD8- CD44+ CD25+ Double negative 3 (DN3) CD3- CD4- CD8- CD44- CD25+ Double negative (DN4) CD3- CD4- CD8- CD44- CD25- Double positive (DP) CD4+ CD8+ CD4 single positive (CD4+) CD4+ CD8- CD8 single positive (CD8+) CD4- CD8+

A collection of myeloid lineage cDNA samples was generated from single cells and their developmental potential assessed by sibling cell analysis. Pentapotent and tetrapotent cells were generated from single Rholo Sca1hi cells allowed to form 4- to 8-cell colony starts in methylcellulose containing IL-1, c-Kit ligand, IL-3, IL-11, 5637 bladder carcinoma cell- conditioned medium, and erythropoietin as previously described91,92. One of the cells was removed for global transcript amplification while the remaining cells were re-plated in secondary cultures to assess for their outcomes.

The bipotent and unipotent myeloid cells were also grown in methylcellulose, and their developmental potential similarly assessed by sibling cell analysis93. These methods exploit the cell cycle and differentiative synchrony of the daughter cells. When all of the daughter cells of a common clone exhibit the same differentiative potential, it allows one to use those daughters as reporters for the developmental potential of the cell removed for cDNA generation.

20

2.1.7 Utility of Gata3 as a surrogate marker

Fractionation of the LSKR phenotype into α2-integrin high and low expressing populations separated two similar cell types that differed principally in their ability to sustain self-renewal. lo While the LSKRα2 (LT-HSC) fraction contained cells capable of enduring multilineage hi engraftment, the α2 (IT-HSC) fraction yielded grafts that ultimately faded by 16 weeks post- transplant5. By comparing their transcriptomes, genes identified to be preferentially expressed in the former fraction may have putative roles in sustaining self-renewal. Gata3 was found to be expressed 47-fold greater in the quiescent LT-HSCs than IT-HSCs94. Of equal importance, the expression of Gata3 mRNA is maintained in actively cycling LT-HSCs (Figure 2-1A). Frelin and colleagues have also demonstrated that Gata3 restricts self-renewal in stressed LT-HSCs downstream of p38 signaling and its genetic ablation enhances the expansion of LT-HSC numbers in vivo94. The stability of Gata3 mRNA expression, its greater abundance in LT-HSCs relative to IT-HSCs, and its functional role in LT-HSCs as a negative regulator of long-term repopulating activity support its use as a surrogate LT-HSC marker.

A Gata3 B Gapdh

Figure 2-1. Expression of Gata3 throughout the hematopoietic hierarchy. (A) Globally amplified cDNA products of various hematopoietic fractions were analyzed for Gata3 expression on a 2% agarose gel after 40 cycles of PCR. (B) Gapdh expression after 30 cycles of PCR in the hematopoietic hierarchy as an endogenous control.

However, the sharp downregulation of Gata3 expression in IT-HSCs is followed by its upregulation in lymphoid progenitors and NK cells. The exclusion of these downstream cell

21 types incapable of enduring multilineage engraftment within the Gata3+ fraction using additional mRNA transcripts may allow one to specifically mark the LT-HSC population. Such a mRNA marker combination can be used on the microfluidic system to determine which microdroplet contains a LT-HSC.

2.1.8 Complementary markers to exclude non-LT-HSCs that are Gata3+

In addition to the role of Gata3 in the development and function of T-lymphocytes, Gata3 is required for the development of NK cells capable of homing to the liver and IFN-γ production, and possessing receptors indicative of a mature NK phenotype, such as CD11b and DX595. Thus, it is critical to consider NK cells when identifying markers to complement Gata3 for a more complete exclusion of differentiated cell types. Immature (CD122+DX5-CD3-Lin-) and mature (NK1.1+CD11bhiDX5hi CD3-) NK cells were isolated from the bone marrow (BM) of C57BL/6J mice by FACS (Figure 2-2). These purified cells were reverse transcribed into cDNA and globally amplified as described in Section 2.1.1. The rationale for using CD122 as an immature NK cell marker will be described briefly, but it has been characterized in-depth elsewhere96. Briefly, it forms the IL-15R with the common γ-chain which is critical for NK development as either IL-15 or IL-15R deficient mice are also NK-deficient97. It is expressed prior to other NK markers, including NK1.1 and DX5, and marks a committed NK population lacking detectable myeloid, B- and T- lymphocyte potential in vitro96,98. The selected mature NK immunophenotype corresponds to a functional mature NK in vivo98. As expected, both immature and mature NK cells express Gata3 mRNA (Figure 2-2).

22

Figure 2-2. Gata3 is expressed in the NK lineage. Immature and mature NK cells were sorted from BM, and cDNA was generated by the global amplification protocol. Gata3 and Gapdh expression were then analyzed with 40 and 30 cycles of PCR, respectively.

The expression of endothelial protein C receptor (EPCR) and Gata3, and the absence of IL-7Rα expression were found to discriminate LT-HSCs from committed hematopoietic cell types. IL- 7Rα was selected because of its well-characterized role in the early developmental stages of T- and B- lymphocytes, its role in the enhancement of mature T-cell survival in circulation, and its absence in the Thy1lo LSK HSC-enriched fraction99-102. Expectedly, the expression of IL-7Rα resided mainly in these two lymphoid lineages, and was absent in our quiescent and cycling LT- HSC populations (Figure 2-3). The combination of Gata3 expression and the lack of IL-7Rα expression excluded the undesired T- and B- lymphoid lineages. With the addition of EPCR as the third marker, the remaining Gata3+ IL-7Rα- mNK cells were excluded.

A IL-7Rα B EPCR

Figure 2-3. Complementary markers to exclude Gata3+ committed cell types. (A) IL-7Rα expression was analyzed in the hematopoietic hierarchy by PCR. It is expressed throughout the

23

B- and T- lymphoid lineages where Gata3 was detected. (B) EPCR expression was similarly analyzed, and was capable of excluding the mature NK population that is Gata3+ IL-7Rα-.

EPCR has been used to isolate HSCs prospectively where all detectable multilineage reconstituting capabilities resided in EPCR+ marrow103-105, and its expression was found to be stable on HSCs after a 5 day culture period104. Furthermore, the cell surface expression of EPCR was found to be absent, among other cell types, in mature T-lymphocytes (CD3/CD4/CD8+), B- lymphocytes (B220/CD19+), and NK cells (NK1.1+)106. The gene expression analysis of EPCR in our hematopoietic hierarchy was consistent with these previous immunophenotypic data.

Despite the common use of EPCR as a HSC marker, its role in stem cell function remains elusive. Present also on endothelial cells, EPCR binds to protein C and augments its proteolytic activation by thrombin that is in a complex with thrombomodulin107. Activated protein C (APC) can then exert its serine protease activity through two separate pathways: (1) inactivation of fVa and fVIIIa which results in anti-coagulant activity; and (2) activation of protease-activated receptor-1 (Par- 1). Par-1 is involved in an anti-inflammatory response through the downregulation of vascular adhesion molecules (e.g., CD54) on endothelial cells to reduce leukocyte adhesion and tissue infiltration108. Also, Par-1 activation in monocytes reduces the release of pro-inflammatory cytokines109. Its activation in monocytes by the EPCR-APC complex is consistent with the detection of EPCR in the macrophage fraction of our hematopoietic hierarchy.

EPCR-APC-Par-1 pathway also has anti-apoptotic effects mediated through the downregulation of pro-apoptotic gene products such as p53 and Bax, and up-regulating anti-apoptotic Bcl-2109. Interestingly, Par-1 mRNA was detected in the fetal liver LSK EPCR+ fraction enriched for HSC activity110. Iwasaki and colleagues found that the addition of APC in cultures of LSK EPCR+ cells reduced the number of dead cells by approximately 3-fold while leaving proliferation unaffected, and that this effect was not observed when LSK EPCR- cells were cultured. Although this may suggest that the EPCR-APC-Par-1 pathway may be involved in HSC maintenance, direct in vivo experimentation would be required to elucidate a possible function.

The combination of the presence of Gata3 and EPCR, and the absence of IL-7Rα was capable of excluding the committed fractions that constitute the majority of the cells present in the hematopoietic system. However, the downregulation of Gata3 and EPCR as LT-HSCs differentiate into the immediate downstream transient stem and progenitor cells was not absolute.

24

Without the quantification of Gata3 and EPCR mRNA, LT-HSCs may appear indistinguishable from IT-HSCs, ST-HSCs and LMPPs in the microdroplets after endpoint PCR (Figure 2-4A). The inability to differentiate transient stem and progenitor populations from LT-HSCs will contribute to false positives, and consequently, an overestimation of LT-HSCs. In order to assess whether the markers are capable of differentiating LT-HSCs from these remaining populations, lo qPCR was performed. It is important to note that 30% of the cells in the LSKRα2 (“LT-HSC”) fraction are capable of long-term multilineage engraftment, and 60% of the cells in the fraction yielded intermediate-term grafts5. Thus, the cDNA of the LT-HSC enriched fraction contains IT- HSC cDNA that could lead us to underestimate the expression levels of genes preferentially expressed in LT-HSCs.

A

B C

Figure 2-4. Quantitative gene expression analysis of hematopoietic stem and progenitor cells. (A) Summary of Gata3, EPCR and IL-7Rα detection by PCR on all hematopoietic fractions. (B-C) Gata3 and EPCR qPCR was performed on LT-HSCs and progenitors that expressed Gata3 and EPCR, and not IL-7Rα. Gray bars represent the cell populations that the respective markers were able to differentiate LT-HSCs from. All qPCR data was normalized to Gapdh.

EPCR and Gata3 mRNA were detected at lower abundances in cycling LT-HSCs relative to quiescent LT-HSCs (Figure 2-4B,C). If EPCR and Gata3 expression can be used to differentiate cycling LT-HSCs from IT-HSCs, ST-HSCs and LMPPs, then the same should hold true for

25 quiescent LT-HSCs that express both EPCR and Gata3 at higher levels. EPCR was expressed 7- and 14- fold greater in cycling LT-HSCs than ST-HSCs and LMPPs, respectively. However, the level of EPCR expression cannot differentiate cycling LT-HSCs from IT-HSCs. The discrimination of LT-HSCs from IT-HSCs can be achieved by high Gata3 expression. Cycling LT-HSCs express Gata3 6- and 10- fold greater than quiescent and cycling IT-HSCs, respectively. Furthermore, using EPCR, Gata3 and IL-7Rα as markers in the follow FACS experiments, cells capable of long-term multilineage engraftment were prospectively isolated from unseparated BM.

2.2 Immunophenotypic analysis and prospective isolation of LT- HSCs using Gata3, EPCR and IL-7Rα as markers

Prospective isolation of viable LT-HSCs is limited to the use of cell surface antigens and supravital stains due to the fixation and permeabilization steps necessary to target intracellular antigens. Although EPCR and IL-7Rα are cell surface receptors, Gata3 is a transcription factor that shuttles between the cytosol and nucleus94. Immunophenotypic analyses and isolation of the fraction defined by the three markers were made possible by using a Gata3eGFP/+ mouse strain in which exon 4 of Gata3 was replaced with a cassette encoding eGFP under the native Gata3 promoter111.

Figure 2-5. Immunophenotypic analysis of EPCR+ Gata3+ IL- 7Rα- cells in Gata3eGFP/+ mice. Gata3-enriched bone marrow was gated for Gata3+ EPCR+ and IL- 7Rα- cells. Using this gating strategy, the expression of both mature cell markers (CD3, B220 and CD122), and markers for stem and early progenitor populations (c-Kit and Sca1) were analyzed by flow cytometry.

26

Using this Gata3eGFP/+ mouse strain, the utility of EPCR and IL-7Rα for excluding differentiated hematopoietic cell types that contaminated the Gata3+ fraction was confirmed. The expression of CD3, B220 or CD122 was used to mark T-lymphocytes, B-lymphocytes, and NK cells, respectively. CD122 was identified as a NK marker expressed in an early NK progenitor population prior to the expression of NK1.1 or DX5, and continues to be expressed in mature NK cells96,98. The EPCR+ Gata3+ IL-7Rα- (EGI) phenotype excludes the undesired T, B and NK lineages as expected from the gene expression analysis performed on the hematopoietic hierarchy. Cells in the EGI fraction also homogeneously express two HSC cell surface markers, Sca1 and c-Kit18,112 (Figure 2-5).

To directly link the EGI marker combination to cells capable of enduring multilineage engraftment, cells of this phenotype were sorted and injected into lethally irradiated (9 Gy) recipient mice in competitive reconstitution assays (Figure 2-6A). 25 donor EGI cells purified from Ly5.2-Gpi1b/b mice were co-injected with 5x105 Ly5.1-Gpi1a/a competitor unseparated BM cells into six Ly5.1-Gpi1a/a recipient mice in two independent experiments. Based on previous quantifications of LT-HSC numbers in unseparated bone marrow (1:20,000)5, this experimental set-up is a 1:1 competitive reconstitution assay where theoretically, one would expect equal donor and competitor chimerism in the recipients if all of the donor EGI cells were LT-HSCs.

The recipient mice were monitored for donor myeloid, lymphoid and erythroid reconstitution for 24 weeks (Figure 2-6B). At 8 week intervals, peripheral blood was obtained from the saphenous vein of the recipient mice. The isoforms of Ly5 surface antigen were used to distinguish donor from competitor myeloid cells and lymphocytes by flow cytometry. Separately, glucose phosphate isomerase 1 (Gpi1) isoforms were analyzed by electrophoresis to distinguish the erythrocytes.

27

A

B

Figure 2-6. EPCR+ Gata3+ IL-7Rα- phenotype is highly enriched in LT-HSCs. (A) EPCR+ Gata3+ IL-7Rα- 7AAD- cells were sorted and cultured overnight in the conditions described in Sesction 2.1.5. Cells were counted the next day, and 25 EGI cells were co-injected with 5x105 whole bone marrow cells in lethally irradiated recipient mice. (B) Donor myeloid (CD11b+ Gr1+ CD3- B220-), T-lymphocyte (CD3+ B220- CD11b- Gr1-), B-lymphocyte (B220+ CD3- CD11b- Gr1-), and erythrocyte reconstitution were examined at 8-week intervals to 24 weeks. Mean donor engraftment levels and S.E.M. data were obtained from six recipient mice in two independent experiments.

The high donor engraftment levels of the myeloid, lymphoid and erythroid lineages link the EGI phenotype to cells that are capable of long-term multilineage engraftment. Previous studies have found that HSC activity resided only in EPCR+103,110 and Gata3+94 fractions. Consequently, it can be inferred from our results and these studies that most, if not all, HSC activity is defined by our marker combination. The gene expression data and the corroborating functional assays with Gata3, EPCR and IL-7Rα demonstrate that these markers would be capable of identifying LT- HSCs with high specificity in the microdroplets.

28

2.3 Discussion

The fractionation of the RKSL phenotype into cells expressing high levels of α2-integrin isolated a cell type akin to LT-HSCs in that they could produce multilineage grafts in lethally irradiated recipient mice5. These cells, termed IT-HSCs, contaminated phenotypes used by numerous groups to isolate LT-HSCs due to similarities in their cell surface antigens (e.g., CD34-, Flt3-, LSK). However, IT-HSCs differed principally from LT-HSCs in their inability to sustain multilineage engraftment beyond 16 weeks. From transcriptome analyses of these populations that differed distinctly in their self-renewal properties, Gata3 was found to be expressed predominantly in LT-HSCs, and subsequently found that Gata3 restrained self-renewal upon its relocation from the cytosol to the nucleus94. In addition to its preferential expression and functional role as a negative regulator of LT-HSC self-renewal, its expression persisted as the LT-HSCs cycled. These qualities support Gata3 as a potential surrogate LT-HSC mRNA marker.

The upregulation of Gata3 mRNA downstream of the IT-HSCs necessitated the use of additional markers that complemented Gata3 to exclude cells incapable of enduring multilineage engraftment. My host laboratory developed a high resolution archive of cDNA samples representing the diverse lineages of the hematopoietic hierarchy. Adding the NK lineage to it, I used this archive to identify EPCR and IL-7Rα as mRNA transcripts that in combination with Gata3 could provide a distinct LT-HSC mRNA signature.

Corroborating the gene expression data were immunophenotypic and functional analyses that exploited an eGFP reporter mouse strain to monitor the promoter activity of Gata3. Using this Gata3eGFP/+ mouse strain, it was found that the EPCR+ IL-7Rα- phenotype excluded Gata3+ T-, B- and NK cells (Figure 2-5). Cells with an EPCR+ Gata3+ IL-7Rα- (EGI) phenotype were also found to yield enduring multilineage engraftment in competitive reconstitution assays (Figure 2- 6). The long-term competitive reconstitution assays demonstrate that the EGI phenotype is highly enriched in LT-HSCs.

The functional assays, however, were performed with cells purified based on their protein expression. Although Gata3 expression levels represented the transcriptional activity of its native promoter and thus reflecting endogenous mRNA levels, EPCR and IL-7Rα expression were

29 based on their presence or absence as cell surface antigens. Since mRNA levels will be used to identify LT-HSCs, the question then arises: Do the mRNA and protein levels of EPCR and IL- 7Rα correlate? Previous studies indicate that mRNA levels are typically correlated with the levels of their protein products within a certain expression range. For highly expressed genes, the correlation diminishes possibly due to ribosomal saturation113,114. These highly expressed genes are predominately involved in metabolism and protein synthesis114. The fact that EPCR and IL-7Rα are not involved in such functions suggests that the mRNA levels of EPCR and IL- 7Rα could be correlated with their protein products. However, this is convoluted by post- transcriptional regulatory processes. Nonetheless, correlations between mRNA and protein levels of EPCR and IL-7Rα were also supported by the lack of T, B and NK lineages in the EGI phenotype as determined by both the PCR analysis on the hematopoietic hierarchy and the flow cytometry analysis on BM. It is also important to mention the differences – LT-HSCs were characterized by EPCRhi and Gata3hi mRNA levels, while no quantitative consideration was used for Gata3 and EPCR during the FACS purification of the EGI phenotype for functional analysis.

Experiments will be conducted with the microfluidic system in the collaborating laboratory of Dr. Peter Zandstra to validate the use of Gata3, EPCR and IL-7Rα as mRNA markers for enumerating murine LT-HSCs under conditions where LT-HSC numbers are expected to vary significantly. The conditions are described below (Figure 2-7). If the enumerations of LT-HSCs using the microfluidic system consistently match the current gold standard in vivo assay enumerations under the various conditions, then this will be a proof of principle for using the microfluidic system to enumerate rare LT-HSCs in a high-throughput manner.

30

Figure 2-7. Validation of Gata3, EPCR and IL-7Rα mRNA markers for enumerating LT-HSCs under conditions where LT-HSC numbers are expected to vary. Samples from different origins will be divided so that a portion is enumerated directly on the microfluidic system while the remaining portion will be injected into recipient mice for a biological readout. The diverse set of conditions will assess the ability of the mRNA markers to predict the outcome of the biological assays.

For cells extracted from steady state bone marrow and subjected to different purification strategies, the approximate purity level of LT-HSCs has already been determined5,32,104. These samples can be enumerated rapidly on the microfluidic system and compared to the values obtained in previous studies. Conditions that deplete LT-HSC numbers in culture, or with the selection for lineage marker positive cells (e.g., CD11b, Gr1, CD3, B220), the number of positive microdroplets should be minimal. Most interestingly, this platform can be used as a high- throughput screen for identifying culture conditions that could expand the number of LT-HSCs. Conditions that promoted the greatest increase in the number of microdroplets positive for a LT- HSC mRNA signature can then be analyzed in vivo. The ease and short timeframe needed to obtain results should allow for a more comprehensive analysis of the different levels of each culture component than would otherwise be permitted by in vivo studies.

The qPCR analyses conducted in this chapter are on bulk populations of cells, and thus, the detected gene expression levels are an average of the constituent cells in each population. Gene expression analyses of bulk populations conceal the cell-to-cell variability observed when single cells are interrogated115. This variability will be encountered when analyzing single cells in the microdroplets. The variability in gene expression can be partially explained by a two-state model

31 where there are transitions between an “off” and “on” state of transcriptional activity116. Changes in the nucleosome occupancy upstream of the transcription start site could drive the transitions from this repressive to permissive transcriptional state. Nucleosomes could compete with transcription factor binding, and influence the threshold level of stimulus needed to initiate gene expression117. When the promoter is open, multiple rounds of transcription can be initiated. It is also probable that the transcriptional bursts result from upstream pulsatile signals or cell cycle effects118. Another mechanism could be the stalling of RNA polymerase II downstream of the promoter leading to the build-up of polymerase along the DNA strand, and resulting in bursts of mRNA synthesis upon release of the initial stalled polymerase119,120. Molecular traffic caused by variability in elongation rates can also cause transcriptional bursts121. Additionally, in single cells where the numbers of molecules involved in transcription are limited, fluctuations in their numbers are expected to have a more pronounced effect on the gene expression levels122. Since qPCR is a “snapshot” of gene expression levels at a particular point in time, it is likely that the detected levels of expression of the mRNA markers will vary from cell-to-cell.

The discrimination of LT-HSCs from the downstream transient stem and progenitor cells required the quantification of Gata3 and EPCR, rather than the all-or-nothing gene expression used to exclude the committed hematopoietic populations. A caveat of this study is that cell-cell variability stemming from the mechanisms described above was averaged out, and its consequences on the quantitative analysis of Gata3 and EPCR are unknown. Attempts to study these genes in single cells using a commercial single cell qRT-PCR kit (Invitrogen CellsDirect One Step qRT-PCR) were problematic. There were issues with the detection of Gata3 mRNA. Attempts to improve the conditions by using four different primer pairs, techniques to relax RNA secondary structure, and changing salt, and primer conditions did not yield any notable improvements. A technical note by Fluidigm describing this protocol indicated that up to a third of the 96 genes analyzed failed due to technical limitations123. This technical limitation may have been exacerbated by the low RNA content of HSCs, and particularly the low Gata3 expression in this population as determined by the qPCR studies (Figure 2-4). Nevertheless, the current tools present in my host laboratory allowed me to obtain population-based gene expression, immunophenotypic and functional data that have demonstrated that the EGI marker combination can distinguish LT-HSCs from downstream hematopoietic cells. This experiments support the implementation of this marker combination onto the microfluidic system for further validation.

32

Chapter 3 Development of a multiplex one-step qRT-PCR for analyzing single cell gene expression in the microdroplets

Experiments analyzing RNA or cells encapsulated in microdroplets were performed in collaboration with Jennifer Ma in Dr. Peter Zandstra’s laboratory. Figures 3-1, 3-13 and Figure 3-25 were created by Jennifer Ma. Prior to the experiments described in Section 3.1.2.8, Jennifer Ma had developed a microfluidic platform to encapsulate RNA or cells, and RT-PCR reagents into microdroplets (Figure 3-1).

3.1 One-step qRT-PCR to directly analyze single cell gene expression in microdroplets

A microfluidic platform was used to encapsulate single cells with RT-PCR reagents into microdroplets for gene expression analysis. Each 1 nL microdroplet is an individual RT-PCR. The process of encapsulation, thermal cycling, and analysis of the RT-PCR products is outlined in Figure 3-1 and described in greater detail in Section 3.1.2.8.

Figure 3-1. Workflow for cell encapsulation, thermal cycling and analysis of RT-PCR products. The microfluidic system has two inlets. The presence of two inlets allows a cell suspension and RT- PCR reagents containing a cell lysis buffer to enter separately. The two inlets merge into an orifice where microdroplets are formed by opposing streams of fluorocarbon oil. Microdroplets are collected into 0.2 mL PCR tubes, thermal cycled, and deposited as a monolayer for imaging using the Thermo Scientific Cellomics ArrayScan VTI HCS Reader. The fluorescence pattern in each microdroplet represents the gene expression profile of the encapsulated cell.

33

Once the microdroplets are formed, it is not feasible to modify the contents of individual microdroplets without sacrificing the throughput of the reaction. High-throughput is required to detecting rare cells such as LT-HSCs. Therefore, the reagents for cell lysis, RT-PCR, and detection of the RT-PCR products must be packaged into the microdroplet with the cell. The simultaneous encapsulation of all the reagents necessitates that there cannot be interference between the components of each reaction step, such as the possible inhibition of Taq polymerase activity by reverse transcriptase124-126. Additionally, it was been observed previously that a high concentration of cell lysate inhibits PCR68,80,127. The amount of lysate produced from the encapsulation of a single cell in a 1 nL microdroplet exceeds the inhibitory cell lysate concentration observed by these cited studies by up to 5-fold. The reaction must be able to tolerate high lysate concentrations to quantify the target genes from the encapsulated cells efficiently. Accordingly, the experiments in this chapter were designed to develop a one-step qRT-PCR that could tolerate the high cell lysate concentration in the microdroplets.

3.1.1 Simulation of the microdroplet environment

A 1 nL microdroplet is approximately 4 orders of magnitude less than the volume of a conventional qRT-PCR. Performing qRT-PCR in a nanolitre volume increases the sensitivity of the reaction68,73-75 and reduces the amount of reagents used per data point. However, despite the encapsulation of only a single cell into each microdroplet, the cell lysate concentration is equivalent to approximately 1,000 cells/μL due to the nanolitre volume of the microdroplet (Figure 3-2A). Cell concentrations of 200 cells/μL were found to inhibit the reaction in conventional qRT-PCR volumes and during attempts at downscaling the reaction volume on a microfluidic device68,80,127. I reproduced the inhibition observed in these cited studies (Figure 3- 2B). On qPCR amplification plots, the inhibition by high cell lysate concentrations was observed as irregular fluorescence generation that did not correspond to the generation of cDNA product (Figure 3-2B; red curves). In other reactions, inhibition was observed on the amplification plots as minimal or no fluorescence generation (Figure 3-6). Conversely, these patterns of fluorescence generation were not seen when qRT-PCRs were performed using a concentration of 10 cells/µL or an amount of total RNA equivalent to 1,000 cells/µL as templates.

34

Furthermore, the detrimental effect of high cell lysate concentrations on RT-PCR product yield was detected by agarose gel electrophoresis (Figure 3-2C). By comparing band intensities, inhibition was evident because the product yield from a RT-PCR using 1,000 cells/µL as the template concentration was less than the yield of a reaction where the input number of cells was 100-fold less. Although this inhibition can be avoided in conventional qRT-PCRs by diluting the cell lysate or purifying the RNA prior to the reaction, the throughput needed for rare cell detection makes it unfeasible to take these extra steps to overcome the inhibition by high cell lysate concentrations

A

B C

Figure 3-2. Simulation of microdroplet cell lysate concentrations in conventional 10 μL qRT- PCRs. (A) The lysate concentration of a cell in a 1 nL microdroplet is equivalent to 10,000 cells in a 10 μL reaction. (B) QRT-PCR was performed using pre-modified parameters on 100 cells, 10,000 cells, and 100 ng RNA in 10 μL reactions on the same reaction plate. (C) Products of the qRT-PCR in (B) were visualized on a 2% agarose gel. The black arrow indicates the specific Gata3 band.

If qRT-PCR parameters could be adapted to function efficiently in high cell lysate concentrations, then it should be possible to analyze the gene expression of single cells in the microdroplets directly. To simulate the cell lysate concentration in the microdroplets, 10,000 cells were added to each 10 μL qRT-PCR. This simulation in 10 μL qRT-PCRs provided an economical and time-efficient means of analyzing various concentrations of each reaction component to adapt our qRT-PCR parameters to the microdroplet environment.

35

3.1.2 Identification of singleplex qRT-PCR parameters to quantify mRNA levels directly from high cell lysate concentrations

3.1.2.1 Quantitative reverse transcription polymerase chain reaction

After reverse transcription of the mRNA target into cDNA, fluorescence measurements of cDNA production are taken at each thermal cycle. Whereas endpoint PCR typically measures near the plateau phase, qPCR allows one to use the exponential phase in which the reaction is at its highest efficiency to yield accurate quantification. During the exponential phase, the product ideally doubles with each thermal cycle. As reagents Figure 3-3. Quantitative PCR are consumed, degraded, and inhibitory PCR products amplification plot. Real-time accumulate, the exponential phase progresses into the fluorescence measurements are plotted in terms of their intensity subsequent linear and plateau stages. per each thermal cycle.

TaqMan hydrolysis probes (Life Technologies) were used to monitor the amount of product generated in each thermal cycle. The TaqMan probe is an oligonucleotide that has a fluorophore at its 5’ end, and a fluorescence quencher at the 3’ end. It binds specifically to an internal region of the amplicon. When Taq polymerase amplifies the target cDNA, it cleaves the hybridized TaqMan probe, and separates the fluorophore from the quencher (Figure 3-4). Thus, the generation of fluorescence corresponds to the amount of amplification product.

Figure 3-4. Mechanism of real-time PCR quantification by TaqMan hydrolysis probes. TaqMan probes anneal prior to primers, and its fluorescence is quenched through fluorescence resonance energy transfer. When Taq polymerase encounters the bound TaqMan probe, it displaces and cleaves the oligonucleotide, releasing the fluorophore from the quencher with its 5’ – 3’ exonuclease activity. Consequently, the development of fluorescence corresponds to the generation of specific product.

36

A fluorescence threshold is set in the exponential phase where the intensity of the fluorescence signal is significantly greater than the background signal. The number of thermal cycles required to cross the fluorescence threshold is known as the threshold cycle (CT). The CT value is inversely proportional to the target abundance. The RN is the fluorescence emission intensity normalized to ROX. ROX is an inert fluorescence dye added to normalize for well-to-well variations due to instrument limitations and pipetting errors. ΔRN is the fluorescence intensity of the sample when the baseline fluorescence level is subtracted. The baseline fluorescence level can be obtained in the early cycles of PCR prior to detectable fluorescence or in the no template control. All qRT- PCRs in this chapter are performed in duplicates or triplicates.

3.1.2.2 Primer and TaqMan hydrolysis probe design

Primers (SickKids TCAG) were designed to avoid amplifying genomic DNA by designing one of the primers to span an exon-exon junction. TaqMan hydrolysis probes were used for real-time quantification of product generation (Figure 3-4). PREMIER Biosoft Beacon Designer 8.0 (Trial Version) was used as the first step to design the TaqMan assays with the following criteria:

Primers: 18 – 25 bp / Tm = 56±4°C / exon junction spanning

Probes: 18 – 30 bp / Tm of primers + ~10°C Amplicon length: 80 – 200 bp

TaqMan probes were designed with a higher Tm to ensure that they hybridize to their targets before the primers. The early hybridization of the TaqMan probes is essential because the 5’ – 3’ exonuclease activity of Taq polymerase during elongation separates the fluorophore from its quencher on the TaqMan probes (Figure 3-4). Thus, TaqMan probes must be annealed prior to polymerization for the fluorescence generation to correspond to amplicon production. Beacon Designer and NetPrimer were used to identify regions of the primers prone to form primer- dimers, and particular attention was paid to avoid GC-rich complementary regions. Assays were examined for unintended targets using NCBI Primer-BLAST (http://www.ncbi.nlm.nih.gov/tools/primer-blast/) and known SNPs were avoided using NCBI dbSNP (www.ncbi.nlm.nih.gov/SNP/). The sequences of the primers and probes used in Section 3.1 are listed in Table 3-5.

37

3.1.2.3 Cell lines

Addition of 10,000 cells to a 10 μL qRT-PCR simulated the lysate concentration of a microdroplet containing a single cell. The cell lines were selected based on their expression profile of Gata3, EPCR and IL-7Rα (Table 3-1). Additionally, the cell lines are non-adherent and of murine or human hematopoietic origin. The size of the cells in the cell lines was compared to murine LT-HSCs that were cycling in culture by direct microscopy (Table 3-1; Figure 3-5).

Figure 3-5. Images of cycling LT-HSCs and hematopoietic cell lines at 400X magnification. The images were processed using the Leica Application Suite. Scale bar is 50 μm.

The mean cell diameters of the cell lines were significantly larger than those of cycling LT- HSCs. Their larger cell size suggests that the qRT-PCR conditions developed to tolerate the lysate of these cell lines should be able to function in LT-HSC lysate. However, the reaction will also have to tolerate other kinds of cells having differing sizes and contents.

Table 3-1. Culture conditions, and cell diameters of cycling LT-HSCs, and the hematopoietic cell lines used for the microdroplet cell lysate simulations

Cell type Cell diameter (μm) Culture conditions Cycling LT-HSCs 7.8 ± 1.0 Refer to Section 2.1.5 IMDM + 5% FBS + 630uL/L α-thioglycerol + B62c: Murine pre-B-lymphocytes 10.6 ± 2.1* 1:200 IL-7 (from Dr. Chris Paige’s lab) EL4: Murine T-cell lymphoma 17.3 ± 3.1* IMDM + 5% FBS + 630uL/L α-thioglycerol Jurkat: Human T-cell lymphoma 16.1 ± 2.3* IMDM + 5% FBS + 630uL/L α-thioglycerol Mean diameter ± SD *one-tailed t-test, n = 15, p < 0.0001

3.1.2.4 Single-stranded DNA binding proteins

The addition of single-stranded DNA binding proteins to our qRT-PCR was first considered because of their ability to alleviate the inhibition of Taq polymerase by reverse transcriptase128 and other inhibitory substances129. A T7 bacteriophage single-stranded DNA binding protein,

38 gene 2.5 protein, was shown to sequester primers and probes, and prevented them from annealing to complementary sequences. When the temperature was taken to approximately 55°C (Figure 3-6A)130,131, gene 2.5 binding protein permanently denatures, releases the bound oligonucleotides, and allows annealing to occur. The binding protein will be referred to here as BP. 55°C was also the temperature of the reverse transcription step. Thus, the sequestration of primers by BP is a hot-start mechanism that prevented non-specific reverse transcription prior to initiation of the RT-PCR reaction sequence.

The expression of Gata3 mRNA was analyzed in 10,000 EL4 cells (Gata3+) in a 10 μL qRT- PCR with varying amounts of BP added to each reaction (Figure 3-6B). The following thermal cycle profile was used for the qRT-PCR experiments. After a 30 minute hold at 55°C for reverse transcription, the reverse transcriptase was deactivated by a 2 minute hold at 95°C. Subsequently, 40 cycles of 15 seconds at 95°C, 30 seconds at 55°C and 30 seconds at 72°C were performed. RT-PCR products were analyzed by agarose gel electrophoresis for the formation of specific products. As visualized by the qPCR amplification plots and their corresponding gel electrophoresis images immediately below, adding 2.0 μg of BP per reaction (0.24 μg per pmol of primers and probes) was necessary for the development of an exponential phase.

BP acted as a hot-start mechanism in my qRT-PCR, and it was also able to rescue a high cell lysate qRT-PCR performed using a kit that already contained a hot-start Taq polymerase. The CellsDirect One-Step qRT-PCR (Invitrogen) is marketed to quantify mRNA directly from cell lysate, albeit at a lower concentration and only after processing the lysate. 2,000 EL4 cells/μL inhibited qRT-PCRs using the CellsDirect kit, which recommends a maximum of 300 cells/μL in the reaction (Figure 3-7A). When BP was added to the CellsDirect kit at the concentration used in our reaction, the amplification of Gata3 mRNA was rescued as indicated by the development of a fluorescence curve on the qPCR amplification plot, and a band of correct size on an agarose gel (Figure 3-7A).

Despite the ability of the CellsDirect kit to detect Gata3 mRNA in 2,000 cells/μL when BP was added to the reaction, the qRT-PCR I developed (conditions described in Table 3-3; referred to here as “in-house”) outperformed the kit on the basis of reaction efficiency and product yield in the context of high cell lysate (Figure 3-7). On the qPCR amplification plot, the exponential

39 phase of the in-house (Table 3-3) qRT-PCR had a steeper slope relative to the CellsDirect kit rescued by the addition of BP (Figure 3-7A). The steeper slope of exponential phase in the in- house qRT-PCR indicated that the reaction more closely doubled the amount of product with each thermal cycle relative to the CellsDirect kit. Furthermore, the in-house qRT-PCR yielded more Gata3 cDNA product than the CellsDirect kit rescued by BP addition did as determined by the higher plateau phase on the qPCR plot and the brighter band intensity on the agarose gel (Figure 3-7A,B).

A B

Figure 3-7. Addition of binding protein rescued the CellsDirect One-Step qRT-PCR Kit from inhibition by high cell lysate concentration. (A) Gata3 mRNA levels in 20 and 20,000 EL4 cells were analyzed using the in-house reaction, and CellsDirect kit with or without the addition of BP (10 μL reaction). (B) A 2% agarose gel visualizes the end products of the qRT-PCR with 20,000 EL4 cells.

There were negligible differences between in-house qRT-PCR, and the CellsDirect kit with and without the addition of BP at 2 EL4 cells/μL (Figure 3-7A). Thus, it appeared that BP is advantageous only in the context of high cell lysate concentrations.

3.1.2.5 Use of Nonidet P-40 for cell lysis

Nonidet P-40 (NP-40) is a non-ionic detergent used commonly to lyse cells for gene expression analysis by RT-PCR88,132. NP-40 was needed at a concentration of 0.01% for the lysis of high concentrations of cells (Figure 3-8A). Without 0.01% NP-40, there was a CT delay of 6.37 ± 0.33 (mean ± SD) cycles when Gata3 mRNA was analyzed in 10,000 cells in a 10 μL reaction. No delay was observed in a reaction containing 10 cells. Furthermore, the difference in CT between 1,000 cells/μL and 1 cell/μL in the presence of 0.01% NP-40 was expectedly 10.42 ± 0.88

(theoretical CT = 9.97). Corroborated by complete trypan blue staining of the cells subjected to

40

0.01% NP-40 in our qRT-PCR, it appeared that 0.01% NP-40 in our specific qRT-PCR environment can disrupt the cell membrane integrity for target mRNA quantification. A B C

Figure 3-8. Addition of NP-40 to the qRT-PCR for cell lysis. (A) Analysis of Gata3 mRNA in 10,000 and 10 EL4 cells in a 10 μL qRT-PCR with 0.01% and no NP-40. (B) Agarose gel electrophoresis visualizes Gata3 amplicon (black arrow) after 40 cycles with 0.01% NP-40 and no NP-40. (C) Gata3 mRNA detection in high cell lysate concentrations at increasing NP-40 concentration

Further increase in the NP-40 concentration to 0.5% did not improve the qRT-PCR under high cell lysate concentrations. In fact, there was a 1.54-fold decrease in the resultant fluorescence plateau (Figure 3-8C).

3.1.2.6 Systematic titration of qRT-PCR components

Using 10,000 EL4 cells as the template in a 10 μL reaction, various qRT-PCR components were tested systematically (summarized in Section 3.1.2.9). Since a constant amount of template was used in each reaction, changes observed in the CT value, fluorescence plateau or amplification efficiency should be the result of altering the qRT-PCR component.

Increasing Taq polymerase or primer concentrations by 3-fold increased the fluorescence plateau by 1.37-fold and 1.21-fold, respectively. The higher fluorescence plateau observed on the qPCR amplification plot corresponded to a band of greater intensity on an agarose gel (Figure 3-9). Increasing both Taq polymerase and primer concentrations 3-fold simultaneously augmented the product yield as indicated by a 2.2-fold increase in the fluorescence plateau and higher band intensity (Figure 3-9). The increase in both components also improved the reaction sensitivity as indicated by a decrease in the CT value (1.9 ± 0.45).

41

Figure 3-9. Product yield and sensitivity were improved by increasing Taq polymerase and primer concentrations. Gata3 mRNA was analyzed in 10,000 EL4 cells in a 10 μL qRT-PCR. Taq polymerase and primer concentrations were increased 3-fold individually and simultaneously. The CT values and fluorescence plateau were examined in the qPCR amplification plots (left). The products were analyzed after 40 cycles on a 2% agarose gel (right).

The amount of PCR product in a qRT-PCR with an ideal efficiency (E) should double with each thermal cycle during the exponential phase. The exponential phase is followed by cycles yielding diminishing product generation, finally reaching a plateau where the efficiency approaches 0. Nonetheless, the reaction efficiency remains fairly constant during the early cycles in the exponential phase. The number of PCR products at cycle n (Xn) can be expressed as Xn = n Xo(1+E) , where Xo is the number of initial template molecules and E is the efficiency of the reaction133. From this equation, it can be seen that the ideal efficiency is 1, where product would double at each cycle. Since fluorescence intensity of the TaqMan probe (R) is proportional n to X, the equation can be rewritten as Rn = Ro(1+E) . Using this equation, the amplification

1 R푛,A CT,A−CT,B efficiency can be expressed as E = ( ) − 1 (Equation 1), where Rn,A and Rn,B are two R푛,B fluorescence thresholds along the exponential phase of a qPCR curve, and CT,A and CT,B are the 133 CT values corresponding to those fluorescence thresholds . Implementing this equation, the 3- fold increase in the concentrations of Taq polymerase and primers increased the amplification efficiency from an average of 0.64 to 1.10 in the exponential phase of the qRT-PCR. The efficiency of the qRT-PCR was later confirmed using the conventional standard curve method (Figure 3-11,12).

Subsequently, TaqMan probes were titrated from 50 nM to 550 nM in a 10 μL qRT-PCR analyzing Gata3 mRNA expression in 10,000 EL4 cells. Increasing the TaqMan probe concentration from 50 nM to 350 nM decreased the CT value and increased the fluorescence plateau (Figure 3-10A,B). The minimal improvement observed when the TaqMan probe

42 concentration was increased from 350 nM to 550 nM suggests that a probe concentration of 350 nM was sufficient to detect product accumulation reliably (Figure 3-10B). Although commercial RT-PCR kits typically recommend a TaqMan probe concentration of 100 to 250 nM, concentrations up to 350 nM have been used134,135

A B

Figure 3-10. Titration of TaqMan probes from 50 nM to 550 nM. (A) TaqMan probes were titrated from a concentration of 50 nM to 350 nM in the presence of 10,000 EL4 cells in a 10 μL qRT-PCR analyzing Gata3 mRNA expression. (B) Threshold cycle (CT) values (left axis; black squares) and the fluorescence plateau intensity (right axis; red squares) were plotted for each TaqMan probe concentration.

Aside from using BP, and increasing Taq polymerase, primer and TaqMan probe concentrations, changes in MgCl2, SuperScript III, dNTPs, RNase inhibitor, and BSA from initially chosen concentrations did not yield significant further improvement in sensitivity or efficiency. Incorporating the forgoing changes, summarized in Table 3-3, the ability of the reaction to quantify mRNA directly from cell lysate was evaluated in the next experiments.

Table 3-3 summarizes the qRT-PCR parameters. Beside each of the qRT-PCR components are experimental details regarding the concentrations tested, and other observations.

43

Table 3-3. Summary of singleplex qRT-PCR parameters

qRT-PCR Parameters Experimental Details

Adaptations that improved the qRT-PCR Increased from 10 mM Increased fluorescence intensity in microdroplets. 1. 30 mM TrisHCl, pH 8.0 Improved the fluorescence plateau in 10 μL reactions when amplifying highly abundant targets and had negligible effects on low abundant targets (Appendix 1A,B) 2. 0.3 U/μL Taq polymerase Increased from 0.1 U/μL (also tested 0.03U/μL) 3. 900 nM primers Increased from 300 nM (also tested 100 nM) 4. 350 nM TaqMan probes Increased from 250 nM (also tested 50 nM to 550nM) Increased from 150 ng/μL 5. 450 ng/μL bovine serum albumin Increased fluorescence intensity in microdroplets, but no difference was observed 10 μL reactions (Appendix 1B) 6. 58°C annealing temperature Increased from 55°C (Appendix 1C; also tested to 60°C) Conditions that were not tested, or tested and did not improve the qRT-PCR

1. 2 mM MgCl2 Tested up to 6 mM 2. 400 μM dNTPs Tested 0.3X and 3X 3. 1 UμL SuperScript III Tested 0.3X and 3X reverse transcriptase 4. 1 U/μL SUPERase•In RNase Inhibitor Tested 2X 5. 0.01% NP-40 Tested 0.1% and 0.5% 6. 0.2 μg BP/pmol primers+probes Tested 0.1 μg/pmol and 0.3 μg/pmol 7. 55 mM KCl 8. 1 mM DTT Did not test 9. 50 nM ROX passive reference dye 10. 2.2% glycine

3.1.2.7 Characterization of the optimized singleplex qRT-PCR

Gata3 mRNA transcripts were quantified directly across a range of 1 to 10,000 EL4 cell(s) in 10 μL qRT-PCRs (Figure 3-11). Across the range of cell numbers tested, there was robust quantification of Gata3 mRNA levels. The CT values were plotted against the cell concentration analyzed in the 10 μL qRT-PCRs (Figure 3-11B). On the semi-logarthmic graph, the standard curve is a linear function as expected (R2 = 0.9963). Using the slope of the standard curve (a), the reaction efficiency across the range tested was calculated to be 1.08 (acceptable range = 0.90 – 1.10) with the following formula: Efficiency (E) = 10–1/a – 1 (Equation 2)136.

44

A B

Figure 3-11. Testing the qRT-PCR parameters across a range of cell lysate. Gata3 mRNA was quantified across a range of approximately 1 – 10,000 EL4 cell(s) per 10 μL. (A) QPCR amplification plot of the 5 10-fold dilutions of EL4 cells. (B) CT vs cell concentration plot with a standard curve was used to calculate R2 and E.

QRT-PCR was then used to quantify a broad range of Gata3 and EPCR transcript abundances under high cell lysate concentrations (Figure 3-12). The addition of different quantities of Gata3 or EPCR transcripts while keeping the cell concentration in the reaction at 1,000 cells/μL simulated qRT-PCR in the microdroplet closely.

A B

Figure 3-12. Quantification of Gata3 and EPCR RNA in the presence of 103 B62c cells/μL. (A) Amplification plot of the five 10-fold dilutions of Gata3 and EPCR RNA. (B) CT vs in vitro transcript concentration plot of Gata3 and EPCR with their standard curves to calculate R2 and E.

In the presence of 10,000 B62c cells (B62c = Gata3lo EPCR-), 5 10-fold dilutions of Gata3 and EPCR transcripts were added to 10 μL reactions. The range of transcript concentrations in these experiments was based on the range of transcript concentrations expected in a microdroplet containing a single cell113,114. QRT-PCRs performed on the dilution series of the Gata3 and EPCR transcripts yielded R2 values greater than 0.99, and efficiencies of 0.94 and 1.06, respectively. The amplification efficiencies calculated using the conventional standard curve method were consistent with those based on qPCR curve kinetics (Table 3-2)133.

45

Table 3-2. Comparison of Gata3 and EPCR amplification efficiencies calculated by qPCR curve kinetics and the standard curve method

Gata3 Transcript Average Average copies/nL RN,A RN,B CT,A CT,B E 1 0.20 0.40 29.40 30.51 0.90 10 0.20 0.40 26.28 27.31 1.00 102 0.20 0.40 22.54 23.68 0.87 103 0.20 0.40 19.17 20.26 0.93 104 0.20 0.40 15.37 16.49 0.89 Average efficiency across all dilutions (Eq. 1) 0.92 Efficiency based on slope of standard curve (Eq. 2) 0.94

EPCR Transcript Average Average copies/nL RN,A RN,B CT,A CT,B E 1 0.20 0.40 28.48 29.61 0.86 10 0.20 0.40 25.33 26.36 1.00 102 0.20 0.40 22.47 23.41 1.11 103 0.20 0.40 19.01 20.02 1.00 104 0.20 0.40 15.73 16.70 1.08 Average efficiency across all dilutions (Equation 1) 1.01 Efficiency based on slope of standard curve (Equation 2) 1.06

3.1.2.8 Implementation of the singleplex qRT-PCR in microdroplets containing single cell equivalents

The titration of qRT-PCR components in 10 μL reactions containing 10,000 cells identified conditions that should support quantification of gene expression in cell lysate concentrations found in microdroplets containing a single cell. In this section, I describe experiments testing the reaction in the microdroplets to demonstrate that the reaction conditions identified in my 10 μL qRT-PCR simulations could be implemented onto the microfluidic system.

Microdroplets were generated on a microfluidic system configured as shown in Figure 3-1. The microdroplets were monodispersed, and remained discrete through thermal cycling typical of PCR. 83% of the microdroplets subjected to 40 thermal cycles had a diameter within 2 standard deviations of the microdroplets that had not been thermal cycled (Figure 3-13A). The thermal stability of the microdroplets is the result of adding a non-ionic surfactant composed of a perfluorinated polyether-polyethyleneglycol (PFPE-PEG) copolymer into the oil71. The PFPE

46 component is soluble in the fluorocarbon oil and provides steric stabilization to the microdroplet. The PEG headgroup is soluble in the aqueous phase of the microdroplet and prevents adsorption of biological components to the microdroplet-oil interface71.

A single cell equivalent (10 pg) of μ23 (Gata3lo) or EL4 (Gata3hi) total RNA was encapsulated in the microdroplets and subjected to the same thermal cycle profile used for the 10 μL qRT- PCRs (Figure 3-13B). We observed a difference in the expression levels of Gata3 between the two cell lines that corresponded appropriately to their expression profiles of Gata3. Interestingly, Gata3 mRNA was not detected in all of the microdroplets encapsulated with 10 pg of μ23 total RNA. We hypothesized that there were insufficient Gata3 mRNA transcripts in the μ23 sample to fill all of the microdroplets with at least one transcript. At such a limiting dilution where majority of the microdroplets were negative for Gata3 mRNA, it is likely that the signal from the positive microdroplets were generated from approximately 1 copy. The next experiment determined directly if the RT-PCR had the sensitivity to reliably detect 1 mRNA transcript encapsulated in a microdroplet.

Using limiting dilutions of IL-7Rα transcript numbers in B62c total RNA, one should be able to determine the copy numbers of IL-7Rα mRNA present in an amount of B62c RNA using Poisson statistics if the reaction is capable of single copy sensitivity (Figure 3-13C). The number of IL- 7Rα transcripts per picogram of B62c RNA was previously determined by Jennifer Ma using the Bio-Rad QX100 Droplet Digital PCR System. The proportions of IL-7Rα negative microdroplets encapsulated with B62c total RNA were determined for 4 total RNA concentrations. The empirical results obtained using the microfluidic system correlated well with the Poisson predictions (R2 = 0.994). Thus, the microdroplet RT-PCR has the sensitivity to detect single mRNA molecules.

A B

47

C

D

Figure 3-13. RNA and cell RT-PCR in the microdroplets. (A) Distribution of microdroplet diameters (μm) before PCR thermal cycling and after 40 thermal cycles. (B) Gata3 mRNA was analyzed in 10 pg of μ23 or EL4 total RNA in the microdroplets. Fluorescence images corresponding to the RNA samples or a no template control (NTC) are presented with their fluorescence distributions immediately below. (C) IL-7Rα mRNA in B62c total RNA was analyzed at limiting mRNA transcript numbers. The frequencies of negative microdroplets correlate closely with Poisson predictions. (D) IL-7Rα mRNA was analyzed in microdroplets containing either B62c cells or 10 pg B62c total RNA. Microdroplets were analyzed after 50 thermal cycles.

The RT-PCR was then used to analyze gene expression directly from cells encapsulated in the microdroplets (Figure 3-13D). Either 10 pg of B62c total RNA or B62c cells were encapsulated in microdroplets that were approximately 130 μm in diameter (volume = 1.15 nL). A single cell in the microdroplet closely follows the high cell lysate concentration simulated in the 10 μL qRT-PCRs containing 10,000 cells. In microdroplets encapsulated with either B62c total RNA or cells, IL-7Rα mRNA was analyzed. The comparable fluorescence intensities of the RNA and cell sample after thermal cycling demonstrates that the conditions for qRT-PCR constructed in this study can quantify RNA accurately despite the high lysate concentrations implicit in microdroplets containing single cells.

The qRT-PCR parameters identified using the 10 μL qRT-PCR simulations of high cell lysate concentration were capable of single mRNA molecule quantification, and single cell gene expression analysis in the microdroplets. Subsequent efforts were focused on identifying multiplex qRT-PCR parameters capable of quantifying Gata3, EPCR and IL-7Rα expression

48 simultaneously. The experimental approach to identifying multiplex parameters are discussed in Section 3.2.

3.2 Development of triplex parameters to analyze LT-HSC mRNA markers in the microdroplets

Multiplex qRT-PCR is defined as the reverse transcription of two or more mRNA targets into cDNA, followed by the amplification of the cDNA targets in the same reaction. The amount of cDNA generated from each thermal cycle will be monitored by TaqMan probes with a unique fluorophore for each target cDNA. Multiplex qRT-PCRs have been developed to reduce reagent consumption, and increase the amount of genetic information that can be obtained from scarce quantities of template. The ability to perform multiplex qRT-PCR is critical for the microdroplet assay because the expression of EPCR, Gata3 and IL-7Rα mRNA must be analyzed simultaneously in a single cell to determine if a microdroplet contains a LT-HSC. The expression of EPCR, Gata3, and IL-7Rα mRNA levels in the different cell types of the hematopoietic system vary significantly (Figure 2-4). In single eukaryotic cells, although over 80% of detectable mRNA transcripts were expressed under 10 copies, other transcripts were expressed up to 104 copies114. Consequently, the detectable mRNA transcript abundance in a single cell can range from 1 – 104 copies. Quantification of multiple genes varying significantly in their abundance levels can be difficult because the amplification of low transcript numbers is hindered by highly abundant targets competing for many of the same reagents137. Failure to amplify any of the mRNA markers could result in inaccurate calls on the identity of the cell in the microdroplet.

3.2.1 An experimental strategy to simulate single cell mRNA relative abundances

An experimental strategy was developed to identify multiplex qRT-PCR parameters capable of quantifying EPCR, Gata3 and IL-7Rα mRNA simultaneously even when their expression levels were significantly different. To simulate the expression levels of EPCR, Gata3 and IL-7Rα

49 mRNA, in vitro transcription products of each gene were added into the qRT-PCRs as template while systematically testing the reaction components (Figure 3-14). The in vitro transcription products, referred to here as in vitro transcripts (IVTs), were synthesized from cDNA template as described in Figure 3-14, and contained an exon-exon junction that could be targeted by primers and TaqMan probes for subsequent qRT-PCR. By using IVTs, the number of RNA transcripts added to each reaction was known. EPCR, Gata3 and IL-7Rα IVTs were added into 10 μL qRT-PCRs in specific amounts to simulate the range of mRNA concentrations expected in a 1 nL microdroplet containing a single cell. The overview of this approach is presented in Figure 3-14.

Figure 3-14. Experimental strategy used to simulate the range of mRNA concentrations of a single cells encapsulated in a 1 nL microdroplet. A minimal T7 promoter and poly(T) tail were incorporated into the cDNA of the target genes by PCR. Subsequently, T7 RNA polymerase was used to transcribe the cDNA into IVTs. Synthesized IVTs were then combined in specific amounts to simulate the upper and lower limits of mRNA numbers in single cells. 10,000 cells were also spiked into the 10 μL reactions.

3.2.2 Generation of the in vitro transcripts

Primers used to generate the cDNA template for in vitro transcription were designed to flank the region targeted by the primers used in the multiplex reaction. IVTs are 700-800 bp segments of the mRNA transcript around the exon-exon boundaries targeted for subsequent qRT-PCR. The forward primers used to generate the cDNA template included a minimal T7 promoter sequence to allow the cDNA to be transcribed by T7 RNA polymerase, and the reverse primers contained a poly(T) sequence for the generation of transcripts with a poly(A) tail (Figure 3-14; Table 3-4).

The generation of the cDNA by PCR was carried out using the protocol described in Section 2.1.3. The PCR products were analyzed on a 2% agarose gel, and purified using the Qiagen

50

QIAquick PCR Purification Kit (column-based). The purified cDNA was quantified using a NanoDrop ND-1000, and stored at -20°C until it was needed for in vitro transcription.

Table 3-4. In vitro transcript information and primer sequences used to generate cDNA template

In vitro transcripts Gata3 Forward: 5’ TAATACGACTCACTATAGGCAATGCCTGCGGACTCTACC 3’ Reverse: 5’ TTTTTTTTTTTTTTTTTTTTGAATGGCTTATTCACAAATGGGA 3’ Length: 733 nt Molecular weight: 235,639.5 g/mol EPCR Forward: 5’ TAATACGACTCACTATAGGTCTCCTATTTCCAAGACAACCATC 3’ Reverse: 5’ TTTTTTTTTTTTTTTTTTTTGCTACGACCTTCCCTTACCAG 3’ Length: 729 nt Molecular weight: 235,309.1 g/mol IL-7Rα Forward: 5’ TAATACGACTCACTATAGGCAATGCCTGCGGACTCTACC 3’ Reverse: 5’ TTTTTTTTTTTTTTTTTTTTGAATGGCTTATTCACAAATGGGA 3’ Length: 819 nt Molecular weight: 263,952.9 g/mol Minimal T7 promoter

In vitro transcription was performed using the MEGAscript T7 Transcription Kit (Invitrogen). 200 ng of purified cDNA was added to each reaction. The reactions were incubated at 37°C for 4 hours. 1 μL of DNase provided in the kit was added to digest the cDNA template, and the synthesized RNA was purified using TRIzol Reagent.

The purified transcripts were quantified using a Nanodrop ND-1000. IVT copy numbers were approximated based on molecular weight calculations performed using OligoCalc (http://www.basic.northwestern.edu/biotools/oligocalc.html)138. IVTs were diluted to 1010 copies, snap-frozen on dry ice, and stored at -80°C as 2 – 5 μL aliquots in PCR tubes for single use.

3.2.3 Characterization of the in vitro transcript combinations

Four combinations of the EPCR, Gata3, and IL-7Rα IVTs were assembled to simulate the range of mRNA transcript numbers in 1 nL microdroplets encapsulated with a single cell (Figure 3- 14). Transcript numbers can range from 1 – 104 copies in a single cell114. Consequently, encapsulation of a single cell in a 1 nL microdroplet results in mRNA concentrations ranging

51 from 1 – 104 copies/nL, or 103 – 107 copies/μL when expressed on a scale relevant to 10 μL qRT- PCRs.

Using an Applied Biosystems 7900HT instrument unless otherwise stated, each reaction was performed in duplicate or triplicate with the following thermal cycling profile: 55°C for 30 minutes, 95°C for 2 minutes, and 40 cycles of 95°C for 15 seconds, 55°C or 58°C for 30 seconds and 72°C for 30 seconds. PCR products were analyzed on 3% agarose gels to clearly separate bands of similar length.

Singleplex qRT-PCR parameters were used to analyze 4 combinations of EPCR, Gata3 and IL- 7Rα IVTs in a triplex reaction. The results of each triplex reaction were compared directly to three individual singleplex reactions performed in the same qPCR experiment (Figure 3-15). When all three transcripts were present in high abundance, the exponential phases of the triplex qRT-PCR amplification curves superimposed onto their respective singleplex controls (IVT combination 1). However, the amplification of low abundant targets were significantly hindered by the presence of highly abundant targets (IVT combination 2 and 3).

Figure 3-15. Triplex qRT-PCR in four combinations of high and low IVT abundances using singleplex qRT- PCR parameters. Combinations of IL- 7Rα, Gata3 and EPCR IVTs were spiked into 10 μL qRT-PCRs to determine if the singleplex parameters could quantify vastly different target abundances. Genes analyzed in triplex reactions (black curves) were compared directly to their three respective singleplex controls (red curves) that were performed in the same experiment.

In IVT combination 3, quantification of EPCR and Gata3 IVTs was hindered by the presence of highly abundant IL-7Rα IVT (Figure 3-15). To begin experimenting with the components of the

52 qRT-PCR to improve the triplex reaction, IVT combination 3 was used as the template because it was the only IVT combination containing two targets that the qRT-PCR could not quantify. Reaction parameters capable of quantifying IVT combination 3 will be tested subsequently on the other combinations. 10,000 B62c cells were spiked-in to each 10 μL reaction to simulate the amount of lysate in the microdroplets. The B62c cell line was used because its mRNA levels of EPCR, Gata3 and IL-7Rα were compatible with the IVT levels in IVT combination 3 (IL-7Rα

CT = 22.0; Gata3 CT = 36.0; EPCR CT = not detected).

3.2.4 Identification of triplex qRT-PCR parameters to quantify Gata3, EPCR and IL-7Rα mRNA in the microdroplets

Developing a multiplex qRT-PCR can be challenging because there is an increased propensity of spurious amplification arising from the use of an additional primer pair per gene target139. Amplification of unintended targets affects the reaction sensitivity because it competes with target-specific amplification for the same pool of reagents. Accordingly, primers must be designed to be target-specific, and to minimize primer-dimer formation across all primer pairs. Furthermore, the concentrations of the singleplex qRT-PCR components will need to be reassessed to meet the demands of amplifying three targets by PCR137,139.

A B

Figure 3-16. Example of a reaction plate configuration and corresponding qPCR curves. (A) Triplex qRT-PCR was performed with varying MgCl2 concentrations (black = initial, red = concentration that improved the reaction, blue = concentration that hindered or did not improve the reaction). Singleplex reactions were also performed on the same plate. (B) QPCR curves were plotted for each gene separately for a visual comparison of the initial reagent concentration, concentration that led to improvement, and the singleplex controls. Other tested concentrations were omitted for clarity. All reactions were performed in duplicates or triplicates.

To find the concentration of qRT-PCR reagents needed to analyze Gata3, EPCR and IL-7Rα simultaneously, the experiments in this section were configured to allow a controlled change in

53 a qRT-PCR component to be compared to the initial condition and singleplex controls determined to be efficient in Section 3.1 (Figure 3-16A,B). A multiplex reaction is considered successful when its exponential phase superimposes onto their singleplex counterparts, and the amplification is specific as determined by agarose gel electrophoresis.

As described in Figure 3-15, IVT combination 3 (Gene 1hi Gene 2lo Gene 3lo) was the only template where the qRT-PCR could not quantify two of the three target genes, specifically the low abundant targets. Thus, IVT combination 3 was used as the initial template to investigate conditions for an efficient multiplex qRT-PCR. The parameters identified using IVT combination 3 as the template would later be assessed for their ability to quantify the other IVT combinations. A duplex qRT-PCR that I developed was used as the starting point (Appendix 2).

In multiplex qRT-PCR, there is an increase in the number primer pairs and TaqMan probes proportional to the number of mRNA targets. Since primer sequestration by BP occurs in a stoichiometric manner, more BP is required to sequester the additional amount of primer pairs and TaqMan probes. Expectedly, increasing BP to 0.2 μg per pmol of primers and probes, which is the approximate ratio of BP to primers and probes in a singleplex reaction, 1.56- and 1.52-fold increases in the fluorescence plateaus of the Gata3 and EPCR qPCR curves, respectively, were observed (Figure 3-17). Furthermore, without an increase in BP concentration, there was no observable exponential phase in the qPCR amplification curve of EPCR (Figure 3-15, 17).

Figure 3-17. Increase in binding protein is required for the amplification of multiple RNA targets. IVT combination 3 was analyzed for IL-7Rα, Gata3 and EPCR expression with increasing amounts of BP. Triplex qRT-PCRs (black and red curves) were compared directly to singleplex reactions (gray curve).

Salts are another critical component for an efficient qRT-PCR. KCl and MgCl2 dissociate into monovalent and divalent cations, respectively, that bind to the negatively charged DNA phosphate backbone. The neutralization of the repulsive negative charges on DNA molecules facilitates primer-template pairing, and annealing of DNA molecules. An increase in KCl was found to increase the product yield of shorter amplicons139. The authors in this study

54 hypothesized that an increase in KCl hindered the denaturation of templates longer in length, and as a result, the shorter templates were preferentially amplified139. Changing the KCl concentration from 55 mM to 90 mM in the triplex reaction increased the total EPCR product formation as indicated by an increase in the fluorescence plateau by 2.21-fold while having no effect on the amplification of Gata3 and IL-7Rα. Interestingly, EPCR was the shortest amplicon in the reaction.

Figure 3-18. Influence of KCl and MgCl2 salts on triplex qRT-PCR. The effects of increasing KCl and MgCl2 on the qRT-PCR of IVT combination 3 were examined. Specifically, the resultant fluorescence plateau after 40 cycles (indicating product yield), and CT values (indicating sensitivity of the reaction) were used as the criteria to assess the reaction outcomes.

2+ Mg is also a Taq polymerase cofactor. 2.5 mM MgCl2 was needed for the triplex reaction, and either a decrease or increase of MgCl2 at an increment of 0.5 mM was detrimental to the reaction.

An increase of MgCl2 from 2.0 mM to 2.5 mM decreased the CT by 1.61 ± 0.76 (mean ± SD), which translates to an approximate 2.6-fold increase in sensitivity (Figure 3-18).

Non-specific amplification was observed while investigating the above qRT-PCR components (Figure 3-19A). Although non-specific products will not yield a false signal due to the sequence- specific fluorescence of the TaqMan probes, the use of primer pairs with minimal off-target amplification is important because when non-specific targets are amplified, they compete with the specific targets for the same pool of PCR reagents. Primer design was facilitated by programs that align regions of complementarity (Beacon Designer; Appendix 3), and predict potential unintended targets (Primer-BLAST). However, primers designed to satisfy recommended thermodynamic constraints set forth in silico still yielded non-specific products (Figure 3-19). Thus, an empirical approach with several primer pairs designed computationally as the starting point was used to find a compatible set of multiplex primers (Table 3-5; Appendix 4).

55

Table 3-5. Primer, probe and target amplicon information

Gene Assay information Gata3 F = 5’ CAAGCTTCATAAT][ATTAACAGACC (54°C) R = 5’ GCTGAAGGGAGAGATGTG (54°C) Probe = 5’ AAGGCATCCAGACCCGAAAC (60°C) Fluorophore = 6-FAM Ex = 495 nm Em = 520 nm Amplicon length = 196 bp IL-7Rα F = 5’ ACCAAAAGCAATGTATGAAATC (52°C) Pair 1 R = 5’ CCATCCTC][CTTGATTCTTG (53°C) Probe = 5’ AAGGTAGAACTTGGACTCCACTCGCTC (66°C) Fluorophore = NED Ex = 546 nm Em = 575 nm Amplicon length = 127 bp IL-7Rα F = 5’ CAAAGTCCGATCCATTCC (53°C) Pair 2 R = 5’ CCATCCTC][CTTGATTCTTG (53°C) Amplicon length = 107bp IL-7Rα F = 5’ CATAACGATTACTTCAAAGGCT (54°C) Pair 3 R = 5’ CCATCCTC][CTTGATTCTTG (53°C) Amplicon length = 87bp EPCR F = 5’ GAACGTGTTCT][TTCCTCTC (53°C) Pair 1 R = 5’ CATCGAAGAAGACATGGG (52°C) Probe = 5’ AGCCCTCCTCCTCCTCCTCC (64°C) Fluorophore = VIC Ex = 538 nm Em = 554 nm Amplicon length = 95 bp EPCR F = 5’ GTGTTCT][TTCCTCTCACTGTTAG (57°C) Pair 2 R = 5’ GAAGTTGGCTGCTTTGGA (55°C) Amplicon length = 183bp EPCR F = 5’ GAACGTGTTCT][TTCCTCTCAC (57°C) Pair 3 R = 5’ GCCACATCGAAGAAGACATG (57°C) Amplicon length = 99bp ][ Indicates exon-exon boundary

Using IVT combination 3 as the template, IL-7Rα primer pairs 1 – 3, EPCR primer pairs 1 – 2 and the Gata3 primer pair were tested (Figure 3-19A). The compatibility of the primers was measured by the reaction specificity and their quantitative performance. In the first iteration, non-specific amplification was the caused by EPCR primer pair 1 as observed by the formation of spurious products when different IL-7Rα primers were used with EPCR primer pair 1, and then the disappearance of all off-target products when EPCR primer pair 2 was used regardless of which IL-7Rα primer pair it was combined with (Figure 3-19A). Although the amplification curve of IL-7Rα primer pair 3 had a higher fluorescence plateau compared to the other two pairs,

EPCR primer pair 2 compared to primer set 1 had a significant CT delay of 1.354 ± 0.65 across all of the IL-7Rα primer pairs tested (one-tailed t-test, n=6, p=0.0002). As a result, another EPCR primer pair was designed and tested for its ability to quantify IVT combination 3 with IL-7Rα

56 primer pair 3 and the Gata3 primers. Using EPCR primer pair 3, the triplex reaction was specific and its sensitivity was comparable to EPCR primer pair 1 (Figure 3-19B).

A

B

Figure 3-19. Testing primers for specificity and quantitative performance. (A) EPCR primer pairs 1 and 2 were tested with IL-7Rα primer pairs 1 – 3 and the Gata3 primers for their ability to analyze IVT combination 3. Products after 40 cycles were visualized on an agarose gel (left) while the qPCR curves assessed the IL-7Rα and EPCR primers quantitatively (right). (B) IL-7Rα primer pair 3 was tested with EPCR primer pair 3, which yielded specific amplification and comparable quantitative performance relative primer pair 1. Red arrows on the agarose gels indicate non- specific product formation.

The quantity of Taq polymerase is important for an efficient multiplex PCR137,140. After finding a compatible set of primer pairs, different Taq polymerase concentrations were tested (Figure 3- 20). An increase in the concentration of Taq polymerase from 0.3 U/μL to 0.4 U/μL improved the duplex reaction (Appendix 2), and a subsequent increase to 0.6 U/μL facilitated the amplification of three targets (Figure 3-20A).

57

2.8 IL-7Ra A B Gata3 2.4 EPCR

2.0

1.6

1.2 Fluorescence Intensity Fluorescence

0.4 0.5 0.6 0.7 0.8 Taq polymerase (U/L)

Figure 3-20. Influence of Taq polymerase on triplex qRT-PCR. (A) QPCR amplification plots illustrating the effects of increasing Taq polymerase to 0.6 U/μL on the triplex reaction. (B) The fluorescence intensity of the plateau phase after 40 thermal cycles is plotted for each RNA target at all of the Taq polymerase concentrations tested (0.4 – 0.8 U/μL).

We anticipated that more dNTPs would be required to amplify 3 targets. The concentration of dNTPs was raised in increments of 100 μM while analyzing IVT combination 3 (Figure 3-21). However, multiplex reactions were inhibited progressively with increasing amounts of dNTPs, 139 and a simultaneous addition of MgCl2 alleviated this inhibition . A balance between dNTPs 2+ and MgCl2 for PCR was thought to be needed because dNTPs can bind to Mg electrostatically, and thus, reduces the amount Mg2+ that can act as a Taq polymerase cofactor. Accordingly, 3 concentrations of dNTPs were tested in a factorial experimental design with 3 concentrations of

MgCl2 (Figure 3-21B). A 100 μM increase in the dNTP concentration raised the fluorescence plateau of all three genes (Figure 3-12A). Contrary to what was expected, a corresponding increase in MgCl2 at 500 μM dNTPs was not necessary. But when the concentration of dNTPs was increased to 600 μM, a MgCl2 concentration of 3.0 mM did result in a higher fluorescence plateau albeit lower than what was observed when 500 μM dNTPs and 2.5 mM MgCl2 were used

(Figure 3-21B). Lower MgCl2 concentrations were not tested in this analysis because previous experiments found that the triplex reaction functioned better in 2.5 mM than 2.0 mM MgCl2 139 (Figure 3-18), and an adequately high increase in dNTPs necessitates an increase in MgCl2 .

Similar to the results from the Taq polymerase titrations, improvements made by adjusting dNTPs were incremental relative to the previous modifications. However, a definitive conclusion can be made about which concentration resulted a better triplex reaction by finding the concentration that enhanced the sensitivity or product yield of the RT-PCR, and if increased or decreased, the reaction would not be improved to the same extent.

58

A

B

Figure 3-21. Interplay between dNTP and MgCl2 concentrations in a triplex qRT-PCR. (A) A qRT- PCR containing 500 μM dNTPs (red curves) was compared directly against the singleplex dNTP concentration (black curves) and the singleplex controls (gray curves). (B) Summary of the dNTP and MgCl2 concentrations tested in a factorial set-up for each RNA target in a triplex reaction. The fluorescence intensity of the plateau (ΔRN,F) at each concentration of dNTPs and MgCl2 was measured after 40 thermal cycles.

The above modifications to the qRT-PCR parameters were necessary for quantifying two low level RNA targets in the presence of a target expressed approximately 104-fold higher. Using the identified qRT-PCR parameters, all 4 IVT combinations were analyzed (Figure 3-22). The qRT- PCR parameters were able to quantify EPCR, Gata3 and IL-7Rα levels despite differences of 104-fold. The exponential phases of the triplex reactions were superimposed onto their respective singleplex controls. All the IVT combinations were analyzed in the presence of 10,000 B62c cells except for IVT combination 4 because the template composition conflicts with the gene expression profile of the cell line (IL-7Rαhi Gata3lo EPCR-).

59

Figure 3-22. Quantification of IVT combinations 1 – 4 using parameters that have been adapted for a triplex qRT-PCR. Four different combinations of IVT copies of the target genes were assembled, representing the extreme transcript abundance differences expected in single cells (left). These combinations were quantified using the triplex qRT-PCR parameters, and compared directly to the singleplex reactions (right). Thus, each triplex reaction was compared to three individual singleplex reactions.

If the reaction is not inhibited by high lysate concentrations, then the multiplex reaction should perform equally in the presence or absence of high cell lysate, and the analysis of IVT combination 4 in the above experiment would be valid. To test the effect of high cell lysate directly, IVT combination 3 was analyzed with 10,000 B62c cells or 2.2% glycine added to the reaction (Figure 3-23). The superimposition of the qPCR curves of the two conditions suggests that the reaction parameters were not inhibited by high lysate concentrations (Figure 3-23; Appendix 5).

Figure 3-23. Quantification of IVT combination 3 in the presence and absence of 1,000 B62c cells/μL. Triplex qRT-PCR was performed to compare the effects of high lysate concentration on the quantitative ability of the qRT-PCR.

60

Next, qRT-PCR parameters were used to quantify a mRNA profile similar to LT-HSCs (Figure 3-24). IVTs were spiked-in so that EPCR was present in at a concentration of 104 copies/nL, and Gata3 was diluted from 104 copies/nL to 1 copy/nL. No IL-7Rα transcript was added because LT-HSCs do not express it, but the primers and probes used to detect the transcript were still added to simulate the triplex conditions that will be used on the microfluidic system.

A

B

Figure 3-24. Simulation of a LT-HSC mRNA profile using IVTs. (A) Triplex qRT-PCR was performed to quantify Gata3 RNA in 5 10-fold dilutions down to 1 copy/nL while EPCR was kept at a constant concentration of 104 copies/nL. The qPCR amplification plots are shown to the left, and the CT values are plotted against their copy numbers to the right. (B) Triplex qRT-PCR in (A) was repeated across the same range, but 2-fold dilutions were performed from 4 copies/nL to 1 copy/nL as well.

In the presence of 104 copies of EPCR IVT per nL, the quantification of Gata3 IVT was possible across a range likely exceeding the levels that a transcription factor is expressed in a single cell113,114 (Figure 3-24). The average efficiency of Gata3 IVT quantification from 1 copy/nL to 104 copies/nL was calculated to be approximately 0.94 in two independent experiments. Furthermore, a dilution series was performed at 1, 2 and 4 copies per nL (Figure 3-24B). This yielded average CT values of 26.2, 24.9 and 23.6, which is consistent with the approximate 1 CT change for every 2-fold difference. The quantification of Gata3 mRNA is critical for the differentiation of LT-HSCs from IT-HSCs. These results demonstrate that the reaction is capable of performing such necessary expression level discriminations.

61

With the current multiplex qRT-PCR parameters, the expression of EPCR and IL-7Rα were analyzed simultaneously in microdroplets encapsulated with a single cell equivalent of B62c or Wehi3 total RNA (10 pg; Figure 3-25). Singleplex RT-PCR was also performed in parallel for a direct comparison. The fluorescence intensities of the microdroplets, which corresponded to the product yield of EPCR and IL-7Rα, were analyzed after 50 thermal cycles. B62c cells expressed IL-7Rα significantly higher than Wehi3 cells as expected from the dependency of B62c cells (pre-B cell line) on IL-7. This quantitative difference was observed when B62c and Wehi3 total RNA was analyzed in the microdroplets. Whereas all of the B62c RNA-encapsulated microdroplets were IL-7Rα+, only 13.7% of microdroplets encapsulated with Wehi3 RNA were positive for IL-7Rα mRNA. The singleplex and duplex reactions were comparable in their percent of positive microdroplets and resultant fluorescence intensities. The correlations between the singleplex and duplex reactions in the microdroplets were not observed prior to the identification of the multiplex parameters (Appendix 6).

Figure 3-25. Multiplex RT-PCR in the microdroplets. Either a single cell equivalent of B62c (top) or Wehi3 (bottom) total RNA was encapsulated in the microdroplets for RT-PCR. RNA samples were analyzed for EPCR and IL-7Rα mRNA expression in either duplex or singleplex reactions after 50 thermal cycles. The negative controls without template were used to set the threshold as indicated by the vertical lines.

With modifications to the singleplex qRT-PCR, the reaction can now simultaneously quantify three targets across a broad dynamic range, likely exceeding the range necessary for single cell gene expression analysis113,114. In the microdroplets, the multiplex qRT-PCR parameters were

62 also used to analyze EPCR and IL-7Rα mRNA levels from total RNA samples amounting to the approximate quantity found in single cells.

3.3 Discussion

The ability to use a particular gene expression profile as a surrogate for cellular identity relies on the exclusivity of the mRNA marker combination to the cell of interest, and also the ability to analyze the selected mRNA markers reliably. Accordingly, qRT-PCR parameters that could quantify EPCR, Gata3 and IL-7Rα mRNA levels simultaneously were required to determine if a microdroplet contained a LT-HSC. In order to quantify the 3 target genes directly from single cells encapsulated in 1 nL microdroplets, I had to identify conditions that could tolerate high cell lysate concentrations, and had a broad dynamic range sufficient for single cell analysis.

Although decreasing the reaction volume improves the RT reaction and PCR68,74,75,141, inhibition of the reaction by the amount of cell lysate produced from a single cell in a nanolitre volume was observed (Figure 3-2,7). A single cell encapsulated in a 1 nL microdroplet is equivalent to approximately 1,000 cells/μL. Although the systematic testing of the reaction components did not elucidate what it was in the lysate that was inhibitory, it did identify which qRT-PCR parameters alleviated the inhibition.

BP was important to the qRT-PCR as a hot-start mechanism (Figure 3-6). Interestingly, addition of BP rescued a commercial kit containing a hot-start Taq polymerase from inhibition by high cell lysate concentrations while having no effects at low lysate concentrations, suggesting that BP may have another role in the reaction aside from providing a hot-start (Figure 3-7). The mechanism by which BP exerts its effects specifically in the context of high cell lysate is unclear. While commonly used hot-start mechanisms such as chemically modified Taq polymerases permit RT during the reaction set-up, BP provides a RT-based hot-start to minimize non-specific cDNA generation by the reverse transcriptase142. Although a RT-based hot-start increases reaction specificity, it is unlikely that it accounts for the rescue of the commercial kit from complete reaction inhibition. In addition to sequestering primers, BP stimulates T7 DNA polymerase activity directly143. Any function BP may have is likely exerted during the reaction

63 set-up, ramping to and in the early stages of the reverse transcription because exposure to temperatures of 55°C or more denatures BP.

Parameters were identified that could quantify mRNA levels across a broad dynamic range in a 10 μL qRT-PCR containing 10,000 cells (Figure 3-11,12; Table 3-2). The qRT-PCR parameters identified in the 10 μL reaction volumes were used in the microdroplets to perform single mRNA molecule and cell gene expression analyses (Figure 3-13). Comparable fluorescence intensities of the mRNA and single cell RT-PCR in the microdroplets suggest that the reaction was not inhibited by high lysate concentrations.

To determine if a microdroplet contains a LT-HSC, qRT-PCR parameters that could quantify EPCR, Gata3 and IL-7Rα simultaneously were identified. Initially, the co-amplification of low level targets in the presence of highly abundant ones failed possibly because of the competition between the targets for a pool of limited reagents that were also subject to thermal degradation144, and the accumulation of PCR products that inhibited the reaction at later cycles145. Additionally, non-specific amplification can further consume the reagents. The problem of co-amplifying multiple transcripts with significant abundance differences was observed most evidently in IVT combination 3 (Figure 3-15). Thus, IVT combination 3 was used as the initial template for identifying multiplex conditions. Using reaction specificity and the quantitative performance of the singleplex reactions determined previously to be efficient as the benchmarks for a successful multiplex reaction, many of the qRT-PCR components were reassessed.

An increase in BP proportional to the increase in the amount of primers and TaqMan probes was important for the development of an exponential phase in the EPCR amplification curves (Figure 3-17). Conversely, an exponential phase was present in the Gata3 amplification curves prior to the increase in BP. Since the same amount of Gata3 and EPCR template were added into the reaction, it could be argued that the Gata3 assay was more efficient than the EPCR assay. This is supported by the steeper exponential phase of Gata3. Furthermore, this particular pair of EPCR primers (pair 1) was prone to cause non-specific amplification in the triplex reaction (Figure 3- 18). It is possible that the increase in BP was necessary for the exponential phase of EPCR but not the Gata3 amplification curves because the EPCR primers depended more heavily on the sequestration mechanism of BP to minimize its participation in non-specific amplification.

64

Kainz showed that the accumulation of dsDNA products in the later cycles exerted an inhibitory effect on Taq polymerase activity145. He provided evidence to suggest that the inhibition of Taq polymerase by dsDNA results in the plateau phase of PCR. Indeed, the presence of short dsDNA fragments were found to diminish Taq polymerase activity without sequence-specificity146,147. However, this entails that the co-amplification of low abundant targets in the presence of high abundant targets would not be possible. Especially when the generation of detectable products from the low abundant target occurs when amplification of the highly abundant target reached the plateau phase. This scenario did not occur in IVT combinations 2 and 3. The discrepancy may be because the 1 kb dsDNA that was added to simulate the later cycles of PCR in the study did not completely denature due to its length and its presence in such a high concentration, and consequently inhibited the Taq polymerase145. Inhibition of Taq polymerase by dsDNA specifically and not ssDNA is supported by another study where shorter dsDNA molecules exhibited a thermal reversible inhibition that coincided with the melting temperature of the dsDNA molecules147. The lengths of all the gene targets in my experiments were kept under 200 bp to maximize qPCR efficiency. Perhaps this length restriction also prevented the inhibition of Taq polymerase activity in the later cycles.

What does cause the plateau phase if it was not as Kainz proposed? And how is it possible that the low abundant targets could still be exponentially amplified while the amplification of the highly abundant targets reached the plateau phase? Annealing kinetics of dsDNA molecules are affected by their concentrations. In the early cycles when the concentrations of the cDNA targets are relatively sparse, the re-annealing of separated strands after the denaturation hold at 95°C will occur slowly. However in the later cycles, the high concentration of cDNA amplicon especially for highly abundant targets, leads to dsDNA formation144,148. Thus, the amplification of highly abundant targets enters the plateau phase possibly because of primer displacement by template re-annealing or poor strand separation. Entry of the amplification kinetics of highly abundant targets into the plateau phase could actually be favourable for the amplification of low abundant targets because that leads to a more concerted expenditure of the reagents on amplifying the low abundant targets.

In single eukaryotic cells, over 80% of detectable genes were expressed under 10 copies/cell, with predominantly ribosomal RNA and genes involved in metabolism and protein translation

65 expressed up to 104 copies113,114. Transcripts that were expressed as low as 1 copy per cell can lead to reliable protein production. Interestingly in yeast, approximately half of the genes expressed at approximately 1 copy per cell were essential for their growth and had detectable levels of protein product114. Thus, a broad dynamic range and the ability to detect low transcript numbers would be necessary for the attribution of a particular mRNA profile to cellular identity. The identified reaction parameters can quantify mRNA levels across a broad dynamic range, and were used in the microdroplets to detect single mRNA molecules reliably (Figure 3-11,12,13). Furthermore, using the IVT combinations to simulate gene expression profiles with 104-fold expression level differences between the RNA targets, qRT-PCR parameters were adapted to quantify the expression of the LT-HSC mRNA markers regardless of any abundance differences in the transcripts that may be encountered when analyzing cells of the hematopoietic system (Figure 3-22).

Gata3 and EPCR IVTs were spiked-in to simulate the mRNA profile of LT-HSCs (Figure 2-4; Gata3lo EPCRhi IL-7Rα-). Although no IL-7Rα IVT was present, the IL-7Rα primers and probes were added to mimic the triplex reaction parameters used in the microdroplets. The qRT-PCR parameters quantified a range of Gata3 IVTs that encompassed levels expressed in LT-HSCs while also quantifying a high concentration of EPCR IVTs (104 copies/nL). No indication of interference from the unincorporated IL-7Rα primers and probes was observed (Figure 3-24). The rigorous testing of the reaction parameters provided technical confidence that the simultaneous quantification of EPCR, Gata3 and IL-7Rα mRNA in single cells will yield expression profiles that can predict whether a microdroplet contains a LT-HSC or otherwise.

Although the simulations using 10 μL reactions were successful in overcoming key challenges, there were differences between the environment of the microdroplet and 10 μL conventional qRT-PCR that must be considered. These differences were first observed when NP-40 destabilized the microdroplets upon thermal cycling (Appendix 7). Subsequently, a 3-fold increase in BSA improved the fluorescence intensity of RT-PCR performed in the microdroplet (Appendix 1B), and had negligible effects in conventional RT-PCRs. Preliminary multiplex studies with cells and RNA have shown that the same conditions that allow for the amplification of RNA in stable microdroplets result in microdroplet coalescence when cells were encapsulated. It is possible that these issues arise from the interactions between RT-PCR reagents and the

66 microdroplet-oil interface. The amount of reagent adsorption to this interface may be augmented by the high temperatures that the microdroplets are subjected to, and high surface area to volume ratio due to the low reaction volume149. NP-40, an amphipathic molecule, could possibly displace the stabilizing fluorinated surfactants and destabilize the microdroplets. Although the inner PEG brush of the fluorinated surfactant should promote surface passivation71,150, the improvement of the reaction only in the microdroplets when BSA was increased, and microdroplet destabilization when NP-40 was added suggest that reagent adsorption at the interface does occur. Additionally, coalescence in the multiplex reactions where microdroplets were encapsulated with cells, but not with RNA, could be caused by cellular protein and lipid membrane adsorption at the interface. Understanding the microdroplet environment is a challenge that is currently being addressed as the parameters that I have identified in this chapter are being used in the microdroplets by the collaborating laboratory.

Nonetheless, the simulations in the 10 μL qRT-PCRs were instrumental for understanding the parameters necessary for the quantification of multiple genes directly in single cells encapsulated in the microdroplets. The presented lines of evidence demonstrated that the reaction parameters derived from the simulations in the 10 μL qRT-PCRs were capable of overcoming the inhibitory effects of high lysate concentration in the microdroplets. Moreover, the multiplex parameters with a broad dynamic range identified in the 10 μL qRT-PCRs were translated into the microdroplets to quantify two genes simultaneously in single cell quantities of total RNA. This sets the stage for further experimentation to examine the use of these parameters to detect multiple genes in single cells, and ultimately, to use the mRNA markers to enumerate rare LT- HSCs rapidly in unenriched samples.

67

Chapter 4 Conclusions

Enumeration of LT-HSCs requires in vivo assays that take 5 months to complete. The length of these assays makes routine clinical measurements and high-throughput research strategies in the area of hematopoietic cell transplantation and stem cell biology unfeasible. This bottleneck was the motivation to develop an assay that can circumvent the need for such time-consuming and laborious in vivo assays. In this project, a prototype of a surrogate mRNA marker-based single cell qRT-PCR assay was developed to enumerate LT-HSCs rapidly.

LT-HSC mRNA markers were identified with the help of a high resolution gene expression “map” of the hematopoietic hierarchy that consisted of global transcript profiles generated from stem, progenitor and mature cells. Global transcript profiles were also generated from stem cells cultured in conditions that preserve their functional capabilities. By analyzing these samples, the combined presence or absence of EPCR, Gata3, and IL-7Rα transcripts was found to mark uniquely LT-HSCs independent of their cell cycle state. The specificity of this marker combination for LT-HSCs was validated by cell sorting and long-term competitive reconstitution assays.

I used an emulsion-based microfluidic system to package cells into microdroplets with qRT- PCR reagents for single cell analysis. I identified the parameters necessary for a robust multiplex qRT-PCR in the cell lysate concentrations present in microdroplets. This foundation sets the stage for the final experimental validation of the system for enumerating LT-HSCs by correlating enumerations in microdroplets with biological readouts of in vivo assays in samples containing widely differing numbers of LT-HSCs.

The ability of this system to perform high-throughput, multiplex genetic determinations on single cells can be adapted to detect non-coding RNA, copy number variations, point mutations, SNPs, and chromosomal translocations. The flexibility in the target type in the context of high- throughput single cell analyses can have far-reaching applications in other biological systems, including malignancies whose progress, metastasis and relapse may depend on the numbers of specific genetic lesions in rare cells.

68

References

1 Gordon, M. Y., Lewis, J. L. & Marley, S. B. Of mice and men … and elephants. Blood 100, 4679 (2002). 2 Osawa, M., Hanada, K.-i., Hamada, H. & Nakauchi, H. Long-term lymphohematopoietic reconstitution by a single CD34-low/negative hematopoietic stem cell. Science 273, 242- 245 (1996). 3 Benveniste, P., Cantin, C., Hyam, D. & Iscove, N. N. Hematopoietic stem cells engraft in mice with absolute efficiency. Nat Immunol 4, 708-713 (2003). 4 Copelan, E. A. Hematopoietic stem-cell transplantation. New Engl J Med 354, 1813-1826 (2006). 5 Benveniste, P. et al. Intermediate-term hematopoietic stem cells with extended but time- limited reconstitution potential. Cell Stem Cell 6, 48-58 (2010). 6 Yang, L. et al. Identification of Lin(-)Sca1(+)kit(+)CD34(+)Flt3- short-term hematopoietic stem cells capable of rapidly reconstituting and rescuing myeloablated transplant recipients. Blood 105, 2717-2723 (2005). 7 Till, J. E. & Mc, C. E. A direct measurement of the radiation sensitivity of normal mouse bone marrow cells. Radiat Res 14, 213-222 (1961). 8 Becker, A. J., McCulloch, E. A. & Till, J. E. Cytological demonstration of the clonal nature of spleen colonies derived from transplanted mouse marrow cells. Nature 197, 452-454 (1963). 9 Siminovitch, L., McCulloch, E. A. & Till, J. E. The distribution of colony-forming cells among spleen colonies. J Cell Physio 62, 327-336 (1963). 10 Trevisan, M. & Iscove, N. N. Phenotypic analysis of murine long-term hemopoietic reconstituting cells quantitated competitively in vivo and comparison with more advanced colony-forming progeny. J Exp Med 181, 93-103 (1995). 11 Jones, R. J., Wagner, J. E., Celano, P., Zicha, M. S. & Sharkis, S. J. Separation of pluripotent haematopoietic stem cells from spleen colony-forming cells. Nature 347, 188-189 (1990). 12 Pereira, C., Clarke, E. & Damen, J. Hematopoietic colony-forming cell assays. Methods Mol Biol 407, 177-208 (2007).

69

13 Pessina, A., Malerba, I. & Gribaldo, L. Hematotoxicity testing by cell clonogenic assay in drug development and preclinical trials. Curr Pharm Des 11, 1055-1065 (2005). 14 Gordon, M. Y. Human haemopoietic stem cell assays. Blood Rev 7, 190-197 (1993). 15 Woehrer, S., Miller, C. & Eaves, C. in Basic Cell Culture Protocols Vol. 946 Methods in (eds Cheryl D. Helgason & Cindy L. Miller) Ch. 16, 257-266 (Humana Press, 2013). 16 Dick, J. E., Bhatia, M., Gan, O., Kapp, U. & Wang, J. C. Assay of human stem cells by repopulation of NOD/SCID mice. Stem Cells 15 Suppl 1, 199-203; Discussion 204-197 (1997). 17 Broudy, V. C. Stem cell factor and hematopoiesis. Blood 90, 1345-1364 (1997). 18 Okada, S. et al. Enrichment and characterization of murine hematopoietic stem cells that express c-kit molecule. Blood 78, 1706-1712 (1991). 19 Chabot, B., Stephenson, D. A., Chapman, V. M., Besmer, P. & Bernstein, A. The proto- oncogene c-kit encoding a transmembrane tyrosine kinase receptor maps to the mouse W locus. Nature 335, 88-89 (1988). 20 Geissler, E. N., Ryan, M. A. & Housman, D. E. The dominant-white spotting (W) locus of the mouse encodes the c-kit proto-oncogene. Cell 55, 185-192 (1988). 21 Anderson, D. M. et al. Molecular cloning of mast cell growth factor, a hematopoietin that is active in both membrane bound and soluble forms. Cell 63, 235-243 (1990). 22 Copeland, N. G. et al. Mast cell growth factor maps near the steel locus on mouse chromosome 10 and is deleted in a number of steel alleles. Cell 63, 175-183 (1990). 23 Zsebo, K. M. et al. Stem cell factor is encoded at the Sl locus of the mouse and is the ligand for the c-kit tyrosine kinase receptor. Cell 63, 213-224 (1990). 24 Blouin, R. & Berstein, A. in Clinical Disorders and Experimental Models of Erythropoietic Failure (ed Freedman M. H.) 157 (CRC, 1993). 25 Russell, E. S., Bernstein, S. E., Lawson, F. A. & Smith, L. J. Long-continued function of normal blood-forming tissue transplanted into genetically anemic hosts. J Natl Cancer Inst 23, 557-566 (1959). 26 Dexter, T. M. & Moore, M. A. In vitro duplication and "cure" of haemopoietic defects in genetically anaemic mice. Nature 269, 412-414 (1977).

70

27 McCulloch, E. A., Siminovitch, L., Till, J. E., Russell, E. S. & Bernstein, S. E. The cellular basis of the genetically determined hemopoietic defect in anemic mice of genotype Sl/Sld. Blood 26, 399-410 (1965). 28 Trevisan, M., Yan, X. & Iscove, N. Cycle initiation and colony formation in culture by murine marrow cells with long-term reconstituting potential in vivo. Blood 88, 4149- 4158 (1996). 29 Ito, C. Y., Li, C. Y., Bernstein, A., Dick, J. E. & Stanford, W. L. Hematopoietic stem cell and progenitor defects in Sca-1/Ly-6A-null mice. Blood 101, 517-523 (2003). 30 Bradfute, S. B., Graubert, T. A. & Goodell, M. A. Roles of Sca-1 in hematopoietic stem/progenitor cell function. Exp Hematol 33, 836-843 (2005). 31 Adolfsson, J. et al. Identification of Flt3+ lympho-myeloid stem cells lacking erythro- megakaryocytic potential a revised road map for adult blood lineage commitment. Cell 121, 295-306 (2005). 32 Kiel, M. J. et al. SLAM family receptors distinguish hematopoietic stem and progenitor cells and reveal endothelial niches for stem cells. Cell 121, 1109-1121 (2005). 33 Li, C. L. & Johnson, G. R. Rhodamine123 reveals heterogeneity within murine Lin-, Sca- 1+ hemopoietic stem cells. J Exp Med 175, 1443-1447 (1992). 34 Johnson, L. V., Walsh, M. L. & Chen, L. B. Localization of mitochondria in living cells with rhodamine 123. Proc Natl Acad Sci USA 77, 990-994 (1980). 35 Johnson, L. V., Walsh, M. L., Bockus, B. J. & Chen, L. B. Monitoring of relative mitochondrial membrane potential in living cells by fluorescence microscopy. J Cell Biol 88, 526-535 (1981). 36 Schinkel, A. H. et al. Disruption of the mouse mdr1a P-glycoprotein gene leads to a deficiency in the blood-brain barrier and to increased sensitivity to drugs. Cell 77, 491- 502 (1994). 37 Schinkel, A. H. et al. Normal viability and altered pharmacokinetics in mice lacking mdr1-type (drug-transporting) P-glycoproteins. Proc Natl Acad Sci USA 94, 4028-4033 (1997). 38 Li, C. L. & Johnson, G. R. Murine hematopoietic stem and progenitor cells: I. Enrichment and biologic characterization. Blood 85, 1472-1479 (1995).

71

39 Spangrude, G. J. & Johnson, G. R. Resting and activated subsets of mouse multipotent hematopoietic stem cells. Proc Natl Acad Sci USA 87, 7433-7437 (1990). 40 Akashi, K. et al. Transcriptional accessibility for genes of multiple tissues and hematopoietic lineages is hierarchically controlled during early hematopoiesis. Blood 101, 383-389 (2003). 41 Randall, T. D. & Weissman, I. L. Phenotypic and functional changes induced at the clonal level in hematopoietic stem cells after 5-fluorouracil treatment. Blood 89, 3596-3606 (1997). 42 Venezia, T. A. et al. Molecular signatures of proliferation and quiescence in hematopoietic stem cells. PLoS Biol 2, e301 (2004). 43 Morrison, S. J., Wright, D. E. & Weissman, I. L. Cyclophosphamide/granulocyte colony- stimulating factor induces hematopoietic stem cells to proliferate prior to mobilization. Proc Natl Acad Sci USA 94, 1908-1913 (1997). 44 Wilson, A. et al. Hematopoietic stem cells reversibly switch from dormancy to self- renewal during homeostasis and repair. Cell 135, 1118-1129 (2008). 45 Schuettpelz, L. G. et al. G-CSF regulates hematopoietic stem cell activity, in part, through activation of Toll-like receptor signaling. Leukemia (2014). 46 Zhang, C. C. & Lodish, H. F. Murine hematopoietic stem cells change their surface phenotype during ex vivo expansion. Blood 105, 4314-4320 (2005). 47 Sato, T., Laver, J. H. & Ogawa, M. Reversible expression of CD34 by murine hematopoietic stem cells. Blood 94, 2548-2554 (1999). 48 Bodine, D. M., Seidel, N. E. & Orlic, D. Bone marrow collected 14 days after in vivo administration of granulocyte colony-stimulating factor and stem cell factor to mice has 10-fold more repopulating ability than untreated bone marrow. Blood 88, 89-97 (1996). 49 Goldberg, L. R. et al. The murine long-term multi-lineage renewal marrow stem cell is a cycling cell. Leukemia (2013). 50 Dorrell, C., Gan, O. I., Pereira, D. S., Hawley, R. G. & Dick, J. E. Expansion of human cord blood CD34(+)CD38(-) cells in ex vivo culture during retroviral transduction without a corresponding increase in SCID repopulating cell (SRC) frequency: dissociation of SRC phenotype and function. Blood 95, 102-110 (2000).

72

51 Harrison, D. E. Competitive repopulation: a new assay for long-term stem cell functional capacity. Blood 55, 77-81 (1980). 52 Harrison, D. E., Jordan, C. T., Zhong, R. K. & Astle, C. M. Primitive hemopoietic stem cells: direct assay of most productive populations by competitive repopulation with simple binomial, correlation and covariance calculations. Exp Hematol 21, 206-219 (1993). 53 Iscove, N. N. & Yan, X. Q. Precursors (pre-CFCmulti) of multilineage hemopoietic colony-forming cells quantitated in vitro. Uniqueness of IL-1 requirement, partial separation from pluripotential colony-forming cells, and correlation with long term reconstituting cells in vivo. J Immunol 145, 190-195 (1990). 54 Yuan, R., Astle, C. M., Chen, J. & Harrison, D. E. Genetic regulation of hematopoietic stem cell exhaustion during development and growth. Exp Hematol 33, 243-250 (2005). 55 Harrison, D. E. & Astle, C. M. Lymphoid and erythroid repopulation in B6 W-anemic mice: a new unirradiated recipient. Exp Hematol 19, 374-377 (1991). 56 Nocka, K. et al. Molecular bases of dominant negative and loss of function mutations at the murine c-kit/white spotting locus: W37, Wv, W41 and W. EMBO J 9, 1805-1813 (1990). 57 Dykstra, B. et al. Long-term propagation of distinct hematopoietic differentiation programs in vivo. Cell Stem Cell 1, 218-229 (2007). 58 Iscove, N. N. & Nawa, K. Hematopoietic stem cells expand during serial transplantation in vivo without apparent exhaustion. Curr Biol 7, 805-808 (1997). 59 Bao, S. et al. Glioma stem cells promote radioresistance by preferential activation of the DNA damage response. Nature 444, 756-760 (2006). 60 Donnenberg, A. D. & Donnenberg, V. S. Rare-event analysis in flow cytometry. Clin Lab Med 27, 627-652, viii (2007). 61 Wang, J. C., Doedens, M. & Dick, J. E. Primitive human hematopoietic cells are enriched in cord blood compared with adult bone marrow or mobilized peripheral blood as measured by the quantitative in vivo SCID-repopulating cell assay. Blood 89, 3919-3924 (1997). 62 Aubin, J. E. Autofluorescence of viable cultured mammalian cells. J Histochem Cytochem 27, 36-43 (1979).

73

63 Gross, H. J., Verwer, B., Houck, D., Hoffman, R. A. & Recktenwald, D. Model study detecting breast cancer cells in peripheral blood mononuclear cells at frequencies as low as 10(-7). Proc Natl Acad Sci USA 92, 537-541 (1995). 64 Ntouroupi, T. G. et al. Detection of circulating tumour cells in peripheral blood with an automated scanning fluorescence microscope. Br J Cancer 99, 789-795 (2008). 65 Lecault, V. et al. High-throughput analysis of single hematopoietic stem cell proliferation in microfluidic cell culture arrays. Nat Meth 8, 581-586 (2011). 66 Cheong, R., Wang, C. J. & Levchenko, A. High content cell screening in a microfluidic device. Mol Cell Proteomics 8, 433-442 (2009). 67 Paliwal, S. et al. MAPK-mediated bimodal gene expression and adaptive gradient sensing in yeast. Nature 446, 46-51 (2007). 68 White, A. K. et al. High-throughput microfluidic single-cell RT-qPCR. Proc Natl Acad Sci USA 108, 13999-14004 (2011). 69 Leung, K. et al. A programmable droplet-based microfluidic device applied to multiparameter analysis of single microbes and microbial communities. Proc Natl Acad Sci USA 109, 7665-7670 (2012). 70 Faley, S. L. et al. Microfluidic single cell arrays to interrogate signalling dynamics of individual, patient-derived hematopoietic stem cells. Lab Chip 9, 2659-2664 (2009). 71 Holtze, C. et al. Biocompatible surfactants for water-in-fluorocarbon emulsions. Lab Chip 8, 1632-1639 (2008). 72 Williams, R. et al. Amplification of complex gene libraries by emulsion PCR. Nat Meth 3, 545-550 (2006). 73 Bontoux, N. et al. Integrating whole transcriptome assays on a lab-on-a-chip for single cell gene profiling. Lab Chip 8, 443-450 (2008). 74 Curry, J., McHale, C. & Smith, M. T. Low efficiency of the Moloney murine leukemia virus reverse transcriptase during reverse transcription of rare t(8;21) fusion gene transcripts. BioTechniques 32, 768, 770, 772, 754-765 (2002). 75 Lareu, R. R., Harve, K. S. & Raghunath, M. Emulating a crowded intracellular environment in vitro dramatically improves RT-PCR performance. Biochem Biophys Res Commun 363, 171-177 (2007).

74

76 Novak, R. et al. Single-cell multiplex gene detection and sequencing with microfluidically generated agarose emulsions. Angew Chem Int Ed 50, 390-395 (2011). 77 Zeng, Y., Novak, R., Shuga, J., Smith, M. T. & Mathies, R. A. High-performance single cell genetic analysis using microfluidic emulsion generator arrays. Anal Chem 82, 3183- 3190 (2010). 78 Zhang, H., Jenkins, G., Zou, Y., Zhu, Z. & Yang, C. J. Massively parallel single-molecule and single-cell emulsion reverse transcription polymerase chain reaction using agarose droplet microfluidics. Anal Chem 84, 3599-3606 (2012). 79 Zhu, Z. et al. Highly sensitive and quantitative detection of rare pathogens through agarose droplet microfluidic emulsion PCR at the single-cell level. Lab Chip 12, 3907- 3913 (2012). 80 Eastburn, D. J., Sciambi, A. & Abate, A. R. Ultrahigh-throughput mammalian single-cell reverse-transcriptase polymerase chain reaction in microfluidic drops. Anal Chem 85, 8016-8021 (2013). 81 Ivanova, N. B. et al. A stem cell molecular signature. Science 298, 601-604 (2002). 82 Ramalho-Santos, M., Yoon, S., Matsuzaki, Y., Mulligan, R. C. & Melton, D. A. "Stemness": transcriptional profiling of embryonic and adult stem cells. Science 298, 597-600 (2002). 83 Chambers, S. M. et al. Hematopoietic fingerprints: an expression database of stem cells and their progeny. Cell Stem Cell 1, 578-591 (2007). 84 Baldridge, M. T., King, K. Y., Boles, N. C., Weksberg, D. C. & Goodell, M. A. Quiescent haematopoietic stem cells are activated by IFN-gamma in response to chronic infection. Nature 465, 793-797 (2010). 85 Cheshier, S. H., Prohaska, S. S. & Weissman, I. L. The effect of bleeding on hematopoietic stem cell cycling and self-renewal. Stem Cells Dev 16, 707-717 (2007). 86 Harrison, D. E. & Lerner, C. P. Most primitive hematopoietic stem cells are stimulated to cycle rapidly after treatment with 5-fluorouracil. Blood 78, 1237-1240 (1991). 87 Quesniaux, V. F. et al. Use of 5-fluorouracil to analyze the effect of macrophage inflammatory protein-1 alpha on long-term reconstituting stem cells in vivo. Blood 81, 1497-1504 (1993).

75

88 Brady, G., Barbara, M. & Iscove, N. N. Representative in vitro cDNA amplification from individual hemopoietic cells and colonies. Method Mol Cell Biol 2, 17-25 (1990). 89 Iscove, N. N. et al. Representation is faithfully preserved in global cDNA amplified exponentially from sub-picogram quantities of mRNA. Nat Biotechol 20, 940-943 (2002). 90 Muro, E. M. et al. Identification of gene 3′ ends by automated EST cluster analysis. Proc Natl Acad Sci USA 105, 20286-20290 (2008). 91 Quentmeier, H., Zaborski, M. & Drexler, H. G. The human bladder carcinoma cell line 5637 constitutively secretes functional cytokines. Leuk Res 21, 343-350 (1997). 92 Billia, F., Barbara, M., McEwen, J., Trevisan, M. & Iscove, N. N. Resolution of pluripotential intermediates in murine hematopoietic differentiation by global complementary DNA amplification from single cells: confirmation of assignments by expression profiling of cytokine receptor transcripts. Blood 97, 2257-2268 (2001). 93 Brady, G. et al. Analysis of gene expression in a complex differentiation hierarchy by global amplification of cDNA from single cells. Curr Biol : CB 5, 909-922 (1995). 94 Frelin, C. et al. GATA-3 regulates the self-renewal of long-term hematopoietic stem cells. Nat Immunol 14, 1037-1044 (2013). 95 Samson, S. I. et al. GATA-3 promotes maturation, IFN-gamma production, and liver- specific homing of NK cells. Immunity 19, 701-711 (2003). 96 Rosmaraki, E. E. et al. Identification of committed NK cell progenitors in adult murine bone marrow. Eur J Immunol 31, 1900-1909 (2001). 97 Kovanen, P. E. & Leonard, W. J. Cytokines and immunodeficiency diseases: critical roles of the gamma(c)-dependent cytokines interleukins 2, 4, 7, 9, 15, and 21, and their signaling pathways. Immunol Rev 202, 67-83 (2004). 98 Kim, S. et al. In vivo developmental stages in murine natural killer cell maturation. Nat Immunol 3, 523-528 (2002). 99 Peschon, J. J. et al. Early lymphocyte expansion is severely impaired in interleukin 7 receptor-deficient mice. J Exp Med 180, 1955-1960 (1994). 100 Akashi, K., Kondo, M. & Weissman, I. L. Role of interleukin-7 in T-cell development from hematopoietic stem cells. Immunol Rev 165, 13-28 (1998).

76

101 Schluns, K. S., Kieper, W. C., Jameson, S. C. & Lefrancois, L. Interleukin-7 mediates the homeostasis of naive and memory CD8 T cells in vivo. Nat Immunol 1, 426-432 (2000). 102 Rathmell, J. C., Farkash, E. A., Gao, W. & Thompson, C. B. IL-7 Enhances the survival and maintains the size of naive T cells. J Immunol 167, 6869-6876 (2001). 103 Balazs, A. B., Fabian, A. J., Esmon, C. T. & Mulligan, R. C. Endothelial protein C receptor (CD201) explicitly identifies hematopoietic stem cells in murine bone marrow. Blood 107, 2317-2321 (2006). 104 Kent, D. G. et al. Prospective isolation and molecular characterization of hematopoietic stem cells with durable self-renewal potential. Blood 113, 6342-6350 (2009). 105 Kent, D. G., Dykstra, B. J., Cheyne, J., Ma, E. & Eaves, C. J. Steel factor coordinately regulates the molecular signature and biologic function of hematopoietic stem cells. Blood 112, 560-567 (2008). 106 Kerschen, E. et al. Activated protein C targets CD8+ dendritic cells to reduce the mortality of endotoxemia in mice. J Clin Invest 120, 3167-3178 (2010). 107 Stearns-Kurosawa, D. J., Kurosawa, S., Mollica, J. S., Ferrell, G. L. & Esmon, C. T. The endothelial cell protein C receptor augments protein C activation by the thrombin- thrombomodulin complex. Proc Natl Acad Sci USA 93, 10212-10216 (1996). 108 Franscini, N. et al. Gene expression profiling of inflamed human endothelial cells and influence of activated protein C. Circulation 110, 2903-2909 (2004). 109 Mosnier, L. O., Zlokovic, B. V. & Griffin, J. H. The cytoprotective protein C pathway. Blood 109, 3161-3172 (2007). 110 Iwasaki, H., Arai, F., Kubota, Y., Dahl, M. & Suda, T. Endothelial protein C receptor- expressing hematopoietic stem cells reside in the perisinusoidal niche in fetal liver. Blood 116, 544-553 (2010). 111 Grote, D., Souabni, A., Busslinger, M. & Bouchard, M. Pax 2/8-regulated Gata 3 expression is necessary for morphogenesis and guidance of the nephric duct in the developing kidney. Development 133, 53-61 (2006). 112 Okada, S. et al. In vivo and in vitro stem cell function of c-kit- and Sca-1-positive murine hematopoietic cells. Blood 80, 3044-3050 (1992).

77

113 Schwanhausser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337-342 (2011). 114 Marguerat, S. et al. Quantitative analysis of fission yeast transcriptomes and proteomes in proliferating and quiescent cells. Cell 151, 671-683 (2012). 115 Raj, A., Peskin, C. S., Tranchina, D., Vargas, D. Y. & Tyagi, S. Stochastic mRNA synthesis in mammalian cells. PLoS Biol 4, e309 (2006). 116 Lionnet, T. & Singer, R. H. Transcription goes digital. EMBO Rep 13, 313-321 (2012). 117 Lam, F. H., Steger, D. J. & O'Shea, E. K. Chromatin decouples promoter threshold from dynamic range. Nature 453, 246-250 (2008). 118 Chubb, J. R. & Liverpool, T. B. Bursts and pulses: insights from single cell studies into transcriptional mechanisms. Curr Opin Gent Dev 20, 478-484 (2010). 119 Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845-1848 (2008). 120 Muse, G. W. et al. RNA polymerase is poised for activation across the genome. Nat Genet 39, 1507-1511 (2007). 121 Dobrzyński, M. & Bruggeman, F. J. Elongation dynamics shape bursty transcription and translation. Proc Natl Acad Sci USA 106, 2583-2588 (2009). 122 Kaern, M., Elston, T. C., Blake, W. J. & Collins, J. J. Stochasticity in gene expression: from theories to phenotypes. Nature Rev Genet 6, 451-464 (2005). 123 Fluidigm. technical note: single-cell gene expression approaches, http://www.fluidigm.com/home/fluidigm/docs/AppNote_2v1Step_pf9.pdf (2012). 124 Sellner, L. N., Coelen, R. J. & Mackenzie, J. S. Reverse transcriptase inhibits Taq polymerase activity. Nucleic Acids Res 20, 1487-1490 (1992). 125 Suslov, O. & Steindler, D. A. PCR inhibition by reverse transcriptase leads to an overestimation of amplification efficiency. Nucleic Acids Res 33, e181 (2005). 126 Chumakov, K. M. Reverse transcriptase can inhibit PCR and stimulate primer-dimer formation. Genome Res 4, 62-64 (1994). 127 Ratnamohan, V. M., Cunningham, A. L. & Rawlinson, W. D. Removal of inhibitors of CSF-PCR to improve diagnosis of herpesviral encephalitis. J Virol Methods 72, 59-65 (1998).

78

128 Chandler, D. P., Wagnon, C. A. & Bolton, H., Jr. Reverse transcriptase (RT) inhibition of PCR at low concentrations of template and its implications for quantitative RT-PCR. Appl Environ Microbiol 64, 669-677 (1998). 129 Kreader, C. A. Relief of amplification inhibition in PCR with bovine serum albumin or T4 gene 32 protein. Appl Environ Microbiol 62, 1102-1106 (1996). 130 Kubu, C. J. HotStart-IT: A novel hot start pcr method based on primer sequestration. BioTechniques 44, 275-277 (2008). 131 Kubu, C. J., Muller-Greven, J. C. & Moffett, R. B. Hot start nucleic acid amplification. United States patent (2011). 132 Ho, Y. K., Xu, W. T. & Too, H. P. Direct quantification of mRNA and miRNA from cell lysates using reverse transcription real time PCR: a multidimensional analysis of the performance of reagents and workflows. PLoS ONE 8, e72463 (2013). 133 Liu, W. & Saint, D. A. A new quantitative method of real time reverse transcription polymerase chain reaction assay based on simulation of polymerase chain reaction kinetics. Analytical biochemistry 302, 52-59 (2002). 134 Rock, C., Alum, A. & Abbaszadegan, M. PCR inhibitor levels in concentrates of biosolid samples predicted by a new method based on excitation-emission matrix spectroscopy. Appl Environ Microbiol 76, 8102-8109 (2010). 135 Wong, K., Lyddon, R. & Dracheva, S. TaqMan-based, real-time quantitative polymerase chain reaction method for RNA editing analysis. Anal Biochem 390, 173-180 (2009). 136 Pfaffl, M. in A-Z of quantitative PCR (ed S. A. Bustin) Ch. 3, 87 - 112 (International University Line, 2004). 137 Persson, K., Hamby, K. & Ugozzoli, L. A. Four-color multiplex reverse transcription polymerase chain reaction--overcoming its limitations. Anal Biochem 344, 33-42 (2005). 138 Kibbe, W. A. OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Res 35, W43-W46 (2007). 139 Henegariu, O., Heerema, N. A., Dlouhy, S. R., Vance, G. H. & Vogt, P. H. Multiplex PCR: critical parameters and step-by-step protocol. BioTechniques 23, 504-511 (1997). 140 Shuber, A. P., Grondin, V. J. & Klinger, K. W. A simplified procedure for developing multiplex PCRs. Genome Res 5, 488-493 (1995).

79

141 Mary, P. et al. Analysis of gene expression at the single-cell level using microdroplet- based microfluidic technology. Biomicrofluidics 5, 24109 (2011). 142 Peters, I. R., Helps, C. R., Hall, E. J. & Day, M. J. Real-time RT-PCR: considerations for efficient and sensitive assay design. J Immunol Methods 286, 203-217 (2004). 143 Kim, Y. T. & Richardson, C. C. Bacteriophage T7 gene 2.5 protein: an essential protein for DNA replication. Proc Natl Acad Sci USA 90, 10173-10177 (1993). 144 Innis, M. A. & Gelfand, D. H. in PCR Protocols: A Guide to Methods and Applications (eds M. A. Innis, D. H. Gelfand, J. J. Sninsky, & T. K. White) (Academic Press, 1990). 145 Kainz, P. The PCR plateau phase - towards an understanding of its limitations. Biochim Biophys Acta 1494, 23-27 (2000). 146 Dang, C. & Jayasena, S. D. Oligonucleotide inhibitors of Taq DNA polymerase facilitate detection of low copy number targets by PCR. J Mol Biol 264, 268-278 (1996). 147 Kainz, P., Schmiedlechner, A. & Strack, H. B. Specificity-enhanced hot-start PCR: addition of double-stranded DNA fragments adapted to the annealing temperature. BioTechniques 28, 278-282 (2000). 148 Mathieu-Daudé, F., Welsh, J., Vogt, T. & McClelland, M. DNA rehybridization during PCR: The ‘Cot effect’ and its consequences. Nucleic Acids Res 24, 2080-2086 (1996). 149 Baret, J.-C. Surfactants in droplet-based microfluidics. Lab on a Chip 12, 422-433 (2012). 150 Roach, L. S., Song, H. & Ismagilov, R. F. Controlling nonspecific protein adsorption in a plug-based microfluidic system by controlling interfacial chemistry using fluorous- phase surfactants. Anal Chem 77, 785-796 (2004).

80

Appendices

Appendix 1. Additional improvements to the singleplex qRT-PCR. (A) The effects of a 3-fold increase in TrisHCl on the 10 μL qRT-PCR were analyzed. Three singleplex assays (left qPCR graph), and two duplex assays (middle and right) were tested in the presence of high cell lysate concentrations. There were negligible CT differences but the fluorescence plateau increased in the highly abundant targets of each assay. (B) 10 pg EL4 total RNA was encapsulated into microdroplets with RT-PCR reagents. Microdroplets were analyzed after 50 thermal cycles for the resultant fluorescence intensities. 3-fold greater BSA or Tris-HCl concentrations were tested to determine if any improvement in fluorescence intensity can be made. (C) 104 copies/nL, 102 copies/nL and 1 copy/nL of Gata3 and IL-7Ra IVT were analyzed at 55°C, 58°C and 60°C (left). The fluorescence plateau intensity at 58°C was significantly higher than at 55°C (one-tailed one sample t-test, n = 6, p = 0.002 [Ho = 1, Ha > 1]). Thermal gradient qRT-PCR was performed on the Bio-Rad CFX96 Touch Real Time PCR Detection System.

A

B

C

81

Appendix 2. A duplex qRT-PCR was used as a starting point to identify parameters for a quantitating IL- 7Rα, EPCR and Gata3. Two targets can be quantitated simultaneously despite a target abundance difference of 104 fold, as demonstrated by (A) the superimposition of its qPCR curves on the singleplex assays, (B) two bands of expected size when visualized by gel electrophoresis, and (C) an experiment where EPCR was serially diluted 10-fold while Gata3 was held at a high concentration. (D) A primer decrease, and increase in Taq polymerase concentrations was necessary for an efficient duplex reaction.

A B C

D

82

Appendix 3. Five of the most stable cross dimers formed in the initial and final triplex primer sets are shown here with their corresponding ΔG values. The negativity of the ΔG value is an indication of stability of the cross primers, where the maximum ΔG for 3’ end cross-dimer is -5 kcal/mol and -6 kcal/mol for an internal one as suggested by the Beacon Designer primer design software. A lower ΔG value indicates that the structure has a higher Tm, which would determine if these structures are present at the annealing temperature. NM_008091 = IL-7Ra, NM_011171 = EPCR, and NM_008372 = Gata3.

Initial primer combination

Current primer combination

83

Appendix 4. SYBR Green validation of new primer pairs. Dilution series of either D4T (EPCR+) or B62c (IL-7Rα+) cDNA was used to determine primer efficiency prior to their use in the qRT-PCRs. The reactions were performed to 40 cycles using the Bio-Rad iTaq Universal SYBR Green Supermix. All primer pairs had an efficiency between 0.91 – 1.1 with a R2 > 0.99.

Appendix 5. Performance of qRT-PCR with and without cell lysate in a (A) triplex reaction where all three IVTs were highly abundant, and (B) in singleplex reactions across different IVTs and IVT concentrations.

A

B

84

Appendix 6. Performance of RT-PCR in microdroplets prior to the identification of multiplex qRT-PCR parameters. The microfluidic platform was used to encapsulate 10 pg EL4 total RNA or H2O into microdroplets with qRT-PCR reagents. The microdroplets were analyzed after 50 thermal cycles. Representative images are shown with a histogram of the distribution of fluorescence intensities immediately below. Singleplex reactions were also performed so that the performance of the duplex reactions could be compared directly to reaction conditions that have been verified to be efficient.

Appendix 7. Effect of NP40 on microdroplet stability. Microdroplets were loaded with qRT-PCR buffer, buffer and 0.01% NP40, or water. After thermal cycling, the microdroplet sizes were measured. Microdroplets that were distinctly larger than those that have not been thermal cycled were considered to be the product of coalescence.