CHARACTERIZATION OF TCL1-MURINE B-1A CELL TRANSCRIPTOME DYNAMICS REVEALS NOVEL INSIGHTS INTO CHRONIC LYMPHOCYTIC LEUKEMIA ONSET

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Yuntao Dai, B.S., M.Sc

Graduate Program in Molecular, Cellular and Developmental Biology

The Ohio State University

2015

Dissertation Committee:

Carlo Croce, MD; Advisor

Jeffrey Parvin, MD, PhD

Qianben Wang, PhD

Flavia Pichiorri, PhD

Copyright by

Yuntao Dai

2015

ABSTRACT

B-cell chronic lymphocytic leukemia (B-CLL or CLL) is the most common leukemia in adults in western countries. This disease seems to arise from genetic lesions that block differentiation of normal B lymphocytes. Patients with CLL (both aggressive and indolent) are at risk for development of invasive over-proliferation of malignant CD5 B lymphocytes, caused by an immature expansion of B cell precursors (B-1a). B-1a cells are thus ideal for the study of CLL disease initiation steps. From previous studies it is

known that uncontrolled T-Cell Leukemia/Lymphoma 1A (TCL1) signaling is involved

in aggressive CLL development. In order to substantiate the pathogenic effect of TCL1 and to provide paths to study CLL in vivo, EuTCL1-transgenic mice (TCL1 mice), with targeted human TCL1 overexpression in B cells, have been generated. TCL1 mice, which consistently develop aggressive CLL symptoms, represent a good model to screen for novel factors that may play significant roles in CLL disease initiation by studying their transcriptome profiling. This will lead to the discovery of new prognostic markers and/or therapeutic targets for clinical use.

Next Generation Sequencing RNA sequencing (RNA seq) provides comprehensive overviews of transcriptome dynamics and thus is ideal for the genomic profiling.

Therefore, as described in Chapter 2, RNA seq was performed to compare the B-1a cell transcriptome of early age (1-4mo) TCL1-transgenic mice to the wild-type (WT) ii

counterparts. We found that: i) the expression levels of several coding and non-coding

are deregulated; ii) the number of deregulated genes increases with age; and iii)

certain oncogenic pathways such as NF-kB are stimulated due to the targeted TCL1

overexpression in mice. We focused on the top 15 up/down regulated genes in most

genotype/age categories to perform further studies.

As shown in Chapter 3 we validated the selected genes and picked the most promising

candidates to focus on. In particular, quantitative real time PCR (qRT-PCR) was performed to validate the transcriptional deregulation of protein coding genes in transgenic mice vs WTs, specifically. Neto2 (Neuropilin and Tolloid-like 2) and Hbegf

(Which resulted to be upregulated and downregulated, respectively), while the

transcriptional deregulation of the noncoding genes AI427909 and 1700097N02Rik

(which resulted to be upregulated and downregulated, respectively) was not validated.

For the protein coding genes Neto2 and Hbegf, western blots confirmed their expression changes at the protein level. Validation of the changes in Neto2 and Hbegf was performed in human samples and cell lines. qRT-PCR on 31 human samples revealed that both

NETO2 and HBEGF showed collinear relationship with TCL1 expression levels (positive and negative, respectively) confirmed at the protein level by carrying out western blot experiments on TCL1-transfected human cell lines. Moreover, the correlation between

NETO2 and TCL1 was verified by western blot also in randomly selected patients.

Therefore, we decided to focus on the potential oncogene NETO2.

NETO2 encodes a transmembrane protein that regulates glutamate receptor function and modulates glutamate signaling in the central nervous system (CNS). Glutamate

iii

receptor, in addition to its primarily reported role in CNS, was recently published to be deregulated in several cancer types. Given that our results show NETO2/Neto2 is

upregulated in CLL-related samples from both human and mouse as well as TCL1-

transfected cell lines, it is reasonable to conclude that NETO2 can be associated with

CLL, therefore representing a potential prognostic marker or therapeutic targets for future

clinical uses. To validate this, NETO2 transgenic mouse models are proposed in the

discussion on future directions in Chapter 4. Taken together, our findings not only

continue to decode the mechanism of CLL through appreciation of the signaling network,

but also help us understand NETO2 and its potential prognostic and therapeutic values.

iv

DEDICATION

This document is dedicated to my family.

v

ACKNOWLEDGEMENTS

First and foremost, I would like to express my deep and sincere gratitude to my advisor, Dr. Carlo Croce, for his mentorship, support, understanding and patience at all times. His wide knowledge, illuminating insight and persistent passion for science have inspired me all the way along my study.

I would like to acknowledge my graduate committee members, Dr. Jeffrey Parvin, Dr.

Qianben Wang, and Dr. Flavia Pichiorri for their insights, supports and suggestions. I want to especially thank my laboratory mentors Dr. Mario Acunzo, Dr. Giulia Romano,

Dr. Nicola Zanesi, and Dr. Veronica Balatti for their excellent mentorship.

I want to thank our lab manager Dr. Dorothee Wernicke-Jameson, our lab secretary

Mrs Sharon R Palko, and our department staff members Mrs Tornik Colette and Mrs Erin

Kimbrell for their assistance.

I also want to thank my supportive friends Dr. Mei Zhang, Dr. Fabienne McClanahan, and Dr. Christopher Walker for their consistent effort and encouragement to support me to move towards my life goal.

Thank you to all the past and present members of the Croce lab and friendship labs who helped me along the way: Dr. Dario Veneziano, Dr. Alessandro Lagana, Dr. Hui-

Lung Sun, Mr. Douglas Cheung, Dr. Young-Jun Jeon, Dr. Pearlly Yan, Dr. Yuri

Pekarsky, Dr. Stefano Volinia, Dr. Sukhinder Sandhu, Mr. Bryan McElwain, Dr. vi

Giovanni Nigita, Dr. Lara Rizzotto, Dr. Ri Cui, Dr. Huijun Wei, Dr. Yong Peng, Dr.

Zhenghua Luo, Dr. Taewan Kim, Dr. Sung Suk Suh, Miss. Pooja Josh, Dr. Jessica

Consiglio, Dr. Pierluigi Gasparini, Dr. Jinghai Wu, Dr. Esmerina Tili, Dr. Francesca

Lovat, Dr. Federica Calore, Dr. Alex Palamarchuk, Dr. Dario Palmieri, Dr. Anna Tessari,

Mr. Timothy Richmond, Ms. Janae Dulaney, Ms. Prasanthi Kumchala, Dr. Dayong Wu,

Dr. Hongtao Jia.

Last but not least, I am grateful to my parents, my relatives and friends back in China for their constant love and support.

vii

VITA

December 22, 1982 ...... Born in Wuhan, China

September 2001 to June 2005 ...... B.S., Biological Sciences,

Huazhong Agriculture University, Wuhan

January 2007 to July 2009 ...... M.S., Plant Pathology

University of Arkansas, Fayetteville, AR

June 2010 to present ...... Graduate Research Associate, The Ohio

State University

viii

PUBLICATIONS

1. Nigita G, Acunzo M, Romano G, Lagana A, Veneziano D, Dai Y, Vitiello M, Wernicke D, Ferro A, Croce CM (2015) MicroRNA editing favors dynamic cellular changes in hypoxic conditions. Submitted. 2. Dai Y, Veneziano D, Lagana A, Balatti V, Zanesi N, McClanahan F, Nigita G, Sun HL, Walker C, Jeon YJ, Romano G, Yan P, Cheung D, Peng Y, Pekarsky Y, Acunzo M, Croce CM (2015) Characterization of TCL1-Murine B-1a cell transcriptome dynamics reveals novel insights into CLL onset. Submitted. 3. Srivastava AK, Han C, Zhao R, Cui T, Dai Y, Mao C, Zhao W, Zhang X, Yu J, Wang QE (2015) Enhanced expression of DNA polymerase eta contributes to cisplatin resistance of ovarian cancer stem cells. PNAS 112: 4411-4416. 4. Dai Y, Winston E, Correll JC, Jia Y (2014) Induction of avirulence by AVR- Pita1 in virulent U.S. field isolates of Magnaporthe oryzae. The Crop Journal 2: 1-9. 5. Peng Y, Dai Y, Hitchcock C, Yang X, Kassis ES, Liu L, Luo Z, Sun HL, Cui R, Wei H, Kim T, Lee TJ, Jeon YJ, Nuovo GJ, Volinia S, He Q, Yu J, Nana- Sinkam P, Croce CM (2013) Insulin growth factor signaling is regulated by micro-RNA 486, an underexpressed microRNA in lung cancer. PNAS 110: 15043-15048. 6. Dai Y, He H, Wise GE, Yao S (2011) Hypoxia promotes growth of stem cells in dental follicle cell populations. Journal of Biomedical Science and Engineering 4: 454-461. 7. Yao S, Gutierrez G, He H, Dai Y, Liu D, Wise GE (2011) Proliferation of dental follicle derived cell populations in heat-stress conditions. Cell Proliferation 44: 486-493. 8. Dai Y, Jia Y, Correll JC, Wang X, Wang Y (2010) Diversification and evolution of the avirulence AVR-Pita1 in field isolates of Magnaporthe oryzae. Fungal Genetics and Biology 47: 973-980. 9. Jia Y, Liu G, Costanzo S, Lee S, Dai Y (2009) Current progress on understanding of genetic interactions of rice with rice blast and sheath blight fungi. Frontier Research in China 3: 231-239. 10. Zhang L, Lu Q, Chen H, Pan G, Xiao S, Dai Y, Li Q, Zhang J, Wu X, Wu J, Tu J, Liu K (2007) Identification of a cytochrome P450 hydroxylase, CYO81A6, as the candidate for the bentazon and sulfonylurea herbicide resistance gene, Bel, in rice. Molecular Breeding 19: 59-68. ix

FIELDS OF STUDY

Major Field: Molecular, Cellular and Developmental Biology

x

TABLE OF CONTENTS

ABSTRACT ...... ii

DEDICATION ...... v

ACKNOWLEGEMENTS ...... vi

VITA ...... viii

LIST OF TABLES ...... xiv

LIST OF FIGURES ...... xiv

LIST OF ABBREVIATIONS ...... xivii

CHAPTER1: INTRODUCTION ...... 1

1.1 Overview of Chronic Lymphocytic Leukemia ...... …...1

1.1.1 B lymphocyte, CLL and CLL pathology ...... 1

1.1.2 Clinical prognosis of CLL ...... 4

1.1.3 Disease Management ...... 6

1.2 T-Cell Lymphoma/Leukemia 1A (TCL1) and the TCL1-mouse model ...... 7

1.2.1 Mouse models to study CLL ...... 7

1.2.2 Molecular mechanisms of TCL1 ...... 9 xi

1.2.3 The Eu-TCL1-transgenic mouse model ...... 14

1.3 B-1a cells ...... 15

1.3.1 Understaning the origin of CLL ...... 15

1.3.2 Murine B-1a cell isolation ...... 16

1.4 Hypothesis ...... 18

CHAPTER2: PROFILING OF TCL1-MURINE B-1A CELL USING RNA-SEQ .... 19

2.1 Introduction ...... 19

2.2 Results and discussions ...... 20

2.3 Materials and methods ...... 39

CHAPTER3: VALIDATION OF THE CANDIDATES FROM RNA SEQ...... 46

3.1 Introduction ...... 46

3.2 Results and discussions ...... 47

3.3 Materials and methods ...... 57

CHAPTER4: NETO2 AND FUTURE PERSPECTIVES ...... 60

4.1 An overview of NETO2 ...... 60

4.2 Proposed future research: transgenic mice ...... 62

4.2.1 NETO2-tg vs WT ...... 63

4.2.2 NETO2/TCL1 double tg vs NETO2-tg or TCL1-tg alone ...... 65

4.2.3 Neto2-/TCL1 vs TCL1-tg ...... 66

xii

CHAPTER5: CONCLUDING REMARKS ...... 67

REFERENCES ...... 69

xiii

LIST OF TABLES

Table 1.1 Mouse models to study human CLL and their principles ...... 8

Table 2.1 Expression level of top candidates by RNA-seq analysis ...... 43

Table 2.2 Information of human samples analyzed with qRT-PCR ...... 44

xiv

LIST OF FIGURES

Figure 1.1 B-lymphocyte formation and function...... 2

Figure 1.2 CLL disease symptoms...... 3

Figure 1.3 Genetic aberrations in CLL ...... 4

Figure 1.4 Survival probabilities of different genetic aberrations ...... 5

Figure 1.5 Protein structure of TCL1 ...... 9

Figure 1.6 The Eu-TCL1-transgenic mouse model ...... 12

Figure 1.7 TCL1 oncogenic functions in T-cells and B-cells ...... 13

Figure 1.8 Mature splenic B cell subsets ...... 16

Figure 1.9 Total splenocyte compositions in mice ...... 17

Figure 1.10 Mouse B-1a cell isolation ...... 18

Figure 2.1 FACS analysis for B-1a cell collection from total splenocytes...... 21

Figure 2.2 RNA sequencing output quality control ...... 22

Figure 2.3 Validation of TCL1 expression and transcriptome profiling ...... 23

Figure 2.4 Superimposed diagrams displaying the overlapping relationship ...... 29

Figure 2.5 Differentially expressed genes in TCL1 vs. WT mouse B-1a cells...... 33

Figure 3.1 Differentially expressed genes in mature mouse CLL ...... 48 xv

Figure 3.2 Protein expression analysis of NETO2 and Hbegf ...... 49

Figure 3.3 NETO2 and HBEGF gene expression in human samples ...... 51

Figure 3.4 Protein expression of NETO2 and HBEGF in human samples ...... 52

Figure 3.5 NETO2 protein expression in patient samples of distinctive ZAP levels ...... 53

Figure 3.6 TCL1 regualtes the RNA expression of NETO2 and HBEGF ...... 55

Figure 3.7 Protein expression analysis of NETO2 and HBEGF on transfected cells ...... 56

Figure 4.1 The NETO protein ...... 61

Figure 4.2 The transgenic mouse with human NETO2 expressed in mouse B cells ...... 63

Figure 4.3 NETO2-tg vs WT ...... 64

Figure 4.4 NETO2/TCL1 double tg vs NETO2-tg or TCL1-tg alone ...... 65

Figure 4.5 Neto2-/TCL1 vs TCL1-tg ...... 66

xvi

LIST OF ABBREVIATIONS

CLL ...... Chronic Lymphocytic Leukemia

CUB ...... Complement C1r/C1s, Uegf, Bmp1

DNMT3A/B ...... DNA (cytosine-5)-Mehyltransferase 3A/B

FACS ...... Fluorescence-Activated Cell Sorting

FBS ...... Fetal bovine serum

FISH ...... Fluorescence in situ hybridization

FPKM ...... Fragments Per Kilobase of transcript per Million mapped reads

GAPDH ...... Glyceraldehyde 3-phosphate dehydrogenase

GFP...... Green Fluorescence Protein

Hbegf ...... Heparin-binding EGF-like growth factor

HEK293 cells ...... Human Embryonic Kidney 293 cells

HRP ...... Horseradish peroxidase

KO ...... Knockout

LDL...... Low Density Lipoprotein lncRNA ...... long non-coding RNA mo ...... month

xvii

Neto2 ...... Neuropilin (NRP) And Tolloid (TLL)-Like 2

NF-kB ...... Nuclear Factor kappa-light-chain-enhancer of activated B cells

NGS ...... Next Generation Sequencing

PBMC ...... Peripheral Blood Mononuclear Cell

QC ...... Quality Control qRT-PCR ...... Quantitative RT-PCR (realtime PCR)

RNA seq ...... RNA sequencing shRNA ...... short hairpin RNA siRNA ...... short interfering RNA

TCL1 ...... T-Cell Leukemia/Lymphoma 1A tg...... transgenic

WT ...... Wild-type

ZAP-70 ...... Zeta-chain-associated protein kinase 70

xviii

CHAPTER1: INTRODUCTION

1.1 Overview of Chronic Lymphocytic Leukemia (CLL)

1.1.1 B lymphocyte, CLL and CLL pathology

The human specific immune defense is a system comprised of a variety of immune- associated cells (lymphocytes) and their secretory organs. B lymphocytes, or simply B cells, are a type of immune cell that contributes to humoral immune response. B cells mature in the bone marrow and then are carried by the blood to the secondary lymph organs. Upon activation by antigens, B cells differentiate into plasma cells that secrete antibodies (Figure 1.1). The success of this process results in the elimination of the causal agent (antigen) by the immune response. Due to the significance of B cells to the host, defects like genetic deregulation taking place in B cell homeostasis may lead to severe health problems such as disease-related immune suppression followed by infectious complications [1]. One major genetic malignancy of B cells is B-cell chronic lymphocytic leukemia (CLL).

1

Figure 1.1 B-lymphocyte formation and function. Pluripotent stem cells develop into myeloid stem cells and lymphoid stem cells; part of the latter mature in bone marrow and become B cells. B cells, upon activation by antigen, develop into plasma cells excreting antibodies for immune purposes. (Image adapted from Vander’s Human Physiology 11th edition).

2

CLL is the most common adult leukemia type in western countries, accounting for

30% of all leukemias with >10,000 patients diagnosed annually in the U.S. alone. It is

predominantly a disease affecting older people (>50y) with a survival time ranging from

2 to 20 years [2]. CLL occurs in two forms, aggressive and indolent, both characterized

by the progressive accumulation of functionally incompetent B lymphocytes expressing

CD5 antigen on their surface [3]. In detail, CLL is characterized by an accumulation of

clonal B lymphocytes that express glycoprotein markers CD5, CD19, CD20, IgM/IgD

and CD23 on their cell surface and exhibit kappa or lambda light chain restriction. In

CLL patients, these cells can be found in peripheral blood, bone marrow, and in

lymphoid organs such as spleen and lymph nodes, and lead to lymphocytosis, organomegaly and lymphadenopathy [4] (Figure 1.2).

Figure 1.2 CLL disease symptoms. Abnormal expansion of CD5+ B lymphocytes, fewer red blood cells and platelets, which are accompanied by enlarged liver, lymph nodes and spleen. (Image adapted from www.cll.cancerinformation.com)

3

1.1.2 Clinical prognosis of CLL

The Rai [5] and Binet [6] are the most established prognostic staging systems for

clinical CLL evaluation based on physical examination and blood count. Rai stage 0 and

Binet stage A represent patients of low-risk disease with median survival of 17 years; Rai

stage 1-2 and Binet stage B represent patients of intermediate-risk disease with median

survival of 5-8 years; Rai stage 3-4 and Binet stage C represent patients of high-risk with a median survival of less than 2 years [2]. As the clinical course of CLL is highly variable, Rai and Binet systems help clinicians decide when therapies should start, but these systems are unable to precisely predict the clinical course and thus not suitable for long-term prognostic indicators. Instead, cytogenetic (Figure 1.3 and 1.4) and cellular molecular features can be used as markers to distinguish patients with distinct clinical courses [3].

Figure 1.3 Genetic aberrations in CLL. Besides normal karyotype, cytogenetic aberrations present in CLL include 6q21 deletion, 13q14 deletion, 11q22-23 deletion, 17p13 deletion and trisomy 12. 4

Figure 1.4 Survival probabilities of different genetic aberrations. 17p deletion is of the shortest surviving period, followed by 6q, 11q deletions and 12q trisomy; 13q deletion and normal karyotype are the least threatening due to the longest surviving time. (Image acquired from Dohner (2000) The New England Journal of Medicine 343: 1910-1916)

Several prognostic molecular markers have been identified, such as the mutational status of the immunoglobulin heavy-chain variable-region gene (IgVH), the expression

levels of the 70kDa zeta-associated protein (ZAP-70), and the presence of different

chromosomal alterations [7, 8]. CLLs with unmutated IgVH gene and high expression of

the ZAP-70 usually have an aggressive course, whereas patients with mutated IgVH

clones and low ZAP-70 expression have an indolent course [9]. Cytogenetic aberrations

are present in over 80% of cases of CLL, which mainly include deletions of 13q14

(>50%), 11q22-23 (18%), 17p13 (7%-10%), and trisomy 12 (15%-18%) [10] (Figure

1.3). Each of these genetic aberrations has been found to lead to distinct survival

5

probabilities (Figure 1.4). More specifically, the genomic alterations in CLL can be

stratified into three groups: (i) low-risk: patients with a normal karyotype or isolated 13q deletion; (ii) intermediate-risk: subjects with del11q deletion, trisomy 12 or 6q deletion;

and (iii) high-risk: patients with 17p deletion or a complex karyotype (Figures 1.3) [10].

Hence, genomic alterations in CLL are important independent predictors of CLL disease

progression and survival.

1.1.3 Disease management

In order to avoid treatment related side effects, patients with a stable indolent disease

are never recommended for treatment and therapy is usually reserved for patients that

show sign of progression toward a more aggressive stage [3]. Common therapy

approaches in general include single-agent chemotherapy, combination chemotherapy,

chemo-immunotherapy etc. Nevertheless, many newly developed treatment plans such as

chemo-immunotherapy approaches, which have yet proven to be effective in prolonging patients’ survival rates (e.g. fludarabine, cyclophosphamide and anti-CD20 antibody rituximab [11]) fail to be applied to patients: the lack of reliable targets for such treatment

plans represents an important obstacle toward the development of new and more efficient

drugs. Therefore, specific prognostic markers and therapeutic targets need to be found

[12].

6

1.2 T-Cell Leukemia/Lymphoma 1A (TCL1) and TCL1-mouse model

1.2.1 Mouse models to study CLL

Mouse models are valuable tools for preclinical studies because they simulate human

malignancy and thus can be used to elucidate the underlying pathogenetic mechanisms

[2]. Currently available mouse models for studying CLL include mir-15a/16-/- and mir-

15a/16-1floxed CD19-Cre mice [13], 14qC3 minimal deleted region (MDR)-/- and

MDRfloxed CD19-Cre mice [13], 14qC3 common deleted region (CDR)floxed CD19-Cre

mice [14], Eu-TCL1 transgenic mice [15], APRIL transgenic mice [16], BCL2xtraf2dn transgenic mice [17], ROR1 transgenic mice [18], Eu-mir-29 transgenic mice [19],

Vh11xirf4-/- mice [20], IgH.T and IgH.TEu mice [21] (Table 1.1). Among them, Eu-

TCL1 transgenic mice (TCL1 mice) are an optimal tool to investigate CLL and a

preclinical model for novel therapeutics due to its close resemblance to human disease

regarding leukemia phenotype, antigen-receptor repertoire, and disease course [22].

7

Table 1.1 Mouse models to study human CLL and their principles.

Mouse models Principles

mir-15a/16-/- and mir-15a/16-1floxed Disruption of physiological expression

CD19-Cre mice [13] of mir-15a/16-1

14qC3 minimal deleted region (MDR)-/- Mir-15a/16-1, dleu2 and dleu5 deleted

and MDRfloxed CD19-Cre mice [13]

14qC3 common deleted region CDR deletion in mouse B lymphocytes

(CDR)floxed CD19-Cre mice [14]

Eu-TCL1 transgenic mice [15] Targeted expression of human TCL1 in

mouse B lymphocytes

APRIL transgenic mice [16] Accumulation of increased level of

APRIL in sera

BCL2xtraf2dn transgenic mice [17] Accumulated expression of human

BCL2 and traf2dn in mice

ROR1 transgenic mice [18] Targeted expression of the human

ROR1 gene in mouse B lymphocytes

Eu-mir-29 transgenic mice [19] Overexpress the miR-29a/b cluster in

mouse B cells

Vh11xirf4-/- mice [20] IRF4 deficiency and vh11 expression in

mouse B cells

IgH.T and IgH.TEu mice [21] Sporadic SV40 T antigen expression in

mature B cells

8

1.2.2 Molecular mechanisms of TCL1

Figure 1.5 TCL1 family members, TCL1 protein structure and TCL1 function. (A) Sequence alignment of the members of the human TCL1 family; (B) Crystal structure of TCL1; (C) TCL1 interacts with Akt, enhancing the kinase activity of Akt by facilitating its nuclear translocation. (Image acquired from Noguchi (2007) The FASEB Journal 21: 2273-2284)

9

Figure 1.5

A

B

C 10

T-cell leukemia/lymphoma 1 (TCL1) belongs to the proto-oncogene TCL1 family

comprised of three isoforms in both human and mouse genomes: TCL1 (14kDa), TCL1b

(15kDa), and MTCP1 (16kDa) [23] (Figure 1.5A). TCL1 was first identified in the

translocation of human T cell prolymphocytic leukemia [24]. TCL1 is an Akt co-activator

[25] encoded by the TCL1 gene located in human 14q32.1. Its protein product is 114 amino acids long and forms a tight dimer. TCL1 consists of an orthogonal

8-stranded beta-barrel that is categorized into the “filled barrel”, the inside of which is tightly packed and hydrophobic. The antiparallel beta strands of variable length are arranged into two very similar up-and-down, four stranded beta-meander motifs connected by a long, poorly structured loop that wraps around to form the barrel (Figure

1.5B). This unique topology allows TCL1 to interact with the pleckstrin homology domain of Akt, enhancing its kinase activity [23] (Figure 1.5C). TCL1 thus functions as a promoter of the PI3K-Akt (PKB) oncogenic pathway by activating Akt, driving its nuclear translocation and leading to increased cellular proliferation, inhibition of apoptosis. and malignant transformation [25, 26, 27]. Activation of the TCL1 oncogene is a central initiating event in the pathogenesis of aggressive CLL, and high TCL1 expression in patients correlates with aggressive phenotype [27]. In physiological conditions, TCL1 is expressed early in T- and B-lymphocyte differentiation [26]. In pathological conditions, the overexpression of TCL1 in T- and B- cells leads to T-PLL/T-

CLL and B-CLL, respectively. In aggressive CLL, TCL1 is an oncogenic protein critical for leukemogenesis and its deregulation is related to the disease pathogenesis [27, 28, 29].

For example, TCL1 protein is detectable in 90% of human CLL samples [29]; high TCL1

11

levels are a marker of adverse outcome in CLL [27], and transgenic mice exclusively expressing TCL1 in B cells display disease symptoms of human aggressive CLL (Figure

1.6).

Figure 1.6 The Eu-TCL1-transgenic mouse model demonstrates human CLL symptoms. (A) Gross pathology of a representative >8mo old TCL1 mouse (right), and a WT control of the same age (left); (B) Schematic representation of the construct used to generate the TCL1(FL) mice; (C) Upper: Hematoxylin and eosin-stained spleen of mouse showing an expanded MZ in TCL1 mice; lower: Immunodetection of TCL1 protein in lymphoid cells of the MZ. (Image acquired from Efanov, et al. (2010) Leukemia 24: 970-975 and Bichi, et al. (2002) PNAS 99: 6955-6960)

Research in the past decade expanded our knowledge on TCL1 by focusing on the

study at the proteomic level, revealing its involvement in Akt activation in T-cell

transformation [24]. Besides, TCL1 activates NF-kB, inhibits AP-1 [30], and restrains

12

DNMT3A [31], which causes epigenetic deregulation of gene expression and leads to

CLL [32] (Figure 1.7). TCL1 also increases survival through activation of endoplasmic reticulum stress response [33], and interaction with ATM [34] and HSP70 [35]. Post- transcriptionally, the expression of TCL1 can be inhibited by miR-29 and miR-181 [36].

Moreover, our preliminary results (unpublished microarray data) indicate certain

lncRNAs are aberrantly expressed in TCL1-expressing CLL samples, in accord with

evidence that lncRNAs play a role in tumorigenesis. Thus TCL1 may as well exert

oncogenic effects in aggressive CLL through lncRNA modulation.

Figure 1.7 TCL1 oncogenic functions in CLL. TCL1 activates Akt, activates NF-kB, stabilizes HSP70, interacts with ATM, and inhibits AP-1 in CLL and more.

13

1.2.3 The Eu-TCL1-transgenic mouse model

The Croce lab has established a Eu-TCL1-transgenic mouse model (TCL1 mouse) expressing human TCL1 in the murine B cells, which displays an aggressive form of

CLL. The TCL1 mouse model was genetically engineered by placing the entire coding

region of the human TCL1 gene under the control of a mouse IgVH and an IgH-u

enhancers to promote TCL1 expression exclusively in mature and immature B cells [15]

(Figure 1.6). Indeed, this model closely resembles human aggressive CLL in disease

phenotype [28], epigenetic changes [37], response to treatment [38], and CLL induced T- cell dysfunction [39]. Compared to other transgenic models, the TCL1 mouse is regarded

as the gold standard animal model to study aggressive CLL [40] and findings from this

model are considered to be highly comparable to the human disease. As TCL1 is a co-

activator of AKT, activates the NF-kB pathway in CLL cells, and inhibits DNMT3A and

DNMT3B activity, it is suggested that leukemia development in TCL1-expressing

individuals is at least partially dependent on enhanced AKT activity [25]. Due to the

decisive role of TCL1 in disease initiation, we believe it maintains a signaling network

whose complexity is beyond that previously documented. Therefore, an overall

understanding of the transcription events taking places in CLL is helpful, and thereby a

transcriptome-level screening will be beneficial in generating such an expression outline.

Accordingly, this mouse model facilitates us in identifying genes highly accountable for

the CLL initiation mechanism, which might represent novel targets for prognostic

markers and/or therapeutic interventions.

14

1.3 B-1a cells

1.3.1 Understanding the origin of CLL

CLL is characterized by an overproduction of abnormal B lymphocytes that express

CD5 glycoprotein marker (CD5+) on their surface. Previous studies showed that a small

subgroup of CD5+ B cells in mammals, accounting for approximately 1% of the total lymphocytes, is the normal counterpart of malignant CLL cells [41, 42]. Indeed, human

CD5+ B cells and malignant CLL B cells share similar gene expression profiles when

compared to conventional B-cell subsets. More specifically, IgV unmutated CLL cells were found to be more similar to CD5+/CD27- B cells, whereas IgV mutated CLL cells

showed a higher similarity to CD5+/CD27+ B cells. These findings indicate that certain

CD5+ B cells might be the precursor and normal counterpart of CLL cells. Another recent

study used microarray profiles of CD5+ B cells in three types of CLL-prone transgenic mice to demonstrate that a clonal expansion of CD5+ B cells can lead to malignant

transformation, supporting the hypothesis that CD5+ B cells can act as CLL precursors

[43]. Therefore, it is reasonable to consider CD5+ B lymphocytes as the primary object to

investigate and further the understanding of the disease mechanism of CLL.

15

Figure 1.8 Mature splenic B cell subsets in mice. There are five types of mature B cells in the mouse spleen. The majority of splenic B cells derive from bone marrow. They are follicular B cells (>70%) and marginal zone B cells (15%). B-1 cells are minor subsets and are composed of B-1a cells (2%) and B-1b cells (<1%). B-1a is CD5+, whereas B-1b is CD5-. The regulatory B cells constitute 1% of the total splenic B cells with its function currently unknown. (Image acquired from Baumgarth (2011) Nature Reviews Immunology 11: 34-46)

1.3.2 Murine B-1a cell isolation

In mice, B-1 cells are a group of innate-like B cells that are long-lived, self-renewing and produce most of the circulating natural IgM antibodies. The B-1a subgroup of B-1 cells accounts for the majority of CD5+ B cells that possess B-1 cell characteristics. In

humans, a subset of CD5+ polyreactive IgM-producing B cells has been described as a potential functional human B-1a cell homologue (Figure 1.8) [44]. Due to the shared features of CLL cells and B-1a cells, attempts have been conducted to decipher the

16

potential role of physiological B-1a cells as precursors of CLL [41, 43]. B-1a cells are

preferentially abundant in young mice compared to old mice and can progress to CLL

[42]. For example, these early-generated B-1a cells may become B-CLL when promoted by the human TCL1 overexpression engineered in TCL1 mice. TCL1 mice accumulate

B-1a cells with disease progression and eventually exhibit fully developed CLL that resembles human aggressive CLL. In chapter 2 of this dissertation study, we describe how we employed the TCL1- mouse model to screen for and study transcriptome dynamics in B-1a cells that may contribute to the disease initiation of CLL.

Figure 1.9 Total splenocyte compositions in mice. B-1a cells accounts for only a tiny portion of splenocytes (varying from 1 to 3% according to individual differences with the majority ~1%).

As B-1a cells account for 1% (Figure 1.8 & 1.9) of the total splenocytes, their isolation

for genetic material extraction is technically challenging. In my thesis research this

challenge has been overcome by carrying out a combined protocol of positive and

17

negative magnetic labeling selection (Figure 1.10), followed by the isolation of an adequate amount of B-1a cells for RNA isolation and RNA seq. The purity of the isolated

B-1a cells was routinely checked by FACS analysis and confirmed to be >90% (Figure

2.1 in Chapter 2).

Figure 1.10 Mouse B-1a cell isolation. A combined protocol of positive and negative magnetic labeling selection is applied. Step 1: Magnetic labeling of non-B-1a cells; Step 2: Depletion of the labelled non-B-1a cells (negative screening); Step 3: Magnetic labeling of B-1a cells; Step 4: Positive selection of B-1a cells. (Image acquired from https://www.miltenyibiotec.com)

1.4 Hypothesis

We hypothesize at certain stages in B-1a cells significant alteration take place contributing to CLL transformation, which could be shown by RNA seq results. Equally important, there might exist previously unrevealed factors in determination of disease initiation. They can be potential new prognostic marker or possibly therapeutic targets. 18

CHAPTER 2: PROFILING OF TCL1-MURINE B-1A CELL USING RNA-SEQ

2.1 Introduction

CLL is the most frequent type of adult leukemia in western countries. It is characterized by a clonal accumulation of abnormal CD5+ B lymphocytes involving the peripheral blood, bone marrow, and lymphoid organs [2] (Figure 1.2 in Chapter 1), which leads to variable clinical outcomes. CLL is a heterogeneous disease and is characterized by a variety of genetic lesions such as chromosomal abnormalities, epigenetic alterations, and gene mutations of immunoglobulin heavy chains. There are two major types of CLL: aggressive and indolent. Despite the fact that the exact mechanisms of progression from an indolent to aggressive stage remain largely unknown, previous works provide

evidence that specific genes, such as TCL1, play a decisive role in disease development

[15, 29]. Therefore, we choose to study the modifications of specific gene pathways in

cells at different stages, in order to provide new knowledge toward the identification of

new prognostic factors and more effective targeted therapies.

Transcriptome analyses have recently identified CD5+ lymphocytes as the cell origin

of malignant CLL cells [41]. In mice, CD5+ B lymphocytes are also known as B-1a cells

[44]. The well-established TCL1 mouse model, which mirrors the biological

19

characteristics of human CLL, possesses an expanded B-1a cell population starting at an

early age [45]. This model features the targeted expression of the human TCL1 gene in

mouse B cells, which drives the accumulation of B-1a cells with disease progression

(Figure 2.1). TCL1 transgenic mice eventually exhibit CLL-like disease starting at 6 to

12 mo of age [15]. For this reason, we study B-1a cells as precursors of CLL cells using

TCL1 mice at different ages during development and progression of the disease [45].

In this chapter we will identify the transcriptome signature of murine B-1a cells in

mouse individuals prior to the occurrence of CLL symptoms by using gene expression

profiling (mRNA seq) in different age groups.

2.2 Results and discussions

We analyzed 1, 2, and 4mo old TCL1 mice and 1, 2, 4 (+8) mo age-matched wild-type

(WT) controls (Figures 2.1 and 2.2). For each age group, three replicates are used, with

each replicate consisting of an average of three mice in order to obtain sufficient B-1a

cells which could allow us to perform statistically valid comparisons between TCL1 and

WT. The murine B-1a cells were isolated and verified with FACS analysis based on the

characteristic cell surface markers CD5 and CD19 (Figure 2.1). TCL1 expression in

TCL1 mice was confirmed by qRT-PCR (Figure 2.3A). These results were obtained following a RNA sequencing quality control analysis (Figure 2.2).

20

Figure 2.1 FACS analysis for B-1a cell collection from total splenocytes. B-1a cells isolated from (A) 1 mo, (B) 2 mo, and (C) 4 mo old TCL1 mice and the corresponding WT counterparts, respectively. In each collection the image on the left represents total splenocyte, while the one on the right represents isolated B-1a cells from the total population. The percentage numbers on the upper right corner indicate the purity of B-1a cells in the isolated populations (8mo WT were excluded due to the lack of 8mo TCL1). Note: the upper pointing arrows indicate in TCL1 mice (from 1mo to 4mo) the progressive cellular transformation towards the CD5+ direction. 21

A

B

Figure 2.2 RNA sequencing output quality control indicates the transcriptome readings are of satisfactory quality. The two primary examinations are: (A) Fast QC checks on the per base sequencing quality and (B) RSE QC checks on the read coverage. (Note: 8mo WTs are included in the quality control checks)

22

Figure 2.3 Validation of TCL1 expression and transcriptome profiling. (A) qRT-PCR was applied to determine the TCL1 gene expression levels in TCL1 mice in different strain/age groups; (B) Numbers of deregulated genes in different age groups obtained from total transcriptome RNA-seq; (C) Pathway analysis with deregulated genes: cell cycle control of chromosomal replication and G2/M DNA damage checkpoint regulation; (D) Heat maps summarize of the most up- or downregulated protein coding and non- coding genes in each age group. Data are presented as the mean +/- SEM. ***P<0.005.

23

Figure 2.3

C

Continued

24

Figure 2.3 continued

Continued

25

Figure 2.3 continued

D 26

RNA-seq revealed a large number of deregulated genes. Total transcriptome

RNAseq was performed on TCL1 and WT mice in triplicate at 1 mo, 2 mo, and 4 mo of

age. Based on a 2.0 fold change as threshold for both upregulated and downregulated

genes and >4.0 FPKM (Fragments Per Kilobase of transcript per Million mapped reads)

cut-off value, the comparison of 1 mo old TCL1 and WT mice revealed 104

downregulated genes. With increasing age, the downregulated genes increase 2-fold at 2

mo and 3-fold at 4 mo (218 genes at 2 mo and 310 genes at 4 mo). There were 61

differentially upregulated genes in the 1 mo old group and 62 upregulated genes in the 2

mo old group, but this number increased more than 5-fold in the 4 mo old group to a total

of 530 genes. This implies most of the gene upregulations in the transcriptome of TCL1

mouse B-1a cells begin between the second and fourth mo of age, suggesting a pattern of

accelerated transcriptome dynamics due to the overexpression of the human TCL1

transgene (Figure 2.3B). These deregulations may directly result from TCL1 or indirectly

be associated with TCL1 through a network of genetic events.

Signaling pathways are predicted based on the list of deregulated genes. Among the

results, cell cycle-related pathways maintain the lowest P-values: cell cycle control of

chromosomal replication (p<2.51x10-8) and G2/M DNA damage checkpoint regulation

(9.12x10-8) (Figure 2.3C). Both of the pathways are predicted to be upregulated,

indicating increased cell cycle possibly due to the TCL1 overexpression. These are

followed by elevated Mitotic roles of polo-like kinase, ATM signaling and Aryl hydrocarbon receptor signaling. Among them, enhanced ATM signaling is consistent

with Gaudio et al 2012 discovering the interaction between TCL1 and ATM [34];

27 increased Aryl hydrocarbon receptor signaling recapitulates the proliferating nature of

CLL cells because it is involved in hematopoietic stem cell activation.

Differentially regulated genes were stratified between up- and down-regulated genes, and protein-coding and non-coding genes. For protein-coding genes, the top 15 most significantly up/down regulated genes (ranked according to linear fold changes) at each time point were selected as candidates; however, for non-coding genes, due to the limited number revealed by RNA-seq, all of them were included in the heat maps as well as subsequent analysis (Figure 2.3D). As summarized in the Venn diagrams shown in Figure

2.4, we focused on the candidates that were deregulated in all three age groups because they are likely to be constantly involved in the entire early transformation stage. The representative candidates were selected according to the gene linear fold change values and p-values, and were then successfully validated by real-time PCR. We validated the differential expression of the protein-coding genes Neto2 and Hbegf, and non-coding genes AI427809 and 1700097N02Rik (Figure 2.5).

28

Figure 2.4 Superimposed diagrams display the overlapping relationship among the deregulated genes from different age groups. The purple circles contain candidate genes differentially expressed between WT and TCL1 mice in the 1mo age group; the pink circles are the 2mo age group; and the green circles are the 4mo age group. The overlapping areas cover the candidates significantly deregulated in more than one age group. (A) protein coding genes upregulated; (B) protein coding genes downregulated; (C) non-coding RNA (ncRNA) upregulated; (D) ncRNA downregulated.

29

Figure 2.4

Continued

30

Figure 2.4 continued

31

Upregulated genes. Neto2 was the most upregulated protein coding gene in the B-1a cells of TCL1 mice compared to WT mice. It was upregulated in all three age groups and the mean ratio increased with age: 1 mo TCL1 vs. WT 70.64 (p<0.00005), 2 mo TCL1 vs. WT 97.36 (p<0.00005), and 4 mo TCL1 vs. WT 143.97 (p<0.00005) (Table 2.1). For validation, realtime PCR was performed using B-1a cells of TCL1 vs. WT mice at 1 mo,

2 mo, and 4 mo of age (Figure 2.5A). Neto2 has been proven to be highly differentially expressed in dnRAG1 and DTG mice, both of which are alternative murine models of

CLL [43] and it has also been implicated in development of solid tumors [46]. In response to expression of the tumor suppressor gene Nm23-H1 (WT) in the MDA-MB-

435 cancer cell line, NETO2 has been found to be down-regulated whereas this downregulation failed to occur upon the expression of non-functional mutant Nm23-H1

[47]. Its expression is also associated with renal and lung cancers, where it may be a potentially useful therapeutic target [46]. The exact mechanisms by which NETO2 affects cancer development, however, still have to be elucidated.

32

Figure 2.5 Differentially expressed genes in TCL1 vs. WT mouse B-1a cells of different age groups. Compared to WT, TCL1 mouse B-1a cells from all three age groups express higher levels of Neto2 and AI427809 and lower levels of Hbegf and 1700097N02Rik. This result is consistent with RNA seq. (A) Neto2; (B) Hbegf; (C) AI427809; and (D) 1700097N02Rik. Data are presented as the mean +/-SEM. * P<0.05; **P<0.01; ***P<0.005.

Additional upregulated genes with a potential role in CLL development in our study

were Fstl1, Grb7, Evc and Fkbp11. Fstl1 was upregulated in TCL1 mice in all three age

groups: 1mo TCL1 vs. WT 4.05 (p<0.00005), 2mo TCL1 vs. WT 10.31 (p<0.00005), and

4mo TCL1 vs. WT 17.08 (p<0.00005) (Table 2.1). FSTL1 has been described as being

33

secreted by Snail+ tumor cells that frequently metastasize to bone [48], and therefore might play a role in CLL migration. Grb7, which was upregulated in 1mo (TCL1 vs. WT

4.66, P<0.00005), 2mo (TCL1 vs. WT 9.38, P<0.00005) and 4mo (TCL1 vs. WT 4.15,

P<0.00005) old TCL1 mice (Table 2.1), translates into a non-catalytic intracellular adaptor protein. GRB7 protein interacts with EGFR, ephrin receptors and FAK to facilitate cell migration. It has been reported that late-stage CLL patients show enhanced

GRB7 expression accompanied by in vitro migration [49]. Evc was upregulated in both

2mo (TCL1 vs. WT 7.81, P<0.00005) and 4mo (TCL1 vs. WT 25.50, P<0.00005) old

TCL1 mice (Table 2.1). The product of this gene regulates leukemia cell survival through activation of the hedgehog pathway [50]. Fkbp11 was upregulated in 1mo (TCL1 vs. WT

2.67, p<0.00005), 2mo (TCL1 vs. WT 9.13, p<0.00005) and 4mo (TCL1 vs. WT 4.71, p<0.00005) old TCL1 mice (Table 2.1). This gene was identified as a biomarker in hepatocellular carcinoma [51] and is also highly expressed in lymphoma [52].

Downregulated genes. Hbegf was the most downregulated gene in the B-1a cells of

TCL1 mice compared to WT mice within all age groups, and the ratio decreased with age: 1mo TCL1 vs. WT 0.027 (p<0.00005), 2mo TCL1 vs. WT 0.0051 (p<0.00005), and

4mo TCL1 vs. WT 0.0026 (p<0.00005) (Table 2.1). We validated Hbegf downregulation with qRT-PCR in B-1a cells of <4mo old TCL1 vs. WT mice as well as >8mo old mature mouse CLL (Figure 2.5B and Figure 3.1C). To confirm the differential protein expression we used <4mo old mouse splenocytes but saw only very small differences in Hbegf at the protein level (Figure 3.2A). This can be due to the yet low percentage of B-1a cells in

34

young mice and thus non-significant Hbegf difference in TCL1 vs WT detected (Note:

for western blot total splenocytes were used instead of isolated B-1a cells due to the

restricted isolation yield).

Conversely, western blot using splenocytes from >8mo old mature mouse CLL cells

showed a marked downregulation of Hbegf in the TCL1 mice when compared to normal

splenocytes of WT counterparts (Figure 3.2B). We speculated that the different results

between juvenile and mature mice comes from the fact that very few B-1a cells are found

in the spleens of 4 mo old TCL1 mice; whereas B-1a cells prevail in the spleens of

mature TCL1 mice. Hbegf is conserved in both human and mice, and regulates numerous

genes related to cell fate determination. HBEGF stimulates human embryo development by promoting the expression of specific human embryo genes [53]. Therefore, the

progressive loss of Hbegf expression in B-1a cells of TCL1 mice throughout development could result in a dysregulation of cell differentiation mechanisms, contributing to extended proliferation of malignant B cells and clinical symptoms of CLL. More specifically, the lowered expression of Hbegf in the CLL B-1a cells suggests a loss of developmental potential necessary for B cell maturation. This is in accord with a functional pathway analysis that showed the B cell development pathway is altered (data not shown), suggesting that an impedance of cell differentiation in the first stage of CLL may be partially due to downregulation of Hbegf. This hypothesis is supported by the negative relationship between TCL1 and Hbegf expression as revealed in this study.

Further understanding of the mechanisms by which Hbegf plays a role in the onset of leukemogenesis remains to be elucidated.

35

Similar to Hbegf, the expression of stem cell antigen-1 (Ly6a) in TCL1 mouse B-1a

cells was progressively reduced with increased age: 1mo TCL1 vs. WT 0.086

(p<0.00005), 2mo TCL1 vs. WT 0.037 (p<0.00005), and 4mo TCL1 vs. WT 0.018

(p<0.00005) (Table 1). LY6A encodes an antigen upregulated on activated lymphocytes and is a common marker of hematopoietic stem cells [54]. Ly6a-/- mice have reduced

platelet and megakaryocyte counts, suggesting that this gene is involved in the

homeostasis of hematopoiesis [55]. Thus, this result might indicate a gradual loss of homeostasis of hematopoiesis in B-1a cells prior to the disease onset of CLL.

Additional downregulated genes with a potential role in CLL development in our study were MSH5, Ssm1b and LRRC49. MSH5 is downregulated in 1mo (TCL1 vs. WT 0.18,

P<0.00005), 2mo (TCL1 vs. WT 0.093, P<0.00005) and 4mo (TCL1 vs. WT 0.13,

P<0.00005) old TCL1 mice (Table 2.1). MSH5 is located in a susceptibility in lung

cancer [56]. Similarly, Ssm1b (2610305D13Rik) was downregulated in 1 mo (TCL1 vs.

WT 0.072, P<0.00005), 2mo (TCL1 vs. WT 0.0088, P<0.00005) and 4 mo (TCL1 vs.

WT 0.0029, P<0.00005) old TCL1 mice (Table 2.1). Ssm1b is a KRAB-zinc finger (ZF)

gene located on the distal arm of chromosome 4 and has been shown to be expressed in

early developmental stages. Ssm1b works in concert with Dnmt3b to mediate de novo

DNA methylation and chromatin modification in undifferentiated embryonic stem cells

(ESCs) and in turn regulates gene expression [57]. Leucine rich repeat containing 49

(LRRC49) is downregulated in 1mo (TCL1 vs. WT 0.076, P<0.00005), 2mo (TCL1 vs.

WT 0.017, P<0.00005) and 4mo (TCL1 vs. WT 0.026, P<0.00005) old TCL1 mice

(Table 2.1). This gene is located on human chromosome 15q23 and is silenced in breast

36

cancer due to hypermethylation of its promoter region [58]. Furthermore, 15q23 is one of

the recently identified susceptibility loci for CLL [59]. Therefore, the downregulation of

LRRC49 RNA expression could be a result of instability of the 15q23 region in CLL.

This downregulation may contribute to the malignancy of CLL similar to that in breast

cancer.

Deregulated ncRNAs include pre-miRs, lncRNAs and pseudogenes. We found

several interesting miRs, lncRNAs and pseudogenes deregulated between the TCL1 and

WT mice. Deregulated precursor miRs included pre-mir-568 and pre-mir-682, which

were downregulated in the 1mo and 2mo old groups, respectively. Downregulated

lncRNAs included 5730416F02Rik (1mo only), AW112010 (2mo only),

1500011B03Rik, 4933412E12Rik, E130102H24Rik and Tmem181b-ps (4mo only),

1700020N18Rik (1mo and 2mo), A430093F15Rik (2 and 4mo), and 1700097N02Rik (all age groups). Upregulated lncRNAs included 4930481A15Rik and I730030J21Rik (1mo only), E330023G01Rik, F730043M19Rik, 2210039B01Rik, 4933421O10rik,

A930005H10Rik, 2810025M15Rik, BC033916, AW011738, 1190002F15Rik and

1700063D05Rik (4mo only), E330020D12Rik (1mo and 4mo), and 2610035D17Rik and

AI427809 (all age groups). Among them, both 1700097N02Rik and AI427809 were validated by qRT-PCR in <4mo old mouse B-1a cells (Figure 2.5C and 2.5D).

Pseudogenes exist mostly due to gene duplications that make one copy expendable and mutations accumulating in the second copy [60]. Though not producing functional proteins, the transcript of pseudogenes may still have a regulatory role [60]. As identified by our RNA-seq analysis, downregulated pseudogenes included Gm8615 (2mo only),

37

Gm11346 and Gm15408 (4mo only), and Gm10653, Gm12505 and Gm6654 (all age

groups). Upregulated pseudogenes included Gm10451, Gm6402, Gm15987 and Gm8580

(4mo only) (Figure 2.4 and Table 2.1). As >98% of the is non-coding, the

identification of noncoding genes potentially contributing to leukemogenesis is of

particular importance in deciphering the molecular events in disease development. Thus,

the corresponding human genomic regions should be further studied.

Summary

In comparison to the microarray platform, the employment of RNA-seq technology in

the present study provides a highly extensive overview of the molecular alterations within

the potential CLL B cell precursor prior to the disease onset. In addition to known

annotated genes, RNA-seq identifies novel genomic regions with transcription events,

thus capturing comprehensive details of new potential causes leading to the disease

pathogenesis. In summary, we characterized the transcriptome of B-1a cells to test the

hypothesis that molecular dysregulation in mouse B-1a cells contributes to the occurrence

of mouse CLL. Indeed, we have appreciated an age-related expansion of the B-1a cell population of CLL-prone mice, while the same cell population of the WT counterpart

remained constant (Figure 2.1). Moreover, the comparison of normal B-1a cells from the

WT mice vs. CLL-prone B-1a cells from TCL1 mice revealed a cascade of transcriptome

dynamics taking place within the transformation from B-1a to overt leukemia cells.

Identified targets can be of novel therapeutic value in the clinic. For instance, Neto2 has

been reported to function in mammalian central nervous systems by encoding a subunit

38

of auxiliary kainate receptor [61] and upregulated in tumors [46, 47]. Although this gene

has never been implicated in CLL, its role in leukemogenesis remains to be investigated.

Moreover, functional enrichment analysis with the dataset derived from RNA-seq

reported a significant association of subsets of the de-regulated genes to cancer-related

pathways such as the cell cycle, NF-kB pathways etc, some of which has been reported to

contribute to CLL development [62].

2.3 Materials and methods

Mice and preparation of murine B-1a cells. All animal experiments were

performed following procedures approved by The Ohio State University Institutional

Laboratory Animal Care and Use Committee. Homozygous TCL1 (B6C3H strain as

background) and strain/age-matched WT counterparts (Jackson Laboratory) were

sacrificed at the age of 1, 2 and 4 mo and spleens were collected for B-1a cell isolation.

Three samples were prepared for each strain/age group, and three mice were combined to provide adequate numbers of B-1a cells for each sample. B-1a cells were purified by magnetic separation techniques using the B-1a cell isolation kit (Miltenyi, cat #: 130-097-

413) following the manufacturer’s instructions. Cell purities after magnetic isolation were determined by Fluorescence-Activated Cell Sorting (FACS) analysis using a BD FACS

Aria III. Cells were stained with PE Rat Anti-mouse CD5 and PerCP-CY 5.5 Rat Anti-

Mouse CD19 (BD Biosciences) following standard protocols and B-1a cells were identified as CD5+CD19+ lymphocytes (Figure 2.1). FlowJo software version 10 was used to analyze FCS files. Stopping gates were set on the CD19 vs. CD5 gate to record

39

20,000 events. The purity of isolated B-1a cell populations was maintained above 90%

for each sample: 92.5% for 1mo WT, 96.4% for 1mo TCL1, 95.7% for 2mo WT, 98.8%

for 2mo TCL1, 95.6% for 4mo WT, and 98.2% for 4mo TCL1 (Figure 2.1).

RNA preparation and sequencing. Total RNA was extracted with TRIzol reagent

(Invitrogen, Carlsbad, California) following the manufacturer’s recommended protocol.

Total RNA concentration was quantified by Qubit (Life Tech). To ensure the quality of

the subsequent cDNA library generation, DNase was removed with PureLink DNase

(Invitrogen) associated with PureLink RNA Mini Kit (Invitrogen). The integrity of the

final RNA product was verified by NanoBioAnalyzer. Sample preparation and cDNA

library generation were performed following the illumina protocol simplified as the

following steps: (1) mRNA purification and fragmentation, (2) first strand cDNA

synthesis, (3) second strand cDNA synthesis, (4) end repair, (5) 3’-end adenylation, (6) adapter ligation, (7) DNA fragment enrichment. In the final step, deep sequencing was applied to the library aiming for 35-40 million passed filter reads/sample. The reads were recorded to show the expression level of regions throughout the transcriptome. For CLL samples, RNA was extracted using standard TRIzol (Invitrogen, Carlsbad, California) method and checked for quality on Agilent Bioanalyzer.

Realtime PCR. Total cDNA was synthesized from 250ng of total RNA using High

Capacity cDNA Reverse Transcription Kit (AB Applied Biosystems) following the manufacturer’s instructions. Genes of interest (Neto2, Hbegf, AI427809, and

1700097N02Rik) were analyzed for expression level determination by qRT-PCR using

FAM-labelled Taqman assays from Applied Biosystem following the protocol provided

40

by the manufacturer. GAPDH was used as an internal control. Detailed information of the

assays are: mouse Gapdh (assay ID Mm99999915_g1, Cat# 4331182), mouse Neto2

(assay ID Mm01245002_m1, Cat# 4351372), mouse Hbegf (assay ID Mm00439306_m1,

Cat# 4331182), mouse AI427809 (assay ID Mm01346743_m1, Cat# 4351372), mouse

1700097N02Rik (assay ID Mm03956926_m1, Cat # 4426961), human GAPDH (assay ID

Hs03929097_g1, Cat# 4331182), human NETO2 (assay ID Hs00983152_m1, Cat#

4331182), human HBEGF (assay ID Hs00181813_m1, Cat# 4331182).

RNA-seq Data Analysis. Transcriptome data (in the format of fastq files produced

by the Illumina sequencer) were mapped to the murine reference genome sequence along

with corresponding annotation files [63] with the Tuxedo Package mapping software

TopHat v.2.0.9 [64]. Quality control assessments were performed on both pre- and post-

alignment in order to ensure the highest possible quality of the data selected for the following analysis. Fast-QC analysis was performed to assess the quality of the pre- alignment data and assure the integrity of the data. Fast-QC provides assessment on basic statistics, per base sequence quality, per tile sequence quality etc. RSeQC was performed to assess post-alignment data quality. RSeQC provides evaluation on RNAseq-specific metrics such as sequencing depth, mapped reads distribution, coverage uniformity and saturation checking [65]. Both the pre- and post-alignment data passed the QC checks according to the results given by the subordinate programs of FastQC and RSeQC.

Differential expression analysis was subsequently performed with the Tuxedo Package software Cuffdiff v.2.1.1 for each age group, where mapped data from the biological replicates were combined in comparing the two conditions, TCL1 vs. WT [66] (P-value

41

<0.05). A cutoff value of FPKM >4 and a linear fold change > 2 were applied to screen

for significantly deregulated transcripts. To increase stringency, we also filtered out

genes with Interquantile Range above the 95th percentile, as it is indicative of high

intraclass variability. The signaling pathway prediction is carried out with the

INGENUITY IPA program.

General Data Analysis. To generate heat maps, differentially expressed genes

common to all three time points were extracted and differential expression analysis was performed on all mapped data from each condition in all three age groups so as to retrieve comparable expression values for the common genes set. The maps were established by the Hierarchical Clustering module of GenePattern based on normalized expression data

[67]. Pearson correlation was used as distance and pairwise complete-linkage as clustering method for both genes and samples. Pathway analysis on de-regulated genes was performed by Ingenuity Pathway Analyzer (IPA) (Ingenuity® Systems, www.ingenuity.com).

In qRT-PCR results, data were presented as the mean +/- SEM. Significance was evaluated by a T-test analysis. Independently, P-values in RNAseq-based candidate profiling were automatically calculated by the Tuxedo Package software Cuffdiff v.2.1.1 in the process when differential expression analysis was being carried out. The trend line generation of NETO2 and HBEGF expression in human samples was conducted by regression analysis.

42

Table 2.1 Expression level of top candidates by RNA-seq analysis

1 mo 2 mo 4 mo Genes up Fold change (TCL1/wt) p-value Genes up Fold change (TCL1/wt) p-value Genes up Fold change (TCL1/wt) p-value Neto2 70.64400139 0.00005 Neto2 97.36328234 0.00005 Neto2 143.9745497 0.00005 Fstl1 4.04911824 0.00005 Fstl1 10.30648082 0.00005 Fstl1 17.08052009 0.00005 Fkbp11 2.669598641 0.00005 Fkbp11 9.12861212 0.00005 Tcstv3 14.39514085 0.00005 Grb7 4.660167351 0.00005 Grb7 9.378599514 0.00005 Evc 25.49937533 0.00005 S100a6 2.573209692 0.00005 S100a6 8.590260717 0.00005 Fxyd6 83.91402203 0.0001 Stc1 6.376885136 0.00005 Stc1 9.996619052 0.00005 Gm20767 19.58164416 0.00005 Tcstv3 2.792703361 0.00005 Evc 7.814936132 0.00005 Gnb3 39.13479547 0.00005 Afap1 2.942525155 0.00005 Bhlha15 8.161295354 0.00005 Liph 29.52018792 0.00005 Apobec2 4.543234685 0.00005 Fcgr4 10.36640489 0.00005 Mcc 27.41132194 0.00005 Bst1 4.017830623 0.00005 Hba-a2 12.51359822 0.00005 Myl9 13.15199727 0.00005 Fam211a 4.409530308 0.00005 Lmna 7.007701538 0.00005 Nrep 24.13541954 0.00005 Igj 2.63754759 0.00005 Pawr 7.202350635 0.00005 Pgbd5 13.64518453 0.00005 Pltp 3.255559272 0.00735 Sel1I3 7.510231515 0.00005 Pilra 13.93614853 0.00005 S100a8 2.856966457 0.00005 Tnfrs17 12.1079056 0.00005 Robo1 16.4506194 0.00005 Sepn1 3.636279735 0.00005 Trim6 10.99453234 0.00005 Serpinf1 16.93469384 0.00005 Genes down Fold change (TCL1/wt) p-value Genes down Fold change (TCL1/wt) p-value Genes down Fold change (TCL1/wt) p-value Hbegf 0.026906352 0.00005 Hbegf 0.005149865 0.00005 Hbegf 0.002636409 0.00005 Lrrc49 0.075693208 0.00005 Lrrc49 0.016774639 0.00005 Lrrc49 0.02642157 0.00005 Ly6a 0.086162969 0.00005 Ly6a 0.036600816 0.00005 Ly6a 0.018128709 0.00005 Pianp 0.059730901 0.00005 Pianp 0.028466486 0.00005 Pianp 0.015235176 0.00005 Msh5 0.177141283 0.00005 Msh5 0.092970642 0.00005 Gm5506 0.031282942 0.0002 Zfp534 0.091678674 0.00005 Zfp534 0.030209941 0.00005 Slc15a2 0.019868407 0.00005 H2-Bl 0.027493514 0.00005 Gm5506 0.037985357 0.00005 H2-Bl 0.041480584 0.00005 Cyp11a1 0.185258851 0.00005 Slc15a2 0.020403664 0.00005 2610305D13Rik 0.002878853 0.00005 Folr2 0.159614749 0.00005 Adcy6 0.03725557 0.00005 8430419L09Rik 0.048302109 0.00005 Gm6251 0.103316238 0.0086 Cd3g 0.073743052 0.00005 Acsf2 0.060979579 0.00005 Gpr124 0.134986159 0.00005 H2-Ea-ps 0.003277525 0.00005 Ccr6 0.057129511 0.00005 Gm5506 0.042276873 0.00005 Ms4a4b 0.0470009 0.00005 Cnn3 0.052262864 0.00005 Nme7 0.168964278 0.00005 Nsg2 0.082381118 0.00005 Dnahc8 0.067423251 0.00005 Olfr1033 0.093222693 0.00005 Prkch 0.083771185 0.00005 I830077J02Rik 0.042707108 0.00005 Ptprv 0.196304168 0.00005 Zap70 0.041388505 0.00005 Prg2 0.067057383 0.00005 ncRNAs up Fold change (TCL1/wt) p-value ncRNAs up Fold change (TCL1/wt) p-value ncRNAs up Fold change (TCL1/wt) p-value 2610035D17Rik 4.764858508 0.00005 2610035D17Rik 5.126204193 0.00005 2610035D17Rik 4.803846325 0.00005 AI427809 3.065749284 0.00005 AI427809 6.827521216 0.00005 AI427809 5.228329262 0.00005 E330020D12Rik 2.276589763 0.00005 E330020D12Rik 4.025449807 0.00005 4930481A15Rik 3.576820746 0.00425 E330023G01Rik 4.816015382 0.00005 I730030J21Rik 2.188697678 0.00005 F730043M19Rik 3.911494672 0.00005 Gm10451 2.81407348 0.00005 2210039B01Rik 2.186116067 0.0003 4933421O10Rik 2.456033093 0.00005 A930005H10Rik 2.126454521 0.00005 2810025M15Rik 4.458844848 0.00005 BC033916 5.253540424 0.00005 AW011738 2.313280159 0.00005 Gm6402 4.673877752 0.0097 1190002F15Rik 5.689768679 0.00005 1700063D05Rik 4.651158164 0.00005 Gm15987 2.844628001 0.00005 Gm8580 3.172235769 0.00005 ncRNAs down Fold change (TCL1/wt) p-value ncRNAs down Fold change (TCL1/wt) p-value ncRNAs down Fold change (TCL1/wt) p-value 1700097N02Rik 0.307141658 0.00005 1700097N02Rik 0.168869672 0.00005 1700097N02Rik 0.029950004 0.00005 Gm10653 0.118198653 0.00005 Gm10653 0.126039933 0.00005 Gm10653 0.131400257 0.00005 Gm12505 0.185645531 0.00005 Gm12505 0.311104435 0.00005 Gm12505 0.253468786 0.00005 Gm6654 0.051586475 0.0001 Gm6654 0.03682046 0.00005 Gm6654 0.030231786 0.0084 1700020N18Rik 0.225513099 0.00005 1700020N18Rik 0.198609179 0.00005 A430093F15Rik 0.28016138 0.00005 5730416F02Rik 0.015841373 0.00125 A430093F15Rik 0.428357348 0.00005 AI504432 0.20779315 0.00005 Mir568 0.485881389 0.02785 AI504432 0.489197616 0.00005 1500011B03Rik 0.433282538 0.00015 AW112010 0.467068124 0.00005 4933412E12Rik 0.318260496 0.0004 Gm8615 0.007655352 0.0004 E130102H24Rik 0.481841313 0.0053 Mir682 0.419095647 0.0475 Gm11346 0.362632466 0.00005 Gm15408 0.241760424 0.00005 Tmem181b-ps 0.452707151 0.00025

43

Table 2.2 Information of the human samples sample ID ZAP% VH% FISH% Karyotype 1150 2.3 100 92 11q deletion 1151 75.5 96.1 77 11q deletion 1153 26.7 99.3 93 11q deletion 1154 24.1 99.7 98 11q deletion 1155 85 100 98 11q deletion 1156 25.7 100 86.5 11q deletion 1157 66.2 100 95.5 11q deletion 1181 9.8 100 90 11q deletion 1188 13.6 100 92.5 11q deletion 1190-p 79.2 100 28 17P deletion 1193 36.8 99.7 95.5 17P deletion 1194 95.2 100 78.5 17P deletion 1195 66.7 100 95 17P deletion 1199 21.4 100 98 17P deletion 1198-p 63.8 100 49 17P deletion 1237 59.4 100 64 17P deletion 1241 44 100 98 17P deletion 1246 58.5 100 87.5 17P deletion 1248 65.3 100 99.5 17P deletion 1257 65.6 100 99 17P deletion 1299 0.84 100 88 17P deletion 1203 3 92.6 97.5 13q deletion 1204 0.6 91 95.5 13q deletion 1205-p 16.8 87.6 93.5 13q deletion 1206-p 5.3 92.5 90.5 13q deletion 1208 1 93.9/97.5 41 13q deletion 1210 0.12 96.5 96.5 13q deletion 1212 0.4 95.4 normal normal karyotype 1226 26.5 100 normal normal karyotype 1227 0.9 95.1 normal normal karyotype 1228 7.1 96.2 normal normal karyotype 1231 30.9 100 normal normal karyotype 1232 0.6 90.9 normal normal karyotype 33-V 83.9 100 71 Trisomy 12 39-V 67.7 100 69 Trisomy 12 58R 77.8 99.6 78 Trisomy 12 59R 92.4 99.6 56 Trisomy 12 Continued 44

Table 2.2 continued

1-V 84.2 100 52.5 Trisomy 12 2-V 91.7 93 86.5 Trisomy 12 cord blood

“FISH%” refers to the percentage of cells that carry the correspondent chromosomal aberration.

45

CHAPTER3: VALIDATION OF THE CANDIDATES FROM RNA SEQ

3.1 Introduction

In chapter 2 we obtained a list of candidate genes by RNA-seq performed on mouse B-

1a cells of different age groups with the intention to identify new key factors in CLL development. According to linear fold changes and statistical significances, two protein coding genes and two non-coding RNAs were selected to be the optimal targets for subsequent studies. Accordingly, CLL samples were analyzed to validate the candidates.

We collected >8mo old TCL1 mice that had developed aggressive CLL disease symptoms. The WT counterparts of similar ages were prepared as controls. Similarly,

RNA isolated from 30 human CLL samples (and one normal CD5+ B cell sample) was prepared. Additionally, to specify the functionality of TCL1 in terms of its association with both protein encoding candidates, NETO2 and HBEGF, human cell lines were transfected with TCL1 (see materials and methods). HEK293, human embryonic kidney cells, and MEC2, human CLL cells, were both transfected in order to compare TCL1- mediated regulatory mechanism in these two cell line models.

In this chapter we aim to (1) validate the results in mouse and human CLL samples, and (2) explore the correlation of the identified candidates with TCL1.

46

3.2 Results and discussions

Dysregulations of Neto2, Hbegf, AI427809 and 1700097N02Rik are found in mouse mature CLL samples. As shown in Chapter 2, these four candidates have been selected and validated. Their expressions are assessed in the mature mouse CLL cells isolated from >8mo old TCL1 mice and compared to lymphocytes collected from their

WT counterparts (Figure 3.1). As can be anticipated from the RNA seq and qRT-PCR of

<4mo mice, in mouse CLL Neto2 and AI427809 are as well upregulated, whereas Hbegf and 1700097N02Rik are downregulated. The differences in expression levels between

TCL1 and WT are significant and the results in triplicates prove consistent (Figure 3.1).

This result supports the previous findings in <4mo B-1a cells and suggests that (1) the

RNA-seq datasets are reliable, (2) these four candidates are consistently deregulated, and

(3) any of these abnormalities may contribute to the CLL formation. Western blot results of Neto2 and Hbegf on <4mo and >8mo mouse samples confirm deregulation at the protein level (Figure 3.2). An exception is that of <4mo mouse samples Hbegf does not show observable distinction between TCL1 mice and WT mice. Since the samples (total splenocytes) from <4mo mice consist of extremely small portion of CD5 B cells (1%), this discrepancy can be due to the predominant non CD5 positive B cells that represent

99% of the total splenocytes collected (Figure 3.2A). This hypothesis is well supported by the >8mo mouse results. Indeed, the majority of splenocytes collected from these mice have developed into CD5 CLL B cells (Figure 3.2B).

47

Figure 3.1 Differentially expressed genes in adult mouse WT splenocytes vs. mouse CLL counterparts. qRT-PCR was performed to validate the top differentially regulated up/down regulated coding and non-coding genes. Compared to WT, TCL1 mouse samples express higher levels of Neto2 and AI427809 and lower levels of Hbegf and 1700097N02Rik. These results are in consistent with those derived from RNA-seq and <4mo B-1a cells. (A) TCL1; (B) Neto2; (C) Hbegf; (D) AI427809; (E) 1700097N02Rik. Data are presented as the mean +/-SEM. * P<0.05; **P<0.01; ***P<0.005. 48

Figure 3.2 Protein expression analysis of Neto2 and Hbegf. Western blot was performed on the protein lysates derived from total splenocytes. The results from <4mo samples in (A) show higher Neto2 expression in TCL1 mice in all the three age groups, whereas this difference in Hbegf doesn’t exist (Densitometry of HBEGF normalizations on <4mo old mouse splenocytes follows), probably due to the low composition of B-1a cells in yet young mice; the results from >8mo samples in (B), however, show in TCL1 mice a distinct upregulation of Neto2 and downregulation of Hbegf, which is in consistent with the results of RNA seq and qRT-PCR. (A) Immunoblots of 1mo, 2mo, and 4mo old mouse splenocytes. (B) >8mo old mouse splenocytes. 49

Correlations identified in human samples. qRT-PCRs of NETO2, HBEGF and TCL1

were performed on 30 CLL patient samples together with normal CD5+ B cells. The two

ncRNAs are not conserved in human; therefore they were not tested on this CLL sample

set. Higher gene expression of NETO2 was observed in 28 of 30 CLL patient samples when compared to normal CD5+ B cell controls. When the samples were ordered by

TCL1 gene expression values (Figure 3.3A) and compared to NETO2 and HBEGF, a

positive correlation between TCL1 and NETO2 gene expressions was observed (Figure

3.3B), whereas TCL1 and HBEGF expression levels displayed a negative correlation

(Figure 3.3C). This was also evidenced by the significantly high RNA expression of

HBEGF in the patient samples with lowest TCL1 RNA expression (#1226 and #1208 in

Figure 3.3C). This finding aligns with the previous experiments and RNA-seq results.

Notably, the correlation between TCL1 and NETO2 is more remarkable than the one

between TCL1 and HBEGF.

The qRT-PCR results are supported by western blot on randomly selected patient

samples. In these samples, high TCL1 expression correlates with high NETO2 (Figure

3.4). However, HBEGF doesn’t show correlation with TCL1. This is likely attributed to

the heterogeneity of the primary human samples (similar to that in <4mo mouse

splenocytes as shown in Figure 3.2A).

Moreover, further experiments show that NETO2 positively correlates to ZAP-70

(Figure 3.5), which is an important disease prognostic marker. Indeed, high-level

expression of ZAP-70 is associated with more aggressive disease in patients. Therefore,

the positive correlation between NETO2 and ZAP-70 also corroborates our hypothesis

50 that NETO2 expression associates with the disease progression. This finding prompts toward more extensive investigation about the role and function of NETO2 in CLL using transgenic mouse models.

Figure 3.3 NETO2 and HBEGF gene expression in human samples. qRT-PCR was performed on 30 CLL samples and normal CD5+ B cells. Samples are ordered by levels of (A) TCL1 gene expression, (B) NETO2 gene expression, and (C) HBEGF gene expression. Trend lines of NETO2 and HBEGF were generated with linear regression.

51

Figure 3.4 Western blots on NETO2 and HBEGF. (A) Protein expression of NETO2 and HBEGF in human samples performed with the indicated antibodies; (B) Densitometry of HBEGF normalizations on patient samples.

52

A

*

B

Figure 3.5 NETO2 protein expressions in patient samples of distinctive ZAP-70 levels. (A) Western blot plus Ponceau loading control; (B) Densitometry of NETO2 normalization (normalization based on signal intensity relative to GAPDH). Samples with high ZAP-70 expression (indicative of aggressiveness of the disease) express higher amount of NETO2 than those with low ZAP-70 expression in certain portion of the samples. * P<0.05.

53

Ectopic expression of TCL1 alters the expression of NETO2 and HBEGF. As a further validation of our profiling data, we sought to identify the relationship of gene expression between TCL1 and the differentially regulated genes. We ectopically expressed TCL1 in two different human cell lines, HEK293 (Human Embryonic Kidney

293 cells) and MEC2 (Human CLL derived cell line), and then assessed NETO2 and

HBEGF expressions. Intriguingly, TCL1-transfected HEK293 cells and MEC2 cells both expressed higher level of NETO2 when compared to those transfected with the empty vector control (Figure 3.6A and 3.6B). Also, as expected, the expression level of HBEGF

decreased in both HEK293 cells and MEC2 when TCL1 was overexpressed (Figure 3.6C

and 3.6D). This has been confirmed at a protein level by western blot results, which show

the same pattern (Figure 3.7).

54

Figure 3.6 TCL1 regulates the RNA expression of NETO2 and HBEGF. (A) NETO2 gene expression in 293 cells transfected with TCL1-expressing vector [TCL1] vs empty vector [EV]; (B) NETO2 gene expression in MEC2 cells transfected with TCL1-expressing vector [TCL1] vs empty vector [EV]; (C) HBEGF gene expression in 293 cells transfected with TCL1-expressing vector [TCL1] vs empty vector [EV]; (D) HBEGF gene expression in MEC2 cells transfected with TCL1-expression vector [TCL1] vs empty vector [EV]. All the experiments were repeated three times. Data are presented as the mean +/- SEM. *P<0.05; **P<0.01; ***P<0.005.

55

Figure 3.7 Protein expression analysis of NETO2 and HBEGF on (A) HEK293 cells and (B) MEC2 cells transfected with human TCL1. Immunoblots (top) and densitometry (bottom) of TCL1-transfected. All the experiments were repeated three times. Data are presented as the mean +/- SEM. ***P<0.005.

Summary

Using both mouse and human CLL samples, we demonstrated the TCL1-associated deregulation of NETO2 and HBEGF. In parallel, transfected human cell lines were employed to confirm that TCL1 alters NETO2 and HBEGF expression. In detail, the expression levels of TCL1, NETO2 and HBEGF were validated by qRT-PCR in 31 human samples. Samples were sorted according to TCL1 expression levels, and the corresponding gene expressions of NETO2 and HBEGF were shown (Figure 3.3).

NETO2 and HBEGF demonstrated positive and negative associations, respectively, to

TCL1. These validation results are in accord with our RNA-seq data. The findings 56

suggest a possible role for NETO2 and HBEGF expressions as prognostic markers

indicative of TCL1-dependent CLL disease stage, and serve as a preliminary validation

of the results of our RNA-seq analysis.

Taken together, these results validate our profiling data of RNA-seq. In particular, the

remarkable upregulation of NETO2 upon TCL1 transfection suggests that NETO2 can be

a potential oncogenic target in the TCL1-driven CLL initiation. For this reason, we

decided to further evaluate NETO2 (to be proposed in Chapter 4), a fine oncogene

candidate.

3.3 Materials and methods

Immunoblotting. Equal amounts of protein (30ug) were loaded onto 4-20%

CRITERION TGX precast western blot gels (Biorad, Cat# 5671093) and transferred to nitrocellulose membrane (Biorad, Cat# 1620115). Anti-GAPDH (Abcam, Cat# ab181602), anti-vinculin (Santa Cruz, Cat# sc-73614), anti-TCL1 (Abcam, Cat# ab91211), anti-NETO2 (Abcam, Cat# ab171651) and anti-HBEGF (Abcam, Cat# ab92620) primary antibodies were used as 1:1,000 dilutions. Protein detection was performed by HRP-conjugated secondary antibodies mouse (GE Healthcare, Cat#

NA931), rabbit (GE Healthcare, Cat# NA934), and the ECL Plus chemiluminescence detection kit (Thermo Scientific, Cat# 34080).

Mouse CLL samples. >8mo old TCL1 mice were sacrificed for mature mouse CLL cells as the CLL symptom was fully revealed. WT counterparts of the same age were used as control. qRT-PCR (Figure 3.1) and Western blot (Figures 3.2B) were performed

57

to determine the expression of candidate genes as well as TCL1 in these >8mo old mouse

samples.

Primary human samples. The study was carried out in accordance with the

institutional review board protocol approved by The Ohio State University. Primary cells

were obtained from peripheral blood mononuclear cells (PBMCs) from 30 CLL patients

from the CLL Research Consortium upon written informed consent. Samples were

selected to represent typical cytogenetic risk groups (del11q-, del17p-, del13q-, trisomy

12, and normal karyotype, each group accounting for six samples). Further information on patient characteristics is summarized in Table 2.2. Cord blood RNA was used to represent normal CD5+ B cells (Allcells Cat #: RNA-CB003C; lot #: CB091217A).

Cells, cell culture and cell transfection. For transfection, HEK293 (ATCC) and

MEC2 cells (ATCC) were cultured in RPMI-1640 Medium supplemented with 10% FBS

and 1% Penicillin Streptomycin (Sigma-Aldrich). The TCL1-coding region was cloned

into the CD512B-1 expression vector (System Biosciences). In each well of a six-well

plate, 0.5x10^6 HEK293 cells were suspended in 3ml medium and seated for 24hr prior

to transfection. Lipofectamine 2000-mediated cell transfection was conducted following

the protocol of the supplier (Life Technologies). For MEC2 cells, 3.0x10^6 cells were

electro-transfected with Amaxa Nucleofector II equipment using the Amaxa Cell Line

Nucleofector Kit V program five, following the protocol of the supplier (Lonza). 48

hours post-transfection, cells were collected; RNA and protein were extracted for

subsequent experiments. TCL1 expression was validated by qRT-PCR and western blot

58

(Figure 3.7). Candidate gene expressions were determined by qRT-PCR and western blot

(Figures 3.6 and 3.7).

qRT-PCR and General Data Analysis

(See chapter 2 materials and methods)

59

CHAPTER4: NETO2 AND FUTURE PERSPECTIVES

4.1 An overview of NETO2

Transcriptome profiling, CLL sample analysis and in vitro studies have validated the correlation of NETO2 with TCL1, thus associating the overexpression of NETO2 with development/progression of CLL. NETO2 was first identified as an interacting partner for neuronal glutamate receptors [68]. Glutamate is the major excitatory neurotransmitter in the mammalian central nervous system, important for behavior, learning and memory.

The glutamatergic system is composed of glutamate transporters and glutamate receptors, the latter of which is responsible for signal input [69]. The NETO2 gene is located in chromosome 16q11 in human, and in chromosome 8; 8C4 in mouse. With a coding region of 1.6kb in average length, NETO2 variants in both human and mouse share 50% sequence similarity. The mature form of the translated product consists of two extracellular CUB (Complementary C1r/C1s, Uegf, Bmpl) domains followed by a LDL

(Low Density Lipoprotein) domain, which are connected with a transmembrane segment anchoring the protein in the cellular membrane [61] (Figure 4.1). NETO2 functions as a glutamate receptor auxiliary subunit modulating the receptor activities [61]. Alteration of glutamate receptors modulates cancer cell proliferation; it also influences the expression

60

and function of genes involved in invasion, metastasis, tumor suppression, activation and

adhesion in different cancer cell lines [70]. Thus, the glutamate receptor in cancer cells may be involved in the regulation of malignant phenotype. In particular, NETO2 has been reported to be deregulated in breast cancer [47], hemangiomas [71], renal caner [46] and

lung cancer [46, 72].

Figure 4.1 The NETO2 protein. NETO2 contains two discrete CUB (complement C1r/C1s, Uegf and Bmp1) domains, ~110 amino acid protein interaction domains crucial for development. They are followed by an extracellular juxtamembrane LDLa (low- density lipoprotein class A) domain and a transmembrane segment. (Image acquired from Copits and Swanson (2012) Nature Review Neuroscience 13: 675-686)

As NETO2 is highly upregulated in CLL, we speculate that NETO2, as an accessory

subunit of glutamate receptors, may regulate CLL development through modulating

glutamate receptors. The first step to study NETO2 is to evaluate its oncogenic effect in

61 mouse models. We therefore propose further in vivo studies with the TCL1 transgenic mouse model.

4.2 Proposed future research: transgenic mice

Human diseases can be reproduced by introducing a target gene into animals.

Compared to in vitro testing, intact organisms provide a complete and physiologically relevant picture of a transgene’s function. The transgenic mouse overexpressing human

NETO2 in B cells will allow us to appreciate the oncogenic effect of NETO2 in this type of cells. In this section, we propose to perform in vivo validation of the oncogenic effect of NETO2 and its potential synergy with TCL1 using transgenic mouse model.

In order to achieve a complete understanding of NETO2 functionality, three approaches will be applied:

(1) NETO2-tg (human NETO2 transgenic) vs WT to appreciate the oncogenic effect

of NETO2.

(2) NETO2/TCL1 double tg vs NETO2-tg or TCL1-tg alone to appreciate the

functional correlation between TCL1 and NETO2 in CLL development.

(3) Neto2-/TCL1 (mouse Neto2 knockout (KO) in TCL1 mouse) vs TCL1-tg to

further evaluate the synergistic effect of Neto2 with TCL1.

62

Figure 4.2 The transgenic mouse with human NETO2 expressed in mouse B cells. (A) vector design. In order for a targeted expression of NETO2 in mouse B cell, a similar vector as used for TCL1 mice is constructed with the replacement of TCL1 coding region with the NETO2 coding region. Restriction sites: X, XhoI; S, SalI; E, EcoRV; B, BssHII. (B) The vector is micro-injected into the pronuclei of the fertilized eggs and the eggs are implanted into the uterus of foster mother mice. Offspring with incorporated NETO2 are identified by genotyping. A homozygous transgenic strain is obtained by mating the heterozygous offspring. Additional transgenic mouse strains in this Chapter are produced with the similar procedure.

4.2.1 NETO2-tg vs WT

As a primary step to validate the oncogenic function of NETO2, we will create

NETO2-tg mice and comparing them to WT. 63

The candidate oncogene, NETO2, will be inserted into a pre-designed vector (Figure

4.2A) to target its expression in mouse B cells. The vector will then be injected into the pronuclei of fertilized eggs to be implanted into the uterus of a foster mother. The offspring will be tested and may be crossbred to produce homozygous NETO2 transgenic strain (Figure 4.2B). Our hypothesis is that NETO2-tg mice may produce human CLL symptoms similar to those observed in TCL1 mice, therefore showing clear evidence of the oncogenic potential of NETO2 (Figure 4.3).

Figure 4.3 NETO2-tg vs WT. The NETO2 transgenic mouse is monitored for disease development. The WT mouse is used as a standard for control purposes. If CLL symptoms arise in the NETO2 transgenic mouse, it will provide evidence that NETO2 functions as an oncogene. In contrast, if the NETO2 transgenic mouse survives without CLL symptoms, it will then show that NETO2 alone is not sufficient for leukemic transformation.

64

4.2.2 NETO2/TCL1 double tg vs NETO2-tg or TCL1-tg alone

The NETO2/TCL1 double transgenic mouse model, when compared to NETO2-tg and/or TCL1-tg, will be a useful tool to study the synergistic effect of NETO2 with TCL1

(Figure 4.4).

Figure 4.4 NETO2/TCL1 double tg vs NETO2-tg and/or TCL1-tg alone. NETO2/TCL1 double tg is generated by crossing NETO2-tg and TCL1-tg. The F1 generation can be obtained for analysis. The disease symptoms are subsequently compared. If CLL symptoms in the NETO2/TCL1 double tg mouse are more aggressive than the NETO2-tg and TCL1-tg, it will provide evidence that NETO2 plays a synergistic role with TCL1 in CLL development.

65

4.2.3. Neto2-/TCL1 vs TCL1-tg

As a continuous approach to evaluate the synergistic effect of Neto2 on TCL1, comparisons between Neto2-/TCL1 and TCL1-tg mice will be conducted. Since Neto2 is upregulated in TCL1 mice, it could act synergistically with TCL1. By crossing Neto2 KO mice with TCL1 mice, we should obtain a Neto2-/TCL1 strain to observe attenuation in the aggressiveness of the disease relative to that displayed by the TCL1 mice, in further supports that Neto2 maintains a synergistic role with TCL1.

Figure 4.5 Neto2-/TCL1 vs TCL1-tg. Neto2 knockout mice are generated and then crossed with TCL1 mice to obtain Neto2-/TCL1. Subsequently, the disease development of Neto2-/TCL1 is evaluated and compared to TCL1 mice. If the disease progression is delayed or alleviated in Neto2-/TCL1, it will indicate Neto2 plays a synergistic role with TCL1.

66

CHAPTER5: CONCLUDING REMARKS

CLL has been studied for decades. However, much remains unknown of the molecular

mechanisms by which the pathogenesis of this disease is regulated. The work presented

here combines lines of evidence for the identification of molecular deregulations driven

by the TCL1 oncogenic factor.

In Chapter 2, we showed that a number of genes were found to be upregulated or

downregulated in B-1a cells isolated from TCL1-tg mice compared to those from WT

counterparts, which was subsequently validated. We also found that the deregulated

genes constitute a network containing several oncogenic signaling pathways presumably

involved in CLL disease initiation. In Chapter 3, we focused on a few representative

candidates, to validate by carrying out TCL1 transfection experiment into human cell

lines, with qRT-PCR and western blotting. We found that Neto2 mRNA and protein level

are significantly increased in TCL1 mice, human CLL patients, and human leukemia cells transfected with TCL1 in vitro.

Although NETO2 is known to be primarily neural, it has recently been reported to be an oncogenic marker for several cancer types [46, 47, 71, 72]. Thus NETO2 could exert oncogenic functions in cancer development. In Chapter 4 we propose to conduct three

67 comparisons using transgenic mouse models: (1) NETO2-tg vs WT to address the oncogenic effect of NETO2, (2) NETO2/TCL1 double tg vs NETO2-tg and TCL1-tg to evaluate the synergistic effect of NETO2 with TCL1, and (3) Neto2-/TCL1 to further confirm the synergistic effect of mouse endogenous Neto2 with TCL1. Future research can be expected to delineate the exact mechanisms underlying NETO2 effect in the development of CLL.

68

REFERENCES

1. Widmaier EP, Raff H, Strang KT (2008) Chapter NO.18 Vander's Human Physiology 11th edition. McGraw-Hill Higher Education press. 2. Gutman JA, Smith KM, Pagel JM (2012) Chapter NO.3 Chronic lymphocytic leukemia. Springer press. 3. Hallek M (2015) Chronic lymphocytic leukemia: 2015 update on diagnosis, risk stratification, and treatment. Am J Hematol 90: 446-460. 4. Gribben JG, Brien SO (2011) Update on therapy of chronic lymphocytic leukemia. Journal of Clinical Oncology 29: 544-550. 5. Rai K, Sawitsky A, Cronkite E, Chanana A, Levy R, Pasternack B (1975) Clinical staging of chronic lymphocytic leukemia. Blood 46: 219-234. 6. Binet JL, Auquier A, Dighiero G, Chastang C, Piquet H, et al. (1981) A new prognostic classification of chronic lymphocytic leukemia derived from a multivariate survival analysis. Cancer 48: 198-206. 7. Orchard JA, Ibbotson RE, Davis Z, Wiestner A, Rosenwald A, et al. (2004) ZAP-70 expression and prognosis in chronic lymphocytic leukaemia. Lancet 363: 105- 111. 8. Rassenti LZ, Huynh L, Toy TL, Chen L, Keating MJ, et al. (2004) ZAP-70 compared with immunoglobulin heavy-chain gene mutation status as a predictor of disease progression in chronic lymphocytic leukemia. N Engl J Med 351: 893-901. 9. Chiorazzi N, Allen SL, Ferrarini M (2005) Clinical and laboratory parameters that define clinically relevant B-CLL subgroups. Curr Top Microbiol Immunol 294: 109-133. 10. Döhner H, Stilgenbauer S, Benner A, Leupolt E, Kröber A, et al. (2000) Genomic aberrations and survival in chronic lymphocytic leukemia. N Engl J Med 343:1910-1916. 11. Hallek M, Fischer K, Fingerle-Rowson G, Fink AM, Busch R, et al. (2010) Addition of rituximab to fludarabine and cyclophosphamide in patients with chronic lymphocytic leukemia: a randomized, open-label, phase 3 trial. Lancet 376: 1164- 1174. 12. Chiorazzi N, Rai KR, Ferrarini M. (2005) Chronic Lymphocytic Leukemia. New England journal of Medicine 352: 804-815.

69

13. Klein U, Lia M, Crespo M, Siegel R, Shen Q, et al. (2010) The DLEU2/miR-15a/16-1 cluster controls B cell proliferation and its deletion leads to chronic lymphocytic leukemia. Cancer Cell 17: 28-40. 14. Lia M, Carette A, Tang H, Shen Q, Mo T, et al. (2012) Functional dissection of the chromosome 13q14 tumor-suppressor locus using transgenic mouse lines. Blood 119: 2981-2990. 15. Bichi R, Shinton SA, Martin ES, Koval A, Calin GA, et al. (2002) Human chronic lymphocytic leukemia modeled in mouse by targeted TCL1 expression. PNAS 99: 6955-6960. 16. Stein JV, Lopez-Fraga M, Elustondo FA, Carvalho-Pinto CE, Rodriguez D, et al. (2002) APRIL modulates B and T cell immunity. J Clin Invest 109: 1587-1598. 17. Zapata JM, Krajewska M, Morse HC III, Choi Y, Reed JC (2004) TNF receptor- associated factor (TRAF) domain and Bcl-2 cooperate to induce small B cell lymphoma/chronic lymphocytic leukemia in transgenic mice. PNAS 101: 16600- 16605. 18. Widhopf GF II, Cui B, Ghia EM, Chen L, Messer K, et al. (2014) ROR1 can interact with TCL1 and enhance leukemogenesis in Eu-TCL1 transgenic mice. PNAS 111: 793-798. 19. Santanam U, Zanesi N, Efanov A, Costinean S, Palamarchuk A, et al. (2010) Chronic lymphocytic leukemia modeled in mouse by targeted miR-29 expression. PNAS 107: 12210-12215. 20. Shukla V, Ma S, Hardy RR, Joshi SS, Lu R (2013) A role for IRF4 in the development of CLL. Blood 122: 2848-2855. 21. ter Brugge PJ, Ta VB, de Bruijn MJ, Keijzers G, Maas A, et al. (2009) A mouse model for chronic lymphocytic leukemia based on expression of the SV40 large T antigen. Blood 114: 119-127. 22. Simonetti G, Bertilaccio MTS, Ghia P, Klei U (2014) Mouse models in the study of chronic lymphocytic leukemia pathogenesis and therapy. Blood 124: 1010-1019. 23. Noguchi M, Ropars V, Roumestand C, Suizu F (2007) Proto-oncogene TCL1: more than just a coactivator for Akt. The FASEB Journal 21: 2273-2284. 24. Virgilio L, Narducci MG, Isobe M, Billips LG, Cooper MD et al. (1994) Identification of the TCL1 gene involved in T-cell malignancies. PNAS 91: 12530-12534. 25. Laine J, Künstle G, Obata T, Sha M, Noguchi M (2000) The protooncogene TCL1 is an Akt kinase coactivator. Mol Cell 6:395-407. 26. Pekarsky Y, Koval A, Hallas C, Bichi R, Tresini M, et al. (2000) TCL1 enhances AKT kinase activity and medidates its nuclear translocation. PNAS 97: 3028- 3033. 27. Herling M, Patel KA, Weit N, Lilienthal N, Hallek M, et al. (2009) High TCL1 levels are a marker of B-cell receptor pathway responsiveness and adverse outcome in chronic lymphocytic leukemia. Blood 114: 4675-4686. 28. Yan XJ, Albesiano E, Zanesi N, Yancopoulos S, Sawyer A, et al. (2006) B cell receptors in TCL1 transgenic mice resemble those of aggressive, treatment- resistant human chronic lymphocytic leukemia. PNAS 103: 11713-11718. 70

29. Herling M, Patel KA, Khalili J, Schlette E, Kobayashi R, et al. (2006) TCL1 shows a regulated expression pattern in chronic lymphocytic leukemia that correlates with molecular subtypes and proliferative state. Leukemia 20: 280-285. 30. Pekarsky Y, Palamarchuk A, Maximov V, Efanov A, Nazaryan N, et al. (2008) TCL1 functions as a transcriptional regulator and is directly involved in the pathogenesis of CLL. PNAS 105: 19643-19648. 31. Palamarchuk A, Yan PS, Zanesi N, Wang L, Rodrigues B, et al. (2012) TCL1 protein functions as an inhibitor of de novo DNA methylation in B-cell chronic lymphocytic leukemia. PNAS 109: 2555-2560. 32. Pekarsky Y, Zanesi N, Croce CM (2010) Molecular basis of CLL. Semin Cancer Biol 20: 370-376. 33. Kriss CL, Pinilla-Ibarz JA, Mailloux AW, Powers JJ, Tang CH, et al. (2012) Overexpression of TCL1 activates the endoplasmic reticulum stress response: a novel mechanism of leukemic progression in mice. Blood 120: 1027-1038. 34. Gaudio E, Spizzo R, Paduano F, Luo Z, Efanov A, et al. (2012) TCL1 interacts with ATM and enhances NF-kB activation in hematologic malignancies. Blood 119: 180-187. 35. Gaudio E, Paduano F, Ngankeu A, Lovat F, Fabbri M, et al. (2013) Heat shock protein 70 regulates TCL1 expression in leukemia and lymphomas. Blood 121: 351-359. 36. Pekarsky Y, Santanam U, Cimmino A, Palamarchuk A, Efanov A, et al. (2006) TCL1 expression in chronic lymphocytic leukemia is regulated by miR-29 and miR-181. Cancer Research 66: 11590-11593. 37. Chen SS, Raval A, Johnson AJ, Hertlein E, Liu TH, et al. (2009) Epigenetic changes during disease progression a murine model of human chronic lymphocytic leukemia. PNAS 106: 13433-13438. 38. Johnson AJ, Lucas DM, Muthusamy N, Smith LL, Edwards RB, et al. (2006) Characterization of the TCL1 transgenic mouse as a pre-clinical drug development tool for human chronic lymphocytic leukemia. Blood 109: 1334-1338. 39. Gorgun G, Ramsay AG, Holderried TA, Zahrieh D, Le Dieu R, et al. (2009) Eu-TCL1 mice represent a model for immunotherapeutic reversal of chronic lymphocytic leukemia-induced T-cell dysfunction. PNAS 106: 6250-6255. 40. Bertilaccio MT, Scielzo C, Muzio M, Caligaris-Cappio F (2010) An overview of chronic lymphocytic leukaemia biology. Best Pract Res Clin Haematol 23:21-32. 41. Seifert M, Sellmann L, Bloehdorn J, Wein F, Stilgenbauer S, et al. (2012) Cellular origin and pathophysiology of chronic lymphocytic leukemia. J Exp Med 209: 2183-2198. 42. Hayakawa K, Formica AM, Colombo MJ, Ichikawa D, Shinton SA, et al. (2015) B cells generated by B-1 development can progress to chronic lymphocytic leukemia. Annals of the New York Academy of Sciences 2015; 1-6. 43. Nganga VK, Palmer VK, Naushad H, Kassmeier MD, Anderson DK, et al. (2013) Accelerated progression of chronic lymphocytic leukemia in Eu-TCL1 mice expressing catalytically inactive RAG1. Blood 121: 3855-3865.

71

44. Baumgarth N (2011) The double life of a B-1 cell: self-reactivity selects for protective effector functions. Nature Review Immunology 11: 34-46. 45. Simonetti G, Bertilaccio MTS, Ghia P, Klein U (2014) Mouse models in the study of chronic lymphocytic leukemia pathogenesis and therapy. Blood 124: 1010-1019. 46. Oparina N, Sadritdinova AF, Snezhkina AV, Dmitriev AA, Krasnov GS, et al. (2012) Increase in NETO2 gene expression is a potential molecular genetic marker in renal and lung cancers. Genetika 48: 599-607. 47. Horak CE, Lee JH, Elkahloun AG, Boissan M, Dumont S, et al. (2007) Nm23-H1 suppresses tumor cell motility by down-regulating the lysophosphatidic acid receptor EDG2. Cancer Research 67: 7238-7246. 48. Kudo-Saito C, Fuwa T, Murakami K, Kawakami Y (2013) Targeting FSTL1 presents tumor bone metastasis and consequent immune dyfunction. Cancer Research 73: 6185-6193. 49. Haran M, Chebatco S, Flaishon L, Lantner F, Harpaz N, et al. (2004) Grb7 expression and cellular migration in chronic lymphocytic leukemia: a comparative study of early and advanced stage disease. Leukemia 18: 1948-1950. 50. Takahashi R, Yamagishi M, Nakano K, Yamochi T, Yamochi T, et al. (2014) Epigenetic deregulation of EVC confers robust Hedgehog signaling in adult T-cell leukemia. Cancer Science 105: 1160-1169. 51. Lin IY, Yen CH, Liao YJ, Lin SE, Ma HP, et al. (2013) Identification of FKBP11 as a biomarker for hepatocellular carcinoma. Anticancer Res 33: 2763-2769. 52. Chng WJ, Remstein ED, Fonseca R, Bergsagel PL, Vrana JA, et al. (2009) Gene expression profiling of pulmonary mucosa-associated lymphoid tissue lymphoma identifies new biologic insights with potential diagnostic and therapeutic applications. Blood 113: 635-645. 53. Kimber SJ, Sneddon SF, Bloor DJ, EI-Bareq AM, Hawkhead JA, et al. (2008) Expression of genes involved in early cell fate decisions in human embryos and their regulation by growth factors. Reproduction 135: 635-647. 54. Holmes C, Stanford WL (2007) Concise review: stem cell antigen-1: expression, function, and enigma. Stem Cells 25: 1339-1347. 55. Ito CY, Li CY, Bernstein A, Dick JE, Stanford WL (2003) Hematopoietic stem cell and progenitor defects in Sca-1/Ly-6A-null mice. Blood 101:517-523. 56. Doherty JA, Sakoda LC, Loomis MM, Barnett MJ, Julianto L, et al. (2013) DNA repair genotype and lung cancer risk in the beta-carotene and retinol efficacy trial. Int J Mol Epidemiol Genet 4: 11-34. 57. Ratnam S, Engler P, Bozek G, Mao L, Podlutsky A, et al. (2014) Identification of Ssm1b, a novel modifier of DNA methylation, and its expresssion during mouse embryogenesis. Development 141: 2024-2034. 58. De Souza Santos E, De Bessa SA, Netto MM, Nagai MA (2008) Silencing of LRRC49 and THAP10 genes by bidirectional promoter hypermethylation is a fregquent event in breast cancer. International Journal of Oncology 33: 25-31. 59. Bernardo M, Crowther-Swanepoel D, Broderick P, Webb E, Sellick G, et al. (2008) A genome-wide association study identifies six susceptibility loci for chronic lymphocytic leukemia. Nature Genetics 40: 1204-1210. 72

60. Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, et al. (2008) Pseudogene- derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 453: 534-538. 61. Copits BA, Swanson GT (2012) Dancing partners at the synapse: auxiliary subunits that shape kainite receptor function. Nature Reviews Neuroscience 13: 675-686. 62. Zanesi N, Balatti V, Riordan J, Burch A, Rizzotto L, et al. (2013) A sleeping beauty screen reveals NF-kB activation in CLL mouse model. Blood 121: 4355-4358. 63.V.mm10:ftp://igenome:[email protected]/Mus_musculus/UCSC/mm 10/Mus_musculus_UCSC_mm10.tar.gz last accessed on Sep 1st, 2014. 64. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105-1111. 65. Wang L, Wang S, Li W (2012) RSeQC: quality control of RNA-seq experiments. Bioinformatics 28: 2184-2185. 66. Trapnell C, Williams BA, Pertea G, Mortazavi AM, Kwan G, et al. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology 28: 511-515. 67. Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. (2006) GenePattern 2.0. Nature Genetics 38: 500-501. 68. Zhang W, St-Gelais F, Grabner CP, Trinidad JC, Sumioka A, et al. (2009) A transmembrane accessory subunit that modulates kainate-type glutamate receptors. Neuron 61: 385-396. 69. Kalariti N, Pissimissis N, Koutsilieris M (2005) The glutamatergic system outside the CNS and in cancer biology. Expert Opin Investig Drugs 14: 1487-1496. 70. Luksch H, Uckermann O, Stepulak A, Hendruschk S, Marzahn J, et al. (2011) Silencing of selected glutamate receptor subunits modulates cancer growth. Anticancer Research 31: 3181-3192. 71. Calicchio ML, Collins T, Kozakewich HP (2009) Identification of signaling systems in proliferating and involuting phase infantile hemangiomas by genome-wide transcriptional profiling. Am J Pathol 174: 1638-1649. 72. Kadara H, Fujimoto J, Yoo SY, Maki Y, Gower AC, et al. (2014) Transcriptomic architecture of the adjacent airway field cancerization in non-small cell lung cancer. J Natl Cancer Inst 106: 1-9.

73