Leeds Thesis Template

- i - Burkitt lymphoma classification and MYC-associated non- Burkitt lymphoma investigation based on gene expression Chulin Sha Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds Faculty of Biology School of Molecular and Cellular Biology April 2015 - ii - Intellectual Property and Publication Statements 1st Authored, used in this thesis, Transferring genomics to the clinic: distinguishing Burkitt and diffuse large B cell lymphomas. Accepted for publication in Genome Medicine. The candidate confirms that the work submitted is her own and that appropriate credit has been given where reference has been made to the work of others. This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. The right of Chulin Sha to be identified as Author of this work has been asserted by her in accordance with the Copyright, Designs and Patents Act 1988. © 2015 The University of Leeds and Chulin Sha - iii - Acknowledgements I would like to give my sincerest gratitude to my supervisor, David Westhead, for introducing me to Bioinformatics and Lymphoma research area and offering me guidance and care whenever I get confused or frustrated, particularly for his patience and encouragement when the project is going slow and for always considering my benefit where I am short of experience. Being honest, I feel so lucky to pursue my study here that I couldn‟t possibly have a better supervisor. I also want to thank to my collaborators from Leeds St. James Hospital Sharon, Andrew and Reuben for providing the data, advice and help on how to be involved in a project and cooperate with others. My dear lab mates, past and present, thanks for their companionship and providing a fantastic research atmosphere. Especially a big thank to Matt and Vijay, who help me along my whole Phd study, offering valuable ideas, encouragement and friendship. I hope I can grow to a good bioinformatician like them in the near future. Thank to my close friends for sharing my feelings so that I don‟t feel so lonely and isolated. Thank to my parents and my boyfriend for their unconditional support and endless love, which make me believe I can accomplish my objectives. At last, thank to China Scholarship Council and University of Leeds for the financial support, and to anyone who has helped me in any way during my Phd study period. - iv - Abstract Burkitt lymphoma and diffuse large B-cell lymphoma are two closely related types of lymphoma that are managed differently in clinical practice and the accurate diagnosis is a key point in treatment decisions. However based on current criteria combined with morphological, immunophenotypic and genetic characteristics, a significant number of cases exhibit overlapping features where diagnosis and treatment decisions are difficult to make. Especially, the prognosis have been reported significantly unfavourable in a subset of cases that are initially diagnosed as diffuse large B-cell lymphoma but bear MYC gene translocation, which is a defining feature of Burkitt lymphoma however can also be found in other lymphomas. Despite the adverse effect of MYC in aggressive lymphomas other than Burkitt lymphoma, the underlying mechanism and effective treatment is still unclear. Recent technological advances have made it possible to simultaneously investigate an enormous number of bio-molecules, and the scientific fields associated with measuring molecular data in such a high-throughput way are usually called “omics”. For example, genomics assesses thousands of DNA sequences and transcriptomics assays large numbers of transcripts in a single experiment. These techniques together with the rapidly emerging analytical methods in bioinformatics have introduced cancer research into a new era. The growing amount of omics data have significantly influenced the understanding of lymphomas and hold great promise in classifying subtypes, predicting treatment responses that will eventually lead to personalized therapy. Here in this study, we investigate the discrimination of Burkitt lymphoma and diffuse large B-cell lymphoma based on DNA microarray gene expression data, which has contributed most in molecular classification of lymphoma subtypes in the last decade. On the basis of two previous research level gene expression profiling classifiers, we developed a robust classifier that works effectively on different platforms and formalin fixed paraffin-embedded samples commonly used in routine clinic. The validation of the classifier on - v - the samples from clinical patients achieves a high agreement with diagnosis made in a central haematopathology laboratory, and leads to a potential outcome indication in the patients presenting intermediate features. In addition, we explore the role of MYC in the above lymphomas. Our investigation emphasizes the inferior impact of high level MYC mRNA expression on patients‟ outcome, and the functional analysis of MYC high expression associated genes show significantly enriched molecular mechanisms of proliferation and metabolic process. Moreover, the gene PRMT5 is found to be highly correlated with MYC expression which opens a possible therapeutic target for the treatment. - vi - Table of Contents Intellectual Property and Publication Statements .................................... ii Acknowledgements .................................................................................... iii Abstract ....................................................................................................... iv List of Abbreviations ................................................................................... x List of Tables ............................................................................................ xiii List of Figures ........................................................................................... xv Chapter 1 Introduction ................................................................................ 1 1.1 Burkitt lymphoma and diffuse large B-cell lymphoma .................... 2 1.1.1. Burkitt lymphoma .......................................................... 2 1.1.2. Diffuse large B-cell lymphoma....................................... 4 1.1.3. Cases with features intermediate between BL and DLBCL .................................................................................. 5 1.2 MYC in BCLU ................................................................................ 7 1.2.1. MYC translocation ......................................................... 8 1.2.2. MYC prognostic implication ........................................... 9 1.2.3. MYC potential mechanism in B-cell lymphoma ........... 10 1.3 High-throughput data in cancer research .................................... 13 1.3.1. New methods and techniques ..................................... 14 1.3.2. Omics-based tests translation from research to clinic 16 1.4 Microarray gene expression profiling .......................................... 17 1.4.1. DNA microarray technology ........................................ 18 1.4.2. General analysis of DNA microarray data ................... 20 1.4.3. Reality in DNA microarray GEP analysis .................... 22 1.5 Research overview ...................................................................... 23 1.5.1. Study design ............................................................... 23 1.5.2. Developing environment and tools .............................. 26 1.5.3. Data collection and collaboration ................................ 27 Chapter 2 Methods .................................................................................... 29 2.1 Low level analysis ....................................................................... 29 2.1.1. Quality check............................................................... 29 2.1.2. Preprocessing ............................................................. 30 2.1.3. Cross-platform normalization ...................................... 31 - vii - 2.2 Feature selection ......................................................................... 32 2.2.1. Introduction ................................................................. 33 2.2.2. SAM ............................................................................ 34 2.2.3. Smyth moderated t-statistic ......................................... 35 2.3 Classification methods ................................................................ 36 2.3.1. Introduction ................................................................. 37 2.3.2. Support vector machines ............................................ 38 2.3.3. Evaluation of classifier ................................................ 40 2.4 Survival analysis ......................................................................... 41 2.4.1 Introduction ......................................................................... 42 2.4.2 Kaplan-Meier survival estimate ........................................... 43 2.4.3 The cox proportional hazard model ..................................... 44 2.5 Mechanism analysis .................................................................... 45 2.5.1. Gene set enrichment analysis tool .............................. 45 2.5.2. DAVID functional annotation tool ................................ 47 Chapter 3 Development of a Burkitt lymphoma classifier .................... 49 3.1 Datasets summary .....................................................................

Leeds Thesis Template

A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus

UNIVERSITY of CALIFORNIA, IRVINE Combinatorial Regulation By

GTF3A (NM 002097) Human Tagged ORF Clone Product Data

Integrative Bulk and Single-Cell Profiling of Premanufacture T-Cell Populations Reveals Factors Mediating Long-Term Persistence of CAR T-Cell Therapy

Download Special Issue

Pseudotime Ordering of Single Human Β-Cells Reveals States Of

1 Santa Cat.No Cloud-Clone Cat.No. Cloud-Clone Product Name

(12) Patent Application Publication (10) Pub. No.: US 2009/0269772 A1 Califano Et Al

Table S1. 103 Ferroptosis-Related Genes Retrieved from the Genecards

Evolving Neoantigen Profiles in Colorectal Cancers with DNA Repair

MOL #82305 TITLE PAGE Title: Induced CYP3A4 Expression In

A Novel Role for the Chromatin Remodeling Atpase Brg1 During Zygotic Genome Activation