Global Patterns of Changes in the Gene Expression Associated with Genesis of Cancer a Dissertation Submitted in Partial Fulfillm
Total Page:16
File Type:pdf, Size:1020Kb
Global Patterns Of Changes In The Gene Expression Associated With Genesis Of Cancer A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University By Ganiraju Manyam Master of Science IIIT-Hyderabad, 2004 Bachelor of Engineering Bharatiar University, 2002 Director: Dr. Ancha Baranova, Associate Professor Department of Molecular & Microbiology Fall Semester 2009 George Mason University Fairfax, VA Copyright: 2009 Ganiraju Manyam All Rights Reserved ii DEDICATION To my parents Pattabhi Ramanna and Veera Venkata Satyavathi who introduced me to the joy of learning. To friends, family and colleagues who have contributed in work, thought, and support to this project. iii ACKNOWLEDGEMENTS I would like to thank my advisor, Dr. Ancha Baranova, whose tolerance, patience, guidance and encouragement helped me throughout the study. This dissertation would not have been possible without her ever ending support. She is very sincere and generous with her knowledge, availability, compassion, wisdom and feedback. I would also like to thank Dr. Vikas Chandhoke for funding my research generously during my doctoral study at George Mason University. Special thanks go to Dr. Patrick Gillevet, Dr. Alessandro Giuliani, Dr. Maria Stepanova who devoted their time to provide me with their valuable contributions and guidance to formulate this project. Thanks to the faculty of Molecular and Micro Biology (MMB) department, Dr. Jim Willett and Dr. Monique Vanhoek in embedding valuable thoughts to this dissertation by being in my dissertation committee. I would also like to thank the present and previous doctoral program directors, Dr. Daniel Cox and Dr. Geraldine Grant, for facilitating, allowing, and encouraging me to work in this project. My heartful thanks goes to the formal and current graduate students in the MMB department Dr. Mohammed Jarrar, Dr. Michael Estep, Dr. Manpreet Randhawa, Dr. Aybike Birerdinc, Subashini Iyer, Beatrix Meltzer and Eshwar Iyer who contributed their time and effort in facilitating and providing the scientific environment for this project. A very special thanks to my friends in galigang for their continual moral support to enhance my spirit in progressing this project. iv TABLE OF CONTENTS Page List of Tables……………………………………………………………………………vii List of Figures…………………………………………………………………………...ix Abstract................................................................................................................……….xi A Summary..........................................................................................................……….xiii 1. Introduction......................................................................................................………..1 - Cancer and its genesis...............................................................................………...1 - Genetic and epigenetic events causing cancer..........................................………...2 - Diversity of human gene expression.........................................................………...8 - Bioinformatics & Cancer...........................................................................………..11 - The analysis of microarrays and the cancer transcriptome........................………..15 - Cancer – A systems biology perspective...................................................………..23 2. Genome wide discrimination of normal and tumor samples.............................………25 - Rationale......................................................................................................………25 - Background................................................................................................………..25 - Hypothesis..................................................................................................……….28 - Materials and Methods................................................................................………28 - Results and Discussion...............................................................................……….33 a) Modeling strategy................................................................................………..33 b) Assessment of the global and signature-specific gene expression distances for two-point (Normal-Tumor) datasets........................................................………..36 c) Assessment of the global and signature-specific gene expression distances of multi-stage (three or more stage) datasets...............................................………..41 d) Principal component analysis (PCA) of the distance spaces...............………..46 e) Cancer – An attractor with intermediate regulatory framework..........………..52 - Conclusion and Future Perspective...........................................................………..55 3. Abundance based transcriptome analysis as a tool for automated discovery of the tumor biomarkers..................................................................................................……….58 - Rationale....................................................................................................………..58 - Background................................................................................................………..58 v - Hypothesis.................................................................................................………..63 - Materials and methods...............................................................................………..63 - Results and Discussion..............................................................................………..65 a) Development of the standard vocabulary describing human tissues...………..65 b) Classification of the cDNA libraries used in the study.......................………..66 c) Unique and Common Unigenes...........................................................………..68 d) Estimation of the diversity of transcripts within normal and tumor tissues......72 e) An analysis of the unigenes for putative tumor biomarkers................………..77 f) Functional analysis of protein-coding unigenes identified as tumor biomarker candidates...............................................................................………..79 - Conclusion.................................................................................................………..82 4. Effects of the tumor-specific telomere rearrangements on the adjacent gene expressions...........................................................................................................………..84 - Rationale....................................................................................................………..84 - Background................................................................................................………..84 - Hypothesis.................................................................................................………..88 - Materials and methods...............................................................................………..88 - Results and Discussion..............................................................................………..91 a) The definition of the over/underexpressed genes................................………..91 b) Correlation with tumor stage...............................................................………..93 c) Variation in distribution of expression changes..................................………..96 d) Gene Ontology analysis.............................................................................…....97 e) Defining the maximal length of the subtelomeric fragment that might be influenced by telomere rearrangements in cancer cells...........................………..99 f) Functionally, the behavior of the genes located in the subtelomeric regions of human chromosomes is not different from that of the genes located in other parts of the human chromosomes...................................................................………..106 - Conclusion...............................................................................................………..108 Appendices..........................................................................................…………………109 A: KEGG Pathway Painter......................................................................………………109 B: Enriched pathways for cancer-specific markers.............................…………………116 C: Enriched pathways for normal-specific (anti-cancer) markers.......…………………128 D: PCA results of the two-point paired datasets..................................…………………138 E: PCA results of the two-point population datasets...........................…………………143 F: PCA results of the multi-stage cancer and normal datasets...................……………..147 G: Potential Human tumor biomarkers................................................…………………151 H: Potential Human biomarkers of Normal tissues.............................…………………163 References.........................................................................................………...…………184 vi LIST OF TABLES Table Page 1: The attributes of two-point datasets describing paired normal and tumor tissue samples collected from the same individual........................................................………..29 . 2: Description of the two-point datasets comprised of normal and tumor samples collected from the same tissue type across a number of subjects........................………..31 3: Description of the datasets with three or more physiological groups of normal and tumor samples collected across the same subject or a number of subjects...………..32 4: Rankings of the tumor malignancy potential by relative distance to the normal sample space of two-point