Bioinformatics Strategies for Identification of Cancer Biomarkers and Targets in Pathogens Associated with Cancer
Total Page:16
File Type:pdf, Size:1020Kb
FEDERAL UNIVERSITY OF MINAS GERAIS INSTITUTE OF BIOLOGICAL SCIENCE DEPARTMENT OF GENERAL BIOLOGY GRADUATE PROGRAM OF BIOINFORMATICS BIOINFORMATICS STRATEGIES FOR IDENTIFICATION OF CANCER BIOMARKERS AND TARGETS IN PATHOGENS ASSOCIATED WITH CANCER PhD STUDENT: Debmalya Barh SUPERVISOR: Prof. Dr. Vasco Ariston de Carvalho Azevedo Page 1 of 237 DEBMALYA BARH Bioinformatics strategies for identification of cancer biomarkers and targets in pathogens associated with cancer PhD STUDENT: Debmalya Barh SUPERVISOR: Prof. Dr. Vasco Ariston de Carvalho Azevedo FEDERAL UNIVERSITY OF MINAS GERAIS INSTITUTE OF BIOLOGICAL SCIENCES BELO HORIZONTE - MG February – 2017 Page 2 of 237 043 Barh, Debmalya. Bioinformatics strategies for identification of cancer biomarkers and targets in pathogens associated with cancer [manuscrito] / Debmalya Barh. – 2017. 237 f. : il. ; 29,5 cm. Supervisor: Vasco Ariston de Carvalho Azevedo. Tese (doutorado) – Federal University of Minas Gerais, Institute of Biological Sciences. 1. Bioinformática - Teses. 2. Marcadores biológicos. 3. Doenças transmissíveis - Teses. 4. Transcriptômica - Teses. 5. MicroRNAs - Teses. I. Azevedo , Vasco Ariston de Carvalho. II. Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. III. Título. CDU: 573:004 TABLE OF CONTENTS TABLE OF CONTENTS……………………………………………………………………………………i DEDICATION………………………………………………………………………………………..……iii ACKNOWLEDGEMENTS…………………………………………………………………………..……iv LIST OF ABBREVIATIONS…………………………………………………………………….……….. v ABSTRACT………………………………………………………………………………………...……..vii I. PRESENTATION I.1 Collaborators……………………………………………………………………………….…..2 I.2 Thesis Delineation………………………………………………………………...……………4 II. OBJECTIVES II.1 Main Objectives…………………………..…………………………………………….……..7 II.2 Specific Objectives………………………………………………………………………….…7 II.2 Future Objectives………………………………………………………………………….…..7 III. INTRODUCTION III.1 Preface ………………………………………………………………………………………..9 III.2 Article I: Literature Review …………………………………………………..…………….11 Molecular Biomarkers: Overview, Technologies, and Strategies III.3 Article II: Literature Review ……………………………………………..…………………11 Next-generation Molecular Markers: Challenges, Applications, and Future Perspectives III.4 Article III: Literature Review ………………………………………………………………13 In Silico Subtractive Genomics for Target Identification in Human Bacterial Pathogens IV. CHAPTERS/ RESEARCH ARTICLES IV.1 Chapter I: Research Article …………………………………………………………….…..15 A novel in silico reverse-transcriptomics-based identification and blood-based validation of a panel of sub-type specific biomarkers in lung cancer IV.1.1 Conclusions from this research…………………………………………16 IV.1.2 Media highlights of this research outcomes……………….……………17 IV.2 Chapter II: Research Article……………………………………………………..…………19 miRegulome: a knowledge-base of miRNA regulomics and analysis IV.2.1 Conclusions from this research………………………………………….20 IV.2.2 Media highlights of this research outcomes………………………….….21 IV.3 Chapter III: Research Article……………………………………………………….……….22 miRsig: a consensus-based network inference methodology to identify pan-cancer miRNA-miRNA interaction signatures IV.3.1 Conclusions from this research…………………………………………..23 IV.4 Chapter IV: Research Article ……………………………………………………………….24 Globally conserved inter-species bacterial PPIs based conserved host-pathogen interactome derived novel target in C. pseudotuberculosis, C. diphtheriae, M. tuberculosis, C. ulcerans, Y. pestis, and E. coli targeted by Piper betel compounds IV.4.1 Conclusions from this research…………………………………………..25 IV.4.2 Media highlights of this research outcomes…………………………….26 IV.5 Chapter V: Research Article……………………………………………….……………….28 Exoproteome and Secretome Derived Broad Spectrum Novel Drug and Vaccine Candidates in Vibrio cholerae Targeted by Piper betel Derived Compounds IV.5.1 Conclusions from this research…………………………………………..29 IV.5.2 Media highlights of this research outcomes……………………………..30 V. GENERAL CONCLUSIONS………………………………………………………………...………..32 VI FUTURE PERSPECTIVES………………………………………………………………………….34 i Page 3 of 237 VII. BIBLIOGRAPHY……………………………………………………………………………….….36 VIII. APPENDIX A. Published Research & Review Articles……………………………………………………….41 A.a: Journal articles with Prof. Vasco Azevedo A.b: Journal articles with other co-authors B. Published Books……………………………………………………………………………….47 B.a: Book with Prof. Vasco Azevedo B.b: Books with others C. Book Chapters…………………………………………………………………………………53 C.a: Book chapters with Prof. Vasco Azevedo C.b: Book chapters with other co-authors D. Presentations in Brazil………………………………………………………………..……….55 ii Page 4 of 237 Dedication I dedicate this work to my parents Late Purnendu Bhusan Barh & Ms. Mamata Barh, for being my inspiration of life. iii Page 5 of 237 ACKNOWLEDGMENT First and foremost, I would like to thank my supervisor Prof. Dr. Vasco Ariston de Carvalho Azevedo. I am grateful for his humble friendship, unconditional cooperation, and committed guidance with his vast wisdom granting me high degree of freedom to conduct and complete the entire work successfully. I am indebted to my colleagues/ collaborators Dr. Mukesh Verma (Epidemiology and Genomics Research Program, Division of Cancer Control and Population Sciences, NCI/ NIH, USA), Prof. Triantafillos Liloglou (Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK), Dr. Cedric Viero (Institute of Molecular and Experimental Medicine, Wales Heart Research Institute, School of Medicine, Cardiff University, UK), Dr. Nidia Leon-Sicairos and Adrian Canizalez-Roman (Unidad de investigacion, Facultad de Medicina, Universidad Autonoma de Sinaloa. Cedros y Sauces, Fraccionamiento Fresnos, Mexico), Prof. Ranjith Kumavath (Department of Genomic Science, Central University of Kerala, India), Prof. Preetam Ghosh (Department of Computer Science and Center for the Study of Biological Complexity, Virginia Commonwealth University, Virginia, USA), Prof. Artur Silva (Instituto de Cieˆncias Biolo´gicas, Universidade Federal do Para´, Brazil), Dr. Michel Herranz (Molecular Oncology and Imaging Program, Complejo Hospitalario Universitario, Santiago de Compostela, A Coruña, Spain), Dr. Elena Padin-Iruegas (Medical Oncology Department, Complejo Hospitalario Universitario, Santiago de Compostela, A Coruña, Spain), and Dr. Alvaro Ruibal (Nuclear Medicine Service, Complejo Hospitalario Universitario. Fundación Tejerina. Santiago de Compostela, A Coruña, Spain) for their supports in experiments and analysis. I am thankful to Prof. Anil Kumar (School of Biotechnology, Devi Ahilya University, Indore, India), Prof. Amarendra Narayan Misra (Center for Life Sciences, School of Natural Sciences, Central University of Jharkhand, Ranchi, India), Prof. Kenneth Blum (University of Florida, College of Medicine, Gainesville, Florida, USA), Prof. Jan Baumbach (Computational Biology Group, Department of Mathematics and Computer Science, University of Southern Denmark, Denmark), Prof. Eric R. Braverman (Weill-Cornell College of Medicine, Cornell University, New York, USA), Prof. John K Field (University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK) for their various assistances, suggestions, and guidance during the research. I owe many thanks to my students/ friends Sandeep Twari, Neha Jain, Amjad Ali, Marriam Bakhtiar, Anderson Rodrigues Santos, Bhanu Kamapantula, Joseph Nalluri, Syed Shah Hassan, Krishnakant Gupta, Siomar de Castro Soares, Sintia Silva de Almeida, Syed Babar Jamal, Rommel Ramos, Edson Luiz Folador, and Diego Mariano for their various helps during this research. My special thanks to my wife Priyanka and daughter ShauryaShree and other family members for their all supports during this research and my difficult times. My deepest gratitude goes to my loving Parents for always being beside me and for their love, support, inspiration, and blessings. iv Page 6 of 237 LIST OF ABBREVIATIONS BMI: Body Mass Index BPs: Biological Processes CAI: Codon Adaptation Index CC: Cellular Component CFUs: Colony-Forming Units CLA: Caseous Lymphadenitis CLR: Context Likelihood of Relatedness CMNR: Corynebacterium, Mycobacterium, Nocardia, and Rhodococcus CNPq: Conselho Nacional de Desenvolvimento Científico e Tecnológico COG: Clusters of Orthologous Groups Cp: Corynebacterium pseudotuberculosis CTD: Comparative Toxicogenomics Database DAVID: Database for Annotation, Visualization, and Integrated Discovery DC: Distance Correlation DEG: Database of Essential Genes GENIE3: Gene Network Inference with Ensemble of Trees GO: Gene Ontology GSEA: Gene Set Enrichment Analysis HP-PPIs: Host–pathogen protein–protein interactions IEDB: Immune Epitope Database KEGG: Kyoto Encyclopedia of Genes and Genomes MF: Molecular Function miRNA/ miR: MicroRNA miRsig: miRNA Signature mRNAs: messenger RNAs MRNETB: Maximum Relevance Minimum Redundancy Backward MVD: Molegro Virtual Docker NCBI: National Center of Biotechnology Information NSCLC: Non-Small Cell Lung Cancer PAIDB: Pathogenicity Island Database PATRIC: PathoSystems Resource Integration Center PDB: Protein Data Bank PPIs: Protein-Protein Interactions PSAT: Prokaryotic Sequence Homology Analysis QC: Quality Control qPCR: quantitative Real Time Polymerase chain reaction ROC: Receiver Operating Characteristic curve RV: Reverse Vaccinology SAVS: Structure Analysis and Verification Server SCLC: Small Cell Lung Cancer STRING: Retrieval of Interacting Genes TFs: Transcription Factors TWAS : The Academy of Sciences for the Developing World, Trieste, Italy v Page 7 of 237 UFMG: Universidade