Divergent Microbial Profiles in Tumor and Adjacent Normal Tissue Across Cancer Types
Total Page:16
File Type:pdf, Size:1020Kb
DIVERGENT MICROBIAL PROFILES IN TUMOR AND ADJACENT NORMAL TISSUE ACROSS CANCER TYPES A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAIʻI AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN BIOMEDICAL SCIENCES (CLINICAL RESEARCH) BY REBECCA M RODRIGUEZ Dissertation Committee: Brenda Y Hernandez (Chairperson) Youping Deng (Co-Chairperson) Vedbar S Khadka Amy Brown Scott Kuwada Loic LeMarchand Lynne Wilkens, Outside Member Purposely Left Blank i DEDICATION To the patients who volunteered sample tissues from which the data was derived; To all my family and friends who have lost their battle, and those who continue to battle with all forms of cancer ii ACKNOWLEDGEMENTS Many thanks to Dr. Brenda Y. Hernandez, Dr. Youping Deng and Dr. Vedbar S. Khadka. Without their mentorship, restless support and encouragement this work would not have been possible. My sincere gratitude to Dr. Amy Brown for her continuous advice and editorial contributions; to Dr. Lynne Wilkens, Dr. Scott Kuwada, Dr. Loïc LeMarchand and Dr. John Chen for their guidance, flexibility and input to this work. Many thanks to the Cancer Center Pathology Shared Resources for their work with the population blocks in the validation phase of this project. My deepest appreciation to Dr. Mark Menor from the JABSOM Bioinformatics Core, as his dedicated support ensured the completion of this project. My heartfelt thanks to all those who encouraged and supported me throughout this journey. I am especially grateful to my dear husband, Anibal and my children, Anibal and Axel, who sacrificed so much so that I could achieve this dream. The results published here are in part based upon data generated by the Cancer Genome Atlas managed by the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI). Information about TCGA can be found at http://cancergenome.nih.gov. All human data were handled in accordance with TCGA Data Use Certification Agreement and Data Access Request (DAR) 57292; Project-14778 (Deng, PI), Request ID 57292-2 and 57292-4 (renewal 10/10/2018) for level-1, controlled- data access phs000178 versions v9.p8 and v10.p8. This research was funded by the National Institute of Minority Health and Health Disparities, and the Health And Wellness Achieved by Impacting Inequalities (OlaHAWAII) pilot grant number 2U54MD007601-32 (Khadka, PI). The Bioinformatics Core is supported by National Institute of Health (NIH) Grant numbers 5P30GM114737, P20GM103466 and U54MD007584. R. M. Rodriguez was supported in part by the Population Sciences in the Pacific Epidemiology Program at the University of Hawaii Cancer Center. The Hawaii Tumor Registry is part of the National Cancer Institute’s (NCI) Surveillance, Epidemiology, and End Results (SEER) Program and operate affiliated Residual Tissue Repository (RTR). Tissue blocks from the Hawaii RTR were used for PCR validation. Exempt IRB approval was obtained prior to any study procedures relating to use of tissue blocks (protocol number 2018-00178). iii STRUCTURE OF THE DISSERTATION This dissertation is structured using UHM manuscript option. The research proposal is introduced first in Chapter I, followed by review of the literature in Chapter II. The research proposal was successfully submitted to OlaHAWAII Team Pilot Grant for funding. A portion of the literature review was submitted for publication as a review article to a peer review journal. A series of manuscripts on the microbial associations with cancer pathogenesis are presented in Chapter III. Subsequently, a summary of the results, overall conclusion and future research are discussed in Chapters IV. Work completed during the life of the project and supplemental tables or figures are included as appendices in Chapter V. iv PROJECT SUMMARY Background: There is growing evidence that microbial variation can influence cancer development, progression, response to therapy, and outcomes. We wanted to examine the microbial composition of paired tumor and adjacent normal tissue across various cancer types in order to provide an improved understanding of microbial diversity and abundance patterns of the tumor microenvironment and their influence on clinical presentation and survival. Methods: Using raw whole exome sequencing data from 22 cancer types from the The Cancer Genome Atlas (TCGA) network, we examined differential relative abundance and diversity data in primary tumor and adjacent solid tissue normal (adjacent normal), nine of which are presented in this work. Data were processed through a bioinformatics pipeline designed to extract microbial profiles from human sequencing data based on PathoScope 2.0. Differential abundance and diversity metrics were calculated using R-package tools to compare primary tumor and adjacent normal within cancer types and across cancer cohorts correlating to clinical features including histologic and pathologic features and survival data. Analyses were controlled for demographic, exposures and batch effects. Findings were then validated by qPCR for selected cancer types with tissue from the Hawaii Tumor Registry RTR. Results: As part of a pilot project we have created microbial composition and diversity profiles for a subset of solid tumors within TCGA cancers building a platform for ongoing and future studies. We screened over 10,000 files encompassing a total of 1,838 paired cases from which 813 are discussed here. From those, 767 tumors and 753 adjacent normal samples had positive microbial (viral and bacterial) sequence reads detection. Microbial composition and diversity (richness, within sample alpha diversity, evenness and beta diversity) varied across cohorts with similar patterns at the phylum level in compositional structures. Bacterial shifts were evident in tumor compared to adjacent normal. Proteobacteria phyla was observed to be increased in tumors of all cohorts except for STAD, where Proteobacteria species were reduced and Firmicutes levels increased. Differences between patient samples were evident at the higher taxonomic levels. Differential abundance analyses revealed significant differences in stomach adenocarcinoma (STAD) and colon adenocarcinoma (COAD). Compared to adjacent normal, tumor samples were found to have lower number of species present overall and lower diversity indices. We found significant association between microbial relative abundance and diversity to clinicopathological presentation and survival dependent on race in some cancers, particularly those of infectious origin like STAD and LIHC. Conclusion. This project demonstrates the feasibility of the utilization of exome sequencing data to derive complex microbial data with easy to interpret results. This project facilitates the understanding of the role of bacteria play in cancer pathogenesis across different race groups as demonstrated in LIHC and STAD cancer cohorts. In these cancers, relative abundance was associated with tumor stage and overall survival days and within sample diversity was associated with race in fully adjusted models. v TABLE OF CONTENTS STRUCTURE OF THE DISSERTATION ....................................................................................... iv PROJECT SUMMARY ....................................................................................................................v LIST OF FIGURES ......................................................................................................................... xi TABLE OF ACRONYMS ............................................................................................................... xii CHAPTER I. RESEARCH PROPOSAL......................................................................................... 1 1.1 Objectives ............................................................................................................................ 1 1.2 Specific Aims ....................................................................................................................... 1 1.2.1 Primary Aims .................................................................................................................. 1 1.2.2 Assumptions ................................................................................................................... 1 1.2.3 Hypotheses ..................................................................................................................... 1 1.2.4 Secondary aims .............................................................................................................. 2 1.3 Significance ......................................................................................................................... 2 1.4 Experimental Approach ...................................................................................................... 2 1.5 Research Design ................................................................................................................. 3 1.5.1 Study population ............................................................................................................. 3 1.5.2 Methods and Planned Statistical Analyses .................................................................... 4 1.5.3 Strengths of the research proposal ................................................................................ 7 1.5.4 Limitations and Alternative Strategies ........................................................................... 7 1.5.5 Changes to planned analyses .......................................................................................