Genetic Commonalities in Gynecologic Cancers Using Publicly Available Genome-Wide Association Study Summary Results: an Exploratory Meta-Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Title Page Genetic Commonalities in Gynecologic Cancers Using Publicly Available Genome-Wide Association Study Summary Results: An Exploratory Meta-Analysis by Mark Francis Vater BS Biology, Northern Kentucky University, 2014 Submitted to the Graduate Faculty of the Graduate School of Public Health in partial fulfillment of the requirements for the degree of Master of Science University of Pittsburgh 2021 Committee Membership Page UNIVERSITY OF PITTSBURGH GRADUATE SCHOOL OF PUBLIC HEALTH This thesis was presented by Mark Francis Vater It was defended on August 4, 2021 and approved by Jeanine Buchanich, Associate Professor, Department of Biostatistics Ada Youk, Associate Professor, Department of Biostatistics John Shaffer, Assistant Professor, Department of Human Genetics Thesis Advisor: Jenna Carlson, Assistant Professor, Department of Biostatistics ii Copyright © by Mark Vater 2021 iii Abstract Genetic Commonalities in Gynecologic Cancers Using Publicly Available Genome-Wide Association Study Summary Results: An Exploratory Meta-Analysis Mark Francis Vater, MS University of Pittsburgh, 2021 Abstract Public Health Significance: Gynecologic cancers are responsible for millions of deaths worldwide every year. The goal of this research is to further scientific understanding of such cancers, potentially leading to improved diagnosis and treatment. The public health significance of this project is to uncover potential genetic underpinnings of gynecologic cancers, aiding in efforts to reduce the mortality rate of gynecologic cancers, which is imperative in protecting the health of susceptible persons across the globe. Gynecologic cancers are those which arise in the female reproductive system, chiefly, those of the ovaries, cervix, vulva, endometrium, and vagina. These conditions present a serious threat to the health of susceptible persons, leading to nearly three million deaths worldwide each year. Questions remain about the genetic origins and risk factors for each of them. Specifically, there is uncertainty regarding potential overlap in genetic architecture for these conditions. This analysis sought to answer the questions: are there shared genetic variants between ovarian cancers and endometrial and cervical cancers? If so, what are the genetic implications of these overlapping variants? This was accomplished by gathering summary-level data from published genome-wide association studies (GWAS) and analyzing them using fixed effects inverse variance meta- analysis. Four datasets were included, three ovarian cancer datasets, one cervical and one endometrial cancer dataset. Three meta-analyses were performed: ovarian + endometrial cancer, iv ovarian + cervical cancer, and ovarian cancer only. Several significant variants were found in the ovarian + endometrial cancer meta-analysis, including those located on genes TERT (p=3.1e-17), ABO (p=3.4e-11), and ATAD5 (p=3.7e-12). The ovarian + cervical cancer meta-analysis was inconclusive, but the ovarian-only meta-analysis provided evidence for significant variants across datasets, including those located on genes TIPARP (p=2.3e-14) and SKAP1 (p=3.0e-10). Some variants found may yield new insight if studied in conjunction with both types of cancer in the meta-analysis, such as ABO for endometrial cancer, a known genetic factor in ovarian cancer risk. Further study is needed to determine the relevance and level of involvement for shared genetic variants in multiple types of gynecologic cancers, potentially leading to new or improved treatments. v Table of Contents Preface ............................................................................................................................................ x 1.0 Introduction ............................................................................................................................. 1 1.1 Distinct Characteristics of Gynecologic Cancer Sub-Types ....................................... 1 1.2 General Commonalities Across Gynecologic Cancer Sub-Types .............................. 2 1.2.1 Genome-Wide Association Commonalities Across Sub-Types ........................7 1.3 Research Question .......................................................................................................... 8 2.0 Methods .................................................................................................................................. 10 2.1 Datasets Included .......................................................................................................... 10 2.2 Data Processing ............................................................................................................. 16 2.3 Analysis Steps ................................................................................................................ 18 2.4 Fixed Effects Inverse Variance Meta-Analysis .......................................................... 18 2.5 Additional Steps and Considerations .......................................................................... 21 3.0 Results .................................................................................................................................... 23 3.1 Individual Study Results .............................................................................................. 23 3.2 Meta-Analysis Results .................................................................................................. 28 3.2.1 Ovarian+Endometrial Cancer Meta-Analysis .................................................28 3.2.2 Ovarian+Cervical Cancer Meta-Analysis Results ..........................................34 3.2.3 Ovarian-Only Meta-Analysis Results ...............................................................39 4.0 Discussion............................................................................................................................... 46 4.1 Limitations .................................................................................................................... 52 4.2 Conclusion ..................................................................................................................... 54 vi Appendix A Other Highly Significant Variants from O+E and O-Only Meta-Analyses .................................................................................................................................................. 56 Appendix B Custom R Code and METAL Scripts .................................................................. 60 Bibliography .............................................................................................................................. 135 vii List of Tables Table 1: Significant Top Loci Variants for O+E Meta-Analysis ............................................ 33 Table 2: All Significant Variants for O+C Meta-Analysis ...................................................... 38 Table 3: Significant Top Loci Variants for Ovarian-Only Meta-Analysis ............................ 45 viii List of Figures Figure 1: Flowchart of data from identification to assessment .............................................. 17 Figure 2: Manhattan and QQ Plots for Individual Data Sources. ......................................... 27 Figure 3: Manhattan and QQ Plots for O+E Meta-Analysis .................................................. 30 Figure 4: Locus and Forest Plots for High Significant Variants in O+E. .............................. 32 Figure 5: Manhattan and QQ Plots for O+C Meta-Analysis .................................................. 35 Figure 6: Locus and Forest Plots for High Significant Variants in O+C. ............................. 37 Figure 7: Manhattan and QQ Plot for Ovarian-Only Meta-Analysis ................................... 41 Figure 8: Locus and Forest Plots of the High Significant Variants for Ovarian-Only Meta- Analysis ............................................................................................................................ 43 Figure 9: Locus and Forest Plots for Other Highly Significant Variants in the O+E and O- Only Meta-Analyses. ....................................................................................................... 59 ix Preface The creator of this work wishes to thank the Department of Biostatistics and the Department of Human Genetics at the University of Pittsburgh Graduate School of Public Health, as well as family and friends that supported him throughout this journey. x 1.0 Introduction Gynecologic cancers are a class of cancers that affect people susceptible to cancers of the ovaries, cervix, endometrium, vulva, and vagina (i.e., those with female reproductive organs), and result in approximately 2.9 million cancer deaths every year globally (Sankaranarayanan and Ferlay, 2006). Specific to the United States, the incidence of gynecologic cancers is approximately 40 diagnoses and 12.5 deaths per 100,000 susceptible people per year (Phelan et al., 2017). As such, these cancers pose a serious threat to the health of susceptible people. Each of these cancers has been well researched singularly. However, the similarities and differences of the underlying sub-types of gynecologic cancers are not fully understood. That is, there may be commonalities in the mechanisms of gynecologic cancers in terms of tumorigenesis, genomic makeup, etc., that are currently not characterized in the body of literature. Investigating such commonalities may lead to improved or new methods of treatment or prevention for gynecologic cancers. 1.1 Distinct Characteristics of Gynecologic Cancer