Retrospective Analysis of the Two Sister Study Using Haplotype-Based Association Testing To

medRxiv preprint doi: https://doi.org/10.1101/2020.12.04.20244251; this version posted December 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. 1 Retrospective analysis of The Two Sister Study using haplotype-based association testing to 2 identify loci associated with early-onset breast cancer 3 James R. Gilbert, Ph.D. 1, James J. Cray, Ph.D. 2, Joseph E. Losee, M.D. 1, Gregory M. Cooper, 4 Ph.D. 1,3,4. 5 6 1. Department of Plastic Surgery, University of Pittsburgh/Children’s Hospital of Pittsburgh, 7 Pittsburgh, PA 15201. 8 2. Division of Anatomy, The Ohio State University College of Medicine, Columbus, OH 9 43210. 10 3. Department of Oral Biology, University of Pittsburgh/Children’s Hospital of Pittsburgh, 11 Pittsburgh, PA 15201. 12 4. Department of Bioengineering, University of Pittsburgh/Children’s Hospital of Pittsburgh, 13 Pittsburgh, PA 15201. 14 15 Email addresses: 16 [email protected] 17 [email protected] 18 [email protected] 19 [email protected] 20 21 Running Title: 22 23 Funding: The work described within this study was funded through the Children’s Fund of 24 Children’s Hospital of Pittsburgh of UPMC and through the Ross H. Musgrave Endowment 25 (J.E.L). 26 27 Conflicts of Interest: There are no conflicts of interest to disclose. 28 Corresponding Author: 29 Dr. Gregory M. Cooper 30 Department of Plastic Surgery 31 3533 Rangos Research Building 32 530 45th St, Pittsburgh, PA 15201 33 412/692-5384 (office) 34 [email protected] 35 36 Keywords: young-onset; early-onset; cancer; familial; breast cancer 37 NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. medRxiv preprint doi: https://doi.org/10.1101/2020.12.04.20244251; this version posted December 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. 38 ABSTRACT 39 Breast cancer is a polygenic disorder and is the leading cause of cancer related mortality among 40 women. Early-onset breast cancer (EOBC) is diagnosed in women prior to 45 years-of-age and is 41 associated with worse clinical outcomes, a more aggressive disease phenotype, and poor prognosis 42 for disease-free survival. While substantial progress has been made in defining the genetics of 43 breast cancer, EOBC remains less well understood. In the current study we perform a retrospective 44 analysis of data derived from The Two Sister Study. The use of alternate strategies for handling 45 age-at-diagnosis in conjunction with haplotype-based methods yielded novel findings that help to 46 explain the heritability of EOBC. These findings are validated through comparison against 47 discordant sibs from The Two Sister Study as well as using data derived The Cancer Genome Atlas 48 (TCGA). medRxiv preprint doi: https://doi.org/10.1101/2020.12.04.20244251; this version posted December 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. 49 INTRODUCTION 50 Breast cancer is the most frequently diagnosed oncogenic malignancy and a leading cause 51 of cancer-related mortality among women worldwide (1, 2). Early-onset breast cancer (EOBC) 52 accounts for approximately 5-10% of all new female breast cancer cases and young age at 53 diagnosis correlates with worse clinical outcomes (3, 4). Germline variants play a prominent role 54 in the etiology of breast cancer and an estimated 10-15% of women who develop breast cancer 55 report a familial history of the disease. Germline variants in BRCA1 or BRCA2 are observed in 15- 56 20% of familial breast cancer cases (5). Direct evidence for genetic modifiers of breast and ovarian 57 cancer risk for BRCA1 and BRCA2 mutation carriers has been provided through genome-wide 58 association study (GWAS) (6). Patients affected by EOBC exhibit shared patterns of gene 59 expression that differ from their older counterparts (7). These combined observations suggest a 60 genetic component contributes to EOBC although only a fraction of the heritability of EOBC has 61 been explained. 62 Deciphering the genetic basis for phenotypic heterogeneity in complex diseases remains a 63 major challenge. Single marker association studies often lack sufficient statistical power to support 64 the discovery of rare variants or epistatic interactions within a polygenic architecture. Haplotype- 65 based analysis is thought to have greater power than single marker association tests in the study of 66 complex disease (8-10). Haplotypes, which consist of a series of sequentially ordered single 67 nucleotide variants (SNVs), are a potentially more informative format for association testing than 68 single markers and may have improved sensitivity and specificity for discovery (11, 12). 69 Haplotype-based analysis has been used to gain insight in a wide array of complex disease models 70 including mood disorders, multiple sclerosis, orofacial clefting, and cancer (13-18). Moreover, medRxiv preprint doi: https://doi.org/10.1101/2020.12.04.20244251; this version posted December 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. 71 haplotype-based analysis has been effectively applied to investigate age-of-onset in human disease 72 although relatively few studies have specifically addressed EOBC (19-23). 73 Several approaches have been used to investigate the genetic regulation of breast cancer 74 age-of-onset. The Two Sister Study made use of a familial case-control design with affected cases 75 diagnosed ≤ 50 years-of-age and discordant sibs of EOBC patients defining a control population. 76 Parental samples were included in The Two Sister Study to allow for the identification of 77 maternally-mediated effects and Mendelian errors in transmission (24-26). Other studies have 78 instead used categorical thresholding with diagnosis at 35, 40, 45, and 50-years-of-age to define 79 EOBC populations contrasted against either unaffected familial controls or unrelated age-matched 80 controls (3, 27-29). Age-of-onset has further been evaluated in terms of phenotypic extremes by 81 comparing individuals diagnosed at ≤ 35 years-of-age against cancer-free controls at age ≥ 60 or 82 against an age-specific cohort diagnosed with breast cancer at ≥ 65 years-of-age (30, 31). Still 83 others have investigated breast cancer in terms of age-stratified risk or using quantitative trait 84 analysis to support discovery. Internally consistent logic has justified the use of these and other 85 study designs. Yet the genetic basis for EOBC remains poorly understood and more recent studies 86 have turned towards meta-analyses aimed at achieving sufficient statistical power to identify rare 87 variants with small effect size (32-34). 88 Design considerations for the study of complex polygenic disorders have been evaluated 89 across a range of disease models. For example, Peyrot and colleagues have convincingly argued 90 against familial trio designs when investigating complex disease traits with a polygenic 91 architecture or a lifetime risk ≥ 1% (35). Reasons given included a potential for reduced statistical 92 power, ascertainment bias, and a significant underestimation of SNV heritability. Additional 93 considerations in sib pair study design include the potential for misclassification and/or medRxiv preprint doi: https://doi.org/10.1101/2020.12.04.20244251; this version posted December 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. 94 overmatching (36). Misclassification of discordant sibs presents a challenge primarily in cases 95 associated with pronounced variation in age-of-onset. Overmatching presents a more significant 96 challenge in complex disease models where discordant sibs are likely to share an indeterminate 97 number of disease-related alleles. As a result, allele-frequency differences between affected and 98 unaffected sibs are generally underestimated relative to randomly selected affected and unaffected 99 individuals (36). Recent investigation of polygenic risk in multiplex melanoma families indicated 100 that familial controls may carry a significantly elevated polygenic load relative to unrelated cases 101 or healthy controls and thereby introduce bias (37). Kerber and colleagues likewise argued that 102 familial studies should be approached with caution, particularly when investigating complex 103 diseases such as cancer where variable onset, incomplete genetic penetrance, gene-environment 104 interactions, and environmental phenocopies have a dramatic potential to impact disease 105 occurrence and phenotype (38). The authors further argued in favor of a case-only analysis for an 106 initial scan followed by more comprehensive analysis of regions surrounding initial hits using both 107 affected and unaffected study participants. In keeping with this reasoning, we speculated that a 108 comparison of younger and older patients diagnosed with breast cancer might provide insight into 109 the genetic architecture of breast cancer age-of-onset. 110 We performed a retrospective analysis of “The Two Sister Study: A Family-Based Study of 111 Genes and Environment in Young-Onset Breast Cancer,” hereafter referred to as “The Two Sister 112 Study.” The Two Sister Study is one of the longest standing and best characterized studies of early- 113 onset breast cancer and hence was chosen to establish proof-of-principle.

Retrospective Analysis of the Two Sister Study Using Haplotype-Based Association Testing To

Transcriptional Control of Tissue-Resident Memory T Cell Generation

Rabbit Anti-LYPD6B/CT116/FITC Conjugated Antibody-SL18576R-FITC

Organization, Evolution and Functions of the Human and Mouse Ly6/Upar Family Genes Chelsea L

Human Induced Pluripotent Stem Cell–Derived Podocytes Mature Into Vascularized Glomeruli Upon Experimental Transplantation

35Th International Society for Animal Genetics Conference 7

Genomic and Transcriptome Analysis Revealing an Oncogenic Functional Module in Meningiomas

LYPD6B (F-13): Sc-243376

Nº Ref Uniprot Proteína Péptidos Identificados Por MS/MS 1 P01024

Research2007herschkowitzetvolume Al

Nicotinic Signaling: Alpha3 Beta4 Heteromers, Alpha5 Subunits, and the Rp Ototoxin Lypd6b Vanessa Ochoa University of Vermont

A Tale of Two Tails Eric M. Jennings Janezic

Transcriptional and Spatial Resolution of Cell Types in the Mammalian Habenula