Network Modeling Links Breast Cancer Susceptibility and Centrosome Dysfunction
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLES Network modeling links breast cancer susceptibility and centrosome dysfunction Miguel Angel Pujana1,2,16,17, Jing-Dong J Han1,2,16,17, Lea M Starita3,16,17, Kristen N Stevens4,17, Muneesh Tewari1,2,16, Jin Sook Ahn1,2, Gad Rennert5,Vı´ctor Moreno6,7, Tomas Kirchhoff 8, Bert Gold9, Volker Assmann10, Wael M ElShamy2, Jean-Franc¸ois Rual1,2, Douglas Levine8, Laura S Rozek6, Rebecca S Gelman11, Kristin C Gunsalus12, Roger A Greenberg2, Bijan Sobhian2, Nicolas Bertin1,2, Kavitha Venkatesan1,2, Nono Ayivi-Guedehoussou1,2,16, Xavier Sole´7, Pilar Herna´ndez13, Conxi La´zaro13, Katherine L Nathanson14, Barbara L Weber14, Michael E Cusick1,2, David E Hill1,2, Kenneth Offit8, David M Livingston2, Stephen B Gruber4,6,15, Jeffrey D Parvin3,16 & Marc Vidal1,2 http://www.nature.com/naturegenetics Many cancer-associated genes remain to be identified to clarify the underlying molecular mechanisms of cancer susceptibility and progression. Better understanding is also required of how mutations in cancer genes affect their products in the context of complex cellular networks. Here we have used a network modeling strategy to identify genes potentially associated with higher risk of breast cancer. Starting with four known genes encoding tumor suppressors of breast cancer, we combined gene expression profiling with functional genomic and proteomic (or ‘omic’) data from various species to generate a network containing 118 genes linked by 866 potential functional associations. This network shows higher connectivity than expected by chance, suggesting that its components function in biologically related pathways. One of the components of the network is HMMR, encoding a centrosome subunit, for which we demonstrate previously unknown functional associations with the breast cancer– associated gene BRCA1. Two case-control studies of incident breast cancer indicate that the HMMR locus is associated with Nature Publishing Group Group Nature Publishing higher risk of breast cancer in humans. Our network modeling strategy should be useful for the discovery of additional cancer- 7 associated genes. 200 © Combinations of mutated and/or aberrantly expressed tumor sup- most cancer genes remain to be identified1. In addition, it is becoming pressor genes and oncogenes, or ‘cancer genes’, are thought to be increasingly clear that most genes and their products interact in responsible for most steps of cancer progression. Although funda- complex cellular networks, the properties of which might be altered mental principles have emerged from the study of known cancer genes in cancer cells as compared with their unaffected counterparts2. and their products, many questions remain unanswered. Notably, Achieving a deeper understanding of cancer molecular mechanisms 1Center for Cancer Systems Biology (CCSB) and 2Department of Cancer Biology, Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, 44 Binney St., Boston, Massachusetts 02115, USA. 3Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, 77 Louis Pasteur Ave., Boston, Massachusetts 02115, USA. 4Department of Epidemiology, University of Michigan, 109 Zina Pitcher Pl., Ann Arbor, Michigan 48109, USA. 5CHS National Cancer Control Center, Department of Community Medicine and Epidemiology, Carmel Medical Center and Bruce Rappaport Faculty of Medicine, Technion, Haifa 34362, Israel. 6Department of Internal Medicine, University of Michigan, 109 Zina Pitcher Pl., Ann Arbor, Michigan 48109, USA. 7Department of Epidemiology and Cancer Registry, and Translational Research Laboratory, Catalan Institute of Oncology, IDIBELL, Gran Vı´a km 2.7, L’Hospitalet, Barcelona 08907, Spain. 8Clinical Genetics Service, Department of Medicine, Memorial Sloan-Kettering Cancer Center, 1275 York Ave., New York, New York 10021, USA. 9National Cancer Institute, Human Genetics Section, Laboratory of Genomic Diversity, Frederick, Maryland 21702, USA. 10Center for Experimental Medicine, Institute of Tumor Biology, University Hospital Hamburg–Eppendorf, Martinistrasse 52, Hamburg 20246, Germany. 11Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Department of Biostatistics, Harvard School of Public Health, 44 Binney St., Boston, Massachusetts 02115, USA. 12Center for Comparative Functional Genomics, Department of Biology, New York University, 100 Washington Square East, New York, New York 10003, USA. 13Translational Research Laboratory, Catalan Institute of Oncology, IDIBELL, Gran Vı´a km 2.7, L’Hospitalet, Barcelona 08907, Spain. 14Abramson Family Cancer Research Institute, University of Pennsylvania School of Medicine, 421 Curie Blvd., Philadelphia, Pennsylvania 19104, USA. 15Department of Human Genetics, University of Michigan, 109 Zina Pitcher Pl., Ann Arbor, Michigan 48109, USA. 16Present addresses: Bioinformatics and Biostatistics Unit, Translational Research Laboratory, Catalan Institute of Oncology, IDIBELL, Gran Vı´a km 2.7, L’Hospitalet, Barcelona 08907, Spain (M.A.P.); Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Datun Rd., Beijing 100101, China (J.-D.J.H.); Department of Genome Sciences, University of Washington, 1705 NE Pacific St., Seattle, Washington 98195, USA (L.M.S.); Human Biology Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. North, Seattle, Washington 98109, USA (M.T.); Harvard School of Public Health, Boston, Massachusetts 02115, USA (N.A.-G.); Department of Biomedical Informatics, Ohio State University Medical Center, 460 West 12th Ave., Columbus, Ohio 43210, USA (J.D.P.). 17These authors contributed equally to this work. Correspondence should be addressed to M.V. ([email protected]), J.D.P. ([email protected]) or S.B.G. ([email protected]). Received 31 March; accepted 2 August; published online 7 October 2007; doi:10.1038/ng.2007.2 1338 VOLUME 39 [ NUMBER 11 [ NOVEMBER 2007 NATURE GENETICS ARTICLES may therefore require global strategies aimed at modeling the integrates coexpression profiles in human tissues and then integrates functional interrelationships between genes and/or proteins (genes/ functional associations derived from various functional genomic and proteins) as complex interdependent networks3. proteomic, or ‘omic’, data sets obtained in both humans and model Here we propose a complementary approach to systematic organisms. This integrated network modeling strategy provides a resequencing efforts4–6 for identifying cancer genes and/or proteins. ranking system to classify potential network components from low This approach is based on global network modeling of functional to high likelihood; the components are then functionally and geneti- associations between potential cancer genes and their products. cally tested. RESULTS Coexpression profiling A network modeling strategy To initiate our search for genes that are likely to be functionally Macromolecular networks can be modeled on the basis of both global associated with the reference genes/proteins, we used a data set correlations observed among transcriptional profiling compendia, containing transcript abundance measurements for 9,214 human protein-protein interaction or ‘interactome’ networks, and genome- genes in 101 samples originating from 43 healthy human tissues or wide phenotypic profiling data sets7, and comparisons of ‘interolog’ organs and three cell lines14. We determined Pearson correlation data sets from different organisms8. We combined these two strategies coefficient (PCC) values between each of the four reference genes to generate models of macromolecular networks that are possibly and all of the genes tested on the array. To determine the likelihood of perturbed in human cancer, starting from four known breast cancer– predicting functional associations by this coexpression approach, we associated genes and their products—BRCA1 and BRCA2 (both curated published data from the scientific literature (until 1 October identified by high-penetrance mutations)9–11 and ATM and CHEK2 2004) on protein interactions involving any of the four reference (both identified by low-penetrance mutations)12,13—referred to here- human proteins. The resulting ‘literature interaction’ (LIT-Int) net- after as the ‘reference genes/proteins’ (Fig. 1). The strategy first work contains 103 proteins and 129 functional associations (Fig. 2a and Supplementary Table 1 online). We determined that a PCC value of 4 0.4 cap- http://www.nature.com/naturegenetics Initial human tures 36% of the LIT-Int functional associa- gene set (9,214) tions (Fig. 2b). Beyond this threshold, the LIT-Int pairs are enriched between two- and tenfold (depending on the control set used) in Expression profiling coexpressed pairs as compared with 10,000 Human tissues gene pairs generated randomly from the same Similar expression data set. profiles Genes To identify potential functional associations involving all four reference genes, we focused BRCA on those transcripts found in the ‘expression Nature Publishing Group Group Nature Publishing XPRSS-Int intersection’ (XPRSS-Int) of the four coex- 7 (164) pression sets. The XPRSS-Int contains 164 200 genes, of which 15 are also present in the © LIT-Int data set (Fig. 2c and Supplementary Human Human and model organisms Table 2 online). To evaluate the significance of Expression Phenotypic Genetic Protein the XPRSS-Int set, we generated 200 ran- changes in breast tumors profiling interactions interactions domly chosen sets of four genes, calculated Upregulated Phenotypes Genes Proteins