Research 1..8
Total Page:16
File Type:pdf, Size:1020Kb
Article pubs.acs.org/jpr Proteome Atlas of Human Chromosome 8 and Its Multiple 8p Deficiencies in Tumorigenesis of the Stomach, Colon, and Liver † ○ † ○ ‡ § ○ ∥ ○ † † † Yang Zhang, , Guoquan Yan, , Linhui Zhai, , , Shaohang Xu, , Huali Shen, Jun Yao, Feifei Wu, † # † † † ‡ § ‡ § Liqi Xie, Hailin Tang, Hongxiu Yu, Mingqi Liu, Pengyuan Yang, Ping Xu, , Chengpu Zhang, , ‡ § ‡ § ‡ § ‡ § ‡ § ∥ ⊥ ∥ Liwei Li, , Cheng Chang, , Ning Li, , Songfeng Wu, , Yunping Zhu, , Quanhui Wang, , Bo Wen, ∥ ∥ ∥ □ † ∥ ⊥ ‡ § Liang Lin, Yinzhu Wang, Guiyan Zheng, Lanping Zhou, Haojie Lu,*, Siqi Liu,*, , Fuchu He,*, , † and Fan Zhong*, † Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, China ‡ State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, China § National Engineering Research Center for Protein Drugs, Beijing 102206, China ∥ BGI-Shenzhen, Shenzhen 518083, China ⊥ Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China # College of Mechanical & Electronic Engineering and Automatization, National University of Defense Technology, Changsha 410073, China □ State Key Laboratory of Molecular Oncology, Cancer Institute & Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100021, China *S Supporting Information ABSTRACT: Chromosome 8, a medium-length euchromatic unit in humans that has an extraordinarily high mutation rate, can be detected not only in evolution but also in multiple mutant diseases, such as tumorigenesis, and further invasion/ metastasis. The Chromosome-Centric Human Proteome Project of China systematically profiles the proteomes of three digestive organs (i.e., stomach, colon, and liver) and their corresponding carcinoma tissues/cell lines according to a chromosome organizational roadmap. By rigorous standards, we have identified 271 (38.7%), 330 (47.1%), and 325 (46.4%) of 701 chromosome 8-coded proteins from stomach, colon, and liver samples, respectively, in Swiss-Prot and observed a total coverage rate of up to 58.9% by 413 identified proteins. Using large-scale label-free proteome quantitation, we also found some 8p deficiencies, such as the presence of 8p21−p23 in tumorigenesis of the above-described digestive organs, which is in good agreement with previous reports. To our best knowledge, this is the first study to have verified these 8p deficiencies at the proteome level, complementing genome and transcriptome data. KEYWORDS: chromosome 8, proteome, 8p deletion, tumorigenesis, liver, colon, stomach ■ INTRODUCTION Chromosome 8 is a medium-length euchromatic unit in humans that has an extraordinarily high mutation rate by The Chromosome-Centric Human Proteome Project (C-HPP) 3,4 5 6,7 fi positive selection. DEF and MCPH1 are widely known has received considerable attention for its signi cance in rapidly evolving gene clusters in 8p. Telomere shortening, understanding gene function and structure in terms of especially in chromosome 8, appears as a mechanism fostering chromosome proteins.1,2 The International Human Genome the development of chromosomal instability during aging and Sequencing Consortium recently completed a sequence of the chronic disease.8 This relatively high genomic instability of human genome including chromosome 8. It has reported a chromosome 8 is found not only in evolution but also in manually curated gene catalog, containing 793 gene loci and 301 pseudogene loci, including all previously known genes on Special Issue: Chromosome-centric Human Proteome Project 3 chromosome 8 and giving 701 nonredundant proteins in Received: August 31, 2012 Swiss-Prot (Version 2012-7-11). © XXXX American Chemical Society A dx.doi.org/10.1021/pr300834r | J. Proteome Res. XXXX, XXX, XXX−XXX Journal of Proteome Research Article Figure 1. Chromosome 8 proteome atlas with data on the quantification (A) and identification (B) of gastric (green text and frame), colorectal (blue text and frame), and hepatic (red text and frame) systems. (A) The colors denote the abundance of the protein expression data. The median − normalized protein data were transformed by log2 and rescaled into a [ 1,1] region. The protein data above the median are shown in red, whereas those below the median are shown in blue. The color legend is provided at the top portion. (B) The protein coding genes are shown in blue. multiple mutant diseases, such as tumorigenesis, and further information, is more robust for LTQ data. More details on invasion/metastasis. Chromosome 8 and its 8p deletion have proteome quantification are provided in the related article. The been studied and determined to be associated with mass spectrum results detected rather high incidences of hepatocellular carcinoma (HCC) metastasis for many years.9 missing values (without identification in some replicates). Use As part of the Chinese C-HPP consortium, chromosomes 1, of each proteomic experimental replicate as input column 8, and 20 have been selected to systematically profile the would disrupt further Gene Set Enrichment Analysis (GSEA) proteomes of three digestive organs, namely, stomach, colon, processing. To solve this problem, we calculated the average of and liver, as well as their corresponding carcinoma tissues/cell SIN or LFQuant values for each sample as representation, rather lines. From the perspective of pathophysiological significance in than deal with each fluctuant experimental replicate in further China and worldwide, digestive cancers, such as gastric cancer analysis. For comparative analysis of liver samples, we further (GC), colorectal cancer (CRC), and HCC, are among the most normalized SIN and LFQuant values by dividing their medians. frequently reported cancers and are characterized by metastatic Biological Category Statistics and Enrichment Analysis potential and poor outcomes. This digestive group includes some of the most critical cancers (among them are those “Data set Files” and “Analyses” under “Human Genes ranked second to fourth in cancer-related mortality) and, Chromosomal Location” in Ingenuity Pathway Analysis (IPA, despite all sustained efforts, maintains a profile of low survival Ingenuity Systems; www.ingenuity.com) Version 14197757 rates and lacks successful therapies.10 Herein, we report the were used for cell, tissue, and organ (CTO) and cell line proteome of chromosome 8 based on measured proteins from distribution statistics as well as biological category enrichment samples of selected digestive tissues/cells and summarize the analysis. total nonredundant proteins reported so far for the said Enrichment analysis of 288 missing chromosome 8-coded proteome. We also discuss the connections of measured 8p proteins was carried out using the web-accessible Database for deletions with these three digestive cancers. Annotation, Visualization, and Integrated Discovery (DAVID) tool.14 DAVID can recognize the UniProt AC from data sets. ■ MATERIALS AND METHODS Medium classification stringency and default items were chosen for enrichment calculation. Proteome Identification and Quantification GSEA Proteomes from 18 samples, namely, liver tissue11 Hep3B, SNU398, SNU449, SNU475, MHCC97L (97L), MHCC97H GSEA15 was performed to find down-regulated enriched (97H), HCCLM3 (LM3), HCCLM6 (LM6), colon tissue, cytobands in chromosome 8. The data set was loaded from CRC tissue, SW480, HCT116, stomach tissue, GC tissue, AGS, HUGO Gene Nomenclature Committee-approved gene BGC823, and SGC7901, were identified or researched as symbols16 and analyzed using the Java GSEA package.17 We described in a companion article (DOI: 10.1021/pr3008286). scanned (Signal2Noise metric, weighted scoring, 1,000 Protein abundance in the human liver data set and that in 17 phenotype permutations) the cytoband-organized c1 gene fi 12 13 other samples were quanti ed by SIN and by LFQuant, sets of Molecular Signature Database v3.0. As GSEA requires respectively. Both are label-free proteome quantification at least three (sample) columns for each phenotype group, we methods, and the SIN method, which does not use XIC replicated stomach, colon, and liver proteome data to appear B dx.doi.org/10.1021/pr300834r | J. Proteome Res. XXXX, XXX, XXX−XXX Journal of Proteome Research Article Figure 2. Overlapping status of samples in total and chromosome 8-coded protein identifications. (A−C) Unique (diagonal, red text) and overlapped identification scales within gastric (A), colorectal (B), and hepatic (C) systems. Numbers of chromosome 8-coded proteins are given in brackets. (D−K) Venn diagrams comparing total proteins (D−F) and chromosome 8-coded proteins (G−I) identified within the three digestive systems and among three normal digestive organs (J, K). (L, M) Venn diagrams comparing total proteins (L) and chromosome 8-coded proteins (M) from the present study, PeptideAtlas, and MaxQB. thrice as well as HCT116 and SW480 proteome data to appear respectively, using rigorous standards. The proteomes from twice for comparison. The gene sets with normalized these samples provided 413 identified proteins in total and gave enrichment scores lower than −1.00 were enriched in a a total coverage of up to 58.9%. The total identification down-regulated direction. coverage of chromosome 8-coded proteins in other 5 databases were Ensembl (gene level, v.69) 402/696 = 57.8%, neXtProt ■ RESULTS (v.2012-10-16) 413/700 = 59.0% (Table S2 in Supporting Information), GPMdb (green, v.2012-10-18) 355/472 = 75.2%, Proteomes of the Three Digestive Organs Peptide Atlas (1% FDR