Original Article the Analyses of SRCR Genes Based on Protein-Protein Interaction Network in Esophageal Squamous Cell Carcinoma
Total Page:16
File Type:pdf, Size:1020Kb
Am J Transl Res 2019;11(5):2683-2705 www.ajtr.org /ISSN:1943-8141/AJTR0093089 Original Article The analyses of SRCR genes based on protein-protein interaction network in esophageal squamous cell carcinoma Zepeng Du1,2, Qiaoxi Xia6, Bingli Wu6, Jiyu Ding6, Yan Zhao2, Ling Lin6, Mantong Chen6, Zhixiong Cai3, Shaohong Wang2, Liyan Xu7, Enmin Li6, Zhiyong Wu4, Yun Li1, Haixiong Xu5, Dong Yin1 1Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Genes Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou 510120, China; Departments of 2Pathology, 3Cardiology, 4Surgical Oncology, 5Neurosurgery, Shantou Central Hospital, Affiliated Shantou Hospital of Sun Yat-sen University, Shantou 515041, China; 6Department of Biochemistry and Molecular Biology, 7Institute of Oncologic Pathology, Shantou University Medical College, Shantou 515041, China Received February 27, 2019; Accepted April 29, 2019; Epub May 15, 2019; Published May 30, 2019 Abstract: The scavenger receptor cysteine-rich (SRCR) proteins, with one to several SRCR domains, play important roles in human diseases. A full view of their functions in esophageal squamous cell carcinoma (ESCC) remain unclear. Sequence alignment and phylogenetic tree for all human SRCR domains were performed. Differentially- expressed SRCR genes were identified in ESCC, followed by protein-protein interaction (PPI) network construction, topological parameters, subcellular distribution, functional enrichment and survival analyses. The variation of con- served cysteines in each SRCR domain suggested a requirement for new classification of the SRCR family. Six genes (LGALS3BP, MSR1, CD163, LOXL2, LOXL3 and LOXL4) were upregulated, and four genes (DMBT1, PRSS12, TMPRSS2 and SCARA5) were downregulated in ESCC. These 10 SRCR genes form a unique biological network. Functional enrichment analyses provided important clues to investigate the biological functions for SRCR gene net- work in ESCC, such as extracellular structure organization and the PI3K-Akt signaling pathway. Kaplan-Meier curves confirmed that high expression of SCARA5, LOXL2, LOXL3, LOXL4 were related to poor survival, whereas high expres- sion of DMBTI and PRSS12 showed the opposite result. SRCR genes promote the development of ESCC through its network and could serve as potential prognostic factors and therapy targets of ESCC. Keywords: SRCR gene, protein-protein interaction network, sequence analysis, functional enrichment, esophageal squamous cell carcinoma Introduction teristics of the conserved cysteines in the SR- CR domains, SRCR genes are usually classified The scavenger receptor cysteine-rich (SRCR) into two types: type A genes, which contain six superfamily was first defined by Goldsteinet al. cysteine residues that are encoded by at least in 1979, describing their activity in the uptake two exons, and type B genes, which have eight of modified low-density lipoprotein (LDL) [1]. cysteine residues that generated from a single These SRCR genes were described upon the exon [3]. These cysteine residues participate in identification of a series of receptors contain- intra-domain disulfide bonds, creating the pro- ing one or several extracellular domains homol- tein structure for substrate recruitment. ogous to the cysteine-rich C-terminal domain of the type I class A scavenger receptor express- In the past few decades, new members of the ed in macrophages [2]. The SRCR domain is SRCR gene family have been identified and approximately 100-110 amino acids long and much has been learned about their biological characterized by the multiple conserved cyste- properties [4]. Accumulated research has re- ine residues. The number of SRCR domains in vealed that these SRCR genes/proteins have a SRCR genes varies, ranging from a single copy broad range of ligand-binding substrates, in- to up to 14 repeats. Depending on the charac- cluding polyribonucleotides, proteins, polysac- SRCR genes in ESCC charides, and lipids [5]. Nevertheless, many An identity matrix, which contained similar per- SRCR genes encode mosaic or multidomain centages of every pair of SRCR domain se- proteins proposed to function in the homeosta- quences, was generated by ClustalW after the sis of epithelia and the immune system. Some alignment. To visualize the results, the sequ- SRCR genes are associated with a number of ence similarity matrix was log-transformed, clu- diseases and pathogenic states, such as au- stered by Cluster 3.0 software (http://bonsai. toimmune diseases, Alzheimer’s disease, ath- hgc.jp/~mdehoon/software/cluster/software. erosclerosis and cancer, thus exhibiting prom- htm) and viewed in TreeView (http://jtreeview. ising potential as targets for diagnostic and/or sourceforge.net/). The Neighbor joining meth- therapeutic intervention [5]. od in MEGA X software was applied to construct a phylogenetic tree with bootstrap replicates The esophagus is an open organ, connecting set as 1000 [8]. mouth and stomach, and easily suffers from stimuli, such as heat, food and reflux. Globally, Differentially-expressed SRCR genes in esoph- esophageal cancer is the seventh most com- ageal cancer mon malignant tumor, as well as the sixth most common fatal cancer [6]. Based on these char- At least three mRNA expression profiles for acteristics, the esophagus is a good model to esophageal squamous cell carcinoma (ESCC) study the expression pattern of SRCR genes. were obtained and analyzed in this study, whi- Much research in the past has reported the ch were available from GEO database (https:// function, structure and evolution of specific, www.ncbi.nlm.nih.gov/geo/), to detect the di- single SRCR genes in many species [7]. How- fferential expression of SRCR genes. Both GS- ever, human SRCR genes have not been a fo- E53622 and GSE53624 were provided from cus and fully considered in human disease, the same previous research, which analyzed especially for cancer. In this study, human the lncRNA and mRNA expression profiles by SRCR genes in esophageal cancer were col- microarray in paired ESCC tumor and normal lected, and their specific protein-protein inter- tissues [9]. In GSE53622, 119 paired patients action networks were established. Multiple bio- were divided randomly into training (n=60) and informatics and clinical survival analyses of test (n=59) groups. A prognostic signature was SRCR genes were performed to understand developed from the training group and valid- their roles and clinical significance in the back- ated in the test group, and further in an ind- ground of esophageal cancer. ependent cohort (n=60), which generated the GSE53624 dataset. In GSE23400, Su et al. Materials and methods profiled global gene expression of 53 paired ESCC tumor and matched normal samples to Human SRCR protein sequence collection better characterize molecular changes in ESCC [10]. Human SRCR proteins were searched and col- lected through a query in the UniProt protein These datasets were re-annotated and norm- database (http://www.uniprot.org/). “SRCR” alized by the robust multi-array average (RMA) was used as keyword to search in the UniProt method in the R language limma package database, with a search condition set as “HU- (http://www.bioconductor.org/packages/2.9/ MAN” and “Reviewed”. The FASTA format of the bioc/html/limma.html). Significantly different- protein sequences was retrieved from Uniprot ially expressed genes were identified by the database as well. Since these proteins conta- SAM (Significance Analysis of Microarray, assi- in other functional domains other than SRCR milating a set of gene-specific t-tests) method, domain, we categorized each of these SRCR with a fold-change >1.5 or fold-change < 0.67 proteins by different domains in order to study (FDR P.adjust < 0.01) being defined as a signifi- the amino acid sequences of SRCRs. Then the cant change [11]. Since three expression pro- sequence alignment of these SRCR units was files were analyzed in this study, only genes performed using ClustalW embedded in BioEdit that were significantly changed in 2 out of the software (http://www.mbio.ncsu.edu/bioedit/ 3 ESCC expression profiles were considered bioedit.html), and modified by Jalview (http:// for future study. Hierarchical clustering was www.jalview.org/) to indicate the conserved performed using the data of differentially-ex- amino acid residues. pressed SRCR genes from these three ESCC 2684 Am J Transl Res 2019;11(5):2683-2705 SRCR genes in ESCC mRNA expression profiles to generate a heat- yzer, which can calculate network diameter, map. density, centralization, and clustering coeffi- cients, providing insight into the organization Protein-protein interaction network for differen- and structure of complex networks [14]. The tially-expressed SRCR genes degree distribution P(k) of a large-scale net- work was defined as the fraction of nodes in Updated human protein-protein interaction da- the network with degree k. The pattern of th- ta was acquired from HPRD (http://www.hprd. eir dependencies can be visualized by fitting a org/), BioGRID (http://thebiogrid.org/), and Int- line on the node degree distribution data. Act (http://www.ebi.ac.uk/intact/). These phys- NetworkAnalyzer calculates the positive coor- ical protein interactions were collected from dinate value for fitting the line, where the pow- published papers, which were confirmed by low er law curve is of the form y = bxa. The R2 value -throughput or high-throughput experiments, is a statistical measure of the linearity of the providing high confidence for future