Integrated Analysis of Differentially Expressed Genes in Breast Cancer Pathogenesis
Total Page:16
File Type:pdf, Size:1020Kb
2560 ONCOLOGY LETTERS 9: 2560-2566, 2015 Integrated analysis of differentially expressed genes in breast cancer pathogenesis DAOBAO CHEN and HONGJIAN YANG Department of Breast Surgery, Zhejiang Cancer Hospital, Hangzhou, Zhejiang 310022, P.R. China Received October 20, 2014; Accepted March 10, 2015 DOI: 10.3892/ol.2015.3147 Abstract. The present study aimed to detect the differences ducts or from the lobules that supply the ducts (1). Breast between breast cancer cells and normal breast cells, and inves- cancer affects ~1.2 million women worldwide and accounts tigate the potential pathogenetic mechanisms of breast cancer. for ~50,000 mortalities every year (2). Despite major advances The sample GSE9574 series was downloaded, and the micro- in surgical and nonsurgical management of the disease, breast array data was analyzed to identify differentially expressed cancer metastasis remains a significant clinical challenge genes (DEGs). Gene Ontology (GO) cluster analysis using affecting numerous of patients (3). The prognosis and survival the GO Enrichment Analysis Software Toolkit platform and rates for breast cancer are highly variable, and depend on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway the cancer type, treatment strategy, stage of the disease and analysis for DEGs was conducted using the Gene Set Analysis geographical location of the patient (4). Toolkit V2. In addition, a protein-protein interaction (PPI) Microarray technology, which may be used to simultane- network was constructed, and target sites of potential transcrip- ously interrogate 10,000-40,000 genes, has provided new tion factors and potential microRNA (miRNA) molecules were insight into the molecular classification of different cancer screened. A total of 106 DEGs were identified in the current types (5). Many genes involved in the regulation of cell cycle, study. Based on these DEGs, a number of bio-pathways appear invasion, metastasis and angiogenesis have been indicated to be altered in breast cancer, including a number of signaling to be prognostic biomarkers based on microarray analyses. pathways and other disease-associated pathways, as indicated Furthermore, numerous researchers have proposed that the by KEGG pathway clustering analysis. ATF3, JUND, FOSB phenotypic diversity of breast tumors may be accompanied and JUNB were detected in the PPI network. Finally, the most by a corresponding diversity in gene expression patterns (6). significant potential target sites of transcription factors and Therefore, the systematic investigation of gene expression miRNAs in breast cancer, which are important in the regula- patterns in human breast tumors may aid in understanding the tion of gene expression, were identified. The results indicated pathogenesis of this disease (7). that miR-93, miR-302A, miR-302B, miR-302C, miR-302D, The present study aimed to investigate the differences miR-372, miR-373, miR-520E and miR-520A were closely between breast cancer cells and normal cells, and examine the associated with the occurrence and development of breast possible underlying mechanisms of breast cancer. Biological cancer. Therefore, changes in the expression of these miRNAs microarray analysis was used to analyze the expression profile may alter cell metabolism and trigger the development of breast of breast cancer and normal cells, and identify the differentially cancer and its complications. expressed genes (DEGs). In addition, altered metabolic path- ways in breast cancer were identified using a bioinformatics Introduction approach, while the target sites of potential transcription factors and miRNAs were screened. Furthermore, the current Breast cancer is a type of cancer that originates from breast study aimed to improve the understanding of the occurrence tissue and most commonly from the inner lining of the milk and development of breast cancer and facilitate the discovery of potential novel biomarkers for its treatment. Materials and methods Correspondence to: Dr Daobao Chen, Department of Breast Surgery, Zhejiang Cancer Hospital, 38 Guangji Road, Hangzhou, Gene expression microarray. In order to investigate the altera- Zhejiang 310022, P.R. China tions in breast cancer cells compared with normal cells, DEGs E-mail: [email protected] were screened at the gene level and the possible mechanisms were examined. The gene expression series, GSE9574 (8), was Key words: differentially expressed genes, transcription factors, downloaded from the Gene Expression Omnibus (GEO; http:// breast cancer, function and pathway annotation, protein-protein www.ncbi.nlm.nih.gov/geo/) of the National Center for Biotech- interaction network, microRNA nology Information (Bethesda, MD, USA), based on the GPL96 [HG-U133] platform data (Affymetrix Human Genome U133 Array), and included 14 breast cancer and 15 normal samples. CHEN and YANG: DIFFERENTIALLY EXPRESSED GENES IN BREAST CANCER 2561 Identification of DEGs. Microarray data were analyzed using R software v.2.13.0 (9) and further processed using Geoquery (10) and Limma (11) packages. Geoquery is used to rapidly obtain gene expression profiles from the GEO data- base (10), while the Limma package is the most popular method for the analysis of DEGs (11,12). The preprocessed expression data were obtained using Geoquery, and subjected to a log2 transformation. The breast cancer and normal samples were compared using Limma in order to identify the DEGs between the two tissue types. Gene P-values were determined by R soft- ware using the Student's t-test and P<0.001 in a Bayesian model was considered to indicate a DEG (12). Gene Ontology (GO) analysis of DEGs. In order to assess the changes in DEGs occurring at the cellular level and the functional clustering of DEGs, the GO Enrichment Analysis Software Toolkit (GOEAST) (13) in the GO database (14) was used. Hypergeometric algorithms were used for statistical analysis, and terms associated with biological process and molecular function were enriched. Bio‑pathway analysis of DEGs. In order to detect the changes in DEGs at the molecular level, all the metabolic and nonmetabolic pathways were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Pathway enrichment of DEGs was assessed using the Gene Set Analysis Toolkit v2 (15,16), and the number of genes was counted for each term. A gene Figure 1. Molecular function clustering of differentially expressed genes. number of ≥2 and P<0.05 were used as the cut‑off values. Yellow boxes are significantly enriched Gene Ontology terms (false dis- covery rate <0.051). Brighter color indicates stronger statistical significance. Protein‑protein interaction (PPI) network construction. PPI GO, Gene Ontology. data were integrated and verified using data obtained from the following databases (all accessed on November 11, 2012): Human Protein Reference Database (http://www.hprd. cancer when compared with normal tissues, and involved org/); Biological General Repository for Interaction Datasets 106 DEGs (Table I). (http://thebiogrid.org/); Biomolecular Object Network Data- bank (http://bond.unleashedinformatics.com/); Database of GO cluster of DEGs. The molecular functions enriched in the Interacting Proteins (http://dip.doe-mbi.ucla.edu/); IntAct identified DEGs included nucleic acid binding transcription (http://www.ebi.ac.uk/intact/); Molecular INTeraction database factor activity, sequence‑specific DNA binding transcription (http://mint.bio.uniroma2.it/mint/welcome.do); and Reactome factor activity and double-stranded DNA binding (Fig. 1). In (http://www.reactome.org/). Subsequently, a PPI network was addition, the biological processes enriched are shown in Fig. 2, constructed based on the identified DEGs. A hypergeometric and include positive regulation of biological process, positive algorithm was used, and P<0.05 was considered to indicate regulation of cellular process, cellular response to organic statistically significant differences. substance and positive regulation of transcription from RNA polymerase II promoter. The GO clustering results provide a Screening the target sites of potential transcription factors preliminary description of the potential functions of the DEGs and miRNAs. Based on the gene annotation data from the and their effects on cells. Molecular Signatures Database (http://www.broadinstitute. org/gsea/msigdb/index.jsp; accessed November 11, 2012), the Bio‑pathways altered in breast cancer. To further investigate abundance of the gene sets were analyzed. In addition, hyper- changes of the biological pathways within cancer cells in geometric algorithms and false discovery rate (FDR) correction detail, a KEGG pathway enrichment analysis of the identified were performed using the Benjamini & Hochberg method (17). DEGs was performed. KEGG clustering results indicated FDR<0.05 was selected as the statistical significance threshold that a number of bio-pathways were altered in breast cancer to indicate target sites of potential transcription factors and cells, primarily signaling and disease-associated path- miRNAs. ways (Table II). The alteration of the RNA transport pathway was consistent with the GO clustering results, suggesting Results that gene expression in breast cancer cells differs from that in normal cells. In addition, signaling pathways in the Identification of DEGs in breast cancer.Using P<0.001 as the cell surface were altered, including the nucleotide-binding statistical significance threshold, a total of 123 probes were oligomerization domain (NOD)-like receptor signaling