Original Article Identification of Prognostic Genes in Paediatric Medulloblastoma from Mrna Expression Profiles
Total Page:16
File Type:pdf, Size:1020Kb
Int J Clin Exp Med 2016;9(10):19925-19929 www.ijcem.com /ISSN:1940-5901/IJCEM0017954 Original Article Identification of prognostic genes in paediatric medulloblastoma from mRNA expression profiles Changjun Cao, Wei Wang, Pucha Jiang Department of Neurosurgery, Zhongnan Hospital of Wuhan University, China Received October 15, 2015; Accepted March 1, 2016; Epub October 15, 2016; Published October 30, 2016 Abstract: Medulloblastoma is the most common malignant brain tumour of childhood. The identification of prognos- tic biomarkers correlated with overall survival remains a crucial step towards the refinement of medulloblastoma treatment. A total of 100 medulloblastoma samples from two independent cohorts were included in this study. The statistical modelling approach, Bayesian Model Averaging algorithm, was used to discover the prognostic biomark- ers. Six genes including BICD2, CD300LG, RAB21, RAD18, SYNRG and TNFSF13 were identified to be related to medulloblastoma overall survival. We demonstrated this six-gene signature could successfully discriminate low-risk group from high-risk groups in two independent medulloblastoma cohorts. We have successfully identified a six- gene medulloblastoma prognostic signature. We anticipate that these genes could serve as biomarkers or drug targets in personalised therapy of medulloblastoma. Keywords: Prognosis, medulloblastoma, mRNA expression, bayesian model averaging Introduction [9]. The ability of measuring mRNA expression levels of thousands of genes at one time has Medulloblastoma is the most common malig- made the microarray technology a promising nant brain tumour of childhood and accounts direction in cancer research. Based on microar- for 15%-20% of all paediatric primary brain ray data, many gene signatures have also been tumours [1]. Although there is approximately identified for the prediction of medulloblastoma 70% improvement in 5-year survival rates of prognosis [10, 11]. standard-risk patients as a result of advances in treatment regiments, many survival children One major aim in microarray-based survival suffered from long-term neurologic and endo- analysis is to subset a few predictive candi- crinologic adverse effects [2, 3]. Despite the dates from large amount of measured genes in improvements in survival rates, approximately the analysed samples. In this way, the signifi- 30% of patients are incurable [4]. cantly smaller set of biomarkers would make clinical use more affordable in terms of time The identification of prognostic biomarkers cor- and cost. However, most existing prognostic related with overall survival remains a crucial biomarker identification methods reply on uni- step towards the refinement of medulloblasto- variate analysis [12-14], which considers the ma treatment. The elevated expression level of expression profile of each gene individually. The NTRK3 (neurotrophic tyrosine kinase, receptor, multivariate analysis, on the other hand, evalu- type 3) indicated good prognosis [5-7], while ates multiple genes simultaneously and identi- the presence of high expression of CDK6 has fies predictive genes in combinations. been associated with poorer prognosis [8]. With the advances of high-throughput mRNA In the present study, we investigated the expression profiling technologies, known as Bayesian Model Averaging (BMA) based ap- microarray, genes related to cell proliferation, proach on mRNA expression profiles from a transcription and mitosis showed promising large cohort of primary medulloblastroma sam- results in predicting medulloblastoma outcome ples to generate novel prognostic genes that Identification of prognostic genes in medulloblastoma Table 1. Characteristic of clinical samples were scanned with a GeneChip Sca- Dataset 1, n=61 Dataset 2, n=39 nner 3000 (Affymetrix, Santa Clara, California) according to manufactur- NCBI GEO accession number GSE10327 GSE12992 er’s protocol. Age under 3 Yes 15 4 Combination of independent medul- No 46 35 loblastoma cohorts Histology Anaplastic 1 2 Expression datasets from the two Classic 44 31 independent cohorts were obtained Desmoplastic 13 3 from the National Centre for Biotec- NA 3 3 hnical Information Gene Expressi- NCBI GEO (http://www.ncbi.nlm.nih.gov/geo/). on Omnibus (http://www.ncbi.nlm.nih. gov/geo) with series accession num- bers (dataset 1: GSE10327; dataset could be used to predict patient overall surviv- 2: GSE12992). The Ethical Committee of the al. The BMA algorithm is among multivariate Zhongnan Hospital, Wuhan University, China, gene selection strategies and has many advan- approved the research. Since the same experi- tages [15, 16]. It is computationally efficient mental platform was used datasets from two and systematically determines the number of cohorts were pre-processed together. Data predictive genes and models. The final selected from 2 patients (GEO accession numbers: models usually consists only a few genes. We GSM324138 and GSM260981) were excluded anticipate that these genes could serve as bio- to avoid surgery related complications because markers or drug targets in personalised thera- they died within one month after surgery. The py of medulloblastroma. expression datasets were normalized using RMA (Robust Multi-array Average) algorithm Materials and methods [19] in the Bioconductor package (http://www. Patient samples bioconductor.org/) of R statistical environment (http://www.r-project.org). Finally, the matrix As shown in Table 1, a total of 100 samples consisting of mRNA expression datasets from from two independent cohorts were selected 100 medulloblastoma patients was generated for the present study. Tumours were classified for BMA modelling. as classic (n=75; 75%), anaplastic (n=3; 3%) or desmoplastic (n=16; 16%), while no histological Bayesian model averaging for survival analysis information was provided with 6 samples. 19% We implemented the BMA algorithm for medul- of the patients at diagnosis of medulloblasto- loblastoma survival analysis by using a ma were under the age of three. Median sur- Bioconductor package, iterativeBMAsurv [20]. vival was 41 months (range: 2-277 months). Firstly, the genes were ranked in the descend- Microarray experiments ing order of their log likelihood using the Cox Proportional Hazards Model [21]. Then the Although the microarray experiments were con- BMA algorithm was applied on the 10 top log- ducted in two different institutes, the research- ranked genes. In the next step, genes to which ers literally following the same protocol to gen- the BMA assigns low posterior probabilities erate mRNA expression datasets [17, 18]. In (less than 1%) of being in the predictive model summary, 4 μg total RNA was used for cRNA are removed. If n genes are removed, the next synthesis and fragmented. Labelling was per- n genes from the previously ranked list are formed with One-Cycle cDNA Synthesis Kit added back to the set of genes and the BMA (Affymetrix, Santa Clara, California) according algorithm will be applied again. These steps to manufacturer’s protocol. Sample quality was continue until all genes are considered. checked on a Bioanalyzer prior and after frag- mentation. 10 μg of labelled cRNA was hybrid- The BMA training process was applied on all ized to Affymetrix Human Genome U133 Plus the 100 samples. After the prognostic model 2.0 arrays according to manufacturer’s proto- was established, it was tested on the two inde- col (Affymetrix, Santa Clara, California). Arrays pendent datasets respectively. 19926 Int J Clin Exp Med 2016;9(10):19925-19929 Identification of prognostic genes in medulloblastoma Table 2. List of the identified six prognostic genes in medulloblastoma RAB21 were significantly Official symbol Description Gene ID enriched in one the cel- lular component catego- BICD2 Bicaudal D homolog 2 (Drosophila) 23299 ry: Golgi apparatus (GO CD300LG CD300 molecule-like family member g 146894 term accession: GO: 00- RAB21 RAB21, member RAS oncogene family 23011 05794). However, the pr- RAD18 RAD18 E3 ubiquitin protein ligase 56852 ognostic gene signatur- SYNRG Synergin, gamma 11276 es were not significantly TNFSF13 Tumor necrosis factor (ligand) superfamily, member 13 8741 enriched in any KEGG pathways. Characterisation of selected medulloblastoma Discussion prognostic genes In the present study, we have successfully The selected medulloblastoma prognostic ge- applied machine-learning approaches to iden- nes were categorized in Gene Ontology groups tify six overall survival-associated genes in and mapped to Kyoto Encyclopedia of Genes medulloblastoma to meet the pressing chal- and Genomes (KEGG) pathways using the lenge in clinical treatment. Because this gene Database for Annotation, Visualization and In- panel consists of rather smaller number of tegrated Discovery (DAVID, http://david.abcc. genes, mRNA expression could be easily ncifcrf.gov/) software [22, 23]. assessed by quantitative PCR (qPCR) experi- ments; we believe that this could establish a Results practical and inexpensive molecular diagnostic tool for clinical use in the near future. Identification of six genes associated with medulloblastoma overall survival To our knowledge, none of the six genes has been reported in previous medulloblastoma In the clinical practice of medulloblastoma prognosis related researches. Only one gene, therapy, using a small set of genes (biomark- TNFSF13, was reported to be associated with ers) to predict overall survival reduces the costs prognostic prediction in other disease. The pro- associated with high throughput time-consum- tein