Original Article Tumor-Intrinsic and -Extrinsic (Immune) Gene Signatures
Total Page:16
File Type:pdf, Size:1020Kb
Am J Cancer Res 2021;11(1):181-199 www.ajcr.us /ISSN:2156-6976/ajcr0121853 Original Article Tumor-intrinsic and -extrinsic (immune) gene signatures robustly predict overall survival and treatment response in high grade serous ovarian cancer patients David P Mysona1,2, Lynn Tran3, Shan Bai3, Bruno dos Santos2, Sharad Ghamande4, John Chan5, Jin-Xiong She3,4 1University of North Carolina, Chapel Hill, NC 27517, USA; 2Jinfiniti Precision Medicine, Inc. Augusta, GA 30907, USA; 3Center for Biotechnology and Genomic Medicine, Medical College of Georgia at Augusta University, Augusta, GA 30912, USA; 4Department of OBGYN, Medical College of Georgia at Augusta University, Augusta, GA 30912, USA; 5Palo Alto Medical Foundation Research Institute, Palo Alto, CA 94301, USA Received September 6, 2020; Accepted September 14, 2020; Epub January 1, 2021; Published January 15, 2021 Abstract: In the present study, we developed a transcriptomic signature capable of predicting prognosis and re- sponse to primary therapy in high grade serous ovarian cancer (HGSOC). Proportional hazard analysis was per- formed on individual genes in the TCGA RNAseq data set containing 229 HGSOC patients. Ridge regression analy- sis was performed to select genes and develop multigenic models. Survival analysis identified 120 genes whose expression levels were associated with overall survival (OS) (HR = 1.49-2.46 or HR = 0.48-0.63). Ridge regression modeling selected 38 of the 120 genes for development of the final Ridge regression models. The consensus model based on plurality voting by 68 individual Ridge regression models classified 102 (45%) as low, 23 (10%) as moder- ate and 104 patients (45%) as high risk. The median OS was 31 months (HR = 7.63, 95% CI = 4.85-12.0, P < 1.0-10) and 77 months (HR = ref) in the high and low risk groups, respectively. The gene signature had two components: in- trinsic (proliferation, metastasis, autophagy) and extrinsic (immune evasion). Moderate/high risk patients had more partial and non-responses to primary therapy than low risk patients (odds ratio = 4.54, P < 0.001). We concluded that the overall survival and response to primary therapy in ovarian cancer is best assessed using a combination of gene signatures. A combination of genes which combines both tumor intrinsic and extrinsic functions has the best prediction. Validation studies are warranted in the future. Keywords: High grade serous ovarian cancer, gene signature, chemotherapy resistance, prognosis, immune eva- sion, machine learning Introduction interval (PFI) defined as the time to recurrence or progression after receiving platinum-based Ovarian cancer is the most lethal gynecologic chemotherapy. Extended PFIs are associated cancer in the United States [1]. The majority of with higher response rates to repeat platinum patients are diagnosed at an advanced stage treatments and longer survival times [5]. with an estimated 5-year survival between 30% However, a PFI of less than 6 months is consid- and 50% [2]. Current standard of care consists ered platinum resistant and is associated of cytoreductive surgery either before or after with a median survival of 9-12 months [5]. systemic chemotherapy with a combination of Unfortunately, little is known about platinum a platinum and taxane agents [3]. This regimen resistance or how to overcome it [6]. is effective at initially treating the cancer, as 80% of patients will have no evidence of dis- To better understand the mechanisms of plati- ease after therapy completion [4]. However, at num resistance, we applied machine learning to least half of patients recur within the first 18 the TCGA RNAseq data to develop multigenic months after therapy [4]. models capable of predicting prognosis and treatment response among high grade serous To date, one of the most important prognostic ovarian cancer patients (HGSOC). Although pre- factors for ovarian cancers is the platinum free vious studies have examined prognostic signa- Prognostic gene signature high grade ovarian cancer Table 1. Summary of demographic, pathologic, and treatment information for all patients Patients # (%) Characteristic Median OS HR (95% CI) p-value (n, total = 229) Age < 59 years 113 (49%) 49 months ref ref ≥ 59 years 113 (49%) 38 months 1.18 (0.85-1.64) 0.32 Unknown 3 (2%) 24 months 4.31 (1.34-13.89) 0.014 Stage Low 20 (9%) 71 months ref 0.06 High* 209 (91%) 43 months 2.20 (0.97-4.99) Histology Serous 229 (100%) NA NA NA Grade Moderate 32 (14%) 62 months ref 0.03 High 197 (86%) 42 months 1.76 (1.07-2.90) Lymphovascular Invasion Negative 36 (16%) 52 months ref ref Positive 64 (28%) 41 months 1.45 (0.78-2.70) 0.24 Unknown 129 (56%) 44 months 1.50 (0.85-2.63) 0.16 PDS Yes 229 (100%) NA NA NA Residual Disease R0 44 (19%) 57 months ref ref R1 102 (44%) 41 months 1.82 (1.07-3.09) 0.03 R2 61 (27%) 38 months 1.81 (1.04-3.18) 0.04 Unknown 22 (10%) 79 months 0.87 (0.41-1.84) 0.72 Treated Postoperatively Yes 229 (100%) NA NA NA Response to Primary Treatment Complete Response 136 (59%) 57 months ref ref Partial Response 29 (13%) 33 months 4.02 (2.49-6.49) < 0.001 No Response 20 (9%) 24 months 5.88 (3.43-10.06) < 0.001 Stable Disease 15 (6%) 34 months 2.81 (1.39-5.68) 0.004 Unknown 29 (13%) 32 months 4.00 (2.42-6.63) < 0.001 Abbreviations: HR hazard ratio, CI confidence interval, Low Stage IIA-IIC, High Stage IIIA-IV, High grade: cancers described as grade 3, Moderate Grade: cancers described as grade 2, R0 No residual disease, R1 between 1 mm - 10 mm of residual dis- ease, R2 greater than 10 mm of residual disease, PDS: primary debulking surgery. *Among high stage patients 170 (81%) and 22 (11%) were stage IIIC and IV, respectively. tures, the reported signatures have below years. Of the 229 patients, the median age was par survival prediction, poor validation in out- 59 and 209 (91.3%) were stage IIIA or later. All side datasets, and do not predict platinum patients had serous histology, were grade 2 or resistance [7]. As previous studies have not higher, underwent primary cytoreductive sur- addressed the significant clinical question of gery and received postoperative treatment. An understanding and overcoming platinum resis- optimal cytoreduction (R0+R1 resections) was tance [7-14], we undertook the present study to achieved in 146 (63.8%) of patients, and most develop a gene signature that can predict both patients had a complete response (n = 136, treatment response and survival prognosis. 59.4%) to initial chemotherapy Table 1. Methods Survival analyses with individual genes Patients and data All statistical analyses were performed using the R language and environment for statistical TCGA ovarian cancer patient cohort (n = 307) computing [16]. Genes with an even distribu- level 3, log2 transformed RNAseq data was tion of patients when divided into 4 quartiles obtained through the UCSC Xena platform [15]. were chosen for analysis (n = 14,262). In single Exclusion criteria were unknown stage, grade 1 gene analyses, patients were ranked by expres- differentiation, no post-operative treatment, or sion levels and divided into four quartiles. The censored at less than or equal to 6 months. first quartile was used as the reference and This left a final cohort of 229 patients. Overall compared to the 2nd, 3rd and 4th quartiles using survival was the primary endpoint of this study Cox proportional hazards for survival analyses. and all surviving patients were censored at 10 All survival analyses and Kaplan-Meier survival 182 Am J Cancer Res 2021;11(1):181-199 Prognostic gene signature high grade ovarian cancer curves were generated using the “survival dence interval for the HR associated with each package” in R [17]. model. Briefly, 70% of patients were randomly sampled for each bootstrap and 1,000 boot- Ridge regression straps with replacement were generated for this study. Each bootstrapped dataset, for the Ridge regression was carried out with different selected top models were analyzed by Ridge gene sets to calculate Ridge Regression Scores regression and Cox proportion Hazard. The (RRS) for each patient using the “glmnet” pack- mean HR from all 1000 bootstraps was also age in R [18]. Ridge regression combines mul- computed and the 95% confidence interval tiple inputs in a linear manner and then uses a was defined by the HR at the th5 and 95th per- penalty term (lambda) to tune the model. The centiles of the 1000 models. Conventionally, effect of this penalty term can be modified to models are considered validated if 95% or have no effect (lambda = 0) or if lambda equals more models have p values less than 0.05. infinity the coefficient of the input parameter equals 0, meaning the given parameter has no Plurality voting for consensus modeling impact on the model. The input factors are then summed together based on their coeffi- Our analytical pipeline generated a number of cients resulting in an individual score for each models that were validated by training, testing, patient. The lambda value was optimized using and bootstrapping. It was critical to assess the lambda.min function, which automatically the consistency of patient classification by chooses the lambda which results in the least each of the selected models. For this purpose, errors on cross validation. After the RRS is the RRS group assignment for each patient by computed for each patient, all patients were the selected models was compiled and the ranked and then divided into two groups (RRS_ percentages of models assigning a specific high and RRS_low) by the cumulative sum of patient to each RRS group were calculated.