Overall Survival CCNE1 Progression Free Survival CCNE1
Overall Survival MCM5 Progression Free Survival MCM5
S-Fig. 1 Supplemental material
Analysis of gene expression data .
Procedure 1
Image addressing, segmentation and flagging
In the first one, we carried out image addressing and segmentation with the CSIRO Spot software
(CSIRO Australia, http://spot.cmis.csiro.au/spot/), and data analysis with the “R” software
(http://cran.at.r-project.org/). Each spot was found using the array grid and the “seeded growing region algorithm” over the image. For each fluorescent image, the average pixel intensity within each spot was determined, and a local background was computed equal to the “morphX.close.open” value (where X is either R for Red or G for Green), measured as reported (ref ). Spots deemed unsuitable for accurate quantification because of array artefacts were manually flagged to be identified and in case excluded from further analysis.
Intensity and log ratio computation
For each found array dataset two distinct classes of log2 ratio have been computed. The first
(MA=(FR– BGR)/(FG– BGG)) was the classic log ratio of the two net fluorescence signal (F-BG) determined by subtraction of the local background (BG) from the spot median intensity (F). The second (MB=(FR/BGR)/(FG/BGG)) instead was the log ratio of the two relative intensities (F/BG) of each channel fluorescence (F) of spot in respect to its own local background (BG). The log2 geometric mean intensities and log ratios of both classes were arbitrarily set to 0 if a spot had not a net fluorescence signal greater than 0 in both channels (i. e. is a unreliable spot).
LOWESS normalisation
Both kind of M-values were normalised and corrected from possible biases using the LOWESS
(ref) smoothing functions algorithm, in a MA vs. A plot for the first (ref) and in a MB vs. A plot for the other. While markers, controls, unreliable and flagged spots were excluded from LOWESS computation, all M-values were normalized by LOWESS assessment subtraction. The R lowess function in stats package was used, the best f span smoothing parameter of each interpolation curve has been found by iterating the LOWESS algorithm and by visual inspection of smoothed functions, while the delta parameter was set to 0.01 times the range of the intensity in each scatter plot and iter iteration parameter was 1000.
Z-score standardization and data quantization
In order to evaluate which genes have probably modified their expression level an intensity- dependent Z-score (ref) calculation has been performed for each microarray data set and log ratio class (MA and MB). The Z-scores were determined separating the positive normalized log ratios from the negative ones, in order to account for the data distribution asymmetry in the standard deviation evaluation. Again spot were selected not to be markers, controls, unreliable or flagged. On positive (or negative) selected data was applied a reflection transformation over the M=0 axis and resulting negative (or positive) data were merged to selected ones in a new symmetrical data set. On the two resulting data sets intensity dependent standard deviations were computed. Then the whole original data set was standardized using both functions calculated for each log ratio class (MA and
MB). For markers, controls, unreliable or flagged spots Z-score was set to 0, otherwise was taken the Z-score from the data set (positive or negative) corresponding to its original value (positive or negative). Z-score signs were inverted in swap data sets and technical replicates were averaged by geometric mean for each patient. When the two technical replicates of a sample were discordant then mean Z-score for the spot was set to 0. Then geometric mean of MA and MB was calculated in each patient in order to have a unique estimate of spots expression. Again if MA and MB differed in signs average spot Z-score was defined to be 0. Before undergoing clustering phase, one more step was done: a new discrete data set was built from the original one as described in table S1. Both data sets were used in following steps of analysis.
Unsupervised filtering and clustering A unsupervised analysis was done to find gene signatures correlating to patients clinical data. First a filter was applied to the 13496 genes of in the platform (marker and controls were excluded). Only
3294 continuous genes and 3474 from the discrete data set having |Zeta Score|=>2 in at least 4 out of 68 Stage I patients were selected. To both sets of genes was applied a divisional clustering algorithm based on neural nets: SOM (Self Organizing Maps). A 11 rows by 6 columns starting grid with hexagonal symmetry was used. It was initialized by projecting it onto the plane of the first two
Principal Components calculated for the genes to be clustered. The gene clustering was implemented in R.
An agglomerative hierarchical clustering on both genes and patients was used to define gene signatures associated to groups of samples. The Cluster3 program (ref) was used with Average
Linkage clustering rule and Euclidean distance settings. The algorithm was applied for each cluster in the set of 132 found (66 in quantized data set and 66 for the continuous one).
For each cluster in one of the two data set (continuous or discrete data) were found corresponding clusters in the other (discrete or continuous respectively) that share most genes. In that way a kind of “homology map” for the two data set has been obtained. Each cluster was visually inspected to find samples group (defined by gene signatures and sample tree branches) that can be confirmed in its homologous clusters. 74 redundant groups were found.
Procedure 2
Image analysis and data transformation
Images were analysed using the GenePix Pro software (Axon Instruments) version 3.0. GenePix
Result files (GPR) and the related scan images, together with the Agilent cDNA pattern file and information on the hybridised samples and labelling signs (positive for normal labelling and negative for dye-swap labelling), were then loaded into the Rosetta Resolver SE software. The
Axon GenePix error model was applied to transform data and then duplicate ratio profiles were combined to obtain one ratio experiment for each patient. Log ratios, log ratio errors and p-values were thus available for the non flagged sequences of each ratio experiment.
Unsupervised filtering and clustering
As in procedure 1, 13496 genes (marker and controls were excluded) were subjected to filtering.
Only sequences with at most 20% of missing values, vector standard deviation of at least 0.4 and p- value less than 0.01 in at least 7 out of 68 stage I patients (∼10%) were submitted to “weighted-by- error” self-organizing-maps on a 6x6 grid. 1116 sequences passed the filter and the 36 nodes obtained after SOM were subjected to weighted agglomerative hierarchical clustering, using cosine correlation as metric for sequences, Euclidean distance for patients and average link as linkage method. Successively, the patient dendrograms obtained for each cluster were split into two (or sometimes three) groups based on the form of the tree branches or on signatures. So 34 patient groups were selected to be tested for statistical correlation to available clinical parameters.
Metaclustering
In order to identify identical and complementary groups and hence reducing their number a
“metacluster analysis” was done. Presence and absence information (codified as 1 and 0) of each patient in each group was reported in a table and that was the input for a hierarchical clustering
(Euclidean distance and Complete linkage rule). So groups number was reduced from 108 (i. e.
74+34) to 94 not redundant clusters which were tested for statistical correlation to available clinical parameters.
Genebank Gene Name Sense Antisense Ta (°C) Acc. Number ADAMTS9 BM469961 GTGACACCTCAGAACACAAA GCTAAGCCGCTGTTTAATGC 60 CDH9 NM_016279 GTCTGGAGTCGGTACATCTG GGCTGTCCTTGCAATATGCT 60 CTH NM_001902 GTGTATGGAGGTACAAACAG CAGCCTTCAATGTCAATCACC 60 GRB14 XM_001131281 CGCTTGGAGGAAAAAAGGAT GCTGGGACCGGTGGATAG 60 IGFBP1 NM_000596 TCCCCATGCTGCAGAGGCAGGGAG AGAGCCTTCGAGCCATCATA 60 NDRG1 AL550073 AACAGTTTGGGCTGAAAAGC ATAAGGACAAGGCCCTCCAC 60
CYP2S1 NM_030622 GATGGCCATGGGGTTTTCTT ATCAGCTCCTCGCCTTCTC 60
CYP4F12 AK091995 ATTGTCAGGAGAGGCCCAGT CCCGTCATGGGAGAGGTAAT 60
MGST2 BY796194 CGGGCACAACAAAACTGTGT CCAGACCCAGACAAGTAGCA 60
NQO1 BU167840 GATATTCCAGTTCCCCCTGC CGGAAGGGTCCTTTGTCATA 60
NR1I2 AK122990 AGACACTGCAGGTGGCTTCCA TCTGGGGAGAAGAGGGAGAT 60
TNF NM_000594 AGAGGGAAGAGTTCCCCAGG CAGCTTGAGGGTTTGCTACA 60
CCNE1 M74093 TGTACTGAGCTGGGCAAATA ACACACCTCCATTAACCAATCC 60 MCM5 NM_006739 ACTGCGACAGGTACCTGTGTG ACACGGATGTAGGAGCTTCG 60
Cyclophillin A NM_021130 GCGTCTCCTTTGAGCTGTTT GTCTTGGCAGTGCAGATGAA 60 Actin B XR_019170 CACCCACACTGTGCCCATCTA CAGCGGAACCGCTCATTGCCAATGG 60
S-Table I List of primers pair sequences, Genebank accession number and annealing temperature (Ta) of genes analysed by real time RT-PCR
A) PFS
Hazard Lower Upper Variable p ratio 95% CI 95% CI Age 1.03 0.99 1.08 0.186 Sub-stage 2.73 1.08 6.86 0.033 Clear cells 1.96 0.73 5.24 0.180 Grading 2.91 1.25 6.76 0.013
B) OS
Hazard Lower Upper Variable p ratio 95% CI 95% CI Age 1.05 0.99 1.11 0.139 Sub-stage 1.95 0.78 4.87 0.151 Clear cells 4.19 1.34 13.06 0.013 Grading 2.56 0.98 6.63 0.054
S-Table II. Relationships between PFS (A) or OS (B) and age, sub-stage, clear-cell histotype and grading by univariate Cox proportional hazard models. p is the p-value referred to Welch t-test (p<0.05). 1) Type of cluster: frequencies and percentages of patients in each hierarchical cluster
CLUSTER N. pts N. pts N. pts N. pts in Total in Total in Total in Total cluster pts % cluster pts % cluster pts % cluster pts %
A 19 68 0.2794 Y 30 68 0.4412 AW 12 68 0.1765 BU 20 68 0.2941
B 13 68 0.1912 Z 37 68 0.5441 AX 11 68 0.1618 BV 31 68 0.4559
C 13 68 0.1912 AA 5 68 0.0735 AY 14 68 0.2059 BW 12 68 0.1765
D 6 68 0.0882 AB 34 68 0.5000 AZ 9 68 0.1324 BX 14 68 0.2059
E 28 68 0.4118 AC 35 68 0.5147 BA 28 68 0.4118 BY 12 68 0.1765
F 16 68 0.2353 AD 50 68 0.7353 BB 31 68 0.4559 BZ 19 68 0.2794
G 3 68 0.0441 AE 6 68 0.0882 BC 12 68 0.1765 CA 7 68 0.1029
H 21 68 0.3088 AF 9 68 0.1324 BD 11 68 0.1618 BC 12 68 0.1765
I 25 68 0.3676 AG 5 68 0.0735 BE 13 68 0.1912 CC 35 68 0.5147
J 10 68 0.1471 AH 56 68 0.8235 BF 5 68 0.0735 CD 10 68 0.1471
K 13 68 0.1912 AI 35 68 0.5147 BG 19 68 0.2794 CE 13 68 0.1912
L 35 68 0.5147 AJ 24 68 0.3529 BH 32 68 0.4706 CF 14 68 0.2059
M 36 68 0.5294 AK 8 68 0.1176 BI 17 68 0.2500 CH 19 68 0.2794
Nn 51 68 0.7500 AL 6 68 0.0882 BJ 11 68 0.1618 CI 27 68 0.3971
O 7 68 0.1029 AM 17 68 0.2500 BK 13 68 0.1912 CJ 22 68 0.3235
P 10 68 0.1471 AN 8 68 0.1176 BL 10 68 0.1471 CL 33 68 0.4853
Q 6 68 0.0882 AO 47 68 0.6912 BM 8 68 0.1176 CN 11 68 0.1618
R 11 68 0.1618 AP 13 68 0.1912 BN 10 68 0.1471 CO 21 68 0.3088
S 29 68 0.4265 AQ 52 68 0.7647 BO 22 68 0.3235 CP 34 68 0.5000
T 13 68 0.1912 AR 4 68 0.0588 BP 36 68 0.5294 CQ 13 68 0.1912
U 25 68 0.3676 AS 20 68 0.2941 BQ 32 68 0.4706 CR 23 68 0.3382
V 7 68 0.1029 AT 11 68 0.1618 BR 16 68 0.2353 CS 28 67 0.4179
W 16 68 0.2353 AU 31 68 0.4559 BS 6 68 0.0882 X 33 68 0.4853 AV 6 68 0.0882 BT 4 68 0.0588
2) correlation among clusters and tumor characteristics: frequencies and percentages of patients in each cluster divided by substage (a, b, c) SUBSTAGE SUBSTAGE SUBSTAGE a b c a b c a b c N. N. N. pts N. N. N. pts pts pts pts pts in pts pts pts in To in in in in clu in in in clus t clust Tot clust Tot clust Tot clust Tot ste Tot clust Tot clust Tot clust Tot ter pts % er pts % er pts % er pts % er pts % r pts % er pts % er pts % er pts % A 5 17 .294 1 4 .25 13 47 .276 AG 2 17 .117 0 4 .00 3 47 .063 BM 3 17 .176 0 4 .0 5 47 .106 B 4 17 .235 0 4 .00 9 47 .191 AH 14 17 .823 4 4 1.0 38 47 .808 BN 4 17 .235 0 4 .0 6 47 .127 C 3 17 .176 0 4 .00 10 47 .212 AI 9 17 .529 1 4 .25 25 47 .531 BO 4 17 .235 2 4 .5 16 47 .340 D 2 17 .117 0 4 .00 4 47 .085 AJ 5 17 .294 2 4 .50 17 47 .361 BP 9 17 .529 2 4 .5 25 47 .531 E 7 17 .411 1 4 .25 20 47 .425 AK 2 17 .117 0 4 .00 6 47 .127 BQ 8 17 .470 2 4 .5 22 47 .468 F 3 17 .176 2 4 .50 11 47 .234 AL 3 17 .176 0 4 .00 3 47 .063 BR 5 17 .294 1 4 .2 10 47 .212 G 1 17 .058 0 4 .00 2 47 .042 AM 5 17 .294 0 4 .00 12 47 .255 BS 3 17 .176 0 4 .0 3 47 .063 H 6 17 .352 1 4 .25 14 47 .297 AN 3 17 .176 0 4 .00 5 47 .106 BT 3 17 .176 0 4 .0 1 47 .021 I 8 17 .470 1 4 .25 16 47 .340 AO 13 17 .764 1 4 .25 33 47 .702 BU 6 17 .352 2 4 .5 12 47 .255 J 3 17 .176 0 4 .00 7 47 .148 AP 3 17 .176 2 4 .50 8 47 .170 BV 9 17 .529 1 4 .3 21 47 .446 K 4 17 .235 0 4 .00 9 47 .191 AQ 13 17 .764 2 4 .50 37 47 .787 BW 4 17 .235 0 4 .0 8 47 .170 L 9 17 .529 1 4 .25 25 47 .531 AR 2 17 .117 1 4 .25 1 47 .021 BX 4 17 .235 0 4 .0 10 47 .212 M 9 17 .529 1 4 .25 26 47 .553 AS 5 17 .294 1 4 .25 14 47 .297 BY 4 17 .235 0 4 .0 8 47 .170 Nn 13 17 .764 2 4 .50 36 47 .766 AT 3 17 .176 2 4 .50 6 47 .127 BZ 6 17 .352 0 4 .0 13 47 .276 O 2 17 .117 0 4 .00 5 47 .106 AU 4 17 .235 2 4 .50 25 47 .531 CA 2 17 .117 0 4 .0 5 47 .106 P 1 17 .058 0 4 .00 9 47 .191 AV 1 17 .058 0 4 .00 5 47 .106 BC 3 17 .176 0 4 .0 9 47 .191 Q 2 17 .117 0 4 .00 4 47 .085 AW 2 17 .117 1 4 .25 9 47 .191 CC 12 17 .705 2 4 .5 21 47 .446 R 3 17 .176 0 4 .00 8 47 .170 AX 3 17 .176 1 4 .25 7 47 .148 CD 3 17 .176 0 4 .0 7 47 .148 S 5 17 .294 1 4 .25 23 47 .489 AY 3 17 .176 0 4 .00 11 47 .234 CE 5 17 .294 0 4 .0 8 47 .170 T 3 17 .176 1 4 .25 9 47 .191 AZ 3 17 .176 1 4 .25 5 47 .106 CF 2 17 .117 0 4 .0 12 47 .255 U 5 17 .294 1 4 .25 19 47 .404 BA 7 17 .411 1 4 .25 20 47 .425 CH 8 17 .47 0 4 .0 11 47 .234 V 0 17 .000 1 4 .25 6 47 .127 BB 8 17 .470 2 4 .50 21 47 .446 CI 3 17 .176 2 4 .5 22 47 .468 W 6 17 .352 1 4 .25 9 47 .191 BC 3 17 .176 0 4 .00 9 47 .191 CJ 6 17 .352 2 4 .0 14 47 .297 X 9 17 .529 2 4 .50 22 47 .468 BD 2 17 .117 1 4 .25 8 47 .170 CL 8 17 .470 2 4 .5 23 47 .489 Y 8 17 .470 1 4 .25 21 47 .446 BE 4 17 .235 0 4 .00 9 47 .191 CN 3 17 .176 0 4 .0 8 47 .170 Z 11 17 .647 1 4 .25 25 47 .531 BF 1 17 .058 0 4 .00 4 47 .085 CO 2 17 .117 2 4 .5 17 47 .361 AA 2 17 .117 0 4 .00 3 47 .063 BG 5 17 .294 0 4 .00 14 47 .297 CP 12 17 .705 1 4 .3 21 47 .446 AB 7 17 .411 1 4 .25 26 47 .553 BH 10 17 .588 1 4 .25 21 47 .446 CQ 3 17 .176 1 4 .3 9 47 .191 AC 9 17 .529 1 4 .25 25 47 .531 BI 2 17 .117 2 4 .50 13 47 .276 CR 7 17 .411 1 4 .3 15 47 .319 AD 14 17 .823 1 4 .25 35 47 .744 BJ 4 17 .235 1 4 .25 6 47 .127 CS 8 17 .470 2 4 .5 18 46 .391 AE 2 17 .117 0 4 .00 4 47 .085 BK 4 17 .235 0 4 .00 9 47 .191 AF 3 17 .176 0 4 .00 6 47 .127 BL 3 17 .176 1 4 .25 6 47 .127
3) correlation among clusters and tumor characteristics (Histotype): frequencies and percentages of patients in each cluster divided by histotype (serous, mucinous, endometrioid, clear cells and undefined)
Histotype serous mucinous endometr Undiff clearcells
pts in tot pts in tot pts in tot pts in tot pts in tot cluster pts % cluster pts % cluster pts % cluster pts % cluster pts %
A 5 24 .2083 5 10 .5000 4 17 .2353 0 1 .0000 5 16 .3125
B 2 24 .0833 0 10 .0000 9 17 .5294 0 1 .0000 2 16 .1250
C 1 24 .0417 0 10 .0000 0 17 .0000 0 1 .0000 12 16 .7500
D 0 24 .0000 6 10 .6000 0 17 .0000 0 1 .0000 0 16 .0000
E 7 24 .2917 8 10 .8000 4 17 .2353 1 1 1.000 8 16 .5000
F 8 24 .3333 1 10 .1000 3 17 .1765 1 1 1.000 3 16 .1875
G 1 24 .0417 0 10 .0000 1 17 .0588 0 1 .0000 1 16 .0625
H 5 24 .2083 3 10 .3000 4 17 .2353 0 1 .0000 9 16 .5625
I 8 24 .3333 2 10 .2000 12 17 .7059 1 1 1.000 2 16 .1250
J 2 24 .0833 0 10 .0000 7 17 .4118 0 1 .0000 1 16 .0625
K 2 24 .0833 1 10 .1000 8 17 .4706 0 1 .0000 2 16 .1250
L 9 24 .3750 5 10 .5000 10 17 .5882 1 1 1.000 10 16 .6250
M 9 24 .3750 5 10 .5000 10 17 .5882 1 1 1.000 11 16 .6875
Nn 16 24 .6667 7 10 .7000 13 17 .7647 1 1 1.000 14 16 .8750
O 0 24 .0000 6 10 .6000 1 17 .0588 0 1 .0000 0 16 .0000
P 1 24 .0417 1 10 .1000 2 17 .1176 0 1 .0000 6 16 .3750
Q 0 24 .0000 6 10 .6000 0 17 .0000 0 1 .0000 0 16 .0000
R 1 24 .0417 0 10 .0000 0 17 .0000 0 1 .0000 10 16 .6250 Histotype serous mucinous endometr Undiff clearcells
pts in tot pts in tot pts in tot pts in tot pts in tot cluster pts % cluster pts % cluster pts % cluster pts % cluster pts %
S 9 24 .3750 4 10 .4000 9 17 .5294 1 1 1.000 6 16 .3750
T 4 24 .1667 3 10 .3000 2 17 .1176 1 1 1.000 3 16 .1875
U 8 24 .3333 4 10 .4000 7 17 .4118 1 1 1.000 5 16 .3125
V 2 24 .0833 1 10 .1000 2 17 .1176 0 1 .0000 2 16 .1250
W 7 24 .2917 2 10 .2000 4 17 .2353 0 1 .0000 3 16 .1875
X 14 24 .5833 4 10 .4000 9 17 .5294 0 1 .0000 6 16 .3750
Y 12 24 .5000 2 10 .2000 8 17 .4706 0 1 .0000 8 16 .5000
Z 12 24 .5000 7 10 .7000 15 17 .8824 1 1 1.000 2 16 .1250
AA 0 24 .0000 5 10 .5000 0 17 .0000 0 1 .0000 0 16 .0000
AB 9 24 .3750 3 10 .3000 10 17 .5882 1 1 1.000 11 16 .6875
AC 9 24 .3750 5 10 .5000 10 17 .5882 1 1 1.000 10 16 .6250
AD 15 24 .6250 5 10 .5000 14 17 .8235 1 1 1.000 15 16 .9375
AE 2 24 .0833 0 10 .0000 4 17 .2353 0 1 .0000 0 16 .0000
AF 2 24 .0833 0 10 .0000 6 17 .3529 0 1 .0000 1 16 .0625
AG 1 24 .0417 0 10 .0000 4 17 .2353 0 1 .0000 0 16 .0000
AH 22 24 .9167 9 10 .9000 9 17 .5294 1 1 1.000 15 16 .9375
AI 9 24 .3750 5 10 .5000 9 17 .5294 1 1 1.000 11 16 .6875
AJ 9 24 .3750 7 10 .7000 2 17 .1176 0 1 .0000 6 16 .3750
AK 2 24 .0833 1 10 .1000 5 17 .2941 0 1 .0000 0 16 .0000
AL 1 24 .0417 0 10 .0000 5 17 .2941 0 1 .0000 0 16 .0000
AM 1 24 .0417 4 10 .4000 0 17 .0000 0 1 .0000 12 16 .7500 Histotype serous mucinous endometr Undiff clearcells
pts in tot pts in tot pts in tot pts in tot pts in tot cluster pts % cluster pts % cluster pts % cluster pts % cluster pts %
AN 1 24 .0417 0 10 .0000 6 17 .3529 0 1 .0000 1 16 .0625
AO 13 24 .5417 5 10 .5000 13 17 .7647 1 1 1.000 15 16 .9375
AP 7 24 .2917 1 10 .1000 2 17 .1176 1 1 1.000 2 16 .1250
AQ 17 24 .7083 8 10 .8000 12 17 .7059 0 1 .0000 15 16 .9375
AR 2 24 .0833 0 10 .0000 1 17 .0588 0 1 .0000 1 16 .0625
AS 5 24 .2083 3 10 .3000 4 17 .2353 0 1 .0000 8 16 .5000
AT 6 24 .2500 0 10 .0000 2 17 .1176 1 1 1.000 2 16 .1250
AU 10 24 .4167 3 10 .3000 9 17 .5294 1 1 1.000 8 16 .5000
AV 0 24 .0000 0 10 .0000 0 17 .0000 0 1 .0000 6 16 .3750
AW 4 24 .1667 2 10 .2000 3 17 .1765 1 1 1.000 2 16 .1250
AX 6 24 .2500 1 10 .1000 4 17 .2353 0 1 .0000 0 16 .0000
AY 1 24 .0417 0 10 .0000 0 17 .0000 0 1 .0000 13 16 .8125
AZ 3 24 .1250 2 10 .2000 3 17 .1765 0 1 .0000 1 16 .0625
BA 12 24 .5000 4 10 .4000 7 17 .4118 0 1 .0000 5 16 .3125
BB 14 24 .5833 5 10 .5000 7 17 .4118 0 1 .0000 5 16 .3125
BC 2 24 .0833 0 10 .0000 8 17 .4706 0 1 .0000 2 16 .1250
BD 8 24 .3333 0 10 .0000 3 17 .1765 0 1 .0000 0 16 .0000
BE 3 24 .1250 4 10 .4000 4 17 .2353 0 1 .0000 2 16 .1250
BF 0 24 .0000 5 10 .5000 0 17 .0000 0 1 .0000 0 16 .0000
BG 6 24 .2500 1 10 .1000 5 17 .2941 0 1 .0000 7 16 .4375
BH 12 24 .5000 4 10 .4000 10 17 .5882 0 1 .0000 6 16 .3750 Histotype serous mucinous endometr Undiff clearcells
pts in tot pts in tot pts in tot pts in tot pts in tot cluster pts % cluster pts % cluster pts % cluster pts % cluster pts %
BI 7 24 .2917 1 10 .1000 3 17 .1765 1 1 1.000 5 16 .3125
BJ 6 24 .2500 1 10 .1000 2 17 .1176 0 1 .0000 2 16 .1250
BK 2 24 .0833 7 10 .7000 3 17 .1765 0 1 .0000 1 16 .0625
BL 8 24 .3333 0 10 .0000 1 17 .0588 0 1 .0000 1 16 .0625
BM 5 24 .2083 1 10 .1000 1 17 .0588 0 1 .0000 1 16 .0625
BN 5 24 .2083 2 10 .2000 2 17 .1176 0 1 .0000 1 16 .0625
BO 9 24 .3750 3 10 .3000 6 17 .3529 0 1 .0000 4 16 .2500
BP 10 24 .4167 5 10 .5000 9 17 .5294 1 1 1.000 11 16 .6875
BQ 14 24 .5833 5 10 .5000 8 17 .4706 0 1 .0000 5 16 .3125
BR 3 24 .1250 3 10 .3000 2 17 .1176 0 1 .0000 8 16 .5000
BS 3 24 .1250 1 10 .1000 1 17 .0588 0 1 .0000 1 16 .0625
BT 2 24 .0833 1 10 .1000 1 17 .0588 0 1 .0000 0 16 .0000
BU 10 24 .4167 4 10 .4000 2 17 .1176 0 1 .0000 4 16 .2500
BV 7 24 .2917 6 10 .6000 8 17 .4706 0 1 .0000 10 16 .6250
BW 6 24 .2500 2 10 .2000 2 17 .1176 0 1 .0000 2 16 .1250
BX 1 24 .0417 6 10 .6000 0 17 .0000 0 1 .0000 7 16 .4375
BY 3 24 .1250 7 10 .7000 2 17 .1176 0 1 .0000 0 16 .0000
BZ 5 24 .2083 2 10 .2000 11 17 .6471 0 1 .0000 1 16 .0625
CA 0 24 .0000 6 10 .6000 0 17 .0000 0 1 .0000 1 16 .0625
BC 2 24 .0833 0 10 .0000 8 17 .4706 0 1 .0000 2 16 .1250
CC 10 24 .4167 5 10 .5000 10 17 .5882 0 1 .0000 10 16 .6250 Histotype serous mucinous endometr Undiff clearcells
pts in tot pts in tot pts in tot pts in tot pts in tot cluster pts % cluster pts % cluster pts % cluster pts % cluster pts %
CD 1 24 .0417 0 10 .0000 0 17 .0000 0 1 .0000 9 16 .5625
CE 2 24 .0833 1 10 .1000 1 17 .0588 0 1 .0000 9 16 .5625
CF 4 24 .1667 1 10 .1000 4 17 .2353 1 1 1.000 4 16 .2500
CH 9 24 .3750 1 10 .1000 5 17 .2941 0 1 .0000 4 16 .2500
CI 7 24 .2917 5 10 .5000 5 17 .2941 1 1 1.000 9 16 .5625
CJ 8 24 .3333 4 10 .4000 7 17 .4118 0 1 .0000 3 16 .1875
CL 15 24 .6250 5 10 .5000 8 17 .4706 0 1 .0000 5 16 .3125
CN 2 24 .0833 0 10 .0000 8 17 .4706 0 1 .0000 1 16 .0625
CO 7 24 .2917 2 10 .2000 6 17 .3529 0 1 .0000 6 16 .3750
CP 11 24 .4583 7 10 .7000 9 17 .5294 0 1 .0000 7 16 .4375
CQ 6 24 .2500 1 10 .1000 2 17 .1176 1 1 1.000 3 16 .1875
CR 5 24 .2083 6 10 .6000 3 17 .1765 0 1 .0000 9 16 .5625
CS 14 24 .5833 2 10 .2000 8 17 .4706 . 0 . 4 16 .2500
4) correlation among clusters and tumor characteristics (Grading): frequencies and percentages of patients in each cluster divided by grading (1, 2, 3)
Grading 1 2 3
pts in toT pts in tot pts in tot cluster pts % cluster pts % cluster pts %
A 5 13 .3846 5 19 .2632 9 36 .2500
B 2 13 .1538 6 19 .3158 5 36 .1389
C 0 13 .0000 1 19 .0526 12 36 .3333
D 4 13 .3077 2 19 .1053 0 36 .0000
E 7 13 .5385 8 19 .4211 13 36 .3611
F 1 13 .0769 2 19 .1053 13 36 .3611
G 0 13 .0000 1 19 .0526 2 36 .0556
H 4 13 .3077 5 19 .2632 12 36 .3333
I 8 13 .6154 9 19 .4737 8 36 .2222
J 2 13 .1538 5 19 .2632 3 36 .0833
K 3 13 .2308 5 19 .2632 5 36 .1389
L 7 13 .5385 6 19 .3158 22 36 .6111
M 7 13 .5385 6 19 .3158 23 36 .6389
Nn 11 13 .8462 11 19 .5789 29 36 .8056
O 4 13 .3077 3 19 .1579 0 36 .0000
P 2 13 .1538 0 19 .0000 8 36 .2222
Q 4 13 .3077 2 19 .1053 0 36 .0000
R 0 13 .0000 1 19 .0526 10 36 .2778
S 7 13 .5385 6 19 .3158 16 36 .4444 Grading 1 2 3
pts in toT pts in tot pts in tot cluster pts % cluster pts % cluster pts %
T 3 13 .2308 1 19 .0526 9 36 .2500
U 6 13 .4615 4 19 .2105 15 36 .4167
V 2 13 .1538 0 19 .0000 5 36 .1389
W 4 13 .3077 4 19 .2105 8 36 .2222
X 7 13 .5385 13 19 .6842 13 36 .3611
Y 6 13 .4615 6 19 .3158 18 36 .5000
Z 11 13 .8462 14 19 .7368 12 36 .3333
AA 3 13 .2308 2 19 .1053 0 36 .0000
AB 6 13 .4615 5 19 .2632 23 36 .6389
AC 7 13 .5385 6 19 .3158 22 36 .6111
AD 9 13 .6923 12 19 .6316 29 36 .8056
AE 1 13 .0769 4 19 .2105 1 36 .0278
AF 2 13 .1538 4 19 .2105 3 36 .0833
AG 1 13 .0769 3 19 .1579 1 36 .0278
AH 10 13 .7692 13 19 .6842 33 36 .9167
AI 7 13 .5385 5 19 .2632 23 36 .6389
AJ 4 13 .3077 7 19 .3684 13 36 .3611
AK 3 13 .2308 4 19 .2105 1 36 .0278
AL 2 13 .1538 4 19 .2105 0 36 .0000
AM 2 13 .1538 3 19 .1579 12 36 .3333
AN 1 13 .0769 5 19 .2632 2 36 .0556 Grading 1 2 3
pts in toT pts in tot pts in tot cluster pts % cluster pts % cluster pts %
AO 9 13 .6923 10 19 .5263 28 36 .7778
AP 1 13 .0769 2 19 .1053 10 36 .2778
AQ 12 13 .9231 16 19 .8421 24 36 .6667
AR 0 13 .0000 1 19 .0526 3 36 .0833
AS 4 13 .3077 5 19 .2632 11 36 .3056
AT 0 13 .0000 1 19 .0526 10 36 .2778
AU 6 13 .4615 6 19 .3158 19 36 .5278
AV 0 13 .0000 0 19 .0000 6 36 .1667
AW 3 13 .2308 3 19 .1579 6 36 .1667
AX 3 13 .2308 2 19 .1053 6 36 .1667
AY 0 13 .0000 1 19 .0526 13 36 .3611
AZ 2 13 .1538 4 19 .2105 3 36 .0833
BA 5 13 .3846 13 19 .6842 10 36 .2778
BB 6 13 .4615 13 19 .6842 12 36 .3333
BC 2 13 .1538 6 19 .3158 4 36 .1111
BD 2 13 .1538 5 19 .2632 4 36 .1111
BE 4 13 .3077 6 19 .3158 3 36 .0833
BF 4 13 .3077 1 19 .0526 0 36 .0000
BG 3 13 .2308 5 19 .2632 11 36 .3056
BH 7 13 .5385 6 19 .3158 19 36 .5278
BI 1 13 .0769 8 19 .4211 8 36 .2222 Grading 1 2 3
pts in toT pts in tot pts in tot cluster pts % cluster pts % cluster pts %
BJ 3 13 .2308 1 19 .0526 7 36 .1944
BK 7 13 .5385 5 19 .2632 1 36 .0278
BL 1 13 .0769 3 19 .1579 6 36 .1667
BM 2 13 .1538 2 19 .1053 4 36 .1111
BN 3 13 .2308 4 19 .2105 3 36 .0833
BO 4 13 .3077 9 19 .4737 9 36 .2500
BP 6 13 .4615 6 19 .3158 24 36 .6667
BQ 7 13 .5385 13 19 .6842 12 36 .3333
BR 2 13 .1538 4 19 .2105 10 36 .2778
BS 2 13 .1538 2 19 .1053 2 36 .0556
BT 2 13 .1538 2 19 .1053 0 36 .0000
BU 5 13 .3846 6 19 .3158 9 36 .2500
BV 6 13 .4615 10 19 .5263 15 36 .4167
BW 3 13 .2308 2 19 .1053 7 36 .1944
BX 3 13 .2308 3 19 .1579 8 36 .2222
BY 6 13 .4615 6 19 .3158 0 36 .0000
BZ 6 13 .4615 8 19 .4211 5 36 .1389
CA 4 13 .3077 2 19 .1053 1 36 .0278
BC 2 13 .1538 6 19 .3158 4 36 .1111
CC 8 13 .6154 12 19 .6316 15 36 .4167
CD 0 13 .0000 1 19 .0526 9 36 .2500 Grading 1 2 3
pts in toT pts in tot pts in tot cluster pts % cluster pts % cluster pts %
CE 2 13 .1538 2 19 .1053 9 36 .2500
CF 3 13 .2308 2 19 .1053 9 36 .2500
CH 4 13 .3077 5 19 .2632 10 36 .2778
CI 3 13 .2308 5 19 .2632 19 36 .5278
CJ 6 13 .4615 9 19 .4737 7 36 .1944
CL 8 13 .6154 13 19 .6842 12 36 .3333
CN 2 13 .1538 6 19 .3158 3 36 .0833
CO 4 13 .3077 6 19 .3158 11 36 .3056
CP 8 13 .6154 11 19 .5789 15 36 .4167
CQ 1 13 .0769 2 19 .1053 10 36 .2778
CR 5 13 .3846 7 19 .3684 11 36 .3056
CS 6 13 .4615 11 19 .5789 11 35 .3143
Median age at Chemotherapeutic 52 diagnosis Regimens Histopathological N° of Percentage CBDCA NT parameters cases (%) Stage I 21 a 13 61.9 5 9 b 1 4.76 1 c 7 33.34 4 2 Grade 1 8 30.09 1 7 2 7 33.34 4 3 3 6 28.57 5 1 Histotype Serous 6 28.57 3 3 Mucinous 9 42.86 1 8 Endometroid 5 23.81 5 Clear cell 1 4.76 1
S-Table IV. Clinical parameters and chemotherapy regimens of the 21 EOC samples used as test set. CBDCA, carboplatin; NT, not treated. Genbank GO Biological Process Symbol Name Acc. N° Aldo-keto reductase family 1, member C2 (dihydrodiol dehydrogenase 2; bile acid BC040210 AKR1C2 binding protein; 3-alpha hydroxysteroid dehydrogenase, type III) Solute carrier family 6 (amino acid NM_007231 SLC6A14 transporter), member 14 Organic acid transport Aldo-keto reductase family 1, member C4 (chlordecone reductase; 3-alpha BC020744 AKR1C4 hydroxysteroid dehydrogenase, type I; dihydrodiol dehydrogenase 4) ADP-ribosylation-like factor 6 interacting AC092060 ARL6IP5 protein 5 Aldo-keto reductase family 1, member C2 (dihydrodiol dehydrogenase 2; bile acid BC040210 AKR1C2 binding protein; 3-alpha hydroxysteroid dehydrogenase, type III) Solute carrier family 6 (amino acid NM_007231 SLC6A14 transporter), member 14 Carboxylic acid transport Aldo-keto reductase family 1, member C4 (chlordecone reductase; 3-alpha BC020744 AKR1C4 hydroxysteroid dehydrogenase, type I; dihydrodiol dehydrogenase 4) ADP-ribosylation-like factor 6 interacting AC092060 ARL6IP5 protein 5 AB008193 LTB4R Leukotriene B4 receptor NM_004080 DGKB Diacylglycerol kinase, beta 90kDa AL121917 GNAS GNAS complex locus Growth hormone releasing hormone Second-messenger-mediated NM_000823 GHRHR receptor signaling Guanine nucleotide binding protein (G AK126708 GNAI2 protein), alpha inhibiting activity polypeptide 2 CA307692 ADORA2A Adenosine A2a receptor NM_145754 KIFC2 Kinesin family member C2 NM_000928 PLA2G1B Phospholipase A2, group IB (pancreas) AK023766 DNAH1 Dynein, axonemal, heavy polypeptide 1 NM_004240 TRIP10 Thyroid hormone receptor interactor 10 Cytoskeleton organization and BG286577 TUBB Tubulin, beta biogenesis NM_183419 RNF19 Ring finger protein 19 AF177198 TLN1 Talin 1 CF457414 SORBS1 Sorbin and SH3 domain containing 1 DB565402 LASP1 LIM and SH3 protein 1 CA314918 LMO7 LIM domain 7 Minichromosome maintenance complex DB353455 MCM5 component 5 Regulation of progression Ribosomal protein S6 kinase, 70kDa, NM_003952 RPS6KB2 through cell cycle polypeptide 2 AF195139 PNN Pinin, desmosome associated protein M74093, CCNE1 Cyclin E1 BG761079 AY561635 KLK10 Kallikrein-related peptidase 10 NM_006191 PA2G4 Proliferation-associated 2G4, 38kDa NM_138292 ATM Ataxia telangiectasia mutated Placental growth factor, vascular BC007255 PGF endothelial growth factor-related protein NM_004073 PLK3 Polo-like kinase 3 (Drosophila) NM_003377 VEGFB Vascular endothelial growth factor B DB509863 CCNL2 Cyclin L2 Basic helix-loop-helix domain containing, BC068292 BHLHB2 class B, 2 NM_002509 NKX2-2 NK2 homeobox 2 AF195139 PNN Pinin, desmosome associated protein NM_005169 PHOX2A Paired-like homeobox 2a M74093, CCNE1 BG761079 Cyclin E1 NM_006191 PA2G4 Proliferation-associated 2G4, 38kDa BC001562 NCOA4 Nuclear receptor coactivator 4 AJ492196 ZNF248 Zinc finger protein 248 CB269721 NRIP1 Nuclear receptor interacting protein 1 BC007333 ETV5 Ets variant gene 5 (ets-related molecule) B double prime 1, subunit of RNA NM_018429 BDP1 polymerase III transcription initiation factor IIIB AF317391 BCOR BCL6 co-repressor TAF7 RNA polymerase II, TATA box AF349038 TAF7 binding protein (TBP)-associated factor, 55kDa MYC-associated zinc finger protein Regulation of nucleobase, NM_002383 MAZ (purine-binding transcription factor) nucleoside, nucleotide and BC111408 ZNF276 Zinc finger protein 276 nucleic acid metabolism BE748366 PAX8 Paired box 8 BX385997 SQSTM1 Sequestosome 1 GCN5 general control of amino-acid DB552558 GCN5L2 synthesis 5-like 2 (yeast) DQ895028 SMAD4 SMAD family member 4 Nuclear receptor subfamily 4, group A, CD364918 NR4A1 member 1 Core-binding factor, runt domain, alpha AI584154 CBFA2T3 subunit 2; translocated to, 3 BC012070 ZBTB7B Zinc finger and BTB domain containing 7B CT004126 JUN Jun oncogene AL161658 INSM1 Insulinoma-associated 1 DB509863 CCNL2 Cyclin L2 SWI/SNF related, matrix associated, actin DB210960 SMARCA5 dependent regulator of chromatin, subfamily a, member 5 Minichromosome maintenance complex DB353455 MCM5 component 5 AAH07256 ZNF23 Zinc finger protein 23 (KOX 16) NM_005599 NHLH2 Nescient helix loop helix 2 ELK3, ETS-domain protein (SRF accessory NM_005230 ELK3 protein 2) NM_138292 ATM Ataxia telangiectasia mutated SWI/SNF related, matrix associated, actin DA493585 SMARCA1 dependent regulator of chromatin, subfamily a, member 1 Basic helix-loop-helix domain containing, BC068292 BHLHB2 class B, 2 NM_002509 NKX2-2 NK2 homeobox 2 AF195139 PNN Pinin, desmosome associated protein NM_005169 PHOX2A Paired-like homeobox 2a M74093, CCNE1 BG761079 Cyclin E1 NM_006191 PA2G4 Proliferation-associated 2G4, 38kDa BC001562 NCOA4 Nuclear receptor coactivator 4 AJ492196 ZNF248 Zinc finger protein 248 CB269721 NRIP1 Nuclear receptor interacting protein 1 BC007333 ETV5 Ets variant gene 5 (ets-related molecule) TATA box binding protein (TBP)- AB209594 TAF1C associated factor, RNA polymerase I, C, 110kDa B double prime 1, subunit of RNA NM_018429 BDP1 polymerase III transcription initiation factor IIIB AF317391 BCOR BCL6 co-repressor TAF7 RNA polymerase II, TATA box AF349038 TAF7 binding protein (TBP)-associated factor, 55kDa Transcription MYC-associated zinc finger protein NM_002383 MAZ (purine-binding transcription factor) BC111408 ZNF276 Zinc finger protein 276 BE748366 PAX8 Paired box 8 BX385997 SQSTM1 Sequestosome 1 GCN5 general control of amino-acid DB552558 GCN5L2 synthesis 5-like 2 (yeast) DQ895028 SMAD4 SMAD family member 4 Nuclear receptor subfamily 4, group A, CD364918 NR4A1 member 1 Core-binding factor, runt domain, alpha AI584154 CBFA2T3 subunit 2; translocated to, 3 BC012070 ZBTB7B Zinc finger and BTB domain containing 7B CT004126 JUN Jun oncogene AL161658 INSM1 Insulinoma-associated 1 DB509863 CCNL2 Cyclin L2 SWI/SNF related, matrix associated, actin DB210960 SMARCA5 dependent regulator of chromatin, subfamily a, member 5 Minichromosome maintenance complex DB353455 MCM5 component 5 AAH07256 ZNF23 Zinc finger protein 23 (KOX 16) NM_005599 NHLH2 Nescient helix loop helix 2 ELK3, ETS-domain protein (SRF accessory NM_005230 ELK3 protein 2) NM_138292 ATM Ataxia telangiectasia mutated SWI/SNF related, matrix associated, actin DA493585 SMARCA1 dependent regulator of chromatin, subfamily a, member 1
S-Table V. Significant GO Biological Process Ontologies resulted from David functional annotation enrichment analysis applied to the 188 genes defining the differences between relapsed and not relapsed samples. For each ontology group, Genebank accession number, gene symbol and name are reported. Red color refers to genes up-regulated in relapsers compared to non-relapsers. Green color refers to down- regulated genes in relapsers compared to non-relapsers. A) PFS
Hazard Lower Upper Variable p ratio 95% CI 95% CI CCNE1 1.417 1.1 1.826 0.0069 MCM5 1.797 0.943 3.425 0.075 Grading 3.091 1.313 7.276 0.0098 Histotype 0.728 0.46 1.15 0.173 Chemotherapy 0.417 0.164 1.058 0.066 Chemo*CCNE1* 3.393 0.952 12.089 0.0595 Chemo*MCM5* 6.032 0.772 47.12 0.087
B) OS
Hazard Lower Upper Variable p ratio 95% CI 95% CI CCNE1 1.132 0.842 1.523 0.412 MCM5 0.766 0.371 1.580 0.47 Grading 2.656 1.017 6.939 0.046 Histotype 1.062 0.654 1.724 0.8075 Chemotherapy 0.384 0.125 1.178 0.094 Chemo*CCNE1* 1.651 0.358 7.616 0.5202 Chemo*MCM5* 0.670 0.173 2.592 0.562
C) PFS
Hazard Lower Upper Variable p ratio 95% CI 95% CI CCNE1 1.238 0.931 1.646 0.1412 Grading 2.415 0.972 5.999 0.0577
S-Table VI: Relationships between PFS (A) or OS (B) and CCNE1, MCM5, histotype, grading and chemotherapy by univariate Cox proportional hazard models. Relationships between PFS (C) and CCNE1 and grading by multivariate Cox proportional hazard models. Chemo*CCNE1 refers to stratified CCNE1 according to the discriminative gene expression level both for OS and PFS. Chemo*MCM5 refers to stratified MCM5 according to the discriminative gene expression level both for OS and PFS. p is the p-value (p<0.05).