S-TABLE 1
GENE SYMBOL FULL NAME AND DESCRIPTION
ATP-binding cassette, sub-family F, member1
ABCF1 This protein may be regulated by tumor necrosis factor-alpha and play a role in enhancement of protein synthesis and the inflammation process
Coronin, actin binding protein, 1C
This gene encodes a member of the WD repeat protein family. Members of this family are
CORO1C involved in a variety of cellular processes, including cell cycle progression, signal transduction,
apoptosis, and gene regulation
Dipeptidyl-peptidase 3
DPP3 This gene encodes a protein that is a member of the S9B family in clan SC of the serine proteases. Increased activity of this protein is associated with certain type of cancers
Prolactin regulatory binding-element protein
PREB This protein may act as a transcriptional regulator and is thought to be involved in some of the developmental abnormalities
Ubiquitin protein ligase E3A
UBE3A This gene encodes an E3 ubiquitin-protein ligase, part of the ubiquitin protein degradation system
Phosphatidylserine synthase 1
PTDSS1 This gene is related to the phosphorous metabolism and lipid biosynthesis
S-Table 1- The 6-gene-model: gene description
S-TABLE 2
HR p-value
Six-gene model 0.30 <0.00001
Primary Tumor Size (≤2cm vs. >2cm) 0.56 0.006
Node ( negative vs. positive ) 0.93 0.8
Age ( <45 vs. ≥45 years) 1.62 0.02
Chemotherapy Exposure (no vs. yes) 1.89 0.04
ER ( negative vs. positive) 0.74 0.25
Differentiation: intermediate vs. well 1.15 0.61
poorly vs. well 1.06 0.83
S-Table 2: Multivariable Proportional-Hazards Analysis of the risk of distant metastasis as a first event in van de Vijver’s dataset based on six-gene model. S-TABLE 3
Datasets Total # of patients Kaplan-Meier(p) HR Cancer type
GSE4573 130 0.04 0.52 SCLC
GSE11117 41 0.09 0.51 NSCLC
HLM 79 0.03 0.52 NSCLC
MICH 177 0.08 0.66 NSCLC
DFCI 82 0.07 0.49 NSCLC
MSKCC 104 0.09 0.51 NSCLC
S-Table 3 - Utilizing the 6-gene model to predict human lung cancer outcomes. (SCLC- Squamous Cell Lung Carcinoma, NSCL-Non-small Cell Lung Cancer)
S-FIGURE 1
A. Class 1 Class 2
van de Vijver
B. Class 1 Class 2
GSE4922
C. Class 1 Class 2
GSE2034
S-Figure 1- Breast cancer patients expressed SpMGS was segregated into two groups by
Hierarchal cluster based on the first bifurcation in the clustering dendrogram, and assigned as Class 1 and Class 2.
S-FIGURE 2
A Van de Vijver
Metastasis-free survival Overall Survival
Class 1 Class 1
Class 2 Class 2 Probability of overall survival Hazard ratio = 0.68 Log-rank p=0.07 Hazard ratio = 0.72 Log-rank p=0.17 Probability of metastasis-free survival
Yea Yea No. AT Risk No. AT Risk Class 1 120 108 96 71 52 31 16 8 Class 1 120 114 102 80 58 38 20 Class 2 175 146 118 95 56 37 20 Class 2 175 167 145 111 69 45 22
B C GSE4922 GSE2034 (Overall Survival) (Relapse-free survival)
Class 1 Class 1
Class 2 Class 2
Probability survival of overall Hazard ratio= 0. 57 Log-rank p=0.03 Hazard ratio =0.60 Log-rank p=0.024 Probability of relapse-free survival of relapse-free Probability
No. AT Risk Yea Yea Class 1 75 68 59 52 46 38 2 No. AT Risk Class 2 88 59 48 43 37 31 5 Class 1 105 101 93 87 80 76 72 66 Class 2 181 169 149 131 124 107 97 87
Figure 2 - Among all breast cancer patients who expressed EMGS, Kaplan-Meier analysis of the probability that patients would remain free of metastases and overall survival in van de Vijver dataset (panel A); overall survival in GSE4922 dataset (panel B) and replase-free survival in GSE2034 dataset (panel C). Patients exhibited the metastatic signature (EMGS) were assigned Class 2 (blue), while those did not were assigned Class 1 (red). S-FIGURE 3
GSE4573 dataset GSE11117 dataset
Class 1 (n=45)
Class 1 (n=25) Class 2 (n=85)
Class 2 (n=216) Probability of overallProbability survival
Probability of overallProbability survival HR=0.5 Log-rank p=0.09 HR=0.52 Log-rank p=0.04
Year Year
HLM dataset MICH dataset
Class 1 (n=94)
Class 1 (n=23) Class 2 (n=83)
Class 2 (n=56) Probability of overallProbability survival Probability of overallProbability survival
HR=0.66 Log-rank p=0.08 HR=0.52 Log-rank p=0.03
Year Year
DFCI dataset MSKCC dataset
Class 1 (n=58) Class 1 (n=38)
Class 2 (n=46) Class 2 (n=44) Probability of overall survivalProbability Probability of overallProbability survival
HR=0. 49 Log-rank p=0.07 HR=0. 51 Log-rank p=0.09
Year Year
S-Figure 3 – Utilize six-gene model to predict human lung cancer patients’ outcome. S-FIGURE 4
4T1 Mouse Metastases Model
Signature (79 SpMGS)
3 Human Breast Cancer Datasets van de Vijver GSE4922 GSE2034
6-gene-model
Validation
3 Original Human Breast Cancer Datasets van de Vijver GSE4922 GSE2034
Further Validation
3 New Human 6 Human Lung Breast Cancer datasets Cancer datasets GSE1456 GSE4573 GSE2990 GSE11117 GSE7390 HLM MICH DFCI MSKCC