Supplementary Materials

Identification of Compounds

The connectivity map concept (C-Map) is based on expression profiles, also known as gene fingerprints, and is used to analyze similar effects of compounds and to find drugs for treating diseases [1]. The profiles in both the C-Map and the CLUE [2] websites were derived from the treatment of human cells with thousands of drugs. Therefore, the gene expression signatures of interest in any induced or organic cell state could be compared with one another to determine similar mechanisms or reverse signatures of drugs and shRNA. Pattern-matching algorithms were used to score each gene expression profile and provide strength of enrichment through query signatures. The results were ranked by

“connectivity score (τ)”; a positive score of a signature denoted a similar effect, whereas a negative score indicated a contrary effect. A τ of 90 indicated that only 10% of all perturbations exhibited strong connectivity to the query [2].

Methodology of perturbagen classes (PCLs)

To render the CLUE database relatively easy for users to quickly find the mechanism of action (MOA) of a target drug, codifying the class-level annotation required considerable effort. MOAs were adopted to identify groups of compounds with distinct chemical structures, and genetic perturbagens were grouped on the basis of their belonging to the same

1

gene family or being commonly targeted by the same compounds. Ultimately, CLUE named

PCLs for their class-level annotations and further connected these cognate class members according to the results of L1000 connectivity analyses to predict the mechanism [2].

2

A

B

3

C

D

4

E

5

Figure S1. Intersection compounds and analysis of curcumin using the CLUE and C-Map.

The L1000 gene expression data of HT29 and HepG2 cells treated with curcumin were analyzed by CLUE (https://clue.io/) (A and B) and C-Map

(https://portals.broadinstitute.org/cmap/) (C and D), respectively. (A and B) Compounds and

PCLs from HT29 and HepG2 were first predicted by CLUE and then subjected to intersection. The 9 intersected compounds (A) and 13 PCLs (B) are shown. (C) The same

L1000 gene expression data of curcumin-treated HT29 and HepG2 were analyzed using the

C-Map. (D) Intersected compounds are shown and annotated. Because C-Map does not provide the drug information, we have annotated these drugs via several public databases. To the best of our knowledge, there are no available annotations for some drugs, which are labelled N/A, accordingly. (E) The list, which was provided by the CLUE database, shows the targets of the 30 highest-scoring compounds predicted by CLUE. Compounds without

CLUE annotations are labelled N/A.

6

7

Figure S2. Prediction of highly correlated pathways. that were in two sets were used to query CPDB in order to predict the pathways in which these genes were likely participating. The Venn diagram shows two intersecting PCLs (curcumin-treated HT29 and

HepG2). We focused on the intersection results indicated by red circles; the results contained

13 PCLs, including 10 compounds and 3 shRNA (Supplementary Figure S1B). We employed shRNA gene lists, including PSMB5, PSMA1, PSMA3, PSMB1, PSMB2, COPA, COPB2,

COPZ1, UVRAG, C2CD2, and RAB11FIP2, to query CPDB in order to analyze interaction network modules, biochemical pathways, and functional information. A total of 43 prediction pathways, which are indicated at the bottom of the figure, were identified according to analysis using the CPDB database (p < 0.001), and we analyzed the

CD4-T-cell-receptor-signaling NF-κB cascade for further validation (highlighted in yellow).

8

A

9

10

11

B

12

13

C

Figure S3. The output data of compounds (CP) and PCL were analyzed from CLUE (score ≥

90). CLUE treated thousands of compounds in several cells, including PC3, VCAP, A375,

A549, HA1E, HCC515, HT29, MCF7, and HEPG2, to detect their gene expression profiles.

The summary is reflected by these cells’ connectivity scores, and thus ranking was dependent on this summary score. The abbreviation pc denotes the percentage of total perturbagens that queried the column sample against the Touchstone data set and exceeded the given thresholds; ts_pc denotes the percentage of total Touchstone perturbagens that connected to the given perturbagen above the indicated thresholds, and median_score denotes the average connectivity score for nine types of cells. (A) The gene-expression profile from HepG2 treated with curcumin was analyzed by CLUE. (B) The gene-expression profile from HT29 treated with curcumin was analyzed by CLUE focused on the compounds

14

with connectivity scores greater than 90. Because the raw data were extensive, we present only the 30 highest-scoring compounds. (C) More than 90 PCL scores had similar effects to those of HepG2 and HT29 treated with curcumin.

15

HepG2 HT29

Figure S4. The heat maps for HepG2 and HT29. The heat map shows the top 50 up and down probes. Differentially expressed probe sets were selected by arbitrary (fold change) ≥

1.5 and p-value < 0.01 (two sample t-test). Corresponding gene names of probe sets were based on HG_U133A.chip file download from GSEA website.

16

References

1. Lamb, J.; Crawford, E.D.; Peck, D.; Modell, J.W.; Blat, I.C.; Wrobel, M.J.; Lerner, J.;

Brunet, J.P.; Subramanian, A.; Ross, K.N.; Reich, M.; Hieronymus, H.; Wei, G.;

Armstrong, S.A.; Haggarty, S.J.; Clemons, P.A.; Wei, R.; Carr, S.A.; Lander, E.S.;

Golub, T.R., The Connectivity Map: using gene-expression signatures to connect

small molecules, genes, and disease. Science (New York, N.Y.) 2006, 313, 1929-1935.

2. Subramanian, A.; Narayan, R.; Corsello, S.M.; Peck, D.D.; Natoli, T.E.; Lu, X.;

Gould, J.; Davis, J.F.; Tubelli, A.A.; Asiedu, J.K.; Lahr, D.L.; Hirschman, J.E.; Liu,

Z.; Donahue, M.; Julian, B.; Khan, M.; Wadden, D.; Smith, I.C.; Lam, D.; Liberzon,

A.; Toder, C.; Bagul, M.; Orzechowski, M.; Enache, O.M.; Piccioni, F.; Johnson,

S.A.; Lyons, N.J.; Berger, A.H.; Shamji, A.F.; Brooks, A.N.; Vrcic, A.; Flynn, C.;

Rosains, J.; Takeda, D.Y.; Hu, R.; Davison, D.; Lamb, J.; Ardlie, K.; Hogstrom, L.;

Greenside, P.; Gray, N.S.; Clemons, P.A.; Silver, S.; Wu, X.; Zhao, W.N.;

Read-Button, W.; Haggarty, S.J.; Ronco, L.V.; Boehm, J.S.; Schreiber, S.L.; Doench,

J.G.; Bittker, J.A.; Root, D.E.; Wong, B.; Golub, T.R., A Next Generation

Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell 2017, 171,

1437-1452 e1417.

17