Supporting Information

Rangel et al. 10.1073/pnas.1613859113 SI Materials and Methods were identified, and the microarray data were used to Tumor Xenograft Studies. Female 6- to 7-wk-old Crl:NU(NCr)- determine the closest intrinsic subtype centroid for each sample, Foxn1nu mice were purchased from Charles River Laboratories. based on Spearman correlation using logged mean-centered Mammary fat pad injections into athymic nude mice were per- expression data. To estimate cellular proliferation, a “ formed using 3 × 106 cells (HCC70, HCC1954, MDA-MB-468, and proliferation signature” (32) was used to generate a proliferation MDA-MB-231). The cancer cells were resuspended in score for each sample. Briefly, using the logged expression data 100 μL of a 1:1 mix of PBS and matrigel (TREVIGEN). For for a subset of proliferation-related genes, singular value de- HCC1569 human breast cancer cells, we injected 4 × 106 cells in composition was used to produce a “proliferation metagene,” 100 μL of a 1:1 mix of PBS and matrigel. Injections were done which was then scaled to generate a score between 0 and 1, with into the fourth mammary gland. Tumors were measured using a a higher score denoting an increased level of proliferation - digital caliper, and the tumor volume was calculated using the tive to samples with lower scores. following formula: volume (mm3) = width × length/2. At the end of the experiment, tumor tissues were sectioned for fixation Human Data. Data from a combined cohort of 2,116 breast tumors (10% formalin or 4% paraformaldehyde) and RNA isolation. were used for analysis in a human cancer context. These data comprise tumors profiled on the Affymetrix Lung Metastasis Imaging. We injected 3 × 106 MDA-MB-231 lu- HGU133A, HGU133A2, and HGU133PLUS2 microarray plat- ciferase cells in 100 μL of a 1:1 mix of PBS and matrigel. In- forms. The normalization and subtyping procedures associated jections were done into the fourth mammary gland. Lung with this dataset have been described previously (74). The metastases were subsequently analyzed in vivo by bioluminescence Affymetrix probe set “218502_s_at” was used to define the ex- imaging. Mice anesthetized with isoflurane were injected in- pression of TRPS1 in this dataset. traperitoneally with D-luciferin (150 mg/kg) and imaged using an IVIS spectrum Xenogen machine (Caliper Life Sciences). Bio- Cell Migration and Invasion Assays. Transwell migration assays were luminescence analysis was performed using living image software. performed in a 24-multiwell insert system with a porous poly- carbonate membrane (8-mm pore size) according to the manu- Detection of β-Galactosidase Activity in Frozen Sections. Mammary facturer’s instructions (Cell Biolabs). Cells were allowed to grow glands were formalin-fixed and embedded in an optimal cutting to subconfluency (∼75–80%) and were then serum-starved for temperature compound. Next, 10-mm frozen sections were fixed 24 h. After detachment with trypsin, the cells were washed with with cold formalin for 10 min. Slides were then washed three PBS, resuspended in serum-free medium, and 300 μL of cell 5 −1 times with PBS, rinsed with water, and incubated overnight at suspension (5 × 10 cell/mL ) was added to the upper chamber. 37 °C with X-gal working solution. The slides were subsequently We then added 500 μL of complete medium to the bottom wells removed from the humidified chamber, washed with PBS, and of the chamber. Cells that did not migrate were removed from rinsed with water. Finally, the tissue sections were counter- the upper face of the filters using cotton swabs, and cells that had stained with fast red and mounted with aqueous medium. migrated to the lower face of the filters were fixed, stained, and quantified according to the manufacturer’s instructions (Cell Immunostaining of Paraffin Sections. Harvested tumors were fixed Biolabs). Similar inserts coated with matrigel were used in the in 10% neutral buffered formalin, dehydrated, and embedded in invasion assay. paraffin. Mammary tumor immunostaining was performed on 5-mm sections with antigen retrieval and overnight incubation qRT-PCR. Total RNA was purified and DNase treated using the with antibody for cytokeratin 14 (1:100, Abcam), cytokeratin 18 RNeasy Mini Kit (Qiagen). Synthesis of cDNA was performed (1:100, Progen), and anti-Ki67 (1:50 dilution, Abcam). To detect using SuperScript VILO Master Mix (Life Technologies). nuclear transposase expression, we performed antigen Quantitative PCR analysis was performed on the QuantStudio retrieval and incubation with polyclonal goat anti-transposase 12K Flex System (Life Technologies). All signals were normalized antibody (1:200 dilution, R&D Systems). After primary antibody to the levels of GAPDH TaqMan probes. incubation, chromogen detection (Envision System from Dako) TaqMan probes were obtained from Life Technologies: and hematoxylin counterstaining were performed. AFTPH (Hs00214281_m1), ANO6 (Hs03805835_m1), ASH1L (Hs00218516_m1), ERBB2IP (Hs01049966_m1), EP400 Microarray Gene Expression Analysis. Gene expression profiling of (Hs01566078_m1), LPP (Hs00944352_m1), LRRC4 (Hs01934623_s1), 21 SB-induced mouse mammary tumors was performed using MAN1A1 (Hs00195458_m1), NIPBL (Hs00209846_m1), PKP4 Affymetrix microarrays. RNA was extracted using a NORGEN (Hs00269305_m1), PPP1R12A (Hs01552899_m1), PUM2 Biotek Animal Tissue RNA Purification kit (cat. no. 25700), (Hs00209677_m1), R3HCC1L (Hs00402062_m1), RAB10 according to the manufacturer’s instructions. RNA was then la- (Hs00211643_m1), RASA1 (Hs00963554_m1), SOS2 (Hs00183311_m1), beled using an Affymetrix 3′ IVT Express kit (cat. no. 901229), VPS26A (Hs01013219_g1), XPNPEP3 (Hs00223094_m1), using 100 ng of total RNA for each sample, as per the manu- YTHDF3 (Hs00405590_m1), ZNF143 (Hs00366181_m1), facturer’s instructions. After labeling, samples were hybridized ZNF326 (Hs00299025_m1), TRPS1 (Hs00936363_m1), to Affymetrix GeneChip Mouse Genome 430 2.0 arrays and FOSL1 (Hs04187685_m1), SERPINE1 (Hs01126606_m1), scanned at the University of Otago Genomics & Bioinformatics SERPINB2 (Hs01010736_m1), TFPI2 (Hs04334126_m1), Facility. Raw data were processed in R (version 2.15) (71) SERPIND1 (Hs00164821_m1), SERPINE2 (Hs00299953_m1), using the “rma” function of the “affy” package (72). The SERPINI1 (Hs01115397_m1), SERPINC1 (Hs00166654_m1), “affyQCReport” package (73) for R was used to perform quality SERPINF2 (Hs00168686_m1), GAPDH (Hs03929097_g1), assessment of the microarray data. For each sample, an intrinsic miRNA221 (Hs04231481_s1), miRNA222 (Hs04415495_s1), subtype was assigned based on the previously described PAM50 and RNU44 (001094). EMT RT-PCR arrays (PAHS-090Z) were subtyping approach (31). Mouse gene orthologs for the PAM50 purchased from QIAGEN. The assays were performed according

Rangel et al. www.pnas.org/cgi/content/short/1613859113 1of8 to the manufacturer’s instructions, and the results were evaluated and control region (forward, 5′-ATGATGCTCACGCTCAGG-3′; using the QIAGEN data analysis center. reverse, 5′-AGCATG AAGAAGCCGCGAAG-3′). We used the Quantstudio 12K Real-Time PCR system (Applied Biosystems) –Δ ELISA. We collected the supernatants from HCC70 serum-free cell with SYBR green. Data were analyzed using the 2 Ct method cultures after 24 h of culture. The ELISAs were commercially and normalized with the input sample. available from Cloud-Clone Corp: SERPINE1 (SEA532Hu) and SERPINB2 (SEA531Hu). The ELISAs were run according to the Luciferase Assays. SERPINE1, SERPINB2, ZEB2, GAPDH, and manufacturer’s instructions. negative control (RPC) promoter reporter clones were obtained from Active Motif (LightSwitch Promoter Reporter GoClone). ChIP Studies. HCC70 cells were fixed in 1% formaldehyde at 37 °C For luciferase assays in HCC70 cells, we used nontargeting for 10 min. Cells were then washed twice with ice-cold PBS control and TRPS1 knockdown, and for MDA-MB-231 cells, we containing protease inhibitors, scraped, and centrifuged at 4 °C. used vector control and TRPS1-ORF. Cells were plated in 96- Cell pellets were resuspended in lysis buffer and sonicated well plates, and transfection was performed with promoter re- (Covaris S220) to shear DNA to a fragment size of 300–400 bp. porter plasmids (50 ng/well) using Lipofectamine 2000 (Life After sonication, the lysates were centrifuged and the super- Technologies) according to the manufacturer’s instructions. Af- natants diluted 10-fold with ChIP dilution buffer (EZ-CHIP kit, μ EMD-MILLIPORE). Anti-TRPS1 (sc-26974X, Santa Cruz Bio- ter 48 h, 100 L of luciferase assay reagent was added and in- technology) or normal goat IgG (AB-108-C, R&D Systems) were cubated for 30 min at room temperature (LightSwitch luciferase added to the diluted chromatin and incubated overnight at 4 °C assay system, Active Motif). Cell lysates were then transferred to with rotation. Antigen–antibody complexes were precipitated a white 96-well plate, and each well was then read for 2 s in a with protein A/G agarose and washed sequentially with low-salt luminometer (Synergy H1 Hybrid Reader, BioTek). buffer, high-salt buffer, and lithium chloride wash buffer and then eluted with elution buffer (1% SDS, 0.1 M NaHCO3, and Bioinformatic Analysis. Mouse CISs were converted to human 200 mM NaCl). Reversal of cross-linking was then performed by genes using two different annotation databases [MGI EntrezGene heating at 65 °C overnight in the presence of NaCl. DNA was associations (www.informatics.jax.org/)andHGNCcompletean- purified using DNA-binding columns provided by the EZ-ChIP kit. notations (www.genenames.org/)]. Mouse genetic markers were The amount of immunoprecipitated DNA was quantified using downloaded from MGI (www.informatics.jax.org/tools.shtml). The high-sensitive detection DNA reagent (Q32851, Life Technologies) gene catalog registered in The Cancer Gene Census was down- and measured using a Qubit 3.0 Fluorometer (Life Technologies). loaded from catalog of somatic mutations in cancer (COSMIC) qPCRs were run in triplicate using primers for the SERPINE1 (cancer.sanger.ac.uk/census/). Enrichr Bioinformatics Resources promoter target region (forward, 5′-GCTCTTTCCTGGAGGTG- (amp.pharm.mssm.edu/Enrichr/) was used for pathway analysis. GTC-3′; reverse, 5′-CCCTAGTGTTCAGCTTGGAG-3′)and To investigate the association between expression of genes and control region (forward, 5′-GCGCTGTCAAGAAGACCCAC-3′; prognosis in breast cancer patients, we used expression data and reverse, 5′-ATTGGCGGTTCGTCCTGCTC TG-3′), SERPINB2 clinical information from two websites (www.kmplot.com/analysis and promoter target region (forward, 5′-GAATCACTCAAAGGAC co.bmc.lu.se/gobo/gsa.pl). ACAGATC-3′; reverse, 5′-CATGAAACCCTATTTCCCATAGAC-3′) and control region (forward, 5′-TTCCCTCCCATGCCCTAAGC-3′; Statistics. All data are provided as mean ± SEM unless otherwise reverse, 5′-TCTTCTAGCTTTG GACAACCATG-3′), and ZEB2 indicated. Statistical analyses were performed using a paired promoter target region (forward, 5′-CCCGAGGTGTAG AGAGA- Student’s t test using GraphPad Prism 6 software (Version 6.0f), TTCAGAG-3′;reverse,5′-GCTTCTGGAACAAAGTTCTCTGC-3′) unless otherwise indicated.

Fig. S1. Detection of β-galactosidase activity in the epithelial cells of mouse mammary glands. (A and B) Shown are representative tissue sections of mouse mammary glands from 8-wk-old K5CreTg/+;LacZTg/+ mice stained with X-gal. Arrows indicate β-galactosidase activity in all mammary epithelium. Boxed regions are enlarged images. Left and Right magnification, 100× and 200×, respectively. [Scale bar, 100 μm(Left) and 200 μm(Right).]

Rangel et al. www.pnas.org/cgi/content/short/1613859113 2of8 Trps1

3’ 5’

Pten

5’ 3’

Nf1

5’ 3’

Axin1 5’ 3’

Jup

3’ 5’

Pkp4 5’ 3’

Arhgap35

Rab10

Rasal1

Fig. S2. Location of transposon insertions in SB-identified trunk driver genes. Transposon insertions in the sense (green arrow) and antisense (red arrows) DNA strand are shown. Black arrow shows the transcription initiation site.

Rangel et al. www.pnas.org/cgi/content/short/1613859113 3of8 Fig. S3. RT-PCR analysis of the CCGs down-regulated by shRNA pools in the HCC70 cell line.

Rangel et al. www.pnas.org/cgi/content/short/1613859113 4of8 Fig. S4. RT-PCR analysis of the CCGs down-regulated by shRNA pools in the MDA-MB-231 cell line.

Rangel et al. www.pnas.org/cgi/content/short/1613859113 5of8 utp cosacleto f216hmnbes uospoie sn fyerxmicroarrays. Affymetrix using profiled tumors breast human 2,116 of collection a across subtype S5. Fig. agle al. et Rangel S6. Fig. amr pteilimraie elln HE1 a sdo h otmrgnccl ie aarpeetmaso triplicates of means represent Data line. cell nontumorigenic the on used was (HMEC1) line cell immortalized epithelial mammary RS xrsini ifrn ua ratcne utps hw sabxpo fTP1epeso poeset (probe expression TRPS1 of plot box a is Shown subtypes. cancer breast human different in expression TRPS1 RS,FS1( FOSL1 TRPS1, www.pnas.org/cgi/content/short/1613859113 Top ,admRA2/2 ( miRNA221/222 and ), Relative miRNA abundance Relative RNA abundance 1 2 3 4 5 6 3 4 5 6 0 0 1 2 7 8 C7 C16 D-B21HC94B59Z-51HMEC1 ZR-75-1 BT549 HCC1954 MDA-MB-231 HCC1569 HCC70 C7 C16 D-B21HC94B59Z-51HMEC1 ZR-75-1 BT549 HCC1954 MDA-MB-231 HCC1569 HCC70 TRPS1 expression (log scale) 2.8 3.0 3.2 3.4 Bottom

xrsinwsdtrie yqTPRars ifrn ratcne ellns h human The lines. cell cancer breast different across qRT-PCR by determined was expression ) TNBC

Her2

LumA

LumB

Normal miRNA222 miRNA221

NoSubtype FOSL1 TRPS1 “ 218502_s_at ± SEM. ” essbes cancer breast versus ) 6of8 A B 1.25 HCC70 cell line 1.25 HCC1569 cell line

1.00 1.00 (2)

0.75 0.75 Control shRNA (1) shRNA Control shRNA (1) shRNA (2)

TRPS1 0.50 TRPS1 0.50 ** β β-Actin * * -Actin Relative TRPS1 levels Relative TRPS1 levels 0.25 0.25

0 0

Control Control shTRPS1(1)shTRPS1(2) shTRPS1(1)shTRPS1(2)

C D 20 HCC1954 cell line 60 MDA-MB-231 cell line

50 15 40

10 30 Control TRPS1-ORF Control TRPS1-ORF

20 TRPS1 5 TRPS1 10 β β expression (Fold change) mRNA -Actin

mRNA expression (Fold change) mRNA -Actin 0 0 Control TRPS1-ORF Control TRPS1-ORF

Fig. S7. Stable TRPS1 knockdown in HCC70 (A) and HCC1569 (B) TNBC cell lines using two independent shRNAs was confirmed by qRT-PCR and Western blot. Error bars represent SEM (*P < 0.0001). Ectopic expression of TRPS1 in (C) HCC1954 and (D) MDA-MB-231 cells transduced with lentivirus particles expressing TRPS1 cDNA and empty vector (control), respectively. TRPS1 mRNA levels were quantified by qRT-PCR.

Rangel et al. www.pnas.org/cgi/content/short/1613859113 7of8 ifrn ratcne utpsbsdon based subtypes “ cancer breast different aae 1(XLSX) S1 Dataset Files Information Supporting Other by serpins of regulation the depicting n S8. Fig. agle al. et Rangel h xrsinof expression the ER genes. EMT 222651_s_at = C B A ,2 ER 1,225 Survival probability Tumor cells TRPS1 expression (log scale) 0.2 0.4 0.6 0.8 1.0 -4 -2 0 0 2 050 TRPS1 ER neg ER pos Expression TRPS1 + SNAI2 www.pnas.org/cgi/content/short/1613859113 TGFB2 High (n=788) Low (n=759) ” ratcne ape) eeepeso-ae ucm o ratcne niewsue ooti h xrsinaayi.( analysis. expression the obtain to used was online cancer breast for outcome expression-based Gene samples). cancer breast – o l h nlss ( analysis. the all for RngERpos ER neg xrsinlvl rdc ain uvvli ER in survival patient predict levels expression uo el xrs ihlvl of levels high express cells tumor EPN1SERPINB2 SERPINE1 SERPINB2 0 5 0 250 200 150 100 Time (months) logrankP =0.0032 HR =0.68(0.52-0.88) Luminal A , p =0.00001 Breast Cancer SERPINE1 (ER positive) TRPS1 B o lt of plots Box ) TRPS1 n M ee opooetmrgot n metastasis. and growth tumor promote to genes EMT and ,

nER in Survival probability 0.2 0.4 0.6 0.8 1.0 TRPS1 SERPINE1 expression (log scale) -4 -2 0 0 4 2 0 – FOSL1 Expression TRPS1 TRPS1 n ER and MMP9 ZEB2 MMP2 High (n=361) Low (n=350) xrsinlvl.H,hzr ai.Lgrn etwsue nteaayi.W sdtepoeset probe the used We analysis. the in used was test Log-rank ratio. hazard HR, levels. expression RngERpos ER neg 01010200 150 100 50 hc ciae h xrsinof expression the activates which , , SERPINE1 + Time (months) Luminal B ratcancer. breast logrankP =0.052 HR =0.73(0.53-1) + ratcne.( cancer. breast , p =0.03295 SERPINB2 Breast Cancer (ER negative) TRPS1 SERPINB2 SERPINE1 ZEB2 and , A xrsini ER in expression Kaplan )

MMP9 Survival probability SERPINB2 expression (log scale) 0.2 0.4 0.6 0.8 1.0 -4 -2 0 FOSL1 0 4 6 2 0 Expression TRPS1 High (n=104) Low (n=122) – xrsini ER in expression mir221/222 ee lt eitn h eurnefe uvvlo ainswith patients of survival recurrence-free the depicting plots Meier RngER pos ER neg TGFB2 010150 100 50 SERPINB2 SERPINE1 SNAI2 Time (months) + HER2 logrankP =0.46 HR =1.2(0.74-1.92) uo el ietyrgltsteepeso fsrisand serpins of expression the regulates directly cells tumor targeting , p =0.00286 + n ER and Metastasis TRPS1 TRPS1 – ratcne utps( subtypes cancer breast

RAfrdgaainadactivating and degradation for mRNA Survival probability

FOSL1 expression (log scale) 0.2 0.4 0.6 0.8 1.0 Lun 0 -4 -2 0 2 4 0 g Expression TRPS1 mir221/222 colonization FOSL1 High (n=254) Low (n=263) RngER pos ER neg 01010200 150 100 50 Time (months) Basal-like C HR =0.78(0.56-1.1) logrank P =0.16 ceai model Schematic ) n = p =0.00001 9 ER 395 8of8 – and