Supporting Information

Wang et al. 10.1073/pnas.1619917114 SI Materials and Methods pended after centrifugation. Then, sonication and the following In Vitro RNA Probe Synthesis and RNA Pull-Down. In vitro RNA steps were carried out as previously described (18). synthesis procedures using MEGAscript T7 Transcription Kit for target probe and MEGAscript SP6 Transcription Kit for antisense Luciferase Plasmid Constructs. The promoter region of FOXE1 was control probe (Life Technologies) were performed according to the PCR-amplified from human BAC clone RP11-746L3 and cloned manufacturers’ instructions. Template DNAs were prepared using into the KpnI and SacI sites of the PGL4.10 vector (Promega). T7/SP6 primer and specific primers by PCR of cDNA clones con- Two sets of plasmids were constructed with a long fragment – + taining PTCSC2 isoform C or isoform D transcripts, respectively. ( 2,062 to 1 of the FOXE1 upstream regulatory region from + + ′ This was followed by purification with QIAquick Gel Extraction TSS and 1to 458 of 5 UTR) and a short fragment from – + Kit (QIAGEN) after the agarose gel band had been dissolved in ( 1,149 to 1oftheFOXE1 upstream regulatory region from TSS + + ′ RNase free water. Sequences were verified by Sanger sequenc- and 1to 458 of 5 UTR). Another two sets of plasmids with ing. Synthesized RNAs were analyzed qualitatively and quanti- inverted promoter regions were made from their corresponding tatively by electrophoresis and Nanodrop 2000 and then stored forward plasmids by end blunting of the insertion and religation at –80 °C. into the vector. Orientation-specific clones were screened by PCR, Primers for Target RNA probe are as follows: forward (T7), and all of the constructs were validated by Sanger sequencing. The TAATACGACTCACTATAGGGACAACGTCA-GGGCGG- expression vectors were pCMV-mCherry-MHC-IIA (Addgene GGAGGGCGC; reverse, ATGAATTAAGAGTCCTTTATTAGC. plasmid 35687, a gift from Venkaiah Betapudi, Case Western Primers for antisense control RNA probe are as follows: forward Reserve University, Cleveland) (47), pcDNA3-PTCSC2 (15), (Sp6), ATTTAGGTGACACTATAGAAATGAATTAA-GAGT- and pcDNA3 (Thermo Fisher Scientific), which was used as the CCTTTATTAGC; reverse, ACAACGTCAGGGCGGGGAGG- empty vector control. GCGC. The whole lysate from human nontumorous thyroid Quantitative Real-Time PCR Assay. Real-time qRT-PCR assay was sample was extracted using T-PER Tissue Protein Extraction performed in three biological replicates on ABI Prism 7900 HT Reagent (Thermo Fisher Scientific). Protein concentration was Sequence Detection System (Applied Biosystems) according to ’ determined by using Bio-Rad Protein Assay Dye Reagent Con- the manufacturer s protocol. The Taqman assays were carried centrate (Bio-Rad) kit. RNA pull-down assay was performed out using Taqman probe sets for PTCSC2 Isoform C (15), using Pierce Magnetic RNA-Protein Pull-Down Kit (Thermo FOXE1 (Life Technologies, Hs00916085_sl), and 18S (Life Fisher Scientific) according to the standard instructions. Briefly, Technologies, 4333760T) with TaqMan Fast Universal PCR the target RNA and antisense control RNA were labeled with Master Mix (Thermo Fisher Scientific). THBS1, IGFBP-3, and Biotin at the 3′ end and purified using Pierce RNA 3′ End all of the primer sets used in ChIP assay (Table S3) were de- Desthiobiotinylation Kit (Thermo Fisher Scientific). Labeled tected by Fast SYBR Green Master Mix kit (Thermo Fisher RNA probe (50 pmol) was used for the binding to Streptavidin Scientific). Magnetic Beads after incubation in 1× RNA Capture Buffer for 30 min at room temperature. In total, 200 μg protein was used for Human Primary Thyroid Cell Culture and Nucleofection. Human the subsequent protein binding step in protein–RNA binding buffer primary thyroid cells were cultured in customized 6H medium as for 150 min at 4 °C with agitation. The final RNA–magnetic beads– previously described (20). Briefly, fresh nontumorous thyroid ∼ protein complex was washed three times with wash buffer, where- tissue samples (0.3 1.5 g) were obtained from patients with PTC after 12 μL elution buffer was added to retrieve the pull-down by surgery. The tissue was immediately dissected into fragments protein products. The retrieved were resolved in gradient as small as possible using a sterile razor blade in a cell culture ’ gel electrophoresis followed by MS identification. hood. After one wash in Hanks Balanced Salt Solution (Life Technologies), the tissue fragments were transferred to 0.25% ChIP Assay. ChIP assays to determine the degree of MYH9 en- trypsin solution for an overnight digestion. On the second day, richment were performed using the Magna ChIP A/G Chromatin the fragments were digested with 1% trypsin (Life Technologies) Immunoprecipitation Kit (17-10085, EMD Millipore) on KTC1 and 0.35% collagenase 4 (Worthington Biochemical) solution cells according to the manufacturer’s instructions. Briefly, chro- for 90 min at 37 °C. The digested material was filtered through matin was cross-linked with 1% formaldehyde for 10 min at nylon mesh (100 μm, FALCON). After centrifugation at 1,000 × g room temperature. After sonication, chromatin was immuno- for 5 min, the supernatant was discarded and 1 mL red blood cell precipitated with rabbit anti-MYH9 antibody (sc-98978X, Santa lysing buffer (Sigma) was added for 2 min to eliminate the blood Cruz Biotechnology) or IgG (sc-2027X, Santa Cruz Biotechnology) cells. The cells were washed twice with Hanks’ solution and at 4 °C overnight. The protein/DNA complexes were eluted from centrifuged at 1,000 × g for 5 min. Finally, the cells were counted the magnetic beads after standard washing steps. The cross-links using a TC20 Automated Cell Counter (Bio-Rad) and seeded to were reversed by incubating at 62 °C for 2 h and 95 °C for 10 min. a density of 105∼106 cells per well on six-well plates. Final DNA products were purified by using QIAquick PCR Purifi- Human FOXE1 siRNA (ON-TARGET plus SMART pool) cation Kit (QIAGEN). Then, qPCR assays were performed by using and negative control siRNA (ON-TARGET plus control pool) the purified DNA as template with primers covering the transcrip- were purchased from Dharmacon. Human thyroid primary cells tion factor-enriched region of the FOXE1 promoter (Fig. 2A). were cultured in 6H medium for 5 d until they had reached ChIP assays on frozen thyroid tissue samples were performed as 80∼90% confluency. Then, primary cells were electroporated by follows: Briefly, ∼50 mg of frozen tissue was minced into small NucleofectorII device (Amaxa) with 75 pmol siRNA in 100 μLof pieces (∼2 mm) using sterile blades. Protein/DNA cross-linking Basic Nucleofector Medium for Primary Mammalian Epithelial was performed by incubating with formaldehyde at 1% concen- Cells (Lonza, VPI-1005) for each well of the six-well plates. Cells tration for 10 min at room temperature. After washing twice with were then resuspended in 6H medium, incubated for 24 h, and PBS, fixed tissues were homogenized. Cell pellets were resus- used for further analysis.

Wang et al. www.pnas.org/cgi/content/short/1619917114 1of8 RNA-Seq Sample Preparation and Detection. The total RNA samples using TopHat2 (48). Raw read counts for each were for RNA-seq were extracted by TRIzol reagent (Invitrogen) and quantified by using featureCounts software (49) that uses the then treated by DNase-I (Ambion) to eliminate DNA contami- GENCODE v.22 Gene Transfer Format (GTF) file as a transcript nation. RNA concentration was determined by using Qubit 2.0 reference (GENCODE annotation). with counts below 5 Fluorometer (Agilent Technologies) with an RNA HS Assay Kit. for at least two samples out of three within each group were The integrity of the RNA samples was assessed by BioAnalyzer filtered out. Then, the counts were normalized toward the (Agilent). All RNA integrity numbers (RINs) were greater than 8. common library size. The count data were assumed to follow a The purified RNAs with no visible sign of genomic DNA con- negative binomial distribution. To improve the estimate of tamination from the HS Nanochip tracings were used for total RNA library generation. overdispersion and to identify genes differentially expressed Furthermore, Illumina TruSeq Stranded Total RNA Sample between samples, R package DESeq2 (50) was used to estimate Prep Kit with Ribo-Zero Gold (catalog no. RS-122-2201) was the smoothed overdispersion parameters and to calculate P used to transform RNA into cDNA after removing rRNA and values with Wald test for group comparison under a generalized mitochondrial RNA. The RNA-seq libraries were prepared linear model. The P value cutoffs were determined by controlling according to the manufacturer’s protocol. Finally, 75 bp paired- the mean number of false positives (51). end sequencing was performed using the Illumina HiSeq 2500 system. Western Blotting. Western blotting was performed according to standard procedures. Antibodies used were MYH9 (Santa Cruz, Gene Abundance Estimate and Differential Gene Expression Analysis. sc-98978), FOXE1 (Abcam, ab134129), and beta- (Santa RNA-seq reads were first mapped to the hg19 Cruz, sc-47778).

Fig. S1. A diagrammatic view of the 9q22 locus. The diagram shows the lead SNP rs965513 in GWAS, the two flanking coding genes (XPA and FOXE1), and the lncRNA PTCSC2 isoforms C and D. Red arrows indicate transcriptional orientations. Blue filled boxes represent exons.

Wang et al. www.pnas.org/cgi/content/short/1619917114 2of8 Fig. S2. Identification of MYH9 as a PTCSC2 isoform D binding protein. (A) RNA pull-down experiment with nontumorous thyroid tissue extract. Biotin pull- down assays followed by SDS/PAGE separation were used to isolate the protein binding PTCSC2 isoform D. Antisense RNA of PTCSC2 isoform D was used as the negative control. The arrows indicate the binding protein bands ∼226 KD and 42 KD in size, respectively. (B) Information for the identification of MYH9 by MS.

Fig. S3. The detailed information of MS analysis for ACTB protein identified in the RNA pull-down assay. Information for the identification of beta-actin by MS.

Fig. S4. Expression of MYH9 and FOXE1 in thyroid cancer cell lines and nontumorous thyroid tissue. The protein levels of MYH9 and FOXE1 were detected by Western blotting using ACTB as the loading control. Expression of PTCSC2 spliced isoforms and GAPDH as detected by RT-PCR was described previously in ref. 15.

Wang et al. www.pnas.org/cgi/content/short/1619917114 3of8 Fig. S5. Transcriptional activity of FOXE1 promoter with different alleles of rs1867277. Dual reporter luciferase assay using long promoter constructs with either G or A allele of rs1867277 and forward (Left) or inverted (Right) orientations cotransfected with MYH9, PTCSC2, or empty expression vectors. All values were normalized with the values of the corresponding groups using promoter plasmids containing the G allele. Results were shown as means ± SD of four independent experiments, each in four replicates. *P < 0.05; ***P < 0.001. Student’s t test.

Fig. S6. The top 10 key biological functional groups predicted by dysregulated genes in FOXE1 knockdown thyroid primary cells by IPA analysis.

Wang et al. www.pnas.org/cgi/content/short/1619917114 4of8 Fig. S7. qRT-PCR of FOXE1, THBS1, and IGFBP3 in FOXE1 knockdown cell lines. Expression levels of BCPAP cell line (Left) or TPC1 cell line (Right) were detected after treatment with FOXE1 siRNA or negative control siRNA for 24 h. Results are shown as means ± SD of three independent experiments, each in three replicates. All values were normalized with the values of the corresponding negative control siRNA-treated groups. *P < 0.05; **P < 0.01; ***P < 0.001. Student’s t test.

Wang et al. www.pnas.org/cgi/content/short/1619917114 5of8 Table S1. Top 107 dysregulated genes by FOXE1 knockdown in thyroid primary cell cultures Gene name P value Fold change

GDF6 1.43E–20 2.435 – OASL 4.58E 15 2.101 – ACTC1 4.26E 13 2.099 AQP3 2.78E–09 1.925 FOXE1 1.14E–14 −1.908 – IGFBP3 1.65E 12 1.894 – SYTL2 1.08E 15 1.893 STC1 3.31E–17 1.883 – PLAU 1.93E 14 1.878 – IVL 1.23E 08 1.843 SALRNA2 6.07E–08 −1.824 RAET1L 7.85E−10 1.807 – SLC19A2 3.20E 14 1.798 − ELF3 7.60E 14 1.793 CORO6 4.56E–13 −1.782 ADM 1.00E−10 1.778 – KRT17 1.88E 10 1.774 − IL11 7.06E 09 1.769 TGM2 2.26E–15 1.754 – NGFR 2.24E 07 1.751 – ALOX15B 1.23E 11 −1.745 TREM1 1.05E–09 1.744 CNN1 6.15E–08 1.731 – THBS1 1.74E 13 1.717 – TACSTD2 3.59E 10 1.707 RTN4RL2 3.11E–07 −1.706 AJAP1 5.84E–13 1.703 – P4HA3 6.19E 09 1.698 – BRINP1 4.66E 07 1.692 SH3TC2 4.16E–09 1.689 – SYBU 4.42E 11 1.688 – PORCN 2.40E 10 1.683 – AC096669.3 1.21E 06 −1.681 RGS4 1.07E–06 1.681 – RP11-349F21.5 3.09E 06 −1.677 – PLEKHD1 5.07E 10 −1.662 SLC5A5 1.64E–06 −1.653 PFKFB4 9.49E–10 1.639 – TUBA4A 5.77E 10 1.631 – MIR7848 1.11E 06 −1.630 CALCB 6.86E–08 1.625 ZNF114 1.38E–08 1.611 – RASSF4 4.63E 10 1.610 – RP11-362F19.1 7.87E 08 1.603 KHDRBS2 1.56E–06 −1.603 – DIRAS3 5.83E 06 1.598 – RP11-349F21.2 1.69E 05 −1.598 EDNRA 4.52E–08 1.591 LMCD1 1.08E–09 1.587 – PPP1R1A 7.95E 07 1.585 – KRT19 2.44E 10 1.580 SLC38A5 5.24E–09 1.570 SFN 6.22E–08 1.569 – CXCL8 2.31E 06 1.569 – HTRA3 1.74E 07 −1.569 RRAD 3.77E–07 1.569 – RP11-626E13.1 3.73E 06 −1.569 – SSX2 5.18E 05 1.568 RP11-486B10.4 2.27E–05 −1.567 SLA 8.89E–07 −1.566 – CCL2 4.24E 05 1.564 – C3orf36 1.89E 06 1.560

Wang et al. www.pnas.org/cgi/content/short/1619917114 6of8 Table S1. Cont. Gene name P value Fold change

– FAM110B 3.70E 09 1.559 NPPC 6.27E–05 1.559 – PPP1R3C 1.09E 05 1.558 – PDE4A 1.43E 05 −1.558 CNN3 6.67E–08 1.557 ADAMTS9 7.14E–08 1.557 – PLCXD3 6.62E 05 1.557 – SGK1 2.51E 08 1.555 IER3 2.63E–07 1.554 SHC4 3.95E–05 1.554 – IPCEF1 8.07E 06 −1.551 – SGCD 1.53E 08 −1.547 SLC43A3 2.23E–09 −1.547 – TNC 5.65E 10 1.547 – GBP1 1.56E 06 1.546 – EDN1 4.25E 05 1.544 MAMDC2 1.07E–06 1.541 – C1orf116 6.43E 09 1.540 – KCNN4 2.98E 05 1.540 RAB36 4.48E–07 −1.540 ISM1 1.07E–06 −1.537 – TNFRSF19 3.30E 07 1.536 – PRKAR2B 6.80E 06 1.530 TGFA 4.70E–08 1.529 RSAD2 5.86E–06 1.527 – SSX2B 1.48E 04 1.524 – KRT83 3.56E 06 1.523 PHLDB2 7.77E–07 1.521 – PTCHD1 3.76E 05 −1.519 – HGF 1.70E 04 −1.518 NPBWR1 1.84E–04 1.515 TNFRSF11B 6.83E–06 1.515 – LMLN 9.49E 06 −1.515 – MDFI 8.24E 06 1.515 RP11-506K6.4 1.95E–04 −1.512 DCBLD2 1.03E–07 1.512 – TIMP4 5.52E 05 1.510 – EHF 1.95E 04 1.506 Y_RNA 2.32E–04 −1.505 – WWC1 8.29E 10 1.505 – CRABP2 2.52E 05 1.505 DPYSL4 3.94E–05 1.502 NTNG1 1.09E–04 1.502 – TNNC1 1.08E 04 1.502 – LINC00511 1.60E 04 1.501

Table S2. Differential gene expression of 59 tumorous (T)–nontumorous (N) PTC tissue pairs from TCGA database for FOXE1, IGFBP3, and THBS1 Gene name P value* Fold change†

FOXE1 2.02E–05 −1.504 IGFBP3 4.19E−05 1.668 – THBS1 1.73E 02 1.606

*P values were obtained by paired t test. †The values represent the median fold changes of tumorous versus non- tumorous tissue pairs for each gene.

Wang et al. www.pnas.org/cgi/content/short/1619917114 7of8 Table S3. PCR primer sequences Gene or region name Forward Reverse

Primers for qRT-PCR using SYBR Green kit THBS1 CCTGTGATGATGACGATGA CTGATCTGGGTTGTGGTTGTA IGFBP-3 CCATGACTGAGGAAAGGAGCTC TGCAGCAGGGCAGAGTCTC Primers for q-PCR in ChIP assay R1 GCCCAGCGCCAGTACTAACT CTGTGGTGCCCGCTAGTTTA R2 CTAAACTAGCGGGCACCACA CGTGACCGGGACTGGACT R3 CTTCAGCCGGAGACCAGAGT ACAGAGGCTCGGGAGTGAC R4 TCGGCTAGCGGGTCACTC GAGAGCTCAGGGGATCGTC

Wang et al. www.pnas.org/cgi/content/short/1619917114 8of8