HOXB7 is an ERα cofactor in the activation of HER2 and multiple ER target genes leading to endocrine resistance Kideok Jin, Sunju Park, Wei Wen Teo, Preethi Korangath, Sean Soonweng Cho, Takahiro Yoshida, Balázs Győrffy, Chirayu Pankaj Goswami, Harikrishna Nakshatri, Leigh-Ann Cruz, Weiqiang Zhou, Hongkai Ji, Ying Su, Muhammad Ekram, Zhengsheng Wu, Tao Zhu, Kornelia Polyak, and Saraswati Sukumar Supplemental Methods Plasmids Full-length promoter region including ER and one of two different HOXB7 binding sites of CA12 was cloned into the pGL3 promoter luciferase construct. Full length miR196a cDNA was cloned into the pcDNA3.1 vector. Full-length promoter region including MYC binding region of miR-196a gene was cloned into pGL3 promoter luciferase plasmid. 3’UTR region of HOXB7 was cloned into pGL2 promoter luciferase plasmid. The full-length human HER2 plasmid was a generous gift from Daniel Leahy (Johns Hopkins University, Baltimore, MD) and the HER2 cDNA was cloned to pLPCX vector. EGFR WT, pRetrosuper Myc shRNA, Lenti-sh1368 knockdown c-myc were purchased at Addgene (1-3). siRNAs against HOXB7 were purchased at Invitrogen and shRNAs against HOXB7 was purchased at Thermo scientific. Immunohistochemistry Immunohistochemical analysis of HOXB7 and HER2 protein expression was performed using monoclonal antibodies against HOXB7 (Santa Cruz Biotechnology) and HER2 (Cell signaling). After blocking with 5% goat serum in PBST for 1 hour at room temperature, the sections were treated with the HOXB7 or HER2 antibodies overnight at 4°C, then the peroxidase conjugated streptavidin 1 complex method was performed, followed by the 3, 3' diaminobenzidine (DAB) procedure according to manufacturer’s protocols (VECTASTAIN Elite ABC Kit, Vector Lab). Site-directed mutagenesis 3’UTR-HOXB7-Luciferase mutant and pcDNA3.1-MYC mutants (S62A and T58A) were generated using a QuikChange site-directed mutagenesis kit (Stratagene, Santa Clara, California). The sequences were confirmed by automated DNA sequencing. Primer sequences were: MYC (T58A), Forward (5’-GAG CTG CTG CCC GCC CCG CCC CTG TC-3’), Reverse (5’-GAC AGG GGC GGG GCG GGC AGC AGC TC-3’); MYC (S62A), Forward (5’-CAC CCC GCC CCT GAC CCC CTA GCC GCC-3’), Reverse (5’-GGC GGC TAG GGG TCA GGG GCG GGG TG-3’). Analysis of the Metabric dataset Illumina microarray data generated by the Metabric consortia were downloaded from EGA (4). The database contains 1,988 patients, the average overall survival is 8.07 years; 76% of the patients are ER positive and 47.3% are lymph node positive. The raw Illumina microarray files were imported into R and summarized using the beadarray package (5). Follow-up was censored after 15 years. For annotation, the illuminaHumanv3 database of bioconductor was used (http://www.bioconductor.org). Quantile normalization was performed using the preprocessCore package (6). ChIP-PED analysis The ChIP-PED data set is publicly available from http://www.biostat.jhsph.edu/~gewu/ChIPPED/ (7), which contains 13,182 human gene expression samples from Affymetrix Human Genome U133A Array in NCBI Gene Expression Omnibus (GEO). The data set is a matrix with columns corresponded to samples and rows corresponded to genes. The cross-sample gene expression correlation between two genes was calculated by the Pearson correlation coefficient between the 2 corresponding rows in the data set. The density curves (Figure 1B) for the distributions of correlations from HOXB7 versus 144 ER target genes and that from HOXB7 versus 144 random genes were estimated using kernel density estimation. Two tailed t-test was performed to evaluate the significance of difference between these two distributions (P<10-15). We observed that there are two groups of samples separated by the diagonal line y=x, one group (y<x, red circles in Figure S1D) showed good correlation between ER and HOXB7 gene expression and the other group (blue circles in Figure S1D) did not. To rank the ER target genes, 8,425 samples from the group with good correlation between ER and HOXB7 were used to compute the cross-sample correlations between ER target genes and HOXB7 (or ER). To estimate the significance of correlation, given a target gene, HOXB7 (or ER) gene expression data was permuted 10,000 times and a null distribution was constructed by the correlations between the target gene and each permuted HOXB7 (or ER) gene expression. Then, the density curve of the null distribution was estimated by kernel density estimation and the p-value was calculated as the area under the curve past the observed correlation. The ER target genes were ranked according to their average cross-sample correlations with HOXB7 and ER. In Figure 1B, the p-value was calculated as follows: Given a gene t, let x_t=(x_t1,〖 …,x〗 _tN ) be a vector consisting of its expression values in N different samples. Let y=(y_1,〖 …,y〗 _N ) be a vector consisting of expression values of HOXB7 in the same N samples. The correlation (r_t) between gene t and HOXB7 was calculated as the Pearson correlation coefficient between x_t and y. To determine whether HOXB7 is significantly correlated with ER target genes, we first calculated the correlation between each of the 144 ER target genes and HOXB7, yielding a set of 144 correlation coefficients A={r_t }. Next, we randomly selected 144 genes and calculated the correlation between each of the random genes and HOXB7, yielding another set of 144 correlation coefficients B={r_t^' } which served as the control. Finally, a two-sample t-test was performed to 3 evaluate whether the two sets of correlation coefficients A and B have the same mean or not. The two-sided p-value (P<10-15) was reported. In Figure S1E to K, the p-values for the correlation between HOXB7 (or ER) and ER target genes were calculated as follows: To rank the ER target genes, 8,425 samples from the group with good correlation between ER and HOXB7 were used to compute the cross-sample correlations between ER target genes and HOXB7 (or ER). To estimate the significance of correlation between HOXB7 and a given ER target gene, HOXB7 (or ER) gene expression data was permuted 10,000 times and a null distribution was constructed by the correlation between the target gene and permuted HOXB7 (or ER) gene expression. For example, consider HOXB7 and a target gene t. Let 〖 x〗 _t=(x_t1,〖 …,x〗 _tN ) be a vector consisting of expression values of gene t in N microarray samples (N=8,425). Let y=(y_1,〖 …,y〗 _N ) be a vector consisting of expression values of HOXB7 in the same N samples. Let r_t be the Pearson correlation coefficient between 〖 x〗 _t and y. To determine the p-value of r_t , we randomly permuted y 10,000 times. In each permutation i (i=1,…,10000), the order of the N numbers in y was shuffled to create a permuted vector y^((i)) (e.g., y=(2,3,6,5,…,10) may become y^((i))=(6,10,3,2,…,5) after permutation). This process broke the association between HOXB7 and the target gene t. The correlation coefficient between 〖 x〗 _t and each permuted y^((i)) was computed and denoted as r_t^((i)). This produced 10000 correlation coefficients. We then applied kernel density estimation (using R function density) to these 10000 r_t^((i))s to estimate the probability density function of the correlation coefficients of permuted data. The estimated density function was used as the null distribution for evaluating the significance of the observed correlation r_t. The p-value of r_t was calculated as the area under the density curve with x-axis values larger than the observed correlation r_t (Figure 1B). 4 DNase-seq analysis DNase I hypersensitive sites sequencing (DNase-seq) (8) is a state-of-the-art technology to identify accessible chromatin regions genome-wide. DNase I hypersensitive sites are proved to be robust marker (9) of regulatory DNA regions which enables one to screen candidate transcription factor binding sites (TFBSs) through DNase-seq experiment. The ENCODE project has performed DNase-seq experiments on a number of human cell types which provide a good resource to study TFBSs. In this study, we obtained the DNase hotspot dataset (downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeUwDnase/, the DNase hotspots were generated by ENCODE using the hotspot algorithm of estradiol treated MCF7 cell line. (10) Because the data contained two replicates, we first screened for DNase hotspots which were overlapped between the two replicates. Then, we searched for DNase I hypersensitive (DNase-seq) hotspots in the proximal promoter region (15,000 base pairs upstream of transcript start site) for MYC gene and found five different DNase peaks, (1) chr8: 128,735,412-128,735,765 (2) chr8: 128,737,633-128,738,395 (3) chr8: 128,738,780-128,739,066 (4) chr8: 128,739,355- 128,740,018 (5) chr8: 128,746,080-128,746,796. We selected two peaks close to the MYC transcription starting site. Using the regions between the peaks, we chose three putative HOXB7 binding sites based on the “TAAT” consensus HOX-binding motif. We created MYC luciferase constructs containing an ER enhancer region tagged to three different putative HOXB7 binding sites (MYC-B7-1, MYC-B7-2, and MYC-B7-3) using previously studied MYC enhancer/promoter luciferase constructs containing ER-binding site (11). Supplemental References 1. Lin CH, Jackson AL, Guo J, Linsley PS, Eisenman RN. Myc-regulated microRNAs attenuate embryonic stem cell differentiation. The EMBO journal. 2009;28:3157-70. 5 2. Popov N, Wanzel M, Madiredjo M, Zhang D, Beijersbergen R, Bernards R, et al. The ubiquitin-specific protease USP28 is required for MYC stability. Nature cell biology. 2007;9:765- 74. 3.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages29 Page
-
File Size-