(2007) 26, 2543–2553 & 2007 Nature Publishing Group All rights reserved 0950-9232/07 $30.00 www.nature.com/onc ORIGINAL ARTICLE Regulation of clustered expression by cofactor of BRCA1 (COBRA1) in breast cells

SE Aiyar1, AL Blair1, DA Hopkinson, S Bekiranov and R Li

Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA, USA

Eucaryotic that are coordinately expressed tend to the gene organization patterns within a genome. In be clustered. Furthermore, gene clusters across chromo- contrast to the long-held presumption that genes are somal regions are often upregulated in various tumors. randomly distributed throughout the eucaryotic ge- However, relatively little is known about how gene nomes, there is compelling evidence for clustering of clusters are coordinately expressed in physiological or co-expressed genes in every eucaryotic genome exam- pathological conditions. Cofactor of BRCA1 (COBRA1), ined so far (Hurst et al., 2004). For example, 25% of a subunit of the human negative elongation factor, has genes in budding yeast that are transcribed in the same been shown to repress estrogen-stimulated transcription cell-cycle stage are adjacently located (Cho et al., 1998). of trefoil factor 1 (TFF1 or pS2)by stalling RNA The gene-clustering phenomenon is perhaps even more polymerase II. Here, we carried out a genome-wide study prominent in multicellular organisms. It is reported that to identify additional physiological target genes of 20% of genes in the entire fly genome are organized into COBRA1 in cells. The study identified a clusters with similar expression patterns (Spellman and total of 134 genes that were either activated or repressed Rubin, 2002). Likewise, a large number of housekeeping upon small hairpin RNA-mediated reduction of COBRA1. genes and highly expressed genes in the Interestingly, many COBRA1-regulated genes reside as are clustered (Caron et al., 2001; Lercher et al., 2002). clusters on the and have been previously Co-expression of most gene clusters is unlikely owing to implicated in cancer development. Detailed examination a ‘bystander effect’ wherein genes might be activated of two such clusters on 21 (21q22)and simply because they are adjacent to other actively chromosome X (Xp11)reveals that COBRA1 is physically transcribed genes, as syntenic analyses of mouse and associated with a subset of its regulated genes in each human genomes indicate that the order of co-expressed cluster. In addition, COBRA1 was shown to regulate genes in a cluster is conserved by natural selection both estrogen-dependent and -independent transcription of (Singer et al., 2005). Therefore, clustering of co- the gene cluster at 21q22, which encompasses the pre- expressed genes may have a functional advantage in viously identified COBRA1-regulated TFF1 (pS2)locus. coordinating during cell proliferation, Thus, COBRA1 plays a critical role in the regulation of differentiation and development. clustered gene expression at preferred chromosomal In addition to their potential roles in normal domains in breast cancer cells. physiology, gene clusters across chromosomal regions Oncogene (2007) 26, 2543–2553. doi:10.1038/sj.onc.1210047; are often upregulated in various tumors (Zhou et al., published online 16 October 2006 2003; Koon et al., 2004). For example, members of the trefoil factor family (TFF1–3) are tandem clustered Keywords: COBRA1; clustered gene expression; estrogen; within a 55 kb genomic region at chromosome 21q22.3 transcription; NELF (Chinery et al., 1995; Schmitt et al., 1996). Clinical studies suggest a strong association of TFF1/TFF3 expression with breast cancer (Poulsom et al., 1997; Mikhitarian et al., 2005; Xu et al., 2005; Smid et al., 2006). In a recent study of primary breast tumor Introduction samples, overexpression of TFF1 (pS2) and TFF3 was most significantly correlated with bone in Recent rigorous examination of complete eucaryotic breast cancer (Smid et al., 2006). In addition, it has been genomes, combined with gene expression profiling shown that plasma levels of TFF1, TFF2 and TFF3 are analyses, has shed exciting and unexpected insight into increased in patients with advanced prostate cancer (Vestergaard et al., 2006). Therefore, coordinate expres- Correspondence: Dr R Li, Department of Biochemistry and Molecular sion of the TFF gene cluster may represent a significant Genetics, School of Medicine, University of Virginia, 1300 Jefferson marker in breast and prostate cancer progression. Park Avenue, Charlottesville, VA 22908, USA. Although part of the upregulated clustered gene E-mail: [email protected] 1These authors contributed equally to this work. expression in tumor samples could be attributed to gene Received 22 May 2006; revised 24 August 2006; accepted 6 September amplification, epigenetic changes in high-order chro- 2006; published online 16 October 2006 matin structure may also lead to dysregulation of COBRA1 regulates clustered gene expression SE Aiyar et al 2544 coordinated gene expression at the transcription level in human . Despite the abundant evidence for clustering of co-expressed genes in eucaryotic genomes, the mechanistic basis and regulators that dictate the coordinated expression of clustered genes are poorly understood. Cofactor of BRCA1 (COBRA1) was first identified as a BRCA1-interacting protein capable of inducing large- scale reorganization (Ye et al., 2001). COBRA1 was subsequently found to be one of the four subunits of the negative elongation factor complex NELF (NELF-B) (Narita et al., 2003). The NELF complex was biochemically purified based on its ability to inhibit transcription elongation by stalling RNA polymerase II (RNAPII) within the promoter-proximal region (Wada et al., 1998; Yamaguchi et al., 1999). Recent in vivo work shows that COBRA1 interacts with the ligand-binding domain of estrogen a (ERa) (Aiyar et al., 2004). In response to estrogen, COBRA1 and the rest of the NELF complex are recruited to a Figure 1 analysis of COBRA1-regulated genes as identified by microarray. (a) Immunoblots showing the decreased subset of ERa-regulated promoters such as TFF1 (pS2), expression of COBRA1 protein in the COBRA1 knockdown and inhibit the ligand-dependent gene activation by T47D-derived cells (sh2A and shGA). Parental T47D and interfering with the RNAPII movement (Aiyar et al., shEGFP-expressing cells were used as controls. (b) The Ingenuity 2004). To better understand the pattern of gene Gene Ontology Analysis tool is used to identify several biological pathways that are relatively enriched with COBRA1-regulated regulation by COBRA1, we conducted a genome-wide genes. The P-value for each category is shown on the top of the search for COBRA1 target genes by comparing gene bar. expression profiles in control and COBRA1 knockdown breast cancer cells. Our work led to the key discovery that COBRA1 regulates both ligand-dependent and - regulated genes. Of note, 31 out of the 134 COBRA1- independent expression of cancer-associated genes that regulated genes are either membrane-associated or tend to cluster on the chromosomes, including the TFF secretory (solid squares in Supplementary Table 1). gene cluster. Thus, this finding suggests a link between Using the Ingenuity Gene Ontology Analysis software regulation of transcription elongation and higher order program (www.ingenuity.com), we found that the list of chromatin structure in breast cancer cells. COBRA1-regulated genes is enriched with those in- volved in control, antigen presentation and amino-acid synthesis pathways (Figure 1b). A total of 21 COBRA1-regulated genes have been Results reported to be estrogen-responsive (http://defiant.i2r. a-star.edu.sg/projects/Ergdb-v2/index.htm) (solid trian- Whole-genome analysis of COBRA1-regulated genes gles in Supplementary Table 1), and 17 of them are To conduct a genome-wide search for genes regulated by repressed by COBRA1 (upregulated in the COBRA1 COBRA1, ERa-positive breast cancer cell line T47D knockdown cells). This extends our previous observa- was infected with a retrovirus expressing either a control tion of the COBRA1-mediated repression of estrogen- small hairpin RNA (shRNA) against enhanced green responsive transcription at the TFF1 (pS2) and C3 loci fluorescent protein (EGFP) (shEGFP) or shRNA (Aiyar et al., 2004). In addition, a large number of against COBRA1 (sh2A). shRNA-mediated knockdown COBRA1-regulated genes have been previously impli- of COBRA1 was verified by Western blotting cated in cancer development. These include CD74 in (Figure 1a). Total RNA isolated from duplicates of myeloma (Burton et al., 2004), Claudin 1 in colon cancer each shRNA-expressing cell line was subjected to (Dhawan et al., 2005), AZGP1 in prostate cancer microarray analyses using the Affymetrix microarray (Lapointe et al., 2004) and the cancer/testis-specific G- chips (HG-U133A). Examination of the microarray antigens (GAGEs) (Bodey, 2002). In particular, multiple data revealed a total of 134 genes with a P-value of COBRA1-regulated genes have been associated with less than or equal to 0.05 and a fold change in breast cancer. These include APOD (Lopez-Boado signal intensity greater than 1.5 (Supplementary et al., 1994; Blais et al., 1995; Rassart et al., 2000), Table 1). Among these genes, 84 were stimulated (i.e. CLDN1 (Kramer et al., 2000; Hoevel et al., 2002; COBRA1-repressed) and 50 were downregulated Morin, 2005; Tokes et al., 2005), MGP (Chen et al., (i.e. COBRA1-activated) in the COBRA1 knockdown 1990; Sheikh et al., 1993; Hirota et al., 1995), GAGEs cells (Supplementary Table 1). As expected, COBRA1 (Mashino et al., 2001; Duan et al., 2003), MCM2 was among the downregulated genes from the initial (Gonzalez et al., 2003; Moggs et al., 2005; Shetty et al., microarray analysis (reduced 2.96-fold; P-value 0.04), 2005), PIP (Pagani et al., 1994; Clark et al., 1999; Carsol but was not included in the final list of COBRA1- et al., 2002; Tian et al., 2004), TIMP1 (Nakopoulou

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2545 et al., 2003; Wu¨ rtz et al., 2005), WISP2 (Banerjee et al., of non-random distribution of many genes across the 2003) and TFF3 (Xu et al., 2005). genome. In two instances (MGP and ARHGDIB on ; DNAL4 and CBX6 on chromosome 22), the two COBRA1-regulated genes on the same Confirmation of microarray data by real-time RT–PCR chromosome are only separated by one gene. In Quantitative real-time reverse transcription–polymerase addition, TFF3 at chromosome 21q22 is clustered with chain reaction (RT–PCR) analysis was used to verify the TFF1 (pS2), another member of the TFF and a known microarray results for 15 COBRA1-repressed genes and COBRA1-repressed gene from our previous work four COBRA1-stimulated genes (bold in Supplementary (Aiyar et al., 2004). Lastly, GAGE2 and GAGE7, Table 1). To alleviate the concern over the ‘off-targeting’ which belong to the same cancer/testis-specific G effect commonly associated with shRNA-mediated gene antigen family on chromosome X (Xp11) (de Backer silencing, we also included an independent T47D pool et al., 1999), are immediate neighbors of each other. that expressed a different COBRA1-targeted shRNA Although some COBRA1-regulated genes such as (shGA; Figure 1a). All 19 genes verified by real-time TFF1/TFF3 and GAGE2/7 belong to gene families, RT–PCR were affected by COBRA1 knockdown in the many other clustered COBRA1 target genes are same direction as indicated by the microarray. A subset apparently structurally unrelated. A chromosomal gene of these genes is shown in Figure 2. For example, cluster is defined as two or more genes in which a consistent with the microarray data, expression of contiguous start site is within a distance (d) of the CLDN1, ARHGDIB, WISP2 and TXNIP was stimu- neighboring start site. When d is set at 20 Mb, 112 of the lated when COBRA1 was reduced by either COBRA1- 134 COBRA1-regulated genes appear to be clustered on specific shRNA (Figure 2a–d). On the other hand, the chromosomes (Figure 3 and Supplementary Table COBRA1 knockdown led to decreased expression of 4). We estimate the significance of this by randomly MYO5C and SPINK5 (Figure 2e and f). Thus, findings sampling 134 starts 100 000 times among 13 003 non- from the whole-genome approach can be validated by overlapping exemplar sequences from which Affymetrix an independent assay. probe sets were selected. The clustering phenomenon is statistically significant (P ¼ 0.004) with the median Clustering of COBRA1-regulated genes number of randomly selected genes found in clusters A close examination of the genomic locations of the equal to 80. As illustrated in Figure 3, the clustering COBRA1-regulated genes revealed an intriguing finding effect is observed on all chromosomes except 18.

Figure 2 Verification of the microarray results by real-time RT–PCR. Four cell lines are compared: parental T47D cells (white bars), T47D shEGFP control cells (gray bars), T47D shCOBRA1 cells expressing two independent shRNA oligos (black bars for 2A and striped bars for GA). Examples of COBRA1-repressed genes are shown by (a) CLDN1, (b) ARHGDIB, (c) WISP2 and (d) TXNIP. Examples of COBRA1-stimulated genes are shown by (e) MYO5C and (f) SPINK5. The error bars represent s.d. Results were representatives of at least four independent biological experiments.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2546 Chromosome 1 has the largest number of COBRA1- regulated gene clusters (a total of five), with one of these clusters containing nine genes (1q41–q42).

Analysis of the COBRA1-regulated gene cluster at Xp11 Following the global analysis of the COBRA1-regulated gene expression, we carried out a more in-depth study at specific gene clusters. Aberrant gene expression on the has recently been associated with both hereditary and sporadic basal-like breast cancer (Ri- chardson et al., 2006). In addition, several COBRA1- repressed genes at the X chromosome including GAGE2/GAGE7 at Xp11.2 and TIMP1 at Xp11.2– 11.4 are proximal to each other and have been implicated in breast cancer development (Chen et al., 1998; Mashino et al., 2001; Bodey, 2002; Duan et al., 2003; Nakopoulou et al., 2003; Wu¨ rtz et al., 2005). We therefore examined the effect of COBRA1 knockdown on these and flanking genes by quantitative RT–PCR. As summarized in Figure 4 (top diagram), several additional members of the GAGE gene family as well as FOXP3 were upregulated by COBRA1 knockdown, whereas expression of GAGEB1 and LOC139163 was reduced in the absence of COBRA1. Expression of four other neighboring genes in this genomic region (GA- GEC1, LOC286408, PPP1R3F and JM1) was not affected by COBRA1. Except for FOXP3, all CO- BRA1-affected genes at this locus are contiguous. Figure 3 A schematic diagram of the chromosomal distribution of TIMP1, another COBRA1-repressed gene revealed by the 112 clustered COBRA1-regulated genes. Each solid circle on the chromosome represents a COBRA1-regulated gene. See the microarray study, is located approximately 2 Mb Supplementary Table 3 for the exact chromosomal location of downstream of the GAGE locus on the X chromosome. each gene in the clusters. The entire TIMP1 gene is situated within the sixth

Figure 4 COBRA1-regulated gene cluster at the Xp11.2–11.4. Quantitative real-time RT–PCR confirmed that COBRA1 modulated a cluster of genes on the X chromosome (see Supplementary Table 5 for the raw data). The solid up block arrows indicate genes that are upregulated in the absence of COBRA1, whereas the open down block arrows indicate genes that are downregulated in the COBRA1 knockdown cells. The gray horizontal bars designate those genes that are not significantly affected by COBRA1 knockdown. The thin horizontal arrows above the gene names designate the transcription directions.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2547 intron of SYN1, which is transcribed in the opposite (mRNA) level in the COBRA1 knockdown cells (data direction. Examination of eight genes in this genomic not shown; but see Figure 8). region by quantitative RT–PCR indicates that expres- Given the clinical evidence for an association of sion of SYN1 and TIMP1 are elevated by COBRA1 upregulation of TFF1 and TFF3 with breast cancer knockdown, whereas none of the flanking genes were development (Poulsom et al., 1997; Mikhitarian et al., affected by COBRA1 (Figure 4, bottom diagram). Of 2005; Xu et al., 2005; Smid et al., 2006), we compared note, expression of the ZNF family (ZNF41, ZNF81 the impact of COBRA1 knockdown on the expression and ZNF157) located in the same region was not of TFF1/TFF3 and several other neighboring genes in regulated by COBRA1, suggesting that COBRA1 is not this region. As summarized in Figure 6, TFF1, TFF3 a general transcription regulator of all gene families. and two additional genes (TMPRSS3 and TSGA2) in To determine whether COBRA1 was physically the cluster were upregulated by COBRA1 knockdown. associated with any of the affected genes at the Xp11 On the other hand, expression of TFF2 and UBASH3A locus, we carried out chromatin immunoprecipitation was decreased in the COBRA1 knockdown cells, (ChIP) in a T47D derivative that expresses a moderate whereas ABCG1 and SLC37A on the outskirts of the amount of ectopic Flag-COBRA1. This cell line allows region were not affected by COBRA1. Interestingly, all for ChIP with both an anti-COBRA1 and anti-Flag antibody. As shown in Figure 5, positive ChIP signals were detected at the promoters of TIMP1 and SYN1, as well as the common promoter for GAGE 1, 2 and 8. The intensity of the ChIP signals varied between the two antibodies at the three COBRA1-regulated promoters (compare lanes 3 and 4), which could be owing to the variation in availability of the epitopes and conforma- tion of the COBRA1 protein at specific loci. Neither antibody revealed any COBRA1 association with the promoters for GAGE4-7/7B, ELK1 on chromosome X or glyceraldehyde-3-phosphate dehydrogenase on chromosome 12. These findings strongly suggest a preferential affinity of COBRA1 for a subset of COBRA1-regulated genes on chromosome X.

Characterization of the COBRA1 action at the gene cluster on 21q22 TFF3, which is upregulated by COBRA1 knockdown in the microarray experiment, belongs to the TFF that also includes TFF1 and TFF2. TFF1, the previously known COBRA1 target gene (Aiyar et al., 2004), was upregu- Figure 5 ChIP demonstrates COBRA1 association with a subset lated by COBRA1 knockdown as well (3.95-fold). of COBRA1-regulated genes at Xp11.2–11.4. ChIP analysis was However, it had a P-value of 0.066 and therefore was carried out using a T47D-derived cell line that ectopically expressed not included in the list of COBRA1-regulated genes Flag-COBRA1. The input lane represents 1/10 of the total genomic from the microarray analysis. This is most likely due to DNA used in each ChIP (lane 1). Samples were immunoprecipi- the exceedingly stringent criteria set for the current tated with IgG (Santa Cruz, Santa Cruz, CA, USA; lane 2), an anti- Flag antibody (Sigma Aldrich, St Louis, MO, USA; lane 3) and an microarray analysis (Po0.05), as quantitative RT–PCR anti-COBRA1 antibody (lane 4) (Aiyar et al., 2004). Primer using the same RNA for the microarray experiment sequences used in the ChIP experiments are listed in Supplementary clearly indicated an elevated TFF1 messenger RNA Table 3.

Figure 6 COBRA1 modulates clustered gene expression at the 21q22.3 chromosomal locus. The solid and open arrows indicate genes that are up- and downregulated in the absence of COBRA1, respectively. The gray bars indicate those genes that are not affected by COBRA1 knockdown. See Supplementary Table 6 for the raw quantitative RT–PCR data.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2548 COBRA1-repressed genes (i.e. stimulated in the absence COBRA1 reduction even in the absence of exogenously of COBRA1) at this locus are transcribed in the same added estrogen (compare the first and third bars), direction. Consistent with our previous finding (Aiyar suggesting an effect of COBRA1 on the ligand- et al., 2004), ChIP analysis showed that COBRA1 was independent transcription as well. On the other hand, associated with the TFF1 promoter region (Figure 7). In TSGA2 expression was not estrogen-responsive, yet its addition, COBRA1 was present at the promoters of transcription was upregulated by COBRA1 knockdown several additional potential target genes in this region, in charcoal-stripped medium with and without exo- including TFF3, TMPRSS3, TSGA2 and UBASH3A. genously added estrogen (Figure 8). Therefore, our data COBRA1 was not detected at the promoters of TFF2, suggest that COBRA1 is capable of repressing both SLC37A1 or ABCG1. Therefore, COBRA1 is preferen- ligand-dependent and -independent gene expression at tially associated with most of its regulated genes at this the TFF gene cluster. particular cluster.

Impact of COBRA1 on the estrogen-dependent and Discussion -independent transcription at the 21q22 locus Because both TFF1 and TFF3 are responsive to In vitro biochemical characterization clearly demon- estradiol stimulation (May and Westley, 1997) and strate that NELF, of which COBRA1 is an integral ERa has been shown to bind to several regions along subunit, modulates transcription elongation by stalling (Carroll et al., 2005), we assessed the RNAPII at the promoter-proximal region (Yamaguchi impact of COBRA1 on the ligand-dependent and - et al., 1999; Palangat et al., 2005). Previous in vivo independent transcription of those COBRA1-repressed studies in human and Drosophila systems corroborate genes within the TFF-encompassing gene cluster (TFF1, the RNAPII-stalling function of NELF in gene regula- TFF3, TSGA2 and TMPRSS3). Control and COBRA1 tion (Wu et al., 2003, 2005a; Aiyar et al., 2004). knockdown T47D cells were starved in charcoal- Immunostaining of Drosophila polytene chromosomes stripped medium for 3 days, and subsequently treated shows fewer foci for NELF than RNAPII (Wu et al., with 10 nM 17-b-estradiol for 24 h. As shown by 2003), suggesting that NELF may not act as a general quantitative RT–PCR analysis (Figure 8), ligand-depen- transcription elongation factor for all RNAPII-depen- dent transcription of the estrogen-responsive genes dent gene expression. Findings from our whole-genome (TFF1, TFF3 and TMPRSS3) was significantly elevated in the COBRA1 knockdown cells (compare the second and fourth bars in each panel). In addition, expression of TFF3 and TMPRSS3 was also stimulated by the

Figure 8 COBRA1 regulates both estrogen-dependent and -independent gene expression at the 21q22.3 chromosomal locus. ShEGFP- and shCOBRA2A-expressing T47D cells were starved for 3 days in medium containing charcoal-stripped serum and then were treated with either vehicle (ethanol; open bars) or 10 nM 17-b- estradiol (solid bars) for 24 h. RNA was prepared for real-time Figure 7 ChIP reveals association of COBRA1 with multiple RT–PCR analysis of the indicated genes at 21q22.3. Among the COBRA1-regulated genes at the 21q22.3 locus. ChIP was carried COBRA1-repressed genes tested, TMPRSS3, TFF1 and TFF3 are out in the same way as described in Figure 5. estrogen-responsive, whereas TSGA2 is estrogen-independent.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2549 study in breast cancer cells clearly demonstrate that in mouse, consistent with the notion COBRA1 preferentially modulates the expression of a that the co-expressed gene clusters are conserved owing relatively small number of genes that tend to have to functional selection (Singer et al., 2005). clustered genomic configurations. Furthermore, the The underlying mechanisms for the COBRA1-regu- ChIP results indicate a physical presence of COBRA1 lated clustered gene regulation remain to be elucidated. in at least a subset of these specific genes and At least two possible scenarios could be envisioned. chromosomal regions. A specialized function of NELF First, as a subunit of the NELF complex, COBRA1 may in regulating more complex chromatin structure and be simultaneously recruited to multiple promoters hormone-dependent gene expression in higher euca- within the same chromosomal region via its interactions ryotes would be consistent with the absence of their with a common DNA-binding such structural homologs in unicellular eucaryotes. as ERa at these promoters. The NELF complex that is Our previous gene-targeted investigation in breast present at each promoter may independently repress cancer cells strongly suggests that COBRA1 represses transcription of individual genes in the cluster. This ligand-dependent gene activation by ERa at a subset of mode of action is likely to occur for those duplicated estrogen-responsive genes including TFF1 (pS2). Find- gene families that may share common cis-regulatory ings from the unbiased approach in the current study promoter elements. For example, the three genes of the corroborate and extend the previous observation by TFF family at 21q22 display similar patterns in the uncovering an additional set of known estrogen- tissue-selective expression and regulation, and therefore responsive genes that are regulated by COBRA1. Our have been suggested to share a common locus activation new data also suggest that COBRA1 can modulate region (Tomasetto et al., 1992; Chinery et al., 1995; both ligand-dependent and -independent transcription Gott et al., 1996). In an alternative scenario, the of ER-responsive promoters (Figure 8). However, COBRA1-mediated regulation of clustered gene it is worth noting that the majority of COBRA1- expression may invoke a chromatin-based mechanism regulated genes uncovered in our study have not been that is associated with, but distinct from its role in reported to be estrogen-responsive, and this includes transcription elongation. In such an event, COBRA1 those that were verified by the real-time RT–PCR might be recruited initially to the promoter region analysis (e.g. TSGA2). Therefore, in addition to its role of one of the regulated genes in the cluster, and its in modulation of ER-dependent gene expression, impact on transcription may be subsequently propa- COBRA1 may also regulate transcription via its gated to the neighboring genes via a putative influence interactions with other site-specific transcription factors. of COBRA1 on higher-order chromatin structure. The Consistent with this notion, COBRA1 was found to second model would be in line with our previous interact with AP1 and represses its transcriptional observation that chromosome-tethered COBRA1 is activity (Zhong et al., 2004). capable of inducing large-scale chromatin reorganiza- An important finding from the current study is the tion (Ye et al., 2001). clustered organization of many COBRA1-regulated Given the known function of NELF in repressing genes. This is manifested in the following two aspects. transcriptional elongation, it is interesting to note that a First, in several occasions, two COBRA1-regulated gene significant number of genes are stimulated by COBRA1 are localized next to each other or only separated by a (i.e. their expression downregulated in the COBRA1 third gene, events that are unlikely to occur by chance. knockdown cells). ChIP analysis failed to detect any Second, many of the genes identified by the microarray COBRA1 association with several of the most down- study are localized within relatively small regions on the regulated genes in the COBRA1 knockdown cells (data chromosomes. Although several COBRA1-regulated not shown), suggesting that the observed effect of genes (e.g. GAGEs and TFF1-3) belong to tandem- COBRA1 on these genes might be indirect. However, localized gene families, most of the 112 clustered COBRA1 was detected at the promoter of UBASH3A, COBRA1-regulated genes are structurally unrelated. one of the genes in the TFF-containing gene cluster that We suspect that more COBRA1-regulated genes would were downregulated by COBRA1 knockdown (Figure 7). be found in individual clusters if less stringent filtering Therefore, the possibility of a direct gene activation criteria were applied to the microarray data. For function by COBRA1 cannot be excluded at this point. instance, several genes at the Xp11 and 21q22 clusters The TIMP1 and TFF1 genes that were chosen for (Figures 4 and 6) are unfiltered in the microarray yet in-depth characterization are overexpressed in breast identified as COBRA1-regulated by the real-time PCR cancer, making them promising prognostic markers assay. We find that 80% of these genes are in fact (Hahnel et al., 1994; Toi et al., 1998; Balleine and affected by COBRA1 knockdown in the same direction Clarke, 1999; Fata et al., 2004). In addition, expression in both assays, suggesting that microarray-based find- of both genes may be regulated by the breast cancer ings may have underestimated the actual number of tumor suppressor BRCA1. The TIMP1 gene frequently clustered COBRA1-regulated genes. It is also worth escapes from inactivation at the inactive X chromosome pointing out that the synteny of many of the COBRA1- allele (Anderson and Brown, 1999, 2002), and BRCA1 regulated gene clusters is conserved between mouse has been shown to participate in X chromosome and human. For example, the order of the neighboring inactivation (Ganesan et al., 2002). Furthermore, genes at the TFF-encompassing cluster (Figure 6) is elevated TIMP1 expression was found in BRCA1 maintained between chromosome 21 in human and mutation-associated ovarian tumors and ‘BRCA1-like’

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2550 sporadic ovarian tumors (Jazaeri et al., 2002, 2004). 2005b). Alteration of gene expression by COBRA1 knock- Likewise, it has been shown that BRCA1 directly binds down was considered significant if the P-value was less than or to the TFF1 (pS2) promoter and represses its transcrip- equal to 0.05, the fold change in signal intensity (increase or tion (Fan et al., 2001; Zheng et al., 2001). In light of the decrease) was greater than 1.5 and the absolute difference in physical interaction between COBRA1 and BRCA1, it is signal intensity was greater than 100. DMT chip software package was used to analyse the microarray data (Gallagher tempting to speculate that these two may act et al., 2003). The P-values were computed by a non-parametric jointly to downregulate transcription from a common Wilcoxon’s rank test comparing the population of PM probes set of breast cancer-relevant genes such as TFF1 and and MM probes for each gene on the chip (http://genes.med. TIMP1. virginia.edu). Many COBRA1-regulated genes that are uncovered in the current study have been previously associated RNA isolation and cDNA synthesis with various types of cancer, consistent with the Total RNA was extracted with TRIzol Reagent according to reported functions of COBRA1 in cancer cell prolifera- the manufacturer’s instructions (Invitrogen, Carlsbad, CA, tion and tumorigenesis (Aiyar et al., 2004; McChesney USA). The quality of the RNA was verified by absorbance et al., 2006). A search with the Oncomine Cancer measurements at 260 and 280 nm. To prepare cDNA, Profiling Database (www.oncomine.org) indicates that approximately 1 mg of RNA was reverse-transcribed using expression of multiple COBRA1-regulated genes within ImPromp (Promega, Madison, WI, USA). a given cluster is deregulated in cancer. For example, four out of the eight COBRA1-regulated genes at the Quantitative real-time RT–PCR cluster on chromosome 17 are deregulated in prostate Results obtained from the Affymetrix microarray were verified cancer (CPD, DUSP3, SMARCD2 and PSMD12). The by real-time RT–PCR for those genes highlighted in bold same is true for all three clustered genes on chromosome face in Supplementary Table 1. For the real time RT–PCR 16 (GOT2, TK2, and MBTPS1) in prostate and experiments, a neomycin-selected EGFP shRNA-expressing leukemia. Likewise, altered expression of the TFF gene cell pool was used. Primer Express software (Applied Biosystems, Foster City, CA, USA) was used to design primers family has been implicated in lung cancer. In addition, that cover two neighboring exons (Supplementary Table 2). two recent clinical studies show that concurrent over- SYBR Green technology and an ABI7300 real-time PCR expression of the TFF family members is associated machine were used for the quantitative real-time PCR analysis. with advanced breast and prostate cancers (Smid et al., Relative mRNA levels were determined and standard curves 2006; Vestergaard et al., 2006). Conceivably, alteration were generated by a serial dilution of cDNA. Expression levels of COBRA1 activity and/or expression may contribute were normalized against b-actin. Results were confirmed with to the deregulation of transcription at these clusters in at least four independent experiments. various cancer types, a notion that merits testing in To assess the mRNA abundance of GAGE1 from the highly future studies. homologous GAGE gene family on Xp11.2, a primer set was designed based on a unique 143-bp insertion in GAGE1. Because of the high degree of nucleotide identity between GAGE2 and GAGE8, Taqman primer probe sets from Materials and methods Applied Biosystems were used to measure separately the mRNA abundance of these two GAGE genes. A previously Cell lines and culture described primer set (VV1 and VDE24) (de Backer et al., 1999) T47D cells lines were obtained from the American Type was used in semiquantitative RT–PCR to measure the Culture Collection (Manassas, VA, USA), and grown in collective expression levels of GAGE-3, -4, -5, -6 and -7B. Dulbecco’s modified Eagle’s medium supplemented with 10% fetal bovine serum, penicillin (100 U/ml) and streptomycin (100 mg/ml). Chromatin immunoprecipitation ChIP assays were carried out according to a previously shRNA constructs described protocol (Aiyar et al., 2004). Primer sequences for shRNA-mediated gene silencing was achieved using the the ChIP assay are listed in Supplementary Table 3. pSuperRetro plasmid purchased from Oligoengine (Seattle, WA, USA). Two different oligonucleotides directed against the human COBRA1 (NELF-B) gene were used: COBRA-2A Immunoblotting (50-GTTCCACCAGTCGGTATTC-30) and COBRA-GA (50- An equal amount of whole-cell lysates was resolved by 10% GACCTTCTGGAGAAGAGCT-30). As a negative control, sodium dodecyl sulfate–polyacrylamide gel electrophoresis. an shRNA directed against EGFP was used (50-GAACGG The proteins were detected by anti-COBRA1 (Aiyar et al., CATCAAGGTGAAC-30). Viral supernatants were prepared 2004) and anti-a-tubulin (Calbiochem, San Diego, CA, USA) and used to infect T47D cells, and stable transfectants were antibodies. selected with 10 mg/ml puromycin as described previously (Aiyar et al., 2004). Acknowledgements

Microarray and data analysis We thank Asma Amleh and Jianlong Sun for critical reading shRNA-expressing T47D cells were grown to 70% confluency of the paper, and Yongde Bao for the microarray analysis. The and harvested for total RNA. Duplicates of the control EGFP work was supported by a postdoctoral fellowship from DOD. and COBRA1-2A shRNA cell lines were subjected to Breast Cancer Research Program to SEA (BC031441), a microarray analyses, using the human gene array chips Cancer Training Grant to ALB and an NIH grant to RL (HgU-133A) from Affymetrix as described earlier (Wu et al., (DK064604).

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2551 References

Aiyar SE, Sun J-L, Blair AL, Moskaluk CA, Lv Y, Ye Q-N de Backer O, Arden KC, Boretti M, Vantomme V, De Smet C, et al. (2004). Attenuation of alpha- Czekay S et al. (1999). Characterization of the GAGE genes mediated transcription through estrogen-stimulated re- that are expressed in various human cancers and normal cruitment of a negative elongation factor. Genes Dev 18: testis. Cancer Res 59: 3157–3165. 2134–2146. Dhawan P, Singh AB, Deane NG, No Y, Shiou SR, Schmidt C Anderson CL, Brown CJ. (1999). Polymorphic X-chromosome et al. (2005). Claudin-1 regulates cellular transformation inactivation of the human TIMP1 gene. Am J Hum Genet 65: and metastatic behavior in colon cancer. J Clin Invest 115: 699–708. 1765–1776. Anderson CL, Brown CJ. (2002). Variability of X chromo- Duan Z, Duan Y, Lamendola DE, Yusuf RZ, Naeem R, some inactivation: effect on levels of TIMP1 RNA and role Penson RT et al. (2003). Overexpression of MAGE/GAGE of DNA methylation. Hum Genet 110: 271–278. in paclitaxel/doxorubicin-resistant human cancer cell lines. Balleine RL, Clarke CL. (1999). Expression of the oestrogen Clin Cancer Res 9: 2778–2785. responsive protein pS2 in human breast cancer. Histol Fan S, Ma YX, Wang C, Yuan RQ, Meng Q, Wang JA et al. Histopathol 14: 571–578. (2001). Role of direct interaction in BRCA1 inhibition of Banerjee S, Saxena N, Sengupta K, Tawfik O, Mayo MS, estrogen receptor activity. Oncogene 20: 77–87. Banerjee SK. (2003). WISP-2 gene in human breast cancer: Fata JE, Werb Z, Bissell MJ. (2004). Regulation of mammary estrogen and progesterone inducible expression and regula- gland branching morphogenesis by the extracellular matrix tion of tumor cell proliferation. Neoplasia 5: 63–73. and its remodeling . Breast Cancer Res 6: 1–11. Blais Y, Sugimoto K, Carriere MC, Haagensen DE, Labrie F, Gallagher PG, Bao Y, Serrano SMT, Kamiguti AS, Theakston Simard J. (1995). Interleukin-6 inhibits the potent stimula- RDG, Fox JW. (2003). Use of microarrays for investigating tory action of androgens, glucocorticoids and interleukin-1 the subtoxic effects of snake venoms: insights into venom- alpha on apolipoprotein D and GCDFP-15 expression in induced in human umbilical vein endothelial cells. human breast cancer cells. Int J Cancer 62: 732–737. Toxicon 41: 429–440. Bodey B. (2002). Cancer-testis antigens: promising targets for Ganesan S, Silver DP, Greenberg RA, Avni D, Drapkin R, antigen directed antineoplastic immunotherapy. Expert Opin Miron A et al. (2002). BRCA1 supports XIST RNA Biol Ther 2: 577–584. concentration on the inactive X chromosome. Cell 111: Burton JD, Ely S, Reddy PK, Stein R, Gold DV, Cardillo TM 393–405. et al. (2004). CD74 is expressed by multiple myeloma Gonzalez MA, Pinder SE, Callagy G, Vowler SL, Morris LS, and is a promising target for therapy. Clin Cancer Res 10: Bird K et al. (2003). Minichromosome maintenance protein 6606–6611. 2 is a strong independent prognostic marker in breast cancer. Caron H, van Schaik B, van der Mee M, Baas F, Riggins G, J Clin Oncol 21: 4306–4313. van Sluis P et al. (2001). The human transcriptome map: Gott P, Beck S, Machado JC, Carneiro F, Schmitt H, Blin N. clustering of highly expressed genes in chromosomal (1996). Human trefoil peptides: genomic structure in 21q22.3 domains. Science 291: 1289–1292. and coordinated expression. Eu J Human Gen 4: 308–315. Carroll JS, Liu S, Brodsky AS, Li W, Meyer CA, Szary AJ Hahnel E, Harvey J, Robbins P, Sterrett G, Hahnel R. (1994). et al. (2005). Chromosome-wide mapping of estrogen Hormone-regulated genes (pS2, PIP, FAS) in breast cancer receptor binding reveals long-range regulation requiring and nontumoral mammary tissue. Pathobiology 62: 82–89. the forkhead protein FoxA1. Cell 122: 33–43. Hirota S, Ito A, Nagoshi J, Takeda M, Kurata A, Takatsuka Carsol JL, Gingras S, Simard J. (2002). Synergistic action of Y et al. (1995). Expression of bone matrix protein messenger prolactin (PRL) and androgen on PRL-inducible protein ribonucleic acids in human breast cancers. Possible involve- gene expression in human breast cancer cells: a unique model ment of osteopontin in development of calcifying foci. for functional cooperation between signal transducer and Lab Invest 72: 64–69. activator of transcription-5 and . Mol Hoevel T, Macek R, Mundigl O, Swisshelm K, Kubbies M. Endocrinol 16: 1696–1710. (2002). Expression and targeting of the tight junction protein Chen L, O’Bryan JP, Smith HS, Liu E. (1990). Overexpression CLDN1 in CLDN1-negative human breast tumor cells. of matrix Gla protein mRNA in malignant human J Cell Physiol 191: 60–68. breast cells: isolation by differential cDNA hybridization. Hurst LD, Pal C, Lercher MJ. (2004). The evolutionary Oncogene 5: 1391–1395. dynamics of eukaryotic gene order. Nat Rev Genet 5: Chen ME, Lin SH, Chung LW, Sikes RA. (1998). Isolation 299–310. and characterization of PAGE-1 and GAGE-7. New genes Jazaeri AA, Chandramouli GVR, Aprelikova O, Nuber UA, expressed in the LNCaP prostate cancer progression model Sotiriou C, Liu ET et al. (2004). BRCA1-mediated repres- that share homology with melanoma-associated antigens. sion of select X chromosome genes. J Translational Med 2: J Biol Chem 273: 17618–17625. 32–39. Chinery R, Williamson J, Poulsom R. (1995). The gene Jazaeri AA, Yee CJ, Sotiriou C, Brantley KR, Boyd J, Liu ET. encoding human intestinal trefoil factor (TFF3) is located on (2002). Gene expression profiles of BRCA1-linked, BRCA2- chromosome 21q22.3 clustered with other members of the linked, and sporadic ovarian cancers. J Natl Cancer Inst 94: trefoil peptide family. Genomics 32: 281–284. 990–1000. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Koon N, Zaika A, Moskaluk CA, Frierson HF, Knuutila S, Wodicka L et al. (1998). A genome-wide transcriptional Powell SM et al. (2004). Clustering of molecular alterations analysis of the mitotic cell cycle. Mol Cell 2: 65–73. in gastroesophageal carcinomas. Neoplasia 6: 143–149. Clark JW, Snell L, Shiu RP, Orr FW, Maitre N, Vary CP et al. Kramer F, White K, Kubbies M, Swisshelm K, Weber BH. (1999). The potential role for prolactin-inducible protein (2000). Genomic organisation of claudin 1 and its assess- (PIP) as a marker of human breast cancer micrometastasis. ment in hereditary and sporadic breast cancer. Hum Genet Br J Cancer 81: 1002–1008. 107: 249–256.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2552 Lapointe J, Li C, Higgins JP, van de Rijn M, Bair E, human intestinal trefoil factor, maps to 21q22.3. Cytogenet Montgomery K et al. (2004). Gene expression profiling Cell Genet 72: 299–302. identifies clinically relevant subtypes of prostate cancer. Proc Sheikh MS, Shao Z-M, Chen J-C, Fontana JA. (1993). Natl Acad Sci USA 101: 811–816. Differential regulation of matrix Gla protein (MGP) gene Lercher MJ, Urrutia AO, Hurst LD. (2002). Clustering of expression by retinoic acid and estrogen in human breast housekeeping genes provides a unified model of gene order carcinoma cells. Mol Cell Endocinol 92: 153–160. in the human genome. Nat Genet 31: 180–183. Shetty A, Loddo M, Fanshawe T, Prevost AT, Sainsbury R, Lopez-Boado YS, Tolivia J, Lopez-Otin C. (1994). Apolipo- Williams GH et al. (2005). DNA replication licensing and protein D gene induction by retinoic acid is concomitant cell cycle kinetics of normal and neoplastic breast. Br J with growth arrest and cell differentiation in human breast Cancer 93: 1295–1300. cancer cells. J Biol Chem 269: 26871–26878. Singer GA, Lloyd AT, Huminiecki LB, Wolfe KH. (2005). Mashino K, Sadanaga N, Tanaka F, Yamaguchi H, Clusters of co-expressed genes in mammalian genomes are Nagashima H, Inoue H et al. (2001). Expression of conserved by natural selection. Mol Biol Evol 22: 767–775. multiple cancer-testis antigen genes in gastrointestinal and Smid M, Wang Y, Klijn JG, Sieuwerts AM, Zhang Y, Atkins breast carcinomas. Br J Cancer 85: 713–720. D et al. (2006). Genes associated with breast cancer May FEB, Westley BR. (1997). Trefoil proteins: their role in metastatic to bone. J Clin Oncol 24: 2261–2267. normal and malignant cells. J Path 183: 4–7. Spellman PT, Rubin GM. (2002). Evidence for large domains McChesney PA, Aiyar SE, Lee OJ, Zaika A, Moskaluk C, of similarly expressed genes in the Drosophila genome. J Biol Li R et al. (2006). Cofactor of BRCA1: a novel transcription 1:5. factor regulator in upper gastrointestinal adenocarcinomas. Tian W, Osawa M, Horiuchi H, Tomita Y. (2004). Expression Cancer Res 66: 1346–1353. of the prolactin-inducible protein (PIP/GCDFP15) gene in Mikhitarian K, Gillanders WE, Almeida JS, Hebert Martin R, benign epithelium and adenocarcinoma of the prostate. Varela JC, Metcalf JS et al. (2005). An innovative Cancer Science 95: 491–495. microarray strategy identities informative molecular markers Toi M, Ishigaki S, Tominaga T. (1998). Metalloproteinases for the detection of micrometastatic breast cancer. Clin and tissue inhibitors of metalloproteinases. Breast Cancer Cancer Res 11: 3697–3704. Res Treat 52: 113–124. Moggs JG, Murphy TC, Lim FL, Moore DJ, Stuckey R, Tokes AM, Kulka J, Paku S. (2005). Claudin-1, -3 and -4 Antrobus K et al. (2005). Anti-proliferative effect of proteins and mRNA expression in benign and malig- estrogen in breast cancer cells that re-express ERa is nant breast lesions: a research study. Breast Cancer Res 7: mediated by aberrant regulation of cell cycle genes. J Mol R296–R305. Endocrinol 34: 535–551. Tomasetto C, Rockel N, Mattei MG, Fujita R, Rio MC. Morin PJ. (2005). Claudin proteins in human cancer: promis- (1992). The gene encoding the human spasmolytic protein ing new targets for diagnosis and therapy. Cancer Res 65: (SML1/hSP) is in 21q22.3, physically linked to the homo- 9603–9606. logous breast cancer gene BCEI/pS2. Genomics 13: Nakopoulou L, Giannopoulou I, Lazaris AC, Alexandrou P, 1328–1330. Tsirmpa I, Markaki S et al. (2003). The favorable prognostic Vestergaard EM, Borre M, Poulsen SS, Nexo E, Torring N. impact of tissue inhibitor of matrix metalloproteinases-1 (2006). Plasma levels of trefoil factors are increased in protein overexpression in breast cancer cells. APMIS 111: patients with advanced prostate cancer. Clin Cancer Res 12: 1027–1036. 807–812. Narita T, Yamaguchi Y, Yano K, Sugimoto S, Chanarat S, Wada T, Takagi T, Yamaguchi Y, Ferdous A, Imai T, Hirose H Wada T et al. (2003). Human transcription elongation factor et al. (1998). DSIF, a novel transcription elongation factor NELF: identification of novel subunits and reconstitution that regulates RNA polymeerase II processitivity, is of the functionally active complex. Mol Cell Biol 23: composed of human Spt4 and Spt5 homologs. Genes Dev 1863–1873. 12: 343–356. Pagani A, Sapino A, Eusebi V, Bergnolo P, Bussolati G. Wu CH, Lee C, Fan R, Smith MJ, Yamaguchi Y, Handa H (1994). PIP/GCDFP-15 gene expression and apocrine et al. (2005a). Molecular characterization of Drosophila differentiation in carcinomas of the breast. Virchows Arch NELF. Nucleic Acids Res 33: 1269–1279. 425: 459–465. Wu C-H, Yamaguchi Y, Benjamin LR, Horvat-Gordon M, Palangat M, Renner DB, Price DH, Landick R. (2005). A Washinsky J, Enerly E et al. (2003). NELF and DSIF cause negative elongation factor for human RNA polymerase II promoter proximal pausing on the hsp70 promoter in inhibits the anti-arrest transcript-cleavage factor TFIIS. Drosophila. Genes Dev 17: 1402–1414. Proc Natl Acad Sci USA 102: 15036–15041. Wu Y, Ghosh S, Nishi Y, Yanase T, Nawata H, Hu Y-F. Poulsom R, Hanby AM, Lalani EN, Hauser F, Hoffmann W, (2005b). The orphan nuclear receptors NURR1 and NGFI- Stamp GW. (1997). Intestinal trefoil factor (TFF 3) and pS2 B modulate aromatase gene expression in ovarian granulosa (TFF 1), but not spasmolytic polypeptide (TFF 2) mRNAs cells: a possible mechanism for repression of aromatase are co-expressed in normal, hyperplastic, and neoplastic expression upon luteinizing hormone surge. Endocrinology human breast epithelium. J Pathol 183: 30–38. 146: 237–246. Rassart E, Bedirian A, Do Carmo S, Guinard O, Sirois J, Wu¨ rtz SØ, Schrohl A-S, Srensen NM, Lademann U, Terrisse L et al. (2000). Apolipoprotein D. Biochim Biophys Christensen IJ, Mouridsen H et al. (2005). Tissue inhibitor Acta 1482: 185–198. of metalloproteinases-1 in breast cancer. Endocr Relat Richardson AL, Wang ZC, De Nicolo A, Lu X, Cancer 12: 215–227. Brown M, Miron A et al. (2006). X chromosomal Xu XQ, Emerald BS, Goh EL, Kannan N, Miller LD, abnormalities in basal-like human breast cancer. Cancer Gluckman PD et al. (2005). Gene expression profiling to Cell 9: 121–132. identify oncogenic determinants of autocrine human growth Schmitt H, Wundrack I, Beck S, Gott P, Welter C, Shizuya H hormone in human mammary carcinoma. J Biol Chem 280: et al. (1996). A third P-domain polypeptide gene (TFF3), 23987–24003.

Oncogene COBRA1 regulates clustered gene expression SE Aiyar et al 2553 Yamaguchi Y, Takagi T, Wada T, Yano K, Furuya A, repression of the estrogen receptor. Proc Natl Acad Sci USA Sugimoto S et al. (1999). NELF, a multisubunit complex 98: 9587–9592. containing RD, cooperates with DSIF to repress RNA Zhong H, Zhu J, Zhang H, Ding L, Sun Y, Huang C et al. polymerase II elongation. Cell 97: 41–51. (2004). COBRA1 inhibits AP-1 transcriptional activity Ye Q, Hu Y-F, Zhong H, Nye AC, Belmont AS, Li R. (2001). in transfected cells. Biochem Biophys Res Commun 325: BRCA1-induced large-scale chromatin unfolding and allele- 568–573. specific effects of cancer-predisposing mutations. J Cell Biol Zhou Y, Luoh SM, Zhang Y, Watanabe C, Wu TD, Ostland M 155: 911–921. et al. (2003). Genome-wide identification of chromosomal Zheng L, Annab LA, Afshari CA, Lee W-H, Boyer TG. regions of increased tumor expression by transcriptome (2001). BRCA1 mediates ligand-independent transcriptional analysis. Cancer Res 63: 5781–5784.

Supplementary Information accompanies the paper on the Oncogene website (http://www.nature.com/onc).

Oncogene