Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision Received 12 August 2004 | Accepted 7 October 2004 | Published 7 October 2004

Analysis of transcriptional regulation of the small leucine rich proteoglycans

Elena S. Tasheva,1 Bernward Klocke,2 Gary W. Conrad1

1Kansas State University, Division of Biology, Manhattan, KS; 2Genomatix Software GmbH, München, Germany

Purpose: Small leucine rich proteoglycans (SLRPs) constitute a family of secreted proteoglycans that are important for collagen fibrillogenesis, cellular growth, differentiation, and migration. Ten of the 13 known members of the SLRP family are arranged in tandem clusters on human 1, 9, and 12. Their syntenic equivalents are on mouse chromosomes 1, 13, and 10, and rat chromosomes 13, 17, and 7. The purpose of this study was to determine whether there is evidence for control elements, which could regulate the expression of these clusters coordinately. Methods: Promoters were identified using a comparative genomics approach and Genomatix software tools. For each gene a set of human, mouse, and rat orthologous promoters was extracted from genomic sequences. Transcription factor (TF) binding site analysis combined with a literature search was performed using MatInspector and Genomatix’ BiblioSphere. Inspection for the presence of interspecies conserved scaffold/matrix attachment regions (S/MARs) was performed using ElDorado annotation lists. DNAseI hypersensitivity assay, chromatin immunoprecipitation (ChIP), and transient transfection experiments were used to validate the results from bioinformatics analysis. Results: Transcription factor binding site analysis combined with a literature search revealed co-citations between several SLRPs and TFs Runx2 and IRF1, indicating that these TFs have potential roles in transcriptional regulation of the SLRP family members. We therefore inspected all of the SLRP promoter sets for matches to IRF factors and Runx factors. Positionally conserved binding sites for the Runt domain TFs were detected in the proximal promoters of chondroadherin (CHAD) and osteomodulin (OMD) . Two significant models (two or more transcription factor binding sites ar- ranged in a defined order and orientation within a defined distance range) were derived from these initial promoter sets, the HOX-Runx (homeodomain-Runt domain), and the ETS-FKHD-STAT (erythroblast transformation specific-forkhead- signal transducers and activators of transcription) models. These models were used to scan the genomic sequences of all 13 SLRP genes. The HOX-Runx model was found within the proximal promoter, exon 1, or intron 1 sequences of 11 of the 13 SLRP genes. The ETS-FKHD-STAT model was found in only 5 of these genes. Transient transfections of MG-63 cells and bovine corneal keratocytes with Runx2 isoforms confirmed the relevance of these TFs to expression of several SLRP genes. Distribution of the HOX-Runx and ETS-FKHD-STAT models within 200 kb of genomic sequence on human 9 and 500 kb sequence on chromosome 12 also were analyzed. Two regions with 3 HOX-Runx matches within a 1,000 bp window were identified on human ; one located between OMD and osteoglycin (OGN)/ mimecan genes, and the second located upstream of the putative extracellular matrix 2 (ECM2) promoter. The intergenic region between OMD and mimecan was shown to coincide with different patterns of DNAse I hypersensitivity sites in MG-63 and U937 cells. ChiP analysis revealed that this region binds Runx2 in U937 cells (mimecan transcript note detectable), but binds Pitx3 in MG-63 cells (expressing high level of mimecan), thereby demonstrating its functional association with mimecan expression. Upon comparing the predictions of S/MARs on the relevant chromosomal context of human chromosomes 9 and 12 and their rodent equivalents, no convincing evidence was found that the tandemly arranged genes build a chromosomal loop. Conclusions: Twelve of 13 known SLRP genes have at least one HOX-Runx module match in their promoter, exon 1, intron 1, or intergenic region. Although these genes are located in different clusters on different chromosomes, the com- mon HOX-Runx module could be the basis for co-regulated expression.

The process of transcription is the key element in gene nals for transcription factors and that promoters bear the his- expression and, as such, an attractive control point for regula- tone code: H3 hyperacetylation and methylation of lysine 4 tion of gene expression in cell and tissue specific manners. It [1]. Various models demonstrate how the two types of com- is not surprising that considerable research has been conducted plexes, the nucleosome remodeling complexes of the SWI/ on elucidating the mechanisms by which genes are regulated SNF (switch/sucrose non-fermentable) type, which use the [1-7]. Current views of transcriptional regulation incorporate energy of ATP-hydrolysis to alter histone-DNA contacts, and the histone code hypothesis which proposes that different com- the enzymatic complexes that modify histones by acetylation, binations of histone modifications function as recognition sig- methylation, phosphorylation, and ubiquitinylation participate in chromatin remodeling [1-7]. Models also explain how regu- Correspondence to: Elena S. Tasheva, Division of Biology, Ackert latory motifs act at a distance and involve looping to bring Hall, Kansas State University, Manhattan, KS, 66506-4901; Phone: regulatory elements in contact with distant promoters. Regu- (785) 532-6553; FAX: (785) 532-6653; email: [email protected] latory motifs are seen as binding sites for that induce 758 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision chemical modifications and structural alterations propagating ture and that regulatory effects of a control region depend on down the fiber [5]. In addition to studying regulation of tran- the specific combination of elements, as well as the order and scription through a variety of biological and biochemical ap- orientation in which they occur [8,10,16]. The ability of a given proaches, recently there has been much interest in the possi- sequence-specific transcription factor to interact with both co- bility of using bioinformatics approaches to identify gene regu- activators and co-repressors and/or to recruit histone-modify- latory elements [8-10]. Large scale cross-species DNA se- ing proteins is thought to provide a simple means for generat- quence comparisons reveal regions of highly conserved non- ing on-off switches in a tissue specific manner during the cell- coding sequences located upstream of transcription initiation cycle, and in development [1-4,7]. In addition to the order and sites (gene promoters), or in introns and intergenic regions orientation of units of transcription regulatory regions, the (enhancers, silencers, scaffold/matrix attachment region (S/ abundance of transcriptional cofactors in a certain cell type MARs), and locus control regions) [11-15]. Functional analy- also influences the ability of site specific factors to regulate ses of these conserved regions show that they represent cis- gene expression [17]. Thus, a combination of bioinformatics regulatory elements that control expression of nearby genes. and wet lab experiments has emerged as a successful method Data indicate that these regulatory regions have modular struc- for detecting cis-regulatory elements in many genomic loci, including those for homeodomain containing transcription factors (HOX), immunoglobulin, β-globin, and IL4/IL13/IL5- cytokine gene clusters [18-21]. The small lecine rich proteoglycans (SLRPs) are a well- known family of secreted proteoglycans present in many con- nective tissues. Crucial roles of these macromolecules in ma- trix assembly and modulation of cellular growth have been demonstrated extensively in the literature [22-25]. In the eye, SLRPs have been shown to be equally important for develop- ment and maintenance of corneal transparency and for pro- viding the structural link between the neural retina and retinal pigment epithelium [26,27]. Alterations in the balance of cor- neal SLPGs result in increased hydration and loss of corneal transparency. In vitro data demonstrate that these molecules regulate axon guidance and synapse formation during the de- velopment of nervous tissue and the vertebrate retina. Indi- vidual roles of several SLRPs have been demonstrated by pro- duction of gene knock-out animals. All single or double SLRP- null mice generated so far displayed abnormal collagen fibrillogenesis and developed a variety of diseases such as osteoporosis, osteoarthritis, muscular dystrophy, Ehlers-Danlos syndrome, and corneal pathology, e.g. diseases that appear to result from collagen defects [28-33]. Phenotypic changes in different SLRP-null mice are mild, indicating that these pro- teins can compensate for one another, as evidenced by an in- creased amount of lumican in fibromodulin-null mice [31]. Although the molecular bases for these compensatory mecha- nisms presently are unknown, it is likely that this might occur at the level of transcription. The notion of regulated expres- sion of the SLRPs at the level of transcription is supported by their genomic organization. Thus, 10 of the 13 members of the SLRP gene family are organized in clusters on three chro- mosomes: opticin (OPTC), proline arginine rich end leucine rich repeat protein (PRELP), and fibromodulin (FMOD) on human chromosome 1q3 (their syntenic equivalents on mouse Figure 1. A dendrogram and chromosomal locations of the SLRP chromosome 1); asporin (ASPN), osteoadherin/osteomodulin genes. The dendrogram shows relationships between SLRP family (OMD), and OGN/mimecan on human chromosome 9q2 members in the human and mouse genomes. Based on their chromo- (mouse chromosome 13); dermatan sulfate proteoglycan 3/ somal organization, overall protein structure, and the structure of epiphycan (DSPG3), keratocan (KERA), lumican (LUM), and cysteine rich clusters in their N terminus they are subdivided into four classes. ECM2 is not a SLRP, but has a lucine rich repeat (LRR) decorin (DCN), on human chromosome 12q (mouse chromo- domain and is physically linked to a SLRP cluster on chromosome 9. some 10, Figure 1). Such clustered chromosomal arrangements This analysis was done using GenWorks version 2.5 (IntelliGenetics, resemble the organization of other genomic loci that have been Inc. Oxford). shown to be precisely regulated in time and space to ensure 759 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision normal development. For example, mammals have 39 HOX DNAse I hypersensitivity PCR assay: Previously pub- genes that are clustered in four genomic loci, and their spatial lished protocols were used with modifications [35-37]. Briefly, and temporal transcriptional activation correspond to the gene MG-63 (human osteosarcoma cell line) and U-937 (human order along the cluster [12,20]. Similarly, the human β-globin promyelocytic cell line) cells were harvested by centrifuga- locus consists of five erythroid specific genes that are expressed tion. After two washes with PBS, cells were resuspended in sequentially during development [34]. As with other gene clus- 0.5 ml of permeabilizing buffer (15 mM Tris [pH 7.5], 60 mM ters, elucidation of the mechanisms of transcriptional regula- KCl, 15 mM NaCl, 5 mM MgCl2, 0.5 mM EGTA, 300 mM tion of the SLRP gene clusters may provide information about sucrose, and 0.5 mM 2-mercaptoethanol) supplemented with important control points for regulating the expression of these 0, 500, 2,000, or 10,000 U/ml of DNase I (Roche Applied genes in particular cell types or in response to specific sig- Sciences, Indianapolis, IN). An equal volume of permeabilizing nals. In addition, increased understanding of transcriptional buffer supplemented with 0.1% lysolecithin (Sigma-Aldrich, regulation of these genes will enable development of thera- St Louis, MO) was added, and the reaction was incubated at pies directed against the right molecular targets for the treat- room temperature for 4 min. Genomic DNA was isolated us- ment of different eye pathologies such as those caused by sur- ing standard procedures. PCR was performed in 50 µl reac- gical procedures, systemic diseases, and tumors. tions using 100 ng of genomic DNA, 100 ng of each primer, In this report, we describe the results of our analysis of 0.5 mM dNTPs and 1 unit of Taq polymerase (Promega Corp., the SLRP gene clusters on human chromosomes 1, 9, and 12 Madison, WI). The cycle of denaturing at 94 °C for 30 s, an- that were obtained using a combination of bioinformatics, a nealing at 60 °C for 30 s, and extension at 72 °C for 2 min was review of relevant publications, and wet lab experiments. Be- repeated 30 times. The PCR products were resolved by agar- cause co-regulated promoters often utilize the same frame- ose gel electrophoresis and visualized with ethidium bromide work of TF binding sites, our findings suggest that the HOX- staining. Primers synthesized by Integrated DNA Technolo- Runx models may be the basis for co-regulated expression of gies Inc. (Coralville, IA) were used in this study, and listed in the SLRP genes. Table 2. Chromatin immunoprecipitation: ChIP analysis was car- METHODS ried out essentially as described, with minor modifications [38]. Bioinformatics resources and genome databases: For these Briefly, MG-63 and U937 cells were collected, re-suspended studies, tools from the Genomatix software package in media containing 2.5% formaldehyde and incubated at room (Genomatix Suite release 3.0) were used. These tools are shown temperature for 15 min to crosslink protein DNA complexes. in Table 1. More detailed information is available at Cells were washed twice with ice cold TBS buffer and the Genomatix. pelleted nuclei were resuspended in 1 ml of ChIP lysis buffer (150 mM NaCl, 25 mM Tris, pH 7.5, 1% Trixon X-100, 0.1% TABLE 1. DESCRIPTION OF THE SOFTWARE TOOLS USED IN THIS STUDY SDS, 0.5% deoxycholate), and the protease inhibitor cocktail (Complete Mini, Roche Diagnostics, Mannheim, Germany). Software Description ------The samples were sonicated ten times for 20 s each, with 1 GEMS Launcher task: Search for TF binding min of cooling on ice in between sonications, and then pre- MatInspector: To search for sites described by transcription factor binding weight matrices. cleared by incubation with protein A Sepharose beads for 2 h sites [43] at 4 °C. The various primary antibodies were then added and samples were incubated overnight at 4 °C. Immunocomplexes GEMS Launcher task: Search for complex ModelInspector: To search for regulatory patterns. were precipitated for 3 h with protein A-Sepharose beads and user-defined models [92,93] the precipitates were washed once with 1 ml of RIPA buffer (50 mM Tris, pH 8, 150 mM NaCl, 0.1% SDS, 0.5% deoxy- GEMS Launcher task: Search for a common FrameWorker: For definition framework of two or cholate, 1% Nonidet P-40, and 1 mM EDTA), once with 1 ml of common framework more TF binding sites of High Salt buffer (50 mM Tris, pH 8, 500 mM NaCl, 0.1% in a set of sequences.

Genomatix Suite: Retrieve and Analyze Gene2Promoter Promoters TABLE 2. PRIMERS USED FOR DNASE I HYPERSENSITIVITY PCR ASSAY

Primer Genomatix Suite: Extended Genome name Sequence Amplification product ------ElDorado Annotation Hm+4145 5'-GAACTCCACAGAGGCCAAGGTGG-3' 246 bp of the intergenic region Hm-4391 5'-GGCAAGGTCTTGCTCTGTCACCCAAGCTGG-3' (ig1) on human chromosome 9

Genomatix BibilioSphere Search for Gene-Gene Hm+4351 5'-ATCGTGCCACTGCACTCCAGCTTGGGTG-3' 274 bp of ig2 Gene-TF co-citations Hm-4625 5'-GGCAAAGCCCAGACCTCTCTTTAGG3'

in PubMed abstracts Hm+4707 5'-GAGGTAGAGGTGGCTATGCCAGG-3' 271 bp of ig3 Hm-4978 5'-GAGATGCGTAGGGGCAGTGTCTAGAAGG-3'

The GEMS Launcher software package (release 3.1) was used in Hm+1314 5'-CACGGTACCCTAGTACAACACACTGGATTTC-3' 410 bp encompassing exon 1 of this study. GEMS Launcher is an integrated, task-oriented software Hm-904 5'-AGCACCCTATTCTTGCCTCGCTGG-3' the gene encoding human mimecan package which is based on proprietary software tools developed by Primer pairs used for DNAse I hypersensitivity PCR experiments Genomatix. The set of software tools used to solve the individual and their corresponding amplification products are indicated. Num- tasks and some representative publications [43,92,93] of the basic bers in the primer name indicate the position of the primer relative to algorithms are shown. translation initiation site of the human mimecan gene. 760 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

SDS, 0.5% deoxycholate, 1% Nonidet P-40, and 1 mM EDTA), Reverse Transcriptase (Life Technologies, Inc., Gaithersburg, once with 1 ml of LiCl buffer (50 mM Tris, pH 8, 1 mM EDTA, MD). The single stranded cDNA products (2 µl) were used as 250 mM LiCl, 1% Nonidet P-40, and 0.5% deoxycholate), templates in PCR amplification reactions as described [42]. and twice with 1 ml of TE (10 mM Tris [pH 8] and 1 mM Gene specific primers used for human and bovine SLRPs are EDTA). All washes were for 5 min, rotating, at 4 °C. The listed in Table 3. samples were treated with 200 µg/ml proteinase K for 3 h at 55 °C. Formaldehyde crosslinks were reversed by overnight RESULTS incubation at 65 °C. The DNA was isolated by phenol-chloro- Promoter sets and identification of candidate transcription form extraction and ethanol precipitation. The primers for PCR factors and frameworks: Promoter sequences for the 13 amplifications are listed in the DNAse hypersensitivity assay. proteoglycan genes (Figure 1) were extracted from genome The following antibodies were used; anti-PEBP2αA (sc- sequences for human (NCBI build 34), mouse (MGSCv3 R3), 12488), anti-Pitx3 (sc-19307), anti-upstream stimulatory fac- and rat (NCBI build 2) using Genomatix’ Gene2Promoter soft- tor-1 (USF-1, sc-229), anti-c-Myc (sc-42X); anti-IRF-2 (sc- ware. These orthologous promoter sets are listed in Table 4. 498). All antibodies were obtained from Santa Cruz Biotech- The presence of more than one putative transcription start site nology, Inc., Santa Cruz, CA. (TSS) in some genes reflects the fact that several transcripts Plasmids, transient transfection of mammalian cells, and are mapped to the same locus. The rodent BGN promoters semi-quantitative RT-PCR: Mammalian expression plasmids were omitted because there were stretches of Ns within the for Runx2 (MRIPV and MASNSL isoforms) and control plas- sequence, indicating sequencing or assembly ambiguities. mid (pCMV5) were gifts from Dr. Jennifer J. Westendorf (Uni- The average length of these promoters was adjusted to versity of Minnesota, Minneapolis, MN) [39]. The first Runx2 600 bp, 500 bp upstream of the most 5' mapped TSS and 100 isoform, also known as PEBP2aA1, type I and p56, is a 513 bp downstream of the most 3' mapped TSS in the set (Table amino acid protein that initiates in exon 2 at the sequence 4). These promoter sets were used to search with MatInspector MRIPV [40]. The second isoform, also known as til-1, type II [43]. The search produced a large and complex output and the or p57, initiates in exon 1 at the sequence MASNSL and is 15 results showed that orthologous promoters from human, amino acids longer than the first isoform [41]. Primary bo- mouse, and rat are similarly organized. To obtain matches that vine corneal keratocytes and MG-63 cells were transiently were more likely to identify functional sites, we combined the transfected using FuGENE 6 transfection reagent (Roche Ap- TF binding site analysis with a literature analysis using plied Sciences, Indianapolis, IN) according to the standard Genomatix’ BiblioSphere. BiblioSphere allows analyzing protocol (9 µl reagent per 3 µg DNA). Total RNA was iso- gene/gene, and gene/transcription factor relations from their lated using Totally RNA, Total RNA Isolation Kit (Ambion co-citation in PubMed abstracts. The group of proteoglycan Inc., Austin, TX). RNA (2 µg) was reverse-transcribed using genes was used as input to BiblioSphere. This approach al- the anchor primer oligonucleotide (dT)18, and Superscript II lowed us to reveal relations between the cluster of proteoglycan

TABLE 3. GENE SPECIFIC PRIMERS USED FOR HUMAN AND BOVINE SLRPS

Accession Primer Amplification Gene number name Primer sequence (5'-3') product (bp) ------Mimecan NM_033014 Hmim+1 GGCTAATGCACAGACATGAACATCTATTGAGG 506 & 230 Hmim-506 GCGTGAGTCCTGCTGGGTTGGTGG

Decorin NM_001920 Hdec+124 GGGCTGGACCGTTTCAACAGAGAGG 586 Hdec-710 GTATCAGCAATGCAAATGTAGGAG

Biglycan NM_001711 Hbgn+964 GATCAGGATGATCGAGAACGGGAG 280 Hbng-1244 GGTCAGTGACGCAGCGGAAAGTGG

Mimecan M37974 Bm+411 TTTAGGATCCATTAAAATGAAGACTCTGCAATCTACACTTCTCCTG 902 Bm-1313 GATAGCTCGAGAATGTATGACCCTATAGG

Keratocan U48360 Bkera+150 ATGGATCCATGGCATCCACAATCTGCTTCATCCTCTGGG 189 Bkera-339 CAGTACAAAGCAGTAGGGAAACTGGGGG

Lumican L11063 Blum+671 TGACTTGAGCTTCAATCAGATGACC 503 Blum-1174 CCATAAACTGCTGTTCCAGGCTACACC

Chondroadherin NM_174019 Bchad+595 CTCAGTTCCCTGCAGCCCGGCGCTC 272 Bchad-867 CATGTTTCAGCGTGGTCACACCC

Primer pairs used for amplification of indicated human and bovine SLRPs and their corresponding amplicons are shown. The primer pair for human mimecan detects two differentially spliced mRNA isoforms. The first letter of the primer name indicates whether the primer is human (H) or bovine (B). 761 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision genes and other genes/transcription factors that were co-cited co-cited with IRF1 (interferon regulatory factor 1). We there- with all or some of the genes from the input cluster. The re- fore inspected the outputs from MatInspector searches on all sults of this type of analysis, termed Cluster Centered of the orthologous promoter sets for matches to IRF factors BiblioSphere, are shown in Figure 2. The co-citation analysis and Runx factors. The most striking similarities were found from Figure 3 shows that OMD, FMOD, BGN, and DCN are for the Runx matches in the orthologous CHAD and OMD co-cited with Runx2, whereas DCN, LUM, and mimecan are promoters. In these six promoters Runx matches are found

TABLE 4. PROXIMAL PROMOTER SETS EXTRACTED FROM GENE2PROMOTER

Putative Gene Symbol LocID Org Chr Contig ID Str. Contig positions Length TSS ------FMOD FMOD 331 Hs 1 NT_004671 (-) 14675085-14675774 690 564 PRELP PRELP 5549 Hs 1 NT_079622 (+) 33923-34593 671 548 OPTC OPTC 26254 Hs 1 NT_079622 (+) 52276-52876 601 501 FMOD Fmod 14264 Mm 1 NW_000154 (+) 16009060-16009744 685 501 585 PRELP Prelp 116847 Mm 1 NW_000154 (-) 15891829-15892501 673 501 502 526 OPTC Optc 269120 Mm 1 NW_000154 (-) 15877461-15878278 818 501 718 FMOD Fmod 64507 Rn 13 NW_047395 (+) 2159562-2160244 683 576 PRELP Prelp 84400 Rn 13 NW_047395 (-) 2046495-2047160 666 566 OPTC na 304802 Rn 13 NW_047395 (-) 2028701-2029301 601 501 ASPN ASPN 54829 Hs 9 NT_008476 (-) 2565740-2566476 737 501 637 OMD OMD 4958 Hs 9 NT_008476 (-) 2507656-2508265 610 501 OGN OGN 4969 Hs 9 NT_008476 (-) 2487939-2488564 626 518 ASPN Aspn 66695 Mm 13 NW_000075 (+) 9878489-9879091 603 501 503 OMD Omd 27047 Mm 13 NW_000075 (+) 9916743-9917350 608 508 OGN Ogn 18295 Mm 13 NW_000075 (+) 9939392-9940017 626 501 526 ASPN na 306805 Rn 17 NW_047490 (-) 2462636-2463236 601 501 OMD Omd 83717 Rn 17 NW_047490 (-) 2426721-2427329 609 OGN na 291015 Rn 17 NW_047490 (-) 2411032-2411643 612 DCN DCN 1634 Hs 12 NT_19546 (-) 15058688-15059337 650 550 LUM LUM 4060 Hs 12 NT_19546 (-) 14987288-14987985 698 501 KERA KERA 11081 Hs 12 NT_19546 (-) 14933783-14934822 1040 501 DSPG3 DSPG3 1833 Hs 12 NT_19546 (-) 14880840-14881498 659 501 DCN Dcn 13179 Mm 10 NW_000032 (+) 10899310-10899910 601 LUM Lum 17022 Mm 10 NW_000032 (+) 10982193-10982889 697 537 544 564 KERA Kera 16545 Mm 10 NW_000032 (+) 11023517-1102453 1020 894 920 DSPG3 Dspg3 13516 Mm 10 NW_000032 (+) 11064216-11064872 657 557 DCN Dcn 29139 Rn 7 NW_047774 (+) 11454854-11455504 651 501 LUM Lum 81682 Rn 7 NW_047774 (+) 11534201 -11534896 696 596 KERA na 314771 Rn 7 NW_047774 (+) 11572274-11573298 1025 DSPG3 na 314772 Rn 7 NW_047774 (+) 11615311-11615968 658 CHAD CHAD 1101 Hs 17 NT_010783 (-) 7199501-7200101 601 501 CHAD Chad 12643 Mm 11 NW_000040 (+) 6062353-6062953 601 501 CHAD Chad 29195 Rn 10 NW_047337 (+) 8376-8976 601 501 BGN BGN 633 Hs X NT_025965 (+) 111914-112531 618 501 503 NYX NYX 60506 Hs X NT_079573 (+) 4155996-4156596 601 501 NYX Nyx 236690 Mm X NW_042619 (+) 4069343-4069943 601 501 NYX na 302516 Rn X NW_048034 (-) 8864677-8865277 601 501

Promoter sequences for the proteoglycan genes were extracted from genome sequences for human (Hs), mouse (Mm), and rat (Rn) using Genomatix’ Gene2Promoter. Proximal promoter sets, with extracted sequence, locus ID (LocID), organism (Org), chromosomal location (Chr), contig ID, and contig positions are listed. Some of the extracted rat sequences do not have an official symbol annotated by LocusLink (Symbol na). Abbreviations used for the genes follow LocusLink nomenclature: ASPN (asporin), small leucine rich protein 1C, BGN (biglycan), CHAD (chondroadherin), DCN (decorin), DSPG3 (dermatan sulfate proteoglycan 3), Pg-Lb, epiphycan, FMOD (fibromodulin), KERA (keratocan), LUM (lumican), NYX (nyctalopin), OGN (osteoglycin), osteoinductive factor, mimecan, OMD (osteomodulin), osteoadherin, OPTC (opticin), oculoglycan, PRELP (proline arginine rich end leucine rich repeat protein), SLRR2A. 762 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

Figure 2. Cluster centered BiblioSphere for the SLRP genes and co-cited transcription factors. A: Connecting edges originating from transcription factor Runx2. B: Connecting edges originating from transcription factor IRF1. The relevant references for this Table are listed [45,50,89-91].

Figure 3. Two and three ele- ment models derived from CHAD and OMD initial pro- mote sets. A: The two ele- ment TSS proximal HOX- Runx model. Note that in OMD there are several HOX matches that can be combined with Runx. However, the most specific model is ob- tained using the most proxi- mal Runx matches, which has an element distance of 52 to 54 bp in all promoters shown. B: The three element ETS- FKHD-STAT model. Only the distance between ETS and FKHD seems fairly con- served. The same is true for the orientation of FKHD and ETS matches, if the first ETS in mouse OMD is not consid- ered.

763 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision close to the putative TSS. To find potential regulatory models containing either Runx or IRF, the set of CHAD and OMD TABLE 5. SUMMARY OF HOX-RUNX AND ETS-FKHD-STAT MODEL promoters was subjected to a search for common frameworks MATCHES of TF binding sites with FrameWorker. Of note, framework is HOX-Runx model ETS-FKHD-STAT model a description of two or more transcription factor binding sites ------Gene Org Promoter Exon 1 Intron 1 Promoter Exon 1 Intron 1 (elements) arranged in a defined order and orientation within ------a defined distance range. The model is a computational de- ASPN Hs + + ASPN Mm + + + scription of a framework for the purpose of computer-assisted ASPN Rn + BGN Hs + detection of frameworks in DNA sequences. FrameWorker CHAD Hs + + CHAD Mm + + determined 10 models of 2 elements and 3 models of 3 ele- CHAD Rn + + + DCN Hs + ments. Resulting models were evaluated as follows. The varia- DCN Mm + DCN Rn + tion of the distance between TF binding sites within the model DSPG3 Hs + + DSPG3 Mm + matches should be small. The initial parameters allowed mod- DSPG3 Rn + + els with a distance of 10-100 bp (a maximal variation of 90 FMOD Hs + FMOD Mm + bp). Models with a variation exceeding 30 bp were rejected. KERA Hs + + KERA Rn Further, the strand orientation of TF binding sites had to be LUM Hs + LUM Mm + conserved and models with a conserved position relative to LUM Rn + NYX Hs + the TSS in all promoters were higher ranked than others. A OMD Hs + + + + OMD Mm + + + + further ranking is done according to the specificity score (p OMD Rn + + value) FrameWorker calculates. The number of matches within PRELP Hs + 5,000 random promoters is determined and subsequently the Model matches found in promoter, intron 1, or exon 1 sequences of probability to obtain an equal or greater number of matching the SLRP genes. In the Org (organism) column, Hs represents Homo sequences in a randomly drawn sample of the same size is sapiens, Mm represents Mus musculus, and Rn represents Rattus calculated. The scores are useful for ranking but their abso- norvegicus.

Figure 4. Frequency of the HOX-Runx and ETS-FKHD-STAT model matches on human chromosome 9. Distribution of the HOX-Runx, and ETS-FKHD-STAT matches on the analyzed 200 kb region of human chromosome 9. A sliding window of 1,000 bp with steps of 100 bp was used. Matches within the window were counted and a graphical plot of matches per 1,000 bp window versus sequence position is shown. Positions of genes within the chromosomal region were superimposed to easily identify intergenic regions. 764 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

Figure 5. DNAse I hypersensitivity and ChIP analyses of the region between OGN/mimecan and OMD genes. DNAse I hypersensitivity and ChIP analyses of the region between OGN/mimecan and OMD genes on human chromosome 9. A: Schematic of the region between the mimecan and OMD genes on human chromosome 9 and positions of the HOX-Runx model matches. Binding sites for factors that have been characterized previously also are indicated. These are; the E-box (binding site for USF1), ISRE (binding site for IRF1 and IRF2), and p53 (binding site for tumor suppressor protein p53) [45-48]. B: DNAse I hypersensitivity PCR analysis. Cells were treated with the indicated amount of DNAse I, or incubated with buffer lacking DNAse I (input), lysed, and genomic DNA was isolated and used as template for PCR with indicated primers (Table 2). C: ChIP analysis. Chromatin immunoprecipitation was performed using indicated cells and antibodies. PCR of input DNA shows equivalent starting material for the assay. 765 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision lute value is less significant since it depends much on the size occurences per 10,000 bp for the ETSF-FKHD-STAT. Both of the sequence set, which usually is small. The HOX-Runx values are within the quality standard of the experimentally model fulfils these criteria; the minimal distance of absolutely verified modules in Genomatix’ Promoter Module Library. The strand conserved elements is 50-52 bp, the positions of model models as generated by FrameWorker are shown in Figure 3. matches are conserved near the TSS in all six promoters, and These models were used to scan the remaining the FrameWorker specificity score is third ranked among the proteoglycan promoters. The HOX-Runx model was found in two element models (p value of 3.82x10-5). Furthermore, the three other human promoters, DSPG3, KERA, and NYX possible importance of Runx is indicated by co-citation with (Table 5). Since the expression of mimecan is governed by OMD. From the 3 element models the erythroblast transfor- elements within exon 1 [44,45], the models were used to search mation specific forkhead signal transducers and activators of for matches in the first exon and first intron of the input genes. transcription (ETSF-FKHD-STAT) was further investigated, since it was the only one with small distance variation and strand conservation for the first two elements, and fairly con- served positioning around -300 bp relative to the TSS. Scan- ning the with these two models showed 0.31 occurrences per 10,000 bp for the HOX-Runx model and 0.08

Figure 6. Semi-quantitative RT-PCR analysis of SLRPs expression in MG-63 cells. MG-63 cells were transiently transfected with the indicated Runx2 expression vectors. Total RNA was isolated and semi- Figure 7. Semi-quantitative RT-PCR analysis of SLRPs expression quantitative RT-PCR was performed to assess the effect of Runx2 in bovine corneal keratocytes. Semi-quantitative RT-PCR was per- isoforms on SLRP expression. A: Photographs of ethidium bromide formed to assess the effect of Runx2 isoforms on SLRP expression stained agarose gels. The 230 bp differentially spliced transcript cor- in bovine corneal keratocytes. A: Photographs of ethidium bromide responding to OGN/mimecan, the 586 bp transcript corresponding stained agarose gels. The 902 bp transcript corresponding to mimecan, to decorin, and the 280 bp transcript corresponding to biglycan were the 189 bp transcript corresponding to keratocan, the 503 bp tran- amplified together with QuantumRNA 18S internal standard in the script corresponding to lumican, and 272 bp transcript correspond- same (multiplex) PCR reaction. To modulate amplification efficiency ing to chondroadherin were amplified together with QuantumRNA of 18S rRNA, 18S primers were mixed with 18S competimers in a 18S internal standard in the same (multiplex) PCR reaction. B: Plot 3:7 ratio. B: Plot of the mean values of relative quantities of RNA of the mean values of relative quantities of RNA obtained from two obtained from two experiments. experiments. 766 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

A number of model matches were found in intron 1 or exon 1 overall low frequency of occurrence of this model within the sequences of ASPN, BGN, CHAD, DCN, DSPG3, KERA, human genome sequence, the vicinity of three model matches LUM, OMD, and PRELP (Table 5). Notably, among these are could indicate a regulatory module for ECM2 and either a BGN, DCN, and FMOD, which display no matches in their downstream regulatory module for OMD or an upstream regu- proximal promoter sequences, but have a co-citation support latory module for OGN/mimecan. The region on chromosome for Runx2 by BiblioSphere. In contrast to Runx, no models 12 was analyzed in the same way. However, model matches could be derived from the promoter sequences of those SLRP were much more evenly distributed and no occurrence of three genes co-cited with IRF. or more matches within a 1,000 bp window were observed. Taken together, the results above show that 11 of the 13 DNAse I hypersensitivity site formation on human chro- human SLRP genes have at least one HOX-Runx match in mosome 9: The relevance of the intergenic region with HOX- their promoter, exon 1, or intron 1 sequences. Although these Runx models to mimecan expression was analyzed by DNAse genes are located in different clusters on different chromo- I hypersensitivity (DH) PCR and ChIP assays. MG-63 cells somes, the common HOX-Runx model could be the basis for (shown to express high levels of mimecan) and U937 cells a co-regulated expression. (mimecan transcripts are not detectable) were used in these Chromosomal context: Control of an entire cluster of studies [44]. The cells were subjected to in vivo DNAse I treat- genes at the chromosomal level would require them to reside ment via membrane permeabilization, and the isolated genomic within one chromosomal loop structure that is accessible to DNA was used for PCR amplifications. Primer sets that am- the transcription machinery. Usually such a loop is about 200 plify separately the three HOX-Runx elements were used (Fig- kb or less. S/MARs often define the borders of chromosomal ure 5A, ig1, ig2, and ig3). To ensure the validity of the assay, loops [46,47]. Thus, one prerequisite of an analysis of clus- we also analyzed the human mimecan promoter (Figure 5A, ters is that their genes reside on the same sequence contig in e1). It was shown previously that USF1 is a transcriptional relative proximity to each other. Because the SLRP genes on activator of bovine and human mimecan promoters [45], there- human chromosome 1 were found on different sequence fore a DH site should be detected in this region. In our DH contigs, this cluster was not analyzed at the chromosomal level. assay, an increasing DNAse I concentration correlates with a On chromosome 9 there is a cluster of ASPN, OMD, and OGN. decreasing PCR product, as the template (if accessible) is de- Upstream to these three genes another gene, extracellular graded by DNAse I treatment. As shown in Figure 5B, the ig1 matrix protein 2 (ECM2), coding for an extracellular matrix region was sensitive in MG-63 cells after treatment with as protein is found (Figure 1 and Figure 4). All these genes re- little as 2,000 U/ml of DNAse I. Interestingly, in U937 cells, side within approximately 180 kb making them suitable for ig1 and ig3 regions also were found to be sensitive to DNAse common control mechanisms at the chromosomal level. The I treatment, although a higher concentration of DNAse I same arrangement of genes is found on chromosome 13 of (10,000 U/ml) was need for their detection. The ig2 region mouse and chromosome 17 of rat. The genes for human DCN, was not sensitive to DNAse I in both cell types. In agreement LUM, KERA, and DSPG3 are found within a region of about with our expectations, the mimecan promoter region was 280 kb on chromosome 12. Inspection of S/MAR annotations DNAse I sensitive in MG-63 cells, but not in U-937 cells (Fig- in ElDorado did not give hints for the presence of chromo- ure 5B, e1). somal loops, for the human gene clusters, or for their synthenic ChIP analysis was used to determine transcription factors rodent equivalents. that occupy the three Hox-Runx models in the intergenic re- Next, the sequences of the chromosomal regions were ex- gion (Figure 5C). Pitx3 was present at the ig1 region in MG- tracted and scanned for the presence of matches to the models 63 cells but not in U937 cells, whereas Runx2 was present at previously defined within the proximal promoter regions. In the ig2 region in U-937 cells but not in MG-63 cells. These many cases, TFs that are functional in the proximal promoter results suggest that Pitx3 may act as a positive regulator of are also involved in enhancer/repressor function. The clus- mimecan transcription in MG-63 cells, whereas Runx2 has an tered co-occurrence of putatative TF binding sites has been opposite effect in U937 cells. Consistent with our previous successfully exploited to determine enhancer elements in reports, USF1 was present at the e1 region in MG-63 cells, Drosophila [48,49]. Therefore, accumulations of model but not in U937 cells. The presence of IRF2 at e1 in MG-63 matches in intergenic regions may indicate similar regulatory cells (Figure 5C, e1) also is consistent with our previous data elements. However, our models imply the additional con- that demonstrate the involvement of IRF2 in transcriptional straints of a defined site orientation and distance range be- regulation of human mimecan [45,50]. The presence of Pitx3 tween sites. They are expected to occur less frequently than at e1 in U937 cells (Figure 5C, e1) was surprising. Mutually the accumulation of the respective single sites. We therefore exclusive binding of Pitx3 or USF1 seems a plausible expla- allowed an enlarged window size of 1,000 bp in contrast to nation for these results. Of note, Pitx and Runx sites within the 500 bp and 700 bp sliding windows used in [48] and [49], the first exon of human mimecan are not a part of the HOX- respectively. Figure 4 shows that there are two occurrences of Runx model because the distance between these two sites is three matches to the HOX-Runx model within a 1,000 bp win- different than the distance described for the model. dow on human chromosome 9. The first is found in the Taken together, these data indicate that the intergenic re- intergenic region between OGN/mimecan and OMD, the sec- gion between OGN/mimecan and OMD is associated with the ond is upstream of the putative promoter of ECM2. Given the regulated transcription of the human mimecan gene. 767 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

Runx2 transcription factors affect expression of several involvement has been demonstrated for the hematopoietic lin- SLRP genes: To test the effect of Runx2 transcription factors eage specific transcription factor GATA-1 [51]. By searching on expression of SLRPs we performed transient transfection for a defined model instead of searching for the occurrences experiments using MG-63 cells and primary bovine corneal of single TF binding sites we identified common regulatory keratocytes. These cells were chosen because both express frameworks in promoters of the SLRP genes. One such frame- mimecan, thereby allowing a comparison between bone spe- work, HOX-Runx, was detected within the proximal promoter, cific and cornea specific transcriptional regulation of this gene exon 1, or intron 1 sequences of 11 of the 13 human by Runx2 factors. In addition, MG-63 cells also express decorin proteoglycan genes. In addition, three HOX-Runx frameworks and biglycan, whereas corneal keratocytes express lumican were found in the intergenic region between OGN/mimecan and low levels of chondroadherin, and currently are the only and OMD and shown by DH and ChIP assays to be associated cell type known that expresses keratocan. We limited our analy- with expression of mimecan, thereby increasing the list of sis to testing only the two isoforms of Runx2 TFs for the fol- SLRPs genes that contain this model to 12 of 13. The only lowing reasons. First, Runx2 was the factor shown to bind the remaining SLRP that could not be shown to contain the frame- intergenic region in U937 cells, suggesting a repressor func- work is opticin. However, because the gene cluster on human tion on mimecan expression (Figure 5C). Second, the results chromosome 1 was not scanned for the model, as the clusters from BiblioSphere analysis show co-citation of this transcrip- tion factor with DCN, BGN, FMOD, and OMD in other cell types. Third, considering the fact that there are about 40 HOX genes in vertebrates and most of them give rise to at least two isoforms, testing all of the HOX and Runx transcription fac- tors that potentially could bind to Hox-Runx modules described above will require detailed analyses that are beyond the aim of this study. The results from transient transfections of MG-63 cells are shown in Figure 6. Overexpression of Runx2 (p56 isoform) led to decreased levels of mimecan and BGN mRNAs, whereas overexpression of Runx2 (p57 isoform) led to further decrease in mimecan mRNA and increase in biglycan mRNA. Overexpression of both isoforms had no effect on the level of decorin mRNA. Similar experiments using bovine corneal keratocytes are shown in Figure 7. Overexpression of Runx2 isoforms led to a slight decrease in mimecan mRNA and an increase in KERA and CHAD mRNAs. The level of LUM mRNA remained unchanged. Taken together these results show that Runx2 TFs affect the expression of 5 of the 7 SLRPs analyzed in this study. Unchanged expression of DEC and LUM indicates that Runx2 is not a transcriptional regulator of decorin in MG-63 cells and lumican in bovine corneal keratocytes. Whether another member of the Runx family of TF could change the expres- sion of DEC and LUM in these cell types remain to be deter- mined. Figure 8. General model for transcriptional co-regulation of the SLRPs. General model proposing how transcription of the SLRPs DISCUSSION may be regulated by members of the TGF-β superfamily of growth We have used a bioinformatics approach to compare promot- factors and by the HOX and Runx families of transcription factors. ers of the SLRPs between human, mouse, and rat. Compared Different SLRP proteins bind TGF-β and/or bone morphogenetic to approaches used by others, such as those used to search for protein (BMP), sequester them in the ECM without inactivation, and cis-regulatory modules involved in pattern formation in the serve as a depot available during growth or tissue remodeling. Upon Drosophila genome [49], the main difference in our approach release, TGF-β and BMP interact with their respective receptors, is that first it generates a defined model from promoter se- serine-threonine kinases that signal through the SMAD family of tran- quences and then it tries to find multiple occurrences in re- scriptional regulators. Other signaling pathways, such as p38 mito- lated intergenic regions. This approach is more stringent than gen activated protein kinase (MAPK), also are activated by TGF-β and BMP signaling [78]. SMADs modulate transcription of Runx the searches for site clustering in intergenic regions [49]. Fur- and HOX TF, and these in turn modulate transcription of SLRP genes. thermore, it implies that the very same TF modules that are Some of the SLRP promoters also contain predictions of SMAD bind- active in regulation at the level of the proximal promoter also ing sites (MatInspector output). Increased/decreased expression of are involved in regulation at the level of intergenic elements, SLRPs that are exported back into ECM provide the feedback mecha- including enhancers and locus control regions (LCRs). Such nism seen in many biological systems. 768 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision on chromosomes 9 and 12 were, the presence of an intergenic scription of mimecan in this cell type. It is of interest to note HOX-Runx framework similar to those on chromosome 9, that that most of the biological activities of Runx proteins are simi- might affect the expression of opticin, cannot be excluded at lar to those of TGF-β superfamily cytokines [76]. this time. Both the homeodomain and runt domain containing tran- The homeobox gene family of transcription factors was scription factors seem to be reasonable candidates for partici- first identified in Drosophila, encoding proteins that play fun- pating in transcriptional regulation of the SLRP family mem- damental roles in coordinating development and morphogen- bers for several reasons: First, TGF-β signaling has been shown esis [52]. Analysis of their structure shows that they contain a to affect the expression of both homeodomain containing and highly conserved DNA binding domain 60 residues long, runt domain containing transcription factors [77,78]. Second, termed the homeodomain or homeobox. Based on differences TGF-β growth factors and cytokines also have been shown to in the amino acid at position nine of the DNA recognition be principal regulators of SLRP expression and matrix remod- helix of these proteins, they are subdivided into large sub- eling during development, inflammation and diseases [79-82]. groups. The homeodomain proteins related to Drosophila Thus, it seems likely that HOX and Runx TF are mediators of bicoid have a lysine at this position and bind to the sequence TGF-β effects on SLRP expession. Supportive to this notion TCTAATCCC, whereas HD-proteins related to Drosophila are several reports demonstrating that both homeodomain con- antennapedia and fushi-tarazu have glutamine at this position taining and runt domain containing TFs also regulate the ex- and bind to sequence TCAATTAA [53,54]. The DNA binding pression of other ECM molecules, including collagen type I specificity of these proteins is further affected by interactions and type V, and procollagen lysyl hydrolase, i.e., similarly to between different homeodomain proteins [55]. In addition, TGF-β and SLRPs, HOX and Runx TF also are involved in various Hox genes are known to produce alternative transcripts matrix remodeling [83-87]. Third, the results from encoding different isoforms whose physiological relevance bioinformatics analysis, transient transfection, ChIP, and DH during development is not yet understood [56]. There are only assays presented here support the above hypothesis. Forth, four eight HOX proteins in Drosophila but about 40 in vertebrates. SLRP members already have been co-cited with Runx2 (Fig- Pitx genes, also referred to as the RIEG/PITX homeobox gene ure 3). A hypothetical model to explain how HOX and Runt family, are of particular interest as potential regulators of the families of transcription factors may mediate the effects of SLRP genes because of their important role in eye, tooth, pi- TGF-β on expression of the SLRP genes is shown in Figure 8. tuitary, and umbilical region development [57-60]. Pitx3 is The model is based on results presented in this study and also associated with anterior segment mesenchymal dysgenesis is supported by data from several publications [79-88]. It is (ASMD), congenital cataracts, and development of dopamin- consistent with current views on interactions between certain ergic neurons in the substantia nigra [61-64]. The finding that members of the SLRPs and cytokines of the TGF-β super- Pitx3 binds the intergenic region on human chomosome 9 in family. It is applicable to different cell types and suggests that MG-63 cells, but not in U937 cells, indicates that this TF is depending on the cell type (as well as developmental stage or associated with regulated expression of mimecan. the type of tissue injury) different combinations of Hox and The first member of the Runx family of transcription fac- Runx TF may regulate the expression of SLRPs thereby main- tors, runt, was also identified in Drosophila as a regulatory taining/remodeling the ECM accordingly. Finally, the model gene, which functions in establishing segmentation patterns is easily testable, and will serve as a focus for our future stud- during embryogenesis, and also in sex determination and ies. neurogenesis [65]. A second Drosophila Runx gene, lozenge, is a multifunctional transcription factor that is required for ACKNOWLEDGEMENTS cell patterning in the eye and for hematopoiesis [66]. There This work was supported by NIH Grant EY13395 to GWC are three Runx genes in mammals: Runx1 is required for the and EST. formation of hematopoietic stem cells and is a frequently mutated gene in human leukemia [67,68]. Runx2 is required REFERENCES for osteogenesis and is associated with cleidocranial dyspla- 1. Strahl BD, Allis CD. The language of covalent histone modifica- sia [69,70], and Runx3 controls neurogenesis in dorsal root tions. Nature 2000; 403:41-5. ganglia and cell proliferation in the gastric epithelium, and is 2. Vignali M, Hassan AH, Neely KE, Workman JL. ATP-dependent frequently deleted or silenced in human gastric cancer [71,72]. chromatin-remodeling complexes. Mol Cell Biol 2000; 20:1899- 910. Runx proteins activate or repress gene expression by binding 3. Wolffe AP. Transcriptional regulation in the context of chromatin to a TGPuGGTPu DNA sequence [73]. It has been hypoth- structure. Essays Biochem 2001; 37:45-57. esized that Runx factors are actually organizers that facilitate 4. Hassan AH, Neely KE, Vignali M, Reese JC, Workman JL. Pro- the assembly of transcriptional regulatory complexes on gene moter targeting of chromatin-modifying complexes. Front Biosci regulatory elements [74]. Indeed, Runx factors have been 2001; 6:D1054-64. shown to interact with many transcription factors and also with 5. Cook PR. Nongenic transcription, gene regulation and action at a histone deacetylase 6 [39,75]. Our finding that Runx2 binds distance. J Cell Sci 2003; 116:4483-91. the intergenic region on human chomosome 9 in U937 cells is 6. Smale ST, Kadonaga JT. The RNA polymerase II core promoter. consistent with these reports (Figure 5C). Runx2 could attract Annu Rev Biochem 2003; 72:449-79. 7. Papadakis ED, Nicklin SA, Baker AH, White SJ. Promoters and histone deacetylase(s) to this region, thereby repressing tran- 769 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

control elements: designing expression cassettes for gene macular corneal dystrophy. J Biol Chem 2001; 276:39788-96. therapy. Curr Gene Ther 2004; 4:89-113. 27. Tanihara H, Inatani M, Koga T, Yano T, Kimura A. Proteoglycans 8. Werner T, Fessele S, Maier H, Nelson PJ. Computer modeling of in the eye. Cornea 2002; 21:S62-9. promoter organization as a tool to study transcriptional 28. Danielson KG, Baribault H, Holmes DF, Graham H, Kadler KE, coregulation. FASEB J 2003; 17:1228-37. Iozzo RV. Targeted disruption of decorin leads to abnormal col- 9. Lenhard B, Sandelin A, Mendoza L, Engstrom P, Jareborg N, lagen fibril morphology and skin fragility. J Cell Biol 1997; Wasserman WW. Identification of conserved regulatory elements 136:729-43. by comparative genome analysis. J Biol 2003; 2:13. 29. Chakravarti S, Magnuson T, Lass JH, Jepsen KJ, LaMantia C, 10. Wasserman WW, Sandelin A. Applied bioinformatics for the iden- Carroll H. Lumican regulates collagen fibril assembly: skin fra- tification of regulatory elements. Nat Rev Genet 2004; 5:276- gility and corneal opacity in the absence of lumican. J Cell Biol 87. 1998; 141:1277-86. 11. Frisch M, Frech K, Klingenhoff A, Cartharius K, Liebich I, Werner 30. Xu T, Bianco P, Fisher LW, Longenecker G, Smith E, Goldstein T. In silico prediction of scaffold/matrix attachment regions in S, Bonadio J, Boskey A, Heegaard AM, Sommer B, Satomura large genomic sequences. Genome Res 2002; 12:349-54. K, Dominguez P, Zhao C, Kulkarni AB, Robey PG, Young MF. 12. Kmita M, Tarchini B, Duboule D, Herault Y. Evolutionary con- Targeted disruption of the biglycan gene leads to an osteoporo- served sequences are required for the insulation of the verte- sis-like phenotype in mice. Nat Genet 1998; 20:78-82. brate Hoxd complex in neural cells. Development 2002; 31. Svensson L, Aszodi A, Reinholt FP, Fassler R, Heinegard D, 129:5521-8. Oldberg A. Fibromodulin-null mice have abnormal collagen 13. Frazer KA, Tao H, Osoegawa K, de Jong PJ, Chen X, Doherty fibrils, tissue organization, and altered lumican deposition in MF, Cox DR. Noncoding sequences conserved in a limited num- tendon. J Biol Chem 1999; 274:9636-47. ber of mammals in the SIM2 interval are frequently functional. 32. Tasheva ES, Koester A, Paulsen AQ, Garrett AS, Boyle DL, Genome Res 2004; 14:367-72. Davidson HJ, Song M, Fox N, Conrad GW. Mimecan/ 14. Wang Z, Fan H, Yang HH, Hu Y, Buetow KH, Lee MP. Compara- osteoglycin-deficient mice have collagen fibril abnormalities. tive sequence analysis of imprinted genes between human and Mol Vis 2002; 8:407-15 . mouse to reveal imprinting signatures. Genomics 2004; 83:395- 33. Liu CY, Birk DE, Hassell JR, Kane B, Kao WW. Keratocan- 401. deficient mice display alterations in corneal structure. J Biol 15. Dickmeis T, Plessy C, Rastegar S, Aanstad P, Herwig R, Chalmel Chem 2003; 278:21672-7. F, Fischer N, Strahle U. Expression profiling and comparative 34. Stamatoyannopoulos G, Grosveld F. Hemoglobin switching. In: genomics identify a conserved regulatory region controlling Stamatoyannopoulos G, Majerus PW, Perlmutter RM, Varmus midline expression in the zebrafish embryo. Genome Res 2004; H, editors. The molecular basis of blood Diseases. 3rd ed. Phila- 14:228-38. delphia: WB Saunders; 2001. p.135-182. 16. Duret L, Bucher P. Searching for regulatory elements in human 35. Ymer S, Jans DA. In vivo chromatin structure of the murine noncoding sequences. Curr Opin Struct Biol 1997; 7:399-406. interleukin-5 gene region: a new intact cell system. 17. Fry CJ, Farnham PJ. Context-dependent transcriptional regula- Biotechniques 1996; 20:834-8,840. tion. J Biol Chem 1999; 274:29583-6. 36. Yoo J, Herman LE, Li C, Krantz SB, Tuan D. Dynamic changes 18. Cocea L, Dahan A, Ferradini L, Reynaud CA, Weill JC. Negative in the locus control region of erythroid progenitor cells demon- regulation of Ig gene rearrangement by a 150-bp transcriptional strated by polymerase chain reaction. Blood 1996; 87:2558-67. silencer. Eur J Immunol 1998; 28:2809-16. 37. Zabel MD, Byrne BL, Weis JJ, Weis JH. Cell-specific expression 19. Li Q, Peterson KR, Fang X, Stamatoyannopoulos G. Locus con- of the murine CD21 gene depends on accessibility of promoter trol regions. Blood 2002; 100:3077-86. and intronic elements. J Immunol 2000; 165:4437-45. 20. Manzanares M, Wada H, Itasaki N, Trainor PA, Krumlauf R, 38. Im H, Grass JA, Johnson KD, Boyer ME, Wu J, Bresnick EH. Holland PW. Conservation and elaboration of Hox gene regula- Measurement of protein-DNA interactions in vivo by chroma- tion during evolution of the vertebrate head. Nature 2000; tin immunoprecipitation. Methods Mol Biol 2004; 284:129-46. 408:854-7. 39. Kahler RA, Westendorf JJ. Lymphoid enhancer factor-1 and beta- 21. Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, catenin inhibit Runx2-dependent transcriptional activation of Rubin EM, Frazer KA. Identification of a coordinate regulator the osteocalcin promoter. J Biol Chem 2003; 278:11937-44. of interleukins 4, 13, and 5 by cross-species sequence compari- 40. Ogawa E, Maruyama M, Kagoshima H, Inuzuka M, Lu J, Satake sons. Science 2000; 288:136-40. M, Shigesada K, Ito Y. PEBP2/PEA2 represents a family of tran- 22. Hocking AM, Shinomura T, McQuillan DJ. Leucine-rich repeat scription factors homologous to the products of the Drosophila glycoproteins of the extracellular matrix. Matrix Biol 1998; 17:1- runt gene and the human AML1 gene. Proc Natl Acad Sci U S A 19. 1993; 90:6859-63. 23. Iozzo RV. The biology of the small leucine-rich proteoglycans. 41. Xiao ZS, Thomas R, Hinson TK, Quarles LD. Genomic structure Functional network of interactive proteins. J Biol Chem 1999; and isoform expression of the mouse, rat and human Cbfa1/ 274:18843-6. Osf2 transcription factor. Gene 1998; 214:187-97. 24. Kresse H, Schonherr E. Proteoglycans of the extracellular matrix 42. Tasheva ES, Ke A, Deng Y, Jun C, Takemoto LJ, Koester A, and growth control. J Cell Physiol 2001; 189:266-74. Conrad GW. Differentially expressed genes in the lens of 25. Ameye L, Young MF. Mice deficient in small leucine-rich mimecan-null mice. Mol Vis 2004; 10:403-16 . proteoglycans: novel in vivo models for osteoporosis, osteoar- 43. Quandt K, Frech K, Karas H, Wingender E, Werner T. MatInd thritis, Ehlers-Danlos syndrome, muscular dystrophy, and cor- and MatInspector: new fast and versatile tools for detection of neal diseases. Glycobiology 2002; 12:107R-16R. consensus matches in nucleotide sequence data. Nucleic Acids 26. Plaas AH, West LA, Thonar EJ, Karcioglu ZA, Smith CJ, Res 1995; 23:4878-84. Klintworth GK, Hascall VC. Altered fine structures of corneal 44. Tasheva ES, Maki CG, Conrad AH, Conrad GW. Transcriptional and skeletal keratan sulfate and chondroitin/dermatan sulfate in activation of bovine mimecan by p53 through an intronic DNA- 770 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

binding site. Biochim Biophys Acta 2001; 1517:333-8. 62. Semina EV, Murray JC, Reiter R, Hrstka RF, Graw J. Deletion in 45. Tasheva ES. Analysis of the promoter region of human mimecan the promoter region and altered expression of Pitx3 homeobox gene. Biochim Biophys Acta 2002; 1575:123-9. gene in aphakia mice. Hum Mol Genet 2000; 9:1575-85. 46. Dijkwel PA, Hamlin JL. Matrix attachment regions are positioned 63. Rieger DK, Reichenberger E, McLean W, Sidow A, Olsen BR. A near replication initiation sites, genes, and an interamplicon junc- double-deletion mutation in the Pitx3 gene causes arrested lens tion in the amplified dihydrofolate reductase domain of Chi- development in aphakia mice. Genomics 2001; 72:61-72. nese hamster ovary cells. Mol Cell Biol 1988; 8:5398-409. 64. Hwang DY, Ardayfio P, Kang UJ, Semina EV, Kim KS. Selective 47. Loc PV, Stratling WH. The matrix attachment regions of the loss of dopaminergic neurons in the substantia nigra of Pitx3- chicken lysozyme gene co-map with the boundaries of the chro- deficient aphakia mice. Brain Res Mol Brain Res 2003; 114:123- matin domain. EMBO J 1988; 7:655-64. 31. 48. Halfon MS, Grad Y, Church GM, Michelson AM. Computation- 65. Coffman JA. Runx transcription factors and the developmental based discovery of related transcriptional regulatory modules balance between cell proliferation and differentiation. Cell Biol and motifs using an experimentally validated combinatorial Int 2003; 27:315-24. model. Genome Res 2002; 12:1019-28. 66. Yan H, Canon J, Banerjee U. A transcriptional chain linking eye 49. Berman BP, Nibu Y, Pfeiffer BD, Tomancak P, Celniker SE, Levine specification to terminal determination of cone cells in the Droso- M, Rubin GM, Eisen MB. Exploiting transcription factor bind- phila eye. Dev Biol 2003; 263:323-9. ing site clustering to identify cis-regulatory modules involved 67. North T, Gu TL, Stacy T, Wang Q, Howard L, Binder M, Marin- in pattern formation in the Drosophila genome. Proc Natl Acad Padilla M, Speck NA. Cbfa2 is required for the formation of Sci U S A 2002; 99:757-62. intra-aortic hematopoietic clusters. Development 1999; 50. Tasheva ES, Conrad GW. Interferon-gamma regulation of the 126:2563-75. human mimecan promoter. Mol Vis 2003; 9:277-87 . 68. Look AT. Oncogenic transcription factors in the human acute 51. Horak CE, Mahajan MC, Luscombe NM, Gerstein M, Weissman leukemias. Science 1997; 278:1059-64. SM, Snyder M. GATA-1 binding sites mapped in the beta-globin 69. Ducy P, Zhang R, Geoffroy V, Ridall AL, Karsenty G. Osf2/Cbfa1: locus by using mammalian chIp-chip analysis. Proc Natl Acad a transcriptional activator of osteoblast differentiation. Cell 1997; Sci U S A 2002; 99:2924-9. 89:747-54. 52. Lewis EB. A gene complex controlling segmentation in Droso- 70. Lee B, Thirunavukkarasu K, Zhou L, Pastore L, Baldini A, Hecht phila. Nature 1978; 276:565-70. J, Geoffroy V, Ducy P, Karsenty G. Missense mutations abol- 53. Harrison SC, Aggarwal AK. DNA recognition by proteins with ishing DNA binding of the osteoblast-specific transcription factor the helix-turn-helix motif. Annu Rev Biochem 1990; 59:933- OSF2/CBFA1 in cleidocranial dysplasia. Nat Genet 1997; 69. 16:307-10. 54. Treisman J, Gonczy P, Vashishtha M, Harris E, Desplan C. A 71. Inoue K, Ozaki S, Ito K, Iseda T, Kawaguchi S, Ogawa M, Bae single amino acid can determine the DNA binding specificity SC, Yamashita N, Itohara S, Kudo N, Ito Y. Runx3 is essential of homeodomain proteins. Cell 1989; 59:553-62. for the target-specific axon pathfinding of trkc-expressing dor- 55. Gehring WJ, Qian YQ, Billeter M, Furukubo-Tokunaga K, Schier sal root ganglion neurons. Blood Cells Mol Dis 2003; 30:157- AF, Resendez-Perez D, Affolter M, Otting G, Wuthrich K. 60. Homeodomain-DNA recognition. Cell 1994; 78:211-23. 72. Li QL, Ito K, Sakakura C, Fukamachi H, Inoue K, Chi XZ, Lee 56. Dintilhac A, Bihan R, Guerrier D, Deschamps S, Pellerin I. A KY, Nomura S, Lee CW, Han SB, Kim HM, Kim WJ, Yamamoto conserved non-homeodomain Hoxa9 isoform interacting with H, Yamashita N, Yano T, Ikeda T, Itohara S, Inazawa J, Abe T, CBP is co-expressed with the ‘typical’ Hoxa9 protein during Hagiwara A, Yamagishi H, Ooe A, Kaneda A, Sugimura T, embryogenesis. Gene Expr Patterns 2004; 4:215-22. Ushijima T, Bae SC, Ito Y. Causal relationship between the loss 57. Andersen B, Rosenfeld MG. Pit-1 determines cell types during of RUNX3 expression and gastric cancer. Cell 2002; 109:113- development of the anterior pituitary gland. A model for tran- 24. scriptional regulation of cell phenotypes in mammalian orga- 73. Meyers S, Downing JR, Hiebert SW. Identification of AML-1 nogenesis. J Biol Chem 1994; 269:29335-8. and the (8;21) translocation protein (AML-1/ETO) as sequence- 58. Szeto DP, Rodriguez-Esteban C, Ryan AK, O’Connell SM, Liu specific DNA-binding proteins: the runt homology domain is F, Kioussi C, Gleiberman AS, Izpisua-Belmonte JC, Rosenfeld required for DNA binding and protein-protein interactions. Mol MG. Role of the Bicoid-related homeodomain factor Pitx1 in Cell Biol 1993; 13:6336-45. specifying hindlimb morphogenesis and pituitary development. 74. Wang SW, Speck NA. Purification of core-binding factor, a pro- Genes Dev 1999; 13:484-94. tein that binds the conserved core site in murine leukemia virus 59. Semina EV, Reiter R, Leysens NJ, Alward WL, Small KW, Datson enhancers. Mol Cell Biol 1992; 12:89-102. NA, Siegel-Bartelt J, Bierke-Nelson D, Bitoun P, Zabel BU, 75. Westendorf JJ, Zaidi SK, Cascino JE, Kahler R, van Wijnen AJ, Carey JC, Murray JC. Cloning and characterization of a novel Lian JB, Yoshida M, Stein GS, Li X. Runx2 (Cbfa1, AML-3) bicoid-related homeobox transcription factor gene, RIEG, in- interacts with histone deacetylase 6 and represses the p21(CIP1/ volved in Rieger syndrome. Nat Genet 1996; 14:392-9. WAF1) promoter. Mol Cell Biol 2002; 22:7982-92. 60. Ryan AK, Blumberg B, Rodriguez-Esteban C, Yonei-Tamura S, 76. Miyazono K, Maeda S, Imamura T. Coordinate regulation of cell Tamura K, Tsukui T, de la Pena J, Sabbagh W, Greenwald J, growth and differentiation by TGF-beta superfamily and Runx Choe S, Norris DP, Robertson EJ, Evans RM, Rosenfeld MG, proteins. Oncogene 2004; 23:4232-7. Izpisua Belmonte JC. Pitx2 determines left-right asymmetry of 77. Grienenberger A, Merabet S, Manak J, Iltis I, Fabre A, Berenger internal organs in vertebrates. Nature 1998; 394:545-51. H, Scott MP, Pradel J, Graba Y. Tgfbeta signaling acts on a Hox 61. Semina EV, Ferrell RE, Mintz-Hittner HA, Bitoun P, Alward WL, response element to confer specificity and diversity to Hox pro- Reiter RS, Funkhauser C, Daack-Hirsch S, Murray JC. A novel tein function. Development 2003; 130:5445-55. homeobox gene PITX3 is mutated in families with autosomal- 78. Lee KS, Hong SH, Bae SC. Both the Smad and p38 MAPK path- dominant cataracts and ASMD. Nat Genet 1998; 19:167-70. ways play a crucial role in Runx2 expression following induc- 771 Molecular Vision 2004; 10:758-772 ©2004 Molecular Vision

tion by transforming growth factor-beta and bone morphoge- rich proteoglycan biglycan modulates BMP-4-induced osteo- netic protein. Oncogene 2002; 21:7156-63. blast differentiation. FASEB J 2004; 18:948-58. 79. Kinsella MG, Bressler SL, Wight TN. The regulated synthesis of 87. Ghannam G, Takeda A, Camarata T, Moore MA, Viale A, Yaseen versican, decorin, and biglycan: extracellular matrix NR. The oncogene Nup98-HOXA9 induces gene transcription proteoglycans that influence cellular phenotype. Crit Rev in myeloid cells. J Biol Chem 2004; 279:866-75. Eukaryot Gene Expr 2004; 14:203-34. 88. Ge G, Seo NS, Liang X, Hopkins DR, Hook M, Greenspan DS. 80. Long CJ, Roth MR, Tasheva ES, Funderburgh M, Smit R, Conrad Bone Morphogenetic Protein-1/Tolloid-related GW, Funderburgh JL. Fibroblast growth factor-2 promotes Metalloproteinases Process Osteoglycin and Enhance Its Abil- keratan sulfate proteoglycan expression by keratocytes in vitro. ity to Regulate Collagen Fibrillogenesis. J Biol Chem 2004; J Biol Chem 2000; 275:13918-23. 279:41626-33. 81. Demoor-Fossard M, Galera P, Santra M, Iozzo RV, Pujol JP, Redini 89. Balint E, Lapointe D, Drissi H, van der Meijden C, Young DW, F. A composite element binding the vitamin D receptor and the van Wijnen AJ, Stein JL, Stein GS, Lian JB. Phenotype discov- retinoic X receptor alpha mediates the transforming growth fac- ery by gene expression profiling: mapping of biological pro- tor-beta inhibition of decorin gene expression in articular cesses linked to BMP-2-mediated osteoblast differentiation. J chondrocytes. J Biol Chem 2001; 276:36983-92. Cell Biochem 2003; 89:401-26. 82. Tiede K, Stoter K, Petrik C, Chen WB, Ungefroren H, Kruse 90. Lee TH, Klampfer L, Shows TB, Vilcek J. Transcriptional regu- ML, Stoll M, Unger T, Fischer JW. Angiotensin II AT(1)-recep- lation of TSG6, a tumor necrosis factor- and interleukin-1-in- tor induces biglycan in neonatal cardiac fibroblasts via autocrine ducible primary response gene coding for a secreted hyaluronan- release of TGFbeta in vitro. Cardiovasc Res 2003; 60:538-46. binding protein. J Biol Chem 1993; 268:6154-60. 83. Penkov D, Tanaka S, Di Rocco G, Berthelsen J, Blasi F, Ramirez 91. Silverstein DM, Travis BR, Thornhill BA, Schurr JS, Kolls JK, F. Cooperative interactions between PBX, PREP, and HOX pro- Leung JC, Chevalier RL. Altered expression of immune modu- teins modulate the activity of the alpha 2(V) collagen (COL5A2) lator and structural genes in neonatal unilateral ureteral obstruc- promoter. J Biol Chem 2000; 275:16681-9. tion. Kidney Int 2003; 64:25-35. 84. Kern B, Shen J, Starbuck M, Karsenty G. Cbfa1 contributes to 92. Frech K, Danescu-Mayer J, Werner T. A novel method to de- the osteoblast-specific expression of type I collagen genes. J velop highly specific models for regulatory units detects a new Biol Chem 2001; 276:7101-7. LTR in GenBank which contains a functional promoter. J Mol 85. Hjalt TA, Amendt BA, Murray JC. PITX2 regulates procollagen Biol 1997; 270:674-87. lysyl hydroxylase (PLOD) gene expression: implications for the 93. Frech K, Quandt K, Werner T. Muscle actin genes: a first step pathology of Rieger syndrome. J Cell Biol 2001; 152:545-52. towards computational classification of tissue specific promot- 86. Chen XD, Fisher LW, Robey PG, Young MF. The small leucine- ers. In Silico Biol 1998; 1:29-38.

The print version of this article was created on 7 Oct 2004. This reflects all typographical corrections and errata to the article through that date. Details of any changes may be found in the online version of the article. 772