Exon Structure of the Nuclear Factor I DNA-Binding Domain from C
Total Page:16
File Type:pdf, Size:1020Kb
Mammalian Genome 10, 390–396 (1999). Incorporating Mouse Genome © Springer-Verlag New York Inc. 1999 Exon structure of the Nuclear Factor I DNA-binding domain from C. elegans to mammals Colin F. Fletcher,1 Nancy A. Jenkins,1 Neal G. Copeland,1 Ali Z. Chaudhry,2 Richard M. Gronostajski2 1Mammalian Genetics Laboratory, ABL-Basic Research Program, NCI-Frederick Cancer Research and Development Center, Frederick, Maryland 21702, USA 2The Lerner Research Institute, Cleveland Clinic Foundation, 9500 Euclid Avenue, Dept. of Cancer Biology NB40, Cleveland, Ohio 44195 and Case Western Reserve University, Dept. of Biochemistry, Cleveland, Ohio, USA Received: 2 November 1998 / Accepted: 9 December 1998 Abstract. The Nuclear Factor I (NFI) family of DNA-binding Osada et al. 1996), hamster (Gil et al. 1988), pig (Meisterernst et proteins is essential for adenovirus DNA replication and the tran- al. 1989), chicken (Rupp et al. 1990; Kruse et al. 1991) and Xeno- scription of many cellular genes. Mammals have four genes en- pus (Roulet et al. 1995; Puzianowska-Kuznicka and Shi 1996) and coding NFI proteins, C. elegans has only a single NFI gene, and have been shown to be alternatively spliced, yielding as many as prokaryotes have none. To assess the relationship between mem- seven proteins from each gene (Kruse and Sippel 1994; Wen- bers of this unusually small family of transcription/replication fac- zelides et al. 1996). While the number of distinct transcripts from tors, we mapped the chromosomal locations of the four murine each gene is large, the same four NFI genes are present in all NFI genes and analyzed the exons encoding the DNA-binding vertebrate species examined to date and, unlike the Hox gene domains of the mouse, Amphioxus, and C. elegans NFI genes. The family (Carroll 1995), no extra-familial homologs have been de- four murine NFI genes are on Chrs 4 (Nfia and Nfib),8(Nfix), and tected. The apparent genetic simplicity of this gene family makes 10 (Nfic), suggesting early duplication of the genes and dispersal it a useful system to investigate gene-family expansion during throughout the genome. The DNA-binding domains of all four NFI metazoan evolution. genes are encoded by large (532 bp) exons with identical splice NFI proteins function by binding to duplex DNA containing acceptor and donor sites in each. In contrast, the C. elegans nfi-1 the consensus sequence TTGGC/AN5G/TCCAA (Gronostajski et gene has four phased introns interrupting this DNA-binding, do- al. 1985; Gronostajski 1986, 1987; Meisterernst et al. 1988) and main-encoding exon, and the last exon extends 213 bp past the modulating DNA replication and transcription. The DNA-binding splice site used in all four murine genes. In addition, the introns activity of NFI proteins resides in a highly homologous N-terminal present in C. elegans nfi-1 are missing from the NFI genes of DNA-binding domain (Mermod et al. 1989; Gounari et al. 1990; Amphioxus and all mammalian genomes examined. This analysis Novak et al. 1992), while the C-terminal regions of the proteins are of the exon structure of the C. elegans and murine NFI genes more divergent. Multiple C-terminal domains of each NFI gene indicates that the murine genes were probably generated by dupli- product are generated by alternative splicing (Kruse and Sippel cation of a C. elegans-like ancestral gene, but that significant 1994). The N-terminal DNA-binding domains of all NFI proteins changes have occurred in the genomic organization of either the C. elegans or murine NFI genes during evolution. appear to bind to the same DNA consensus sequence with appar- ently equal affinities (Meisterernst et al. 1988; Goyal et al. 1990) and are sufficient to stimulate adenovirus DNA replication in vitro (Mermod et al. 1989; Gounari et al. 1990). In contrast, activation Introduction of transcription requires both the DNA-binding domain and C- terminal transactivation domains (Mermod et al. 1989). The find- The Nuclear Factor I (NFI) gene family encodes a set of highly ing that NFI genes are differentially expressed during mouse de- conserved site-specific DNA-binding proteins required for both velopment and cellular differentiation suggests that NFI proteins adenovirus DNA replication (Nagata et al. 1983; Bosher et al. may play an important role in gene expression during development 1991; Armentero et al. 1994; Coenjaerts and van der Vliet 1994) (Kulkarni and Gronostajski 1996; Chaudhry et al. 1997). Since and the transcription of a variety of cellular and viral genes (Gold- NFI proteins differ in their ability to transactivate NFI-responsive berg et al. 1992; Cardinaux et al. 1994; Furlong et al. 1996; Krebs promoters, the precise mechanisms of stimulation of replication et al. 1996; Spitz et al. 1997). cDNAs encoding NFI proteins have and transcription may differ among the NFI proteins (Apt et al. been cloned from a number of species and define a set of four 1994; Krebs et al. 1996). paralogous genes (NFI-A, NFI-B, NFI-C, and NFI-X) that are Although four NFI genes are present in mammals, only a single conserved throughout vertebrate evolution (Rupp et al. 1990; NFI gene has been detected in the nematode C. elegans, and the Kruse et al. 1991). Transcripts from NFI genes have been cloned exon structure of the NFI gene in C. elegans differs considerably from human (Santoro et al. 1988; Apt et al. 1994; Qian et al. 1995; from those of some previously characterized mammalian NFI Kulkarni and Gronostajski 1996), mouse (Inoue et al. 1990; Nebl genes. Also, no NFI-like genes have been detected in any viral, and Cato 1995; Chaudhry et al. 1997), rat (Paonessa et al. 1988; fungal, bacterial, or archebacterial genomes sequenced to date. These data suggest either that the NFI gene family arose during The nucleotide sequence data reported in this paper have been submitted to metazoan evolution or was lost independently in a number of GenBank and have been assigned the accession numbers: AF059517 lineages. To examine the relationship between members of the NFI (xNFI-1), AF059518 (xNFI-2), AF059519 (xNFI-3), AF059520 (xNFI-4), gene family, we have mapped the locations of the NFI genes in the AF059521 (amphi-NFI), AF111263 (Nfia), AF111264 (Nfib), AF111265 mouse genome, characterized the exons encoding most of the (Nfic), AF111266 (Nfix). DNA-binding domain of the mouse genes, and compared the Correspondence to: R.M. Gronostajski, [email protected] DNA-binding domain-encoding exons of the C. elegans NFI gene C.F. Fletcher et al.: Structure and divergence of NFI genes 391 (nfi-1) with the mammalian and Amphioxus exons. Our studies in M. spretus DNA. The presence or absence of the 9.4-kb M. spretus- indicate that while the mammalian and C. elegans genes likely specific KpnI fragment was followed in backcross mice. The Nfic probe, an 32 share a common ancestral gene, significant changes have occurred 0.631-kb EcoRI fragment of human cDNA, was labeled with [␣ P] dCTP in exon organization during evolution of the NFI gene family. as above, and washing was to a final stringency of 0.8 × SSCP, 0.1% SDS, 65°C. Fragments of 7.4 and 3.5 kb were detected in BamHI-digested C57BL/6J DNA, and fragments of 7.4 and 4.9 kb were detected in M. Materials and methods spretus DNA. The presence or absence of the 4.9-kb M. spretus-specific BamHI fragment was followed in backcross mice. The Nfix probe, an 1.472-kb EcoRI fragment of human cDNA, was labeled as above, and Degenerate PCR and cloning of genomic and cDNA fragments. washing was to a final stringency of 0.8 × SSCP, 0.1% SDS, 65°C. Frag- Degenerate PCR to detect NFI genes was performed with either of two sets ments of 3.9 and 2.6 kb were detected in PvuII-digested C57BL/6J DNA, of primers: Deg 1 (5ЈTTCCGGATGARTTYCAYCITTYATYGARGC3Ј) and fragments of 6.0 and 2.6 kb were detected in M. spretus DNA. The and Deg 2 (5ЈAATCGATRTGRTGBGGCTGIAYRCAIAG3Ј) are primers presence or absence of the 6.0-kb M. spretus-specific PvuII fragment was used previously to clone NFI cDNAs from chicken (Rupp et al. 1990) and followed in backcross mice. The map locations of several other loci used human (Kulkarni and Gronostajski 1996) genomic DNA and do not am- to position the Nfi loci on our interspecific backcross have been previously plify the C. elegans nfi-1 gene (data not shown), while Deg 3 (5ЈTAY- described (Copeland and Jenkins 1991; Fletcher et al. 1996, 1997). Re- AMITGGTTYMAYCTICARG3Ј)andDeg4(5ЈCKYTCICCRTCIGTRC- combination distances were calculated as described by Green (1981), using TYTCS3Ј) were designed from the regions most highly conserved between the computer program MapManager. Gene order was determined by mini- C. elegans nfi-1 and vertebrate NFI genes and amplify NFI genes from both mizing the number of recombination events across the chromosome. C. elegans and vertebrates. Amplification was performed in 20- to 50-l reactions containing 10–300 ng of genomic DNA, 3 M degenerate prim- ers, 200 M dNTPs, 50 mM Tris-HCl, pH 8.2, 1.5 mM MgCl2,50mM KCl, Results and 20 g/ml BSA. Cycle times were 94° − 5Ј,35×(94°−1Ј, 50° − 2Ј, 72° − 3Ј), 72° − 7Ј for reactions used to clone PCR products. PCR products were repaired with T4 polymerase, cloned and sequenced. Amphioxus ge- Mapping of the mouse NFI genes. The Nfi loci are distributed in nomic DNA was a gift from Peter Holland (Univ. Reading, UK). the mouse genome and map to regions of conserved synteny with Genomic fragments containing the 2nd exons of the murine Nfia and their locations in the human genome (Fig.