Molecular and Cellular Endocrinology 260–262 (2007) 2–11

The evolution and genomic landscape of CGB1 and CGB2 Pille Hallast a, Kristiina Rull a,b, Maris Laan a,∗ a Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia b Department of Obstetrics and Gynecology, University of Tartu, Estonia Received 12 August 2005; accepted 28 November 2005

Abstract The origin of completely novel is a significant question in evolution. The (LHB)/chorionic gonadotropin (CGB) cluster in humans contains a candidate example of this process. Two genes in this cluster (CGB1 and CGB2) exhibit nucleotide sequence similarity with the other LHB/CGB genes, but as a result of frameshifting are predicted to encode a completely novel . Our analysis of these genes from humans and related primates indicates a recent origin in the lineage specific to humans and African great apes. While the function of these genes is not yet known, they are strongly conserved between human and chimpanzee and exhibit three-fold lower diversity than LHB across human populations with no mutations that would disrupt the coding sequence. The 5-upstream region of CGB1/2 contains most of the promoter sequence of hCG␤ plus a novel region proximal to the putative transcription start site. In silico prediction of putative transcription factor binding sites supports the hypothesis that CGB1 and CGB2 gene products are expressed in, and may contribute to, implantation and placental development. © 2006 Elsevier Ireland Ltd. All rights reserved.

Keywords: Chorionic gonadotropin beta 1 and 2; Gene evolution; Polymorphism patterns; Putative promoter region; In silico transcription factor binding site prediction

1. Introduction entire 5-UTR of an ancestral hCG␤-subunit coding gene (Bo and Boime, 1992; Hollenberg et al., 1994; Fig. 1B). This inser- The human luteinizing hormone/chorionic gonadotropin beta tion creates a CGB1/CGB2 specific putative promoter fragment, (LHB/CGB) gene cluster on 19q13.3 consists of an alternative 5untranslated region (5-UTR) and a novel exon one LHB gene and six CGB genes (Fiddes and Goodman, 1980; 1, leading to a one basepair frameshift in the open reading frame Talmadge et al., 1984a; Maston and Ruvolo, 2002; Fig. 1A). (ORF) for exons 2 and 3. These seven genes are highly conserved at the nucleotide level Although a protein product corresponding to CGB1 and (85–99% DNA sequence identity) and appear to have origi- CGB2 has not yet been isolated, mRNA from these genes nated from an ancestral LHB gene as a result of duplication has been detected in the placenta (Bo and Boime, 1992; Rull during primate evolution. Four of the genes (CGB, CGB5, and Laan, 2005) as well in the testis (Berger et al., 1994), CGB7 and CGB8) encode the beta subunit of human chori- pituitary (Dirnhofer et al., 1996), and in breast cancer tissue onic gonadotropin, a 163 amino acid protein that is produced (Giovangrandi et al., 2001). The repeated observations of expres- by the implanting conceptus and is essential for alternations to sion suggest that these genes are functional. In transgenic mice the maternal reproductive system in support of pregnancy. The carrying a 36-kb cosmid insert with all the six CGB genes, the other CGB genes, CGB1 and CGB2, encode a hypothetical pro- CGB1 and CGB2 transcripts were also observed in brain at lev- tein of 132 amino acids that is completely different from the els comparable with placenta, the expression site for all the CGB hCG␤-subunit and lacks similarity to any known protein (Bo genes (Strauss et al., 1993). and Boime, 1992). These genes appear to have evolved by inser- As the next step toward understanding the evolution and func- tion of a DNA fragment (736 bp for CGB1, 724 bp for CGB2) tional relevance of CGB1 and CGB2 we sequenced and analyzed that replaces 52 bp of the proximal end of the promoter and the the genes from three human populations as well as from the closest living relatives of humans. As a reference for consid- ering relative conservation of CGB1 and CGB2 we used LHB, ∗ Corresponding author. Tel.: +372 7375008; fax: +372 7420286. the founding member of this gene cluster and a gene that has E-mail address: [email protected] (M. Laan). a well-established, essential and conserved function in mam-

0303-7207/$ – see front matter © 2006 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.mce.2005.11.049 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 3 4 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11

Fig. 1. Genomic context of CGB1 and CGB2. (A) Schematic presentation of the structure of the LHB/CGB gene cluster (covering 39.76 kb from LHB to CGB7) drawn to an approximate scale. Individual LHB/CGB genes (white boxes) cover 1.11–1.466 kb. Arrows indicate the direction of transcription either from a sense or an antisense strand. Experimentally identified hCG␤ promoter sequence (Otani et al., 1988; white ovals) is also present, although more distally, upstream of LHB, CGB1 and CGB2 genes. Detailed alignment of the promoter area is shown in C. CGB1 and CGB2 specific insert is divided into a transcribed segment coding for 5-UTR, exon1 and part of intron 1 of CGB1/CGB2 (black boxes; 255 bp) and an immediate 5-upstream segment, which could serve as an additional promoter component (black ovals; CGB1 481 bp, CGB2 469 bp). Alignment of the non-coding 5-upstream part of the insert is in D. Intergenic Neutrophin 6 (psNTF6; striped boxes; <1.15 kb) originate through duplication from Neutrophin 5 (NTF5)exon3(Hallast et al., 2005). (B) Structure of CGB1 and CGB2 differs from a consensus hCG␤ gene in the following aspects: (1) hCG␤ 5-UTR has been replaced by a CGB1/2-specific insert coding for CGB1/2 5-UTR, exon 1 (diagonally striped box) and part of intron 1 (black box) as well as provides a 481/469 bp upstream fragment, which could function as an additional promoter segment (black oval); (2) hCG␤ exon 1 (horizontally striped box) is a part of CGB1/2 intron 1; (3) open reading frame (ORF) of exons 2 and 3 of CGB1/2 (grey boxes) has a-1bp shifted compared to hCG␤ coding genes; (4) shifted ORF has lead to earlier STOP codon and shorter exon 3. An alternative exon 1 and shifted ORF for exons 2 and 3 code for a putative CGB1/2 protein with no amino acid similarity to hCG␤-subunit. (C) Alignment of the proximal promoter of hCG␤ subunit coding genes (CGB, CGB5, CGB8, CGB7) with the homologous upstream segment of CGB1 and CGB2. cAMP response element has been mapped from −311 to −202 (Albanese et al., 1991; black brackets), trophoblast-specific element TSE from −305 to −279 (Steger et al., 1993; dotted brackets). Other experimentally proven regulatory elements of hCG␤ promoter include activating protein 2 (AP2) and selective promoter factor 1 (Sp1) (Johnson and Jameson, 1999) as well as Ets-2 binding sites (Ghosh et al., 2003). *CCAAT box has been identified by Matinspector and Alibaba TFBS prediction softwares. (D). Prediction of transcription factor binding sites (TFBS) onto the 5-upstream segment unique to CGB1 and CGB2 created by the insertion (B). TFBSs predicted by both MatInspector and Alibaba methods are marked with solid arrows above the aligned sequences of CGB1 and CGB2; TFBSs recognized by MatInspector alone are marked by broken arrows. TFBSs predicted solely based on CGB1 sequence are indicated with (*) and based on CGB2 (**). ATF: activating transcription factor; AP2: activating protein 2; Cdx2: Caudal-related transcription factor; CREB: cAMP responsive element binding protein; ERE: Estrogen response element; HIF: Hypoxia-inducible factor 1; NFkappaB: nuclear factor ␬B; GATA2: GATA-biding protein 2; SF1: steroidogenic factor 1; Sp1: selective promoter factor 1. Transcription start site has been indicated based on NCBI GenBank locus no NG 000019 information. malian reproduction. We used the resulting data to explore the (Asia) individuals. Estonian DNA samples originate from the DNA bank of following questions: (i) what is the origin of CGB1 and CGB2? Department of Biotechnology, IMCB, University of Tartu. Mandenka and Han (ii) how conserved are CGB1 and CGB2 among primates? (iii) DNA samples were obtained from HGDP-CEPH Diversity Cell Line Panel (http://www.cephb.fr/HGDP-CEPH-Panel/). Common chim- does the variation pattern of human CGB1 and CGB2 support panzee (Pan troglodytes) DNA was extracted from sperm material obtained constraints on variation consistent with functionality? (iv) does from Tallinn Zoo, Estonia; sources of orangutan (Pongo pygmaeus) and gorilla the upstream region of human CGB1/2 have expected features (Gorilla gorilla) DNAs were primary cell lines AG12256 and AG05251B, pur- of a functional promoter and what transcription factor binding chased from ECACC. sites are present that could direct expression of these genes to specific tissues? 2.2. Sequencing of human and great ape CGB1 and CGB2 genes

The structure of the CGB1/CGB2 (#MIM, 608823; 608824) genomic region 2. Materials and methods has been reconstructed by web-based global alignment (http://www.ebi.ac.uk/ clustalw/;CLUSTALW) and BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) 2.1. Experimental subjects tools. For the analysis we used the sequence obtained from NCBI GenBank database (http://www.ncbi.nlm.nih.gov; locus no NG 000019; 26 June 2002 The study was approved by the Ethics Committee of the University of Tartu, release). Estonia (protocol no. 117/9, 16 June 2003). CGB1 and CGB2 genes were rese- PCR and sequencing primers for CGB1, CGB2 and a reference gene LHB quenced for 47 Estonian (Europe), 23 Mandenka (Africa) and 25 Chinese Han were designed based on human sequence using the web-based version of the P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 5

Table 1 total eight PCR primer combinations were tested for CGB1 and nine for CGB2 Primers used for human and great ape PCR amplification and DNA sequencing amplification from chimpanzee, gorilla and orangutan DNA. PCR amplification   of 100 ng genomic DNA (Long PCR Enzyme Mix; MBI Fermentas) was per- Gene Primers Sequence 5 –3 formed in a PTC-200 thermal cycler (MJ Research) using a standard protocol Human-specific PCR primers-pairs recommended by the manufacturer. The reactions were initiated with a denatu- ◦ ◦ CGB1 CGB1 1F GTTGGTCTGGAACCCCTCA ration at 95 C for 5 min, followed by 10 cycles of denaturation at 95 C for 20 s, ◦ ◦ CGB1 2R ATGGAGCCAATCACAAGAGG annealing at 68 C for 30 s (decrease of temperature 1 C per cycle), elongation at 68 ◦C for 2 min, 10 cycles 95 ◦C (20 s), 56 ◦C (30 s), 68 ◦C (2 min), 10 cycles CGB2 CGB2 1F AGGAGAGGCTCACCCCTGAC ◦ ◦ ◦ ◦ ◦ 95 C (20 s), 54 C (30 s), 68 C (2 min), 10 cycles 95 C (20 s), 51 C (30 s), CGB2 2R CCCGGATAACTTTTCGTATTTTTA ◦ ◦ 68 C (2 min). A final extension step was performed at 68 C for 10 min. LHB LHB 1F ATGACATCAAGCGGGTCTAC All amplified genes were sequenced from both strands. For removing unin- LHB 4R GGATTAGTGTCCAGGTTACCC corporated mononucleotides and PCR primers, PCR products were treated with shrimp alkaline phosphatase (1.5 U, USB) and exonuclease I (1 U, MBI Fer- Additional PCR primers tried to amplify CGB1 and CGB2 of great apes mentas). Incubation was performed in a GeneAmp® PCR System 2700 thermal ◦ CGB1 CGB1 2F TCTACAGGAAGGATGGCAGAGTG cycler (Applied Biosystems) at 37 C for 20 min followed by enzyme inactiva- ◦ CGB1 4R CTTTGATCTTACGCAGGGTGATG tion at 80 C for 15 min. Purified PCR product (1.5–3 ␮l) was used as a template CGB1 5R CATTCTGTTTACCACAGGTGACGA in sequencing reactions (10 ␮l) along with a sequencing primer (2 pmol) and CGB1 6R ATCACAAGAGGCTCATCCCTGAC DYEnamic ET Terminator Cycle Sequencing Kit reagent premix (Amersham Biosciences Inc.) as recommended by the supplier. Sequencing reactions (1.5 ␮l) CGB2 CGB2 3R CCACCCTCTTTTCTTTTTCTTTCT were resolved on ABI 377 Prism automated DNA sequencer (Applied Biosys- CGB2 5R ACCGGCTAGGGGAAGAAAAGAAC tems) using ReproGelTM 377 gels (Amersham Biosciences Inc.). Genes of CGB2pr1F CGAGTAGTGGGACATCCCACTTGCT great apes were sequenced with both human-specific as well as species-specific CGB2pr2F CCTATCAAGGCGTCCTCCCTTTA primers designed by primer-walking approach. Sequencing primers are listed in Table 1. Human-specific sequencing primers The sequence data was assembled into a contig using phred and phrap CGB1 CGB1 1Fseq CCTCACCTCAGCTGATCCAC software, the contig was edited in consed package (http://www.phrap.org/ CGB1 2Rseq ACAAGAGGCTCATCCCTGAC phredphrapconsed.html). Human polymorphisms were identified using the CGB2 CGB2 1Fseq GGAAGGGGAACTGTATCTGAGAG polyPhred program (Version 4.2) (Nickerson et al., 1997) and confirmed by CGB2 2Rseq CCCGGATAACTTTTCGTATTTTT manual checking. Allele frequencies of identified human SNPs were estimated and conformance with Hardy–Weinberg equilibrium was computed by an exact LHB LHB 1Fseq ATCAAGCGGGTCTACTCACTCT test (α = 0.05) using HaploView (Barrett et al., 2005) program. LHB 6Rseq AGGTTACCCCAGCATCCTATC Alignment of human and great apes CGB1 and CGB2 genomic All genesaCGBsense 1seqR TGGTACACCACCCACAAAGA sequences was performed by web-based global alignment tool CLUSTALW CGBsense 2seqF CTCTTTCTGGAGGAGCGTGA (http://www.ebi.ac.uk/clustalw/). CGBsense 2seqR ACATCGCGGTAGTTGCACA CGBsense 3seqF CCCCTGAGTCTGAGACCTGT 2.3. Sequence diversity parameters and neutrality tests CGBantisense 1seqR GTCAACACCACCATCTGTGC CGBantisense 2seqF GTCCAGGAAGCCCTCTGTT Sequence diversity parameters were calculated by DnaSP software (Version CGBantisense 2seqR CACCTTCCACCTCCTTCCAG 4.0) (Rozas and Rozas, 1999). The direct estimate of per-site heterozygosity CGBantisense 3seqF TCCTATTCAGGACCCACCAC (π) was derived from the average pairwise sequence difference, and Watter- son’s θ (Watterson, 1975) represents as an estimate of the expected per-site Additional sequencing primers for great apes heterozygosity based on the number of segregating sites (S). Tajima’s D (DT) CGB1 Chimp-CGB1-3R2-F GGAAATGTGGATCTACCCTACCT statistic (Tajima, 1989) was performed to determine if the observed patterns of Chimp-CGB1 1F1R-F ATCACAGGTCAAGGGGTGGT human CGB1, CGB2 and a reference LHB gene diversity are consistent with the Chimp-CGB1-2F2-F AGAGGGAGACCACCCTTCCT standard neutral model. The basis of DT value is the difference between the π CGB2 Gor-CGB2 2F2-F GAAGGGTCTCTGGGTCTTTGT and θ estimates: under neutral expectation π = θ and DT = 0. Significant positive Gor-CGB2-2R2-R GTCTGGAAGCCGTGTGAGA DT values indicate an excess of intermediate-frequency alleles in a population consistent with either balancing selection or population bottleneck, whereas sig- LHB Chimp-LHB-1F1R-F AGCTGGACCCACCCTATGTAT nificant negative DT values indicate an excess of rare SNPs consistent with either Gor-LHB 1F1R-F CCACCAGAGTTCTGTACTGTGAC recent directional selection or an increase in population size. Orang-LHB 1R2-R CTCAGCTGTCACTGTGGACTCT A simple neutral model (Kimura, 1983) predicts that drift and mutation rate LHB 3Fseq2-F ATAGGATGCTGGGGTAACCTG determine the level of nucleotide variation accumulating within and between LHB 9Rseq-F GCTTCTGCCCAGTGAGAGAG species. Therefore, the relative amount of within-species polymorphism should F: forward primer; R: reverse primer. reflect the amount of between-species fixation under neutrality. Genetic diver- a CGB2 is transcribed from the sense, and CGB1, LHB from the antisense sity of human CGB1 and CGB2 was compared with fixation between human strand. and chimpanzee, as well as human and gorilla sequences to test neutrality. We applied the Hudson, Kreitman and Aguade (HKA) test (Hudson et al., 1987)to estimate whether there was a significant difference in the ratio of polymorphism Primer3 software (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3 www.cgi; to divergence of across CGB1 and CGB2 using LHB as a reference locus. Table 1). To guarantee the unique amplification of CGB1 and CGB2, one of the primers was located within the most divergent segments of CGB1/2-specific insertion region. The uniqueness of all PCR primer pairs was checked using 2.4. In silico prediction of TFBS to human CGB1 and CGB2  BLAST. The genes were amplified to cover the entire coding sequence and part 5 -upstream region of flanking regions (for human CGB1 1600 bp; CGB2 1652 bp; LHB 1599 bp PCR products). Primers designed on human sequence were also used to amplify Prediction of transcription factor binding sites (TFBS) was performed CGB1 and CGB2 as well as LHB from genomic DNA of common chimpanzee, using the MatInspector 2.2 (http://www.genomatix.de/products/MatInspector/; gorilla and orangutan. However, in order to overcome possible divergence among Cartharius et al., 2005) and Alibaba 2.1 (http://www.gene-regulation.com/ the species, we used a panel of primers and primer combinations (Table 1). In pub/programs/alibaba2/index.html; Grabe, 2002) programs. Both approaches 6 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 rely on the information about the experimentally defined TFBS collected and of CGB2 for gorilla (CGB2 1F and CGB2 3R, 1812 bp; in the TRANSFAC database (http://www.gene-regulation.com/pub/databases. DQ238550). Chimpanzee CGB2 was inferred from the jointly html#transfac; Matys et al., 2003). MatInspector identifies TFBS in nucleotide amplified CGB1/2 products (primers CGB2 1F and CGB2 5R, sequences using a large library of position weight matrices (PWM). PWM is a common way to represent the degenerate sequence preferences of a DNA- 2269 bp; DQ238549) using the chimpanzee CGB1 sequence as binding protein (reviewed by Stormo, 2000). Briefly, the elements of PWM a reference. With a similar approach, the gorilla CGB2 was correspond to scores reflecting the likelihood that particular nucleotide at the used as a reference to derive CGB1 from a common CGB1/2 particular position can be observed as the known or candidate TFBS. A weight product amplified from gorilla DNA (primers CGB1 1F and matrix pattern definition is superior to a simple IUPAC (International Union of CGB1 2R, 1600 bp; DQ238548). None of the human-specific Pure and Applied Chemistry) consensus sequence as it represents the complete nucleotide distribution of each single position. It also allows the quantification CGB1 and CGB2 primer combinations were capable to amplify of the similarity between the weight matrix and a potential TFBS detected within the expected products from orangutan genomic DNA. There- the sequence. Alibaba program starts directly at the known binding sites instead fore, either the orangutan CGB1 and CGB2 sequences are highly of using predefined matrices in the database. The analysis is a process consisting divergent from other studied primates; or this species lacks of three steps: (1) it pairwise aligns of known sites to the unknown sequence; CGB1/2 insertion region (target of one of the primers), and con- (2) it forms small sets of sites by their position and their according class of fac- tor; (3) it constructs matrices from these sets. We run the Alibaba 2.1 under the sequently CGB1/2 genes. The latter scenario is also supported following conditions: Pairsim to known sites 64, matrix width 10 bp, minimum by a recent study suggesting the total copy number of orangutan number of sites 4–5, minimum matrix conservation 75%, similarity to sequence CGB genes to be four (Maston and Ruvolo, 2002). Consequently, matrix 1%, factor class level 4–5. we raise the hypothesis of the origin of CGB1 and CGB2 in the When the reliability of the methods was tested on the previously determined common ancestor of African great apes. CGB5 promoter (Otani et al., 1988; Albanese et al., 1991; Steger et al., 1993; Hollenberg et al., 1994; Fig. 1C), MatInspector algorithm predicted one of the Amplification of the reference gene LHB, a functional ances- two experimentally proved (Johnson and Jameson, 1999) AP2 binding sites tral member of the same gene cluster was successful with human- and two of three Sp-sites as well as CCAAT and CG-boxes, while Alibaba specific primers in all four studied species: human, chimpanzee approach was capable of recognizing correctly one Sp1 site and CCAAT box. (Genbank accession no. DQ238551), gorilla (DQ238552) and In the subsequent analysis of CGB1/2-specific upstream fragment, we relied orangutan (DQ238553). predominantly on predictions by MatInspector approach. 3.2. CGB1 and CGB2 are conserved between human and 3. Results and discussion chimpanzee

3.1. CGB1 and CGB2 have possibly arisen in the common Divergence of chimpanzee (C) and gorilla (G) CGB1, CGB2 ancestor of African great apes and LHB from human (H) sequences (Table 2; H/C: across the genes 1.35–2.19%, exons 0.5–1.42%, introns 1.53–2.68; First, we addressed the question of conservation of CGB1 and H/G: across the genes 1.44–3.00%, exons 1.42–4.04%, introns CGB2 genes among the species. Human-specific primers were 1.36–3.31%) somewhat exceeds previous estimations. The aver- used to amplify a unique gene product of CGB1 for chimpanzee age divergence across 53 autosomal intergenic regions has (primers CGB1 2F and CGB1 6R, predicted length based on been reported 1.24 ± 0.07% for H/C and 1.62 ± 0.08% for human sequence 2312 bp; Genbank accession no. DQ238547) H/G (Chen and Li, 2001). Human/chimpanzee comparison of

Table 2 Nucleotide divergence between species Gene Region Length (bp)a Nucleotide diversity (%)

H/C H/G H/O C/G C/O G/O

LHB mRNA 1111 1.35 1.44 5.39 1.88 6.29 6.38 Exons 423 1.42 1.42 4.02 2.84 5.91 5.20 Introns 588 1.53 1.36 6.12 1.19 6.47 6.81 5-UTR 9 0 0 11.11 0 11.11 11.11 3-UTR 91 0 2.20 6.45 2.15 6.45 8.60 CGB1 mRNA 1366 2.19 2.34 – 2.26 – – Exons 396 0.50 1.51 – 1.49 – – Introns 634 2.68 2.21 – 2.20 – – 5-UTR 174 3.40 4.60 – 3.41 – – 3-UTR 162 3.08 2.47 – 3.08 – – CGB2 mRNA 1366 1.46 3.00 – 3.80 – – Exons 396 1.26 4.04 – 5.30 – – Introns 634 1.89 3.31 – 3.47 – – 5-UTR 174 1.70 2.30 – 3.97 – – 3-UTR 162 0 1.23 – 1.23 – –

H: human; C: common chimpanzee; G: gorilla; O: orangutan. a Length based on human sequence. P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 7

127 genes mapped to human chr. 21 resulted in estimates of gorilla-specific deletions (2 and 12 bp) increasing substantially overall divergence for coding sequences 0.75 ± 0.01% (range the number of fixed nucleotide differences between species 0.53–2.05%), for exons 0.51% ± 0.02 (range 0.08–2.52%), for (Fig. 2; supplementary figure). exon/intron junction 0.85 ± 0.02% (range 0.41–2.78%) for 5- Divergence patterns between human and chimpanzee CGB1 UTR 1.00% ± 0.10 and for 3-UTR 0.93% ± 0.09 (Shi et al., and CGB2 resemble the reference gene LHB, characterized by 2003). Relatively high divergence (across the gene 5.39% com- higher conservation in exons compared to introns (Table 2). We pared to 3.08% reported for intergenic regions; Chen and Li, identified only a few fixed differences among species causing 2001) was also estimated between human and orangutan (O) non-synonymous changes in chimpanzee CGB1 (3), CGB2 (5) for LHB including 11 non-synonymous changes. Higher inter- and LHB (1) relative to human sequence (Fig. 2; supplementary specific divergence could result from the intraspecific gene figure). None of the sequence differences alter the ORF nor conversion among highly homologous genes in the LHB/CGB create a preliminary stop-codon. The evolution of 5- and 3- cluster (Maston and Ruvolo, 2002; Hallast et al., 2005). For UTR sequences is variable among the genes, from 0 differ- gorilla CGB2 gene approximately two fold higher sequence ences to 3.4% divergence. In gorilla the overall number of divergence for H/G compared to H/C largely arises from two non-synonymous changes is even higher for LHB (8) than for

Fig. 2. SNP patterns and fixed differences between human and great apes in CGB1 (A), CGB2 (B) and LHB (C) genes. Human polymorphic positions (vertical black bars) are marked as long bars for common SNPs (minor allele frequency >10%) and short bars for rare SNPs (<10%). For human and great ape comparison fixed differences (black arrows), non-synonymous changes (black arrows with an asterisk), SNPs found in apes (black triangle) and protein altering insertions/deletions are shown (exclamation mark). 8 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11

CGB1 (1) and CGB2 (4). However, in gorilla we identified for c 0.87 0.62 0.79 1.31 0.13 both CGB1 (1 bp insertion in exon 2) and CGB2 (12 bp deletion 0.49 D − − − − − at the beginning of exon 2 removing an Ala-Val-Ala-Ala motif) − ) a change presumably leading to the disruption of a predicted 3 10

protein (Fig. 2). Whether these represent consensus sequences × ( b

for gorilla CGB1 and CGB2, or only mutations in the genome of θ the sequenced individual will be solved when additional gorilla ) sequences are available for comparison. 3 10

In summary, the interspecific analysis of CGB1 and CGB2 × ( -UTR a  indicates that the level of conservation between human and chim- π panzee is as high as for LHB, thus supporting the functional importance of these genes in these species. However, in gorilla 0.47 3.95 4.21 1.31 4.42 5.62 c D − the functionality of CGB1 and CGB2 is less likely as disrupted − ORF was identified for CGB1 and a large deletion in exon 2 for ) CGB2. 3 10 × ( b 3.3. Resequencing of human CGB1 and CGB2 genes θ

revealed low variation and no nonsense mutations ) 3 10 × (

As a next step we studied the polymorphism patterns -UTR 3 a  of human CGB1 and CGB2 in comparison to LHB gene. π Re-sequencing of total 190 from three human c 1.10 0 0 NA 2.16 1.38 0.76 1.39 1.94 2.62 1.490.90 5.55 0.74 5.13 2.62 0.18 0.73 2.76 0.44 2.72 1.12 1.64 0.39 1.21 populations (Estonians n = 94, Chinese Han n = 50 and Man- 0.52 0 0 NA 2.8 1.21 1.53 D − − − − − denka n = 46) identified 22 single nucleotide polymorphisms − )

(SNPs) in CGB1 (ss48399944–ss48399963), 30 in CGB2 3 (ss48399964–ss48399997) and 24 in LHB (ss48399882– 10 × ( ss48399908) (Fig. 2; supplementary table). Interestingly, the b θ LHB gene exhibited even three-fold higher variation than CGB1 ) and CGB2 (Table 3; average across populations: πLHB/kb = 3.92; 3 π π 10 /kb = 1.39; /kb = 1.26). Only one polymorphism in ×

CGB1 CGB2 ( a

CGB1,fiveinCGB2 and four in LHB were identified leading to π 0.96 1.76 a non-synonymous change. None of the polymorphisms found * in coding regions created a preliminary stop codon. Thus, c 1.08 1.65 3.23 0.13 3.62 3.48 0.11 0 0 NA 1.37 2.50 1.83 D − − the diversity patterns of human CGB1 and CGB2 comparable −

with a typical variation of human genes (African Americans ) π π 3 (74 genes)/kb = 1.00; European Americans (74 genes)/kb = 0.80; 10 × Crawford et al., 2004) and rare non-synonymous substitutions ( b give support to the functionality of these genes. Identification of θ = 23). ) ancestral alleles of human SNPs in comparison with other great n 3 ape sequences revealed that for most of the SNPs the major allele 10 × (

in human is also the ancestral variant (supplementary table). a < 0.05; NA: not applicable. π p

We performed two alternative analyses to test whether the * human LHB, CGB1 and CGB2 have evolved under standard c 1.23 2.41 4.02 0.23 3.58 3.77 0.980.95 0.69 0.74 0.56 0.57 0.31 0.40 0.68 1.94 1.76 2.87 neutral model. Tajima’s test examines whether the average num- 1.45 0.69 2.82 D − − − − −

ber of pairwise nucleotide differences between sequences (π)is = 25); M: Mandenka ( n ) larger than expected from the observed number of polymorphic 3 θ π θ 10 sites ( ). The expected difference (Tajima D) between and is × ( b roughly zero under the standard neutral model. As differences θ between π and θ for the studied genes were small and Tajima diversity parameters ) D values were close to zero (Table 3), the hypothesis of neu- 3 CGB2 10 statistics and significance level: × ( tral evolution of these genes was not rejected. The HKA test D and a = 47); H: Chinese Han ( was performed to test the neutral evolution of CGB1 and CGB2 π 3.553.093.04 2.82 2.61 3.28 0.74 0.54 3.75 3.69 2.31 2.64 1.34 0.96 4.37 3.87 3.65 3.03 0.52 0.75 0 0 0 0 NA NA 2.3 0.86 2.15 2.45 0.08 n among the studied species with LHB as a reference (Table 4). The CGB1 d , d test is based on prediction from the Neutral Theory of Molecu- d M 2.18 3.50 H M AllE 3.92HM 1.08 1.31 1.73 1.00 1.96 2.50 0.18 0.82 3.97 0.49 0.78 0.96 1.23 4.29 0 1.7 E All 1.26 1.08 1.13 0.53 3.01 AllE 1.39H 0.92 0.90 0.86 1.80 0.16 0.57 0.76 0.49 0.18 0.91 1.23 1.15 3.49 1.57 lar Evolution (Kimura, 1983) that the amount of within-species LHB

diversity should be correlated with levels of between-species Estimate of nucleotide diversity per siteEstimate from of average nucleotide pairwise diversity difference per among siteValue of individuals. from Tajima’s number of segregating sitesE: (S). Estonians ( a c b d Table 3 Human CGB1 Gene Pop mRNALHB Exons Introns 5 divergence, due to the dependence of both on the neutral muta- CGB2 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 9

Table 4 Hudson–Kreitman–Aguade (HKA) test results Population Gene Sa Human vs. chimpanzee Human vs. gorilla

Fixed diff.b HKAc χ2 p-Valued Fixed diff.b HKAc χ2 p-Valued

Estonian LHB 17 15 17 CGB1 7 35 5.616 0.0178* 36 4.492 0.0341* CGB2 6 23 3.469 0.0625 49 8.078 0.0045** Han LHB 14 15 17 CGB1 12 33 1.701 0.1922 35 0.971 0.3245 CGB2 11 24 0.555 0.4565 48 2.566 0.1092 Mandenka LHB 17 15 16 CGB1 14 31 1.768 0.1836 32 1.147 0.2841 CGB2 21 21 0.006 0.9367 45 0.883 0.3472 All LHB 24 15 16 CGB1 22 30 1.740 0.1871 32 1.030 0.3101 CGB2 30 20 0.000 0.9858 41 1.024 0.3116 a Segregating sites = the number of human polymorphisms. b The number of fixed differences between populations: nucleotide sites at which all of the human sequences are different from the chimpanzee or gorilla sequence. c LHB was used as a reference gene for the HKA test. d Significance level of HKA p-value: *p < 0.05; **p < 0.01. tion rate. Consistently with Tajima’s test, we could not reject the for AP-2 and Sp1 transcription factors required for the full activ- null hypothesis of neutral evolution. The exception was the Esto- ity of the promoter (Johnson et al., 1997; Johnson and Jameson, nian sample set, where the significant result could be spurious 1999; Knofler¨ et al., 2004). Hollenberg et al. (1994) has sug- outcome of the population history in Europe shaped by bottle- gested that the individual domains of the hCG␤ promoter act in necks (Barbujani and Goldstein, 2004) and thus capturing the an addictive or combinatory manner. Thus, the absence of hCG␤ least the human intrapopulation variation applied in HKA test. MPR from CGB1/2 putative promoter region could possibly be However, as neither Tajima’s nor HKA test takes into account compensated by the sequences within CGB1/2-specific inser- gene conversion shown to shape the sequences and variation of tion. However, whether this segment indeed has a regulatory LHB/CGB genes within a species (Maston and Ruvolo, 2002; function for CGB1/2 needs to be proven in wet-lab experiments. Hallast et al., 2005), we should interpret the overall test results with caution and could not entirely exclude selection. Innan 3.5. CGB1- and CGB2-specific upstream region is (2003) has shown that statistical tests of neutrality based on predicted to harbor binding sites for transcription factors the standard coalescent theory for a single-copy gene may not related to early placental development and implantation be appropriate for duplicated genes. In addition to alternative 5-UTR and exon 1 for CGB1 3.4. CGB1 and CGB2 genes possess almost complete and CGB2 genes, the CGB1/2-specific insert (736 and 724 bp, hCGβ promoter sequence respectively) provides a novel putative proximal promoter seg- ment (481 and 469 bp, respectively) upstream the transcription In order to predict the regulatory elements and patterns puta- start site (Fig. 1B). We evaluated this fragment as a potential tively involved in driving the expression of CGB1 and CGB2, additional promoter component. In silico analysis predicted sev- we investigated the upstream regions of these genes. Alignment eral regulatory elements in CGB1/2 insert experimentally deter- of the experimentally identified hCG␤ promoter (−311 bp from mined to be essential for gonadotrope expression (Fig. 1D): two hCG␤ 5-CAP; Otani et al., 1988; Hollenberg et al., 1994) with copies of CRE sites binding cAMP-responsive element binding the 5-upstream region of CGB1 and CGB2 genes revealed a protein (CREB) and activating transcription factor (ATF), bind- more proximal location of an almost complete hCG␤ promoter ing sites for CG␣ and CG␤ transcription inducer AP-2 as well sequence (−757 to −481 for CGB1 and −745 to −469 for CGB2 as for repressor c-Jun (Pestell et al., 1994; Johnson et al., 1997), from predicted transcription start site), lacking only 52 bp of binding sites for GATA-2 regulating CG ␣-subunit expression proximal promoter segment of hCG␤ (Fig. 1B and C). Despite in placenta (Steger et al., 1994). the absence of the minimal promoter region (MPR; −37 to +104; Interestingly, the CGB1/2 specific upstream region is pre- Hollenberg et al., 1994) including two Ets-2 binding sites (Ghosh dicted to harbor interaction sites for several transcription factors et al., 2003), the other sequence motifs playing a crucial role regulating implantation and placental development (Fig. 1D): in regulating hCG␤ expression are conserved among the genes NF-␬B, Cdx-2, ERR-␤, HIF1 and SF-1. Although NF-␬Bis (Fig. 1C). These include cAMP-dependent transcription element a transcriptional factor involved mainly in inflammatory and mapped to −311 to −202 (Albanese et al., 1991), trophoblast- immune responses, it regulates also several genes responsible specific element (TSE) between −305 and −279 maintaining for immunological adaptation at the feto-maternal interface and basal expression (Steger et al., 1993), as well as binding sites early embryonic development (Chen et al., 1999; Muggia et al., 10 P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11

1999). Both, in human (Page et al., 2002) and mouse (Nakamura and thus there is little support for the functionality of these et al., 2004) NF-␬B is activated in the pregnant uterus dur- genes. We hypothesize that the fate of duplicated CGB1 and ing preimplantation period and is highly expressed during the CGB2 genes has split for human–chimpanzee and gorilla lin- implantation window. eages evolving towards a novel functional gene for the former Cdx-2 and ERR-␤ exhibit highly specific expression pat- and pseudogenization for the latter. tern during embryogenesis. Besides its main role in driving Upstream CGB1 and CGB2 is preserved almost full and embryonic axial elongation and anterior–posterior pattern- well conserved (among genes) sequence of the promoter for ing, Cdx2 is also essential for trophoblastic development hCG␤ coding genes. Additionally, CGB1/2 possess a novel (Chawengsaksophak et al., 2004). Consistently, aberrant putative proximal promoter segment created by the CGB1/2- expression of bovine Cdx2 in the preimplantation cloned specific insertion. Analysis of this segment in silico for TFBSs embryo has been reported to lead to the failure of implantation highlighted several elements shown to regulate gene expression (Hall et al., 2005). ERE (reviewed by Gruber et al., 2004)isa during implantation and placental development. However, as binding site (consensus sequence 5-GGTCAnnnTGACC-3) TFBS prediction programs can only infer the binding potential, not only for the estrogen receptor ligand complex, but also for and not the functionality of the site, only succeeding wet-lab ERR-␤, an orphan member of the superfamily of nuclear hor- experiments are able to uncover whether the predictions and mone receptors (Pettersson et al., 1996). Studies on mice have postulated hypothesis hold true. shown that ERR-␤ is expressed during embryogenesis by ecto- dermally derived regions of the amniotic fold, forming chorion. Acknowledgements Homozygous mutant mouse embryos generated by targeted disruption of the Estrrb gene have severely impaired placental We thank Mrs. Liina Nagirnaja, Imbi Taniel and Tallinn development, and die 10.5 days post-coitum (Luo et al., 1997). Zoo for the chimpanzee DNA sample, Tonu˜ Margus for the Hypoxia-inducible factors (HIFs) mediating oxygen home- help in DNA sequence assembly, Viljo Soo for the technical ostasis have been suggested to regulate uterine vascular perme- assistance and dr. Robert K. Campbell for the editing of the ability and angiogenesis (Daikoku et al., 2003). Transcription English language. The study has been supported by Wellcome factor SF-1 is a key regulator of the transcription of many genes Trust International Senior Research Fellowship in Biomedi- involved in sexual differentiation, steroidogenesis and reproduc- cal Science in Central Europe (grant no. 070191/Z/03/Z for tion (GnRHR, α-GSU, FSHB and LHB; reviewed by Parker and M.L.), Howard Hughes Medical Institute International Scholar- Schimmer, 1997). ship (grant #55005617 for M.L.), Estonian Science Foundation In silico prediction of putative transcription factor binding (grant no. 5796 for M.L.) and Estonian Ministry of Education sites allows to postulate the hypothesis of the involvement of and Science (Core grants no. 0182721s06 for M.L. and no. CGB1 and CGB2 gene products in implantation and placental 0182641s04 for K.R.); as well as personal scholarships from development. The hypothesis is supported by the detection of the World Federation of Scientists (K.R. and P.H.) and Kristjan CGB1 and CGB2 transcripts in the placenta, although at much Jaak Stipend Program (travel to ICGR for K.R); from Artur Lind lower level compared to hCG␤ coding genes (Bo and Boime, Gene Technology Fund and Estonian-Revelia Harald Raudsepp 1992; Rull and Laan, 2005). Academic Fund (for P.H.).

4. Conclusions Appendix A. Supplementary data This report aimed to explore the evolution, variation and puta- Supplementary data associated with this article can be found, tive regulatory regions of CGB1 and CGB2 in order to seek in the online version, at doi:10.1016/j.mce.2005.11.049. indirect evidence for the functionality of these genes, originally considered as pseudogenes (Talmadge et al., 1984b). As both of the genes were amplified additionally to human References also in chimpanzee and gorilla but not in orangutan, we suggest Albanese, C., Kay, T.W., Troccoli, N.M., Jameson, J.L., 1991. Novel cyclic that they have arisen among the common ancestor of African adenosine 3,5-monophosphate response element in the human chorionic great apes. Gene duplication was accompanied by the replace- gonadotropin beta-subunit gene. Mol. Endocrinol. 5, 693–702.  ment of the hCG␤ 5 -UTR with a non-coding sequence provid- Barbujani, G., Goldstein, D.B., 2004. Africans and Asians abroad: genetic diver- ing novel putative promoter segment, 5-UTR and exon 1. sity in Europe. Annu. Rev. Genomics Hum. Genet. 5, 119–150. In human, CGB1 and CGB2 exhibit three times lower diver- Barrett, J.C., Fry, B., Maller, J., Daly, M.J., 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265. sity than for LHB and no ORF disturbing mutations for a sample Bo, M., Boime, I., 1992. Identification of the transcriptionally active genes of representing three continents. Both genes are conserved between the chorionic gonadotropin beta gene cluster in vivo. J. Biol. Chem. 267, human and chimpanzee, exhibiting the same level of interspe- 3179–3184. cific divergence as LHB. Especially CGB1 stands out with a Berger, P., Kranewitter, W., Madersbacher, S., Gerth, R., Geley, S., Dirnhofer, strong exonic conservation with only 0.5% divergence between S., 1994. Eutopic production of human chorionic gonadotropin beta (hCG beta) and luteinizing hormone beta (hLH beta) in the human testis. FEBS human and chimpanzee, whereas the respective number for LHB Lett. 343, 229–233. is 1.42%. In contrast, for gorilla both CGB1 and CGB2 harbor Cartharius, K., Frech, K., Grote, K., Klocke, B., Haltmeier, M., Klingenhoff, insertion/deletion changes, which disrupt the predicted protein A., Frisch, M., Bayerlein, M., Werner, T., 2005. MatInspector and beyond: P. Hallast et al. / Molecular and Cellular Endocrinology 260–262 (2007) 2–11 11

promoter analysis based on transcription factor binding sites. Bioinformatics Maston, G.A., Ruvolo, M., 2002. Chorionic gonadotropin has a recent origin 21, 2933–2942. within primates and an evolutionary history of selection. Mol. Biol. Evol. Chawengsaksophak, K., de Graaff, W., Rossant, J., Deschamps, J., Beck, F., 19, 320–335. 2004. Cdx2 is essential for axial elongation in mouse development. Proc. Matys, V., Fricke, E., Geffers, R., Gossling, E., Haubrock, M., Hehl, R., Hor- Natl. Acad. Sci. U.S.A. 101, 7641–7645. nischer, K., Karas, D., Kel, A.E., Kel-Margoulis, O.V., Kloos, D.U., Land, Chen, F., Castranova, V.,Shi, X., Demers, L.M., 1999. New insights into the role S., Lewicki-Potapov, B., Michael, H., Munch, R., Reuter, I., Rotert, S., Saxel, of nuclear factor-kappaB, a ubiquitous transcription factor in the initiation H., Scheer, M., Thiele, S., Wingender, E., 2003. TRANSFAC: transcriptional of diseases. Clin. Chem. 45, 7–17. regulation, from patterns to profiles. Nucl. Acids Res. 31, 374–378. Chen, F.-C., Li, W.H., 2001. Genomic divergences between humans and other Muggia, A., Teesalu, T., Neri, A., Blasi, F., Talarico, D., 1999. Trophoblast giant hominoids and the effective population size of the common ancestors of cells express NF-kappa B2 during early mouse development. Dev. Genet. humans and chimpanzees. Am. J. Hum. Genet. 68, 444–456. 25, 23–30. Crawford, D.C., Bhangale, T., Li, N., Hellenthal, G., Rieder, M.J., Nickerson, Nakamura, H., Kimura, T., Ogita, K., Nakamura, T., Takemura, M., Shimoya, K., D.A., Stephens, M., 2004. Evidence for substantial fine-scale variation in Koyama, S., Tsujie, T., Koyama, M., Murata, Y., 2004. NF-kappaB activation recombination rates across the human genome. Nat. Genet. 36, 700–706. at implantation window of the mouse uterus. Am. J. Reprod. Immunol. 51, Daikoku, T., Matsumoto, H., Gupta, R.A., Das, S.K., Gassmann, M., DuBois, 16–21. R.N., Dey, S.K., 2003. Expression of hypoxia-inducible factors in the peri- Nickerson, D.A., Tobe, V.O., Taylor, S.L., 1997. PolyPhred: automating implantation mouse uterus is regulated in a cell-specific and ovarian the detection and genotyping of single nucleotide substitutions using hormone-dependent manner. Evidence for differential function of HIFs dur- fluorescence-based resequencing. Nucl. Acids Res. 25, 2745–2751. ing early pregnancy. J. Biol. Chem. 278, 7683–7691. Otani, T., Otani, F., Krych, M., Chaplin, D.D., Boime, I., 1988. Identification Dirnhofer, S., Hermann, M., Hittmair, R., Hoermann, K., Kapelari, K., Berger, of a promoter region in the CG beta gene cluster. J. Biol. Chem. 263, 7322– P., 1996. Expression of the human chorionic gonadotropin-beta gene cluster 7329. in human pituitaries and alternate use of exon 1. J. Clin. Endocrinol. Metab. Page, M., Tuckerman, E.M., Li, T.C., Laird, S.M., 2002. Expression of nuclear 81, 4212–4217. factor kappa B components in human endometrium. Reprod. Immunol. 54, Fiddes, J.C., Goodman, H.M., 1980. The cDNA for the beta-subunit of human 1–13. chorionic gonadotropin suggests evolution of a gene by readthrough into the Parker, K.L., Schimmer, B.P., 1997. Steroidogenic factor 1: a key determinant 3-untranslated region. Nature 286, 684–687. of endocrine development and function. Endocr. Rev. 18, 361–377. Ghosh, D., Ezashi, T., Ostrowski, M.C., Roberts, R.M., 2003. A central Pestell, R.G., Hollenberg, A.N., Albanese, C., Jameson, J.L., 1994. c-Jun role for Ets-2 in the transcriptional regulation and cyclic adenosine 5- represses transcription of the human chorionic gonadotropin alpha and beta monophosphate responsiveness of the human chorionic gonadotropin-beta genes through distinct types of CREs. J. Biol. Chem. 269, 31090–31096. subunit gene. Mol. Endocrinol. 17, 11–26. Pettersson, K., Svensson, K., Mattsson, R., Carlsson, B., Ohlsson, R., Berken- Giovangrandi, Y., Parfait, B., Asheuer, M., Olivi, M., Lidereau, R., Vidaud, M., stam, A., 1996. Expression of a novel member of estrogen response element- Bieche, I., 2001. Analysis of the human CGB/LHB gene cluster in breast binding nuclear receptors is restricted to the early stages of chorion formation tumors by real-time quantitative RT-PCR assays. Cancer Lett. 168, 93–100. during mouse embryogenesis. Mech. Dev. 54, 211–223. Grabe, N., 2002. AliBaba2: context specific identification of transcription factor Rull, K., Laan, M., 2005. Expression of ␤-subunit of human chorionic binding sites. In Silico Biol. 2, S1–S15. gonadotropin genes during the normal and failed pregnancy. Hum. Reprod. Gruber, C.J., Gruber, D.M., Gruber, I.M., Wieser, F., Huber, J.C., 2004. Anatomy 20 (12), 3360–3368. of the estrogen response element. Trends Endocrinol. Metab. 15, 73–78. Rozas, J., Rozas, R., 1999. DnaSP Version3: an integrated program for molecular Hall, V.J., Ruddock, N.T., French, A.J., 2005. Expression profiling of genes population genetics and molecular evolution analysis. Bioinformatics 15, crucial for placental and preimplantation development in bovine in vivo, in 174–175. vitro, and nuclear transfer blastocysts. Mol. Reprod. Dev. 72, 16–24. Shi, J., Xi, H., Wang, Y., Zhang, C., Jiang, Z., Zhang, K., Shen, Y., Jin, L., Zhang, Hallast, P., Nagirnaja, L., Margus, T., Laan, M., 2005. Segmental duplications K., Yuan, W., Wang, Y., Lin, J., Hua, Q., Wang, F., Xu, S., Ren, S., Xu, S., and gene conversion: human luteinizing hormone/chorionic gonadotropin Zhao, G., Chen, Z., Jin, L., Huang, W., 2003. Divergence of the genes on beta gene cluster. Genome Res. 15, 1535–1546. human chromosome 21 between human and other hominoids and variation Hollenberg, A.N., Pestell, R.G., Albanese, C., Boers, M.E., Jameson, J.L., 1994. of substitution rates among transcription units. Proc. Natl. Acad. Sci. U.S.A. Multiple promoter elements in the human chorionic gonadotropin beta sub- 100, 8331–8336. unit genes distinguish their expression from the luteinizing hormone beta Steger, D.J., Buscher, M., Hecht, J.H., Mellon, P.L., 1993. Coordinate con- gene. Mol. Cell. Endocrinol. 106, 111–119. trol of the alpha- and beta-subunit genes of human chorionic gonadotropin Hudson, R.R., Kreitman, M., Aguade, M., 1987. A test of neutral molecular by trophoblast-specific element-binding protein. Mol. Endocrinol. 7, evolution based on nucleotide data. Genetics 116, 153–159. 1579–1588. Innan, H., 2003. The coalescent and infinite-site model of a small multigene Steger, D.J., Hecht, J.H., Mellon, P.L., 1994. GATA-binding proteins regulate family. Genetics 163, 803–810. the human gonadotropin alpha-subunit gene in the placenta and pituitary Johnson, W., Albanese, C., Handwerger, S., Williams, T., Pestell, R.G., Jameson, gland. Mol. Cell. Biol. 14, 5592–5602. J.L., 1997. Regulation of the human chorionic gonadotropin alpha- and beta- Stormo, G.D., 2000. DNA binding sites: representation and discovery. Bioin- subunit promoters by AP-2. J. Biol. Chem. 272, 15405–15412. formatics 16, 16–23. Johnson, W., Jameson, J.L., 1999. AP-2 (activating protein 2) and Sp1 (selective Strauss, B.L., Pittman, R., Pixley, M., Nilson, J.H., Boime, H.I., 1993. Expres- promoter factor 1) regulatory elements play distinct roles in the control of sion of the beta subunit of chorionic gonadotropin in transgenic mice. J. Biol. basal activity and cyclic adenosine 3,5-monophosphate responsiveness Chem. 269, 4968–4973. of the human chorionic gonadotropin-beta promoter. Mol. Endocrinol. 13, Tajima, F., 1989. Statistical method for testing the neutral mutation hypothesis 1963–1975. by DNA polymorphism. Genetics 123, 585–595. Kimura, M., 1983. The Neutral Theory of Molecular Evolution. Cambridge Talmadge, K., Vamvakopoulos, N.C., Fiddes, J.C., 1984a. Evolution of the genes University Press, Cambridge, MA. for the beta subunits of human chorionic gonadotropin and luteinizing hor- Knofler,¨ M., Saleh, L., Bauer, S., Galos, B., Rotheneder, H., Husslein, P., Helmer, mone. Nature 307, 37–40. H., 2004. Transcriptional regulation of the human chorionic gonadotropin Talmadge, K., Boorstein, W.R., Vamvakopoulos, N.C., Gething, M.J., Fiddes, beta gene during villous trophoblast differentiation. Endocrinology 145, J.C., 1984b. Only three of the seven human chorionic gonadotropin beta 1685–1694. subunit genes can be expressed in the placenta. Nucl. Acids Res. 12, 8415– Luo, J., Sladek, R., Bader, J.A., Matthyssen, A., Rossant, J., Giguere, V., 1997. 8436. Placental abnormalities in mouse embryos lacking the orphan nuclear recep- Watterson, G.A., 1975. On the number of segregating sites in genetical models tor ERR-beta. Nature 388, 778–782. without recombination. Theor. Popul. Biol. 7, 256–276.