Lamprey Proglucagon and the Origin of -like

David M. Irwin,* Ozgur Huner,* and John H. Youson² *Department of Laboratory Medicine and Pathobiology, and Banting and Best Diabetes Centre, Faculty of Medicine, University of Toronto, Ontario, Canada; and ²Division of Life Sciences, University of Toronto at Scarborough, Ontario, Canada

We characterized two proglucagon cDNAs from the intestine of the sea lamprey Petromyzon marinus. As in other vertebrates, sea lamprey proglucagon encode three glucagon-like sequences, glucagon, and glucagon-like peptides 1 and 2 (GLP-1 and GLP-2). This observation indicates that all three glucagon-like sequences encoded by the proglucagon originated prior to the divergence of jawed and jawless vertebrates. Estimates of the rates of evolution for the glucagon-like sequences suggest that glucagon originated ®rst, about 1 billion years ago, while GLP-1 and GLP-2 diverged from each other about 700 MYA. The two sea lamprey intestinal proglucagon cDNAs have differing coding potential. Proglucagon I cDNA encodes the previously characterized glucagon and the glu- cagon-like GLP-1, while proglucagon II cDNA encodes a predicted GLP-2 and, possibly, a glucagon. The existence of two proglucagon cDNAs which differ with regard to their potential to encode glucagon-like peptides suggests that the lamprey may use differential as a third mechanism, in addition to alternative proteolytic processing and mRNA splicing, to regulate the production of proglucagon-derived peptides.

Introduction Glucagon is a 29-amino-acid peptide in- were isolated from the of bony ®sh (angler®sh) volved in the regulation of carbohydrate, amino acid, and birds (chicken), and the predicted proglucagon con- and lipid metabolism (LefeÁbvre 1995). Glucagon is se- tained only two glucagon-like sequences, glucagon and creted by the A cells of the mammalian pancreatic islets a glucagon-like peptide (Lund et al. 1982, 1983; Hase- in response to low glucose levels and acts as the coun- gawa et al. 1989). Recently, additional proglucagon terregulatory hormone to (LefeÁbvre 1995). In cDNAs from nonmammalian vertebrates have been mammals, glucagon is processed from a larger precur- characterized, and they were found to encode GLP-2- sor. Alternative proteolytic processing results in the pro- like molecules (Irwin and Wong 1995; Chen and Druck- duction of glucagon-like peptides 1 and 2 (GLP-1 and er 1997; Irwin et al. 1997). These results indicate that GLP-2) in intestinal L cells and some neurons of the the triplication of a glucagon-like exon preceded the di- brain stem and (Mojsov et al. 1986; vergence of bony ®sh and mammals, and that this struc- érskov et al. 1986). While the function of glucagon ap- ture has been maintained since then. The absence of pears to be conserved among vertebrates, the function GLP-2 from the cDNA initially isolated from the pan- of GLP-1 varies. In tetrapods (e.g., mammals and am- creas of angler®sh and chickens (Lund et al. 1982, 1983; phibians), GLP-1 is an hormone that is pro- Hasegawa et al. 1989) was due to alternative splicing. duced by intestinal L cells which potentiates the release GLP-2-encoding cDNAs from the trout and chicken of insulin by pancreatic islet B cells and thus executes were isolated from the intestine (Irwin and Wong 1995). insulin-like physiological actions (Mojsov, Weir, and The discovery that proglucagon mRNAs are alter- Habener 1987; Drucker 1998). In contrast, GLP-1 in natively spliced in some species (®sh, frogs, birds, and bony ®sh performs glucagon-like activity and is pro- lizards) has led to the identi®cation of GLP-2 sequences duced by both the pancreas and the intestine (Dugay and in species in which they were previously believed not Mommsen 1994; Plisetskaya and Mommsen 1996). The to exist (Irwin and Wong 1995; Chen and Drucker function of GLP-2 has so far been described only for 1997). A previous phylogenetic analysis of proglucagon mammals, for which it was identi®ed to be an intestinal gene sequences by Lopez et al. (1984) concluded that growth factor (Drucker et al. 1996). GLP-2 was of ancient origin and that glucagon and The three glucagon-like sequences found in mam- GLP-1 were more closely related to each other. The ear- malian proglucagon genes are encoded by separate ex- ly origin of GLP-2 was inferred using the rate of evo- ons (Heinrich, Gros, and Habener 1984; White and Saunders 1986; Philippe 1991). It has been suggested lution of GLP-2 within mammals, while the rates of evo- that the three peptides originated via triplication of an lution of glucagon and GLP-1 were inferred from ang- ancestral exon that encoded a glucagon-like peptide ler®sh-to-mammal comparisons (Lopez et al. 1984). In (Philippe 1991; Holst and érskov 1994; Laser and Phi- our characterization of GLP-2-encoding proglucagon lippe 1997; Hoyle 1998). The ®rst proglucagon cDNAs cDNAs from nonmammalian vertebrates (Irwin and Wong 1995; Irwin et al. 1997), we recognized that the GLP-2 sequence was evolving more rapidly in nonmam- Key words: proglucagon, exon duplication, gene duplication, reg- malian vertebrates than within mammals. This obser- ulatory evolution. vation led us to hypothesize that GLP-2 may have a Address for correspondence and reprints: David M. Irwin, De- more recent origin. partment of Laboratory Medicine and Pathobiology, 100 College Street, University of Toronto, Toronto, Ontario, Canada M5G 1L5. E- To test our hypothesis of a recent origin for GLP- mail: [email protected]. 2, we isolated and characterized proglucagon cDNAs Mol. Biol. Evol. 16(11):1548±1557. 1999 from the sea lamprey Petromyzon marinus, whose lin- ᭧ 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 eage can be traced to the earliest vertebrates, the Ag-

1548 Evolution of the Vertebrate Proglucagon Gene 1549

Table 1 Sequences of Primers Used for the Isolation and Characterization of Sea Lamprey Proglucagon Genes Name Sequencea Orientationb Degenerate Glu5Ј ...... GGGAATTCAYTCNGARGGNACNTT Sense Glp...... GCGGATCCTGCATRTCRTTNGTRAA Antisense Gene I I-5.1 ...... GCGACTATAGCAAGTACCTG Sense I-5.2 ...... TGACCTCCTACCTGGACGCGAAG Sense I-3.1 ...... GCCTGTTTTGAATCCGATGAGGCC Antisense I-3.2 ...... TGCCTCCGCTGCAGCTCACT Antisense Gene II II-5.1 ...... TGGGTGGCTTATTATTCGATTACC Sense II-5.2 ...... ACGTTCATGTCATTGGTGAAGCTG Sense II-3.1 ...... CGCTTCAGGCACCCTGCTGTCGTA Antisense II-3.2 ...... CCCAACCCTGTGTAGGTTCCAAGA Antisense

a Sequences are 5Ј to 3Ј. b For locations of primers, see ®gure 1.

natha. It has been hypothesized that pancreatic endo- (HSEGTF) and the midportion of the glucagon-like pep- crine cells evolved from gut endocrine cells and that the tide sequence (FTNDMT) (Conlon, Nielson, and You- lampreys may represent an intermediate in which some son 1993; see table 1). The glucagon primer was the but not all endocrine cell types have migrated to the sense primer, whereas the GLP-1 primer was the anti- pancreas (Falkmer and Van Noorden 1983). Insulin- and sense primer. Reverse transcription±polymerase chain - producing endocrine cells have been iden- reaction (RT-PCR) was as previously described (Irwin ti®ed in the pancreas, but glucagon-producing cells are and Wong 1995), using mRNA isolated from the intes- found only in the intestine (Cheung et al. 1991; Youson tine of a sea lamprey in the spawning (nonfeeding) and Elliott 1993). We isolated and characterized two dif- phase of its life cycle. Products of RT-PCR were cloned ferent sea lamprey intestinal proglucagon cDNAs and into pCRII. A lambda ZAP II cDNA library was con- show that sea lamprey proglucagon does encode GLP- structed (composed of approximately 2,000,000 inde- 2. Comparison of lamprey and mammalian sequences of pendent clones) from mRNA puri®ed from the intestine proglucagon-derived peptides suggests that GLP-1 and of a single sea lamprey that was in the spawning phase GLP-2 have had a more recent common ancestor than of its life cycle. The 5Ј and 3Ј ends of proglucagon either has with glucagon, and that they diverged from cDNAs were ampli®ed from subsets of the library using each other just prior to the origin of vertebrates. The a nested PCR approach. Each end was ampli®ed using existence of two intestinal proglucagon cDNAs with dif- two nested gene-speci®c primers (see table 1) and nested fering coding potentials suggests that the regulation of pairs of lambda ZAP vector primers (for the 5Ј-end the production of proglucagon-derived peptides in the primers, M13 reverse and then T7; for the 3Ј-end prim- lamprey differs from that described for other vertebrates. ers, M13 forward and then T3). PCR products were cloned into pCRII or pCR2.1. At least three clones were Materials and Methods sequenced for each end of both proglucagon cDNAs. Materials Northern Blot and RT-PCR Analysis Oligonucleotides for PCR were obtained from ACGT (Toronto, Ontario, Canada) and are listed in table RNA was isolated from sea lamprey tissues (intes- 1. Radioisotopes were purchased from Amersham (Oak- tine, pancreas, and ) with RNAsol (Canadian Life ville, Ontario, Canada). pCRII and pCR2.1 cloning vec- Technologies, Burlington, Ontario, Canada). To identify tors were purchased from Invitrogen (San Diego, Calif.), the length of the proglucagon I transcript, RNA was and the ␭ZAPII cDNA synthesis kit was from Stratagene separated in a formaldehyde agarose gel (Sambrook, (San Diego, Calif.). Tissues were obtained from sea lam- Fritsch, and Maniatis 1983). Proglucagon I mRNA was preys captured during their upstream spawning migra- detected by hybridization of the RNA at high stringency tion in the Humber River and Duf®ns Creek near To- with a portion of the pancreatic cDNA that encodes glu- ronto, Ontario, Canada. Following anesthetization with cagon, GLP-1, and GLP-2, as well as 3Ј untranslated MS-222, the animals were killed by decapitation. The sequences (bases 202±796, ®g. 2). Hybridization and caudal islet organ, the intestine, and the liver were ex- washing conditions were as previously described (Irwin tirpated, immediately frozen in liquid nitrogen and ul- and Wong 1995). RT-PCR using pairs of gene-speci®c timately stored at Ϫ80ЊC until use. primers (table 1) assayed tissue-speci®c expression of each of the two proglucagon genes. The structures of Isolation of Proglucagon cDNA Clones two mRNAs were determined by RT-PCR, and 3Ј rapid Degenerate primers were synthesized based on the ampli®cation of cDNA ends (RACE) was used to test known N-terminal amino acid sequence of glucagon for alternative splicing. RACE was as previously de- 1550 Irwin et al.

FIG. 1.ÐStructure of sea lamprey proglucagon I and II cDNA. The boxes represent coding sequences, and the thin line represents the 5Ј and 3Ј untranslated sequences. The vertical lines within each coding region represent potential processing sites for the release of glucagon (Glu), GLP-1, and GLP-2. Arrows represent oligonucleotide primers used for PCR. Degenerate primers used for initial ampli®cations are shown above proglucagon I, while gene-speci®c primers are shown below each cDNA. The sequences of the primers are given in table 1. scribed (Irwin and Wong 1995). First-strand cDNA for sequencing (Conlon, Nielson, and Youson 1993). The RT-PCR and 3Ј RACE was synthesized with a kit from complete cDNA sequence of sea lamprey proglucagon Canadian Life Technologies (Burlington, Ontario, Can- I was extended in the 3Ј and 5Ј directions by PCR am- ada); random primers were used to prime cDNA for RT- pli®cation of a cDNA library with gene-speci®c and PCR, and a modi®ed oligo dT primer (see table 1) was vector-speci®c primers (see ®g. 1 and table 1). The lon- used for 3Ј RACE. gest cDNA clone obtained predicted a mRNA sequence of 796 bases for sea lamprey proglucagon I (®g. 2). In Sequence Analysis addition to our proglucagon I cDNA clones, we obtained DNA sequence data were assembled and analyzed additional clones that differed from the proglucagon I with the MacDNAsis 3.2 computer package (Hitachi, sequence. The proglucagon II sequence was also ex- Sun Bruno, Calif.). Differences and divergences (and tended by PCR ampli®cations of the cDNA library (see standard errors) between DNA and amino acid sequenc- ®g. 1 and table 1). Our longest cDNA clones for pro- es were calculated with MEGA 1.01 (Kumar, Tamura, glucagon II predict an mRNA sequence of 738 bases and Nei 1993). Rates of evolution for glucagon-like se- (®g. 3). quences and dates of divergence of these sequences were essentially as described by Lopez et al. (1984). Structure of Sea Lamprey Proglucagon mRNAs Brie¯y, we aligned the DNA sequences from the glu- Northern blot analysis identi®ed a single proglu- cagon and glucagon-like peptides (GLP-1 and GLP-2) cagon I mRNA transcript of approximately 1,000 bases from lampreys and humans. Only the 29 codons that are in the intestine and the pancreas (®g. 4). The size of the shared by all three peptides were used in the analysis. mRNA suggests that we have a nearly full length cDNA Nonsynonymous divergences (K ) were estimated with a sequence (796 bases plus polyA tail). Expression of pro- Jukes-Cantor correction for multiple hits. Rates for each glucagon I was detected in intestinal mRNA isolated of the three peptides were calculated assuming a diver- from sea lampreys in both the parasitic and the spawner gence date of 564 Myr between lampreys and humans phases of their life cycle (®g. 4). Levels of expression (see Kumar and Hedges 1998). of proglucagon I mRNA are similar in both the pancreas and the intestine. RT-PCR analysis indicated that pro- Results glucagon II mRNAs were present in the intestine (both Isolation of Sea Lamprey Proglucagon cDNAs parasitic and spawner) and the pancreas (data not Initial cDNA clones for sea lamprey proglucagon shown). were isolated by RT-PCR using degenerate primers spe- Proglucagon mRNAs are alternatively spliced in ci®c for the known sequences of sea lamprey glucagon many vertebrate species (Irwin and Wong 1995; Chen and a glucagon-like peptide (Glu5Ј and Glp; see table 1 and Drucker 1997; Irwin et al. 1997); thus, we attempted and ®g. 1; Conlon, Nielson, and Youson 1993). With to identify alternatively spliced forms of the sea lamprey our degenerate primer pair, we obtained a single DNA proglucagon I mRNA. Previously described patterns of fragment that was cloned. The DNA sequences of ®ve alternative splicing of proglucagon alter the coding po- clones yielded a single cDNA sequence that predicted a tential of the 3Ј end of the mRNA (Irwin and Wong glucagon and partial N-terminal GLP-1 amino acid se- 1995; Chen and Drucker 1997; Irwin et al. 1997) and quences that were identical to that deduced by peptide result in mRNAs that do or do not encode GLP-2. We Evolution of the Vertebrate Proglucagon Gene 1551

FIG. 2.ÐSequence of sea lamprey proglucagon I. The cDNA sequence is shown, with the predicted coding region shown above in single- letter code. The nucleotide sequence is numbered from the 5Ј end of the longest cDNA. A potential polyadenylation site is underlined. The amino acid sequence is numbered from the potential signal peptidase cleavage site, with the numbered backward. Glucagon, glucagon-like peptide 1, and glucagon-like peptide 2 sequences are indicated in bold type. used 3Ј RACE to amplify the 3Ј end of the proglucagon To determine if an exon encoding GLP-1 was alternatively I mRNA in both the intestine and the pancreas using spliced in the sea lamprey proglucagon II transcript, we two nested primers that are upstream of the GLP-2 se- used primers near the 5Ј end of the coding region (II-5.1, quence (I-5.1 and I-5.2; see ®g. 1). All of the 3Ј RACE within the signal peptide) and downstream of GLP-2 (II- products from both tissues were identical in size and 3.1) to amplify most of the coding region (see ®g. 1). sequence, suggesting that all the proglucagon I mRNAs Again, only a single ampli®ed DNA fragment was detect- were of the same structure. ed by RT-PCR ampli®cation in the pancreas and the in- The characterization of the sea lamprey proglucagon testine, and all clones generated from these fragments had II cDNAs implied a novel potential alternative splicing identical DNA sequences. pattern. The observation that the proglucagon II mRNA was missing a GLP-1 sequence (see below) suggested that Structure of the Sea Lamprey Proglucagon cDNAs and a GLP-1-encoding exon might be alternatively spliced. Predicted Products This pattern of alternative splicing would be similar to the Our proglucagon I cDNA predicts an amino acid alternative splicing of the GLP-2- encoding exon found in sequence that encodes peptide sequences identical to the Xenopus laevis proglucagon gene (Irwin et al. 1997). glucagon and glucagon-like peptide sequences (®g. 5) 1552 Irwin et al.

FIG. 3.ÐSequence of sea lamprey proglucagon II. The cDNA sequence is shown as in ®gure 2. The pair of open diamonds (#) indicates the site that may not be recognized by processing enzymes in proglucagon II (see text for details). deduced by peptide sequencing from intestinal extracts 1 sequence (®gs. 4 and 6). Predicted sea lamprey pro- of the same species of lamprey (Conlon, Nielson, and glucagon I and II both have potential signal peptides to Youson 1993). Carboxy-terminal to the GLP-1 sequence allow secretion of processed products (i.e., glucagon, of sea lamprey proglucagon I, the open reading frame GLP-1, and GLP-2). Potential signal peptidase cleavage predicts an additional 47 amino acid residues (®g. 2). sites predict a 21-amino-acid residue signal peptide for Within this sequence, an additional glucagon-like pep- both proglucagons (®gs. 2, 4, and 6). Mature progluca- tide sequence was observed, although a gap of ®ve ami- gon would thus possess a 21- (proglucagon I) or 20- no acid residues had to be inserted into the highly con- amino-acid-long (proglucagon II) N-terminal region that served N-terminal region of this sequence for maximal corresponds to the glicentin-related pancreatic peptide alignment (®g. 5). The C-terminal location of this glu- (GRPP) of mammalian proglucagons. cagon-like sequence suggests that it corresponds to the GLP-2 sequences of other vertebrate proglucagons. Discussion The proglucagon II cDNA sequence predicted a Predicted Sea Lamprey Proglucagon-Derived Peptides glucagon sequence that differed from the proglucagon I glucagon sequence at eight residues (®g. 5). The glu- Our proglucagon cDNAs predict glucagon-like se- cagon sequence encoded by proglucagon II was found quences similar to glucagon, GLP-1, and GLP-2. The to be equally similar to the glucagon sequences from proglucagon I cDNA (®g. 2) encodes sequences that are Lampetra ¯uviatilis (river lamprey) (Conlon et al. 1995) identical to puri®ed glucagon and glucagon-like peptide and Geotria australis (Wang et al. 1999a). A unique from the sea lamprey (Conlon, Nielson, and Youson feature of our proglucagon II cDNA was that it appeared 1993). The glucagon-like peptide sequence is homolo- to contain only two glucagon-like sequences, a glucagon gous to GLP-1 sequences of other vertebrates (®g. 5). (glucagon II) and a GLP-2 (GLP-2 II), but not a GLP- Carboxy-terminal to the GLP-1 sequence, there is a sec- Evolution of the Vertebrate Proglucagon Gene 1553

ond glucagon-like sequence that appears to correspond to GLP-2 (®g. 2). A pair of basic amino acid residues precedes the GLP-2 sequence that should allow pro- cessing of this peptide from proglucagon I (see ®g. 2). Maximal alignment of this GLP-2 sequence to other GLP-2 sequences requires the placement of a gap of ®ve amino acids between residues 3 and 4 (®gs. 5 and 6). These missing residues in GLP-2 of proglucagon I cor- respond to well-conserved residues in all glucagon-like sequences; e.g., glycine at position 4 is otherwise per- fectly conserved (see ®g. 5). Since the conserved N- terminal sequences have been shown to be important for receptor recognition for glucagon and GLP-1 (Adelhorst et al. 1994; Gallwitz et al. 1994; Unson and Merri®eld 1994), it is likely that they are also important for rec- ognition of the GLP-2 receptor. Therefore, we conclude that the GLP-2-like sequence encoded by proglucagon I probably does not function in the same manner as other vertebrate GLP-2 sequences. This GLP-2-like peptide either is nonfunctional or has acquired a new function. In contrast to the sea lamprey proglucagon I se- quence, a single basic residue (®gs. 4 and 6) precedes the glucagon peptide encoded by the sea lamprey pro- glucagon II cDNA. The presence of a single residue, rather than paired basic residues, raises the possibility that this glucagon peptide is not processed from the pro- glucagon II precursor. Prohormone convertases typically FIG. 4.ÐNorthern blot analysis of proglucagon I gene expression. mRNA from the intestine (lanes 1 and 2), pancreas (lane 3), and liver cleave precursor molecules at paired basic amino acid (lane 4) were separated in an agarose gene and probed with a fragment sites, but in some cases, certain convertases cleave at of the sea lamprey intestinal proglucagon I cDNA. RNAs in lanes 1, single basic amino acid sites (Steiner et al. 1992; Dhan- 3, and 4 were from a sea lamprey in the spawning phase of its life vantari, Seidah, and Brubaker 1996). The failure of a cycle, while RNA from an individual in the parasitic phase is shown in lane 2. The mobility of sea lamprey 18S and 28S rRNA is indicated convertase to cleave, or to cleave at low ef®ciency, at on the right. the single basic residue N-terminal to glucagon in pro- glucagon II may explain the observation that only a sin-

FIG. 5.ÐComparison of glucagon-like sequences. Alignment of glucagon, GLP-1, and GLP-2 sequences is shown. Dots represent identities, while dashes represent gaps introduced for maximum identity. Pm peptide sequences are the deduced sea lamprey sequences (Conlon, Nielson, and Youson 1993). 1554 Irwin et al.

FIG. 6.ÐAlignment of sea lamprey proglucagon I and II amino acid sequences. An alignment of the predicted amino acid sequences is shown, with gaps (dashes) to yield maximum identity. Potential processing sites are indicated by spaces, with the identities of the potential processed peptides indicated above. gle glucagon was isolated from intestinal tissue (Conlon, that encode different sets of , the level of pro- Nielson, and Youson 1993). The GLP-2 sequence en- duction of each hormone can potentially be regulated at coded by proglucagon II, unlike the GLP-2 sequence the level of transcription in the sea lamprey. encoded by proglucagon I, does not require any inser- tions or deletions for maximal alignment to other ver- Origin of Glucagon-like Sequences tebrate GLP-2 sequences (see ®g. 4). The GLP-2 II se- Our characterization of sea lamprey proglucagon quence is preceded by paired basic residues and there- cDNAs demonstrates that there are at least three glu- fore may by processed to yield a 32-amino-acid-long cagon-like sequences encoded in the lamprey genome peptide. We hypothesize that the GLP-2 encoded by pro- (®gs. 2, 4, and 5). This observation indicates that the glucagon II is the functional GLP-2 in the sea lamprey. origin of these three peptides (glucagon, GLP-1, and Sea Lamprey Proglucagon cDNAs Differ in Coding GLP-2) occurred prior to the divergence of lampreys Potential for Intestinal Proglucagon-Derived Peptides and perhaps the original vertebrates. However, sequenc- es clearly homologous to glucagon or glucagon-like The two sea lamprey intestinal proglucagon cDNAs peptides have not yet been identi®ed in any invertebrate that we characterized differ in coding potential (®gs. 2, species (Hoyle 1998). To estimate the date of origin of 4, and 6). The proglucagon I cDNA encodes glucagon the three glucagon-like hormones, we used the same ap- and GLP-1, while the proglucagon II cDNA encodes proach as Lopez et al. (1984), except that we used our GLP-2 and possibly glucagon. Proglucagon-derived new sea lamprey cDNA sequences. Shown in table 2 are peptides GLP-1 and GLP-2 are secreted in parallel in the divergence rates for each of the three glucagon-like equimolar amounts by mammalian intestinal L cells hormones that were calculated assuming a divergence (Drucker 1998). Differing amounts of GLP-1 and GLP- time of 564 Myr between lampreys and other vertebrates 2 can be produced in some vertebrate species due to (see Kumar and Hedges 1998). For divergence calcu- alternative splicing. In ®sh, GLP-1, but not GLP-2, can lations, we used portions of the cDNAs that encoded be produced in the pancreas (Irwin and Wong 1995). both sea lamprey , GLP-1 from proglucagon All intestinal transcripts in trout and other vertebrate I, and GLP-2 from proglucagon II. We did not use the species should allow equimolar production of GLP-1 portion of proglucagon I cDNA that encoded GLP-2, and GLP-2 (Irwin and Wong 1995; Chen and Drucker since this predicted peptide sequence requires an inser- 1997; Irwin et al. 1997). As a consequence of having tion of a ®ve-amino-acid gap for maximum identity (see two proglucagon cDNAs (produced by different genes) ®gs. 5 and 6). We also hypothesized that GLP-2 encoded by proglucagon I may not be functional (see above). Our Table 2 estimated divergence rates for each of the glucagon-like Evolutionary Rates for Glucagon-like Peptides hormone sequences are similar to those estimated by Average Divergence Lopez et al. (1984), except for the rate estimated for a b Ϫ9 Peptide K a Rate GLP-2. We estimated a rate of 0.44 ϫ 10 nonsynon- Glucagonc ...... 0.1958 0.18 ϫ 10Ϫ9 ymous substitutions per year for GLP-2, which is higher GLP-1d ...... 0.3610 0.32 ϫ 10Ϫ9 than the rate of 0.25 ϫ 10Ϫ9 substitutions per year es- GLP-2e...... 0.5002 0.44 ϫ 10Ϫ9 timated by Lopez et al. (1984). GLP-2 is evolving at a

a Average nonsynonymous divergence of lamprey glucagon-like peptides to slower rate within mammals than it is in the remaining homologous human, gila monster, chicken, Xenopus, and trout peptides (see text vertebrates. We used our estimated divergence rates to for references). calculate the divergences of the sequences encoding glu- b Divergence rate is calculated assuming divergence between lampreys and cagon, GLP-1, and GLP-2 from each other for both the other vertebrates occurred 564 MYA (Kumar and Hedges 1998). sea lamprey and the human cDNA sequences (table 3). c Both glucagon I and glucagon II. d The Xenopus GLP-1A sequence was used as the homologous peptide. Similar divergence dates were estimated from both the e Only GLP-2 II. lamprey and the human proglucagon cDNA sequences. Evolution of the Vertebrate Proglucagon Gene 1555

Table 3 Origin of Glucagon-like Peptides

DIVERGENCE COMPARED PEPTIDES Ka MYA Glucagon-like peptides Glucagona±GLP-1 Lamprey...... 0.4952 (Ϯ0.1130)c 990 (Ϯ226) Human ...... 0.4617 (Ϯ0.1065) 923 (Ϯ213) Glucagona±GLP-2b Lamprey...... 0.6036 (Ϯ0.1312) 974 (Ϯ212) Human ...... 0.6339 (Ϯ0.1384) 1022 (Ϯ223) GLP-1±GLP-2b Lamprey...... 0.5226 (Ϯ0.1170) 688 (Ϯ154) Human ...... 0.5144 (Ϯ0.1160) 677 (Ϯ153) Lamprey genes Glucagon I±glucagon II.... 0.1722 (Ϯ0.0545) 478 (Ϯ151) GLP-2 I±GLP-2 II ...... 0.2591 (Ϯ0.0763) 294 (Ϯ87)

a Both glucagon I and glucagon II. b Only GLP-2 II. c ϮSE.

The sequence encoding glucagon diverged from se- quences encoding both GLP-1 and GLP-2 approximate- FIG. 7.ÐA hypothetical scheme for the evolution of sequences ly 900±1000 MYA, while the sequences encoding GLP- encoding glucagon, GLP-1, and GLP-2. Duplication events within the proglucagon gene are indicated by the open diamonds (#), while du- 1 and GLP-2 have a more recent divergence of about plication of sequences due to gene duplications are indicated by open 677±688 MYA (see ®g. 7). Similar divergence dates squares (Ⅺ). An initial sequence duplication about 1 billion years ago were estimated if we calculated gamma distances be- resulted in the divergence of sequences encoding glucagon and glu- tween the peptide sequences (data not shown). These cagon-like peptides and was followed by a second duplication event about 600 MYA allowing the divergence of the sequences encoding estimates of dates indicate that all three glucagon-like glucagon-like peptides. A gene duplication event about 300±400 MYA hormone sequences encoded by the proglucagon gene resulted in two proglucagon genes in lampreys. The dashed line leads must have origins early in chordate or metazoan evo- to the GLP-1 sequence that was not found in our proglucagon II lution. cDNA, while the thin line leads to the GLP-2 sequence encoded by proglucagon II that may have changed its function (see text). Duplication of Proglucagon Genes Within the Lamprey Lineage liard 1987). Wang et al. (1999a) suggested that the iso- A single glucagon peptide was isolated from intes- lation of a single glucagon peptide from sea and river tinal extracts from sea lamprey and river lamprey, while lampreys may be explained by species-speci®c expres- two glucagons were isolated from the Southern Hemi- sion of the two proglucagon genes. Our isolation of sphere lamprey G. australis (Conlon, Nielson, and You- cDNAs from two sea lamprey proglucagon genes shows son 1993; Conlon et al. 1995; Wang et al. 1999a). These that both can be expressed, although reduced production observations suggested that there were two proglucagon of the hormone glucagon from the proglucagon II gene genes in lampreys, with both genes expressed in the in- may be due to the absence or reduced proteolytic pro- testine of Geotria, while one was predominantly ex- cessing of the precursor polypeptide (see above). Spe- pressed in the intestines of the two Northern Hemisphere cies-speci®c proteolytic processing of the two proglu- (sea and river) lamprey species (Wang et al. 1999a). cagons, rather than species-speci®c gene expression, Alternatively, a single proglucagon gene could encode may regulate the production of glucagon. the two glucagons, a situation analogous to that of the The isolation of two distinct proglucagon cDNAs multiple GLP-1 sequences encoded by the X. laevis pro- from sea lampreys allows us to use the same approach glucagon gene (Irwin et al. 1997). Our characterization described above for glucagon-like sequences to estimate of two proglucagon cDNAs from the intestine of the sea the date of divergence of these two proglucagon cDNAs. lamprey supports the conclusion that there are two dif- For the sequences that encode glucagon, a divergence ferent proglucagon genes in lampreys. The glucagon date of 478 MYA is estimated from nonsynonymous peptide encoded by the proglucagon I cDNA is most divergences (table 3). A similar value was again ob- similar to the Geotria II glucagon, while the potential tained with gamma distances from the predicted peptide glucagon encoded by proglucagon II is most similar to sequences (data not shown). The only other parts of the Geotria I and the glucagon isolated from the river lam- predicted proglucagon sequences that appear to be ho- prey (®g. 5). Thus, the duplication of the proglucagon mologous and alignable are the GLP-2-like peptides. gene occurred prior to the earliest divergence of modern Using the sequences that encode the GLP-2-like se- lampreys, that between the Northern (sea and river) and quences, we estimated a divergence date of nearly 300 Southern (Geotria) Hemisphere species (Potter and Hil- MYA (see ®g. 7). These divergence dates, together with 1556 Irwin et al. the observation that the predicted N-terminal region (in- CHEUNG, R., P. C. ANDREWS,E.M.PLISETSKYA, and J. H. cluding the signal peptide) and intervening peptide se- YOUSON. 1991. Immunoreactivity to peptides belonging to quences share little similarity (®g. 6), support the sug- the family (NPY, aPY, PP, PYY) and gestion that the duplication of the proglucagon gene oc- to glucagon-like peptide in the endocrine pancreas and an- curred hundreds of millions of years ago in the lamprey terior intestine of adult lampreys, Petromyzon marinus:an immunohistochemical study. Gen. Comp. Endocrinol. 81: lineage, possibly early in lamprey evolution. To date, 51±63. the lamprey lineage is traced back to approximately 250 CONLON, J. M., V. BONDAREVA,Y.RUSAKOV,E.M.PLISET- MYA, when lampreys likely evolved from a naked an- SKYA,D.C.MYNARCIK, and J. WHITTAKER. 1995. Char- aspid derivative of the ostracoderms (Forey and Janvier acterization of insulin, glucagon, and somatostatin from the 1994). river lamprey, Lampetra ¯uviatilis. Gen. Comp. Endocrinol. 100:96±105. Duplication of Lamprey Hormone Genes CONLON, J . M . , P. F. N IELSON, and J. H. YOUSON. 1993. Pri- We have shown that the proglucagon gene has been mary structures of glucagon and glucagon-like peptide iso- duplicated in the lamprey lineage. At least two other lated from the intestine of the parasitic phase lamprey Pe- hormone genes are also known to have been duplicated tromyzon marinus. Gen. Comp. Endocrinol. 91:96±104. in lampreys, the (POMC) and pep- DHANVANTARI, S., N. G. SEIDAH, and P. L. BRUBAKER. 1996. tide tyrosine-tyrosine (PYY) genes (Heinig et al. 1995; Role of prohormone convertases in the tissue-speci®c pro- Takahashi et al. 1995; Wang et al. 1999b). For both of cessing of proglucagon. Mol. Endocrinol. 10:342±355. these genes the gene duplication events must have also DRUCKER, D. J. 1998. Glucagon-like peptides. Diabetes 47: been early, rather than recent, in lamprey evolution as 159±169. DRUCKER, D. J., P. EHRLICH,S.L.ASA, and P. L. BRUBAKER. the peptides encoded by these differ greatly (Heinig et 1996. Induction of intestinal epithelial proliferation by glu- al. 1995; Takahashi et al. 1995; Wang et al. 1999b). A cagon-like peptide 2. Proc. Natl. Acad. Sci. USA 93:7911± possible explanation for the origin of these duplicated 7916. genes is a genome duplication event, an event that may DUGAY, S. J., and T. P. MOMMSEN. 1994. Molecular aspects of also explain the large number of found in pancreatic peptides. Fish Physiol. 13:225±271. lampreys (Potter and Rothwell 1970; Robinson and Pot- FALKMER, S., and S. VAN NOORDEN. 1983. Ontogeny and phy- ter 1981). logeny of the glucagon cell. Pp. 81±119 in P. J. LEFEBVRE, The specialization of duplicate proglucagon genes ed. Handbook of experimental pharmacology. Springer-Ver- in the sea lamprey to encode different peptides is not lag, Heidelberg, Germany. unique to the proglucagon gene. The two POMC-like FOREY, P., and P. JANVIER. 1994. Evolution of the early ver- genes of lamprey (POM and POC) appear to have dif- tebrates. Am. Sci. 82:554±565. fering coding potentials (Heinig et al. 1995; Takahashi GALLWITZ, B., M. WITT,G.PAETZOLD,C.MORYS-WORT- MANN,B.ZIMMERMANN,K.ECKART,U.R.FOLSCH, and et al. 1995). The POM gene encodes melanotropin A W. E. SCHMIDT. 1994. Structure/activity characterization of and B and a beta-endorphin, while the POC paralog en- glucagon-like peptide 1. Eur. J. Biochem. 225:1151±1156. codes nasohypophysial factor, corticotropin, and a dif- HASEGAWA, S., K. TERAZONO,K.NATA,T.TAKADA,H.YA- ferent beta-endorphin (Takahashi et al. 1995). Like pro- MAMOTO, and H. OKAMOTO. 1989. Nucleotide sequence de- glucagon, a single gene in most vertebrate species typ- termination of chicken glucagon precursor cDNA: chicken ically encodes POMC. Genome duplication may have proglucagon does not contain glucagon-like peptide II. allowed lampreys to use a novel transcriptional mecha- FEBS Lett. 264:117±120. nism, in addition to posttranslational control, to regulate HEINIG, J. A., F. W. KEELEY,P.ROBSON,S.A.SOWER, and J. the production of hormones encoded by the POMC-like H. YOUSON. 1995. The appearance of proopiomelanocortin and proglucagon genes. early in vertebrate evolution: cloning and sequencing of POMC from a lamprey pituitary cDNA library. Gen. Comp. Acknowledgments Endocrinol. 99:137±144. HEINRICH, G., P. GROS, and J. F. HABENER. 1984. Glucagon The sequences reported in this paper have been gene sequence: four of six exons encode separate functional submitted to GenBank under accession numbers domains of the rat pre-proglucagon gene. J. Biol. Chem. AF159707 and AF159708. This work was supported by 259:14082±14087. grants from the Natural Sciences and Engineering Re- HOLST, J. J., and C. éRSKOV. 1994. Glucagon and other pro- search Council (NSERC). We thank two anonymous re- glucagon-derived peptides. Pp. 305±340 in J. H. WALSH viewers for constructive and thoughtful suggestions. We and G. J. DOCRAY, eds. Gut peptides: biochemistry and thank Dr. John Holmes and Richard Manzon for animal physiology. Raven Press, New York. collection and maintenance. HOYLE, C. H. V. 1998. families: evolutionary perspectives. Regul. Pept. 73:1±33. IRWIN, D. M., M. SATKUNARAJAH,Y.WEN,P.L.BRUBAKER, LITERATURE CITED R. A. PEDERSON, and M. B. WHEELER. 1997. The Xenopus ADELHORST, K., B. B. HEDEGAARD,L.B.KNUDSEN, and O. proglucagon gene encodes novel GLP-1-like peptides with KIRK. 1994. Structure-activity studies of glucagon-like pep- insulinotropic properties. Proc. Natl. Acad. Sci. USA 94: tide 1. J. Biol. Chem. 269:6275±6278. 7915±7920. CHEN, Y. E., and D. J. DRUCKER. 1997. Tissue-speci®c ex- IRWIN, D. M., and J. WONG. 1995. Trout and chicken proglu- pression of unique mRNAs that encode proglucagon-de- cagon: alternative splicing generates mRNA transcripts en- rived peptides or exendin-4 in the lizard. J. Biol. Chem. coding glucagon-like peptide 2. Mol. Endocrinol. 9:267± 272:4108±4115. 277. Evolution of the Vertebrate Proglucagon Gene 1557

KUMAR, S., and S. B. HEDGES. 1998. A molecular timescale POTTER, I. C., and R. W. HILLIARD. 1987. A proposal for the for vertebrate evolution. Nature 392:917±920. functional and phylogenetic signi®cance of differences in KUMAR, S., K. TAMURA, and M. NEI. 1993. MEGA: molecular dentition of lampreys (Agnatha: Petromyzontiformes). J. evolutionary genetics analysis. Version 1.01. Pennsylvania Zool. 212:713±737. State University, University Park. POTTER, I. C., and B. ROTHWELL. 1970. The mitotic chromo- LASER, B., and J. PHILIPPE. 1997. Molecular aspects of the somes of the lamprey, Petromyzon marinus L. Experientia glucagon gene. Pp. 203±228 in D. LEROIOTH, ed. Advances 26:429±430. in molecular and cellular endocrinology. Vol. 1. JAI Press, ROBINSON, E. S., and I. C. POTTER. 1981. The chromosomes Greenwich, Conn. of the southern hemispheric lamprey, Geotria australis LEFEÁ BVRE, P. J. 1995. Glucagon and its family revisited. Dia- Gray. Experientia 37:239±240. betes Care 18:715±730. SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molec- ular cloning. 2nd edition. Cold Spring Harbor Press, Cold LOPEZ, L. C., W.-H. LI,M.L.FRAZIER, C.-C. LUO, and G. F. Spring Harbor, N.Y. SAUNDERS. 1984. Evolution of glucagon genes. Mol. Biol. STEINER, D. F., S. P. SMEEKENS,S.OHAGI, and S. J. CHAN. Evol. 1:335±344. 1992. The new enzymology of precursor processing endo- LUND, P. K., R. H. GOODMAN,P.C.DEE, and J. F. HABENER. proteases. J. Biol. Chem. 267:23435±23438. 1982. Pancreatic preproglucagon cDNA contains two glu- TAKAHASHI, A., Y. AMEMIYA,M.SARASHI,S.A.SOWER, and cagon-related coding sequences arranged in tandem. Proc. H. KAWAUCHI. 1995. Melanotropin and corticotropin are Natl. Acad. Sci. USA 79:345±349. encoded on two distinct genes in the lamprey, the earliest LUND, P. K., R. H. GOODMAN,M.R.MONTMINY,P.C.DEE, evolved extant vertebrate. Biochem. Biophys. Res. Com- and J. F. HABENER. 1983. Angler®sh islet pre-proglucagon mun. 213:490±498. II: nucleotide and corresponding amino acid sequence of UNSON, C. G., and R. B. MERRIFIELD. 1994. Identi®cation of the cDNA. J. Biol. Chem. 258:3280±3284. an essential serine residue in glucagon: implication for an MOJSOV, S., G. HEINRICH,I.B.WILSON,M.RAVAZZOLA,L. active site triad. Proc. Natl. Acad. Sci. USA 91:454±485. ORCI, and J. F. HABENER. 1986. Preproglucagon gene ex- WANG, Y. , P. F. N IELSEN,J.H.YOUSON,I.C.POTTER, and J. pression in the pancreas and intestine diversi®es at the level M. CONLON. 1999a. Multiple forms of glucagon and so- of post-translational processing. J. Biol. Chem. 261:11880± matostatin isolated from the intestines of the Southern- 11889. Hemisphere lamprey Geotria australis. Gen. Comp. Endo- MOJSOV, S., G. C. WEIR, and J. F. HABENER. 1987. Insulino- crinol. 113:274±282. tropin: glucagon-like peptide I (7±37) co-encoded in the WANG, Y. , P. F. N IELSEN,J.H.YOUSON,I.C.POTTER,V.A. glucagon gene is a potent stimulator of insulin release in LANCE, and J. M. CONLON. 1999b. Molecular evolution of the perfused rat pancreas. J. Clin. Invest. 79:616±619. peptide tyrosine-tyrosine: primary structure of PYY from éRSKOV, C., J. HOLST,S.KNUHTSEN,F.G.A.BALDISSERA,S. the lamprey Geotria australis and Lampetra ¯uviatilis, bir- S. POULSEN, and O. V. NIELSEN. 1986. Glucagon-like pep- chir, python and desert tortoise. Regul. Pept. 79:103±108. tides, GLP-1 and GLP-2, predicted products of the glucagon WHITE, J. W., and G. F. SAUNDERS. 1986. Structure of the hu- gene are secreted separately from pig small intestine but not man proglucagon gene. Nucleic Acids Res. 14:4719±4730. pancreas. Endocrinology 119:1467±1475. YOUSON, J. H., and W. M. ELLIOTT. 1993. Morphogenisis and distribution of the endocrine pancreas in adult lampreys. PHILIPPE, J. 1991. Structure and pancreatic expression of the Fish Physiol. Biochem. 7:125±131. insulin and glucagon genes. Endocr. Rev. 12:252±271. PLISETSKAYA, E. M., and T. P. MOMMSEN. 1996. Glucagon and CLAUDIA KAPPEN, reviewing editor glucagon-like peptides in ®shes. Int. Rev. Cytol. 168:187± 257. Accepted July 26, 1999