Quick viewing(Text Mode)

Turkish Honeybees

Turkish Honeybees

Table 1. Presence or absence of diagnostic restriction sites in four regions of honeybee mtDNA, and Turkish Honeybees: Genetic structure of the noncoding intergenic region Variation and Evidence for a West Type 1, eastern Type 2, eastern Middle Fourth Lineage of Apis Gene Enzyme European Mediterranean Mediterranean African Eastern mellifera mtDNA Cytochrome b BglII ϩϩ ϩ Ϫϩ COI HincII ϩϪ Ϫ ϪϪ M. R. Palmer, D. R. Smith, and lsrRNA EcoRI Ϫϩ Ϫ ϪϪ O. KaftanogÏlu COI XbaI Ϫϩ ϩ ϪϪ COI XbaI ϪϪ ϩ ϪϪ

Noncoding sequence PQ Q Q P0QP1Q The mtDNA of bees from 84 colonies of PQQ P0QQ PQQQ P QQQ Turkish honeybees (Apis mellifera) was 0 surveyed for variation at four diagnostic Western European, eastern Mediterranean, and African are three lineages of honeybee mtDNA characterized by restriction sites and the sequence of a restriction site and length differences. Type 1 and type 2 are two variants of the eastern Mediterranean lineage. noncoding intergenic region. These colo- ‘‘Gene’’ indicates the approximate location of restriction sites, ‘‘Enzyme’’ the restriction enzyme. The ‘‘ϩ’’ sign indicates presence of a restriction site, ‘‘Ϫ’’ indicates its absence. A novel pattern of restriction sites was found nies came from 16 locations, ranging from in bees from Hatay, ; here it is called ‘‘Middle Eastern.’’ Primers for cytochrome b reported in Crozier et al. European Turkey and the western Medi- (1991); others in Hall and Smith (1990). Terminology for structure of the noncoding region follows that of Cornuet terranean coast to the Caucasus Moun- et al. (1991). tains along the Georgian border, the east- ern Lake Van region, and the extreme ner et al. 1993; Moritz et al. 1994; Sheppard some Spanish honeybees, eastern Medi- south. Combined restriction site and se- et al. 1996; Smith 1991a,b; Smith and terranean mtDNA in A. m. ligustica, A. m. quence data revealed four haplotypes. Brown 1990; Smith et al. 1991) and se- carnica, and A. m. caucasica, and African Three haplotypes belonged to the eastern quence polymorphisms (Arias and Shep- mtDNA in A. m. scutellata and other bees Mediterranean mtDNA lineage. The fourth pard 1996; Garnery et al. 1992; Lee and from Africa. haplotype, which had a novel restriction Hall 1996) have revealed three main lin- The bees of Turkey are particularly rel- site pattern and noncoding sequence, eages of honeybee mtDNA: western Euro- evant for studies of honeybee biogeogra- was found in samples from the extreme pean, eastern Mediterranean, and African. phy. Turkey is located at the geographic south, near the Syrian border. We found The mitochondrial genome of honey- crossroads of Europe, Asia, and the Mid- two different noncoding sequences bees also contains a noncoding region lo- dle East and contains a wide range of cli- among the eastern Mediterranean haplo- cated between a leucine tRNA gene and mates and habitats within its borders. Not types. The ‘‘Caucasian’’ sequence match- the cytochrome oxidase II gene (COII; Cor- surprisingly, the honeybees of Turkey are es that described from A. m. caucasica, nuet et al. 1991). This noncoding sequence also quite diverse; based on morphomet- and the ‘‘Anatolian’’ sequence matches has a complex structure, which gives rise ric, behavioral, and ecological data, Rutt- that of A. m. carnica. The frequency of the to both sequence and length variation ner (1988) suggested that four subspecies ‘‘Caucasian’’ sequence was highest (98– (Hall and Smith 1991). Two basic ele- occur in Turkey: A. m. anatoliaca, A. m. 100%) in sites near the Georgian border ments, called ‘‘P’’ (or ‘‘P0’’) and ‘‘Q’’ are caucasica, A. m. meda, and A. m. syriaca. and decreased steeply to the south and found in the noncoding sequence; length According to Ruttner, A. m. caucasica oc- west. Elsewhere the Anatolian sequence variation stems from presence or absence curs in the extreme northeast of Anatolia was found. In European Turkey (Thrace) a of the P element and from tandem repeats (Asian Turkey), with bees resembling A. restriction site polymorphism previously of the Q element. The Q element itself con- m. caucasica occurring along the eastern reported from A. m. carnica in Austria and sists of three subunits—Q1, Q2, and Q3; Black Sea coast as far as Samsun. A. m. the Balkans was present at high frequen- Q3 is usually identical to the P element meda is found in the southeast, and A. m. cy. A novel mtDNA haplotype with a from the same genome (Cornuet et al. syriaca in the extreme south, near the bor- unique restriction site pattern and noncod- 1991). Sequence variation includes both der with Syria. A. m. anatoliaca occurs ing sequence was found among bees base substitutions and insertions/dele- throughout the rest of Turkey, including from Hatay, in the extreme south near the tions. The three mitochondrial lineages European Turkey. Syrian border. This haplotype differed from differ in length and sequence of the non- Genetic studies of the honeybees of Tur- the three previously known lineages of coding region: in the eastern Mediterra- key include allozyme studies (Asal et al. honeybee mtDNA—African, western Eu- nean lineage one finds a single Q element; 1995; Kandemir and Kence 1995) and a ropean, and eastern Mediterranean—and in the western European and African lin- study of mtDNA restriction site polymor- may represent a fourth mitochondrial lin- eages one finds a P element followed by phisms (Smith et al. 1997). The restriction eage. one, two, or three repeats of the Q ele- site study showed that our Turkish sam- ment. The P element of the western Eu- ples possessed the eastern Mediterranean Honeybees (Apis mellifera) are geographi- ropean lineage has a 15-base deletion rel- lineage of honeybee mtDNA. Even though cally diverse, with as many as 25 subspe- ative to the African ‘‘P0’’ element (Cornuet two subspecies—A. m. anatoliaca and A. cies currently recognized (Ruttner 1988; et al. 1991). These lineage-specific differ- m. caucasica—were believed to be repre- Sheppard et al. 1997). The honeybee mi- ences are summarized in Table 1 and Fig- sented in our samples, the only variation tochondrial genome has provided abun- ure 1. we found was among the bees of Thrace dant data for studies of honeybee phylog- The geographic distribution of these mi- (European Turkey), where we found an eny and biogeography. Studies of mitochon- tochondrial lineages corresponds roughly XbaI site previously known only from A. m. drial DNA (mtDNA) restriction site poly- with the distributions of honeybee sub- carnica (Meixner et al. 1993; Smith and morphisms (e.g., Crozier et al. 1991; Gar- species: western European mtDNA is Brown 1990). nery et al. 1993; Hall and Smith 1991; Meix- found primarily in A. m. mellifera and Here we examine mtDNA restriction site

42 and sequence variation in bees collected from 16 localities in Turkey: the 12 sites reported earlier (Smith et al. 1997) and 4 additional sites in southern and eastern Turkey. We survey the four restriction sites, which distinguish three major mt- DNA lineages within A. mellifera, and se- quence the noncoding intergenic region of the mitochondrial genome (Cornuet et al. 1991). Our restriction site survey and se- quence data reveal a novel mtDNA haplo- type, which may constitute a fourth mito- chondrial lineage in A. mellifera.

Methods Collections Adult worker honeybees were collected from comb (or in one case from a swarm) and frozen in liquid nitrogen or preserved in 70% ethanol. Samples were collected from colonies in the following locations (see Figure 2). In 1994, samples were col- Figure 1. Structure and sequence of the noncoding region located between leucine tRNA and cytochrome oxidase lected from Bursa (one colony) and vil- II genes in the A. mellifera mitochondrial genome. The ‘‘P’’ and ‘‘Q’’ terminology follows the usage of Cornuet et al. (1991). (A) Structure of the noncoding region in three major lineages of honeybee mtDNA (W ϭ western lages near Giresun (four colonies). In June European, E ϭ eastern Mediterranean, A ϭ African) and the novel mitochondrial haplotype found in Hatay (M ϭ 1995, samples were collected from Thrace Middle Eastern). In African and western European mtDNAs, the Q element may be repeated one to three times. (7 colonies from villages near Tekirdagˇ); (B) Sequence of the noncoding region. Sequence of each element and subunit (P, Q1, Q2, Q3) shown separately. A ϭ A. m. scutellata from African lineage; W ϭ A. m. mellifera from western European lineage; M ϭ Syrian sequence Go¨kc¸eada (10 colonies); the Black Sea from the proposed Middle Eastern lineage; E-c ϭ A. m. caucasica from the eastern Mediterranean lineage; E-l ϭ A. coast near Bolu and Yedigo¨ller (4 colo- m. ligustica from the east Mediterranean lineage. A. m. ligustica and A. m. scutellata sequences from Garnery et al. nies); Menemen, Aegean Agricultural Re- (1992). search Institute (9 colonies); Beypazari (7 colonies); Erzurum (7 colonies); (5 colonies); villages near Ardanuc¸ (6 col- onies); villages near Artvin (5 colonies); and the villages of Posof, S¸avs¸at, and Su¨n- gu¨llu¨ near the Georgian border (16 colo- nies). The bees from Bolu, S¸avs¸at, and Beypazari came from breeding colonies maintained by the Beekeeping Project of the Turkish Development Foundation (TKV); the rest were collected in situ. In 1998, samples were collected from Bitlis (9 colonies), Mus¸ (2 colonies), Van (4 colo- nies), and Hatay (14 colonies). These collection sites fall within the ranges of A. m. anatoliaca, A. m. caucasica, A. m. meda, and A. m. syriaca as described by Ruttner (1988; see Figure 2). However, migratory bee-keeping is widely practiced in Turkey, and Caucasian bees (A. m. cau- casica) are highly prized by beekeepers. Both factors (especially the latter) can lead to transplantation and mixing of pop- ulations. Although none of the samples used in this study were subjected to mor- Figure 2. Approximate ranges of A. m. anatoliaca, A. m. caucasica, A. m. meda, and A. m. syriaca in Turkey as phometric analysis, we are confident that suggested by the morphometric studies of Ruttner (1988; dotted lines), and distribution of four mtDNA haplotypes they are, for the most part, representative found in this study (pie charts). Samples were collected from starred locations. Key to pie charts: Gray shading of regional populations. First, Gu¨ler (1996) indicates the frequency of haplotypes of the eastern Mediterranean mitochondrial lineage with type 1 restriction site pattern and Anatolian noncoding intergenic sequence; striped shading, eastern Mediterranean mitochondrial and Gu¨ler et al. (1999) carried out a mor- lineage, type 2 restriction site pattern, Anatolian noncoding sequence; black shading, eastern Mediterranean mi- phometric analysis of other samples from tochondrial lineage, type 1 restriction site pattern, Caucasian noncoding sequence; white, Syrian sequence with Middle Eastern restriction site pattern. Restriction site patterns are described in Table 1, sequences are shown in some of our 1995 collection sites. Their Figure 1. samples from Posof, Su¨ngu¨llu¨, and Arda-

Brief Communications 43 han corresponded to published character- Table 2. Frequency of three restriction site patterns in the mtDNA of Turkish honeybees from 17 istics of A. m. caucasica. They also exam- localities: eastern Mediterranean (type 1 and 2) and ‘‘Middle Eastern’’ ined samples from Go¨kc¸eade, Beypazari, Eastern Mediterranean Fethiye, and Thrace; although they found Type 1 Type 2 Middle Eastern regional variation among these popula- Locality Number Percent Number Percent Number Percent tions, all correspond to A. m. anatoliaca. Second, some samples were taken from re- Tekirdag 1 14 6 86 0 0 search apiaries that maintain honeybee Go¨kc¸eada 10 100 0 0 0 0 Bursa 1 100 0 0 0 0 subspecies stocks (e.g., Aegean Agricul- Bolu 2 100 0 0 0 0 tural Research Station, Menemen; Bee- Menemen 9 100 0 0 0 0 Fethiye 9 100 0 0 0 0 keeping Project of the Turkish Develop- Beypazari 6 100 0 0 0 0 ment Foundation (TKV), Beypazari; honeybee Giresun 4 100 0 0 0 0 breeding station, Ardahan). Others are Ardahan 6 100 0 0 0 0 Posof and Savsat 9 100 0 0 0 0 from regions in which migratory bee keep- Artvin 5 100 0 0 0 0 ing is rare or prohibited (e.g., Go¨kc¸eada, Ardanuc¸ 610000 00 Ardahan). Finally, since A. m. caucasica is Erzurum 6 100 0 0 0 0 Bitlis 9 100 0 0 0 0 so highly prized, ‘‘foreign’’ populations of Van 410000 00 honeybees are not likely to be imported Mus 2 100 0 0 0 0 into its range in Turkey, though A. m. cau- Hatay 6 43 0 0 8 57 casica is likely to be exported and estab- The restriction site patterns are described in Table 1. lished in new locations. In the final analy- sis, however, subspecies designations are not critical to our study. We are docu- Turkish honeybee samples. Most colonies (like the African and west European lin- menting variation present in honeybee had the restriction site pattern character- eages) and one Q element. The Syrian P mtDNA and its geographic distribution. istic of the eastern Mediterranean mtDNA (we call it ‘‘P1’’) is identical to the African

The distribution of mtDNA haplotypes lineage (Tables 1 and 2). As reported ear- P0 element except for one base substitu- may not always match subspecies desig- lier (Smith et al. 1997), a variant of this tion and a 3-base insertion (Figure 1). The nations based on morphometrics. pattern in which there are two (rather Syrian Q element differs from those of the than one) XbaI sites in the amplified COI other lineages by a series of base substi- Laboratory Methods fragment was found in six of seven colo- tutions, additions, and deletions. Total DNA was prepared by proteinase K nies from Thrace. A novel restriction site The Syrian sequence was always found digestion of single thoraces, followed by pattern was found in some colonies from with Middle Eastern restriction site pat- phenol extraction and ethanol precipita- Hatay (Tables 1 and 2); this haplotype tern. Although we did not sequence all tion as described in Smith et al. (1997) or may represent a fourth mitochondrial lin- eight examples of mtDNA with the Middle with Qiagen Tissue Prep kits (Qiagen, Va- eage. Here we refer to both the haplotype Eastern restriction site pattern, we infer lencia, CA). The four regions of the mito- and the mitochondrial lineage as ‘‘Middle that they all had the Syrian intergenic se- chondrial genome used to diagnose lin- Eastern.’’ quence, based on size differences in the eages of honeybee mtDNA were amplified We also found three sequence variants amplified mtDNA fragment: the Anatolian by means of the polymerase chain reac- in the noncoding intergenic region of Turk- intergenic sequence (Q) is shorter than tion (PCR; Saiki et al. 1985) and digested ish honeybee mtDNA (Figure 1). All of our the Syrian (P1Q). with the appropriate restriction enzymes, noncoding sequences differed in one re- Table 3 shows the frequency and geo- as described in Smith et al. (1997). The spect from those reported by Garnery et graphic distribution of the three intergenic amplified regions, restriction enzymes, al. (1992). In their study the noncoding se- sequences. Caucasian sequence is highest and pattern of restriction sites character- quence began with ATTTCCCC in A. m. li- near the Georgian border and decreases to istic of each honeybee mtDNA lineage are gustica and A. m. carnica, and ATTTCCC- the west and south. Fourteen of 16 colo- shown in Table 1. (single-base deletion) only in A. m. nies (87.5%) from sites near the Georgian One of these amplified regions extends caucasica; all of our Turkish sequences, re- border (Posof, S¸avs¸at) and 10 of 15 colo- from the 3Ј end of cytochrome oxidase I gardless of origin, began ATTTCCC-. nies (66%) from localities 20–40 km from (COI) to the 5Ј end of COII; this fragment One of our three mtDNA sequences the Georgian border (Ardahan, Ardanuc¸, includes the noncoding intergenic region matched (except for the missing ‘‘C’’ men- and Artvin) had the Caucasian sequence. discussed above. The noncoding region tioned above) the sequence reported for Two of 7 colonies (29%) from Erzurum (ap- was sequenced using the internal primer A. m. carnica, and a second matched the proximately 240 km south of the Georgian 5Ј-GGCAGAATAAGTGCATTG-3Ј (Cornuet et sequence reported for A. m. caucasica border) and 3 of 12 colonies (25%) in the al. 1991). Sequencing reactions were car- (Garnery et al. 1992). Here we refer to Lake Van region (Van, Mus¸, Bitlis) had the ried out using the fMOL (Promega) cycle these sequences as ‘‘Anatolian’’ and ‘‘Cau- Caucasian sequence. The novel Syrian se- sequencing protocol with 32P-labeled se- casian,’’ respectively. quence was found only in our samples quencing primer. We call the third sequence ‘‘Syrian’’ (not from Hatay. All other colonies sampled reported in the literature), as it was found from other localities had only the Anato- in bees collected in Hatay, near the Syrian lian sequence. Of these, six of seven col- Results border and within the reported range onies from Thrace had the eastern Medi- Three restriction site patterns were found (Ruttner 1988) of A. m. syriaca. The novel terranean type 2 restriction site pattern, in the mitochondrial genomes of these Syrian sequence contained a P element while the rest had the type 1 pattern.

44 The Journal of Heredity 2000:91(1) Table 3. Frequency of three noncoding mitochondrial sequences—‘‘Anatolian,’’ ‘‘Caucasian,’’ and any of the three previously documented ‘‘Syrian’’—in Turkish honeybees honeybee mitochondrial lineages and may Anatolian Caucasian Syrian constitute a fourth lineage. This was found Locality Number Percent Number Percent Number Percent in high frequency (roughly 50%) in bees from Hatay. This southern city lies at the Tekirdag 7 100 0 0 0 0 northern edge of the proposed range of A. Go¨kc¸eada 1 100 0 0 0 0 Bursa 1 100 0 0 0 0 m. syriaca: along the eastern coast of the Bolu 4 100 0 0 0 0 Mediterranean north of the Negev desert, Menemen 2 100 0 0 0 0 crossing parts of Israel, Jordan, Syria, and Beypazari 7 100 0 0 0 0 Giresun 4 100 0 0 0 0 Lebanon. Little genetic work has been car- Ardahan 0 0 5 100 0 0 ried out on the native honeybees of the Posof and Savsat 2 12.5 14 87.5 0 0 Artvin 2 40 3 60 0 0 Middle East, principally because imported Ardanuc¸ 3 60 2 40 0 0 A. m. ligustica, A. m. carnica, and other Erzurum 5 71 2 29 0 0 races have largely replaced the native Bitlis 3 50 3 50 0 0 Van 4 100 0 0 0 0 honeybee where modern apiculture is Mus 2 100 0 0 0 0 practiced (Lensky Y, personal communi- Hataya 4500 0 450 cation to D.R.S.). Our data suggests that a Anatolian and Caucasian match published A. m. carnica and A. m. caucasica sequences, respectively (Garnery et fourth mitochondrial lineage occurs al. 1992); Syrian is found in bees with the Middle Eastern restriction site pattern (Tables 1 and 2). among Middle Eastern honeybees. If this a Four of each restriction site pattern (Table 2) were selected for sequencing. were so, it could change our ideas about the relationships among mitochondrial lin- eages and the biogeography of A. melli- Discussion The ‘‘homeland’’ of A. m. caucasica is in fera. the Caucasus mountains, southern valleys Our restriction site and sequence data Earlier morphometric studies (summa- of the Caucasus, and the higher reaches of combined show four mitochondrial vari- rized in Ruttner 1988) indicated four sub- the Little Caucasus mountains (Ruttner ants in Turkey. Three of the four haploty- species groups: M, C, A, and O or Oriental. 1988), primarily in and neighbor- pes belong to the eastern Mediterranean Ruttner (1988) pointed out that the mor- ing republics. The full extent of the range mitochondrial lineage; the fourth, found phometrically based groups do not nec- of this bee is not clear. Ruttner states that only in samples from Hatay, does not cor- essarily reflect phylogenetic relationships, bees resembling A. m. caucasica occur respond to any of three previously report- and we have found that the match be- along the Black Sea coast of Anatolia as ed mitochondrial lineages. tween morphometric subspecies groups far as Samsun (Ruttner 1988; pp. 178, 192– The most common and widespread and mtDNA lineages is not exact (e.g., see 198), but so far we have no evidence of the mtDNA haplotype in our samples has the Smith 1991a). For example, the morpho- Anatolian intergenic sequence and the re- Caucasian sequence along the southern metric C branch includes the subspecies striction site pattern ‘‘eastern Mediterra- Black Sea coast, though our samples from A. m. carnica and A. m. ligustica, a group- nean type 1Љ (Table 1). In Thrace (Euro- this region are admittedly small. ing also supported by mtDNA data, as pean Turkey) a haplotype with the The Caucasian sequence is found in Er- both these subspecies typically carry Anatolian intergenic sequence and the re- zurum, significantly south of the proposed mtDNA belonging to the eastern Mediter- striction site pattern ‘‘eastern Mediterra- range of A. m. caucasica. If the intergenic ranean lineage. However, the morphomet- nean type 2’’ occurs at high frequency. sequence described by Garnery et al. ric O branch includes A. m. caucasica, A. Since this restriction site pattern is also (1992) for A. m. caucasica is indeed char- m. anatoliaca, and A. m. syriaca, a group- found among A. m. carnica from Austria, acteristic of the entire subspecies, then ing not supported by mtDNA data. The Slovenia, and Croatia (Meixner et al. 1993; our results indicate a wide zone of inter- mtDNA typically carried by A. m. caucasi- Smith and Brown 1990), it suggests mater- action between A. m. caucasica and A. m. ca and A. m. anatoliaca also belongs to the nal gene flow among the bees of Thrace, anatoliaca, at least from Lake Van to the eastern Mediterranean lineage. We do not the Balkans, and southern Austria. The Georgian border (Figure 2). This could be have any verifiable samples of A. m. syr- fact that this restriction site was not found due either to natural gene flow and dis- iaca, but our data show that some bees of in any bees from Anatolia suggests there persal or to transportation of A. m. cau- the Middle East have a distinctive mtDNA may be a barrier to maternal gene flow be- casica by humans. More extensive collec- haplotype. tween these two regions—though more tions from the northwestern area of An advantage of mitochondrial data sampling, especially in northwest Anato- Turkey and the heart of the A. m. caucas- over morphometric data is that DNA se- lia, is needed to test this. ica range in Georgia would enable us to quence data can be analyzed easily in a A third haplotype has the eastern Med- determine if the intergenic sequence we phylogenetic context. This is desirable for iterranean (type 1) restriction site pattern call ‘‘Caucasian’’ is actually characteristic inferring the relationships among popula- and the Caucasian intergenic sequence. of all or most A. m. caucasica. Nothing in tions and for inferring the history of pop- This matches the mitochondrial haplotype our mtDNA data suggested that the bees ulation movements. A drawback of mtDNA previously described from A. m. caucasica from the Lake Van region, supposedly in is that it is inherited uniparentally. When (Garnery et al. 1992; Smith 1988). We the range of A. m. meda, were in any way formerly isolated populations of honey- found this sequence in high frequency distinct from other Anatolian populations. bees come into contact, whether through near the Georgian border and in lower fre- The fourth haplotype, which is charac- range expansion or human transportation, quency in Erzurum and the region around terized by a novel restriction site pattern mating between members of different pop- Lake Van. and intergenic sequence, does not match ulations can lead to introgression of mt-

Brief Communications 45 DNA haplotypes into new populations. zyme polymorphism in honey bee (Apis mellifera L.) nica (Hymenoptera: Apidae). Ann Entomol Soc Am 83: from Anatolia. Turk J Zool 19:153–156. 81–88. This is probably one source of discor- Cornuet J-M, Garnery L, and Solignac M, 1991. Putative Smith DR, Palopoli MF, Talyor BR, Garnery L, Cornuet dance between morphometric and mito- origin and function of the intergenic region between J-M, Solignac M, and Brown WM, 1991. Geographic chondrial datasets. However, mtDNA pre- COI and COII of Apis mellifera L. mitochondrial DNA. overlap of two classes of mitochondrial DNA in Spanish serves information on the relatedness of Genetics 128:393–403. honey bees (Apis mellifera iberica). J Hered 82:96–100. queens and queen lines, and it is an ex- Crozier YC, Koulianos S, and Crozier RH, 1991. An im- Smith DR, Slaymaker A, Palmer M, and Kaftanolgˇu O, proved test for Africanized honeybee mitochondrial 1997. Turkish honey bees belong to the east Mediter- cellent source of data for inferring the his- DNA. Experientia 47:968–969. ranean mitochondrial lineage. Apidologie 28:269–274. tory and of A. mellifera. Garnery L, Cornuet J-M, and Solignac M, 1992. Evolu- Received May 26, 1998 On a practical note, honeybee subspe- tionary history of the honey bee Apis mellifera inferred Accepted August 25, 1999 cies and ecotypes differ in many physio- from mitochondrial DNA analysis. Mol Ecol 1:145–154. Corresponding Editor: Robert Wayne logical, ecological, and behavioral charac- Garnery L, Solignac M, Celebrano G, and Cornuet J-M, ters that make them particularly well- 1993. A simple test using restricted PCR-amplified mi- tochondrial DNA to study the genetic structure of Apis suited to their local environments. The va- mellifera L. Experientia 49:1016–1021. riety of subspecies and ecotypes provide Gu¨ler A, 1996. Turkiye`deki onemli balraisi (Apis melli- Classifying Genealogical ample genetic variation for the selective fera L.) irk ve ekotiplerinin morfolojik ozelliklerinin be- breeder. Brother Adam, a respected (even lirlenmesi ve performanslarinin saptanmasi [Morpho- Origins in Hybrid metric characteristics and performances of honeybee revered) honeybee breeder and bee biol- (Apis mellifera L.) races and ecotypes in Turkey] (PhD Populations Using Dominant ogist, visited Turkey in 1954 and 1962 in dissertation). C¸ ukurova U¨niversitesi Fen Bilimleri En- Markers the course of his quest for honeybees en- stitu¨su¨, Adana, Turkey. dowed with qualities desired by profes- Gu¨ler A, Kaftanoakgglu O, Bek Y, and Yeninar H, 1999. L. M. Miller Turkiye`deki cesitli balarisi (Apis mellifera) irk ve eko- sional beekeepers. In his publications de- tiplerinin morfolojik karakterler acisindan iliskilerinin scribing his travels he praised the diskriminant analiz yontemi ile saptanmasi. TURBITAK In hybrid studies, potential for error is high excellent qualities of the Anatolian bees Doga 23:337–344. when classifying genealogical origins of Hall HG and Smith DR, 1991. Distinguishing African and (Adam 1954, 1964, 1977). Genetic studies individuals (e.g., parental, F1,F2) based on European honey bee matrilines using amplified mito- of Turkish honeybee populations will help chondrial DNA. Proc Natl Acad Sci USA 88:4548–4552. their genotypic arrays. For codominant in understanding their geographic varia- markers, previous researchers have con- Kandemir I and Kence A, 1995. Allozyme variability in tion and may aid in maintaining and utiliz- a central Anatolian honeybee (Apis mellifera L.) popu- sidered the probability of misclassification ing their genetic diversity. lation. Apidologie 26:503–510. by genotypic inspection and proposed al- Lee ML and Hall HG, 1996. Identification of mitochon- ternative maximum-likelihood approaches From the Department of Entomology, Haworth Hall, drial DNA of Apis mellifera (Hymenoptera: Apidae) sub- to estimating genealogical class frequen- University of Kansas, Lawrence, KS 66045 (Palmer and species groups by multiplex allele-specific amplifica- Smith) and C¸ ukurova U¨niversitesi, Ziraat Faku¨ltesi, tion with competing fluorescent-labeled primers. Ann cies. Recently developed dominant mark- Adana, Turkey (Kaftanogˇlu). Work in Turkey was sup- Entomol Soc Am 89:20–27. er systems may significantly increase the ported by NATO Science for Stability program TU-POL- LINATION Project to O. Kaftanogˇlu. Work at the Univer- Meixner MD, Sheppard WS, and Poklukar J, 1993. Asym- number of diagnostic loci available for hy- sity of Kansas was supported in part by a grant to O. metrical distribution of a mitochondrial DNA polymor- brid studies. I examine probabilities of phism between 2 introgressing honey bee subspecies. R. Taylor, Jr., and D. Smith from the U.S. Department of classification error based on the number Agriculture competitive grants program. We thank Dr. Apidologie 24:147–153. Ferat Genc¸, Dr. Ahmet Gu¨ler, and Hakan Kaftanogˇlu for Moritz RFA, Cornuet J-M, Kryger P, Garnery L, and Hep- of dominant loci. As in earlier studies, I as- ¨ their help in collecting samples, and Ozgen Aksu, Ne- burn HR, 1994. Mitochondrial DNA variability in South sume that only parental and first- and sec- ¨ cati Dikilitas¸, Ahmet Inci, and Dr. Ali Ihsan Oztu¨rk for African honeybees (Apis mellifera L.). Apidologie 25: ond-generation hybrid crosses between their help in locating suitable colonies. We thank Dr. 169–178. Harrington Wells (Tulsa University) and Dr. Ibrahim two taxa potentially exist. Thirteen loci with C¸ akmak (Uludagˇ University, Bursa) for samples from Ruttner F, 1988. Biogeography and of honey dominant expression from each parental Bursa and Giresun. We also thank Cengiz Erdem ( Yu¨- bees. Berlin: Springer-Verlag. taxon (i.e., 26 total loci) are needed to re- zu¨ncu¨ Yil University, Van) for providing samples from Saiki RK, Scharf S, Faloona F, Mullis KB, Horn GT, Erlich Van, Bitlis, and Mus¸, and Dr. Nuray K. S¸ahinler for pro- HA, and Arnheim N, 1985. Enzymatic amplification of duce classification error below 5% for F2 viding bees from Hatay. J. Therrien and two anony- ␤-globin genomic sequences and restriction site anal- individuals, compared to 13 codominant mous reviewers provided valuable comments on the ysis for diagnosis of sickle cell anemia. Science 230: loci for the same error rate. Use of loci in manuscript. Address correspondence to Deborah R. 1350–1354. Smith at the address above or e-mail: deborahsmith similar numbers from both taxa most effi- @ukans.edu. Sheppard WS, Arias MC, Grech A, and Meixner MD, 1997. Apis mellifera ruttneri, a new honey bee subspe- ciently increases power to characterize all ᭧ 2000 The American Genetic Association cies from Malta. Apidologie 28:287–293. genealogical classes. In contrast, classi- Sheppard WS, Rinderer TE, Meixner MD, Yoo HR, Stel- fication of backcrosses to one parental zer JA, Schiff NM, Kamel SM, and Krell R, 1996. HinfI taxon is wholly dependent on loci from the variation in mitochondrial DNA of Old World honey bee subspecies. J Hered 87:35–40. other taxon. Use of dominant diagnostic markers may increase the power and ex- References Smith DR, 1988. Mitochondrial DNA polymorphisms in Adam, 1954. In search of the best strains of bees: sec- five Old World subspecies of honey bees and in New pand the use of maximum-likelihood meth- ond journey. Bee World 35:193–203, 233–244. World hybrids. In: Africanized honey bees and bee ods for evaluating hybrid mixtures. mites (Needham GR, Page RE, Delfinado-Baker M, and Adam, 1964. In search of the best strains of bee: con- Bowman CE, eds). Chichester: Ellis Horwood; 303–312. cluding journeys. Bee World 45:70–83, 104–118. Nason and Ellstrand (1993) and Epifanio and Smith DR, 1991a. African bees in the Americas: insights Adam, 1977. In search of the best strains of bee: sup- from biogeography and genetics. Trends Ecol Evol 6: Philipp (1997) addressed theoretical aspects plementary journey to Asia Minor, 1973. Bee World 58: 17–21. of using diagnostic codominant markers to 57–66. Smith DR, 1991b. Mitochondrial DNA and honey bee estimate frequencies of genealogical classes Arias MC and Sheppard WS, 1996. Molecular phyloge- biogeography. In: Diversity in the genus Apis (Smith in hybridized populations. Their work was netics of honey bee subspecies (Apis mellifera L.) in- DR, ed). Boulder, CO: Westview Press; 131–176. ferred from mitochondrial DNA sequences. Mol Phylog inspired by recognition of the great poten- Evol 5:557–566. Smith DR and Brown WM, 1990. Restriction endonucle- ase cleavage site and length polymorphisms in mito- tial for classification error when genealogi- Asal S, Kocabas¸ S, Elmaci C, and Yildiz MA, 1995. En- chondrial DNA of Apis mellifera mellifera and A. m. car- cal origins of individuals are assigned based

46 The Journal of Heredity 2000:91(1) on their genotypic arrays (genotypic inspec- 1993), AFLP (Biesmann et al. 1997), and of homozygotes for taxon B alleles at all loci tion) (Avise and van den Avyle 1984; Camp- ISSR (Wolfe et al. 1998). Diagnostic loci (typical of the P2 class); category H consists ton 1990). When hybrids survive past the will be more prevalent for interspecies hy- of heterozygotes for A and B alleles at all

first generation (i.e., F1 hybrids successfully brids, but these techniques should also in- loci (typical of the F1 class); categories AI reproduce), error arises because genotypic crease the likelihood of finding sufficient and BI consist of homozygotes for one, but arrays of second- and later-generation cross- loci for intraspecific hybrid studies (i.e., not both, taxon, and at least one heterozy- es overlap with those of parental and first- subspecies, populations). For example, gote (typical of the BP1 and BP2 classes, re- generation crosses. Assignment based on Williams et al. (1998) screened 17 RAPD spectively); and category S consists of at genotype inspection can thus lead to incor- primers and found three of them that pro- least one homozygote for each taxon (re- rect conclusions about the proportions of duced 15 markers (bands) that distin- stricted to the F2 class). All genotypic cate- various crosses (genealogical classes) in a guish two subspecies of largemouth bass gories except S can contain members of hybrid mixture. Nason and Ellstrand (1993) (Micropterus salmoides). In contrast, only multiple genealogical classes; category S is considered this error and suggested the use 2 of 28 allozyme loci were diagnostic for the only category restricted to a single class of maximum-likelihood methods, which pro- these subspecies (Philipp et al. 1983). and acts as a signature for those F2 individ- duce unbiased estimates of hybrid class fre- Boecklen and Howard (1997) assessed the uals (Epifanio and Philipp 1997). Using the quencies. Epifanio and Philipp (1997) ex- use of both codominant and dominant category definitions above, Epifanio and Phi- panded this work to quantify the extent of markers in hybrid studies, but restricted lipp (1997) determined the expected distri- classification error based on the number of their analysis to repeated backcrossing to bution of category assignments for the six diagnostic codominant loci. With the advent one parental taxon. I now consider the use genealogical classes based on the number of of several new dominant marker systems of dominant markers under the model of Na- codominant, diagnostic loci (see their Table having great potential for identifying diag- son and Ellstrand (1993), which includes all 2). This allowed them to quantify the ex- nostic loci, it is important to consider clas- first- and second-generation hybrids. I focus pected misclassification error when mixed sification error for the case of dominant primarily on the issue of genealogical mis- hybrids are assigned to genealogical classes markers. classification addressed by Epifanio and Phi- based on genotype inspection. Recently researchers have developed lipp (1997). An understanding of the ap- several new types of polymerase chain re- proaches taken by these authors for Dominant Markers action (PCR)-based DNA marker systems codominant markers is required first. that are typically scored assuming domi- With diagnostic dominant markers (i.e., nant inheritance [e.g., random amplified fixed for presence in one taxon and fixed Codominant Markers polymorphic DNA (RAPD) (Welsh and for absence in the other taxon), the ge- McClelland 1990; Williams et al. 1990), AFLP The model of Nason and Ellstrand (1993) notypic categories described above must (Vos et al. 1995), inter-simple-sequence re- assumes that a population consists of only be redefined because heterozygous loci peats (ISSR) (Gupta et al. 1994; Zietkiewicz six genealogical classes. Two parental cannot be distinguished from homozygous et al. 1994), SINE-PCR (Greene and taxa, A and B, and first- and second-gen- dominant-taxon loci. Therefore multilocus 1997)]. Dominant markers are character- eration products of mating between them, phenotypes, based on band presence or ized by the presence or absence of bands produce the six classes: P1 and P2 (crosses absence, must be considered. For loci on a gel, with each band considered a lo- within parental taxa), F1 (P1 ϫ P2), BP1 and present in taxon A (LA), individuals from cus. The copy number of alleles producing BP2 (backcrosses F1 ϫ P1 and F1 ϫ P2,re- the P1,F1, and BP1 classes produce bands the band (one or two) cannot be deter- spectively), and F2 (F1 ϫ F1). No advanced- at all loci and a phenotypic category, call mined so that band-present homozygotes generation crosses exist [e.g., F3 (F2 ϫ F2) it presence (ϩ), incorporates the codom- cannot be distinguished from heterozy- or BC-2 (BP1 ϫ P1)]. When taxa-specific di- inant categories A, H, and AI. The P2 class gotes. This results in less genetic informa- agnostic alleles at codominant loci are does not produce bands at any loci and is tion per locus than codominant markers identified for two parental taxa, the mul- categorized as absence (o), corresponding when applied to questions of population tilocus genotype of each individual in a to the codominant category B. Finally, genetic structure, paternity, and hybridiza- hybrid population can be assigned to one both the BP2 and F2 class are typified by tion (Fritsch and Rieseberg 1996). of seven mutually exclusive and exhaus- individuals having at least one locus with The loss of information due to domi- tive genotypic categories. These authors bands and at least one without (ϩ/o). This nance is countered by positive attributes considered the general case in which, at category corresponds to BI and S. As ob- that make dominant marker systems use- the population level, loci have some diag- served in the codominant case, BP2 and F2 ful for hybrid analyses. Each technique nostic alleles and some alleles shared by can also produce some individuals with ϩ produces numerous bands, so that single both parental taxa. One genotypic cate- or o for all loci, creating a potential for reactions can screen multiple loci for di- gory, called ‘‘ambiguous,’’ consists of in- misclassification. For loci present in taxon agnostic bands. By varying PCR primers dividuals that share alleles at all loci; this B(LB), category ϩ incorporates codomi- within marker systems, and by combining condition is not possible when at least one nant categories B, H, and BI; category o systems, essentially unlimited numbers of locus is diagnostic (i.e., all alleles are taxa corresponds to A; and category ϩ/o in- loci can be generated. Therefore dominant specific). When diagnostic loci are pres- cludes AI and S. markers may provide for faster and more ent, only six categories are possible. One can determine the expected pro- efficient discovery of diagnostic loci com- For codominant, diagnostic loci, Epifanio portion of each genealogical class having pared to other techniques (e.g., allozymes, and Philipp (1997) described six genotypic multilocus phenotypes characteristic of microsatellites, and introns). Diagnostic categories. Genotypic category A consists of phenotypic categories (ϩ,o,orϩ/o) loci at varying taxonomic levels have been homozygotes for taxon A alleles at all loci based on the number of dominant, diag- documented for RAPD (Crawford et al. (typical of the P1 class); category B consists nostic loci from a single taxon. A critical

Brief Communications 47 Table 1. Expected proportion of each genealogical class (P1,P2,...,F2) having multilocus phenotypes Table 2. Probability of assigning the multilocus characteristic of phenotypic categories (A, B, . . ., S) for unlinked dominant loci with at least one locus phenotype of an F2 individual to phenotypic

fixed for presence in taxon 1 (LA) and one locus fixed for presence in taxon 2 (LB) category S based on the number of unlinked dominant or codominant loci Genealogical class Phenotypic Dominant Codominant category P1 P2 F1 BP1 BP2 F2 Number of loci Number L L L A 1 (1/2) B (3/4) A(1/4) B of loci L L L B 1 (1/2) A (1/4) A(3/4) B LA LB Total S S L L L L H 1 (1/2) B (1/2) A (3/4) A(3/4) B L L L L L AI 1–2(1/2) B (3/4) A(4 BϪ3 BϪ1)/4 B 1 1 2 0.06 1 0.00 L L L L L BI 1–2(1/2) A (3/4) B(4 AϪ3 AϪ1)/4 A 3 3 6 0.33 3 0.28 L L L L S [(4 AϪ3 A)/4 A][1Ϫ(3/4) B] 6 6 12 0.68 6 0.66 9 9 18 0.86 9 0.85 13 13 26 0.95 13 0.95 16 10 26 0.93 19 7 26 0.86 conclusion is that when multiple domi- and Philipp 1997). I examined the power 22 4 26 0.68 nant loci are all present in a single taxon, of dominant markers to classify F ’s (Table 25 1 26 0.25 2 11 Infinite Infinite 0.96 that is, all LA or all LB, there is not a one- 2). Based on the probability of assigning 1 Infinite Infinite 0.25 to-one correspondence between genealog- an F2 individual to category S, and thus ical classes and phenotypic categories correctly concluding that it is an F2, there ‘‘typical’’ of those classes. An infinite num- is an approximate equivalence between ber of LA loci can do no better than cate- two dominant loci, one from each taxon, cus from the other taxon is needed to sep- gorize P1,F1, and BP1’s as A, H, or AI (ϩ), and a single codominant locus. Thirteen arate these classes. and BP2 and F2’s as BI or S (ϩ/o). Also, pairs of dominant loci (i.e., 26 total loci) there is no category that is restricted to a are needed to categorize F2’s as S with single genealogical class, so that no indi- probability greater than .95, equaling the Implications vidual can be definitively classified based number of single codominant loci needed The main emphasis of this article has been on its category (i.e., a signature). (Epifanio and Philipp 1997). With fewer to address sources of misclassifying gene- When dominant loci from each taxon loci, and lower power, pairs of dominant alogical origins of individuals in hybrid are combined, however, the one-to-one loci result in higher probabilities of cor- populations using dominant markers, in correspondence of categories to classes rect assignment than equivalent numbers the manner of Epifanio and Philipp (1997). and the F2 signature of the S category are of single codominant loci. This is because Their key conclusion was that many co- recovered. Six mutually exhaustive phe- any two dominant loci from alternate taxa dominant loci are needed to minimize mis- notypic categories can be defined: cate- can categorize an individual as an S if they classification based on genotypic inspec- gory A ϭ all LA ϩ and all LB o; category are both phenotype o. No single codomi- tion and assignment of individuals, even if B ϭ all LA o and all LB ϩ; category H ϭ nant locus is sufficient because two ho- all loci are diagnostic. I have shown that all LA and LB ϩ; category AI ϭ all LA ϩ mozygotes, one from each taxon, must be more dominant than codominant loci are and at least one LB ϩ and at least one LB observed to assign to category S. This needed and that it is important that loci o; BI ϭ at least one LA ϩ and at least one power difference dissipates as the number come from both parental taxa if F2’s and LA o and all LB ϩ; and category S ϭ at of loci increases (Table 2). backcrosses to both parental taxa may be least one LA o and at least one LB o. Then, Efficiency (power per locus) of dominant present. My results reinforce support for as for codominant loci, the expected fre- markers to classify F2’s is best increased by using maximum-likelihood methods for es- quencies of phenotypes in each genealog- having similar numbers of loci from each timating class contributions (Nason and ical class can be calculated based on taxon (Table 2). With a single locus from Ellstrand 1993) as an alternative to assign- transmission probabilities from taxon A one of the taxa and an infinite number from ment by genotypic inspection. and B assuming Mendelian inheritance of the other, only 0.25 F2’s are expected to be My results also provide the basis for ap- unlinked loci (Table 1) (Nason and Ell- placed into category S. Even maintaining the plying maximum-likelihood methods to di- strand 1993). Note that only the F2 class infinite number of loci from one taxon, 11 agnostic dominant loci. Following the pro- can contain individuals ‘‘absent’’ (o) for are needed from the other to increase pow- cedures of Nason and Ellstrand (1993), both LA and LB loci, again making the S cat- er to greater than 0.95. Less extreme marker individuals are first placed into genotypic egory a signature for F2 individuals. differences also demonstrate this principle categories according to Table 1. The prob- (Table 2). Contrast this with the probability ability of observing a genotypic category of assigning backcross classes, BP and BP , Misclassifying Genealogical 1 2 is then equated to its observed frequency into their ‘‘typical’’ categories, AI and BI. Ta- Origins: Dominant versus in the sample. Six linear equations for the ble 1 shows that the categorization of back- Codominant Markers genealogical class estimates can then be crosses to one parental taxon depends en- solved, which consist of category frequen- Epifanio and Philipp (1997) highlighted tirely on the number of loci from the other cies and conditional probabilities of as- the potential for error when assigning off- taxon (e.g., BP1 are categorized based on the signing a multilocus phenotype (geno- spring in a hybrid mixture to a genealogi- number of LB) (see also Boecklen and How- type) of an individual in a given class to a cal class solely by genotypic inspection ard 1997). This is because a backcross to given genotypic category (based on Table using codominant markers. They specifi- one taxon will produce the ϩ phenotype for 1). These equations were shown to pro- cally examined the probability of misclas- all dominant loci from that taxa (Table 2); duce unbiased estimates of genealogical sifying F2’s, the class most likely to be in- making the phenotypic distributions of class frequencies assuming that model as- correctly assigned (Figure 1 in Epifanio classes P, F1, and BP indistinguishable. A lo- sumptions were met, particularly that no

48 The Journal of Heredity 2000:91(1) advanced generation crosses are present sity of Minnesota, 1980 Folwell Ave., St. Paul, MN 55108. Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, and I thank Wansuk Senanan for her contributions to this Tingey SV, 1990. DNA polymorphisms amplified by ar- (Nason and Ellstrand 1993). article and William Ardren, William Eldridge, Raymond bitrary primers are useful as genetic markers. Nucleic The maximum-likelihood method of Na- Newman, and John Epifanio for comments on the Acids Res 18:6531–6535. manuscript. This work is the result of research spon- son and Ellstrand (1993) has limitations Wolfe AD, Xiang Q-Y, and Kephart SR, 1998. Assessing sored by the Minnesota Sea Grant College Program hybridization in natural populations of Penstemon (Epifanio and Philipp 1997). First, genea- supported by the NOAA Office of Sea Grant, U.S. De- (Scrophulariaceae) using hypervariable intersimple se- partment of Commerce, project no. R/A-12, under grant logical class frequencies in populations, quence repeat (ISSR) bands. Mol Ecol 7:1107–1125. no. NOAA-NA86-RG0033; journal reprint no. 459. The rather than individual class membership, U.S. Government is authorized to reproduce and dis- Zietkiewicz E, Rafalski A, and Labuda D, 1994. Genome are estimated. For example, any one indi- tribute reprints for government purposes, not with- fingerprinting by simple sequence repeat (SSR)-an- vidual in genotype category A could be a standing any copyright notation that may appear here- chored polymerase chain reaction amplification. Gen- on. This is article 984410022 of the Minnesota omics 20:176–183. P1,BP1,orF2. Therefore the common ques- Agricultural Experiment Station Scientific Journal Se- Received January 4, 1999 ries. Address correspondence to Loren M. Miller at the tion, ‘‘Is this individual unhybridized, past Accepted September 14, 1999 or present?’’ is not answered by this meth- address above or e-mail: [email protected]. Corresponding Editor: Bruce S. Weir od. Only those individuals of the F2 class ᭧ 2000 The American Genetic Association that fall into category S can be classified with certainty. Second, the model cannot

accommodate crosses beyond the F2 gen- References eration. Individuals from advanced gener- Avise JC, 1994. Molecular markers, natural history, and Is There Really Natural ations will have multilocus genotypes that . New York: Chapman & Hall. Selection Affecting the l place them in one of the six categories, Avise JC and van den Avyle MJ, 1984. Genetic analysis of reproduction of hybrid white bass ϫ striped bass in Frequencies (Long Hair) in thereby biasing estimates of genealogical Savannah River. Trans Am Fish Soc 113:563–568. class frequencies. An extreme example of the Brazilian Cat Populations? Beismann H, Barker JHA, Karp A, and Speck T, 1997. this is continuous backcrosses to the AFLP analysis sheds light on distribution of two Salix same parent (e.g., taxon A). By the fifth species and their hybrid along a natural gradient. Mol M. Ruiz-Garcia backcross (BC-5), more than 66% of indi- Ecol 6:989–993. viduals will likely be placed into parent Boecklen WJ and Howard DJ, 1997. Genetic analysis of The scientific literature on cat genetics hybrid zones: numbers of markers and power of reso- category A using 13 codominant loci or lution. Ecology 78:2611–2616. contains a presumed typical example of natural selection affecting l frequencies dominant loci LB (Boecklen and Howard Campton DE, 1990. Application of biochemical and mo- 1997). The remaining individuals will al- lecular markers to analysis of hybridization. In: Elec- (long hair) in 16 Brazilian cat populations. most certainly be placed into AI. These au- trophoretic and isoelectric focusing techniques in fish- It has been observed that the hotter and eries management (Whitmore DH, ed). Boca Raton, FL: thors showed that more than 70 markers CRC Press; 241–264. more tropical the climate in Brazil, the low- er the values of l frequencies in the cat are needed before the probability of as- Crawford DJ, Brauner S, Cosner MB, and Stuessy TF, 1993. signing a BC-5 to a hybrid class would ex- Use of RAPD markers to document the origin of the inter- populations. Nevertheless, this study of ceed .95, although under this framework it generic hybrid Margyracaena skottsbergi (Rosaceae) on some new cat populations in Latin Ameri- the Juan Fernadez Islands. Am J Bot 80:89–92. would be incorrectly considered a first- ca showed that all of them, independent Epifanio JM and Philipp DP, 1997. Sources for misclas- of the climate, had high or very high l fre- generation backcross (BP1). sifying genealogical origins in mixed hybrid popula- Despite these limitations, maximum-like- tions. J Hered 88:62–65. quencies. I postulate that an alternative lihood methods for estimating genealogi- Fritsch P and Rieseberg LH, 1996. The use of random migrational-historical hypothesis exists cal origins in mixed hybrid populations amplified polymorphic DNA (RAPD) in conservation that explains the correlation between the l genetics. In: Molecular genetic approaches in conser- frequencies and climate characteristics should still be useful. If hybridization is a vation (Smith TB and Wayne RK, eds). New York: Ox- recent event, for example, due to popula- ford University Press; 54–73. (which are correlated with the latitude) tion transfers (e.g., stocking), or if suc- Greene BA and Seeb JE, 1997. SINE and transposon se- without using natural selection explana- cessful hybridization is restricted to the quences generate high-resolution DNA fingerprints, tions concerning the appearance of the l ‘‘SINE prints,’’ that exhibit faithful Mendelian inheri- allele in Brazil. first or second generation of interbreeding tance in pink salmon (Oncorhynchus gorbuscha). Mol (Avise 1994 and references therein), then Mar Biol Biotechnol 6:328–338. the model of Nason and Ellstrand (1993) Gupta M, Chyi Y-S, Romero-Severson J, and Owen JL, Seven of 12 genes coding for coat charac- is applicable. Furthermore, Nason and 1994. Amplification of DNA markers from evolutionarily ters such as color, tabby, and length, as diverse genomes using single primers of simple-se- Ellstrand (1993) showed that their model quence repeats. Theor Appl Genet 89:998–1006. well as certain skeletal anomalies have will often suggest the presence of ad- Nason JD and Ellstrand NC, 1993. Estimating the fre- been studied in many domestic cat popu- vanced-generation hybrids by producing quencies of genetically distinct classes of individuals lations (Felis catus) worldwide (e.g., Ah- impossible estimates (i.e., frequencies Ͻ0 in hybridized populations. J Hered 84:1–12. mad et al. 1980; Lloyd 1985; Lloyd and or Ͼ1.0) for some classes. In this case, Philipp DP, Childers WF, and Whitt GS, 1983. A bio- Todd 1989; Ruiz-Garcia 1991, 1994, 1997b; chemical genetic evaluation of northern and Florida class frequency estimates will be biased subspecies of largemouth bass. Trans Am Fish Soc 112: Ruiz-Garcia et al. 1995, 1998, 1999). These but advanced-generation hybrids will be 1–20. loci have significant spatial patterns in dif- detected, which may often be the goal of Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T, ferent areas of the world (Ruiz-Garcia the research. The potential to significantly Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, and 1994, 1997b), showing a close relationship Zabeau M, 1995. AFLP: a new technique for DNA fin- increase the number of diagnostic loci us- gerprinting. Nucleic Acids Res 23:4407–4414. to historical and commercial human mi- ing dominant marker techniques will in- Welsh J and McClelland M, 1990. Fingerprinting ge- grations (Ruiz-Garcia and Alvarez 1996; crease the power of maximum-likelihood nomes using PCR with arbitrary primers. Nucleic Acids Todd 1977). Artificial human selection has methods and should advance the use of Res 18:7213–7218. been shown to have had little influence on molecular data in hybrid studies. Williams DJ, Kazianis S, and Walter RB, 1998. Use of the genetic profiles of the stray cat popu- random amplified polymorphic DNA (RAPD) for iden- tification of largemouth bass subspecies and their in- lations studied in different cities. Clark From the Department of Fisheries and Wildlife, Univer- tergrades. Trans Am Fish Soc 127:825–832. (1975) showed, for instance, that the hu-

Brief Communications 49 man taste for specific mutant colors was Table 1. Some mutant allele frequencies of European (Spain, , and Italy), Hispanic settlements not reflected in the genetic profile of the in the United States, and Latin American cat populations (of Spanish and Portuguese origin) cat population of Glasgow. Nevertheless, it Locus has been particularly difficult to establish Populations nOatb dl SW exact geographical patterns of and influ- ences on one gene, l (long hair). Todd et Spain al. (1974) put forward that ‘‘...popula- Barcelona (1989) 709 0.16 0.70 0.27 0.27 0.14 0.27 0.004 Granollers 158 0.17 0.71 0.16 0.08 0.21 0.23 0.003 tions which are unarguably related in oth- Girona 159 0.22 0.67 0.24 0.12 0.15 0.29 0.003 er aspects, show great disparity in their l L’Estartit 100 0.19 0.81 0.23 0.00 0.15 0.31 0.000 Llansa 75 0.26 0.74 0.15 0.24 0.00 0.20 0.000 frequencies.’’ At first glance one would R Castell (1989) 319 0.24 0.80 0.17 0.22 0.15 0.28 0.007 think that this allele would be favorably R Castell (1994) 228 0.20 0.78 0.26 0.26 0.13 0.27 0.009 selected for in cold climates, and con- Sitges 204 0.13 0.74 0.32 0.25 0.20 0.25 0.006 Vilanova 118 0.20 0.72 0.22 0.23 0.09 0.35 0.005 versely would be negatively selected for in Tarragona 216 0.21 0.68 0.38 0.15 0.00 0.29 0.000 hot climates. It would therefore corre- Benidorm (1990) 207 0.23 0.70 0.55 0.16 0.00 0.28 0.005 spond to our expectations to find high fre- Alicante 335 0.23 0.78 0.50 0.24 0.00 0.31 0.006 Murcia 171 0.18 0.74 0.62 0.08 0.08 0.38 0.012 quencies of l in populations in very cold Cadiz 64 0.16 0.71 0.45 0.25 0.00 0.23 0.010 climates, such as Leningrad [q(l) ϭ 0.64] Mahon (Balearics) 475 0.30 0.80 0.18 0.38 0.12 0.14 0.003 Villacarlos (Balearics) 226 0.24 0.75 0.13 0.44 0.17 0.18 0.002 and Alma Ata [q(l) ϭ 0.56] in the former Ciudadela (Balearics) 510 0.21 0.73 0.22 0.35 0.08 0.27 0.004 Soviet Union, or Inverness [q(l) ϭ 0.52] in P.Majorca (Balearics) 475 0.20 0.72 0.36 0.35 0.27 0.23 0.004 Scotland, for example. It would be para- Ibiza (Balearics) 273 0.24 0.76 0.30 0.23 0.19 0.29 0.010 Vigo 228 0.23 0.80 0.36 0.14 0.19 0.35 0.004 doxical, however, that populations in plac- Santiago Compostela 120 0.24 0.75 0.40 0.16 0.16 0.37 0.000 es such as Cyprus [q(l) ϭ 0.50], Jericho, Tenerife (Canary) 146 0.29 0.80 0.32 0.34 0.25 0.41 0.003 Israel [q(l) ϭ 0.42], and Phoenix, Arizona Puerto Cruz (Canary) 126 0.16 0.83 0.33 0.38 0.13 0.32 0.000 Portugal and Islands [q(l) ϭ 0.51], with very high recorded tem- Lisbon 371 0.07 0.65 0.45 0.27 0.09 0.21 0.020 peratures, would have high frequencies of Porto 256 0.14 0.77 0.43 0.30 0.23 0.29 0.010 l. They are, in fact, much higher than other Terceira () 160 0.31 0.66 0.37 0.34 0.16 0.45 0.030 populations with notably colder climates, Faial/Pico (Azores) 109 0.25 0.57 0.38 0.34 0.17 0.34 0.010 S. Miguel (Azores) 159 0.23 0.63 0.50 0.34 0.11 0.29 0.000 such as Poznan [q(l) ϭ 0], Bialowieza [q(l) 133 0.18 0.78 0.35 0.41 0.00 0.39 0.010 ϭ 0.21], and Wroclaw [q(l) ϭ 0] in Poland Mindelo (C Verdes) 206 0.12 0.77 0.60 0.19 0.00 0.34 0.000 Praia (C Verdes) 168 0.23 0.83 0.47 0.11 0.00 0.44 0.010 or in Iceland [q(l) ϭ 0.14]. This has moti- Italy vated the study of this characteristic in Rome 480 0.09 0.66 0.49 0.34 0.10 0.31 0.010 different areas of the world by several in- Venice (1991) 145 0.10 0.56 0.27 0.34 0.19 0.20 0.014 vestigators. Lloyd (1983, 1985), for exam- San Remo 148 0.04 0.58 0.48 0.32 0.29 0.27 0.000 Rimini 518 0.13 0.68 0.38 0.41 0.22 0.26 0.012 ple, showed that a significantly negative Riccione 130 0.11 0.59 0.44 0.42 0.27 0.28 0.008 correlation existed between the average Hispanic settlements in the United States minimum temperatures in 35 populations Denver (Colorado) 286 0.20 0.84 0.26 0.38 0.35 0.29 0.010 from the Atlantic coast of North America Lubbock (Texas) 265 0.31 0.79 0.36 0.33 0.44 0.24 0.000 during January (winter) and the frequen- Dallas (Texas) 311 0.25 0.67 0.27 0.32 0.44 0.18 0.000 Denton (Texas) 311 0.25 0.81 0.27 0.33 0.46 0.22 0.010 cies of l (r ϭϪ0.44). However, there was Mineral Wells (Texas) 311 0.31 0.73 0.34 0.29 0.52 0.20 0.000 no significant correlation (r ϭϪ0.08) with Houston (Texas) 294 0.25 0.69 0.29 0.29 0.35 0.19 0.000 respect to the average maximum temper- Richmond (California) 107 0.19 0.77 0.27 0.33 0.35 0.28 0.030 San Francisco (California) 195 0.27 0.79 0.33 0.32 0.36 0.31 0.030 ature in July (summer). Still, the most fa- Humboldt County (California) 238 0.27 0.75 0.51 0.35 0.30 0.22 0.030 mous example where the existence of nat- Hispanic settlements in Latin America ural selection affecting l frequencies was Los Mochis (Mexico) 141 0.31 0.71 0.39 0.24 0.31 0.32 0.010 postulated was that described by Wata- Mexico City (Mexico) 170 0.16 0.62 0.23 0.29 0.57 0.29 0.020 Caracas (Venezuela) 164 0.13 0.79 0.32 0.10 0.33 0.33 0.009 nabe (1984) in Brazil. The author ob- Willemstadt (Curacao) 151 0.14 0.84 0.19 0.19 0.28 0.34 0.020 served that the hotter and more tropical Havana (Cuba) 334 0.30 0.72 0.24 0.14 0.62 0.39 0.022 the climate, the lower the values of q(l) for Bogota (Colombia) 1105 0.19 0.86 0.18 0.35 0.34 0.21 0.009 Ibague (Colombia) 147 0.24 0.82 0.11 0.37 0.32 0.23 0.003 16 cat populations studied in Brazil (r ϭ Bucaramanga (Colombia) 240 0.16 0.87 0.11 0.32 0.29 0.24 0.002 Ϫ0.95), concluding that l was strongly un- Cali (Colombia) 258 0.23 0.84 0.18 0.47 0.33 0.32 0.005 favorable in tropical climates. This finding Pasto (Colombia) 210 0.20 0.82 0.10 0.37 0.41 0.29 0.000 Santiago (Chile) 126 0.13 0.76 0.42 0.51 0.59 0.33 0.036 has been recognized as a clear example of Buenos Aires (1992) 295 0.27 0.82 0.31 0.45 0.40 0.28 0.021 natural selection affecting a genetic char- Buenos Aires (1996) 675 0.21 0.79 0.29 0.43 0.41 0.29 0.016 acter for a morphological trait in cat pop- ulations [see, e.g., Klein (1993), Lloyd (1987), Lloyd and Todd (1989), and Todd tude, leaving aside the possible existence cats in Brazil, the allele frequencies of sev- and Lloyd (1984), among others]. Howev- of selection in Brazil. en loci controlling fur color, tabby, and er, the analysis of other Latin American length were studied in the cities of La Ha- cat populations has allowed me to postu- vana (n 334) in Cuba; Bogota´(n Materials and Methods ϭ ϭ late an alternative hypothesis that would 1105), Ibague´(nϭ147), Bucaramanga (n explain the correlation between q(l) and In order to demonstrate the absence of ϭ 240), Cali (n ϭ 257), and Pasto (n ϭ 210) climate characteristics according to lati- natural selection affecting the locus L in in Colombia; Santiago (n ϭ 126) in Chile;

50 The Journal of Heredity 2000:91(1) Table 1. Continued joining (Saitou and Nei 1987). To deter- mine the reliability of the trees generated, Locus three statistical methods were applied: the b Populations nOat dl SW interior branch and Rzhestsky and Nei (1992) tests (Li 1989), the Felsenstein Brazil Porto Alegre 489 0.16 0.68 0.26 0.28 0.27 0.31 0.020 (1985) bootstrap test, and the cophenetic Curitiba 327 0.17 0.71 0.25 0.21 0.27 0.35 0.020 correlation coefficient (Sneath and Sokal Sao Paulo 1164 0.22 0.71 0.26 0.18 0.35 0.42 0.030 1973). The trees that showed the best sta- Rio de Janeiro (1984) 1545 0.24 0.74 0.32 0.27 0.20 0.38 0.030 Rio de Janeiro (1996) 232 0.14 0.74 0.23 0.30 0.31 0.37 0.012 tistics for reliability are shown here. All of Belho Horizonte 859 0.18 0.71 0.26 0.25 0.22 0.36 0.020 the analyses were performed both includ- Campo Grande 512 0.19 0.64 0.27 0.20 0.24 0.29 0.020 ing and excluding the locus L to determine Brasilia 492 0.26 0.63 0.28 0.19 0.18 0.31 0.010 Cuiaba 302 0.16 0.66 0.20 0.12 0.10 0.29 0.040 its influence on the relationships found be- Salvador 959 0.20 0.57 0.42 0.14 0.16 0.47 0.020 tween the populations studied. The sec- Rio Branco 235 0.25 0.49 0.29 0.06 0.06 0.34 0.000 J. Norte 503 0.33 0.60 0.24 0.21 0.05 0.39 0.000 ond analysis was a canonical analysis of Teresina 994 0.18 0.69 0.33 0.26 0.01 0.35 0.030 populations. This separates groups of Fortaleza 1254 0.20 0.66 0.27 0.29 0.08 0.50 0.020 populations along axes of high discrimi- S. Luis 1323 0.24 0.66 0.19 0.24 0.00 0.53 0.010 Manaus 993 0.18 0.68 0.32 0.06 0.06 0.47 0.030 nation power using the Mahalanobis Belem 909 0.22 0.65 0.25 0.14 0.05 0.42 0.010 square distance, and is based on the ful- fillment of two hypotheses: (1) that there is homogeneity between all covariance matrices corresponding to the population groups (maximum likelihood test), and (2) a new sample of cats in Buenos Aires (n dominant frequencies (p) were taken as 1 that the means of the k groups are signif- ϭ 675) in Argentina; a new sample in Rio Ϫ q. icantly different [Wilks’ ⌳ test, and the as- de Janeiro (n ϭ 232) in Brazil; and two The genetic relationships between Bra- sociate value of the Fisher–Snedecor F test nonoverlapping samples in the Canary Is- zilian populations reported by Watanabe by means of the approximation of Rao lands (one from Lloyd 1989, unpublished, (1981, 1984) were analyzed in this study, (1951)]. Subsequently, a canonical trans- in Tenerife City, n ϭ 146; one from Puerto matching them against other populations. formation, the eigenvalues, the signifi- de la Cruz, Tenerife Island, n ϭ 126). Each The populations were grouped in two cance of the first canonical axes with the population was extensively sampled to ways: (1) The first group consisted of 40 Bartlett’s test, and the radius of the con- minimize local effects that could cause de- cat populations including the Latin Amer- fidence regions (for a 90% level) were cal- viations in the allele frequencies. The cats ican (both of Spanish and Portuguese or- culated. In this canonical population anal- sampled were alley cats, or ‘‘pseudowild.’’ igins), southwestern United States, and ysis the following groups were employed: Previously reported samples that were in- two Canary Island populations. (2) A sec- (1) Buenos Aires (two samples), (2) Mexi- cluded in this analysis were from Rio de ond group included 80 populations, co (three samples), (3) Venezuela and Cu- Janeiro (Watanabe 1984) and Buenos Ai- among them the 40 populations of the racao (two samples), (4) southern Brazil res (Kajon et al. 1992). The phenotypes of group just described as well as a group of (six samples), (5) northern Brazil (10 sam- the individuals were recorded from direct North American cat populations from the ples), (6) Colombia (five samples), (7) Ca- observation. The genetic nomenclature United States and Canada (reported by nary Islands (two samples), and separate- used is in accordance with the Committee Lloyd and Todd 1989) of probable British ly, La Havana, Santiago, and the second on Standardized Genetic Nomenclature for origin, and 20 European populations from sample from Rio de Janeiro. Cats (1968). The genetic characteristics the countries of origin of all these Ameri- studied included the sex-linked gene [O, o; can populations (Kajon et al. 1992; Lloyd Results and Discussion Orange (epistatic to the observation of the and Todd 1989; Ruiz-Garcia, 1990a–d, 1993, A locus) versus non-orange] and the non- 1994, 1997b). In order to analyze the rela- Table 1 and Figures 1–3 show the basis for linked autosomal loci: A [A, a; agouti ver- tionship between these cat populations, the apparent association between the fre- sus non-agouti (epistatic to the observa- two kinds of analysis were carried out. quencies of l and climatic factors in Brazil tion of the T locus)], T (tϩ, tb,Ta,striped The first was to obtain matrices of genetic without having to resort to the explana- or mackerel tabby versus blotched tabby distances between pairs of populations. tion of selection such as has been offered versus Abyssinian tabby), D (D, d; nondi- The genetic distances used were four: the up until now. Figure 1a shows the analysis lution versus dilution), L (L, l; short hair Nei standard genetic distance (Nei 1978), based on the application of the WPGMA versus long hair), S (S, s; piebald white the Cavalli-Sforza and Edwards (1967) algorithm with Cavalli-Sforza and Edwards spotting versus non-white spotting), and chord distance, the distance of Prevosti chord distance, including the L locus for W [W, w; dominant white (epistatic to all (1974), and the DA distance (Nei et al. 40 populations. Figure 1b shows the tree the other colors) versus normal color]. 1983). With these matrices, different den- derived from the neighbor-joining method For the characteristics of these genes see drograms were constructed in order to ex- with DA distance, including the L locus for Robinson (1977). The frequency of the al- plain the overall genetic relationships be- 40 populations. Figure 1c shows the result lele orange was calculated using a differ- tween all of these American and European from the COMPLETE algorithm with Ca- ential equation (Ahmad et al. 1980). The populations. The algorithms used were valli-Sforza and Edwards chord distance, autosomic recessive frequencies (q) were the UPGMA (Sneath and Sokal 1973), without the L locus for 40 populations. calculated as the square roots of the ob- WPGMA (using the recommendation of They show clearly that upon analyzing the served phenotypic frequencies, while the Pamilo 1990), COMPLETE, and neighbor- genetic relationships between the Latin

Brief Communications 51 Figure 1. (a) WPGMA phenetic analysis from Cavalli-Sforza and Edwards (1967) chord distance of Hispanic settlements in the United States, Hispanic America, two Canary Island, and Brazilian cat populations (40 populations) with the inclusion of the L locus. Cophenetic correlation coefficient, r ϭ 0.71; approximate Mantel t test: t ϭ 8.79, P Ͻ .0000; out of 1,000 random permutations: one-tail probability is p[random Z Ͼ observed Z] ϭ 0.001. (b) Neighbor-joining tree with DA distance (Nei et al. 1983) of 40 populations with the inclusion of the L locus. The numbers in the figure are the bootstrap (1,000) percentages. (c) COMPLETE phenetic analysis from Cavalli-Sforza and Edwards (1967) chord distance of 40 populations without the inclusion of the L locus. Cophenetic correlation coefficient, r ϭ 0.69; approximate Mantel t test: t ϭ 7.17, P Ͻ .0000; out of 1,000 random permutations: one-tail probability is p[random Z Ͼ observed Z] ϭ 0.001. The dendrograms shown are those with the better cophenetic correlation coefficients, better percentages of Felsenstein bootstraps, and better Rzhestsky and Nei (1992) statistics.

52 The Journal of Heredity 2000:91(1) panic American cat population, from Cal- ifornia, Colorado, and Texas, to Argentina and Chile (more than 7000 km), the cli- matic characteristics are extremely di- verse, such high and constant q(l) values cannot be attributed to the action of nat- ural selection, either in favor or against, after the formation of the original Hispan- ic American populations. Neither could genetic drift explain the systematic oc- currence of such high l frequencies in all of the Hispanic American populations. The Spaniards have a great admiration for long-haired cats, a trait with low fre- quencies in Spain (Ruiz-Garcı´a M, unpub- lished observations). It is quite likely that during the period when the Americas were being colonized, the l frequencies were lower still. A migrational selection could therefore have occurred due to the novelty of this character (Todd 1977, 1978). This hypothesis is much more par- simonious than the existence of human Figure 2. Canonical analysis of populations. The first and second axes explain 90.09% of the variation. selection a posteriori at a time when the populations in question in Latin America had become important from a demo- American cat populations, the Brazilian co, Sao Luis, Fortaleza, and J. Norte) are graphic point of view. As shown by An- populations do not form a homogeneous less related to the Hispanic American pop- derson and Jenkins (1979), Morrill and group. A group of Brazilian populations ulations than are the southern Brazilian Todd (1978), and Ruiz-Garcı´a (1991), once was observed which showed more of a ge- populations (particularly Porto Alegre, the human population of a locality ap- netic similarity to the Hispanic popula- Curitiba, and the two samples from Rio de proaches 30,000, the cat population may tions, such as Caracas (Venezuela), Wil- Janeiro). A canonical analysis of popula- be large enough to be refractory against lemstadt (Curacao), and Los Mochis tions is shown in Figure 2. This analysis changes in the allele frequencies. (Mexico), than to the other group of Bra- showed a Wilks ⌳ϭ0.0002 and F ϭ 4.63 The populations from southern Brazil zilian populations. A second group had with 70 and 100 df, being F ϭ 1.43 with ␣ resembled the Hispanic populations more marked differences from the first, and had ϭ 0.05. Consequently the hypothesis that than those of the rest of Brazil. This could no particular resemblance to the Hispanic the representative group means are equal have originated from the establishment of populations. The composition of these was rejected as expected. The two first ca- a gene flow of a certain magnitude be- two groups of Brazilian populations is not nonical axes explained 90.09% of the vari- tween those Brazilian populations and geographically random, however. The ability. All the Hispanic American groups populations of Spanish origin near the group resembling the Hispanic popula- were clearly related. The southern Brazil- southern frontier of Brazil. The commerce tions is located in southern Brazil (Porto ian group and the new sample from Rio de between southern Brazil and the Hispanic Alegre, Curitiba, Sao Paulo, and Rio de Ja- Janeiro were also highly related to the His- colonies in Uruguay, Paraguay, and the neiro). These populations have the high- panic American groups. To the contrary, area of Rio de La Plata in Argentina was est l frequencies in Brazil (0.22–0.35). On the northern Brazilian and Amazon groups very intense from 1713 onward. In fact, the the contrary, the other group of Brazilian were isolated from the other groups ana- Portuguese succeeded in setting up some populations was made up of coastal pop- lyzed. commercial colonies on the Rio de La Pla- ulations from northern Brazil and extend- The main point which allows a new ta. For instance, the first explorer in cur- ed inland into the Amazon regions, namely nonselectionist hypothesis of an historic- rent Paraguay was the Portuguese Alejo Salvador, Sao Luis, Rio Branco, J. Norte, migrational character is the following. Garcı´a in 1525 searching for the ‘‘Silver Teresina, Fortaleza, Manaus, and Belem, Ruiz-Garcia (1997a,c) and Ruiz-Garcia et Mountain.’’ Later the area was conquered and is characterized by very low or null l al. (1998, 1999) observed that a constant by the Spaniards Juan de Salazar y Espi- frequencies (0–0.16). Populations from feature of the Hispanic American cat pop- nosa, Alvaro Nun˜ez de Vaca, and Domingo Campo Grande, Brasilia, Bello Horizonte, ulations was the high, or very high, l fre- Martinez de Irala. Juan de Salazar founded and Cuiaba are found to cluster differently quencies [e.g., Bogota´ (0.34), Buenos Ai- Asuncio´n in 1537, and coming from Bue- with both groups with regard to the algo- res (0.41), Mexico City (0.57), Santiago, nos Aires, Alvaro Nun˜ez and Domingo rithmic techniques and genetic distances Chile (0.59), and La Havana (0.62)]. These Martı´nez founded small colonies in the employed. values are much higher than those found south of today’s Brazil. In 1588 the Span- The same analysis excluding the L locus in Spanish populations, mostly with val- iard Jesuits founded numerous ‘‘Missions’’ shows a similar perspective, although it is ues of q(l) between 0 and 0.20 (Ruiz-Gar- or ‘‘Reducciones,’’ where they congregat- slightly less clear. The northern Brazilian cı´a 1990c,d, 1991, 1994, 1997b). Since in ed hundreds of Guaranı´ Indian families populations (Salvador, Manaus, Rio Bran- the entire geographic range of the His- and lodged Spaniard colonists. They trav-

Brief Communications 53 eled throughout the Paraguay and Parana´ rivers as far as the frontier of southern Brazil. The tremendous ability of the Guar- anı´ Indians in various jobs encouraged the Portuguese to enslave entire populations, which were made to travel into southern Brazil (1690–1767). Also, numbers of poor Portuguese colonists in Paraguay and Uru- guay followed the Spanish Missions along their way. When the Jesuits were expelled from the Spanish Empire in 1767 the majority of Guaranı´ Indian families, poor Portuguese colonists, and one-sixth of the Spanish col- onists in Paraguay emigrated to the state of Sao Paulo (southern Brazil) to work on the rice and maize plantations and herd- ing cattle. In Uruguay there was also a re- lationship between the Spanish and the Portuguese populations. The first settle- ments in Uruguay were established by the Portuguese in Colonia in 1680. Later Spain established another colony in this area, Montevideo in 1726, and the area was fi- nally controlled by the Spanish in 1777. In 1811 and 1816, the Portuguese invaded a large portion of Uruguay and relations be- tween this country and southern Brazil have been very important ever since. There was another Hispanic–Portuguese connection in southern Brazil: In 1680– 1710, a small and as yet undeveloped har- bor, Rio de Janeiro, started to gain impor- tance. The finding of gold was of key importance to this little harbor, and many Spanish ships arrived from the Canary Is- lands. By 1763 Rio de Janeiro had grown into a very important city. The Canary Is- lands were and remain a key part of the commercial routes between Spain and Lat- in America. During the last decades of the century, more than 25,000 ships sailed to these islands. One more point is very important to better explain the Spanish relationship with Rio de Janeiro. In 1650 Potosı´ (Boliv- ia) was the second largest city of the West- ern World (160,000 inhabitants), after Lon- don. The reason was the mining of silver. The Spaniards arrived in Potosı´ following this route: southern Spain–Canary Islands– Dominican Republic–Panama´ (Puerto de Dios, Porto Belo, and Panama´ City)–Gua- yaquil (Ecuador)–Peru´ (Callao and Lima). Figure 3. (a) UPGMA phenetic analysis with DA distance with 80 populations with the inclusion of the L locus. Once the silver had been loaded in Potosı´, Cophenetic correlation coefficient, r ϭ 0.62; approximate Mantel t test: t ϭ 32.66, P Ͻ .0000; out of 1,000 random permutations: one-tail is p[random Z Ͼ observed Z] ϭ 0.001. The numbers in the figure are the bootstrap (1,000) one of the most important routes back to percentages. (b) WPGMA phenetic analysis from Prevosti (1974) genetic distance of 80 populations without the Spain was through Bolivia (previously inclusion of the L locus. Cophenetic correlation coefficient, r ϭ 0.68; approximate Mantel t test: t ϭ 25.81, P Ͻ .0000; out of 1,000 random permutations: one-tail is p[random Z Ͼ observed Z] ϭ 0.001. The dendrograms shown called Alto Peru´)–northern Argentinean are those which offered better cophenetic correlation coefficients, better Felsenstein bootstraps, and better Rzhes- deserts–the Argentinean cities of Salta, Ju- tsky and Nei (1992) statistics. juy, Co´rdoba, Tucuma´n, La Plata (in Span- ish, silver) and finally, to the Port of Bue- nos Aires. All of these cities were founded

54 The Journal of Heredity 2000:91(1) Figure 3. Continued.

Brief Communications 55 during the years of the silver trade. Many dence gathered so far, the apparent cor- thanks Diana Alvarez (Bogota´ DC, Colombia) and Drs. A. Kajon and S. Diaz for their help. This study was par- of the ships carrying silver sailed from relation between the frequencies of l and tially supported by the Convenios no. 139-94 and no. Buenos Aires to Spain via Rio de Janeiro geographic latitude in Brazil in fact re- 140-96 (Decreto 1742 de 1994) between COLCIENCIAS (Ferna´ndez de Oviedo 1944; Perrottet veals a geographical correlation with the and the author. Address correspondence to the author at the address above or e-mail: [email protected]. 1990). It could have been that this intense migrational and historical factors that edu.co. contact with Hispanic populations, where were involved in the formation of the co- the frequencies of l were high was gener- lonialsettlements in Brazil. This correla- ᭧ 2000 The American Genetic Association ated by migrational selection, based on tion there-fore does not result from the novelty, from their original populations in occurrence of natural selection, but rath- Spain, promoting an increase in the fre- er from the particular location of the mi- References quencies of l in the populations of south- grational and historical relationships, Ahmad M, Blumenberg B, and Chaudhary MF, 1980. Mu- tant allele frequencies and genetic distance in cat pop- ern Brazil. which by chance, in this case, coincide ulations of Pakistan and Asia. J Hered 71:323–330. In northern Brazil the process of colo- with the location, according to latitude, Anderson MM and Jenkins SH, 1979. Gene frequencies nization was, in contrast, very different. of the Brazilian populations and their ge- in the domestic cats of Reno, Nevada: confirmation of Discovered by the Portuguese Pedro Al- netic constitution. a recent hypothesis. J Hered 70:267–269. vares Cabral in 1500, the most important One more group of results supports the Cavalli-Sforza LL and Edwards AWF, 1967. Phylogenetic analysis: models and estimation procedures. Evolution 21: populations founded were located in conclusion that natural selection does 550–570. northern Brazil, with Salvador de Bahı´a not necessarily act against the l allele in Clark JM, 1975. The effects of selection and human pref- being the first capital of Brazil (one of the tropical populations: data for the l allele erence on coat colour gene frequencies in urban cats. cat populations studied). The important in the cat population of La Havana Heredity 35:195–210. crop behind this development was sugar, (Cuba), a place of tropical climate, is one Committee on Standardized Genetic Nomenclature for Cats, 1968. Standardized genetic nomenclature for the for which the shoots for the plantations of the highest worldwide (0.62). Also, domestic cat. J Hered 59:39–40. had been brought from the Madeira Is- data from other recently sampled tropical Felsenstein J, 1985. Confidence limits on phylogenies: lands (African coast). This activity was Caribbean populations, such as Santo Do- an approach using the bootstrap. Evolution 39:783–791. sustained for nearly two centuries, during mingo [Dominican Republic, q(l) ϭ Ferna´ndez de Oviedo J, 1944. Historia general y natural which southern Brazil remained forgotten. 0.534]; Veracruz [Mexican Carib, q(l) ϭ de las Indias. Ed. Guaranı´a. Asuncio´n. The populations of the north coast and in- 0.472], and Acapulco [Mexico, q(l) ϭ Kajon A, Centron D, and Ruiz-Garcia M, 1992. Gene fre- land Brazil were not affected by the His- 0.441], have q(l) values greater than 0.40, quencies in the cat population of Buenos Aires, Argen- tina, and the possible origin of this population. J Hered panic influence, since their q(l) frequen- which could not be expected under the 83:148–152. cies are null or very low. These Brazilian selection hypothesis (Ruiz-Garcia M, un- Klein K, 1993. Population genetics and gene geography. populations are therefore probably consti- published data). Additional studies on In: Genetikii koshkii (Ruvinki A and Borodin PM, eds). tuted almost exclusively of Atlantic Por- the subject would, however, be necessary Moscow: Russian Academy of Science; 118–150. tuguese influences, coming from the Is- in order to explain why there is appar- Li WH, 1989. A statistical test of phylogenies estimated lands of Madeira, , and the ently no migrational selection favoring l from sequence data. Mol Biol Evol 6:424–435. Azores, where the l frequencies are null or in the case of the Portuguese, as seems Lloyd AT, 1983. Population genetics of domestic cats (Felis catus L.) in New England and the Canadian Mar- very low. Alternatively they could have to have occurred for the Spanish migra- itime Provinces: an investigation of the historical im- their origin in some continental Portu- tion. It could have been that the l allele migration hypothesis (PhD dissertation). Boston: Bos- guese populations which have not yet was not systematically present in some ton University. been studied. Portuguese cities which were the sources Lloyd AT, 1985. Geographic distribution of mutant al- leles in domestic cat populations of New England and The tree displayed in Figure 3a shows of cats during the colonization of the the Canadian Maritimes. J Biogeog 12:315–322. the analysis based on the application of Americas. Another possibility would be Lloyd AT, 1987. Cats from history and history from cats. the UPGMA algorithm with DA distance, that in Portugal there were at least two Endeavour 11:112–115. including the L locus, for 80 populations, different gene pools, one very similar or Lloyd AT and Todd NB, 1989. Domestic cat gene fre- and Figure 3b shows the corresponding identical to certain Spanish populations quencies. A catalogue and bibliography. Newcastle upon Tyne: Tetrahedron Publications. WPGMA algorithm with Prevosti dis- postulated to be ancient, and which were Morrill RB and Todd NB, 1978. Mutant allele frequen- tance, excluding the L locus, for the 80 the origin of southern Brazilian popula- cies in the domestic cats of Denver, Colorado. J Hered populations, revealing that some Spanish tions studied, and another(s) of different 69:131–134. populations were mainly clustered with characteristics, which might have been Nei M, 1978. Estimation of average heterozygosity and the populations from southern Brazil, es- the origin of certain Portuguese islander genetic distance from a small number of individuals. pecially some Catalan populations such and northern and Amazonian Brazilian Genetics 89:583–590. as Barcelona, Sitges, and Girona, which cat populations. Only a thorough study of Nei M, Tajima F, and Tateno Y, 1983. Accuracy of esti- mated phylogenetic trees from molecular data. Gene had previously been postulated to be the Portuguese cat populations would al- frequency data. J Mol Evol 19:153–170. Spanish populations with probable an- low a more precise historical hypothesis. Pamilo P, 1990. Statistical tests of phenograms based cient genetic profiles (Kajon et al. 1992; I conclude that the l allele can in fact be on genetic distances. Evolution 44:689–697. Ruiz-Garcia 1988, 1990a–d, 1991, 1994). favorably selected for in cold or very cold Perrottet T, 1990. South America. Singapore: Hofer The southern Brazilian populations are climates (Lloyd 1983, 1985), but its behav- Press. therefore more similar to certain Spanish ior is neutral in temperate, hot, or very Prevosti A, 1974. La distancia gene´tica entre pobla- ciones. Misc. Alcobe´. Publicacions de l’Universitat de populations than to recently sampled hot climates, such as the tropics. Barcelona; 109–118. Portuguese populations such as Lisbon Rao CR, 1951. Advanced statistical methods in biomet- From Unidad Gene´tica (Gene´tica de Poblaciones-Biol- and Porto, confirming the proposed hy- ric research. Darien, CT: Hafner Publishing. ogı´a Evolutiva), Departamento de Biologı´a, Pontificia pothesis. Universidad Javeriana, Cra 7a No 43-82, Bogota´ DC, Co- Robinson R, 1977. Genetic for cat breeders. Oxford: From the historical and genetic evi- lombia, and CIGEEM, Barcelona, Spain. The author Pergamon Press.

56 The Journal of Heredity 2000:91(1) Ruiz-Garcia M, 1988. Frecuencias ale´licas mutantes en in the domestic cats of Portugal and the Azores. J He- ual that has only one distinct allele may una poblacio´n de gatos dome´sticos urbanos (Barce- red 75:495–497. lona) y en una poblacio´n de gatos rurales (Castelldefels be either diploid or triploid. Thus unam- Todd NB, Robinson R, and Clark JM, 1974. Gene fre- rural) en Catalun˜a, Espan˜a. Gene´t Ibe´r 40:157–187. quencies in domestic cats of Greece. J Hered 65:227– biguous classification of individuals as Ruiz-Garcia M, 1990a. Frecuencias ale´licas en la pobla- 231. diploid or triploid is impossible in general. cio´n de gatos dome´sticos de la isla de Menorca (Bal- Watanabe MA, 1981. Mutant allele frequencies in the Krieger and Keller (1998) showed that it is eares): diferentes modelos de evolucio´n colonizadora. domestic cats of Sao Paulo, Brasil. Carnivore Genet Evol Biol 4: 307–342. possible, nonetheless, to estimate the pro- Newslett 4:168–177. Ruiz-Garcia M, 1990b. Frecuencias ale´licas en la pob- portion of triploids from phenotypic data, Watanabe MA, 1984. Estudo populacional da cor de pe- lacio´n de gatos dome´sticos de Palma de Mallorca e Ibi- provided that the same alleles occur with lagem de gato domestico (Felis catus L.) em dezesseis za y relaciones gene´ticas con otras poblaciones de ga- localidades do Brasil (PhD dissertation). Sao Paulo: the same frequencies in diploids and trip- tos Europeos y Norteafricanos. Evol Biol 4:189–216. Universidade de Sao Paulo. loids and provided that genotype frequen- Ruiz-Garcia M, 1990c. Mutant allele frequencies in do- Corresponding Editor: Stephen J. O’Brien mestic cat populations in Catalonia, Spain, and genetic cies are in Hardy–Weinberg equilibrium. relationships between Spanish and English colonial cat In this article we demonstrate that there populations. Genetica 82:209–214. are advantages in using likelihood meth- Ruiz-Garcia M, 1990d. Mutant allele frequencies in do- ods to address this problem. We show that mestic cat populations on the Spanish Mediterranean the maximum likelihood estimate of the coast and genetic distances from other European and Improved Estimation of the North African cat populations. Genetica 82:215–221. proportion of triploids is almost always Ruiz-Garcia M, 1991. Ma´s sobre la gene´tica de pobla- Proportion of Triploids in preferable to the estimator of Krieger and ciones de Felis catus en la costa Mediterra´nea Espan˜- Populations with Diploid and Keller (1998). Another advantage of the ola: un ana´lisis de la estructura gene´tica de las pobla- likelihood method is that it leads naturally ciones naturales de gatos. Evol Biol 5:227–283. Triploid Individuals to a test of one of the key assumptions, Ruiz-Garcia M, 1993. Analysis of the evolution and ge- netic diversity within and between Balearic and Iberian M. S. Ridout namely that the alleles occur at the same cat populations. J Hered 84:173–180. frequency in diploids and triploids. Ruiz-Garcia M, 1994. Genetic profiles from coat genes We consider the estimation of the propor- of natural Balearic cat populations: an eastern Medi- tion of triploids in populations of plants or terranean and North African origin. Genet Sel Evol 26: Triploidy Estimates 39–64. animals in which diploid and triploid indi- Ruiz-Garcia M, 1997a. Caracterı´sticas gene´ticas de las viduals coexist, using data from electro- Single Locus with Three Alleles poblaciones de gatos dome´sticos (Felis catus) en las phoretic analysis of isozyme or microsa- We consider a locus with three alleles (A, Ame´ricas. I. Estructura espacial inducida por coloni- tellite markers. Individuals that have three B, and C) in a mixed population of diploids zacio´n. Brazil J Biol (in press). distinct alleles at a locus are unambigu- and triploids in which the proportion of Ruiz-Garcia M, 1997b. Genetic relationships among some new cat populations sampled in Europe: a spatial ously triploid. However, other individuals triploids is t. The allele frequencies, a, b, autocorrelation analysis. J Genet 76:1–24. cannot be classified with certainty as dip- and c, are assumed to be the same for dip- Ruiz-Garcia M, 1997c. Perfiles gene´ticos de las pobla- loid or triploid, unless allelic dosage can loids and triploids and genotype frequen- ciones de gatos dome´sticos de La Habana (Cuba) y de be determined reliably. This is impossible cies are assumed to be related to allele fre- Bogota´ (Colombia) y posibles origenes Europeos de esas poblaciones. I: no existencia de paralelismo con for microsatellite markers, and for many quencies according to the Hardy–Weinberg el modelo colonizador brita´nico. Misc Zool (in press). isozyme markers. We therefore present a law. There are seven phenotypic classes, as Ruiz-Garcia M and Alvarez D, 1996. The use of the do- maximum likelihood method of estimating shown in Table 1 [identical to Table 1 in mestic cat as an extragenic marker of the historical and the proportion of triploids based only on Krieger and Keller (1998)]. ABC individuals commercial human movements. Brazil J Genet 19:184. the presence or absence of different al- are necessarily triploid and may therefore Ruiz-Garcia M, Barrera MI, and Alvarez D, 1998. Genetic leles. be termed ‘‘overt triploids.’’ Other individ- relationships between Caribbean and South-American cat populations: confirmation of the ‘‘Historical Migra- uals may be diploid or triploid. tion Hypothesis.’’ Genet Sel Evol (submitted). Populations that include both diploid and The method of Krieger and Keller (1998) Ruiz-Garcia M, Campos HA, Alvarez D, Kajon A, and triploid individuals occur in various plant involves the following steps Diaz S, 1999. Mutant allele frequencies of cat popula- and animal species. Estimation of the pro- tions in Latin America, Havana (Cuba), Bogota´ (Colom- 1. Estimate allele frequencies a, b, and c portion of triploid individuals within such bia), Ibague´ (Colombia), Santiago (Chile) and Buenos on the assumption that all individuals, Aires (Argentina): genetic relationships with other mixed populations is important for popu- apart from the overt triploids, are diploid. American and European cat populations. J Hered (sub- lation genetic studies and for ecological mitted). 2. Estimate t by equating the observed studies (Krieger and Keller 1998). One ap- Ruiz-Garcia M, Ruiz S, and Alvarez D, 1995. Perfiles ge- number of overt triploids (n ) to the ex- proach to this problem is to use pheno- 7 ne´ticos de poblaciones de gatos dome´sticos (Felis ca- pected number, based on the current es- tus) de la provincia de Girona (Catalunya, NE, Espan˜a) typic data from electrophoretic analysis of timates of a, b, and c. y posibles relaciones gene´ticas con otras poblaciones isozyme or microsatellite markers. These europeas occidentales. Misc Zool 18:169–196. 3. Calculate revised estimates a, b, and techniques identify the distinct alleles that Rzhestsky A and Nei M, 1992. A simple method for es- c, based on the current estimate of t. timating and testing minimum-evolution trees. Mol Biol are present. However, determination of al- Evol 9:945–967. lelic dosage is generally impossible with Steps 2 and 3 are alternated until the esti- Saitou N and Nei M, 1987. The neighbor-joining meth- microsatellite markers. Dosage can some- mates converge, or until the estimate of t od: a new method for reconstructing phylogenetic times be determined for isozyme markers, exceeds one, in which case the estimate is trees. Mol Biol Evol 4:406–425. but is often unreliable. Clearly an individ- taken as one. Krieger and Keller (1998) Sneath PH and Sokal RR, 1973. Numerical taxonomy. San Francisco: W. H. Freeman. ual that has three distinct alleles at a locus show that convergence is usually achieved Todd NB, 1977. Cats and commerce. Sci Am 237:100– is necessarily triploid. However, if dosage in a few iterations. 107. cannot be determined reliably, an individ- An alternative method of estimating t is Todd NB, 1978. An ecological, behavioural genetic mod- ual that has only two distinct alleles may by maximum likelihood. Given a sample of el for the domestication of the cat. Carnivore 1:52–60. be diploid, or it may be a triploid that has N independent individuals, the distribu- Todd NB and Lloyd AT, 1984. Mutant allele frequencies two identical alleles. Similarly, an individ- tion of the numbers of individuals in the

Brief Communications 57 Table 1. Frequencies of phenotypes and genotypes for a locus with three alleles different phenotypic classes is multinomi- al. Ignoring combinatorial terms that are Phenotype AA AB AC BB BC CC ABC irrelevant to parameter estimation, the Genotype 2n AA AB AC BB BC CC — log-likelihood function is Frequency a2(1 Ϫ t)2ab(1 Ϫ t)2ac(1 Ϫ t) b2(1 Ϫ t)2bc(1 Ϫ t) c2(1 Ϫ t)— Genotype 3n AAA AAB ABB AAC ACC BBB BBC BCC CCC ABC 7 Frequency a3t 3a2bt 3ab2t 3a2ct 3ac2tb3t 3b2ct 3bc2tc3t 6abct L(a, b, t) ϭ n ln(p ), ͸ ii iϭ1

where ni denotes the number of individu-

als in phenotypic class i (i ϭ 1,…,7), pi is the probability that a randomly selected individual belongs to the ith phenotypic class, and where the allele frequency c is eliminated from the likelihood by writing c ϭ 1 Ϫ a Ϫ b. Numerical methods are needed to maximize this function and thereby obtain maximum likelihood esti- mates of a, b, and t. We used an imple- mentation of the Nelder–Mead simplex al- gorithm (Nelder and Mead 1965). An alternative approach would be to use the EM algorithm (e.g., Lange 1997, chap. 2). To compare the Krieger and Keller (KK) estimator of t with the maximum likeli- hood (ML) estimator, we simulated data- sets with N ϭ 50, 100, or 250 and with var- ious sets of allele frequencies. Five thousand datasets were simulated for each combination of parameter values. Krieger and Keller (1998) note that their estimator of t is unbiased when the geno- type frequencies within the sampled data are in exact Hardy–Weinberg equilibrium. However, this condition will seldom be met, because of sampling variation, even when the population frequencies are in Hardy–Weinberg equilibrium. Therefore in general the estimator is biased. For ex- ample, with N ϭ 100, a ϭ b ϭ c ϭ 1/3 and t ϭ 0.5, the mean of the 5000 KK estimates was 0.508, a bias of 0.008 (SE ϭ 0.0020). Although the bias is small, there is strong statistical evidence that it is not zero. For these parameter values, the estimated bias of the ML estimator was 0.010 (SE ϭ 0.0018). Thus the ML estimator is also bi- ased. However, these biases are small in relation to the standard deviations of the estimates (0.143 for KK, 0.126 for ML). Larger biases arose for some other com- binations of parameter values (see below), but in most instances the bias remained small in relation to the variability of the estimates. The efficiency of the KK estimator rela- tive to the ML estimator (RE) was calcu- lated using the formula

MSE(ML estimator) RE(%) ϭ 100 ϫ , MSE(KK estimator) Figure 1. Relative efficiency (RE, %) of the Krieger and Keller (1998) estimator of t compared to the maximum likelihood estimator for different values of t and for different sample sizes [N ϭ 50 (solid circles), N ϭ 100 (open circles), N ϭ 250 (solid triangles)] for two sets of allele frequencies. Each relative efficiency value is based on data where MSE denotes the mean squared er- from 5,000 simulations. ror, the average squared difference be-

58 The Journal of Heredity 2000:91(1) Table 2. Sample sizes needed to obtain the same Table 3. Phenotype frequencies simulated assuming different allele frequencies in diploids and precision as KK or ML estimators when triploids unambiguous classification of diploids and triploids is possible Phenotype AA AB AC BB BC CC ABC

a ϭ 1/10, b ϭ 1/10, a ϭ 1/3, b ϭ 1/3, Observed 26 86 119 15 108 95 51 c ϭ 4/5 c ϭ 1/3 Fitted, model 1 27.8 66.9 139.0 21.1 120.1 78.7 46.4 Fitted, model 2 25.7 84.0 120.4 18.0 102.8 96.3 52.8 t KK ML KK ML The fitted frequencies are from two models. Model 1 is the standard model, assuming equal allele frequencies in 0.1 8 8 51 52 diploids and triploids. Model 2 allows the frequencies to differ. 0.3 5 9 40 47 0.5 4 8 30 40 0.7 3 8 19 31 0.9 2 6 10 17 as, and usually better than, the KK meth- mator based on a sample of size 250 was od, except in some circumstances in 0.00172. This same value could have been Values are rounded to the nearest integer. The preci- sion of the KK and ML estimates is based on the sim- which neither method gives reliable esti- obtained by taking n ϭ 52.2, that is, with ulated mean square error in samples of size 250. mates. a sample of only 52 individuals, if triploids Although our primary interest is in the could be identified reliably. As can be seen parameter t, it is of interest to compare from Table 2, even smaller samples would tween the parameter estimate and the briefly the estimates of allele frequencies be needed if the true value of t were larger, true parameter value. The mean squared obtained by the two methods. For the first or particularly if the true allele frequen- error incorporates the bias and the vari- set of simulations, with one allele much cies were a ϭ 1/10, b ϭ 1/10, and c ϭ 4/5 ance of the estimates. more common than the other two, the re- and, although the ML estimator is prefer- Figure 1 shows the values of RE for two sults were similar to those for parameter able to the KK estimator, both estimators sets of allele frequencies. In the first set t. The ML method was more efficient, often are very poor. Although the mean squared (Figure 1a), one allele was much more substantially so, except when the sample error of the ML and KK estimators can be common than the other two. The KK esti- size and the value of t were small. For the reduced by increasing the sample size, a mator was more efficient than the ML es- second set of simulations, with all allele more effective approach, discussed below, timator for small values of t when the sam- frequencies equal, there was little differ- will be to utilize data from several un- ple size was 50 or 100, but otherwise the ence in efficiency when t was small but, at linked loci. ML estimator was preferable. The relative large values of t, the KK method was more efficiency of the KK estimator declined efficient, particularly when the sample Goodness-of-Fit rapidly as t increased and as the sample size was small. When t ϭ 0.9, the relative Once the parameters have been estimat- size increased. In the second set of simu- efficiency was about 120% when N ϭ 50, ed, it is sensible to check the goodness-of- lation runs (Figure 1b) the allele frequen- and was still about 110% when N ϭ 250. fit to the observed phenotypic frequen- cies were equal. The KK estimator was al- cies. A lack of fit suggests that the most always less efficient than the ML Comparison with Estimation When underlying assumptions are invalid. Since estimator. The decline in efficiency with Triploids Can Be Identified there are seven genotypes, and three pa- increasing t was less rapid than in the first As we have discussed earlier, the need for rameters have been estimated, three de- set of simulations and the effect of the these rather elaborate approaches to es- grees of freedom remain to assess good- sample size (N) was much smaller. timating the proportion of triploids in the ness-of-fit. One assumption that can be The poor relative performance of the population arises because triploids can- tested specifically is that allele frequen- ML estimator for small values of t and not be identified reliably unless they carry cies are the same in diploids and triploids. small sample sizes in the first set of sim- three distinct alleles. Suppose, however, To do this we fit a more complex model to ulations was due partly to a large positive that such identification were possible, and the data in which the allele frequencies bias, though the estimator was also more that t could be estimated directly as the are different in diploids and triploids. The variable than the KK estimator. This illus- proportion of triploids in a sample of size likelihood ratio test statistic is then twice trates that the ML estimator, though hav- n. This estimator is unbiased and its MSE the difference in the maximized log-likeli- ing desirable theoretical properties in is therefore equal to its variance, which is hood values of these two models. Under large samples, need not be optimal for t(1 Ϫ t)/n. We can then ask what value of the null hypothesis that the allele frequen- small samples. However, in these circum- n would give the same MSE as the esti- cies are the same, this statistic has, ap- stances the KK estimator, though prefera- mators considered earlier. For example, in proximately, a chi-square distribution with ble to the ML estimator, was itself a poor the simulations with t ϭ 0.1, a ϭ 1/3, b ϭ two degrees of freedom. estimator. For example, with N ϭ 50 and t 1/3, and c ϭ 1/3, the MSE of the ML esti- To illustrate the procedure, Table 3 ϭ 0.1, the KK estimate exceeded 0.5 in 583 of the first set of 5,000 simulations Table 4. Parameter estimates for two models fitted to the simulated data shown in Table 3 (11.7%), and in 222 instances (4.4%) the estimate was one, suggesting that all in- Parameter tabc dividuals were triploid. For comparison, in Model 1 0.462 0.288 0.254 0.458 the second set of simulations, when the (0.0543) (0.0138) (0.0132) (0.0153) Model 2 0.489 Diploid 0.198 0.209 0.593 allele frequencies were equal and the two (0.0732) (0.0864) (0.0541) (0.0507) estimation methods performed similarly, Triploid 0.401 0.304 0.295 only four of the KK estimates (0.08%) ex- (0.0894) (0.0780) (0.0315) ceeded 0.5, the largest being 0.63. Our con- Figures in parentheses are standard errors. Model 1 is the standard model, assuming equal allele frequencies in clusion is that the ML method is as good diploids and triploids. Model 2 allows the frequencies to differ.

Brief Communications 59 gives simulated data for 500 individuals number of loci. A significant value would Krieger MJB and Keller L, 1998. Estimation of the pro- portion of triploids in populations with diploid and with t ϭ 0.5. For diploids we took a ϭ b ϭ cast doubt on the reliability of the pooled triploid individuals. J Hered 89:275–279. 0.2, c ϭ 0.6, and for triploids a ϭ b ϭ c ϭ estimate. Lange K, 1997. Mathematical and statistical methods 1/3. Parameter estimates under the two Although the likelihood approach for for genetic analysis. New York: Spring-Verlag. models are shown in Table 4 and the fitted several loci is straightforward conceptu- Nelder JA and Mead R, 1965. A simplex method for values are shown in Table 3. For the stan- ally, the amount of computation increases function minimization. Comp J 7:303–313. dard model, the chi-square goodness-of-fit considerably with each new locus. A sim- Received September 23, 1998 statistic is 15.38 (3 df, P Ͻ .01), but for the pler method of obtaining a pooled esti- Accepted August 18, 1999 extended model with different allele fre- mate is to form a weighted average Corresponding Editor: James L. Hamrick quencies in diploids and triploids, this is rrtÃ1 reduced to 0.91 (df. ϭ 1, P ϭ .34). If, in- tÃϭi , pool ͸͸ stead of the usual chi-square statistic, we iϭ1VVii΋iϭ1 calculate the deviance (likelihood ratio) where r is the number of independent loci, statistic using the formula Inheritance of Unique Fruit ˆti is the estimate of t for the ith locus, and V is the variance of this estimate. The es- and Foliage Color D ϭ 2͸ observed i timate ˆtpool can be shown to be fully effi- in NuMex PinÄ ata ϫ log(observed/expected), cient, in the sense that its theoretical vari- ance is the same as the theoretical E. J. Votava, C. Balok, D. Coon, we obtain very similar values of 15.09 for variance of the maximum likelihood esti- and P. W. Bosland the standard model and 0.93 for the ex- mator. Moreover, the statistic tended model; the difference between The inheritance of mature fruit color in these values, 14.16 (df. ϭ 2, P Ͻ 0.001), is r 2 (ÃÃtiϪt pool) peppers (Capsicum spp.) is controlled by X 2 the likelihood ratio test for comparing the ϭ ͸ several genes. However, the inheritance of iϭ1 Vi two models. Thus, as one might hope with the transition of colors the fruit undergo a sample of size N ϭ 500, the analysis is may be used to test the homogeneity of during ripening has not been described able to detect the difference in allele fre- the estimates {ˆti}. Under the null hypoth- extensively. The authors describe the in- 2 quencies of diploids and triploids very esis of homogeneity, X is distributed as heritance of a unique gene which affects convincingly. chi-square with r Ϫ 1 degrees of freedom foliage color and fruit color transition oc- (e.g., Bailey 1961, Appendix A1.5). curing in the jalapen˜o cultivar NuMex Pi- Single Locus with More than Three n˜ata. The gene responsible is designated Alleles Concluding Remarks the tra gene. The likelihood approach extends in a straightforward manner to loci with more Likelihood methods provide a general ap- The jalapen˜o cultivar NuMex Pin˜ata was than three alleles. For m alleles the num- proach to statistical inference that is released in 1998 (Votava and Bosland ber of phenotypes is m(m2 ϩ 5)/6. There known to be optimal in many senses for 1998). It originated spontaneously in the are m parameters to be estimated, leaving large sample sizes. However, as indicated cultivar Early Jalapen˜o, released by (m3 Ϫ m Ϫ 6)/6 degrees of freedom to as- by this example, maximum likelihood es- PetoSeed in 1977 (Tigchelaar 1980). Nu- sess goodness-of-fit. As with three alleles, timates may often be as good as, or better Mex Pin˜ata is unique in the transition of a likelihood ratio test with m Ϫ 1 degrees than, other ad hoc estimates, even with colors the fruit undergo as they mature. of freedom can be constructed to test the samples of small or moderate size. Anoth- Immature fruit are light green, maturing to assumption that allele frequencies in dip- er advantage of using the likelihood ap- yellow, orange, and finally red. The fruit loids and triploids are equal. proach is that it allows simple tests of color of standard jalapen˜o cultivars, for some of the underlying assumptions. Al- example, Early Jalapen˜o, changes from Several Loci though the computations are somewhat dark green to red. The likelihood approach can also be ex- more complex than the method of Krieger The inheritance of mature fruit color in tended to analyze several loci, provided and Keller (1998), they can be completed Capsicum has been described by several that they are unlinked and therefore seg- very rapidly on a modern personal com- authors (Hurtado-Hernandez and Smith regate independently. In these circum- puter, and we therefore suggest that the 1985; Kormos 1962; Kormos and Kormos stances, the log-likelihood functions for likelihood approach should be the method 1954, 1960; Schifriss and Pilovsky 1992; different loci can be added together to of choice for this problem. A program for Smith 1950). It is generally considered to give an overall log-likelihood function. doing the basic calculations for a single be controlled by the combination and in- This function can be maximized to esti- locus, with up to nine distinct alleles, is teraction of the following genes: c1 and c2 mate separate allele frequencies for each available from the author as an executable (carotene pigment inhibitors), cl (chloro- locus, together with a single pooled value program for IBM-compatible PCs. phyll maintainer in fruit), and y (yellow of t. The likelihood ratio statistic for test- From Horticulture Research International, East Malling, mature fruit color) (Daskalov and Poulos ing the homogeneity of t is twice the dif- West Malling, Kent ME19 6BJ, UK. 1994). NuMex Pin˜ata ripens to a final red ference between the maximized value of ᭧ 2000 The American Genetic Association mature fruit color, indicating, according to this function and the sum of the maxi- general consensus, that the genotype for mized log-likelihood values for each locus mature color is y ϩ c1. However, it is the analyzed separately. This can be com- References transition of colors the fruit undergo as pared with the chi-square distribution Bailey NJT, 1961. Introduction to the mathematical the- they ripen as well as luteous foliage that with degrees of freedom one less than the ory of genetic linkage. Oxford: Oxford University Press. makes NuMex Pin˜ata interesting both

60 The Journal of Heredity 2000:91(1) Table 1. Chi-square analysis for goodness-of-fit to a 3:1 ratio of inheritance of the Early Jalapen˜o farben der paprikafruct. Acta Bot (Acad Sci Hung) 8: 279–281. phenotype compared to the NuMex Pin˜ata phenotype in the F2 generation Kormos J and Kormos K, 1954. He´re´dite´ des pigments No. of plants et des carote´noides chez le piment. Ann Inst Biol (Ti- Early NuMex hany) Hung Acad Sci 22:253–259. Jalapen˜o Pin˜ata Expected Kormos J and Kormos K, 1960. Die genetischen typen phenotype phenotype ratio ␹2 P der carotinoid system der paprikafruct. Acta Bot (Acad Sci Hung) 6:305–319. Early Jalapen˜o P1 40 Schifriss C and Pilovsky M, 1992. Studies of the inher- NuMex Pin˜ata P2 04 itance of mature fruit color in Capsicum annuum L. Eu- Early Jalapen˜o ϫ NuMex Pin˜ata F1 96 0 phytica 60:123–126. NuMex Pin˜ata ϫ Early Jalapen˜o F1 96 0 Selfed F2 521 175 3:1 0.0076 .9305 Smith PG, 1950. Inheritance of brown and green mature fruit color in peppers. J Hered 41:138–140. Tigchelaar EC (ed), 1980. New vegetable varieties list commercially and genetically, and which Pin˜ata. The foliage of Early Jalapen˜o and XXI. HortScience 15:565–576. suggests that a separate genetic system is other jalapen˜o cultivars is dark green, Votava EJ and Bosland PW, 1998. ’NuMex Pin˜ata’ jala- pen˜o chile. HortScience 33:350. involved. while NuMex Pin˜ata foliage has luteous fo- Received March 1, 1999 The inheritance of foliage color and fruit liage. There was an absolute association Accepted August 18, 1999 color transition in NuMex Pin˜ata was stud- between fruit color and foliage color Corresponding Editor: Kendall R. Lamkey ied. The mode of inheritance of this trait among all 696 F2 progeny. No recombina- is described and the gene responsible is tion was observed that produced a NuMex designated. Pin˜ata–type fruit color transition with an Early Jalapen˜o–type foliage or, conversely, Materials and Methods a normal Early Jalapen˜o fruit color tran- Population Genetics of sition with a NuMex Pin˜ata foliage pheno- Reciprocal hybridizations were made be- type. Geographically Restricted tween Early Jalapen˜o and NuMex Pin˜ata. Data from the F2 segregating population and Widespread Species of Hybridizations were performed in the en- was tested for goodness-of-fit to a 3:1 ratio Myrica (Myricaceae) vironmentally controlled insect-proof green- using chi-square analysis (Table 1). The house of the New Mexico State University results of the chi-square analysis indicate Y.-P. Cheng, C.-T. Chien, and Chile Pepper Breeding Program in the ear- that the progeny fit a 3:1 ratio. Therefore T.-P. Lin ly summer of 1996. A total of 192 F1 seed, the inheritance of the foliage color and representing eight distinct crosses, were fruit color transition phenotype of NuMex Allozyme variation of 11 putative loci in planted in the greenhouse in late summer. Pin˜ata is due to a single homozygous re- five populations of the rare Myrica aden- Plants were evaluated for phenotype as cessive gene. ophora Hance, and four populations of its widespread congeneric species, M. rubra they grew. In the spring of 1997, F2 seed- No other Capsicum accession possesses (Lour.) Sieb. & Zucc. was studied. Among lings from 12 randomly chosen F1 plants the same foliage color and fruit color tran- were transplanted to the field at the Lyen- sition phenotype associated with NuMex the 21 alleles studied, no unique allele decker Plant Science Research Center, Las Pin˜ata. Therefore true allele and comple- was detected for M. adenophora, whereas M. rubra had 3 alleles not found in the for- Cruces, New Mexico. A total of 696 F2 mentation tests were not performed. plants were evaluated for phenotype as In accordance with rules prepared by mer species. In terms of genetic diversity, the first-set fruit reached maturity. Data the Capsicum and Eggplant Newsletter com- populations of the rare species contained fewer alleles per locus (1.5 versus 1.7), from the F2 segregating population were mittee for Capsicum gene nomenclature analyzed using the chi-square goodness- (CENL 1994), the gene responsible for the fewer effective number of alleles per locus of-fit method. unique fruit color transition seen in Nu- (1.12 versus 1.20), fewer number of alleles Mex Pin˜ata is designated tra for transition. per polymorphic locus (2.14 versus 2.46), lower percentage of polymorphic loci Results and Discussion From the Department of Agronomy and Horticulture, (30.9 versus 40.9), and lower expected MSC 3Q, New Mexico State University, Las Cruces, NM The foliage and fruit color transition phe- 88003. This article is a contribution of the New Mexico heterozygosity (0.106 versus 0.163) than notypes of the entire F1 population were Agricultural Experiment Station, New Mexico State Uni- populations of the widespread species. that of Early Jalapen˜o, regardless of versity, Las Cruces, New Mexico. Address correspon- Genetic distances within species average dence to Eric J. Votava at the address above. whether Early Jalapen˜o was the female or 0.043 for M. adenophora and 0.045 for M. male parent, indicating the NuMex Pin˜ata ᭧ 2000 The American Genetic Association rubra, and between species ranged from phenotype is recessive and displays no 0.052 to 0.177, with a mean of 0.103, maternal effects. The segregating F2 pop- which agrees with the very similar gross ulation was scored for either of two ob- References morphologies of these two species. Intra- served phenotypes. One phenotype was CENL Committee for Capsicum gene nomenclature, population differentiation was similar in that typical of the foliage and fruit color 1994. Rules for gene nomenclature of Capsicum. Cap- both species: G ϭ 0.152 for M. adeno- sicum Eggplant Newsl 13:13–14. ST transition associated with NuMex Pin˜ata; phora, and 0.146 for M. rubra, whereas Daskalov S and Poulos JM, 1994. Updated Capsicum the other phenotype was that typical of gene list. Capsicum Eggplant Newsl 13:15–26. estimated gene flow based on GST values the foliage and fruit color transition of Ear- were moderate in these two species (Nm Hurtado-Hernandez H and Smith PG, 1985. Inheritance ly Jalapen˜o. No other phenotypes were ob- of mature fruit color in Capsicum annuum L. J Hered ϭ 1.39 versus 1.46). We inferred that M. served. A single gene appears to have a 76:211–213. rubra and M. adenophora are a progeni- pleiotropic effect on the foliage of NuMex Kormos J, 1962. Einige bemerkungen u¨ber die karotin- tor-derivative species pair that emerged

Brief Communications 61 before migrating into Taiwan during the uted in Taiwan from low elevations up to cies pair therefore should be well suited last glacial period. We consider the 2000 m. M. rubra is a species of east Asia, for a comparison of genetic variation. The Hengchun population (Chiupeng, Hsuhai, including Korea, Japan, Ryukyu, China, comparison should be especially interest- and Chufengpi) and Taitung population and the Philippines. These two species are ing to see if the results conform with the (Tienkuan and Lanshan) of M. adenophora very similar in gross morphology and are conclusions reported previously. which probably arose from two subsets of never found sympatrically. Tropical weath- the genome of M. rubra. Genetic drift was er and a strong prevailing northeast win- Materials and Methods inferred to be one of the forces shaping ter monsoon may cause plants of M. ad- the observed genetic structure in M. ad- enophora to change their morphology Sampling enophora and M. rubra. when compared with M. rubra, which oc- Two species of Myrica were sampled in cur throughout the island except Heng- this study. M. rubra is found as a big tree Recent interest in conservation chun Peninsula. In addition to the differ- in broadleaf forests and can be distin- has directed attention to comparisons of ence in environmental factors, we as yet guished from M. adenophora by the follow- widespread versus restricted congeneric do not know whether the morphologies ing characteristics: glabrous branches, species, especially with respect to levels shown in M. adenophora and M. rubra rep- staminate flowers with six to eight stamen, of genetic variation. An understanding of resent similar differences in the genetic and spheroid fruit with a diameter of 1.4– the long-term demographic patterns is composition of the two species. 2 cm. M. adenophora, on the other hand, likely to be useful as an indicator of future For this study, we compare genetic vari- is a shrub or small tree with smaller, thick- prospects for a rare species (Milligam et ation in two species of Myrica. This spe- er leaves, pubescent young branches, sta- al. 1994). Enzyme electrophoresis has been used to describe the levels and dis- tribution of genetic variation and the pop- ulation genetic structure of a variety of groups of plants. The use of allozymes has been particularly fruitful in this respect, enabling identification of progenitor-deriv- ative species pairs in several plant genera (see reference in Kadereit et al. 1995). The general conclusion is that genetic similar- ity between progenitor and derivative is very high, and that in virtually all cases the derivative taxon contains a subset of the alleles present in the progenitor taxon and alleles unique to the derivative are rare or absent (Gottlieb 1973). Several studies on this subject were found in re- cent publications (Allen et al. 1991; Ed- wards and Wyatt 1994; Kadereit et al. 1995; Mayer et al. 1994; Purdy and Bayer 1995; Purps and Kadereit 1998). Myrica has a wind-pollinated catkin in- florescence. Two species of Myrica occur in Taiwan, M. rubra and M. adenophora, both of which are favorites for landscape design. These two species are usually di- oecious, but may be monoecious as in M. adenophora. M. adenophora has an ex- tremely restricted distribution, only being found on grasslands (elevation of 50 m) of the east coast of the Hengchun Peninsula, and on mollisol of a mud stone area (ele- vation of 100 m) of the coastal mountain range in Taitung County. The Hengchun population is considered an endemic va- riety of M. adenophora Hance var. kusanoi Hayata (Hayata 1911). Illegal removal for commercial purposes has consistently re- duced the population size of M. adenopho- ra. M. adenophora was also reported in Guangdong and Fujian Provinces of China. M. rubra, known as bayberry or strawber- Figure 1. Location of nine sample populations in Taiwan. Closed squares indicate M. adenophora (five popula- ry tree, is widespread but patchily distrib- tions) and closed circles represent M. rubra (four populations).

62 The Journal of Heredity 2000:91(1) Table 1. Allele frequencies and expected (He) and observed (Ho) heterozygosities for seven minate flowers with two to four stamen, polymorphic loci in populations of M. adenophora and M. rubra and ellipsoid fruit but slightly flattened M. rubra with a diameter of 0.7–1 cm. M. adenophora Locus/ Nan- Yangming- Fresh materials of M. adenophora were allele Chiupeng Hsuhai Chufengpi Tienkuan Lanshan Puli Chiayang chuang shan collected from five natural populations at Chiupeng (50 individuals), Hsuhai (50), Mdh-2 and Chufengpi (50) on the Hengchun Pen- a 0.000 0.020 0.010 0.151 0.179 0.050 0.227 0.050 0.029 b 1.000 0.980 0.990 0.849 0.821 0.790 0.761 0.950 0.971 insula, and Tienkuan (43) and Lanshan c 0.000 0.000 0.000 0.000 0.000 0.160 0.012 0.000 0.000 (14) of Taitung County (Figure 1), for a to- H 0.000 0.040 0.020 0.260 0.304 0.351 0.373 0.096 0.058 e tal of 207 individuals. Hengchun and Tai- Ho 0.000 0.040 0.020 0.302 0.357 0.420 0.477 0.100 0.058 tung are separated by a geographical dis- Pgi-1 a 0.990 0.970 1.000 0.837 0.679 0.960 0.977 1.000 0.569 tance of about 100 km. Hengchun’s b 0.010 0.030 0.000 0.163 0.321 0.040 0.023 0.000 0.431 populations usually flower from December H 0.020 0.059 0.000 0.276 0.452 0.078 0.045 0.000 0.497 e to March, but those in Taitung do so be- H 0.020 0.060 0.000 0.279 0.500 0.080 0.045 0.000 0.472 o tween August and November. M. rubra Pgm-1 samples were collected from four popula- a 0.000 0.000 0.000 0.000 0.000 0.100 0.000 0.000 0.000 b 0.070 0.000 0.010 0.105 0.036 0.590 0.239 0.549 0.667 tions at Puli (50), Nanchuang (41), Chiay- c 0.930 1.000 0.990 0.895 0.964 0.310 0.761 0.451 0.333 ang (44), and Yangmingshan (36) for a to- H 0.132 0.000 0.020 0.190 0.071 0.551 0.368 0.501 0.451 e tal of 171 individuals (Figure 1). New H 0.140 0.000 0.020 0.209 0.071 0.480 0.477 0.561 0.333 o shoots were carried back to the labora- Skdh-2 tory in sealed PE bags and stored in a re- a 0.230 0.230 0.010 0.314 0.393 0.720 0.909 0.951 0.986 b 0.770 0.770 0.990 0.686 0.607 0.280 0.091 0.049 0.014 frigerator.

He 0.358 0.358 0.020 0.436 0.495 0.407 0.167 0.094 0.028 H 0.340 0.260 0.020 0.302 0.357 0.400 0.182 0.098 0.028 o Electrophoresis 6Pgd-1 Young leaves were ground with extraction a 0.900 0.940 0.580 0.988 1.000 0.940 0.955 1.000 0.583 b 0.100 0.060 0.420 0.012 0.000 0.060 0.045 0.000 0.417 buffer according to procedures described

He 0.182 0.114 0.492 0.023 0.000 0.114 0.088 0.000 0.493 in Feret (1971) and absorbed onto What- H 0.200 0.120 0.600 0.023 0.000 0.120 0.091 0.000 0.000 o man 3MM filter paper (4 mm ϫ 12 mm). 6Pgd-2 Paper strips were arranged in a plastic Pe- a 1.000 1.000 1.000 1.000 1.000 0.810 0.898 0.988 1.000 tri dish and stored at Ϫ70ЊC until use. Hor- b 0.000 0.000 0.000 0.000 0.000 0.190 0.102 0.012 0.000 izontal starch-gel electrophoresis was He 0.000 0.000 0.000 0.000 0.000 0.311 0.186 0.024 0.000 Ho 0.000 0.000 0.000 0.000 0.000 0.340 0.205 0.024 0.000 used for separating the following iso- Per-1 zymes: IDH (isocitrate dehydrogenase, EC a 0.000 0.000 0.040 0.012 0.000 0.010 0.012 0.232 0.014 1.1.1.42), MDH (malate dehydrogenase, EC b 0.350 0.290 0.280 0.895 1.000 0.220 0.602 0.561 0.292 1.1.1.37), PER (peroxidase, EC 1.11.1.7), c 0.650 0.710 0.680 0.093 0.000 0.770 0.386 0.207 0.694

He 0.460 0.416 0.462 0.192 0.000 0.362 0.492 0.596 0.439 6PGD (phosphogluconate dehydrogenase, Ho 0.420 0.420 0.440 0.163 0.000 0.280 0.477 0.537 0.250 EC 1.1.1.44), PGM (phosphoglucomutase, EC 5.4.2.2), SKDH (shikimate 5-dehydroge- Alleles a, b, and c were designated according to their migration distance from the origin. Mdh-1, Pgi-2, Idh-1, and Est-1 are monomorphic loci. nase, EC 1.1.1.25), EST (carboxylesterase, EC 3.1.1.1), and PGI (glucose-6-phosphate isomerase, EC 5.3.1.9). Electrophoresis and staining followed the procedure de- scribed in Cheliak and Pitel (1984): The zone specifying the most anodally migrat- Table 2. Genetic diversity statistics for each population and the species level of M. adenophora and M. ing variant was designated as 1, the next rubra as 2, and so on. Within each zone the most a b b Population NAAe AP P He (SD) Ho (SD) anodally migrating variant was designated as a, the next b, and so on. Genotype fre- M. adenophora Chiupeng 50 1.5 1.12 2.00 36.4 0.105 (0.050) 0.102 (0.047) quencies were inferred directly from iso- Hsuhai 50 1.5 1.10 2.00 27.3 0.090 (0.046) 0.082 (0.042) zyme phenotypes. Chufengpi 50 1.5 1.10 2.50 18.2 0.092 (0.057) 0.100 (0.064) Tienkuan 43 1.6 1.14 2.20 45.5 0.125 (0.046) 0.116 (0.041) Lanshan 14 1.4 1.14 2.00 27.3 0.120 (0.059) 0.117 (0.057) Data Analysis Mean 41 1.5 1.12 2.14 30.9 0.106 (0.013) 0.103 (0.012) Allozyme genotypes from young leaf tis- Species level 207 1.6 1.14 2.20 54.5 0.126 (0.051) 0.101 (0.036) sue of individual trees from each popula- M. rubra tion were used in conjunction with the Puli 50 1.9 1.25 2.67 54.5 0.198 (0.061) 0.193 (0.058) BIOSYS-1 package (Swofford and Selander Chiayang 44 1.8 1.18 2.60 45.5 0.156 (0.054) 0.178 (0.062) Nanchuang 41 1.5 1.14 2.33 27.3 0.119 (0.065) 0.120 (0.065) 1989) for estimates of allele frequencies, Yangmingshan 36 1.6 1.22 2.25 36.4 0.179 (0.070)* 0.104 (0.051)* mean number of alleles per locus (A), ef- Mean 43 1.7 1.20 2.46 40.9 0.163 (0.047) 0.149 (0.036) fective number per locus (Ae), the number Species level 171 1.9 1.24 2.57 63.6 0.191 (0.060) 0.153 (0.050) of alleles per polymorphic locus (AP ), per- a The frequency of the most common allele is less than 0.95. centage of polymorphic loci (P ), and mean b Difference between He and Ho is significant: * P Ͻ .01. observed (Ho) and expected (He) hetero-

Brief Communications 63 Table 3. F statistics, G statistics, and contingency chi-square test for polymorphic loci in populations of were categorized between endemic and M. adenophora and M. rubra narrowly distributed species as defined by M. adenophora M. rubra Hamrick et al. (1992). At the population

2a 2a level, M. adenophora exhibits a lower level Locus ␹ FIS FIT HT HS GST ␹ FIS FIT HT HS GST of genetic diversity than M. rubra. The av- Mdh-2 4.845* Ϫ0.153 Ϫ0.068 0.124 0.125 0.066 15.956** Ϫ0.216 Ϫ0.120 0.237 0.220 0.073 erage number of alleles per locus (A) was Pgi-1 0.745 Ϫ0.060 0.102 0.188 0.161 0.140 0.090 0.023 0.309 0.216 0.155 0.284 Pgm-1 1.603 Ϫ0.088 Ϫ0.048 0.085 0.083 0.023 0.000 Ϫ0.001 0.115 0.523 0.468 0.106 1.5; effective numbers of alleles per locus

Skdh-2 7.085** 0.185 0.252 0.360 0.333 0.074 0.125 Ϫ0.027 0.085 0.193 0.174 0.100 (Ae) varied from 1.10 to 1.14, with an av- 6Pgd-1 6.632* Ϫ0.179 0.103 0.209 0.162 0.223 81.866** 0.692 0.768 0.227 0.174 0.234 erage of 1.12; the number of alleles per 6Pgd-2 — — — — — — 1.850 Ϫ0.104 Ϫ0.010 0.140 0.130 0.073 Per-1 0.915 0.047 0.421 0.501 0.306 0.389 10.236* 0.173 0.305 0.556 0.472 0.150 polymorphic locus (AP ) varied from 2.0 to Mean 0.011 0.226 0.246 0.195 0.152 0.078 0.219 0.299 0.256 0.146 2.5, with an average of 2.14; the percent- age of polymorphic loci per individual (P ) Chi-square test according to Li and Horvitz (1953) and referred to as F . IS varied from 18.2% to 45.5%, with an aver- a Significance level: *P Ͻ .05, ** P Ͻ .01. age of 30.9%; and the expected heterozy- gosity was 0.106. zygosities. Deviations of observed from Pgi-1, Pgm-1, Skdh-2, 6Pgd-1, 6Pgd-2, and M. rubra has seven polymorphic loci. At expected heterozygosities over all poly- Per-1 are polymorphic (95% criterion) in at the species level, estimates of parameters

morphic loci for each population were as- least one population, while Mdh-1, Pgi-2, As, Aes, APs, Ps, and Hes were 1.9, 1.24, 2.57, sessed with chi-square testing. Wright’s F Idh-1, and Est-1 are monomorphic. Skdh-1, 63.6, and 0.191, respectively. This level of coefficient was used to estimate devia- Got-1, Got-2, Mr-1, and Mr-2 are probably diversity is similar to that observed in re- tions of observed and expected Hardy– polymorphic but were not included for gional and widespread species. At the

Weinberg heterozygotic frequencies for analysis because they were not adequate- population level, A, Ae, AP, P, and He were each polymorphic locus in each popula- ly resolved on most gels. Observed allele 1.7, 1.20, 2.46, 40.9, and 0.163, respectively, tion (Wright 1965). The amount of gene frequencies at each locus for each popu- and fall between the narrow and regional flow among these populations was esti- lation, along with overall mean allele fre- species. Differences between observed

mated as Nm ϭ (1 Ϫ GST)/4GST (Wright quencies and observed and expected het- and expected heterozygosities were not 1951). Genetic diversity within and among erozygosities are presented in Table 1. significant for any population of M. aden- populations was partitioned using Nei’s 6Pgd-2 is monomorphic in M. adenophora, ophora and M. rubra, indicating that the (1973) genetic diversity statistics. but polymorphic in M. rubra. Among 21 al- populations are in Hardy–Weinberg equi- A UPGMA (Sneath and Sokal 1973) clus- leles of 11 loci, Mdh-2c, Pgm-1a, and 6Pgd- librium. ter analysis of identity values was gener- 2b do not occur in M. adenophora, and ated using BIOSYS-1 to examine genetic as- therefore are unique to M. rubra. Pgm-1a is Genetic Differentiation

sociations among populations. Principal a rare allele and only found in the Puli In M. adenophora the FIS value for all en- component analysis (PCA) (Sneath and population. M. adenophora has no unique zyme systems was 0.011 (Table 3). This in- Sokal 1973) using the NTSYS-pc 1.8 pro- alleles relative to M. rubra. dicates allele frequencies within the pop- gram (Rohlf 1992) helped evaluate the ulation are close to random mating. M.

phenetic interpopulational relationships Genetic Diversity rubra also has a positive FIS (0.078) which based on allele frequency distributions Measures of genetic diversity are present- is not significantly deviated from zero as from polymorphic loci only. PCA is pre- ed in Table 2. M. adenophora has six poly- verified by the formula of Li and Horvitz sented with OTUs (populations) plotted morphic loci. At the species level, the av- (1953) (␹2 ϭ 2.08, P Ͼ .05). Contingency

onto the derived variables (principal erage number of alleles per locus (As) was chi-square tests showed that Mdh-2, Skdh- axes). 1.6, effective number of alleles per locus 2, and 6Pgd-1 of M. adenophora have sig-

(Aes) was 1.14, the number of alleles per nificant deviations from Hardy–Weinberg

Results polymorphic locus (APs) was 2.2, the per- expectations. In M. rubra, however, Mdh-2, Eleven loci in eight enzyme systems tested centage of polymorphic loci per individual 6Pgd-1, and Per-1 showed significant differ-

could be resolved clearly enough for ge- (Ps) was 54.5, and expected heterozygosity ences. netic diversity analysis: Mdh-1, Mdh-2, Pgi- was 0.126. Estimates of the genetic diver- The extent of genetic differentiation

1, Pgi-2, Pgm-1, Skdh-2, 6Pgd-1, 6Pgd-2, Idh- sity parameter of M. adenophora at the among the populations (GST)ofM. adeno- 1, Est-1, and Per-1. Among them, Mdh-2, species level or at the population level phora averaged 0.152. Thus about 15% of genetic variation resides among popula- tions. Similar genetic differentiation Table 4. Nei’s measure of genetic identity (I) (above the diagonal) and genetic distance (D) (below the among populations was also observed in M. adenophora M. rubra diagonal) for populations of and M. rubra (0.146). Gene flow per generation Population 123456789 was moderate in both M. adenophora (Nm ϭ 1.39) and M. rubra (Nm ϭ 1.46). Unbi- Chiupeng ***** 0.999 0.985 0.964 0.943 0.932 0.937 0.908 0.875 Hsuhai 0.001 ***** 0.983 0.958 0.937 0.926 0.932 0.898 0.866 ased genetic distances between pairs of Chufengpi 0.015 0.017 ***** 0.932 0.907 0.885 0.882 0.847 0.838 populations of M. adenophora ranged from Tienkuan 0.037 0.043 0.070 ***** 0.998 0.896 0.950 0.926 0.855 0.001 to 0.098, with an average of 0.043 Lanshan 0.059 0.065 0.098 0.002 ***** 0.873 0.944 0.917 0.850 Puli 0.070 0.077 0.122 0.110 0.136 ***** 0.960 0.963 0.955 (Table 4). Among them, Chiupeng and Chiayang 0.065 0.070 0.126 0.052 0.058 0.041 ***** 0.983 0.933 Hsuhai showed very low genetic distance; Nanchuang 0.096 0.107 0.167 0.076 0.087 0.038 0.017 ***** 0.944 Yangmingshan 0.134 0.144 0.177 0.157 0.163 0.046 0.070 0.058 ***** the same observation occurred between Tienkuan and Lanshan. Genetic distances

64 The Journal of Heredity 2000:91(1) they are a progenitor-derivative species pair. The genetic similarity of these two species is 0.897, which, in fact, is higher than the 0.67 Ϯ 0.04 for congeneric spe- cies reported by Gottlieb (1981). Even though rare, M. adenophora main- tains higher levels of genetic diversity than many rare tree species in Taiwan,

such as Amentotaxus formosana (He ϭ 0.004; Wang et al. 1996), Bretschmeidera si-

nensis (He ϭ 0.062; Huang 1994), and Rho-

dodendron kanehirai (He ϭ 0.068; Huang et al. 1995). This pattern may be due to the relationships between the specific progen- itor-derivative species pairs. The progeni- tors of A. formosana, B. sinensis, and R. ka- nehirai occurring in mainland China have never been studied, though theoretically they should have higher genetic diversi- ties. Cunninghamia lanceolata, native to China, and C. konishii, found only in Tai- wan, is another progenitor-derivative spe- cies pair; the former having a much higher genetic diversity than the latter (Lin et al. Figure 2. Dendrogram of cluster analysis was using the UPGMA method for populations of M. adenophora and M. rubra. 1998). It was reported that during the late Pleistocene, temperate forests dominated among populations of M. rubra varied from phora from populations of M. rubra the the lowlands of Taiwan in the Tali glacial 0.017 to 0.058, with an average of 0.045. At most. PCA also indicated a close genetic stage from about 50,000 to 15,000 BP (Tsu- the species level, the genetic distance be- relationship between populations of M. ad- kada 1967). During that time, although gla- tween M. adenophora and M. rubra was enophora where they co-occurred at Lan- ciation did not advance into southern Chi- 0.103. shan and Tienkuan, and Hsuhai and Chiu- na, the climate did cool. Taiwan would The UPGMA dendrogram produced clus- peng (Figure 3). then likely have been connected with ters of populations separating M. adeno- mainland China by a land bridge due to a phora and M. rubra into two distinct Discussion lowering of the sea level (Liu 1988) across groups (Figure 2) with a genetic distance which plants could have migrated toward of 0.103. PCA of allele frequencies revealed The relationship between M. adenophora the warmer south without a barrier. M. ad- that Skdh-2 and Pgm-1 were the loci that and M. rubra agrees with that of other enophora being better adapted to a tropi- differentiated the population of M. adeno- closely related congeners suggesting that cal climate might have survived in refugia on the Hengchun Peninsula. As glaciers re- treated northward around 15,000 BP, warming returned again to the lowlands. M. adenophora remained in southern Tai- wan probably because there was less com- petition due to the lack of open grasslands beyond Hengchun Peninsula and it being small in size. M. rubra, a species more adapted to a temperate climate, spread from lowland forests up to elevations of about 2000 m. Regarding species diver- gence, we inferred that M. adenophora emerged from M. rubra in temperate main- land China before migrating into Taiwan. One alternative scenario for speciation of M. adenophora could be that it originated from M. rubra in Taiwan. However, it is un- likely that M. adenophora could have ex- perienced the same speciation indepen- dently from two distant localities: Figure 3. Principal component analysis (PCA) based on allele frequencies for populations of M. adenophora Guangdong of China and Taiwan. Certainly (populations 1–5) and M. rubra (populations 6–9). The first two principal components explain 54.8% and 28.5% of the variances, respectively. 1, Chiupeng; 2, Hsuhai; 3, Chufengpi; 4, Tienkuan; 5, Lanshan; 6, Puli; 7, Chiayang; 8, comparison between populations from Nanchuang; 9, Yangmingshan. both Hengchun and Guangdong will shed

Brief Communications 65 light on our understanding of the origin of the populations may produce the appear- nensis Hemsl. of Taiwan [in Chinese]. Ann Taiwan Mus 37:49–67. M. adenophora. ance of a deficiency of heterozygotes. M. Huang S, Chen CM, Hsu SL, and Lu SY, 1995. Population Lower genetic diversity of M. adenopho- rubra is patchily distributed in dense for- genetic of endangered species, Rhododendron kanehi- ra could have originated as a subset of est and also has a restricted neighbor- rai [in Chinese]. Biol Bull Nat Taiwan Normal Univ that of the progenitor. M. adenophora also hood size preventing from gene flow, thus 30(2):63–68. exhibits no unique alleles, probably due to favoring population substructuring and Kadereit JW, Comes HP, Curnow DJ, Irwin JA, and Ab- bott RJ, 1995. Chloroplast DNA and isozyme analysis of its relatively recent origin and experienc- the Wahlund effect. the progenitor-derivative species relationship between ing genetic drift. These two groups, the Genetic differentiation between popula- Senecio nebrodensis and S. viscosus (Asteraceae). Am J Bot 82:1179–1185. Hengchun (Chiupeng, Shuhai, and Chu- tions of M. adenophora (GST ϭ 0.152) is fengpi) and Taitung populations (Tien- similar to that of M. rubra (G ϭ 0.146). Kruckeberg AR, 1957. Variation in fertility of hybrids ST between isolated populations of the serpentine spe- kuan and Lanshan), distinctly differed in These values are higher than the mean re- cies, Streptanthus glandulosus Hook. Evolution 11:185– their gene frequencies, for example, alleles ported for wind pollination outcrossing 211. Kruckeberg AR, 1986. An assay: the stimulus of unusual Mdh-2, Pgi-1, Skdh-2, 6Pgd-1, and Per-1; this plant of long-lived woody species (GST ϭ may indicate they represent a different 0.077) (Hamrick et al. 1992), and is slightly geologies for plant speciation. Syst Bot 11:455–463. subset of the genetic variability of M. ru- higher than that reported for tropical Li CC and Horvitz DG, 1953. Some methods of estimat- ing the inbreeding coefficient. Am J Hum Genet 5:107– bra. To further test this hypothesis, char- woody species (GST ϭ 0.135) (Hamrick 117. acterization of chloroplast DNA or mito- 1994). In general, not only is the dispersal Lin TP, Wang CT, and Yang JC, 1998. Comparison of ge- chondria DNA haplotype would provide of fruit restricted, differences in flowering netic diversity between Cunninghamia konishii and C. additional evidence (Kadereit et al. 1995). period contribute to the genetic differen- lanceolata. J Hered 89:370–373. Liu KB, 1988. Quaternary history of the temperate for- At present the individuals per population tiation of the Hengchun and Taitung pop- est of China. Q Sci Rev 7:1–20. of M. adenophora are less than 500. Illegal ulations of M. adenophora. The same phe- Mason HL, 1946a. The edaphic factor in narrow endem- removal for commercial purposes and ar- nomenon was also observed within M. ism. I. The nature of environmental influences. Mad- tificial fire are the major disturbances that rubra but has not been well defined, es- rono 8:209–240. reduce the number of individuals. Genetic pecially among localities at different ele- Mason HL, 1946b. The edaphic factor in narrow endem- ism. II. The geographic occurrence of plants of highly drift, in fact, may have occurred since M. vations. restricted patterns of distribution. Madrono 8:241–272. adenophora is only restricted to the south From the Taiwan Forestry Research Institute, 53 Nan- Mayer MS, Soltis PS, and Soltis DE, 1994. The evolution part of the island. For example, Pgm-1b be- Hai Road, Taipei, Taiwan. This research was financially of the Streptanthus glandulosus complex (Cruciferae): comes a rare allele, 6Pgd-2a is a fixed al- supported by the Taiwan Forestry Research Institute genetic divergence and gene flow in serpentine endem- lele, and 6Pgd-1a is fixed in the Taitung (TFRI). This paper is contribution no. 133 of the TFRI. ics. Am J Bot 81:1288–1299. We thank Chien-Lien Pan and Wan-Long Chang, Heng- Milligam BG, Leebens-Mack J, and Strand E, 1994. Con- population. Taitung populations grow on chun Station of Taiwan Forestry Research Institute, for servation genetics: beyond the maintenance of marker mollisol of a mud stone area, the soil tex- assistance with field sampling. Address correspon- diversity. Mol Ecol 3:423–435. dence to Yu-Pin Cheng at the address above or e-mail: ture being generally grayish in color, hav- [email protected]. Nei M, 1973. Analysis of gene diversity in subdivided ing little moisture content during the dry populations. Proc Natl Acad Sci USA 70:3321–3323. season, having an abundant mineral com- ᭧ 2000 The American Genetic Association Proctor J and Woodell SRJ, 1975. The ecology of ser- ponent, and being alkaline or neutral in re- pentine soils. Adv Ecol Res 9:255–365. Purdy BG and Bayer RJ, 1995. Allozyme variation in the action, but lacking organic compounds. Salix silicicola, References Athabasca sand dune endemic, and the Reproductive isolation was found between closely related widespread species, S. alaxensis. Syst the Hengchun and Taitung populations of Allen GA, Gottlieb LD, and Ford VS, 1991. Electropho- Bot 20:179–190. retic evidence for the independent origins of two self- Purps DML and Kadereit JW, 1998. RAPD evidence for M. adenophora. We suspect adaptation to pollinatiry subspecies of Clarkia concinna (Onagra- a relationship of the presumed progenitor- ceae). Can J Bot 69:2299–2301. mollisol of a mud stone area might cause derivative species pair Sencico nebrodensis and S. vis- reproductive isolation, as has been re- Cheliak WM and Pitel JA, 1984. Techniques for starch cosus (Asteraceae). Plant Syst Evol 211:57–70. ported in the evolution of many plant spe- gel electrophoresis of enzymes from forest tree spe- Rohlf FJ, 1992. NTSYS-pc: numerical taxonomy and mul- cies. Information Report PI-X-42. Chalk River, Ontario: cies to a serpentine habitat (Kruckeberg tivariate analysis system, version 1.8. Setauket, NY: Ex- Petawawa National Forest Institute; 19–45. ter Software. 1957). An edaphic factor can be the chief Edwards AL and Wyatt R, 1994. Population genetics of Sneath PH and Sokal RR, 1973. Numerical taxonomy: stimulus for the evolution of endemic taxa the rare Asclepias texana and its widespread sister spe- the principles and practice of numerical classification. (Kruckeberg 1986; Mason 1946a,b; Proctor cies, A. perennis. Syst Bot 19:291–307. San Francisco; W.H. Freeman. and Woodell 1975). This isolation should Feret PP, 1971. Isozyme variation in Picea glauca Swofford DL and Selander RB, 1989. A computer pro- (Moench) Voss seedlings. Silvae Genet 20:46–50. proceed for a long time, thus the Heng- gram for the analysis of allelic variation in population Gottlieb LD, 1973. Genetic differentiation, sympatric genetics and biochemical , version 1.7. chun and Taitung populations have differ- speciation and the origin of diploid species of Stepha- Champaign, IL: Illinois Natural History Survey. entiated to the same genetic distance as nomeria. Am J Bot 60:545–553. Tsukada M, 1967. Vegetation in subtropical Formosa that between populations of widespread Gottlieb LD, 1981. Electrophoretic evidence and plant during the Pleistocene glaciations and the Holocene. M. rubra (Figure 2). Within M. rubra, gene populations. Progr Phytochem 7:1–46. Paleogeog Paleoclimat Paleoecol 3:49–64. frequency distribution of Pgi-1, Pgm-1, and Hamrick JL, 1994. Genetic diversity and conservation Wang CT, Wang WY, Chiang CH, Wang YN, and Lin TP, 1996. Low genetic variation in Amentotaxus formosana 6Pgd-1 in Nanchuang, and Pgm-1 and 6Pgd- in tropical forests. In: Proceedings of the International Symposium on Genetic Conservation and Production of Li revealed by isozyme analysis and random amplified 2 in the Yangmingshan population are con- Tropical Forest Tree Seeds, Ching Mai, Thailand, 1993 polymorphic DNA markers. Heredity 77:388–395. sistent with the action of genetic drift. (Drysdale RM, John SET, and Yapa AC, eds). Dordrecht, Wright S, 1951. The genetic structure of populations. M. adenophora and M. rubra both have The Netherlands: Kluwer Academic; 95–124. Ann Eugen 15:323–354. low F , which is consistent with the ab- Hamrick JL, Godt MJW, and Sherman-Broyles SL, 1992. Wright S, 1965. The interpretation of population struc- IS Factors influencing levels of genetic diversity in woody ture by F-statistics with special regard to systems of solute outcrossing pollination behavior of plant species. New For 6:95–124. mating. Evolution 19:395–420.

Myrica. A high FIT value for both species is Hayata B, 1911. Materials for a flora of Formosa. J Coll Received December 28, 1998 probably due to a spatial Wahlund effect. Sci Imp Univ Tokyo 30(1):255. Accepted August 18, 1999 Differences in allele frequencies between Huang S, 1994. Genetic structure of Bretschneidera si- Corresponding Editor: James L. Hamrick

66 The Journal of Heredity 2000:91(1)