<<

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600 MOLECULAR GENETIC ANALYSIS OF THE PHYLOGENETIC

RELATIONSHIPS OF

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Gregory Charles Bonner Booton, B. S.

The Ohio State University

199 5

Dissertation Committee: Approved by

Paul A. Fuerst

C. W. Birky

P. Parker

R. Skavaril \ Advisor

J. Downhower Department of Molecular Genetics UMI Number: 9525995

UMI Microform 9525995 Copyright 1995, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 To my , thank you

ii ACKNOWLEDGEMENTS

I would like to sincerely thank my adviser, Dr. Paul A. Fuerst for his support and encouragement from the beginning of this project. Without him as my advisor, I am sure my project would have never come to fruition. My thanks also go the other members of my committee for their helpful suggestions. My thanks also go to the staff of The Johnson Aquatic Complex and

The Columbus Zoo for their support, both personally and financially. In addition, the financial support of The Ohio State

University and The National Science Foundation has been critical to the completion of this project, and I greatly appreciate that assistance. I would also like to thank Drs. Les Kaufman and Mark

Chandler for their support and insight into the Lake Victoria cichlid problem. Also, without the support of the fellow members of my laboratory, none of this work would have been possible.

Finally, I would like to deeply thank my family, without their support, understanding, and faith, this achievement would certainly have never been possible. VITA

November 7, 1958 ...... Bom- Columbus, Ohio

1988 ...... B.S., The Ohio State University, Columbus, Ohio

1988-1993 ...... Graduate Assistant, The Ohio State University, Columbus, Ohio

1993-1995 ...... Research Assistant, The Ohio State University, Columbus, Ohio

PUBLICATIONS

Booton, Gregory C., Doug Warmolts, Sandy Andromeda, and Paul A. Fuerst. 1991. Analysis of Mitochondrial Sequence Variation in Endangered of Lake Victoria cichlid fish. Ohio Journal of Science, 92: 2.

FIELDS OF STUDY

Major Field: Molecular

Studies in Phylogenetic Reconstruction, Molecular Genetic Analysis of Taxonomic Relationships, Dr. Paul A. Fuerst TABLE OF CONTENTS

DEDICATION...... ii

ACKNOWLEDGEMENTS...... iii

VITA...... iv

LIST OF TABLES...... viii

LIST OF FIGURES...... x

CHAPTERS

I. INTRODUCTION ...... 1

A. The Cichlidae Taxa of The Great Lakes of Africa 1 B. The Family Cichlidae...... 8 C. Taxonomic History of The Lake Victoria Cichlid Species Flock ...... 11 D. Historical and Current Challenges to the Lake Victoria Cichlid Fauna, and Conservation Efforts 18 E. Phylogenetic Reconstruction and its Importance to Conservation Efforts Regarding Lake Victoria ...... 23

II. TAXONOMIC SYNONOMIES AND PHYLOGENETIC RECONSTRUCTION METHODS...... 29

A. Taxa ...... 29 B. Phylogenetic Reconstruction Methodology ...... 37 III. PHYLOGENETIC RELATIONSHIPS OF A LAKE VICTORIA SPECIES DERIVED FROM 18S RIBOSOMAL DNA ...... 42

A. Introduction ...... 42 B. Materials and Methods ...... 43 C. Results...... 56 D. Discussion ...... 70

IV. AN ANALYSIS OF PHYLOGENETIC RELATIONSHIPS OF LAKE VICTORIA CICHLID SPECIES USING INTERNAL TRANSCRIBED SPACER ONE (ITS 1) OF THE RIBOSOMAL GENE OPERON...... 72

A. Background...... 72 B. Methods and Materials ...... 76 C. Results...... 84 D. Discussion ...... 91

V. COMPARISON OF A RANDOMLY AMPLIFIED POLYMORPHIC DNA (RAPD) PHYLOGENY OF LAKE VICTORIA CICHLID SPECIES WITH A PHYLOGENY PRODUCED FROM MORPHOLOGICAL DATA...... 95

A. Background...... 95 B. Methodology ...... 101 a. Experimental Design ...... 101 b. Collections ...... 101 c. Experimental Conditions ...... 102 C. Results...... 112 D. Discussion ...... 142

SUMMARY...... 145

APPENDICES...... 147

Appendix A: Data relevent to Chapter III ...... 148

Appendix B: Data relevent to Chapter V ...... 157 Appendix C. Use Protocol

LIST OF REFERENCES...... LIST OF TABLES

T a b le s Page

1. Taxonomic synonomy of 18S rDNA analysis species ...... 30

2. Taxonomic synonomy of taxa analyzed for Internal Transcribed Spacer One (ITS 1) ...... 32

3. Taxonomic synonomy of taxa examined by RAPD analyis ...... 34

4. 18S Ribosomal DNA Primers usen in Amplification and Sequencing ...... 51

5. Distance matrix of the five species in the first 18S rRNA primary sequence alignment ...... 61

6. Corrected Genetic Distances of 18S rRNA Sequences ...... 65

7. Species examined for ribosomal ITS 1 study ...... 77

8. Internal Transcribed Spacer One Amplification and Sequencing Primers ...... 82

9. Distance Matrix and Sequence Dissimilarity of Internal Transcribed Spacer One (ITS 1) ...... 89

10. Primers used for RAPD Analysis ...... 109

11. Taxonomic Designation of Species Analyzed in RAPD study...... 117

12. Species sample size summary for RAPD analysis ...... 120

13. Generic Abbreviations used in RAPD phylogenetic trees ...... 123 14. Distribution of differences between 100 pairwise comparison of RAPD derived trees ...... 128 LIST OF FIGURES

F ig u re Page

1. Phylogenetic Relationship within the Suborder Labroidei ...... 2

2. Proposed Relationship within the Family Cichladae based on Morphological Characteristics ...... 3

3. Eukaryotic rRNA Gene Cluster Arrangement ...... 44

4. Proposed Inter-familial Relationship of the Family Cichlidae...... 48

5. Proposed Secondary Structure of Gaurochromis sp. 18SrRNA ...... 59

6. Neighbor-joining tree of five taxa based on truncated 18S rDNA sequences ...... 62

7. 18S rRNA N-J Gene Tree ...... 66

8. 18S rRNA N-J Gene Tree with relative distances ...... 67

9. Maximum Parsimony 18S rRNA Gene Tree ...... 68

10. Internal Transcribed Spacer One (ITS 1) ...... 81

11. Internal Transcribed Spacer One (ITS 1) Sequence Alignment ...... 85

12. Internal Transcribed Spacer One (ITS 1) Consensus Distance and Parsimony Gene Tree ...... 90

13. Represenative RAPD gel ...... 111

x 14. Majority Rule Consensus Tree for the 100 Most Parsimonious RAPD Trees ...... 115

15. Majority Rule Consensus Tree for the 900 Most Parsimonious RAPD Trees ...... 116

16. Majority Rule consensus tree of the 100 most parsimonious trees on truncated data set ...... 126

17. Majority Rule consensus tree of the 100 most parsimonious trees following removal of taxa with missing data ...... 127

18. Kenyan taxa only RAPD tree ...... 130

19. Ugandan taxa only RAPD tree ...... 131

20. Proposed phylogeny of hapiochromine taxa based on scale and squamation characters ...... 133

21. Truncated morphological phylogeny ...... 134

22. First Majority Rule consensus tree for RAPD data using taxa previously studied by Lippitsch ...... 136

23. Second Majority Rulse consensus tree for RAPD data using taxa previously studied by Lippitsch...... 137

24. RAPD phylogeny using genera tested by Lippitsch using Kenyan derived specimens ...... 139

25. RAPD phylogeny using genera tested by Lippitsch using Ugandan derived specimens ...... 140

26. Primary Sequence alignment of 18S rDNA sequences for taxa analyzed in this study ...... 149

27. RAPD data matrix ...... 158 CHAPTER I

INTRODUCTION

A. The Cichlidae Taxa of The Great Lakes of Africa

The great lakes of Africa harbor some of the most striking diversity of freshwater fish on earth (Kaufman and Cohen, 1993).

Lakes Victoria, Tanganyika, and Malawi (and other nearby smaller lakes) collectively contain hundreds of fish taxa, the overwhelming percentage in the family Cichlidae. Nearly all cichlid species which are found in these lakes are endemic (Ribbink, 1991). The two oldest of the lakes, Tanganyika and Malawi, are deep rift lakes whose origins date from five to twenty million years ago

(Tanganyika) and one hundred thousand to two million years ago

(Malawi). By contrast, Lake Victoria is a shallow lake whose origins are dated between one million and 15,000 years before present (ybp). In contrast to the other two great lakes, Victoria’s waters are more turbid due to higher levels of vegetation, shallow depth, and a higher rate of turnover of substrate sediment. Also in contrast to Tanganyika and Malawi, Lake Victoria does not contain the same quantity of rocky habitats found in those lakes.

The hypothetical phylogenetic relationship between Cichlidae and other families in the Suborder Labroidei is shown in Figure 1.

This cladogram is based on morphological and behavioral 1 2

Cicblidae Embiocentridae Pomacentridae Labridae

13

Figure 1. Phylogenetic Relationship within the Suborder Labroidei. This phylogeny proposes Cichlidae as a sister group to the remaining families. Numbers reflect morphological or behavioral character changes. Crossbars joining closed circles represent gain or loss of derived characters which were used to determine this phylogeny. (Redrawn From Stiassny, 1991) 3

&

o. Neotropical Q* cichlids

25-26

21-24

6-7

1-4

Figure 2. Proposed Relationship within the Family Cichlidae based on Morphological Characteristics. This tree is derived from the consensus of 54 most parsimonious trees analyzing 28 morphological or behavior characteristics. (Redrawn From Stiassny, 1991) 4 characters which are able to discriminate familial relationships.

These characters are insufficient to distinguish within the families, including Cichlidae. Other, higher resolution phylogenies which discriminate within the Cichlidae have been proposed and one of these is presented in Figure 2. This phylogeny is able to distingusish between groups of cichlid taxa, as well as specific genera in some case, eg., Heterochromis. Together, the characters analyzed for Figures 1 and 2 are insufficient to distinguish close taxonomic relationships. The effort to elucidate these relationships between cichlid species has confounded researchers for many years (Kornfield, 1991). In Africa, major groups of endemic (those taxa which are found only in that particular lake) cichlids, sometimes referred to as species flocks, are found in the Great

Lakes of Tanganyika, Malawi, and Victoria (Fryer, 1977,

Greenwood, 1984a, 1984b, A vise, 1990). In addition, limited endemic flocks are found in several nearby smaller lakes. In the phylogeny of Figure 2 the Lake Victorian cichlids which are the subject of this research, reside with other taxa in the group titled

"The Rest". The aim of my research is to dertmine the phylogenetic relationships of the Lake Victoria taxa to other

Cichlidae species, as well as to determine the phylogenetic relatedness among the Lake Victoria species.

1. Lakes Tanganyika and Malawi:

Lake Tanganyika is a long, deep, narrow, rift lake of approximately 19,500 km2, with an average depth of 1470 m, 5 making it the second deepest lake in the world (after Lake Baikal).

Twenty million years ago (Miocene) tectonic activity in this area deepened the predecessor lake(s) of Tanganyika into two lakes whose water levels changed vertically with relative frequency during climactic fluctuations. As recently as 200,000 ybp the lake was separated into three distinct lakes (Scholz and Rosendahl,

1988). This history has led to many fish invasions from connected rivers of the Zaire basin. The separation into distinct lakes in the past is hypothesized to have led to the subsequent intralacustrine subdivision found in many of these fish taxa (Ribbink, 1986).

Currently, contains nearly three hundred species of fish, representing twenty four families (Sturmbauer and Meyer,

1992). Of these, almost 80% are endemic to the lake. In addition to the large number of endemic species there are many color forms of particular species within the lake.

In contrast to Lake Tanganyika, appears to have originated as a single lake (Scholtz and Rosendahl, 1988).

Like Tanganyika, it is a deep rift lake (maximum depth 785m) of approximately 45,000km2 (Lowe-McConnell, 1993). The origin of this lake is placed at one and a half to two million years ago. Since that time the lake has undergone major fluctuations in lake levels, events which have continued into historical time, the last ending in the mid-nineteenth century (Owen et al., 1990). Lake Malawi is estimated to contain nearly five hundred cichlid species with many color morphs of particular species which are endemic to specific locations within the lake (Konigs, 1990). Of this large 6 number only four are non-endemic to Lake Malawi. In addition to the large number of cichlids there are many endemic non-cichlids.

Many of the species found in Lake Malawi are morphological very similar to taxa found in Lake Victoria, and until recently had been placed in the same , , in many cases. The indication is that taxa found in Lake Victoria are closely related to, or derived from, species in Lake Malawi.

2. Lake Victoria:

Lake Victoria is the youngest of the major great lakes of

Africa. Its origin has been estimated at between one million and

15,000 years before present. The older estimated ages are proposed for the formation of the original lake in this location.

Changes in precipitation, as evidenced by core samples in the lake basin reveal numerous times in the past when lake levels dropped dramatically. The most recent is estimated by these methods to have occurred approximately 15,000 years ago (Stager, Reinthal,

and Livingstone, 1986, Owen et al., 1990). Unlike Lake

Tanganyika and Malawi, Victoria is not a rift lake, but rather was formed by the ponding of numerous rivers and the fusion of older,

smaller, lakes which have occurred in the region since the Miocene

(Lowe-McConnell, 1993). The surface area of the lake is 69,000 km2, making it the second largest freshwater lake (in surface area) in the world following Lake Superior. The average depth of Lake

Victoria, in stark contrast to the rift lakes, is only 40 meters. In comparison to the generally rocky habitat found in the other two 7 great lakes of Africa, Lake Victoria is generally sandy bottomed, with large amounts of vegetation found along the shoreline. There are many large bays around the perimeter of the lake and numerous islands (hilltops in a previous landscape) dot the lake

(Lowe-McConnell, 1993). Climactic fluctuations, as evidenced by the core samplings, led to changes in lake depth within Lake

Victoria over time. Due to shallowness of the lake, increases in precipitation, or periods of flooding would have acted differently in Lake Victoria than in the rift lakes. Instead of a vertical change in water level, Lake Victoria's waters would spread in a horizontal fashion (Stager, Reinthal, and Livingstone, 1986). The effect that this horizontal/vertical fluctuation dichotomy between the rift lakes and Victoria would have had on is speculative, but under investigation by examination of similar species which are found in Lake Victoria and the nearby, but presently isolated, smaller lakes which surround it (L. Kaufman, pers com.).

Whatever the mode of speciation, hundreds of endemic cichlid species were previously found in the lake, although current conditions have substantially lowered that number (see below,

Baskin, 1994).

In summary, geologic conditions which gave rise to these lakes were distinctly different but each has evolved a rich fauna of endemic taxa. These taxa attract study from biologists for many reasons. To evolutionary biologists, one of the more interesting phenomena is that on initial inspection Lake Victoria may appear to have evolved a rich species flock of cichlid fish under sympatric 8 conditions, in a relatively short evolutionary time (Maynard -

Smith, 1966, Futuyma and Mayer, 1980, Kondrashov and Mina,

1986). The possibility of sympatric speciation continues to be a controversial issue, and Lake Victoria provides a natural laboratory to study the possibilities of possible modes of speciation by examination of geographically isolated taxa. Traditional allopatric speciation, where species arise following geographic isolation of a population, may have occurred in Lake Victoria.

However, the existince of these taxa in a single lake raises the possibility that these species arose sympatrically.

Cichlid success is not limited to the Great Lakes of Africa.

The Cichlidae are a prolific group in many areas of the world. In the following section, a brief examination of the evolutionary history of this highly successful family, Cichlidae, is presented.

B. The Family Cichlidae

Bony fish are among the most successful of vertebrate species, containing an equivalent number of species as all other vertebrates combined. Found within the large number of families in the fish order Perciformes is the family Cichlidae. Fossil specimens which clearly contain anatomical characteristics found in cichlids appeared in either the Eocene or the Oligocene

(Woodward, 1939). These earliest fossils were found in South

America and have physical characteristics of modern day cichlids.

Intermediate forms leading to these fully cichlid forms have not 9 been found and thus detailed information about the origin of the family Cichlidae is unclear (Van Couvering, 1982)

Physical characteristics of this family are single nostrils on each side of the head, spiny rays, and a variety of forms of secondary (pharyngeal) teeth (McCallister, 1968). Behaviorally this family is noted for its complicated social structure including a high level of parental care, leading in some species to maternal mouth brooding of young (Baerends and Baerends-van Roon, 1950,

Fryer and lies, 1972, Barlow, 1991, Keenly side, 1991). The distribution of the Cichlidae is generally in tropical areas of the world (Berra, 1981). In Africa they are widely distributed throughout the continent. Species are also found in Israel and

Syria, and on the island of Madagascar. They are also widely distributed across Central and South America, including the

Caribbean. They extend as far north as Texas in the United States

(Lowe-McConnell, 1991). Some Asian species are found in India.

The determination of the relationship of Cichlidae to other

Perciformes families has been an area of controversy for many years. The original proposed grouping of the families

Embiotocidae, Pomacentridae, and Labridae, with Cichlidae (Muller,

1843) appears robust with the recent inclusion of these families in the suborder Labroidei following reexamination of these families

(Kaufman and Liem, 1982). The relationships within this suborder are still being investigated. Proposed phylogenies include some in which Cichlidae is most closely related to Pomacentridae. The

Pomacentridae are colorful marine species, including familiar 10 damsel fish taxa, which exhibit a range of behavioral characteristics similar to cichlids. Such a relationship is also supported by the morphological similarity of a single nostril on each side of the head, similarities in feeding mechanisms in both groups, and the presence of a variety of coloration patterns in many species. Other workers have proposed the Cichlidae as a sister group to Embiotocidae, Scaridae, and Labridae based on other pharyngeal bone characteristics (Van Couvering, 1982,

Stiassny and Jensen, 1987, Fig. 1). Accurate determination of these relationships is confounded by the phylogenetic noise produced by the number of morphological homoplasies which are observed. The resolution of taxonomic hierarchy in this suborder awaits further investigation.

The intrafamilial relationships between the major groups of

Cichlidae is also an area of active investigation and these results

are similarly not fully resolved. The results of a recent review are

shown in Fig 2 (adapted from Stiassny, 1991) obtained by the

analysis of 28 morphological and behavioral characteristics. The

tree is the consensus of the 54 most parsimonious trees produced

by that analysis. In this phylogeny, as well as in previously

proposed phylogenies, the Eutroplines (which includes the

Madagascan genus ) are ancestral to both the

Neotropical cichlids as well as many African genera.

Wherever they are found throughout their range, the

Cichlidae exploit a variety of ecological niches, and in many cases

dominate the fish fauna of a region. Their successful , 11 interesting behavioral characteristics, phenotypically plastic nature, and, in many cases their explosive speciation, have made them a focus of a large number of scientific investigations. In addition to their merit for scientific study, cichlids are a major food resource in many areas of the world, comprising the major source of protein in some regions (Pullin, 1991). Also, the intrinsic beauty of some cichlid species cannot be overstated. Many species are as colorful as popular marine fish. This attractiveness, and their interesting social behavior, has made this family a favorite with aquarists for many years. However, in many areas of the globe, these important taxa are threatened with . This situation has led to efforts to preserve and restore some species.

The preservation effort includes not only the major food , but some of the lesser known species (Lowe-McConnell, 1990,

Bootsma and Hecky, 1993, Coulter and Muramba, 1993, Reinthal.

1993). These include cichlid species from Lake Victoria, whose phylogenetic evaluation is the focus of the studies in this dissertation.

C. Taxonomic History of The Lake Victoria Cichlid Species

Flock

The majority of the cichlid species of Lake Victoria were originally placed in the genus Haplochromis (Hilgendorf, 1888).

This designation has its origins in the late nineteenth century when Hilgendorf originated this genus as a subgenus of based on anatomical differences found in one species from Lake Victoria. Much of the early work on the description of the cichlid species of Lake Victoria was carried out by taxonomists from the

British Museum of Natural History, including Regan, Boulenger, and

Trewavas (Trewavas, 1933). After 1945, work on the

Haplochromine species flock of Lake Victoria was associated with the East African Freshwater Fisheries Research Organization

(EAFFRO) in Jinja, Uganda. Much of this work, including studies on phylogeny, ecology, fisheries, and conservation was carried out by

P. H. Greenwood. Eventually, hundreds of species from Lake

Victoria were placed in the genus Haplochromis, and the generic justification was not challenged until a reexamination and revision by Greenwood himself in the late 1970's and early 1980’s

(Greenwood, 1959, 1966,1974, 1979, 1980, 1991). The

Haplochromis taxa were morphologically very similar, and

Greenwood’s revision of the genus was based on examination of characters associated with the trophic apparatus of these taxa.

This resulted in the division of the genus Haplochromis into 20 separate lineages which were assigned generic or (less often) subgeneric rank.

This revision was based on few character differences and made identification of new species based on these characters somewhat difficult. In addition, the characters he examined, those associated with the trophic apparatus, are known to subject to strong selection pressures, and ecological effects (Lippitsch, 1993).

Greenwood also excluded the Malawi Haplochromis species from his reexamination. For these reasons, and the inherent reluctance to give up an established taxonomic system, this revision has met with limited acceptance. However, subsequent studies which explored Greenwood’s revision have supported his conclusion.

These studies examined many characteristics within the including: egg spots, scale characteristics, body size, fecundity, egg size, characteristics of breeding including timing, sites, and distribution of young, as well as other life history characteristics (Lippitsch, 1993, Goldschmidt and Witte, 1990,

Goldschmidt, 1991). These studies found many interspecific differences, even among these closely related species, lending support to the taxonomic revision. Nonetheless, many workers still refer to many of these species as "haplochromines" and numerous novel "species" are still being discovered, which are not easily identified or classified using Greeenwood's revisions.

Regardless of the current controversy regarding the taxonomic characteristics of the species flock of Lake Victoria, they are prolific and exploit all trophic niches available in their . The largest number of described species (134 taxa) are classified as piscivores. Other large groups feed on: insects (29 taxa), zooplankton (21 taxa), snails (23 taxa), detritus (16 taxa), and prawn (13 taxa). Smaller groups include those which feed on algae (10 taxa), parasites (2 taxa), plants (2 taxa), and crabs (1 taxon). The food source of a substantial number (53 taxa) of these species is unknown (Van Oijen, 1982, Witte and Van Oijen, 1990,

Yamaoka, 1991, Goldschmidt and Witte, 1992). Witte has shown that contrary to previous speculation regarding the lakewide 14 distribution of Victorian cichlids, many species are not widely distributed (Witte, 1984). In fact many species are endemic to specific areas of the lake, such as Mwanza Gulf, where a great deal of research has been done (Van Oijen, Witte, and Witte-Maas,

1981). A proposed ecological segregation of these species has been suggested by Goldschmidt and Witte which separates these species by substrate, vertical and horizontal segregation, and food source

(Goldschmidt, Witte, and deVisser, 1990, Goldschmidt and Witte,

1992), This hypothesized scheme separates the taxa according to the ecological niches they exploit. It does not, however, reveal the evolutionary relationship among these taxa, and would only allow accurate classification if evolutionary relationship is correlated with trophic type, which is problematic.

Regardless of their phylogenetic relationship, the radiation of such a large number of species in a short period of time is an area of much speculation. On first examination it would appear that this large assemblage of fish taxa may have speciated under sympatric conditions within Lake Victoria. One sympatric speciation model developed by Maynard-Smith postulates selection, specifically habitat selection, that would occur within a panmictic population that contains polymorphisms which allow that population to exploit new niches. This would result in the formation of sub-populations able to exploit new niches (Maynard-

Smith, 1966). Initially, this would create sub-populations adapted to the new niche which coexist, and are still capable of interbreeding. Later, reproductive isolation is hypothesized to 15 follow. This would facilitate sub-population interbreeding, and strengthen the association of the and the newly exploited niche. Eventually this would lead to reproductive isolation, and new species formation. Given this scenario, one might expect to observe various morphological steps, referred to as a morphocline, as these forces of habitat selection proceeded.

Morphoclines are defined as the existence of species which represent adaptive steps leading from an ancestral generalized form to species representing the most specialized type. The cichlid species of Lake Victoria do exhibit this morphocline in derived feeding apparatus characters. Since these taxa are all found within one lake, this may be used as evidence for sympatric speciation as it has in smaller lakes (Schliewen, Tautz, and Paabo, 1994).

However, the rarity of strongly supported cases of sympatric speciation make it difficult to accept this type of scenario

(Futuyma and Mayer, 1980). Also, many of the observations on these fish are also consistent with more traditional allopatric models of speciation.

The alternative explanation is that the large number of species within Lake Victoria diverged under allopatric conditions.

Soon after the formation of Lake Victoria, it is hypothesized that one, or more, riverine species invaded the lake (Temple, 1969).

Without competition this founder was able to exploit these new and different environments, and quickly spread to the many ecological niches which the lake provided (Barel, et al., 1989).

Species such as cichlids may be particularly adept at exploiting 16 such situations. The known plasticity of many of their key morphological characters, such as pharyngeal teeth form, would position them to adapt rapidly to a new environment (Witte, Barel, and Hoogerhound, 1990, Liem, 1991, Sackley, 1992). An aquarist who has kept cichlids of any species can attest to their ability to change their environment! This scenario could lead to rapid geographical isolation of populations within the lake, each exploiting different ecological niches. This type of isolation is supported by the observation of species which apparently only inhabit specific gulfs or bays (Kaufman and Ochumba, 1993). The broad range of open ecological niches, and the cichlids ability to exploit them rapidly could lead to their dominance of the resident fish fauna in a relatively short period of time. Following geographical isolation, acquisition of behavioral, or ecological, characteristics, may lead to reproductive barriers in these separated populations, leading the way to eventual speciation between two populations (McKeye, 1991, Nelissen, 1991, Noakes,

1991, Wilson, 1992). This process, repeated enough times in this large lake, could lead to the large number of species observed today, satisfying the conditions of an allopatric model.

This scenario may have been followed in all of the African

Great Lakes to varying degrees. An opportunistic, phenotypically plastic, riverine cichlid found itself in a new and varied environment that it was able to exploit quickly to full advantage.

Subsequent reproductive, ecological, or behavioral isolation would eventually lead to the formation of nascent species. The results 17 may be morphologically different in the lakes, but the process may have been similar.

Further confusing the problem of understanding speciation in Lake Victoria is the controversy over which of these described forms are indeed species as defined by the criteria of the biological species concept (Mayr, 1964). The description of the various forms of Lake Victoria cichlids by taxonomists has rarely been followed by direct examination of their validity as true species

(Capron de Caprona and Fritzsch, 1984). The major justification of these forms as biological species comes from less direct evidence.

Differences in morphological characteristics (such as male breeding coloration), and ecological segregation, are cited as evidence for genetic isolation (Fryer, 1977). Regardless of the lack of direct evidence of true species characteristics of the members of this flock, groupings of forms based on similarities of morphological and ecological characteristics provides a hypothetical taxonomic framework that can be experimentally tested.

Although the taxonomic designations of these species are still under investigation, the occurrence of such a large number of putative taxa which arose in a short evolutionary period is of great interest to scientists studying speciation processes. In instances such as these, where close taxonomic groupings of endemic species are found, they are often referred to as a species flock (Mayr,

1984). Generally accepted criteria to define a species flock include: 1) monophylogeny, 2) the species are endemic within an area, 3) they are speciose, and 4) relative to other areas of a 1 8 similar size, there is a large number of these taxa (Ribbink, 1984).

Recent allozyme and mitochondrial DNA analysis support the monophyly of the Victoria flock (Meyer et al., 1990, see below).

While not all the above criteria relating to Lake Victoria taxa have been discussed in detail here, they do have all of these characteristics.

In summary, the Lake Victoria cichlid species flock represents a recent, possibly continuing, explosive radiation of vertebrate taxa. The ability to distinguish between these incipient species, and verify their identity, is a challenge to evolutionary biologists. Molecular analyses, which are the subject of this dissertation, may be better suited for comparisons of closely related taxa. Unfortunately, the ability to study this fascinating radiation of fish is evaporating due to ecological degradation, and human pressures on this lake.

D. Historical and Current Challenges to the Lake Victoria Cichlid

Fauna, and Conservation Efforts

The fish fauna of Lake Victoria has been under pressure for much of the twentieth century. Gill net fisheries in the early part of the century were responsible for the decimation of the major food fish in the lake, Oreochromis esculentus, a tilapine cichlid species (Achieng, 1990). Within the past thirty years the large species flock has been decimated due to three major factors. One has been the introduction in the early sixties of a non-native species, the (Lates niloticus), a large 19 predatory fish (Ogutu-Ohwayo, 1993). This introduction was originally an experimental attempt to transfer the protein biomass found in the large number of cichlid species into a form more easily harvested by native fisheries (Achieng, 1990, Barel et al.,

1991, Goldschmidt, Witte, and Wanink, 1993). As an experiment to convert biomass this must be considered at least a temporary success. The large increase in fisheries in the area attests to this.

However, much of the catch is exported and little but scraps remain for local populations. Furthermore, the decimation of the haplochromine species has altered the conversion of lower organisms into fish biomass. Nile perch are unable to exploit the variety of food sources taken previously by haplochromines.

Further, as the Nile Perch have fewer haplochromines to feed on, they have turned on the younger members of their own species.

Nile Perch catches have declined in recent years, while at the same time, the average size of Nile perch has decreased.

A second factor leading to the haplochromine decline has been the overfishing which has occurred in the twentieth century due to more efficient techniques and larger fishing operations. A third detrimental factor has been increased pollution of Lake

Victoria due to erosion runoff, and environmental pollutants

(Ogutu-Ohwayo, 1990, Cohen et al., 1993). These factors led to a situation where perhaps more than half of the endemic species of

Victorian cichlids are either extinct, endangered, or threatened in the wild (Kaufman, 1989, 1992). Without intervention this situation will probably lead to the eradication of most of these 20 potentially informative species for evolutionary, and other, re s e a rc h .

With this threat in mind a campaign was initiated to provide for the maintenance in captivity of a representative number of these species (Kaufman, 1989). The group responsible for of this program is the Captive Breeding Specialty Group (CBSG), which is part of the International Union for the Conservation of Nature

(IUCN). Several international institutions are involved in this work. A Species Survival Plan (SSP) was approved in February,

1991 by the Association of Zoos and Aquariums, the North

American governing body of zoos involved in conservation, captive breeding, and display of (Kaufman, 1989). This plan proposes to work with approximately 36 of the Victorian cichlid taxa, representing about 10% of the number of species originally thought to be present in the lake. Aspects of this plan involve research into the husbandry of these fish in captivity, examination of phenotypic plasticity, taxonomic investigations, as well as research on the population and evolutionary genetics of the species. The ultimate goal of this program would be reintroduction, if possible, of the managed taxa to Lake Victoria when environmental conditions have improved (Williams, et al.,

1 9 8 8 ).

It is important to the long term survival of these species that reintroduction is done as soon as possible to minimize the loss of genetic variation in the captive breeding group. Maintenance of genetic variation is important to maximize the ability of the taxa to 21 respond to future environmental changes, such as would be encountered following reintroduction to a Lake Victoria which has undergone ecological changes (Frankel and Soule, 1981). Species which are brought into captive breeding programs face a variety of threats to genetic variation. Among these are the effects of population bottlenecks, genetic drift, and inbreeding. When a sample of individuals is taken from the wild for inclusion in the program, the sample will go through a population bottleneck where the number of individuals is greatly reduced compared to its "natural population size" (Nei, Maruyama, and Chakraborty,

1975). After this population bottleneck, only a part of the original genetic variation from the species will remain in the sample. The long term severity of the genetic effects due to the bottleneck is dependent upon how fast the population size recovers and upon the original size of the founding population (Maruyama and Fuerst,

1984, 1985, Fuerst and Maruyama, 1986). Fortunately, if the population size recovers quickly, the losses of genetic variation may be minimal, even when founder stocks are small (Frankel and

Soule, 1981). Due to the small population sizes of most of the program species they will also be subject to the effects of genetic drift. This will result in a further erosion of variation, and rare alleles which were present in the original population may be lost

(Wright, 1955, Dobzhansky, 1951). Thus far, fish species in the captive propagation program have increased their population size quickly, but population sizes remain small. A third factor can be loss of genetic variation due to inbreeding. Reproduction between 22 related individuals can lead to the expression of detrimental alleles. It also has an effect similar to genetic drift, a decrease in heterozygosity, or genotypic variation. In addition to these problems, there is the probability that captive propagation programs artificially select for those individuals which are best

suited to captivity (Meffe, 1986). Some loss of genetic variation due to these factors is unavoidable in managed populations. The

goal must be to minimize these losses over the time frame that the

species is captively maintained. The captive breeding program for

the Lake Victoria cichlids is designed to address these problems,

but they can not be completely avoided.

With a well designed captive breeding program in place, one

key to the success of the species survival plan becomes the

selection of taxa for captive propagation. As discussed in the

previous section, current information has not allowed researchers

to reliably discern the phyletic relationships between many of the

cichlid taxa of Lake Victoria. Without a better understanding of

these relationships, selection of taxa for inclusion in the program is

subjective, and may not be representative of the whole flock.

Rapidly deteriorating conditions in the lake have already forced

action, and currently a number of seriously threatened taxa are

being propagated internationally. Reintroduction of some of these

taxa to ponds or small lakes near Lake Victoria is now being

initiated. Justification of inclusion of taxa into the SSP program

based on the degree of the threat facing an individual species is

also somewhat subjective. Since the phylogenetic relationships 23 between these putative species is not clear, it may be that one species which is currently under severe pressure may have a genetically close relative which is not under the same pressure. In fact, it has been observed in Lake Victoria that haplochromine species respond differentially to Nile Perch predation and environmental decline (Kaufman and Ochumba, 1993). If closely related, differentially sensitive taxa are observed, inclusion of the severely threatened taxa may not be an effective use of resources.

For such practical reasons, and to better understand the explosive speciation of the Lake Victoria haplochromine fauna, high resolution phylogenetic analysis is warranted.

E. Phylogenetic Reconstruction and its Importance to Conservation

Efforts Regarding Lake Victoria Cichlids

The cichlid species flocks of the Great Lakes of Africa represent one of the most explosive speciation events of vertebrates observed (Stiassny, 1991, Meyer, 1989). In particular, the cichlid flock of Lake Victoria is thought to have arisen as a monophyletic group within the last 100,000 years, possibly as recently as 15,000 years ago ( Meyer et al., 1990, Livingston,

1980). This group represents a large number of species, but the nature of the evolutionary relationships between these cichlid species, and their affinity to other Cichlidae species of the world, is still an area of active investigation (Komfield, 1991). Better knowledge of these phylogenies may help us to understand the evolutionary processes, such as possible tests of allopatric or 24 sympatric models, that acted upon these taxa. However, after a century of research, these relationships are still debated. The difficulty in distinguishing among the Lake Victoria species is clear; it has even been suggested that accurate determination of the taxonomic relationships between these species may not be possible (Komfield, 1991).

Nonetheless, the determination of phylogenetic relationships remains fundamental to our understanding of evolution and species relatedness in fishes. Traditionally, comparisons of morphological, behavioral, paleontological, and other data have been used to determine these relationships in cichlids (Lippitsch,

1993 Dominey, 1984, Van Couvering, 1982). The analysis of informative regions of protein and nucleic acid sequences is now being used to expand such studies in other fish groups (Leslie and

Vrienhoek, 1978, Kornfield et al., 1982, Avise and Saunders, 1984,

Beckwitt, 1987, Powers, 1991, Echelle and Dowling, 1992). A few studies have sought to combine morphological and molecular data to determine the relationships between Lake Victoria cichlids. In

Lake Victoria cichlids, allozyme analysis was able to discriminate between Tilapia, Astatoreockromis, and Hoplotilapia, but was unable to differentiate between ten species of "Haplochromines"

(Sage, et al, 1984). It is evident from these studies that isozyme analysis is unable to fully delineate within Lake Victoria taxa.

Identification of more phyiogenetically informative characters is re q u ire d . 25 The use of comparative molecular sequence analysis has now become a standard means of obtaining phylogenetic information on species from bacteria to man (Li and Grauer, 1991 , Swofford and Olsen, 1990). Widespread application of the Polymerase Chain

Reaction (PCR) has accelerated these studies (Arnheim, White, and

Rainey, 1990, Erlich, Gelfand, and Sninsky, 1991). Nuclear DNA regions which are now being studied for their possible phylogenetic information in cichlid taxa include the highly variable

Major Histocompatability Genes (MHC) (Ono, et al.,1993). In addition to nuclear DNA, the cytoplasmically inherited mitochondrial DNA molecule has been examined by sequence analysis. The mitochondrial DNA molecule is one of the most rapidly evolving sequences in eucaryotes (Avise et al., 1987,

Avise, 1991). Recently, studies of cichlid species from the African

Great Lakes using DNA sequence data from mitochondiral DNA

(mtDNA displacement loop ID-loop], cytochrome b, tRNAs, and RNA

Pro regions) have been reported (Meyer, 1990, Sturmbauer and

Meyer, 1992, Sturmbauer and Meyer, 1993). These genes evolve at different rates within the mitochondrial genome, and the information derived from them can be applied to phylogenetic questions at differing levels of relatedness. The most rapidly evolving region of the mitochondrial DNA, and thus best suited for examination of close phylogenetic relationships, is the D-loop.

Initial examination of this region in 36 mitochondrial sequences from African cichlids was able to discriminate between three groups of "haplochromines", two Haplochromis species groups from 26 Lake Malawi and one from Lake Victoria (Meyer, 1990). The

Lake Malawi groups apparently diverged three to four hundred thousand years before present (ybp). However, the group of

Victoria "haplochromines" could not be separated from one another in this analysis. The entire group appeared to have an origin approximately one hundred thousand years ago, and the results support the monophyletic origin of the Haplochromine species within Lake Victoria (Meyer, et al., 1990, Kocher et al.,

1989). In subsequent studies using the rapidly evolving D-loop region Sturmbauer and Meyer examined the relative substitution rates between a larger sample of the cichlid genera of the African

Great Lakes (Sturmbauer and Meyer, 1992). One focus of this study was to examine the relative sequence divergence of this region in the widely distributed genus Tr op he us in Lake

Tanganyika, twenty four species of cichlids from Lake Malawi, and fourteen cichlid taxa from Lake Victoria. These results showed that Tropheus was more diverse than either of the two flocks combined, with the Lake Victoria taxa depauperate in genetic variation in this region. Although rapidly evolving, the mitochondrial genome's evolutionary history differs from that of nuclear genes. It is maternally inherited, and possible important contributions from males may be overlooked in mitochondrial analyses. Therefore, results from mitochondrial DNA may not be reflective of the nuclear genome. 27 Other, fine-grained methods which examine nuclear DNA exist which might be able to resolve these questions. One such method is the use of nuclear VNTR (Variable Number of Tandem

Repeat) loci (DNA fingerprint analysis) (Jeffreys, Wilson, and

Thein, 1985). This method has been used in fish research, and while it might be useful in separating these species, it is not without its drawbacks in interpretation, as well as difficulties in implementing the techniques related to time, difficulty, and money

(Baker et al., 1992).

In the studies outlined in this thesis, regions of the nuclear genome which exhibit increasingly finer levels of resolution have been examined to elucidate aspects of the phylogenetic relationships of the cichlid taxa of Lake Victoria. The initial portion of this research focuses on the gross phylogenetic relationship of the Cichlidae to other fish and related animal taxa.

A second analysis exploits an evolutionarily more variable nuclear

DNA region to examine the relationship among a sample of Lake

Victoria cichlid taxa. The third part of the thesis will examine the phylogeny of Lake Victoria cichlid taxa using higher resolution genetic methods. This last derived phylogeny will be tested against a previously proposed phylogeny based on morphological characters.

These studies will further the search for molecular methods which are able to distinguish between these putative taxa. They provide information on nuclear loci which can be compared with previously reported mitochondrial data (Slade, Moritz, and Heideman, 1994). Finally, the phylogenetic results of these studies can be used by the Lake Victoria conservation program to assist in the selection of species for captive propagation, and effectively utilize limited conservation resources. CHAPTER II

TAXONOMIC SYNONOMIES AND PHYLOGENTIC

RECONSTRUCTION METHODS

A. Taxa

In the following chapters a large number of Cichlidae genera and species are examined, most of which are endemic to Lake

Victoria. Tables 1, 2, and 3 present a synonomy of the teleost taxa according to the chapter in which they are introduced.

29 30 Table 1: Taxonomic synonomy of 18S rDNA analysis species.

Species Author of Species Name 1. A gnatha Styela plicata F lem ing Lampetra aepyptera A b b o tt Petromyzon marinus L in n a e u s

2. Chondricthys Echinorhinus cookei Pietschmann Notorynchus cepedianus (P e ro n ) Rhinobatos lentiginosus G arm an Squalus acanthias L in n a e u s

3. Acipensiformes Acipenser fulvescens R a fu b e sq u e

4. Coelocanth Latameria chalumnae S m ith

5. Teleostei Cichlasoma cyanoguttatum Bair and Girard Etheostoma zonale Cope Fundulus heteroclitus L in n a e u s Gaurochromis sp. Greenwood, 1981 Haplochromis empodisma, Greenwood, 1960 Haplochromis velifer, Trewavas, 1933 nigricans Greenwood, 1981 Haplochromis (N eochrom is) nigricans, Regan, 1922 Neochromis nigricans, Regan, 1920 Tilapia simotes, Boulenger, 1911 Tilapia nigricans, Boulenger, 1906 Oreochomis esculentus Trewavas, 1981 Graham, 1928 Tilapia sp, Dobbs, 1927 Tilapia, eudardiana, Boulenger, 1915 Tilapia variabilis, Boulenger, 1906 Tilapia galilaea, Pellegrin, 1905

Paretroplus polyactis B le e k e r Pomacentrus melanochir B le e k e r Table 1 (continued)

Sebastolobus altivelis G ilb ert 32 Table 2: Taxonomic synonomy of taxa analyzed for Internal Transcribed Spacer One (ITS1).

a Author of Species Name 1. Tilapine Oreochromis esculentus Trewavas, 1981 Graham, 1928 Tilapia sp, Dobbs, 1927 Tilapia, eudardiana, Boulenger, 1915 Tilapia variabilis, Boulenger, 1906 Tilapia galilaea, Pellegrin, 1905 Oreochromis niloticus Trewavas, 1981 Saratherodon niloticus, Trewavas, 1978 Tilapia nilotica nilotica, Thys van den Audenaerde, 1964 Tilapia nilotica, Boulenger, 1908 Tilapia nilotica, Boulenger, 1899 Chromis niloticus, Gunther, 1862 Chromis nilotica Cuvier, 1817 Labrus niloticus, Linnaeus 1757

2. Haplochromines alluaudi Greenwood, 1981 Poll, 1939 Regan, 1922 Boulenger, 1907 Pellegrin, 1903 nubilus Greenwood, 1981, , Trewavas, 1933 Haplochromis annectidens, Trewavas, 1933 Haplochromis nubilus, Regan, 1922 Tilapia nubila, Boulenger, 1906 Gaurochromis sp. Greenwood, 1981 Haplochromis empodisma, Greenwood, 1960 Haplochromis velifer, Trewavas, 1933 33 Table 2 (continued)

"Haplochromis" (Neochromis) K aufm an "krusing" "Haplochromis" K aufm an (Paralabidochromis) "rock k rib e n sis" "Haplochromis" K au fm an (Prognathochromis) "kachira d eep " Neochromis nigricans Greenwood, 1981 Haplochromis (Neochromis) nigricans, Regan, 1922 Neochromis nigricans, Regan, 1920 Tilapia simotes, Boulenger, 1911 Tilapia nigricans, Boulenger, 1906 Pseudocrenalabis multicolor/ L oiselle victoriae Ptyochromis xenognathus Greenwood, 1981 Haplochromis xenognathus, Greenwood, 1956b Xystichromis phytophagous Greenwood, 1981 Haplochromis phytophagous, Greenwood, 1965 Haplochromis cinereus, Regan, 1922 Haplochromis nuchisquamulatus, Boulenger, 1915 , Boulenger, 1915 a. Some of the species used in this study are recently discovered taxa, and as yet, formally undescribed. In those cases, taxonomic designation is given as follows: The putative species is temporairily placed in the genus "Haplochromis" until it is formally described. "Haplochromis" is then followed by the generic (in parentheses) name that it would be placed in if the taxonomic revision of Greenwood (1981) is followed. The species name (in quotation marks) is one that has been tenatively ascribed by our collaborators at the time of discovery. This follows the convention established in Kaufman and Ochumba (1993). 34 Table 3: Taxonmic synonomy taxa examined by RAPD analysis.

Species b Authoj^ofBaS£ ed esN am e^ ^ ^ ^ ^ Astatoreochromis alluaudi Greenwood, 1981 Poll, 1939 Regan, 1922 Boulenger, 1907 Pellegrin, 1903 Pseudocrenalabis multicolor / L oiselle victoriae Astatotilapia nubilus Greenwood, 1981, Haplochromis nubilus, Trewavas, 1933 Haplochromis annectidens, Trewavas, 1933 Haplochromis nubilus, Regan, 1922 Tilapia nubila, Boulenger, 1906 Gaurochromis sp. Greenwood, 1981 Haplochromis empodisma. Greenwood, 1960 Haplochromis velifer, Trewavas, 1933 Greenwood, 1981 Haplochromis nubilus, Regan, 1922 Haplochromis desfontainesii, Boulenger, 1915 Haplochromis nuchisquamulatus, Boulenger, 1915 Chromis (Haplochromis.) obliquidens, Hilgendorf, 1888 "Haplochromis" (Astatotilapia) K aufm an "black group" "Haplochromis" (Astatotilapia) K aufm an "flameback" "Haplochromis" (Harpagochromis) K aufm an "frogmouth" "Haplochromis" (Neochromis) K aufm an "m adonna" "Haplochromis" K aufm an (Paralabidochromis) "chestnut" 35 Table 3 (continued)

"Haplochromis" K aufm an (Paralabidochrmois) "fried egg" "Haplochromis" K aufm an (Paralabidochromis) "rock k rib e n sis" "Haplochromis" K aufm an (Paralabidochromis) "xenodon" "Haplochromis" K aufm an (Prognathochromis) "kachira d eep " "Haplochromis" (Ptyochromis) K aufm an "rusingal oral sheller" Neochromis nigricans Greenwood, 1981 Haplochromis (Neochromis) nigricans, Regan, 1922 Neochromis nigricans, Regan, 1920 Tilapia simotes, Boulenger, 1911 Tilapia nigricans, Boulenger, 1906 Paralabidochromis beadlei Greenwood, 1981 Paralabidochromis victoriae, Greenwood, 1956 Haplochromis beadlei, Trewavas, 1933 Paralabidochromis plagiodonGreenwood, 1981 Paralabidochromis victoriae, Greenwood, 1956 Haplochromis plagiodon, Regan and Trewavas, 1928 , Hilgendorf, 1922 Clinodon bayoni, Regan, 1920 Hemitilapia bayoni, Boulenger, 1908 Haplochromis crassilabris, Regan, 1922 crassilabris, Boulenger,, 1915 Haplochromis crassilabris, Boulenger, 1906 Prognathochromis Venator Greenwood, 1981 Haplochromis pellegrini, Trewavas, 1933 Paratilapia prognatha, Pellegrin, 1904 36 Table 3 (continued)

Psammochromis riponianus Greenwood, 1981 Haplochromis riponianus, Lohberger, 1929 Haplochromis cine reus, Regan, 1922 Paratilapia victoriana, Boulenger, 1915 Paratilapia serranus, 1915 Pelmato chromis riponianus, Boulenger, 1911 Haplochromis ishmaeli, Boulenger, 1906 Ptyochromis xenognathus Greenwood, 1981 Haplochromis xenognathus. Greenwood, 1956b Pyxichromis orthostoma Greenwood, 1981 Haplochromis parorthostoma. Greenwood, 1967 Haplochromis orthostoma, Regan, 1922 Pelmatochromis spekii, Boulenger, 1915 Xstichromis phytophagous Greenwood, 1981 Haplochromis phytophagous, Greenwood, 1965 Haplochromis cinereus, Regan, 1922 Haplochromis nuchisquamulatus, Boulenger, 1915 Haplochromis ishmaeli, Boulenger, 1915 Yssichromis lamparogramma Greenwood, 1981 Haplochromis fusiformis, Greenwood and Gee, 1969 Haplochromis laparogramma, 1969 b. Many of the species used in this study are recently discovered taxa, and as yet, formally undescribed. In those cases, taxonomic designation is given as follows: The putative species is temporairily placed in the genus "Haplochromis" until it is formally described. "Haplochromis" is then followed by the generic (in parentheses) name that it would be placed in if the taxonomic revision of Greenwood (1981) is followed. The species name (in quotation marks) is one that has been tenatively ascribed by our collaborators at the time of discovery. This follows the convention established in Kaufman and Ochumba (1993). 37 B. Phylogenetic Reconstruction Methodology

In the following chapters similar methods are used in the construction of gene or character phylogenies. Since all chapters use similar methods, a single description of those methods is presented here. Phylogenetic reconstruction methods can be placed into two categories, those which use distances between taxa and those which are based on discrete characters. In distance methods (phenetic approaches), a single evolutionary distance is computed between pairs of taxa based on the comparison of characters, and phylogenetic trees are built from this distance using various algorithms. In character methods (cladistic approaches), shared states at single characters are used to construct evolutionary relationships and information is combined over the various single character relationships. Data from discrete characters, as opposed to continuous characters, can be converted into distance measurements and trees may be constructed using algorithms which deal with distance as well as character methods

(Avise, 1994, Swofford and Olsen, 1990, and references therein).

DNA sequence data can be analyzed using either distance or character methodologies.

Reconstruction of a gene or character phylogeny may require the examination of a vast number of possible trees as the number of taxa grows. For n taxa there are (2n-3)!/(2n*2(n-2)!) possible different bifurcating rooted trees (Li and Grauer, 1991). This number grows exponentially with the addition of taxa and quickly outstrips current computing capabilities to do an exhaustive search 38 of all possible trees. Various algorithms have been developed which search for a tree which minimizes the total evolutionary distance between taxa while examining only a subset of all possible trees. These algorithms have been incorporated into computer programs (such as PAUP and PHYL1P which are used in these studies) that calculate distances and reconstruct trees.

Further description of the algorithms can be found in the manuals accompanying these programs or in various reviews of phylogenetic reconstruction methodology (Felsenstein, 1982,

1988a, 1988b, Swofford, 1990, Hillis and Moritz, 1990, Swofford and Olsen, 1990, Avise, 1994).

Of the distance reconstruction algorithms, the Neighbor-

Joining method has been shown to be the most efficient in reconstructing true phylogenies (Saitou and Nei, 1987). This method produces a single unrooted tree unless an outgroup is specified. This method was used in construction of distance gene trees in the following chapters.

The algorithm generally used in construction of phylogenetic trees from discrete character data is maximum parsimony (Sober,

1989, Wheeler, 1990). This method searches for the tree which minimizes the number of changes which must have occurred from the ancestral states. To do this, the method only examines informative sites. For DNA sequence data, these are defined as sites where two different nucleotides are each shared by at least two taxa. In addition, parsimony analysis can allow insertions and deletions in the sequence as informative characters if they meet 39 the above criteria. Distance methods do not easily integrate insertion and deletion data. In any parsimony analysis, maximum parsimony may produce a number of most parsimonious trees. All

"most parsimonius trees" may be presented or they may be condensed into a single representation using various methods.

This is generally done by producing a strict consensus tree in which any conflicting topologies are removed by production of multifurcations at the conflicting node. In an alternative method, majority rule consensus, trees whose branching pattern are maintained in a certain percentage (greater than 50%) of the most parsimonious trees, are produced. Trees are consensed, and those nodes which fall below the majority rule cutoff are converted into multifurcations in the final tree. While maximum parsimony methods search for the most parsimonious tree, there is no guarantee that the most parsimonious tree is the correct, or true, evolutionary tree (Hillis, Huelsenbeck, and Cunningham, 1994).

In the following studies the search for the maximum parsimony trees was carried out using a heuristic search method.

This is done because the large number of taxa included in these studies would greatly increase the computer time required to search through a larger database, such as an exhaustive or branch and bound method. The drawback of this type of search is that the most parsimonious tree(s) have a higher likelihood of being overlooked, since not all topologies are examined. The order of addition of taxa in a search can produce associations which bias the choice of tree topology. To make sure this was not a problem, 40 random addition of taxa was tested to determine if the order of addition of taxa affected the final outcome.

Statistical analysis of the validity of a particular tree is an area of ongoing research and controversy (Felsenstein, 1988b).

One of the most commonly used methods is the bootstrap. The original data set is sampled with replacement to produce a second, third, fourth, etc. replicate data set for analysis. Bootstrapping methods are used to estimate variability when the underlying sampling distribution is unknown. Bootstrap samplings are generally done 100 to 1000 times and are limited by computing power and number of taxa included in a particular study.

Phylogenetic reconstruction is performed on the bootstrap data set and a consensus bootstrap tree is produced. Numbers at branch points in these trees reflect the percentage of resampled trees which contained a particular branch. This method was also used in the following chapters.

All of the analyses done in the following chapters were performed in PAUP or PHYLIP. Using PHYLIP, distances were calculated in DNADIST utilizing the Kimura 2-parameter correction which adjusts distances for multiple substitutions. Tree reconstruction was done using the NEIGHBOR program which is used to construct neighbor-joining trees, or DNAPARS which was used to construct maximum parsimony trees. Bootstrapping of the data was done using the program SEQBOOT.

In addition, PAUP was used in some of the following analyses, using either heuristic, branch and bound, or exhaustive 41 search algorithms. However, due to a large number of taxa, few of the anlayses permitted exhaustive searches. Taxa were added during the analyses in either a stepwise (general), or random, manner. When multiple maximum parsimony trees were obtained, they were consensed using either the strict, or majority rule (set at 50%) consensus methods within PAUP. Bootstrapping of the data is also available in PAUP, but due to the large number of taxa which were under study, bootstrapping was not done on all of the data sets. Specific adjustments or additional settings within the program, such as selection of an outgroup taxa, are discussed in the chapter in which it is relevant.

In summary, distance and discrete character data sets may be used in the reconstruction of a gene, genetic character, or morphological character, tree. The production of a tree does not directly imply that this is in fact the species phylogeny of these organisms. Construction of phylogenies from various evolutionarily informative regions are required before robust arguments of true species relationships may be entertained.

Hopefully, the construction of each phylogenetic tree will make the resolution of the true species relationships more achievable. CHAPTER III

PHYLOGENETIC RELATIONSHIPS OF A LAKE VICTORIA CICHLID

SPECIES DERIVED FROM 18S RIBOSOMAL DNA DATA

A. Introduction

The cichlid species flocks of the Great Lakes of Africa represent one of the most explosive speciation events of vertebrates observed (Stiassny, 1991, Mayr, 1984). The cichlid flock of Lake Victoria (previously referred to as the

Haplochromine species flock, Greenwood, 1984a, 1984b,

Greenwood, 1981) is thought to have arisen as a monophyletic group within the last 100,000, possibly as recently as 15,000, years ago ( Meyer et al., 1990, Livingston, 1980). This group represents a very large number of species, but the nature of the evolutionary relationships between these recently divergent cichlid species, as well as their position with respect to other

Cichlidae species of the world, is still an area of active investigation (Kornfield, 1991). The ability to determine a robust phylogeny of these taxa will help in analyzing the evolutionary phenomena associated with these rapid speciation events.

In general, determination of phylogenetic relationships is fundamental to our understanding of evolution and species relatedness. In Lake Victoria cichlids, comparisons of 4 2 43 morphological, behavioral, and paleontological data have previously been used to estimate these relationships

(Lippitsch,1993 , Dominey, 1984, Van Couvering, 1982). More recently, analysis of protein and nucleic acid sequences is now being used to expand these studies in cichlids (Sage and Selander,

1975, Kornfield, et al., 1982, Kornfield et al., 1979). Some studies have sought to combine morphological and molecular data to determine the relationships between the African Great Lake cichlids (Sage et al., 1984, Meyer, 1990, Sturmbauer and Meyer,

1992, Sturmbauer and Meyer, 1993, Ono et al.,1993). A common result of these studies has been the difficulty in discriminating

between the recently divergent taxa of Lake Victoria. In addition, the relationship of the family Cichlidae to other teleost taxa is not fully resolved (Van Couvering, 1982).

In studies in our laboratory we are working to resolve the phylogenetic relationships of Lake Victoria cichlid species using

molecular data from different regions of the nuclear genome. The

series of selected loci allow for increasingly finer resolution of evolutionary relationships. These loci allow resolution ranging

from the familial to the species level depending upon the loci

which is examined. One of the nuclear regions which we have examined is the ribosomal rRNA operon (Sogin, Elwood, and

Gunderson, 1986). This operon consists of the 18S, 5.8S, 28S ribosomal DNA's, and the internal transcribed spacers between them (Fedoroff, 1979). Figure 3 shows the arrangement of the eucaryotic rRNA gene cluster as well as the approximate location 44

rRNA gene rRNA gene rRNA gene cluster igs cluster jqj cluster

18S Ribosomal RNA gene

CRN5 373C 570C 892C PCR2 1262C 1200C SSU1 1712C 1/F

Figure 3. Eukaryotic rRNA Gene Cluster Arrangement. The ribosomal gene cluster is a series of repeated units consisting of the 18S, S.8S, and 28S rRNA genes as shown above. In the lower portion of the figure is the 18S gene. Above the gene are the approximate locations of reverse primers. Below the gene are shown the locations of the forward primers. Shaded areas represent: IGS=Intergenic Spacer. 5'ETS= 5' External Transcribed Spacer. 5’ITS=5’Intemal Transcribed Spacer. 3'ITS=3' Internal Transcribed Spacer. 3'ETS=3' External Transcribed Spacer. 45 of primers used in the 18S rRNA study. Within the operon, analysis of the 18S rRNA gene allows for comparison of species at the generic or higher levels. Comparison of the internal transcribed spacers allows for comparison of species, as well as populations in some cases (Pleyte, Duncan, and Phillips, 1992).

Additionally, sequence and Restriction Fragment Length

Polymorphism (RFLP) analysis of the non-transcribed spacers between ribosomal operon repeats has also been used in genetic variation evaluation and phylogenetic reconstruction (Cordesse, et al., 1993).

In this chapter, results of sequence determination of the 18S rRNA sequence in teleosts are presented. This has been an attractive molecule for phylogenetic research because it contains areas of high sequence conservation which facilitates sequence alignment, while also having variable regions harboring phylogenetically informative sites. A large database of rRNA sequences is available for sequence comparison (Neefs et al., 1991,

Olsen, Larsen, and Woese, 1991, De Rijk et al., 1992). In addition, hypothetical secondary structures of this molecule have been proposed for some organisms which allows examination of substitutions for complementarity and secondary structure construction (Olsen, Larsen, and Woese, 1991). While this molecule has been used extensively in the past to study ancient evolutionary branchings and currently many rRNA sequences are available, at the initiation of this project there were no fish 18S rRNA sequences in the database (Pace, Olsen, and Woese, 1986, 46 Olsen, 1987, Cedergren, et al., 1988). Recently, however, some 18S rRNA sequence, as well as restriction analysis studies, have been reported in fish (Stock et al., 1991, Bemardi, Sordino, and Powers,

1992, Bernardi and Powers, 1992, Phillips, Pleyte, and Brown,

1992, Stock and Whitt, 1992). Although not available at the beginning of this project, these other fish 18S rRNA sequences were incorporated into the following studies.

The 18S rRNA can be used to study deep evolutionary branchings. In addition, it is sufficiently informative to determine the gross phylogenetic relationships between selected Cichlidae and other fish taxa. A phylogeny of the Cichlidae which has been proposed by Stiassny is presented in Figure 4. In the current study the results of the analysis of the 18S rRNA sequence in several teleost species is presented. This analysis allows the determination of the phylogenetic relationships between these taxa. The 18S rRNA gene sequence was determined in five

Cichlidae species: two haplochromine species, Gaurochromis sp., and Neochromis nigricans, a tilapine Oreochromis esculentus, a

Etropline cichlid , and a Neotropical cichlid

Cichlasoma cyanoguttatum Cichlids . from Lake Victoria included the Gaurochromis sp., Neochromis nigricans, and Oreochromis esculentus. Also, thel8S rRNA gene sequence was determined from a marine Pomacentrid damselfish, Pomacentrus melanochir.

In addition, the full sequence of this gene was determined in two more distantly related freshwater teleost taxa: Etheostoma zonale

(a North American freshwater darter), and a member of the teleost 47 subfamily Acipenseformes, Acipenser fulvescens (lake sturgeon).

The first cichlid 18S rRNA gene sequenced was compared with frog and human to examine the rate of evolution of this gene in vertebrates. Next, the full set of fish sequences determined in this study was compared with the small number of previously derived fish 18S rRNA gene sequences for phylogenetic analyses.

While the results indicate that thel8S rRNA gene lacks the resolution to determine the phylogenetic relationships between the closely related cichlid species of Lake Victoria, it is nonetheless useful for examining the broad evolutionary relationships of the cichlid taxa with other fish species, as well as other vertebrate phyla (Neefs et al., 1991)

B. Materials and Methods

Source of Specimens:

Specimens of Gaurochomis sp. and Neochromis nigricans, being propagated as part of the Lake Victoria Species Survival

Plan, were obtained from the Johnson Aquatic Complex of The

Columbus Zoological Gardens, Columbus, Ohio. Cichlasoma cyanoguttatum , the common Texas cichlid, was obtained through a fish wholesaler. The Madagascan cichlid Paretroplus polyactis specimens were collected from the wild by Dr. P. Loiselle of the

Aquarium for Wildlife Conservation, New York. The samples of

Pomacentrus melanochirt Etheostoma zonale, and A cipenser 48

J3 Neotropical cichlids

25-26

21-24

6-7

1-4

Figure 4. Proposed Inter-familial Relationship of the Family Cichlidae. In the 18S rRNA study, gene sequences were obtained from a member of the Etropline, Paretroplus polyactis, one of the Neotropical cichlids Cichlasoma cyanoguttatum, one tilapine, Oreochromis esculentus, and two from the category above titled "The Rest", Gaurochomis jp.and Neochromis nigricans. (Tree redrawn From Stiassny, 1991) 49 fu lv e sc e n, s were collected from the wild and provided by Dr. T.

Cavender and Mr. Brady Porter. Voucher specimens of program fish, or others from Lake Victoria, are maintained at the Museum of Comparitive Zoology at Harvard. Other specimens are maintained at the Museum of Biological Diversity at The Ohio State

University.

DNA Extraction:

DNA extraction was performed as follows: first, 1cm3 of muscle tissue was removed from individual fish. Tissue was then sheared by razor blade in the presence of ABI lysis buffer

(Applied Biosystems Inc., 0.1 M Tris, 4M Urea, 0.2M NaCl, 0.01 M

CDTA, and 0.5% n-laurelsarcosine). Chopped tissue was then placed in a 1.5 ml centrifuge tube to a total volume of lml. lOpl of a 20mg/ml solution of proteinase K was then added. Samples were incubated overnight at 50°-55° C. Following digestion, DNA was extracted by two phenol/chloroform, and one chloroform extraction. DNA was precipitated by addition of 2 volumes of 95%

EtOH, followed by a 70% EtOH wash. DNA was quantified by spectroscopy and quantifications were confirmed by electrophoresis on a 1% agarose gel.

PCR Amplifications:

18S ribosomal RNA gene amplifications were performed as follows: 100 ng of genomic template DNA was added to each reaction (final volume 100ml) which contained lOpM of each 50 amplification primer, 2.5mM MgC12, 1.5mM dNTP's, and 1-2.5U

Taq DNA Polymerase (ProMega Corp.). IBS rRNA genes were amplified in two halves. The 5' portion was amplifed using primers CRN5 and 1262 (Table 4). These are forward and reverse primers respectively. The 3' section of the gene was amplified using primers 892C and SSU2. All primers used in PCR amplification and subsequent 18S rRNA sequencing reactions are shown in Table 4. The PCR reactions were carried out as follows: denaturation at 94° C for 1 min, annealing at 50-52° C for 1.5 min, and extension at 72° C for 3 min, 35 cycles. Reactions were performed in a Perkin Elmer Cetus PCI thermocycler. 10pl of PCR products were electrophoresed on a 1% agarose gel to determine size and approximate quantity before sequencing. Separate PCR reactions from the same individual which produced a band of the correct size were pooled and prepared in the following manner for subsequent sequencing reactions: 1) no preparation (straight PCR product used for sequencing), 2) band isolation followed by glass milk purification, and 3) band isolation through dense nylon without further preparation.

In the first method 3ul of PCR product was used directly in cycle sequencing reactions. For the second method, PCR product was first run out on a gel and stained with Ethidium Bromide. The desired band was excised and purified with a glass milk suspension according to manufacturer’s instructions (Gene Clean,

Bio 101, Inc.). For the final method DNA was run on an agarose gel and the desired band was excised. This band was then placed in 51

Table 4. 18S Ribosomal DNA Primers used in Amplification and Sequencing.

Primer Primer Sequence (all primers Primer Melt. Orien- Name shown in 5'-> 3' orientation) Size Temp. tation c <°C) SSU1 ccgcggccgcgtcgactggttgatcctgc 35 -m er 82.5 Forward cagtag CRN5 tggttgatcctgccagtag 19 -m er 48.0 Forward 18S66 cacacgggcggtacagtg 18-m er 52.8 R everse 170 gcatgtattagctctaga 18-mer 33.7 Reverse 373 aggctccctctccggaatc 19-m er 54.5 R everse 373C gattccggagagggagcct 19-mer 54.5 Forward 570 gctattggagctggaattac 20 -m er 46.0 R everse 570C gtaattccagctccaatagc 20-m er 46.0 Forward 892C gtcagaggtgaaattcttgg 20 -m er 46.0 Forward 1137 gtgcccttccgtcaat 16-mer 45.1 Reverse PCR2 gaaacttaaaaggaattga 19-m er 38.7 Forward 1262 gaacggccatgcaccac 17-mer 52.4 R everse 1262C gtggtgcatggccgttctta 20 -m er 55.7 Forward 1200 gggcatcacagacctg 16-mer 42.9 R everse 1200C caggtctgtgatgccc 16-mer 42.9 Forward 1712C agcgccgagaagacgatcaaa 21 -m er 58.6 Forward 1/F cacaccgcccgtcg 14-mer 49.3 Forward SSU2 ccgcggccgcggatcctgatccctccgc 36-m er 86.2 Reverse aggttcac c. Forward= primers which are oriented in the 5’ to 3' direction relative to the gene. Reverses primers which are oriented in the 3' to 5' direction relative to the gene. the cap of a 1.5ml eppendorf tube. Densely woven nylon was placed over the opening of the tube and the tube was closed. The tube was spun in a microfuge at 10,000 rpm for 2 min. The desired product was spun into the tube while the agarose remained on the nylon. Three to ten microlitres of product prepared by methods two or three was subsequently used in cycle sequencing reactions. The best results were obtained using 3pl of straight PCR amplification product which had not been altered in any way. Occasionally there were regions which were difficult to read on the sequencing gels using this method, but annealing at a higher temperature during the sequencing reaction solved this p ro b le m .

Sequencing Analysis:

Sequencing was performed using the dsDNA Cycle Sequencing System (BRL). Sequencing primers were y32P labeled using 1U of T4 Polynucleotide Kinase in a 5)il reaction. Labeled primer was then added to the 31pl sequencing reaction which contained 2.5U Taq polymerase (.5|il), 4.5pl 10X Taq sequencing buffer, 3|il of straight, or 3-10pl of Gene Cleaned or nylon spun template. 8pl of this reaction was placed in four separate termination reaction tubes containing 2pl of each ddNTP's.

Sequencing was carried out as follows: denaturation at 94-97 °C for 30 sec., annealing at 48-60 °C (temperature was dependent on primer) for 1 min., extension at 70 °C for lmin., 20 cycles. This was followed by 10 cycles of 30 sec. at 94-97 °C, and 1 min. at 70 °C. Products were electrophoresed on either a 6 or 8% polyacrylamide gel. Gels were exposed to Amersham Hyperfilm-

MP autoradiography film for 16-48 hours.

Phylogenetic Reconstruction and Data Analysis:

The primary sequences of G sp., N. nigricans, C. cyanonguttatum, O. esculentus, E. zonale, Pomacentrus melanochir,

Paretrophus polyactis, and A. fulvescens were aligned with other

18S rRNA gene sequences examined in this study using the sequence alignment program Eyeball Sequence Editor (ESEE) for the PC (Cabot and Beckenbach, 1989). The initial 18S rRNA gene primary sequence dervived from Gaurochromis sp., was aligned with two other vertebrates: X. laevis, and H. sapiens (M a d e n ,

1986). In addition, two outgroups were added to this alignment,

Artemia salina (brine shrimp), and Plactopecten magellanicus (a sea scallop) (Nelles, et al., 1984, Rice, 1990, for sequence alignment, see Fig. 26, Appendix A). Three regions (bases 227-

258, 673-734, and 1720-1747) were removed from the alignment before the data were used in subsequent analyses because unambiguous alignment of these regions was not possible. The remaining 1714 bases were used in phylogenetic analyses.

A second alignment was done when all of the fish 18S rRNA sequences were completed. This consisted of the primary sequences determined in this study as well as other fish 18S rRNA primary sequences which were available in the GenBank database.

Included in this were three sharks: Squalus acanthia (Bernardi, 54 Sordino, and Powers, 1992), Echinorhinus cookei (Bernardi and

Powers, 1992), and Notorynchus cepedianus (Bernardi and Powers,

1992), one ray: Rhinobatos lentiginosus.{Stock and Whitt, 1992)

Also included were teleost fish sequences from the killifish

Fundulus heteroclitus (Bernardi, Sordino, and Powers, 1992) the

Sea Bass Sebastolobus altivelis (Bernardi, Sordino, and Powers,

1992), the coelacanth Latamaria chalumnae (Stock, et al., 1991).

Outgroups in this analysis consisted of two lampreys, Petrom yzon m arinus and Lampetra aepyptera, and one tunicate, Styela plicata

(all from Stock and Whitt, 1992) The alignment was truncated at both ends due to the location of amplification primers in these, and the other studies. This removed fifteen bases from the 5' end of the molecule and twenty one bases from the 3' end. Secondly, sixteen bases were removed from the alignment in the highly variable El0-1 stem and loop region (see secondary structure in

Figure 5 for the location of this stem-loop). Sixteen bases were removed from the Gaurochromis sp. sequence in this region. Other species varied in the amount of bases that were removed which spanned this same region. These truncations produced a final sequence alignment of 1762 bases (again relative to the

Gaurochromis sequence) from a total sequence of 1825 bases for this gene (Fig. 26, Appendix A).

Phylogenetic analysis was done using the phylogenetic analysis package PHYLIP (Felsenstein, 1989) on either a Silicon

Graphics, IBM, or Macintosh computer. Using the PHYLIP package, corrected sequence divergences were quantified using the Kimura two parameter model (Kimura, 1980) which takes into account multiple substitution events. Distances derived by this method were then used to produce phylogenetic trees using various algorithms available in this program: UPGMA, KITCH, FITCH, and

NEIGHBOR. A parsimony analysis was also performed using the

DNAPARS program of PHYLIP or using the program PAUP

(Swofford, 1990) on the Macintosh.

Secondary Structure:

The secondary structure of the 18S rRNA was produced using the graphics package Canvas 3.0.1 on the Macintosh. The proposed secondary structure was constructed as follows. The proposed secondary structure of the Drosophila melanogaster 18S rRNA was used as the initial template (DeRijk et al., 1992). The

D rosophila primary sequence and the proposed secondary structure was then compared with the known primary sequence of thel8S rRNA of Xenopus laevis to produce a secondary structure for Xenopus. There were no fish 18S rDNA sequences available at this time. This structure was then compared with the primary sequence determined for Gaurochromis sp. Changes in the primary sequence between X enopus and Gaurochromis were then transferred to the secondary structure to produce the final

Gaurochromis sp. 18S rRNA secondary structure. The proposed secondary structure of the 18S rDNA of Gaurochromis sp. is presented in Figure 6. Comparisons between Gaurochromis and the other species analyzed in this study used Gaurochormis as the 56 reference species for construction of hypothetical secondary structures.

C. Results

18S ribosomal DNA Sequence Determination:

Initial DNA amplification of the 18S ribosomal RNA gene was performed on Gaurochromis sp. (a represenative Lake Victoria piscivorous cichlid) using two sets of overlapping primers (CRN5-

1262 and 892C-SSU2). Previous workers in our laboratory had constructed a set of conserved eukaryotic primers which cover the entire 18S ribosomal RNA gene in both orientations. Primer set

CRN5-1262 amplifies a fragment of 1322 bases while 892C-SSU2 produces a product of 915 bases. These amplifications produce overlapping fragments spanning essentially the entire ribosomal gene of Gaurochromis sp. (approximately 1840 bases). Using other conserved eukaryotic small subunit (SSU) primers we were able to sequence the gene using direct dsDNA cycle sequencing. Sequence information derived from the opposite strand solved remaining ambiguous positions. In total, approximately 90% of the gene was sequenced in both orientations. One primer which was complementary to 892C did not work in these studies which prevented sequencing of both strands in the region spanning bases

600 to 850.

Comparison of the primary sequences from O. esculentus, N. nigricans, and Gaurochromis sp. revealed no differences between 57 these taxa. This is not unexpected between the latter two taxa due to the recent divergence of these species. The lack of differences between tilapine and haplochromine taxa is also not completely unexpected, since these two lineages are thought to have diverged within the last 10 million years (Trewavas, 1983). However, differences are found in the first internal transcribed spacer (see following chapter) between these taxa. This helps to confirm that the 18S rRNA gene sequences from the three species are in fact identical, and rules out the possibility that we are observing a contamination artifact leading to sequence determination of the same sample three times.

The 18S rRNA gene sequence determination from

Paretroplus polyactis, the Madagascan cichlid, revealed only three single base substitutions relative to Gaurochromis sp.

The primary sequence of C. cyanoguttatum exhibits several changes when compared to the African sequences. Two insertions are found in the highly variable El0-1 stem totaling seven bases.

A duplication of a dinucleotide repeat is observed in the E21-8 region. Four other single base substitutions are found in loop regions, three in known variable areas. The two insertions in the

E10-1 region are found in the 227-258 region which was not included in the larger phylogenetic analysis because of problematic alignment.

Comparison of the primary sequence of the cichlids with a more distantly related freshwater teleost taxa {Etheostoma zonale) revealed seventeen substitutions or insertions. Eight single and 58 one trinucleotide insertion are found in loops. Four substitutions which maintain basepairing are found in variable stem regions.

One compensatory base pair change is observed in variable region

V7. One base subtitution is observed in variable region V9 which does not maintain the hypothesized base pairing. This region was sequenced multiple times from pooled PCR product and does not appear to be an artifact.

Comparison of the primary sequence of Gaurochromis with that of the primitive freshwater sturgeon Acipenser fulvescens revealed 96 substitutions and six insertion or deletion events

(indels). Both were distributed throughout the gene, the majority being found in known variable stems and loops.

18S rRNA Secondary Structure Construction:

The Gaurochromis sp. 18S rRNA gene prim ary sequence was used in the construction of a proposed secondary structure. It is important to have a working secondary structure for this cichlid

species so that base substitutions which are observed in other fish

species can be compared to this secondary structure to determine if they are compensatory or non-compensatory changes. This method is used so that observed base changes in other species can be examined not only by primary sequence comparison, but also in the conserved secondary structure. 59

1MI »• •« • »•*« € t«M»t4• AS ™ | MM • • ***V\ i/*%, ;; ,W** ••MS s VV77 m S ! • * t * • •• • •• «. *•«

»|?- *\V ?! f •, ! ! • • v s • « • • * • * • s *€ • >«: i v s *• « • ci 27.. • • « • V4 EJ1.. .. • • « • s'. •< • • • • • * • • • • • • « •V «: tii-i at * •» • I* " • • *»•" s ! ss•• *«■•••••••■••• mj -----• • • • * • • • \ i . ; • • • • * • « • vv..**.* .« • • • /'« -» * 21 *. ;*V ••*••••*•*«*t" ft 1'.■•*/ . A “ 4 I *(>••' 4.. *?:* *•• ; to i; * i • so < 4 *. * • • * * s • | IHSStlStwi tStttMM * » • /2 2 'M* JW444.4H* MM4NM,.' • « nr* ■ .* * 4 • *' *«, v.* • • • * * V * * I I •» * v W *»• •« 121*1 •••5%V • i 9 mmm. mmm. ------•**•*••'*! I4M! (MIIHIMN *4 • ...... iMIMNH •• \> i 5 M«.*M i. H|«H4M H lM4 |..H ;« » . ••••«. 4«.«M ( « \ ‘ * * * ■ *4 f- , , •» «* * • V9 t i l - 4 ■■ »■ \ s :■ y V V3 17 •* * •* • .* 41

4. I «• • « • l « « |4 * «M

it

• •••• 4W4S4* - *

* * * •* * • •••,* ,« ** C«s **• »44* •• * • ** • »* * , * * •a«• • -* y. *«« • • .* ■ V2

Figure 5. Proposed Secondary Structure of Gaurochromis sp. 18S rRNA. Secondary structure was based on derived primary DNA sequence. 60 Phylogenetic Analyses:

1. Vertebrate Relationships:

An analysis was carried out on the alignment which consisted of G. simpsoni as the representative teleost taxa,

Xenopuslaevis, Homo sapiens, Artemia salina, and Placopecten magellanicus, after removal of ambiguous alignment positions.

Corrected nucleotide sequence divergences calculated between these species are shown in Table 5. Tree construction was performed using various reconstruction algorithms. All produced the same topology. The representative tree shown in Figure 7 is the Neighbor-Joining tree. Comparison of the number of substitutions was performed between the Gaurochromis sp.,

Xenopus laevis, and Homo sapiens 18S rRNA gene sequences. If the substitution rate for the 18S rRNA gene is relatively constant, use of the percent substitution can often provide useful insights into potential separation times between related taxa i.e., yields a molecular clock, especially when closely related groups are involved (Zuckerkandl and Pauling, 1965, Wilson, Carlson, and

White, 1977, Takahata, 1988, Wolfe, Sharp, and Li, 1989,). A test of a molecular clock was carried out on the fish-frog-human comparisons. Based on an average assumed substitution rate of 1% per 50 million years in the small subunit ribosomal gene (Wilson,

Ochman, and Prager,1987), and the time of divergence of the fish-

[mammal-amphibian] group (400 mya), and mammal-amphibian group (350 mya) we calculated the expected numbers of substitutions over the 1714 bases examined, assuming lineages 61

Table 5. Distance matrix of the five species in the first 18S rRNA primary sequence alignment. d

G.s. X.l. H.s. A.s. P.m.

G.s. 0.00 0.0438 0.0557 0.1499 0.1371

X.l. 0.00 0.0447 0.1449 0.1358

H.s. 0.00 0.1490 0.1397

A.s. 0.00 0.1432

P.m. 0.00

4. Distances were derived using the truncated sequences (see text) using the program DNADIST of the phylogenetic analysis package PHYLIP. Distances are corrected for multiple substitution using the Kimura two parameter method. G. s.= Gaurochomis sp.X..= I Xenopus laevis.H. s.-H om o sapiens. A. s. = Artermia salina. P. m. = Placopecten magellanicus. Gaurochromis sp.

X. laevis

H. sapiens

P. magaellanicus

A. salina

— — = 2.5 % divergence

Figure 6. Neighbor-joining tree of five taxa based on truncated 18S rRNA sequences. The distances on horizontal branches are proportional to nucleotide divergence (corrected for multiple substitutions). 63 were accumulating substitutions at an average eukaryotic rate for this gene. For the fish-frog and the fish-human lineages we expect

137 substitutions, for the frog-human 120 changes. Our observed substitutions were less in all cases: fish-frog, 80 substitutions; fish- human, 102; and frog-human, 81. The observed average rate across the lineages was .67%/ 50my. The rate observed was significantly lower than the expected rate based on the Wilson-

Ochman-Prager clock (X2 test, pc.001). This result is not entirely suprising as it has already been reported that vertebrate substitution rates are lower than average in the 18S rRNA gene

(Hillis and Dixon, 1991).

To examine vertebrate substitution rates further in this gene, comparison of substitution rates between more closely related vertebrate taxa was performed. Two of the fish taxa in the study with a previously hypothesized time of separation of approximately fifty million years (Van Couvering, 1982, T.

Cavender, pers. comm.), Gaurochromis sp. and Cichlasoma cyanoguttatum, were analyzed. Again using the average substitution rate above we would expect to see approximately seventeen substitutions. By contrast the observed number of substitutions is seven if we consider the insertion events equivalent to a single substitution. The calculated X2 value is 5.88, which is highly significant with p<.001. Either the proposed time of divergence is too recent or the 18S rRNA gene is evolving at a slower than expected rate in these lineages, further supporting the 64 hypothesized slower than expected rate of this gene in vertebrates.

Since knowledge of the divergence time is at best rather ambiguous in many cases, relative rate tests have been developed which do not require knowledge of divergence times (Sarich and

Wilson, 1973, Li and Grauer, 1991). These tests examine the evolutionary rates of two lineages by comparison to an outgroup taxa. The test asks whether the difference between the two rates being examined is statistically significantly different from 0.

Application of this test to the Cichlasoma/Gaurochromis species pair was performed, with Squalus as the outgroup taxa, and was not significant. In addition, an examination of Figure 9 shows an apparant accelerated rate of evolution in A cip enser versus the other fish taxa. Therefore, the Gaurochromisl Acipenser pair was also tested, using the tunicate Stye la as the outgroup taxa. In addition the test was performed on this pair using the more closely related outgroup L a m p etra. In both cases, the differences in the rates were not significantly different from zero.

2. Phylogenetic Relationships of Cichlidae with other Teleost Taxa :

A second sequence alignment was produced using all of the

18S rRNA data collected in this study: Gaurochromis sp.,N. nigricans, C. cyanonguttatum,. esculentus, O E, zonale,P. melanochir, P. polyactis, and A. fulvescens. This data was combined with one sea bass ( Sebastolobus altevelis), one killifish

(Fundulus heteoclitus), the coelacanth ( Latameria chalumnae), Table 6. Corrected Genetic Distances of 18S rRNA Sequences.6

A ci Eth Pom Par Gau Cic Fun Seb Lat Squ Not Rhi Ech Lam Pet Sty Aci ------Eth .0566 ------Pom .0582 .0092 ------Par .0593 .0121 .0115 ------Gau .0575 .0103 .0098 .0011 ------Cic .0590 .0116 .0092 .0029 .0017 ...... Fun .0664 .0226 .0256 .0273 .0255 .0239 ------Seb .0664 .0185 .0244 .0232 .0226 .0233 .0350 ...... Lat .0683 .0413 .0470 .0465 .0456 .0465 .0542 .0394 Squ .0658 .0365 .0414 .0395 .0389 .0397 .0486 .0262 .0292 Not .0633 .0334 .0384 .0365 .0359 .0366 .0468 .0250 .0268 .0074 ------Rhi .0772 .0480 .0499 .0491 .0486 .0494 .0591 .0405 .0388 .0291 .0267 ------Ech .0651 .0364 .0420 .0407 .0401 .0409 .0486 .0250 .0303 .0080 .0086 .0291 Lam .1070 .0853 .0869 .0879 .0867 .0858 .0899 .0912 .0873 .0828 .0809 .0834 .0827 ...... Pet .1064 .0860 .0876 .0886 .0874 .0865 .0893 .0919 .0877 .0841 .0822 .0858 .0840 .0028 ...... Sty .1680 .1477 .1532 .1510 .1497 .1490 .1518 .1505 .1483 .1433 .1441 .1443 .1439 .1527 .1535 c. Shown on the lower diagonal arc the corrected genetic distances using Kimura corrections for multiple substitutions. Key: Aci= Acipenser fulvescens, Eth= Etheostoma zonale, Pom= Pomacentrus melanochir, Gau= Gaurochromis sp., Cic= Cichlasoma cyanoguttatum, Fun= Fundulus heteroclitus, Seb= Sebastolobus altivelis, Lai=Latameria chalumnae, Squ= Squalus acanthias, Not= Notorynchus cepedianus, Rhi= Rhinobatos lentiginosus, Ech= Echinorhinus cookei, Lam= Lampetra aepyptera, Pet= Petromyzon marinus, Sty= Styela p lie ata.

o\ Ut 66

Cichlasoma 100.0 Gaurochromis 87.0 92.0 I — Paretroplus 63.0 1 Pomacentrus 85.0 I 1 Etheostoma 9l*° I i Fundulus 82.0 Sebastolobus Acipenser 100.0 Notorynckus 100.0 Echinorhinus 68.0 74.0 I ■■■ ■ Squalus 100.0 Rhinobatos 84.0 Latameria

Styela

Lampetra

Petromyzon

Figure 7: 18S rRNA N-J Gene Tree. This tree was produced using the Neighbor-joining algorithm in PHYLIP. Numbers at nodes are bootstrap percentages. 67

Sebastolobus

r Etheostoma

— Pomacentrus

- Paretroplus

Gaurochromis

Cichlasoma

Fundulus

------Acipenser

'Latameria

- Squalus

- Echinorhinus

Notorynchus

Rhinobatos Lampetra (Petromyzon Styela

= 2% divergence

Figure 8. 18S rRNA N-J Tree with relative distances. This tree was produced using the Neighbor-joining algorithm in PHYLIP. In this tree branch lengths reflect sequence divergence between taxa. 68

Cichlasoma 99.0 Gaurochromis 76.6 43.0 Paretroplus 1 Pomacentrus 72.2 Etheostoma 69.4 Fundulus 74.0 Sebastolobus Acipenser 100.0 Notorynchus 92.6 Echinorhitm 49.8 69.4 Squalus

100.0 Rhinobatos 54.9 Latameria 100.0 Styela

Lampetra

Petromyzon

Figure 9: Maximum Parsimony 18S rRNA Gene Tree. This tree was produced using the Maximum Parsimony algorithm in Phylip. Numbers at nodes are b o o tstrap percentages. 69 three sharks (Notorynchus cepedianus , Squalus acanthias, and

Echinorhinus cookei), one ray ( Rhinobatos lentiginosus), two lam preys (Lampetra aepyptera and Petromyzon marinus), and one tunicate ( Styela plicata) to produce the alignment in Figure 26,

Appendix A. Analysis of this data by the DNADIST program in

PHYLIP correcting for multiple substitutions produced the table of evolutionary distances presented in Table 6. Reconstruction of phylogenetic relationships was done using theNeighbor-Joining algorithm in PHYLIP. In addition, the consistency of the data in the phylogenies was tested using bootstrapping, with 100 bootstraps of the data being performed. Bootstrapped parsimony analysis was also performed on this data set. The resultant tress had nearly the same topology, with different bootstrap numbers.

These gene trees, with bootstrap values are presented in Figures 7,

8, and 9.

Examination of the Neighbor tree reveal three major .

One group consists of the lampreys, a second group of sharks and rays, with the lungfish Latam eria basal to this group. A third major consists of the bony fish, including the Cichlidae, with the lake sturgeon, A cipenser, basal to this group. A cipenser appears to be an ancient lineage, and one that appears to be evolving at a more rapid rate than the other members of this g ro u p .

The Cichlidae taxa constitute a closely related clade. This association was found in 100% of the Neighbor bootstapped trees, and 99% of the parsimony trees. The sister taxa to this clade is 70 Pomacentrus, the marine damselfish. Interestingly, within the cichlid clade, Cichlasoma is the sister taxa to Gaurochromis and

Paretroplus (the Madagascan cichlid). This grouping was found in

92% of the Neighbor bootstrapps, but in only 43% of the parsimony trees. As discussed previously, most traditional phylogenies of the

Cichlidae hypothesize Paretroplus basal to both the new world

Cichlidae as well as many African genera, including Gaurochromis.

While this is potentially an interesting result, the scarcity of phylogenetically informative sites between these species in the

18S rRNA gene will require the addition of more Madagascan taxa, as well as the analysis of these species at more phylogenetically informative loci. The maximum parsimony tree contained the same topology except for the lack of strong association of the lamprey species.

D. Discussion

The primary sequence of the 18S rRNA gene for several species of cichlids, including a Lake Victorian haplochromine cichlid, has been determined. This gene is one of the most widely used phylogenetically informative molecules available in current evolutionary research. A secondary structure for this molecule is proposed which will be useful when additional fish 18S ribosomal

RNA gene sequences are determined. A phylogenetic tree based upon these data for the included taxa indicated that the degree of divergence between teleost forms for this gene is small. The

slower than expected rate of substitution of the gene in the fish

species underscores a lack of confidence in the ability of this molecule to distinguish between the closely related African

species, especially those of Lake Victoria. Specifically, the upper limit of speciation of the haplochromines of Lake Victoria may be

100,000 years. Using our empirically determined rate, we would

expect to observe only about .025 changes between two species

which had diverged 100,000 years ago. Obviously, given this

evolutionary rate this molecule will not be useful to distinguish

between these species.

In conclusion, this study represents the first determination of

any Cichlidae 18S rRNA gene sequences. The determination of the cichlid sequences, and those of the other taxa examined expands

the teleost database by nearly 400%. As such it begins to represent a significant framework for future evolutionary analyses

within the Teleostei. Furthermore, the proposed secondary

structure will be useful for comparison of 18S rRNA of other taxa

as they are acquired. It will permit the correct identification of

substitutional patterns within the gene as they are affected by

selective forces related to gene function.

The evaluation of the genetic relationships of the closely

related Cichlidae will require analysis on a more phylogenetically

informative locus, eg., the ribosomal operon spacer regions, which

is being investigated, and forms the focus of the following chapter. CHAPTER IV

AN ANALYSIS OF PHYLOGENETIC RELATIONSHIPS OF LAKE

VICTORIA CICHLID SPECIES USING INTERNAL TRANSCRIBED

SPACER ONE (ITS 1) OF THE RIBOSOMAL GENE OPERON

A. Background

Molecular methods to study genetic differentiation between populations have developed rapidly during the past ten years utilizing techniques such as the polymerase chain reaction (PCR) and dideoxy sequencing. These allow the survey of larger numbers of individuals from diverse taxa relatively quickly.

Approaches such as these are powerful because they provide a large amount of information by examining variation at the highest degree of resolution, the level of DNA. In addition, since different classes of DNA evolve at different rates, markers can be identified which provide information about evolutionary events which have occurred at different times in the past. The choice of the genomic region for analysis depends strongly on the suspected time since divergence of the study groups, and the degree of genetic differentiation which is thus expected.

Mitochondrial DNA has been used extensively for studies of animals at both intraspecific and interspecific levels because of the rapid evolution of mtDNA sequences and the uniparental mode of

72 73 inheritance (Brown 1985, Avise 1994). Nevertheless, analysis of mtDNA has deficiencies when used as a marker in some situations.

In animals, the uniparental mode of inheritance can leave some questions unanswered because of introgression. Furthermore, it is inherited essentially as a haploid molecule (Birky, Maruyama,

Fuerst, 1983). Together, these factors act to reduce the effective population size for the molecule to about one-fourth that of nuclear genes in the same population. Thus, population studies which analyze nuclear DNA with similar mutation rates to mtDNA should be much more likely to reveal nucleotide variation.

Several highly variable nuclear gene markers which evolve rapidly may be examined, each with potential drawbacks.

Hypervariable VNTR or microsatellite loci may acquire variation primarily by recombinational, rather than mutational input

(Jeffreys, Wilson, and Thein, 1985) and may evolve too rapidly to be useful for many phylogenetic studies. Identification of restriction fragment length polymorphisms (RFLP's) can be laborious, and polymorphisms are often species specific (Avise

1994, Parker et al., 1994). To date, it is unclear whether there is a class of nuclear sequences which accumulate differences by nucleotide substitution and can be used for both population and species comparisons. Consideration of nucleotide substitution rates

(Li, Luo, and Wu, 1985, Gillespie, 1986) suggests that intervening sequences are one class of nuclear sequences which may contain these characteristics. Regions of the nuclear genome that may be amenable to sequence analysis to examine nucleotide substitutions 74 include the internal transcribed spacers between the ribosomal

RNA genes, as well as introns found between coding regions of g enes.

Intervening sequences are among the most rapidly evolving classes of nuclear sequences (Li, Luo, and Wu, 1985). Since overall nucleotide substitution rate is correlated with neutral mutation rate, average population heterozygosity for these sequences is also expected to be higher, on average, than most other nuclear regions.

Also, intervening regions tend to be conserved in location within many animal genes and are flanked by exon sequences which may often be conserved enough to develop primers for PCR. This provides a technique to rapidly identify and analyze the spacer, or intron, from a large number of individuals from different populations and species (Lessa, 1992).

As a class of genetic markers, rapidly evolving regions of genomic DNA such as intervening sequences have been studied very little by evolutionary biologists. Intervening sequences have the potential to provide a substantial amount of information about genetic variability among recently divergent populations and species for several reasons. Because they are noncoding, they are among the most rapidly evolving regions of genomic DNA, thus having the potential to diverge in sequence very soon after populations diverge. Sequence data from introns can be used to resolve questions about divergence which mtDNA sequences often leave unresolved because of introgression. The likelihood of detecting allelic variation in non-coding sequences is higher than for analogous mtDNA regions because the biparental mode of transmission makes the effective population size of the molecule four times that of mtDNA. Importantly, it is now technically feasible to identify and sequence introns using available amino acid, or DNA sequence data, to develop primers. Finally, using the

PCR and standard sequencing technology, amplification and DNA analysis is possible. In this chapter, a method is described which can be replicated to identify other potentially informative intervening sequences.

The first internal transcribed spacer of the ribosomal gene operon was the target sequence of these studies. The ribosomal gene operon consists of repeat units containing the 18S, 5.8S, and

28S rDNA genes. Flanking the 18S and 28S are external transcribed regions, while between the 18S, 5.8S, and 28S lie internal transcribed spacers. Between the operons there are non­ transcribed spacer regions. This gene is transcribed as one unit and later processing removes the spacers. The location of this spacer as well as primer locations used for amplification and sequencing are shown in Figure 10.

On first examination this region may not appear attractive for this type of study due to the large number of repeats of the same gene within an individual which could contain variation.

However, in the case of repeated genes, the genes undergo interactions between themselves which leads to homogenization.

This is known as concerted evolution and as a consequence the genes do not evolve independent of one another (Dover et al.. 76 1982, Birky and Skavaril, 1976). Two mechanisms are thought to be involved in concerted evolution. One involves unequal crossing over between homologous chromosomes which leads to duplication in one and loss of sequence in the other. With iteration, homogenization to one allele occurs. A second mechanism is gene conversion, a non-reciprocal mechanism which results in the replacement of one area of sequence with another. Allelic or non­ allelic gene conversion may occur in this case. Concerted evolution has led to homogenization in the ribosomal gene operon, and thus makes it amenable to study. Within the operon, known sequences of flanking 18S and 5.8S rDNA genes allow selection of conserved

PCR primers which will amplify the internal transcribed spacer one (ITS 1) region.

Here we present an examination of ITS 1 of 12 species of

East African, predominantly Lake Victorian, cichlids. This sequence represents a model for the study of other intervening sequences which can be evaluated to determine whether they are useful molecular markers for studying recently divergent groups of organisms (Torres, Ganol, and Hemblem, 1990, Baldwin, 1992,

Soltis and Kuzoff, 1993, Volger and DeSalle, 1994, and Pleyte,

Duncan, and Phillips, 1994).

B. Methods and Materials

Taxa:

Included in this study were the following twelve species: two tilapine cichlids: Oreochromis niloticus and Oreochromis 77

Table 7. Species examined for ribosomal ITS 1 study.

Species Trophic Type f

Oreochromis niloticus Phytoplanktivore (PH)

Oreochromis esculentus Algal scraper (AS)

Astatoreochromis alluaudi Pharyngeal crusher (PC)

Pseudocrenalabis multicolor Insectivore (I)

Astatotilapia nubilus Insectivore (I)

Gaurochromis sp. Insectivore (I)

Neochromis nigricans Algal scrapper (AS) "Haplochromis" Paralabidochromis Insectivore (I) "rock kribensis"

Ptyochromis xenognathus Oral Shelter (OS)

Xystichromis phytophagous Plant Eater (PE)

Prognathochomis Venator Piscivore (P) "Haplochromis" Neochromis Algal scraper (AS) " k ru s in g " f. In parantheses is shown the abbreviation for each trophic type. 78 esculentus, two cichlid species distributed in other East African

in addition to Lake Victoria, Pseudocrenalabis

multicolor, and Astatoreochromis alluaudi. Endemic Lake Victoria

species included: Gaurochromis sp., Neochromis nigricans,

"Haplochromis" (Paralabidochromis) "rock kribensis", Ptyochromis

xenognathus, Xystichromis phytophagous, Astatotilapia nubils,

Prognathochromis venator and "Haplochromis" (Neochromis)

"krusing". Table 7 summarizes these species and their trophic

ty p e .

DNA Extraction:

DNA was extracted by excision of 1cm3 of muscle tissue.

Tissue was sheared by razor blade in the presence of ABI lysis

buffer (Applied Biosystems Inc., 0.1 M Tris, 4M Urea, 0.2M NaClr

0.01 M CDTA, and 0.5% n-laurelsarcosine). Chopped tissue was placed in a 1.5 ml centrifuge tube to total volume of 1ml. lOpl of a

20mg/ml solution of proteinase K was added. Samples were

incubated overnight at 50° C. Following digestion, DNA was

extracted by two phenol/chloroform, and one chloroform

extraction. DNA was precipitated by addition of 2 volumes of 95%

EtOH, followed by a 70% EtOH wash. DNA was quantified by

spectroscopy and quantifications were confirmed by electrophoresis on a 1% agarose gel.

PCR Amplifications:

Ribosomal DNA ITS 1 spacer amplifications were performed

as follows: 100 ng of genomic template DNA was added to each 79 reaction (final volume lOOpl) which contained lOpM of each amplification primer, 2.5mM MgC12, 1.5mM dNTP's, and 1-2.5U

Taq DNA Polymerase (ProMega Corp.). A number of combinations of forward and reverse primers were used for ITS 1 amplification, although the greatest success was obtained using primer combination 1712C and ITS2. ITS 1 amplification and sequencing primers are summarized in Table 8. The relative locations of these primers are summarized in Figure 10. The PCR reactions were carried out as follows: denaturation at 94° C for lmin., annealing at

50-52° C for 1.5 min., and extension at 72° C for 3 min., 35 cycles.

Reactions were performed in a Perkin Elmer Cetus PCI thermocycler. lOpl of PCR products were electrophoresed on a 1% agarose gel to determine size and approximate quantity before sequencing. Separate PCR reactions from the same individual which produced the correct size bands were pooled.

DNA pools were prepared in the following manners for subsequent sequencing reactions: 1. no preparation (straight PCR product used for sequencing); or 2. T/A cloned for subsequent double stranded sequencing. In the first method 3|o.l of PCR product was used directly in cycle sequencing reactions. In the second method, pooled PCR products were quantified by spectroscopy without further manipulation. A dilution of 25ng/pl was prepared. This was used for cloning using a commercially available T/A cloning kit following manufacturers instructions

(Invitrogen, Inc, Marchuk, Mitchell, and Collins, 1991). T/A cloning takes advantage of the tendency of Taq polymerase to add a terminal adenosine to PCR products. Ligation of these A- overhang products to an engineered vector which has a complementary thymine (T) is done using 1U of T4 DNA ligase which is provided in the kit. This is carried out in a 10pl reaction volume. Following overnight ligation at 12°C, l-2pl of the ligation is used to transform competent cells supplied with the kit (T/A

Cloning OneShot Competent Cells, Invitrogen). Cells are plated in the presence of 50pg/ml ampicillin, and 25pl of a 40mg/ml stock of X-gal on 100mm Petri plates. Incubation is overnight to 24 hours. Inclusion of the ampicillin resistance gene and the LacZ gene in the vector allows antibiotic and colorimetric selection of positive clones. Positive clones are chosen and prepared using standard laboratory protocols (Sambrook, Maniatis, and Fritsch,1989). The plasmid was resuspended in 20|il TE, pH 7.5.

5|il of the resuspended product was restricted with EcoRl to confirm the presence of an insert of the correct size. After confirmation, 3pl of the cloned plasmid product was used directly in double stranded cycle sequencing reactions.

Sequencing Analysis:

Sequencing was performed using the dsDNA Cycle Sequencing System (BRL Inc.). Sequencing primers were y32P-ATP end labeled using 1U of T4 Polynucleotide Kinase in a 5pl reaction.

Labeled primer was then added to the 31 pi seq reaction which 81

rrs2-cin 18S rDNA ITS1 ^ 2 517bp 1538RE-I

ISSU2C I236C-ITS 1712C 1/F 5.8S rDNA 1262C 1262C-Cln

Figure 10. Internal Transcribed Spacer One (ITS1). Approximate location of ITS 1 amplification and sequencing primers are shown. Primers appearing below the schematic representation of ITS 1 are forward primers, those above are reverse primers. Lines perpendicular to the ITS 1 region represent the approximate location of these primers. The spacer size is given for a representative species, Astatoreochromis alluaudi, since variation was observed in the size of the spacer (see text). Shaded areas represent gene coding regions. 82

Table 8.: Internal Transcribed Spacer One Amplification and Sequencing Prim ers,

Prim er 8 sequence 5'->3’ Forward/Reverse Melting Point°C

1262C gtggtgcatggccgttctta F o rw a rd 55.7

1262C-Cln gcggatccgtggtgcatggccgttctta Forward 73.7

1712C agcgccgagaagacgatcaaa F o rw a rd 58.6

1/F cacaccgcccgtcg F o rw a rd 49.3

SSU2C gtgaacctgcggaaggatca Forward 54.8

236C-ITS ggaccgtggctcgttgg F o rw a rd 54.3

538RE-ITS ttgccacattcgtagacggg Reverse 55.8

ITS2 gctgcgttcttcatcgacgc Reverse 58.1 cgaagcttgtcgacgctgcgttcttcatc ITS2-Cln gacgc R e v e rs e 78.1

S. Cln designation at the end of primer name represents a primer which has cloning sites engineered into the ends. Primers 1263C-Cln, 236C-1TS, 538RE-ITS, and ITS2-Cln were designed for this study. Other primers shown here were previously developed. 83 contained 2.5U Taq polymerase (.5pl), 4.5pl 10X Taq sequencing buffer, and 3pl of straight or 3-1 Ojj. 1 of resuspended cloned product. 8pl of this reaction was placed in a termination reaction containing 2pl of ddNTP’s. Sequencing was carried out as follows: following an initial heat soak at 96°C for 3 minutes, denaturation was carried out at 96°C for 30s, annealing at 48-60°C for lmin., and extension at 70°C for lmin., 20 cycles. This was followed by

10 cycles of: 30sec at 94-97°C, and lmin. at 70°C. Products were electrophoresed on either a 6 or 8% polyacrylamide gel. Gels were exposed to Amersham Hyperfilm-MP autoradiography film for 16-

48 hours. Scoring was done manually.

Phylogenetic Reconstruction and Data Analysis:

The primary sequence from the twelve taxa in this study were aligned using the sequence alignment program Eyeball

Sequence Editor (ESEE) for the PC (Cabot and Beckenbach, 1989).

Phylogenetic analysis was done using the phylogenetic analysis package PHYLIP on a Silicon Graphics, IBM, and Macintosh computers (Felsenstein, 1989). Using the PHYLIP package, the corrected proportion of nucleotide substitutions were estimated using the Kimura two parameter model to correct for multiple substitutions. Distances derived by this method were then used to produce phylogenetic trees using various algorithms available in this program: e.g.. UPGMA, KITCH, FITCH, and NEIGHBOR. A parsimony analysis was also performed on this data using the 84 DNAPARS program of PHYLIP or using the program PAUP on a

Macintosh computer (S wofford, 1990). Bootstrapping of the data was performed to examine the consistency of the data.

C. Results

The sequences of ITS1 were obtained for eight endemic Lake

Victorian cichlid species. In addition, four more widely distributed taxa were amplified and sequenced for this region. Most species have an ITS1 of 517 bp although variation in overall size due to insertion/deletion (indels) events is observed. An alignment of

537 bp was produced from the sequence data of this region (Fig.

11). Of the 537 aligned sites, 57 were variable. Of the total sites, seventeen were deletions or insertions, twenty one were transitions, sixteen were transversions, and three sites contained three different nucleotides. The transition/trans version ratio for the total data set was 1.31. Of the total sites, 36 were phylogenetically informative. Of these, thirteen were insertions or deletions, eleven were transitions, nine were transversions, and three were sites which contained more than two different nucleotides. For the phylogenetically informative data, the transition/transversion ratio was 1.22. In the three sites which contained more than two different nucleotides, the two tiiapine outgroup species of Oreochromis shared the same nucleotide. Data sets containing only transitions and transversion were made to test whether they produced different trees. Figure 11. Internal Transcribed Spacer One (ITS1) Sequence Alignment. On the following three pages is the sequence alignment of the 12 taxa examined for this region. This is a comparison which shows nucleotide changes relative to a reference sequence, Astatoreochromis alluaudi. Key: . = nucleotide identity with reference sequence, - = insertion/deletion events (gaps), N = ACGT. Species are abbreviated as follows: Astatore a = Astatoreochomis alluaudi, Gaurochrom = Gaurochomis sp., Prognathoc = Prognathochomis kachira deep, "Haplochro" = "Haplochromis" Neochromis "krusing". Xstichromi = Xystichromis phytophagous, Pseudocren = Pseudocrenalabis multicolor, Paralabido = "Haplochromis" Paralabidochromis "rock kribensis", Ptochromi = Ptyochromis xenognathus, Astatotila = Astatotilapia nubilus, Neochromis = Neochromis nigricans, Oreoch nil = Oreochromis niloticus, Oreoch esc = Oreochromis esculentus.

85 A statorc a CTGGCTACACCGAGCGGCCCCGCCTGC...... TG-TCTCCCT-TTTTGCCGCCGAGGGTCTCCCGCCACCGTCGCCGGTGCGGGTATCCCGAGGTCTTCGGCTCGCGCGTCCCCCACCGGAAGCICG 120 G aurochrom ...... G...... Prognathoc ...... "Haplochr" ...... G...... Xstichromi ...... G...... P seu d o c re n ...... G...... ___ - c ______r t i P a r a la b id o ...... G...... Ptyochromi ...... G...... A s t a t o t i l a ...... G...... Neochromis ...... G...... Oreoch nil ...... - ...... GCACCCGG...1...... C..- ...... A...... C____ ...... TG...... - ...... Oreoch esc ...... - ...... GCACCCGG. match ********* ********** ****** *• ******* *• •••* ** ••* * * ••* ••* •••* ••* ••* * ***» ******** ****** ***************** *****

A statore a AGCCTTAGTCTGGGCCTGGTCGCCGGCCGGACGAACCGACGGCCCCGCCCGCCTCTGCGCCAAAGCGAGCCCGCTGCCCCGAC-GGCTTCCTCCGAGGGCCGACGGAGGAGAGTAGGGACCG1 GGCT CGI 249 Gaurochrom ...... - ...... G...... CA...... Prognathoc ...... A...... G...... "Haplochr" ...... A...... G...... CA...... Xstichromi ...... G...... CA...... Pseudocren ...... A...... - ...... T...... C...... T...... P a r a la b id o ...... C...... G...... CA...... Ptyochromi ...... G --- . . - ...... CA...... Astatotila ...... T...... G...... CA...... Neochromis ...... A...... G...... CA...... Oreoch nil ...... A...... C...... A...... C...... CA____I ...... Oreoch esc ...... A...... C...... A...... C...... CA____I ...... match ******* ************* *••**•*•***• ********************* * * ************* •* •****•****•••••*•*• ***** ***« **** *********

Figure 11. ITS 1 Sequence Alignment. oo 0\ Figure 11 (continued).

A statore a IGGAGGCGGGCGCGGGTCCCCGTCCGGCAACTACCGGTACCGGCCCGCCACGAGAACCTCGACCGAAAGCGCGGGCTGGCGGTCTCGCCIGGCCGCTGCCCGCGCGCCTCCGGGUCCCAACICICCIIX 379

Prognathoc ...... "Haplochr" ...... Xstichromi ...... Pseudocren ...... Pnrnlnbido ...... Ptyochromi ...... Astatotila ...... Neochromis ...... Oreoch nil ...... A...... A___ Oreoch esc ......

A statore a CTCCTGCGGAGGAGCACGGGGGGTTCAATGTCTCCTCTCCCCCC- -GCCTCGG...... NN AGCGCCCGGGGT T T T TIT T CCCT T C------A AACCC T T TIA - CCCG TCT ACGAAT GT GGCAACCCACAGT G 497 Gaurochrom ...... CC...... C . .. - ...... Prognathoc ...... CC...... C . . . - ...... "Haplochr" ...... CC...... C . . . - ...... Xstichromi ...... CC...... C . . . - ...... Pseudocren ...... -- ...... A...... - ...... AAC-.-AA...... P a r a la b id o ...... CC...... C . . . - ...... Ptyochromi ...... CC...... C . . . - ...... A s t a t o t i l a ...... CC...... C . . . - ...... Neochromis ...... CC...... C . . . - ...... Oreoch nil ...... TGCC...... AGGAAGG...... G...... - ...... CTTT...... -C...T ...... G..A Oreoch esc ...... TGCC...... AGGAAGG...... G...... - ...... CTTT...... -C...T ...... G..A match ****************************************** ******* ********* * ****** ****** ****** • *********************** **

OO Figure 11 (continued).

A statorc a AAACGAAAACAAAAACi 517 Gaurochrom ...... Prognathoc ...... "Haplochr" ...... Xstichromi ...... Pscudocrcn G..G.G...A C Paralabido ...... Ptyochromi ...... Astatotila ...... Neochromis ...... Oreoch nil ______.A . Oreoch esc ....-----.A......

match ** * **** *

OO OO Table 9. Distance Matrix and Sequence Dissmiilarity of Internal Transcribed Spacer One (ITS 1). h

A all Gaur Pro* Hapl Xsti P se u P a ra Ptyo Asta Neoc O.nil O.esc A a ll .0120 .0990 .0160 .0119 .0492 .0140 .0119 .0140 .0139 .0449 .0449 G a u r 1.9 .0099 .0039 0.00 .0591 .0020 0.00 .0020 .0020 .0412 .0412 P ro g 1.4 1.2 .0059 .0099 .0522 .0119 .0099 .0119 .0079 .0433 .0433 H apl 2.1 0.6 0.6 .0039 .0544 .0059 .0039 .0059 .0020 .0412 .0412 Xsti 1.7 0.2 1.0 0.2 ------.0500 .0020 0.00 .0020 .0020 .0411 .0411 P se u 5.2 6.0 6.0 6.2 5.4 ------.0522 .0500 .0501 .0521 .0699 .0699 P a r a 1.7 0.4 1.2 0.6 0.2 6.0 .0020 .0039 .0039 .0433 .0433 P ty o 1,5 0.2 1.0 0.2 0.0 6.0 0.2 .0020 .0020 .0411 .0411 A sta 1.7 0.4 1.2 0.4 0.2 5.0 0.4 0.2 .0039 .0433 .0433 N eoc 1.7 0.4 0.4 0.2 0.2 6.0 0.4 0.2 0.4 .0390 .0390 O.nil 7.2 6.6 6.0 6.2 6.0 9.5 6.6 6.4 6.6 6.0 0.00 O.esc 7.2 6.6 6.0 6.2 6.0 9.5 6.6 6.4 6.6 6.0 0.0 h. Numbers shown above the diagonal are distance calculated in PHYLIP using the Kimura two parameter model. Numbers below the diagonal are the percent pairwise sequence dissimilarities calculated as follows: sum of the number of mismatches in a pairwise comparison divided by the total number of bases, multiplied by 100%. Each insertion or deletion was scored as a single difference in this method. Generic abbreviations are: Aall= Astatoreochromis alluaudi, Gaur= Gaurochromis sp., Prog= “Haplochromis" (Prognathochromis) "kachira deep", Hapl= "Haplochromis" (Neochromis) "krusing", Xsti= Xstichromis phytophagous, Pseu= Pseudocrenalabis multicolor, Para= "Haplochromis" (Paralabidochromis) "rock kribensis”, Ptyo= Ptyochromis xenognathus, Asta= Astatotilapia nubilus, Neoc= N eo ch ro m is n igrican s, 0.nil= Oreochromis niloticus, 0.esc= Oreochromis esculentus.

CO VO 9 0

Gaurochromis 30(41) I

22(29) Paralabidochromis

49(47) Ptyochromis

69 (70) Xystichromis

60(60) 1 Astatotilapia

43 (42) Neochromis 42(41) CHaplochromis

71 (70) • Prognathochromis 100(100) • Astatoreochromis

Pseucocremlabis

Oreochromis nil.

Oreochromis esc.

Figure 12. Internal Transcribed Spacer One (ITS1) Consensus Distance and Parsimony Gene Tree. Both trees were produced by consensus of trees produced with bootstrapped data. Distance tree used Neighbor-Joining algorithm. Parsimony tree used maximum parsimony. Numbers at node reflect number of times groupings terminal of that node occurred out of 100 trees. First number is that for distance, number in parentheses is for maximum parsimony. 91 This alignment was used to calculate the corrected proportion of nucleotide substitutions using the Kimura two parameter model. No differences were observed between the two species of Oreochromis, and between the three taxa Gaurochromis,

Xstichromis, and Ptyochromis. At the upper end of the range of differentiation was that found between Oreochromis species and

Pseduocrenalabis at 0.0699. Also shown was the overall sequence dissimilarity between all pairwise combinations of taxa which ranged from 0.0 to 0.095 (Table 9).

The distance matrix shown in Table 9 was used for phylogenetic reconstruction using the neighbor-joining algorithm in PHYLIP using the two Oreochromis species as outgroups. These data were also used to produce a maximum parsimony tree using the same computer program. In both cases the data were sampled with replacement using bootstrapping. A consensus tree produced from 100 bootstrapped data sets was produced in both cases. Both methods produced identical trees. Trees were also produced using only the transition and transversion data. These trees had the same gross topology as the trees shown but were less well resolved due to the truncated data sets which were used.

D. Discussion

Sequence analysis of ITS1 from the twelve taxa involved in this study revealed low levels of sequence divergence. This is not unexpected considering the recent divergence of these species.

However, interesting results were observed. A large number of insertion and deletion events are found in ITS 1. These were most prevalent in the tilapine taxa, Oreochromis, and to a lesser extent in Pseudocrenalabis, relative to the other taxa. This is similar to other studies which have observed numerous indels in species ranging from salmon to beetles (Pleyte, Duncan, and Phillips, 1992,

Vogler and DeSalle, 1994). The lack of substantial variation does, however, allow an unambiguous alignment of the entire ITS1 in the taxa studied. The calculated distances in most comparisons are small, with no differences found between the two Oreochromis species as well as identity between three of the Lake Victoria taxa.

The greatest distances are found between Oreochromis species and

Pseudocrenalabis. The genetic distances between

Astatoreochromis alluaudi and its sister genera range from 0.99% for Xstichromis to 1.4% for Paralabidochromis.

Analysis of phylogenetic gene trees produced identical branching patterns using both distance and parsimony methods

(as well as other methods not shown), although bootstrap values were low for many branches. This is due to the small number of phylogentically informative sites. Although this indicates that the data should be interpreted cautiously, a comparison of

Astatoreochromis alluaudi with the taxa terminal to it does yield some interesting findings.

The average distance between A. alluaudi and

Prognathochromis, Haplochromis, Neochromis, Astatotilapia,

Xstichromis, Ptyochromis, Paralabidochromis, and Gaurochromis is

1.3%. A recent study examined this same ITS 1 region in six 9 3 species of salmon, (Pleyte, Duncan, and Phillips, 1992). The two most closely related species in that study, Salvelinus alpinus and

Salvelinus malma had a sequence divergence between them of

0.53%. It was estimated by other methods that these two taxa diverged approximately 10,000 years ago. As long as precursor rRNA processing sites are maintained within the spacer region, there is little reason to believe that this region would evolve at greatly different rates in salmon and cichlids. Also, the inability to align sequences from other fish species in the case of this study or in the salmon study would suggest that there are not many selective constraints on ITS 1 sequences, excluding processing sites. In addition, the G+C content is similar in both cases, 63% in the salmons, and 70% in cichlids.

While molecular clocks are certainly not universal, the salmon clock does provide an interesting comparison to the cichlids studied here (Korey, 1981, Li, 1993). Astatoreochromis alluaudi is a broadly distributed species in East Africa found in both lacustrine and riverine environments. It has been shown to be phenotypically plastic in some of its morphological characteristics (Sackley, 1992). Further, it has been hypothesized that a riverine cichlid exploited the newly formed Lake Victoria following drying events in the past, rapidly exploiting new ecological niches (Temple, 1969). Previous mitochondiral DNA work has shown A. alluaudi to be a sister group to the Victoria taxa (Meyer et al., 1990). The ITS1 "clock" provided independently by the salmon data suggests a 0.053% sequence 9 4 divergence/one thousand years for the ITS 1. Given the average sequence divergence between A. alluaudi and the remainder of the

Victorian taxa, we would calculate a time of divergence between these groups of 24,500 years b.p. The most recent estimate of the last drying out event in Lake Victoria is thought to have occurred about 15,000 years ago (Livingston, 1980). The calculated time of divergence is in rough agreement with this estimate, and would suggest all other species of Lake Victoria haplochromine cichlids studied here, with the exception of Pseduocrenalabis, have evolved since that time.

In summary, while the ITS1 contains low levels of variation, it is able to discriminate between a hypothesized sister taxa of the

Lake Victoria flock and genera within the flock. Also, a calculated time of this divergence is in rough agreement with a recent drying event in the region. Addition of further Lake Victoria cichlid taxa to the ITS 1 database as well as examination of other regions of the genome will be needed to more robustly establish the relationships among Lake Victoria cichlids. CHAPTER V

COMPARISON OF A RANDOMLY AMPLIFIED POLYMORPHIC DNA (RAPD) PHYLOGENY OF LAKE VICTORIA CICHLID SPECIES WITH A PHYLOGENY PRODUCED FROM MORPHOLOGICAL DATA

A. Background

The large species flock of endemic cichlid fish in Lake

Victoria represents one of the most rapid and significant vertebrate radiations. In less than 100,000 years more than 350 species have diverged to occupy all possible ecological niches

(Goldschmidt and Witte, 1992). New species are still being discovered (Kaufman and Ochumba, 1993). Due to the ecological decline of Lake Victoria's endemic fish fauna these discoveries are rapidly declining (Baskin, 1992). Nonetheless, this large species assemblage is of great interest to scientists studying explosive evolution, putative sympatric speciation, adaptation processes, as well as extinction events. The evaluation of molecular methods which can aid in the determination of the phylogenetic relationships between these taxa is a primary goal in this research.

Due to the recent divergence of these taxa it has proven difficult to determine phyletic relationships by traditional taxonomic methods, and the phylogeny of species within Lake Victoria remains largely unresolved (Kornfield, 1991).

9 5 96 Recently, however, a phylogeny has been produced for a number of the Lake Victoria taxa based on scale and squamation characters (Lippitsch, 1993). This study was a cladistic analysis of eighteen genera representing a majority of the Lake Victoria taxa.

Included were species in the following genera: Astatoreochromis,

Astatotilapia (representing two evolutionary lines), Haplochromis,

Neochromis, Xystichromis, Hoplotilapia, Ptyochromis,

Platyaeniodus, Macropleurodus, Prognathochromis,

Psammochromis, Gaurochromis, Enterochromis, Yssichromis,

Lipochromis, Schubotzia, and Harpagochromis. Lippitsch presented a phylogenetic tree based on these characters which supported the recent revison that broke had split the haplochromine genus into new genera (Greenwood, 1981).

Previous studies have revealed the general difficulty of morphologically based methods to differentiate among the taxa in

Lake Victoria, and have led to more recent analyses of molecular traits, which are potentially phylogenetically more informative.

The methods include, among others: allozyme studies, DNA restriction endonuclease analysis, DNA repetitive element analysis, and DNA sequencing. Initial allozyme analysis was able to discriminate between the more distantly related tilapia,

Astatoreochromis, and Hoplotilapia genera, but was unable to differentiate between ten species of "Haplochromines" (Sage, et al,

1984). Hoplotilapia is found in Lake Victoria, but is also found in the Victoria Nile, which is connected to Lake Victoria. However, the low levels of variation among related species rendered 97 allozyme analysis insufficient to discriminate between the Lake

Victoria taxa.

Unlike allozyme analysis, which examines the products of genes, DNA analysis studies the genes themselves and reveals additional variation hidden in allozyme studies. Methods exist to examine variability at the DNA level which may be able to resolve the phylogenetic questions. One such method is the study of

"Variable Number of Tandem Repeat" (VNTR) loci (DNA fingerprint analysis) (Franck and Wright, 1993, Goff et. al., 1992, Jeffreys,

Wilson, and Thein, 1985). This method has been used in fish research and while it might be useful in differentiating between the haplochromine species, it is not without its drawbacks in interpretation, as well as difficulties in implementing the techniques related to time, technical difficulty, and money (Baker et al., 1992).

In addition to VNTR analysis, direct sequencing of DNA is also being used. Comparative DNA sequence analysis is now being used widely to infer phylogenetic relationships in prokaryotes and eukaryotes (Li and Grauer, 1991 , Swofford and Olsen, 1990). One of the most studied targets when examining closely related taxa has been the cytoplasmically inherited, rapidly evolving, mitochondrial DNA (mtDNA) molecule. Several studies of cichlid species from the African Great Lakes using DNA sequence data from differentially evolving loci of the mitochondiral DNA genome

(mtDNA displacement loop [D-loop], cytochrome b, tRNAs, and RNA

Pro regions) have been reported (Meyer, 1990, Sturmbauer and 98 Meyer, 1992, Sturmbauer and Meyer, 1993). Using the most rapidly evolving mtDNA region, the D-loop, Meyer was able to discriminate between three groups of "Haplochromines", two

Haplochromis species groups from Lake Malawi and one from

Lake Victoria (Meyer, 1990). The Lake Malawi groups apparently diverged three to four hundred thousand years before present

(y b p ).

The Lake Victoria taxa contained different mitochondrial haplotypes, but there were not enough phylogenetically informative sites to produce phylogenetic trees. Therefore, the taxa of Lake Victoria haplochromines studied could not be resolved from one another in this analysis. However, results suggest an origin approximately one hundred thousand years ago.

The estimated time of separation between the haplochromine species of Malawi and Victoria of approximately two million years ago lends support to the monophyletic origin of the haplochromine species within Lake Victoria (Meyer et al., 1990, Kocher et al.,

1989). Like mitochondiral genes, nuclear genes with rapid evolutionary rates can be sequenced for their possible phylogenetic information. In cichlids the Internal Transcribed

Spacer One of the Ribosomal Operon, and the highly variable Major

Histocompatability Genes (MHC) are being examined in this way

(Booton, 1995, this thesis, Ono et al., 1993).

However, as discussed, even these rapidly evolving regions have been unable to fully differentiate between these taxa. In addition, there may not be more rapidly evolving regions that can 99 be found. If this is the case, determination of enough phylogenetically informative sites may require the accumulation of large amounts of sequence data.

In contrast to these more traditional types of sequence analyses, a new method which potentially screens the entire genome for polymorphisms has been described (Williams, et. al.,

1990). Using single ten base oligonucleotides as primers in the polymerase chain reaction (PCR), a series of amplified products are produced from genomic DNA. Following electrophoresis, polymorphisms between individuals and species may be observed arising from changes within the primer sites. This method,

Randomly Amplified Polymorphic DNA (RAPD) analysis, is potentially useful to develop species specific banding profiles, and to determine genetic relatedness based on a character state analysis of these patterns.

In the studies presented in this chapter, we propose to test aspects of the Lippitsch phylogeny using results derived from molecular based analysis, specifically RAPD analysis. RAPD analysis theoretically allows us to screen the entire genome to find markers which delimit species. The results can be used to produce phylogenetic trees based on these markers (Williams et. al., 1990,

Chalmers et al., 1992, Chapco, et. al., 1992, Kambhampata, Black, and Rai, 1992, Scott, Haymes, and Williams, 1992, Russell, et al.,

1993, Lynch and Milligan, 1994). 100 In this study we used data derived by RAPD analysis of a subset of the Lipptitsch taxa, as well as some other species of Lake

Victoria cichlids, to answer the following questions:

1. Are polymorphisms produced using RAPD markers sufficient to produce a reliable phylogenetic tree of these species?

That is, do repeated anayses of the data consistently produce the same tree. Data derived from RAPD studies will be treated as discrete characters, and will be analyzed using maximum parsimony to determine a hypothetical RAPD phylogeny.

2. Does the phylogenetic tree produced from RAPD markers concur with that derived from morphological data? We hypothesize that the phylogeny based on molecular data (RAPD's)

will be concordant with that based on morphological data if they both reflect the actual phylogenetic relationships between these taxa. Alternatively, does the RAPD data produce a tree in conflict with the previously proposed tree? To examine this, a phylogenetic tree generated from RAPD data by maximum parsimony will be compared with the morphologically derived

Lippitsch phylogeny.

3. Do the RAPD data produce a tree which groups taxa by

trophic type? Lake Victoria cichlids exploit a wide variety of ecological niches, and previous classification schemes have

incorporated trophic associations to group taxa. In addition, recent

analysis of different trophic groups reveal that they have different

behavioral characteristics (Goldschmidt, T., 1990, Goldschmidt, T.,

and F. Witte, 1990). We will examine whether the genetic data 101 from RAPD's support a phylogenetic basis for such hypothesized relationships.

B. Methodology

a. Experimental Design

RAPD analysis is one approach to determine levels of genetic relatedness and genetic variation in Lake Victoria cichlids. To evaluate the usefulness of these methods to produce unbiased estimates of genetic relationships, the main series of experiments were performed on samples whose identity was concealed from us until late in the analysis. This would prevent bias in evaluating band sharing because of any a priori assumption of relatedness between samples.

b. Collections

With the single blind study design in mind, cichlid samples were collected in Ugandan and Kenyan waters by Drs. Les Kaufman and Mark Chapman, and Mr. Mwanja Wilson, and identified by them in situ, and at The Museum of Comparative Zoology (MCZ) at

Harvard University, where the voucher specimens from which tissue were collected have been deposited. The MCZ or other identification number, and the source of all samples is presented in Table 11. Muscle plugs were removed from the specimens in the field, and subjected to a series of alcohol washes before shipment to our laboratory. Specimens are given the numerical voucher code before shipping. The voucher number ultimately 102 allowed us to identify an individual, but did not further identify the specimen. Included in the sampling strategy was a known outgroup taxa, Astatoreochromis alluaudi, a related species from the Lake Victoria basin suggested by allozyme and mitochondrial studies to have diverged basal to the remainder of the Victoria flock. Sample sizes from each species were between one and six individuals.

Seventy specimens collected from Lake Victoria, and coded before being sent to our laboratory, were included in the RAPD study. Tissue samples had DNA extracted, and quantified, and were used in the studies discussed below. In addition, seven samples were obtained from populations being managed as part of the Lake Victoria cichlid Species Survival Plan. Five of these are known samples of "Haplochromis" (Astatotilapia) "flameback".

Two others were also classified as "Haplochromis" (Astatotilapia)

"flameback", but there was concern at the institution that was propagating these individuals that this classification was incorrect.

This pair was included to determine if these two individuals grouped with the other five "flamebacks".

c. Experimental Conditions

1. Tissue collection:

Frozen tissue or tissue preserved in 95% ethanol was used.

If solid tissue was to be preserved in ethanol, the tissues were immersed in five volumes of ethanol per volume of tissue, and the ethanol was changed within one hour after the tissue was originally exposed to alcohol.

2. DNA Extraction From Fish Tissues:

For RAPD analysis, DNA was extracted exclusively from muscle tissue of alcohol preserved specimens. To prepare muscle, a small piece of tissue (about 1 cm3) was minced with a fresh single edge razor blade on a glass plate in the presence of 0.5 ml

ABI lysis buffer (Applied Biosystems Inc., 0.1 M Tris, 4M Urea,

0.2M NaCl, 0.01 M CDTA, and 0.5% n-laurelsarcosine). The slurry was scraped into a 1.5 ml microfuge tube. A fresh blade was used for each sample and the glass plate was cleaned with soap and water, and wiped with alcohol between samples. The total volume in the centrifuge tube was brought to 1 ml, and 10pl of a 20mg/ml solution of proteinase K was added. Samples were incubated overnight at 45-50° C. If the samples appeared viscous the next day, another 25pl of Proteinase K was added and incubation continued for at least 2 more hours.

Following overnight digestion, the tissue slurry was divided into two 1.5 ml eppendorf tube, each containing 500pl of liquid.

DNA was extracted by two phenol/chloroform, and one chloroform extraction. Many protocols call for the addition of phenol without chloroform during the first extraction, but because of the high salt concentration in the ABI lysis buffer the layers will not always separate if only phenol is added. 250pl of Tris saturated phenol, pH 8.0 or higher and 250pl chloroform/isoamyl alcohol were added to each tube for the first two extractions, followed by 500pl 104 of chloroform/isoamyl alcohol (24:1). Samples were mixed by inversion for five minutes and then spun in a microcentrifuge for two minutes after each step. Following this step the top layer was removed, and put in a fresh tube. DNA was precipitated by addition of 2 volumes (1ml) of 95% EtOH, followed by a 70% EtOH wash. After addition of the alcohol, a white precipitate of DNA was usually visible. If not, samples were placed in the -20 °C freezer for fifteen minutes. Tubes were then spun at full speed in the microfuge for ten to fifteen minutes. Tubes were turned upside down on fresh paper towels and the pellets were left to dry overnight. After overnight drying, pellets were resuspended by incubation at 65°C in 250-400(il TE, pH 7.5. DNA was quantified by spectroscopy and quantifications were confirmed by electrophoresis on a 1% agarose gel. After quantification, a 25ng/pl genomic DNA stock of each sample was prepared for analysis by RAPD's .

3. RAPD Study Primers:

A large number of commercially synthesized RAPD primers are available for screening. The primers used in this study were obtained from Operon Inc. Identified at the time of purchase by a capital letter, these primer came in sets of twenty, all of which have the same G+C content within each set. Within a set, the twenty primers are labeled according to the capital letter of the set, followed by a number, e.g., Al, A2 A20 in set A.

Preliminary screening of a subset of the DNA samples used in this study with the M set of RAPD primers showed that most primers were potentially useful, producing 3-10 bands which were polymorphic in the series of test samples. For the experiments discussed here, a total of ten primers were used with screening done in duplicate on the entire experimental sample (Table 10).

4. RAPD Reaction Preparation:

After preparation of DNA, samples were screened with the previously identified RAPD primers. All RAPD conditions were established in preliminary studies and each aspect of the reaction was tightly controlled as described in the following sections.

Although there has been some evidence to the contrary, we approached our studies with the belief that variation in many of the factors of the polymerase chain reaction, such as primer concentration, number of cycles, cycling temperatures, and others, can produce variable RAPD banding patterns (Schierwater and

Ender, 1993, Venugopal, et al., 1993, Park and Kohel, 1994).

Consequently, we attempted to control these factors in each reaction to eliminate as much variability as possible. Reactions were carried out in duplicate to insure repeatability. A constant positive of a known species is used with every primer. This is a

DNA specimen from Astatoreochromis alluaudi whose RAPD pattern had been determined for all of the primers involved in the study. It provided a continuous control, useful to check for repeatability of the reactions. In addition, since the banding pattern of the known control is repeatably scored, the amount of variation in size determination of bands can be assessed. 106 A set of pipetors was set aside solely for PCR reactions to minimize problems from contamination from other sources in the laboratory. These were used for all manipulations except pipeting of genomic DNA. For genomic DNA, another pipetor was used. In most cases forty-eight RAPD reactions were set up at once. First, sixteen 0.75ml eppendorf tubes were placed in a microfuge tube rack. Next, 2j l l 1 (50 ng) genomic DNA was placed in the bottom of sixteen tubes. These first sixteen tubes were closed and labeled.

The next set of sixteen tubes were then done. On the final set of sixteen, the last two reactions contained lp.1 of the positive A. alluaudi sample, and lp.1 of water for the negative, respectively. A bulk solution was then prepared for the remainder of the reagents.

Enough of this stock solution was made for fifty two reactions. Each reaction contained 2.5p.l RAPD buffer (lOOmM Tris-Cl, pH 8.8,

500mM KC1, and 15mM MgCl2), 2.5pl dNTP's (1.25mM final concentration of each nucleotide), lp.1 RAPD primer, 0.2|il (1 Unit)

Taq Polymerase (Perkin Elmer Cetus), and 16.8|H sterile H20, for a final volume of 25pl. The Taq Polymerase was added to this stock last and the solution was placed on ice.

The first sixteen reactions were next placed in the thermal cycler and the tops opened. Twenty-three microlitres of the stock solution was placed in each tube using a fresh pipet tip for each tube, followed by one drop of sterile mineral oil, and the tubes were closed. This procedure was followed for the remaining two sets of sixteen reactions. 107 5. RAPD PCR Conditions.

The cycle conditions which were used to amplify RAPD bands were as follows:

Step 1: 30sec @ 94 °C

Step 2: 1 min @ 35 °C

Step 3: 2 min @ 72 °C

Number of cycles :45

Soak cycles: 5 minutes at 72 °C after the final cycle,

followed by a step to a 4°C soak.

When the thermal cycler reached 93°C on the first cycle, the reaction was paused for three minutes (hot start reaction), then the cycling reaction was started. After the PCR reaction was completed, 3pl of "blue juice" (Bromphenol Blue, Xylene Cyanol, and Ficoll) was added to each tube. The tube was centrifuged momentarily and the reaction was placed at 4 °C until electrophoresis.

6. Electrophoresis:

RAPD product bands were electrophoresed on a

Agarose/Synergel matrix with the positive control and two lkb marker lanes. Agarose gel electrophoresis was carried out in 0.5X

TE buffer. The gel matrix was made up of an Agarose/Synergel mixture. For these studies, a mixture which was equivalent to a

2% agarose gel was used in all cases. Depending on the size of the agarose gel apparatus that was used, 200-250mls of solution was used. Dry agarose and Synergel powder were shaken together on the bottom of a 500ml flask. The appropriate volume of .5X TE 108 was then added and the mixture was microwaved on high for three minutes and thirty seconds. After heating, and addition of a stir bar, the gel matrix was placed on a stir plate and allowed to cool to 50°C. Gels were consistently poured at +/- 2°C of this target temperature for these studies. At this point, the molten gel was gently poured into the gel mold which was 20cm wide x 21cm in length. The custom made electrophoresis apparatus used in this study permitted loading of 26 samples onto the upper, and 26 samples onto the lower section of each gel. After cooling, 25pl (of a total volume of 28pl) of sample was loaded into each well.

Twenty four experimental samples were loaded onto the top half of the gel. Twenty two experimental, one positive, and one negative control, were loaded onto the bottom. A size standard, the 123bp ladder, or in some cases a lkb ladder, was added to the remaining two wells in each section of the gel. Gels were electrophoresed at 110 Volts for 4 hours. Following electorphoresis, the gels were cut in half, and stained in an

Ethidium Bromide solution for fifteen minutes (1ml of stock EtBr solution in 600ml distilled water, prepared fresh, daily). Following staining, gels were destained for fifteen minutes. Polaroid photographs were taken of the gel. The aperture and shutter speed were the same for all photographs: f-stop of 8, and shutter speed 0.5 seconds. RAPD's were then scored. As discussed above, all reactions were duplicated to determine repeatability. In preliminary studies, the migration distance of the marker bands were determined by measurement to the nearest 0.25mm. This 109

Table 10. Primers used for RAPD Analysis. All primers are from set OP-M from Operon Inc.

P rim e r Sequence (5'->3’)

M 2 ACAACGCCTC

M 4 GGCGGTPGTC

M 5 GGGAACGTGT

M 6 CTGGGCAACT

M 8 TCTOTTCCOC

M l 0 TCTGGG jCAC

M l 2 GGGACGTTGG

M l 5 GACCTACCAC

M l 9 CCITCAGGCA

M 20 AGGTCTTGGG 110 information was then used to determine a regression line for a particular gel. The regression is used to determine the size of the experimental A . alluaudi RAPD bands from their migration distance. By repeated analysis of the known samples, we can examine the error associated with this type of measurement scheme. Experimental bands whose size were determined in this manner in separate gel runs produced a correlation coefficient of

0.99 of size determination between experiments. We feel confident that this method reliably determines the band size.

7. Data Scoring:

After determination of band size for the various RAPD loci on a gel, individuals are scored for the presence (1), or absence (0) of a particular band as a character state. Bands which were missing were scored as (-) (Maddison, 1993). Missing bands occurred when an individual produced no bands in repeated trials for a particular primer. This information was accumulated across all primers used in the study. A matrix (Appendix B) was produced which can be used to determine the pattern of band sharing between individuals to determine similarity based on maximum parsimony methods. It is also possible to score similarity between individuals, as measured by the proportion of band sharing, but we have not done such an analysis, yet, as part of the present studies. This latter approach would permit the application of distance approaches (Clark and Lanigan, 1993). Results were then scored as described above and results tabulated. These studies represent the first large scale analyses of fish using RAPD primers. Figure 13. Represenative RAPD gel. The RAPD primer used here is M10. Lanes 1-6 are Pseu/mu6, Ptyo/os3,Asta/nu4, Asta/nu3. Asta/nu6, lanes 8-13 are Ptyo/xel, Ptyo/xe2, Ptyo/xe3, Ptyo/xe4, Ptyo/xe7, and Ptyo/xe8. Lanes 15-20 are blank, Yssi/lal, Hapl/lil, Yssi/la2, A./allul, and the negative. Lanes 7 and 14 are the lkb ladder. 112 8. Data Analysis:

Phylogenetic analysis was done using the parsimony program PAUP (Phylogenetic Analysis Using Parsimony)

(Swofford, 1990). Bands were scored as presence/absence/missing in experimental samples. This data was then tabulated in a NEXUS file for parsimony analysis using PAUP.

C. Results

1. RAPD Amplifications and Phylogenetic Tree

Reconstruction:

Genomic DNA from 76 experimental, and one positive control, individuals were subjected to PCR amplification using ten

RAPD primers. These amplifications resulted in 65 scorable markers across the study individuals. A representative RAPD gel is shown in Figure 13. Bands were scored as described above.

Some individuals did not amplify at all using particular primers after repeated attempts. Since the entire banding profile was missing from these individuals, not just the band being scored, they were classified as missing for that particular band. A data matrix consisting of l's, 0’s, and -'s was produced and appears in the appendix.

The data matrix produced was used in a discrete character analysis using parsimony methods in PAUP. Of the 65 scored markers, 56 were phylogenetically informative. The initial analysis was done prior to breaking the blind coding of these individuals. Although the individuals sent for the study were coded anonymously, some of the individuals used in this RAPD 113 analysis had previously been extracted and identified for other

studies. For this reason the first tree contains both the numerical codes for most of the group, as well as names of some known

groups. Prior to breaking the code, these known individuals were

used informally to determine if any clustering was occurring. Due

to the overwhelming number of possible trees with this many individuals, an exhaustive search was not attempted. Various search methodologies are available which analyze only a subset of the possible trees. In this study we used a heuristic search method to search for most parsimonious trees, those trees that minimize the total number of character changes. For the initial analysis, the total number of trees retained was limited to 100. A majority rule consensus tree of these 100 most parsimonious trees was determined (Figure 14).

A determination to break the anonymous code was made

after examination of this tree for three reasons. First, while at this point most individuals were coded, there did appear to be clustering of taxa at several locations in the tree. Secondly, known individuals, such as Pseudocrenalabis multicolor (pseu9 through pseu7 in lower portion of tree), and Xstichromis phytophagous

(phyt 5 through phyt8 near lower middle) were clustering. This was not universal though, as members of another known group,

Ptyochromis xenognathus (xen-4, xen-7, and xen 8) appeared

scattered in the lower portion of the tree. Third, the majority rule numbers were relatively high, indicating that the same branching pattern was appearing in many of the trees. 114 To examine whether any important trees were missed by limiting our search to the 100 most parsimonious trees, another search was performed which maintained the 900 most parsimonious trees. A majority rule consensus tree was produced in a manner analogous to that described above for the 100 trees

(Figure 15). In comparison to the 100 tree analysis, areas of the topology which had lower majority rule numbers (60's to 70’s) in the 100 tree consensus resulted in multifurcations in the 900 tree data set. Clustering, such as for Pseudocrenalabis multicolor and P. xenognathous remained strong.

Therefore, examination of the 100 most parsimonious trees appears sufficient, as long as lower majority rule values are interpreted as indicating branching points with weak support. At this point our field collaborators were asked to break the code.

The anonymous numerical designations of the individuals as well as their specific equivalent is shown in Table 11. This table also shows the geographic origin of a sample, if known, and the trophic type of the species, again if known. Table 12, which follows, summarizes the total sample size for each taxa.

The original sampling structure called for our collaborators to identify from three to five individuals of each species to be analyzed in these experiments. While many of the taxa contain the target sample size, examination of Table 12 reveals many taxa which have only one representative. We were unaware of these sizes before breaking the code. The scarcity of many species in recent collections forced smaller than desired samples in some 115

ohvifi 16260 xen-4 ros-3 nu-3 nu-^

P lb k 4 ?tDK2 Pibka Flpkl R b k l oseua oseut oseu6 oseu3 oseud oseur atiua

Figure 14. Majority Rule Consensus Tree for 100 Most Parsimonious RAPD Trees. Numbers at nodes correspond to the number of times that the node appeared m the 100 trees. Taxa numbers are coded using anonymous designations (see text). 116

Mawnwroie 97096 16782 97085

16820 97130 97110

97096 97132 97115

16876 97102

97116 18903 16865 97109

16250 xen-8 16264

14707

Figure 15. Majority Rule Consensus Tree for 900 Most Parsimonious RAPD Trees. Numbers at nodes correspond to the percentage that the node appeared in the 900 trees. Taxa numbers are coded using anonymous designations (see text). 117 Table 11. Taxonomic Designation of Species Analyzed in RAPD stu d y .

CQDE' SPECIES______ORIGIN TROPHIC 13376 not identified 14702 Yssichromis laparogramma K z 14703 Yssichromis laparogramma K z 14704 Yssichromis laparogramma K z 14707 Haplochromis lividus K 16250 Ptyochromis xenognathus K OS 16256 Ptyochromis xenognathus K C6 16259 "Haplochromis" (Ptyochromis) K 06 "rusinga oral sheller" 16260 Ptyochromis xenognathus K 06 16264 "Haplochromis" (Ptyochromis) K 06 "rusinga oral sheller" 16265 not identified 16290 Neochromis nigricans K AS 16291 "Haplochromis" (Harpagochromis) K P " frogmouth" 16297 Neochromis nigricans K AS 16375 Paralabidochromis beadlei U I 16380 Gaurochromis sp. U I 16467 Prognathochromis Venator U P 16468 Prognathochromis Venator U P 16476 Prognathochromis Venator U P 16496 Oreochromis esculentus u AS 16576 "Haplochromis" (Paralabidochromis) u PC "rock kribensis" 16578 Psammochromis riponianus u I 16579 "Haplochromis" (Paralabidochromis) u PC "rock kribensis” 16582 "Haplochromis" (Neochromis) u AS "m adonna" 16589 "Haplochromis" Paralabidochromis u PC "rock kribensis" 16729 "Haplochromis" (Paralabidochromis) u "black" 16730 "Haplochromis" (Paralabidochromis) u "black" 16782 Pyxichromis orthostoma u 118 Table 11 (continued)

16806 "Haplochromis" (Paralabidochromis) U "black" 16817 Paralabidochromis plagiodon U 16820 Paralabidochromis plagiodon U 16865 "Haplochromis" (Harpagochromis) U "frogmouth" 16875 Neochromis nigricans U AS 16876 "Haplochromis" (Harpagochromis) U P "frogmouth" 16879 Neochromis nigricans U AS 16889 Paralabidochromis xenodon U 16891 Neochromis nigricans U AS 16903 Paralabidochromis xenodon U 16910 ’Haplochromis" (Paralabidochromis) U 'fried egg" 16912 ’Haplochromis" (Paralabidochromis) U 'fried egg" 16921 ’Haplochromis" (Paralabidochromis) U 'fried egg" 97048 ’Haplochromis" (Paralabidochromis) U I? ’chestnut" 97085 ’Haplochromis" (Prognathochromis) U P 'kachira deep" 97093 not identified 97096 ’Haplochromis" (Prognathochromis) U P ’kachira deep" 97098 ’Haplochromis" (Prognathochromis) U P 'kachira deep" 97102 not identified 97109 "Haplochromis" (Prognathochromis) U P 'kachira deep" 97110 ’Haplochromis" (Astatotilapia) "black U I group" 97115 ’Haplochromis" (Astatotilapia) "black U I group" 97116 ’Haplochromis" (Astatotilapia) "black u I group" 97130 not identified 97132 not identified allua Astatoreochromis alluaudi K PC F lbkl "Haplochromis" (Astatotilapia) P I "flameback" Flbk2 "Haplochromis" (Astatotilapia) I " flameback" Flbk3 "Haplochromis" (Astatotilapia) I "flameback" 119 Table 11 (continued)

Flbk4 "Haplochromis" (Astatotilapia) P I " flameback" Flbk5 "Haplochromis" (Astatotilapia) P I " flameback" HoFBl Putative "Haplochromis" P I (Astatotilapia ) "flameback" 1 HoFB2 Putative "Haplochromis" P I (Astatotilapia) "flameback" 2 n u - 1 Astatotilapia nubilus KI n u -3 Astatotilapia nubilus KI n u -4 Astatotilapia nubilus K I p h y t5 Xystichromis phytophagous K PE p h y t6 Xystichromis phytophagous K PE phy t8 Xystichromis phytophagous K PE pseu 1 Pseudocrenalabis multicolor KI pseu3 Pseudocrenalabis multicolor KI pseu6 Pseudocrenalabis multicolor KI pseu7 Pseudocrenalabis multicolor KI pseu8 Pseudocrenalabis multicolor KI pseu9 Pseudocrenalabis multicolor KI ros-3 "Haplochromis" (Ptyochromis) K OS "rusinga oral sheller" xen-4 Ptyochromis xenognathus K OS xen-7 Ptyochromis xenognathus K OS xen-8 Ptyochromis xenognathus K OS

Table Key: Code=is that anonymous number given an individual by field team, Species=Genus and species/putative species of individual with that code, Origin=location of origin of that individual. (U= Ugandan which are those fish caught in Ugandan waters in the Northern part of the lake, K=Kenyan which are those fish caught in Kenyan waters in the Eastern portion of the lake, P=Captive Propagation Program), Trophic =trophic type of that species: OS=oral sheller, PC= pharyngeal crusher, I=insectivore, Z=zooplanktivore, P=piscivore, AS=epilithic algal scraper, PE=plant eater. 120 Table 12: Species sample size summary for RAPD analysis.

SPECIES NUMBER OF INDIVIDUALS "Haplochromis" (Prognathochromis) 4 "kachira deep" Prognathochromis Venator 3 GENERIC TOTAL 7 "Haplochromis" (Paralabidochromis) 3 "fried egg" "Haplochromis" (Paralabidochromis) 3 "rock kribensis" "Haplochromis" (Paralabidochromis) 3 "black" "Haplochromis" (Paralabidochromis) 1 "chestnut" P aralabidochromis plagiodon 2 Paralabidochromis xenodon 2 Paralabiodchromis beadlei 1 GENERIC TOTAL 15 "Haplochromis" (Ptyochromis) 3 "rusinga oral sheller" Ptyochromis xenognathus 6 GENERIC TOTAL 9 "Haplochromis" (Astatotilapia) "black 3 group" "Haplochromis" (Astatotilapia) 5 " flameback" Putative Astatotilapia flameback 2 GENERIC TOTAL 10 Pyxichromis orthostoma 1 GENERIC TOTAL 1 "Haplochromis" (Harpagochromis) 3 "frogmouth" GENERIC TOTAL 3 "Haplochromis" (Neochromis) 1 "m adonna" Neochromis nigricans 5 GENERIC TOTAL 6 Gaurochromis sp. 1 GENERIC TOTAL 1 Oreochromis niloticus 1 GENERIC TOTAL 1 Psammochromis riponianus 1 GENERIC TOTAL 1 Yssichromis laparogramma 3 GENERIC TOTAL 3 Xstichromis phytophagous 3 GENERIC TOTAL 3 Astatotilapia nubilus 3 GENERIC TOTAL 3 121 Table 12 (continued)

Haplochromis lividus 1 GENERIC TOTAL Pseudocrenalabis multicolor 6 GENERIC TOTAL 6 Astatoreochromis alluaudi 1 GENERIC TOTAL 1 Not Identified 6 NOT IDENTIFIED TOTAL 6 GRAND TOTAL 77 122 cases. In addition, one genus, Parlabidochromis, makes up over

20% of the entire sample, with representatives from seven species.

While the sample structure is not ideal, and not exactly what we had hoped, we did not know until breaking of the code what the sample actually looked like.

Following identification of the individuals, a second representation of the data was produced which identified each individual. Abbreviations for taxa which were already known, e.g..

Pseudocrenalabis multicolor, were also changed to follow this scheme. The new identification was followed by a U or K to identify geographic location of the individual, followed by sequential number for each member of a species sample. The new abbreviations, which are used in all of the remaining phylogenetic trees, are summarized in Table 13.

Individuals which were not identified at this time by the field collaborators were dropped from remaining analyses. One specimen appeared to have been clearly misidentified or mislabeled. Oreochromis niloticus, a tilapine cichlid should differ substatially in banding patterns from all haplochromines, but its

RAPD pattern grouped it with the haplochromines. Another study currently underway in our laboratory with large samples of

Oreochromis niloticus and Oreochromis esculentus using the same primers utilized in this study has not resulted in the same banding patterns as the putative Oreochromis niloticus used in this study.

Reexamination of the RAPD gels of this sample confirmed the banding patterns to be very similar to the haplochromines in all Table 13.: Generic Abbreviations used in RAPD phylogenetic trees.

GENUS J Ugandan Abrev. Kenyan Abrev. Astatoreochromis alluaudi A ./a llu l A ./a llu l Pseudocrenalabis multicolor P s e u /m u # K Astatotilapia nubilus A sta /n u # K Gaurochromis sp. G a u r/s p # U Haplochromis lividus H a p l/li# K "Haplochromis" (Astatotilapia) "black Asta/bl#U A sta /b l# U g ro u p " "Haplochromis" (Aatatotilapia) A s ta /fb # A s ta /f b # "flameback" Putative "Haplochromis" H o/A sFB # H o/A sFB # (Aatatotilapia) "flameback" "Haplochromis" (Harpagochromis) Harp/fr#U H a r p /f r # K " frogmouth" "Haplochromis" (Neochromis) Neoc/ma#U "m ad o n n a" "Haplochromis" (Paralabidochromis) Para/bl#U "black group" "Haplochromis" (Paralabidochromis) Para/ch#U " c h e s tn u t" "Haplochromis" (Paralabidochromis) Para/fe#U "fried egg" "Haplochromis" (Paralabidochromis) Para/rk#U "rock kribensis it "Haplochromis" (Paralabidochromis) Para/xe#U "xenodon" "Haplochromis" (Prognathochromis) Prog/kd#U "kachira deep" "Haplochromis" (Ptyochromis) P ty o /o s# K "rusinga oral sheller" Neochromis nigricans Neoc/ni#U Neoc/ni#K Paralabidochromis beadlei P a ra /b e # U Paralabidochromis plagiodon P a ra /p l# U Prognathochromis Venator P ro g /v e # U Psammochromis riponianus P s a m /ri# U Ptyochromis xenognathus P ty o /x e # K Pyxichromis orthostoma P y x i/o r# U Xystichromis phytophagous X sti/p h # K Yssichromis lamparogramma Y ssi/la# K

J. These abbreviation were used following identification of species in RAPD analysis. Abbreviation Key: (Genus or putative genus)/(species or putative species)(# of individual in study)(Geographic location, U=Ugandan waters, K=Kenyan waters). A. alluaudi, "Haplochromis" (Astatotilapia) "flameback", and the putative "Haplochromis" (Astatotilapia) "flameback" are shown in both columns and not identified by location, as location is not known for these individuals. 124 cases. An examination of the voucher specimen for this sample is underway by field collaborators. Because of the likely mislabeling of this DNA specimen, it was dropped from further analyses. The remaining data set consisted of 70 individuals.

The analysis was run again using the new data matrix which contained the abbreviations for each species. This was done with

A. alluaudi as the outgroup, and used random addition of taxa in the tree building algorithm so that order of entry of an individual within the data matrix would not bias the final tree topology. The results of a majority rule consensus of 100 most parsimonious trees is shown in Figure 16. The numbers at each branch point represent the number of times that the branching pattern appeared in the 100 most parsimonious trees. Using the newly identified taxa, we compared the taxa with their groupings in the phylogenetic tree. Some results were immediately apparent.

RAPD character analysis did not appear to robustly group multiple members of all species, although some associations were strong, such as Pseudocrenalabis multicolor, and Xstichromis phytophagous. In fact, individuals from different species within a genus did not always group together. Secondly, the most obvious distinction is that there appeared to be a split of the species based on the geographic origin of the sample, either Ugandan or Kenyan waters. With the exception of two individuals, "Haplochromis"

(Ptyochromis) "rusinga oral sheller" 2 and Yssichromis lamparogramma 3, all species grouped according to geographic origin within the lake. To examine these trees more closely, a 125 pairwise comparison of the total number of differences between all combinations of these 100 most parsimonious trees was examined. In this matrix, there are 10,000 pairwise comparisons,

4950 of which are unique ([number of comparisons-like comparisons]/2). An examination of how different the trees are from one another gives an indication of the noise which we may be in the data set. The range of tree to tree differences was 1 to 39.

Fifty five percent of these trees differ from each other by 10 changes or less, eighty-one percent by 20 changes or less.

A number of the taxa in this study contained missing data which can act as noise in these analyses. All taxa which contained missing data were removed. Another analysis was performed using this matrix which contained 56 individuals. The majority rule consensus tree of the 100 most parsimonious trees produced in this analysis is shown in Figure 17. A similar examination of the tree-tree comparisons of differences was done. The range of differences was 1 to 31. Fifty six percent of the trees differed by

10 changes or less, and fully ninety four percent differed by

20changes or less. These results are shown in Table 14. These numbers were similar in repeated analyses for both sets of data, and the set containing the missing data does not appear to result in significantly more tree to tree differences, although there are more trees with greater numbers of differences. 126

Ma«omy ruie ProgftOlU Prag/M4U Para/taSU Asta/bnu i-iaro/triu Prog/lcttU Paratoaiu Para/xeiu Para/ie2U Para*l2U Prog/ye iu Param2U ProgAd3U Asta«l2U Neoe/ni2U Neoc/ni3U Haro/tr2U Pararcmu Para/di U Para*i2U Para/rk3U Ho/AsFB1 Ho/A»FB2 Ptyo/xe2K Ptvo«a6K Ptyo/OS3K XStl/phIK XSIi/Wl2K Xsti/ph3K Ptyo/xe3K Plyo/xeAK Ptyo/oslK Asia/nu2K AstafnuiK Asta/nu3K HartVtrSK Piyo/xeiK Yssi/iaiK Ptyo/xaSK yssuia2K Haol/ll1K Naoc/nr4K vssi/laSK Neoc/niSK Asia/bi3U Astart02 ASta/lbS Aaia/tbS A$ta/ID4 Psau/rnuiK PsoumwSK PsaumuSK Pseu/mu2K Psaurmu4K Pseu/mu3K A./allul

Figure 16. Majority Rule consensus tree of 100 most parsimonious trees on truncated data set. Removal of unidentified indiviuals and Oreochromis niloticus was done in this data set. Individual designations are as described in Table 13. 127

M a m n v r u i e Proq/Xdiu Para/veiu r-roa/ve2U nam /trtu Prog/*d2U Pvxi/onu Prog/Xd4U Para/te3U Ptyo/o*2K ¥sti/la3K Pam/bn U Psam/mu Asta/bnu Prog/veiu Para/m iu ParaAM2U ParaAOU Neoc/malU Para/xeiU ParaAe2U Para/teiu Gaur/iolU Para/baiu Neoc/nnU Progrkd3u Asta/bi2u Neoc/ni3U Knocmi2U Para/chiU Para/wiu ActaAOU Para/oKU Harertr2U Para/rk2U Para/xe2U Para/rK3U Ho/AsFB1 Ho/A*FB2 Ptyo/xe2K PIV0/X86K PIVO/OS3K Neoc/ni5K xstirotiiK XHI/W13K Xstl/Oh2K Ptyo/xe3K PtvorxexK PtVO/OStK Asta/nu2K AstamuiK A«a/nu3K Haro/1r3K Ptyo/xeiK VSSI/181K PiwrxeSK /ssi/ia2K naa/li IK Naoe/nMK Acta/104 Astaftb2 Asta/tDi ASta/165 Asia/103 P m u / r i u i k Pseu/mu2K Psau/muAK PsaumxiSK Pseumxj6K PseurmuSK A raltul

Figure 17. Majority Rule consensus tree of 100 most parsimonious trees following removal of taxa with missing data. Individual designations are as previously described in Table 13. 128

Table 14. Distribution of differences between 100 pairwise comparisons of RAPD derived trees. k

0-10 11-20 21-30 31-40 AD 2748 1580 488 134 W/O MD 2789 1877 284 0 k. Key: AD=trees produced which included all individuals (77), some of which contained missing data, W/O MD= pairwise tree comparisons for those trees produced using only individuals with complete data sets (62 individuals). Column headings: 0-10= 0 to 10 total differences between trees, 11-20= 11-20 total differences between trees, etc. 129

The next analysis that was performed was to split the data set which contained no missing data into subsets which contained specimens from the two different geographic locations of the lake.

Also included in both of these data sets were the outgroup A. alluaudi. The "Haplochromis" (Astatotilapia) "flameback", the captive propagation program taxa, was left in both analyses. For the Kenyan group, this resulted in a set of 34 individuals. Of the

65 characters, 59 were variable, and 50 were phylogenetically informative in this analysis. A majority rule consensus of the 100 most parsimonious trees is shown in Figure 18. This tree was 146 steps long. Except for Pseudocrenalabis and Xstichromis, grouping of all indiividuals of a species into a clade does not occur in this tree. There is some grouping of the Ptyochromis genera ("rusinga oral sheller" and xenognathus), but in two different locations of the tree. Repeated analyses and retention of a larger number of most parsimonious trees did not result in significantly different consensus trees. The data set with Ugandan specimens contained

29 individuals. For the Ugandan analysis, 47 of the 65 characters were variable, and 37 were phylogenetically informative. A maximum parsimony tree was 115 steps, and again little resolution between species was observed. A majority rule consensus tree appears in Figure 19. 130

Maiorny ru*e Pseu/m uiK

Pseu/m u2K

Pseu/mu4K

Pseu/muSK

Pseu/m u6K

Pseu/m u3K xsti/pmK

xsti/ph2K

Xsti/ph3K

Ptyo/xe3K

Ptyo/xe4K

Ptyo/osiK

Asta/nu2K

Asta/nuiK

Asta/nu3K

Harp/rr3K

Ptyo/xeiK

YssiflaiK

Ptyo/xe5K

Yssi/)a2K

HapUMK

Neoc/ni4K

YsSl/la3K

Naoc/m5K

Ptyo/xezK

PtyofreoK

Ptyo/os3K

Ho/AsFBl HO/ASFB2 Asta/vb4

Asta/tb2

Asta/fb3

Asta/tb5

A./allui

Figure 18. Kenyan only RAPD tree. Majority consensus tree of 100 most parsimonious trees of 115 steps, containing no missing data. 131

Maiornyruie Prog/kdHJ

99 Harortriu Prog/Ko2U

64 A sta/bnu

60 Prog/M «U 64 Para/te3U 100 ' P ara/em u > Para/Dll U 100 ■ Para/xeiu 04 ■ Para/te2u 88 ■ Paraft>i2U

91 ■ Para/belU > Prog/veiU

• ParartttU

73 ■ Prog/ka3U 88 51 62 ■ Neoc/ni3U

75 • Asla/bl2U

■ Neoc/m2U 94 - ParaA*3U

- Harp/WU

- Para/Di2U 94 • Asta/bl3U

- Asta/102 94 100 - Asta/tb3 91 - Asta/ibS

- Asta/lb4 100 - Ho/AaFBl - HO/ASFB2

- A /aiiui

Figure 19. Ugandan only RAPD tree. Majority consensus tree of 100 most parsimonious trees of 146 steps, containing no missing data. 132 In summary, division of the data set into subsets by geographic origin did not result in greater resolution of the species, although some association of species within a genus did appear in the trees. In the larger data set which contained individuals from both areas of the lake, a nearly complete geographic split was the most striking features of the tree.

2. Comparison of RAPD phylogenetic trees with one produced

by morphological characteristics.

Lippitsch (1993) has produced a phylogeny of many of the haplochromine taxa of Lake Victoria using various scale and squamation characteristics. This included the examination of nearly 190 species from 85 genera. The total number of characters established was 96, with a total of 300 discrete character states (Lippitsch, 1993). This robust analysis of morphological characters resulted in the proposed phylogeny redrawn in Figure 20. Of note in this phylogeny are the two

"Astatotilapia" genera shown. Riverine Astatotilapia which were examined in the Lippitsch study fall basal to the remainder of the group while the species of lacustrine Astatotilapia (designated

'Astatotilapia' in the Figure 20) are within the Lake Victoria flock.

Not all of the genera examined in the morphological study were analyzed in the RAPD study. A modified Lippitsch phylogeny including only those taxa which were examined in the RAPD analysis is shown in Figure 21. This is the phylogeny which will be compared to RAPD trees. 133

PC ■ Astatoreochromis i m Astatotilapia M. ■ Haplochromis AS 1 Neochromis - ■ Xystichromis OS i Hoplotilapia Ptyochromis i Platytaeniodus | ______OS • Macropleurodus

i 'Astatotilapia'

Prognathochromis

•Psammochromis

• Gaurochromis D i Enterochromis

i Yssichromis PA • Lipochromis

■ Schubotzia • Harpagochromis

Figure 20. Proposed phylogeny of haplochromine taxa based on scale and squamation characters. Redrawn from Lippitsch (1993). Abbreviations on each lineage refer to trophic type, see Table 7 or Table 11. 134

Astatoreochromis

AS Haplochromis AS Neochromis PE Xystichromis Ptyochromis

Prognathochromis

Psammochromis

Gaurochromis

Yssichromis

Harpagochromis

Figure 21. Truncated Morphological Phylogeny. Lippitsch phylogeny produced by maintaing only those genera which were included in RAPD analysis. Note that the Astatotilapia which remains in this tree is the lacustrine Astatotilapia, designated 'Astatotilapia'. Letter designations on branches leading to terminal taxa refer to trophic type, see Table 7 or Table 11. The comparison of Lippitsch's morphological study with the

RAPD study were done in a succession of steps to determine if there was sufficient signal in the RAPD data for comparison. The

RAPD data was broken down into subsets which were then analyzed. The first data set analyzed was one that included all the taxa from the full RAPD study that were contained within genera examined in the Lippitsch study. This reduced the total number of individuals to 53. Individuals from both geographic locations,

Ugandan and Kenyan waters, were maintained in this analysis.

The outgroup taxa Astatoreochromis alluaudi a n d

Pseudocreanalabis multicolor were also maintained. The 100 most parsimonious trees were maintained, and a majority rule tree was produced (Figure 22). A similar analysis was carried out on this data following the removal of any individuals which contained missing data. This tree contained 46 individuals. Again, a majority rule tree of the 100 most parsimonious trees was calculated and is shown in Figure 23 . In both cases the primary split in the tree was again a split along geographic lines, and it was difficult to observe any similarity to the Lippitsch tree. Repeated runs produced similar results with the maintenance of the geographic split. Therefore, Ugandan and Kenyan localities of the lake were analyzed separately, using those genera Lippitsch had studied. Both of these groups also contained the outgroup taxa A. alluaudi and Pseudocreanlabis multicolor. 136

Maontvtute 100 Prog/kdi u Prog/ye2U 80 HafD/lrlU 10C Prog/WI2U Gaur/soiU NMCMH1U 56 Naoc/noiu 100 Prog/Kd3U 10C Asta/bl2U 90 Asia/t>l3U 10C Neoc/ni2U 58 97 NaocmOU Harp/tr2U IOC Prog/veiu 97 Psam/mu Asta/biiu 10C Prog/taML) 91 Ptyo/os2K Yssi/ia3K Ptyo/xeYK XstitoniK KsWptttK XStl/Oh2K 88 Plyo/xe3K Piyo/xMK 88 Plyo/oalK Asia/nu2K 10C Asta/nui K Asla/nu3K Harp/lr3K vssirtaiK Ptyo/xaSK vsst/la2K Plyo/xe2K Ptyo/xeS* Ptyo/o*3K 84 HoMsFBl Ho/AsFB2 Naoc/mSK HapuiiK Neoc/mAK Asta/IDA AstartD2 Asta/IDS Asia/ici Asta/IM Psau/muiK PHUAfU2K Pseu/muAK 100 I— ^ — PaauAnuSK Paeumw6K Pseu/mu3K A./allul

Figure 22. First Majority Rule consensus tree for RAPD data using taxa previously studied by Lippitsch. This is produced by majority rule consensus of 100 most parsimonious trees. Numbers at nodes represent the percentage that this node was found in all trees. 137 Mapnry ru« 59 Prog/kdib "173 |——naip/trit PfO$/kd2. P rog/vei. 63 Proe/kd3. 99 Neoc/m3w Asta/bl2b 96 98 100 As«a/bl3b Neoc/ni2u 100 Haroflr2L Asta/blK 100 Prog/kd4„ YSSl/183^ Ptyo/xe’1' xsti/pm c XS1I/DM3' Xsti/Dh2«" 32 Pivo>xe3* Ptyo/xM*: 81 Ptyo/osK AstamuZ* 100 Asta/nu'1 * Asia/nu3< Harp/Ir3‘’ 88 vssi/iai Piyo/xe5K Yssi/la2< Hapl/ui>- 71 Piyo/xe2< Piyoixeo^ Piyo/os3< 100 88 HO/ASF3' HO/AsFB2 100 Neoe/n:s< 100 Neoc/n>4< 100 Aslan t- 100 Asta/tc? A stanci > A slans: Pseu'n'ji K Pseuinvj2K Pseu/maSK 100 Pseu/mu6" PseumuAK Pseu'mj3"' A./ai:.'

Figure 23. Second Majority Rule consensus tree for RAPD data using taxa previously studied by Lippitsch. This data used to produce this tree contained no missing data. This is a majority rule consensus of 100 most parsimonious trees. Numbers at nodes represent the percentage that this node was found in all trees. 138 The two data sets containing individuals from either Kenyan or Ugandan waters was prepared from the data set containing no missing data. Consequently the Ugandan and Kenyan data contain no missing data.

The Kenyan tree contained 28 individuals. Of the 65 characters analyzed, 57 were variable, and 45 were phylogenetically informative. A Majority Rule Consensus tree is shown in Figure 24. The Ugandan tree contained 13 individuals.

In this case, 37 of the 65 sites were variable, and 18 were phylogenetically informative. The result of the Majority Rule

Consensus is shown in Figure 25. In both cases, strong association within species was not observed. In the Kenyan group the strong association of Pseduocrenalabis, observed previously, was maintained, as well as grouping of Xstichromis phytophagous.

Beyond this, individuals of like species, and even genera, did not show strong association. The Ugandan tree had fewer individuals, as well as fewer informative characters, but still did not show strong association within genera.

3. Correlation of trophic type and derived RAPD phylogenies.

Trophic type has been used in the past in the classification schemes of the Lake Victoria cichlids (Goldschmidt and Witte,

1992). To investigate phylogenetic associations of various trophic types we compared the phylogenies of Lippitsch based on morphological characters with the ones derived from RAPD analysis. Examination of the full Lippitsch tree reveals little 139

V & o w y r u ie Pseu/mu 1K 52 64 Pseu/mu2K 100 Pseu/fflu4K

Ps#u/mu5K 100 100 Pseu/mu6K

Pseu/mu3K

Xsti/on 1K 99 100 Xsti/cn3K 100 Xsti/oh2K 100 Ptyo/xe3K °tyo/xe4K 100 Piyo/osiK 86 100 Asia/nu2K Asta/nulK 97 100 Asta/nu3K

Haro/1 r3K

Yssi/laiK

Ptyo/xai K 94 Ptyo/xe2K 96 100 100 Ptyo/xe6K

88 Ptyo/os3K

Yssi/ta3K '00 Ptyo/xe5K

Yss/la2K

77 H3DI/1I1K Neoc/m4K

> Neoc/niSK

■ A/aiiui

Figure 24. RAPD phylogeny using genera tested by Lippitsch using Kenyan derived specimens. 140

Maioritv rule ?rog/Kaiu

- s

narp/fnu

53 ?rog/kfl2U

62 a rog/ka4U

62 ;sta/bnu

62 a rog/veiU

62 h aro/tr2U

Progfta3U 100

\eoc/ni3U

100 Njeoc/m2U

Asta/b!2U

Asta/b!3U

AValiui

Figure 25. RAPD phylogeny using genera tested by Lippitsch using Ugandan derived specimens. 141 correlation between derived clades and trophic type, although some apparent associations appear (Figure 20). All of the oral shellers which were studied genetically are found in one clade.

However, thus clade also contains a detritivore. Fluvatile

Astatotilpia species, which are insectivores are basal to this group and another sister group which contains algal scrapers and one plant eating taxa. The lacustrine Astatotilapia is also an insectivore, as are others in the lower portion of the tree. Also found in the lower portion of the Lippitsch tree are piscivores, which are not found anywhere else in the tree. Finally, although associations of trophic type are not complete, there does appear to be some consistency of associations, with a basal insectivore giving rise to two groups. One of these groups exploited algal and snail foods, while the second retained feeding on insects, and eventually diverged to eat other fish, and fish eggs, e.g., the paedophage

Lipochromis. The truncated Lippitsch tree does not contain as many genera as the full tree, but does maintain these basic phenomena within the topology. The algal scrapers and oral shellers appear as a distinct group from the insectivores and piscivores with the lacustrine Astatotilpia between them.

Examination of the results of RAPD analysis, using only those taxa examined by Lippitsch which are from either Kenyan and

Ugandan waters, does not confirm or refute the trophic relationships due to the lack of strong associations between species discussed previously. 142

D. Discussion.

The RAPD trees produced in this study do not appear to separate genera in the manner consistent with the analysis of the morphological data. However, they do indicate that there may be a significant geographic factor involved. Previous studies on the widely distributed Tropheus taxa in Lake Tanganyika have revealed the importance of geographic location on species relatedness (Sturmbauer and Meyer, 1992). Regardless of the number of taxa maintained or the methodology of the search algorithm applied, the taxa diverge nearly entirely via a Kenyan or

Ugandan geographic origin. If real, this could be due to convergent evolution of like morphological types evolving independently in different areas of the lake. A recent study of matched morphologically similar species from Lake Victoria and Lake

Tanganyika has been reported (Kocher, et al., 1994). Using mitochondrial DNA analysis, they were able to show that any taxa chosen from one of the lakes was more similar to all of the morphologically dissimilar species within that lake than it was to its morphologically similar pair from the other lake. We may be observing a similar phenomena within Lake Victoria.

Alternatively, geographic isolation of previously panmictic populations may have led to apparent divergence between these taxa. The inability of the RAPD data to distinguish genera within these geographic groups may be due to a number of factors. First, the number of characters which have been scored may be insufficient to provide a significant signal to noise ratio to provide a conclusion. It has been suggested that 2n (n= number of taxa) characters should be studied to provide robust information in character analyses (get ref). The number of phylogenetically informative characters studied for the whole taxa set of 70 is less than half of that (56). Removal of taxa to a minimum of 17 in the

Ugandan only tree, while retaining all characters, however, did not yield a tree which was in better agreement with the morphological study. In this case the number of characters is approximately n

(18 phylogenetically informative characters). The Kenyan tree which contains 28 individuals and 45 phylogenetically informative characters approaches 2n, but still does not appear to be improving resolution. That is, like individuals of the same species do not robustly group together. A second possibility is that retained polymorphisms which predate speciation events are being observed which are obscuring taxonomic relationships. If this is true, RAPD analysis may simply be unable to distinguish between these closely related taxa .

Further work which expands the number of taxa studied with key genera not studied here, such as the riverine

Astatotilapia may help determine the utility of this method.

Analysis of known samples from matched genera and from larger samples of each species taken from different geographic localities of the lake may provide insight into the apparent geographic divergence of taxa. While the phylogenetic picture provided by RAPD analysis is not clear, it does raise some interesting questions which await more analysis. SUMMARY

The determination of the phylogenetic relationships of the

Lake Victoria cichlid taxa is certainly daunting, however it may not be impossible. Results of the analyses presented in this thesis indicate that it may be possible. The study of the phylogenetically informative and widely studied 18S rRNA gene reveals the gross phyletic relationships of the Cichlidae, including the Lake Victoria taxa. Further analysis of the more recently investigated, and the theoretically more neutrally evolving region of the ITS1, expanded the resolution to provide more insight into the relationships between these species. Application of the emergent technology of

RAPD analysis raised some interesting questions regarding the importance of geographic location as a factor in phylogenetic reconstruction within Lake Victoria cichlids. RAPD's were not able to confirm or refute a morphologically produced tree. More informative methods yet to be studied will be needed to resolve these close relationships fully . These include methods such as microsatellites which have previously been used on populations.

The extinction of a large percentage of the this large assemblage coincindentally coincides with an explosion of molecular technology to study species and population relationships in a way not possible in the past. The determination of the correct 145 146 methodology to apply to a particular problem is one of the biggest challenges of these types of studies (Avise, 1994). It remains to be seen whether or not a subset of the original Lake Victoria taxa will be able to be saved in captivity, followed by subsequent reintroduction to the lake. However, determination of the appropriate methodologies to study a particular problem, like those presented in this thesis are very useful.

In the future, when faced with a crisis like that in Lake

Victoria, where a large number of closely related species are threatened with extinction, decisions on which taxa to try to save will inevitably need to be made. In situations such as these, where iminent extinction pressures exist, and rapid decisions must be made for a conservation program, the need to apply the correct methodology rapidly in an attempt to determine species relatedness is apparent. The ability and knowledge to use the method which most closely matches the level of relationships under study is of primary concern in terms of species survival, success of the conservation program, time, and resources. While it appear that more fine grained analyses will be needed, such as those previously used in population studies, to differentiate fully between the Lake Victoria cichlids, the results presented here provide the initial steps towards the resolution of these questions. APPENDICES

147 Appendix A: Data relevent to Chapter III, 18S rRNA Analysis Figure 26. Primary sequence alignment of 18S rDNA sequences for taxa analyzed in this study. On the following 7 pages, the primary sequence alignment of the 16 taxa analyzed in for this gene is presented on the following seven pages.

149 Gouoiochro AGC AIAIGC1IGICTCAAAGA1 lAAGCCAIGCAGGTCIAAGIACACGCGGCCGGTACAGrGAAACIGCGAAfGGCfCAf IAAAICAGI TATGGI1CCTI rCAtCGCTCAfC-CGItACI IGGATAACTG 128 Cirhtfisomt -...... -...... Parctroplu

Etheostono ...-...... FiaiduliK ...... Srbastolol) ...... A.. Aciponsor ...... C. latam*ria ...... -TGC......

Motorynchu ...... Rhinobntos ...... T..

Pftronyron ...... Sty<*ln . .10...... GC..... GAGMCTC.. .A. .C...... Cf.....TA....G . - . ..GAAA.C..AG......

Gmmiochro IGGCAAMCIAGAGCf AATACA1GCAAACGAGCGCTGACCC- ' TCCG-CG*TATGCGTGCAIIf AICAGACCCAAAACCCATGCGGG CCCGGCCGCfTIGGTGACTCIAGAfAACCTCG-'AGCC 2 4 7

( M h l l l ’.fltlllt ......

I'm r l l tt|i| it . . , • , , ...... Pomnrcntr u ...... G. Ethcostoma...... C ...... — ...... -G...... -...... Furtdulus ...... I...... C...... SrlMi'.lolitli...... C ...... C.'.. *...... Ar ((M-rr.n ...I...... N ...... GG. .. G ...... 1-A..... I.AC...... GCI...... G ...... I i;i ...1...... CG...... 1--..G. -...... A..C...... -G...

SijujiI tia. ...I...... C ...... -.1. A..C... ------..A...... --G..T •fntoi yn c h u T ...... C...... *..A..C*...------’..A.. - -G..T RhinobMos — I...... C ...... G..CA___ A..IG...... A..C______-.IA...... C.- C... Echinorhin ...T, ...... C...... ■ -. .1 ...... *..... A. .C...... A...... --C..T Lwrpctra ...I...... TG...... G...... C _ f A..C ------.A...... G..... A..IGC..A Pptromyjon ...F...... TG...... ______-G ...... C _ f A..C ______-.A...... G ..... A.TTGG. .A Styfln ...I...... -.TIA C... 1 -G.. .C.. I I G. C ______I.A..-____A ...... G...... A..--C.-- ni.itfh *•**• •* * • *• • *** •• *••• *•* * •*•*** * **** •*** • •••• •••**•• ••**••• *

Figure 26. 18S Primary Sequence Alignment. Figure 26 (continued).

Gnuoi ochro GAfCGCTGGCCC- ICCGIGGCGGCGACGICfCAfTCGAArGfCfGCCCf AfCAACT! ICGAfGGIACf Tf AfGIGCCIACCATGGIGACCACGGGIAACGGGGAAICAGGGt ICGAI1CCGGAGAGGGAG 1 7 6 Cichlasoma ___

Pomacentru ___ Etheostoma .... Fundulus .... .,. Srbnstotob Aci pcnser ..AC.?. .AI..C...... Squalus Nototynchu___

. 1AC.1 . .A...... A1C...... Echinorhin ___ .A...... A... Innpctra ___ -C.G ...C!...... Pctromyion ___ ..AC.G. ..A..!...... ____.....A.. .ur.r.1 1 rar a . nt.lt ( h ...... * * • ••* * *...... **« *...... *•**......

Gauoi ochro CCrGAGAAACGGCIACCACATCCAAGGAAGGCAGrACGCGCGCAAAIIACCCACTCCCGACICGGGGAGGrAGIGACGAAAAAf AACAAIACAGGACfCTT fCGAGGCCCfGIAAT IGGAArCAGf ACAC 506 Cirblasona ...... Paretroplu ...... Pomacentru...... *...... Etheostoma ...... fundulus ...... A ...... I...A...... Scbastolob ...... T...... G ...... Ariprnsrr ...... C ...... Lntamri in ...... G ...... Squill us ...... A ...... Mot or yiw hu ...... Rh inoti.itos ...... ( chiooi hiit ...... lampott a ...... Petromyron ...... Stycln I...... A...... ,.... A*...... I ill. 111 || •••••••*•• »•*«*•«**••••••**••*******••• *•••*••***•« ***** * ••••*»•*•*•*«•*»**••••«*•••••*••••»••» •*** 151 Figure 26 (continued).

Gauorochro VTTAAAVCCTTf AACGAGGATCCATTGGAGGGCAAGTCfGGfGCCAGCAGCCGCGGTAATfCCAGCTCCAATAGCGTATCUAAAGTTGCIGCAGIIAAAAAGCTCGfAGIIGGAICfCGGGAICGAGCf 636 Cichlasoma...... Paretroplu...... Pomacentru *...... Etheostoma ...... A ...... fundulus ...... A ...... A ...... Scbastolob...... A ...... A ...... , Acipenser ...... C ...... A ...... T...... IiitiWRM in ...... I...... G...... A ...... I......

Squnlus ...... A.r...... CA C ...... I ...... Notorynchu ...... t...... C ...... A C...... I...... Rhinobatos ...... A .....C...... I...... G... frhinoi hin ...... I...... A C...... ?...... 1 iwjx't i a ...... A ..... C. . I...... I__ .AG.G__ Pctromyion...... A ..... C..T...... I__ AG.G... Styela CC A.TC...... I...... C..T GC...... T.T...CGA.C..A match *•*• • • ****** * • ***•*•«***••*«**** **************************** *• *•* ***** *• *«• *•* ••**••***•*•**#••* • •*• • •*

Gnuorochro GACGGlCCGCCGCGAG CAGGCIACCGTCIGTCCCAGCCCClGCCICrCGGCGCCCCCICGAIGCICItAGClGAGIGICCCGC GG GGICCGAAGCGITfAC MIGAAAAAATlAGAGIGMCAAAGC Tit3 Cichlasoma...... G- C ...... C..-...... Pnrctroplu...... •...... Piunarrntfu ...... R..C...... I I ill** i*. I (illirt ...... A ...... I*...... ItNKhiltis ...... G.GA..C...... Schnstolob ...... G.GA...... I...... Acipenser .G ...... C .... T...... C.I...... L.itamrrin .G...... G.GA...... C ...... T.... T___ I...... A ...... 1.G- ...... Squalid .G...... G.GA...... C ______T___ T...... t. - ...... Nnloiynchu .G .G.GA...... C...A...... t...... Rltiiirthntos .G...... A..G.GA...... C ...... T..A... T...... I...... fchinorhin ,G...... G.GA...... C ...... T... I...... I.--..-...... Iim'irtra IG...... G1GI.IC..I.C...... G .... CA...... T..I...CG...... G.T...... I.T-..G...... G ..... G...... Prtiomyrmi IG...... G>G1.IC..1.C...... G .... CA...... T..I...CG...... G.1...... 1.. ..G...... G ..... G ...... Styela -t______t...AG.G.GI.I...1.GT..CGIVC...1.-A...-____II.I..C...G...... GA...... G .. IAA...... nui t < ft • • • « • • • • • * * ***** •*• • *•« * •• • *• • •••*••* *•*•*••« * * ***• *********** **** * ************* 2 5 1 Figure 26 (continued).

Goifot ochro AGGCCCGGI -CGCCIGAATACCGCAGCfAGGAAFAAfGCAAfAGGACrCCGGIfCtA UlTGTGGGTmCICfCI- GAAC IGGGGCCAIGA11AAGAGCGACGGCCGGGGGCAIICC1A1 f GTGC W 7 Cichlasoma...... -...... „...... •...... Cf___ -...... Paretroplu...... •...... ,...... *...... - ...... P o m a r m t i u ...... *...... •...... G...... 11 ht'O's t om n ...... •...... I...... fundulus ...... -...... * ICIC...... Sebastotob...... -...... -...... T •-___ --...... Acipenser TGAC. * C TI...... G ...... C -.C ter...... c»...... l.it.MtoM.i ...... |..r,___ ||...... | .g ...... Squnlus ...... !C...... C...... • -.0 ...... Wotoiynchu ...... 1C...... C...... -...... I...... G ‘____ -...... RhinobatOS T. .CG...... ICG...... C ...... -...... T...... G -___ . .A...... C.. Frhinot hi n ...... 1C...... C...... ■...... I...... G - •____ *.■...... I nnyirti n CA C.GT..... G ...... CT...... T .G ___ G.A...... T...... Prtromy/nn CA Gl..... G ...... CT...... I - .G • * G.A...... 1...... Styeln .... I .C» .C__ Gf.IT,.AT...... CT...... -...... I...... G--..G.A C.A..IA...... A..A...... G.C___ C.C... nvttth •««* m * i * » «• •«•••»••*»«*****»* •*•••»*» ••••* • ** • • •• •••••• •««•«••« * •***««* * **** • **

Gauot ochro CGCIAGAGGIGAAATTCTTCXACCGGCGCAAGACGGACGAAAGCGAAAGCAII TGCCAAGAAf GTIITCA1TAATCAAGAACGAAAG1CGGAGGT TCGAAGACGA1CAGATACCGICGIAGTTCCGACCA 1017 C i t h I as o m n ...... Par rti npltj ...... Pomacentru...... G...... f 1 tiros torwi ...... 1 IRRltlltlS ...... G...... Srbastotob...... Ar ipenser ...... GC__ CG...... I iit jHnr i in ...... A ...... S<|uat

Cauorochro TAAACGATGCCAACTAGCGATCCGGCGGCGTTAf TCCCAfGACCCGCCGGGCAGCGTCCGGGAAACCAAAGICTTTGGGnCCGGGGGGAGTAIGGTTGCAAAGCTGAAAC?TAAAGGAA11GACGGAAC 1 U 7 Cirhljisomn...... _...... Pnretfoplu...... Pomacentru...... G... Etheostoma ...... „.... f isidul US ...... Srtvastoloh A ...... Acipenser ...... G....,...... A .... C ...... C...... ?...... L at amor ia ...... C ...... I...... Gt...... Squalu*...... A .... 1......

Not 01 ync hu ...... A .... I...... Phi nobat os ...... G ...... G...FC...... -...... Echtnorhin...... A.... r ...... I ...... Ldmpetra ...... G A ...... 1. ..T..... 1.0...... G ...... A...... Petfomyton ...... G A...... T.G...... G ...... A ...... Styeln ...... G..A...... CCAIG.C....ITT.C T ...... A ...... match **•*» **••* •* ••••••* * *•** • * •• •»*•**«•*••••• ***** *•*••••••*•*•»• • •• *•»••••••*•*«.*

Gatiof orhro GGCACCACrAGGAGIGGAGCClGCGGCT lAAIITGACrCAACACGGGAAACCfCACCCGGCCCGGACACGGAAAGGAIIGACAGAT IGATAGCTCITICICGAI TCfGTGGGIGGTGG7GCATGGCCr.l I U 7 7 Cichlas.omn...... NNN...... Paretroplu ...... Pomacentru ...... ,...... f theostomn ...... fundulus ...... G ...... Sebastolob...... Ac i prnsri ...... T...... I a t nmoi i a ...... Squ.il ir; ...... Notorynchu...... Rhinobntos ...... Fchinoi hi n ...... lampetta ...... A ...... ,..G...... G ...... C Petiomymi ...... A ...... G ...... G...... C Styela ...... G..A...... A.G1...... G...... I____ ... rn.it rh •* ••«•••••*«*•**••«* * •«»*•*•••*•*«*•• «•••««•«■ t 4 5 1 Figure 26 (continued).

Gauorochro CITAGIIGGIGGAGCGAtfIGICIGGI lAAnCCGAIAACGAACGAGACICCGACAIGCIAACIAGfIACICGACCCC GTGCGGTCCGAGIC -CAACI ICI IAGAGGGACAAGIGGCCiMCAGL'CACA K(J4

Paretroplu Pomacentru Etheostoeia fundulus ...... G .... Srbastolnt) ...... G ...... IG. n __ . I ...... IG. ... II

Squat us ...... tc...... A,, ...... It...... A...... G. . .c * ...... f ...... A!...... c...... A,. ___G.C......

Loapctro • G ...... Petromyion .G , .C...... G...... C...G..G___ I...... G.C. .T- G.C...... G. Styelo ...... f.fi...... A ...... G .... TIC. .... G.C... I...... T. match * ••«*•«*»**••• ••*•«••••*•«•••«**•••• « •••»** «** **«*•* • • • * « *

Gauot ochro CGAGAIIGAGCAAlAACAGfilCfGIGAIGCCCr lAGAIGICCGGGGCIGCACGCGCGCCACACtGAGIGGATCAr.CGlGlGICIACCCiJ ICGCCGAGAGGCGIGGGIAACCCGI IGAACM r.AC M.GIGA V»W Cichlasoma ...... MM...... Paretroplu...... ,C...... Pomacentru...... N ...... Etheostoma ...... C ...... fuidul us ...... I...... C...... Si't.iMoiuii...... i...... * ...... a .... rr... i...... i.....

Aciprtiscr ...... C ...... CG...... I CA. . .C...... C ...... C ...... IG___ C latmcria ...... I A...... A.A C...I.C...... I..... Squalus ...... I...... A ...... A .... CC...T...... I..... Nrilotyiw hu ...... I...... A ...... A .... CC...I...... I ..... RhimAuilus ...... C ...... I...... A ...... A CC...I.C...... I..... Echinoihin...... 1...... A ...... A .... CC...I...... I..... lampetra ...... C ...... 1...... A ...... CG...... GC______A___ I..... Petraayion...... C ...... I...... A ...... CG...... GC______A___ I..... Styrln ...... C ...... I...... A.G A ...... A.. .AG A. ..ICC...... I ..I..... mitch tttttltll* *••••••••**••«••« •*••**•«********• «•••»• * •«(*• ttt | **• ill lliiliilll «*«* * li *•«« 155 Figure 26 (continued).

Gauorochro FAGGGA1FGGGGATTGCAAFF Af f TCCCAFGAACGAGGAAf FCCCAGIAAGCGCGGGFCAFAAGCFCGCGTFGAIfAAGFCCClGCCCTf FGTACACACCGCCCGTCGCTACFACCGAFFGGAFGGFFFA 1664 Cichlasoma...... Paretroplu...... ,...... „......

Pomacentru ...... n ...... Etheostoma...... fundulus C ...... G ...... *. ,C C ...... G Sebastolob...... A ...... G..I...... G ...... Acipenser .C...... A ...... C ...... T...... C ...... Lntamorio C ...... G ...... T...... Squnlus .G...... A ...... G..I..A...... Notorynchu...... A ...... 1.,..... C...

Rhinobntos . 1 .... C...A...... G...... 1.1.A...... fcltinoihtn ...... A ...... G..I...... G...... Lonpctia ...... G. -C.G. .C...... A ...... 1...... Petroayten ...... G. .C.G..C...... A ...... I...... Styela .1.... A...AC...... G ..... T...... AA.... C.A..T...... C ...... A ...... match ***** *** *** ** • •• ••• •**•••• *•***•*••• ** * ***** * ** ********** ***•****•***•*•••***•*• ******** *•*•*•**•• »•*# **

Gauorochro GTGAGGICCICGGAICGGCCCCGCCGGGGTCGGICACGGCCCTGGC-GGAGCGCCGAGAAGACCAtCAAAClfGACIAlClAGAGGAAGIAAAAGtCGI 1762 Cirhlnsowi ...... NN...... Pin rli tkf»l ti ...... C ...... l'

a \ Appendix B: Data relevent to Chapter V, RAPD Analysis 158

Prog/kdLU 11111110000X01100110001000011000010000100011X00110010001100100110 Para/rU tJ 11111111000101100110001100010100010001100011100-00-000------10100 Para/velU 1111111100011010011000100001101001000—00011100111010001100100110 prog/ve2u 1111111100310110011000100001100001000—00011110110010001100100110 Para/xtlU 10111111000101100110001000010100010001110011010110010001100100100 Para/bllU 1011111100010110011000110001100001000—OOOllllOllOOlOOOllOOlOOlOO Harp/frlU 11111111000101100110001000011000010001100011110111010001100100110 Gaur/splU 11111111000101100110001010010100010001110011110111010001100100 ----- Para/t>12U 111111110001011001100011000101000100011000U110110110001100110100 Psam/rllU 1011111100010110011000110001100001000—00011010111010001100100100 Prog/M2U lOllllllOOOlOllOOllOOlOOOOOllOOOOlQOOlllOOllllOlllOlOOOUOOlOOllO Para/balU 11111111000110100110001000010100010001100011110111010001100100100 Neoc/nilU 1111111100010110011000100001010001000—00011110111010001100100100 Neoc/aalU 1010111100010110011000100001010001000—00011110 ---- 0-000------110100 Para/felU 1110111100013110011000100001010001000—00011010110010001100100100 Para/rk2U lllOllllOOOlOllOOllOOOllOOOlOlOOOlOOOlOOOOllOlOlllOlOOOHOOlOOlOO Para/bl3U 1011111000010010011000111001010001000-100011010110010001100110100 Para/2e2U 10101111000101100110001000010100010001010011110110010001100100100 Prog/velU 111011110001111001100011000101000100Q1100011110111010001100100100 ASta/bllU 11101110000101100110001100011000010001100011110111010001100100100 para/xe2u loioiiiiooaiioiioiioooiioooioioooiooo-ooooiioiouioioooiiooiooioo Prog/Xd3U 1010111100C101100110001000010100010000000011110111010001100100110 ASta/bl2U 10101111000101100110001100010100010000000011110110010001100100110 Neoc/ni2U loioiiiiooaioiiooiicooiioooioioooiooooooooiiiioiiioioooiiooiooioo Neoc/ni3U 10101111000001100110001000010100010000000011110111010001100100100 Para/rk3U 11101101000101100110001100010100010001000011110111010001100100100 Pyxi/orlU 11011111100000100110001010000000-000-000011—0 -----0 -000 ------100 Harp/2r2U 11101111103101000113001000010100010000000011110111010001100101100 Prog/fcd4U 11111110100101000110001000011000010000000011110111010001100100000 Para/ehlU lOlllllOlOOlOllOOllOOOlOOOOlOlOOOlOOOOOOOOllllOlllOlOOOHOOlOOlOO Para/2e3U 11111110103101100110001100011000010000010011110111010001100100100 Asta/bi3U 10000000003101100110001100010100010000000011110110010001100000100 Para/pllU lOlllOlOlOOlOllOOllSOOllOOOlOlOOOlOOOlOOOOllllOlllOlOOOHOOllOlOO Para/pl2U 11001110000101100110001100010100010010000011110111010001100100100 Pseu/BUlK 01000000111311100000001001100101000100001011100101100010011001001 Xstl/PhlK 01001111100101100010001000010100010011110011110111010000000110100 Paeu/au2K 01000000111311100010001001100101000100001011100101100010011001001 Pseu/=u3K 01000000111311100010001001100100000100000011100111100010011100001 pseu/mu4K 01000000111011100010001001100101000100001011100111100010011001001 XsCl/ph2K 11001111100100100100001000010100010011000011110111010000000110000 Pseu/su5K 01000000111011000000001Q011001000QQ1Q100101110Q110100010011001001 Xsti/bh3K 11001111103100100113001000010100010011100011110110010000000011000 PSAU/BU6K 01000000111011100000001C01101011000101001011100110100010011001001 Ptyo/OSlK 110011111C1101100103001100010100010001000011110110010001100101000 ASta/nulK 10001111103101100103001000010100010011000011110110010001100100000 ASta/rtu2K 11001111100101100103001100010100010001000011010110010001100101000 ASta/r.u3K 10001111133100100103001000010100010011000011110110010001100101010 Harp/fr3K 11001111100101100103001000010100010000100011110110010001100101010 Ptyo/xelK 110011111331311001C3001000011000000000100011110111010001100100000 Ptyo/xe2K 11001101133131100103331100011000010011000011110111010001100101000 pt.yo/xe3K 110011111CC100100103001100010100000001100011110111010001100111000 Ptyo/xe4K 11001111133100100100001100010100010001100011110111010001100111000 Ptyo/xe5K 11001101133131100103001000010100010000100011100111010001100100000 Ptvo/xe6K 11001101133331100133001100011000010011000011111111010001100100000 V asl/lalK 11001111100101100100001000010100010000100011110111010001100100000 U apl/lllK 11001001103101100103001000010100010000100011110111010001100101000 ¥ssi/la2K 10001100100101100100001100010100010000100011110111010001100100000 Ptyo*os2K 10111011000101100130001000011000 ------0100011110111010001100100000 Neoc/nl4K 11111001000101100100001000010100010000100011110111010001100101000 Yssi/la3K 11111101003101100100001000011000010000100011110111010001100100000 Ptyo/os3K 11001001000101100100001100011000010001000011110111010001100100000 Haoc/nlSK 10001001003101100100001000010100010000100011110110010001100100000 M ta/2bl 11000011000000000103001000010000------0100001100010000000000001010 Asta/2b2 01001001000000000100001100010100010000100001110011010001100001000 Asta/fb3 01000001000000000100001100010100010010000001110011010000000001000 Aaca/£b4 1100100000C100101100001000000100010010000001010011010001100001000 Asta/2b5 11000001000100000100001000010100010000100001110011010000000001000 Ho/AaFBl 00111000003101001100001100110100010001000011111111000001100100000 Ho/AaTB2 11111001000101000330001100110100010001000011111111000001100100000 A ./allul 01110001000100100031100000010100110001000101010110001000100000000

Figure 27. RAPD data matrix. This is the full data matrix used in the RAPD study. This was a NEXUS file used in PAUP. All other data sets were derived from this data matrix. Key: l=charater presence, 0= character absence, -=missing charater. For more details, see chapter V. 159 A ppendix C: Animal Use Protocol

ANIMAL USE PROTTTTL P ro to c o l No. Dace Received THE OHIO STATE fSIVEESITV :::t-:ttt:onal laiiOratorv animal z.-s z r,:z tse co:~:ttee protocol TITLE: isolation Genetic -naivsi; ~.f i-cincerec Clchlia Tisnes l i c i t tc ICQ s ra c e s /-\

PRINCIPAL INVESTIGATOR: Paul A._r u e r s :______Must te memuer c: - £ V Faculty, ’ >*?*“ r.-ine ^ iienature

NOTE: ALL OSL personnel involved ir. tr.is protect! zus: cczpitce tne Ar.iir.al Care and Lae Tramir.c Proerats BEFORE anisals can oe rrocurec. Ir. susmciir.c cnis protocol, 1, as m r.cisa* investigate: . accept the responsibility :cr ccr.riiance witr. cnis requirement.

Acscezic Title: *SS::m e Professor ______Work No. 293-6403 ______Ezersencv rnone Sc. £61-6629

Department: Mole:iiar Genetics Zc.lece: riolooTCal Science!

A c a re s s: :21 Biological Sciences i'.zz.

Co-investigator(s): Ivpeo naze iigr.ature

REVIEV STATUS: I n i n a . Ri v i i u X lir.tir.ua: icr. Continuation N: Cnar.ca _____ Llth Chans* _____

PERIOD OF PROTOCOL: laces Protocol w ill toe i;: «::ec: :rcr. _09'0]_ £0_:o OS ■'21 / 93- : : : ixcee: ; ears .

DEPARTMENT CHAIRPERSON’ S ENDORSEMENT: ! nave rev iew * ; cnis Ann.*! .s * P rc co co i and ev.corse

.is suosissior.. Ccnrents;

,ee Jonnson ______? /n /< ~vto*a name 'w sic r.a tu r* Date ATTENDINC VETERINARIAN'S REVIEW: I nave reviewea tr.is sricocoi witr. reeara to oroposec cart ar.o use o; anira.s ana utilization c: appropriate iacilities. The Principal lr.veatlt.ator lor cesirnee. r.at ;*er. mtorcec atoout =v concerns cctstnts that are suraanzeo oeiow. C o z n a n ts : ><5V- TT~ J±;/

50URCEI5' FOR Fl'N'DIND: sd : n : tor FOR propz PROPiJfftDJsE^ p PROJECT. r o je c t . Checf. Known.::

^ ?SL'RF; itenser !r:uTnnus Z?r rr^oosai/Pro’ect Nc._ Development r-r.c : Donor • : cc-_r.c Numoer Coiieat/Deoartzen: ______-er.cir.c

ZLACL: A-Ol t 02/90 > rrotocti LIST OF REFERENCES

Achieng, A. P., 1990. The impact of the introduction of Nile perch, Lates niloticus (L.) on the fisheries of Lake Victoria. Journal of Fish Biology, 37 (Supplement A): 17-23.

Amheim, N., T. White, and W. E. Rainey 1990. Application of PCR: Organismal and Population Biology. BioScience, 40(3): 174-182.

Avise, J., 1990. Flocks of African Fishes. Nature 347: 512-513.

Avise, J., 1991. Ten Unorthodox Perspectives on Evolution Prompted by Comparative Population Genetic Findings on Mitochondrial DNA. Ann. Rev. Genet. 25: 45-69.

Avise, J. 1994. Molecular Markers, Natural History, and Evolution. Chapman and Hall, N.Y.

Avise, J. C. and N. C. Saunders 1984. Hybridization and Introgression Among Species of Sunfish (Lepomis): Analysis by Mitochondrial DNA and Allozyme Markers. Genetics, 108: 237- 2 5 5 .

Avise, J. C., J. amold, R. M. Ball, E. Bermingham, T. Lamb, J.E. Neigel, C.A. Reeb, and N.C. Saunders, 1987. INTRASPECIFIC PHYLOGEOGRAPHY: The Mitochondrial DNA Bridge Between Population Genetics and . Ann. Rev. Ecol. Syst., 28: 4 8 9 -5 2 2 .

Baerends, G. P. and J. M. Baerends-van Roon, 1950. An introduction to the study of the ethology of cichlid fishes. Behaviour (Supp.) 1, 1 -2 4 3 .

Baker, C. S., M. MacCarthy, P.J. Smith, A.P. Perry, and G.K. Chambers, 1992. DNA fingerprints of orange roughy, Hoplostethus

160 161 atlanticus: a population comparison. Marine Biology, 113: 561 - 5 6 7 .

Baldwin, B. G., 1992. Phylogenetic Utility of the Internal Transcribed Spacers of Nuclear Ribosomal DNA in Plants: An Example from the Compositae. Molecular and Evolution 1(1): 3-16.

Barel, C. D. N„ G. C. Ankel, F. Witte, R. J. C. Hoogerhoud, and T. Goldschmidt 1989. Constructional Constraint and its Ecomorphological Implications. Acta Morphol. Neerl-Scand. 27: 83- 109.

Barel, C. D. N., W. Ligtvoet, T. Goldschmidt, F. Witte, and P. C. Goudswaard 1991. The haplochromine cichlids in Lake Victoria: an assessment of biological and fisheries interests. In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 258-279 Fish and Fisheries Series 2.

Barlow, G. W. 1991. Mating systems among cichlid fishes In M iles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 173-190 Fish and Fisheries Series 2.

Baskin, Y., 1992. Africa's troubled waters. BioSci. 42: (7) 476-481.

Baskin, Y., 1994. Losing a Lake. Discover. March, 1994: 73-81

Beckwitt, R. 1987. Mitochondrial DNA Sequence Variation in Domesticated Goldfish, Carassius auratus. Copeia, 1987(1): 219- 222.

Bernardi, G., and D. A. Powers, 1992. Molecular phylogeny of the prickly shark, Echinorhinus cookei, based on a nuclear (18S rRNA) and a mitochondrial (cytochrome b) gene. Mol. Phylogenet. Evol. 1: 1 6 1 -1 6 7 .

Bernardi, G., P. Sordino, and D. A. Powers, 1992. Nucleotide Sequence of the 18S rRNA gene from two sharks and their molecular phylogeny. Mol. Marine Biol. Biotechnol. Berra, T.M. 1981. An Atlas of Distribution of the Freshwater Fish Families of the World. University of Nebraska Press. Lincoln and L ondon.

Birky, C.W., and R. Skavaril, 1976. Maintenance of genetic homogeneity in systems with multiple genomes. Genet. Res. Camb. 27: 249-265.

Birky, W., T. Maruyama, P.A. Fuerst. 1983. An approach to population and evolutionary genetic theory for genes in mitochondria and chloroplasts, and some results. Genetics, 103:513-527.

Bootsma, H. A., and R. E. Hecky 1993. Conservation of the African Great Lakes: A Limnological Perspective. Cons Biol. 7 (3): 644-656.

Brown, W. M., 1985. The mitochondrial genome of animals, pp. 95- 130. In R. MacIntyre (ed.), Molecular Evolutionary Genetics. Plenum, New York.

Cabot, E. L. and A. T. Beckenbach. 1989. Simultaneous editing of multiple nucleic acid and protein sequences with ESEE. Computational and Applied Bioscience, 5: 233-234

Capron de Caprona, M. D., and B. Fritzsch, 1984. Interspecific fertile hybrids of haplochromine cichlidae teleostei and their possible importance for speciation. Neth. J. Zool. 34(4):503-538

Cedergren, R., M. W. Gray, Y. Abel, and D. Sankoff 1988. The Evolutionary Relationships among Known Life Forms. J. Mol. Evol., 28: 98-112.

Chalmers, K. J., R. Waugh, J. I Sprent, and W. Powell, 1992. Detection of genetic variation between and within populations of Gliricidia sepium and G. maculata using RAPD markers. Heredity, 69: 465-472.

Chapco, W., N. Ashton, R. Martel, and N. Antonishyn, 1992. A feasibility study of the use of random amplified polymorphic DNA in the population genetics and systematics of grasshoppers. Genome 35:569-574. 163

Clark, A., and C. Lanigan, 1993. Prospects for Estimating Nucleotide Divergence with RAPD's. Mol. Biol. Evol. 10(5): 1096- 1111.

Cohen, A. S., R. Bills, C. Z. Cocquyt, and A. G. Caljon 1993. The Impact of Sediment Pollution on Biodiversity in Lake Tanganyika. Cons. Biol. 7 3: 667-677.

Cordese, F., R. Cooke, D. Tremousaygue, F. Grellet, and M. Delseny 1993. Fine Structure and Evolution of the rDNA Intergenic Spacer in Rice and Other Cereals. J. Mol. Evol., 36: 369-379.

Coulter, G. W. , and R. Muramba 1993. Conservation in Lake Tanganyika, with Special Reference to Underwater Parks. Cons. Biol. 7 (3): 678-685.

De Rijk, P., J-M. Neefs, Y. Van de Peer, and R. De Wachter, 1992. Compilation of small ribosomal subunit RNA sequences. Nucleic Acids Research, 20 (Supplement): 2075-2089.

Dobzhansky, T. 1951. Genetics and the Origin of Species, 3rd ed. Columbia Univ. Press, New York.

Dominey, W. J. 1984. Effects of Sexual Selection and Life History on Speciation: Species Flocks in African Cichlids and Hawaiian Drosophila, 231-250. In A. A. Echelle and I. Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

Dover, G., S. Brown, E. Coen, J. Dallas, T. Stratchan, and M. Trick, 1982. The dynamics of genome evolution and species differentiation, 343-372. In G. A. Dover and R. B. Flavell (eds.), Genome Evolution. Academic Press, New York.

Echelle, A. and T. E. Dowling 1992. Mitochondrial DNA Variation and Evolution of the Death Valley Pupfishes ( C yprinodon, Cyprinodontidae) Evolution 46(1): 193-206.

Erlich, H. A., D. Gelfand, and J. J. Sninsky 1991. Recent Advances in the Polymerase Chain Reaction. Science, 252:1643- 1651. 164 Fedoroff, N. V. 1979. On Spacers. Cell, 16, 697-710.

Felsenstein, J. 1982. Numerical Methods for Inferring Evolutionary Trees 57 (4): 379-404, Phylogenetic analysis methods.

Felsenstein, J. 1988a. The detection of phylogeny Hawksworth, D. L. Prospects in Systematics. Oxford Clarendon Press 113-127.

Felsenstein, J. 1988b. Phylogenies from molecular sequences: Inference and reliability. Annual Review of Genetics, 22:521-565.

Felsenstein, J. 1989. PHYLIP — phylogeny inference package (Version 3.2). Cladistics, 5.

Frankel, O. H., and M. E. Soule, 1981. Conservation and Evolution. Cambridge University Press, Cambridge.

Fryer, G., 1977. Evolution of species flocks of cichlid fishes in African lakes. Zool. Syst. Evolut.-forsch,. 15: 141-165

Fryer, G. and T.D. lies 1972. The Cichlid Fishes of the Great Lakes of Africa: their Biology and Evolution, Oliver and Boyd, London.

Furerst, P., and T. Maruyama, 1986. Considerations on the Conservation of Alleles and of Genic Heterozygosity in Small Managed Populations. Zoo Biol. 5:171-179.

Futuyma, D., and G. Mayer, 1980. Non-Allopatric Speciation in Animals. Systematic Zoology, 29(3): 254-271.

Gillespie, J. H. 1986. Rates of Molecular Evolution Annual Reviews of Ecological Systematics. Annual Reviews Inc., 17: 637-665.

Goff, D. j., K. Galvin, h. Ktaz, M. Westerfield, E. S. Lander, and C. J. Tabin 1992. Identification of Polymorphic Simple Sequence Repeats in the Genome of the Zebrafish. Genomics 14: 200-202, SSP.

Goldschmidt, T., 1990. Egg Mimics in Haplochromine Cichlids (Pisces, Perciformes) from Lake Victoria. Ethology 88, 177-190. Goldschmidt, T„ and F. Witte, 1990. Reproductive strategies of zooplanktivorous haplochromine cichlids (Pisces) from Lake Victoria before the Nile perch boom. OIKOS. 58: 356-368.

Goldschmidt, T„ and F. Witte 1992. Explosive speciation and of haplochromine cichlids from Lake Victoria: An illustration of the scientific value of a lost species flock. Mitt. Internat. Verein. Limnol 23: 101-107.

Goldschmidt, T., F. Witte, and J. de Visser 1990. Ecological segregation in zooplanktivorous haplochromine species (Pisces: Cichlidae) from Lake Victoria. OIKOS 58: 343-355.

Goldschmidt, T., F. Witte, and J. Wanink 1993. Cascading Effects of the Introduced Nile Perch on the Detrivourous/Phytoplanktivorous Species in the Sublittoral Areas of Lake Victoria. Cons. Biol. 7(3): 6 5 6 -7 0 0 .

Greenwood, P. H., 1959. The Monotypic Genera of Cichlid Fishes in Lake Victoria Part II. and A Revision of the Lake Victoria Haplochromis Species (Pisces Cichladae) Part III 5(7): 7-177.

Greenwood, P. H., 1966. The Fishes of Uganda Second Kampala. The Uganda Society 131.

Greenwood, P. H., 1974. The cichlid fishes of Lake Victoria, East Africa: the biology and evolution of a species flock. Bull. Br. Mus. Nat. Hist. (Zool.), Supp. 6:1-134.

Greenwood, P. H., 1979. Towards a phyletic classification of the ’genus' Haplochromis (Pisces, Cichlidae) and related taxa. Part I. Bull. Br. Mus. nat. Hist. (Zool.), 35: 265-322.

Greenwood, P. H., 1980. Towards a phyletic classification of the 'genus' Haplochromis (Pisces, Cichlidae) and related taxa. Part II; the species from Lakes Victoria, Nabugabo, Edward, George, and Kivu. Bull. Br. Mus. Nat. Hits. (Zool.), 39:1-110.

Greenwood, P. H., 1981. The Haplochromine Fishes of the East African Lakes. Kraus International Publications, Munchen. 839 Pgs. 166

Greenwood, P. H. 1984a. African Cichlids and Evolutionary Theories, 141-154. In Anthony A. Echelle and Irv Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

Greenwood, P. H. 1984b. What Is a Species Flock?, 13-19. In Anthony A. Echelle and Irv Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

Greenwood, P. H. 1991. Speciation Keenlyside, Miles H. Cichlid Fishes: Behaviour, ecology, and evolution. First London Chapman and Hall 86-102 Fish and Fisheries Series 2.

Hilgendorf.E., 1888. Fische aus dem Victoria-Nyanza (Ukerewe See). Sber. Ges. naturf. Freunde Berl., 75-9.

Hillis, D. M., and M. T. Dixon, 1991. Ribosomal DNA: molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66: 411- 4 5 3 .

Hillis, D. M., J. P. Huelsenbeck, C. W. Cunningham 1994. Application and Accuracy of Molecular Phylogenies. Sci. 264: 671-677.

Hillis, D. M., and C. Moritz, 1990. An Overview of Applications of Molecular Systematics. In (D. Hillis and C. Moritz, eds.) M olecular Systematics, 502-515, Sinauer, Sunderland, MA.

Jeffreys, A. J., V. Wilson, & S. L. Thein 1985. Individual-specific 'fingerprints' of human DNA. Nature, 316, 76-79.

Kambhampati, S., W. Black IV, and K. Rai, 1992. Random Amplified Polymorphic DNA of Mosquito Species and Populations (Diptera: Culicidae): Techniques, Statistical Analysis, and Applications. J. Med Entomol. 29(6):939-945.

Kaufman, L., 1989. Challenges to fish faunal conservation programs as illustrated by the captive biology of Lake Victoria cichlids. 5th World Conference on Breeding Endangered Species in Captivity. 105-120. Kaufman, L., 1992. Catastrophic change in species-rich freshwater ecosystems: The lessons of Lake Victoria. Bioscience, 42(1): 846- 8 5 8 .

Kaufman, L., and A. S. Cohen 1993. The Great Lakes of Africa. Cons. biol. 7(3): 719-730.

Kaufman, L., and K. F. Liem, 1982. Fishes of the suborder Labroidei (Pisces; Perciformes): phylogeny, ecology, and evolutionary significance. Brevoria, 472: 1-19.

Kaufman, L., and P. Ochumba, 1993. Evolutionary and Conservation Biology of Cichlid Fishes as Revealed by Faunal Remnants in Northern Lake Victoria. Conservation Biology, 7(3): 7 1 9 -7 3 0 .

Keenlyside, M. H., 1991. Parental care In Miles, H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 191-208, Fish and Fisheries Series 2.

Kimura, M. 1980. A Simple Method for Estimating Evolutionary Rates of Base Substitutions Through Comparative Studies of Nucleotide Sequences. J. Mol. Evol., 16, 111-120.

Kocher, T. D., J. A. Conroy, K. R. McKaye, and J. R. Stauffer, 1994. Similar Morphologies of Cichlid Fish in Lakes Tanganyika and Malawi Are Due to Convergence. Mol. Phy. Evol., 2(2): 158-165.

Kocher, T. D., W.K. Thomas, A. Meyer, S.V. Edwards, S. Paabo, F.X. Villablanca, and A.C. Wilson, 1989. Dynamics of mitochondrial DNA evolution in animal: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci., 86: 6196-6200.

Kondrashov, A., and M. Mina, 1986. Sympatric speciation: when is it possible?. Biological Journal of the Linnean Society, 27:201-223.

Konigs, A., (1990). Tanganyika Cichlids, Raket, B. V., Pijnacker, The Netherlands

Korey, K. A., 1981. Species Number, generation Length, and the Molecular Clock. Evol. 35(1): 139-147. 168 Kornfield 1982 p. 21

Kornfield, I. 1991. Genetics In Miles, H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 103-128 Fish and Fisheries Series 2.

Kornfield, I., D.C. Smith, P.S. Gagnon, and J.N. Taylor, 1982. The Cichlid Fish of Cuatro Cienegas, Mexico: Direct Evidence of Conspecificity Among Distinct Trophic Morphs. Evolution, 36(4): 6 5 8 -6 6 4 .

Kornfield, I., U. Ritte, C. Richler, J. Wahrman, 1979. Biochemical and Cytological Differentiation Among Cichlid Fishes of the Sea of Galilee. Evolution, 33: 1-13.

Leslie, J. F. and R. C. Vrijenhoek 1978. Genetic Dissection of Clonally Inherited Genomes of Poeciliopsis. I. Linkage Analysis and Preliminary Assessment of Deleterious Gene Loads. Genetics, 90: 801-811.

Lessa, E„ 1992. Rapid Surveying of DNA Sequence Variation in Natural Populations. Mol. Biol. Evol. 9(2):323-330.

Li, W-H., 1993. What about the Molecular clock hypothesis? 896- 9 0 1 .

Li, W-H, and J. Bousquet, 1992. Relative-Rate Test for Nucleotide Substitutions between Two Lineages. Mol. Biol. Evol. 9(6): 1185- 1 1 8 9 .

Li, W-H., and D. Grauer 1991. Fundamentals of Molecular Evolution. Sinauer Associates, Inc. Sunderland, Mass, 284 pgs.

Li, W-H., C. C. Luo, and C. I. Wu 1985. Evolution of DNA Sequences, pp. 1-94. In R. J. MacIntyre (ed.), Molecular Evolutionary Genetics. Plenum, N. Y.

Liem, K. F., 1991. Functional morphology In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 129-150, Fish and Fisheries Series 2. Lippitsch, E., 1993. A phyletic study on lacustrine haplochromine fishes (Perciformes, Cichlidae) of East Africa, based on scale and squamation characters. J. Fish Biol., 42: 903-946.

Livingston, D. A., 1980. Environmental changes in the Nile Headwaters, 339-359. In MAJ Williams and H. Faune (eds.), The Sahara and the Nile. Balkema Press, Rotterdam.

Lowe-McConnell, R. H., 1990. Summary address: rare fish, problems, progress and prospects for conservation. J. Fish Biol. 37 (Supplement A): 263-269.

Lowe-McConnell, R. H„ 1991. Ecology of cichlids in South American and African waters, excluding the African Great Lakes In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 60-85. Fish and Fisheries Series 2.

Lowe-McConnell, R., 1993. Fish Faunas of the African Great Lakes: Origins, Diversity, and Vulnerability. Conservation Biology 7(3): 6 3 4 -6 4 3 .

Lynch, M. and B. G. Milligan, 1994. Analysis of Population Structure with RAPD Markers. Molecular Ecology, 94(3): 91-99.

Maddison, W. P., 1993. Missing Data versus Missing Characters in Phylogenetic Analysis. Syst. Biol. 42(4): 576-581.

Maden, B. E., 1986. Identification of the location of the methyl groups in 18S ribosomal RNA from Xenopus and man. J. Mol. Biol. 189: 681-699.

Marchuk, D., A. S. Mitchell and F. S. Collins. 1991. Construction of T- vectors, a rapid and general system for direct cloning of unmodified PCR products. Nucleic Acid Research, 19:1154.

Maruyama, T., and P. A. Fuerst, 1984. Population bottlenecks and nonequilibrium models in population genetics. I. Allele numbers when populations evolve from zero variablity. Genetics 108:745- 7 6 3 . 170 Maruyama, T., and P. A. Fuerst, 1984. , Population bottlenecks and nonequilibrium models in population genetics. II. Nuber of alleles in a small population that was fromed by means of a recent bottleneck. Genetics 111: 675-689.

Maynard-Smith, J. 1966. Sympatric speciation. American Naturalist, 100:637-650.

Mayr, E, 1964. Animal species and evolution. Harvard University Press, Cambridge, Mass. 797 pp.

Mayr, E. 1984. Evolution of Fish Species Flocks: A Commentary, 3- 11. In Anthony A. Echelle and Irv Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

McCallister, D.E. 1968. Evolution of branchiostegals and classification of teleostome fishes. Bulletin of the Natural Museum of Canada, 221, 1-239.

McKaye, K. R., 1991. Sexual selection and the evolution of the cichlid fishes of Lake Malawi, Africa In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 241-257. Fish and Fisheries Series 2.

Meffe, G., 1986. Conservation Genetic and the Management of Endangered Fishes. Fisheries, 11(1), 14-22.

Meyer, A., 1989. Trophic polymorphisms in cichlid fishes: do they represent intermediate steps during sympatric speciation and explain their rapid adaptive radiation? in New Trends in Ichthyology (ed. J. H. Schroder), pp

Meyer, A., 1990. Ecological and evolutionary consequences of the trophic polymorphism in Cichlasoma citrinellum (Pisces: Cichlidae). Biol. J. Linn. Soc. 39: 279-299.

Meyer, A., T. D. Kocher, P. Basasibwaki and A. C. Wilson 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. N ature, 347, 550-553. 171 Muller, J., 1843. Nachtrage zu der Abhandlung uber die naturlich Familenn der Fische. Arch. Naturgesch, 9,381-384.

Neefs, J-M., Y. Van de Peer, P. De Rijk, A. Goris, and R. De Wachter 1991. Compilation of small ribosomal subunit RNA sequences. Nucleic Acids Research. (19) supplement 1987-1999.

Nei, M., T. Maruyma, and R. Chankraborty, 1975. The bottleneck effect and genetic variability in populations. Evolution 29:1-10.

Nelissen, M. H. J., 1991. Communication. In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 209-240, Fish and Fisheries Series 2.

Nelles, L., B.L. Fang, G. Volckaert, A. Vandenberghe, and R. DeWachter, 1984. Nucleotide sequence of a crustacean 18S ribosomal RNA gene and secondary structure of eucaryotic small subunit ribosomal RNA's. Nuc. Acids Res. 12: 8749-8768.

Noakes, D. L. G., 1991. Ontogeny of behaviour in cichlids In M iles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 209-224, Fish and Fisheries Series 2.

Ogutu-Ohwayo, R., 1990. The decline of the native fishes of Lakes Victoria and Kyoga (East Africa) and the impact of introduced species, especially the Nile perch, Lates niloticus, and , Oreochromis niloticus. Envir. Biol. Fishes. 27: 81-96.

Ogutu-Ohwayo, R., 1993. The Effects of Predation by Nile Perch, Lates niloticus L., on the Fish of Lake Nabugabo, with Suggestions for Conservation of Endangered Endemic Cichlids. Cons. Biol. 7(3): 7 0 1 -7 1 1 .

Olsen, G. J., 1987. Earliest Phylogenetic Branchings: Comparing rRNA-based Evolutionary Trees Inferred with Various Techniques Cold.

Olsen, G.J., N. Larson, and C. R. Woese 1991. The ribosomal RNA Database project. N.A.R, 19 (supplement) 2017-2021. 172 Ono, H., C. O'hUigin, V. Vincek, and J. Klein, 1993. Exon-Intron Organization of fish major histocompatability complex II 6 genes. Immunogenetics 38(3): 223-234.

Owen, R. B., R. Crossley, T. C. Johnson, D. Tweddle, I. Kornfield, S. Davidson, D H. Eccles, and D. E. Engstrom 1990 Major low levels of Lake Malawi and their implications fro speciation rates in cichlid fishes. Proceedings of the Royal Society, London. B240: 519-553.

Pace, Norman R., Gary J. Olsen, and Carl R. Woese 1986. Ribosomal RNA Phylogeny and the Primary Lines of Evolutionary Descent. Cell, 45: 325-326.

Park, Y., R. Kohel, 1994. Effect of Concentration of MgCb on Random-Amplified DNA Polymorphism. Biotechniques. 16(4): 6 5 2 -6 5 5 .

Parker, P.P., A. Snow, M. D. Schug, G.C. Booton, P.A. Fuerst. 1994. Molecular markers for population biology (submitted). Ecology

Phillips, R. B., K. A. Pleyte, M. R. Brown 1992. Salmonid Phylogeny Inferred from Ribosomal DNA Restriction Maps. Can. J. Fish. Aquat. Sci. 49: 2345-2353.

Pleyte, K. A., S. Duncan, and R. Phillips, 1992. Evolutionary Relationships of the Salmonid Fish Genus Salvelinus Inferred from DNA Sequences of the First Internal Transcribed Spacer (ITS 1) of Ribosomal DNA. Molec. Phy. Evol. 1(3): 223-230.

Powers, D, 1991. Evolutionary Genetics of Fish. In (J. Scanalios and T. Wright, eds.) Advances in Genetics 29: 119-227.

Pullin, R. S. V., 1991. Cichlids in aquaculture In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 280-309, Fish and Fisheries Series 2.

Reinthal, P., 1993. Evaluation Biodiversity and Conserving Lake Malawi's Cichlid Fish Fauna. Cons. Biol. 7(3): 712-718. 173 Ribbink, A. J., 1984. Is the Species Flock Concept Tenable? 21-26. In A. A. Echelle and I. Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

Ribbink, A. J., 1986. The species concept, sibling species and speciation. Annls. Mus. r. Afr. Cent. Sci. zool., 251:109-116.

Ribbink, A. J. 1991. Distribution and ecology of the cichlids of the African Great Lakes 36-59. In Miles, H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, L ondon.

Rice. L., 1990. Nucleotide sequence of the 18S ribosomal RNA gene from the Atlantic sea scallop Placopecten magellanicus (Gmelin, 1791). Nuc. Acids. Res. 18:5551.

Russell, J. R., F. Hosein, E. Johnson, R. Waugh, and W. Powell, 1993. Genetic differentiation of cocoa ( Theobroma cacao L.) populations revealed by RAPD analysis. Molecular Ecology, 2:89-97.

Sackley, P. 1992. Phenotypic plasticity in fishes. M.A. Dissertation. University of Massachusetts at Boston, Boston, Mass.

Sage, R. D., P. V. Louiselle, P. Basasibwaki and A. C. Wilson 1984. Molecular Versus Morphological Change Among Cichlid Fishes of Lake Victoria, 185-202. In A. A. Echelle and I. Kornfield (eds.), Evolution of Fish Species Flocks, University of Maine Press, Orono.

Sage, R. D., and R. K. Selander, 1975. Trophic radiation through polymorphism in cichlid fishes. Proc. Natl. Acad. Sci. USA 72: 4669- 4 6 7 3 .

Saitou, N., and M. Nei, 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4): 4 0 6 -4 2 5

Sambrook, J., E. F. Maniatis, and T. Fritsch, 1989. (Cold Spring Harbor Laboratory Press, Plainview, New York).

Sarich, V. M. and A. C. Wilson, 1973. Generation time and genomic evolution in primates. Science 179: 1144-1147. 174

Schierwater, B., and A. Ender, 1993. Different thermostable DNA polymerase may amplify different RAPD products. Nuc. Acids Res. 21(19):4647-4648.

Schliewen, U., D. Tautz, and S. Paabo, 1994. Sympatric speciation suggested by monophyly of crater lake cichlids. Nature, 368: 629- 6 3 2 .

Scholz, C.A., and B.R. Rosendahl 1988. Low lake stands in Lakes Tanganyika and Malawi, East Africa, delineated with multifold seismic data. Science, 240, 1645-1648.

Scott, M., K. Haymes, and S. Williams, 1992. Parentage analysis using RAPD PCR. Nuc. Acids Res. 20(20):5493.

Slade, R. W., C. Moritz, and A. Heideman, 1994. Multiple Nuclear- Gene Phylogenies: Application to Pinnipeds and Comparison with a Mitochondrial DNA Gene Phylogeny. Mol. Biol. Evol. 11(3): 341 - 356.

Sober, E., 1989. Reconstructing the Past: Parsimony, Evolution, and Inference. MIT Press, Cambridge, MA.

Sogin, M. L., H. J. Elwood, and J. H. Gunderson, 1986. Evolutionary diversity of eukaryotic small-subunit rRNA genes. Proc. Natl. Acad. Sci. USA, 83: 1383-1387.

Soltis, P., and R. Kuzoff, 1993. ITS Sequence Variation within and among Populations of Lomatium grayi and L. laevigatum (Umbelliferae). Mol. Phy. Evol. 2(2):166-170.

Stager, J. C., P. N. Reinthal, and D. A. Livingstone, 1986. A 25,000- year history for Lake Victoria, East Africa, and some comments on its significance for the evolution of cichlid fishes. Fresh. Biol. 16: 1 5 -1 9 .

Stiassny, M. L. J. 1991. Phylogenetic intrarelationships of the family Cichlidae: an overview, 1-35. In Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 175

Stiassny, M. L. J., and J. S. Jensen, 1987. Labroid intrarelationships revisited: morphological complexity, key innovations, and the study of comparative diversity. Bull Mus. Comp. Zool. 151:269-319.

Stock, D. W., K. D. Moberg, L. R. Maxson, and G. S. Whitt, 1991. A phylogenetic analysis of the 18S ribosomal RNA sequence of the coelacanth Latameria chalumnae. Envior. Biol. Fishes, 32: 99-117.

Stock, D. W., and G. S. Whitt, 1992. Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group. Journal Science 257: 787-789.

Strassny, M. L. J. , and J. S. Jensen, 1987. Labroid intrarelationships revisited: morphological complexity, 'key innovations’, and the study of comparative diversity. Bull. Mus. comp. Zool., 151: 269-319.

Sturmbauer, C. and A. Meyer 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fish. Nature, 358: 578-581.

Sturmbauer, C, and A. Meyer, 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyica in eastern Africa. Mol. Biol. Evol. 10(9): 751-768.

Swofford, D. L. 1990. PAUP- Phylogenetic Analysis Using Parsimony (Version 3.0) Computer program distributed by the Illinois Natural History Survey, Champaign, Illinois.

Swofford, D. L., and G. Olsen, 1990. Phylogeny Reconstruction. In D. Hillis and C. Moritz (eds.), Molecular Systematics. Sinauer Assoc., Sunderland, Ma.

Takahata, N., 1988. More of the episodic clock. Genetics, 118: 387- 3 8 8 .

Temple, P., 1969. Some biological implications of a revised geological history for Lake Victoria. Biological Journal of the Linnean Society, 1:363-371. 176 Torres, R. A., M. Ganal, and V. Hemleben, 1990. GC balance in the internal transcribed spacers ITS 1 and ITS 2 of nuclear ribosomal RNA genes. J. Mol. Evol. 30:170-181.

Trewavas, E., 1933. Scientific Results of the Cambridge Expedition to the East African Lakes, 1930-1.-11. The Cichlid Fishes.

Trewavas, E., 1983. Tilapiine Fishes of the Genera Sarotherodon, Oreochromis and Danakilia, British Museum (Natural History), L ondon.

Van Couvering, 1982. Fossil cichlid fishes of Africa. Special papers in Palaeontology. Palaeontological Ass. Lond. 29: 1-103.2

Van Oijen, M. J. P., 1982. Ecological Differentiation Among the Piscivorous Haplochromine Cichlids of Lake Victoria (East Africa). Neth. J. Zool. 32(3): 336-363.

Van Oijen, M. J. P., F. Witte, and E. L. M. Witte-Maas, 1981. An Introduction to Ecological and Taxonomic Investigation on the Haplochromine Cichlids from the Mwanza Gulf of Lake Victoria. Neth J. Zool., 31(1): 149-174.

Venugopal, G., S. Mohapatra, D. Salo, and S. Mohapatra, 1993. Multiple mismatch annealing: Basis for Random Amplified Polymorphic DNA fingerprinting. Bioch. Bioph. Res. Com. 197(3): 1 3 8 2 -1 3 8 7 .

Vogler, A. P., and R. DeSaile, 1994. Evolution and Phylogenetic Information Content of the ITS-1 Region in the Tiger Beetle Cicindela dorsalis. Mol. Biol. Evol. 11(3): 393-405. '

Wheeler, Q., 1990. Ontogeny and Character Phylogeny. Cladistics 6: 225-268.

Williams, J., A. Kubelik, K. Livak, J. Rafalski, and S. Tingley, 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18: 6531-6535.

Williams, J. E., D. W. Sada, C. D. Williams, and Other Members of the Western Division Endangered Species Committee, 198?. American Ill Fisheries Society Guidelines for Introduction of Threatened and Endangered Fishes. Fisheries 13(5): 5-11.

Wilson, E. O., 1992. The Diversity of Life. W. N. Norton and Company, New York, London. 424 pp.

Wilson, A. C., S. S. Carlson, and T. J. White, 1977. Biochemical evolution. Annu. Rev. Biochem. 46: 473-639.

Wilson, A. C., H. Ochman and E. Prager 1987. . Molecular time scale for evolution. Trends in Genetics, 3(9):241-247.

Witte, F., 1984. Ecological differentiation in Lake Victoria haplochromines: comparsion of cichlid species flocks in African lakes, in Evolution of Fish Flocks (eds A. A. Echelle and I. Kornfield), University of Maine at Orono Press, Orono, Maine, 155- 6 7 .

Witte, F., C. D. N. Barel, and R. J. C. Hoogerhoud, 1990. Phenotypic Plasticity of Anatomical Structures and its Ecomorphological Significance. Neth. J. Zoo. 40(1-2): 278-298.

Witte, F., and M. J. P. van Oijen, 1990. , Ecology and fishery of Lake Victoria haplochromine trophic groups. 262 1-47.

Wolfe, K. H., P. M. Sharp, and W-H. Li, 1989. Mutation rates differ among regions of the mammalian genome. Nature 337: 283-285.

Woodward, A. S., 1939. Tertiary fossil fishes from Maranhao, Brasil. Ann. Mag. Nat. Hist., 2(3):450-453.

Wright, S., 1955. Classification of the factors of evolution. Cold Spring Harbor Symp. Quant. Biology. 20:16-24D.

Yamaoka, K., 1991. Feeding relationships/n Miles H. A. Keenlyside (ed.), Cichlid Fishes: Behaviour, ecology, and evolution, Chapman and Hall, London. 151-172, Fish and Fisheries.

Zuckerkandl,E., and L. Pauling, 1965. Evolutionary divergence and convergergence in proteins. Evolving Genes and Proteins, V. Bryson and H. J. Vogel (eds.), Academic Press, New York, 97-166.