<<

Iowa State University Capstones, Theses and Retrospective Theses and Dissertations Dissertations

1-1-2003

Phylogeny of lamniform based on whole mitochondrial genome sequences

Toni Laura Ferrara Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/rtd

Recommended Citation Ferrara, Toni Laura, "Phylogeny of lamniform sharks based on whole mitochondrial genome sequences" (2003). Retrospective Theses and Dissertations. 19960. https://lib.dr.iastate.edu/rtd/19960

This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Phylogeny of lamniform sharks based on whole mitochondrial genome sequences

by

Toni Laura Ferrara

A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

Major: Zoology

Program of Study Committee: Gavin Naylor, Major Professor Dean Adams Bonnie Bowen Jonathan Wendel

Iowa State University

Ames, Iowa

2003 11

Graduate College Iowa State University

This is to certify that the master's thesis of Toni Laura Ferrara has met the thesis requirements of Iowa State University

Signatures have been redacted for privacy 111

TABLE OF CONTENTS

GENERAL INTRODUCTION 1

LITERATL:jRE REVIEW 2

CHAPTER ONE: THE BIOLOGY OF LAMNIFORM SHARKS 2

CHAPTER TWO: LAMNIFORM SYSTEMATICS AND PHYLOGENY 49

CHAPTER THREE: THE RECORD OF 76

CHAPTER FOUR: MITOCHONDRIAL GENOMES, TAXON SAMPLING AND PHYLOGENY 96

CHAPTER FIVE: A PHYLOGENY OF LAMNIFORM SHARKS BASED ON WHOLE MITOCHONDRIAL GENOME SEQUENCES 115

CHAPTER SIX: A TANDEM DUPLICATION IN THE MITOCHONDRIAL GENOME OF THE GOBLIN OWSTONI 147

GENERAL CONCLUSIONS 153

RErERENCES 154

ACKNOWLEDGEMENTS 180

APPENDIX 181 1

GENERAL INTRODUCTION

The Lamniformes comprises 15 living of sharks organized into seven families. The diversity among these species is impressive. Within the Lamniformes are the notorious predator, the great ( ), and two species of harmless filter-feeding sharks, the megamouth ( pelagios) and the (Cetorhinus maximus). The remarkable discovery of the in 1976 renewed interest in the phylogenetic relationships among lamniform sharks. Morphology-based analyses of lamniform phylogeny have produced conflicting results (Maisey, 1985; Compagno, 1990; Shirai 1992, 1996; De Carvalho, 1996). Subsequent molecular-based analyses of this group were also unable to discern their phylogeny (Morrissey et al., 1997; Martin and Naylor, 1997; Naylor et al., 1997; Martin et al., 2002; Martin and Burg, 2002; Lopez et al., MS). Although lamniform teeth are abundant in the fossil record, these have proved of limited use in revealing relationships among living species. The aim of this thesis is to attempt to resolve lamniform phylogeny using entire mitochondrial genome sequences from all living taxa. As there are only 15 living species of lamniform sharks, this group represents an excellent opportunity to test the idea that complete taxon sampling combined with large datasets (such as entire mitochondrial genome sequences) can improve phylogenetic reconstruction.

Thesis Organization This thesis is written in an alternative format. It includes an extensive literature review, that is divided into four chapters: Chapter One is an overview of the biology of lamniform sharks; Chapter Two examines lamniform systematics; Chapter Three covers the fossil record of these sharks; and Chapter Four discusses the utility of mitchondrial genes and genomes in phylogenetic analysis. Chapters Five and Six present the results of this study and are written in manuscript format. A general conclusion summarizing the contents of all six Chapters is included, followed by a final list of references combined for all Chapters. Compete mitochondrial genome sequences for all 15 species of lamniform sharks are listed in the Appendix in GENBANK format. 2

LITERATURE REVIEW

CHAPTER ONE: THE BIOLOGY OF LAMNIFORM SHARKS

Introduction The order Lamniformes comprises 15 living species of sharks organized into seven families. The diversity among these species is impressive. Within the Lamniformes is the notorious predator, the (Carcharodon carcharias), and two species of harmless filter-feeding sharks, the megamouth (Megachasma pelagios) and the basking shark (Cetorhinus maximus). Individual lamniforms also exhibit an array of biological unique among cartilaginous . For example, the sand shark Carcharias taurus, a common inhabitant of many public aquaria, was the first shark in which the unusual reproductive behavior of uterine was documented (Bass et al., 1975). Thresher sharks (Alopias spp.) have enormously long and asymmetrical tails that make up about half the shark's total body length. Endothermy, a rare among fishes, has been described for several species of lamniform sharks. Also included in the order are poorly studied species such as the shark (Pseudocarcharias kamoharai) and the bizarre deep- (Mitsukurina owstoni). This Chapter provides an introduction to the unique adaptations exhibited by lamniform species as well as an overall guide to their biology. Subsequent chapters include detailed information on the fossil record of the group, and the systematics and evolutionary relationships among taxa. A is presented (for a key to species, see Chapter Two) as a guide to recognized species within the Lamniformes and as an outline for the Chapter, since species are also discussed in the order listed. Adaptations such as filter-feeding, endothermy and reproduction are discussed separately.

. List of families and species in the order Lamniformes Order Lamniformes Compagno,1973

Family Megachasmidae Taylor et a1.,1983 Megamouth shark, one species in one : Genus Megachasma Taylor et al., 1983 Species M. pelagios Taylor et al., 1983 3

Family Cetorhinidae , 1862 Basking shark, one species in one genus: Genus Cetorhinus Blainville, 1816 Species C. maximus (Gunnerus, 1765)

Family Jordan, 1898 Goblin shark, one species in one genus: Genus Mitsukurina Jordan, 1898 Species M. owstoni Jordan, 1898

Family Pseudocarchariidae Compagno,1973 , one species in one genus: Genus Pseudocarcharias Cadenat, 1963 Species P. kamoharai (Matsubara, 1936)

Family Qdontaspididae Muller and Henle,1839 Sand tiger sharks, three species in two genera: Genus Carcharias Rafinesque, 1810 Species C. taurus Rafinesque, 1810

Genus Agassiz, 1838 Species O. ferox (Risso, 1810) O. noronhai Maul, 1955

Family Alopiidae Bonaparte, 1838 Thresher sharks, three species in one genus: Genus Alopias Rafinesque, 1810 Species A. vulpinus (Bonnaterre, 1788) A. pelagicus Nakamura, 1935 A. superciliosus Lowe, 1839

Family Muller and Henle,1838 sharks, five species in three genera: 4

Genus Cuvier, 1817 ( and shark) Species L. nasus (Bonnaterre, 1788) L. ditropis Hubbs and Follett, 1947

Genus Rafinesque, 1809 (mako sharks) Species 1. oxyrinchus Rafinesque, 1809 1.paucus Guitart Manday, 1966

Genus Carcharodon Smith, 1838 (great white shark) Species: C. carcharias (Linnaeus, 1758)

Family Megachasmidae In November, 1976, the crew of a research vessel operating off the of Oahu, , was stunned when they found a large, bizarre shark caught in their gear. Realizing the uniqueness of such a creature the crew kept the shark (despite complications imposed by its large size) and made it available for scientific study. The shark, a 4.46m male, weighing 750kg, was determined not only to be a new species of shark, but to represent a new genus and family belonging to the order Lamniformes (see Chapter Two for diagnostic features). The shark was dubbed "megamouth shark" by the press and the was formally named Megachasma pelagios, from the Greek "megachasma" for "large opening", and "pelagios" meaning "of the open sea" (Taylor et al., 1983). The remarkable discovery of this large, previously unknown and bizarre species of shark has been likened in importance to the 1938 capture of the (Berra, 1997). The unusual nature of the animal makes this find even more impressive; Megachasma pelagios is believed to be a filter- feeder, one of only three species of sharks to possess such a feeding mechanism (Taylor et al., 1983). The following section details current information on the biology of Megachasma pelagios. A discussion of filter-feeding in this animal as compared to other filter-feeding sharks (including the lamniform, Cetorhinus maximus) is presented elsewhere in this Chapter. Since its discovery in 1976, less than 20 specimens of Megachasma pelagios have been documented. Almost all of these have been males caught in the Pacific Ocean (Hawaii, , , Philippines, ) (Amorim et. al, 2000). One specimen (a male) was caught off Western , making it the first specimen captured in the (Berra and Hutchins, 1990). Although Megachasma was then unknown from the , Berra and Hutchins (1991) believed the shark should inhabit these . This statement was based on previous studies of the 5 other two known species of large planktivorous sharks, Cetorhinus maximus (the basking shark) and Rhiniodon typus (the shark}. In 1995, two specimens of megamouth sharks were caught in the Atlantic Ocean. The first specimen was caught in a net in (Beret, 1995), while the second specimen was hooked on a longline off (Amorim et al., 2000). Unfortunately the Senegalese shark was discarded before it could be examined. The description of this animal matched that of 1Vlegachasma and the size, estimated at 1.8m, suggests it was an immature male. However, the verbal description of the animal is all that remains of this discovery since there was no other evidence recorded (Beret, 1995). The Brazilian specimen, however, was kept and it represents both the first documented specimen of a megamouth in the Atlantic Ocean and the first description of a j uvenile of the species (Amorim et al., 2000). The 1.9m male was caught on a longline, and is the only specimen to be captured on a hook. A detailed morphological description of this shark is presented in Amorim et al. (2000). It is interesting to note that the only two specimens caught in the Atlantic have both been immature. The majority of reported captures of megamouth sharks are males ranging in size from 1.8m to approximately 5.49m (Amorim et al., 2000). Very few of these reported specimens have actually been studied and these are on display around the world. Several specimens are known only from photographs or anecdotal evidence. Aside from the capture of the first juvenile, other important records of this species occurred in 1990, when the first (and only) megamouth shark was acoustically tracked, and in 1994 when the first female specimen was preserved and studied. These specimens provided the only documented behavioral information on megamouths as well as an important first examination of a female of the species. Before October 1990 only four dead specimens of the mysterious megamouth shark had been reported. A fifth megamouth was captured alive but was subsequently released with only photographs to record the discovery (Amorim et al., 2000). Given the paucity of data on this rare, unusual shark one can imagine the excitement when the sixth specimen of megamouth was captured alive and made available for examination. Megamouth VI was a 4.9m male that was caught in a driftnet off Southern California. The fisherman who caught the shark was unfamiliar with it and, as it was in such good shape, decided to bring the animal back to shore. The color of the shark was very unusual: it was counter-shaded, with its flanks and dorsal surface golden bronze (as opposed to the normal dark black of previous specimens), and a white ventral surface. Such a color pattern was consistent with sharks inhabiting nearshore or shallow oceanic environments, contrary to the deep oceanic megamouths were believed to inhabit (Lavenberg, 1991). This unusual specimen survived in the harbor tied to a rope 6 for over 24 hours by remaining motionless on the bottom and passively pumping over its . It was observed, filmed, photographed and fitted with acoustic telemetry equipment before its release (Lavenberg, 1991; Nelson et al., 1997). The shark was tracked continuously for 50.5 hours, where it exhibited distinct patterns: at night the shark entered shallow water (12-25m) while during the day it retreated to much deeper waters (166m below the surface). The shark repeated this pattern the next evening before it was lost. The reason for this pattern is unknown, although two suggestions have been made. The first is that megamouths may be responding to changes in light levels, resulting in the onset of deeper dives just before sunrise. However, it is unknown how the shark would perceive such changes so a second alternative was proposed. It is known that light levels affect vertical migrations in , so .perhaps Megachasma is simply following its prey throughout the (Nelson et al., 1997). A second advance in our knowledge of Megachasma biology did not occur until 1994 when the first female Megachasma was found, washed up on a beach in Japan (Takada et al., 1997). The 4.8m shark was immature and showed no evidence of nor any (Clark and Castro, 1995; Nakaya et al., 1997). This female did reveal however, that the reproductive tract of Megachasma was very unusual. It consisted of two unfused uteri each with separate vaginas covered by a thick hymen. Although this of reproductive tract is known .for hexanchiforms, it is atypical in lamniforms and may represent an ancestral feature of megachasmids (Clark and Castro, 1995; Castro et al., 1997). Since this time at least three mature female megamouth sharks, ranging in size from 5.2-S.Sm, have been reported two of which displayed prominent mating scars on their bodies (Yano et al., 1997a, 1999). None of these sharks were pregnant, and in fact, a pregnant female of this species has never been reported. Unfortunately only one of these sharks was kept; the others are known only from photographs (one was released while one the other was tragically discarded after it could not be sold at a Japanese market; Yano et al., 1997a; 1999).

Family Cetorhinidae The second species of planktivorous lamniform is the basking shark, Cetorhinus maximus. This species was the first filter-feeding shark to be described, and its original name Squalus maximus ("largest shark") was certainly appropriate given the size of these (see below). The only shark larger than the basking shark is the (Rhiniodon typus, Smith, 1828). The whale shark, described 63 after C. maximus, is the third species of planktivorous shark and may reach up to 18m in length (Compagno, 1984a). 7

The maximum size of the basking shark is controversial since past records may have been overestimated. In the most comprehensive anatomical study of Cetorhinus to date, Mathews and Parker (1950) discussed how easy it was to overestimate the size of this animal. Given that their own estimates were off by as much as 3m, these authors cautioned against relying on lengths obtained by descriptions of sharks not measured directly. Despite records of 12.2-15.2m, it seems that l Om is a fair estimate for the total length of this shark (Last and Stevens, 1994). Males mature between 4-Sm while females mature at approximately 8- l Om (Compagno, 1984a). Although we have known about the basking shark since the late 1700' s, and it has been targeted by for its oil since the 1800's (Fairfax, 1998), the basic biology of these sharks is still poorly known. Information on these animals is often based on anecdotal evidence of sharks that were not available for study (Springer and Gilbert, 1976). The size of this animal makes it difficult to store: there are no intact museum specimens available for study (Springer and Gilbert, 1976) and simply handling the animal for basic research is difficult:

They are not easy subjects for dissection, the size and weight of the individual organs making handling difficult; and woe betide the anatomist who inadvertently punctures the stomach and releases perhaps the better part of a ton of semi-digested over his dissection. Matthews (1950; p.248-249).

One need only imagine the difficulty studying an 8.8m shark where the alone weighs 907 kg (Mathews and Parker, 1950) ! A second reason for the lack of knowledge may be the of these animals (Springer and Gilbert, 1976). These animals are highly migratory and exhibit complex social behaviors that we have yet to understand (Castro, 1983; Francis and Duffy, 2002). Basking sharks are found in all of the major oceans of the world (Compagno, 1984a). These sharks seem to prefer cool temperate waters (Francis and Duffy, 2002) and although they are occasionally found in warm tropical waters (an emaciated 8.2m female was described by Springer and Gilbert [ 1976] off the coast of ) this appears to be out of their normal range (Fairfax, 1998). Basking sharks seemed to be seasonal, often appearing inshore during spring and summer and disappearing in autumn and winter months (Francis and Duffy, 2002). However, this pattern does not explain all sightings of basking sharks (for example, they appear round in waters off California) and there appears to be no universal migratory pathway that explains the seasonal habitats of these sharks (Stott, 1982; Compagno, 1984a; Castro, 1983; Francis and Duffy, 2002). 8

The social behavior of these sharks when they do appear is also a mystery. Large groups of basking sharks (up to 200 at a time; Fairfax, 1998) are often witnessed sometimes within 100m of the shoreline (Francis and Duffy, 2002): The significance of these groups is not understood although their presence may be associated with mating (Fairfax, 1998). Smaller groups of sharks (up to 21) have also been seen in brackish lakes, such as Lake Ellesmere in (Francis and Duffy, 2002). This is considered an unusual habitat for these sharks, and these sightings may have been linked to abundant populations (a favorite food source) in the lake (Francis and Duffy, 2002). Basking sharks have also been caught at depths of at least 904m indicating that the habitat range for these sharks extends from brackish lakes and shallow coastal waters to the open ocean. However, the manner in which these animals utilize these various habitats are unknown (Francis and Duffy, 2002). In addition to seasonal variations, basking sharks may also show sexual segregation. For example, analysis of catches of Cetorhinus during summer months obtained from fisheries data in the British Isles shows that females outnumbered males by a ratio of 40:1 (Compagno, 1984a; Matthews and Parker, 1950). Francis and Duffy (2002) have also noticed a similar pattern of sexual segregation in waters off New Zealand. Although sexual segregation is not uncommon in sharks, in basking sharks this pattern is unusual since these sharks are often immature (or non-breeding adults) and none of these females are pregnant. In fact, records of pregnant females, with the exception of a few ambiguous reports, are almost unheard of (Compagno, 1984a; Izawa and Shibata, 1993; Francis and Duffy, 2002). Perhaps the only modern record of such an event describes the of five living and one stillborn basking shark from a harpooned female. The young ranged in length from 150-200cm and upon release, and were quite capable of feeding and swimming at this size (Branstetter, 2002). Given the dearth of pregnant females, it is not surprising that little is known of the juveniles of this species. The smallest record (besides the account mentioned above) for the basking shark was a free-swimming 1.65m individual in the Atlantic (Compagno, 1984a; Castro, 1983) and only a handful of specimens less than 3m have been reported (Izawa and Shibata, 1993). Studies of young specimens indicate a significant difference in the morphology of the juvenile snout as compared to the adult: in juveniles the snout is proportionally longer and displays a prominent "hook-shape" which is lacking in adults (Matthews and Parker, 1950; Springer and Gilbert, 1976; Izawa and Shibata, 1993). This unusual morphology may help direct water into the mouth thus increasing the efficiency of filter- feeding in juveniles that may be hampered in swimming ability as compared to adults (Izawa and Shibata, 1993). The size at which the adult morphology is achieved is unknown, although sharks examined by Matthews and Parker (1950) between 1.67 and 3.65m in length displayed this condition. 9

Since past descriptions of juveniles, often within this range, have omitted this prominent feature, the accuracy of these reports have been questioned (Matthews and Parker, 1950; Izawa and Shibata, 1993; Fairfax, 1998). Growth estimates in the basking shark indicate that the size at birth is approximately 1.7 to 1.8m, which is perhaps the largest known for any viviparous species of shark (but see A. pelagicus below; Compagno, 1984a; Izawa, and Shibata, 1993). Both the lack of pregnant females and the apparent absence of these sharks in winter have prompted speculation on the potential deep-water habitats of these sharks (Matthews, 1950; Parker and Boseman, 1954). Parker and Boseman (1954) proposed the idea that these sharks may hibernate during winter months. This theory (now defunct) was based on two primary sets of evidence: first, several captures of basking sharks with non-functional gill-rakers, ~believed essential for feeding, have been reported in winter months; and second, plankton concentrations during this season may have been insufficient to support the needs of these sharks {Weihs, 1999). However, recent evidence argues against these points since specimens without gill-rakers are caught throughout the year (this may simply be a mechanism to replace the worn feeding apparatus; Francis and Duffy, 2002) and energy costs associated with feeding in these sharks have been severely overestimated (Weihs, 1999). In addition, observations that these sharks feed in water with very low plankton densities (Hallacher, 1977) and the theory that their huge liver (which at 907 kg, is 25% of the total body weight; Matthews and Parker, 1950) may also help sustain the shark when reserves are low (Last and Stevens, 1994; Francis and Duffy, 2002) suggest that food may not be the limiting factor as once believed. And what of the apparent absence of this shark during winter months? In a comparison of New Zealand fisheries conducted by Francis and Duffy (2002), areas which utilized nets designed to trawl along the bottom at depth (greater than 700m) reported significant captures of Cetorhinus in winter months as compared to areas which only trawled in shallower waters. Accounts of basking sharks in winter may therefore be biased by the technique utilized since worldwide captures at these depths often yield similar results. A deep-water habitat for these sharks is also supported by both the presence of mesopelagic species found in the stomach and a squalene (a low-density hydrocarbon) composition in their liver similar to that found only in deep-sea sharks (Francis and Duffy, 2002). Perhaps basking sharks live at depth year round and are only brought to the surface when conditions (such as food sources) are beneficial (Springer and Gilbert, 1976). So far, much of the information we have obtained on basking sharks has been obtained from analysis of fisheries data (e.g., Stott, 1982; Francis and Duffy, 2002). Basking sharks have been the subject of local fisheries since the 1800's. Early fisheries focused on the oil of these sharks as a source of both lamp oil and Vitamin A and today they are still utilized for their and 10

(Fairfax, 1998; Castro et al., 1999). However, these local fisheries often quickly collapsed due to over-, and this fact has been used as evidence for the vulnerability of basking sharks to exploitation (Castro, 1983; Compagno, 1990a; Castro et al., 1999). In addition to their use in fisheries, basking sharks have also been killed as pests since their surface habits cause them to become entangled in salmon nets. These animals often destroy these nets both in an effort to escape and by the mucous that they secrete, which is so noxious that it is actually corrosive to natural fibers (Compagno, 1990a; Fairfax, 1998). The vulnerability of these sharks has been recognized by several efforts to protect them. Since 1998, it is illegal to harm a basking shark in any way in waters off the (Fairfax, 1998) and they have been placed on the ILTCN-World Conservation Union, Red List of Threatened Animals in 1996 (Sims et al., 2000). In the past, misidentifications of the carcasses of these sharks (perhaps discarded from fisheries usage) have been used as proof of sea monsters. The remains of this animal when washed ashore were often mistaken as sea serpents: During decomposition, the and gill arches detach from the cartilaginous "skull", which is proportionally small in this animal. This gives the appearance of an animal with a very long neck attached to a small skull. This "sea creature" has three sets of "legs", with each pair formed by the remnants of the pectoral girdle, pelvic girdle and the claspers (Fairfax, 1998), which are over lm in length in adult males (Mathews, 1951). Proof of this spectacular beast was given by eyewitness accounts of the creature, although these were probably just several basking sharks swimming in a row (Compagno, 1984a). Belief in the creature was so certain that in 1808 a new species, Halsydrus pontoppidani was described from the unusual skeletal remains of what was believed to be a sea serpent (Fairfax, 1998). Although we now question the existence of sea monsters and know that the sea serpent Halsydrus pontoppidani was based on the remains of the basking shark, the biology of Cetorhinus maximus, like the myths that have surrounded it, still invoke a sense of mystery. How do these animals utilize their various environments and how are populations segregated both by sex and depth? What are the habitats of juveniles and pregnant females? The answers to these questions may be vital towards future conservation efforts of a species that is perhaps more vulnerable to over-fishing than any other species of shark (Compagno, 1984x; Castro et al., 1999).

Biological adaptations in lamniform sharks I: Filter—feeding Filter-feeding in modern sharks is restricted to three species: the two lamniform species, Megachasma pelagios and Cetorhinus maximus and the whale shark, Rhiniodon typus (Orectolobiformes: Rhiniodontidae). The morphological and behavioral adaptations to filter-feeding 11 in these sharks are remarkably different and hence support the idea that planktivory evolved more than once in lamniform sharks (Compagno, 1990b; Martin and Naylor, 1997; Morrissey et al., 1997). A comparison of filter-feeding in these sharks, including design of the feeding apparatus, behavioral adaptations and prey selection is discussed below. In sharks gill-rakers guard the entrance to the internal gill slits preventing debris from damaging the delicate gill lamellae required for respiration. In filter-feeding sharks, the gill-rakers have been modified in various ways to assist in feeding. In M. pelagios each gill arch possesses approximately four rows of densely packed papillae-like (papillose) gill-rakers. These gill-rakers are small and slender, approximately 10-15mm long. Each raker is supported by a core of hyaline , is covered with numerous dermal denticles and is surrounded by a mucous membrane (Taylor et al., 1983; Yano et al., 1997b). This design creates a sticky sieve that traps organisms suspended in the water column as water passes through the internal gill slits. Unlike C. maximus, the gill-rakers of M. pelagios are not shed (Compagno, 1990b). Processes (10-20mm long) similar in morphology to the gill-rakers are found in the and may also assist in filter-feeding (Yano et al., 1999). The branchial arches of C. maximus support numerous (1000-1300), thin, flexible, rod-shaped gill-rakers (Matthews and Parker, 1950) that are up to 11.Scm long (Fairfax, 1998). Unlike other sharks, the gill-rakers of the basking shark are made of keratin, the same material that forms the plates found in filter-feeding (Mysticeti; Fairfax, 1998). The bases of the gill-rakers are attached to the pharyngeal (inner) surface of the arch by elastic connective tissue and on the outer surface by a layer of muscle. This arrangement allows the rakers to assume an upright position when the mouth is open and return to a flattened position when the animal is not feeding (Matthews and Parker, 1950; Fairfax, 1998). Thickened epithelial mucoua are found at the base of the rakers, and mucus secreted by glands in this membrane help trap food particles on the gill-rakers (Matthews and Parker, 1950; Fairfax, 1998). Papillae in the esophagus act as a valve for food entering the stomach (Fairfax, 1998) that is mostly filled with the mucus secreted by the membrane associated with the rakers (Matthews and Parker, 1950). Although it may seem that the feeding apparatus of the basking shark is similar to that of M. pelagios, the design of these systems is in fact very different, allowing for unique adaptations to filter-feeding (see below). C. maximus possesses a unique method of planktivory known as "ram filter-feeding" (Weihs, 1999). This strategy is dependent on the passive flow of water over the gills as the animal moves forward through the water column (Taylor et al., 1983; Weihs, 1999). The basking shark is the only known passive filter-feeder in the , with forward motion generating water flow over 12 the gills (Duffin, 1998). As the shark swims slowly with its large mouth open (lm wide in an 8m specimen; Hallacher, 1977), it is capable of filtering very large volumes of water (at least 2215 cubic meters of water per hour). The delicate feeding apparatus of the basking shark is designed to effectively filter large quantities of very small (primarily microscopic , although small fish may also be captured; Francis and Duffy, 2002) prey by combining a high flow rate of water with a highly proficient filtering apparatus (Taylor et al., 1983). This effective filtering mechanism may allow basking sharks to maximize feeding capabilities under adverse conditions, such as low plankton densities. Basking sharks may also maximize feeding by selective foraging. In a recent study conducted by Sims and Quayle (1998), basking sharks were found to locate and track specific patches of that possessed certain characteristics (i.e., high densities of zooplankton above a certain threshold; large species of copepod, specifically ). These sharks were also found to locate their prey by foraging along thermal fronts (characterized by unusual horizontal gradients in water temperature, that separate warm and cold waters; Sims et al., 2000) -conditions that are known to support high plankton populations (Sims and Qualye, 1998). Perhaps the most unusual device is that of the non-lamniform whale shark R. typus. Unlike, M. pelagios or C. maximus, the filtering apparatus of R. typus is not restricted to the margins of the internal gill slits. In this shark, an intricate arrangement of flattened, triangle-shaped parallel plates traverses the internal gill slits, forming a dense grid (Taylor et al., 1983; Compagno, 1984). Unlike C. maximus, the complex, dense network of plates found in the whale sharks prohibits the filtration of large volumes of water, and hence the capture of the small prey preferred by C.maximus (Taylor et al., 1983). However, since R. typus also employs a type of "suction filtration", the filtering apparatus of the whale shark, unlike that of Cetorhinus, is not dependent on the shark's forward movement through the water. R. typus often sits stationary in a vertical position near the surface of the water, and utilizes the expansion of the "bellows-like" pharynx to suck food into its large, slot shape, terminal mouth (Taylor et al., 1983). The shark is not limited to the small organisms utilized by Cetorhinus and can hence exploit a variety of food items including and small (Taylor et al., 1983; Compagno, 1984a}. Other orectolobiform sharks employ a suction style of feeding but only in the whale shark has it been adopted for filter-feeding (Compagno, 1984a). Both basking sharks and whale sharks are strong swimming sharks that inhabit nutrient rich waters. This is in stark contrast to M. pelagios, which, based on both anatomical characters such as its soft, flabby musculature and poorly calcified skeletal elements and tracking is an inactive shark, that lives in deep-water habitats with limited nutrients (Taylor et al., 1983; Diamond, 1985; Nelson et 13 al., 1997). How does the megamouth shark, which attains lengths of S.Sm, survive in these nutrient depleted depths? While adaptations to the nutrient poor environments inhabited by these sharks are still being investigated (for example, analysis of the intestine reveals an increased digestive capacity as compared to other filter-feeding elasmobranchs; Yano et al., 1997) two major theories have been proposed to answer this question. The first is that the megamouth possess bioluminescent tissue in its mouth believed to lure prey at depth; and the second is that these sharks are vertical migrators, which follow plankton populations (Taylor et al., 1983; Diamond, 1985). Since feeding in megamouths has never been witnessed directly, inferences on this subject must be made through anatomical studies. M. pelagios possesses a jaw articulation designed to substantially increase the volume of the pharynx while creating a significant anterior protrusion of the jaw (Compagno, 1990b). The sparse distribution of dermal denticles on the throat of Megachasma also supports the idea that the throat of this shark is easily expandable (Nakaya et. al., 1997). Given this information, Compagno (1990b) suggests that megamouths feed by swimming slowly through the water, with their mouths open and then quickly snapping its jaws forward, while expanding the throat, thus creating a suction that forces water into the mouth. The reverse of this action would force water out through the gills, which are guarded by the elaborate gill-rakers, thus filtering any organisms, including krill (Euphausiacea), copepods and present in the water (Berra and Hutchins, 1990; Compagno, 1990b). Prey is believed. to be concentrated in front of the mouth by the iridescent, possibly bioluminescent, tissue in the mouth (Taylor et al., 1983; Diamond, 1985; Compagno, 1990b). in megamouths, however, is still speculative since analyses of tissue samples, while promising, are still inconclusive (Compago, 1990b) and the only study conducted on a living megamouth showed no evidence of luminescence, even at night (Lavenberg, 1991). Nakaya et al. (1997) and Nakaya (2001) state that bioluminescent tissue may not be the only mechanism utilized by megamouths to maximize feeding potential. These authors note that all megamouths captured to date possess a prominent white band along the margin of their upper jaw that is only visible when the upper jaw is protruded. When exposed, the white band would be in spectacular contrast against the black skin of the head and the darkness of the . Since this band is associated with movements of the jaw, and i s situated in front of the mouth, it is believed its function may be associated with feeding and may serve to attract the bioluminescent shrimp found in the stomachs of these sharks (Nakaya et al., 1997; Nakaya, 2001). The first, and only, tracking experiment (see above) conducted on a megamouth shark revealed that, as previously suspected (Taylor et al., 1983), megamouth sharks are vertical migrators 14

(Nelson et al., 1997}. The depth changes displayed by the shark at first appeared to be influenced by changes in light levels. However, since changes in illumination are used as cues for vertically migrating euphausiid shrimp (a primary food source for Megachasma} it is believed that these sharks may simply respond to prey availability (Nelson et al., 1997}. Thus Megachasma is found in shallow depths in the evening when concentrations of shrimp are highest (Nelson et al., 1997). The morphological and behavior adaptations displayed by the three species of filter-feeding sharks are indeed remarkable. Each species of shark has developed its own unique method of exploiting a food source underutilized by chondrichthyian fishes. The variety in both the design and usage of the filtration apparatus as well as the various behavioral mechanisms utilized to maximize feeding are truly some of the most unusual traits seen in living elasmobranchs.

Family Mitsukurinidae In 1897, Kakichi Mitsukuri, a distinguished professor of Zoology at the University of Tokyo, brought a specimen of a rather unusual shark to the International Conference in the . The animal was obtained from a Japanese fisherman by Captain Alan Owston, who immediately recognized the uniqueness of the specimen. The shark was subsequently described by David S. Jordan and named Mitsukurina owstoni in honor of Professor Mitsukuri and Captain Owston (Jordan, 1878; Bean, 1905). The "grotesque appearance" (Dean 1903) was indeed remarkable: The snout formed a "long, flat, flexible, leaf-like blade" which protruded well in front of the head, and the "thin, flexible and papery" fins were attached to an elongate body in which the "flesh and skeleton [were] extremely limp, folding like a wet rag" (Jordan, 1898). The type specimen, a 107cm male, believed to be immature, was designated as a new genus in a new family, the Mitsukurinidae. It was considered of "lamnoid affinities" and related to the genera Carcharias and Odontaspis (Jordan, 1898}. The shark was also remarkable since it resembled extinct sharks in the genus (Dean, 1903). This resemblance resulted in a debate over the correct genus for the living species (Mitsukurina versus Scapanorynchus) until examination of fossil Scapanorhynchus by Capetta (1980) supported their separation into two distinct genera (Chapter Three). An exhaustive search among Japanese fisherman failed to produce a second specimen of M. owstoni, and indeed it appeared that no one had even heard of the shark (Jordan and Fowler, 1903). Several years later, a large (3.53m) female was caught, supporting the notion that the type specimen was indeed immature. It appeared that the shark was quite common in a local Japanese fishing community in which it was locally known as tengu-tame, the goblin or elfin shark (Jordan and Snyder, 1904). The area was considered a possible breeding ground since predominately female 15 sharks were captured but only during the spring. Remarkably, fisherman only four miles away had never heard of the shark (Jordan and Synder, 1904). Early records of Mitsukurina showed the variation in length within this species. The largest recorded specimen of M. owstoni is 3.84m (Stevens and Paxton, 1985; Compagno, 2001) while the smallest record, approximately 1m, is comparable in size to the type specimen (Stewart and Clark, 1988). Males and females are mature around 2.6m and 3.4m, respectively, although due to the paucity of specimens, both sexes may attain maturity before this (Compagno, 2001). Pregnant females have yet to be recorded (Compagno, 2001). Although four species of Mitsukurina have been named from extant specimens (M. owstoni, M. dofleini, M. nastus, M. jordani; Hussakof, 1909) it appears that these all represent the same species, M. owstoni (Bass et al., 1975). Although M. owstoni has been known since the late 1800's, it is still poorly known scientifically. It is a very deep-water shark, usually caught between 300-1300m and as such is not often seen (Duffy, 1997). Reports of goblin sharks in shallower water are occasionally, though rarely reported (Duffy, 1997; Ugoretz and Seigel, 1999) Goblin sharks are usually light reddish gray to brownish in color, with dark brown fins (Jordan and Fowler, 1903). Certain smaller specimens (juveniles) were reported as having a conspicuous white spot on the dorsal surface of the head consistent with the presence of a "pineal window", a possible light-detecting common in deep-sea species (Duffy, 1997). This feature is not reported in adult specimens. Two specimens have been recorded with unusual color patterns consisting of skin that was either white to light purplish-gray and semi-transparent (Uyeno et al., 1976), or transparent with the underlying blood vessels giving the specimen a pink hue (Duffy, 1997). The diet of M. owstoni is poorly known, and the species has never been observed feeding. Feeding mechanisms are inferred indirectly from its anatomy and stomach contents. The body musculature is weak and the caudal fins lack awell-developed ventral lobe, which indicate the shark is a poor swimmer (Duffy, 1997). The teeth are specialized for grasping (Bass et al., 1975; Castro, 1983) and fit tightly together when the jaw is closed thus preventing small prey from escaping (Duffy, 1997). The are very small, and the snout is covered with sensory pits () specialized for (Bass et al., 1975). The function of the elongate snout of goblin sharks is unknown and it has been suggested that it may be used for detecting prey buried in the ooze at the bottom of the ocean (Bass et al., 1975; Castro, 1983). However, an alternative feeding strategy in which this sluggish shark feeds in the water column has also been proposed. The long blade-like snout of Mitsukurina extends far beyond its highly protrusible jaws, and hence may detect prey which passes in front of the mouth (Duffy, 1997). 16

It is thought that M. owstoni feeds by remaining motionless or near-motionless in the water column, waiting for prey to swim close by. V~hen prey is detected with the snout, the jaws are rapidly protruded forward using specializations in branchial arches and associated musculature (Compano, 2001). The throat is thin and pliable and forms apelican-like pouch, which together with the highly protrusible jaws, aids the shark in gulping prey (Compagno, 2001). Rapid projection of the jaws is accompanied by depression of the tongue, which expands the volume of the buccal cavity and creates a suction, used to suck up prey. The jaws then shut tight with prey caught in the zipper-like occlusion of the teeth (Duffy, 1997; Compagno, 2001). Significant expansion of the buccal cavity suggests that M. owstoni has the ability to engulf large prey; in mesopelagic , similar features are considered adaptations to low-prey environments (Duffy, 1997). Bass et al. (1975) suggested that, based on anatomical features, Mitsukurina fed on squid and fish; stomach contents from several goblin sharks verified this assumption (Stevens and Paxton, 1985; Duffy, 1997; Ugoretz and Seigel, 1999). Duffy (1997) found , mesopelagic and teleosts, indicating that this shark feeds in mid-water. These prey items include species that vertically migrate (Duffy, 1997), and may explain why M. owstoni is occasionally found in shallow water. Goblin sharks display irregular distribution patters in the Pacific, Atlantic, and Indian Oceans, with catches from only a few known localities, such as Japan, , , Guyana, South , California, , and New Zealand (D' Aubrey, 1969; Glover, 1976; Piotrovskiy and Prut' ko, 1977; Castro, 1983; Noden, 1984; Stewart and Clark, 1988; Duffy, 1997; Ugoretz and Seigel, 1999). The majority of goblin sharks are caught in Sagami Bay, Japan (Stevens and Paxton, 1985; Duffy, 1997). Fossil remains of M. owstoni are recorded from the , but it is not known to exist here today (Cigala Fulgosi, 1986).

Family Pseudocarchariidae In 1936 Professor Toshij i Kamohara noticed an unfamiliar shark on a trip to a Japanese fish market. This shark, known as Mizu-wani or crocodile shark, by the local fisherman was formally named and described in his honor as Carcharias kamoharai by Matsubara (1936). In subsequent years new species of crocodile sharks were described, including Carcharias yangi from and Carcharias pelagicus in . However, D' Aubrey (1964) synonymized these species under Carcharias kamoharai. Compagno (1973) disagreed with the inclusion of C. kamoharai in Carcharias, and raised it to its own genus, Pseudocarcharias (previously used as a subgenus), and its own family, Pseudocarchariidae, based on morphological criteria (Chapter Two). 17

P. kamoharai is the smallest living lamniform species, with a maximum known length of 1.1m (Last and Stevens, 1994). Size at birth is about 40cm, and males and females mature at around 74cm and 89cm, respectively (Matsubara, 1936; Last and Stevens, 1994; Compagno, 2001). The crocodile shark has huge eyes, gill openings that extend onto the dorsal surface of the head, a large mouth lined with long and slender teeth, and jaws that are strong and highly protrusible (Compagno, 2001). The body is more slender than in other lamniforms. It is counter-shaded with a dark brown dorsal surface and pale belly, sometimes with dark blotches on the sides and ventral surface and fins with white or translucent edges (Last and Stevens, 1994). The dorsal color has also been described as gray-brown to "muddy lilac" (Romanov and Samarov, 1994). A prominent white blotch behind the mouth has been described in some specimens (including the type specimen) and , especially those from the Pacific Ocean (Matsubara, 1936; Abe et al., 1969; Bass et al., 1975), but not other specimens (D' Aubrey, 1964). The color pattern displayed by crocodile sharks indicates this is an epipelagic shark (Compagno, 2001) since counter-shading is beneficial only in areas penetrated by light. This habitat is also supported by the fact that it is frequently caught in pelagic longlines (Compago, 2001). The large, non-reflective eyes suggest nocturnal behavior and, as in megamouths, crocodile sharks may exhibit Biel vertical migration pattens (Compagno, 2001). The feeding habits of the crocodile shark are poorly known. The strong jaws and grasping dentition, combined with the strong musculature and large caudal suggests that this is an active species (Comagno, 2001). It will jump out of the water to seize bait, and is aggressive when captured, snapping its jaws vigorously (Compagno, 2001). Thus, it appears that the crocodile shark is an active swimmer capable of chasing prey. Bass et al. (1975) suggested that the teeth were designed for grasping, not cutting, so it probably fed on small prey, such as squid and small . This was confirmed by stomach contents, which show that it feeds on small fish, shrimp and squid (Compagno, 2001). P. kamoharai has a widespread but patchy distribution in the world's oceans. It has been recorded in the epipelagic and mesopelagic zones of tropical and subtropical waters of the Pacific, Atlantic and Indian Oceans (Last and Stevens, 1994: Romanov and Samarov, 1994). This species is caught at depths of between the surface and 590m, and though usually found in deep oceanic waters, it is occasionally found inshore (Last and Stevens, 1994). It has been found as far north as Japan and and as far south as , and New Zealand (Matsubara, 1936; D' Aubrey, 1964; Long and Seigel, 1996; Bearez, 2001; Stewart, 2001). 18

The crocodile shark is of no commercial value, and is a common of fishing expeditions that target tuna and squid (Compagno 2001); in fact, Pseudocarcharias may be prone to capture by methods used by such commercial fisherman, since it uses its strong, wide jaws to grab bait off longlines. Even though it is frequently caught in some areas, the biology of the crocodile shark is still poorly known. Since these sharks are small and not considered edible, crocodile sharks are often regarded as "trash fish" (Abe et al., 1969). As "trash fish", crocodile sharks are discarded and not recorded, and so information concerning the number of these sharks caught may be significantly understated (Hazin et al., 1989). In addition, since crocodile sharks are often misidentified (Compagno, 2001), capture records may be also be biased when Pseudocarcharias is mistaken for other sharks (Hazin et al., 1989). Since it is unknown exactly how many crocodile sharks are caught and killed annually, the of this shark remains unknown. Due to the lack of data and the risk factors listed above, P. kamoharai was listed as Limited Risk (Near Threatened) on the Red List of the IUCN Shark Specialist Group (Castro et al., 1999; Compagno, 2001). Ironically, the liver of crocodile sharks is very large and rich in squalene (Abe et al., 1969); hence these sharks may actually be commercially valuable (Compagno, 2001). Their small size (cited as one of the reasons for their "trash fish" status) may make them amenable to captivity, and if so, would significantly enhance the opportunity to observe this poorly known shark (Compagno, 2001).

Family Odontaspididae The family Odontaspididae contains two extant genera with three living species: Carcharias taurus {sand tiger), Odontaspis ferox (ragged-toothed shark or small-toothed sand tiger), and O. noronhai (big-eyed sand tiger). The sand tiger, C. taurus, is the only lamniform to be successfully kept in captivity, and is one of the more common sharks kept in public aquaria. As a result, C. taurus is one of the best known of all shark species, and its behavior has been observed extensively. In captivity, they can live at least 13 years (Pollard, 1996). At the opposite extreme, the big-eyed sand tiger, O. noronhai, is arguably the most poorly known lamniform species, since it is known only from a handful of specimens. C. taurus and O. ferox were both named in the early 1800's, whereas O. noronhai was not named and described until 1955. The generic placement of C. taurus has caused significant controversy. Although the genus Carcharias was named before Odontaspis, the name Odontaspis came to be more widely used, and so a group of British paleontologists (E.I. White, W. Tucker, N.B. Marshall} argued that the name Odontaspis should replace Carcharias as the valid genus name for sand . The International 19

Commission of Zoological Nomenclature (ICZN} agreed, and the name Carcharias was suppressed in favor of Odontaspis. However, morphological analysis of these species subsequently revealed that O. taurus and O. ferox did not belong in the same genus as C. taurus. The species taurus was therefore put in a separate genus; but different authors disagreed over what this genus should be called. In the literature, this species has been called Eugomphodus taurus and Synodontaspis taurus, as well as Carcharias taurus and Odontaspis taurus. In 1987, the ICZN reinstated Carcharias as a valid genus (in response to a petition by Leonard Compagno and William Follett), and so C. taurus became the valid name for the sand , and is used here. This is relevant to the fossil record (Chapter Three), as fossil odontaspidids thought to be related to C. taurus have been put in the genera Eugomphodus and Synodontaspis.

Carcharias Carcharias taurus is a large, stocky shark with a very "toothy" appearance, since the teeth clearly stick out from the jaws. Although the maximum known length of this species is 4.3m, most specimens are less than 3.4m (Krogh, 1994; Pollard, 1996). It reaches sexual maturity at around 1.9m for males, and 2.2m for females (Gilmore et al., 1983; Branstetter and Musick, 1994; Compagno, 2001). The eyes are smaller than other known odontaspidids. C. taurus is light brown in color, darker dorsally than ventrally, and often with dark reddish or brownish spots along the body (Compagno, 2001). Groups of C. taurus feed cooperatively, surrounding and rounding-up schooling fish (Smith and Pollard, 1999). These sharks thrash their tails (known as "tail-thumping") in an effort to scare and confuse these fish (Pollard, 1996). Groups of C. taurus may exhibit a feeding hierarchy, with the dominant individuals feeding first (Pollard, 1996). Hydroids are occasionally observed growing on their teeth, and from this it has been inferred that they are not feeding during these times (Pollard, 1996). C. taurus feeds on a wide range of teleosts, smaller sharks and rays, squids and (Pollard, 1996). Larger C. taurus eat proportionally less food than smaller and younger individuals in captivity, suggesting that the growth rate is most rapid in juveniles, and decreases into adulthood (Schmid et al., 1990; Branstetter and Musick, 1994). These results are similar to those observed for wild C. taurus (Schmid et al., 1990). C. taurus is a wide ranging species in temperate and subtropical coastal waters of the Atlantic Ocean, Mediterranean Sea, and western Pacific Ocean (Compagno, 1990a; Pollard, 1996). It is common (but see below} inshore along the continental of southeastern , Australia (but not Tasmania), and South Africa (Compagno, 1990a). Worldwide migration patterns 20 are not understood (Smith and Pollard, 1999). It is generally found at depths between 15 and 25m, although occasionally it is reported as deep as 200m (Smith and Pollard, 1999). C. taurus is often found in very shallow water less than 4m deep (Castro, 1983). These sharks are often reported as hovering motionless in shallow water during the day. This is accomplished by their excellent buoyancy control: they rise to the surface to gulp air, and this helps them maintain neutral buoyancy. This is very unusual for sharks, and unique among Lamniformes (Feldmeth and Waggoner, 1972). These sharks appear to be more active at night when they are seen swimming slowly but strongly through the water. The sand tiger is a very fierce-looking shark, and this is one of the reasons they are a popular features in public aquaria. Unfortunately, because of their formidable appearance, they were once assumed to be man-eaters, and hunted. In the past, C. taurus was the target of sportfishing, and shot with spears and explosive powerheads, mostly in Australia and South Africa (Compagno, 1990a). In reality, these sharks are sluggish and docile. They are easily killed since, as mentioned above, they often hover motionless in shallow, inshore water (Pollard, 1986). Large numbers of these sharks are known to congregate, particularly during mating season and as many as 80 sand tigers were spotted clustered together off South Africa (Castro et al., 1999; Smith and Pollard, 1999)! All these factors made C. taurus an easy target. The low birth-rate also makes this species vulnerable to over-fishing (Branstetter and Musick, 1994; Castro et al., 1999). When its non-aggressive nature was realized, C. taurus became even more attractive as a trophy kill, since they were unlikely to put up a fight (Pollard, 1986). Divers noticed that numbers of C. taurus were declining, and in 1984, the State of New South in Australia officially protected C. taurus, banning hunting and sportfishing, and requiring a permit from aquaria to capture and display them (Pollard, 1986). Thus C. taurus became the first shark species in the world to be protected. C. taurus is now protected across all of Australia (Pepperell, 1992). C. taurus populations in other parts of the world were also in decline. The United States witnessed a steep decline in sand tiger sharks during the 1990's. Reduction in the numbers of sand tiger sharks were estimated to be as high as 75%, and populations disappeared completely off North Carolina and Florida (Pollard, 1986). So in 1997, the U.S. prohibited all directed (recreational and commercial) fishing along the Atlantic Coast (Castro et al., 1999). Populations were also dwindling in other areas, such as South Africa (Smith and Pollard, 1999). In 1996, C. taurus was listed by the IUCN as vulnerable to . Although hunting and fishing are prohibited, their numbers may still be adversely affected through accidental capture, as a result of bycatch and entanglement in nets 21 designed to protect public beaches from dangerous shark species (Smith and Pollard, 1990; Krogh, 1994).

Odontaspis The genus Odontaspis contains two species, O. ferox and O. noronhai, of which the former is by far the better known. The two species can be distinguished by dentition (Chapter Two), color and by the larger eyes in O. noronhai. O. ferox is lighter in color, often gray or gray-brown and lighter ventrally, with dark spots on its sides, but without a white tip on its first ; whereas O. noronhai is a dark reddish color and often has a prominent white "blotch" on the tip of its first dorsal fin. Young specimens of O. ferox possess black margins on their fins that are absent in adults (Compagno, 1990; Villavicencio-garayzar, 1996). Odontaspis species are similar in appearance to C. taurus, with the same fierce "toothy" look (hence the species name ferox). Sand tigers of the genus Odontaspis are also large, bulky sharks, but with a deeper belly than observed in C. taurus, due to their much larger (see below; Abe et al., 1968; Garrick, 1974). Early descriptions (1800's to early 1900's) of O. ferox were inaccurate and include significant variation in count, position of dorsal fins, and color (gray versus red with spots). Color variation within the species led to the description of a new species, O. herbsti that was later synonymized with O. ferox (Daugherty, 1964; Compagno, 1984; Bonfil, 1995}. O. ferox can reach up to 4.1 m in body length, is about 1m at birth, and matures at 2.75m in males and at least 3.64m in females; these sharks presumably reach sexual maturity below this size (Seigel and Compagno, 1986; Compagno, 2001). Three captured adult male specimens of O. noronhai were 3.2-3.4m, while one female was immature at 3.21m, and another was adult at 3.26m (Sadowsky et al., 1984; Branstetter and McEachran, 1986; Compagno, 2001). Only six specimens of O. noronhai have been collected in total, of which only four are complete (Shimada, 2001). The type specimen (a 1.7m female) was caught off (northeast Atlantic; Maul, 1955). O. noronhai is known from southern Brazil, the Gulf of , Hawaiian Islands, and possibly the Seychelles and South China Sea (Sadowsky et al., 1984; Branstetter and McEachran, 1986; Humphreys et al., 1989). The head of one of the specimens collected off Brazil was estimated to come from an individual that measured around 3.6m, which is the largest recorded size for this species (Sadowsky et al., 1984). Stomach contents for O. ferox indicate that it feeds on small bony fishes, squid, crustaceans and rays (Compagno, 1984, 1990a). The diet and feeding habits of O. noronhai are very poorly known, but stomach contents from one individual included a squid beak and of fish 22

(Branstetter and McEachran, 1986). A male specimen caught in deep-water off Hawaii was very aggressive (Humphreys et al., 1989), suggesting the shark may be an active predator. The teeth of O. ferox and O. noronhai are smaller, and less specialized for cutting than those of C. taurus, while the posterior teeth are less specialized for crushing compared to C. taurus (Compagno, 2001). These features suggest that Odontaspis species feed on smaller, less active and softer-bodied prey than C. taurus (Compagno, 2001). The longer body cavity of both Odontaspis species has a very large, oily liver, full of squalene (Abe et al., 196$; Branstetter and McEachran, 1986) that is believed to assist in maintaining neutral buoyancy (Compagno, 2001). O. ferox is a relatively uncommon deep-water species with awide-ranging but spotty distribution in the world's oceans (Dougherty, 1964; Abe et al., 1968; D'Aubrey, 1969; Garrick, 1974; Gubanov, 1985; Galvan-Magana et al., 1989; Compagno, 1990a; Bonfil, 1995; Villavicencio- Garayzar, 1996). O. ferox is apparently absent from the southeastern Atlantic and southeastern Pacific Oceans (Bonfil, 1995). Although first described in the Mediterranean,. it appears to be uncommon here; but this may be due to the fact that it mostly lives in deep-water (Daugherty, 1964). The spotty distribution of O. ferox may be due to a preference for certain habitats, or due to the paucity of deep-water fishing expeditions (Bonfil, 1995). The second species, O. noronhai, has an even spottier distribution (see above); although it is reported from all three major oceans, its presence in the Indian Ocean is questionable. Aside from deep-water catches, O. ferox specimens are sometimes caught in open pelagic waters especially during early morning and late evening, suggesting possible diel vertical migrations (Bonfil, 1995). These sharks are usually found at a depth of 250-300m (Kobayashi et al., 1982) but some specimens have been taken between 12-15m and close to shore (Daugherty, 1964). O. noronhai is also considered adeep-water shark (at least 450m; the type was caught at a depth of 800-1000m); but it has also been captured in water less than 100m deep (Branstetter and McEachran, 1986; Shimoda, 2001). The capture of one O. noronhai specimen (Marshall Islands) at night at a depth of 75m over extremely deep oceanic water (4500-5300m deep) may suggest diel vertical migration in this species as well (Compagno, 2001). The very large eyes and uniform dark coloration of O. noronhai are also consistent with adeep-water habitat (Compagno, 2001). Most of the recorded catches of O. noronhai are from southern Brazil, where it is known by local fisherman (Sadowsky et al., 1984; Amorim et al., 1998). However, it is only caught off Brazil in the southern spring season, suggesting seasonal migration (Sadowsky et al., 1984). O. ferox is fished for meat, squalene and fins in some parts of the world, but does not appear to be exploited (Compagno, 1990a; Castro et al., 1999). O. noronhai is of no commercial value 23

(Compagno, 1990a), and it is probably too deep to be frequently caught by commercial fisheries (Compagno, 2001).

Family Alopiidae The family Alopiidae (thresher sharks) contains only one living genus, Alopias. Thresher sharks have been known to since ancient times; the philosopher Aristotle (384-322 B.C.) was familiar with thresher sharks, and described their behavior (Gruber and Compagno, 1981). Only three extant species are currently recognized: the {A. vulpinus), the big-eyed thresher (A. superciliosus), and the (A. pelagicus). Threshers are considered harmless to humans, although there are two reports of an undetermined Alopias species attacking boats (Bass et al., 1975). Threshers are best known for their greatly elongated tails that can be as long as the rest of the body. Since only the upper caudal lobe is elongated, the tail is highly asymmetrical in lateral view. Threshers are believed to use this specialized tail to whip, and stun (or kill) their prey (Gubanov, 1972; Gruber and Compagno, 1981). Gubanov (1972) provided evidence for this theory by noticing that 97% of A. vulpinus specimens were hooked by the tail when attempting to strike the live bait. These same results were also observed for A. superciliosus (Stillwell and Casey, 1976). In addition to attacking individual fish, threshers are known to feed on schools of fish. They confuse these fish by thrashing their tails at the surface and leaping out of the water. The shark then corrals the fish by swimming in a circle until it concentrates its prey, at which point it feeds by actively engulfing the fish huddled in the confused mass (Bass et al., 1975). At least one species (A. vulpinus) has been reported to act cooperatively in herding prey (Compagno, 2001). In addition to feeding on small species of schooling fish, threshers are known to capture a variety of prey (Bass et al., 1975). For example, A. vulpinus eats squid, octopus and pelagic crustaceans and is also known to whip sea with its tail, and eat them (Bass et al., 1975; Compagno, 2001). A. superciliosus appears to have a more varied diet that includes a higher proportion of bottom-dwelling prey, as well as pelagic fishes and especially squid (Stillwell and Casey, 1976; Compagno, 2001). A. superciliosus has also been known to eat small elasmobranchs (Bass et al., 1975; Compagno, 2001). Feeding in A. superciliosus may also be augmented by its greatly enlarged eyes (hence the name big-eyed thresher) that can roll up onto the dorsal surface of the head (Gruber, 1980). This adaptation is unique among threshers, and is thought to increase the visual field of this shark, allowing it see prey directly above it and strike from below (Gruber, 1980; Gruber and Compagno, 1981). A. superciliosus is also characterized by a V-shaped, horizontal groove (the nuchal groove; see 24

Chapter Two), which is also found in certain teleost fish; it is believed that these grooves might make the shark more hydrodynamic, and improve maneuverability (Gruber and Compagno, 1981). The three Alopias species differ in size. The largest species is A. vulpinus, which measures up to at least 5.7m with unconfirmed reports of individuals measuring 7.6m long (Compagno, 2001). It seems that these may be extreme examples however, since most individuals are around 4.0-4.9m (Bass et al., 1975). A. pelagicus, which attains a total body length of less than 3.7m, is the smallest thresher (Compagno, 2001) while A. superciliosus achieves a body length of at least 4.6m (Compagno, 2001). Size at birth is around 1.1 to 1.6m for A. vulpinus; 1.0 to 1.4m for A. superciliosus; and 1.3 to 1.6m (possibly up to 1.9m) for A. pelagicus, despite being the smallest species as adult (Bigelow and Schroeder, 1948; Bass et al., 1975; Compagno, 2001}. All three species grow slowly, and reach maturity at a large body size and a late age (e.g., for A. superciliosus 12.3-13.4 years old for female, and 9-10 years for males, at a body length of 3.Sm and 2.7-2.9m, respectively; Stillwell and Casey, 1976; Liu et al., 1997). Size at maturity and number of offspring within particular Alopias species varies according to location (Gubanov, 1972; Caillet and Bedford, 1983; Liu et al., 1997; Compagno, 2001) and may indicate separate populations. Eitner (1995) examined allozymes from threshers in various localities. This study found that specimens believed to be A. superciliosus from Baja California were significantly different compared to other A. superciliosus specimens. For example, 8 out of 13 loci were different in the Baja specimens, with four of these eight sites completely unique (autapomorphic). One particular locus was missing from all but one of the eight Baja California specimens. Eitner attributed this high degree of genetic divergence to the presence of an unrecognized Alopias species off Baja California. A. vulpinus has the most extensive geographical distribution of the threshers (but see below). It is found in oceanic and coastal waters, and although most common in temperature waters, A. vulpinus is also found in tropical and cold waters. It is found throughout all major oceans (as well as the Mediterranean), as far north as the shores of and Scandinavia, and as far south as Patagonia and New Zealand (Gubanov, 1972; Compagno, 2001). A. vulpinus appears to be the only thresher that can tolerate such temperature variation. A. vulpinus is known to have seasonal migrations off the west coast of North America (Compagno, 2001). When A. superciliosus was first named and described in 1839 (off Madeira, southwest ) it was not reported again until 100 years later. It was therefore believed that this was a very rare species of shark, possibly due to its preference for deep-water (Nakamura, 1935). However, although A. superciliosus can go down to a depth of at least SOOm (Gruber and Compagno, 1981), it, 25 as well as all Alopias species, can be found close to the surface in shallow inshore waters (Bass et al., 1975; Compagno, 2001). In addition, the increase in for , which has a similar habitat to A. superciliosus, has resulted in the frequent capture of this shark (especially in Russia, Japan and ) indicating that it is abundant with a wide distribution (Gruber, 1980). As with A. vulpinus, the distribution of A. superciliosus is also circumglobal, but it favors tropical and temperate ; thus, although found in the Atlantic, Pacific and Indian Oceans (and the Mediterranean), its range does not extend as far north or south as that of A. vulpinus (Bass et al., 1975; Cigala Fulgosi, 1983; Compagno, 2001). A. pelagicus has the most restricted range of the three threshers, and is found only in the Indian and Pacific Oceans, where it favors tropical latitudes (Compagno, 2001}. Unfortunately, despite their wide distribution, all three Alopias species are either threatened or endangered. A. vulpinus was once exploited for oil (the liver is rich in Vitamin A), but now this shark is fished for its flesh and fins (Compagno, 1990a). The flesh and fins of A. superciliosus are considered of poorer quality than A. vulpinus, and usually discarded, but are marketed in some countries, such as Taiwan, where this species makes up as much as 13% of the total annual shark catch (Liu et al., 1997). A. vulpinus is also a prized sports fish, popular with anglers because it puts up a fight (Compagno, 1990a). All three thresher species are often caught as bycatch, particularly by swordfish and tuna fisheries (Castro et al., 1999). Threshers are especially vulnerable to over-fishing, due to their slow growth, low fecundity (litter of two pups per brood for A. superciliosus and A. pelagicus, and four-to-six per brood for A. vulpinus) and their frequency as bycatch (Castro et al., 1999). Estimates of annual catches of Alopias species are poorly known. This is compounded by the fact that A. vulpinus and A. pelagicus are easily confused. However, there is one example where over-fishing was found to adversely effect populations. In Californian waters, threshers and other sharks were targeted for meat in the 1970's as an alternative to red meat. The use of

driftnets dramatically increased the efficiency of captures: for example, in 1982 1059 metric tons

(129,000 pounds) of A. vulpinus were caught (Caillet and Bedford, 1983). As a result of such intense

pressure, the industry collapsed due to the swift decline of A. vulpinus. Legislation was passed in

1986 that limited the targeted fishing of threshers to the month of May every year and although 50% of threshers are caught during this period, the remaining 50% represent bycatch. In addition both

catches consist mostly of immature (one- or two-year-old) A. vulpinus; hence, off California, where

this species was once abundant, A. vulpinus may be facing (Castro et al., 1999). 26

Family Lamnidae The family Lamnidae includes five living species in three genera: Lamna, Isurus, and Carcharodon. This group includes fast and powerful predators, and all species seem to have strategies to avoid cannibalism of juveniles. Endothermy has been demonstrated for all members of this group (see below).

Lamna The genus Lamna contains two species: the porbeagle (L. nasus) and the (L. ditropis). Although L. nasus was described over 200 years ago, L. ditropis was not named until 1947. While there have been several accounts of Lamna circling and bumping boats, as well as charging at divers, there have been no reports of attacks on humans (Paust and Smith, 1986; Compagno, 2001). This fact, however, may be influenced by their habitat preference (see below) as these are certainly large and powerful predators. Both species are commonly caught, but their biology is not well known. Lamna species are robust, stocky sharks that are sometimes confused with Carcharodon due to their outward appearance. Like the great white, both Lamna species are gray over most of the body, but with a white underbelly. These species differ in color: the free rear tip of the first dorsal fin of L. nasus is white, whereas in L. ditropis the area above the pectoral fin base is white, and the ventral surface of both the head and abdomen display "dusky blotches" (particularly in North American specimens, although Southern Hemisphere specimens may also display blotches). Lamna are also smaller than Carcharodon: both L. nasus and L. ditropis have a maximum length of around 3m, although specimens up to 3.7m (and possibly up to 4.3m for L. ditropis) have been reported but not confirmed (Paust and Smith, 1986; Compagno, 2001). Size at birth is 60-80cm for L. nasus, and slightly smaller for L. ditropis (40-85cm; Last and Stevens, 1994; Compagno, 2001). For L. nasus, size at maturity is 2.0 —2.Sm for females and 1.5-2.Om for males (Francis and Stevens, 2000; Compagno, 2001). For L. ditropis, females mature at around 2.2m and males mature at around 1.82m (Compagno, 2001). Growth in L. nasus is considered very rapid, especially in the first few years, and both Lamna species are considered to have long life spans of between 20 to 30 years (Francis and Stevens, 2000; Compagno, 2001). Lamna prefer cool to cold waters. This shark can live in low temperatures {down to 1-3°C) due to its endothermic adaptations. In fact, L. ditropis possesses the highest known body temperature of any shark, and can raise its body temperature 13.6°C above ambient temperature (see below). Both L. nasus and L. ditropis are primarily epipelagic, but are occasionally found in inshore shallow water, 27 as well as deep water. For example, L. nasus has been reported in water 1200m deep, while L. ditropis can be found at a depth of at least 225m (Amorim et al., 1998; Compagno, 2001). A juvenile L. nasus was even found in a coastal lagoon in Argentina, indicating a tolerance for in this species (Lucifora and Menni, 1998). Although both species prefer cold waters, they do not have overlapping distributions. L. nasus is found in the North Atlantic, as far north as Greenland and above Scandinavia, but only as far south as (Compagno, 2001). It is also found across a circumglobal belt that includes the southern regions of the Pacific, Atlantic and Indian Oceans (Compagno, 2001). This southern belt is very restricted and extends as far south as the southern tip of South America, but only as far north as the southern parts of Brazil, Australia and South Africa (Compagno, 2001). L. ditropis is found only in the North Pacific, as far south as Japan and southern California, and as far north as the Alaskan and Siberian coastlines (Bering Sea; Compagno, 2001). Although tolerant of cold water, both species appear to seasonally migrate in a north-south direction in order avoid temperature extremes found in these areas (Paust and Smith, 1986; Nakano and Nagasawa, 1996). These migrations may be extensive: for example, L. ditropis migrates from waters in Japan to the Bering Sea, a distance of 3230 km (B lagoderov, 1994; Compagno, 2001). Both species exhibit segregation by both size and sex (Blagoderov, 1994; Ellis and Shackley, 1995; Nakano and Nagasawa, 1996; Nagasawa, 1998). Adults are found in the coolest waters of their respective ranges. Sexual segregation is especially strong in L. ditropis, in which males are found in the western North Pacific, whereas females are found in the eastern North Pacific. It is possible that breeding grounds and feeding grounds are separate: males may not enter the breeding grounds, while females do not feed in such areas (Paust and Smith, 1986}. In addition to size segregation, this is believed to be another way of preventing cannibalism of juveniles (Lamna species do not appear to be cannibalistic, as juveniles are not found in the stomachs of adults; Paust and Smith, 1986). Very little is known of population structure in Lamna, but it is thought that L. nasus in the western and eastern North Atlantic comprise two separate populations (Compagno, 2001). In addition to avoiding temperature extremes, Lamna migrations may be guided by prey availability (Paust and Smith, 1986). For example, it has been suggested that L. nasus migrates to follow schools of Atlantic , whereas L. ditropis follow salmon populations (Paust and Smith, 1986; Nagasawa, 1998). Indeed, L. ditropis acquired its common name of the salmon shark because it is believed to be the major predator of salmon in the Pacific Ocean (Paust and Smith, 1986; Nagasawa, 1998). L. ditropis may even target specific species of salmon, such as sockeye salmon and chum (Paust and Smith, 1986). Not all researchers believe that L. ditropis show such a strong 28 preference for salmon. Studies by Blagoderov (1994) found that the distribution of L. ditropis in the Bering Sea was outside the range of migrating salmon, and these sharks were instead feeding on and herring. In addition to targeting specific fish species, Lamna will also feed on a wide variety of benthic and pelagic teleosts, and occasionally dogfish (Squalus aeanthias) and squid (Ellis and Shackley, 1995; Nagasawa, 1998; Compagno, 2001). Stomach contents of L. ditropis have also included crabs (Nagasawa, 1998), while those of L. nasus have included sea urchins and whelks, although these may have been swallowed accidentally (Gauld, 1989). Both species of Lamna are known to occasionally school and form feeding aggregations (Paust and Smith, 1986; Compagno, 2001). They are voracious predators, and there are even reports of L. ditropis jumping high out of the water into dense patches of kelp to chase prey (Paust and Smith, 1986; Compagno, 2001).L. nasus and L. ditropis are both common bycatch of commercial fisheries (Gauld, 1989; Francis and Stevens, 2000). Both are used for their fins and meat, although L. nasus seems to have a higher market value. In fact, L. nasus has been exploited by Scandinavian fisheries (especially ), which target this shark for its fins and meat (Gauld, 1989; Castro et al., 1999). In the 1960's Norway exploited the porbeagle in its local waters, until the industry was no longer viable. This prompted Norway to fish this shark from North American waters and within six years, L. nasus was depleted here as well (Castro et al., 1999; Francis and Stevens, 2000). Norway currently has a quota on porbeagle catches (Gauld, 1989; Castro et al., 1999). L. nasus is not however, targeted in the Southern Hemisphere, although it is a frequent bycatch of commercial fisheries (Francis and Stevens, 2000). The Norwegian example indicates that this species is seriously vulnerable to over-fishing. L. nasus is protected in the United States, with regulated fisheries in Europe and Canada (Compagno, 2001). The porbeagle is considered a record in the British Isles, although it is not as popular as makos or the great white (Gauld, 1989). For all the above reasons, there is considerable concern about the conservation status of L. nasus (Compagno, 2001). The salmon shark, L. ditropis, is considered a nuisance fish by salmon fisheries, because it damages nets and destroys their catch (Paust and Smith, 1986; Compagno, 2001). As a result, this species is often killed on sight by salmon fishermen. Although there is a regulated sports in Alaska for L. ditropis, it does not appear to be as popular a game fish as L. nasus (Compagno, 2001). In some parts of the world, the salmon shark is finned if caught, and the heart of L. ditropis is even considered a delicacy in certain parts of Japan (Compagno, 2001). The aforementioned factors, combined with the lack of protection of this species, means there is concern for the conservation of this species as well (Castro et al., 1999; Compagno, 2001). 29

Isurus Species within the genus Isurus are known as mako sharks, a name which comes from the Maori (native New Zealand) word for these sharks (Compagno, 2001). Makos are more slender and sleeker sharks than Lamna or Carcharodon. The two species are the short-finned mako, I. oxyrinchus, and the long-finned mako, I. paucus. The latter species was not described until 1966, and is very poorly known. In fact, I. paucus has never been observed underwater (Compagno, 2001). 1. oxyrinchus was named and described in 1809, and is more well known, hence the information in this section pertains to 1. oxyrinchus unless otherwise indicated. Mako sharks are impressive in appearance, due to their striking metallic blue color. As in other lamnids, the body is counter-shaded, with a white ventral surface. 1. oxyrinchus reaches an estimated body length of up to 4.1m, and females are larger than males. Male I. oxyrinchus reach a body size of about 3m, and mature at around 2m while females mature at about 2.8m (Bass et al., 1975; Stevens, 1984; Compagno, 2001). I. paucus is slightly larger, with a maximum reported size of 4.2m (females} and both sexes in this species appear to mature at around the same size (2.45m; Compagno, 2001). 1. oxyrinchus are smaller at birth than I. paucus (60-70cm and 97-120cm respectively; Bass et al., 1975; Compagno, 2001). Males and females appear to have similar growth rates until they are 4-5 years old (around 2.3m; Pratt and Casey, 1983; Casey and Kohler, 1992). Makos may exhibit rapid growth rates when compared to other species (Pratt and Casey, 1983; Casey and Kohler, 1992). For example, 1. oxyrinchus may grow at approximately twice the rate of L. nasus (Pratt and Casey, 1983; Casey and Kohler, 1992). Isurus is perhaps the fastest of all shark species, and they are one of the most active of all fishes (including both cartilaginous and bony fishes; Compagno, 2001). Because of their speed makos have few natural predators, although juvenile makos are occasionally taken by great white sharks (Compagno, 2001). Makos are known to jump several times their own body length out of the water, and are capable of sudden bursts of speed (Compagno, 2001). Their speed allows makos to catch fast-swimming prey. I. paucus may be not as~fast as 1. oxyrinchus, based on the former's broad, long pectoral fins and slimmer build, a morphology found in other slow swimmers (such as longimanus (Compagno, 2001) Makos are famous for their distinctive behaviors, which have been interpreted as threat displays. They are known to charge at other sharks (as well as divers) at high speed, and suddenly change direction at the last moment to avoid a collision (Compagno, 2001). Makos rarely bite divers, but the sheer speed of the charge means that conventional methods of warding off sharks do not work for makos (Compagno, 2001). Another behavior exhibited by makos (and Carcharodon, 30 see below) is known as "gaping"; during gaping the shark swims underwater with its mouths open which is believed to be a warning signal (Compagno, 2001). Makos are also known to make short jumps out of the water ("porpoising"), followed by rapid swimming in a figure-8 pattern (Compagno, 2001). Apart from these putative threat displays, very little is known of the behavior of makos, although it is assumed to be similar to that of Carcharodon. A wide variety of prey items (both pelagic and benthic) have been recorded from the stomachs of 1. oxyrinchus (Compagno, 2001). . The diet of this species seems to be associated with prey availability. For example, for makos caught inshore in the western North Atlantic, almost 78% of their diet is bluefish (Pomatomus saltatix, Stillwell and Kohler, 1982). Makos caught offshore in the western North Atlantic had mostly in their stomachs, including deep-water squids that are known to be vertical migraters (Stillwell and Kohler, 1982). Tracking studies conducted by Holts and Bedford (1993), however, revealed no pattern of diurnal vertical migrations in 1. oxyrinchus. In the western North Atlantic, makos do not appear to be feeding on other shark species however, in South Africa, elasmobranchs (primarily blue sharks [Prionace], but Carcharhinus, small hammerheads [] and several batoid species have also been recorded) appear to be the most important prey item in the diet of these sharks (Compagno, 2001). I. oxyrinchus specimens caught of the Australian coast rarely feed on elasmobranch prey, but feed on various teleost species (but not bluefish, which are absent from these waters; Stevens, 1983; Compagno, 2001). Odd prey items found among the stomach contents of 1. oxyrinchus include sea horses, boxfish, pufferfish, , sargassum weed, , and isopods (Stillwell and Kohler, 1982; Compagno, 2001), but some of these items may have been ingested by accident. The diet of 1. paucus is not well known, but based on stomach contents it appears to eat schooling fish and cephalopods (Compagno, 2001). As makos increase in size, they shift to larger prey that is closer to their own size. For example, adult males weighing about 136kg frequently attack and kill swordfish (Xiphias sp.) that weigh about 180 kg (Compagno, 2001). When makos reach over 3m in length they will hunt and kill small . It is thought that these sharks can capture dolphins since the shape of the teeth changes (from awl-shaped to more triangular and flattened) to a shape similar to that found in Carcharodon carcharias (Chapter Two). Unlike C. carcharias, however, makos are not known to prey upon or scavenge dead whales (Compagno, 2001). 1. oxyrinchus and 1. paucus are found in tropical and warm temperate oceanic water in all three major oceans, and seem to prefer water temperatures between 17 and 22°C (Casey and Kohler, 1982; Compagno, 2001). Nevertheless, 1. oxyrinchus is found as far south as New Zealand and central , and as far north as Scandinavia and the Bering Sea (Nakano and Nasagawa, 1996; 31

Compagno, 2001). 1. oxyrinchus are found down to a depth of at least SOOm, and sometimes come inshore (Compagno, 2001). The known distribution of 1. paucus is very patchy, although it appears to be very common in certain areas (e.g., southern Brazil; Amorim et al., 1998). 1. paucus may be a more oceanic and deep-water shark, which rarely (if at all) ventures inshore, which would explain why it is encountered less frequently than I. oxyrinchus. This is also consistent with the fact that 1. paucus has larger eyes than 1. oxyrinchus (Garrick, 1967) and a morphology similar to other oceanic species, such as the oceanic white-tipped shark Carcharhinus longimanus (C. longimanus was described much Iater than other Carcharhinus species, possibly because it also almost exclusively inhabits oceanic waters; Garrick, 1967). 1. oxyrinchus is capable of traveling long distances, based on "tag and release" studies (Casey and Kohler, 1982). For example, one individual (1.6m long) had traveled 2452 nautical miles (about 4,000 km) from the northeastern coast of the United States to Spain (the majority of sharks however, were captured within 500 km of where they were tagged; Casey and Kohler, 1982). There did not appear to be a relationship between size or sex and the distance traveled (Casey and Kohler, 1982). The difficulty in determining migratory patterns in this species is that the majority of I. oxyrinchus specimens caught are juveniles or subadults (Nakano and Nasagawa, 1996). As a result, little information is available on adult makos, especially pregnant females, which are rarely caught (Casey and Kohler, 1982). Juveniles thus appear to have a wider geographical range than adults, and appear to come into coastal waters more frequently (Casey and Kohler, 1982). This may represent segregation between juveniles (inshore) and adults (offshore) to protect smaller individuals from cannibalism and larger oceanic predators (Casey and Kohler, 1982). Based on tagging studies, it has been suggested that there may be separate populations of 1. oxyrinchus within the Atlantic (Compagno, 2001). For example, Moreno and Moron (1992a) suggest that there is an endemic population of 1. oxyrinchus off the Azores (northeast Atlantic). 1. oxyrinchus can usually be distinguished from 1. paucus by coloration, with the latter exhibiting a darker coloration on the ventral surface of the snout. In contrast, 1. oxyrinchus typically has a white ventral surface along its entire length. Specimens of I. oxyrinchus in the Azores however, possess darker undersides, the degree of which appears to change with size: small specimens have dark blotches on the underside (especially under the snout), while larger specimens show an underside that is almost completely dark with only a few white areas still remaining (Moreno and Moron, 1992a). Since the color variation is related to size, and since this color pattern is common in Azorean waters, Moreno and Moron (1992a) believe this pattern is due to fixed genetic variation in this population; thus, there appears to be a variant .population of 1. oxyrinchus endemic to in waters. 32

Another example of potential differences within Atlantic 1. oxyrinchus populations was found by Heist et al. (1996). A restriction fragment length polymorphism (KELP) analysis of a 220 base pair region of mitochondrial DNA found significant differences between sharks in the North and South Atlantic, and hence may represent separate populations. This implies that if populations of 1. oxyrinchus are over-fished in the northern Atlantic, stocks may not be replenished by migration from the south Atlantic (Heist et al., 1996); this observation, if correct, may have serious implications for the management of this species I. oxyrinchus is targeted by commercial fisheries (Castro et al., 1999). This species is fished for its high-quality meat, and targeted by big game sports anglers on account of the fact that these sharks jump out of the water and put up a fight when hooked (Compagno, 2001). 1. oxyrinchus is also a common bycatch of the tuna and swordfish industries (Castro et al., 1999). Since most of the catches are of immature sharks, nurseries may be being depleted, and in some areas in the Atlantic and eastern Pacific there has been significant decline in catches (Casey and Kohler, 1982; Castro et al., 1999). In response to concerns over these declines, there have been limits imposed by the United States and Australia on the sizes of I. oxyrinchus that can be kept by sports anglers (Casey and Kohler, 1992; Pepperell, 1992) as well as a limitation of one shark per boat per day in the United States (Casey and Kohler, 1992}. This is one of the few species of sharks in which minimum length restrictions have been implemented (Casey and Kohler, 1992). 1. paucus is also frequently caught by commercial fisheries, but since the meat is of lesser quality, these sharks are often finned and thrown back. The frequent practice of discarding these sharks at sea, combined with the fact that this species is often confused with 1. oxyrinchus, makes it difficult to determine how many are actually caught. 1. paucus has similar risk factors to 1. oxyrinchus (such as large size, low fecundity and vulnerability to capture via commercial fisheries) that also make this species susceptible to over-fishing (Castro et al., 1999).

Carcharodon The genus Carcharodon is represented by perhaps the most well known (and most notorious) of all modern shark species, the great white shark, Carcharodon carcharias. Despite the fact that it is one of the most feared of all sharks, little is known of its biology. Great whites are difficult to study since adult specimens of C. carcharias are rarely caught, partly because (unlike other lamnids) they are strong enough to free themselves from long-lines, and also because this species appears to be uncommon throughout its distribution (Caillet et al., 1985). Pregnant females are especially rare (Compagno et al., 1997). 33

The common name for the great white shark derives from their white underbelly; sharks that had been caught and brought up on deck were usually observed lying on their backs, with their lighter ventral surface exposed (Klimley and Ainley, 1996). This name however, is somewhat of a misnomer since most of the surface of the great "white" shark ranges from gray to black in color, and only the underside of the body is white (Compagno, 2001). Carcharodon carcharias however, certainly deserves to be called "great" due to the impressive size this shark can reach. The maximum size of the shark is controversial, since many estimates have been based on extrapolating total body length from isolated teeth, jaws or bite marks inflicted on carcasses of prey (Randal, 1973; Mollet et al., 1996). Such estimates (up to 9m in body length) are difficult to verify (Mollet et al., 1996) and as such, 6.4m is the largest maximum length based on the actual measurement of a whole specimen (Bigelow and Schroeder, 1948; Compagno, 2001). Males are smaller than females, and reach a maximum body length of around 5-S.Sm, maturing between 3.5 and 4.Sm (approximately 9-10 years old; Caillet et al., 1985; Compagno, 2001). Females mature between 4 and Sm at approximately 12- 14 years old, and can live at least 23 years (Compagno et al., 1997). The size of great whites at birth varies from 1.2 to 1.7m (Compagno, 2001). As well as their impressive length, great whites are very stocky and extremely powerful. The large size and enormous strength of C. carcharias means they have few natural predators. Nevertheless, there is one report of a (Orca orcinus) killing and consuming a 3-4m great white. As observed by Pyle et al. (1999), the killer whale rammed the shark at high speed, and then held the shark (which was motionless) upside-down in its jaws for 15 minutes while swimming slowly. It is assumed the shark was killed by either ramming it or asphyxiating it (by being held immobile on its back; Pyle et al., 1999). Nevertheless, great whites remain at the apex of the marine predatory food chain. The diet of great whites is very broad, and includes prey which range in size and habitat (both benthic and pelagic species). These include many species of both bony and cartilaginous fish including chimeras, batoids, and even other lamniforms (e.g., C. taurus, 1. oxyrinchus, C. maximus) and the whale shark (Bass et al., 1975; Cliff et al., 1989; Compagno, 2001). It is not known if basking and whale sharks are attacked or scavenged although juveniles of both species may be easily taken. So far, there is no evidence that adult great whites feed on their young, and it appears that there may be behavioral traits that prevent cannibalism (see below). Great whites also kill birds (e.g., pelicans, gulls, penguins), but they do not appear to eat the birds they just merely "toy" with them (Compagno, 2001). prey includes squid, abalone, bivalves and crustaceans; one 4.4m individual was even found with 150 crabs in its stomach (Compagno, 2001). Feeding on benthic fish may be especially important in 34 juvenile great whites, which can feed by sucking up these fishes with their jaws (Casey and Pratt, 1985; Tricas, 1985). As in Isurus, great whites over 3m in length switch to a diet that includes large marine mammals (Tricas, 1985; Cliff et al., 1989). When available, large great whites feed almost exclusively on marine mammals. Pinnipeds seem to be an important staple in the diet of large white sharks, but occasionally small cetaceans, such as , dolphins, pygmy sperm whales and calves, are also taken (Compagno, 2001}. Juvenile great whites (smaller than 2m) may also feed on pinnipeds, although at this size they are restricted to hunting juveniles (Compagno, 2001). Unlike makos, great whites will scavenge the carcasses of large whales (e.g., baleen whales), and this appears to be an important food source (Compagno, 2001). In addition to marine mammals, the remains of terrestrial mammals have been found in the stomachs of great whites. Many of these remains, such as cows, lambs and pigs, are the result of opportunistic scavenging of carcasses dumped by slaughterhouses into the sea (Compagno, 2001). Great whites appear to exhibit an array of behaviors associated with feeding, which have been interpreted as threat displays. These behaviors are directed at other great whites (and occasionally humans). There appears to be a feeding hierarchy among great whites, and these behaviors may be used to assert dominance. For example, larger individuals will often follow a smaller individual toward bait, but at the Iast moment the smaller individual will suddenly veer away from the bait, allowing the larger shark exclusive access to it (Compagno, 2001}. In addition, two individuals will swim parallel to each other, and slap their tails at the surface, creating a large splash directed at the other shark (Klimley et al., 1996). This appears to be a contest, and will continue until a "winner" is established and is allowed to feed (the stronger shark appears to be determined by both the frequency and vigor of the splashes; Klimley et al., 1996). This tail-slapping is sometimes accompanied by the shark launching itself into the air, a behavior known as "breaching" (Klimley et al., 1996). As in Isurus, great whites also exhibit "gaping" behavior, except in the great white this display occurs both above and below the surface. More elaborate "gaping" behavior was also frequently observed when bait was offered to the shark, and then removed before the shark could feed. The great white would swim awkwardly on its side, with its head above the surface and mouth open, flexing its jaws ("rhythmic partial gapes"; Strong, 1996). This behavior was interpreted as a sign of frustration on the part of the great white (Strong, 1996). Anatomical studies on the and eyes of white sharks provide insights into how these animals hunt. The morphology of the brain suggests that white sharks rely heavily on olfaction and vision to detect prey, rather than electroreception (Demski and Northcutt, 1996}. Traditionally, it has 35 been thought that all sharks have poor eyesight and are color-blind. However, histological studies of the eyes of sharks suggest otherwise (Gruber and Cohen, 1985). For example, the distribution of photoreceptors in the retina is similar to those of animals with acute vision, as the cones are concentrated where the maximum amount of light is focused (similar to the fovea centralis of mammals; Gruber and Cohen, 1985). Great whites are commonly associated with attacks on humans. The great white has been blamed for more attacks on humans than any other shark species. The infamy of this shark makes it is the primary culprit in such attacks even when other species (such as tiger and bull sharks) may actually be responsible (Bass et al., 1975). Great whites appear to be very inquisitive, and are very curious about human activities. For example, they are known to swim up to within a few feet of divers, pause, then swim away, without any aggressive behavior. Great whites are also drawn to boats, and often stick their head out of the water to investigate (this is called "spy hopping"; Compagno, 2001). The reasons) behind great white attacks on humans (such as divers, swimmers and surfers) is often unclear. It may stem from pure curiosity, as exemplified by instances where great whites will swim up to humans and bite, but then depart without feeding (Compagno, 2001). Another popular view is that humans are mistaken for marine prey. For example, surfers paddling on are said to look like pinnipeds from below. However, this "mistaken identity" hypothesis is undermined by the excellent vision ascribed to these sharks (see above; Compagno, 2001}. Unlike Lamna, white sharks are found in areas frequented by human activities. This shark has an extremely wide geographical distribution, and can enter almost any marine habitat including very shallow water, bays, lagoons and even estuaries (Bass et al., 1975; Compagno, 2001). Although great whites are rarely caught in deep-water, one specimen was caught at a depth of 1280m (Compagno, 2001). Great whites can tolerate a wide range of temperatures. They are found in very warm (tropical) and very cold waters - as far north as the Bering Sea and as far south as subAntarctic islands (Compagno, 2001). The distribution of both adult and juvenile white sharks appear to be segregated by both habitat and temperature, with juveniles restricted to warm temperate, coastal waters (Bass et al., 1975; Casey and Pratt, 1985; Klimley, 1985; Goldman et al., 1996; Fergusson, 1996). Such segregation may (as in other lamnids) be a strategy to minimize cannibalism of young white sharks by adults (Klimley, 1985). Goldman et al. (1996) has suggested that temperature segregation may be influenced by size since young great whites may dissipate heat faster than larger individuals and therefore may be more sensitive to cooler temperatures. Juvenile sharks begin to move into cooler waters only when they reach approximately 2m in length (Casey and Pratt; 1985; Klimley 1985). In ~b addition, small immature sharks caught off the east coast of North America were predominately caught in shallow coastal waters, which may represent nursery areas (Casey and Pratt, 1985 ). Migration patterns of white sharks (other than those possibly associated with maturity) are poorly understood in C. carcharias (Compagno, 2001). However, a study conducted by Pardini et al. (2001) indicates migration in this species may also be influenced by sex. In this study an RFLP analysis of nuclear microsatellite DNA was compared to mitochondrial sequence data (the D-loop) for great whites in Australia, New Zealand and South Africa. The mitochondrial data showed that populations from Australia and New Zealand, when compared to that of South Africa, differed by about 4% sequence divergence. In contrast, there was no significant difference in the nuclear data. Pardini et al. (2001) suggest that, assuming maternal inheritance of mitochondrial DNA (Chapter Four) this may indicate that females do not migrate (they also state however, that the lack of any divergence in the nuclear data can be accounted for by the migration of males). If these conclusions are true, then local populations of great whites can be hunted to extinction by wiping out endemic female populations. Unlike other lamnids, great whites are uncommon catches in commercial fisheries. This is because larger great whites are strong enough to break off lines and, since small great whites are not usually found in offshore waters, these are rarely caught by long-lines. However, the great white has a unique problem due to its reputation. Prior to the movie Jaws, any time great whites were caught by accident, they were usually discarded (Compagno, 1990a). After Jaws, capturing and killing great whites was seen a public service, due to the alleged danger of this shark (Compagno, 1990a). It also became very lucrative since products associated with white sharks became much sought after; the jaws, teeth and fins are especially prized by trophy hunters, especially specimens over Sm in length. For example, the jaws of great whites have been sold for $20,000 to $50,000, individual teeth for $600 to $800, and a pair of fins can go for over $1000 (Compagno, 2001). Although great whites are currently protected in Australia, South Africa, Namibia, , and the United States, there is often inadequate enforcement of these laws. The value of great white sharks parts, combined with their protected status, has given rise to a black market industry for this species. Unless the laws are more strictly enforced, and these laws become global, then any measures to protect the species will be ineffective (Compagno et al., 1997).

Biological adaptations in lamniform sharks II: Endothermy The family Lamnidae (as well as the genus Alopias) is endothermic. Endothermy, or the ability to maintain constant body temperatures above that of the environment, is restricted to 0.1 % of 37 all fish species (Hochachka and Somero, 2002). Endothermy is limited in fish species due to rapid heat loss across their gills and skin. The rate at which oxygen diffuses into the blood at the gills is ten times slower than the rate at which heat is lost to the environment (Carey et al., 1971). Due to this fact, blood leaving the gills is rapidly cooled by the surrounding water and hence leaves the gills at a temperature close to that of the environment. Any heat generated by fish (through metabolic processes) is therefore unavailable to maintain stable body temperatures. Despite this obvious limitation, endothermy has evolved independently in both teleost fishes (Scombroidea: tuna and ) and in sharks (Lamnidae: see above). The mechanisms developed for heat retention in these two groups is a striking example of ; it is indeed remarkable how such divergent taxa have developed such a similar suite of complex traits. This section presents an overview of the anatomical and physiological adaptations for endothermy present in lamnid sharks. Evidence for endothermy in other elasmobranch groups, such as thresher sharks and batoids, as well as the advantages of endothermy in these fish are also discussed.

The : Endothermic versus ectothermic sharks Lamnid sharks possess several cardiovascular specializations necessary to fuel the high metabolic demands required for endothermy. Compared to ectothermic sharks, the aerobic tissues of lamnids require increased amounts of oxygen to sustain these high metabolic activities; this is accomplished in several ways (Bernal et al., 2001). These sharks possess a large gill surface area, with small lamellae, which facilitates gas exchange. The heart is also unique: this is larger, with thicker ventricular walls; possesses an increased coronary blood supply; and shows greater activity of both anaerobic and aerobic enzymes in cardiac muscles. The blood itself has a higher oxygen carrying capacity as indicated by unusually high hematocrits (an indicator of the percentage of red blood cells in the blood) and hemoglobin concentrations; these have been shown to equal or surpass levels in birds and mammals (Carey et al., 1981; Emery, 1985). Although concentrations of myoglobin (responsible for delivering oxygen to the mitochondria for use in aerobic ) in sharks are unknown, it is believed that concentrations of this protein will equal the high levels seen in endothermic (Bernal et al., 2001). Recall that the main obstacle for endothermy in fish is the loss of metabolic heat across the gills to the environment. In order to prevent this, lamnid sharks have a developed a highly modified circulatory system designed to circumvent this problem. In order to appreciate the complex modifications present in these animals, a brief review of circulation in sharks is necessary. 38

In sharks, deoxygenated blood leaves the heart through the ventral aorta. This vessel branches into several afferent branchial arteries that carry blood towards the gills. After gas exchange has occurred, oxygen rich blood leaving the gills travels through a series of efferent branchial arteries that collectively drain into the dorsal aorta. The dorsal aorta runs along the midline of the body and branches off this vessel supply blood to all of the major organs of the shark. The venous system in sharks (i.e., the pathway by which blood is returned to the heart) is modified to form both a renal portal system (where blood from the tail is filtered at the kidneys before returning to the heart) and a hepatic portal system (blood from the internal organs is shunted to the liver before its return to the heart). Other major veins include the anterior and posterior cardinal veins that drain blood from the anterior and posterior sections of the body, respectively. These blood vessels are therefore similar in function to the cranial and caudal vena cavae found in mammals. Both the anterior and posterior cardinal veins drain into the common cardinal vein and this vessel returns the blood to the heart. Lamnid sharks have modified several aspects of this basic circulatory pattern by including several "retia mirabille" or "miraculous networks" (circulatory structures where arterial blood mixes with venous blood) along pathways leading to muscles in the trunk, the visceral organs and the brain and . These "retia mirabille" (from here on referred to simply as retia) prevent heat loss by acting as counter current heat exchangers in which cold arterial blood traveling towards the tissues comes in contact with the warm venous blood leaving the tissue. Metabolic heat generated in the tissues is then conducted from the venous circulation to the arterial circulation where it is carried back towards the tissues. In this manner heat generated in the tissue returns to .the organs and thus bypasses the gills preventing heat loss to the environment. Endothermy in sharks however, is limited to the organs serviced by the retia and sharks are therefore only capable of "regional endothermy" since not all tissues in the body are maintained above ambient seawater. Three primary retial systems (the muscle retia, the visceral retia and the orbital retia) are present in lamnid sharks. The additions of these retia require alterations in the circulatory pathways to the organs that they service and this has resulted in reductions in the importance of both the dorsal aorta and posterior cardinal vein in these sharks (Carey et al., 1971}. This section discusses the anatomical and physiological modifications found in these three heat exchanging retia. Although a fourth retial system has been found associated with the in Lamna ditropis (Anderson and Goldman, 2001), this system is poorly understood and is therefore not discussed. 39

The retial system of locomotor muscles There are two types of retia involved with the locomotor muscles of fish: the central retia and the lateral (cutaneous) retia. The central heat exchanger in tunas consists of a rete between the dorsal aorta and the posterior cardinal vein (Satchell, 1991). This type of system is absent in sharks and is considered an ancestral trait in tunas (Bernal et. al, 2001). The second type of heat exchanger, the lateral (cutaneous) retia, is present in both tunas and sharks. Two major anatomical modifications were necessary in order to maintain the metabolic heat generated by the swimming muscles of sharks. The first of these was a change in position of the aerobic, highly metabolically active, red muscle (RM) from a lateral position to a more medial one. In most fish, the RM is a thin, lateral layer of muscle located just beneath the skin (Satchell, 1991). In this position, heat generated by this highly metabolically active tissue is lost both across the gills and through the skin. A shift in the position of the RM from this lateral position to a more medial position (flanking the vertebrae) is believed to be vital for heat conservation; indeed this feature separates the scombroids and lamnid sharks from all other fish species. The centralization of the RM by itself is not sufficient for endothermy: a retial heat exchanger must also be present. The development of the lateral (cutaneous) retia was the second major anatomical modification developed in sharks. In order to supply blood to the medial RM, and conserve the heat generated in this tissue, a reversal in the pattern of circulation was necessary. Recall that the main blood supply to the body in sharks is the dorsal aorta and that the locomotor muscles are usually drained via the posterior cardinal vein; both these vessels are located along the midline of the shark. In lamnids the size and the function of these two primary blood vessels are reduced. Instead, the lateral cutaneous veins, which is normally small and utilized to drain the skin, is modified to supply blood to the retia of the RM (Carey, et. al., 1971). In lamnid sharks, a new blood vessel, the lateral cutaneous artery, is derived from the fourth efferent branchial artery (one of the blood vessels which leave the gills) (Bernal et al., 2001). This blood vessel runs parallel to the lateral cutaneous vein in the area near the RM. Blood flow to the RM travels through the Lateral cutaneous artery, bypassing the dorsal aorta. The lateral cutaneous artery then branches into many smaller blood vessels that supply the retia with cool arterial blood (temperature equal to that of the surrounding water). Arterial blood traveling through the retia comes in contact with vessels running parallel and in the opposite direction; these are branches off the lateral cutaneous vein. Blood traveling through the lateral cutaneous vein is leaving the RM and hence has been warmed by the metabolic activity of this muscle. The structure of the retia therefore ensures that heat is returned to the RM which can be used to maintain body temperature and is not lost across the gills. The lateral cutaneous veins drain 40 into the posterior cardinal vein just prior to their entry into the common cardinal veins thus reducing the function of the posterior cardinal veins (Bernal et al. 2001). The anatomy of the blood vessels present in the lateral retia varies within lamnid sharks (Carey et al. 1971; Satchell, 1991; Bernal et al. 2001). In Carcharodon and Lamna species the blood vessels of the retia are dispersed within the white muscle (WM) tissue (this tissue surrounds and insulates the RM) while in Isurus the rete is a compact mass (up to a centimeter thick) of vessels leading to the RM. Despite these differences, it is clear that these heat exchangers are efficient since these animals conserve heat. Muscle temperatures of lamnid sharks range from 5°C above ambient seawater in Carcharodon carcharias (Carey et. al., 1982) to 15.6°C in Lamna ditropis (Anderson and Goldman, 2001); mako sharks (Isurus) fall within this range (Smith and Rhodes, 1984). Carey et al. (1982) believe that the RM of C. carcharias may indeed be warmer than previously recorded since measurements were not taken in what should be the warmest part of the shark (i.e., directly in the RM).

Visceral endothermy: Suprahepatic retia Visceral retia are present in all lamnid sharks but only in a handful of scombroid fishes (Bushnell and Jones, 1992). Once again, lamnid sharks have modified their circulatory pathways; in this case it is the blood supply to the visceral organs that has been modified. This system enables blood returning from these organs to be filtered in the liver before it returns to the heart (via the hepatic vein to the common cardinal veins to the heart). However, in lamnid sharks the blood supply to the viscera is now modified to travel through the visceral rete. In lamnid sharks a rete (suprahepatic rete) is placed anterior to the liver at the base of the esophagus. Blood flow to the rete travels through an enlarged pericardial artery: this artery is formed by a combination of several efferent branchial arteries (vessels traveling from the gills) that converge ventrally to create the vessel (Bernal et al., 2001; Satchell, 1991). The pericardial artery branches into several small arterioles that form the blood supply to the retia. Unlike other retia, these arterioles are not paralleled by venules, but rather they form a meshwork of blood vessels that sits in a sinus (expansion of a vein) formed by the hepatic vein. In this scenario, cool arterial blood is bathed by the warm venous blood that enter the sinus, therefore this blood does not travel in small veins as in the lateral rete. The pathway of circulation in lamnids is modified to allow blood traveling to the viscera to pass through the suprahepatic rete. In this system, blood bypasses the dorsal aorta and travels directly from the gills to the viscera. Branches off the dorsal aorta that normally serve these internal organs are greatly reduced (Carcharodon, Lamna) or absent (Isurus) in these sharks (Carey et al., 1971). 41

However, these blood vessels may function as a mechanism to allow blood to bypass the rete, thus helping to regulate the temperature of the viscera (Carey et al., 1981). A second potential mechanism for regulating blood flow is in the structure of the rete itself. A large channel travels through the center of the hepatic sinus; this delivers blood directly to the heart and thus bypasses the rete. The walls of this channel are surrounded by smooth muscle and it is believed that constriction of these muscles can also regulate the flow of blood through the rete (Carey et al., 1981). Visceral temperatures of lamnid sharks are from 4 to 14°C higher than the environment (Carey et al., 1981; Goldman et al., 1996; Bernal, et. al., 2001). The suprahepatic retia in Lamna and Carcharodon species are large and similar in size: in a 183cm long porbeagle the retia had across- sectional area of 30 square cm and weighed 267 grams; that of Isurus is approximately one third this size. The enormous size of the suprahepatic rete may be necessary to maintain such high visceral temperatures (Carey et al., 1981). Unlike tunas, heat retention in the viscera does not appear to be related to either feeding activities or water temperatures since the temperature of these organs is consistently high (Carey et al., 1981; Bernal et al., 2001). Despite this independence from feeding activities, heat production is believed to be generated by digestion, primarily that which occurs in the intestine (Bernal et al., 2001). However, it must be emphasized that the exact mechanism of heat production in the viscera of lamnid sharks is currently unknown.

Orbital retia: Endothermy in the eyes and brain Unlike the lateral and visceral retia, arterial circulatory pathways associated with the orbital rete (composed of hyoidean and pseudobranchial retia; see below) of endothermic sharks are almost identical to ectothermic species that lack these retia (Alexander, 1998). In all sharks, two arteries, the pseudobranchial artery and the efferent hyoidian artery branch off the first efferent branchial artery. The efferent hyoidean artery unites anteriorly with the paired dorsal aorta and becomes the internal carotid artery. In lamnid sharks the internal carotid is reduced in size; this vessel is the primary source of blood to the brain in sharks without an orbital rete (Block and Carey, 1985). In addition, the efferent hyoidean and paired dorsal aorta in lamnids branch into smaller blood vessels before uniting with the internal carotid; this forms the arterial blood supply to the hyoidean rete. Branches off the hyoidean rete supply blood to both the eye and its locomotor muscles (Block and Carey, 1985; Alexander, 1998). The hyoidean rete is well developed in Carcharodon and Isurus but is small in both Lamna species (Alexander, 1998). The pseudobranchial artery in lamnid sharks is large and highly coiled compared to other species of sharks. In some lamnids, this artery branches into smaller vessels that comprise the arterial 42 blood supply to a pseudobranchial rete. The presence of the pseudobranchial rete among lamnid sharks is debated: Alexander (1998) reports a pseudobranchial rete in Lamna species only, whereas Block and Carey (1985) state its presence in all five lamnid species. These authors also differ in their descriptions of the internal carotids. Alexander {1998) contends that the pseudobranchial artery branches into two main vessels: one of these leads to the locomotor muscles of the eye while the second further divides into two vessels, both of which unite with the internal carotid that supplies blood to the brain; and both the hyoidean and the pseudobranchial arteries eventually unite with the internal carotids. Block and Carey (1985), however, state that neither the pseudobranchial nor hyoidean arteries unite with the internal carotid. According to these authors, blood supply to the brain and eye (retina and locomotor muscles) is primarily supplied by blood flowing through the pseudobranchial retia and hyoidean retia respectively (these authors collectively call both retia the orbital retia) and not via the internal carotid. However, they propose that since blood traveling through the internal carotid bypasses the retia this may be a mechanism regulating the amount of heat delivered to the brain and eyes (Block and Carey, 1985). Despite these differences in opinion, both authors agree that the eyes and of lamnid sharks are indeed endothermic. Brain temperatures in lamnid sharks are, on average, about 5°C above ambient and recent findings for Lamna ditropis indicate temperatures of 9.4°C above ambient (Bernal et al., 2001). Eye temperatures of lamnids are on average 2.8°C above ambient (Block and Carey, 1985) although a temperature differential of 12.9°C was recorded in L. ditropis (Bernal et al., 2001). How do the eyes and brains of lamnid sharks retain heat? It has been proposed that metabolic activities of the extrinsic eye muscles {which are large and dark red, indicating the potential for high metabolic activity) may play a part in generating heat for this system (Carey et al., 1985; Alexander, 1998); a system similar to this exists in swordfish (Satchell, 1991). However, Bushnell et al. (1992) state that metabolic processes in the eye and brain alone are insufficient to cause the temperature elevations in these tissues, and state that an additional mechanism must be responsible for this phenomenon. The answer lies in a specialized vein, the red muscle vein (RMV), found only in lamnid sharks. This vein carries blood from the RM to the orbital sinus, an expansion of distal branches of the anterior cardinal vein. Both the hyoidean rete and the pseudobranchial rete (or pseudobranchial artery according to Alexander [ 1998]) are contained within the orbital sinus and thus are bathed with warm blood traveling through the RMV from the locomotor muscles (Bernal et al., 2001). 43

Endothermy in thresher sharks and batoids Anatomical evidence exists which suggests that endothemy in elasmobranchs may be more prevalent than previously believed. Two examples of this are found in the Alopiidae (thresher sharks) and certain batoids (: eagle rays, manta rays, sting rays). Thresher sharks possess a lateral circulation to a medial placed RM (Bernal et al., 2001), a condition believed to be necessary for endothermy for reasons discussed above. Alopias vulpinus has hematocrit and hemoglobin levels similar to lamnid sharks, indicating a similar physiology to endothermic species (Emery, 1985). In addition, lateral retia associated with the RM have been attributed to two species of thresher, A. superciliosus (Carey et al., 1971) and A. vulpinus ( and Chubb, 1983), although these retia are poorly characterized. Temperature measurements of the RM needed to confirm endothermy in these species are lacking. Carey et al. (1971) reported RM temperatures in A. supercilosus to be ~4.3°C above that of the heart and the coldest muscle masses (water temperatures could not be recorded since it was unknown where the shark had been swimming) and Alexander (1998) reports that the body temperatures of A. vulpinus are indeed warm (no temperature measurements given). While the presence of visceral retia in threshers is only suspected by limited anatomical evidence in A. vulpinus (Alexander, 1998), the ability to warm the brain and eyes may be present in all three species. Orbital retia similar to lamnid sharks have been reported in both A. superciliosus and A. pelagicus but are lacking in A. vulpinus (Block and Carey, 1985; Alexander, 1998). However, according to Alexander (1998) the lack of orbital retia do not necessarily mean that these sharks cannot control the temperature of their eyes and brain. Alexander (1998) notes that while hyoidean rete were present in both 1. oxyrinchus and C. carcharias, a pseudobranchial rete was not observed in these species. Despite this, brain temperatures of these species are known to be above ambient and it was hypothesized that a pseudobranchial rete is unnecessary and that heat could be exchanged via the highly coiled pseudobranchial artery itself. Neither the hyoidean nor pseudobranchial rete occur in A. vulpinus; however, since both of these blood vessels are highly coiled and sit within an orbital sinus it is therefore possible that these arteries may act as heat exchangers in the absence of retia (Alexander, 1998). The size and color (dark red) of the locomotor muscles of the eyes of thresher sharks are also similar to those reported for lamnids; it has been suggested that metabolic processes in these muscles play a part in warming blood to the eyes of endothermic sharks (Carey et al., 1985; Alexander, 1998). Although anatomical evidence exists for threshers, temperature measurements in these animals are necessary to confirm this phenomenon. Anatomical evidence also exists for the possibility Of endothermy in two families of batoids: Myliobatidae (eaglerays; ) and (manta rays; , Manta) Several species 44 within the Mobulidae possess cranial retia that may function as counter-current heat exchangers. These species include: Manta birostris, Mobula thurstoni, Mobula japanica, and Mobula tarapacana (Alexander, 1996). Among the myliobatids, simplified precerebral retia appear to be limited to the genus Rhinoptera, and one species, Rhinoptera javanica possesses a well developed cranial rete similar to that found in mobulids (Alexander, 1995). Although the mechanism responsible for generating heat in these systems is unknown, eye muscles found in Mobula species are large and contain red muscle fibers that could be used for this purpose (Alexander, 1996). Mobula tarapacana possesses a retial system associated with the pectoral fins; this type of system is not found in sharks. This retial system is derived from the arteries and veins that normally supply blood to the pectoral fins of elasmobranchs (subclavian arteries and segmental veins respectively; Alexander, 1995). Unlike sharks, in which locomotion is achieved via lateral movements of the caudal fin and trunk muscles, rays swim by movements of enlarged pectoral fins. Large amounts of RM are associated with these fins and are believed to be the heat source for this system (Alexander, 1995). Two additional retial systems, those around reproductive organs (in Mobula tarapacana) and visceral retia (Mobula japanica and possibly Mobula tarapacana) are also found in rays (Alexander, 1995). As in threshers, temperature measurements are needed to confirm the existence of endothermy in batoids.

Advantages of endothermy Several hypotheses about the possible advantages of endothermy have been proposed. Lamnid sharks are very fast swimming, active animals and initially it was believed that increased body temperatures (due the lateral rete) may have allowed these sharks to achieve such activity by increasing metabolic rates in the RM (Carey et al., 1971). However, Block (1991) states that this argument is circular, since high metabolic rates in the RM are a precursor for endothermy (since this is the source of heat to the system) and thus cannot be a selective advantage in these animals. Perhaps an advantage of the lateral rete may be that, as in mammals, elevated body temperatures provide a stable environment for the activity of important enzymes (Bernal et al., 2001). Indeed, the activity of two key metabolic enzymes, citrate synthase (this is used as an indicator of aerobic metabolism since it catalyzes the first reaction in the Krebs cycle) and lactate dehydrogenase (key enzyme in anaerobic metabolism), were shown to be higher in the RM and WM of mako sharks as compared to ectothermic species (Bernal et al., 2001). The selective advantages of the visceral and orbital rete seem to be more straightforward: the visceral rete is believed to enhance digestion and absorption rates by elevating and maintaining 45 temperatures of the gut (Carey et al., 1971, 1981). Given the high activity levels of these animals, increased turnover of food into energy is a likely advantage in these sharks. By increasing temperature in the brains and eyes, the orbital rete may have allowed these sharks to exploit niches previously underutilized by sharks. Lamniform sharks encounter rapid changes in water temperature during both global migrations and excursions into deep-water. These journeys are believed to motivated primarily by feeding opportunities (Block and Carey, 1985; Bernal et al., 2001). By stabilizing temperatures of the central nervous system and sensory structures (such as the eyes) these sharks can travel into cool waters and hunt prey that may have otherwise been off limits (Block and Carey, 1985; Block, 1991; Alexander, 1998). An analysis of the fossil record of carcharodon by Purdy (1996) suggests a different reason for the development of endothermy in these sharks. By today's standards, the great white shark (maximum length 6.4m; Compagno, 1984a) is certainly one of the most formidable shark species. But C. carcharodon is dwarfed by the fossil species C. , which is the largest known predatory shark species ever to inhabit the world's oceans, with an estimated length of 15.9m (52.2 ft) (Gottfried et al., 1996; Chapter Three). The extant great white and its enormous relative were both common in the and it has been suggested that endothermy evolved in C. carcharodon in order to escape by the larger species. Purdy (1996) suggests that the fossil record of C. carcharodon indicates a steady preference for cooler water, thus avoiding the warm waters preferred by C. megalodon. However, this goes against the idea that a shark as large as C. megalodon would perhaps be better able to maintain a constant body temperature than the modern, smaller species due to its sheer mass and lower surface area to volume ratio. In conclusion, lamnid sharks have developed a suite of complex anatomical and physiological adaptations necessary to maintain stable internal body temperatures. Compared to tunas, the physiological adaptations necessary for endothermy in lamnids (such as metabolic and hematological properties) are still poorly known (Bernal et al., 2001). Although heat retention is only limited to certain area of the body (locomotor muscles, viscera, eyes and brain) this capability may have enabled lamnids to expand their ecological habitats, thus exploiting previously unavailable food supplies and avoiding predation by larger species. Although discussions of endothermy in fish have been limited to lamnid sharks and scombroid fishes, anatomical evidence from thresher sharks and myliobatiform rays suggest that endothermy may be more prevalent among elasmobranchs than previously believed. 46

Biological adaptations in lamniform sharks III: Uterine cannibalism Elasmobranch reproduction Reproduction in the is characterized by internal fertilization, low fecundity and the birth of relatively large, precocial offspring (Wourms and Demski, 1993). Essentially there are two basic types of reproduction in elasmobranchs: oviparity, or egglaying, and , or "live birth". Both of these reproductive strategies have been subdivided further into numerous categories by various authors; therefore, for simplicity, this review follows Compagno (1990c).

Oviparity There are two types of oviparity, distinguished by how long the are retained by the mother prior to deposition in the substrate. In extended oviparity, eggs are laid soon after fertilization and development occurs within the case, mostly outside of the mother. This strategy is considered ancestral for elasmobranchs and is practiced by skates (raj iforms), heterodontiforms, and certain orectolobiform and carcharhiniforms (see Chapter Two for a review of elasmobranchs). A second form of oviparity involves the retention of egg cases within the mother until the young are well developed. This strategy, in which the eggs hatch soon after deposition, is characteristic of only 1 % of elasmobranch species and is limited to certain carcharhiniforms and orectolobiforms.

Viviparity There are four types of viviparity in elasmobranchs, which differ in the method used to supply nutrients to the young. Yolksac viviparity (sometime called ) is essentially extended oviparity, with the exception that the young are born live, rather than within an egg case. This intermediate reproductive strategy is found in certain batoids (rhinobatiforms, pristiforms and torpediniforms), squaliforms, squatiniforms, and some orectolobiforms and carcharhiniforms. The remaining three forms of viviparity are highly specialized and found only within certain groups, they are: (1) uterine viviparity (only in certain myliobatiforms) in which the embryos subsist on "uterine milk", a nutritious fluid that is secreted from the ; (2) placental viviparity, in which nutrients are supplied to the young via a placenta (limited to certain carcharhiniforms}; and (3) uterine cannibalism, which is known only in lamniforms. The remainder of this section focuses on uterine cannibalism and presents a brief overview of reproduction in lamniform sharks. 47

Uterine cannibalism: and adelphophagy In the early 1900's, examination of the embryos of Lamna nasus revealed that these embryos possessed a grossly enlarged abdomen that was filled with large amounts of . This was the first evidence of oophagy, a reproductive strategy in which the mother continues to ovulate after , releasing fertilized eggs into the uterus on which the developing young feed. Oophagy has been shown in all lamniform species except for O. noronhai, O. ferox, M. owstoni, C. maximus and M. pelagios, for which little (or no) information on pregnant females is available. In several species, the young inside the mother develop specialized teeth that are used to rip open egg capsules (Liu et al., 1999; Francis and Stevens, 2000). As the embryos develop, these teeth are replaced with those that more closely resemble the adult dentition. Stribling et al. (1980) stated that oophagy is the most efficient adaptation for nourishing developing young, based on an examination of the extremely .high caloric content of the eggs of C. taurus. Such a high caloric intake is necessary to produce the very large young characteristic of lamniform sharks. The size of birth for many lamniforms is at least 1m in length: the only known exceptions are L. nasus (60-80cm), L. ditropis (40-85cm), 1. oxyrinchus (60-70cm) and P. kamoharai (in which the adult size is only lm). However, since the young of P. kamoharai are born at about half the adult body length (Fuj ita, 1981} they therefore (in comparison to the adult size) represent the largest embryos of any lamniform. In addition to oophagy, the embryos of C. taurus have been shown to practice adelphophagy, in which embryos attack and consume their smaller siblings while still in the uterus (Bass et al., 1975). Although adelphophagy has only been reported in C. taurus, it is thought that other lamniform species (such as P. kamoharai) may also exhibit this behavior (Compagno, 2001). The development of C. taurus has been described in detail by Gilmore et al. (1983). In early stages of development, encapsulated C. taurus embryos feed on yolk supplies within the egg. Upon hatching, any remaining yolk-sac material is absorbed until the reaches a size of about 100mm. At this size the embryo possesses functional teeth and awell-developed caudal fin and begins to actively attack and consume smaller siblings. Embryos will continue this struggle for survival until only one embryo remains. This embryo, having fed on its smaller siblings, will achieve a size at birth of about 1m in length. Most lamniform sharks have adopted a reproductive strategy in which only a few, very large young (2-4 per brood) are born. Exceptions include 1. oxyrinchus (4-25 pups) and C. carcharias (6- 14 pups; Uchida et al., 1996; Mollet et al., 2000). The gestation period is only known for L. nasus, C. taurus, A. supercilosus and 1. oxyrinchus, and ranges from 8-12 months in these species (Gimore et 48 al., 1983; Moreno and Moron, 1992; Francis and Stevens, 2000; Mollet et al., 2000). As discussed above, the pups can reach such large sizes due to the high nutritional value of their food, which is either unfertilized eggs, or at least in one species, smaller siblings. Such large sizes at birth make these pups less vulnerable to predation. Unfortunately, as discussed above, the low fecundity in lamniform sharks makes certain species more vulnerable to over-fishing.

Summary The order Lamniformes possesses several distinctive biological adaptations including planktivory, endothermy, and uterine cannibalism. Species in this order inhabit a variety of marine environments including very cold, polar waters. With few exceptions (such as L. ditropis, A. pelagicus and possibly O. noronhai) lamniforms are found in all three major oceans of the world. Reproduction strategies of lamniforms (and other sharks) leave them vulnerable to over-fishing, and as such, the conservation status of many species is unknown. Although many species within this group are common, their biology is still poorly known. The next section presents a morphological guide to lamniform sharks as well as the current theories on their systematics and evolutionary relationships. 49

CHAPTER TWO: LAMNIFORM SYSTEMATICS AND PHYLOGENY

Introduction The Chondrichthyes comprise the Elasmobranchii (sharks and rays) and the Holocephalii (chimeras, ratfishes, elephantfishes). This Chapter introduces current theories in elasmobranch systematics as a basis for discussing relationships of Lamniformes to other elasmobranch taxa. A brief introduction to the elasmobranchs, including the seven other living orders of sharks (based on Compagno, 1984a,b) and the rays or batoids (based on Compagno 1977, 1990c) is presented as a framework for this discussion. A guide to the major morphological characters used in elasmobranch systematics is also presented, not as a comprehensive list of morphological traits, but rather as an introduction to important characters used in anatomical analyses. This discussion also provides background material for subsequent discussions of lamniform phylogeny. Evolutionary relationships based on morphological characters are compared to molecular-based studies of phylogeny in an effort to understand the current theories of the relationships both among elasmobranchs and within the lamniforms.

Introduction to extant orders of sharks and rays () There are 375 species of sharks (Compagno, 1990c) arranged in thirty families with eight orders (including Lamniformes; see Table 2.1).

Hexanchiformes This order contains five species of poorly known, deep-water sharks. Most species in this group are smaller than 2m, although griseus may attain lengths of 4.7m. and Hexanchus have six pairs of gill slits, whereas and possess seven pairs of gill slits. The majority of other shark species (with the exception of Pliotrema, see below) are limited to five pairs of gill slits.

Squaliformes This is the second largest order of sharks with at least 87 living species, making up a total of 23% of all living shark species (Compagno, 1990c). The Echinorhinidae and Oxynotidae are poorly known deep-water sharks. Echinorhinids may reach up to 4m while oxynotids are rarely larger than 1.Sm. The is a large and diverse family containing 17 genera. Species within this family primarily inhabit deep-water, with several genera exhibiting bioluminescence. Gulper sharks 50

( spp.) have been recorded from record depths of over 6000m! Members of the Squalidae range from several dwarf species, including the spined , laticaudus (which may be the smallest species of shark since it matures at 15cm and attains a maximum length of 25cm) to the large sleeper sharks ( spp.) that may exceed 6m in length. This family also includes the only know parasitic chondrichthyan: the cookie cutter sharks ( spp.). These sharks utilize their enlarged "lips" and pharynx to suck onto their prey, then insert their sawlike lower teeth into the flesh of their victim, then spin and twist to remove plugs of flesh. The "crater-like" scars from these attacks have been found on a variety of victims including dolphins, whales, large teleosts, megamouth sharks -- and the rubber sonardomes of U.S. Navy nuclear submarines.

Table 2.1. Extant orders and families of sharks (Compagno 1984a,b). Order Family Common name Compagno, 1973 Chlamydoselachidae Garman, 1884 Hexanchidae Gray, 1851 Sixgill & sevengill sharks Compagno, 1973 Echinorhinidae Gill, 1862 Bramble sharks Oxynotidae Gill, 1872 Prickly dogfish, roughsharks Squalidae Blainville, 1816 Dogfish sharks Pristiophoriformes Compagno, Pristiophoridae Bleeker, 1859 Saw sharks 1973 Squatiniformes Compagno, 1973 Squatinidae Bonaparte, 1838 Angel sharks Heterodontiformes Compagno Heterodontidae Gray, 1851 Bullhead or horn sharks 1973 Orectolobiformes Compagno, Gill, 1862 Collared collar sharks 1973 Brachaeluridae Applegate, 1974 Blind sharks Gill, 1862 Bamboo sharks Orectolobidae Gill, 1896 Gill, 1862 Nurse sharks Stegostomatidae Gill, 1862 Zebra sharks Rhiniodontidae Muller & Henle, Whale shark 1839 Compagno, Scyliorhinidae Gill, 1862 1973 Proscyllidae Fowler, 1941 Finback catsharks Gill 1893 False Triakidae Gray 1851 Leptochariidae Gray, 1851 Barbeled Hasse 1879 Weasel sharks Sphyrnidae Gill, 1872 Hammerhead sharks Carcharhinidae Jordan & Requiem sharks Evermann,1896 Lamniformes Compagno, 1973 See Chapter One 51

Pristiophoriformes This is a poorly known order of relatively small (maximum size 1.37m) sharks that are found inshore and offshore at depths of at least 915m. The saw sharks are a morphologically distinct group of sharks so named because of their elongated, flattened snout that bears numerous lateral teeth (thus resembling a saw). The single family Pristiophoridae, contains five species in two genera which are easily distinguished (Pliotrema is differentiated from by the presence of six pairs of gill slits, an unusual trait among elasmobranchs).

Squatiniformes The order Squatiniformes contains one family Squatinidae and one genus Squatina containing 12 species of angel sharks. These are dorsoventrally flattened, primarily benthic sharks that inhabit coastal waters and are occasionally found at depths of up to 1300m. The maximum length of these sharks is 2m, although most species attain less than 1.6m. These sharks are often found buried in sediments, where they ambush their prey by quickly snapping their jaws, which are equipped with needle-sharp teeth. It is because of their speed and dentition that these sharks should be regarded as potentially dangerous. Several angel shark species are fished commercially for food, oil and leather.

Heterodontiformes This order contains one family (Heterodontidae) and one genus (Heterodontus) with eight species of horn sharks. These are bottom-dwelling, nocturnal sharks and, although one species (H. ramalheira) occurs at depths between 108-275m, most occur in shallow water. Heterodontids are small sharks (a maximum size of 1.65m is reported but they rarely exceed 1.37m) and are easily distinguished from other sharks by their blunt, pig-like snout and dorsal fin spines. These sharks are easily kept in public aquaria and because of this, much is known of their reproduction and biology. Courtship behavior, which is generally poorly understood in elasmobranchs, has been documented from several captive species. In addition, their shallow water habitat allows them to be easily studied and at least one species, H. portusjacksoni, has been observed extensively in the wild. Horn sharks lay unique, spiral shaped eggs and wild sharks are known to carry these eggs in their mouths and presumably insert ("screw") these eggs into crevices and between rocks.

Orectolobiformes Collectively known as carpet sharks, this is the third largest order of sharks, with 9% of all shark species (Compagno, 1990c). There are at least 32 species organized in seven families. Many 52 species are benthic, coastal sharks, around 1m in length. These include the collared carpet sharks (Parascylliidae), which can change their color to match the substrate; blind sharks (Brachaeluridae), named for their habit of closing their eyes when caught (not because of impaired vision); and bamboo sharks (Hemiscylliidae}, also called long-tailed sharks. Several other orectolobiform families include species that exceed 3m in length, including the wobbegongs (Orectolobidae), nurse sharks (Ginglymostomatidae), zebra sharks (Stegostomatidae) and the largest living shark, the whale shark (Rhiniodontidae; see Chapter One). Although carpet sharks are generally harmless, attacks have been reported for orectolobids and ginglymostomids. These attacks are not fatal and are often provoked, occuring when the shark, which normally rests on the bottom, is stepped on or molested. Several species, including the cirratum, are maintained in public aquaria. As for the Heterodontiformes, captivity provides a unique opportunity to study these animals and hence the mating behavior of G. cirratum is well known.

Carcharhiniformes The Carcharhiniformes (ground sharks) are the largest order of sharks, with a total of at least 210 species in 48 genera and eight families, and representing 53% of all living shark species (Compagno, 1990c). This order includes the largest family of sharks, the Scyliorhinidae with at least 89 species of catsharks. The Carcharhiniformes are a diverse group of sharks that have been successful in almost every marine ecological niche. The Scyliorhinidae, Triakidae and Pseudotriakidae are found at depths of at least 1500m. The Leptochariidae, Hemigaleidae and Proscyllidae contain harmless species that rarely exceed 1.4m in length. This last family includes the pygmy ribbontail catshark, radcliffei, which matures at 15cm and achieves a maximum length of only 24cm. At the other extreme, the Carcharhiniformes contains two families of very large sharks, the Sphyrnidae (hammerhead sharks) and the Carcharhinidae (requiem sharks), that each acheive lengths of at least 3m. Some of these species have been implicated in numerous attacks on humans, and carcharhinids may be responsible for more attacks on humans than any other neoselachian group. The Carcharhinidae are primarily tropical sharks, and also contain the only known sharks that can live in freshwater for extended periods of time: the (Carcharhinus leucas) and river sharks (Glyphis spp.). As a consequence, the bull shark and the shark (Glyphis gangeticus) are believed responsible for attacks (sometimes fatal) on humans in freshwater lakes, rivers and streams. Other dangerous species include the tiger shark ( cuvier; perhaps the most dangerous species of tropical shark) and the oceanic white tip (Carcharhinus longimanus; believed responsible for attacking victims of ships sunk in the open sea). 53

Skates and rays () The batoids are the largest group of cartilaginous fishes with 494 species (55% of all chondrichthyans; Compagno, 1990c). Compagno (1973} recognized five orders of batoids. The two largest orders are the (skates; 223 species} and the Myliobatiformes (stingrays, eaglerays, manta rays; 171 species), some of which are the only known chondrichthyan species confined to freshwater habitats. The Rhinobatiformes (; 53 species}, Torpediniformes (electric rays; 43 species) and the Pristiformes (; four species), make up the remainder of the Batoidea (Compagno, 1990c). Only relationships between batoids and sharks are considered in this Chapter.

Guide to the major morphological characters used in elasmobranch systematics The anatomical characters used in elasmobranch systematics are often controversial. This is illustrated by the lack of a standard morphological terminology for this group. This problem is compounded by an often inadequate morphological characterization of elasmobranch taxa, and has made the study of this group difficult (Compagno, 1988). Compagno's (1988) morphological analysis of the Carcharhiniformes is one of the most comprehensive anatomical examinations of a group of sharks known to date. Therefore, terminology and morphological characters utilized in this section are based on Compagno's study (unless otherwise noted) in an effort to simplify this introduction to important anatomical characters.

Dentition Dental characters are of primary importance in the analysis of extinct taxa, since most fossil shark species are based solely on teeth (Chapter Three). However, a general introduction to dental characters is presented here because of its usefulness in assessing relationships among living sharks. The base of the , which is embedded in a dental membrane and anchors the tooth in the jaw, is called the root. The crown of the tooth provides the cutting surface of the tooth and is covered with a shiny substance termed "enameloid", which is distinct from both dentine and enamel. The crown is subdivided into three regions: the foot, the neck, and the cusp. The expanded base of the crown, called the foot, is separated from the root by the neck, which is a band of varying thickness. Above the foot is the primary cusp, the cutting surface of the tooth. The foot may also possess cusplets and the number, position, and size of these cusplets are often useful systematic characters.

The surface of the tooth is described according to its position in the mouth as either labial (facing outside; "labial" means pertaining to the "lips") or lingual (facing inside; "lingual" refers to 54 the tongue). Histologically, the teeth can be divided into two major groups: orthodont and osteodont. Although, the composition of the tooth is controversial, orthodont teeth are easily distinguished by the presence of a pulp cavity (a central vascular canal) that is absent from osteodont teeth, such as those found in lamniforms. The teeth of sharks are arranged in labiolingual files (which contain teeth in various stages of development) and mesiodistal rows (where teeth are in comparable stages of development). Tooth replacement is continuous throughout the lifetime of the shark, and proceeds such that teeth developing in more lingual positions move labially to replace those in front of them. Shark dentition may be heterodont, in which teeth along the jaw are morphologically distinct, or homodont, in which all teeth are identical along the jaw. Shark teeth may also be differentiated along one jaw (either the upper or lower jaw; monognathic heterodonty) or there may be differences between teeth in both the upper and lower jaws (dignathic heterodonty). Lamniforms possess five different types of teeth that can be distinguished by both their morphology and their position in the jaw: anterior teeth (A); symphysial teeth (S); intermediate teeth (I); lateral teeth (L); and posterior teeth (P) (Applegate, 1965). Most lamniform taxa {except Mitsukurina, Megachasma, Cetorhinus, A. superciliosus) possess a small intermediate tooth, not found in other shark groups, which divides the anterior teeth from the lateral teeth in the upper jaw. This arrangement is known as the "lamnoid" tooth pattern, and it is unique to lamniforms. Other forms of heterodonty depend upon growth and sex. In lamniform sharks, ontogenetic heterodonty is known for C. taurus, A. superciliosus, C. carcharias, 1. oxyrinchus (Chapter One) and L. nasus (Cigala Fulgosi, 1983; Compagno 1988) and is suspected in O. noronhai and P. kamoharai (Shimada, 2001, 2002a). For example, lateral cusplets may be lacking in embryos but present in adults (e.g., C. taurus), or vise-versa (e.g., C. carcharias). Sexual heterodonty is known in C. maximus (Compagno, 1988), A. superciliosus (Cigala Fulgosi, 1983; 1988; Compagno, 1988) and P. kamoharai (Bass et al., 1975).

Chondrocranial characters and jaw suspension The "skull" of sharks is composed of two parts: (1) the splanchnocranium (also called the visceral arches), which in sharks form the jaws, the hyoid arch (which supports the jaw) and the branchial arches (which supports the gills and respiratory musculature); and (2) the chondrocranium, cartilaginous blocks that surround sensory structures. In sharks and other cartilaginous fishes, the dermatocranium does not develop, and instead, the chondrocranium (which remains rudimentary in other ) develops further and becomes the only structure that protects the brain and associated sensory systems of the head. 55

Unlike mammals, the upper jaw (palatoquadrate cartilage) of elasmobranchs is not fused to the skull. The way in which the palatoquadrate attaches to the chondrocranium, called jaw suspension, has typically been described in sharks as either amphistylic or hyostylic. In sharks with an amphistylic jaw suspension the palatoquadrate connects to the chondrocranium anteriorly by a ligament and posteriorly via the (the top part of the hyoid arch). In sharks with a hyostylic jaw suspension the jaws are only attached to the hyomandibula. Examination of jaw suspension in sharks reveals that it is more complicated than this simple dichotomy, and may provide useful taxonomic characters (Compagno, 1988). The chondrocranium of sharks is divided into four regions: (1) the ethmoidal region, including the rostrum (cartilage of the snout) and nasal capsules; (2) the orbital region, including the orbits (the "eye sockets"), (3) the otic region (which encloses the inner ear) and (4) the occipital region, the posterior chondrocranium. Differences in both the morphology and proportions of these four regions are useful traits in shark . For example, the rostrum of both lamniforms and carcharhiniforms possesses a characteristic "tripodal" shape; this feature is often used to unite these two groups (see below). Several specific chondrocranial characters worth noting include both the suborbital (subocular) shelf and the chondrocranial foramina. The suborbital shelf, a horizontal plate in the floor of the orbit is present in all galeomorph and squatinomorph sharks but absent in squalomorphs and batoids. chondrocranial foramina are openings in the braincase usually associated with the passage of blood vessels or nerves. Of particular taxonomic importance are the positions of foramina for both the various cranial nerves and their branches (peripheral nerves originating from the brain} and several blood vessels that serve the brain and eye. These include several vessels important in endothermy such as the efferent hyoidean artery, paired dorsal aortae, internal carotids and the stapedial artery, a branch off the efferent hyoidean that goes to the orbit and snout. The stapedial artery may enter the chondrocranium via a foramen; or this foramen may enlarge to become the stapedial fenestrae (a fenestra is an opening in the braincase), as in lamniforms. Several muscles associated with the head and jaws of sharks are useful in differentiating individual shark taxa. These include: the preorbitalis muscle (which protrudes the upper jaw and helps close the mouth); the adductor mandibulae (the main muscle responsible for closing the mouth); the levator palatoquadrate (raises the upper jaw); and the levator hyomandibulae (involved in movement of the hyomandibula). Several aspects of these muscles, including the position, function, composition and both the origin and insertion (points of attachment of muscles that are least and most moveable, respectively) are used in shark systematics. 56

Fins Elasmobranchs possess a caudal (tail) fin, one or more dorsal fins along the back, and the ventrally-placed paired pectoral and pelvic fins. The presence of an anal fin, located ventrally between the pelvic fins and tail, is an important character present. in all galeomorph sharks and hexanchiforms. Characters such as the relative position and proportion of the fins, and the presence of spines on the dorsal fins, are also utilized in taxonomy. The presence of precaudal pits (a notch or depression which occurs anterior to the caudal fin) and their position (either dorsal, ventral or both), or keels (which protrude laterally from the base of the caudal fin) are also useful taxonomic characters, particularly within Lamniformes. The skeleton that supports the pectoral fin also provides important taxonomic characters. The organization of the pectoral fin falls into two categories that differ in the distal extension of the radial into the fin web. In aplesodic fins, the distal segments of the radials are truncated and hence the distal fin web is supported primarily by ceratotrichia. Since the ceratotrichia are thin and flexible, the distal portion of the fin is easily bent. In plesodic fins, the radial cartilages extend well into the fin web in order to stiffen and support the fin. Pectoral fins with a plesodic fin skeleton are therefore less flexible than those with aplesodic fin skeletons. Plesodic fins are considered a derived lamniform feature by Compagno (1990b), but this view is not unanimous (e.g., Morrissey et al., 1997). Certain lamniforms (e.g., , ) also have plesodic pectoral fins (Chapter Three).

Other external characters Several other important external characters used in shark taxonomy include: the presence/absence of a nictitating lower eyelid (moveable lower eyelid; NLE}; presence and size of the (the first gill opening, which resembles a hole between the eye and the first ); number, position and size of the external gill slits; morphology of the nostril region including the presence of a nasooral groove (a shallow depression between the nostril and the mouth on the ventral surface of the snout); and morphology of the dermal denticles (the tooth-like scales of chondrichthyans).

Vertebral column and calcification In elasmobranchs the notochord (a long, fluid-filled rod composed of fibrous connective tissue) is constricted along is axis by a series of cartilaginous blocks (the vertebrae). The absence of vertebrae (and hence dependence on a notochord) is considered an ancestral feature in sharks (see Hexanchiformes below). In sharks, there are two main types of vertebrae: trunk vertebrae, which 57 possess attachment sites for ribs; and caudal vertebrae, which are easily recognized by the presence of the Kemal arch, a ventral V-shaped projection off the centrum (the circular midsection of the vertebrae) that protects the blood vessels that serve the caudal fin. In sharks, the cents of trunk vertebrae may be monospondylous or diplospondylous, in which there are one or two cents per muscle block, respectively. The diplospondylous condition in sharks is believed to occur during development, when the cents elongate and subsequently divide forming two vertebrae of varying lengths. This process results in a "stutter zone": a series of alternating long and short vertebral cents, characteristic of regions containing diplospondylous cents. The total number of vertebrae, including the exact number of the three different types of vertebrae, and various length and width ratios of the vertebrae, are widely-used taxonomic characters in sharks. The use of both the structure of the vertebral cents and their calcification patterns were important characters used in elasmobranch systematics throughout the early twentieth century. However, the patterns of calcification are controversial and may be influenced by environmental factors, since deep-water species often display decreased calcification of the skeleton. For example, among lamniform sharks, species known to inhabit deep water, such as O. ferox and A. superciliosus, display reduced calcification patterns as compared to species that do not inhabit these environments (C. taurus and A. vulpinus, respectively). Other deep-water lamniforms, including Mitsukurina and Pseudocarcharias, also display simplified calcification patterns. An extreme example is observed in Megachasma, which possesses a very rudimentary axial skeleton. For these reasons, caution must be applied when considering these characters in systematic analysis. The presence of dense, irregular masses of calcified cartilages (hypercalcification) may also be useful taxonomic characters in sharks. One example is Lamna, in which the rostrum is a heavily calcified, solid structure as compared to the hollow rostra of other shark species.

The intestine The type of intestine, and the number of rings or turns in the intestine, are used as morphological traits for taxonomy. Unlike the long intestinal tracts of mammals, the intestines of sharks are short and rely upon modifications of mucosal folds to increase the surface area available for absorption of nutrients. These modifications define the three different types of intestines found in sharks. (1) A intestine is considered the most ancestral type and is found in most shark species. It is characterized by a helix-like folding of the intestinal mucosa such that when opened, the inside of the intestine resembles a series of overlapping cones. (2) A ring valve intestine, so named for its series of transverse mucosal folds that surround the intestinal lumen, is present primarily in 58 lamniform and orectolobiform sharks; although Hexanchus (Hexanchiformes) and {Carcharhiniformes) also possess this type of intestine (Compagno 1988). (3) A scroll valve intestine is restricted to the families Carcharhinidae and Sphyrinidae, and consists of a large flap of mucosal tissue that rolls up on itself. Sharks with scroll valves are the only known species of sharks capable of everting their intestine through their cloaca, thus releasing its contents.

Classification of sharks To date, the majority of morphology-based elasmobranch taxonomy is based on inference (phenetic gestaldt) rather than tested with phylogenetic analysis. Theories on relationships among taxa are often presented as vague statements with little explanation or supporting evidence. This results in elasmobranch classifications that depend upon both the characters chosen and on individual interpretations of these characters. Thus, it is not surprising that relationships among elasmobranchs are highly controversial. Compagno (1973, 1977) divided the elasmobranchs into four superorders: Squalomorphii (comprising Hexanchiformes, Squaliformes and Pristiophoriformes); Squatinomorphii (Squatiniformes only); Batoidea (skates and rays); and Galeomorphii (comprising Heterodontifomes, Orectolobiformes, Carcharhiniformes and Lamniformes). The first morphology-based cladistic analyses of elasmobranchs were conducted by Shirai (I 992a,b), followed by Shirai (1996) and De Carvalho (1996). These analyses were often at odds with several traditional views and are briefly discussed below.

Batoidea and Sharks Debate continues on the relationships between batoids and sharks. Traditionally, living elasmobranchs have been divided into ashark-ray dichotomy (e.g., Regan, 1906; Garman, 1913; White 1936, 1937; Bigelow and Schroeder 1948; Compagno, 1973, 1977}. However, Jordan (1923) and Moy-Thomas (1938) grouped squatiniforms with rays, rather than sharks. cladistic analyses recovered a Glade that comprised Squatiniformes, Pristiophoriformes, and Rajiformes (=Batoidea), referred to as the "hypnosqualean" group (Shirai, 1992a,b, 1996; De Carvalho, 1996).

Hexanchiformes The Hexanchiformes are phylogenetically controversial for two reasons: (1) their position at the base of the Neoselachii, and (2) whether the order is monophyletic. The position of Hexanchiformes is based principally on the amphistylic jaw articulation and the nature of the vertebral column (see above). Compagno (1977), however, stated that certain non-hexanchiforms 59

(e.g., Pseudocarcharias) have an amphistylic jaw suspension, and that the loss of vertebrae is secondary. Studies also disagree over whether the Hexanchiformes are monophyletic (Compagno, 1977; De Carvalho, 1996) or paraphyletic (Bigelow and Schroeder, 1948; Shirai, 1992a, 1996).

Squalomorphii The monophyly of the Squalomorphii (sensu Compagno, 1973, 1977) is also controversial. According to cladistic analyses (Shirai, 1992a, 1996; De Carvalho, 1996}, the Squaliformes is paraphyletic, and all agree on the removal of the Echinorhinidae. These cladistic analyses regard Galeomorphii as the sister taxon to a Glade (called Squalea) that comprises hexanchiforms, squaliforms, squatiniforms, pristiophoriforms and batoids.

Galeomorphii Historically, the monophyly of the Galeomorphii has proven less contentious than the Squalomorphii. The two main issues regarding galeomorph sharks are (1) the inclusion of the Heterodontiformes, and (2) relationships among galeomorph taxa (including the sister group of lamniform sharks).

Heterodontiformes Heterodontiformes were not always included with other galeomorph sharks; in fact, they were once considered ancestral, and allied with the fossil hybodonts (e.g. Jordan 1923; White, 1936, 1937). The traits used to unite heterodontiforms and hybodonts are now recognized as either basal neoselachian characters or convergent (Compagno, 1977, 1988). Recent cladistic analyses strongly support the inclusion of Heterodontiformes in a Glade with Orectolobiformes, Carcharhiniformes and Lamniformes, thus affirming the monophyly of the Galeomorphii (Shirai, 1992a, 1996; De Carvalho, 1996). The relationship between heterodontiforms and other galeomorphs is still unresolved, as this group is placed either at the base of the Galeomorphii (Shirai, 1992a, 1996; De Carvalho, 1996}, or in a Glade with the orectolobiforms (Compagno, 1977, 1988; contra Applegate, 1972).

Orectolobiformes and Lamniformes These two orders have been united based on vertebral calcification patterns, and the shared presence of a ring valve intestine, and the absence of a NLE (Regan, 1906; White 1936, 1937). Individual orectolobiform taxa have been allied with lamniforms, including Rhiniodon (Garman, 1913; this was based on filter-feeding, see Chapter One) and Ginglymostomatidae (Applegate, 1972). 60

A close relationship between Lamniformes and Orectolobiformes has also been suggested based on the fossil shark Palaeocarcharias, which has lamniform-like teeth combined with an orectolobiform- like body (Chapter Three). Compagno (1988), however, refutes the characters used to unite Orectolobiformes and Lamniformes stating that presumed similarities in vertebral calcification patterns and the absence of an NLE are ancestral characters present in other groups of sharks, including non-galeomorphs. Compagno (1988) also contends that vertebral calcification patterns are influenced by habitat, and are hence unreliable for phylogenetic purposes (see above). In addition, Compagno states that the ring valve intestines of these sharks are different (long and narrow in lamniforms, but short and wide in orectolobiforms), and not all orectolobiforms possess this intestinal type. A ring valve intestine is also present in other sharks (see above).

Lamniformes and Carcharhiniformes Compagno (1998) claims that the Carcharhiniformes are the sister taxon to Lamniformes, a relationship that is supported by cladistic analyses (Shirai, 1992a, 1996; De Carvalho, 1996). De Carvalho (1996) recovered six characters that unite lamniforms and carcharhiniforms (although he cautions that almost all display some degree of homoplasy): ethmoidal region not downcurved; similarities in both the origin and insertion of the preorbitalis muscle; absence of nasooral groove; presence of tripodal rostrum; and absence of special folds (called "aprons") over the root of the tooth. Shirai (1996) retained the Lamniformes +Carcharhiniformes Glade based on the shared presence of a tripodal rostrum and a secondarily shortened inter-orbitonasal region (in contrast to the elongated space found in orectolobiforms and heterodontiforms). Maisey (1984) advanced several shared chondrocranial characters (e.g., enlarged stapedial fenestrae; a pocket for the attachment of the deep adductor muscle) as well as similarities in both vertebral calcification patterns and. dorsal fin size. Compagno (1988) stated that an elongated preoral snout, labially expanded bilobed tooth roots, and clasper siphons may be used to unite lamniforms and carcharhiniforms. Munoz-Chapuli et al. (1994) supported a close relationship between lamniforms and carcharhiniforms (particularly Carcharhinidae and Sphyrnidae) based on the arrangement of coronary arteries (the blood vessels which serve the heart), a character not discussed by other authors. However, several of the characters used to link Lamniformes and Carcharhiniformes are contentious. For example, a tripodal rostrum is often considered unique to Lamniformes and Carcharhiniformes (Compagno, 1977; Maisey, 1984; De Carvalho, 1996), but a putative tripodal rostrum is found in a fossil orectolobiform (Acanthoscyllium; Capetta 1980), although Compagno [1988] believes this trait was possibly misidentified. Also, developmental studies conducted on 61

Squalus and some batoids suggest rostral elements that may be homologous to those forming the tripodal rostra of galeomorphs may develop in these taxa (Maisey, 1984). In addition, lamniforms possess an osteodont tooth histology (carcharhiniforms are primarily orthodont) and a unique "lamnoid tooth" pattern (see above). Compagno (1988) disputed several of Maisey's characters, including vertebral calcification patterns (for reasons listed above) and enlarged first dorsal fin with reduced second dorsal fin (not in Mitsukurina and Carcharias). In addition, Compagno (1988) noted that enlarged stapedial fenestrae are not present in Mitsukurina or basal carcharhiniforms, and are greatly enlarged only in the Lamnidae. Nevertheless, although there is disagreement on the characters used to unite lamniforms and carcharhiniforms, this relationship appears to be generally accepted.

Molecular systematics of Elasmobranchii Analyses of elasmobranchs based on anatomical characters remain controversial, as relationships (and monophyly) of Compagno's (1973, 1977) orders remains uncertain. Compagno (1988) stated that further evidence was necessary to resolve discrepancies regarding elasmobranch sytematics. In this regard, several molecular studies have been conducted. The first molecular based studies of elasmobranch systematics utilized a technique known as microcomplement fixation (MCF) to assess relationships among taxa. In MCF, the antigen binding sites of a particular protein are examined by subjecting the protein to polyclonal antibodies. The amount of antigen binding is examined by comparing the ratio of bound versus unbound complement (proteins that associate only with tightly-bound antigen-antibody complexes): the amount of bound complement is proportional to the number of antigen sites bound by antibodies. Since the binding of antibodies and antigens is highly specific, this technique is used as an estimate of sequence similarity (Quicke, 1993). Several studies assessed relationships among sharks (although not all orders were included) by examination of the iron-binding serum protein transferrin by MCF (Burch et. al., 1984; Davies et al., 1987; Lawson et al., 1995). Results of these studies showed a strong split between squalomorph and galeomorph sharks. The inclusion of Heterodontiformes in the Galeomorphii was strongly supported (Davies et al., 1987; Lawson et al., 1995; this order was not included by Burch et al. [ 1984]). Lawson et al. (1995) also support the placement of batoids as basal elasmobranchs rather than derived sharks (this was the only analysis to include a member of the Batoidea). The first analysis to utilize sequence data was conducted by Dunn and Morrissey (1995). In this preliminary study, 303 by of the mitochondrial 12S rRNA gene were examined from five elasmobranch taxa (representing Hexanchiformes, Squaliformes, Lamniformes, Heterodontiformes, 62 and Batoidea). The analysis supported a close relationship between hexanchiforms and squaliforms, and between heterodontiforms and lamniforms, and supported the basal shark-ray dichotomy. Kitamura et al. (1996) examined the mitochondrial cytochrome b gene of four shark species (a hexanchiform, squaliform, pristiophoroiform, and squatiniform) and several batoids. The resulting analysis (which was based the amino acid sequence of a partial cytochrome b sequence) supported Shirai's (1992a, 1996) and De Carvalho's (1996) placement of the hexanchiforms as the sister taxon to a Glade containing squaliforms, pristiophoriforms and squatiniforms, but disagreed with the monophyly of the "hypnosqualean Glade". Kitamura et al. (1996) argued that Squaliformes was the sister taxon to a Pristiophoriformes + Squatiniformes Glade. This differs from Shirai (1992a, 1996) and De Carvalho (1996) in the placement of the batoids, which according to Kitamura et al. (1996) are the sister taxon to all sharks rather than related to the pristiophoriforms. Unfortunately, since no galeomorph sharks were included in the analysis, relationships within this group were not examined. In an attempt to examine relationships between osteichthyan and chondrichthyan fishes. Arnason et aI. (2001) examined twelve mitochondrial protein-coding genes from a heterodontiform, a squaliform, a batoid, two carcharihiniform sharks and eleven osteichthyan taxa. Although this study was not specifically designed to test relationships among elasmobranchs, the authors state that a split between galeomorph and squalomorph sharks (sensu Compagno, 1973) was strongly supported. In addition, no evidence was found to support a close relationship between squaliforms and batoids, thereby disagreeing with Shirai (1992a, 1996) and De Carvalho (1996). However, the authors indicate that because the taxa necessary to examine the of sharks (i.e., squatiniforms and pritiophoroforms) relative to rays were not included in the analysis, it was premature to refute the "hypnosqualean Glade". The most recent molecular analysis of elasmobranch relationships was conducted by Douady et al. (2003). This analysis criticized both the number of taxa sampled and the methodology of previous molecular-based analyses. For example, neither the study by Dunn and Morrissey (1996) nor Arnason et al. (2001) included the taxa necessary to resolve the monophyly of the hypnosqualean Glade; and although Arnason et al. (2001) used a large dataset, Dunn and Morrissey's (1996) analysis was limited to a fragment of the 12S rRNA gene. Kitamura et al. (1996) was criticized in several respects, including limited taxon sampling; choice of outgroups (instead of a chimera, the very distantly related was included); the use of cytochrome b (which is questionable for analyses .regarding deep divergences, see Chapter Four); and the absence of support values for their study. Douady et al. (2003) endeavored to redress these problems by using a 2.4 kb fragment of the mitochondrial genome composed of 12S rRNA, 16S rRNA and the intervening tRNA-Val sequences 63 for all shark orders and two batoid orders (Rajiformes and Myliobatiformes). This study found strong support for an ancient split between sharks and rays, and neither the choice of outgroup nor the tree reconstruction method changed this result. The results concur with those of Kitamura et al. (1996), in which hexanchiforms are the basal outgroup to a Glade containing squaliforms, pristiophoriforms and squatiniforms. Among galeomorph sharks, the sister group relationship between lamniforms and carcharhiniforms was strongly supported; but the .placement of both the orectolobiforms and heterodontiforms was variable. Douady et al. (2003) suggested that .significantly more information (both in number of taxa and molecular characters} was required to determine their position more accurately.

Molecular data versus Morphological data Given the controversy regarding the systematics of this group, surprisingly few molecular studies have been conducted on elasmobranchs. With the exception of Douady et al. (2003), these analyses fail to include representatives of all orders of sharks. However, despite these shortcomings, all molecular-based analyses seem to support the traditional shark-ray dichotomy rather than a hypnosqualean Glade. Molecular analyses also place Heterodontiformes within the Galeomorphii, and Douady et al. (2003) found strong support for the Carcharhiniformes as the sister taxon to the Lamniformes. Further molecular and morphological based investigations of elasmobranch systematics must be more rigorous if relationships among these groups are to be tested and evaluated.

Lamniform interrelationships Some early classifications split the Lamniformes into two main groups: the first included Odontaspididae and Mitsukurinidae, and the second included Lamnidae, Alopiidae and Cetorhinidae (e.g., Regan, 1906; Garman, 1913, White, 1936, 1937; the Pseudoca.rchariidae and Megachasmidae were not yet known to science; Chapter One). White (1936, 1937) based this split on several characters including: dentition; position and size of the dorsal fins; the presence or absence of precaudal pits; pectoral fin and clasper morphology; vertebral calcification patterns; and cardiac characters (specifically the placement of heart valves). Compagno (1990b) constructed a cladogram of the Lamniformes based on his interpretation of the polarity of characters (Fig. 2.1); these characters were not put through a cladistic analysis. Several characters were considered of particular importance, such as jaw suspension, pectoral fin skeleton, and number of turns of the intestine. Carcharias, Odontaspis and Alopias have a similar jaw suspension, with awell-developed orbital process that articulates with a depression in the ventral 64 surface of the chondrocranium; this is similar to the condition in carcharhiniforms, and hence is deemed ancestral. The orbital process is reduced in Mitsukurina and Pseudocarcharias, and is absent from Cetorhinus and Lamnidae; in these sharks, ethmopalatine ligaments aid in the suspension of the palatoquadrate (Compagno, 1990b). Mitsukurina, odontaspidids and Pseudocarcharias all possess aplesodic pectoral fins (considered plesiomorphic by Compagno), whereas the pectoral fins of Megachasma, Alopias, Cetorhinus and Lamnidae are plesodic. The third character is the number of turns in the spiral valve, with higher numbers considered derived. Compagno (1990b) also considered chondrocranial characters "especially useful"; unfortunately he did not elaborate upon these, and regarded a "detailed account" of these characters as "beyond the scope of this account". Further, it is difficult to discern the evolutionary significance of many of his characters, making his non-cladistic analysis difficult to evaluate. Compagno (1990b) regarded the highly autapomorphic genus Mitsukurina as the basal lamniform taxon. The Odontaspididae (sand tigers} were considered paraphyletic, with Carcharias as the sister taxon to a Glade comprising all remaining lamniforms. The genera Odontaspis and Pseudocarcharias were considered successive outgroups to the Glade containing lamniforms with plesodic fins. Within this plesodic fin Glade, Megachasma was regarded as the basal taxon. The remaining lamniforms formed an Alopias+(Cetorhinus+Lamnidae) Glade. Compagno (1990b) also suggested possible alternative scenarios to this arrangement, such as a close relationship between Mitsukurina and Carcharias, as well as between Odontaspis and Pseudocarcharias, based on shared dental and chondrocranial features (although it is unclear exactly which features he uses in this assessment). The family Odontaspididae was considered paraphyletic on "weak evidence"; on the other hand, there was no strong evidence in support of Carcharias+Odontaspis Glade. Also, Compagno (1990b) suggested that O. noronhai (based on its low anal fin and large eyes} may be more closely related to Pseudocarcharias than O. ferox. However, he later (2001) asserted the distinction between the Odontaspididae and the monotypic Pseudocarchariidae. Compagno was also uncertain of his placement of Pseudocarcharias at the base of the plesodic fin Glade, because the characters that supported this position (e.g., reduction of third lower anterior teeth) may be convergent.

Morphological guide to lamniform sharks Although a comprehensive review of morphological characters that separate lamniform taxa is beyond the scope of this text, it is worthwhile to review some of the major morphological traits that distinguish both lamniform genera and species. This discussion is based on Compagno (1990b, 2001) 65 unless otherwise noted. Sharks with an aplesodic pectoral fin skeleton are discussed first (Mitsukurina, Odontaspididae, Pseudocarcharias), followed by those with plesodic pectoral fins (Megachasma, Cetorhinus, Alopias, Lamnidae).

Mitsukurina This bizarre-looking shark is easily distinguished by its unusually elongated "daggerlike" snout. The rostrum of this shark is unique, and possesses several modifications of the tripodal rostrum (e.g., its greater length; and a medial rostral cartilage that is expanded to form a plate). The jaw suspension of this shark is also unusual in that the position of the hyoid arch is reversed, thus allowing the jaws to swing forward. Aside from these autapomorphies, Mitsukurina possesses several characters regarded as ancestral by Compagno (1990b). The jaws of Mitsukurina possess three rows of large, anterior teeth in the upper jaw that are narrow and awl-like (as in Carcharias) and there are no intermediate teeth (this is replaced by a diastema). The intestine of Mitsukurina contains less than 20 turns, and Mitsukurina lacks enlarged stapedial fenestrae and precaudal pits. The dorsal and anal fins are distinct in that they are small and rounded in contrast to the large angular fins of most lamniforms, and the anal fin is larger than the either of the two dorsal fins. The caudal fin possesses features that may be ancestral, such as a poorly elevated dorsal lobe and small ventral lobe (although the fossil mitsukurinid Scapanorhynchus has a prominent ventral lobe, and is more similar to other lamniforms in this respect [Chapter Three]).

Odontaspididae (Carcharias and Odontaspis) The monophyly of the Odontaspididae was questioned by Compagno (1990b). However, recent classifications (e.g., Compagno, 2001) include Carcharias in the Odontaspididae. The genus Carcharias can be distinguished from the genus Odontaspis by the presence of three rows of large anterior teeth in the upper jaw, and the relative sizes of the first and second dorsal fin (which are equal in size in Carcharias, whereas the first dorsal is larger than the second dorsal fin is Odontaspis). Elsewhere among Lamniformes, both these features are found only in Mitsukurinidae, and may explain the basal placement of Carcharias. Other characters used to separate Carcharias from Odontaspis include the posterior shift of the first dorsal fin in Carcharias (this fin is closer to the pelvic fins, while in Odontaspis it is closer to the pectoral fin); jaw suspension, with Carcharias showing an arching of the basal plate below the anterior part of the suborbital shelves; a single row of intermediate teeth in the upper jaw of Carcharias (compared to two or more rows in Odontaspis); and only a single row of symphysial teeth in the lower jaw of Carcharias with none at all in the upper jaw 66

(Odontaspis has as one or more rows of symphysial teeth in both the upper and lower jaws). Both genera possess awl-like anterior teeth with blade-like lateral teeth, a caudal peduncle that lacks keels, and a tail that is non-lunate (i.e., it has a long upper lobe, and short ventral lobe). O. ferox can be distinguished from O. noronhai by the dentition. The teeth of O. ferox possess 2-3 cusplets on either side of the primary cusp; up to 5 (although usually 4) rows of intermediate teeth; and two rows of symphysial teeth in both the lower and upper jaws. O. noronhai possesses a single cusplet on either side of the primary cusp; only 1-2 rows of intermediate teeth; no more than 2 rows of symphysial teeth in the upper jaw (sometimes totally absent); and 2-4 rows of symphysial teeth in the lower jaw. symphysial teeth are present only in odontaspidids and Mitsukurina and are absent from all other lamniforms (except for Alopias; see below).

Pseudocarcharias This was previously placed in the Odontaspididae, with which it shares awl-like anterior teeth and blade-like lateral teeth, and a tail that has a much longer upper than lower lobe. Pseudocarcharias can be separated from the Odontaspididae by its large orbits (and hence eyes); absence of symphysial teeth; teeth without cusplets (odontaspidids have 1-3 cusplets per tooth); gill openings that extend onto the dorsal surface of the head (further than in odontaspidids}; small and slender build, with an elongated body (odontaspidids are larger and stockier sharks); an anal fin with a narrow base capable of pivoting {the anal fin of odontaspidids is broad and cannot pivot); presence of both upper and lower precaudal pits (only an upper is present in odontaspidids); and a caudal peduncle with lateral keels (absent in odontaspidids). Several unique chondrocranial features of Pseudocarcharias include the shortened otic capsules; jaw suspension (the orbital process is fused with the dental bullae [hollow expansions near the symphysis of the jaws], which articulates with the orbits rather than the nasal capsules); and awedge-shaped ventral process on the internasal septum. Pseudocarcharias (as well as all other lamniforms except Mitsukurina and odontaspidids) has only two rows of anterior teeth in the upper jaw, and a reduction of the third lower anterior tooth (hence the third anterior tooth is the same size as the first lateral tooth). This last feature (combined with ancestral characters, such as aplesodic fins) prompted Compagno (1990b) to place ,Pseudocarcharias as the sister taxon to lamniforms with plesodic pectoral fins.

Megachasma The discovery of the megamouth shark stimulated a renewed interest in the order Lamniformes, when Taylor et al. (1983) assigned Megachasma to this order (Chapter One). The 67 megamouth was placed in the Lamniformes based on an array of morphological characters, such as the presence of a tripodal rostrum; osteodont tooth type; non-molariform posterior teeth; the arrangement of neural foramina; chondrocranial and muscle characters; and an elongated, ring valve intestine with over 15 turns (Taylor et al., 1983). Megachasma also possesses the plesodic pectoral fins shared by Alopias, Cetorhinus and lamnids, but no other traits were found to link these families. Taylor et al. (1983) suggested that Megachasma may be the sister taxon to all other living lamniforms based on two possible plesiomorphic characters: the undifferentiated dentition (but also seen in Cetorhinus and Rhiniodon); and the presence of a well-developed orbital process on the palatoquadrate cartilage. Maisey (1985) argued for a Cetorhinus+Megachasma Glade within Lamniformes, and cited the following characters in support: modified tripodal rostrum (relative position of the median rostral cartilage [MRC] and laterial rostral cartilage [LRC]); dentition (increased number of tooth rows; simplified tooth morphology and loss of dental differentiation); and enlarged gill-rakers that extend to the margins of the gill openings and are covered by modified oropharyngeal scales. Compagno (1990b) takes issue with all of Maisey's characters, emphasizing that the filter-feeding apparatus of Megachasma and Cetorhinus are so radically different (Chapter One) that, according to Compagno, they could not share a common origin. Compagno (1990b) places Megachasma in the "advanced" lamniforms, which also includes Alopias, Cetorhinus and Lamnidae. The extreme protrusibility of the jaws, combined with the number of turns of the intestine (24 in Megachasma versus 33-55 of Alopias, Cetorhinus and lamnids) and the odontaspidid-like size, shape and position of the fins were considered by Compagno to be ancestral features, placing Megachasma at the base of this plesodic pectoral fin Glade. Although several morphological characters of this shark have already been discussed, several more are worth mentioning. The chondrocranium of Megachasma is broad and flattened with an extremely short rostrum and, in overall shape, resembles the skull of the extinct predatory lamniform Squalicorax (Chapter Three). Compagno (1990b), however, explicitly rejected a close relationship between anacoracids and Megachasma. Both the anterior fontanelle and parietal fossa (depressions in the roof of the chondrocranium) of Megachasma are unusual in that the former is large and greatly expanded laterally and the latter is reduced to a single deep slit. The jaws are greatly enlarged, such that the palatoquadrate cartilage is nearly twice the length of the chondrocranium, and the jaws extend to behind the level of the orbits. The manner of jaw suspension allows for significant forward protrusion of the jaws. Megachasma is also unique in that remnants of the notochord are still prominent as expansions between vertebral centra. 68

Alopias Thresher sharks (Alopias spp.) share with C'etorhinus and Lamnidae an enlarged first dorsal fin, increased intestinal valve count (to 33-55), and limited protrusibility of the jaws. The teeth are smaller than those of odontaspidids and Mitsukurina, and are more blade-like. Thresher sharks are readily recognized by their extremely elongate caudal fins. In addition, Alopias is the only lamniform (other than Megachasma) in which the last two gill openings are above the pectoral fin base. Both upper and lower precaudal pits are present, but la"teral keels are absent. Both the pectoral and pelvic fins are enlarged. The chondrocranium contains enlarged orbits, with shortened otic capsules. Vertebral counts of Alopias are the highest known for any living shark. Although not directly addressed by Compagno (1990b, 2001}, modifications associated with endothermy may also unite the Alopiidae. Compagno states that there is very strong evidence to support the monophyly of this group. The three species of thresher sharks can be distinguished as follows. A. superciliosus is separated from other threshers by the presence a deep, V-shaped, horizontal groove (the nuchal groove) which extends from the eyes to about the middle of the pectoral fin and is visible on both the lateral and dorsal surfaces of the head; by its extremely enlarged eyes (and hence orbits), which extend onto the dorsal surface of the head; and by the absence of intermediate and symphysial teeth (these are usually present in both A. pelagicus and A. vulpinus). A. pelagicus can be distinguished from A. vulpinus by the more elongate snout in the former; no labial furrows (external, comma- shaped grooves present at the angle of the mouth, found in A. vulpinus); a weak nuchal groove (absent in A. vulpinus); nearly straight, broad-tipped pectoral fins without a white patch above their base (in A. vulpinus, these fins are curved and narrow-tipped, with a prominent white patch above their base); and a total of 453-477 vertebrae, the highest known vertebral count of any living shark (versus 339- 364 in A. vulpinus). Compagno (1990b) unites A. superciliosus and A. pelagicus based on (but not limited to) the following characters: larger eyes than A. vulpinus; labial furrows reduced or absent; nuchal groves present; pectoral fins with broadened tips; ribs of monospondylous vertebrae modified to form an anterior hemal canal that protects the aorta, and extending all the way to the head; thickened and laterally expanded lateral rostral cartilages; and an increase in the number of rings in the intestine (37- 45 versus 33-34 in A. vulpinus). The claspers of both A. superciliosus and A. pelagicus are only moderately slender, in contrast to the extremely slender and whip-like claspers of A. vulpinus. 69

Cetorhinus As with Megachasma, traits associated with filter-feeding were discussed in Chapter One. These include the huge gill openings that extend onto the dorsal surface of the head; modifications to the gill-rakers; and numerous weakly differentiated teeth (up to 255 rows), which are reduced in size and hook-like in shape. The large size of this shark is also characteristic (Chapter One). Compagno (1990b) allied Cetorhinus with the Lamnidae based on several features including: spindle-shaped body; a depressed caudal peduncle with both upper and lower precaudal pits and strong lateral keels; enlarged gill openings (although these do not extend as far dorsally in lamnids as in Cetorhinus); a shortened and lunate caudal fin (the lower lobe is nearly the same length as the upper lobe); and the presence of an ectethmoid process, a lateral expansion of the suborbital shelf, which inhibits forward protrusion of the jaws.

Lamnidae (Carcharodon, Isurus and Lacmna) The monophyly of the Lamnidae appears to be strongly supported. Synapomorphies include: relatively small number of large teeth (43-65 rows per jaw), which are blade-like, awl-like or triangular in shape; no symphysial teeth; several chondrocranial characters (e.g., greatly enlarged stapedial fenestrae; large orbits with strong supraorbital crests; nasal capsules depressed below the level of the basal plate); palatoquadrate cartilage with a mesial process present at the symphysis; and the strongly lunate tail. Traits associated with endothermy (discussed in Chapter One) also unite this group. Within this family, Compagno (1990b) allies Carcharodon with Isurus based on the following: shared presence of enlarged jaws and anterior teeth; teeth without lateral cusplets (except in juveniles of C. carcharias); increase in both the number of intestinal valves (47-55) and the total number of vertebrae (170-197); and possibly an adult body size exceeding 4m. Carcharodon is easily differentiated from other lamnids by its dentition: the teeth are serrated and flattened with broadly triangular cusps (those of both Isurus and Lamna are smooth, and curved with narrow triangular cusps). The intermediate teeth are large (over 2/3 the length of the anterior teeth) and "reversed" (i.e., they are inclined medially rather than distally), and all teeth (except those of juveniles, less than 3m in length) lack cusplets. In addition, several other features separate Carcharodon from other modern lamnid genera, such as enlarged jaw muscles (combined with an enlarged cranium to support them), the presence of discrete epiphysial fenestrae (openings for the epiphysis, an extension of the diencephalon, believed to be the equivalent of the pineal body [Wingerd, 1988]); and reduced rostral cartilages. The body length of Carcharodon, which may 70 exceed 6m, makes this the largest extant lamnid, but pales in comparison to certain extinct predatory lamnids (Chapter Three). Lamna is differentiated from Isurus by its dentition: lateral cusplets are present on all teeth (except newborn Lamna), but are absent altogether from Isurus; the anterior teeth of Lamna are not strongly flexed as they are in Isurus; the presence of a prominent secondary caudal keel in Lamna (which is weak or absent in other lamnids); certain chondrocranial characters (e.g., the rostrum of Lamna is expanded and hypercalcified with the base of the LRC elevated far above the nasal capsules); lower intestinal valve count (38-41 versus 47-55 in both Isurus and Carcharodon); fewer vertebrae (150-173 total vertebrae in Lamna versus 182-197 in Isurus and 170-187 in Carcharodon); color (both Isurus and Carcharodon possess pectoral fin tips which are black on their ventral surface); and position of the dorsal fins (the first dorsal fin of Lamna is more anterior than in Isurus and begins over the base of the pectoral fin, while the second dorsal fin of Isurus [which is in front of the anal fin] is more anterior than that of Lamna). The common name for Isurus species, the shortfin mako (1. oxyrinchus) and the longfin mako (1. paucus), yields the first clue in how to tell them apart, as the pectoral fins of 1. oxyrinchus are much shorter than the length of the head. However, since this character may not hold for juveniles, several other characters should be considered including: size of the eye (much larger in 1. paucus); shape of the mouth when viewed ventrally ("U" shaped in 1. oxyrinchus, parabolic in 1. paucus); dentition (the cusps of the anterior teeth in 1. oxyrinchus are strongly flexed with "reversed" tips); vertebral count (which is usually lower [< 190] in 1. oxyrinchus than 1. paucus [ 195-197] ); and color (but this may not hold for all I. oxyrinchus; see Chapter One). Aside from color (Chapter One), two principal features distinguish the two species of Lamna. The first is the snout, which is longer and sharply pointed in L. nasus and short and blunt in L. ditropis (due to the more extensive hypercalcification of the rostrum into a "knob" in L. ditropis). The second distinguishing feature is vertebral counts: total of 150-162, with 85-91 precaudal vertebrae, in L. nasus, versus a total of 170 vertebrae with 103 precaudal vertebrae in L. ditropis.

Alternatives to Compagno's lamniform phylogeny Compagno (1990b) supported a Glade of "advanced" lamniforms based on the presence of a plesodic pectoral fin, and which contained Megachasma, Alopias, Cetorhinus and Lamnidae. This relationship was supported by the preliminary cladistic analysis of Shirai (1996). De Carvalho (1996), however, differed from Compagno in inferring the following relationships: Alopias is the sister taxon to the aplesodic fin Glade; Mitsukurina was allied with Pseudocarcharias; and a 71 monophyletic Odontaspididae was supported. However, De Carvalho cautioned that these results are preliminary, and failed to discuss the characters that support these relationships. The importance of plesodic pectoral fins as derived character linking "advanced lamniforms" (a character used by both Maisey and Compagno) was questioned by Morrissey et al. (1997) since many sharks, including Paleozoic species, exhibit plesodic pectoral fins. A cladistic analysis by Long and Waggonner (1996), based solely on dentition, was mostly congruent with Compagno's phylogeny. (Given the reduced dentition of both Megachasma and Cetorhinus, the placement of these taxa was highly uncertain). However, Long and Waggonner disagreed with Compagno on two points. Firstly, they supported a monophyletic Odontaspididae (Carcharias+Odontaspis; albeit O. noronhai was not included in the analysis), and this family was placed at the base of the Pseudocarcharias+plesodic fin Glade. Secondly, Long and Waggoner questioned Compagno's Carcharodon+Isurus Glade within the Lamnidae, and stated that Lamna and Carcharodon were sister taxa based on (1) the shared presence of anterior teeth that are smaller at the base and (2) teeth that are more erect and reduced in their distal inclination. The order Lamniformes has yet to be exposed to a thorough anatomical-based cladistic analysis. The characters discussed by Compagno (1990) (including the chondrocranial characters that he suggests might be of critical importance for lamniform phylogeny) have yet to be incorporated into such an analysis. While subsets of morphological characters can be offered in support of uniting individual lamniform taxa, these characters need to be evaluated in a phylogenetic context. Also, most extant lamniform taxa are highly autapomorphic, obscuring characters that could link individual genera together. The problem of discerning relationships among extant lamniforms appears to be further compounded by the prevailing belief that modern lamniform species represent a small sample of the phylogenetic and morphological diversity of this group (Compagno, 1990b; Shirai, 1996; Chapter Three). Also, typical of Chondrichthyes, most fossil lamniform taxa are represented solely by teeth (Chapter Three), which precludes a greater knowledge of the morphology of fossil taxa that may be intermediate between modern forms (Gaudin, 1991). In contrast, this problem is not as acute for their sister group, the Carcharhiniformes, owing to the improved representation among extant taxa of this group's total diversity.

Relationships of sharks within the Lamniformes: Molecular data The discrepancies in morphological analysis of lamniforms based on morphology prompted the examination of lamniform systematics using molecular data. (A summary of lamniform systematics based on molecular data is presented in Table 2.2.) As in morphological investigations, 72 initial examinations of lamniform taxa focused on the relationships of Megachasma to other lamniforms, especially Cetorhinus (Martin and Naylor, 1997; Morrissey et al., 1997). These analyses found no evidence to support the Megachasma+Cetorhinus Glade proposed by Maisey (1985). The phylogenies recovered from these analyses were significantly different from those derived from morphological data, suggesting that Iamniform systematics was far from resolved. Unfortunately, no published analysis of molecular-based lamniform systematics includes all extant taxa, and several hypotheses (e.g., paraphyletic Odontaspididae) were not evaluated in all analyses. The results of these studies are compared with morphological analyses and discussed below. Lamniform systematics have been examined using a variety of phylogenetic methods using both mitochondrial and nuclear sequences (Table 2.2). In several cases the results depended upon either the method employed, the sequences examined, or both (Naylor and Martin, 1997; Naylor et al., 1997; Martin and Burg, 2002; Lopez et al., MS). Indeed, molecular-based analyses of lamniforms are largely unresolved. The only relationships that are strongly supported across all analyses are a monophyletic Lamniformes; monophyletic Lamnidae; two separate origins of filter-feeding; and a polyphyletic Odontaspididae (Martin and Naylor, 1997; Morrissey et al., 1997; Naylor et al., 1997; Martin et al., 2002; Martin and Burg, 2002; Lopez et al., MS). Although a monophyletic Lamnidae is well-supported, relationships within this Glade are variable. A sister taxon relationship between Carcharodon and Isurus was recovered by several analyses (Martin and Naylor, 1997; Morrissey et al., 1997; Naylor et al., 1997; Martin and Burg, 2002; Lopez et a1., MS) although this could be weakly supported (Naylor et al., 1997; Lopez et al., MS). Lopez et al. (MS) also recovered a Carcharodon+Lamna Glade (based on CYTB), as well as an Isurus+Lamna Glade (NADH2; NADH4; combined CYTB+NADH2+NADH4+RAG-1); however, these relationships depended on the method employed. Martin et al. (2002) failed to resolve relationships among lamnid sharks. Although the genus Lamna was well supported, the monophyly of the genus Isurus was not recovered (RAG-1) or only weakly supported (CYTB; Lopez et al. (MS); note the status of these genera could only be evaluated when all taxa were included; see Table 2.2). Carcharias and Cetorhinus were frequently allied with the Lamnidae, although this Glade was not always well-supported. Cetorhinus was recovered as the sister taxon to Lamnidae either alone (Martin and Naylor, 1997; Morrissey et al, 1997; Martin et al., 2002; Lopez et al., MS) or as a Carcharias+Cetorhinus Glade (Naylor et al., 1997; Martin and Burg, 2002). Carcharias was frequently recovered as the sister taxon to a Cetorhinus+Lamnidae Glade (Martin and Naylor, 1997; Martin et al., 2002; Lopez et al., MS). All three odontaspidid species were never recovered in one unique Glade, although an O. noronhai+Carcharias Glade (excluding O. ferox) was recovered as sister 73 taxon to Lamnidae+Cetorhinus (RAG-1; Lopez et al., MS). Morrissey et al. (1997)'s analysis placed O. ferox at the base of the Cetorhinus+Lamnidae Glade; however, C. taurus was not included in this analysis.

Table 2.2. Summary of molecular-based analyses of lamniform sharks. Abbeviations: AP (A. pelagicus), AS (A. superciliosus), AV (A. vulpinus), CC (C. carcharias), CM (C. maximus), IO (1. oxyrinchus), IP (1. paucus), LN (L. nasus), MP (M. pelagios), OF (O. ferox), ON (O. noronhai), TS (transition), TV (transversion). Authors of Taxa Sequences Analysis study Morrissey et al. AV, CC, CM, 12S rRNA -Maximum Parsimony (TS/TV (1997) IO, LN, MP, OF weighted equally) -Bootstrap Martin and All except ON CYTB -Maximum Parsimony Naylor (1997) -Bootstrap -Neighbor-Joining cluster analysis -Only TV substitutions included Naylor et al. All except ON ND2 and -Maximum Parsimony (1997) CYTB (TV/TS not equal) combined -Neighbor joining cluster analysis (with LogDet transformation) Martin et al. All except AP, Simple -Maximum Parsimony (2002) AS, IP, LN, ON sequence repeat (TS/TV not equal) (SSR) and its -Neighbor-Joining cluster analysis* flanking -Maximum Likelihood* sequence (*used HKY Model of sequence evolution) -Bootstrap Martin and Burg All except ON, Heat shock -Parsimony (2002) IP, LN protein 70 (Criteria: minimize gene duplication (HSP70) and losses to account for paralogs) Lopez et al. All taxa CYTB, ND2, -Maximum Parsimony (MS) included ND4, RAG-1 -Maximum Likelihood Analyzed both -Minimum Evolution combined and -Bootstrap individually

In addition to the (Carcharias+[Cetorhinus+Lamnidae]) Glade, a Glade containing Alopias, Megachasma, Odontaspis and Pseudocarcharias (the "AMOP" Glade of Martin and Burg [2002]) was often recovered (Martin and Naylor, 1997; Naylor et al., 1997; Martin et al., 2002; Martin and Burg, 2002; Lopez et al., MS). Aside from strong support for a Megachasma +Pseudocarcharias Glade (Naylor et al., 1997) and a possible A. vulpinus + A. pelagicus Glade (Naylor et al., 1997; Lopez et al., MS; based CYTB, RAG-1, and combined dataset) relationships within this AMOP group remain unresolved. Perhaps the most surprising result obtained by molecular data was that there was little 74 support for either a monophyletic Alopias or Odontaspis (O. noronhai, however, was only included in Lopez et al. [MS]). In addition, although the AMOP Glade and the (Carcharias+[Cetorhinus+Lamnidae]) Glade was frequently recovered, basal relationships of lamniforms were also uncertain, and there was little support for Mitsukurina as the basal lamniform taxon. Hence, apart from a monophyletic Lamnidae, relationships among lamniform taxa remain largely unresolved.

Morphology versus molecules Both morphological and molecular based studies of lamniform phylogenies support a monophyletic Lamniformes and Lamnidae. The relationships among the Lamnidae are unresolved as molecular-based analyses fail to distinguish between the hypotheses of Compagno (1990b) and Long and Waggoner (1996). Both methods support a Cetorhinus+Lamnidae Glade. Molecular data show a tendency for Carcharias to be recovered as the sister taxon to the Cetorhinus+Lamndiae Glade; this relationship, which creates a polyphyletic Odontaspididae, is not discussed in morphological studies. There is no support in the molecular data for for a sister taxon relationship between Cetorhinus and Megachasma, and only Taylor et al. (1983) and Morrissey et al. (1997) favor Megachasma as basal to all remaining lamniforms (however, the latter study did not include representatives from all lamniform genera; see Table 2.2). Molecular-based analyses also showed a tendency to recover the AMOP Glade; affinities among Alopias, Odontaspis and Pseudocarcharias were only recovered in a preliminary morphological analysis of De Carvalho (1996). Perhaps the most surprising result obtained by molecular studies was the lack of support for the monophyly of both Alopias and Odontaspis, both of which are supported by morphological evidence.

Summary Although elasmobranch phylogeny is still unresolved, certain relationships appear to be well supported, such as the monophyly of the Galeomorphii, and a sister group relationship between Carcharhiniformes and Lamniformes. Given the controversy regarding relationships among elasmobranch orders, is not surprising that relationships among taxa, such as those within the Lamniformes, remain uncertain. As in elasmobranch phylogeny in general, molecular and morphological based examinations of the relationships within the Lamniformes were discordant. Indeed, very few relationships within Iamniform taxa were resolved using these methods. The next section discusses the fossil record of lamniform sharks and its utility in assessing relationships among extant lamniform species. 75

Isurus oxyrinchus

Isurus paucus

Carcharodon carcharias Lamna nasus C Lamna ditropis Cetorhinus maximus

Alopias superciliosus

Alopias pelagicus Alopias vulpinus Megachasma pelagios

._. Pseudocarcharias kamoharai

Odontaspis noronhai

Odontaspis ferox

carcharias taurus

Mitsukurina owstoni

Fig. 2.1. Relationships among living lamniforms based on morphological characters, as determined by Compagno (1990b}. 76

CHAPTER THREE: THE FOSSIL RECORD OF LAMNIFORMES

Introduction Sharks have a long and rich fossil record that is represented almost entirely by fossil teeth; in fact, fossil shark teeth are the most abundant (Hubbell, 1996; Maisey, 1996). Unfortunately, these isolated teeth, along with scales and fin spines (known as ichthyoliths), are usually all that is left to decipher this vast evolutionary history. This is largely due to the fact that the cartilaginous skeletons of elasmobranchs do not fossilize well, and usually disintegrate before they can be preserved (Hubbell, 1996; Maisey, 1996). The paucity of skeletal material makes the task of understanding shark evolution very difficulty and as such, "[t]here is no gnathostome group whose phylogeny is as poorly elucidated as the chondrichthyans, and in particular the elasmobranchs." (Janvier, 1996, p.135-136). The first evidence of elasmobranchs in the fossil record is the presence of microscopic scales in the early , approximately 420 million years ago (mya; Turner, 1985; Maisey, 1996). These fossils, however, tell us little about early shark evolution. The first complete skeleton of an elasmobranch, that of Antarctilamna, a putative xenacanthiform (see below), is known from the middle (Givetian; Janvier, 1996). Such complete fossils are rare for chondrichthyans and have revealed much of what we know about early shark evolution. The fossil record indicates that elasmobranch evolution is characterized by two major, rapid radiations (Carroll, 1987). The first burst of shark evolution occurred in the late Silurian/early Devonian. Several groups of these Paleozoic sharks are characterized by rudimentary skeletons: the notochord (a flexible rod) is the sole component of the backbone (there are no vertebrae); the fin girdles are not fused; and the pectoral fins lack the tribasal structure of modern forms. Except for one group, the eel-like Xenacanthiformes, which were found in freshwater swamps, shark evolution is confined primarily to marine environments. The diversity of these early sharks is unmatched by modern forms, and groups such as the Eugeneodontida (with their unique "tooth whorls") and superficially teleost-like Petalodontida, are unlike any modern elasmobranchs living today. Indeed, some of these early taxa are so unusual, that certain authors (see Maisey, 1984; Janvier, 1996) suggest they may not even be elasmobranchs; thus, relationships among these and other early forms remain unresolved. Many of these Paleozoic forms were extinct by the early Mesozoic and it was during this time that the second major radiation of elasmobranchs, the one that would lead to modern groups, occurred. Fossils from the lower , such as the complete skeleton of Palaeospinax, begin to display features found in modern sharks. Unlike their Paleozoic predecessors, certain Mesozoic 77 elasmobranchs possessed a backbone supported by strongly calcified vertebrae and fused pelvic and pectoral fin girdles. Increased flexibility of the jaws (due to a hyostylic jaw suspension) combined with an elongated rostrum (resulting in a mouth which was ventral rather than terminal) expanded feeding opportunities. Larger nasal capsules and an enlarged braincase suggest that the sensory abilities of these sharks were more acute than earlier forms. This group of Mesozoic forms represents members of the Neoselachii, which comprises modern elasmobranchs. Although the exact ancestor of neoselachians is unknown, this group underwent a rapid radiation in the Mesozoic, resulting in the establishment of all modern shark orders by the end of the Cretaceous. Lamniform sharks, which were well established by the lower Cretaceous, are the focus of the remainder of this Chapter.

The fossil record: Lamniform sharks The fossil record of lamniform sharks is based almost entirely on fossil teeth. As one can imagine, reconstructing lamniform evolutionary history based almost solely on isolated teeth is a very difficult task. Isolated teeth pose a problem for several reasons. One reason (discussed in Chapter Two) is that lamniform dentition is influenced by both sexual and developmental factors that are poorly understood (Compagno, 1988; Shimada, 2001, 2002a). Teeth from different sexes or from different developmental stages may be mistaken for separate species. A second problem is that teeth are differentiated along the jaw (monognathic heterodonty) and may even be different between upper and lower jaws (dignathic heterodonty). Although certain teeth (such as the central teeth in the upper jaw) may be more useful in determining species, these teeth are difficult to identify without a complete tooth set (Naylor, 1990; Naylor and Marcus, 1994). Finally, teeth are prone to convergence, and may therefore lead to erroneous conclusions about phylogeny (Cappetta, 1987). Given the difficulty in correctly identifying fossil shark teeth, it is not surprising that different interpretations have been forwarded regarding the fossil record of lamniform sharks. Disagreements concerning both tooth terminology (Shimada, 2002b) and the relevance of different characters (Applegate and Espinosa-Arrubarena, 1996) lead to conflicting interpretations. Since the identity and affinities of these teeth are difficult to ascertain, recovering an accurate phylogeny of lamniform sharks based on fossil teeth has been fraught with problems. While an accurate phylogeny of lamniform sharks may not be possible using fossil remains, the teeth do provide a potential timeframe for divergences within the group (John Maisey, pers. comet.). Therefore the remainder of this section concentrates on the earliest known records of lamniforms sharks. The geological time scale used throughout the text follows Benton (1993) (see Table 3.1). ~s

Table 3.1. Geological timetable of the Era and the Cretaceous Period. Modern lamniform families are first recorded from the Early Cretaceous. The Jurassic (208-144 mya) and (245- 208 mya) Periods preceded the Cretaceous period in the Mesozoic Era. Evidence for lamniform sharks before the Cretaceous is controversial. Era Period Stage Duration (mva) Cenozoic Recent _ (<10,000 years Pleistocene 1.6-0.04 (67 mya —present) Tertiary 5.3-1.6 Miocene Messinian 6-5.3 Tortonian 10-6 Serravalian 14-10 Lan~hian 17-14 Burdigalian 22-17 Aquitanian _ 24-22 Chattian 31-24 37-31 _ 42-37 46-42 52-46 , 58-52 62-58 Daman 67-62 Mesozoic Cretaceous 74-67 83-74 (245 — 97 mya) 86-83 89-86 90-89 97-90 Early Cretaceous 112-97 123-112 Barremian 130-123 Hauterivian 136-130 Valan~inian 140-136 Berriasian 144-140

Extinct lamniform families Lamniforms appear to have been the dominant predatory sharks of the Cretaceous, and the order was most diverse in the later part of this period. However, since the early Tertiary lamniforms slowly declined, as other predatory families, such as the Carcharhinidae and Sphyrnidae, diversified (Siverson, 1992). Thus, modern lamniforms represent a very small sample of the total diversity that once existed within the order. The Lamniformes is the only galeomorph order to have families exclusively based on fossil species; in other galeomorph orders, all fossil species can be referred to extant families (Cappetta. 1987}. Those lamniform families known exclusively from fossil material (mostly teeth and vertebrae) are recovered from the Cretaceous and early Tertiary. Teeth of fossil lamniforms (such as Cretoxyrhina species) are so abundant during this time that they are used to data fossil strata 79

(Siverson, 1996). Three major families of lamniforms (Cretoxyrhinidae, , ) have been recognized and described based almost exclusively on the distinctiveness of their dentition (Cappetta. 1987). Although other extinct lamniforrr~ families have also appeared in the literature, such as the Jaekelotodontidae (regarded by Cappetta [ 1987] as belonging in the family Odontaspididae), the following discussion will concentrate on the three main families mentioned above.

Family Cretoxyrhinidae The Cretoxyrhinidae was abundant in the upper Cretaceous, but was extinct by the late Eocene (Case and Cappetta, 1990; Cappetta et al., 1993). Numerous genera have been referred to the Cretoxyrhinidae including , , (=Plicatolamna), (sometimes incorrectly spelled Cretolamna), Cretoxyrhina, Dwardius, (=Megarhi.zodon), , Paraisurus, Protolamna, Pseudoisurus, and (Cappetta, 1987; Siverson, 1996, 1997, 1999). The relationships within this family are uncertain, partly due to the fact that it has often been used as a wastebasket for Cretaceous and early Tertiary sharks that cannot be easily assigned to other families (Siverson, 1999). The Cretoxyrhinidae is therefore likely paraphyletic and efforts have been made to split this group into different families, (e.g., Serratolamnidae, Cardabiodontidae) with the Cretoxyrhinidae limited to Cretoxyrhina and its closest relatives (Siverson, 1999). The status of the Cretoxyrhinidae is controversial and unresolved; therefore, for simplicity, the Cretoxyrhinidae will be treated as a single group. The earliest known cretoxyrhinid is Protolamna infracretacea from early in the Cretaceous (Valanginian, Cappetta et al., 1993). The cretoxyrhinids have been associated with the ancestry of many modern families, though individual authors disagree about which cretoxyrhinid genera gave rise to what family. Siverson (1992) proposed Archaeolamna kopingensis as the ancestor of the extant Lamnidae, whereas Applegate and Espinosa-Arrubarrena (1996) regard Cretalamna appendiculata as close to the lineage that led to modern Carcharodon. Carcharodon orientalis (Paleocene-Eocene) has been regarded as both a lamnid (Applegate and Espinosa-Arrubarrenna, 1996) and a cretoxyrhinid (as Palaeocarcharodon orientalis) (Cappetta, 1987). The Cretoxyrhinidae included some very large sharks. The largest Mesozoic shark vertebra known to date has been referred to this family, and comes from a shark estimated to have measured at least 8.3m long, and possibly up to 9.8m (Shimada, 1997a). The most widespread and well-known representative is the Late Cretaceous species Cretoxyrhina mantelli. This species, known from the Cenomanian to the Santonian, has been reported from Europe, North and South America, and Brazil 80

(Cappetta, 1987}. At least four partial skeletons are known for C. mantelli (Shimada, 1997b). It has a body form similar to Carcharodon carcharias, and measured up to 6m Iong (Shimada, 1997b). Morphological features of this species include a lamnoid tooth formula, a large orbit (as in Pseudocarcharias and Alopias), five pairs of gill slits, a plesodic fin (based on an isolated pectoral fin tentatively referred to C. mantelli), and a long axial skeleton composed of 230 vertebrae (Shimada, 1997b,c). Cretoxyrhina mantelli is the only extinct lamniform for which the dentition can be confidently reconstructed (Shimada and Hubbell, 2001) since partial or nearly complete sets of teeth are known from fossils believed to represent individual sharks (Shimada 1997c). Based on these fossils, C. mantelli shares dental features with Alopiidae, Cetorhinidae, Lamnidae, Megachasmidae, and Pseudocarchariidae, such as reduction of the third lower anterior tooth; size and shape of lateral teeth; and absence of the first upper anterior tooth (Shimada, 1997b,c). A cladistic analysis of lamniform morphology by Shimada (1999) grouped C. mantelli within a Glade that also included Alopias, Cetorhinus and. Lamnidae. Before this, C. mantelli had been associated with the origin of extant lamniform genera. C. mantelli was once included in the genus Isurus, because the jaws, teeth and inferred feeding behavior resemble those of Isurus (Shimada, 1997b). C. mantelli has also been linked to the origin of the genus Alopiidae, again based on perceived dental similarities (Ward, 1978). Interestingly, small symmetrical teeth, believed to be dental abnormalities, are known for both C. mantelli, and Alopias superciliosus (Shimada and Hubbell, 2001). C. mantelli is thought to have been afast-swimming shark, since it preyed upon large and active marine vertebrates (Shimada, 1997d; Shimada and Hubbell, 2001). One specimen shows remains of the 4m-long teleost fish Xiphactina in its fossilized stomach contents (Shimada, 1997d). C. mantelli also appears to have fed upon large sea reptiles such as plesiosaurs and . This is supported by the presence of that possess puncture marks or embedded teeth consistent with bites inflicted by C. mantelli (Shimada, 1997d). C. mantelli has been recovered from deposits believed to represent off-shore ("deep water") environments (Shimada, 1997e). This species is rare in near-shore ("shallow") marine deposits, in which another large contemporary cretoxyrhinid (Cretodus crassidens) predominated (Shimada, 1997e).

Family Anacoracidae Most of the larger cretoxyrhinids disappeared near the end of Cretaceous (mid-Campanian; Siverson, 1995). The Anacoracidae (=Squalicoracidae) is another lamniform family known from the upper Cretaceous (Albian-Maastrichtian; Cappetta, 1987; Cappetta et al., 1993). After the decline of 81 large cretoxyrhinids, anacoracid fossils became increasingly common in the mid-Campanian. In the past, the Anacoracidae has been placed in the Hexanchiformes, based on similar root morphology. However, Cappetta (1987) and Siverson (1995) regard this feature as convergent, and since most anacoracids have a lamniform-like tooth histology, they place this family in the Lamniformes. The Anacoracidae includes , Microcorax, Paracorax and Squalicorax (Cappetta, 1987). These lamniforms have teeth that appear specialized for cutting (Siverson, 1995). At least some anacoracid sharks may have been marine . First, the teeth of certain anacoracids (Squalicorax yangaensis, Paracorax jaekeli) show some resemblance to those of modern Galeocerdo (tiger shark), an opportunistic famous for almost anything (Cappetta, 1987; Shimada, 1997d). In fact, the lcm-high teeth of P. jaekeli were originally. referred to Galeocerdo (as Galeocerdo jaekeli). Second, Squalicorax teeth have been found embedded in skeletal remains of cretoxyrhinids such as C. mantelli. Since Squalicorax was much smaller than C. mantelli, it is unlikely that the former shark would attack it (Shimada, 1987d). Nevertheless, Squalicorax appears to have been an active predator: in addition to the large serrated teeth designed for cutting, this shark had strongly calcified jaws and vertebrae, stiff plesodic pectoral fins, and a caudal fin with a strong ventral lobe (Compagno, 1990b). Other anacoracids were probably also predatory: the teeth of other Squalicorax species resemble the upper teeth of Carcharhinus (e.g., bull shark), and the teeth of Pseudocorax show similarities to the teeth of some modern Sphyrna (hammerhead) species (Cappetta, 1987). Anacoracids did not survive the terminal Cretaceous extinction, and they were replaced in the early Tertiary by other shark groups, including hexanchiforms (e.g., Notidanodon, Sphenodus) (Siverson, 1995).

Family "Otodontidae" The Otodontidae is a controversial family of lamniforms, recognized by Cappetta (1987) who lists three genera in this family: , , and Carcharocles. Siverson (1992) proposed that the Otodontidae evolved from the cretoxyrhinid genus Cretalamna. However, many authors dispute the monophyly of the family Otodontidae, and assign the constituent genera to other groups. For example, Applegate and Espinosa-Arrubarrenna (1996) regards the type species, Otodus obliquus, of the early Tertiary (Thanetian-Ypresian), as a close relative of Carcharias, based on the presence of three anterior teeth in the upper jaw. Purdy et al. (2001) tentatively assign Parotodus benedenii (= Isurus benedenii) to the Lamnidae, since these teeth resemble the dentition of 1. oxyrinchus. The remaining genus Carcharocles, is especially contentious, and is discussed below in the section on fossil Lamnidae. 82

Palaeocarcharias: aLate Jurassic lamniform? The earliest known lamniform (or putative lamniform) is Palaeocarcharias stromeri from the Late Jurassic (Kimmeridgian) of (Duffin, 1988). This species is based on three well- preserved specimens of a lm long shark that had lamniform-like teeth but an orectolobiform-like body (Duffin, 1988). However, not all authors agree that Paleocarcharias was a lamniform. Compagno (1973) regarded Palaeocarcharias as an orectolobiform based on several features (e.g., arrangement of three large teeth teeth at the symphysis; dorsoventrally flattened body; very wide pelvic girdle), though he recognized the shape and histology of the teeth were lamnoid. Based on such dentition, both Cappetta (1987) and Duffin (1988) supported lamniform affinities for Palaeocarcharias. Duffin {1988) used Palaeocarcharias as evidence supporting an origin for Lamniformes from orectolobiform ancestors, as suggested earlier by White (1936, 1937). However this view has been disputed by Compagno (1973, 1977).

Family Odontaspididae The Odontaspididae was far more diverse in the Late Cretaceous than today (Siverson, 1992) and two extinct genera, and Hispidaspis, are known from this time (Cappetta 1987). Siverson (1996) adds two further Cretaceous genera to this family: (often considered a mitsukurinid) and . From the Paleocene comes the odontaspidid genera Jaekelotodus, Palaeohypotodus and , and from the Eocene comes Hypotodus (Cappetta, 1987). However only two genera, Odontaspis and Carcharias, both of which are reported as far back as the Cretaceous, exist today. Chapter One discussed the confusion regarding the nomenclature of extant genera of the Odontaspididae. Unfortunately, taxonomy of the fossil remains in this group, especially at the generic level, is equally ambiguous. A discussion of all of the fossil genera within the Odontaspididae is beyond the scope of this text, especially since the systematics of Cretaceous taxa are highly uncertain (Case, 1979; Cappetta, 1987; Siverson, 1992, 1997; Glickman and Averianov, 1998). This has been attributed to many factors, including: rapid dental evolution of the group; misidentification of taxa; convergence in dental morphology; the plesiomorphic ("archaic") lamniform dentition of many early odontaspidids; and the paucity of fossil odontaspidid skeletal remains. Therefore, only key fossil odontaspidid taxa, especially those that may be related to extant species, will be discussed here. The .teeth of small odontaspidids are extremely common in assemblages (Glickman and Averianov, 1998). Cretaceous odontaspidid species have a restricted geographical range, in contrast to modern species, and it is thought that the former did not require seasonal 83 migrations due to the warmer waters of the Cretaceous compared to today (Siverson, 1992). C. acutissima (= Odontaspis acutissima, Synodontaspis acutissima) is known from the middle Eocene to its extinction in the Pliocene (Pledge, 1985), and was one of the most abundant shark species in Miocene seas (Case, 1980, 1981). Cappetta (1987) lists Carcharias (which he calls Synodontaspis) and Odontaspis as first appearing in the Cretaceous. The genus Striatolamia has traditionally been considered an odontaspidid, and the dentition of S. macrota (Eocene: Ypresian) closely resembles C. taurus (Cunningham, 2000). This species (also called Odontaspis macrota) measured around 2.7m long and is known from associated remains (i.e., those believed to represent a single individual) that include teeth, vertebrae, and other skeletal remains (Applegate, 1968). Other fossil odontaspidid species are small; for example C. gracilis (known only from teeth} attained an estimated total length of l.lm long, the size of a newborn C. taurus (Siverson, 1992, 1995). The oldest known species of Striatolamia, S. cederstroemi (early Paleocene: Danian), has a less Carcharias-like dentition compared to later Striatolamia species (Siverson, 1995). Thus, Siverson (1995) proposed that the dental similarities shared by Carcharias and later Striatolamia are due to convergence, and that Striatolamia is possibly descended from Anomotodon. There is disagreement over whether Anomotodon is an odontaspidid or a mitsukurinid; Siverson (1996} regards A. plicatus (the type species for Anomotodon) as an odontaspidid, because the teeth strongly resemble juvenile Carcharias teeth. The oldest known odontaspidid species is Carcharias striatula, first known from the upper Aptian (Early Cretaceous). This species lived in the epicontinental seas of Europe (Gluckman and Averianov, 1998). However, Gluckman and Averianov (1998) assign this species to the genus Eostratiolamia, and separate this species from true Carcharias by a different tooth formula ("archaic" in E. striatula, "advanced" in C. taurus) and greater number of tooth rows in C. taurus. However, these authors acknowledge that the morphology of Eostratiolamia teeth grade into Carcharias teeth, making separation of these genera difficult (Gluckman and Averianov, 1998). Cappetta (1987} refers Eostriatolamia to Synodontaspis, and claims that Carcharias taurus is also referable to the genus Synodontaspis (Chapter One). Other species assigned to Carcharias (=Synodontaspis), such as C. gracilis (Siverson, 1995), are also referred to Eostriatolamia by Glickman and Averianov (1998) based on dental formulae and other aspects of tooth morphology. The composition of the genus Odontaspis is also disputed, but the oldest valid species may be two North American taxa of Late Cretaceous age: O. saskatchewanensis (Coniacian; Case et al., 1990) and O. aculeatus (late Campanian; Cappetta, 1987). The extant species O. ferox and C. taurus 84 are both first recorded from the lower Miocene (Burdigalian) of North America: O. ferox is known from teeth, while C. taurus is known from both teeth and a single vertebra (Iturralde-Vinent et al., 1996; Purdy et al., 2001). The fossil history of C. taurus is complicated by the fact that its teeth are very difficult to distinguish from those of C. acutissima (Cappetta, 1987). In addition, Siverson (1992) states that fossil teeth of C. taurus and O. noronhai are difficult to distinguish because their dentition is similar and characters used to differentiate them, such as the number of rows of anterior teeth (three in C. taurus, two in O. noronhai) are not discernable from isolated fossil teeth.

Family Mitsukurinidae Capetta (1987) assigns three genera to this family: Anomotodon, Scapanorynchus and Mitsukurina. Fossil teeth from these genera are often confused with those of odontaspidids (Cappetta, 1987), and as mentioned above, there is some question as to whether the genus Anomotodon is actually a mitsukurinid (Siverson, 1996). It is possible that due to convergence of dental characters, the genus Anomotodon is polyphyletic (Capetta, 1987). The type species for Anomotodon (A. plicatus) has been assigned to the Odontaspididae; hence, other Anomotodon species, which are likely true mistukurinids, should be assigned to a new genus. A. principalis from the middle Cretaceous (late Aptian) of France appears to represent the earliest known mitsukurinid (Cappetta et al., 1993). A related species (A. cravenensis) is known from fossil teeth from the lower Eocene (Ypresian) of (Case, 1980). The genus Scapanorhynchus is remarkable in that it is known from several complete skeletons, as well as isolated teeth (Cappetta, 1980, 1987). Comparison of the skeletal remains of S. lewisii (late Santonian of Lebanon) with extant M. owstoni reveals that morphologically, the Mitsukurinidae has changed very little since the Late Cretaceous (Cappetta, 1980, 1987). Scapanorhynchus differs from M. owstoni in its fins: it has a very long, ribbon-like anal fin (which is broadly rounded in Mitsukurina), and possesses swell-developed ventral caudal lobe (which is absent in Mitsukurina). The skeleton of S. lewisii is quite small with an approximate length of only SScm (Cappetta, 1987). Small teeth (less than 9mm in height) from an associated tooth set of S. raphiodon indicate an estimated total length of 60cm and although this specimen may have been a juvenile, it is comparable to the size of the skeletal remains of S. lewisii (Hamm and Shimada, 2002). Several large species of Scapanorhynchus (such as S. texanus, represented by teeth 2-Scm in height) are known from the end of the Cretaceous (Case, 1979; Cappetta, 1987). Hence, either Scapanorhynchus displayed a wide variation in body sizes, or skeletal remains represent juvenile specimens. This scenario parallels the history of the only living mitsukurinid, M. owstoni, in which 85 the type species was proven to be immature only when much larger individuals were discovered much later. Scapanorhynchus is known from the lower Cretaceous (Aptian), and was extinct by the end of the Cretaceous (Maastrichtian; Cappetta, 1987). The dentition of S. raphiodon is similar to that of M. owstoni, and may have had a similar diet. Although the diet of M. owstoni is not well established (Chapter One) it is believed that it feeds on small teleosts and squids, both of which are common fossils in areas where S. raphiodon teeth are found (Hamm and Shimada, 2002) The earliest known member of the genus Mitsukurina is M. maslinensis, from the middle or late Eocene of Australia (Cappetta, 1987). The discovery of fossil teeth of M. owstoni in the lower Pliocene (Zanclian) of reveals that this species was once found in the Mediterranean, although it is currently unknown in this area (Cigala Fulgosi, 1986). Indeed, fossil mitsukurinids (Mitsukurina, Scapanorynchus and "Anomotodon") exhibit an almost worldwide distribution and are common in areas (such as the Medditerannean and Atlantic) where extant M. owstoni has yet to be reported (Cigala Fulgosi, 1986; Cappetta, 1987).

Family Alopiidae The family Alopiidae consists of Alopias and possibly the extinct genus Paranomotodon. The placement of Paranomotodon, which is known primarily from fossil teeth, in the Alopiidae is controversial, and those who support this referral acknowledge that similarities in the dentition of these two genera may be caused by convergence. If, however, Paranomotodon is an alopiid then the type species, P. angustidens known from the Cenomanian (Late Cretaceous) of Europe, represents the oldest known member of this family (Cappetta, 1987; Siverson, 1992; Cappetta et al., 1993). Several species of Alopias are known from the Eocene, including the oldest known species, A. denticulatus, from the lower Ypresian of Morocco (Cigala Fulgosi, 1988; Cappetta et al., 1993). A. crochardi and A. leeensis are slightly younger species, at upper Ypresian (Ward, 1978). A. denticulatus may belong to the lineage that led to A. superciliosus (Cigala Fulgosi, 1983), while A. leeensis appears to belong to the lineage that led to A. vulpinus. The teeth of A. crochardi appear intermediate in morphology between the A. vulpinus and A. superciliosus groups, and therefore may belong to a third group. Fossil Alopias teeth from the middle Tertiary are often assigned to two species, A. exigua and A. latidens, but the validity of these species is controversial (Ward 1978, Purdy et al., 2001). The controversy centers on the fact that the dentition of A. superciliosus is sexually dimorphic (Chapter Two), and thus fossil specimens may be representatives of female A. superciliosus rather than new species. Both Purdy et al, (2001) and Ward (1978} believe that A. exigua and A. latidens are fossil 86 representatives of extant Alopias species. Cigala Fulgosi (1983, 1988) believes that A. exigua is a valid species that can be distinguished from A. superciliosus, but that certain alopiid teeth of Pliocene age previously assigned to A. exigua really belong to A. superciliosus. Certain A. latidens teeth are possibly female A. superciliosus teeth, whereas others pertain to A. vulpinus (Cigala Fulgosi, 1983, 1988). If this interpretation is correct, then it extends the fossil record of A. vulpinus back to the lower Eocene as A. latidens (Cigala Fulgosi, 1983), and A. superciliosus back to the late Oligocene- early Miocene, also as A. latidens (Ward, 1978). Otherwise, the earliest record of A. superciliosus and A. vulpinus are lower Miocene (Aquitanian and Burdigalian, respectively; Case, 1980; Purdy et al., 2001). The Burdigalian deposit of A. vulpinus includes a single vertebra in addition to teeth (Purdy et al., 2001). The genus Alopias may have evolved from Cretaceous cretoxyrhinids such as Cretoxyrhina mantelli (Ward, 1978) or within the genus Cretalamna (Cappetta, 1987). The lineages that led to A. vulpinus and A. superciliosus had already diverged by the Oligocene (Cigala Fulgosi, 1983), and, as discussed above, the two groups may have split as early as the Eocene. The origins of A. pelagicus are uncertain, and there are opposing views as to whether this species is closer to A. vulpinus (Cigala Fulgosi, 1983) or to A. superciliosus (Compagno, 1990b).

Family Lamnidae The family Lamnidae includes five extant species in three genera (Carcharodon, Isurus, and Lamna). Cretaceous species originally assigned to lamnid genera are now often placed in the Cretoxyrhinidae. The Cretoxyrhindae is believed to contain the ancestors of the Lamnidae (Cappetta, 1987) althought the exact ancestor is often disputed. For example, Siverson (1992) has suggested that the Lamnidae arose from a cretoxyrhinid ancestor close to Archaeolamna while other authors propose individual lamnid genera evolved from different Cretalamna species (e.g., Cappetta, 1987; Applegate and Espinosa-Arrubarrena, 1996). The taxonomy of fossil lamnid species (which so far are known principally or exclusively from teeth) is confusing and controversial, especially the inclusion of fossil species in modern genera. The following details the controversial evolutionary history of the living genera, Carcharodon, Isurus and Lamna. The presence of very large (up to at Least 168mm in height; Gottfried et al., 1996) fossil shark teeth that resemble those of C. carcharias has sparked one of the most heated debates in elasmobranch . The debate focuses on whether or not these "mega-toothed" sharks are indeed related to the extant great white shark. There are two competing theories. The first theory contends that C. carcharias evolved relatively recently (late Miocene /early Pliocene) from within 87 the genus Isurus and hence has a separate evolutionary history from fossil mega-toothed sharks. The second theory is that C. carcharias is in fact closely related to these ancient mega-toothed sharks, and is the only surviving member of a lineage that dates back to the early Tertiary (Gottfried and Fordyce, 2001). Both theories offer differing interpretations of how the tooth morphology of C. carcharias arose. Since Carcharodon was named in 1838, at least 96 species have been assigned to this genus (Gottfried et al., 1996; Gottfried and Fordyce, 2001). The validity of the vast majority of these species is highly questionable (Bhalla and Dev, 1985; Applegate and Espinosa-Arrubarrena, 1996). Applegate and Espinosa-Arrubarrena (1996) recognize only eight or nine fossil species as valid. However, the issue of whether these species should be retained in the genus Carcharodon rests upon which of the two above competing theories is correct. These fossil Carcharodon species (and other fossil species) have often been assigned to new genera, including Procarcharodon, Palaeocarcharodon, Megaselachus, and Carcharocles. Cappetta (1987) synonymized all these genera under the name Carcharocles, and placed this genus in the extinct family Otodontidae. If the mega-toothed sharks represent a lineage of sharks separate from modern C. carcharias, then these species (e.g., C. angustidens, C. auriculatus, C. megalodon) are referred to the genus Carcharocles. If, however, the second theory is correct, and the mega-toothed sharks are closely related to C. carcharias, then the mega-toothed species are retained in the genus Carcharodon, and all are regarded as belonging to a single lineage. Both theories hold that the mega-toothed sharks evolved in the early Tertiary from ancestors close to or within the extinct family Cretoxyrhinidae. Carcharodon orientalis (= Palaeocarcharodon orientalis) from the late Paleocene to early Eocene has been proposed as the ancestor of this mega- toothed lineage, which itself evolved from an ancestor close to Cretalamna appendiculata (Applegate and Espinosa-Arrubarrena, 1996). The two theories differ, however, over whether C. carcharias evolved from within this lineage, or separately, from within the genus Isurus. The teeth of I. hastalis (lower Miocene-middle Pliocene) share several features with C. carcharias, including labiolingual flattening of the teeth and similar overall tooth shape (Cappetta, 1987; Yabe and Goto, 1996). The teeth of I. escheri (late Miocene) possess teeth with weak serrations, and thus it has been suggested that this species is an intermediate form between I. hastalis (unserrated teeth, as in modern Isurus) and C. carcharias (strongly serrated teeth). The teeth of 1. hastalis, 1. escheri and C. carcharias are therefore regarded as showing consecutive evolutionary stages consistent with the chronological record of these species. The discovery of an associated tooth set of 1. hastalis from the late Miocene 88 of Japan may also support this theory since 1. hastalis shows a heterodont pattern similar to that of C. carcharias (Yabe, 2000). However, the presence of C. carcharias teeth from as far back as the middle Miocene (Purdy, 1996; Purdy et al., 2001) undermines this theory. Opponents of an origin of C. carcharias from fossil Isurus attribute the dental similarities to convergence, and regard C. megalodon and other mega- toothed sharks as the closest known relatives of modern C. carcharias (Keyes, 1972; Yabumoto, 1987; Applegate and Espinosa-Arrubarrena, 1996; Gottfried and Fordyce, 2001). Dental characters shared by both the mega-toothed sharks and C. carcharias support this second theory (Applegate and Espinosa-Arrubarrena, 1996; Gottfried and Fordyce, 2001). (For simplicity, the name Carcharodon will be used for these mega-toothed sharks for the remainder of this section. ) Further evidence for the second theory is found by examining the fossil remains of these ancient mega-toothed sharks. Giant fossil teeth and huge vertebrae are all that remain of the largest predatory shark that ever lived, Carcharodon megalodon. Although the size of this shark has been exaggerated (Keyes [ 1972] estimates a total length of 30m!) a conservative estimate of 16m by Gottfried et al. (1996), over twice the length of the living great white shark, is still impressive. Fossils of C. megalodon have been used to support its close relationship to C. carcharias. The most complete specimen of this shark, represented by 150 associated vertebrae from the Miocene of Belgium, display vertebral calcification patterns remarkably similar to C. carcharias (Gottfried et al., 1996; Gottfried and Fordyce, 2001). Uyeno et al. (1989) describe the first (of only two) associated tooth set of C. megalodon which display the large "reversed" intermediate teeth considered diagnostic for the genus Carcharodon (Chapter Two). Bendix-Almgreen (1983) noted histological similarities in the teeth of both C. carcharias and C. megalodon and states it "indicates the close phyletic relationship between the two [Carcharodon] species" (p.21). Even though large marine mammals continue to inhabit the oceans, giant predatory sharks went extinct by the late Tertiary (or early Quaternary), with C. megalodon the last to disappear. Although it is possible that C. megalodon existed into the (Applegate and Espinosa- Arrubarrena, 1996; Gottfried and Fordyce 2001), reliable fossils suggest the youngest remains are of middle Miocene/upper Pliocene age (Applegate and Espinosa-Arrubarrena, 1996). The earliest record of this shark is also contested. Gottfried and Fordyce (2001) state the oldest unambiguous record of C. megalodon is from the late Oligocene (Chattian} of New Zealand, while Keyes (1972) believes New Zealand specimens may be as old as the Eocene. Applegate and Espinosa-Arrubarrena (1996), however, believe the age of New Zealand specimens are unreliable and suggests the oldest record is from the lower middle Miocene of Mexico. Although Applegate and Espinosa-Arrubarrena 89

(1996) believe the oldest record of C. carcharias is also from the (upper) Miocene of Mexico, they do not believe C. megalodon is its direct ancestor, as once believed. Fossils of the giant mega-toothed sharks are highly correlated with their suspected prey (large cetaceans) and as such, the availability of prey may account for the distribution of fossil Carcharodon (Purdy, 1996). The record of these teeth may be biased in favor of areas rich in fossils, and this would bias the collection of fossil teeth of giant sharks. In addition, living great white sharks are known to possess nursery areas and display size segregation of individuals (Chapter One). Both of these features are suggested by the fossil record: Sediments in North Carolina display an abundance of juvenile teeth (as well as small whales that may represent a food source for young sharks), and large adult teeth are noticeably absent from these putative nursery areas. Also, fossils of small C. carcharias are rarely found with those of larger species (e.g., C. megalodon), suggesting that size segregation occurred between species. Given that the prey items of these sharks still exist, some other factor must have been responsible for the extinction of C. megalodon (Gottfried et al., 1996). Many fossil species have been referred to Lamna, but most have now been split off into new genera (e.g., Cretalamna, Cretodus, Carchariolamna, , ). The teeth of Carchariolamna (Miocene) are regarded by Cappetta (1987) as closely resembling those of Lamna, and differ only in the presence of a serrated cutting edge (absent in modern Lamna). Thus, he has suggested that C. heroni (the type species of Carchariolamna) might be a species of Lamna. The dentition of Carcharoides (middle Oligocene to middle Miocene) appears to display a combination of features found in several lamniform genera. For example, the anterior teeth of Carcharoides are morphologically similar to odontaspidids, whereas the lateral teeth are more lamnid-like, and similar to Lamna (Cappetta, 1987). Species also differ in the presence of serrations (a trait largely associated with Carcharodon). For example, the teeth of Carcharoides totuserratus have serrations, while those of Carcharoides catticus do not (Cappetta, 1987). The affinities of these genera are debated, and some authors, such as Cione and Reguera (1994) do not even consider Carchariolamna or Carcharoides to be lamniforms. ~1ery few fossil remains can be confidently assigned to Lamna (Cappetta, 1987). Lamna rupeliensis (Oligocene, western Europe) may be the earliest representative of this genus (Cappetta, 1987). Teeth referred to L. nasus have been recorded from the Pliocene of Western Europe (Cappetta, 1987). Hypercalcified rostral nodes characteristic of extant Lamna species are known from the Miocene (Burdigalian) of the United States (Purdy et al., 2001). There is also very little agreement on which fossil specimens can be assigned to the genus Isurus. The diagnosis of many fossil Isurus (and related species) is problematic due to both 90 ontogenetic heterodonty and variation in dentition observed within a species (Uyeno et al., 1980; Kaga, 1985; Purdy et al., 2001). For example, the deep-water genus Xiphodolamia (lower-middle Eocene) has often been placed within the Hexanchiformes. However, Cappetta (1987) states that this genus may have affinities to Isurus since its anterior teeth resemble the teeth of juvenile Isurus. 1. winkleri from the late Paleocene (Thanetian; western Europe) is regarded as both the earliest known Isurus species and the earliest known lamnid by Cappetta (1987, 1993). However, Cione and Reguero (1994) suggest that the teeth of 1. winkleri may represent the upper anterior teeth of Striatolamia macrota; hence, they regard the late Paleocene 1. novus as the earliest known Isurus. Cappetta (1987) refutes Cione and Reguero (1994), and states that 1. novus is more likely to be a mitsukurinid. The fossil record of extant Isurus species is also controversial. 1. oxyrinchus material has been recorded from the late Eocene (Case, 1981) and middle Miocene (Case, 1980; Uyeno et al., 1980), but this material has subsequently been assigned to the extinct species I. praecursor and 1. desori (Kayasawa, 1989; Case and Cappetta, 1990). I. desori is considered a junior of 1. oxyrinchus by Purdy et al. (2001), but considered a valid species ancestral to modern Isurus by Kuga (1985). Since teeth of I. oxyrinchus are often confused with teeth of other Isurus species it is difficult to determine the earliest record, but teeth assigned to this species are reported from the lower Miocene (Burdigalian; United States; Purdy et al., 2001) and early Pliocene (Zanclian, Italy; Cigala Fulgosi, 1986). According to Kuga (1985) and Purdy et al. (2001), the fossil record of 1. paucus is difficult to evaluate because of insufficient information regarding living specimens. Certain 1. desori teeth (Oligocene of Belgium) appear to belong to 1. paucus (Purdy et al., 2001). Also, based on dental morphology alone, 1. paucus may actually be the same as the fossil species I. hastalis, which would make 1. paucus a j unior synonym of 1. hastalis (Purdy et al., 2001). 1. hastalis has been mentioned earlier in connection with one hypothesis for the evolution of Carcharodon carcharias. Teeth resembling those of 1. paucus are also known from the Pliocene of Japan (Kuga, 1985).

Family Cetorhinidae cetorhinids appear to have developed afilter-feeding lifestyle early in their evolution based on the tooth morphology of lower Miocene specimens (Cappetta, 1987). The reduced, inconspicuous teeth (usually less than lcm in height) of planktivorous cetorhinids may be overlooked in sediments where larger shark teeth (such as Carcharodon and Isurus) are dominant (Gottfried, 1995). For this reason, the unique gill-rakers of cetorhinids are the first recognized fossil remains for this group 91

(Cappetta, 1987), and thus may be the best indicators of their history. At least two species of Cetorhinus are recognized: remains from the Oligocene to Miocene are often attributed to C. parvus while later remains from the Pliocene are assigned to the extant species C. maximus (Cappetta, 1987). The teeth of C. parvus differ from C. maximus in the root morphology, and in this respect show some resemblance to the teeth of the fossil species Alopias exigua (Cappetta, 1987). Thus, it has been suggested that cetorhinids arose from a lineage close to alopiids (Cappetta, 1987). This conflicts with the close affinities proposed between cetorhinids and lamnids based on the modern species (Chapter Two). Confusion regarding fossil remains of cetorhinids may influence the interpretation of their the evolutionary history. For example, a tooth (Karasawa, 1989), gill-rakers (Bendix-Almgreen, 1983; Itoigawa et al., 1985) and vertebral centra (Bendix-Almgreen, 1983) from the Miocene have all been assigned to C. maximus, and indicate that this species may have an earlier fossil record than previously believed. In addition to teeth and gill-rakers, a clasper spine (similar to C. maximus) is known from the lower Miocene (Burdigalian; Purdy et al., 2001). Vertebral centra of cetorhinids have been confused with those of Carcharodon megalodon, due in part to their large size, and cetorhinids may therefore be more prevalent in the fossil record than previously believed. Vertebral centra from the Miocene have been assigned to the genus Cetorhinus by Gottfried (1995). The earliest record of Cetorhinus is a fragment of a from the middle to late Eocene of Antarctica (Clone and Reguero, 1998). Shark remains from the Late Triassic (Rhaetian) of Western Europe resemble the teeth and gill-rakers of cetorhinids (Duffin, 1998; Cuny et al., 2000). Duffin (1998) believed these remains, comprising isolated teeth and gill-rakers, belong to a new genus and species of cetorhinid, which he named Pseudocetorhinus pickfordi. His description is followed here. The teeth of Pseudocetorhinus are smaller and more thorn-like than those of C. maximus, but share the same construction of the root and crown. The teeth of the two species differ in that adult C. maximus crowns are more flattened, have better-developed cutting edges, and are more spatulate in shape. The cutting edges of the Pseudocetorhinus teeth also show certain similarities to the reduced teeth of Megachasma, including the weak cutting edges and triangular central cusp. The gill-rakers of Pseudocetorhinus resemble those of modern and Tertiary cetorhinids The triple layer of enameloid found in Pseudocetorhinus teeth suggests that the teeth belong to a neoselachian (Duffin, 1998; Cuny et al., 2000). Other Mesozoic sharks known from this time, such as hybodonts and ctenacanths, possess only a single layer of enameloid (Cuny et al., 2000). The morphology of the teeth and gill-rakers suggests that pelagic filter-feeding sharks were present in the 92

Late Triassic (Duffin, 1998). However, Cuny et al. (2000) questions the referral of Pseudocetorhinus to the Cetorhinidae, or to the Lamniformes, since it creates a large gap in the fossil record between the Late Triassic and early Tertiary, and no undisputed lamniforms are known before the Cretaceous. Pseudocetorhinus certainly documents the existence of filter-feeding neoselachians in the early Mesozoic; but further material is required to establish whether this material is referable to the Cetorhinidae, or is an example of convergent evolution. This highlights one of the difficulties of constructing phylogenies based on fossil teeth (and gill-rakers, in this case).

Family Megachasmidae The fossil record of Megachasma illustrates another difficulty encountered when isolated shark teeth are all that is known of the fossil history of a group. In the 1960's, small teeth (2-15mm in height) were found in late Oligocene /early Miocene deposits of California and Oregon (Taylor et al., 1983). Paleontologists debated the identity of these teeth, and some argued that they might be primitive carcharhiniforms (especially Scyliorhinidae or Pseudotriakidae) despite their osteodont tooth histology (Compagno, 1990b). It was not until much later that the discovery of the first megamouth shark suggested the possible nature of these fossils (Taylor et al., 1983; Lavenberg, 1991). These fossil teeth were similar, but "distinctly more primitive", than those of the newly discovered M. pelagios, and thus may represent extinct members of this genus (Taylor et al., 1983; Purdy et al., 2001). Teeth belonging to the genus Megachasma have been reported from the Burdigaiian (lower Miocene) of North Carolina (Purdy et al., 2001). These teeth are different from the California/Oregon fossils in several respects and "exhibit considerable morphological diversity" (Purdy et al., 2001). This "morphological diversity", combined with the fact that this species was only recently discovered, may be responsible for the misidentification of fossil Megachasma teeth in the past. For example, small, osteodont teeth assigned to the genus Megascyliorhinus (family Scyliorhinidae) may actually belong to a fossil megachasmid. These teeth are nevertheless quite distinct from those of Megachasma and thus warrant the retention of a separate genus. If these teeth, known from as early as the early Eocene, do belong to a megachasmid then they would represent the earliest known member of this family. Given the aforementioned difficulties, it is possible that a review of fossil shark remains may yield further insight into the evolutionary history of the Megachasmidae. 93

Family Pseudocarchariidae Given the paucity of data on the only living species, Pseudocarcharias kamoharai, in this family (Chapter One), it is of no surprise that fossil remains for this group are also poorly known. The fossil record of Pseudocarcharias may be underrepresented for two reasons. First, P. kamoharai is considered a pelagic, possibly deep-water shark, and remains from this environment are uncommon. Second, the awl-like teeth of Pseudocarcharias resemble certain other lamniforms (e.g., Carcharias and Isurus) and may have been mistaken for these taxa (Cigala Fulgosi, 1992). Despite these difficulties, teeth attributable to the genus Pseudocarcharias are found in the middle Miocene (lower Serravalian) of Italy (Itoigawa et al., 1985; Cigala Fulgosi, 1992).

Summary Elasmobranchs have a long and rich evolutionary history. The first elasmobranchs are known only from microscopic scales present in the early Silurian. At the end of the Silurian and during the early Devonian, elasmobranchs diversified into several groups of unusual sharks, the likes of which are unknown today. Many of the Paleozoic forms went extinct during the early Mesozoic and it was during this time that the second major radiation of elasmobranchs occurred. This second radiation would lead to modern shark groups. Fossil remains present in the lower Jurassic, such as the complete skeleton of Palaeospinax, introduce us to the first members of the Neoselachii, the group that includes all modern elasmobranchs. These Mesozoic fossils display features found in modern sharks, and provide insight into the early evolution of modern elasmobranchs. Although 'the exact ancestor of neoselachians is unknown, this group underwent a rapid radiation in the Mesozoic, resulting in the establishment of all modern shark orders, including the Lamniformes, by the end of the Cretaceous. Lamniforms appear to have been the dominant predatory sharks of the Cretaceous and were well established by the lower part of this period. The order was most diverse in the later part of this period, but most of these Cretaceous lamniform taxa left no descendents beyond the end of the Cretaceous. Since the early Tertiary, lamniforms have slowly declined (Siverson, 1992), though this group did produce some spectacular forms, notably C. megalodon, the largest predatory shark ever known. Thus, modern lamniforms probably represent a small sample of the total diversity that once existed within the order. Unlike the Carcharhiniformes, the diversity of the Lamniformes was greatest in the Cretaceous and early Tertiary, and they never recovered from the decline that followed this success. 94

As with other chondrichthyans, the fossil record of lamniforms is almost exclusively .composed of isolated teeth, and reconstructing the phylogeny of these sharks based on the fossil record is fraught with difficulty. The dental characters used to distinguish groups are confusing and highly controversial. Interpretations regarding the affinities of fossil shark teeth largely depend upon the individual researcher examining them, and which particular tooth characters they favor. It is only when we have complete tooth-sets and/or associated skeletal remains that we have a better chance of inferring the relationships of fossil shark taxa. Unfortunately, these types of remains are extremely rare for cartilaginous fishes. While an accurate phylogeny of lamniform sharks may not be possible based on fossil remains, the teeth do provide a potential timeframe for divergences within the group (see Tables 3.2 and 3.3).

Table 3.2. Temporal distribution of lamniform families. Family Temporal distribution Cretoxyrhinidae Early Cretaceous -~ Eocene Anacoracidae Cretaceous only Odontaspididae Early Cretaceous -~ Recent Mitsukurinidae Early Cretaceous ~ Recent Alopiidae ?Late Cretaceous*~ Recent Lamnidae Paleocene Recent Cetorhinidae ?Late Triassicf~ Recent Megachasmidae ?Eocene$ ~ Recent Pseudocarchariidae Miocene ~ Recent

* The Cretaceous record of Alopiidae comprises the genus Paranomotodon, the alopiid status of which is controversial. The otherwise earliest alopiid is Alopias denticulatus.

$ The Late Triassic record of Cetorhinidae comprises the genus Pseudocetorhinus, the cetorhinid status of which is controversial. The otherwise earliest cetorhinid belongs to the extant genus Cetorhinus.

$ The Eocene record of Megachasmidae comprises the genus Megascyliorhinus, the megachasmid status of which is controversial. The otherwise earliest megachasmid belongs to the extant genus Megachasma.

The previous Chapter discussed the difficulty in resolving the systematics of extant lamniforms based on both morphology and molecular data. Unfortunately, the fossil record of lamniform sharks does little to help resolve these relationships. It has been suggested that large molecular datasets, such as entire mitochondria genome sequences, combined with complete taxon sampling, may help resolve controversial phylogenies. Since only 15 extant lamniform sharks are 95 known, this group represents an excellent opportunity to test this hypothesis. The following Chapter discusses our current knowledge of mitochondrial genomes and how mitochondrial sequence data has been used in phylogenetic analyses.

Table 3.3. Earliest record of extant lamniform genera. Family Living genera ~ Earliest member of genus Odontaspididae Odontaspis O. saskatchewanensis, Late Cretaceous Carcharias C. striatula, Early Cretaceous Mitsukurinidae Mitsukurina M. maslinensis, Eocene Alopiidae Alopias A. denticulatus, Eocene Lamnidae Carcharodon C. orientalis, Paleocene Isurus 1. winkleri or I. novas, both Paleocene Lamna L. rupeliensis, Oligocene Cetorhinidae Cetorhinus Cetorhinus sp., Eocene Megachasmidae Megachasma Megachasma pelagios, Miocene Pseudocarchariidae Pseudocarcharias Pseudocarcharias sp., Miocene 96

CHAPTER FOUR: MITOCHONDRIAL GENOMES, TAXON SAMPLING AND PHYLOGENY

Introduction Several unique properties of mitochondria) genomes have made them of great interest to evolutionary research. The first part of this Chapter discusses these properties and provides a general introduction to the organization and function of mitochondria) genes. Due to the significant variation in mitochondria) genomes among organisms, only the characteristics of vertebrate mitochondria) genomes will be discussed. The second part of this Chapter discusses the popular notion that larger datasets, such as the entire mitochondria) genome, and increased numbers of taxa (taxon sampling), improve the resolution of phylogenetic analyses.

Mitochondria) genome organization By eukaryotic standards, mitochondria) genomes of vertebrates are small, at approximately 16-25 kb. Early studies on the mitochondria) genomes of mammals and the frog Xenopus revealed several common features. This circular molecule was composed of 37 genes (22 tRNA genes, two rRNA genes, and 13 protein-coding genes) in the same order. The genome was double-stranded, and composed of a heavy strand (which contained the majority of coding sequence) and a light strand (with eight tRNA genes and only one protein-coding gene, ND6; see below). The organization of the vertebrate mitochondria) genome is highly compact: with the exception of the displacement loop (D- loop) and the origin of light strand synthesis (OL), almost the entire mitochondria) genome codes for proteins, rRNAs and tRNAs, and lacks the introns characteristic of nuclear DNA. Although intergenic spaces are present within the mitochondria) genome, these are usually very small, consisting of only several bases. Such tight organization is prohibitive to transpositions since insertions would need to occur in precise locations between genes in order to maintain the function of these genes (Brown, 1983). Nevertheless, mitochondria) gene rearrangements do occur. Although the gene arrangement common to humans and Xenopus seems conserved across the majority of vertebrates (including sharks), exceptions in mitochondria) gene arrangement do occur in several vertebrate groups. These rearrangements sometimes involve protein-coding genes, but novel arrangements of tRNA genes are far more common (Macey et al., 1998). Such rearrangements are documented in lampreys, certain bony fishes, certain frogs, many reptiles (tuatara, lizards, snakes, crocodilians), marsupials, and birds

(Paabo et al., 1991; Macey et al., 1997a,b; Boore, 1999; Pereira, 2000; Inoue et al., 2001). 97

A unique arrangement in which a second copy of the D-loop is found between the genes far NADH1 and NADH2 is found in three families of snakes (Viperidae, Colubridae, Boidae; Kumazawa and Nishida, 1993). Mitochondria) gene arrangements in vertebrates are believed to be useful as

phylogenetic markers, particularly in birds, marsupials, snakes, iguanas and eels (Paabo et al., 1991; Boore and Brown, 1998; Macey et al., 2000; Pereira, 2000; Inoue et al., 2001). However, several Glades of birds were found to have acquired the same mitochondria) rearrangements independently, indicating that identical rearrangements may be subject to convergence, which would undermine their utility as phylogenetic markers (Mindell et al., 1998). In addition to gene rearrangements, duplications are surprisingly common in vertebrate mitochondria) genomes (Wolstenholme, 1992; Savolainen et al., 2000). Duplications frequently occur close to or within the non-coding D-loop and are usually responsible for the difference in size between vertebrate mitochondria) genomes. Tandemly repeated sequences occur within the D-loop of the lizard genus Cnemidophorus, several mammals (including rabbits, wolves, shrews, and pigs) and three species of fish ( sapidissima, Acipencer transmontanus Gadus moruha). These range in size from the 20 by sequence of the rabbit to between one and three copies of a 1500 by sequence in A. sapidissima (chad). Nontandemly arranged repeats have also been reported in the D-loop of Xenopus laevis (two copies of a 45 by sequence} and the chicken, Gallus domesticus (two copies of a 29 by sequence) (Savolainen et al., 2000). Duplications are not solely restricted to the D-loop, as large-scale duplications involving protein-coding genes and tRNAs have also been reported. A 5 kb repeat of the sequence including 12S rRNA, 16S rRNA, the D-loop and almost the entire ND1 gene was reported in the gekko Heteronotia binoei (Zevering et al., 1991). Variation in both length (0.8 to 8 kb) and gene content occur in several species of lizards in the genus Cnemidophorus (Moritz and Brown, 1986; Moritz and Brown, 1987). In this genus, 70% of reported duplications included the entire D-loop and at least one rRNA gene. The largest duplication was 8.0 kb, and included all genes between ND6 and NAD1. Several duplications also included the protein-coding genes ND2 and NDS (Moritz and Brown 1987). A unique repeat is found in the squamate reptile Bipes biporus (Amphisbaenidae). In B. biporus, a single tandem repeat of the tRNAs for proline and threonine occurs, in which one of each copy possesses mutations that render them nonfunctional and turn them into pseudogenes. This is the first report of a repeat that consists entirely of two tRNAs (Macey et al., 1998). In Chapter Six, a second occurrence of this is documented in a lamniform shark. How might the unusual duplication present in B. biporus arise? Examination of the mitochondria) genomes of some iguanas (Acrodonta) and the tuatara (Sphenodon) have been reported 98 in which the O~ sequence is either absent or displaced. These genomes also possess tandemly duplicated genes and it is believed that errors in light strand replication may be the mechanism for generating these duplications (Macey et al., 1997a,b, 1998). The OL in the genome of B. biporus is also displaced; hence errors in light strand synthesis may be responsible for the duplications (Macey et al., 1988). Until the exact mechanisms responsible for these gene rearrangements are understood, the usefulness of these characters as phyiogenetic markers should be regarded with caution (Curole and Kocher, 1999).

Function of mitochondrial genes The majority of mitochondrial proteins are encoded by the nucleus, manufactured in the cytoplasm, and imported into the mitochondrion. In fact, all of the enzymes required for the replication, transcription and translation of the mitochondrial genome are encoded by the nucleus (Gilham, 1994). Other nuclear-encoded proteins include those associated with the outer mitochondrial membrane and enzymes required for both the Citric Acid Cycle and fatty acid oxidation (Gillham, 1994). However mitochondrial genes are involved in two important functions: the synthesis of mitochondrial proteins and cellular respiration. Although mitochondrial ribosomal proteins are encoded by the nucleus, the mitochondrial genome contains all of the RNA components necessary for translation. This includes both the small (12S rRNA) and large (16S rRNA) subunits of the mitochondrial ribosome, and a complete set of tRNA genes. Unlike nuclear DNA, in which an amino acid may be recognized by several tRNAs (due to the `wobble' effect), each mitochondrial tRNA recognizes a single codon (Cantatore and Saccone, 1987). All 20 amino acids are recognized as well as two copies each of leucine and serine. Modifications of the tRNA for methionine result in the recognition of four start codons, instead of the usual AUG sequence (Cantatore and Saccone, 1987). Other modifications to the mitochondrial genetic code include the use of AGA and AGG as stop codons rather than arginine codons, and the use of AUA and UGA as methionine and tryptophan, respectively, instead of isoleucine and a stop codon. Mitochondrial genes are also involved in both the electron transport chain and ATP synthesis. The inner mitochondrial membrane contains five protein complexes associated with cellular respiration. With the exception of Complex II (the succinate-ubiquinone oxidoreductase complex) all of these complexes contain proteins encoded by mitochondrial genes. It is interesting to note that mitochondrial proteins associated with cellular respiration are all embedded in the inner 99 mitochondrial membrane and are hydrophobic in nature. This feature may influence the evolution of these proteins, and hence the genes that code for them (see below).

NADH ubiquinone reductase (Complex I) This complex has been called one of the most complicated enzyme systems known, with at least 19 poiypeptides (41 are known for bovine heart mitochondria) necessary for its function (Gillham, 1994). Given the complexity of this system, it is no wonder that its function is still largely unknown (Shevchuk and Allard, 2001). The majority of the proteins associated with this complex are nuclear in origin, although seven subunits (NADH1 through 6 and NADH4L) are derived from the mitochondrial genome. Nuclear-derived proteins comprise the hydrophilic portion of the complex whereas mitochondriai genes encode the hydrophobic portion of the complex. This complex catalyses the oxidation of NADH by ubiquinone and is the entry point for electrons transferred from NADH into the electron transport chain. However, the discovery that NADH1 protein acts as a cell surface antigen in mice reveals that the function of these genes is still far from understood (Boore, 1997).

Ubiquinol - cytochome-c reductase (Complex III) This complex contains the multi-protein complex, Cytochrome bcl, composed of both

Cytochrome b and c 1 molecules, and an iron-sulfer cluster (FeS). This complex catalyzes the oxidation of reduced coenzyme-Q by Cytochrome c 1. Cytochrome b is the only mitochondrially encoded protein in this complex, and encodes the highly hydrophobic apocytochrome b protein.

Cytochrome c oxidase (Complex IV) This complex is composed of Cytochrome coxidase, amulti-protein complex in which three subunits of Cytochrome oxidase (CO) are derived from mitochondrial genes. The mitochondrial components are embedded in the membrane and are thus hydrophobic. Cytochrome oxidase subunits I-III form the catalytic core of the complex and are essential for its function: creating an electrochemical gradient that provides the majority of energy used to drive ATP synthesis (Adkins et al., 1996).

ATP synthase (Complex V)

The protein ATP synthase is composed of two parts, Fo and F1. Fo is composed of five nuclear encoded proteins, and is involved in the phosphorylation of ADP. Fo sits in the inner 100 mitochondrial membrane and contains protein channels; subunits 6 and 8 are mitochondrial in origin while subunit 9 is derived from the nucleus.

Utility of mitochondrial genomes in phylogenetic analysis Mitochondrial genomes have several properties that make them potentially useful for phylogenetic analyses. Mitochondrial genomes can be easily purified, since they are separate from the nuclear genome and exist in high copy numbers. A growing number of vertebrate mitochondrial genomes are being sequenced, and databases such as Vertebrate MitBASE make information on these genomes readily accessible (Carone, 1999). Mitochondrial genomes are almost ubiquitous; they are known for almost all eukaryotic organisms. Mitochondrial genes evolve rapidly (as compared to nuclear genes); and since different parts of the molecule evolve at different rates, a wide spectrum of evolutionary time can (in theory) be captured using mitochondrial genomes (Hillis et al., 1996a). For example slowly evolving ribosomal genes have been regarded as useful for resolving deep divergences. At the other extreme, the D-loop can vary intraspecifically and has been used in population studies (see below). Several generalizations concerning the evolution of mitochondrial genomes have often been accepted with little question. For example, the theory that mitochondrial genomes are maternally inherited and do not recombine have made this molecule important in both phylogenetic and . The theory that mtDNA follows the neutral theory of evolution has prompted its use as a . The next section examines some of these properties in detail, as well as how recent evidence indicates that the evolution of these molecules may not be as simple as once believed.

Rapid evolution As noted above, early studies on vertebrate mitochondrial genomes revealed that they were genetically highly compact structures that encode the same 37 genes required for cellular metabolism. Such organization and function seemed to imply that these organelles could not tolerate change since any disruption of the molecule may prove deleterious. Hence, when it was shown that the mitochondrial genomes of primates evolved five to ten times faster than nuclear DNA (Brown et al., 1982) this result was unexpected, especially given that the function of mitochondria is dependent on its interactions with nuclear-encoded proteins (Cantatore and Saccone, 1987; Avise, 1991). It has been observed that almost all mutations involve either variations in length (mostly in the D-loop) or base substitutions, with transitions significantly outnumbering transversions (Wilson et al., 1985; 101

Avise, 1994). In addition, these substitutions accrued predominately at silent sites; thus the rate of mutations resulting in amino acid changes was similar to that of nuclear genes (Cantatore and Saccone, 1987). Mitochondria) replication is inherently inaccurate since mitochondria) polymerases incorporate mispaired bases at a rate five times higher than nuclear polymerases (Brown and Simpson, 1982). Further, there is evidence that mutations accumulate in mitochondria) genomes due to inefficient repair of damaged sequences. Mutation experiments in bacteria (from which mitochondria are believed to have originated) are consistent with this hypothesis, although the exact repair mechanisms that operate in mtDNA are unknown (Avise, 1991). Further, there is evidence that mitochondria) evolution is affected by metabolic rate. The rates of substitution in mitochondria) protein-coding genes in mammals and birds are around one order of magnitude higher than in fishes, sharks and amphibians. This has been demonstrated using complete mitochondria) sequences (Adachi et al., 1993) and CYTB (Kocher et al., 1989; Cantatore et al., 1994; Martin, 1999). Two hypotheses have been advanced to account for this, both associated with an endothermic versus ectothermic metabolism: (1) there is a relaxation of selective constraints operating on proteins in endothermic vertebrates; or (2) there is a higher mutation rate associated with endothermy. According to the first hypothesis, mitochondria) proteins must perform the same functions in ectotherms as in , but in the former these functions must be performed at a greater range of temperatures, since the animal cannot maintain a constant body temperature; this may constrain amino acid changes in ectotherms compared to endotherms. According to the second hypothesis, an endothermic metabolism requires a greater number of mitochondria per cell, and this may require a faster replication rate, which perhaps leads to a higher mutation rate due to inefficient repair of damaged mitochondria) DNA. Endotherms tend to have smaller mitochondria) genomes than ectotherms, and it is thought that a faster replication rate selectively favors smaller genomes (Rand, 1993). Aerobic energy production by the mitochondria is known to generate reactive oxygen species that can damage DNA (Cadonas and Davies, 2000). This endogenous damage to mitochondria) DNA has been reported to lead to directional nucleotide substitutions in the form of GC -~ AT transitions (Martin, 1995). Thus, there is an association between weight-specific metabolic rate (SMR) and directional substitution in mammals, with an accelerated rate of O~ consumption leading to increase in AT nucleotides in the mitochondria) genome (Martin and Palumbi, 1993; Martin, 1995). This is the "physiological clock" hypothesis. However, in a comparison of both nuclear and mitochondria) genes between mammals and sharks, Martin (1999) found that the rate of synonymous substitutions was an order of magnitude 102 lower in sharks (Lamniformes) than in mammals for both mitochondrial and nuclear genes. This study also found little variation in substitution rate among the 11 species of Lamniformes tested, despite the presence of endothermic species in this order (Chapter One). Although metabolic activity may impact on mitochondrial evolution, this is not the whole story. The mitochondria of , which are ectotherms, evolve faster than birds, which are endotherms (Kumazawa and Nishida, 1993).

Paralogy and lineage sorting It is often assumed that a phylogeny constructed for a given gene across several species is equivalent to the phylogeny of the species. However, genes undergo their own processes of evolution, which may confound organismal phylogeny. For example, transposition or duplication of a gene may lead to loss (i.e., a pseudogene) or change of function in the new copy. Thus several copies of a gene may exist that evolve under very different circumstances. These genes, .which are not the result of common ancestry, are known as paralogous genes. Mitochondrial genes, however, have been traditionally considered single copy and free from introgression (Cao et al., 1994; Hillis et al., 1996b; Hwang and Kim, 1999). Thus, all mitochondrial genes should be orthologous. However, as mentioned above, large-scale duplications do occur in some vertebrate mitochondrial genomes, and thus such assumptions concerning mitochondrial evolution may not be absolute. In addition, mitochondrial pseudogenes exist in the nuclear genome of several reptiles, birds and mammals (De Woody et al., 1999; Bensasson et al., 2001; Nielsen and Arctander, 2001); such nuclear mitochondrial pseudogenes {NUMTS) indicate that even mitochondrial genes may not be strictly single-copy. A second problem that leads to error in estimations of species trees from genetic data is lineage sorting. Lineage sorting occurs when alleles are removed from populations at varying rates, due to genetic drift. Differential survival of alleles can result in allele phylogeny that is different from the species phylogeny (Page and Holmes, 1998). There are two factors that affect the ability of a gene to track species phylogeny. The first is the rate of speciation: if rapid speciation occurs, new species are generated in less time than it takes for alleles to become fixed and therefore species phylogenies may not reflect allele phylogenies. The length of the internodes resulting from speciation is the same for both mitochondrial genes and nuclear genes. The second factor is population size: an allele is more likely to become fixed in a smaller population. However, certain properties of mitochondria may help alleviate the above problems. Since mitochondria are believed to be maternally inherited, and hence can be considered haploid, the 103 effective population size is one quarter that of nuclear autosomal genes. Therefore, mitochondrial genes are more likely to become fixed prior to speciation and hence more likely to reflect species phylogeny than nuclear autosomal genes (Moore, 1995). Also, the rapid evolution of mitochondrial protein-coding genes (see above) means that they may be better able to track species-level divergence (Moore, 1995) .

Maternal inheritance and neutral theory The above scenario depends on two features of mitochondrial evolution: that it is maternally inherited and that it is a stochastic process (i.e., neutral and hence influenced by genetic drift). A second application of mitochondrial DNA, its use as a molecular clock, also depends on both of these principles. Although these are widely accepted properties of mitochondrial DNA, increasing evidence suggests that there are exceptions.

Maternal inheritance Paternal inheritance of mitochondrial DNA is considered unlikely since sperm mitochondria are greatly outnumbered by those in the egg and mechanisms exist in which sperm mitochondria are located and destroyed (Bromham et al., 2003}. However, a few known examples of paternal inheritance do exist in (e.g., Drosophila, Mytilus; Avise, 1994). Although vertebrate examples of paternal inheritance exist in mice (Mus musculus and M. spretus) and the great tit (Parus major), these are limited to hybridization events and are considered exceptional (Bromham et al., 2003). Thus, in vertebrates, maternal inheritance is considered the norm. However, recent evidence of paternal inheritance in humans questions this generalization. In one example, a man suffering from a debilitating metabolic disorder was found to possess two distinct forms of mtDNA in his cells (heteroplasmy). Upon further examination, it was revealed that the majority of mitochondrial were derived from his father, not his mother (Bromham et al., 2003). Indeed, incidents of heteroplasmy have been documented in many vertebrate taxa (Wilson et al., 1985). In light of these discoveries in humans, the assumption that all cases of heteroplasmy are due to mutation should be tested, when feasible.

Mitochondria) evolution is a stochastic process The neutral theory of evolution states that most mutations are deleterious and are hence quickly removed from populations. Therefore most mutations that remain in populations are neutral (i.e., convey no selective advantage to the organisms) and are fixed by a random process (genetic 104 drift; Page and Holmes, 1998). However what if, in contrast to the neutral theory, slightly deleterious alleles could become fixed in populations? Such a scenario may be responsible for the presence of metabolic disorders in humans (Nachman et al., 1996). Several metabolic disorders in humans are believed to be associated with defects in mitochondrial genes associated with oxidative phosphorylation (Nachman et al., 1996). Although the exact cause of such disorders is unknown, approximately 20 amino acid changes in mitochondrial genes may be associated with such diseases. Many of these changes are considered only slightly deleterious, and since they occur late in life may be passed on to future generations (Nachman et al., 1996). A second violation of the neutral theory is also found in mtDNA. Analysis of the mitochondrial NADH3 gene in mice, chimpanzees, and humans revealed that the ratio of replacement substitutions to silent substitutions was significantly higher within species compared to between species (Nachman et al., 1996). If mtDNA follows a strictly neutral mode of evolution then such a ratio should not exist. Further analysis of the genes encoding both the remaining subunits of NADH and CO in humans also revealed this pattern, suggesting that it was not unusual (Nachman et al., 1996). In addition, humans with some of the metabolic disorders described above displayed higher replacement to silent substitution ratios than healthy individuals, suggesting such ratios may be influenced by deleterious mutations (Nachman et al., 1996).

Mitochondrial DNA as a molecular clock Analysis of mitochondrial genomes of a variety of vertebrate taxa revealed that the mean rate of divergence over the entire mitochondrial genome for all of these species is approximately 2% per million years (Wilson et al., 1985). The idea that this constant rate of evolution existed in such diverse species as those listed above prompted the use of mtDNA as a molecular clock. In addition, exclusive maternal inheritance implies that the evolutionary history of species could potentially be traced to a single individual. These assumptions inspired the search for the single ancestor of all mankind ("Mitochondrial Eve"). However, a constant rate of 2% per million years is only applicable to the mitochondrial genome as a whole, and as mentioned above (and discussed below) individual mitochondrial genes evolve at variable rates. Therefore, the clock-like behavior of any individual mitochondrial gene must be tested before can be employed as an evolutionary chronometer (Wilson et al., 1985). 105

Summary Although vertebrate mitochondrial genomes were once considered compact, highly structured systems, several species exhibit variations in gene arrangement and in length, due to both insertions and deletions (such as in the D-loop and OL ). It is unknown why mitochondrial genes, which are so vital in metabolic function, possess extremely high mutation rates as compared to nuclear genes. Such rapid mutation rates, combined with maternal inheritance, Lack of recombination, and its potential use as a molecular clock, have made this molecule a favorite for both phylogenetic and population genetics. Although there are exceptions to such generalizations, mitochondrial genes and

genomes are still widely used in phylogenetic analyses. The next section will discuss the utility of mitochondrial genes and genomes in molecular evolution.

Displacement loop (D-loop) The D-loop contains both the initiation site for replication and the promoters for transcription for both the heavy and light strands. Despite the functional importance of this region, the D-loop is the fastest evolving region of the genome: rates of evolution in human D-loop sequences have been estimated to be between three and five times faster than CYTB (Meyer, 1994}. In addition to elevated substitution rates, the D-loop is frequently associated with insertions, deletions, and duplications (see above) and accumulates these mutations more frequently than any other portion of the mitochondrial genome (Stewart and Baker, 1994). Due to such exceptionally high mutation rates, D-loop sequences are often used to investigate phylogenies within species and populations (Brown et al., 1993; Hwang and Kim 1999). Despite such high mutation rates, there are conserved regions in the sequence of the D-loop. Identification of such conserved sequence blocks (CSBs) has been important in understanding the structure and possible function of such areas. Brown et al. (1986) divide the D-loop into three domains: a central region, and 5' and 3' flanking regions. These regions are differentiated by both their adenine composition (which is low in the central region, and high in the flanking regions) and their relative mutation rates, with the flanking sequences hypervariable and the central region very highly conserved (Brown et al., 1986; Tamura and Nei, 1993). Despite their hypervariability, both 5' and 3' regions contain sequences capable of forming secondary stem and loop structures, which are believed to be important in replication (Brown et al., 1986; Saccone et al, 1991). The function of the secondary structure of the control region has proven difficult to interpret since the above properties vary in both their position and their sequence across species (e.g., Brown et al., 1993; Dillon and Wright, 1993; Perna and Kocher, 1995; Kitamura et al., 1996). Indeed, there 106 appears to be very few generalizations one can make regarding the mitochondrial control region. For example, high mutation rates attributed to this area may in some lineages approach values equal to or even below those of protein-coding genes (Brown et al, 1986). In addition, certain groups (e.g., gallinaceous birds) may exhibit a higher percentage of transversions, as opposed to the high transition rate characteristic of mtDNA sequences (Desjardins and Morals, 1991; see below). The presence of possible protein-coding regions (open reading frames) within the control regions of cetaceans (Hoelzel et al., 1991) further indicates that the function, structure and evolution of the D-loop are still far from understood.

Ribosomal genes: 12S and 16S rRNA The mitochondrial ribosome is composed of a small and large subunit, encoded by the 12S and 16S rRNA genes, respectively. Both ribosomal genes form a complex secondary structure composed of a series of stem-and-loops. These enable the subunits to bind to nuclear encoded proteins, thus forming the mitochondrial ribosome. The ribosome is essential in the synthesis of mitochondrial proteins and hence is more sensitive to mutations. Consequently, ribosomal genes evolve at a slower rate than protein-coding genes and, in addition to tRNAs, are the most conserved regions of the mitochondrial genome (Brown, 1983; Hwang and Kim, 1999). Given their estimated substitution rate for these genes in various metazoan taxa, Mindell and Honeycutt (1990) suggest that ribosomal genes may be useful for resolving divergences as old as 300 mya, although they believe 150 mya to be a more suitable timeframe. Hillis and Dixon (1991), however, take a much more conservative view and state that mitochondrial ribosomal genes are only appropriate for resolving lineages that split during the Cenozoic (< 65 mya). However, differences in the evolution of 12S rRNA and 16S rRNA impact on their utility as evolutionary chronometers. For example, studies in certain mammals suggest that 12S rRNA evolves slower than 16S rRNA (Cantatore and Saccone, 1987; Hwang and Kim, 1999). Given the variation in substitution rate, Hwang and Kim (1999) suggest that the highly conserved 12S rRNA gene may be useful for resolving relationships as deep as the origin of phyla, whereas 16S rRNA is more appropriate for resolving relationships at the taxonomic level of family or genus. The secondary structure of ribosomal genes has important implications in the evolution of this molecule, and its utility in phylogenetic studies. Many models of sequence evolution require that substitutions events are independent; this assumption is violated when substitutions in stem regions require additional compensatory changes in order to maintain base pairing (Page and Holmes, 1998). Given the necessity for complementary base pairing in stems, it is often shown that these regions 107 evolve at different rates than do loops. For example, Springer and Douzery (1996) found that, for certain mammals, substitutions involving transversions accrue three to four times faster in loops than in stems. They suggest that transversion substitutions that maintain base pairing in stems are rare and, as such, explain why transition substitutions are higher in these regions. Transition substitutions in loops also accumulate faster than in stems, presumably because of the relaxed constraints on secondary structure in loops, which are single-stranded. The difference in rates between stems and loops has lead to differential weighting schemes for these areas in phylogenetic analyses. However, Sullivan et al. (1995) suggest that simply dividing the molecule into stems and loops underestimates rate variation observed within ribosomal RNAs. Their analysis of 12S rRNA in rodents revealed extreme among-site rate variation within this gene, which could not be explained by simple consideration of stems versus loops. Indeed, substitution rates within both loops and stems was variable, as certain stem regions evolved faster than others. They suggest that, in addition to constraints imposed by secondary structure, interactions with nuclear-encoded proteins may also restrict certain portions of the molecule (Sullivan et al., 1995). Stem and loop regions of mitochondrial ribosomal genes also exhibit differences in base composition. Springer and Douzery (1996) examined the base composition of ,both mitochondrial ribosomal genes in mammals and found their results comparable to those of other metazoans. Loop regions were adenine-rich (48.5%), whereas in stems base frequencies were less biased, although a higher GC content was observed. It has been suggested that this pattern is related to the structure of the ribosome: a high percentage of adenine in loops may promote hydrophobic interactions with proteins (adenine is the least polar [hydrophilic] of the bases), while G-C bonds increases the stability of the stem structure (Springer and Douzery, 1996)

Transfer RNA As in ribosomal genes, tRNAs are composed of paired stem and unpaired loop regions. However, the structure of tRNA is far less complicated than that of rRNAs: most of the secondary and tertiary base pair interactions in tRNAs have been elucidated .using techniques (such as X-ray crystallography) which have proved difficult in resolving the complex structures of ribosomal molecules (Springer and Douzery, 1996). The secondary structure of tRNA forms the classic "clover lead' including the anti-codon loop, which acts in codon recognition during translation. The L-shaped tertiary structure is dictated by the interactions between several nucleotides. Interactions that lead to tertiary structures of non-mitochondrial tRNAs are quite different from those of mitochondrial tRNAs and may influence the variation in evolutionary rates between these molecules (see below). 108

tRNAs are much shorter and less complex than rRNAs, and therefore contain less ambiguous sites when the sequences are manipulated and aligned for analysis. As in ribosomal. genes, tRNA genes evolve much more slowly than mitochondrial protein-coding genes. Thus their ease in alignment, combined with their slow evolutionary rate, may make tRNAs more suitable for analyses of deep divergences than their ribosomal counterparts, although individual tRNAs must be concatenated in order to provide a sufficient dataset (Kumazawa and Nishida, 1993). Studies have claimed that tRNA sequences were useful for resolving divergences of up to 600 mya (although 350 mya may be a more conservative estimate based on saturation plots) and performed well in recovering phylogenies at both deep and shallow levels (Kumazawa and Nishida, 1993; Miya and Nishida, 2000). As stated above, tRNA may also be useful in resolving deep Level phylogenies through examination of gene rearrangements that involve them. As with ribosomal genes, tRNAs exhibit variation in substitution rates, not only between stems and loops (stems are more conservative), but differences between individual loops and stems. For example, the anti-codon loop is practically invariable, whereas there is less requirement for strict Watson-Crick base pairing in stem regions of mitochondrial tRNAS; this also alleviates the need for immediate compensatory changes to maintain base pairing (Kumazawa and Nishida, 1993, 1995). In addition nucleotides necessary for tertiary interactions in non-mitochondrial tRNAs are free to vary in mitochondrial tRNAs. The relaxation in constraints on mitochondrial tRNAs may help explain the significant difference in evolutionary rates between mitochondrial and nuclear tRNAS: mitochondrial tRNAS evolve at a rate 100 times faster than their nuclear counterparts (Brown et al, 1982; Kumazawa and Nishida, 1993). The evolutionary constraints on individual tRNAs can also vary within the mitochondrial genome. Certain tRNAs appear to contain signal sequences required for processing polycistronic

transcripts (Paabo et al., 1991), whereas tRNA-Cys forms part of the OL in most vertebrates.

Protein-coding genes Rapid rates of evolution in mitochondrial genomes are due in large part to very high substitution rates in protein-coding genes (Brown and Simpson, 1982). The vast majority of these substitutions occur at silent sites (third base positions). For example, analysis of substitutions in COZ sequences of rats revealed that 94% of observed substitutions did not result in amino acid replacements; hence, strong selection against such mutations must exist in mitochondrial genomes (Brown and Simpson, 1982). 109

Hwang and Kim (1999) suggest that high rate of substitution at the third codon position in mitochondrial protein-coding genes makes this position useful for species- and population-level phylogenetic studies. The advantage of using protein-coding genes, rather than the D-loop, in such studies is that the former can be more easily aligned across taxa due to the necessity of keeping the sequence in-frame (Hwang and Kim, 1999). However, these analyses must take into account the fact that third codon positions in mitochondrial protein-coding genes are saturated (Irwin et al., 1991). Rapid mutation rates at synonymous sites within nucleotide sequences of protein-coding genes are often prohibitive to their use in resolving deep divergences, and Meyer (1994) suggests that nonsynonymous sites and amino acid sequences of protein-coding genes .may prove useful in this regard. Due to functional constraints, nonsynonymous sites (usually first and second base positions) and amino acid sequences of protein-coding genes are often slow to evolve, and (provided enough sites change) may therefore be useful in tracing ancient lineages (Meyer, 1994). However, certain codons, such as those for leucine, .may evolve rapidly, due to silent mutations at the first codon position. Leucine may also undergo frequent substitutions to other non-polar amino acids (especially leucine ~ isoleucine; leucine -~ valine) and for this reason may not be as reliable (Janke et al. 1994; Meyer, 1994). In addition, mitochondrial protein-coding genes may contain a strong bias for T and C at the second position that is directly correlated with the functional constraint of hydrophobicity (Naylor et al., 1995). Such bias may lead to rapid saturation rates at second codon positions; anti-G bias at the third position limits variability at this site and hence first positions (which do not possess strong base compositional biases) may prove to be the most informative (Naylor et al., 1995). However, amino acid replacement rates vary both among lineages and among genes. For example CO2 sequences of certain primates exhibit very high amino acid replacement rates, whereas CO1 and CO3 evolve at much slower rates in these species (Adkins et al., 1996; Wu et al., 2000). Elevated nonsynonymous substitution rates in these primates are also found in genes (such as CYTB and the nuclear encoded gene far cytochrome c) that interact with CO2, suggesting coevolution of these genes (Wu et al., 2000). Zardoya and Meyer (1996) found amino acid replacement rates in vertebrates varied considerably between genes: ND4L and ATP8, for example, exhibited faster amino acid substitution rates as compared to more conservative genes such as CO1 and CO3. The most commonly used mitochondrial protein-coding gene for phylogenetic analysis is CYTB. This was a favorite choice for phylogenetic studies in early studies, since universal primers were available for amplifying these genes from various taxa (Kocher et al., 1989; Irwin et al., 1991) and there existed extensive knowledge of both structure and function of this gene (Irwin et al., 1991; Whitmore et al., 1994). Indeed, CYTB (in addition to CO3; Griffiths, 1998) is one of the few 110 mitochondrial proteins in which evolution of sequences can be examined at both a structural and functional level. Such analyses found that different regions of CYTB evolved at different rates, with regions believed to be essential to its role in the respiratory chain constrained and evolving at very slow rates (Irwin et al., 1991; Yoder et al., 1996). Indeed, there was significant correlation between location and type of protein function and the degree of amino acid conservation (Yoder et al., 1996). As CYTB sequences were obtained from an increasing number of taxa, it became apparent that this gene could not be used to resolve all phylogenetic questions. Several factors were proposed as limiting the use of this gene as a molecular tool for every taxon sample at every taxonomic level: base compositional biases; variation in rates between different lineages; early saturation of third codon position; limited variation in amino acid substitutions; and associated first and second base positions per codon (Cantatore et al., 1994; Meyer, 1994). These criticisms hold for all protein- coding genes; but not all individual protein-coding genes can be considered equal for the purposes of phylogenetic analysis. Several studies have examined the utility of individual mitochondrial protein-coding genes, which has led to the practice of ranking these genes based on their perceived ability to recover intuitively attractive (especially if in broad agreement with morphological studies) and/or strongly- supported phylogenies (Russo et al., 1996; Zardoya and Meyer, 1996; Miya and Nishida, 2000; Shevchuk and Allard, 2001). There was consensus on the performance of certain genes: ND1, ND4L, ND6, ATP6 and ATP8 were generally considered to be `poor' , whereas NDS and ND4 were found to be `good' , and CYTB was usually ranked in the middle. However, other mitochondrial protein- coding genes were the source of disagreement: COI was regarded as `poor' by some studies (Cao et al., 1994; Zardoya et al, 1998), but `good' in others (Zardoya and Meyer, 1996; Miya and Nishida, 2000); and CO2 has been ranked as either `poor' (Russo et al., 1996; Shevchuck and Allard, 2001) or `goad' (Miya and Nishida, 2000). In addition, the performance of these genes was also dependent on the choice of method (e.g., evolutionary model; amino acid versus nucleotide sequence), with genes placed in the `good' categories often fluctuating in their performance based on the method used (Cao et al., 1994; Honeycutt et al., 1995; Zardoya and Meyer, 1996; Naylor and Brown, 1997; 1998; Zardoya et al., 1998). Also, phylogenies based on individual genes often showed strong statistical support for implausible phylogenies (Cao et al. 1994; Zardoya and Meyer, 1996). This was also the case for combining individual genes into a larger dataset, which could also produce different phylogenies with different methods and/or recover unorthodox trees with strong bootstrap values (e.g., Russo et al., 1996; Naylor and Brawn, 1997). 111

Nevertheless, the approach of expanding the dataset in the hope that it will produce a better tree has been adopted by some studies (e.g., Cao et al., 1994; Russo et al., 1996; Zardoya and Meyer, 1996). This approach is based on the statistical principle (the Law of Large Numbers) that as more independent trials are conducted, there will be a tendency for these trials to converge on the correct answer with greater frequency (Churchill et al., 1992; Mindell and Thacker, 1996). As a result, ambiguities in the dataset will either cancel out, or appear at such low frequencies that their overall effect will become negligible (Mindell and Thacker, 1996). This theory has been adapted to phylogenetic analyses to suggest that as more data are collected, phylogenetic noise (homoplasy) will cancel out and, as such, the `true' tree should be revealed (Churchill et al., 1992; Mindell and Thacker, 1996). However, two important assumption of this theory may not be met when using sequence data: the requirement of independence, and random distribution of error. For example, it is well known that functional and evolutionary constraints exist in both lineages and genes to varying degrees (see above). Such constraints indicate that individual nucleotide sites are not equivalent in their ability to change, and as such, cannot be regarded as independent characters. It is often unknown how such constraints impact phylogenetic reconstruction and hence there is no reason to assume that error (homoplasy) is randomly distributed, and will cancel out as more information is obtained (Mindell and Thacker, 1996; Naylor and Brown, 1998). Indeed, addition of more data with similar biases may actually obscure phylogenetic reconstructions, as the cumulative effects of homoplasy may overwrite the evolutionary signal (Mindell and Thacker, 1996; Naylor and Brown, 1998). In addition, methods used to estimate phylogeny (such as parsimony) may not obtain the correct tree as more data are acquired (Felsenstein, 1978; Huelsenbeck et al., 1996); and, even worse, may generate an incorrect phylogeny with high statistical support (Huelsenbeck et al, 1996; Naylor and Brown, 1998). Despite these concerns, the notion that more data will result in improved phylogenies has resulted in the sequencing of large numbers of mitochondrial genomes (Zardoya et al, 1998; Pollock et al., 2000; Miya et al., 2001, 2003). Mitochondrial genomes have been viewed by some as a panacea for solving phylogenetic problems for which morphological and/or paleontological evidence has yet to provide a clear answer. With regard to the resolution of interordinal relationships of mammals, it has been said:

Progress will now be rapid, with mitochondrial DNAs from other vertebrates becoming available and other laboratories becoming involved in sequencing. After 200 years of research, the main outline of the mammalian radiation should be resolved within the next~ve years. 112

Penny and Hasegawa (1997, p.550)

Those five years have passed. However, mammal phylogenies based on mitochondrial genome sequences have raised more questions than answers (e.g., Amason et al., 1997). There are relationships that have strong morphological support (e.g., Rodentia+Lagomorpha Glade) that are not recovered from mitochondrial phylogenies, and relationships for which the opposite is true (e.g, Afrotheria). Indeed, for all vertebrate Glades, while phylogenies based on mitochondrial genomes have offered support for traditional phylogenies (e.g., Braun and Kimball, 2002; Paton et al., 2002), several analyses have recovered rather unorthodox relationships (e.g, Xu et al., 1996; Cao et al., 1998; Reyes et al., 1998; Rasmussen and Amason, 1999a,b; Takezaki and Gojobori, 1999; Miya and Nishida, 2000; Miya et al, 2001, 2003). For some researchers, faith in phylogenies based on whole mitochondrial genomes are so strong that the mitochondrial-based phylogenies have been used to question or even overturn traditional phylogenies based on sound morphological and/or paleontological evidence (Amason et al., 1997; Rasmussen and Amason, 1999a,b; Miya and Nishida, 2000; Miya et al, 2001, 2003). For example, Rasmussen and Amason (1999a,b) recovered a topology in which cartilaginous fishes (Elasmobranchii) were nested within bony fishes (). Based on this result, these authors state:

~RJeevaluation of the morphological characters uniting the Chondrichthyes will be necessary. Rasmussen and Amason (1999a, p.2181)

Although phylogenies based on mitochondrial genome sequences may yield strange results, it is believed that sequencing more genomes from more taxa will help resolve such discrepancies (e.g., Cao et al., 1998; Mindell and Thacker 1998; Pollock et al., 2000; Braun and Kimball, 2002; Miya et al., 2003). It has been suggested that as the taxon sample increases, the number of characters necessary to resolve phylogenies must also increase accordingly, as smaller datasets may not contain sufficient numbers of variable characters to resolve a large number of taxa (Graybeal, 1998.; Poe, 1998). Analysis of entire mitochondrial genome sequences may provide the necessary number of characters required to analyze large numbers of taxa (Pollock et al., 2000; Miya et al., 2003). Addition of taxa may assist in improving phylogenetic analyses by improving both estimates of evolutionary model parameters (Sullivan et al, 1999; Pollock and Wilson, 2000; Zwickl and Hillis, 2002) and ancestral state reconstruction (Cao et al., 1998; Pollock et al., 2000; Salisbury and Kim 113

2001). Examination of such estimates by greater taxon sampling may help enhance our understanding of the dynamics of molecular evolution (Pollock et al., 2000). The main rationale for inclusion of additional taxa in phylogenetic analyses, however, is its potential utility in alleviating the "long branch attraction" problem, as addition of taxa may help "break up" such long branches (Graybeal, 1998; Kim 1998; Poe, 1998; Rannalla et al., 1998). Long branch attraction occurs when taxa (such as those with rapid substitution rates) possess more characters in common by chance rather than by shared evolutionary history (Felsenstein, 1978). The effect of this accumulation of homoplastic sites is especially pronounced when taxa with long branch lengths (i.e., those with numerous substitutions) are drawn together based on this misleading signal; as branch length increases so does the probability that evolutionary signal will be overridden by homoplasy (Purvis and Quicke, 1997). Long branch attraction seems to be especially problematic for methods such as maximum parsimony (Felsenstein, 1978), which cannot distinguish the homoplasy in rapidly changing sites (Purvis and Quicke, 1997). It has been suggested that although the addition of taxa may assist in increasing the reliability of phylogenetic estimation, this process must be done judiciously, such as choosing taxa that break up long branches and which represent the group under investigation (Kim, 1996; Purvis and Quicke, 1997; Graybeal, 1998; Hillis, 1998; Poe and Swofford, 1999). However, the choice of which taxa to add is not a simple one. For example, if the phylogeny of the group in question is unknown, how can one know which taxa are likely to break up long branches (Poe, 1998)? In addition, when examining highly speciose groups (e.g., > 23,500 species of teleosts; Miya et al, 2003), how can one decide which taxa accurately represent the diversity of the group being investigated {Pollock et al., 2002; Rosenberg and Kumar, 2003)? Furthermore, addition of taxa is no guarantee that phylogenies will improve. Indeed, the inclusion of taxa may simply compound the error by creating additional long branches, such as when added taxa themselves possess rapid evolutionary rates (Hillis, 1998; Kim 1996; Poe, 1998). Finally, the choice of taxa intended to break up long lineages usually requires some acquaintance with the group being tested. Hence, although increased taxon sampling may improve phylogenetic estimations, choosing which taxa to add is not a simple procedure.

Summary Variation in evolutionary rates of mitochondrial genome sequences may provide insight into a wide spectrum of evolutionary history, form deep level divergences (ribosomal and transfer RNAs) to population genetic studies (D-loop, third codon position of protein-coding genes). Evolutionary rates of mitochondrial protein-coding genes vary both among genes and among lineages. Efforts to 114 distinguish useful genes for phylogenetic analyses often resulted in inconsistent reconstructions, which were dependent on genes, methods and taxa included in analyses. Use of entire mitochondrial genome sequences combined with increased taxon sampling is believed to increase the accuracy of phylogenetic estimation. However, deciding which taxa to include in such analyses remains unresolved. In an effort to resolve ambiguities in lamniform systematics, an analysis of entire mitochondrial genome sequences was conducted. As all living taxa were included in this study, the question of which taxa to include did not arise. Lamniform sharks thus represent an ideal opportunity to test whether entire mitochondrial genome sequences combined with complete taxon sampling will lead to improved phylogenies. 115

CHAPTER FIVE: A PHYLOGENY OF LAMNIFORM SHARKS BASED ON WHOLE MITOCHONDRIAL GENOME SEQUENCES

Abstract Mitochondrial genomes for all 15 species of lamniform sharks were sequenced. Mitochondrial sequences (individual genes and multigene datasets) were analyzed by a variety of methods (maximum parsimony, distance, maximum likelihood). Results of the current study found support for the same relationships as previous molecular studies, but were unable to resolve relationships further than in past analyses. As in previous molecular studies, the genus Alopias proved to be especially problematic; monophyly of this genus was rarely recovered, and never strongly supported. The of Alopias is highly improbable on morphological grounds, and this result strongly indicates that further examination of the genes is required in order to determine an accurate phylogeny. Based on the current study of all living lamniforms, complete taxon sampling and larger datasets do not necessarily provide a better phylogeny.

Introduction The order Lamniformes comprises I S living species of sharks organized into seven families. Sharks within this group include notorious predators, such as the great white shark (Carcharodon carcharias), harmless filter-feeders such as the basking shark (Cetorhinus maximus) and megamouth (Megachasma pelagios) and unusual deep-water sharks such as the goblin shark (Mitsukurina owstoni). In addition to filter-feeding, lamniform sharks display a variety of unique adaptations such as endothermy (Lamnidae and Alopiidae) and uterine cannibalism (Chapter One). The systematics of the order Lamniformes have yet to be exposed to a thorough anatomy- based cladistic analysis; this makes it difficult to determine the evolutionary significance of morphological characters (Chapter Two). Also, with the possible exception of the odontaspidids, most extant lamniform taxa are highly autapomorphic, and this may obscure characters that could be used link individual taxa together (Compango, 1990b). Further, if modern lamniform species represent a small and non-representative sample of the past diversity of this group, this would further compound the difficulty of discerning relationships among extant lamniform taxa (Compagno, 1990b; Shirai, 1996; Chapter Three}. Lamniformes are common in the Cretaceous, but since then they have diminished in abundance, whereas other neoselachian orders (especially Carcharhiniformes) have increased in diversity. Also, typical of Chondrichthyes, most fossil lamniform taxa are represented solely by teeth, which prevents a greater knowledge of the morphology of fossil taxa that may be 116 intermediate between extant species (Gaudin, 1991). Unfortunately, the fossil record of lamniform sharks does little to help resolve the relationships of this group. Molecular studies, using nuclear and/or mitochondrial genes, have yet to clarify most of the relationships within the Lamniformes (Martin and Naylor, 1997; Morrissey et al., 1997; Naylor et al., 1997; Martin et al., 2002; Martin and Burg, 2002; Lopez et al., MS; Chapter Two). Although a monophyletic Lamnidae is strongly supported in most analyses, these same analyses have yet to elucidate relationships within this family. Carcharias and Cetorhinus were frequently allied with the Lamnidae, although this was not always well-supported. The placement of Cetorhinus as the sister taxon to the Lamnidae is strongly supported by anatomical evidence (Compagno, 1990b}, whereas a close relationship between this Cetorhinus+Lamnidae Glade and Carcharias has no precedent in the morphological literature. An "AMP" Glade containing Alopias, Megachasma, Odontaspis and Pseudocarcharias was often recovered in this study, but relationships within this Glade have proved intractable. Curiously, Alopias and Odontaspis were not always recovered as monophyletic genera, although there is no basis to dispute the monophyly of these genera on morphological grounds. These molecular analyses also reached no consensus on the position of Mitsukurina as the basal lamniform taxon, as suggested by Compagno (1990b). It has been suggested that large molecular datasets, such as entire mitochondrial genome sequences, combined with increased taxon sampling, may help resolve controversial phylogenies (Cao et al., 1998; Mindell and Thacker, 1998; Miya et al., 2001, 2003; Braun and Kimball, 2002). As only 15 extant lamniform sharks are known, this group presents an excellent opportunity to test the hypothesis that complete taxon sampling combined with entire mitochondrial genome sequences will resolve phylogeny.

Methods and Materials DNA extraction and PCR DNA was extracted from tissue using High Pure PCR Template Preparation Kit (Roche, Nutley, NJ). Three mitochondrial genes of lamniform sharks (CYTB, ND2, ND4) have previously been sequenced (Naylor et al., 1997; Lopez et al., MS) and are available in GENBANK. These three genes are distributed in such a way as to subdivide the genome into three separate, unequally-sized fragments: CYTB~ND2, ND2~ND4, and ND4~CYTB. Primer sequences were designed to amplify these three fragments, allowing at least 450bp of overlap necessary for assembly of the final genome sequence (see below). Primer sequences are shown in Table 5.1; successful primer 117 combinations are shown in Table 5.2. These three fragments were amplified by Polymerase Chain Reaction (PCR) using Takara LA Taq polymerase (Takaro Shuzo, Japan).

Table 5.1. Primers used to amplify mitochondrial genomes for 15 lamniform species. Primer Sequence (5' ~ 3') 16SFOR 1L CGA GTA GCG GTG ACA AGC C 16SREV 1L CGC AAT CCT TTC TCA GAG TCC 16SFOR2L CGC AAT CCT TTC TCA GAG TCC 16SREV2L CAG ACT AGA AGT CAG TGG GAA CC ASNM AAC GCT TAG CTG TTA ATT AA C61211 H CTC CAG TCT TCB RCT TAC AAG CO1 FOR 1L CTA GTG CCC TTA ATA ATT GGT GC CO 1REV 1L AAA TTA AAG AGC CGA TAG AGG AG COIIFORIL TGA CTC CTA AGT CCA GAC CC COIIREV 1L TGG TCA GTT TCA GGG CTC G CYTBFORS CAA CTA TAA GAA TTT ATG GCC CYTBFOR6 CCG TAA TAT YCA YGC CAA CGG AGC CYTBFOR7 GGC TGA CTT ATC CGC AAC ATC CAC CYTBFOR8 CCG YAA YAT YCA TGC CAA CGG AGC CYTBFOR9 'CCG CAG ACA TYT CYA TAG CC CYTBREV6 TGG TTG TTC AAC TGG TTG KCC TCC CYTBREV7 ATG GYT GTT CAA CKG GYT GWC C GLUDG TGA CTT GAA RAA CCA YCG TTG GLYFORIL TTA CTT CAC AGC CCT CCA AGC ND4REV 1L ACA AGG AAG GCG ATG AGG C ILEM AAG GAC CAC TTT GAT AGA GT LEU-S CAT AAC TCT TGC TTG GAG TTG CAC CA ND2FOR8 CCG RGC RGT AGA AGC YTC CAC ND2FOR9 GCC ACA CTR GCY ACA ATC GC ND2REV6 CTG GGT TGC ATT CAG AAG ATG TGA GG ND2REV7 TCT GGG TTG CRT TCR GAR GAT GTG ND4FOR3 ACC AGT TCC ATC TGC TTA CGG C ND4FOR4 CCG AAC CAT ACT CCT AGC CCG A ND4FOR5 CGA ACW ATA CTW CTR GCY CGA GG ND4IFMT6 GAG AGA GGT CYG GGA CAC GAA GAY CTG CTA ND4REV3 TCG TAT CCC TGA CCT CTC TCG G ND4REV4 TCG TGT CCY KGA CCT CTC TCG ND4L TGA CTA CCA AAA GCT CAT GTA GAA GC NDSFOR 1L GTC AGG GAC ACG AAG AAC TGC NDSREV 1L GGA TCA GAG TGT ATG TAT CAT AAG GC

A touchdown PCR method was used. This began with an initial denaturation of 94°C for 6 rains, and was followed by the touchdown phase, composed of 17 cycles of denaturation (98°C for 20s), annealing (65-49°C for 30s; the annealing temperature was decreased by 1°C with each successive cycle), and extension (68°C for 15 rains). This touchdown phase was followed by 24 cycles of denaturation (98°C for 20s), annealing (48°C for 30s), and extension (68°C for 8 rains). 118

This was followed by a final extension step of 72°C for 10 mins. Reaction mixtures were held at 4°C after completion of PCR. The aforementioned reaction was adjusted if the product was not "clean" (i.e., diffuse or multiple bands), or if there was no apparent product, by several methods, such as raising or lowering the range 'of annealing temperatures at which the touchdown phase occurred, or altering the number of cycles. Amplicons were observed by agarose geI electrophoresis (0.8-1.0% agarose in 1xTAE). If undesirable bands could not be removed by altering the PCR program, the remaining reaction product was run on Low Melting Point (LMP) agarose (Amersham Pharmacia, Piscataway, NJ) and the desired amplicon was extracted from the gel using the ~3-Agarase I Extraction Kit (New England Biolabs, Beverly, MA). Amplicons were purified using MicroCon columns (Millipore, Amicon, Bellerica, MA), and DNA was quantified using fluorometry.

DNA Sequencing I. DOE/JGI The majority of sequencing of mitochondrial genomes was performed by researchers at the Department of Energy Joint Genome Institute (DOE/JGI, Walnut Creek, CA) under the supervision of Dr J. Boore. The following is a brief overview. Complete protocols are available at http://www.jgi.doe.gov/Internal/prots_index.html. This protocol is designed for intact circular DNA, such as whole mitochondrial or microbial genomes, but was adapted to use mitochondrial fragments; this required: (1) the amount of DNA was at least 2µg; (2) the three fragments overlapped; and (3) the fragments were combined into equimolar amounts. The combined DNA fragments were mechanically sheared into blunt-ended inserts that were ligated into plasmid pUC18 vector and transformed into competent E. coli. The plasmid DNA was then amplified for automated sequencing.

DNA Sequencing II Sequences were edited and aligned in Vector NTI (InforMax, Nth Bethesda, MD) and BioEdit (© T. Hall, Dept. Microbiology, North Carolina State University, NC). The identity of ambiguous nucleotides was determined by visual inspections of chromatograms, viewed in Sequencher (Gene Codes, Ann Arbor, MI}. Although most of the genome was sequenced at JGUDOE using the above protocol, sequences provided by DOE/JGI sometimes had missing regions, or stretches of undetermined or ambiguous sequence. These "gaps" added up to approximately 3.Skb from across three species (A. superciliosius, C. taurus, L. ditropis), and were filled in using PCR with primers based on flanking sequence (Tables 5.1, 5.3). Amplicons were purified and quantitated as for above. Automated sequencing was performed at the Iowa State University DNA Sequencing and 119

Synthesis Facility, Ames, IA. Reactions were set up for dideoxy sequencing using the Prism BigDye Terminator Cycle Sequencing Ready Reaction I~it Version 3.1 with AmpliTaq DNA Polymerase, FS (Applied Biosystems, Foster City, CA), and electrophoresed on Prism 377 DNA Sequencer (Applied Biosystems).

Table 5.2. Successful primer combinations used to amplify mitochondrial genomes for 15 lamniform species. Taxon &specimen # CYTB-~ ND2 NDZ-~ ND4 ND4 CYTB Alopias pelagicus 925 CYTBFOR6 + ND2FOR8 + ND4FOR4 + ND2REV3 ND4REV4 CYTBREV6 Alopias superciliosus CYTBFOR6 + ILEM + ND4IFMT6 + 874 ND2REV3 LEU-S C61211H Alopias vulpinus 429 CYTBFOR8 + ` ND2FOR8 + ND4FOR5 + ND2REV7 ND4REV4 CYTBREV7 Carcharias taurus 238 CYTBFOR8 + ILEM + ND4FOR5 + ND2REV7 LEU-S CYTBREV7 Carcharodon carcharias CYTBFOR8 + ILEM + ND4IFMT6 + 864 ND2REV7 LEU-S C61211H Cetorhinus maximus GLUDG + ND2FOR8 + ND4FOR4 + 1058 ASNM ND4REV4 CYTBREV6 Isurus oxyrinchus 412 CYTBFOR8 + ILEM + ND4IFMT6 + ND2REV7 LEU-S C61211H Isurus paucus 614 CYTBFOR8 + ILEM + ND4L + ND2REV7 LEU-S C61211H Lamna ditropis 1062 CYTBFOR9 + ND2FOR9 + ND4FOR4 + ND2REV7 ND4REV4 CYTBREV6 Lamna nasus 632 CYTBFORB + ND2FOR9 + ND4FOR5 + ND2REV7 ND4REV4 CYTBREV7 Megachasma pelagios CYTBFOR7 + ND2FOR8 + ND4FOR3 + 2724 ND2REV6 ND4REV3 CYTBREV6 Mitsukurina owstoni CYTBFOR9 + ILEM + ND4L + 1057 ND2REV7 LEU-S 061211H Odontaspis ferox 1084 CYTBFORS + ND2FOR8 + ND4L + ASNM ND4REV4 061211H Odontaspis taurus 1422 CYTBFOR9 + ND2FOR9 + ND4FOR4 + ND2REV7 ND4REV4 CYTBREV6 Pseudocarcharias CYTBFOR9 + ILEM + ND2FOR4 + kamoharai 1033 ND2REV7 LEU-S CYTBREV6

Table 5.3. Gaps in genomic sequence filled in by individual PCR reactions. Species Location of gap Primers A. superciliosus NDS NDSFOR 1L + NDSREV 1L C. taurus 001 COIFORIL + COlREV1L C. taurus TRNA-Gly -~ ND4 GLYFOR 1L + ND4REV 1L L. ditropis 16S 16SFORIL + 16SREVIL L. ditropis 16S 16SFOR2L + 16SREV2L L. ditropis COII COIIFORIL + COIIREVIL 120

Genome annotation Nucleotide sequences were edited and aligned in Vector NTI and BioEdit, and were translated into amino acid sequences using Se-Al (© A. Rambaut, Dept. Zoology, University of Oxford, UK}. Nucleotide sequences of lamniform mitochondria maintained the same gene order as other shark mitochondria deposited in GENBANK (Mustelus mana.zo, Heterodontus francisci). Hence, annotation of lamniform genomes was accomplished by comparison with these genomes; annotations for mitochondrial genomes of all 15 lamniform sharks are listed in the Appendix in GENBANK format.

Sequence alignment and outgroup taxa Only a limited number of entire mitochondrial genome sequences of sharks are available in GENBANK. Two taxa were nominated as ougroups: H. francisci (Heterodontiformes; aputative basal galeomorph; Chapter Two) and M. manazo (Carcharhiniformes, the Glade widely regarded as the sister taxon to Lamniformes; Chapter Two). As mentioned above, lamniform sequences were fairly conserved when compared to outgroup taxa; however, several ambiguous sites were removed from the analyses. The "whole genome" datasets for each species consisted of all individual genes aligned end-to-end; the total number of characters for the entire genome combined in this manner was 16,672 base pairs.

Phylogenetic Analysis Mitochondrial genes (16 total) were examined individually (with . the exception of the 22 tRNAs, which were combined into a single dataset) using nucleotide sequences and (for protein- coding genes only) amino acid sequences. Datasets were combined in three ways: (1) nucleotide sequences for the whole genome sequence (see above); (2) combined nucleotide sequence for all protein-coding genes; and (3) combined amino acid sequence for all-protein coding genes. In addition, homogeneity between different protein-coding genes was examined using the Incongruence Length Difference (ILD) test (Farris et ai., 1995) as implemented in PAUP v. 4.0 (Swofford, 1996). Protein-coding genes were also examined using DRUIDS (Fedrigo et al. MS), a program designed to identify regions in multiple alignments that exhibit statistically significant deviation from stationarity (DFS) for various biochemical properties. DFS for two parameters believed to impact on phylogenetic analyses (hydrophobicity and volume; Fedrigo et al., MS), were calculated for individual protein-coding genes. Regions in which both parameters displayed significant DFS (F%=10) were recorded and used in analyses of both individual and combined protein coding genes. 121

For each set of both individual and combined genes (20 total), several standard phylogenetic analyses (i.e., maximum parsimony [MP], maximum likelihood [ML], and distance optimality criteria) were conducted using PAUP v.4.0; amino acid sequences, however, were only examined using maximum parsimony. Protein-coding genes of lamniform sharks are known to exhibit extreme homoplasy of transitions relative to transversions, especially at the third codon position (Martin and Naylor, 1997; Lopez et al., MS); this factor was therefore considered in phylogenetic analyses. Five different analyses using MP with a heuristic search strategy were conducted: two for all genes (unweighted nucleotide sequences; transversion substitutions only), and three for protein-coding genes only (unweighted amino acid sequences; nucleotide sequences without third codon position; deletions of regions with significant DFS). The settings for likelihood analyses were determined with Modeltest (v.3.0; Posada and Crandall, 1998), which uses likelihood ratio tests to choose among commonly used substitution models that best fit the data. ANeighbor-Joining clustering analysis (NJ) was conducted using five standard models of sequence evolution (Jukes Cantor; Kimura's two parameter model; Felsenstein [ 1981]; Hasegawa, Kishino and Yano [ 1985]; General reversible model), and the suggested settings determined by Modeltest. All trees generated using MP and distance criteria were bootstrapped.

Criteria for evaluating trees As relationships among lamniform sharks are largely unresolved, two criteria were used to evaluate trees: morphological evidence and consistency. Morphological evidence strongly supports the monophyly of the Lamnidae, and of all non-monotypic genera in the order (Lamna, Isurus, Alopias, Odontaspis); hence, these should be recovered with strong bootstrap support. The use of clades with strong morphological support as phylogenetic approximators has been suggested by Miyamoto (1994). In addition, the same datasets exposed to different analyses (e.g., nucleotides versus amino acids) should yield the same (consistent) results.

Results Composition of lamniform genomes In general, the arrangement of genes in lamniform sharks was similar to that of M. mana,zo and H. franscisci. However, three species have an insert between tRNA-Thr and tRNA-Pro: A. superciliosus (36bp), C. taurus (35bp) and M. owstoni (1058bp; see Chapter Six). C. carcharodon has an insert (40bp) inside the D-loop. There was slight variation in the size of lamniform mitochondrial genomes (Appendix). 122

Incongruence Length Difference test The results of the ILD test (Table 5.4) suggest that-not all mitochondrial genes of lamniform sharks are combinable. Previous analyses reported that ND6 was incongruent with other mitochondrial genes based on results of ILD tests (Shevchuk and Allard, 2001). This result was explained by the fact that the light strand and heavy strand of mitochondrial DNA possess different base compositions: the light strand is very low G, and therefore the heavy strand is very low C (Cao et al., 1994; Janke et al., 1994; Shevchuk and Allard, 2001). However, the ILD results showed that ND6 was not the only gene that is incongruent. This is in contrast to the assumption that, since all mitochondrial genes are physically linked, genes within the genome should have the same evolutionary history because mitochondrial genomes do not recombine (e.g., Sullivan et al., 1995; Yoder et al., 1996). A subset comprised of seven genes (CYTB, ND3, ATP8; ATP6, ND4L, CO2, CO3) was selected based upon the results of the ILD test; these genes were combined based on strong statistical support (p-values > 0.05). Using only combinable genes, the expected relationships of a monophyletic Lamnidae and Cetorhinus+Lamnidae were usually recovered; but standard distance analyses did not recover Carcharias as the sister taxon to the latter Glade. Using the ILD dataset, there was discordance among methods over whether Lamna or Carcharodon was the sister taxon to Isurus. A monophyletic Odontaspis was recovered (but not well-supported), but never a monophyletic Alopias. AMOP (unresolved) and an A. vulpinus + A. pelagicus Glade were usually well-supported.

Phylogenetic analyses Table 5.5 outlines the mitochondrial datasets used in this study. Table 5.6 shows the results of lamniform mitochondrial datasets as analyzed by various methods; these results are summarized below.

Whole genome Table 5.7 shows the sequence divergence between individual shark taxa for aligned nucleotide sequences for the whole genome. Using the whole genome, most methods of analysis found strong support for the following Glades: Lamnidae; Lamna; Isurus; Lamna+Isurus; Cetorhinus+Lamnidae; Carcharias + (Cetorhinus+Lamnidae); Odontaspis; and AMOP. However, two topologies for Mitukurina were recovered (basal to other lamniforms; sister taxon to AMOP} depending upon the method used. Further, relationships within AMOP were unresolved, and Alopias was rarely recovered as monophyletic (and monophyly was never well-supported). The same lack of 123 resolution observed for these taxa is exhibited by analyses of individual genes, and demonstrates that adding more data does not automatically generate a `better' tree. Certain of the trees resulting from whole genome datasets are shown in Figs. 5.1-5.5.

Table 5.4. Results of the Incongruence Length Difference test. Numbers in the table represent p- values, those bold indicate p-values greater than 0.05. ATP8 CYTB CO1 CO2 CO3 ND1 ND2 ND3 ND4 ND4L NDS ND6 ATP6 0.87 0.73 0.34 0.46 0.44 0.10 0.04 0.41 0.23 0.24 0.13 0.01 ATP8 0.62 0.87 0.31 0.42 0.27 0.07 0.68 0.27 0.58 0.35 0.01 CYTB 0.83 0.32 0.28 0.42 0.01 0.85 0.51 0.38 0.34 0.01 CO1 0.74 0.95 0.28 0.39 0.85 0.44 0.68 0.62 0.01 CO2 0.46 0.01 0.01 0.27 0.07 0.11 0.02 0.01 CO3 0.04 0.04 0.95 0.11 0.65 0.16 0.01 ND1 0.01 0.46 0.11 0.13 0.04 0.01 ND2 0.14 0.01 0.12 0.06 0.01 ND3 0.46 0.80 0.36 0.02 ND4 0.30 0.39 0.01 ND4L 0.62 0.01 NDS 0.01

Protein-coding genes Table 5.8 shows the sequence divergence between individual shark taxa for both nucleotide and amino acid sequences for combined protein-coding genes. Combining all mitochondrial protein- coding genes into a single dataset produced similar results to the whole genome dataset. Notable exceptions include: Lamna+Isurus was no longer well-supported (and some methods failed to recover this Glade); and a monophyletic Alopias was recovered in more analyses {though, again, this was never well-supported). Using amino acid sequences instead of nucleotides did not improve the resolution of any tree, even for this large dataset (Fig. 5.6). NDS, ND4, and CYTB have been described as `good' genes for phylogenetic analysis (Russo et al., 1996; Zardoya and Meyer, 1996; Zardoya et al., 1998; Miya and Nishida, 2000). However, NDS did not always recover a monophyletic Lamnidae, nor a Cetorhinus+Lamnidae Glade, using parsimony analyses -unlike the majority of other genes, or combined datasets. NDS was also unable to resolve relationships within Lamnidae or AMOP. Neither NDS nor ND4 could clearly resolve the position of Mitsukurina. ND4 usually recovered swell-supported Lamnidae, and the distance 124 analyses recovered awell-supported Isurus+Carcharodon Glade (contra the Lamna+Isurus Glade recovered by using the whole genome dataset; see above). Certain methods recovered a monophyletic Alopias for CYTB, ND4, and NDS; but this collapsed upon bootstrapping. Distance analyses of ND4 and NDS found strong support for a monophyletic Odontaspis; this was not the case for any other individual gene.

Ribosomal and transfer RNA genes Most methods that used ribosomal and transfer RNAs showed strong support for monophyletic Lamnidae, Lamna, and Isurus. Lamna+Isurus was frequently recovered using these genes, although this result depended upon the method used. Most distance methods far 12S and 16S recovered a monophyletic Odontaspis; in contrast, only distance methods for 12S recovered a monophyletic Alopias (although this result was not upheld in bootstrap analyses). A monophyletic Alopias was never recovered using 16S. Distance methods for 12S and 16S differed in the placement of Mitsukurina: 12S recovered a Mitsukurina +Carcharias Glade at the base of the Lamniformes (except when Modeltest settings were used, in which case this Glade was placed at the base of the Lamnidae; Fig. 5.7); while 16S placed Mitsukurina either as the sister taxon to all other lamniforms or at the base of the AMOP Glade. However, all of these Glades collapsed upon bootstrapping. The combined tRNA gene dataset rarely recovered Odontaspis as monophyletic; never recovered Alopias as monophyletic; and was inconsistent with regard to the position of Mitsukurina. 12S and combined tRNA genes have been suggested as useful for resolving deep divergences (Chapter Four); but in the current study, only distance methods for 12S recovered Mitsukurina and Carcharias as basal sister taxa in Lamniformes, a position tentatively suggested by Compagno (1990b). Indeed, trees generated using 12S and standard distance methods were the most consistent with morphological evidence, as all genera were found to be monophyletic.

D-loop Analysis of the D-loop showed strong support for Lamnidae, Lamna, and Isurus; Lamna+Isurus was recovered in all methods of analysis except ML. A Cetorhinus+Lamnidae Glade, with Carcharias as the sister taxon to this Glade was also strongly supported. Although the monophyly of Odontaspis was frequently recovered, this Glade was never supported by bootstrap. A monophyletic Alopias was never recovered. The majority of analyses placed 1V~itsukurina as the sister taxon to the AMOP Glade, often with high bootstrap support. 125

Phylogenetic relationships in the order Lamniformes Lamnidae The analyses in this study consistently upheld a monophyletic Lamnidae, usually with strong bootstrap support (>79 %) for most analyses, and for most datasets. The monophyly of both Lamna and Isurus was also strongly supported. Analyses disagreed over whether Carcharodon is closer to Isurus or to Lamna, or the sister taxon to an Isurus+Lamna Glade.

Cetorhinus+Lamnidae A sister taxon relationship between Cetorhinus and Lamnidae Glade was strongly supported in large datasets, but results were variable in other analyses, and depended upon both the gene and the method employed. A Cetorhinus+Lamnidae Glade is well-founded morphologically (Compagno, 1990b, 2001; Shirai, 1996; Chapter Two).

Carcharias and Odontaspis Carcharias was often recovered as the sister taxon to a Cetorhinus+Lamnidae Glade. This position for Carcharias has no precedent in morphological studies (see Discussion}. Alternative topologies for Carcharias were recovered in a minority of analyses, but these were never well- supported (e.g., Carcharias as sister taxon to all other lamniforms; Carcharias+Cetorhinus as sister taxon to Lamnidae). There was no support for a monophyletic Odontaspididae (Carcharias+Odontaspis): this relationship was never recovered. The monophyly of the genus Odontaspis was largely dependent upon both the method and the dataset used, and was rarely supported by bootstrap.

Mitsukurina The current study was unable to resolve the position of Mitsukurina. The majority of analyses, however, indicated that Mitsukurina is either the basal outgroup to all other lamniforms or the sister taxon to the AMOP Glade. One alternative topology was Mitsukurina nested within AMOP; but this was rare and never well-supported (Table 5.5). Another topology, also rarely recovered and never well-supported, was a Mituukurina+Carcharias Glade. This relationship was found using distance criteria for three genes: 12S, ND4L, and CO2. Although the position of this Glade was variable, it was usually at the base of all lamniforms (12S) or the sister taxon to AMOP (ND4L,

co2). 126

AMOP An AMOP Glade was frequently recovered in this study, although the bootstrap support for this Glade was variable. The AMOP Glade is a feature of previous molecular studies (Martin and Naylor, 1997; Naylor et al., 1997; Martin et al., 2002; Martin and Burg, 2002; Lopez et al., MS}. Relationships within AMOP, as in previous molecular studies, were unresolved, with Alopias frequently recovered as polyphyletic. Mitochondrial analyses found no support for a Cetorhinus+Megachasma Glade. A Megachasma+Pseudocarcharias Glade was occasionally reported, but this was usually not strongly supported by bootstrap.

Discussion The results of this study indicate that using the entire mitochondrial genome of all extant lamniform taxa does not improve the resolution of relationships within this group. Indeed, results obtained from individual genes and multigene datasets (including all mitochondrial genes) continue to recover the same topologies as previous analyses based on smaller datasets and/or fewer taxa (Martin and Naylor, 1997; Naylor et al., 1997; Martin and Burg, 2002; Martin et al., 2002; Lopez et al., MS; Chapter Three). As in previous molecular studies, the five lamnid species were found to comprise astrongly-supported Glade (Lamnidae); however, relationships within Lamnidae still remain unresolved. Also, the current study found frequent support for Cetorhinus as sister taxon to Lamnidae, and Carcharias as sister taxon to this Cetorhinus+Lamnidae Glade. These topologies, although strongly supported for multigene datasets, were sensitive to the method and dataset used. As in previous studies, AMOP was frequently recovered, but this seven-species cluster could not be resolved, even with whole mitochondrial datasets. The position of Mitsukurina fluctuated, but almost always remained close to the base of the lamniform tree; but expanded datasets did not cement its position at the base of the Lamniformes, a position suggested by Compagno (1990b). The Lamniformes include four extant genera that contain more than one species. Lamna and Isurus were each strongly supported as Glades for almost all methods and datasets; but this did not hold for Odontaspis and Alopias. The three species of Alopias never formed a Glade with bootstrap support (>65%), irrespective of the choice of dataset or method. Thus, expanded multigene datasets and use of all lamniform taxa did not overcome the problem encountered by previous analyses with smaller datasets and/or fewer taxa: failure to recover a monophyletic Alopias. It has been suggested that maximum parsimony may converge on an incorrect phylogeny with high statistical support as more data are examined (Felsenstein, 1978; Huelsenbeck et al., 1996; Naylor and Brown, 1998). However, this was not apparent in the current study. For example, 127 although astrongly-supported monophyletic Alopias proved elusive, alternative arrangements that included a polyphyletic Alopias were not strongly supported by bootstrap analysis.

Reconciling mitochondrial and morphological data The inability to resolve relationships within the Lamnidae mirrors the difficulty in resolving relationships within AMOP. The Odontaspididae invariably emerged as polyphyletic; but this family was not regarded as monophyletic by Compagno (1990b), since the characters shared by Carcharias and Odontaspis are probably symplesiomorphic. Having Carcharias on the line leading to lamnids has yet to be proposed by morphological studies, and no synapomorphies have been reported that unite Carcharias with Cetorhinus and lamnids. Thus, the current study does not advocate erecting a Carcharias-Cetorhinus-Lamnidae Glade based on the results of the mitochondrial analyses. However, considering the plesiomorphic morphology of Carcharias, combined with the paucity (lack?) of autapomorphies in this taxon (Compagno, 1990b), there appears to be no strong evidence to preclude such a relationship at the current time. Also, teeth referred to Carcharias have been reported from the Early Cretaceous, long before the appearance of the first undisputed cetorhinids and lamnids (Paleocene; Chapter Three). There was no support for close affinities between Alopias and lamnids and/or Cetorhinus as proposed by some studies (Cappetta, 1987; Compagno, 1990b; Shirai, 1996; Shimada, 1999). The existence of an AMOP Glade requires that the plesodic fin evolved at least twice in Lamniformes, as previously suggested (De Carvalho, 1996; Morrissey et al., 1997; contra Compagno, 1990b). There was no support for referring Megachasma to the Cetorhinidae, thus supporting the hypothesis that filter-feeding evolved independently in Megachasma and Cetorhinus (Taylor et al., 1983; Compagno, 1990b; Martin and Naylor, 1997; contra Maisey, 1985). Compagno's (1990b) view that the filtration apparatus of Megachasma evolved from an odontaspidid morphology is also consistent with the current molecular study. The current study also finds no support for Megachasma as the basal lamniform taxon (contra Taylor et al., 1983; Morrissey et al., 1997). Based on morphology, the monophyly of Alopias is undisputed. However, monophyletic Alopias was rarely recovered by any method of analysis, and when recovered, it was never strongly supported. The same can be said for an A. pelagicus + A. superciliosus Glade, which also has strong morphological support; by contrast, an A. vulpinus + A. pelagicus Glade was frequently recovered and strongly supported by the molecular analyses in the current study. The monophyly of the genus Alopias, and the validity of an A. pelagicus + A. superciliosus Glade, should not be questioned in light 128 of these results. Instead, the instability within AMOP, and the inability to recover astrongly- supported monophyletic Alopias, suggests that the dataset requires further examination. Mitsukurina was recovered either at the base of the entire Lamniformes, or as sister taxon to AMOP. In either case, this is congruent with Compagno's (1990b) interpretation that Mitsukurina lies close to the base of the Lamniformes. Also, fossil evidence suggests that mitsukurinids evolved early in the Cretaceous, with forms very similar to Mitsukurina known from the Late Cretaceous (Santonian; Chapter Three). Compagno (1990b) also regarded Carcharias as the next outgroup in the Lamniformes, although allowing for the possibility that Mitsukurina and Carcharias may form a Glade at the base of the Lamniformes. The latter hypothesis has little support in the current analyses.

Problems in phylogenetic analysis The overall topology recovered for Lamniformes was strongly dependent on several factors: (1) dataset (i.e., genes or combination of genes); (2) manipulation of the dataset (e.g., nucleotides versus amino acids; transversion substitutions only; removal of sites with DFS); (3) optimality criteria chosen (MP; distance; ML). These results are not unique to lamniform sharks, as previous phylogenies ,that examined other taxa (Vertebrata; Tetrapoda; Teleostii) were also sensitive to these factors (Russo et al., 1996; Naylor and Brown, 1997; Zardoya and Meyer, 1998; Cao et al., 1998; Miya and Nishida, 2000). The majority of these studies suggest that these problems can be overcome by the analysis of more data (especially whole mitochondrial genomes) and/or adding more taxa. However, this approach was not successful for Lamniformes, as demonstrated by the current study. An alternative approach to this problem is that improvement of phylogenetic reconstruction will only come about by an assessment of the evolutionary process at individual sites (e.g., Naylor and Brown, 1997; Pollock and Bruno, 2000). This "quality versus quantity" approach is worth investigating for lamniform phylogeny, as incorporating all lamniform taxa and all mitochondrial genes did not resolve relationships within this order. In the current study, results were also not improved by previously suggested methods, such as: using settings suggested by Modeltest (Posada and Crandall, 1998); combining data suggested by ILD (Farris et al., 1995); or only using transversion substitutions (Braun and Kimball, 2001). Also, simply deleting regions that deviate from stationarity did not provide a solution. When mitochondrial genes fail to produce a good tree, it has been suggested that nuclear genes should be examined (Springer et al., 2001); but RAG-1, a common nuclear gene for phylogenetic analysis, was also unable to resolve lamniform phylogeny (Lopez et al., MS). As none of the aforementioned methods provided a simple solution, alternative methods are 129 required. Therefore, for lamniforms as well as for other groups, examination of individual genes and their inherent properties may be more productive than simply compiling more expansive datasets. Certain phylogenetic studies have regarded a tree based on whole mitochondrial genomes as the `true' tree, especially when the recovered topologies receive strong bootstrap support (e.g., Rasmussen and Arnason, 1999a,b; Miya et al., 2001, 2003). These studies argued that the resulting phylogenies could be used to dispute or overturn traditional morphology-based phylogenies, even those that have very strong support. One example is the teleost phylogeny of Miya et al. (2003), which included 100 complete mitochondrial genomes representing 75 families. This study broke up some teleost orders that were well-supported morphologically, and was used to advocate novel interordinal relationships that had no morphological support. Thus, Miya et al. (2003) regarded unorthodox relationships above the level of family as `true' ; since the constituent genera were recovered as monophyletic, this study did not question the validity of higher-level Glades. For the current study, the inability to recover astrongly-supported monophyletic Alopias using expanded mitochondrial datasets suggests that mitochondrial genomes cannot always be assumed to generate the `true' tree. Further, as discussed above, more data and taxa do not resolve this problem. Nevertheless, expanding the number of taxa might provide one potential benefit: sampling more species per genus offers the potential of uncovering problems in the phylogeny. For example, an analysis of whole mitochondrial genome sequences in Lamniformes recovered a strongly supported A. pelagicus + A. vulpinus Glade. Thus, if A. superciliosus was omitted from the analysis, the result would be a tree that contained a monophyletic Alopias with strong statistical support. As the O. ferox + O. noronhai Glade was also recovered with strong bootstrap support, such a tree would be intuitively attractive, and could be promoted as the `true' tree. Miya et al. (2003) included no more than two species per teleost genus; most genera were represented by only one species in the analysis. If Miya et al. (2003) had added more congeneric species to this analysis, the resulting phylogenies would perhaps split up well-supported genera, thus revealing that phylogenies based on entire mitochondrial genome sequences contain flaws. Continued sequencing of mitochondrial genomes from such taxa may hence destroy the illusion that phylogenies based on large datasets are always correct; perhaps then our efforts may be directed at more appropriate alternative approaches, such as understanding the fundamental dynamics of evolution at the DNA level. 130

Table 5.5. Summary of results of phylogenetic analysis using maximum parsimony for unweighted, aligned nucleotide sequences.

Dataset Sequence Number of Number of Tree Length Retention Length constant informative Index (bp) characters characters Whole Genome 16672 9883 4830 19700 0.387 All protein-coding genes 11436 6452 3736 15645 0.365 All tRNAs 1556 1080 259 949 0.434 12S rRNA 936 684 115 510 0.433 16S rRNA 1690 1224 281 1063 0.458 D-Loop 1054 443 417 1536 0.522 ATP6 684 365 232 988 0.371 ATP8 168 78 57 221 0.492 CYTB 1146 624 397 1632 0.358 ND1 978 550 325 1326 0.386 ND2 1047 544 361 1495 0.415 ND3 351 194 127 476 0.391 ND4 1380 740 472 1970 0.386 ND4L 297 153 111 432 0.411 NDS 1830 992 625 2731 0.355 ND6 522 258 185 789 0.360 CO1 1557 995 437 1821 0.391 CO2 690 452 182 674 0.384 CO3 786 505 215 845 0.402 131

Table 5.6. Results of phylogenetic analyses used to test relationships within the Lamniformes. Relationships that have strong morphological support (Compagno, 1990b) are underlined. AMOP=Glade comprising Alopias, Megachasma, Odontaspis and Pseudoearcharias (genera and species in any topology). Analyses carried out were as follows: MP (unweighted}; AA (MP, unweighted amino acid sequences); W3 (MP, without third codon position); TV (MP, transversion substitutions only); DFS (MP, minus DFS); JC (Jukes-Cantor); K2P (Kimura's two parameter model); F81 (Felsenstein, 1981); HKY (Hasegawa, Kishino and Yano, 1985); GTR (general reversible model); MTP (neighbor joining with Modeltest settings); and ML (maximum likelihood, with Modeltest settings). Bootstrap analyses were carried out for all trees except ML. AA, W3 and DFS were limited to protein-coding genes; W3 for whole genome only applies to the third codon position in protein-coding genes. *denotes > 79% bootstrap support Lamnidae Lamna Isurus Isurus+ Lamna+ Lamna+ Glade Glade Glade Carcharodon Carcharodon Isurus

Whole MP*, W3* MP*, W3* MP*, W3* ML TV MP, W3* genome TV *, DFS* T V *, DFS * TV *, DFS * DFS, JC JC*, K2P* JC*, K2P*, JC*, K2P* K2P*, F81 F81*, HKY* F81*, HKY* F81*, HKY* HKY* GTR*, GTR*, GTR*, GTR* MTP*, ML MTP* ML MTP*, ML MTP Protein MP*, W3* MP*, AA* MP*, W3* AA, W3, TV MP, DFS, JC -coding AA*, TV* W3*, TV* AA*, TV* MTP*, ML K2P, F81 genes DFS* JC * DFS *, JC * DFS*, JC * HKY, GTR K2P*, F81* K2P*, F81* K2P*, F81 HKY* HKY* HKY* GTR* GTR* GTR* MTP*, ML MTP*, ML MTP*, ML ILD MP*, W3* MP*, W3* MP, W3, TV MP, W3 DFS, JC AA*, TV* AA*, TV* DFS*, JC* TV*, MTP* F81, K2P DFS *, JC * DFS *, JC* F81*, K2P* ML HKY, GTR F81*, K2P* F81*, K2P* HKY*, HKY* HKY* GTR* GTR* GTR* MTP*, ML MTP*, ML _MTP*, ML tRNAs MP*, TV MP*, TV MP*, JC* MP, JC, K2P JC*, K2P* JC*, K2P* K2P*, F81* F81, HKY F81*, HKY* F81*, HKY* HKY* GTR, MTP GTR* GTR* GTR* ML MTP*, ML MTP*, ML MTP*, ML _ ATP6 MP*, AA MP*, AA TV, JC, K2P TV, JC, K2P W3, TV* W3, TV* F81, HKY F81, HKY DFS*, JC* DFS *, JC * GTR, ML GTR, ML K2P*, F81* K2P*, F81 HKY * HKY GTR* GTR* MTP*, ML MTP*, ML 132

Table 5.6. (continued. ) Lamnidae Lamna Isurus Isurus+ Lamna+ Lamna+ Glade Glade Glade Carcharodon Carcharodon Isurus - _ _ 1 ATP8 MP*, AA MP, AA, W3 JC, K2P, F81 MP, AA, W3 W3, TV TV, JC* HKY, GTR JC, K2P DFS*, JC* K2P, F81 F81, HKY K2P*, F81* HKY, GTR GTR, ML HKY * MTP, ML GTR MTP*, ML CO1 MP*, AA MP*, W3 MP, AA MP, TV* AA, DFS W3*, TV* TV*, DFS* TV*, DFS ML JC*, K2P* DFS*, JC* JC*, K2P* JC*, K2P* F81*, HKY* K2P*, F81 * F81*, HKY* F81*, HKY* GTR*, MTP HKY* GTR* GTR* GTR* MTP*, ML MTP*, ML MTP*, ML CO2 MP*, TV, MP*, AA JC, K2P, F81 W3 JC, K2P, F81 DFS*, JC * W 3*, T V HKY, GTR HKY, GTR K2P*, F81* DFS*, JC HKY* K2P*, F81 GTR* HKY*, MTP*, ML GTR* MTP*, ML CO3 MP*, AA MP*, AA MP, AA, W3 MP*, AA W3*, TV* W3*, TV TV, DFS W3, TV DFS*, JC* DFS*, JC* JC*, K2P* DFS, JC* K2P*, F81* K2P*, F81* F81*, HKY* K2P*, F81 HKY* HKY* GTR*, HKY* GTR* GTR*, ML MTP*, ML GTR* MTP*, ML _ CYTB MP*, W3 MP*, W3 MP, TV* MP, W3, TV DFS AA, TV* TV*, DFS DFS, JC* JC*, K2P* DFS*, JC* JC*, K2P* K2P*, F81* F81*, HKY* K2P*, F81* F81*, HKY* HKY* GTR* HKY* GTR* GTR* MTP*, ML GTR* MTP*, ML MTP*, ML MTP*, ML ND1 MP*, AA* MP*, AA* MP, AA, W3 MP*, W3 AA W3*, TV* W3, TV* DFS, JC DFS, JC* DFS, JC * DFS*, JC* K2P, F81 K2P, F81 K2P*, F81* K2P*, F81 * HKY, GTR HKY, GTR HKY* HKY* MTP MTP* GTR * GTR MTP*, ML MTP*, ML ND2 P*, AA P*, AA* P*, AA* MTP, ML P, AA, W3 W3*, TV* W3*, TV* W3*, DFS* DFS, JC DFS *, JC* DFS*, JC * JC *, K2P* K2P, F81 K2P*, F81* I~2P*, F81* F81*, HKY* HKY, GTR HKY * HKY * GTR GTR* GTR* MTP*, ML MTP*, ML MTP*, ML 133

Table 5.6. (continued.) Lamnidae Lamna Isurus Isurus+ Lamna+ Lamm+ Glade______Glade Glade Carcharodon Carcharodon Isurus

ND3 MP*, W 3 MP, W 3, T V MP, DFS ML MP, DFS TV, DFS DFS, JC* JC*, K2P* JC, K2P, F81 JC*, K2P* K2P*, F81* F81*, HKY* HKY, GTR F81*, HKY * HKY * GTR * MTP GTR* GTR* MTP*, ML MTP*, ML _MTP, ML ND4 MP*, AA MP*, AA* MP*, TV* MP, DFS TV + W3, TV* W3*, TV* DFS*, JC* JC*, K2P* DFS*, JC* DFS*, JC* K2P*, F81* F81*, HKY* .K2P*, F81* K2P*, F81* HKY*, GTR*, HKY* HKY* GTR* MTP*, ML GTR* GTR* MTP*, ML MTP*, ML MTP*, ML ND4L MP, AA, W3 MP, DFS MP*, AA MP, JC, K2P W3,DFS TV, DFS, JC JC*, K2P* W3, DFS* F81, GTR HKY, MTP K2P, F81 F81*, HKY* JC*, K2P* HKY, GTR GTR, MTP* F81*, HKY* MTP, ML GTR MTP*, ML ND5 W3, TV MP*, AA MP*, AA TV, DFS AA*, JC DFS, JC* TV*, DFS* W3*, TV* ML K2P, F81 K2P*, F81* JC*, K2P* DFS, JC* HKY, GTR HKY, GTR* F81*, HKY* K2P*, F81* MTP MTP*, ML GTR* HKY* MTP*, ML GTR* MTP*, ML ND6 MP*, AA MP*, AA* MP*, AA MP, AA W3 W3, TV* W3, TV* W3*, DFS DFS, JC DFS*, JC* DFS, JC* JC*, K2P* K2P, F81 K2P*, F81 * K2P*, F81* F81*, HKY* HKY, GTR HKY* HKY* GTR*, MTP MTP, ML GTR * GTR * ML MTP*, ML MTP*, ML 12S MP*,TV MP*,TV MP,JC* MTP, ML JC, K2P JC*, K2P* JC*, K2P* K2P*, F81* F81, HKY F81 *, HKY* F81*, HKY* HKY* GTR GTR * GTR * GTR MTP*, ML MTP*, ML MTP*, ML _ 16S MP*, TV* MP*, TV* MP*, JC* MP, JC, F81 JC*, K2P* JC*, K2P* K2P*, F81* HKY, GTR F81*, HKY* F81*, HKY* HKY* MTP, ML GTR * GTR * GTR MTP*, ML MTP*, ML MTP*, ML D-loop MP*, TV* MP*, TV* MP*, TV ML MP, TV, JC*, K2P* JC*, K2P* JC*, K2P* JC*, K2P* F81*, HKY * F81*, HKY * F81*, HKY * F81*, HKY GTR * GTR * GTR *, MTP GTR *, MTP MTP*, ML MTP*, ML ML 134

Table 5.6. (continued.) Cetorhinus Carcharias + Cetorhinus+ AMOP Mitsukurina Mitsukurina + Lamnidae (Cetorhinics Carcharias + AMOP basal _ + Lamnidae) lamniform Whole MP*, W3* MP*, W3* MP*, W3* JC*, K2P* MP*, W3* genome TV *, DFS* TV *, DFS* TV *, DFS* F81*, HKY* TV*, DFS JC*, K2P* JC*, K2P* JC*, K2P* GTR* MTP, ML F81 *, HKY* F81*, HKY* F81*, HKY* GTR*, MTP* GTR*, MTP* GTR*, MTP* ML ML ML Protein MP*, W3 MP*, W3* MP*, W3 AA, JC* MP*, W3 -coding AA *, TV * TV *, DFS* TV *, DFS* K2P*, F81* TV *, DFS genes DFS, JC* JC*, K2P* JC*, K2P* HKY*, GTR* MTP*, ML KZP*, F81 * F81 *, HKY* F81 *, HKY* HKY*, GTR* GTR*, MTP* GTR*, MTP MTP*, ML ML + ML ILD MP*, AA, TV MP, AA, TV MP, TV JC*, F81* MP, W3, TV DFS, JC DFS, MTP* DFS*, JC* K2P*, HKY* DFS, MTP F81*, K2P ML F81 *, K2P* GTR* ML HKY, GTR HKY*, GTR* MTP*, ML MTP, ML JC, K2P JC, K2P, F$1 MP, ML MP, TV, JC* TV, MTP MP, JC, F81 tRNAs F81, HKY HKY, GTR K2P*, F81* ML K2P, HKY GTR, MTP MTP HKY*, GTR GTR* MTP, ML ATP6 DFS, K2P TV, MTP F81, HKY ML GTR, MTP ATP8 TV, MTP MP, MTP C O1 MP, AA W 3, DFS MP, DFS W 3, DFS MTP JC *, K2P JC *, K2P F81*, HKY F8 1, HKY GTR*, ML GTR, MTP ML CO2 JC, K2P JC, K2P, F81 F81, HKY HKY, GTR MTP CO3 TV, MTP* TV, ML ML CYTB MP, W3 W3, TV*, JC DFS TV, ML MP, DFS TV*, DFS TV*, JC K2P, F81 JC*, K2P* ML K2P, F81 HKY, GTR F81, HKY* HKY, GTR MTP, ML GTR*, MTP MTP, ML ND1 MP, AA, W 3 MP, T V MP, T V, JC MP*, W 3 TV, DFS, JC DFS, JC K2P, F81 TV, DFS K2P, F81 K2P, F81 HKY, GTR JC, K2P HKY, GTR HKY, GTR ML F81, HKY MTP*, ML MTP, ML GTR, MTP ML 135

Table 5.6.(continued. ) Cetorhinus Carcharias + Cetorhinus+ AMOP Mitsukurina Mitsukurina + Lamnidae (Cetorhinus Carcharias + AMOP basal + Lamnidae) lamniform ND2 MP, W3 ML W3, TV, JC AA TV DFS, JC K2P, F81 K2P, F81 HKY, GTR HKY, GTR MTP MTP ND3 TV, JC, K2P JC, K2P, F81 MP, TV F81, HKY HKY, ML DFS, JC MTP, ML K2P, F81 HKY, GTR ML ND4 AA, W3, TV W3, TV* MP*, TV AA, W3 TV, MTP DFS, JC DFS, JC DFS, JC* JC, K2P ML K2P*, F81* K2P, F81 K2P*, F81* F8 1, HKY HKY, GTR HKY, GTR HKY* GTR* MTP*, ML MTP, ML GTR MTP, ML ND4L AA, ML ML ND5 W3, AA, JC JC, K2P, F81 MP, AA MP, W3, TV K2P, F81 HKY, GTR T V *, DFS MTP, ML HKY, GTR JC *, K2P F81*, HKY GTR, MTP ML ND6 W 3 MP, DFS, JC ML ML MP, AA, W 3 K2P, F81 MTP HKY, GTR MTP 12S JC, K2P, F81 TV, ML JC, K2P HKY, GTR F81, HKY GTR 16S MTP, ML MTP, ML MP, TV, JC MP*, JC* JC, K2P MP, TV, ML K2P, F81 K2P*, F81 * F81, HKY HKY, GTR HKY* GTR, MTP GTR* MTP, ML D-loop MP*, TV MP*, TV* MP*, TV MP*, JC* TV JC*, K2P* JC*, K2P* JC*, K2P* K2P*, F81 F81*, HKY * F81 *, HKY * F81 *, HKY * HKY GTR* GTR* GTR*, MTP GTR* MTP*, ML MTP*, ML ML MTP, ML 136

Table 5.6. (continued. ) Odontaspis Pseudo- Alopias A vulpinus+ A. supercil. + A vulpinus+ ,_cl de carcharias+ =cl A.pelagicus A.pela~icus A.supercil. Megachas. Whole MP*, W3 MP*, W3 JC, F81 MP*, W3* genome DFS *, JC* DFS *, K2P TV *, DFS K2P*, F81* HKY, GTR JC, K2P HKY * MTP, ML F81, HKY GTR* GTR, MTP* MTP*, ML ML Protein MP, JC * MP*, DFS* W 3, JC, K2P MP*, W 3 -coding K2P*, F81* F81, HKY AA, TV* genes HKY *, GTR DFS *, JC GTR* K2P, F81 MTP, ML HKY, GTR MTP, ML ILD MP, W3, JC MP, MTP MP, AA, TV F81, K2P JC *, F81 HKY, GTR K2P*, HKY MTP GTR*, ML tRNAs TV, ML MP, JC, K2P F81, GTR MTP, ML ATP6 W3 W3, DFS, JC K2P, F81 HKY, GTR MTP ATP8 JC, K2P JC, K2P, F81 F81, HKY HKY, GTR GTR CO1 AA, W3, JC AA JC, K2P K2P, F81 F81, HKY HKY, GTR GTR, MTP MTP CO2 W3 AA, JC, K2P F81, HKY ML CO3 JC, K2P W3 HKY, GTR CYTB DFS, MTP MP, W3 MP, AA, JC TV MTP, ML K2P, F81 HKY, GTR MTP, ML ND1 TV, ML TV, MTP ML ND2 MP, DFS MP, DFS, JC TV, JC, K2P JC*, K2P* K2P, F81 F81, HKY F81 *, HKY* HKY, GTR GTR, ML GTR, MTP MTP ML 137

Table 5.6. (continued.) Odontaspis Pseudo- Alopias A vulpinus+ A.supercil.+ A vulpinus+ cla_de_ carcharias+ Glade A.pelagicus A.pela~icus A.supercil. Megachas. ND3 DFS ND4 JC, K2P MP, DFS JC, K2P AA, TV, JC F81, HKY F81, HKY K2P, F81 GTR, MTP GTR HKY, GTR _ MTP, ML ND4L MP, DFS, JC DFS, JC TV K2P, F81 K2P, F81 HKY, GTR HKY, GTR MTP ND5 DFS, JC* MP, TV, ML MTP, ML TV, ML MP, JC* K2P*, F81* K2P*, F81 HKY, GTR* HKY, GTR* MTP, ML _ MTP ND6 JC, K2P, F81 MP, JC, K2P HKY, GTR F81, HKY GTR 12S MP, JC, K2P MP JC, K2P, F81 MP, JC, K2P F81, HKY HKY, GTR F81, HKY GTR, ML GTR, ML 16S MP, JC, K2P MP, TV, JC* MP*, JC F81, HKY K2P*, F81* K2P, F81 GTR, MTP HKY* HKY, GTR ML GTR* MTP, ML MTP*, ML D-loop MP, JC, K2P MP*, TV F81, HKY JC* K2P GTR, MTP F81*, HKY* ML GTR, MTP ML 138

~ w O N w z~ a v~ a x~ z A O a v ~ x ~, v O, O ~ ~ ~, ~ a v ►.a ~ ~ ~ ~ maximus,

~ _ C.

~ o = x °° °` `° CM O ~ ~ ~'_ ~ ~ o v~ V , (~~ [~,.-~ N ~ ~t ~t ~ o0 kamoharai, .~ ~ ~ ~ P. ~ N N Q~ = PK N O ~ ~O

Q~ ~ .-~ ~ ~--a 00. 04 ,

00 00 N 00 pelagios,

C/~ ~ 00 ~ ~

~ ~ ~ ~; C~ M. Q~ ~--i .-a ~ .---~ 00 p~ 00 Q\ = a N N ~t ~ M ~O ~ ~t o0 0o c~i M O O O ~ --~. , ~, ~, MP ~-+ N ~ el' •--+ M ~ M '~ [~ c~ N N ~ O O O O ~ '

~ O~ ~ ~fi O~ M M ~D ~--~ O~ N 00 t~ M N N N N N N M M ~

Z t~ M O tt ~ O ~O a1 M N M ~t ~

00 04 ~ M ~ ~ M M M ~ ~ M superciliosus, a A. t~ 00 N ~' ~ M O CT ~ M ~D `O A 00 00 ~ M ~ ~fi ~ M M ~ ~ M M ' = .-~ .-~ --a •-~, •--+ •--~ .--~ ~ .--~ ~--~ •--~ .--~ try

~t M ~--~ ~ (~ M o0 C~ M O ~O ~ [~ AS d' i O O~ 01 ~O M ~ to tt ct ~ ~O ~ ct •--~ ~ _ ~ ct ~-+ t~ ~ ~ ~ ~ vD O G~ ~ '-+ O~ O O a~.., ~v~ .--,v~ ~n~. ...,M , ~n..~, ~t~ ~tr.., , ~t~ .~~t ~.-~ ~n~, ~t,_.. o....~ ~.-., .~o ' V O o4 v~ 00 CT `O M ~ O~ ~ ~--+ ~ ~ --~ N •-~ pelagicus, V CT o0 to M ~ cf' ct d' M tI~ to M ~ N N N A. 139

O ~ w 0 E~ ~, Z~ a~ a ~~ Z A V (%), ~ x V, II ~ O O, ~ ~ ~ ~ w V a ~ ~ ~ V w ""' , 00 ~ M O ~ ~ M M N ~t ~, c}- ~ sequence 0 O ~ ~ ~ O ~ ~' ~, ,~; ~ CO O O ~ O ^~ acid ~ `O , ~ O ~ ~ ~O ~ ~ O ~O ~O ~ 00 N ~O N percentage x o~ c~i '~ ~ ^~ O O O c~i O ~ ~ ~ c~i ~ N w a

~ as V O Q~ °`00 , ~ O~ ~ ~ M M ~ ~ ~ 'Ch o0 ~ N ~ amino --, ~ op ' l~ t~ l~ t~ [~ op t~ op op op O~ O~ O~ II

(b) H E"+ ~ ~ ~ ' 00 b o0 O~ ~ ~ ~h ~ ~O O o0 ct ~O r-, '.-~ .-~ ~p , ~O ~ ~ ~ ~ ~O ~Q ~O ~ t~ t~ t~ .,~ t~ ~ G~ ~D expressed

text); O ~ ~ ~ M ~ N ~ O O ~ O O~ ~ o ~ ~ ~ ~ ~ ~ ~ ~ ~ ~, ~ O taxa, Z o0 00 M M N ' ~' ~ c!' ~}' M ~}' ~O Qt 00 M O O .-~ ,-.~ .~ .~ , a; ~r ~r ~ ~ , ~ ~ ~o ~o , c~ ~ 00 (plain c~ ~n oo ~o --~

' N O~ ~ ~ M ~ ~ M O shark ~ ~ ~ ~ ~ ~ ~, M M ~ ~ ~ ~ ~ ~ b ~ only v~ ~ ~ ~ O oo cv a [~ 00 M M ~+ O O N M ~O O M ~ ~ ~O o0

~ ~t ~ N N `O ct; ~ O O O ~t O o0 00 00 0o M cYi O O O ~ ~ individual nucleotides

.~ carcharias. a ~ ~o ao oo N M o ~ ~ ~ ~ ~ ~ ~-+ N N M N d- ~ ~ ~ o 0 (a) V M [~ N M ~ [~ N M ~-+ M between II ~" 00 00 ~ ~t '--~ r-+ N N N M ' N '~, ~O M O~ ~O it V

genes V ~ ~ N `D ~O ~ O M [~ ~O ~-~+ O~ O~ ~ ~t ~t M mot' cf' M to v~ ' ~ O O O ~h V ~ ~ ..-~ ~ .-, .-, ~ ~-.a .-~ .~ .-~ trj ~p ~p ~p t~ O U ~-+ t~ 00 v~ ~ ~O ct ~O r-+ ^--~ ~t ~ coding Z O O~ ~O ~ ~ ~ ~ try ~ I~ ~ to ' d' ~O ~ ~ divergence

O ~-+ O ~O t~ 00 ~ '--~ ~ O M ~ I A o o ~ ~n ~o ~o ~ ~n ~n ~ ~o ~n `r? ~ ~o 0 I M. a protein ^-+ O ~-+ G~ O~ M ~ `O ~t O ~ [~ ~ ~

'"'' ^~ 00 ~ ~ ~ ~D ~O ~O o0 00 ~O M ~ ' ~' O Sequence MM= N oo t~ t~ O~ O oo ~ r-+ O O~ cn ~ ~ N

5.8. .~ ^~ O t~ V'~ (~ [~ LD `O ~O o0 t~ `U M M N ' ~

V M N ~t t~ 00 M --~ ~-+ [~ ~ •-+ 00 ~ combined O N N O O O [~ t/') ~O `O ~ ~O to ~ t~ ~ M ~ ~f' ct ~ Table for (italics). 140

Mustelus manazo Heterodontus francisci Mitsukurina owstoni Megachasma pelagios Pseudocarcharias kamoharai Alopias pelagicus Alopias vulpinus Odontaspis noronhai Odontaspis ferox Alopias superciliosus Carcharodon carcharias Lamna nasus Lamna ditropis Isurus paucus Isurus oxyrinchus Cetorhinus maximus Carcharias taurus

Fig. 5.1. Phylogeny based on maximum parsimony analysis using unweighted, aligned nucleotide sequences of the whole genome dataset. 141

Mustelus manazo

Heterodontus francisci

Mitsukurina owstoni Megachasma pelagios

Alopias vulpinus

Alopias pelagicus

Pseudocarcharias kamoharai

Odontaspis noronhai

Odontaspis ferox Alopias superciliosus

Carcharodon carcharias

Lamna nasus Lamna ditropis

Isurus paucus Isurus oxyrinchus

Cetorhinus maximus

Carcharias taurus

Fig. 5.2. Phylogeny based on maximum parsimony analysis using transversion substitutions only for the whole genome dataset. 142

Mustelus manazo Heterodontus francisci Mitsukurina owstoni

Megachasma pelagios Pseudocarcharias kamoharai Alopias vulpinus Alopias pelagicus

Odontaspis ferox Odontaspis noronhai Alopias superciliosus

Carcharodon carcharias Isurus paucus Isurus oxyrinchus

Lamna nasus Lamna ditropis Cetorhinus maximus Carcharias taurus

Fig. 5.3. Phylogeny based on maximum parsimony analysis using aligned nucleotide sequences without the third codon position for the whole genome dataset. 143

NJ Mustelus manazo

0.102 Heterodontus francisci

0.02 Mitsukurina owstoni

0.10 o.oss Megachasma pelagios

0.043 Alopias vulpinus o.oc 3o.oBfi 3 o.oas Alopias pelagicus .o o.o4s Alopias superciliosus 0 J02 .008 0.041 Odontaspis ferox o,oc 4 0.043 Odontaspis noronhai

o.oss Pseudocarcharias kamoharai 0.02 o.oss Carcharodon carcharias

0.052 Isurus paucus 0.022 0.012 o.oss Isurus oxyrinchus 0,oC4 o.oa s o.oss Lamna nasus 0.029 o.oss .Lamna ditropis 0,,04 o.oss Cetorhinus maximus

o.os2 carcharias taurus 0. o S s ubstitutio ns~s ite

Fig. 5.4. Phylogeny based on a Neighbor Joining analysis using the Jukes Cantor model of sequence evolution for aligned nucleotide sequences of the whole genome dataset. 144

Mustelus manazo

0.246 Heterodontus francisci

0.258 o.~2s Mitsukurina owstoni

0.112 Megachasma pelagios 0.009 o.oss pseudocarcharias kamoharai o.o~a o.o5s Odontaspis ferox C.010 o.00s , o.osi Odontaspis noronhai 0.115 0.057 -Alopias vulpinus 0.0200,01 0.073 Alopias pelagicus

0.065 Alopias superciliosus

0.112 Carcharodon carcharias

0.009 0.067 0,013 Isurus paucus 0.025 0.080 0.071 Isurus oxyrinchus

0.027 Lamna nasus 0.015 0.049 0.037 .Lamna ditropis ~. 025 0.105 Cetorhinus maxmmus

0.095 Carcharias taurus 0.05 substitutions/site

Fig. 5.5. Phylogeny based on a maximum likelihood analysis using Modeltest settings (Tamura Nei with optimized parameter values) for aligned nucleotide sequences of the whole genome dataset. 145

Mustelus manazo

Heterodontus francisci

Mitsukurina owstoni

Megachasma pelagios

Odontaspis ferox

Odontaspis noronhai

Pseudocarcharias kamoharai

Alopias superciliosus

Alopias vulpinus

Alopias pelagicus

Carcharodon carcharias

Isurus paucus

Isurus oxyrinchus

Lamna nasus

Lafnna ditropis

Cetorhinus maximus

Carcharias taurus

Fig. 5.6 Phylogeny based on maximum parsimony analysis using aligned amino acid sequences for all 13 protein-coding genes combined. 146

NJ Mustelus manazo

0.058 Heterodontus francisci

0.034 0.065 Mitsukurina owstoni o.oc 2 0.022 Carcharias taurus

o~.04 9 Megachasma pelagios

0.014 Alopias vulpinus 002 O.00Z o.ozo Alopias pelagicus 01 0.020 0.014 Alopias superciliosus ].0~1 0.0 1 0.025 Odontaspis ferox o.00z 0.019 Odontaspis noronhai

0.024 Pseudocarcharias kamoharai

0 002 0.023 Carcharodon carcharias

0.012 Isurus paucus 0.011 0.00 0.012 Isurus oxyrinchus 0. J01

O.OQ1 0.01 o Lamna nasus 0.006 0.012 Lamna ditropis

o.o1s Cetorhinus maximus 0.01 s ubstitutio ns~s ite

Fig. 5.7. Phylogeny based on a Neighbor Joining analysis using the Jukes Cantor model of sequence evolution for aligned nucleotide sequences of 12S rRNA. 147

CHAPTER SIX: A TANDEM DUPLICATION IN THE MITOCHONDRIAL GENOME OF THE GOBLIN SHARK MITSUKURINA OWSTONI

Abstract The mitochondrial genome of the goblin shark Mitsukurina owstoni (Lamniformes: Mitsukurinidae) has an insert of 1058bp in length that is composed of tandem repeats of tRNA- threonine and tRNA-proline. Each tandem repeat is separated by a 37bp region that shows sequence similarity to the 5' region of the D-loop. A similar but smaller mitochondrial insert has been reported for the reptile Bipes biporus, which has a degenerate origin of light-strand replication (OL). In M. owstoni, by contrast, the OL appears to form a normal stem-and-loop structure, and a substitution outside the stem-and-loop structure might be responsible for disrupting OL and causing tandem duplication. Thus, different errors in the OL may be responsible for similar tandem duplications.

Introduction Mitochondrial genomes can increase in size due to tandem duplications of parts of the genome (e.g., Moritz and Brown, 1986, 1987; Berg et al., 1995; Macey et al., 1998; Pereira, 2000). The goblin shark M. owstoni has a duplicated region of over lkb in length, located between the genes for cytochrome b and the D-loop. This insert is made up of mostly of tandem repeats of tRNA-Thr and tRNA-Pro gene copies. This is only the second record of this type of repeat in a vertebrate; the first report was in the reptile Bipes biporus (Squamata: Amphisbaenidae), which shows a much smaller tRNA-Thr/tRNA-Pro tandem duplication (Macey et al., 1998). In B. biporus, this duplication is associated with an abnormal OL (Macey et al., 1997). Changes in OL and the resulting duplication of genes are regarded as a prelude to changes in gene order, as a result of the subsequent deletion of redundant gene copies (Macey et al., 1997a,b, 1998; Pereira, 2000; Inoue et al., 2001; Townsend and Larson, 2002; Shao and Barker, 2003; Yamauchi et al., 2003). This Chapter describes the insert in M. owstoni, and compares it to the tRNA-Thr/tRNA-Pro duplication in B. biporus.

Methods and Materials The tandem duplication in M. owstoni was found as a result of sequencing the entire mitochondrial genome of this species, and hence the methods follow those outlined in Chapter Five. Three additional M. owstoni specimens were examined for the presence of this insert. DNA from these tissue samples was isolated by phenol/chloroform extraction. The PCR primers used to amplify the tandem duplication in these samples are based on CYTB and the D-loop, and were based on the 148 sequence of the first M. owstoni specimen. These primer sequences were: 5' - CGGACAAGTCGCATCCA-3' and 5' -TGTACTATATTAGGATATGTGGGC-3' .

Results In M. owstoni, the sequence between cytochrome b and the D-loop sequences is composed of 14 tandem repeats: seven sequences are homologous to tRNA-Thr, and seven sequences are homologous to tRNA-Pro (Fig. 6.1). The first tRNA-Thr and last tRNA-Pro genes in this series show the closest homology to the tRNA-Pro and tRNA-Thr genes in the mitochondrial genomes of other lamniform species. Thus, the other copies may are presumably non-functional (pseudogenes). A 37bp sequence (ARCGCTATTAAAAYATAGCCCTAAAGAAAAATAACTA) is present between each tRNA-Thr-tRNA-Pro pair, and shares 70% homology to the very beginning of the D-loop of M. owstoni. The tandem duplication comprises 1198bp, including the two putative functional tRNA genes. The presumed redundant portion of this sequence is 1058bp in the first specimen tested. This is 6.0 % of the total mitochondrial genome size. No changes in gene arrangement were observed in the M. owstoni mitochondrial genome. Three additional M. owstoni specimens were found to possess a mitochondrial insert in the same region. These inserts varied in size, but the determination of the exact size and identity of these inserts will require sequencing. The mitochondrial genomes of certain other lamniform species have smaller inserts in the region close to or within the D-loop. Two species have an expanded region between tRNA-Thr and tRNA-Pro: A. superciliosus (36bp) and C. taurus (35bp). These show some homology (and the same orientation) to a partial tRNA-Thr gene, and may represent previous duplications that have been truncated. C. carcharodon has a duplicated region inside the D-loop that results from an insert about 40bp long.

Discussion The insert in B. biporus is 318bp long, much shorter than that of M. owstoni. Like M. owstoni, the tandem duplication of B. biporus is between CYTB and the D-loop and is composed of (in order): functional tRNA-Thr; non-functional tRNA-Pro; non-functional tRNA-Thr; and functional tRNA-Pro (Macey et. al., 1998). For the B. biporus mitochondrial genome, the extra copies of the tRNA gene were inferred to be pseudogenes based on the secondary structure of the tRNAs, which is inconsistent with function (Macey et al., 1998). B. biporus has short, directly repeated sequences between the tRNA homologs, of two types: the first ("R 1") represented by CCAGG or ACAGG; the 149 second ("R2") is represented by CTCTGC, CACTGC, or CTTGC (Macey et al., 1998). These sequences show homology to D-loop sequences (Macey et al., 1998). The tandem duplication of M. owstoni is much longer than that of B. biporus: there are six extra copies of tRNA-Thr and tRNA-Pro in M. owstoni, compared to only one extra copy of each in B. biporus; M. owstoni also has longer intervening sequences.

G W C E N K I L N L N 15451 GCTGATGTGP~ATAAAATCCTCAACCTAAACTAATCCTGGTAGCTTAAC 15501 TTAAAAGCGTCGGCCTTGTAAGCCGGAGACTGGAGATTTAATTCTCCCTA 15551 AGATACATTAGGP►~~GGGGTTA.AACTCTTTCCCTTGG000CAAGGCCA 15601 GGGCACCCTCCGAGTCCGCCCCCTAAGCGCTATTA►AA.ACATAGCCCTAAA 15651 GP►~~AA.ATAACTAATCCTGGTAGCTTTACTTAAAAGCGTCGGCCTTATAGG 15701 CTGGAA.ACTGGGAATTTTAATTCCCCTAAATACATTAGGA.AAAGAAGGAT 15751 TA.A.ACTCTTTCCCTTGACCCCAAGGCTGGGGCACCCTCCGAGCCGCCCCC 15801 TAAACGCTATTAAAATATAGCCCTAA~GP~~,~AATA.ACTAACCCTGGTAGC 15851 TTTACTTAAAAGCGTCGGCCTTATAGGCCGGAAACTGGGAATTTTTATTC 15901 CCCTAAATACATTAGGAAA,AGAAGGGTTAAACTCTTTCCCTTGGCCCCAA 15951 GGCTGGGACACCCTCCGAGCCGCCCCCTGAACGCTATTAAAACATAGCCC 16 0 01 TAA.AGP.,~~AAATA.ACTAATCCTGGTAGCTTTACTTAAAAGCGCCGGCCTTA 16051 TAGGCTGGAAACTGGGAATTTTAATTCCCCTAAATACATTAGGAAAAGAA 16101 GGATTAAACTCTTTCCCTTGACCCCAAGGCTGGGACACCCTCCGAGCCGC 16151 CCCCTAAACGCTATTAAAATATAGCCCTAAAGP~~AAATAACTAACCCTGG 16201 TAGCTTTACTTAAAAGCGTCGGCCTTATAGGCCGAAAACTGGGAATTTTT 16251 ATTCCCCTAAATACATTA.AGAAA.AGAAGGGTTA.AACTCTTTCCCTTGGCC 16301 CCAAGGCCAGGGCACCCTCCGAGTCCGCCCCCTA.AGCGCTATTA~AAACAT 16351 AGCCCTAAAGP.►~~AAATAACTAATCCTGGTAGCTTTACTTAAAAGCGTCGG 16401 CCTTATAGGCCGGAAACTGGGAATTTTTATTCCCCTAA.ATACATTAGGAA 16451 A.AGAAGGGTTAAACTCTTTCCCTTGGCCCCAAGGCTGGGACACCCTCCGA 16 5 O l GC C GC C C C C T GAAC GC TAT TAAA.AC ATAGC C C TA.AAGP.A~.AATAAC TART 16551 CCTGGTAGCTTTACTTAAAAGCGCCGGCCTTATAGGCTGGAGACTTAATT 16601 CTCCCCTAGATATATCAGGGGA.AGGAGGGTTAAACTCCCGCCTTTGGCCC 16 6 51 CCAA.AGCCAAGATTCTGCCCAAACTGCCCCCTGGACACTATT~~~AA.ATAT 16 7 01 G~~~A.ACCTAAAGAAAATTTTTTAC~~~A.AAGTTAGTCAGATTAACATATTA

Fig. 6.1.a. Gene sequence for the tandem duplication in M. owstoni, positioned between CYTB (end of gene shown with letters for amino acids above the sequence; * = stop codon) and the D-loop (in italics). Homologous sequences to tRNA-Thr are shown in bold, and homologous sequences to tRNA-Pro are underlined. 150

tRNA-Thr tRNA-Pro

Fig. 6.1.b. Tandem duplication of tRNA-Thr (black arrows) and tRNA-Pro (pale arrows) homology in M. owstoni, shown 5' -~ 3' . The putative functional tRNA-Thr and tRNA-Pro are labeled. Intervening regions are in gray, and show homology to the 5' region of the D-loop.

In B. biporus, the sequence between tRNA-Asn and tRNA-Cys, where the OL is usually situated in vertebrates,. is abnormal and presumably nonfunctional (Macey et al., 1998). The tRNA- Cys gene has been proposed to serve as a replication origin for light strand synthesis in mitochondria (Clayton, 1982), since it forms part of the OL stem region (Macey et al., 1997a). Changes in the secondary structure of tRNA-Cys may remove OL function, which, in turn, may lead to shifts in gene order (Macey et al., 1997a, 1998). In both M. owstoni and B. biporus, part of the tRNA-Cys gene is in the stem of the OL stem-and-loop structure. A sequence within the tRNA-Cys gene directly next to the stem is also believed to be necessary for mitochondrial replication (Hixson et al., 1986). This sequence is 3' -GBCCB-5' in most vertebrates (Macey et al., 2000). This sequence is different for M. owstoni, which has GGTCC (Fig. 6.2). All other lamniform species have GGCCC, as does Mustelus manazo (Carcharhiniformes) and Heterodontus francisci (Hereodontiformes). No tandem duplications are known in the mitochondrial genomes of these species. Other vertebrate species that show a normal OL between tRNA-Asn and tRNA-Cys also have GBCCB (Macey et al., 1997a,b). Exceptions include certain marsupials, in which the sequence here is AACCG or AGTCA (Paabo et al., 1991; Janke et al., 1994). However, in the mitochondrial genomes of marsupials the OL is located between tRNA-Trp and tRNA-Asn, and tRNA-Cys no longer forms part of the OL stem (Paabo et al., 1991; Janke et al., 1994). Therefore, it .may be the deviation from the consensus sequence in M. owstoni that is responsible for disrupting replication in the mitochondrial genome and generating tandem repeats. However, the precise function of this sequence is unknown. As discussed above, the only other vertebrate species in which a tRNA-Thr/tRNA-Pro tandem duplication has been recorded is B. biporus, which has a smaller duplicated region than documented here for M. owstoni. The OL of B. biporus has a degenerate stem-and-loop structure that lacks the GCC sequence that is regarded as the starting point for light-strand elongation (Macey et al., 151

1997b; Fig. 6.3). Thus, the mutation that may have caused the tRNA-Thr/tRNA-Pro tandem duplication in B. biporus appears to occur in a different part of the OL to that of M. owstoni, which contains the GCC sequence in the stem of the OL stem-and-loop structure (Fig. 6.3). In certain lizards, it is profound changes to the secondary structure of tRNA-Cys that are believed responsible for disrupting the function of the OL (Macey et al., 1997a; Townsend and Larson, 2002). This is not the case for M. owstoni, which appears to have a normal secondary structure for tRNA-Cys. The mitochondrial genomes of these lizards lack gene duplications, but they show gene rearrangements that are thought to be the result of tandem duplications followed by selective elimination of extra gene copies (Macey et al., 1997x, 1998; Townsend, and Larson, 2002). Preliminary analysis indicates that the exact size of the tandem duplication varies between individuals of M. owstoni. Some lizards (Cnemidophorus and Heteronotia spp.) have duplicated regions in their mitochondrial genomes that range from 0.8 to 8.0 kb in length (Moritz and Brown, 1987). These duplications are ephemeral in these lizards: they are of low frequency, and vary from one generation to the next. It is not known if this is the case in M. owstoni. 152

(a) C T T T T G T T T T C T G-C G-C C - G F- position of nascent light-strand elongation G-C G-C G-C A -T G -C A -T A - T . sequence necessary for replication (consensus GBCCB) A -T J, 3' - A T T T G -C G G G G T C C T C C -5'

(b) T C T T T A C G C G T G-C A -T G-C A -T A - T sequence necessary for replication (consensus GBCCB) G - C J, 3' -C A G G C -G A G A C C G G G C G -5'

Fig. 6.2. The stem-and-loop structure formed by the origin of light-strand replication (OL) in the mitochondrial genome of (a) M. owstoni and (b) B. biporus. The GCC sequence (in bold), regarded as the point of light-strand elongation, is present in M. owstoni, but not in B. biporus. In M. owstoni, the GBCCB sequence (underlined) that is thought to be necessary for replication, has a substitution at the third base position (C-~T). (b) is after Macey et al. (1997b). 153

GENERAL CONCLUSIONS

The order Lamniformes contains 15 living species of sharks organized into seven families. An overview of the biology, systematics, morphology and fossil record of these sharks is presented. Previous investigations of lamniform phylogeny, using both anatomical and molecular-based data, have been unable to resolve relationships within this group. In an effort to elucidate lamniform phylogeny, this study examined the entire mitochondrial genome sequence for all living species in this order. Mitochondrial genomes for all 15 lamniform species were sequenced, and both individual genes and multigene datasets were analyzed using several standard phylogenetic methods (maximum parsimony, maximum likelihood, and distance optimality criteria). Overall, the current study recovered the same relationships as previous molecular studies. The Glade Lamnidae is strongly supported, although internal relationships could not be resolved. Carcharias and Cetorhinus were frequently allied with the Lamnidae, especially for multigene datasets, but results were variable for individual genes and depended on the method employed. There was no support for a monophyletic Odontaspididae (Carcharias+Odontaspis). A Glade ("AMOP") comprising Alopias, Megachasma, Odontaspis, and Pseudocarcharias was frequently recovered in this study, but relationships within this Glade could not be resolved further. As in previous molecular studies, the genus Alopias proved to be especially problematic. The majority of analyses recovered Mitsukurina as either the basal outgroup to all other lamniforms or the sister taxon to the AMOP Glade. Based on the current study of all living lamniforms, complete taxon sampling and larger datasets do not necessarily provide a better phylogeny. Finally, a large tandem duplication is documented for the mitochondrial genome Mitsukurina owstoni. 154

REFERENCES

Abe, T., Isokawa, S., Misu, T., Kishimoto, T., Shimma, Y. and Shimma, H. 1968. Notes on some members of Osteodonti (Class Chondrichthyes) I. Bull. Tokai. Reg. Fish. Res. Lab. 56: 1-7.

Abe, T., Isokawa, S., Aoki, K., Kishimoto, T., Shimma, Y. and Shimma, H. 1969. Notes on some members of Osteodonti (Class Chondrichthyes) II. Bull. Tokai. Reg. Fish. Res. Lab. 60: 1-3.

Adachi, J., Cao, Y. and Hagesawa, M. 1993. Tempo and mode of mitochondrial DNA evolution in vertebrates at the amino acid sequence level: rapid evolution in warm-blooded vertebrates. J. MoI. Evol. 36: 270-281.

Adkins, R.M., Honeycutt, R.L. and Disotell, T.R. 1996. Evolution of eutherian cytochrome c oxidase subunit II: Heterogeneous rates of protein evolution and altered interaction with cytochrome c. Mol. B iol. Evol. 13: 1393-1404.

Alexander, R.L. 1995. Evidence of acounter-current heat exchanger in the ray Mobula tarapacana (Chondrichthyes: Elasmobranchii: Batoidea: Myliobatiformes). J. Zool. Lond. 237: 377-384.

Alexander, R.L. 1996. Evidence of brain-warming in the mobulid rays, Mobula tarapacana and Manta birostris (Chondrichthyes: Elasmobranchii: Batoidea: Myliobatiformes). Zool. J. Linn. Soc. 118: 151-164.

Alexander, R.L. 1998. Blood supply to the eyes and brain of lamniform sharks (Lamniformes). J. Zool. Lond. 245: 363-369.

Amorim, A.F., Arfeii, C.A and Fagundes, L. 1998. Pelagic easmobranchs caught by longliners off southern Brazil during 1974-97: an overview. Mar. Freshw. Res. 49: 621-632.

Amorim, A.F.; Arfeli, C.A. and Castro, J.I. 2000. Description of a juvenile megamouth shark, Megachasma pelagios, caught off Brazil. Environ. Biol. Fishes. 59: 117-123.

Anderson, S.D. and Goldman, K.J. 2001. Temperature measurements from salmon sharks, Lamna ditropis, in Alaskan waters. Copeia. 2001: 794-796.

Applegate, S.P. 1965. Tooth terminology and variation in sharks with special reference to the , Carcharias taurus Rafinesque. L.A. Cnty Mus. Contrib. Sci. 86: 1-18.

Applegate, S.P. 1972. A revision of the higher taxa of orectolobids. J. Mar. Biol. Ass. . 14: 743- 751.

Applegate, S.P. 1968. A large fossil of the genus Odontaspis from Oregon. The Ore Bin. 30: 32-36.

Applegate, S.P. and Espinosa-Arrubarrena, L. 1996. The fossil history of Carcharodon and its possible ancestor, Cretolamna: A study in tooth identification. In: Klimley, A.P. and D.G. Ainley (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp.19-36. 155

Amason, U., Gullberg, A. and Janke, A. 1997. Phylogenetic analyses of mitochondrial DNA suggest a sister group relationships between Xenarthra (Edentata) and ferungulates. Mol. Biol. Evol. 14: 762- 768.

Amason, U., Gullberg, A. and Janke, A. 2001. Molecular of gnathostomous (jawed) fishes: old bones new cartilage. Zool. Scr. 30: 249-255.

Avise, J.C. 1991. Ten unorthodox perspectives on evolution prompted by comparative population genetic findings on mitochondrial DNA. Annu. Rev. Genet. 25: 45-69.

Avise, J.C. 1994. Molecular Markers and Natural History. Chapman and Hall, U.S.A. S lOpp.

Bass, A.J., D'Aubrey, J.D. and Kistnasamy, N. 1975. Sharks of the east coast of Southern Africa. IV. The families Odontaspididae, Scapanorhynchidae, Isuridae, Cetorhinidae, Alopiidae, Orectolobidae and Rhiniodontidae. Invest. Rep. Oceanogr. Res. Inst. 39: 1-102.

Bean, B.A. 1905. Notes on an adult goblin shark (Mitsukurina owstoni) of Japan. Proc. U.S. Natn. Mus. 28: 815-818.

Bearez, P., Zambrano, M. and Trevino, H. 2001. Premier signalement pour le Perou de trois poisons oceaniques: Pseudocarcharias kamoharai (Chondrichthyes, Pseudocarchariidae), (Osteichthyes, Alepisauridae) et Pteraclis velifera (Osteichthyes, Bramidae). Cybium. 25: 181-184.

Bendix-Almgreen, S.E. 1983. Carcharodon megalodon from the upper Miocene of Denmark, with comments on elasmobranch tooth enameloid: coronoin. Bull. Geol. Soc. Den. 32: 1-32.

Bensasson, D., Zhang, D.-X., Hartl, D. and Hewitt, G.M. 2001. Mitochondrial pseudogenes: evolution's misplaced witnesses. TREE 16: 314-321.

Benton, M.J. and Whyte, M.A.1993. (Eds.). The Fossil Record 2. Chapman and Hall, London. 864pp.

Berg, T., Moum, T. and Johansen, S. 1995. Variable numbers of simple tandem repeats make birds of the order Ciconiiformes heteroplasmic in their mitochondrial genomes. Curr. Genet. 27: 257-262.

Bernal, D., Dickson, K.A., Shadwick, R.E. and Graham, J.B. 2001. Review: Analysis of the evolutionary convergence for high performance swimming in lamnid sharks and tunas. Comp. Biochem. Physiol. (A). 129: 695-726.

Berra, T. 1997. Some 20th century fish discoveries. Environ. Biol. Fish. 50: 1-12.

Berra, T.M. and Hutchins, J.B. 1990. A specimen of megamouth shark, Megachasma pelagios (Megachasmidae) from . Rec. West. Aust. Mus. 14: 651-656.

Berra, T.M. and Hutchins, J.B. 1991. Natural history notes on the megamouth shark, Megachasma pelagios, from Western Australia. West. Aust. Nat., 18: 224-233.

Bhalla, S.N. and Dev, P. 1985. Taxonomic validity of some species of the shark genus Carcharodon Mueller and Henle, 1838. Ind. J. Earth Sci. 12: 141-144. 156

Blagoderov, A.I. 1994. Seasonal distribution and some notes on the biology of salmon shark (Lamna ditropis) in the Northwestern Pacific Ocean. J. Ichthyol. 34: 115-121.

Block, B.A. 1991. Endothermy in fish: thermogenesis, ecology and evolution. In: Hochachka, P.W. and Mommsen, T.P. (Eds.). Biochemistry and Molecular Biology of Fishes Vol. 1: Phylogenetic and Biochemical Perspectives. pp.269-31 1. Elsevier.

Block, B.A. and Carey, F.G. 1985. Warm brain and eye temperatures in sharks. J. Comp. Physiol. (B). 156: 229-236.

Bone, Q. and Chubb, A.D. 1983. The retial system of the locomotor muscles in the thresher shark. J. Mar. Biol. Ass. U.K. 63: 239-241.

Bonfil, R. 1995. Is the ragged-tooth shark cosmopolitan? First record from the western North Atlantic. J. Fish Biol. 47: 341-344.

Boore, J.L. 1997. Transmission of mitochondria) DNA -playing favorites? Bioessays 19: 751-753.

Boore, J.L. 1999. Survey and Summary: Animal mitochondria) genomes. Nucleic Acids Res. 27: 1767-1780.

Boore, J.L. and Brown, W.M. 1998. Big trees from little genomes: mitochondria) gene order as a phylogenetic tool. Curr. Opinion Genetics. Dev. 8: 668-674.

Branstetter, S. and McEachran, J.D. 1986. A first record of Odontaspis noronhai (Lamniformes: Odontaspididae) for the western North Atlantic, with notes on two uncommon sharks from the . Northeast Gulf Sci. 8: 153-160.

Branstetter, S. 2002. Basking Shark. Family Cetorhinidae. In: Collette, B.B. and Klein-MacPhee, G. (Eds.). Bigelow and Schroeder's Fishes of the Gulf of Maine. pp. 32-34.

Branstetter, S. and Musick, J.A. 1994. Age and growth estimates for the sand tiger in the Northwestern Atlantic Ocean. Trans. Am. Fisheries Soc. 123: 242-254.

Braun, E.L. and Kimball, R.T. 2002. Examining basal avian divergences with mitochondria) sequences: model complexity, taxon sampling, and sequence length. Syst. Biol. 51: 614-625.

Bromham, L., Eyre-Walker, A., Smith, N.H. and Maynard Smith, J. 2003. Mitochondria) Steve: paternal inheritance of mitochondria in humans. TREE 18: 2-4.

Brown, G.G., Gadaleta, G., Pepe, G., Saccone, C. and Sbisa, E. 1986. Structural conservation and variation in the D-loop-containing region of vertebrate mitochondria) DNA. J. Mol. Biol. 192: 503- 511.

Brown, J.R., Beckenbach, A.T. and Smith, M.J. 1993. Intraspecific DNA sequence variation of the mitochondria) control region of white sturgeon (Acipenser transmontanus). Mol. Biol. Evol. 10: 326- 341. 157

Brown, W.M. 1983. Evolution of Animal Mitochondrial DNA. In: Nei, M. and Koehn, R.K. (Eds.). Evolution of Genes and Proteins. Sinauer Associates Inc., MA. pp. 62-88.

Brown, W.M., Prager, E.M., Wang, A. and Wilson, A.C. 1982. Mitochondrial DNA sequences of primates: tempo and mode of evolution. J. Mol. Evol. 18: 225-239.

Brown, W.M. and Simpson, M.V. 1982. Novel features of animal mtDNA evolution as shown by sequences of two rat cytochrome oxidase subunit II genes. Proc. Natl. Acad. Sci. 79: 3246-3250.

Burch, S.J., Lawson, R. and Davies, D.H. 1984. The relationships of cartilaginous fishes: an immunological study of serum transferrins of holocephalans and elasmobranchs. J. Zool. Lond. 203: 303-310.

Burgess, G.H. 1985. Biology of sharks as related to commercial shark fishing. Manual on Shark Fishing. Florida Sea Grant College. SRG-73. pp. 4-10.

Bushnell, P.G. and Jones, D.R. 1992. The Arterial System. In: Hoar, W.S., Randall, D.J. and A.P. Farrell (Eds.). . Volume XII A: The cardiovascular system. Academic Press, San Diego, pp. 89-139.

Cadenas, E. and Davies, K.J.A. 2000. Mitochondrial free radical generation, oxidative stress,a nd aging. Free Radical B iol. Med. 29: 222-230.

Cailliet, G.M. and Bedford, D.W. 1983. The biology of three pelagic sharks from California waters, and their emerging fisheries: a review. California Cooperative Oceanic Fisheries Investigational Report. 24: 57-69.

Cailliet, G.M., Natanson, L.J., Weldon, B.A. and Ebert, D.A. 1985.Preliminary studies on the age and growth of the white shark, Carcharodon carcharias, using vertebral bands. Mem. So. Calif. Acad. Sci. 9: 49-60.

Cantatore, P., Roberti, M., Pesole, G., Ludovico, A., Milella, F., Gadaleta, M.N. and Saccone, C. 1994. Evolutionary analysis of cytochrome b sequences in some Perciformes: Evidence for a slower rate of evolution than in mammals. J. Mol. Evol. 39: 589-597.

Cantatore, P. and Saccone, C. 1987. Organization, structure and evolution of mammalian mitochondrial genes. Int. Rev. Cytol. 108: 149-208.

Cao, Y., Adachi, J. Janke, A., Paabo, S. and Hasegawa, M. 1994. Phylogenetic relationships among eutherian orders estimated from inferred sequences of mitochondrial proteins: Instability of a tree based on a single gene. J. Mol. Evol. 39: 519-527.

Cao, Y., Waddell, P.J., Okada, N. and Hagesawa, M. 1998. The complete mitochondrial DNA sequence of the shark Mustelus manazo: evaluating rooting contradictions to living bony vertebrates. 15: 1637-1646.

Cappetta, H. 1980. Les selaciens du Cretace superieur du Liban. I. Requins. Palaeontographica Abt. A. 168: 69-148. 158

Cappetta, H. 1987. Chondrichthyes. II. Mesozoic and Cenozoic Elasmobranchii, In: Handbook of Paleoichthyology, Vol. 3B. Gustav Fischer Verlag Stuttgart, .

Cappetta, H., Duffin, C and Zidek, J. 1993. Chondrichthyes In: Benton, M.J. (Ed.). The Fossil Record 2. Chapman and Hall, London.

Carey, F.G., Teal, J.M., Kanwisher, J.W., Lawson, K.D and Beckett, J.S. 1971. Warm-bodied fish. Am. Zool. 11: 137-145.

Carey, F.G., Teal, J.M. and Kanwisher, J.W. 1981. The visceral temperatures of mackerel sharks (Lamnidae). Physiol. Zool. 54: 334-344.

Carey, F.G., Kanwisher, J.W., Brazier, O., Gabrielson, G., Casey, J.G. and Pratt, H.L. 1982. Temperature and activities of a white shark, Carcharodon carcharias. Copeia: 1982: 254-260.

Carone, A., Malladi, S.B., Attimonelli, M. and Saccone, C. 1999. Vertebrate MitBASE: a specialized database on vertebrate mitochondrial DNA sequences. Nucleic Acids Res. 27: 150-152.

Case, G.R. 1979. Cretaceous selachians from the Pedee Formation (Late Maestrichtian) of Duplin County, North Carolina. Brimleyana. 2: 77-89.

Case, G.R. 1980. A selachian fauna from the Trent Formation, Lower Miocene (Aquitanian) of Eastern North Carolina. Palaeontographica Abt. A. 171: 75-103.

Case, G.R. 1981. Late Eocene selachians from South-Central Georgia. Palaeontographica Abt. A. 176: 52-79.

Case, G.R. and Cappetta, H. 1990. The Eocene selachian fauna from the Fayum depression in . Palaeontographica Abt. A. 212: 1-30.

Case, G.R., Tokaryk, T.T. and Baird, D. 1990. selachians from the of the Upper Cretaceous Coniacian of Carrot River, Saskatchewan, Canada. Can. J. Earth Sci. 27: 1084-1094.

Casey, J.G. and Pratt, H.L., Jr. 1985. Distribution of the white shark, Carcharodon carcharias, in the Western North Atlantic. Mem. So. Calif. Acad. Sci. 9: 2-14.

Casey, J.G. and Kohler, N.E. 1992. Tagging studies on the (Isurus oxyrinchus) in the western North Atlantic. Aust. J. Mar. Freshw. Res. 43: 45-60.

Castro, J.I. 1983. The Sharks of North American Waters. College Station, TX, Texas A&M. University Press, Texas, 180 p.

Castro, J.I., Clark, E., Yano, K. and Nakaya, K. 1997. The gross anatomy of the female reproductive tract and associated organs of the Fukuoka megamouth shark (Megachasma pelagios). In: Yano, K., Morrissey, J. F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan, pp. 115-119.

Castro, J.I., Woodley, C.M. and Brudek, R.L. 1999. A preliminary evaluation of the status of shark 159 species. FAO Fisheries Technical Paper (Food and Agriculture Organisation of the ) (Rome). Issue No. 380 pp. 1-72.

Churchill, G.A., von Haeseler, A. and Navidi, VV.C. 1992. Sample size for phylogenetic inference. Mol. Biol. Evol. 9: 753-769.

Cigala Fulgosi, F. 1983. First record of Alopias superciliosus (Lowe, 1840) in the Mediterranean, with notes on some fossil species of the genus Alopias. Annali del Museo Civico di Storia. Naturale di Genova. 84: 21-229.

Cigala Fulgosi, F. 1986. A deep water. elasmobranch fauna from a lower Pliocene Outcropping (Northern Italy). In: Uyeno, T., Arai, R., Taniuchi, T. and Matsunara, K. (Eds.). Indo-Pacific Fish Biology: Proceedings of the Second International Conference on Indo-Pacific Fishes, pp. 133-139.

Cigala Fulgosi, F. 1988. Additions to the Eocene and Pliocene fish fauna of Italy. Evidence of Alopias cf. denticulatus, Cappetta, 1981 in the Bartonian-Priabonian of the Monte Piano Marl (Northern Apennines) and of A. superciliosus (Lowe, 1840) in the Pliocene of Tuscany (Chondrichthyes, Alopiidae). Tertiary Res. 10: 93-99.

Cigala Fulgosi, F.1992. Additions to the fish fauna of the Italian Miocene. The occurrence of Pseudocarcharias (Chondricthyes, Psuedocarchariidae) in the lower of Parma Province, Northern Apennines. Tertiary Res. 14: 51-60.

Cione, A.L. and Reguero, M. 1994. New records of the sharks Isurus and Hexanchus from the Eocene of Seymour Island, Antarctica. Proc. Geolog. Assoc. 105: 1-14.

Cione, A.L. and Reguero, M.A. 1998. A Middle Eocene basking shark (Lamniformes, Cetorhinidae) from Antarctica. Antarctic Sci. 10: 83-88.

Clark, E. and Castro, J.I. 1995. "Megamamma" is a virgin: dissection of the first female specimen of Megachasma pelagios. Environ. Biol. Fish. 43: 329-332.

Clayton, D.A. 1982. Replication of animal mitochondrial DNA. Cell 28: 693-705.

Cliff, G., Dudley, S.F.J. and Davis, B. 1989. Sharks caught in the protective gill nets off Natal, South Africa. 2. The great white shark, Carcharodon carcharias (Linnaeus). S. Afr. J. Mar. Sci. 8: 131-144.

Compagno, L.J.V. 1973. Interrelationships of living elasmobranchs. In Greenwood, P.H., Miles, R.S. and Patterson, C. (Eds.). Interrelationships of Fishes, Supp.l, Zool.J. Linn. Soc. 53: 15-61.

Compagno, L.J.V. 1977. Phyletic relationships of living sharks and rays. Am. Zool. 17: 303-322.

Compagno, L.J.V. 1984a. Sharks of the world. An annotated and illustrated catalogue of shark species known to date. Part 1. Hexanchiformes to Lamniformes. FAO species catalogue. Vol. 4. Food Agric. Org. United Nations; FAO Fisheries Synopsis 125, Vol. 4, Pt. 1. pp. 1-249.

Compagno, L.J.V. 1984b. Sharks of the world. An annotated and illustrated catalogue of shark species known to date. Part 2 Carcharhiniformes. FAO species catalogue. Vol. 4. Food Agric. Org. United Nations; FAO Fisheries Synopsis 125, Vol. 4, Pt. 1. pp. 251-655 160

Compagno, L.J.V. 1988. Sharks of the order Carcharhiniformes. Princeton University Press, New Jersey, 486 pp.

Compagno, L.J.V. 1990a. Shark Exploitation and Conservation. Pages 391-414 In: Pratt, H.L., Gruber, S.H. and Taniuchi, T. eds. Elasmobranchs as living resources: advances in the biology, ecology, systematics and the status of the fisheries. U.S. Dept. of Commerce. NOAA Tech. Report NMFS 90.

Compagno, L.J.V. 1990b. Relationships of the Megamouth shark, Megachasma pelagios (Lamniformes: Megachasmidae) with comments of its feeding habits. Pages 357-379 In: Pratt, H.L., Gruber, S.H. and Taniuchi, T. (Eds. ). Elasmobranchs as living resources: advances in the biology, ecology, systematics and the status of the fisheries. U.S. Dept. of Commerce. NOAA Tech. Report NMFS 90.

Compagno, L.J.V. 1990c. Alternative life-history styles of cartilaginous fishes in time and space. Environ. Biol. Fish. 28: 33-75.

Compagno, L.J.V., 2001 Sharks of the world. An annotated and illustrated catalogue of shark species known to date. Volume 2. Bullhead, mackerel and carpet sharks (Heterodontiformes, Lamniformes and Orectolobiformes). FAO Species Catalogue for Fishery Purposes. No. 1 (vol. 2), p. i-viii + 1-269

Compagno, L.J.V., Ebert, D.A. and Cowley, P.D. 1991. Distribution of offshore demersal cartilaginous fish (Class Chondrichthyes) off the west coast of South Africa, with notes on their systematics. S. Afr. J. Mar. Sci. 11:43-139.

Compagno, L.J.V., Marks, M.A. and Ferguson, I.K. 1997. Threatened fishes of the world: Carcharodon carcharias (Linnaeus, 1758) (Lamnidae). Environ. Biol. Fish. 50: 61-62.

Cornell, P.S. and Ward, R.H. 2000. Mitochondrial genes and mammalian phylogenies: increasing reliability of branch length estimation. Mol. Biol. Evol. 17: 224-234.

Cunningham, S.B. 2000. A comparison of isolated teeth of early Eocene Striatolamia macrota (Chondrichthyes, Lamniformes), with those of a recent sand shark, carcharias taurus. Tertiary Res. 20: 17-31.

Cuny, G., Hunt, A., Maznn, J-M. and Rauscher, R. 2000. Teeth of enigmatic neoselachian sharks and an ornithischian from the uppermost Triassic of Lons-le Saunier (Jura, France). Palaontologische Zeitschrift. 74: 171-185.

Curdle, J.P. and Kocher, T.D. 1999. Mitogenomics: digging deeper with complete mitochondrial genome. TREE 14: 394-398.

D'Aubrey, J.D. 1964. A carchariid shark new to South African waters. Invest. Rep. Oceanogr. Res. Inst. 9: 1-16.

D'Aubrey, J.D. 1969. Two species of shark new to South African waters. Bull. South Afric. Assoc. Marine Biol. Res. N7: 30-31. 161

Daugherty, A.E. 1964. The sand shark, Carcharias ferox (Risso), in California. Calif. Fish and Game. 50: 4-10.

Davies, D.H., Lawson, R., Burch, S.J. and Hanson, J.E. 1987. Evolutionary relationships of a "primitive" shark (Heterodontus) assessed by micro-complement fixation of serum transferrin. J. Mol. Evol. 25: 74-80.

Dean; B. 1903. Additional specimens of the Japanese shark, Mitsukurina. Science. 17: 630-631.

De ~arvalho, M.R. 1996. Higher-level elasmobranch phylogeny, basal squaleans and paraphyly. In: Stiassny, M.L.J., Parenti, L.R. and Johnson, G.D. (Eds.). Interrelationships of fishes, pp.35-62. Academic Press.

Demski, L.S. and Northcutt, R.G. 1996. The brain and cranial nerves of the white shark: and evolutionary perspective. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 121-130.

Desjardins, P. and Morals, R. 1991. Nucleotide sequences and evolution of coding and noncoding regions of a quail mitochondrial genome. J. Mol. Evol. 32: 153-161.

DeWoody, J.A., Chesser, R.K. and Baker, R.J. 1999. A translocated mitochondrial cytochrome b pseudogenes in voles (Rodentia: Microtus). J. Mol. Evol. 48: 380-382.

Diamond, J.M. 1985. Filter-feeding on a grand scale. Nature 316: 679-680.

Dillon, M.C. and Wright, J.M. 1993. Nucleotide sequence of the D-loop region of the sperm whale (Physeter macrocephalus) mitochondrial genome. Mol. Biol. Evo1.10: 296-305.

Douady, C.J., Dosay, M., Shivji, M.S. and Stanhope, M.J. 2003. Molecular phylogenetic evidence refuting the hypothesis of Batoidea (rays and skates) as derived sharks. Mol. Phyl. Evol. 26: 215- 221.

Dowton, M. and Campbell, N.J.H. 2001. Intramitochondrial recombination - is it why some mitochondrial genes sleep around? TREE 16: 269-271.

Duffin, C.J. 1988. The Upper Jurassic selachian Palaeocarcharias de Beaumont (1960). Zool. J. Linn. Soc. 94: 271-286.

Duffin, C.J. 1998. New shark remains from the British Rhaetian (latest Triassic) 1. The earliest basking shark. N. Jb. Geol. Palaont. Mh. 1998: 157-181.

Duffy, C.A.J. 1997. Further record of the goblin shark, Mitsukurina owstoni (Lamniformes: Mitsukurinidae), from New Zealand. N.Z. J. Zool. 24: 167-171.

Dunn, K.A. and Morrissey, J.F. 1995. Molecular phylogeny of elasmobranchs. Copeia. 1995: 526- 531.

Eitner, B.J. 1995. Systematics of the genus Alopias (Lamniformes: Alopiidae) with evidence for the existence of an unrecognized species. Copeia. 1995: 562-571. 162

Ellis, J.R. and Shackley, S.E. 1995. Notes on porbeagle sharks, Lamna nasus, from the Bristol Channel. J. Fish Biol. 46: 368-370.

Emery, S.H. 1986. Haematological comparisons of endothermic vs. ectothermic elasmobranch fishes. Copeia 1986: 700-705.

Fairfax, D. 1998. The Basking Shark in Scotland, Natural History, Fishery and Conservation. Tuckwell Press, Great Britain, 206 pp.

Farris, J.S., Kallersjo, M., Kluge, A.G. and Bult, C. 1995. Constructing a significance test for incongruence. Syst. Biol. 51: 19-31.

Fedrigo, O., Adams, D.C. and Naylor, G.J.P. MS. DRUIDS —Detection of Regions with Unexpected Internal Deviation from Stationarity.

Feldmeth, R.C. and Waggoner, J.P., III. 1972. Buoyancy control in the shark Odontaspis taurus (Rafinesque). Copeia 1972: 594-595.

Felsenstein, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27: 401-410.

Ferguson, I.K. 1996. Distribution and autoecology of the white shark in the eastern North Atlantic and the Medditerranean Sea. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 321-345.

Francis, M.P. and Duffy, C. 2002. Distribution, seasonal abundance and bycatch of basking sharks Cetorhinus maximus, in New Zealand, with observations on their winter habitat. Marine B iol. 140: 831-842.

Francis, M.P. and Stevens, J.D. 2000. Reproduction, embryonic development, and growth of the porbeagle shark, Lamna nasus, in the Southwest Pacific Ocean. Fish. Bull. 98: 41-63.

Fuj ita, K. 1981. Oviphagous embryos of the pseudocarchariid shark, Pseudocarcharias kamoharai, from the central Pacific. Jap. J. Ichyol. 28: 37-44.

Galvan-Magana, F., Nienhuis, H.J. and Klimley, A.P. 1989. Seasonal abundance and feeding habits of sharks of the lower , Mexico. Calif. Fish and Game. 75: 74-84.

Garman, S. 1913. The Plagiostomia (Sharks, Skates and Rays). Mem. Mus. Comp. Zool. Harv. 36: 1- 515.

Garrick, J.A.F. 1967. Revision of sharks of genus Isurus with description of a new species (Galeoidea, Lamnidae). Proc. U.S. Natl Mus. 118: 663-690.

Garrick, J.A.F. 1974. First record of an odontaspidid shark in New Zealand waters. N.Z. J. Mar. Freshw. Res. 8: 621-630. 163

Gauld, J.A. 1989. Records of landed in Scotland, with observations on the biology, distribution and exploitation of the species. DAFS Scottish Fisheries Report no. 45, 1Spp.

Gillham, N.W. 1994. Organelle genes and genomes. Oxford University Press, Inc., New York, 424 pp•

Gilmore, R.G., Dodrill, J.W. and Linley, P.A. 1983. Reproduction and embryonic development of the sand tiger shark, Odontaspis taurus (Rafinesque). Fish. Bull. 81: 201-225.

Glickman, L.S. and Averianov, A.O. 1998. Evolution of the Cretaceous lamnoid sharks of the genus Eostriatolamia. Paleontol. J. 32: 376-384.

Glover, C.J.M. 1976. The goblin shark Scapanorhynchus owstoni (Jordan, 1898): confirmation of the first Australian record. S. Aust. Natur. 50: 69-72.

Goldman, K.J. Anderson, S.D. McCosker, J.E. and Klimley, A.P. 1996. Temperature, swimming depth and movements of a white shark at the South , California. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 111-120.

Gottfried, M.D. 1995. Miocene basking sharks (Lamniformes: Cetorhinidae) from the Chesapeake Group of Maryland and Virginia. J. Vert. Paleontol. 1S : 443-447.

Gottfried, M.D., Compagno, L.J.V. and Bowman, S.C. 1996. Size and skeletal anatomy of the giant "megatooth" shark Carcharodon megalodon. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp.55-66.

Gottfried, M.D. and Fordyce, R.E. 2001. An associated specimen of Carcharodon angustidens (Chondrichthyes, Lamnidae) from the late Oliocene of New Zealand, with comments on Carcharodon interrelationships. J. Vert. Paleontol. 21: 730-739.

Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47: 9-17.

Griffiths, C.S. 1998. The correlation of protein structure and evolution of aprotein-coding gene: phylogenetic inference using cytochrome oxidase III. Mol. Biol. Evol. 15: 1337-1345.

Gruber, S.H. 1980. The shark. Sea Frontiers. 26: 306-308.

Gruber, S.H. and Cohen, J.L. 1985. Visual system of the white shark, Carcharodon carcharias, with emphasis of retinal structure. Mem. So. Calif. Acad. Sci. 9: 61-72.

Gruber, S.H. and Compagno, L.J.V. 1981. Taxonomic status and biology of the bigeye thresher, Alopias superciliosus. Fish. Bull. 79: 617-640.

Gubanov, Y.P. 1972. On the biology of the thresher shark [Alopias vulpinus (Bonnaterre)J in the northwest Indian Ocean. J. Ich. 12: 591-600. 164

Gubanov. E.P. 1985. Presence of the sharp tooth sand tiger shark, Odontaspis ferox (Odontaspididae), in the open waters of the Indian Ocean. J. Ichthyol. 25: 156-158.

Hallacher, L.E. 1977. On the feeding behavior of the basking shark, Cetorhinus maximus. Env. Biol. Fish. 2: 297-29$.

Hamm, S.A. and Shimada, K. 2002. Associated tooth set of the late Cretaceous lamniform shark Scapanorhynchus raphiodon (Mitsukurinidae) from the Niobrara Chalk of Western . Trans. Kansas Acad. Sci. 105: 18-26.

Hazin, F.H., Couto, A.A., Kihara, K., Otsuka, K. and Ishino, M. 1990. Distribution and abundance of pelagic sharks in the South-Western Equatorial Atlantic. J. Tokyo Univ. Fisheries. 77: 51-64.

Heist, E.J., Musick, J.A. and Graves, J.E. 1996. Genetic population structure of the shortfin mako (Isurus oxyrinchus) inferred from restriction fragment length polymorphism analysis of mitochondrial DNA. Can. J. Fish. Aquat. Sci. 53: 583-588.

Hillis, D.M. 1998. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst. Biol. 47: 3-8.

Hillis, D.M. and Dixon, M.T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Quart. Rev. B iol . 66: 411- 453 .

Hillis, D.M., Mable, B.K., Larson, A., Davis, S.K. and Zimmer, E.A. 1996a. Nucleic acids IV: Sequencing and Cloning. In: Hillis, D.M., Moritz, C and Mable, B.K. (Eds.). Molecular Systematics, pp. 321- 381.

Hillis, D.M., Mable, B.K. and Moritz, C. 1996b. Applications of molecular systematics: the state of the field and a look to the future. In: Hillis, D.M., Moritz, C and Mable, B.K. (Eds.). Molecular Systematics, pp. 515-543.

Hixson, J.E., Wong, T.W. and Clayton, D.A. 1986. Both the conserved and divergent 5-flanking sequences are required for initiation of the human mitochondrial origin of light strand replication. J. Biol. Chem. 256: 10613-10617.

Hoelzel, A.R., Hancock, J.M. and Dover, G.A. 1991. Evolution of the cetacean mitochondrial D-loop region. Mol. Biol. Evol. 8: 475-493.

Honeycutt, R.L., Nedbal, M.A., Adkins, R.M. and Janecek, L.L. Mammalian mitochondrial DNA evolution: A comparison of the cytochrome b and cytochrome c oxidase II genes. J. Mol. Evol. 40: 260-272.

Hubbell, G. 1996. Using tooth structure to determine the evolutionary history of the white shark. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 9-18.

Huelsenbeck, J.P., Bull, J.J. and Cunningham, C.W. 1996. Combining data in phylogenetic analysis. TREE 11: 152-158. 165

Humphreys, R.L. 1989. First record of the shark, Odontaspis noronhai, from the Pacific Ocean. Jap. J. Ichthyol. 36: 357-362.

Hussakoff, L. 1909. A new goblin shark Scapanorhynchus jordani, from Japan. Bull. Amer. Mus. Nat. Hist. 26: 257-263.

Hwang, U.-W. and Kim, W. 1999. General properties and phylogenetic utilities of nuclear ribosomal DNA and mitochondrial DNA commonly used in molecular systematics. Korean J. Parasitol. 37: 215- 228.

Inoue, J.G., Miya, M., Tsukamoto, K. and Nishida, M. 2001. Complete mitochondrial DNA sequence of Conger myriaster (Teleostei: Anguilliformes): Novel gene order for vertebrate mitochondrial genomes and the phylogenetic implications for anguilliform families. J. Mol. Evol. 52: 311-320.

Irwin, D.M., Kocher, T.D. and Wilson, A.C. 1991. Evolution of the cytochrome b gene in mammals. J. Mol. Evol. 32: 128-144.

Itoigawa, J., Nishimoto, H., Karasawa. H. and Okumura, Y. 1985. Miocene fossils of the Mizunami group, central Japan. 3. Elasmobranchs. Monograph of the Mizunami Fossil Museum. 5: 1-89.

Iturralde-Vinent, M., Hubbell, G. and Rojas, R. 1996. Catalogue of Cuban fossil Elasmobranchii (Paleocene to Pliocene) .and paleogeographic implications of their lower to middle Miocene occurrence. J. Geol. Soc. Jamaica. 31:7-21.

Izawa, K. and Shibata, T. 1993. A young basking shark, Cetorhinus maximus, from Japan. Japan. J. Ichthyol . 40: 237-245 .

Janke, A. and Arnason, U. 1997. The complete mitochondrial genome of Alligator mississippiensis and the separation between recent Archosauria birds and crocodiles. Mol. Biol. Evol. 14: 1266-1272.

Janke, A., Feldmaier-Fuchs, G., Thomas, W.K., von Haeseler, A. and Paabo, S. 1994. The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137: 243-256.

Janvier, P. 1996. Early Vertebrates. Oxford Monographs on Geology and Geophysics. Oxford University Press, Inc., New York 393 pp.

Jordan, D.S. 1898. Description of a species of fish (Mitsukurina owstoni) from Japan, the type of a distinct family of lamnoid sharks. Proc. Calif. Acad. Sci. Ser 3: 199-205.

Jordan, D.S. 1923. A classification of fishes. Publications, University Series, Biol. Sci. 3: 79-243.

Jordan, D.S and Fowler, H.W. 1903. A review of the elasmobranchiate fishes of Japan. Proc. U.S. Nat. Mus. 26: 593-674.

Jordan, D.S. and Snyder, J.O. 1904. On a collection of fishes made by Mr. Alan Owston in the deep waters of Japan. Smithsonian Misc. Coll. 45: 230-240. 166

Karasawa, H. 1989. Late Cenozoic elasmobranchs from the Hokuriku district, central Japan. Sci. Rep. Kanazawa Univ. 34: 1-57.

Keyes, I.W. 1972. New records of the elasmobranch C. megalodon (Aggassiz) and a review of the genus Carcharodon in the New Zealand fossil record. N.Z. J. Geol. Geophys. 15: 229-242.

Kim, J. 1996. General inconsistency conditions for maximum parsimony: Effects of branch lengths and increasing numbers of taxa. Syst. Biol. 45: 363-374.

Kim, J. 1998. Large-scale phylogenees and measuring the performance of phylogenetic estimators. Syst. B iol. 47: 43-60.

Kitamura, T., Takemura, A., Watabe, S., Taniuchi, T. and Shimizu, M. 1996. Molecular phylogeny of the sharks and rays of Superorder Squalea based on mitochondrial cytochrome b gene. Fisheries Sci. 62: 340-343.

Klimley, A.P. 1985. The areal distribution and autoecology of the white shark, Carcharodon carcharias, off the west coast of North America. Mem. So. Calif. Acad. Sci. 9: I5-40.

Klimley, A.P. and Ainley, D.G. 1996. White shark research in the past: A perspective. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 3-4.

Klimley, A.P., Pyle, P. and Anderson, S.D. 1996. Tail slap and breach: Agonistic displays among white sharks. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 241-260.

Kobayashi, H., Yamaguchi, Y., Nonoda, T., Izawa, K. and Hidefumi, B. 1982. The sharks caught on the and Slope in Kuman Nada Region along the Pacific coast of Japan. Bull. Fac.~ Fish. Mie Univ. 9: 101-123.

Kocher, T.D., Thomas, W.K., Meyer, A., Edwards, S.V., Paabo. S., Villablanca, F.X. and Wilson, A.C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. U.S.A. 86: 6196-6200.

Krogh, M. 1994. Spatial, seasonal and biological analysis of sharks caught in the protective beach meshing programme. Aust. J. Mar. Freshw. Res. 45: 1087-1106.

Kuga, N. 1985. Revision of mackerel shark of genus Isurus from Japan. Mem. Faculty Sci., Kyoto Univ, Series of Geol. and Mineral. LI (1-2): 1-20.

Kumazawa, Y. and Nishida, M. 1993a. Sequence evolution of mitochondrial tRNA genes and deep- branch animal phylogenetics. J. Mol. Evol. 37: 380-398.

Kurnazawa, Y. and Nishida, M. 1993b. Variations in mitochondrial tRNA gene organization of reptiles as phylogenetic markers. Mol. Biol. Evol. 12: 759-772. 167

Kumazawa, Y. and Nishida, M. 1999. Complete mitochondria) DNA sequences of the green and .blue-tailed mole skink: statistical evidence for archosaurian affinity of . Mol. Biol. Evol. 16: 784-792.

Kumazawa, Y. and Nishida, M. 2000. Variations in mitochondria) tRNA gene organization of reptiles as phylogenetic markers. Mol. Biol. Evol. 12: 759-772.

Kumazawa, Y., Ota, H., Nishida, M. and Ozawa, T. 1996. Gene rearrangements in snake mitochondria) genomes: highly concerted evolution of control-region-like sequences duplicated and inserted into a tRNA gene cluster. Mol. Biol. Evol. 13: 1242-1254.

Kumazawa, Y., Ota, H., Nishida, M. and Ozawa, T. 1998. The complete nucleotide sequence of a snake Dinodon semicarinatus mitochondria) genome with two identical control regions. Genetics 150: 313-329.

Last, P. R. and Stevens, J.D. 1994. Sharks and Rays of Australia. East Melbourne, Australia: CSIRO Australia. 513pp +plates.

Lavenberg, R. J. 1991. Megamania -the continuing saga of megamouth sharks. Terra 30: 30-39.

Lawson, R., Burch, S.J., Oughterson, S.M., Heath, S. and Davies, D.H. 1995. Evolutionary relationships of cartilaginous fishes: an immunological study. J. Zool. Lond. 237: 101-106.

Lee, W.-J. and Kocher, T.D. 1995. Complete sequence of a Petromyzon marinus mitochondria) genome: early establishment of the vertebrate genome organization. Genetics 139: 873- 887.

Liu, F.-G. R. and Miyamoto, M.M. 1999. Phylogenetic assessment of molecular and morphological data for eutherian mammals. Syst. Biol. 48: 54-64.

Liu, K-M., Chiang, P-J. and Chen, C-T. 1998. Age and growth estimates of the bigeye thresher shark, Alopias superciliosus, in northeastern Taiwan waters. Fish. Bull. 96: 482-491.

Liu, K-M., Chen, C-T, Liao, T-H and Joung, S-J. 1999. Age, growth and reproduction of the pelagic thresher shark, Alopias pelagicus, in the northwestern Pacific. Copeia: 1999: 68-74.

Long, D.J. and Seigel, J.A. 1997. A crocodile shark, Pseudocarcharias kamoharai (Selachii: Lamnidae) from pelagic waters of Baja California, Mexico. Oceanides. 12: 61-63.

Long, D.L. and Waggoner, B.M. 1996. Evolutionary relationships of the white shark: A phylogeny of lamniform sharks based on dental morphology. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 37-47.

Lopez, A., Ryburn, J.A., Fedrigo, O. and Naylor, G.J.P. MS. Much more data and exhaustive species sampling does not improve resolution of the phylogeny of lamniform sharks.

Lucifora, L.O. and Menni, R.C. 1998. First record of a porbeagle shark, Lamna nasus, in brackish waters of Mar Chiquita Lagoon, Argentina. Cybium 22: 87-88. 168

Macey, J.R., Larson, A., Ananjeva, N.B. and Papenfuss, T.J. 1997a. Evolutionary shifts in three major structural features of the mitochondrial genome among iguanian lizards. J. Mol. Evol. 44: 660- 674..

Macey, J.R., Larson, A., Ananjeva, N.B. and Papenfuss, T.J. 1997b. Two novel gene orders and the role of light-stranded replication in rearrangement of the vertebrate mitochondrial genome. Mol Biol. Evol. 14: 91-104.

Macey, J.R., Schulte, J.A., II, Larson, A. and Papenfuss, T.J. 1998. Tandem duplication via light- strand synthesis may provide a precursor for mitochondrial genomic rearrangement. Mol. Biol. Evol. 15: 71-75.

Macey, J.R., Schulte, J.A., II and Larson, A. 2000. Evolution and phylogenetic information content of mitochondrial genomic structural features illustrated with acrodont lizards. Syst. Biol. 49: 257-277.

Maisey, J.G. 1984. Higher elasmobranch phylogeny and biostratigraphy. Zool. J. Linn. Soc. 82: 33- 54.

Maisey, J.G. 1996. Discovering Fossil Fishes. Henry Holt and Co., New York. 223pp.

Martin, A.P. 1995. Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA. Mol . B iol. Evol . 12: 1124-1131.

Martin, A.P. 1999. Substitution rates of organelle and nuclear genes in sharks: implicating metabolic rate (again). Mol. B iol. Evol. 16: 996-1002.

Martin, A.P. and Burg, T.M. 2002. Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Syst. Biol. 51: 570-587.

Martin, A.P. and Naylor, G.J.P. 1997. Independent origins of filter-feeding in megamouth and basking sharks (Order Lamniformes) inferred from phylogenetic analysis of cytochrome b gene sequences. In: Yano, K., Morrissey, J. F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan. pp. 39-50.

Martin, A.P. and Palumbi, S.R. 1993. Protein evolution in different cellular environments: Cytochrome b in sharks and mammals. Mol. Biol. Evol. 10: 873-891.

Martin, A.P., Pardini, A.T., Noble, L.R. and Jones, C.S. 2002. Conservation of a dinucleotide simple sequence repeat locus in sharks. Mol. Phyl. Evol. 23: 205-213.

Matsubara, K. 1936. A new carcharoid shark found in Japan. Zool. Mag. (Japan). 48: 380-382.

Matthews, L.H. 1950. Reproduction in the basking shark, Cetorhinus maximus (Gunner). Phil. Trans. Roy. Soc. London, Ser. B . 234: 247-316.

Matthews, L.H. and Parker H.W. 1950. Notes on the anatomy and biology of the basking shark (Cetorhinus maximus (Gunner)). Proc. Zool.. Soc. London. 120: 535-576. 169

Maul, G.E. 1955. Five species of rare sharks new for Madeira including two new to science. Notulae Naturae. 279: 1-13.

Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. TREE 9: 278-280.

Mindell, D.P. and Honeycutt, R.L. 1990. Ribosomal RNA in vertebrates: Evolution and phylogenetic applications. Annu. Rev. Ecol. Syst. 21: 541-566.

Mindell, D.P. and Thacker, C.E. 1996. Rates of molecular evolution: Phylogenetic issues and applications. Annu. Rev. Ecol. Syst. 27: 279-303.

Mindell, D.P., Sorenson, M.D. and Dimcheff, D.E. 1998. Multiple independent origins of mitochondria) gene order in birds. Proc. Natl. Acad. Sci. U.S.A. 95: 10693-10697.

Miya, M. and Nishida, M. 2000. Use of mitogenic information in teleostean : A tree-based exploration under the maximum-parsimony optimality criterion. Mol. Phyl. Evol. 17: 437-455.

Miya, M., Kawaguchi, A. and Nishida, M. 2001. Mitogenomic exploration of higher teleostean phylogenies: a case study for moderate-scale evolutionary genomics with 38 newly ~ determined complete mitochondria) DNA sequences. Mol. Biol. Evol. 18: 1993-2009.

Miya, M., Takeshima, H., Endo, H., Ishiguro, N.B., Inoue, J.G., Mukai, T., Satoh, T.P., Yamaguchi, M., Kawaguchi, A., Mabuchi, K., Shirai, S.M. and Nishida, M. 2003. Major patterns of higher teleostean phylogenies: a new perspective based on 100 complete mitochondria) DNA sequences. Mol. Phyl. Evol. 26: 121-138.

Miyamoto, M.M., Allard, M.W., Adkins, R.M., Janecek, L.L. and Honeycutt, R.L. 1994. A congruence test of reliability using linked mitochondria) DNA sequences. Syst. Biol. 43: 236-249.

Mollet, H.F., Cliff, G., Pratt, Jr., H.L. and Stevens, J.D. 1999. Reproductive biology of the female shortfin mako, Isurus oxyrinchus Rafinesque, 1810, with comments on the embryonic development of lamnoids. Fish. Bull. 98: 299-318.

Mollet, H.F., Ebert, D.A., Cailliet, G.M., Testi, A.D., Klimley, A.P. and Compagno L.J.V. 1996 A review of length validation methods and protocols to measure large white sharks. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 91-108.

Moore, W.S. 1995. Inferring phylogenies from mtDNA variation: Mitochondria) gene-trees versus nuclear-gene trees. Evolution. 49: 718-726.

Moreno, J.A. and Moron, J. 1992a. Comparative study of the genus Isurus (Rafinesque, 1810), and description of a form ("Marrajo Criollo") apparently endemic to the Azores. Aust. J. Mar. Freshw. Res. 43: 109-122.

Moreno, J.A. and Moron, J. 1992b. Reproductive biology of the bigeye thresher shark, Alopias superciliosus (Lowe, 1839). Aust. J. Mar. Freshw. Res. 42: 77-86. 170

Moritz, C. and Brown, W.M. 1986. Tandem duplication of D-loop and ribosomal RNA sequences in lizard mitochondrial DNA. Science 233: 1425-1427.

Moritz, C. and Brown, W.M. 1987. Tandem duplications in animal mitochondrial DNAs: Variation in incidence and gene. content among lizards. Proc. Natl. Acad. Sci. U.S.A. $4: 7183-7187.

Morrissey, J. F., Dunn, K. A. and Mule, F. 1997. The phylogenetic position of Megachasma pelagios inferred from mtDNA sequence data. In: Yano, K., Morrissey, J. F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan. pp. 33-37.

Munoz-Chapuli, R., De Andres, A.V. and Dingerkus, G. 1994. Coronary artery anatomy and elasmobranch phylogeny. Acta Zool. 75: 249-254.

Nachman, M.W., Brown, W.M., Stoneking, M. and Aquadro, C.F. 1996. Nonneutral mitochondrial DNA variation in humans and chimpanzees. Genetics 142: 953-963.

Nakamura, H. 1935. On the two species of thresher shark from Formosan waters. Mem. Fac. Sci. Agric. 14: 1-6.

Nakano, H. and Nagasawa, K. 1996. Distribution of pelagic elasmobranchs caught by salmon research gillnets in the North Pacific. Fish. Sci. 62: 860-865.

Nagasawa, K. 1998. Predation by salmon sharks (Lamna ditropis) on pacific salmon (Oncorhynchus spp.) in the North Pacific Ocean. N. Pac. Anadr. Fish Comm. Bull. No. 1:419-433.

Nakaya, K. 2001. White band on upper jaw of megamouth shark, Megachasma pelagios, and its presumed function (Lamniformes: Megachasmidae). Bull. Fac. Fish. Hokkaido Univ. 52: 125-129.

Nakaya, K., Yano, K., Takada, K. and Hiruda, H. 1997. Morphology of the first female megamouth shark, Megachasma pelagios (Elasmobranchii: Megachasmidae), landed at Fukuoka, Japan. In: Yano, K., Morrissey, J. F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan, pp. 51-62.

Naylor, G.J.P. 1990. A morphometric approach to distinguish between the upper dentitions of Carcharhinus l imbatus and C. brevipinna with comments of its application to tracing shark phylogenees through their fossil teeth. In: H.L., Gruber, S.H. and Taniuchi, T. (Eds.). Elasmobranchs as living resources: advances in the biology, ecology, systematics and the status of the fisheries. U.S. Dept. of Commerce. NOAA Tech. Report NMFS 90.

Naylor, G.J.P. and Brown, W.M. 1995. Structural biology and phylogenetic estimation. Nature 388: 527-528.

Naylor, G.J.P. and Brown, W.M. 1998. Amphioxus mitochondrial DNA, phylogeny, and the limits of inference based on comparison of sequences. Syst. Biol. 47: 61-76.

Naylor, G.J.P. and Marcus, L.F. 1994. Identifying isolated shark teeth of the genus Carcharhinus to species: Relevance for tracking phyletic change through the fossil record. Am. Mus. Novit. 3109: 1- 53. 171

Naylor, G.J.P., Collins, T.M. and Brown, W.M. 1995. Hydrophobicity and phylogeny. Nature 373: 565-566.

Naylor, G.J.P., Martin, A.P., Mattison, A.P. and Brown, W.M. 1997. Interrelationships of lamniform sharks: testing phylogenetic hypotheses with sequence data. Molecular Systematics of Fishes, Chapter 13 pp.199-218. Academic Press.

Nelson, D.R., McKibben, J.N., Strong, W.R., Lowe, C.G., Sisneros, J.A., Schroeder, D.M. and Lavenberg, R.J. 1997. An acoustic tracking of a megamouth shark, Megachasma pelagios: a crepuscular vertical migrator. Environ. Biol. Fish. 49: 389-399.

Nielsen, K.K. and Arctander, P. 2001. Recombination among multiple mitochondrial pseudogenes from a passerine genus. Mol. Phyl. Evo1. 18: 362-369.

Noden, R.G. 1984. Another Goblin. Aust. Fisher. 43: 56.

Paabo, S., Thomas, W.K., Whitfield, K.M., Kumuzawa, Y. and Wilson, A.C. 1991. Rearrangements of mitochondrial transfer genes in marsupials. J. Mol. Evol. 33: 426-430.

Page, R.D.M. and Holmes, E.C. 1998. Molecular Evolution: A phylogenetic approach. Blackwell Science Ltd. United Kingdom, 346 pp.

Pardini, A.T., Jones, C.S., Noble, L.R., Kreiser, B., Malcolm, H., Bruce, B.D., Stevens, J.D., Cliff, G., Scholl, M.C., Francis, M., Duffy, C.A.J. and Martin, A.P. 2001. Sex-based dispersal of great white sharks. Nature 412: 139-140.

Parker, H.W. and Boseman, M. 1954. The basking shark, Cetorhinus maximus, in winter. Proc. Zool. Soc. Lond. 124: 185-194.

Paton, T., Haddrath, O. and Baker, A.J. 2002. Complete mitochondrial DNA genome sequences show that modern shore birds are not descended from traditional shorebirds. Proc. R. Soc. Lond. B 269: 839-846.

Paust, B. and Smith, R. 1986. Salmon Shark Manual: the development of a commercial salmon shark, Lamna ditropis, fishery in the North Pacific. Alaska Sea Grant Rep. 86-01. Petersburg, 430 pp.

Penny, D. and Hagesawa, M. 1997. The platypus in its place. Nature 387: 549-550.

Pepperell, J.G. 1992. Trends in the distribution, species composition, and size of sharks caught by gamefish anglers off South-eastern Australia. Aust. J. Mar. Freshw. Res. 43: 213-225.

Perna, N.T. and Kocher, T.D. 1995. Unequal base frequencies and the estimation of substitution rates. Mol. Biol. Evol. 12: 359-361.

Pereira, S.L. 2000. Mitochondrial genome organization and vertebrate phylogenetics. Gen. Mol. Biol. 23: 745-752.

Piotrovskiy, A.S. and Prut'ko, V.G. 1980. The occurrence of the goblin shark, Scapanorhynchus owstoni (Chondrichthyes, Scapanorhynchidae) in the Indian Ocean. J. Ichthyol. 20: 124-125. 172

Pledge, N.S. 1985. An early Pliocene shark tooth assemblage in South Australia. Spec. Publ., S.Aust. Dept. Mines and Energy. 5: 287-299.

Poe, S. 1998. The effect of taxonomic sampling on accuracy of phylogeny estimation: Test case of a known phylogeny. Mol. Biol. Evol. 15: 1086-1090.

Poe, S. and Swafford, D.L. 1999. Taxon sampling revisited. Nature 398: 299-300.

Pollard, D.A. 1996. The biology and conservation status of the grey nurse shark (carcharias taurus Rafinesque 1810) in New South Wales, Australia. Aquatic Conservation: Marine and Freshwater Ecosystems. 6: 1-20.

Pollock, D.D. and Bruno, W.J. 2000. Assessing an unknown evolutionary process: effect of increasing site-specific knowledge through taxon addition. Mol. Biol. Evol. 17: 1854-1858.

Pollock, D.D., Eisen, J.A., Doggett, N.A. and Cummings, M.P. 2000. A case for evolutionary genomics and the comprehensive examination of sequence biodiversity. Mol. Biol. Evo1. 17: 1776- 1788.

Pollock, D.D., Zwickl, D.J., McGuire, J.A. and Hillis, D.M. 2002. Increased taxon sampling is advantageous for phylogenetic inference. Syst. B iol. 51: 664-671.

Posada, D. and Crandall, K.A. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics Applic. Note 14: 817-818.

Pratt, H.L., Jr and Casey, J.G. 1983. Age and growth of the shortfin mako, Isurus oxyrinchus. NOAA Tech. Rep. NMFS 8: 175-177.

Purdy, R.W. 1996. Paleoecolgy of fossil white sharks. In: Klimley, A.P. and D.G. Ainley (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp.67-78.

Purdy, R.W., Schneider, V.P., Applegate, S.P. et al. 2001. The Neogene Sharks, Rays and Bony Fishes from Lee Creek Mine, Aurora, North Carolina. Smithson. Contrib. Paleobiol. 90: 71-202.

Purvis, A. and Quicke, D.L.J. 1997. Building phylogenies: are the big easy? TREE 12: 49-50.

Pyle, P., Schramm, M.J., Keiper, C. and Anderson, S.D 1999. Predation on a white shark (Carcharodon carcharias} by a killer whale (Orcinus orca) and a possible case of competitive displacement. Mar. Mammal Res. 15: 563-568.

Quicke, D.L.J. 1993. Principles and techniques of contemporary taxonomy. Chapman and Hall, Great Britain, 311 pp.

Quinn, T.W. and Mindell, D.P. 1996. Mitochondrial gene order adjacent to the control region in crocodile, turtle, and tuatara. Mol. Phylogenet. Evol. 5: 344-351.

Quinn, T.W. and Wilson, A.C. 1993. Sequence evolution in and around the mitochondrial control region in birds. J. Mol. Evol. 37: 417-425. 173

Ramirez, V., Savoie, P. and Morais, R. 1993. Molecular characterization and evolution of a duck mitochondrial genome. J. Mol. Evol. 37: 296-310.

Rand, D.M. 1993. Endotherms, ectotherms, and mitochondrial genome-size variation. J. Mol. Evol. 37: 281-295

Randall, J.E. 1973. Size of the great white shark (Carcharodon). Science 181: 169-170.

Rannala, B., Huelsenbeck, J.P., Yang, Z. and Nielsen, R. 1998. Taxon sampling and the accuracy of large phylogenies. Syst. Biol. 47: 702-710.

Rasmussen, A.-S. and Arnason, U. 1999a. Molecular studies suggest that cartilaginous fishes have a terminal position in the piscine tree. Proc. Natl. Acad. Sci. U.S.A. 96: 2177-2182.

Rasmussen, A.-S. and Arnason, U. 1999b. Phylogenetic studies of complete mitochondrial DNA molecules place cartilaginous fishes within the tree of bony fishes. J. Mol. Evol. 48: 118-123.

Regan, C.T. 1906. A classification of the selachian fishes. Proc. Zool. Soc. Lond. 1906: 722-758.

Reyes, A., Pesole, G. and Saccone, C. 1998. Complete mitochondrial DNA sequence of the fat dormouse, Glis glis; further evidence of rodent paraphyly. Mol. Biol. Evol. 15: 499-505.

Romanov, E.V. and Samorov, V.V. 1994. On discoveries of the crocodile shark, Pseudocarcharias kamoharai (Pseudocarchariidae) in the Equatorial Indian Ocean. J. Ichthyol. 34: 155-157.

Rosenberg, M.S. and Kumar, S. 2003. Taxon sampling, bioinformatics, and phylogenomics. Syst. Biol. 52: 119-124.

Russo, C.A.M., Takezaki, N. and Nei, M. 1996. Efficiencies of different genes and different tree- building methods in recovering a known vertebrate phylogeny. Mol. Biol. Evol. 13: 525-536.

Saccone, C., Pesole, G. and Sbisa, E. 1991. The main regulatory region of mammalian mitochondrial DNA: Structure-function model and evolutionary pattern. J. Mol. Evol. 33: 83-91.

Sadowsky, V, Amorim, A.F. and Arfelli, C.A. 1984. Second occurrence of Odontaspis noronhai (Maul, 1955). B. Inst. Pesca. 11: 69-79.

Salisbury, B.A. and Kim, J. 2001. Ancestral state estimation and taxon sampling density. Syst. Biol. 50: 557-564.

Satchell, G.H. 1991. Physiology and form of fish circulation. Cambridge University Press, Great Britain. 235 pp.

Savolainen, P. Arvestad, L. and Lundeburg, J. 2000. mtDNA tandem repeats in domestic dogs and wolves: mutation mechanisms studied by analysis of the sequence of imperfect repeats. Mol. Biol. Evol. 17: 474-488. 174

Shevchuk, N.A. and Allard, M.W. 2001. Sources of incongruence among mammalian mitochondrial sequences: COII, COIII, and ND6 genes are main contributors. Mol. Phyl. Evol. 21: 43-54.

Seigel, J.A. and Compagno, L.J.V. 1985. New records of the ragged-tooth shark, Odontaspis ferox, from California waters. Calif. Fish and Game. 72: 172-176.

Seret, B. 1995. First record of a megamouth shark (Chondrichthyes Megachasmidae) in the Atlantic Ocean, off Senegal. Cybium. 19: 425-427.

Schmid, T.H., Murru, F.L. and McDonald, F. 1990. Feeding habits and growth rates of bull (Carcharhinus leucas (Valenciennes)), sandbar (Carcharhinus plumbeus (Nardo)), sandtiger (Eugomphodus taurus (Rafinesque)) and nurse (Ginglymostoma cirratum (Bonnaterre)) sharks maintained in captivity. J. Aquariculture Aquatic Sci. 5: 100-105.

Shao, R. and Barker, S.C. 2003. The highly rearranged mitochondrial genome of the plague thrips, Thrips imaginis (Insects: Thysanoptera): Convergence of two novel gene boundaries and an extraordinary arrangement of rRNA genes. Mol. Biol. Evol. 20: 362-370.

Shimada, K. 1997a. Gigantic lamnoid shark vertebra from the lower Cretaceous Kiowa shale of Kansas. J. Paleontol. 71: 522-524.

Shimada, K. 1997b. Skeletal anatomy of the late Cretaceous lamniform shark, Cretoxyrhina mantelli, from the Niobrara chalk in Kansas. J. Vert. Paleontol. 17: 642-652.

Shimada, K. 1997c. Dentition of the late Cretaceous lamniform shark, Cretoxyrhina mantelli, from the Niobrara chalk in Kansas. J. Vert. Paleontol. 17: 269-279.

Shimada, K. 1997d. Paleoecological relationships of the late Cretaceous lamniform shark, Cretoxyrhina mantelli (Agassiz). J. Paleontol. 71: 926-933.

Shimada, K. 1997e. Stratigraphic record of the late Cretaceous lamniform shark, Cretoxyrhina mantelli (Agassiz), in Kansas. Trans. Kansas Acad. Sci. 100: 139-149.

Shimada, K. 1999. Contribution of dental characters to elucidate the phylogeny of lamniform sharks. J. Vert. Paleontol. 19: 75A.

Shimada, K. 2001. Notes on the dentition of the bigeye sandtiger shark, Odontaspis noronhai (Lamniformes: Odontaspididae). J. Fossil Res. 34: 15-17.

Shimada, K. 2002. Teeth of embryos in lamniform sharks (Chondrichthyes: Elasmobranchii). Environ. Biol. Fish. 63: 309-319.

Shimada, K. and Hubbell, G. 2001. Identity of small symmetrical teeth of the late Cretaceous lamniform shark, Cretoxyrhina mantelli, from Western Kansas, U.S.A. J. Fossil Res. 34: 55-57.

Shirai, S. 1992a. Squalean phylogeny: A new framework of "squaloid" sharks and related taxa. Hokkaido University Press, Sapporo. 151 pp +plates. 175

Shirai, S. 1992b. Phylogenetic relationships of the angel sharks, with comments on elasmobranch phylogeny (Chondrichthyes, Squatinidae). Copeia 1992: 505-518.

Shirai, S. 1996. Phylogenetic interrelationships of neoselachians (Chondrichthyes: ). In: Stiassny, M.L.J., Parenti, L.R. and Johnson, G.D. (Eds.). Interrelationships of fishes, pp.9-34. Academic Press, San Diego.

Sims, D.W. and Quayle, V.A. 1998. Selective foraging behavior of basking sharks on zooplankton in a small—scale front. Nature 393: 460-464.

Sims, D.W., Southall, E.J., Quayle, V.A. and Fox, A.M. 2000. Annual social behavior of basking sharks associated with coastal front areas. Proc. R. Soc. Lond. B. 267: 1897-1904

Siverson, M. 1992. Biology, dental morphology and taxonomy of lamniform sharks from the Campanian of the , . Paleontol. 35: 519-554.

Siverson, M. 1995. Revision of the Danian cow sharks, sand tiger sharks and goblin sharks (Hexanchidae, Odontaspididae, and Mitsukurinid~ae) from southern Sweden. J. Vert. Paleontol. 15 : 1- 12.

Siverson, M. 1996. Lamniform sharks of the mid Cretaceous Alinga formation and Beedagong Claystone, Western Australia. Palaeontol. 39: 813-849.

Siverson, M. 1997. Sharks from the Mid-Cretaceous Gearle Siltstone, Southern Carnarvon Basin, Western Australia. J. Vert. Paleontol. 17: 453- 465.

Siverson, M. 1999. A new large lamniform shark from the upper most Gearle Siltstone (Cenomanian, Late Cretaceous) of Western Australia. Trans. R. Soc. Edinburgh Earth Sci. 90: 49-66.

Smith, A.K. and Pollard, D.A. 1999. Threatened fishes of the world: Carcharias taurus (Rafinesque, 1810) (Odontaspididae). Environ. Biol. Fish. 56: 365.

Smith, R.L. and Rhodes, D. 1984. Body temperature of the salmon shark, Lamna ditropis. J. Mar. Biol. Ass. U.K. 63: 243-244.

Springer, M.S., DeBry, R.W., Douady, C., Amrine, H.M., Madsen, O., de Jong, W.W. and Stanhope, M.J. 2001. Mitochondrial versus nuclear gene sequence in deep-level mammalian phylogeny reconstruction. Mol. B iol. Evol. 18: 132-143.

Springer, M.S. and Douzery, E. 1996. Secondary structure and patterns of evolution among mammalian mitochondrial 12S rRNA molecules. J. Mol. Evol. 43: 357-373.

Springer, S. and Gilbert, P.W. 1976. The basking shark, Cetorhinus maximus, from Florida and California, with comments on its biology and systematics. Copeia. 1976: 47-54.

Stevens, J.D. 1984. Biological observations on sharks caught by sport fishermen off New South Wales. Aust. J. Mar. Freshw. Res. 35: 573-590. 176

Stevens, J.D. and Paxton, J.R. 1985. A new record of the goblin shark, Mitsukurina owstoni (Family Mitsukurinidae), from eastern Australia. Proc. Linn. Soc. N.S.W. 108: 37-45.

Stewart, A.L. 2001. First record of the crocodile shark, Pseudocarcharias kamoharai (Chondrichthyes: Lamniformes), from New Zealand. N.Z. J. Mar. Freshw. Res.35: 1001-1006.

Stewart, A.L. and Clark, M.R. 1988. Records of three families and four species of fish new to the New Zealand fauna. N.Z. J. Zool. 15: 577-583.

Stewart, D.T. and Baker, A.J. 1994. Patterns of sequence variation in the mitochondrial D-loop region of shrews. Mol. Biol. Evol. 11: 9-21.

Stillwell, C.E. and Casey, J.G. 1976. Observations on the bigeye thresher shark, Alopias superciliosus, in the western north Atlantic. Fish. Bull. 74: 221-225.

Stillwell, C.E. and Kohler, N.E. 1982. Food, feeding habits, and estimates of daily ration of the shortfin mako (Isurus oxyrinchus) in the Northwest Atlantic. Can. J. Fish. Aquat. Sci. 39: 407-414.

Stott, F.C. 1982. A note on catches of basking sharks, Cetorhinus maximus (Gunnerus), off Norway and their relation to possible migration paths. J. Fish Biol. 21: 227-230.

Stribling, M.D., Hamlett, W.C. and Wourms, J.P. 1980. Developmental efficiency of oophagy, a method of viviparous embryonic nutrition displayed by the sand tiger shark (Eugomphodus taurus). Bull. So. Carolina Acad. Sci. XLII: 111.

Strong, W.R., Jr 1996. Repetitive aerial gaping: athwart-induced behavior in white sharks. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 207-215.

Sullivan, J. Holsinger, K.E. and Simon, C. 1995. Among-site rate variation and phylogenetic analysis of 12S rRNA in sigmodontine rodents. Mol. Biol. Evol. 12: 988-1001.

Sullivan, J., Swofford, D.L. and Naylor, G.J.P. 1999. The effect of taxon sampling on estimating rate heterogeneity parameters of maximum-likelihood models. Mol. Biol. Evol. 16: 1347-1356.

Takada, K., Hiruda, H., Wakisaka, S., Mori, T. and Nakaya, K. 1997. Capture of the first female megamouth shark, Megachasma pelagios, from Hakata Bay, Fukuoka, Japan. In: Yano, K., Morrissey, J. F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan, pp. 3-9.

Takezaki, N. and Gojobori, T. 1999. Correct and incorrect vertebrate phylogenies obtained by the entire mitochondrial DNA sequences. Mol. Biol. Evol. 16: 590-601.

Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10: 512-526.

Taylor, L.R., Compagno, L.J.V. and Struhsaker, P.J. 1983. Megamouth -Anew species, genus, and family of lamnoid shark (Megachasma pelagios, Family Megachasmidae) from the Hawaiian Islands. Proc. Calif. Acad. Sci. 43: 87-110. 177

Townsend, T. and Larson, A. 2002. Molecular phylogenetics and mitochondrial genomic evolution in the Chamaeleonidae (Reptilia, Squamata). Mol. Phyl. Evol. 23: 22-36.

Tricas, T.C. 1985. Feeding ethology of the great white shark, Carcharodon carcharias. Mem. So. Calif. Acad. Sci. 9: 81-91.

Turner, S. 1985. Remarks on the early history of chondricthyans, thelodonts, and some higher elasmobranchs. In: Hornibrook Symposium, 1985, extended abstracts. N.Z. Geol. Surv. Rec, 9: 93- 95.

Uchida, S., Toda, M. Teshima, K. and Yano, K. 1996. Pregnant white sharks and full term embryos from Japan. In: Klimley, A.P. and Ainley, D.G. (Eds.). Great White Sharks. The Biology of Carcharodon carcharias. Academic Press, San Diego, pp. 139-155.

Ugoretz, J.K. and Seigel, J.A. 1999. First Eastern Pacific record of the goblin shark Mitsukurina owstoni (Lamniformes: Mitsukurinidae). Calif. Fish and Game. 85: 118-120.

Uyeno, T., Hasegawa, Y. and Tamotsu, K. 1980. Some shark teeth from Miocene Ichishi Formation in Mie Prefecture, Japan. Bull. Natn. Sci. Mus., Ser. C (Geol). 6: 125-128.

Uyeno, T., Nakamura, K. and Mikami, S. 1976. On the body coloration and an abnormal specimen of the goblin shark, Mitsukurina owstoni Jordan. Bull. Kanagawa Pref. Mus. 9: 67-72.

Uyeno, T., Sakamoto, O. and Sekine, H. 1989. Description of an almost complete tooth set of Carcharodon megalodon from a middle Miocene bed in Saitama Prefecture, Japan. Bull. Saitama Mus. Nat. Hist. 7: 73-85.

Villavicencio-Garayzar, C.J. 1996. The ragged tooth shark, Odontaspis ferox (Risso, 1810), in the Gulf of California. Cal. Fish and Game. 82: 195-196.

Ward, D. J. 1978. Additions to the fish fauna of the English Palaeogene. I. Two new species of Alopias (thresher shark) from the English Eocene. Tertiary Res. 2: 23-28.

Weihs, D. 1999. No for basking sharks. Nature 400: 717-718.

White, E.G. 1936. A classification and phylogeny of the elasmobranch fishes. Am. Mus. Novit. 837: 1-16.

White, E.G. 1937. Interrelationships of the elasmobranchs with a key to the order Galea. Bull. Am. Mus. Nat. Hist. 74: 25-138.

Whitmore, D.H., Thai, T.H. and Craft, C.M. 1994. The largemouth bass cytochrome b gene. J. Fish. B iol. 44: 637-645.

Wilson, A.C., Cann, R.L., Carr, S.M., George, M., Gyllenstein, U.B., Helm-Bychowski, K.M., Higuchi, R.G., Palumbi, S.R., Prager, E.M., Sage, R.D. and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linn. Soc. 26: 375-400. 178

Wolstenholrne, D.R. 1992. Animal mitochondrial DNA: structure and evolution. Int. Rev. Cytol. 141: 173-21b.

Wourms, J.P. and Demski, L.S. 1993. The reproduction and development of sharks, skates, rays and ratfishes: introduction, history, overview, and future prospects. Environ. Biol. Fish. 38: 7-21.

Wu, W., Schmidt, T.R., Goodman, M. and Grossman, L.I. Molecular evolution of cytochrome c oxidase subunit I in Primates: Is there coevolution between mitochondrial and nuclear genomes. Mol. Phyl. Evol. 17: 294-304.

Xu, X., Janke, A. and Arnason, U. 1996. The complete mitochondrial DNA sequence of the greater Indian rhinoceros, Rhinoceros unicornis, and the phylogenetic relationships among Carnivora, Perissodactyla, and Artiodactyla (+ ). Mol. B iol. Evol. 13: 1167-1193.

Yabe, H. and Goto, M. 1996. Fossil shark teeth of the genus Carcharocles (Elasmobranchii: Lamniformes) from the middle Miocene at Kuzubukuro, Higashi_Maysuyama City, Saitama Prefecture, central Japan. Earth Sci. 50: 432-440.

Yabe, H. 2000. Teeth of an extinct great white shark Carcharodon sp., from the Neogene Senhata Formation, Miura Group, Chiba Prefecture, Japan. Tertiary Res. 29: 95-105.

Yabumoto, 1987. Oligocene lamnid shark of the genus Carcharodon from Kitakyushu, Japan. Bull. Kitakyushu Mus. Nat. Hist. 6: 239-264.

Yamauchi, M.M., Miya, M.U. and Nishida, M. 2003. Complete mitochondrial DNA sequence of the swimming crab, Portunus trituberculatus (Crustacea: : Brachyura}. Gene (Amsterdam) 311: 129-135.

Yano, K., Goto, M. and Yabumoto, Y. 1997. Dermal and mucous denticles of a female megamouth shark, Megachasma pelagios, from Hakata Bay, Japan. In: Yano, K., Morrissey, J.F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan, pp. 77-91.

Yano, K., Toda, M., Uchida, S. and Yasuzumi, F. 1997b. Gross anatomy of the viscera and stomach contents of a megamouth shark, Megachasma pelagios, from Hakata Bay, Japan, with a comparison of the intestinal structure of other planktivorous elasmobranches. In: Yano, K., Morrissey, J.F., Yabumoto, Y. and Nakaya, K. (Eds.). Biology of the Megamouth Shark. Tokai University Press, Japan, pp. 105-113.

Yano, K., Tsukada, O. and Furuta, M. 1998. Capture of megamouth shark No. 12 from Atawa, Mie, Japan. Ichthyol. Res. 45: 424-426.

Yano, K., Yabumoto, Y., Tanaka, S., Tsukada, O. and Furuta, M. 1997a. Capture of a mature female meagamouth shark, Megachasma pelagios, from Mie, Japan. pp. 335-349. In Seret, B. and J.Y. Sire (Eds.). Proc. 5~' Indo-Pac. Fish. Conf., Noumea, 1997. Soc. Fr. Ichtyol., Paris.

Yano, K., Yabumoto, Y., Tanaka, S., Tsukada, O. and Furuta, M. 1999. Capture of a maturefemale meagamouth shark, Megachasma pelagios, from Mie, Japan. pp. 335-349. In Seret, B. and J.Y. Sire (Eds.). Proc. 5~' Indo-Pac. Fish. Conf., Noumea, 1997. Soc. Fr. Ichtyol., Paris. 179

Yoder, A.D., Vilgalys, R. and Ruvolo, M. 1996. Molecular evolutionary dynamics of a cytochrome b in strepsirrhine primates: The phylogenetic significance of third-position transversions. Mol. Biol. Evol. 13: 1339-1350.

Yoneyama, Y. 1987. The nucleotide sequence of the heavy and light strand replication origins of the Rana castabeiana mitochondrial genome. J. Nippon Med. Sch. 54: 429-440.

Zardoya, R., Cao, Y., Hagesawa, M. and Meyer, A. 1998. Searching for the closest living relatives} of terapods through evolutionary analyses of mitochondrial and nuclear data. Mol. Biol. Evol. 15: 506-517.

Zardoya, R. and Meyer, A. 1996. The complete nucleotide sequence of the mitochondrial genome of the (Protopterus dolli) supports its phylogenetic position as a close relative of land vertebrates. Genetics 142: 1249-1263.

Zevering, C.E., Moritz, C. Heideman, A. and Sturm, R.A. 1991. Parallel origins of duplications and the formation of pseudogenes in mitochondrial DNA from parthenogenic lizards (Heteronotia binoei; Gekkonidae). J. Mol. Evol. 33: 431-441.

Zwickl. D.J. and Hillis, D.H. 2002. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51: 588-598. 180

ACKNOWLEDGEMENTS

First and foremost I would like to thank my husband Tim, whose continued support, understanding and devotion enabled me to finish my "tome" on lamniform sharks. To my family, your encouragement, especially during difficult and trying times, helped me immensely. I would also like to thank Janine Caira for never giving up on me, even when at times I questioned my own abilities. To my fellow lab mates, Julie Ryburn, for teaching me molecular techniques; Olivier Fedrigo, for always being there to answer "one more question"; Aspen Garry, you dragged me kicking and screaming into the 21st century; and Vicente Faria, what else can I say: you always make me smile. To Jeff Boore and Matt Fourcade at DOE/JGI for all your hard work; this project would not have been possible without you. To the members of my POS committee, Dean Adams, Bonnie Bowen and Jonathan Wendel, thank you for helpful comments on my thesis. To my advisor, Gavin Naylor, when I told you I was going to write this monster of a thesis, you could have stopped me, but thank you for allowing me to see what I could achieve. And finally, to Eugenie Clark, who showed me that a kid from Queens could study sharks. Your humility, grace and perseverance will always be an inspiration to me. 181

APPENDIX

Annotated sequences for the complete genomes of 15 species of lamniform sharks {Vertebrata: Chondrichthyes: Elasmobranchii: Neoselachii: Lamniformes).

Alopias pelagicus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGCATGGCAC TGAAGATGCT AAGATGAAAA CGATCACATC GAATTAAATT TCGTACCGTG ACTTCTACGA TTCTACTTTT ACAAAGGTTT 51 ATP.~A.AAATTT TTCCACGAGC GGTCCTGGCC TCAGTGTTAA TATTTTTAAA AAGGTGCTCG TGTTTCCAAA CCAGGACCGG AGTCACAATT 101 TTGTAACTAA AATTATACAT GCAAGTTTCA ACATCCCTGT GAGAATGCCC AACATTGATT TTAATATGTA CGTTCAAAGT TGTAGGGACA CTCTTACGGG 151 TAATTATTCT ATCAATTAAT TAGGAGCGGG TATCAGGCAC GCACACGCAG ATTAATAAGA TAGTTAATTA ATCCTCGCCC ATAGTCCGTG CGTGTGCGTC 201 CCCAAGACAC CTTGCTAAGC CACACCCCCA AGGGATTTCA GCAGTAATAA GGGTTCTGTG GAACGATTCG GTGTGGGGGT TCCCTAAAGT CGTCATTATT 251 ATATTGATCT CATAAGCGCA AGCTTGAATC AGTTAAAGTT AACAGAGTTG TATAACTAGA GTATTCGCGT TCGAACTTAG TCAATTTCAA TTGTCTCAAC 301 GTAAATCTCG TGCCAGCCAC CGCGGTTATA CGAGTAACTC ACATTAACAT CATTTAGAGC ACGGTCGGTG GCGCCAATAT GCTCATTGAG TGTAATTGTA ACTA.AAGTTA 351 TTGCCCGGCG TAAAGAGTGA TTTAAGGAGT ATCTATTATA AACGGGCCGC ATTTCTCACT AAATTCCTCA TAGATAATAT TGATTTCAAT 401 AGACCTCGTC AAGCTGTTAC ATGCACCCAC GAATGGAATT ACCAACAACG TCTGGAGCAG TTCGACAATG TACGTGGGTG CTTACCTTAA TGGTTGTTGC 451 AAGGTGACTT TACTCCACCA GAAATCTTGA TGTCACGACA GTTAGACCCC TTCCACTGAA ATGAGGTGGT CTTTAGAACT ACAGTGCTGT CAATCTGGGG 501 A.AACTAGGAT TAGATACCCT ACTATGTCTA ACCGCAAACT TAAACAATAA TTTGATCCTA ATCTATGGGA TGATACAGAT TGGCGTTTGA ATTTGTTATT 551 TTTACTATAT TGTTCGCCAG AGTACTACAA GCGCTAGCTT AAAACCCAAA AAATGATATA ACAAGCGGTC TCATGATGTT CGCGATCGAA TTTTGGGTTT 601 GGACTTGGCG GTGTCCCAAA CCCACCTAGA GGAGCCTGTT CTGTAACCGA CCTGAACCGC CACAGGGTTT GGGTGGATCT CCTCGGACAA GACATTGGCT 651 TAATCCCCGT TAAACCTCAC CACTTCTGGC CATCCCCGTC TATATACCGC ATTAGGGGCA ATTTGGAGTG GTGAAGACCG GTAGGGGCAG ATATATGGCG 701 CGTCGTCAGC TCACCCTGTG AAGGTAATAA AAGTAAGCAA AAAGAACTAA GCAGCAGTCG AGTGGGACAC TTCCATTATT TTCATTCGTT TTTCTTGATT 751 CTTCCACACG TCAGGTCGAG GTGTAGCAAA TGAAGTGGAT AGAAATGGGC GAAGGTGTGC AGTCCAGCTC CACATCGTTT ACTTCACCTA TCTTTACCCG 801 TACATTTTCT ATAAAGAAAA CACGAATGGT AAACTGAAAA ATTACCTAAA ATGTAAAAGA TATTTCTTTT GTGCTTACCA TTTGACTTTT TAATGGATTT 851 GGCGGATTTA GCAGTAAGAA AAGATTAGAG AACTTCTCTG A.AACTGGCTC CCGCCTAA.AT CGTCATTCTT TTCTAATCTC TTGAAGAGAC TTTGACCGAG 901 TGGGAC GC GC ACACACCGCC CGTCACTCTC CTC T CTACTTATTC ACCCTGCGCG TGTGTGGCGG GCAGTGAGAG GAGTTTTTTA GATGAATAAG 951 TTAATTAA.AA GAAAACCACC AAGAGGAGGC AAGTCGTAAC ATGGTAAGTG AATTAATTTT CTTTTGGTGG TTCTCCTCCG TTCAGCATTG TACCATTCAC 1001 TACTGGAAAG TGCACTTGGA ATCP~AAATGT GGC TA.AAC CA GCAAAGCACC ATGACCTTTC ACGTGAACCT TAGTTTTACA CCGATTTGGT CGTTTCGTGG 1051 TCCCTTACAC CGAGGAGATA CCCGTGCAAT TCGGATCATT TTGAACATTA AGGGAATGTG GCTCCTCTAT GGGCACGTTA AGCCTAGTAA AACTTGTAAT 1101 AAGCTAGCCT GTATACCTAA CCTTAA.ACCT AACCTTATTA ATTACCTTAT TTCGATCGGA CATATGGATT GGAATTTGGA TTGGAATAAT TAATGGAATA 1151 ACACAAATCC ATAATTAAAA CATTTTACAT TTTTAGTATG GGCGACAGAA TGTGTTTAGG TATTAATTTT GTA.A.AATGTA AAA.ATCATAC CCGCTGTCTT 182

TGP~~AAAGA 12 01 CP►~~A,AACTCA GCGCAATAGA CCATGTACCG CAAGGGAAAG C GTTTTTGAGT CGCGTTATCT GGTACATGGC GTTCCCTTTC GACTTTTTCT 1251 AATGAAATAA ATAATTAAAG TAACP►AAA.AG CAGAGATTTA ACCTCGTACC TTACTTTATT TATTAATTTC ATTGTTTTTC GTCTCTAAAT TGGAGCATGG 13 01 TTTTGCATCA TGATTTAGCT AGP►AAAAC TA GACAAAGAGA TCTTAAGCCT A.A.AAC GTAGT ACTAAATCGA TCTTTTTGAT CTGTTTCTCT AGAATTCGGA 1351 ATCCCCCCGA AACTAAACGA GCTACTCCGA AGCAGCATAA ATAGAGCCAA TAGGGGGGCT TTGATTTGCT CGATGAGGCT TCGTCGTATT TATCTCGGTT 1401 CCCGTCTCTG TGGCAAAAGA GTGGGAAGAC TTCCGAGTAG CGGTGACAAA GGGCAGAGAC ACCGTTTTCT CACCCTTCTG AAGGCTCATC GCCACTGTTT 1451 CCTATCGAGT TTAGTGATAG CTGGTTGCCT AAGAAAAGAA CTTTAATTCT GGATAGCTCA AATCACTATC GACCAACGGA TTCTTTTCTT GAAATTAAGA 1501 GCATTAATTC CTTCATCACC P.~P~AAAGTC TA TCTTATTAAG GTCAAACATA CGTAATTAAG GAAGTAGTGG TTTTTCAGAT AGAATAATTC CAGTTTGTAT 1551 P.~AA.ATTAATA GTTATTCAGA AGAGGTACAG CCCTTCTGAA CCAAGATACA TTTTAATTAT CAATAAGTCT TCTCCATGTC GGGAAGACTT GGTTCTATGT 1601 ACTTTTAAAG GC GGA,AAATG ATCATATTTA CCAAGGTTTC TACCTCAGTA TGAA.AATTTC CGCCTTTTAC TAGTATA.AAT GGTTCCAAAG ATGGAGTCAT 1651 GGCTCAAAAG CAGCCACCTG TAAAGTAAGC GTCACAGCTC CAGTTCCTCG CCGAGTTTTC GTCGGTGGAC ATTTCATTCG CAGTGTCGAG GTCAAGGAGC 1701 P.~AAAC C TATA ATTTAGATAT TTTCTCATAA CCCCCTTAAC CCATATTAGG TTTTGGATAT TAAATCTATA AAAGAGTATT GGGGGAATTG GGTATAATCC 1751 CTATTTTATA AAATTATAAA AGAACTTATA C TAAA.ATGAG TAATAAGAGA GATAAAATAT TTTAATATTT TCTTGAATAT GATTTTACTC ATTATTCTCT 1801 ACAAACCTCT CCAGACACGA GTGTATGTTA GAAAGAATTA AATCACTAAC TGTTTGGAGA GGTCTGTGCT CACATACAAT CTTTCTTAAT TTAGTGATTG 1851 AATTA.AAC GA ACCCAGATTG AGGCTATTAT ATTAACATTA CTTTAACTAG TTAATTTGCT TGGGTCTAAC TCCGATAATA TAATTGTAAT GAAATTGATC 1901 AAAATCCTAT TACATTACTC GTTAACCCTA CACAGGAGTG TCTTATGGAA TTTTAGGATA ATGTAATGAG CAATTGGGAT GTGTCCTCAC AGAATACCTT 1951 AGATTAAAAG AAAATAA.AGG AACTCGGCAA ACACAAACTC CGCCTGTTTA GCGGACAA.AT TCTAATTTTC TTTTATTTCC TTGAGCCGTT TGTGTTTGAG 2001 C CP~~AA.ACAT CGCCTCTTGA ATATTATAAG AGGTCCCGCC TGCCCTGTGA GGTTTTTGTA GC GGAGAAC T TATAATATTC TCCAGGGCGG ACGGGACACT 2051 CAATGTTTTA ACGGCCGCGG TATTTTGACC GTGCAAAGGT AGCGTAATCA GTTACAAAAT TGCCGGCGCC ATAAAACTGG CACGTTTCCA TCGCATTAGT 2101 CTTGTCTTTT AAATGAAGAC C C GTAT GA.AA GGCATCACGA GAGTTTAACT GAACAGAAAA TTTACTTCTG GGCATACTTT CCGTAGTGCT CTCA.AATTGA 2151 GTCTCTATTT TCTAATCAAT GA.AATTGATC TACTCGTGCA GAAGCGAGTA CAGAGATA.AA AGATTAGTTA CTTTAACTAG ATGAGCACGT CTTCGCTCAT 2201 TAATCACATC AGACGAGAAG ACCCTATGGA GCTTCAAACA CATAAATTAA ATTAGTGTAG TCTGCTCTTC TGGGATACCT CGAAGTTTGT GTATTTAATT 2251 CTACATAAAT TAATTATTCC ACGGATATAA ATP.►A.AAATAT AGTACCTTTA GATGTATTTA ATTAATAAGG TGCCTATATT TATTTTTATA TCATGGAAAT 2301 ATTTAACTGT TTTTGGTTGG GGTGACCAAG GGGP.~A.AA.ACA AATCCCCCTT TA.AATTGACA AAAACCAACC CCACTGGTTC CCCTTTTTGT TTAGGGGGAA 2351 ATCGACTGAG TACTCAAGTA C TTAGA.AATT AGATTTACAA TTCTAATTAA TAGCTGACTC ATGAGTTCAT GAATCTTTAA TCTAAATGTT AAGATTAATT 2401 TP~AA~.TATTT ATCGAACAAT GACCCAGGAT TTCCTGATCA ATGAACCAAG ATTTTATAAA TAGCTTGTTA CTGGGTCCTA AAGGACTAGT TACTTGGTTC 2451 TTACCCTAGG GATAACAGCG CAATCCTTTC TCAGAGTCCC TATCGCCGAA AATGGGATCC CTATTGTCGC GTTAGGAAAG AGTCTCAGGG ATAGCGGCTT 2501 AGGGTTTACG ACCTCGATGT TGGATCAGGA CATCCTAATG ATGCAACCGT TCCCAAATGC TGGAGCTACA ACCTAGTCCT GTAGGATTAC TACGTTGGCA 2551 TATTAAGGGT TCGTTTGTTC AACGATTAAT AGTCCTACGT GATCTGAGTT 183

ATAATTCCCA AGCA.AACAAG TTGCTAATTA TCAGGATGCA CTAGACTCAA 2601 CAGACCGGAG AAATCCAGGT CAGTTTCTAT CTATGAATTA ATTTTTCCTA GTCTGGCCTC TTTAGGTCCA GTCA.AAGATA GATACTTAAT TP.►A.P.~AAGGAT 2651 GTACGAAAGG AC C GGP~~?~A TGGAGCCAAT ACCCCAGGCA CGCTCCATTT CATGCTTTCC TGGCCTTTTT ACCTCGGTTA TGGGGTCCGT GCGAGGTAAA 2701 TCATCTATTG TAATA.AAC TA AAATAGATAA GAAAAAATTA TCTATTACCC AGTAGATAAC ATTATTTGAT TTTATCTATT CTTTTTTAAT AGATAATGGG 2751 AAGA►AA.AGGG TTGTTGAGGT GGCAGAGCCT GGTAAGTGCA A.AAGAC C TAA TTCTTTTCCC AACA.ACTCCA CCGTCTCGGA CCATTCACGT TTTCTGGATT 2801 GCTCTTTAAT TCAGAGGTTC A.AATC C TC TC CTCAACCATG CTTGAAAACC CGAGAAATTA AGTCTCCAAG TTTAGGAGAG GAGTTGGTAC GAACTTTTGG 2851 TCCTACTTTA CCTAATTAAT CCACTTACCT ATATTATTCC CATCCTATTA AGGATGAAAT GGATTAATTA GGTGAATGGA TATAATAAGG GTAGGATAAT 2901 GCCACAGCTT TCCTCACCCT AGTTGAACGA AAAATCCTGG GCTATATACA CGGTGTCGAA AGGAGTGGGA TCAACTTGCT TTTTAGGACC CGATATATGT 2951 ACTCCGGAAA GGCCCCAACA TCGTAGGTCC TTATGGACTC CTTCAACCAA TGAGGCCTTT CCGGGGTTGT AGCATCCAGG AATACCTGAG GAAGTTGGTT 3001 TCGCAGATGG ACT~TTA TTTATTAAAG AACCCATCCA CCCATCAACA AGCGTCTACC TGATTTTAAT AAATAATTTC TTGGGTAGGT GGGTAGTTGT 3051 TCCTCCCCAT TTCTATTTTT AGCTACCCCC ACAATAGCCC TAACACTAGC AGGAGGGGTA AAGATp~~AAA TCGATGGGGG TGTTATCGGG ATTGTGATCG 3101 TCTCCTTATA TGAATACCTC TTCCTCTCCC CCATTCCATC ATCAATCTTA AGAGGAATAT ACTTATGGAG AAGGAGAGGG GGTAAGGTAG TAGTTAGAAT 3151 ATTTAGGTCT ACTATTCATT CTAGCAATCT CCAGCTTAAC CGTTTATACT TAAATCCAGA TGATAAGTAA GATCGTTAGA GGTCGAATTG GCAAATATGA 3201 ATTTTAGGTT CCGGATGAGC ATCTAATTCG AA.ATATGCTC TGATAGGAGC TP~AAATCCAA GGCCTACTCG TAGATTAAGC TTTATACGAG ACTATCCTCG 3251 CCTACGAGCC GTAGCACAAA CAATCTCCTA TGAAGTAAGT CTAGGACTAA GGATGCTCGG CATCGTGTTT GTTAGAGGAT ACTTCATTCA GATCCTGATT 3301 TCCTCTTATC AATAATCGTA TTTGCAGGAG GTTTCACCCT CCATACCTTT AGGAGAATAG TTATTAGCAT AAACGTCCTC CAAAGTGGGA GGTATGGAAA 3351 AATTTAGCAC AAGAAACAAT CTGACTAATT ATTCCAGGAT GACCCTTAGC TTAAATCGTG TTCTTTGTTA GACTGATTAA TAAGGTCCTA CTGGGAATCG 3401 CCTAATATGA TACGTTTCAA CCCTAGCAGA AACTAACCGA GTCCCATTTG GGATTATACT ~ATGCAAAGTT GGGATCGTCT TTGATTGGCT CAGGGTAAAC 3451 ATTTAACAGA AGGAGAATCA GAACTAGTCT CAGGGTTCAA CATCGAATAT TAAATTGTCT TCCTCTTAGT CTTGATCAGA GTCCCAAGTT GTAGCTTATA 3501 GCAGGAGGCT CATTCGCCCT ATTTTTCCTT GCTGAATACA CA.AACATTTT CGTCCTCCGA GTAAGCGGGA TAAAAAGGAA CGACTTATGT GTTTGTAAAA 3551 ATTAATAAAT ACCCTATCAG TTATCTTATT CATAGGCTCT TCCTACAACC TAATTATTTA TGGGATAGTC AATAGAATAA GTATCCGAGA AGGATGTTGG 3601 CACTTCTCCC AGAA.ATTTCA ACACTCAGCC TTATAATGAA AGCTACCCTA GTGAAGAGGG TC TTTA.AAGT TGTGAGTCGG AATATTACTT TCGATGGGAT 3651 TTAACCTTAT TTTTCTTATG AATTCGAGCA TCCTATCCCC GTTTTCGCTA AATTGGAATA AAAAGAATAC TTAAGCTCGT AGGATAGGGG CA.AA.AGC GAT 3701 TGACCAACTC ATACATTTAG TATG TTTTCTCCCA TTAACCTTAG ACTGGTTGAG TATGTA.AATC ATACTTTTTT AAAAGAGGGT AATTGGAATC 3751 CAATTATACT ATGACATATC GCCCTCCCCA TAGCTACAGC AAGCCTACCT GTTAATATGA TACTGTATAG CGGGAGGGGT ATCGATGTCG TTCGGATGGA 3801 CCCCTAACTT AACGGAAGCG TGCCTGAATA AAGGACCACT TTGATAGAGT GGGGATTGAA TTGCCTTCGC ACGGACTTAT TTCCTGGTGA AACTATCTCA 3851 GGATAATGAA AGTTAAAATC TTTCCCCTTC C TAGA►~~A,AAT AGGACTTGAA CCTATTACTT TCAATTTTAG AAAGGGGAAG GATCTTTTTA TCCTGAACTT 3901 CCTATAATTA AGAGATCAAA ACTCCTTGTG CTTCCAATTA TACTATTTCC GGATATTAAT TCTCTAGTTT TGAGGAACAC GAAGGTTAAT ATGATA.AAGG 184

3951 TAAGTAAAGT CAGCTAACAA AGCTTTTGGG CCCATACCCC AACCATGTTG ATTCATTTCA GTCGATTGTT TC GAA.P~AC C C GGGTATGGGG TTGGTACAAC 4001 GTTP~ATC C TTCTTTTACT AATGAACCCA ATCGTATTAA CCATCATCAT CAATTTTAGG AAGAAAATGA TTACTTGGGT TAGCATAATT GGTAGTAGTA 4051 TTCAAGCCTA GGCCTAGGAA CTATCTTAAC ATTCATTGGT TCACACTGAC AAGTTCGGAT CCGGATCCTT GATAGAATTG TAAGTAACCA AGTGTGACTG 4101 TCCTAGTTTG AATAGGCCTC GAAATTAATA CTCTAGCTAT TATTCCCCTA AGGATCAAAC TTATCCGGAG CTTTAATTAT GAGATCGATA ATAAGGGGAT 4151 ATAATTCGCC AACACCACCC ACGGGCAGTA GAAGCCTCTA CAAAATATTT TATTAAGCGG TTGTGGTGGG TGCCCGTCAT CTTCGGAGAT GTTTTATAAA 4201 CATTACGCA.A GCGACTGCCT CAGCTTTACT TTTATTTGCT AGTGTCACAA GTAATGCGTT CGCTGACGGA GTCGAAATGA AAATAAACGA TCACAGTGTT 4251 ACGCTTGGAC TTCAGGTGAA TGAAGTCTAA TC GAA.ATAAT TAATCCAACC TGCGAACCTG AAGTCCACTT ACTTCAGATT AGCTTTATTA ATTAGGTTGG 4301 TCTGCCACAC TGGCCACAAT CGCATTAGCA TTP~~AA,ATTG GCCTAGCCCC AGACGGTGTG ACCGGTGTTA GCGTAATCGT AATTTTTAAC CGGATCGGGG 4351 CCTCCACTTC TGATTACCAG AAGTACTTCA AGGCTTAGAC CTTACCACAG GGAGGTGAAG ACTAATGGTC TTCATGAAGT TCCGAATCTG GAATGGTGTC 4401 GTCTCATTCT TTCCACATGA CP.►AAAAC TTG CCCCATTCGC TATTCTCTTA CAGAGTAAGA AAGGTGTACT GTTTTTGAAC GGGGTAAGCG ATAAGAGAAT 4451 CAACTCTACC C C TCAC TA.AA TTCTAACTTA CTCATCTTCC TTGGAGTACT GTTGAGATGG GGAGTGATTT AAGATTGAAT GAGTAGAAGG AACCTCATGA 4501 CTCAACTATA GTAGGGGGCT GAGGAGGATT AAACCAAACC CAACTACGAA GAGTTGATAT CATCCCCCGA CTCCTCCTAA TTTGGTTTGG GTTGATGCTT 4551 AAATCCTAGC CTACTCCTCA ATCGCACATC TTGGCTGAAT AATTACAATC TTTAGGATCG GATGAGGAGT TAGCGTGTAG AACCGACTTA TTAATGTTAG 4601 CTACATTACT CCTATAATTT AACCCAACTA AACCTAATAC TTTACATCAT GATGTAATGA GGATATTAAA TTGGGTTGAT TTGGATTATG AAATGTAGTA 4651 CATAACATCA ACAACTTTTC TGCTATTCAA AACATTTAAT TCAAC CA,AAA GTATTGTAGT TGTTGAAAAG ACGATAAGTT TTGTAAATTA AGTTGGTTTT 4701 TCAACTCTAT CTCCTCTTCT TCATCAAAAT CCCCCTTACT ATCTATCATT AGTTGAGATA GAGGAGAAGA AGTAGTTTTA GGGGGAATGA TAGATAGTAA 4751 GCTCTCATAA CTCTCCTTTC TCTAGGAGGC CTACCCCCAC TTTCAGGCTT CGAGAGTATT GAGAGGA.AAG AGATCCTCCG GATGGGGGTG A.AAGTC C GAA 4801 TATAC CAA.A.A TGATTAATTT TACAAGAATT AACAAAACAA AATCTAATTA ATATGGTTTT ACTAATTAAA ATGTTCTTAA TTGTTTTGTT TTAGATTAAT 4851 TCCCAGCTAC TATCATAGCC ATAATAGCCC TCCTCAGTCT ATTCTTCTAT AGGGTCGATG ATAGTATCGG TATTATCGGG AGGAGTCAGA TAAGAAGATA 4901 TTACGCCTAT GCTACGCTAC AACATTAACC ATAACTCCAA ATTCAATTAA AATGCGGATA CGATGCGATG TTGTAATTGG TATTGAGGTT TAAGTTAATT 4951 TATATTAACA TCATGACGAA CCAATTACCC C ATAA.AC C TA ATCCTAACAA ATATAATTGT AGTACTGCTT GGTTAATGGG GTATTTGGAT TAGGATTGTT 5001 CAACTGCCTC ACTATCTATT TTACTCCTCC CAATTACCCC CGCTATTCTC GTTGACGGAG TGATAGATAA AATGAGGAGG GTTAATGGGG GCGATAAGAG 5051 ATATTAATAT CTTAAGAAAT TTAGGTTAAC AATAGACCAA AAGCCTTCAA TATAATTATA GAATTCTTTA AATCCAATTG TTATCTGGTT TTCGGAAGTT 5101 AGCTTTAAGT AGAAGTGAAA A'TCTCCTAAT TTCTGTTAAG ATCTGTAAGA TCGAAATTCA TCTTCACTTT TAGAGGATTA AAGACAATTC TAGACATTCT 5151 CTTTATCTCA CATCTTCTGA ATGCAACCCA GATGCTTTAA TTAAGCTAA.A GAAATAGAGT GTAGAAGACT TACGTTGGGT CTACGAAATT AATTCGATTT 5201 ATCTCCTAGA CAAATAGGCC TTGATCCTAC AAAATCTTAG TTAACAGCTA TAGAGGATCT GTTTATCCGG AACTAGGATG TTTTAGAATC AATTGTCGAT 5251 AGCGTTCAAT CCAGCGAACT TTTATCTACT TTCTCCCGCC ATAAGAACAA TCGCAAGTTA GGTCGCTTGA AAATAGATGA AAGAGGGCGG TATTCTTGTT 5301 AAGGCGGGAG AAAGTCCCGG GAGAAGTCAA CCTCCGGTTT TGGATTTGCA 185

TTCCGCCCTC TTTCAGGGCC CTCTTCAGTT GGAGGCCAAA ACCTAAACGT 5351 ATCCAACGTA ATCATCTACT GCAGGACTAT GATAAGAAGA GGAATTTGAC TAGGTTGCAT TAGTAGATGA CGTCCTGATA CTATTCTTCT CCTTAAACTG 5401 CTCTGTCCAC GGAGCTACAA TCCGCCACTT AGTTCTCAGT CACCTTACCT GAGACAGGTG CCTCGATGTT AGGCGGTGAA TCAAGAGTCA GTGGAATGGA 5451 GTGGCAATTA ATCGTTGACT ATTTTCTACA AACCACAAAG ATATTGGCAC CACCGTTAAT TAGCAACTGA TAAAAGATGT TTGGTGTTTC TATAACCGTG 5501 CCTATATTTA ATCTTTGGTG CATGAGCAGG AATAGTGGGA ACAGCCCTAA GGATATAAAT TAGA.AAC CAC GTACTCGTCC TTATCACCCT TGTCGGGATT 5551 GCCTCTTAAT TCGAGCCGAA TTAGGACAGC CAGGATCACT TCTAGGAGAT CGGAGAATTA AGCTCGGCTT AATCCTGTCG GTCCTAGTGA AGATCCTCTA 5601 GATCAAATCT ATAATGTTAT TGTAACCGCC CATGCATTCG TAATAATCTT CTAGTTTAGA TATTACAATA ACATTGGCGG GTACGTAAGC ATTATTAGAA 5651 CTTTATAGTT ATACCCGTGA TAATTGGCGG ATTTGGAAAC TGACTAGTGC GAAATATCAA TATGGGCACT ATTAACCGCC TAAACCTTTG ACTGATCACG A.AATAATATA 5701 CATTAATAAT TGGTGCACCA GACATAGCTT TTCCACGAAT GTAATTATTA ACCACGTGGT CTGTATCGAA AAGGTGCTTA TTTATTATAT 5751 AGCTTTTGAC TCCTTCCCCC CTCTTTTCTT TTACTTCTAG CTTCAGCTGG TCGA.A.AACTG AGGAAGGGGG GAGA.AA.AGAA AATGAAGATC GAAGTCGACC 5801 AGTTGAAGCC GGAGCCGGTA CTGGTTGAAC AGTTTATCCT CCATTAGCTG TCAACTTCGG CCTCGGCCAT GACCAACTTG TCAAATAGGA GGTAATCGAC 5851 GCAATTTAGC ACATGCTGGA GCATCCGTTG ACTTAGCTAT TTTCTCTCTC CGTTAAATCG TGTACGACCT CGTAGGCAAC TGAATCGATA AAAGAGAGAG 5901 CATTTAGCAG GTATTTCATC AATTTTAGCC TCAATCAACT TTATTACAAC GTA.AATCGTC CATAAAGTAG TTAAAATCGG AGTTAGTTGA AATAATGTTG 5951 TATTATTAAT ATA.A.AACCCC CTGCAATCTC CCAATATCAA ACACCATTAT ATAATAATTA TATTTTGGGG GACGTTAGAG GGTTATAGTT TGTGGTAATA 6001 TTGTGTGATC AATTCTAGTA ACAACTATTC TCCTTCTATT ATCCCTCCCA AACACACTAG TTAAGATCAT TGTTGATAAG AGGAAGATAA TAGGGAGGGT 6051 GTACTTGCAG CCGGCATTAC AATACTACTT ACTGATCGAA ACCTAAACAC CATGAACGTC GGCCGTAATG TTATGATGAA TGACTAGCTT TGGATTTGTG 6101 AACATTCTTT GATCCAGCAG GCGGAGGAGA TCCAATTCTT TATCAACATT TTGTAAGAAA CTAGGTCGTC CGCCTCCTCT AGGTTAAGAA ATAGTTGTAA 6151 TATTTTGATT TTTTGGTCAC CCAGAAGTTT ACATTTTAAT TCTACCCGGT ATAAAACTAA AAAACCAGTG GGTCTTCAAA TGT~~AAATTA AGATGGGCCA 6201 TTCGGGATAA TTTCTCATGT AGTAGCTTAT TATTCTGGCA P~~AAAGAAC C AAGCCCTATT AAAGAGTACA TCATCGAATA ATAAGACCGT TTTTTCTTGG 6251 ATTTGGTTAT ATAGGAATAG TCTGAGCAAT AATAGCAATT GGATTATTAG TAAACCAATA TATCCTTATC AGACTCGTTA TTATCGTTAA CCTAATAATC 6301 GTTTCATTGT ATGAGCCCAT CATATATTTA CGGTAGGAAT AGACGTTGAC CAAAGTAACA TACTCGGGTA GTATATA.AAT GCCATCCTTA TCTGCAACTG 6351 ACACGAGCCT ATTTTACTTC AGCAACAATA ATTATCGCTA TCCCCACAGG TGTGCTCGGA TA.AAATGAAG TCGTTGTTAT TAATAGCGAT AGGGGTGTCC 6401 CGTAAAAGTA TTTAGTTGAC TAGCAACTCT TCATGGAGGC TCTGTTAAAT GCATTTTCAT AAATCAACTG ATCGTTGAGA AGTACCTCCG AGACAATTTA 6451 GAGAAACCCC ATTATTATGA GCTCTTGGAT TTATCTTCTT ATTCACAGTG CTCTTTGGGG TAATAATACT CGAGAACCTA AATAGAAGAA TAAGTGTCAC 6501 GGAGGGTTAA CAGGCATCGT CTTGGCTAAC TCTTCCCTAG ATATTGTACT CCTCCCAATT GTCCGTAGCA GAACCGATTG AGAAGGGATC TATAACATGA 6551 TCATGATACC TACTACGTAG TAGCCCACTT CCATTATGTC CTTTCAATAG AGTACTATGG ATGATGCATC ATCGGGTGAA GGTAATACAG GAAAGTTATC 6601 GAGCAGTATT TGCTATCATA GCAGGCTTTA TCCACTGATT TCCTCTCATC CTCGTCATAA ACGATAGTAT CGTCCGAAAT AGGTGACTAA AGGAGAGTAG 6651 TCTGGCTATA CTCTCCATTC AACATGAACA AAAATCCAAT TTGCAGTAAT AGACCGATAT GAGAGGTAAG TTGTACTTGT TTTTAGGTTA AACGTCATTA 186

6701 ATTTATTGGA GTAAATTTAA CATTTTTCCC ACAACACTTC CTAGGTCTAG TAAATAACCT CATTTAAATT GTP.~~AA.AGGG TGTTGTGAAG GATCCAGATC 6751 CTGGTATACC ACGACGCTAC TCAGATTATC CAGACGCATA TACCCTATGA GACCATATGG TGCTGCGATG AGTCTAATAG GTCTGCGTAT ATGGGATACT 6801 AACACAGTCT CCTCCATTGG CTCTTTAATT TCACTTGTAG CAGTAATTAT TTGTGTCAGA GGAGGTAACC GAGAAATTAA AGTGAACATC GTCATTAATA 6851 ACTTCTATTT ATTATCTGAG AAGCATTTGC CTCAAAACGA GAAGTCTTAT TGAAGATAAA TAATAGACTC TTCGTA.AACG GAGTTTTGCT CTTCAGAATA 6901 CCGTTGAATT ACCTCACACA AACGTTGAAT GATTACACGG CTGCCCTCCA GGCAACTTAA TGGAGTGTGT TTGCAACTTA CTAATGTGCC GACGGGAGGT 6951 CCATATCACA CATATGAAGA ACCAGCATTT GTCCAAGTTC AACGAACCTT GGTATAGTGT GTATACTTCT TGGTCGTA.AA CAGGTTCAAG TTGCTTGGAA 7001 TTAAAACAAG AAAGGAAGGA ATTGAACCCC CATATGTTAG TTTCAAGCCA AATTTTGTTC TTTCCTTCCT TAACTTGGGG GTATACAATC AAAGTTCGGT 7051 ACCACATCAC CACTCTGTCA CTTTCTTTAT TAAGATTCTA GTAAAATACA TGGTGTAGTG GTGAGACAGT GAAAGAAATA ATTCTAAGAT CATTTTATGT 7101 TTACACTGCC TTGTCAAGGC AAAATTGTGA GTTTA.AATCC CACGAGTCTT AATGTGACGG AACAGTTCCG TTTTAACACT CAAATTTAGG GTGCTCAGAA 7151 AACTTATAAT GGCACATCCC TCACAATTAG GATTTCAAGA CGCAGCCTCC TTGAATATTA CCGTGTAGGG AGTGTTAATC CT~.AAGTTCT GCGTCGGAGG 7201 CCAGTTATGG AAGAACTTAT TCATTTTCAC GACCACACAT TAATAATTGT GGTCAATACC TTCTTGAATA AGTA~A.AAGTG CTGGTGTGTA ATTATTAACA 7251 ATTTCTTATT AGCACTCTGG TTCTTTATAT TATTACAGCA ATAGTATCAA TAAAGAATAA TCGTGAGACC AAGAAATATA ATAATGTCGT TATCATAGTT 7301 CA►A,A,AC TTAC AAACAAATAT ATTCTTGACT CTCAAGAAAT TGAA.ATTGTC GTTTTGAATG TTTGTTTATA TAAGAACTGA GAGTTCTTTA ACTTTAACAG 7351 TGAACTATTC TGCCCGCCAT TATTCTCATT ATAATCGCCC TACCATCCCT ACTTGATAAG ACGGGCGGTA ATAAGAGTAA TATTAGCGGG ATGGTAGGGA 7401 ACGAATTCTA TACCTTATAG ACGAAATCAA TGATCCCCAC CTAACAATCA TGCTTAAGAT ATGGAATATC TGCTTTAGTT ACTAGGGGTG GATTGTTAGT 7451 AAGCTATGGG TCATCAATGA TACTGAAGTT ATGAATATAC AGATTATGAA TTCGATACCC AGTAGTTACT ATGACTTCAA TACTTATATG TCTAATACTT 7501 GACTTAGGAT TCGACTCTTA CATAATTCAA ACCCAAGACT TAACCCCAGG CTGAATCCTA AGCTGAGAAT GTATTAAGTT TGGGTTCTGA ATTGGGGTCC 7551 CCAATTCCGT TTATTAGAAA CAGACCACCG AATAGTTGTA CCCATAGAAT GGTTAAGGCA AATAATCTTT GTCTGGTGGC TTATCAACAT GGGTATCTTA 7601 CACCTATTCG TGTATTAGTA TCTGCAGAAG ATGTCTTACA TTCATGAGCT GTGGATAAGC ACATAATCAT AGACGTCTTC TACAGAATGT AAGTACTCGA 7651 GTTCCAGCCT TAGGAATTAA AATAGACGCC GTACCAGGAC GCCTAAATCA CAAGGTCGGA ATCCTTAATT TTATCTGCGG CATGGTCCTG CGGATTTAGT 7701 AACTGCCTTT ATTACCTCCC GACCAGGCAT CTATTATGGT CAATGTTCAG TTGAC GGA.AA TAATGGAGGG CTGGTCCGTA GATAATACCA GTTACAAGTC 7751 AAATTTGTGG TGCTAACCAT AGCTTTATAC CTATCGTAGT AGAAGCAGTT TTTAAACACC ACGATTGGTA TCGAAATATG GATAGCATCA TCTTCGTCAA 7801 CCCTTAGAAC ACTTCGAAGC CTGATCTTCA TTAATATTAG AAGAAGCCTC GGGAATCTTG TGAAGCTTCG GACTAGAAGT AATTATAATC TTCTTCGGAG 7851 ACTAAGAAGC TAAATTGGGT CTAGCATTAG CCTTTTAAGC Tp~~A.AAC TGG TGATTCTTCG ATTTAACCCA GATCGTAATC GGAA.AATTC G ATTTTTGACC 7901 TGACTCCCTA CCACCCTTAG TGATATGCCT CAATTAAATC CCCACCCTTG ACTGAGGGAT GGTGGGAATC ACTATACGGA GTTAATTTAG GGGTGGGAAC 7951 ATTCATTATT CTCCTATTCT CGTGAATAAT TTTCCTTATC ATTTTACCCA TAAGTAATAA GAGGATAAGA GCACTTATTA AAAGGAATAG TA,A.AATGGGT 8001 AAAAAGTAAT AA.ATCACACA TTCAACAATA ACCCAACATT P~~AA.AGTATC TTTTTCATTA TTTAGTGTGT AAGTTGTTAT TGGGTTGTAA TTTTTCATAG 8051 GP.~~AAATC TA AACCTGAACC TTGAAACTGA CCATGATCAT AAGCTTCTTT 187

CTTTTTAGAT TTGGACTTGG AACTTTGACT GGTACTAGTA TTCGAAGAAA 8101 GACCAATTCC TAAGCCCCAC CCTCATTGGA ATCCCATTAA TTGCCCTGGC CTGGTTAAGG ATTCGGGGTG GGAGTAACCT TAGGGTAATT AACGGGACCG 8151 AATTGCATTA CCATGATTAA CTTTCCCAAC CCCAACTAAT CGCTGGTTAA TTAACGTAAT GGTACTAATT GAAAGGGTTG GGGTTGATTA GCGACCAATT 8201 ATAACCGACT AATAACCCTC CAAAGCTGAT TTATTAACCG ATTTATCTAT TATTGGCTGA TTATTGGGAG GTTTCGACTA A.ATAATTGGC TA.AATAGATA 8251 CAACTTATAC AGCCCATTAA CTTTGCTGGT CACAA.ATGAG CCATATTATT GTTGAATATG TCGGGTAATT GAAACGACCA GTGTTTACTC GGTATAATAA 8301 CACAGCACTA ATATTATTCC TAATTACTAT TAACCTATTA GGTCTTCTCC GTGTCGTGAT TATAATAAGG ATTAATGATA ATTGGATAAT CCAGAAGAGG 8351 CTTATACCTT CACACCCACA ACCCAACTCT CCCTTAATAT AGCATTTGCC GAATATGGAA GTGTGGGTGT TGGGTTGAGA GGGAATTATA TCGTAAACGG 8401 CTTCCCTTGT GATTTACAAC CGTATTAATC GGAATACTTA ACCAACCAAC GAAGGGAACA C TA.A.ATGTTG GCATAATTAG CCTTATGAAT TGGTTGGTTG 8451 AATTGCACTA GGCCATTTTC TACCAGAAGG TACCCCCACC CCTCTAGTAC TTAACGTGAT C C GGTA►,AAAG ATGGTCTTCC ATGGGGGTGG GGAGATCATG 8501 CAGTCCTAAT TATCATCGAA ACCATCAGCC TCTTTATCCG ACCACTAGCA GTCAGGATTA ATAGTAGCTT TGGTAGTCGG AGAAATAGGC TGGTGATCGT 8551 CTAGGAGTCC GACTCACCGC TAATTTAACA GCTGGCCACC TACTAATACA GATCCTCAGG CTGAGTGGCG ATTAAATTGT CGACCGGTGG ATGATTATGT 8601 ATTAATCGCA ACCGCAGCCT TCGTCCTTAT TACTATTATA CCAACCGTAG TAATTAGCGT TGGCGTCGGA AGCAGGAATA ATGATAATAT GGTTGGCATC 8651 CATTATTAAC ATCAATTATC CTATTCCTAT TAACAATTCT AGAAGTAGCT GTAATAATTG TAGTTAATAG GATAAGGATA ATTGTTAAGA TCTTCATCGA 8701 GTAGCAATAA TTCAAGCATA TGTATTTGTT CTTCTACTAA GCCTATACCT CATCGTTATT AAGTTCGTAT ACATA.AACAA GAAGATGATT CGGATATGGA 8751 AC AAGAA.AAT GTTTAATGGC TCACCAAGCA CATGCATATC ATATAGTTGA TGTTCTTTTA CAAATTACCG AGTGGTTCGT GTACGTATAG TATATCAACT 8801 CCCTAGCCCA TGACCACTAA CCGGAGCTAC TGCCGCCCTT CTAATAACAT GGGATCGGGT ACTGGTGATT GGCCTCGATG ACGGCGGGAA GATTATTGTA 8851 CCGGGTTGGC CATCTGATTT CACTTCCACT CATTACTACT TCTTTACCTA GGCCCAACCG GTAGACTAA.A GTGAAGGTGA GTAATGATGA AGAAATGGAT 8901 GGCTTAACCC TTCTACTACT AACTATAATC CAATGATGAC GTGATATTAT CCGAATTGGG AAGATGATGA TTGATATTAG GTTACTACTG CACTATAATA 8951 TCGAGAAGGA ACATTTCAAG GTCACCATAC ACCTCCTGTA CA~P~AA.AGGTC AGCTCTTCCT TGTAAAGTTC CAGTGGTATG TGGAGGACAT GTTTTTCCAG 9001 TCCGTTATGG AATAATCTTA TTTATCACAT CAGAAGTATT TTTCTTTTTA AGGCAATACC TTATTAGAAT AAATAGTGTA GTCTTCATAA A.AAGP.~T 9051 GGCTTTTTCT GAGCCTTTTA CCACTCAAGC CTTGCCCCAA CCCCAGAACT C C GA~AA.A.AGA CTCGGAAAAT GGTGAGTTCG GAACGGGGTT GGGGTCTTGA 9101 AGGAGGATGT TGACCACCAA CAGGAATTAA CCCATTAGAC CCATTTGAAG TCCTCCTACA ACTGGTGGTT GTCCTTAATT GGGTAATCTG GGTAAACTTC 9151 TACCACTCCT AAACACTGCA GTACTCTTAG CTTCCGGTGT AACAGTTACC ATGGTGAGGA TTTGTGACGT CATGAGAATC GAAGGCCACA TTGTCAATGG 9201 TGAACCCATC ATAGTTTAAT AGAAGGTAAC C GA►AAAGAAG CTATCCAAGC ACTTGGGTAG TATCA.AATTA TCTTCCATTG GCTTTTCTTC GATAGGTTCG 9251 CCTCACCCTC ACTATTATCC TAGGAGTCTA CTTCACAGCC CTTCAAGCCA GGAGTGGGAG TGATAATAGG ATCCTCAGAT GAAGTGTCGG GAAGTTCGGT 9301 TAGAATATTA CGAAGCACCA TTTACAATCG CCGATGGAGT TTACGGAACA ATCTTATAAT GCTTCGTGGT AAATGTTAGC GGCTACCTCA AATGCCTTGT 9351 ACATTCTTCG TTGCCACAGG ATTCCACGGT CTCCATGTCA TTATTGGCTC TGTAAGAAGC AACGGTGTCC TAAGGTGCCA GAGGTACAGT AATAACCGAG 9401 AACATTTCTA GCAGTCTGTT TACTACGACA AATCCAATAT CATTTTACAT TTGTAAAGAT CGTCAGACAA ATGATGCTGT TTAGGTTATA GTAAAATGTA 188

9451 CAGAACATCA TTTTGGCTTC GAAGCCGCTG CATGATACTG ACATTTTGTA GTCTTGTAGT AA.AAC C GAAG CTTCGGCGAC GTACTATGAC TGT~CAT 9501 GATGTAGTAT GATTATTCCT TTATGTATCC ATCTATTGAT GAGGCTCATA CTACATCATA CTAATAAGGA AATACATAGG TAGATAACTA CTCCGAGTAT 9551 ATTACTTTTC TAGTATAAAC TAGTACAAAT GATTTCCAAT CATTTAATCT TAATGF~AAAG ATCATATTTG ATCATGTTTA CTAAAGGTTA GTA.AATTAGA 9601 TGGTTAGAAT CCAAGGA.A.A.A GTAATGAACC TCATCGCATC TTCTATCGCA ACCAATCTTA GGTTCCTTTT CATTACTTGG AGTAGCGTAG AAGATAGCGT 9651 GCTACGGCCC TGATTTCCCT AATCCTTGTA TTAGTTGCAT TTTGACTTCC CGATGCCGGG AC TA.AAGGGA TTAGGAACAT AATCAACGTA AA.AC TGAAGG 9701 ATCACTAAAT CCAGATAACG P►AAAATTATC CCCATACGAA TGCGGCTTTG TAGTGATTTA GGTCTATTGC TTTTTAATAG GGGTATGCTT AC GC C GAA.AC 9751 ACCCCCTAGG AAGTGCACGC CTTCCATTCT CCTTACGCTT CTTTCTTGTA TGGGGGATCC TTCACGTGCG GAAGGTAAGA GGAATGCGAA GA.AAGAAC AT 9801 GCTATTCTAT TTTTATTATT TGACCTAGAA ATCGCCCTCC TTCTTCCCTT CGATAAGATA AAAATAATAA ACTGGATCTT TAGCGGGAGG AAGAAGGGAA 9851 ACCATGAGGT GATCAATTAC TATCACCACT TTCCACACTA CTTTGAGCAG TGGTACTCCA CTAGTTAATG ATAGTGGTGA AAGGTGTGAT GAAACTCGTC 9901 CGATTATCCT AATTCTATTA ACTTTAGGTC TTATCTATGA ATGATTTCAA GCTAATAGGA TTAAGATAAT TGAAATCCAG AATAGATACT TACTAAAGTT 9951 GGAGGCCTAG AATGAGCAGA ATGGATATTT AGTCTAAATA AAGACCACTA CCTCCGGATC TTACTCGTCT TACCTATAAA TCAGATTTAT TTCTGGTGAT 10001 ATTTCGACTT AGTA.AATTAC GGTGAAAATC CATAAATATC CTATGTCTCC TAA.AGC TGAA TCATTTAATG CCACTTTTAG GTATTTATAG GATACAGAGG 10051 AATATATTTC AGCCTCAGCT CAGCATTCAT CTTAGGCCTC ATAGGTCTTG TTATATAAAG TCGGAGTCGA GTCGTAAGTA GAATCCGGAG TATCCAGAAC 10101 CACTCAACCG TTACCACCTT CTATCAGCAC TCTTATGCTT AGAAAGTATA GTGAGTTGGC AATGGTGGAA GATAGTCGTG AGAATACGAA TCTTTCATAT 10151 TTATTAACTC TTTTCATTAC CATTGCCATC TGAACCCTAA CATTAAACTC AATAATTGAG AAAAGTAATG GTAACGGTAG ACTTGGGATT GTAATTTGAG 10201 CACTTCATGT TCAATTATTC CTATAATTCT CCTCACATTC TCAGCCTGTG GTGAAGTACA AGTTAATAAG GATATTAAGA GGAGTGTAAG AGTCGGACAC 10251 AAGCTAGCGC AGGCCTAGCT ATTCTAGTAG CTACCTCACG TTCTCACGGC TTCGATCGCG TCCGGATCGA TAAGATCATC GATGGAGTGC AAGAGTGCCG 10301 TCTGACAACC TACAAAACCT AAACCTTCTC CAATGCTA.AA AATTTTAATC AGACTGTTGG ATGTTTTGGA TTTGGAAGAG GTTACGATTT TTAAAATTAG 10351 CCAACAATCA TACTCTTCCC TACCACATGA ATTATTAACA P~~AAATGACT GGTTGTTAGT ATGAGAAGGG ATGGTGTACT TAATAATTGT TTTTTACTGA 10401 ATGACCCATA ACCACCACTC ACAGCCTTCT AATCGCATTA CTGAGCCTGC TACTGGGTAT TGGTGGTGAG TGTCGGAAGA TTAGCGTAAT GACTCGGACG 10451 TCCTATTCAA ATGA.AATACA GATATTGGCT GAGATTTTTC TAACCAATTC AGGATAAGTT TACTTTATGT CTATAACCGA C TC TP.~AAAAG ATTGGTTAAG 10501 ATAGCCGTTG ACCCTCTATC AACCCCTTTA CTAATCTTGA CATGTTGACT TATCGGCAAC TGGGAGATAG TTGGGGAAAT GATTAGAACT GTACAACTGA 10551 TCTACCATTA ATAATCTTAG CCAGCCAAAA CCACATTTCT CCAGAACCTA AGATGGTAAT TATTAGAATC GGTCGGTTTT GGTGTAAAGA GGTCTTGGAT 10601 TCATCCGACA ACGAACATAC ATTACACTTC TAATTTCCCT CCAAACCTTC AGTAGGCTGT TGCTTGTATG TAATGTGAAG ATTAAAGGGA GGTTTGGAAG 10651 CTTATCATAG CATTTTCTGC AACCGAAATA ATTATATTTT ACATCATATT GAATAGTATC GTAAAAGACG TTGGCTTTAT TAATATP.►AAA TGTAGTATAA 10701 TGAAGCTACA CTTATCCCAA CCCTTATTAT TATTACACGA TGAGGAAATC ACTTCGATGT GAATAGGGTT GGGAATAATA ATAATGTGCT ACTCCTTTAG 10751 AAACAGAACG CCTAAATGCA GGCACCTATT TCCTATTCTA TACACTAATT TTTGTCTTGC GGATTTACGT CCGTGGATAA AGGATAAGAT ATGTGATTAA 10801 GGTTCACTCC CCCTCCTTAT TGCCCTCCTA TTCATACAAA ACAATTTAGG 189

CCAAGTGAGG GGGAGGAATA ACGGGAGGAT AAGTATGTTT TGTTAAATCC 10851 TACCCTTTCT ATAATTATTA TACAACATTC ACAACTTACA AACCTACTTT ATGGGAA.AGA TATTAATAAT ATGTTGTAAG TGTTGAATGT TTGGATGAAA 10901 CATGAGCAGA CAAATTATGA TGAATGGCCT GTCTCATCGC TTTCCTCGTC GTACTCGTCT GTTTAATACT ACTTACCGGA CAGAGTAGCG AAAGGAGCAG 10951 AAAATACCTT TATATGGAAT TCACCTCTGA CTCCCCA.A.AG CCCATGTCGA TTTTATGGAA ATATACCTTA AGTGGAGACT GAGGGGTTTC GGGTACAGCT 11001 AGCCCCAATC GCTGGCTCAA TAATCCTAGC AGCAGTATTA C T TAAAC TAG TCGGGGTTAG CGACCGAGTT ATTAGGATCG TCGTCATAAT GAATTTGATC 11051 GCGGTTACGG AATAATACGA ATTATTGTAA TAC TA.AAC C C ATTAACCAAA CGCCAATGCC TTATTATGCT TAATAACATT ATGATTTGGG TAATTGGTTT 11101 GAAATAGCCT ACCCATTCTT AATCTTAGCT ATTTGAGGAA TTATTATAAC CTTTATCGGA TGGGTAAGAA TTAGAATCGA TAAACTCCTT AATAATATTG 11151 CAGCTCCATC TGTTTACGAC AAACAGACCT AAAATCCCTA ATTGCCTACT GTCGAGGTAG ACAAATGCTG TTTGTCTGGA TTTTAGGGAT TAACGGATGA 11201 CATCAGTAAG TCATATAGGC CTAGTAGCCG GAGCAATTTT TATCCAAACA GTAGTCATTC AGTATATCCG GATCATCGGC CTCGTTAAAA ATAGGTTTGT 11251 CCATGAAGTT TTGCAGGAGC AATCACACTT ATAATCGCCC ATGGCCTAGT GGTACTTCAA AACGTCCTCG TTAGTGTGAA TATTAGCGGG TACCGGATCA 11301 TTCATCAGCC TTATTTTGTT TAGCCAACAC CAACTATGAA CGAATTCATT AAGTAGTCGG AATAAAACAA ATCGGTTGTG GTTGATACTT GCTTAAGTAA 11351 GCCGAACCAT ACTCCTAGCC CGAGGTATGC AGGTTATCCT CCCACTAATG CGGCTTGGTA TGAGGATCGG GCTCCATACG TCCAATAGGA GGGTGATTAC 11401 GCAACCTGAT GATTCTTTGC TAGCCTAGCT AATCTTGCCC TACCCCCTTC CGTTGGACTA C TAAGA.AAC G ATCGGATCGA TTAGAACGGG ATGGGGGAAG 11451 ACCTAACCTT ATAGGAGAAC TCCTCATCAT CACCTCATTA TTTAATTGAT TGGATTGGAA TATCCTCTTG AGGAGTAGTA GTGGAGTAAT AAATTAACTA 11501 CCAATTGAAC CATAATCTTA TCAGGCCTTG GAGTATTAAT CACAGCCTCC GGTTAACTTG GTATTAGAAT AGTCCGGAAC CTCATAATTA GTGTCGGAGG 11551 TATTCACTCT ACATATTCTT AATAACTCAA CGAGGCCCAA CCCCCCATCA ATAAGTGAGA TGTATAAGAA TTATTGAGTT GCTCCGGGTT GGGGGGTAGT 11601 TATTCTATCA C TA.AAC C C AA CTTATACACG AGAACATCTT CTCCTAAGCC ATAAGATAGT GATTTGGGTT GAATATGTGC TCTTGTAGAA GAGGATTCGG 11651 TCCATCTCTT ACCAGTCCTA CTACTAATAC TTA.AACCAGA ACTCATCTGA AGGTAGAGAA TGGTCAGGAT GATGATTATG AATTTGGTCT TGAGTAGACT 11701 GGATGAACAC TTTGTATTTA TAGTTTAATC AA.AACATTAG ATTGTGGTTC CCTACTTGTG AAACATAAAT ATCAAATTAG TTTTGTAATC TAACACCAAG 11751 TP►~~A,AATAA.A AGTTAAAACC TTTTTAATTA CCGAGAGAGG TCAGGGACAC ATTTTTATTT TCAATTTTGG P.,~~A.AATTAAT GGCTCTCTCC AGTCCCTGTG 11801 GAAGAACTGC TAATTCTTCC TATCATGGCT CA.AATCCATG GCTCACTCAG CTTCTTGACG ATTAAGAAGG ATAGTACCGA GTTTAGGTAC CGAGTGAGTC 11851 CTTCTGAAAG ATAATAGCAA TCTATTGGTC TTAGGAACCA AAAACTCTTG GAAGACTTTC TATTATCGTT AGATAACCAG AATCCTTGGT TTTTGAGAAC 11901 GTGCAACTCC AAGCAAAAGC CATGAACACC ATTTTCAATT CATCATTCCT CACGTTGAGG TTCGTTTTCG GTACTTGTGG TA.A.AAGTTAA GTAGTAAGGA 11951 ACTAATCTTT ATTATCCTTA TTTTACCATT ACTAACCTCA TTAAACCCCA TGATTAGAAA TAATAGGAAT AAA.ATGGTAA TGATTGGAGT AATTTGGGGT 12001 AAGAACTTAA TCCAAATTGA TCCTCATCCT ATGTP►~~AAAC AGC TGTA,A.AA TTCTTGAATT AGGTTTAACT AGGAGTAGGA TACATTTTTG TCGACATTTT 12051 ACTTCCTTCT TCATTAGCCT TATCCCCCTA TTCATTTTTC TAGACCAAGG TGAAGGAAGA AGTAATCGGA ATAGGGGGAT AAGTP~A.AAAG ATCTGGTTCC 12101 CCTAGAATCA ATTATAACCA ATTACAACTG AATAAACATT GGACCATTCG GGATCTTAGT TAATATTGGT TAATGTTGAC TTATTTGTAA CCTGGTAAGC 12151 ATATTAATAT GAGCTTTAAA TTTGATATAT ACTCAATTAT ATTTACCCCC TATAATTATA C TC GA.AATTT AAACTATATA TGAGTTAATA TAAATGGGGG 190

12201 GTAGCCCTCT ACGTCACCTG ATCTATTCTC GAATTCGCCC TATGATACAT CATCGGGAGA TGCAGTGGAC TAGATAAGAG CTTAAGCGGG ATACTATGTA 12251 ACACTCTGAC CCTAACATCA ATCGCTTCTT CAAATACCTA CTACTCTTTT GATGAGA,AA.A TGTGAGACTG GGATTGTAGT TAGCGAAGAA GTTTATGGAT 12301 TGATTTCAAT AATTATCCTA GTGACCGCCA ACAATATATT TCAATTATTC ACTAAAGTTA TTAATAGGAT CACTGGCGGT TGTTATATAA AGTTAATAAG 12351 ATTGGATGAG AAGGAGTTGG AATCATATCA TTCCTCCTAA TCGGCTGATG TAACCTACTC TTCCTCAACC TTAGTATAGT AAGGAGGATT AGCCGACTAC 12401 ATACAGCCGA ACAGATGCTA ACACGGCTGC CCTCCAAGCC GTGATTTATA TATGTCGGCT TGTCTACGAT TGTGCCGACG GGAGGTTCGG CACTAAATAT 12451 ACCGCGTGGG AGATATCGGA TTAATCCTCA CCATAGCCTG ACTAGCTATA TGGCGCACCC TCTATAGCCT AATTAGGAGT GGTATCGGAC TGATCGATAT 12501 AATTTAAATT CATGAGAAAT CCAACAACTA TTTATTTTAT CTAAAGATAT TTAAATTTAA GTACTCTTTA GGTTGTTGAT A.AATAAAATA GATTTCTATA TGGA~A,AAT 12551 AAACTTAACA CTACCCCTTC TAGGCCTCGT CCTAGCCGCA GC TTTGAATTGT GATGGGGAAG ATCCGGAGCA GGATCGGCGT CGACCTTTTA 12601 CCGCACAATT CGGCCTCCAT CCCTGACTCC CTTCCGCTAT AGAAGGTCCA GGCGTGTTAA GCCGGAGGTA GGGACTGAGG GAAGGCGATA TCTTCCAGGT 12651 ACACCAGTCT CCGCCCTACT TCACTCCAGC ACAATAGTCG TCGCTGGCAT TGTGGTCAGA GGCGGGATGA AGTGAGGTCG TGTTATCAGC AGCGACCGTA 12701 TTTCCTGCTA ATCCGCCTTC ACCCTCTAAT CCAAGACAAC CAACTAATCC AAAGGACGAT TAGGCGGAAG TGGGAGATTA GGTTCTGTTG GTTGATTAGG 12751 TAACAACATG CCTCTGCCTT GGAGCACTAA CCACCCTTTT TACCGCAACA ATTGTTGTAC GGAGACGGAA CCTCGTGATT GGTGGGAAAA ATGGCGTTGT 12801 TGCGCACTAA CACAAAATGA TATC ATCGTCGCCT TCTCAACATC ACGCGTGATT GTGTTTTACT ATAGTTTTTT TAGCAGCGGA AGAGTTGTAG 12851 AAGTCAACTC GGACTAATAA TAGTAACAAT CGGCCTTAAC CAACCCCAAC TTCAGTTGAG CCTGATTATT ATCATTGTTA GCCGGAATTG GTTGGGGTTG 12901 TTGCCTTTCT CCATATCTGT ACCCACGCCT TCTTCAAAGC CATACTTTTC AACGGAAAGA GGTATAGACA TGGGTGCGGA AGAAGTTTCG GTATGP.~AAAG 12951 CTTTGCTCAG GATCCATTAT CCATAGTCTT AATGATGAAC AAGACATTCG GAAACGAGTC CTAGGTAATA GGTATCAGAA TTACTACTTG TTCTGTAAGC 13001 CAAA.ATAGGA GGCCTCCATA AACTTCTACC ATTCACATCA TCTTCTTTAA GTTTTATCCT CCGGAGGTAT TTGAAGATGG TAAGTGTAGT AGAAGAA.ATT 13051 CAGTTGGAAG TTTGGCCCTT ACAGGAATAC CTTTCTTATC AGGCTTCTTC GTCAACCTTC A.A.ACCGGGAA TGTCCTTATG GAAAGAATAG TCCGAAGAAG 13101 TCAAAAGACG CTATTATTGA ATCCATAAAC ACTTCACACC TA.AAC GC C T G AGTTTTCTGC GATAATAACT TAGGTATTTG TGAAGTGTGG ATTTGCGGAC 13151 AGCCCTAATC CTTACCCTCA TTGCAACATC TTTCACAGCC ATCTACAGCC TCGGGATTAG GAATGGGAGT AACGTTGTAG AAAGTGTCGG TAGATGTCGG 13201 TACGCCTTGT ATTCTTCACA TTAATAAACT TCCCACGATT TAATTCATTT ATGCGGAACA TAAGAAGTGT AATTATTTGA AGGGTGCTAA ATTAAGTAAA 13251 TCCCCAATCA AC GAA.A.ACAA CCCAATAGTC ATTAACCCTA TTAAACGCCT AGGGGTTAGT TGCTTTTGTT GGGTTATCAG TAATTGGGAT AATTTGCGGA 13301 AGCTTACGGA AGCATTCTCG CCGGTCTCAT CATTACATCA AATTTAACCC TCGAATGCCT TCGTAAGAGC GGCCAGAGTA GTAATGTAGT TTAAATTGGG 13351 CAACP.►AAA.AC CCAAATCATA ACAATATCCC CTTTACTAAA ACTCTCTGCC GTTGTTTTTG GGTTTAGTAT TGTTATAGGG GA.AATGATTT TGAGAGACGG 13401 CTACTAGTTA CAATTATTGG CCTATTATTA GCCTTAGAAT TAGCTAACTT GATGATCAAT GTTAATAACC GGATAATAAT CGGAATCTTA ATCGATTGAA 13451 AACTAACACC CAACTTAAAA TTAACCCCGT CCTTTACACA CATCACTTCT TTGATTGTGG GTTGAATTTT AATTGGGGCA GGA.AATGTGT GTAGTGAAGA 13501 CTAATATACT CGGATATTTC CCACAAATCA TCCACCGCCT C C TAC C A►AAA GATTATATGA GCCTATAAAG GGTGTTTAGT AGGTGGCGGA GGATGGTTTT 13551 ATCAACCTAA ACTGAGCCCA ATACATCTCA ACGCAACTAA TTGATCAATC 191

TAGTTGGATT TGACTCGGGT TATGTAGAGT TGCGTTGATT AACTAGTTAG 13601 ATGAAATGAA A,AAATTGGAC CP.~AAAAGTAC TCTCATTCAA CAAACTCCAC TACTTTACTT TTTTAACCTG GTTTTTCATG AGAGTAAGTT GTTTGAGGTG 13651 TAATTAAACT ATCTACTCAA CCCCAACAAG GCCATATTAA AGTTTACCTC ATTAATTTGA TAGATGAGTT GGGGTTGTTC CGGTATAATT TCAAATGGAG 137 O1 ATACTACTTT TCCTCACATT AACCTTAGCC CTATTGACCT CACTAGCCTA TATGATGAAA AGGAGTGTAA TTGGAATCGG GATAACTGGA GTGATCGGAT 13751 ACTGCACGTA AAGCCCCCCA AGATAATCCC CGAGTTAATT CCAACACCAC TGACGTGCAT TTCGGGGGGT TCTATTAGGG GCTCAATTAA GGTTGTGGTG 13801 AAACAAAGTC AACAACAATA CCCATCCACT CAAA.AC TAAC AATCACCCAC TTTGTTTCAG TTGTTGTTAT GGGTAGGTGA GTTTTGATTG TTAGTGGGTG 13851 CATTAGCGTA C AATAA.AG C T ACCCCCACAA AATCCCCACG AACCATCTCC GTAATCGCAT GTTATTTCGA TGGGGGTGTT TTAGGGGTGC TTGGTAGAGG 13901 ATACTACTCA TTTCCTCCAC CCCTATCCAA TTTAACTCAA ATCACTCTAC TATGATGAGT AAAGGAGGTG GGGATAGGTT AAATTGAGTT TAGTGAGATG 13951 TATAAAATAT TTACCTACAA AA.AC CAGAGC TATTA.AATAA AACCCAACAT ATATTTTATA AATGGATGTT TTTGGTCTCG ATAATTTATT TTGGGTTGTA 14001 ATAACAACAC AGATCAATTA CCCCATGATT CAGGATAAGG TTCAGCAGCA TATTGTTGTG TCTAGTTAAT GGGGTACTAA GTCCTATTCC AAGTCGTCGT 14051 AGAGCCGCTG TATAAGCAAA CACTACTAAC ATCCCGCCCA AATAA.ATTAA TCTCGGCGAC ATATTCGTTT GTGATGATTG TAGGGCGGGT TTATTTAATT 14101 A.AACA.A.AACC AATGATP.~AA.A AAGACCCTCC ATGGCCCACT AATAATCCAC TTTGTTTTGG TTACTATTTT TTCTGGGAGG TACCGGGTGA TTATTAGGTG 14151 ACCCTACCCC AGCAGCTATA ACTAACCCTA ACGCAGCATA GTAAGGAGAA TGGGATGGGG TCGTCGATAT TGATTGGGAT TGCGTCGTAT CATTCCTCTT 14201 GGATTGGACG CCACCCCTAT TAAACCTAAA ACTAAACAAA CTATTATTAA CCTAACCTGC GGTGGGGATA ATTTGGATTT TGATTTGTTT GATAATAATT 14251 AAACATAA.AA TATACCATTA CTCCTACCTG GACTCTAACC AAGACTGATA TTTGTATTTT ATATGGTAAT GAGGATGGAC CTGAGATTGG TTCTGACTAT 143 01 ACTTGP~~A.AA CTACCGTTGT TTATTCAACT ATAAGAATTT TATGGCCATA TGAACTTTTT GATGGCAACA AATAAGTTGA TATTCTTAAA ATACCGGTAT 14351 AATATCCGAA AAACCCACCC AC TAC TAAA.A ATTATTAACC AAACCCTAAT TTATAGGCTT TTTGGGTGGG TGATGATTTT TAATAATTGG TTTGGGATTA 14401 TGATCTCCCA GCCCCTTCCA ACATTTCAAT TTGATGAAAC TTCGGTTCAC ACTAGAGGGT CGGGGAAGGT TGTAAAGTTA AACTACTTTG AAGCCAAGTG 14451 TCCTAGGACT ATGCCTAATC ATTCAAATCC TTACAGGACT ATTCCTAGCT AGGATCCTGA TACGGATTAG TAAGTTTAGG AATGTCCTGA TAAGGATCGA 14501 ATACATTACA CCGCAGACAT CTCCATAGCC TTTTCCTCAG TAGTTCACAT TATGTAATGT GGCGTCTGTA GAGGTATCGG A,AAAGGAGTC ATCAAGTGTA 14551 CTGTCGTGAC GTTAATTACG GCTGACTTAT CCGTAATATT CATGCCAACG GACAGCACTG CAATTAATGC CGACTGAATA GGCATTATAA GTACGGTTGC 14601 GAGCCTCACT ATTCTTTATC TGCGTCTACC TACATATTGC CCGAGGACTT CTCGGAGTGA TAAGAA.ATAG ACGCAGATGG ATGTATAACG GGCTCCTGAA 14651 TACTACGGCT CCTACCTCTA CAA.AGA.AACA TGAAATATTG GAGTAATCTT ATGATGCCGA GGATGGAGAT GTTTCTTTGT ACTTTATAAC CTCATTAGAA 147 01 ATTATTCCTA TTAATAGCCA CAGCCTTCGT AGGCTATGTT CTACCTTGAG TAATAAGGAT AATTATCGGT GTCGGAAGCA TCCGATACAA GATGGAACTC 14751 GACAAATATC CTTTTGAGGC GCTACAGTCA TTACTAACCT CCTATCCGCC CTGTTTATAG GAAAACTCCG CGATGTCAGT AATGATTGGA GGATAGGCGG 14801 TTCCCTTACA TTGGAGATAT GCTAGTCCAA TGAATTTGAG GCGGCTTTTC AAGGGAATGT AACCTCTATA CGATCAGGTT ACTTAAACTC CGCCGAAAAG 14851 AGTAGATAAT GCCACCCTAA CACGATTCTT CGCATTCCAC TTCCTCCTAC TCATCTATTA CGGTGGGATT GTGCTAAGAA GCGTAAGGTG AAGGAGGATG 14901 CCTTCCTAAT TACAGCACTA ATAATTATCC ATATCCTCTT CCTACATGAA GGAAGGATTA ATGTCGTGAT TATTAATAGG TATAGGAGAA GGATGTACTT 192

14951 ACAGGCTCAA ATAACCCTAT AGGACTTAAT TCTGATACAG ACAAAATTTC TGTCCGAGTT TATTGGGATA TCCTGAATTA AGACTATGTC TGTTTTAAAG 15001 CTTCCACCCA TACTATACCT ACAAAGATGC ACTAGGCTTC TTCACCCTAA GAAGGTGGGT ATGATATGGA TGTTTCTACG TGATCCGAAG AAGTGGGATT 15051 TTATAATATT AGGGGTACTA GCCCTATTCC TACCTAATTT ATTAGGGGAC AATATTATAA TCCCCATGAT CGGGATAAGG ATGGATTAA.A TAATCCCCTG 15101 GCC GA.AAAC T ATATCCCTGC TAACCCCCTT GTTACCCCTC CCCATATCAA CGGCTTTTGA TATAGGGACG ATTGGGGGAA CAATGGGGAG GGGTATAGTT 15151 ACCCGAATGA TATTTTTTAT TCGCCTACGC CATTCTTCGA TCCATCCCTA TGGGCTTACT AT TA AGCGGATGCG GTAAGAAGCT AGGTAGGGAT 15201 ATA.AATTAGG AGGGGTTCTA GCCCTCCTAT TCTCCATCCT CATCCTTATA TATTTAATCC TCCCCAAGAT CGGGAGGATA AGAGGTAGGA GTAGGAATAT 15251 CTAGTCCCCC TACTACACAC CTCTAAACAA CGAAGCAGCA CCTTTCGCCC GATCAGGGGG ATGATGTGTG GAGATTTGTT GCTTCGTCGT GGA.AAGCGGG 153 01 ACTCACACAA ATTTTCTTTT GAACCCTCGT AACCAATATG CTTATTTTAA TGAGTGTGTT TAAAAGAAAA CTTGGGAGCA TTGGTTATAC GAATAAAATT 15351 CCTGAATCGG AGGACAGCCA GTTGAACAAC CATATATCCT CATCGGACAA GGACTTAGCC TCCTGTCGGT CAACTTGTTG GTATATAGGA GTAGCCTGTT 15401 ATCGCATCCA TCACCTACTT TTCATTATTT CTTATTATCA TTCCGCTCAC TAGCGTAGGT AGTGGATGAA AAGTAATAAA GAATAATAGT AAGGCGAGTG 15451 AGGCTGATGA GAAAACAAAA TCCTCAGCCT AA.ACTAGTC T TGGTAGCTTA TCCGACTACT CTTTTGTTTT AGGAGTCGGA TTTGATCAGA ACCATCGAAT 15501 ATTTAACA.AA GCGTCGACCT TGTAAGTCGA AGACCAGAGG TTTA.AATCCT TAAATTGTTT CGCAGCTGGA ACATTCAGCT TCTGGTCTCC AAATTTAGGA 15551 CTCCAAGACA TATCAGGGGA AGGAGGGTTA AACTCCTGCC CTTGGCTCCC GAGGTTCTGT ATAGTCCCCT TCCTCCCAAT TTGAGGACGG GAACCGAGGG 15601 AAAGCCAAGA TTCTGCCCAA ACTGCCCCCT GAAATGCTAT TAAAGCAGGG TTTCGGTTCT AAGACGGGTT TGACGGGGGA CTTTACGATA ATTTCGTCCC 15651 AAACCAAATG AA.AATTTGGT TTTCCAAAAG TAAGTCAGTA TGACATATTA TTTGGTTTAC TTTTAAACCA AAAGGTTTTC ATTCAGTCAT ACTGTATAAT 15701 ATGATATAGC CCACATACCT TAATATAGTA CATTACTTAA CTCGACTAAT TACTATATCG GGTGTATGGA ATTATATCAT GTAATGAATT GAGCTGATTA 15751 CAACATTAAT AGATTATTCC CTACTACTAT AATTATCTAT GCTTAATCCT GTTGTAATTA TCTAATAAGG GATGATGATA TTAATAGATA CGAATTAGGA 15801 CATTAATCTA TATTCCACTA TATCATAACA TACTATGCTT AATACTCATT GTAATTAGAT ATAAGGTGAT ATAGTATTGT ATGATACGAA TTATGAGTAA 15851 AATATACTAT CCACTATTTC ATTACATTCA TTCCTTAGAC CTCATAAACC TTATATGATA GGTGATAAAG TAATGTAAGT AAGGAATCTG GAGTATTTGG 15901 TAAAATCAGA ATTTTCATGA CATTGAATTA TACCTTTGAC TCTCAATTAC ATTTTAGTCT TAAAAGTACT GTAACTTAAT ATGGAAACTG AGAGTTAATG 15951 TTAAGTATAT ATCATGCAGG CTGGTAAGAA CATCGCATCC CGCTATTGTA AATTCATATA TAGTACGTCC GACCATTCTT GTAGCGTAGG GCGATAACAT 16001 AGAATAAAAT AGCTCTATTT GTGGCACTGC ACTCGACTAA TCCCCATTAA TCTTATTTTA TCGAGATA.AA CACCGTGACG TGAGCTGATT AGGGGTAATT 16051 TTGACCAGAA CTGGCATCTG ATTAATGCTT GTAATACTCT AATCCTTAAT AACTGGTCTT GACCGTAGAC TAATTACGAA CATTATGAGA TTAGGAATTA 16101 CGCGTCAAGA ATGAATGTAC CCTAGCTCCC TTTTATGCCA TTTTCGTCCT GCGCAGTTCT TACTTACATG GGATCGAGGG AAAATACGGT A,AAAGCAGGA 16151 TGATCGTC TC AAGATTTCTT GTCCGCCCTG ATTTTTTTTT TCGGGGATGA ACTAGCAGAG TTCTAAAGAA CAGGCGGGAC T AGCCCCTACT 16201 AGCAATTACT ATGCCCGGGA GGGCTGATCT GGGACACCGA GTTAAACTTG TCGTTAATGA TACGGGCCCT CCCGACTAGA CCCTGTGGCT CAATTTGAAC 16251 AATCCACCTC GACATTTACT TATAATACCC ATTACTCTCA TTCATGAATT TTAGGTGGAG CTGTAAATGA ATATTATGGG TAATGAGAGT AAGTACTTAA 16301 GTAATTGTCA AGTTGACCAT AACTGAA.AGG GATAGAGAAA TTGACGTCAT 193

CATTAACAGT TCAACTGGTA TTGACTTTCC CTATCTCTTT AACTGCAGTA 16351 AGACGTCAAG TTTCGATTTT TTTGATTAAT GA.AACTATGG TTT T TCTGCAGTTC AAAGCTAAAA AA.ACTAATTA CTTTGATACC AAATTTTTTA 16401 CCATATCCTT AACCCTCATC ATAAGTGCGA TTTGAA.ATAA ATTTGCATGT GGTATAGGAA TTGGGAGTAG TATTCACGCT AAACTTTATT TAAACGTACA 16451 AAGGCGCATT GAATAATCCT AATACATTAA TCACTTTACT TGGCATAAAT TTCCGCGTAA CTTATTAGGA TTATGTAATT AGTGAAATGA ACCGTATTTA 16501 TTTTTTTATT AAGTTTCCCC CTACGTTTCA AA.ATTTCGGA GCCGCTTAAA TAA TTCAAAGGGG GATGCAAAGT TTTAAAGCCT CGGCGAATTT 16551 TA CATTTTTTTG GTP.~~AAACCC CCCTCCCCCT AATATACACG TTTTTTTTAT GT C CATTTTTGGG GGGAGGGGGA TTATATGTGC 16601 GACTCCTCGA AAAACCCCTA AAACGAGGGC CGGACATATA TCTTCAAATT CTGAGGAGCT TTTTGGGGAT TTTGCTCCCG GCCTGTATAT AGAAGTTTAA 16651 AGCATACGAA ATATACTCTA TATATATAGT GTTACACTAT GAT TCGTATGCTT TATATGAGAT ATATATATCA CAATGTGATA CTA

tRNA 1..71 product = tRNA-Phe rRNA 70..1022 product = 12S ribosomal RNA tRNA 1023..1094 product = tRNA-Val rRNA 1095..2762 product = 16S ribosomal RNA tRNA 2763..2837 product = tRNA-Leu gene 28 38 .. 3812 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3814..3882 product = tRNA-Ile tRNA complement (3881..3852) product = tRNA-Gln tRNA 3953..4021 product = tRNA-Met gene 4022..5065 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5065..5135 product = tRNA-Trp tRNA complement (5137..5205) product = tRNA-Ala tRNA complement (5206..5278) product = tRNA-Asn tRNA complement (5312..5378) product = tRNA-Cys tRNA complement (5380..5449) product = tRNA-Tyr gene 5451..7006 gene = COl product = cytochrome c oxidase subunit 1 tRNA complement (7007..7077) product = tRNA-Ser tRNA 7082..7151 product = tRNA-Asp 194 gene 7159..7849 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7850..7923 product = tRNA-Lys gene 7925..8092 gene = ATP8 product =ATP synthase FO subunit 8 gene 8083..8766 gene = ATP6 product =ATP synthase FO subunit 6 gene 8766..9SS 1 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9SS4..9623 product = tRNA-Gly gene 9624..9974 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 997 3..10042 product = tRNA-Arg gene 10043..10339 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10333..11713 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11714..11782 product = tRNA-His tRNA 117 8 3..11849 product = tRNA-Ser tRNA 11850..11921 product = tRNA-Leu gene 11922..13751 gene = NDS product = NADH dehydrogenase subunit S gene complement (13747..14268) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14269..14338) product = tRNA-Glu gene 14342..15487 gene = CYTB product =cytochrome b tRNA 15487..1SS60 product = tRNA-Thr tRNA complement (1S S 63.. l S 631) product = tRNA-Pro D-Loop 156 33..1669 3 195

Alopias superciliosus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGCATGGCAC TGAAGATGCT AAGATGAAAA CGATCACATC GAATTAAATT TCGTACCGTG ACTTCTACGA TTCTACTTTT 51 ATAAAATTTT TCCGCAGGCA TAAAGGTTTG GTCCTGGCCT CAGTATTAAT TATTTTAAAA AGGCGTCCGT ATTTCCAAAC CAGGACCGGA GTCATAATTA 101 TGCAACCAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACGTTGGTTT TAATATGTAC GTTCAAAGTC GTAGGGACAC TCTTACGGGA 151 AATTAATCTA TTAATTAATT AGGAGCAGGT ATCAGGCACA CACACGTAGC TTAATTAGAT AATTAATTAA TCCTCGTCCA TAGTCCGTGT GTGTGCATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGATTTCAG CAGTAATAAA GGTTCTGTGG AACGATTCGG TGTGGGGGTT C C C TAA.AGTC GTCATTATTT 251 TATTGATTTA ATAAGCGCAA GCTTGAATCA GTTA.AAGTTA ACAGAGTTGG ATAACTAAAT TATTCGCGTT CGAACTTAGT CAATTTCAAT TGTCTCAACC 301 TAAATCTCGT GCCAGCCACC GCGGTTATAC GAGTAACTCA CATTAATACT ATTTAGAGCA CGGTCGGTGG CGCCAATATG CTCATTGAGT GTAATTATGA 351 TTCCCGGCGT AAAGAGTGAT TTAAGGAATA TCTATAATAA C TAA.AGTTAA A.AGGGCCGCA TTTCTCACTA AATTCCTTAT AGATATTATT GATTTCAATT 401 GACCTCATCA AGCTGTTACA CGCACTCACG AGTGGAATTA TCAATAACGA CTGGAGTAGT TCGACAATGT GCGTGAGTGC TCACCTTAAT AGTTATTGCT 451 AAGTGACTTT ACCCCACCAG AA.ATC TTGAC GTCACGACAG TTAGACCCCA TTCAC TGA.AA TGGGGTGGTC TTTAGAACTG CAGTGCTGTC AATCTGGGGT 501 A.AC TAGGATT AGATACCCTA CTATGTCTAA C CACA.AAC TT AAACAATAAT TTGATCCTAA TCTATGGGAT GATACAGATT GGTGTTTGAA TTTGTTATTA 551 TCACTATATT GTTCGCCAGA GTACTACAAG CGCTAGCTTA AAACCCAAAG AGTGATATAA CAAGCGGTCT CATGATGTTC GCGATCGAAT TTTGGGTTTC 601 GACTTGGCGG TGTCCCAAAC CCACCTAGAG GAGCCTGTTC TGTAACCGAT CTGAACCGCC ACAGGGTTTG GGTGGATCTC CTCGGACAAG ACATTGGCTA 651 AATCCCCGTT AAACCTCACC ACTTCTGGCC ATCCCCGTCT ATATACCGCC TTAGGGGCAA TTTGGAGTGG TGAAGACCGG TAGGGGCAGA TATATGGCGG 701 GTCGTCAGCT CACCCTATGA AGGTTP~~AAA GTAAGCAAAA AGAACCAACT CAGCAGTCGA GTGGGATACT TCCAATTTTT CATTCGTTTT TCTTGGTTGA 751 CCCATACGTC AGGTCGAGGT GTAGCAAATG AAGTGGATAG AAATGGGCTA GGGTATGCAG TCCAGCTCCA CATCGTTTAC TTCACCTATC TTTACCCGAT 801 CATTTTCTAT A.AAGAAAACA CGAATGGTAG AC TGP.►~~AAAT TACCTAAAGG GTAA.AAGATA TTTCTTTTGT GCTTACCATC TGACTTTTTA ATGGATTTCC 851 TGGATTTAGC AGTAAGA.AAA GATTAGAGAA TTTCTCTGAA ACCGGCTCTG ACCTAAATCG TCATTCTTTT CTAATCTCTT AAAGAGACTT TGGCCGAGAC 901 GGACGCGCAC ACACCGCCCG TCACTCTCCT C CCC TACTTATTTT CCTGCGCGTG TGTGGCGGGC AGTGAGAGGA GTTTTTTGGG ATGAATAA.AA 951 TAATTAAAAG A.AAATTAC TA AGAGGAGGCA AGTCGTAACA TGGTAAGTGT ATTAATTTTC TTTTAATGAT TCTCCTCCGT TCAGCATTGT ACCATTCACA 1001 ACTGGAAAGT GCACTTGGAA TCA,AAATGTG GC TA.AAC TAG CAAAGCACCT TGACCTTTCA CGTGAACCTT AGTTTTACAC CGATTTGATC GTTTCGTGGA 1051 CCCTTACACC GAGGAAATAC CCGTGCAATT CGAGTCATTT TGAACATTAA GGGAATGTGG CTCCTTTATG GGCACGTTAA GCTCAGTAAA ACTTGTAATT 1101 AGCTAGCCTG TATACCTACC CTAAACCTAA CCCTATTAAT TACCTTATAT TCGATCGGAC ATATGGATGG GATTTGGATT GGGATAATTA ATGGAATATA 1151 ATTAATACCT AACTAAAACA TTTTACCTTT TTAGTATGGG CGACAGAACA TAATTATGGA TTGATTTTGT A.AAATGGAAA AATCATACCC GCTGTCTTGT 1201 AAAACTCAGC GCAATAGACT ATGTACCGCA AGGGA.AAGCT GP►AA.AAGA.AA TTTTGAGTCG CGTTATCTGA TACATGGCGT TCCCTTTCGA CTTTTTCTTT 1251 TGAAACAA.AT AATTAA.AGTA ATp,~~AAAGCA GAGATTTAAC CTCGTACCTT ACTTTGTTTA TTAATTTCAT TATTTTTCGT CTCTAAATTG GAGCATGGAA 13 01 TTGCATCATG ATTTAGCTAG P.~C TAGA CA.AAGAGATC TTAAGCCTAT 196

AACGTAGTAC TAAATCGATC TTTTTGATCT GTTTCTCTAG AATTCGGATA 1351 CCCCCCGAAA CTAAACGAGC TACTCCGAAG CAGCACGACT AGAGCAAACC GGGGGGCTTT GATTTGCTCG ATGAGGCTTC GTCGTGCTGA TCTCGTTTGG 1401 CGTCTCTGTG GCAAAAGAGT GGGAAGACTT CCGAGTAGCG GTGACAAGCC GCAGAGACAC CGTTTTCTCA CCCTTCTGAA GGCTCATCGC CACTGTTCGG 1451 TACCGAGTTT AGTGATAGCT GGTTACCCAA P~A.A.AAGAAC T TTAATTCTGC ATGGCTCA.AA TCACTATCGA CCAATGGGTT TTTTTCTTGA AATTAAGACG 1501 ATTAATTCCT TTTCCACCAA AGAGTCCATC TTACCAAGGT TAAACATAAA TAATTAAGGA AAAGGTGGTT TCTCAGGTAG AATGGTTCCA ATTTGTATTT 1551 AATTAATAGT TATTCAGAAG AGGAACAGCC CTTCTGAACT AAGATACAAC TTAATTATCA ATAAGTCTTC TCCTTGTCGG GAAGACTTGA TTCTATGTTG 1601 TTTTAAAGGT GGTAAATGAT CATATTCACC AAGGTTTTTA CCTCAGTGGG P~AAATTTC CA CCATTTACTA GTATAAGTGG TTC Cp~~AAAT GGAGTCACCC 1651 CCCAAAAGCA GCCATCTGTA AAGTAAGCGT CACAGCTCCA GTCTCACAAA GGGTTTTCGT CGGTAGACAT TTCATTCGCA GTGTCGAGGT CAGAGTGTTT 1701 AACCTATAAT TTAGATATTT TTCTCATAAT CCCCTTAACT ATATTGGGTT TTGGATATTA AATC TATA.AA AAGAGTATTA GGGGAATTGA TATAACCCAA 1751 ATTTTATA.AA ATTATAAAAG AACTTATGCT AAAATGAGTA ATAAGAGGAC TAAAATATTT TAATATTTTC TTGAATACGA TTTTACTCAT TATTCTCCTG 1801 TAACCTCTCC AAACACAAGT GTATGTCAGA AAGAATTAAA TCACTGACAA ATTGGAGAGG TTTGTGTTCA CATACAGTCT TTCTTAATTT AGTGACTGTT 1851 TTAAACGAAC CCAGACTGAG GTCATTATAC TAATATTACC TTAACTAGAA AATTTGCTTG GGTCTGACTC CAGTAATATG ATTATAATGG AATTGATCTT 1901 AATCTTATCG TAACATTCGT TAACCCTACA CAGGAGTGTC TTAAGGAA.AG TTAGAATAGC ATTGTAAGCA ATTGGGATGT GTCCTCACAG AATTCCTTTC 1951 ATTTAAAGAA AATA.A.AGGAA CTCGGCAAAC ACAA.ACTCCG CCTGTTTACC TA.AATTTCTT TTATTTCCTT GAGCCGTTTG TGTTTGAGGC GGACAA.ATGG 2001 P~~AAACATC G CCTCTTGAAC ATTATAAGAG GTCCCGCCTG CCCTGTGACA TTTTTGTAGC GGAGAACTTG TAATATTCTC CAGGGCGGAC GGGACACTGT 2051 ATGTTTTAAC GGCCGCGGTA TTTTGACCGT GCAAAGGTAG CGTAATCACT TACAA.AATTG CCGGCGCCAT AAAACTGGCA CGTTTCCATC GCATTAGTGA 2101 TGTCTTTTAA ATGAAGACCC GTATGAAAGG CATCACGAGA GTTCAACTGT ACAGAAAATT TACTTCTGGG CATACTTTCC GTAGTGCTCT CAAGTTGACA 2151 CTCTATTTTC TAATCAATGA AATTGATCTA CTCGTGCAGA AGCGAGTATA GAGATP~AAAG ATTAGTTACT TTA.ACTAGAT GAGCACGTCT TCGCTCATAT 2201 ACTACATTAG ACGAGAAGAC CCTATGGAGC TTCA.AACACA TGAATTAAAT TGATGTAATC TGCTCTTCTG GGATACCTCG AAGTTTGTGT ACTTAATTTA 2251 ATGTAAACTA ACTACTCCCC GGACATAAAT AAAATAATAT TTTTAATTTA TACATTTGAT TGATGAGGGG CCTGTATTTA TTTTATTATA AAAATTAAAT 2301 ACTGTTTTTG GTTGGGGTGA CCAAGGGGAA AAATAAATCC CCCTTATCGA T GAC P►AAA.AC CAACCCCACT GGTTCCCCTT TTTATTTAGG GGGAATAGCT 2351 TTGAGTACTC AAGTACTTAA AAATCAGAAT TACAATTCTG ATTAATAAAA AACTCATGAG TTCATGAATT TTTAGTCTTA ATGTTAAGAC TAATTATTTT 2401 TATTTATCGA P~TGACCC AGGATTTCCT GATCAATGAA CCAAGTTACC ATA.AATAGCT TTTTACTGGG TC C TA.AAGGA CTAGTTACTT GGTTCAATGG 2451 CTAGGGATAA CAGCGCAATC CTTTCTTAGA GTCCCTATCG CCGAAAGGGT GATCCCTATT GTCGCGTTAG GAAAGAATCT CAGGGATAGC GGCTTTCCCA 2501 TTACGACCTC GATGTTGGAT CAGGACATCC TAATGATGCA ACCGTTATTA AATGCTGGAG CTACAACCTA GTCCTGTAGG ATTACTACGT TGGCAATAAT 2551 AGGGTTCGTT TGTTCAACGA TTAATAGTCC TACGTGATCT GAGTTCAGAC TCCCAAGCAA ACAAGTTGCT AATTATCAGG ATGCACTAGA CTCAAGTCTG 2601 CGGAGA.AATC CAGGTCAGTT TCTATCTATG AATTAATTTT TCCTAGTACG GCCTCTTTAG GTCCAGTCAA AGATAGATAC TTAATTAAAA AGGATCATGC 2651 AA.AGGAC C GG P.~~AAATGGAG CCAATACCAC AGGCACGCTC CATTTTCATC TTTCCTGGCC TTTTTACCTC GGTTATGGTG TCCGTGCGAG GTA~A,AAGTAG 197

2701 TATTGAAACA AAC TP~AAATA GATAAGAAAA AATTATCTAC TACCCAAGAA ATAACTTTGT TTGATTTTAT CTATTCTTTT TTAATAGATG ATGGGTTCTT 2751 AAGGGTTGTT GAGGTGGCAG AGCCTGGTAA GTGCAAAAGA CCTAAGCTCT TTCCCAACAA CTCCACCGTC TCGGACCATT CACGTTTTCT GGATTCGAGA 2801 TTAATTCAGA GGTTCAA.ATC CTCTCCTCAA CCATGCTTGA AACTCTCCTG AATTAAGTCT CCAAGTTTAG GAGAGGAGTT GGTACGAACT TTGAGAGGAC 2851 CTTTACCTAA TTAACCCACT TACCTATATT ATTCCAATCC TATTAGCTAC GAAATGGATT AATTGGGTGA ATGGATATAA TAAGGTTAGG ATAATCGATG 2901 AGCTTTTCTA ACCCTAGTCG AAC GP►AA.AAT TCTTGGCTAC ATACAACTCC TCGAAAAGAT TGGGATCAGC TTGCTTTTTA AGAACCGATG TATGTTGAGG 2951 GCAAAGGCCC TAACATCGTA GGTCCATACG GACTCCTTCA ACCTATCGCA CGTTTCCGGG ATTGTAGCAT CCAGGTATGC CTGAGGAAGT TGGATAGCGT 3001 GATGGTTTAA AATTATTTAT TAAAGAACCC ATCCACCCAT CAACATCCTC C TAC CA.AATT TTAATAAATA ATTTCTTGGG TAGGTGGGTA GTTGTAGGAG 3051 TCCATTTCTA TTTTTAGCCA CCCCCACAAT AGCTCTAACA CTAGCTCTTC AGGTAA.AGAT P.~~A.AATC GGT GGGGGTGTTA TCGAGATTGT GATCGAGAAG 3101 TTATATGAAT ACCCCTCCCT CTTCCACACT CCATTATTAA CCTTAATCTA AATATACTTA TGGGGAGGGA GAAGGTGTGA GGTAATAATT GGAATTAGAT 3151 GGTTTATTAT TTATCTTAGC AATCTCAAGC CTAACCGTTT ACACTATTTT C CA.AATAATA AATAGAATCG TTAGAGTTCG GATTGGCAAA TGTGATAAAA 3201 AGGCTCTGGA TGAGCATCCA ATTCAP~AATA TGCCCTAATA GGGGCTCTAC TCCGAGACCT ACTCGTAGGT TAAGTTTTAT ACGGGATTAT CCCCGAGATG 3251 GAGCTGTAGC ACA.AACAATC TCCTATGAAG TAAGCCTCGG ATTAATCCTC CTCGACATCG TGTTTGTTAG AGGATACTTC ATTCGGAGCC TAATTAGGAG 3301 TTATCAATAA TTATATTTAC AGGAGGTTTC ACCCTCCATA CCTTCAATCT AATAGTTATT AATATA.AATG TCCTCCAAAG TGGGAGGTAT GGAAGTTAGA 3351 AGCCCAAGAA ACAATCTGAC TAATTATCCC AGGATGACCA CTAGCCCTAA TCGGGTTCTT TGTTAGACTG ATTAATAGGG TCCTACTGGT GATCGGGATT 3401 TATGATACGT CTCTACCCTA GCAGAAACCA ACCGAGTACC ATTTGATTTA ATACTATGCA GAGATGGGAT CGTCTTTGGT TGGCTCATGG TA.AAC TAAAT 3451 ACAGAAGGAG AATCAGAACT AGTCTCAGGT TTTAATATCG AATATGCAGG TGTCTTCCTC TTAGTCTTGA TCAGAGTCCA AAATTATAGC TTATACGTCC 3501 AGGTTCATTT GCCCTATTCT TCCTTGCTGA ATATACA.AAT ATTCTATTAA TCCAAGTAAA CGGGATAAGA AGGAACGACT TATATGTTTA TAAGATAATT 3551 TAAATACCCT CTCAGTCATC CTATTTATAG GCACCTCTTA TAACCCACTT ATTTATGGGA GAGTCAGTAG GATAAATATC CGTGGAGAAT ATTGGGTGAA 3601 CTACCAGAAA TCTCCACGCT CAGCTTAATA ATGAAAGCAA CCCTACTTAC GATGGTCTTT AGAGGTGCGA GTCGAATTAT TACTTTCGTT GGGATGAATG 3651 TCTATTCTTC TTATGAATCC GGGCATCTTA TCCCCGCTTC CGTTATGATC AGATAAGAAG AATACTTAGG CCCGTAGAAT AGGGGCGAAG GCAATACTAG 3701 AACTTATACA CCTAGTATGA P~~AAACTTCC TTCCCCTAAC CTTAGCAATT TTGAATATGT GGATCATACT TTTTTGAAGG AAGGGGATTG GAATCGTTAA 3751 ATATTATGAC ATATCGCCCT ACCAATAGCC ACAGCAAGCC TACCTCCCCT TATAATACTG TATAGCGGGA TGGTTATCGG TGTCGTTCGG ATGGAGGGGA 3801 AAC TTAA.AC G GAAGCGTGCC TGAACA.AAGG ACCACTTTGA TAGAGTGGAT TTGAATTTGC CTTCGCACGG ACTTGTTTCC TGGTGA.AACT ATCTCACCTA 3851 AATGGAAGTT AAAACCTCCC CTCTTCCTAG P~AAAATAGGA CTTGAACCCA TTACCTTCAA TTTTGGAGGG GAGAAGGATC TTTTTATCCT GAACTTGGGT 3901 TAACTAAGAG ATCAAAACTC TTTGTACTTC CAATTATACT ATTTCCTAAG TA.AAGGATTC ATTGATTCTC TAGTTTTGAG AAACATGAAG GTTAATATGA 3951 TAAAGTCAGC TAACAAAGCT TTTGGGCCCA TACCCCAACC ATGTTGGTTA ATTTCAGTCG ATTGTTTCGA AAACCCGGGT ATGGGGTTGG TACAACCAAT 4001 AAATCCTTCC TTCACTAATG AACCCAATCG TATTAACCAT TATCATTTCA TTTAGGAAGG AAGTGATTAC TTGGGTTAGC ATAATTGGTA ATAGTAAAGT 4051 AGCCTAGGCC TAGGAACTAT CTTAACATTT ATTGGCTCAC ATTGACTCCT 198

TCGGATCCGG ATCCTTGATA GAATTGTA.AA TAACCGAGTG TAACTGAGGA 4101 AGTTTGAATA GGCCTCGAAA TCAACAC TC T AGCCATCATC CCCTTAATAA TCAAACTTAT CCGGAGCTTT AGTTGTGAGA TCGGTAGTAG GGGAATTATT 4151 TTCGTCAGCA CCATCCCCGG GCAGTAGAAG CTTCTACAAA ATATTTCATT AAGCAGTCGT GGTAGGGGCC CGTCATCTTC GAAGATGTTT TATAAAGTAA 4201 ACACA.AGCAA CTGCCTCAGC CTTACTTTTA TTTGCTAGCG TCACA.AACGC TGTGTTCGTT GACGGAGTCG GAATGAAAAT AAACGATCGC AGTGTTTGCG 4251 TTGGACTTCA GGCGAATGAA GTCTAATTGA AATAACTAAT CCAACCTCTG AACCTGAAGT CCGCTTACTT CAGATTAACT TTATTGATTA GGTTGGAGAC 4301 CCACACTAGC CACAATCGCA CTAGCACTAA AAATTGGCCT AGCCCCTCTC GGTGTGATCG GTGTTAGCGT GATCGTGATT TTTAACCGGA TCGGGGAGAG 4351 CACTTCTGAC TGCCTGAAGT CCTCCAAGGC CTAGACTTAA CCACGGGCCT GTGAAGACTG ACGGACTTCA GGAGGTTCCG GATCTGAATT GGTGCCCGGA 4401 TATTCTCTCC ACATGGCAAA AACTCGCCCC ATTCGCTATT CTCTTACAAC ATAAGAGAGG TGTACCGTTT TTGAGCGGGG TAAGCGATAA GAGAATGTTG 4451 TTTACCCCTC ATTAAATTCC AATCTACTCG TATTCCTCGG TGTCCTCTCA AAATGGGGAG TAATTTAAGG TTAGATGAGC ATAAGGAGCC ACAGGAGAGT 4501 ACCATAGTGG GAGGCTGAGG GGGACTAAAC CAA.ACCCAAC TAC GP~~A.AAT TGGTATCACC CTCCGACTCC CCCTGATTTG GTTTGGGTTG ATGCTTTTTA 4551 CCTAGCCTAC TCCTCAATTG CACACCTCGG TTGAATAATC ACAATTCTAC GGATCGGATG AGGAGTTAAC GTGTGGAGCC AACTTATTAG TGTTAAGATG 4601 ATTATTCCTA CAACCTAACC CAATTAAATC TAATTCTCTA CATTATTATA TAATAAGGAT GTTGGATTGG GTTAATTTAG ATTAAGAGAT GTAATAATAT 4651 ACATCAACAA CCTTCCTATT ATTCAAAACA TTCAACTCAA C CAA.AATTAA TGTAGTTGTT GGAAGGATAA TAAGTTTTGT AAGTTGAGTT GGTTTTAATT 4701 CTCCATCTCC TCCTCCTCAT CAAAWACTCC CTTATTATCC ATCATTGCTC GAGGTAGAGG AGGAGGAGTA GTTTWTGAGG GAATAATAGG TAGTAACGAG 4751 TCATAACCCT CCTCTCTCTC GGAGGATTAC CTCCACTTTC AGGCTTCATA AGTATTGGGA GGAGAGAGAG CCTCCTAATG GAGGTGAAAG TCCGAAGTAT 4801 CCAAAATGAT TAATCCTACA AGAATTAACA AAACAAAACT TAGCTATCCC GGTTTTACTA ATTAGGATGT TCTTAATTGT TTTGTTTTGA ATCGATAGGG 4851 AGCCACTATC ATAGCTATAA TAGCCCTCCT CAGTCTATTC TTCTACCTAC TCGGTGATAG TATCGATATT ATCGGGAGGA GTCAGATAAG AAGATGGATG 4901 GCCTATGCTA CGCTACAACA TTAACCATAA CTCCAAATTC AATCAATATA CGGATACGAT GCGATGTTGT AATTGGTATT GAGGTTTAAG TTAGTTATAT 4951 TCAACATCAT GACGAACTAA ACTATCCCAT AACCTAACCC TAACAACAGC AGTTGTAGTA CTGCTTGATT TGATAGGGTA TTGGATTGGG ATTGTTGTCG 5001 TGCCTCACTA TCCATCCTCC TCCTCCCAAT CACCCCCGCC ATCCTCATAT ACGGAGTGAT AGGTAGGAGG AGGAGGGTTA GTGGGGGCGG TAGGAGTATA 5051 TATTATCTTA AGAAATTTAG GTTAACAATA AACCAAAAGC CTTCAAAGCT ATAATAGAAT TCTTTA.AATC CAATTGTTAT TTGGTTTTCG GAAGTTTCGA 5101 TTAAATAGAA GTGP~AAATCT CCTAATTTCT GCTAAGATTT GCAAGTCTTT AATTTATCTT CACTTTTAGA GGATTAAAGA C GATTC TA.AA CGTTCAGA.AA 5151 ATCTCACATC TTCTGAATGC AACCCAGATG CTTTCATTAA GC TAAA.AC C T TAGAGTGTAG AAGACTTACG TTGGGTCTAC GAAAGTAATT CGATTTTGGA 5201 TC TAGGTA.AA TAGGCCTTGA TCCTACA.A.AA TCTTAGTTAA CAGCTAAGTG AGATCCATTT ATCCGGAACT AGGATGTTTT AGAATCAATT GTCGATTCAC 5251 TTCA.AACCAG CGAACTTCTA CCTACTTTCT CCCGCCGTAA AAACAAAAGG AAGTTTGGTC GCTTGAAGAT GGATGAAAGA GGGCGGCATT TTTGTTTTCC 5301 CGGGAGAAAG CCCCGGGAGA AACTAACCTC CGGTTTTGGA TTTGCAATCC GCCCTCTTTC GGGGCCCTCT TTGATTGGAG GCCP~AAACCT AAACGTTAGG 5351 AACGTAATCA TCTACTGCAG GACTTTGATA AGAAGAGGGA TTTGACCTCT TTGCATTAGT AGATGACGTC C TGAA.AC TAT TCTTCTCCCT AA.ACTGGAGA 5401 GTTTACGGAG CTACAATCCG CCACTTAGTT CTCAGTCACC TTACCTGTGG CA.AATGCCTC GATGTTAGGC GGTGAATCAA GAGTCAGTGG AATGGACACC 199

5451 CAATTAATCG TTGACTATTT TC TACAAAC C ACAAAGATAT CGGCACCCTT GTTAATTAGC AAC TGATA.AA AGATGTTTGG TGTTTCTATA GCCGTGGGAA 5501 TATTTAATCT TTGGTGCATG AGCAGGAATA GTGGGAACAG CCCTCAGCCT ATAAATTAGA AACCACGTAC TCGTCCTTAT CACCCTTGTC GGGAGTCGGA 5551 TCTAATTCGA GCCGAGTTAG GCCAGCCCGG ATCACTCCTA GGGGATGATC AGATTAAGCT CGGCTCAATC CGGTCGGGCC TAGTGAGGAT CCCCTACTAG 5601 AGGTCTATAA TGTTATCGTA ACCGCCCATG CATTTGTAAT AATCTTCTTC TCCAGATATT ACAATAGCAT TGGCGGGTAC GTAAACATTA TTAGAAGAAG 5651 ATGGTTATAC CCGTAATAAT TGGGGGATTT GGAAACTGAT TAGTACCCTT TACCAATATG GGCATTATTA ACCCCCTAAA CCTTTGACTA ATCATGGGAA 5701 AATAATTGGT GCACCAGACA TGGCCTTCCC GCGAATAAAT AACATAAGCT TTATTAACCA CGTGGTCTGT ACCGGAAGGG CGCTTATTTA TTGTATTCGA 5751 TTTGACTCCT TCCCCCTTCT TTTCTCTTAC TCCTAGCTTC AGCTGGGGTT AAACTGAGGA AGGGGGAAGA A.AAGAGAATG AGGATCGAAG TCGACCCCAA 5801 GAAGCTGGAG CTGGCACTGG TTGAACAGTT TATCCCCCCT TAGCTGGCAA CTTCGACCTC GACCGTGACC AACTTGTCAA ATAGGGGGGA ATCGACCGTT 5851 CTTAGCACAT GCTGGGGCAT CTGTTGACTT GGCCATTTTC TCGCTTCATT GAATCGTGTA CGACCCCGTA GACAACTGAA CCGGTAAAAG AGCGAAGTAA 5901 TAGCAGGTAT CTCATCAATT TTAGCTTCAA TTAACTTTAT TACAACTATC ATCGTCCATA GAGTAGTTAA AATCGAAGTT AATTGA.AATA ATGTTGATAG 5951 ATTAATATAA AACCACCAGC CATCTCTCAA TATCAAACAC CATTATTTGT TAATTATATT TTGGTGGTCG GTAGAGAGTT ATAGTTTGTG GTAATAAACA 6001 ATGATCAATC CTAGTAACAA CCATCCTCCT CCTCTTATCC CTCCCAGTAC TACTAGTTAG GATCATTGTT GGTAGGAGGA GGAGAATAGG GAGGGTCATG 6051 TCGCAGCCGG CATCACAATA TTATTAACTG ATC GAAAC C T AAACACAACA AGCGTCGGCC GTAGTGTTAT AATAATTGAC TAGCTTTGGA TTTGTGTTGT 6101 TTCTTTGACC CAGCAGGAGG AGGAGATCCA ATTCTTTACC AACATCTATT AAGAAACTGG GTCGTCCTCC TCCTCTAGGT TAAGAAATGG TTGTAGATAA 6151 TTGATTTTTT GGTCACCCAG AAGTCTACAT TTTGATTCTC CCCGGCTTTG AAC T CCAGTGGGTC TTCAGATGTA AAACTAAGAG GGGCCGAAAC 6201 GAATAATCTC CCATGTAGTA GCTTATTATT C TGGT~►~~AAA AGAACCATTC CTTATTAGAG GGTACATCAT CGAATAATAA GACCATTTTT TCTTGGTAAG 6251 GGCTACATAG GCATAGTTTG AGCAATAATA GCGATTGGAT TACTAGGTTT CCGATGTATC CGTATCAAAC TCGTTATTAT CGCTAACCTA ATGATCCAAA 63 01 TATTGTCTGA GCCCACCATA TGTTTACAGT AGGAATAGAC GTTGATACAC ATAACAGACT CGGGTGGTAT ACAAATGTCA TCCTTATCTG CAACTATGTG 6351 GAGCCTATTT CACCTCAGCA ACAATAATTA TTGCTATCCC CACAGGTGTA CTCGGATAAA GTGGAGTCGT TGTTATTAAT AACGATAGGG GTGTCCACAT 6401 AAAGTATTTA GCTGATTAGC AACTCTTCAC GGGGGCTCTG TTAAATGAGA TTTCATA.AAT CGACTAATCG TTGAGAAGTG CCCCCGAGAC AATTTACTCT 6451 CACCCCATTA CTATGAGCCC TTGGATTTAT CTTCTTATTC ACAGTAGGAG GTGGGGTAAT GATACTCGGG AAC C TAA.ATA GAAGAATAAG TGTCATCCTC 6501 GACTAACAGG TATCGTCCTG GCCAACTCCT CCTTAGATAT CGTCCTTCAT CTGATTGTCC ATAGCAGGAC CGGTTGAGGA GGAATCTATA GCAGGAAGTA 6551 GATACCTACT ATGTAGTAGC CCATTTCCAT TATGTCCTTT CAATAGGAGC CTATGGATGA TACATCATCG GGTAAAGGTA ATACAGGA.AA GTTATCCTCG 6601 AGTGTTCGCT ATTATAGCAG GCTTCATCCA CTGATTCCCT CTCATCTCTG TCACAAGCGA TAATATCGTC CGAAGTAGGT GACTAAGGGA GAGTAGAGAC 6651 GTTATACTCT CCATTCAACA TGA.ACAAAA.A TCCAATTTGC AGTAATATTT CAATATGAGA GGTAAGTTGT ACTTGTTTTT AGGTTAAACG TCATTATA.AA 6701 ATTGGAGTTA ATTTAACATT CTTTCCACAA CATTTCCTAG GCCTTGCCGG TAACCTCAAT TAA.ATTGTAA GAA.AGGTGTT GTAAAGGATC CGGAACGGCC 6751 AATACCACGA CGTTACTCAG ACTACCCAGA TGCGTACACT CTATGAAACA TTATGGTGCT GCAATGAGTC TGATGGGTCT ACGCATGTGA GATACTTTGT 6801 CAGTCTCCTC TATTGGTTCT TTAGTTTCAC TTGTAGCAGT AATTATGCTC 200

GTCAGAGGAG ATAACCAAGA AATCAA.AGTG AACATCGTCA TTAATACGAG A.AAC GAGAAG 6851 CTATTTATTA TCTGAGAAGC ATTTGCCTCA TATTATCCGT GATA.AATAAT AGACTCTTCG TA.AAC GGAGT TTTGCTCTTC ATAATAGGCA 6901 TGAATTACCT CATACAAACG TTGAATGACT ACACGGCTGC CCTCCACCAT ACTTAATGGA GTATGTTTGC AACTTACTGA TGTGCCGACG GGAGGTGGTA 6951 ACCATACATA TGAAGAACCA GCATTTGTTC AGGTTCAACG AACTTTTTAA TTGP~TT TGGTATGTAT ACTTCTTGGT CGTA.AACAAG TCCAAGTTGC 7001 AACAAGAAAG GAAGGAATCG AACCCCCATA TGTCAGTTTC AAGCCAACCA TTGTTCTTTC CTTCCTTAGC TTGGGGGTAT ACAGTCAAAG TTCGGTTGGT 7051 CATCACCACT CTGTCACTTT CTTCATTAAG ATTCTAGTAA AATATATTAC GTAGTGGTGA GACAGTGAAA GAAGTAATTC TAAGATCATT TTATATAATG 7101 ACTGCCTTGT CAAGGCAAAA TTGTGAGTTT AAACCCCACG AATCTTAACT TGACGGAACA GTTCCGTTTT AACACTCAAA TTTGGGGTGC TTAGAATTGA 7151 TATGGCACAC CCCTCACAAT TAGGATTTCA AGACGCAGCC TCCCCAGTTA ATACCGTGTG GGGAGTGTTA ATCCTAAAGT TCTGCGTCGG AGGGGTCAAT 7201 TGGAAGAACT TATTCATTTT CACGACCACA CATTAATAAT TGTATTTCTA ACCTTCTTGA ATAAGTAAAA GTGCTGGTGT GTAATTATTA ACATAAAGAT 7251 ATTAGTACCC TAGTCCTTTA TATTATCACA GCAATAGTAT CAACAAAACT TAATCATGGG ATCAGGAA.AT ATAATAGTGT CGTTATCATA GTTGTTTTGA 7301 CACAA.ACAAA TATATTCTTG ATTCCCAAGA AATTGAGATT GTCTGAACTA GTGTTTGTTT ATATAAGAAC TAAGGGTTCT TTAACTCTAA CAGACTTGAT 7351 TTCTCCCCGC CATCATCCTC ATTATAATTG CCCTACCATC TTTACGAATT AAGAGGGGCG GTAGTAGGAG TAATATTAAC GGGATGGTAG A.AATGC TTAA 7401 TTATACCTTA TAGACGAAAT CAATGACCCT CACCTGACCA TCAAAGCTAT AATATGGAAT ATCTGCTTTA GTTACTGGGA GTGGACTGGT AGTTTCGATA 7451 AGGACACCAA TGATACTGAA GTTATGAATA TACAGATTAT GAAGATCTTG TCCTGTGGTT ACTATGACTT CAATACTTAT ATGTCTAATA CTTCTAGAAC 7501 GATTCGACTC TTACATAATT CAAACCCAAG ACTTAACCCC AGGCCAATTT CTAAGCTGAG AATGTATTAA GTTTGGGTTC TGAATTGGGG TCCGGTTAAA 7551 CGTTTATTAG AAACAGACCA TCGAATAGTT GTCCCCATAG AATCACCTGT GCAA.ATAATC TTTGTCTGGT AGCTTATCAA CAGGGGTATC TTAGTGGACA 7601 TCGTGTATTA GTATCTGCAG AAGATGTCTT ACATTCATGA GCTGTTCCAG AGCACATAAT CATAGACGTC TTCTACAGAA TGTAAGTACT CGACAAGGTC 7651 CCTTAGGAAT TP.~3AATAGAT GCCGTACCAG GACGCCTAAA C CAA.AC TGCC GGAATCCTTA ATTTTATCTA CGGCATGGTC CTGCGGATTT GGTTTGACGG 7701 TTTATCATCT CCCGACCAGG TGTCTATTAT GGTCAATGTT CAGAAATTTG AAATAGTAGA GGGCTGGTCC ACAGATAATA CCAGTTACAA GTCTTTAAAC 7751 TGGCGCTAAC CACAGTTTTA TGCCCATCAT AGTAGAAGCA GTCCCCCTAG ACCGCGATTG GTGTCAAA.AT ACGGGTAGTA TCATCTTCGT CAGGGGGATC 7801 AACATTTCGA AGCCTGATCT TCATTAATAT TAGAAGAAGC CTCACTAAGA TTGTAAAGCT TCGGACTAGA AGTAATTATA ATCTTCTTCG GAGTGATTCT 7851 AGCTAAACTG GGCCTAGCAT TAGCCTTTTA AGC TP.►~~AAAT TGGTGACTCC TCGATTTGAC CCGGATCGTA ATCGGAAAAT TCGATTTTTA ACCACTGAGG 7901 CTACCACCCT TAGTGATATG CCTCAATTAA ACCCCCACCC TTGATTCATT GATGGTGGGA ATCACTATAC GGAGTTAATT TGGGGGTGGG AACTAAGTAA 7951 ATTCTCCTAT TTTCATGAAT AATTTTCCTT ACCATTCTCC CT GT TAAGAGGATA A.AAGTAC TTA TTAA.AAGGAA TGGTAAGAGG GATTTTTTCA 8001 AGTAAATCAT ACATTTAATA ATAACCCAAC ATTP.►~~AA.AGT AC TGP.~~AAAC TCATTTAGTA TGTAAATTAT TATTGGGTTG TAATTTTTCA TGACTTTTTG 8051 C TA.AAC C T GA GCCCTGAAAT TGACCATGAT CATAAGCTTC TTCGACCAAT GATTTGGACT CGGGACTTTA ACTGGTACTA GTATTCGAAG AAGCTGGTTA 8101 TCCTAAGCCC CTCCCTCCTT GGAATTCCAC TAATTGCTCT AGCAATTATA AGGATTCGGG GAGGGAGGAA CCTTAAGGTG ATTAACGAGA TCGTTAATAT 8151 TTACCATGAT TAACTTTTCC AACCCCAACT AACCGCTGAT TA.AATAAC C G AATGGTACTA ATTGAAAAGG TTGGGGTTGA TTGGCGACTA ATTTATTGGC 201

8201 GCTAATGACC CTCCAAAGCT GATTTATTAA TCGATTTATT TATCAACTTA TA.AATAATT CGATTACTGG GAGGTTTCGA C AGCTAAATAA ATAGTTGAAT 8251 TACAACCCAT TAACTTCGCT GGCCATAAGT GAGCTATATT ATTCACAGCA ATGTTGGGTA ATTGAAGCGA CCGGTATTCA CTCGATATAA TAAGTGTCGT 8301 CTGATACTGT TCCTAATTAC CATTAACCTT TTAGGACTTC TCCCTTATAC GACTATGACA AGGATTAATG GTAATTGGAA AATCCTGAAG AGGGAATATG 8351 CTTCACGCCC ACAACCCAAC TATC C C TA.AA CATAGCATTT GCTCTACCCC GAAGTGCGGG TGTTGGGTTG ATAGGGATTT GTATC GTA.AA CGAGATGGGG 8401 TATGATTTAT AACCGTACTA ATTGGTATAC TCAACCAGCC AACAATTGCA ATACTAAATA TTGGCATGAT TAACCATATG AGTTGGTCGG TTGTTAACGT 8451 CTAGGCCATT TTTTACCAGA AGGCACCCCC ACACCCTTAG TACCTGTCCT GATCCGGTAA AAAATGGTCT TCCGTGGGGG TGTGGGAATC ATGGACAGGA 8501 GATCATCATC GA.A.ACTATCA GCCTATTTAT TCGACCATTA GCATTAGGAG CTAGTAGTAG CTTTGATAGT CGGATA.AATA AGCTGGTAAT CGTAATCCTC 8551 TTCGATTAAC TGCTAATTTA ACAGCAGGCC ACCTACTAAT ACAACTAATT AAGCTAATTG ACGATTAAAT TGTCGTCCGG TGGATGATTA TGTTGATTAA 8601 GCAACCGCAG CCTTCGTCCT CATCACCATC ATACCAACCG TAGCATTATT CGTTGGCGTC GGAAGCAGGA GTAGTGGTAG TATGGTTGGC ATCGTAATAA 8651 AACATCAATT ATCCTATTCC TATTAACAAT TCTAGAAGTA GCTGTAGCAA TTGTAGTTAA TAGGATAAGG ATAATTGTTA AGATCTTCAT CGACATCGTT 8701 TAATTCAGGC ATATGTATTT GTTCTACTGT TAAGTCTATA CCTACAAGAA ATTAAGTCCG TATACATAAA CAAGATGACA ATTCAGATAT GGATGTTCTT 8751 AATGTATAAT GGCTCACCAA GCACACGCAT ATCATATAGT TGACCCTAGT TTACATATTA CCGAGTGGTT CGTGTGCGTA TAGTATATCA ACTGGGATCA 8801 CCATGACCAT TAACCGGAGC TACAGCCGCC CTTCTTATAA CATCCGGGCT GGTACTGGTA ATTGGCCTCG ATGTCGGCGG GAAGAATATT GTAGGCCCGA 8851 AGCCATCTGA TTTCACTTCC ACTCATTATT ACTCCTCTAT TTAGGATTAA TCGGTAGACT AA.AGTGAAGG TGAGTAATAA TGAGGAGATA AATCCTAATT 8901 CCCTCCTACT ATTAACTATA ATTCAATGAT GACGTGACAT TATCCGAGAA GGGAGGATGA TAATTGATAT TAAGTTACTA CTGCACTGTA ATAGGCTCTT 8951 GGAACATTTC AAGGTCATCA TACACCTCCC GTTCA~~AAAG GCCTCCGTTA CCTTGTAAAG TTCCAGTAGT ATGTGGAGGG CAAGTTTTTC CGGAGGCAAT 9001 CGGGATAATC TTATTTATTA CATCAGAAGT ATTCTTCTTC CTAGGCTTCT GCCCTATTAG AATAAATAAT GTAGTCTTCA TAAGAAGAAG GATCCGAAGA 9051 TCTGAGCCTT TTACCACTCA AGTCTTGCCC CAACCCCAGA ATTAGGAGGA AGACTCGGAA AATGGTGAGT TCAGAACGGG GTTGGGGTCT TAATCCTCCT 9101 TGCTGACCAC CAACAGGAAT TAGCCCACTA GATCCATTTG AAGTACCACT ACGACTGGTG GTTGTCCTTA ATCGGGTGAT C TAGGTA.AAC TTCATGGTGA 9151 TCTAAATACC GCAGTACTTC TAGCCTCTGG TGTAACAGTA ACCTGAACCC AGATTTATGG CGTCATGAAG ATCGGAGACC ACATTGTCAT TGGACTTGGG 9201 ACCATAGTTT AATAGAAGGT AATC GA,AA,AG AGGCTATTCA AGCCCTCGCC TGGTATCAAA TTATCTTCCA TTAGCTTTTC TCCGATAAGT TCGGGAGCGG 9251 CTGACTATTC TTTTAGGATT CTACTTTACA GCCCTCCAAG CCATAGAATA GACTGATAAG AAAATCCTAA GATGAA.ATGT CGGGAGGTTC GGTATCTTAT 9301 TTACGAAGCA CCCTTCACAA TCGCCGATGG AGTTTATGGA ACAACATTCT AATGCTTCGT GGGAAGTGTT AGCGGCTACC TCAAATACCT TGTTGTAAGA 9351 TCGTTGCCAC AGGATTCCAC GGCCTCCATG TCATTATTGG TTCAACATTT AGCAACGGTG TCCTAAGGTG CCGGAGGTAC AGTAATAACC AAGTTGTAAA 9401 TTAGCAATCT GCTTACTACG ACAAATTCAA TACCACTTCA CATCAGAACA AATCGTTAGA CGAATGATGC TGTTTAAGTT ATGGTGAAGT GTAGTCTTGT 9451 TCACTTTGGT TTTGAAGCTG CCGCATGATA TTGACACTTT GTAGATGTAG AGTGAA.ACCA AAACTTCGAC GGCGTACTAT AAC TGTG.AAA CATCTACATC 9501 TATGACTATT CCTTTATGTA TCCATCTATT GATGAGGCTC ATAATTACTT ATACTGATAA GGAAATACAT AGGTAGATAA CTACTCCGAG TATTAATGAA 9551 TTCTAGTATA ACTAGTACAA ATGATTTCCA ATCATTTAGT CTTGGTTAA.A 202

AAGATCATAT TGATCATGTT TACTAAAGGT TAGTAAATCA GAACCAATTT 9601 CTCCAAGGAA AAGTAATGAA CCTCATCACG TCTTCTATCG CAGCTACGGC GAGGTTCCTT TTCATTACTT GGAGTAGTGC AGAAGATAGC GTCGATGCCG 9651 CCTGATTTCC CTAATCCTTG TATTAATTGC ATTTTGACTT CCGTCATTAA TGAA GGACTAAAGG GATTAGGAAC ATAATTAACG TA.A.AAC GGCAGTAATT 9701 ACCCAGATAA TGP►~3AA,AC TA TCCCCATATG AATGCGGCTT TGACCCCCTA TGGGTCTATT ACTTTTTGAT AGGGGTATAC TTACGCCGAA ACTGGGGGAT 9751 GGAAGTGCAC GCCTTCCATT TTCATTACGT TTCTTCCTTG TAGCTATTCT CCTTCACGTG CGGAAGGTAA AAGTAATGCA AAGAAGGAAC ATCGATAAGA 9801 ATTCTTACTA TTTGACCTAG AAATCGCCCT ACTCCTTCCC TTGCCATGAG TAAGAATGAT AAACTGGATC TTTAGCGGGA TGAGGAAGGG AACGGTACTC 9851 GTGATCAACT ATTATCCCCA CTTTCCACGC TACTCTGAGC AACAATTATT CACTAGTTGA TAATAGGGGT GAAAGGTGCG ATGAGACTCG TTGTTAATAA 9901 CTAATTTTAC TAACTCTAGG CCTTATCTAT GAATGACTTC AAGGAGGATT GATTAAAATG ATTGAGATCC GGAATAGATA CTTACTGAAG TTCCTCCTAA 9951 AGAATGAGCA GAATGAATAT TTAGTCTAAA TGAAGACCAC TAATTTCGAC TCTTACTCGT CTTACTTATA AATCAGATTT ACTTCTGGTG ATTAAAGCTG 10001 TTAGTAAACT ATGGTGA~AAA TCCATAAATA TTCTATGTCT CTCATACATT AATCATTTGA TACCACTTTT AGGTATTTAT AAGATACAGA GAGTATGTAA 10051 TCAGCCTCAA CTCAGCATTT ATTCTAGGTC TCATGGGCCT CGCACTTAAT AGTCGGAGTT GAGTCGTAAA TAAGATCCAG AGTACCCGGA GCGTGAATTA 10101 CGTTATCACC TTCTATCTGC ACTCCTATGC TTAGAAAGCA TACTACTAAC GCAATAGTGG AAGATAGACG TGAGGATACG AATCTTTCGT ATGATGATTG 10151 TCTATTCATC ACCATTGCTA TCTGAACCCT AACACTGAAC TCCACTTCAT AGATAAGTAG TGGTAACGAT AGACTTGGGA TTGTGACTTG AGGTGAAGTA 10201 CCTCAATTAT CCCCATAATC CTCCTTACAT TTTCAGCCTG TGAAGCTAGC GGAGTTAATA GGGGTATTAG GAGGAATGTA A.AAGTC GGAC ACTTCGATCG 10251 GCAGGC C TAG CCATTCTAGT AGCCACCTCA CGCTCTCATG GCTCCGATAA CGTCCGGATC GGTAAGATCA TCGGTGGAGT GCGAGAGTAC CGAGGCTATT 10301 CTTGCAAAAC CTAAACCTTC TCCAATGCTA AAAATCCTAA TTCCAACAAT GAACGTTTTG GATTTGGAAG AGGTTACGAT TTTTAGGATT AAGGTTGTTA 10351 TATACTCTTT CCAACCACAT GAATTATTAA C TGA TTATGACCTA ATATGAGA.AA GGTTGGTGTA CTTAATAATT GTTTTTTACT AATACTGGAT 10401 TAATTACCAC CCATAGCCTT ATAATCGCAT TACTAAGCCT ACTTCTATTC ATTAATGGTG GGTATCGGAA TATTAGCGTA ATGATTCGGA TGAAGATAAG 10451 AAATGAAATA TAGATATCGG CTGAGATTTT TCTAACCAAT TTATAGCCGC TTTACTTTAT ATCTATAGCC GACTCTAAAA AGATTGGTTA AATATCGGCG 10501 TGATCCTTTA TCCACCCCCT TGCTAATTCT TACATGCTGA CTCCTCCCAT ACTAGGAAAT AGGTGGGGGA ACGATTAAGA ATGTACGACT GAGGAGGGTA 10551 TAATAATTTT AGCCAGTCAA AACCACATTT CCCCAGAACC AATTATCCGG ATTATTAAAA TCGGTCAGTT TTGGTGTAAA GGGGTCTTGG TTAATAGGCC 10601 CAACGAACAT ACATTACACT TCTAATCTCC CTCCAAGCCT TCCTCATTAT GTTGCTTGTA TGTAATGTGA AGATTAGAGG GAGGTTCGGA AGGAGTAATA 10651 AGCATTCTCT GCAACCGAA.A TAATTCTATT TTACATTATA TTTGAAGCTA TCGTAAGAGA CGTTGGCTTT ATTAAGATAA AATGTAATAT AAACTTCGAT 10701 CGCTTATCCC AACCCTCATC ATTATTACAC GATGAGGGAA C CA.AACAGAA GCGAATAGGG TTGGGAGTAG TAATAATGTG CTACTCCCTT GGTTTGTCTT 10751 CGCTTAAATG CGGGCACCTA CTTTTTATTT TACACCTTAA TTGGCTCACT GCGAATTTAC GCCCGTGGAT GAAAAATAAA ATGTGGAATT AACCGAGTGA 10801 TCCTCTTCTT ATTGCCCTTC TACTTATACA A~ATAACCTA GGCACCCTGT AGGAGAAGAA TAACGGGAAG ATGAATATGT TTTATTGGAT CCGTGGGACA 10851 CCATAATTAT TATGCAACAC TCACAATTTC CAA.ACCTATT CTCATGAGCA GGTATTAATA ATACGTTGTG AGTGTTA.AAG GTTTGGATAA GAGTACTCGT 10901 GATAAATTAT GATGAATAGC CTGTCTCATC GCCTTCCTTG TCAAAATAC C CTATTTAATA CTACTTATCG GACAGAGTAG CGGAAGGAAC AGTTTTATGG 203

10951 TTTATATGGA ATCCACCTCT GACTTCCCAA AGCCCATGTT GAAGCCCCAA AAATATACCT TAGGTGGAGA CTGAAGGGTT TCGGGTACAA CTTCGGGGTT TACTTAA.ACT 11001 TCGCCGGCTC AATAATCCTA GCAGCAGTAT AGGAGGTTAT AGCGGCCGAG TTATTAGGAT CGTCGTCATA ATGAATTTGA TCCTCCAATA 11051 GGAATAATAC GAATTATTGT GATACTAAAT CCACTAACCA AAGAAATAGC CCTTATTATG CTTAATAACA CTATGATTTA GGTGATTGGT TTCTTTATCG 11101 CTATCCATTC TTAATTCTAG CTATCTGAGG AATCATTATA ACTAGTTCTA GATAGGTAAG AATTAAGATC GATAGACTCC TTAGTAATAT TGATCAAGAT 11151 TTTGTCTACG ACAAACAGAC TTAAAGTCCC TGATCGCTTA CTCATCAGTA AAACAGATGC TGTTTGTCTG AATTTCAGGG ACTAGCGAAT GAGTAGTCAT 11201 AGCCACATGG GCTTAGTCGC CGGAGCAATT CTTATCCAAA CACCATGAAG TCGGTGTACC CGAATCAGCG GCCTCGTTAA GAATAGGTTT GTGGTACTTC 11251 CTTCGCAGGA GCAATCACAC TTATAATCGC CCACGGCTTA ATTTCCTCAG GAAGCGTCCT CGTTAGTGTG AATATTAGCG GGTGCCGAAT TAA.AGGAGTC 11301 CCTTATTTTG CTTAGCTAAT ACCAATTATG AACGTATTCA CAGCCGAACC GGAATAAAAC GAATCGATTA TGGTTAATAC TTGCATAAGT GTCGGCTTGG 11351 ATACTCCTAG CCCGAGGCGT ACAAATTATC CTCCCACTGA TAGCCACCTG TATGAGGATC GGGCTCCGCA TGTTTAATAG GAGGGTGACT ATCGGTGGAC 11401 ATGATTCTTT GCTAGCCTAG CTAACCTTGC CCTTCCCCCA TCCCCCAATC TACTAAGAAA CGATCGGATC GATTGGAACG GGAAGGGGGT AGGGGGTTAG 11451 TTATAGGAGA ACTTCTTATT ATTACTTCAT TATTCAATTG ATCCAACTGA AATATCCTCT TGAAGAATAA TAATGAAGTA ATAAGTTAAC TAGGTTGACT 11501 ACTATGATCC TATCAGGTCT TGGAGTATTA ATTACAGCCT CCTACTCGCT TGATACTAGG ATAGTCCAGA ACCTCATAAT TAATGTCGGA GGATGAGCGA 11551 CTACATGTTC TTAATAACCC AACGAGGTCC AACCCCCCAT CATATTCTAG GATGTACAAG AATTATTGGG TTGCTCCAGG TTGGGGGGTA GTATAAGATC 11601 CATTAAACCC AAATTATACA CGAGAACATC TCCTCCTAAG CCTTCACCTC GTAATTTGGG TTTAATATGT GCTCTTGTAG AGGAGGATTC GGAAGTGGAG 11651 ATACCTGTCC TACTACTGAT ATTTA.AACCA GAACTTATTT GAGGGTGAAC TATGGACAGG ATGATGACTA TAAATTTGGT CTTGAATAAA CTCCCACTTG 11701 ACTCTGTATT TATAGTTTAA CTAAAACATT AGATTGTGGT TCTp~~AAATA TGAGACATAA ATATCAAATT GATTTTGTAA TCTAACACCA AGATTTTTAT 11751 AAAGTT~ CCTTTTTAAT TACCGAGAGA GGTCAGGGAC ACGAAGAACT TTTCAATTTT GGAAAAATTA ATGGCTCTCT CCAGTCCCTG TGCTTCTTGA 11801 GCTAATTCTT CTTATCATGG CTCAA.ATCCA TGACTCACTC AGCTTCTGAA CGATTAAGAA GAATAGTACC GAGTTTAGGT ACTGAGTGAG TCGAAGACTT 11851 AGATAATAGT AATCTATTGG TCTTAGGAAC CA,AA.A.AC TC T TGGTGCAACT TCTATTATCA TTAGATAACC AGAATCCTTG GTTTTTGAGA ACCACGTTGA 11901 CCAAGCAAAA GCTATGAACA CCATCTTCAA TTCATCATTC CTTCTAATCT GGTTCGTTTT CGATACTTGT GGTAGAAGTT AAGTAGTAAG GAAGATTAGA 11951 TTATTATCCT TATCTTCCCA CTAATAACCT CATTAAGTCC TAAAGAATTT AATAATAGGA ATAGAAGGGT GATTATTGGA GTAATTCAGG ATTTCTTA.AA 12001 AACTCCAATT GATCATCATC CTATGTAAAA ACAGCCGTAA AAATTTCCTT TTGAGGTTAA CTAGTAGTAG GATACATTTT TGTCGGCATT TTTAAAGGAA 12051 TTTTATTAGC CTTATCCCCT TATTTATTTT CCTCGACCAA GGCCTAGAAT A.A.AATAATC G GAATAGGGGA ATAAATAAAA GGAGCTGGTT CCGGATCTTA 12101 CAATTATAAC CAATTATAAT TGAATAA.ATA TTGGACCATT TGACATTAAC GTTAATATTG GTTAATATTA ACTTATTTAT AACCTGGTAA ACTGTAATTG 12151 ATAAGCTTCA AATTTGATAT ATATTCAATC ATATTCACCC CCGTAGCCCT TATTCGAAGT TTAAACTATA TATAAGTTAG TATAAGTGGG GGCATCGGGA 12201 CTATGTTACC TGGTCCATCC TTGAATTTGC CTTATGATAC ATACACTCTG GATACAATGG ACCAGGTAGG AAC TTA.AAC G GAATACTATG TATGTGAGAC 12251 ATCCCAATAT TAACCGCTTT TTCA.AATATT TATTACTCTT TTTAATTTCA TAGGGTTATA ATTGGCGA.AA AAGTTTATAA ATAATGAGAA AAATTA.AAGT 12301 ATAATTATTC TAGTTACAGC TAATAACATA TTTCAACTTT TCATCGGATG 204

A.A.AGTTGAAA TATTAATAAG ATCAATGTCG ATTATTGTAT AGTAGCCTAC 12351 AGAAGGAGTT GGAATCATAT CATTTCTCCT CATTGGCTGA TGATACAGCC TCTTCCTCAA CCTTAGTATA GTA.AAGAGGA GTAACCGACT ACTATGTCGG 12401 GAACAGATGC TAATACTGCC GCTCTCCAAG CTGTAATTTA TAACCGAGTG CTTGTCTACG ATTATGACGG CGAGAGGTTC GACATTAAAT ATTGGCTCAC 12451 GGAGATATCG GATTAATCCT CAGCATAGCT TGATTGGCTA TAAACTTAAA CCTCTATAGC CTAATTAGGA GTCGTATCGA ACTAACCGAT ATTTGAATTT 12501 TTCATGAGAA ATCCAACAAC TATTCATTCT ATC TP.~~AAAC ATAGACCTAA AAGTACTCTT TAGGTTGTTG ATAAGTAAGA TAGATTTTTG TATCTGGATT 12551 CATTACCTCT CTTCGGCCTG GTCCTAGCTG CAGCTGGAAA ATCCGCACAA GTAATGGAGA GAAGCCGGAC CAGGATCGAC GTCGACCTTT TAGGCGTGTT 12601 TTTGGCCTTC ACCCCTGACT TCCCTCTGCT ATAGAAGGAC CAACACCAGT AAACCGGAAG TGGGGACTGA AGGGAGACGA TATCTTCCTG GTTGTGGTCA 12651 CTCCGCCCTG CTCCACTCCA GCACAATAGT TGTCGCCGGC ATTTTTCTAC GAGGCGGGAC GAGGTGAGGT CGTGTTATCA ACAGCGGCCG TAAAAAGATG 12701 TAATCCGTCT CCACCCATTA ATTCAAGACA ACCAATTAAT TTTAACAACA ATTAGGCAGA GGTGGGTAAT TAAGTTCTGT TGGTTAATTA AAATTGTTGT 12751 TGCCTATGTC TAGGAGCACT AACTACCCTT TTTACTGCAA CATGTGCCCT ACGGATACAG ATCCTCGTGA TTGATGGGAA AAATGACGTT GTACACGGGA 12801 CAC C C1?~AAAT GACATCAAAA A.AATCGTTGC CTTCTCAACA TCAAGCCAAC GTGGGTTTTA CTGTAGTTTT TTTAGCAACG GAAGAGTTGT AGTTCGGTTG 12851 TCGGACTAAT AATAGTAACA ATTGGCCTTA ACCAACCCCA ACTCGCCTTC AGCCTGATTA TTATCATTGT TAACCGGAAT TGGTTGGGGT TGAGCGGAAG 12901 CTCCATATTT GTACCCACGC CTTCTTCAAA GCCATACTCT TCCTCTGCTC GAGGTATAA.A CATGGGTGCG GAAGAAGTTT CGGTATGAGA AGGAGACGAG 12951 AGGTTCTATT ATCCACAGCC TTAATGATGA ACAAGACATT CGCA.A.AATAG TCCAAGATAA TAGGTGTCGG AATTACTACT TGTTCTGTAA GCGTTTTATC 13001 GAGGACTCCA CAAACTCCTA CCATTTACCT CATCTTCTCT AACTATCGGA CTCCTGAGGT GTTTGAGGAT GGTAAATGGA GTAGAAGAGA TTGATAGCCT 13051 AGTTTAGCCC TTACAGGCAT ACCCTTCTTA TCAGGCTTCT TC TCA.A.AAGA TCAAATCGGG AATGTCCGTA TGGGAAGAAT AGTCCGAAGA AGAGTTTTCT 13101 CGCCATCATC GAATCCATAA ACACTTCACA CCTAAACGCC TGAGCCCTAG GCGGTAGTAG CTTAGGTATT TGTGAAGTGT GGATTTGCGG ACTCGGGATC 13151 TCCTCACCCT CATCGCAACA TCATTTACAG CTATCTATAG TCTACGCCTT AGGAGTGGGA GTAGCGTTGT AGTAAATGTC GATAGATATC AGATGCGGAA 13201 GTATTCTTCA CATTAATAAA TTTTCCACGA TTCAACCCAC TCTCCCCAGT CATAAGAAGT GTAATTATTT AA.A.AGGTGC T AAGTTGGGTG AGAGGGGTCA 13251 CAATGAAAAT AACCCAATAG TGATTAACCC TATCAAACGC CTAGCCTATG GTTACTTTTA TTGGGTTATC ACTAATTGGG ATAGTTTGCG GATCGGATAC 13301 GAAGCATCCT AGCCGGCCTT ATTATCACAT CAAACTTAAC CCCAACA,.P~AA CTTCGTAGGA TCGGCCGGAA TAATAGTGTA GTTTGAATTG GGGTTGTTTT 13351 ACCCAAATTA TGACAATACC CCCCTTACTA A.AACTCTCCG CCCTACTAGT TGGGTTTAAT ACTGTTATGG GGGGAATGAT TTTGAGAGGC GGGATGATCA 13401 CACAATTATT GGTCTCCTAT TAGCCTTAGA ATTAGCTAAT TTAACTAACA GTGTTAATAA CCAGAGGATA ATCGGAATCT TAATCGATTA AATTGATTGT 13451 CCCAACTTAA AATAAACCCC ACCCTATATA CCCACCACTT CTCCAACATG GGGTTGAATT TTATTTGGGG TGGGATATAT GGGTGGTGAA GAGGTTGTAC 13501 CTAGGATATT TTCCACAAAT CATTCATCGT CTCCTACCAA AAATTAACTT GATCCTATAA AAGGTGTTTA GTAAGTAGCA GAGGATGGTT TTTAATTGAA 13551 AAGCTGAGCC CAACATATCT CAACCCACCT GATTGACCAA ACATGA.AATG TTCGACTCGG GTTGTATAGA GTTGGGTGGA CTAACTGGTT TGTACTTTAC 13601 TCGG ACCAAAAAGC ACCCTCATTC AACAGACCCC ACTAATCA.AA TTTTTTAGCC TGGTTTTTCG TGGGAGTAAG TTGTCTGGGG TGATTAGTTT 13651 CTATCCACCC AACCACAACA AGGTTATATT AAAATTTATC TCATACTACT GATAGGTGGG TTGGTGTTGT TCCAATATAA TTTTAAATAG AGTATGATGA~ 205

13701 TTTCCTTACA TTAACCTTAG CCCTACTGAC CTCACTAACC TAACTGCACG AAAGGAATGT AATTGGAATC GGGATGACTG GAGTGATTGG ATTGACGTGC 13751 TAAGGTCCCC CAAGATAATC CCCGAGTCAA TTCCAACACC ACAAACAGAG ATTCCAGGGG GTTCTATTAG GGGCTCAGTT AAGGTTGTGG TGTTTGTCTC 13801 TTAACAATAG TACCCACCCA CTTAAAATTA ACAACCACCC ACCATTAGCA AATTGTTATC ATGGGTGGGT GAATTTTAAT TGTTGGTGGG TGGTAATCGT 13851 TATAATAAAG CCACCCCTAT AAAATCTCCA CGGACTACCG CCATACTATT ATATTATTTC GGTGGGGATA TTTTAGAGGT GCCTGATGGC GGTATGATAA 13901 TAACTCCTCT ACCCCTGCCC AACTTAACTC AAACCATTTA AC TATAA.A.AT ATTGAGGAGA TGGGGACGGG TTGAATTGAG TTTGGTAAAT TGATATTTTA 13951 ATTTACCAAC P►~~AAATTAAG GCTATTGCAT AAAACCCAAC ATACAATAAC TAAATGGTTG TTTTTAATTC CGATAACGTA TTTTGGGTTG TATGTTATTG 14001 ACAGATCAAT TACCCCACGA CTCAGGATAA GGCTCAGCAG CAAGCGCTGC TGTCTAGTTA ATGGGGTGCT GAGTCCTATT CCGAGTCGTC GTTCGCGACG 14051 TGTATAAGCA AATACCACCA ACATTCCCCC CAA.ATAAATT ~ACAAAA ACATATTCGT TTATGGTGGT TGTAAGGGGG GTTTATTTAA TTTTTGTTTT 14101 CTAATGACAA AAAAGATCCA CCATGACCCA CCAACAACCC ACATCCCACC GATTACTGTT TTTTCTAGGT GGTACTGGGT GGTTGTTGGG TGTAGGGTGG 14151 CCAGCAGCCA TAACCAACCC CAATGCAGCA TAATAAGGTG AAGGATTAGA GGTCGTCGGT ATTGGTTGGG GTTACGTCGT ATTATTCCAC TTCCTAATCT 14201 CGCCACCCCT ATTA.AAC C TA A.AAC TAAACA AACTATTATT A~~AA.AC ATAA GCGGTGGGGA TAATTTGGAT TTTGATTTGT TTGATAATAA TTTTTGTATT 14251 AATATACCAT TATTCCTACC TGGACTCTAA CCAAGACCAA TAACTTGAAA TTATATGGTA ATAAGGATGG ACCTGAGATT GGTTCTGGTT ATTGAACTTT 14301 AACTATCGTT GTTCATTCAA CTATAAGAAT TTATGGCCAC A.AACATCCGA TTGATAGCAA CAAGTAAGTT GATATTCTTA AATACCGGTG TTTGTAGGCT 14351 AAAACCCACC CATTACTAAA AATCGTTAAC CA.AACC TTAA TTGACCTCCC TTTTGGGTGG GTAATGATTT TTAGCAATTG GTTTGGAATT AACTGGAGGG 14401 AGCCCCGTCC AACATTTCAA TTTGATGAAA CTTTGGCTCA CTTCTGGGGT TCGGGGCAGG TTGTAA.AGTT AA.AC TAC TTT GAAACCGAGT GAAGACCCCA 14451 TATGCCTAAT TATTCAAATC CTCACAGGTC TTTTCCTAGC AATACATTAT ATACGGATTA ATAAGTTTAG GAGTGTCCAG AAAAGGATCG TTATGTAATA 14501 ACTGCAGATA TCTCCATAGC CTTTTCCTCA GTAATCCATA TTTGCCGCGA TGACGTCTAT AGAGGTATCG GAAAAGGAGT CATTAGGTAT AAACGGCGCT 14551 CGTCAATTAT GGTTGACTTA TCCGTAATAT TCACGCCAAC GGAGCCTCAT GCAGTTAATA CCAACTGAAT AGGCATTATA AGTGCGGTTG CCTCGGAGTA 14601 TATTCTTTAT CTGCGTTTAC TTACATATCG CCCGAGGACT TTACTACGGC ATAAGA.AATA GACGCAA.ATG AATGTATAGC GGGCTCCTGA AATGATGCCG 14651 TCCTACCTTT ATAAAGAGAC ATGAAATATT GGAGTAGTCT TACTATTTCT AGGATGGAAA TATTTCTCTG TACTTTATAA CCTCATCAGA ATGATAAAGA 14701 ATTAATAGCC ACAGCCTTCG TAGGCTATGT ACTGCCATGA GGACAAATAT TAATTATCGG TGTCGGAAGC ATCCGATACA TGACGGTACT CCTGTTTATA 14751 CCTTCTGAGG CGCCACAGTC ATCACCAACC TCCTATCCGC CTTTCCTTAT GGAAGACTCC GCGGTGTCAG TAGTGGTTGG AGGATAGGCG GAAAGGAATA 14801 ATTGGAGATA TACTAGTCCA ATGAATTTGA GGAGGCTTCT CAGTAGATAA TAACCTCTAT ATGATCAGGT TAC T TA.AAC T CCTCCGAAGA GTCATCTATT 14851 CGCCACCCTA ACACGATTCT TCGCATTCCA CTTTCTCCTA CCTTTTCTAA GCGGTGGGAT TGTGCTAAGA AGCGTAAGGT GAAAGAGGAT GGAAAAGATT 14901 TCACAGCGCT AATAATGATC CACATCCTCT TCCTACATGA AACAGGCTCA AGTGTCGCGA TTATTACTAG GTGTAGGAGA AGGATGTACT TTGTCCGAGT 14951 AACAATCCTA TAGGCCTTAA CTCTGACATA GATAAAATCT CCTTTCACCC TTGTTAGGAT ATCCGGAATT GAGACTGTAT CTATTTTAGA GGAAAGTGGG 15001 CTACTTCTCC TATAAAGACG CACTTGGCTT TCTCACCCTA CTCATACTCC GATGAAGAGG ATATTTCTGC GTGAACCGAA AGAGTGGGAT GAGTATGAGG 15051 TAGGAGTCCT AGCCCTATTT CTCCCTAATC TATTAGGAGA C GC C GA►AAAC 206

ATCCTCAGGA TCGGGATAAA GAGGGATTAG ATAATCCTCT GCGGCTTTTG 15101 TACATCCCCG CA.AACCCTCT TGTCACCCCT CCCCACATTA AACCTGAATG ATGTAGGGGC GTTTGGGAGA ACAGTGGGGA GGGGTGTAAT TTGGACTTAC 15151 ATACTTCCTA TTTGCCTATG CCATTCTTCG ATCCATCCCT AATA.AACTAG TATGAAGGAT AAACGGATAC GGTAAGAAGC TAGGTAGGGA TTATTTGATC 15201 GAGGAGTCCT AGCCCTCCTA TTCTCCATTC TCATCCTTAT ATTAGTTCCC CTCCTCAGGA TCGGGAGGAT AAGAGGTAAG AGTAGGAATA TAATCAAGGG 15251 CTTCTACACA CCTCTAAACA ACGAAGCAGC ACCTTTCGCC CACTCACACA GAAGATGTGT GGAGATTTGT TGCTTCGTCG TGGAAAGCGG GTGAGTGTGT 15301 AGTCTTCTTC TGAACCCTCG TAACCAATAT ACTAATCTTA ACCTGAATTG TCAGAAGAAG ACTTGGGAGC ATTGGTTATA TGATTAGAAT TGGACTTAAC 15351 GAGGACAACC AGTTGAACAA CCATTCATTC TTATTGGACA AATTGCATCT CTCCTGTTGG TCAACTTGTT GGTAAGTAAG AATAACCTGT TTAACGTAGA 15401 ATCACCTACT TTTCCTTATT TCTCATTGTA ATTCCACTCA CAGGCTGATG TAGTGGATGA A.AAGGAATAA AGAGTAACAT TAAGGTGAGT GTCCGACTAC 15451 AG~TAAA ATCCTCAACC TAA.AC TAGTT TTGGTAGCTT AACTTAACAA TCTTTTATTT TAGGAGTTGG ATTTGATCAA AACCATCGAA TTGAATTGTT 15501 AGCGTCGACC TTGTAAGTCG AAGACCGGAG GTTCAAACCC TCCCCA,AAAC TCGCAGCTGG AACATTCAGC TTCTGGCCTC CAAGTTTGGG AGGGGTTTTG 15551 AATTAAGAAA AAAAGGACTA AACTCCTATT AAAATATATC AGGGGAAGGA TTAATTCTTT TTTTCCTGAT TTGAGGATAA TTTTATATAG TCCCCTTCCT 15601 GGGTTAA.AC T CCTGCCCTTG GCTCCCAAAG CCAAGATTCT GCCCAAACTG CCCAATTTGA GGACGGGAAC CGAGGGTTTC GGTTCTAAGA CGGGTTTGAC 15651 CCCCCTGTAA TGTAATA.A.A.A GCATGAAAAC CGAATAGACA TTCGGTTTTC GGGGGACATT ACATTATTTT CGTACTTTTG GCTTATCTGT AAGC CAAA.AG 15701 p►~~A.AAGTAAG TCAGTGTGAC ATATTAATGA CATAGCCCAC ATACCTTAAT TTTTTCATTC AGTCACACTG TATAATTACT GTATCGGGTG TATGGAATTA 15751 ATAGTACATT ACTTAACTCG ATTAACCAAC ATTAATAGCC TATTCCCTAC TATCATGTAA TGAATTGAGC TAATTGGTTG TAATTATCGG ATAAGGGATG 15801 TACTATTATT ATCTATGCTT AATCCCCATT AATCTATATT CCACTATATC ATGATAATAA TAGATACGAA TTAGGGGTAA TTAGATATAA GGTGATATAG 15851 ATAACATACT ATGCTTAATA CTCATTAATA TACTATCCAC TATTTCATTA TATTGTATGA TACGAATTAT GAGTAATTAT ATGATAGGTG ATA.AAGTAAT 15901 CATTTTGGGC TTTAGCCCCC ATTTATCTAA AATCAAAATT TCCATATCAT GTAAAACCCG AAATCGGGGG TAAATAGATT TTAGTTTTAA AGGTATAGTA 15951 AGAATTATTC ATTCCACCCA CTATTACTTA AGTTATGAAT ATGCGGGTTG TCTTAATAAG TAAGGTGGGT GATAATGAAT TCAATACTTA TACGCCCAAC 16001 GTAAGAACAT CACATCCCGC TATTATAAGG TTGC TCTATTTGTG CATTCTTGTA GTGTAGGGCG ATAATATTCC TTTTTTAACG AGATA.AACAC 16051 GCACTGTACT CGATTAATCC CTATCAATTG ACCAGAACTG GCATCTGATT CGTGACATGA GCTAATTAGG GATAGTTAAC TGGTCTTGAC CGTAGACTAA 16101 AATGCTTGAG CTACTTCAGT CCTTGATCGC GTCAAGAATG CCAGCCCCCT TTACGAACTC GATGAAGTCA GGAACTAGCG CAGTTCTTAC GGTCGGGGGA 16151 AGTTCCCTTT AATGGCACCT TCGTCCTTGA CTGTCTCAAG ATTTATTGTC TCAAGGGAAA TTACCGTGGA AGCAGGAACT GACAGAGTTC TAAATAACAG 16201 CGCCCTTCAA TTTTTTTCGG GGATGAAGCA ATTACTAAGC CCGGGAGGGC GCGGGAAGTT GCC CCTACTTCGT TAATGATTCG GGCCCTCCCG 16251 TGATCTGGGA CTCTGAGATA AACCTGAATC CACCTCGACA TTTACTTAAA ACTAGACCCT GAGACTCTAT TTGGACTTAG GTGGAGCTGT AAATGAATTT 16301 ATACTCATTA CTCACCATTC ATGAATTATA ATTGTCAAGT TGACCATAGC TATGAGTAAT GAGTGGTAAG TACTTAATAT TAACAGTTCA ACTGGTATCG 16351 TGAA.AGGAAT AGAGAAATTG ACGCCATAGG CGACAAGTTT CGATTTTTTT ACTTTCCTTA TCTCTTTAAC TGCGGTATCC GCTGTTCAAA GCT 16401 GATTAATGAA ACTACGGTTT TACA TTCTCTTAAT CCTCATCAAA CTAATTACTT TGATGCCAAA TTTTTTATGT AAGAGAATTA GGAGTAGTTT 207

16451 AGCAAGACGC GATAAATATT CGTGTAA.A.GC GCACCTAATA ATCCTAGTAC TCGTTCTGCG CTATTTATAA GCACATTTCG CGTGGATTAT TAGGATCATG 16501 ATACTTCACT TTACTAGGCA TAAATTTATT ATTATTAAGT TTCCCCCTGG TATGAAGTGA AATGATCCGT ATTTAAATAA TAATAATTCA AAGGGGGACC 16551 GTTGTP►~~AAA TTTCGGAGCC GCTTTP.~AAAA CAT TTTTTGGTAA CAACATTTTT AAAGCCTCGG CGAAATTTTT TTTTTTTGTA p~~AAAC CATT 16601 AAACCCCCCT CCCCCTAATA TAC AC G GAC T C C T C Gp~~AAA C C C C TAA.AAC TTTGGGGGGA GGGGGATTAT ATGTGCCTGA GGAGCTTTTT GGGGATTTTG 16651 GAGGGCCGGA CATATATCTT TGAATTAGCA TGCGAAATAT TCTCTATATA CTCCCGGCCT GTATATAGAA ACTTAATCGT ACGCTTTATA AGAGATATAT 16701 TAGTGTTACA CTATGAT ATCACAATGT GATACTA tRNA 1..70 product = tRNA-Phe rRNA 69..1021 product = 12S ribosomal RNA tRNA 1022..1093 product = tRNA-Val rRNA 1094..2757 product = 165 ribosomal RNA tRNA 2758..2832 product = tRNA-Leu gene 2833..3807 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3810..3878 product = tRNA-Ile tRNA 3877..3948 product = tRNA-Gln tRNA 3949..4017 product = tRNA-Met gene 4018..5061 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5061..5131 product = tRNA-Trp tRNA complement (5133..5201) product = tRNA-Ala tRNA complement {5202..5274) product = tRNA-Asn tRNA complement (5308..5374) product = tRNA-Cys tRNA complement (5 376..5445} product = tRNA-Tyr gene 5447..7000 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7003..7073) product = tRNA-Ser tRNA 7078..7147 product = tRNA-Asp gene 7152..1842 gene = CO2 208

product = cytochrome c oxidase subunit 2 tRNA 7 843..7916 product = tRNA-Lys gene 7918..8085 gene = ATP8 product =ATP synthase FO subunit 8 gene 8076..8759 gene = ATP6 product =ATP synthase FO subunit 6 gene 8759..9544 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9547..9615 product = tRNA-Gly gene 9616..9966 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9965..10034 product = tRNA-Arg gene 10035..10331 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10325..11705 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11706..11774 product = tRNA-His tRNA 11775..11841 product = tRNA-Ser tRNA 11842..11913 product = tRNA-Leu gene 11914..13743 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13739..14260) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14261..14330) product = tRNA-GIu gene 14333..15478 gene = CYTB product =cytochrome b tRNA 15478..15551 product = tRNA-Thr insert 15552..15588 tRNA complement (15589..15657) product = tRNA-Pro D-Loop 15659..16717

Alopias vulpinus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTTA AAGCATGGCA CTGAAGATGC T~TGAAA CGATCACATC GAATTAAAAT TTCGTACCGT GACTTCTACG ATTTTACTTT 209

51 AATP~~AAATT TTCCACAGGC ATAAAGGTTT GGTCCTGGCC TCAGTATTA.A TTATTTTTAA AAGGTGTCCG TATTTCCAAA CCAGGACCGG AGTCATAATT 101 TTGTAACTAA AATTATACAT GCAAGTTTCA GCACCCCCGT GAGAATGCCC AACATTGATT TTAATATGTA CGTTCAAAGT CGTGGGGGCA CTCTTACGGG 151 TAATTATTCT ATTA.ATTAAT TAGGAGCAGG TATCAGGCAC GCACACGTAG ATTAATAAGA TAATTAATTA ATCCTCGTCC ATAGTCCGTG CGTGTGCATC 201 CCCAAGACAC CTTGCTAAGC CACACCCCCA AGGGAGTTCA GCAGTAATAA GGGTTCTGTG GAACGATTCG GTGTGGGGGT TCCCTCAAGT CGTCATTATT 251 ATATTGATTC TATAAGCGCA AGCTTGAATC AGTTAAAGTT AACAGAGTTG TATAACTAAG ATATTCGCGT TCGAACTTAG TCA.ATTTCAA TTGTCTCAAC 301 GTAAATCTCG TGCCAGCCAC CGCGGTTATA CGAGTAACTC ACATTAATAT CATTTAGAGC ACGGTCGGTG GCGCCAATAT GCTCATTGAG TGTAATTATA 351 TTTCCCGGCG TAAAGAGTGA TTTAAGGAAT ATCTATCATA ACTAAAGTTA AAAGGGCCGC ATTTCTCACT AAATTCCTTA TAGATAGTAT TGATTTCAAT 401 AGACCTCGTC AAGCTGTTAC ACGCACCCAC GAATGGAATT GTCAATAACG TCTGGAGCAG TTCGACAATG TGCGTGGGTG CTTACCTTAA CAGTTATTGC 451 AAAGTGACTT TATCCCACTA GA.AATC TTGA TGTCACGACA GTTAGACCCC TTTCACTGAA ATAGGGTGAT CTTTAGAACT ACAGTGCTGT CAATCTGGGG 5 01 A.AAC TAGGAT TAGATACCCT ACTATGTCTA AC CACAA.AC T TAAACAATAA TTTGATCCTA ATCTATGGGA TGATACAGAT TGGTGTTTGA ATTTGTTATT 551 TTCACTATAT TGTTCGCCAG AGTACTACAA GCGCTAGCTT GAAACCCAAA AAGTGATATA ACAAGCGGTC TCATGATGTT CGCGATCGAA CTTTGGGTTT 601 GGACTTGGCG GTGTCCCA.AA CCCACCTAGA GGAGCCTGTT CTGTAACCGA CCTGAACCGC CACAGGGTTT GGGTGGATCT CCTCGGACAA GACATTGGCT 651 TAATCCCCGT TA.AAC C TCAC CACTTCTGGC CATCCCCGTC TATATACCGC ATTAGGGGCA ATTTGGAGTG GTGAAGACCG GTAGGGGCAG ATATATGGCG 701 CGTCGTCAGC TCACCCTGTG AAGGTP.~AA,AA AGTAAGCAAA AAGAATTAAC GCAGCAGTCG AGTGGGACAC TTCCATTTTT TCATTCGTTT TTCTTAATTG 751 TTCTACACGT CAGGTCGAGG TGTAGCAAAT GAAGTGGATA GAAATGGGCT AAGATGTGCA GTCCAGCTCC ACATCGTTTA CTTCACCTAT CTTTACCCGA 801 ACATTTTCTA TAAAGAAGAT ACGAATGGTA AAC TGP~~AAA TTACCTAAAG TGTAA.AAGAT ATTTCTTCTA TGCTTACCAT TTGACTTTTT AATGGATTTC 851 GTGGATTTAG CAGTAAGAAA AGATTAGAGA GCTTCTCTGA AACTGGCTCT CACCTAAATC GTCATTCTTT TCTAATCTCT CGAAGAGACT TTGACCGAGA 901 GGGACGCGCA CACACCGCCC GTCACTCTCC TCAA.AAAAAT CTACTTATTT CCCTGCGCGT GTGTGGCGGG CAGTGAGAGG AGTTTTTTTA GATGAATA.AA 951 ATAATTAAAA G~~TTATC AAGAGGAGGC AAGTCGTAAC ATGGTAAGTG TATTAATTTT CTTTTAATAG TTCTCCTCCG TTCAGCATTG TACCATTCAC 1001 TACTGGAA.AG TGCACTTGGA ATCA.AA.ATGT GGCTAAATTA GCAAAGCACC ATGACCTTTC ACGTGAACCT TAGTTTTACA CCGATTTAAT CGTTTCGTGG 1051 TCCCTTACAC CGAGGAAATA CCCGTGCAAT TCGAGTCATT TTGAACATTA AGGGAATGTG GCTCCTTTAT GGGCACGTTA AGCTCAGTAA AACTTGTAAT 1101 AAGCTAGCCT GTATACCTTA ACCTTAAACC TAACCTTATT A.AATTAC C TT TTCGATCGGA CATATGGAAT TGGAATTTGG ATTGGAATAA TTTAATGGAA 1151 ATATGTTAAA TCATAACTAA AACATTTTAA CCTTTTAGTA TGGGCGACAG TATACAATTT AGTATTGATT TTGTAAAATT GGAAAATCAT ACCCGCTGTC 12 01 AAC P~~AAAC T CAGCGCAATA GATTATGTAC CGCAAGGGAA AGC TGP.~AA.AA TTGTTTTTGA GTCGCGTTAT CTAATACATG GCGTTCCCTT TCGACTTTTT 12 51 GA.AATGAAAC AA.ATAATTAA AGTAACGAAA AGCAGAGATT CAACCTCGTA CTTTACTTTG TTTATTAATT TCATTGCTTT TCGTCTCTAA GTTGGAGCAT 1301 CCTTTTGCAT CATGATTTAG C TAGP.~AAAAC TAGACA.AAGA GATCTTAAGC GGAAAACGTA GTACTAAATC GATCTTTTTG ATCTGTTTCT CTAGAATTCG 1351 CTACCCTCCC GAAAC TA.AAC GAGCTACTCC GAAGCAGCAC AATTTTTAGA GATGGGAGGG CTTTGATTTG CTCGATGAGG CTTCGTCGTG TTAA.AAATC T 1401 GCCAACCCGT CTCTGTGGCA AA.AGAGTGGG AAGACTTCCG AGTAGCGGTG 210

CGGTTGGGCA GAGACACCGT TTTCTCACCC TTCTGAAGGC TCATCGCCAC 1451 ACAAGCCTAT CGAGTTTAGT GATAGCTGGT TGCCCAAGAA AAGAACTTTA TGTTCGGATA GCTCAAATCA CTATCGACCA ACGGGTTCTT TTCTTGAAAT 1501 ATTCTGCATT AATTTCTTCA C C C C CP.~P~AAA GTCTATCTTA TTAAGGTTAA TAAGACGTAA TTAAAGAAGT GGGGGTTTTT CAGATAGAAT AATTCCAATT 1551 AC ATP~A.AAAT TAATAGTTAT TCAGAAGAGG TACAGCCCTT CTGAACTAAG TGTATTTTTA ATTATCAATA AGTCTTCTCC ATGTCGGGAA GACTTGATTC 1601 ATACAACTTT TCAAGGCGGA AAATGATCAT ACTTACCAAG GTTTTTACCT TATGTTGAAA AGTTCCGCCT TTTACTAGTA TGAATGGTTC Cp~~AAATGGA 1651 CAGTGGGCCC AAAAGCAGCC ATCTGTAAAG TAAGCGTCAC AGCTCCAGTC GTCACCCGGG TTTTCGTCGG TAGACATTTC ATTCGCAGTG TCGAGGTCAG 1701 TCACP~~AAAC CTATAATTTA GATATCCTCT CATAAACCCC TTAACCATTA AGTGTTTTTG GATATTAAAT CTATAGGAGA GTATTTGGGG AATTGGTAAT 1751 CTGGGCTATT TTATAAAATT ATAA,AAGAAC TTATGCTAAA ATGAGTAATA GACCCGATAA AATATTTTAA TATTTTCTTG AATACGATTT TACTCATTAT 1801 AGAGAACAAA CCTCTCCAGA CATGAGTGTA TGTTAGAAAG AATTAAATCA TCTCTTGTTT GGAGAGGTCT GTACTCACAT ACAATCTTTC TTAATTTAGT 1851 CTAACAATTA AACGAACCCA AACTGAGGCC ATTATATTAA TATTACCTTA GATTGTTAAT TTGCTTGGGT TTGACTCCGG TAATATAATT ATAATGGAAT 1901 AC TAGA►AAAC CTTATTGTAT TATTCGTTAA CCCTACACAG GAATGTCCTA TGATCTTTTG GAATAACATA ATAAGCAATT GGGATGTGTC CTTACAGGAT 1951 AGGA.AAGATT TA.AAGA.AAAT AAAGGAACTC GGCAAACACA AACTCCGCCT TCCTTTCTAA ATTTCTTTTA TTTCCTTGAG CCGTTTGTGT TTGAGGCGGA 2001 GTTTACCA.AA AACATCGCCT CTTGAATATT ATAAGAGGTC CCGCCTGCCC CA.AATGGTTT TTGTAGCGGA GAACTTATAA TATTCTCCAG GGCGGACGGG 2051 TGTGACAATG TTTTAACGGC CGCGGTATTT TGACCGTGCA AAGGTAGCGT ACACTGTTAC A,.A.P~.TTGCCG GCGCCATAAA ACTGGCACGT TTCCATCGCA 2101 AATCATTTGT CTTTTAA.ATG AAGACCCGTA TGAA.AGGCAT CACGAGAGTT TTAGTAAACA GAAAATTTAC TTCTGGGCAT ACTTTCCGTA GTGCTCTCAA 2151 TAACTGTCTC TATTTTCTAA TCAATGAAAT TGATCTACTC GTGCAGAAGC ATTGACAGAG ATP~AAAGATT AGTTACTTTA ACTAGATGAG CACGTCTTCG 2201 GAGTATAATT ACATTAGACG AGAAGACCCT ATGGAGCTTC AAACACATAA CTCATATTAA TGTAATCTGC TCTTCTGGGA TACCTCGAAG TTTGTGTATT 2251 ATTAATTATG TAAACTAATT ACTCCACGGA TATAAATAAA AATATAATAT TAATTAATAC ATTTGATTAA TGAGGTGCCT ATATTTATTT TTATATTATA 2301 ATTTAATTTA AGTGTTTTTG GTTGGGGTGA CCAAGGGGAA AAATA.AATC C TAAATTAAAT TGACP.~~AAAC CAACCCCACT GGTTCCCCTT TTTATTTAGG 2351 CCCTTATCGA CTGAGTACTC AAAGTACTTA AAAATTAGAA CTACAATTCT GGGAATAGCT GACTCATGAG TTTCATGAAT TTTTAATCTT GATGTTAAGA 2401 AATTAATA.AA ATATTTATCG AACAATGACC CAGGATTTCC TGATCAATGA TTAATTATTT TATA.AATAGC TTGTTACTGG GTCCTAAAGG ACTAGTTACT 2451 ACCAAGTTAC CCTAGGGATA ACAGCGCAAT CCTTTCTCAG AGTCCCTATC TGGTTCAATG GGATCCCTAT TGTCGCGTTA GGAA.AGAGTC TCAGGGATAG 2501 GCCGAAAGGG TTTACGACCT CGATGTTGGA TCAGGACATC CTAATGATGC CGGCTTTCCC AAATGCTGGA GCTACAACCT AGTCCTGTAG GATTACTACG 2551 AACCGTTATT AAGGGTTCGT TTGTTCAACG ATTAATAGTC CTACGTGATC TTGGCAATAA TTCCCAAGCA AACAAGTTGC TAATTATCAG GATGCACTAG 2601 TGAGTTCAGA CCGGAGAAAT CCAGGTCAGT TTCTATCTAT GAATTAATTT ACTCAAGTCT GGCCTCTTTA GGTCCAGTCA AAGATAGATA CTTAATTAAA 2651 TTCCTAGTAC GAAAGGACCG GP.~~AAATGGA GCCAATACCA TAGGCACGCT AAGGATCATG CTTTCCTGGC GTTTTTACCT CGGTTATGGT ATCCGTGCGA 2701 CCATTTTCAT CTATTGAA.AC AAAC TP~AAAT AGATAAGAAA AAATTATCTA GGTA,,AAAGTA GATAACTTTG TTTGATTTTA TCTATTCTTT TTTAATAGAT 2751 CTACCCGAGA AAAGGGTTGT TGAGGTGGCA GAGCCTGGTA AGTGC~,AA.AG GATGGGCTCT TTTCCCAACA ACTCCACCGT CTCGGACCAT TCACGTTTTC 211

2801 ACCTAAGCTC TTTAATTCAG AGGTTCAAAT CCTCTCCTCA ACCATGCTTG TGGATTCGAG AAATTAAGTC TCCAAGTTTA GGAGAGGAGT TGGTACGAAC 2851 AA.ACTCTCCT ACTTTACTTA ATTAATCCAC TTACCTATAT TATTCCCATC TTTGAGAGGA TGA.AATGAAT TAATTAGGTG AATGGATATA ATAAGGGTAG 2901 CTATTAGCTA CAGCTTTCCT TACCTTAGTT GAAC GP.~~AAA TCCTCGGCCA GATAATCGAT GTC GA.AAGGA ATGGAATCAA CTTGCTTTTT AGGAGCCGGT 2951 CATACAACTC CGAAAAGGTC CCAACATCGT AGGCCCATAT GGCCTCCTTC GTATGTTGAG GCTTTTCCAG GGTTGTAGCA TCCGGGTATA CCGGAGGAAG 3001 AACCAATTGC AGATGGCCTG AAACTATTTA TTAAAGAACC TATCTACCCA TTGGTTAACG TCTACCGGAC TTTGATAAAT AATTTCTTGG ATAGATGGGT 3051 TCAACTTCTT CCCCCCTTCT ATTTTTAATC ACCCCTACAA TAGCCCTAAC AGTTGAAGAA GGGGGGAAGA TP~~AAATTAG TGGGGATGTT ATCGGGATTG 3101 ACTAGCCCTC CTCATGTGAA TACCTCTCCC CCTCCCCCAT TCCATTATTA TGATCGGGAG GAGTACACTT ATGGAGAGGG GGAGGGGGTA AGGTAATAAT 3151 ATCTCAACCT AGGCTTACTA TTTATTCTAG CAATCTCAAG TCTAACTGTC TAGAGTTGGA TCCGAATGAT AAATAAGATC GTTAGAGTTC AGATTGACAG 3201 TACACTATTC TAGGATCCGG ATGAGCATCC AATTCAAAAT ATGCTTTAAT ATGTGATAAG ATCCTAGGCC TACTCGTAGG TTAAGTTTTA TAC GA.AATTA 3251 AGGGGCCCTA CGAGCCGTAG CACAA.ACAAT TTCCTATGAA GTAAGCCTTG TCCCCGGGAT GCTCGGCATC GTGTTTGTTA AAGGATACTT CATTCGGAAC 3301 GATTAATCCT TTTATCAATA ATTATATTTA CAGGAGGTTT TACCCTCCAC CTAATTAGGA A.AATAGTTAT TAATATAAAT GTCCTCCAAA ATGGGAGGTG 3351 ACCTTCAATT TAGCACAAGA AACAATCTGA TTAATTATCC CAGGATGACC TGGAAGTTAA ATCGTGTTCT TTGTTAGACT AATTAATAGG GTCCTACTGG 3401 CTTAGCCCTA ATATGGTATG TCTCAACCTT AGCAGAAACC AACCGAGTAC GAATCGGGAT TATACCATAC AGAGTTGGAA TCGTCTTTGG TTGGCTCATG 3451 CATTTGATCT AACAGAAGGA GAATCAGAAC TAGTCTCAGG ATTTAACATT GTAAACTAGA TTGTCTTCCT CTTAGTCTTG ATCAGAGTCC TAAATTGTAA 3501 GAATATGCAG GAGGCCCATT TGCTTTATTT TTCCTTGCTG AATACACA.AA CTTATACGTC CTCCGGGTAA AC GA.AATA.AA AAGGAACGAC TTATGTGTTT 3551 TATTTTATTA ATAA.ATAC C C TATCAGTTAT CCTATTTATA GGCTCCTCCT ATAAAATAAT TATTTATGGG ATAGTCAATA GGATAAATAT CCGAGGAGGA 3601 ATAATCCACT TCTCCCAGAA ATTTCAACAC TCAGCCTAAT AATAAAAGCA TATTAGGTGA AGAGGGTCTT TAA.AGTTGTG AGTCGGATTA TTATTTTCGT 3651 ACCATACTAA CCCTATTTTT CTTATGAATC CGAGCATCCT ACCCCCGTTT TGGTATGATT GGGATP.~~AAA GAATACTTAG GCTCGTAGGA TGGGGGCA.A.A 3701 TCGTTACGAC CAACTTATAC ATTTAGTATG CTTC CTCCCATTAA AGCAATGCTG GTTGAATATG TAAATCATAC TTTTTTGAAG GAGGGTAATT 3751 CCTTAGCAAT TATACTATGA CATATTGCCC TCCCCATAGC TACAGCAAGC GGAATCGTTA ATATGATACT GTATAACGGG AGGGGTATCG ATGTCGTTCG 3801 CTACCTCCCC TAACTTAACG GAAGTGTGCC TGAACTAAAG GACCACTTTG GATGGAGGGG ATTGAATTGC CTTCACACGG ACTTGATTTC C TGGTGAA.AC 3851 ATAGAGTGGA TAATGAAAGT TAAAACCTTT CCTCTTCCTA GP~~AAATAGG TATCTCACCT ATTACTTTCA ATTTTGGAAA GGAGAAGGAT CTTTTTATCC 3901 ACTTGAACCT ATAACTAAGA GCTCAAAACT CCTCGTATTT CCAATTATAC TGAACTTGGA TATTGATTCT CGAGTTTTGA GGAGCATAAA GGTTAATATG 3951 TATTTCCTAA GTAAAGTCAG CTAATAAAGC TTTTGGGCCC ATACCCCAAC ATAAAGGATT CATTTCAGTC GATTATTTCG AAAACCCGGG TATGGGGTTG 4001 CATGTTGATT AA.AATCC TTC CTTTACTAAT GAACCCAATT GTACTAACCA GTACAACTAA TTTTAGGAAG GAAATGATTA CTTGGGTTAA CATGATTGGT 4051 TCATTATTTC AAGCCTAGGC CTAGGAACTA TCCTAACATT TATTGGTTCA AGTAATAAAG TTCGGATCCG GATCCTTGAT AGGATTGTAA ATAACCAAGT 4101 CATTGACTCC TAATTTGAAT AGGCCTCGAA ATCAATACTC TAGCCATCAT GTAACTGAGG ATTAAACTTA TCCGGAGCTT TAGTTATGAG ATCGGTAGTA 4151 CCCCCTAATA ATCCGCCAAC ACCACCCCCG GGCAGTAGAA GCTTCCACAA 212

GGGGGATTAT TAGGCGGTTG TGGTGGGGGC CCGTCATCTT CGAAGGTGTT 4201 AATATTTCAT CACACAAGCA ACTGCCTCAG CCTTACTTTT ATTTGCTAGC TTATA.AAGTA GTGTGTTCGT TGACGGAGTC GGAATGAAAA TA.AAC GATC G 4251 GTCACAAACG CTTGGACTTC AGGCGAATGA AGTCTAATCG AAATAATTAA CAGTGTTTGC GAACCTGAAG TCCGCTTACT TCAGATTAGC TTTATTAATT 4301 TCCAACCTCT GCCACACTGG CCACAATCGC ATTAGCATTA A~AAATTGGCC AGGTTGGAGA CGGTGTGACC GGTGTTAGCG TAATCGTAAT TTTTAACCGG 4351 TAGCCCCACT TCACTTCTGA TTACCCGAAG TTCTCCAAGG CTTAGACCTT ATCGGGGTGA AGTGAAGACT AATGGGCTTC AAGAGGTTCC GAATCTGGAA 4401 ACTACCGGCC TCATTCTTTC TACATGACAA A.AACTTGCCC CATTCGCTAT TGATGGCCGG AGTAAGAAAG ATGTACTGTT TTTGAACGGG GTAAGCGATA 4451 TCTCTTACAA CTTTACCCTT CACTAA.ACTC CAATTTACTT ATTTTCCTAG AGAGAATGTT GAAATGGGAA GTGATTTGAG GTTAAATGAA TAA.AAGGATC 4501 GAGTACTATC AACTATAGTA GGGGGCTGAG GAGGGTTAAA CCAAACCCAA CTCATGATAG TTGATATCAT CCCCCGACTC CTCCCAATTT GGTTTGGGTT 4551 C TAC GA►~~~ TCCTAGCCTA CTCCTCAATC GCACACCTTG GTTGAATAGT GATGCTTTTT AGGATCGGAT GAGGAGTTAG CGTGTGGAAC CAACTTATCA 4601 TACAATCCTA CATTATTCCC ATAACTTAAC C CAAC TA.AAT TTGATCCTTT ATGTTAGGAT GTAATAAGGG TATTGAATTG GGTTGATTTA AACTAGGAAA 4651 ATATTATTAT AACATCAACA ACCTTTTTAT TATTTAA.AAC ATTTAATTCA TATAATAATA TTGTAGTTGT TGGP.~A.A.AATA ATAAATTTTG TAAATTAAGT ACTA.AAATTA 4701 ACTCTATCTC CTCTTCTTCA TCA►AAATCCC CCTTACTATC TGATTTTAAT TGAGATAGAG GAGAAGAAGT AGTTTTAGGG GGAATGATAG 4751 TATTATCGCC CTCATAACCC TCCTTTCTCT TGGAGGCTTA CCCCCTCTCT ATAATAGCGG GAGTATTGGG AGGAAAGAGA ACCTCCGAAT GGGGGAGAGA 4801 CAGGATTTAT GCCAAAATGA TTAATTCTAC AAGAACTAAC AA.AACAAAAT GTCCTAAATA CGGTTTTACT AATTAAGATG TTCTTGATTG TTTTGTTTTA 4851 CTAATTATCC CAGCCCTTAT TATAGCTATA ATAGCCCTCC TAAGTTTATT GATTAATAGG GTCGGGAATA ATATCGATAT TATCGGGAGG ATTCAAATAA 4901 CTTCTACCTA CGTCTATGTT ACGCCACAAC ATTAACCATA ACCCCAAATT GAAGATGGAT GCAGATACAA TGCGGTGTTG TAATTGGTAT TGGGGTTTAA 4951 CAATTAATAT ATTAACATCA TGACGAACCA AATTACCCCA TA.ATC TAAC C GTTAATTATA TAATTGTAGT ACTGCTTGGT TTAATGGGGT ATTAGATTGG 5001 CTAACAACAA CCGCCTCGCT ATCCCTCTTA CTCCTCCCAA TCACCCCCGC GATTGTTGTT GGCGGAGCGA TAGGGAGAAT GAGGAGGGTT AGTGGGGGCG 5051 TATCCTCATA TTAATATCTT AAGAAATTTA GGTTAACAAC AGACCAAAAG ATAGGAGTAT AATTATAGAA TTCTTTAAAT CCAATTGTTG TCTGGTTTTC 5101 CCTTCAA.AGC CTTAAGTAGA AGTGA.AAATC CTCTAATTTC TGCTAAGATT GGAAGTTTCG GAATTCATCT TCACTTTTAG GAGATTAA.AG ACGATTCTAA 5151 TGTAAGACTT TATCTCACAT CTTCTGAATG CAACCCAGAT GCTTTAATTA ACATTCTGAA ATAGAGTGTA GAAGACTTAC GTTGGGTCTA C GAA.ATTAAT AGC C 5201 TA~AA.AC TCCTAGATAA ATAGGCCTTG ATCCTACAAA ATCTTAGTTA TCGATTTTGG AGGATCTATT TATCCGGAAC TAGGATGTTT TAGAATCAAT 5251 ACAGCTAAGC GTTCAATCCA GCGAACTTTT ATCTACTTTC TCCCGCCGTA TGTCGATTCG CAAGTTAGGT CGCTTGAAAA TAGATGAAAG AGGGCGGCAT 5301 AA.AACAAAAG GCGGGAGAAA GTCCCGGGAG AA.ATTAATC T CCGGTTTTGG TTTTGTTTTC CGCCCTCTTT CAGGGCCCTC TTTAATTAGA GGC CA,AAACC 5351 ATTTGCAATC CAACGTAATC ATTTACTGCA GGACTATGGT AAGAAGAGGA TAAACGTTAG GTTGCATTAG TA.AATGAC GT CCTGATACCA TTCTTCTCCT 5401 ATTTGACCTC TGTTTACGGA GCTACAATCC GCCACTTAAG TTCTCAGTCA TAAACTGGAG ACAAATGCCT CGATGTTAGG CGGTGAATTC AAGAGTCAGT 5451 CCTTACCTGT GGCAATTAAT CGTTGACTAT TTTCTACAAA CCACAAAGAT GGAATGGACA CCGTTAATTA GCAACTGATA AAAGATGTTT GGTGTTTCTA TTTATTTA.AT 5501 ATCGGCACCC CTTTGGTGCA TGAGCAGGAA TAGTGGGAAC TAGCCGTGGG AAATAAATTA GAAACCACGT ACTCGTCCTT ATCACCCTTG 213

5551 AGCCCTAAGC CTTCTAATTC GAGCTGAATT AGGACAACCC GGATCACTTC TCGGGATTCG GAAGATTAAG CTCGACTTAA TCCTGTTGGG CCTAGTGAAG 5601 TAGGAGATGA TCAAGTCTAT AATGTTATTG TAACCGCCCA TGCATTTGTA ACGTAAACAT ATCCTCTACT AGTTCAGATA TTACAATAAC ATTGGCGGGT 5651 ATAATCTTCT TCATGGTTAT ACCCGTAATA ATTGGTGGAT TTGGAAATTG TATTAGAAGA AGTACCAATA TGGGCATTAT TAACCACCTA AACCTTTAAC 5701 ACTAGTGCCC TTAATAATTG GTGCACCAGA TATAGCTTTC CCGCGAATAA TGATCACGGG AATTATTAAC CACGTGGTCT ATATCGAAAG GGCGCTTATT 5751 ATAACATAAG CTTTTGACTA CTTCCCCCTT CTTTTCTTTT ACTTCTGGCC GAA.AAGP~AAA TATTGTATTC GAAAACTGAT GAAGGGGGAA TGAAGACCGG 5801 TCAGCTGGAG TTGAAGCCGG AGCTGGCACT GGTTGAACAG TTTATCCTCC AGTCGACCTC AACTTCGGCC TCGACCGTGA CCAACTTGTC AAATAGGAGG 5851 CTTAGCGGGA AATCTAGCAC ATGCTGGAGC ATCCGTTGAT TTAGCCATTT GGTA.AA GAATCGCCCT TTAGATCGTG TACGACCTCG TAGGCAACTA AATC 5901 TCTCCCTCCA TTTAGCAGGT ATCTCATCAA TTTTAGCTTC CATTAATTTT AGAGGGAGGT AAATCGTCCA TAGAGTAGTT AAAATCGAAG GTAATTAAAA 5951 ATTACAACTA TCATTAACAT A.AAAC CAC C G GCCATCTCTC AATAC CA.AAC TAATGTTGAT AGTAATTGTA TTTTGGTGGC CGGTAGAGAG TTATGGTTTG 6001 ACCATTATTT GTATGATCAA TTCTAGTTAC AACTATCCTT CTTCTATTAT TGGTAATAAA CATACTAGTT AAGATCAATG TTGATAGGAA GAAGATAATA 6051 CCCTCCCAGT GCTCGCAGCC GGTATTACAA TATTACTTAC TGATCGAAAC GGGAGGGTCA CGAGCGTCGG CCATAATGTT ATAATGAATG ACTAGCTTTG 6101 CTAAACACAA CATTCTTTGA TCCGGCAGGA GGGGGAGATC CAATTCTTTA GATTTGTGTT GTAAGAAACT AGGCCGTCCT CCCCCTCTAG GTTAAGAAAT 6151 CCA.ACATCTA TTTTGGTTTT TTGGCCACCC AGAAGTTTAT ATTTTAATTC GGTTGTAGAT A.AAAC CA.A.AA AACCGGTGGG TC TTCA.AATA TAAAATTAAG 6201 TTCCCGGTTT TGGAATAATT TCCCATGTAG TAGCTTACTA TTC C GGTAA.A AAGGGCCAAA ACCTTATTAA AGGGTACATC ATCGAATGAT AAGGCCATTT 6251 A.AAGAAC C AT TCGGCTATAT GGGTATAGTC TGGGCAATAA TAGCAATCGG TTTCTTGGTA AGCCGATATA CCCATATCAG ACCCGTTATT ATCGTTAGCC 6301 ACTATTAGGT TTTATTGTAT GAGCCCATCA TATATTTACA GTAGGAATAG TGATAATCCA A.AATAACATA CTCGGGTAGT ATATAAATGT CATCCTTATC 6351 ACGTTGACAC ACGTGCCTAT TTTACTTCAG CAACAATAAT TATTGCCATC TGCAACTGTG TGCACGGATA A.AATGAAGTC GTTGTTATTA ATAACGGTAG 6401 CCCACAGGTG TAAAAGTATT TAGCTGACTA GCAACTCTCC ACGGGGGCTC GGGTGTCCAC ATTTTCATAA ATCGACTGAT CGTTGAGAGG TGCCCCCGAG 6451 CATTAAATGA GAGACCCCAT TACTATGAGC TCTCGGATTC ATTTTCTTAT GTAATTTACT CTCTGGGGTA ATGATACTCG AGAGCCTAAG T~GAATA 6501 TCACAGTAGG AGGTTTAACA GGTATCGTCT TAGCCAACTC CTCCTTAGAT AGTGTCATCC TCCAAATTGT CCATAGCAGA ATCGGTTGAG GAGGAATCTA 6551 ATTGTTCTCC ATGATACCTA TTATGTAGTA GCTCACTTCC ATTATGTCCT TAACAAGAGG TACTATGGAT AATACATCAT CGAGTGAAGG TAATACAGGA 6601 TTCAATAGGA GCAGTATTTG CTATTATAGC AGGTTTTATC CACTGATTTC AAGTTATCCT CGTCATAAAC GATAATATCG TCCAAAATAG GTGACTAAAG 6651 CTCTCATCTC TGGCTACACC CTTCATTCAA CATGAACA.AA AATCCAATTT GAGAGTAGAG ACCGATGTGG GAAGTAAGTT GTACTTGTTT TTAGGTTAAA 6701 GCAGTAATAT TTATCGGGGT AAACTTAACA TTTTTCCCAC AACATTTCCT CGTCATTATA AATAGCCCCA TTTGAATTGT P.~~AAAGGGTG TTGTA.AAGGA 6751 AGGTCTCGCC GGTATACCAC GACGTTACTC AGATTATCCA GACGCATACA TCCAGAGCGG CCATATGGTG CTGCAATGAG TCTAATAGGT CTGCGTATGT 6801 CTTTATGAAA TACAGTTTCC TCTATTGGTT CCTTAATTTC ACTTGTAGCA GAAATACTTT ATGTCAAAGG AGATAACCAA GGAATTA.AAG TGAACATCGT 6851 GTAATTATGC TCCTATTTAT TATCTGAGAA GCATTCGCCT CP.~AAAC GAGA CATTAATACG AGGATAAATA ATAGACTCTT CGTAAGCGGA GTTTTGCTCT 6901 AGTACTATCC ATTGAATTAC CTCACACAAA CGTTGAATGA TTACACGGTT 214

TCATGATAGG TAACTTAATG GAGTGTGTTT GCAACTTACT AATGTGCCAA 6951 GTCCCCCACC TTACCATACA TATGAAGAAC CAGCATTTGT TCAAGTTCAA CAGGGGGTGG AATGGTATGT ATACTTCTTG GTCGTA.AACA AGTTCAAGTT 7001 CGAACCTTCT AAGACAAGAA AGGAAGGAAT CGAACCCCCA TATGTTAGTT GCTTGGAAGA TTCTGTTCTT TCCTTCCTTA GCTTGGGGGT ATACAATCAA 7051 TCAAGCCAAC CACATCACCA CTCTGTCACT TTCTTTATTA AGATTCTAGT AGTTCGGTTG GTGTAGTGGT GAGACAGTGA AAGA.AATAAT TCTAAGATCA 7101 AAA.ATATATT ACACTGCCTT GTCAAGGCAA AATTGTGAGT TTAAATCCCA TTTTATATAA TGTGACGGAA CAGTTCCGTT TTAACACTCA AATTTAGGGT 7151 CGAATCTTAA CTTATAATGG CACACCCTTC ACAATTAGGA TTTCAAGACG GCTTAGAATT GAATATTACC GTGTGGGAAG TGTTAATCCT AAAGTTCTGC 7201 CAGCCTCCCC AGTTATGGAA GAACTTATTC ACTTTCACGA CCACACACTA GTCGGAGGGG TCAATACCTT CTTGAATAAG TGAAAGTGCT GGTGTGTGAT 7251 ATAATTGTAT TTCTAATTAG CACTCTAGTT CTTTATATCA TTACAGCAAT TATTAACATA AAGATTAATC GTGAGATCAA GAAATATAGT AATGTCGTTA 7301 AGTATCAACA AAACTTACAA ATAAATATAT TCTTGATTCT CAAGAAATTG TCATAGTTGT TTTGAATGTT TATTTATATA AGAACTAAGA GTTCTTTAAC 7351 AAATTGTCTG AACTATTCTA CCCGCCATCA TCCTCATTAT AATCGCCCTA TTTAACAGAC TTGATAAGAT GGGCGGTAGT AGGAGTAATA TTAGCGGGAT 7401 CCATCCCTAC GAATTTTATA TCTTATAGAC GAAATTAATG ATCCCCACCT GGTAGGGATG C TTP~AAATAT AGAATATCTG CTTTAATTAC TAGGGGTGGA 7451 AAC CATCA.AA GCTATAGGTC ATCAATGATA CTGAAGTTAT GAATACACAG TTGGTAGTTT CGATATCCAG TAGTTACTAT GACTTCAATA CTTATGTGTC 7501 ATTATGAAGA TTTAAGCTTT GACTCTTACA TAATTCAA.AC CCAAGACTTA TAATACTTCT AAATTCGAAA CTGAGAATGT ATTAAGTTTG GGTTCTGAAT 7551 ACCCCAGGCC AATTTCGTCT ATTAGAAACA GATCACCGAA TAGTTGTACC TGGGGTCCGG TTAAAGCAGA TAATCTTTGT CTAGTGGCTT ATCAACATGG 7601 CATAGAATCA CCTATTCGCG TATTAGTATC TGCAGAAGAT GTTTTACACT GTATCTTAGT GGATAAGCGC ATAATCATAG ACGTCTTCTA CAAAATGTGA 7651 CATGAGCTGT CCCAGCCTTA GGAATTA.A.AA TAGACGCTGT ACCAGGACGC GTACTCGACA GGGTCGGAAT CCTTAATTTT ATCTGCGACA TGGTCCTGCG 7701 C TA.AAC C A.AA CTGCCTTCAT TACCTCCCGA CCAGGTATTT ATTATGGCCA GATTTGGTTT GACGGAAGTA ATGGAGGGCT GGTCCATA.AA TAATACCGGT 7751 ATGTTCAGAA ATTTGTGGTG CTAACCATAG TTTTATACCT ATCGTAGTAG TACAAGTCTT TA.AACAC CAC GATTGGTATC AAAATATGGA TAGCATCATC 7801 AA.ACAGTC C C CCTAGATCAC TTCGAAGCCT GATCTTCATT AATACTAGAA TTTGTCAGGG GGATCTAGTG AAGCTTCGGA CTAGAAGTAA TTATGATCTT 7851 GAAGCCTCAC TAAGAAGCTA AATTGGGTCT AGCATTAGCC TTTTAAGCTA CTTCGGAGTG ATTCTTCGAT TTAACCCAGA TCGTAATCGG A.AAATTC GAT 7901 AAAACTGGTG ATTCCCTACC ACCCTTAGTG ATATGCCTCA ACTAAATCCA TTTTGACCAC TAAGGGATGG TGGGAATCAC TATACGGAGT TGATTTAGGT 7951 CATCCCTGAT TCATTATCCT CCTATTTTCA TGAATAATTT TTCTTATTAT GTAGGGACTA AGTAATAGGA GGATAAA.AGT AC TTATTAA.A AAGAATAATA 8001 TCTGCCCCAA AAAGTAATAA ATTATACATT TAATAATAAC CCAACATTAA AGACGGGGTT TTTCATTATT TAATATGTAA ATTATTATTG GGTTGTAATT 8051 AAAATATCGA AAAATCTAAA CCTGAACCTT GA.AAC TGAC C ATGATTGTAA TTTTATAGCT TTTTAGATTT GGACTTGGAA CTTTGACTGG TACTAACATT 8101 ACTTCTTTGA CCAATTCCTA AGTCCCTCCC TCCTTGGAAT CCCATTAATT TGAAGA.AACT GGTTAAGGAT TCAGGGAGGG AGGAACCTTA GGGTAATTAA 8151 GCCTTAGCAA TTACATTACC TTGACTAACT TTCCCAACCC CAACTAATCG CGGAATCGTT AATGTAATGG AACTGATTGA AAGGGTTGGG GTTGATTAGC 8201 CTGATTAAAT AATCGATTAA TAACTCTTCA AAGTTGATTT ATTAATCGAT GACTAATTTA TTAGCTAATT ATTGAGAAGT TTCAACTAAA TAATTAGCTA 8251 TTATTTATCA ACTTATACAA CCCATTAACT TTGCTGGTCA CAA.ATGAGCT AATAAATAGT TGAATATGTT GGGTAATTGA AACGACCAGT GTTTACTCGA 215

8301 ATATTATTTA CAGCACTAAT AATATTCTTA ATTACCATCA ACCTTCTAGG TATAATAAAT GTCGTGATTA TTATAAGAAT TAATGGTAGT TGGAAGATCC 8351 ACTTCTCCCA TACACCTTCA CACCTACAAC CCAACTCTCC CTTAATATAG TGAAGAGGGT ATGTGGAAGT GTGGATGTTG GGTTGAGAGG GAATTATATC 8401 CATTTGCTTT ACCCCTATGA CTTATAACCG TACTAATTGG AATACTTAAT GTAAACGAAA TGGGGATACT GAATATTGGC ATGATTAACC TTATGAATTA 8451 CAACCAACAA TCACATTAGG CCATTTTCTA CCAGAAGGCA CCCCCACCCC GTTGGTTGTT AGTGTAATCC GGTAAAAGAT GGTCTTCCGT GGGGGTGGGG 8501 ACTAGTACCC GTCCTAATTA TCATCGAAAC TATCAGCCTA TTTATTCGAC TGATCATGGG CAGGATTAAT AGTAGCTTTG ATAGTCGGAT AA.ATAAGC T G 8551 CACTAGCATT AGGAGTTCGA CTAACTGCCA ACTTAACAGC CGGCCACCTA GTGATCGTAA TCCTCAAGCT GATTGACGGT TGAATTGTCG GCCGGTGGAT 8601 TTAATACAAT TAATCGCAAC CGCAGCCTTT GTCGTGATTA CTATTATACC AATTATGTTA ATTAGCGTTG GCGTCGGA.AA CAGGAGTAAT GATAATATGG 8651 AACCGTAGCA TTACTAACAT CAATCATCCT ATTCCTATTA ACAACTCTAG TTGGCATCGT AATGATTGTA GTTAGTAGGA TAAGGATAAT TGTTGAGATC 8701 AAGTAGCTGT AGCAATAATT CAAGCATATG TATTTGTACT CCTACTAAGT TTCATCGACA TCGTTATTAA GTTCGTATAC ATA.AACATGA GGATGATTCA 8751 TTGTATTTAC AAGAA.AATGT TTAATGGCTC ACCAAGCACA CGCATATCAT AACATAAATG TTCTTTTACA AATTACCGAG TGGTTCGTGT GCGTATAGTA 8801 ATAGTTGACC CCAGCCCATG ACCATTAACC GGAGCTATCG CCGCCCTTCT TATCAACTGG GGTCGGGTAC TGGTAATTGG CCTCGATAGC GGCGGGAAGA 8851 AATAACATCC GGGTTGGCCA TCTGATTTCA TTTCCACTCA TTATTACTTC TTATTGTAGG CCCAACCGGT AGACTAAAGT A.AAGGTGAGT AATAATGAAG 8901 TCTATTTAGG CCTAACCCTT TTATTATTAA CCATAATTCA ATGATGACGT AGATAAATCC GGATTGGGAA AATAATAATT GGTATTAAGT TACTACTGCA 8951 GATATTATCC GAGAAGGAAC ATTTCAAGGT CATCATACAC CCCCTGTCCA CTATAATAGG CTCTTCCTTG TAAAGTTCCA GTAGTATGTG GGGGACAGGT 9001 AA.AAGGTCTT CGTTATGGAA TAATCTTATT TATTACATCA GAAGTATTCT TTTTCCAGAA GCAATACCTT ATTAGAATAA ATAATGTAGT CTTCATAAGA 9051 TCTTCCTAGG CTTTTTCTGA GCCTTTTACC ATTCTAGCCT TGCCCCAACC AGAAGGATCC GP►~~AAAGAC T C GGAAA.ATGG TAAGATCGGA ACGGGGTTGG 9101 CCAGAACTAG GAGGATGTTG ACCACCAACA GGAATTAACC CATTAGATCC GGTCTTGATC CTCCTACAAC TGGTGGTTGT CCTTAATTGG GTAATCTAGG 9151 TTTTGAAGTA CCACTTCTAA ATACTGCAGT ACTTTTAGCT TCTGGTGTAA AAAACTTCAT GGTGAAGATT TATGACGTCA TGAAAATCGA AGACCACATT 9201 CAGTAACCTG AACCCACCAT AGCTTAATAG AAGGTAACCG P.►,AAAGAAGC T GTCATTGGAC TTGGGTGGTA TCGAATTATC TTCCATTGGC TTTTCTTCGA 9251 ATCCAAGCCC TCACCCTCAC TATTATTTTA GGATTTTACT TTACAGCCCT TAGGTTCGGG AGTGGGAGTG ATAATA.A.AAT CCTAAAATGA AATGTCGGGA 9301 CCAAGCTATA GAATATTATG AAGCACCATT CACAATTGCT GACGGAATTT GGTTCGATAT CTTATAATAC TTCGTGGTAA GTGTTAACGA CTGCCTTAAA 9351 ATGGAACAAC ATTCTTTGTC GCCACAGGAT TTCACGGTCT CCATGTCATC TACCTTGTTG TAAGA.AACAG CGGTGTCCTA AAGTGCCAGA GGTACAGTAG 9401 ATCGGTTCAA CATTTTTAGC AATCTGTTTA CTACGACAAA TTCAATATCA TAGCCAAGTT GTP►AA.AATC G TTAGACAAAT GATGCTGTTT AAGTTATAGT 9451 CTTCACATCA GAACATCATT TCGGCTTTGA AGCTGCTGCA TGATACTGAC GAAGTGTAGT CTTGTAGTAA AGCCGAAACT TCGACGACGT ACTATGACTG 9501 ACTTTGTAGA TGTAGTATGA TTATTTCTTT ATGTATCCAT CTATTGATGA TGAAACATCT ACATCATACT AATAAAGAAA TACATAGGTA GATAACTACT 9551 GGCTCATAAT TACTTTTCTA GTACAGACTA GTACA.AATGA TTTCCAATCA CCGAGTATTA ATGAAAAGAT CATGTCTGAT CATGTTTACT A.AAGGTTAGT 9601 TTTAATCTTG GTTAA.AATCC AAGGAA.A.AGT AATGAACCTC ATCACATCTT AAATTAGAAC CAATTTTAGG TTCCTTTTCA TTACTTGGAG TAGTGTAGAA 9651 CTATCGCAGC TACGGCCCTG ATTTCCCTAA TCCTTGTATT TATTGCATTT 216

GATAGCGTCG ATGCCGGGAC TA.AAGGGATT AGGAACATAA ATAAC GTA.AA 9701 TGGCTTCCAT CATTAAACCC AGATAATGAA AAATTATCCC CATACGAATG ACCGAAGGTA GTAATTTGGG TCTATTACTT TTTAATAGGG GTATGCTTAC 9751 TGGTTTTGAC CCCCTAGGAA GCGCACGCCT CCCATTTTCC CTACGCTTCT ACCAAAACTG GGGGATCCTT CGCGTGCGGA GGGTAAAAGG GATGCGAAGA 9801 TTCTCGTAGC TATCTTATTC CTGTTATTCG ATTTAGAAAT CGCCCTCCTC AAGAGCATCG ATAGAATAAG GACAATAAGC TAAATCTTTA GCGGGAGGAG 9851 CTTCCCCTAC CATGAGGCGA TCAATTACTA TCGCCACTCT CCACATTACT GAAGGGGATG GTACTCCGCT AGTTAATGAT AGCGGTGAGA GGTGTAATGA 9901 CTGAGCAACA ATTATCCTAA TCCTATTAAC CCTAGGTCTT ATCTATGAAT GAC TC GTTGT TAATAGGATT AGGATAATTG GGATCCAGAA TAGATACTTA 9951 GACTTCAAGG AGGATTAGAA TGAGCAGAAT GGATGTTTAG TCTAAATAAA CTGAAGTTCC TCCTAATCTT ACTCGTCTTA CCTACAAATC AGATTTATTT 10001 GACCACTAAT TTCGACTTAG TAAATTATGG TGAA.AATC CA TAAATATCCT CTGGTGATTA AAGCTGAATC ATTTAATACC ACTTTTAGGT ATTTATAGGA 10051 ATGTCTCCAA TATATTTCAG CCTTAATTCA GCATTTATTC TAGGTCTCAT TACAGAGGTT ATATAAAGTC GGAATTAAGT CGTAAATAAG ATCCAGAGTA 10101 GGGCCTCGCA CTTAATCGCT ATCACCTTTT ATCTGCACTC TTATGTTTAG CCCGGAGCGT GAATTAGCGA TAGTGGAAAA TAGACGTGAG AATACAA.ATC 10151 AAAGTATACT ACTAACCCTT TTCATTACCA TTGCCATCTG AACCCTAACC TTTCATATGA TGATTGGGAA AAGTAATGGT AACGGTAGAC TTGGGATTGG 10201 CTAAACTCAA CCTCATGCTC AATCATTCCT ATGATCCTAC TTACATTTTC GATTTGAGTT GGAGTACGAG TTAGTAAGGA TACTAGGATG AATGTAAAAG 10251 AGCCTGTGAA GCTAGCGCAG GCCTAGCCAT TCTAGTAGCT ACCTCACGCT TCGGACACTT CGATCGCGTC CGGATCGGTA AGATCATCGA TGGAGTGCGA 10301 CTCACGGCTC TGATAATTTA CAAAACCTGA ACCTTCTTCA ATGC TA.AAAA GAGTGCCGAG AC TATTA.AAT GTTTTGGACT TGGAAGAAGT TACGATTTTT 10351 TTTTAATTCC AACAATCATA CTCTTCCCCA CCACCTGAAT TATTAACA.AA ~~AAATTAAGG TTGTTAGTAT GAGAAGGGGT GGTGGACTTA ATAATTGTTT 10401 AAATGATTAT GACCCATAAT CACCACTCAT AGCCTCCTAA TTGCATTACT TTTACTAATA CTGGGTATTA GTGGTGAGTA TCGGAGGATT AACGTAATGA 10451 AAGCCTACCC CTATTCAAAT GAAACATAGA TATTGGCTGA GATTTCTCTA TTCGGATGGG GATAAGTTTA CTTTGTATCT ATAACCGACT C TA.AAGAGAT 10501 ACAAATTTAT AGCCGTAGAT CCATTATCAA CCCCCCTATT AATTCTAACA TGTTTAAATA TCGGCATCTA GGTAATAGTT GGGGGGATAA TTAAGATTGT 10551 TGCTGACTCC TACCATTAAT AGTCTTAGCC AGCCAAAACC ACATTTCCCC ACGACTGAGG ATGGTAATTA TCAGAATCGG TCGGTTTTGG TGTAAAGGGG 10601 AGAAC CA.ATT ATTCGACAAC GAACATACAT CACACTTCTA ATTTCCCTAC TCTTGGTTAA TAAGCTGTTG CTTGTATGTA GTGTGAAGAT TAAAGGGATG 10651 A.AACTTTCCT CATCATAGCA TTCTCTGTAA CCGAAATAAT TATATTCTAT TTTGAAAGGA GTAGTATCGT AAGAGACATT GGCTTTATTA ATATAAGATA 10701 ATTATATTTG AAGCCACACT CATCCCAACC CTTATTATTA TTACACGATG TAATATAAAC TTCGGTGTGA GTAGGGTTGG GAATAATAAT AATGTGCTAC 10751 AGGAAATCAA ACAGAACGCT TAAATGCAGG CACTTACTTT TTATTTTATA TCCTTTAGTT TGTCTTGCGA ATTTACGTCC GTGAATGAAA AATAAAATAT 10801 CTTTAATTGG CTCACTTCCC CTTCTTATTG CCCTCCTACT TATACAAAAT GAAATTAACC GAGTGAAGGG GAAGAATAAC GGGAGGATGA ATATGTTTTA 10851 AACCTAGGCA CCCTATCTAT AATTATTATA CAATACTCAC AATTTCCAAA TTGGATCCGT GGGATAGATA TTAATAATAT GTTATGAGTG TTAAAGGTTT 10901 CCTACTTTCA TGAGCAGACA AACTATGATG AATAGCCTGT CTCATCGCCT GGATGA.AAGT ACTCGTCTGT TTGATACTAC TTATCGGACA GAGTAGCGGA 10951 TTCTTGTCAA AATACCTTTA TACGGAATCC ACCTCTGACT CCCCAA.AGCT AAGAACAGTT TTATGGAAAT ATGCCTTAGG TGGAGACTGA GGGGTTTCGA 11001 CATGTCGAAG CCCCAATCGC TGGCTCAATA ATCCTAGCAG CAGTGTTACT GTACAGCTTC GGGGTTAGCG ACCGAGTTAT TAGGATCGTC GTCACAATGA 217

11051 CAAATTAGGG GGTTATGGAA TAATACGAAT TATTGTTATA CTAGACCCAT GTTTAATCCC CCAATACCTT ATTATGCTTA ATAACAATAT GATCTGGGTA 11101 TAAC C A.AAGA AATAGCCTAT CCATTCTTAA TTTTAGCTAT TTGAGGAATC ATTGGTTTCT TTATCGGATA GGTAAGAATT ~TCGATA AACTCCTTAG 11151 ATTATAACCA GTTCTATTTG TTTACGACAA ACAGATTTAA AATCTCTTAT TAATATTGGT CAAGATAAAC AAATGCTGTT TGTCTA.AATT TTAGAGAATA 11201 TGCTTACTCA TCAGTAAGTC ACATAGGCCT AATTGCAGGA GCAATTTTTA ACGAATGAGT AGTCATTCAG TGTATCCGGA TTAACGTCCT CGTTAAAAAT 11251 TCCA.AACACC TTGAAGTTTT GCAGGAGCAA TCACACTTAT AATTGCCCAT AGGTTTGTGG AACTTCAAAA CGTCCTCGTT AGTGTGAATA TTAACGGGTA 11301 GGCTTAATTT CATCAGCCTT ATTTTGCTTA GCTAACACCA ACTATGAACG C C GAATTAA.A GTAGTCGGAA TA►AAAC GAAT CGATTGTGGT TGATACTTGC 11351 AATCCACAGC C GAAC'~'ATAC TACTAGCCCG AGGTATACAA ATCATTCTCC TTAGGTGTCG GCTTGATATG ATGATCGGGC TCCATATGTT TAGTAAGAGG 11401 CATTAATAGC AACCTGATGA TTCTTTGCTA GCCTAGCTAA CCTTGCCCTA GTAATTATCG TTGGACTACT AAGAAACGAT CGGATCGATT GGAACGGGAT 11451 CCCCCATCAC CCAACCTTAT AGGAGAACTT CTCATCATTA CTTCATTATT GGGGGTAGTG GGTTGGAATA TCCTCTTGAA GAGTAGTAAT GAAGTAATAA 11501 TAACTGATCA AACTGAACCA TGATCTTATC AGGCCTTGGA GTATTAATTA ATTGACTAGT TTGACTTGGT ACTAGAATAG TCCGGAACCT CATAATTAAT 11551 CAGCCTCCTA TTCACTCTAC ATATTCTTAA TAACACAACG AGGCCCAACC GTCGGAGGAT AAGTGAGATG TATAAGAATT ATTGTGTTGC TCCGGGTTGG' 11601 CCCCATCACA TTCTATCATT AAACCCAAAT TACACACGAG AACATCTTCT GGGGTAGTGT AAGATAGTAA TTTGGGTTTA ATGTGTGCTC TTGTAGAAGA 11651 CCTAAGTCTT CATCTCATAC CAGTCCTATT ACTAATACTT AAACCAGAAC GGATTCAGAA GTAGAGTATG GTCAGGATAA TGATTATGAA TTTGGTCTTG 117 01 TCATCTGAGG CTGAACACTT TGTATTTATA GTTTAACTAA AACATTAGAT AGTAGACTCC GACTTGTGAA ACATAAATAT CA.AATTGATT TTGTAATCTA 11751 TGTGGTTCTA A►AAA.TA,AAAG TTA.AAAC C TT TTTAATTACC GAGAGAGGTC ACACCAAGAT TTTTATTTTC AATTTTGGAA A.AATTAATGG CTCTCTCCAG 11801 AGGGACACGA AGAACTGCTA ATTCTTCCTA TCATGGCTCA AATCCATGAC TCCCTGTGCT TCTTGACGAT TAAGAAGGAT AGTACCGAGT TTAGGTACTG TGA.AAGAT 11851 TCACTCAGCT TC AATAGTAATC TATTGGTCTT AGGAACCAAA AGTGAGTCGA AGACTTTCTA TTATCATTAG ATAACCAGAA TCCTTGGTTT 11901 AACTCTTGGT GCAATTCCAA GCA,A.AAGCTA TGAACACCAT CTTCAACTCA TTGAGAACCA CGTTAAGGTT CGTTTTCGAT ACTTGTGGTA GAAGTTGAGT 11951 TCATTTCTCC TAATTTTTAT TATTCTCATC TTTCCATTAT TAACTTCGTT ATTp~3AAATA AGTAAAGAGG ATAAGAGTAG AA.AGGTAATA ATTGAAGCAA 12001 AAGTCCTAA.A GAATCTAATC CAAACTGATC ATCCTCTTAT GTp~~AAATAG TTCAGGATTT CTTAGATTAG GTTTGACTAG TAGGAGAATA CATTTTTATC C C GTP.~A.AAAT 12051 TTCCTTCTTT ATTAGCCTTA TCCCCTTATT TATTTTCCTA GGCATTTTTA AAGGAAGAAA TAATCGGAAT AGGGGAATAA ATAAAAGGAT 12101 GATCAAGGTC TAGAATCAAT TATAACCAAC TATAATTGAA TA.AACATTGG CTAGTTCCAG ATCTTAGTTA ATATTGGTTG ATATTAACTT ATTTGTAACC 12151 ACCATTTGAT ATTAATATAA GC TTTA.AATT TGATATATAT TCAATCATAT TGGTAAACTA TAATTATATT CGAAATTTAA ACTATATATA AGTTAGTATA 12201 TCACCCCCGT AGCCCTCTAT GTTACCTGAT CTATCCTTGA ATTCGCCTTA AGTGGGGGCA TCGGGAGATA CAATGGACTA GATAGGAACT TAAGCGGAAT 12251 TGATATATAC ATTCAGACCC TAACATTAAC CGCTTTTTCA AATATTTACT ACTATATATG TAAGTCTGGG ATTGTAATTG GCGAAAAAGT TTATAA.ATGA 12301 GCTCTTCCTA GTCTCAATAA TTATTCTAGT AACAGCTAAT AATATATTCC CGAGAAGGAT CAGAGTTATT AATAAGATCA TTGTCGATTA TTATATAAGG 12351 AACTATTCAT TGGATGAGAA GGAGTTGGAA TCATATCATT CCTCCTAATT TTGATAAGTA ACCTACTCTT CCTCAACCTT AGTATAGTAA GGAGGATTAA 12401 GGCTGATGAT ACAGCCGAGC AGATGCTAAT ACAGCTGCTC TTCAAGCTGT 218

CCGACTACTA TGTCGGCTCG TCTACGATTA TGTCGACGAG AAGTTCGACA 12451 AATTTATAAC CGAGTCGGAG ATATTGGATT AATCCTTAGC ATAGCCTGAC TTA.AATATTG GCTCAGCCTC TATAACCTAA TTAGGAATCG TATCGGACTG 12501 TAGCCATAAA TCTAAATTCA TGAGAAATCC AACAATTATT TATCTTATCT ATCGGTATTT AGATTTAAGT ACTCTTTAGG TTGTTAATAA ATAGAATAGA 12551 P~AAAACATAG ACCTAACATT ACCTCTATCC GGCCTCGTCC TAGCCGCAGC TTTTTGTATC TGGATTGTAA TGGAGATAGG CCGGAGCAGG ATCGGCGTCG 12601 TGGAAAATCC GCACAATTTG GCCTTCACCC CTGACTTCCT TCTGCTATAG ACCTTTTAGG CGTGTTAAAC CGGAAGTGGG GACTGAAGGA AGACGATATC 12651 AAGGACCAAC ACCAGTCTCC GCCCTACTCC ACTCCAGCAC AATAGTTGTT TTCCTGGTTG TGGTCAGAGG CGGGATGAGG TGAGGTCGTG TTATCAACAA 12701 GCCGGCATTT TCCTACTAAT CCGCCTCCAC CCCCTAATTC AA.AATAATCA CGGCCGTAA.A AGGATGATTA GGCGGAGGTG GGGGATTAAG TTTTATTAGT 12751 ATTAATCTTA ACAACATGCC TTTGCCTAGG AGCATTAACT ACTCTTTTTA TAATTAGAAT TGTTGTACGG A.AACGGATCC TCGTAATTGA TGAGp~~AAAT 12801 CTGCAACATG TGCACTCACC CAAAACGATA TC T CGTCGCTTTT GACGTTGTAC ACGTGAGTGG GTTTTGCTAT AGTTTTTTTA GCAGCGAAAA 12851 TCAACATCAA GCCAACTCGG ATTAATAATA GTAACAATCG GCCTTAACCA AGTTGTAGTT CGGTTGAGCC TAATTATTAT CATTGTTAGC CGGAATTGGT 12901 ACCCCAACTC GCCTTCCTAC ATATCTGCAC CCACGCCTTC TTCAAAGCCA TGGGGTTGAG CGGAAGGATG TATAGACGTG GGTGCGGAAG AAGTTTCGGT 12951 TACTCTTTCT CTGCTCAGGC TCTATTATTC ATAGTCTTAA TGATGAACAA ATGAGAA.AGA GACGAGTCCG AGATAATAAG TATCAGAATT ACTACTTGTT 13001 GACATTCGCA AAATAGGAGG ACTCCATAAA CTCCTACCAT TTACATCATC CTGTAAGCGT TTTATCCTCC TGAGGTATTT GAGGATGGTA AATGTAGTAG 13051 CTCTTTAACC ATCGGAAGTT TAGCCCTCAC AGGCATGCCC TTCTTGTCAG GAGAAATTGG TAGCCTTCAA ATCGGGAGTG TCCGTACGGG AAGAACAGTC 13101 GTTTCTTCTC AAA.AGAC GC T ATCATTGAGT CCATAAACAC CTCACACCTA CAAAGAAGAG TTTTCTGCGA TAGTAACTCA GGTATTTGTG GAGTGTGGAT 13151 AACGCCTGAG CCCTAACCCT~ TACCCTTATC GCAACATCAT TCACAGCTAT TTGCGGACTC GGGATTGGGA ATGGGAATAG CGTTGTAGTA AGTGTCGATA 13201 CTACAGCCTA CGCCTTGTAT TCTTCACATT AATAAACTTC CCACGATTCA GATGTCGGAT GCGGAACATA AGAAGTGTAA TTATTTGAAG GGTGCTAAGT 13251 ATTCACTTTC CCCAATTAAT GAAAATAACC CTACAATAAT TAACCCAATC TAAGTGAAAG GGGTTAATTA CTTTTATTGG GATGTTATTA ATTGGGTTAG 13301 AAACGCCTAG CCTATGGAAG TATCCTAGCT GGCCTCATTA TTACATCAAA TTTGCGGATC GGATACCTTC ATAGGATCGA CCGGAGTAAT AATGTAGTTT 13351 TTTAACCCCA ACP.~~AAAC C C A.AATCATGAC AATATCCCCT TTATTA.AAAC AAATTGGGGT TGTTTTTGGG TTTAGTACTG TTATAGGGGA AATAATTTTG 13401 TTTCCGCCTT ACTAGTCACA ATTATTGGCC TATTACTAGC CTTAGAACTA AAAGGCGGAA TGATCAGTGT TAATAACCGG ATAATGATCG GAATCTTGAT 13451 GCTAACTTAA CTAATACCCA AC T TAAA.AT T AACCCCTCCC TTTATACTCA CGATTGAATT GATTATGGGT TGAATTTTAA TTGGGGAGGG A.AATATGAGT 13501 CCACTTCTCC AACATATTAG GTTATTTCCC ACAAATTATC CACCGTCTCC GGTGAAGAGG TTGTATAATC CAATA.AAGGG TGTTTAATAG GTGGCAGAGG 13551 TAC C A.AAAAT CAACCTAAAC TGAGCCCAAC ATATCTGCAC CCACCTAATT ATGGTTTTTA GTTGGATTTG ACTCGGGTTG TATAGAGGTG GGTGGATTAA 13601 GAC CA.AACAT GAAATGAAAA AATCGGACCA AAAAGTACTC TTATCCAACA CTGGTTTGTA CTTTACTTTT TTAGCCTGGT TTTTCATGAG AATAGGTTGT 13651 AACCCCACTA ATCA.AAC TAT CCACCCAACC ACAACAAGGT TATATTAAAA TTGGGGTGAT TAGTTTGATA GGTGGGTTGG TGTTGTTCCA ATATAATTTT 13701 TTTATCTAAT ATTACTTTTT CTTACATTAA CCTTAGCTCT ATTAACTTCA AA.ATAGATTA TAATGP►~~AAA GAATGTAATT GGAATCGAGA TAATTGAAGT TGCACGC.A.AA 13751 CTAACCTAAC GTCCCCCAAG ATAGCCCTCG AGTTAACTCC GATTGGATTG ACGTGCGTTT CAGGGGGTTC TATCGGGAGC TCAATTGAGG 219

13801 AATACCACAA ACAAAGTTAA CAATAATACC CAACCACTTA AAACTAACAA TTATGGTGTT TGTTTCAATT GTTATTATGG GTTGGTGAAT TTTGATTGTT 13851 TCACCCCCCA TCCGCATATA ATAAGGCTAC C C C CACA,AAA TCTCCACGAA AGTGGGGGGT AGGCGTATAT TATTCCGATG GGGGTGTTTT AGAGGTGCTT 13901 CCATTTCTAT ATTACTTATC TCCTCCACCC CTACTCAACT TAACTCAAAT GGTAAAGATA TAATGAATAG AGGAGGTGGG GATGAGTTGA ATTGAGTTTA 13951 CACTCAACTA TAAAATATTT ACCAACAAAA ACTAAAACTA C TAAATP.~3AA GTGAGTTGAT ATTTTATA.AA TGGTTGTTTT TGATTTTGAT GATTTATTTT 14001 CCCGACATAT AATAATACAG ATCAACTGCC TCACGATTCA GGATAAGGCT GGGCTGTATA TTATTATGTC TAGTTGACGG AGTGCTAAGT CCTATTCCGA 14051 CAGCAGCAAG TGCTGCCGTA TAAGCAAACA CTACTAATAT CCCACCCA.AA GTCGTCGTTC ACGACGGCAT ATTCGTTTGT GATGATTATA GGGTGGGTTT 14101 TA.AATTAGAA ACAAAACCAA TGAT GATCCTCCAT GTCCCACCAA ATTTAATCTT TGTTTTGGTT ACTATTTTTT CTAGGAGGTA CAGGGTGGTT 14151 CAACCCACAC CCTACCCCAG CAGCCATAAC TAACCCCAAC GCAGCATAAT GTTGGGTGTG GGATGGGGTC GTCGGTATTG ATTGGGGTTG CGTCGTATTA 14201 AAGGAGAAGG ATTAGACGCT ACCCCTATTA AACCTAAAAC TA.AAC AGAC T TTCCTCTTCC TAATCTGCGA TGGGGATAAT TTGGATTTTG ATTTGTCTGA 14251 ATTATTP~AAA ACATAPsAATA TACCATTATT CCTACCTGGA CTTTAACCAA TAATAATTTT TGTATTTTAT ATGGTAATAA GGATGGACCT GAAATTGGTT 14301 GACCAATAAC TTGp~~AAAC T ATCGTTGTCT ATTCAACTAC AAGAATTTAT CTGGTTATTG AACTTTTTGA TAGCAACAGA TAAGTTGATG TTCTTAA.ATA 14351 GGCCATAAAT ATC C GP.~~AAA CCCACCCATT AC TP~~AAATC ATTAACCAAA CCGGTATTTA TAGGCTTTTT GGGTGGGTAA TGATTTTTAG TAATTGGTTT 14401 CCTTAATTGA CCTTCCAGCT CCATCAAATA TTTCAATTTG ATGAAACTTC GGAATTAACT GGAAGGTCGA GGTAGTTTAT AAAGTTAAAC TACTTTGAAG 14451 GGTTCACTCC TAAGTCTATG CCTAATTATC CAAATCCTCA CAGGCCTTTT CCAAGTGAGG ATTCAGATAC GGATTAATAG GTTTAGGAGT GTCCGGAAAA 14501 CCTAGCAATA CATTACACCC CAGACATCTC AATAGCCTTC TCCTCAGTAA GGATCGTTAT GTAATGTGGG GTCTGTAGAG TTATCGGAAG AGGAGTCATT 14551 TCCATATCTC CCGCGACGTT AACTATGGCT GACTCATCCG TAATATTCAC AGGTATAGAG GGCGCTGCAA TTGATACCGA CTGAGTAGGC ATTATAAGTG 14601 GCCAACGGAG CCTCATTATT CTTTATCTGC GTATACTTAC ATATTGCCCG CGGTTGCCTC GGAGTAATAA GAAATAGACG CATATGAATG TATAACGGGC 14651 AGGACTTTAC TATGGTTCCT ACCTTTATAA AGAAACATGA AATGTTGGAG TC C TGA.AATG ATACCAAGGA TGGAAATATT TCTTTGTACT TTACAACCTC 14701 TAATCCTATT ATTTCTGTTA ATAGCCACAG CCTTCGTAGG CTATGTACTA ATTAGGATAA TAAAGACAAT TATCGGTGTC GGAAGCATCC GATACATGAT 14751 CCATGAGGCC AA.ATATCCTT TTGAGCCGCT ACAGTCATTA CCAACCTCTT GGTACTCCGG TTTATAGGAA AACTCGGCGA TGTCAGTAAT GGTTGGAGAA 14801 ATCCGCCTTT CCCTATATTG GAGATATACT AGTACAATGA ATTTGAGGTG TAGGCGGAAA GGGATATAAC CTCTATATGA TCATGTTACT TAAACTCCAC 14851 GCTTTTCAGT AGACAACGCC ACCCTAACAC GATTCTTCGC ATTTCATTTT CGAA.AAGTCA TCTGTTGCGG TGGGATTGTG CTAAGAAGCG TAA.AGTAA.AA 14901 CTCCTTCCCT TCCTAATTAC AGCATTAATA ATTATCCACA TCCTCTTCTT GAGGAAGGGA AGGATTAATG TCGTAATTAT TAATAGGTGT AGGAGAAGAA 14951 ACATGA.AACA GGTTCA.AACA ACCCCATAGG ACTTAATTCT GACATAGACA TGTACTTTGT CCAAGTTTGT TGGGGTATCC TGAATTAAGA CTGTATCTGT 15001 A.AATTTCCTT CCACCCCTAC TACTCCTACA AAGACGCACT TGGCTTCTTT TTTAAAGGAA GGTGGGGATG ATGAGGATGT TTCTGCGTGA ACCGAAGAAA 15051 ACCATAATTA TACTACTAGG GATCCTAGCC CTATTCCTTC CTAATCTTCT TGGTATTAAT ATGATGATCC CTAGGATCGG GATAAGGAAG GATTAGAAGA 15101 AGGAGACGCT GP~AAAC TATA TCCCCGCTAA TCCTCTCGTT ACACCCCCTC TCCTCTGCGA CTTTTGATAT AGGGGCGATT AGGAGAGCAA TGTGGGGGAG 15151 ATATTAAACC AGAATGATAC TTCCTATTCG CCTATGCCAT CCTCCGATCT 220

TATAATTTGG TCTTACTATG AAGGATAAGC GGATACGGTA GGAGGCTAGA 15201 ATCCCTAATA AATTAGGAGG AGTACTAGCC CTCTTATTCT CCATTCTTAT TAGGGATTAT TTAATCCTCC TCATGATCGG GAGAATAAGA GGTAAGAATA 15251 TCTTATATTA ATTCCCCTAT TACATACCTC TAAACAACGA AGTAGCACCT AGAATATAAT TAAGGGGATA ATGTATGGAG ATTTGTTGCT TCATCGTGGA 153 01. TTCGCCCACT CACACAAATT TTCTTTTGAA CCCTAGTGAC CAATATACTA AAGCGGGTGA GTGTGTTTAA AAGAAAACTT GGGATCACTG GTTATATGAT 15351 ATCTTAACCT GAATTGGTGG ACAACCAGTT GAACAACCAT TTATCCTTAT TAGAATTGGA CTTAACCACC TGTTGGTCAA CTTGTTGGTA AATAGGAATA 15401 CGGACAAATC GCATCTATCA CCTACTTTTC TTTATTTCTT ATTGTGATCC GCCTGTTTAG CGTAGATAGT GGATGA.AAAG A.AATA.AAGAA TAACACTAGG 15451 CACTCACAGG CTGATGAGAA AAC A~AAATC C TCAGCCTAAA CTAGTTTTGG GTGAGTGTCC GACTACTCTT TTGTTTTAGG AGTCGGATTT GATCAAAACC 15501 TAGCTTAACT TAACAAAGCG TCGACCTTGT AAGTCGAAGA TCGGAGGTTA ATCGAATTGA ATTGTTTCGC AGCTGGAACA TTCAGCTTCT AGCCTCCAAT 15551 AAACCCTCTC Cp~AAACATAT CAGGGGAAGG AGGGTTAAAC TCCTGCCCTT TTTGGGAGAG GTTTTGTATA GTCCCCTTCC TCCCAATTTG AGGACGGGAA 15601 GGCTCCCAA.A GCCAAGATTC TGCCCAAACT GCCCCCTGAA ATGCTATTAA CCGAGGGTTT CGGTTCTAAG ACGGGTTTGA CGGGGGACTT TACGATAATT 15651 AGCATGGAAA CCGAATGAAA ATTTGGTTTC CAAAAAGTAA GTCAGAGTGA TCGTACCTTT GGCTTACTTT T.A.AAC CA.AAG GTTTTTCATT CAGTCTCACT 157 01 CATATTAATG ATATAGCCCA CATACCTTAA TATAGTACAT TACTTAACTC GTATAATTAC TATATCGGGT GTATGGAATT ATATCATGTA ATGAATTGAG 15751 GACTAACCAA CATTAATAGA TTATTCCCTA CTACTATAAT TATCTATGCT CTGATTGGTT GTAATTATCT AATAAGGGAT GATGATATTA ATAGATACGA 15801 TAATCCTCAT TAATCTATAT TCCCCTATAT CATAACATAC TATGCTTAAT ATTAGGAGTA ATTAGATATA AGGGGATATA GTATTGTATG ATACGAATTA 15851 ACTCATTAAT ATACTATCCA CTATTTCATT ACATTCTATT CCTTAGTCCT TGAGTAATTA TATGATAGGT GATA.AAGTAA TGTAAGATAA GGAATCAGGA 15901 CATAAACTTA A.AATCAGAAT TTTCATTACA TAA.AATAATT CATTTAACAC GTATTTGAAT TTTAGTCTTA AAAGTAATGT ATTTTATTAA GTAAATTGTG 15951 TCAATTATCT AATTATGAAT TATGCGGGTT GGTAAGAACA TCACAACCCG AGTTAATAGA TTAATACTTA ATACGCCCAA CCATTCTTGT AGTGTTGGGC 16001 CTATTGTAAG TAG CTCTATTTGT GGCACTATAC TCGATTAATC GATAACATTC TTTTTTTATC GAGATAAACA CCGTGATATG AGCTAATTAG 16051 CCTATCAATT GATCAAAACT GGCATCTGAT TAATGCTCGA AATACTTTAA GGATAGTTAA CTAGTTTTGA CCGTAGACTA ATTACGAGCT TTATGAAATT 16101 TCCTTGATCG CGTCAAGAAT GTAAGTACCC CTAGTTCCCT TTAATGGCAC AGGAACTAGC GCAGTTCTTA CATTCATGGG GATCAAGGGA AATTACCGTG 16151 CTCCGTCCTT GATCGTCTCA AGATTTACTG TCCGCCCTAT ATTTTTTATC GAGGCAGGAA CTAGCAGAGT TCTAAATGAC AGGCGGGATA TAA,AAAATAG 16201 GGGGATGAAG CAATTACTCA GCCCGGGAGG GCTGATCTGG GACACTGAGA CCCCTACTTC GTTAATGAGT CGGGCCCTCC CGACTAGACC CTGTGACTCT 16251 TA.AATTTGAG TCCACCTCGA CATCTATTTA TAATACTCAT TACTCACCAT ATTTA.AAC TC AGGTGGAGCT GTAGATAAAT ATTATGAGTA ATGAGTGGTA 16301 TCATGAATAA TAGTTGTCAA GTTGACCATT ACTGAGAGGG ATAGAGAAAC AGTACTTATT ATCAACAGTT CAACTGGTAA TGACTCTCCC TATCTCTTTG 16351 TGACGCCATA GGCGACACGT TTCGATTTTT TTGATTAATG AAGCTATGGT ACTGCGGTAT CCGCTGTGCA AAGC TP►,AAAA AACTAATTAC TTCGATACCA 16401 TT TA CATTCTTTTA ACCCTCATCA AAAGCGATTC GTAATAA.ATG AATTTTTTAT GTAAGAAAAT TGGGAGTAGT TTTCGCTAAG CATTATTTAC 16451 TTCATGTAAA GCGCATTGAA TAATCCTAAT ACATTCTTCA CTTTACTTGG AAGTACATTT CGCGTAACTT ATTAGGATTA TGTAAGAAGT GA.A.ATGAAC C 16501 CATAATTTTT TTTTTATTAA GCTTTCCCCT AGGTC TTA.AA ATTTTGGAGC GTATTAAAAA P~~AAATAATT C GAAAGGGGA TCCAGAATTT TAAA.ACCTCG 221

16551 CGCCTAGAAA TAC ATTTTTTGGT P.~AAAA000CC CTCCCCCTAA GCGGATCTTT TTTTTTTATG T CCA TTTTTGGGGG GAGGGGGATT 16601 TATACACGGA CTCCTCGAA.A AACCCCTAAA ACGAGGGCCG GACATATATC ATATGTGCCT GAGGAGCTTT TTGGGGATTT TGCTCCCGGC CTGTATATAG 16651 TTTGAATTAG CATGCGAAAT ATACTCTATA TATATAGTGT AACACTATGA AAACTTAATC GTACGCTTTA TATGAGATAT ATATATCACA TTGTGATACT 16701 T A

tRNA 1..71 product = tRNA-Phe rRNA 70..1022 product = 12S ribosomal RNA tRNA 1023..1094 product = tRNA-Val rRNA 1095..2768 product = 16S ribosomal RNA tRNA 2769..2843 product = tRNA-Leu gene 2844..3818 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3820..3 8 89 product = tRNA-Ile tRNA 3888..3959 product = tRNA-Gln tRNA 3960..4028 product = tRNA-Met gene 4029..5072 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5072..5142 product = tRNA-Trp tRNA complement (5144..5212) product = tRNA-Ala tRNA complement (5213..5285) product = tRNA-Asn tRNA complement (5319..5385) product = tRNA-Cys tRNA complement (5 3 87..5457) product = tRNA-Tyr gene 5459..7016 gene = COl product = cytochrome c oxidase subunit 1 tRNA complement (7015..7085) product = tRNA-Ser tRNA 7090..7159 product = tRNA-Asp gene 7167..7857 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7858..7931 product = tRNA-Lys gene 7933..8100 222

gene = ATP8 product =ATP synthase FO subunit 8 gene 8091..8774 gene = ATP6 product =ATP synthase FO subunit 6 gene 8774..9559 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9562..9631 product = tRNA-Gly gene 9632..9982 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9981..10050 product = tRNA-Arg gene 10051..10347 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10341..11721 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11722..11790 product = tRNA-His tRNA 11791..11857 product = tRNA-Ser tRNA 1185 8..11929 product = tRNA-Leu gene 11930..13759 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13755..14276) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14277..14346) product = tRNA-Glu gene 14349..15494 gene = CYTB product =cytochrome b tRNA 15494..15567 product = tRNA-Thr tRNA complement (15570..15638) product = tRNA-Pro D-Loop 14349..15494

Carcharodon carcharias mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTTA AAGCATGGCA CTGAAGATGC TAATATGA.AA CGATCACATC GAATTAAAAT TTCGTACCGT GACTTCTACG ATTATACTTT 51 AATGAGAATT TTCCGCAGGC ATTAAGGTTT GGTCCTGGCC TCAGTATTAA TTACTCTTAA AAGGCGTCCG TAATTCCAAA CCAGGACCGG AGTCATAATT 101 TTGTAACCAA AATTATACAT GCAAGTTTCA GCATCCCTGT GAGAATGCCC AACATTGGTT TTAATATGTA CGTTCAAAGT CGTAGGGACA CTCTTACGGG 151 TAACTACTCT ATCAATTAGT TAGGAGCAGG TATCAGGCAC ACACTCACGT 223

ATTGATGAGA TAGTTAATCA ATCCTCGTCC ATAGTCCGTG TGTGAGTGCA 201 AGCCCAAGAC ACCTTGCTAA GCCACACCCC CAAGGGACAT CAGCAGTAAC TCGGGTTCTG TGGAACGATT CGGTGTGGGG GTTCCCTGTA GTCGTCATTG 251 AGATATTGAT TATATGAGCG CAAGCTCGAA CCAGTTAAAG TTAACAGAGT TCTATAACTA ATATACTCGC GTTCGAGCTT GGTCAATTTC AATTGTCTCA 301 TGGTCAATCT CGTGCCAGCC ACCGCGGTTA TACGAGTAAC TCACATTAAT ACCAGTTAGA GCACGGTCGG TGGCGCCAAT ATGCTCATTG AGTGTAATTA 351 ACTTCCCGGC GTAAAGAGTG ATTTAAGAAG TATCTAACAG TAATTAAAGT TGAAGGGCCG CATTTCTCAC TAAATTCTTC ATAGATTGTC ATTAATTTCA 401 TAAGACCTTA TCAAGCTGTC ACACGCACCC ATAA.AC GGAA TTATCAACAA ATTCTGGAAT AGTTCGACAG TGTGCGTGGG TATTTGCCTT AATAGTTGTT 451 CGAA.AGTGAC TTTATCTCAC TAGAAATCTT GATGTCACGA CAGTTAGACC GCTTTCACTG A.AATAGAGTG ATCTTTAGAA CTACAGTGCT GTCAATCTGG 501 CCAAACTAGG ATTAGATACC CTACTATGTC TAACCACAAA CTTAAACAAT GGTTTGATCC TAATCTATGG GATGATACAG ATTGGTGTTT GAATTTGTTA 551 AACTCACTGT ATTGTTCGCC AGAGTACTAC AAGCGCTAGC TTA,AAACCCA TTGAGTGACA TAACAAGCGG TCTCATGATG TTCGCGATCG AATTTTGGGT 601 AAGGACTTGG CGGTATCCCA AACCCACCTA GAGGAGCCTG TTCTGTAACC TTCCTGAACC GCCATAGGGT TTGGGTGGAT CTCCTCGGAC AAGACATTGG 651 GATAATCCCC GTTAAACCTC ACCACTTCTA GCCATCCCTG TCTATATACC CTATTAGGGG CAATTTGGAG TGGTGAAGAT CGGTAGGGAC AGATATATGG 701 GCCGTCGTCA GCTCACCCTA TGAAGGCTTA AAAGTAAGCA AAAAGAATTA CGGCAGCAGT CGAGTGGGAT ACTTCCGAAT TTTCATTCGT TTTTCTTAAT 751 ACTCCCATAC GTCAGGTCGA GGTGTAGCTA ATGAAGTGGA TAGAAATGGG TGAGGGTATG CAGTCCAGCT CCACATCGAT TACTTCACCT ATCTTTACCC 801 CTACATTTTC TAT.AAAGAAA ACACGAATGG TAGACTGAAA AACTGCCTAA GATGTAAAAG ATATTTCTTT TGTGCTTACC ATCTGACTTT TTGACGGATT 851 AGGTGGATTT AGCAGTAAGA AA.AGAC TAGA GAGCTTCTCT GAA.AC C GGC T TCCACCTAAA TCGTCATTCT TTTCTGATCT CTCGAAGAGA CTTTGGCCGA 901 CTGGGATGCG CACACACCGC CCGTCACTCT CCTC AATCTACTTA GACCCTACGC GTGTGTGGCG GGCAGTGAGA GGAGTTTTTT TTAGATGAAT 951 TTTTTAATTA ATGP~TC TCAAGAGGAG GCAAGTCGTA ACATGGTAAG P.►AAAATTAAT TACTTTTTAG AGTTCTCCTC CGTTCAGCAT TGTACCATTC 1001 TGTACTGGAA AGTGCACTTG GAATCA.A.AAT GTGGCTAAAC TAGTA.AAGCA ACATGACCTT TCACGTGAAC CTTAGTTTTA CACCGATTTG ATCATTTCGT 1051 CCTCCCTTAC AC C GAGGA.A.A TACTCGTGCA ATCCGAGTCA TTTTGAACAT GGAGGGAATG TGGCTCCTTT ATGAGCACGT TAGGCTCAGT AAA.AC TTGTA 1101 TAA.AGC TALC CTGTCCACCT AC C C CAA.AC C CAACATTATT AACTACCTTA ATTTCGATCG GACAGGTGGA TGGGGTTTGG GTTGTAATAA TTGATGGAAT 1151 CATATTAA.AC CCTAACTAAA ACATTTTATC ATTCTAGTAT GGGCGACAGA GTATAATTTG GGATTGATTT TGTAAAATAG TAAGATCATA CCCGCTGTCT 1201 AACA.A.AACTC AGCGCAATAG ACTATGTACC GCAAGGGAGA GCTGAAAGAG TTGTTTTGAG TCGCGTTATC TGATACATGG CGTTCCCTCT CGACTTTCTC 1251 A.AATGAAACA AATAATTA.AA GTAACAAAAA GCAGAGATTC TACCTCGTAC TTTACTTTGT TTATTAATTT CATTGTTTTT CGTCTCTAAG ATGGAGCATG 1301 CTTTTGCATC ATGATTTAGC TAGAA,A,AAC T AGAC A.AAGAG ATCTTTAGCC GAAAACGTAG TACTAAATCG ATCTTTTTGA TCTGTTTCTC TAGA.AATC GG 1351 TATCTTCCCG A.AAC TAAAC G AGCTACTCCG AAGCAGCACA ATTAGAGCCA ATAGAAGGGC TTTGATTTGC TCGATGAGGC TTCGTCGTGT TAATCTCGGT 1401 ACCCGTCTCT GTGGCP.~A.AAG AGTGGGAAGA CTTCCGAGTA GCGGTGATAA TGGGCAGAGA CACCGTTTTC TCACCCTTCT GAAGGCTCAT CGCCACTATT 1451 GCCTACCGAG TTTAGTGATA GCTGGTTGTC CAAGAAAAGA ACTTTAGTTC CGGATGGCTC AAATCACTAT CGACCAACAG GTTCTTTTCT TGA.AATCAAG 1501 TGCATTAATT CTTTCATTAC CAGAAA.ATCT ATTATATTAA GGTCA.AACAT ACGTAATTAA GAA.AGTAATG GTCTTTTAGA TAATATAATT CCAGTTTGTA 224

1551 AAGAATTAAT AGTTATTCAG AAGAGGTACA GCCCTTCTGA ACCAGGATAC TTCTTAATTA TCAATAAGTC TTCTCCATGT CGGGAAGACT TGGTCCTATG 1601 AACTTTCAAA GGAGGGAAAT GATCATATTT ATTAAGGTTT TCACCTCAGT TTGAAAGTTT CCTCCCTTTA CTAGTATAA.A TAATTCCA.AA AGTGGAGTCA 1651 AGGC C TAA.A.A GCAGCCACCT GTAAAGTAAG CGTCATAGCT CCAGTCCCAC TCCGGATTTT CGTCGGTGGA CATTTCATTC GCAGTATCGA GGTCAGGGTG 1701 P►~~AAAC C TAT AATCCAGATA TTCTTCTCAC CGCCCCCTTA GCTATATTGG TTTTTGGATA TTAGGTCTAT AAGAAGAGTG GCGGGGGAAT CGATATAACC 1751 ACTATTTTAT AAAATTATAA AAGAACTTAT GC TA.A.AATGA GTAATAGGAG TGATAA.AATA TTTTAATATT TTCTTGAATA CGATTTTACT CATTATCCTC 1801 GATAAACCTC TCCCGACACA AGTGTACGTC AGA.AAGAATT AACTCACTGA CTATTTGGAG AGGGCTGTGT TCACATGCAG TCTTTCTTAA TTGAGTGACT 1851 CAATTAAACG AACCCAAACT GAGGTCATTA TACTTTTATT TTAACCCAAC GTTAATTTGC TTGGGTTTGA CTCCAGTAAT ATGAAAATAA AATTGGGTTG 1901 TAGp~AAATC T TATTATAACA TTCGTTAACC CAACACAGGA GTGTCCTAAG ATCTTTTAGA ATAATATTGT AAGCAATTGG GTTGTGTCCT CACAGGATTC 1951 GAAAGATTAA AAGAAAGTAA AGGAACTCGG CAAACACGAA CTCCGCCTGT CTTTCTAATT TTCTTTCATT TCCTTGAGCC GTTTGTGCTT GAGGCGGACA 2001 TTACCP.~~AAA CATCGCCTCT TGAAACTCCA TAAGAGGTCC CGCCTGCCCT AATGGTTTTT GTAGCGGAGA ACTTTGAGGT ATTCTCCAGG GCGGACGGGA 2051 GTGACAATGT TTAACGGCCG CGGTATTCTG ACCGTGCAAA GGTAGCGTAA CACTGTTACA AATTGCCGGC GCCATAAGAC TGGCACGTTT CCATCGCATT 2101 TCACTTGTCT TTTA.AATGAA GACCCGTATG AAAGGCATCA CGAGAGTTCA AGTGAACAGA AAATTTACTT CTGGGCATAC TTTCCGTAGT GCTCTCAAGT 2151 ACTGTCTCTA TTTTCTAATC AATGAAATTG ATCTACCCGT GCAGAAGCGA TGACAGAGAT AAA.AGATTAG TTACTTTAAC TAGATGGGCA CGTCTTCGCT 2201 GTATAACCAC ATTAGACGAG AAGACCCTAT GGAGCTTCAA ACACATA.AAT CATATTGGTG TAATCTGCTC TTCTGGGATA CCTCGAAGTT TGTGTATTTA 2251 TAATTATGTA AATTAACCAC TCTACGGATA TAAAC AGA.AA TACAATACTT ATTAATACAT TTAATTGGTG AGATGCCTAT ATTTGTCTTT ATGTTATGAA 2301 TTAATTTAGC TGTTTTTGGT TGGGGTGACC AAGGGGAA.AA ATTAATCCCC AATTAAATCG ACP.~AAAACCA ACCCCACTGG TTCCCCTTTT TAATTAGGGG 2351 CTTATCGACC GAGTACTCTC AAGTACTTAA AAATTAGAAC CACAATTCTA GAATAGCTGG CTCATGAGAG TTCATGAATT TTTAATCTTG GTGTTAAGAT 2401 ATTAATA.AAA CATTTATCGA A.AA.ATGATC C AGGATTTCCT GATCAATGAA TAATTATTTT GTAA.ATAGC T TTTTACTAGG TCCTAAAGGA CTAGTTACTT 2451 CCAAGTTACC CTAGGGATAA CAGCGCAATC CTTTCTTAGA GTCCCTATCG GGTTCAATGG GATCCCTATT GTCGCGTTAG GAAAGAATCT CAGGGATAGC 2501 AC GA.AAGGGT TTACGACCTC GATGTTGGAT CAGGACATCC TAATGATGCA TGCTTTCCCA AATGCTGGAG CTACAACCTA GTCCTGTAGG ATTACTACGT 2551 GCCGTTATTA AGGGTTCGTT TGTTCAACGA TTAACAGTCC TACGTGATCT CGGCAATAAT TCCCAAGCAA ACAAGTTGCT AATTGTCAGG ATGCACTAGA 2601 GAGTTCAGAC C GGAGA.AATC CAGGTCAGTT TCTATCTATG AATTTATTTT CTCAAGTCTG GCCTCTTTAG GTCCAGTCAA AGATAGATAC TTAAATAAA.A 2651 TCCTAGTACG AA.AGGACCGG P.~AAAATGGAG CCAATACTAC AAGCACGCTT AGGATCATGC TTTCCTGGCC TTTTTACCTC GGTTATGATG TTCGTGCGAA 2701 CATTTTCATC TGTTGA.AATA AACTAAAATA GATAAGAA.A.A AGTCAACTTC GTAAAAGTAG ACAACTTTAT TTGATTTTAT CTATTCTTTT TCAGTTGAAG 2751 TACCCAAGAA AAGGGTTGTT GGGGTGGCAG AGCCTGGTAA AC GCAA.AAGA ATGGGTTCTT TTCCCAACAA CCCCACCGTC TCGGACCATT TGCGTTTTCT 2801 CCTAAGTTCT TTATTTCAGA GGTTCAAATC CTCTCCTCAA CTATGCTTGA GGATTCAAGA AATAA.AGTC T CCAAGTTTAG GAGAGGAGTT GATACGAACT 2851 AGCCCTCCTC CTGTACTTTA TTTCCCCCCT TACCTATATT GTTCCCATCC TCGGGAGGAG GACATGA.AAT AAAGGGGGGA ATGGATATAA CAAGGGTAGG 2901 TATTAGCCAC AGCCTTCCTC ACCCTAGTCG AAC GP~~AAAT TCTTGGTTAT 225

ATAATCGGTG TCGGAAGGAG TGGGATCAGC TTGCTTTTTA AGAACCAATA 2951 ATACAACTCC GTAAAGGCCC CAATATTGTA GGCCCCTACG GACTACTTCA TATGTTGAGG CATTTCCGGG GTTATAACAT CCGGGGATGC CTGATGAAGT 3001 GCCTGTCGCA GACGGCCTAA AATTATTTAC C AAAGA.AC C C ATTTATCCAT CGGACAGCGT CTGCCGGATT TTAATAAATG GTTTCTTGGG TAAATAGGTA 3051 CAGCATCCTC TCCCCTCCTA TTTCTAATCA CCCCCACAAT AGCCCTCACA GTCGTAGGAG AGGGGAGGAT AA.AGATTAGT GGGGGTGTTA TCGGGAGTGT 3101 CTAGCCCTCC TCATATGAAT GCCCCTACCT CTCCCCTACT CTGTTATCAA GATCGGGAGG AGTATACTTA CGGGGATGGA GAGGGGATGA GACAATAGTT 3151 CCTCAACTTA GGCTTGTTAT TTATTCTAGC AATCTCAAGT CTGACCGTCT GGAGTTGAAT CCGAACAATA AATAAGATCG TTAGAGTTCA GACTGGCAGA 3201 ATACCGTCTT AGGCTCTGGA TGAGCATCAA ATTC~TA TGCCTTAATA TATGGCAGAA TCCGAGACCT ACTCGTAGTT TAAGTTTTAT ACGGAATTAT 3251 GGGGCTTTAC GAGCTGTTGC AC AA.AC TAT T TCCTATGAAG TCAGTCTTGG CCCCGA.AATG CTCGACAACG TGTTTGATAA AGGATACTTC AGTCAGAACC 3301 ACTAATCCTC CTATCAATAA TTATCTTCAC AGGAGGCTTC ACCCTCCATA TGATTAGGAG GATAGTTATT AATAGAAGTG TCCTCCGAAG TGGGAGGTAT 3351 CTTTTAACTT AGCACAAGAA ACAATCTGAC TAATTATTCC AGGATGACCA GAA.AATTGAA TCGTGTTCTT TGTTAGACTG ATTAATAAGG TCCTACTGGT 3401 TTAGCTTTAA TATGATATGT ATCCACCTTA GCAGAGACTA ACCGAGTACC AATCGAAATT ATACTATACA TAGGTGGAAT CGTCTCTGAT TGGCTCATGG 3451 ATTTGATCTA ACAGAAGGAG AATCAGAACT AGTTTCAGGC TTTAATATTG TAAACTAGAT TGTCTTCCTC TTAGTCTTGA TCA.AAGTCCG AAATTATAAC 3501 AATATGCAGG AGGCTCCTTC GCCCTATTCT TTCTTGCTGA ATATACAAAT TTATACGTCC TCCGAGGAAG CGGGATAAGA AAGAACGACT TATATGTTTA 3551 ATTCTACTAA TAAATACCCT TTCAGTAATC CTTTTCATAG GCTCCTCCTA TAAGATGATT ATTTATGGGA AAGTCATTAG GAAA.AGTATC CGAGGAGGAT 3601 TAACCCACTC TTTC CAGAA.A TCTCAACCCT CACTCTAATA AT~GCAA ATTGGGTGAG AAAGGTCTTT AGAGTTGGGA GTGAGATTAT TATTTTCGTT 3651 CCCTGCTTAC CCTACTTTTT TTATGGATCC GAGCATCATA CCCCCGCTTT GGGACGAATG GGATGAAAAA AATACCTAGG CTCGTAGTAT GGGGGC GA.AA 3701 CGCTATGATC AACTTATACA CTTAGTGTGA P.~~,AAAC TTC T TACCCCTAAC GCGATACTAG TTGAATATGT GAATCACACT TTTTTGAAGA ATGGGGATTG 3751 CTTAGCAATT ATACTATGAC ATATCGCCCT TCCCGTAGCC ACAACAAGCC GAATCGTTAA TATGATACTG TATAGCGGGA AGGGCATCGG TGTTGTTCGG 3801 TGCCCCCTTT AAC C T~►A.A.AG GAAGTGTGCC TGAACAAAGG ACCACTTTGA ACGGGGGA.AA TTGGATTTTC CTTCACACGG ACTTGTTTCC TGGTGAAACT 3851 TAGAGTGGAT AATGAAAGTT AAA.ATCTTTC CTCTTCCTAG P.~3AAATAGGA ATCTCACCTA TTACTTTCAA TTTTAGAAAG GAGAAGGATC TTTTTATCCT 3901 CTTGAACCTA TACCCAAGAG ATCAA.AACTC TTCGTACTTC CAGTTATACT GAACTTGGAT ATGGGTTCTC TAGTTTTGAG AAGCATGAAG GTCAATATGA 3951 ATTTTC TA.AA GTAAGGTCAG CTAATTAAGC TTTTGGGCCC ATACCCCAAC TA,AAAGATTT CATTCCAGTC GATTAATTCG AAAACCCGGG TATGGGGTTG 4001 CATGTCGGTT A~A.AATCCCTC CTTTACTAAT GAATCCCCTT GTATTAACCA GTACAGCCAA TTTTAGGGAG GAAATGATTA CTTAGGGGAA CATAATTGGT 4051 TTATTATCTC AAGCCTGGGT CTAGGAACTA TTCTCACATT CATCAGCTCT AATAATAGAG TTCGGACCCA GATCCTTGAT AAGAGTGTAA GTAGTCGAGA 4101 CACTGACTAC TAGTCTGAAT AGGCCTTGAA ATCAACACTC TGGCCATTCT GTGACTGATG ATCAGACTTA TCCGGAACTT TAGTTGTGAG ACCGGTAAGA 4151 TCCCCTAATA ATTCACCAGC ACCACCCTCG CGCAGTAGAA GCCTCAACAA AGGGGATTAT TAAGTGGTCG TGGTGGGAGC GCGTCATCTT CGGAGTTGTT 4201 AATACTTTAT TACACAAGCA ACCGCCTCAG CCCTACTTCT ATTCGCTAGC TTATGAAATA ATGTGTTCGT TGGCGGAGTC GGGATGAAGA TAAGCGATCG 4251 GCTACAAACG CCTGAACCTC AGGCGAATGA AGCCTAATCG AAATAATCAA CGATGTTTGC GGACTTGGAG TCCGCTTACT TCGGATTAGC TTTATTAGTT 226

P~lAATCGGCT 4301 TCCAGGCTCT GCCACACTGG CCACAATCGC ATTAGCATTA AGGTCCGAGA CGGTGTGACC GGTGTTAGCG TAATCGTAAT TTTTAGCCGA 4351 TAGCTCCCCT CCACTTCTGA CTCCCCGAAG TTCTTCAAGG CCTAAATCTT ATCGAGGGGA GGTGAAGACT GAGGGGCTTC AAGAAGTTCC GGATTTAGAA 4401 ACTACAGGCC TCATCCTTTC CACCTGACAA AAACTTGCAC CATTTGCCGT TGATGTCCGG AGTAGGAAAG GTGGACTGTT TTTGAACGTG GTAAACGGCA 4451 CCTCCTACAA CTCTATCCCT CCTTAAACCC CAACCTATTA ATTTTTCTTG TP.,~~AAAGAAC GGAGGATGTT GAGATAGGGA GGAATTTGGG GTTGGATAAT 4501 GAATGCTCTC CACCATAGTA GGAGGATGAG GCGGATTAAA TCAAACCCAA CTTACGAGAG GTGGTATCAT CCTCCTACTC CGCCTAATTT AGTTTGGGTT 4551 CTACGP►~~AAA TCCTAGCCTA CTCCTCAATT GCCCATCTAG GCTGAATAGT GATGCTTTTT AGGATCGGAT GAGGAGTTAA CGGGTAGATC CGACTTATCA 4601 CTCCATTCTC CCCTATTCCT ATAATTTAAC CCAACTTAAC TTAATTCTCT GAGGTAAGAG GGGATAAGGA TATTAA.ATTG GGTTGAATTG AATTAAGAGA 4651 ATATTATTAT AACTTCAACA ACCTTCCTTC TATTCA►,AAAC ACTTAACTCA TATAATAATA TTGAAGTTGT TGGAAGGAAG ATAAGTTTTG TGAATTGAGT 4701 ACCAA.AATTA ACTCTATCTC CTCATCTTCA TCAAAATCCC CCTTACTTTC TGGTTTTAAT TGAGATAGAG GAGTAGAAGT AGTTTTAGGG GGAATGAAAG 4751 CATTATTGCC CTCATAACCC TCCTCTCTCT CGGAGGCCTT CCTCCCCTAT GTAATAACGG GAGTATTGGG AGGAGAGAGA GCCTCCGGAA GGAGGGGATA 4801 CAGGCTTTAT ACCAA.AGTGA CTTATCTTAC AAGAACTAAC CA.AACAAAAT GTCCGAAATA TGGTTTCACT GAATAGAATG TTCTTGATTG GTTTGTTTTA 4851 TTAATTATCC CCGCTACTGT CATGGCCATA ACGGCCCTCC TCAGTCTATT AATTAATAGG GGCGATGACA GTACCGGTAT TGCCGGGAGG AGTCAGATAA 4901 TTTTTACCTA CGTTTATGTT ATGCTACAAC ACTAACCATA TCCCCCGCCC P.~AAAATGGAT GCAAATACAA TACGATGTTG TGATTGGTAT AGGGGGCGGG 4951 CAATTAATAT ATTAACATCA TGACGAACCA AAATATCCCA TAACCTGATT GTTAATTATA TAATTGTAGT ACTGCTTGGT TTTATAGGGT ATTGGACTAA 5001 TTAACAACCA CTGCCTCCCT ATCTATTTTC CTCCTCCCAA TCACCCCAGC AATTGTTGGT GACGGAGGGA TAGATAAAAG GAGGAGGGTT AGTGGGGTCG .5051 CATCCTCATG CTAATATCCT AAGAAATTTA GGTTAATAAT AGAC C A,AAAG GTAGGAGTAC GATTATAGGA TTCTTTAAAT CCAATTATTA TCTGGTTTTC 5101 CCTTCAAAGC TTTAAGTAGA AGTGAAA.ATC TCCTAATTTC TGTTAAGGTC GGAAGTTTCG AAATTCATCT TCACTTTTAG AGGATTAAAG ACAATTCCAG 5151 TGCAAGACTT TATCTCACAT CTTCCGAATG CAACCCAGAT ACTTTTATTA ACGTTCTGAA ATAGAGTGTA GAAGGCTTAC GTTGGGTCTA TGAAAATAAT 5201 AGC TA.A.AAC C TTCTAGATAA ATAGGCCTTG ATCCTACAAG ATCTCAGTTA TCGATTTTGG AAGATCTATT TATCCGGAAC TAGGATGTTC TAGAGTCAAT 5251 ACAGCTAAGC GTTCAATCCA GCGAACTTTT ATCTACTTTC TCCCGCCGTA TGTCGATTCG CAAGTTAGGT C GC TTGA.A.AA TAGATGAAAG AGGGCGGCAT 5301 GAG G GCGGGAGAAA GCCCCGGGAG AAACTAATCT CCATCTTTGG CTCTTTTTTC CGCCCTCTTT CGGGGCCCTC TTTGATTAGA GGTAGA.AAC C 5351 GTTTGCAACC CAACGTAAAC ATCTACTGCA GGACTGGTAA GAAGAGGAAT CAAACGTTGG GTTGCATTTG TAGATGACGT CCTGACCATT CTTCTCCTTA 5401 TGAACCTCTG TCTGTGGAGC TACAATCCAT TACTTAGTTC TCAGTCACCT ACTTGGAGAC AGACACCTCG ATGTTAGGTA ATGAATCAAG AGTCAGTGGA 5451 TACCTGTGGC AATTAATCGA TGACTATTTT C TACAA.AC CA CAAAGATATT ATGGACACCG TTAATTAGCT AC TGATA.A.AA GATGTTTGGT GTTTCTATAA 5501 GGTACCCTTT ATTTAATTTT TGGTGCATGA GCAGGAATAG TGGGAACAGC CCATGGGAAA TAAATTAAAA ACCACGTACT CGTCCTTATC ACCCTTGTCG 5551 CCTAAGCCTT TTAATCCGTG CCGAGCTGGG TCAACCAGGT TCCCTCCTCG GGATTCGGAA AATTAGGCAC GGCTCGACCC AGTTGGTCCA AGGGAGGAGC 5601 GAGATGACCA GATTTATAAT GTTATTGTGA CCGCCCATGC CTTCGTAATA CTCTACTGGT C TA.AATATTA CAATAACACT GGCGGGTACG GAAGCATTAT 5651 ATCTTCTTCA TGGTAATGCC CATCATAATT GGGGGTTTTG GGAATTGACT 227

TAGAAGAAGT ACCATTACGG GTAGTATTAA CCCCCAAAAC CCTTAACTGA 57 01 AATCCCGTTA ATAATTGGTG CCCCGGACAT AGCCTTCCCC C GAATA.AATA TTAGGGCAAT TATTAACCAC GGGGCCTGTA TCGGAAGGGG GCTTATTTAT 5751 ACATAAGCTT CTGACTCCTT CCCCCTTCCT TTTTACTACT CCTAGCTTCA TGTATTCGAA GACTGAGGAA GGGGGAAGGA AAAATGATGA GGATCGAAGT 5801 GCCGGAGTTG AAGCAGGAGC CGGCACTGGT TGAACAGTCT ACCCTCCCCT CGGCCTCAAC TTCGTCCTCG GCCGTGACCA ACTTGTCAGA TGGGAGGGGA 5851 GGCCGGTAAT TTAGCACACG CAGGAGCATC CGTTGACCTG GCTATCTTCT CCGGCCATTA AATCGTGTGC GTCCTCGTAG GCAACTGGAC CGATAGAAGA 5901 CCCTTCACCT AGCAGGTATT TCCTCAATCT TGGCCTCAAT TAACTTTATT GGGAAGTGGA TCGTCCATAA AGGAGTTAGA ACCGGAGTTA ATTGAAATAA 5951 ACAACTATCA TCAATATGAA ACCCCCAGCA ATCTCCCAAT ACCAAACACC TGTTGATAGT AGTTATACTT TGGGGGTCGT TAGAGGGTTA TGGTTTGTGG 6001 CCTGTTCGTA TGATCCATCT TAGTAACAAC CATCCTTCTT CTCCTAGCCC GGACAAGCAT ACTAGGTAGA ATCATTGTTG GTAGGAAGAA GAGGATCGGG 6051 TTCCAGTGCT CGCAGCCGGT ATCACAATGT TACTTACTGA CCGAAATCTA AAGGTCACGA GCGTCGGCCA TAGTGTTACA ATGAATGACT GGCTTTAGAT 6101 AACACAACAT TCTTTGATCC AGCAGGAGGA GGAGACCCTA TTCTTTACCA TTGTGTTGTA AGAAACTAGG TCGTCCTCCT CCTCTGGGAT AAGAAATGGT 6151 ACATCTTTTC TGATTTTTTG GTCACCCTGA AGTCTACATT CTCATCCTTC TGTAGAAA.AG ACT C CAGTGGGACT TCAGATGTAA GAGTAGGAAG 6201 CTGGTTTTGG TATAATCTCC CATATTGTGG CTTATTATTC TGGT GACCAAAACC ATATTAGAGG GTATAACACC GAATAATAAG ACCATTTTTT 6251 GAACCATTTG GTTATATAGG AATGGTTTGG GCAATAATAG CAATTGGCCT CTTGGTA.AAC CAATATATCC TTAC CA.AAC C CGTTATTATC GTTAACCGGA 6301 ACTTGGGTTT ATTGTCTGAG CCCACCACAT ATTTACCGTA GGAATGGACG TGAACCCAAA TAACAGACTC GGGTGGTGTA TAAATGGCAT CCTTACCTGC 6351 TTGATACACG GGCCTACTTT ACCTCAGCAA CGATAATTAT TGCCATCCCC AACTATGTGC CCGGATGAAA TGGAGTCGTT GCTATTAATA ACGGTAGGGG 6401 ACAGGTGTAA AAGTCTTCAG CTGATTAGCT ACCCTCCATG GAGGCTCAGT TGTCCACATT TTCAGAAGTC GACTAATCGA TGGGAGGTAC CTCCGAGTCA 6451 TAAATGAGAA ACCCCCTTAC TATGGGCTCT CGGATTCATT TTCCTATTTA ATTTACTCTT TGGGGGAATG ATACCCGAGA GCCTAAGTAA AAGGATAAAT 6501 CAGTAGGGGG TTTAACAGGA ATTGTCCTAG CTAACTCTTC TCTCGATATT GTCATCCCCC AAATTGTCCT TAACAGGATC GATTGAGAAG AGAGCTATAA 6551 GTTCTCCACG ATACTTACTA TGTAGTAGCC CACTTTCACT ATGTTCTTTC CAAGAGGTGC TATGAATGAT ACATCATCGG GTGAA.AGTGA TACAAGAA.AG 6601 AATAGGGGCA GTATTTGCTA TCATGGCAGG CTTTATCCAC TGATTCCCAT TTATCCCCGT CATAAACGAT AGTACCGTCC GAAATAGGTG ACTAAGGGTA 6651 TAATAACGGG TTACACACTC CATTCAACCT GAACp,~~AAAT CCAATTCGCA ATTATTGCCC AATGTGTGAG GTAAGTTGGA CTTGTTTTTA GGTTAAGCGT 6701 GTTATGTTTA TTGGGGTAAA TTTAACATTC TTCCCTCAAC ACTTCCTAGG C AATAC A.AAT AACCCCATTT AAATTGTAAG AAGGGAGTTG TGAAGGATCC 6751 CCTCGCCGGT ATGCCACGAC GTTACTCAGA CTACCCAGAC GCTTACACTT GGAGCGGCCA TACGGTGCTG CAATGAGTCT GATGGGTCTG CGAATGTGAA 6801 TATGA.AATAC AGTTTCCTCT ATTGGCTCTT TAATCTCACT TGTAGCAGTA ATACTTTATG TCAAAGGAGA TAACCGAGAA ATTAGAGTGA ACATCGTCAT 6851 ATTATACTAT TATTTATTAT TTGAGAAGCA TTCGCCTCAA AACGAGAAGT TAATATGATA ATAAATAATA AACTCTTCGT AAGCGGAGTT TTGCTCTTCA 6901 TCTGTCTATT GAGCTACCCC ACACAAACGT TGAATGACTC CATGGTTGCC AGACAGATAA CTCGATGGGG TGTGTTTGCA ACTTACTGAG GTACCAACGG 6951 CCCCGCCCTA TCACACATAC GAAGAACCAG CGTTTGTTCA AGTCCAACGA GGGGCGGGAT AGTGTGTATG CTTCTTGGTC GCA.A.ACAAGT TCAGGTTGCT 7001 AACCTTTAAG ACAAGAAAGG AAGGAATTGA ACCCCCATAT .GTTAGTTTCA TTGGA.AATTC TGTTCTTTCC TTCCTTAACT TGGGGGTATA CAATCAAAGT 228

7051 AGCCAACCAC ATCACCACTC TGTCACTTTC TTCATAGAGA CCCTAGTAAA TCGGTTGGTG TAGTGGTGAG ACAGTGAAAG AAGTATCTCT GGGATCATTT 7101 TACATTACAC TGCCTTGTCA GGGCAA.AATT GTGGGTTTAA ATCCCACGGA ATGTAATGTG ACGGAACAGT CCCGTTTTAA CACCCAAATT TAGGGTGCCT 7151 TCTTAACCAA TGGCACACCC CTCACAATTA GGTTTTCAAG ATGCAGCCTC AGAATTGGTT ACCGTGTGGG GAGTGTTAAT CCAAAAGTTC TACGTCGGAG 7201 CCCAGTTATG GAAGAACTTA TCCACTTTCA CGACCACACA CTAATAATTG GGGTCAATAC CTTCTTGAAT AGGTGAAAGT GCTGGTGTGT GATTATTAAC 7251 TATTTCTAAT TAGCGCCCTA GTTCTTTATA TCATCACAGC GATAGTATCC ATAAAGATTA ATCGCGGGAT CAAGAA.ATAT AGTAGTGTCG CTATCATAGG 7301 ACAAAACTCA CAAACAA.ATA CATTCTTGAC TCCCAAGAAA TTGAAATCGT TGTTTTGAGT GTTTGTTTAT GTAAGAACTG AGGGTTCTTT AACTTTAGCA 7351 CTGGACTATC CTCCCTGCTA TTATCCTCAT TATAATTGCC CTCCCATCTT GACCTGATAG GAGGGACGAT AATAGGAGTA ATAT'I'AAC GG GAGGGTAGAA 7401 TACGAATCTT ATATCTTATA GAC GA.AATTA ACGATCCCCA CTTAACCATT ATGCTTAGAA TATAGAATAT CTGCTTTAAT TGCTAGGGGT GAATTGGTAA 7451 AAAGCTATAG GTCATCAATG ATACTGAAGC TACGAATATA CAGACTATGA TTTCGATATC CAGTAGTTAC TATGACTTCG ATGCTTATAT GTCTGATACT 7501 AGATCTAGGC TTCGACTCTT ACATAATTCA AACCCAAGAC TTAACCCCCG TCTAGATCCG AAGCTGAGAA TGTATTAAGT TTGGGTTCTG AATTGGGGGC 7551 GCCAATTTCG TCTACTAGAA ACAGACCATC GAATAGTAGT CCCCATAGAG CGGTTAAAGC AGATGATCTT TGTCTGGTAG CTTATCATCA GGGGTATCTC 7601 TCACCTGTTC GTGTCCTAGT GTCCGCAGAA GATGTTTTAC ACTCATGAGC AGTGGACAAG CACAGGATCA CAGGCGTCTT CTACAAAATG TGAGTACTCG 7651 CGTTCCAGCT TTAGGGGTTA AA.ATAGATGC TGTCCCAGGA CGTTTA.AACC GCAAGGTCGA AATCCCCAAT TTTATCTACG ACAGGGTCCT GCAAATTTGG 7701 AAACTGCCTT TATCATTTCT CGACCAGGCA TCTATTACGG CCAATGTTCA TTTGACGGAA ATAGTAAAGA GCTGGTCCGT AGATAATGCC GGTTACAAGT 7751 GAAATTTGTG GGGCTAATCA CAGTTTTATA CCCATTGTAG TAGAAGCAGT CTTTAAACAC CCCGATTAGT GTCAAAATAT GGGTAACATC ATCTTCGTCA 7801 CCCTTTAGAA CACTTTGAAG CCTGATCTTC ATTAATATTA GAAGAAGCCT GGGAAATCTT GTGAAACTTC GGACTAGAAG TAATTATAAT CTTCTTCGGA 7851 CACTGAGAAG CTA.AACTGGG ACTAGCGTTA GCCTTTTAAG CTP~~~ACTG GTGACTCTTC GATTTGACCC TGATCGCAAT CGGA,AAATTC GATTTTTGAC 7901 GTGATTCCCT CCCACCCTTA GTGATATGCC TCAATTAA.AC CCTCACCCTT CACTAAGGGA GGGTGGGAAT CACTATACGG AGTTAATTTG GGAGTGGGAA 7951 GATTAATGAT CCTTTTGTTC TCATGAACAA TTTTCCTCAT TATCTTACCA CTAATTACTA GGA,AAACAAG AGTACTTGTT A.A.AAGGAGTA ATAGAATGGT 8001 GTAA CAAACCACCT ATATAGCAAC AACCCAACAT TP.,~~AA.AGTAC TTTTTTCATT GTTTGGTGGA TATATCGTTG TTGGGTTGTA ATTTTTCATG 8051 AGP~~AA.ATCT A.AAC C C GAAC CCTGAAATTG ACCATGATCC TAAACTTTTT TCTTTTTAGA TTTGGGCTTG GGACTTTAAC TGGTACTAGG ATTTGP~~AAA 8101 TGATCAATTC CTAAGTCCCT CCATCCTCGG AATTCCACTT ATTATCCTCG ACTAGTTAAG GATTCAGGGA GGTAGGAGCC TTAAGGTGAA TAATAGGAGC 8151 CAATCGCTTT ACCCTGACTA ATCTTCCCAA CCCCAACTAA CCGATGACTT GTTAGCGA.AA TGGGACTGAT TAGAAGGGTT GGGGTTGATT GGCTACTGAA 8201 AACAACCGAC TAATAACACT CCAAAGCTGA TTTATTAATC GATTTACTTA TTGTTGGCTG ATTATTGTGA GGTTTCGACT AAATAATTAG CTAAATGAAT 8251 TCAACTCATT CAACCTATAA ACTTCACCGG CCATAAATGA GCTATACTAT AGTTGAGTAA GTTGGATATT TGAAGTGGCC GGTATTTACT CGATATGATA 8301 TTACAACACT AATACTATTT TTAATCACTA TTAACCTACT AGGCCTTCTC AATGTTGTGA TTATGATAAA AATTAGTGAT AATTGGATGA TCCGGAAGAG 8351 CCTTACACCT TTACACCAAC AACACAACTC TCCCTTAACA TGGCATTCGC GGAATGTGGA AATGTGGTTG TTGTGTTGAG AGGGAATTGT ACCGTAAGCG 8401 CCTTCCCCTA TGATTTACCA CCGTCCTAGT GGGTATACTT AACCAACCCA 229

GGAAGGGGAT ACTAAATGGT GGCAGGATCA CCCATATGAA TTGGTTGGGT 8451 CAGTGGCCCT AGGACACTTC TTACCAGAAG GTACCCCCAC CCTTCTAGTA GTCACCGGGA TCCTGTGAAG AATGGTCTTC CATGGGGGTG GGAAGATCAT 8501 CCAGCCCTAA TTATCATTGA GACCATTAGC CTATTTATCC GACCACTAGC GGTCGGGATT AATAGTAACT CTGGTAATCG GATAAATAGG CTGGTGATCG 8551 ATTAGGAGTA CGATTAACCG CTAATCTGAC AGCCGGTCAC CTACTTATAC TAATCCTCAT GCTAATTGGC GATTAGACTG TCGGCCAGTG GATGAATATG 8601 AACTAATTGC AACTGCCACC TTTATGCTCA TCACCATTAT ACCAGCCGTG TTGATTAACG TTGACGGTGG AAATACGAGT AGTGGTAATA TGGTCGGCAC 8651 GCACTCCTCA CATCAATTAT TTTATTTTTA CTAACAATTT TAGAAGTAGC CGTGAGGAGT GTAGTTAATA AAATP~~AAAT GATTGTTAAA ATCTTCATCG 8701 TGTAGCAATA ATCCAAGCAT ATGTATTTGT CCTCCTATTA AGCCTTTATC ACATCGTTAT TAGGTTCGTA TACATAAACA GGAGGATAAT TCGGAAATAG 8751 TACAAGAGAA TATCTAATGG CCCACCAAGC ACACGCATAT CACATAGTTG ATGTTCTCTT ATAGATTACC GGGTGGTTCG TGTGCGTATA GTGTATCAAC 8801 ACCCAAGCCC TTGACCACTA ACCGGAGCTA CAGCCGCCCT TCTAATAACA TGGGTTCGGG AACTGGTGAT TGGCCTCGAT GTCGGCGGGA AGATTATTGT 8851 TCTGGACTAA CCATCTGGTT TCACTTCCAC TCATTAATTC TCCTTTATCT AGACCTGATT GGTAGACCAA AGTGAAGGTG AGTAATTAAG AGGA.AATAGA 8901 AGGACTAACC CTTCTTTTAC TAACTATAGT TCAATGATGA CGCGATATCA TCCTGATTGG GAAGAAAATG ATTGATATCA AGTTACTACT GCGCTATAGT 8951 TCCGAGAGGG GACATTTCAA GGTCATCATA CACCCCCCGT TCP.~~A.AAGGC AGGCTCTCCC CTGTAAAGTT CCAGTAGTAT GTGGGGGGCA AGTTTTTCCG 9001 CTCCGCTATG GAATAATTCT ATTCATTACA TCAGAAGTAT TCTTCTTCTT GAGGCGATAC CTTATTAAGA TAAGTAATGT AGTCTTCATA AGAAGAAGAA 9051 AGGGTTTTTC TGAGCCTTCT ATCATTCGAG CCTTGCCCCC ACCCCCGAAC TCCCP►~~AAAG ACTCGGAAGA TAGTAAGCTC GGAACGGGGG TGGGGGCTTG 9101 TAGGGGGATG CTGGCCACCA ACAGGAATTA GCCCTATAGA CCCATTTGAA ATCCCCCTAC GACCGGTGGT TGTCCTTAAT CGGGATATCT GGGTAA.ACTT 9151 GTCCCACTAT TA.AATACTGC AGTACTCCTA GCCTCAGGTG TAACAGTGAC CAGGGTGATA ATTTATGACG TCATGAGGAT CGGAGTCCAC ATTGTCACTG 9201 TTGAACACAT CATAGCCTTA TAGAAGGAAA CCGAAAGGAA ACTATCCAAG AACTTGTGTA GTATCGGAAT ATCTTCCTTT GGCTTTCCTT TGATAGGTTC 9251 CCCTCACCCT TACTATTCTC CTAGGAGTGT ACTTTACAGC CC TTCA.AAC C GGGAGTGGGA ATGATAAGAG GATCCTCACA TGAAATGTCG GGAAGTTTGG 9301 ATAGAATATT ATGAAGCACC TTTCACAATC GCTGACGGAA TTTACGGTAC TATCTTATAA TACTTCGTGG AAAGTGTTAG CGACTGCCTT AAATGCCATG 9351 AACCTTCTTC GTCGCCACAG GATTCCACGG CCTCCATGTT ATTATTGGCT TTGGAAGAAG CAGCGGTGTC CTAAGGTGCC GGAGGTACAA TAATAACCGA 9401 CAACATTCCT AATAATCTGC CTATTACGAC A.AATCCAATA TCACTTTACA GTTGTAAGGA TTATTAGACG GATAATGCTG TTTAGGTTAT AGTGA.AATGT 9451 TC CA.AACAC C ACTTTGGATT TGAAGCTGCT GCATGATACT GACACTTTGT AGGTTTGTGG TGAAACCTAA ACTTCGACGA CGTACTATGA CTGTGAAACA 9501 AGACGTAGTA TGATTATTCC TTTATATTTC CATCTATTGA TGAGGCTCAT TCTGCATCAT ACTAATAAGG AAATATA.AAG GTAGATAACT ACTCCGAGTA 9551 AACTACTTTT CTAGTATAGA CTAGTACAAA TGATTTCCAA TCATTA.AATC TTGATGAAAA GATCATATCT GATCATGTTT ACTAAAGGTT AGTAATTTAG 9601 TTGGTTAA.AA CCCAAGGAAA AGTAATGAAT CTCATCATAT CTTCTGTCGC 'AACCAATTTT GGGTTCCTTT TCATTACTTA GAGTAGTATA GAAGACAGCG 9651 AGCTACGGCC CTGATTTCCC TAATCCTTGT ATTTATTGCA TTTTGACTCC TCGATGCCGG GACTAAAGGG ATTAGGAACA TAAATAACGT AAAACTGAGG 9701 CATCACTCAA CCCAGACAAC GAAAAATTAT CCCCATATGA GTGCGGCTTC GTAGTGAGTT GGGTCTGTTG CTTTTTAATA GGGGTATACT CACGCCGAAG 9751 GACCCTCTTG GAAATGCACG TCTCCCATTC TCCCTGCGCT TCTTCCTCGT CTGGGAGAAC CTTTACGTGC AGAGGGTAAG AGGGACGCGA AGAAGGAGCA 230

9801 AGCTATCTTA TTCCTACTAT TTGACCTAGA AATCGCCCTC CTTCTCCCTC TCGATAGAAT AAGGATGATA AACTGGATCT TTAGCGGGAG GAAGAGGGAG 9851 TACCCTGAGG AAACCAATTA TTATCTCCGC TCCATACACT ATTCTGAGCA ATGGGACTCC TTTGGTTAAT AATAGAGGCG AGGTATGTGA TAAGACTCGT 9901 ACAACCATCT TAATTCTGCT CACCCTGGGC CTTATTTATG AATGACTTCA TGTTGGTAGA ATTAAGACGA GTGGGACCCG GAATAAATAC TTACTGAAGT 9951 AGGGGGATTA GAGTGAGCAG AATAGATACT TAGTC C.F~AAA TAAAGACCAC TCCCCCTAAT CTCACTCGTC TTATCTATGA ATCAGGTTTT ATTTCTGGTG 10001 TAATTTCGGC TTAGTAAATT ATGGTGA~AAA TCCATAAGTA TCTTATGTCC ATTAAAGCCG AATCATTTAA TACCACTTTT AGGTATTCAT AGAATACAGG 10051 CCTATATATT TTAGCCTTAA CTCAGCATTT ATACTAGGCC TAATAGGTCT GGATATATAA AATCGGAATT GAGTCGTAAA TATGATCCGG ATTATCCAGA 10101 TGCACTTAAC CGCTATCACC TTTTATCCGC ACTTTTATGT TTAGAA.AGCA ACGTGAATTG GCGATAGTGG AAAATAGGCG TGA.A.AATACA AATCTTTCGT 10151 TACTATTAAC CCTATTTATT ACTATTGCTA TCTGAACCCT TACACTAAAC ATGATAATTG GGATAAATAA TGATAACGAT AGACTTGGGA ATGTGATTTG 10201 TCTACCTCTT CTTCAATTAC CCCTATAATC CTCCTCACAT TTTCAGCTTG AGATGGAGAA GAAGTTAATG GGGATATTAG GAGGAGTGTA AAAGTCGAAC 10251 CGAAGCTAGT GCAGGCTTGG CTATTCTAGT TGCCACTTCA CGCTCACATG GCTTCGATCA CGTCCGAACC GATAAGATCA ACGGTGAAGT GCGAGTGTAC 10301 GTTCCGATAA CTTACAAAAC CTAAACCTTC TCCAATGCTA AAAATTCTTA CAAGGCTATT GAATGTTTTG GATTTGGAAG AGGTTACGAT TTTTAAGAAT 10351 TCCCAACAAT CATACTCTTT CCAACCACAT GAATTATTAA C TGA AGGGTTGTTA GTATGAGAAA GGTTGGTGTA CTTAATAATT GTTTTTTACT 10401 CTATGACCCA TAACTACCTC CTATAGCCTT CTAATCGCAT TATTAAGTTT GATACTGGGT ATTGATGGAG GATATCGGAA GATTAGCGTA ATAATTCAAA 10451 AGTCTGATTC AAATGAAACA CAGACATCGG ATGAGACTTT TCCAATCAAT TCAGACTAAG TTTACTTTGT GTCTGTAGCC TACTCTGAAA AGGTTAGTTA 10501 TTATGGCTAT TGATCCCTTA TCAGCCCCTC TACTCATTCT CACCTGCTGA AATACCGATA ACTAGGGAAT AGTCGGGGAG ATGAGTAAGA GTGGACGACT 10551 CTTCTCCCAC TAATAATTTT AGCCAGCCAA AATCATATCT CCACAGAACC GAAGAGGGTG ATTATTAAAA TCGGTCGGTT TTAGTATAGA GGTGTCTTGG 10601 AGA.AAC C C GA CAACGAATAT ACCTTTCCCT ACTCATTTCT CTCCA.AACTT TCTTTGGGCT GTTGCTTATA TGGAAAGGGA TGAGTAAAGA GAGGTTTGAA 10651 TCCTTATTAT AGCCTTTTCT GCAACCGAAA TAATTATATT TTACATTATA AGGAATAATA TCGGAAAAGA CGTTGGCTTT ATTAATATAA AATGTAATAT 10701 TTTGAAGCTA CACTTATCCC CACCCTCATT ATTATTACAC GATGAGGAAA AAACTTCGAT GTGAATAGGG GTGGGAGTAA TAATAATGTG CTACTCCTTT 10751 CCA.AACAGAA CGCCTAAGTG CAGGAACTTA CTTCCTATTC TACACCTTAA GGTTTGTCTT GCGGATTCAC GTCCTTGAAT GAAGGATAAG ATGTGGAATT 10801 TTGGTTCCCT CCCCCTTCTT ATTGCCCTCC TACTAATACA AAACGACCTA AACCAAGGGA GGGGGAAGAA TAACGGGAGG ATGATTATGT TTTGCTGGAT 10851 GGCACTCTGT CCATACTTAT TATACAATAC ACACAACCCA TGACCCTAAC CCGTGAGACA GGTATGAATA ATATGTTATG TGTGTTGGGT ACTGGGATTG 10901 CTCATGAGCA GACA,A.AC TAT GATGAGTGGC CTGCCTCTTC GCCTTTCTTG GAGTACTCGT CTGTTTGATA CTACTCACCG GACGGAGAAG CGGAAAGAAC 10951 TTAA.AATACC CCTATATGGA ATACACCTTT GATTACCCAA AGCCCACGTC AATTTTATGG GGATATACCT TATGTGGAAA CTAATGGGTT TCGGGTGCAG 11001 GAGGCTCCAA TTGCCGGTTC AATAATCTTA GCTGCCGTAT TACTTAAAAT CTCCGAGGTT AACGGCCAAG TTATTAGAAT CGACGGCATA ATGAATTTTA 11051 AGGAGGCTAT GGCATAATAC GAATTATTGT AATGCTAAAT CCCCTCACTA TCCTCCGATA CCGTATTATG CTTAATAACA TTACGATTTA GGGGAGTGAT 11101 AAGAAATAGC CTACCCCTTC CTAATTCTGG CCATCTGAGG AATTATCATA TTCTTTATCG GATGGGGAAG GATTAAGACC GGTAGACTCC TTAATAGTAT 11151 ACCAGCTCTA TTTGTCTACG ACAGACTGAC CTCA.AATCTC TAATCGCTTA 231

TGGTCGAGAT AAACAGATGC TGTCTGACTG GAGTTTAGAG ATTAGCGAAT 11201 CTCATCCGTA AGCCACATAG GCCTGGTAAC AGGAGCAATC C TAATC CA.AA GAGTAGGCAT TCGGTGTATC CGGACCATTG TCCTCGTTAG GATTAGGTTT 11251 CACCATGAAG CTTTGCAGGA GCAATTACAC TAATAATTGC CCACGGCCTG GTGGTACTTC GAAACGTCCT CGTTAATGTG ATTATTAACG GGTGCCGGAC 11301 ATTTCATCCG CCCTATTCTG CTTAGCCAAC ACTAACTACG AGCGAATCCA TAAAGTAGGC GGGATAAGAC GAATCGGTTG TGATTGATGC TCGCTTAGGT 11351 CAGCCGAACA ATACTCCTAG CCCGGGGTAT ACA.AATTATT TTTCCACTAA GTCGGCTTGT TATGAGGATC GGGCCCCATA TGTTTAATAA AAAGGTGATT 11401 TAGCTACCTG ATGATTCCTT GCCATCCTAG CCAACCTTGC CCTTCCCCCC ATCGATGGAC TACTAAGGAA CGGTAGGATC GGTTGGAACG GGAAGGGGGG 11451 TCTCCCAATT TTATAGGAGA ACTCCTTATT ATTACCTCCT TATTTAACTG AGAGGGTTAA AATATCCTCT TGAGGAATAA TAATGGAGGA ATAAATTGAC 11501 ATCCAACTGA ACTATTACCC TCTCGGGCCT TGGAGTATTA ATCACCGCTT TAGGTTGACT TGATAATGGG AGAGCCCGGA ACCTCATAAT TAGTGGCGAA 11551 CCTACTCCCT TTATATATTC CTAATAATTC AACGCGGACC AACCCCCTAC GGATGAGGGA AATATATAAG GATTATTAAG TTGCGCCTGG TTGGGGGATG 11601 CATATCTTAT CATTAA.ACC C AAGCCATACA CGAGAACACC TCCTCCTGAG GTATAGAATA GTAATTTGGG TTCGGTATGT GCTCTTGTGG AGGAGGACTC 11651 CCTCCACCTC ATGCCCGTCC TACTTCTAAT ATTTAAACCA GAACTTATCT GGAGGTGGAG TACGGGCAGG ATGAAGATTA TAAATTTGGT CTTGAATAGA 11701 GAGGCTGAAC ACTCTGTGCT TATAGTTTAA CCAAAACATT AGATTGTGGT CTCCGACTTG TGAGACACGA ATATCA.AATT GGTTTTGTAA TCTAACACCA TCTP~~AAATA 11751 AAAGTTAAAA CCTTTTTAAC CACCGAGAGA GGTCGGGGAC AGATTTTTAT TTTCAATTTT GGAAAAATTG GTGGCTCTCT CCAGCCCCTG 11801 ATGAAGATCT GCTAATTCTT CTCATCATGG CTCAAATCCA TGACTCACTC TACTTCTAGA CGATTAAGA.A GAGTAGTACC GAGTTTAGGT ACTGAGTGAG 11851 AGCTTCTGAA AGATATAAGT AATCTATTGG TCTTAGGAAC Cp,~~AAACCCT TCGAAGACTT TCTATATTCA TTAGATAACC AGAATCCTTG GTTTTTGGGA C CAAGCA.A.AA 11901 TGGTGCAACT GCTATGAATA CTATCTTTAA CTCATCCTTC ACCACGTTGA GGTTCGTTTT CGATACTTAT GATAGAAATT GAGTAGGAAG 11951 CTCCTAATCT TTGTTACCCT CCTCTTCCCA CTAATAACCT CATTAAATCC GAGGATTAGA AACAATGGGA GGAGAAGGGT GATTATTGGA GTAATTTAGG 12 0 01 A►AAAGAAC TC ACTCCTAACT GGGCCTCCTC CTACGCAAAA ACATCTGTAA TTTTCTTGAG TGAGGATTGA CCCGGAGGAG GATGCGTTTT TGTAGACATT 12051 AAATCTCCTT CTTCATTAGC CTTATCCCAC TATCCATTTT TCTAGACCAA TTTAGAGGAA GAAGTAATCG GAATAGGGTG ATAGGTAAAA AGATCTGGTT 12101 GGTTTAGAGT CAATCATAAC CAACTACAAC TGAATCAACA TTGGGCCTTT CCAAATCTCA GTTAGTATTG GTTGATGTTG ACTTAGTTGT AACCCGGAAA 12151 CGATATTAAT ATAAGCTTCA AATTTGATAC ATACTCTGTT CTATTTACCC GCTATAATTA TATTCGAAGT TTA.AAC TATG TATGAGACAA GATA.AATGGG 12201 CTGTGGCCCT CTACGTTACT TGATCTATCC TTGAATTTGC CCTATGATAC GACACCGGGA GATGCAATGA ACTAGATAGG A.AC TTAAAC G GGATACTATG AC 12251 ATATACTCCG C CA.A.ACAT CAACCGCTTC TTTA.AATATC TCCTACTCTT TATATGAGGC TGGGTTTGTA GTTGGCGAAG A.AATTTATAG AGGATGAGAA 12301 CCTAATCTCA ATAATCATTT TAGTGACTGC CAACAACATA TTCCAACTGT GGATTAGAGT TATTAGTAAA ATCACTGACG GTTGTTGTAT AAGGTTGACA 12351 TCATTGGCTG AGAGGGAGTT GGAATCATAT CATTTCTCCT CATTGGTTGA AGTAACCGAC TCTCCCTCAA CCTTAGTATA GTAAAGAGGA GTAACCAACT 12401 TGACATAGTC GAACAGACGC CAACACAGCC GCCCTACAAG CTGTAATCTA ACTGTATCAG CTTGTCTGCG GTTGTGTCGG CGGGATGTTC GACATTAGAT 12451 TAACCGAGTA GGAGATATTG GCCTAATTCT TAGCATAGCT TGACTAGCTA ATTGGCTCAT CCTCTATAAC CGGATTAAGA ATCGTATCGA ACTGATCGAT TA.AATTTAAA 12501 TTCCTGAGAA ATTCAACAAC TATTCATCTT ATCTA.AAGAC ATTT.A.AATTT AAGGACTCTT TAAGTTGTTG ATAAGTAGAA TAGATTTCTG 232

12551 ATAAACTTAA CCCTACCTCT CCTTGGTCTT GTCCTAGCCG CAGCTGGAAA TATTTGAATT GGGATGGAGA GGAACCAGAA CAGGATCGGC GTCGACCTTT 12601 ATCCGCACAA TTCGGCCTTC ACCCATGACT TCCATCGGCC ATAGAAGGAC TAGGCGTGTT AAGCCGGAAG TGGGTACTGA AGGTAGCCGG TATCTTCCTG 12651 CCACACCAGT CTCCGCCTTA CTCCACTCCA GCACAATAGT TGTAGCCGGT GGTGTGGTCA GAGGCGGAAT GAGGTGAGGT CGTGTTATCA ACATCGGCCA 12701 ATCTTCCTCC TAATTCGCCT TCACCCATTA ATCCACAATA ACCAATTAAT TAGAAGGAGG ATTAAGCGGA AGTGGGTAAT TAGGTGTTAT TGGTTAATTA 12751 TCTAACAACA TGCTTATGCC TAGGAGCATT AACTACCCTC TTCACTGCAA AGATTGTTGT ACGAATACGG ATCCTCGTAA TTGATGGGAG AAGTGACGTT 12801 CATGCGCACT TACCCAAAAT GATATTP~AAA A.AATCATTGC CTTCTCAACA GTACGCGTGA ATGGGTTTTA CTATAATTTT TTTAGTAACG GAAGAGTTGT 12851 TCCAGCCAAC TAGGACTTAT AATGGTAACA ATTGGCCTTA ACCAACCACA AGGTCGGTTG ATCCTGAATA TTACCATTGT TAACCGGAAT TGGTTGGTGT 12901 ACTAGCCTTC CTCCACATCT GCACCCACGC CTTCTTTA.AA GCTATACTTT TGATCGGAAG GAGGTGTAGA CGTGGGTGCG GAAGAAATTT CGATATGAAA 12951 TCCTCTGTTC CGGATCCATC ATCCATAGCC TCAATGACGA ACAAGATATC AGGAGACAAG GCCTAGGTAG TAGGTATCGG AGTTACTGCT TGTTCTATAG 13001 CGCA,AAATAG GAGGTCTCCA CA.AAC TC C TA CCATTCACCT CCTCCTCCTT GCGTTTTATC CTCCAGAGGT GTTTGAGGAT GGTAAGTGGA GGAGGAGGAA 13051 AACTATTGGA AGCCTAGCCC TCATAGGCAT GCCCTTCTTA TCAGGCTTCT TTGATAACCT TCGGATCGGG AGTATCCGTA CGGGAAGAAT AGTCCGAAGA 13101 TTTCA.AAAGA TGCCATCATC GAAGCCATGA ACACTTCTCA CCTCAACGCC A.AAGTTTTCT ACGGTAGTAG CTTCGGTACT TGTGAAGAGT GGAGTTGCGG 13151 TGAGCCCTTA TCCTTACCCT AATTGCAACA TCATTTACAG CCATTTACAG ACTCGGGAAT AGGAATGGGA TTAACGTTGT AGTAAATGTC GGTAAATGTC 13201 CCTCCGCCTC ATTTTTTTTA CATTAATAAA TTTTCCACGG TTTAATTCAC GGAGGCGGAG TAA T GTAATTATTT AAAAGGTGCC A.AATTAAGTG 13251 TCTCCCCTAT TAATGAAAAC AACCCCCTAA TTATTAACCC AATCAAACGC AGAGGGGATA ATTACTTTTG TTGGGGGATT AATAATTGGG TTAGTTTGCG 13301 CTAGCTTACG GAAGTATCCT GTCTGGCCTC ATTATCACAT CCAACATAAT GATCGAATGC CTTCATAGGA CAGACCGGAG TAATAGTGTA GGTTGTATTA 13351 C C C TAC AA.AA ACCCAAATTA TAACTATAAT TCCCCTACTA AAACTCTCCG GGGATGTTTT TGGGTTTAAT ATTGATATTA AGGGGATGAT TTTGAGAGGC 13401 CCCTACTAAT TACTATTGCT GGCCTTCTAC TAGCCTTAGA ACTAGCAAAC GGGATGATTA ATGATAACGA CCGGAAGATG ATCGGAATCT TGATCGTTTG 13451 CTAACTAACT CTCAGCTTAA AATAACCCCC ACCCCCTATC CCCATCACTT GATTGATTGA GAGTCGAATT TTATTGGGGG TGGGGGATAG GGGTAGTGAA 13501 CTCAA.ATATA TTGGGATACT TTCCCCAAAT TATCCACCGC CTCCTGCCCA GAGTTTATAT AACCCTATGA AAGGGGTTTA ATAGGTGGCG GAGGACGGGT 13551 AAATTAACCT ATCCTGAGCC CAACATGTTT CCACTCACCT AATTGACCAA TTTAATTGGA TAGGACTCGG GTTGTACAAA GGTGAGTGGA TTAACTGGTT 13601 ACATGGTCTG TTGG AC CP.~AAAAGT GCCTTTATTC AACAAACCCT TGTACCAGAC TTTTTTAACC TGGTTTTTCA CGGAAATAAG TTGTTTGGGA 13651 TCTAATTAAA TTATCCACCC AACCTCAACA AGGCTATATT AAAGTTTACC AGATTAATTT AATAGGTGGG TTGGAGTTGT TCCGATATAA TTTCA.AATGG 13701 TAACACTACT TTTCCTCACG CTAGCCTTAG CCATACTCAC TACATTAACC ATTGTGATGA AAAGGAGTGC GATCGGAATC GGTATGAGTG ATGTAATTGG 13751 TAACCACACG TAATGTCCCT CATGCTAGAC CTCGAGTCAA CTCCAACACC ATTGGTGTGC ATTACAGGGA GTACGATCTG GAGCTCAGTT GAGGTTGTGG 13801 ACAAACAAAG TCAATAATAA CACCCACCCA CTTAAAACTA ACA.ACCACCC TGTTTGTTTC AGTTATTATT GTGGGTGGGT GAATTTTGAT TGTTGGTGGG 13851 CCCATCCTCA TA.AAGCAAAG CCACCCCCAC A.AA.ATC C CCA CGAGTTATCT GGGTAGGAG.T ATTTCGTTTC GGTGGGGGTG TTTTAGGGGT GCTCAATAGA 13901 CCATATTGCT CAACTCCTCT ACCCCTGACC AATCCAACTC AAATCACTCT 233

GGTATAACGA GTTGAGGAGA TGGGGACTGG TTAGGTTGAG TTTAGTGAGA 13951 ACCATAAAAT AATTACCAAC TAAC ATTACTAAAT P►~~AAAC CAAC TGGTATTTTA TTAATGGTTG TTTTTTATTG TAATGATTTA TTTTTGGTTG 14001 ATACAACAAA ACAGATCAAT TACCCCATGA CTCAGGATAT GGCTCAGCAG TATGTTGTTT TGTCTAGTTA ATGGGGTACT GAGTCCTATA CCGAGTCGTC 14051 CAAGAGCTGC CGTATAAGCA AACACCACCA ACATTCCCCC CAA.ATAAATT GTTCTCGACG GCATATTCGT TTGTGGTGGT TGTAAGGGGG GTTTATTTAA 14101 AAGAACAAAA CTAATGATAA AA.AAGAC C CA CCATGTCCTA CCAACAGACC TTCTTGTTTT GATTACTATT TTTTCTGGGT GGTACAGGAT GGTTGTCTGG 14151 ACACCCTATC CCAGCAGCCA TAACCAACCC TAACGCAGCA TAATAAGGTG TGTGGGATAG GGTCGTCGGT ATTGGTTGGG ATTGCGTCGT ATTATTCCAC 14201 AAGGATTGGA CGCTACTCCT ATTAA.AC C TA GCACTA.AACA AACCATCATC TTCCTAACCT GCGATGAGGA TAATTTGGAT CGTGATTTGT TTGGTAGTAG 14251 P►~~AAACATAA AATATACCAT CATTCCTACC TGGACTTTAA CCAAGACCAA TTTTTGTATT TTATATGGTA GTAAGGATGG AC CTGAA,ATT GGTTCTGGTT 14301 CAACTTGAAA AACTATCGTT GTTTATTCAA CTATAAGAAT TTATGGCCCT GTTGAACTTT TTGATAGCAA CA.AATAAGTT GATATTCTTA AATACCGGGA 14351 CAATATTCGA AAAATCCATC CCCTACTAAA AATTATAA.AC CAAACTCTAA GTTATAAGCT TTTTAGGTAG GGGATGATTT TTAATATTTG GTTTGAGATT 14401 TTGATCTTCC AGCTCCATCT AACATCTCCA TCTGATGAA.A CTTCGGCTCA AACTAGAAGG TCGAGGTAGA TTGTAGAGGT AGACTACTTT GAAGCCGAGT 14451 CTCCTAGGAC TGTGTTTAGT AATC CA.AATT GTCACAGGAC TCTTCCTAGC GAGGATCCTG ACACAAATCA TTAGGTTTAA CAGTGTCCTG AGAAGGATCG 14501 AATACATTAC ACCGCAGATA TCACTATAGC CTTCTCCTCA GTAACCCACA TTATGTAATG TGGCGTCTAT AGTGATATCG GAAGAGGAGT CATTGGGTGT 14551 TCTGCCGTGA CGTCAATTAC GGCTGACTTA TTCGTAACAT CCATGCTAAC AGACGGCACT GCAGTTAATG CCGACTGAAT AAGCATTGTA GGTACGATTG 14601 GGAGCCTCTT TATTCTTTGT CTGCATCTAC TTCCACATTG CCCGAGGACT CCTCGGAGAA ATAAGAAACA GACGTAGATG AAGGTGTAAC GGGCTCCTGA 14651 TTATTACGGC TCCTACCTCT ACAA.AGAGAC TTGA.AATATT GGAGTTATCT AATAATGCCG AGGATGGAGA TGTTTCTCTG AACTTTATAA CCTCAATAGA 14701 TACTATTTCT ACTCATAGCC ACAGCCTTCG TAGGCTATGT CCTACCCTGA ATGATAAAGA TGAGTATCGG TGTCGGAAGC ATCCGATACA GGATGGGACT 14751 GGCCAA.ATAT CCTTCTGAGG CGCAACTGTC ATCACTAACC TCCTTTCTGC CCGGTTTATA GGAAGACTCC GCGTTGACAG TAGTGATTGG AGGAAAGACG 14801 TTTCCCTTAT ATTGGTGACA CATTAGTCCA ATGAATCTGA GGTGGCTTCT AAAGGGAATA TAACCACTGT GTAATCAGGT TACTTAGACT CCACCGAAGA 14851 CAGTAGACAA CGCCACCCTA ACACGATTCT TCGCATTCCA CTTTCTCCTT GTCATCTGTT GCGGTGGGAT TGTGCTAAGA AGCGTAAGGT GAAAGAGGAA 14901 CCCTTCCTAA TCACCGCATT AATAATTATC CATATTCTCT TCTTACACGA GGGAAGGATT AGTGGCGTAA TTATTAATAG GTATAAGAGA AGAATGTGCT 14951 AACAGGCTCA AACAACCCCA TGGGTCTCAA CTCCGACATA GACAAAATCT TTGTCCGAGT TTGTTGGGGT ACCCAGAGTT GAGGCTGTAT CTGTTTTAGA 15001 CCTTCCACCC CTATTTCATC TACA.AAGACG CACTAGGGTT TCTAAGTCTC GGAAGGTGGG GATAAAGTAG ATGTTTCTGC GTGATCCCAA AGATTCAGAG 15051 CTTATTCTCC TAGGAATCCT AGCCCTATTT CTCCCCAACC TCTTAGGAGA GAATAAGAGG ATCCTTAGGA TCGGGATAAA GAGGGGTTGG AGAATCCTCT 15101 TAC C GAA.AAC TTCATCCCCG CCAACCCTCT CGTCACCCCT CCACACATTA ATGGCTTTTG AAGTAGGGGC GGTTGGGAGA GCAGTGGGGA GGTGTGTAAT 15151 AACCAGAATG ATACTTCCTA TTTGCCTACG CCATCCTCCG ATCCATTCCT TTGGTCTTAC TATGAAGGAT AAACGGATGC GGTAGGAGGC TAGGTAAGGA 15201 AATA.AATTAG GAGGAGTCTT AGCCCTTCTA TTTTCCATCC TCATTCTTAT TTATTTAATC CTCCTCAGAA TCGGGAAGAT AA.AAGGTAGG AGTAAGAATA 15251 ATTAGTCCCC TTTCTTCACA CCTCCAAACA ACGTAGCAGC ACCTTCCGCC TAATCAGGGG AAAGAAGTGT GGAGGTTTGT TGCATCGTCG TGGAAGGCGG 234

153 01 CCCTTACACA AGTCTTTTTC TGAATTCTTG TAGCCAACAT ACTAGTTCTG GGGAATGTGT TCAGP.~~A,AAG ACTTAAGAAC ATCGGTTGTA TGATCAAGAC 15351 ACTTGAATCG GAGGCCAACC AGTTGAACAA CCATTTATTC TCATCGGACA TGAACTTAGC CTCCGGTTGG TCAACTTGTT GGTA.AATAAG AGTAGCCTGT 15401 AATTGCATCT ATCTCCTACT TCTCTCTGTT TCTCATTGCA ATTCCACTCG TTAACGTAGA TAGAGGATGA AGAGAGACAA AGAGTAACGT TAAGGTGAGC 15451 CAGGCTGATG AGP~AACAAA ATCCTCGGCC TTAACTAATC TTGGTAGCTT GTCCGACTAC TCTTTTGTTT TAGGAGCCGG AATTGATTAG A.ACCATCGAA 15501 AAC TTAA,AAG CGTCGGCCTT GTAAGCCGAA GACTGGAGGT TTA.AACC C TC TTGAATTTTC GCAGCCGGAA CATTCGGCTT CTGACCTCCA AATTTGGGAG 15551 CCCAAAATAC ATCAGGGGAA GGAGGGTTAA ACTCCTGCCC TTGGCTCCCA GGGTTTTATG TAGTCCCCTT CCTCCCAATT TGAGGACGGG AACCGAGGGT 15601 AAGCCAAGAT TCTGCCCAAA CTGCCCCCTG AATGCTGTCA AA.ACATGAAA TTCGGTTCTA AGACGGGTTT GACGGGGGAC TTACGACAGT TTTGTACTTT 15651 GCCAAACATC CATTTGGCCT TCP.►AAAAGTA AGTCAGTTTA ACATATTAAT CGGTTTGTAG GTAAACCGGA AGTTTTTCAT TCAGTCAAAT TGTATAATTA 157 01 GACATGACCC ACATACCTTA ATATAAAGAC ATATATCATC TCAACTACAC CTGTACTGGG TGTATGGAAT TATATTTCTG TATATAGTAG AGTTGATGTG 15751 CACATTAATT GACCTTCACC TAATGGTATT ATACTCTATG TATAATACTC GTGTAATTAA CTGGAAGTGG ATTACCATAA TATGAGATAC ATATTATGAG 15801 ATTAATTTAC ATTCCCCTAT ATCATAACAT ACTATGCTTT ATCCCCATTC TAATTAAATG TAAGGGGATA TAGTATTGTA TGATACGAAA TAGGGGTAAG 15851 ATCTACTTAC AGCAATTTCA TTACATTATA TTTTTAACCT TCATTAATTT TAGATGAATG TCGTTAAAGT AATGTAATAT P.~AAAATTGGA AGTAATTAAA 15901 AAAATCAAAA TCTTCATACC ATA.AATTTAT TTCTTCCACT TACA.AAGAC T TTTTAGTTTT AGAAGTATGG TATTTA.AATA AAGAAGGTGA ATGTTTCTGA 15951 TAAGTATATA TCATGAAAGC TAACAAGAAC ATCATATTCC ATTATCGTAA ATTCATATAT AGTACTTTCG ATTGTTCTTG TAGTATAAGG TA.ATAGCATT 16001 G TTA TTCTACCTAT GACATTACAT TCGATTAATC CTCATCAACT CTTTTTTAAT AAGATGGATA CTGTAATGTA AGCTAATTAG GAGTAGTTGA 16051 GATCAAACCT GACATTTAAT TAATGCTTGT TACACTTCAG TCCTTGATCG CTAGTTTGGA CTGTAAATTA ATTACGAACA ATGTGAAGTC AGGAACTAGC 16101 CGTCAAGAAT GCCAGTCCTC TAGTTCCCTT TAACAGCCCC CATCCTTGAT GCAGTTCTTA CGGTCAGGAG ATCAAGGGAA ATTGTCGGGG GTAGGAACTA 16151 CGCGTCAAGA ATGCCAGCCC CCTAGTTCCC TTTAATGACA CCTTCGTCCT GCGCAGTTCT TACGGTCGGG GGATCAAGGG AAATTACTGT GGAAGCAGGA 16201 TGATCGCGTC AAGATTTATT TTCCACCCTG TTTTTTTTGG GGGGAATGAA ACTAGCGCAG TTCTAAATAA AAGGTGGGAC CC CCCCTTACTT 16251 GCCATCGCTA TTCCTCGGAG AGGCTCATCT GGGACACTAA GGTAAACCTG CGGTAGCGAT AAGGAGCCTC TCCGAGTAGA CCCTGTGATT CCATTTGGAC 163 01 TACTCCCCTC GACACTCTTC TATTATACTC ATTACTTATC ATTCATGAAT ATGAGGGGAG CTGTGAGAAG ATAATATGAG TAATGAATAG TAAGTACTTA 16351 TAAGATTGTC AAGTTGACCA AA.AC TGAAAG GGATAGAGAG ATTGACGCCA ATTCTAACAG TTCAACTGGT TTTGACTTTC CCTATCTCTC TAACTGCGGT 16401 TAATGGATAC GTTTCGATTT TTTTGATTAA AGAAGCTATG GTTTP~AAATA ATTACCTATG CA.AAGC TAAA AAAACTAATT TCTTCGATAC CAAATTTTAT 16451 GACATTTTCT TAACCCTCAT TTA.AATCTGC AGCTAGCAAT ATACGTGCGT CTGTAAAAGA ATTGGGAGTA AATTTAGACG TCGATCGTTA TATGCACGCA 16501 GTA.P~A.AAGCA TTCCATTATT TGGTACATCA ATCACTTTAT TGGACATAAT CATTTTTCGT AAGGTAATAA ACCATGTAGT TAGTGA.AATA ACCTGTATTA 16551 TTTATCTTTA TTAGGATACC CCCTGGGTTG T~~A.AA.A.TTTG GAGTAGATAA A.AATAGAAAT AATCCTATGG GGGACCCAAC ATTTTTAAAC CTCATCTATT 16601 CG AGACATTATT TGGTP►~~AAAC CCCCCTCCCC CTAATATACA TTTTTTTTGG TCTGTAATAA ACCATTTTTG GGGGGAGGGG GATTATATGT 16651 CGGATTCCTC GA.A.AAAC C C C TA.AAAC GAGA ACCGGACATA TATTTTGAAA 235

GCCTAAGGAG CTTTTTGGGG ATTTTGCTCT TGGCCTGTAT AT~CTTT 16701 TTAGCATGCG AAATGTATTC TGTATTTATA TTGTTACACT ATGAT AATCGTACGC TTTACATAAG ACATA.AATAT AACAATGTGA TACTA tRNA 1..71 product = tRNA-Phe rRNA 70..1024 product = 12S ribosomal RNA tRNA 1025..1096 product = tRNA-Val rRNA 1097..2767 product = 16S ribosomal RNA tRNA 2768..2842 product = tRNA-Leu gene 2843..3817 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3820..3888 product = tRNA-Ile tRNA 3887..3958 product = tRNA-Gln tRNA 3960..4028 product = tRNA-Met gene 4029..5072 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5072..5142 product = tRNA-Trp tRNA complement (5144..5212) product = tRNA-Ala tRNA complement (5213..5285) product = tRNA-Asn tRNA complement (5319..5385) product = tRNA-Cys tRNA complement (5385..5454) product = tRNA-Tyr gene 5456..7013 gene = COl product = cytochrome c oxidase subunit 1 tRNA complement (7012..7082) product = tRNA-Ser tRNA 7087..7155 product = tRNA-Asp gene 7160..7850 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7851..7924 product = tRNA-Lys gene 7926..8093 gene = ATPB product =ATP synthase FO subunit 8 gene 8084..8767 gene = ATP6 product =ATP synthase FO subunit 6 236 gene 8767..9552 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9555..9624 product = tRNA-Gly gene 9625..9975 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9974..10044 product = tRNA-Arg gene 10045..10341 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10335..11715 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11716..11784 product = tRNA-His tRNA 11785..11851 product = tRNA-Ser tRNA 11852..11923 product = tRNA-Leu gene 11924..13753 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13749..14270) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14271..14340) product = tRNA-Glu gene 14343..15488 gene = CYTB product =cytochrome b tRNA 15488..15559 product = tRNA-Thr tRNA complement (15562..15630) product = tRNA-Pro D-Loop 15631..16745

Carcharias taurus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGTATGGCAC TGAAGATGCT AAGATGAA.AA CGATCACATC GAATTAA.ATT TCATACCGTG ACTTCTACGA TTCTACTTTT 51 ATGA.AA.ATTT TCCACAGGCA TGAAGGTTTG GTCCTAGCCT CAGTATTAAT TACTTTTAAA AGGTGTCCGT- ACTTCCA.AAC CAGGATCGGA GTCATAATTA 101 TGCAACCAAA ATTATACATG CAAGTTTCAG CACCCCTGTG AGAATGCCCT ACGTTGGTTT TAATATGTAC GTTCAAAGTC GTGGGGACAC TCTTACGGGA 151 AATCATTCTA TCAATTAATT AGGAGCAGGT ATCAGGCACA CACACGTAGC TTAGTAAGAT AGTTAATTAA TCCTCGTCCA TAGTCCGTGT GTGTGCATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGATTTCAG CAGTAATAA.A GGTTCTGTGG AACGATTCGG TGTGGGGGTT CCCTAAAGTC GTCATTATTT 251 TATTGATCAC ATAAGCGCAA GCTTGAATCA GTTAAAGTTA ACA.AAGTTGG ATAACTAGTG TATTCGCGTT CGAACTTAGT CAATTTCAAT TGTTTCAACC 237

3 01 TAA.ATCTCGT GCCAGCCACC GCGGTTATAC GAGTAACTTA TATTAATACT ATTTAGAGCA CGGTCGGTGG C GC CA.ATATG CTCATTGAAT ATAATTATGA 351 TCCCGGCGTA AAGAGTGATT TAAGAAACAT CTACAATAAC TAAAGTTGAG AGGGCCGCAT TTCTCACTAA ATTCTTTGTA GATGTTATTG ATTTCAACTC 401 ACCTCACCAA GCTGTTACAC GCACCCATAA ATAGAAATAT CAACAACGAA TGGAGTGGTT CGACAATGTG CGTGGGTATT TATCTTTATA GTTGTTGCTT 451 AGTGACTTTA CCCTACTAGA AATCTTGATG CCACGACAGT TAGACTCCAA TCACTGAAAT GGGATGATCT TTAGAACTAC GGTGCTGTCA ATCTGAGGTT 501 ACTAGGATTA GATACCCTAC TATGTCTAAC CATAAACTTA AACAATAATT TGATCCTAAT CTATGGGATG ATACAGATTG GTATTTGAAT TTGTTATTAA 551 TACTATATTG TTCGCCAGAG AACTACAAGC GCTAGCTTAA AACCCAAAGG ATGATATAAC AAGCGGTCTC TTGATGTTCG CGATCGAATT TTGGGTTTCC 601 ACTTGGCGGT GTCCCAAACC CACCTAGAGG AGCCTGTTCT ATAACCGATA TGAACCGCCA CAGGGTTTGG GTGGATCTCC TCGGACAAGA TATTGGCTAT 651 ATCCCCGTTA AACCTCACCA CTTCTAGCCA TCCCCGTCTA TATACCGCCG TAGGGGCAAT TTGGAGTGGT GAAGATCGGT AGGGGCAGAT ATATGGCGGC 701 TCGTCAGCTC ACCCTATGAA GGCCAAAAAG TAAGCP.~~AA.A GAATCAACTC AGCAGTCGAG TGGGATACTT CCGGTTTTTC ATTCGTTTTT CTTAGTTGAG 751 CCATACGTCA GGTCGAGGTG TAGCAAATGA AGTGGATAGA AATGGGCTAC GGTATGCAGT CCAGCTCCAC ATCGTTTACT TCACCTATCT TTACCCGATG 801 ATTTTCTATA AAGAAAACAC GAATGGTA.AA C TGP~AAAATT ACCTAAAGGT TAAAAGATAT TTCTTTTGTG CTTACCATTT GACTTTTTAA TGGATTTCCA 851 GGATTTAGCA GTAAGP.~G ATTAGAGAGC TTCTCTGA.AA CTGGCTCTGG C C TAA.ATC GT CATTCTTTTC TAATCTCTCG AAGAGACTTT GACCGAGACC 901 GACGCGCACA CACCGCCCGT CACTCTCCTC TCC ATTCATTTTT CTGCGCGTGT GTGGCGGGCA GTGAGAGGAG TTTTTTTAGG TAAGTP~~AAA 951 AATT AATCATCAAG AGGAGGCAAG TCGTAACATG GTAAGTGTAC TTAATTTTTT TTAGTAGTTC TCCTCCGTTC AGCATTGTAC CATTCACATG 1001 TGGAAAGTGC ACTTGGAATC AAAATGTGGC TAAACTAGTA AAGCACCTCC ACCTTTCACG TGAACCTTAG TTTTACACCG ATTTGATCAT TTCGTGGAGG 1051 CTTACACCGA GGAA.ATGCCT GTGCAATTCA GGTCATTTTG AACATTA.AAG GAATGTGGCT CCTTTACGGA CACGTTAAGT C CAGTA.AA.AC TTGTAATTTC 1101 CTAGCCTGTA CACCTACCTT AAACCCAACC CTATTAATTA CTTCACATAC GATCGGACAT GTGGATGGAA TTTGGGTTGG GATAATTAAT GAAGTGTATG 1151 TAATCCCTAA CTA.AAACATT TTTACCTTTT TAGTATGGGC GACAGAACAA ATTAGGGATT GATTTTGTAA AAATGGAAAA ATCATACCCG CTGTCTTGTT 1201 AAACTCAGCG CAATAAATTA TGTACCGCAA GGGA.AAGCTG P.~AAA.AGA.AAT TTTGAGTCGC GTTATTTAAT ACATGGCGTT CCCTTTCGAC TTTTTCTTTA 12 51 GA.AACAAATA ATTA.AAGTAA CAAAAAGCAG AGATTCCACC TCGTACCTTT CTTTGTTTAT TAATTTCATT GTTTTTCGTC TCTAAGGTGG AGCATGGAAA 1301 TGCATCATGA TTTAGCTAGA A.AAAC TAGAC AAAGAGATCT TAAGCCTACC ACGTAGTACT AA.ATC GATC T TTTTGATCTG TTTCTCTAGA ATTCGGATGG 1351 TTCCCGAAAC TAAACGAGCT AC TC C GA.AGC AGCACAATTC TAGAGCCAAC AAGGGCTTTG ATTTGCTCGA TGAGGCTTCG TCGTGTTAAG ATCTCGGTTG 1401 CCGTCTCTGT GGCAAAAGAG TGGGAAGACT TCCGAGTAGC GGTGAAAAGC GGCAGAGACA CCGTTTTCTC ACCCTTCTGA AGGCTCATCG CCACTTTTCG 1451 CTATCGAGTT TAGTGATAGC TGGTTGCCCA AGAAAAGAAC TTTAATTCTG GATAGCTCAA ATCACTATCG ACCAACGGGT TCTTTTCTTG AAATTAAGAC 1501 CATTAATTTC TTCATTACCA AAAAGTCTAT CTTACCAAGG TTAA.ACATAA GTAATTAAAG AAGTAATGGT TTTTCAGATA GAATGGTTCC AATTTGTATT 1551 AAATTAATAG TTATTCAGAA GAGGTACAGC CCTTCTGAAC CAAGATACAA TTTAATTATC AATAAGTCTT CTCCATGTCG GGAAGACTTG GTTCTATGTT 1601 CTTTTGAAGG AGGGAAAATG ATCATATTTA TTAAGGTTTC CACCTCAGTG GAAA.AC TTCC TCCCTTTTAC TAGTATAAAT AATTCCAAAG GTGGAGTCAC 1651 GGCTCAAGAG CAGCCACCTG TAAAGTAAGC GTCATAGCTC CAGTTTCACG 238

CCGAGTTCTC GTCGGTGGAC ATTTCATTCG CAGTATCGAG GTCAAAGTGC 1701 AAAACCTATA ATTTAGATAT CTTCCTCATA ACCCCCTTAA CTATATTGGG TTTTGGATAT TAAATCTATA GAAGGAGTAT TGGGGGAATT GATATAACCC 1751 CTATTTTATA A.AATTATAAA AGAACTTATG C TAA.AATGAG TAATAAGAGG GATAAAATAT TTTAATATTT TCTTGAATAC GATTTTACTC ATTATTCTCC 1801 ATAAACCTCT CCAGACACAA GTGTACGTCA GAAAGAATTA AATCACTGAC TATTTGGAGA GGTCTGTGTT CACATGCAGT CTTTCTTAAT TTAGTGACTG 1851 AAT TAA.AC GA CTCCAGACTG AGGTCATCAT ACCAATATTA TCTTAACTAG TTAATTTGCT GAGGTCTGAC TCCAGTAGTA TGGTTATAAT AGAATTGATC 1901 AAAATCCTAT TATAACATTC GTTAACCCTA CACAGGAGTG TCTCAAGGAA TTTTAGGATA ATATTGTAAG CAATTGGGAT GTGTCCTCAC AGAGTTCCTT 1951 AGATTAAAAG AAAATAA.AGG AACTCGGCAA ACACA.AACTC CGCCTGTTTA GCGGACAA.AT TCTAATTTTC TTTTATTTCC TTGAGCCGTT TGTGTTTGAG 2001 CCP.~~A,AACAT CGCCTCTTGC AACACCATAA GAGGTCCCGC CTGCCCTGTG GGTTTTTGTA GCGGAGAACG TTGTGGTATT CTCCAGGGCG GACGGGACAC GTGCAA.AGGT 2051 ACAATGTTTA ACGGCCGCGG TATTTTGACC AGCGTAATCA TGTTACAAAT TGCCGGCGCC ATAAAACTGG CACGTTTCCA TCGCATTAGT 2101 CTTGTCTTTT AAATGAAGAC CCGTATGAAA GGCACCACGA GAGTTTAACT GAACAGAAA.A TTTACTTCTG GGCATACTTT CCGTGGTGCT CTCAAATTGA 2151 GTCTCTATTT TCCAATCAAT GAAATTGATC TACTCGTGCA GAAGCGAGTA CAGAGATAAA AGGTTAGTTA CTTTAACTAG ATGAGCACGT CTTCGCTCAT CATA.AATTAA 2201 TAACCACATT AGACGAGAAG ACCCTATGGA GCTTCAAACA ATTGGTGTAA TCTGCTCTTC TGGGATACCT CGAAGTTTGT GTATTTAATT 2251 CTATGTAAAT CAACCATTCC ACGGATATAA ACP~~AAATAC AATATTTTTA TTATp~~AAAT GATACATTTA GTTGGTAAGG TGCCTATATT TGTTTTTATG 2301 ACTTAACTGT TTTTGGTTGG GGTGACCAAG GGGP.~AAA.ACA AATCCCCCTT TGAATTGACA AAAACCAACC CCACTGGTTC CCCTTTTTGT TTAGGGGGAA 2351 ATCGATTGAG TATTCAAGTA C TTP~AAAATT AGAATTACAA TTCTAATTAA TAGCTAACTC ATAAGTTCAT GAATTTTTAA TCTTAATGTT AAGATTAATT 2401 TA.A.AACATTT AC C GP~~AAAT GATCCAGGAT TTCCTGATCA ATGAACCAAG ATTTTGTA.AA TGGCTTTTTA CTAGGTCCTA AAGGACTAGT TACTTGGTTC 2451 TTACCCTAGG GATAACAGCG CAATCCTTTC TCAGAGTCCC TATCGCCGAA AATGGGATCC CTATTGTCGC GTTAGGAAAG AGTCTCAGGG ATAGCGGCTT 2501 AGGGTTTACG ACCTCGATGT TGGATCAGGA CATCCTAATG ATGCAACCGT TCCCAAATGC TGGAGCTACA ACCTAGTCCT GTAGGATTAC TACGTTGGCA 2551 TATTAAGGGT TCGTTTGTTC AACGATTAAC AGTCCTACGT GATCTGAGTT ATAATTCCCA AGCAAACAAG TTGCTAATTG TCAGGATGCA CTAGACTCAA 2601 CAGACCGGAG AAATCCAGGT CAGTTTCTAT CTATGAATTT ATTTTTCCTA TP.~~A.AAGGAT GTCTGGCCTC TTTAGGTCCA GTCA.AAGATA GATAC TTA.AA 2651 GTACGAAAGG AC C GGP►AAAA TGGAGCCAAT ACCCTAGGCA CGCTCCATTT CATGCTTTCC TGGCCTTTTT ACCTCGGTTA TGGGATCCGT GCGAGGTAAA 2701 TCATCTATTG AAACAAACTA A.AATAGATAA G TCA ACCACTGCCC AGTAGATAAC TTTGTTTGAT TTTATCTATT CTTTTTTAGT TGGTGACGGG 2751 AAGAAAAGGG CTGTTGAGGT GGCAGAGCCT GGTAAATGCA AAAGACCTAA TTCTTTTCCC GACAACTCCA CCGTCTCGGA CCATTTACGT TTTCTGGATT 2801 GCTCTTTAAT CCAGAGGTTC AA.ATC C TC TC CTCAACCATG CTTGAAGCCC CGAGAAATTA GGTCTCCAAG TTTAGGAGAG GAGTTGGTAC GAACTTCGGG 2851 TTCTACTTTA CCTAATTAAT CCACTTGCCT ATATTATTCC CATCCTATTG AAGATGAAAT GGATTAATTA GGTGAACGGA TATAATAAGG GTAGGATAAC 2901 GCCACGGCCT TCCTTACCCT AGTTGAACGA AAAATTCTCA GTTCCATACA CGGTGCCGGA AGGAATGGGA TCAACTTGCT TTTTAAGAGT CA.AGGTATGT 2951 ACTCCGCAAA GGTCCCAACA TCGTAGGCCC CTACGGCCTC CTTCAACCCA TGAGGCGTTT CCAGGGTTGT AGCATCCGGG GATGCCGGAG GAAGTTGGGT 3001 TTGCAGACGG CCT~TTA TTTATTAAAG AACCCGTTTA CCCATCAACA AACGTCTGCC GGATTTTAAT AAATAATTTC TTGGGCAAAT GGGTAGTTGT 239

3051 TCCTCCCCAT TTCTATTCCT CGCCACACCC ACAATAGCCC TGACACTAGC AGGAGGGGTA AAGATAAGGA GCGGTGTGGG TGTTATCGGG ACTGTGATCG 3101 CCTCCTCATA TGAATACCCC TCCCTCTCCC CTACTCCATT ATTAACCTCA GGAGGAGTAT ACTTATGGGG AGGGAGAGGG GATGAGGTAA TAATTGGAGT 3151 ATCTAGGCCT ATTATTTATT CTAGCGATCT CAAGCCTGAC TGTCTACACT TAGATCCGGA TAATAAATAA GATCGCTAGA GTTCGGACTG ACAGATGTGA 3201 ATTTTAGGCT CCGGATGGGC ATCTAATTCA AAATACGCTT TAATAGGGGC TAA.AATC C GA GGCCTACCCG TAGATTAAGT TTTATGCGAA ATTATCCCCG 3251 CCTGCGAGCT GTAGCACAAA CAATCTCTTA TGAAGTAAGC CTAGGATTAA GGACGCTCGA CATCGTGTTT GTTAGAGAAT ACTTCATTCG GATCCTAATT 3301 TCCTCCTGTC AATAATTATC TTCGCAGGAG GTTTCACCCT CCACACCTTT AGGAGGACAG TTATTAATAG AAGCGTCCTC CAAAGTGGGA GGTGTGGAAA 3351 AATCTGGCAC AAGAA.ACAAT TTGATTACTC ATCCCAGGAT GACCTCTAGC TTAGACCGTG TTCTTTGTTA AACTAATGAG TAGGGTCCTA CTGGAGATCG 3401 CCTAATATGA TACGTATCTA CCCTGGCAGA GACTAACCGA ATCCCTTTCG GGATTATACT ATGCATAGAT GGGACCGTCT CTGATTGGCT TAGGGAAAGC 3451 ACCTAACAGA AGGGGAATCA GAACTAGTCT CAGGTTTTAA CATCGAATAC TGGATTGTCT TCCCCTTAGT CTTGATCAGA GTCCAAAATT GTAGCTTATG 3501 GCCGGGGGCT CCTTCGCCCT ATTCTTTCTC GCTGAATATA CAAACATCCT CGGCCCCCGA GGAAGC GGGA TAAGAAAGAG CGACTTATAT GTTTGTAGGA 3551 ATTAATAAAT ACCCTCTCAG TCATTCTATT CATAGGTTCC TCTTACAACC TAATTATTTA TGGGAGAGTC AGTAAGATAA GTATCCAAGG AGAATGTTGG 3601 CCCTCCTCCC AGAAATTTCA ACACTCAACC TGATAATAA.A AGCAACCTTA GGGAGGAGGG TCTTTA.AAGT TGTGAGTTGG ACTATTATTT TCGTTGGAAT 3651 CTAACCCTAT TCTTCTTATG AATTCGAGCA TCATACCCCC GCTTCCGTTA GATTGGGATA AGAAGAATAC TTAAGCTCGT AGTATGGGGG CGAAGGCAAT 3701 TGACCAACTC ATACACCTAG TATG~~AAAAA TTTCCTACCC TTAACCCTAG ACTGGTTGAG TATGTGGATC ATACTTTTTT AA.AGGATGGG AATTGGGATC 3751 CAATTATACT ATGACACATA GCCCTTCCCA TAACTACAGC AAGCCTACCT GTTAATATGA TACTGTGTAT CGGGAAGGGT ATTGATGTCG TTCGGATGGA 3801 CCCCTAACCT AAACGGAAGC GTGCCTGAAC A.AAGGAC CAC TTTGATAGAG GGGGATTGGA TTTGCCTTCG CACGGACTTG TTTCCTGGTG AAACTATCTC 3851 TGGACAATGA AAGTTAAAAT CTTTCCTCTT C C TAGP~~AAA TAGGATTCGA ACCTGTTACT TTCAATTTTA GAAAGGAGAA GGATCTTTTT ATCCTAAGCT 3901 ACCCATACCT AAGAGATCAA AACTCTTCGT GCTTCCAATT ATACTATCTT TGGGTATGGA TTCTCTAGTT TTGAGAAGCA CGAAGGTTAA TATGATAGAA 3951 CTAAGTAAAG TCAGCTAATA AAGCTCTTGG GCCCATACCC CAACCATGTT GATTCATTTC AGTCGATTAT TTCGAGAACC CGGGTATGGG GTTGGTACAA 4001 GGTTA.A.AATC CTTCCTTTAC TAATGAACCC AATTGCATTA ACTATTATCA CCAATTTTAG GAAGGAAATG ATTACTTGGG TTAACGTAAT TGATAATAGT 4051 TTTCAAGCCT AGGCCTAGGG ACTATCCTAA CATTTATTGG TTCACACTGA A.AAGTTC GGA TCCGGATCCC TGATAGGATT GTA.AATAAC C AAGTGTGACT 4101 CTCCTAGTCT GAATAGGCCT CGAAATCAAC ACTTTAGCCA TTATCCCCTT GAGGATCAGA CTTATCCGGA GCTTTAGTTG TGAAATCGGT AATAGGGGAA 4151 AATAATCCAC CAACACCATC CCCGGGCAGT AGAAGCTTCC ACAAA.ATAC T TTATTAGGTG GTTGTGGTAG GGGCCCGTCA TCTTCGAAGG TGTTTTATGA 4201 TCATTACACA AGCTACTGCC TCAGCCTTAC TCTTATTTGC TAGCATTACA AGTAATGTGT TCGATGACGG AGTCGGAATG AGAATA.AAC G ATCGTAATGT 4251 AACGCTTGAA CTTCAGGCGA ATGGAGTTTA ATTGAAATAA TTAACCCAAG TTGCGAACTT GAAGTCCGCT TACCTCAAAT TAACTTTATT AATTGGGTTC 4301 CTCTGCCACA CTAGTCACAA TCGCATTAGC AC TP~~AAATT GGCTTAGCCC GAGACGGTGT GATCAGTGTT AGCGTAATCG TGATTTTTAA CCGAATCGGG 4351 CCCTCCATTT CTGATTACCC GAAGTCCTTC AAGGAC TA.AA TCTTACTACA GGGAGGTA.AA GACTAATGGG CTTCAGGAAG TTCCTGATTT AGAATGATGT 4401 GGCCTTATCC TCTCTACCTG ACP.►~~A.AACTT GCCCCATTCG CTATTCTCTT 240

CCGGAATAGG AGAGATGGAC TGTTTTTGAA CGGGGTAAGC GATAAGAGAA 4451 ACAACTTTAC CCTTCACTAA ACCCTAACCT ACTAGTATTC CTTGGGGTTC TGTTGAAATG GGAAGTGATT TGGGATTGGA TGATCATAAG GAACCCCAAG 4501 TTTCAACAAT AGTAGGAGCC TGAGGAGGCT TAAACCAAAC CCAACTACGA AAAGTTGTTA TCATCCTCGG ACTCCTCCGA ATTTGGTTTG GGTTGATGCT 4551 A,AAATCCTAG CCTACTCATC AATCGCCCAC CTTGGCTGAA TAATTACAAT TTTTAGGATC GGATGAGTAG TTAGCGGGTG GAACCGACTT ATTAATGTTA 4601 CCTACATTAT TCCCATAATT TAACTCAACT TAATTTATTT CTTTACATTG GGATGTAATA AGGGTATTAA ATTGAGTTGA ATTA.AATAAA GAAATGTAAC 4651 TCATAACATC AACAACCTTC CTTTTATTCA AGACATTTAA CTCCACCAAA AGTATTGTAG TTGTTGGAAG GA.A.AATAAGT TCTGTAAATT GAGGTGGTTT 4701 ATTAACTCCA TCTCCTCCTC CTCATCA,AA.A TCCCCCCTAC TATCTATTAT TAATTGAGGT AGAGGAGGAG GAGTAGTTTT AGGGGGGATG ATAGATAATA 4751 TGCTCTCATA ACCCTCCTCT CTCTCGGAGG CCTACCTCCA CTCTCAGGCT ACGAGAGTAT TGGGAGGAGA GAGAGCCTCC GGATGGAGGT GAGAGTCCGA 4801 TTATACCAAA ATGATTAATT TTACAAGAAC TAAC~CA AAACCTAGCC AATATGGTTT TACTAATTAA AATGTTCTTG ATTGTTTTGT TTTGGATCGG 4851 ATTCCAGCTA CTATCATAGC CATAACTACC CTCCTCAGCC TATTCTTCTA TAAGGTCGAT GATAGTATCG GTATTGATGG GAGGAGTCGG ATAAGAAGAT 4901 TCTACGTCTC TGCTATGCTA CAACATTAAC TATAACCCCA ACCCCAATCA AGATGCAGAG ACGATACGAT GTTGTAATTG ATATTGGGGT TGGGGTTAGT 4951 ATATACTAAC ATCATGACGA ACCAAACTAT CCCACAACCT AATCTTAACA TATATGATTG TAGTACTGCT TGGTTTGATA GGGTGTTGGA TTAGAATTGT 5001 ACAACCACCT CACTATCCAT CCTTCTTCTT CCAATCACCC CAGCCATCCT TGTTGGTGGA GTGATAGGTA GGAAGAAGAA GGTTAGTGGG GTCGGTAGGA 5051 CATATTATTA TCTTAAGAAA TTTAGGTTAA CTAGACCAAA AGCCTTCAA.A GTATAATAAT AGAATTCTTT AAATCCAATT GATCTGGTTT TCGGAAGTTT 5101 GCTTTAAGTA GAAGTGA,AA.A CCTCCTAATT TC TGC TAA.AA TTTGCGAGAC CGAAATTCAT CTTCACTTTT GGAGGATTAA AGACGATTTT AAACGCTCTG 5151 TTTACCTCAC ATCTTCTGAA TGCAACCCAG ATGCTTTCAT TAAGC TAAA.A AAATGGAGTG TAGAAGACTT ACGTTGGGTC TACGAAAGTA ATTCGATTTT 5201 TCTTCTAGAC A.AATAGGC C T CGATCCTACA AAATCTTAGT TAACAGCTAA AGAAGATCTG TTTATCCGGA GCTAGGATGT TTTAGAATCA ATTGTCGATT 5251 GCGTTCAATC CAGCGAACTT TTATCTACTT TCTCCCGCCG TAAGAACA.AA CGCAAGTTAG GTC GC TTGAA AATAGATGAA AGAGGGCGGC ATTCTTGTTT 5301 AGGCGGGAGA AAGTCCCGGG AGAAATTAAC CTCCATTTTT GGATTTGCAA TCCGCCCTCT TTCAGGGCCC TCTTTAATTG GAGGTP►~~AAA CCTAAACGTT 5351 TCCAACGTAA ACATTTACTG CAGAACTATG GCAAGAAGAG GAATTTGACC AGGTTGCATT TGTAAATGAC GTCTTGATAC CGTTCTTCTC CTTAAACTGG 5401 TCTGTATACG GAGCTACAAT CCGCCACTTA GTTCTCAGTC ACCTTACCTG AGACATATGC CTCGATGTTA GGCGGTGAAT CAAGAGTCAG TGGAATGGAC 5451 TGGCAATTAA TCGTTGACTA TTTTCTACAA ACCACAAAGA CATTGGCACC ACCGTTAATT AGCAACTGAT A.AAAGATGTT TGGTGTTTCT GTAACCGTGG 5501 CTATACTTAA TCTTTGGTGC ATGGGCAGGA ATAGTAGGAA CAGCCCTAAG GATATGAATT AGAAACCACG TACCCGTCCT TATCATCCTT GTCGGGATTC 5551 CCTTCTAATT CGAGCTGAAC TAGGACGACC CGGATCACTC CTAGGAGATG GGAAGATTAA GCTCGACTTG ATCCTGCTGG GCCTAGTGAG GATCCTCTAC 5601 ATCAGATTTA TAATGTTATT GTAACCGCCC ATGCATTTGT AATAATTTTC TAGTCTAAAT ATTACAATAA CATTGGCGGG TAC GTAA.ACA TTATTA.A.AAG 5651 TTCATGGTCA TACCTGTAAT AATTGGTGGG TTCGGGAACT GACTAGTGCC AAGTACCAGT ATGGACATTA TTAACCACCC AAGCCCTTGA CTGATCACGG 5701 CTTAATAATT GGTGCACCAG ACATGGCCTT CCCCCGAATA AACAATATAA GAATTATTAA CCACGTGGTC TGTACCGGAA GGGGGCTTAT TTGTTATATT 5751 GCTTTTGACT TCTTCCCCCC TCTTTTCTTT TACTCCTAGC TTCAGCCGGA CGA,AAACTGA AGAAGGGGGG AGP~AAAGAAA ATGAGGATCG AAGTCGGCCT 241

5801 GTCGAAGCTG GAGCCGGCAC CGGTTGGACA GTGTATCCTC CTTTAGCCGG CAGCTTCGAC CTCGGCCGTG GCCAACCTGT CACATAGGAG GAAATCGGCC 5851 TAACTTAGCC CATGCCGGAG CATCCGTTGA CTTAGCTATC TTTTCTCTTC ATTGAATCGG GTACGGCCTC GTAGGCAACT GAATCGATAG A.AAAGAGAAG 5901 ATTTAGCAGG TATTTCATCA ATCTTAGCCT CAATCAACTT CATTACAACC TAAATCGTCC ATAAAGTAGT TAGAATCGGA GTTAGTTGAA GTAATGTTGG 5951 ATTATTAACA TAAA.ACCCCC AGCTATCTCT CAATATCAAA CACCATTATT TAATAATTGT ATTTTGGGGG TCGATAGAGA GTTATAGTTT GTGGTAATAA 6001 TGTATGATCA ATTTTAGTAA CAACTATCCT CCTCCTCCTG TCCCTTCCAG ACATACTAGT TP~AAATCATT GTTGATAGGA GGAGGAGGAC AGGGAAGGTC 6051 TACTTGCAGC TGGCATCACT ATACTTCTTA CGGACCGAAA CTTAAACACA ATGAACGTCG ACCGTAGTGA TATGAAGAAT GCCTGGCTTT GAATTTGTGT 6101 ACATTCTTTG ACCCAGCTGG GGGAGGAGAT CCAATCCTCT ATCAACATCT TGTAAGAAAC TGGGTCGACC CCCTCCTCTA GGTTAGGAGA TAGTTGTAGA 6151 ATTTTGATTC TTTGGTCACC CAGAAGTGTA CATTTTGATT CTTCCTGGTT TAAAACTAAG AAACCAGTGG GTCTTCACAT GTAAAACTAA GAAGGACCAA 6201 TTGGAATAAT TTCCCATGTA GTAGCCTACT ATTCTGGTAA AAAAGAACCA AACCTTATTA AAGGGTACAT CATCGGATGA TAAGACCATT TTTTCTTGGT 6251 TTCCGCTACA TAGGAATAGT CTGAGCAATA ATAGCAATTG GTCTACTAGG AAGGCGATGT ATCCTTATCA GACTCGTTAT TATCGTTAAC CAGATGATCC 6301 CTTTATTGTT TGAGCCCACC ATATATTTAC AGTAGGTATG GACGTTGACA GAAATAACAA ACTCGGGTGG TATATAAATG TCATCCATAC CTGCAACTGT 6351 CACGAGCCTA TTTTACTTCA GCAACAATAA TTATCGCCAT CCCTACAGGT GTGCTCGGAT ~TGAAGT CGTTGTTATT AATAGCGGTA GGGATGTCCA 6401 GTAAAAGTAT TTAGTTGATT AGCAACCCTT CATGGAGGTT CTGTTAAATG CATTTTCATA AATCAACTAA TCGTTGGGAA GTACCTCCAA GACAATTTAC 6451 AGAGACCCCA TTGTTATGAG CTCTCGGCTT CATTTTTTTA TTTACAGTAG TCTCTGGGGT AACAATACTC GAGAGCCGAA GT T AAATGTCATC 6501 GAGGACTTAC AGGCATCGTC CTAGCCAATT CTTCCCTAGA TATTGTTCTC CTCCTGAATG TCCGTAGCAG GATCGGTTAA GAAGGGATCT ATAACAAGAG 6551 CACGACACTT ATTATGTAGT AGCCCATTTC CACTATGTTC TTTCAATAGG GTGCTGTGAA TAATACATCA TCGGGTAAAG GTGATACAAG AAAGTTATCC 6601 AGCAGTATTT GCCATTATAG CAGGTTTCAT CCACTGATTC CCTCTAATAT TCGTCATA.AA CGGTAATATC GTCCAAAGTA GGTGACTAAG GGAGATTATA 6651 CTGGTTTTAC CCTCCATTCA ACATGAACAA AAATCCAATT TGCAGTTATA GACCAAAATG GGAGGTAAGT TGTACTTGTT TTTAGGTTAA ACGTCAATAT 6701 TTCATTGGAG TAA.AC TTAAC ATTCTTTCCA CAACATTTCT TAGGCCTTGC AAGTAACCTC ATTTGAATTG TAAGAAAGGT GTTGTAA.AGA ATCCGGAACG 6751 TGGCATACCA CGACGATACT CAGACTACCC AGACGCATAC ACTTTATGAA ACCGTATGGT GCTGCTATGA GTCTGATGGG TCTGCGTATG TGAAATACTT 6801 ATGCAGTCTC CTCTATCGGC TCTTTAATTT CACTTGTAGC AGTAATTATA TACGTCAGAG GAGATAGCCG AGAAATTAAA GTGAACATCG TCATTAATAT 6851 CTCCTATTTA TTATCTGAGA AGCATTTGCC TCAAA.AC GAG AAGTATTATC GAGGATAAAT AATAGACTCT TCGTA.AACGG AGTTTTGCTC TTCATAATAG 6901 ACTTGAACTT CCCCACACAA ACGTTGAATG ACTTCACGGC TGTCCTCCAC TGAACTTGAA GGGGTGTGTT TGCAACTTAC TGAAGTGCCG ACAGGAGGTG 6951 CATATCACAC GTATGAAGAA CCAGCATTTG TTCAAATTCA ACGAACTTTT GTATAGTGTG CATACTTCTT GGTC GTAA.AC AAGTTTAAGT TGCTTGAAAA 7001 TA,AAACAAGA AAGGAAGGAA TTGAACCCTC ATATGTTAGT TTCAAGCCAA ATTTTGTTCT TTCCTTCCTT AACTTGGGAG TATACAATCA AAGTTCGGTT 7051 CCACATTACC ACTCTGCCAC TTTCTTTATT AAGGTTCTAG T~~AAACATAT GGTGTAATGG TGAGACGGTG AAAGAAATAA TTCCAAGATC ATTTTGTATA 7101 TACACTGCCT TGTCAAGACA AAATTGTGGG TTA,AA.ATCCC ACGAACCTTA ATGTGACGGA ACAGTTCTGT TTTAACACCC AATTTTAGGG TGCTTGGAAT 7151 ACTTATAATG GCACACCCCT CACAATTAGG ATTCCAAGAC GCAGCCTCCC 242

TGAATATTAC CGTGTGGGGA GTGTTAATCC TAAGGTTCTG CGTCGGAGGG 7201 CCGTTATAGA AGAACTTATC CATTTTCACG ACCACACACT AATAATTGTA GGCAATATCT TCTTGAATAG GTA►AAAGTGC TGGTGTGTGA TTATTAACAT 7251 TTTCTAATCA GCACCCTAGT TCTTTACATT ATTACAGCAA TAGTATCAAC AA.AGATTAGT CGTGGGATCA AGA.AATGTAA TAATGTCGTT ATCATAGTTG 7301 AAA.ACTTACA AACAAATATA TTCTTGATTC TCAAGAAATT GAAATTGTCT TTTTGAATGT TTGTTTATAT AAGAACTAAG AGTTCTTTAA CTTTAACAGA 7351 GAACTATTCT CCCTGCCATC ATCCTTATCA TAATTGCCCT CCCATCCCTA CTTGATAAGA GGGACGGTAG TAGGAATAGT ATTAACGGGA GGGTAGGGAT 7401 CGAATTTTAT ATCTTATAGA CGAAATTAAT GATCCCCATT TAACCATTAA GC TTP.~PsAATA TAGAATATCT GCTTTAATTA CTAGGGGTAA ATTGGTAATT 7451 AGCTATAGGT CATCAATGAT ACTGAAGTTA TGAGTATACA GATTATGAGA TCGATATCCA GTAGTTACTA TGACTTCAAT ACTCATATGT CTAATACTCT 7501 ACCTAGGCTT CGATTCTTAT ATAATCCAAA CCCAAGACTT GACCCCCGGC TGGATCCGAA GCTAAGAATA TATTAGGTTT GGGTTCTGAA CTGGGGGCCG 7551 CAATTCCGTT TATTAGA.AAC AGATCACCGA ATAGTAGTAC CCATAGAGTC GTTAAGGCAA ATAATCTTTG TCTAGTGGCT TATCATCATG GGTATCTCAG 7601 ACCTGTTCGC ATGCTAGTAT CTGCAGAAGA CGTCTTACAT TCATGAGCTG TGGACAAGCG TACGATCATA GACGTCTTCT GCAGAATGTA AGTACTCGAC 7651 TACCAACCTT AGGAATTAA.A ATAGATGCTG TTCCAGGCCG CTTAAATCAA ATGGTTGGAA TCCTTAATTT TATCTACGAC AAGGTCCGGC GAATTTAGTT 7701 ACTGCTTTCA TTATTTCCCG ACCAGGTGTC TATTATGGTC AATGCTCAGA TGACGAAAGT AATAAAGGGC TGGTCCACAG ATAATACCAG TTACGAGTCT 7751 AATTTGTGGT GCTAACCACA GTTTCATGCC TATCGTAGTA GAAACAGTCC TTAAACACCA CGATTGGTGT CA.AAGTACGG ATAGCATCAT CTTTGTCAGG 7801 CCCTAAAACA CTTCGAAGCC TGATCTTCAT TAATACTAGA AGAAACCTCA GGGATTTTGT GAAGCTTCGG ACTAGAAGTA ATTATGATCT TCTTTGGAGT 7851 CTAAGAAGCT AAATTGGGCC TAGCATTAGC CTTTTAAGCT P~AAA.ATTGGT GATTCTTCGA TTTAACCCGG ATCGTAATCG GAA.AATTC GA TTTTTAACCA 7901 GACTCCCTAC CACCTTTAGT GACATGCCTC AGTTA.AATCC CCACCCTTGA CTGAGGGATG GTGGAAATCA CTGTACGGAG TCAATTTAGG GGTGGGAACT 7951 TTCATTATCC TCCTATTTTC ATGAATGATT TTCCTTATTA TTTTACCA.AA AAGTAATAGG AGGATAAAAG TACTTACTAA AAGGAATAAT AAAATGGTTT 8001 AAAAGTAATA ACYCACATAT TCAACAATAA CCCAATATTA P~~AAATATC G TTTTCATTAT TGYGTGTATA AGTTGTTATT GGGTTATAAT TTTTTATAGC 8051 AGAAGCCTAA ACCCGAGCCC TGAAACTGAC CATGATCATA AGCTTTTTTG TCTTCGGATT TGGGCTCGGG ACTTTGACTG GTACTAGTAT TCG C 8101 ACCAATTCCT AAGTCCCTCC CTTCTTGGAA TTCCACTAAT CGCCCTAGCA TGGTTAAGGA TTCAGGGAGG GAAGAACCTT AAGGTGATTA GCGGGATCGT 8151 ATTGTGTTAC CATGATTAAC CTTCCCAACT CCAACCAACC GATGACTTAA TAACACAATG GTACTAATTG GAAGGGTTGA GGTTGGTTGG CTACTGAATT 8201 TAATCGACTA ATAAACCTCC AAAATTGGTT TATTAATCGA TTTATCTATC ATTAGCTGAT TATTTGGAGG TTTTAACCAA ATAATTAGCT AAATAGATAG 8251 AACTTCTACA CCCCATTAAC TTTACCGGCC ATAAATGAGC TGTACTGTTT TTGAAGATGT GGGGTAATTG AAATGGCCGG TATTTACTCG ACATGACAAA 8301 ACAGCACTAA TATTATTCTT AATTACCAGC AACCTATTAG GACTTCTCCC TGTCGTGATT ATAATAAGAA TTAATGGTCG TTGGATAATC CTGAAGAGGG 8351 TTATACCTTT ACACCTACAA CTCAACTCTC CCTTAATATA GCACTTGCTC AATATGGAAA TGTGGATGTT GAGTTGAGAG GGAATTATAT CGTGAACGAG 8401 TACCCTTATG ACTCATAACC GTACTAATCG GAATACTTAA TAAACCAACA ATGGGAATAC TGAGTATTGG CATGATTAGC CTTATGAATT ATTTGGTTGT 8451 ATTGCACTAG GACATTTCCT ACCAGAAGGT ACCCCCACCC CCCTAGTACC TAACGTGATC CTGTAAAGGA TGGTCTTCCA TGGGGGTGGG GGGATCATGG 8501 CATCCTAATT ATTATCGA.AA CTATTAGTCT ATTTATTCGA CCATTAGCAT GTAGGATTAA TAATAGCTTT GATAATCAGA TAA.ATAAGC T GGTAATCGTA 243

8551 TAGGAGTTCG ACTAACCGCT AATTTAACAG CTGGCCACCT ATTAATACAA ATCCTCAAGC TGATTGGCGA TTAAATTGTC GACCGGTGGA TAATTATGTT 8601 TTAATTGCAA CTGCAACCTT TGTCCTCATT ACTATTATGC CAACCGTGGC AATTAACGTT GACGTTGGAA ACAGGAGTAA TGATAATACG GTTGGCACCG 8651 ACTACTCACA TCAATTATCC TATTTCTACT AACAATTTTA GAA.ATC GC TG TGATGAGTGT AGTTAATAGG ATAAAGATGA TTGTTAAAAT CTTTAGCGAC 8701 TAGCAATAAT TCAAGCATAC GTATTTGTTC TCCTACTAAG CCTTTACTTA ATCGTTATTA AGTTCGTATG CATAAACAAG AGGATGATTC GGAAATGAAT 8751 CAAGAAAACG TCTAATGGCT CACCAAGCAC ATGCATATCA TATAGTTGAC GTTCTTTTGC AGATTACCGA GTGGTTCGTG TACGTATAGT ATATCAACTG 8801 CCCAGCCCAT GACCATTAAC TGGAGCTACA GCCGCCCTCC TAATAACATC GGGTCGGGTA CTGGTA.ATTG ACCTCGATGT CGGCGGGAGG ATTATTGTAG 8851 CGGATTAGCC ATCTGATTTC ATTACCACTC ATTATCTCTC CTCTATCTAG GCCTAATCGG TAGAC TA.AAG TAATGGTGAG TAATAGAGAG GAGATAGATC 8901 GATTAATCCT CCTACTTTTA ACTATAATCC AATGATGACG TGACATTATC CTAATTAGGA GGATGAAA.AT TGATATTAGG TTACTACTGC ACTGTAATAG 8951 CGAGAAGGAA CATTTCAAGG CCATCACACA CCTCCCGTCC AAAA.AGGTC T GCTCTTCCTT GTAAAGTTCC GGTAGTGTGT GGAGGGCAGG TTTTTCCAGA 9001 TCGCTATGGA ATAATCTTAT TTATCACATC AGAAGTATTC TTCTTTTTAG AGCGATACCT TATTAGAATA AATAGTGTAG TCTTCATAAG AAGp~~A.AATC 9051 GTTTCTTCTG AGCCTTTTAC CATTCAAGTC TAGCACCAAC CCCCGAGCTA CAAAGAAGAC TC GGA.A.AATG GTAAGTTCAG ATCGTGGTTG GGGGCTCGAT 9101 GGAGGGTGTT GACCACCAAC AGGAATCAAC CCATTAGACC CATTTGAAGT CCTCCCACAA CTGGTGGTTG TCCTTAGTTG GGTAATCTGG GTAAAC TTCA 9151 ACCACTTCTG AACACCGCCG TACTTTTGGC CTCTGGCGTA ACAGTAACCT TGGTGAAGAC TTGTGGCGGC ATGAAAACCG GAGACCGCAT TGTCATTGGA 9201 GAAC TCAC CA CAGCTTAATA GAAGGTAACC GAAAAGAAGC TATCCAAGCC CTTGAGTGGT GTCGAATTAT CTTCCATTGG CTTTTCTTCG ATAGGTTCGG 9251 CTTGCTATCA CTATCATTTT AGGTTTTTAC TTCACAGCCC TCCAAGCTAT GAACGATAGT GATAGTAAA.A TC CP.►AAAATG AAGTGTCGGG AGGTTCGATA 9301 AGAATATTAC GAAGCACCCT TCACAATCGC CGATGGAATT TATGGAACAA TCTTATAATG CTTCGTGGGA AGTGTTAGCG GCTACCTTAA ATACCTTGTT 9351 CATTCTTCGT TGCCACAGGA TTCCACGGCC TCCATGTTAT TATCGGCTCA GTAAGAAGCA ACGGTGTCCT AAGGTGCCGG AGGTACAATA ATAGCCGAGT 9401 ACATTTTTAG CAATTTGTCT ACTACGACAA ATTCAATATC ACTTTACATC TGTP►A.AAATC GTTA.AACAGA TGATGCTGTT TAAGTTATAG TGAAATGTAG 9451 AGAACATCAC TTTGGTTTCG AAGCTGCTGC ATGATATTGA CATTTCGTTG TCTTGTAGTG AAACCAAAGC TTCGACGACG TACTATAACT GTAAAGCAAC 9501 ACGTAGTATG ACTATTCCTT TATGTATCCA TTTATTGATG AGGCTCATAA TGCATCATAC TGATAAGGAA ATACATAGGT AA.ATAAC TAC TCCGAGTATT 9551 TTACTTTTCT AGTATAAACT AGTACAAATG ATTTCCAATC ATTTAATCTT AATGA.AAAGA TCATATTTGA TCATGTTTAC TAAAGGTTAG TAAATTAGAA 9601 GGCTAAAATC CAAGGAAAAG TAATGAGCCT CATCATGTCT TCCATCGCAG CCGATTTTAG GTTCCTTTTC ATTACTCGGA GTAGTACAGA AGGTAGCGTC 9651 CTACGGCCCT GGTTTCCCTA ATCCTTGTAT TTATTGCATT CTGACTCCCA GATGCCGGGA CCAAAGGGAT TAGGAACATA AATAACGTAA GACTGAGGGT 9701 TCACTAAACC CAGATAATGA A.A.AAC TATC C CCATACGAAT GCGGCTTTGA AGTGATTTGG GTCTATTACT TTTTGATAGG GGTATGCTTA CGCCGAAACT 9751 CCCCCTAGGA AATGCACGCC TCCCATTTTC TCTACGCTTT TTCCTTGTAG GGGGGATCCT TTACGTGCGG AGGGTA.AAAG AGATGCGAAA A.AGGAACATC 9801 CTATCCTATT CCTTCTATTT GACCTAGAAA TTGCCCTCCT TCTCCCTTTA GATAGGATAA GGAAGATAAA CTGGATCTTT AACGGGAGGA AGAGGGAAAT 9851 CCATGAGGTA ATCAATTATT ATCACCACTC CTCGCATTAT TCTGAGCAAC GGTACTCCAT TAGTTAATAA TAGTGGTGAG GAGCGTAATA AGACTCGTTG 9901 AATTATCCTA ATCTTACTAA CCTCAGGCCT CATCTATGAA TGATTTCAAG 244

TTAATAGGAT TAGAATGATT GGAGTCCGGA GTAGATACTT ACTAAAGTTC 9951 GGGGCCTAGA ATGAGCAGAA TGAATATTTA GTCTAAATAA AGACCACTAA CCCCGGATCT TACTCGTCTT ACTTATAA.AT CAGATTTATT TCTGGTGATT 10001 TTTCGGCTTA GTAAATTATG GTGAAAATCC ATAAATATCC TATGTCTCCC AAAGCCGAAT CATTTAATAC CACTTTTAGG TATTTATAGG ATACAGAGGG 10051 ATACATTTTA TCCTTAATTC AACATTCATC CTAGGACTCA TAGGTCTCGC TATGTAAAAT AGGAATTAAG TTGTAAGTAG GATCCTGAGT ATCCAGAGCG 10101 ACTCAACCGC TATCACCTTT TATCCGCACT CTTATGTTTA GAAAGTATAC TGAGTTGGCG ATAGTGGA.AA ATAGGCGTGA GAATACAAAT CTTTCATATG 10151 TACTAACCCT ATTTATTTCC ATCGCTATTT GAACTCTAAC ACTA.AAC T C C ATGATTGGGA TAA.ATAAAGG TAGCGATA.AA CTTGAGATTG TGATTTGAGG 10201 ACCTCATGCG CAATTACTCC CATAATTCTC CTTACATTCT CAGCTTGCGA TGGAGTACGC GTTAATGAGG GTATTAAGAG GAATGTAAGA GTCGAACGCT 10251 AGCTAGTACA GGCCTAGCCA TCCTAGTAGC CACCTCACGA TCCCATGGCT TCGATCATGT CCGGATCGGT AGGATCATCG GTGGAGTGCT AGGGTACCGA 10301 CTGATAACCT ACAAA.ACCTA AACCTCCTTC AATGCTAAAA ATCCTAGTTC GACTATTGGA TGTTTTGGAT TTGGAGGAAG TTACGATTTT TAGGATCAAG 10351 CAACAATCAT ACTCTTCCCA ACCACGTGAA TTACTAATAA A►AA,ATGACTA GTTGTTAGTA TGAGAAGGGT TGGTGCACTT AATGATTATT TTTTACTGAT 10401 TGACCTGTAA TCACTACCCA TAGTCTCCTG ATTGCACTAC TAAGCCTTCT ACTGGACATT AGTGATGGGT ATCAGAGGAC TAACGTGATG ATTCGGAAGA 10451 CTGATTTAAA TGAAATACAG ATACTGGCTG AGACTTTTCC AACCAATTTA GACTAAATTT ACTTTATGTC TATGACCGAC TCTGAAAAGG TTGGTTAAAT 10501 TAGCCATTGA CCCCTTATCC GCCCCCTTAC TCATCCTTAC ATGCTGACTT ATCGGTAACT GGGGAATAGG CGGGGGAATG AGTAGGAATG TACGACTGAA 10551 CTTCCACTAA TAATCTTAGC TAGCCA.AAAT CACATTTCCC CAGAACCAAT GAAGGTGATT ATTAGAATCG ATCGGTTTTA GTGTAAAGGG GTCTTGGTTA 10601 CATTCGACAA CGAACATACA TTACACTCCT AATTTTCCTC CAAACTTTTC GTAAGCTGTT GCTTGTATGT AATGTGAGGA TTAAA.AGGAG GTTTGAAAAG 10651 TTATCCTAGC ATTTTCCGCA ACC GA.AATAA TCATATTCTA CATTATATTT AATAGGATCG TA.AA.AGGCGT TGGCTTTATT AGTATAAGAT GTAATATAAA 10701 GAAGCTACAC TTATTCCTAC ACTTATTATT ATTACACGAT GAGGA.AACCA CTTCGATGTG AATAAGGATG TGAATAATAA TAATGTGCTA CTCCTTTGGT 10751 AACAGAACGC CTAAACGCAG GTACCTACTT TTTATTCTAT ACTTTAATTG TTGTCTTGCG GATTTGCGTC CATGGATGAA AA.ATAAGATA TGA.AATTAAC 10801 GTTCCCTCCC CCTTCTTATT GCCCTCCTAC TTATACAAAA CAACCTAGGC CAAGGGAGGG GGAAGAATAA CGGGAGGATG AATATGTTTT GTTGGATCCG 10851 ACCTTATCTA TAACTATTAT ACAACATTCA CA.A.AACCCAA ACCTAACTTT TGGAATAGAT ATTGATAATA TGTTGTAAGT GTTTTGGGTT TGGATTGAAA 10901 ATGAATGGAT AAACTATGAT GAGTAGCATG CCTCATCGCC TTCCTTGTCA TACTTACCTA TTTGATACTA CTCATCGTAC GGAGTAGCGG AAGGAACAGT 10951 AAATACCTTT ATATGGAGTC CACCTCTGAC TACCCAAAGC CCACGTAGAA TTTATGGAAA TATACCTCAG GTGGAGACTG ATGGGTTTCG GGTGCATCTT 11001 GCCCCAATTG CCGGCTCAAT AATTTTAGCT GCAGTACTAC ATAA.ACTAGG CGGGGTTAAC GGCCGAGTTA TTAAA.ATCGA CGTCATGATG TATTTGATCC 11051 AGGTTATGGA ATAATACGAA TTATTGTAAT ATTAGACCCA CTGACCAAAG TCCAATACCT TATTATGCTT AATAACATTA TAATCTGGGT GACTGGTTTC 11101 AAATAGCTTA CCCCTTCTTA ATCCTAGCTA TCTGAGGAAT TATCATAACC TTTATCGAAT GGGGAAGAAT TAGGATCGAT AGACTCCTTA ATAGTATTGG 11151 AGCTCTATTT GTCTACGGCA AACAGACCTT AAATCTCTAA TTGCTTACTC TCGAGATAAA CAGATGCCGT TTGTCTGGAA TTTAGAGATT AACGAATGAG 11201 ATCAGTAAGT CATATAGGAC TAGTCACTGG AGCAATCCTC ATTCAAACAC TAGTCATTCA GTATATCCTG ATCAGTGACC TCGTTAGGAG TAAGTTTGTG 11251 CATGAAGTTT TGCAGGAGCA ATTACACTAA TAATCGCCCA TGGCCTAATC GTACTTCAAA ACGTCCTCGT TAATGTGATT ATTAGCGGGT ACCGGATTAG 245

11301 TCATCAGCCC TATTCTGCTT AGCCAACACC AACTACGAAC GAATTCACAG AGTAGTCGGG ATAAGACGAA TCGGTTGTGG TTGATGCTTG CTTAAGTGTC 11351 CCGAACTATA CTTCTAGCTC GAGGAATACA AATCATCTTC CCCCTCACAG GGCTTGATAT GAAGATCGAG CTCCTTATGT TTAGTAGAAG GGGGAGTGTC 11401 CAACCTGATG ATTCTTTGCT ACCCTAGCTA ACCTTGCTCT TCCACCATCC GTTGGACTAC TAAGA.AAC GA TGGGATCGAT TGGAACGAGA AGGTGGTAGG 11451 CCTAATCTTA TAGGAGAACT CCTTATCATC ACTTCATTAT TCAATTGATC GGATTAGAAT ATCCTCTTGA GGAATAGTAG TGAAGTAATA AGTTAACTAG 11501 CAATTGAACT ATAATCCTAT CAGGCCTCGG AGTATTAATC ACAGCCTCTT GTTAACTTGA TATTAGGATA GTCCGGAGCC TCATAATTAG TGTCGGAGAA 11551 ACTCCCTCTA TATATTCCTA ACAACCCAAC GCGGTCCAAC CCCTCACCAC TGAGGGAGAT ATATAAGGAT TGTTGGGTTG CGCCAGGTTG GGGAGTGGTG 11601 ATCTTATCAT TAAACCCAAA CCATACACGA GAACACCTCC TCCTAAGCCT TAGAATAGTA ATTTGGGTTT GGTATGTGCT CTTGTGGAGG AGGATTCGGA 11651 CCACCTCCTG CCCATTCTAC TACTAATACT TAAGCCAGAA CTTATCTGAG GGTGGAGGAC GGGTAAGATG ATGATTATGA ATTCGGTCTT GAATAGACTC 11701 GCTGAACACT TTGTATTTAT AGTTTAAACA AAACATTAGA TTGTGGTTCT CGACTTGTGA AACATA.AATA TCAAATTTGT TTTGTAATCT AACACCAAGA 11751 P.~~AAATAAAA GTTAAAACCT TTTTAATTAC CGAGAGAGGT CCGGGACACG TTTTTATTTT CAATTTTGGA AAAATTAATG GCTCTCTCCA GGCCCTGTGC 11801 AAGAACTGCT AATTCTTCCT ATCATGGTTC AAATCCATGA CTCACTCAGC TTCTTGACGA TTAAGAAGGA TAGTACCAAG TTTAGGTACT GAGTGAGTCG 11851 TTCTGAAAGA TAATAGTAAT CTATTGGTCT TAGGA.AC CAA AAACCCTTGG AAGACTTTCT ATTATCATTA GATAACCAGA ATCCTTGGTT TTTGGGAACC 11901 TGCAACTCCA AGCAAAAGCT ATGAACACTA TCTTTAATTC ATCATTTCTC ACGTTGAGGT TCGTTTTCGA TACTTGTGAT AGA.AATTAAG TAGTA.AAGAG 11951 CTAATCTTTT TTATCCTCAC TTTTCCATTA ATGACCTCAT TAATCCCCAA GATTAGAAA.A AATAGGAGTG ~~AAAGGTAAT TACTGGAGTA ATTAGGGGTT 12001 AAAACTTAAC CCTAACTGAT CATCATCCCA TGCP~~AAACA GC TGTp~~,AAA TTTTGAATTG GGATTGACTA GTAGTAGGGT ACGTTTTTGT CGACATTTTT 12051 CCTCCTTCTT CATCAGCCTC CTCCCCTTAT TTATTTTTCT AGACCAAGGC GGAGGAAGAA GTAGTCGGAG GAGGGGAATA AATP.~AAAAGA TCTGGTTCCG 12101 CTAGAATCAA TTATAACCAA CCATAACTGA GTAA.ACATTG GCCCATTTGA GATCTTAGTT AATATTGGTT, GGTATTGACT CATTTGTAAC C GGGTAA.AC T 12151 CATTAACATG AGCTTCAAAT TTGATATATA CTCAATTATG TTCACCCCAG GTAATTGTAC TCGAAGTTTA AACTATATAT GAGTTAATAC AAGTGGGGTC 12201 TAGCCCTTTA CGTCACTTGA TCCATCCTTG AATTTGCCCT ATGATACATA ATCGGGA.AAT GCAGTGAACT AGGTAGGAAC TTAAACGGGA TACTATGTAT 12251 CACTCTGACC CA.AATATTAA CCGCTTCTTC AAATACTTAC TACTCTTCTT GTGAGACTGG GTTTATAATT GGCGAAGAAG TTTATGAATG ATGAGAAGAA 12301 AATCTCAATA ATTATCCTAG TTACTGCTAA CAACATATTC CAACTGTTCA TTAGAGTTAT TAATAGGATC AATGACGATT GTTGTATAAG GTTGACAAGT 12351 TCGGATGAGA AGGGGTTGGA ATCATATCAT TTCTCCTAAT CGGTTGGTGA AGCCTACTCT TCCCCAACCT TAGTATAGTA AAGAGGATTA GCCAACCACT 12401 TATAGTCGAA CAGATGCTAA TACCGCCGCC CTCCAAGCTG TAATTTACAA ATATCAGCTT GTCTACGATT ATGGCGGCGG GAGGTTCGAC ATTAAATGTT 12451 CCGAGTAGGA GATATTGGAC TAATCCTCAG CATAGCTTGA TTAGCCATAA GGCTCATCCT CTATAACCTG ATTAGGAGTC GTATCGAACT AATCGGTATT 12501 ACCTTAACTC ATGAGAAATT CAACAATTAT TCATCTTATC CAAAAACATA TGGAATTGAG TACTCTTTAA GTTGTTAATA AGTAGAATAG GTTTTTGTAT 12551 GACTTAACCT TACCCCTCCT CGGCCTCATC CTAGCCGCAG C TGGA.A.AATC CTGAATTGGA ATGGGGAGGA GCCGGAGTAG GATCGGCGTC GACCTTTTAG 12601 CGCACAATTT GGCCTTCATC CCTGACTTCC CTCTGCCATA GAAGGACCCA GCGTGTTA.AA CCGGAAGTAG GGACTGAAGG GAGACGGTAT CTTCCTGGGT 12651 CCCCGGTCTC CGCCCTACTC CACTCTAGCA CAATAGTTGT TGCTGGCATC 246

GGGGCCAGAG GCGGGATGAG GTGAGATCGT GTTATCAACA ACGACCGTAG 12701 TTTCTACTAA TCCGTCTTCA CCCCCTAATT CAAGACAACC AACTAATCTT A.AAGATGATT AGGCAGAAGT GGGGGATTAA GTTCTGTTGG TTGATTAGAA 12751 AACAACATGC CTATGCCTAG GAGCACTAAC CACCCTTTTC ACCGCAACAT TTGTTGTACG GATACGGATC CTCGTGATTG GTGGGA►AAAG TGGCGTTGTA 12801 GCGCACTCAC TCAAAACGAT ATC TTATTGCCTT CTCAACATCA CGCGTGAGTG AGTTTTGCTA TAGTTTTTTT AATAACGGAA GAGTTGTAGT 12851 AGCCAACTCG GACTAATAAT AGTAACAATT GGTCTCAATC AACCCCAACT TCGGTTGAGC CTGATTATTA TCATTGTTAA CCAGAGTTAG TTGGGGTTGA 12901 TGCTTTCCTC CATATCTGTA CCCACGCCTT CTTCAAAGCC ATGCTTTTCC ACGAAAGGAG GTATAGACAT GGGTGCGGAA GAAGTTTCGG TACGAAAAGG 12951 TCTGCTCAGG GTCCATTATC CATAATCTTA ACGATGAACA AGACATCCGC AGACGAGTCC CAGGTAATAG GTATTAGAAT TGCTACTTGT TCTGTAGGCG 13001 AAAATAGGAG GACTCCACAA ACTTCTGCCT CTTACCTCAT CTTCCCTAAC TTTTATCCTC CTGAGGTGTT TGAAGACGGA GAATGGAGTA GAAGGGATTG 13051 TATCGGTAGC CTAGCTCTCA CAGGCATGCC CTTCTTGTCA GGCTTCTTCT ATAGCCATCG GATCGAGAGT GTCCGTACGG GAAGAACAGT CCGAAGAAGA 13101 CAAAAGACGC TATCATTGAA TC CATA.AACA CTTCATACCT A.AAC GC~C TGA GTTTTCTGCG ATAGTAACTT AGGTATTTGT GAAGTATGGA TTTGCGGACT 13151 GCCCTAACCC TCACCCTCAT CGCAACATCA TTCACAGCTA TCTATAGCCT CGGGATTGGG AGTGGGAGTA GCGTTGTAGT AAGTGTCGAT AGATATCGGA 13201 CCGCCTTATT TTCTTCACAC TAATAAATTT TCCACGATTC AATCCACTCT GGCGGAATAA AAGAAGTGTG ATTATTTAAA AGGTGCTAAG TTAGGTGAGA 13251 CCCCTATTAA TGAAA.ATAAT CCAATACTAA CCAACCCAAT TAAACGCTTA GGGGATAATT ACTTTTATTA GGTTATGATT GGTTGGGTTA ATTTGCGAAT 13301 ACCTACGGAA GTATCCTAGC CGGCCTTATC ATCACATCAA ACCTAACCCC TGGATGCCTT CATAGGATCG GCCGGAATAG TAGTGTAGTT TGGATTGGGG 13351 TACP.~~AA.AC C CAAATCATAA CAATACCCCC TCTATTAAAA CTTTCCGCCC ATGTTTTTGG GTTTAGTATT GTTATGGGGG AGATAATTTT GAAAGGCGGG 13401 TACTACTAAC AATCATTGGT CTTCTACTAG CCCTAGAACT AGCTAACTTA ATGATGATTG TTAGTAACCA GAAGATGATC GGGATCTTGA TCGATTGAAT 13451 ACTAACTCCC AAC TCA.AAAC AACCCCTACC CTTTATACCC ACCACTTCTC TGATTGAGGG TTGAGTTTTG TTGGGGATGG GAAATATGGG TGGTGAAGAG 13501 CAATATACTT GGATACTTTC CACAGATTAT TCACCGCCTC C TAC C p~~A.AA GTTATATGAA C C TAT GA.AAG GTGTCTAATA AGTGGCGGAG GATGGTTTTT 13551 TTAACCTAAC TTGAGCCCAA CATATTTCAA CCCACCTAAT TGACCAAACA AATTGGATTG AACTCGGGTT GTATAAAGTT GGGTGGATTA ACTGGTTTGT 13601 TGAAATGAAA AAATTGGACC p~~AAAGTGTT CTTATCCAAC AAATCCCATT ACTTTACTTT TTTAACCTGG TTTTTCACAA GAATAGGTTG TTTAGGGTAA 13651 AATTAA.ATTA TCAACCCGAC CCCAACAAGG TTATATTAAA ACCTACCTTA TTAATTTAAT AGTTGGGCTG GGGTTGTTCC AATATAATTT TGGATGGAAT 13701 TACTACTCTT TCTCACATTA ATCCTAGCCC TACTCACCAC ACTAGCCTAA ATGATGAGAA AGAGTGTAAT TAGGATCGGG ATGAGTGGTG TGATCGGATT 13751 TCACACGCAA AGTTCCCCAA GAC A.AAC C C C GAGTTAACTC TAACACCACA AGTGTGCGTT TCAAGGGGTT CTGTTTGGGG CTCAATTGAG ATTGTGGTGT 13801 AACAAAGTTA ATAACAATAC TCATCCACTC AAAACCAATA ACCACCCGCC TTGTTTCAAT TATTGTTATG AGTAGGTGAG TTTTGGTTAT TGGTGGGCGG 13851 ATTAGCATAT AACAA.AGCCA CTCCCATAAA ATCTCCACGA ACCATCTCCA TAATCGTATA TTGTTTCGGT GAGGGTATTT TAGAGGTGCT TGGTAGAGGT 13901 AACCATTCAT TTCCTCCACC CCTACTCAAC TTAACTCAAA CCACTCAACC TTGGTAAGTA AAGGAGGTGG GGATGAGTTG AATTGAGTTT GGTGAGTTGG 13951 ATAAAATATT TACCAACAAA AACCA.AACCA ACCAAATAAA AACCTACATA TATTTTATAA ATGGTTGTTT TTGGTTTGGT TGGTTTATTT TTGGATGTAT 14001 CAACAACACA GACCAGTTAC CTCATGACTC AGGATAAGGC TCAGCAGCCA GTTGTTGTGT CTGGTCAATG GAGTACTGAG TCCTATTCCG AGTCGTCGGT 247

14051 GCGCTGCCGT ATAAGCAAAC ACTACCAACA TCCCCCCTAA ATAAATTAAA CGCGACGGCA TATTCGTTTG TGATGGTTGT AGGGGGGATT TATTTAATTT 14101 AATA.A.AACCA AC GACP~~AAA AGACCCACCA TGCCCCACTA ATAACCCACA TTATTTTGGT TGCTGTTTTT TCTGGGTGGT ACGGGGTGAT TATTGGGTGT 14151 CCCTACCCCA GCAGCTATAA CTAATCCCAA CGCAGCATAA TAAGGAGAAG GGGATGGGGT CGTCGATATT GATTAGGGTT GCGTCGTATT ATTCCTCTTC TATTATTA.AA 14201 GATTAGATGC TACCCCTACC AAACCTAAAA CTAAACAAAC CTAATCTACG ATGGGGATGG TTTGGATTTT GATTTGTTTG ATAATAATTT 14251 AATACAAAAT ATACCATTAT TCCTACCTGG ACTTTAACCA AGACCAATAA TTATGTTTTA TATGGTAATA AGGATGGACC TGAAATTGGT TCTGGTTATT 14301 C TTGA.P►,AAAC TATCGTTGTT TATTCAACTA TAAGAATCCA TGGCCATTAA GAACTTTTTG ATAGCAACAA ATAAGTTGAT ATTCTTAGGT ACCGGTAATT 14351 TATC C GAAAA ACTCACCCAC TAC TP.~AAAAT CGTAAATCAA ATCCTAATTG ATAGGCTTTT TGAGTGGGTG ATGATTTTTA GCATTTAGTT TAGGATTAAC 14401 ACCTCCCAGC TCCATCAAAC ATTTCCATCT GATGAAATTT TGGCTCACTC TGGAGGGTCG AGGTAGTTTG TA.AAGGTAGA CTACTTTA.AA ACCGAGTGAG 14451 CTAGGACTAT GCCTGATCAT CCA.A.ATCCTC ACAGGACTCT TCTTATCCAT GATCCTGATA CGGACTAGTA GGTTTAGGAG TGTCCTGAGA AGAATAGGTA 14501 ACATTACACC GCAGATATTT CCACAGCCTT CTCCTCAGTA ATTCACATTT TGTAATGTGG CGTCTATAAA GGTGTCGGAA GAGGAGTCAT TAAGTGTAAA 14551 GTCGCGATGT CAATTATGGC TGACTTATTC GCAACATTCA CGCTAACGGA CAGCGCTACA GTTAATACCG ACTGAATAAG CGTTGTAAGT GCGATTGCCT 14601 GCCTCCCTAT TCTTCGTCTG TGTATATTTT CACATCGCCC GAGGGCTCTA CGGAGGGATA AGAAGCAGAC ACATATAAAA GTGTAGCGGG CTCCCGAGAT 14651 CTATGGCTCT TACCTCAACA AAGAAACATG AA.ACATC GGA GTAATTTTGC GATACCGAGA ATGGAGTTGT TTCTTTGTAC TTTGTAGCCT CATTAAAACG 14701 TATTCCTACT AATAGCTACA GCCTTTGTAG GATATGTTTT ACCTTGAGGA ATAAGGATGA TTATCGATGT CGGAA.ACATC CTATACAAAA TGGAACTCCT 14751 CAAATATCCT TCTGAGGGGC TACAGTTATC ACCAACCTCC TCTCTGCCTT GTTTATAGGA AGACTCCCCG ATGTCAATAG TGGTTGGAGG AGAGACGGAA 14801 CCCTTACATC GGAGACACAA TAGTCCAATG AATCTGAGGC GGCTTTTCAG GGGAATGTAG CCTCTGTGTT ATCAGGTTAC TTAGACTCCG CCGAAAAGTC 14851 TAGATAATGC CACCCTAACA CGATTCTTCG CATTCCACTT CCTCCTCCCT ATCTATTACG GTGGGATTGT GCTAAGAAGC GTAAGGTGAA GGAGGAGGGA 14901 TTCCTAATTG CCGCACTAAC AATTATTCAC ATCCTCTTCT TACATGAAAC AAGGATTAAC GGCGTGATTG TTAATAAGTG TAGGAGAAGA ATGTACTTTG 14951 AGGCTCAAAC AATCCTATAG GCCTTAATTC TGACATAGAC A.AAATC TCC T TCCGAGTTTG TTAGGATATC CGGAATTAAG ACTGTATCTG TTTTAGAGGA 15001 TCCACCCCTA TTTCTCCTAC AAAGACCTAC TCGGCTTCTT CACCTTATTT AGGTGGGGAT AAAGAGGATG TTTCTGGATG AGCCGAAGAA GTGGAATAAA 15051 ATCTTCCTAG GAATCCTAGC CCTATTCCTC CCCAACCTCC TAGGAGATGC TAGAAGGATC CTTAGGATCG GGATAAGGAG GGGTTGGAGG ATCCTCTACG 15101 TGp~AAATTTC ATCCCCGCTA ATCCTCTCGT TACCCCTCCA CATATTAAAC ACTTTTAAAG TAGGGGCGAT TAGGAGAGCA ATGGGGAGGT GTATAATTTG 15151 CCGAATGATA CTTTCTATTC GCCTATGCTA TCCTCCGCTC CATCCCCAAC GGCTTACTAT GAAAGATAAG CGGATACGAT AGGAGGCGAG GTAGGGGTTG 15201 AAACTAGGAG GAGTCTTAGC TCTCATATTC TCTATTTTCA TCCTCATACT TTTGATCCTC CTCAGAATCG AGAGTATAAG AGATAAAAGT AGGAGTATGA 15251 AGTACCCCTC CTCCACACCT C TA.AACAAC G AAGCAGCACC TTCCGCCCAC TCATGGGGAG GAGGTGTGGA GATTTGTTGC TTCGTCGTGG A.AGGCGGGTG 15301 TCACACAAAT TTTCTTCTGA ACCCTCGTAA CCAACATGCT AATCTTAACC AGTGTGTTTA AAAGAAGACT TGGGAGCATT GGTTGTACGA TTAGAATTGG 15351 TGAATTGGAG GACAACCAGT TGAACAACCA TTTATCCTAA TTGGACAAGT ACTTAACCTC CTGTTGGTCA ACTTGTTGGT AAATAGGATT AACCTGTTCA 15401 TGCATCTATC ATCTACTTTT CCCTATTTCT TATTGTAATT CCCCTCACAG 248

ACGTAGATAG TAGATGAAAA GGGATAAAGA ATAACATTAA GGGGAGTGTC 15451 GCTGATGAGA AA.ACAAAATT CTCAGCCTAA ACTAGTTTTG GTAGCTTAAC CGACTACTCT TTTGTTTTAA GAGTCGGATT TGATCA,AAAC CATCGAATTG 15501 CTAAAGCGTT GGCCTTGTAA GC CP►AAAAC C GGAGGTTTAA ACCCTCCCCA GATTTCGCAA CCGGAACATT CGGTTTTTGG CCTCCAAATT TGGGAGGGGT 15551 AAACATATCA GAGGAAGGAG GTTTAACCCC CTAAAATATG TCAGAGGAAG TTTGTATAGT CTCCTTCCTC CAAATTGGGG GATTTTATAC AGTCTCCTTC 15601 GAGAGTCAAA CTCCTGCCCT TGGCCCCCAA AACCAAGATT CTGCCTAAAC CTCTCAGTTT GAGGACGGGA ACCGGGGGTT TTGGTTCTAA GACGGATTTG 15651 TACCCCCTGA TGCCCACAAA TCATGAA.AAC CAGTGTACAT TGGTTTTCAA ATGGGGGACT ACGGGTGTTT AGTACTTTTG GTCACATGTA ACCAAAAGTT 15701 AA.AAGTAAGT CAGAGTGACA TATTAATGAT ATGGCCCACA TACCTTAATA TTTTCATTCA GTCTCACTGT ATAATTACTA TACCGGGTGT ATGGAATTAT 15751 ATGGTACATT ACCCAACTCG ACTACACTAC ATTAATTGAT TATCCCCTAC TACCATGTAA TGGGTTGAGC TGATGTGATG TAATTAACTA ATAGGGGATG 15801 TGGTATCGCA CTCTATGTAT AATCCCCATT AATTTATATT CCACTATATC ACCATAGCGT GAGATACATA TTAGGGGTAA TTAA.ATATAA GGTGATATAG 15851 ATAACATACT ATGCTTAATA CTCATTAATA TACTGTCCAC TATTTCATTA TATTGTATGA TACGAATTAT GAGTAATTAT ATGACAGGTG ATAAAGTAAT AATCP.~PLAATT 15901 CATTATATTC CTTAGCCCCC ATAA.AATTAA TTCATATCAT GTAATATAAG GAATCGGGGG TATTTTAATT TTAGTTTTAA AAGTATAGTA 15951 AAATTTATTC ATTTAACCCT TAAAAATCTA AGTAAATATC ATGCGGGTTG TTTA.AATAAG TAA.ATTGGGA ATTTTTAGAT TCATTTATAG TACGCCCAAC 16001 GTAAGAACAT CACATCCCGC TATTGTAAGG AP►AAAATTGC TCTATTTGTG CATTCTTGTA GTGTAGGGCG ATAACATTCC TTTTTTAACG AGATAAACAC 16051 GCACTGTACT CGATTTATCC CCATCAATTG ACCAGAACTG GCATCTGATT CGTGACATGA GCTAAATAGG GGTAGTTAAC TGGTCTTGAC CGTAGACTAA 16101 AATGCTTTCA ATACTTCAAT CCTTGATCGC GTCAAGAATG CCAGCCCCCT TTACGAAAGT TATGAAGTTA GGAACTAGCG CAGTTCTTAC GGTCGGGGGA 16151 AGTTCCCTTT AATGGCATTT TCGTCCTTGA TCGTCTCAAG ATTTATCGTC TCAAGGGAAA TTAC C GTA.AA AGCAGGAACT AGCAGAGTTC TAAATAGCAG 16201 CTCCCTGAAT TTTTTTTTGG GGATGAAGCC ATCGCTATTC CCCGGAAGGG GAGGGACTTA P~~AAAAAAC C CCTACTTCGG TAGCGATAAG GGGCCTTCCC 16251 CTGAACTGGG ACACTGAGAT AAACCTGTAT CATCCTCGAC ATCTATCTAA GACTTGACCC TGTGACTCTA TTTGGACATA GTAGGAGCTG TAGATAGATT TTGAC CA,AAA 16301 CATACTCATT ACTCATCATT CATGAGTGAT AATTGTCAAG GTATGAGTAA TGAGTAGTAA GTACTCACTA TTAACAGTTC AACTGGTTTT 16351 C TGAAAGGGA TAGAGAAATT GACGCCATAG TCGGCAAGTT TCGATTTTTT GACTTTCCCT ATCTCTTTAA CTGCGGTATC AGCCGTTCAA ACCT 16401 TGATTAATGA AGCTATGGTT T TAC ACTCTCTTAA CCCCCATCCG ACTAATTACT TCGATACCAA ATTTTTTATG TGAGAGAATT GGGGGTAGGC 16451 GGACAAATTC GCAATAAACG TTAGTGTAAA ATACATTACA TTATTCTAAT CCTGTTTAAG CGTTATTTGC AATCACATTT TATGTAATGT AATAAGATTA 16501 ACATTCTTCA CTTTATCTGG CATAAATTTA TTATTATTAA GTTTCCCCCT TGTAAGAAGT GA.AATAGAC C GTATTTAAAT AATAATAATT CAAAGGGGGA 16551 GAGTTGTAAA AAA.ATTTTTG AGGCCGCTTA CA TTTTTTGGTA AAP~AAACCAT CTCAACATTT TTTTP~APsAAC TCCGGCGAAT TTTTTTTTGT AC TA.AAA 16601 AAAACCCCCC TCCCCCTAAT ATACACGGAC TCCTCGAAAA C C C TTTTGGGGGG AGGGGGATTA TATGTGCCTG AGGAGCTTTT TGGGGATTTT 16651 CGAGGGCCGG ACATATATTT TTGAATTAGC ATGCGAAATA TTTTCTATAT GCTCCCGGCC TGTATATAAA AACTTAATCG TACGCTTTAT ~GATATA 16701 ATATTGTTAC ACTATGAT TATAACAATG TGATACTA

tRNA 1..70 249

product = tRNA-Phe rRNA 69..1019 product = 12S ribosomal RNA tRNA 1020..1091 product = tRNA-Val rRNA 1092..2762 product = 16S ribosomal RNA tRNA 2763..2837 product = tRNA-Leu gene 2838..3812 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3815..3883 product = tRNA-Ile tRNA 3882..3953 product = tRNA-Gln tRNA 3954..4022 product = tRNA-Met gene 4023..5066 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5066..5134 product = tRNA-Trp tRNA complement (5136..5204) product = tRNA-Ala tRNA complement (5205..5277) product = tRNA-Asn tRNA complement (5311..5377} product = tRNA-Cys tRNA complement (5379..5448) product = tRNA-Tyr gene 7158..7848 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7006..7076) product = tRNA-Ser tRNA 7081..7150 product = tRNA-Asp gene 715 8 ..7 848 gene = CU2 product = cytochrome c oxidase subunit 2 tRNA 7 849..7922 product = tRNA-Lys gene 7924..8091 gene = ATP8 product =ATP synthase FO subunit 8 gene 8082..8765 gene = ATP6 product =ATP synthase FO subunit 6 gene 8765..9550 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9553..9622 product = tRNA-Gly 250 gene 9623..9973 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9972..10041 product = tRNA-Arg gene 10042..1033 8 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10332..11712 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11713..11781 product = tRNA-His tRNA 11782..11848 product = tRNA-Ser tRNA 11849..11920 product = tRNA-Leu gene 11921..13750 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13746..14267) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14268..14337) product = tRNA-Glu gene 14340..15485 gene = CYTB product = cytochrome b tRNA 15485..15555 product = tRNA-Thr insert 15556..15590 tRNA complement (15591..15659) product = tRNA-Pro D-Loop 15650..16718

Cetorhinus maximus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATGTAA AGTATGGCAC TGAAGATGCT AAGATGAAAA CGATCACATC GAATTACATT TCATACCGTG ACTTCTACGA TTCTACTTTT 51 ATAGA.AATTT TCCACAGGCA TATAGGTTTG GTCCTGGCCT CAGTATTAAT TATCTTTAAA AGGTGTCCGT ATATCCAAAC CAGGACCGGA GTCATAATTA 101 TGTAACCAA.A ATTATACATG CAAGTTTCAG CATCCCCGTG AGAATGCCCT ACATTGGTTT TAATATGTAC GTTCAA.AGTC GTAGGGGCAC TCTTACGGGA 151 AATTACTCTA TCAATTAATT AGGAGCGGGT ATCAGGCACA CACACGTAGC TTAATGAGAT AGTTAATTAA TCCTCGCCCA TAGTCCGTGT GTGTGCATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGATTTCAG CAGTAATAAA GGTTCTGTGG A.ACGATTCGG TGTGGGGGTT CCCTA.AAGTC GTCATTATTT 251 TATTGACACA TAAGCGTAAG CTTGAGTCAG TTAAAGTTAA CAGAGTTGGT ATAACTGTGT ATTCGCATTC GAACTCAGTC AATTTCAATT GTCTCAACCA 301 AAATCTCGTG CCAGCCACCG CGGTTATACG AGTAACTCAT ATTAATACTT TTTAGAGCAC GGTCGGTGGC GCCAATATGC TCATTGAGTA TAATTATGAA 351 CCCGGCGTAA AGAGTGATTT AAGGAATATC TACAATAATT AA.AGTTAAGA GGGCCGCATT TCTCACTAAA TTCCTTATAG ATGTTATTAA TTTCAATTCT 401 CCTCATCA.AA CTGTTATACG CACCCATAA.A CGGAAATATC AACAACGA.AA 251

GGAGTAGTTT GACAATATGC GTGGGTATTT GCCTTTATAG TTGTTGCTTT 451 GTGACTTTAT ACCACTAGAA ATCTTGATGT CACGACAGTT AGACCCCAAA C AC T GA.AATA TGGTGATCTT TAGAACTACA GTGCTGTCAA TCTGGGGTTT 501 CTAGGATTAG ATACCCTACT ATGTCTAACC ATAAACTTAA ACAATAATTC GATCCTAATC TATGGGATGA TACAGATTGG TATTTGAATT TGTTATTAAG 551 ACTATATTGT TCGCCAGAGA ACTACAAGCG CTAGCTTGAA ACCCAAAGGA TGATATAACA AGCGGTCTCT TGATGTTCGC GATCGAACTT TGGGTTTCCT 601 CTTGGCGGTG TCCCAAACCC ACCTAGAGGA GCCTGTTCTG TAACCGATAA GAACCGCCAC AGGGTTTGGG TGGATCTCCT CGGACAAGAC ATTGGCTATT 651 TCCCCGTTAA ACCTCACCAC TTCTAGCCAT CCCCGTCTAT ATACCGCCGT AGGGGCAATT TGGAGTGGTG AAGATCGGTA GGGGCAGATA TATGGCGGCA 701 CGTCAGCTCA CCCTGTGAAG GTCTA~AAAGT AAGCP~~AAAG AACTAACTTC GCAGTCGAGT GGGACACTTC CAGATTTTCA TTCGTTTTTC TTGATTGAAG 751 CATACGTCAG GTCGAGGTGT AGCAAATGAA GTGGATAGAA ATGGGCTACA GTATGCAGTC CAGCTCCACA TCGTTTACTT CACCTATCTT TACCCGATGT 801 TTTTCTATAA AGAAAACACG GATGGTAAAC TGp~~AAATTA CCTAAAGGTG AA.AAGATATT TCTTTTGTGC CTACCATTTG ACTTTTTAAT GGATTTCCAC 851 GATTTAGCAG TAAGAAAAGA TTAGAGAGCT TTTC TGP►AAC TGGCTCTGGG CTAAATCGTC ATTCTTTTCT AATCTCTCGA AAAGACTTTG ACCGAGACCC 901 ACGCGCACAC ACCGCCCGTC ACTCTCCTCG TCT ACTTATTTTT TGCGCGTGTG TGGCGGGCAG TGAGAGGAGC TTTTTTTAGA TGAATAAAAA 951 AATTA.A.AAGA ATATCACCAA GAGGAGGCAA GTCGTAACAT GGTAAGTGTA TTAATTTTCT TATAGTGGTT CTCCTCCGTT CAGCATTGTA CCATTCACAT 1001 CTGGAAAGTG CACTTGGAAT CAA.AATGTGG CTAAACTAGC AA.AGCAC C TC GACCTTTCAC GTGAACCTTA GTTTTACACC GATTTGATCG TTTCGTGGAG 1051 CCTTACACCG AGGAGAAACC CGTGCAATTC GAGTCATTTT GAACATTAAA GGAATGTGGC TCCTCTTTGG GCACGTTAAG CTCAGTAAAA CTTGTAATTT 1101 GCTAGCCTGT ACATCACCTT AAACCTAACC TTATTAATTA CCTTATACAC CGATCGGACA TGTAGTGGAA TTTGGATTGG AATAATTAAT GGAATATGTG 1151 TATTTCCTAA CTAAAACATT TTTACTTTTT AGTATGGGTG ACAGAACAAA ATAAAGGATT GATTTTGTAA AAATGP~.AAAA TCATACCCAC TGTCTTGTTT 1201 AAATCAGCGC AATAGACTAT GTACCGCAAG GGAAAGCTGA AAGAGA.AATG TTTAGTCGCG TTATCTGATA CATGGCGTTC CCTTTCGACT TTCTCTTTAC 1251 AAACAA.ATAA TTAAAGTAAT P►AAAAGCAGA GATTCCACCT CGTACCTTTT TTTGTTTATT AATTTCATTA TTTTTCGTCT CTAAGGTGGA GCATGGAA.AA 1301 GCATCATGAT TTAGCTAGAA A.AACTAGACA AAGAGATCTT AAGCCTATCC CGTAGTACTA AATCGATCTT TTTGATCTGT TTCTCTAGAA TTCGGATAGG 1351 TCCCGAAACT A.AAC GAGC TA CTCCGAAGCA GCACAACTGA GCGAACCCGT AGGGCTTTGA TTTGCTCGAT GAGGCTTCGT CGTGTTGACT CGCTTGGGCA 1401 CTCTGTGGCA AA.AGAGTGGG AAGACTTCCG AGTAGCGGTG ACAAGCCTAT GAGACACCGT TTTCTCACCC TTCTGAAGGC TCATCGCCAC TGTTCGGATA 1451 CGAGTTTAGT GATAGCTGGT TGTCCAAGAA AAGAACTTAA ATTCTGCATT GCTCAAATCA CTATCGACCA ACAGGTTCTT TTCTTGAATT TAAGACGTAA 1501 AATTTTTCAT CACCAATAAG TCCACCTTAT TAAGGTCAAA TATP.►~~AAATT TTP.►~~AA.AGTA GTGGTTATTC AGGTGGAATA ATTCCAGTTT ATATTTTTAA 1551 AATAGTTATT CAGAAGAGGT ACAGCCCTTC TGAATTAAGA TACAACTTTC TTATCAATAA GTCTTCTCCA TGTCGGGAAG ACTTAATTCT ATGTTGAAAG 1601 AAAGGAGGGA AATGATCATA TTTATTAAGG TTTTCACCTT AGTGGGCCCA TTTCCTCCCT TTACTAGTAT AAATAATTCC AAAAGTGGAA TCACCCGGGT 1651 AAAGCAGCCA CCTGAAGAGT AAGCGTCACA GCTCCAGTTT AAC C TTTCGTCGGT GGACTTCTCA TTCGCAGTGT C GAGGTC A.AA TTGTTTTTTG 1701 CTATAATACG GATAACTCCT CATAACCCCC TTAATCATAT TGGACCATTT GATATTATGC CTATTGAGGA GTATTGGGGG AATTAGTATA ACCTGGTAAA 1751 TAT~?~AAATTA TA.AA.AGAAC T TATGC TA.AAA TGAGTAATAA GAGGACA.AAC ATATTTTAAT ATTTTCTTGA ATACGATTTT ACTCATTATT CTCCTGTTTG 252

1801 CTCTCCAGAC ACAAGTGTAT GTCAGAA.AGA ATTAAATCAC TGACAATTAA GAGAGGTCTG TGTTCACATA CAGTCTTTCT TAATTTAGTG ACTGTTAATT 1851 ACGAACCCAG ACTGAGGATA TTATACTGAT ATGACCTTAA C TAGA~AAAC C TGCTTGGGTC TGACTCCTAT AATATGACTA TACTGGAATT GATCTTTTGG 1901 CTATTACAAT GCTCGTTAAC CCTACACAGG AGTGTCTTAA GGAAAGATTA GATAATGTTA CGAGCAATTG GGATGTGTCC TCACAGAATT CCTTTCTAAT 1951 AAAGAAAATA AAGGAACTCG GCA.AACACAA ACTCCGCCTG TTTACCAA.AA TTTCTTTTAT TTCCTTGAGC CGTTTGTGTT TGAGGCGGAC AAATGGTTTT 2001 ACATCGCCTC TTGCATTCCC ATAAGAGGTC CCGCCTGCCC TGTGACAATG TGTAGCGGAG AACGTAAGGG TATTCTCCAG GGCGGACGGG ACACTGTTAC 2051 TTTAACGGCC GCGGTATTTT GACCGTGCAA AGGTAGCGTA ATCACTTGTC AAATTGCCGG CGCCATA,AAA CTGGCACGTT TCCATCGCAT TAGTGAACAG 2101 TTTTAAATGA AGACCCGTAT GAAAGGCATC ACGAGAGTTT AACTGTCTCT ~~ATTTAC T TCTGGGCATA CTTTCCGTAG TGCTCTCAAA TTGACAGAGA 2151 ATTTTCTAAT CAATGAAATT GATCTACTCG TGCAGAAGCG AGTATA.AACA TAAAAGATTA GTTACTTTAA CTAGATGAGC ACGTCTTCGC TCATATTTGT 2201 CATTAGACGA GAAGACCCTA TGGAGCTTCA AACACATA.AA TTAACTATGT GTAATCTGCT CTTCTGGGAT ACCTCGAAGT TTGTGTATTT AATTGATACA 2251 A.AATTAATTA TTCCACGGAT ATAAACAATA ATACAATATC TTTAATTTAA TTTAATTAAT AAGGTGCCTA TATTTGTTAT TATGTTATAG AAATTAAATT 2301 CTGTTTTTGG TTGGGGTGAC CAAGGGGA.A.A AACAAATCCG CCTTATCGAC GAC A~P~3AC C AACCCCACTG GTTCCCCTTT TTGTTTAGGG GGAATAGCTG 2351 TGAGTACTCA AGTACTTA.AA AATTAGAATT ACAATTCTAA TTGATAA.AAC ACTCATGAGT TCATGAATTT TTAATCTTAA TGTTAAGATT AACTATTTTG 2401 ATTTATCGAA AAATGACCCA GGATTTCCTG ATCAATGAAC CAAGTTACCC TAAATAGCTT TTTACTGGGT CCTAAAGGAC TAGTTACTTG GTTCAATGGG 2451 TAGGGATAAC AGCGCAATCC TTTCCCAGAG TCCCTATCGC CGAAAGGGTT ATCCCTATTG TCGCGTTAGG A.A.AGGGTCTC AGGGATAGCG GCTTTCCCAA 2501 TACGACCTCG ATGTTGGATC AGGACATCCT AATGATGCAA CCGTTATTAA ATGCTGGAGC TACAACCTAG TCCTGTAGGA TTACTACGTT GGCAATAATT 2551 GGGTTCGTTT GTTCAACGAT TAATAGTCCT ACGTGATCTG AGTTCAGACC CCCAAGCAA.A CAAGTTGCTA ATTATCAGGA TGCACTAGAC TCAAGTCTGG 2601 GGAGAAATCC AGGTCAGTTT CTATCTATGA ATTTATTTCT CCTAGTACGA CCTCTTTAGG TCCAGTCAAA GATAGATACT TAA.ATAAAGA GGATCATGCT 2651 AAGGACCGGA A►AAATGGAGC CAATACCCAA GGCACGCTCC ATTTTCACCT TTCCTGGCCT TTTTACCTCG GTTATGGGTT CCGTGCGAGG TAAAAGTGGA 2701 ATTGAA.ATAA AC TAAA.ATAG ATAAGp~~AAA AATCA.AACAT TGCCCAAGAA TAACTTTATT TGATTTTATC TATTCTTTTT TTAGTTTGTA ACGGGTTCTT 2751 AAGGGCTGTT GAGGTGGCAG AGCCTGGCAA ATGCGAAAGA CCTAAGCTCT TTCCCGACAA CTCCACCGTC TCGGACCGTT TACGCTTTCT GGATTCGAGA 2801 TTAATCCAGA GGTTCAA.ATC CTCTCCCCAA CTATGCTTGA AGCCCTCCTA AATTAGGTCT CCAAGTTTAG GAGAGGGGTT GATACGAACT TCGGGAGGAT 2851 CTTTACTTGA TTTGCCCACT TACCTATATT ATTCCTATTT TATTAGCCAC GA.AATGAAC T AAACGGGTGA ATGGATATAA TAAGGATAAA ATAATCGGTG 2901 AGCCTTCCTC ACCCTAGTCG AAC GA.AAAGT CCTTGGTTAT ATACAACTCC TCGGAAGGAG TGGGATCAGC TTGCTTTTCA GGA.ACCAATA TATGTTGAGG 2951 GCAAAGGCCC CAACATTGTA GGACCATGCG GACTCCTCCA ACCTATCGCA CGTTTCCGGG GTTGTAACAT CCTGGTACGC CTGAGGAGGT TGGATAGCGT 3001 GACGGCCTAA AACTATTTAC C A.AAGAAC C T ATTTACCCCT CCACATCCTC CTGCCGGATT TTGATA.AP~TG GTTTCTTGGA TAA.ATGGGGA GGTGTAGGAG 3051 CCCATTCCTA TTTTTAGCTA CCCCCACAAT AGCCCTAACA TTGGCCCTCC GGGTAAGGAT P,~~AAATC GAT GGGGGTGTTA TCGGGATTGT AACCGGGAGG 3101 TAATATGAAT ACCCCTCCCC CTCCCTTACT CTATCATTAA CCTCAATCTA ATTATACTTA TGGGGAGGGG GAGGGAATGA GATAGTAATT GGAGTTAGAT 3151 GGCTTATTAT TTATTCTAGC AATTTCAAGT TTAACCGTCT ACACCATTTT 253

CCGAATAATA AATAAGATCG TTA.AAGTTCA AATTGGCAGA TGTGGTAAAA 3201 AGGATCTGGA TGAGCATCAA ATTCAAAATA TGCCCTAATA GGGGCCCTAC TCCTAGACCT ACTCGTAGTT TAAGTTTTAT ACGGGATTAT CCCCGGGATG 3251 GAGCTGTAGC ACAGACAATT TCTTACGAAG TGAGCCTTGG ATTAATCCTC CTCGACATCG TGTCTGTTAA AGAATGCTTC ACTCGGAACC TAATTAGGAG 3301 CTATCAATGA TCATTTTTGC AGGGGGCTTT ACTCTCCATA CCTTCAACCT GATAGTTACT AGTp~~,AAACG TCCCCCGA.AA TGAGAGGTAT GGAAGTTGGA 3351 GACACAAGAG ACAATCTGAC TAATTATCCC AGGATGACCC CTAGCCTTAA CTGTGTTCTC TGTTAGACTG ATTAATAGGG TCCTACTGGG GATCGGAATT 3401 TATGATATGT ATCAACCCTA GCAGAGACTA ATCGAGTTCC ATTTGACCTA ATACTATACA TAGTTGGGAT CGTCTCTGAT TAGCTCAAGG TAAACTGGAT 3451 ACGGAGGGAG AATCAGAACT AGTCTCAGGC TTTAACATCG AATACGCAGG TGCCTCCCTC TTAGTCTTGA TCAGAGTCCG AAATTGTAGC TTATGCGTCC 3501 AGGCTCATTC GCCCTATTTT TCCTAGCCGA ATATACTAAC ATTTTATTAA TCCGAGTAAG CGGGATAAAA AGGATCGGCT TATATGATTG T~~AAATAATT 3551 TAAACACCCT TTCGGTTATT TTGTTCATAG GCTCCTCCTA TAACCCCCTT ATTTGTGGGA AAGCCAATAA AACAAGTATC CGAGGAGGAT ATTGGGGGAA 3601 TCCCCAGAGA TTTCAACACT CAGCCTAATA AT~?~AAAGCAA CCCTACTAAC AGGGGTCTCT AAAGTTGTGA GTCGGATTAT TATTTTCGTT GGGATGATTG 3651 CCTGTTCTTC TTATGAATCC GAGCATCATA TCCTCGCTTC CGTTACGATC GGACAAGAAG AATACTTAGG CTCGTAGTAT AGGAGCGAAG GCAATGCTAG 3701 AACTTATACA CTTAGTATGA p~~A.AATTTTC TGCCCTTGAC CTTAGCAATT TTGAATATGT GAATCATACT TTTTTAAAAG ACGGGAACTG GAATCGTTAA 3751 ATACTATGAC ACATCGCCCT CCCCACAGCT ACAGCAGGCC TGCCTCCTTT TATGATACTG TGTAGCGGGA GGGGTGTCGA TGTCGTCCGG ACGGAGGAAA 3801 AACCTAACGG AAGCGTGCCT GAACAAAGGA CCACTTTGAT AGAGTGGGTA TTGGATTGCC TTCGCACGGA CTTGTTTCCT GGTGAAACTA TCTCACCCAT 3851 ATGAAAGTTA A.AACCTTTCC TCTTCCTAGA AAAACAGGAC TTGAACCTGC TACTTTCAAT TTTGGAAAGG AGAAGGATCT TTTTGTCCTG AACTTGGACG 3901 ACCTAAGAGA TCAAAACTCT TCATGCTTCC TATTATACTA TTTTCTAAGT TGGATTCTCT AGTTTTGAGA AGTACGAAGG ATAATATGAT AAA.AGATTCA 3951 AAAGTCAGCT AAATAAGCTT TCGGGCCCAT ACCCCAACCA TGTCGGTTAA TTTCAGTCGA TTTATTCGAA AGCCCGGGTA TGGGGTTGGT ACAGCCAATT 4001 AATCCTTCCT TTACTAATGA GCCCAATCGT ACTAACCATT ATTATCTCTA TTAGGAAGGA AATGATTACT CGGGTTAGCA TGATTGGTAA TAATAGAGAT 4051 GCCTAGGCCT AGGAACTATC CTAACATTCA TCAGTTCACA CTGACTCCTA CGGATCCGGA TCCTTGATAG GATTGTAAGT AGTCAAGTGT GACTGAGGAT 4101 GTTTGAATAG GCCTCGAAAT CAATACTCTA GGGATGATTG CACTAATAAT CAA.ACTTATC CGGAGCTTTA GTTATGAGAT CGGTAGTAAG GTGATTATTA 4151 TCGCCAGCAC CACCCCCGAG CAGTAGAAGC CTCCACAAAA TACTTTATCA AGCGGTCGTG GTGGGGGCTC GTCATCTTCG GAGGTGTTTT ATGAAATAGT 4201 CACAAGCCAC TGCCTCAGCC CTACTCTTAT TCGCTAGCAC CACAAACGCT GTGTTCGGTG ACGGAGTCGG GATGAGAATA AGCGATCGTG GTGTTTGCGA 4251 TGGACTTCAG GTGAATGAAG TCTAATCGAA ATTTCCAATC CAGGCTCTGC ACCTGAAGTC CACTTACTTC AGATTAGCTT TAAAGGTTAG GTCCGAGACG 4301 CACACTAGCC ACAATCGCAT TAGCCTTAAA AATTGGCTTG GCCCCCCTTC GTGTGATCGG TGTTAGCGTA ATCGGAATTT TTAACCGAAC CGGGGGGAAG 4351 ACTTTTGACT ACCCGAAGTC CTCCAAGGCC TAGACCTCAC CACAGGACTC TGA.A.AAC TGA TGGGCTTCAG GAGGTTCCGG ATCTGGAGTG GTGTCCTGAG 4401 ATCCTATCTA CCTGACAAAA ACTGGCCCCA TTCGCTATTC TCCTACAACT TAGGATAGAT GGACTGTTTT TGACCGGGGT AAGCGATAAG AGGATGTTGA 4451 TTACCCCTCA CTAAACCCCA ATTTACTACT ACTCCTTGGA GTCCTCTCAA AATGGGGAGT GATTTGGGGT TAAATGATGA TGAGGAACCT CAGGAGAGTT 4501 CCATGGTAGG AGCCTGAGGA GGATTAAATC AAACTCAATT ACGAAAA.ATC GGTACCATCC TCGGACTCCT CCTAATTTAG TTTGAGTTAA TGCTTTTTAG 254

4551 ATAGCCTATT CCTCTATTGC ACATCTTGGT TGAATAATTA CAGTCTTATA TATCGGATAA GGAGATAACG TGTAGAACCA ACTTATTAAT GTCAGAATAT 4601 TTACTCACAC AATTTAACCC AACTCAACCT AATTCTTTAC ATCACTATAA AATGAGTGTG TTA.AATTGGG TTGAGTTGGA TTAAGAAATG TAGTGATATT 4651 CATCAACGAC CTTCCTCTTA TTT~`~AAACAT TTAACTCCAC CAAAATCAAC GTAGTTGCTG GAAGGAGAAT AAATTTTGTA AATTGAGGTG GTTTTAGTTG 4701 TCTATCTCCT CTTCATCATC P.,.A.AA.ACC C C C CTACTATCTA CCATTGCTCT AGATAGAGGA GAAGTAGTAG TTTTTGGGGG GATGATAGAT GGTAACGAGA 4751 TATAACCCTC CTTTCCCTCG GAGGATTACC TCCACTCTCA GGTTTTATAC ATATTGGGAG GAAAGGGAGC CTCCTAATGG AGGTGAGAGT CCAAAATATG 4801 CAAAATGATT AATTTTACAA GAGCTAACAA AACAAAACCT AATTGTCCCA GTTTTACTAA TTA.A.AATGTT CTCGATTGTT TTGTTTTGGA TTAACAGGGT 4851 GCCACTATTA TAGCTATAAT AACCCTCCTC AGTCTATTCT TTTACCTACG CGGTGATAAT ATCGATATTA TTGGGAGGAG TCAGATAAGA A.AATGGATGC 4901 CCTATGCTAT GCCACAACAT TAACCATAAC CCCAAATCCA ATCAACATAC GGATACGATA CGGTGTTGTA ATTGGTATTG GGGTTTAGGT TAGTTGTATG 4951 CAACATCATG AC GAAC TAA.A CTACCACATA ACCTCATCCT GACAACAACT GTTGTAGTAC TGCTTGATTT GATGGTGTAT TGGAGTAGGA CTGTTGTTGA 5001 GCCTCACTAT CTATTTTCCT CCTCCCAGTC ACCCCAGCCA TCCTCATATT CGGAGTGATA GATA►.P~AAGGA GGAGGGTCAG TGGGGTCGGT AGGAGTATAA 5051 AATATCCTAA GA.AATTTAGG TTAACAACAG ACCAAGAGCC TTCAAAGCTT TTATAGGATT CTTTAAATCC AATTGTTGTC TGGTTCTCGG AAGTTTCGAA 5101 TAAGTAGAAG TGA.AAATCTC CTAATTTCTG CTAAGATTTG CAAGATTTTA ATTCATCTTC ACTTTTAGAG GATTA.AAGAC GATTCTAA.AC GTTC T~~AA.AT 5151 TCTCACATCC TCTGATTGCA ACCCAGACAC TTTAATTAAG CTA.A.AACCCT AGAGTGTAGG AGACTAACGT TGGGTCTGTG AAATTAATTC GATTTTGGGA 5201 CTAGACAAAT AGGCCTCGAT CCTACAAAAT CTTAGTTAAC AGCTAAGTGT GATCTGTTTA TCCGGAGCTA GGATGTTTTA GAATCAATTG TCGATTCACA 5251 TCAATCCAGC GAACTTTTAT CTACTTTCTC CCGCCGTAAG AACAAAGGCG AGTTAGGTCG CTTG~TA GATGAAAGAG GGCGGCATTC TTGTTTCCGC 5301 GGAGA.AAGTC C C GGGAGA.AA CTTAACCTCC ATTTTTGGGT TTGCAACCCA CCTCTTTCAG GGCCCTCTTT GAATTGGAGG TP.~~AAAC C C A AACGTTGGGT 5351 ACGTACACAT TTACTGCGGG ACTTTGGTAA GAAGAGGAAT TTAGCCTCTG TGCATGTGTA AATGACGCCC TGAAACCATT CTTCTCCTTA AATCGGAGAC 5401 TCCGCGGAGC TACAACCCGC TACTTAATTC TCAGACACCT TACCTGTGGC AGGCGCCTCG ATGTTGGGCG ATGAATTAAG AGTCTGTGGA ATGGACACCG 5451 AATTAATCGT TGACTATTTT CTACAAACCA TAAAGACATC GGCACCCTGT TTAATTAGCA ACTGAT~ GATGTTTGGT ATTTCTGTAG CCGTGGGACA 5501 ATTTAATCTT TGGTGCATGA GCAGGAATAG TAGGGACAGC CCTAAGCCTC TAAATTAGAA ACCACGTACT CGTCCTTATC ATCCCTGTCG GGATTCGGAG 5551 CTAATTCGAG CCGAATTAGG CCAACCCGGA TCACTTCTTG GTGATGATCA GATTAAGCTC GGCTTAATCC GGTTGGGCCT AGTGAAGAAC CACTACTAGT 5601 AATTTATAAT GTTATTGTGA CAGCTCATGC ATTTGTAATA ATCTTCTTCA TTAAATATTA CAATAACACT GTCGAGTACG TAAACATTAT TAGAAGAAGT 5651 TGGTTATACC CGTAATAATT GGGGGTTTTG GGAACTGATT AGTACCATTA ACCAATATGG GCATTATTAA CCCCCAAAAC CCTTGACTAA TCATGGTAAT 5701 ATAATTGGTG CGCCAGACAT AGCCTTCCCA C GAATAA.ATA ATATAAGCTT TATTAACCAC GCGGTCTGTA TCGGAAGGGT GCTTATTTAT TATATTCGAA 5751 TTGACTCCTC CCTCCTTCTT TTCTCTTGCT CCTGGCCTCA GCCGGAGTTG AACTGAGGAG GGAGGAAGAA AAGAGAACGA GGACCGGAGT CGGCCTCAAC 5801 AAGCTGGAGC CGGAACTGGC TGAACAGTAT ACCCTCCCCT AGCTGGCAAT TTCGACCTCG GCCTTGACCG ACTTGTCATA TGGGAGGGGA TCGACCGTTA 5851 CTAGCACACG CTGGAGCATC CGTTGATTTA GCCATCTTTT CTCTCCATTT GATCGTGTGC GACCTCGTAG GCAACTAAAT CGGTAGAAAA GAGAGGTAAA 5901 AGCAGGCATC TCATCAATTC TAGCTTCAAT TAACTTTATT ACAACCATTA 255

TCGTCCGTAG AGTAGTTAAG ATCGAAGTTA ATTGAAATAA TGTTGGTAAT 5951 TTAATATGAA GCCACCAGCC ATCTCCCAAT ATCAAACACC ATTATTCGTG AATTATACTT CGGTGGTCGG TAGAGGGTTA TAGTTTGTGG TAATAAGCAC 6001 TGATCAATTC TAGTCACAAC CATCCTTCTT CTTTTAGCCC TCCCAGTACT ACTAGTTAAG ATCAGTGTTG GTAGGAAGAA GA.A.AATC GGG AGGGTCATGA AACACAACAT 6051 TGCAGCCGGC ATCACAATAT TGCTTACTGA TCGGAACCTA ACGTCGGCCG TAGTGTTATA ACGAATGACT AGCCTTGGAT TTGTGTTGTA 6101 TCTTTGACCC AGCAGGGGGA GGGGACCCTA TTCTCTACCA ACACCTGTTC AGAAACTGGG TCGTCCCCCT CCCCTGGGAT AAGAGATGGT TGTGGACAAG 6151 TGATTCTTCG GTCACCCAGA AGTTTACATT TTAATCCTTC CCGGTTTTGG GGCCP.►AAACC ACTAAGAAGC CAGTGGGTCT TCAAATGTAA AATTAGGAAG 6201 AATAATTTCC CATGTAGTAG CCTATTATTC TGGG GAACCATTCG TTATTAAAGG GTACATCATC GGATAATAAG ACCCTTTTTT CTTGGTAAGC 6251 GCTACATAGG AATAGTTTGA GCAATAATAG CAATTGGTCT ATTAGGCTTT CGATGTATCC TTATCAAACT CGTTATTATC GTTAACCAGA TAATCCGAAA 6301 ATTGTCTGAG CCCACCATAT ATTTACAGTA GGAATGGATG TTGATACACG TAACAGACTC GGGTGGTATA TAAATGTCAT CCTTACCTAC AACTATGTGC 6351 AGCCTACTTT ACTTCAGCAA CAATAATTAT TGCCATCCCC ACGGGTGTAA TC GGATGAA.A TGAAGTCGTT GTTATTAATA ACGGTAGGGG TGCCCACATT 6401 AAGTCTTTAG CTGATTAGCA ACCCTTCATG GAGGCTCCGT TAAATGAGAA TTCAGAAATC GACTAATCGT TGGGAAGTAC CTCCGAGGCA ATTTACTCTT 6451 ACCCCCCTAC TATGAGC TC T TGGGTTCATC TTCTTATTTA CGGTAGGAGG TGGGGGGATG ATACTCGAGA ACCCAAGTAG AAGAATAAAT GCCATCCTCC 6501 ACTAACCGGA ATTGTCCTAG CTAATTCCTC CTTAGATATC GTTCTCCACG TGATTGGCCT TAACAGGATC GATTAAGGAG GAATCTATAG CAAGAGGTGC 6551 ATACTTATTA TGTAGTAGCC CACTTCCACT ATGTCCTTTC AATAGGAGCA TATGAATAAT ACATCATCGG GTGAAGGTGA TACAGGAAAG TTATCCTCGT 6601 GTATTCGCTA TTATAGCAGG TTTTATCCAC TGATTGGCCT TAATATCTGG CATAAGCGAT AATATCGTCC AAAATAGGTG ACTAAGGGGA ATTATAGACC 6651 CTACACCCTT CATTCAACAT GAACP~PlAAAT CCAATTCGCG GTTATATTCA GATGTGGGAA GTAAGTTGTA CTTGTTTTTA GGTTAAGCGC CAATATAAGT 6701 TCGGAGTTAA CCTAACATTC TTCCCACAAC ATTTCCTAGG TCTTGCTGGG AGCCTCAATT GGATTGTAAG AAGGGTGTTG TAAAGGATCC AGAACGACCC 6751 ATAC C AC GAC GCTACTCAGA TTACCCAGAT GCTTACACCT TATGAAATAC TATGGTGCTG CGATGAGTCT AATGGGTCTA CGAATGTGGA ATACTTTATG 6801 GGTCTCCTCT ATCGGCTCTC TAATTTCACT TGTAGCAGTA ATTATACTCC CCAGAGGAGA TAGCCGAGAG ATTAAAGTGA ACATCGTCAT TAATATGAGG 6851 TATTCATCCT CTGAGAAGCA TTTGCCTCAA AACGAGAAGT ATTATCCATT ATAAGTAGGA GACTCTTCGT AAACGGAGTT TTGCTCTTCA TAATAGGTAA 6901 GACCTACCCC ATACAAATGT CGAATGACTC CACGGCTGCC CCCCACCCTA CTGGATGGGG TATGTTTACA GCTTACTGAG GTGCCGACGG GGGGTGGGAT 6951 CCACACATAT GAAGAACCAG CATTCGTTCA GGTTCAACGA ACTTTTTAAA GGTGTGTATA CTTCTTGGTC GTAAGCAAGT CCAAGTTGCT TGAA.A.AATTT 7001 TCAAGAA.AGG AAGGAATTGA ACCCCCATAA ATTAGTTTCA AGCCAACCAC AGTTCTTTCC TTCCTTAACT TGGGGGTATT TAATCA.AAGT TCGGTTGGTG 7051 ATCACCACTC TGTCACTTTC TTTATTAAGA TTCTAGTAAA ACACATTACA TAGTGGTGAG ACAGTGAAAG A.AATAATTC T AAGATCATTT TGTGTAATGT 7101 CTACCTTGTC AAGATAAAAT TGTGGGTTAA AATCCCACGA ATCTTAATTT GATGGAACAG TTCTATTTTA ACACCCAATT TTAGGGTGCT TAGAATTAAA 7151 GTAATGGCAC ACCCCTCACA ATTAGGATTC CAAGACGCAG CCTCCCCAGT CATTACCGTG TGGGGAGTGT TAATCCTAAG GTTCTGCGTC GGAGGGGTCA 7201 TATGGAAGAA CTTATTCATT TTCACGACCA CACACTAATA ATTGTATTCC ATACCTTCTT GAATAAGTAA AAGTGCTGGT GTGTGATTAT TAACATAAGG 7251 TAATTAGTGC ACTAGTCCTT TATGTTATTA CAGCAATAGT ATCAACA.A.AA ATTAATCACG TGATCAGGAA ATACAATAAT GTCGTTATCA TAGTTGTTTT 256

7301 CTCACA.AACA AATATATTCT TGACTCCCAA GAAATTGAAA TTGTCTGAAC GAGTGTTTGT TTATATAAGA ACTGAGGGTT CTTTAACTTT AACAGACTTG 7351 TATTCTCCCC GCCATTATTC TCATTATAAT TGCTTTACCA TCCCTACGAA ATAAGAGGGG CGGTAATAAG AGTAATATTA ACGA.A,ATGGT AGGGATGCTT 7401 TTTTATACCT CATAGACGAA ATTAATGACC CTCACCTAAC CATTA.AAGC T A.AA.ATATGGA GTATCTGCTT TAATTACTGG GAGTGGATTG GTAATTTCGA 7451 ATAGGCCATC AGTGATACTG AAGTTATGAA TACACAGATT ATGAAGACTT TATCCGGTAG TCACTATGAC TTCAATACTT ATGTGTCTAA TACTTCTGAA 7501 AGGATTCGAC TCCTACATAA TCCAAACTCA AGACTTAACC CCAGGCCAAT TCCTAAGCTG AGGATGTATT AGGTTTGAGT TCTGAATTGG GGTCCGGTTA 7551 TTCGTTTATT AGAAACGGAT CATCGAATAG TTGTTCCCAT GGAGTCGCCT AAGCAAATAA TCTTTGCCTA GTAGCTTATC AACAAGGGTA CCTCAGCGGA 7601 GTTCGAGTAT TAGTGTCAGC AGAAGATGTT CTACATTCAT GAGCTGTACC CAAGCTCATA ATCACAGTCG TCTTCTACAA GATGTAAGTA CTCGACATGG 7651 GGCCCTAGGA GTTA.AAATAG ATGCCGTCCC AGGACGTTTA AACCAGACCG CCGGGATCCT CAATTTTATC TACGGCAGGG TCCTGCAAAT TTGGTCTGGC 7701 CCTTTATTAT CTCCCGACCA GGTGTTTATT ACGGCCAATG TTCAGAAATT GGAA.ATAATA GAGGGCTGGT CCACAAATAA TGCCGGTTAC AAGTCTTTAA 7751 TGTGGTGCCA ACCACAGTTT TATACCCATT GTCGTAGAAG CAGTTCCTTT ACACCACGGT TGGTGTCA.AA ATATGGGTAA CAGCATCTTC GTCAAGGA.AA 7801 AGAACACTTT GAGGCCTGAT CTTCATCAAT ACTAGAAGAA GCCTCACTAA TCTTGTGAAA CTCCGGACTA GAAGTAGTTA TGATCTTCTT CGGAGTGATT 7851 GAAGCTAAAT TGGGCCTAGC GTTAGCCTTT TAAGCTAA.AA ACTGGTGATT CTTCGATTTA ACCCGGATCG CAATCGGAAA ATTCGATTTT TGACCACTAA 7901 CCCTACCACC CTTAGTGATA TGCCTCAACT TAATCCGCAC CCTTGATTCA GGGATGGTGG GAATCACTAT ACGGAGTTGA ATTAGGCGTG GGAACTAAGT 7951 TTATTCTCCT ATTCTCATGG ATAATTTTCC TTGTTATCTT GCC AATAAGAGGA TAAGAGTACC TATTAA.AAGG AACAATAGAA CGGTTTTTTT 8001 GTAATAAATC ATACATTCAA CAATGACCCT ACATTP~~AAA GCACTG~.AAA CATTATTTAG TATGTAAGTT GTTACTGGGA TGTAATTTTT CGTGACTTTT 8051 ATC TA.AAC C T GAGCCCTGAA ACTGACCATG ATCATAAGCT TTTTCGACCA TAGATTTGGA CTCGGGACTT TGACTGGTAC TAGTATTCGA A,AAAGCTGGT 8101 ATTCTTAAGC CCCTCTTTCC TTGGAATCCC ATTAATTGCT CTAGCAATTA TAAGAATTCG GGGAGAAAGG AACCTTAGGG TAATTAACGA GATCGTTAAT 8151 CATTACCATG ACTAACCTTC CCAACTCCAA CTAATCGATG GCTCAATAAC GTAATGGTAC TGATTGGAAG GGTTGAGGTT GATTAGCTAC CGAGTTATTG 8201 CGATTAATAG CCCTCCAAGG TTGATTTATC AACCGATTCA TTTACCAACT GCTAATTATC GGGAGGTTCC AAC TAA.ATAG TTGGCTAAGT A.AATGGTTGA 8251 CATACAACCC ATTAATTTCG CAGGCCACAA GTGAGCCATG TTATTTACAG GTATGTTGGG TAATTAAAGC GTCCGGTGTT CACTCGGTAC AATA.AATGTC 8301 CATTAATACT ATTCCTAATC ACTATTAATT TGCTAGGACT CCTACCTTAT GTAATTATGA TAAGGATTAG TGATAATTAA ACGATCCTGA GGATGGAATA 8351 ACCTTTACGC CCACAACCCA ACTTTCCCTT AATATAGCTC TTGCCCTCCC TGGA.A.ATGC G GGTGTTGGGT TGAA.AGGGAA TTATATCGAG AACGGGAGGG 8401 CCTATGACTC ACTACCGTGT TAATTGGTGT ACTAAACCGA CCAACAATTG GGATACTGAG TGATGGCACA ATTAACCACA TGATTTGGCT GGTTGTTAAC 8451 CACTAAGCCA CTTCCTACCA GAAGGAACCC CCACCCCTCT AGTACCCATC GTGATTCGGT GAAGGATGGT CTTCCTTGGG GGTGGGGAGA TCATGGGTAG 8501 TTAATTATTA TTGAAACTAT TAGCCTATTT ATTCGACCAT TAGCATTAGG AATTAATAAT AACTTTGATA ATCGGATAAA TAAGCTGGTA ATCGTAATCC 8551 AGTCCGATTG ACTGCTAATT TAACAGCTGG TCACTTACTA ATACAATTAA TCAGGCTAAC TGACGATTAA ATTGTCGACC AGTGAATGAT TATGTTAATT 8601 CTGCAACTGC AGCCTTCGCC CTTATCACCA TCATGCCAAC CGTGGCATTA GACGTTGACG TCGGAAGCGG GAATAGTGGT AGTACGGTTG GCACCGTAAT 8651 CTTACATCAA CTATTCTATT TTTATTAACA ACCCTAGAAG TTGCCGTAGC 257

GAATGTAGTT GATAAGATAA AAATAATTGT TGGGATCTTC AACGGCATCG 8701 AATAATTCAA GCATATGTAT TTGTCCTCCT ACTAAGCCTT TATTTACAAG ATA.AATGTTC TTATTAAGTT CGTATACATA AACAGGAGGA TGATTCGGAA 8751 AAAATGTCTA ATGGCTCACC AAGCACACGC ATATCATATA GTTGACCCCA TTTTACAGAT TACCGAGTGG TTCGTGTGCG TATAGTATAT CAACTGGGGT 8801 GCCCATGACC ATTAACCGGA GCTACGGCCG CCCTTCTAAT AACATCCGGG CGGGTACTGG TAATTGGCCT CGATGCCGGC GGGAAGATTA TTGTAGGCCC 8851 TTGGCCATCT GATTTCACTT CCACTCATTA TCCCTTCTCT ATTTAGGATT AACCGGTAGA CTAAAGTGAA GGTGAGTAAT AGGGAAGAGA TAAATCCTAA 8901 AACCCTTCTT CTACTAACCA TAATCCAATG ATGACGCGAT ATTATCCGAG TTGGGAAGAA GATGATTGGT ATTAGGTTAC TACTGCGCTA TAATAGGCTC CCGTGCA.AAA 8951 AAGGGACATT TCAAGGTCAT CACACACCCC AGGCCTTCGC TTCCCTGTAA AGTTCCAGTA GTGTGTGGGG GGCACGTTTT TCCGGAAGCG 9001 TATGGCATAA TCTTATTCAT CACATCAGAA GTATTCTTCT TCTTAGGCTT ATACCGTATT AGAATAAGTA GTGTAGTCTT CATAAGAAGA AGAATCCGAA 9051 TTTCTGAGCC TTTTACCACT CAAGCCTTGC CCCAACCCCA GAACTAGGAG AAAGACTCGG AAAATGGTGA GTTCGGAACG GGGTTGGGGT CTTGATCCTC 9101 GATGTTGACC ACCTACAGGA ATTAACCCCC TAGACCCATT TGAAGTCCCA CTACAACTGG TGGATGTCCT TAATTGGGGG ATCTGGGTAA ACTTCAGGGT 9151 CTTCTAAATA CCGCAGTACT CTTAGCTTCT GGTGTGACAG TAACCTGAGC GAAGATTTAT GGCGTCATGA GAATCGAAGA CCACACTGTC ATTGGACTCG 9201 CCATCATAGC TTAATAGAAG GTAACCGAAA AGAAGCTATT CAAGCCCTTA GGTAGTATCG AATTATCTTC CATTGGCTTT TCTTCGATAA GTTCGGGAAT 9251 CTCTTACTAT TATTTTAGGA TTTTATTTCA CAGCCCTCCA AGCCATAGAA GAGAATGATA ATP.~AAATC C T AAAATAAAGT GTCGGGAGGT TCGGTATCTT 9301 TATTATGAAG CACCTTTCAC AATTGCTGAC GGGGTCTATG GCACAACATT ATAATACTTC GTGGAAAGTG TTAACGACTG CCCCAGATAC CGTGTTGTAA 9351 CTTCGTTGCC ACAGGGTTCC ACGGCCTTCA TGTCATCATT GGCTCAACAT GAAGCAACGG TGTCCCAAGG TGCCGGA.AGT ACAGTAGTAA CCGAGTTGTA 9401 TCTTAACAAT CTGCCTACTA CGACAGATTC AATACCACTT TACATCCGAA AGAATTGTTA GACGGATGAT GCTGTCTAAG TTATGGTGAA ATGTAGGCTT 9451 CACCACTTTG GCTTCGAAGC TGCTGCATGA TACTGACATT TTGTAGACGT GTGGTGAAAC CGAAGCTTCG ACGACGTACT ATGACTGTAA AACATCTGCA 9501 AGTATGATTA TTTCTTTATG TATCCATCTA TTGATGAGGC TCATAATTAC TCATACTAAT AAAGAAATAC ATAGGTAGAT AACTACTCCG AGTATTAATG 9551 TTTTCTAGTA TAGACTAGTA CA.AATGATTT CCAATCATTT AATCTTGGTT A.AAAGATCAT ATCTGATCAT GTTTACTAA.A GGTTAGTAAA TTAGAACCAA 9601 AA.AATCCAAG GAAAAGTAAT GAGCCTCATT ACGTCTTCTG TCGCGGCTAC TTTTAGGTTC CTTTTCATTA CTCGGAGTAA TGCAGAAGAC AGCGCCGATG 9651 GGCCCTGATT TCTCTAATCC TTGTATTTAT TACATTCTGA CTTCCATCAC CCGGGACTAA AGAGATTAGG AACATAAATA ATGTAAGACT GAAGGTAGTG 9701 TTAGCCCAGA TAATGp~~AAA CTATCCCCCT ATGAATGTGG CTTCGACCCC AATCGGGTCT ATTACTTTTT GATAGGGGGA TACTTACACC GAAGCTGGGG 9751 TTAGGAAGTG CACGTCTCCC ATTTTCCCTA CGCTTCTTCC TTGTAGCTAT AATCCTTCAC GTGCAGAGGG TAAAAGGGAT GCGAAGAAGG AACATCGATA 9801 CTTATTCCTG TTATTTGACT TAGAAATCGC TCTTCTTCTC CCTTTACCGT GAATAAGGAC AATAAACTGA ATCTTTAGCG AGAAGAAGAG GGAAATGGCA 9851 GGGGCGATCA ACTACTATCA CCACTCTCCA CATTACTCTG AGCAGCAACC CCCCGCTAGT TGATGATAGT GGTGAGAGGT GTAATGAGAC TCGTCGTTGG 9901 ATCCTAATTC TATTAACCCT AGGCCTTATC TATGAATGAC TTCAAGGAGG TAGGATTAAG ATAATTGGGA TCCGGAATAG ATACTTACTG AAGTTCCTCC 9951 ATTAGAATGA GCAGAATGGG TGTTTAGTCT AAACAAGACC ACTAATTTCG TAATCTTACT CGTCTTACCC ACAA.ATCAGA TTTGTTCTGG TGATTA.AAGC 10001 ACTTAGTAGA TTATAGTGAA AATC C ATA.AA CACCTTATGT CTTCTATATA TGAATCATCT AATATCACTT TTAGGTATTT GTGGAATACA GAAGATATAT 258

10051 TTTCAGCCTT AACTCAGCAT TTATTTTAGG CCTCATGGGT CTTGCACTTA AAAGTCGGAA TTGAGTCGTA AATAAAATCC GGAGTACCCA GAACGTGAAT 10101 ACCGCCACCA CCTTTTATCT GCACTCTTAT GTTTAGAAAG TATATTATTA TGGCGGTGGT GGAA.AATAGA CGTGAGAATA CAAATCTTTC ATATAATAAT 10151 ACCCTATTTA TTACCATCGC CATCTGAACT TTAACATTAA ACTCTGCCTC TGGGATA.AAT AATGGTAGCG GTAGACTTGA AATTGTAATT TGAGACGGAG 10201 ATCCTCAATT ATTCCCATAA TTCTCCTCAC ATTTTCAGCC TGTGAGGCCA TAGGAGTTAA TAAGGGTATT AAGAGGAGTG TAAAAGTCGG ACACTCCGGT 10251 GCGCTGGCCT AGCCATCCTT GTAGCCACTT CACGTTCTCA CGGGTCTGAC CGCGACCGGA TCGGTAGGAA CATCGGTGAA GTGCAAGAGT GCCCAGACTG 10301 AATTTACAA.A ACCTAAACCT CCTTCAATGC TAA.A.AATC C T AATTCCAACA TTAAATGTTT TGGATTTGGA GGAAGTTACG ATTTTTAGGA TTAAGGTTGT 10351- ATTATGTTGT TTCCAACCAC ATGAGTCATT AAC T GACTGTGACC TAATACAACA AAGGTTGGTG TACTCAGTAA TTGTTTTTTA CTGACACTGG 10401 CATAACTACT ACCTACAGCC TTATAATCGC ACTACTAAGT TTACTTTGAT GTATTGATGA TGGATGTCGG AATATTAGCG TGATGATTCA AATGAAACTA 10451 TTAAATGAAA CATAGATATT GGCTGAGATT TTTCCAATCA ATTTATAGCT AATTTACTTT GTATCTATAA CCGACTCTAA AAAGGTTAGT TAAATATCGA 10501 ATTGACCCTT TATCATCCCC ATTACTTATT CTCACATGCT GACTCCTCCC TAACTGGGAA ATAGTAGGGG TAATGAATAA GAGTGTACGA CTGAGGAGGG 10551 ATTAATAATT CTAGCCAGCC AGAACCATAT CTCCCCCGAG CCCATCATCC TAATTATTAA GATCGGTCGG TCTTGGTATA GAGGGGGCTC GGGTAGTAGG 10601 GACAACGTAC ATATATTACA CTTCTAATTT CTCTCCAAGC CTTCCTTATT CTGTTGCATG TATATAATGT GAAGATTAAA GAGAGGTTCG GAAGGAATAA 10651 ATGGCATTCT CCGCAACCGA AATGATTATA TTTTACATTA TATTTGAAGC TACCGTAAGA GGCGTTGGCT TTACTAATAT A.A.AATGTAAT ATAAACTTCG 10701 CACGCTTATC CCCACACTCA TTATTATCAC ACGATGAGGA AACCA.AACAG GTGCGAATAG GGGTGTGAGT AATAATAGTG TGCTACTCCT TTGGTTTGTC 10751 AAC GC TTA.AA CGCGGGCACC TATTTTCTAT TTTATACTTT AATTGGCTCC TTGCGAATTT GCGCCCGTGG ATA.AA.AGATA A.AATATGA.AA TTAACCGAGG 10801 CTCCCCCTTC TCATCGCTCT TTTACTTATA CAAA.ATAATC TGGGCACCTT GAGGGGGAAG AGTAGCGAGA AAATGAATAT GTTTTATTAG ACCCGTGGAA 10851 GTCCATAATT ATCATACAAC ACTCGCAACT TCCAA.ACCTA ACATCATGAG CAGGTATTAA TAGTATGTTG TGAGCGTTGA AGGTTTGGAT TGTAGTACTC 10901 CGGACAATCT ATGATGACTA GCCTGCCTCC TCGCTTTCCT TGTTP.~AA,ATA GCCTGTTAGA TACTACTGAT CGGACGGAGG AGCGAAAGGA ACAATTTTAT 10951 CCCCTATATG GAATCCACCT CTGATTACCC AAAGCCCACG TTGAAGCTCC GGGGATATAC CTTAGGTGGA GACTAATGGG TTTCGGGTGC AACTTCGAGG 11001 AATCGCTGGT TCAATAATTT TAGCCGCTGT ACTACTCA.AA CTAGGGGGTT TTAGCGACCA AGTTATTAAA ATCGGCGACA TGATGAGTTT GATCCCCCAA 11051 ATGGAATAAT ACGAATCATC GTAATGCTAA ACCCACTAAC CAAAGAAATA TACCTTATTA TGCTTAGTAG CATTACGATT TGGGTGATTG GTTTCTTTAT 11101 GCCTACCCAT TCCTAATCCT AGCTATTTGA GGAATTATCA TAACCAGCTC CGGATGGGTA AGGATTAGGA TCGATAAACT CCTTAATAGT ATTGGTCGAG 11151 TATTTGCCTT CGACAGACAG ATCTTAAATC CCTCATTGCC TACTCATCAG ATAAACGGAA GCTGTCTGTC TAGAATTTAG GGAGTAACGG ATGAGTAGTC 11201 TAAGCCACAT AGGCCTAGTC GCAGGGGCGA TCCTTATCCA AACGCCATGA ATTCGGTGTA TCCGGATCAG CGTCCCCGCT AGGAATAGGT TTGCGGTACT 11251 AGCTTCGCAG GAGCAATCAC ATTAATAATT GCCCATGGCT TAATTTCATC TCGAAGCGTC CTCGTTAGTG TAATTATTAA CGGGTACCGA ATTAAAGTAG 11301 AGCCCTATTC TGCTTAGCCA ACACTAACTA CGAGCGAATC CACAGTCGAA TCGGGATAAG ACGAATCGGT TGTGATTGAT GCTCGCTTAG GTGTCAGCTT 11351 CTATACTCCT AGCCCGAGGC ATACAAATTG TCCTTCCACT TATGGCGACC GATATGAGGA TCGGGCTCCG TATGTTTAAC AGGAAGGTGA ATACCGCTGG 11401 TGATGATTCT TTGCTAGCCT AGCTAACCTC GCCTTACCAC CAACCCCTAA 259

ACTACTAAGA AACGATCGGA TCGATTGGAG CGGAATGGTG GTTGGGGATT 11451 TCTTATAGGG GAACTCCTTA TTGTCACCTC ACTATTCAAC TGATCCAACT AGAATATCCC CTTGAGGAAT AACAGTGGAG TGATAAGTTG ACTAGGTTGA 11501 GAACCATAAT CCTATCAGGC CTTGGAATAT TAATCACAGC CTCCTACTCC CTTGGTATTA GGATAGTCCG GAACCTTATA ATTAGTGTCG GAGGATGAGG 11551 CTTTATATAT TCTTAATAAC CCAACATGGC CCAACCCCCC ACCATATTTT GAAATATATA AGAATTATTG GGTTGTACCG GGTTGGGGGG TGGTATAAAA 11601 ATCATT.A.AAC CCTAACCACA CACGAGAACA TCTTCTCCTA AGCCTCCACC TAGTAATTTG GGATTGGTGT GTGCTCTTGT AGAAGAGGAT TCGGAGGTGG 11651 TTATTCCAGT CCTACTCCTA ATATTCAAGC CAGAACTCAT CTGAGGATGG AATAAGGTCA GGATGAGGAT TATAAGTTCG GTCTTGAGTA GACTCCTACC TP~~AAA 11701 ACACTTTGTA TTTATAGTTT AACCAAAACA TTAGATTGTG GTTC TGTGAAACAT AAATATCAAA TTGGTTTTGT AATCTAACAC CAAGATTTTT 11751 CAAA.AGTTAA AACCTTTTTA ATTACCGAGA GAGGTCCAGG ACACGAAGAA GTTTTCAATT TTGGP.~~A.AAT TAATGGCTCT CTCCAGGTCC TGTGCTTCTT 11801 CTGCTAATTC TTCCTATCAT GGCTCAAATC CATGGCTCAC TCAGCCTCTG GACGATTAAG AAGGATAGTA CCGAGTTTAG GTACCGAGTG AGTCGGAGAC CP.►~~AAATT 11851 AAAGATAATA GCTATCTATT GGTCTTAGGA AC CTTGATGCAA TTTCTATTAT CGATAGATAA CCAGAATCCT TGGTTTTTAA GAACTACGTT 11901 CTCCAAGCAG AAGCTATGAA CACCATTTTT AATTCATCAT TCCTCCTAAT GAGGTTCGTC TTCGATACTT GTGGTP.►~~AAA TTAAGTAGTA AGGAGGATTA 11951 CTTTATTATC CTTATTTTTC CACTAATAAC CTCACTAAGC CCTAAAGAAC GAAATAATAG GAATAAAAAG GTGATTATTG GAGTGATTCG GGATTTCTTG CA,AAACCTCC 12001 TTAACCCTAA TTGATCCTCA TCCCATGTAA AAATAGCCAT AATTGGGATT AACTAGGAGT AGGGTACATT TTTATCGGTA GTTTTGGAGG 12051 TTCTTTATTA GCCTTGTCCC CCTATTCATT TTCCTAGACC AAGGCTTAGA AAGAAATAAT CGGAACAGGG GGATAAGTAA AAGGATCTGG TTCCGAATCT 12101 GTCAACTATA ACA.AATTTTA ACTGAATGAA CATTGGACCA TTTGACATTA AA.ACTGTAAT CAGTTGATAT TGTTTP~A.AAT TGACTTACTT GTAACCTGGT 12151 ACATAAGCTT CA.AATTTGAT CTATACTCAA CTGTATTTAC CCCAGTAGCC TGTATTCGAA GTTTAAACTA GATATGAGTT GACATAAATG GGGTCATCGG 12201 CTTTATGTCA CTTGATCTAT CCTTGAATTC GCCCTATGAT ACATACACTC GAAATACAGT GAACTAGATA GGAACTTAAG CGGGATACTA TGTATGTGAG 12251 TGACCCA.AAC ATCAACCGCT TCTTTAAGTA TCTGTTACTT TTCCTAATCT ACTGGGTTTG TAGTTGGCGA AGAAATTCAT AGACAATGAA AAGGATTAGA 12301 CCATAATTAT CCTAGTCACT GCCAACAACA TATTTCAATT ATTTATTGGA T GGTATTAATA GGATCAGTGA CGGTTGTTGT ATAAAGTTAA TA.AATAAC C 12351 TGGGAGGGGG TCGGAATTAT ATCCTTCCTC CTCATTGGCT GATGACACAG ACCCTCCCCC AGCCTTAATA TAGGAAGGAG GAGTAACCGA CTACTGTGTC 12401 CCGAACAGAC GCCAACACAG CCGCTCTCCA GGCTGTAATT TATAATCGAG GGCTTGTCTG CGGTTGTGTC GGCGAGAGGT CCGACATTAA ATATTAGCTC 12451 TAGGAGATAT TGGACTAATT CTGAGGATGG CCTGATTAGC CATA.AACTTG ATCCTCTATA ACCTGATTAA GAGTCGTACC GGACTAATCG GTATTTGAAC 12501 AATTCGTGAG AAATTCAACA ATTATTTACC CTATCCAAAA ACACAGACTT TTAAGCACTC TTTAAGTTGT TAATA.AATGG GATAGGTTTT TGTGTCTGAA 12551 AACTCTACCT CTCCTTGGCC TTGTCCTAGC AGCAGCTGGA AAATCTGCAC TTGAGATGGA GAGGAACCGG AACAGGATCG TCGTCGACCT TTTAGACGTG 12601 AATTCGGCCT CCACCCCTGA CTTCCTTCTG CTATAGAAGG ACCAACGCCA TTAAGCCGGA GGTGGGGACT GAAGGAAGAC GATATCTTCC TGGTTGCGGT 12651 GTTTCCGCCC TACTCCATTC TAGTACAATA GTCGTTGCCG GCGTCTTCCT CAA.AGGCGGG ATGAGGTAAG ATCATGTTAT CAGCAACGGC CGCAGAAGGA 12701 GCTAATTCGC CTCCACCCAT TAATTCAAGA CAATCAACTT ATCCTAACAA CGATTAAGCG GAGGTGGGTA ATTAAGTTCT GTTAGTTGAA TAGGATTGTT 12751 CATGCCTATG TCTAGGAGCA ATAACTACCC TCTTTACCGC AGCATGCGCA GTACGGATAC AGATCCTCGT TATTGATGGG AGAAATGGCG TCGTACGCGT 260

12801 CTTACCCAAA ACGATATTAA P►~~AAATCATC GCCTTCTCAA CATCAAGCCA GAATGGGTTT TGCTATAATT TTTTTAGTAG CGGAAGAGTT GTAGTTCGGT 12$51 ACTCGGATTA ATAATAGTAA CAATCGGCCT TAACCAGCCT CAACTTGCCT TGAGCCTAAT TATTATCATT GTTAGCCGGA ATTGGTCGGA GTTGAACGGA 12901 TCCTCCATAT CTGCACCCAC GCCTTTTTTA AGGCTATGCT TTTTCTCTGC AGGAGGTATA GACGTGGGTG CGG T TCCGATACGA AAAAGAGACG GCAA.AAT 12951 TCAGGGTCTA TTATTCACAG CCTCAACGAT GAACAGGATA TC C AGTCCCAGAT AATAAGTGTC GGAGTTGCTA CTTGTCCTAT AGGCGTTTTA 13001 AGGGGGACTC CACAAACTCT TACCATTTAC CTCATCCTCC CTAACTATTG TCCCCCTGAG GTGTTTGAGA ATGGTAAATG GAGTAGGAGG GATTGATAAC TCGAA.A 13051 GAAGCCTAGC TCTTACAGGC ATGCCATTCC TATCAGGTTT C TTC CTTCGGATCG AGAATGTCCG TACGGTAAGG ATAGTCCAAA GAAGAGCTTT 13101 GACGCCATCA TTGAATCCAT A.AACAC TTC T CACCTAAACG CCTGAGCCCT CTGCGGTAGT AACTTAGGTA TTTGTGAAGA GTGGATTTGC GGACTCGGGA 13151 A.ATC C TCAC C CTTGTTGCAA CATCATTTAC AGCTATTTAC AGCCTCCGCC TTAGGAGTGG GAACAACGTT GTAGTA.AATG TCGATAAATG TCGGAGGCGG ACTCTCACCT 13201 TTATTTTCCT CACACTAATA AACTTTCCCC GATTCAATTC AATAAAAGGA GTATGATTAT TTGAAAGGGG CTAAGTTAAG TGAGAGTGGA 13251 ATTAATGA.AA ATAACCCAAT AGTGATCAAC CCAATCAAAC GTCTAGCTTA TAATTACTTT TATTGGGTTA TCACTAGTTG GGTTAGTTTG CAGATCGAAT 13301 CGGAAGCATC CTAGCTGGCC TCATCATTAC ATTAAACCTA ACTCCTACAA GCCTTCGTAG GATCGACCGG AGTAGTAATG TAATTTGGAT TGAGGATGTT 13351 AA.AC C CAAAT CATAACAATA CCTCCCCTAT TAAAACTCTC CGCCCTATTA TTTGGGTTTA GTATTGTTAT GGAGGGGATA ATTTTGAGAG GCGGGATAAT 13401 GTAACAATTG CTGGTCTCCT GCTGGCCTTA GAACTAGCCA GCTTAACCAA CATTGTTAAC GACCAGAGGA CGACCGGAAT CTTGATCGGT CGAATTGGTT 13451 TACTCAACTT A.A.AACAAAC C CCATCCTTTA TACCCATCAC TTCTCTAACA ATGAGTTGAA TTTTGTTTGG GGTAGGAAAT ATGGGTAGTG AAGAGATTGT P~~AAATCAAT 13501 TACTTGGATA TTTCCCACAA ATTATTCACC GACTCCTACC ATGAACCTAT AAAGGGTGTT TAATAAGTGG CTGAGGATGG TTTTTAGTTA 13551 TTAACCTGAG CCCAGCATAC CTCAACCCAC CTGATTGACC A.AACATGAAA AATTGGACTC GGGTCGTATG GAGTTGGGTG GACTAACTGG TTTGTACTTT 13601 TG TT GGGCCP►AAAA GTATTGTTAT CCAACAA.ACC CCCCTAATCA ACTTTTTTAA CCCGGTTTTT CATAAGAATA GGTTGTTTGG GGGGATTAGT 13651 AATTATCTAC TCAACCTCAA CAAGGCTACA TTA.AAGTTTA CCTAATACTG TTAATAGATG AGTTGGAGTT GTTCCGATGT AATTTCAAAT GGATTATGAC 13701 CTATTTCTTA CATTAACTCT AGCCCTGCTC ACTACATTAA CCTAACCACA GATA~AGAAT GTAATTGAGA TCGGGACGAG TGATGTAATT GGATTGGTGT 13751 CGCAAAGTCC CCCAAGATAG ACCCCGAGTT AATTCCAGCA CCACAAACAA GCGTTTCAGG GGGTTCTATC TGGGGCTCAA TTAAGGTCGT GGTGTTTGTT 13801 AGTTAACAAT AATATCCACC CACTTAAAAC CAACAATCAT CCACCATCAG TCAATTGTTA TTATAGGTGG GTGAATTTTG GTTGTTAGTA GGTGGTAGTC 13851 CATATAATAA AGCCACCCCA ACAA.AATC TC CACGAACTAT CTCCAAGCCA GTATATTATT TCGGTGGGGT TGTTTTAGAG GTGCTTGATA GAGGTTCGGT 13901 CCTAACTCCT CTACCCCAGC TCAACTTAAC TCATACCACT TAACCATAAA GGATTGAGGA GATGGGGTCG AGTTGAATTG AGTATGGTGA ATTGGTATTT 13951 ATACTTACCA GCAAGAACTA AAGTCACTAA GTP.►AAAACCA ACGTACAATA TATGAATGGT CGTTCTTGAT TTCAGTGATT CATTTTTGGT TGCATGTTAT 14001 ACACAGACCA ACTACCTCAT GACTCAGGAT AAGGCTCAGC AGCAAGCGCC TGTGTCTGGT TGATGGAGTA CTGAGTCCTA TTCCGAGTCG TCGTTCGCGG 14051 GCCGTATAAG CAAATACTAC CAACATTCCC CCCAAATAAA TTAGAAATAA CGGCATATTC GTTTATGATG GTTGTAAGGG GGGTTTATTT AATCTTTATT 14101 GACCAAGGAT GACC CACCGTGCCC CACCAATAAC CCACACCCGA CTGGTTCCTA TTTTTTCTGG GTGGCACGGG GTGGTTATTG GGTGTGGGCT 14151 CCCCAGCAGC CACGACCAA.A CCCAATGCAG CATAATAAGG AGAAGGATTA 261

GGGGTCGTCG GTGCTGGTTT GGGTTACGTC GTATTATTCC TCTTCCTAAT 14201 GATGCTACTC CCATCAAACC TA.A.AAC CA.AA CAAACTATTA TTP.~~AA.ACAT CTACGATGAG GGTAGTTTGG ATTTTGGTTT GTTTGATAAT AATTTTTGTA 14251 A.AAATATAC C ATCATTCCTA CCTGGACTTT AACCAAGACC AATAACTTGA TTTTATATGG TAGTAAGGAT GGACCTGAAA TTGGTTCTGG TTATTGAACT 14301 AAA.AC TATC G TTGTTAATTC AACTATAAGA ATTTATGGCC ACAAACATCC TTTTGATAGC AACAATTAAG TTGATATTCT TAAATACCGG TGTTTGTAGG 14351 GA.P.~AA.A000A CCCACTACTA A.AAATTATCA ACCAAGCCCT AATTGACCTC CTTTTTGGGT GGGTGATGAT TTTTAATAGT TGGTTCGGGA TTAACTGGAG 14401 CCAACTCCAT CAAACATTTC CATCTGATGA AACTTCGGCT CACTTTTAGG GGTTGAGGTA GTTTGTAAAG GTAGACTACT TTGAAGCCGA GTGAA.AATC C 14451 ATTATGTTTA CTTATCCAAA TCATTACAGG ACTTTTCTTA GCAATACATT TAATACAAAT GAATAGGTTT AGTAATGTCC TGAAAAGAAT CGTTATGTAA 14501 ACACCGCAGA CGTCTCCCTA GCCTTCTCCT CAGTAGTCCA TATTTGTCGT TGTGGCGTCT GCAGAGGGAT CGGAAGAGGA GTCATCAGGT ATA.AACAGCA 14551 GACGTTAACT ACGGCTGGCT TATTCGCAAT ATCCACGCCA ATGGGGCCTC CTGCAATTGA TGCCGACCGA ATAAGCGTTA TAGGTGCGGT TACCCCGGAG 14601 ATTATTTTTC GTCTGCATTT ACTTTCACAT CGCCCGTGGA CTATACTACG TAATP►~~AAAG CAGACGTAA.A TGAAAGTGTA GCGGGCACCT GATATGATGC 14651 GCTCCTACCT CTACAAAGAA AC AT GA.AATA TTGGAGTAAT TCTATTATTC CGAGGATGGA GATGTTTCTT TGTACTTTAT AACCTCATTA AGATAATAAG 14701 CTACTTATGG CCACAGCCTT CGTAGGCTAT GTTTTGCCAT GAGGACAAAT GATGAATACC GGTGTCGGAA GCATCCGATA CAAAACGGTA CTCCTGTTTA 14751 ATCCTTCTGA GCTGCTACAG TCATCACCAA CCTTCTCTCC GCCTTTCCTT TAGGAAGACT CGACGATGTC AGTAGTGGTT GGAAGAGAGG CGGAAAGGAA 14801 ACATTGGAGA TACATTAGTC CAATGAATCT GAGGCGGCTT CTCAATCGAC TGTAACCTCT ATGTAATCAG GTTACTTAGA CTCCGCCGAA GAGTTAGCTG 14851 AACGCCACCC TAACACGATT CTTCACACTC CACTTCCTTC TCCCCTTCCT TTGCGGTGGG ATTGTGCTAA GAAGTGTGAG GTGAAGGAAG AGGGGAAGGA 14901 AATCACTGCA TTAATAATCA TCCATGTTCT CTTCCTACAT GAAACAGGCT TTAGTGACGT AATTATTAGT AGGTACAAGA GAAGGATGTA CTTTGTCCGA 14951 CAAATAACCC CATAGGTCTA AATTCTGACA TAGATP~AAAT CTCCTTCCAC GTTTATTGGG GTATCCAGAT TTAAGACTGT ATCTATTTTA GAGGAAGGTG 15001 CCCTACTTTT C C TAC A.AAGA CATACTTGGC TTCTTCACCT TAATCATCCT GGGATGA►AAA GGATGTTTCT GTATGAACCG AAGAAGTGGA ATTAGTAGGA 15051 TCTAGGCATC CTAACCCTAC TCCTCCCCAA CCTCCTAGGA GACACCGAAA AGATCCGTAG GATTGGGATG AGGAGGGGTT GGAGGATCCT CTGTGGCTTT 15101 ACTTCATCCC CGCTAACCCT CTCGTCACCC CTCCCCACAT CAAACCCGAA TGAAGTAGGG GCGATTGGGA GAGCAGTGGG GAGGGGTGTA GTTTGGGCTT 15151 TGGTACTTCC TGTTCGCTTA TGCCATCCTC CGGTCCATCC CCAATAAGCT ACCATGAAGG ACAAGCGAAT ACGGTAGGAG GCCAGGTAGG GGTTATTCGA 15201 AGGAGGGGTC TTAGCCCTCC TGTTCTCCAT TCTCATCCTC ATACTAGTTC TCCTCCCCAG AATCGGGAGG ACAAGAGGTA AGAGTAGGAG TATGATCAAG 15251 CCCTCCTCCA CACTTCTAAA CAACGAAGCA GTACCTTTCG TCCACTCACA GGGAGGAGGT GTGAAGATTT GTTGCTTCGT CATGGAAAGC AGGTGAGTGT 15301 CA.AATTTTCT TCTGAGTCCT AATAGCCGAT ATACTAATCC TAACCTGAAT GTTT~GA AGACTCAGGA TTATCGGCTA TATGATTAGG ATTGGACTTA 15351 CGGGGGACAA CCAGTCGAAC AACCGTTCAT CTTAATCGGA CA.AATTGCAT GCCCCCTGTT GGTCAGCTTG TTGGCAAGTA GAATTAGCCT GTTTAACGTA 15401 CTATTACCTA CTTCTCTCTA TTCCTCATTG TGATCCCACT CACAGGCTGA GATAATGGAT GAAGAGAGAT AAGGAGTAAC ACTAGGGTGA GTGTCCGACT 15451 TGAGA,,AA,ACA AAATCCTCAA C C TA.AAC TAG TTCTGGTAGC TTAACTTAAA ACTCTTTTGT TTTAGGAGTT GGATTTGATC AAGACCATCG AATTGAATTT 15501 GCGTCGGCCT TGTAAGCCGA AGACCGGAGG TTTAAACCCT C C C CA~P.~AACA CGCAGCCGGA ACATTCGGCT TCTGGCCTCC AAATTTGGGA GGGGTTTTGT 262

15551 CATCAGGGGA AGGAGAGTTA AACTCCTGCC CTTGGCTCCC AAAGCCAAGA GTAGTCCCCT TCCTCTCAAT TTGAGGACGG GAACCGAGGG TTTCGGTTCT 15601 TTCTGCCCAA ACTGCCCCCT GAATGCCATA AAAGCATGAA AACAAATGTC AAGACGGGTT TGACGGGGGA CTTACGGTAT TTTCGTACTT TTGTTTACAG 15651 CATTTGGTTT CP.~~AA.AGTTA GTCAGTCTGA CATATTAATG ACATGGCCCA GTAAAC C AA.A GTTTTTCAAT CAGTCAGACT GTATAATTAC TGTACCGGGT 15701 CATACATTAA TATCAAGCAC ATTACTCATC TCGACTACAT CACATTAATT GTATGTAATT ATAGTTCGTG TAATGAGTAG AGCTGATGTA GTGTAATTAA 15751 GCTAGTCCCC TACTGATATC ACACTCTATG TATAATCCCC ATTAATTTAT CGATCAGGGG ATGACTATAG TGTGAGATAC ATATTAGGGG TAATTAAATA 15801 ATTCCCCTAT ACTATAACAT ACTATGCTTA ATACTCATTA ATATACTATC TAAGGGGATA TGATATTGTA TGATACGAAT TATGAGTAAT TATATGATAG 15851 CACTATTTCA TTACATTCTA TCCTTTATTC CTCACTATAT TAAAATCA.AA GTGATAAAGT AATGTAAGAT AGGAAATAAG GAGTGATATA ATTTTAGTTT 15901 ATCTTCATAT CATATTATTA TAATTTCGCC C TTA.AAGAC T TAAGTATGGA TAGAAGTATA GTATAATAAT ATTAAAGCGG GAATTTCTGA ATTCATACCT 15951 ATATGCGGGC TGGTAAGAAC ATCACATCCC GCTATTGTAA G TT TATACGCCCG ACCATTCTTG TAGTGTAGGG CGATAACATT CTTTTTTTAA 16001 GCTCTATTTG TGGCGCTGTG ATCGGTTGAT CCCCACCAAT TGACCAGACC CGAGATAAAC ACCGCGACAC TAGCCAACTA GGGGTGGTTA ACTGGTCTGG 16051 TGGCATCTGA TTACTGCTCG AGATTCTTTA ATCCTTGATC GCGTCAAGAA ACCGTAGACT AATGACGAGC TC TAAGAA.AT TAGGAACTAG CGCAGTTCTT 16101 TGCCAGCACC CTAGCTCCCT TTAATGGCAC CTTCGTCCTT GACCGTCTCA ACGGTCGTGG GATCGAGGGA AATTACCGTG GAAGCAGGAA CTGGCAGAGT 16151 AGATTTATTT TCCTCCCTAA ATTTTTTGGG GGGGATGAAG CAATCGCTAT TC TA.AATAAA AGGAGGGATT T CCC CCCCTACTTC GTTAGCGATA 16201 CAATC GAA.AA AGTTCATTAA AAT~AATCTG TACTGACCTC GACATCTGTC GTTAGCTTTT TCAAGTAATT TTATTTAGAC ATGACTGGAG CTGTAGACAG 16251 TAAACTCCCA TTACTTTTCA TTCATGAGTA ACAATTGTCA AGTAGACCAA ATTTGAGGGT AATGAAAAGT AAGTACTCAT TGTTAACAGT TCATCTGGTT 16301 CACTGAGAGG GATGGAGAGA TTGACGCCAT AGTCGGCAAG TTTCGATTTT GTGACTCTCC CTACCTCTCT AACTGCGGTA TCAGCCGTTC A.AAGC TAAA.A 16351 TTTGATTAAT GAAGCTATGG TTT G TCATTTTCTT AATcccccCG AAACTAATTA CTTCGATACC AAATTTTTTC AGTAAAAGAA TTAGGGGGGC 16401 GGGACAAATT CGCAATAAAC GTTAATGTAG AGTGCATTAC ATTATTCTAA CCCTGTTTAA GCGTTATTTG CAATTACATC TCACGTAATG TAATAAGATT 16451 CACATTCTTC ACTTTATCGG GCATAAATTT GTTGTTATTA GGTTACCCCC GTGTAAGAAG TGAAATAGCC CGTATTTAAA CAACAATAAT CCAATGGGGG 16501 TGGGTTGTAA AAATTGAAAG C C GC TTA.AA.A P►~~A.AATAAAC ATTTTTTGGT ACCCAACATT TTTAACTTTC GGCGAATTTT TTTTTATTTG Tp~~AAAAC CA 16551 AAA.AAC C C C C CTCCCCCTAA TATACACGGA CTCCTCGAAA AACCCCTAAA TTTTTGGGGG GAGGGGGATT ATATGTGCCT GAGGAGCTTT TTGGGGATTT 16601 ACGAGGGCCG GACATATATT TTTGAATTAG CATGCGAA.AT A.AAC TC TGTA TGCTCCCGGC CTGTATATAA AAACTTAATC GTACGCTTTA TTTGAGACAT 16651 TATATTGTTA CACTACCAC ATATAACAAT GTGATGGTG

tRNA 1..70 product = tRNA-Phe rRNA 69..1020 product = 12S ribosomal RNA tRNA 1021..1092 product = tRNA-Val rRNA 1093..2757 product = 16S ribosomal RNA 263 tRNA 2758..2832 product = tRNA-Leu gene 2833..3807 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3809..3877 product = tRNA-Ile tRNA 3876..3947 product = tRNA-Gln tRNA 3948..4016 product = tRNA-Met gene 4017..5060 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5060..5160 product = tRNA-Trp tRNA complement (5132..5200) product = tRNA-Ala tRNA complement {5201..5273) product = tRNA-Asn tRNA complement (5306..5373} product = tRNA-Cys tRNA complement (5375..5444) product = tRNA-Tyr gene 5446..6999 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7002..7072) product = tRNA-Ser tRNA 7077..7146 product = tRNA-Asp gene 7154..7844 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7845..7918 product = tRNA-Lys gene 7920..8087 gene = ATPB product =ATP synthase FO subunit 8 gene 8078..8761 gene = ATP6 product =ATP synthase FO subunit 6 gene 8761..9546 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9549..9618 product = tRNA-Gly gene 9619..9969 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9968..10036 product = tRNA-Arg gene 10037..1033 3 gene = ND4L 264

product = NADH dehydrogenase subunit 4L gene 10327..11707 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11708..11776 product = tRNA-His tRNA 11777..11843 product = tRNA-Ser tRNA 11844..11915 product = tRNA-Leu gene 11916..13745 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13741..14262) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14263..14332) product = tRNA-Glu gene 14335..15480 gene = CYTB product = cytochrome b tRNA 15480..15550 product = tRNA-Thr tRNA complement (15553..15621) product = tRNA-Pro D-Loop 15622..16669

Isurus oxyrinchus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGCATGGCAC TGAAGATGCT AAGATGAAAA CGATCACATC GAATTAAATT TCGTACCGTG ACTTCTACGA TTCTACTTTT 51 ATGACAATTT TCCGCAGGCA TGAAGGTTTG GTCCTGGCCT TAGTATTAAT TACTGTTAAA AGGCGTCCGT ACTTCCAAAC CAGGACCGGA ATCATAATTA 101 TGTAACCAGA ATTATACATG CAAGTTTCAG CATCCCTGTG AAAATGCCCT ACATTGGTCT TAATATGTAC GTTCAAAGTC GTAGGGACAC TTTTACGGGA 151 AGCTACTCTG TCAATTAGTT AGGAGCGGGT ATCAGGCACA CATACATGTA TCGATGAGAC AGTTAATCAA TCCTCGCCCA TAGTCCGTGT GTATGTACAT 201 GCCCAAGACA CCTTGCTAAG CCACACCCCC AAGGGATTTC AGCAGTAATA CGGGTTCTGT GGAACGATTC GGTGTGGGGG TTCCCTAAAG TCGTCATTAT 251 AATATTGATT ATATGAGCGC AAGCTCGAAT CAGTTAAAGT TAACAGAGTT T TATAAC TA.A TATACTCGCG TTCGAGCTTA GTCAATTTCA ATTGTCTCAA 301 GGTCAATCTC GTGCCAGCCA CCGCGGTTAT ACGAGTAACT CACATTAATA CCAGTTAGAG CACGGTCGGT GGCGCCAATA TGCTCATTGA GTGTAATTAT 351 CTCCCCGGCG TAA.AGAGTGA TTTAAGGAAC ATCTAACAAC AACTAAAGTT GAGGGGCCGC ATTTCTCACT A.AATTCCTTG TAGATTGTTG TTGATTTCAA 401 CAGACCTTAT CAAGCTGTCA CACGCGCCCA CAAGCGGAAT TATCAACAAC GTCTGGAATA GTTCGACAGT GTGCGCGGGT GTTCGCCTTA ATAGTTGTTG 451 GAAAGTGACT TTACCCCACT AGAA.ATC TTG ATGTCACGAC AGTTAGACCC CTTTCACTGA AATGGGGTGA TCTTTAGAAC TACAGTGCTG TCAATCTGGG 501 CAAACTAGGA TTAGATACCC TACTATGTCT AACCACAAAC TTAAACAATA GTTTGATCCT AATCTATGGG ATGATACAGA TTGGTGTTTG AATTTGTTAT 551 ATTCACTATA TTGTTCGCCA GAGTACTACA AGCGCTAGCT TA,~A,ACCCAA TAAGTGATAT AACAAGCGGT CTCATGATGT TCGCGATCGA ATTTTGGGTT 601 AGGACTTGGC GGTGTCCCAA ACCCACCTAG AGGAGCCTGT TCTGTAACCG 265

TCCTGAACCG CCACAGGGTT TGGGTGGATC TCCTCGGACA AGACATTGGC 651 ATAATCCCCG TTA.AACCTCA CCACTTCTAG CCATTCCCGT CTATATACCG TATTAGGGGC AATTTGGAGT GGTGAAGATC GGTAAGGGCA GATATATGGC 701 CCGTCGTCAG CTCACCCTGT GAAGGCTTAA AAGTAAGCAA A.AAGAAC CAA GGCAGCAGTC GAGTGGGACA CTTCCGAATT TTCATTCGTT TTTCTTGGTT 751 CTTCCACACG TCAGGTCGAG GTGTAGCGAA TGAAGTGGAT AGAA.ATGGGC GAAGGTGTGC AGTCCAGCTC CACATCGCTT ACTTCACCTA TCTTTACCCG 801 TACATTTTCT ATA.AAGAAAA CACGAATGGT AA.AC TGAAAA ATTACCTAAA ATGTAAAAGA TATTTCTTTT GTGCTTACCA TTTGACTTTT TAATGGATTT 851 GGTGGATTTA GCAGTAAGAA AAGACTAGAG AGCTTCTCTG A.A.ACCGGCTC CCACCTAAAT CGTCATTCTT TTCTGATCTC TCGAAGAGAC TTTGGCCGAG 901 TGGGACGCGC ACACACCGCC CGTCACTCTC CTCG ATCTACTTAT ACCCTGCGCG TGTGTGGCGG GCAGTGAGAG GAGCTTTTTT TAGATGAATA 951 TTTTAATTAA AGp~~AAATGC CAAGAGGAGG CAAGTCGTAA CATGGTAAGT AAA,ATTAATT TCTTTTTACG GTTCTCCTCC GTTCAGCATT GTACCATTCA 1001 GTACTGGA.AA GTGCACTTGG AATCAAAATG TGGCTA.AACC AGTA.AAGCAC CATGACCTTT CACGTGAACC TTAGTTTTAC ACCGATTTGG TCATTTCGTG 1051 CTCCCTTACA CCGAGGAGAT ACCCGTGCAA TTCGGGTCAT TCTGAACATT GAGGGAATGT GGCTCCTCTA TGGGCACGTT AAGCCCAGTA AGACTTGTAA 1101 AAAGCTAGCC TGTCCACCTA C C TCA.AATTC AACATTATTA ACTACCTTGC TTTCGATCGG ACAGGTGGAT GGAGTTTAAG TTGTAATAAT TGATGGAACG 1151 CCACTAATTC C TAAC TP~AAA CATTTTATCA TTTTAGTATG GGCGACAGAA GGTGATTAAG GATTGATTTT GTAAA.ATAGT A.AAATCATAC CCGCTGTCTT 1201 CP►~~AAATTCA GCGCAATAGA CTATGTACCG CAAGGGAGAG CTGAAAGAGA GTTTTTAAGT CGCGTTATCT GATACATGGC GTTCCCTCTC GACTTTCTCT 1251 AATGA.AACAA ATAATTAAAG TAGTP.~~AA.AG CAGAGATTAT ATCTCGTACC TTACTTTGTT TATTAATTTC ATCATTTTTC GTCTCTAATA TAGAGCATGG 1301 TTTTGCATCA TGATTTAGCT AGAAAA.AC TA GACAAAGAGA TCTTTAGCCT AA.AAC GTAGT AC TAA.ATC GA TCTTTTTGAT CTGTTTCTCT AGA.A.ATC GGA 1351 ATCTTCCCGA AACTAAACGA GCTACTCCGA AGCAGCACAA TTTAGAGCCA TAGAAGGGCT TTGATTTGCT CGATGAGGCT TCGTCGTGTT A.AATC TC GGT 1401 ACCCGTCTCT GTGGCA~AAAG AGTGGGAAGA CTTCCGAGTA GCGGTGACAA TGGGCAGAGA CACCGTTTTC TCACCCTTCT GAAGGCTCAT CGCCACTGTT 1451 GCCTATCGAG TTTAGTGATA GCTGGTTGTC CAAGAAAAGA ACTTCAGTTC CGGATAGCTC AAATCACTAT CGACCAACAG GTTCTTTTCT TGAAGTCAAG 1501 TGCATTAATT CTTTCATCAC CAAGAAGTCT ACCATATTAA GGCCA.AATAT ACGTAATTAA GA.AAGTAGTG GTTCTTCAGA TGGTATAATT CCGGTTTATA 1551 AAGAATTAAT AGTTATTCAG AAGAGGTACA GCCCTTCTGA ACCAAGATAC TTCTTAATTA TCAATAAGTC TTCTCCATGT CGGGAAGACT TGGTTCTATG 1601 AAC TTTTA.AA GGAGGGA.AAT GATCATATTT ATTAAGGTTC TCACCTCAGT TTGA.A.AATTT CCTCCCTTTA C TAGTATA.AA TAATTCCAAG AGTGGAGTCA 1651 GGGCCCAAAA GCAGCCACCT GTAAAGTAAG CGTCACAGCT CCAGTTTAAC CCCGGGTTTT CGTCGGTGGA CATTTCATTC GCAGTGTCGA GGTCAAATTG 1701 P.~~AAA.0 C TAT AATCTAGATA TTCTTCTCAA CGCCCCCTTA TTAATATTGG TTTTTGGATA TTAGATCTAT AAGAAGAGTT GCGGGGGAAT AATTATAACC 1751 ACTATTTTAT AAAGTTATAA AAGAACTTAT GC TA.AA.ATGA GTAATAGGAG TGATAAAATA TTTCAATATT TTCTTGAATA CGATTTTACT CATTATCCTC 1801 GATAAACCTC TCCCGACATA AGTGTACGTC AGAAAGAATT AATTCCCTGA CTATTTGGAG AGGGCTGTAT TCACATGCAG TCTTTCTTAA TTAAGGGACT 1851 CAATTAAACG AACCCAGACT GAGGTTATTA TACCCTATTT TACCTTAACT GTTAATTTGC TTGGGTCTGA CTCCAATAAT ATGGGATAAA ATGGAATTGA 1901 AGA.A.AAC C C T ATTACAAACA TTCGTTAACC CTACACAGGC ATGTCTTAAG TCTTTTGGGA TAATGTTTGT AAGCAATTGG GATGTGTCCG TACAGAATTC 1951 GAAAGATTAA AAGAAAGTAA AGGAACTCGG CAAACACGAA CTCCGCCTGT CTTTCTAATT TTCTTTCATT TCCTTGAGCC GTTTGTGCTT GAGGCGGACA 266

2001 TTAC Cp~3AAA CATCGCCTCT TGAAAGCCCA TAAGAGGTCC CGCCTGCCCT AATGGTTTTT GTAGCGGAGA ACTTTCGGGT ATTCTCCAGG GCGGACGGGA 2051 GTGACAATGT TTAACGGCCG CGGTATTCTG ACCGTGCA.AA GGTAGCGTAA CACTGTTACA AATTGCCGGC GCCATAAGAC TGGCACGTTT CCATCGCATT 2101 TCACTTGTCT TTTAAATGAA GACCCGTATG AAAGGCATCA CGAGAGTTCA AGTGAACAGA AA.ATTTAC TT CTGGGCATAC TTTCCGTAGT GCTCTCAAGT 2151 ACTGTCTCTA CTTTCCAATC AATGAA.ATTG ATCCACCCGT GCAGAAGCGG TGACAGAGAT GA.AAGGTTAG TTACTTTAAC TAGGTGGGCA CGTCTTCGCC 2201 GTATA.AACAC ATCAGACGAG AAGACCCTAT GGAGCTTCAA ACACATGAAT CATATTTGTG TAGTCTGCTC TTCTGGGATA CCTCGAAGTT TGTGTACTTA 2251 TAATTATGTA GACTAACTAC TCCACGGACA TAA.ACP►~~A.AA TACAACACTT ATTAATACAT CTGATTGATG AGGTGCCTGT ATTTGTTTTT ATGTTGTGAA 2301 TTAATTTAAC TGTTTTGGTT GGGGTGACCG AGGGGP►~~,~AA TCAATCCCCC AATTAAATTG ACAA.AACCAA CCCCACTGGC TCCCCTTTTT AGTTAGGGGG 2351 TTATCGACCG AGTGTTCTCA AGCAC TTAA.A AATTAGAATT ACAATTCTAA AATAGCTGGC TCACAAGAGT TCGTGAATTT TTAATCTTAA TGTTAAGATT 2401 TTAGTP~AAAT ATTTACCGAA AAATGACCCA GAATTTTCTG ATCAATGAAC AATCATTTTA TAAATGGCTT TTTACTGGGT CTTAAAAGAC TAGTTACTTG 2451 CAAGTTACCC TAGGGATAAC AGCGCAATCC TTTCCTAGAG TCCCTATCGA GTTCAATGGG ATCCCTATTG TCGCGTTAGG AAAGGATCTC AGGGATAGCT 2501 CGAAAGGGTT TACGACCTCG ATGTTGGATC AGGACATCCT AATGATGTAG GCTTTCCCAA ATGCTGGAGC TACAACCTAG TCCTGTAGGA TTACTACATC 2551 CCGTTATTAA GGGTTCGTTT GTTCAACGAT TAATAGTCCT ACGTGATCTG GGCAATAATT CCCAAGCAAA CAAGTTGCTA ATTATCAGGA TGCACTAGAC 2601 AGTTCAGACC GGAGAAATCC AGGTCAGTTT CTATCTATGA ATTTATCTTT TCAAGTCTGG CCTCTTTAGG TCCAGTCAA.A GATAGATACT TAAATAGA.AA 2651 CCTAGTACGA AAGGACCGGA AAAATGAAGC CAATACCCCA GGCACGCTTC GGATCATGCT TTCCTGGCCT TTTTACTTCG GTTATGGGGT CCGTGCGAAG 2701 ACTTTCATCT ATTGAAGTAA ACTAAAATAG ATAAGA~~AAA ATTAACTATT TGAAAGTAGA TAACTTCATT TGATTTTATC TATTCTTTTT TAATTGATAA 2751 GCCCAAGAAA AGGGCTGTTG GGGTGGCAGA GCCTGGTAAT TGCAAAAGAC CGGGTTCTTT TCCCGACAAC CCCACCGTCT CGGACCATTA ACGTTTTCTG 2801 CTAAGCTCTT TATTCCAGAG GTTCAAATCC TCTCCTCAAC TATGCTAGAA GATTCGAGAA ATAAGGTCTC CAAGTTTAGG AGAGGAGTTG ATACGATCTT 2851 GCCCTACTCC TTTACTTTAT TTGCCCCCTT ACCTATATTG TCCCTATCTT CGGGATGAGG A.AATGAAATA AACGGGGGAA TGGATATAAC AGGGATAGAA 2901 GTTAGCCACA GCATTCCTTA CCCTGGTTGA ACGAAAGATC CTCGGCTACA CAATCGGTGT CGTAAGGAAT GGGACCAACT TGCTTTCTAG GAGCCGATGT 2951 TGCAGCTTCG TAAAGGCCCC AACATCGTAG GCCCCTACGG TCTTCTTCAA ACGTCGAAGC ATTTCCGGGG TTGTAGCATC CGGGGATGCC AGAAGAAGTT 3001 CCTATTGCAG ACGGCCTAAA ACTATTTACC A.A.AGAAC C CA TTTACCCGTC GGATAACGTC TGCCGGATTT TGATA.AATGG TTTCTTGGGT AAATGGGCAG 3051 AGCATCCTCC CCATTTCTGT TCTTGATCGC CCCCACAATA GCCCTTACTC TCGTAGGAGG GGTAAAGACA AGAACTAGCG GGGGTGTTAT CGGGAATGAG 3101 TGGCCCTCCT CATATGAATG CCCCTTCCCC TCCCCCACTC TGTCATCAAT ACCGGGAGGA GTATACTTAC GGGGAAGGGG AGGGGGTGAG ACAGTAGTTA 3151 CTTAACTTAG GTTTATTGTT TATTCTAGCA ATCTCAAGTC TAACCGTCTA GAATTGAATC CA.AATAACAA ATAAGATCGT TAGAGTTCAG ATTGGCAGAT 3201 TACCATCTTG GGCTCCGGAT GAGCATCAAA TTCA.A.AATAC GCCTTGATAG ATGGTAGAAC CCGAGGCCTA CTCGTAGTTT AAGTTTTATG CGGAACTATC 3251 GAGCTTTACG AGCTGTGGCA CA.AACAATC T CCTACGAAGT GAGCCTCGGA CTCGAAATGC TCGACACCGT GTTTGTTAGA GGATGCTTCA CTCGGAGCCT 3301 CTGATCCTCC TATCAATGAT TATTTTCACA GGGGGATTTA CCCTCCATAC GACTAGGAGG ATAGTTACTA AT~~.A.A~GTGT CCCCCTAAAT GGGAGGTATG 3351 CTTTAACTTA GCACAGGAAA CAGTCTGATT AATTATCCCT GGATGACCCC 267

GAAATTGAAT CGTGTCCTTT GTCAGACTAA TTAATAGGGA CCTACTGGGG TAA 3401 TAGCCCTAAT ATGATATGTA TCAACCCTAG C AGAA.AC CCGAGTACCA ATCGGGATTA TACTATACAT AGTTGGGATC GTCTTTGATT GGCTCATGGT 3451 TTTGATCTAA CAGAGGGAGA ATCAGAACTA GTCTCTGGTT TCAACATCGA A.AAC TAGATT GTCTCCCTCT TAGTCTTGAT CAGAGACCAA AGTTGTAGCT TACACAA.ATA 3501 ATATGCGGGG GGCTCATTCG CCCTATTCTT CCTTGCTGAA TATACGCCCC CCGAGTAAGC GGGATAAGAA GGAACGACTT ATGTGTTTAT 3551 TTCTACTAAT A.AACAC C CTC TCAGTCATCC TCTTCATAGG CTCCTCCTAT AAGATGATTA TTTGTGGGAG AGTCAGTAGG AGAAGTATCC GAGGAGGATA TAAA.AGCAAC 3601 GATCCCCTCT TTCCAGAAAT CTCAACCCTC AGCCTCATAA CTAGGGGAGA AAGGTCTTTA GAGTTGGGAG TCGGAGTATT ATTTTCGTTG 3651 CCTGCTCACC CTACTTTTTT TATGAATTCG AGCATCATAC CCTCGCTTTC GGACGAGTGG GATG ATACTTAAGC TCGTAGTATG GGAGCGAAAG 3701 GCTATGACCA ACTCATACAC TTAGTGTGAA A.AA.ACTTC TT ACCCCTAACC CGATACTGGT TGAGTATGTG AATCACACTT TTTTGAAGAA TGGGGATTGG 3751 TTAGCAATTA TACTATGACA TATTGCCCTT CCCGTGGCTT CAGCAAGCTT AATCGTTAAT ATGATACTGT ATAACGGGAA GGGCACCGAA GTCGTTCGAA 3801 GCCCCCTCTA AC C TAAA.AGG AAGCGTGCCT GAACA.A.AAGG ACCACTTTGA T CGGGGGAGAT TGGATTTTCC TTCGCACGGA CTTGTTTTCC TGGTGA.AAC P,~~AAATAGGA 3851 TAGAGTGGAT AATGAAAGTT AAAACCTCTC CTCTTCCTAG ATCTCACCTA TTACTTTCAA TTTTGGAGAG GAGAAGGATC TTTTTATCCT 3901 TTTGAACCTA TACCTAAGAG ATCAAAACTC TCCGTACTTC CAGTTATACC AAACTTGGAT ATGGATTCTC TAGTTTTGAG AGGCATGAAG GTCAATATGG 3951 ATTTTCTAAG TAAGGTCAGC TAATGAAGCT TTTGGGCCCA TACCCCAACC TAAA.AGATTC ATTCCAGTCG ATTACTTCGA AAACCCGGGT ATGGGGTTGG 4001 ATGTCGGTTA AAATCCTTCC TTTACTAATG AATCCCCTTG TATTAACCAT TACAGCCAAT TTTAGGAAGG AA.ATGATTAC TTAGGGGAAC ATAATTGGTA 4051 CGTCATCTCA AGCCTAGGCC TAGGAACCAT CCTCACATTC ATTGGCTCAC GCAGTAGAGT TCGGATCCGG ATCCTTGGTA GGAGTGTAAG TAACCGAGTG 4101 ACTGACTATT AGTCTGAATA GGTCTTGAAA TCAACACTTT AGCTATCCTT TGACTGATAA TCAGACTTAT CCAGAACTTT AGTTGTGAAA TCGATAGGAA 4151 CCCCTAATAA TCCGCCAGCA CCACCCCCGA GCGGTAGAAG CCTCCACA.AA GGGGATTATT AGGCGGTCGT GGTGGGGGCT CGCCATCTTC GGAGGTGTTT 4201 ATACTTTATC ACACAAGCCA CTGCCTCAGC CCTTCTCTTA TTTGCTAGTG TATGAAATAG TGTGTTCGGT GACGGAGTCG GGAAGAGAAT AAACGATCAC 4251 TTACAAACGC CTGAACCTCA GGCGAATGAA ACCTAGTCGA A.ATAGTTAGC AATGTTTGCG GACTTGGAGT CCGCTTACTT TGGATCAGCT TTATCAATCG 4301 CCAGGCTCTG CCACACTGGC CACAATCGCA TTGGCCCTAA AAATCGGCTT GGTCCGAGAC GGTGTGACCG GTGTTAGCGT AACCGGGATT TTTAGCCGAA 4351 AGCCCCCCTT CACTTCTGAC TCCCCGAAGT CCTTCAAGGC TTAGACCTAA TCGGGGGGAA GTGAAGACTG AGGGGCTTCA GGAAGTTCCG AATCTGGATT 4401 CCACGGGCCT TATCCTCTCC AC C TGACAA.A AACTTGCCCC ATTCGCCATT GGTGCCCGGA ATAGGAGAGG TGGACTGTTT TTGAACGGGG TAAGCGGTAA 4451 CTTCTGCAAC TCTACCCCTC ACTAAACCCC AATCTACTAA TTCTTCTTGG GAAGACGTTG AGATGGGGAG TGATTTGGGG TTAGATGATT AAGAAGA.AC C 4501 AGTCCTGTCA ACTATAGTGG GGGGC'I`GAGG GGGATTAAAC CAGACCCAAC TCAGGACAGT TGATATCACC CCCCGACTCC CCCTAATTTG GTCTGGGTTG 4551 TAC GP.~~~AT CCTAGCATAC TCCTCAATCG CTCACCTTGG TTGAATAATT ATGCTTTTTA GGATCGTATG AGGAGTTAGC GAGTGGAACC AACTTATTAA 4601 TCCATTCTCC ACTACTCCCA CAACCTAACC CAACTTAATC TAATTCTTTA AGGTAAGAGG TGATGAGGGT GTTGGATTGG GTTGAATTAG ATTAAGAAAT 4651 TATTATTATA ACCTCAACAA CCTTCCTTCT GTTTAAGACA TTTAACTCAA ATAATAATAT TGGAGTTGTT GGAAGGAAGA CAAATTCTGT AA.ATTGAGTT 4701 CP.~~AAATCAA CTCTGTCTCC TCCTCTTCAT CAAAGTCCCC CTTGCTCTCC GTTTTTAGTT GAGACAGAGG AGGAGAAGTA GTTTCAGGGG GAACGAGAGG 268

4751 ATTATTGCCC TCCTAACTCT CCTCTCTCTC GGAGGCCTGC CCCCTCTTTC TAATAACGGG AGGATTGAGA GGAGAGAGAG CCTCCGGACG GGGGAGA.AAG 4801 AGGCTTTATA CCA,AAATGGC TCATCTTACA AGAATTGACT AAACAAGACT TCCGAAATAT GGTTTTACCG AGTAGAATGT TCTTAACTGA TTTGTTCTGA 4851 TAATTGTCCC CGCCGTTATT ATAGCTATAA TGGCCCTCCT TAGTTTATTC ATTAACAGGG GCGGCAATAA TATCGATATT ACCGGGAGGA ATCAAATAAG 4901 TTCTACCTAC GCCTGTGCTA CGCTACAGCA CTAACCATAA CCCCCGCCCC AAGATGGATG CGGACACGAT GCGATGTCGT GATTGGTATT GGGGGCGGGG 4951 AATTAATATA CTGACATCAT GACGCACCAA ATTATCCCAC AACCTGGCCC TTAATTATAT GACTGTAGTA CTGCGTGGTT TAATAGGGTG TTGGACCGGG 5001 TGACAACCAC TGCCTCATTG TCCATCTTCC TCCTCCCAAT CACCCCTGCC ACTGTTGGTG ACGGAGTAAC AGGTAGAAGG AGGAGGGTTA GTGGGGACGG 5051 ATCCTCATAC TAATATCCTA AGAAATTTAG GTTAACAACA GACCP~AAAGC TAGGAGTATG ATTATAGGAT TCTTTAAATC CAATTGTTGT CTGGTTTTCG 5101 CTTCAAAGCT TTAAGTAGAA GTGA.A.AATCT CCTAATTTCT GTTAAGATCT GAAGTTTCGA AATTCATCTT CACTTTTAGA GGATTAAAGA CAATTCTAGA 5151 GCAAGACTTT ATCTCACATC TTCTGAATGC AACCCAGATA CTTTCATTAA CGTTCTGAAA TAGAGTGTAG AAGACTTACG TTGGGTCTAT GAA.AGTAATT 5201 GC TAA,AAC C T TCTCCTAAAT AAGTAGGCCT TGATCCTACA AAATCTTAGT CGATTTTGGA AGAGGATTTA TTCATCCGGA ACTAGGATGT TTTAGAATCA 5251 TAACAGCTAA GCGTTCAATC CAGCGAACTT TTATCTACTT TCTCCCGCCG ATTGTCGATT CGCAAGTTAG GTCGCTTGAA AATAGATGAA AGAGGGCGGC 5301 TC AGGCGGGAGA AAGTCCCGGG AGAAACTAAT CTCCATCTTT AGTTTTTTTT TCCGCCCTCT TTCAGGGCCC TCTTTGATTA GAGGTAGAAA 5351 GGATTTGCAA TCCAACATAA ACAGCTACTG CAGGACTATG GTAAGAAGAG CCTAAACGTT AGGTTGTATT TGTCGATGAC GTCCTGATAC CATTCTTCTC 5401 GAATTGGACC TCTGTTCATG GGGTTACAAT CCATCACTTA GTTCTCAGTC CTTAACCTGG AGACAAGTAC CCCAATGTTA GGTAGTGAAT CAAGAGTCAG 5451 ACCTTACCTG TGGCAATTAA TCGATGACTA TTTTCTACAA ACCACAAAGA TGGAATGGAC ACCGTTAATT AGCTACTGAT AAAAGATGTT TGGTGTTTCT 5501 CATTGGCACC TTGTATTTAA TCTTTGGTGC ATGAGCAGGA ATGGTAGGGA GTAACCGTGG AACATAAATT AGAAACCACG TACTCGTCCT TACCATCCCT 5551 CAGCCCTAAG CCTTTTAATT CGTGCCGAAC TGGGTCAGCC TGGTTCCCTC GTCGGGATTC GGAAAATTAA GCACGGCTTG ACCCAGTCGG ACCAAGGGAG 5601 CTAGGGGATG ATCAGATTTA TAATGTTATT GTAACCGCCC ATGCATTTGT GATCCCCTAC TAGTC TAA.AT ATTACAATAA CATTGGCGGG TACGTAAACA 5651 AATAATTTTC TTTATGGTCA TGCCCGTAAT AATTGGAGGC TTTGGAAATT TTATTAAA.AG AAATACCAGT ACGGGCATTA TTAACCTCCG A.AAC C TTTAA 5701 GACTAGTCCC TTTAATGATC GGAGCACCAG ACATAGCCTT CCCCCGAATA CTGATCAGGG AAATTACTAG CCTCGTGGTC TGTATCGGAA GGGGGCTTAT 5751 AATAACATAA GTTTCTGGCT CCTCCCCCCT TCTTTCCTTC TACTCTTGGC TTATTGTATT CAAAGACCGA GGAGGGGGGA AGAA.AGGAAG ATGAGAACCG 5801 CTCAGCCGGA GTTGAGTCAG GAGCCGGCAC TGGCTGAACA GTCTACCCTC GAGTCGGCCT CAACTCAGTC CTCGGCCGTG ACCGACTTGT CAGATGGGAG 5851 CCCTAGCTGG CAACTTAGCA CACGCCGGAG CATCTGTTGA TCTAGCCATT GGGATCGACC GTTGAATCGT GTGCGGCCTC GTAGACAACT AGATCGGTAA 5901 TTCTCCCTCC ACCTGGCTGG TATCTCGTCC ATCCTAGCTT CCATTAACTT AAGAGGGAGG TGGACCGACC ATAGAGCAGG TAGGATCGAA GGTAATTGAA 5951 CATTACAACC ATCATCAACA TA.A.AAC C C C C GGCAATCTCC CAATACCA.AA GTAATGTTGG TAGTAGTTGT ATTTTGGGGG CCGTTAGAGG GTTATGGTTT 6001 CACCCCTGTT TGTCTGGTCC ATTCTAGTGA CAACCATCCT CCTTCTTTTA GTGGGGACAA ACAGACCAGG TAAGATCACT GTTGGTAGGA GGAAGA,.AA.AT 6051 GCACTCCCAG TGCTCGCCGC TGGCATTACA ATACTACTTA CGGACCGAAA CGTGAGGGTC ACGAGCGGCG ACCGTAATGT TATGATGAAT GCCTGGCTTT 6101 CCTA.AACACA ACATTCTTTG ATCCGGCCGG AGGAGGTGAT CCTATCCTCT 2b9

GGATTTGTGT TGTAAGAAAC TAGGCCGGCC TCCTCCACTA GGATAGGAGA 6151 ACCAGCATCT GTTCTGATTT TTTGGCCATC CAGAGGTCTA CATTCTTATC TGGTCGTAGA C AAGAC TA.AA A.AACCGGTAG GTCTCCAGAT GTAAGAATAG 6201 CTTCCTGGCT TTGGGATAAT CTCCCATGTT GTAGCCTACT ACTCTGGCAA GAAGGACCGA AACCCTATTA GAGGGTACAA CATCGGATGA TGAGACCGTT 6251 A1~A.AGAAC C C TTTGGCTACA TGGGAATAGT TTGAGCAATA ATAGCAATTG TTTTCTTGGG AAACCGATGT ACCCTTATCA AACTCGTTAT TATCGTTAAC 6301 GCCTGCTAGG CTTCATCGTC TGGGCCCATC ATATGTTTAC CGTAGGAATG CGGACGATCC GAAGTAGCAG ACCCGGGTAG TATACAAATG GCATCCTTAC 6351 GATGTTGACA CGCGAGCCTA CTTCACCTCA GCAACGATAA TTATCGCCAT CTACAACTGT GCGCTCGGAT GAAGTGGAGT CGTTGCTATT AATAGCGGTA 6401 CCCTACAGGT GTAAAAGTCT TCAGTTGACT AGCGACCCTT CATGGAGGCT GGGATGTCCA CATTTTCAGA AGTCAACTGA TCGCTGGGAA GTACCTCCGA 6451 CTGTCAAATG AGAGACCCCC TTACTATGGG CTCTTGGGTT TATCTTCCTA GACAGTTTAC TCTCTGGGGG AATGATACCC GAGAACCCAA ATAGAAGGAT 6501 TTCACAGTAG GAGGCCTGAC AGGGATTGTA CTAGCCAACT CTTCTCTAGA AAGTGTCATC CTCCGGACTG TCCCTAACAT GATCGGTTGA GAAGAGATCT 6551 CATCGTCCTC CACGACACTT ACTATGTAGT AGCCCATTTC CACTATGTCC GTAGCAGGAG GTGCTGTGAA TGATACATCA TCGGGTAAAG GTGATACAGG 6601 TCTCGATAGG GGCCGTATTC GCTATTATGG CGGGCTTTAT CCACTGATTC AGAGCTATCC CCGGCATAAG CGATAATACC GCCCGAAATA GGTGACTAAG 6651 CCTTTAATAA CCGGCTACAC CCTCCACTCG ACTTGAACAA AAATCCAATT GGAAATTATT GGCCGATGTG GGAGGTGAGC TGAACTTGTT TTTAGGTTAA 6701 CGCAGTTATA TTTATTGGAG TAAATCTGAC ATTCTTCCCA CAACACTTCC GCGTCAATAT AA.ATAAC C T C ATTTAGACTG TAAGAAGGGT GTTGTGA.AGG 6751 TAGGCCTCGC TGGAATGCCA CGACGTTACT CAGACTACCC AGACGCTTAC ATCCGGAGCG ACCTTACGGT GCTGCAATGA GTCTGATGGG TCTGCGAATG 6801 ACCTTATGAA ACACAGTCTC CTCTATCGGC TCCTTAATCT CACTCGTGGC TGGAATACTT TGTGTCAGAG GAGATAGCCG AGGAATTAGA GTGAGCACCG 6851 TGTAATCATG TTCTTATTTA TTATTTGAGA AGCATTTGCC TCAAAACGAG ACATTAGTAC AAGAATAAAT AATA.AAC TC T TCGTAAACGG AGTTTTGCTC 6901 AAGTCCTATC CGTTGAACTA CCGCACACAA ATGTCGAATG ACTACACGGT TTCAGGATAG GCAACTTGAT GGCGTGTGTT TACAGCTTAC TGATGTGCCA 6951 TGCCCTCCAC CCTATCACAC ATATGAAGAG CCAGCCTTTG TTCAAGTTCA ACGGGAGGTG GGATAGTGTG TATACTTCTC GGTCGGAAAC AAGTTCAAGT 7001 AC GAA.AC T TA TAGGACAAGA AAGGAAGGAA TTGAACCCCC ATATGTTAGT TGCTTTGAAT ATCCTGTTCT TTCCTTCCTT AACTTGGGGG TATACAATCA 7051 TTCAAGCTAA CCACATTACC ACTCTGCCAC TTTCTTCATA GAGGCCCTAG AAGTTCGATT GGTGTAATGG TGAGACGGTG AAAGAAGTAT CTCCGGGATC 7101 TP►~~AAACATA TTACACTACC TTGTCAAGGC ATAATTGCAG GTTAGAATCC ATTTTTGTAT AATGTGATGG AACAGTTCCG TATTAACGTC CAATCTTAGG 7151 TGCGGGTCTT AGAGCTAATG GCACACCCCT CACAATTAGG ATTTCAAGAT ACGCCCAGAA TCTCGATTAC CGTGTGGGGA GTGTTAATCC TAAAGTTCTA 7201 GCAGCCTCCC CAGTCATAGA AGAACTTATT CACTTTCACG ACCACACACT CGTCGGAGGG GTCAGTATCT TCTTGAATAA GTGA.AAGTGC TGGTGTGTGA 7251 AATAATTGTA TTTCTAATTA GCGCCCTGGT TCTTTATATT ATTACAGCGA TTATTAACAT AAAGATTAAT CGCGGGACCA AGA.AATATAA TAATGTCGCT 7301 TAGTATCAAC AAAACTTACA AACAA.ATACA TCCTCGATTC CCAAGAGATT ATCATAGTTG TTTTGAATGT TTGTTTATGT AGGAGCTAAG GGTTCTCTAA 7351 GA.AATCGTTT GGACTATCCT CCCCGCCATC ATCCTCATTA TAATCGCCCT CTTTAGCAAA CCTGATAGGA GGGGCGGTAG TAGGAGTAAT ATTAGCGGGA 7401 ACCATCCTTA CGAATTTTAT ACCTAATAGA CGAGATTAAT GACCCCCACT TGGTAGGAAT GCTTAAAATA TGGATTATCT GCTCTAATTA CTGGGGGTGA 7451 TGACCATTAA AGCCATAGGC CATCAATGGT ACTGAAGCTA CGAATACACA ACTGGTAATT TCGGTATCCG GTAGTTACCA TGACTTCGAT GCTTATGTGT 270

7501 GACTACGAAG ATCTAGGCTT TGACTCTTAC ATGATTCAAA CCCAAGACTT CTGATGCTTC TAGATCCGAA ACTGAGAATG TACTAAGTTT GGGTTCTGAA 7551 AGCCCCCGGC CAGTTTCGCT TATTAGAGAC AGACCATCGA ATAGTAGTTC TCGGGGGCCG GTCAAAGCGA ATAATCTCTG TCTGGTAGCT TATCATCAAG 7601 CCATAGAATC CCCCGTACGT GTACTAGTGT CCGCAGAAGA TGTCTTACAC GGTATCTTAG GGGGCATGCA CATGATCACA GGCGTCTTCT ACAGAATGTG 7651 TCATGGGCCG TACCAGCCTT AGGGGTTAA.A ATAGACGCTG TCCCAGGGCG AGTACCCGGC ATGGTCGGAA TCCCCAATTT TATCTGCGAC AGGGTCCCGC 7701 TTTAAATCAA ACTGCCTTCA TCATCTCCCG GCCCGGTGTC TATTATGGTC AAATTTAGTT TGACGGAAGT AGTAGAGGGC CGGGCCACAG ATAATACCAG 7751 AGTGTTCAGA AATCTGTGGG GCCAACCACA GCTTTATACC TATTGTAGTA TCACAAGTCT TTAGACACCC CGGTTGGTGT CGAAATATGG ATAACATCAT 7801 GAAGCAGTTC CTCTTGAACA CTTCGAAGCC TGATCTTCAT TAATACTAGA CTTCGTCAAG GAGAACTTGT GAAGCTTCGG ACTAGAAGTA ATTATGATCT 7851 AGAAGCCTCA CTAAGAAGCT AAACTGGGCC TAGCGTTAGC CTTTTAAGCT TCTTCGGAGT GATTCTTCGA TTTGACCCGG ATCGCAATCG GAAAATTC GA 7901 p►~~AAAC TGGT GACTCCCTAC CACCCTTAGT GATATGCCTC AATTAAACCC TTTTTGACCA CTGAGGGATG GTGGGAATCA CTATACGGAG TTAATTTGGG 7951 TCACCCTTGA CTAATTATCC TCCTGTTTTC ATGAATAATT TTCCTCATTG AGTGGGAACT GATTAATAGG AGGACAAAAG TACTTATTAA AAGGAGTAAC 8001 TCTTACCA.AA GAAAGTGATA AATCACCTAT TCACCAACCA CCCAACATTA AGAATGGTTT CTTTCACTAT TTAGTGGATA AGTGGTTGGT GGGTTGTAAT 8051 AAAAGTGCAG P~3AAATCTAA ACCAGCACCC TGAAACTGAC CATGGTCCTA TTTTCACGTC TTTTTAGATT TGGTCGTGGG ACTTTGACTG GTACCAGGAT 8101 AACTTTTTTG ACCAATTCCT AAGCCCCTCC CTCCTTGGAG TCCCATTAAT TTGAA,AAAAC TGGTTAAGGA TTCGGGGAGG GAGGAACCTC AGGGTAATTA 8151 TGCCCTCGCA ATCACCCTAC CATGATTAAT TTTTCCAACC CCAACTGGCC ACGGGAGCGT TAGTGGGATG GTACTAATTA AAAAGGTTGG GGTTGACCGG 8201 GATGACTCAG TAATCGACTC ATAACACTCC A.AAGC TGATT CATTAACCGA CTACTGAGTC ATTAGCTGAG TATTGTGAGG TTTCGACTAA GTAATTGGCT 8251 TTTGTTTACC AACTCATACA GCCCATTAAC TTCGCTGGCC ATAAATGAGC AAACA.AATGG TTGAGTATGT CGGGTAATTG AAGCGACCGG TATTTACTCG 8301 AATACTATTT ACAGCTCTAA TACTATTCCT AATTTCTATC AACCTACTGG TTATGATAAA TGTCGAGATT ATGATAAGGA TTAAAGATAG TTGGATGACC 8351 GCCTTCTTCC CTACACCTTT ACACCCACAA CACAACTCTC CCTCAACATA CGGAAGAAGG GATGTGGAAA TGTGGGTGTT GTGTTGAGAG GGAGTTGTAT 8401 GCATTCGCCC TTCCCTTATG ATTTACTACC GTCCTAGTCG GAATGCTCAA CGTAAGCGGG AAGGGAATAC TAAATGATGG CAGGATCAGC CTTACGAGTT 8451 TCAGCCCACC ATTGCACTAG GACACTTTCT GCCCGAAGGC ACGCCCACCC AGTCGGGTGG TAACGTGATC CTGTGAAAGA CGGGCTTCCG TGCGGGTGGG 8501 CTTTAGTCCC CGTCTTAATT GTCATTGAAA CCATTAGTTT ATTTATCCGA GAAATCAGGG GCAGAATTAA CAGTAACTTT GGTAATCAAA TAAATAGGCT 8551 CCACTAGCGC TAGGGGTCCG ACTAACTGCT AATTTGACAG CAGGTCACCT GGTGATCGCG ATCCCCAGGC TGATTGACGA TTAAACTGTC GTCCAGTGGA 8601 ACTTATACAA CTAATTGCAA CCGCAGCCTT TGTGCTTATT ACTATCATAC TGAATATGTT GATTAACGTT GGCGTCGGAA ACACGAATAA TGATAGTATG 8651 CCGCCGTAGC ATTACTCACA TCAATTGTTT TATTTCTACT TACAATCTTA GGCGGCATCG TAATGAGTGT AGTTAACAAA ATA.AAGATGA ATGTTAGAAT 8701 GAAGTGGCTG TAGCAATAAT TCAAGCATAC GTATTCGTCC TCTTATTAAG CTTCACCGAC ATCGTTATTA AGTTCGTATG CATAAGCAGG AGAATAATTC 8751 CCTCTACCTA CAAGAAAATG TTTAATGGCT CACCAAGCAC ACGCATATCA GGAGATGGAT GTTCTTTTAC A.AATTACCGA GTGGTTCGTG TGCGTATAGT 8801 TATAGTTGAC CCCAGCCCAT GACCGCTGAC CGGGGCTACA GCCGCCCTTT ATATCAACTG GGGTCGGGTA CTGGCGACTG GCCCCGATGT CGGCGGGAA.A 8851 TAATGACATC CGGCCTAGCC ATCTGGTTTC ACTTCCACTC TCTAATTCTC 271

ATTACTGTAG GCCGGATCGG TAGAC CA.AAG TGAAGGTGAG AGATTAAGAG 8901 CTCTACCTAG GACTGACCCT TCTCCTACTA ACTATAATCC AATGATGGCG GAGATGGATC CTGACTGGGA AGAGGATGAT TGATATTAGG TTACTACCGC 8951 CGATATTATC CGAGAAGGAA CATTTCAAGG CCACCACACA CCTCCCGTTC GCTATAATAG GCTCTTCCTT GTAA.AGTTCC GGTGGTGTGT GGAGGGCAAG 9001 A~~A,AAGGC C T CCGCTACGGA ATAATTCTAT TCATCACATC AGAAGTGTTC TTTTTCCGGA GGCGATGCCT TATTAAGATA AGTAGTGTAG TCTTCACAAG 9051 TTTTTTCTAG GCTTTTTCTG AGCCTTTTAC CATTCAAGTT TAGCCCCCAC GATC C GP~~AAAGAC TC GGAAA.ATG GTAAGTTCAA ATCGGGGGTG 9101 CCCTGAGCTA GGAGGATGCT GGCCCCCAAC AGGAATTAGT CCTATAGACC GGGACTCGAT CCTCCTACGA CCGGGGGTTG TCCTTAATCA GGATATCTGG 9151 CATTTGAAGT GCCACTCCTA AATACTGCAG TTCTGCTGGC CTCCGGCGTA GTA.AACTTCA CGGTGAGGAT TTATGACGTC AAGACGACCG GAGGCCGCAT 9201 ACAGTAACCT GAGCTCACCA TAGCCTTATA GAAGGCAATC GA~A.AAGAAAC TGTCATTGGA CTCGAGTGGT ATCGGAATAT CTTCCGTTAG CTTTTCTTTG 9251 TATTCAAGCC CTCACTCTCA CTATCCTTCT AGGTATTTAC TTCACAGCCC ATAAGTTCGG GAGTGAGAGT GATAGGAAGA TCCATAAATG AAGTGTCGGG 9301 TACAAGCCAT AGAGTACTAT GAAGCCCCTT TTACTATCGC TGATGGGGTC ATGTTCGGTA TCTCATGATA CTTCGGGGAA AATGATAGCG ACTACCCCAG 9351 TATGGAACTA CATTCTTCGT AGCCACAGGA TTTCACGGCC TCCATGTTAT ATACCTTGAT GTAAGAAGCA TCGGTGTCCT AAAGTGCCGG AGGTACAATA 9401 TATTGGCTCA ACATTCCTAA TAATCTGCCT ATTACGACAG ATTCAATACC ATAACCGAGT TGTAAGGATT ATTAGACGGA TAATGCTGTC TAAGTTATGG 9451 ACTTCACATC CCAACACCAC TTTGGATTTG AAGCTGCTGC ATGATACTGA TGAAGTGTAG GGTTGTGGTG A.AACC TAA.AC TTCGACGACG TACTATGACT 9501 CACTTTGTGG ACGTAGTGTG ATTATTCCTC TATGTTTCCA TTTATTGATG GTGAAACACC TGCATCACAC TAATAAGGAG ATACAAAGGT AAATAACTAC 9551 AGGCTCATAA CTGCTTTTCT AGTATAGACT AGTACAAATG ATTTCCAATC TCCGAGTATT GACGAAAAGA TCATATCTGA TCATGTTTAC TAAAGGTTAG 9601 ATTTAATCTT GGTTAAAATC CAAGGAAAAG TAATGAACCT CATCATGTCT TAAATTAGAA CCAATTTTAG GTTCCTTTTC ATTACTTGGA GTAGTACAGA 9651 TCTGTTGCGG CTACGGCCCT GGTTTCCCTA ATCCTTGTAT TCATCACATT AGACAACGCC GATGCCGGGA CCAAAGGGAT TAGGAACATA AGTAGTGTAA 9701 CTGGCTTCCA TCTCTCAGCC CAGACAACGA AAA.AC TC TC C CCATATGAAT GACCGAAGGT AGAGAGTCGG GTCTGTTGCT TTTTGAGAGG GGTATACTTA 9751 GTGGCTTCGA CCCTCTTGGA AGTGCACGTC TTCCATTTTC CCTACGTTTC CACCGAAGCT GGGAGAACCT TCACGTGCAG AAGGTAAAAG GGATGCAA.AG 9801 TTTCTCGTAG CCATCCTATT TCTACTATTT GAC C TAGA.AA TTGCTCTTCT AAAGAGCATC GGTAGGATAA AGATGATAAA CTGGATCTTT AACGAGAAGA 9851 TCTCCCCCTC CCCTGAGGGG ATCAACTTCT ATCGGGGGTG TACACACTGC AGAGGGGGAG GGGACTCCCC TAGTTGAAGA TAGGGGCGAG ATGTGTGACG 9901 TTTGAGCAGC AATCATCTTA ATTCTACTTA CCCTCGGTCT CGTCTATGAA A.AAC TC GTC G TTAGTAGAAT TAAGATGAAT GGGAGCCAGA GCAGATACTT 9951 TGACTCCAAG GGGGATTAGA ATGGGCAGAA TGGATATTTA GTCTAAACAA ACTGAGGTTC CCCCTAATCT TACCCGTCTT ACCTATAAAT CAGATTTGTT 10001 AGACCACTAA TTTCGGCTTA GTAGACTATG GTGA,AA,ATC C ATAAATATCT TCTGGTGATT AAAGCCGAAT CATCTGATAC CACTTTTAGG TATTTATAGA 10051 TATGTCCCCC CTATATTTTA GCCTCAACTC AGCATTTATA CTAGGCCTGA ATAGAGGGGG GATATAAAAT CGGAGTTGAG TCGTAAATAT GATCCGGACT 10101 TGGGTCTCGC ACTCAACCGT TATCATCTCT TATCCGCACT TTTATGCCTG ACCCAGAGCG TGAGTTGGCA ATAGTAGAGA ATAGGCGTGA AAATACGGAC 10151 GAAAGCATAC TACTAACTCT ATTCATTACC ACTGCTATCT GGAC TC TA.AC CTTTCGTATG ATGATTGAGA TAAGTAATGG TGACGATAGA CCTGAGATTG 10201 ACTGAACTCT GTCTCATCCT CAGTCTTCCC TATGATTCTC CTTACATTCT TGACTTGAGA CAGAGTAGGA GTCAGAAGGG ATACTAAGAG GAATGTAAGA 272

10251 CGGCCTGCGA AGCCAGCGCA GGCCTGGCTA TTCTAGTAGC CACCTCCCGC GCCGGACGCT TCGGTCGCGT CCGGACCGAT AAGATCATCG GTGGAGGGCG 10301 TCCCACGGTT CTGATAACCT AC A~AAAC C TA AATCTTCTCC AATGCTAAAA AGGGTGCCAA GACTATTGGA TGTTTTGGAT TTAGAAGAGG TTACGATTTT 10351 GTTCTTATCC CAACTATCAT ACTCTTCCCA ACCACATGAG TTATTAACAA CAAGAATAGG GTTGATAGTA TGAGAAGGGT TGGTGTACTC AATAATTGTT 10401 A.AAGTGGC TA TGGCCCATAA CCACCTCCTA CAGCCTTCTA ATTGCACTGT TTTCACCGAT ACCGGGTATT GGTGGAGGAT GTCGGAAGAT TAACGTGACA 10451 CAAGCCTAAT CTGATTCA.AA TGGAACATCG ACATCGGCTG AGACTTCTCC GTTCGGATTA GACTAAGTTT ACCTTGTAGC TGTAGCCGAC TCTGAAGAGG 10501 AACCAGTTTA TGGCCATTGA CCCTTTATCC TCCCCTCTAC TAATTCTCAC TTGGTCAAAT ACCGGTAACT GGGAAATAGG AGGGGAGATG ATTAAGAGTG 10551 ATGCTGACTT CTTCCGCTGA TGATTTTGGC CAGC CA.AAAC CATATCTCCC TACGACTGAA GAAGGCGACT AC TA,AA.AC C G GTCGGTTTTG GTATAGAGGG 10601 CAGAACCAAT TATTCGACAA CGAACATATA TCTCACTCCT AATCTCCCTC GTCTTGGTTA ATAAGCTGTT GCTTGTATAT AGAGTGAGGA TTAGAGGGAG 10651 CAGACTTTTC TTATTCTAGC CTTCTCCGCA AC C GA.AATAA TTATATTTTA GTC TGA.A.AAG AATAAGATCG GAAGAGGCGT TGGCTTTATT AATATA.AAAT 10701 CATTATATTT GAAGCCACAC TTATCCCCAC TCTCATTATT ATTACGCGAT GTAATATAA.A CTTCGGTGTG AATAGGGGTG AGAGTAATAA TAATGCGCTA 10751 GAGGTAACCA GACAGAACGC CTTAATGCAG GGACCTACTT CCTATTTTAC CTCCATTGGT CTGTCTTGCG GAATTACGTC CCTGGATGAA GGATAA.AATG 10801 ACCTTAATTG GCTCCCTTCC TCTTCTCATT GCCCTTCTAC TTATACP~AAA TGGAATTAAC CGAGGGAAGG AGAAGAGTAA CGGGAAGATG AATATGTTTT 10851 TAACCTCGGC ACCCTGTCCA TAATTATTAT ACAACACTCA CAACCCCTCA ATTGGAGCCG TGGGACAGGT ATTAATAATA TGTTGTGAGT GTTGGGGAGT 10901 CTCTGACTTC ATGGGCTGAC AAACTGTGAT GGGTGGCCTG CCTTCTCGCT GAGACTGAAG TACCCGACTG TTTGACACTA CCCACCGGAC GGAAGAGCGA 10951 TTTCTCGTTA AAATGCCCTT ATATGGAATC CACCTCTGAC TTCCTAAAGC AAAGAGCAAT TTTACGGGAA TATACCTTAG GTGGAGACTG AAGGATTTCG 11001 CCACGTTGAA GCCCCAATTG CCGGCTCTAT AATCCTAGCT GCCGTCTTAC GGTGCAACTT CGGGGTTAAC GGCCGAGATA TTAGGATCGA CGGCAGAATG 11051 TTAAACTAGG GGGATACGGC ATAATACGAA TTATTGTAAT ACTAGACCCT AATTTGATCC CCCTATGCCG TATTATGCTT AATAACATTA TGATCTGGGA 11101 C TTAC CA.AAG A.AATAGC C TA CCCATTCCTA ATTTTGGCCA TCTGAGGGAT GAATGGTTTC TTTATCGGAT GGGTAAGGAT TA.A.AACCGGT AGACTCCCTA 11151 TATTATAACC AGCTCTATCT GCCTGCGACA AACTGACCTC AA.ATC TC TTA ATAATATTGG TCGAGATAGA CGGACGCTGT TTGACTGGAG TTTAGAGAAT 11201 TTGCCTACTC ATCAGTTAGC CACATGGGCC TAGTCGCAGG GGCAATTCTT AACGGATGAG TAGTCAATCG GTGTACCCGG ATCAGCGTCC CCGTTAAGAA 11251 ATCCAA.ACCC CATGAAGCTT TGCAGGGGCA ATTACGCTAA TGATTGCTCA TAGGTTTGGG GTACTTCGAA ACGTCCCCGT TAATGCGATT ACTAACGAGT 113 01 TGGCCTAATC TCCTCCGCCC TATTCTGCTT AGCTAACACA AACTACGAGC ACCGGATTAG AGGAGGCGGG ATAAGACGAA TCGATTGTGT TTGATGCTCG 11351 GAATTCATAG CCGAACAATA CTTCTGGCCC GAGGCCTGCA AGTCATTCTC CTTAAGTATC GGCTTGTTAT GAAGACCGGG CTCCGGACGT TCAGTAAGAG 11401 CCACTAATGG CAACCTGATG ATTCCTCGCT AGCCTTGCCA ACCTTGCCCT GGTGATTACC GTTGGACTAC TAAGGAGCGA TCGGAACGGT TGGAACGGGA 11451 TCCCCCCTCC CCCAATCTCA TAGGAGAACT CCTCATTATC ACCTCATTGT AGGGGGGAGG GGGTTAGAGT ATCCTCTTGA GGAGTAATAG TGGAGTAACA 11501 TTAACTGATC TAACTGAACC ATCACCCTCT CAGGTCTTGG AGTACTAATC AATTGACTAG ATTGACTTGG TAGTGGGAGA GTCCAGAACC TCATGATTAG 11551 ACAGCCTCCT ACTCCCTCTA TATATTCCTA ATAACCCAAC GCGGTCCTAC TGTCGGAGGA TGAGGGAGAT ATATAAGGAT TATTGGGTTG CGCCAGGATG 11601 CCCCCTCCAT ATCCTATCAC TAAGCCCGAC CTATACACGA GAACATCTCC 273

GGGGGAGGTA TAGGATAGTG ATTCGGGCTG GATATGTGCT CTTGTAGAGG 11651 TCCTTAGCCT CCACCTTATG CCTGTCCTGC TTCTAATATT TAAACCTGAA AGGAATCGGA GGTGGAATAC GGACAGGACG AAGATTATAA ATTTGGACTT 11701 CTTATCTGAG GCTGGACACT CTGTATTTAT AGTTTAACCA AAACATTAGA GAATAGACTC CGACCTGTGA GACATAAATA TCAA.ATTGGT TTTGTAATCT 11751 TTGTGGTTCT P.►AAAATAAA.A GTTA.AAATC T TTTTAATTAC CGAGAGAGGT AACACCAAGA TTTTTATTTT CAATTTTAGA AAAATTAATG GCTCTCTCCA 11801 CCGGGACACG AAGAACTGCT AATTCTTCTC ATCATGGCTC GAATCCATGA GGCCCTGTGC TTCTTGACGA TTAAGAAGAG TAGTACCGAG CTTAGGTACT 11851 CTCACTCGGC TTCTGAAAGA TATTAGTAAT CTATTGGTCT TAGGAACCAA GAGTGAGCCG AAGACTTTCT ATAATCATTA GATAACCAGA ATCCTTGGTT 11901 AAACTCTTGG TGCAACTCCA AGCAA.AAGC T ATGAACACCA TCTTTAACTC TTTGAGAACC ACGTTGAGGT TCGTTTTCGA TACTTGTGGT AGAA.ATTGAG 11951 ATCCTTCCTC CTAATCTTTA TTATCCTCAT CCTTCCATTA ATAACCTCAC TAGGAAGGAG GATTAGAA.AT AATAGGAGTA GGAAGGTAAT TATTGGAGTG 12001 TAAGCCCCAA AGAACTAAAC ATTAACTGAG CCTCCTCCCA TGTGAAGACA ATTCGGGGTT TCTTGATTTG TAATTGACTC GGAGGAGGGT ACACTTCTGT 12051 GCTGTAAAGA CCTCTTTCTT CATTAGTCTT ATCCCCCTGT CTATTTTCCT CGACATTTCT GGAGAAAGAA GTAATCAGAA TAGGGGGACA GATA.AAAGGA 12101 AGATCAAGGC TTAGAGTCAA TTATAACTAA CTTCAACTGA ATAAACATTG TCTAGTTCCG AATCTCAGTT AATATTGATT GAAGTTGACT TATTTGTAAC 12151 GGCCTTTTGA CATTAACATG AGTTTTAAAT TTGACTCATA CTCAGTCGTA CCGGA,AAACT GTAATTGTAC TC~TTTA AACTGAGTAT GAGTCAGCAT 12201 TTTACCCCCG TAGCCCTCTA CGTCACCTGA TCCATCCTTG AATTTGCCCT AAATGGGGGC ATCGGGAGAT GCAGTGGACT AGGTAGGAAC TTAAACGGGA 12251 CTGATATATA CACTCTGACC CAAACATTAA CCGTTTCTTC AA.ATATCTTC GACTATATAT GTGAGACTGG GTTTGTAATT GGCAA.AGAAG TTTATAGAAG 12301 TGCTCTTCCT AGTCTCAATA ATCATCCTAG TCACCGCCAA CAACATGTTC ACGAGAAGGA TCAGAGTTAT TAGTAGGATC AGTGGCGGTT GTTGTACAAG 12351 CAGCTATTCA TCGGCTGAGA AGGAGTGGGA ATCATATCTT TCCTCCTCAT GTCGATAAGT AGCCGACTCT TCCTCACCCT TAGTATAGAA AGGAGGAGTA 12401 TGGTTGGTGA TACAGCCGAA CAGACGCCAA CACAGCCGCC CTGCAAGCTG ACCAACCACT ATGTCGGCTT GTCTGCGGTT GTGTCGGCGG GACGTTCGAC 12451 TAATTTATAA CCGAGTAGGG GATATCGGAC TAATTCTCAG CATAACCTGA ATTAAATATT GGCTCATCCC CTATAGCCTG ATTAAGAGTC GTATTGGACT 12501 CTAGCCATAA AC C TA.AAC TC C TGAGA.AATA CAACAATTGT TTATCCTATC GATCGGTATT TGGATTTGAG GACTCTTTAT GTTGTTAACA AATAGGATAG 12551 CAAAGATATA AATCTAACCT TCCCCCTCCT CGGCCTTGTC CTAGCCGCAG GTTTCTATAT TTAGATTGGA AGGGGGAGGA GCCGGAACAG GATCGGCGTC 12601 CCGGAAAATC CGCACAATTC GGTCTCCACC CTTGGCTCCC CTCAGCTATA GGCCTTTTAG GCGTGTTAAG CCAGAGGTGG GAACCGAGGG GAGTCGATAT 12651 GAAGGCCCCA CACCAGTCTC CGCCCTACTC CACTCCAGCA CAATAGTTGT CTTCCGGGGT GTGGTCAGAG GCGGGATGAG GTGAGGTCGT GTTATCAACA 12701 CGCCGGCATT TTCCTCCTAA TCCGTCTTCA CCCGCTAATC CAAGACAACC GCGGCCGTAA AAGGAGGATT AGGCAGAAGT GGGCGATTAG GTTCTGTTGG 12751 AATTAGTCCT TACAGTATGC CTATGCCTGG GGGCATTAAC CACCCTTTTC TTAATCAGGA ATGTCATACG GATACGGACC CCCGTAATTG GTGGGA.AA.AG 12801 ACCGCAGTGT GTGCTCTCAC C C AA.AAC GAC ATC TCATTGCCTT TGGCGTCACA CACGAGAGTG GGTTTTGCTG TAGTTTTTTT AGTAACGGAA 12851 CTCAACATCC AGCCAACTCG GACTAATGAT AGTAACAATT GGCCTCAACC GAGTTGTAGG TCGGTTGAGC CTGATTACTA TCATTGTTAA CCGGAGTTGG 12901 AACCCCAACT AGCCTTCCTC CACATCTGTA CCCACGCCTT CTTCAAAGCT TTGGGGTTGA TCGGAAGGAG GTGTAGACAT GGGTGCGGAA GAAGTTTCGA 12951 ATGCTTTTCC TTTGTTCTGG GTCTATCATC CACAGTCTCA ACGATGAACA TAC GAAAAGG A.AACAAGACC CAGATAGTAG GTGTCAGAGT TGCTACTTGT 274

13001 AGACATCCGT AAGATAGGGG GCCTCCATAA ACTCTTACCC CTCACCTCAT TCTGTAGGCA TTCTATCCCC CGGAGGTATT TGAGAATGGG GAGTGGAGTA 13051 CCTCCCTGAC TGTTGGAAGC TTAGCCCTCA CAGGCATACC TTTCTTATCA GGAGGGACTG ACAACCTTCG AATCGGGAGT GTCCGTATGG AAAGAATAGT 13101 GGCTTTTTTT CA,AA,AGATGC CATCATTGAA TCCATAAACA CTTCTCACCT CCG GTTTTCTACG GTAGTAACTT AGGTATTTGT GAAGAGTGGA 13151 CAACGCCTGA GCCCTTATCC TAACCCTAAT CGCAACATCA TTCACAGCTA GTTGCGGACT CGGGAATAGG ATTGGGATTA GCGTTGTAGT AAGTGTCGAT 13201 TTTACAGCCT TCGCCTCATC TTCTTCGCAT TAATAAATTT TCCCCGATTC AA.ATGTCGGA AGCGGAGTAG AAGAAGCGTA ATTATTTAAA AGGGGCTAAG 13251 AACCCACTCT CCCCTATTAA TGAAA.ATAAC CCCATAGTTA TTAATCCCAT TTGGGTGAGA GGGGATAATT ACTTTTATTG GGGTATCAAT AATTAGGGTA 13301 CAAACGCCTA GCTTACGGAA GCATCCTAGC CGGTCTCATT ATTACATCCA GTTTGCGGAT CGAATGCCTT CGTAGGATCG GCCAGAGTAA TAATGTAGGT 13351 ACCTAACTCC CACAAAGACT CAAATCATAA CCATGCCCCC TC TATTP~AAA TGGATTGAGG GTGTTTCTGA GTTTAGTATT GGTACGGGGG AGATAATTTT 13401 CTCTCCGCCC TACTAGTGAC TATCATTGGC CTTCTGCTAG CCCTAGAGCT GAGAGGCGGG ATGATCACTG ATAGTAACCG GAAGACGATC GGGATCTCGA 13451 AGCTAATCTA ACCAACACTC AGCTCAAAAC AACTCCTACT CTCTTCCCCC TCGATTAGAT TGGTTGTGAG TCGAGTTTTG TTGAGGATGA GAGAAGGGGG 13501 ATCACTTCTC AAATATACTA GGATTCTTCC CACA.AATTAT CCACCGCTTT TAGTGAAGAG TTTATATGAT CCTAAGAAGG GTGTTTAATA GGTGGCGAAA 13551 C TGC C TA►AAA TCAGTCTAAC CTGATCCCAA CATGTCTCCA CTCACCTAGT GACGGATTTT AGTCAGATTG GACTAGGGTT GTACAGAGGT GAGTGGATCA 13601 TGACCAGTCA TGGTATGAAA A.AATTGGAC C P.~AAAAGC C C T CTCATCCAAC ACTGGTCAGT ACCATACTTT TTTAACCTGG TTTTTCGGGA GAGTAGGTTG 13651 AAATTCCATT AATTAAAATA TCCACTCAAC CTCAACAAGG TTACATTAA.A TTTAAGGTAA TTAATTTTAT AGGTGAGTTG GAGTTGTTCC AATGTAATTT 13701 GTTTATCTTA TGTTACTCCT CCTTACCCTA ACCCTAGCCT TACTCACTGC CAAATAGAAT ACAATGAGGA GGAATGGGAT TGGGATCGGA ATGAGTGACG 13751 CCTAACCTAA CCACTCGTAG GGTTCCCCAT GATAACCCCC GAGTTAACTC GGATTGGATT GGTGAGCATC CCAAGGGGTA CTATTGGGGG CTCAATTGAG 13801 CAACACCACA AACAATGTCA ATAATAACAC CCACCCACTT AAAACTAATA GTTGTGGTGT TTGTTACAGT TATTATTGTG GGTGGGTGAA TTTTGATTAT 13851 ATCACCCACC ATCACCATAA AGCA.AAGCCA CCCCCACAAA ATCCCCCCGA TAGTGGGTGG TAGTGGTATT TCGTTTCGGT GGGGGTGTTT TAGGGGGGCT 13901 GTTATCTCTA TACTGTTCAT CTCCTCCACC CCTGATCAAC TTAACTCAAG CAATAGAGAT ATGACAAGTA GAGGAGGTGG GGACTAGTTG A.ATTGAGTTC 13951 TCACTCCACC ATGAAATATT TACCCGCAAG A.AATAAC GTC ACTAAATAAA AGTGAGGTGG TACTTTATAA ATGGGCGTTC TTTATTGCAG TGATTTATTT 14001 AACCGACATA CAACAAAACA GACCAATTAC CCCACGACTC GGGGTAAGGC TTGGCTGTAT GTTGTTTTGT CTGGTTAATG GGGTGCTGAG CCCCATTCCG 14051 TCAGCGGCAA GTGCCGCCGT ATAAGCGAAT ACCACTAATA TGCCTCCCAA AGTCGCCGTT CACGGCGGCA TATTCGCTTA TGGTGATTAT ACGGAGGGTT GTAA.ATCAAA 14101 AACA.A.AACTA AT GAC P.~A.AAA AGACCCACCA TGTCCCACCA CATTTAGTTT TTGTTTTGAT TACTGTTTTT TCTGGGTGGT ACAGGGTGGT ATAA.ACCGCA 14151 CCCTACCCCT GCAGCTATAA C CA.ATC C CAG TGCAGCATAA TATTTGGCGT GGGATGGGGA CGTCGATATT GGTTAGGGTC ACGTCGTATT 14201 TAAGGGGAAG GATTAGATGC CACTCCTATT AAACCCAGTA C CA.AACAA.AC ATTCCCCTTC CTAATCTACG GTGAGGATAA TTTGGGTCAT GGTTTGTTTG 14251 CGTTATCAAA AACATAATAT ATACCATTAT TCCTACCTGG ACTCTAACCA GCAATAGTTT TTGTATTATA TATGGTAATA AGGATGGACC TGAGATTGGT 14301 AGACCAACAA C TTGP~~AAAC TGTCGTTGTT TATTCAACTA TAAGAATTCA TCTGGTTGTT GAACTTTTTG ACAGCAACAA ATAAGTTGAT ATTCTTAAGT TGGCCCTA.A.A 14351 TATCCGAAAA ACCCACCCTC TAC TA►~~AAAT CGTCAACCAA 275

ACCGGGATTT ATAGGCTTTT TGGGTGGGAG ATGATTTTTA GCAGTTGGTT 14401 ACTCTAATTG ATCTTCCCGC CCCCTCAAAC ATCTCCGTCT GATGA.AAC TT TGAGATTAAC TAGAAGGGCG GGGGAGTTTG TAGAGGCAGA CTACTTTGAA 14451 TGGCTCACTT CTAGGACTAT GTCTAATTAT TCAAATCGTT ACAGGACTCT ACCGAGTGAA GATCCTGATA CAGATTAATA AGTTTAGCAA TGTCCTGAGA 14501 TCTTAGCCAT ACATTATACC GCAGACATCT CCCTAGCTTT CTCCTCTGTT AGAATCGGTA TGTAATATGG CGTCTGTAGA GGGATCGAAA GAGGAGACAA 14551 GTTCATATCT GCCGCGACGT TAACTATGGG TGACTTATCC GTAACATCCA CAAGTATAGA CGGCGCTGCA ATTGATACCC ACTGAATAGG CATTGTAGGT 14601 CGCCAACGGA GCCTCCCTCT TCTTTGTTTG TATCTACTTT CACATCGCCC GCGGTTGCCT CGGAGGGAGA AGAAACAAAC ATAGATGAAA GTGTAGCGGG 14651 GAGGCCTTTA CTATGGCTCC TACCTCTACA AAGAGACTTG A.AACATC GGA CTCCGGAAAT GATACCGAGG ATGGAGATGT TTCTCTGAAC TTTGTAGCCT 14701 GTAATCTTAC TATTCCTTCT CATAGCCACA GCCTTCGTGG GCTACGTCCT CATTAGAATG ATAAGGAAGA GTATCGGTGT CGGAAGCACC CGATGCAGGA 14751 ACCTTGAGGT CAAATATCCT TCTGGGGCGC AACAGTCATT ACCAACCTCC TGGAACTCCA GTTTATAGGA AGACCCCGCG TTGTCAGTAA TGGTTGGAGG 14801 TCTCCGCTTT CCCCTATGTT GGTGATGTAC TAGTACAATG AATCTGAGGC AGAGGCGAAA GGGGATACAA CCACTACATG ATCATGTTAC TTAGACTCCG 14851 GGCTTCTCAG TAGATAACGC CACCCTAACA CGATTTTTCG CATTTCACTT CCGAAGAGTC ATCTATTGCG GTGGGATTGT GC TP~AAAAGC GTAAAGTGAA 14901 CCTCCTCCCC TTCCTAATCA CCGCATTGAT AATTATCCAC GTCCTCTTTT GGAGGAGGGG AAGGATTAGT GGCGTAACTA TTAATAGGTG CAGGAGAAAA TACACGAA.AC 14951 AGGC TCA.AAC AACCCTATGG GTCTCAATTC TGACATAGAC ATGTGCTTTG TCCGAGTTTG TTGGGATACC CAGAGTTAAG ACTGTATCTG C T 15001 ~,A.A,ATC TC TTCACCCCTA CTTCTCCTAT A.AAGAC GCAC TCGGATTCTT TTTTAGAGGA AAGTGGGGAT GAAGAGGATA TTTCTGCGTG AGCCTAAGAA 15051 AACCCTTCTT ATCCTCCTAG GAGTCCTAGC CCTATTCCTA CCTAACCTCT TTGGGAAGAA TAGGAGGATC CTCAGGATCG GGATAAGGAT GGATTGGAGA 15101 TAGGTGACGC CGAAAACTAT ATCCCTGCCA ATCCTCTCGT CACCCCTCCC ATCCACTGCG GCTTTTGATA TAGGGACGGT TAGGAGAGCA GTGGGGAGGG 15151 CACATTAAAC CCGAGTGGTA CTTCCTATTT GCCTACGCCA TCCTCCGATC GTGTAATTTG GGCTCACCAT GAAGGATAA.A CGGATGCGGT AGGAGGCTAG 15201 CATCCCTAAT AA.AC TAGGGG GTGTCCTAGC CCTTCTATTC TCCATCCTCA GTAGGGATTA TTTGATCCCC CACAGGATCG GGAAGATAAG AGGTAGGAGT 15251 TCCTTATACT AGTTCCCTTC CTCCATACCT CTAAACAACG AAGTAGCACC AGGAATATGA TCAAGGGAAG GAGGTATGGA GATTTGTTGC TTCATCGTGG 15301 TTTCGCCCAC TTACACAAAT TTTCTTCTGA ACTCTTGTCA CCAATATACT AAAGCGGGTG AATGTGTTTA AA.AGAAGAC T TGAGAACAGT GGTTATATGA 15351 CATTCTAACT TGAATTGGGG GACAACCAGT TGAACAACCA TTCATTCTCA GTAAGATTGA ACTTAACCCC CTGTTGGTCA ACTTGTTGGT AAGTAAGAGT TTGGACAA.AT 15401 TGCATCTATC TCCTACTTTT CTCTATTCCT CATTGCATTG AACCTGTTTA ACGTAGATAG AGGATGAAA.A GAGATAAGGA GTAACGTAAC 15451 CCCCTTGCCG GCTGATGAGA A.AACA.A.AATC CTCAACCTTA ACTAATTTTG GGGGAACGGC CGACTACTCT TTTGTTTTAG GAGTTGGAAT TGATT~~AAAC 15501 ATAGCTTAGC CTAAAGCGTC GACCTTGTAA GTCGAAGACC GGAGGTTTGA TATCGAATCG GATTTCGCAG CTGGAACATT CAGCTTCTGG CCTCCAA.ACT 15551 ACCCTCCTCA AGATACATCA GGGGAAGGAG GGTTA.AACTC CTGCCCTCGG TGGGAGGAGT TCTATGTAGT CCCCTTCCTC CCAATTTGAG GACGGGAGCC 15601 CTCCCAAAGC CAAGATTCTG CCCAAACTGC CCCCTGAGTG CTGTCAGAGC GAGGGTTTCG GTTCTAAGAC GGGTTTGACG GGGGACTCAC GACAGTCTCG ATGAA.AGC CA 15651 AATACCCTTT TGGTTTTCAA AAAATGAGTC GGTTTAACAT TACTTTCGGT TTATGGGAAA ACCA,AAAGTT TTTTACTCAG CCAAATTGTA 15701 ATTAATGACA TGGCCCACAT ACCTTAATAC AAGGACATAT CTCATCTCGA TAATTACTGT ACCGGGTGTA TGGAATTATG TTCCTGTATA GAGTAGAGCT 276

15751 CTACATCACC CTATTTGACC TTCACCTATT GGTTTCACAC TCTATGTATA GATGTAGTGG GATAAACTGG AAGTGGATAA CCAAAGTGTG AGATACATAT 15801 ATACTCATTG ATTTACATTC CACTATTTCA TTACATTTCA TGCGTTATCC GTGATAA.AGT TATGAGTAAC TAAATGTAAG AATGTAAAGT ACGCAATAGG 15851 CCATTACTGT ACTAATCACT ATTTCATTAC ACTTTACTCT TAATCCTCAT GGTAATGACA TGATTAGTGA TAAAGTAATG TGAAATGAGA ATTAGGAGTA 15901 TAACCTATAA TC~TTTT CATATCATCA AATTACTCCT TCCACCCTCA ATTGGATATT AGTTTTI?~AAA GTATAGTAGT TTAATGAGGA AGGTGGGAGT 15951 AATATCTCTG TATATCTTAT GCGGGCTGGT AAGAACATCA CATCCCGCTA TTATAGAGAC ATATAGAATA CGCCCGACCA TTCTTGTAGT GTAGGGCGAT 16001. TTGTAAGGAA A~AAATTGCTC TATTTGTGGC GCTGTACTCG ATTAATCCCT AACATTCCTT TTTTAACGAG ATAAACACCG CGACATGAGC TAATTAGGGA 16051 ATCAATTGAC CAGAACTGGC ATCTGATTAA TGCTCGAGCT ACTTCAGTCC TAGTTAACTG GTCTTGACCG TAGACTAATT ACGAGCTCGA TGAAGTCAGG 16101 TTGATCGCGT CAAGAATGCC AGCCCGCTAG TTCCCTTTAA TGGCACCTTC AACTAGCGCA GTTCTTACGG TCGGGCGATC AAGGGAAATT ACCGTGGAAG 16151 GTCCTTGATC GCGTCAAGAT TTATTTTCCA CCCTGTTTTT TTGGGGGGGG CAGGAACTAG CGCAGTTCTA AATAAAAGGT GGGACP.►~~AAA AACCCCCCCC 162 01 ATGAAGCCAT CGCTATTCCC CGGAGGGGCT GAACTGGGAC ACTGAGATAA TACTTCGGTA GCGATAAGGG GCCTCCCCGA CTTGACCCTG TGACTCTATT 16251 ACCTGTAATC CCCTCGACAC TCTTCTGTAA TACTCATTAC TTATCATTCA TGGACATTAG GGGAGCTGTG AGAAGACATT ATGAGTAATG AATAGTAAGT 163 01 TGAATTAAGA TTGTCAAGTT GACCATAACT GAAAGGGATG GAGGAAGTCA ACTTAATTCT AACAGTTCAA CTGGTATTGA CTTTCCCTAC CTCCTTCAGT 16351 CGCCATAGCG GATACGTTTC GATTTTTTTG ATTA.AAGAA.A CTATGGTTTA GCGGTATCGC CTATGCAAAG CT C TAATTTCTTT GATACCAAAT 16401 P►~~AAAGAC AT TTTCTTAACC CCCATCCAGA TTGATCCTAT CACTGTACGT TTTTTCTGTA AAAGAATTGG GGGTAGGTCT AACTAGGATA GTGACATGCA 16451 TAGTGTA.A.AA TACATTTCAC TGTTTGAATA CATTCATTAC TTAATCGGAC ATCACATTTT ATGTAA.AGTG ACAAACTTAT GTAAGTAATG AATTAGCCTG 16501 ATAAATTCAT CATTATTAAG TTTCCCCCTG GGTTGTGAAA AATTGGAGCC TATTTAAGTA GTAATAATTC AAAGGGGGAC CCAACACTTT TTAACCTCGG 16551 GCTTAAP.~AAA AGATAAACAT TTTTTTGGTA A.AAAC000CC TCCCCCTAAT CGAATTTTTT TCTATTTGTA CCAT TTTTGGGGGG AGGGGGATTA 16601 ATACACGGAC TC C TC GAA.AA ACCCCT~ CGAGGGCCGG ACATATATTT TATGTGCCTG AGGAGCTTTT TGGGGATTTT GCTCCCGGCC TGTATATA.AA 16651 TGAAATTAGC ATGCGAAATC TATTCTGTAT TTATATTGTC A ACTTTAATCG TACGCTTTAG ATAAGACATA AATATAACAG T

tRNA 1..70 product = tRNA-Phe rRNA 69..1023 product = 12S ribosomal RNA tRNA 1024..1095 product = tRNA-Val rRNA 1096..2766 product = 16S ribosomal RNA tRNA 2767..2841 product = tRNA-Leu gene 2845..3819 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3823..3891 product = tRNA-Ile tRNA 3887..3958 277

product = tRNA-Gln tRNA 3959..4027 product = tRNA-Met gene 4028..5071 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5071..5141 product = tRNA-Trp tRNA complement (5143..5211) product = tRNA-Ala tRNA complement (5215..5287) product = tRNA-Asn tRNA complement (5321..5390) product = tRNA-Cys tRNA complement (5389..5458) product = tRNA-Tyr gene 5460..7017 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7016..7086) product = tRNA-Ser tRNA 7091..7161 product = tRNA-Asp gene 7168..7858 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7861..7934 product = tRNA-Lys gene 7936..8103 gene = ATPB product =ATP synthase FO subunit 8 gene 8094..8777 gene = ATP6 product =ATP synthase FO subunit 6 gene 8777..9562 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9565..9634 product = tRNA-Gly gene 9635..9985 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9982..10051 product = tRNA-Arg gene 10052..10348 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10342..11722 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11723..11791 product = tRNA-His tRNA 11792..11859 product = tRNA-Ser 278 tRNA 11859..11930 product = tRNA-Leu gene 11931..13760 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13756..14277) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14278..14347) product = tRNA-Glu gene 14350..15495 gene = CYTB product = cytochrome b tRNA 15495..15565 product = tRNA-Thr tRNA complement (15568..15636) product = tRNA-Pro D-Loop 15637..16691

Isurus paucus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGCATGGCAC TGAAGATGCT AAGATGAAAA CGATCACATC GAATTAAATT TCGTACCGTG ACTTCTACGA TTCTACTTTT 51 ATGACAATTT TCCGCAGGCA TGAAGGTTTG GTCCTGGCCT TAGTATTAAT TACTGTTAAA AGGCGTCCGT ACTTCCA.A.AC CAGGACCGGA ATCATAATTA 101 TGTAACCAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AAAATGCCCT ACATTGGTTT TAATATGTAC GTTCAAAGTC GTAGGGACAC TTTTACGGGA 151 AACTACTCTG TCAATTAGTT AGGAGCGGGT ATCAGGCACA CATGCACGTA TTGATGAGAC AGTTAATCAA TCCTCGCCCA TAGTCCGTGT GTACGTGCAT 201 GCCCAAGACA CCTTGCTAAG CCACACCCCC AAGGGACTTC AGCAGTAATA CGGGTTCTGT GGAACGATTC GGTGTGGGGG TTCCCTGAAG TCGTCATTAT 251 AATATTGATC ACATGAGCGC AAGCTCGAAT CAGTTAAAGT TAACAGAGTT TTATAACTAG TGTACTCGCG TTCGAGCTTA GTCAATTTCA ATTGTCTCAA 301 GGTCAATCTC GTGCCAGCCA CCGCGGTTAT ACGAGTAACT CACATTAATA CCAGTTAGAG CACGGTCGGT GGCGCCAATA TGCTCATTGA GTGTAATTAT 351 CTTCCCGGCG TA.AAGAGTGA TTTAAGGAAT ATCTGACAAT AACTAAAGTT GAAGGGCCGC ATTTCTCACT A.AATTCC TTA TAGACTGTTA TTGATTTCAA 401 AAGACCTTAC CAAGCTGTCA CACGCACCCA TAAGCGGAAC CATCAACAAC TTCTGGAATG GTTCGACAGT GTGCGTGGGT ATTCGCCTTG GTAGTTGTTG 451 GA.AAGTGACT TTACCCTACT AGAAATCTTG ATGTCACGAC AGTTAGACCC CTTTCACTGA AATGGGATGA TCTTTAGAAC TACAGTGCTG TCAATCTGGG 501 CA.AACTAGGA TTAGATACCC TACTATGTCT AACCACA.A.AC TTA.AACAATA GTTTGATCCT AATCTATGGG ATGATACAGA TTGGTGTTTG AATTTGTTAT 551 ACTCACTATA TTGTTCGCCA GAGTACTACA AGCGCTAGCT TAAA,ACCCAA TGAGTGATAT AACAAGCGGT CTCATGATGT TCGCGATCGA ATTTTGGGTT 601 AGGACTTGGC GGTGTCCCAA ACCCACCTAG AGGAGCCTGT TCTGTAACCG TCCTGAACCG CCACAGGGTT TGGGTGGATC TCCTCGGACA AGACATTGGC 651 ATAATCCCCG TTA.AACCTCA CCACTTCTAG CCATCCCCGT CTATATACCG TATTAGGGGC AATTTGGAGT GGTGAAGATC GGTAGGGGCA GATATATGGC 701 CCGTCGTCAG CTCACCCTGT GAAGGCTCAA AAGTAAGCAA AAAGAACCAA GGCAGCAGTC GAGTGGGACA CTTCCGAGTT TTCATTCGTT TTTCTTGGTT 751 CTCCCACACG TCAGGTCGAG GTGTAGCGAA TGAAGTGGGT AGA.AATGGGC GAGGGTGTGC AGTCCAGCTC CACATCGCTT ACTTCACCCA TCTTTACCCG 801 TACATTTTCT ATA.AAGAAAA CACGAATGGT AAACTGAAAA ATTATCTGAA 279

ATGTAAAAGA TATTTCTTTT GTGCTTACCA TTTGACTTTT TAATAGACTT 851 GGTGGATTTA GCAGTAAGAA AAGACTAGAG AGCTTCTCTG A.AACCGGCTC CCACCTAAAT CGTCATTCTT TTCTGATCTC TCGAAGAGAC TTTGGCCGAG 901 TGGGACGCGC ACACACCGCC CGTCACTCTC C TC GA.A.AGAA GTCTACTTAT ACCCTGCGCG TGTGTGGCGG GCAGTGAGAG GAGCTTTCTT CAGATGAATA 951 TTTTAATTAA AGp~~AAATAC CAAGAGGAGG CAAGTCGTAA CATGGTAAGT AAAATTAATT TCTTTTTATG GTTCTCCTCC GTTCAGCATT GTACCATTCA 1001 GTACTGGAAA GTGCACTTGG AATCA.A.AATG TGGCTGAACC TAGTAA.AGCA CATGACCTTT CACGTGAACC TTAGTTTTAC ACCGACTTGG ATCATTTCGT 1051 CCTCCCTTAC AC C GAG GA.AA TACCCGTGCA ATTCGAGTCA TTTTGAACAT GGAGGGAATG TGGCTCCTTT ATGGGCACGT TAAGCTCAGT AAAACTTGTA 1101 TAAAGCTAGC CTGTCCACCT ACCTTAAACC CAACATTATT ACTCTACCTA ATTTCGATCG GACAGGTGGA TGGAATTTGG GTTGTAATAA TGAGATGGAT 1151 ACTCGTAAGC TCCTAACTAA AACATTTTAT CTCTTTAGTA TGGGCGACAG TGAGCATTCG AGGATTGATT TTGTAAA.ATA GAGAAATCAT ACCCGCTGTC 12 01 AAC P,~~A.AAC T CAGCGCAATA GACTATGTAC CGCAAGGGAG AGCTGAAAGA TTGTTTTTGA GTCGCGTTAT CTGATACATG GCGTTCCCTC TCGACTTTCT 1251 GAAATGAAAC AAATAATTAA AGTAAGAAAA AGCAGAGATT CTACCTCGTA CTTTACTTTG TTTATTAATT TCATTCTTTT TCGTCTCTAA GATGGAGCAT 13 01 CCTTTTGCAT CATGATTTAG CTAGAAAAAC TAGACAAAGA GATCTTTAGC GGAA.AAC GTA GTAC TA.AATC GATCTTTTTG ATCTGTTTCT CTAGAAATCG 1351 CTACCTTCCC GA.AAC TAA.AC GAGCTACTCC GAAGCAGCAC AACTTAGAGC GATGGAAGGG CTTTGATTTG CTCGATGAGG CTTCGTCGTG TTGAATCTCG 1401 CAACCCGTCT CTGTGGCAAA AGAGTGGGAA GACTTCCGAG TAGCGGTGAC GTTGGGCAGA GACACCGTTT TCTCACCCTT CTGAAGGCTC ATCGCCACTG 1451 AAGCCTATCG AGTTTAGTGA TAGCTGGTTG TCCAAGAAAA GAACTTCAAT TTCGGATAGC TCAAATCACT ATCGACCAAC AGGTTCTTTT CTTGAAGTTA 1501 TCTGCACTAA TTCTTTCATC ACCP~AAAAGT CTACCATACC AGGGCTAAAA AGACGTGATT AAGAAAGTAG TGGTTTTTCA GATGGTATGG TCCCGATTTT 1551 ATATAAGAAT TAGTAGTTAT TCAGAAGAGG TACAGCCCTT CTGAATCAAG TATATTCTTA ATCATCAATA AGTCTTCTCC ATGTCGGGAA GACTTAGTTC 1601 ATACAACTTT CAAAGGAGGG AAATGATCAT ATTTACCAAG GTTCTCACCT TATGTTGAAA GTTTCCTCCC TTTACTAGTA TAAATGGTTC CAAGAGTGGA 1651 CAGTGGGCCC AAAAGCAGCC ACCTGTAAAG TAAGCGTCAC AGCTCCAGTC GTCACCCGGG TTTTCGTCGG TGGACATTTC ATTCGCAGTG TCGAGGTCAG 17 01 TCACA.AAAAC CTATAATTTA GATAATCTTC TCAGCACCCC CTTAACTTTA AGTGTTTTTG GATATTAAAT CTATTAGAAG AGTCGTGGGG GAATTGAA.AT 1751 TTGGACTATT TTATAAACTT ATAAAAGAAA TTATGCTAAA ATGAGTAATA AACCTGATAA AATATTTGAA TATTTTCTTT AATACGATTT TACTCATTAT 1801 GGAGGACAA.A CCTCTCCCCG ACACAAGTGT ATGTCAGA.AA GAATTAATTC CCTCCTGTTT GGAGAGGGGC TGTGTTCACA TACAGTCTTT CTTAATTAAG 1851 ACTGACAATT ATACGAACCC AGACTGAGGT TATTATACCT TATTTTACCT TGACTGTTAA TATGCTTGGG TCTGACTCCA ATAATATGGA ATpsAAATGGA 1901 TAACTAGAAA ATCTTATTAT AACATTCGTT AACCCTACAC AGGAGTGTCT ATTGATCTTT TAGAATAATA TTGTAAGCAA TTGGGATGTG TCCTCACAGA 1951 TAAGGAAAGA TTA.A.AAGA.AA GTAAAGGAAC TC GGCAA.ACA CGAACTCCGC ATTCCTTTCT AATTTTCTTT CATTTCCTTG AGCCGTTTGT GCTTGAGGCG 2001 CTGTTTACCA AAAACATCGC CTCTTGGAAC TCATAAGAGG TCCCGCCTGC GACAAATGGT TTTTGTAGCG GAGAACCTTG AGTATTCTCC AGGGCGGACG 2051 CCTGTGACAA TGTTTAACGG CCGCGGTATT CTGACCGTGC AAAGGTAGCG GGACACTGTT ACAAATTGCC GGCGCCATAA GACTGGCACG TTTCCATCGC 2101 TAATCACTTG TC TTTTAA.AT GAAGACCCGT ATGAAAGGCA TCACGAGAGT ATTAGTGAAC AGAAAATTTA CTTCTGGGCA TACTTTCCGT AGTGCTCTCA 2151 TCAACTGTCT CTATTTTCTA ATCAATGAAA TTGATCCACC CGTGCAGAAG AGTTGACAGA GATA.AA.AGAT TAGTTACTTT AACTAGGTGG GCACGTCTTC 280

2201 CGGGTATAAT CACATTAGAC GAGAAGACCC TATGGAGCTT CAAACACATG GCCCATATTA GTGTAATCTG CTCTTCTGGG ATACCTCGAA GTTTGTGTAC 2251 AATTAATTAT GTAGACTAAC TACCCCACGG GTATAAACAA A.AATACAACA TTAATTAATA CATCTGATTG ATGGGGTGCC CATATTTGTT TTTATGTTGT 2301 CTTTTAATTT AACTGTTTTG GTTGGGGTGA CCAAGGGGAA AAACAA.ATCC GA.AAATTAAA TTGACAAA.AC CAACCCCACT GGTTCCCCTT TTTGTTTAGG 2351 CCCTTATCGA CCGAGTGCTC TCA.AACACTT P.~~AAATTAGA ATTACAATTC GGGAATAGCT GGCTCACGAG AGTTTGTGAA TTTTTAATCT TAATGTTAAG 2401 TAATTAATAA AACATTTACC GP,~~AAATGAC CCAGGATTTC CTGATCAATG ATTAATTATT TTGTA.AATGG CTTTTTACTG GGTCCTAAAG GACTAGTTAC 2451 AACCAAGTTA CCCTAGGGAT AACAGCGCAA TCCTTTCTCA GAGTCCCTAT TTGGTTCAAT GGGATCCCTA TTGTCGCGTT AGGAAAGAGT CTCAGGGATA 2501 CGACGAAAGG GTTTACGACC TCGATGTTGG ATCAGGACAT CCTAATGATG GCTGCTTTCC CAAATGCTGG AGCTACAACC TAGTCCTGTA GGATTACTAC 2551 CAGCCGTTAT TAAGGGTTCG TTTGTTCAAC GATTAACAGT CCTACGTGAT GTCGGCAATA ATTCCCAAGC AAACAAGTTG CTAATTGTCA GGATGCACTA 2601 CTGAGTTCAG ACCGGAGAAA TCCAGGTCAG TTTCTATCTA TGAATTTATT GACTCAAGTC TGGCCTCTTT AGGTCCAGTC AAAGATAGAT AC TTA.AATAA 2651 TTTCCTAGTA CGAAAGGACC GGAAAA.ATGA AGCCAATGCC CCAGGCACGC AAAGGATCAT GCTTTCCTGG CCTTTTTACT TCGGTTACGG GGTCCGTGCG 2701 TTCATTTCCA TCTATTGAAA TAAAC TA.AA.A TAGATAAGAA A.AAGC CAAC C AAGTAAAGGT AGATAACTTT ATTTGATTTT ATCTATTCTT TTTCGGTTGG 2751 ACCACCCAAG A.A.AAGGGTTG TTGGGGTGGC AGAGCCTGGT AAATGCAAA.A TGGTGGGTTC TTTTCCCAAC AACCCCACCG TCTCGGACCA TTTACGTTTT 2801 GACCTAAGTT CTTTATTCCA GAGGTTCAAA TCCTCTCCTC AACTATGCTT CTGGATTCAA GAAATAAGGT CTCCAAGTTT AGGAGAGGAG TTGATACGAA 2851 GAAGCCCTCC TCCTCTACTT AATTTGTCCA CTTACCTACA TTATCCCCAT CTTCGGGAGG AGGAGATGAA TTAAACAGGT GAATGGATGT AATAGGGGTA 2901 CCTACTAGCT ACAGCCTTCC TCACCCTAGT TGAACGGAAG GTCCTTGGCT GGATGATCGA TGTCGGAAGG AGTGGGATCA ACTTGCCTTC CAGGAACCGA 2951 ATATACAACT CCGTAAGGGC CCCAACATTG TAGGCCCGTA CGGTCTCCTT TATATGTTGA GGCATTCCCG GGGTTGTAAC ATCCGGGCAT GCCAGAGGAA 3001 CAACCCATTG CAGACGGCCT AAAACTATTT ACCAAAGAAC CCATTTATCC GTTGGGTAAC GTCTGCCGGA TTTTGATA.AA TGGTTTCTTG GGTAAATAGG 3051 ATCGGCATCC TCCCCATTTC TATTCTTAAT CGCCCCCACA ATAGCCCTCA TAGCCGTAGG AGGGGTAAAG ATAAGAATTA GCGGGGGTGT TATCGGGAGT 3101 CACTAGCTCT CCTCATATGA ATACCTCTCC CCCTCCCCCA TTCCGTCATC GTGATCGAGA GGAGTATACT TATGGAGAGG GGGAGGGGGT AAGGCAGTAG 3151 AACATCAACT TAGGCTTACT ATTTATTTTA GCAATCTCGA GTCTGACCGT TTGTAGTTGA ATCCGAATGA TAAATAA.AAT CGTTAGAGCT CAGACTGGCA 3201 CTACACCATC TTAGGCTCCG GATGGGCATC AAATTCAAAA TACGCCCTGA GATGTGGTAG AATCCGAGGC CTACCCGTAG TTTAAGTTTT ATGCGGGACT 3251 TAGGAGCCTT ACGAGCTGTA GCACAA.ACAA TTTCCTATGA AGTGAGCCTC ATCCTCGGAA TGCTCGACAT CGTGTTTGTT AAAGGATACT TCACTCGGAG 3301 GGACTTATCC TGCTCTCAAT AGTCATTTTT ACAGGGGGCT TCACCCTCCA CCTGAATAGG ACGAGAGTTA TCAGTP.~~AAA TGTCCCCCGA AGTGGGAGGT 3351 TACCTTTAAC TTAGCACAAG AGACAGTCTG ACTAATTATT CCAGGGTGAC ATGGA.AATTG AATCGTGTTC TCTGTCAGAC TGATTAATAA GGTCCCACTG 3401 CATTGGCCCT AATATGGTAC GTATCGACCC TAGCAGAAAC CAACCGAGTA GTAACCGGGA TTATACCATG CATAGCTGGG ATCGTCTTTG GTTGGCTCAT 3451 CCGTTTGATT TAACAGAAGG GGAATCAGAA CTAGTCTCAG GTTTTAACAT GGCA.AAC TAA ATTGTCTTCC CCTTAGTCTT GATCAGAGTC CAAAATTGTA 3501 CGAATATGCA GGGGGCTCGT TCGCCCTCTT CTTTCTTGCG GAGTATACCA GCTTATACGT CCCCCGAGCA AGCGGGAGAA GA.AAGAAC GC CTCATATGGT 3551 ATATTCTACT AATAAACACC CTTTCAGTCA TCCTTTTTAT AGGCTCCTCC 281

TATAAGATGA TTATTTGTGG GAAAGTCAGT AGGP.~~AAATA TCCGAGGAGG 3601 TATAACCCCC TCTTTCCAGA AATCTCCACC CTCAGCCTAA TGATA,A.AAGC ATATTGGGGG AGAAAGGTCT TTAGAGGTGG GAGTCGGATT ACTATTTTCG 3651 AACCCTACTT ACCCTACTTT TTTTATGAAT TCGAGCATCA TATCCTCGCT TTGGGATGAA TGGGATGAAA A.AAATAC T TA AGCTCGTAGT ATAGGAGCGA 3701 TCCGCTATGA CCAGCTCATA CACCTGGTGT G CTT CTTGCCCTTA AGGCGATACT GGTCGAGTAT GTGGACCACA CTTTTTTGAA GAACGGGAAT 3751 ACCCTAGCAA TTATATTATG ACATATCGCC CTCCCCGTAG CTTCAGCAAG TGGGATCGTT AATATAATAC TGTATAGCGG GAGGGGCATC GAAGTCGTTC 3801 CTTACCCCCC TTAACCTAGG AGGAAGAGTG CCTGAACAAA AGGACCACTT GAATGGGGGG AATTGGATCC TCCTTCTCAC GGACTTGTTT TCCTGGTGAA 3851 TGATAGAGTG GATAATGAA.A GTTA.A.AACCT TTCCTTTTCC TAGP~~AAATA ACTATCTCAC CTATTACTTT CAATTTTGGA AAGGP.,AAAGG ATCTTTTTAT 3901 GGACTTGAAC CTACACCTAA GAGATCAAAA CTCTCCGTAC TTCCAATTAT CCTGAACTTG GATGTGGATT CTCTAGTTTT GAGAGGCATG AAGGTTAATA 3951 ACCATTTCCT AGCCAAGTAA GGTCAGCTAA TCAAGCTTTT GGGCCCATAC TGGTA.AAGGA TCGGTTCATT CCAGTCGATT AGTTCGAAA.A CCCGGGTATG 4001 CCCAACCACG TCGGTTAA.AA TCCTTCCCTT ACTAATGAGC CCCCTTGTAT GGGTTGGTGC AGCCAATTTT AGGAAGGGAA TGATTACTCG GGGGAACATA 4051 TAACCATTAT TATCTCAAGC CTAGGCCTAG GAACTATTCT CACATTCATC ATTGGTAATA ATAGAGTTCG GATCCGGATC CTTGATAAGA GTGTAAGTAG 4101 GGCTCCCACT GACTGCTAGT CTGAATAGGT CTTGAAATTA ACACCTTAGC CCGAGGGTGA CTGACGATCA GACTTATCCA GAACTTTAAT TGTGGAATCG 4151 CATCCTTCCT CTAATAATCC GCCAACACCA CCCCCGAGCA GTAGAAGCCT GTAGGAAGGA GATTATTAGG CGGTTGTGGT GGGGGCTCGT CATCTTCGGA 4201 CCACP.~AAATA CTTTATTACA CAAGCCACCG CCTCAGCCCT ACTCTTATTT GGTGTTTTAT GAAATAATGT GTTCGGTGGC GGAGTCGGGA TGAGAATAAA 4251 GCTAGTGTTA CAA.AC GC C TG AACCTCAGGC GAATGGAACC TAGTTGAAAT CGATCACAAT GTTTGCGGAC TTGGAGTCCG CTTACCTTGG ATCAACTTTA 4301 GGTCAACCCA GGCTCTGCCA CACTAGCCAC AATCGCACTA GC C C TP.,~~AA.A CCAGTTGGGT CCGAGACGGT GTGATCGGTG TTAGCGTGAT CGGGATTTTT 4351 TCGGCTTGGC CCCCCTTCAC TTCTGACTCC CTGAAGTCCT CCAAGGTCTA AGCCGAACCG GGGGGAAGTG AAGACTGAGG GACTTCAGGA GGTTCCAGAT 4401 GACCTGACTA CAGGCCTTAT CCTCTCGACC TGACAAAAGC TTGCCCCATT CTGGACTGAT GTCCGGAATA GGAGAGCTGG ACTGTTTTCG AACGGGGTAA 4451 CGCTATCCTC CTACAACTCT ACCCCTCATT AAACCCTAAC CTACTAATTT GCGATAGGAG GATGTTGAGA TGGGGAGTAA TTTGGGATTG GATGATTAAA 4501 TTCTTGGAGT CCTCTCAACT ATAGTGGGGG GCTGAGGCGG ATTA.AAC CAA AAGAACCTCA GGAGAGTTGA TATCACCCCC CGACTCCGCC TAATTTGGTT 4551 ACCCAACTAC GA.AAGAT T C T AGCCTACTCC TCAATCGCAC ACCTTGGTTG TGGGTTGATG CTTTCTAAGA TCGGATGAGG AGTTAGCGTG TGGAACCAAC 4601 AATAATTTCC ATCCTCCACT ACTCCCATAA TTTAACCCAA CTTAATCTAA TTATTAAAGG TAGGAGGTGA TGAGGGTATT AAATTGGGTT GAATTAGATT 4651 TCCTTTACAT CATTATGACC TCAACAACCT TCCTCCTATT TAAGACATTT AGGA.AATGTA GTAATACTGG AGTTGTTGGA AGGAGGATAA ATTCTGTAAA 4701 AACTCAACCA AAATCAATTC TATCTCCTCC TCTTCCTCAA AGTCCCCCTT TTGAGTTGGT TTTAGTTAAG ATAGAGGAGG AGAAGGAGTT TCAGGGGGAA 4751 ACTTTCCATT ATTGCCCTCC TAACCCTCCT CTCTCTCGGA GGCCTTCCCC TGAAAGGTAA TAACGGGAGG ATTGGGAGGA GAGAGAGCCT CCGGAAGGGG 4801 CTCTCTCAGG CTTCATACCA AAGTGACTTA TCTTGCAAGA ACTAACTAAA GAGAGAGTCC GAAGTATGGT TTCACTGAAT AGAACGTTCT TGATTGATTT 4851 CAAAACTTAA TTATCCCCGC CATTATTATG GCTATAATAG CTCTCCTTAG GTTTTGAATT AATAGGGGCG GTAATAATAC CGATATTATC GAGAGGAATC 4901 TCTATTTTTC TATCTACGCT TATGCTACGC CACAGCACTA ACCATAACCC AGATP.►~~AAAG ATAGATGCGA ATACGATGCG GTGTCGTGAT TGGTATTGGG 282

4951 CAGCCCCAAT TAACATATTA ACATCATGAC GAACCAACTT ACCCCCCAAC GTCGGGGTTA ATTGTATAAT TGTAGTACTG CTTGGTTGAA TGGGGGGTTG 5001 CTGGCCCTAA CAGCCACTGC CTCATTGTCC ATTTTCCTTC TCCCAATCAC GACCGGGATT GTCGGTGACG GAGTAACAGG TA►AAAGGAAG AGGGTTAGTG 5051 CCCTGCCATC CTTATATTAA CGTCCTAAGA AATTTAGGTT AACAACAGAC GGGACGGTAG GAATATAATT GCAGGATTCT TTAAATCCAA TTGTTGTCTG 5101 CAAAAGCCTT CAA.AGC TTTA AGTAGAAGTG AAA.ATCTC C T AATTTCTGTT GTTTTCGGAA GTTTC GA.A.~T TCATCTTCAC TTTTAGAGGA TTA.AAGACAA 5151 AAGATTTGCA AGACTCTACC TCACATCTTC TGAATGCAAC TCAGATACTT TTCTAAACGT TCTGAGATGG AGTGTAGAAG ACTTACGTTG AGTCTATGAA 5201 TCATTAAGCT AAAACCTTCT AGATAAATAG GCCTTGATCC TACA.AAATC T AGTAATTCGA TTTTGGAAGA TCTATTTATC CGGAACTAGG ATGTTTTAGA 5251 CAGTTAACAG CTAAGCGTTC AATCCAGCGA ACTTTTATCT ACTTTCTCCC GTCAATTGTC GATTCGCAAG TTAGGTCGCT TGAAA.ATAGA TGAAAGAGGG 5301 GCCGTCAGAA AAA.A.AGGC GG GAGAAAGCCC CGGGAGAAAC TAATCTCCAT CGGCAGTCTT TTTTTCCGCC CTCTTTCGGG GCCCTCTTTG ATTAGAGGTA 5351 CTTTGGATTT GCAATCCAAC ATA.AACATC T ACTGCAGGAC TATGGTAAGA GAAACCTAAA CGTTAGGTTG TATTTGTAGA TGACGTCCTG ATACCATTCT 5401 AGAGGAATTG GACCTCTGTT CATGGGGCTA CAATCCATCA CTTAGTTCTC TCTCCTTAAC CTGGAGACAA GTACCCCGAT GTTAGGTAGT GAATCAAGAG 5451 AGTCACCTTA CCTGTGGCAA TTAATCGATG ACTATTTTCT ACA.AAC CACA TCAGTGGAAT GGACACCGTT AATTAGCTAC TGAT~~AAAGA TGTTTGGTGT 5501 AAGATATCGG CACCCTGTAT TTAATCTTTG GTGCATGAGC AGGGATAGTG TTCTATAGCC GTGGGACATA AATTAGAAAC CACGTACTCG TCCCTATCAC 5551 GGAACAGCCC TAAGCCTTCT AATTCGCGCC GAACTGGGTC AGCCAGGTTC CCTTGTCGGG ATTCGGAAGA TTAAGCGCGG CTTGACCCAG TCGGTCCAAG 5601 TCTTCTAGGG GACGATCAGA TTTATAATGT TATTGTAACC GCCCATGCAT AGAAGATCCC CTGCTAGTCT AAATATTACA ATAACATTGG CGGGTACGTA 5651 TTGTAATGAT TTTCTTCATG GTAATGCCCG TGATAATTGG GGGCTTTGGG AACATTACTA AAAGAAGTAC CATTACGGGC ACTATTAACC CCCGA.AACCC 5701 AACTGACTGG TGCCTTTAAT GATCGGTGCA CCCGATATGG CCTTCCCCCG TTGACTGACC ACGGAAATTA CTAGCCACGT GGGCTATACC GGAAGGGGGC 5751 AATA.AACAAC ATGAGCTTCT GACTCCTCCC CCCTTCTTTT CTCTTACTGC TTATTTGTTG TACTCGAAGA CTGAGGAGGG GGGAAGAAAA GAGAATGACG 5801 TAGCCTCAGC CGGGGTTGAA TCAGGGGCTG GAACTGGCTG GACAGTTTAC ATCGGAGTCG GCCCCAACTT AGTCCCCGAC CTTGACCGAC CTGTCAAATG 5851 CCTCCCCTAG CTGGTAACTT AGCACATGCT GGGGCATCTG TTGACTTAGC GGAGGGGATC GACCATTGAA TCGTGTACGA CCCCGTAGAC AACTGAATCG 5901 TATCTTCTCC CTTCACCTAG CAGGTATCTC GTCAATTCTG GCCTCTATTA ATAGAAGAGG GAAGTGGATC GTCCATAGAG CAGTTAAGAC CGGAGATAAT 5951 ACTTCATCAC GACAATCATC AACAT.A.AA.AC CACCAGCAAT TTCTCAGTAC TGAAGTAGTG CTGTTAGTAG TTGTATTTTG GTGGTCGTTA AAGAGTCATG 6001 CAA.ACAC C C C TATTTGTGTG ATCCATCCTA GTAACAAC TA TCCTGCTCCT GTTTGTGGGG ATAAACACAC TAGGTAGGAT CATTGTTGAT AGGACGAGGA 6051 TCTAGCCCTC CCAGTACTCG CCGCCGGCAT TACAATACTA CTTACGGACC AGATCGGGAG GGTCATGAGC GGCGGCCGTA ATGTTATGAT GAATGCCTGG 6101 GAAACCTGAA CACAACATTC TTTGACCCGG CGGGAGGGGG AGATCCTATC CTTTGGACTT GTGTTGTAAG AAACTGGGCC GCCCTCCCCC TCTAGGATAG 6151 CTCTACCAAC ATCTATTCTG ATTTTTTGGT CACCCGGAAG TCTACATTCT GAGATGGTTG TAGATAAGAC T CCA GTGGGCCTTC AGATGTAAGA 6201 TATTCTCCCT GGCTTTGGGA TAATTTCCCA TATTGTAGCC TACTATTCCG ATAAGAGGGA CCGA.AACCCT ATTAAAGGGT ATAACATCGG ATGATAAGGC 6251 GTAAGA.AAGA GCCATTTGGC TACATGGGGA TAGTCTGAGC AATAATAGCA CATTCTTTCT CGGTAAACCG ATGTACCCCT ATCAGACTCG TTATTATCGT 6301 ATCGGCCTAC TAGGTTTTAT TGTCTGAGCC CATCACATGT TCACCGTAGG 283

TAGCCGGATG ATCCp TA ACAGACTCGG GTAGTGTACA AGTGGCATCC 6351 AATGGATGTT GACACACGGG CCTACTTCAC CTCAGCAACG ATGATTATTG TTACCTACAA CTGTGTGCCC GGATGAAGTG GAGTCGTTGC TACTAATAAC 6401 CCATCCCTAC GGGTGTA,AAA GTTTTCAGCT GACTGGCAAC CCTCCACGGA GGTAGGGATG CCCACATTTT CAAAAGTCGA CTGACCGTTG GGAGGTGCCT 6451 GGCTCTGTCA AATGAGACAC CCCCTTGCTA TGAGCTCTTG GATTTATTTT CCGAGACAGT TTACTCTGTG GGGGAACGAT ACTCGAGAAC CTAAATAAAA 6501 TCTATTCACA GTGGGGGGCC TAACGGGAAT TGTCCTAGCC AACTCTTCTC AGATAAGTGT CACCCCCCGG ATTGCCCTTA ACAGGATCGG TTGAGAAGAG 6551 TAGATATTGT CCTCCACGAC ACTTATTATG TCGTAGCCCA CTTCCACTAC ATCTATAACA GGAGGTGCTG TGAATAATAC AGCATCGGGT GAAGGTGATG 6601 GTCCTTTCAA TAGGAGCAGT GTTCGCTATC ATAGCAGGTT TTATTCACTG CAGGAAAGTT ATCCTCGTCA CAAGCGATAG TATCGTCCAA AATAAGTGAC 6651 ATTCCCTTTA ATAACCGGTT ACACCCTCCA TTCAACTTGA ACA,A.AAATCC TAAGGGAAAT TATTGGCCAA TGTGGGAGGT AAGTTGAACT TGTTTTTAGG 6701 AGTTCGCAGT TATATTTATC GGAGTAAATC TGACATTCTT CCCCCAACAT TCAAGCGTCA ATATA.A.ATAG CCTCATTTAG ACTGTAAGAA GGGGGTTGTA 6751 TTCCTAGGTC TCGCTGGTAT GCC AC GAC GC TACTCAGACT ACCCAGACGC AAGGATCCAG AGCGACCATA CGGTGCTGCG ATGAGTCTGA TGGGTCTGCG 6801 TTACACTTTA TGAAACACAG TCTCCTCTAT CGGCTCCCTA ATCTCACTTG AATGTGAA.AT ACTTTGTGTC AGAGGAGATA GCCGAGGGAT TAGAGTGAAC 6851 TAGCAGTGAT TATGTTCCTG TTCATTATTT GAGAAGCATT TGCCTCAAAA ATCGTCACTA ATACAAGGAC AAGTAATAA.A CTCTTCGTAA ACGGAGTTTT 6901 CGAGAAGTCC TATCCGTCGA ACTACCCCAT ACAAATGTCG AATGACTACA GCTCTTCAGG ATAGGCAGCT TGATGGGGTA TGTTTACAGC TTACTGATGT 6951 CGGCTGCCCG CCACCCTACC ACACATATGA AGAACCAGCA TTCGTTCAAG GCCGACGGGC GGTGGGATGG TGTGTATACT TCTTGGTCGT AAGCAAGTTC 7001 TTCAACGAAC CCTTTA.AAAC AAGAAAGGAA GGAATTGACC CCCATATGTT AAGTTGCTTG GGAAATTTTG TTCTTTCCTT CCTTAACTGG GGGTATACAA 7051 AGTTTCAAGC TAACCACATC ACCACTCTGT CACTTTCTTT ATAAAGACCC TCA.AAGTTCG ATTGGTGTAG TGGTGAGACA GTGAAAGAAA TATTTCTGGG 7101 TAGTAAAACA CATTACATTA CCTTGTCAAG GCACAATTGT GGGTTTGAGC ATCATTTTGT GTAATGTAAT GGAACAGTTC CGTGTTAACA CCCAAACTCG 7151 CCCGCGGGTC TTATAACAAA TGGCACACCC CTCACAATTA GGATTTCAAG GGGCGCCCAG AATATTGTTT ACCGTGTGGG GAGTGTTAAT CCTAAAGTTC 7201 ATGCAGCCTC CCCAGTTATG GAAGAACTTA TCCACTTTCA CGATCATACA TACGTCGGAG GGGTCAATAC CTTCTTGAAT AGGTGAA.AGT GCTAGTATGT 7251 CTGATAATCG TGTTTCTAAT TAGCGCTCTA GTCTTATATA TTATTACAGC GACTATTAGC ACAAAGATTA ATCGCGAGAT CAGAATATAT AATAATGTCG 7301 AATAGTATCA ACAAAACTCA CAAACA.AATA CATTCTTGAC TCCCAAGAGA TTATCATAGT TGTTTTGAGT GTTTGTTTAT GTAAGAACTG AGGGTTCTCT 7351 TTGAAATCGT TTGAACTATC CTCCCCGCCA TCATCCTCAT TATAATTGCC AACTTTAGCA AACTTGATAG GAGGGGCGGT AGTAGGAGTA ATATTAACGG 7401 CTGCCGTCCT TACGAATCCT TTACCTCATG GACGA.A.ATCA ATGATCCTCA GACGGCAGGA ATGCTTAGGA AATGGAGTAC CTGCTTTAGT TACTAGGAGT 7451 CTTAACCATT AAAGCCATRG GTCATCAATG ATACTGAAGC TACGAATATA GAATTGGTAA TTTCGGTARC CAGTAGTTAC TATGACTTCG ATGCTTATAT 7501 CAGACTACGA AGATCTGGGT TTTGACTCTT ATATGATCCA AACCCAAGAC GTCTGATGCT TCTAGACCCA AAACTGAGAA TATACTAGGT TTGGGTTCTG 7551 TTGGCCCCAG GCCAATTTCG TTTATTAGAG ACAGATCATC GAATAGTAGT AACCGGGGTC CGGTTA.AAGC AAATAATCTC TGTCTAGTAG CTTATCATCA 7601 CCCCATAGAA TCACCCGTTC GTGTCCTAGT ATCCGCAGAA GATGTCTTAC GGGGTATCTT AGTGGGCAAG CACAGGATCA TAGGCGTCTT CTACAGAATG 7651 ACTCATGGGC TGTACCGGCC CTGGGAGTTA A.AATAGATGC CGTCCCAGGA TGAGTACCCG ACATGGCCGG GACCCTCAAT TTTATCTACG GCAGGGTCCT 284

7701 CGTTTAAACC AGACCGCCTT CATCATCTCC CGACCAGGCG TCTATTATGG GCAAATTTGG TCTGGCGGAA GTAGTAGAGG GCTGGTCCGC AGATAATACC 7751 TCAATGCTCA GAAATCTGTG GAGCTAACCA TAGTTTTATA CCCATTGTAG AGTTACGAGT CTTTAGACAC CTCGATTGGT ATCAAA.ATAT GGGTAACATC 7801 TAGAAGCAGT TCCTCTAGAA CACTTCGAAG CCTGATCTTC ATCAATGCTA ATCTTCGTCA AGGAGATCTT GTGAAGCTTC GGACTAGAAG TAGTTACGAT 7851 GAAGAAGCCT CATTAAGAAG CTAAACCGGG ACTAGCGTTA GCCTTTTAAG CTTCTTCGGA GTAATTCTTC GATTTGGCCC TGATCGCAAT CGGAAAATTC 7 9 0 1 C TAAAAAC TG GTGACTCCCT ACCACCCTTA ATGACATGCC TCAATTAA.AC GATTTTTGAC CACTGAGGGA TGGTGGGAAT TACTGTACGG AGTTAATTTG 7951 CCTCACCCAT GATTAATCAT CCTCTTGTTT TCATGAATAG TCTTCCTCAT GGAGTGGGTA CTAATTAGTA GGAGAACAAA AGTACTTATC AGAAGGAGTA 8001 TATCCTACCA GTGA TAA.ATCAC C T ATTTAACAAC AACCCAACAT ATAGGATGGT TTTTTTCACT ATTTAGTGGA TAAATTGTTG TTGGGTTGTA 8 0 51 TP.~AA.AAGCAC GGAGAAATCT AAGCCCGAAC CC TGAA.AC TG ACCATGATCC ATTTTTCGTG CCTCTTTAGA TTCGGGCTTG GGACTTTGAC TGGTACTAGG 8101 TAAGCTTTTT CGACCAATTC CTAAGCCCCT CCCTCCTTGG TATCCCACTA ATTC GP.~AA.AA GCTGGTTAAG GATTCGGGGA GGGAGGAACC ATAGGGTGAT 8151 ATTGCCCTCG CAATCGCTCT CCCATGACTA ATTTTTCCAA CCCCAACTAG TAACGGGAGC GTTAGCGAGA GGGTACTGAT TP~AAAAGGTT GGGGTTGATC 8201 CCGATGACTC AACAATCGAC TAATAACGCT C CAA.AGTTGA TTTATTAATC GGCTACTGAG TTGTTAGCTG ATTATTGCGA GGTTTCAACT AA.ATAATTAG 8251 GATTCGTCTA TCAGCTCATA CAACCCATTA ATTTCACCGG CCATAAATGA CTAAGCAGAT AGTCGAGTAT GTTGGGTAAT TAAAGTGGCC GGTATTTACT 8301 GCCATACTAT TTACAGCATT AATACTGTTT CTAATTACCA TTAACCTACT CGGTATGATA AATGTCGTAA TTATGACA.AA GATTAATGGT AATTGGATGA 8351 GGGCCTCCTT CCCTACACCT TTACACCCAC GACACAACTT TCCCTCAATA CCCGGAGGAA GGGATGTGGA AATGTGGGTG CTGTGTTGAA AGGGAGTTAT 8401 TAGCGTTTGC CCTGCCCTTA TGGTTTACCA CCGTCTTGAT CGGAATACTT ATCGCAAACG GGACGGGAAT ACCAAATGGT GGCAGAACTA GCCTTATGAA 8451 AACCAGCCTA CTATTGCACT AGGACACTTC TTGCCCGAAG GTACTCCCAC TTGGTCGGAT GATAACGTGA TCCTGTGAAG A.ACGGGCTTC CATGAGGGTG 8501 CCCCCTAGTA CCCGTCCTAA TTATCATCGA A.ACCATTAGC TTGTTTATCC GGGGGATCAT GGGCAGGATT AATAGTAGCT TTGGTAATCG AACAAATAGG 8551 GACCACTAGC ATTAGGGGTT CGACTAACCG CTAACTTGAC AGCCGGACAC CTGGTGATCG TAATCCCCAA GCTGATTGGC GATTGAACTG TCGGCCTGTG 8601 CTACTAATAC A.AC TAATTGC AACTGCAGCC TTTGTGCTCA TCACCATTAT GATGATTATG TTGATTAACG TTGACGTCGG AAACACGAGT AGTGGTAATA 8651 ACCAACCGTG GCGTTACTTA CATCAACCAT CCTATTCCTA CTAACAATCT TGGTTGGCAC CGCAATGAAT GTAGTTGGTA GGATAAGGAT GATTGTTAGA 8701 TAGAAGTGGC TGTAGCAATA ATCCAGGCAT ATGTTTTTGT TCTCCTGCTA ATCTTCACCG ACATCGTTAT TAGGTCCGTA TAC P.~AAAAC A AGAGGACGAT 8751 AGCCTCTACC TACAAGAAAA TGTCTAATGG CTCACCAAGC ACACGCATAC TCGGAGATGG ATGTTCTTTT ACAGATTACC GAGTGGTTCG TGTGCGTATG 8801 CACATAGTTG ACCCAAGCCC ATGACCACTA ACCGGAGCTA CAGCCGCCCT GTGTATCAAC TGGGTTCGGG TACTGGTGAT TGGCCTCGAT GTCGGCGGGA 8851 TTTAATAACA TCTGGACTAG CCATCTGGTT CCATTACCAT TCGCTCATCC AA.ATTATTGT AGACCTGATC GGTAGACCAA GGTAATGGTA AGCGAGTAGG 8901 TTCTTTACCT AGGACTAACC CTTCTCCTAC TAACTATGGT TCAATGATGA AAGAAATGGA TCCTGATTGG GAAGAGGATG ATTGATACCA AGTTACTACT 8951 CGTGATATTA TCCGAGAAGG AACATTCCAA GGCCACCACA CACCCCCCGT GCACTATAAT AGGCTCTTCC TTGTAAGGTT CCGGTGGTGT GTGGGGGGCA 9001 CCAAAAAGGC CTCCGCTACG GAATGATTCT ATTTATCACA TCAGAGGTGT GGTTTTTCCG GAGGCGATGC CTTACTAAGA TA.AATAGTGT AGTCTCCACA 9051 TCTTTTTCCT GGGCTTTTTC TGAGCCTTTT ACCACTCAAG TCTCGCCCCC 285

AGP~~,AAAGGA C C C GP.~3AAAG AC TC GGAA.A.A TGGTGAGTTC AGAGCGGGGG 9101 ACCCCCGAAC TGGGGGGATG CTGACCCCCA ACAGGAATTA GCCCCATGGA TGGGGGCTTG ACCCCCCTAC GACTGGGGGT TGTCCTTAAT CGGGGTACCT 9151 CCCGTTCGAA GTACCACTCC TAAACACCGC CGTACTCTTA GCCTCCGGCG GGGCAAGCTT CATGGTGAGG ATTTGTGGCG GCATGAGAAT CGGAGGCCGC 9201 TAACAGTAAC TTGAGCCCAC CACAGCCTTA TGGAAGGCAA C C GAAAAGAA ATTGTCATTG AACTCGGGTG GTGTCGGAAT ACCTTCCGTT GGCTTTTCTT 9251 ACTATTCAAG CCCTCTCTCT CACCATCCTC TTAGGTATCT ACTTCACAGT TGATAAGTTC GGGAGAGAGA GTGGTAGGAG AATCCATAGA TGAAGTGTCA 9301 CCTCCAGGCT ATGGAATATT ATGAAGCGCC TTTTACAATC GCTGATGGGG GGAGGTCCGA TACCTTATAA TACTTCGCGG AP~AATGTTAG CGACTACCCC 9351 TCTATGGAAC CACATTCTTC GTTGCCACAG GATTCCACGG CCTCCATGTT AGATACCTTG GTGTAAGAAG CAACGGTGTC CTAAGGTGCC GGAGGTACAA 9401 ATTATTGGCT CAACATTTTT AATGATTTGC CTACTCCGGC AGCTTCAATA TAATAACCGA GTTGTP.~~,AAA TTACTAAACG GATGAGGCCG TCGAAGTTAT 9451 CCACTTCACA TCCCAACACC ACTTTGGATT TGAAGCTGCT GCATGATACT GGTGAAGTGT AGGGTTGTGG TGAA.ACCTAA ACTTCGACGA CGTACTATGA 9501 GACACTTTGT AGACGTAGTG TGACTATTCC TCTATGTTTC CATCTATTGA CTGTGAAACA TCTGCATCAC ACTGATAAGG AGATACAAAG GTAGATAACT 9551 TGAGGCTCAT AATTACTTTT CTAGTATAGA CTAGTACAAA TGATTTCCAA ACTCCGAGTA TTAATGP~AAA GATCATATCT GATCATGTTT ACTAAAGGTT 9601 TCATTTAATC TTGGTTAAAA TCCAAGGAAA AGTAATGAAC CTCATCATGT AGTAAATTAG AACCAATTTT AGGTTCCTTT TCATTACTTG GAGTAGTACA 9651 CTTCTGTCGC GGCTACGGCC CTGGTTTCCC TAATCCTTGT ATTTATCGCA GAAGACAGCG CCGATGCCGG GAC CAA.AGGG ATTAGGAACA TAAATAGCGT 9701 TTCTGGCTCC CATCGCTTAA CCCAGACAAC GAGAA.AC TAT CCCCATATGA AAGACCGAGG GTAGCGAATT GGGTCTGTTG CTCTTTGATA GGGGTATACT 9751 ATGTGGCTTT GACCCCCTCG GAAGCGCACG CCTTCCATTC TCCCTACGCT TACACCGAAA CTGGGGGAGC CTTCGCGTGC GGAAGGTAAG AGGGATGCGA 9801 TCTTCCTCGT AGCTATCCTG TTCCTACTAT TTGACCTAGA AATCGCCCTC AGAAGGAGCA TCGATAGGAC AAGGATGATA AACTGGATCT TTAGCGGGAG 9851 CTTCTCCCCC TGCCCTGAGG GGATCAATCT CTATCACCCC TCTATACATT GAAGAGGGGG ACGGGACTCC CCTAGTTAGA GATAGTGGGG AGATATGTAA 9901 ATTCTGAGCG GCAGTTATCT TAATTCTACT TACCCTAGGC CTCATTTATG TAAGACTCGC CGTCAATAGA ATTAAGATGA ATGGGATCCG GAGTAAATAC 9951 AGTGACTTCA AGGGGGATTA GAGTGGGCAG AGTAGGTATT TAGTCTAAAA TCACTGAAGT TCCCCCTAAT CTCACCCGTC TCATCCATAA ATCAGATTTT 10001 ACAAGACCAC TAATTTCGGC TTAGTAAATT ATGGTGA,AAA TCCATAAATA TGTTCTGGTG ATTAAAGCCG AATCATTTAA TACCACTTTT AGGTATTTAT 10051 CCTTATGTCC CCCATGTATT TTAGCCTAAA CTCAGCATTC ATACTTGGCC GGAATACAGG GGGTACATAA AATCGGATTT GAGTCGTAAG TATGAACCGG 10101 TGATGGGTCT CGCACTTAAC CGCTATCACC TTCTATCTGC ACTCTTATGC ACTACCCAGA GCGTGAATTG GCGATAGTGG AAGATAGACG TGAGAATACG 10151 CTGGAAAGTA TACTACTAAC TCTATTTATT ACCATTGCTA TCTGAACCCT GACCTTTCAT ATGATGATTG AGATAA.ATAA TGGTAACGAT AGACTTGGGA 10201 TACACTAAAC TCTGTCTCCT CCTCAATCAT CCCTATAATC CTCCTCACAT ATGTGATTTG AGACAGAGGA GGAGTTAGTA GGGATATTAG GAGGAGTGTA 10251 TCTCGGCCTG CGAAGCCAGC GCAGGCCTAG CTATTCTAGT AGCCACCTCC AGAGCCGGAC GCTTCGGTCG CGTCCGGATC GATAAGATCA TCGGTGGAGG 10301 CGCTCCCACG GCTCTGATAA CCTCC~►AAAC CTAAATCTTC TCCAATGCTA GCGAGGGTGC CGAGACTATT GGAGGTTTTG GATTTAGAAG AGGTTACGAT 10351 AAAATTCTTA TCCCAACAGT TATACTCCTC CCGACCACAT GGACTATTAA TTTTAAGAAT AGGGTTGTCA ATATGAGGAG GGCTGGTGTA CCTGATAATT 10401 C TGA CTATGGCCCA TAACCACCTC CTATAGTCTC CTAATCGCAT GTTTTTTACT GATACCGGGT ATTGGTGGAG GATATCAGAG GATTAGCGTA 286

10451 TATCAAGCTT AGTCTGATTC A.AATGAGACA TAGACATTGG CTGAGACTCT ATAGTTCGAA TCAGACTAAG TTTACTCTGT ATCTGTAACC GACTCTGAGA 10501 TCCAACCA.AT TCATGGCTGT TGACCCCCTA TCCTCCCCCC TACTCATTCT AGGTTGGTTA AGTACCGACA ACTGGGGGAT AGGAGGGGGG ATGAGTAAGA 10551 CACATGCTGA CTCCTTCCAC TAATAATCTT GGCCAGCCAA AACCACATCT GTGTACGACT GAGGAAGGTG ATTATTAGAA CCGGTCGGTT TTGGTGTAGA 10601 CCCCAGAACC AGTTATTCGA CAACGAACAT ACATTTCACT CCTGATTTCT GGGGTCTTGG TCAATAAGCT GTTGCTTGTA TGTAAAGTGA GGACTAAAGA 10651 CTTCAAACTT TTCTTATTAT AGCCTTCTCC GCAACCGAAA TAATTATATT GAAGTTTGAA AAGAATAATA TCGGAAGAGG CGTTGGCTTT ATTAATATAA 10701 CTACATTATA TTTGAGGCCA CACTCATCCC CACCCTCATT ATTATCACAC GATGTAATAT AAACTCCGGT GTGAGTAGGG GTGGGAGTAA TAATAGTGTG 10751 GATGGGGAAA CCAGACAGAA CGCCTTAATG CAGGCACCTA CTTCCTATTC CTACCCCTTT GGTCTGTCTT GCGGAATTAC GTCCGTGGAT GAAGGATAAG 10801 TACACTCTAA TCGGCTCCCT CCCTCTCCTC ATCGCCCTCC TGCTCATACA ATGTGAGATT AGCCGAGGGA GGGAGAGGAG TAGCGGGAGG ACGAGTATGT 10851 GAACAACCTC GGCACCTTGT CCATAATTAT CATACAGTAC ACACAGCCCT CTTGTTGGAG CCGTGGAACA GGTATTAATA GTATGTCATG TGTGTCGGGA 10901 TAACCCTGAC CTCATGGGCC GATAAACTAT GATGAGTGGC CTGTCTCGCC ATTGGGACTG GAGTACCCGG CTATTTGATA CTACTCACCG GACAGAGCGG 10951 GCCTTTCTTG TTP~AA,ATAC C CCTATACGGA ATCCACCTCT GACTTCCTAA CGGAAAGAAC AATTTTATGG GGATATGCCT TAGGTGGAGA CTGAAGGATT 11001 AGCTCACGTC GAAGCCCCCA TTGCCGGCTC AATAATCCTA GCTGCTGTGT TCGAGTGCAG CTTCGGGGGT AACGGCCGAG TTATTAGGAT CGACGACACA 11051 TACTCAA.ACT AGGGGGATAT GGCATGATAC GAATTATTGT GATACTGGAC ATGAGTTTGA TCCCCCTATA CCGTACTATG CTTAATAACA CTATGACCTG 11101 CCTCTTACCA AAGAA.ATGGC CTACCCTTTC TTAATTTTAG CCATCTGAGG GGAGAATGGT TTCTTTACCG GATGGGAAAG AATTAAAATC GGTAGACTCC 11151 GATTATTATA ACCAGCTCCA TCTGCCTGCG GCAGACCGAC CTTAAATCTC CTAATAATAT TGGTCGAGGT AGACGGACGC CGTCTGGCTG GAATTTAGAG 11201 TCATTGCCTA CTCATCAGTA AGCCACATAG GC TTGGTC GC AGGAGCAATC AGTAACGGAT GAGTAGTCAT TCGGTGTATC CGAACCAGCG TCCTCGTTAG 11251 C TTATC CA.AA CACCATGAAG CTTTGCAGGA GCAATTACAC TAATAATCGC GAATAGGTTT GTGGTACTTC GAAACGTCCT CGTTAATGTG ATTATTAGCG 11301 CCACGGCCTA ATCTCATCCG CCCTATTCTG CTTAGCCAAC ACTAACTACG GGTGCCGGAT TAGAGTAGGC GGGATAAGAC GAATCGGTTG TGATTGATGC 11351 AGCGAATCCA TAGCCGAACA ATACTTCTGG CCCGAGGCAT ACAAATTATT TCGCTTAGGT ATCGGCTTGT TATGAAGACC GGGCTCCGTA TGTTTAATAA 11401 TTCCCACTAA TAGCAACCTG ATGATTCTTT GCCAGCCTAG CTAATCTTGC AAGGGTGATT ATCGTTGGAC TACTAAGAAA CGGTCGGATC GATTAGAACG 11451 ACTTCCACCC TCCCCCAATC TCATAGGAGA ACTCCTCATC ATCACCTCAC TGAAGGTGGG AGGGGGTTAG AGTATCCTCT TGAGGAGTAG TAGTGGAGTG 11501 TATTCAACTG GTCTAACTGA ACCATTATCC TTTCAGGCCT TGGAGTATTA ATAAGTTGAC CAGATTGACT TGGTAATAGG A.AAGTC C GGA ACCTCATAAT 11551 GTTACAGCCT CCTACTCCCT CTACATGCTT CTAATAACCC AACGTGGCCC CAATGTCGGA GGATGAGGGA GATGTACGAA GATTATTGGG TTGCACCGGG 11601 TACCCCCCTC CACATCCTAT CACTAACTCC AAGCCACACA CGAGAACATC ATGGGGGGAG GTGTAGGATA GTGATTGAGG TTCGGTGTGT GCTCTTGTAG 11651 TCCTCCTAAG CCTCCACCTT ATGCCCATCC TACTCCTGAT TCTTAAGCCA AGGAGGATTC GGAGGTGGAA TACGGGTAGG ATGAGGACTA AGAATTCGGT 11701 GAACTCATCT GAGGCTGAAC ACTCTGTACT TATAGTTTAA C CA.AA.ACAC T CTTGAGTAGA CTCCGACTTG TGAGACATGA ATATCAAATT GGTTTTGTGA 11751 AGATTGTGGT TCTAGAAATA AGAGTTAAAA CCTCTTTAAG TATCGAGAGA TCTAACACCA AGATCTTTAT TCTCAATTTT GGAGAAATTC ATAGCTCTCT 11801 GGTCCGGGAC ACGAAGAGCT GCTAACTCTT CTTATCATGG CTCAATTCCA 287

CCAGGCCCTG TGCTTCTCGA CGATTGAGAA GAATAGTACC GAGTTAAGGT 11851 TGACTCACTC AGCTTCTGAA AGATATTAGT AATCTATTGG TCTTAGGAAC ACTGAGTGAG TCGAAGACTT TCTATAATCA TTAGATAACC AGAATCCTTG 11901 CP~~AAAC TC T TGGTGCAACT CCAAGCAAAA GCTATGAATG CCATCTTTAA GGTAGAA.ATT GTTTTTGAGA ACCACGTTGA GGTTCGTTTT CGATACTTAC CACCATCCCA 11951 CTCATCATTT CTTCTAATCT TTGCTATCCT TTAATAATCT GAGTAGTAAA GAAGATTAGA AACGATAGGA GTGGTAGGGT AATTATTAGA 12001 CACTAAGCCC CAAAGAACTA AGTCTTAACT GAGCCTCCTC CCATGTAAAA GTGATTCGGG GTTTCTTGAT TCAGAATTGA CTCGGAGGAG GGTACATTTT 12051 ACAGCCGTAA AGACCTCTTT TTTCATCAGC CTCATTCCCC TATTTATTTT TGTCGGCATT TCTGGAGAAA AAAGTAGTCG GAGTAAGGGG ATAA.ATA.A.AA 12101 CCTAGACCAG GGACTAGAAT CAATCATAAC TAACTACAAC TGAATAAACA GGATCTGGTC CCTGATCTTA GTTAGTATTG ATTGATGTTG ACTTATTTGT 12151 TTGGACCCTT CGACATTAAC ATGAGCTTCA AATTTGATAC ATACTCAATT AACCTGGGAA GCTGTAATTG TACTCGAAGT TTAAACTATG TATGAGTTAA 12201 ATATTCACCC CAGTGGCCCT CTACGTCACC TGGTCCATCC TTGAGTTTGC TATAAGTGGG GTCACCGGGA GATGCAGTGG ACCAGGTAGG AACTCAAACG 12251 CCTCTGATAC ATGCACTCTG ATCCAAACAT TAACCGCTTC TTCAAGTACC GGAGACTATG TACGTGAGAC TAGGTTTGTA ATTGGCGAAG AAGTTCATGG 12301 TCCTACTCTT TCTAATCTCA ATAATCATCC TAGTTACCGC CAACAACATA AGGATGAGAA AGATTAGAGT TATTAGTAGG ATCAATGGCG GTTGTTGTAT 12351 TTCCAGCTAT TCATCGGCTG GGAGGGGGTA GGAATCATAT CCTTTCTCCT AAGGTCGATA AGTAGCCGAC CCTCCCCCAT CCTTAGTATA GGAAAGAGGA 12401 CATTGGCTGA TGATATGCCC GAACAGATGC TAACACAGCT GCCTTGCAAG GTAACCGACT ACTATACGGG CTTGTCTACG ATTGTGTCGA CGGAACGTTC 12451 CTGTAATTTA CAATCGAGTG GGCGACATTG GACTAATTCT TAGCATAGCC GACATTA.AAT GTTAGCTCAC CCGCTGTAAC CTGATTAAGA ATCGTATCGG 12501 TGACTAGCCA TAAACCTGAA CTCCTGAGAA ATTCAACAAT TATTCATATT ACTGATCGGT ATTTGGACTT GAGGACTCTT TAAGTTGTTA ATAAGTATAA 12551 ATCCAAAGAC ATGGACTTAA CCCTACCTCT CCTCGGCCTC GTCCTAGCCG TAGGTTTCTG TACCTGAATT GGGATGGAGA GGAGCCGGAG CAGGATCGGC 12601 CCGCCGGA.AA GTCCGCACAA TTTGGCCTTC ACCCCTGACT CCCCTCAGCC GGCGGCCTTT CAGGCGTGTT AAACCGGAAG TGGGGACTGA GGGGAGTCGG 12651 ATAGAAGGAC CCACACCAGT CTCCGCCTTA CTCCACTCCA GCACAATAGT TATCTTCCTG GGTGTGGTCA GAGGCGGAAT GAGGTGAGGT CGTGTTATCA 12701 TGTTGCCGGC ATCTTCCTTC TAATCCGCCT TCACCCACTA ATCCAAGACA ACAACGGCCG TAGAAGGAAG ATTAGGCGGA AGTGGGTGAT TAGGTTCTGT 12751 ATCAGTTAAT TCTAACAACG TGCTTGTGCC TGGGAGCATT GACCACCCTT TAGTCAATTA AGATTGTTGC ACGAACACGG ACCCTCGTAA CTGGTGGGAA 12801 TTCACTGCAG CATGCGCACT CACCCAAAAT GACATCAAAA AGATTATTGC AAGTGACGTC GTACGCGTGA GTGGGTTTTA CTGTAGTTTT TCTAATAACG 12851 CTTCTCAACA TCCAGCCAAC TCGGACTAAT AATAGTAACA ATCGGCCTCA GAAGAGTTGT AGGTCGGTTG AGCCTGATTA TTATCATTGT TAGCCGGAGT 12901 ACCAACCCCA ACTAGCCTTC CTCCATATCT GCACCCACGC TTTCTTTAAA TGGTTGGGGT TGATCGGAAG GAGGTATAGA CGTGGGTGCG AAAGAAATTT 12951 GCCATGCTCT TCCTCTGCTC CGGATCTATC ATTCACAGCC TCAATGACGA CGGTACGAGA AGGAGACGAG GCCTAGATAG TAAGTGTCGG AGTTACTGCT 13001 ACAAGACATC C GC AA.AATAG GGGGCCTTCA TA.AAC TTTTA CCGTTTACCT TGTTCTGTAG GCGTTTTATC CCCCGGAAGT ATTTGA.AA.AT GGCAAATGGA 13051 CATCCTCCCT GACCATCGGA AGCCTGGCTC TCACAGGTAT ACCTTTCTTA GTAGGAGGGA GTGGTAGGGT TCGGACCGAG AGTGTCCATA TGGA.AAGAAT 13101 TCAGGTTTCT TCTCCA.AAGA TGCTATTATT GAATCCCTAA ACACTTCTCA AGTCCAAAGA AGAGGTTTCT ACGATAATAA CTTAGGGATT TGTGAAGAGT 13151 CCTCAACGCC TGAGCCCTTA TccTAAccCT AATCGCGACA TCATTCACAG GGAGTTGCGG ACTCGGGAAT AGGATTGGGA TTAGCGCTGT AGTAAGTGTC 288

13201 CTATCTATAG CCTCCGCCTC ATTTTCTTCG CACTAATAAA TTTCCCGCGG GATAGATATC GGAGGCGGAG TAA.AAGAAG C GTGATTATTT AAAGGGCGCC 13251 TTCAACCCAC TTTCCCCTAT CAAC GA.AAAC AACCCCATAA TCATCAACCC AAGTTGGGTG AA.AGGGGATA GTTGCTTTTG TTGGGGTATT AGTAGTTGGG 13301 AATCA.AACGC CTGGCTTACG GAAGCATCCT AGCCGGTCTC ATCATTACAT TTAGTTTGCG GACCGAATGC CTTCGTAGGA TCGGCCAGAG TAGTAATGTA 13351 CCAACCTAAC CCCCACGAA.A ACCCA.AATTA TAACCATACC CCCTCTACTA GGTTGGATTG GGGGTGCTTT TGGGTTTAAT ATTGGTATGG GGGAGATGAT 13401 AAACTCTCCG CCCTACTAGT AACTATCATC GGCCTCCTAC TGGCTCTAGA TTTGAGAGGC GGGATGATCA TTGATAGTAG CCGGAGGATG ACCGAGATCT 13451 ACTAGCCAAC CTATCCAATA CTCAACTCAA AACAACTCCT ACCCTCCTCC TGATCGGTTG GATAGGTTAT GAGTTGAGTT TTGTTGAGGA TGGGAGGAGG 13501 CCCACCACTT C TCAA.ATATA CTAGGCTACT TCCCACA.AAT CATCCACCGC GGGTGGTGAA GAGTTTATAT GATCCGATGA AGGGTGTTTA GTAGGTGGCG 13551 TTCCTACCTA A.AATTAGTC T AACCTGAGCT CA.ACACATC T CCACCCACCT AAGGATGGAT TTTAATCAGA TTGGACTCGA GTTGTGTAGA GGTGGGTGGA 13601 AATTGACCAG TCATGATATG TTGG AC CP.~~,~A.AGC CCCCTCATCC TTAACTGGTC AGTACTATAC TTTTTTAACC TGGTTTTTCG GGGGAGTAGG 13651 AACAAATCCC TCTAATCAAA TTATCAACCC AACCTCAACA AGGTTTCATT TTGTTTAGGG AGATTAGTTT AATAGTTGGG TTGGAGTTGT TCCAAAGTAA 13701 AA.AGTC TAC C TCATACTACT CTTCCTTACA CTGATTTTAG CCCTACTCAC TTTCAGATGG AGTATGATGA GAAGGAATGT GACTA.AA.ATC GGGATGAGTG 13751 CACATTAACC TAACCACACG CAA.AGTCCCT CATGACAGCC CCCGAGTTAA GTGTAATTGG ATTGGTGTGC GTTTCAGGGA GTACTGTCGG GGGCTCAATT 13801 CTCTAATACC ACAAACAAGG TCAGTAGTAA CACCCACCCG CTTAAAACCA GAGATTATGG TGTTTGTTCC AGTCATCATT GTGGGTGGGC GAATTTTGGT 13851 GTAACCAACC ACCATCACCA TA.AAGCAGAG CCACCCCCAC AAAATCCCCA CATTGGTTGG TGGTAGTGGT ATTTCGTCTC GGTGGGGGTG TTTTAGGGGT 13901 CGAGTTATCT CCATACTACT CAGCTCCTCC ACCCCCGACC AACTTAACTC GCTCAATAGA GGTATGATGA GTCGAGGAGG TGGGGGCTGG TTGAATTGAG 13951 A.AATCAC TC T ACCATAAAAT ACTTACCCGC P.~AAAAGTAAT GTTATTAAAT TTTAGTGAGA TGGTATTTTA TGAATGGGCG TTTTTCATTA CAATAATTTA 14001 P►~~AAACCAAC ATATAATAAA ACAGATCAAT TACCCCACGA CTCAGGATAA TTTTTGGTTG TATATTATTT TGTCTAGTTA ATGGGGTGCT GAGTCCTATT 14051 GGCTCAGCAG CAAGCGCTGC TGTATAAGCA AATACTACCA ACATCCCTCC CCGAGTCGTC GTTCGCGACG ACATATTCGT TTATGATGGT TGTAGGGAGG 14101 TAAGTAAATC P~~AAACA,AA,A CCAATGACAA AAAAGACCCG CCATGCCCAA ATTCATTTAG TTTTTGTTTT GGTTACTGTT TTTTCTGGGC GGTACGGGTT 14151 CTAATAACCC ACATCCCACC CCAGCAGCTA CAACCAATCC CAATGCAGCA GATTATTGGG TGTAGGGTGG GGTCGTCGAT GTTGGTTAGG GTTACGTCGT 14201 TAATAGGGCG AAGGATTAGA TGCCACTCCT ATTA.AACCCA AAACCAAACA ATTATCCCGC TTCCTAATCT ACGGTGAGGA TAATTTGGGT TTTGGTTTGT 14251 AATCGTTATC AAAAACATAA AATATACCAT TATTCCTACC TGGACTTTAA TTAGCAATAG TTTTTGTATT TTATATGGTA ATAAGGATGG ACCTGAAATT 14301 CCAAGACTAA TAACTTGAAA AACTATCGTT GTTGATTCAA CTATAAGAAT GGTTCTGATT ATTGAACTTT TTGATAGCAA CAACTAAGTT GATATTCTTA 14351 TCATGGCCCT AAACATCCGA AAAACCCACC CTCTACTGAA AATTGTAAAC AGTACCGGGA TTTGTAGGCT TTTTGGGTGG GAGATGACTT TTAACATTTG 14401 CAGACCCTAA TCGACCTCCC AGCCCCATCG AACATCTCCA TTTGATGAAA GTCTGGGATT AGCTGGAGGG TCGGGGTAGC TTGTAGAGGT AAACTACTTT 14451 CTTTGGATCA CTCCTAGGAC TATGTTTAAT CATCCAAATC GTTACAGGAC GAAACCTAGT GAGGATCCTG ATACA.AATTA GTAGGTTTAG CAATGTCCTG 14501 TCTTCCTAGC AATACACTAC ACCGCAGACA TCTCCCTAGC TTTCTCCTCA AGAAGGATCG TTATGTGATG TGGCGTCTGT AGAGGGATCG AAAGAGGAGT 14551 GTAATTCACA TTTGCCGCGA CGTCAACTAT GGCTGACTTA TCCGCAACAT 289

CATTAAGTGT A.A.AC GGC GC T GCAGTTGATA CCGACTGAAT AGGCGTTGTA 14601 CCATGCCAAC GGAGCCTCTC TATTCTTCGT CTGTGCCTAT ATTCACATTG GGTACGGTTG CCTCGGAGAG ATAAGAAGCA GACACGGATA TAAGTGTAAC 14651 CCCGCGGTCT TTACTATGGC TCCTACCTCT ACAAGGAAAC TTGA.AACAC T GGGCGCCAGA AATGATACCG AGGATGGAGA TGTTCCTTTG AACTTTGTGA 14701 GGAGTAATCC TACTGTTCCT CCTCATAGCT ACAGCTTTCG TGGGCTACGT CCTCATTAGG ATGACAAGGA GGAGTATCGA TGTCGAAAGC ACCCGATGCA 14751 CCTACCCTGA GGCCAGATAT CCTTCTGAGG CGCAACAGTC ATCACCAACC GGATGGGACT CCGGTCTATA GGAAGACTCC GCGTTGTCAG TAGTGGTTGG 14801 TTCTCTCCGC TTTCCCTTAT ATTGGAGACA CATTAGTCCA ATGGATCTGA AAGAGAGGCG A.AAGGGAATA TAACCTCTGT GTAATCAGGT TACCTAGACT 14851 GGAGGCTTCT CAGTAGACAA TGCCACCCTA ACACGATTTT TCGCATTTCA CCTCCGAAGA GTCATCTGTT ACGGTGGGAT TGTGCTAAAA AGC GTA.AAGT 14901 CTTCCTCCTC CCCTTCCTAA TTACCGCATT AATAGTCATC CACGTTCTCT GAAGGAGGAG GGGAAGGATT AATGGCGTAA TTATCAGTAG GTGCAAGAGA 14951 TTTTACACGA AACAGGCTCA AACAACCCTA TAGGCCTTAA TTCTGACATA AAAATGTGCT TTGTCCGAGT TTGTTGGGAT ATCCGGAATT AAGACTGTAT 15001 GACAAAATCT CCTTTCACCC CTACTTCTCT TACAAAGACG CACTTGGATT CTGTTTTAGA GGAAAGTGGG GATGAAGAGA ATGTTTCTGC GTGAACCTAA 15051 CTTGACCCTC CTTATCCTTT TAGGGGCCCT AGCTCTATTT CTACCCAACC GAACTGGGAG GAATAGGA.AA ATCCCCGGGA TCGAGATAAA GATGGGTTGG 15101 TCTTAAGTGA C GC T GAAAAC TTCATCCCCG CCAACCCTCT CGTCACCCCT AGAATTCACT GCGACTTTTG AAGTAGGGGC GGTTGGGAGA GCAGTGGGGA 15151 CCCCACATTA AACCCGAATG GTACTTCCTA TTTGCCTACG CCATCCTCCG GGGGTGTAAT TTGGGCTTAC CATGAAGGAT AAACGGATGC GGTAGGAGGC 15201 ATCTATCCCT AATAAACTGG GCGGAGTCCT GGCTCTCCTA TTCTCCATCC TAGATAGGGA TTATTTGACC CGCCTCAGGA CCGAGAGGAT AAGAGGTAGG 15251 TCATCCTCCT ACTAGTACCC CTCCTCCATA CTTCTAAACA ACGAAGCAGC AGTAGGAGGA TGATCATGGG GAGGAGGTAT GAAGATTTGT TGCTTCGTCG 15301 ACCTTCCGCC CACTTACACA AGTCTTCTTC TGAATCCTCG TCACCAATAT TGGAAGGCGG GTGAATGTGT TCAGAAGAAG ACTTAGGAGC AGTGGTTATA 15351 ACTAGTTTTA ACCTGAATCG GAGGACAGCC CGTTGAACAG CCATTCATTC TGATCAAAAT TGGACTTAGC CTCCTGTCGG GCAACTTGTC GGTAAGTAAG 15401 TCATTGGACA AATTGCATCT ATCTCCTACT TTTCACTGTT CCTCATCGCA AGTAACCTGT TTAACGTAGA TAGAGGATGA AAAGTGACAA GGAGTAGCGT 15451 ATGCCACTCG CCGGTTGGTG AGAAAACA.AA ATCCTCAGCC TTAACTAATT TACGGTGAGC GGCCAACCAC TCTTTTGTTT TAGGAGTCGG AATTGATTAA 15501 TTGATAGCTT AAC C TA.A.AAG CGTCGACCTT GTAAGTCGAA GATCGGAGGT AACTATCGAA TTGGATTTTC GCAGCTGGAA CATTCAGCTT CTAGCCTCCA 15551 TTGAACCCTC CTCAAAATAT ATCAGGGGAA GGAGGGTTAA ACTCCTGCCC AACTTGGGAG GAGTTTTATA TAGTCCCCTT CCTCCCAATT TGAGGACGGG 15601 TTGGCTCCCA AAGCCAAGAT TCTGCCCAAA CTGCCCCCTG AATGCTGTCA AACCGAGGGT TTCGGTTCTA AGACGGGTTT GACGGGGGAC TTACGACAGT 15651 AAGCATGAAG GCCAGACACC CGTTTGGCCT TCP.~A,AAAGTA AGTCAGTTTA TTCGTACTTC CGGTCTGTGG GCA.AACCGGA AGTTTTTCAT TCAGTCAAAT 15701 ACATATTAAT GACATGGCCC ACATACCTTA ATATAGAGAC ATATCTTATC TGTATAATTA CTGTACCGGG TGTATGGAAT TATATCTCTG TATAGAATAG 15751 TCGACTACAT TACTACAATT GACTTTCACC TAATGGTATC ACACTCTATG AGCTGATGTA ATGATGTTAA C TGA.AAGTGG ATTACCATAG TGTGAGATAC 15801 TATAATACTC ATTAATTTAT ATTCCCCTAT ATCATTACAT ATTATGCTTT ATATTATGAG TAATTAAATA TAAGGGGATA TAGTAATGTA TAATACGAAA 15851 ATCCCCATTA TTCTACTATC CACTATTTCA TTACACTATA CTCTTCGTCC TAGGGGTAAT AAGATGATAG GTGATAA.AGT AATGTGATAT GAGAAGCAGG 15901 CCATTAACCT ~~AAATCAGAA TTTTCATATC ATCAATTTAC TCCTTCCACC GGTAATTGGA TTTTAGTCTT AAAAGTATAG TAGTTAAATG AGGAAGGTGG 290

15951 C TCA.AATAC T TAAGTATATC TTATGCGGGC TGGTAAGAAC ATCACATCCC GAGTTTATGA ATTCATATAG AATACGCCCG ACCATTCTTG TAGTGTAGGG 16001 GCTATTGTAA GG TT GCTCTATTTG TGGCGCTGTA CTCGATTAAT CGATAACATT CCTTTTTTAA CGAGATAAAC ACCGCGACAT GAGCTAATTA 16051 CCCTATCAAT TGCCCATACC TGGCATCTGA TTAATGCTCG AGCTACTTCA GGGATAGTTA ACGGGTATGG ACCGTAGACT AATTACGAGC TCGATGAAGT 16101 GTCCTTGATC GCGTCAAGAA TGCCAGCCCG CTAGTTCCCT TTAATGGCAC CAGGAACTAG CGCAGTTCTT ACGGTCGGGC GATCAAGGGA AATTACCGTG 16151 CTTCGTCCTT GATCGCGTCA AGATTTATTT TCCACCCTGT TTTTTTGGGG GAAGCAGGAA CTAGCGCAGT TC TAA.ATAA.A AGGTGGGACA CCCC 16201 GGGGATGAAG CCATCGCTAT TCCCCGGAGG GGCTGAACTG GGACTCTGAG CCCCTACTTC GGTAGCGATA AGGGGCCTCC CCGACTTGAC CCTGAGACTC 16251 ATAGACTTGA GACCTCCTCG ACACTCTTCT GTAATACTCA TTACTCATCA TATCTGAACT CTGGAGGAGC TGTGAGAAGA CATTATGAGT AATGAGTAGT 16301 TTCATGAATT AAGATTGTCA AGTTGACCAA AACTGAAAGG GATGGAGAGA AAGTACTTAA TTCTAACAGT TCAACTGGTT TTGACTTTCC CTACCTCTCT 16351 TTGACGCCAT AGTGGGTACG TTTCGATTTT TTTGATTAA.A GAAACTATGG AACTGCGGTA TCACCCATGC AAAGCTAAAA A.AAC TAAT T T CTTTGATACC 16401 TTT ACATTTTCTT AACCCCCATC CAAACTGATC CTAGCAGTGT AAATTTTTTT TGTAAAAGAA TTGGGGGTAG GTTTGACTAG GATCGTCACA 16451 ACGTTAGTGT AAAATGCATT TCACTATTTG AATACATTCA TTACTTATTC TGCAATCACA TTTTACGTAA AGTGATAAAC TTATGTAAGT AATGAATAAG 16501 GGGCATAAAT TCATCATTAT TAAGATCCCC CCTGCGTTGC TCGG CCCGTATTTA AGTAGTAATA ATTCTAGGGG GGACGCAACG TTTTTTAGCC 16551 AGCCGCTTAA P►.AAAAGATAA ACATTTTTTG GTP.►,AAAAC C C CCCTCCCCCT TCGGCGAATT TTTTTCTATT TGT C CATTTTTGGG GGGAGGGGGA 16601 AATATACACG GACTCCTCGA A~A.AAC C C C TA AAACGAGGGC CGGACGTATA TTATATGTGC CTGAGGAGCT TTTTGGGGAT TTTGCTCCCG GCCTGCATAT 16651 TTTTGAAATT AGCATGCGAA ATATATTCTG TATTTATATT GTAACACTAT ~CTTTAA TCGTACGCTT TATATAAGAC ATAAATATAA CATTGTGATA 16701 GAT CTA

tRNA 1..70 product = tRNA-Phe rRNA 69..1023 product = 12S ribosomal RNA tRNA 1024..1096 product = tRNA-Val rRNA 1097..2769 product = 16S ribosomal RNA tRNA 2770..2844 product = tRNA-Leu gene 2845..3819 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3823..3891 product = tRNA-Ile tRNA 3890..3961 product = tRNA-Gln tRNA 3966..4034 product = tRNA-Met gene 4035..5078 gene = ND2 product = NADH dehydrogenase subunit 2 291 tRNA 507 8..5148 product = tRNA-Trp tRNA complement (S 1SO..S218) product = tRNA-Ala tRNA complement (5219..5291) product = tRNA-Asn tRNA complement (S 32S ..S 391) product = tRNA-Cys tRNA complement (S 393..5462) product = tRNA-Tyr gene 5464..7021 gene = COl product = cytochrome c oxidase subunit 1 tRNA complement (7019..7089) product = tRNA-Ser tRNA 7094..7163 product = tRNA-Asp gene 7170..7860 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7861..7934 product = tRNA-Lys gene 7936..8103 gene = ATP8 product =ATP synthase FO subunit 8 gene 8094..8777 gene = ATP6 product =ATP synthase FO subunit 6 gene 8777..9562 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9S6S..9634 product = tRNA-Gly gene 9635..9985 gene = ND3 product =NADH dehydrogenase subunit 3 tRNA 9984..10054 product = tRNA-Arg gene 100SS..10351 gene = ND4L product =NADH dehydrogenase subunit 4L gene 10345..11725 gene = ND4 product =NADH dehydrogenase subunit 4 tRNA 11726..11794 product = tRNA-His tRNA 11795..11862 product = tRNA-Ser tRNA 11862..11933 product = tRNA-Leu gene 11934..13763 gene = NDS product =NADH dehydrogenase subunit S gene complement (13759..14280) 292

gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14281..14350) product = tRNA-Glu gene 14353..15498 gene = CYTB product = cytochrome b tRNA 15496..15569 product = tRNA-Thr tRNA complement (15572..15640) product = tRNA-Pro D-Loop 15641..16703

Lamna ditropis mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGTATGGCAC TGAAGATGCT AAGATGA.AAA CGATCACATC GAATTAAATT TCATACCGTG ACTTCTACGA TTCTACTTTT 51 ATGAGAATTT TCCGCAGGCA TAAAGGTTTG GTCCTGGCCT CAGTATTAAT TACTCTTAAA AGGCGTCCGT ATTTCCA.AAC CAGGACCGGA GTCATAATTA 101 TGTAACCAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACATTGGTTT TAATATGTAC GTTCAAAGTC GTAGGGACAC TCTTACGGGA 151 AACTACTCTG TCAATTAATT AGGAGCAGGT ATCAGGCACA CACCCACGTA TTGATGAGAC AGTTAATTAA TCCTCGTCCA TAGTCCGTGT GTGGGTGCAT 201 GCCCAAGACA CCTTGCTAAG CCACACCCCC AAGGGATCTC AGCAGTAATA CGGGTTCTGT GGAACGATTC GGTGTGGGGG TTCCCTAGAG TCGTCATTAT 251 AATATTGATC ATATGAGCGT AAGCTCGAAT CAGTTA.AAGT TAACAGAGTT TTATAACTAG TATACTCGCA TTCGAGCTTA GTCAATTTCA ATTGTCTCAA 301 GGTTAATCTC GTGCCAGCCA CCGCGGTTAT ACGAGTAACT CATATTAATA CCAATTAGAG CACGGTCGGT GGCGCCAATA TGCTCATTGA GTATAATTAT 351 CTTCCCGGCG TAAAGAGTGA TTTAAGGAAT ACCTGCCAAT AACTAAAGTT GAAGGGCCGC ATTTCTCACT AAATTCCTTA TGGACGGTTA TTGATTTCAA 401 AAGACCTTAT CAGGCTGTCA CACGCACCCA CAAGCGGAAT TGTCAACAAC TTCTGGAATA GTCCGACAGT GTGCGTGGGT GTTCGCCTTA ACAGTTGTTG 451 GAAAGTGACT TTATTACCCC TAGA.AATC T T GATGTCACGA CAGTTAGACC CTTTCACTGA AATAATGGGG ATCTTTAGAA CTACAGTGCT GTCAATCTGG 501 C CA.AAC TAGG ATTAGATACC CTACTATGTC TAAC CACA.AA C TTA.AACAAT GGTTTGATCC TAATCTATGG GATGATACAG ATTGGTGTTT GAATTTGTTA 551 GACCTACTAT ATTGTTCGCC AGAGTACTAC AAGCGCTAGC TTA.A.AAC C CA CTGGATGATA TAACAAGCGG TCTCATGATG TTCGCGATCG AATTTTGGGT 601 AAGGACTTGG CGGTGTCCCA AACCCACCTA GAGGAGCCTG TTCTGTAACC TTCCTGAACC GCCACAGGGT TTGGGTGGAT CTCCTCGGAC AAGACATTGG 651 GATAATCCCC GTTA.AAC C TC ACCACTTCTA GCCATCCCCG TCTATATACC CTATTAGGGG CAATTTGGAG TGGTGAAGAT CGGTAGGGGC AGATATATGG 701 GCCGTCGTCA GCTCACCCTG TGAAGGCCTA AAAGTAAGCA AAAAGAACTA CGGCAGCAGT CGAGTGGGAC ACTTCCGGAT TTTCATTCGT TTTTCTTGAT 751 ACTTCCATAC GTCAGGTCGA GGTGTAGCGA ATGAAGTGGA TAGAAATGGG TGAAGGTATG CAGTCCAGCT CCACATCGCT TACTTCACCT ATCTTTACCC 801 CTACATTTTC TATAAAGAAA ACACGAATGG CA.AACTGAAA AATTGCCTAA GATGTAP~AAG ATATTTCTTT TGTGCTTACC GTTTGACTTT TTAACGGATT 851 AGGTGGATTT AGCAGTAAGA AAAGACTAGA GAGCTTCTCT GAAACCGGCT TC CAC C TAAA TCGTCATTCT TTTCTGATCT CTCGAAGAGA CTTTGGCCGA 901 CTGGGACGCG CACACACCGC CCGTCACTCT CCTCTACAAA AAAATCTACT GACCCTGCGC GTGTGTGGCG GGCAGTGAGA GGAGATGTTT TTTTAGATGA 951 TATTTTTAAT TAAAGAAAAT ACATCAAGAG GAGGCAAGTC GTAACATCGG 293

ATP►~~?~AATTA ATTTCTTTTA TGTAGTTCTC CTCCGTTCAG CATTGTAGCC 1001 TAAGTGTACT GGAAAGTGCA CTTGGAATCA A.AATGTGGCT AAACTAGCAA ATTCACATGA CCTTTCACGT GAACCTTAGT TTTACACCGA TTTGATCGTT 1051 AGCACCTCCC TTACACCGAG GAAATACTCG TGCAATTCGA GTCATTTTGA TCGTGGAGGG AATGTGGCTC CTTTATGAGC ACGTTAAGCT CAGT~~AA.AC T 1101 ACATTAAAGC TAGCCTGTCC ATCTACCTCA AACCCAACAT TATTAACTAC TGTAATTTCG ATCGGACAGG TAGATGGAGT TTGGGTTGTA ATAATTGATG 1151 CTCACGTATT TATTCCTAAC T~CATTT TATTATTTTA GTATGGGCGA GAGTGCATAA ATAAGGATTG ATTTTGTAAA ATAATAAAAT CATACCCGCT 1201 CAGAACAAAA ATTCAGCGCA ATAGACCATG TACCGCAAGG GAAAGCTGAA GTCTTGTTTT TAAGTCGCGT TATCTGGTAC ATGGCGTTCC CTTTCGACTT 1251 AGAGAAATGA AATAAATAAT TAAAGTAGAA AAAAGCAGAG ATTTCACCTC TCTCTTTACT TTATTTATTA ATTTCATCTT TTTTCGTCTC TAAAGTGGAG 1301 GTACCTTTTG CATCATGATT TAGCTAGAAA AACTAGACAA AGAGATCTTT CATGGAAAAC GTAGTACTAA ATCGATCTTT TTGATCTGTT TCTCTAGAAA 1351 AGCCTATCCT CCCGAAACTA AACGAGCTAC TCCGAAGCAG CACAATTTAG TCGGATAGGA GGGCTTTGAT TTGCTCGATG AGGCTTCGTC GTGTTAAATC 1401 AGCCAACCCG TCTCTGTGGC AAAAGAGTGG GAAGACTTCC GAGTAGCGGT TCGGTTGGGC AGAGACACCG TTTTCTCACC CTTCTGAAGG CTCATCGCCA 1451 GACAAGCCTA TCGAGTTTAG TGATAGCTGG TTGTCCAAGA A.AAGAAC TTC CTGTTCGGAT AGCTCAAATC ACTATCGACC AACAGGTTCT TTTCTTGAAG 1501 AATTCTGCAT TAATTCTTTC ATCACCAAAA AGTTTATCAT ACCAAGGTCA TTAAGACGTA ATTAAGAAAG TAGTGGTTTT TCAAATAGTA TGGTTCCAGT 1551 CACATAAGA.A TTAGTAGTTA TTCAGAAGAG GTACAGCCCT TCTGAACCAA GTGTATTCTT AATCATCAAT AAGTCTTCTC CATGTCGGGA AGACTTGGTT 1601 GACACAACTT TCWAAGGAGG GAAATGATCA CATTTATCAA GGTTCTCACC CTGTGTTGAA AGWTTCCTCC CTTTACTAGT GTA.AATAGTT CCAAGAGTGG 1651 CCAGTGGGCC TAAAAGCAGC CACCTGTA.AA GTAAGCGTCA CAGCTCCAGT GGTCACCCGG ATTTTCGTCG GTGGACATTT CATTCGCAGT GTCGAGGTCA 1701 CTCACP.~AAAA CCTATAATTC AGATATTCTT CTCAGGACCC CCTTAACCAT GAGTGTTTTT GGATATTAAG TCTATAAGAA GAGTCCTGGG GGAATTGGTA 1751 ATTGGACTAT TTTATAAAAT TATAAA.AGAA CTTGATGCTA AAATGAGTAA TAACCTGATA AAATATTTTA ATATTTTCTT GAACTACGAT TTTACTCATT 1801 TAAGAGGTTA AACCTCTCCC GACACAAGTG TATATCAGAA AGAATTAATT ATTCTCCAAT TTGGAGAGGG CTGTGTTCAC ATATAGTCTT TCTTAATTAA 1851 CACTGATAAT TAAACGAACC CAAACTGAGG TCATTATATT CATATTTTAC GTGACTATTA ATTTGCTTGG GTTTGACTCC AGTAATATAA GTATAAAATG 1901 CCAACTAGAA AATCTTATTA TAACATTCGT TAACCCTACA CAGGAGTGTC GGTTGATCTT TTAGAATAAT ATTGTAAGCA ATTGGGATGT GTCCTCACAG 1951 CTAAGGAAAG AT TA.A.AAGAA AATAA.AGGAA CTCGGCA.AAC ACGAACTCCG GATTCCTTTC TAATTTTCTT TTATTTCCTT GAGCCGTTTG TGCTTGAGGC 2001 CCTGTTTACC p~~A.AACATC G CCTCTTGGAA GCCCCATAAG AGGTCCCGCC GGACAAATGG TTTTTGTAGC GGAGAACCTT CGGGGTATTC TCCAGGGCGG 2051 TGCCCTGTGA CAATGTTTAA CGGCCGCGGT ATTCTGACCG TGCAAAGGTA ACGGGACACT GTTACAAATT GCCGGCGCCA TAAGACTGGC ACGTTTCCAT 2101 GCGTAATCAC TTGTCTTTTA AATGAAGACC CGTATGAAAG GCATCACGAG CGCATTAGTG AACAGAAAAT TTACTTCTGG GCATACTTTC CGTAGTGCTC 2151 AGTTCAACTG TCTCTATTTT CTAATCAATG A.AATTGATC T ACCCGTGCAG TCAAGTTGAC AGAGATAAAA GATTAGTTAC TTTAACTAGA TGGGCACGTC 2201 AAGCGGGTAT AACTACATTA GACGAGAAGA CCCTATGGAG CTTCAAACAC TTCGCCCATA TTGATGTAAT CTGCTCTTCT GGGATACCTC GAAGTTTGTG 2251 ATAGATTAAT TATGTAGATT AATTATTCTA CGGATATAAA TP~~AAATATA TATCTAATTA ATACATCTAA TTAATAAGAT GCCTATATTT ATTTTTATAT 2301 ATACTTTTAA TTTAACTGTC TTTGGTTGGG GTGACCAAGG GG TT TATGAAAATT AAATTGACAG AAACCAACCC CACTGGTTCC CCTTTTTTAA 294

2351 ATCCCCCTTA TCGACCGAGT GTTCTCAAGC AC T C TP.~A,AAA TGTAGAATTA TAGGGGGAAT AGCTGGCTCA CAAGAGTTCG TGAGATTTTT ACATCTTAAT 2401 CAATTCTAAT TAAT~?~AAATA TTTAC C GA.AA AATGACCCAG GATTTCCTGA GTTAAGATTA ATTATTTTAT AAATGGCTTT TTACTGGGTC C TA.A.AGGAC T 2451 TCAATGAACC AAGTTACCCT AGGGATAACA GCGCAATCCT TTCTCAGAGT AGTTACTTGG TTCAATGGGA TCCCTATTGT CGCGTTAGGA AAGAGTCTCA 2501 CCCTATCGAC GAAAGGGTTT ACGACCTCGA TGTTGGATCA GGACATCCTA GGGATAGCTG CTTTCCCAAA TGCTGGAGCT ACAACCTAGT CCTGTAGGAT 2551 ATGATGCAGC CGTCATTAAG GGTTCGTTTG TTCAACGATT AATAGTCCTA TAC TAC GTC G GCAGTAATTC CCAAGCAAAC AAGTTGCTAA TTATCAGGAT 2601 CGTGATCTGA GTTCAGACCG GAGAAATCCA GGTCAGTTTC TATCTATGAT GCACTAGACT CAAGTCTGGC CTCTTTAGGT CCAGTCA.AAG ATAGATACTA 2651 TTTATTTTTC CTAGTACGAA AGGAGCCGGA A,AAATGAAGC CAATACCCTA AAATp~~AAAG GATCATGCTT TCCTCGGCCT TTTTACTTCG GTTATGGGAT 2701 GGCACGCTTC ATTTTCATCT ATTGAAATAA ACTAAAATAG ATAAG CCGTGCGAAG TAAAAGTAGA TAACTTTATT TGATTTTATC TATTCTTTTT 2751 ACCAACTACC ACCCAAGAAA AGGGTTGTTG GGGTGGCAGA GCCTGGTAAT TGGTTGATGG TGGGTTCTTT TCCCAACAAC CCCACCGTCT CGGACCATTA 2801 TGCAAAAGAC CTAAGCTCTT TATTCCAGAG GTTCAAATCC TCTCCTCAAC ACGTTTTCTG GATTCGAGAA ATAAGGTCTC CAAGTTTAGG AGAGGAGTTG 2851 TATGCTTGAA GCCCTTCTTC TTTACTTAAT CTGCCCGCTA ACCTATATTG ATACGAACTT CGGGAAGAAG AAATGAATTA GACGGGCGAT TGGATATAAC 2901 TTCCCATCCT ACTGGCCACA GCCTTCCTTA CCTTAGTTGA ACGAAAAGTC AAGGGTAGGA TGACCGGTGT CGGAAGGAAT GGAATCAACT TGCTTTTCAG 2951 CTCGGTTATA TACAGCTCCG TAAAGGCCCC AACATTGTGG GCCCATATGG GAGCCAATAT ATGTCGAGGC ATTTCCGGGG TTGTAACACC CGGGTATACC 3001 CCTACTTCAA CCCATTGCAG ACGGCTTAAA ATTATTTACC AAAGAACCTA GGATGAAGTT GGGTAACGTC TGCCGAATTT TAATAAATGG TTTCTTGGAT 3051 TCTACCCATC AGCATCTTCC CCTTTCCTAT TTTTGGTTGC CCCCACAATG AGATGGGTAG TCGTAGAAGG GGAA.AGGATA A.A.AAC CAAC G GGGGTGTTAC 3101 GCTCTTACAT TAGCCCTCCT CATATGAATG CCCCTCCCTC TCCCCCACTC CGAGAATGTA ATCGGGAGGA GTATACTTAC GGGGAGGGAG AGGGGGTGAG 3151 CGTTATTAAT CTTAATCTAG GCTTACTATT CATTCTAGCA ATCTCAAGTC GCAATAATTA GAATTAGATC CGAATGATAA GTAAGATCGT TAGAGTTCAG 3201 TGACCGTCTA CACCATCTTA GGTTCCGGAT GAGCATCAAA TTCAAAATAT ACTGGCAGAT GTGGTAGAAT CCAAGGCCTA CTCGTAGTTT AAGTTTTATA 3251 GCCCTGATAG GGGCCTTACG AGCTGTAGCA CAAACAATCT CCTACGAAGT CGGGACTATC CCCGGAATGC TCGACATCGT GTTTGTTAGA GGATGCTTCA 3301 AAGTCTTGGC TTAATCCTCC TATCAATAAT TATTTTTACA GGTGGCTTCA TTCAGAACCG AATTAGGAGG ATAGTTATTA ATP►~~AAATGT CCACCGAAGT 3351 CCCTCCATAC TTTCAACTTA TCCCAAGAAA CAGTTTGACT AATCATCCCG GGGAGGTATG AAAGTTGAAT AGGGTTCTTT GTCAAACTGA TTAGTAGGGC 3401 GGGTGACCCT TAGCCCTAAT ATGATATGTA TCAACCCTGG CAGAAACTAA CCCACTGGGA ATCGGGATTA TACTATACAT AGTTGGGACC GTCTTTGATT 3451 CCGGGTACCA TTTGATTTAA CAGAAGGAGA ATCAGAACTA GTCTCAGGCT GGCCCATGGT AAACTAAATT GTCTTCCTCT TAGTCTTGAT CAGAGTCCGA 3501 TTAACATCGA ATATGCAGGT GGCTCATTCG CTCTATTCTT CCTCGCTGAG AATTGTAGCT TATACGTCCA CCGAGTAAGC GAGATAAGAA GGAGCGACTC 3551 TACACAAATA TCCTACTAAT AAATACCCTC TCAGTCATCC TCTTCATAGG ATGTGTTTAT AGGATGATTA TTTATGGGAG AGTCAGTAGG AGAAGTATCC 3601 CTCCTCCTAC AACCCCTTCT TTCCAGAAAT CTCAACTCTT AGTTTAATAA GAGGAGGATG TTGGGGAAGA AAGGTCTTTA GAGTTGAGAA TCAAATTATT 3651 TA.AAAGCAAC CTCACTCACC CTACTTTTTT TATGAATTCG AGCATCATAC ATTTTCGTTG GAGTGAGTGG GATG ATACTTAAGC TCGTAGTATG 3701 CCTCGCTTTC GCTATGATCA ACTCATACAC TTAGTATGAA AA.PsATTTCCT 295

GGAGCGAAAG CGATACTAGT TGAGTATGTG AATCATACTT TTTTAAAGGA 3751 ACCTTTAACC CTAGCAATTA TATTATGACA TATTGCCCTC CCTATCGCCA TGGAA.ATTGG GATCGTTAAT ATAATACTGT ATAACGGGAG GGATAGCGGT 3801 CAGCAAGCTT GCCCCCTTTA ACCTA.AAAGG AAGCGTGCCT GAACA.AAGGA GTCGTTCGAA CGGGGGAAAT TGGATTTTCC TTCGCACGGA CTTGTTTCCT 3851 CCACTTTGAT AGAGTGGATA ATGAAAGTTA AAATCTCTCC TCTTCCTAGA GGTGAAACTA TCTCACCTAT TACTTTCAAT TTTAGAGAGG AGAAGGATCT 3901 A,AAATAGGGT TTGAACCTAT GCCTAACACA TCAAAACTCT TCGTGCTTCC TTTTATCCCA AACTTGGATA CGGATTGTGT AGTTTTGAGA AGCACGAAGG 3951 AATTATACTA CTTTCTAAGT AAGGTCAGCT AACAA.AGC TT TTGGGCCCAT TTAATATGAT GAAAGATTCA TTCCAGTCGA TTGTTTCGAA AACCCGGGTA 4001 ACCCCAACCA TGTCGGTTAA AATCCCTCCC CTACTAATGA ACCCTGTCGT TGGGGTTGGT ACAGCCAATT TTAGGGAGGG GATGATTACT TGGGACAGCA 4051 ATTAACCATC ATTATTTCAA GCCTGGGCCT GGGGACTATC CTTACATTCA TAATTGGTAG TAATAAAGTT CGGACCCGGA CCCCTGATAG GAATGTAAGT 4101 TTGGTTCCCA CTGACTTCTA GTCTGAATGG GCCTTGAAAT CAACACCCTG AACCAAGGGT GACTGAAGAT CAGACTTACC CGGAACTTTA GTTGTGGGAC 4151 GCCATCCTCC CTCTTATAAT TCGCCAACAC CACCCTCGAG CTGTAGAAGC CGGTAGGAGG GAGAATATTA AGCGGTTGTG GTGGGAGCTC GACATCTTCG 4201 C TC CACP~AAA TACTTCATTA CACAAGCCAC CGCCTCAGCC TTACTCCTAT GAGGTGTTTT ATGAAGTAAT GTGTTCGGTG GCGGAGTCGG AATGAGGATA 4251 TTGCTAGCGT TACAAACGCC TGAACCTCAG GC GAATGAA.A CCTGGTTGAA AACGATCGCA ATGTTTGCGG ACTTGGAGTC CGCTTACTTT GGACCAACTT 4301 ATAGTCAACC CAGGCTCTGC CACACTGGCC ACAATCGCAT TAGCATTAAA TATCAGTTGG GTCCGAGACG GTGTGACCGG TGTTAGCGTA ATCGTAATTT 4351 AATTGGTCTA GCCCCCCTTC ACTTCTGACT CCCTGAAGTC CTTCCAGGTC TTAACCAGAT CGGGGGGAAG TGAAGACTGA GGGACTTCAG GAAGGTCCAG 4401 TAGACCTGAC CACAGGCCTC ATCCTCTCCA CCTGACAAAA ACTTGCCCCC ATCTGGACTG GTGTCCGGAG TAGGAGAGGT GGACTGTTTT TGAACGGGGG 4451 TTCGCCATTC TCCTACAACT TTACCCCTCA TTAAATCCCA ACCTCCTAGT AAGCGGTAAG AGGATGTTGA AATGGGGAGT AATTTAGGGT TGGAGGATCA 4501 CTTCCTTGGT GTACTCTCAA CTATAGTGGG GGGCTGAGGA GGATTAAATC GAAGGAACCA CATGAGAGTT GATATCACCC CCCGACTCCT CCTAATTTAG 4551 AAACTCAACT ACG~~~AAATC CTAGCCTACT CCTCAATTGC ACACCTCGGA TTTGAGTTGA TGCTTTTTAG GATCGGATGA GGAGTTAACG TGTGGAGCCT 4601 TGAATAATTT CCATCCTCCA CTACTCCCAT AATTTAACTC AACTTAACCT ACTTATTAA.A GGTAGGAGGT GATGAGGGTA TTAAATTGAG TTGAATTGGA 4651 AATTCTCTAC ATCATCATAA CCTCAACAAC CTTCCTCCTA TTCAAGACAT TTAAGAGATG TAGTAGTATT GGAGTTGTTG GAAGGAGGAT AAGTTCTGTA 47 01 TTAAC TCAAC CP~AAATCAAT TCCATCTCCT CTTCTTCAGC AAAATCCCCC AATTGAGTTG GTTTTAGTTA AGGTAGAGGA GAAGAAGTCG TTTTAGGGGG 4751 TTACTTTCCA TTATTGCCCT CATAACCCTT CTCTCCCTTG GAGGCCTACC AATGAA.AGGT AATAACGGGA GTATTGGGAA GAGAGGGAAC CTCCGGATGG 4801 CCCTCTATCA GGCTTCATAC CAAAATGACT TATCTTACAA GAATTAACCA GGGAGATAGT CCGAAGTATG GTTTTACTGA ATAGAATGTT CTTAATTGGT 4851 AACP~A.AACTT GGCCATCCCA GCCATTATCA TGGCTATAAT GGCTCTCCTC TTGTTTTGAA CCGGTAGGGT CGGTAATAGT ACCGATATTA CCGAGAGGAG 4901 AGTCTATTTT TTTACCTACG CCTATGTTAC GCTACAACAC TAACCATAAC TCAGATAAAA AAATGGATGC GGATACAATG CGATGTTGTG ATTGGTATTG 4951 CCCAAACCCA GTCAACATAC TAACATCATG AC GAAC CAA.A TTATCCCACA GGGTTTGGGT CAGTTGTATG ATTGTAGTAC TGCTTGGTTT AATAGGGTGT 5001 ATTTAACCCT AACAACTACT GCCTCATTGT CCATTTTCCT CCTTCCCATT TA.AATTGGGA TTGTTGATGA CGGAGTAACA GGTAAAAGGA GGAAGGGTAA 5051 ACCCCTGCCA TCCTCATACT AGTGTCCTAA GAAATTTAGG TTAATAATAG TGGGGACGGT AGGAGTATGA TCACAGGATT CTTTAAATCC AATTATTATC 296

5101 AC CA~AAAGC C TTCAAAGCTT TAAGTAGAAG TGAAAATCTC CTAATTTCTG TGGTTTTCGG AAGTTTCGAA ATTCATCTTC ACTTTTAGAG GATTAAAGAC 5151 TTAAGATTTG CAAGACTTTA CCTCACATCT TCTGAATGCA ACCCAGATAC AATTCTAAAC GTTCTGAAAT GGAGTGTAGA AGACTTACGT TGGGTCTATG 5201 TTTCATTAAG CTAAAACCTT CTAGATAAAT AGGCCTTGAT CCTACA,.AAAT AAAGTAATTC GATTTTGGAA GATCTATTTA TCCGGAACTA GGATGTTTTA 5251 CTTAGTTAAC AGCTAAGCGT TCAATCCAGC GAACTTCTAT CTACTTTCTC GAATCAATTG TCGATTCGCA AGTTAGGTCG CTTGAAGATA GATGAAAGAG 5301 CCGCCGTAAG GGC GGGAGAAAGC CCCGGGAGAA ACTAATCTCC GGCGGCATTC TTTTTTTCCG CCCTCTTTCG GGGCCCTCTT TGATTAGAGG 5351 ATCTTTGGAT TTGCAATCCA ACATAAACAT CTACTGCAGG ACTATGGCAA TAGAAACCTA AACGTTAGGT TGTATTTGTA GATGACGTCC TGATACCGTT 5401 GAAGAGGAAT TGGACCTCTG TACATGGAGC TACAACCCAT TACTTAGTTC CTTCTCCTTA ACCTGGAGAC ATGTACCTCG ATGTTGGGTA ATGAATCAAG 5451 TCAGTCACCT TACCTGTGGC AATTAATCGA TGACTATTTT CTACAAACCA AGTCAGTGGA ATGGACACCG TTAATTAGCT ACTGATAAAA GATGTTTGGT 5501 CA.AAGATATC GGCACCCTTT ATTTAATCTT TGGTGCATGA GCAGGAATAG GTTTCTATAG CCGTGGGAAA TAAATTAGAA ACCACGTACT CGTCCTTATC 5551 TGGGAACAGC CCTAAGCCTT TTAATTCGCG CTGAGCTAGG CCAGCCTGGT ACCCTTGTCG GGATTCGGAA AATTAAGCGC GACTCGATCC GGTCGGACCA 5601 TCCCTCCTAG GTGATGATCA GATTTATAAT GTTATTGTAA CCGCCCATGC AGGGAGGATC CACTACTAGT CTAAATATTA CAATAACATT GGCGGGTACG 5651 ATTTGTAATA ATTTTCTTTA TAGTAATGCC TGTAATAATT GGGGGATTCG TAAACATTAT T~~.AAGAAAT ATCATTACGG ACATTATTAA CCCCCTAAGC 5701 GGAACTGACT TGTACCATTA ATAATTGGTG CACCAGATAT AGCCTTCCCT CCTTGACTGA ACATGGTAAT TATTAACCAC GTGGTCTATA TCGGA.AGGGA 5751 CGAATAAATA ATATAAGTTT CTGACTCCTC CCTCCTTCTT TTCTCCTACT GCTTATTTAT TATATTCAAA GACTGAGGAG GGAGGAAGAA AAGAGGATGA 5801 CCTGGCTTCG GCCGGAGTCG AAGCAGGGGC TGGTACTGGC TGAACGGTTT GGACCGAAGC CGGCCTCAGC TTCGTCCCCG ACCATGACCG ACTTGCCAAA 5851 ATCCTCCCCT AGCTGGCAAC TTAGCACACG CCGGGGCCTC TGTTGATCTG TAGGAGGGGA TCGACCGTTG AATCGTGTGC GGCCCCGGAG ACAACTAGAC 5901 GCTATCTTTT CACTTCACCT AGCAGGTATT TCTTCAATCC TAGCTTCAAT CGATAGAAAA GTGAAGTGGA TCGTCCATAA AGAAGTTAGG ATCGAAGTTA 5951 TAACTTCATC ACAACTATCA TTA.ACATAAA ACCACCAGCA ATTTCCCAAT ATTGAAGTAG TGTTGATAGT AATTGTATTT TGGTGGTCGT T.AAAGGGTTA 6001 ACCAAACACC CCTATTTGTG TGATCCATTC TAGTAACGAC TATTCTCCTT TGGTTTGTGG GGATA.AACAC ACTAGGTAAG ATCATTGCTG ATAAGAGGAA 6051 CTTTTAGCCC TCCCAGTACT TGCAGCCGGC ATCACAATAT TACTCACTGA GAAAATCGGG AGGGTCATGA ACGTCGGCCG TAGTGTTATA ATGAGTGACT 6101 CCGAAACCTA AACACAACTT TCTTTGACCC AGCAGGAGGT GGAGACCCCA GGCTTTGGAT TTGTGTTGAA AGAAACTGGG TCGTCCTCCA CCTCTGGGGT 6151 TCCTCTACCA GCATTTGTTC TGGTTTTTTG GTCACCCAGA GGTCTACATT AGGAGATGGT CGTAAACAAG ACC C CAGTGGGTCT CCAGATGTAA 6201 CTTATCCTTC CTGGTTTTGG TATAATTTCC CACATTGTAG CTTACTACTC GAATAGGAAG GACCAA.AACC ATATTA.AAGG GTGTAACATC GAATGATGAG 6251 CGGTAAGAAG GAACCATTCG GCTACATGGG TATGGTCTGA GCAATAATAG GCCATTCTTC CTTGGTAAGC CGATGTACCC ATACCAGACT CGTTATTATC 6301 CAATTGGCCT ACTAGGATTT ATTGTCTGAG CCCATCACAT ATTCACCGTC GTTAACCGGA TGATCCTAAA TAACAGACTC GGGTAGTGTA TAAGTGGCAG 6351 GGAATAGACG TTGACACACG AGCCTACTTC ACCTCAGCAA CAATA.ATTAT CCTTATCTGC AACTGTGTGC TCGGATGAAG TGGAGTCGTT GTTATTAATA 6401 TGCCATCCCT ACAGGTGTAA AAGTTTTTAG CTGATTAGCA ACCCTTCATG ACGGTAGGGA TGTCCACATT TTCp~~AAATC GACTAATCGT TGGGAAGTAC 6451 GCGGCTCTGT CAAATGAGAA ACCCCCTTAC TATGAGCCCT CGGGTTTATT 297

CGCCGAGACA GTTTACTCTT TGGGGGAATG ATACTCGGGA GCCCAAATAA 6501 TTCCTGTTTA CAGTAGGAGG TCTGACAGGA ATTGTTCTAG CCAACTCCTC AAGGACAAAT GTCATCCTCC AGACTGTCCT TAACAAGATC GGTTGAGGAG 6551 CCTAGACATT GTTCTCCATG ATACTTACTA TGTAGTAGCC CACTTCCACT GGATCTGTAA CAAGAGGTAC TATGAATGAT ACATCATCGG GTGAAGGTGA 6601 ACGTCCTCTC AATAGGAGCA GTGTTTGCTA TCATGGCAGG CTTCATCCAC TGCAGGAGAG TTATCCTCGT CACA,A.ACGAT AGTACCGTCC GAAGTAGGTG 6651 TGATTCCCTT TAATAACCGG TTATACTCTT CATTCAACTT GAACP►~~A.AAT ACTAAGGGAA ATTATTGGCC AATATGAGAA GTAAGTTGAA CTTGTTTTTA 6701 CCAATTCGCA GTCATGTTTA TTGGAGTAAA CCTCACATTC TTCCCACAAC GGTTAAGCGT CAGTACAAAT AACCTCATTT GGAGTGTAAG AAGGGTGTTG 6751 ATTTTCTAGG TCTCGCCGGC ATGCCGCGAC GTTACTCAGA TTACCCAGAC TA.AAAGATC C AGAGC GGC C G TACGGCGCTG CAATGAGTCT AATGGGTCTG 6801 GCTTATACCC TGTGA.AATAC AGTCTCCTCT ATCGGCTCTT TAATTTCACT CGAATATGGG ACACTTTATG TCAGAGGAGA TAGCCGAGAA ATTAAAGTGA 6851 TGTAGCAGTG ATTATGCTCC TCTTCATTAT TTGAGAAGCA TTCGCCTCAA ACATCGTCAC TAATACGAGG AGAAGTAATA AACTCTTCGT AAGCGGAGTT 6901 AACGAGAAGT CCTATCTGTC GAGCTACCTC ATACAAATGT GGAATGACTT TTGCTCTTCA GGATAGACAG CTCGATGGAG TATGTTTACA CCTTACTGAA 6951 CATGGCTGCC CTCCTCCCTA CCACACATAT GAAGAGCCAG CATTTGTTCA GTACCGACGG GAGGAGGGAT GGTGTGTATA CTTCTCGGTC GTAAACAAGT 7001 AGTACAACGA ACCCTTTAAG ACAAGA.AAGG AAGGAATTGA ACCCCCATAT TCATGTTGCT TGGGAAATTC TGTTCTTTCC TTCCTTAACT TGGGGGTATA 7051 GTTAGTTTCA AGCTAACCAC ATCACCACTC TGTCACTTTC TTTATAGAGA CAATCAA.AGT TCGATTGGTG TAGTGGTGAG ACAGTGAA.AG .A.AATATC TC T 7101 TCCTAGTAAA ATGTATTACA TTATCTTGTC AAGGCAAAAT TGTGAGTTTA AGGATCATTT TACATAATGT AATAGAACAG TTCCGTTTTA ACACTCAAAT 7151 AATCCCACGG ATCTTAATTA ATGGCACACC CCTCACAATT AGGATTTCAA TTAGGGTGCC TAGAATTAAT TACCGTGTGG GGAGTGTTAA TCCTAAAGTT 7201 GATGCAGCCT CCCCAGTTAT GGAAGAACTT ATTCACTTTC ACGACCACAC CTACGTCGGA GGGGTCAATA CCTTCTTGAA TAAGTGAAAG TGCTGGTGTG 7251 ACTAATAATT GTATTTCTAA TTAGCGCTCT GGTCCTTTAT ATTATTACGG TGATTATTAA CATAAAGATT AATCGCGAGA CCAGGAAATA TAATAATGCC 7301 CAATAGTATC AACA.AAGC TT AC AAATA.AAT ATATTCTTGA TTCCCAAGAA GTTATCATAG TTGTTTCGAA TGTTTATTTA TATAAGAACT AAGGGTTCTT 7351 ATTGAAATCG TTTGGACTAT CCTGCCCGCC ATCATCCTTA TTATAATCGC TAACTTTAGC AAACCTGATA GGACGGGCGG TAGTAGGAAT AATATTAGCG 7401 CCTTCCATCT CTACGAATTC TGTATCTCAT AGACGAAATT AATGACCCCC GGAAGGTAGA GATGCTTAAG ACATAGAGTA TCTGCTTTAA TTACTGGGGG 7451 ACCTAACTAT TAAAGCCATG GGTCACCAAT GATACTGAAG TTATGAATAT TGGATTGATA ATTTCGGTAC CCAGTGGTTA CTATGACTTC AATACTTATA 7501 ACAGACTATG A,AAATCTAGC TTTTGACTCC TACATAGTCC AGACCCAAGA TGTCTGATAC TTTTAGATCG AAAACTGAGG ATGTATCAGG TCTGGGTTCT 7551 CTTAACCCCA GGCCAATTTC GTTTACTGGA GACAGATCAT CGAATAGTGG GAATTGGGGT C C GGTTA.AAG CAA.ATGACCT CTGTCTAGTA GCTTATCACC 7601 TTCCTATGGA ATCCCCTGTC CGCGTCCTGG TGTCCGCAGA AGATGTCCTA AAGGATACCT TAGGGGACAG GCGCAGGACC ACAGGCGTCT TCTACAGGAT 7651 CACTCATGGG CTGTACCAGC CTTAGGGGTT AAAATAGAC G CTGTCCCGGG GTGAGTACCC GACATGGTCG GAATCCCCAA TTTTATCTGC GACAGGGCCC 7701 ACGTTTAA.AC CAAACTGCCT TCATCATCTC CCGACCAGGT GTCTACTACG TGCA.AATTTG GTTTGACGGA AGTAGTAGAG GGCTGGTCCA CAGATGATGC 7751 GCCAGTGTTC AGAAATTTGT GGGGCTAACC ACAGCTTCAT GCCTATTGTA CGGTCACAAG TCTTTAAACA CCCCGATTGG TGTCGAAGTA CGGATAACAT 7801 GTAGAAGCAG TCCCTCTAGA ACACTTCGAA GCCTGATCTT CATTAATGCT CATCTTCGTC AGGGAGATCT TGTGAAGCTT CGGACTAGAA GTAATTACGA 298

7851 AGAAGAAGCC TCACCAAGAA GCTA.AATCGG GACTAGCGTT AGCCTTTTAA TCTTCTTCGG AGTGGTTCTT CGATTTAGCC CTGATCGCAA TCGGAA.AATT 7901 GC TP~~C T GGTGACTCCC TACCACCCTT GGTGATATGC CCCAATTAAA CGATTTTTGA CCACTGAGGG ATGGTGGGAA CCACTATACG GGGTTAATTT 7951 CCCCCACCCT TGATTAATTA TCCTTTTGTT CTCATGAATA ATTTTCCTCA GGGGGTGGGA ACTAATTAAT AGGA.AAACAA GAGTACTTAT TAAAAGGAGT 8001 TTATTTTACC GTG ATAAATCACC TATTTAGCAA TAATCCAACA AATAA.AATGG TTTTTTTCAC TATTTAGTGG ATAAATCGTT ATTAGGTTGT 8051 TTAAAAAGCA CAGAAATATC TAAACCCGAG CCCTGAAACT GACCATGATT AATTTTTCGT GTCTTTATAG ATTTGGGCTC GGGACTTTGA CTGGTACTAA 8101 CTAAGCTTCT TCGACCAATT CCTAAGCCCC TCCCTCCTTG GAATCCCATT GATTCGAAGA AGCTGGTTAA GGATTCGGGG AGGGAGGAAC CTTAGGGTAA 8151 AATTGCCCTA GCAATTGCCC TACCATGGTT AATCTTCCCA ACCCCCACTA TTAACGGGAT CGTTAACGGG ATGGTACCAA TTAGAAGGGT TGGGGGTGAT 8201 GTCGGTGACT TAATAACCGA CTAATAACGC TCCAAAGCTG ATTTATTAAC CAGCCACTGA ATTATTGGCT GATTATTGCG AGGTTTCGAC T.A.AATAATTG 8251 CGATTTATTT ACCAACTGAT ACAGCCCATC AACTTTGCCG GCCATAAATG GCTAAATAAA TGGTTGACTA TGTCGGGTAG TTGAAACGGC CGGTATTTAC 8301 AGCCGTGTTA TTTACAGCAT TAATACTATT CTTGATTACC ATCAACCTAT TCGGCACAAT AA.ATGTC GTA ATTATGATAA GAACTAATGG TAGTTGGATA 8351 TAGGCCTTCT CCCCTACACC TTCACACCCA CAACACAACT TTCCCTCAAT ATCCGGAAGA GGGGATGTGG AAGTGTGGGT GTTGTGTTGA AAGGGAGTTA 8401 ATGGCATTTG CTCTACCTTT ATGATTCACC ACCGTTTTAA TCGGTATACT TACCGTAAAC GAGATGGAAA TACTAAGTGG TGGCP~AAATT AGCCATATGA 8451 GAATCAACCC ACAATTGCCC TGGGTCACTT CCTGCCAGAA GGCACCCCCA CTTAGTTGGG TGTTAACGGG ACCCAGTGAA GGACGGTCTT CCGTGGGGGT 8501 CCCTTTTAGT ACCCGTCCTA ATTGTCATCG AGACCATTAG CTTATTTATT GGGAAA.ATCA TGGGCAGGAT TAACAGTAGC TCTGGTAATC GAATAAATAA 8551 CGACCACTAG CGCTAGGAGT CCGATTAACT GC TAATTTA.A CAGCTGGCCA GCTGGTGATC GCGATCCTCA GGCTAATTGA CGATTAAATT GTCGACCGGT 8601 CCTAGTAATA CAACTAATTG CAACCGCAGC CTTCGTTCTT ATTACCATTA GGATGATTAT GTTGATTAAC GTTGGCGTCG GAAGCAAGAA TAATGGTAAT 8651 TGCCAGCCGT GGCATTACTC ACATCAGTAA TTTTATTTTT ACTAACAGTC ACGGTCGGCA CCGTAATGAG TGTAGTCATT AAAATP►~~AA.A TGATTGTCAG 8701 CTAGAAGTAG CTGTAGCAAT AATTCAAGCA TATGTCTTCG TCCTCCTACT GATCTTCATC GACATCGTTA TTAAGTTCGT ATACAGAAGC AGGAGGATGA 8751 AAGCCTCTAC C TACAAGAA.A ACGTCTAATG GCTCACCAAG CACACGCATA TTCGGAGATG GATGTTCTTT TGCAGATTAC CGAGTGGTTC GTGTGCGTAT 8801 TCATATAGTT GACCCTAGCC CATGACCACT AACCGGAGCC ACAGCCGCCC AGTATATCAA CTGGGATCGG GTACTGGTGA TTGGCCTCGG TGTCGGCGGG 8851 TTTTAATAAC ATCTGGCCTA GCCATCTGGT TTCACTTCCA CTCATTAATT AAAATTATTG TAGACCGGAT CGGTAGACCA AAGTGAAGGT GAGTAATTAA s9o1 CTTCTTTACC TAGGACTAAC CCTTCTTCTA TTAACTATAA TTCAATGATG GAAG.AAATGG ATCCTGATTG GGAAGAAGAT AATTGATATT AAGTTACTAC 8951 ACGTGATATT ATCCGAGAAG GAACATTCCA AGGTCACCAC ACACCCCCTG TGCACTATAA TAGGCTCTTC CTTGTAAGGT TCCAGTGGTG TGTGGGGGAC 9001 TTCp~~AA.AGG CCTCCGCTAC GGAATAATCT TATTTATCAC ATCAGAAGTG AAGTTTTTCC GGAGGCGATG CCTTATTAGA ATAAATAGTG TAGTCTTCAC 9051 TTCTTCTTTC TAGGCTTTTT CTGAGCCTTT TACCACTCAA GTCTCGCCCC AAGAAGAAAG ATCCGA~ GACTCGGAAA ATGGTGAGTT CAGAGCGGGG 9101 CACCCCTGAG CTGGGGGGAT GTTGGCCACC AACAGGAATT AGTCCTATTG GTGGGGACTC GACCCCCCTA CAACCGGTGG TTGTCCTTAA TCAGGATAAC 9151 ATCCATTCGA AGTACCACTT TTAAATACCG CAGTACTCCT AGCCTCCGGC TAGGTAAGCT TCATGGTGAA AATTTATGGC GTCATGAGGA TCGGAGGCCG 9201 GTAACAGTAA CCTGAGCCCA CCACGGTCTC ATGGAAGGTA ACCGAAAAGA 299

CATTGTCATT GGACTCGGGT GGTGCCAGAG TACCTTCCAT TGGCTTTTCT 9251 AACTATTCAA GCCCTCACTC TCACCATCAT CCTAGGTGTC TACTTTACAG TTGATAAGTT CGGGAGTGAG AGTGGTAGTA GGATCCACAG ATGAAATGTC 9301 CCCTCCAAGC TATAGAATAT TATGAAGCAC CTTTTACAAT TGCTGATGGG GGGAGGTTCG ATATCTTATA ATACTTCGTG GP~AAATGTTA ACGACTACCC 9351 GTCTATGGAA CAACATTCTT CGTCGCCACA GGGTTCCACG GCCTCCATGT CAGATACCTT GTTGTAAGAA GCAGCGGTGT CCCAAGGTGC CGGAGGTACA 9401 TATTATTGGC TCAACATTTT TAATAATTTG CCTACTACGA CAA.ATTCAAT ATAATAACCG AGTTGTAAAA ATTATTAA.AC GGATGATGCT GTTTAAGTTA 9451 ATCACTTTAC ATCCCAACAC CACTTTGGAT TTGAAGCCGC CGCATGATAC TAGTGAAATG TAGGGTTGTG GTGAAACCTA AACTTCGGCG GCGTACTATG 9501 TGACACTTTG TAGACKTAGT GTGACTATTC CTTTATGTTT CCATCTATTG AC TGTGA.AAC ATCTGKATCA CACTGATAAG GAAATACAA.A GGTAGATAAC 9551 ATGAGGCTCA TAACTGCTTT TCTAGTATAT ACTAGTACAA ATGATTTCCA TACTCCGAGT ATTGACGAAA AGATCATATA TGATCATGTT TAC TAA.AGGT 9601 ATCATTTAAT CTTGGTTAAA GTCCAAGGAA AAGCAATGAA CCTCATCATG TAGTAAATTA GAACCAATTT CAGGTTCCTT TTCGTTACTT GGAGTAGTAC 9651 TCTTCTGTCG CGGCTACGGC CCTGGTTTCC CTAATCCTTG TATTTATTGC AGAAGACAGC GCCGATGCCG GGACCAAAGG GATTAGGAAC ATA.AATAAC G 9701 ATTTTGACTT CCATCACTTA ACCCAGACAA CGAGAAGCTA TCCCCGTATG TAAAACTGAA GGTAGTGAAT TGGGTCTGTT GCTCTTCGAT AGGGGCATAC 9751 AATGCGGCTT TGATCCTCTT GGCAGTGCAC GTCTCCCATT CTCCCTACGC TTACGCCGAA ACTAGGAGAA CCGTCACGTG CAGAGGGTAA GAGGGATGCG 9801 TTCTTCCTCG TAGCTATCTT ATTCCTACTA TTTGACCTAG AGATTGCCCT AAGAAGGAGC ATCGATAGAA TAAGGATGAT AAACTGGATC TCTAACGGGA 9851 CCTCCTCCCC TTACCCTGGG GTGATCAGTT AATATCACCT CTCTATTCTC GGAGGAGGGG AATGGGACCC CACTAGTCAA TTATAGTGGA GAGATAAGAG 9901 TACTCTGGGC AACAATTATC CTAATTTTAC TTACCCTGGG TCTTATTTAT ATGAGACCCG TTGTTAATAG GATTAAAATG AATGGGACCC AGAATAAATA 9951 GAATGACTCC AAGGAGGATT AGAATGAGCA GAGTAGATAT TTAGTC T.A.AA CTTACTGAGG TTCCTCCTAA TCTTACTCGT CTCATCTATA AATCAGATTT 10001 CAAAGACCAC TAATTTCGGC TTAGTA.AAC T ATGGTGAA.AA TCCATAAATA GTTTCTGGTG ATTAAAGCCG AATCATTTGA TACCACTTTT AGGTATTTAT 10051 TCTTATGTCT CCTATGTATT TTAGCCTTAA CTCAGCATTT ATACTAGGCC AGAATACAGA GGATACATAA AATCGGAATT GAGTCGTAAA TATGATCCGG 10101 TGATGGGTCT TGCACTTAAC CGTTATCACC TCTTATCTGC ACTTTTATGT ACTACCCAGA ACGTGAATTG GCAATAGTGG AGAATAGACG TGAAAATACA 10151 CTAGAAAGCA TACTACTAAC CCTATTCATT ACCATTGCTA TCTGAACCCT GATCTTTCGT ATGATGATTG GGATAAGTAA TGGTAACGAT AGACTTGGGA 10201 TACACTA.AAT TCTATCTCCT CTTCAATTAT TCCCATGATC CTCCTCACAT ATGTGATTTA AGATAGAGGA GAAGTTAATA AGGGTACTAG GAGGAGTGTA 10251 TTTCAGCTTG TGAAGCTAGT GCAGGCCTAG CTATTCTAGT GGCCACCTCA AA.AGTC GAAC ACTTCGATCA CGTCCGGATC GATAAGATCA CCGGTGGAGT 10301 CGCTCCCACG GATCTGACAA CTTACA.AAAT CTAAATCTCC TCCAATGCTA GCGAGGGTGC CTAGACTGTT GAATGTTTTA GATTTAGAGG AGGTTACGAT 10351 AP~AATTCTCA TCCCAACAAT TATACTCTTT CCAACCACAT GGATTATTAA TTTTAAGAGT AGGGTTGTTA ATATGAGAAA GGTTGGTGTA CCTAATAATT 10401 C TGA CTATGACCCA CAACCACCTC CTTTAGTCTT CTAATCGCAT GTTTTTTACT GATACTGGGT GTTGGTGGAG GAAATCAGAA GATTAGCGTA 10451 TATCAAGCCT AATCTGGTTT AAATGAAATA TAGATATTGG CTGAGACTTC ATAGTTCGGA TTAGACCAAA TTTACTTTAT ATCTATAACC GACTCTGAAG 10501 TCCAATCAAT TCATAGCTGT TGACCCCCTA TCAGCCCCCT TGCTTATTCT AGGTTAGTTA AGTATCGACA ACTGGGGGAT AGTCGGGGGA ACGAATAAGA 10551 TACATGTTGA CTTCTACCAC TAATAATCTT AGCCAGCCAG AACCACATCT ATGTACAACT GAAGATGGTG ATTATTAGAA TCGGTCGGTC TTGGTGTAGA 300

10601 CCCCAGAACC AATCATTCGA CAACGGACAT ACATCTCACT CCTAATCTCC GGGGTCTTGG TTAGTAAGCT GTTGCCTGTA TGTAGAGTGA GGATTAGAGG 10651 CTTCAGCCTT CCTTATTATA GCATTTTCCG CAAC C GAAAT GATTATATTT GAAGTCGGAA GGAATAATAT CGTAAAAGGC GTTGGCTTTA CTAATATAAA 10701 TACATCATAT TTGAAGCTAC ACTTATCCCC ACTCTTATCA TTATTACACG ATGTAGTATA AACTTCGATG TGAATAGGGG TGAGAATAGT AATAATGTGC 10751 ATGGGGTAAT CAAACAGAGC GCCTAAATGC AGGCACCTAC TTCCTATTTT TACCCCATTA GTTTGTCTCG CGGATTTACG TCCGTGGATG AAGGATAAAA 10801 ATACCTTAAT TGGTTCTCTC CCCCTACTCA TTGCCCTTTT ACTTATACAA TATGGAATTA ACCAAGAGAG GGGGATGAGT AAC GGGA,,AAA TGAATATGTT 10851 AATAACCTCG GCACCCTATC CATAATTATT ATACAACACT CACAGTCCCT TTATTGGAGC CGTGGGATAG GTATTAATAA TATGTTGTGA GTGTCAGGGA 10901 AAGCCTAACC TCATGAACAG ACAAATTATG ATGAGCAGCC TGTCTCCTCG TTCGGATTGG AGTACTTGTC TGTTTAATAC TACTCGTCGG ACAGAGGAGC 10951 CCTTTCTTGT C A,AAATAC C C CTGTACGGAA TTCACCTTTG ACTCCCTAAA GGAAAGAACA GTTTTATGGG GACATGCCTT AAGTGGA.AAC TGAGGGATTT 11001 GCCCATGTTG AAGCCCCAAT TGCCGGCTCA ATAATCCTAG CCGCCGTACT CGGGTACAAC TTCGGGGTTA ACGGCCGAGT TATTAGGATC GGCGGCATGA 11051 ACTCAAACTG GGGGGCTATG GCATAATACG AATTATCGTA ATACTAAATC TGAGTTTGAC CCCCCGATAC CGTATTATGC TTAATAGCAT TATGATTTAG 11101 CCCTCACCAA AGAAATGGCC TACCCATTCC TAATCCTGGC CATTTGAGGT GGGAGTGGTT TCTTTACCGG ATGGGTAAGG ATTAGGACCG GTAAACTCCA 11151 ATTATTATAA CTAGCTCCAT CTGCCTACGA C AA.AC C GAC C TCAA.ATCTCT TAATAATATT GATCGAGGTA GACGGATGCT GTTTGGCTGG AGTTTAGAGA 11201 GATTGCCTAT TCATCAGTAA GCCATATGGG CTTAGTTGCA GCAGCAATTC CTAACGGATA AGTAGTCATT CGGTATACCC GAATCAACGT CGTCGTTAAG 11251 TTATCCAAAC ACCATGAAGC TTCGCAGGAG CAATCACGTT AATAATTGCC AATAGGTTTG TGGTACTTCG AAGCGTCCTC GTTAGTGCAA TTATTAACGG 11301 CACGGTCTAA TTTCATCCGC CCTATTCTGC CTAGCCAACA CTAACTACGA GTGCCAGATT AAAGTAGGCG GGATAAGACG GATCGGTTGT GATTGATGCT 11351 ACGGATCCAT AGCCGA.ACAA TACTCCTAGC CCGGGGCATG CAAATTATTC TGCCTAGGTA TCGGCTTGTT ATGAGGATCG GGCCCCGTAC GTTTAATAAG 11401 TTCCACTAAC AGCAACCTGA TGGTTCTTTG CTAGTTTGGC TAATCTTGCT AAGGTGATTG TCGTTGGACT AC C AAGA.A.AC GATCAAACCG ATTAGAACGA 11451 CTCCCACCTT CCCCTAATCT CATAGGAGAA CTCCTTATCA TTACCTCAAT GAGGGTGGAA GGGGATTAGA GTATCCTCTT GAGGAATAGT AATGGAGTTA 11501 GTTTAATTGA TCCAACTGAA CTATTATCCT CTCAGGCCTT GGGGTATTAA CAAATTAACT AGGTTGACTT GATAATAGGA GAGTCCGGAA CCCCATAATT 11551 TTACAGCCTC CTACTCCCTT TATATATTCC TAATGACCCA ACGAGGTCCT AATGTCGGAG GATGAGGGAA ATATATAAGG ATTACTGGGT TGCTCCAGGA 11601 ACCCCCCACC ATATTTTATC ACTAAACCCG ACCTACACAC GAGAACACCT TGGGGGGTGG TATAAAATAG TGATTTGGGC TGGATGTGTG CTCTTGTGGA 11651 CCTTCTAAGC CTTCACCTCA TGCCTGTCCT ACTTCTAATG CTTAAGCCAG GGAAGATTCG GAAGTGGAGT ACGGACAGGA TGAAGATTAC GAATTCGGTC 11701 AACTTATCTG AGGCTGAACA CTTTGTATTT ATAGTTTAAC CAPsAACATTA TTGAATAGAC TCCGACTTGT GA.AACATAAA TATCAAATTG GTTTTGTAAT 11751 GATTGTGGTT CTAAAGATAA AAGTT~C CTTTTTAATT ACCGAGAGAG CTAACACCAA GATTTCTATT TTCAATTTTG GP~~AAATTAA TGGCTCTCTC 11801 GTCAGGGACA CGACAGAACT GCTAATTCTT CTTACCATGG CTCAAATCCA CAGTCCCTGT GCTGTCTTGA CGATTAAGAA GAATGGlACC GAGTTTAGGT 11851 TGGCTCACTC AGCTTCTGAA AGATATTAGT AATCTATTGG TCTTAGGAAC ACCGAGTGAG TCGAAGACTT TCTATAATCA TTAGATAACC AGAATCCTTG 11901 CA,P.~A.AATTCT TGGTGCAATT CCAAGCAAGA GCTATGAATA CCATTTTCAA GTTTTTAAGA ACCACGTTAA GGTTCGTTCT CGATACTTAT GGTAAAAGTT 11951 CTCATCATTA CTTCTAATCT TCGCCATCCT CATCTTTCCA CTAATAACCT 301

GAGTAGTAAT GAAGATTAGA AGCGGTAGGA GTAGA.AAGGT GATTATTGGA 12001 CACTGAGCCC TAA.AGAAC TT AATCTCAACT GGGCCTCATC CCACGTAAAA GTGACTCGGG ATTTCTTGAA TTAGAGTTGA CCCGGAGTAG GGTGCATTTT 12051 ACAGCTGTAA AGACCTCTTT CTTTATTAGC CTTATTCCCC TGTCCATTTT TGTCGACATT TCTGGAGAA.A GAA.ATAATC G GAATAAGGGG ACAGGTAAAA TGAATA.A.ACA 12101 CCTAGACCAG GGTTTAGAGT CCATCATGAC CAACTATAAC GGATCTGGTC CCAAATCTCA GGTAGTACTG GTTGATATTG ACTTATTTGT 12151 TCGGACCATT CGATATTAAT ATAAGCTTCA AATTTGATAT ATACTCAATT AGCCTGGTAA GCTATAATTA TATTCGAAGT TTAAACTATA TATGAGTTAA 12201 GTATTTACCC CAGTGGCCCT CTATGTTACT TGATCTATCC TTGAATTCGC CATAA:ATGGG GTCACCGGGA GATACAATGA ACTAGATAGG AACTTAAGCG 12251 CCTATGGTAT ATACATTCCG ATCCCAACAT CAACCGTTTC TTCAAGTACC GGATACCATA TATGTAAGGC TAGGGTTGTA GTTGGCAAAG AAGTTCATGG 12301 TCTTACTTTT CCTAATCTCA ATAATTATTC TAGTGACCGC TAATAATATA AGAATGAAAA GGATTAGAGT TATTAATAAG ATCACTGGCG ATTATTATAT 12351 TTTCAACTAT TCATTGGCTG AGAGGGCGTA GGCATTATAT CCTTCCTTCT A.AAGTTGATA AGTAACCGAC TCTCCCGCAT CCGTAATATA GGAAGGAAGA 12401 CATTGGTTGA TGATATAGCC GAACAGATGC CAACACAGCT GCCCTCCAAG GTAACCAACT ACTATATCGG CTTGTCTACG GTTGTGTCGA CGGGAGGTTC 12451 CTGTAATCTA CAACCGAGTA GGCGACATCG GTCTAATCCT CAGCATAGCT GACATTAGAT GTTGGCTCAT CCGCTGTAGC CAGATTAGGA GTCGTATCGA 12501 TGACTAGCCA TA.AAC C TAA.A CTCCTGAGAA ATCCAACAAC TATTTATTCT ACTGATCGGT ATTTGGATTT GAGGACTCTT TAGGTTGTTG ATAAATAAGA 12551 ATCTAAGGAC ATAAACTTAA CCTTACCCCT TCTTGGCCTT GTCCTGGCCG TAGATTCCTG TATTTGAATT GGAATGGGGA AGAACCGGAA CAGGACCGGC 12601 CAGC TGGAA.A ATCCGCACAA TTTGGTCTTC ACCCATGACT TCCCTCAGCC GTCGACCTTT TAGGCGTGTT AAACCAGAAG TGGGTACTGA AGGGAGTCGG 12651 ATAGAAGGAC CAACGCCAGT CTCCGCCTTA CTCCATTCCA GCACAATAGT TATCTTCCTG GTTGCGGTCA GAGGCGGAAT GAGGTAAGGT CGTGTTATCA 12701 TGTTGCCGGT ATCTTCCTTC TAATTCGCTT CCACCCATTA ATACAAGATA ACAACGGCCA TAGAAGGAAG ATTAAGCGAA GGTGGGTAAT TATGTTCTAT 12751 ATCAGCTGAT CCTAACAACA TGCCTATGCC TGGGAGCACT GAGGACTCTT TAGTCGACTA GGATTGTTGT ACGGATACGG ACCCTCGTGA CTGGTGAGAA 12801 TTTACTGCAG CATGCGCACT CACCCAAAAC GACATCAAAA AAATTGTTGC AAATGACGTC GTACGCGTGA GTGGGTTTTG CTGTAGTTTT TTTAACAACG 12851 ATTCTCAACA TCCAGCCAAC TTGGATTGAT AATAGTGACA ATTGGCCTCA TAAGAGTTGT AGGTCGGTTG AACCTAACTA TTATCACTGT TAACCGGAGT 12901 ACCAACCTCA ACTAGCTTTT CTCCACATCT GCACCCACGC CTTCTTTAAA TGGTTGGAGT TGATCGAAAA GAGGTGTAGA CGTGGGTGCG GAAGAAATTT 12951 GCCATGCTCT TCCTTTGTTC CGGGTCTATT ATTCATAGCC TCAATGACGA CGGTACGAGA AGGA.A.ACAAG GCCCAGATAA TAAGTATCGG AGTTACTGCT 13001 ACAAGACATC CGCAAGATAG GAGGCCTCCA TAAACTTCTA CCATTCACCT TGTTCTGTAG GCGTTCTATC CTCCGGAGGT ATTTGAAGAT GGTAAGTGGA 13051 CATCTTCCTT AACCATCGGT AGTCTAGCTC TTGCAGGCAT ACCTTTTTTA GTAGAAGGAA TTGGTAGCCA TCAGATCGAG AACGTCCGTA TGG ~ T 13101 TCAGGCTTCT TCTCAA.AAGA TGCTATTATT GAGTCTATAA ACACTTCTCA AGTCCGAAGA AGAGTTTTCT ACGATAATAA CTCAGATATT TGTGAAGAGT 13151 CCTCAACGCC TGAGCCCTTA TCCTTACCTT AATCGCAACA TCATTCACAG GGAGTTGCGG ACTCGGGAAT AGGAATGGAA TTAGCGTTGT AGTAAGTGTC 13201 CTATCTACAG CCTCCGCCTA ATTTTCTTCG CAC TAATA.AA TTTTCCACGA GATAGATGTC GGAGGCGGAT TA.A.AAGAAGC GTGATTATTT A.A.AAGGTGC T 13251 TTCAATTCAC TCTCCCCTAT CAAC GA,AAAT AATCCCATAG TCATCAACCC AAGTTAAGTG AGAGGGGATA GTTGCTTTTA TTAGGGTATC AGTAGTTGGG 13301 GATCAAACGT CTAGCTTACG GAAGTATCCT GGCCGGCCTC ATCATTACAT CTAGTTTGCA GATCGAATGC CTTCATAGGA CCGGCCGGAG TAGTAATGTA 302

13351 CTAACCTAAC AC C CACAA.AA AC C CA.AATCA TAACTATACC TCCTCTACTG GATTGGATTG TGGGTGTTTT TGGGTTTAGT ATTGATATGG AGGAGATGAC 13401 AAACTCTCCG CCCTACTAGT AACCATTATT GGCCTTCTAC TGGCCTTAGA TTTGAGAGGC GGGATGATCA TTGGTAATAA CCGGAAGATG ACCGGAATCT 13451 ACTAGCTAAC CTAACCAATA CCCAATTCAA AACAACCCCC ACCCTCTACC TGATCGATTG GATTGGTTAT GGGTTAAGTT TTGTTGGGGG TGGGAGATGG 13501 CCCACCACTT CTCA.AATATA CTGGGATACT TTCCACA.A.A.T TACTCACCGC GGGTGGTGAA GAGTTTATAT GACCCTATGA AAGGTGTTTA ATGAGTGGCG 13551 CTCCTACCCA AAATTAACTT AACCTGAGCC CAATACATCT CTACCCACTT GAGGATGGGT TTTAATTGAA TTGGACTCGG GTTATGTAGA GATGGGTGAA 13601 GATTGATCAA ACATGATATG TTGG AC CP►~~AAAGT ACCCTTATTC CTAACTAGTT TGTACTATAC TTTTTTAACC TGGTTTTTCA TGGGAATAAG 13651 AACAAATTCC CCTAGTYAAA CTAACCACTC AGCCCCAGCA AGGTTACATT TTGTTTAAGG GGATCAYTTT GATTGGTGAG TCGGGGTCGT TCCAATGTAA 13701 AAAGTCTACC TTATGTTACT CTTCCTTACT CTAGCCTTAG CCTTACTCAC TTTCAGATGG AATACAATGA GAAGGAATGA GATCGGAATC GGAATGAGTG 13751 TACATTAACC TAACCACACG CAA.AGTGCCC CATGACAGCC CCCGAGTTAA ATGTAATTGG ATTGGTGTGC GTTTCACGGG GTACTGTCGG GGGCTCAATT 13801 TTCCAACACC ACAAATAAAG TCAATAATAA CACTCATCCA CTTAAAACTA AAGGTTGTGG TGTTTATTTC AGTTATTATT GTGAGTAGGT GAATTTTGAT 13851 ACAGCCATCC CCCATCACCA TAAAGTA.AAG ACACTCCCAC AAAATCTCCA TGTCGGTAGG GGGTAGTGGT ATTTCATTTC TGTGAGGGTG TTTTAGAGGT 13901 CGAGTCATTT CCAAGTTACT TGCCTCCTCA GCCCCGGATC AATCTAACCC GCTCAGTAAA GGTTCAATGA ACGGAGGAGT CGGGGCCTAG TTAGATTGGG 13951 AA.ATCATTCC AC C GC GA.A.AT ATTTACCAGC AAGAACTAAT ACTGCCAA.AT TTTAGTAAGG TGGCGCTTTA TAAATGGTCG TTCTTGATTA TGACGGTTTA 14001 P~A.AAACCAAC ATATAACAGA ACAGACCAAT TACCTCACGA CTCAGGATAC TTTTTGGTTG TATATTGTCT TGTCTGGTTA ATGGAGTGCT GAGTCCTATG 14051 GGCTCAGCAG CAAGCGCTGC TGTATAAGCA AACACTACCA ACATTCCCCC CCGAGTCGTC GTTCGCGACG ACATATTCGT TTGTGATGGT TGTAAGGGGG 14101 CAGATAAATT P.~~AAAT~ CTAATGACAA A.AA.AGAC C C C CCATGCCCCA GTCTATTTAA TTTTTATTTT GATTACTGTT TTTTCTGGGG GGTACGGGGT 14151 CTAACAACCC ACATCCCACC CCCGCAGCTA TAACTAATCC TAATGCGGCA GATTGTTGGG TGTAGGGTGG GGGCGTCGAT ATTGATTAGG ATTACGCCGT 14201 TAATAAGGCG AAGGATTAGA CGCCACTCCC ATCAAACCCA GAACTAAACA ATTATTCCGC TTCCTAATCT GCGGTGAGGG TAGTTTGGGT CTTGATTTGT 14251 AACTATTATT p►~~AAACATAA A.ATATACCAT TATTCCTACC TGGACTTTAA TTGATAATAA TTTTTGTATT TTATATGGTA ATAAGGATGG ACCTGA.AATT 14301 CCAAGACCAA CAACTTGAAA AACTGTCGTT GTTCATTCAA CTATAAGAAT GGTTCTGGTT GTTGAACTTT TTGACAGCAA CAAGTAAGTT GATATTCTTA 14351 TTATGGCCAT A.AATAT C C GA AAAACCCACC CTCTACTAAA AATTGTAAAC AATACCGGTA TTTATAGGCT TTTTGGGTGG GAGATGATTT TTAACATTTG 14401 CAAGTCCTAA TTGACCTCCC AGCCCCATCA AACATCTCCA TCTGATGAAA GTTCAGGATT AACTGGAGGG TCGGGGTAGT TTGTAGAGGT AGACTACTTT 14451 CTTTGGTTCA CTTCTAGGAC TGTGTTTAGT AATCCAAATC ATTACAGGCC GAAACCAAGT GAAGATCCTG ACACAAATCA TTAGGTTTAG TAATGTCCGG 14501 TCTTCTTGGC AATACACTAC ACCGCAGACA TCTCTATAGC CTTCTCTTCA AGAAGAACCG TTATGTGATG TGGCGTCTGT AGAGATATCG GAAGAGAAGT 14551 GTGATTCACA TTTGTCGTGA CGTCAACTAC GGCTGACTCA TCCGTAATAT CACTAAGTGT AAACAGCACT GCAGTTGATG CCGACTGAGT AGGCATTATA 14601 TCATGCCAAC GGAGCCTCCC TCTTCTTTGT CTGTATTTAC TTTCAGATGG AGTACGGTTG CCTCGGAGGG AGAAGA.AACA GACATAAATG AAAGTGTAGC 14651 CCCGAGGACT CTACTATGGC TCCTACCTCT ATAAAGAAAC GTGAA.ATATT GGGCTCCTGA GATGATACCG AGGATGGAGA TATTTCTTTG CACTTTATAA 147 01 GGAGTAATCC TACTATTCCT GCTCATAGCT ACGGCCTTCG TGGGCTATGT 303

CCTCATTAGG ATGATAAGGA CGAGTATCGA TGCCGGAAGC ACCCGATACA 14751 CTTACCCTGA GGTCAA.ATAT CCTTCTGAGG TGCTACAGTC ATCACTAACC GAATGGGACT CCAGTTTATA GGAAGACTCC ACGATGTCAG TAGTGATTGG 14801 TTCTCTCCGC CTTCCCCTAT GTTGGAGACA TATTAGTCCA ATGAATTTGA AAGAGAGGCG GAAGGGGATA CAACCTCTGT ATAATCAGGT TACTTAAACT 14851 GGGGGCTTCT CAGTAGATAA CGCCACCCTC ACACGATTTT TTGCATTTCA CCCCCGAAGA GTCATCTATT GCGGTGGGAG TGTGCTAAA.A AACGTAAAGT 14901 CTTTCTCCTA CCCTTCCTTA TTACTGCACT AATAATTATT CACGTCCTCT GAAAGAGGAT GGGAAGGAAT AATGACGTGA TTATTAATAA GTGCAGGAGA 14951 TCCTACACGA AACAGGCTCG AACAATCCTA TAGGCCTTAA TTCCGACATA AGGATGTGCT TTGTCCGAGC TTGTTAGGAT ATCCGGAATT AAGGCTGTAT 15001 GACAAA.ATCT CCTTCCACCC CTATTTCTCC TACAAAGACG CTCTTGGGTT CTGTTTTAGA GGAAGGTGGG GATA.AAGAGG ATGTTTCTGC GAGAACCCAA 15051 CTTCATCCTC CTCGTCCTCC TAGGTATTCT AGCCCTATTC CTACCA.AACC GAAGTAGGAG GAGCAGGAGG ATCCATAAGA TCGGGATAAG GATGGTTTGG 15101 TCCTAGGAGA CGCTGAAAAC TTCATCCCCG CCAATCCTCT CGTTACCCCT AGGATCCTCT GCGACTTTTG AAGTAGGGGC GGTTAGGAGA GCAATGGGGA 15151 CCCCATATTA A.ACC C GAATG ATACTTCCTA TTTGCCTATG CCATCCTCCG GGGGTATAAT TTGGGCTTAC TATGAAGGAT AA.ACGGATAC GGTAGGAGGC 15201 ATCCATCCCT AATAAACTAG GAGGAGTCCT AGCCCTACTA TTCTCCATCC TAGGTAGGGA TTATTTGATC CTCCTCAGGA TCGGGATGAT AAGAGGTAGG 15251 TCATCCTCAT ATTAGTCCCC CTCCTCCACA CTTCCAAACA ACGAAGCAGC AGTAGGAGTA TAATCAGGGG GAGGAGGTGT GAAGGTTTGT TGCTTCGTCG 15301 ACCTTCCGCC CACTTACACA AATCTTCTTC TGGACCCTCG TTACCAATAT TGGAAGGCGG GTGAATGTGT TTAGAAGAAG ACCTGGGAGC AATGGTTATA 15351 ATTAATTTTA ACCTGAATTG GAGGCCAACC GGTTGAACAA CCATTTATTC TAATTAA.AAT TGGACTTAAC CTCCGGTTGG CCAACTTGTT GGTAAATAAG 15401 TTATTGGACA AATTGCATCT ATCTCATACT TCTCTTTATT CCTTATCGTA AATAACCTGT TTAACGTAGA TAGAGTATGA AGAGAAATAA GGAATAGCAT 15451 ATTCCACTCA CAGGCTGATG AGAA.AACAA.A ATCCTCAGCC TTAACTAGTT TAAGGTGAGT GTCCGACTAC TCTTTTGTTT TAGGAGTCGG AATTGATCAA 15501 TTGATAGCTT AGCCTAAAAG CGTCGACCTT GTAAGTCGAA GACCGGAGGT AACTATCGAA TCGGATTTTC GCAGCTGGAA CATTCAGCTT CTGGCCTCCA 15551 TTAAATCCTC CTCAAGATAT ATCAGGGGAA GGAGGGTTAA ACTCCTGCCC AATTTAGGAG GAGTTCTATA TAGTCCCCTT CCTCCCAATT TGAGGACGGG 15601 TTGGCTCCCA AAGCCAAGAT TCTGCCCAAA CTGCCCCCTG AGTACTGTTA AACCGAGGGT TTCGGTTCTA AGACGGGTTT GACGGGGGAC TCATGACAAT 15651 AACGTGAAAG CCAAATGTCC ATTTGGTTTT CAAAAAGTTG GTCGG`I'TTAA TTGCACTTTC GGTTTACAGG TAA.ACC AAAA GTTTTTCAAC CAGCCAAATT 15701 CATATTAATG ACATGGCCCA CATACCTTAA TACAAGGACA TATCTCATCT GTATAATTAC TGTACCGGGT GTATGGAATT ATGTTCCTGT ATAGAGTAGA 15751 CGACTACATT ACCATGTTTG ACCTTCACCT AATGATTTCA CACTCTATGT GCTGATGTAA TGGTACAA.AC TGGAAGTGGA TTACTAAAGT GTGAGATACA 15801 ATAATACTCA TTAATTTATA TTCCCCTATA TCATAGCATA CTATGTTTAG TATTATGAGT AATTA.AATAT AAGGGGATAT AGTATCGTAT GATACAA.ATC 15851 TCCTCATTAA TCTATTAATC ACAATTTCAT CCCATTTCTA ATCTCCATAC AGGAGTAATT AGATAATTAG TGTTAA.AGTA GGGTA.AAGAT TAGAGGTATG 15901 TCATTAACCT ATA.ATCAAAT TGTCCATTCC ATAGATTTAC TCCTTCCACC AGTAATTGGA TATTAGTTTA ACAGGTAAGG TATCTAAATG AGGAAGGTGG 15951 CACAGAGACT TCTGTATTTA TTATGCGGGC TGGTAAGAAC ATCACATCCC GTGTCTCTGA AGACATAA.AT AATACGCCCG ACCATTCTTG TAGTGTAGGG 16001 GCTATTGTAA G TT GCTCTATTTG TGGCGCTGTA CTCGATTTAT CGATAACATT CTTTTTTTAA CGAGATA.AAC ACCGCGACAT GAGCTAAATA 16051 CCCTATCAAT TGCCCATACC TGGCATCTGA TTAATGCTTG TGCTACTTCA GGGATAGTTA ACGGGTATGG ACCGTAGACT AATTACGAAC ACGATGAAGT 304

16101 GTCCTTGATC GCGTCAAGAA TGCCAGCCCG CTAGTTCCCT TTAATGGCAC CAGGAACTAG CGCAGTTCTT ACGGTCGGGC GATCAAGGGA AATTACCGTG 16151 CTTCGTCCTT GATCGCGTCA AGATTTATTT TCCACCCTGT TTTTTTTGGG GAAGCAGGAA CTAGCGCAGT TCTAAATAAA AGGTGGGACA CCC 16201 GGGGATGAAG CCATCGCTAT TCCCCGGAGG GGCTCAACTG GGACTCTGAG CCCCTACTTC GGTAGCGATA AGGGGCCTCC CCGAGTTGAC CCTGAGACTC 16251 ATAGATCTGT GATATCCTCG ACACTCTTGT TTAATACTCA GTACTCATCA TATCTAGACA CTATAGGAGC TGTGAGAACA AATTATGAGT CATGAGTAGT 16301 TTCGTGAATT AAGATTGTCA AGTCGTTCAA AACTGAAAGG GATAGAGATG AAGCACTTAA TTCTAACAGT TCAGCAAGTT TTGACTTTCC CTATCTCTAC 16351 TAGACGCCAT AACGGGTACG TTTCGATTTT TTKGATTAAA GA.AGCTATGG ATCTGCGGTA TTGCCCATGC AAAGCTAAAA AAKCTAATTT CTTCGATACC 16401 TTT G ATATTTTCTT AACCCGCGTC CGAGTCTATC TCTGGCAGTA AAATTTTTTC TATAAAAGAA TTGGGCGCAG GCTCAGATAG AGACCGTCAT 16451 TACGTGAGTG TP~AA.ATGCAT TTCATTGTTT CAGTACATTA ATCACTTAAT ATGCACTCAC ATTTTACGTA AAGTAACAAA GTCATGTAAT TAGTGAATTA 16501 C GGGCATAA.A TTCATTATTA TTAGACTTCC CCCTGCTTTG TAAAATTTTG GCCCGTATTT AAGTAATAAT AATCTGAAGG GGGACGAAAC ATTTTA.A.AAC 16551 GAGCCGCCTA GATA A.AACATATTT TGGTP~~AAAC CCCCCTCCCC CTCGGCGGAT TTTTTTCTAT TTTGTATAAA ACCATTTTTG GGGGGAGGGG 16601 CTAATATACA CGGACTCCTC GP~AAAAC C C C TAAAACGAGG GCCGGACATA GATTATATGT GCCTGAGGAG CTTTTTGGGG ATTTTGCTCC CGGCCTGTAT 16651 TATTTT~ TTAGCATGCG AAATGTATTC TGTATTTATA TTGTTACACT ATAA.AATTTT AATCGTACGC TTTACATAAG ACATAAATAT AACAATGTGA 16701 ATGAT TACTA tRNA 1..70 product = tRNA-Phe rRNA 69..1028 product = 12S ribosomal RNA tRNA 1029..1100 product = tRNA-Val rRNA 1101..2776 product = 16S ribosomal RNA tRNA 2777..2851 product = tRNA-Leu gene 2852..3826 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3829..3897 product = tRNA-Ile tRNA 3 896..3967 product = tRNA-Gln tRNA 3968..4036 product = tRNA-Met gene 4037..5080 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5080..5150 product = tRNA-Trp tRNA complement (S 152..5220) product = tRNA-Ala tRNA complement (5221..5293) product = tRNA-Asn 305 tRNA complement (5327..5393) product = tRNA-Cys tRNA complement (5395..5464) product = tRNA-Tyr gene 5466..7023 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7022..7092) product = tRNA-Ser tRNA 7097..7166 product = tRNA-Asp gene 7171..7861 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7862..7935 product = tRNA-Lys gene 7937..8104 gene = ATP8 product =ATP synthase FO subunit 8 gene 8095..8778 gene = ATP6 product =ATP synthase FO subunit 6 gene 8778..9563 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9566..9635 product = tRNA-Gly gene 9636..9986 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9985..10054 product = tRNA-Arg gene 10055..10351 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10345..11725 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11726..11794 product = tRNA-His tRNA 11795..11862 product = tRNA-Ser tRNA 11863..11934 product = tRNA-Leu gene 11935..13764 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13760..14281) gene = ND6 product = NA.DH dehydrogenase subunit 6 tRNA complement (14282..14351) product = tRNA-Glu gene 14354..15499 gene = CYTB 306

product = cytochrome b tRNA 15497..15570 product = tRNA-Thr tRNA complement (15573..15641) product = tRNA-Pro D-Loop 15 642..167 OS

Lamna nasus mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAA AGTATGGCAC TGAAGATGCT AAGATGAAA.A CGATCACATC GAATTAAATT TCATACCGTG ACTTCTACGA TTCTACTTTT 51 ATGAGAATTT TCCGCAGGCA TAAAGGTTTG GTCCTGGCCT CAGTATTAAT TACTCTTAAA AGGCGTCCGT ATTTCCAAAC CAGGACCGGA GTCATAATTA 101 TGTAACCAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACATTGGTTT TAATATGTAC GTTCAAAGTC GTAGGGACAC TCTTACGGGA 151 AACTACTCTA TCAATTGGTT AGGGGCGGGT ATCAGGCACA CATGCACGTA TTGATGAGAT AGTTAACCAA TCCCCGCCCA TAGTCCGTGT GTACGTGCAT 201 GCCCAAGACA CCTTGCTAAG CCACACCCCC AAGGGATTTC AGCAGTAATA CGGGTTCTGT GGAACGATTC GGTGTGGGGG TTCCCTAAAG TCGTCATTAT 251 AATATTGATC ATATGAGCGC AAGCTCGAAT CAGTTAAAGT TAACAGAGTT TTATAACTAG TATACTCGCG TTCGAGCTTA GTCAATTTCA ATTGTCTCAA 301 GGTTAATCTC GTGCCAGCCG CCGCGGTTAT ACGAGTAACT CATATTAATA CCAATTAGAG CACGGTCGGC GGCGCCAATA TGCTCATTGA GTATAATTAT 351 TTTCCCGGCG TA.AAGAGTGA TTTAAGGAAT ATCTACCAAT AACTAAAGTT AAAGGGCCGC ATTTCTCACT A.AATTC C TTA TAGATGGTTA TTGATTTCAA 401 AAGACCTTAT CAAGCTGTCA CACGCACCCA CAAGCGGAAT TGTCAACAAC TTCTGGAATA GTTCGACAGT GTGCGTGGGT GTTCGCCTTA ACAGTTGTTG 451 GAAAGTGACT TTATTATCCC TAGAAATCTT GATGTCACGA CAGTTAGACC CTTTCACTGA AATAATAGGG ATCTTTAGAA CTACAGTGCT GTCAATCTGG 501 CCAAACTAGG ATTAGATACC CTACTATGTC TAACCACAAA C T TA.AAC AAT GGTTTGATCC TAATCTATGG GATGATACAG ATTGGTGTTT GAATTTGTTA 551 AACTCACTAC ATTGTTCGCC AGAGTACTAC AAGCGCTAGC TTAAA.AC C CA TTGAGTGATG TAACAAGCGG TCTCATGATG TTCGCGATCG AATTTTGGGT 601 AAGGACTTGG CGGTGTCCCA AACCCACCTA GAGGAGCCTG TTCTGTAACC TTCCTGAACC GCCACAGGGT TTGGGTGGAT CTCCTCGGAC AAGACATTGG 651 GATAATCCCC GTTAAACCTC ACCACTTCTA GCCATCCCCG TCTATATACC CTATTAGGGG CAATTTGGAG TGGTGAAGAT CGGTAGGGGC AGATATATGG 701 GCCGTCGTCA GCTCACCCTG TGAAGGCCTA AA.AGTAAGCA AAAAGAACTA CGGCAGCAGT CGAGTGGGAC ACTTCCGGAT TTTCATTCGT TTTTCTTGAT 751 ACTTCCATAC GTCAGGTCGA GGTGTAGCGA ATGAAGTGGG TAGAAATGGG TGAAGGTATG CAGTCCAGCT CCACATCGCT TACTTCACCC ATCTTTACCC 801 CTACATTTTC TATAAAGAAA ACACGAATGG CA.AAC TGAAA AATTGCCTAA GATGTAAAAG ATATTTCTTT TGTGCTTACC GTTTGACTTT TTAACGGATT 851 AGGTGGATTT AGCAGTAAGA AAAGACTAGA GAGCTTCTCT GAAACCGGCT TC CAC C TA.AA TCGTCATTCT TTTCTGATCT CTCGAAGAGA CTTTGGCCGA 901 CTGGGACGCG CACACACCGC CCGTCACTCT CCTC ATCTACTTAT GACCCTGCGC GTGTGTGGCG GGCAGTGAGA GGAGTTTTTT TAGATGAATA 951 TTTTAATTAA AGAAAATACA TCAAGAGGAG GCAAGTCGTA ACATGGTAAG p~AAATTAATT TCTTTTATGT AGTTCTCCTC CGTTCAGCAT TGTACCATTC 1001 TGTACTGGAA AGTGCACTTG GAATCAAAAT GTGGCTAAAC TAGCAAAGCA ACATGACCTT TCACGTGAAC CTTAGTTTTA CACCGATTTG ATCGTTTCGT 1051 CCTCCCTTAC ACCGAGGAAA TACTCGTGCA ATTCGAGTCA TTTTGAACAT GGAGGGAATG TGGCTCCTTT ATGAGCACGT TAAGCTCAGT AAAACTTGTA 1101 TAAAGCTAGC CTATCCACCT ACCTCAAACC CAACATTATT AATTACCTTA 307

ATTTCGATCG GATAGGTGGA TGGAGTTTGG GTTGTAATAA TTAATGGAAT 1151 CGTATTTACA CCTAACTAAA ACATTTTATC ATTTTAGTAT GGGCGACAGA GCATAAATGT GGATTGATTT TGTAAAATAG TAAAATCATA CCCGCTGTCT 1201 ACAAAAAC TC AGCGCAATAG ACCATGTACC GCAAGGGA.A.A GCTGAAAGAG TGTTTTTGAG TCGCGTTATC TGGTACATGG CGTTCCCTTT CGACTTTCTC 1251 AAATGAAACA AATAATTAAA GTAATA~AAAA GCAGAGATTC CACCTCGTAC TTTACTTTGT TTATTAATTT CATTATTTTT CGTCTCTAAG GTGGAGCATG 13 01 CTTTTGCATC ATGATTTAGC TAGP~AA.AAC T AGACA.AAGAG ATCTTTAGCC GA.AAAC GTAG TACTAAATCG ATCTTTTTGA TCTGTTTCTC TAGAAATCGG 1351 TATCTTCCCG A.AAC TAAAC G AGCTACTCCG AAGCAGCACA ACTTAGAGCC ATAGAAGGGC TTTGATTTGC TCGATGAGGC TTCGTCGTGT TGAATCTCGG 1401 AACCCGTCTC TGTGGCAAAA GAGTGGGAAG ACTTCCGAGT AGCGGTGACA TTGGGCAGAG ACACCGTTTT CTCACCCTTC TGAAGGCTCA TCGCCACTGT 1451 AGCCTATCGA GTTTAGTGAT AGCTGGTTGT C CAAGA~AAAG AACTTCAATT TCGGATAGCT CAAATCACTA TCGACCAACA GGTTCTTTTC TTGAAGTTAA 1501 CTGCATTAAT TCTTTCATCA CCP~P~AAAGTC TATCATACTA AGGTCAA.ACA GACGTAATTA AGAAAGTAGT GGTTTTTCAG ATAGTATGAT TCCAGTTTGT 1551 TAAGAATTAA TAGTTATTCA GAAGAGGTAC AGCCCTTCTG AACCAAGACA ATTCTTAATT ATCAATAAGT CTTCTCCATG TCGGGAAGAC TTGGTTCTGT 1601 CAACTTTCAA AGGAGGGAAA TGATCACATT TATCAAGGTT CTTACCCCAG GTTGAAAGTT TCCTCCCTTT ACTAGTGTAA ATAGTTCCAA GAATGGGGTC 1651 TGGGCCCAAA AGCAGCCACC TGTGAAGTAA GCGTCACAGC TCCAGTCTCA ACCCGGGTTT TCGTCGGTGG ACACTTCATT CGCAGTGTCG AGGTCAGAGT 1701 CP.~A,AA.AC C TA TAATTCAGAT ATTCTTCTCA GCACCCCCTT GACTATATTG GTTTTTGGAT ATTAAGTCTA TAAGAAGAGT CGTGGGGGAA CTGATATAAC 1751 GACTATTTTA TA.AAATTATA AAAGAACTTA TGC TAA.AATG AGTAATAAGA CTGATAAAAT ATTTTAGTAT TTTCTTGAAT ACGATTTTAC TCATTATTCT 1801 GGTTAA.ACCT CTCCCGACAC AAGTGTACAT CAGAAAGAAT TAATTCACTG CCAATTTGGA GAGGGCTGTG TTCACATGTA GTCTTTCTTA ATTAAGTGAC 1851 ATAATTAAAC GAACCCAAAC TGAGGTCATT ATATTAATAT TTTACCTTAA TATTAATTTG CTTGGGTTTG ACTCCAGTAA TATAAGTATA A.AATGGAATT 1901 CTAGAAAATC TTATTATAAC ATTCGTTAAT CCTACACAGG AGTGTCTTAA GATCTTTTAG AATAATATTG TAAGCAATTA GGATGTGTCC TCACAGAATT 1951 GGA.AAGATTA AAAGAAAATA AAGGAACTCG GCA.AACAC GA ACTCCGCCTG CCTTTCTAAT TTTCTTTTAT TTCCTTGAGC CGTTTGTGCT TGAGGCGGAC 2001 TTTACCAAAA ACATCGCCTC TTGGAAACCC TATAAGAGGT CCCGCCTGCC AAATGGTTTT TGTAGCGGAG AACCTTTGGG ATATTCTCCA GGGCGGACGG 2051 CTGTGACAAT GTTTAACGGC CGCGGTATTC TGACCGTGCA AAGGTAGCGT GACACTGTTA CAAATTGCCG GCGCCATAAG ACTGGCACGT TTCCATCGCA 2101 AATCACTTGT CTTTTAAATG AAGACCCGTA TGA.AAGGCAT CACGAGAGTT TTAGTGAACA GP.~AA.ATTTAC TTCTGGGCAT ACTTTCCGTA GTGCTCTCAA 2151 CAACTGTCTC TATTTTCTAA TCAATGAAAT TGATCTACCC GTGCAGAAGC GTTGACAGAG ATAAA.AGATT AGTTACTTTA ACTAGATGGG CACGTCTTCG 2201 GGGTATAACT ACATTAGACG AGAAGACCCT ATGGAGCTTC A.AACACATAA CCCATATTGA TGTAATCTGC TCTTCTGGGA TACCTCGAAG TTTGTGTATT 2251 ATTAATTATG TAGGTTAACC ATTCTACGGA TATAAACAAA AATACAATAC TAATTAATAC ATCCAATTGG TAAGATGCCT ATATTTGTTT TTATGTTATG 2301 TTTTAATTTA ACTGTCTTTG GTTGGGGTGA CCAAGGGGAA AAAACTATCC AAAATTA.AAT TGACAGAAAC CAACCCCACT GGTTCCCCTT TTTTGATAGG 2351 CCCTTATCGA CCGAGTGCTC TCAAGTACTT P.~~AA.ATTAGA ATTACAATTC GGGAATAGCT GGCTCACGAG AGTTCATGAA TTTTTAATCT TAATGTTA.AG 2401 TAATTAATAA AATATTTACC GP.►~~AAATGAC CCAGGATTTC CTGATCAATG ATTAATTATT TTATAAATGG CTTTTTACTG GGTCCTAAAG GACTAGTTAC 2451 AACCAAGTTA CCCTAGGGAT AACAGCGCAA TCCTTTCTCA GAGTCCCTAT TTGGTTCAAT GGGATCCCTA TTGTCGCGTT AGGAAAGAGT CTCAGGGATA 308

2501 CGACGAAAGG GTTTACGACC TCGATGTTGG ATCAGGACAT CCTAATGATG GCTGCTTTCC CA.AATGCTGG AGCTACAACC TAGTCCTGTA GGATTACTAC 2551 CAGCCGTCAT TAAGGGTTCG TTTGTTCAAC GATTAACAGT CCTACGTGAT GTCGGCAGTA ATTCCCAAGC AAACAAGTTG CTAATTGTCA GGATGCACTA 2601 CTGAGTTCAG AC C GGAGA.AA TCCAGGTCAG TTTCTATCTA TGAATTTATT GACTCAAGTC TGGCCTCTTT AGGTCCAGTC AAAGATAGAT ACTTAAATAA 2651 TTTCCTAGTA C GAA.AGGAC C GGP.~AAAATGA AGCCAATACC CTAGGCACGC AAAGGATCAT GCTTTCCTGG CCTTTTTACT TCGGTTATGG GATCCGTGCG 2701 TTCATTTTCA TCTATTGA.AA TAAACTAAAA TAGATAAGAA AAAGCCAACT AAGTAAAAGT AGATAACTTT ATTTGATTTT ATCTATTCTT TTTCGGTTGA 2751 ACTACCCAAG AAA.AGGGTTG TTGGGGTGGC AGAGCCTGGT AATTGCAAAA TGATGGGTTC TTTTCCCAAC AACCCCACCG TCTCGGACCA TTAACGTTTT 2801 GACCTAAGCT CTTTATTCCA GAGGTTCA.AA TCCTCTCCTC AACCATGCTT CTGGATTCGA GAAATAAGGT CTCCAAGTTT AGGAGAGGAG TTGGTACGAA 2851 GAAGCCCTTC TTCTTTACTT AATCTGTCCA CTAACCTATA TTGTCCCTAT CTTCGGGAAG AAGAAATGAA TTAGACAGGT GATTGGATAT AACAGGGATA 2901 CCTACTAGCT ACGGCCTTCC TTACCTTAGT TGAACGAAAA ATCCTCGGTT GGATGATCGA TGCCGGAAGG AATGGAATCA ACTTGCTTTT TAGGAGCCAA 2951 ATATACAGCT CCGCA.AAGGT CCCAACATTG TAGGCCCATA TGGCTTACTC TATATGTCGA GGCGTTTCCA GGGTTGTAAC ATCCGGGTAT ACCGAATGAG 3001 CAACCTATTG CAGACGGCCT AAAACTATTT AC CA.A.AGAAC CTATCTACCC GTTGGATAAC GTCTGCCGGA TTTTGATAAA TGGTTTCTTG GATAGATGGG 3051 ATCAGCATCT TCCCCTTTCC TATTCTTGGT TGCCCCCACA ATGGCTCTTA TAGTCGTAGA AGGGGAAAGG ATAAGAACCA ACGGGGGTGT TACCGAGAAT 3101 CACTGGCCCT CCTCATATGA ATGCCCCTCC CTCTTCCCCA CTCCGTTATT GTGACCGGGA GGAGTATACT TACGGGGAGG GAGAAGGGGT GAGGCAATAA 3151 AATCTCAATC TGGGCTTACT TTTCATTCTA GCAATCTCAA GTCTGACCGT TTAGAGTTAG ACCCGAATGA AAAGTAAGAT CGTTAGAGTT CAGACTGGCA 3201 CTACACTATC TTGGGCTCCG GATGAGCATC AAATTCAA.AA TACGCCCTAA GATGTGATAG AACCCGAGGC CTACTCGTAG TTTAAGTTTT ATGCGGGATT 3251 TAGGGGCCTT ACGAGCTGTA GCACAAACAA TCTCCTACGA AGTAAGTCTT ATCCCCGGAA TGCTCGACAT CGTGTTTGTT AGAGGATGCT TCATTCAGAA 3301 GGGTTAATCC TCCTATCAAT AATTATTTTT ACAGGCGGCT TCACCCTTCA CCCAATTAGG AGGATAGTTA TTAATP►AAAA TGTCCGCCGA AGTGGGAAGT 3351 TACATTCAAC TTAACCCAAG AAACAATTTG ACTAATCGTC CCAGGCTGAC ATGTAAGTTG AATTGGGTTC TTTGTTA.AAC TGATTAGCAG GGTCCGACTG 3401 CCTTAGCCCT AATATGATAC GTATCAACCC TAGCAGA.AAC CAACCGGGTA GGAATCGGGA TTATACTATG CATAGTTGGG ATCGTCTTTG GTTGGCCCAT 3451 CCATTTGATC TAACAGAAGG AGAATCAGAA CTAGTCTCAG GCTTTAACAT GGTA.AAC TAG ATTGTCTTCC TCTTAGTCTT GATCAGAGTC CGAAATTGTA 3501 CGAATATGCA GGAGGCTCAT TCGCTCTATT TTTCCTCGCT GAGTATACAA GCTTATACGT CCTCCGAGTA AGCGAGATAA A.AAGGAGC GA CTCATATGTT 3551 ACATTTTACT AATA.AACAC C CTCTCAGTCA TCCTCTTCAT GGGCTCCTCC TGTAAA.ATGA TTATTTGTGG GAGAGTCAGT AGGAGAAGTA CCCGAGGAGG 3601 TATGACCCCT TCTTTCCAGA AATCTCAACC CTCAGCTTGA TAATp~AAAGC ATACTGGGGA AGAAAGGTCT TTAGAGTTGG GAGTCGAACT ATTATTTTCG 3651 AACCTCACTT ACCCTACTTT TTTTATGAAT TCGAGCATCA TATCCTCGCT TTGGAGTGAA TGGGATGA.AA AAAATACTTA AGCTCGTAGT ATAGGAGCGA 3701 TTCGCTACGA TCAGCTTATA CATTTAGTAT G TTT CCTACCCTTA AAGCGATGCT AGTCGAATAT GTAA.ATCATA CTTTTTTAAA GGATGGGAAT 3751 ACCCTAGCAA TTATATTATG ACATATTGCC CTCCCCATCG CCACAGCAAG TGGGATCGTT AATATAATAC TGTATAACGG GAGGGGTAGC GGTGTCGTTC 3801 CCTACCCCCT CTAACCTAAA AGGAAGCGTG CCTGAACAA.A GGACCACTTT GGATGGGGGA GATTGGATTT TCCTTCGCAC GGACTTGTTT CCTGGTGA.A.A 3851 GATAGAGTGG ATAATGAAAG TTA.AA.ATCTT TCCTCTTCCT AGP,.A~AAATAG 309

CTATCTCACC TATTACTTTC AATTTTAGAA AGGAGAAGGA TCTTTTTATC 3901 GGTTTGAACC TATGCCTA.AG AGATCA.AAAC TCTTCGTGCT TCCAATTATA C CAA.AC TTGG ATACGGATTC TCTAGTTTTG AGAAGCACGA AGGTTAATAT 3951 CTACTTTCTA AGTAAGGTCA GCTAACAA.AG CTTTTGGGCC CATACCCCAA GATGAAAGAT TCATTCCAGT CGATTGTTTC GAAAACCCGG GTATGGGGTT 4001 CCATGTCGGT TAAA.ATC C C T CCCCTACTAA TGAACCCTAT CGTATTAACC GGTACAGCCA ATTTTAGGGA GGGGATGATT ACTTGGGATA GCATAATTGG 4051 ATCATCATTT CAAGCCTGGG CCTGGGAACT ATCCTCACAT TCATCGGCTC TAGTAGTAAA GTTCGGACCC GGACCCTTGA TAGGAGTGTA AGTAGCCGAG 4101 CCACTGACTT CTAGTCTGAA TGGGCCTTGA AATCAACACC TTAGCCATCC GGTGACTGAA GATCAGACTT ACCCGGAACT TTAGTTGTGG AATCGGTAGG 4151 TCCCTCTAAT AATTCGCCAG CACCACCCTC GAGCTGTAGA AGCCTCTACA AGGGAGATTA TTAAGCGGTC GTGGTGGGAG CTCGACATCT TCGGAGATGT 4201 AAATACTTCA TCACACAAGC CACCGCCTCA GCCCTACTTC TATTTGCTAG TTTATGAAGT AGTGTGTTCG GTGGCGGAGT CGGGATGAAG ATAAACGATC 4251 CGTTACA.AAC GCCTGAACCT CAGGCGAATG AAACCTGGTT GAAATAGTCA GCAATGTTTG CGGACTTGGA GTCCGCTTAC TTTGGACCAA CTTTATCAGT 4301 ATCCAGGCTC TGCCACACTG GCCACAATCG CATTAGCATT p~~AAATCGGC TAGGTCCGAG ACGGTGTGAC CGGTGTTAGC GTAATCGTAA TTTTTAGCCG 4351 CTAGCCCCCC TTCACTTCTG ACTCCCCGAA GTCCTTCAAG GCCTAGACCT GATCGGGGGG AAGTGAAGAC TGAGGGGCTT CAGGAAGTTC CGGATCTGGA 4401 AACCACCGGC CTCATCCTCT CCACCTGACA A.AAACTCGCC CCTTTTGCCA TTGGTGGCCG GAGTAGGAGA GGTGGACTGT TTTTGAGCGG GGA.AAAC GGT 4451 TTCTTCTACA ACTTTACCCC TCATTAAATC CCAACCTTCT TATCTTTCTT AAGA.AGATGT TGAAATGGGG AGTAATTTAG GGTTGGAAGA ATAGAA.AGAA 4501 GGTGTACTCT CGACCATAGT AGGGGGCTGA GGAGGATTAA ATCA.AAC C CA CCACATGAGA GCTGGTATCA TCCCCCGACT CCTCCTAATT TAGTTTGGGT 4551 ACTACGAAAA ATCCTAGCCT ACTCCTCAAT CGCACACCTC GGATGAATAA TGATGCTTTT TAGGATCGGA TGAGGAGTTA GCGTGTGGAG CCTACTTATT 4601 TCTCCATCCT CCACTACTCC CATAACTTAA CCCAACTTAA CTTAATCCTC AGAGGTAGGA GGTGATGAGG GTATTGAATT GGGTTGAATT GAATTAGGAG 4651 TATATCATCA TAACCTCGAC AACCTTCCTC CTATTCAAGA CATTTAACTC ATATAGTAGT ATTGGAGCTG TTGGAAGGAG GATAAGTTCT GTA.AATTGAG 47 01 AACCAAAATC AATTCCATCT CCTCTTCATC AGCAA.AATC C CCCTTACTTT TTGGTTTTAG TTAAGGTAGA GGAGAAGTAG TCGTTTTAGG GGGAATGAAA 4751 CCGTTATTGC CCTCATAACC CTTCTCTCCC TTGGGGGCCT ACCCCCTCTA GGCAATAACG GGAGTATTGG GAAGAGAGGG AACCCCCGGA TGGGGGAGAT 4801 TCAGGCTTCA TACCAAAATG ACTTATTTTA CAAGAATTAA CTAAACAGAA AGTCCGAAGT ATGGTTTTAC TGAATA.AAAT GTTCTTAATT GATTTGTCTT 4851 CTTAGTTATC CCAGCCATTA TCATGGCTAT AATGGCTCTC CTCAGTCTAT GAATCAATAG GGTCGGTAAT AGTACCGATA TTACCGAGAG GAGTCAGATA 4901 TTTTTTACCT ACGCCTATGC TACGCCACAA CACTAACCAT AACCCCCAAC TGGA TGCGGATACG ATGCGGTGTT GTGATTGGTA TTGGGGGTTG 4951 CCAATCAACA TACTAACATC ATGACGAACC AAATTACCTC ACAACTTGAC GGTTAGTTGT ATGATTGTAG TACTGCTTGG TTTAATGGAG TGTTGAACTG 5001 CCTAACAACC ACCGCCTCAT TGTCTATTTT CCTCCTTCCA ATCACCCCTG GGATTGTTGG TGGCGGAGTA ACAGATA.AAA GGAGGAAGGT TAGTGGGGAC 5051 CTATCCTCAT ACTAGTATCC TAAGAAATTT AGGTTAACAA CAGACCAAAA GATAGGAGTA TGATCATAGG ATTCTTTAAA TCCAATTGTT GTCTGGTTTT 5101 GCCTTCAAAG CTTTAAGTAG AAGTGAAAAT CTCCTAATTT CTGTTAAGAT CGGAAGTTTC GAAATTCATC TTCACTTTTA GAGGATTAAA GACAATTCTA 5151 TTGCAAGACT TTACCTCACA TCTTCTGAAT GCAACCCAGA TACTTTCATT AACGTTCTGA AATGGAGTGT AGAAGACTTA CGTTGGGTCT ATGAA.AGTAA 5201 AAGCTAAAAC CTTCTAGATA AATAGGCCTT GATCCTACAA AATCTTAGTT TTCGATTTTG GAAGATCTAT TTATCCGGAA CTAGGATGTT TTAGAATCAA 310

5251 AACAGCTAAG CGTTCAATCC AGCGAACTTC TATCTACTTT CTCCCGCCGT TTGTCGATTC GCAAGTTAGG TCGCTTGAAG ATAGATGAAA GAGGGCGGCA 5301 AGAGGP.~AA.AA GGCGGGAGAA AGCCCCGGGA GA.AAC TAATC TCCATCTTTG TCTCCTTTTT CCGCCCTCTT TCGGGGCCCT CTTTGATTAG AGGTAGA.AAC 5351 GATTTGCAAT CCAACATAAA CATCTACTGC AGGACTATGG CAAGAAGAGG CTAAACGTTA GGTTGTATTT GTAGATGACG TCCTGATACC GTTCTTCTCC 5401 AATTGGACCT CTGTACATGG AGCTACAACC CATTACTTAG TTCTCAGTCA TTAACCTGGA GACATGTACC TCGATGTTGG GTAATGAATC AAGAGTCAGT 5451 CCTTACCTGT GGCAATTAAT CGATGACTAT TTTCTACAAA C CACA.AAGAT GGAATGGACA CCGTTAATTA GCTACTGATA AAAGATGTTT GGTGTTTCTA 5501 ATCGGCACCC TCTATTTAAT CTTTGGTGCA TGGGCAGGAA TAGTGGGAAC TAGCCGTGGG AGATAAATTA GAAACCACGT ACCCGTCCTT ATCACCCTTG 5551 AGCCCTAAGC CTTTTAATTC GCGCTGAGCT GGGTCAGCCT GGTTCCCTCC TCGGGATTCG GA.A.AATTAAG CGCGACTCGA CCCAGTCGGA CCAAGGGAGG 5601 TAGGCGACGA TCAGATTTAT AATGTTATTG TAACCGCCCA TGCATTTGTA ATCCGCTGCT AGTCTAAATA TTACAATAAC ATTGGCGGGT ACGTAAACAT 5651 ATGATTTTCT TTATAGTAAT GCCTGTGATA ATTGGGGGCT TTGGAAACTG TAC TP~AAAGA AATATCATTA CGGACACTAT TAACCCCCGA AACCTTTGAC 5701 ACTAGTACCA TTA.ATAATTG GTGCACCAGA TATGGCCTTC CCTCGAATAA TGATCATGGT AATTATTAAC CACGTGGTCT ATACCGGAAG GGAGCTTATT 5751 ATAACATAAG TTTCTGACTC CTCCCTCCTT CTTTTCTCCT ACTCCTAGCT TATTGTATTC AA.AGAC TGAG GAGGGAGGAA GAAAAGAGGA TGAGGATCGA 5801 TCGGCCGGAG TCGAAGCAGG GGCTGGTACT GGCTGAACGG TTTACCCTCC AGCCGGCCTC AGCTTCGTCC CCGACCATGA CCGACTTGCC AAATGGGAGG 5851 CCTAGCTGGC AACTTAGCAC ATGCCGGGGC CTCCGTTGAT CTGGCTATCT GGATCGACCG TTGAATCGTG TACGGCCCCG GAGGCAACTA GACCGATAGA 5901 TTTCCCTTCA CCTAGCGGGT ATCTCTTCAA TCTTAGCTTC AATTAACTTC A.A.AGGGAAGT GGATCGCCCA TAGAGAAGTT AGAATCGAAG TTAATTGAAG 5951 ATTACAACTA TTATTAATAT AA.AACCACCA GCAATTTCCC AGTACCAGAC TAATGTTGAT AATAATTATA TTTTGGTGGT CGTTAAAGGG TCATGGTCTG 6001 GCCCCTATTT GTGTGATCCA TTCTAGTAAC AACTATTCTC CTTCTTTTAG CGGGGATA.AA CACACTAGGT AAGATCATTG TTGATAAGAG GAAGAA.A.ATC 6051 CCCTCCCAGT ACTTGCAGCC GGCATCACAA TACTACTCAC TGACCGAAAT GGGAGGGTCA TGAACGTCGG CCGTAGTGTT ATGATGAGTG ACTGGCTTTA 6101 CTAAATACAA CTTTCTTTGA CCCAGCAGGA GGGGGAGACC CTATTCTCTA GATTTATGTT GAAAGAAACT GGGTCGTCCT CCCCCTCTGG GATAAGAGAT 6151 CCAACACTTG TTCTGATTCT TTGGTCACCC AGAGGTCTAC ATTCTTATCC GGTTGTGAAC AAGACTAAGA AACCAGTGGG TCTCCAGATG TAAGAATAGG 6201 TTCCTGGTTT TGGCATAATT TCCCATATTG TAGCTTACTA CTCCGGTA.AA AAGGACCAAA ACCGTATTAA AGGGTATAAC ATCGAATGAT GAGGCCATTT 6251 AAGGAACCAT TTGGCTACAT GGGCATGGTC TGAGCAATAA TAGCAATTGG TTCCTTGGTA AACCGATGTA CCCGTACCAG ACTCGTTATT ATCGTTAACC 6301 CCTACTAGGG TTCATTGTCT GAGCCCATCA CATATTCACC GTCGGAATGG GGATGATCCC AAGTAACAGA CTCGGGTAGT GTATAAGTGG CAGCCTTACC 6351 ACGTTGACAC ACGAGCCTAC TTCACCTCAG CAACAATAAT TATCGCCATC TGCAACTGTG TGCTCGGATG AAGTGGAGTC GTTGTTATTA ATAGCGGTAG 6401 CCCACAGGTG TA.A.AAGTTTT CAGCTGATTA GCAACCCTTC ATGGAGGTTC GGGTGTCCAC ATTTTCAAAA GTCGACTAAT CGTTGGGAAG TACCTCCAAG 6451 TGTCA.AATGA GAAACCCCCT TACTATGGGC CCTCGGGTTT ATTTTCCTAT ACAGTTTACT CTTTGGGGGA ATGATACCCG GGAGCCCAAA TAAAAGGATA 6501 TCACAGTAGG GGGTCTAACA GGAATCGTTC TAGCCAACTC CTCCCTAGAC AGTGTCATCC CCCAGATTGT CCTTAGCAAG ATCGGTTGAG GAGGGATCTG 6551 ATTGTTCTCC ACGATACTTA TTATGTAGTA GCCCACTTCC ACTACGTCCT TAACAAGAGG TGCTATGAAT AATACATCAT CGGGTGAAGG TGATGCAGGA 6601 CTCAATAGGA GCAGTCTTCG CTATCATGGC AGGCTTCATT CACTGATTCC 311

GAGTTATCCT CGTCAGAAGC GATAGTACCG TCCGAAGTAA GTGACTAAGG 6651 CTTTAATAAC CGGCTTCACC CTTCATTCAA CTTGAACAAA AATCCAATTC GAAATTATTG GCCGAAGTGG GAAGTAAGTT GAACTTGTTT TTAGGTTAAG 6701 GCAGTCATGT TTATCGGAGT AAATCTCACA TTCTTCCCAC AACATTTTCT CGTCAGTACA AATAGCCTCA TTTAGAGTGT AAGAAGGGTG TTGTAAA.AGA 6751 AGGTCTCGCC GGTATACCGC GACGTTACTC AGACTAC C C G GACGCCTACA TCCAGAGCGG CCATATGGCG CTGCAATGAG TCTGATGGGC CTGCGGATGT 6801 CCCTATGAAA TACAGTCTCC TCTATCGGCT CTTTAATTTC ACTTGTAGCA GGGATACTTT ATGTCAGAGG AGATAGCCGA GA.AATTA.AAG TGAACATCGT 6851 GTGATTATAC TCCTCTTCAT TATTTGAGAA GCATTCGCCT C A~AAAC GAGA CACTAATATG AGGAGAAGTA ATAAACTCTT CGTAAGCGGA GTTTTGCTCT 6901 AGTATTATCC GTCGAGCTGC C TCATACAA.A TGTAGAATGA CTCCATGGCT TCATAATAGG CAGCTCGACG GAGTATGTTT ACATCTTACT GAGGTACCGA 6951 GCCCTCCTCC CTACCACACA TATGAAGAGC CAGCATTTGT TCAAGTACAA CGGGAGGAGG GATGGTGTGT ATACTTCTCG GTCGTAAACA AGTTCATGTT 7001 CGAACCCTTT AAGACAAGAA AGGAAGGAAT TGAACCCCCA TATGTTAGTT GCTTGGGAAA TTCTGTTCTT TCCTTCCTTA ACTTGGGGGT ATACAATCAA 7051 TCAAGCTAAC CACATCACCA CTCTGTCACT TTCTTTATAG AGATCCTAGT AGTTCGATTG GTGTAGTGGT GAGACAGTGA AAGAAATATC TCTAGGATCA 7101 AA.AATGTATT ACATTGCCTT GTCAAGGCAA AATTGTGAGT TTAAATCCCA TTTTACATAA TGTAACGGAA CAGTTCCGTT TTAACACTCA AATTTAGGGT 7151 CGGATCTTAA TTAATGGCAC ACCCCTCACA ATTAGGATTT CAAGATGCAG GCCTAGAATT AATTACCGTG TGGGGAGTGT TAATC C TA.AA GTTCTACGTC 7201 CCTCCCCAGT TATGGAAGAA CTTATTCACT TTCACGACCA CACACTAATA GGAGGGGTCA ATACCTTCTT GAATAAGTGA AAGTGCTGGT GTGTGATTAT 7251 ATTGTATTTC TAATTAGCGC TCTGGTTCTT TATATTATTA CGGCGATAGT TAACATA.AAG ATTAATCGCG AGACCAAGAA ATATAATAAT GCCGCTATCA 7301 GTCAACAAAA CTTACAAATA AATATATTCT TGACTCCCAA GAAATTGA.AA CAGTTGTTTT GAATGTTTAT TTATATAAGA ACTGAGGGTT CTTTAACTTT 7351 TCGTTTGAAC TATCCTGCCC GCCATCATCC TTATTATAAT TGCCCTACCA AGCAAACTTG A'I'~AGGAC GGG CGGTAGTAGG AATAATATTA ACGGGATGGT 7401 TCTCTACGAA TTCTGTATCT TATGGACGAA ATTAATGATC CCCACCTAAC AGAGATGCTT AAGACATAGA ATACCTGCTT TAATTACTAG GGGTGGATTG 7451 TATTAAAGCC ATGGGTCACC AGTGATACTG AAGCTATGAG TATACAGATT ATAATTTCGG TACCCAGTGG TCACTATGAC TTCGATACTC ATATGTCTAA 7501 ATGAAGATCT AGCTTTTGAC TCTTACATAG TTCAAACCCA AGACTTAACC TACTTCTAGA TC GA,AAAC TG AGAATGTATC AAGTTTGGGT TCTGAATTGG 7551 CCCGGCCAAT TTCGTCTACT GGAGACAGAC CATCGAATAG TAGTCCCCAT GGGCCGGTTA AAGCAGATGA CCTCTGTCTG GTAGCTTATC ATCAGGGGTA 7601 GGAGTCCCCT GTTCGCGTCC TGGTGTCCGC AGAAGATGTC CTACACTCAT CCTCAGGGGA CAAGCGCAGG ACCACAGGCG TCTTCTACAG GATGTGAGTA 7651 GAGCTGTACC AGCCTTAGGG GTTAAA,ATAG ACGCTGTCCC AGGACGTTTA CTCGACATGG TCGGAATCCC CAATTTTATC TGCGACAGGG TCCTGCAAAT 7701 AACCAAACTG CCTTCATCAT CTCCCGACCA GGTGTCTACT ATGGCCAGTG TTGGTTTGAC GGAAGTAGTA GAGGGCTGGT CCACAGATGA TACCGGTCAC 7751 TTCAGAA.ATT TGTGGGGCTA ACCACAGCTT CATGCCTATT GTAGTAGAAG AAGTCTTTAA ACACCCCGAT TGGTGTCGAA GTACGGATAA CATCATCTTC 7801 CAGTCCCTCT AGAACACTTC GAAGCCTGAT CTTCATTAAT GCTAGAAGAA GTCAGGGAGA TCTTGTGAAG CTTCGGACTA GAAGTAATTA CGATCTTCTT 7851 GCCTCACTAA GAAGCTAAAT CGGGACTAGC GTTAGCCTTT TAAGCTAAAA CGGAGTGATT CTTCGATTTA GCCCTGATCG CAATCGGAAA ATTCGATTTT 7901 ACTGGTGACT CCCTACCACC CTTAGTGACA TGCCCCAATT AAACCCTCAC TGACCACTGA GGGATGGTGG GAATCACTGT ACGGGGTTAA TTTGGGAGTG 7951 CCTTGATTAA TTATCCTTTT GTTCTCATGA ATAATTTTCC TCATTATTTT GGAACTAATT AATAGGAAAA CAAGAGTACT TATTA.A.AAGG AGTAATA.AAA 312

8001 ACCAA,AAAAA GTGATAAATC ACCTATTTAG CAACAACCCA ACATTAAAAA TGGTTTTTTT CACTATTTAG TGGATA.AATC GTTGTTGGGT TGTAATTTTT 8051 GCACAGAA.AT ATCTAA.AC C C GAACCCTGAA ATTGACCATG ATCATAAGCT CGTGTCTTTA TAGATTTGGG CTTGGGACTT TAACTGGTAC TAGTATTCGA 8101 TCTTCGACCA ATTCCTAAGC CCCTCCCTCC TTGGAGTCCC ACTAATTGCC AGAAGCTGGT TAAGGATTCG GGGAGGGAGG AACCTCAGGG TGATTAACGG 8151 CTAGCAATTG CCCTACCATG ACTAATTTTC CCAACCCCCA CTAGCCGGTG GATCGTTAAC GGGATGGTAC TGATTAAAAG GGTTGGGGGT GATCGGCCAC 8201 ACTTAATAAT CGACTGATAA CGCTCCA.AAA CTGATTTATT AACCGATTTG TGAATTATTA GCTGACTATT GCGAGGTTTT GACTAAATAA TTGGCTAAAC 8251 TTTACCAACT TATACAACCC ATCAACTTTG CCGGCCATAA ATGAGCCGTG AAATGGTTGA ATATGTTGGG TAGTTGAAAC GGCCGGTATT TACTCGGCAC 8301 CTATTTACAG CATTAATGTT ATTCTTAATT ACCATCAACT TATTAGGCCT GATAAATGTC GTAATTACAA TAAGAATTAA TGGTAGTTGA ATAATCCGGA 8351 TCTCCCTTAC ACCTTCACAC CCACAACACA ACTTTCCCTC AACATGGCAT AGAGGGAATG TGGAAGTGTG GGTGTTGTGT TGA.AAGGGAG TTGTACCGTA 8401 TTGCTCTACC TTTATGACTC ACCACCGTCT TAATCGGAAT ATTGAATCAA AACGAGATGG AAATACTGAG TGGTGGCAGA ATTAGCCTTA TAACTTAGTT 8451 CCCACAATTG CCCTAGGCCA CTTCCTGCCG GAAGGTACTC CCACCCTTCT GGGTGTTAAC GGGATCCGGT GAAGGACGGC CTTCCATGAG GGTGGGAAGA 8501 AGTGCCTGTC CTAATTGTCA TCGAGACCAT TAGCTTATTT ATTCGACCAC TCACGGACAG GATTAACAGT AGCTCTGGTA ATCGAATAAA TAAGCTGGTG 8551 TAGCGCTAGG GGTCCGATTA ACTGCTAATT TAACAGCCGG TCATCTACTA ATCGCGATCC CCAGGCTAAT TGACGATTAA ATTGTCGGCC AGTAGATGAT 8601 ATACAATTAA TTGCAACCGC AGCCTTCGTC CTTATCACCA TTATGCCAAC TATGTTAATT AACGTTGGCG TCGGAAGCAG GAATAGTGGT AATACGGTTG 8651 CGTAGCATTA CTCACATCAG TGATTCTATT TTTACTAACG GTCCTAGAAG GCATCGTAAT GAGTGTAGTC ACTAAGATAA AAATGATTGC CAGGATCTTC 8701 TAGCCGTGGC AATAATTCAA GCATACGTCT TCGTCCTCCT ATTAAGCCTC ATCGGCACCG TTATTAAGTT CGTATGCAGA AGCAGGAGGA TAATTCGGAG 8751 TACCTACAAG AA,A.ACGTC TA ATGGCTCACC AAGCACACGC ATATCATATA ATGGATGTTC TTTTGCAGAT TACCGAGTGG TTCGTGTGCG TATAGTATAT 8801 GTTGACCCTA GCCCATGACC ACTAACCGGA GCCACAGCCG CCCTTTTAAT CAACTGGGAT CGGGTACTGG TGATTGGCCT CGGTGTCGGC GGGAA.AATTA 8851 AACATCTGGC CTAGCCATCT GGTTTCACTT TCACTCATTA ATCCTTCTCT TTGTAGACCG GATCGGTAGA CCAAAGTGAA AGTGAGTAAT TAGGAAGAGA 8901 ACCTTGGACT AACCCTTCTT CTATTAACTA TAATTCAATG ATGACGTGAT TGGAACCTGA TTGGGAAGAA GATAATTGAT ATTAAGTTAC TACTGCACTA 8951 ATTATCCGAG AAGGAACATA CCAAGGTCAC CATACACCCC CTGTTCA.AAA TAATAGGCTC TTCCTTGTAT GGTTCCAGTG GTATGTGGGG GACAAGTTTT 9001 AGGCCTCCGC TACGGAATAA TCCTGTTTAT CACATCAGAA GTATTCTTCT TCCGGAGGCG ATGCCTTATT AGGACAAATA GTGTAGTCTT CATAAGAAGA 9051 TCCTAGGCTT TTTCTGAGCC TTCTACCACT CAAGTCTCGC CCCCACCCCT AGGATCCGAA AAAGACTCGG AAGATGGTGA GTTCAGAGCG GGGGTGGGGA 9101 GAACTAGGCG GATGCTGACC ACCAACAGGA ATTAGTCCTA TTGACCCATT CTTGATCCGC CTACGACTGG TGGTTGTCCT TAATCAGGAT AACTGGGTAA 9151 CGAAGTACCA CTTTTAAATA CTGCAGTACT TCTAGCCTCC GGTGTAACAG GCTTCATGGT GAAAATTTAT GACGTCATGA AGATCGGAGG CCACATTGTC 9201 TAACTTGAGC CCACCACGGC CTCATAGAAG GTAACCGAAA AGAGACTATT ATTGAACTCG GGTGGTGCCG GAGTATCTTC CATTGGCTTT TCTCTGATAA 9251 CAAGCCCTCA CTCTCACCAT CATCCTAGGC GTTTACTTTA CAGCCCTCCA GTTCGGGAGT GAGAGTGGTA GTAGGATCCG CAAATGAAAT GTCGGGAGGT 9301 AGCTATAGAA TATTATGAAG CACCTTTTAC AATTGCTGAT GGAGTCTATG TCGATATCTT ATAATACTTC GTGGAA.AATG TTAACGACTA CCTCAGATAC 9351 GGACAACATT CTTCGTCGCC ACAGGATTCC ATGGCCTCCA TGTTATTATT 313

CCTGTTGTAA GAAGCAGCGG TGTCCTAAGG TACCGGAGGT ACAATAATAA 9401 GGCTCAACAT TTTTAATAAT CTGTCTACTA CGACAAATTC AATATCACTT CCGAGTTGTA A.A.P,ATTATTA GACAGATGAT GCTGTTTAAG TTATAGTGAA 9451 TACATCCCAA CACCACTTTG GATTTGAAGC TGCCGCATGA TACTGACACT ATGTAGGGTT GTGGTGAAAC CTAAACTTCG ACGGCGTACT ATGACTGTGA 9501 TCGTAGACGT AGTGTGACTA TTCCTTTATG TTTCCATCTA TTGATGAGGC AGCATCTGCA TCACACTGAT AAGGA.AATAC A.AAGGTAGAT AACTACTCCG 9551 TCATAATTAC TTTTCTAGTA TAGACTAGTA CAAATGATTT CCAATCATTT AGTATTAATG AAAAGATCAT ATCTGATCAT GTTTACTA.A.A GGTTAGTA.AA 9601 AATCTTGGTT A~AAATCCAAG GAA.AAGTAAT GAACCTCATC ATGTCTTCTG TTAGAACCAA TTTTAGGTTC CTTTTCATTA CTTGGAGTAG TACAGAAGAC 9651 TCGCGGCTAC GGCCCTGGTT TCCCTAATCC TCGTGTTTAT CGCATTCTGA AGCGCCGATG CCGGGACCAA AGGGATTAGG AGCACAAATA GCGTAAGACT 9701 CTTCCGTCAC TTAACCCAGA CAAC GA.A.A.AA CTATCCCCAT ACGAATGCGG GAAGGCAGTG AATTGGGTCT GTTGCTTTTT GATAGGGGTA TGCTTACGCC 9751 CTTTGACCCT CTTGGCAGCG CACGTCTCCC ATTTTCCTTA CGCTTCTTCC GAAACTGGGA GAACCGTCGC GTGCAGAGGG TAAAAGGAAT GCGAAGAAGG 9801 TCGTAGCTAT CTTATTCCTA CTGTTTGATC TAGA.AATC GC CCTCCTCCTC AGCATCGATA GAATAAGGAT GACAAACTAG ATCTTTAGCG GGAGGAGGAG 9851 CCCCTACCCT GAGGCGATCA ATTACTATCA CCGCTCTATA CACTACTCTG GGGGATGGGA CTCCGCTAGT TAATGATAGT GGCGAGATAT GTGATGAGAC 9901 AGCAACAATT ATCCTAGTTC TGCTCACCCT AGGTCTTATT TATGAATGAC TCGTTGTTAA TAGGATCAAG ACGAGTGGGA TCCAGAATAA ATACTTACTG 9951 TTCAAGGAGG GTTAGAATGA GCAGAGTAGA TATTTAGTCT AAACAAAGAC AAGTTCCTCC CAATCTTACT CGTCTCATCT ATAAATCAGA TTTGTTTCTG 10001 CACTAATTTC GGCTTAGTAA ATTATGGTGA AAACCCATAA ATATCTTATG GTGATTAAAG CCGAATCATT TAATACCACT TTTGGGTATT TATAGAATAC 10051 TCCCCTATGT ATTTTAGTCT TAACTCAGCA TTTATACTAG GCCTAATGGG AGGGGATACA TAAA.ATCAGA ATTGAGTCGT AAATATGATC CGGATTACCC 10101 CCTTGCACTT AACCGTTATC ACCTCTTATC CGCACTTTTA TGCCTAGAGA GGAACGTGAA TTGGCAATAG TGGAGAATAG GC GTGAA.AAT ACGGATCTCT 10151 GTATACTACT AACCCTATTC ATTACCATTG CTATCTGAAC CCTTACACTA CATATGATGA TTGGGATAAG TAATGGTAAC GATAGACTTG GGAATGTGAT 10201 AATTCTGCCT CCTCTTCAAT TATCCCCATG ATCCTCCTCA CATTCTCAGC TTAAGACGGA GGAGAAGTTA ATAGGGGTAC TAGGAGGAGT GTAAGAGTCG 10251 TTGTGAAGCC AGTGCGGGCC TGGCTATTCT AGTGGCCACC TCACGCTCCC AACACTTCGG TCACGCCCGG ACCGATAAGA TCACCGGTGG AGTGCGAGGG 10301 ACGGCTCTGA TAACTTACAA AGCCTAAACC TCCTCCAATG C TP.~AAAATTC TGCCGAGACT ATTGAATGTT TCGGATTTGG AGGAGGTTAC GATTTTTAAG 10351 TTATCCCAAC AATCATACTC TTTCCAACCA CATGAATTAT TAAC AATAGGGTTG TTAGTATGAG AAAGGTTGGT GTACTTAATA ATTGTTTTTT 10401 TGGCTATGAC CCATAACTAC TTCCTATAGT CTTCTAATCG CACTATCAAG ACCGATACTG GGTATTGATG AAGGATATCA GAAGATTAGC GTGATAGTTC 10451 TTTAACCTGA TTTAAATGAA ACATAGATAT TGGCTGGGAC TTTTCCAATC A.AATTGGAC T A.AATTTAC TT TGTATCTATA ACCGACCCTG AAAAGGTTAG 10501 AATTCATAGC TGTTGACCCT CTATCAGCCC CCTTGCTTAT TCTTACATGC TTAAGTATCG ACAACTGGGA GATAGTCGGG GGAACGAATA AGAATGTACG 10551 TGACTTCTTC CACTAATAAT CTTAGCCAGC CAGAACCACA TCTCCCCAGA ACTGAAGAAG GTGATTATTA GAATCGGTCG GTCTTGGTGT AGAGGGGTCT 10601 ACCAATTATT CGACAACGGA CATACATCTC ACTCCTAATC TCCCTCCAAA TGGTTAATAA GCTGTTGCCT GTATGTAGAG TGAGGATTAG AGGGAGGTTT 10651 CTTTCCTTAT TATAGCATTT TCCGCAACCG AA.ATAATCAT ATTTTACATC GAAAGGAATA ATATCGTA.AA AGGCGTTGGC TTTATTAGTA TAAAATGTAG 10701 ATATTTGAAG CTACACTCAT CCCCACTCTT ATTATTATTA CACGATGAGG TATA.AAC TTC GATGTGAGTA GGGGTGAGAA TAATAATAAT GTGCTACTCC 314

10751 TAATCAGACA GAACGCCTAA ATGCAGGTAC TTACTTCCTA TTTTATACCT ATTAGTCTGT CTTGCGGATT TACGTCCATG AATGAAGGAT A.A.AATATGGA 10801 TAATTGGCTC CCTCCCCCTC CTCATTGCCC TTTTACTCAT ACAAAACAAC ATTAACCGAG GGAGGGGGAG GAGTAACGGG AAAATGAGTA TGTTTTGTTG 10851 CTCGGCACCC TCTCTATAAT TATTATACAG CACTCACAGT CCTTAAGCCT GAGCCGTGGG AGAGATATTA ATAATATGTC GTGAGTGTCA GGAATTCGGA 10901 AACCTCATGA ACAGACA.AAT TATGATGAGT AGCTTGCCTC CTCGCCTTTC TTGGAGTACT TGTCTGTTTA ATACTACTCA TCGAACGGAG GAGC GGAA.AG 10951 TTGTCAAAAT ACCCCTGTAC GGAATTCACC TCTGACTTCC TAAAGCCCAT AACAGTTTTA TGGGGACATG CCTTAAGTGG AGACTGAAGG ATTTCGGGTA 11001 GTTGAAGCCC CAATTGCCGG CTCAATAATT CTAGCCGCCG TACTACTCAA CAACTTCGGG GTTAACGGCC GAGTTATTAA GATCGGCGGC ATGATGAGTT 11051 ACTAGGGGGC TATGGCATAA TACGAATCAT TGTAATACTA AATCCCCTCA TGATCCCCCG ATACCGTATT ATGCTTAGTA ACATTATGAT TTAGGGGAGT 11101 CCAAAGAAAT GGCCTACCCA TTCCTAATCC TAGCCATTTG AGGTATTATT GGTTTCTTTA CCGGATGGGT AAGGATTAGG ATCGGTA.AAC TCCATAATAA 11151 ATAACTAGCT CCATCTGCCT AC GACA.AAC T GACCTCAAAT CTCTGATTGC TATTGATCGA GGTAGACGGA TGCTGTTTGA CTGGAGTTTA GAGACTAACG 11201 CTACTCATCA GTGAGTCACA TGGGCTTAGT TGCAGGAGCA ATTCTTATCC GATGAGTAGT CACTCAGTGT ACCCGAATCA ACGTCCTCGT TAAGAATAGG 11251 AAACACCATG AAGCTTTGCA GGAGCAATCA CGTTAATGAT TGCCCATGGC TTTGTGGTAC TTC GAA.AC GT CCTCGTTAGT GCAATTACTA ACGGGTACCG 11301 CTAATCTCAT CTGCCCTATT CTGCCTAGCC AACACTAACT ACGAACGAAT GATTAGAGTA GACGGGATAA GACGGATCGG TTGTGATTGA TGCTTGCTTA 11351 TCATACCCGA ACAATACTCC TAGCTCGGGG CATACAAATT ATTCTTCCAT AGTATGGGCT TGTTATGAGG ATCGAGCCCC GTATGTTTAA TAAGAAGGTA 11401 TAATAACAAC CTGATGATTC TTTGCTAGTT TGGCCAATCT TGCTCTCCCA ATTATTGTTG GACTACTAAG AAACGATCAA ACCGGTTAGA ACGAGAGGGT 11451 CCTTCCCCTA ATCTCATGGG AGAACTCCTT ATCATTACCT CAATGTTTAA GGAAGGGGAT TAGAGTACCC TCTTGAGGAA TAGTAATGGA GTTACA.AATT 11501 TTGATCCAAC TGAACTATTC TCCTCTCAGG CCTTGGGGTA TTGATTACAG AACTAGGTTG ACTTGATAAG AGGAGAGTCC GGAACCCCAT AACTAATGTC 11551 CCTCCTACTC CCTCTATATA TTCCTAATGA CTCAACGAGG TCCCACCCCC GGAGGATGAG GGAGATATAT AAGGATTACT GAGTTGCTCC AGGGTGGGGG 11601 CACCATATCT TATCACTAAA CCCAACCTAC ACACGAGAAC ATCTCCTCCT GTGGTATAGA ATAGTGATTT GGGTTGGATG TGTGCTCTTG TAGAGGAGGA 11651 AGCCCTTCAC CTCATGCCTG TTCTACTTCT AATATTTAAG CCAGAACTTA TCGGGAAGTG GAGTACGGAC AAGATGAAGA TTATAAATTC GGTCTTGAAT 11701 TCTGAGGCTG AACACTTTGT ACTTATAGTT TAAC TA.AA.AC ATTAGATTGT AGACTCCGAC TTGTGAAACA TGAATATCAA ATTGATTTTG TAATCTAACA 11751 GGTTCTAAAG ACAAAAGTTA AAACCTTTTT AATTACCGAG AGAGGTCAGG CCAAGATTTC TGTTTTCAAT TTTGGAA.A.AA TTAATGGCTC TCTCCAGTCC 11801 GACACGATAG AACTGCTAAT TCTTCTTACC ATGGCTCAAA TCCATGGCTC CTGTGCTATC TTGACGATTA AGAAGAATGG TACCGAGTTT AGGTACCGAG 11851 ACTCAGCTTC TGAAAGATAT TAGTAATCTA TTGGTCTTAG GAACCAAAAA TGAGTCGAAG ACTTTCTATA ATCATTAGAT AACCAGAATC CTTGGTTTTT 11901 TTCTTGGTGC AATTCCAAGC AAGAGCTATG AATACCATTT TTAACTCATC AAGAACCACG TTAAGGTTCG TTCTCGATAC TTATGGTA.AA AATTGAGTAG 11951 ATTACTCTTA ATCTTTACCA TCCTTATCTT TCCACTAATA ACCTCACTAA TAATGAGAAT TAGAAATGGT AGGAATAGAA AGGTGATTAT TGGAGTGATT 12001 GCCCTAAAGA ACTTAATCTC AACTGAGCCT CATCCCACGT p~~AAAC GGC T CGGGATTTCT TGAATTAGAG TTGACTCGGA GTAGGGTGCA TTTTTGCCGA 12051 GTP~AAAACCT CTTTCTTTAT TAGCCTAATT CCCCTATCCA TTTTCCTAGA CATTTTTGGA GA.AAGAAATA ATCGGATTAA GGGGATAGGT A.AA.AGGATC T 12101 CCAGGGTTTA GAATCCATCA TGACTAACTA CAACTGAATA AACATTGGAC 315

GGTCCCAA.AT CTTAGGTAGT ACTGATTGAT GTTGACTTAT TTGTAACCTG 12151 CATTCGATAT TAATATAAGC TTCAAATTTG ATATATATTC AATTGTATTT GTAAGCTATA ATTATATTCG AAGTTTAAAC TATATATAAG TTAACATAAA 12 2 01 ACCCCAGTAG CTCTCTACGT TACTTGATCC ATCCTTGAAT TCGCCCTCTG TGGGGTCATC GAGAGATGCA ATGAACTAGG TAGGAACTTA AGCGGGAGAC 12251 ATATATACAT TCCGACCCCA ACATCAACCG CTTCTTCAAG TACCTCCTAC TATATATGTA AGGCTGGGGT TGTAGTTGGC GAAGAAGTTC ATGGAGGATG 12301 TTTTCCTAAT TTCAATAATT ATTCTAGTGA CCGCTAACAA CATATTTCAA AAAAGGATTA AAGTTATTAA TAAGATCACT GGCGATTGTT GTATAAAGTT 12351 CTGTTCATTG GCTGAGAAGG TGTAGGCATT ATATCCTTTC TTCTCATCGG GACAAGTAAC CGACTCTTCC ACATCCGTAA TATAGGAA.AG AAGAGTAGCC 12401 TTGATGATAT AGCCGAACAG ACGCCAATAC AGCCGCCCTA CAAGCTGTAA AACTACTATA TCGGCTTGTC TGCGGTTATG TCGGCGGGAT GTTCGACATT 12451 TCTACAACCG AGTAGGTGAC ATCGGTCTAA TCCTCAGCAT AGCTTGACTA AGATGTTGGC TCATCCACTG TAGCCAGATT AGGAGTCGTA TCGAACTGAT 12501 GC CATAAAC T TAAACTCCTG AGAA.ATTCAA CAATTATTTA TCCTATCTAA CGGTATTTGA ATTTGAGGAC TCTTTAAGTT GTTAATAAAT AGGATAGATT 12551 AGAGATAGAC TTAACCTTGC CCCTTCTTGG CCTTGTCCTG GCCGCAGCTG TCTCTATCTG AATTGGAACG GGGAAGAACC GGAACAGGAC CGGCGTCGAC 12601 GA.AA.ATC C GC ACAATTTGGC CTTCACCCAT GACTCCCCTC AGCCATAGAA CTTTTAGGCG TGTTAA.ACCG GAAGTGGGTA CTGAGGGGAG TCGGTATCTT 12651 GGGCCCACGC CGGTCTCTGC CTTACTCCAC TCCAGCACAA TAGTTGTTGC CCCGGGTGCG GCCAGAGACG GAATGAGGTG AGGTCGTGTT ATCAACAACG 12701 CGGCATCTTC CTTCTAATTC GCCTCCACCC ATTAATACAA GACAATCAAT GCCGTAGAAG GAAGATTAAG CGGAGGTGGG TAATTATGTT CTGTTAGTTA 12751 TAATCCTAAC AACATGCTTA TGCCTGGGAG CACTGACCAC TCTTTTTACT ATTAGGATTG TTGTACGAAT ACGGACCCTC GTGACTGGTG AGP~TGA 12801 GCGGCATGCG CACTTACCCA AAACGATATC TCA TTGCATTCTC CGCCGTACGC GTGAATGGGT TTTGCTATAG TTTTTTTAGT AACGTAAGAG 12851 AACATCCAGC CAGCTCGGAC TGATAATAGT GACAATTGGC CTCAACCAAC TTGTAGGTCG GTCGAGCCTG ACTATTATCA CTGTTAACCG GAGTTGGTTG 12901 CCCAACTAGC TTTTCTCCAT ATCTGCACCC ACGCCTTTTT TAAAGCTATA GGGTTGATCG AAAAGAGGTA TAGACGTGGG TGCGGP.►~~AA.A ATTTCGATAT 12951 CTCTTCCTCT GTTCTGGGTC TATTATTCAC AGCCTCAATG ATGAACAAGA GAGAAGGAGA CAAGACCCAG ATAATAAGTG TCGGAGTTAC TACTTGTTCT 13001 CATCCGCA.AA ATAGGAGGCC TCCATAAACT TCTACCATTC ACCTCATCTT GTAGGCGTTT TATCCTCCGG AGGTATTTGA AGATGGTAAG TGGAGTAGAA 13051 CCTTAACCAT TGGAAGTCTA GCTCTTACAG GTATACCTTT CCTATCAGGC GGAATTGGTA ACCTTCAGAT CGAGAATGTC CATATGGA.AA GGATAGTCCG 13101 TTCTTCTCAA AAGACGCCAT CATTGAATCC ATAAACACTT CTCACCTCAA AAGAAGAGTT TTCTGCGGTA GTAACTTAGG TATTTGTGAA GAGTGGAGTT 13151 CGCCTGAGCC CTTATCCTTA CCCTAATCGC AACCTCATTC ACAGCTATTT GCGGACTCGG GAATAGGAAT GGGATTAGCG TTGGAGTAAG TGTCGATA.AA 13201 ACAGCCTCCG CCTGATTTTC TTCGCATTAA TAA.ATTTTCC ACGATTTAAT TGTCGGAGGC GGACTAA.AAG AAGCGTAATT ATTTAAA.AGG TGCTA.AATTA 13251 TCACTCTCCC CTATCAACGA AAACAACCCC ATAGTCATCA ACCCAATCAA AGTGAGAGGG GATAGTTGCT TTTGTTGGGG TATCAGTAGT TGGGTTAGTT 13301 ACGTCTAGCT TATGGAAGTA TCCTAGCCGG CCTCATCATT ACATCTAACC TGCAGATCGA ATACCTTCAT AGGATCGGCC GGAGTAGTAA TGTAGATTGG 13351 TAACACCCAC P~~AAACCCAA ATCATAACTA TACCCCCTCT ACTGAAACTC ATTGTGGGTG TTTTTGGGTT TAGTATTGAT ATGGGGGAGA TGACTTTGAG 13401 TCCGCCCTAC TAGTAACCAT TATTGGCCTC CTACTAGCCT TAGAGCTAGC AGGCGGGATG ATCATTGGTA ATAACCGGAG GATGATCGGA ATCTCGATCG 13451 TAATCTAACC AGCACCCAAC TCA.AAACAAC CCCCACCCTT TATCCTCACC ATTAGATTGG TCGTGGGTTG AGTTTTGTTG GGGGTGGGAA ATAGGAGTGG -316

13501 ACTTCTCAAA TATGCTGGGA TACTTTCCAC AAATTATCCA TCGCTTCCTG TGAAGAGTTT ATACGACCCT ATGAAAGGTG TTTAATAGGT AGCGAAGGAC 13551 CCCAAAATTA ACCTAACCTG AGCCCAACAC ATTTCCACCC ACCTGATTGA GGGTTTTAAT TGGATTGGAC TCGGGTTGTG TAAAGGTGGG TGGACTAACT 13601 CCAAACATGA TATG TTGGACCAAA AAGTACCCTT ATTCGACAAA GGTTTGTACT ATACTTTTTT AACCTGGTTT TTCATGGGAA TAAGCTGTTT 13651 TTCCACTAAT CAAACTATCC ACTCAGCCCC AACAAGGTTA CATTAAAGTC AAGGTGATTA GTTTGATAGG TGAGTCGGGG TTGTTCCAAT GTAATTTCAG 13701 TACCTTATGT TACTCTTCCT CACTTTAACC TTGGCCCTAC TCACTACATT ATGGAATACA ATGAGAAGGA GTGAAATTGG AACCGGGATG AGTGATGTAA 13751 AACCTAACCA CACGCAAAGT ACCCCACGAC AACCCCCGAG TTAACTCTAA TTGGATTGGT GTGCGTTTCA TGGGGTGCTG TTGGGGGCTC AATTGAGATT 13801 TACCACAAAC .AAAGTCAATA GCAGTACTCA CCCACTCAGA ACTAACAGCC ATGGTGTTTG TTTCAGTTAT CGTCATGAGT GGGTGAGTCT TGATTGTCGG 13851 ACCCCCCATC AC C ATA.AAGT AAAGACACCC CCACAAA.A.TC CCCACGAGTC TGGGGGGTAG TGGTATTTCA TTTCTGTGGG GGTGTTTTAG GGGTGCTCAG 13901 ATTTCTAAAT TACTTATCTC TTCCACCCCT GACCAACTTA ACTCAAATAA TAAAGATTTA ATGAATAGAG AAGGTGGGGA CTGGTTGAAT TGAGTTTATT 13951 TTCTACCATA AAATATTTAC CAGCAAGAAC TAATACTGCT A.AAT C AAGATGGTAT TTTATAAATG GTCGTTCTTG ATTATGACGA TTTATTTTTG 14001 CGATATACAA CAGAACAGAT CAATTACCCC ACGACTCAGG ATAGGGCTCA GCTATATGTT GTCTTGTCTA GTTAATGGGG TGCTGAGTCC TATCCCGAGT 14051 GCAGCAAGCG CTGCTGTATA AGC AA.ATAC T ACCAATATCC CCCCCAAATA CGTCGTTCGC GACGACATAT TCGTTTATGA TGGTTATAGG GGGGGTTTAT 14101 AATTp~~AAAC AA~AACTAATG ACAAAAAGGA CCCCCCGTGT CCTACCAACA TTAATTTTTG TTTTGATTAC TGTTTTTCCT GGGGGGCACA GGATGGTTGT 14151 ATCCACACCC TACCCCAGCA GCCATAACTA ACCCTAATGC AGCATAATAA TAGGTGTGGG ATGGGGTCGT CGGTATTGAT TGGGATTACG TCGTATTATT 14201 GGTGAAGGAT TAGACGCTAC TCCTACCAGA CCCAGAACTA AACAAACTAT CCACTTCCTA ATCTGCGATG AGGATGGTCT GGGTCTTGAT TTGTTTGATA 14251 TATTp~~AAAC ATAA.AATATA CCATTATTCC TACCTGGACT TTAACCAAGA ATAATTTTTG TATTTTATAT GGTAATAAGG ATGGACCTGA AATTGGTTCT 14301 CCAACAACTT GP.~~AAAC TAT CGTTGTTTAT TCAACTATAA GAATTTATGG GGTTGTTGAA CTTTTTGATA GCAACAAATA AGTTGATATT CTTAAATACC 14351 CCATA.AATAT C C GP►~~AA.AC C CACCCTCTAC TP~~AAATTGT AAACCAGGTC GGTATTTATA GGCTTTTTGG GTGGGAGATG ATTTTTAACA TTTGGTCCAG 14401 CTAATTGACC TTCCAGCCCC ATCAAACATC TCCATCTGAT GAAACTTTGG GATTAACTGG AAGGTCGGGG TAGTTTGTAG AGGTAGACTA CTTTGAAACC 14451 TTCACTTCTA GGACTGTGTT TAGTAATCCA AATCATTACA GGTCTCTTCT AAGTGAAGAT CCTGACACAA ATCATTAGGT TTAGTAATGT CCAGAGAAGA 14501 TGGCAATACA CTACACCGCA GATATCTCTA TAGCTTTCTC TTCAGTAGTC ACCGTTATGT GATGTGGCGT CTATAGAGAT ATCGAAAGAG AAGTCATCAG 14551 CATATTTGTC GTGACGTCAA CTACGGCTGA CTCATCCGTA ATATTCACGC GTATAAACAG CACTGCAGTT GATGCCGACT GAGTAGGCAT TATAAGTGCG 14601 CAACGGAGCC TCTCTATTCT TTGTCTGTGT TTACTTTCAC ATCGCCCGAG GTTGCCTCGG AGAGATAAGA AACAGACACA AATGAAAGTG TAGCGGGCTC 14651 GACTCTACTA TGGCTCCTAC TTATATAAAG AAACATGAAA TATCGGGGTA CTGAGATGAT ACCGAGGATG AATATATTTC TTTGTACTTT ATAGCCCCAT 14701 ATCCTACTAT TTCTGCTCAT AGCCACGGCC TTCGTAGGCT ATGTCTTACC TAGGATGATA AAGACGAGTA TCGGTGCCGG AAGCATCCGA TACAGAATGG 14751 ATGAGGTCAA ATATCCTTCT GAGGCGCTAC AGTCATCACT AACCTTCTTT TACTCCAGTT TATAGGAAGA CTCCGCGATG TCAGTAGTGA TTGGAAGAAA 14801 CCGCCTTCCC CTATATTGGA GACACACTAG TCCAATGAAT CTGAGGGGGC GGCGGAAGGG GATATAACCT CTGTGTGATC AGGTTACTTA GACTCCCCCG 14851 TTCTCAGTAG ATAACGCCAC CCTCACGCGA TTTTTCGCAT TTCACTTTCT 317

GTA AAGTGA.AAGA AAGAGTCATC TATTGCGGTG GGAGTGCGCT AAA,AAGC 14901 CTTACCCTTC CTTATTATTG CACTAATAAT TATTCATATC CTCTTCTTAC GAATGGGAAG GAATAATAAC GTGATTATTA ATAAGTATAG GAGAAGAATG 14951 ACGAAACAGG CTCA.AACAAT CCTATAGGCC TTAACTCCGA CATAGATAAA TGCTTTGTCC GAGTTTGTTA GGATATCCGG AATTGAGGCT GTATCTATTT 15001 ATCTCCTTCC ACCCCTACTT C TC CTACAA.A GACGCACTTG GATTCTTCAT TAGAGGAAGG TGGGGATGAA GAGGATGTTT CTGCGTGAAC CTAAGAAGTA 15051 CCTCCTCCTC CTCCTAGGAA TCCTAGCCCT ATTACTCCCC AACCTTCTAG GGAGGAGGAG GAGGATCCTT AGGATCGGGA TAATGAGGGG TTGGAAGATC 15101 GAGACGCTGA AAACTTCATC CCCGCCAATC CTCTCGTCAC CCCTCCTCAT TTTGAAGTAG GGGCGGTTAG GAGAGCAGTG GGGAGGAGTA 15151 ATTA.AAC C C G AATGATACTT CCTATTTGCC TATGCCATCC TCCGATCCAT TAATTTGGGC TTACTATGAA GGATAAACGG ATACGGTAGG AGGCTAGGTA 15201 CCCCAATAAA CTAGGAGGAG TCCTAGCCCT TCTATTCTCT ATCCTCATCC GGGGTTATTT GATCCTCCTC AGGATCGGGA AGATAAGAGA TAGGAGTAGG 15251 TCATGTTAGT CCCCCTTCTC CACACTTCCA AACAACGAAG CAGCACCTTC AGTACAATCA GGGGGAAGAG GTGTGAAGGT TTGTTGCTTC GTCGTGGAAG 15301 CGCCCACTTA CACA.AATTTT CTTCTGGACC CTTGTTGCCA ATATATTAAT GCGGGTGAAT GTGTTTA.AAA GAAGACCTGG GAACAACGGT TATATAATTA 15351 TTTAACCTGA ATTGGGGGTC AACCAGTTGA ACAACCATTT ATTCTTATTG A.AATTGGAC T TAACCCCCAG TTGGTCAACT TGTTGGTAAA TAAGAATAAC 15401 GACAAATTGC ATCTATTTCA TACTTCTCTT TATTCCTTGT CGTAATCCCA CTGTTTAACG TAGATAAAGT ATGAAGAGAA ATAAGGAACA GCATTAGGGT 15451 CTCACAGGCT GATGAGAAAA CAAAATCCTC AGCCTTAACT AGTTTTGATA GAGTGTCCGA CTACTCTTTT GTTTTAGGAG TCGGAATTGA TCAAAACTAT 15501 GCTTAACCTA AAAGCGTCGA CCTTGTAAGT CGAAGACCGG AGGTTTAAAC CGAATTGGAT TTTCGCAGCT GGAACATTCA GCTTCTGGCC TCCAAATTTG 15551 CCTCCTCAAG ATATATCAGG GGAAGGAGGG TTAAACTCCT GCCCTTGGCT GGAGGAGTTC TATATAGTCC CCTTCCTCCC AATTTGAGGA CGGGAACCGA 15601 CCCA.AAGCCA AGATTCTGCC CAA.ACTG000 CCTGAATGCT GTTAAACGTG GGGTTTCGGT TCTAAGACGG GTTTGACGGG GGACTTACGA CAATTTGCAC 15651 A.AAGC CARAT GTCCATTTGG TTTTCAAAAA GTTGGTCGGT TTAACATATT TTTCGGTTTA CAGGTA.A.ACC P~A,AAGTTTTT CAACCAGCCA AATTGTATAA 15701 AATGACATGG CCCACATACC TTAATACAAG GGCATATCTC ATCTCGACTA TTACTGTACC GGGTGTATGG AATTATGTTC CCGTATAGAG TAGAGCTGAT 15751 CATTACAATA TTTGACCTTC ACCTAATAGT TTCACACTCT ATGTATAATA GTAATGTTAT AAACTGGAAG TGGATTATCA AAGTGTGAGA TACATATTAT 15801 CTCATTAATT TATGTTCCCC TATATCATAG CATACTATGC TTTACCCTCA GAGTAATTAA ATACAAGGGG ATATAGTATC GTATGATACG AAATGGGAGT 15851 ATAAGTCACT AATCACAATT TCATCCCATT CTGTCCTCAA TCCCCATTAA TATTCAGTGA TTAGTGTTAA AGTAGGGTAA GACAGGAGTT AGGGGTAATT 15901 CCTATAATCA AGATGTCCAT TCCATAGATT TACTTCTTCC GCCCATAAAG GGATATTAGT TCTACAGGTA AGGTATCTAA ATGAAGAAGG CGGGTATTTC 15951 ACTTCTGTAT TTATTATGCG GGCTGGTAAG AACATCGCAT CCCGCTATTG TGAAGACATA AATAATACGC CCGACCATTC TTGTAGCGTA GGGCGATAAC 16001 TAAG ATTGCTCTAT TTGTGGCGCT GTACTCGATT TATCCCTATC ATTCTTTTTT TAACGAGATA AACACCGCGA CATGAGCTAA ATAGGGATAG 16051 AATTGACCAG ACCTGGCATC TGATTAATGC TTGTGCTACT TCAGTCCTTG TTAACTGGTC TGGACCGTAG ACTAATTACG AACACGATGA AGTCAGGAAC 16101 ATCGCGTCAA GAATGCCAGC CCGCTAGTTC CCTTTAATGG CACCTTCGTC TAGCGCAGTT CTTACGGTCG GGCGATCAAG GGA.AATTAC C GTGGAAGCAG 16151 CTTGATCGCG TCAAGATTTA TTTTCCACCC TGCTTTTTTC GGGGGGGATG GAACTAGCGC AGTTCTAAAT AAA.AGGTGGG AC G G CCCCCCCTAC 16201 AAGCCATCGC TATTCCCCGG AGGGGCTGAA CTGGGACTCT GAGATAGATC TTCGGTAGCG ATAAGGGGCC TCCCCGACTT GACCCTGAGA CTCTATCTAG 318

16251 TGTAATATCC TCGACACTCT TCTTTAATAC TCAGTACTCA TCATTCGCGA ACATTATAGG AGCTGTGAGA AGAAATTATG AGTCATGAGT AGTAAGCGCT 16301 GTTAAGATTG TCAAGTCGAC CAA.AAC TGAA AGGGATGGAG AGGTTCACGT CAATTCTAAC AGTTCAGCTG GTTTTGACTT TCCCTACCTC TCCAAGTGCA 16351 CATAACGGGT ACGTTTCGAT TTTTTTGATT AAAGAAGCTA TGGTTTA.AAA GTATTGCCCA TGCAAAGCTA CTAA TTTCTTCGAT ACCA.AATTTT 16401 AAGATATTTT CTTAACCCCC GTCCGAGTCC ATACCAGCAA TATACGTGAG TTCTATAAAA GAATTGGGGG CAGGCTCAGG TATGGTCGTT ATATGCACTC 16451 TGTAAA.ATGC ATTTCATTGT TTGAGTACAT TAATCACTTA ATCGGGCATA ACATTTTACG TAAAGTAACA AACTCATGTA ATTAGTGAAT TAGCCCGTAT 16501 AATTTACTGT TATTAGACTT CCCCCTGCTT TGTP.~A.PsAATT TAGAGCCGCC TTAAATGACA ATAATCTGAA GGGGGACGAA ACATTTTTAA ATCTCGGCGG 16551 T GA TA.A.AACATAT TTTGGTAA.AA ACCCCCCTCC CCCTAATATA ATTTTTTTCT ATTTTGTATA AAACCATTTT TGGGGGGAGG GGGATTATAT 16601 CACGGACTCC TCGP.~~AA.AC C C C TA.AAAC GA GGGCCGGACA TATATTTTGA GTGCCTGAGG AGCTTTTTGG GGATTTTGCT CCCGGCCTGT ATATAAAACT 16651 AATTAGCATG CGAAATGTAT TCTGTATTTA TATTGTTACA CTATCAT TTAATCGTAC GCTTTACATA AGACATAAAT ATAACAATGT GATAGTA

tRNA 1..70 product = tRNA-Phe rRNA 69..1024 product = 12S ribosomal RNA tRNA 1025..1096 product = tRNA-Val rRNA 1097..2769 product = 16S ribosomal RNA tRNA 2770..2844 product = tRNA-Leu gene 2845..3819 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3822..3890 product = tRNA-Ile tRNA 3889..3960 product = tRNA-Gln tRNA 3961..4029 product = tRNA-Met gene 4030..5073 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5073..5143 product = tRNA-Trp tRNA complement (5145..5213) product = tRNA-Ala tRNA complement (5214..5286) product = tRNA-Asn tRNA complement (5 320..5 386) product = tRNA-Cys tRNA complement (5388..5457) product = tRNA-Tyr gene 5459..7016 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7015..7085) 319

product = tRNA-Ser tRNA 7090..7159 product = tRNA-Asp gene 7164..7 854 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7855..7928 product = tRNA-Lys gene 7930..8097 gene = ATP8 product =ATP synthase FO subunit 8 gene 8088..8771 gene = ATP6 product =ATP synthase FO subunit 6 gene 8771..9556 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9559..9628 product = tRNA-Gly gene 9629..9979 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 997 8..10047 product = tRNA-Arg gene 10048..10344 gene = ND4L product = NADH dehydrogenase subunit 4L gene 1033 8..11718 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11719..11787 product = tRNA-His tRNA 11788..11855 product = tRNA-Ser tRNA 11856..11927 product = tRNA-Leu gene 11928..13757 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13753..14274) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14275..14344) product = tRNA-Glu gene 14347..15492 gene. = CYTB product =cytochrome b tRNA 15492..15563 product = tRNA-Thr tRNA complement (15566..15634) product = tRNA-Pro D-Loop 15635..16697 3 20

Megachasma pelagios mitochondrion, complete genome

T 1 GCTAGTGTAG CTTAATTTAG AGTATGGCAC TGA.A.AATGC AAGATGAAAA CGATCACATC GAATTAAATC TCATACCGTG ACTTTTACGA TTCTACTTTT 51 ATP.►~~AAATTT TCCACAAGCA TGAAGGTTTG GTCCTGGCCT CAGTATTAAT TATTTTTAAA AGGTGTTCGT ACTTCCAAAC CAGGACCGGA GTCATAATTA 101 TGCAACCAAG ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACGTTGGTTC TA.ATATGTAC GTTCAAAGTC GTAGGGACAC TCTTACGGGA 151 AATTATTCTA TTAATTAATT AGGAGCAGGT ATCAGGCACA CATATGTAGC TTAATAAGAT AATTAATTAA TCCTCGTCCA TAGTCCGTGT GTATACATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGATCTCAG CAGTAACAAA GGTTCTGTGG AACGATTCGG TGTGGGGGTT CCCTAGAGTC GTCATTGTTT 251 CATTGATTCT TATAAGCGCA AGCTTGAATC AGTTAAAGTT AATAGAGTTG GTAACTAAGA ATATTCGCGT TCGAACTTAG TCAATTTCAA TTATCTCAAC 301 GTAAACCTCG TGCCAGCCAC CGCGGTTATA CGAGTGACTC ACATTAATAT CATTTGGAGC ACGGTCGGTG GCGCCAATAT GCTCACTGAG TGTAATTATA 351 TTCCTCGGCG TAAAGAGTGA TTTAAGAAAT ATCTATAGTG ATTAAAGTTA AAGGAGCCGC ATTTCTCACT AA.ATTC TTTA TAGATATCAC TAATTTCAAT 401 AGACCTCGTC AA.AC TGTTAT ACGTACCCAC GAGTGGAACC ATCAACAACG TCTGGAGCAG TTTGACAATA TGCATGGGTG CTCACCTTGG TAGTTGTTGC 451 AAAGTGACTT TATCCATCTA G~AAAC TTGA CGTCACGACA GTTAGACCTC TTTCACTGAA ATAGGTAGAT CTTTTGAACT GCAGTGCTGT CAATCTGGAG TAA.ATAATAA 501 AAACTAGGAT TAGATACCCT ACTATGTCCA AC C ATA.AAC T TTTGATCCTA ATCTATGGGA TGATACAGGT TGGTATTTGA ATTTATTATT P~AAATTCAAA 551 TTTACTATAT TATTCGCCAG AGTACTACAA GCGCTAGCTT AAATGATATA ATAAGCGGTC TCATGATGTT CGCGATCGAA TTTTAAGTTT 601 GGACTTGGCG GTGTCCCAAA CCCACTTAGA GGAGCCTGTT CTGTAACCGA CCTGAACCGC CACAGGGTTT GGGTGAATCT CCTCGGACAA GACATTGGCT 651 TAATCCCCGT TAAACCTCAC CACTCCTGGC TATCCCCGTC TATATACCGC ATTAGGGGCA ATTTGGAGTG GTGAGGACCG ATAGGGGCAG ATATATGGCG 701 CGTCGTCAGC TCACCCTGTG AAGATTAAAA AGTAAGCAAA AAGAATTATC GCAGCAGTCG AGTGGGACAC TTCTAATTTT TCATTCGTTT TTCTTAATAG 751 TCCCACACGT CAGGTCGAGG TGTAGCAGAT GGAGTGGATA GAAATGGGCT AGGGTGTGCA GTCCAGCTCC ACATCGTCTA CCTCACCTAT CTTTACCCGA 801 ACATTTTCTT TAAAGP~AAAT ACGAATGGTA AC C TGA~AA,AT TTACCTAAAG TGTAAAAGAA ATTTCTTTTA TGCTTACCAT TGGACTTTTA AATGGATTTC 851 GTGGATCTAA TAGTAAGAAA AGATTAGAGA GCTTTTCTGA AACTGGCTCT CACCTAGATT ATCATTCTTT TCTAATCTCT CGAAAAGACT TTGACCGAGA 901 GGGACGCGCA CATACCGCCC GTCACTCTCC TCAAAATATT CTATTTATTT CCCTGCGCGT GTATGGCGGG CAGTGAGAGG AGTTTTATAA GATAAATAAA 951 TTAATTAAAA GAAAATTATT AAGAGGAGGC AAGTCGTAAC ATGGTAAGTG AATTAATTTT CTTTTAATAA TTCTCCTCCG TTCAGCATTG TACCATTCAC 1001 TAC TGGAA.AG TGCACTTGGA ATCAAAATGT GGCTAAATCA GCAAAGCACC ATGACCTTTC ACGTGAACCT TAGTTTTACA CCGATTTAGT CGTTTCGTGG 1051 TCCCTTACAC C GAGGA.A.ATA CCCGTGCA.AA TCGGATCATT TTGAACACTA AGGGAATGTG GCTCCTTTAT GGGCACGTTT AGCCTAGTAA AACTTGTGAT 1101 AAGCTAGCCT GTATATCTAC CCTTAAATTT AACCTTATTA ATTACCACCC TTCGATCGGA CATATAGATG GGAATTTAA.A TTGGAATAAT TAATGGTGGG 1151 CATATTAATG CCTAACTAAA ACATTTTATC TTTTTAGTAT GGGTGACAGA GTATAATTAC GGATTGATTT TGTA.AA,ATAG P.~AAA.ATCATA CCCACTGTCT 1201 ACAA.AA.ATTC AGCGCAATAG ACTATGTACC GTAAGGGAA.A GC TGP~~~AAG TGTTTTTAAG TCGCGTTATC TGATACATGG CATTCCCTTT CGACTTTTTC 1251 AAATGAAATA AATAATTAAA GTAATP~'~A.AA GCAGAGATTT AGCCTCGTAC TTTACTTTAT TTATTAATTT CATTATTTTT CGTCTCTAAA TCGGAGCATG 1301 CTTTTGCATC ATGATTTAGC TAGP.~~AAAC T AGACAAAGAG ATCTTAAGTC 321

GAAAACGTAG TACTAAATCG ATCTTTTTGA TCTGTTTCTC TAGAATTCAG 13 51 TATCCTCCCG AA.AC TAAAC G AGCTACTCCG AAGCAGCACA ATTAGAGCCA ATAGGAGGGC TTTGATTTGC TCGATGAGGC TTCGTCGTGT TAATCTCGGT 1401 ACCCGTCTCT GTAGCAA.AAG AGTGGGAAGA CTTCCGAGTA GCGGTGATAA TGGGCAGAGA CATCGTTTTC TCACCCTTCT GAAGGCTCAT CGCCACTATT 1451 GCCTATCGAG TTTAGTGATA GCTGGTTACC CAAGAAAAGA ACTTTAATTC CGGATAGCTC AAATCACTAT CGACCAATGG GTTCTTTTCT TGAAATTAAG 1501 TGCATTAATT TTTTATTACC P.►AAAAGC C TA TCCTATTAAG GTTAA.AATAT ACGTAATTAA AAAATAATGG TTTTTCGGAT AGGATAATTC CAATTTTATA 1551 P►AAA.ATTAAT AGTTATTCAG AAGAGGTACA GCCCTTCTGA ACTAAGATAC TTTTTAATTA TCAATAAGTC TTCTCCATGT CGGGAAGACT TGATTCTATG 1601 AACTTTCCAA GGC GGA,,AAAT GATCATATTA ATCAAGGTTT TTACCTCAGC TTGAAAGGTT CCGCCTTTTA CTAGTATAAT TAGTTCCAAA AATGGAGTCG 1651 GGGCCCAA.A.A GCAGTCACCT GTAAAGTAAG CGTCACAGCT CCAGTCTCAC CCCGGGTTTT CGTCAGTGGA CATTTCATTC GCAGTGTCGA GGTCAGAGTG 17 01 P.~~AA.AC C TAT AATTTGGATA TTCACCTCAT AATCCCCTTA ACTATATTGG TTTTTGGATA TTAAACCTAT AAGTGGAGTA TTAGGGGAAT TGATATAACC 1751 GTTATTTTAT AAAATTATAA AAGAACTTAT GCTAAAATGA GTAATAAGAG CAAT~TA TTTTAATATT TTCTTGAATA CGATTTTACT CATTATTCTC 1801 AATA.AAC C TC TCCCGACATA AGTGTATGTT AGA.AAGAATT AAATCACTAA TTATTTGGAG AGGGCTGTAT TCACATACAA TCTTTCTTAA TTTAGTGATT 1851 CAATTAAACG AACCCAAACT GAGGTTATTA TATTAATATT ACCTTAACTA GTTAATTTGC TTGGGTTTGA CTCCAATAAT ATAATTATAA TGGAATTGAT 1901 GP►~~AAAC TTA TCATAACATT CGTTAACTCT ACACAGAAAT GTCTCAGGGA CTTTTTGAAT AGTATTGTAA GCAATTGAGA TGTGTCTTTA CAGAGTCCCT 1951 AAGATTTAAA GP~AAATAAAG GAACTCGGCA AACACAAACT CCGCCTGTTT TTCTA.AATTT CTTTTATTTC CTTGAGCCGT TTGTGTTTGA GGCGGACAAA 2001 AC Cp~~AAACA TCGCCTCTTG AATACTATAA GAGGTCCCGC CTGCCCTGTG TGGTTTTTGT AGCGGAGAAC TTATGATATT CTCCAGGGCG GACGGGACAC 2051 ACAATGTTTA ACGGCCGCGG TATTCTGACC GTGCAA.AGGT AGCGTAATCA TGTTACAAAT TGCCGGCGCC ATAAGACTGG CACGTTTCCA TCGCATTAGT 2101 CTTGTCTTTT AAATGAAGAC CCGTATGA.AA GGCATCACGA GAGTTTCACT GAACAGAAAA TTTACTTCTG GGCATACTTT CCGTAGTGCT C TCAA.AGTGA 2151 GTCTCTATTT TCTAATCAAT GAA.ATTGATC TACTCGTGCA GAAGCGAGTA CAGAGATA.AA AGATTAGTTA CTTTAACTAG ATGAGCACGT CTTCGCTCAT 2201 TAATTACATT AGACGAGAAG ACCCTATGGA GCTTCAAACA CATA.AATTAA ATTAATGTAA TCTGCTCTTC TGGGATACCT CGAAGTTTGT GTATTTAATT 2251 CTATGTAAAT TATTAACCCC ACGGGTATAA ATP~AAAATAA TATTTTTAAT GATACATTTA ATAATTGGGG TGCCCATATT TATTTTTATT ATP~~AAAT TA 2301 TTAACTGTTT TTGGTTGGGG TGACCAAGGG GP.~~AAATAA.A TCCCCCTTAT AATTGACA.AA AACCAACCCC ACTGGTTCCC CTTTTTATTT AGGGGGAATA 2351 CGACCAAGTA CTCAAGTACT TAAGAATTAA AACTACAATT TTAATTAATA GCTGGTTCAT GAGTTCATGA ATTCTTAATT TTGATGTTAA AATTAATTAT 2401 AAATATTTAT C Gp~TGA CCCAGGATTT CCTGATCAAT GAACCAAGTT TTTATAAATA GCTTTTTACT GGGTC C TAA.A GGACTAGTTA CTTGGTTCAA 2451 ACCCTAGGGA TAACAGCGCA ATCCTTTCTC AGAGTCCCTA TCGCCGAAAG TGGGATCCCT ATTGTCGCGT TAGGA.AAGAG TCTCAGGGAT AGCGGCTTTC 2501 GGTTTACGAC CTCGATGTTG GATCAGGACA TCCTAATGAT GTAACCGTTA CCA.AATGCTG GAGCTACAAC CTAGTCCTGT AGGATTACTA CATTGGCAAT 2551 TTAAGGGTTC GTTTGTTCAA CGATTAATAG TCCTACGTGA TCTGAGTTCA AATTCCCAAG CA.AACAAGTT GCTAATTATC AGGATGCACT AGACTCAAGT 2601 GACCGGAGAA ATCCAGGTCA GTTTCTATCT ATGAATTAAT TTTTCCTAGT CTGGCCTCTT TAGGTCCAGT CAAAGATAGA TACTTAATTA AAAAGGATCA 2651 ACGA.AAGGAC C GGP~AAAATA GAGCCAATAC CCCAGGCACG CTCTATTTTC TGCTTTCCTG GCCTTTTTAT CTCGGTTATG GGGTCCGTGC GAGATA~AAAG 322

2701 ATCTATTGAA ACAAACTAAA ATAGATAAGA A~A.AAATTATC TAATACCCAA TAGATAACTT TGTTTGATTT TATCTATTCT TTTTTAATAG ATTATGGGTT 2751 Gp~AAA.AGGGT TGTTGGGGTG GCAGAGCCTG GTAAATGCAA AAGACCTAAG CTTTTTCCCA ACAACCCCAC CGTCTCGGAC CATTTACGTT TTCTGGATTC 2s01 CTCTTTAATC CAGAGGTTCA AATCCTCTCC TCAATCATGC TTGAAGCCCT GAGAAATTAG GTCTCCAAGT TTAGGAGAGG AGTTAGTACG AACTTCGGGA 2851 CCTACTTTAT TTAATTAACC CACTTGCCTA TATTATTCCA ATCCTACTAG GGATGAAATA AATTAATTGG GTGAACGGAT ATAATAAGGT TAGGATGATC 2901 CTACAGCCTT CCTTACTCTA GTTGAACGAA AAATTCTTGG TCATATACAA GATGTCGGAA GGAATGAGAT CAACTTGCTT TTTAAGAACC AGTATATGTT 2951 ATCCGTAAAG GTCCCAACAT TGTAGGCCCT TACGGACTCC TCCAACCAAT TAGGCATTTC CAGGGTTGTA ACATCCGGGA ATGCCTGAGG AGGTTGGTTA 3001 TGCAGATGGT CTAAAATTAT TTATTAAGGA ACCCATCTAC CCATCAACAT ACGTCTACCA GATTTTA.ATA AATAATTCCT TGGGTAGATG GGTAGTTGTA 3051 CCTCCCCATT TCTATTCCTA ATCACCCCCA CAATAGCCTT AGCACTAGCC GGAGGGGTAA AGATAAGGAT TAGTGGGGGT GTTATCGGAA TCGTGATCGG 3101 CTCCTCATAT GAATACCCCT TCCTCTCCCA CATTCCATTA TCAACCTTAA GAGGAGTATA CTTATGGGGA AGGAGAGGGT GTAAGGTAAT AGTTGGAATT 3151 TTTAGGGTTA TTATTTATTT TAGCAATTTC AAGCTTAACC GTTTATACTA AAATCCCAAT AATAAATAAA ATCGTTAAAG TTCGAATTGG CA.AATATGAT 3201 TTTTAGGCTC TGGATGAGCA TCCAATTCAA AATACGCCCT AATAGGAGCC P~AAATC C GAG ACCTACTCGT AGGTTAAGTT TTATGCGGGA TTATCCTCGG 3251 CTACGAGCCG TAGCACAAAC AATCTCATAT GAGGTAAGTC TTGGATTAAT GATGCTCGGC ATCGTGTTTG TTAGAGTATA CTCCATTCAG AACCTAATTA 3301 CCTCTTATCA ATAGTTATAT TTGCAGGAGG CTTTACCCTC CATACCTTCA GGAGAATAGT TATCAATATA AACGTCCTCC GA.AATGGGAG GTATGGAAGT 3351 ACCTAGCACA AGAAACAGTC TGATTAATTA TTCCAGGATG ACCATTAGCC TGGATCGTGT TCTTTGTCAG ACTAATTAAT AAGGTCCTAC TGGTAATCGG 3401 CTAATATGAT ATATCTCAAC CCTAGCAGAA ACTAACCGAG TACCATTTGA GATTATACTA TATAGAGTTG GGATCGTCTT TGATTGGCTC ATGGTAA.ACT 3451 CTTAACAGAA GGGGAATCAG AACTAGTTTC AGGTTTTAAT ATTGAATATG GAATTGTCTT CCCCTTAGTC TTGATCAAAG TC CA.A.AATTA TAACTTATAC 3501 CAGGAGGCTC ATTTGCCCTA TTTTTTCTTG CCGAATATAC AAATATTTTA GTCCTCCGAG TAAACGGGAT GAAC GGCTTATATG TTTAT~?~AAAT 3551 TTAATAAATA CCCTCTCAGT TATTTTATTC CTAGGTTCTT CTTATAATCC AATTATTTAT GGGAGAGTCA ATAAAATAAG GATCCAAGAA GAATATTAGG 3601 ACTTTTCCCA GAAATCTCTA CACTAAGCTT GATAACA.A.AA GCAACCCTAC TGAAAAGGGT CTTTAGAGAT GTGATTCGAA CTATTGTTTT CGTTGGGATG 3651 TAACCCTACT TTTCTTATGA ATTCGAGCAT CCTACCCCCG CTTCCGTTAT ATTGGGATGA AAAGAATACT TAAGCTCGTA GGATGGGGGC GAAGGCAATA 3701 GACCAACTTA TACACCTAGT ATGAP.►AA.AAT TTCCTCCCTC TAACCTTAGC CTGGTTGAAT ATGTGGATCA TACTTTTTTA AAGGAGGGAG ATTGGAATCG 3751 AATAATATTA TGACATATTG CCTTCCCCCT AGCCACAGCA AGTCTCCCTC TTATTATAAT ACTGTATAAC GGAAGGGGGA TCGGTGTCGT TCAGAGGGAG 3801 CCCTAACCTA AACGGAAATG TGCCTGAATA AAGGACCACT TTGATAGAGT GGGATTGGAT TTGCCTTTAC ACGGACTTAT TTCCTGGTGA AACTATCTCA 3851 GGATAATGAG AGTTAAAATC CCTCCTCTTC CTAGAAAAAT AGGATTTGAA CCTATTACTC TCAATTTTAG GGAGGAGAAG GATCTTTTTA TCCTAAACTT 3901 CCTATAATTA AGAGATCAAA ACTCCTTGTA TTCCCAACTA TACTATCTTC GGATATTAAT TCTCTAGTTT TGAGGAACAT AAGGGTTGAT ATGATAGAAG 3951 TAAGTAA.AGT CAGCTAATAA AGCTTTTGGG CCCATACCCC AACCACGTTG ATTCATTTCA GTCGATTATT TCGAAAACCC GGGTATGGGG TTGGTGCAAC 4001 GTTAAAATCC TTCCTTTACT AATGAACCCA ATTGTACTTA CCATTATTAT CAATTTTAGG AAGGAAATGA TTACTTGGGT TAACATGAAT GGTAATAATA 4051 TTCAAGCCTA GGTTTAGGAA CTATTCTCAC ATTTATTGGT TCACATTGAT 323

AAGTTCGGAT CCA.AATCCTT GATAAGAGTG TAAATAACCA AGTGTAACTA 4101 TCCTAATTTG GATAGGACTT GAAATTAATA CTCTAGCCAT TATCCCCTTA AGGATTAAAC CTATCCTGAA CTTTAATTAT GAGATCGGTA ATAGGGGAAT 4151 ATAATTCGCC AACATCACCC CCGAGCAGTG GAAGCTTCCA CAAAATATTT TATTAAGCGG TTGTAGTGGG GGCTCGTCAC CTTCGAAGGT GTTTTATAAA 4201 TATTACACAA GCAACCGCCT CAGCCTTACT TTTATTTGCT AGTATTACAA ATAATGTGTT CGTTGGCGGA GTCGGAATGA AAATAAACGA TCATAATGTT 4251 ATGCTTGAAC TTCAGGTGAA TGAAGTTTAA TC GA.AATAAA TAATCCAACC TACGAACTTG AAGTCCACTT ACTTCAAATT AGCTTTATTT ATTAGGTTGG 4301 TCTGCCACAC TGGCCACAAT C GCAC TAAC G TTP.~~A.AATTG GCCTAGCTCC AGACGGTGTG ACCGGTGTTA GCGTGATTGC AATTTTTAAC CGGATCGAGG 4351 CCTTCACTTC TGATTACCCG AAGTGCTCCA AGGTTTAGAC CTTACCACAG GGAAGTGAAG ACTAATGGGC TTCACGAGGT TCCA.AATCTG GAATGGTGTC 4401 GCCTTATCCT TTCTACATGA CP~AA.AAC TC G CCCCATTCGC TATTCTACTA CGGAATAGGA AAGATGTACT GTTTTTGAGC GGGGTAAGCG ATAAGATGAT 4451 CAACTTTACC CCTCATTAA.A TTCCAATTTA CTTGTATTCC TTGGAGTCCT GTTGAAATGG GGAGTAATTT AAGGTTAAAT GAACATAAGG AACCTCAGGA 4501 CTCTATTATG GTAGGAGCCT GAGGTGGATT AA.ACCAAACC CAGCTACGAA GAGATAATAC CATCCTCGGA CTCCACCTAA TTTGGTTTGG GTCGATGCTT 4551 AAATCCTAGC CTATTCCTCA ATTGCACACC TTGGCTGAAT AATCACAATT TTTAGGATCG GATAAGGAGT TAACGTGTGG AACCGACTTA TTAGTGTTAA 4601 CTACACTACT CCCATAATTT AACCCAACTA AATTTATTCC TCTATATTAT GATGTGATGA GGGTATTAAA TTGGGTTGAT TTAAATAAGG AGATATAATA 4651 TATAACCTCA ACAACCTTCC TATTATTTAA AATATTCAAT TCAAC TAPsA,A ATATTGGAGT TGTTGGAAGG ATAATAA.ATT TTATAAGTTA AGTTGATTTT 4701 TTAATTCTAT TTCATCCTCC TCATCAAAGT CCCCTTTATT ATCTACTATT AATTAAGATA AAGTAGGAGG AGTAGTTTCA GGGGA.AATAA TAGATGATAA 4751 ACCCTCATAA CCCTCCTTTC TCTAGGTGGC TTACCCCCAC TCTCAGGCTT TGGGAGTATT GGGAGGAAAG AGATCCACCG AATGGGGGTG AGAGTCCGAA 4801 TATACCAAAA TGATTAATTT TACAGGAATT AACA.AAACAA AACCTAATTA ATATGGTTTT ACTAATTAAA ATGTCCTTAA TTGTTTTGTT TTGGATTAAT 4851 TCCCAGCCAT TATTATATCC ATAATAACCC TCCTTAATCT ATTCTTCTAT AGGGTCGGTA ATAATATAGG TATTATTGGG AGGAATTAGA TAAGAAGATA 4901 TTACGCCTAT GTTATGCTAC AACATTAACT ATAACTCCAA ACTCTGTTAA AATGCGGATA CAATACGATG TTGTAATTGA TATTGAGGTT TGAGACAATT 4951 TATAATAACA TCATGACGAA CCAAACTACC CCACAACCTA ACCCTAACAA ATATTATTGT AGTACTGCTT GGTTTGATGG GGTGTTGGAT TGGGATTGTT 5001 CAGCCGCCTC ATTATCTATC CTTATACTCC CGATCACCCC CGCCATTCTC GTCGGCGGAG TAATAGATAG GAATATGAGG GCTAGTGGGG GCGGTAAGAG 5051 ATACTGATAT CCTAAGAAAT TTAGGTTAAT AATAA.A.0 C AA AAACCTTCAA TATGACTATA GGATTCTTTA AATCCAATTA TTATTTGGTT TTTGGAAGTT 5101 AGTTTTAAAT AGAAGTGAAA ATCTCCTAAT TTCTGCTAAG ATTTGTAAGA TCAAAATTTA TCTTCACTTT TAGAGGATTA AAGACGATTC TAAACATTCT 5151 CTTTACCTCA CATCTTCTGA ATGCAACCCA GATACTTTCA TTAAGCTAAA GAA.ATGGAGT GTAGAAGACT TACGTTGGGT C TATGA.AAGT AATTCGATTT 5201 ACCTCCTAGA TA.AATAGGCT TTGATCCTAC AAA~TCTTAG TTAACAGCTA TGGAGGATCT ATTTATCCGA AACTAGGATG TTTTAGAATC AATTGTCGAT 5251 AGCGTTCAAT CCAACGAACT TTTATCTACT TTCTCCCGCC GTAAGAATAA TCGCAAGTTA GGTTGCTTGA AAATAGATGA AAGAGGGCGG CATTCTTATT 5301 AAGGCGGGAG AAAGCCCCGG GAGGAGATAA CCTCCGGTTT TGGATTTGCA TTCCGCCCTC TTTCGGGGCC CTCCTCTATT GGAGGCCAAA ACCTAAACGT 5351 ATCCAACGTA ATCATTTACT GCGGAACTAT GGTAAAAAGA GGAATTTGAC TAGGTTGCAT TAGTAAATGA CGCCTTGATA CCATTTTTCT CCTTAAACTG 5401 CTCTGTTAAC GAAGCTACAA TCCGCCACTT AGTTCTCAGT CATTTTACCT GAGACAATTG CTTCGATGTT AGGCGGTGAA TCAAGAGTCA GTA,AAATGGA 324

5451 GTGGCAATTA ATCGCTGACT ATTTTCTACA AACCACAA.AG ATATCGGCAC CACCGTTAAT TAGCGACTGA TAAAAGATGT TTGGTGTTTC TATAGCCGTG 5501 CCTTTATTTG ATCTTTGGTG CATGAGCAGG AATAGTGGGA ACAGCCCTAA GGAAATAAAC TAGAAACCAC GTACTCGTCC TTATCACCCT TGTCGGGATT 5551 GTCTTCTAAT TCGAGCTGAA TTGGGACAAC CTGGGTCTCT TCTAGGAGAT CAGAAGATTA AGCTCGACTT AACCCTGTTG GACCCAGAGA AGATCCTCTA 5601 GATCAGATTT ATAATGTCAT TGTGACCGCC CACGCATTTG TAATAATTTT CTAGTCTAAA TATTACAGTA ACACTGGCGG GTGCGTAAAC ATTATTAAAA 5651 CTTCATGGTT ATACCCGTAA TAATCGGGGG GTTTGGAAAC TGATTAGTAC GAAGTACCAA TATGGGCATT ATTAGCCCCC CAAACCTTTG ACTAATCATG 5701 CATTAATAAT TGGTGCACCA GATATGGCCT TTCCACGAAT AAATAATATA GTAATTATTA ACCACGTGGT CTATACCGGA AAGGTGCTTA TTTATTATAT 5751 AGCTTTTGAT TACTCCCTCC TTCATTTCTT TTACTTTTAG CTTCAGCCGG TC GAAA.AC TA ATGAGGGAGG AAGTAAAGAA AATGAAAATC GAAGTCGGCC 5801 AGTTGAAGCC GGAGCCGGTA CTGGTTGAAC AGTTTATCCT CCTTTAGCTG TCAACTTCGG CCTCGGCCAT GACCAACTTG TCAA.A.TAGGA GGAAATCGAC 5851 GTAATTTAGC ACATGCTGGA GCATCCGTTG ATTTAGCCAT CTTCTCTCTT CATTAAATCG TGTACGACCT CGTAGGCAAC TA.AATCGGTA GAAGAGAGAA 5901 CATTTAGCAG GCATTTCATC AATTCTAGCT TCAATTAACT TTATTACAAC GTAAATCGTC CGTAAAGTAG TTAAGATCGA AGTTAATTGA AATAATGTTG 5951 TATTATTAAC ATA.AA.AC CAC CAGCCATTTC CCAATATCAA ACACCATTAT ATAATAATTG TATTTTGGTG GTCGGTA.AAG GGTTATAGTT TGTGGTAATA 6001 TTGTGTGATC AATCTTAGTA ACAACTATTC TACTTCTTTT AGCACTCCCA AACACACTAG TTAGAATCAT TGTTGATAAG ATGAAG~ TCGTGAGGGT 6051 GTACTTGCAG CCGGCATTAC TATACTTCTT ACTGATCGAA ACCTAAACAC CATGAACGTC GGCCGTA.ATG ATATGAAGAA TGACTAGCTT TGGATTTGTG 6101 AACATTCTTT GACCCAGCAG GAGGAGGAGA CCCTATTCTT TACCAACACC TTGTAAGAAA CTGGGTCGTC CTCCTCCTCT GGGATAAGAA ATGGTTGTGG 6151 TATTTTGATT TTTTGGACAT CCAGAAGTTT ACATTTTAAT TCTCCCCGGC ATAAAACTAA ~~AAAC CTGTA GGTCTTCAA.A TGTAAA.ATTA AGAGGGGCCG 6201 TTTGGAATAA TTTCCCATGT AGTAGCTTAT TATTCTGGTA ~?~~AAAGAAC C A.AAC C TTATT AAAGGGTACA TCATCGAATA ATAAGACCAT TTTTTCTTGG 6251 ATTTGGTTAC ATAGGTATAG TTTGAGCAAT AATAGCAATC GGATTACTGG TA.AACCAATG TATCCATATC AA.AC TC GTTA TTATCGTTAG CCTAATGACC 6301 GTTTTATTGT TTGAGCCCAT CATATATTTA CAGTAGGCAT AGACGTTGAC CAAAATAACA AACTCGGGTA GTATATAAAT GTCATCCGTA TCTGCAACTG 6351 ACACGAGCCT ATTTTACTTC AGCGACAATA ATTATTGCCA TCCCCACAGG TGTGCTCGGA TAA.AATGAAG TCGCTGTTAT TAATAACGGT AGGGGTGTCC 6401 CGTGAAAGTG TTTAGCTGAC TAGCAACCCT TCACGGAGGC TCCATCAAAT GCACTTTCAC AAATCGACTG ATCGTTGGGA AGTGCCTCCG AGGTAGTTTA 6451 GAGAA.ACC C C ATTATTATGG GCTCTCGGGT TTATCTTTTT ATTCACAGTA CTCTTTGGGG TAATAATACC CGAGAGCCCA AATAGP►~~.AA.A TAAGTGTCAT 6501 GGAGGGTTAA CAGGTATTGT CTTAGCCAAC TCCTCCTTAG ATATTGTCCT CCTCCCAATT GTCCATAACA GAATCGGTTG AGGAGGAATC TATAACAGGA 6551 CCATGATACC TACTATGTAG TAGCTCACTT CCATTATGTT CTCTCAATAG GGTACTATGG ATGATACATC ATCGAGTGAA GGTAATACAA GAGAGTTATC 6601 GAGCAGTATT CGCTATCATA GCAGGCTTTA TCCACTGATT TCCTCTTATG CTCGTCATAA GCGATAGTAT CGTCCGA.A.AT AGGTGACTAA AGGAGAATAC 6651 TCTGGCTACA CTCTCCATTC AGCATGAACT AAAATCCAAT TTATAGTAAT AGACCGATGT GAGAGGTAAG TCGTACTTGA TTTTAGGTTA AATATCATTA 6701 ATTTATTGGA GTAAACTTAA CATTCTTCCC ACAACACTTC CTAGGACTTG TAAATAACCT CATTTGAATT GTAAGAAGGG TGTTGTGAAG GATCCTGAAC 6751 CAGGCATACC ACGACGTTAT TCAGATTACC CAGATGCCTA TACTTTATGA GTCCGTATGG TGCTGCAATA AGTCTAATGG GTCTACGGAT ATGAAATACT 6801 AATACAGTTT CCTCAATTGG CTCTTTAATC TCACTTGTAG CAGTAATTAT 325

TTATGTCAAA GGAGTTAACC GAGAAATTAG AGTGAACATC GTCATTAATA 6851 ACTCCTATTT ATTATCTGAG AAGCATTTGC TTCAA.AACGA GAAGTTTTAT TGAGGATAAA TAATAGACTC TTCGTA.AACG AAGTTTTGCT C TTCP~AAATA 6901 CCATTGAACT ACCTCATACA AATGTTGAGT GATTACATGG CTGCCCTCCA GGTAACTTGA TGGAGTATGT TTACAACTCA CTAATGTACC GACGGGAGGT 6951 CCATATCACA CATATGAAGA ACCAGCATTT GTTCAAATTC AACGGACTTT GGTATAGTGT GTATACTTCT TGGTCGTAAA CAAGTTTAAG TTGCCTGAAA 7001 TTA.AACAAGA AAGGAAGGAA TTGAACCCCC ATATGTTAGT TTCAAGCCAA AATTTGTTCT TTCCTTCCTT AACTTGGGGG TATACAATCA AAGTTCGGTT 7051 CTACATCACC ACTCTGTCAC TTTCTTTATT AAGATTCTAG TAAAATATAT GATGTAGTGG TGAGACAGTG A.AAGAAATAA TTCTAAGATC ATTTTATATA 7101 TACACTGTCT TGTCAAGACA AAATTGTGAG TTTAAACCCC ACGAATTTTA ATGTGACAGA ACAGTTCTGT TTTAACACTC AAATTTGGGG TGCTTAAAAT 7151 ATTTATAATG GCACACCCTT CACAATTAGG ATTTCAAGAT GCAGCCTCCC TAAATATTAC CGTGTGGGAA GTGTTAATCC TAAAGTTCTA CGTCGGAGGG 7201 CAGTTATGGA AGAACTTATT CATTTTCACG ACCACACATT AATAATTGTA GTCAATACCT TCTTGAATAA GTAAAAGTGC TGGTGTGTAA TTATTAACAT 7251 TTTCTAATTA GTACCCTAGT TCTTTATATT ATTACAGCAA TAGTATCAAC AAAGATTAAT CATGGGATCA AGA.AATATAA TAATGTCGTT ATCATAGTTG 7301 A.AAAC T TAC A AACAAATATA TTCTAGATTC TCAAGAAATT GA.AATC GTAT TTTTGAATGT TTGTTTATAT AAGATCTAAG AGTTCTTTAA CTTTAGCATA 7351 GAACTATTCT CCCCGCCATT ATCCTCATTT TAATTGCACT TCCATCCTTA CTTGATAAGA GGGGCGGTAA TAGGAGTA.AA ATTAACGTGA AGGTAGGAAT 7401 CGAATTTTAT ACCTCATAGA TGAGATTAAT GATCCTCACC TTACTATTAA GCTTAAAATA TGGAGTATGT ACTCTAATTA CTAGGAGTGG AATGATAATT 7451 AGCAATAGGT CACCAATGAT ATTGAACTTA TGAATACACA GATTATGAAG TCGTTATCCA GTGGTTACTA TAACTTGAAT ACTTATGTGT CTAATACTTC 7501 ATCTAGGATT TGACTCTTAC ATGATTCAAA CTGAAGATTT AACTCCAGGC TAGATCCTAA ACTGAGAATG TACTAAGTTT GACTTCTAAA TTGAGGTCCG 7551 CAATTCCGTT TAC TAGAA.AC AGACCACCGT ATAGTTGTAC CCATAGAATC GTTAAGGCAA ATGATCTTTG TCTGGTGGCA TATCAACATG GGTATCTTAG 7601 ACCCATCCGT GTACTAGTTT CTGCAGAAGA TGTCTTACAC TCATGAACTA TGGGTAGGCA CATGATCA.AA GACGTCTTCT ACAGAATGTG AGTACTTGAT 7651 TCCCAGCCCT AGGTGTTAAA ATGGACGCTG TACCAGGACG CCTAA.ACCAA AGGGTCGGGA TCCACAATTT TACCTGCGAC ATGGTCCTGC GGATTTGGTT 7701 ACTGCTTTTA TCATCTCCCG ACCAGGTATT TATTATGGTC AATGTTCAGA TGAC GP~AAAT AGTAGAGGGC TGGTCCATAA ATAATACCAG TTACAAGTCT 7751 AATTTGTGGT GCCAACCACA GCTTTATGCC TATCGTAGTA GAGGCAGTTC TTA.AACAC CA CGGTTGGTGT CGAAATACGG ATAGCATCAT CTCCGTCAAG 7801 CCCTAGAACA CTTCGAAGCC TGATCTTCAT TAATATTAGA AGAAGTCTCA GGGATCTTGT GAAGCTTCGG ACTAGAAGTA ATTATAATCT TCTTCAGAGT 7851 TTAAGAAGCT A.AATTGGGTA TAGCATTAGC CTTTTAAGCT P~~A.AACTGGT AATTCTTCGA TTTAACCCAT ATCGTAATCG GAA.AATTC GA TTTTTGACCA 7901 GATTCCCTAC CACCCTTAAT GATATGCCTC AATTAAACCC CCACCCTTGA CTAAGGGATG GTGGGAATTA CTATACGGAG TTAATTTGGG GGTGGGAACT 7951 TTTATTATTT TCCTATTTTC ATGAATAATT CTTCTAACTA TTATACCTAA AAATAATAA.A AGGATAAAAG TACTTATTAA GAAGATTGAT AATATGGATT 8001 A.A.AAGTAATA AATCATATAT TTAGTAATAA CCCAACATTA AAAAGTACTG TTTTCATTAT TTAGTATATA AATCATTATT GGGTTGTAAT TTTTCATGAC 8051 AGAAATCCAA ACCTAAGTCC TGAAACTGAC CATGATCCTA AACTTTTTTG TCTTTAGGTT TGGATTCAGG ACTTTGACTG GTACTAGGAT TTG C 8101 ACCAATTTCT AAGTCCCTCC CTCCTTGGAA TTCCTTTAAT TGCTATAGCA TGGTTA.AAGA TTCAGGGAGG GAGGAACCTT AAGGAAATTA ACGATATCGT 8151 ATTACATTAC CATGACTAAT TTTCCCAACC CCAACTAACC GC TGATTA.AA TAATGTAATG GTACTGATTA A.A.AGGGTTGG GGTTGATTGG CGACTAATTT 326

8201 TAATCGATTG ATAACCCTCC A.AA.ACTGGTT CATTAATCGA TTTATTTATC ATTAGCTAAC TATTGGGAGG TTTTGACCAA GTAATTAGCT AAATAAATAG 8251 AACTTATACA ACCTATTAAC TTTACTGGCC ATAAATGAAC CATATTATTT TTGAATATGT TGGATAATTG AAATGACCGG TATTTACTTG GTATAATAAA 8301 ATAGCACTAA TACTGTTCCT AATTACCATC AACCTTCTAG GACTTCTTCC TATCGTGATT ATGACAAGGA TTAATGGTAG TTGGAAGATC CTGAAGAAGG 8351 CTACACCTTC ACCCCCACAA CCCAACTCTC CCTTAATATA GCATTTGCCC GATGTGGAAG TGGGGGTGTT GGGTTGAGAG GGAATTATAT CGTAAACGGG 8401 TACCCTTATG ACTTATAACC GTATTAATCG GAGTAATTAA TCAACCAACG ATGGGAATAC TGAATATTGG CATAATTAGC CTCATTAATT AGTTGGTTGC 8451 ATTGCACTAG GACATTTCTT ACCAGAAGGT ACTCCTACTC CTCTAGTACC TAACGTGATC CTGTAAAGAA TGGTCTTCCA TGAGGATGAG GAGATCATGG 8501 CGTACTAATT ATTATTGAAA CTATTAGCCT ATTTATTCGA CCACTAGCAT GCATGATTAA TAATAACTTT GATAATCGGA TAAATAAGCT GGTGATCGTA 8551 TAGGGGTTCG ACTAACTGCC AATTTAACAG CTGGTCATCT ATTAATACAA ATCCCCAAGC TGATTGACGG TTA.AATTGTC GACCAGTAGA TAATTATGTT 8601 TTAATTGCAA CCGCAGCCTT TATTCTCATC ACCATTATGC CAACCATAGC AATTAACGTT GGCGTCGGAA ATAAGAGTAG TGGTAATACG GTTGGTATCG 8651 ATTATTAACA TCCATTATCT TATTTCTATT AACAATTCTA GAAGTAGCTG TAATAATTGT AGGTAATAGA ATAAAGATAA TTGTTAAGAT CTTCATCGAC 8701 TAGCAATAAT TCAAGCTTAT GTATTTGTAC TCCTATTAAG CCTGTACCTA ATCGTTATTA AGTTCGAATA CATAAACATG AGGATAATTC GGACATGGAT 8751 CAAGA,AAATG TTTAATGGCT CACCAAGCAC ACGCATATCA CATAGTTGAT GTTCTTTTAC AAATTACCGA GTGGTTCGTG TGCGTATAGT GTATCAACTA 8801 CCCAGTCCAT GACCACTAAC TGGAGCTACA GCTGCCCTTC TAATAACATC GGGTCAGGTA CTGGTGATTG ACCTCGATGT CGACGGGAAG ATTATTGTAG 8851 CGGACTGGCT GTCTGATTTC ACTTCAACTC GTTATCACTT CTCTATTTAG GCCTGACCGA C AGAC TAA.AG TGAAGTTGAG CAATAGTGAA GAGATAAATC 8901 GTTTAATTCT CCTACTATTA ACTATAATTC AATGATGACG CGATATTATT CA.AATTAAGA GGATGATAAT TGATATTAAG TTACTACTGC GCTATAATAA 8951 CGAGAAGGAA CATTTCAAGG TCATCACACA CCCCCCGTTC P.,AAAAGGTC T GCTCTTCCTT GTAAAGTTCC AGTAGTGTGT GGGGGGCAAG TTTTTCCAGA 9001 CCGCTATGGT ATAATCTTAT TTATTATATC AGAAGTATTT TTCTTTCTAG GGCGATACCA TATTAGAATA AATAATATAG TCTTCATAAA AAGA.AAGATC 9051 GCTTTTTCTG AGCCTTTTAC CACTCAAGTC TTTCCCCAAC CCCAGAACTA CGAAAAAGAC TC GGA~AAATG GTGAGTTCAG AA.AGGGGTTG GGGTCTTGAT 9101 GGGGGGTGCT GACCCCCAAC GGGAATTAAC CCATTAGACC CATTTGAAGT CCCCCCACGA CTGGGGGTTG CCCTTAATTG GGTAATCTGG GTAAACTTCA 9151 GCCACTTTTA AATACGGCAG TACTATTAGC TTCTGGCGTC ACAGTAACCT CGGTGAAAAT TTATGCCGTC ATGATAATCG AAGACCGCAG TGTCATTGGA 9201 GAACCCACCA CAGTCTAATA GAAGGTAATC GA.A.AAGAAGC TATCCAAGCC CTTGGGTGGT GTCAGATTAT CTTCCATTAG CTTTTCTTCG ATAGGTTCGG 9251 CTTACCCTCA CTATCATTCT AGGATTTTAC TTTACAACCC TTCAAGCTAT GAATGGGAGT GATAGTAAGA TC C TA.A.AATG AA.ATGTTGGG AAGTTCGATA 9301 GGAATATTAC GAAGCATCAT TCACAATCGC TGATGGAGTT TATGGAACTA CCTTATAATG CTTCGTAGTA AGTGTTAGCG ACTACCTCAA ATACCTTGAT 9351 CATTTTTCGT CGCCACAGGA TTCCACGGTC TACATGTTAT TATTGGCTCA GTP~~A.AAGCA GCGGTGTCCT AAGGTGCCAG ATGTACAATA ATAACCGAGT 9401 ACATTTTTAA CAATCTGTTT ACTACGACAA ATTCAATATC ACTTTACATC TGTP.►~~~AATT GTTAGACAAA TGATGCTGTT TAAGTTATAG TGAAATGTAG 9451 AGAGCACCAT TTTGGCTTTG AAGCTGCAGC ATGATATTGA CATTTTGTAG TCTCGTGGTA AA.AC C GAAAC TTCGACGTCG TACTATAACT GTAAAACATC 9501 ATGTAGTATG ATTATTCCTT TATGTATCCA TCTATTGATG AGGCTCATAA TACATCATAC TAATAAGGAA ATACATAGGT AGATAACTAC TCCGAGTATT 9551 TTACTTTTCT AGTATAAACT AGTACAA.ATG ATTTCCAATC ATTTAATCTT 327

AATGAA.AAGA TCATATTTGA TCATGTTTAC TAAAGGTTAG TAAATTAGAA 9601 GGCTAAAATC CAAGGAAAAG TAATGAGCCT CATTACATCT TCTATCGCGG CCGATTTTAG GTTCCTTTTC ATTACTCGGA GTAATGTAGA AGATAGCGCC 9651 CTACGGCCCT GGTTTCCCTA ATCCTCGTTC TAATTACATT TTGACTTCCA GATGCCGGGA CCAAAGGGAT TAGGAGCAAG ATTAATGTAA AACTGAAGGT 9701 TTATTAAACC CAGATAATGA AAAATTATC C CCATATGAAT GTGGATTCGA AATAATTTGG GTCTATTACT TTTTAATAGG GGTATACTTA CACCTAAGCT 9751 CCCCCTAGGA AATGCACGCC TCCCCTTTTC CTTACGTTTT TTTCTTGTAG GGGGGATCCT TTACGTGCGG AGGGGAAAAG GAATGCAAAA AAAGAACATC 9801 CTATCTTATT TTTATTATTT GAC C TAGAA.A TTGCCCTCCT TCTTCCCCTG GATAGAATAA AA.ATAATA.AA CTGGATCTTT AACGGGAGGA AGAAGGGGAC 9851 CCATGAGGAA ATCAACTATT ATCACCACTT TCCACACTAT TCTGGGCAAC GGTACTCCTT TAGTTGATAA TAGTGGTGAA AGGTGTGATA AGACCCGTTG 9901 AATCATCCTA ATTTTATTAA CTTTAGGCCT TATTTATGAA TGATTTCAAG TTAGTAGGAT TAAAATAATT GAAATCCGGA ATAAATACTT ACTA.AAGTTC 9951 GAGGATTAGA ATGAGCAGAA TGGATATTTA GTC TA.A.ATAA AGACCACTAA CTCCTAATCT TACTCGTCTT ACCTATAAAT CAGATTTATT TCTGGTGATT 10001 TTTCGACTTA GTAA.ATTATG GTAAAAATCC ATA.AATAT C C TATGTCTCCC AAAGCTGAAT CATTTAATAC CATTTTTAGG TATTTATAGG ATACAGAGGG 10051 ATATATTTCA GTCTTAATTC AGCATTTATT TTAGGTCTTA CGGGCCTCGC TATATA.AAGT CAGAATTAAG TCGTAAATAA AATCCAGAAT GCCCGGAGCG 10101 ACTCAATCGT TATCACCTTC TATCTGCACT TTTATGTTTA GAAGGTATAC TGAGTTAGCA ATAGTGGAAG ATAGACGTGA AAATAC.A.AAT CTTCCATATG 10151 TATTAACTTT ATTCATTACT ATTTCCATCT GAGCCTTAAC ATTAA.AC TCA ATAATTGAAA TAAGTAATGA TAAAGGTAGA CTCGGAATTG TAATTTGAGT 10201 ACCTCATGCT CAATTATTCC CATAATTCTT CTTACATTTT CAGCCTGTGA TGGAGTACGA GTTAATAAGG GTATTAAGAA GAATGTAAAA GTCGGACACT 10251 AGCTAGTACA GGTTTAGCCA TTCTAGTAGC CACCTCCCGT TCTCATGGTT TCGATCATGT CCAAATCGGT AAGATCATCG GTGGAGGGCA AGAGTACCAA 10301 CCGACAACTT ACA~AAAC C TA AACCTTCTTC AATGCT~ ATCCTTATCC GGCTGTTGAA TGTTTTGGAT TTGGAAGAAG TTACGATTTT TAGGAATAGG 10351 CAACAATTAT ACTCCTCCCA ACCACATGAA TAATTAATAA ~~AATGATTA GTTGTTAATA TGAGGAGGGT TGGTGTACTT ATTAATTATT TTTTACTAAT 10401 TGATCCACAA TCACCACCTA CAGCCTTCTA ATTGCACTAC TGAGTCTACT ACTAGGTGTT AGTGGTGGAT GTCGGAAGAT TAACGTGATG ACTCAGATGA 10451 TTTATTTAAA TGAAATATAG ATATTGGCTG GGATTTCTCC AACCAATTTA AAATAAATTT ACTTTATATC TATAACCGAC CCTAAAGAGG TTGGTTAA.AT 10501 TAGCTACCGA CCCTTTATCC ACTCCCCTAT TGATCCTTAC ATGTTGACTT ATCGATGGCT GGGAA.ATAGG TGAGGGGATA ACTAGGAATG TACAACTGAA 10551 CTACCATTAA TAACCTTAGC TAGTCA.A.AAT CATATTTCTC CAGAACCAAT GATGGTAATT ATTGGAATCG ATCAGTTTTA GTATAAAGAG GTCTTGGTTA 10601 TGTCCGACAA CGAACATATA TTATACTTCT AATTTTCCTC CAAGCCTTTC ACAGGCTGTT GCTTGTATAT AATATGAAGA TTA.AAAGGAG GTTC GGA.AAG 10651 TCATTATAGC ATTCTCTGCA ACCGAAATAA TTTTATTTTA TATTATATTT AGTAATATCG TAAGAGACGT TGGCTTTATT AAA.ATA.AA.AT ATAATATAAA 10701 GAAGCCACAC TCATCCCCAC TCTCATTATT ATTACACGAT GAGGA.AATCA CTTCGGTGTG AGTAGGGGTG AGAGTAATAA TAATGTGCTA CTCCTTTAGT 10751 AACAGAACGC TTAA.ATGCAG GTACCTATTT CTTATTTTAC ACCTTAATTG TTGTCTTGCG AATTTACGTC CATGGATAA.A GAATP~AAATG TGGAATTAAC 10801 GTTCTCTTCC CCTTCTCATT GCCCTCCTAC TTATACAA.AA TAATTTAGGT CAAGAGAAGG GGAAGAGTAA CGGGAGGATG AATATGTTTT ATTAAATCCA 10851 ACCCTATCAA TAATCATTAT ACAACACTCA CAACTCCCAA ATCTATTCTC TGGGATAGTT ATTAGTAATA TGTTGTGAGT GTTGAGGGTT TAGATAAGAG 10901 ATGGGCAGAT A.AATTATGAT GAGTAGCCTG TCTCATTGCT TTCCTTGTTA TACCCGTCTA TTTAATACTA CTCATCGGAC AGAGTAACGA AAGGAACAAT 328

10951 A.AATAC C C C T ATATGGAATT CACCTTTGAC TCCCCAAAGC CCATGTTGAA TTTATGGGGA TATACCTTAA GTGGAAACTG AGGGGTTTCG GGTACAACTT TA.A.AAC TAGG 11001 GCCCCAATTG CTGGATCAAT AATTCTAGCA GCAGTATTAC CGGGGTTAAC GACCTAGTTA TTAAGATCGT CGTCATAATG ATTTTGATCC 11051 GGGTTATGGA ATAATACGAA TTATTGTGAT ACTAAATCCT TTAACCAAAG CCCAATACCT TATTATGCTT AATAACACTA TGATTTAGGA AATTGGTTTC 11101 AAATAATTTA TCCATTCTTA ATTTTAGCTA TCTGAGGAAT TATTATAACC TTTATTAAAT AGGTAAGAAT TAAA.ATC GAT AGACTCCTTA ATAATATTGG A.AATCACTCA 11151 AGTTCCATCT GCTTACGGCA AACAGATCTA TTGCTTATTC TCAAGGTAGA CGAATGCCGT TTGTCTAGAT TTTAGTGAGT AACGAATAAG ATCCA.AACAC 11201 ATCAGTAAGT CACATAGGAC TAGTTGCTGG AGCTATTCTT TAGTCATTCA GTGTATCCTG ATCAACGACC TCGATAAGAA TAGGTTTGTG 11251 CATGAAGTTT TGCAGGAGCA ATTACACTCA TAATTGCCCA TGGCTTAATT GTACTTCAAA ACGTCCTCGT TAATGTGAGT ATTAACGGGT ACCGAATTAA 11301 TCATCAGCCT TATTCTGTCT AGCTAACACC AACTATGAAC GAATTCACAG AGTAGTCGGA ATAAGACAGA TCGATTGTGG TTGATACTTG CTTAAGTGTC 11351 CCGAACTATG CTCCTAGCTC GAGGTTTACA AATCATCCTT CCATTAACAG GGCTTGATAC GAGGATCGAG CTCCAAATGT TTAGTAGGAA GGTAATTGTC 11401 CAACCTGATG ACTCCTTACT AGTTTAGCTA ACCTTGCCCT ACCTCCCTCA GTTGGACTAC TGAGGAATGA TCA.AATCGAT TGGAACGGGA TGGAGGGAGT 11451 CCCAACCTCA TAGGAGAACT CCTTATTATT ACTTCACTAT TTAACTGATC GGGTTGGAGT ATCCTCTTGA GGAATAATAA TGAAGTGATA AATTGACTAG 11501 TAACTGAACC CTAATCTTAT CAGGCCTTGG AGTATTAATC ACAGCCTCCT ATTGACTTGG GATTAGAATA GTCCGGAACC TCATAATTAG TGTCGGAGGA 11551 ATTCACTTTA CATATTCTTA TTAACTCAAC GAGGTCCAAC TCCCCTTCAC TAAGTGA.AAT GTATAAGAAT AATTGAGTTG CTCCAGGTTG AGGGGAAGTG 11601 ATTTTATCCT TAAATCCAAA TTATACACGA GAACATCTTC TCATAACCCT TAAAATAGGA ATTTAGGTTT AATATGTGCT CTTGTAGAAG AGTATTGGGA 11651 CCACCTTATA CCCATTTTAT TACTAATGTT TAAACCAGAA CTTATCTGAG GGTGGAATAT GGGTAAAATA ATGATTACAA ATTTGGTCTT GAATAGACTC A.AACATTAGA 11701 GTTGAACATT TTGTATTTAT AGTTTAACCA TTGTGGTTCT CAACTTGTAA AACATAAATA TC.AAATTGGT TTTGTAATCT AACACCAAGA 11751 P.~~AAATAAAA GCTAAAACCT TTTTAATTAC CGAGAGAGGT CAGGGATACG TTTTTATTTT CGATTTTGGA AAA.ATTAATG GCTCTCTCCA GTCCCTATGC 11801 AAGAACTGCT AATTCTTCTC ACCATGGCTC AAATCCATGG CTCACTCAGC TTCTTGACGA TTAAGAAGAG TGGTACCGAG TTTAGGTACC GAGTGAGTCG 11851 TTATGAAAGA TAATAGA.AAT CTATTGGTCT TAGGAATCAA AAACTCTTGG AATACTTTCT ATTATCTTTA GATAACCAGA ATCCTTAGTT TTTGAGAACC 11901 TGCAA.ATCCA AGCAAAAGCT ATGAATACCA TCTTCAACTC ATCATTTCTC ACGTTTAGGT TCGTTTTCGA TACTTATGGT AGAAGTTGAG TAGTAAAGAG 11951 TTAATTTTTA TTATCCTTAC CTTCCCACTA ATAACCTCAC T.A.A.AACC TAA AATTP.~~AAA.T AATAGGAATG GAAGGGTGAT TATTGGAGTG ATTTTGGATT TGTA~~AAACA GCCGTP.~AAAA 12001 ACAACCCAAT CCCAATTGAT CATCATCTCA TGTTGGGTTA GGGTTAACTA GTAGTAGAGT ACATTTTTGT CGGCATTTTT A.AATCAAGGC 12051 CCTCCTTCTT TATTAGCCTT ATCCCACTAT TCATTTTCCT GGAGGAAGAA ATAATCGGAA TAGGGTGATA AGTAAAAGGA TTTAGTTCCG 12101 CTAGAATCAA TCATAATCAA CTATAACTGA ATAAATATTG GACCATTTGA GATCTTAGTT AGTATTAGTT GATATTGACT TATTTATAAC CTGGTAAACT 12151 TATTAACATA AGTTTCA.AAT TTGATATGTA CTCAATTATA TTTACCCCCG ATAATTGTAT TCAAAGTTTA AACTATACAT GAGTTAATAT AAATGGGGGC 12201 TAGCTCTCTA TGTTACCTGA TCTATTCTCG AGTTCGCCTT ATGATATATA ATCGAGAGAT ACAATGGACT AGATAAGAGC TCAAGCGGAA TACTATATAT 12251 CATTCTGATC CCAACATTAA CCGCTTTTTT AAATATTTAT TACTCTTCCT GTAAGACTAG GGTTGTAATT GGC G TTTATAAATA ATGAGAAGGA 12301 AATTTCAATA ATTATCTTAG TAACAGCTAA CAACATCTTT CAATTATTTA 329

TTAAAGTTAT TAATAGAATC ATTGTCGATT GTTGTAGAAA GTTAATAAAT 12351 TTGGATGAGA AGGGGTCGGA ATTATATCAT TCCTCCTAAT TGGTTGATGA AACCTACTCT TCCCCAGCCT TAATATAGTA AGGAGGATTA ACCAACTACT 12401 TATAGCCGAA CAGATGCTAA CACCGCCGCC CTCCAAGCTG TAATTTACAA ATATCGGCTT GTCTACGATT GTGGCGGCGG GAGGTTCGAC ATTA.AATGTT 12451 TCGAGTAGGG GATATTGGAT TAATCCTCAG CATAGCCTGA TTAGCTATAA AGCTCATCCC CTATAACCTA ATTAGGAGTC GTATCGGACT AATCGATATT 12501 ATTTAAACTC ATGAGAAATT CAACAATTAT TTATTCTAGC CP.~A.AAATATA TA.AATTTGAG TACTCTTTAA GTTGTTAATA AATAAGATCG GTTTTTATAT 12551 AATATAACAC TACCTCTCTT CGGTCTCGTC CTAGCTGCAG CTGGAA,AATC TTATATTGTG ATGGAGAGAA GCCAGAGCAG GATCGACGTC GACCTTTTAG 12601 CGCACAATTT GGCCTCCACC CTTGACTCCC TTCCGCCATA GAAGGCCCAA GCGTGTTAA.A CCGGAGGTGG GAACTGAGGG AAGGCGGTAT CTTCCGGGTT 12651 CACCAGTATC TGCCTTACTT CACTCCAGCA CAATAGTTGT TGCCGGTATT GTGGTCATAG ACGGAATGAA GTGAGGTCGT GTTATCAACA ACGGCCATAA 12701 TTCCTATTAA TCCGCCTTCA CCCCTTAATC CAAGATAATC AACTAATCTT AAGGATAATT AGGCGGAAGT GGGGAATTAG GTTCTATTAG TTGATTAGAA 12751 AACAATGTGC CTTTGTTTAG GAGCATTAAC TACCCTTTTT ACCGCAGCTT TTGTTACACG GAAACAAATC CTCGTAATTG ATGGGAAAAA TGGCGTCGAA 12801 GCGCACTAAC CCAAAATGAT ATC TTATTGCCTT CTCAACATCA CGCGTGATTG GGTTTTACTA TAGTTTTTTT AATAACGGAA GAGTTGTAGT 12851 AGTCAACTTG GATTAATAAT AGTAACAATT GGACTCAATC AACCCCAACT TCAGTTGAAC CTAATTATTA TCATTGTTAA CCTGAGTTAG TTGGGGTTGA 12901 CGCCTTTCTC CACATTTGTA CCCATGCCTT CTTCAAAGCC ATACTCTTCC GC GGA.A.AGAG GTGTAAACAT GGGTACGGAA GAAGTTTCGG TATGAGAAGG 12951 TTTGTTCAGG ATCTATTATC CACAGTCTTA ATGATGAACA AGACATCCGT A.AACAAGTCC TAGATAATAG GTGTCAGAAT TACTACTTGT TCTGTAGGCA 13001 AAA.ATAGGAG GGCTCCATAA ACTCATACCA TTTACCTCAT CTTCTTTAAT TTTTATCCTC CCGAGGTATT TGAGTATGGT AAATGGAGTA GAAGAAATTA 13051 TATTGGAAGC TTAGCCCTTA CAGGCATACC TTTTTTATCA GGTTTCTTCT ATAACCTTCG AATCGGGAAT GTCCGTATGG TAGT CCAAAGAAGA 13101 CAAAAGACAT TATTATTGAA ACCATAA.ACA CTTCTCACCT AAACGCCTGA GTTTTCTG'I`A ATAATAACTT TGGTATTTGT GAAGAGTGGA TTTGCGGACT 13151 GCCCTAATCC TCACCCTTAT CGCAACATCA TTCACCTCCA TCTATAGCCT CGGGATTAGG AGTGGGAATA GCGTTGTAGT AAGTGGAGGT AGATATCGGA 13201 ACGCCTTATA TTCTTCACAT TAATAAACTT CCCACGATTC AACTCACTTT TGC GGA.ATAT AAGAAGTGTA ATTATTTGAA GGGTGCTAAG TTGAGTGAAA 13251 CCCCCATTAA TGAAAATAAT CCTATAATAA TTAACCCAAT CAAACGATTG GGGGGTAATT ACTTTTATTA GGATATTATT AATTGGGTTA GTTTGCTAAC 13301 GCTTACGGAA GTATCCTAGC TGGCCTTATT ATTACATCAA ATTTAACCCC CGAATGCCTT CATAGGATCG ACCGGAATAA TAATGTAGTT TAA.A.TTGGGG 13351 CACP~~AAACA CAAGTCATAA CTATATCCCC TTTATTAAAA TTCTCCACCC GTGTTTTTGT GTTCAGTATT GATATAGGGG AAATAATTTT AAGAGGTGGG 13401 TTTTAATCAC AATCACTGGC TTATTACTAG CCCTAGAATT GGTTAACTTA AA.A.ATTAGTG TTAGTGACCG AATAATGATC GGGATCTTAA CCAATTGAAT 13451 ACTAATACTC AATTTAAA,AT AAACCCCACC CTCTTTACCC ACCATTTCTC TGATTATGAG TTAAATTTTA TTTGGGGTGG GAGA.AATGGG TGGTA.AAGAG 13501 CAATATACTC GGATATTTTC CACA.AATTAT TCACCGCCTC C TAC CP~~~1AA GTTATATGAG CCTATAAAAG GTGTTTAATA AGTGGCGGAG GATGGTTTTT 13551 TCAACCTAAA TTGAGCCCAA AACACCTCAA CCCACCTAGT TGATCAAACA AGTTGGATTT AACTCGGGTT TTGTGGAGTT GGGTGGATCA ACTAGTTTGT 13601 TGAAATGAAA AAATTGGACC p~~A.AAGTAC C CTCATCCAAC AAATTCCTTT ACTTTACTTT TTTAACCTGG TTTTTCATGG GAGTAGGTTG TTTAAGGAAA 13651 AAC TA.AAC TA TCTACTCAAC CACAACAGGG TTATATTAAA ACTTATCTTA TTGATTTGAT AGATGAGTTG GTGTTGTCCC AATATAATTT TGAATAGAAT 330

13701 TATTACTTTT CCTTACATTA ACCCTAGCCC TATTAACTTC ACTAATTAAC ATAATGAAAA GGAATGTAAT TGGGATCGGG ATAATTGAAG TGATTAATTG 13751 TGCACGTA.AA GCCCCCCAAG ATAGCCCTCG AGTTAACTCC AATACCACAA ACGTGCATTT CGGGGGGTTC TATCGGGAGC TCAATTGAGG TTATGGTGTT 13801 ATAAAGTTAA CAATAGTACC CACCCACTTA A.AAAC AATAA CCATCCACCA TATTTCAATT GTTATCATGG GTGGGTGAAT TTTTGTTATT GGTAGGTGGT 13851 CTAGCATACA ACAAAGCTAC C C CTGCA.A.AA TCTCCACGAA CCATCTCCAT GATCGTATGT TGTTTCGATG GGGACGTTTT AGAGGTGCTT GGTAGAGGTA 13901 ACTACTAATC TCCTCTACTC CCACCCAACC TAGCTCAGAT CACTCAACTA TGATGATTAG AGGAGATGAG GGTGGGTTGG ATCGAGTCTA GTGAGTTGAT 13951 TAAAATACTT GCCAACAAAG AC TA.AAGC TA C TAAATAAAA CCCAACATAC ATTTTATGAA CGGTTGTTTC TGATTTCGAT GATTTATTTT GGGTTGTATG 14001 AATAATACCG ATCAACTACC CCACCACTCA GGATAAGGCT CAGCAGCAAG TTATTATGGC TAGTTGATGG GGTGGTGAGT CCTATTCCGA GTCGTCGTTC 14051 AGCTGCCGTA TAAGCAAACA CTACTAACAT TCCCCCTAAA TA.AATTAAAA TCGACGGCAT ATTCGTTTGT GATGATTGTA AGGGGGATTT ATTTAATTTT 14101 ATAGAACTAA TGAT GATCCCCCAT GGCCCACTAA TAATCCACAC TATCTTGATT ACTATTTTTT CTAGGGGGTA CCGGGTGATT ATTAGGTGTG 14151 CCCACCCCAG CAGCTATAAC TAACCCTAAT GCAGCATAAT AAGGAGAAGG GGGTGGGGTC GTCGATATTG ATTGGGATTA CGTCGTATTA TTCCTCTTCC 14201 GTTAGACGCT ACTCCTATTA ATCCTAAGAC TAAACAAATT ATTATC~'~AAA CAATCTGCGA TGAGGATAAT TAGGATTCTG ATTTGTTTAA TAATAGTTTT 14251 ATATAAAATA TACCATTATT CCTACCTGGA TTTTAACCAA GACCAATAAC TATATTTTAT ATGGTAATAA GGATGGACCT AAAATTGGTT CTGGTTATTG 14301 TTGP.~~AA.AC T ATCGTTGTTT ATTCAACTAT AAGAATTTAT GGCCATAAAT AACTTTTTGA TAGCAACAAA TAAGTTGATA TTC TTA.AATA CCGGTATTTA 14351 ACCCGP.~~.AAA CCCACCCACT AC TP►AAAATT GTTAATCAAA CCCTAATTGA TGGGCTTTTT GGGTGGGTGA TGATTTTTAA CAATTAGTTT GGGATTAACT 14401 TCTCCCAACT CCCTCAAATA TTTCAATCTG ATGAAACTTT GGCTCACTTC AGAGGGTTGA GGGAGTTTAT AAAGTTAGAC TACTTTGAAA CCGAGTGAAG 14451 TAGGACTATG CCTAATTATC CAAATTCTCA CAGGACTTTT TCTAGCAATA ATCCTGATAC GGATTAATAG GTTTAAGAGT GTCCTGAA.AA AGATCGTTAT 14501 CACTATACCC CCGATATCTC CATAGCCTTC TCCTCAGTAA TTCACATTTG GTGATATGGG GGCTATAGAG GTATCGGAAG AGGAGTCATT AAGTGTAAAC 14551 CCGCGATGTT AACTATGGCT GACTTATCCG CAACATCCAC GCCAACGGAG GGCGCTACAA TTGATACCGA CTGAATAGGC GTTGTAGGTG CGGTTGCCTC 14601 CCTCATTATT CTTCATTTGC ACATATTTAC ATATTGCTCG AGGACTTTAT GGAGTAATAA GAAGTA.AAC G TGTATA.AATG TATAACGAGC TCCTGAAATA 14651 TATGGCTCCT ACCTTTATAA AGAAACATGG AATATTGGGG TAATCTTATT ATACCGAGGA TGGA.AATATT TCTTTGTACC TTATAACCCC ATTAGAATAA 147 01 ATTTCTGCTA ATAGCCACAG CCTTCGTAGG TTATGTATTA CCATGGGGGC TAAAGACGAT TATCGGTGTC GGAAGCATCC AATACATAAT GGTACCCCCG 14751 AAATATCCTT CTGAGGTGCT ACAGTTATTA CCAACCTCTT ATCTGCCTTC TTTATAGGAA GACTCCACGA TGTCAATAAT GGTTGGAGAA TAGACGGAAG 14801 CCCTATATTG GA.AATATGTT AGTTCAGTGA ATTTGAGGTG GTTTCTCAGT GGGATATAAC CTTTATACAA TCAAGTCACT TAAACTCCAC CAA.AGAGTCA 14851 AGATAACGCC ACTTTGACAC GATTCTTCGC ATTTCACTTC CTTCTACCTT TCTATTGCGG TGA.AACTGTG CTAAGAAGCG TAAAGTGAAG GAAGATGGAA 14901 TCCTAATTAC AGCATTAATA CTTATTCATA TTCTCTTTTT ACATGAAACA AGGATTAATG TCGTAATTAT GAATAAGTAT AAGAGP►~~AA.A TGTACTTTGT 14951 GGTTCAAATA ACCCCATAGG ACTTAATTCT GATATAGATA AAATTTCCTT CCAAGTTTAT TGGGGTATCC TGAATTAAGA CTATATCTAT TTTAAAGGAA 15001 CCATCCCTAC TTCTCCTATA AAGATACACT TGGTTTTTTT ACCTTAATCA GGTAGGGATG AAGAGGATAT TTCTATGTGA ACC TGGAATTAGT 15051 TATTCCTGGG AATCCTGACC CTATTTCTCC CTAACCTTCT AGGTGATGCT 331

ATAAGGACCC TTAGGACTGG GATA.AAGAGG GATTGGAAGA TCCACTACGA 15101 G~,,A.PsACTTCA TCCCTGCTAA CCCCCTTGTT ACCCCTCCCC ATATTAAACC CTTTTGAAGT AGGGACGATT GGGGGAACAA TGGGGAGGGG TATAATTTGG 15151 CGAATGATAT TTCCTATTTG CCTATGCTAT TCTCCGATCA ATTCCTAATA GCTTACTATA AAGGATAAAC GGATACGATA AGAGGCTAGT TAAGGATTAT 15201 AACTAGGAGG AGTCCTAGCC CTTCTATTCT CTATTTTTAT CCTTATATTA TTGATCCTCC TCAGGATCGG GAAGATAAGA GATp~~AAATA GGAATATAAT 15251 ATTCCCTTAT TACACACCTC TA.AACAAC GA ACCAGCATCT TCCGCCCACT TAAGGGAATA ATGTGTGGAG ATTTGTTGCT TGGTCGTAGA AGGCGGGTGA 15301 TACACAAATT TTCTTCTGAA TCCTTGTAAC CAACATATTA ATCTTAACTT ATGTGTTTAA AAGAAGACTT AGGAACATTG GTTGTATAAT TAGAATTGAA 15351 GAATTGGAGG ACAACCAGTT GAACAACCAT TTATTATTAT TGGACA.AATC CTTAACCTCC TGTTGGTCAA CTTGTTGGTA AATAATAATA ACCTGTTTAG 15401 GCATCCATTA TATATTTCTC CTTATTTCTT ATTGTAATTC CACTCACAGG CGTAGGTAAT ATATAAAGAG GAATAA.AGAA TAACATTAAG GTGAGTGTCC 15451 CTGAYGAGAA AACAAAATCC TCAGCCTAAA CTGTTTTGGT AGCTTAATTT GACTYCTCTT TTGTTTTAGG AGTCGGATTT GACAAAACCA T C GAATTA.AA 15501 AATAAAGCAT CGACCTTGTA AGTCGAAGAC CGGAGGTTTA AAACCCCCCC TTATTTCGTA GCTGGAACAT TCAGCTTCTG GCCTCCAAAT TTTGGGGGGG 15551 AAA.ACATATC AGGGAA.AGGA GGGTTAAACT CCTGCCTTTG GCTCCCAAAG TTTTGTATAG TCCCTTTCCT CCCAATTTGA GGACGGAA.AC CGAGGGTTTC 15601 CCAAGATTCT GCCCAAACTG CCCCCTGTAA TGCTATTAAA GCATGA.AAAT GGTTCTAAGA CGGGTTTGAC GGGGGACATT ACGATAATTT CGTACTTTTA CAATCp~3AAA 15651 TTGATTTTCA AAAAGTAAGT CAGAGTGACA TATTAATGAC GTTAGTTTTT AACTAAAAGT TTTTCATTCA GTCTCACTGT ATAATTACTG 15701 ATAGCCCACA TACCTTAATA TAGTACATTA CTTAACTCGA CTAATCAACA TATCGGGTGT ATGGAATTAT ATCATGTAAT GAATTGAGCT GATTAGTTGT 15751 TTAATAGATT ATTCCCTACT ATCATTATTA TCTATGTATA ATCCTCATTA AATTATCTAA TAAGGGATGA TAGTAATAAT AGATACATAT TAGGAGTAAT 15801 ATCTATATTC CACTATATCA TAACATACTA TGCTTAATAC TCATTAATAT TAGATATAAG GTGATATAGT ATTGTATGAT ACGAATTATG AGTAATTATA 15851 ACTATCCACT ATTTCATTAC ATTATATTCT TTAGCCCCCA TAAATATATA TGATAGGTGA TAAAGTAATG TAATATAAGA AATCGGGGGT ATTTATATAT 15901 ATCAATATTT TCATATCATA TAATTATTTA TTCAACCCTT AATTACTTAA TAGTTATAAA AGTATAGTAT ATTAATA.AAT AAGTTGGGAA TTAATGAATT 15951 ATTATATATT ATGCGGGTTG GTAAGAACAT CACAACCCGC TATTGTAAGA TAATATATAA TACGCCCAAC CATTCTTGTA GTGTTGGGCG ATAACATTCT 16001 TAGC TCTATTTGTG GCACTATACT CGATGAATCC CCATTAATTG TTTTTTATCG AGATAAACAC CGTGATATGA GCTACTTAGG GGTAATTAAC ATCA~A,AACTG 16051 GCATCTGATT AATGCTTGAA ATACCATAAT CCTTAATCGC TAGTTTTGAC CGTAGACTAA TTACGAACTT TATGGTATTA GGAATTAGCG 16101 GTCAAGAATG CCAGATCCGC TAGTTCCCTT TAATGGCACC TTCGTCCTTG CAGTTCTTAC GGTCTAGGCG ATCAAGGGAA ATTACCGTGG AAGCAGGAAC 16151 ATCGTCTCAA GATTTATCGT CCGCCCTACA ATTTTTTTCG GGGATGAAGC TAGCAGAGTT CTAAATAGCA GGCGGGATGT T GC CCCTACTTCG 16201 AATTACTCAG CCCGGGAGGG CTGATCTGGA ACACCGAAAT AA.ATTTGAAT TTAATGAGTC GGGCCCTCCC GACTAGACCT TGTGGCTTTA TTTAAACTTA 16251 CCACCTCGAC ATCTACTTAA TATACCCATT ACTTATCATT CATGGATCGT GGTGGAGCTG TAGATGAATT ATATGGGTAA TGAATAGTAA GTACCTAGCA 16301 AATTGTCAAG TTGACCAATA CTGAGAGGGA TAGAGAAACT GACGCCATAG TTAACAGTTC AACTGGTTAT GACTCTCCCT ATCTCTTTGA CTGCGGTATC 16351 GCGACAAGTT TCGATTTTTT TGATTAATGA AACTATGGTT T TAC AGC T CGCTGTTCAA ACTAATTACT TTGATACCAA ATTTTTTATG 16401 ATTCTCTTAA TCCTCATCAA A.A.AAGCAATT CGTAATAAAT ATTAATGTAA TAAGAGAATT AGGAGTAGTT TTTTCGTTAA GCATTATTTA TAATTACATT 332

16451 GGCGCATTGA ATAATCCTAG TACATCATTC ACTTTACTGG GCATAAATTT CCGCGTAACT TATTAGGATC ATGTAGTAAG TGAAATGACC CGTATTTAAA 16501 ATTATTATTA GGTTTCCCCC TAGGTTGTGA AAAATTTTCA GCCGCCTAAA TAATAATAAT CCAAAGGGGG ATCCAACACT TTTTAAAAGT CGGCGGATTT 16551 CA TTTTTTTGGT P.~A,A.AACC000 CTCCCCCTAA TATACACGGA TTTTTTTTGT CCA TTTTTGGGGG GAGGGGGATT ATATGTGCCT 16601 C TCC TC GA.A.A AAC C C C TAA.A ACGAAGGCCG GACATATATT TTTAAATTAG GAGGAGCTTT TTGGGGATTT TGCTTCCGGC CTGTATATAA AAATTTAATC 16651 CATGCGGAAT ATATTTTGTA TATATAT GTACGCCTTA TATAAAACAT ATATATA tRNA 1..70 product = tRNA-Phe rRNA 69..1022 product = 12S ribosomal RNA tRNA 1023..1094 product = tRNA-Val rRNA 1095..2761 product = 16S ribosomal RNA tRNA 2762..2836 product = tRNA-Leu gene 2837..3811 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3814..3882 product = tRNA-Ile tRNA 3881..3952 product = tRNA-Gln tRNA 3953..4021 product = tRNA-Met gene 4022..5065 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5065..5135 product = tRNA-Trp tRNA complement (5137..5205) product = tRNA-Ala tRNA complement (5206..5278) product = tRNA-Asn tRNA complement (5312..5378) product = tRNA-Cys tRNA complement (5380..5449) product = tRNA-Tyr gene 5461..7004 gene =COI product = cytochrome c oxidase subunit 1 tRNA complement (7006..7076) product = tRNA-Ser tRNA 7081..7150 product = tRNA-Asp gene 7158..7848 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7849..7922 333

product = tRNA-Lys gene 7924..8091 gene = ATPB product =ATP synthase FO subunit 8 gene 8082..8765 gene = ATP6 product =ATP synthase FO subunit 6 gene 8765..9550 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9553..9622 product = tRNA-Gly gene 9623..9973 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9972..10041 product = tRNA-Arg gene 10042..1033 8 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10332..11712 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11713..11781 product = tRNA-His tRNA 11782..11848 product = tRNA-Ser tRNA 11849..11920 product = tRNA-Leu gene 11921..13759 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13745..14266) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14267..14336) product = tRNA-Glu gene 14339..15484 gene = CYTB product =cytochrome b tRNA 15483..15556 product = tRNA-Thr tRNA complement (15559..15627} product = tRNA-Pro D-Loop 15629..16677

Mitsukurina owstoni mitochondrion, complete genome

GCTAGTGTAG CTTAATTTAA AGCATAGCAC TGA.AAATGC T AATATGAAAA CGATCACATC GAATTAAATT TCGTATCGTG ACTTTTACGA TTATACTTTT ATGAAAATTT TCCCCAAGCA TGAAGATTTG GTCCTGGCCT CAGTATTAAT TACTTTTAAA AGGGGTTCGT ACTTCTAAAC CAGGACCGGA GTCATAATTA TGCAACTAAA ATTATACATG CA.AGTTTCAG CATCCCTGTG AGAATGCCCT 334

ACGTTGATTT TAATATGTAC GTTCA.AAGTC GTAGGGACAC TCTTACGGGA 151 AATTACTCTA TCAATTAATT AGGAGCGGGT ATCAGGCACA CACACGTAGC TTAATGAGAT AGTTAATTAA TCCTCGCCCA TAGTCCGTGT GTGTGCATCG 201 CCAAGACATC TTGCTAAGCC ACACCCCCAA GGGATTTCAG CAGTAATAGA GGTTCTGTAG AACGATTCGG TGTGGGGGTT CCCTAAAGTC GTCATTATCT 251 TATTGACACA TAAGCGAAAG CTTGAGTCAG TTA.A.AGTTAA CAGAGTTGGT ATAACTGTGT ATTCGCTTTC GAACTCAGTC AATTTCAATT GTCTCAACCA 3 01 A.AATCTCGTG CCAGCCACCG CGGTTATACG AGTAACTTAC ATTAATATTT TTTAGAGCAC GGTCGGTGGC GCCAATATGC TCATTGAATG TAATTATAAA 351 CCCCGGCGTA AAGAGTGATT TAAGGAATAT CTACATAACT AAAGTGAAGA GGGGCCGCAT TTCTCACTAA ATTCCTTATA GATGTATTGA TTTCACTTCT 401 CCTAATGAAG CTGTTACACG TACCCATAAA TGGA.AACATC CACAACGAAA GGATTACTTC GACAATGTGC ATGGGTATTT ACCTTTGTAG GTGTTGCTTT 451 GTGACTTTAC ATAACCAGAA ATCTTGATGT CACGACAGTT AGACCCCAAA CAC TGAA.ATG TATTGGTCTT TAGAACTACA GTGCTGTCAA TCTGGGGTTT 501 CTAGGATTAG ATACCCTACT ATGTCTAACC ACAA.ACTTAA ACAATAACTT GATCCTAATC TATGGGATGA TACAGATTGG TGTTTGAATT TGTTATTGAA 551 ACCTATATTG TTCGCCAGAG TACTACAAGC GCTAGCTTAA AACCCAAAGG TGGATATAAC AAGCGGTCTC ATGATGTTCG CGATCGAATT TTGGGTTTCC 601 ACTTGGCGGT GTCCCAA.ACC CACCTAGAGG AGCCTGTTCT ATAACCGATA TGAACCGCCA CAGGGTTTGG GTGGATCTCC TCGGACAAGA TATTGGCTAT 651 ATCCCCGTTA AACCTCACCA CTTCTGGCCA TCCCCGTCTA TATACCGCCG TAGGGGCAAT TTGGAGTGGT GAAGACCGGT AGGGGCAGAT ATATGGCGGC 701 TCGTCAGCTC ACCCTATGAA GGTTp~~G CAAGCAAAAA GAACTAACTC AGCAGTCGAG TGGGATACTT CCAATTTTTC GTTCGTTTTT CTTGATTGAG 751 CTATACGTCA GGTCGAGGTG TAGCA.AATGA AGTGGGAAGA AATGGGCTAC GATATGCAGT CCAGCTCCAC ATCGTTTACT TCACCCTTCT TTACCCGATG 801 ATTTTCTATA AAGAAAACAC GAATGGAAAA C TGP.►AAAAC T ACTTAAAGGT TAAAAGATAT TTCTTTTGTG CTTACCTTTT GACTTTTTGA TGAATTTCCA 851 GGATTTAGCA GTAAGAGGAG ACCAGAGAGC TTCTCTGAAA TCGGCTCTGG C GT C TAA.ATC CATTCTCCTC TGGTCTCTCG AAGAGACTTT AGCCGAGACC 901 GACGCGCACA CACCGCCCGT CACTCTCCTC TCT ATTCATTTTT CTGCGCGTGT GTGGCGGGCA GTGAGAGGAG TTTTTTTAGA TAAGTP.~AAAA 951 AATTAAAAGA GAATCCCCAA GAGGAGGCAA GTCGTAACAT GGTAAGTGTA TTAATTTTCT CTTAGGGGTT CTCCTCCGTT CAGCATTGTA CCATTCACAT 1001 CTGGAAAGTG CACTTGGAAT CAAAATGTAG CTAAATTAAC AAAGTACCTC GACCTTTCAC GTGAACCTTA GTTTTACATC GATTTAATTG TTTCATGGAG 1051 ACTTACACCG AGGAGATATC CGTACAATTC GGGTCATTTT GAACATTAAA TGAATGTGGC TCCTCTATAG GCATGTTAAG C C CAGTA.A.AA CTTGTAATTT 1101 ATTAGCCTGA CCACCCACCT GAAC TA.AAC C ATATTAACTA CCTCACATAT TAATCGGACT GGTGGGTGGA CTTGATTTGG TATAATTGAT GGAGTGTATA 1151 TAATTCCTAA CTAAAACATT TTTAATTTTT AGTATGGGTG ACAGAACA.AA ATTAAGGATT GATTTTGTAA AAATTP►~~AAA TCATACCCAC TGTCTTGTTT 1201 AACTCAGCGC AATA.AACATG TACCGCAAGG GAAAACTGAA AAAGAA.ATGA TTGAGTCGCG TTATTTGTAC ATGGCGTTCC CTTTTGACTT TTTCTTTACT 12 51 p~~AAATAATT AAAGTAATAA AAAGCAGAGA CTCAACCTCG TACCTTTTGC TTTTTATTAA TTTCATTATT TTTCGTCTCT GAGTTGGAGC ATGGAAAACG AGCTAGAA.AA 1301 ATCATGATTT ACTAGACAAA GAGATCTTAA GCCTACCTTC TAGTACTAA.A TCGATCTTTT TGATCTGTTT CTCTAGAATT CGGATGGAAG 1351 CCGAAACTAA ACGAGCTACT CCGAAGCAGC ACAACTTAGA GCCAACCCAT GGCTTTGATT TGCTCGATGA GGCTTCGTCG TGTTGAATCT CGGTTGGGTA 1401 CTCTGTGGCA AAAGAGTGGG AAGACTTCCG AGTAGCGGTG ATAAGCCTAT GAGACACCGT TTTCTCACCC TTCTGAAGGC TCATCGCCAC TATTCGGATA 1451 CGAGTTTAGT GATAGCTGGT TACCCAAGAA AAGAACTTTA ATTCTGCATT GCTCA.AATCA CTATCGACCA ATGGGTTCTT TTCTTGAAAT TAAGACGTAA 335

ACATP~T 1501 AATTTATTTT ATAC CP~~AAA GTCTACCTTA TTAAGGTTAA TTA.AATA.A.AA TATGGTTTTT CAGATGGAAT AATTCCAATT TGTATTTTTA 1551 TAATAGTTAT TCAAAAGAGG GACAGCCCTT CTGAACTAAG ATACAACTTT ATTATCAATA AGTTTTCTCC CTGTCGGGAA GACTTGATTC TATGTTGAAA 1601 TTAAGGTGGA TAATGATCAT ATTTATTAAG GTTATTACCC CAGTGGGCCT AATTCCACCT ATTACTAGTA TAAATAATTC CAATAATGGG GTCACCCGGA TCACA~AAACC 1651 AAAAGCAGCC ATCTGTAA~G TAAGCGTCAC AGCTCCAATC TTTTCGTCGG TAGACATTTC ATTCGCAGTG TCGAGGTTAG AGTGTTTTGG 1701 CTATAATTTA GATATTCCTT CACAACCCCC TTAATTATAT TGGGTTATTT AATAA.A GATATTAAAT CTATAAGGAA GTGTTGGGGG AATTAATATA AC C C 1751 TATAAAACTA TA.AAAGAAC T TATGCTAAAA TGAGTAATAA GAGGP►~~AAAC ATATTTTGAT ATTTTCTTGA ATACGATTTT ACTCATTATT CTCCTTTTTG 1801 CTCTCCAGAC ACAAGTGTAT GTCAGAAAGA ATTAAATCAC TGATAACTAA GAGAGGTCTG TGTTCACATA CAGTCTTTCT TAATTTAGTG ACTATTGATT 1851 ACGAACCCAA ATTGAGGCCA TTATAATAAT ATTTCCTTAA CTAGAAAATC TGCTTGGGTT TAACTCCGGT AATATTATTA TAAAGGAATT GATCTTTTAG 1901 CTATTATAAT ATTCGTTAAC CCTACACAGG AGCGTCCCAA GGAAAGATTA GATAATATTA TAAGCAATTG GGATGTGTCC TCGCAGGGTT CCTTTCTAAT 1951 AAAGP~AAATA AAGGAACTCG GCAAACATAA ACTCCGCCTG TTTAC CAAA.A TTTCTTTTAT TTCCTTGAGC CGTTTGTATT TGAGGCGGAC A.AATGGTTTT 2001 ACATCGCCTC TTGCAAAACC ATAAGAGGTC CCGCCTGCCC TGTGACAATG TGTAGCGGAG AACGTTTTGG TATTCTCCAG GGCGGACGGG ACACTGTTAC 2051 TTTAACGGCC GCGGTATTTT GACCGTGCAA AGGTAGCGTA ATCACTTGTC AAATTGCCGG C GC CATAA.AA CTGGCACGTT TCCATCGCAT TAGTGAACAG 2101 TTTTAAATGA AGACCCGTAT GA.AAGGCATC ACGAGAGTTT AACTGTCTCT AAAATTTACT TCTGGGCATA CTTTCCGTAG TGCTCTCAAA TTGACAGAGA 2151 ATTTTCTAAT CAATGA.AATT GATCTACTCG TGCAGAAGCG AGTATAACAA TAA.AAGATTA GTTACTTTAA CTAGATGAGC ACGTCTTCGC TCATATTGTT 2201 CATTAGACGA GAAGACCCTA TGGAGCTTCA AACACTTAAA TTAACTATGT GTAATCTGCT CTTCTGGGAT ACCTCGAAGT TTGTGAATTT AATTGATACA 2251 AAATTAACTA TTCCACGGAA ATA.AACA.AA.A ATATAATATT TTTAATTTAA TTTAATTGAT AAGGTGCCTT TATTTGTTTT TATATTATAA AAATTAA.ATT 2301 CTGTTTTTGG TTGGGGTGAC CAAGGGGAA.A AACAAATCCC CCTTATCGAC GAC P►~~AAAC C AACCCCACTG GTTCCCCTTT TTGTTTAGGG GGAATAGCTG 2351 TGAGTACTCT TAGTGCTTAA AAATTAGAAT CACAATTCTA ATTAATAAA.A ACTCATGAGA ATCACGAATT TTTAATCTTA GTGTTAAGAT TAATTATTTT 2401 TATTTATCGA A.A.AAT GAC C C AGGATTTCCT GATCAATGAA CCAAGTTACC ATAAATAGCT TTTTACTGGG TC C TAA.AGGA CTAGTTACTT GGTTCAATGG 2451 CTAGGGATAA CAGCGCAATC CTTTCTCAGA GTCCTTATCG CCGAAAGGGT GATCCCTATT GTCGCGTTAG GA.AAGAGTC T CAGGAATAGC GGCTTTCCCA 2501 TTACGACCTC GATGTTGGAT CAGGACATCC TAATGATGCA ACCGTTATTA AATGCTGGAG CTACAACCTA GTCCTGTAGG ATTACTACGT TGGCAATAAT 2551 AGGGTTCGTT TGTTCAACGA TTAACAGTCC TACGTGATCT GAGTTCAGAC TCCCAAGCAA ACAAGTTGCT AATTGTCAGG ATGCACTAGA CTCAAGTCTG 2601 C GGAGAA.ATC CAGGTCAGTT TCTATCTATG AATTAATTTT TCCCAGTACG GCCTCTTTAG GTCCAGTCAA AGATAGATAC TTAATTAAAA AGGGTCATGC 2651 AAAGGACCGG P►~~?~AATGGAG CCAATACCTC AGGCACGCTC CATTTTCATC TTTCCTGGCC TTTTTACCTC GGTTATGGAG TCCGTGCGAG GTAAAAGTAG 2701 TATTGAAACA AAC TAAA.ATA GATAAGAAAA ATCTATCTAT ATACCCAAGA ATAACTTTGT TTGATTTTAT CTATTCTTTT TAGATAGATA TATGGGTTCT 2751 AAAGGGTTGT TGGTGTGGCA GAGCCTGGTA AGTGCAAGAG ACCTAAACTC TTTCCCAACA ACCACACCGT CTCGGACCAT TCACGTTCTC TGGATTTGAG 2801 TTTAATCCAG AGGTTCAAAT CCTCTCTCCA ACTATGCTTG AA.A000TCTT AAATTAGGTC TCCAAGTTTA GGAGAGAGGT TGATACGAAC TTTGGGAGAA 2851 ACTTTACTTA ATTAATCCAC TTACTTACAT TATCCCCATC TTACTAGCTA 336

TGAAATGAAT TAATTAGGTG AATGAATGTA ATAGGGGTAG AATGATCGAT 2901 CAGCTTTCCT CACCTTAATT GAACGP.►AAA.A TTCTTGGCCA TATACAACTT GTC GAA.AGGA GTGGAATTAA CTTGCTTTTT AAGAACCGGT ATATGTTGAA 2951 CGCA.A.AGGCC CTAATATTGT AGGTCCACAT GGACTCCTAC AACCAATTGC GCGTTTCCGG GATTATAACA TCCAGGTGTA CCTGAGGATG TTGGTTAACG 3001 AGACGGCCTA AAACTATTTA TTA.AAGAAC C CATTCATCCA TTAACATCTT TCTGCCGGAT TTTGATAAAT AATTTCTTGG GTAAGTAGGT AATTGTAGAA 3051 CTCCATTCCT ATTCCTAGCT ACCCCCACAA TAGCCCTAAT ACTAGCCCTC GAGGTAAGGA TAAGGATCGA TGGGGGTGTT ATCGGGATTA TGATCGGGAG 3101 CTTATATGAA TACCCCTCCC TCTTCCCCAC GCAATTATTA ACCTAAATTT GAATATACTT ATGGGGAGGG AGAAGGGGTG CGTTAATAAT TGGATTTAAA 3151 AGGCTTACTA TTCATTCTAG CAGTTTCAAG TTTAACCGTC TATACTATTT TCCGAATGAT AAGTAAGATC GTCAAAGTTC AAATTGGCAG ATATGATAAA 3201 TAGGCTCTGG ATGAGCATCC AATTCAAAAT ACGCCCTCAT AGGGGCCCTA ATCCGAGACC TACTCGTAGG TTAAGTTTTA TGCGGGAGTA TCCCCGGGAT 3251 CGAGCCGTAG CACAA.ACAAT CTCCTACGAA ATCAGCCTCG GACTAATCCT GCTCGGCATC GTGTTTGTTA GAGGATGCTT TAGTCGGAGC CTGATTAGGA 3301 TTTATCAATA ATCATCTTTA CAGGAGGATT CACCCTCCAT ACCTTCAACC AAATAGTTAT TAGTAGAAAT GTCCTCCTAA GTGGGAGGTA TGGAAGTTGG 3351 TTGCACAAGA AACAATTTGA CTTCTTATTC CAGGATGACC ACTAGCCCTA AACGTGTTCT TTGTTA.AACT GAAGAATAAG GTCCTACTGG TGATCGGGAT 3401 ATATGATATG TTTCAACCCT AGCAGAA.ACT AACCGAGTAC CATTTGACTT TATACTATAC AAAGTTGGGA TCGTCTTTGA TTGGCTCATG GTAA.AC TGAA 3451 AACAGAGGGG GAATCAGAAT TAGTTTCAGG ATTTAATACC GAATACGCAG TTGTCTCCCC CTTAGTCTTA ATCAA.AGTC C TAAATTATGG CTTATGCGTC 3501 GAGGGTCATT TGCCCTATTT TTCCTTGCTG AATACACAAA TATCTTACTA CTCCCAGTAA ACGGGATA.AA AAGGAACGAC TTATGTGTTT ATAGAATGAT 3551 ATAAATGCCC TCTCAGTCAT CCTATTTATA GGCTCCTCCT ACAACCCACT TATTTACGGG AGAGTCAGTA GGATA.AATAT CCGAGGAGGA TGTTGGGTGA 3601 CCTCCCCCAA ATCTCAACAT TTAACTTAAT AATP~~AAACA ACCCTATTAA GGAGGGGGTT TAGAGTTGTA AATTGAATTA TTATTTTTGT TGGGATAATT 3651 CCTTACTTTT CCTATGAATC CGAGCATCAT ACCCTCGCTT CCGCTACGAC GGAATGA,AAA GGATACTTAG GCTCGTAGTA TGGGAGCGAA GGCGATGCTG 3701 CAACTCATAC ACTTAGTATG CTTT TTACCCCTAA CTTTAGCAAT GTTGAGTATG TGAATCATAC TTTTTTGA.AA AATGGGGATT GAAATCGTTA 3751 TATACTATGA CATATCACCC TACCCATAAC CACAGCAAGC CTACCCCCAC ATATGATACT GTATAGTGGG ATGGGTATTG GTGTCGTTCG GATGGGGGTG 3801 TAAC C TP~AAA ACGGAAGCGT GCCTGAATAA AGGACCACTT TGATAGGGTG ATTGGATTTT TGCCTTCGCA CGGACTTATT TCCTGGTGAA ACTATCCCAC 3851 GATAATGAAA GTTACAACCT TTCCTCTTCC TAGP.~A.AA,ATA GGATTTGAAC CTATTACTTT CAATGTTGGA AAGGAGAAGG ATCTTTTTAT CCTAAACTTG 3901 CTATACCTAA GAGATCAAAA CTCTTTATGC TTCCAATTAT ACTACTTCCT GATATGGATT CTCTAGTTTT GAGAA.ATAC G AAGGTTAATA TGATGAAGGA 3951 AAGTAAAGTC AGCTAACAAA GCTTTTGGGC CCATACCCCA ACCATGTTGA TTCATTTCAG TCGATTGTTT C GAA.AAC C C G GGTATGGGGT TGGTACAACT 4001 TTA.AA.ATC C T TCCTTTACTA ATGAACCCAA TTGTATTAAC CATTATCATT AATTTTAGGA AGGA.AATGAT TACTTGGGTT AACATAATTG GTAATAGTAA 4051 TCAAGCCTAG GCCTAGGAAC TATCCTTACA TTTACTGGTT CACACTGACT AGTTCGGATC CGGATCCTTG ATAGGAATGT AAATGACCAA GTGTGACTGA 4101 TCTAGTATGA ATAGGTCTCG AAATTAATAC TCTAGCCATC ATCCCCCTAA AGATCATACT TATCCAGAGC TTTAATTATG AGATCGGTAG TAGGGGGATT 4151 TAATTCGTCA ACACCATCCC CGAGCAGTAG AAGCCTCCAC AAAATACTTC ATTAAGCAGT TGTGGTAGGG GCTCGTCATC TTCGGAGGTG TTTTATGAAG 4201 ATCACACAAG CAACTGCCTC AGCCTTACTT TTATTTGCTA GC GTAACA.AA TAGTGTGTTC GTTGACGGAG TCGGAATGAA AATAA.AC GAT CGCATTGTTT 337

4251 CGCTTGGACT TCAGGTGAAT GAAGTTTAAT TGAAATAACT AATCCAGGCT GCGAACCTGA AGTCCACTTA CTTCA.AATTA ACTTTATTGA TTAGGTCCGA 4301 CTGCCACACT TGTCACAATT GCACTAGCAC TP.~~AAATTGG CCTAGCCCCC GACGGTGTGA ACAGTGTTAA CGTGATCGTG ATTTTTAACC GGATCGGGGG 4351 CTCCATTTCT GACTCCCTGA TGTCCTTCAA GGCCTAGACC TCACCACCGG GAGGTAAAGA CTGAGGGACT ACAGGAAGTT CCGGATCTGG AGTGGTGGCC 4401 CCTCATCCTT TCCACATGGC P.~AAAAC TC GC CCCATTCGCC ATCCTTTTAC GGAGTAGGAA AGGTGTACCG TTTTTGAGCG GGGTAAGCGG TAGGAAAATG 4451 AACTTTACCC TTCACTTAAT TCCAATTTAC TAGTCTTTCT CGGAGTCCTC TTGAAATGGG AAGTGAATTA AGGTTAAATG ATCAGAAAGA GCCTCAGGAG 4501 TCAACCATAA TCGGGGGATG AGGTGGACTA AACCAAACCC AAC TAC GA.AA AGTTGGTATT AGCCCCCTAC TCCACCTGAT TTGGTTTGGG TTGATGCTTT 4551 AATCCTAGCC TACTCCTCAA TCGCCCATCT TGGTTGAATA ATCACAATCC TTAGGATCGG ATGAGGAGTT AGCGGGTAGA ACCAACTTAT TAGTGTTAGG 4601 TACACTTCTC CCCCAATTTA ACCCAACTAA ATTTAATCCT TTACATTATT ATGTGAAGAG GGGGTTA.AAT TGGGTTGATT TAAATTAGGA AATGTAATAA 4651 ATAACATCAA CAACCTTTCT CCTGTTTAAA ATATTTAACT CAAC TAAA.AT TATTGTAGTT GTTGGAA.AGA GGACAAATTT TATAAATTGA GTTGATTTTA 4701 CAATTCTATC TCCTCCTCTT CATCTAAATC TCCCCTACTA TCTACCATTG GTTAAGATAG AGGAGGAGAA GTAGATTTAG AGGGGATGAT AGATGGTAAC 4751 CCCTAATGAC CCTCCTCTCC CTAGGGGGGT TACCTCCACT TACAGGCTTT GGGATTACTG GGAGGAGAGG GATCCCCCCA ATGGAGGTGA ATGTCCGAAA 4801 ATAC C GA,A.AT GACTAATTTT GCAAGA.AATA ACAAAACA.A.A ACCTAACCAC TATGGCTTTA CTGATTAAAA CGTTCTTTAT TGTTTTGTTT TGGATTGGTG 4851 CCCAGCCATT ATTATAGCTA TAATAACTCT CCTCAGCCTA TTCTTTTACC GGGTCGGTAA TAATATCGAT ATTATTGAGA GGAGTCGGAT AAGAAAATGG 4901 TACGCCTATG CTATTCCACA ACACTAACCA TAGCCCCTAA CCCAATTAAC ATGCGGATAC GATAAGGTGT TGTGATTGGT ATCGGGGATT GGGTTAATTG 4951 ATAATAACAT CATGACGAAC TAAACTACCC CCCAACCTCA CCATAACAAC TATTATTGTA GTACTGCTTG ATTTGATGGG GGGTTGGAGT GGTATTGTTG 5001 AACTACCTCA TTATCAATTT TACTTCTACC AATTACCCCA GCTATCCTCA TTGATGGAGT AATAGTTAAA ATGAAGATGG TTAATGGGGT CGATAGGAGT 5051 TATTACTTTC TTAAGA.AATT TAGGTTAACA ACAGACCA.A.A AGCCTTCAA.A ATAATGAAAG AATTCTTTAA ATCCAATTGT TGTCTGGTTT TCGGAAGTTT 5101 GCCTTAAGTA GAAGAGAAAA TCTCCTAATT TCTGTTAAGA TCTGCAAGAT CGGAATTCAT CTTCTCTTTT AGAGGATTAA AGACAATTCT AGACGTTCTA 5151 TTTATCCCAC ATCTTCTGAA TGCAACCCAG ATGCTTTAAT TAAGC TP~AAA AA.ATAGGGTG TAGAAGACTT ACGTTGGGTC TACGAAATTA ATTCGATTTT 5201 CCTTCTAGAT AAATAGGCCT TGATCCTACA AGATCTTAGT CAACAGCTAA GGAAGATCTA TTTATCCGGA ACTAGGATGT TCTAGAATCA GTTGTCGATT 5251 GCGTTCAATC CAGCGAACTT TTACCTAAAC TTTCTCCCGC C GAA.A.AGAAC CGCAAGTTAG GTCGCTTGAA AATGGATTTG A.AAGAGGGC G GCTTTTCTTG 5301 AA.AGGCGGGA GAAAGCCCCA GGAGGAACTA ATCTCCGGTT TTGGGTTTGC TTTCCGCCCT CTTTCGGGGT CCTCCTTGAT TAGAGGCCAA AACCCAAACG 5351 AACCCAACGT AACTGTCTAC TGCAGGGCTA TGGCAAGAAG AGGAATTTGA TTGGGTTGCA TTGACAGATG ACGTCCCGAT ACCGTTCTTC TCCTTA.AACT 5401 CCTCCGTACA CGGAGCTACA ATCCGCCACT TAGTTCTCAG TCACCTTACC GGAGGCATGT GCCTCGATGT TAGGCGGTGA ATCAAGAGTC AGTGGAATGG 5451 TGTGGCAATT AATCGTTGAC TTTTTTCTAC AA.ACCACAAA GATATTGGCA ACACCGTTAA TTAGCAACTG GATG TTTGGTGTTT CTATAACCGT 5501 CCCTGTATTT AATCTTTGGT GCATGAGCAG GAATAGTGGG AACAGCCCTA GGGACATA.A.A TTAGAAACCA CGTACTCGTC CTTATCACCC TTGTCGGGAT 5551 AGCCTACTAA TTCGAGCTGA ACTAGGGCAG CCTGGGTCTC TCCTAGGAGA TCGGATGATT AAGCTCGACT TGATCCCGTC GGACCCAGAG AGGATCCTCT 5601 TGATCACATC TATAATGTTA TTGTTACCGC CCATGCATTT GTAATAATTT 338

ACTAGTGTAG ATATTACAAT AACAATGGCG GGTACGTAA.A CATTATTAAA 5651 TCTTCATAGT AATACCCGTA ATAATTGGTG GCTTTGGA.AA TTGACTAGTA AGAAGTATCA TTATGGGCAT TATTAACCAC C GA.AAC C TTT AACTGATCAT 5701 CCATTAATAA TTGGTGCACC AGATATAGCC TTCCCACGAA TAAACAACAT GGTAATTATT AACCACGTGG TCTATATCGG AAGGGTGCTT ATTTGTTGTA 5751 AAGCTTTTGA CTTCTTCCCC CCTCTTTCCT TTTACTCCTA GCTTCAGCC.G TTCGA.AA.ACT GAAGAAGGGG GGAGAA.AGGA AAATGAGGAT CGAAGTCGGC 5801 GAGTTGAAGC TGGGGCCGGT ACTGGCTGAA CAGTTTATCC ACCCTTAGCT CTCAACTTCG ACCCCGGCCA TGACCGACTT GTCAAATAGG TGGGAATCGA 5851 GGTAATTTAG CACACGCTGG GGCATCCGTA GACTTAACTA TTTTTTCTTT CCATTAAATC GTGTGCGACC CCGTAGGCAT CTGAATTGAT GAAA 5901 ACATTTAGCA GGTATTTCAT CAATTTTAGC CTCAATTAAT TTTATTACAA TGTAAATCGT CCATAAAGTA GTTA.A.AATC G GAGTTAATTA AAATAATGTT 5951 CTATTATTAA TATP~AAAC CA CCAGCTATCT CCCAATACCA AACACCATTA GATAATAATT ATATTTTGGT GGTCGATAGA GGGTTATGGT TTGTGGTAAT 6001 TTTGTATGAT CAATCTTAGT AACAACCGTC CTCCTTCTAT TAGCACTCCC AAACATACTA GTTAGAATCA TTGTTGGCAG GAGGAAGATA ATCGTGAGGG 6051 AGTCCTTGCA GCCGGTATTA CAATATTACT CACTGACCGA AACCTAAACA TCAGGAACGT CGGCCATAAT GTTATAATGA GTGACTGGCT TTGGATTTGT 6101 CAACATTCTT TGACCCAGCA GGAGGAGGGG ATCCAATTTT ATACCAACAT GTTGTAAGAA ACTGGGTCGT CCTCCTCCCC TAGGTTAAAA TATGGTTGTA 6151 CTATTTTGAT TTTTTGGTCA CCCAGAAGTT TATATTTTAA TTCTCCCAGG GATAAAACTA P►~~;?~AAC CAGT GGGTCTTCAA ATAT~TT AAGAGGGTCC 6201 CTTTGGAATA ATCTCCCATG TAGTAGCTTA TTACTCCGGT GAAC GAAACCTTAT TAGAGGGTAC ATCATCGAAT AATGAGGCCA TTTTTTCTTG 6251 CATTCGGTTA TATAGGCATA GTCTGAGCAA TAATAGCAAT CGGACTATTA GTAAGCCAAT ATATCCGTAT CAGACTCGTT ATTATCGTTA GCCTGATAAT 6301 GGTTTTATTG TATGAGCCCA CCATATATTT ACAGTAGGAA TGGACGTTGA CCAAAATAAC ATACTCGGGT GGTATATA.AA TGTCATCCTT ACCTGCAACT 6351 TACACGAGCT TATTTTACCT CAGCAACAAT AATTATTGCT ATCCCCACAG ATGTGCTCGA ATP~AAATGGA GTCGTTGTTA TTAATAACGA TAGGGGTGTC 6401 GC GTAAAAGT ATTTAGCTGA CTAGCAACTC TTCACGGAGG CTCTATCAAA CGCATTTTCA TA.AATC GAC T GATCGTTGAG AAGTGCCTCC GAGATAGTTT 6451 TGAGAAGCCC CATTATTATG AGCCCTTGGG TTCATCTTTT TATTTACAGT ACTCTTCGGG GTAATAATAC TCGGGAACCC AAGTAGp~AAA ATAAATGTCA 6501 AGGGGGATTA ACAGGTATTG TTCTAGCTAA CTCCTCTTTA GACATCGTAC TCCCCCTAAT TGTCCATAAC AAGATCGATT GAGGAGA.AAT CTGTAGCATG 6551 TTCATGATAC TTATTACGTA GTAGCTCACT TCCATTATGT CCTTTCAATA AAGTACTATG AATAATGCAT CATCGAGTGA AGGTAATACA GGAA.AGTTAT 6601 GGAGCAGTAT TTGCCATCAT AGCAGGATTT ATCCACTGAT TTCCTCTTAT CCTCGTCATA AACGGTAGTA TCGTCCTAAA TAGGTGACTA AAGGAGAATA 6651 CTCTGGCTTC ACCTTACATT CAACATGAAC P~A,AAAC C CAA TTTGTAGTTA GAGACCGAAG TGGAATGTAA GTTGTACTTG TTTTTGGGTT AAACATCAAT 6701 TGTTTATTGG AGTAAATTTA ACATTCTTCC CACAACATTT CCTAGGTCTT ACAAATAACC TCATTTA.AAT TGTAAGAAGG GTGTTGTAAA GGATCCAGAA 6751 GCTGGTATAC CACGACGTTA CTCAGATTAC CCAGATGCAT ACACCTTATG CGACCATATG GTGCTGCAAT GAGTCTAATG GGTCTACGTA TGTGGAATAC 6801 GAATACAGTC TCCTCTATCG GCTCTACAAT CTCACTTGTA GCAGTAATTA CTTATGTCAG AGGAGATAGC CGAGATGTTA GAGTGAACAT CGTCATTAAT 6851 TATTTTTATT TATTATCTGA GAAGCATTTG CC TCA.AAAC G AGAAGTGTTA AT~~~AAATAA ATAATAGACT CTTCGTAAAC GGAGTTTTGC TCTTCACAAT 6901 TCCATCGAAT TACCTCACAC AAACGTTGAA TGATTACATG GCTGCCCTCC AGGTAGCTTA ATGGAGTGTG TTTGCAACTT ACTAATGTAC CGACGGGAGG 6951 ACCACATCAC ACATACGAAG AACCAGCATT TGTTCAAGTC CAACGAACTT TGGTGTAGTG TGTATGCTTC TTGGTCGTAA ACAAGTTCAG GTTGCTTGAA 339

7001 TTTAAACAAG A.AAGGAAGGA ATCGAACCCC CTTATGTTAG TTTCAAGCCA AAATTTGTTC TTTCCTTCCT TAGCTTGGGG GAATACAATC AAAGTTCGGT 7051 ACCACATCAC CATTCTGTCA CTCTCTTACT AAGACTCTAG TAAAATACAT TGGTGTAGTG GTAAGACAGT GAGAGAATGA TTCTGAGATC ATTTTATGTA 7101 TACACTGCTT TGTCGA.AGCA GAATTGTGAG TTAAAATCCC ACGAATCTTA ATGTGACGAA ACAGCTTCGT CTTAACACTC AATTTTAGGG TGCTTAGAAT 7151 ATTTATAATG GCACACCCCT CACAATTAGG ATTTCAAGAC GCAGCCTCCC TAA.ATATTAC CGTGTGGGGA GTGTTAATCC TA.AAGTTCTG CGTCGGAGGG 7201 CAGTTATGGA AGAACTTATT CATTTTCACG ACCACACACT AATAATTATA GTCAATACCT TCTTGAATAA GTAA.AAGTGC TGGTGTGTGA TTATTAATAT 7251 TTTCTAATTA GCACTTTAAT TCTCTACATT ATCACAGCAA TAGTATCAAC AA.AGATTAAT CGTGAAATTA AGAGATGTAA TAGTGTCGTT ATCATAGTTG 7301 A.A.AACTTACA AACAAATACA TTCTTGATTC TCAAGAAATT GAGATTGTCT TTTTGAATGT TTGTTTATGT AAGAACTAAG AGTTCTTTAA CTCTAACAGA` 7351 GAACTATTCT CCCCGCCATC ATTCTCATTA TAATTGCCCT TCCATCCCTA CTTGATAAGA GGGGCGGTAG TAAGAGTAAT ATTAACGGGA AGGTAGGGAT 7401 CGTATCTTAT ATCTCATAGA CGAAATTAAT GAGCCCCATT TAACCATTAA GCATAGAATA TAGAGTATCT GCTTTAATTA CTCGGGGTAA ATTGGTAATT 7451 AGCTATAGGC CATCAATGAT ACTGAAGCTA TGAATACACA GATTATGAAG TCGATATCCG GTAGTTACTA TGACTTCGAT ACTTATGTGT CTAATACTTC 7501 ACCTAGGATT TGACTCTTAT ATAATCCAGA CCCAAGACTT AACCCCAGGC TGGATCCTAA ACTGAGAATA TATTAGGTCT GGGTTCTGAA TTGGGGTCCG 7551 CAATTTCGTT TATTAGAA.AC AGACAGCCGA ATAGTTGTAC CCATAGAATC GTTAAAGCAA ATAATCTTTG TCTGTCGGCT TATCAACATG GGTATCTTAG 7601 ACCTATCCGC GTATTAGTAT CAGCAGAAGA TGTATTACAT TCATGAGCTA TGGATAGGCG CATAATCATA GTCGTCTTCT ACATAATGTA AGTACTCGAT 7651 TCCCAGCCCT TGGTATTAAA ATAGATGCCG TACCAGGACG CCTAAATCAA AGGGTCGGGA ACCATAATTT TATCTACGGC ATGGTCCTGC GGATTTAGTT 7701 ACTGCCTTCA TCATCCCCCG ACCAGGTATT TATTATGGTC AATGTTCAGA TGACGGAAGT AGTAGGGGGC TGGTCCATAA ATAATACCAG TTACAAGTCT 7751 AATCTGCGGA GCTAACCATA GTTTTATGCC TATCGTAGTA GAAGCAGTCC TTAGACGCCT CGATTGGTAT CAAAATACGG ATAGCATCAT CTTCGTCAGG 7801 CATTAGAACA CTTCGAAGCC TGATCTTCAT TAATACTAGA AGAAGCCTCA GTAATCTTGT GAAGCTTCGG ACTAGAAGTA ATTATGATCT TCTTCGGAGT 7851 CTAAGAAGCT AAACTGGAAC AGCATTAGTC TTTTGAACTA AATATTGGTG GATTCTTCGA TTTGACCTTG TCGTAATCAG AAAACTTGAT TTATAACCAC 7901 ACTACCATCC ACCCTTAGTG AATATGCCCC AATTA.AATC C TCACCCTTGA TGATGGTAGG TGGGAATCAC TTATACGGGG TTAATTTAGG AGTGGGAACT 7951 TTTATTATTT TCCTATTTTC ATGAATAATT TTTCTTGTTA TCTTACCAAA AA.ATAATA.AA AGGATA,AA.AG TACTTATTAA AAAGAACAAT AGAATGGTTT 8001 AAAAGTAATA AACCATGTAT TTAGCAATAA CCCTACATTA AAAAGTACTA TTTTCATTAT TTGGTACATA AATCGTTATT GGGATGTAAT TTTTCATGAT 8051 P,~~PsAAC C TAA ACCCAAATCC TGAAACTGAC CATGATCATA AGCTTTTTTG TTTTTGGATT TGGGTTTAGG ACTTTGACTG GTACTAGTAT TCG C 8101 ACCAATTCCT AAGCCCATCA CTTCTTGGAA TTCCATTAAT TGCTCTAGCA TGGTTAAGGA TTCGGGTAGT GAAGAACCTT AAGGTAATTA ACGAGATCGT 8151 ATTACATTAC CATGATTAAT CTTCCCAACC CCAACGAATC GCTGACTAAA TAATGTAATG GTACTAATTA GAAGGGTTGG GGTTGCTTAG CGACTGATTT 8201 TAATCGATTA ATAACCCTCC A.AAATTGATT TATTAACCGA TTCATTTATC ATTAGCTAAT TATTGGGAGG TTTTAACTAA ATAATTGGCT AAGTA.AATAG 8251 AACTCATACA ACCCATTAAT TTTACTGGTC ATAAATGAGC TATTATATTT TTGAGTATGT TGGGTAATTA AA.ATGAC CAG TATTTACTCG ATAATATA.AA 8301 ACAACATTAA TACTGTTTCT AATCACTATT AACCTTCTAG GACTTCTCCC TGTTGTAATT ATGACAAAGA TTAGTGATAA TTGGAAGATC CTGAAGAGGG 8351 TTACACCTTT ACACCTACAA CCCAACTCTC CCTTAACATA GCATTTGCCC 340

AATGTGGAAA TGTGGATGTT GGGTTGAGAG GGAATTGTAT CGTAAACGGG 8401 TCCCCCTATG ACTCACAACT GTATTAATTG GAATGCTCAA TCAACCAACA AGGGGGATAC TGAGTGTTGA CATAATTAAC CTTACGAGTT AGTTGGTTGT 8451 ATTGCACTAG GACATTTCCT ACCAGAAGGC ACCCCCACCC CCCTAGCACC TAACGTGATC CTGTAAAGGA TGGTCTTCCG TGGGGGTGGG GGGATCGTGG 8501 CATCCTAATT ATTATTGAAA CTATTAGTTT ATTTATTCGA CCATTAGCAC GTAGGATTAA TAATAACTTT GATAATCA.AA TAAATAAGCT GGTAATCGTG 8551 TAGGAGTCCG ACTAACTGCC AATTTAACAG CTGGTCACCT ACTAATACAA ATCCTCAGGC TGATTGACGG TTAAATTGTC GACCAGTGGA TGATTATGTT 8601 TTAATCGCAA CTGCAGCCTT TGTCCTAATT ACTATTATAC CAACCGTAGC AATTAGCGTT GACGTCGGAA ACAGGATTAA TGATAATATG GTTGGCATCG 8651 ATTATTAACA TCTATCATTC TATTCCTACT AATAATCCTG GAAGTAGCTG TAATAATTGT AGATAGTAAG ATAAGGATGA TTATTAGGAC CTTCATCGAC 8701 TGGCAATAAT TCAAGCATAC GTATTTGTCC TCTTACTAAG TCTATACTTA ACCGTTATTA AGTTCGTATG CATAAACAGG AGAATGATTC AGATATGAAT 8751 CAAGAAAATA CCTAATGGCT CATCAAGCAC ACGCATATCA CATAGTCGAC GTTCTTTTAT GGATTACCGA GTAGTTCGTG TGCGTATAGT GTATCAGCTG 8801 CCTAGTCCAT GACCATTAAC CGGAGCTACC GCCGCCCTTC TAATGACATC GGATCAGGTA CTGGTAATTG GCCTCGATGG CGGCGGGAAG ATTACTGTAG 8851 CGGGTTGGCC ATCTGATTTC ATTTTCACTC ATTATTACTT CTCTACTTAG GCCCAACCGG TAGACTAAAG TAAAAGTGAG TAATAATGAA GAGATGAATC 8901 GATTGATCCT CCTACTATTA ACTATAATTC AATGATGACG AGATATTATC CTAACTAGGA GGATGATAAT TGATATTAAG TTACTACTGC TCTATAATAG 8951 CGAGAAGGAA CATTCCAAGG CCATCATACA CCCCCTGTCC P~AA.A.AGGC C T GCTCTTCCTT GTAAGGTTCC GGTAGTATGT GGGGGACAGG TTTTTCCGGA 9001 CCGTTACGGA ATAATCTTAT TCATCACATC AGAAGTATTC TTCTTTCTAG GGCAATGCCT TATTAGAATA AGTAGTGTAG TCTTCATAAG AAGAAAGATC 9051 GCTTTTTCTG AGCCTTTTAC CATTCAAGTC TCGCCCCAAC CCCAGAACTA C GP~~A.AAGAC TCGGAAAATG GTAAGTTCAG AGCGGGGTTG GGGTCTTGAT 9101 GGAGGATGCT GACCACCAAC AGGAATTAAC CCATTAGACC CATTTGAAGT CCTCCTACGA CTGGTGGTTG TCCTTAATTG GGTAATCTGG GTAAACTTCA 9151 ACCACTTCTG AATACCGCAG TACTTTTAGC TTCTGGCGTA ACAGTAACCT TGGTGAAGAC TTATGGCGTC ATGA,AA.ATCG AAGACCGCAT TGTCATTGGA 9201 GAACCCATCA TAGTCTAATA GAAGGTAATC GA,AAAGAAAC TATTCAAGCC CTTGGGTAGT ATCAGATTAT CTTCCATTAG CTTTTCTTTG ATAAGTTCGG 9251 CTAACCCTTA CTATTATCCT AGGAATTTAT TTTACATCCC TCCAAGCCGT GATTGGGAAT GATAATAGGA TCCTTAAATA AAATGTAGGG AGGTTCGGCA 9301 AGAATATTAC GAAGCACCAT TTACAATTGC TGACGGAGTC TACGGAACAA TCTTATAATG CTTCGTGGTA AATGTTAACG ACTGCCTCAG ATGCCTTGTT 9351 CGTTCTATGT CGCCACAGGA TTCCATGGTC TACATGTTAT TATTGGCTCA GCAAGATACA GCGGTGTCCT AAGGTACCAG ATGTACAATA ATAACCGAGT 9401 ACATTTTTAG CAATTTGTCT ATTACGACAA ATCCAATACC ATTTCACATC TGTp►AA.AATC GTTA.AACAGA TAATGCTGTT TAGGTTATGG TAAAGTGTAG 9451 AGAACATCAC TTTGGCTTCG AAGCTGCCGC ATGATATTGA CACTTCGTAG TCTTGTAGTG AAACCGAAGC TTCGACGGCG TACTATAACT GTGAAGCATC 9501 ACGTAGTATG ATTATTCCTT TATGTATCCA TCTATTGATG AGGCTCATAA TGCATCATAC TAATAAGGAA ATACATAGGT AGATAACTAC TCCGAGTATT 9551 TTACTTTTCT AGTATAAACT AGTACAAATG ATTTCCAATC ATTTAATCTT AATGAAAAGA TCATATTTGA TCATGTTTAC TA.AAGGTTAG TA.AATTAGAA 9601 GGCTA,AAACC CAAGGAAAAG TAATGAACCT CATCGCGTCT TCTGTCGCAG CCGATTTTGG GTTCCTTTTC ATTACTTGGA GTAGCGCAGA AGACAGCGTC 9651 CTACGGCCCT CATTTCCCTA ATCCTTGTTT TAGTTGCATT TTGACTTCCA GATGCCGGGA GTAA.AGGGAT TAGGAACAAA ATCAACGTAA AACTGAAGGT 9701 TCACTAA.ATC CAGATAATGA AAAATTATCT CCCTATGAGT GTGGCTTTGA AGTGATTTAG GTCTATTACT TTTTAATAGA GGGATACTCA CACCGAAACT 341

9751 CCCCCTAGGA AGCGCGCGTC TTCCATTCTC CCTACGCTTC TTCCTTGTAG GGGGGATCCT TCGCGCGCAG AAGGTAAGAG GGATGCGAAG AAGGAACATC 9801 CCATTCTATT CCTCCTATTT GAC C TAGA.AA TTGCTCTCCT CCTCCCCCTA GGTAAGATAA GGAGGATAAA CTGGATCTTT AACGAGAGGA GGAGGGGGAT 9851 CCATGGGGTA ACCAACTATT CACACCATTA TCCACATTAT TCTGAGCAGC GGTACCCCAT TGGTTGATAA GTGTGGTAAT AGGTGTAATA AGACTCGTCG 9901 AATTATCCTA ATTTTATTAA CTCTAGGCCT TATTTATGAA TGATTTCAAG TTAATAGGAT Tp~.AAATAATT GAGATCCGGA ATAAATACTT AC TAA.AGTTC 9951 GAGGACTAGA ATGAGCAGAG TAGATGTTTA GTCCA.AATAA AGACCACTAA CTCCTGATCT TACTCGTCTC ATCTACAAAT CAGGTTTATT TCTGGTGATT 10001 CTTCGACTTA GTAAACTATG GTGAAAACCC ATA.AACATCT TATGTCTCCT GAAGCTGAAT CATTTGATAC CACTTTTGGG TATTTGTAGA ATACAGAGGA 10051 ATACATTTTA GTCTTAACTC AGCATTCATC TTGGCCCTTA TAGGTCTTGC TATGTAAAAT CAGAATTGAG TCGTAAGTAG AACCGGGAAT ATCCAGAACG 10101 ACTCAATCGC TCCCACCTCC TATCTGCACT CCTCTGTTTA GAAGGTATAC TGAGTTAGCG AGGGTGGAGG ATAGACGTGA GGAGACAAAT CTTCCATATG 10151 TACTAACCCT ATTTATCACT ATCACTATCT GGACTTTAAT ACTGAACTCC ATGATTGGGA TAAATAGTGA TAGTGATAGA CCTGAAATTA TGACTTGAGG 10201 ACCTCATGCT CAATTACCCC TCTAATTATC CTTACATTTT CAGCTTGCGA TGGAGTACGA GTTAATGGGG AGATTAATAG GAATGTAAAA GTC GAAC GC T 10251 AACTAGTGCA GGCCTAGCCA TCCTAGTGGC TACCTCCCGC TCTCACGGCT TTGATCACGT CCGGATCGGT AGGATCACCG ATGGAGGGCG AGAGTGCCGA 10301 CTGATAACTT AC P.~AAAC C TA AACCTTCTCC AATGCTAAAA ATCCTAATTC GACTATTGAA TGTTTTGGAT TTGGAAGAGG TTACGATTTT TAGGATTAAG 10351 CAACAATCAT ACTTTTCCCA ACCACATGAA TAATTAATAA AAAATGACTG GTTGTTAGTA TGAAAAGGGT TGGTGTACTT ATTA.ATTATT TTTTACTGAC 10401 TGACCTATAA CCACTACCTA TAGCCTTCTA ATCGCATTAC TAAGCCTACT ACTGGATATT GGTGATGGAT ATCGGAAGAT TAGCGTAATG ATTCGGATGA 10451 CTGATTTA.AA TGAAGTATGG ATATTGGCTG AGACTTTTCT AACCAATATA GACTAAATTT ACTTCATACC TATAACCGAC TCTGAAAAGA TTGGTTATAT 10501 TAGCTATTGA CCCCTTATCA GCCCCTTTGC TAATTCTTAC ATGCTGACTT ATCGATAACT GGGGAATAGT CGGGGAAACG ATTAAGAATG TACGACTGAA 10551 CTCCCATTAA CAATCTTAGC TAGC CA.AAAC CATATTACCC CAGAACCAAT GAGGGTAATT GTTAGAATCG ATCGGTTTTG GTATAATGGG GTCTTGGTTA 10601 TATTCGACAA CGAACATACA TTACACTCCT TATTTTTCTC CAA.ACATTCC ATAAGCTGTT GCTTGTATGT AATGTGAGGA ATP►~~AAAGAG GTTTGTAAGG 10651 TCATTATAAC ATTTTCTGCA ACCGAAATAA TTATATTTTA CATTATATTT AGTAATATTG TA,AAAGACGT TGGCTTTATT AATATAAAAT GTAATATA.A.A 10701 GAAGCCACAC TTATCCCAAC ACTTATTATT ATTACACGAT GAGGAAACCA CTTCGGTGTG AATAGGGTTG TGAATAATAA TAATGTGCTA CTCCTTTGGT 10751 AACAGAACGT TTAAATGCAG GAACATATTT TTTATTTTAT ACCCTAATTG TTGTCTTGCA AATTTACGTC CTTGTATAAA AA.ATAAAATA TGGGATTAAC 10801 GTTCTCTTCC CCTTCTTATT GCTCTTTTAC TTATACP~AAA TAGTTTAGGG CAAGAGAAGG GGAAGAATAA CGAGAAAATG AATATGTTTT ATCAAATCCC 10851 ACATTATCCA TAATCATTAT ACAACACTCA CAACTTCTAA ACCTATTTTC TGTAATAGGT ATTAGTAATA TGTTGTGAGT GTTGAAGATT TGGATAA.AAG 10901 ATGAACA.AAT AAATTATGAT GAATAGCTTG CCTCATCGCC TTCCTTGTCA TACTTGTTTA TTTAATACTA CTTATCGAAC GGAGTAGCGG AAGGAACAGT 10951 AAATACCTTT ATATGGTATT CACCTTTGAC TTCCCAA.AGC TCACGTTGAA TTTATGGAAA TATACCATAA GTGGA.A.ACTG AAGGGTTTCG AGTGCAACTT 11001 GCCCCAATCG CTGGGTCAAT AATTCTAGCC GCAGTACTAC TTAAGCTAGG CGGGGTTAGC GACCCAGTTA TTAAGATCGG CGTCATGATG AATTCGATCC 11051 GGGATACGGA ATAATACGAA TTATTGTTAT ATTAAATCCA TTAACCAAAG CCCTATGCCT TATTATGCTT AATAACAATA TAATTTAGGT AATTGGTTTC 11101 AAATAGCTTA CCCATTCTTA ATCTTAGCTA TCTGAGGAAT TATCATAACC 342

TTTATCGAAT GGGTAAGAAT TAGAATCGAT AGACTCCTTA ATAGTATTGG 11151 AGCTCCATCT GTCTACGACA GACAGACCTA AAATCTCTAA TCGCCTATTC TCGAGGTAGA CAGATGCTGT CTGTCTGGAT TTTAGAGATT AGCGGATAAG 11201 ATCAGTAAGT CATATAGGCC TAGTCGCTGC AGCAATTCTC ATCCAAACAC TAGTCATTCA GTATATCCGG ATCAGCGACG TCGTTAAGAG TAGGTTTGTG 11251 CATGAAGTTT CGCAGGAGCA GTTACACTAA TAATTGCCCA TGGCTTAATT GTACTTCAAA GCGTCCTCGT CAATGTGATT ATTAACGGGT ACCGAATTAA 11301 TCATCAGCCC TATTCTGCTT AGCCAACACT AACTATGAAC GAATCCACAG AGTAGTCGGG ATAAGACGAA TCGGTTGTGA TTGATACTTG CTTAGGTGTC 11351 CCGAACTATA CTTCTAGCCC GAGGCACACA AATCATTCTC CCTTTAATAG GGCTTGATAT GAAGATCGGG CTCCGTGTGT TTAGTAAGAG GGA.AATTATC 11401 CAACCTGATG ATTCCTTACC AGCCTCGCCA ATCTTGCTCT ACCCCCATCC GTTGGACTAC TAAGGAATGG TCGGAGCGGT TAGAACGAGA TGGGGGTAGG 11451 CCCAACCTAA TAGGGGAACT CCTCATTATT ACCTCATTAT TCAACTGATC GGGTTGGATT ATCCCCTTGA GGAGTAATAA TGGAGTAATA AGTTGACTAG 11501 CAACTGGACT ATTATTTTAT TGGGCCCCGG AGTATTAATC ACAGCCTCTT GTTGACCTGA TAATAAAATA ACCCGGGGCC TCATAATTAG TGTCGGAGAA 11551 ACTCACTCTA TATATTTTTA ATAACCCAAC GAGGCCCGAC CCCCCATCAC TGAGTGAGAT ATATP►~~PsAAT TATTGGGTTG CTCCGGGCTG GGGGGTAGTG 11601 ATCCTATCAT TAAATCCTAA TTATACACGA GAACATCTTC TACTCAACCT TAGGATAGTA ATTTAGGATT AATATGTGCT CTTGTAGAAG ATGAGTTGGA 11651 CCATCTAATC CCTATTCTTC TACTAATACT TAAACCAGAA CTTATTTGAG GGTAGATTAG GGATAAGAAG ATGATTATGA ATTTGGTCTT GAATAAACTC 11701 GCTGGACACT TTGTATTTAT AGTTTAACCA AAACATTAGA TTGTGGTTCT CGACCTGTGA AACATA.AATA TCAAATTGGT TTTGTAATCT AACACCAAGA 11751 P.►~~AA.ATAAGA GTTAAAACCT TTTTAACTAC CGAGGGAGGT CAAGGACACA TTTTTATTCT CAATTTTGGA ~TTGATG GCTCCCTCCA GTTCCTGTGT 11801 AAGAACTGCT AATTCTTTCT ATCATGGCTC AAATCCATGA CTCACTCGGC TTCTTGACGA TTAAGAAAGA TAGTACCGAG TTTAGGTACT GAGTGAGCCG 11851 TTCTGAAAGA TAATAGCAAT CTATTGGTCT TAGGAACCAA AAACTCTTGG AAGACTTTCT ATTATCGTTA GATAACCAGA ATCCTTGGTT TTTGAGAACC 11901 TGCAACTCCG AGCAAAAGCT ATGACCACCA TTTTCAATTC ATCACTCCTC ACGTTGAGGC TCGTTTTCGA TACTGGTGGT AA.AAGTTAAG TAGTGAGGAG 11951 ATAATTTTTA CCGTCCTCAT TTTTCCATTA ATAACCTCAT TAAACCCTAA TATTP►~~AAAT GGCAGGAGTA A.AA.AGGTAAT TATTGGAGTA ATTTGGGATT 12001 AGAACTTAAT CCTAATTGAT CTTCATCCTA TGCP~~AAATA GCTGTGAAAA TCTTGAATTA GGATTAACTA GAAGTAGGAT ACGTTTTTAT CGACACTTTT 12051 TTTCCTTCTT CATCAGCCTT ATTCCTCTAT TTATTTTCCT GGACCAAGGT AAAGGAAGAA GTAGTCGGAA TAAGGAGATA AATAAAAGGA CCTGGTTCCA 12101 TTAGAATCAA TCATAACTAA TTATAATTGA ATAAATATTG GACCATTCGA AATCTTAGTT AGTATTGATT AATATTAACT TATTTATAAC CTGGTAAGCT 12151 CATCAACATA AGCTTCAAAT TCGATATATA CTCAATTATA TTTATCCCAG GTAGTTGTAT TCGAAGTTTA AGCTATATAT GAGTTAATAT AAATAGGGTC 12201 TAGCCCTTTA CGTAACCTGA TCTATCCTTG AATTTGCCCT ATGATACATA ATCGGGAAAT GCATTGGACT AGATAGGAAC TTAAACGGGA TACTATGTAT 12251 CACTCTGACC CA.AATATTAA CCGCTTTTTC AAATACTTAT TACTCTTCCT GTGAGACTGG GTTTATAATT GGC GP~~A.A.AG TTTATGAATA ATGAGAAGGA 12301 AATCTCAATA ATTATTCTAG TCACCGCTAA CAACATATTT CAACTGTTTA TTAGAGTTAT TAATAAGATC AGTGGCGATT GTTGTATAAA GTTGACAAAT 12351 TTGGTTGGGA AGGAGTTGGA ATTATGTCCT TCCTCCTAAT TGGCTGATGA AACCAACCCT TCCTCAACCT TAATACAGGA AGGAGGATTA ACCGACTACT 12401 CACAGCCGAA CAGATGCTAA CACGGCTGCC CTCCAAGCTG TAATTTATAA GTGTCGGCTT GTCTACGATT GTGCCGACGG GAGGTTCGAC ATTAAATATT 12451 CCGAATAGGA GACATCGGAC TAATCCTCAG CATAACCTGA TTAGCCATAA GGCTTATCCT CTGTAGCCTG ATTAGGAGTC GTATTGGACT AATCGGTATT 343

12501 ACTTAAATTC ATGAGAAATC CAACAACTCT TTATTTTATC TP~~AA.ATATA TGAATTTAAG TACTCTTTAG GTTGTTGAGA AATAAAATAG ATTTTTATAT 12551 GACTTAACCT TACCACTCTT TGGCCTCGTC CTAGCTGCAG CTGGAAAATC CTGAATTGGA ATGGTGAGAA ACCGGAGCAG GATCGACGTC GACCTTTTAG 12601 CGCACAATTT GGACTTCACC CTTGACTTCC ATCCGCTATA GAAGGACCCA GCGTGTTAAA CCTGAAGTGG GAACTGAAGG TAGGCGATAT CTTCCTGGGT 12651 CACCAGTCTC TGCCCTACTC CATTCTAGTA CA.ATAGTCAT TGCCGGTATC GTGGTCAGAG ACGGGATGAG GTAAGATCAT GTTATCAGTA ACGGCCATAG 12701 TTCCTACTAA TCCGTCTTCA CCCCCTAATT CA.A.AACAATC AACTAATCCT AAGGATGATT AGGCAGAAGT GGGGGATTAA GTTTTGTTAG TTGATTAGGA 12751 AACATTATGC CTATGTCTAG GGGCCCTAAC TACCCTCTTT ACTGCAGCCT TTGTAATACG GATACAGATC CCCGGGATTG ATGGGAGAAA TGACGTCGGA 12801 GTGCACTCAC C CP.LA.AATGAT ATC TTATCGCCTT CTCAACATCA CACGTGAGTG GGTTTTACTA TAGTTTTTTT AATAGCGGAA GAGTTGTAGT 12851 AGTCAACTAG GACTAATAAT AGTAACAATT GGCCTAAACC AACCCCAACT TCAGTTGATC CTGATTATTA TCATTGTTAA CCGGATTTGG TTGGGGTTGA 12901 TGCCTTCCTC CATATTTGTA CTCACGCCTT CTTCAAAGCC ATACTCTTCT ACGGAAGGAG GTATAAACAT GAGTGCGGAA GAAGTTTCGG TATGAGAAGA 12951 TATGCTCAGG GTCTATTATC CACAACCTAA GCGGCGAACA AGACATTCGT ATACGAGTCC CAGATAATAG GTGTTGGATT CGCCGCTTGT TCTGTAAGCA 13001 P►.AA.ATAGGTG GCCTCCACAA ACTTCTACCA TTTACCTCAT CTTCCTTAAC TTTTATCCAC CGGAGGTGTT TGAAGATGGT AAATGGAGTA GAAGGAATTG 13051 TGTTGGAAGC CTAGCCCTTA CAGGTATGCC ATTCTTATCA GGTTTCTTCT ACAACCTTCG GATCGGGAAT GTCCATACGG TAAGAATAGT CCAAAGAAGA 13101 CA~A.AAGACAC TATCATTGAA TCCATAAACA CTTCACACCT AAACGCCTGA GTTTTCTGTG ATAGTAACTT AGGTATTTGT GAAGTGTGGA TTTGCGGACT 13151 GCCCTAACCC TCACCCTTAT CGCAACATCA TTCACAGCTA TCTACAATCT CGGGATTGGG AGTGGGAATA GCGTTGTAGT AAGTGTCGAT AGATGTTAGA 13201 CCGACTCATT TTCCTTGTAT TAATA.AATTA TCCACGTTTC AATCCTCTCC GGCTGAGTAA AAGGAACATA ATTATTTAAT AGGTGCAAAG TTAGGAGAGG 13251 CTTCAATTAA C GAAAATCAC CCAACAATGA TTAACCCAAT CAAACGTCTT GAAGTTAATT GCTTTTAGTG GGTTGTTACT AATTGGGTTA GTTTGCAGAA 13301 GCTTACGGGA GTATCTTAAT AGGTCTCATT ATTATACTAA ATTTAAACCC CGAATGCCCT CATAGAATTA TCCAGAGTAA TAATATGATT TAAATTTGGG 13351 AACp►~~CC CAAACTATAA C AATAAC TC C TCTACTAAAA ATATCCGCCT TTGTTTTTGG GTTTGATATT GTTATTGAGG AGATGATTTT TATAGGCGGA 13401 TATTAATCAC AATCATCGGC CTCCTACTAG CCCTAGAATT AGCCAACTTA ATAATTAGTG TTAGTAGCCG GAGGATGATC GGGATCTTAA TCGGTTGAAT 13451 ACCAACACCC AATTTAAAA.0 TGTTCCCATT CTCCACACTC ATCACTTTTC TGGTTGTGGG TTAAATTTTG ACAAGGGTAA GAGGTGTGAG TAGTGAAAAG 13501 CA.ACATACTC GGATACTTCC CACAAATCAT CCATCGCCTA TTGCCP~~AAA GTTGTATGAG CCTATGAAGG GTGTTTAGTA GGTAGCGGAT AACGGTTTTT 13551 TTAACTTAA.A TTGAGCCCAA CACATTCCAA CCCACTTGAT TGACCA.AACA AATTGAATTT AACTCGGGTT GTGTAAGGTT GGGTGAACTA ACTGGTTTGT 13601 TGAAATGA.AA AAATCGGACC P.~~AAAGTGC T CTTATCCAAC AA.AC TC TAC T ACTTTACTTT TTTAGCCTGG TTTTTCACGA GAATAGGTTG TTTGAGATGA 13651 AATTAAATTA TCCACCCAAC CACAACAAGG CTATATTAAA ACCTACCTAA TTAATTTAAT AGGTGGGTTG GTGTTGTTCC GATATAATTT TGGATGGATT 13701 TACTACTTTT CCTTACATTA ACCTTAGCCC TACTAACTAC ATTAATCTAA ATGATGA.AA.A GGAATGTAAT TGGAATCGGG ATGATTGATG TAATTAGATT 13751 CTACACGCAA GGCTCCCCAA GATAATCCAC GAGTCAACTC CAACACAACA GATGTGCGTT CCGAGGGGTT CTATTAGGTG CTCAGTTGAG GTTGTGTTGT 13801 AATAAAGTCA ATAATAATAC TCACCCACTC A,AAAC TAATA ATCATCCCCC TTATTTCAGT TATTATTATG AGTGGGTGAG TTTTGATTAT TAGTAGGGGG 13851 ATCAGCATAC AATAATGCTA CCCCCACAAA ATCTCCTCGA ACTATCTCCA 344

TAGTCGTATG TTATTACGAT GGGGGTGTTT TAGAGGAGCT TGATAGAGGT 13901 AATTGCTTAT CTCCTCTACC CCGCCTCAAC TTAATTCAAA TCACTTAACC TTAACGAATA GAGGAGATGG GGCGGAGTTG AATTAAGTTT AGTGAATTGG 13951 ATAAA.ATATT TACTAATAAA AATTAACCCT ATAA,AATAAA ATCCAACATA TATTTTATAA ATGATTATTT TTAATTGGGA TATTTTATTT TAGGTTGTAT 14001 CAATAACACA GACCAATTAC CCCACGACTC AGGATAGGGC TCAGCAGCAA GTTATTGTGT CTGGTTAATG GGGTGCTGAG TCCTATCCCG AGTCGTCGTT 14051 GCGCTGCCGT ATAAGCA.AAT ACTACTAACA TCCCCCCTAA ATAAATTAAA CGCGACGGCA TATTCGTTTA TGATGATTGT AGGGGGGATT TATTTAATTT 14101 AATAAAATTA ATGATATAA.A AGATCCACCA TGTCCCACTA ATAACCCACA TTATTTTAAT TACTATATTT TCTAGGTGGT ACAGGGTGAT TATTGGGTGT 14151 CCCCACCCCA GCAGCTATAA CTAAACCCAA TGCAGCATAA TAAGGAGATG GGGGTGGGGT CGTCGATATT GATTTGGGTT ACGTCGTATT ATTCCTCTAC 14201 GATTAGATGC CACCCCAATT AAGCCTAAAA TTAAACAAAT TATTATTAAA CTAATCTACG GTGGGGTTAA TTCGGATTTT AATTTGTTTA ATAATAATTT 14251 AACATAAAAT AAACCATTAT TCTCACCTGG ACTCCAACCA AGACCAATAA TTGTATTTTA TTTGGTAATA AGAGTGGACC TGAGGTTGGT TCTGGTTATT 143 01 C TTGP►~~P~AAC TATCGTTGTT AATTCAACTA TAAGAATTTA TGGCCATAAA GAACTTTTTG ATAGCAACAA TTAAGTTGAT ATTCTTAAAT ACCGGTATTT 14351 TATCCGAAAA ACTCACCCAC TAC Tp►AAAAT CGTAAACCAC GTCTTAATTG ATAGGCTTTT TGAGTGGGTG ATGATTTTTA GCATTTGGTG CAGAATTAAC 14401 ATCTCCCAAC CCCCTCCAAC ATTTCAATTT GATGA.AAC TT TGGCTCACTT TAGAGGGTTG GGGGAGGTTG TA.AAGTTAA.A CTACTTTGAA ACCGAGTGAA 14451 CTAGGACTAT GTTTAATTAT CCA.AATCCTT ACAGGACTTT TTCTAGCTAT GATCCTGATA CAAATTAATA GGTTTAGGAA TGTCCTGAAA AAGATCGATA 14501 ACATTATACC GCAGACATCT CCATAGCCTT CTCCTCAGTA ATCCACATTT TGTAATATGG CGTCTGTAGA GGTATCGGAA GAGGAGTCAT TAGGTGTAAA 14551 GCCGTGATGT CAACTATGGC TGACTTATCC ATAATATCCA TGCCAACGGA CGGCACTACA GTTGATACCG ACTGAATAGG TATTATAGGT ACGGTTGCCT 14601 GCCTCATTAT TCTTCGTCTG TGCATACCTA CACATTGCCC GAGGACTTTA CGGAGTAATA AGAAGCAGAC ACGTATGGAT GTGTAACGGG C TC C TGAAAT 14651 CTATGGCTCC TACCTCCACA AAGAAACATG AAATGTTGGA GTAATCCTAT GATACCGAGG ATGGAGGTGT TTCTTTGTAC TTTACAACCT CATTAGGATA 14701 TCTTTCTATT AATAACTACA GCCTTCGTAG GATATGTATT ACCCTGAGGA AGAAAGATAA TTATTGATGT CGGAAGCATC CTATACATAA TGGGACTCCT 14751 CAAATATCCT TCTGAGGCGC CACAGTCATT ACCAACCTCC TATCTGCCTT GTTTATAGGA AGACTCCGCG GTGTCAGTAA TGGTTGGAGG ATAGACGGAA 14801 CCCATACATC GGGAATATAC TAGTCCAATG AATTTGAGGA GGCTTTTCAG GGGTATGTAG CCCTTATATG ATCAGGTTAC TTAAACTCCT CCGAAAAGTC 14851 TGGGTAACGC CACCCTGACA CGATTCTTCA CATTTCACTT TCTCCTCCCT ACCCATTGCG GTGGGACTGT GCTAAGAAGT GTAAAGTGAA AGAGGAGGGA 14901 TTCTTAATTA CCGCATTAAT AATAATTCAC ATCCTCTTCC TACATGA.AAC AAGAATTAAT GGCGTAATTA TTATTAAGTG TAGGAGAAGG ATGTACTTTG 14951 AGGCTCAAGC AACCCTATAG GACTTAATTC CAACACAGAC A.AA.ATCC CC T TCCGAGTTCG TTGGGATATC CTGAATTAAG GTTGTGTCTG TTTTAGGGGA 15001 TTCACCCTTA TTTCTCCTAC AAAGACACAC TCGGTTTTTT TATTATAATT AAGTGGGAAT A.AAGAGGATG TTTCTGTGTG AGC CAA.AAA.A ATAATATTAA 15051 GTTTCCCTAA TATCCCTAGC CCTACTTTTC CCTTATTCAT TAGGAGACGC CAAAGGGATT ATAGGGATCG GGATG~AAAG GGAATAAGTA ATCCTCTGCG 15101 TGAAAACTTT ATCCCTGCAA ACCCTCTTAT CACCCCTCTC CATATTAAAC ACTTTTGAAA TAGGGACGTT TGGGAGAATA GTGGGGAGAG GTATAATTTG 15151 CCGAATGATA CTTCCTATTC GCCTACGCTA TCCTCCGATC TATCCCAAAC GGCTTACTAT~ GAAGGATAAG CGGATGCGAT AGGAGGCTAG ATAGGGTTTG 15201 AAACTAGGAG GTGTCCTAGC CCTCCTATTC TCCATCTTTA TCCTCATACT TTTGATCCTC CACAGGATCG GGAGGATAAG AGGTAGAAAT AGGAGTATGA 345

15251 AGTTCCGCTT CTCCACACCT C TA.AACAAC G AAATAACACC TTTCGTCCAT TCAAGGCGAA GAGGTGTGGA GATTTGTTGC TTTATTGTGG AAAGCAGGTA 15301 TCACACAA.AT TTTCTTCTGA CTTCTTGTAA TTAACATACT TATCTTAACC AGTGTGTTTA A.AAGAAGAC T GAAGAACATT AATTGTATGA ATAGAATTGG 15351 TGAATTGGGG GACAACCAGC TGAACAACCA TTTATTCTCA TCGGACAAGT ACTTAACCCC CTGTTGGTCG ACTTGTTGGT A.AATAAGAGT AGCCTGTTCA 15401 CGCATCCACC ACCTACTTTT CCTTATTCCT TATTATAATC CCCCTCACTG GCGTAGGTGG TGGATGA.A.AA GGAATAAGGA ATAATATTAG GGGGAGTGAC 15451 GCTGATGTGA AA.ATAAAATC CTCAACCTAA ACTAATCCTG GTAGCTTAAC CGACTACACT TTTATTTTAG GAGTTGGATT TGATTAGGAC CATCGAATTG 15501 TTAAAAGCGT CGGCCTTGTA AGCCGGAGAC TGGAGATTTA ATTCTCCCTA AATTTTCGCA GCCGGAACAT TCGGCCTCTG AC C TC TA.AAT TAAGAGGGAT 15551 AGATACATTA GGp~~AAAGGG GTTAAACTCT TTCCCTTGGC CCCAAGGCCA TCTATGTAAT CCTTTTTCCC CAATTTGAGA AAGGGAACCG GGGTTCCGGT 15601 GGGCACCCTC CGAGTCCGCC CCCTAAGCGC TATTA.A.AACA TAGCCCTAAA CCCGTGGGAG GCTCAGGCGG GGGATTCGCG ATAATTTTGT ATCGGGATTT 15651 GP►~~AAATAAC TAATCCTGGT AGCTTTACTT AAAAGC GTC G GCCTTATAGG CTTTTTATTG ATTAGGACCA TCGA.AATGAA TTTTCGCAGC CGGAATATCC 15701 C TGGA.AAC TG GGAATTTTAA TTCCCCTAAA TACATTAGGA AAAGAAGGAT GACCTTTGAC C C TTAA.AATT AAGGGGATTT ATGTAATCCT TTTCTTCCTA 15751 TAAACTCTTT CCCTTGACCC CAAGGCTGGG GCACCCTCCG AGCCGCCCCC ATTTGAGA.AA GGGAACTGGG GTTCCGACCC CGTGGGAGGC TCGGCGGGGG 15801 TAA.AC GC TAT TAA.AATATAG CCCTAAAGAA AAATAACTAA CCCTGGTAGC ATTTGCGATA ATTTTATATC GGGATTTCTT TTTATTGATT GGGACGATCG 15851 TTTACTTAAA AGCGTCGGCC TTATAGGCCG GAAACTGGGA ATTTTTATTC A.AATGAATTT TCGCAGCCGG AATATCCGGC CTTTGACCCT Tp~~AAATAAG 15901 CCCTAAATAC ATTAGGAAAA GAAGGGTTAA ACTCTTTCCC TTGGCCCCAA GGGATTTATG TAATCCTTTT CTTCCCAATT TGAGAAAGGG AACCGGGGTT 15951 GGCTGGGACA CCCTCCGAGC CGCCCCCTGA ACGCTATTAA AACATAGCCC CCGACCCTGT GGGAGGCTCG GCGGGGGACT TGCGATAATT TTGTATCGGG 16001 TAAAGP.►~~AAA TAACTAATCC TGGTAGCTTT AC T TA.A.AAGC GCCGGCCTTA ATTTCTTTTT ATTGATTAGG ACCATCGAAA TGAATTTTCG CGGCCGGAAT 16051 TAGGCTGGAA ACTGGGAATT TTAATTCCCC TAAATACATT AGGAAAAGAA ATCCGACCTT TGACCCTTAA AATTAAGGGG ATTTATGTAA TCCTTTTCTT 16101 GGATTAAACT CTTTCCCTTG ACCCCAAGGC TGGGACACCC TCCGAGCCGC CCTAATTTGA GAAAGGGAAC TGGGGTTCCG ACCCTGTGGG AGGCTCGGCG 16151 CCCCTA.AACG C TATTAAA.AT ATAGCCCTAA AGP.,~~AAATAA CTAACCCTGG GGGGATTTGC GATAATTTTA TATCGGGATT TCTTTTTATT GATTGGGACC 16201 TAGCTTTACT TA.AAAGCGTC GGCCTTATAG GCCGAAAACT GGGAATTTTT ATCGAAATGA ATTTTCGCAG CCGGAATATC CGGCTTTTGA C C C TTP~~A.AA 16251 ATTCCCCTAA ATACATTAAG AAAAGAAGGG TTAAACTCTT TCCCTTGGCC TAAGGGGATT TATGTAATTC TTTTCTTCCC AATTTGAGAA AGGGAACCGG 16301 CCAAGGCCAG GGGAGGCTCG GAGTCCGCCC CCTAAGCGCT ATTAAAACAT GGTTCCGGTC CCGTGGGAGG CTCAGGCGGG GGATTCGCGA TAATTTTGTA 16351 AGCCCTA.AAG AAAAATAACT AATCCTGGTA GCTTTACTTA AAAGCGTCGG TCGGGATTTC TTTTTATTGA TTAGGACCAT CGAAATGAAT TTTCGCAGCC 16401 CCTTATAGGC CGGAAACTGG GAATTTTTAT TCCCCTAAAT ACATTAGGAA GGAATATCCG GCCTTTGACC C TTp►~3AA,ATA AGGGGATTTA TGTAATCCTT 16451 AAGAAGGGTT AAACTCTTTC CCTTGGCCCC AAGGCTGGGA CACCCTCCGA TTCTTCCCAA TTTGAGAAAG GGAACCGGGG TTCCGACCCT GTGGGAGGCT 16501 GCCGCCCCCT GAACGCTATT AAAACATAGC C C TAA.AGAAA AATAACTAAT CGGCGGGGGA CTTGCGATAA TTTTGTATCG GGATTTCTTT TTATTGATTA 16551 CCTGGTAGCT TTAC TTP.~AAA GCGCCGGCCT TATAGGCTGG AGACTTAATT GGACCATCGA AATGAATTTT CGCGGCCGGA ATATCCGACC TCTGAATTAA 16601 CTCCCCTAGA TATATCAGGG GAAGGAGGGT TAAACTCCCG CCTTTGGCCC 346

GAGGGGATCT ATATAGTCCC CTTCCTCCCA ATTTGAGGGC GGA.AACCGGG 16651 C CAA.AGC CAA GATTCTGCCC AAACTGCCCC CTGGACACTA TTP,~~A.AATAT GGTTTCGGTT CTAAGACGGG TTTGACGGGG GACCTGTGAT AATTTTTATA 16701 GAAAACCTAA AGP~AA.ATTTT TTACP►AAAAG TTAGTCAGAT TAACATATTA CTTTTGGATT TCTTTTAAAA AATGTTTTTC AATCAGTCTA ATTGTATAAT 16751 ATGACATAGC CCACATATCC TAATATAGTA CATTACTTAT CTCGACTAAT TACTGTATCG GGTGTATAGG ATTATATCAT GTAATGAATA GAGCTGATTA 16801 CTAACATTAA TAGACTATCC CCTACTGGTA TCATATATCT ATGCTTAATC GATTGTAATT ATCTGATAGG GGATGACCAT AGTATATAGA TACGAATTAG 16851 CTCATTAATC TATATTCCAC TATATCATTA CATACTATGC TTAATACTCA GAGTAATTAG ATATAAGGTG ATATAGTAAT GTATGATACG AATTATGAGT 16901 TTAATCTATA TCCCACTATT TCATTTCATA CTATTCTTTA GTCCCCAATA AATTAGATAT AGGGTGATAA AGTAAAGTAT GATAAGAAAT CAGGGGTTAT 16951 TTCTAATATC P~AAATTTTCA TTACATAACA ATCAATTATT TAACCCTTAA AAGATTATAG TTTTAAAAGT AATGTATTGT TAGTTAATAA ATTGGGAATT 17001 TTATCTAATT TATATATTAT GCGGGTTGGT AAGAACATCA CATCCCGCTA AATAGATTAA ATATATAATA CGCCCAACCA TTCTTGTAGT GTAGGGCGAT 17051 CTGTAAGAAA AAAATAGCTC TATTTGTGGC ACTGTACTCG ATTTATCCCT GACATTCTTT TTTTATCGAG ATA.AAC AC C G TGACATGAGC TAAATAGGGA 17101 ATCAATTGAT CAAAATTGGC ATCTGATTAA TGCTTGTAAT CCTTTAATCC TAGTTAACTA GTTTTAACCG TAGACTAATT ACGAACATTA GGAAATTAGG 17151 TTAATCGCGT CAAGAATGTT AGTACGCTAG CTCCCTTTAA TGGCAACTCC AATTAGCGCA GTTCTTACAA TCATGCGATC GAGGGAAATT ACCGTTGAGG 17201 GTCCTTGATC GTCTCAAGAT TTATTTTCCT CCCTAATTTT TTTGGGGGGG CAGGAACTAG CAGAGTTCTA AATAAA.AGGA GGGATTA~AAA AAACCCCCCC 17251 ATGAAGCAAT AATTACTCCC CGGAAGGGCT CATCTGGAAC ACTAAGATAA TACTTCGTTA TTAATGAGGG GCCTTCCCGA GTAGACCTTG TGATTCTATT 17301 ATCTGAATCC ACCTCGACAC TTATATAAGA TACTCATTAC TATCATTCAT TAGACTTAGG TGGAGCTGTG AATATATTCT ATGAGTAATG ATAGTAAGTA 17351 GAATTATAAT TGTCAAGTTG ACCATACCTG AGAGGGATAG AGAAA.ATGAC CTTAATATTA ACAGTTCAAC TGGTATGGAC TCTCCCTATC TCTTTTACTG 17401 GTCATAGACG TCACGTTTCG ATTTTTTTGA TTAATGAAGC TATGGTTTAA CAGTATCTGC AGTGCAAAGC T CT AATTACTTCG ATACCAAATT 17451 AA.AACTCATT CTCTTAACCC CCATCTAGAC A.AAATTTATA ATAATTATTA TTTTGAGTAA GAGAATTGGG GGTAGATCTG TTTTAAATAT TATTAATAAT 17501 GTGTAAAATA CATTTCATTA TTCTAATACA TTCTTCACTT TATTAGGCAT CACATTTTAT GTA.AAGTAAT AAGATTATGT AAGAAGTGAA ATAATCCGTA 17551 AAA.ATTTATT ATATTAGGTT TTCCCCTAGG TTGTGTAA.AA ATGGGGCCGC TTTTAAATAA TATAATCCAA AAGGGGATCC AACACATTTT TACCCCGGCG 17601 C GAAAAGA.AA AAACAATTTT TGATp~~AAAC CCCCCTCCCC CTAATATACA GCTTTTCTTT TTTGTTAA.AA ACTATTTTTG GGGGGAGGGG GATTATATGT 17651 CGGACTCCTC GP.►~~AA.AC C C C P.►~~AAAC GAGG GCCGGACATA TATTTTTGAT GCCTGAGGAG CTTTTTGGGG TTTTTGCTCC CGGCCTGTAT ATP.►~~AAAC TA 17701 TAGCATGCGA AATAGATTCT GTATTACATT GTTACACTGT GAT ATCGTACGCT TTATCTAAGA CATAATGTAA CAATGTGACA CTA

tRNA 1..70 product = tRNA-Phe rRNA 69..952 product = 1ZS ribosomal RNA tRNA 1021..1092 product = tRNA-Vat rRNA 1093..2758 product = 16S ribosomal RNA tRNA 2759..2833 347

product = tRNA-Leu gene 2834..3808 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3813..3881 product = tRNA-Ile tRNA 3880..3951 product = tRNA-Gln tRNA 3952..4020 product = tRNA-Met gene 4021..5064 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5064..5134 product = tRNA-Trp tRNA complement (5136..5204) product = tRNA-Ala tRNA complement (5205..5277) product = tRNA-Asn tRNA complement (5313..5379) product = tRNA-Cys tRNA complement (5381..5450) product = tRNA-Tyr gene 5452..7005 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7007..7077) product = tRNA-Ser tRNA 7081..7150 product = tRNA-Asp gene 7158..7848 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7849..7921 product = tRNA-Lys gene 7924..8091 gene = ATP8 product =ATP synthase FO subunit 8 gene 8082..8765 gene = ATP6 product =ATP synthase FO subunit 6 gene 8765..9550 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9553..9622 product = tRNA-Gly gene 9623..9973 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9972..10041 product = tRNA-Arg gene 10042..1033 8 gene = ND4L product = NADH dehydrogenase subunit 4L 348 gene 10332..11712 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11713..11781 product = tRNA-His tRNA 11782..11848 product = tRNA-Ser tRNA 11849-11920 product = tRNA-Leu gene 11921..13750 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13746..14267) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14268..14337) product = tRNA-Glu gene 14340..15485 gene = CYTB product = cytochrome b tRNA 15485..15555 product = tRNA-Thr insert 15556..16614 alternating tRNA-Pro (x 6) + tRNA-Thr (x 6) pseudogenes tRNA complement (16615..16683) product = tRNA-Pro D-Loop 16685..17743

Odontaspis ferox mitochondrion, complete genome

1 GCTAGTGTAA CTTAATTTAA AGTATGGCAC TGAAGATGCT AATATGAAAA CGATCACATT GAATTAA.ATT TCATACCGTG ACTTCTACGA TTATACTTTT 51 ATP►~~A.AATTT TCCACAAGCA TAAAGGTTTA GTCCTGGCCT CAGTGTTAAT TATTTTTAAA AGGTGTTCGT ATTTCCAAAT CAGGACCGGA GTCACAATTA 101 TGTAACTAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACATTGATTT TAATATGTAC GTTCA.AAGTC GTAGGGACAC TCTTACGGGA 151 AATTATTCTA TTAATTAATT AGGGGCAGGT ATCAGGCACA CACACGTAGC TTAATAAGAT AATTAATTAA TCCCCGTCCA TAGTCCGTGT GTGTGCATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGATTCCAG CAGTAATAAA GGTTCTGTGG AACGATTCGG TGTGGGGGTT CCCTAAGGTC GTCATTATTT 251 TATTGATTTT ATAAGCACAA GCTTGAATCA GTTAAAGTTA ATAGAGTTGG ATAAC TAA,AA TATTCGTGTT CGAACTTAGT CAATTTCAAT TATCTCA.ACC 301 TCAATCTCGT GCCAGCCACC GCGGTTATAC GAGTAACTCA CATTAATACT AGTTAGAGCA CGGTCGGTGG CGCCAATATG CTCATTGAGT GTAATTATGA 351 TTCCCGGCGT AAAGAGTGAT TTAAGGAATA TCTATAATAA CTGAAGTTAA AAGGGCCGCA TTTCTCACTA AATTCCTTAT AGATATTATT GACTTCAATT 401 GACCTTATCA AGCTGTTACA CGCACCCATA AATGGAATCA TCAACAACGA CTGGAATAGT TCGACAATGT GCGTGGGTAT TTACCTTAGT AGTTGTTGCT 451 AAGTGACTTT ACTTTACTAG AAATCTTGAC GTCACGACAG TTAGACCCCA TTCACTGAAA TGAAATGATC TTTAGAACTG CAGTGCTGTC AATCTGGGGT 501 AACTAGGATT AGATACCCTA CTATGTCTAA CCACAAACTT AAACAATAAT TTGATCCTAA TCTATGGGAT GATACAGATT GGTGTTTGAA TTTGTTATTA 551 TCACTATATT GTTCGCCAGA GTACTACAAG CGCTAGCTTG A.AACCCAAAG AGTGATATAA CAAGCGGTCT CATGATGTTC GCGATCGAAC TTTGGGTTTC 349

601 GACTTGACGG TGTCCCAAAC CCACCTAGAG GAGCCTGTTC TGTAACCGAT CTGAACTGCC ACAGGGTTTG GGTGGATCTC CTCGGACAAG ACATTGGCTA 651 AATCCCCGTT AAACCTCACC ACTTCTAGCC ATCCCCGTCT ATATACCGCC TTAGGGGCAA TTTGGAGTGG TGAAGATCGG TAGGGGCAGA TATATGGCGG 701 GTCGTCAGCT TACACTGTGA AGGTp~~AAAG TAAGCP~~AA.A GAACTAACTC CAGCAGTCGA ATGTGACACT TCCATTTTTC ATTCGTTTTT CTTGATTGAG 751 CCACACGTCA GGTCGAGGTG TAGCAAATGA AGTGGATAGA AATGGGCTAC GGTGTGCAGT CCAGCTCCAC ATCGTTTACT TCACCTATCT TTACCCGATG 801 ATTTTCTATA AAGA,AAAC AC GAATGGTAAA C TG1?~~AAATT ACCTAAAGGT TP~AAAGATAT TTCTTTTGTG CTTACCATTT GACTTTTTAA TGGATTTCCA 851 GGATTTAGCA GTAAGAAAAG ATTAGAGAAC TTATCTGAAA TTGGCTCTGG C C TAA.ATC GT CATTCTTTTC TAATCTCTTG AATAGACTTT AACCGAGACC 901 GACGCGTACA TACCGCCCGT CACTCTCCTC TCC ACTTATTCCT CTGCGCATGT ATGGCGGGCA GTGAGAGGAG TTTTTTTAGG TGAATAAGGA 951 AATTAA.AAGA AAATTATTAA GAGGAGGCAA GTCGTAACAT GGTAAGTGTA TTAATTTTCT TTTAATAATT CTCCTCCGTT CAGCATTGTA CCATTCACAT 1001 CTGGAAAGTG CACTTGGAAT CAAAATGTGG CTAAACTAGC AA.AGCACCTC GACCTTTCAC GTGAACCTTA GTTTTACACC GATTTGATCG TTTCGTGGAG 1051 CCTTACACCG AGGAAATACC CGTGCAATTC GGATCATTTT GAACATTAAA GGAATGTGGC TCCTTTATGG GCACGTTAAG C C TAGTAA.AA CTTGTAATTT 1101 GCTAGCCTGT ATAACTACCC AAACCCAACC TTATTAACTA CCTTATATAT CGATCGGACA TATTGATGGG TTTGGGTTGG AATAATTGAT GGAATATATA 1151 TAATTCCTAA TTAAAACATT TTATCCTTCT AGTATGGGTG ACAGAACAA.A ATTAAGGATT AATTTTGTAA AATAGGAAGA TCATACCCAC TGTCTTGTTT 1201 AACTCAGCGC AATAGATTAT GTACCGCAAG GGAAAGCTGA AAAAGAA.ATG TTGAGTCGCG TTATCTAATA CATGGCGTTC CCTTTCGACT TTTTCTTTAC 1251 AAATAAATAA TTAAA.GTAAT A,P~AA,AGCAGA GATCTAACCT CGTACCTTTT TTTATTTATT AATTTCATTA TTTTTCGTCT CTAGATTGGA GCATGGA►AAA 1301 GCATCATGAT TTAGCTAGAA AA.AC TAGAC A AAGAGATCTT AAGCCTATCC CGTAGTACTA AATCGATCTT TTTGATCTGT TTCTCTAGAA TTCGGATAGG 1351 TC C C GAA.AC T A.A.AC GAGC TA CTCCGAAGCA GCACAATTAG AGCCAACCCG AGGGCTTTGA TTTGCTCGAT GAGGCTTCGT CGTGTTAATC TCGGTTGGGC 1401 TCTCTGTGGC A►AAAGAGTGG GAAGACTTCC GAGTAGCGGT GACAAACCTA AGAGACACCG TTTTCTCACC CTTCTGAAGG CTCATCGCCA CTGTTTGGAT 1451 TCGAGTTTAG TGATAGCTGG TTGCCCAAGA AAAGAACTTT AATTCTGCAT AGCTCA.AATC ACTATCGACC AACGGGTTCT TTTCTTGA.AA TTAAGACGTA 1501 TAATTCCTTT ACCACCAA.AG AATTTATCTT ACTAAGGTTA AATATP~~AAA ATTAAGGAAA TGGTGGTTTC TTAAATAGAA TGATTCCAAT TTATATTTTT 1551 TTAATAGTTA TTCAGAAGAG GTACAGCCCT TCTGAACCAA GATACAACTT AATTATCAAT AAGTCTTCTC CATGTCGGGA AGACTTGGTT CTATGTTGAA 1601 TTTAAGATGG AAAATGATCA CATTTATCAA GGTTTTTACC CCAGTGGACC AAATTCTACC TTTTACTAGT GTAAATAGTT C CP.~~,A.AATGG GGTCACCTGG 1651 CA,.AAAGCAGC CATCTGTAAA GTAAGCGTCA CAGCTCCAGT CTCACP~~AAA GTTTTCGTCG GTAGACATTT CATTCGCAGT GTCGAGGTCA GAGTGTTTTT 17 01 CCTATAATTT AGATATTCCT CTCATAATCC CCTTAACTAT ATTGAGCTAT GGATATTAAA TCTATAAGGA GAGTATTAGG GGAATTGATA TAACTCGATA 1751 TTTATAA.AAT ATAAAAGAAC TTATGCTAAA ATGAGTAATA AGAGAATAAA AAATATTTTA TATTTTCTTG AATACGATTT TACTCATTAT TCTCTTATTT 1801 CCTCTCCAGA CATAAGTGTA TGTCAGAAAG AATTAAATCA CTGACAATTA GGAGAGGTCT GTATTCACAT ACAGTCTTTC TTAATTTAGT GACTGTTAAT 1851 AACGAACCCA AATTGAGGCC ATTATATTAA TATTTACTTA ACTAGAAAAT TTGCTTGGGT TTAACTCCGG TAATATAATT ATAAATGAAT TGATCTTTTA 1901 CTTATTATAA TATTCGTTGA TCCTACACAG GAATGTCTTT AAGGA.AAGAT GAATAATATT ATAAGCAACT AGGATGTGTC CTTACAGAAA TTCCTTTCTA 1951 TTAAAGAAAA TAAAGGAACT CGGCA.AACAC AAACTCCGCC TGTTTACCAA 350

AATTTCTTTT ATTTCCTTGA GCCGTTTGTG TTTGAGGC GCU ACAA:ATGG~i~~i~ 2001 AAACATCGCC TCTTGAGTAT TATAAGAGGT CCCGCCTGCC CTGTGACAAT TTTGTAGCGG AGAACTCATA ATATTCTCCA GGGCGGACGG GACACTGTTA 2051 GTTTAACGGC CGCGGTATTT TGACCGTGCA AAGGTAGCGT AATCACTTGT CAAATTGCCG GCGCCATAAA ACTGGCACGT TTCCATCGCA TTAGTGAACA 2101 CTTTTAAATG AAGACCCGTA TGAA.AGGCAT CACGAGAGTT TAACTGTCTC GAAAATTTAC TTCTGGGCAT ACTTTCCGTA GTGCTCTCAA ATTGACAGAG 2151 TATTTTCTAA TCAATGAAAT TGATCTACTC GTGCAGAAGC GAGTATAATT ATA.A.AAGATT AGTTACTTTA ACTAGATGAG CACGTCTTCG CTCATATTAA 2201 ACATTAGACG AGAAGACCCT ATGGAGCTTC AAATACA.AAT TAATTATGTA TGTAATCTGC TCTTCTGGGA TACCTCGAAG TTTATGTTTA ATTAATACAT 2251 AATCAATTAT TCCACGGATA TAAATP.►AAAA ATATAATATT TTTAATTTAA TTAGTTAATA AGGTGCCTAT ATTTATTTTT TATATTATAA A.AATTA.AATT 2301 CTGTTTTTGG TTGGGGTGAC CAAGGGGAAA AACAAATCCC CCTTATCGAC GACAAAAACC AACCCCACTG GTTCCCCTTT TTGTTTAGGG GGAATAGCTG 2351 TGAGTACTCA AGTACTTAAA AATTAGAATT ACAATTCTAA TTAATp~~AAA ACTCATGAGT TCATGA.ATTT TTAATCTTAA TGTTAAGATT AATTATTTTT 2401 ATTTATCGAA AAATGACCCA GGAATACCTG ATCAATGAAC CAAGTTACCC TAAATAGCTT TTTACTGGGT CCTTATGGAC TAGTTACTTG GTTCAATGGG 2451 TAGGGATAAC AGCGCAATCC TTTCTCAGAG TCCCTATCGC CGAA.AGGGTT ATCCCTATTG TCGCGTTAGG AAAGAGTCTC AGGGATAGCG GCTTTCCCAA 2501 TACGACCTCG ATGTTGGATC AGGACATCCT AATGATGTAA CCGTTATTAA ATGCTGGAGC TACAACCTAG TCCTGTAGGA TTACTACATT GGCAATAATT 2551 GGGTTCGTTT GTTCAACGAT TAATAGTCCT ACGTGATCTG AGTTCAGACC CCCAAGCA.AA CAAGTTGCTA ATTATCAGGA TGCACTAGAC TCAAGTCTGG 2601 GGAGAAATCC AGGTCAGTTT CTATCTATGA ATTAATTTTT CCTAGTACGA CCTCTTTAGG TCCAGTCAAA GATAGATACT TAATTP.~AAAA GGATCATGCT 2651 AAGGACCGGA A~A.A.ATGGAGC CAATACCCTA GGCACGCTCC ATTTTCATCT TTCCTGGCCT TTTTACCTCG GTTATGGGAT CCGTGCGAGG T~GTAGA 2701 ATTGAAATAA ACTAAAATAG ATAAGP.~AAAA ATTATCTACT ACCCAAGAAA TAACTTTATT TGATTTTATC TATTCTTTTT TAATAGATGA TGGGTTCTTT 2751 AGGGTTGTTG AGGTGGCAGA GCCTGGTAAG T GC A,AA.AGAC CTAAACTCTT TCCCAACAAC TCCACCGTCT CGGACCATTC ACGTTTTCTG GATTTGAGAA 2801 TAATTCAGAG GTTCAAATCC TCTCCCCAAT CATGCTTGAA ACTCTCCTAC ATTAAGTCTC CAAGTTTAGG AGAGGGGTTA GTACGAACTT TGAGAGGATG 2851 TTTATTTAAT TAATCCACTT ACCTACATTA TTCCTATCTT ATTGGCTACA AAATA.AATTA ATTAGGTGAA TGGATGTAAT AAGGATAGAA TAACCGATGT 2901 GCTTTTCTCA CCCTAGTTGA AC GP.~~AAATC CTCGGCCATA TACAACTCCG CGA~AA.AGAGT GGGATCAACT TGCTTTTTAG GAGCCGGTAT ATGTTGAGGC 2951 T.A.AAGGC C C T AACATCGTAG GCTTATACGG ACTCCTTCAA CCAATTGCAG ATTTCCGGGA TTGTAGCATC CGAATATGCC TGAGGAAGTT GGTTAACGTC 3001 ATGGCCTAAA ATTATTTATT A.A.AGAAC C CA TCCATCCATC AACATCCTCC TACCGGATTT TAATAAATAA TTTCTTGGGT AGGTAGGTAG TTGTAGGAGG 3051 CCCTTCCTAT TTCTAATCAC TCCCACAACA GCCCTAACAT TAGCCCTCCT GGGAAGGATA AAGATTAGTG AGGGTGTTGT CGGGATTGTA ATCGGGAGGA 3101 TATATGAATA CCTCTTCCCC TCCCCCACTC TATTATTAAT CTTAATTTAG ATATACTTAT GGAGAAGGGG AGGGGGTGAG ATAATAATTA GAATTA.AATC 3151 GTTTATTATT TATTTTAGCA ATCTCAAGCT TAACCGTCTA TACTATTTTA CA.AATAATAA ATP TC GT TAGAGTTCGA ATTGGCAGAT ATGATAAA.AT 3201 GGTTCCGGAT GAGCATCCAA TTCA.A.AATAT GCCCTAATAG GAGCTTTACG CCAAGGCCTA CTCGTAGGTT AAGTTTTATA CGGGATTATC CTCGAAATGC 3251 AGCCGTAGCA CAAACAATTT CTTATGAAGT AAGTTTAGGA CTAATCCTTT TCGGCATCGT GTTTGTTAAA GAATACTTCA TTCAAATCCT GATTAGGAAA 3301 TATCTATGAT TATATTTACA GGAGGTTTTA CCCTCCACAC CTTTAATCTA ATAGATACTA ATATAAATGT CCTCCA~AA.AT GGGAGGTGTG GA.AATTAGAT 351

3351 GCACAAGAAA CAATCTGACT AATTATTCCA GGATGACCAT TGGCCCTAAT CGTGTTCTTT GTTAGACTGA TTAATAAGGT CCTACTGGTA ACCGGGATTA 3401 ATGATATGTC TCAACCCTAG CAGAAACTAA CCGAGTACCA TTTGACTTAA TACTATACAG AGTTGGGATC GTCTTTGATT GGCTCATGGT AAACTGAATT 3451 CAGAAGGGGA ATCAGAACTA GTTTCAGGAT TTAATATTGA ATATGCAGGA GTCTTCCCCT TAGTCTTGAT CAAAGTCCTA AATTATAACT TATACGTCCT 3501 GGTTCATTTG CCCTATTCTT CCTTGCTGAA TATACAAATA TTTTATTAAT CCAAGTAAAC GGGATAAGAA GGAACGACTT ATATGTTTAT ~TAATTA 3551 AA.ATAC C C T T TCAGTCATTT TATTTATAGG TTCCTCTTAC AATCCACTTC TTTATGGGAA AGTCAGTAAA ATAAATATCC AAGGAGAATG TTAGGTGAAG 3601 TCCCAGAAAT CTCAACACTC AGCTTAATAA TAAAAGCAAC CCTACTAACT AGGGTCTTTA GAGTTGTGAG TCGAATTATT ATTTTCGTTG GGATGATTGA 3651 CTACTTTTCC TATGAATCCG AGCATCCTAT CCCCGCTTTC GTTATGATCA GATGA~AAAGG ATACTTAGGC TCGTAGGATA GGGGCGAAAG CAATACTAGT 3701 ACTCATACAC TTAGTATGAA AAAATTTCCT CCCTTTAACC TTAGCAATTA TGAGTATGTG AATCATACTT TTTTA.AAGGA GGGA.AATTGG AATCGTTAAT CAGCA.AACCT 3751 TATTATGACA TATTGCCCTC CCCATAGCCA ACCTCCCCTA ATAATACTGT ATAACGGGAG GGGTATCGGT GTCGTTTGGA TGGAGGGGAT 3801 ACTTAACGGA AGCGTGCCTG AATAAAGGAC CACTTTGATA GAGTGGATAA TGAATTGCCT TCGCACGGAC TTATTTCCTG GTGAAACTAT CTCACCTATT 3851 TGAAAGTTAA AACCTTTCCT CTTCCTAGAA AAATAGGACT TGAACCTATA ACTTTCAATT TTGGAAAGGA GAAGGATCTT TTTATCCTGA ACTTGGATAT 3901 ATTAAGAGAT CAAAACTCCT TGTATTTCCA ACTATACTAT TTCCTAAGTA TAATTCTCTA GTTTTGAGGA ACATA.AAGGT TGATATGATA AAGGATTCAT 3951 AAGTCAGCTA ATA.AAGCTTT TGGGCCCATA CCCCAACCAT GTTGGTTAAA TTCAGTCGAT TATTTCGA.AA ACCCGGGTAT GGGGTTGGTA CAACCAATTT 4001 ATCCCTCCTT TACTAATGAA CCCAATTGTA TTAACCATTA TCATTTCAAG TAGGGAGGAA ATGATTACTT GGGTTAACAT AATTGGTAAT AGTAAAGTTC 4051 CCTAAGCCTA GGAACTATCT TAACATTTAT TGGCTCACAT TGATTCCTAA GGATTCGGAT CCTTGATAGA ATTGTAAATA ACCGAGTGTA ACTAAGGATT 4101 TTTGAATAGG CCTCGAAATT AATACTCTAG CTATTATCCC CTTAATAATT AAACTTATCC GGAGCTTTAA TTATGAGATC GATAATAGGG GAATTATTAA 4151 CGCCAGCACC ACCCCCGGGC AGTAGAAGCT TC CACAAAAT ATTTTATTAC GCGGTCGTGG TGGGGGCCCG TCATCTTCGA AGGTGTTTTA TAAAATAATG 4201 ACAAGCAACT GCCTCAGCCT TACTTTTATT CGCTAGCGTC ATAAACGCTT TGTTCGTTGA CGGAGTCGGA ATGAAAATAA GCGATCGCAG TATTTGCGAA 4251 GGACTTCAGG CGAATGAAGT CTAATTGAAA TAATTAATCC AACTCCTGCC CCTGAAGTCC GCTTACTTCA GATTAACTTT ATTAATTAGG TTGAGGACGG 4301 ACACTGGCCA CAATCGCACT AGCATT~ ATTGGCTTAG CCCCCCTTCA TGTGACCGGT GTTAGCGTGA TCGTAATTTT TAACCGAATC GGGGGGAAGT 4351 TTTCTGATTA CCCGAAGTTC TCCAAGGTTT AGACCTTACT ACAGGCCTCA AAAGACTAAT GGGCTTCAAG AGGTTCCAAA TCTGGAATGA TGTCCGGAGT 4401 TTCTTTCTAC ATGACP►AAAA CTCGCCCCAT TCGCTATCCT CTTACAACTT AAGAAAGATG TACTGTTTTT GAGCGGGGTA AGCGATAGGA GAATGTTGAA 4451 TACCCCTCAT TAAACTCCAA CTTACTCGTA TTCCTTGGAA TCCTCTCAAC ATGGGGAGTA ATTTGAGGTT GAATGAGCAT AAGGAACCTT AGGAGAGTTG 4501 TATAGTAGGA GCTTGAGGAG GCTTAAATCA GACCCAATTA C Gp~~AAATC C ATATCATCCT CGAACTCCTC CGAATTTAGT CTGGGTTAAT GCTTTTTAGG 4551 TAGCCTACTC CTCAATTGCA CATCTTGGTT GAATAATTAC AATCCTACAT ATCGGATGAG GAGTTAACGT GTAGAACCAA CTTATTAATG TTAGGATGTA 4601 TATTCCCATA ACCTAACCCA ACTAAATTTA ATTCTTTACA TTATCATAAC ATAAGGGTAT TGGATTGGGT TGATTTA.AAT TAAGAAATGT AATAGTATTG 4651 ATTAACAACC TTCCTATTAT TTA.AA.ATATT TAATTCAACC AAAATTAATT TAATTGTTGG AAGGATAATA AATTTTATAA ATTAAGTTGG TTTTAATTAA 4701 CTATTTCCTC TTCCTCATCA AAATCTCCCT TACTATCCAT TATTGCTCTT 352

GATAAAGGAG AAGGAGTAGT TTTAGAGGGA ATGATAGGTA ATAACGAGAA 4751 ATAACTCTTC TTTCTCTCGG AGGCTTACCT CCACTTTCAG GCTTTATACC TATTGAGAAG A.AAGAGAGC C TCCGAATGGA GGTGAAAGTC CGAAATATGG 4801 A.A.AATGATTA ATTTTACAAG AATTAACAAA ACAGAACCTA ATTATTACAG TTTTACTAAT TAAAATGTTC TTAATTGTTT TGTCTTGGAT TAATAATGTC 4851 CCACTATTAT AGCCATAATA ACCCTCCTCA GTCTATTCTT CTATCTACGC GGTGATAATA TCGGTATTAT TGGGAGGAGT CAGATAAGAA GATAGATGCG 4901 CTCTGCTATG CTACAACACT AACCATAATT CCAAATTCAA TCAACATATT GAGACGATAC GATGTTGTGA TTGGTATTAA GGTTTAAGTT AGTTGTATAA 4951 ATCATCATGA C GAATTA.AAT CATCCTATAA CCTAACCTTA ACAACAACTG TAGTAGTACT GCTTAATTTA GTAGGATATT GGATTGGAAT TGTTGTTGAC 5001 CCTCACTATC CATTCTACTC CTTCCAATCA CCCCCTCCAT TCTCATACTA GGAGTGATAG GTAAGATGAG GAAGGTTAGT GGGGGAGGTA AGAGTATGAT 5051 TTATCTTAAG AAATTTAGGT TAACAATAGA CCAAAAGCCT TCAAAGCTTT AATAGAATTC TTTAAATCCA ATTGTTATCT GGTTTTCGGA AGTTTCGA.AA 5101 AAGCAGAAGT GAAAATCTCC TAATTTCTGC T~TTTGT AAGACTTTAT TTCGTCTTCA CTTTTAGAGG ATTAAAGACG ATTTTAAACA TTCTGAAATA 5151 CTCACATCTT CTGAATGCAA CCCAGATGCT TTAATTAAGC TAPsAATC TC C GAGTGTAGAA GACTTACGTT GGGTCTACGA AATTAATTCG ATTTTAGAGG 52 01 TAGATAAATA GGCCTTGATC CTACA.A.AATC TTAGTTAACA GCTAAGCGTT ATCTATTTAT CCGGAACTAG GATGTTTTAG AATCAATTGT CGATTCGCAA 5251 CAATCCAGCG AACTTTTATC TACTTTCTCC CGCCGTAAAA ATAAAAGGCG GTTAGGTCGC TTGAAA.ATAG ATGA.AAGAGG GCGGCATTTT TATTTTCCGC 5301 GGAGAAAGCC CCGGGAGAAA CAAACCTCCG GTTTTGGATT TGCAATCCAA CCTCTTTCGG GGCCCTCTTT GTTTGGAGGC CA~AAAC C TAA ACGTTAGGTT 5351 CGTAATTATC TACTGCAGGA CTATGATAAG GAGAGGAATT TGACCTCTGT GCATTAATAG ATGACGTCCT GATACTATTC CTCTCCTTA.A ACTGGAGACA 5401 TTACGGAGCT ACAACCCGCC ACTTAGTTCT CAGTCACCTT ACCTGTGGCA AATGCCTCGA TGTTGGGCGG TGAATCAAGA GTCAGTGGAA TGGACACCGT 5451 ATTAATCGTT GACTATTTTC TACAAACCAC AAAGATATTG GCACCCTTTA TAATTAGCAA CTGATAAAAG ATGTTTGGTG TTTCTATAAC C GTGGGAA.AT 5501 TTTAATCTTC GGTGCATGAG CAGGAATAGT GGGAATAGCT TTAAGCCTTC AAATTAGAAG CCACGTACTC GTCCTTATCA CCCTTATCGA AATTCGGAAG 5551 TAATTCGAGC CGAATTAGGC CAACCTGGGT CACTTCTAGG AGATGATCAG ATTAAGCTCG GCTTAATCCG GTTGGACCCA GTGAAGATCC TCTACTAGTC 5601 ATTTATAATG TTATTGTAAC CGCCCATGCA TTCGTAATAA TCTTCTTCAT TAAATATTAC AATAACATTG GCGGGTACGT AAGCATTATT AGAAGAAGTA 5651 GGTTATACCC GTAATAATTG GTGGGTTTGG AA.AC TGATTA GTACCATTAA CCAATATGGG CATTATTAAC CACCCAAACC TTTGACTAAT CATGGTAATT 5701 TAATTGGTGC ACCAGATATA GCCTTCCCAC GAATAA.ATAA CATAAGCTTT ATTAACCACG TGGTCTATAT CGGAAGGGTG CTTATTTATT GTATTCGAAA 5751 TGACTTCTAC CTCCTTCTTT TCTTTTACTT CTGGCTTCAG CTGGAGTTGA ACTGAAGATG GAGGAAGAA.A AGAAAATGAA GACCGAAGTC GACCTCAACT 5801 AGCCGGAGCC GGTACTGGTT GAACAGTTTA TCCCCCTTTA GCTGGTAACT TCGGCCTCGG CCATGACCAA CTTGTCA.AAT AGGGGGAAAT CGACCATTGA 5851 TAGCACATGC TGGAGCATCC GTTGACTTAG CCATCTTCTC TCTCCATTTA ATCGTGTACG ACCTCGTAGG CAACTGAATC GGTAGAAGAG AGAGGTAAAT 5901 GCAGGCATCT CATCAATTTT AGCTTCAATT AACTTTATTA CAACCATTAT CGTCCGTAGA GTAGTTA.A.AA TCGAAGTTAA TTGAAATAAT GTTGGTAATA 5951 TAATATA,AAA CCACCAGCCA TCTCTCAATA TCAA.ACAC CA TTATTTGTGT ATTATATTTT GGTGGTCGGT AGAGAGTTAT AGTTTGTGGT AATAAACACA 6001 GATCAATTCT AGTAACAACT ATCCTCCTCC TATTATCCCT TCCAGTACTC CTAGTTAAGA TCATTGTTGA TAGGAGGAGG ATAATAGGGA AGGTCATGAG 6051 GCAGCAGGTA TTACAATATT ACTTACTGAT CGCAATCTAA ATACAACATT CGTCGTCCAT AATGTTATAA TGAATGACTA GCGTTAGATT TATGTTGTAA 353

6101 CTTTGATCCA GCAGGAGGAG GAGATCCAAT TCTTTATCAA CATCTATTTT GAAACTAGGT CGTCCTCCTC CTCTAGGTTA AGAAATAGTT GTAGAT~ 6151 GATTTTTTGG CCACCCAGAA GTTTATATTT TAATTCTTCC TGGCTTTGGA CT CC GGTGGGTCTT CAAATATAAA ATTAAGAAGG ACCGAAACCT 6201 ATAATTTCTC ATGTAGTAGC TTACTATTCT GGT G AACCATTTGG TATTAAAGAG TACATCATCG AATGATAAGA CCATTTTTTC TTGGTAAACC 6251 CTATATAGGT ATAGTTTGAG CAATAATAGC AATTGGATTA CTAGGTTTTA GATATATCCA TATCAAACTC GTTATTATCG TTAACCTAAT GATCCAA.AI~T 6301 TTGTTTGAGC CCACCATATA TTTACAGTAG GTATAGACGT TGATACACGA AACAAACTCG GGTGGTATAT A.AATGTCATC CATATCTGCA ACTATGTGCT 6351 GCCTATTTTA CCTCAGCAAC AATAATTATT GCCATTCCCA CAGGTGTAAA C GGATA.A.AAT GGAGTCGTTG TTATTAATAA CGGTAAGGGT GTCCACATTT 6401 AGTATTTAGC TGGTTAGCAA CTCTTCACGG AGGCTCTATT AAATGAGA.AA TCATAAATCG ACCAATCGTT GAGAAGTGCC TCCGAGATAA TTTACTCTTT 6451 CCCCATTACT ATGAGCCCTT GGATTTATCT TCTTATTCAC AGTAGGAGGA GGGGTAATGA TACTCGGGAA CCTAAATAGA AGAATAAGTG TCATCCTCCT 6501 CTAACAGGCA TCGTCTTAGC CAATTCCTCC TTAGATATTG TTCTCCACGA GATTGTCCGT AGCAGAATCG GTTAAGGAGG AATCTATAAC AAGAGGTGCT 6551 TACTTATTAT GTAGTAGCTC ATTTTCATTA TGTCCTATCA ATAGGAGCAG ATGAATAATA CATCATCGAG TA.AA.AGTAAT ACAGGATAGT TATCCTCGTC 6601 TGTTCGCTAT TATAGCAGGT TTTATCCACT GATTTCCTCT TATCTCTGGC ACAAGCGATA ATATCGTCCA AAATAGGTGA C TA.AAGGAGA ATAGAGACCG 6651 TATACCCTCC ATTCAACATG AACP.~AAA.ATT CAATTTGTAG TAATATTTAT ATATGGGAGG TAAGTTGTAC TTGTTTTTAA GTTAA.ACATC ATTATAAATA 6701 TGGAGTAAAT TTAACATTCT TCCCACAACA CTTCCTAGGT CTTGCTGGCA ACCTCATTTA AATTGTAAGA AGGGTGTTGT GAAGGATCCA GAACGACCGT 6751 TACCACGACG TTACTCAGAT TACCCAGATG CATATACTTT ATGAAATATA ATGGTGCTGC AATGAGTCTA ATGGGTCTAC GTATATGAAA TACTTTATAT 6801 ATCTCCTCTA TTGGCTCTTT AATTTCACTT GTAGCAGTAA TTATATTTCT TAGAGGAGAT AAC C GAGAA.A TTAAAGTGAA CATCGTCATT AATATAAAGA 6851 ATTTATTATC TGAGAAACAT TCGCCTCAAA ACGAGAAGTA CTATCCATTG TAA.ATAATAG ACTCTTTGTA AGCGGAGTTT TGCTCTTCAT GATAGGTAAC 6901 AATTACCCCA TACAAATGTT GAATGACTAC ACGGTTGTCC TCCACCATAT TTAATGGGGT ATGTTTACAA CTTACTGATG TGCCAACAGG AGGTGGTATA 6951 CATACATATG AAGAACCAGC ATTCGTTCAA GTTCAACGAA CTTTCTP~AA,A GTATGTATAC TTCTTGGTCG TAAGCAAGTT CAAGTTGCTT GAAAGATTTT 7001 CAAGAAAGGA AGGAATTGAA CCCCCATATG TTAGTTTCAA GCCAACCACA GTTCTTTCCT TCCTTAACTT GGGGGTATAC AATCAA.AGTT CGGTTGGTGT 7051 TCACCACTCT GTCACTTTCT TTATTAAGAT TCTAGTAA.AA TATATTACAC AGTGGTGAGA CAGTGA.AAGA AATAATTCTA AGATCATTTT ATATAATGTG 7101 TGCTTTGTCA AGGCAAAATC GTGAGTTTAA ATCCCACGAA TCTTAACTTA ACGAAACAGT TCCGTTTTAG CACTCAAATT TAGGGTGCTT AGAATTGAAT 7151 TAATGGCACA CCCCTCACAA TTAGGATTTC AAGACGCAGC CTCCCCAGTT ATTACCGTGT GGGGAGTGTT AATCCTAAAG TTCTGCGTCG GAGGGGTCAA 7201 ATAGAAGAAC TTATTCATTT TCACGACCAC ACATTAATAA TTGTATTTAT TATCTTCTTG AATAAGTAAA AGTGCTGGTG TGTAATTATT AACATA.AATA 7251 AATTAGCACT CTGATTCTTT ATATTATTAC AGCAATAGTA TCAACA.AA.AC TTAATCGTGA GAC TAAGA.AA TATAATAATG TCGTTATCAT AGTTGTTTTG 7301 TCACAA.ACAA ATATATTCTT GACTCCCAAG AAATTGA.AAT TGTTTGAACT AGTGTTTGTT TATATAAGAA CTGAGGGTTC TTTAACTTTA ACAAACTTGA 7351 ATTCTCCCCG CCATTATCCT TATCATAATT GCCCTACCAT CCCTACGAAT TAAGAGGGGC GGTAATAGGA ATAGTATTAA CGGGATGGTA GGGATGCTTA 7401 TTTATATCTT ATAGACGAAA TTAATGACCC CCATCTAACC ATCA.AAGCTA AAATATAGAA TATCTGCTTT AATTACTGGG GGTAGATTGG TAGTTTCGAT 7451 TAGGTCATCA ATGATACTGA AGTTATGAAT ACACAGATTA TGAGAATTTA 354

ATCCAGTAGT TACTATGACT TCAATACTTA TGTGTCTAAT AC TC TTAA.AT 7501 GGATTTGATT CTTACATAGT TCACACTCAA GACTTAACCC CAGGCCAATT C C TA.AAC TAA GAATGTATCA AGTGTGAGTT CTGAATTGGG GTCCGGTTAA 7551 TCGTTTATTA GA.AACAGATC ACCGAATAGT TGTACCCATA GAATCACCTA AGCAAATAAT CTTTGTCTAG TGGCTTATCA ACATGGGTAT CTTAGTGGAT 7601 TTCGTGTGTT AGTATCTGCA GAAGACGTCT TACATTCATG AGCTATCCCA AAGCACACAA TCATAGACGT CTTCTGCAGA ATGTAAGTAC TCGATAGGGT 7651 GCCTTAGGAA TTAA.AATAGA CGCCGTACCA GGACGTCTAA ATCAAACTGC CGGAATCCTT AATTTTATCT GCGGCATGGT CCTGCAGATT TAGTTTGACG 7701 CTTTATTATT TCCCGACCAG GCGTCTATTA TGGTCAATGT TCAGAAATTT GA.AATAATAA AGGGCTGGTC CGCAGATAAT ACCAGTTACA AGTCTTTAAA 7751 GTGGTGCTAA CCACAGTTTT ATACCCATTA TGGTAGAAGC AATTCCCCTA CACCACGATT GGTGTCAAAA TATGGGTAAT ACCATCTTCG TTAAGGGGAT 7801 GAGCATTTCG AAGCCTGATC TTCATTAATA TTAGAAGAAG CCTCACTAAG CTCGTAAAGC TTCGGACTAG AAGTAATTAT AATCTTCTTC GGAGTGATTC 7851 AAGCTAATTG GATATAGCAT TAGCCTTTTA AGC TF~~AAAT TGGTGACTCC TTCGATTAAC CTATATCGTA ATC GGA.AA.AT TCGATTTTTA ACCACTGAGG 7901 CTATCACCCT TAGTGATATG CCTCAATTAA ACCCCCACCC TTGATTTATT GATAGTGGGA ATCACTATAC GGAGTTAATT TGGGGGTGGG AACTAAATAA 7951 ATATTCCTAT TTTCATGAAT AATTTTTCTT ACTATTTTAC CT GT TATAAGGATA AAAGTACTTA TTAA.A,AAGAA TGATAAAATG GATTTTTTCA 8001 AATAAATTAT ACATTCAACA ATAATCCAAC ATTAAA.A.AAT ATCGAAAAAT TTATTTAATA TGTAAGTTGT TATTAGGTTG TAATTTTTTA TAGCTTTTTA 8051 C TA,AAC C TAA ACCCTGAAAT TGACCATGAT CATAA.ACTTC TTTGACCAAT GATTTGGATT TGGGACTTTA ACTGGTACTA GTATTTGAAG AAACTGGTTA 8101 TCCTAAGTCC TTCCCTCCTT GGAATCCCGT TAATTGCTTT AGCAATTATA AGGATTCAGG AAGGGAGGAA CCTTAGGGCA ATTAACGAAA TCGTTAATAT 8151 TTACCATGAT TAACTTTCCC AACCCCTACC AATCGCTGAT TAAATAATCG AATGGTACTA ATTGAAAGGG TTGGGGATGG TTAGCGACTA ATTTATTAGC 8201 ATTAATAACC C TC CAA.AGC T GATTTATTAA TCGATTTATT TATCAACTTA TAATTATTGG GAGGTTTCGA CTA.AATAATT AGCTAAATAA ATAGTTGAAT 8251 TACAACCCAT TAATTTTACT GGTCATAAAT GAGCTATATT ATTTACAGCA ATGTTGGGTA ATT~TGA CCAGTATTTA CTCGATATAA TAAATGTCGT 8301 CTCATATTAT TTTTAATTAC CACTAACCTT TTAGGACTTC TCCCCTACAC GAGTATAATA AAAATTAATG GTGATTGGAA AATCCTGAAG AGGGGATGTG 8351 CTTCACGCCC ACAACCCAAC TCTCCCTTAA TATAGCATTT GCTTTACCCT GAAGTGCGGG TGTTGGGTTG AGAGGGAATT ATATC GTA.AA CGAAATGGGA 8401 TATGATCCAT AACCGTATTA ATTGGCATAC TTAATCAACC AACAATTGCA ATACTAGGTA TTGGCATAAT TAACCGTATG AATTAGTTGG TTGTTAACGT 8451 CTAGGCCATT TCCTACCAGA AGGCACCCCC ACCCCTCTAG TACCCGTCCT GATCCGGTAA AGGATGGTCT TCCGTGGGGG TGGGGAGATC ATGGGCAGGA 8501 AATTATTATC GA.AAC TATTA GTCTATTTAT TCGACCATTA GCATTAGGGG TTAATAATAG CTTTGATAAT CAGATA.AATA AGCTGGTAAT CGTAATCCCC 8551 TTCGACTAAC CGCTAACTTA ACAGCTGGCC ACCTATTAAT ACAATTAATC AAGCTGATTG GCGATTGAAT TGTCGACCGG TGGATAATTA TGTTAATTAG 8601 GCAACCGCAG CTTTTGTCCT TACCACTACT TTACCAACCG TAACATTATT CGTTGGCGTC GAAA.ACAGGA ATGGTGATGA AATGGTTGGC ATTGTAATAA 8651 AGCATCAATT ACCCTATTCC TATTAACAAT TCTAGAAGTA GCCGTAGCAA TCGTAGTTAA TGGGATAAGG ATAATTGTTA AGATCTTCAT CGGCATCGTT 8701 TAATTCAAGC ATATGTATTT GTACTTCTAC TAAGTTTATA TCTACAAGAA ATTAAGTTCG TATACATAAA CATGAAGATG ATTCAAATAT AGATGTTCTT 8751 AACGTCTAAT GGCTCATCAA GCACACGCAT ATCATATAGT TGACCCCAGC TTGCAGATTA CCGAGTAGTT CGTGTGCGTA TAGTATATCA ACTGGGGTCG 8801 CCATGACCAC TAACCGGAGC TACAGCCGCC CTTCTAATAA CATCCGGGTT GGTACTGGTG ATTGGCCTCG ATGTCGGCGG GAAGATTATT GTAGGCCCAA 355

8851 GGCCATCTGA TTTCACTTCC ACTCATTATT ACTCCTCTAC TTAGGATTAA CCGGTAGACT A.AAGTGAAGG TGAGTAATAA TGAGGAGATG AATCCTAATT 8901 CCCTTCTACT ATTAACCATA ATTCAATGAT GACGTGATAT TATCCGAGAA GGGAAGATGA TAATTGGTAT TAAGTTACTA CTGCACTATA ATAGGCTCTT 8951 GGAACATTTC AAGGTCATCA TACACCTCCC GTCCP►~~A,AAG GTCTCCGTTA CCTTGTA.AAG TTCCAGTAGT ATGTGGAGGG CAGGTTTTTC CAGAGGCAAT 9001 TGGAATAATC TTATTTATTA CATCAGAAGT ATTCTTCTTT TTAGGCTTTT ACCTTATTAG AATAAATAAT GTAGTCTTCA TAAGAAGAAA AATCCGA.A.AA 9051 TCTGAGCCTT TTACCACTCA AGTCTTGCCC CAACCCCAGA ATTAGGAGGA AGACTCGGAA AATGGTGAGT TCAGAACGGG GTTGGGGTCT TAATCCTCCT 9101 TGTTGACCAC CAATAGGAAT TAATCCATTA GATCCATTTG AAGTACCACT ACAACTGGTG GTTATCCTTA ATTAGGTAAT C TAGGTA.AAC TTCATGGTGA 9151 TCTAAATACT GCAGTACTTT TAGCTTCTGG CGTAACAGTA ACCTGAACCC AGATTTATGA CGTCATGAAA ATCGAAGACC GCATTGTCAT TGGACTTGGG 9201 ATCATAGTTT AATAGAAGGA AAC C GA.A.AAG AAGCTATCCA AGCCCTCACC TAGTATCAAA TTATCTTCCT TTGGCTTTTC TTCGATAGGT TCGGGAGTGG 9251 CTTACTATTA TTTTAGGATT TTACTTTACA GCTCTCCA.AA TTATAGAATA GAATGATAAT A.A.AATC CTAA AATGAAATGT CGAGAGGTTT AATATCTTAT 9301 TTACGAAGCA CCATTTACAA TTGCCGATGG AATTTATGGA ACAACATTTT AATGCTTCGT GGTAAATGTT AACGGCTACC TTA.AATAC C T TGTTGTA.A.AA 9351 TCGTTGCTAC AGGATTTCAC GGTCTCCATG TTATTATTGG TTCAACATTT AGCAACGATG TC C TA.AAGTG CCAGAGGTAC AATAATAACC AAGTTGTAA.A 9401 TTAGCAATCT GTTTACTACG ACAAATTCAA TATCACTTTA CATCAGAACA AATCGTTAGA CAAATGATGC TGTTTAAGTT ATAGTGAAAT GTAGTCTTGT 9451 TCACTTTGGT TTCGAGGCTG CTGCATGATA TTGACACTTT GTGGATGTAG AGTGAAACCA AAGCTCCGAC GACGTACTAT AACTGTGAAA CACCTACATC 9501 TATGATTATT CCTTTATGTA TCCATCTATT GATGAGGCTC ATAATTACTT ATACTAATAA GGAAATACAT AGGTAGATAA CTACTCCGAG TATTAATGAA 9551 TTCTAGTATA AACTAGTACA AATGATTTCC AATCATTTAA TCTTGGTTAT AAGATCATAT TTGATCATGT TTACTAAAGG TTAGTA.AATT AGAACCAATA 9601 AATCCAAGGA A.AAGTAATGA ACCTCATCAC GTCTTCTATC GCAGCTACGG TTAGGTTCCT TTTCATTACT TGGAGTAGTG CAGAAGATAG CGTCGATGCC 9651 CCCTGATTTC CCTAATCCTT GTATTAATTG CATTTTGACT CCCATCACTA GGGACTAAAG GGATTAGGAA CATAATTAAC GTAAAACTGA GGGTAGTGAT 9701 AATCCAGATA ATGP.~~AAAC T ATCCCCATAT GAATGCGGCT TTGATCCCCT TTAGGTCTAT TACTTTTTGA TAGGGGTATA CTTACGCCGA AACTAGGGGA 9751 AGGAAATGCA CGCCTTCCAT TCTCCTTACG CTTCTTCCTT GTAGCTATTT TCCTTTACGT GCGGAAGGTA AGAGGAATGC GAAGAAGGAA CATCGATA.AA 9801 TATTTTTATT ATTCGACTTA GAA.ATC GC C C TCCTCCTTCC TTTACCATGA ATP►~~AAATAA TAAGCTGAAT CTTTAGCGGG AGGAGGAAGG AAATGGTACT 9851 GGCAATCAAT TATTATCACC ACTTTCCACA TTACTCTGAG CAACAATTAT CCGTTAGTTA ATAATAGTGG TGAAAGGTGT AATGAGACTC GTTGTTAATA 9901 TTTAACTTTA TTAACTTTAG GCCTTATCTA TGAATGACTT CAAGGAGGAT AAATTGAAAT AATTGAAATC CGGAATAGAT ACTTACTGAA GTTCCTCCTA 9951 TAGAATGAGC AGAATGGATA TTTAATCTAA ATA.AAGAC TA CTAATTTCGA ATCTTACTCG TCTTACCTAT AAATTAGATT TATTTCTGAT GATTAAAGCT 10001 CTTAGTAAAT TATGGTAAAA ATC CATA.AAT ATCCTATGTC TCTCATACAT GAATCATTTA ATACCATTTT TAGGTATTTA TAGGATACAG AGAGTATGTA 10051 TTTAGTCTAA ATTCAGCATT CATTTTAAGT CTTATGGGCC TCGCACTAAA AAATCAGATT TAAGTCGTAA GTAA.AATTCA GAATACCCGG AGCGTGATTT 10101 TCGTTATCAC CTTTTATCCG CACTCCTATG TTTAGAAAGT ATACTACTAA AGCAATAGTG GAAAATAGGC GTGAGGATAC AAATCTTTCA TATGATGATT 10151 CTCTATTTAT TACCATTACC ATCTGAACTC TAATAC TA.AA CTCCACTTCA GAGATAAATA ATGGTAATGG TAGACTTGAG ATTATGATTT GAGGTGAAGT 10201 TGTTCAATTA TCCCCATAAT TCTTCTCACA TTCTCAGCCT GTGAAGCTAG 356

ACAAGTTAAT AGGGGTATTA AGAAGAGTGT AAGAGTCGGA CACTTCGATC 10251 TACAGGCCTA GCCATTCTAG TAGCAACCTC ACGCTCTCAC GGTTCTGATA ATGTCCGGAT CGGTAAGATC ATCGTTGGAG TGCGAGAGTG CCAAGACTAT 10301 ACTTACAAAA CCTGAACCTT CTCCAATGCT P.►AAAATTC TA ATTCCAACAA TGA.ATGTTTT GGACTTGGAA GAGGTTACGA TTTTTAAGAT TAAGGTTGTT 10351 TCATACTCTT CCCAACCACA TGACTTACTA AC TG AATATGACCT AGTATGAGAA GGGTTGGTGT ACTGAATGAT TGTTTTTTAC TTATACTGGA 10401 GTAATTACCA CCCACAGTCT TCTAATTGCA CTACTAAGCC TACTCTTATT CATTAATGGT GGGTGTCAGA AGATTAACGT GATGATTCGG ATGAGAATAA 10451 CAAGTGAA.AT ATAGATATCG CCTGAGATTT TTCTAATCAA CTTATAGCCA GTTCACTTTA TATCTATAGC GGACTCTAAA AAGATTAGTT GAATATCGGT 10501 TCGATCCTTT ATCA.AT000C TTACTAATTC TTACATGTTG ACTTCTTCCA AGCTAGGAAA TAGTTAGGGG AATGATTAAG AATGTACAAC TGAAGAAGGT 10551 TTAATAATTT TAGCTAGCCA AAACCATATT TCCCTAGAAC CAATTATCCG AATTATTAAA ATCGATCGGT TTTGGTATAA AGGGATCTTG GTTAATAGGC 10601 ACAACGAACA TATATTACGC TTCTAATTTC CCTCCAAGCC TTCCTCATTA TGTTGCTTGT ATATAATGCG AAGATTAAAG GGAGGTTCGG AAGGAGTAAT 10651 TAGCATTCTC TGCAACCGAA ATAATTATAT TTTATATTAT ATTTGAAGCC ATCGTAAGAG ACGTTGGCTT TATTAATATA AAATATAATA TA.AAC TTC GG 10701 ACACTTATCC CAACTCTTAT TATTATTACA CGATGAGGTA ATCAAACAGA TGTGAATAGG GTTGAGAATA ATAATAATGT GCTACTCCAT TAGTTTGTCT 10751 ACGTTTAAAT GCAGGCACTT ACTTTCTATT TTATACCTTA ATTGGCTCAC TGCAAATTTA CGTCCGTGAA T GAA.AGATAA AATATGGAAT TAACCGAGTG 10801 TCCCCCTTCT TATTGCCCTC CTACTTATAC ~TAATTT AGGTACTTTA AGGGGGAAGA ATAACGGGAG GATGAATATG TTTTATTA.AA TCCATGAA.AT 10851 TCTATAATTA TTATACAATA TTCACAGCTC CCAAATCTAC TTTCATGAGC AGATATTAAT AATATGTTAT AAGTGTCGAG GGTTTAGATG AAAGTACTCG 10901 AGACAAACTA TGATGAATAG CCTGTCTCAT CGCCTTCCTT GTCA~AA,ATAC TCTGTTTGAT ACTACTTATC GGACAGAGTA GCGGAAGGAA CAGTTTTATG 10951 CTTTATATGG AATCCATCTT TGACTCCCCA AAGCCCATGT TGAAGCCCCA GAAATATACC TTAGGTAGAA ACTGAGGGGT TTCGGGTACA ACTTCGGGGT 11001 ATTGCTGGCT CAATAATCCT AGCAGCAGTA TTACTCAAAT TAGGGGGTTA TAACGACCGA GTTATTAGGA TCGTCGTCAT AATGAGTTTA ATCCCCCAAT 11051 CGGAATAATA CGAATTATTG TAATGCTAAA CCCATTAACT AA.AGAAATAG GCCTTATTAT GCTTAATAAC ATTACGATTT GGGTAATTGA TTTCTTTATC 11101 CCTATCCATT CTTAATTTTA GCTATTTGAG GAATTATTAT AACCAGCTCT GGATAGGTAA GAATTAAAAT CGATAAACTC CTTAATAATA TTGGTCGAGA 11151 ATCTGCTTAC GACAAACAGA CCTTAAATCT CTAATCGCTT ACTCGTCAGT TAGACGAATG CTGTTTGTCT GGAATTTAGA GATTAGCGAA TGAGCAGTCA 11201 AAGTCATATA GGCCTAGTTG CTGGAGCAAT TCTTATCCAG ACACCATGAA TTCAGTATAT CCGGATCAAC GACCTCGTTA AGAATAGGTC TGTGGTACTT 11251 GTTTTGCAGG AGCAATTACA CTTATAATCG CTCATGGTTT AATTTCATCA CA,~AACGTCC TCGTTAATGT GAATATTAGC GAGTACCA.AA TTAAAGTAGT 11301 GCCTTATTTT GTTTAGCTAA TACCAACTAT GAGCGAATTC ACAGTCGAAC CGGAATAAAA CAAATCGATT ATGGTTGATA CTCGCTTAAG TGTCAGCTTG 11351 CATACTCCTA GCCCGAGGTA TACAAATTAT CCTTCCATTA ATGGCAACCT GTATGAGGAT CGGGCTCCAT ATGTTTAATA GGAAGGTAAT TACCGTTGGA 11401 GATGATTATT TACTAGTCTA GCTAATCTTG CCCTACCCCC ATCTCCCAAC CTACTAATAA ATGATCAGAT CGATTAGAAC GGGATGGGGG TAGAGGGTTG 11451 CTTATAGGAG AACTTCTCAT CATCACATCA TTATTTAATT GATCTAACTG GAATATCCTC TTGAAGAGTA GTAGTGTAGT AATAA.ATTAA CTAGATTGAC 11501 AACCATAATC TTATCAGGTC TTGGAGTATT AATTACAGCC TCTTACTCAC TTGGTATTAG AATAGTCCAG AACCTCATAA TTAATGTCGG AGAATGAGTG 11551 TCTACATATT CTTAATAACC CAACGAGGTC CAACCCCCCA CCATATTTTA AGATGTATAA GAATTATTGG GTTGCTCCAG GTTGGGGGGT GGTATAA.AAT 357

11601 TCATTAAACC CAAATTACAC ACGAGAACAT CTCCTCATAA GTCTTCACCT AGTAATTTGG GTTTAATGTG TGCTCTTGTA GAGGAGTATT CAGAAGTGGA 11651 TATACCCGTT TTATTATTAA TATTTAAACC AGAACTTATT TGAGGATGAA ATATGGGCAA AATAATAATT ATA.AATTTGG TCTTGAATAA ACTCCTACTT 11701 CACTTTGTAC TTATAGTTTA AC CA.A.AACAT TAGATTGTGG TTC TP.►~~AAAT GTGAAACATG AATATCAAAT TGGTTTTGTA ATCTAACACC AAGATTTTTA 11751 A.AAAGTTAAA ACCTTTTTAA TTACCGAGAG AGGTCAAGGA CACGAAGAAC TTTTCAATTT TGGP.~~AAATT AATGGCTCTC TCCAGTTCCT GTGCTTCTTG 11801 TGCTAACTCT TCCTATCATG GCTCAAATCC ATGACTCACT CAGCTTCTGA ACGATTGAGA AGGATAGTAC CGAGTTTAGG TACTGAGTGA GTCGAAGACT 11851 AAGATAATAG TAATCTATTG GTCTTAGGAA CCP.~AAAACTC TTGGTGCAAC TTCTATTATC ATTAGATAAC CAGAATCCTT GGTTTTTGAG AACCACGTTG 11901 TC CAAGCA.AA AGCTATGAAT ACCATCTTCA ATTCATCATT TCTCCTAATC AGGTTCGTTT TCGATACTTA TGGTAGAAGT TAAGTAGTAA AGAGGATTAG 11951 TTCATTATCC TTATCTTTCC ACTAATAACC TCACTAAATC CT TT AAGTAATAGG AATAGAA.AGG TGATTATTGG AGTGATTTAG GATTTTTTAA 12001 TAATCCAAAT TGATTATCAT TTTACACAAA AACAGCCGTA AA.AATTTC C T ATTAGGTTTA ACTAATAGTA AAATGTGTTT TTGTCGGCAT TTTTAAAGGA 12051 TCTTTATTAG TCTTATCCCT TTATTTATTT TTCTAGATCA AGGCCTAGAA AGAAATAATC AGAATAGGGA AATA.AATAAA AAGATCTAGT TCCGGATCTT 12101 TCAATTATAA CTAATTATAA TTGAATAAAT ATTGGACCAT TTGATGTTAA AGTTAATATT GATTAATATT AACTTATTTA TAACCTGGTA AACTACAATT 12151 TATAAGCTTC AAATTTGATA TATACTCAAT TATATTTACC CCCGTAGCCC ATATTCGAAG TTTAA.AC TAT ATATGAGTTA ATATAAATGG GGGCATCGGG 12201 TTTATGTCAC CTGATCTATC CTCGAATTTG CCTTATGATA CATACACTCT A.AATACAGTG GACTAGATAG GAGCTTAAAC GGAATACTAT GTATGTGAGA 12251 GATCCTAATA TTAATCGCTT CTTCAAATAT TTACTACTAT TTTTAATCTC CTAGGATTAT AATTAGCGAA GAAGTTTATA AATGATGATA AAA.ATTAGAG 12301 AATAATTATT CTAGTTACAG CTAATAATAT ATTTCAACTA TTTATTGGAT TTATTAATAA GATCAATGTC GATTATTATA TAAAGTTGAT AA.ATAAC C TA 12351 GAGAAGGGGT TGGAATCATA TCATTCCTCC TAATTGGCTG ATGATATAGC CTCTTCCCCA ACCTTAGTAT AGTAAGGAGG ATTAACCGAC TACTATATCG 12401 CGAACAGATG CTAATACCGC TGCTCTTCAA GCTGTAATTT ATAATCGAGT GCTTGTCTAC GATTATGGCG ACGAGAAGTT CGACATTAAA TATTAGCTCA 12451 AGGAGATATT GGATTAATCC TTAACATAAC CTGAATGGCT ATAAATTTAA TCCTCTATAA CCTAATTAGG AATTGTATTG GACTTACCGA TATTTA.AATT 12501 ATTCATGAGA AATTCAACAA CTATTTATTT TATCCP.~ TATAGACCTA TAAGTACTCT TTAAGTTGTT GATAAATAAA ATAGGTTTTT ATATCTGGAT 12551 ACATTACCTT TATTTGGTCT TATTCTAGCT GCAGCTGGAA AATCCGCACA TGTAATGGAA ATAAACCAGA ATAAGATCGA CGTCGACCTT TTAGGCGTGT 12601 ATTTGGTCTT CACCCCTGAC TTCCCTCTGC TATAGAAGGT CCCACACCAG TAAACCAGAA GTGGGGACTG A.AGGGAGACG ATATCTTCCA GGGTGTGGTC 12651 TTTCTGCCCT ACTCCACTCC AGCACAATAG TTGTTGCCGG TATCTTTCTT GGGA AA.AGAC TGAGGTGAGG TCGTGTTATC AACAACGGCC ATAGAAAGAA 12701 CTAATCCGCC TTCACCCTTT AATCCAAGAT AATCAATTAA TTTTAACAAT GATTAGGCGG AAGTGGGAA.A TTAGGTTCTA TTAGTTAATT A.A.AATTGTTA 12751 ATGCCTATGT TTAGGAGCAC TAACTACTCT TTTCACCGCA ACATGTGCAC TACGGATACA AATCCTCGTG ATTGATGAGA AA.AGTGGCGT TGTACACGTG 12801 TTACCCAAAA TGATATTA.AA AAAATTATTG CCTTCTCAAC ATCAAGCCAA AATGGGTTTT ACTATAATTT TTTTAATAAC GGAAGAGTTG TAGTTCGGTT 12851 CTTGGATTAA TAATAGTAAC AATTGGCCTT AATCAACCCC AACTTGCTTT GAACCTAATT ATTATCATTG TTAACCGGAA TTAGTTGGGG TTGAACGAAA 12901 CCTCCACATT TGTACTCATG CCTTCTTCAA AGCCATACTC TTTCTCTGCT GGAGGTGTAA ACATGAGTAC GGAAGAAGTT TCGGTATGAG A.AAGAGAC GA 12951 CAGGTTCTAT TATTCACAGC CTTAATGATG AACAAGACAT TCGCA.AA.ATA 358

GTCCAAGATA ATAAGTGTCG GAATTACTAC TTGTTCTGTA AGCGTTTTAT 13001 GGAGGACTTC ACA.AACTTCT ACCATTCACC TCATCTTCCT TAATCATCGG CCTCCTGAAG TGTTTGAAGA TGGTAAGTGG AGTAGAAGGA ATTAGTAGCC 13051 AAGTTTAGCC CTTACAGGTA TACCTTTCTT ATCAGGTTTC TTCTCA,AAAG TTCAAATCGG GAATGTCCAT ATGGAAAGAA TAGTCCAA.AG AAGAGTTTTC 13101 ACACTATCAT TGAATCCTTA AACACTTCAC ACCTAAACGC CTGAGCCCTA TGTGATAGTA ACTTAGGAAT TTGTGAAGTG TGGATTTGCG GACTCGGGAT 13151 ATCCTTACCC TTATCGCAAC ATCATTCACA GCTATTTATA GCCTACGCCT TAGGAATGGG AATAGCGTTG TAGTAAGTGT C GAT.AAATAT CGGATGCGGA 13201 TATATTCTTC ACATTAATAA ATTTTCCACG ATTTAATTCA CTACCCCTAA ATATAAGAAG TGTAATTATT TAAAAGGTGC TAAATTAAGT GATGGGGATT 13251 TCAATGAA.AA TAACCCAATA ATAATTAATC CAATCAAACG CCTAGCTTAT AGTTACTTTT ATTGGGTTAT TATTAATTAG GTTAGTTTGC GGATCGAATA 13301 GGAAGTATCC TAGCTGGCCT TATTATTACA TCAAATTTAA CTCCAACAA.A CCTTCATAGG ATCGACCGGA ATAATAATGT AGTTTAAATT GAGGTTGTTT 13351 AACCCAAATC ATAACAATAT CTCCTTTACT AA~ACTCTCC ACCCTATTAC TTGGGTTTAG TATTGTTATA GAGGAAATGA TTTTGAGAGG TGGGATAATG 13401 TTACAATTAT TGGTCTCTTA CTAGCCCTAG AATTAGCTAA TTTAACTAAC AATGTTAATA ACCAGAGAAT GATCGGGATC TTAATCGATT A.AATTGATTG 13451 ACTCAATTCA AAATA.AATC C ~TACCCCTTAT ACTCACCATT TCTCCAACAT TGAGTTAAGT TTTATTTAGG ATGGGGAATA TGAGTGGTAA AGAGGTTGTA 13501 ACTCGGCTAC TTTCCACAAG TTATCCATCG CCTTATACCA AAAATTAACT TGAGCCGATG A.A.AGGTGTTC AATAGGTAGC GGAATATGGT TTTTAATTGA 13551 TAA.ACTGAGC CCAACACATC TCCACCCATC TAATTGATCA AACATGA.AAT ATTTGACTCG GGTTGTGTAG AGGTGGGTAG ATTAACTAGT TTGTACTTTA 13601 G TTG GACCP.~~AAAG CACTTTTATT CAACAA~TTC CACTGATTAA CTTTTTTAAC CTGGTTTTTC GTGAAAATAA GTTGTTTAAG GTGACTAATT 13651 ACTATCTACC CAAC CACAAC AAGGTTATAT TAAAGTTTAT CTTATATTAC TGATAGATGG GTTGGTGTTG TTCCAATATA ATTTCA.AATA GAATATAATG 13701 TTTTCCTTAC ATTAACCTTA GCTCTACTAA CTTCATTAAC CTAATCACAC AAAAGGAATG TAATTGGAAT CGAGATGATT GAAGTAATTG GATTAGTGTG 13751 GCAA.AGTTCC CCAAGATAAT CCTCGAGTTA ACTCTAACAC CACAAATA.A~ CGTTTCAAGG GGTTCTATTA GGAGCTCAAT TGAGATTGTG GTGTTTATTT 13801 GTTAATAATA ACACTCACCC AC TTA.A.AAC T AATAATCACC CACCATCAGC CAATTATTAT TGTGAGTGGG TGAATTTTGA TTATTAGTGG GTGGTAGTCG 13851 ATATAATA.AA GCTACCCCTA CAAAATCCCC ACGAACTATC TCCATATCAT TATATTATTT CGATGGGGAT GTTTTAGGGG TGCTTGATAG AGGTATAGTA 13901 TCATCTCCTC TACCCCTGCC CAATTTAACT CAAATCATTC AACTAAAAAA AGTAGAGGAG ATGGGGACGG GTTAAATTGA GTTTAGTAAG TTGATTTTTT 13951 TATTTACCAA CP►~~AAAC TAA AATTAC TA.AA TAAAGTATAA TATACAGTAA ATAA.ATGGTT GTTTTTGATT TTAATGATTT ATTTCATATT ATATGTCATT 14001 TACAGATCAA TTACCCCATG ATTCAGGATA AGGCTCAGCA GCAAGCGCTG ATGTCTAGTT AATGGGGTAC TAAGTCCTAT TCCGAGTCGT CGTTCGCGAC 14051 CTGTATAAGC AAATACTACC AATATTCCCC C CAGATAA.AT Tp~~AAATAAA GACATATTCG TTTATGATGG TTATAAGGGG GGTCTATTTA ATTTTTATTT 14101 ACCAGTGATA P.~3A.AAGATC C TCCATGCCCC ACTAATAATC CACACCCCAC TGGTCACTAT TTTTTCTAGG AGGTACGGGG TGATTATTAG GTGTGGGGTG 14151 CCCAGCAGCC ATAACCAACC CTAACGCAGC ATAATAAGGA GAAGGATTAG GGGTCGTCGG TATTGGTTGG GATTGCGTCG TATTATTCCT CTTCCTAATC 14201 AAGCTACCCC TATTAAACCT AAAACTAAAC AAATTATTAT TP.~~AAACATA TTCGATGGGG ATAATTTGGA TTTTGATTTG TTTAATAATA ATTTTTGTAT 14251 AAATATACCA TTATTCCTAC CTGGACTCTA ACCAAGACCA ATAACTTGAA TTTATATGGT AATAAGGATG GACCTGAGAT TGGTTCTGGT TATTGAACTT 14301 AAACTATCGT TGTTCATTCA ACTATAAGAA TTTATGGCCA CAAATATCCG TTTGATAGCA ACAAGTAAGT TGATATTCTT AAATACCGGT GTTTATAGGC 359

14351 p~~AAAC C CAC CCACTACTAA AAATTGTTAA C CA.AAC C TTA ATTGACCTCC TTTTTGGGTG GGTGATGATT TTTAACAATT GGTTTGGAAT TAACTGGAGG 14401 CAACTCCGTC TAATATTTCA GTTTGATGAA ACTTCGGCTC ACTACTAGGA GTTGAGGCAG ATTATAAAGT CAAACTACTT TGAAGCCGAG TGATGATCCT 14451 TTATGCCTAA TTATCCAAAT CCTTACAGGA CTTTTCCTAG CAATACATTA AATACGGATT AATAGGTTTA GGAATGTCCT GA.AA.AGGATC GTTATGTAAT 14501 TATCGCAGAC GTCTCCATAG CCTTTTCCTC AGTAATCCAT ATCTGCCGCG ATAGCGTCTG CAGAGGTATC GGAAAAGGAG TCATTAGGTA TAGACGGCGC 14551 ATGTTAATTA TGGCTGACTT ATCCATAATA TTCATGCTAA TGGAGCCTCA TACAATTAAT ACCGACTGAA TAGGTATTAT AAGTACGATT ACCTCGGAGT 14601 TTATTTTTTA TCTGCGTATA TTTACATATT GCCCGAGGAC TTTACTACGG AAT T AGACGCATAT A.AATGTATAA CGGGCTCCTG A.AATGATGCC 14651 CTCCTACCTT TATAAAGA.AA CATGAAATAT TGGAGTAATT CTATTATTCC GAGGATGGAA ATATTTCTTT GTACTTTATA ACCTCATTAA GATAATAAGG 14701 TTTTAATAGC CACAGCCTTC GTAGGCTATG TTTTACCATG AGGACAAATA A.AAATTATC G GTGTCGGAAG CATCCGATAC ~,AAATGGTAC TCCTGTTTAT 14751 TCCTTCTGAG GCGCCACAGT TATTACCAAT CTTTTATCCG CCTTTCCTTA AGGAAGACTC CGCGGTGTCA ATAATGGTTA GAAAATAGGC GGAAAGGAAT 14801 TGTTGGAAAT ATATTAGTCC AATGAATTTG AGGTGGTTTT TCAGTAGATA ACAACCTTTA TATAATCAGG TTAC TTAA.AC TCCACCAAAA AGTCATCTAT 14851 ATGCTACCCT AACACGATTC TTCGCATTCC ATTTCCTCCT ACCTTTTCTA TACGATGGGA TTGTGCTAAG AAGCGTAAGG TAAAGGAGGA TGGAAAAGAT 14901 ATTACAGCGC TAATAATAAT CCATATCCTC TTCCTACATG AAACAGGCTC TAATGTCGCG ATTATTATTA GGTATAGGAG AAGGATGTAC TTTGTCCGAG 14951 A.AAC AATC C T ATAGGACTTA ATTCTGATAT AGAT~~AAATT TCTTTCCACC TTTGTTAGGA TATCCTGAAT TAAGACTATA TCTATTTTAA AGAAAGGTGG 15001 CCTACTTTTC CTATAAAGAT ACACTCGGCT TTTTCACCAT AATTATAATC GGATGAA.AAG GATATTTCTA TGTGAGCCGA AAA.AGTGGTA TTAATATTAG 15051 CTAGGAGTCC TAACCTTACT CCTTCCTAAT CTATTAGGAG ATGCTGAAAA GATCCTCAGG ATTGGAATGA GGAAGGATTA GATAATCCTC TACGACTTTT 15101 CTTTATCCCT GCTAACCCTC TTGTCACCCC TCCCCATATT A.AAC C TGAAT GAAATAGGGA CGATTGGGAG AACAGTGGGG AGGGGTATAA TTTGGACTTA 15151 GATATTTCCT ATTCGCCTAT GCCATTCTCC GATCAATCCC TAATAAATTA CTATAAAGGA TAAGCGGATA CGGTAAGAGG CTAGTTAGGG ATTATTTAAT 15201 GGAGGAGTAC TAGCCCTTCT ATTCTCCATC CTTATCCTTA TATTGATTCC CCTCCTCATG ATCGGGAAGA TAAGAGGTAG GAATAGGAAT ATAACTAAGG 15251 ACTACTACAT AC C TC TA.AAC AACGAAGTAG CACCTTTCGC CCACTTACAC TGATGATGTA TGGAGATTTG TTGCTTCATC GTGGAAAGCG GGTGAATGTG 15301 AAATTTTCTT TTGAATCCTT GTAACCAATA TACTAATCTT AACC TGA.ATT TTTAA.AAGAA AACTTAGGAA CATTGGTTAT ATGATTAGAA TTGGACTTAA 15351 GGAGGACAAC CAGTTGAACA ACCATTTATT CTTATCGGAC AAATTGCATC CCTCCTGTTG GTCAACTTGT TGGTAAATAA GAATAGCCTG TTTAACGTAG 15401 TATTACCTAT TTTTCCTTAT TTCTTATTGT AATCCCACTC ACAGGCTGGT ATAATGGATA ~GGAATA AAGAATAACA TTAGGGTGAG TGTCCGACCA 15451 GAGAAAATAA AATCCTCAGC CTA.AATTGTT TTGGTAGCTT AACTTAATAA CTCTTTTATT TTAGGAGTCG GATTTAACAA AACCATCGAA TTGAATTATT 15501 AGCATCGACC TTGTAAGTCG AAGACCGGAG GTTCATGCCC TC C C CAAA.AC TCGTAGCTGG AACATTCAGC TTCTGGCCTC CAAGTACGGG AGGGGTTTTG 15551 AAATCAGGGG AAGGAGGGTT A.AAC TC C TGC CCTTGGCTCC CAAAGCCAAG TTTAGTCCCC TTCCTCCCAA TTTGAGGACG GGAACCGAGG GTTTCGGTTC 15601 ATTCTGCCCA AACTGCCCCC TGATATGTCA TTAAATCACG AAAATAAATG TAAGACGGGT TTGACGGGGG ACTATACAGT AATTTAGTGC TTTTATTTAC 15651 AAAATTTTAC P►AAAAGTAAG TCAGAGTGAC ATATTAATGA CATAGCTCAC TTTTAAAATG TTTTTCATTC AGTCTCACTG TATAATTACT GTATCGAGTG 15701 ATTCCTTAAT ATAATACATT ACTTAACTCG ACTAATCAAC ATTAATAGAC 360

TAAGGAATTA TATTATGTAA TGAATTGAGC TGATTAGTTG TAATTATCTG 15751 TATTCCCTAC TACTATTATT ATCTATGCTT AATCCCCATT AATCTATATT ATAAGGGATG ATGATAATAA TAGATACGAA TTAGGGGTAA TTAGATATAA 15801 CCACTATATC ATAACATACT ATGCTTAATC CTCATTAATA TACTAACCAC GGTGATATAG TATTGTATGA TACGAATTAG GAGTAATTAT ATGATTGGTG 15851 TATTTCATTA CATTTTATTC TTTAGCCCCT ATA.AAGTTAA AATCAATATT ATA.AAGTAAT GTAAAATAAG AAATCGGGGA TATTTCAATT TTAGTTATAA AAGTATA.AAT 15901 TCCATATCAT GA.AATTATTT ATTTAACCCT CAGTTACTTA AGGTATAGTA C TTTAATA.AA TAAATTGGGA GTCAATGAAT TTCATATTTA 15951 CATGCGGGTT GGTAAGAACA TCACATCCCG CTATTGTAAG TTG GTACGCCCAA CCATTCTTGT AGTGTAGGGC GATAACATTC TTTTTTTAAC 16001 CTCTATTTGT GGCACTGTAC TCGATTAATC CCTATCAATT GATC~TT GAGATAAACA CCGTGACATG AGCTAATTAG GGATAGTTAA CTAGTTTTAA 16051 GGCATCTGAT TAATGCTTGA AATCCTTTAA TCCTTAATCG CGTCAAGAAT CCGTAGACTA ATTACGAACT TTAGGAAATT AGGAATTAGC GCAGTTCTTA 16101 GCCAGATCCC CTAGCTCCCT TTAATGGCAA TTTCGTCCTT GATCGTCTCA CGGTCTAGGG GATCGAGGGA AATTACCGTT AAAGCAGGAA CTAGCAGAGT 16151 AGATTTATCG TCCGCCCTGT TTTTTTTTTT GGGGATGAAG CAGTTACTAA TCTAAATAGC AGGCGGGACA CCCCTACTTC GTCAATGATT 16201 GCCCGGGAGG GCTGATCTAG GACAC TGA.AA TAAACTTGAA TCCACCTCGA CGGGCCCTCC CGACTAGATC CTGTGACTTT ATTTGAACTT AGGTGGAGCT 16251 CATCTACTTA AAATACTCAT TACTTTCATT CATGAATTGT AATTGTCAAG GTAGATGAAT TTTATGAGTA ATGAA.AGTAA GTACTTAACA TTAACAGTTC 16301 TTGACCATAA CTGAGAGGGA TAGAGA.AATT GACGCCATAG GCGACAAGTT AACTGGTATT GACTCTCCCT ATCTCTTTAA CTGCGGTATC CGCTGTTCAA 16351 TCGATTTTTT TGATTAATGA AGCTATGGTT T TAC ATTCTCTTAA AGCT ACTAATTACT TCGATACCAA ATTTTTTATG TAAGAGAATT 16401 TCCCCATCAA AA.AGC GATTC GT TA TTAATGTGAA GCGCATTGAA AGGGGTAGTT TTTCGCTAAG CATTTTTTAT AATTACACTT CGCGTAACTT 16451 TAATCCTAAT ACATTCTTCA CTTTACTTGG CATAAATTTA TTATTATTAA ATTAGGATTA TGTAAGAAGT GAAATGAACC GTATTTAAAT AATAATAATT 16501 GATACCCCCT AGGTTGTAAA AATTTGAAGC C GC TTP.~~AAA ~TAA.ACA CTATGGGGGA TCCAACATTT TTA.AAC TTC G GCGAATTTTT TTTTATTTGT 16551 TTTTTTGGTA AAA.ACCCCCC TCCCCCTAA.A TATACACGGA CTCCTCGAAA CCAT TTTTGGGGGG AGGGGGATTT ATATGTGCCT GAGGAGCTTT 16601 AAC C C C TA.AA ACGAGGGCCG GACATATATC TTTGAATTAG CATGCGAAAT TTGGGGATTT TGCTCCCGGC CTGTATATAG AAACTTAATC GTACGCTTTA 16651 ATATTCTATA TATATAGTGT TACACTATGA T TATAAGATAT ATATATCACA ATGTGATACT A tRNA 1..70 product = tRNA-Phe rRNA 71..1020 product = 12S ribosomal RNA tRNA 1021..1092 product = tRNA-Val rRNA 1093..2756 product = 16S ribosomal RNA tRNA 2757..2831 product = tRNA-Leu gene 2832..3806 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3808..3876 product = tRNA-Ile 361 tRNA 3875..3946 product = tRNA-Gln tRNA 3947..4015 product = tRNA-Met gene 4016..5059 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5059..5129 product = tRNA-Trp tRNA complement (5131..5199) product = tRNA-Ala tRNA complement {5201..5272) product = tRNA-Asn tRNA complement (5306..5372) product = tRNA-Cys tRNA complement (5374..5443) product = tRNA-Tyr gene 5445..6998 gene = COl product = cytochrome c oxidase subunit 1 tRNA complement (7001..7071) product = tRNA-Ser tRNA 7076..7145 product = tRNA-Asp gene 7153..7843 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7844..7916 product = tRNA-Lys gene 7918..8085 gene = ATP8 product =ATP synthase FO subunit 8 gene 8076..8759 gene = ATP6 product =ATP synthase FO subunit 6 gene 8759..9544 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9547..9616 product = tRNA-Giy gene 9617..9967 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9966..10035 product = tRNA-Arg gene 10036..10332 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10326..11706 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11707..11775 product = tRNA-His tRNA 11776..11842 362

product = tRNA-Ser tRNA 11843..11914 product = tRNA-Leu gene 11915..13744 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13740..14261) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14262..14331) product = tRNA-Glu gene 14334..15479 gene = CYTB product = cytochrome b tRNA 15478..15551 product = tRNA-Thr tRNA complement (15554..15622) product = tRNA-Pro D-Loop 15616..16681

Odontaspis noronhai mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTCAA AGCATGGCAC TGAAGATGCT AATATGTAAA CGATCACATC GAATTAAGTT TCGTACCGTG ACTTCTACGA TTATACATTT 51 ATAAAAATTT TCCACAAGCA TTAAGGTTTG GTCCTGGCCT CAGTATTAAT TATTTTTAAA AGGTGTTCGT AATTCCAAAC CAGGACCGGA GTCATAATTA 101 TGTAGCCAAA ATTATACATG CAAGTTTCAG CATCCCCGTG AGAATGCCCT ACATCGGTTT TAATATGTAC GTTCAAAGTC GTAGGGGCAC TCTTACGGGA 151 AATTATTCTA TTAATTAATT AGGAGCAGGT ATCAGGCACA CATACGTTAG TTAATAAGAT AATTAATTAA TCCTCGTCCA TAGTCCGTGT GTATGCAATC 201 CCCAAGACAC CTTGCTAAGC CACACCCCCA AGGGATTTCA GCAGTAATAA GGGTTCTGTG GAACGATTCG GTGTGGGGGT TC C C TA.AAGT CGTCATTATT 251 ATATTGATTT TATAAGCACA AGCTTGAATC AGTTAAAGTT GATAGAGTTG TATAACTAAA ATATTCGTGT TCGAACTTAG TCAATTTCAA CTATCTCAAC 301 GTAAATCTCG TGCCAGCCAC CGCGGTTATA CGAGTAACTC ATATTAATAC CATTTAGAGC ACGGTCGGTG GCGCCAATAT GCTCATTGAG TATAATTATG 351 TTCCCGGCGT AAAGAGTGAT TTAAGGAATA TCTATAATAA CTAAAGTTAA AAGGGCCGCA TTTCTCACTA AATTCCTTAT AGATATTATT GATTTCAATT 401 GACCTCATCA AGCTGTTACA CGCACCCATG AACAGAACTA TCAATAACGA CTGGAGTAGT TCGACAATGT GCGTGGGTAC TTGTCTTGAT AGTTATTGCT 451 AAGTGACTTT ATTCCACTAG AA.ATCTTGAC GTCACGACAG TTAGACCCCA TTCACTGAAA TAAGGTGATC TTTAGAACTG CAGTGCTGTC AATCTGGGGT 501 AACTAGGATT AGATACCCTA CTATGTCTAA CCACAAACTT A.AACAATAAT TTGATCCTAA TCTATGGGAT GATACAGATT GGTGTTTGAA TTTGTTATTA 551 TCACTATATT GTTCGCCAGA GTACTACAAG CGCTAGCTTA AAACCCAAAG AGTGATATAA CAAGCGGTCT CATGATGTTC GCGATCGAAT TTTGGGTTTC 601 GACTTGGCGG TGTCCCAAAC CCACCTAGAG GAGCCTGTTC TGTAACCGAT CTGAACCGCC ACAGGGTTTG GGTGGATCTC CTCGGACAAG ACATTGGCTA 651 AATCCCCGTT A.AAC C TCAC C ACTTCTAGCC ATCCCCGTCT ATATACCGCC TTAGGGGCAA TTTGGAGTGG TGAAGATCGG TAGGGGCAGA TATATGGCGG 701 GTCGTCAGCT CACCCTGTGA AAGC TP.,~~AAA GTAAGCAAAA AGAATCAACT CAGCAGTCGA GTGGGACACT TTCGATTTTT CATTCGTTTT TCTTAGTTGA 751 CCCACACGTC AGGTCGAGGT GTAGCAAATG AAGTGGATAG AAATGGGCTA GGGTGTGCAG TCCAGCTCCA CATCGTTTAC TTCACCTATC TTTACCCGAT 363

801 CATTTTCTAT AAAGAAAATA CGAATGATAA ACTG ATTACCTAAA GTAAAAGATA TTTCTTTTAT GCTTACTATT TGACTTTTTT TAATGGATTT 851 GGTGGATTTA GTAGTAAGAA AAGATTAGAA AACTTCTCTG A.AAC C GGC TC CCACCTAAAT CATCATTCTT TTCTAATCTT TTGAAGAGAC TTTGGCCGAG 901 TGGGAC GC GC ACACACCGCC CGTCACTCTC CTTT CCTACTTATT ACCCTGCGCG TGTGTGGCGG GCAGTGAGAG GA.AATTTTTT GGATGAATAA 951 TCTAATTAAA AGAAAACTAT TAAGAGGAGG CAAGTCGTAA CATGGTAAGT AGATTAATTT TCTTTTGATA ATTCTCCTCC GTTCAGCATT GTACCATTCA 1001 GTACTGGAAA GTGCACTTGG AATCAAAATG TGGCTA~ATT AGTAAAGCAC CATGACCTTT CACGTGAACC TTAGTTTTAC ACCGATTTAA TCATTTCGTG 1051 CTCCCTTACA CTGAGAAGAC ACCCGTGCAA TTCGGATCAT TTTGAACATT GAGGGAATGT GACTCTTCTG TGGGCACGTT AAGCCTAGTA AAACTTGTAA 1101 AAAGCTAGCC TGTACACCTA CCCCA.AACCC AATCTTATCA ATTACCTTAC TTTCGATCGG ACATGTGGAT GGGGTTTGGG TTAGAATAGT TAATGGAATG 1151 ACACTAATCC C TAAC TAAA.A CATTTTACCT TTTTAGTATG GGTGACAGAA TGTGATTAGG GATTGATTTT GTAAAATGGA AAAATCATAC CCACTGTCTT 1201 CA~AAAACTTC AGCGCAATAG ACTATGTACC GCAAGGGA.AA GCTG G GTTTTTGAAG TCGCGTTATC TGATACATGG CGTTCCCTTT CGACTTTTTC 1251 AAATGAAATA AATAATTAAA GTAATP~AAAA GCAGAGATTT AACCTCGTAC TTTACTTTAT TTATTAATTT CATTATTTTT CGTCTCTAAA TTGGAGCATG 1301 CTTTTGCATC ATGATTTAGC TAGP►AA.AAC T AGACAAAGAG ATCTTAAGCC GAAAACGTAG TACTAAATCG ATCTTTTTGA TCTGTTTCTC TAGAATTCGG 1351 TATCCTCCCG AAACTAAACG AGCTACTCCG AAGCAGCACA ATTTAGAGCC ATAGGAGGGC TTTGATTTGC TCGATGAGGC TTCGTCGTGT TAA.ATC TC GG 1401 AACCCGTCTC TGTAGCAAAA GAGTGGGAAG ACTTCCGAGT AGCGGTGACA TTGGGCAGAG ACATCGTTTT CTCACCCTTC TGAAGGCTCA TCGCCACTGT 1451 AGCCTATCGA GTTTAGTGAT AGCTGGTTGC C CAAGAA.A.AG AACTTCAATT TCGGATAGCT CAAATCACTA TCGACCAACG GGTTCTTTTC TTGAAGTTAA 1501 CTGCATTAAT TCCTTCACCA C CP.~AA.AATTC ATTTTATTAA GGTTAAACAT GACGTAATTA AGGAAGTGGT GGTTTTTAAG TAAAATAATT CCAATTTGTA 1551 P~~AAATTAAT AGTTATTCAG AAGAGGTACA GCCCTTCTGA ACCAAGATAC TTTTTAATTA TCAATAAGTC TTCTCCATGT CGGGAAGACT TGGTTCTATG 1601 AACTTTTCAA GGC GGP►~~A,AA TGATCATATT TACCAAGGTT TTTACCCCAG TTGAAAAGTT CCGCCTTTTT ACTAGTATAA ATGGTTCCAA A.AATGGGGTC 1651 TGGGCCCAAA AGCAGCCACC TGTAAAGTAA GCGTCACAGC TCCAGTCTTA ACCCGGGTTT TCGTCGGTGG ACATTTCATT CGCAGTGTCG AGGTCAGAAT 1701 CAATAACCTA TAATTTAGAT ATTCTTCTCA TATTCCCTTA ACTATATTGG GTTATTGGAT ATTAAATCTA TAAGAAGAGT ATAAGGGAAT TGATATAACC 1751 GCTATTTTAT AAA.ATTATAA AAGAACTTAT GCTAAAATGA GTAATAAGAG CGATAAAATA TTTTAATATT TTCTTGAATA CGATTTTACT CATTATTCTC 1801 AACAAACCTC TCCAGACACA AGTGTATGTC AGA.AAGAATT A.AAT C AC TAA TTGTTTGGAG AGGTCTGTGT TCACATACAG TCTTTCTTAA TTTAGTGATT 1851 CAATTAAACG AACCCAAATT GAGGCCATTA TATTAATATT ACCTTAACCA GTTAATTTGC TTGGGTTTAA CTCCGGTAAT ATAATTATAA TGGAATTGGT 1901 GA,AAATC TTA TTATAATATT CGTTAACCCT ACACAGGAGC GTCTTAAGGA CTTTTAGAAT AATATTATAA GCAATTGGGA TGTGTCCTCG CAGAATTCCT 1951 AAGATTTA.AA GAAAATAAAG GAACTCGGCA AACACAAACT CCGCCTGTTT TTCTAA.ATTT CTTTTATTTC CTTGAGCCGT TTGTGTTTGA GGCGGACAAA 2001 ACCAAAAACA TCGCCTCTTG AATATTATAA GAGGTCCCGC CTGCCCTGTG TGGTTTTTGT AGCGGAGAAC TTATAATATT CTCCAGGGCG GACGGGACAC 2051 ACAATGTTTT AACGGCCGCG GTATTTTGAC CGTGCAAAGG TAGCGTAATC TGTTACAAAA TTGCCGGCGC CATA.AAACTG GCACGTTTCC ATCGCATTAG 2101 ACTTGTCTTT TAAATGAAGA CCCGTATGAA AGGCATCACG AGAGTTTAAC TGAACAGAAA ATTTACTTCT GGGCATACTT TCCGTAGTGC TCTCAAATTG 2151 TGTCTCTATT TTCTAATCAA TGAA.ATTGAT CTACTCGTGC AGAAGCGAGT 364

ACAGAGATAA AAGATTAGTT ACTTTAACTA GATGAGCACG TCTTCGCTCA 2201 ATAATTACAT TAGACGAGAA GACCCTATGG AGCTTCAA.AC ACATAAATTA TATTAATGTA ATCTGCTCTT CTGGGATACC TCGAAGTTTG TGTATTTAAT 2251 ATTATGTAAA CTAATTACTC CACGGATATA AACp~~A,AATA TAATATTTTT TAATACATTT GATTAATGAG GTGCCTATAT TTGTTTTTAT ATTAT 2301 AATTTAACTG TTTTTGGTTG GGGTGACCAA GGGG~?~~AAAC AAATCCCCCT TTAAATTGAC p~~,AAAC CAAC CCCACTGGTT CCCCTTTTTG TTTAGGGGGA 2351 TATCGACTAA GTACTCAAGT ACTTAAAAAT TAGAGTTACA ACTCTAATTA ATAGCTGATT CATGAGTTCA TGAATTTTTA ATCTCAATGT TGAGATTAAT 2401 ATP~AAACATT TATCG TGACCCAGGA TTTCCTGATC AATGAACCAA TATTTTGTAA ATAGCTTTTT ACTGGGTCCT AAAGGACTAG TTACTTGGTT 2451 GTTACCCTAG GGATAACAGC GCAATCCTTT CCCAGAGTCC TTATCGCCGA CAATGGGATC CCTATTGTCG C GTTAGGA.AA GGGTCTCAGG AATAGCGGCT 2501 AAGGGTTTAC GACCTCGATG TTGGATCAGG ACATCCTAAT AATGCAACCG TTCCCAAATG CTGGAGCTAC AACCTAGTCC TGTAGGATTA TTACGTTGGC 2551 TTATTAAGGG TTCGTTTGTT CAACGATTAA TAGTCCTACG TGATCTGAGT AATAATTCCC AAGCAAACAA GTTGCTAATT ATCAGGATGC ACTAGACTCA 2601 TCAGACCGGA GAAATCCAGG TCAGTTTCTA TCTATGAATT AATTTTTCCT AGTCTGGCCT CTTTAGGTCC AGTCA.AAGAT AGATACTTAA TTP►~~A,AAGGA 2651 AGTACGAAAG GAC C GGA.AA.A ATGGAGCCAA TACCCAAGGC ACGCTCCATT TCATGCTTTC CTGGCCTTTT TACCTCGGTT ATGGGTTCCG TGCGAGGTAA 2701 TTCATCTATT GAA.ATA.AAC T AAA~TAGATA AG TT ATCTATCTGC AAGTAGATAA CTTTATTTGA TTTTATCTAT TCTTTTTTAA TAGATAGACG 2751 CCAAGAAAAG GGTTGTTGAG GTGGCAGAGC CTGGTAAGTG CP,.A~~.AGACCT GGTTCTTTTC CCAACAACTC CACCGTCTCG GACCATTCAC GTTTTCTGGA 2801 AAGCTCTTTA ATTCAGAGGT TCA.AATCCTC TCCTTAACTA TGCTTGAAAC TTCGAGAAAT TAAGTCTCCA AGTTTAGGAG AGGAATTGAT ACGAACTTTG 2851 CCTCCTACTT TACTTAATTA ATCCACTTAC CTATATTATT CCCATCTTAC GGAGGATGAA ATGAATTAAT TAGGTGAATG GATATAATAA GGGTAGAATG 2901 TAGCTACAGC TTTCCTTACC TTAGTTGAAC GP.,~~AAAT C C T CGGCCATATA ATCGATGTCG AAAGGAATGG AATCAACTTG CTTTTTAGGA GCCGGTATAT 2951 CAACTCCGCA AAGGCCCTAA CATCGTAGGC CCATACGGGC TCCTTCAACC GTTGAGGCGT TTCCGGGATT GTAGCATCCG GGTATGCCCG AGGAAGTTGG 3001 AATCGCAGAT GGTCTAAAAC TATTTATCAA AGAACCCATC CACCCATCAA TTAGCGTCTA CCAGATTTTG ATAAATAGTT TCTTGGGTAG GTGGGTAGTT 3051 CATCCTCCCC CCTTCTATTT TTAGCTACTC CCACAATAGC TCTAACACTA GTAGGAGGGG GGAAGATA.AA AATCGATGAG GGTGTTATCG AGATTGTGAT 3101 GCCCTCCTTA TATGAATACC TCTTCCACTC CCTCACTCCT TTATTAATCT CGGGAGGAAT ATACTTATGG AGAAGGTGAG GGAGTGAGGA AATAATTAGA 3151 TAATCTAGGT CTACTATTTA TTTTAGCAAT TTCAAGTTTA ACCGTCTACA ATTAGATCCA GATGATAAAT A.A.AATC GTTA AAGTTCAAAT TGGCAGATGT 3201 CTATTCTAGG TTCCGGATGA GCATCCAACT CA.AA.ATACGC CCTAATAGGG GATAAGATCC AAGGCCTACT CGTAGGTTGA GTTTTATGCG GGATTATCCC 3251 GCCCTACGAG CTGTAGCACA GACAATTTCT TATGAAGTAA GTCTCGGATT CGGGATGCTC GACATCGTGT CTGTTA.AAGA ATACTTCATT CAGAGCCTAA 3301 AATCCTCTTA TCAATGATCA TATTTACAGG AGGTTTCACC CTCCATACCT TTAGGAGAAT AGTTACTAGT ATAAATGTCC TCCAAAGTGG GAGGTATGGA 3351 TCAACCTAGC ACAAGAAACA ATCTGACTAA TTATTCCAGG ATGACCATTA AGTTGGATCG TGTTCTTTGT TAGACTGATT AATAAGGTCC TACTGGTAAT 3401 GCCCTGATAT GATATGTTTC AACCTTAGCA GAAACTAACC GAGTACCATT CGGGACTATA CTATACAAAG TTGGAATCGT CTTTGATTGG CTCATGGTAA 3451 TGATTTAACA GAAGGAGAAT CAGAACTAGT CTCAGGATTT AATATTGAAT ACTA.AATTGT CTTCCTCTTA GTCTTGATCA GAGTCCTAAA TTATAACTTA 3501 ATGCAGGAGG CTCATTTGCC TTATTCTTCC TCGCTGAATA TACGAACATT TACGTCCTCC GAGTAAACGG AATAAGAAGG AGCGACTTAT ATGCTTGTAA 365

3551 TTATTAATAA ATACCCTCTC AGTTATCCTA TTTATAGGCT CCTCCTATAA AATAATTATT TATGGGAGAG TCAATAGGAT AAATATCCGA GGAGGATATT 3601 TCCACTTTTC C CAGAA.ATC T CAACACTCAG CTTAATAATA AAAGCAACCC AGGTGA~AAI~G GGTCTTTAGA GTTGTGAGTC GAATTATTAT TTTCGTTGGG 3651 TACTAACTCT ATTTTTTTTA TGAATTCGAG CATCTTATCC CCGCTTCCGT ATGATTGAGA T T ACTTAAGCTC GTAGAATAGG GGCGAAGGCA 3701 TATGATCAAC TAATACATCT AGTATGA.AA.A AACTTCCTTC CCTTAACCTT ATACTAGTTG ATTATGTAGA TCATACTTTT TTGAAGGAAG GGAATTGGAA 3751 AGCAATTATA TTATGACATA TTGCCCTCCC CATAGCTCTA GCAAGCCTAC TCGTTAATAT AATACTGTAT AACGGGAGGG GTATCGAGAT CGTTCGGATG 3801 CTCCCCTAAC CTAACGGAAG CGTGCCTGAA CAAAGGACCA CTTTGATAGA GAGGGGATTG GATTGCCTTC GCACGGACTT GTTTCCTGGT GAAACTATCT 3851 GTGGACAATG AAAGTTPLAAA TCTTTCCTCT TCCTAGAAAA ATAGGACTTG CACCTGTTAC TTTCAATTTT AGAAAGGAGA AGGATCTTTT TATCCTGAAC 3901 AACCTATAAT TAAGAGATCA AAACTCCTTG TACTTCCAAT TATACTATCT TTGGATATTA ATTCTCTAGT TTTGAGGAAC ATGAAGGTTA ATATGATAGA 3951 CCTAAGTAAA GTCAGCTAAC A.AAGCTTTTG GGCCCATACC CCAACCACGT GGATTCATTT CAGTCGATTG TTTC GA~AAAC CCGGGTATGG GGTTGGTGCA 4001 TGGTTAAAAT CCTTCCTCTA CTAATGAACC CAATTGTATT AACCATTATC ACCAATTTTA GGAAGGAGAT GATTACTTGG GTTAACATAA TTGGTAATAG 4051 ATTTCAAGCC TAGGCCTAGG AACTATCTTA ACATTTACCA GCTCACACTG TAA.AGTTCGG ATCCGGATCC TTGATAGAAT TGTAAATGGT CGAGTGTGAC 4101 ACTCCTAGTT TGAATAGGCC TCGAAATCAA TACTCTAGCC ATCATTCCCT TGAGGATCAA ACTTATCCGG AGCTTTAGTT ATGAGATCGG TAGTAAGGGA 4151 TAATAATCCG CCAGCACCAT CCACGGGCAG TAGAAGCTTC TACAAAATAT ATTATTAGGC GGTCGTGGTA GGTGCCCGTC ATCTTCGAAG ATGTTTTATA 4201 TTCATCACAC AAGCAACCGC CTCAGCCTTA CTTTTATTTG CTAGCGTCAT AAGTAGTGTG TTCGTTGGCG GAGTCGGAAT GA►AA,ATA.AAC GATCGCAGTA 4251 AAACGCTTGG ACTTCAGGAG AATGAAGTCT AATTGAA.ATA ACTAATCCAA TTTGCGA.ACC TGAAGTCCTC TTACTTCAGA TTAACTTTAT TGATTAGGTT 4301 CTCCTGCCAC ACTGGCCACA ATCGCATTAG CATTp,~~AAAT TGGCTTAGCC GAGGACGGTG TGACCGGTGT TAGCGTAATC GTAATTTTTA ACCGAATCGG 4351 CCTCTTCATT TTTGATTACC CGAAGTCCTA CAAGGCTTAA ACCTTACTAC GGAGAAGTAA AAACTAATGG GCTTCAGGAT GTTCCGAATT TGGAATGATG 4401 AGGTCTCATT CTTTCTACAT GACP.►~~AA.AC T CGCCCCATTC GCCATCCTCT TCCAGAGTAA G.AA.AGATGTA CTGTTTTTGA GCGGGGTAAG CGGTAGGAGA 4451 TACAACTTTA CCCCTCATTA AATTCTAACC TACTCGTATT CCTTGGAGTC ATGTTGAAAT GGGGAGTAAT TTAAGATTGG ATGAGCATAA GGAACCTCAG 4501 CTCTCAACCA TAGTTGGAGG TTGAGGAGGA TTAAACCAAA CCCAATTACG GAGAGTTGGT ATCAACCTCC AACTCCTCCT AATTTGGTTT GGGTTAATGC 4551 P.~~AA.ATTC TA GCCTATTCCT CAATCGCACA CCTCGGTTGA ATAATTACAA TTTTTAAGAT CGGATAAGGA GTTAGCGTGT GGAGCCAACT TATTAATGTT 4601 TCCTACATTA CTCCCATAAC CTAACCCAAC TA.AAC C TAC T TCTTTACATC AGGATGTAAT GAGGGTATTG GATTGGGTTG ATTTGGATGA AGA.AATGTAG 4651 ATTATAACAT CAACAACCTT TCTATTATTC AAAACATTTA ATTCAACCAA TAATATTGTA GTTGTTGGAA AGATAATAAG TTTTGTAAAT TAAGTTGGTT 4701 AATTAATTCT ATTTCCTCTT CTTCATCAAA ATCCCCCCTA CTATCTATTA TTAATTAAGA TA.AAGGAGAA GAAGTAGTTT TAGGGGGGAT GATAGATAAT 47 51 TTGCTCTTAT AACTCTCCTT TCTCTTGGAG GTCTACCTCC ACTTTCAGGC AACGAGAATA TTGAGAGGAA AGAGAACCTC CAGATGGAGG TGAAAGTCCG 4801 TTCATACCAA AATGATTAAT TTTACAAGAA TTAGCF~C AAAATCTAAA AAGTATGGTT TTACTAATTA AAATGTTCTT AATCGTTTTG TTTTAGATTT 4851 TACTCCAGCT ATTATTATAG CTATAATAAC CCTCCTCAGT CTATTCTTTT ATGAGGTCGA TAATAATATC GATATTATTG GGAGGAGTCA GATAAGAAA.A 4901 ATCTACGCCT ATGTTATGCT ACAACATTAA CCATAACCCC AA.ATTCAATT 366

TAGATGCGGA TACAATACGA TGTTGTAATT GGTATTGGGG TTTAAGTTAA 4951 AACATATTAA CATCATGACG AACTAAATTA TCCCATAACC TAACCCTAAC TTGTATAATT GTAGTACTGC TTGATTTAAT AGGGTATTGG ATTGGGATTG 5001 AACAACCGCC TCATTATCCA TCCTACTCCT TCCAATCACC CCCGCCATTC TTGTTGGCGG AGTAATAGGT AGGATGAGGA AGGTTAGTGG GGGCGGTAAG 5051 TCATATTGAT ACCTTAAGAA ATTTAGGTTA ACAATAGACC A,AAAGC C TTC AGTATAACTA TGGAATTCTT TAAATCCAAT TGTTATCTGG TTTTCGGAAG 5101 AAAGCTTTAA GCAGAAGTGA AAATCTCCTA ATTTCTGCTA AGACTTGCAA TTTC GA.AATT CGTCTTCACT TTTAGAGGAT TAAAGACGAT TCTGAACGTT 5151 GACTTTATCT CACATCTTCT GAACGCAACC CAGATGCTTT AATTAAGCTA CTGAAATAGA GTGTAGAAGA CTTGCGTTGG GTCTACGAAA TTAATTCGAT 5201 AAACCTCCTA GATAAATAGG CCTTGATCCT AC~TCTT AATTAACAGC TTTGGAGGAT CTATTTATCC GGAACTAGGA TGTTTTAGAA TTAATTGTCG 5251 TAAGCGTTCA ATCCAGCGAA CTTTTATCTA CTTTCTCCCG CCGT ATTCGCAAGT TAGGTCGCTT GA.A.AATAGAT GAAAGAGGGC GGCATTTTTT 5301 AAAGGCGGGA GAAAGTCCCG GGAGAATCAA CCTCCGGTTT TGGATTTGCA TTTCCGCCCT CTTTCAGGGC CCTCTTAGTT GGAGGC CA.AA ACCTAAACGT 5351 ATCCAACGTA ACCATTTACT GCAGGACTAT GGTAAGAAGA GGAATTTGAC TAGGTTGCAT TGGTA.AATGA CGTCCTGATA CCATTCTTCT CCTTAAACTG 5401 CTCTGTTTAC GAAGCTACAA TCCGCTACTT AGTTCTCAGT CACCTTACCT GAGACAAATG CTTCGATGTT AGGCGATGAA TCAAGAGTCA GTGGAATGGA 5451 GTGGCAATTA ATCGTTGACT ATTTTCTACA AACCACAAAG ATATTGGCAC CACCGTTAAT TAGCAACTGA TAA.AAGATGT TTGGTGTTTC TATAACCGTG 5501 CCTATACTTG ATTTTTGGTG CATGAGCAGG AATAGTGGGA ACAGCCCTAA GGATATGAAC Tp►~~AA.AC CAC GTACTCGTCC TTATCACCCT TGTCGGGATT 5551 GCCTTTTAAT TCGAGCCGAA CTAGGACAGC CCGGATCACT TCTAGGAGAT CGGAAAATTA AGCTCGGCTT GATCCTGTCG GGCCTAGTGA AGATCCTCTA 5601 GACCAAGTTT ATAATGTTAT TGTAACCGCC CATGCATTCG TAATAATCTT CTGGTTCAAA TATTACAATA ACATTGGCGG GTACGTAAGC ATTATTAGAA 5651 CTTCATGGTT ATACCCGTAA TAATTGGTGG GTTTGGAA.AT TGACTAGTAC GAAGTACCAA TATGGGCATT ATTAACCACC CAAACCTTTA ACTGATCATG 5701 CATTAATGAT TGGTGCACCA GATATAGCCT TCCCACGAAT AAATAATATA GTAATTACTA ACCACGTGGT CTATATCGGA AGGGTGCTTA TTTATTATAT 5751 AGCTTTTGAC TCCTTCCCCC TTCTTTTCTT TTACTTCTGG CCTCAGCTGG TCGAPsAACTG AGGAAGGGGG AAGAAAAGAA AATGAAGACC GGAGTCGACC 5801 AATTGAAGCC GGAGCTGGCA CTGGTTGAAC AGTTTACCCT CCTTTAGCTG TTAACTTCGG CCTCGACCGT GACCAACTTG TCAAATGGGA GGAAATCGAC 5851 GTAACTTAGC ACATGCTGGA GCATCCGTTG ACTTAGCCAT CTTCTCTCTT CATTGAATCG TGTACGACCT CGTAGGCAAC TGAATCGGTA GAAGAGAGAA 5901 CATTTAGCAG GTATCTCATC AATTTTAGCT TCAATTAACT TTATCACAAC GTAAATCGTC CATAGAGTAG TTAAAATCGA AGTTAATTGA AATAGTGTTG 5951 CATTATTAAT ATAAAACCAC CAGCCATCTC TCAATATCAA ACACCATTAT GTAATAATTA TATTTTGGTG GTCGGTAGAG AGTTATAGTT TGTGGTAATA 6001 TTGTATGATC AATTCTAGTA ACAACCATCC TTCTCCTCTT ATCCCTTCCA AACATACTAG TTAAGATCAT TGTTGGTAGG AAGAGGAGAA TAGGGAAGGT 6051 GTACTCGCAG CCGGCATTAC AATGTTACTT ACTGATCGAA ATCTAAACAC CATGAGCGTC GGCCGTAATG TTACAATGAA TGACTAGCTT TAGATTTGTG 6101 AACATTCTTT GACCCAGCAG GAGGAGGAGA TCCAATTCTT TATCAACACC TTGTAAGAAA CTGGGTCGTC CTCCTCCTCT AGGTTAAGAA ATAGTTGTGG 6151 TATTTTGATT TTTCGGTCAC CCAGAAGTTT ATATCCTAAT TCTCCCCGGC ATAAAACTAA A.AAGC CAGTG GGTCTTCA.AA TATAGGATTA AGAGGGGCCG 6201 TTCGGAATAA TTTCCCATGT AGTAGCTTAC TACTCCGGCA p~~AAAGAAC C AAGCCTTATT AA.AGGGTACA TCATCGAATG ATGAGGCCGT TTTTTCTTGG 6251 GTTCGGTTAT ATAGGTATAG TTTGAGCAAT AATAGCAATT GGACTATTAG CAAGCCAATA TATCCATATC A.AAC TC GTTA TTATCGTTAA CCTGATAATC 367

6301 GTTTTATTGT CTGAGCCCAT CATATATTTA CAGTAGGGAT AGACGTTGAT CA.AA.ATAACA GACTCGGGTA GTATATAAAT GTCATCCCTA TCTGCAACTA 6351 ACACGAGCCT ATTTTACCTC AGCAACAATA ATTATTGCTA TTCCTACAGG TGTGCTCGGA TAAAATGGAG TCGTTGTTAT TAATAACGAT AAGGATGTCC 6401 TGTAAAAGTA TTCAGCTGAT TAGCAACTCT TCACGGAGGC TCTATTAAAT ACATTTTCAT AAGTCGACTA ATCGTTGAGA AGTGCCTCCG AGATAATTTA 6451 GAGAAACCCC ATTACTATGA GCTCTCGGAT TCATCTTCTT ATTTACAGTA CTCTTTGGGG TAATGATACT CGAGAGCCTA AGTAGAAGAA TAAATGTCAT 6501 GGGGGACTAA CAGGTATTGT ATTAGCCAAC TCCTCCTTAG ATATTGTTCT CCCCCTGATT GTCCATAACA TAATCGGTTG AGGAGGAATC TATAACAAGA 6551 CCATGATACC TATTATGTAG TAGCTCATTT CCATTATGTC CTTTCAATAG GGTACTATGG ATAATACATC ATCGAGTAAA GGTAATACAG GAAAGTTATC 6601 GAGCAGTATT CGCCATTATA GCAGGTTTTA TCCACTGATT TCCTCTTATC CTCGTCATAA GCGGTAATAT CGTCCA~T AGGTGACTAA AGGAGAATAG 6651 TCTGGCTACA CCCTCCACTC AACATGAACA AAAATCCAAT TTGCAGTAAT AGACCGATGT GGGAGGTGAG TTGTACTTGT TTTTAGGTTA AACGTCATTA 6701 ATTTATTGGA GTA.AAC TTGA CATTCTTCCC ACAACATTTC CTAGGCCTTG TAAATAACCT CATTTGAACT GTAAGAAGGG TGTTGTAAAG GATCCGGAAC 6751 CCGGTATACC ACGACGTTAT TCAGATTACC CAGATGCATA TACTTTATGA GGCCATATGG TGCTGCAATA AGTCTAATGG GTCTACGTAT ATGAAATACT 6801 AACATAATTT CCTCTATCGG CTCTTTAATC TCACTTGTAG CAGTAATTAT TTGTATTAAA GGAGATAGCC GAGA.AATTAG AGTGAACATC GTCATTAATA 6851 ACTCCTATTC ATTATCTGAG AAGCATTTGC CTCAAAACGA GAAGTACTAT TGAGGATAAG TAATAGACTC TTC GTA.AAC G GAGTTTTGCT CTTCATGATA 6901 CTATTGAATT ACCTAACACA AATGTTGAAT GATTACACGG TTGCCCTCCA GATAACTTAA TGGATTGTGT TTACAACTTA CTAATGTGCC AACGGGAGGT 6951 CCATACCACA CATACGAAGA ACCAGCATTT GTTCAAGTTC AACGAATTTT GGTATGGTGT GTATGCTTCT TGGTCGTAAA CAAGTTCAAG TTGCTTAAAA 7001 TTAA~IACAAG AA.AGGAAGGA ATTGAACCCC CATATGTTAG TTTCAAGCCA AATTTTGTTC TTTCCTTCCT TAACTTGGGG GTATACAATC AAAGTTCGGT 7051 ACCACATCGC CACTCTGTCA CTTTCTTTAT TAAGATTCTA GT~TGTA TGGTGTAGCG GTGAGACAGT GAAAGAAATA ATTCTAAGAT CATTTTACAT 7101 TTACACTGCC TTGTCAAGAC AAAATTGTGT GTTTAAATCC CACGAATCTT AATGTGACGG AACAGTTCTG TTTTAACACA CA.AATTTAGG GTGCTTAGAA 7151 AACTTATAAT GGCACACCCC TCACAATTAG GATTTCAAGA CGCAGCCTCC TTGAATATTA CCGTGTGGGG AGTGTTAATC CTAAAGTTCT GCGTCGGAGG 7201 CCAGTTATGG AAGAACTTAT TCATTTTCAC GACCACACAT TAATAATCGT GGTCAATACC TTCTTGAATA AGTAA.AAGTG CTGGTGTGTA ATTATTAGCA 7251 ATTTTTAATT AGCACTTTAG TTCTTTATAT TATTACAGCA ATAGTATCAA TP~~AAATTAA TCGTGAAATC AAGAAATATA ATAATGTCGT TATCATAGTT 7301 CAAAACTTAC AAACAAATAT ATTCTTGATT CTCAAGAAAT TGAAATTGTC GTTTTGAATG TTTGTTTATA TAAGAACTAA GAGTTCTTTA ACTTTAACAG 7351 TGAACTATTC TTCCCGCCAT CATCCTCATT ATAATTGCCC TACCATCCCT ACTTGATAAG AAGGGCGGTA GTAGGAGTAA TATTAACGGG ATGGTAGGGA 7401 ACGAATTTTA TATCTTATAG ACGAAATCAA TGACCCCCAC CTAACCATTA TGC TTP.~A,AAT ATAGAATATC TGCTTTAGTT ACTGGGGGTG GATTGGTAAT 7451 AAGCTATAGG TCATCAATGA TATTGAACTT ATGAATATAC AGATTACGAA TTCGATATCC AGTAGTTACT ATAACTTGAA TACTTATATG TCTAATGCTT 7501 GACTTAGAAT TTGACTCCTA CATAATTCAA ACCCAAGACT TAACCCCAGG CTGAATCTTA AACTGAGGAT GTATTAAGTT TGGGTTCTGA ATTGGGGTCC 7551 CCAATTTCGT TTATTAGAGA CAGATCACCG AATAGTTGTA CCTATAGAAT GGTTAAAGCA AATAATCTCT GTCTAGTGGC TTATCAACAT GGATATCTTA 7601 CACCTGTTCG TGTATTAGTA TCTGCAGAAG ACGTCTTACA CTCATGAGCT GTGGACAAGC ACATAATCAT AGACGTCTTC TGCAGAATGT GAGTACTCGA 7651 GTTCCAGCCT TAGGAATTAA AATAGACGCT GTACCAGGAC GCCTAAA.CCA 368

CAAGGTCGGA ATCCTTAATT TTATCTGCGA CATGGTCCTG CGGATTTGGT 7701 AACTGCCTTT ATCATTTCCC GACCAGGTAT TTACTATGGC CAATGTTCAG TTGACGGAAA TAGTA.AAGGG CTGGTCCATA AATGATACCG GTTACAAGTC 7751 A.AATTTGTGG TGCCAACCAC AGCTTTATAC CTATCATAGT AGAAGCAATT TTTAAACACC ACGGTTGGTG TCGAAATATG GATAGTATCA TCTTCGTTAA 7801 CCCCTAGAAC ACTTCGAAGC CTGATCTTCA TTAATACTAG AAGAAGTCTC GGGGATCTTG TGAAGCTTCG GACTAGAAGT AATTATGATC TTCTTCAGAG 7851 ACTAAGAAGC TAATTGGGTC TAGCATTAGC CTTTTAAGCT P.~~~AATTGGT TGATTCTTCG ATTAACCCAG ATCGTAATCG GA,AAATTC GA TTTTTAACCA 7901 GATTCCCTAC CACCCTTAGT GATATGCCTC AAT TA.AATC C ACACCCTTGG CTAAGGGATG GTGGGAATCA CTATACGGAG TTAATTTAGG TGTGGGAACC 7951 TTCATTATTC TCTTATTTTC ATGAATAGTT TTTCTTATTA TTTTACCTAA AAGTAATAAG AGAATAAAAG TACTTATCAA AAAGAATAAT AAAATGGATT 8001 A.A.AAGTAATA AATTATATAT TTAATAATAA CCCAACACTA P.~~AAATAC C G TTTTCATTAT TTAATATATA AATTATTATT GGGTTGTGAT TTTTTATGGC 8051 P.~~AAAC C TAA ACAGGAACCC TGAAATTGAC CATGATCATA AGCTTCTTTG TTTTTGGATT TGTCCTTGGG ACTTTAACTG GTACTAGTAT TCGAAG~AAC 8101 ATCAATTCCT AAGCCCCTCC CTTCTCGGAA TCCCGTTAAT TGCTTTAGCA TAGTTAAGGA TTCGGGGAGG GAAGAGCCTT AGGGCAATTA ACGAAATCGT 8151 ATTATATTAC CATGATTAAC TTTCCCAACC CCAACTAATC GCTGATTAA.A TAATATAATG GTACTAATTG AA.AGGGTTGG GGTTGATTAG CGACTAATTT 8201 TAATCGATTA ATAACCCTCC AAAACTGATT TATTAATCGA TTTATTTGTC ATTAGCTAAT TATTGGGAGG TTTTGACTAA ATAATTAGCT A.AATAAACAG 8251 AACTCATACA AC C CATCAAC TTCACTGGCC ATAAATGAGC CATATTATTT TTGAGTATGT TGGGTAGTTG AAGTGACCGG TATTTACTCG GTATAATAAA 8301 ACAGCACTAA TATTATTCCT TATTACCATC AATCTCTTAG GACTTCTCCC TGTCGTGATT ATAATAAGGA ATAATGGTAG TTAGAGAATC CTGAAGAGGG 8351 CTACACCTTC ACACCCACAA CCCAACTTTC CCTTAATATA GCATTTGCTC GATGTGGAAG TGTGGGTGTT GGGTTGAAAG GGAATTATAT CGTAAACGAG 8401 TACCCTTATG GTTTACAACC GTATTAATTG GAATACTTAA CCAACCAACA ATGGGAATAC CA.AATGTTGG CATAATTAAC CTTATGAATT GGTTGGTTGT 8451 ATTGCACTAG GCCATTTTCT ACCAGAAGGC ACCCCAACCC CTCTAGTACC TAACGTGATC C GGTA,.AAAGA TGGTCTTCCG TGGGGTTGGG GAGATCATGG 8501 CGTCCTCATT ATTATCGAAA CTATTAGCTT GTTTATTCGA CCATTAGCAC GCAGGAGTAA TAATAGCTTT GATAATCGAA CAAATAAGCT GGTAATCGTG 8551 TAGGAGTTCG ACTAACTGCT AATCTAACAG CCGGTCACTT ATTAATACAA ATCCTCAAGC TGATTGACGA TTAGATTGTC GGCCAGTGAA TAATTATGTT 8601 TTAATCGCAA CCGCAGTGTT TGCCCTTATT ACTATTATAC CAACCGTAGC AATTAGCGTT GGCGTCACAA ACGGGAATAA TGATAATATG GTTGGCATCG 8651 GTTATTAACA TCAATTATCC TATTTTTACT AACAATTCTT GAAGTAGCTG CAATAATTGT AGTTAATAGG ATP~~AATGA TTGTTAAGAA CTTCATCGAC 8701 TAGCAATAAT TCAAGCATAT GTATTTGTTC TCCTACTAAG CTTATATCTA ATCGTTATTA AGTTCGTATA CATAAACAAG AGGATGATTC GAATATAGAT 8751 CAAGA►AAATG TTTAATGACT CACCAAGCAC ACGCATATCA TATAGTTGAC GTTCTTTTAC AAATTACTGA GTGGTTCGTG TGCGTATAGT ATATCAACTG 8801 CCCAGTCCAT GACCACTAAC CGGAGCTACA GCCGCCCTTC TAATAACATC GGGTCAGGTA CTGGTGATTG GCCTCGATGT CGGCGGGAAG ATTATTGTAG 8851 CGGATTAGCC ATCTGATTTC ACTTCCACTC ATTACTTCTT CTCTACTTAG GCCTAATCGG TAGACTAAAG TGAAGGTGAG TAATGAAGAA GAGATGAATC 8901 GATTAATCCT TTTATTATTA ACTATAATCC AATGATGACG TGATATTATC CTAATTAGGA AAATAATAAT TGATATTAGG TTACTACTGC ACTATAATAG 8951 CGAGAAGGAA CATTCCAAGG TCATCATACA CCTCCCGTCC AAAAAGGTCT GCTCTTCCTT GTAAGGTTCC AGTAGTATGT GGAGGGCAGG TTTTTCCAGA 9001 CCGTTATGGA ATAATCTTAT TCATTACATC AGAAGTATTC TTCTTTTTAG GGCAATACCT TATTAGAATA AGTAATGTAG TCTTCATAAG AAGP.►~~AAATC 369

9051 GCTTTTTCTG AGCCTTTTAC CATTCAAGTC TTGCCCCCAC CCCAGAACTA C GP~A.AAAGAC TCGGAAAATG GTAAGTTCAG AACGGGGGTG GGGTCTTGAT 9101 GGAGGATGTT GACCACCAAC AGGAATTTAT CCATTAGACC CATTTGAAGT CCTCCTACAA CTGGTGGTTG TC C TTA.AATA GGTAATCTGG GTAAACTTCA 9151 ACCACTTCTA AATACTGCAG TACTTTTAGC TTCTGGTGTA ACAGTAACCT TGGTGAAGAT TTATGACGTC ATGAAAATCG AAGACCACAT TGTCATTGGA GAAA.AGAAGC 9201 GAACCCACCA TAGTTTAATA GAAGGTAACC TATTCAAGCC CTTGGGTGGT ATCAAATTAT CTTCCATTGG CTTTTCTTCG ATAAGTTCGG 9251 CTCACCCTTA CTATTATTTT AGGATTCTAC TTTACAACCC TTCAAGCCAT GAGTGGGAAT GATAATAAAA TCCTAAGATG A.AATGTTGGG AAGTTCGGTA 9301 AGAATATTAC GAAGCACCCT TTACAATTGC CGATGGAGTT TATGGAACAA TCTTATAATG CTTCGTGGGA AATGTTAACG GCTACCTCAA ATACCTTGTT 9351 CATTTTTCGT TGCCACAGGA TTCCACGGCC TCCATGTTAT TATTGGTTCA GTP.►~~AAAGCA ACGGTGTCCT AAGGTGCCGG AGGTACAATA ATAACCAAGT 9401 ACATTTCTAG CAATCTGTTT ACTACGACAA ATTCAATATC ATTTTACATC TAA.AATGTAG TGTAAAGATC GTTAGACA.AA TGATGCTGTT TAAGTTATAG 9451 AAA,ACATCAC TTTGGTTTTG AAGCTGCCGC ATGATACTGA CATTTTGTAG TTTTGTAGTG AAACCAAAAC TTCGACGGCG TACTATGACT GTAAAACATC 9501 ATGTAGTATG ATTATTTCTT TATGTATCCA TCTATTGATG AGGCTCATAA TACATCATAC TAATA.AAGAA ATACATAGGT AGATAACTAC TCCGAGTATT 9551 TTACTTTTCT AGTATAAACT AATAC A.AAT G ATTTCCAATC ATTTAATCTT AATGAAAAGA TCATATTTGA TTATGTTTAC TAAAGGTTAG TAAATTAGAA 9601 GGTTTAAACC CAGGGA~A.AAG TAATGAGCCT CATCACGTCC TCTATCGCGG CCAAATTTGG GTCCCTTTTC ATTACTCGGA GTAGTGCAGG AGATAGCGCC 9651 CTACGGCCCT GGTTTCCCTA ATCCTCGTAA TAATTGCATT TTGGCTTCCA GATGCCGGGA CCAAAGGGAT TAGGAGCATT ATTAACGTAA AACCGAAGGT 9701 TCACTAAGTC CAGATAATGA A.A.AATTATC C CCATATGAAT GTGGCTTTGA T AGTGATTCAG GTCTATTACT TTTTAATAGG GGTATACTTA CACC GAA.AC 9751 CCCCCTAGGA AGTGCACGCC TTCCATTCTC CCTACGCTTC TTTCTTGTAG GGGGGATCCT TCACGTGCGG AAGGTAAGAG GGATGCGAAG A.AAGAACATC 9801 CCATTCTATT CTTACTATTT GACTTAGA.AA TCGCTCTCCT TCTTCCTTTA GGTAAGATAA GAATGATAA.A CTGAATCTTT AGCGAGAGGA AGAAGGAAAT 9851 CCATGAGGTA ATCAATTATT ACTACCACTT TCCACATTAA TCTGAGCAGC GGTACTCCAT TAGTTAATAA TGATGGTGAA AGGTGTAATT AGACTCGTCG 9901 AATTATCTTA ATTCTATTAA CTCTAGGTCT TATTTATGAA TGACTTCAAG TTAATAGAAT TAAGATAATT GAGATCCAGA ATA.AATACTT ACTGAAGTTC 9951 GAGGATTAGA ATGAGCAGAA TGGATATTTA GTTTA.AACAA AGACCACTAA CTCCTAATCT TACTCGTCTT ACCTATAAAT CAAATTTGTT TCTGGTGATT 10001 TTTCGACTTA GTA.AATTATG GTGAA.AATC C ATACATATCC TATGTCTCCT AAAGCTGAAT CATTTAATAC CACTTTTAGG TATGTATAGG ATACAGAGGA 10051 ATACATTTCA GCCTCAACTC AGCATTCATT TTAGGCCTCA TAGGGCTAGC TATGTAAAGT CGGAGTTGAG TCGTAAGTAA AATCCGGAGT ATCCCGATCG 10101 ACTTAATCGT TATCACCTTT TATCCGCACT CCTATGTTTA GAAAGTATAT TGAATTAGCA ATAGTGGAAA ATAGGCGTGA GGATACAAAT CTTTCATATA 10151 TATTAACTCT ATTTATCACT ATTGCCATCT GAACCCTAAC ACTAAACTCC ATAATTGAGA TAA.ATAGTGA TAACGGTAGA CTTGGGATTG TGATTTGAGG 10201 ACTTCATGTT CAATTATCCC CATAATTCTC CTTACATTTT CAGCCTGTGA TGAAGTACAA GTTAATAGGG GTATTAAGAG GAATGTAAAA GTCGGACACT 10251 AGCTAGCGTA GGTCTAGCCA TTCTAGTAGC TACCTCACGC TCTCACGGTT TCGATCGCAT CCAGATCGGT AAGATCATCG ATGGAGTGCG AGAGTGCCAA 10301 CTGATAATTT ACAAAACCTG AACCTTCTCC AATGCTAAAA ATTTTAATTC GACTATTAAA TGTTTTGGAC TTGGAAGAGG TTACGATTTT TAAAATTAAG 10351 CAACAATTAT ACTCTTTCCA ACCACATGAA CTATTAACAA AAAATGACTA GTTGTTAATA TGAGAAAGGT TGGTGTACTT GATAATTGTT TTTTACTGAT 10401 TGATCCATAA CCACCACCCA TAGCCTTCTA ATCGCATTAC TAAGCTTACT 370

ACTAGGTATT GGTGGTGGGT ATCGGAAGAT TAGCGTAATG ATTCGAATGA 10451 TTTATTCAAA TGAAATATAG ATATTGGTTG AGATTTTTCT AACCAATTCA A.AATAAGTTT ACTTTATATC TATAACCAAC TC Tp~~AAAGA TTGGTTAAGT 10501 TAGCTATTGA TCCTTTATCA ACCCCTTTAC TAATCCTTAC ATGTTGACTT ATCGATAACT AGGAAATAGT TGGGGAAATG ATTAGGAATG TACAACTGAA 10551 CTTCCATTAA TAATCTTAGC CAGCC~?,AAAT CACATTTCTC CAGAACCTAT GAAGGTAATT ATTAGAATCG GTCGGTTTTA GTGTAAAGAG GTCTTGGATA 10601 TATCCGACAA CGAACATACA TTACACTTCT AATCTCCCTC CAAGCTTTCC ATAGGCTGTT GCTTGTATGT AATGTGAAGA TTAGAGGGAG GTTCGAAAGG 10651 TCGTCATAGC ATTCTCCGCA ACCGAAATAA TTATATTTTA TATTATATTT AGCAGTATCG TAAGAGGCGT TGGCTTTATT AATATP~A.AAT ATAATATA,A.A 10701 GAAGCCACAC TAATCCCAAC TCTCATTATT ATCACACGAT GAGGAAATCA CTTCGGTGTG ATTAGGGTTG AGAGTAATAA TAGTGTGCTA CTCCTTTAGT 10751 AACAGAACGT TTAAATGCAG GAACCTACTT CTTATTTTAT ACCTTGATTG TTGTCTTGCA AATTTACGTC CTTGGATGAA GAATA.A.AATA TGGAACTAAC 10801 GCTCACTCCC TCTTCTTATT GCTCTTCTAC TCATACAAA.A TAATTTAGGT CGAGTGAGGG AGAAGAATAA CGAGAAGATG AGTATGTTTT ATTAAATCCA 10851 ACCCTATCTA TAATTATTAT ACAACACTCG CAACTTCCAA ATCTGCTAAC TGGGATAGAT ATTAATAATA TGTTGTGAGC GTTGAAGGTT TAGACGATTG 10901 ATGAACAGAT AAACTATGAT GAGTAGCCTG TCTAATCGCC TTCCTTGTCA TACTTGTCTA TTTGATACTA CTCATCGGAC AGATTAGCGG AAGGAACAGT 10951 AAATACCTTT ATATGGAATT CACCTTTGAC TTCCCAAAGC CCATGTTGAA TTTATGGAAA TATACCTTAA GTGGAAACTG AAGGGTTTCG GGTACAACTT 11001 GCCCCAATTG CAGGCTCAAT AATCCTAGCA GCAGTATTAC TTA.AATTAGG CGGGGTTAAC GTCCGAGTTA TTAGGATCGT CGTCATAATG AATTTAATCC 11051 AGGTTATGGA ATAATACGAA TTATTGTAAT ACTAAACCCA TTAACCAAAG TCCAATACCT TATTATGCTT AATAACATTA TGATTTGGGT AATTGGTTTC 11101 AAATAGCCTA CCCATTCTTA ATTTTAGCTA TTTGAGGAAT TATCATAACC TTTATCGGAT GGGTAAGAAT TAAA.ATCGAT AAACTCCTTA ATAGTATTGG 11151 AGTTCCATTT GCTTACGACA AACAGACCTA AAATCTCTAA TTGCTTACTC TCAAGGTA.AA CGAATGCTGT TTGTCTGGAT TTTAGAGATT AACGAATGAG 11201 ATCAGTAAGT CATATAGGAC TAGTTGCCGG AGCAATTTTT ATCCAAACAC TAGTCATTCA GTATATCCTG ATCAACGGCC TCGTTP.►~~,~A,A TAGGTTTGTG 11251 CATGAAGTTT CGCAGGAGCA ATTACACTTA TAATTGCCCA TGGGTTAATT GTACTTCAAA GCGTCCTCGT TAATGTGAAT ATTAACGGGT ACCCAATTAA 11301 TCATCAACCC TATTTTGCTT AGCTAACACT AATTATGAAC GAATTCATAG AGTAGTTGGG ATAA.AAC GAA TCGATTGTGA TTAATACTTG CTTAAGTATC 11351 CCGAACTATA CTCCTAGCCC GAGGCATACA AATCATTCTT CCACTAATGG GGCTTGATAT GAGGATCGGG CTCCGTATGT TTAGTAAGAA GGTGATTACC 11401 CAACCTGATG ATTCTTTGCT AGCCTAGCTA ATCTTGCCCT ACCTCCATCT GTTGGACTAC TAAGAAACGA TCGGATCGAT TAGAACGGGA TGGAGGTAGA 11451 CCCAATCTTA TAGGAGCACT TCTCATCATC ACCTCATTAT TTAACAGATC GGGTTAGAAT ATCCTCGTGA AGAGTAGTAG TGGAGTAATA AATTGTCTAG 11501 TAACTGAACT ATAATCCTAT CAGGTCTTGG AATATTAATT ACAGCCTCCT ATTGACTTGA TATTAGGATA GTCCAGAACC TTATAATTAA TGTCGGAGGA 11551 ATTCACTTAA TATATTCTTA ATAACCCAAC GAGGTCCAAC CCCCCATCAT TAAGTGAATT ATATAAGAAT TATTGGGTTG CTCCAGGTTG GGGGGTAGTA 11601 ATTCTATCAT TGAACCCAAA TTACACACGA GAACATCTTC TTCTTAGTCT TAAGATAGTA ACTTGGGTTT AATGTGTGCT CTTGTAGAAG AAGAATCAGA 11651 TCACCTTATA CCCGTTCTAC TACTAATACT TAAACCAGAA CTTATTTGAG AGTGGAATAT GGGCAAGATG ATGATTATGA ATTTGGTCTT GA.ATA~AC TC 11701 GATGAACATT TTGTACTTAT AGTTTAACCA AAACATTAGA TTGTGGTTCT CTACTTGTAA AACATGAATA TCAAATTGGT TTTGTAATCT AACACCAAGA 11751 P.►~~AAATAAAA GTTAAAACCT TTTTAATTAC CGAGAGAGGT CAGGGACACG TTTTTATTTT CAATTTTGGA AAAATTAATG GCTCTCTCCA GTCCCTGTGC 371

11801 AAGAACTGCT AATTCTTCCT ATCATGGCTC AAATCCATGG CTCACTCAGC TTCTTGACGA TTAAGAAGGA TAGTACCGAG TTTAGGTACC GAGTGAGTCG 11851 TTCTGAAAGA TAACAGTAAT CTATTGGTCT TAGGAACCAA AAACTCTTGG AAGACTTTCT ATTGTCATTA GATAACCAGA ATCCTTGGTT TTTGAGAACC 11901 TGCAACTCCA AGCAAAAGTT ATGAGCACCA TCTTCAATTC ATCTTTCCTC ACGTTGAGGT TCGTTTTCAA TACTCGTGGT AGAAGTTAAG TAGAAAGGAG 11951 CTAATCCTCA TTATCCTTAT CTTTCCACTA ACAACCTCAT TATACCCTAA GATTAGGAGT AATAGGAATA GAAAGGTGAT TGTTGGAGTA ATATGGGATT 12001 AGAATTTAAC CCCAATTGGT TATCATCCTA TGTP~~AAACA GC C GTP.~3AAA TCTTAAATTG GGGTTAACCA ATAGTAGGAT ACATTTTTGT CGGCATTTTT 12051 CTTCCTTCTT TATTAGCCTT ATCCCTTTAT TTATTTTCCT AGATCAAGGC GAAGGAAGAA ATAATCGGAA TAGGGAAATA AATAAAAGGA TCTAGTTCCG 12101 CTAGAATCAA TTATAACCAA TTATAATTGA ATAAACATTG GACCATTTGA GATCTTAGTT AATATTGGTT AATATTAACT TATTTGTAAC CTGGTA.AACT 12151 TATTAACATA AGCTTCAAAT TTGATATATA CTCAATTATA TTCACCCCTG ATAATTGTAT TCGAAGTTTA AACTATATAT GAGTTAATAT AAGTGGGGAC 12201 TAGCTCTCTA CATCACCTGA TCTATCCTCG AATTTGCCTT ATGATATATG ATCGAGAGAT GTAGTGGACT AGATAGGAGC TTAAACGGAA TACTATATAC 12251 TATTTTGACC CTAATATTAA CCGCTTTTTC A.AATATC TAT TACTTTTCCT ATAAAACTGG GATTATAATT GGC GP►~~AAAG TTTATAGATA ATGA.AAAGGA 12301 AATCTCAATA ATTATACTAG TTACAGCTAA CAATATATTT CAACTATTTA TTAGAGTTAT TAATATGATC AATGTCGATT GTTATATAAA GTTGATAAAT 12351 TTGGATGAGA AGGAGTAGGA ATTATATCAT TCCTACTAAT TGGTTGATGA AACCTACTCT TCCTCATCCT TAATATAGTA AGGATGATTA ACCAACTACT 12401 TATGGCCGAA CAGATGCTAA CACCGCTGCC CTCCAAGCTG TAATCTACAA ATACCGGCTT GTCTACGATT GTGGCGACGG GAGGTTCGAC ATTAGATGTT 12451 TCGAATAGGA GATATTGGAC TAATCCTAAC CATAGCCTGA TTAGCTATAA AGCTTATCCT CTATAACCTG ATTAGGATTG GTATCGGACT AATCGATATT 12501 ATTTAA.AC TC ATGAGA.AATT CAACAATTAT TTATTCTATC CP.~~AAATACA TAAATTTGAG TACTCTTTAA GTTGTTAATA AATAAGATAG GTTTTTATGT 12551 GACCTAACAT TACCTCTCTT CGGCCTCATT TTAGCTGCAG C TGGA.A.AATC CTGGATTGTA ATGGAGAGAA GCCGGAGTAA AATCGACGTC GACCTTTTAG 12601 CGCACAATTT GGCCTTCATC CCTGACTTCC TTCTGCTATA GAAGGACCAA GCGTGTTAAA CCGGAAGTAG GGACTGAAGG AAGACGATAT CTTCCTGGTT 12651 CACCAGTATC TGCCCTACTC CACTCTAGCA CAATAGTGGT TGCCGGCATC GTGGTCATAG ACGGGATGAG GTGAGATCGT GTTATCACCA ACGGCCGTAG 12701 TTTCTATTAA TCCGCCTCCA CCCCTTAATC CAAAATAACC AATTAATCCT AAAGATAATT AGGCGGAGGT GGGGAATTAG GTTTTATTGG TTAATTAGGA 12751 AACAATATGC CTTTGTCTAG GAGCACTAAC CACTCTTTTT ACCGCAACAT TTGTTATACG GAAACAGATC CTCGTGATTG GTGAGP~AAAA TGGCGTTGTA 12801 GCGCACTCAC CCAAAATGAT ATC TCATTGCCTT CTCAACATCA CGCGTGAGTG GGTTTTACTA TAGTTTTTTT AGTAACGGAA GAGTTGTAGT 12851 AGTCAACTTG GGCTAATAAT AGTAACAATC GGCCTCAATC AACCCCAACT TCAGTTGAAC CCGATTATTA TCATTGTTAG CCGGAGTTAG TTGGGGTTGA 12901 TACTTTCCTC CATATCTGTA CTCACGCCTT CTTTAAAGCT ATACTCTTCC ATGA.AAGGAG GTATAGACAT GAGTGCGGAA GA.AATTTCGA TATGAGAAGG 12951 TCTGCTCAGG ATCTATTATT CACAGCCTTA ATAATGAACA AGATATTCGC AGACGAGTCC TAGATAATAA GTGTCGGAAT TATTACTTGT TCTATAAGCG 13001 A.A.AATGGGAG GACTCCACAA ACTTCTACCA TTCACCTCAT CTTCCTTAAC TTTTACCCTC CTGAGGTGTT TGAAGATGGT AAGTGGAGTA GAAGGAATTG 13 051 CATTGGAAGT TTAGCCCTCA CAGGCATGCC CTTTTTATCA GGTTTCTTCT GTAACCTTCA AATCGGGAGT GTCCGTACGG GP~~AA.ATAGT CCAAAGAAGA 13101 CA.AA.AGATTC TATCATTGAA GCAATAA.ACA CTTCACACCT AAACGCCTGA GTTTTCTAAG ATAGTAACTT CGTTATTTGT GAAGTGTGGA TTTGCGGACT 13151 GCCCTAATCC TTACCCTTAT CGCAACATCA TTCACAGCCA TCTATAGCCT 372

CGGGATTAGG AATGGGAATA GCGTTGTAGT AAGTGTCGGT AGATATCGGA 13201 ACGCCTTATC TACTTCACAT TAATAAACTT CCCACGATTT AATTCACTTT TGCGGAATAG ATGAAGTGTA ATTATTTGAA GGGTGCTAAA TTAAGTGAAA 13251 CCCCAATTAA TGAAAATAAC CCAATAATAA TTAACCCAAT CA.AACGTCTA GGGGTTAATT ACTTTTATTG GGTTATTATT AATTGGGTTA GTTTGCAGAT 13301 GCTTATGGAA GTATTCTAGC TGGCCTCATT ATTACATCAA ATTTAACTCC CGAATACCTT CATAAGATCG ACCGGAGTAA TAATGTAGTT TAAATTGAGG 13351 AACP►~~A.AATT CAAATCATAA CAATATCCCC CCTACTGAAA CTCTCCGCCC TTGTTTTTAA GTTTAGTATT GTTATAGGGG GGATGACTTT GAGAGGCGGG 13401 TATTAGTTTC AATTATTGGC CTCTTACTAG CCTTAGAATT AGCTAACTTA ATAATCA.AAG TTAATAACCG GAGAATGATC GGAATCTTAA TCGATTGAAT 13451 ACTAATACCC AATTCAAAAT TAACCCTACC CTTTATACTC ACCACTTCTC TGATTATGGG TTAAGTTTTA ATTGGGATGG GAA.ATATGAG TGGTGAAGAG 13501 CAATATACTC GGTTACTTCC CACAAATTAT CCATCGTCTC CTACC GTTATATGAG CCAATGAAGG GTGTTTAATA GGTAGCAGAG GATGGTTTTT 13551 TCAACTTAAG CTGAGCTCAA CATATCTCAA CTCATCTAAT TGACCAAACA AGTTGAATTC GACTCGAGTT GTATAGAGTT GAGTAGATTA ACTGGTTTGT 13601 TGAAATGAAA AAATTGGACC P.,AAAAGTACT CTTATTCAAC AAACCCCATT ACTTTACTTT TTTAACCTGG TTTTTCATGA GAATAAGTTG TTTGGGGTAA 13651 AATTA.AATTA TCTACTCAAC CACAACAAGG TTATATTA.AA ATTTATCTCA TTAATTTAAT AGATGAGTTG GTGTTGTTCC AATATAATTT TAA.ATAGAGT 13701 TACTACTTTT CCTTACATTA ACCCTAACTT TATTAACTTC ATTAACCTAA ATGATGAAAA GGAATGTAAT TGGGATTGAA ATAATTGAAG TAATTGGATT 13751 CCACACGTAA AGTTCCCCAA GACAATCCTC GAGTTAACTC CAATACCACA GGTGTGCATT TCAAGGGGTT CTGTTAGGAG CTCAATTGAG GTTATGGTGT 13801 AACA.AAGTTA ACAACAATAC TCATCCACTT AAAACTAATA ATCACCCACC TTGTTTCAAT TGTTGTTATG AGTAGGTGAA TTTTGATTAT TAGTGGGTGG 13851 ATTAGCATAT AACAAAGCTA CCCCTACAAG ATCCCCACGA ACTATCTCCA TAATCGTATA TTGTTTCGAT GGGGATGTTC TAGGGGTGCT TGATAGAGGT 13901 TACTACTCAT CTCCTCTACC CCTACTCAAC TTAATTCAAA TCACTCAACT ATGATGAGTA GAGGAGATGG GGATGAGTTG AATTAAGTTT AGTGAGTTGA 13951 ATA.AA.ATATT CACCAACAA.A AACTAAAACT ACTAAATAAA ACCCAACATA TATTTTATAA GTGGTTGTTT TTGATTTTGA TGATTTATTT TGGGTTGTAT 14001 TAGTAATACA GATCAACTAC CTCACGACTC AGGATAAGGC TCAGCAGCAA ATCATTATGT CTAGTTGATG GAGTGCTGAG TCCTATTCCG AGTCGTCGTT 14051 GCGCTGCCGT ATAAGCAAAT ACTACCAACA TCCCCCCTAA ATAAATTAAA CGCGACGGCA TATTCGTTTA TGATGGTTGT AGGGGGGATT TATTTAATTT 14101 AACAAAACCA ATGATP.~AAAA AGACCCTCCA TGACCTACTA ATAATCCACA TTGTTTTGGT TACTATTTTT TCTGGGAGGT ACTGGATGAT TATTAGGTGT 14151 CCCCACTCCA GCAGCCATAA CCAACCCTAA CGCAGCATAA TAAGGAGAAG GGGGTGAGGT CGTCGGTATT GGTTGGGATT GCGTCGTATT ATTCCTCTTC 14201 GATTAGACGC TACACCTATT AAAC C TAA.AA TTAAACA.AAC CATTATTAA.A CTAATCTGCG ATGTGGATAA TTTGGATTTT AATTTGTTTG GTAATAATTT 14251 AACAT~T ATACCATTAC TCCCACCTGG ATTTTAACCA AGACTAATAA TTGTATTTTA TATGGTAATG AGGGTGGACC TAAAATTGGT TCTGATTATT 14301 C TTGP►~3AAAC TATCGTTGTC CATTCAACTA TAAGAATTTA TGGCCATAAA GAACTTTTTG ATAGCAACAG GTAAGTTGAT ATTC TTA.AAT ACCGGTATTT 14351 TATCCGAAAA ACCCACCCAC TC C TP.~AAAAT TATTAACCAA ACCCTAATTG ATAGGCTTTT TGGGTGGGTG AGGATTTTTA ATAATTGGTT TGGGATTAAC 14401 ACCTCCCAAC CCCATCTAAC ATTTCAATTT GATGAAACTT CGGTTCACTT TGGAGGGTTG GGGTAGATTG TAAAGTTAAA CTACTTTGAA GCCAAGTGAA 14451 CTAGGACTAT GTTTAATTAT TCAAATCCTT ACAGGACTTT TCCTAGCAAT GATCCTGATA CA.AATTAATA AGTTTAGGAA TGTCCTGAAA AGGATCGTTA 14501 ACATTATACC GCAGACATTT CCATAGCCTT CTCCTCAGTA ATCCATATTT TGTAATATGG CGTCTGTAAA GGTATCGGAA GAGGAGTCAT TAGGTATAAA 373

14551 GCCGCGATGT TAACTATGGC TGACTTATTC ATAACATACA CGCCAATGGA CGGCGCTACA ATTGATACCG ACTGAATAAG TATTGTATGT GCGGTTACCT 14601 GCCTCATTGT TCTTTGTTTG CGTATATTTA CATATTGCCC GAGGACTTTA CGGAGTAACA AGA.A.AC AAAC GCATATAAAT GTATAACGGG CTCCTGAAAT 14651 CTATGGCTCC TACCTTTATA AAGA.AACATG AA.ATATTGGA GTAATTCTAT GATACCGAGG ATGGAAATAT TTCTTTGTAC TTTATAACCT CATTAAGATA 14701 TATTTCTACT AATAGCCACA GCCTTCGTAG GTTATGTACT ACCATGAGGA ATAAAGATGA TTATCGGTGT CGGAAGCATC CAATACATGA TGGTACTCCT 14751 CAGATATCCT TCTGAGGCGC CACAGTCATT ACCAACCTCC TATCCGCCTT GTCTATAGGA AGACTCCGCG GTGTCAGTAA TGGTTGGAGG ATAGGCGGAA 14801 CCCCTACATT GGA.AATATAT TAGTCCAATG AATTTGAGGT GGCTTTTCAG GGGGATGTAA CCTTTATATA ATCAGGTTAC TTAAACTCCA C C GA~AAAGTC 14851 TAGATAGCGC CACCCTAACA CGATTCTTCG CATTCCACTT CCTCCTACCT ATCTATCGCG GTGGGATTGT GCTAAGAAGC GTAAGGTGAA GGAGGATGGA 14901 TTTCTAATCA CAGCATTAAT AATAATTCAC ATCCTTTTCC TACACGAGAC AAAGATTAGT GTCGTAATTA TTATTAAGTG TAGGA~A,AAGG ATGTGCTCTG 14951 AGGCTCAAGC AACCCCATAG GACTTAACTC TGACATAGAC AAAATTCCCT TCCGAGTTCG TTGGGGTATC CTGAATTGAG ACTGTATCTG TTTTAAGGGA 15001 TTCACCCCTA CTTCTCCTAT AAAGATACAC TTGGTTTCTT TATTATAATT AAGTGGGGAT GAAGAGGATA TTTCTATGTG AACCAAAGAA ATAATATTAA 15051 ATTCTCCTAG GAGTTCTAGC CTTATTCCTC CCTAACCTGT TAGG~GACGC TAAGAGGATC CTCAAGATCG GAATAAGGAG GGATTGGACA ATCCTCTGCG 15101 TGAAAACTTT ATCCCTGCTA ACCCTCTTAT CACCCCTCCC CATATTAA.AC AC TTTTGA.AA TAGGGACGAT TGGGAGAATA GTGGGGAGGG GTATAATTTG 15151 CTGAATGATA CTTCCTATTC GCTTACGCCA TCCTCCGATC TATCCCTAAT GACTTACTAT GAAGGATAAG CGAATGCGGT AGGAGGCTAG ATAGGGATTA 15201 AAATTAGGAG GAGTCCTAGC TCTTCTATTC TCCATCTTCA TCCTCATGTT TTTAATCCTC CTCAGGATCG AGAAGATAAG AGGTAGAAGT AGGAGTACAA 15251 AATTCCTCTA CTACACACCT CTAAACAACG AAGCAGTATC TTCCGCCCAC TTAAGGAGAT GATGTGTGGA GATTTGTTGC TTCGTCATAG AAGGCGGGTG 153 01 TTATACAAGT TTTCTTCTGA ATTCTTATAA CCGATATATT AATCTTAACC AATATGTTCA AAAGAAGACT TAAGAATATT GGCTATATAA TTAGAATTGG 15351 TGAATCGGAG GACAGCCAGT TGAACAACCG TTTATTCTTA TTGGACAAAT ACTTAGCCTC CTGTCGGTCA ACTTGTTGGC AAATAAGAAT AACCTGTTTA 15401 CGCATCCATT ACCTACTTTT CCTTATTCCT TATTGTAATC CCACTTACAG GCGTAGGTAA TGGATGA,.AAA GGAATAAGGA ATAACATTAG GGTGAATGTC 15451 GCTGATGAGA AAACAAAATC CTCAGCCTAA ACTAGTTTTG GTAGCTTAAC CGACTACTCT TTTGTTTTAG GAGTCGGATT TGATCAAAAC CATCGAATTG 15501 TTAATAAAGC ATCGACCTTG TAAGCCGAAG ACCGGAGGTT TGAACCCTCC AATTATTTCG TAGCTGGAAC ATTCGGCTTC TGGCCTCCAA ACTTGGGAGG 15551 C C A.A.AAC ATA TCAGGGGGAG GAGGGTTAAA CTCCTGCCCT TGGCTCCCAA GGTTTTGTAT AGTCCCCCTC CTCCCAATTT GAGGACGGGA ACCGAGGGTT 15601 AGCCAAGATT CTGCCCAAAC TGCCCCCTGT AATGTCATTA AAGCATGAAA TCGGTTCTAA GACGGGTTTG ACGGGGGACA TTACAGTAAT TTCGTACTTT 15651 ATCA.AATGAA AATTTGATTT TCCAAAAGTA AGTCAGAGTG ACATATTAAT TAGTTTACTT TTAAACTAAA AGGTTTTCAT TCAGTCTCAC TGTATAATTA 15701 GACATAGCCC ACATACCTTA ATATAGTACA TTACTTAACT CGACTAATCA CTGTATCGGG TGTATGGAAT TATATCATGT AATGAATTGA GCTGATTAGT 15751 ACATTAATAG ACTATTCCCT ACTACTATTA TTATCTATGC TTAATCCTCA TGTAATTATC TGATAAGGGA TGATGATAAT AATAGATACG AATTAGGAGT 15801 TTAATCTATA TTCCACTATA TCATAACATA CTATGCTTAA TACTCATTAA AATTAGATAT AAGGTGATAT AGTATTGTAT GATACGAATT ATGAGTAATT 15851 TATACTATCC ACTATTTCAT TACATTATAT TCTTTAGCCC TTP~P~AAACTT ATATGATAGG TGATA.AAGTA ATGTAATATA AGAA.ATC GGG AATTTTTGAA 15901 AAGATCA.AAA TTTCCATGAC AT TA TTATTTAACC CTAAGATACT 374

TTCTAGTTTT AAAGGTACTG TATTTTTTAT AATAAATTGG GATTCTATGA 15951 TA.AATTACAA ATTATGTGGG CTGGTAAGAA CATCACATCC CGCTATTGTA ATTTAATGTT TAATACACCC GACCATTCTT GTAGTGTAGG GCGATAACAT 16001 AGAATAAAAT AGCTCTATTT GTGGCGCTGT ACTCGATTTA TCCCTATCAA TCTTATTTTA TCGAGATAAA CACCGCGACA TGAGCTAAAT AGGGATAGTT 16051 TTGATCAAAA TTGGCATCTG ATTAATGCTC GAAATACTTT AATCCTTAAT AACTAGTTTT AACCGTAGAC TAATTACGAG CTTTATGAAA TTAGGAATTA 16101 CGCGTCAAGA ATGCCAGATC CGCTAGCTCC CTTTAATGGT ATTTTCGTCC GCGCAGTTCT TACGGTCTAG GCGATCGAGG GAAATTACCA TAAAAGCAGG 16151 TTGACTGTCT CAAGATTTAC TGTCCTCCCT GTTTTTTTTT TTGGGGATGA AACTGACAGA GTTCTAAATG ACAGGAGGGA C AACCCCTACT 16201 AGCAGTTACT AAGCCCGGGA GGGCTGATCT AGAACACTGA AATAAATTTG TCGTCAATGA TTCGGGCCCT CCCGACTAGA TCTTGTGACT TTATTTAAAC 16251 AATCCACCTC GACATTCATA TTTAATACTC ATTACTCACC ATTCATGAAT TTAGGTGGAG CTGTAAGTAT AAATTATGAG TAATGAGTGG TAAGTACTTA 16301 TATAATTGTC AAGTTGACCA TACCTCAGAG GGATAGAGAA ACTGACGCCA ATATTAACAG TTCAACTGGT ATGGAGTCTC CCTATCTCTT TGACTGCGGT 16351 TAGGCGACAA GTTTCGATTT TTTTGATTAA TGAAGCTATG GTTT ATCCGCTGTT CAAAGCTAAA AAA.AC TAATT ACTTCGATAC CAAATTTTTT 16401 AGCATTCTTT TAACCCTCAT GP►~~AAAGC GA TTCGTAATAA ATATTAATGT TCGTAAGAAA ATTGGGAGTA CTTTTTCGCT AAGCATTATT TATAATTACA 16451 AAGGCGCATA GAATAATCCT AGTACATCCT TCACTTTATT AGGCATAAAT TTCCGCGTAT CTTATTAGGA TCATGTAGGA AGTGA.AATAA TCCGTATTTA 16501 TTATTTTTAT TAAGGTTTCC CCTAGGTCTT P►AAAATTCAA GGCCGCCTTA AATP.,~~AAATA ATTCCAAAGG GGATCCAGAA TTTTTAAGTT CCGGCGGAAT 16551 CTTTTTTTGA TP►~~A,AAC C C C CCTCCCCCTA ATATACACGG TTTTTTTTTT G CT ATTTTTGGGG GGAGGGGGAT TATATGTGCC 16601 ATTCCTCGAA AAACCCCTAA AACGAAGACC GGACATATAT TTTTGAATTA APsAAC TTAAT TAAGGAGCTT TTTGGGGATT TTGCTTCTGG CCTGTATATA 16651 GCATGCGAAA TATATTCTGT ATATATATAG TGTTACACTA TGAT CGTACGCTTT ATATAAGACA TATATATATC ACAATGTGAT ACTA

tRNA 1..70 product = tRNA-Phe rRNA 69..1023 product = 12S ribosomal RNA tRNA 1024..1095 product = tRNA-Val rRNA 1096..2764 product = 16S ribosomal RNA tRNA 2765..2839 product = tRNA-Leu gene 2840..3814 gene = ND1 product = NADH dehydrogenase subunit 1 tRNA 3816..3884 product = tRNA-Ile tRNA 3883..3954 product = tRNA-Gln tRNA 3955..4023 product = tRNA-Met gene 4024..5067 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5067..5137 375

product = tRNA-Trp tRNA complement (S 139..5207) product = tRNA-Ala tRNA complement (5208..5280) product = tRNA-Asn tRNA complement (5313..5378) product = tRNA-Cys tRNA complement (5380..5449) product = tRNA-Tyr gene 5451..7005 gene = CO1 product = cytochrome c oxidase subunit 1 tRNA complement (7007..7077) product = tRNA-Ser tRNA 7082..7151 product = tRNA-Asp gene 7159..7849 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7850..7923 product = tRNA-Lys gene 7924..8091 gene = ATP8 product =ATP synthase FO subunit 8 gene 8083..8765 gene = ATP6 product =ATP synthase FO subunit 6 gene 8765..9550 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9553..9622 product = tRNA-Gly gene 9623..9973 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9973..10041 product = tRNA-Arg gene 10042..1033 8 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10332..11712 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11713..11781 product = tRNA-His tRNA 11782..11848 product = tRNA-Ser tRNA 11849..11920 product = tRNA-Leu gene 11921..13750 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13746..14267) gene = ND6 376

product = NADH dehydrogenase subunit 6 tRNA complement (14268..14337) product = tRNA-Glu gene 14340..15485 gene = CYTB product = cytochrome b tRNA 15485..15558 product = tRNA-Thr tRNA complement (15561..15629) product = tRNA-Pro D-Loop 15631..16694

Pseudocarcharias kamohara~ mitochondrion, complete genome

1 GCTAGTGTAG CTTAATTTAG AGTATGGCAC TGAAAATGCT AAGATGAAAA CGATCACATC GAATTAAATC TCATACCGTG ACTTTTACGA TTCTACTTTT 51 ATP►~~AAATTT TCCGCAAGCA CAAAGGTTTG GTCCTGGCCT CAGTATTAAT TATTTTTAAA AGGCGTTCGT GTTTCCAAAC CAGGACCGGA GTCATAATTA 101 TGTAACCAAA ATTATACATG CAAGTTTCAG CATCCCTGTG AGAATGCCCT ACATTGGTTT TAATATGTAC GTTCAAAGTC GTAGGGACAC TCTTACGGGA 151 AATTATTCTA TTAACTAATT AGGAGCAGGT ATCAGGCACA CGCACGTAGC TTAATAAGAT AATTGATTAA TCCTCGTCCA TAGTCCGTGT GCGTGCATCG 201 CCAAGACACC TTGCTAAGCC ACACCCCCAA GGGACTTCAG CAGTAATAAA GGTTCTGTGG AACGATTCGG TGTGGGGGTT CCCTGA.AGTC GTCATTATTT 251 TATTGATTAT CATAAGCGCA AGCTTGAATC AGTTAAAGTT AACAGAGTTG ATAACTAATA GTATTCGCGT TCGAACTTAG TCAATTTCAA TTGTCTCAAC 301 GTAA.ATCTCG TGCCAGCCAC CGCGGTTATA CGAGTAACTC ATATTAATAC CATTTAGAGC ACGGTCGGTG GCGCCAATAT GCTCATTGAG TATAATTATG 351 TTTCCCGGCG TA.AAGAGTGA TTTAAGGAAT ATCCACAATA AC TA.AAGC TA AAAGGGCCGC ATTTCTCACT AAATTCCTTA TAGGTGTTAT TGATTTCGAT 401 AGACCTCATC AAGCTGTTAC ACGCACCCAT GAGCGGAATC ACCAACAACG TCTGGAGTAG TTCGACAATG TGCGTGGGTA CTCGCCTTAG TGGTTGTTGC 451 A.AAGTGACTT TATATACTAG AAATCTTGAT GTCACGACAG TTAGACCCCA TTTCACTGAA ATATATGATC TTTAGAACTA CAGTGCTGTC AATCTGGGGT 501 AACTAGGATT AGATACCCTA CTATGTCTAA CCACAA.ACCT AAACAATAAA TTGATCCTAA TCTATGGGAT GATACAGATT GGTGTTTGGA TTTGTTATTT 551 TTACTTTATT GTTCGCCAGA GTACTACAAG CGCTAGCTTA A.AAC C C AA.AG AATGAAATAA CAAGCGGTCT CATGATGTTC GCGATCGAAT TTTGGGTTTC 601 GACTTGGCGG TGTCCCAA.AC CCACCTAGAG GAGCCTGTTC TGTAACCGAT CTGAACCGCC ACAGGGTTTG GGTGGATCTC CTCGGACAAG ACATTGGCTA 651 AATCCTCGTT A.AAC C TC AC C ACTTCTGGCC ATCCCCGTCT ATATACCGCC TTAGGAGCAA TTTGGAGTGG TGAAGACCGG TAGGGGCAGA TATATGGCGG 701 GTCGTCAGCT AACCCTATGA AGGTTp~~?~AA GTAAGCA,AAA AGAATTAACT CAGCAGTCGA TTGGGATACT TCCAATTTTT CATTCGTTTT TCTTAATTGA 751 TCCAAACGTC AGGTCGAGGT GTAGCAAATG AAGTGGATAG AAATGGGCTA AGGTTTGCAG TCCAGCTCCA CATCGTTTAC TTCACCTATC TTTACCCGAT 801 CATTTTCTAT AAAGAAGACA CGAATGGTAA AC TGP~~AAAT TAC TTA.AAGG GTAA.AAGATA TTTCTTCTGT GCTTACCATT TGACTTTTTA ATGAATTTCC 851 TGGATTTAGC AGTAAGAAA.A GATTAGAGAG CTTTTCTGAA ACTGGCTCTG AC C TA.AATC G TCATTCTTTT CTAATCTCTC GAAAAGACTT TGACCGAGAC 901 GGACGCGCAC ACACCGCCCG TCACTCTCCT C TC TATTTATTTT CCTGCGCGTG TGTGGCGGGC AGTGAGAGGA GTTTTTTTAG ATAAATAA.AA 951 TAATT~~AA.AG AAAATTATCA AGAGGAGGCA AGTCGTAACA TGGTAAGTGT ATTAATTTTC TTTTAATAGT TCTCCTCCGT TCAGCATTGT ACCATTCACA 377

1001 ACTGGAAAGT GCACTTGGAT TCAA.AATGTG GCTAAACCAG CAAAGCACCT TGACCTTTCA CGTGAACCTA AGTTTTACAC CGATTTGGTC GTTTCGTGGA 1051 CCCTTACACC GAGGAGATAC CCGTGCAACT CGGATCATTT TGAACATTAA GGGAATGTGG CTCCTCTATG GGCACGTTGA GCCTAGTAAA ACTTGTAATT 1101 AGCTAGCCTG TATATCTACC TAAATTCAAC CTTATCATTT ACCCTGTATA TCGATCGGAC ATATAGATGG ATTTAAGTTG GAATAGTAAA TGGGACATAT 1151 TTAACTCCTA ACTAAAACAT TTTACCTTTT TAGTATGGGC GACAGAACAA AATTGAGGAT TGATTTTGTA AAATGGAAA.A ATCATACCCG CTGTCTTGTT 1201 A.AAC TCAGC G CAATAGACTA TGTACCGTAA GGGA.AAGCTG AA.A,AAGAAAT TTTGAGTCGC GTTATCTGAT ACATGGCATT CCCTTTCGAC TTTTTCTTTA 1251 GAAATAAATA ATTAAAGTAA TP~~A,AAGCAG AGATTTAACC TCGTACCTTT CTTTATTTAT TAATTTCATT ATTTTTCGTC TCTAAATTGG AGCATGGA.AA 1301 TGCATCATGA TTTAGCTAGA A.A.AACTAGAC A.A.AGAGATC T TAAGCCTATC ACGTAGTACT AAATCGATCT TTTTGATCTG TTTCTCTAGA ATTCGGATAG 1351 CTCCCGA.AAC TAAACGAGCT ACTCCGAAGC AGCACAATTA GAGCCAACCC GAGGGCTTTG ATTTGCTCGA TGAGGCTTCG TCGTGTTAAT CTCGGTTGGG 1401 GTCTCTGTGG CAAAAGAGTG GGAAGACTTT CGAGTAGCGG TGACAAGCCT CAGAGACACC GTTTTCTCAC CCTTCTGAAA GCTCATCGCC ACTGTTCGGA 1451 ATCGAGTTTA GTGATAGCTG GTTGCCCAAG AAAAGAACCT TAATTCTGCA TAGCTCAAAT CACTATCGAC CAACGGGTTC TTTTCTTGGA ATTAAGACGT 1501 TTAATTCTTT CATCAC CA.AA AAGTCTATCT TAACAAGGTT AAACATAATA AATTAAGAAA GTAGTGGTTT TTCAGATAGA ATTGTTCCAA TTTGTATTAT 1551 ATTAATAGTT ATTTAGAAGA GGTACAGCCC TTCTGAACTA AGATACAACT TAATTATCAA TAAATCTTCT CCATGTCGGG AAGACTTGAT TCTATGTTGA 1601 TTTTAAGGCG GA.AAATGATC ATATTAACTA AGGTTTTTAC CTCAGTGGAC AAAATTCCGC CTTTTACTAG TATAATTGAT TC CAAAA.ATG GAGTCACCTG 1651 CCAAAAGCAG CCACCTGAAA AGTAAGCGTC ACAGCTCCAG TATCACAAAA GGTTTTCGTC GGTGGACTTT TCATTCGCAG TGTCGAGGTC ATAGTGTTTT 1701 ACCTATAATT TAGATATTCT TCTCATAATC CCCTTAACTT TATTGGGTTA TGGATATTAA ATCTATAAGA AGAGTATTAG GGGAATTGAA AT.AAC C C AAT 1751 TTTTATAAAA TTATAAAAGA ACTTATGCTA AAATGAGTAA TAAGAGAATA AAAATATTTT AATATTTTCT TGAATACGAT TTTACTCATT ATTCTCTTAT 1801 AATCTCTCCG GACATAAGTG TACGTCAGAT AGAATTAATT CACTGACAAT TTAGAGAGGC CTGTATTCAC ATGCAGTCTA TCTTAATTAA GTGACTGTTA 1851 TAAACGAACC CAAACTGAGG CTATTATATT AATATTACCT TAACTAGAAA ATTTGCTTGG GTTTGACTCC GATAATATAA TTATAATGGA ATTGATCTTT 1901 ACCTTATTAT AACATTCGTT AACCCTACAC AGGAACGTCT TAAGGAAAGA TGGAATAATA TTGTAAGCAA TTGGGATGTG TCCTTGCAGA ATTCCTTTCT 1951 TTTAAAGAAA ATAAAGGAAC TCGGCAAACA TAAACTCCGC CTGTTTACCA AAATTTCTTT TATTTCCTTG AGCCGTTTGT ATTTGAGGCG GACAAATGGT 2001 A.A.AACATCGC CTCTTGAATA TTATAAGAGG TCCCGCCTGC CCTGTGACAA TTTTGTAGCG GAGAACTTAT AATATTCTCC AGGGCGGACG GGACACTGTT 2051 TGTTTAACGG CCGCGGTATT TTGACCGTGC AAAGGTAGCG TAATCACTTG ACAAATTGCC GGCGCCATAA AACTGGCACG TTTCCATCGC ATTAGTGAAC 2101 TC TTTTA.AAT GAAGACCCGT ATGAAAGGCA TCACGAGAGT TTAACTGTCT AG~TTTA CTTCTGGGCA TACTTTCCGT AGTGCTCTCA AATTGACAGA 2151 CTATTTTCTA ATCAATGAAA TTGATCTGCT CGTGCAGAAG CGAGCATAAC GAT~~AAAGAT TAGTTACTTT AACTAGACGA GCACGTCTTC GCTCGTATTG 2201 TACATTAGAC GAGAAGACCC TATGGAGCTT CAAACACATA AATTAATTAT ATGTAATCTG CTCTTCTGGG ATACCTCGAA GTTTGTGTAT TTAATTAATA 2251 GTAAATTAAT TATTCCACGG ATATAAATAA AAATATAATA TTCTTAATTT CATTTAATTA ATAAGGTGCC TATATTTATT TTTATATTAT AAGAATTAAA 2301 AACTGTTTTT GGTTGGGGTG ACCAAGGGGA A.A.AATAAATC CCCCTTATCG TTGACP~~A.AA CCAACCCCAC TGGTTCCCCT TTTTATTTAG GGGGAATAGC 2351 ACTGAGTACT TAAATACTTA AA.AATTAAAG TTACAACTTT AATTAATAAA ~~s

TGACTCATGA ATTTATGAAT TTTTAATTTC AATGTTGAAA TTAATTATTT 2401 ATATTTATCG P~~TGACC CAGAACTTTC TGATCAATGA ACCAAGTTAC TATAAATAGC TTTTTACTGG GTCTTGAAAG ACTAGTTACT TGGTTCAATG 2451 CCTAGGGATA ACAGCGCAAT CCTTTCCCAG AGTCCCTATC GAC GA.AAGGG GGATCCCTAT TGTCGCGTTA GGAAAGGGTC TCAGGGATAG CTGCTTTCCC 2501 TTTACGACCT CGATGTTGGA TCAGGACATC CTAATGATGC AACCGTTATT AAATGCTGGA GCTACAACCT AGTCCTGTAG GATTACTACG TTGGCAATAA 2551 AAGGGTTCGT TTGTTCAACG ATTAATAGTC CTACGTGATC TGAGTTCAGA TTCCCAAGCA AACAAGTTGC TAATTATCAG GATGCACTAG ACTCAAGTCT 2601 CCGGAGAAAT CCAGGTCAGT TTCTATCTAT GAATTAATTT TTCCTAGTAC GGCCTCTTTA GGTCCAGTCA AAGATAGATA CTTAATTAAA AAGGATCATG 2651 GAAAGGACCG GAAAAATGAA GCCAATACCC TAGGCACGCT TCATTTTCAT CTTTCCTGGC CTTTTTACTT CGGTTATGGG ATCCGTGCGA AGTAAAAGTA 2701 CTATTGAAAC AAACTA.AAAT AGATAAGAAA ATATTATCTA CTACCCAAGA GATAACTTTG TTTGATTTTA TCTATTCTTT TATAATAGAT GATGGGTTCT 2751 A.AAGGGTTGT TGGGGTGGCA GAGCCTGGCA AGTGCAAAAG ACCTAAGCTC TTTCCCAACA ACCCCACCGT CTCGGACCGT TCACGTTTTC TGGATTCGAG 2801 TTTAATTCAG AGGTTCA.AAT CCTCTCCCCA ACTATGCTTG AAGCCCTCCT AAATTAAGTC TCCAAGTTTA GGAGAGGGGT TGATACGAAC TTCGGGAGGA 2851 ACTTTACTTA ATTAATCCAC TTACCTATAT TATCCCTATT TTATTAGCTA TGAAATGAAT TAATTAGGTG AATGGATATA ATAGGGATAA AATAATCGAT 2901 CAGCTTTCCT CACCCTAGTT GA.ACGP.~~AAA TTCTTGGCTA TATACAACTT GTCGAAAGGA GTGGGATCAA CTTGCTTTTT AAGAACCGAT ATATGTTGAA 2951 CGTAAAGGAC CCAACATTGT GGGTCCCTAT GGTCTTCTTC AACCAATTGC GCATTTCCTG GGTTGTAACA CCCAGGGATA CCAGAAGAAG TTGGTTAACG 3001 AGATGGCCTA A.AACTATTTA TTAAAGAACC CATCCACCCA TCAACATCCT TCTACCGGAT TTTGATAAAT AATTTCTTGG GTAGGTGGGT AGTTGTAGGA 3051 CTCCATTTTT ATTTCTGGCC ACCCCCACAA TAGCCCTAAC ACTAGCCCTC GAGGTP,~~AAA TAAAGACCGG TGGGGGTGTT ATCGGGATTG TGATCGGGAG 3101 CTTTTATGAA TACCTCTTCC TCTTCCTCAC TCTATTATTA ACCTCAATCT GP.~AAATACTT ATGGAGAAGG AGAAGGAGTG AGATAATAAT TGGAGTTAGA 3151 AGGCCTACTA TTTATCCTAG CAATCTCAAG CCTAACCGTT TATACTATCC TCCGGATGAT AAATAGGATC GTTAGAGTTC GGATTGGCAA ATATGATAGG 3201 TAGGCTCCGG ATGAGCATCT AATTCAAAAT ACGCCCTAAT GGGAGCCCTT ATCCGAGGCC TACTCGTAGA TTAAGTTTTA TGCGGGATTA CCCTCGGGAA 3251 CGAGCCGTAG CACAAACAAT CTCTTATGAA GTAAGTCTCG GATTAATTCT GCTCGGCATC GTGTTTGTTA GAGAATACTT CATTCAGAGC CTAATTAAGA 3301 CCTATCAATA ATTATATTTA CAGGAGGCTT CACCCTCCAT ACCTTTAACT GGATAGTTAT TAATATAA.A.T GTCCTCCGAA GTGGGAGGTA TGGAAATTGA 3351 TAGCACAAGA AACAATCTGA TTAATTATTC CAGGCTGACC ATTAGCCCTA ATCGTGTTCT TTGTTAGACT AATTAATAAG GTCCGACTGG TAATCGGGAT 3401 ATATGATATG TTTCAACCCT AGCAGAAACT AACCGAGTAC CTTTTGATTT TATACTATAC AAAGTTGGGA TCGTCTTTGA TTGGCTCATG GAAAACTAAA 3451 AACAGAAGGA GAATCAGAAC TAGTCTCAGG ATTTAACATC GAATATGCAG TTGTCTTCCT CTTAGTCTTG ATCAGAGTCC TAAATTGTAG CTTATACGTC 3501 GAGGCTCATT TGCCCTATTT TTCCTCGCCG AATATACAAA CATTTTATTA CTCCGAGTAA ACGGGATA.AA AAGGAGCGGC TTATATGTTT GTAA.AATAAT 3551 ATAA.ATAC C C TCTCAGTTAT CTTATTTATA GGTTCCTCCT ACAACCCATT TATTTATGGG AGAGTCAATA GAATAAATAT CCAAGGAGGA TGTTGGGTAA 3601 TTTTCCAGAA ATTTCAACAC TTAGCTTAAT AATAAA.AGCA ACTTTATTAA ~GGTCTT TA.AAGTTGTG AATCGAATTA TTATTTTCGT TGA.AATAATT 3651 CCCTATTTTT CTTATGAATT CGAGCATCCT ACCCACGTTT TCGTTATGAC GGGATP~~AAA GAATACTTAA GCTCGTAGGA TGGGTGCA.A.A AGCAATACTG 3701 CAACTTATAC ACTTAGTATG TTTT CTACCCTTAA CCCTAGCAAT GTTGAATATG TGAATCATAC TTTTTTA.AA.A GATGGGAATT GGGATCGTTA 379

3751 TATACTATGA CATATCGCCC TCCCCCTAGC TACAGCAAGC CTACCTCCTC ATATGATACT GTATAGCGGG AGGGGGATCG ATGTCGTTCG GATGGAGGAG 3801 TAACCTAATG GAAGTGTGCC TGAACAAAGG ACCACTTTGA TAGAGTGGAT ATTGGATTAC CTTCACACGG ACTTGTTTCC TGGTGAAACT ATCTCACCTA 3851 AATGAAAGTT ~,AA,ATC TC TC CTCTTCCTAG A►~~AAATAGGA CTTGAACCTA TTACTTTCAA TTTTAGAGAG GAGAAGGATC TTTTTATCCT GA.AC TTGGAT 3901 TAATTAAGAG ATCA►AAACTC TTTGTATTTC CAATTATACT ATTTCCTAAG ATTAATTCTC TAGTTTTGAG AAACATAAAG GTTAATATGA TAAAGGATTC 3951 TAA.AGTCAGC TAATAAAGCT TTTGGGCCCA TACCCCAACC ATGTTGGTTA ATTTCAGTCG ATTATTTCGA AAACCCGGGT ATGGGGTTGG TACAACCAAT 4001 AAATCCTTCC TCTACTAATG AACCCCATTG TATTATCCAT TATAATTTCA TTTAGGAAGG AGATGATTAC TTGGGGTAAC ATAATAGGTA ATATTA.AAGT 4051 AGCCTAGGCC TAGGAACCAT TCTAACATTT ATTGGCTCAC ATTGACTCCT TCGGATCCGG ATCCTTGGTA AGATTGTAAA TAACCGAGTG TAACTGAGGA 4101 AGTTTGAATA GGC C TC GA.AA TTAATACTCT AGCCATTATC CCTTTAATAA TCAAACTTAT CCGGAGCTTT AATTATGAGA TCGGTAATAG GGA.AATTATT 4151 TCCGGCAACA TCACCCTCGA GCCGTAGAAG CTTCAACAAA ATATTTTATT AGGCCGTTGT AGTGGGAGCT CGGCATCTTC GAAGTTGTTT TATP~AAATAA 4201 ACACAAGCAA CCGCCTCAGC TTTACTTTTA TTCGCTAGCA TCATAAACGC TGTGTTCGTT GGCGGAGTCG AAATGAAAAT AAGCGATCGT AGTATTTGCG 4251 CTGAACTTCA GGTGAATGAG GCTTAACTGA AATAATTAAT CCAACCTCCG GACTTGAAGT CCACTTACTC CGAATTGACT TTATTAATTA GGTTGGAGGC 4301 CCACACTAGC TACAATCGCA TTAGCACTAA AAATTGGCTT AGCCCCTCTC GGTGTGATCG ATGTTAGCGT AATCGTGATT TTTAACCGAA TCGGGGAGAG 4351 CACTTCTGAT TACCCGAAGT TCTCCAAGGC TTAGACCTTA CTACAGGCCT GTGAAGACTA ATGGGCTTCA AGAGGTTCCG AATCTGGAAT GATGTCCGGA 4401 TATCCTCTCC ACATGACAAA AACTCGCTCC ATTTGCTATC CTCTTACAAC ATAGGAGAGG TGTACTGTTT TTGAGCGAGG TAAACGATAG GAGAATGTTG 4451 TTTACCCCTC ATTAAATTCT AATCTACTTA TTTTTCTTGG AATCCTCTCA AAATGGGGAG TAATTTAAGA TTAGATGAAT P~~AAAGAAC C TTAGGAGAGT 4501 ACTATAGTAG GAGGCTGAGG AGGTTTAAAC CAAACCCAAT TACG T TGATATCATC CTCCGACTCC TCCA.AATTTG GTTTGGGTTA ATGCTTTTTA 4551 CCTAGCCTAC TCCTCAATCG CACACCTTGG TTGAATAATT TCAATCCTAC GGATCGGATG AGGAGTTAGC GTGTGGAACC AACTTATTAA AGTTAGGATG 4601 ATTATTCTCA TAACCTCACT CAATTAA.ATC TTATCTTATA TATTATTATA TAATAAGAGT ATTGGAGTGA GTTAATTTAG AATAGAATAT ATAATAATAT 4651 ACATCAACAA CCTTCCTACT TTTTAAAACA TTTAATTCAA CCAAAATTAA TGTAGTTGTT GGAAGGATGA AAAATTTTGT A.AATTAAGTT GGTTTTAATT 4701 CTCTATTTCT TCTTCCTCAT CAAAATCTCC CCTTCTATCT ATTATTGCCC GAGATAAAGA AGAAGGAGTA GTTTTAGAGG GGAAGATAGA TAATAACGGG 4751 TCCTAACTTT ATTATCTCTA GGAGGACTAC CTCCACTCTC AGGCTTTATA AGGATTGAAA TAATAGAGAT CCTCCTGATG GAGGTGAGAG TC C GA.AATAT 4801 CCAAAATGAT TAATTTTACA AGAATTAACA A.AACAAAATC TTACCATCCC GGTTTTACTA ATTAAAATGT TCTTAATTGT TTTGTTTTAG AATGGTAGGG 4851 AGCTACTATT ATAGCCATAA TAGCCCTCCT CAGCCTATTC TTCTATTTAC TCGATGATAA TATCGGTATT ATCGGGAGGA GTCGGATAAG AAGATA.AATG 4901 GGTTATGTTA TGCTACAACA CTAACAATAA CTCCAAATTC AATTAATATA CCAATACAAT ACGATGTTGT GATTGTTATT GAGGTTTAAG TTAATTATAT 4951 TCAACATCAT GACGAACTAA ATTATCACAC AAC C TA.AC C C TAACAACAGC AGTTGTAGTA CTGCTTGATT TAATAGTGTG TTGGATTGGG ATTGTTGTCG 5001 CGCCTCATTA TCCATTTTGC TTCTCCCCAT TACTCCCGCC ATCCTCATAT GCGGAGTAAT AGGTAAAACG AAGAGGGGTA ATGAGGGCGG TAGGAGTATA 5051 TAATATCTTA AGAAATTTAG GTTAACAATA GAC CP,~~AAAC C TTCA.AAGTT ATTATAGAAT TCTTTA.AATC CAATTGTTAT CTGGTTTTTG GAAGTTTCAA 5101 TTAAGTAGAA GTGA.AA.ATC T CCTAATTTCT GCTAAGATTT GTAAGACTTT 380

AATTCATCTT CACTTTTAGA GGATTAAAGA C GATTC TA.AA CATTCTGAAA 5151 ATCTCACATC TTCTGAATGC AACCCAGATG CTTTCATTAA GC T~~A,ACCT TAGAGTGTAG AAGACTTACG TTGGGTCTAC GAAAGTAATT CGATTTTGGA 5201 TCTAGATAA.A TAGGCCTTGA TCCTACA.AAA TCTTAGTTAA CAGCTAAGCG AGATCTATTT ATCCGGAACT AGGATGTTTT AGAATCAATT GTCGATTCGC 5251 TTCAATCCAG CGAACTTTTA TC TAC TTTC T CCCGCCGTAA GC TA,AAAGGC AAGTTAGGTC GCTTGAA.AAT AGATGAAAGA GGGCGGCATT CGATTTTCCG 5301 GGGAGAAAGC CCCGGGAGAA AC.A.AAC C TC C GGTTTTGGAT TTGCAATCCA CCCTCTTTCG GGGCCCTCTT TGTTTGGAGG CCAAAACCTA AACGTTAGGT 5351 ACGTAATCAT TTACTGCAGG GCTATGGTAA GAAGAGGAAT TTAACCTCTG TGCATTAGTA AATGACGTCC CGATACCATT CTTCTCCTTA AATTGGAGAC 5401 TTTGCGGAGC TACAA.ACCGC CACTTAGTTC TCAGTCACCT TACCTGTGGC AAACGCCTCG ATGTTTGGCG GTGAATCAAG AGTCAGTGGA ATGGACACCG 5451 AATTAATCGT TGACTATTTT CTACAAACCA CAA.AGATATC GGCACCCTTT TTAATTAGCA ACTGATAAAA GATGTTTGGT GTTTCTATAG CCGTGGGAAA 5501 ATTTAATCTT TGGTGCATGA GCAGGAATAG TGGGAACAGC CCTAAGCCTT TA.AATTAGAA ACCACGTACT CGTCCTTATC ACCCTTGTCG GGATTCGGAA 5551 TTAATTCGAG CTGAACTGGG ACAACCTGGA TCACTTTTAG GAGATGACCA AATTAAGCTC GACTTGACCC TGTTGGACCT AGTG.~►A.AATC CTCTACTGGT 5601 AATCTATAAT GTAATTGTAA CCGCCCATGC ATTCGTAATA ATCTTCTTCA TTAGATATTA CATTAACATT GGCGGGTACG TAAGCATTAT TAGAAGAAGT 5651 TGGTTATACC AGTAATAATC GGTGGATTTG GTAACTGACT AGTACCATTA ACCAATATGG TCATTATTAG CCACCTAAAC CATTGACTGA TCATGGTAAT 5701 ATAATTGGTG CACCAGACAT GGCCTTTCCA C GAATA.AATA ATATAAGCTT TATTAACCAC GTGGTCTGTA CCGGA.A.AGGT GCTTATTTAT TATATTCGAA 5751 CTGACTTCTC CCTCCTTCTT TTCTCCTACT CTTAGCTTCA GCTGGAGTTG GACTGAAGAG GGAGGAAGAA AAGAGGATGA GAATCGAAGT CGACCTCAAC 5801 AAGCCGGAGC CGGTACTGGC TGAACAGTAT ATCCCCCCTT AGCTGGTAAC TTCGGCCTCG GCCATGACCG ACTTGTCATA TAGGGGGGAA TCGACCATTG 5851 CTAGCCCATG CCGGAGCATC CGTTGACTTA GCCATTTTCT CCCTTCATTT GATCGGGTAC GGCCTCGTAG GCAACTGAAT C GGTA.AAAGA GGGAAGTAAA 5901 AGCAGGTATT TCATCTATTT TAGCTTCAAT TAATTTTATC ACAACCATTA TCGTCCATAA AGTAGATAAA ATCGAAGTTA ATTAA.AATAG TGTTGGTAAT 5951 TTAATATAAA ACCACCAGCT ATTTCCCAGT AC CA.AACAC C ATTATTCGTG AATTATATTT TGGTGGTCGA TA.AAGGGTCA TGGTTTGTGG TAATAAGCAC 6001 TGATCAATTC TAGTAACAAC CATTCTTCTT CTCTTATCTC TACCAGTACT ACTAGTTAAG ATCATTGTTG GTAAGAAGAA GAGAATAGAG ATGGTCATGA 6051 CGCAGCCGGT ATTACAATAT TATTAACTGA CCGAAATCTA AACACAACAT GCGTCGGCCA TAATGTTATA ATAATTGACT GGCTTTAGAT TTGTGTTGTA 6101 TTTTTGATCC AGCCGGTGGA GGAGATCCAA TTCTTTATCA GCACTTATTT P,~3AAAC TAGG TCGGCCACCT CCTCTAGGTT AAGA.A~TAGT CGTGAATAAA 6151 TGATTTTTTG GCCATCCAGA AGTTTATATT TTAATTCTTC CTGGCTTTGG ACT C CGGTAGGTCT TCAAATATAA AATTAAGAAG GACCGA.AACC 6201 AATAATTTCT CATGTAGTAG CTTATTATTC TGGT GAACCATTTG TTATTAAAGA GTACATCATC GAATAATAAG ACCATTTTTT CTTGGTAAAC 6251 GCTATATAGG TATAGTTTGA GCAATAATAG CAATTGGACT ATTAGGTTTC CGATATATCC ATATCAAACT CGTTATTATC GTTAACCTGA TAATC CA.AAG 6301 ATCGTCTGAG CCCACCACAT ATTTACAGTA GGTATGGATG TTGATACACG TAGCAGACTC GGGTGGTGTA TA.AATGTCAT CCATACCTAC AACTATGTGC 6351 AGCCTATTTT ACTTCAGCAA CAATAATTAT TGCTATTCCT ACAGGTGTAA TCGGATp~AAA TGAAGTCGTT GTTATTAATA ACGATAAGGA TGTCCACATT 6401 AAGTATTTAG CTGATTAGCA ACCCTTCACG GAGGTTCCAT TAAATGAGAA TTCATAAATC GACTAATCGT TGGGAAGTGC CTCCAAGGTA ATTTACTCTT 6451 ACCCCATTAC TATGAGCCCT TGGGTTTATC TTCTTATTTA CATTAGGAGG TGGGGTAATG ATACTCGGGA ACCCAAATAG AAGAATAAAT GTAATCCTCC 381

6501 ATTAACAGGC ATCGTCTTAG CTAATTCCTC CTTAGATATT GTTCTTCATG TAATTGTCCG TAGCAGAATC GATTAAGGAG GAATCTATAA CAAGAAGTAC 6551 ACACCTACTA TGTAGTAGCT CACTTCCATT ATGTCCTTTC AATAGGAGCG TGTGGATGAT ACATCATCGA GTGAAGGTAA TACAGGAAAG TTATCCTCGC 6601 GTATTCGCCA TTATAGCAGG CTTTATCCAC TGATTTCCTC TTATCTCTGG CATAAGCGGT AATATCGTCC GAAATAGGTG ACTA.AAGGAG AATAGAGACC 6651 CTATACCCTC CATTCAACAT GAACP~~,~A.AT ACAATTTGCA GTAATATTTA GATATGGGAG GTAAGTTGTA CTTGTTTTTA TGTTAAACGT CATTATAAAT 6701 TTGGAGTA.AA CTTAACATTC TTCCCACAAC ATTTCCTAGG CCTTGCCGGT AACCTCATTT GAATTGTAAG AAGGGTGTTG TAAAGGATCC GGAACGGCCA 6751 ATACCACGAC GTTATTCAGA TTACCCAGAC GCATATACTC TATGAAATAT TATGGTGCTG CAATAAGTCT AATGGGTCTG CGTATATGAG ATACTTTATA 6801 GGTCTCTTCT ATCGGTTCTT TAATTTCACT TGTAGCAGTA ATTATACTTT CCAGAGAAGA TAGCCAAGAA ATTAAAGTGA ACATCGTCAT TAATATGAAA 6851 TATTTATTAT CTGAGAAGCA TTTGCCTCAA AACGAGAAGT ATTATCCGTT ATAAATAATA GACTCTTCGT AAACGGAGTT TTGCTCTTCA TAATAGGCAA 6901 GAATTACCTC ACACAAATGT TGAATGATTA CACGGTTGTC CCCCACCTTA CTTAATGGAG TGTGTTTACA ACTTACTAAT GTGCCAACAG GGGGTGGAAT 6951 TCACACATAC GAAGAACCAG CATTTGTTCA AGTTCAACGA ACTTTTTAAA AGTGTGTATG CTTCTTGGTC GTA.AACAAGT TCAAGTTGCT TGP~~AAATTT 7001 CCAAGAAAGG AAGGAATCGA ACCCCCATAT GTTAGTTTCA AGCCAACCAC GGTTCTTTCC TTCCTTAGCT TGGGGGTATA CAATCAAAGT TCGGTTGGTG 7051 ATTACCACTC TGCCACTTTC TTTATTAAGA TTCTAGTAAA ATATATTACA TAATGGTGAG ACGGTGAA.AG A.AATAATTCT AAGATCATTT TATATAATGT 7101 CTGCCTCGTC AAGGCA~AAAC TGTGAGTTTA AATCCCACGA ATCTTAATCT GACGGAGCAG TTCCGTTTTG ACACTCAAAT TTAGGGTGCT TAGAATTAGA 7151 ATAATGGCAC ACCCATCACA ATTAGGATTT CAAGACGCAG CCTCCCCAGT TATTACCGTG TGGGTAGTGT TAATCCTAAA GTTCTGCGTC GGAGGGGTCA 7201 TATAGAAGAA CTTATTCATT TTCACGACCA CACATTAATA ATTGTGTTTC ATATCTTCTT GAATAAGTAA AAGTGCTGGT GTGTAATTAT TAACACAA.AG 7251 TGATTAGCAC CCTAGTCCTT TATATTATCA CAGCAATAGT ATCAACAAAA ACTAATCGTG GGATCAGGAA ATATAATAGT GTCGTTATCA TAGTTGTTTT 7301 CTTACAAACA AATATATTCT CGATTCCCAA GAAATTGA~A.A TTGTCTGAAC GAATGTTTGT TTATATAAGA GCTAAGGGTT CTTTAACTTT AACAGACTTG 7351 TATTCTCCCC GCCATTATCC TCATCATAAT TGCTTTACCA TCCCTACGAA ATAAGAGGGG CGGTAATAGG AGTAGTATTA ACGAAATGGT AGGGATGCTT 7401 TTTTATATCT TATAGACGAA ATTAATGATC CTCACCTGAC CATTAAAGCT ~TATAGA ATATCTGCTT TAATTACTAG GAGTGGACTG GTAATTTCGA 7451 ATAGGCCATC AATGATACTG AAGTTATGAA TATACAGATT ATGAGGACTT TATCCGGTAG TTACTATGAC TTCAATACTT ATATGTCTAA TACTCCTGAA 7501 AGGATTTGAC TCTTATATAA TTCAAACCCA AGATTTAACA CCAGGCCAAT TCCTAAACTG AGAATATATT AAGTTTGGGT TCTAAATTGT GGTCCGGTTA 7551 TTCGTTTATT AGA.AACAGAT CATCGAATAG TTGTACCCAT AGAATCACCT AAGCAAATAA TCTTTGTCTA GTAGCTTATC AACATGGGTA TCTTAGTGGA 7601 ATTCGTGTTT TAGTATCTGC AGAAGACGTT TTACATTCAT GAGCTGTCCC TAAGCACA.AA, ATCATAGACG TCTTCTGCAA AATGTAAGTA CTCGACAGGG 7651 AGCCTTAGGA ATTA.AA.ATAG ATGCCGTACC AGGACGCCTA AACCAAACTG TCGGAATCCT TAATTTTATC TACGGCATGG TCCTGCGGAT TTGGTTTGAC 7701 CTTTTATCAC TTCCCGACCA GGCGTCTATT ATGGTCAATG TTCAGAGATT GA►A~IATAGTG AAGGGCTGGT CCGCAGATAA TACCAGTTAC AAGTCTCTAA 7751 TGTGGCGCTA ATCACAGCTT TATACCCATT GTAGTAGAAG CAGTTCCCTT ACACCGCGAT TAGTGTCGAA ATATGGGTAA CATCATCTTC GTCAAGGGAA 7801 AGAATACTTC GAAGCCTGAT CTTCATTAAT ATTAGAAGAA GCCTCATTAA TCTTATGAAG CTTCGGACTA GAAGTAATTA TAATCTTCTT CGGAGTAATT 7851 GAAGCTAAAT TGGGCCTAGC ATTAGCCTTT TAAGCT~ ATTGGTGATT 382

CTTCGATTTA ACCCGGATCG TAATCGGAAA ATTCGATTTT TAACCACTAA 7901 CCCTACCACC CTTAGTGATA TGCCTCAATT AA.ATC C C CAC CCCTGATTTA GGGATGGTGG GAATCACTAT ACGGAGTTAA TTTAGGGGTG GGGACTAAAT 7951 TTTTCCTTTT ATTTTCATGA ATTATTTTTC TTACTATTTT ACCT ' AAA.AGGAAAA TAAAAGTACT TAAT~~3AAAG AATGATAAAA TGGATTTTTT 8001 GTAATAAACC ATATATTTAA TAATAACCCA ACATTP►~~A.AA GTACTGAAAA CATTATTTGG TATATAAATT ATTATTGGGT TGTAATTTTT CATGACTTTT 8051 ACCTAAGCCA GAGCCCTGAA ATTGACCATG ATCATAAGCT TTTTTGACCA TGGATTCGGT CTCGGGACTT TAACTGGTAC TAGTATTCGA AAAAACTGGT 8101 ATTCCTAAGC CCTTCTCTCC TCGGAATCCC ATTAATTGCT TTAGCAATTA TAAGGATTCG GGAAGAGAGG AGCCTTAGGG TAATTAACGA AATCGTTAAT 8151 TGCTGCCATG ATTAACCTTT CCAACCCCAA CTAACCGGTG GTTAAATAAT ACGACGGTAC TAATTGGAA.A GGTTGGGGTT GATTGGCCAC CAATTTATTA 8201 CGATTAATAA CCCTCCAAAG TTGATTTATT AATCGATTTA TTTATCAACT GCTAATTATT GGGAGGTTTC AACTAAATAA TTAGCTAAAT AAATAGTTGA 8251 CATACA.AC C C ATCAATTTTG CCGGCCATAA ATGAGCTATA TTATTTACAG GTATGTTGGG TAGTTP~~AAC GGCCGGTATT TACTCGATAT AATA.AATGTC 8301 CACTTATACT ATTCCTAATC ACCAGTAATC TCCTGGGACT TCTCCCCTAC GTGAATATGA TAAGGATTAG TGGTCATTAG AGGACCCTGA AGAGGGGATG 8351 ACCTTCACAC CCACAACTCA ACTATCCCTT AATATAGCAT TCGCCCTACC TGGAAGTGTG GGTGTTGAGT TGATAGGGAA TTATATCGTA AGCGGGATGG 8401 TTTATGATTC ATAACCGTAT TAATTGGAAT ACTTAATCAA CCAACA.ATTG AAATACTAAG TATTGGCATA ATTAACCTTA TGAATTAGTT GGTTGTTAAC 8451 CACTAGGCCA TTTCCTGCCA GAAGGTACAC CTACCCCTCT AGTTCCTATC. GTGATCCGGT A.AAGGACGGT CTTCCATGTG GATGGGGAGA TCAAGGATAG 8501 CTAATTATTA TCGAAACTAT TAGTTTATTT ATTCGACCAT TAGCATTGGG GATTAATAAT AGCTTTGATA ATCAAATA.AA TAAGCTGGTA ATCGTAACCC 8551 GGTTCGACTA ACTGCTAATT TAACAGCTGG TCATCTATTA ATACAATTAA CCAAGCTGAT TGACGATTAA ATTGTCGACC AGTAGATAAT TATGTTAATT 8601 TCGCAACCGC AACTTTCGTC CTTATTACCA TCATACCAAC TGTGGCATTA AGCGTTGGCG TTGAAAGCAG GAATAATGGT AGTATGGTTG ACACCGTAAT 8651 ATAACATCAA TTATCCTATT TTTATTAACA ATTCTGGAAG TAGCTGTAGC TATTGTAGTT AATAGGATAA AAATAATTGT TAAGACCTTC ATCGACATCG 8701 AATAATCCAG GCGTATGTAT TCGTACTTTT ATTAAGCCTA TACTTACAAG TTATTAGGTC CGCATACATA AGCATGAA.AA TAATTCGGAT ATGAATGTTC 8751 AAAATGTTTA ATGGCTCACC AAGCACACGC ATATCACATA GTTGACCCTA TTTTAC.AAAT TACCGAGTGG TTCGTGTGCG TATAGTGTAT CAACTGGGAT 8801 GCC,CATGACC ATTAACCGGA GCTACGGCCG CCCTTCTAAT AACATCCGGG CGGGTACTGG TAATTGGCCT CGATGCCGGC GGGAAGATTA TTGTAGGCCC 8851 TTGGCCATCT GATTTCATTT CCACTCATTA CTTCTTCTTT ACTTAGGATT AACCGGTAGA CTAAAGTAAA GGTGAGTAAT GAAGAAGAAA TGAATCCTAA 8901 AACTCTTTTA CTACTAACTA TAATTCAATG ATGACGTGAT ATTATTCGAG TTGAGAAAAT GATGATTGAT ATTAAGTTAC TACTGCACTA TAATAAGCTC 8951 AAGGAACATT CCAAGGTCAT CACACACCCC CCGTCCA~AAA AGGACTTCGC TTCCTTGTAA GGTTCCAGTA GTGTGTGGGG GGCAGGTTTT TCCTGAAGCG 9001 TATGGAATAA TCTTATTCAT TACATCAGAA GTATTTTTCT TCTTAGGTTT ATACCTTATT AGAATAAGTA ATGTAGTCTT CATA►~~AAAGA AGAATCCAAA 9051 TTTCTGAGCC TTTTACCATT CAAGTCTCGC CCCAACCCCA GAACTAGGAG AAAGAC TC GG AAAATGGTAA GTTCAGAGCG GGGTTGGGGT CTTGATCCTC 9101 GATGTTGGCC ACCTACAGGA ATTAATCCAT TAGATCCCTT TGAAGTCCCA CTACAACCGG TGGATGTCCT TA.ATTAGGTA ATCTAGGGAA ACTTCAGGGT 9151 CTTCTAA.ATA CCGCAGTACT TTTAGCTTCT GGTGTTACAG TAACCTGAAC GAAGATTTAT GGCGTCATGA AAATCGAAGA CCACAATGTC ATTGGACTTG 9201 CCACCACAGT TTAATAGAAG GGAATCGAAA AGAAGCTATC CAAGCCCTCG GGTGGTGTCA AATTATCTTC CCTTAGCTTT TCTTCGATAG GTTCGGGAGC 383

9251 CCCTTACCAT TATTTTAGGA TTTTACTTCA CAGCTCTTCA AGCTATAGAA GGGAATGGTA ATAAAATCCT AAAATGAAGT GTCGAGAAGT TCGATATCTT 9301 TATTATGAAG CACCCTTCAC AATTGCCGAT GGTGTTTACG GAACAACATT ATAATACTTC GTGGGAAGTG TTAACGGCTA CCACAAATGC CTTGTTGTAA 9351 TTTTGTTGCC ACAGGATTCC ATGGCCTCCA CGTTATTATT GGTTCAACAT A~A,AACAAC GG TGTCCTAAGG TACCGGAGGT GCAATAATAA CCAAGTTGTA 9401 TTTTAGCAGT TTGCTTGTTA CGACAAATTC AATACCATTT TACATCAGAA AAAATCGTCA AACGAACAAT GCTGTTTAAG TTATGGTA.AA ATGTAGTCTT 9451 CATCACTTCG GTTTCGAAGC TGCCGCATGA TACTGGCACT TTGTAGACGT GTAGTGAAGC CAA.AGCTTCG ACGGCGTACT ATGACCGTGA AACATCTGCA 9501 AGTATGATTA TTCCTTTATG TATCTATCTA TTGATGAGGC TCATAATTAC TCATACTAAT AAGGAAATAC ATAGATAGAT AACTACTCCG AGTATTAATG 9551 TTTTCTAGTA TAAACTAGTA CAAATGATTT CCAATCATTT GATCTTGGTT AA.AAGATCAT ATTTGATCAT GTTTACTAAA GGTTAGTAAA CTAGAACCAA 9601 AGAATCCAAG GAA.AAGTAAT GAACCTCATC ACGTCTTCTA TCGCAGCTAC TCTTAGGTTC CTTTTCATTA CTTGGAGTAG TGCAGAAGAT AGCGTCGATG 9651 GGCCCTGATT TCCCTAATCC TTGTATTAAT TGCTTTTTGA CTTCCATTAT CCGGGACTAA AGGGATTAGG AACATAATTA AC GP.►AAAAC T GAAGGTAATA 9701 TAAATCCAGA TAATGP.,~~AAA CTATCCCCAT ATGAATGTGG CTTTGATCCT ATTTAGGTCT ATTACTTTTT GATAGGGGTA TACTTACACC GAAACTAGGA 9751 CTAGGAAGTG CACGCCTCCC ATTTTCCCTT CGCTTTTTCC TTGTAGCTAT GATCCTTCAC GTGCGGAGGG TAAAAGGGAA GC GP~~AAAGG AACATCGATA 9801 TTTATTTTTA TTGTTTGATC TAGAAATTGC CCTTCTTCTT CCTTTACCAT AAATP~~P~-SAT AACAAACTAG ATCTTTAACG GGAAGAAGAA GGAAATGGTA 9851 GAGGCGACCA ACTACCATCA CCACTCTCCA CATTAATTTG AGCAACAATT CTCCGCTGGT TGATGGTAGT GGTGAGAGGT GTAATTAAAC TCGTTGTTAA 9901 ATCCTTCTTC TATTAACTCT AGGCCTAATT TATGAATGAC TTCAAGGAGG TAGGAAGAAG ATAATTGAGA TCCGGATTAA ATACTTACTG AAGTTCCTCC 9951 ACTAGAATGA GCAGAATAGA TATTTAGTCC A.AATAA.AGAC CACTAATTTC TGATCTTACT CGTCTTATCT ATAAATCAGG TTTATTTCTG GTGATTAA.AG 10001 GACTTAGTAA ATTATGGTGA A.AATCCATAA ATATCTTATG TCTCCCATAC CTGAATCATT TAATACCACT TTTAGGTATT TATAGAATAC AGAGGGTATG 10051 ATTTTAGTCT TTACTCAGCA TTCTTCTTGG GTCTCACGGG CCTTGCACTT TP~AAATCAGA AATGAGTCGT AAGAAGAACC CAGAGTGCCC GGAACGTGAA 10101 AATCGTTACC ACCTTCTATC TGCGCTTCTA TGTTTAGAAA GTATACTACT TTAGCAATGG TGGAAGATAG ACGCGAAGAT ACAAATCTTT CATATGATGA 10151 AACTTTATTT ATTAGTATTG CTATCTGAAC CCTAACATTA AATTCCACCT TTGAAATAAA TAATCATAAC GATAGACTTG GGATTGTAAT TTAAGGTGGA 10201 CATGTTCTAT TATTCCTATA ATTCTCCTTA CATTTTCAGC CTGTGAAGCT GTACAAGATA ATAAGGATAT TAAGAGGAAT GTAAAAGTCG GACACTTCGA 10251 AGCGCAGGCC TAGCCATTCT AGTAGCTACC TCACGCTCCC ACGGCTCTGA TCGCGTCCGG ATCGGTAAGA TCATCGATGG AGTGCGAGGG TGCCGAGACT 10301 CAACCTGCAA AATCTAAATC TTCTACAATG C TAAAA.ATTC TAATTCCAAC GTTGGACGTT TTAGATTTAG AAGATGTTAC GATTTTTAAG ATTAAGGTTG 10351 AATTATACTC TTCCCAACTA CATGAATTAT TAATAAA.A.AA TGATTATGAT TTAATATGAG AAGGGTTGAT GTACTTAATA ATTATTTTTT ACTAATACTA 10401 CCACAACCAC TACTTACAGC TTCCTAATTG CGTTACTAAG CTTACTTCTA GGTGTTGGTG ATGAATGTCG AAGGATTA.AC GCAATGATTC GAATGAAGAT 10451 TTTAAATGAA ATATAGATAT TGGCTGAGAT TTTTCTAACC AATTTATAGC AAATTTACTT TATATCTATA ACCGACTCTA A.AA.AGATTGG TTAA.ATAT C G 10501 CATTGATCCT TTATCAACCC CTCTACTAGT TCTTACATGT TGACTTCTTC GTAACTAGGA AATAGTTGGG GAGATGATCA AGAATGTACA ACTGAAGAAG 10551 CATTAATAAT TTTAGCTAGT CAGAACCACA TCTCTCCAGA ACCAATTATT GTAATTATTA AAATCGATCA GTCTTGGTGT AGAGAGGTCT TGGTTAATAA 10601 CGACAACGAA CATACATCAC ACTTTTAATT TCCCTCCAAA CTTTCCTTAT 384

GCTGTTGCTT GTATGTAGTG TGAA.AATTA.A AGGGAGGTTT GA.AAGGAATA 10651 TATAGCATTC TCTGCAACCG AAATAATTAT ATTTTACATT ATATTTGAAG ATATCGTAAG AGACGTTGGC TTTATTAATA TAA.AATGTAA TATAAACTTC 10701 CCACACTTAT CCCGACTCTT ATTATTATCA CACGATGAGG AAATCAAACA GGTGTGAATA GGGCTGAGAA TAATAATAGT GTGCTACTCC TTTAGTTTGT 10751 GAACGCCTAA ATGCAGGTAC CTATTTTTTA TTTTATACCT TAATCGGCTC CTTGCGGATT TACGTCCATG GAT T AAAATATGGA ATTAGCCGAG 10801 ACTTCCCCTT CTCATTGCCC TTTTACTAAT ACAAAATAAT TTAGGTACTT TGAAGGGGAA GAGTAACGGG AAAATGATTA TGTTTTATTA AATCCATGAA 10851 TATCCATGAT CATTATACAG CACTCACAAT TTCCAAACTT ACTCTCATGA ATAGGTACTA GTAATATGTC GTGAGTGTTA AAGGTTTGAA TGAGAGTACT 10901 GCAGATA.AAT TATGATGAAT AGCCTGTCTT ATAGCTTTCC TTGTCP~A.AAT CGTCTATTTA ATACTACTTA TCGGACAGAA TATC GA.AAGG AACAGTTTTA 10951 ACCTTTATAT GGAATCCATC TCTGACTTCC CAAAGCCCAT GTTGAAGCCC TGGAAATATA CCTTAGGTAG AGACTGAAGG GTTTCGGGTA CAACTTCGGG 11001 CAATTGCCGG TTCAATAATC CTAGCAGCAG TATTACTTAA ATTAGGAGGT GTTAACGGCC AAGTTATTAG GATCGTCGTC ATAATGAATT TAATCCTCCA 11051 TATGGAATAA TACGAATTAT TATTATATTA AATCCATTAA C CA.AAGAA.AT ATACCTTATT ATGCTTAATA ATAATATAAT TTAGGTAATT GGTTTCTTTA 11101 AGCCTACCCA TTCTTAATCT TAGCCATTTG AGGCATTATT ATAACTAGCT TCGGATGGGT AAGAATTAGA ATCGGTAAAC TCCGTAATAA TATTGATCGA 11151 CTATCTGTCT ACGACAA.ACA GACCTAAAAT CCCTAATTGC CTACTCATCA GATAGACAGA TGCTGTTTGT CTGGATTTTA GGGATTAACG GATGAGTAGT 11201 GTAAGCCATA TAGGATTAGT TGCTGCAGCA ATTCTTATTC AAACGCCATG CATTCGGTAT ATCCTAATCA ACGACGTCGT TAAGAATAAG TTTGCGGTAC 112 51 AAGTTTCGCA GGAGCAACCG CACTCATAAT TGCCCATGGC TTAATCTCAT TTCAAAGCGT CCTCGTTGGC GTGAGTATTA ACGGGTACCG AATTAGAGTA 11301 CAGCCTTATT CTGTTTAGCT AATACTAACT ATGAACGAAT TCATAGCCGA GTCGGAATAA GACA.AATC GA TTATGATTGA TACTTGCTTA AGTATCGGCT 11351 ACCATACTCC TAGCCCGAGG TATACAAATC ATTTTTCCAC TAATAGCAAC TGGTATGAGG ATCGGGCTCC ATATGTTTAG TP~~AAAGGTG ATTATCGTTG 11401 CTGATGATTC CTCACCAGTC TAGCTAACCT AGCCCTACCC CCATCCCCTA GACTACTAAG GAGTGGTCAG ATCGATTGGA TCGGGATGGG GGTAGGGGAT 11451 ACCTTATAGG AGAAC TTC TT ATTATTACCT CTTTATTCAA TTGATCCAAC TGGAATATCC TCTTGAAGAA TAATAATGGA GAAATAAGTT AACTAGGTTG 11501 TGAACTATAA TCTTATCAGG GTTTGGAGTA TTA.A.TTACAG CCTCCTATTC ACTTGATATT AGAATAGTCC CAAACCTCAT AATTAATGTC GGAGGATAAG 11551 ACTCTACATA TTCTTAATAA CCCAACGGGG TCCAACTCCC CACCACATCT TGAGATGTAT AAGAATTATT GGGTTGCCCC AGGTTGAGGG GTGGTGTAGA 11601 TATCATTA.AA CCCAAACTAC AC AC GAGA.AC ACCTTCTCTT AA.AC C TTCAC ATAGTAATTT GGGTTTGATG TGTGCTCTTG TGGAAGAGAA TTTGGAAGTG 11651 CTCATACCCG TTCTCTTACT AATATTTAAA CCAGAACTCA TTTGAGGGTG GAGTATGGGC AAGAGAATGA TTATAAATTT GGTCTTGAGT A.AACTCCCAC 11701 GACACTTTGT ATTTATAGTT TAATCA.AAAC ATTAGATTGT GGTTC TAA.AA CTGTGAAACA TAAATATCAA ATTAGTTTTG TAATCTAACA CCAAGATTTT 11751 ATAAAAGTTA A.AACCTTTTT AATTACCGAG AGAGGTCAGG GACACGAAGA TATTTTCAAT TTTGGP~~A.AA TTAATGGCTC TCTCCAGTCC CTGTGCTTCT 11801 ATTGCTAACT CTTCCTATCA TGGTTCAAAT CCATGACTCA CTCAGCTTCT TAACGATTGA GAAGGATAGT ACCAAGTTTA GGTACTGAGT GAGTCGAAGA 11851 GAAAGATAAT AGTAATCTAT TGGTCTTAGG AACCAAAAAT TCTTGGTGCA CTTTCTATTA TCATTAGATA ACCAGAATCC TTGGTTTTTA AGAACCACGT 11901 ACTCCAAGCA AAAGCCATGA ATACTATTTT TAATTCATCA TTTCTTCTAA TGAGGTTCGT TTTCGGTACT TATGATAA.AA ATTAAGTAGT AAAGAAGATT 11951 TTTTTATAAT CCTTACCTTT CCATTAATAA CCTCATTAAA TAC CAAAA.AA P.►~~AAATATTA GGAATGGAAA GGTAATTATT GGAGTAATTT ATGGTTTTTT 385

12001 CTTAACCACG ATTGATCATC ATCCCATGTA AAAATAGCCG TP~~AAATTTC GAATTGGTGC TAACTAGTAG TAGGGTACAT TTTTATCGGC ATTTTTAAAG 12051 CTTCTTTATT AGCCTAATCC CCTTATTTAT CTTTTTAGAT CAAGGCCTAG GAAGAAATAA TCGGATTAGG GGAATA.AATA GP~~A.AATC TA GTTCCGGATC 12101 AATCAATTAT AACCAACTAT AGCTGAATAA ATATTGGCCC CTTCGATATT TTAGTTAATA TTGGTTGATA TCGACTTATT TATAACCGGG GAAGCTATAA 12151 AATATAAGCT TTAAATTTGA CATATACTCA ATTATATTTA CCCCTGTAGC TTATATTCGA AATTTAAACT GTATATGAGT TAATATAAAT GGGGACATCG 12201 TTTATATGTT ACTTGATCCA TCCTCGAATT TGCTTTATGA TATATACACT AAATATACAA TGAACTAGGT AGGAGCTTAA ACGAAATACT ATATATGTGA 12251 CTGATCCTAA TATTAATCGC TTTTTTA.AAT ATTTATTACT CTTTTTAATC GACTAGGATT ATAATTAGCG TTTA TAAATAATGA GP.~~AAATTAG 12301 TCAATAATTA TTCTAGTGAC AGCCAACAAC ATATTTCAAT TATTTATTGG AGTTATTAAT AAGATCACTG TCGGTTGTTG TATAAAGTTA ATA.AATAAC C 12351 ATGGGAAGGG GTAGGAATTA TATCTTTCCT CTTAATTGGC TGATGATATA TACCCTTCCC CATCCTTAAT ATAGA.AAGGA GAATTAACCG ACTACTATAT 12401 GCCGGACAGA CGCTAATACC GCTGCTCTTC AAGCTGTAAT TTATAACCGA CGGCCTGTCT GCGATTATGG CGACGAGAAG TTCGACATTA AATATTGGCT 12451 ATAGGAGATA TCGGACTAAT TCTTAGCATA GCCTGATTAG CCATA.AATTT TATCCTCTAT AGCCTGATTA AGAATCGTAT CGGACTAATC GGTATTTAAA 12501 AAATTCATGA GAAATTCAAC AACTATTTAT CTTATCTGAA AATGTAAACC TTTAAGTACT CTTTAAGTTG TTGATAAATA GAATAGACTT TTACATTTGG 12551 TAACATTACC CCTTTTAGGT CTTGTCCTAG CCGCAGCTGG AAA.ATC C GCA ATTGTAATGG GGA~AAATCCA GAACAGGATC GGCGTCGACC TTTTAGGCGT 12601 CAATTCGGCC TTCACCCTTG ACTTCCCTCT GCCATAGAAG GACCAACACC GTTAAGCCGG AAGTGGGAAC TGAAGGGAGA CGGTATCTTC CTGGTTGTGG 12651 AGTCTCCGCC CTACTCCACT CTAGCACAAT AGTTGTTGCC GGCATTTTTC TCAGAGGCGG GATGAGGTGA GATCGTGTTA TCAACAACGG C C GTAA.AAAG 12701 TACTAATCCG CCTCCACCCC TTAATTCAAA ATAATCAATT AATCTTAACA ATGATTAGGC GGAGGTGGGG AATTAAGTTT TATTAGTTAA TTAGAATTGT 12751 ACATGCTTGT GTTTAGGAGC ACTAACTACC CTTTTTACTG CAGCATGCGC TGTACGAACA CAAATCCTCG TGATTGATGG GP~~A.AAT GAC GTCGTACGCG 12801 ACTCACCCAA AACGATATTA TTAT TGCCTTTTCA ACATCAAGTC TGAGTGGGTT TTGCTATAAT TTTTTTAATA ACGGAAAAGT TGTAGTTCAG 12851 AACTAGGATT AATAATAGTA ACAATTGGCC TTAACCAACC TCAACTTGCT TTGATCCTAA TTATTATCAT TGTTAACCGG AATTGGTTGG AGTTGAACGA 12901 TTTCTCCATA TCTGTACTCA TGCCTTTTTC A.AAGC C ATAC TTTTTCTCTG AAAGAGGTAT AGACATGAGT ACGGAAAAAG TTTCGGTATG AAA,AAGAGAC 12951 TTCAGGATCT ATTATCCATA GTCTTAATGA TGAACAAGAT ATC C GCA~AAA AAGTCCTAGA TAATAGGTAT CAGAATTACT ACTTGTTCTA TAGGCGTTTT 13001 TAGGGGGACT CCATAAACTT CTACCATTTA CCTCATCTTC CTTAACTATT ATCCCCCTGA GGTATTTGAA GATGGTAAAT GGAGTAGAAG GAATTGATAA 13051 GGAAGCTTAG CCCTTACAGG TATACCTTTT CTATCAGGCT TCTTCTCAAA CCTTCGAATC GGGAATGTCC ATATGGAAAA GATAGTCCGA AGAAGAGTTT 13101 AGACGCCATT ATTGAATCCA TA.AACACTTC T TAC C TAA.AC GCCTGAGCCC TCTGCGGTAA TAACTTAGGT ATTTGTGAAG AATGGATTTG CGGACTCGGG 13151 TAGTTCTTAC CCTTATCGCA ACATCATTTA CAGCTATTTA TAGCTTACGC ATCAAGAATG GGAATAGCGT TGTAGTAAAT GTC GATA.AAT ATCGAATGCG 13201 CTTGTGTATT TTGCATCAAT AAATTTCCCA CGATTTAATT CATTCTCCCC GAACACATAA AACGTAGTTA TTTAAAGGGT GC TA.AATTAA GTAAGAGGGG 13251 TATTAACGAG AATCATCCAA CAATAATTAA TCCAATTAAA CGCTTAGCCT ATAATTGCTC TTAGTAGGTT GTTATTAATT AGGTTAATTT GCGAATCGGA 13301 ATGGAAGTAT TTTGGCCGGC CTCATCATTA CATCAAACTT ATCCCCAACA TACCTTCATA AA.ACCGGCCG GAGTAGTAAT GTAGTTTGAA TAGGGGTTGT 13351 A,,AAACCCAA.A TCATAACAAT ACCCCCTCTA C TAA.AAC TTT CCGCCCTTTT 386

TTTTGGGTTT AGTATTGTTA TGGGGGAGAT GATTTTGA.AA GGC GGGAA.AA 13401 AGTGACAATC ATTGGCCTTC TATTAGCCTT AGAATTAACC AACCTAACTA TCACTGTTAG TAACCGGAAG ATAATCGGAA TCTTAATTGG TTGGATTGAT 13451 ATACCCAACT TAAA~TCAAC CCTACTCTTT ATACCCACCA TTTTTCTAAT TATGGGTTGA ATTTTAGTTG GGATGAGAA.A TATGGGTGGT P~~AAAGATTA 13501 ATGCTTGGAT ATTTTCCACA AATCATTCAT CGCCTATTAC CP~~;AAATTAA TACGAACCTA TA~AAAGGTGT TTAGTAAGTA GCGGATAATG GTTTTTAATT 13551 CTTAAGCTGA GCACAACACG TCTCAACACA TCTGATTGAT CAAACATGAA GAATTCGACT CGTGTTGTGC AGAGTTGTGT AGACTAACTA GTTTGTACTT 13601 ATG T TGGACCAAAA AGCAACCTTA TCCAACAAAC TCCACTAATT TAGTTTTTTA ACCTGGTTTT TCGTTGGAAT AGGTTGTTTG AGGTGATTAA 13651 AAATTATCCA CCCAACCACA ACAAGGCTAT ATCAAAATTT ATCTTATACT TTTAATAGGT GGGTTGGTGT TGTTCCGATA TAGTTTTAAA TAGAATATGA 13701 ACTCTTTCTC ACACTAACCT TAGCCCTACT AACTTCATTA ACCTAATTAC TGAGAAAGAG TGTGATTGGA ATCGGGATGA TTGAAGTAAT TGGATTAATG 13751 AC GCAA.AGTT CCCCAAGATA ATCCTCGAGT TAATTCCAGT AC CACAA.ACA TGCGTTTCAA GGGGTTCTAT TAGGAGCTCA ATTAAGGTCA TGGTGTTTGT 13801 AAGTCAATAA TAACATTCAC CCACTCAAAA CTAATATTCA ACCACCATCA TTCAGTTATT ATTGTAAGTG GGTGAGTTTT GATTATAAGT TGGTGGTAGT 13851 GCATATAATA AAGCTACCCC TATAAGATCC CCACGAACCA TCTCCATACT CGTATATTAT TTCGATGGGG ATATTCTAGG GGTGCTTGGT AGAGGTATGA 13901 ACTTATCTCC TCTACTCCTA CCCAACTTAA TTCAAATCAC TCAACCATAA TGAATAGAGG AGATGAGGAT GGGTTGAATT AAGTTTAGTG AGTTGGTATT 13951 AATATTTACC AATAAATACT AAAATTACTA AATA.AAATCC AATGTACAAT TTATAAATGG TTATTTATGA TTTTAATGAT TTATTTTAGG TTACATGTTA 14001 AATACAGACC AATTACCCCA TGATTCAGGA TAAGGCTCAG CAGCAAGCGC TTATGTCTGG TTAATGGGGT ACTAAGTCCT ATTCCGAGTC GTCGTTCGCG 14051 TGCCGTATAA GCAAATACTA CCAACATCCC C C C TA.AATAA ATCP~~AAI~CA ACGGCATATT CGTTTATGAT GGTTGTAGGG GGGATTTATT TAGTTTTTGT 14101 AAACTAATGA TAA~~AAAGAA CCCCCATGAC CCACTAATAA CCCACACCCA TTTGATTACT ATTTTTTCTT GGGGGTACTG GGTGATTATT GGGTGTGGGT 14151 ACCCCAGCAG CCACAACTAA CCCCAACGCA GCATAATAAG GAGAAGGATT TGGGGTCGTC GGTGTTGATT GGGGTTGCGT CGTATTATTC CTCTTCCTAA 14201 AGATGCCACC CCTATTAAAC CTAAAATTAA ACAGATTATT ATTAAAAACA TCTACGGTGG GGATAATTTG GATTTTAATT TGTCTAATAA TAGTTTTTGT 14251 TAAAATATAC CATTACTCCT ACCTGGACTT TAACCAAGAC CAATAACTTG ATTTTATATG GTAATGAGGA TGGACCTGAA ATTGGTTCTG GTTATTGAAC 14301 P►~~~AAC TATC GTTGTTCATT CAACTATAAG AATTTATGGC CCTAAATATC TTTTTGATAG CAACAAGTAA GTTGATATTC TTA.AATAC C G GGATTTATAG 14351 C GP.►~~AAAC C C ATCCATTACT TA.AA.ATTATT AACCAAACCT TAATTGATCT GCTTTTTGGG TAGGTAATGA ATTTTAATAA TTGGTTTGGA ATTAACTAGA 14401 CCCAGCCCCA TCTAATATTT CAATTTGATG AA.ACTTCGGC TCACTTCTAA GGGTCGGGGT AGATTATAAA GTTAA.AC TAC TTTGAAGCCG AGTGAAGATT 14451 GCCTATGTCT AATCATTCAA ATCCTTACAG GACTTTTCCT AGCAATACAT CGGATACAGA TTAGTAAGTT TAGGAATGTC CTGAAAAGGA TCGTTATGTA 14501 TATACCGCAG ACATCTCCAT AGCCTTCTCC TCAGTAATCC ATATTTGCCG ATATGGCGTC TGTAGAGGTA TCGGAAGAGG AGTCATTAGG TATAAACGGC 14551 CGACGTAAAT TATGGTTGAC TCATCCGTAA TATTCATGCC AACGGAGCCT GCTGCATTTA ATACCAACTG AGTAGGCATT ATAAGTACGG TTGCCTCGGA 14601 CACTTTTCTT CATTTGTGTA TACTTACATA TTGCCCGAGG ATTATATTAT GTGAAAAGAA GTAAACACAT ATGAATGTAT AACGGGCTCC TAATATAATA 14651 GGCTCCTATC TTTATAAAGA AACATGAAAT ATTGGAGTAA TCTTATTATT CCGAGGATAG AAATATTTCT TTGTACTTTA TAACCTCATT AGAATAATAA 14701 CCTATTAATA GCCACAGCCT TCGTAGGCTA TGTATTACCA TGAGGACAAA GGATAATTAT CGGTGTCGGA AGCATCCGAT ACATAATGGT ACTCCTGTTT 387

14751 TATCCTTCTG AGGTGCCACA GTTATTACCA ATCTCCTATC CGCCTTCCCC ATAGGAAGAC TCCACGGTGT CAATAATGGT TAGAGGATAG GCGGAAGGGG 14801 TATATCGGAA ACATACTAGT CCAATGAATC TGAGGCGGTT TTTCAGTAGA ATATAGCCTT TGTATGATCA GGTTACTTAG ACTCCGCCAA AAAGTCATCT 14851 TAATGCTACC CTGACACGAT TCTTCGCATT TCATTTCCTA CTACCTTTTC ATTACGATGG GACTGTGCTA AGAAGCGTAA AGTAAAGGAT GATGG~G 14901 TAATTTCAGC ACTAGCAATA ATTCACATTC TCTTTCTTCA TGAAACAGGT ATTAAAGTCG TGATCGTTAT TAAGTGTAAG AGAAAGAAGT ACTTTGTCCA 14951 TCAAATAACC CTATAGGACT TAATTCTGAT ATAGACAA,AA TTTCCTTCCA AGTTTATTGG GATATCCTGA ATTAAGACTA TATCTGTTTT AAAGGAAGGT 15001 CCCCTACTTC TCCTACA.AAG ACGCACTCGG CTTCTTTATT ATAATTATAA GGGGATGAAG AGGATGTTTC TGCGTGAGCC GAAGAAATAA TATTAATATT 15051 TTTTAGGAAT CTTAGCCTTA TTCCTTCCTA ACCTTCTAGG AGATGCTGAA AAAATCCTTA GAATCGGAAT AAGGAAGGAT TGGAAGATCC TCTACGACTT 15101 AACTTCATCC CCGCCAATCC TCTCGTTACC CCTCCCCATA TTAAACCCGA TTGAAGTAGG GGCGGTTAGG AGAGCAATGG GGAGGGGTAT AATTTGGGCT 15151 ATGATACTTC CTATTTGCCT ACGCCATTCT CCGATCCATC C C CAACAA.AC TACTATGAAG GATAAACGGA TGCGGTAAGA GGCTAGGTAG GGGTTGTTTG 15201 TAGGAGGAGT CCTAGCCCTC TTATTCTCCA TCTTCATTCT TATATTAGTC ATCCTCCTCA GGATCGGGAG AATAAGAGGT AGAAGTAAGA ATATAATCAG 15251 CCATTACTTC ATACCTCTAA ACAACGAAGC AGTACCTTTC GCCCACTCAC GGTAATGAAG TATGGAGATT TGTTGCTTCG TCATGGAAAG CGGGTGAGTG 15301 TCAAATTTTC TTTTGAATCC TCGTAGCCAA CACATTAATT TTAACCTGAA AGTTTAAAAG P,AAAC TTAGG AGCATCGGTT GTGTAATTAA AATTGGACTT 15351 TTGGAGGCCA ACCAGTTGAA CAACCATTTA TCCTCATTGG ACA.AATTACA AACCTCCGGT TGGTCAACTT GTTGGTAAAT AGGAGTAACC TGTTTAATGT 15401 TCTATTACTT ACTTCTCCTT ATTTCTTATT GTAATTCCAC TCACAGGCTG AGATAATGAA TGAAGAGGAA TAAAGAATAA CATTAAGGTG AGTGTCCGAC 15451 ATGAGP~AAAT A►AA.ATC C TCA ACTTAAACTA GTTTTGGTAG CTTAACTTAA TACTCTTTTA TTTTAGGAGT TGAATTTGAT CA~AAAC CATC GAATTGAATT 15501 TAAAGCATCG ATCTTGTAAA TCGAAAACCG GAGGTTTAAA TCCTCCCCAA ATTTCGTAGC TAGAACATTT AGCTTTTGGC CTCCAAATTT AGGAGGGGTT 15551 AACATATCAG GGGAAGGAGG GTTA.AAC TC C CGCCCTTGGC TCCCAAAGCC TTGTATAGTC CCCTTCCTCC CAATTTGAGG GCGGGAACCG AGGGTTTCGG 15601 AAGATTCTGC CCAAACTGCC CCCTGCCATG TCATTAAAGC ATGP~AAACCA TTCTAAGACG GGTTTGACGG GGGACGGTAC AGTAATTTCG TACTTTTGGT 15651 AATGP~AAATT TGGTTTTCCA AAAGTAAGTC AGAGTGACAT ATTAATGACA TTACTTTTAA ACCAAA.AGGT TTTCATTCAG TCTCACTGTA TAATTACTGT 15701 TAGCCCACAT ~TCCTAATAT AGTACATTAC TTAACTCGAC TAATCAACAT ATCGGGTGTA TAGGATTATA TCATGTAATG A.ATTGAGCTG ATTAGTTGTA 15751 TAATTGATTA TTCCCTACTA CCATTACTAC TATGTATAAT CCTCATTAAT ATTAACTAAT AAGGGATGAT GGTAATGATG ATACATATTA GGAGTAATTA 15801 CTATATTCCA CTATATCATA ACATACTATG CTTAATACTC ATTAATATAC GATATAAGGT GATATAGTAT TGTATGATAC GAATTATGAG TAATTATATG 15851 TATCCACTAT TTCATAACAT TCTGTTCTTT AACCCTCATT AATCTAATAT ATAGGTGATA AAGTATTGTA AGACAAGAAA TTGGGAGTAA TTAGATTATA 15901 CAAAATTTTC ATTTCATAAA ATTTCTTTAT CCGCTCTCAA TTACCTAAGT GTTTTAAA.AG TAAAGTATTT TAAAGA.AATA GGCGAGAGTT AATGGATTCA 15951 ATTGATCATG CGGGTTGGTA AGAACATCAC ATCCCGCTAT TGTAAGAAAA TAACTAGTAC GCCCAACCAT TCTTGTAGTG TAGGGCGATA ACATTCTTTT 16001 AAATAGCTCT ATTTGTGGCG CTGTACTCGA TTTATCCCCA CCAATTGATC TTTATCGAGA TA.AACAC C GC GACATGAGCT A.AATAGGGGT GGTTAACTAG 16051 A,.AAATTGGCA TCTGATTAAT GCTTGTGCTA CTTTAATCCT TGATCGCGTC TTTTAACCGT AGACTAATTA CGAACACGAT GAA.ATTAGGA ACTAGCGCAG 16101 AAGAATGCCA GATCCCCTAG TTCCCTTTAA TGGCACTTTC GTCCTTGACT 388

TTCTTACGGT CTAGGGGATC AAGGGA.AATT ACCGTGAAAG CAGGAACTGA 16151 GCATCAAGAT TTACTGTCCT CCCAGTTTTT TTTTTTGGGG ATGAAGCAAT CGTAGTTCTA AATGACAGGA GGGTCP~~,~A,A CCCC TACTTCGTTA 16201 TACTAAGCCC GGGAGGGCTG ATCTAGGACA CTGAGATAAA CCTGAATCCT ATGATTCGGG CCCTCCCGAC TAGATCCTGT GACTCTATTT GGACTTAGGA 16251 CCTCGACATT TAC TTp~AAAT ACTCATTACT CACCATTCAT GAATTATAAT GGAGCTGTAA ATGAATTTTA TGAGTAATGA GTGGTAAGTA CTTAATATTA 16301 TGTCAAGTTG ACCATTACTG AGAGGGATAG AGA.AAC TGAC GCCATAGGCG ACAGTTCAAC TGGTAATGAC TCTCCCTATC TCTTTGACTG CGGTATCCGC 16351 ACAAGTTTCG ATTTTTTTGA TTAATGA.AAC TATGGTTTAA ~GACATT TGTTCAAAGC TAAAAAAACT AATTACTTTG ATACCAAATT TTTTCTGTAA 16401 CTCTTAACCC TCATTPLAAAC CGACAAGCGA TAAATGTGAA TGTAAAGCGC GAGAATTGGG AGTAATTTTG GCTGTTCGCT ATTTACACTT ACATTTCGCG 16451 ACTCGTGATC TTAGTACATG CTTCACTTTA CTAGGCATAG ATATATTATT TGAGCACTAG AATCATGTAC GAAGTGAAAT GATCCGTATC TATATAATAA 16501 ATTAGGTTTC CCCCTGGATT GTp~~A.AATTT TTGGGGCCGC TT TAATCCAAAG GGGGACCTAA CATTTTTAAA AACCCCGGCG AATTTTTTTT 16551 AAACATTTTT TTGGTP.,~~AAA CCCCCCTCCC CCTAATATAC ACGGATTCCT TTTGTP.~A,AAA AACCATTTTT GGGGGGAGGG GGATTATATG TGCCTAAGGA 16601 C GP.~~AAAC CC CTAAAACGAA GGCCGGACAT ATATTTTTGA ATTAGCATAC GCTTTTTGGG GATTTTGCTT CCGGCCTGTA TATP~3AAAC T TAATCGTATG 16651 GA.AATTTGTC TTGTATATAT ATAGTGTTAC ACTATGAT CTTTAAACAG AACATATATA TATCACAATG TGATACTA

tRNA 1..70 product = tRNA-Phe rRNA 69..1021 product = 12S ribosomal RNA tRNA 1022..1093 product = tRNA-Val rRNA 1094..2758 product = 16S ribosomal RNA tRNA 2759..2833 product = tRNA-Leu gene 2834..3808 gene = ND 1 product = NADH dehydrogenase subunit 1 tRNA 3810..3878 product = tRNA-Ile tRNA 3877..3948 product = tRNA-Gln tRNA 3949..4017 product = tRNA-Met gene 4018..5061 gene = ND2 product = NADH dehydrogenase subunit 2 tRNA 5061..5131 product = tRNA-Trp tRNA complement (5133..5201) product = tRNA-Ala tRNA complement (5202..5274) product = tRNA-Asn tRNA complement (5307..5373} product = tRNA-Cys tRNA complement (5375..5444) 389

product = tRNA-Tyr gene 5446..6999 gene =COI product = cytochrome c oxidase subunit 1 tRNA complement (7002..7072) product = tRNA-Ser tRNA 7077..7146 product = tRNA-Asp gene 7154..7844 gene = CO2 product = cytochrome c oxidase subunit 2 tRNA 7 845..7918 product = tRNA-Lys gene 7920..8087 gene = ATP8 product =ATP synthase FO subunit 8 gene 8078..8761 gene = ATP6 product =ATP synthase FO subunit 6 gene 8761..9546 gene = CO3 product = cytochrome c oxidase subunit 3 tRNA 9549..9618 product = tRNA-Gly gene 9619..9969 gene = ND3 product = NADH dehydrogenase subunit 3 tRNA 9968..10037 product = tRNA-Arg gene 10038..103 34 gene = ND4L product = NADH dehydrogenase subunit 4L gene 10328..11708 gene = ND4 product = NADH dehydrogenase subunit 4 tRNA 11709..11777 product = tRNA-His tRNA 11778..11844 product = tRNA-Ser tRNA 11845..11916 product = tRNA-Leu gene 11917..13746 gene = NDS product = NADH dehydrogenase subunit 5 gene complement (13742..14263) gene = ND6 product = NADH dehydrogenase subunit 6 tRNA complement (14264..14333) product = tRNA-Glu gene 14336..15481 gene = CYTB product =cytochrome b tRNA 15481..15554 product = tRNA-Thr 390 tRNA complement (15557..15625) product = tRNA-Pro D-Loop 15628..16688