<<

CHARACTERIZATION OF GROUP I INTRONS IN THE RIBOSOMAL RNA INTERNAL TRANSCRIBED SPACERS OF EIGHT ORDERS OF

Veena Patil

A Thesis

Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

December 2011

Committee:

Dr. Scott O. Rogers, Advisor

Dr. Paul Morris

Dr. Jeffrey Miner

ii

ABSTRACT

Dr. Scott O. Rogers, Advisor

Group I introns are self-splicing, ancient and abundant, being distributed sporadically among many phylogenetic lineages. This study is the first to report group I intron elements in the rDNA spacer regions (ITS1 and ITS2) of any organism and is the first report in any metazoan spacer region. Putative group I introns have been identified in ITS1 and ITS2 regions of all eight extant orders of sharks that are included in this study. The secondary structure prediction and analysis of these putative group I introns reveal that the introns have maintained all of the regions found in functional introns and thus are probably group I introns capable of splicing. Phylogenetic analysis indicates that these introns horizontally transferred prior to the evolution of all Orders of the , appearing first in ITS1 region. Since their first appearance, they have been evolving in parallel with species. iii

ACKNOWLEDGEMENTS

I would like to express my deepest gratitude to my advisor Dr Scott O. Rogers for his guidance, inspiration, support and cooperation. He has been a wonderful person and a scientist to work with. He was always accessible and directed me to resources useful for my research. He is a very calm minded person who has always encouraged me to learn and progress in science. I am ever grateful to him for the invaluable knowledge I received from him.

I am sincerely thankful to my committee members Dr. Paul Morris and Dr. Jeffrey Miner for their valuable feedback, guidance and support during my study.

I profusely thank our collaborators Dr.Mahmood Shivji who sent us DNA samples of sharks for our analyses, and fellow collaborator Michael J.Stanhope. I am very thankful to our previous lab members Armeria Vicol and Nancy Walker who laid initial ground work for this project.

I cannot forget the valuable help and cooperation of my lab members Caitlin Knowlton, Sammy

Jumma, Farida Sidiq, Amal Abu Almakarem, and Katie Heilman. My special thanks to Yury

Mikailovich Shtarkman for his consistent moral and technical support throughout my studies, and also for his timely help and suggestions.

My sincere thanks to all my family members, my husband without whose consistent support and love it would not have been possible for me to come to BGSU and finish my studies. He has been very cooperative and helpful throughout my studies. I extend my deepest gratitude to my parents who gave me a very good education from my childhood that has enabled me to do this today. I am also thankful for my in-laws for their help and cooperation. iv

I am thankful to Dr. Vipaporn Phuntumart and Dr. Kamau Mbuthia for all their help and cooperation during my study in BGSU. I am immensely thankful to Department of Biological

Sciences, BGSU for giving me this opportunity and supporting me throughout my studies. I am extremely thankful to Dee Dee Wentland, Deb McLean, Louise Small, Chris Hess , and Linda

Treeger for their help and cooperation during my study.

I profusely thank my friends Chaitanya Kota, Vaishali, Uksha, Shivani, Rohit, Abilash, Prasad,

Raja, and any other friends whose names I might have not recalled, for their help and cooperation during my study here in BG.

Veena Patil

v

TABLE OF CONTENTS Page INTRODUCTION

Sharks……………………………………………………………………………………….. 1

Ribosomal DNA (rDNA = rRNA gene locus)……………………………………………… 5

Phylogenetic studies………………………………………………………………………… 5

Introns and Evolution……………………………………………………………………….. 6

Group I introns……………………………………………………………………………… 7

MATERIALS AND METHODS

DNA Extractions…………………………………………………………………………… 11

Amplification of ITS1 and ITS2 sequences from sharks…………………………………… 11

Constructing of plasmids for ITS1 and ITS2 sequencing and in vitro transcpition……….... 13

Blast search………………………………………………………………………………….. 14

Sequence alignment using Clustal W……………………………………………………….. 16

Phylogenetic Analysis and Secondary Structure Analysis………………………………….. 16

RESULTS

Blast search results………………………………………………………………………….. 19

Phylogenetic analysis ………………………………………………………………………. 22

Secondary structure comparison……………………………………………………………. 23

DISCUSSION………………………………………………………………………………. 25

REFERENCES……………………………………………………………………………… 69

vi

LIST OF FIGURES

1 Amplification of ITS1 and ITS2 sequences …………………………………………………. 35

2 Plasmid with PCR product insertion site for sequencing and transcription …………………. 36

3 Secondary structure the ITS2 intron from nasus …………………………………….. 37

4 Secondary structure the ITS2 intron from griseus … …………………………….. 39

5 Secondary structure the ITS2 intron from Heterodontus francisci …………………………... 41

6 Secondary structure the ITS2 intron from Squalus acanthias ……………………………….. 43

7 Secondary structure the ITS2 intron from brevipinna ……………………….. 45

8 Secondary structure the ITS2 intron from Rhincodon typus …………………………………. 47

9 Secondary structure the ITS1 intron from Ginglymostoma cirratum ………………………… 49

10 Secondary structure the ITS1 intron from Sphyrna lewini …………………………………. 51

11 Secondary structure the ITS1 intron from Lamna nasus …………………………………… 53

12 Secondary structure the ITS1 intron from oxyrinchus ………………………………. 55

13 Secondary structure the ITS1 intron from Heterodontus fransici ………………………….. 57

14 Secondary structure the ITS1 intron from Hexanchus griseus ……………………………. 59

15 Unrooted most parsimonius tree of putative group I introns from both ITS1 and ITS2 regions………………………………………………………………………………………… 61

16 Rooted phylogram (most parsimonious) of putative introns from both ITS1 and ITS2 regions………………………………………………………………………………………… 63

17 Rooted phylogram (most parsimonious) of putative introns from both ITS1 and ITS2 regions………………………………………………………………………………………… 65

18 Most parsimonious tree of putative introns from both ITS1 and ITS2 regions showing representative structures for each clade………………………………………………………. 67 vii

LIST OF TABLES:

1. List of all the taxa examined from the extant orders of sharks……………………...... 12

2. Primers used for the amplification of ITS1 and ITS2 from sharks………………….. 13

3. Primers used for sequencing ITS 1 and ITS2 regions of sharks……………………… 15

4. GenBank (NCBI) accession numbers for ITS1 and ITS2 intron sequences determined in

this study……………………………………………………………………………….18

5. List of putative group I introns found in shark sequences previously deposited in

GenBank (NCBI) from other studies………………………………………………….20

6. Mean and Standard deviation of length of P1 to P10 from the shark ITS 1 and ITS2

putative introns………………………………………………………………………..24

INTRODUCTION

Sharks

Sharks belong to the Class Chondrichthyes, Superorder Selachimorpha (Berman and

Rotman, 1995), which have cartilaginous skeletons and highly streamlined bodies. They first appeared in the fossil record about 420 million years ago. There are about 440 known species of sharks to date. There are two subclasses, one of them being Elasmobranchi to which sharks belong along with skates and rays (the other Subclass is or . They are diverse in their physical appearance, distribution, diet, and habits. Their size range is wide from the smallest known species, lantern sharks ( pusillus ), having an average length of 17 cm, to the largest known species, whale sharks ( Rhinocodon typus ), which have an average length of 12 m. Sharks are found in all seas around the world. They are mostly restricted to marine habitats and are rarely found in fresh water, except for bull and liver sharks, which are found in both marine and fresh water environments (Allen 1999).

Order Squatiniformes consists of one family and one with 16 species, including

Squatina californica (Carrier et al 2004). They are mostly benthic, and are typically found in temperate coastal and tropical upper bathyal habitats (Compagno, 1984). Squatina californica

(commonly known as Pacific angel shark, angel shark, California angel shark or Monkfish) is found in eastern Pacific Ocean from southeastern Alaska to the Gulf of California and Costa

Rica to Southern Chile.

Order consists of six families composed of small or medium sized sharks that are found in cold water (Compagno, 1999). The Genus Squalus first appeared in the Upper

Cretaceous (Cappetta, 1987). Squalus acanthias (common names: , piked dogfish, spotted spiny dogfish, and spur dogfish) is the most prominent member of the Family .

1

It is used as a source of food for humans and pets, and for liver oil, fish meal, fertilizer, and

leather products.

Order Pristiophoriformes includes one family with 2 genera. The Genus includes five species, of which two have been used for this study. The Japanese saw shark

(Pristiophorus japonicus ) is found primarily in the northwest Pacific near Japan, Korea, and northern China. The shortnose saw shark ( Pristiophorus nudipinnis ) is found along the east coast of Australia. Pristiophorus fossils appear in the Upper and are very similar to the extant sharks (Cappetta, 1987).

The Order includes eight families (Compango 2001). The Family

Carcharhinidae includes ( Prionace glauca ), bull shark ( Carcharhinus leucas ), black tip shark (C. limbatus ), spinner shark, ( C. brevipinna ), ( C. plumbeus ),

(C. obscurus ), and silky shark ( C. falciformis ). Carcharhinus leucas and C. brevipinna are found in warm temperate and tropical waters in the Indo-West Pacific and Atlantic. Carcharhinus plumbeus and C. obscurus are circumglobal in warm temperate and tropical waters.

Carcharhinus falciformis is found in all tropical waters and occasionally in warm temperate waters (Carrier et al., 2010). The black tip reef shark ( C. limbatus ), is most commonly found in tropical and subtropical waters around the world. It is a common shore species, usually found around river mouth estuaries and coral reefs (Compango 2001). Prionace glauca is a wide- ranging pelagic, oceanic species, found in both temperate and tropical seas, typically at shallower depths away from the equator and at greater depths near the equator. Sphyrna lewini (scalloped ) in the Family Sphrynidae is circumglobal in warm temperate and tropical waters. It feeds mostly on bony fishes and invertebrates (, crabs and shrimp). Fisheries utilize its meat for food, the fins for soup stock, the skin for leather, and the liver oil as a source

2

of vitamins (Carrier et al., 2010). Sphyrna tiburo ( shark) is found in warm temperate

and tropical waters in the eastern Pacific and western Atlantic (Carrier et al., 2010).

Order members are commonly known as mackerel sharks. This order consists

of seven extant Families. The Family includes 3 genera, Carcharodon , Isurus, and

Lamna . Carcharodon () is found in coastal surface waters in all major oceans. The largest shark recorded was approximately 6 m in length. The earliest known fossils of Carchardon are about 16 million years old (Gottfried and Fordyce, 2001). Shortfin

mako shark ( Isurus oxyrinchus ) is found in temperate and tropical seas worldwide. It is a close

relative of long fin mako shark ( Isurus paucaus ), which is found in the Gulf Stream or warmer

off-shore waters (Carrier et al., 2010). Collectively, I. oxyrinchus and I. paucaus encompass all

mako sharks.

Order Orectolobiformes includes a diverse group of sharks, and is composed of seven families. Because of their ornate patterns similar to a carpet they are called carpet sharks. Most of the species are tropical, coastal and benthic. Ginglymostoma cirratum () is one of the main members of the Family . The genus Ginglymostoma first appeared in the Lower Cretaceous, and 12 fossil species have been identified (Cappetta, 1987). This is a widely distributed shark in the tropical parts of the Atlantic and eastern Pacific oceans, normally inhabiting reefs (Compango 2001) . plagiosum is a small shark (maximum length

= 1 m) in the Family living in shallow waters of the Indian and west Pacific oceans. It is commonly called the whitespotted bamboo shark and is used for its meat and in home aquariums (Compagno 2001). The ( Rhincodon typus ; Family Rhicondontidae ) is the largest living shark. It is circumglobal and is found in tropical and warm temperate seas,

3

oceans, usually near the surface. This species is known to have originated about 60 million years

ago (Compagno, 2001).

The Order Heterodontiformes has one extant family with one genus, consisting of 9 species (Compagno, 2001). The sharks belonging to this order are called bull head sharks. It is also one of the more ancient groups whose fossils appear in the Early , 180 mya

(Cappetta, 1987; Cappetta et al., 1993). California horn sharks ( Heterodontus francisci ) are

relatively small sharks found in warm temperate and subtropical waters of the eastern Pacific.

(Compagno, 2001)

The Order is known to constitute the most primitive type of sharks. The fossils indicate the presence of these sharks from the earliest parts of the Jurassic, 200 mya

(Cappetta et al., 1987). It consists of the Family Chlamydoselachidae and the Family

Hexanchidae. The Hexanchidae includes three Genera, , Hexanchus and

Notorynchus . The blunt nose sixgill or ( Hexanchus griseus ) is the largest member of the Order Hexanchiformes (5.4 m in length). The Genus Hexanchus is ancient and its fossils are known from the Early Jurassic (Cappetta, 1987; Cappetta et al., 1993 ). Notorhynchus includes one species, the broad nose sevengill shark ( Notorhychus cepedianus ), which is the only known

living member of this genus. This genus first appeared during the Lower Cretaceous (Carrier,

2004).

4

Ribosomal DNA (rDNA = rRNA gene locus)

Ribosomal DNA (rDNA) produces the ribosomal RNAs (rRNAs) that form the primary functional components of ribosomes. Ribosomal RNA is an essential structural component of the ribosome and plays the central role in protein synthesis. It consists of tandem repeats of an IGS

(intergenic spacer), an ETS (external transcribed spacer), an SSU (small subunit gene), ITS1

(internal transcribed spacer 1), a 5.8S gene, ITS2, and an LSU (large subunit gene; Tropp 2008).

The spacer regions vary in length and sequence among species. The vital role of rRNA molecules in protein synthesis leads to strong selection pressure to maintain functional rRNA molecules. Hence, ribosomal genes (especially the SSU gene) are some of the most conserved genes in all cellular organisms with sequence similarity evident even among distant phylogenetic taxa. On the other hand, the internal transcribed spacer regions (ITS1 and ITS2) are not incorporated into the structure of the ribosome and are relatively more variable and have less homology. The IGS often contains more sequence variation than the ITS regions.

Phylogenetic Studies

High conservation of the rRNA genes makes this region suitable for studying the phylogeny of distantly related organisms. Eukaryotic rDNA consists of hundreds of tandemly repeated copies of the transcription unit, which includes (in order) the ETS, SSU gene, ITS1,

5.8S gene, ITS2 and the LSU gene (Harris and Crandall, 2000). The ITS regions which are variable due to insertions, deletions and point mutations, have been used to gain insight into

DNA sequence evolution at lower taxonomic levels (usually at the family, genus, species or varietal levels). They have been widely utilized in molecular ecology. They are frequently used in plant systematics (Baldwin, 1992) for characterization across interspecific and intergenic level

5 divergences (Baldwin, 1995). They have also been used to study phylogenetic interrelationships in crayfish (Harris and Crandall, 2000), grasshoppers (Kuperus and Chapco, 1994), mites

(McLain et al., 1995), snails (Schilthuizen et al., 1999), corals (Odorico and Miller, 1997) and fungi (Yan et al., 1995, Shinohara 1996, Zhou and Stanosz 2001). Similarly, ITS sequences have been used to achieve rapid species identification of Carcharhinid sharks (Pank et al., 2001).

Shivji et al 1996a, 1996b used the nucleotide differences in ribosomal ITS regions for rapid identification of tissues in sharks using multiplex PCR-based protocols.

Introns and evolution:

Introns are common in the rDNA locus of fungi, and have been described in the same region in some plants and algae. Introns are non-coding segments (removed from precursor RNAs prior as a part of maturation of the RNA) typically found in genes and are located between the expressed versions of the RNAs (the exons). There are four classes of introns: splicesosomal, Group I,

Group II and Archael introns. Spliceosomal introns are spliced out from RNA by a cellular apparatus called a spliceosome. They are present most often in mRNAs, although some fungi have spliceosomal introns within their rRNA genes (Rogers et al. 1993; Bhattacharya et al.

2000). Group I and group II are self-splicing introns (with different mechanisms) found in rRNA, tRNA, and organellar RNAs that excise themselves from nascent RNA transcripts via ribozyme activity. Archeal introns are found in archaebacterial tRNA genes of euryarchaeotes

(extremophiles) and in tRNA and rRNA genes of crenarchaetoes (Archaea in marine environment) (Andersen et al. 1997). Archeal introns are also often (but not always) found in eukaryotic tRNA and are known to interrupt the anticodon loop (one base 3' to the anitcodon) similar to eukaryotic tRNA introns (Tocchini-Valentini et al 2011)

6

The origin of introns is intriguing and controversial. The ‘intron early’ hypothesis

assumes that introns were present in the earliest versions of genes and played a crucial role in the

origin of proteins by facilitating recombination (Blake 1979). Subsequent evolution of the genes

mostly involved the loss of introns, partially in eukaryotes and completely in prokaryotes in

order to streamline their genomes (Koonin 2006). On the other hand, the ‘intron-late’ hypothesis

assumes that introns have been acquired and this gain has been a continuous process during the

evolution of eukaryotes, while most prokaryotes never possessed them. Some, believe that both

hypotheses are correct. Although introns are more common in eukaryotes than prokaryotes, group I introns are phylogenetically the most widely distributed, having been found in the tRNA and rRNA of bacteria, as well as within some organellar and bacteriophage genes. Some of them are also known to posses homing endonucleases that function in their mobility. Because group I introns have been found in few bacteriophages, horizontal transfer of these introns via viruses is a possibility. Further strengthening this possibility is the fact that ORFs in group I introns that promote their transfer to intron-less copies of the same gene are also found in introns of bacteriophage T4 (e.g td and sunY) and have been demonstrated to promote their transfer by similar process employed by group I intron-encoded ORFs (open reading frames; Shub 1991).

Saccharomyces cerevisiae group I intron’s (r1) ORF, the translation product of which the omega

transposase is a specific restriction endonuclease which represents the reminiscent of a

transposable element encoding its own transposase (Colleaux 1988).

Group I Introns

Group I introns belong to the most ancient (Harris and Rogers 2008, Takizawa 2011) and

most abundant (Vicens and Cech 2006) of the self-splicing ribozymes. They were first reported

in the ciliated protozoan Tetrahymena thermophila (Cech 1981) . Group I introns are widely

7 distributed in nuclear ribosomal DNA (rDNA). Their sporadic presence among phylogenetic lineages is found to be consistent with their frequent insertions and deletions during evolution

(Dujon et al 1989). This study is the first to report group I intron-like elements in the rDNA spacer region (ITS1 and ITS2) of any organism, and is the first to find introns in shark genomes.

Group I introns were found in the ITS1 and ITS2 regions of all eight extant orders of sharks

(Shivji et al 1995, 1996a, 1996b, 1997, 1999).

Group I introns vary significantly in their sizes, catalytic properties and self-splicing abilities (Golden, 2008; David and Eckstein, 2008). Their size range from the smallest 56 nt

(Grube et al., 1996) to 3000 nt (Lambowitz & Belfort 1993). The in vivo and in vitro mutation studies along with phylogenetic comparisons and crystallographic determinations have lead to the deduction of group I intron secondary structure (Michel and Westhof 1990; Lambowitz 1993;

Adams et al. 2004) which is largely conserved among all group I introns. The secondary structure consists of a series of paired regions, termed P1 through P10, which assemble to form a catalytically active ribozyme (Vicens and Cech 2006). The helices are organized into three domains: P1-P2-P10, P4-P5-P6 and P3-P7-P8-P9. They are connected by non-Watson-Crick base pairs or single stranded junction (J) segments (Golden, 2008, David and Eckstein, 2008).

Regions P1 and P10 contain the 5' and 3' splice sites, respectively (Lambowitz 1993, Vicens and

Cech 2006) and are formed by base paring to the IGS (internal guide sequence), often forming triple bases involving P1, the IGS and P10, concurrently. The sequences can be variable, except for a few crucial nucleotides at the active site, although the core secondary and tertiary structures are conserved (Michel and Westhof 1990). The conserved features of the group I introns include:

8

a) Conserved regions P, Q, R and S. The P and Q regions are complimentary and pair with

each other to form the P4 region of the secondary structure and are found in the P4-P5-P6

domain.

b) Regions R and S are complimentary to each other and form the P7 region found in the

P3-P7-P8-P9 domain.

c) The final exon U (which always forms wobble base pair with a G residue in the IGS) at

the 3' end of the 5' exon and the last intron G (termed the omega G, or ωG) at the 3' end

of the intron (preceding 3' intron splice site) are conserved (although a few introns have

been described that have a terminal ωU instead of a terminal ωG).

d) The exogenous G binding site is found in P7 (Adams et al 2004), which corresponds to a

universally conserved GC pair in the P7. The exogenous G initiates the first reaction of

splicing in group I introns.

The domains P4-P5-P6 and P3-P7-P8-P9 constitute the catalytic core that contains the

active site (Kim and Cech 1987, Michel and Westhof 1990). The receptor for the conserved U ·G wobble pair in P1 (at the 5' splice site) is provided by the P4-P5-P6 domain (Wang et al 1993), which also is reported to structurally support the P3-P7-P8-P9 domain (Woodson and Chauhan

2008). The P3-P7-P8-P9 domain is reported to contain many active site residues and is known to solely retain some catalytic activity independent of other domains (Ikawa et al 2000). Group I introns are classified into 5 major groups IA through IE (Zhang and Li 2005), based on their structural and sequence features (Michel and Westhof 1990). They are further divided into 11 minor subgroups (IA1, IA2, IA3, IB1, IB2, IB3, IB4, IC1, IC2, IC3, ID and IE).

9

The splicing of a group I intron proceeds through two transesterification reactions, which take place in the presence of biologically relevant concentrations of Mg+ and a guanosine nucleoside (Cech et al 1994). The first step of intron splicing is mediated by a free guanosine

(exogenous G, which can be in the form of guanosine, GMP, GDP or GTP) where this guanosine docks into active G binding site in P7 (Golden 2008, Vicens and Cech 2006). The 3'-OH of the exogenous G bound to the G-binding site in P7 is aligned to attack the phosphodiester bond at 5' splice site located in P1. The attack results in formation of a free 3'-OH group of the U at the end of the 5' exon and the exogenous G becomes covalently bound to the 5' end of the intron, which is mediated by magnesium ions (Lambowitz 1993, Golden 2008). Next, the terminal conserved

G ( ωG) of the intron in the 3' splice site within the P10 region replaces the exogenous G in the G binding site of P7 to initiate the second transesterificaion reaction and is attacked by the free 3'

OH of uridine at the end of the 5' exon in P1 leading to the splicing of exons and removal of intron (Lambowitz 1993, Adams 2004, Vicens and Cech 2006, Golden 2008).

While attempting to identify a broad range of sharks from extant orders, PCR amplification using sets of primers for the rDNA ITS regions (ITS1 and ITS2) was performed and the alignment of amplified products indicated the presence of a large insertion (Shivji et al

1995, 1996a, 1996b, 1997 et al 1999). This was first noted because the ITS2 region was longer than those for most other eukaryotes, and the ITS2 for bull shark ( C. leucas ), as well as the ITS2

of other shark species, was approximately 400 bp longer than the ITS2 region in bonnethead

shark ( S. tiburo ). Close examination of additional ITS1 and ITS2 sequences from sharks led to

the discovery of group I introns that apparently inserted into at least two sites (one in each of the

ITS regions) early during the evolution of sharks (Shivji et al 1995, 1996a, 1996b, 1997 et al

1999). In the current study, ITS1 and ITS2 sequences from species representing all extant orders

10 of sharks were analyzed for the presence of Group I introns. Their secondary structures were predicted and their possible origin and diversification was studied using sixgill shark (Hexanchus griseus ) as the outgroup, because it is considered to be one of the most ancient lineages of sharks used for this study.

MATERIALS AND METHODS

DNA extraction

Tissue samples (muscle or fin) from the sharks used in this study (Table 1) were obtained from Dr. Mahmood Shivji (NOVA Southeastern University, Oceanographic Center, Dania FL).

They were collected during Dr. Shivji’s conservation studies of wild sharks from several locations worldwide. All samples were stored in ethanol at -20°C. Genomic DNA was isolated from each sample according to a CTAB DNA extraction method (Rogers and Bendich 1985,

1994).

Amplification of ITS1 and ITS2 sequences from sharks

Polymerase Chain Reaction (PCR) for the rDNA ITS region (Fig. 1) was performed in 50 µl volumes containing Taq DNA polymerase buffer (20 mM Tris-HC1, 10 mM (NH 4)2S0 4, 10 mM

KC1, 2 mM MgSO 4, 0.1% Triton X-100, pH 8.8), 0.25 µM of primers (Table 2), 250 mM dNTPs, 2 units of Taq DNA polymerase (New England Biolabs, Ipswich, MA) and 50 ng of genomic DNA. The DNA was denatured at 94°C for 5 minutes, followed by 30 amplification cycles (melting: 94°C for 35 seconds; annealing 55°C, for 1 minute; extension: 72°C for 2 minutes) and a final extension at 72°C for 15 minutes. The primers used for ITS1 amplification

11

Table 1. List of all the taxa examined from the extant orders of sharks.

Order Taxa Included

Carcharhinus acronotus (blacknose shark), Carcharhinus brevipinna (spinner shark), Carcharhinus falciformis (silky shark), Carcharhinus leucas (bull shark), Carcharhinus Cacharhiniformes limbatus (blacktip shark), Carcharhinus obscurus (dusky shark), Carcharhinus plumbeus (sandbar shark), Prionace glauca (blue shark), ventriosum (swell shark), Sphyrna lewini (scalloped hammerhead), Sphyrna tiburo (bonnethead shark or shovelhead). Carcharodon carcharias (great white shark, Isurus Lamniformes oxyrinchus (shortfin mako), Isurus paucus (longfin mako), Lamna nasus (porbeagle) Heterodontiformes Heterodontus francisci (horn shark)

Hexanchiformes Hexanchus griseus (), cepedianus (broadnose sevengill shark) Ginglymostoma cirratum (nurse shark), Chiloscyllium Orectolobiformes plagiosum (whitespotted bamboo shark), Rhincodon typus (whale shark). Pristiophoriformes Pristiophorus japonicus (Japanese ), Pristiophorus nudipinnis (shortnose sawshark) Squaliformes Squalus acanthias (spiny dogfish)

Squatiniformes Squatina californica (Pacific )

12

Table 2. Primers used for the amplification of ITS1 and ITS2 from sharks

Primer name Sequence a

ITS1F GTACACACCGCCCGTCGCTACTA

Primer 2 R GCTGCGTTCTTCATCGATGC

ITS2 Fish 5.8S F TTAGCGGTGGATCACTCGGCTCGT

TCCTCCGCTTAGTAATATGCTTAAATTCA Fish 28S R GC

aAll are shown in the 5' to 3' direction

were primer ITS 1F and primer 2 (reverse). The primers used for ITS2 amplification were primers Fish 5.8S F and Fish 28S R.

Constructing the plasmids for ITS1 and ITS2 sequencing and in vitro transcription

Following amplification, the amplicons were ligated into plasmid vectors (pCR2.l-TOPO;

Invitrogen Corp., Carlsbad, CA) and transformed into host E. coli bacteria. Recombinant clones were identified by growth on selective medium with 100 µg/ml ampicillin (Fig. 2). The recombinant plasmids were extracted from the bacterial host cells and used in sequencing reactions (Gene Gateway Inc., Hayward, CA). Sequencing primers were the T7 promoter primer

(GTAATACGACTCACTATAGGG) and M13 reverse primer

13

(GGAAACAGCTATGACCATG), which were on the vector portion of the recombinant plasmids. Additional primers were synthesized and used for sequencing because most of the fragments were large (due to the presence of introns). The additional primers used for sequencing are presented in Table 3.

BLAST searches

NCBI (National Center for Biotechnology Information: www.ncbi.nlm.nih.gov ) was used to perform BLASTn analyses (nucleotide searches) with the ITS1 and ITS2 sequences. Initially, a search for group I introns was undertaken to determine if any of the sharks studied have been reported to contain group I introns. The sequences from both ITS1 and ITS2 were entered in the query sequence option in FASTA formats. All options were set to the standard default settings in

BLASTn.

14

Table 3. Primers used for sequencing ITS 1 and ITS2 regions of sharks

ITS1 Primer (forward) a Primer (reverse) a (species) CGAAGCCGGTGATGCAGC C. plagiosum GCGG G. cirratum GGACGAGCCGGCACTCAG GCGGCTCGCCCGAGG H. francisci GGTGCGCACAGCGCGA GCGCTGTGTTGTGCTGGC H. griseus TTGATG CAGGGCTCGTCTGGCGGA GGCCATGCCGGACCGATGAC I. oxyrinchus GGT AG CCAGCCCACGGGAGCCGG CAGGTACATTCTCTCGCACGC L. nasus AG TG CACGTCGCACCCAACCGG CAGTCGGTTCGGCAGCGTCAC P. japonicus CTC CG AACGGTCCCACCGAAGAT CTGCTGCTGTGGCTGCTTAC S. acanthias GG TGGGTGTGAGCAAATCAG S. californica AGAGGGCGGA TTCGCTGGATGCCGAGAC S. lewini CTGG ITS2 Primer (forward) Primer (reverse) (species) C. leucas GCCTCCTCGAGGTCGCC CAGGACGGCTGTCAGTGG GCTCCAGGCCGGCCGGTATAT H.francisci GTTGC CT I. oxyrinchus TTCCCGGCACGGCTGTC GGGCACATTGAGTGCATC L.nasus GGTAGGTGCTCGCCAGACAG GG TAAGTGCAGACCCGGAGT CGCGGAAGCGCTGCTGGAGT N. cepedianus CTCCGCCTC CC GCTCACACCGTTCTGGAG CTCGGCTCACACCGTGCCGTA P. japonicus TCCCGA CGTCGGTCTGAGCTGCGG R. typus CACGCACGCCACCGGACACG TCG CATTGTCTTGTTCGGCGTC S. acanthias GAGCAGTGGCTGGGCAG CC S. californica TGTGTGGAAGGACAGCGAGG aAll are shown in the 5’ to 3’ direction.

15

Sequence alignment using ClustalW

A multiple sequence alignment program, ClustalW, was used for aligning the sequences to be used in the phylogenetic studies. The site www.ebi.ac.uk was used in which the ‘DNA’

option was chosen in the input sequence and the sequences were entered using a FASTA format.

All other options were set to the standard default settings (input sequence – DNA; pairwise

alignment option – slow; DNA weight matrix – IUB; Gap open – 10; Gap extension – 0.20; Gap

distance – 5, end gaps – none; iteration – none, Numiter -1; Clustering – NJ, Output option :

Format –Aln w/numbers; Order – aligned). After alignment, the ‘result summary’ option was

used to obtain the alignment in the ‘CLUSTAL’ format. The sequences in this format were

copied and pasted in a text editor (Notepad) to manually find and align the P, Q, R and S regions.

Phylogenetic Analysis and Secondary Structure Prediction

Phylogenetic analyses were performed using PAUP 4.0 bl0 (Swofford, 2001). Maximum

parsimony using a heuristic search strategy was employed. Bootstraping with 1000 replicates

was used to assess support for the branches of the trees. Two equally parsimonious trees were

generated for both ITS1 and ITS2 introns using Hexanchus griseus as the outgroup. An unrooted

phylogram was also generated. The phylogenetic tree was divided into thirteen clades and a

representative structure was predicted for each clade. The differences in the sequence of the

other members of the clade were marked on the representative structure.

Secondary structures were predicted for introns in each clade of the phylogenetic tree for both

the ITS1 and ITS2 regions. Initially, one of the sequences representing each clade was used for

the prediction of the secondary structures. Then, the remaining sequences from the clade were

depicted on the representative structure for that clade. This was achieved by performing pairwise

16 alignments using the representative sequence of the clade. Initially, the R and S regions, which comprise the P7 region of the group I introns, were located manually based on their complementarities. Similarly, the P and Q regions were located based on their complementarities to form the P4 region. Region P3 was located by searching downstream from R and upstream from P (within P4). Secondary structures for each of these pairing regions was initially predicted using Mfold (Zucker, 1989, 2003), followed by manual adjustments. Once the P4/P5 region was obtained, the region between Q and R which forms the loop P6 was predicted, again using Mfold and manual adjustments. Next, loop P8 was located, followed by P1 and P2 (where present), P9 and P10. Borders were predicted using both the deduced region of insertion of the intron and based on previous models of group I introns (Adams et al 2004; Michel and Westhof 1990,

Neuveglise and Brygoo 1994). Manual adjustments were performed to obtain maximum base pairing.

RESULTS

Sequencing of the rDNA internal transcribed spacers (ITS1 and ITS2) for shark identification for their conservation led to the discovery of large single inserts in both spacer regions (Shivji et al

1995, 1996a, 1996b, 1997) for most species with the following exceptions: the ITS1 of R. typus and some L. nasus samples were intron-less; and the ITS2 regions of S. lewini and S. tiburo were intron-less (Table 4). A close examination revealed the presence of group I intron-like elements in one or both of these spacers. These inserts contained features that are known to be conserved in group I introns (Burke 1988; Cech 1988; Vincens and Cech 2006), including pairing regions

P1-P10, a G•U pair at the 5' exon-intron border, a U as the last base of the 5' exon and a G as the last base of the intron (Figures 3 to 14). The conserved tertiary structure needed for efficient splicing is known to be maintained by the base pairing of

17

Table 4. GenBank (NCBI) accession numbers for ITS1 and ITS2 intron sequences determined in this study .

ITS2 intron ITS1 intron accession Species accession numbers numbers (length in bp) (length in bp) Carcharhinus brevipinna absent JN039367 (522)

Carcharhinus leucas JN003436 (337) JN003437 (546)

Carcharhinus limbatus absent JN003438 (534)

Cephaloscyllium ventriosum absent JN039367 (432)

Chiloscyllium plagiosum JN003434 (496) absent

Ginglymostoma cirratum JN243354(485) JN003433(485) Heterodontus francisi JN003427 (703) JN003444 (438)

Hexanchus griseus JN003428 (586) JN003446 (325)

Isurus oxirinchus JN003432 (517) JN003441 (496) Lamna nasus JN003431(704) (+/-) JN003440 (490)

Notorynchus cepedianus JN003429 (600) JN003447 (336)

Prionace glauca absent JN003439(497) Pristiophorus japonicus JN003431 (625) JN003448 (361)

Pristiophorus nudipinnis absent JN003449 (365) Rhincodon typus absent JN003442 (497) Squalus acanthais JN003426 (677) JN003445 (380)

Squatina californica JF906176 (675) JN003443 (376) Sphyrna lewini JN003435 (495) absent

Sphyrna tiburo not determined (+/-) absent

(+/-) indicates that some sequences from this species had the introns and some did not.

The species that contain accession numbers are the ones that we sequenced, submitted to

Genbank and predicted to have putative introns.

18 conserved sequence regions P, Q, R and S (Booton et al 2004), which form part of the central core of group I introns. These regions (Cech 1990) were present in all putative group I introns examined (Figures 3 to 14). Regions P and Q are complimentary to each other to form the P4 region while regions R and S pair to form the P7 region.

BLAST search results

Comparison of the conserved regions P, Q, R and S with other intron regions (Shinohara et al., 1996) suggests that the introns in sharks are closest to subgroup IC1. This subgroup is common in the nuclear rRNA gene locus of a variety of eukaryotes, including fungi and algae

(Gargas et al., 1995), but previously has not been reported in any metazoan, with the exception of sea anemones. Relatively high sequence conservation was seen among the species from Order

Carcharhinidae that allowed a precise alignment of the sequences from this order. While multiple sequence alignments using ClustalW of all the ITS1 and ITS2 sequences were somewhat ambiguous, homology was evident in the P and Q region. Similarities were highest among the Carcharhinus species in the P, Q, R and S regions.

In an attempt to determine whether these insertions are present in the ITS regions for other shark species, a BLAST search was performed. The BLAST hits were mostly sequences that included partial sequences from 5.8S rRNA, SSU rRNA and partial to complete ITS1 and

ITS2 regions, but none of the sequences in the database that matched had previously been listed or reported as group I introns. However, some sequences from the BLAST search contained large insertions in ITS1 and ITS2 (based on comparisons with the sequences determined in this study), including Carcharhinus acronotus (GU385345.1), Lamna nasus (AF515444.1), Isurus paucas (AF515443.1) and Carcharodon carcharias (AY198335.1) (Table 5). These were

19

Table 5 List of putative group I introns found in shark sequences previously deposited in GenBank (NCBI) from other studies.

Nucleotide difference ( Putative group I between Description of Sequence from Insertion intron elements found putative intron GenBank (NCBI) Position in this study sequences and the BLAST hit sequence from Gen Bank )

ITS1 complete, 5.8SrRNA 488-1197 17 bp Lamna nasus

JN003431 (704 bp) 18S rRNA, 5.8S rRNA and ITS1 508-1214 29 bp complete

Isurus oxyrinchus 18S rRNA, 5.8S rRNA and ITS1 508-1214 28 nt JN003432(517 bp) complete

Heterodontus francisi 5.8S rRNA partial, ITS1 792-1083 18 nt JN003427 (703 bp) complete

5.8S r RNA gene, partial Carcharodon sequence; ITS2, complete carcharias AY198335 523-1004 0 sequence; and 28S rRNA gene, (482 bp) partial sequence

20

5.8S r RNA gene, partial sequence; ITS2, complete sequence; and 28S rRNA gene, 500-998 33 nt Isurus paucas partial sequence

AF515443 (500 bp)

5.8S r RNA gene, partial 495-503 30 nt sequence; ITS2, complete sequence; and 28S rRNA gene, partial sequence

5.8S rRNA gene,partial Hexanchus griseus sequence; ITS2, complete 350-668 0 JN003446 (325 bp) sequence; and 28S rRNA gene, partial sequence

ITS2 and 28S rRNA gene, 39 bp partial sequence 394-919

5.8S r RNA gene, partial

sequence; ITS2, complete 40 bp sequence; and 28S rRNA gene, 394-919

partial sequence

5.8S r RNA gene, partial 33 bp Carcharhinus sequence; ITS2, complete 376-909

brevipinna sequence; and 28S rRNA gene,

JN039367 (520 bp) partial sequence

5.8S rRNA gene, partial 35 bp sequence; ITS 2, complete 389-914

sequence; and 28S rRNA gene,

partial sequence

5.8S rRNA gene, partial

sequence; ITS 2, complete 33 bp sequence; and 28S rRNA gene, 390-862

partial sequence

21 subjected to phylogenetic and secondary structure analysis in this study. Each had regions within the inserted sequence that could form the group I intron secondary structure.

Phylogenetic analysis Both unrooted and rooted phylograms were constructed for the putative intron sequences from both ITS1 and ITS2 regions. The unrooted phylogram (Figure 15) shows that the ITS1 introns form one clade, while those in ITS2 form another clade, with the exception of a small clade that includes ITS1 introns from Ginglymostoma cirratum (JN243354) and Chiloscyllium

plagiosum (JN003434), as well as ITS2 introns from Rhincodon typus (JN003442) and

Ginglymostoma cirratum (JN003433). The most parsimonious rooted phylogram with the

intron from ITS2 of Hexanchus griseus as the outgroup (Figure 16) formed a tree in which the

putative intron sequences from ITS1 and ITS2 were separated, except for the clade mentioned

above. The most parsimonious rooted phylogram with the intron sequence from ITS1 of

Hexanchus griseus as the outgroup resulted in clustering of ITS1 introns into one clade and the

ITS2 introns into another. As with the other phylograms, as mentioned above one mixed clade

persisted. Unlike other orders, species belonging to the Order Carcharhinidae were closer to

each other than with other species as is evident in both the unrooted and rooted phylograms.

This is consistent with their more recent evolution. The clustering of the putative intron

sequences in an unrooted phylogram (Figure 17) is similar to the rooted phylogram of putative

intron sequences with Hexanchus griseus as outgroup from both ITS1 and ITS2 regions (Figure

15). Examination of ITS1 and ITS2 in rays (class: Chondrichytes, subclass: ) did not reveal the presence of putative introns in spacer regions, while among the skates examined, deep sea skate Bathyraja abyssicola from (class Chondrichytes, subclass Elasmobranchii ) indicated the presence of intron in ITS1 region (data not shown). While in the spotted rat fish

22

Hydrolagus colliei (subclass Holocephali ), the immediate sister group of Elasmobranchii did not indicate the presence of possible intron in ITS1 region. However, the ITS1 and ITS2 regions among jawless fishes Petromyzon marinus (Sea lamprey), (the proposed ancestors of cartilaginous fishes) failed to show any signs of having introns.

Secondary structure comparison

Figures 3 through 14 show the predicted secondary structures for group I introns in

several shark species from both the ITS1 and ITS2 regions. The mean lengths and standard

deviations of all the paired regions P1-P10 of putative intron sequences from both ITS1 and ITS2

regions are given in Tables 6. The species listed in these tables are the representatives for each

clade from the tree (Figure 18).

Intron sequences from the ITS1 region from S. lewini (JN003435) and H. francisci

(JN003427) have all of the paired regions (P1 to P10), whereas P2 is missing from the ITS1 introns of G. cirratum (JN243354) and I. oxyrinchus (JN003432) and from all of the ITS2 introns, with the exception of R. typus (JN003442) (Table 6). Similarly, the stem-loop P9.1 is missing from the ITS1 introns of L. nasus (JN003431), I. oxyrinchus (JN003432) and H. griesus

(JN003428), and is absent from all of the ITS2 introns, except C. brevipinna (JN039367). In

ITS1, region P4 (paired region P and Q) exhibited the least variation in sequence length (with SD of 3.4) followed by P1, P3 and P7, while P5 had the highest variation in sequence length ranging from 40 nt ( I. oxyrinchus -JN003432) to 219 nt ( H. francisi - JN003427) (Table 6). The P5 region was the longest stem-loop in all of the introns in ITS1, except in I. oxyrinchus

(JN003432), which had a P6 longer than its P5 region. Unlike introns from ITS1, none of the

introns from ITS2 contain all of the paired regions P1 to P10 (Table 6). The regions common to

23

Table 6. Mean and Standard deviation of length of P1 to P10 from the shark ITS 1 and ITS2 putative introns

Species P1 P2 P3 P4 P5 P6 P7 P8 P9 P9.1 Total G. cirratum 73 - 39 32 128 73 38 37 15 13 483 (JN243354) S. lewini 55 19 42 34 157 83 42 11 27 14 495 (JN003435) L. nasus 70 42 58 32 180 100 58 48 83 - 704 (JN003431) I. oxyrinchus 70 - 58 25 40 100 58 48 83 - 496 (JN003432) H. francisi ITS1 81 39 48 27 219 100 70 43 17 28 703 (JN003427) H. griseus 71 57 41 29 145 65 52 40 54 - 586 (JN003428)

Mean 70 40 48 30 144 87.0 53 32 37 16 578 S.D 8.4 16 9.0 3.4 60 15.5 12 12 24 8 104 L. nasus 71 - 41 42 150 75 25 22 30 - 490 (JN003440) H. griseus 27 - 29 34 79 44 28 23 17 - 305 (JN003446) H. francisi 55 - 53 27 119 76 30 30 22 - 438 (JN003444) ITS2 S. acanthias 39 - 43 28 116 50 28 25 28 - 380 (JN003445) C. brevipinna 86 - 52 31 137 75 39 31 38 10 522 (JN039367) R. typus 56 37 44 39 120 77 47 45 23 - 510 (JN003442) Mean 56 - 44 33 120 66 33 29 26.34 - 440

S.D 21 - 8.7 6.0 24 15 8.4 8.5 7.4 - 84.7

Mean (ITS1+ITS2) 63 39 46 32 133 77 43 36 36 16

S.D (ITS1+ITS2) 17 14 8.4 5.0 46 18 14 12 24 8

24

all introns from ITS2 are P1, P3, P4, P5, P6, P7, P8, P9 and P10. In ITS2, P4 had the least sequence length variation (SD of 4.0) followed by P9, P7, and P3, while P1 the highest (SD of

21). The P5 region was the longest stem loop region in ITS2 ranging from 79 to 150 bp.

A GNRA loop is found in the P9 stem loop of G. cirratum (JN243354), L. nasus (JN003431),

I. oxyrinchus (JN003432) and H. griseus (JN003428) from ITS1 (Figures 9,11,12 and 14, respectively). However, in H. francisi (JN003427) both P9 and P9.1 loops have GNRA loops

(Figure 13). None of the ITS2 sequences had any GNRA loops in the P9 or P9.1 regions (Figures

3 to 8).

When ITS1 and ITS2 introns were compared, more variation was seen in ITS2 only for P1 and P4 (with SD values of 21 and 6 respectively) whereas P3, P5, P6, P7, P8 and P9 were more variable in ITS1 (with SD values of 9, 6, 15.5, 12 and 24, respectively) (Table 6).

A comparison of the overall lengths of the putative group I introns from the ITS1 and ITS2 regions indicated that there was more variation in sequence and in length within the ITS1 introns than within the ITS2 introns (with SD 104 and 84.7, respectively) (Table 6). This is consistent with a more ancient insertion for the ITS1 introns.

When compared for each pairing region P1 to P9 from both ITS1 and ITS2, region P4 showed the least variation followed by P3, P7, P8, P2, P1, P6 and P5 with SD values of 8.4, 14, 12, 39,

16 and P5 with the highest variation (SD 133).

Discussion:

This study is the first to report the presence of group I introns in the ITS regions in any

organism, and the first report of group I introns in any higher group (in all the eight

25 extant orders of shark species). The results also indicate that the initial insertion of the introns into this region occurred shortly before the first sharks appeared on Earth (prior to about 420 mya), because it is found in all shark Orders, as well as in skates but is absent from species outside of the Class Chondrichthyes . The self-splicing group I introns are considered to be the most ancient and probably date to common ancestor of the cyanobacteria (2.7-3.5 billion years) and might have entered the eukaryotic domain through plastid endosymbiosis (Bhattacharya

1996) or other types of horizontal transfers. Their occurrence in organisms such as hyperthermophilic bacterium Thermatoga that are basal lineages in the tree of life signifies the possible presence of these catalytic RNA (ribozymes) in the last common universal ancestor

(LUCA) of contemporary life. Ribozymes are considered as the legacies of a primordial RNA world, where both information encoding and catalytic properties were primarily the domain of

RNAs (Raghavan and Minnick 2009). They are known to have played significant roles in the evolution of life on the planet (Vicens and Cech 2009, Raghavan and Minnick 2009).

Group I introns are embedded within essential genes (Golden 2008) and are found in a wide variety of organisms such as the rDNA locus in fungi (Mavridou et al 2000, Booton et al

2004, Rogers et al 1996, Bhattacharya et al 1996, Takizawa et al 2010), as well as in the genomes of sea anemones (Beagley et al 1996), protists (Gray et al 1997), green algae

(Bhattacharya 1996), red algae (Ragan et al 1993, Sheath et al 2001) lichens (Simon et al 2004), eubacteria (Xu et al 1990, Kuhsel 1990) and bacteriophage (Shub et al 1988; Sandegren and

Sjoberg et al., 2006). In addition to their presence in many rRNA genes, they have been found in protein-encoding genes, as well as in tRNA genes (Lambowtiz 1993). Fungi, plants, red and green algae are known to contain 90% of all group I introns among the lineages represented in the tree of life (Haugen et al., 2005). The sporadic distribution of these mobile elements indicates

26 their ability to transfer horizontally into different species and genes (Shub et al., 1997). Some group I introns encode homing endonucleases (HEG) within their primary structure (Golden

2008). But, the vast majority of them lack HEG’s (Bhattacharya et al., 2005). The ability of group I introns to transpose themselves by horizontal transfer or by vertical inheritance complicates the reconstruction of their evolutionary histories (Booton et al., 2004).

Recurrent gains and losses are likely to be a general feature of HEG associated group I introns (Shub et al., 1997). Some of the species examined (e.g., the Sphyrna tiburo ITS2 intron and the Lamna nasus ITS1 intron) lack some of these spacer introns in some rDNA copies indicating that the introns have been lost during the evolution from some, or in some cases from all, of the rDNA repeated copies. Therefore, these can be considered optional elements. Similar cases are reported in fungi, in which some isolates have the introns and some do not (Dujon

1989, Harris and Rogers 2008, Yan et al 1995).

Group I introns may have been playing roles in the evolution of major groups of organisms, including fungi (Chen 2011) and some metazoans. They may have inserted successfully into the rDNA locus because it offers a larger target area due to the fact that the genes are arranged in long tandem arrays consisting of hundreds to thousands of copies (Rogers and Bendich 1987).

However, insertion sites within the locus are somewhat limited because functional rRNAs are essential for survival of the organisms. Even within the ITS regions, there are limited sites, because of selection pressure for accurate processing of the precursor rRNAs. Similarly, group I intron deletion in essential genes requires a highly precise excision from the host gene which otherwise would have deleterious impacts on the ability of the host cell to produce rRNAs. This would minimize the chances of these molecular parasites being purged from the host genome

(Raghavan and Minnick 2009). In this context, the presence of these putative introns in spacer

27 regions of rDNA in sharks provides a means to investigate their origin and maintenance and possible impacts on the host. The fact that they are in the spacer region also means that their insertion and excision might be less restricted than similar processes in more conserved regions

(i.e., gene regions). The introns are each located in loop regions of ITS1 and ITS2 that are variable in sequence, indicating low selection pressures in those sites (Rogers, unpublished). The excision of the ITS regions is necessary for the production of mature rRNA molecules (Les and

Tippery 2008). ITS regions have a functionally conserved secondary structure that enables their precise excision from precursor RNA transcript (Van Nues et al., 1994, 1995; Joseph et al.,

1999), with the aid of small nucleolar RNAs (snoRNAs). The unusual presence of introns in the

ITS regions means that the introns probably need to be spliced out of the ITS regions prior to processing of the precursor rRNAs to produce the mature SSU, 5.8S and LSU rRNAs destined for the ribosomes. The results presented indicate that the introns have maintained all of the necessary regions, and thus are probably functional group I introns capable of splicing. The fact that the structures have been maintained for at least 420 million years indicates that the maintenance of their splicing activities is important to the proper processing of rRNAs, specifically as relates to the ITS1 and ITS2 regions.

The most parsimonious explanation for the presence of introns in both ITS1 and ITS2 region as found in this study is that the introns were inserted into one of the ITS regions first, followed by insertion into ITS2. This was followed by independent evolution at the two sites, as is evident from the phylogenetic analysis (Figure 16). All of the introns in the ITS1 regions are monophyletic, regardless of shark taxon. A similar pattern is found for the introns in the ITS2 region (Figure 16). Furthermore, because the group I introns are found in ITS1 and ITS2 of all shark lineages, the initial insertions at both sites predated the origin and diversification of sharks.

28

Additionally, because of their widespread distribution and persistence in shark rDNA ITS regions there is a possibility that these introns are providing some benefits to hosts.

In the divergent shark lineages examined conservation of characteristic features of group

I introns was observed among the regions P, Q, R and S. Typically, the catalytic cores of the group I introns consist of two major domains, P4-P5-P6 and P3-P7-P8-P9. The P4-P5-P6 region is thought to structurally support the P3-P7-P8-P9 domain (Kim and Cech 1987, Michel and

Westhof 1990, Wang et al in 1993). However, the P4-P5-P6 domain appears to be dispensable for a functionally active intron as seen in the PaSSU introns in the rDNA of Phialophora americana (Harris and Rogers 2008). The P3-P7-P8-P9 region appears to play a major role in catalysis (Ikawa et al 2000, Kim and Cech 1987, Michel and Westhof 1990). The P4 region, consisting of paired P and Q regions, is known to interact with J6/7 (segment connecting P6 and

P7) and a base triple between the J8/7 region which is known to be phylogenetically well conserved (Tanner and Cech 1997, Yasushi 2002). Among all the stem loop regions examined from the introns in both ITS1 and ITS2 regions, P4 was found to be least variable (Table 6) suggesting that these putative introns are likely to be functional.

The P5 region that lies between P4 and P6 is highly variable in introns from both ITS1 and ITS2 regions (Table 6). The variability seen in P5 is not surprising because it does not appear to be critical for catalysis, but acts primarily as a scaffold to hold other parts of the molecule. By contrast, it is surprising to see relatively high variability in P6 despite its proposed roles in the functionality of group I introns. Specifically, P6 along with P4 and P1 plays a crucial role in the tertiary contacts within the core facilitating self-splicing (Vicens 2008). However, it remains to be tested if all the putative introns having variable P6 regions are functional and if all are functional.

29

For the introns that posses a canonical P2 stem, P3 is needed to connect on the opposite side of the P1-P2 extended helix in certain introns (e.g, the Saccharomyces cerevisiae LSU

intron ; Michel and Westhof 1990) . Tertriary contacts between the P3 and J6/6a (along with

tertiary contacts between P4 and P5) allows the P3-P7-P8-P9 domain to wrap around the P4-P5-

P6 domain, thus defining the gross architecture of the active site (Golden 2008). Among introns

from both ITS1 and ITS2 regions examined, P3 was the next most conserved region after P4

(Table 6). However, its role in catalysis may not be crucial based on the findings in Phialophora americana where the functional PaSSU intron lacks a P3 region (Harris and Rogers 2008). The exogenous guanosine (which is aligned to attack the phosphodiester bond at 5' splice site in P1) is bound within the P7 helix (Vicens and Cech 2008, Golden 2008). The P7 region is crucial in the catalysis of group I introns. The smallest functional intron to date which consists of 67 nt in

Phialophora americana contains only three regions: P1, P7 and P10. This underlines the vital role of the P7 region. The P7 regions were the next most conserved compared to P4, P3 and P8 in introns from both ITS1 and ITS2 regions (Table 6). A significant specificity for a guanosine base is provided by G-C pair near the top of the P7 helix (Golden 2008). One of the conserved features of the P7 region in group I introns is the single nucleotide bulge, which appears to be an important facet in guanosine binding site (Golden 2008). Although relatively higher variability was observed for the lengths of P7 in introns from both ITS1 and ITS2 (Table 6), a G-C pair at the top of P7 region and a single nucleotide bulge was found as common features in all the putative introns (Figures 3-14). A partial reason for the lack of high sequence conservation in P7 is that other than the requirement of a G-C pair next to an unpaired base, the remainder of the stem and loop can change. Also, for ribozymes, for most regions of the molecule, the structure is more important than the sequence.

30

The P9 stem and loop structure with a GNRA terminal loop is known to interact with P5

(Michel and Westhof 1990). Although a P9 loop was found in all the introns from ITS1 and ITS2

(Table 6), a GNRA loop was found only in introns from ITS1 (Figures 9-14). This suggests that a GNRA loop may be dispensable for catalysis and therefore might have been lost during evolution of the introns from ITS2, possibly before their insertion into this region. The P9 region turned out to be most variable (next to P5, Table 6). This may not be surprising because P9 has not been shown to be critical for catalysis. In Azoarcus (a proteobacterium), P8 has been shown to interact with P2, which helps to clamp the P5 and P9 regions (Vicens 2006). Although P8 was found in all the putative introns examined and is fairly well conserved (Table 6), the observation that P2, P5, P9 and P8 are dispensable (Pichler and Shroeder 2002, Harris and Rogers 2008) may undermine the importance of P8 for catalysis.

The P10 region plays a vital role in the second step of splicing and acts to align the 5' and 3' splice sites (Adams et al. 2004, Harris and Rogers 2008). The conserved terminal G of the intron

(ωG) in P10 (at the 3' splice site) replaces the exogenous G in the G binding site of P7 to initiate the second transestrification reaction which is then attacked by the free 3'-OH of the U on the 3' end of the 5' exon, thus splicing the exons. A G·U wobble pair near the 5' splice site P1 and an

ωG in P10 are well conserved in all of the putative introns examined (Figures 3-14). The internal guide sequence (IGS) at the 3' end of P1 that is capable of base pairing with parts of the 5' and 3' exons facilitates splicing by bringing 5' and 3' splice sites into close proximity. This region was well defined in all of the introns examined (Figures 3 -14) (except for S. lewini , JN003435 and

H. griseus , JN003428, from ITS1, because of the lack of sequence information; Figures 10 and

14, respectively).

31

Based on the presence of features in the putative introns from all of the divergent lineages of sharks examined that are characteristic of group I introns, it is probable that the putative introns are functional and have the ability to accurately excise from the spacer regions, thus remaining genetically neutral, because improper processing of rRNA might cause a lethal failure of ribosome assembly and thus protein synthesis. In vitro splicing experiments of these putative

introns and demonstration of their splicing activity may confirm the first ever presence of these

group I introns within spacer regions. Even if they fail to splice, these insertional elements still

may represent remnants of a very ancient insertional event by a group I ancestor.

Introns are highly divergent sequences and thus their conservation is unlikely across the

millions and billions of years that separate taxa such as prokaryotes or eukaryotes (Bhattacharya

2005). Their sporadic distribution across the tree of life might be explained by their ability to

transfer horizontally (Shub et al., 1997, Bhattacharya 2005). An investigation of the ITS regions

for the presence of group I introns in the sister groups of sharks from elasmobranchs - the skates

and their sister group from the Subclass Holocephali (rat fishes) revealed the presence of introns in their ITS1 and ITS2 regions, as well. However, they are absent in jawless fishes which are considered the immediate ancestors for cartilaginous fishes (Figure 15). This suggests that the introns were acquired horizontally more than 420 million years ago. The presence of these introns across all Orders of sharks (including the most ancient extant shark species Hexanchus griseus) and in the skates, suggests their presence in these species before diversification of the

Class Chondrichthyes . The putative intron from Hexanchus griseus which represents the most

ancient extant shark lineage tested (paleontological studies indicating its presence (Capetta 1987)

150 million years ago) was used as an outgroup (from both ITS1 and ITS2 regions) with an aim

to determine whether the introns first appeared in ITS1 or in ITS2. The phylogenetic analyses of

32 group I introns from both ITS1 and ITS2 regions (Figure 16 and 17) within the representative sharks from the eight orders suggests that these introns appeared first in the ITS1 region, but moved very early into the ITS2 location. The higher degree of variation within each of the paired regions in group I introns from the ITS1 region as compared to those in the ITS2 regions (Table

6) supports this conclusion. Also, the variation for overall length of putative group I introns is greater for those in ITS1 than for those in ITS2 (Table 6). Together, these observations indicate that these introns probably first appeared in ITS1 of a shark ancestor, and later appeared in ITS2.

The persistence of similar relationship patterns among species and the persistence of one mixed clade consisting of both ITS1 ( Ginglymostoma cirratum; JN243354 and Chiloscyllium plagiosum; JN003434) and ITS2 introns ( Rhincodon typus;JN003442 and Ginglymostoma cirratum; JN003433) in the rooted (introns from ITS1 and ITS2 of Hexanchus griseus as outgroup) and unrooted trees (Figure 16,17 and 15 respectively) suggests that the introns might have been introduced into one of the members at a later time in this mixed clade and might have diverged separately.

Sharks are known to have a long evolutionary history, and some are of commercial importance (Velez-Zuazo and Agnarsson 2011). Many are listed as either endangered or threatened. However, the phylogenetic relationships within sharks are poorly understood.

Obtaining the information about the evolutionary history of species and their status with their relatives can influence the planning of conservation strategies, especially for those taxa that are under some level of threat (Velez-Zuazo and Agnarsson 2011). The phylogenetic results for these spacer introns are in good agreement with the conclusions of shark evolution made by others (Velez-Zuazo and Agnarsson 2011, Heinicke 2009, and Human et al., 2006). The tree topology determined from the intron sequences (Figures 15, 16 and 17) is similar to those of

33

Velez-Zuazo and Agnarsson (2011; which used four mitochondrial and one nuclear gene with

229 species of sharks from all the eight orders and 31 families), Heinicke (2009) (which used the nuclear protein coding RAG1 gene, and mitochondrial 12S and 16S rRNA genes), and Human et al., (2006; which used 3 mitochondrial genes: cytb, NADH2 and NADH4). The tree topology of

Human et al, (2006) and Velez-Zuazo and Agnarsson (2011) with (skates and rays) as the root indicate Hexanchiformes as sister taxon to the remaining orders. Similarities between the phylogenetic trees based on intron sequences and those based on other genes indicate that these introns have been evolving in parallel with the shark species and probably have existed before or during the inception of the Class Chondrichthyes . Apart from their biological and evolutionary significance, the variation in these sequences may potentially be exploited as species-specific markers for shark identification (Walker et al., 1999).

The likely sequence of evolutionary events for these ITS1 and ITS2 introns in sharks is:

1. A mobile group I intron (with functional HEG) inserted into the ITS1 region of a shark

ancestor prior to the diversification of the Class Chondrichthyes (420 million years ago).

2. Some time later, a second insertion event occurred, this time into the ITS2 region. This

may have been via horizontal transfer or via transposition of an ITS1 intron.

3. The introns lost their abilities to move, but retained their splicing functions.

4. The introns in ITS1 evolved independently of those in ITS2 as the sharks diverged.

5. In some lineages, the introns were lost (either partially or completely) from either the

ITS1 or ITS2 regions. Recombination and gene conversion, which are active in the rDNA

locus, are possible mechanisms for the losses of these introns.

34

Primer ITS-1F Primer Fish 5.8F

SSU 5.8S LSU ITS1 ITS2 Primer Primer Fish 2R 28S-R

Figure 1 Amplification of ITS1 and ITS2 sequences

The primers used for ITS1 amplification are primer ITS 1F and primer 2 (reverse). The primers used for ITS2 amplification are primers Fish 5.8S F and Fish 28S R

35 PCR product insertion site

EcoRI EcoRI M13 reverse primer binding site T7 promoter

pCR 2.1 - TOPO 3.9 Kb

Amplicillin resistance gene

Figure 2 Plasmid with PCR product insertion site for sequencing and transcription

The plasmid is provided by Invitrogen Corp., Carlsbad, CA. It allows sequencing using the M13 reverse primer and the T7 promoter primer which are supplied by many sequencing companies. It also allows checking for the presence of the insert by restriction digestion with EcoRI, and for the transcription of the product inserted, by T7 polymerase

36 U U G U GU AC-G UG G-C U(I.o) U(I.o) C-G UG UG-CU UG-CU UG G G A-U GU UG GU C-G U C-G C G UG G A C-G U U A-U G G G-C U(I.p G-C C-G ) G-C C(I.o) U G G(I.o) C-G U U G U G G U G-C G-C G-C U U G GU A-U U G U G(I.o) A-U U-A UG G U G-C C-G P1 del(Car) U UG G-C U C 3' G-C A-U A(I.o) C C C-G C-G a U(I.p AC U G UC G C G C G U U C-G a U U c ) G-C U G U G C A U U(I.p),(I.o) U C A U G U C G-C C A(I.p,C.c g A A(I.p) U C U G U U g A-U G-C C-G ) c C C G u C C A-U U-A g U C G-C G-C A(I.p) u U C U U c G-C C(I.o) UG G-C u C-G C-G U C c U P9 UG G C-G UGU UG C-G U P10 A G C g-C C G(I.o) C-G U g U U CC C A P5 U G a-U U C g-C G C-G G A C-G A G C-G 5' G A C-G G A-U CG-C IGS U-A C-G Del(I.o) GU G C U-A G-C C U U C-G C-G G G-C G-C A G( I.p ) U G C-G G A(I.o) U UG-CC P4 G-CU C U C C-G P7 AC-G G-C G A A A C-G A(C.c) G(I.p) A C-G UG UG U-A A G-C U U U(C.c) G-C G-C UG-CU A Del U(C.c) A A(I.p) G-C A A C UGA Del(I.p) U del(I.o) G-C U C P3 C(I.p) C C C U(I.p) AU-A U(I.p) C G-CC UG U C C-G C-G G U Del(C.c) G A C-G U(I.o) G-C C-G G U G-C UG Del(I.p) U-A C C U(I.o) A C-G C-G C-G G-C A-U U-A C C-G P6 C-G UG-C P8 C(I.p) C-G C(I.p),A(I.o) U A-U C-G U(I.p) G-C C-G UU(C.c) C C GU C-G C(I.p) U U UG-C C G U(I.o) CG C C UAA-U G(C.c) C-G U-A G-C U C(I.p) UC-G U-A G(C.c),(I.p) Figure 3 U(I.p) GU U(C.c) G-C U(I.p),(I.o) C C G-C A G(I.p) A-U G-C U(C.c) C A C

37 Figure 3 Secondary structure the ITS2 intron from Lamna nasus

Secondary structure the ITS2 intron from Lamna nasus (total length is 490 nt ; JN003440), Isurus oxyrinchus (I.o) (JN003441) and Isurus paucas (I.p)

(AF515443.1). The main structure is for L. nasus , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions . The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-

P3-P8 (right side) P1 and P10 regions (middle).

Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

38 U U U C C( N.c ) U G U G C U G-C C-G G G 3' U C C G-C U( N.c ) C U c U U U C-G C-GU t C-G U a G-C G-C C( N.c ) GU g C A C t U-A P5 U G G P10 c C P9 U C-G a G-C C-G GU c UG G-C U( N.c ) C-G c u G G G-C A U( N.c ) C-G g U C g-C G GU P1 u G U GU C(N.c) U 5' U C U( N.c ) C U C G-C C C-G G( N.c ) C A G G G G U-A U( N.c ) A G C( N.c ) C-G G-C UG U-A U U( N.c ) C-G IGS C GA U-A C-G C-G U A( N.c ) U-A C-G G-C G-C U A( N.c ) C-G G-C G-C C-GC P7 U-A C C U GC UC del( N.c ) U U C U G U G G C-G U C-G G-C G GCU GG AU UG-C U C UU C C C-G C(N.c) A-U U-A A-U C( N.c ) C( N.c ) GU C-G U-A U( N.c ) G-C C G( N.c ) C-G C-G U P4 U-A C U G G U C GU C UG Ins G( N.c ) A P3 C UG del( N.c ) G U A U C G-C G G G-C C U U G G-C

CU G-C U C( N.c ) A-U C-G G-C C-G C( N.c ) A C P8 C( N.c ) A C A C C( N.c ) C-G C-G GU U( N.c ) C U C-G U U A G-C C U U GU U C( N.c ) A-U C-G G-C C C U( N.c ) P6 UG U-A U( N.c ) G-C U( N.c ) Ins A( N.c ) A C C C U( N.c ) C( N.c ) U C Figure 4 C C U G C C U Ins G,( N.C)

39 Figure 4 Secondary structure the ITS2 intron from

Hexanchus griseus

Secondary structure the ITS2 intron from Hexanchus griseus

(total length is 325 nt ;JN003446), Notorynchus cepedianus (N.c)

(JN003447)The main structure is for Hexanchus griseus , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions. The structure shows characteristics of group I intron form with P4-P5-P6 (left side),

P9-P7-P3-P8 (right side) P1 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5’ to 3’ direction.

40 G C U G U U U G-C C-G CC-G U C A U UG U G-C U G G-C U U U U G C UG-C C-G A-U UG-C U U U G G G U CC-G C-G C-G A-U A-U U G A-U G-C A G-C CG-C G-C G A G C-G G U A-U U U G-C P5 C C GU C-G U 3' U U C C-G G-C G u U C U G u U-A GUA c G U g C C-G C-GC C-G c U UG u U G U c C-G G-C G-C U U u G G G-C P9 U-A U U GU G a-U C GU G U U c-G P10 G g-C G C C-G G C U-A P1 g-C U GU u-A G gu G C G-C C C-G U U-A 5' C G U UUG IGS GU U U G-C G-C C-G A-U U-A UC-G G U G G G-C A A U C GC-G C C G-CC C-G U P7 A U G C-G C-G U U G G-C G A U G A-U U C P4 C-G C U-A C-G C C-G U U-A U UG U C-G A-U G-C C-GA A A G-C G-C U C C U-A C U-A C U G-C U G G-C P3 U G-C A C U C C-G U A C U-A C A G-C A G-C C-G C U C G-C G-C G C C C-G G A-U A-U U GU G C-G C-G A C C U U C-G C A C U C C-G P6 G-CC U C U-A U-A C C G-C C-G UG GU C-G GU A C C-G C U P8 C C C-G G-C UG Figure 5 U-A C-G A U-A AC C-GG CU U G-C G C U A A G-C C A U U C C G U C

41 Figure 5 Secondary structure the ITS2 intron from

Heterodontus francisci

Secondary structure the ITS2 intron from Heterodontus francisci (total length is 438 nt ; JN003444), The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side) P1 and P10 regions (middle).

Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5'to 3' direction.

42 U U G U G A(Sc) G-C C(Pn) C-G UU G A(Sc) A A-U U(Pn) G-C A(Pn) UG C(Pn) G G U(Pn) A-U C-G C-G C U U U G U C-G Del(Sc) G-CG GU UG del(Sc) del(Pn) G-C del(Pn) C-G C(Pn) G(Pn) C(Sc) U U C(Pn) C U G(Sc) CGCCG(Pn) C U A(Sc) GUG del(Sc) U A(Pn) G A C U U U(Pn) U U C(Pn) A C G-CU A G U U(Pn) G(Pn) A-UG C-GC G(Pn) U G U(Sc) C C(Sc) del(Sc) C G G G(Pn) UG U C(Pn) U U G(Pn) A G U(Sc) GUU C(Pn) U-A U(Pn) G(Pn) A A G(Sc) U c U C U(Sc) G-C U(Pn) A C u C(Pn) C C P5 C-G C-GU u U(Sc) A-U G(Sc) C g U(Pn) C C GU G(Pn) U P10 C-G C-G u C(Pn) G A P9 C-G GU G(Sc) g U(Pn) A-U C(Pn) U GU c GU G(Sc) G G C(Sc) G-C u G U(Pn) G u G U A-U U C(Pn) C(Pn) g-C U(Sc) C C(Sc) C-G U(Pn) c-G G U-A C(Pn) C C-G 5' C G(Pn) GU C A-U del(Sc) G(Pn) UG-C C G(Pn) IGS U C C C C-G U(Pn) U G G(Pn) CU G G(Sc) C C-G C(Sc) A-UA U(Sc) G-C G C(Sc) U U C(Sc) C C C-G A C(Sc) GU A-U A G(Sc) C-G U(Pn) G U(Sc) GU C(Sc) P4 A-U del(Pn),(Sc) C-G C-G A(Pn) C(Pn) A C GU del(Pn) A del(Pn) G-C U-A C( Pn ) C-G A-U U C(Pn) C U(Pn) G U C-G G-C A A C(Pn) A G(Pn,Sc) A C(Pn) P7 A-UA U(Pn),A(Sc) A C(Pn) C-G G(Pn),U(Sc) G-C A C C(Pn) C G(Sc) G-C C(Pn) U(Sc) C U(Sc) G-C C(Pn) A-U G-C C U C C(Pn) G-C C(Pn) G U-A C(Pn) C(Pn) U U C U-A U C(Pn) del(Sc) UG U-A del(Sc) G-C P3 A(Pn) G-C U(Pn) G-C GU UG C-G U(Pn),A(Sc) C(Pn) U G C(Pn) del(Pn) UG A(Sc) G-C C-G A(Pn) U C U U U-A C(Pn),(Sc) C(Pn) U G(Pn),(Sc) A-U U G A C A(Sc) G-C G-C G(Sc) A-U C-G C A C(Pn),(Sc) G-C U G P6 C U U C U C(Sc) G(Pn),(Sc) UG U(Sc) A-U G(Sc) G-C U(Sc) C U C-G A-U G(Sc) A U(Sc) C C-G U(Pn) C-G C U GU U-AC U(Sc) A C C C P8 C U A G(Pn),(Sc) A A del(Pn) A A A A A

Figure 6

43 Figure 6 Secondary structure the ITS2 intron from Squalus acanthias

Secondary structure the ITS2 intron from Squalus acanthias

(total length is 380 bp; JN003426), Squalus californica (S.c)

(JN003443) and Pristiophorus nudipinnis (P.n)(JN003449). The main structure is for Squalus acanthias , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions. The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side)

P1 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case.

Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

44 U(Blu) G-C Ins CU(Dus) C U C-G C(BN) C-G U(Blu) G C UG A(BN,BT,Dus,Blu) C C U(BT,Blu) G-C A(Bul) C G Ins(San,BN) G T A-U G(Bul) G C(San,Sil,BN) GU A Ins(Blu) GA del(Blu) G-C T C(Bul)del(San,Sil,BN) A G C U G T C U C(Blu) A-U C Ins(Sil) G U•G GU Del(BT) T G-C G G(San,BN) G(San,BT) C-G G G G•U A G C-G G C-G G(BN,BT,Sil,San) U U T G C U G(Blu) T G•U del(BN,Dus,San) U C U G-C G U U T C(BN,San,Dus) C(San,BN) G-C G Ins(Dus) U•G G(Sil) G C G G-C C C C G-C G(Dus) G-C C U U G G U C G U C T G U U-AA U-A P9.1 U U G (BN) C-G C G(BT,Bul) U G A(BN,BT,Sil,Dus,San);del(Bul,Blu) A-U U-A A A(San) C-G A-U G-C G-C G•U (Dus) U(Dus) del(Dus) GU C(Bul) U A-U C C del(BN,BT,San) U C U GU U•G A-U del(Blu) A-U C-G A-U(BT) C-G C-G A-U A-U del(BN,Sil) C C G(Sil);C(Blu) A(Blu) G-C G•U C-G del(BN,Dus,San,Bull) U-A U G U•G del(Blu) U G U C-G U•G U G A(BN) (BN,San) U del(Sil) C-G C C U C-G C C A U U P9.2 GU G(Dus) C A 3' C(BN,BT,Sil)del(Dus) G-C C-G G·U(Dus) U•G t A(Sil) C-G Ins C.c(Bull) t U•G C-G UG G-C t G-C U G-C C(Sil) G-C del(Blu) GU C(Sil) U c U(San,Bul,Sil,BT,BN) G-C G-C g U U A(Dus) g U(Bul) G G G-C A(Dus) del(Dus) U C t del(Blu) U U C c U-A U(Sil),G(Dus) G-C c Ins GGC(BN,San) U U g G-C U(BN,BT,Sil,Bul,San,Dus) C-G c C-G u•G G U g-C G UG A a-U P10 U P5 G-C c-G C G-C a-U C A-U C(Bul) P1 U G(BN,BT,Bul,Blu,San) A A ' G G 5 U A C-G C C-G U U U A(Dus) G-C IGS G-C A(BT) A-U G-C C C(BN,San) UG U G A(Sil)del(Dus) U A C B(Blu) G U C GU U U G(Dus) C C C-G CU G(Dus) C A(BN)U(BT,Sil,Dus,Blu) C-G UG U G del( Dus,BN,BT,Sil ) G-C U-A U-A U(Sil) C-G C-G Ins C(Bul) C P7 U C(Dus,Bul) G-C C-G G GU U-A C A(BN,BT,Sil,Bul,Blu,San,Dus) G G Ins(Blu) C-G U(Bul) G-C A-U Ins GG(Sil)A A(Sil) C-G U A-U U(Sil),G(BT) C C C G-C C-G A G del(BN,San,Bul) U(Blu) G-C U G A A G(BN,Sil,San) C-G Ins G(Dus) G-C A-U U G-C U(Bul) P4 G G(BN,Sil,San) C-G A U(BN,San,Sil) G-C C C G C C-G Ins U(Sil) C U A A-U G U(Blu) G-C G A(BT) G G C A-U C-G G-C C-G U(BN,BT,Sil,Bul,San) P3 U G-C U(Sil) C C G(Bul) U U U U C(Sil) G-C U(BN,BT,Sil,San,Bul) C G C U C(Dus) G-C GU U U C(BN,San) G del(Blu) C C-G G C(Bul) U C G(Bul) Ins U(BT) GU C-G C A(BN,San) C G C-G G C C A U-A G-C A(Sil) del(Blu) U-A UG G A(Sil) U GU G-C C(BT) U G-C A-U U-A C-G C(BN,BT,Sil,BulSan) UG C(BN) U A-U P6 C C G(Sil,Dus) U(BN,BT,San) C U C(BN),San C A(BT) Ins UG(BT) G-C U A(BN,San) G G U(Dus) Ins C(Sil) G-C G G U(Dus) G-C U(Sil) C-G G(Sil) C U A(Bul) C(Sil) G-C G U Figure 7 U C G Ins U(Dus) P8 G-C C C UG UG U Ins U(Bul) C U G-C C C-G Ins C(BN,San) U A(BN,Bul,San) U-A G(Sil,Dus) G-C U(BN) G-C C A C C Ins U(BT) U-A A-U G-C C-G C-G C(BN,San)G(Dus,Blu) U CU G-C U U CG

45 Figure 7 Secondary structure the ITS2 intron from

Carcharhinus brevipinna

Secondary structure the ITS2 intron from Carcharhinus brevipinna (total length is 522 nt ; JN039365), Blacknose

(BN) (GU385345.1), Bull (Bul) (JN003437), Dusky (Dus)

(AY033819.1), Blacktip (BT) (JN003438), Sandbar (San)

(AY033820.1), Blue (Blu) (JN003439). The main structure is for Carcharhinus brevipinna , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions. The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side) P1 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

46 C U C U U C G U C-G A-U G-C G-C C GU C G C A G A C-G A-U G-C G-C A-U C-G U U U G C-G U G-C U G C-G G-C G-C G U C-G GU C-G U G U C-G U-A G U G U G G A G G-C C U G G U C U C G-C UG U U U C C-G C-G A-U 3' U C a C-G C-G C-G c GU C-G C-G U a G A G G U g UG P9 G U C U g UC-G c A-U C-G G u A G G-C U-A g A U G-C A-U G-C A G-C C A-U U U U C UG C G G-C g-C P10 C-G P1 G g U A UG G P5 c-G GG GU g a g u u u-A C-G 5' G-CC A-U A-U U G UG IGS G-C G-C G G G-C G G A-U A A A UG C-G C-G G GU C-G C-G C-G U U C G-C P7 G-C U G-C GU G C-G U-A UG C-G U C G A A C A-U C-G G A G-C G-C C-G G-C U U-AC G-C U C C-G P4 U C-GC C-G G-C C U G-C U U C U-AG A GU G-C A A-U A-U A G G-CA U G P3 C C C-GG C-G C-G G-C U U G-C UG U UG G-CG GU A C-G C-G G-C C-G C-G UG U-A U C-G U UG U-A C U G-C C-G U-A U UG P2 C-G G C UG C C G-C U-A A-U A C G C-G C-G U-A G-C A-U C-G G A GU P8 C-G A U U P6 A-U UG G-C C-G U-A G-C G-C G A G-C C A A C C U G C C C-G C C CC U-A C-G G-C UG C-G Figure 8 U GU C-G U-A C-G G-C G-C G A G G A C

47 Figure 8 Secondary structure the ITS2 intron from

Rhincodon typus

Secondary structure the ITS2 intron from

Rhincodon typus (total length is 497 nt ; JN003442). The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side) P1-P2 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

48 G G C A C U C A G G G C-G G G-C C-G A AUGG C-G G G G-C C-G G-C G U C U G-C C-G C G-C C U G-C A-U G-C G-C C-G AGG-C A C A G del( C.p ) A-U U C A-U A( C.p ) CAG-C C-G G-C G-C G A U G-C A C AG G-C G-C A G C C U A G-C C C C G-C P9 A-U UG C U G-C P9.1 U G G-CU C-G U C del( C.p ) G-C U G C U C 3' C-G A-U g G-C U C-G U C G G-C A( C.p ) C g C UG C-G c C-G a G-C A( C.p ) u P5 C-G P1 G-C u G-C C-G g U U u G P10 G U C U C U-A c-G A G-C C U u-A G C-G c-G G G-C u G C G( C.p ) U-A G( C.p ) 5' C G U G C G-C del( C.p ) UG U-A del( C.p ) IGS G-C C C U U U C C C U U( C.p ) GU C( C.p ) U( C.p ) U C A G G G A C C( C.p ) G-C G( C.p ) G-CAC U C G-C C( C.p ) U-A G( C.p ) C-G UG U( C.p ) del( C.p ) C( C.p ) UG GU G G-C A( C.p ) G-C C U C( C.p ) GU GU C G-C A( C.p ) C( C.p ) G-C U( C.p ) UG A( C.p ) G U G( C.p ) U C( C.p ) C U C C-G AA-U A-U UG G( C.p ) GU C G-C A( C.p ) A P7 P4 A C C C-G U( C.p ) G U G( C.p ) C-G A C del( C.p ) GUC G C( C.p ) G-C C-G C-G C-G G U( C.p ) G-C G A( C.p ) U-A A G G-C C U( C.p ) U C U( C.p ) G U G C G( C.p ) G Ins( C.p ) U-A A-U G G C UG C-G G G G U U U( C.p ) G-C U G G U( C.p ) A del( C.p ) C Ins( C.p ) U G C Ins( C.p ) G-C A C-G A( C.p ) U C-G C C( C.p ) U U C-G G U G G C A Ins( C.p ) C-G A-U A( C.p ) C-G C C G U( C.p ) A C G G-C U-A P3 G U C C C U C U C-G del( C.p ) GU U U-A G( C.p ) G-C C-G G-C G G G A-U G G U C( C.p ) C C U G P6 U U Ins( C.p ) U U G C Ins( C.p ) G C-G U A C( C.p ) G-C C-G C( C.p ) G C C G U G C-G A( C.p ) U G( C.p ) U( C.p ) C-G A C G-C A C U C( C.p ) U P8 G U C C( C.p ) G UA Ins G C-G G G C A C C A( C.p ) G A-U C-GG C A G U U( C.p ) CG-C C G A C( C.p ) C G A( C.p ) C-G C-G C A( C.p ) A Ins( C.p ) G G Ins( C.p ) G C C-G U G-C A C U( C.p ) G-C G G-C G C G U( C.p ) GU C( C.p ) C-G G Ins G( C.p ) G C C A G( C.p ) G C A Figure 9 G

A( C.p ) U( C.p )

49 Figure 9 Secondary structure the ITS1 intron from

Ginglymostoma cirratum

Secondary structure the ITS1 intron from Ginglymostoma cirratum (total length is 483 nt ;JN003433), Chiloscyllium plagiosum (C.p) (JN003434). The main structure is for

Ginglymostoma cirratum , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions . The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side)

P1 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case.

Dashed arrows indicate joined regions, with arrowheads indicating the 5’ to 3’ direction.

50 A G A U A C U C U G-C C-G A A G-C C-G C A-U G-C C C C-G C C G-C A A A C G-C del( C.l ) C-G UG G U UG U G U A G-C G G A UG U G A-U GU C U A UG C-G A A del( C.l ) U-A C G C C C-G UG G-C U G-C A C C C-G U-A UG CC-G UG A G C U-A Ins C( C.l ) G C U G U C C-GU C-G U U C-G C A G G-C G G A GU A-UG UG U UC UG C A C-GU UG G C 3' G-C C-G G-C UG UG U-A P9 A-U U-A GU U-A G-C A-U G C U UG GU G-C G U U U P10 C-G U C G-C U g C U G-C UG GUG C-G P5 A G A GU u G U UG U G P9.1 U-A g U A C C G-C UU G-C G-C P1 g U G del( C.l ) G-C C U C U 5’ A G U G-C U C U A-U G U IGS G U U C-G A C G-C InsC(C.l ) G-C A A G-C C-G C C-G C-G G( C.l ) A A C GUU G-C C G G C U Ins C(C.l ) G( C.l ) A-U UC-GC G A-U UG U( C.l ) G GU G U G-C P7 G-C UG A( C.l ) U U Del(C.l ) C-G U UG U( C.l ) C-G G U-A G U C-G A( C.l ) G-CG U-A C( C.l ) G U GU U Del(C.l ) U U( C.l ) G-C C-G P4 U-A C-G A C( C.l ) G-C U U C-G A-U G UG G A-U C C-G C G-C A-U G G A del( C.l ) G-C G G U G G P3 U G U U U-A U-A C C A U-A U-A U G-C C-G C U A A U-A A( C.l ) U C G-C C C C G-C C( C.l ) A-UA C-G G-C A C U U G-C C C U-AC C-G P8 C G-C U A C U P2 G G U C C U-A U G G C U-A del( C.l ) G G del( C.l ) A G-C U U U U C C CU-A A( C.l ) U Figure 10 G-C C A( C.l ) U-A P6 GU G-C G-C C U-A GU A G A G C-G G G A G C-G U G U C

51 Figure 10 Secondary structure the ITS1 intron from Sphyrna lewini

Secondary structure the ITS1 intron from Sphyrna lewini

(total length is 495 nt ;JN003435), Carcharhinus leucas (C.l)(JN003436). The main structure is for Sphyrna lewini , while differences are indicated by arrows and nucleotide insertions, deletions (del) and substitutions. The structure shows characteristics of group I intron form with P4-

P5-P6 (left side), P9-P7-P3-P8 (right side) P1-P2 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the

5’ to 3’ direction. The sequences in the black box are undetermined.

52 A G A G A AGUA C-G G U U-A CA GC G-C C C U C C C-G G-C U U U A-U U-A U C C U U C-GU C-G U G U C G C C C U-A G-C U U-A G G U G G U U G G C G-C C C A G-C G-C C-G C-G U-A C C-G A G U G-C C-G U-A U-A A-U A U-A GU C-G G-C U C-G U U-A 3' A-U G C G C C-G GU G G C-G c A-U C G C C U-A C g A G G UG u G-C U-A UG g C-G U U g C-G P1 UG-C P10 u G-CC P9 P5 A-U U u A-U C U U g C-G CAC AG A G A-U u A-U U C A C G G U G U G G C C g G-C G C g A C U G G C G G-C G-C C-G G C C C C u G G A-U C C-G u G G G-C c-G C C-G CGU CC ACC U 5’ U-A G-C A G-C IGS A G C CG UC U C G G G G U G C G G U C-G C G C C U C A C C AG CCAGC G UCUG C C-G A C C-G U C-G G-C U GUA GU C U C C A-U GU G-C C U U-A UA-U C-G U-A G-C G-C A-U CC G U A C A C A G U G G C U A C UGUC U UG U UG P4 A-U U A-U UG A G-C G-C G-C A G-C P7 GU C U GU G G G-CG G-C G GU G-CG A G A G-C C-G UG C G-C C C U UG U U G G-CU C-G U A U C A U-A UG U G U C U G-C G C G-C A C G-C C A G C-G U-A C-G G C C U GU C G-CC G-CU U C G-C U U C U P3 G A C C G-C C-G U C C UG C U C P2 U C A-U UG C C U G C G A UG U-AC G-C GU G-C UG G-C C-G G-C G C U-A GU G-C A-U A G C U A A-U GU C-G C-G A C-G A A G U U A G U C P6 G C-G GU U-A C U-A U-A C-G C C A A-U G G C-G C-G G U U U-A C-G G-C G-C G-C CC-G U G U-A U-A C G GU G-C Figure 11 G-C A G A C-G G-CG C G U P8 U C G-C C C C C C U CG-C G-C C C U

53 Figure 11 Secondary structure the ITS1 intron from

Lamna nasus

Secondary structure the ITS1 intron from Lamna nasus

(total length is 704 nt ;JN003431). The structure shows characteristics of group I intron form with P4-P5-P6

(left side), P9-P7-P3-P8 (right side) P1-P2 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

54 A G A G A AGUA C-G G U U-A A G G-C C C C C G-C UC-G U U A-U U C-G U C C C G C CU U C U G U U G-C G G U G G U U-A C G-C C G G-C G-C C-G C-G C GU G-C U-A A-U GU U-A U G-C U A-U G C G C-G 3' G C GU A-U C G C C C-G c A C g G-C UG u UG g C-G P9 U U g G-C UG-C u C U u A-U U U g A-U P1 A-UC P10 u G-C C C g G G-C g G-C G U u G G A-U C u G G C-G c-G C C-G CGU CC ACC G-C 5’ A C A-U C-G C C IGS A C-G UG U P5 A G-C C-G U GU A-U C C A-U A CGU C-G C U U C-G A-UC G G C C-G C G U-A A-U AG-CC A G G-C C U G AUG P7 C-G UG G U A-U U G C A A-U G-C U C A UG U G-C P4 GU A-U C U A-U GU G-C G G G-C G-CG G-CG GU

C G-C C GU G-CG G A-U G U C A A U G-C C G-C C-G C C G C U UG C U-A UG G C-G U G U U C-G U C UG U-A U G-C G-C G-C A U UC P3 A U C C-G C C GU UC U G-C UG-CU G A A C-G C U UG UC-G UG U G A A-U P6 GG-C A UG A G-C C G-C G A-U U-A U-A C-GA C-G A G A G GU G G C-G C-G U-AU GU G-C U-AC G-C C C U-AU A-U C C-G GU G G-CA P8 U G C-G G-CG G-C UG-C C-G Figure 12 C C C C G C U-A C G UC AG-C G-C C-G G-C C G C U U C C

55 Figure 12 Secondary structure the ITS1 intron from

Isurus oxyrinchus

Secondary structure the ITS1 intron from Isurus oxyrinchus (total length is 496 nt ; JN003432). The structure shows characteristics of group I intron form with P4-P5-P6 (left side), P9-P7-P3-P8 (right side) P1 and P10 regions (middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

56 U CUC C C UC U C U U C C U C CA C C-GC U-A C C G U G-C C C C U A-U G-C G G-C C CU-AC U C C U C U-A C C U U A G C-G U C U U C UG-C U CG-CC C C G-C U U U U U U U C A-U CC C C U-A C-G C G-C GU G C U U-A UU G-C G-C C C UG UG G-C U-A A UG A G GC-G C A 3' G A G A-U C-G G A u C-G C A u U-A A-U u U-A U C C a C-G C U G GU A U U g A A C-G C-G g G U C-G C-G C C a U G A-U G-C c U A C-G u G C-G G-C A A u G P9 U C P9.1 G-C a C G AG-C GU c A G C-G u-A C C-G U u C P10 G U-A G C-G C-G P1 uc-GC G C P5 C-GC g-C G C g U IGS C CU-AU u-A C G U 5' C C G-C G-C G-C C-G G-C C-G A C A G C G G C C A-U U U C C G A A G U G C GU C U U G C C U U G C U U C A U G GA G G-C GU C C C C U A G C-G CC C A CGUU C U U U U G U U C C U C A-U C A C C-G U AG-CA A-U C A A C A-U C-G C AUGC P7 C-G A-U C U U C U G C C C U G C-G C C G G U G-C C G C-G C-G C C-G G A-U A-U C G A C G-C G-C U-A C C A U U P2 G-C C G-C C-G U C C G U C C A-U A C CC-G GU A-U GU GU U G U U GU U G GU C-G C-G G G G UC C C GU P4 A-U G C U U C A-U G-C A-U G-C C U U-A C-G U U C C G-C U-A A G-C G-C G-C C U G-C C C P3 C-G CG-CC U G GU U-A C U C GU UG-C U-A C-G C C-GC GGU G C-G G G C-G G U U U U G U-A U G G-C GU U-A G C-G U G U U-AC G U P6 G C-G G-C C-G UG C-GA A-U G-C C-G P8 G-C C-G C-G U Figure 13 A-U GU C C C C-G C U A A C-GC C-G G G-C C A A C-G C-G U CC GU C C C-G C U U C-G A C-G CG-C C C G-C U C A C-G UC-G CG-C U A C A 57 Figure 13 Secondary structure the ITS1 intron from

Heterodontus fransici

Secondary structure the ITS1 intron from Heterodontus fransici (total length is 703 nt ; JN003427). The structure shows characteristics of group I intron form with P4-P5-P6

(left side), P9-P7-P3-P8 (right side) P1-P2 and P10 regions

(middle). Intron sequences are shown in the upper case and exon sequences are shown in the lower case. Dashed arrows indicate joined regions, with arrowheads indicating the 5' to 3' direction.

58 AG G G C-G CCG A G U-A C C G A U C C-G U-A U G-C U C C U-A U U U U-A A C GU A A-UA G C C C-G C-G C A-UC U-A C C A-U del( N.c ) U A G C-G U A-U C U C G-C C-G U-A G-C AG-CC C C G A-U CG-C C A A P5 U-A A C G( N.c ) A-U C C C-G G-C A C A G-C G-C G U U-A U A C CG-C P9 G U G-CA U-A A U( N.c ) A-U U-A G-C G-C P1 C 3' C-G G( N.c ) U-A g C-G C c G G U-A G-C u U-A G-C G-C g GU CC C C C U UC C C-G G A GU U G C G U C G GUG C C A u G G GU C A C P10 U C-G CC CC U C U U C U C G C G G C C C G G C G G-C A G-C A( N.c ) C C-G C G-C C( N.c ) A( N.c ) IGS G-C C( N.c ) A-U 5' UG UG C-G G-C G U del( N.c ) C U U U UGUC C-G C( N.c ) UG U C-G G-C U U U G G-C C U U-A G( N.c ) U C G U G-C G-C A-U G-C A-U U U C-G C C C A A C U-A G( N.c ) G-C P7 C A-U C C Ins G( N.c ) G U A-U G-C A-UC U C C-G UG G-C U G A-U U C C GU A-U U C C C G-C U G-C G G P4 A C (N.c ) U-A G C C-G G-CA C-G C-G C-G U C-G U C U G C A G A( N.c ) G-C U U C U GU U UG C C-G A U-A C G-CG A A G C( N.c ) CA-UC A G U( N.c ) C U P3 C-G G( N.c ) U C U-A U CG-CU G-C G-C UUG UGCU U-A U G U-A G-CA C-G A A-U UG-C UA G C U U C C C C C( N.c ) A-U U G-C del( N.c ) UU C C-G G-C G-C A C-G G G-C U-A C-G G-C G-C G-C P8 P6 G-C C A-U U-A G-C del( N.c ) A-U P2 G-C C U C G-C U U U-A G-C G-C C G A-U GU G-C A-U U-A C-G U U U C U-A C-G UG-CC G U C-G C-G C-G G-C C-G G-C U U G-C U U U U A G G-C G G U G G-C G C C C A C U CA G

Figure 14

59 Figure 15 Unrooted most parsimonius tree of putative group I introns from both ITS1 and ITS2 regions showing the two major clades containing of ITS1 and ITS2 introns, respectively. In the upper part of the phylogram are all putative intron sequences in ITS 2, while in the lower part of the phylogram are all putative intron sequences from

ITS1. The numbers on the branches are the bootstrap values (for branches supported by more than 50%).

60 ITS 2 Lamna nasus JN003440 Isurus oxirinchus JN003441 Lamna nasus AF515444 66 Carcharodon carcharias AY198335 Isurus paucas AF515443 100

Heterodontus francisci JN003444 Pristiophorus nudipinnis JN003449 Squatina californica JN003443 100 Hexanchus griseus JN003446 Pristiophorus japonicus JN003448 100 95 Notorynchus cepedianus JN003447 Squalus acanthias JN003445 100 58 89

Carcharhinus leucas JN003437 68 Carcharhinus leucas JN039366 100 Carcharhinus limbatus JN003438 Carcharhinus brevipinna JN039367 Carcharhinus plumbeus AY033820.1 Carcharhinus acronotus GU385345.1 Carcharhinus obscurus AY033819.1 Prionace glauca JN003439 Ginglymostoma cirratum JN003433 Prionace glauca AF515441.1 Carcharhinus falciformis AF513986.1

Rhincodon typus JN003442 100 61 Cephaloscyllium ventriosum JN039367

100

98

99 Sphyrna lewini JN003435 Chiloscyllium plagiosum JN003434 100 Carcharhinus leucas JN003436

92 Ginglymostoma cirratum JN243354

93 Pristiophorus japonicus JN003431 100 94 98 Lamna nasus JN003431 61 Squalus acanthais JN003426 Isurus oxyrinchus JN003432 100 Squatina californica JF906176 Notorynchus cepedianus JN003429 Heterodontus francisi JN003427 Hexanchus griseus JN003428 50 changes ITS 1 Figure 15

61 Figure 15 Unrooted most parsimonius tree of putative group I introns from both ITS1 and ITS2 regions showing the two major clades containing of

ITS1 and ITS2 introns, respectively. In the upper part of the phylogram are all putative intron sequences in ITS2, while in the lower part of the phylogram are all putative intron sequences from

ITS1. The numbers on the branches are the bootstrap values (for branches supported by more than 50%).

62 Lamna nasus JN003440 100 Lamna nasus AF515444 66 Isurus oxirinchus JN003441 100 Isurus paucas AF515443 Carcharodon carcharias AY198335 100 Hexanchus griseus JN003446 100 95 Notorynchus cepedianus JN003447 100 Heterodontus francisci JN003444 100 Pristiophorus japonicus JN003448 58 Pristiophorus nudipinnis JN003449 89 68 Squalus acanthias JN003445 ITS 2 Squatina californica JN003443 Carcharhinus leucas JN003437 Carcharhinus leucas JN039366 100 Carcharhinus limbatus JN003438 Carcharhinus brevipinna JN039367 61 Carcharhinus plumbeus AY033820.1 Carcharhinus acronotus GU385345.1 Prionace glauca JN003439 Prionace glauca AF515441.1 99 Carcharhinus falciformis AF513986.1 Carcharhinus obscurus AY033819.1 Cephaloscyllium ventriosum JN039367 100 Rhincodon typus JN003442 92 100 Ginglymostoma cirratum JN003433 98 Ginglymostoma cirratum JN003433 93 Chiloscyllium plagiosum JN003434 100 Sphyrna lewini JN003435 Carcharhinus leucas JN003436 61 100 Lamna nasus JN003431 Isurus oxyrinchus JN003432 ITS 1 100 94 Squatina californica JF906176 Squalus acanthais JN003426 98 Pristiophorus japonicus JN003431 Heterodontus francisi JN003427 Hexanchus griseus JN003428 Notorynchus cepedianus JN003429 50 changes Figure 16

63 Figure 16 Rooted phylogram (most parsimonious) of putative introns from both ITS1 and ITS2 regions showing the separate grouping of ITS1 (lower part of the phylogram) and ITS2 introns (upper part of the phylogram). The tree is rooted using the ITS1 intron sequence from

Hexanchus griseus as the outgroup. The numbers on the branches indicate the bootstrap values (for those supported by more than 50%).

64 Lamna nasus JN003440 100 Lamna nasus AF515444 66 Isurus oxirinchus JN003441 100 Isurus paucas AF515443 Carcharodon carcharias AY198335 Pristiophorus japonicus JN003448 100 Pristiophorus nudipinnis JN003449 88 95 Squalus acanthias JN003445 ITS 2 Squatina californica JN003443 Carcharhinus leucas JN003437 Carcharhinus leucas JN039366 100 Carcharhinus limbatus JN003438 89 Carcharhinus brevipinna JN039367 Carcharhinus plumbeus AY033820.1 Carcharhinus acronotus GU385345.1 Prionace glauca JN003439 Prionace glauca AF515441.1 68 Carcharhinus falciformis AF513986.1 Carcharhinus obscurus AY033819.1 Cephaloscyllium ventriosum JN039367 100 100 Hexanchus griseus JN003428 61 Notorynchus cepedianus JN003429 93 Heterodontus francisi JN003427 94 Squatina californica JF906176 ITS 1 92 98 Squalus acanthais JN003426 Pristiophorus japonicus JN003431 99 100 Lamna nasus JN003431 Isurus oxyrinchus JN003432 61 100 Sphyrna lewini JN003435 Carcharhinus leucas JN003436 100 Rhincodon typus JN003442 ITS 2 100 Ginglymostoma cirratum JN003433 98 Ginglymostoma cirratum JN243354 Chiloscyllium plagiosum JN003434 ITS 1 Heterodontus francisci JN003444 Hexanchus griseus JN003446 Notorynchus cepedianus JN003447 ITS2 50 changes

Figure 17

65 Figure 17 Rooted phylogram (most parsimonious) of putative intron sequences from both ITS1 and ITS2 regions showing the separate grouping of ITS1 and

ITS2 introns. The tree is rooted using the ITS2 intron sequence from Hexanchus griseus in ITS2 as the outgroup. The numbers on the branches indicate the boot strap values (for those supported by more than

50%).

66 Lamna nasus JN003440 Lamna nasus AF515444

Isurus oxirinchus JN003441

1 Isurus paucas AF515443 Carcharodon carcharias AY198335

Hexanchus griseus JN003446 2 Notorynchus cepedianus JN003447

3 Heterodontus francisci JN003444

Pristiophorus japonicus JN003448 Pristiophorus nudipinnis JN003449 4 Squalus acanthias JN003445 Squatina californica JN003443

Carcharhinus leucas JN003437 5 Carcharhinus leucas JN039366 Carcharhinus limbatus JN003438 Carcharhinus brevipinna JN039367 Carcharhinus plumbeus AY033820.1 Carcharhinus acronotus GU385345.1 Prionace glauca JN003439 Prionace glauca AF515441.1 Carcharhinus falciformis AF513986.1 Carcharhinus obscurus AY033819.1 6 Cephaloscyllium ventriosum JN039367 undetermined Rhincodon typus JN003442 7 Ginglymostoma cirratum JN003433 Ginglymostoma cirratum JN003433 8 Chiloscyllium plagiosum JN003434

9 Sphyrna lewini JN003435 Carcharhinus leucas JN003436 Lamna nasus JN003431 10 Isurus oxyrinchus JN003432 Squatina californica JF906176 11 undetermined Squalus acanthais JN003426 Pristiophorus japonicus JN003431 12 Heterodontus francisi JN003427 Isurus oxyrinchus JN003432 13 Hexanchus griseus JN003428 Notorynchus cepedianus JN003429 50 changes

Figure 18

67 Figure 18 Most parsimonious tree of putative introns from both

ITS1 and ITS2 regions showing the separate grouping of ITS1

(lower part of the phylogram) and ITS2 introns (upper part of the phylogram). The tree was divided to 13 clades (numbers indicated on the tree) and a representative structure was made for each clade which are shown by the arrows pointing to the secondary structures .

68 References

Adams PL, Stahley MR, Gill ML, Kosek AB, Wang J, Strobel SA. 2004. Crystal structure of a

group I intron splicing intermediate. RNA 18: 67-87.

Allen TB. 1999. The Shark Almanac. Lyons Press, New York, NY.

Baldwin BG. 1992. Phylogenetic utility of the internal transcribed spacers of nuclear ribosomal

DNA in plants: an example from the compositae. Mol Phylogenet Evol. 1:3-16.

Beagley CT, Okada N, and Wolstenholme DR, 1996. Two mitochondrial group I introns in a

metazoan, the sea anemone Metridium senile : One intron contains genes for subunits 1

and 3 of NADH dehydrogenase. Proc. Natl. Acad. Sci. 93: 5619–5623.

Campbell CS, Donoghue MJ, Baldwin BG, and Wojciechowski ME. 1995. Phylogenetic

relationships in Maloideae (Rosaceae): evidence from sequences of the internal

transcribed spacers of nuclear ribosomal DNA and its congruence with morphology.

Amer. J. Bot. 82: 903-918.

Bhattacharya D, Damberger S, Surek B, Melkonian, M. 1996. Primary and secondary structure

analyses of the rDNA group I introns of the Zygnematales (Charophyta ) Curr. Genet. 29:

282-286.

Bhattacharya D, Friedl T, Damberger S. 1996. Nuclear-encoded rDNA group I introns: origin

and phylogenetic relationships of insertion site lineages in the green algae. Mol. Biol.

Evol 12: 978-989

Bhattacharya D, Lutzoni F, Reeb V, Simon D, Nason J, Fernandez F. 2000. Widespread

occurrence of spliceosomal introns in the rDNA genes of ascomycetes. Mol. Biol. Evol.

17:1971-84.

Blake CC. 1979. Exons encode protein functional units. Nature 277: 598.

69

Booton GC, Floyd GL, Fuerst PA. 2004. Multiple group I introns detected in the nuclear small

subunit rDNA of the autosporic green alga Selenastrum capricornutum . Curr. Genet.

46:228-34.

Burger G. 1998. Genome structure and gene content in protist mitochondrial DNAs. Nucleic

Acids Res. 15: 865–878

Burke JM. 1988. Molecular genetics of group I introns: RNA structures and protein factors

required for splicing. Gene 73:259-271.

Cappetta H. 1980. The selachians from upper cretaceous of Mount Lebanon. Sharks

Paleontographica (A) 168:69-148.

Cappetta H. 1987. Chondrichthyes II, Mesozoic and Cenozoic Elasmobranchii , In: HP Schultze,

ed., Handbook of Paleoichthyology vol. 3B, Gustav Fischer Verlag, Stuttgart, 3B: 1-193

Carrier JC, Musick JA, Haithaus RM. 2004. Biology of Sharks and their relatives, CRC press,

Boca Raton, FL.

Carrier JC, Musick JA, Haithaus RM. 2010. Sharks and their relatives II, CRC press, Boca

Raton, FL.

Cech TR. 1988. Conserved sequences and structures of group I introns: building an active site

for RNA catalysis. Gene 73:259-271.

Cech TR. 1990. Self-splicing of group I introns. Annu. Rev. Biochem 59: 543-568

Cech TR, Damberger, SH, Gutell, RR. 1994. Representation of the secondary and tertiary

structure of group I introns. Nat. Struct. Biol. 1:273-280.

Cech, TR, Zaug AJ, Grabowski PJ. 1981. In vitro splicing of the ribosomal RNA precursor of

Tetrahymena : involvement of a guanosine nuclèotide in the excision of the intervening

sequence. Cell 27:487-496.

70

Chauhan S, and Woodson SA. 2008. Tertiary interactions determine the accuracy of RNA

folding. J. Amer. Chem. Soc. 130:1296-1303.

Cho Y, Qiu YL, Kuhlman P, Palmer JD. 1998. Explosive invasion of plant mitochondria by a

group I intron. Proc. Natl. Acad. Sci, USA 95:14244-14249.

Chen X. 2010. Evolution of group I introns in the nuclear ribosomal RNA genes of

Dothideomycetes . M.S thesis, Bowling Green State University, Bowling Green, OH.

Colleaux L, D'Auriol L, Galibert F, Dujon B. 1988. Recognition and cleavage site of the intron- encoded omega transposase. Proc. Natl. Acad. Sci. USA. 85: 6022–6026.

Compagno, L.J.V. 2001 Sharks of the world. An annotated and illustrated catalogue of shark

species known to date. Bullhead, mackerel and carpet sharks (Heterodontiformes,

Lamniformes and Orectolobiformes). FAO Species Catalogue for Fishery Purposes.

Rome, FAO. 2: 1-269.

Compagno, L.J.V. 1984 FAO Species Catalogue. Sharks of the world. An annotated and

illustrated catalogue of shark species known to date. Part 1 - Hexanchiformes to

Lamniformes. FAO Fish. Synop. 4:1-249.

Lilley DMJ, Eckstein F. 2008. Ribozymes and RNA Catalysis . RSC Biomolec. Sci.. Royal Soc.

Chem. pp:178-198.

Dujon, B. 1989. Group I introns as mobile genetic elements: facts and mechanistic speculations.

Gene 82:91-114.

Duncan CD, Weeks KM. 2010. The Mrs1 splicing factor binds the bI3 group I intron at each of

two tetraloop-receptor motifs. PLoS 5: 8983.

71

Gottfried MD, and Fordyce RE. 2001. An associated specimen of Carcharodon angustidens

(Chondrichthyes , Lamnidae ) from the Late Oligocene of New Zealand, with comments

on lamnid interrelationships. J. Vert. Paleontol. 21:730-739.

Gargas A, DePriest PT, Taylor JW. 1995. Positions of multiple insertions in SSU rDNA of

lichen-forming fungi. Mol. Biol. Evol. 12:208-218.

Gray MW, Lang BF, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Brossard N,

Delage E, Littlejohn TG, Plante I, Rioux P, Saint-Louis D, Zhu Y, Burger G. 1998.

Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Res.

26:865-78.

Grube M, Gargas A, DePriest PT. 1996. A small insertion in the SSU rDNA of the lichen fungus

Arthonia lapidicola is a degenerate group-I intron. Curr Genet. 29:582-6.

Golden, BL. 2008. Group I Introns: Biochemical and Crystallographic Characterization of the

Active Site Structure. In : Ribozymes and RNA Catalysis, RSC publishing, pp. 178-201.

Harris DJ, Crandall KA. 2000. Intragenomic variation within ITS1 and ITS2 of freshwater

crayfishes ( Decapoda : Cambarida e): implications for phylogenetic and microsatellite

studies. Mol Biol Evol. 17: 284-91.

Harris L, Rogers SO. 2008. Splicing and evolution of an unusually small group I intron. Curr

Genet. 54:213-22.

Heinicke MP, Naylor GJP, Hedges SB. 2009. Cartilaginous fishes ( Chondrichthyes ). In: Hedges,

SB, Kumar, S (Eds.), The Timetree of Life. Oxford University Press, New York, p. 320.

Haugen P, Simon DM, Bhattacharya D. 2005. The natural history of group I introns. Trends

Genet. 21:111-9.

72

Human BA, Owen EP, Compagno LJV, Harley EH. 2006. Testing morphologically based

phylogenetic theories within the cartilaginous fishes with molecular data, with special

reference to the family (Chondrichthyes: Scyliorhinidae ) and the

interrelationships within them. Mol. Phylogenet. Evol. 39:384–391.

Ikawa Y, Shiraishi H, Inoue T. 2000. Minimal catalytic domain of a group I self-splicing intron.

RNA. Nat. Struct. Biol. 7:1032-5.

Joseph N, Krauskopf E, Vera MI, Michot B. 1999. Ribosomal internal transcribed spacer 2

(ITS2) exhibits a common core of secondary structure in and yeast. Nucleic

Acids Res. 27:4533–4540.

Joyce GF, Van der Horst G, Inoue T. 1989. Catalytic activity is retained in the Tetrahymena

group I intron despite removal of the large extension of element PS. Nucleic Acids Res.

17:7879-7889.

Takizawa K, Hashizume T, Kamei K. 2011. Occurrence and characteristics of group 1 introns

found at three different positions within the 28S ribosomal RNA gene of the

dematiaceous Phialophora verrucosa : phylogenetic and secondary structural implications

BMC Microbiol. 11:94.

Kim SH, Cech TR. 1987. Three-dimensional model of the active site of the self-splicing rRNA

precursor of Tetrahymena . Proc. Natl. Acad. Sci. USA. 84: 8788-92.

Koonin EV. 2006. The origin of introns and their role in eukaryogenesis: a compromise solution

to the introns-early versus introns-late debate Biol. Direct. 1:22.

Kuperus WR, Chapco W. 1994. Usefulness of internal transcribed spacer regions of ribosomal

DNA in melanopline ( Orthoptera , Acrididae ) systematics. Ann. Entomol. Soc. Amer.

87:751-754.

73

Kuhsel MG, Strickland R, Palmer JD. 1990. An ancient group I intron shared by eubacteria and

chloroplasts. Science 250:1570-1573.

Lambowitz AM, Belfort M. 1993 . Introns as mobile genetic elements. Annu. Rev. Biochem.

62 :587 –622.

Li Z, Zhang Y. 2005. Predicting the secondary structures and tertiary interactions of 211 group I

introns in IE subgroup. Nucleic Acids Res. 33:2118-2128.

Lykke-Andersen J, Aagaard C, Semionenkov M, Garrett RA. 1997. Archaeal introns: splicing,

intercellular mobility and evolution. Trends Biochem. Sci. 22: 326-331.

Martin, R. Aidan. 2003. “Biology of Sharks and Rays” online course. www.elasmo-

research.org/copyright.htm

Mavridou A, Cannone J, Typas MA. 2000. Identification of group-I introns at three different

positions within the 28S rDNA gene of the entomopathogenic fungus Metarhizium

anisopliae var. anisopliae . Fungal Genet Biol 31:79-90

McLain DK, Wesson DM, Oliver JHJ, Collins FH. 1995. Variation in ribosomal DNA internal

transcribed spacers 1 among Eastern populations of Ixodes scapularis (Acari : Ixodidae ).

J. Med. Entomol. 32:353-360.

Michel F, Westhof E. 1990. Modelling of the three-dimensional architecture of group I catalytic

introns based on comparative sequence analysis. J. Mol. Biol. 5:585-610.

Muller KM, Sheath RG, Vis ML, Crease TJ, Cole KM. 1998. Biogeography and systematics of

Bangia (B angiales , Rhodophyta ) based on the Rubisco spacer, rbcL gene and 18S rRNA

gene sequences and morphometric analyses. 1. North America. Phycologia 37:195-207.

Neuvéglise C, Brygoo Y, Riba G. 1997. 28S rDNA group-I introns: a powerful tool for

identifying strains of Beauveria brongniartii . Mol. Ecol. 6: 373-81.

74

Odorico DM, and Miller DJ. 1997. Variation in the ribosomal internal transcribed spacers and

5.8S rDNA among five species of Acropora (Cnidaria; Scleractinia.: Patterns of variation

consistent with reticulate evolution. Mol. Biol. Evol. 14 :465-473.

Oliveira MC, Kurniawan J, Bird CJ, Rice EL, Murphy CA, Singh RK, Gutell RR, Ragan AA.

1995. A preliminary investigation of the order Bangiales (Bangiophycidae , Rhodophyta )

based on sequences of nuclear small subunit ribosomal RNA genes. Phycol. Res. 43:71-

79.

Oliveira MC, Ragan MA. 1994. Variant forms of a group-I intron in nuclear small-subunit rRNA

genes of the marine red alga Porphyra spiralis var. amplifolia . Mol. Biol. Evol. 11:195 -

207.

Pank M, Stanhope M, Natanson L, Kohier N, Shivji M. 2001. Rapid and simultaneous

identification of body parts from the morphologic ally similar sharks Carcharhinus

obscurus and Carcharhinus plumbeus (Carcharhinidae ) using multiplex PCR. Mar.

Biotechnol. 3: 23 1-240.

Pichler A, Schroeder R. 2002. Folding problems of the 5' splice site containing the P1 stem of

the group I thymidylate synthase intron: substrate binding inhibition in vitro and mis-

splicing in vivo. J. Biol. Chem. 277:17987-17993.

Rogers SO, Yan ZH, Shinohara M, LoBuglio KF, Wang CJK. 1993. Messenger RNA intron in

the nuclear 18S ribosomal RNA gene of deuteromycetes. Curr. Genet. 23:338- 342.

Rogers SO, Bendich AJ. 1985. Extraction of DNA from milligram amounts of fresh, herbarium

and mummified plant tissues. Plant Mol.Biol. 5:69-76.

75

Rogers SO, Bendich, AJ. 1994. Extraction of total cellular DNA from plants, algae and fungi. In:

SB Gelvin, RA Shilperoort (eds) Plant Molecular Biology Manual, 2’’ Edition, Kluwer

Academic Press, Dordrecht, The Netherlands, Pp. Dl:1-8.

Rumfelt LL, Lohr RL, Dooley H, Flajnik MF. 2004. Rumfelt LL, Lohr RL, Dooley H, Flajnik

MF. Diversity and repertoire of IgW and IgM VH families in the newborn nurse shark.

BMC Immunol 5:8.

Raghavan R, Minnick MF. 2009. Group I introns and inteins: disparate origins but convergent

parasitic strategies. J. Bacteriol. 191:6193-202.

Schilthuizen M, Street GT, Coull, BC, Chandler, GT, Quattro, JM. 1999. Molecular population

structure of the marine benthic copepod Microarthridion littorale along the southeastern

and Gulf coasts of the USA. Mar. Biol. 135:399-405.

Shivji MC, Tagliaro L, Natanson N, Kohler, Rogers SO, Stanhope M. 1996. Utility of

ribosomal DNA ITS2 for deriving shark species-diagnostic identification markers.

Aquaculture Biotechnology Symposium Proceedings, Physiological Section, American

Fisheries Society, pp. 87-93.

Shivji M, Rogers SO and Stanhope M. 1996. Molecular studies on sharks. Shark Tagger 1995

Summary, NOAA. p.15

Shivji MS, Stanhope MJ and Rogers SO. 1997. Group I introns ("spintrons") are present in

shark nuclear ribosomal DNA internal transcribed spacers. American Elasmobranch

Society Annual Meetings. Seattle, WA.

Simon D, Moline J, Helms G, Friedl T, Bhattacharya D. 2005. Divergent histories of rDNA

group I introns in the lichen family Physciaceae. J. Mol. Evol. 60:434-46.

76

Shub DA, Gott JM, Xu MQ, Lang BF, Michel F, Tomaschewski J, Pedersen-Lane J, Belfort M.

1988. Structural conservation among three homologous introns of bacteriophage T4 and

the group I introns of eukaryotes. Proc. Natl. Acad. Sci. 85:1151-5.

Sandegren L, Sjöberg BM. 2007. Self-splicing of the bacteriophage T4 group I introns requires

efficient translation of the pre-mRNA in vivo and correlates with the growth state of the

infected bacterium. J. Bacteriol. 189:980-90.

Swofford DL. 2001. PAUP: a phylogenetic analysis using parsimony (and other methods).

Version 4.O blO. Sinauer Associates, Sunderland, Mass.

Swofford DL, Olsen GL, Waddell PJ, Hillis DM. 1996. Phylogenetic inference Molecular

systematics (D.M.Hillis, C.Moritz and B.K. Mable, eds.). Sinauer, Sunderland,

Massachusetts. 14: 407-5

Strauss-Soukup JK, Strobel SA. 2000. A chemical phylogeny of group I introns based upon

interference mapping of a bacterial ribozyme. J. Mol. Biol. 302:339-58.

Shinohara ML, LoBuglio KF, Rogers SO. 1996. Group-I intron family in the nuclear ribosomal

RNA small subunit gene of Cenococcum geophilum isolates. Curr. Genet. 29:377-387.

Tanner MA, Cech TR. 1997. Joining the two domains of a group I ribozyme to form the catalytic

core. Science 275:847-9.

Tropp BE. Molecular biology - genes to proteins. 2008. 3rd edition by, Jones and Bartlett

Publishers, Sudbury,. Massachusetts, USA

Tippery NP, Les DH. 2008. Phylogenetic analysis of the internal transcribed spacer (ITS) region

in Menyanthaceae using predicted secondary structure. Mol. Phylogenet. Evol. 49:526-

537.

77

Tocchini-Valentini GD, Fruscoloni P, Tocchini-Valentini GP. 2011. Evolution of introns in the

archaeal world. Proc Natl Acad Sci U S A. 108 : 4782-7.

Vicens Q, Paukstelis PJ, Westhof E, Lambowitz AM, Cech TR. 2008. Toward predicting self-

splicing and protein-facilitated splicing of group I introns. RNA 14:2013-2029.

Wang JF, Downs WD, Cech TR. 1993. Movement of the guide sequence during RNA catalysis

by a group I ribozyme. Science 260:504-508.

Woodson SA. 2005. Metal ions and RNA folding: a highly charged topic with a dynamic future.

Curr. Opin. Chem. Biol. 9:104-109.

Van Nues RW, Rientjes JMJ, Van der Sande, CAFM, Zerp SF, Sluiter C, Venema J, Planta RJ,

Raué HA.1994. Separate structural elements within internal transcribed spacer 1 of

Saccharomyces cerevisiae precursor ribosomal RNA direct the formation of 17S and 26S

rRNA. Nucleic Acids Res. 22:912–919.

Van Nues RW, Rientjes JMJ, Morré, SA, Mollee E, Planta RJ, Venema J, Raué HA. 1995.

Evolutionarily conserved structural elements are critical for processing of internal

transcribed spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA. J. Mol.

Biol. 250: 24–36

Vicens Q, Paukstelis PJ, Westhof E, Lambowitz AM, Cech TR. 2008. Toward predicting self-

splicing and protein-facilitated splicing of group I introns. RNA 14:2013-2029.

Vélez-Zuazo X, Agnarsson I. 2011. Shark tales: a molecular species-level phylogeny of sharks

(Selachimorpha , Chondrichthyes ). Mol. Phylogenet. Evol. 58:207-217.

Walker, Nancy B, Shivji Mahmood S, Stanhope, Michael J, Rogers SO. 1997. Characterization

of group I intron-like insertion elements in shark ribosomal DNA spacers, American

Elasmobranch Soc. Ann. Meetings. Seattle, WA.

78

Waldsich C, Masquida B, Westhof E, Schroeder R. 2002. Monitoring intermediate folding states

of the td group I intron in vivo. EMBO J. 21:5281-91.

Xiao M, Li T, Yuan X, Shang Y, Wang F, Chen S, Zhang Y. 2005. A peripheral element

assembles the compact core structure essential for group I intron self-splicing. Nucl.

Acids. Res. 33:4602-4611.

Xu MQ, Kathe SD, Goodrich-Blair H, Nierzwicki-Bauer SA, Shub DA. 1990. Bacterial origin of

a chloroplast intron: conserved self-splicing group I introns in cyanobacteria. Science

250:1566-1570.

Zhou Y, Lu C, Wu QJ, Wang Y, Sun ZT, Deng JC, Zhang Y. 2008. GISSD: Group I Intron

Sequence and Structure Database. Nucleic Acids Res. (Database issue):D31-7

79