<<

PATTERN AND DISTRIBUTION OF RNA EDITING IN LAND RBCL

AND NAD5 TRANSCRIPTS

A Thesis

Presented to

The Graduate Faculty of The University of Akron

In Partial Fulfillment

of the Requirements for the Degree

Master of Science

Traci L. Branch

December, 2006

PATTERN AND DISTRIBUTION OF RNA EDITING IN LAND PLANT RBCL

AND NAD5 TRANSCRIPTS

Traci L. Branch

Thesis

Approved: Accepted:

______Advisor Dean of the College Dr. Robert Joel Duff Dr. Ronald F. Levant

______Committee Member Dean of the Graduate School Dr. Richard Londraville Dr. George R. Newkome

______Committee Member Date Dr. Francisco B. Moore

______Committee Member Dr. Amy Milsted

______Department Chair Dr. Bruce Cushing

ii ABSTRACT

RNA editing is a process that occurs in both the chloroplast and mitochondrial genome of . However, little is know about the patterns and distribution of RNA

editing among land plant lineages. To date, any investigations utilizing comparisons

between multiple taxonomic groups have only looked at a small number of samples,

typically within three lineages. More importantly, the data collected in the previous

studies were generally restricted to sequences available on GenBank making it problematic due to the small number of plants sequenced. Therefore, to resolve questions unanswered thus far, it was crucial to perform a more extensive study including samples representing all five land plant lineages. To further examine this, one chloroplast gene, rbcL, and one mitochondrial gene, nad5, were studied in detail.

Fourteen DNA and 22 cDNA sequences of rbcL were generated for a diverse group of land plants. In addition, 1 DNA and 8 cDNA sequences of nad5 were generated. RNA editing sites were identified through direct comparison of cDNA/DNA sequences and prediction methods using an alignment of 126 samples

(newly generated and obtained from Genbank) representing all plant lineages. A total of 122 editing sites were predicted within 1335 nucleotides of rbcL. A total of 123

(out of 1107nt) editing sites were detected for the mitochondrial gene, nad5. The majority of the editing sites initially predicted by this method were actually observed.

iii The 143 amino acid changes predicted resulted in a variety of 39 different types of amino

acid conversions. S-L, S-F, and F-L were the most frequent amino acid changes noted

for rbcL, respectively. A definite T_A context bias was detected at the -1/+1 nucleotide

positions of editing sites that is absent in non-edited sites.

The rate and distribution of RNA editing sites varied greatly among the taxa

sampled. In several cases, individual taxa showed higher or lower rates of editing than

the rest of their group within one or even both organelles. The findings of the T_A

context bias support the suggestion that there are sequence recognition factors necessary

for editing to take place. The absence of shared editing sites among selected taxa, for

rbcL, raise a number of interesting evolutionary questions. In order for these questions to be resolved, future research needs to focus on improving both the number of genes (or better whole organelle genmones) and the number of plant groups examined.

iv DEDICATION

For my mother, Peg: I am left speechless at how much her support and unconditional love has taught me. And, without whom I would not have strived to reach and achieve my dreams…

For Nate: there are not enough words to express my gratitude for the amount of love, encouragement, and support you have shown me, your patience is inexhaustible…

v ACKNOWLEDGMENTS

I want to extend my sincerest appreciation to my advisor Dr. Robert Joel Duff who has patiently guided and supported me in my scientific endeavors. To the members of my committee, Dr. Richard Londraville, Dr. Francisco “Paco” Moore, and Dr. Amy

Milsted, each of whom have not only shared their knowledge and support but have also inspired me with their own excitement, to explore the scientific process.

In addition, I would like to thank the other people who have helped me along the way: Laurie Kay and Laura Peterson for easing my transition into our laboratory; Lauren

Smith for burning the midnight oil along side of me; Chiara Benvenuto and Sadie Reed for their never ending guidance and encouragement in and out of the lab; Hope Ball for her ability to keep me laughing through the obstacles I faced; Sue Robinson for all of her help and incredible “paper-pushing” ability; and Nate Manning for all the hours of editing and brainstorming. For the company spent with our noses dug deep into our biochemistry and molecular texts, I would like to thank Hope Ball and Patty Taylor. To my beloved group of Algal Troopers, I would like to thank you for the most beautiful week in San Salvador, Bahamas. Who better to spend a week of snorkeling and eating conch fritters? Finally and with much gratitude, I would like to thank the faculty in the

Department of Biology at the University of Akron for all of their support, guidance, and their individual awe-inspiring approaches to science.

vi TABLE OF CONTENTS

Page LIST OF TABLES……………………………………………………………………….xi

LIST OF FIGURES…………………………………………………………………...…xii

CHAPTER

I. INTRODUTION………………………………………………………………………..1

Phylogenetic distribution of RNA editing……………………………………...... 2

Frequency of editing in subcellular organelles……………………………………4

RNA editing among ……...... 6

Research questions………………………………………………………………...8

Specific Problem…………………………………………………………………10

II. MATERIALS AND METHODS………………………………………………….....12

Materials………………………………………………………………………....12

Isolation of Nucleic Acids and Synthesis of cDNA……………………………...12

Amplification of DNA and cDNA……………………………………………….14

Purification of PCR Products………………………………………………….....16

Cloning of cDNA and DNA PCR products……………………………………...16

Sequencing Reactions……………………………………………………………17

Determination of RNA Editing Sites…………………………………………….18

vii

III. RESULTS…………………………………………………………………………...19

Results of Sequence Acquisition for rbcL……………………………………….19

Total Number of RNA Editing Sites for rbcL: all groups……………………....22

Results of Group Comparisons of Shared Editing Sites…………………………24

Results of the Affect of RNA Editing on Amino Acid Composition………...….27

RNA Editing in the Mitochondrial Gene, nad5……………………...…………..28

IV. DISCUSSION……………………………………………………………………...32

Describing Editing Sites: a conservative approach……………………………....32

Predicting RNA Edited Sites: problems encountered…………………………....32

Distribution of RNA Editing Sites: predicted and observed, in and between groups ……………………………………………………………….....34

RNA Editing Rate Across Land Plants and between Chloroplast and Mitochondrial………………………………………………………………….....36

Selection of RNA Editing Sites: what drives targeting of editing sites?...... 38

Upstream and Downstream Base Preferences and RNA Editing in Land Plants…………………………………………………………………………..…41

Second Base Sites and RNA Editing…………………………………………….44

The evolution and maintenance of RNA editing in plant organelles…………….47

Conclusions and Future Research ……………………………………………….49

REFERENCES………………………………………………………………………….51

APPENDICES…………………………………………………………………………...55

viii APPENDIX A. ALLIGNMENT OF SELECTED TAXA DNA AND CDNA SEQUENCES FOR RBCL ………………………………………………………69

APPENDIX B. POSITION, AMINO ACID CHANGE AND TYPE OF EDITING SITES FOUND IN RBCL FOR SELECTED LAND PLANTS…..…72

ix LIST OF TABLES

Table Page

1 A comparison of the known rates of RNA editing in plant organelles………4

2 Collection information for taxa included in this study. Genbank numbers for DNA sequences collected as part of this study are bolded. Genbank numbers represent samples for which prior DNA sequences had been generated but are being used in this study……………………… 13

3 PCR primers used for PCR amplifications and sequencing reactions. In some cases, multiple combinations of external or internal primers were used to obtain completed sequences…………………………………………………….15

4 Sequence reaction parameters used for each of the sequences that were generated during this study………………………………………………. .17

5 Characteristics of RNA editing sites in rbcL sequences for selected land plant groups. Data include taxa for which cDNAs were obtained and for which RNA edited sites were predicted from DNA sequence comparisons only…………………………………………………………………………24

6 Numbers of edited sites shared between selected plant groups compared to the total number of edited sites between the two groups. Abbreviations of plant names are as follows: hornworts (HW), (LY), (FN), (MS), and Takakia (TK)……25

7 Characteristics of RNA editing sites in nad5 sequences for selected land plant groups and data compiled from literature. * indicates data has been obtained from Steinhauser et al. (1999), while the remaining data has been collected for the purpose of this study. Bold print indicates that cDNAs were available for direct comparison and edited sites reported for those samples are observed rather than predicted. A (?) indicates samples for which the T-C sites are only predicted……….30

8 Comparison of RNA editing rate between the chloroplast gene, rbcL, and the mitochondrial gene, nad5, for selected taxa as observed and predicted in this study. RNA editing rates for both single taxa and plant groups are listed. * indicates data has been obtained from Steinhauser et al. (1999)…………………………………………………..31 x LIST OF FIGURES

Figure Page

1 Illustration of the features of the rbcL and nad5 genes with approximate positions of primers used in this study. (Figure provided by Laura Peterson)………………………………………………………………………….15

2 Maximum parsimony tree showing the phylogenetic affects of RNA editing on the placement of genomic sequences when compared to corresponding (multiple) cDNA sequences. This is one of 187 shortest trees with a length of 2369 characters (597of which were informative). Genbank numbers follow the sample name (or code) for those samples previously generated. Samples without Genbank numbers were those generated for use in this study….……………………21

3 Distribution of RNA editing sites among selected taxa in rbcL. Bold taxa represent taxa for which DNA and cDNA sequences are available and thus edited sites have been confirmed………………………..………..26

4 Distribution of 143 amino acid changes resulting from 122 edited sites in the chloroplast gene, rbcL. There were 39 different amino acid conversion types……………………………………………………………28

5 Regression analysis examining the covariation of editing across organellar genomes for the clade………………………………..37

6 Regression analysis examining the covariation of editing across organellar genomes for all taxonomic groups…………………………...... 37

7 Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for all samples included in this study within the hornwort clade…………………42

8 Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for all samples included in this study within the clade……………………...43

xi 9 Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for Takakia……………………………………………………………………..43

10 Base composition of the -1/+1 nucleotide in relation to the edited and non-edited second base position sites where C-T editing was known to occur. Illustration of results found for the hornwort sample PBUEU……45

11 Base composition of the -1/+1 nucleotide in relation to the edited and non-edited second base position sites where C-T editing was known to occur. This graph illustrates results within the hornwort sample PPECA……………………………………………………………………46

12 Base composition of the -1/+1 nucleotide in relation to the edited and non-edited second base position sites where C-T editing was known to occur. This graph illustrates results within the hornwort sample ANTH…………………………………………………………………….46

xii CHAPTER I

INTRODUCTION

RNA editing (identified in the mid 1980’s) revolutionized the field of molecular

biology and subsequently challenged conventional thought (Gray, 1996). Before the

discovery of RNA editing, the prevailing dogma had outlined the flow of genetic material

beginning with DNA being transcribed into RNA and ending with its translation into the

corresponding essential proteins (Bock, 2000). Classically, this ‘central dogma’ presumes

that the mRNA to be translated into proteins will represent its DNA sequence. However,

the process of RNA editing alters this typical formula. RNA editing is defined here as

any process resulting in nucleotide changes within the transcript resulting in recognizable

differences between the RNA sequence and its corresponding DNA sequence (Bock,

2000). One consequence of these post-transcriptional changes is the RNA’s ability to

restore and ensure evolutionary conserved amino acids are encoded for by negating the

errors in the DNA sequence (for example, repairing early termination codons) (Hiesel et al., 1989).

Two general forms of RNA editing are recognized. Insertion/deletion editing is characterized by site-specific deletions or insertions of one or more nucleotides in the transcript (Bock, 2000; Hanson et al., 1996). Therefore, the RNA editing process is then

1 either deleting or inserting nucleotides post-transcriptionally. A second type of editing is conversion or replacement editing. This form of editing is characterized by specific compositional changes to individual nucleotides (Bock, 2000). In this case, the nucleotides being edited are “converted” probably by site-specific deamination of cytosines to uracil (Maas & Rich, 2000). These changes generally result in the presence of a uracil (U) nucleotide in the mature mRNA transcript when the original DNA sequence codes for a cytosine (C) (Hanson et al., 1996). Although most RNA editing focuses on cytosine (C) to uracil (U) alterations, in mammals editing also takes the form of adenosine-to-inosine conversions (Maas & Rich, 2000).

There are two different types of replacement conversions that occur: C to U and

U to C. Despite the fact that C to U conversions are generally believed to occur by a deamination process, the actual mechanism responsible for U to C conversions is unknown. For simplicity’s sake, C to U conversions are referred to as “forward” editing events as they were discovered first and are the most common, while U to C conversions are referred to as “reverse” events throughout this text.

Phylogenetic distribution of RNA editing

RNA editing is widely distributed among organisms. Initially, RNA editing was identified within the mitochondria of trypanosomes, a protozoan taxon (Bock, 2000).

Subsequently, RNA editing has been identified in a number of animals, plants, fungi, bacteria, and protists (Börner & Pääbo, 1996). In 1989, RNA conversion editing was first found in plants in the mitochondria of angiosperms (Hiesel et al., 1989) and later in plant

2 chloroplast genomes (Hoch et al., 1991). Yoshinaga et al. (1996) characterized RNA editing in all lineages of land plants except for the bryophytes. However, more recent evidence for RNA editing has been reported across all land plant taxa including both chloroplast and mitochondria genomes, with the exception of complex thalloid liverworts

(Marchantiidae) which do not exhibit editing in either organelle (Steinhauser et al.,1999).

To date, no evidence of RNA editing in the green algal ancestors of plants has been found

(Steinhauser et al., 1999). Steinhauser et al. (1999) also examined nad5 sequences of bryophytes, including 15 liverworts, 2 hornworts, and 30 mosses. They were able to detect mitochondrial RNA editing in all 30 of the mosses, both of the hornworts, and 7 of the leafy and simple thalloid liverworts (Jungermanniidae). Within another mitochondrial gene, coxIII, it had previously been observed that there was a much higher editing frequency found within compared to angiosperms (Hiesel et al.,

1994). To date, all studies have found the hornworts, a group of nonvascular plants, to have the highest frequency of RNA editing in both chloroplast and mitochondrial genes.

In addition to the frequency of editing in different plant groups, the distribution of

C to U and U to C types of RNA editing in plants is not equivalent (Bock, 2000; Kugita et al., 2003). C to U editing conversions are the most common in all plant groups and occur in both organelles for all groups. However, U to C editing is most common form of RNA editing in both hornwort chloroplast and mitochondrial genomes with some reports of this form of editing in mosses and lycophytes (Bock 2000; Kugita et al., 2003;

Steinhauser et al.,1999). In addition, U to C type edits have also been shown to occur, though with less frequency, within mitochondria (Hiesel et al., 1994; Kugita et al.,

3 2003; Malek et al., 1997). Thus far, reverse conversions are detected very rarely in

angiosperms (Schuster et al., 1990) and not yet know in gymnosperms (Gray, 1996).

Table 1. A comparison of the known rates of RNA editing in plant organelle genomes.

Location Chloroplast Mitochondria Rate Low High

Tobacco/Arabidopis 34/19* -/456 (flowering Plants) C - U C – U (Kugita et al., 2003; Tsudzuki et al., 2001/Giege et al., 1999) 350 total - Adiantium capillus-veneris 315 C – U (fern) 35 U – C (Wolfe et al., 2004)

Anthoceros punctatus 942 total Approx. 1500* (hornwort) 509 C – U (Yoshinaga et al., 1996) 433 T – U (*predicted by Duff unpub.)

Marchantiae 0 0 (Steinhauser et al., 1999)

Algae 0 0 (Steinhauser et al., 1999)

Frequency of editing in subcellular organelles of plants

The frequency of RNA editing within the chloroplast has been reported to be much lower than in the mitochondria (Hanson et al., 1996; Kugita et al., 2003). The

differences between total editing rates and forms of editing in plants is shown in Table 2.

For example, RNA editing in the mitochondrial genome of the angiosperm (flowering

plant) Maize has been estimated to be between 3 and 15% of the inferred amino acid

sequences in a number of genes (Hanson et al.,1996; Mulligan et al., 1999). In contrast,

the rate of chloroplast editing in the same organism is usually less than 1% and has been

reported as low as 0.13% of the inferred amino acid sequence (Hanson et al.,1996) In 4 total, Giege and Brennicke (1999) found 456 C to U edits within the mitochondria of

Arabidopsis. While, the number of plastid genome editing sites of Arabidopsis still is unknown, it is likely that there are a similar number to that found in maize and tobacco which have been revealed to have only 25 and 31 sites respectively (Kugita et al., 2003).

Among the seedless vascular plants, Wolf et al. (2004) were able to identify 350 editing sites within the chloroplast genome of the fern, Adiantum capillus-veneris, representing approximately 0.38% of the investigated sites. Furthermore, the lycopsid Selaginella uncinata, may have as many as 112 chloroplast editing sites resulting in a change in amino acid sequence of 7.7% (Kugita et al., 2003). Therefore, the trend among vascular plants appears to range from high rates of editing among the basal lineages with a general loss of editing proficiency in more advanced groups. This trend is also seen in the rates of reverse edited sites occurring across land plant lineages.

In bryophytes the frequency of RNA editing in the chloroplast and mitochondria genomes appears to vary widely though few taxa have been examined. The liverworts display a strange dichotomy of editing frequency between the two major clades; the complex thalloid liverworts and the simple/leafy liverworts (see figure 1). The former exhibits no known editing sites in either the mitochondrial or chloroplast genomes (Gray,

1996; Groth-Malonek et al., 2005; Steinhauser et al., 1999). The latter have been shown to have editing in both genomes though at relatively low frequencies (Gray, 1996; Groth-

Malonek et al., 2005; Steinhauser et al., 1999). One exception to this general observation is the potentially ancestral liverwort Haplomitrium which was recently identified as having very high editing rates in the nad5 mitochondrial gene with a total of 67 C-T

5 editing sites (Groth-Malonek et al., 2005). At present it is unknown if the high rate of

RNA editing in the gene reflects the rate of editing in the mitochondrial genome in general. RNA editing has been shown to occur, although at low frequencies, within both the chloroplast and mitochondrion of the mosses (Freyer et al., 1997; Steinhauser et al.,

1999). Similar to that of the liverwort Haplomitrium, there is also an example of elevated editing within the chloroplast of the mosses. Using prediction methods as described in Duff & Moore (2005), the basal moss, Takakia, is estimated to have up to 25 edited sites within the chloroplast gene, rbcL (Duff, 2006). In contrast to the lower frequencies (0-3 sites) among the other tested members of the lineage of mosses, editing events occurring within the hornworts are drastically elevated (Steinhouser et al., 1999).

However, recent investigations have revealed a single lineage of hornworts, represented by the Leiosporoceros, that has RNA editing rates similar to those found among most mosses (Duff & Moore 2005, Kugita et al. 2003, Yoshinaga et al. 1997).

RNA editing among hornworts

Hornworts express the highest levels of editing among the land plants in both the chloroplast and mitochondrial genomes. A complete examination of the A. formosae chloroplast genome, 942 edited sites were found, establishing hornworts as having the greatest propensity for plastid editing among land plants (refer to Table 2) (Kugita et al.,2003; Oshinaga et al.,1997). The numbers of editing sites found within the mitochondrial genome are even greater with approximately 1500 sites have been estimated (Duff unpub. data).

6 Recent analysis of 20 hornwort samples, using the chloroplast gene, rbcL, has

identified the presence of 72 editing sites (Duff & Moore, 2005). Although the frequency

of editing is elevated in the hornwort clade, with the exception of Leiosporoceros,

individual editing sites are not conserved across these taxa (Duff, 2006; Duff & Moore,

2005; Kugita et al., 2003). It was shown that the number of editing sites per sample can

vary in both base position and number, ranging from zero in Leiosporoceros to 34 edited

sites in three different samples, Megaceros giganteus, Phaeceros coriaceus, and

Phaeceros fimbriatus (Duff & Moore, 2005). Twenty-eight of these identified editing

sites represent U-to-C conversions, or “reverse” editing, which account for 39% of the

total number of edited positions in the rbcL gene.

In order to better understand the levels of editing within the mitochondrial genome of hornworts, Duff (2006) examined genomic DNA and cDNA sequences of the gene, nad5. Within the 1107 nucleotides examined, 113 total RNA editing sites were found. As in the plastid genome of hornworts, the number of “reverse” conversions in the mitochondria is also elevated in comparison of other land plant lineages. Here it accounts for approximately 32% of the identified edited sites. Additionally,

Leiosporoceros was identified as having only 8 edited sites. Furthermore, although editing was found within nad5 of Leiosporoceros, it did exhibit a significantly lower number of sites than rest of the hornwort clade (8 sites compared to 20-45).

As we have seen, current data suggests that editing rates vary greatly among land

plant lineages (Groth-Malonek et al., 2005). Thus far it has been shown that the complex

7 thalloid liverworts, one of the two main lineages of liverworts, lack editing in both organelles. Similarly the hornworts show elevated editing rates in both the cpDNA and mtDNA genomes. Few other land plant groups have been examined for both organelles to determine if these taxa represent the norm with respect to the relationship of the cpDNA and mtDNA genomes.

Research questions

Based on these studies of RNA editing in plants, it is clear that RNA editing is a process that affects both the mitochondrial and chloroplast genomes of nearly all plants.

However, many questions remain to be answered including:

1. What is the total variation in RNA editing rates among plants? A few plant lineages have been examined in detail (eg. Hornworts and a few angiosperms) but a large portion of land plant diversity has not been surveyed to determine just how much variation there is in total frequency of RNA editing and whether there is conservation of editing sites across diverse phylogenetic groups.

2. How and when did RNA editing evolve in land plant history? As discussed earlier, algae lack the ability to perform RNA editing as do the complex thalloid liverworts

(Marchantiidae). However, the sister group to the Marchantiidae, the simple/leafy liverworts have been found to contain RNA edited sites within both organelles. In addition, editing is present and maintained throughout all other land plant lineages. This could suggest that RNA editing may have evolved at least twice, once in the leafy liverworts and again in the lineage leading to all the other land plants. Or, it could have

8 evolved in an ancestor of all land plants but was later lost in the ancestor of the complex

thalloid liverworts. But, how the mechanism could be lost or gained multiple times and

what the likelihood of these hypotheses is unknown.

3. Did the ability to edit in chloroplast and mitochondrial genomes evolve

independently or do they have a common origin? Irrespective of when editing evolved

did it evolve separately in the mitochondrial and chloroplast or do both have a common

origin? There is evidence that the editing process itself exhibits similarities between the

chloroplast and the mitochondrion (Fiebig et al., 2004) supporting a common origin. The

lack of data from both organelles and the lack of a known mechanism still leaves the

relationship of the editing in the two organelles open.

4. Do lineages that exhibit high rates of RNA editing in one gene have higher overall

rates of editing across the entire organellar genome or even both genomes? Wolf et al.

(2004) found 9 edited sites within the plastid gene, atpB, but none in the gene, rbcL. In the fern, Psilotum, no editing sites were found within the gene, rbcL, while one site was found within the gene, ndhB (Freyer et al., 1997). Beyond this data, no comparative studies have been done explicitly to look at this issue. Similarly comparative studies have been initiated in the mitochondrial genome to examine this question.

5. What is the selective pressure to maintain RNA editing? This is one of the most mystifying questions concerning the occurrence of editing: why not simply get rid of editing sites altogether by corrections in the genomic DNA, specifically by back

9 mutations? There are few hypotheses that have been put forward to explain the

maintenance of RNA editing. One among these is that it is most likely maintained due to

the evolutionary need for phenotypic plasticity (Maas & Rich 2000). However, since it

has been observed that RNA editing is not found within the plastid genome of many

lower plants and does not appear to be specifically conserved across lineages, there may

not be an active selective force driving the maintenance of RNA editing (Bock, 2000).

Likewise, it has been suggested that it is simply a byproduct of another unrelated process

and hence not the result of direct selection (Duff & Moore, 2005).

Specific Problem

This study provides new data to address some portions of the questions presented

above. In particular, the data gathered address the first, third and fourth questions from

above. To do so, the distribution of editing sites within the plastid genomes across a wide selection of land plant lineages was surveyed, by sequencing DNA and cDNA sequences, in order to determine the pattern of RNA editing in a chloroplast gene and to a lesser extent the mitochondrial gene nad5.

These data were use to address several specific hypotheses related to the broad questions

conveyed above:

1) Given that RNA editing exists in most land plants it is expected that a wider survey of land plants will reveal additional taxa that are edited and that a greater diversity of editing rates may be found. Are there groups that have not been examined that have higher or lower RNA editing frequencies than those previously reported?

10 2) When RNA editing rates from two organelles are compared, a correlation of editing rate frequency between organelles is expected. A consistent pattern of correlation between RNA editing rates in both organelles will provide support for the hypothesis that there is a common editing mechanism governing editing in these two organelles.

3) If RNA editing sites are evolutionarily conserved then we would expect to see the same edited sites shared across diverse land plant lineages. Are particular RNA editing sites maintained over long periods of time? If they are shared across taxa are they simply being regenerated within distinct lineages?

11 CHAPTER II

MATERIALS AND METHODS

Materials

Specimens representing major lineages of land plants were either collected fresh

in the field, donated from a number of individuals, or purchased (Table 2).

Isolation of Nucleic Acids and Synthesis of cDNA

Both total DNAs and RNAs were obtained from either fresh plant material or

material preserved at -80°C prior to extraction. DNA extractions were performed using

the DNeasy® Plant Mini Kit (Qiagen), using protocols specified by the manufacturer.

RNA extractions were obtained using the RNeasy® Plant Mini Kit (Qiagen) also following the protocols provided by the manufacturer. Following extraction, RNA

samples were further treated with either RNase-free DNase I® (Invitrogen) and incubated

at room temperature for 15 minutes or TURBO DNaseTM (RNase-Free) and incubated at

37°C for 30 minutes. In both cases, the DNase was inactivated by the addition of 1 µl

25mM EDTA, for prevention of chemical scission of RNA samples, and incubated at

either 65°C or 75°C for 10 minutes, respectively. Execution of this step prohibited any

contamination due to the presence of genomic DNA in the RNA samples. In most cases,

DNAs were stored at -20°C and RNAs at -80°C prior to amplification or cDNA

preparation.

12 Table 2. Collection information for taxa included in this study. Genbank numbers for DNA sequences collected as part of this study are bolded. Genbank numbers represent samples for which prior DNA sequences had been generated but are being used in this study. Sequences generated in this study are noted with an X.

Sample Code Name Genbank Origin/ Collection Information (Genus species) Number Liverworts Haplomitrium spp. HAPLO804A X Giant City State Park, IL; TLBranch, 2005 Hornworts Phaeoceros pearsonii PPECA AY860203 California, USA; Doyle, 2004 Phaeoceros laevis PLAEU DQ845682 Europe; JC Villarreal Phaeoceros bulbiculosus PBUEU AY860201 Europe; JC Villarreal Megaceros feugiensis MFUCH699A DQ845643 Chile; Ducket, 2005 Megaceros spp. MSP1B X Hawaii; Mashito, Megaceros spp. MSPCH701A DQ845645 Chile; Ducket, 2005

Dendroceros crispus DCRJC742 DQ845662 Unknown; JC Villarreal 742 Leiosporoceros dussii LDU05 AY463053 Unknown; JC Villarreal Phaeoceros pearsonii PPECAB AY860203 Unknown; Doyle 11325 Phaeoceros fimbriatus PFIJC799 DQ845655 Panama; JC Villarreal, 2005 Phaeoceros spp. PSPDO4495A DQ845652 LaCarbozay Merida; Ducket, 2004 Phaeomegaceros skottsbergii PSPCH2747A DQ845656 Chile; Ducket Phaeomegaceros skottsbergii PSPCHMALE X Chile; Ducket Phaeoceros carolinianus PCADCC726 AY860202 Australia; Cargill, 2004 Megaceros vincentianus JC791M X Unknown; JC Villarreal Phaeomegaceros fimbriatus PHAMEGVEN AY860200 Venezuela; M. Price, 2004 Nothoceros spp. NOTHCR AY860199 Costa Rica; JC Villarreal Phaeoceros spp. PSIU3001 AY463057 Carbodale IL, K. Renzaglia Anthoceros APUEU DQ845681 Portugal; JC Villarreal, Mosses Sphagnum SPAG1033A X Kent Bog, Kent OH; TL Branch 2005 Thuidium fissidens THUID916A X Furnace Run Metro Park, Brecksville, OH; TL subbasilaris Branch, 2005 Ferns Cyanthea australis TRFRN1082A X UA Greenhouse; TL Branch, 2004 Botrychium BOTRY1026A X North America Polypodium PAUSIU798A X Giant City State Park, IL; TL Branch, 2005 Psilotum nudum PSILO1081A X UA Greenhouse; TL Branch, 2004 Lycophytes Selaginella kraussiana SELAG11160A X UA Greenhouse; TL Branch, 2004 Selaginella kraussiana aurea SELAG31021A X North America; Carolina Biological Supply, 2005 Selaginella kraussiana SELAG41161A X North America; Carolina Biological Supply, 2005 Lycopodium LYCO1025A X North America; Carolina Biological Supply, 2005 Seed Plants Conopholis CONOP915A X Furnace Run Metro Park, Brecksville OH; TL Branch, 2005 Gnetnum gnemon GNET1162A X UA Greenhouse; RJ Duff, 2006 Zamia pumila ZAM1109A X UA Greenhouse; RJ Duff, 2006

13

Amplification of DNA and cDNA

First-strand cDNA transcripts from RNA samples were obtained using the

Sensuscript® cDNA kit (Qiagen) in combination with the primer, rbclRH (Duff et al.

2004), specific for the 3’ end of the rbcL transcript and nad5L Reverse (Steinhauser et al.

1999) for the nad5 transcript. After obtaining first strand cDNAs, the polymerase chain reaction (PCR) was performed using the same 3’ primers from rbcL and nad5 along with the appropriate 5’ end primer (Table 3). Amplification of the same portion of the rbcL

and nad5 genomic sequence was obtained with the same suite of primers. In several

cases, DNA sequences of hornworts had been previously obtained and were available through GenBank. All amplifications were performed using the LA-PCR® Long PCR

Kit, ver. 2.1 (Panvera Corporation), following manufacturer’s protocol. PCRs were amplified after combining genomic DNA or cDNA with gene specific primers (see Table

6), and the polymerase/buffer mixture. rbcL PCRs were carried out in an Eppendorf thermal cycler with a deanaturation period of 3:50 minutes at 94 ºC, annealing time of

0:50 minutes at 44 ºC, and an extension time of 1:30 minutes at 68 ºC. This was repeated

for 30 cycles. nad5 PCR conditions cycled from 56 ºC down to 42 ºCdegrees during 30

rounds of amplification, however the denaturation time was 2:40 minutes, the annealing

time was 0:40 minutes, and the extension time was 1:20 minutes. As an experimental

control, PCR was performed on the RNA samples after the DNase step, prior to synthesis

of the cDNA. This step was necessary to verify that the final obtained sequences were

derived from cDNAs and not the result of genomic DNA contamination. In some

instances an additional control amplification of the PCR mix minus template was also

14 performed. All PCR amplifications were carried out in an Eppendorf thermal cycler under conditions shown in Tables 4 and 5.

Table 3. PCR primers used for PCR amplifications and sequencing reactions. In some cases, multiple combinations of external or internal primers were used to obtain completed sequences.

Name Sequence 5’ – 3’ Length Stock Conc. Specificity RBCLF GTC ACC ACA AAC GGA RAC TAA AGC 24 1250 µg/ml RbcL gene RBCLR CTT TCC ATA CTT CRC AAG CAG C 22 1250 µg/ml RbcL gene RBCL471F CAA GGT CCA CCT CAT GGT A 19 1250 µg/ml RbcL gene RBCL320F GGT AAY GWT TYG GAT T 16 1250 µg/ml RbcL gene RBCL888R TAC ACG GAA RTG CAT ACC AT 20 1250 µg/ml RbcL gene NAD5 K ATA TGT CTG AGG ATC CGC ATA G 22 1250 µg/ml Nad5 gene NAD5 L AAC TTT GGC CAA GGA TCC TAC AAA 30 1250 µg/ml Nad5 gene NAD5 100 AAG ATG GA GGG AGT MGG TC 20 1250 µg/ml Nad5 gene NAD5 900 ACA CTT CCC AAC CAG AAA GCA 21 1250 µg/ml Nad5 gene PPECA946R GAC AGA GAC ACT TCC CGG 18 1250 µg/ml Nad5 gene PPECA107F CCT GCT YTA TAT TGT CAA CC 20 1250 µg/ml Nad5 gene M13R CAG GAA ACA GCT ATG ACC 18 1250 µg/ml TOPOvector M13F TGT AAA ACG ACG GCC AGT 18 1250 µg/ml TOPOvector

Figure 1. Illustration of the features of the rbcL and nad5 genes with approximate positions of primers used in this study. (Figure provided by Laura Peterson)

15 Purification of PCR Products

All amplified products along with a 1 kb size standard (Hyperladder I, Bioline Inc.) were

electrophoresed on 0.9% agarose gels and visualized under UV light to assess the length

of DNA fragments. DNA/cDNA fragments isolated from a gel slice were either directly

cloned or purified (using GenElute® (Sigma) agarose spin columns or Ultrafree-DA®

Amicon Bioseparation (Millipore Corporation) agarose spin columns).. Or, alternatively, some PCR products (1.4 kb for rbcL and 1.1kb for nad5) were purified using the

MinEluteTM PCR Purification Kit (Qiagen) and stored at -20°C until direct sequencing

was performed.

Cloning of cDNA and DNA PCR products

Amplification products were cloned by insertion into the TA TOPO® vector

(Invitrogen). Prior to the plating of transformed bacteria, 10 µl of X-Gal (20 µg/ml) was

applied to the plates as an added negative selection step which helped to eliminate

potential selection of false positives. Transformed bacteria were plated on

LB/Kanomycin selective medium plates following the manufacturers’ protocols.

Subsequently, plates were incubated at 37°C for approximately 30-36 hours, or until the

color indicator (blue) appeared representing galactase metabolism has occurred in the non-positive clones. Five to fifteen white isolated colonies were selected and grown in

LB/Kanomycin broth culture overnight, or approximately 12-15 hours. Prior to

extraction of plasmid DNAs, PCR screening using universal primers was performed in order to confirm the insert size of the respective positive colonies. For each initial cloning reaction of amplified cDNAs, multiple positive culture preps (5-10) were chosen

16 and plasmid minipreps (using QIAprep® spin miniprep kit Qiagen) were prepared. For

amplifications of DNA products, only one to two positive colonies were chosen and

minipreps prepared as above.

Sequencing Reactions

Sequencing reactions were performed on the purified plasmids using the Big Dye

cycle sequencing kit (version 2, ABI). Sequencing primers included universal plamid

primers T3/T3 or M13F/M13R and a number of specific primers for internal portions of

the gene sequence (see Table 3). Following cycle sequencing, the reactions were purified

and were loaded onto an automated ABI 310 DNA sequencer (Applied Biosystems).

Resultant chromatograms were then analyzed using the ABI Prism version 3.3

Sequencing Analysis program (Applied Biosystems).

Once obtained, chromatograms were entered into Sequencing analysis software to

read and manually edit. Then, corrected sequences, both cDNA and DNA, were aligned

using the software SeqApp (Gilbert,1993).

Table 4. Sequence reaction parameters used for each of the sequences that were generated during this study.

Sequencing Parameters 1. T = 96.0˚ Time = 00:20 R = 1.1˚/s 2. T = 44.0˚ Time = 00:10 R = 1.1˚/s 3. T = 60.0˚ Time = 4:00 R = 1.1˚/s 4. GoTo 1 Repeat 29 for cycles

17

Determination of RNA Editing Sites

Initially, RNA editing sites were determined by the use of a prediction model

previously described by Duff & Moore (2005). Briefly, an extensive alignment of sequences was generated including 126 sequences of newly acquired, previously

generated sequences, or sequences downloaded sequences from Genbank representing all

major groups of land plants. Using the algal sequences as a reference any thymine or cytosine that was conserved in all algal sequence was initially considered a perspective

editing site. These predetermined sites were then examined across the entire alignment and differences in the form of either cytosine-to-thymine or thymine-to-cytosine were flagged as potential edited sites in each sample. Finally, these sites were cross-referenced

against the amino acid sequence and only if the amino acid present differed from what

was found as the conserved state in all other taxa, was it determined to be an edited site.

For some edited sites, a more definitive means of identifying editing sites was accomplished via straightforward comparisons of newly generated and previously described pairs of cDNA and genomic sequences obtained from selected taxa (see Table

2). By definition, edited sites were determined by evidence of a conversion of cytosine to thymine, or vice versa, in the genomic sequence when compared to the cDNA.

Because sequence and PCR artifacts can introduce point mutations in either the DNA or cDNA sequences, these sites were confirmed by assessing the amino acid sequence and looking for the expected amino acid changes.

18 CHAPTER III

RESULTS

Results of Sequence Acquisition for rbcL

A total of 14 genomic sequences and 22 first-strand cDNA sequences were obtained for the chloroplast rbcL gene in taxa representing major groups of land plants.

For the rbcL gene these sequences were combined with 14 previously generated pairs of

DNA and cDNA sequences from all genera of hornworts. Samples for which genomic

Both DNA and multiple cDNAs (number of cDNAs in parentheses) were generated included Gnetum (2), Lycopodium (2), Psilotum (2), Botrychium (1), Thuidium (2), Tree

Fern (3), Polypodium (1), Selaginella spp. (2), Sphagnum (6), and Haplomitrium (1). In

many cases, both genomic and cDNA sequences from many taxa that were part of the

original targeted list were not successfully obtained either because the DNA or the cDNA

could not be amplified or sequenced.

In order to determine the location of the RNA editing sites, genomic sequences

were compared to the corresponding cDNA sequences and identification of these sites

were based on the occurrence of C-T or T-C differences in the DNA transcript from its

corresponding cDNA transcript. In many cases multiple cDNA transcripts were

sequenced and the total number of edited sites included all of those found in any one

19 single transcript. When possible multiple cDNAs were cloned and sequenced for each sample however many cDNAs could not be fully sequenced. Only fully sequenced transcripts are reported here. In each case the number of edited sites reported should be considered a minimal number of edited sites for each sequence since it is possible that additional edited sites may have been found had more cDNAs been sequenced. An alignment was created using all known and newly generated cDNA and genomic sequences for rbcL (Figure 2). This figure shows the phylogenetic effects of RNA editing within the selected taxa. The sequence alignment can be viewed in Appendix A.

Using the large rbcL sequence alignment of 126 sequences including those for which both DNA and cDNA sequences were available, the number of edited sites for each taxon was determined either by direct observation of DNA and cDNA or predicted as described above. In many cases, prediction of edited sites at third base positions was too problematic to predict using these methods and so all third base positions were excluded from consideration. For the remaining 1st and 2nd base pair positions, Appendix

B shows all the predicted and known edited sites for these genomic sequences including their location, nucleotide base position, amino acid position, and predicted amino acid change if applicable.

20 100 GNET1162A DNA GNET1126C CDNA GNET1126D CDNA SELAG11160A DNA SELAG1927A CDNA 100 SELAG31021A DNA SELAG41161A DNA SELAG1927B CDNA 100 HAPLO804A DNA HAPLOC888A CDNA PSILO1081A DNA 100 PSILO925A CDNA PSILO925B CDNA PSILO925C CDNA 100 BOTRY1026A DNA BOTRY1150A CDNA 100 TRFRN1082A DNA 100 TRFRN926A CDNA TRFRN926C CDNA 96 TRFRN926D CDNA 100 PAUSIU798A DNA PAUSIU889A CDNA 100 LYCO1025A DNA LYCO1054A CDNA LYCO1054B CDNA 100 LDU05AY 463052 DNA CLDU05921A CDNA 100 AFORM NC 004543 DNA AFORM CDNA NOTHCR AY860199 DNA MSPCH701A DNA 100 57 MFLAGELL AY860198 DNA PFIMB AY860200 DNA PSPCH2747A DNA PSPCHR702A DNA PSPCH 1031A CDNA PSPD041032A CDNA MFUCHR699A DNA MFUCH923A CDNA 93 92 MSPCH924A CDNA NOTHCR647B CDNA 100 MSPH1B633A CDNA MFLAGGEL CDNA PHAMGVEN635A CDNA PFIMB722A CDNA PCADCC726 AY860202 DNA 100 CPSPCHMALE1032A CDNA PSPDO4 495A DNA PPECAA Y860203 DNA PPECAJME2 CDNA 70 99 PPECA917B CDNA PPECA920A CDNA PCADCC726 620A CDNA 100 Takakia AF244565 DNA Takakia PREDICTED SPAG1033A DNA SPAG1051C CDNA SPAG1051D CDNA 80 SPAG1051B CDNA SPAG1051A CDNA SPAG1051E CDNA THUID916A DNA 100 THUID1052A CDNA THUID1052C CDNA Marchantia XO4465 DNA Chara vulgaris DQ339107 10 changes

Figure 2. Maximum parsimony tree showing the phylogenetic affects of RNA editing on the placement of genomic sequences when compared to corresponding (multiple) cDNA sequences. This is one of 187 shortest trees with a length of 2369 characters (597of which were informative). Genbank numbers follow the sample name (or code) for those samples previously generated. Samples without Genbank numbers were those generated for use in this study. Numbers above branches represent bootstrap values. 21 Using the large rbcL sequence alignment of 126 sequences including those for which both DNA and cDNA sequences were available, the number of edited sites for each taxon was determined either by direct observation of DNA and cDNA or predicted as described above. In many cases, prediction of edited sites at third base positions was too problematic to predict using these methods and so all third base positions were excluded from consideration. For the remaining 1st and 2nd base pair positions, Appendix

B shows all the predicted and known edited sites for these genomic sequences including their location, nucleotide base position, amino acid position, and predicted amino acid change if applicable.

Total Number of RNA Editing Sites for rbcL: all groups

In total, 1335 nucleotides were examined spanning five land plant lineages. Here,

122 total RNA editing sites are reported. Each of these editing events directs a departure in the amino acid sequence coded for by the cDNA sequence from that of its corresponding genomic sequence. As a result, these conversions revert the amino acid back to the ancestral (conserved) state. In some cases in the hornworts, these conversions corrected early termination codons without which the necessary proteins would not be produced.

Of the 122 total edited sites among all land plants, 79.5% of the sites are found to occur within the hornworts. The frequency at which editing occurs among the hornwort clade, with the exception of Leiosporoceros, is extremely elevated compared to the rest of land plants. The total number of edited sites among land plant groups is shown in

22 Table 5. This table shows that there are differences in the total numbers of edited sites among taxonomic groups (see Appendix B for the number of sites per sample).

Examining this figure reveals that within some groups, such as the hornworts and mosses, there are clearly divergent rates of RNA editing in individual taxa. For example, the moss Takakia was predicted to have 20 edited sites compared to other mosses that had between 0 and 4 edited sites (see Appendix B for edited sites amongst the other mosses).

The hornwort Leiosporoceros lacks any edited sites for this gene while all other hornworts had from 15 to 33 edited sites and a total of 97 edited sites collectively.

Interestingly, while certain groups were observed to have a large number of edited sites, most of those sites present in a particular group did not appear to be shared between groups. A list of sites unique to each group and for selected taxa is recorded in Table 5.

This table shows that many edited sites are found exclusively in one group or even one taxon or another and are rarely found distributed widely across the phylogenetic spectrum of plants. For example, in the mosses, 8 of the 29 moss editing sites are unique to

Takakia. Like all previous reports of RNA editing in plants, second base positions are preferentially edited over first base positions (Liu and Bundschuh, 2005). Although third base positions were excluded from our study it is clear that editing sites at third base sites are exceedingly rare as expected since very few sites could be predicted or were observed in cDNAs.

Of the 122 total sites found, 47 were identified as U-C editing sites. This accounts for 39% of the sites found within the chloroplast gene, rbcL. Interestingly,

23 hornwort sequences contain 44 of these 47 U-C sites, or in other words 94% of the total

sites were reported here. Furthermore, in hornworts, 46.4% of all edited sites found were

U-C conversions. This is more than in any of the other group examined which is

consistent with previous studies (Yoshinaga et al., 1996). Lycophytes and mosses

(including Takakia) have the next most U-C conversions but they only account for 6% of

the total U-C editing sites. For example, the sample Takakia has only one U-T edited site

within the 20 reported. Excluding Takakia, the mosses have 2 U-T editing sites out of

11.

Table 5: Characteristics of RNA editing sites in rbcL sequences for selected land plant groups. Data include taxa for which cDNAs were obtained and for which RNA edited sites were predicted from DNA sequence comparisons only.

Group Total Edited Sites: Type of Editing: % Codon Unique (Number of Predicted C – U (U – C) U - C Position: Sites per samples/Taxa) (Observed) 1st (2nd) Group Seed Plants (5) 1 (0) 1 (0) 0 1 (0) 0 Ferns (30) 10 (0) 9 (1) 10.0 4 (6) 2 Lycophytes (12) 13 (3) 10(3) 23.1 5 8) 3 Hornworts (40) 97 (35) 52 (45) 46.4 38 (59) 64 Leiosporoceros (2) 0 (0) 0 (0) 0 0 (0) 0 Takakia (1) 20 (0) 19 (1) 5.0 4 (16) 8 Mosses +Takakia 29(0) 26 (3) 10.3 8 (21) 11 (16) Mosses – Takakia 11 (0) 9 (2) 18.2 4 (7) 3 (15) Liverworts (12) 10 (3) 9 (1) 30.0 5 (5) 3

Results of Group Comparisons of Shared Editing Sites

Table 6 shows a comparison of editing site positions among land plant groups.

When combined, mosses and hornworts had the highest number of shared edited sites,

out of a total 106 edited sites between the two groups 16 sites (15.1%) were shared. The

moss Takakia along with the hornworts had 101 total edited sites of which 9 (8.9%) were

24 shared. Hornworts and lycophytes had a combined total of 94 editing sites of which only

9 (9.6%) were shared. Hornworts, when grouped with ferns, had a total of 95 edited sites of which only 6 (6.3%) were shared. However, across the 113 edited sites found in hornworts, mosses, lycophytes, and ferns there was not a single edited position that all four taxa shared. Figure 3 shows the pattern of RNA editing sites across the gene for selected taxa. An examination of this figure clearly shows that edited sites are not couspicuously conserved at particular sites across groups.

Table 6. Numbers of edited sites shared between selected plant groups compared to the total number of edited sites between the two groups. Abbreviations of plant names are as follows: hornworts (HW), lycophytes (LY), ferns (FN), mosses (MS), and Takakia (TK).

Comparison Group Shared Sites: Total Sites Combined of Predicted (Real) Grouping HW/LY 10 (3) 94 HW/FN 6 95 MS/LY 4 39 MS/HW 16 106 TK/HW 9 101 MS/HW/LY/FN 0 113

25

1

4 5 7 1

5

0

0 7 9 8 5 488 506 520 545 557 644 649 6 623

37 47

1 4 40 77 86 97 2 314 341 407 86 88 289 290 317 347 356 371 410 452 25 1 1 74 100 214 269 274 277 329 380

3 Gnetum Welwitchia Dryopteris Polypodium Woodsia Adiantum Cyathia Botrychium Psilotum Selaginella Isoetes Lycopodium Megaceros flagellaris (AY860198) Megaceros aenigmaticus (L13481) Phaeoceros bulbilocosus (AY860201) Phaeoceros pearsonii (AY860203) Anthoceros formosae (NC_004543) Leiosporoceros dussii (AY463052) Mnium Thuidium Takakia Sphagnum Haplomitrium Riciocarpos Monoclea Marchantia Outgroups 2 11122 1 2 1 11 2 1 2 12 112122 1222222222222222121 2

U to C conversions C to U conversions 682 686 730 764 776 788 791 806 814 817 833 836 857 859 869 908 949 959 971 982 995 1010 1013 1015 1058 1063 1079 1082 1088 1091 1093 1094 1120 1127 1136 1139 1148 1159 1163 1166 1195 1238 1331 839 Gnetum Welwitchia Dryopteris Polypodium Woodsia Adiantum Cyathia Botrychium Psilotum Selaginella Isoetes Lycopodium Megaceros flagellaris (AY860198) Megaceros aenigmaticus (L13481) Phaeoceros bulbilocosus (AY860201) Phaeoceros pearsonii (AY860203) Anthoceros formosae (NC_004543) Leiosporoceros dussii (AY463052) Mnium Thuidium Takakia Sphagnum Haplomitrium Riciocarpos Monoclea Marchantia Outgroups 12222212 2 1 12 2 2 1 2 2 2 22122 212 1212222 221 2 22 122 1 2

Figure 3. Distribution of RNA editing sites among selected taxa in rbcL. Bold taxa represent taxa for which DNA and cDNA sequences are available and thus edited sites have been confirmed.

26 Results of the Affect of RNA Editing on Amino Acid Composition

In total, the 122 identified RNA editing sites resulted in 143 predicted amino acid

changes. Of these amino acid conversions, there was a total of 39 total different types observed. Clearly, there is lack of congruence between the number of identified edited

sites and the number of amino acid conversions reported above. There were multiple

editing sites at which the amino acid being edited or the amino acid produced differed

between taxonomic groups. Site 410, for example, had three different amino acids

encoded in the genomic sequences of the selected taxa (valine, leucine, and isoleucine)

that were all converted into serine after editing. Figure 4 shows there was a disparity

between the frequency with which the different types of amino acid conversions

occurred. The three most frequent conversions were: serine to leucine (23), serine to

phenylalanine (19), and phenylalanine to leucine (11). The other 36 different types of

amino acid conversions each accounted for 8 or fewer changes within the amino acid

sequence and 20 of these were found at only one position.

There is a definite bias associated with respect to which amino acids are affected

by RNA editing events. Consistent with the data presented here (Figure 4), the literature

reports a preference for serine to leucine, serine to phenylalanine, and proline to leucine

changes (Bock 2000; Tillich et al., 2006). Here, it is reported that, there are 23 serine

codons converted to leucine which accounts for approximately 16% of all the changes

incurred through RNA editing. The next two common changes are serine to

phenylalanine and proline to leucine representing about 21% of the changes combined.

27 Changes in Amino Acid Sequence due to RNA Editing

25

20

15 Type 10

5 Number of Changes per Amino Acid of Changes Number 0

-L Y Q R R C W S R F -L -V *- -H I-T *- T-I I-S - T-F - L- -P F S S-FP-LH- A S-PC- Y T-M V-AR- L-P R- P-S Y-EY-AV T-R S-YS S-AP-FP-AN-YM-TL-S F-YF-SF A-SA-P Amino Acid Changes

Figure 4. Distribution of 143 amino acid changes resulting from 122 edited sites in the chloroplast gene, rbcL. There were 39 different amino acid conversion types.

RNA Editing in the Mitochondrial Gene, nad5

Excluding the hornwort clade, a total of 8 cDNA sequences were generated for the mitochondrial gene, nad5, from selected taxa. cDNA sequences were generated from the following: Lycopodium (sp.) (1), Botrychium (sp.) (1), Cyathea (sp.) (2), Psilotum

(sp.) (2), and Sphagnum (sp.) (2). A newly generated Botrychium cDNA/DNA pair was added to the 7 pre-existing hornwort cDNA/DNA pairs. For samples in which the genomic sequences were not generated, Genbank sequences of the same species were used for direct comparison. The Genbank sequences used were as follows: Sphagnum fallax (AJ622817), Lycopodium (Huperzia selago) (AJ012795), and Psilotum nudum

(AJ012794).

28 Acquiring both the cDNA and genomic sequences for the mitochondrial gene, nad5, proved difficult. Attempts to amplify both the cDNA and genomic material using multiple primer combinations (see table 3) were made with varying results. As nad5 has several introns, locations of which varied depending on the plant group in question, it was difficult to design a universal primer for the internal region of the gene for all selected taxa. In several cases, as mentioned above, the genomic sequence was not obtainable even though the cDNA sequences were fully generated. Intron variability, both in size and location across land plants, is a likely explanation for the poor success of amplification, of internal gene regions, between plant groups.

Excluding the hornwort clade, a total of 37 editing sites were found in the four samples examined. Sixteen of these 37 editing sites were shared with at least one of the hornwort samples. Unique sites existed for the each of the 4 non-hornwort samples, these included P. nudum (5), Botrychium (sp.) (6), Lycopodium (sp.) (2), and Sphagnum (sp.)

(4). Table 7 shows the characteristics of editing sites found in nad5 for the selected land plant groups examined in this study and for previously recorded data, including total number of sites, number of C-T and T-C editing sites, and the percent of sites that are T-

C.

A comparison of editing rates between the chloroplast gene, rbcL, and the mitochondrial gene, nad5, for selected taxa is shown in Table 8. Only the individual samples for which data was obtainable for both genes were listed. The data presented here reflects the trends found in previous studies (Gray & Covello, 1993) in which the

29 mitochondria were reported to have a higher frequency of editing compared to the chloroplast.

Table 7. Characteristics of RNA editing sites in nad5 sequences for selected land plant groups and data compiled from literature. * indicates data has been obtained from Steinhauser et al. (1999), while the remaining data has been collected for the purpose of this study. Bold print indicates that cDNAs were available for direct comparison and edited sites reported for those samples are observed rather than predicted. A (?) indicates samples for which the T-C sites are only predicted.

Sample Total Number of Sites C-T T-C % T-C Ferns Psilotum 17 16 1 5.9 Botrychium 13 13 0 0 Lycophytes Lycopodium 9 9 0 0 Hornworts Nothoceros 40 31 9 22.5 Megaceros fuegiensis 45 35 10 22.2 Megaceros flagellaris 43 34 9 20.9 Phaeoceros fimbriatus 45 37 8 17.7 Phaeoceros bulbiculosus 54 41 13 24.1 Phaeoceros pearsonii 35 25 10 28.6 Phaeoceros spp. 40 17 23 57.5 Phaeoceros spp.2 46 20 26 56.5 Phaeceros carolinianus 42 17 25 59.5 Anthoceros punctatus 52 34 18 34.6 *Anthoceros husnotii-AJ000697 37 32 18 48.6 Leiosporoceros dussii 8 5 3 37.5 Mosses *Polytrichum formosum-AJ001228 2 2 0 0 *Thuidium tamariscinum-AJ004809 3/4 3 1? 25.0? *Rhodobryum roseum-Z98964 6 6 0 0 *Tetraphis pellucida-AJ224855 6 6 0 0 Sphagnum 7 7 0 0 *Sphagnum fallax-AJ001225 7/9 7 ?2 22.2? Liverworts *Metzgeria conjugate-AJ000703 11 11 0 0 *Fossombronia pusilla-AJ000699 13 13 0 0 *Marchantia polymorpha-M68929 0 0 0 0

30 Table 8. Comparison of RNA editing rate between the chloroplast gene, rbcL, and the mitochondrial gene, nad5, for selected taxa as observed and predicted in this study. RNA editing rates for both single taxa and plant groups are listed. * indicates data has been obtained from Steinhauser et al. (1999).

Sample rbcL nad5 Ferns 10 27 Psilotum 0 17 Botrychium 2 13

Lycophytes 13 9 Lycopodium 5 9

Hornworts 97 125 Nothoceros 20 40 Megaceros fuegiensis 27 45 Megaceros flagellaris 25 43 Phaeoceros fimbriatus 35 45 Phaeoceros bulbiculosus 33 54 Phaeoceros pearsonii 28 35 Anthoceros punctatus 29 52 Leiosporoceros dussii 0 8

Mosses 29 7 Sphagnum 1 7 Thuidium 2 3/4*

Liverworts 10 11 Metzgeria conjugate 2 11* Marchantia polymorpha 0 0*

31 CHAPTER IV

DISCUSSION

Describing Editing Sites: a conservative approach

A conservative approach was used in this study to determine the total number of

RNA editing sites and therefore the results reported here are likely underestimated. For example, changes at the third base positions were not included in the analysis and therefore any sites at these positions would not have been included in the results.

According to Gray and Covello (1993), editing sites are not typically observed at the 3rd

position in plants. This is likely due to changes at the third position within the codon

often do not affect the outcome of the amino acid sequence. Therefore, the potential for

silent edited sites at third base positions is low. While this seems to be case for RNA

editing in plants, in other groups RNA editing does not display a similar behavior. For

example, Liu and Bundschuh (2005) found that a mold preferentially edited bases at the

3rd codon position.

Predicting RNA Edited Sites: problems encountered

The prediction method was extremely useful in targeting RNA editing sites in

samples when the cDNAs were not available. However, it was not possible to apply this

method to all sites. In fact, it is possible that some sites that appear to be editing sites

may simply represent point mutations that are not “repaired” by editing. Therefore, when

an amino acid change was found at a potential editing position in a sample without the

32 corresponding cDNA available, pairs of cDNA/genomic sequence were examined within the same plant lineage, whenever possible. In this case any site in question was first examined to verify if a conversion appeared only in the genomic sequence and not in the cDNA sequence. When a point mutation was found both in the genomic and cDNA sequences, the site was discarded as a predicted editing site. For example, site 752 was originally predicted, based on the large sequence alignment, as a C-to-T editing site for

Gnetum gnemon and several of the ferns with the resulting amino acid change Alanine to

Valine. However, further examination of the cDNAs available for G. gnemon and

Psilotum nudum revealed no editing of the transcripts and thus the amino acid differences likely represent real substitutions. In addition, the algae Tolypella also exhibited a C-to-T change at the 752 site which provides a further red flag for this site due to the presumed lack of editing among the algal ancestors. Therefore, since it appeared as though it had alternative functional states, site 752 was not included in the final analysis.

Other factors potentially affecting the detection of edited sites were errors introduced through the molecular methods used in acquiring both the genomic and cDNA sequences. It has been shown that PCR, aside from the introduction of an occasional point mutation, may bias the number of edited sites detected by copying cDNA transcripts which are not edited to completion (Mower, 2005). If this occurs, then the number of edited sites detected would be lower than that predicted using genomic sequences. Errors can also be introduced into transcript sequences through first-strand cDNA synthesis and subsequently also through the process of cloning and sequencing. In order to mitigate the potential for including methodological artifacts, whenever possible

33 multiple cDNAs were produced for each sample. Since the likelihood of obtaining the same artifact in multiple sequences at the same location is extremely low, it was possible to spot these errors easily when all sequences were aligned.

Due to the progressive nature of RNA editing many cDNAs are not fully edited

(reference, Duff, Branch, Moore, unpublished data) and thus it is likely that some editing sites will have gone unnoticed by examination of limited numbers of cDNAs. However, it is likely that the total or at least the majority of the edited sites have been identified in the available cDNAs that were generated in this study. This is evidenced by the fact that all of the sites predicted by comparison of the genomic DNA and amino acid sequences match the observed sites found and reported here. Additionally, the cDNA sequences revealed additional RNA editing positions not previously found either by prediction or limited sequences of cDNAs (Duff & Moore, 2005). Moreover, a study of large numbers of transcripts from a single specimen (Duff, Branch, and Moore, unpublished data) suggests that cDNA transcripts are quickly edited to completion such that a survey of cDNAs show they either exhibit only a few of the expected number of edited sites or nearly all of them.

Distribution of RNA Editing Sites: predicted and observed, in and between groups

It is evident that the RNA editing sites observed are not evenly distributed between the taxa examined. As has been reported previously, most of the edited sites are found predominantly in hornworts which are found to edit at 97 of the 122 sites identified

34 here. However, the distribution of edited sites is irregular even amongst the hornworts

(refer to figure 2).

There does not seem to be a set pattern dictating the distribution of editing sites between the five groups examined. As shown in Table 6, not a single edited site is shared across hornworts, mosses, lycophytes, and ferns. This is in opposition to the findings of

Gray and Covello (1993) who suggested that most RNA editing sites are conserved among plants. Another example of the limited number of chloroplast editing sites shared among plant groups was presented by Wolf et al. (2004). They showed that only 51 editing sites are shared between the hornwort, A. formosae, and the fern, A. cappillis- veneris, while even fewer are shared among A. cappillis-veneris and seed plants (5). The evidence presented here suggests that even sites unique to one lineage are not necessarily found in each sample examined within that group or vice versa. For example, Takakia was found to have multiple edited sites which are unique to its group. In addition, editing has been well documented within the simple thalloid and leafy liverworts though it is absent within the complex thalloid liverworts (Steinhauser et al., 1999). Compared to the mosses, Takakia stands out as having an abnormally high editing rate of editing within the chloroplast gene, rbcL. This is also true of the liverwort, Haplomitrium, within the mitochondrial gene, nad5, where it has been documented to have 65 C-T editing sites

(Groth-Malonek et al., 2005).

35 RNA Editing Rate Across Land Plants and between Chloroplast and Mitochondrial

The literature suggests that the mechanism responsible for RNA editing within the mitochondria and chloroplast may be the same or shared (Gray & Covello, 1993). As many attempts at finding direct evidence of the mechanism have been unsuccessful, any

correlation drawn between the mechanisms is merely speculation. However, groups

which lack the ability to edit, such as the algae and the complex thalloid liverworts, lack

editing in both of the organelles. In addition, groups that are found to edit in one

organelle have consistently been shown to do so in both organelles. One of the original

hypotheses presented here was that if the same mechanism were responsible for editing in

both organelles, an increase in editing frequency of one organelle should be mirrored in

the other organelle as well. Where data was available for both rbcL and nad5, the editing

rate frequencies were comparable. For example, the rate of editing in rbcL of

Phaeoceros bulbiculosus is elevated (33 sites), and it is reflected by an elevation in the rate of editing in nad5 (54 sites) (refer to Table 8)

Regression analysis was used to look for a relationship between the editing rate in

rbcL and the editing rate in nad5. Within the hornwort clade, variation in the chloroplast

gene, rbcL, explained a significant( P = 0.0016, ANCOVA) proportion (83%) of the

variation in editing rate of the mitochondrial gene, nad5 (Figure 5). When all taxonomic

groups were examined, 90% of the variation between the two organelles was explained

by rbcL (P = 3.95 x 10-8) (Figure 6). When all taxonomic groups were examined, the

non-hornwort lineages clumped with Leiosporoceros.

36 Covariation of Editing Across Organelles HORNWORTS 60

y = 1.1947x + 10.832 50 R2 = 0.8301 P =.0016 40

30 5 Edited Sites 20 nad

10

0 0 10203040 rbcL Edited Sites

Figure 5. Regression analysis examining the covariation of editing across organellar genomes for the hornwort clade.

Covariation of Editing Across Organelles ALL TAXA 60

y = 1.3191x + 7.0871 50 R2 = 0.9087 P = 3.95 x 10-8 40

30 5 Edited5 Sites 20 nad

10

0 0 10203040 rbc L Edited Sites

Figure 6. Regression analysis examining the covariation of editing across organellar genomes for all taxonomic groups.

37

One outlier to the relationship between the editing rate in the chloroplast and

mitochondria is Haplomitrium. Unlike in the mitochondrial gene, nad5, where

Haplomitrium was reported to have 65 edited sites (Groth-Malonek, 2005), this study

only found 3 editing sites in the chloroplast gene, rbcL. Surprisingly, with the exception

of complex thalloid liverworts, this was lower than the editing rate of found in other

liverworts examined. Why might Haplomitrium be different? In order to answer this

question, another must be asked: is it practical to use one gene as a representative for

each organelle or should it only be used as a sample? One explanation for the presence

of occasional outliers is that selection may occasionally cause an organelle with a

reduced number of edits to sweep through the population. Originally, it was hoped that

enough data would be generated in order to elucidate the usefulness of using one gene as

a representative in order to more clearly predict editing frequencies across entire

genomes. Unfortunately, there is not enough data to support or refute this assumption at

this time. More studies examining the entire organellar genome of both the chloroplast

and the mitochondria need to be conducted in order to gain a better picture of how RNA

editing is distributed within plants.

Selection of RNA Editing Sites: what drives targeting of editing sites?

Why are some sites selected for RNA editing events while others are not? What are the determining factors flagging a perspective editing site? These are two intriguing

questions concerning RNA editing in plants. The mechanism controlling RNA editing in

plants has yet to be concretely described, it has been difficult to ascertain the exact

38 factors associated with controlling and maintaining RNA editing sites. RNA editing is

proposed to use the same mechanism in both the chloroplast and mitochondria (Bock,

2000). Therefore, one possible step to understanding the mechanism would be to

providing for a better grasp of the distribution of edited sites within organellar genomes.

This mechanistic aspect of RNA editing becomes important when considering the numerous first and second base positions at which a T or a C exists in any land plant in the entire sequence alignment used in this study (refer to Appendix A). There are several first and second positions containing T’s and C’s that are not targeted for editing. There must be distinguishing characteristics associated with the editing sites in the genomic sequence to set them apart from the rest. Determination of why some sites are edited and others are not may reveal how editing occurs at a mechanistic level.

A number of studies have shown that sequence recognition factors neighboring

the RNA editing sites must be present in both the chloroplast and the mitochondria, in

order for the sites to be properly recognized (Bock et al., 1997; Mizuki et al., 2004;

Mulligan et al., 1999). The sequence length necessary for the recognition of an editing

site in the chloroplast has been shown to be approximately 30 nucleotides upstream and

10 nucleotides downstream of the editing site (Hegeman et al., 2005a; Hirose et al., 2001;

Miyamoto et al., 2004). Yet, there are no universally conserved aspects found among

flowering plants, in either the chloroplast or mitochondria genomes, in this sequence

region (Hegeman et al., 2005b; Hermann & Bock, 1999; Mulligan et al., 1999).

39 Several transgenic and in vitro studies have examined the affects of altering the

length and composition of this sequence region, at both 5’ and 3’ ends of editing sites, to

evaluate the efficiency of RNA editing at selected sites in the chloroplast and

mitochondria of flowering plants (Hegeman et al., 2005b; Herman & Bock, 1999;

Mulligan et al., 1999). Experiments introducing sequence mutations, to pea mitochondria, upstream and downstream of the editing site have shown that the sequence region between -5 and -15 nucleotides upstream of the editing site are essential for the occurrence of RNA editing while upstream of -35 nucleotides will decrease the efficiency of RNA editing to putative sites (Takenada et al., 2004). To date, it is thought that the emphasis, in respect to the majority of the recognition factors, lies on the sequence 5’ of the editing site in both the mitochondria and chloroplast (Mulligan et al., 1999; Takenaka et al., 2004).

Taking this one step further, is the nucleotide make-up of the sequence or the length of the sequence before, after, or between editing sites, more important in locating editing sites? Hermann and Bock (1999) showed that deletion of only 1 nucleotide in the spacer region between two editing sites created a loss of editing in the second site. They also showed that if a “new” C was positioned, through insertion or deletion of 5’ sequence downstream of the cis-element, at the proper distance it would be targeted for editing. So, it seems that the distance of the editing site to its cis-element is more valued than the actual nucleotide sequence itself.

40 Another idea put forth, based on findings in trypanosomes (Kable et al., 1996), has inspired investigators interested to search for “guide” RNAs which could be directing the editing apparatus to potential editing sites in plant organellar editing (Mulligan et al.,

1999). Antisense RNAs are thought to act as a “guide” for editing machinery in the targeting of RNA editing sites by complementary binding to the RNA transcript itself

(Gualberto et al. 1990; Mulligan et al. 1999). Gualberto et al. (1990) used hybridization techniques in order to identify any potential antisense RNA templates, but none were found. To date, there have not been any “guide” RNAs identified in association with

RNA editing sites. Consequently, if there are additional elements, such as antisense

RNAs, that associate with the RNA transcript to aid in the targeting of editing sites, they still remain a mystery.

Upstream and Downstream Base Preferences and RNA Editing in Land Plants

As discussed above, there has been an interest in looking at nearest neighbor effects to possibly predict the likely location of editable sites (Mulligan et al., 1999). In order to determine if there was a nucleotide preference at the -1 and +1 position, relative to the editing site, an evaluation of each editing site was conducted. Due to the limited number of editing sites in the other groups, only the hornworts, the mosses, and Takakia were used in this evaluation. If it is believed that the nucleotides within the -1 to -15 are most critical for proper recognition of editing sites and that the identity of those nucleotides has an effect on editing, the -1 position within these three groups should reflect a similarity in preference. The results supported this hypothesis as all three groups showed a definite preference at the -1 nucleotide position for thymine (see figures 7, 8,

41 and 9). However, as it has been well documented that the sequence 3’ of the editing site

may have little affect on the editing efficiency of the site, the +1 position was not expected to be completely conserved across the groups examined. Though both the

hornworts and the mosses showed a definite preference for adenosine, Takakia showed a

slight preference for guanine (refer to figures 7, 8, and 9).

Upstream and Downstream Base Frequency for the Hornwort Clade

100 90 80 70 60 G C 50 T 40 A

Number ofBases 30 20 10 0 HW Upstream HW Downstream Edited Sites Only

Figure 7. Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for all samples included in this study within the hornwort clade.

42 Upstream and Downstream Base Frequency for the Moss Clade

12

10

8 G C 6 T A 4 Number ofBases

2

0 MS Upstream MS downstream Edited Sites Only

Figure 8. Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for all samples included in this study within the moss clade.

Upstream and Downstream Base Frequency for the Moss sample Takakia

20 18 16 14 12 G C 10 T 8 A

Number ofBases 6 4 2 0 TK Upstream TK Downstream Edited Sites Only

Figure 9. Graphical representation of bases positioned upstream and downstream of all the reported editing sites, both observed and predicted, for Takakia.

43 Second Base Sites and RNA Editing

After reviewing the upstream and downstream base preferences among editing

sites and the prevalence of RNA editing occurring at second base positions, the question

can be posed: is there an apparent force driving the conservation of the -1 and +1

nucleotides in edited vs. non-edited 2nd base positions? Does the T_A context preference

hold true for those 2nd base positions that are not affected by RNA editing? Tillich et al.

(2006) looked for such a signature in C-T editing sites in the chloroplast of several seed

plants, a fern, and a hornwort. They found that approximately 70% of C-T editing sites

in seed plants were found to occur in a T_A context but to a lesser extent in both the fern

and hornwort. To address these questions three hornwort samples were examined at each

2nd base position where thymine was conserved, or in other words at any site that could

potentially be a C-T editing site (refer to Appendix A for applicable sites). The -1 and +1

nucleotide at each of these sites was documented for each of the three samples along with

the occurrence of an editing event, if one occurred.

In all three samples examined (PBUEU, PPECA, and ANTH) the preference for

edited sites at the -1 nucleotide position, in reference to 2nd base sites, was thymine

(Figures 10, 11, and 12). This is consistent with the overall findings for all editing sites

examined for the data presented here. This trend was not observed at the -1 nucleotide position in non-edited sites. Rather, the likelihood that a T occurred was slightly higher

than 25% which would be expected if there was no selection pressure acting on the site

(Figures 8, 9, and 10). As was the case above, there is no detectable 3’ preference found

at the editing sites in the sample PBUEU (Figure 10). However, there were detectable

preferences found in both PPECA (T) and ANTH (A), but they were not shared (Figures

44 11 and 12). Again, there was a slight preference for thymine at the +1 position (Figures

10, 11, and 12).

-1 and +1 Nucleotide Position of Edited vs. Non-edited Sites

120

100

80 T G 60 C A 40 Number of Bases

20

0 PBUEU -1nt. of PBUEU+1 nt. of PBUEU -1nt. of PBUEU +1nt. of Edited Site Edited site Unedited Site Unedited Site Edited vs. Non-edited Sites

Figure 10. Base composition of the -1/+1 nucleotide in relation to the edited and non- edited second base position sites where C-T editing was known to occur. Illustration of results found for the hornwort sample PBUEU.

45 -1 and +1 Nucleotide Position of Edited vs. Non-edited Sites

120

100

80 T G 60 C A 40 Number of Bases

20

0 PPECA -1nt. of PPECA +1 nt. of PPECA -1nt. of PPECA+1nt. of Edited Site Edited site Unedited Site Unedited Site Edited vs. Non-edited Sites

Figure 11. Base composition of the -1/+1 nucleotide in relation to the edited and non- edited second base position sites where C-T editing was known to occur. This graph illustrates results within the hornwort sample PPECA.

-1 and +1 Nucleotide Position of Edited vs. Non-edited Sites

120

100

80 T G 60 C A 40 Number of Bases

20

0 ANTH -1nt. of ANTH +1 nt. of ANTH -1nt. of ANTH +1nt. of Edited Site Edited site Unedited Site Unedited Site Edited vs. Non-edited Sites

Figure 12. Base composition of the -1/+1 nucleotide in relation to the edited and non- edited second base position sites where C-T editing was known to occur. This graph illustrates results within the hornwort sample ANTH.

46

The evolution and maintenance of RNA editing in plant organelles

Given the observed frequency of RNA editing among hornwort plastid and mitochondrial genes and even editing among other land plants one might ask what the selective advantage of editing may be. There are a few hypotheses for the origin and maintenance of RNA editing. One thing is apparent though, once RNA editing was established in these lineages they must continue to edit these sites to maintain functional proteins. For example, while working with the hornwort Anthoceros formosae,

Yoshinaga et al. (1996) noticed that RNA editing was “correcting” early termination

(stop) codons (Figure 2) resulting in complete translation of the encoded sequence. This phenomenon has also been discovered recently across members of the entire hornwort clade, with the exception of Leiosporoceros (Duff & Moore 2005). Although less obvious, an additional benefit of RNA editing is its ability to maintain a conserved state within the amino acid sequence. Twenty editing sites were found accounting for an alteration of approximately 4.0% of the amino acid sequence examined by Yoshinaga et al. (1996). In a second chloroplast gene, atpB, 29 editing sites were found within the mRNA sequence of A. formosae (Yoshinaga et al. 1997). Of the 29 sites, “reverse” conversions accounted for 14 of the alterations (Yoshinaga et al. 1997) revealed. These changes resulted in a 5.7% change of amino acid constitution of this gene (Yoshinaga et al. 1997).

Using a mathematical model to illustrate the evolution of codon preference, Liu and Bundschuh (2005) suggest that plants are not able to randomly acquire editing sites

47 due to the expense incurred by the editing machinery. Therefore, they hypothesize that

evolutionary selection must be directed towards conservation at the protein level. My data supported this hypothesis. The sequence alignment identifies random point mutations both at the 5’ and 3’ends of highly conserved editing sites; this reinforces the

lack of conservation of nucleotide sequence across selected taxa (unreported data). also

In addition, at several positions in the alignment (Appendix A) RNA editing “fixes” early termination codons, as reported in the literature, so that the proper reading frame is restored (refer to Figure 2 for identification of early termination codons). In addition, the

majority of the editing sites are transformed so that the improper amino acid coded for in

the genomic sequence are converted to the evolutionary conserved amino acid. These

findings are consistent with literature for both the chloroplast and the mitochondrial

genomes (Duff, 2006; Duff & Moore 2005). However, regardless of whether discussion

is directed at the point mutations or the targeted codon, the amino acid produced was that

of the conserved state. This is consistent with the hypothesis that selection is working at

the protein level.

Even if selection is acting on the protein to maintain the need for RNA editing

sites there is still the question as to whether RNA editing sites are maintained for long

periods of time or if individual sites are lost and gained multiple times. Duff and Moore

(2005) for rbcL and Duff (2006) for nad5 clearly demonstrated that RNA editing sites in hornworts are not maintained across all hornworts but rather appear to have been lost and

gained multiple times. The data presented show the same pattern at larger time scales

(Figure 2). So despite the ability to edit and the need to maintain specific sites to keep

proper protein structure, back mutations that recreate the conserved DNA sequence 48 abolish editing sites. In this sense, at the level of the DNA sequence the editing sites appear to evolve in a neutral manner (Duff and Moore, 2005; Duff 2006; Fiebig et al.

2004).

Conclusions and Future Research

The data presented here shows that RNA editing is present in the chloroplast and mitochondrial genomes of nearly all land plants. A key to this study was the number of cDNA sequences generated which lent a more precise account of RNA editing in the chloroplast gene, rbcL. By increasing the sample pool examined, additional editing sites were found that were not previously reported in hornworts for the chloroplast gene, rbcL.

Consistent with literature, the majority of RNA editing sites in most land plant lineages occur in a C-T fashion while few T-C editing sites were found with the exception of the hornworts.

For the most part, the rate and distribution of edited sites varied greatly among taxa examined in this study. As in the hornwort clade (Duff & Moore, 2005; Duff,

2005), the lack of any shared editing sites among the selected plant groups could suggest that the ability to gain and lose editing sites is a common phenomenon within land plants.

While the mechanism responsible for RNA editing in plant organellar genomes still remains a mystery, certain preferences at or surrounding editing sites have been well documented. This study, consistent with literature, detected a definite T_A context bias at the -1/+1 nucleotide positions of editing sites that is absent in non-edited sites.

49 In order to gain a better understanding of the impacts associated with RNA

editing within plant organellar genomes, it will be vital that future investigations sample a

broader range of land plants lineages as well as increasing the number of specimens

examined per group. While this study greatly increased the number of taxa for which

RNA editing has been examined, many more taxa will need to be examined to produce a

more complete picture of the distribution and to be able to postulate about the origin and

maintenance of RNA editing across land plant lineages. In addition to increasing sample

diversity, there needs to be a greater variety of genes examined from each of the organelles. Ideally, the sequencing of whole genomes would aid in a more accurate prediction of editing sites while allowing for greater statistical sampling.

50 REFERENCES

Bock, Ralph. 2000. Sense from nonsense: How the genetic information of chloroplasts is altered by RNA editing. Biochimie. 82:549-557.

Bock, Ralph. Marita Hermann and Melanie Fuchs. 1997. Identification of critical nucleotide positions for plastid RNA editing site recognition. RNA. 3:1194- 1200.

Börner, Valentin. Svante Pääbo. 1996. Evolutionary fixation of RNA editing. Nature. 383:225.

Duff, R. Joel. Francisco B.-G. Moore. 2005. Pervasive RNA editing among hornwort rbcL transcripts except Leiosporoceros. Journal of Molecular Evolution. 61:571-578.

Duff, R. Joel. 2006. Divergent RNA editing frequencies in hornwort mitochondrial nad5 sequences. Gene. 366:285-291

Fiebig, Andreas. Sandra Stegemann, and Ralph Bock. 2004. Rapid evolution of RNA editing sites in a small non-essential plastid gene. Nucleic Acids Research. 32(12):3615-3622.

Freyer, Regina. Marie-Christine Keifer-Meyer and Hans Kossel. 1997. Occurrence of Plastid RNA editing in all major lineages of land plants. Proceedings of the National Academy of Sciences of the United States of America. 94(12): 6285-6290.

Giege, Philippe. Axel Brennicke. 1999. RNA editing in Arabidopsis mitochondria effects 441 C to U changes in ORFs. Proceedings of the National Academy of Sciences of the United States of America. 96(26):15324-15329.

Gilbert, D. 1993. SeqApp. Version 1.9a157. Biocomputing Office, Biology Department, Indiana State University, Bloomington.

Gray, Michael. 1996. RNA editing in plant organelles: a fertile field. Proc. Natl. Acad. Sci. USA. 93:8157-8159.

51

Gray, Michael W. Patrick S. Covello. 1993. RNA editing in plant mitochondria and chloroplasts. FASEB Journal. 7:64-71.

Gualberto, Jose M. Jacques-Henry Weil, and Jean-Michel Grienenberger. 1990. Editing of the wheat coxIII transcript – evidence for 12 C to U and one U to C conversions and for sequence similarities around editing sites. Nucleic Acids Research. 18(13):3771-3776.

Groth-Malonek. Dagmar Pruchner, Felix Grewe, and Volker Knoop. 2005. Ancestors of trans-splicing Mitochondrial introns support serial sister roup relationships of hornworts and mosses with vascular plants. Molecular Biology and Evolution. 22(1):117-125.

Hanson, Maureen R. Claudia A. Sutton and Bingwei Lu. 1996. Plant organelle gene expression: altered by RNA editing. Trends in Plant Science, Reviews. 1(2):57-64.

Hegman, Carla E. Michael L. Hayes, and Maureen R. Hanson. 2005a. Substrate and cofactor requirements for RNA editing of chloroplast transcripts in Arabidopsis in vitro. The Plant Journal. 42:124-132.

Hegeman, Carla E. Christine P. Halter, Thomas G. Owens, and Maureen R. Hanson. 2005b. Expression of complementary RNA from chloroplast transgenes affects editing efficiency of transgene and endogenous chloroplast transcripts. Nucleic Acids Research. 33(5):11454-1464.

Hermann, Marita. Ralph Bock. 1999. Transfer of plastid RNA-editing activity to novel sites suggests a critical role spacing in editing-site recognition. Proceedings of the National Academy of Science of the United States of America. 96(9):

Heisel, Rudolf. Bernd Wissinger, Wolfgang Schuster, and Axel Brennicke. 1989. RNA editing in Plant Mitochondria. Science. 246(4937):1632-1634.

Heisel, Rudolf. Bruno Combettes, and Axel Brennickle. 1994. Evidence for RNA editing in mitochondria of all major groups of land plants except the Bryophyta. Proc. Natl. Acad. Sci. USA. 91:629-633.

Hirose, Tetsuro. Masahiro Sugiura. 2001. Involvement of a site-specific trans- acting factor and a common RNA-binding protein in the editing of chloroplast mRNAs: development of a chloroplast in vitro RNA editing system. EMBO Journal. 20(5):1144-1152.

52 Hoch, Brigrtte. Rainer M. Maier, Kurt Appel, Gabor L. Igloi, and Hans Kössel. 1991. Editing of a chloroplast mRNA by creation of an initiation codon. Nature. 353:178-180.

Kable, Moffett L. Scott D. Seiwert, Stefan Heidmann, and Kenneth Stuart. 1996. RNA editing a mechanism for gRNA-specific Uridylate Insertion into precursor mRNA. Science. 274(5284):1189-1195.

Kugita, Masanori. Yuhei Yamamoto, Takeshi Fujikawa, Tohoru Matsumoto, and Koichi yoshinaga. 2003. RNA editing in hornwort chloroplasts makes more than half the genes functional. Nucleic Acids Research. 31(9):2417-2423.

Liu, Tsunglin. Ralf Bundschuh. 2005. Model for codon bias in RNA editing. Physical Review Letters. 95:088101(1-4).

Malek, O. K. Lattig, R. Hiesel, A. Brennicke, and V. Knoop. 1996. RNA editing in bryophytes and a molecular phylogeny of land plants. EMBO J. 15(6):1403- 11.

Mass, Stefan. Alexander Rich. 2000. Changing genetic information through RNA editing. BioEssays. 22:790-802.

Mower, Jeffrey P. 2005. PREP-mt: predictor RNA editor for plant mitochondrial genes. BMC Bioinformatics. doi:10.1186/1471-2105-6-96

Miyamoto, T. Obokata, J. and Sugiura, M. 20004. A site-specific factor interacts differently with its congnate RNA editing site in chloroplast transcripts. Proceedings of National Academy of Science USA. 101:48-52.

Mulligan, R. M. M. A. Williams, and M. T. Shanahan. 1999. RNA editing site recognition in higher plant mitochondria. The Journal of Heredity. 90(3): 338-344.

Neuwirt, Julia. Mizuki takenada, Johannes A. Van Der Merwe, and Axel Brennicke. 2005. An in vitro RNA editing system from cauliflower mitochondria: editing site recognition parameters can vary in different plant species. RNA. 11:1563-1570.

Peeters, Nemo M. Maureen R. Hanson. 2002. Transcript abundance supercedes editing efficiency as a factor in development variation of chloroplast gene expression. RNA. 8:497-511.

Reed, Martha L. Sangbom M. Lyi, and Maureen R. Hanson. 2001. Edited transcripts compete with unedited mRNAs for trans-acting editing factors in higher plant chloroplasts. Gene. 272:165-171.

53

Schuster, Wolfgang. Rudolf Hiesel, Bernd Wissinger, and Alex Brennicke. 1990. RNA editing in the cytochrome-b locus of the higher-plant Oenothera- berteriana includes a U –to- U transition. Molecular and Cell Biology 10(5):2428-2431.

Schuster, Wolfgang. Rainer Ternes, Volker Knoop, Rudolf Hiesel, Bernd Wissinger, and Axel Brennicke. (1991) Distribution of RNA editing sites in Oenothera mitochondrial mRNAs and rRNAs. Current Genetics. 20:397-404.

Shields, Denis C. Kenneth H. Wolfe. 1997. Accelerated evolution of sites undergoing nRNA editing in plant mitochondria and chloroplasts. Molecular Biology and Evolution. 14(3):344-349.

Steinhauser, Siegfried. Susanne Beckert, Ingrid Capesius, Olaf Malek, and Volker Knoop. 1999. Plant mitochondrial RNA editing. Journal of Molecular Evolution. 48:303-312.

Szmidt, Alfred E. Meng-Zhe Lu and Ziao-Ru Wang. 2001. Effects of RNA editing on the coxI evolution and phylogeny reconstruction. Euphytica. 118:9-18.

Takenada, Mizuki. Julia Neuwirt, and Axel Brennicke. 2004. Complex cis-elements determine an RNA editing site in pea mitchondria. Nucleic Acids Research. 32(14):4137-4144.

Tillich, Michael. Pascal Lehwark, Brian R. Morton, and Uwe G. Maier. 2006. The evolution of chloroplast RNA editing. Molecular Biology and Evolution. 23(10):1912-1921.

Tsudzuki, Takahiko. Tatsuya Wakasugi and Masahiro Sugiura. 2001. Comparative analysis of RNA editing sites in higher plant chloroplasts. Journal of Molecular Evolution. 53:327-332.

Wolf, Paul G. Carol. A Rowe, and Mitsuyasu Hasebe. 2004. High levels of RNA editing in a chloroplast genome: analysis of transcripts from the fern Adiantum capillus-veneris. Gene. 339:89-97.

Yoshinaga, Koichi. Hiroe linuma, Takehiro Masuzawa, and Kunihiko Ueda. 1996. Extensive RNA editing of U to C in addition to C to U substitution in the rbcL transcripts of hornwort chloroplasts and the origin of RNA editing in green plants. Nucleic Acids Research. 24(6):1008-1014.

54

APPENDICES

55 APPENDIX A

ALLIGNMENT OF SELECTED TAXA DNA AND CDNA SEQUENCES FOR RBCL

10 30 50 70 90 GNETdna11 GGATTCCAAGCTGGTGTTAA AGATTATAGAttaacttATT ATACTCCGGAGTATCAGACT AAGGATACTGATATCTTGGC AGCATTCCGAGTAACACCGC GNETcdna1 gGATTCCAAGCTGGTGTTAA AGATTATAGATTaaCTTATT ATACTCCGGAGTATCAGACT AAGGATACTGATATCTTGGC AGCATTCCGAGTAACACCGC GNETcdna1 GGATTcCAAGCTGGTGTTaA AGATtATAGATTaACTTATt ATACTCCGGAGTATCAGACT AAGGATACTGATATCTTGGC AGCATTCCGAGTAACACCGC LYCOdna10 GGATTCAAAGCTGGCGTTAG GGATTACAGATTAACTCATT ACACTCCTGATTATAAGACC AAAGAAACCGATATTCTGGC AGCATTTCGAATGACTCCTC LYCOcdna1 GGATTCAAAGCTGGCGTTAG GGATTACAGATTAACTTATT ACACTCCTGATTATAAGACC AAAGAAACCGATATTCTGGC AGCATTTCGAATGACTCCTC LYCOcdna1 GGATTCAAAGCTGGCGTTAG GGATTACAGATTAACTTATT ACACTCCTGATTATAAGACC AAAGAAACCGATATTCTGGC AGCATTTCGAATGACTCCTC PSILOdna1 ------GGTGTTAA GGATTATCGATTGACCTATT ATACCCCTGATTATAAAGTC AGTGATACTGATATCCTAGC AGCATTTCGAATGACTCCTC PSILOcdna GGATTCAAAGCTGGTGTTAA GGATTATCGATTGACCTATT ATACCCCTGATTATAAAGTC AGTGATACTGATATCCTAGC AGCATTTCGAGTGACTCCTC PSILOcdna GGATTCAAAGCTGGTGTTAA GGATTATCGATTGACCTATT ATACCCCTGATTATAAAGTC AGTGATACTGATATCCTAGC AGCATTTCGAATGACTCCTC PSILOcdna GGATTCaaaGCTGGtGTTaA GGATTATCGATTGACCTATT ATACCCCTGATTATAAAGTC AGTGATACTGATATCCTAGC AGCATTTCGAATGACTCCTC BOTRYdna GGATTCAAAGCTGGTGTTAA AGATTATCGATTGACCTATT ACACTCCCGACTACAAAACC AAGGATACTGATATCCTAGC AGCTTTTCGAATGACTCCCC BOTRYcdna GGATTcAAAGCTGGTGTTaA AGATtATCGATTGACCtaTT aCACTCCCGACTaCAAAACC AAGGATACTGATATCCTAGC AACTTCTCGAATGACTCCCC THUIDdna GGATTTAAAGCTGGTGTTAA AGATTACAGATTAACTTATT ACACTCCAAATTATCAAACT TTAGAAACTGATATTTTGGC AGCATTTCGAATGACTCCTC THUIDcdna GGATTTAAAGCTGGTGTTAA AGATTACAGATTAACTTATT ACACTCCAAATTATCAGACT TTAGAAACTGATATTTTGGC AGCATTTCGAATGACTCCTC THUIDcdna GGATTTaAAGCTGGTGTTaA AGATTACAGATTAACTTATT ACACTCCAAATTATCAGACT TTAGAAACTGATATTTTGGC AGCATTTCGAATGACTCCTC TRFR1082d GGATTCAAAGCTGGTGTTAA AGATTATCGATTGACCTATT ACACTCCCAAGTATGAGACC AAAGACACCGATATCTTGGC AGCCTTTCGAATGACCCCGC TRFRcdna9 GGATTCAAAGCTGGTGTTAA AGATTATCGATTGACCTATT ACACTCCCAAGTATGAGACC AAAGACACCGATATCTTGGC AGCCTTTCGAATGACCCCGC TRFRcdna9 GGATTCAAAGCTGGTGTTAA AGATTATCGATTGACCTATT ACACTCCCAAGTATGAGACC AAAGACACCGATATCTTGGC AGCCTTTCGAATGACCCCGC TRFRcdna9 GGATTcAAAGCTGGTGTTaA AGATtATCGATTGACCTATT aCACTCCCAAGTATGAGACC AAAGACACCGATATCTTGGC AGCCTTCCGAATGACCCCGC PAUSIUdna GGATTCAAAGCTGGTGTCAA AGATTATCGATTGACTTATT ACACCCCCGAATACAAAACC AAAGATACCGATATCTTAGC AGCATTTCGAATGACCCCAC PAUSIUcdn GGATTCAAAGCTGGTGTCAA AGATTATCGATTGACTTATT ACACCCCCGAATACAAAACC AAAGATACCGATATCTTAGC AGCATTTCGAATGACCCCAC SELAG1dna GGATTCAAGGCTGGCGTTAA AGATTACAGATTAAcTTACT ACACCCCCGACTACGAAACC AAGGATACCGATATATTGGC AGCATTCCGAATGACCCCGC SELAG1cdn GGATTCAAGGCTGGCGTTAA AGATTACAGATTAACTTACT ACACCCCCGACTACGAAACC AAGGATACCGATATATTGGC AGCATTCCGAATGACCCCGC SELAG1cnd GGATTCAAGGCTGGCGTTAA AGATTACAGATTAACTTACT ACACCCCCGACtaCgAaACC AaGgAtaCCgATaTaTtGGc AGcATTcCgAATgACCCcGC SELAG3dna GGATTCAAGGCTGGCGCTAA AGATTACAGATTAACTTACT ACACCCCCGACTACGAAACC AAGGATACCGATATATTGGC AGCATTCCGAATGACCCCGC SELAG4dna GGATTCAAGGCTGGCGTTAA AGATTACAGATtaacTTaCT ACACCCCCGACTACGAAACC AAGGATACCGATATATTGGC AGCATTCCGAATGACCCCGC L4LDU05dn GGATTTAAAGCTGGTGTTAA AGATTATTGATTAACCTATT ATACTCCTGACTATGAGACC AAGGATACTGATATTTTAGC AGCGTTTCGAATGACTCCTC L4LDU05cD GGATTTAAAGCTGGTGTTAA AGATTATAGATTAACCTATT ATACTCCTGACTATGAGACC AAGGATACTGATATTTTAGC AGCGTTTCGAATGACTCCTC AformDNA GGATTTAAAGCTGGTGTTAA AGATTATAGATTAACCCATT ATACCCCTGATTACGAGACC AAGGATACTGATATTTTGGC AGCGTCTTGAATGACTCCTT AformCDNA GGATTTAAAGCTGGTGTTAA AGATTATAGATTAACCTATT ATACCCCTGATTACGAGACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC NOTHCRdna GGATCTAAAGCTGGtGTTAA GGATCATAGATTAACCTACT ATACTCTTGAttATGAAACC AAGGATACTGATACTTTGGC AGCGTcTTGAATGAcTCCtc NOTHCRcdn GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTAcT ATaCtCCTGATTATGAAaCC AAGGATaCTGATATTTTGGC AGCGTTTCGAATGACTCCtC MSPCHdna GgaTCTAAAGCTGGTGTTAA GGaTcAtaGAtTaACCtaCt ataCtcTTGAtTaTGAAACC AAGGATaCTGATaCTTTGGC AGCGTCTTGAATGACtcCtC MSPCHcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC MFUCHdnaR GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC MFUCHcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTTTTGAATGACTCCTC MFLAGAY86 GGATTTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTCTTGAATGACTCCTC MFLAGcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTTTCGAATGACTCCTC MFLAGcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTTTCGAATGACTCCTC PFIMBAY86 GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PFIMBcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC PFIMBcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC PCADCCdna GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACC ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PCADCCcdn GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTAATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC PSPCH2dna GGATCTaAAGcTGGTGTTAA GGATCaTAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PSPCH2cdn GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACC ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC

56 PSPCHMdna GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PSPCHMcdn GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PSPDO4dna GGAtCTAAAGCTGGTGTTAA GGAtCATAGATTAaCCTACC ATACTCTTGATTATGAAaCC AAGGATaCTGATaCTTTGGC AGCGTCTTGAATGACTCCTC PSPD04cdn GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACT ACACTCTTGATTATGAAACC AAGGATGCTGATACTTTGGC AGCGTCTTGAATGACTCCTC PPECAdnaJ GGATCTAAAGCTGGTGTTAA GGATCATAGATTAACCTACC ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PPECAcdna GGATTTAAAGCTGGTGTTAA GGATCATAGATTAACCTACC ATACTCTTGATTATGAAACC AAGGATACTGATACTTTGGC AGCGTCTTGAATGACTCCTC PPECAcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTCGAATGACTCCTC PPECAcdna GGATTTAAAGCTGGTGTTAA GGATTATAGATTAACCTACT ATACTCCTGATTATGAAACC AAGGATACTGATATTTTGGC AGCGTTTTGAATGACTCCTC TAKAKIADN GGATTCAAAGCTGGTGTTAA AGATTACAGGTTAACTCATT ACACTCCGAATTATGGGACC AAGGATACCGATATTTCGGC AGCATTTCGGATGACTCCTC TAKAKpred GGATTCAAAGCTGGTGTTAA AGATTACAGGTTAACTCATT ACACTCCGAATTATGGGACC AAGGATACCGATATTTCGGC AGCATTTCGGATGACTCCTC SPAGdna10 GGATTTAAAGCTGGTGTTAA AGATTACAGGTTGACTTATT ACACCCCGGAGTATGTTGTC AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC SPAGcdna1 GGATTTaAAGCTGGTGTTaA GGATTaCAGGTTGACTtATT aCACCCCGGAGTATGTTGTc AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC SPAGcdna1 GGATTTaAAGCTGGTGTTaA AGATTaCAGGTTGACTtATT aCACCCCGGAGTATGTTGTc AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC SPAGcdna1 GGATTTAAAGCTGGTGTTAA AGATTACAGGTTGACTTATT ACACCCCGGAGTATGTTGTC AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC SPAGcdna1 GGATTTAAAGCTGGTGTTAA AGATTACAGGTTGACTTATT ACACCCCGGAGTATGTTGTC AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC SPAGcdna1 GGATTTaAAGCTGGTGTTaA AGATTaCAGGTTGACTtATT aCACCCCGGAGTATGTTGTc AAAGATACCGACATTTTGGC AGCATTTCGAATGACTCCTC HAPLOdna8 GGATTTAAAGCTGGTGTTAA AGATCACAGATTGACCCATT ACACCCCCGATTACAAGACC GAAGATACCGATATCTTAGC AGCGTTCCGAATGACTCCTC HAPLOCDNA GGATTTAAAGCTGGTGTTAA AGATTACAGATTGACCTATT ACACCCCCGATTACAAGACC GAAGATACCGATATCTTAGC AGCGTTCCGAATGACTCCTC Marchanti GGATTCAAAGCTGGTGTTAA AGATTATCGATTAACTTATT ACACTCCGGATTATGAGACC AAGGATACGGATATTTTAGC AGCATTTAGAATGACTCCTC Charavulg GGATTTAAAGCAGGGGTAAA AGATTACAGATTAACTTACT ATACTCCTGAGTATAAAACT AAAGATACTGACATTTTAGC TGCATTTCGTGTAACTCCAC

110 130 150 170 190 GNETdna11 AGCCCGGAGTACCGCCTGAG GAAGCAGGAGCAGCTGTAGC TGCTGAATCCTCCACCGGTA CATGGACCACTGTGTGGACT GATGGACTTACCAGTCTTGA GNETcdna1 AGCCCGGAGTACCGCCTGAG GAAGCAGGAGCAGCTGTAGC TGCTGAATCCTCCACCGGTA CATGGACCACTGTGTGGACT GATGGACTTACCAGTCTTGA GNETcdna1 AGCCCGGAGTACCGCCTGAG GAAGCAGGAGCAGCTGTAGC TGCTGAATCCTCCACCGGTA CATGGACCACTGTGTGGACT GATGGACTTACCAGTCTTGA LYCOdna10 AACTTGGAGTACCACCCGAG GAAGCGGGAGCCGCAGTAGC TGCTGAATCTTCCACTGGTA CATGGACTACTGTTTGGACC GATGGACTTACCAGTCTCGA LYCOcdna1 AACTTGGAGTACCACCCGAG GAAGCGGGAGCCGCAGTAGC TGCTGAATCTTCCACTGGTA CATGGACTACTGTTTGGACC GATGGACTTACCAGTCTCGA LYCOcdna1 AACTTGGAGTACCACCCGAG GAAGCGGGAGCCGCAGTAGC TGCTGAATCTTCCACTGGTA CATGGACTACTGTTTGGACC GATAGACTTACCAGTCTCGA PSILOdna1 AACCTGGAGTACCACCTGAG GAAGCAGGAGCTGCAGTAGC TGCTGAGTCTTCCACCGGTA CGTGGACCACTGTATGGACT GATGGACTTACCAGCCTGGA PSILOcdna AACCTGGAGTACCACCTGAG GAAGCAGGAGCTGCAGTAGC TGCTGAGTCTTCCACCGGTA CGTGGACCACTGTATGGACT GATGGACTTACCAGCCTGGA PSILOcdna AACCTGGAGTACCACCTGAG GAAGCAGGAGCTGCAGTAGC TGCTGAGTCTTCCACCGGTA CGTGGACCACTGTATGGACT GATGGACTTACCAGCCTGGA PSILOcdna AACCTGGAGTACCACCTGAG GAAGCAGGAGCTGCAGTAGC TGCTGAGTCTTCCACCGGTA CGTGGACCACTGTATGGACT GATGGACTTACCAGCCTGGA BOTRYdna AACCTGGAGTGCCAGCTGAG GAAGCGGGAGCCGCGGTAGC TGCTGAATCTTCCACCGGTA CGTGGACCACCGTATGGACC GATGGACTTACCAGCCTTGA BOTRYcdna AACCTGGAGTGCCAGCTGAG GAAGCGGGAGCCGCGGTAGC TGCTGAATCTTCCACCGGTA CGTGGACCACCGTATGGACC GATGGACTTACCAGCCTTGA THUIDdna AACCAGGAGTCCCACCTGAA GAGGCAGGAGCTGCGGTAGC TGCGGAATCTTCCACTGGTA CATGGACCACTGTTTGGACT GATGGACTTACTAGTCTTGA THUIDcdna AACCAGGAGTCCCACCTGAA GAGGCAGGAGCTGCGGTAGC TGCGGAATCTTCCACTGGTA CATGGACCACTGTTTGGACT GATGGACTTACTAGTCTTGA THUIDcdna AACCAGGAGTCCCACCTGAA GAGGCAGGAGCTGCGGTAGC TGCGGAATCTTCCACTGGTA CATGGACCACTGTTTGGACT GATGGACTTACTAGTCTTGA TRFR1082d AACCCGGAGTACCGCCTGAG GAAGCTGGAGCCGCAGTAGC TGCGGAATCTTCCACAGGTA CATGGACCACTGTATGGACG GATGGACTTACTAGTCTCGA TRFRcdna9 AACCCGGAGTACCGCCTGAG GAAGCTGGAGCTGCAGTAGC TGCGGAATCTTCCACAGGTA CATGGACCACTGTATGGACG GATGGACTTACTAGTCTCGA TRFRcdna9 AACCCGGAGTACCGCCTGAG GAAGCTGGAGCTGCAGTAGC TGCGGAATCTTCCACAGGTA CATGGACCACTGTATGGACG GATGGACTTACTAGTCTCGA TRFRcdna9 AACCCGGAGTACCGCCTGAG GAAGCTGGAGCTGCAGTAGC TGCGGAATCTTCCACAGGTA CATGGACCACTGTATGGACG GATGGACTTACTAGTCTTGA PAUSIUdna AACCTGGAGTACCGGCTGAG GAAGCTGGAGCTGCGGTAGC CGCAGAATCCTCCACAGGTA CGTGGACAACTGTATGGACA GATGGGTTGACTAGCCTTGA PAUSIUcdn AACCTGGAGTACCGGCTGAG GAAGCTGGAGCTGCGGTAGC TGCAGAATCCTCCACAGGTA CGTGGACAACTGTATGGACA GATGGGTTGACTAGCCTTGA SELAG1dna AACCCGGCGTTCCCGCCGAG GAAGCAGGGGCCGCGGTAGC CGCGGAGTCCTCCACCGGTA CATGGACTACGGTCTGGACC GACGGGCTTACTAATCTTGA SELAG1cdn AACCCGGCGTTCCCGCCGAG GAAGCAGGGGCCGCGGTAGC CGCGGAGTCCTCCACCGGTA CATGGACTACGGTCTGGACC GACGGGCTTACTAATCTTGA SELAG1cnd AaCCCGGcGTTcCCGCCgAG gAAGcaGGGGCCGCGGtAGC CGCGgAGTCCTCCaCCGGtA CATGgActACGGTCTGgACC GACGGGCTtACtaATCTTgA SELAG3dna AACCCGGCGTTCCCGCCGAG GAAGCAGGGGCCGCGGTAGC CGCGGAGTCCTCCACCGGTA CATGGACTACGGTCTGGGCC GACGGGCTTACTAATCTTGA SELAG4dna AACCCGGCGTTCCCGCCGAG GAAGCAGGGGCCGCGGTAGC CGCGGAGTCCTCCACCGGTA CATGGACTACGGTCTGGACC GACGGGCTTACTAATCTTGA L4LDU05dn AACCAGGGGTGCCACCTGAA GAAGCAGGAGCTGCAGTGGC CGCAGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA L4LDU05cD AACCAGGGGTGCCACCTGAA GAAGCAGGAGCTGCAGTGGC CGCAGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA AformDNA AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACTGGTA CATGGACTACTGTTTGGACC GATGGACTTACCAGCCTTGA AformCDNA AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACTGGTA CATGGACTACTGTTTGGACC GATGGACTTACCAGCCTTGA NOTHCRdna AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAAtcTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGAcTTACCAGtCTTGA NOTHCRcdn AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGgacTTaCCAGTCTTGA MSPCHdna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAAtCTTcAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTaCCAGTCTTGA MSPCHcdna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA MFUCHdnaR AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCTGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA MFUCHcdna AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCTGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA MFLAGAY86 AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCTGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA MFLAGcdna AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCTGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA MFLAGcdna AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCTGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PFIMBAY86 AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA 57 PFIMBcdna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PFIMBcdna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PCADCCdna AACCAGGGGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACTAGTCTTGA PCADCCcdn AACCAGGGGTGCCAGTTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACTAGTCTTGA PSPCH2dna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PSPCH2cdn AACCAGGGGCGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACAGGTA CATGGACTACTGTTTGGACT GATGGACTTACTAGTCTTGA PSPCHMdna AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PSPCHMcdn AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PSPDO4dna AACCAGGGGCGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACAGGTA CATGGACTACTGTTTGGACT GATGGACTTACTAGTCTTGA PSPD04cdn AACCAGGGGTGCCACCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACGGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PPECAdnaJ AACCAGGAGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC TGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PPECAcdna AACCAGGAGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PPECAcdna AACCAGGAGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA PPECAcdna AACCAGGAGTGCCAGCTGAA GAAGCAGGAGCCGCAGTAGC CGCTGAATCTTCAACAGGTA CATGGACCACTGTTTGGACT GATGGACTTACCAGTCTTGA TAKAKIADN AACCTGGAGTGCCACCCGAG GAAGCGGGAGCTGCACTAGC TGCAGAATCTTCCACTGGTA CATGGACTACTGTTCGGACT GATGGACTTACCAGTCTCGA TAKAKpred AACCTGGAGTGCCACCCGAG GAAGCGGGAGCTGCACTAGC TGCAGAATCTTCCACTGGTA CATGGACTACTGTTTGGACT GATGGACTTACCAGTCTCGA SPAGdna10 AACCTGGAGTACCACCCGAG GAAGCAGGAGCTGCAGTAGC TGCGGAATCTTCCACCGGTA CATGGACTACCGTTTGGACT GATGGACTTACCAGTCTTGA SPAGcdna1 AACCTGGAGTACCACCCGAG GAAGCAGGAGCTGCAGTAGC TGCGGAATCTTCCACCGGTA CATGGACTACCGTTTGGACT GATGGACTTACCAGTCTTGA SPAGcdna1 AACCTGGAGTACCACCCGAG GAAGCAGGAGCTGCAGTAGC TGCGGAATCTTCCACCGGTA CATGGACTACCGTTTGGACT GATGGACTTACCAGTCTTGA SPAGcdna1 AACCTGGAGTACCACCCGAA GAAGCAGGAGCTGCAGTAGC TGCGGAATCTTCCACCGGTA CATGGACTACCGTTTGGACT GATGGACTTACCAGTCTTGA SPAGcdna1 AACCTGGAGTACCACCCGAG GAAGCAGGAGCTGCAGTAGC TGCGGAATCTTCCACCGGTA CATGGAATACCGTTTGGACT GATGGACTTACCAGTCTTGA SPAGcdna1 AACCTGGAGTACCACCCGAG GAAGCAGGAGCTGCAGTAGC TGCGGAACCTTCCACCGGTA CATGGACTACCGTTTGGACT GATGGACTTACCAGTCCTGA HAPLOdna8 AACCTGGGGTGCCGCCAGAG GAAGCAGGAGCAGCAGTAGC TGCTGAATCTTCCACCGGTA CATGGACCACCGTTTGGACT GATGGACTTACCAGTCTCGA HAPLOCDNA AACCTGGGGTGCCGCCAGAG GAAGCAGGAGCAGCAGTAGC TGCTGAATCTTCCACCGGTA CATGGACCACCGTTTGGACT GATGGACTTACCAGTCTCGA Marchanti AGCCTGGAGTTCCAGCGGAA GAAGCAGGCAACGCAGTTGC TGCTGAATCTTCAACTGGTA CATGGACTACAGTTTGGACT GATGGTCTTACTAACCTTGA Charavulg AACCTGGCGTTCCACCTGAA GAAGCAGGTGCTGCAGTAGC TGCAGAATCTTCTACTGGTA CATGGACTACTGTTTGGACT GACGGATTAACTAGTTTAGA

210 230 250 270 290 GNETdna11 TCGGTACAAAGGACGATGTT ACGACCTTGAGCCTGTTCCT GGAGAAGACAATCAATTTAT TGCTTATGTGGCGTATCCTT TGGACCTTTTCGAAGAGGGT GNETcdna1 TTGGTACAAAGGACGATGTT ACGACCTTGAGCCTGTTCCT GGAgAAGACAATCAATTTAT TGCTTATGTGGCGTATCCTT TGGACCTTTTCgAAGAGGGT GNETcdna1 TCGGTACAAAGGACGATGTT ACGACCTTGAGCCTGTTCCT GGAGAAGACAATCAATTTAT TGCTTATGTGGCGTATCCTT TGGACCTTTTCGAAGAGGGT LYCOdna10 TCGCTACAAAGGTCGGTGCT ATGAAATTGAACCTGTTGCT GGAGAAAAAAATCAATATAT TGCTTATGTAGCTTATCCTC TGGATCTGTTTGAGGAAGGT LYCOcdna1 TCGCTACAAAGGTCGGTGCT ATGAAATTGAACCTGTTGCT GGAGGAAAAAATCAATATAT TGCTTATGTAGCTTATCCTC TGGATCTGTTTGAGGAAGGT LYCOcdna1 TCGCTACAAAGGTCGGTGCT ATGAAATTGAACCTGTTGCT GGAGAAAAAAATCAATATAT TGCTTATGTAGCTTATCCTC TGGATCTGTTTGAGGAAGGT PSILOdna1 TCGTTACAAGGGTCGATGTT ATGGTATCGAACCCGTTGCT GGGGAAGAAAATCAATATAT TGCCTATGTAGCATATCCAT TGGATCTATTTGAGGAAGGT PSILOcdna TCGTTACAAGGGTCGATGTT ATGGTATCGAACCTGTTGCT GGGGAAGAAAATCAATATAT TGCCTATGTAGCATATCCAT TGGATCTATTTGAGGAAGGT PSILOcdna TCGTTACAAGGGTCGATGTT ATGGTATCGAACCTGTTGCT GGGGAAGAAAATCAATATAT TGCCTATGTAGCATATCCAT TGGATCTATTTGAGGAAGGT PSILOcdna TCGTTACAAGGGTCGATGTT ATGGTATCGAACCTGTTGCT GGGGAAGAAAATCAATATAT TGCCTATGTAGCATATCCAT TGGATCTATTTGAGGAAGGT BOTRYdna TcgTTACAAGGGTCGATGCT ATGAAATCGAACCTGTTCCC GGGGAGGAGAATCAATTCAT TGCTTATGTGGCGTATCCTC TAGATCTTTTTGAGGAAGGT BOTRYcdna TCGTTACAAGGGTCGATGCT ATGAAATCGAACCTGTTCCC GGGGAGGAGAATCAATTCAT TGCTTATGTGGCGTATCCTC TAGATCTTTTTGAGGAAGGT THUIDdna TCGTTATAAAGGTCGATGCT ATGATATTGAAGCAGTTCCT GGAGAAGAGAATCAATATAT CGCTTATGTTGCTTGCCCAT TAGATTTATTTGAAGAAGGT THUIDcdna TCGTTATAAAGGTCGATGCT ATGATATTGAAGCAGTTCCT GGAGAAGAGAATCAATATAT CGCTTATGTTGCTTACCCAT TAGATTTATTTGAAGAAGGT THUIDcdna TCGTTATAAAGGTCGATGCT ATGATATTGAAGCAGTTCCT GGAGAAGAGAATCAATATAT CGCTTATGTTGCTTACCCAT TAGATTTATTTGAAGAAGGT TRFR1082d TCGCTACAAAGGCCGATGCT ATGATATCGAACCTGTTGCT GGGGAGGATAATCAGTATAT TGCATATGTAGCTTATCCTT TGGATTTATTTGAAGAAGGT TRFRcdna9 TCGCTACAAAGGCTGATGCT ATGATaTCGAACCTGTTGCT GGGGAGGATAATCAGTATAT TGCATATGTAGCTTATCCTT TGGATTTATTTGAAGAAGGT TRFRcdna9 TCGCTACAAAGGCCGATGCT ATGATATCGAACCTGTTGCT GGGGAGGATAATCAGTATAT TGCATATGTAGCTTATCCTT TGGATTTATTTGAAGAAGGT TRFRcdna9 TCGCTACAAAGGCCGATGCT ATGATATCGAACCTGTTGCT GGGGAGGATAATCAGTATAT TGCATATGTAGCTTATCCTT TGGATTTATTTGAAGAAGGT PAUSIUdna CCGTTACAAGGGCCGATGCT ACGACATCGAACCCGTCGCT GGAGAAGAAAACCAGTATAT CGCGTATGTAGCTTATCCTT TGGATCTATTCGAAGAAGGT PAUSIUcdn CCGTTACAAGGGCCGATGCT ACGACATCGAACCCGTCGCT GGAGAAGAAAACCAGTATAT CGCGTATGTAGCTTATCCTT TGGATCTATTCGAAGAAGGT SELAG1dna TCGTTACAAGGGGCGATGCT ATGACATCGAACCCGTGGCT GGAGAAAAGGACCAATATAT AGCCTACGTGGCCTACCCCC TGGATCTCTTCGAGGAGGGT SELAG1cdn TCGTTACAAGGGGCGATGCT ATGACATCGAACCCGTGGCT GGAGAAAAGGACCAATATAT AGCCTACGTGGCCTACCCCC TGGATCTCTTCGAAGAGGGT SELAG1cnd TCGtTACaAGGGGCAgTGCT aTgACaTCgAACCCGtGGCT GgAgAAAAGgACCaAtAtaT aGCCtACGTGGCCtACCCCC TGtATCTCTTCtAAtAGGGT SELAG3dna TCGTTACAAGGGGCGATGCT ATGACATCGAACCCGTGGCT GGAGAAAAGGACCAATATAT AGCCTACGTGGCCTACCCCC TGGATCTCTTCGAAGAGGGT SELAG4dna TCGTTACAAGGGGCGATGCT ATGACATCGAACCCGTGGCT GGAGAAAAGGACCAATATAT AGCCTACGTGGCCTACCCCC TGGATCTCTTCgAAGAGGGT L4LDU05dn TCGTTATAAAGGTCGATGCT ATGACATCGAGCCTGTTGCT GGAGAGGAAAATCAATATAT TGCTTATGTTGCTTATCCAT TAGATCTTTTTGAGGAGGGT L4LDU05cD TCGTTATAAAGGTCGATGCT ATGACATCGAGCCTGTTGCT GGAGAGGAAAATCAATATAT TGCTTATGTTGCTTATCCAT TAGATCTTTTTGAGGAGGGT AformDNA CCGTTACAAAGGTCGATGCT ATGACATTGAACCTGTTGCT GGAGAGGAAAATCAATATAT TGCTTATGCTGCTTATTCTT TAGATTTGTTTGAGGAGGGT AformCDNA CCGTTACAAAGGTCGATGCT ATGACATTGAACCTGTTGCT GGAGAGGAAAATCAATATAT TGCTTATGTTGCTTATCCTT TAGATTTGTTTGAGGAGGGT NOTHCRdna tCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACaT TGCTTATGTTGCTTAtCCTT TAgATCTTCCTGAGGAAGGT NOTHCRcdn TCGTTATAAAGGTCGATGCT ATGACATcGAACCTgTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT MSPCHdna tCGTTATAAAGGTTGATGCT ATGACAtcGAACCTGTTGCT GGGGAAGATAATCAAtaCAT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT MSPCHcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT 58 MFUCHdnaR TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAGGGT MFUCHcdna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT MFLAGAY86 TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTAGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTCATCCTT TAGATCTTTCTGAGGAAGGT MFLAGcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTAGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT MFLAGcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTAGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT PFIMBAY86 TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT PFIMBcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT PFIMBcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT PCADCCdna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTCTCTGAGGAAGGT PCADCCcdn TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT PSPCH2dna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT PSPCH2cdn TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACCT tGCTCATGTTGCTTATCCTT TAGATCTCTCTGAGGAAGGT PSPCHMdna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGGGAAGATAATCAATACAT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT PSPCHMcdn TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT AGGGAAGATAATCAATACaT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT PSPDO4dna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTCTCTGAGGAAGGT PSPD04cdn TCGTTATAAAGGTTGATGCT ATGACATCgAACCTGTTGCT GGGGAAGATAATCAATACaT TGCTTATGTTGCTCATCCTT TAGATCTTCCTGAGGAAGGT PPECAdnaJ TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATTCTT TAGATCTCTCTGAGGAAGGT PPECAcdna TCGTTATAAAGGTTGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATTCTT TAGATCTCTCTGAGGAAGGT PPECAcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTTTTTGAGGAAGGT PPECAcdna TCGTTATAAAGGTCGATGCT ATGACATCGAACCTGTTGCT GGAGAAGAGAATCAATACAT TGCTTATGTTGCTTATCCTT TAGATCTCTTTGAGGAAGGT TAKAKIADN TCGTTACAAGGGTCGATGCT ATGACATAGAGTCTGTTGCT GGAGAGAGAGATCAATATAT TGCCCATGTCGCTATTCCCT CAGATCCATTCGAAGAAGGT TAKAKpred TCGTTACAAGGGTCGATGCT ATGACATAGAGTCTGTTGCT GGAGAGAGAGATCAATATAT TGCCCATGTCGCTATTCCCT TAGATCCATTCGAAGAAGGT SPAGdna10 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGAGAAGAAAATCAATATAT TGCTTATGTTGCTTATCCTT CAGATTTATTCGAAGAAGGT SPAGcdna1 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGAGAAGAAAATCAATATAT TGCTTATGTTGCTTATCCTT TAGATTTATTCGAAGAaGgT SPAGcdna1 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGAGAAGAAAaTCAATATAT TGCTTATGTTGCTTATCCTT tAGATTTATTCGAAGAAGGT SPAGcdna1 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGAGAAGAAAATCAATATAT TGCTTATGTTGCTTATCCTT TAGATTTATTCGAAGAAGGT SPAGcdna1 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGAGAAGAAAATCAATATAT TGCTTATGTTGCTTATCCTT TAGATTTATTCGAAGAAGGT SPAGcdna1 TCGTTATAAGGGTCGATGCT ACGATATCGAGCCTGTAGCT GGGGAAGAAAATCAATATAT TGCTTATGTTGCTTATCCTT TAGATTTATTCGAAGAAGGT HAPLOdna8 TCGTTACAAAGGTCGATGCT ATGGTATTGAACCTGTCGTG GGAGAGGAGAATCAATACAT TGCTTATGTAGCTTATCCTC TAGATCTGTTCGAGGAAGGT HAPLOCDNA TCGTTACAAAGGTCGATGCT ATGGTATTGAACCTGTCGTG GGAGAGGAGAATCAATACAT TGCTTATGTAGCTTATCCTC TAGATCTGTTCGAGGAAGGT Marchanti TCGTTATAAAGGTCGATGCT ATGATATTGACCCTGTTCCT GGAGAAGAAAATCAATATAT TGCTTATGTAGCTTATCCTT TAGATTTATTTGAAGAAGGG Charavulg CCGATACAAAGGAAGATGCT ACGATATTGAACCTGTTGCT GGTGAAGAGAATCAGTTTAT TGCATATGTTGCTTATCCTC TTGATTTATTTGAAGAAGGA

310 330 350 370 390 GNETdna11 TCCGTGACTAACATGTTTAC TTCCATTGTAGGAGACGTTT TTGGATTCAAAGCTCTACGG GCTTTGCGCCTGGAAGACTT ACGGATTCCTACTTCTTATA GNETcdna1 TCCGTGACTAACATGTTTAC TTCCATTGTAGGAAACGTTT TTGGATTCAAAGCTCCACGG GCTTTGCGCCTGGAAGACTT ACGGATTCCTACTTCTTATA GNETcdna1 TCCGTGACTAACATGTTTAC TTCCATTGTAGGAAACGTTT TTGGATTCAAAGCTCTACGG GTTTTGCGCCTGGAAGACTT ACGGATTCCTACTTCTTATA LYCOdna10 TCCGTTACTAACTTGTTCAC TTCCATTGTAGGTAATGTAT TTGGATTCAAAGCCTTACGA GCTTTACGTTTAGAAGATTT GCGAATTCCTCCTGCTTATT LYCOcdna1 TCCGTTACTAACTTGTTCAC TTCCATTGTAGGTAATGTAT TTGGATTCAAAGCCTTACGA GCTTTACGTTTAGAAGATTT GCGAATTCCTCCTGCCTATT LYCOcdna1 TCCGTTACTAACTTGTTCAC TTCCATTGTAGGTAATGTAT TTGGATTCAAAGCCTTACGA GCTTTACGTTTAGAAGATTT GCGAATTCCTCCTGCTTATT PSILOdna1 TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTAT TCGGATTCAAAGCATTGAGA GCTCTACGTCTGGAAGATTT AAGAATTCCCCCTGCTTATT PSILOcdna TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTAT TCGGATTCAAAGCATTGAGA GCTCTACGTCTGGAAGATTT AAGAATTCCCCCTGCTTATT PSILOcdna TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTAT TCGGATTCAAAGCATTGAGA GCTCTACGTCTGGAAGATTt AAGAATTCCCCcTGCTTATT PSILOcdna TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTAT TCGGATTCAAAGCATTGAGA GCTCTACGTCTGGAAGATTT AAGAATTCCCCCTGCTTATT BOTRYdna TCTGTCACCAACATGTTCAC TTCCATTGTAGGTAACGTAT TCGGATTCAAGGCATTGAGA GCTTTACGGTTGGAAGATTC AAGAATCCCCCCTGCTTATT BOTRYcdna TCTGTCACCAACATGTTCAC TTCCATTGTAGGTAACGTAT TTGGATTCAAGGCATTGAGA GCTTTACGGTtGGAAGATTT AAGAATCCCCCCTGCTTATT THUIDdna TCTGTTACCAATTTATTTAC TTCTATCGTTGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTtGCGTCTAGAAGATTT GCGTATTCCTCCAGCTTATT THUIDcdna TCTGTTACCAATTTATTTAC TTCTATCGTTGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTAGAAGATTT GCGTATTCCTCCAGCTTATT THUIDcdna TCTGTTACCAATTTATTTAC TTCTATCGTTGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTAGAAGATTT GCGTATTCCTCCAGCTTATT TRFR1082d TCCGTTACCAATATGTTCAC TTCCATTGTAGGTAACGTTT TTGGATTCAAGGCCTTACGC GCTCTCCGCTTAGAAGATCT TCGAATTCCTCCTGCTTATT TRFRcdna9 TCCGTTACCAATATGTTCAC TTCCATTGTAGGTAACGTTT TTGGATTCAAGGCCTTACGC GCTCTCCGCTTAGAAGATCT TCGAATTCCTCCTGCTTATT TRFRcdna9 TCCGTTACCAATATGTTCAC TTCCATTGTAGGTAACGTTT TTGGATTCAAGGCCTTACGC GCTCTCCGCTTAGAAGATCT TCGAATTCCTCCTGCTTATT TRFRcdna9 TCCGTTACCAATATGTTCAC TTCCATTGTAGGTAACGTTT TTGGATTCAAGGCCTTACGC GCTCTCCGCTTAGAAGATCT TCGAATTCCTCCTGCTTATT PAUSIUdna TCTGTTACTAATTTGTTCAC CTCCATAGTAGGTAATGTCT TTGGATTTAAGACTCTACGC GCTCTACGCTTGGAAGACCT TCGAATTCCTCCCGCTTATT PAUSIUcdn TCTGTTACTAATTTGTTCAC CTCCATAGTAGGTAATGTCT TTGGATTTAAGGCTCTACGC GCTCTACGCTTGGAAGACCT TCGAATTCCTCCCGCTTATT SELAG1dna TCCGTTACTAACATGTTCAC ATCTATCGTGGGTAACGTTT TCGGATTCAAGGCCTTACGG GCATTGCGATTGGAGGATTT GCGGATTCCCCCCGCTTATT SELAG1cdn TCCGTTACTAACATGTTCAC ATCTATCGTGGGTAACGTTT TCGGATTCAAGGCCTTACGG GCATTGCGATTGGAGGATTT GCGGATTCCCCCCGCTTATT SELAG1cnd TCCGTTaCTaACaTGTTaAC aTCTaTCGTGGGtAACGTTT TCGtATTCAAGGCCTtACGG GcATTGCtATTGtAGtATTT GCGtATTCCCCCCGCTtATT SELAG3dna TCCGTTACTAACATGTTCAC ATCTATCGTGGGTAACGTTT TCGGATTCAAGGCCTTACGG GCATTGCGATTGGAGGATTT GCGGATTCCCCCCGCTTATT SELAG4dna TCCGTTACTAACATGTTCAC ATCTATCGTGGGTAACGTTT TCGGATTCAAGGCCTTACGG GCATTGCGATTGGAGGATTT GCGGATTCCCCCCGCTTATT L4LDU05dn TCTGTTACTAACCTATTTAC TTCCATTGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTACGTTTAGAAGATTT ACGAATTCCACCTGCTTATT L4LDU05cD TCTGTTACTAACCTATTTAC TTCCATTGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTACGTTTAGAAGATTT ACGAATTCCACCTGCTTATT 59 AformDNA TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTCACGTCTAGAAGATTC ACGAATTCCACCTGCTTATT AformCDNA TCTGTTACCAACATGTTCAC TTCCATTGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTACGTCTAGAAGATTT ACGAATTCCACCTGCTTATT NOTHCRdna TCTGTTaCCAACATGTTCAC TTCCATCGTAGGtAATGTTT CTGGAtCTAAAGCTTCACGA GCCTTGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT NOTHCRcdn TCTGTTACCAACATGTTCAC TTCCATCGtAGGTAATGTCT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAaTTCCACCTGCtTATT MSPCHdna tcTGTTaCCAACATGTTCAC TTcCAtCGTAGGTAATGTTT cTGGAtCTAAAGCTTcACGA GCCTcGCGtCTGGAAGAATT CACGAATTCCACCTGCTTAT MSPCHcdna TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCCTTGCGTCTGGAAGATTT ACGAATTCCACCTGCTTATT MFUCHdnaR TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT MFUCHcdna TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGNNTTTAAAGCTTTACGA GCCTTGCGTCTGGAAGATTT ACGAATTCCACCTGCTTATT MFLAGAY86 TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATCTAAAGCTTCACGA GCCCTGCGTCTGGAAGATTC ACGAATTCCGCCTGCTTATT MFLAGcdna TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCCCTGCGTCTGGAAGATTT ACGAATTCCGCCTGCTTATT MFLAGcdna TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCCCTGCGTCTGGAAGATTT ACGAATTCCGCCTGCTTATT PFIMBAY86 TCTGTTACTAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT PFIMBcdna TCTGTTACTAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATT-AAAGCTTT-CGA GCTTTGCGTCTGGAAGATTT -CGAATTCCACCTGCTTATT PFIMBcdna TCTGTTACTAACATGTTCAC TTCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAATTCCACCTGCTTATT PCADCCdna TCTGTTACCAACATGTCCAC ATCCATCGCAGGTAATGTTT TTGGATTTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATCCCACCTGCTTATT PCADCCcdn TCTGTTACCAACATGTTCAC ATCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAATCCCACCTGCTTATT PSPCH2dna TCTGTTACTAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT PSPCH2cdn TCTGTTACCAACATGTCCAC ATCCATCGCAGGTAATGTTT TTGGAT-TAAaGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATCC-ACCTGCTTATT PSPCHMdna TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT PSPCHMcdn TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAAAGCTTCACGA GCCTCGCGTCTGGAagATTC ACGAATTCCACCTGCTTATT PSPDO4dna TCTGTTACCAACATGTCCAC ATCCATCGCAGGTAATGTTT TTGGATTTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATCCCACCTGCTTATT PSPD04cdn TCTGTTACCAACATGTTCAC TTCCATCGTAGGTAATGTTT CTGGATCTAA-GCTTCACGA GCCTCGCGTCTGGAAGATTC ACGAATTCCACCTGCTTATT PPECAdnaJ TCTGTTACCAACATGTCCAC ATCCATCGCAGGTAATGTTT TTGGATTTAAAGCTTCACGA GCCTCGCGTCTGGAAGATTT ACGAATCCCACCTGCTTATT PPECAcdna TCTGTTACCAACATGTCCAC ATCCATCGCAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAATCCCACCTGCTTATT PPECAcdna TCTGTTACCAACATGTTCAC ATCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAATCCCACCTGCTTATT PPECAcdna TCTGTTACCAACATGTTCAC ATCCATCGTAGGTAATGTTT TTGGATTTAAAGCTTTACGA GCTTTGCGTCTGGAAGATTT ACGAATCCCACCTGCTTATT TAKAKIADN TCTGTTACCAACCTATTCAC TTCTATTGTAGGTAATGTAT TTGGTTACAAAGCTTTACGA GCCCCACGTCCAGAAGATTT GCGAATTCCTCCTGCTTATG TAKAKpred TCTGTTACCAACCTATTCAC TTCTATTGTAGGTAATGTAT TTGGTTACAAAGCTTTACGA GCCCTACGTCTAGAAGATTT GCGAATTCCTCCTGCTTATG SPAGdna10 TCTGTTACCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCTACGTCTAGAGGATTT ACGGATTCCCCCATCTTATT SPAGcdna1 TCTGTTACCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCT-CGTCtAGAGGATTt ACGGATtCCCCCATCTTATT SPAGcdna1 TCTGTTACCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCTACGTCTAGAGGATTT ACGGATTCCCCCATCTTATT SPAGcdna1 TCTGTTACCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCTACGTCTAGAGGATTT ACGGATTCCCCCATCTTATT SPAGcdna1 TCTGTTACCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCTACGTCTAGAGGATTT ACGGATTCCCCCATCTTATT SPAGcdna1 TCTGTTGCCAACTTATTTAC TTCTATTGTAGGTAATGTTT TTGGATTCAAAGCTCTACGA GCTCTACGTCTTGAGGATTT ACGGATTCCCCCATCTTATT HAPLOdna8 TCCGTCACCAATCTATTCAC TTCCATAGTAGGTAATGTAT TCGGCTTCAAAGCGTTGCGA GCTTTACGTCTCGAAGATCT GAGAATCCCTCCAGCCTATT HAPLOCDNA TCCGTCACCAATCTATTCAC TTCCATAGTAGGTAATGTAT TCGGCTTCAAAGCGTTGCGA GCTTTACGTCTCGAAGATCT GAGAATCCCTCCAGCCTATt

410 430 450 470 490 Marchanti TCTGTTACAAATATGTTTAC TTCAATTGTAGGTAATGTAT TTGGGTTTAAAGCTTTAAGA GCGTTACGTCTTGAAGATTT AAGAATTCCTCCAGCTTACA Charavulg TCAGTAACAAATTTATTCAC ATCGATTGTAGGTAATGTAT TTGGATTTAAAGCATTAAGG GCTTTACGTTTGGAAGATTT ACGTATTCCTCCTGCTTATA GNETdna11 TAAAAACATTTCAAGGGCCT CCTCACGGTATCCAAGTTGA AAGAGATAAATTAAACAAAT ATGGACGCCCTTTATTGGGG TGTACTATCAAGCCTAAGTT GNETcdna1 TAAAAACATTTCAAGGGCCT CCTCACGGTATCCAAGTTGA AAGAGATAAATTAAACAAAT ATGGACGCCCTTTATTGGGG TGTACTATCAAGCCTAAGTT GNETcdna1 TAAAAACATTTCAAGGGCCC CCTCACGGTATCCAAGTTGA AAGAGATAAATTAAACAAAT ATGGACGCCCTTTATTGGGG TGTACTATCAAGCCTAAGTT LYCOdna10 CCAAAACTTTCATGGGTCCG CCCCATGGTATCCAAGTCGA AAGAGACAAATTGAACAAAT ATGGCCGTCCTTTATTAGGA TGTACTATTAAACCAAAATT LYCOcdna1 CCAAAACTTTCATGGGTCCG CCCCATGGTATCCAAGTCGA AAGAGACAAATTGAACAAAT ATGGCCGTCCTTTATTAGGA TGTACTATTAAACCAAAATT LYCOcdna1 CCAAAACTTTCATGGGTCCG CCCCATGGTATCCAAGTCGA AAGAGACAAATTGAACAAAT ATGGCCGTCCCTTATTAGGA TGTACTATTAAACCAAAATT PSILOdna1 CAAAGACCTTCATGGGTCCA CCTCACGGTATCCAAGTTGA AAGAGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACCATCAAACCTAAATT PSILOcdna CAAAGACCTTCATGGGTCCA CCTCACGGTATCCAGGTTGA AAGAGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACCATCAAACCTAAATT PSILOcdna CAAAGACCTTCATGGGTCCA CCTCACGGTATCCAAGTTGA AAGAGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACCATCAAACCTAAATT PSILOcdna CAAAGACCTTCATGGGTCCA CCTCACGGTATCCAAGTTGA AAGAGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACCATCAAACCTAAATT BOTRYdna CTAAGACTTTCACGGGTCCG CCTCACGGTATTCAAGTCGA GAGGGATAAATTAAACAAAT ACGGTCGCCCCTTACTAGGA TGTACCATCAAaCCCAAATT BOTRYcdna CTAAGACTTTCATGGGTCCG CCTCACGGTATTCAAGTCGA GAGGGATAAATTAAACAAAT ACGGTCGCCCCTTACTAGGA TGTACCATCAAACCCAAATT THUIDdna CCAAAACTTTCCAGGgCCCA CCTCATGGTATTCAaGTCgA AAgAGATAAaTTAAACAaAT ATGgtCGTCCATTATTAGGA TGCACTATCAAACCAAAATT THUIDcdna CCAAAACTTTCCAGGGCCCA CCTCATGGTATTCAAGTCGA AAGAGATAAATTAAACAAAT ATGGTCGTCCATTATTAGGA TGCACTATCAAACCAAAATT THUIDcdna CCAAAACTTTCCAGGGCCCA CCTCATGGTATTCAAGTCGA AAGAGATAAATTAAACAAAT ATGGTCGTCCATTATTAGGA TGCACTATCAAACCAAAATT TRFR1082d CTAAAACTTTCATTGGACCG CCCCATGGTATCCAGGTTGA AAGGgATAAGCTGAACAAAT ATGGGCGTCCCTTATTAGGA TGTACAATCAAGCCAAAATT TRFRcdna9 CTAAAACTTTCATTGGACCG CCCCATGGTATCCAGGTTGA AAGGGATAAGCTGAACAAAT ATGGGCGTCCCTTATTGGGA TGTACA?TCAAGCCAAAATT TRFRcdna9 CTAAAACTTTCATTGGACCG CCCCATGGTATCCAGGTTGA AAGGGATAAGCTGAACAAAT ATGGGCGTCCCTTATTAGGA TGTACAATCAAGCCAAAaTT TRFRcdna9 CTAAAACTTTCATTGGACCG CCCCATGGTATCCAGGTTGA AAGGGATAAGCTGAACAAAT ATGGGCGTCCCTTATTAGGA TGTACAATCAAGCCAAAATT PAUSIUdna CTAAAACTTTCATGGGACCG CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACAATCAAGCCAAAATT PAUSIUcdn CTAAAACTTTCATGGGACCG CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAACAAAT ATGGACGTCCTTTATTGGGA TGTACAATCAAGCCAAAATT SELAG1dna CCAAGACCTTTCAGGGCCCG CCTCACGGTATCCAGGTTGA AAGGGATAAATTGAACAAAT ACGGTCGACCCCTGCTGGGA TGCACCATAAAACATAAACT 60 SELAG1cdn CCAAGACCTTTCAGGGCCCG CCTCACGGTATCCAGGTTGA AAGGGATAAATTGAACAAAT ACGGTCGACCCCTGCTGGGA TGCACCATAAAACCTAAACT SELAG1cnd CcaAGACCTTTcAGGGCCCG CCTcACGGtATCcAGGTTgA AAGGgATaAATTGaACaAAT aCGGTCgACCCCTGCTGGgA TGcACcATaAAACCTaAACT SELAG3dna CCAAGACCTTTCAGGGCCCG CCTCACGGTATCCAGGTTGA AAGGGATAAATTGAACAAAT ACGGTCGACCCCTGCTGGGA TGCACCATAAAACCTAAACT SELAG4dna CCAAGACCTTTCAGGGCCCG CCTCACGGTATCCAGGTTGA AAGGGATAAATTGAACAAAT ACGGTCGACCCCTGCTGGGA CGCACCATAAAACCTAAACT L4LDU05dn CCAAAACTTTTCAAGGTCCG CCTCATGGTATTCAAGTTGA GAGAGATAAATTGAATAAAT ATGGTCGTCCCTTGTTAGGA TGTACCATTAAACCAAAATT L4LDU05cD CCAAAACTTTTCAAGGTCCG CCTCATGGTATTCAAGTTGA GAGAGATAAACTGAATaAAT ATCGTCGTCCCTTGTTAGGA TGTACCATTAAACCAAAATT AformDNA CCAAAACCTTTCAAGGTCCA CCTCATGGTATTCAAGTTGA AAGAGATAAATCGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAGTT AformCDNA CCAAAACCTTTCAAGGTCCA CCTCATGGTATTCAAGTTGA AAGAGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAGTT NOTHCRdna CCAAAACTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGCA TGTaCTATTaAACCAAAACT NOTHCRcdn CCAAAACTTTTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAt ATGGTCGTCCTTTATTGGGA TGTACTATCAAACCAAAACT MSPCHdna TCCAAACTTCTCAAGGTCCA CCTCATGGtaTTCAGGTTGA AAGGGAATAATTGAATAAAT ATGGtCGTCCTTTATTGGGA TGtaCtaCCAAACCAAAACT MSPCHcdna CCAAAACTTTTCA-GGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGtCGTCCTTTATTGGGA TGTACTATCAAACCAAAACT MFUCHdnaR CCAAAACTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTACCAAACCAAAACT MFUCHcdna CCAAA-CTTTTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGCCGtCCTTTATTGGGA TGTACTATCAAACCAAAACT MFLAGAY86 CCAAAACTTCTCAAGGCCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATCAAACCAAAACT MFLAGcdna CCAAAACTTTTCAAGGCCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATCAAACCAAAACT MFLAGcdna CCAAAACTTTTCAAGGCCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATCAAACCAAAACT PFIMBAY86 CCAAAACTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT PFIMBcdna CCAAAACTTTTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT PFIMBcdna CCAAAACTTTTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT PCADCCdna TAAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT PCADCCcdn CAAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT PSPCH2dna CCAAAACTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAATT PSPCH2cdn TAAAAACTTTCCAAGGTCCa CCTCATGGTATTCAGGTTGA AGGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCatTAAACCAAAATT PSPCHMdna CCAAAACTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT PSPCHMcdn CCAAAaCTTCTCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGgATAAaTTGAATAAAT ATGGTCGTCCTTtATTGGgA TGTACTATTAA-CCAAAACT PSPDO4dna TAAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTaCCATTAAACCAAAATT PSPD04cdn CCAAAACTTCNCA-GGTCCA CCTCATGGTATTCAGGTTGA A-GG-ATAA-TTGAATAAaT ATGGtCgtCCTTaAT-GGGA TGTACTATTAAACCAAAACT PPECAdnaJ CCAAAACTTCCCAAGGTCCA CCTCATGGTATTTAGGTTGA AAGGGATAAATTGAATAAAT ATGGCCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT PPECAcdna CCAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT PPECAcdna CCAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGATCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT PPECAcdna CCAAAACTTTCCAAGGTCCA CCTCATGGTATTCAGGTTGA AAGGGATAAATTGAATAAAT ATGGTCGTCCTTTATTAGGA TGTACCATTAAACCAAAATT TAKAKIADN TCAAAACTTCTCAGGGTCCA CCTCACGGTATCCAAGTTGA AAGAGATAAATTGAACAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT TAKAKpred CCAAAACTTTTCAGGGTCCA CCTCACGGTATCCAAGTTGA AAGAGATAAATTGAACAAAT ATGGTCGTCCTTTATTGGGA TGTACTATTAAACCAAAACT SPAGdna10 CTAAAACTTTCCAAGGTCCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTAAACAAAT ATGGTCGTCCCCTATTGGGA TGTACTATTAAACCAAAATT SPAGcdna1 CTAAAaCTTTCCAAGGTcCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTAAACAAAT ATGGTCGTCCCCTATTGGGA TGTACTATTAAACCAAAATT SPAGcdna1 CTAAAACTTTCCAAGGTCCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTaaAcaaat aTGGTCGTCCCCTATTGGGA tGTACTATTAAACCAAAATT SPAGcdna1 CTAAAACTTTCCAAGGTCCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTAAACAAAT ATGGTCGtCCCTTATTGGGA TGTACTATTAAACCAAAATT SPAGcdna1 CTAAAACTTTCCAAGGTCCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTAAACCAAT aTGGtCGTCCCCTATTGGGA TGTACTATTAAACCAAAATT SPAGcdna1 CTAAAACTTTCCAAGGTCCA CCTCATGGTATCCAAGTTGA GAGAGATAAATTAAACAAAT ATGGTCGTCCCCTATTGGGA TGTACTATTAAACCAAAATT HAPLOdna8 CAAAAACTTTCCAGGGTCCA CCCCATGGTATCCAAGTCGA GAGAGATAAATTAAACAAAT ATGGTCGTCCTTTATTGGGA TGTACCATTAAACCGAAACT HAPLOCDNA CAAAAACTTTCCAGGGTCCA CCCCATGGTATCCAaGTCGA GAGAGATAAaTTAAACAAAT ATGGTCGTCCTTTATTGGGA TGTACCATTAAACCGAAACT

510 530 550 570 590 Marchanti CAAAAACTTTCCAAGGTCCT CCTCATGGTATTCAAGTTGA GAGAGATAAATTAAACAAAT ATGGTCGTCCTTTATTAGGA TGTACTATTAAACCAAAATT Charavulg CAAAGACTTTTCAAGGGCCA CCTCATGGTATTCAAGTTGA AAGAGATAAACTAAATAAAT ATGGACGACCATTATTGGGA TGTACTATTAAACCCAAATT GNETdna11 AGGTCTATCGGcCAAAAACT ACGGTAGAGCCGtTTATGAA TGTCTTCGTGGTGGGCTTGG TTTTACTAAAGATGATGAAA ACGTCAATTCTCAACCATTC GNETcdna1 AGGTCTATCGGCCAAAAACT ACGGtAGAGCCGTTTATGAA TGTCTTCGTGGTGGGCTTGA TTTTACTAAAGATGATGAAA ACGTCAATTCTCAACCATTC GNETcdna1 AGGTCTATCGGCCAAAAACT ACGGTAGAGCCGTTTATGAA TGTCTTCGTGGTGGGCTTGA TTTTACTAAAGATGGTGAAA ACGTCAATTCTCAACCATTC LYCOdna10 AGGTTTATCCGCTAAAAACT ATGGTAGAGCTGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTCACTAAGGATGATGAAA ACGTGAATTCTCAACCGTTT LYCOcdna1 AGGTTTATCCGCTAAAAACT ATGGTAGAGCTGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTCACTAAGGATGATGAAA ACGTGAATTCTCAACCGTTT LYCOcdna1 AGGTTTATCCGCTAAAAACT ATGGTAGAGCTGTTTATGAa TGTCTtCGTGgTGgACTtGA TTTCACTAAGGATGATGAAA ACGTGAATTCTCAACCGTTT PSILOdna1 AGGTTTATCTGCTAAAAACT ACGGTAGAGCGGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAa-gATGATGAGA -tGTCAATTCTCAACCTTTT PSILOcdna AGGTTTATCTGCTAAAAACT ACCGTAGAGCGGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAAAGATGATGAGA ATGTCAATTCTCAACCTTTT PSILOcdna AGGTTTATCTGCTAAAAACT ACGGTAGAGCGGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAAAGATGATGAGA ATGCCAATTCTCAACCTTTT PSILOcdna AGGTTTATCTGCTAAAAACT ACGGTAGAGCGGTTTATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAAAGATGATGAGA ATGTCAATTCTCAACCTTTT BOTRYdna GGGATTATCTGCCAAAAATT ATGGTAGAGCTGTTTATGAA TGTTTACGTGGTGGaCTCGA TTTCCCcAAGGATGATGAAA ACGTAAATTCCCAACCGTTT BOTRYcdna GGGATTATCTGCCAAAAATT ATGGTAGAGCTGTTTATGAA TGTTTACGTGGTGGGCTCGA TTTCACCAAGGATGATGAAA ACGTAAATTCCCAACCGTTT THUIDdna GGGTTTATCGGCTAAAAATT ACGGTAGAGCTGTGTATGAA TGTCTTCGTGGTGGACTTGA TTTCACAAAAGATGATGAAA ACGTAAATTCTCAACCGTTT THUIDcdna GGGTTTATCGGCTAAAAaTT ACGGTAGAGCTGTGTATGAA TGTCTTCGTGGTGGACTTGA TTTCACAAAAGATGATGAAA ACGTAAATTCTCAACCGTTT THUIDcdna GGGTT-ATCGGCTaAAAATT -CGGTAGAGCTGTGTATGAA TGTCTTCGTGGTGGACTTGA TTTCACAAAAGATGATGAAA ACG-AAATTCTCAACCGTTT TRFR1082d GGGCTTATCTGCTAAAAATT ATGGGAGAGCCGTTTATGAA TGTCTCCGCGGTGGACTTGA CTTCACCAAGGATGATGAGA ACGTAAATTCCCAACCATTC 61 TRFRcdna9 GGACTTATCTGcTTAAAATT ATGGGAGAGCCGTTTATGaa TgTCTCCGCGGTGGACTTGA CTTCACCAAGGATGATGAGA ACGTAAATTCCCAACCATTC TRFRcdna9 GGGCTTATCTGCTAAAAATT ATGGGAGAGCCGTTTATGAA TGTCTCCGCGGTGGACTTGA CTTCACCAAGGATGATGAGA ACGTAAATTCCCAACCATTC TRFRcdna9 GGGCTTATCTGCTAAAAATT ATGGGAGAGCCGTTTATGAA TGTCTCCGCGGTGGACTTGA CTTCACCAAGGATGATGAGA ACGTAAATTCCCAACCATTC PAUSIUdna GGGTCTGTCTGCTAAAAATT ACGGTAGAGCCGTCTACGAA TGCCTTCGTGGTGGACTTGA TTTCACAAAAGATGATGAAA ATGTGAATTCCCAGCCATTC PAUSIUcdn GGGTCTGTCTGCTAAAAATT ACGGTAGAGCCGTCTACGAA TGCCTTCGTGGTGGATTTGA TTTCACAAAAGATGATGAAA ATGTGAATTCCCAGCCATTC SELAG1dna GGGTCTATCTGCTAAGAACT ATGGTAGGGCAGTCTACGAA TGCCTCCGTGGCGGACTCGA TTTCACTAAGGATGATGAGA ACGTAAATTCTCAGCCATTC SELAG1cdn GGGTCTATCTGCTAAgAACT ATGGTGGAGCAGTCTACGAA TGCCTCCGTGGCGGACTCGA TTTCACTAAGGATGATGAGA ACGTAAATTCTCAGCCATTC SELAG1cnd GGGTCtATCTGCTaAgAACT aTGGtAgAGcaGTCTaCGAA TGCCTCCGTGGCGGACTCgA TTTcACTaAGAgTgATgAgA ACGtAAATTCTcAGCCaTTC SELAG3dna GGGTCTATCTGCTAAgAACT ATGGTAGAGCAGTCTACGAA TGCCTCCGTGGCGGACTCGA TTTCACTAAGGATGATGAGA ACGTAAATTCTCAGCCATTC SELAG4dna GGGTCTATCTGCTAAGAACT ATGGTAGAGCAGTCTACGAA TGCCTCCGTGGCGGACTCGA TTTCACTAAGGATGATGAGA ACGTAAATTCTCAGCCATTC L4LDU05dn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGTGGACTTGA TTTCACTAAAGATGATGAAA ATGTGAATTCTCAACCTTTT L4LDU05cD AGGTTTATCTGCCAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGTGGACTTGA TTTCACTAAAGATGATGAAA ATGTGAATTCTCAACCTTTT AformDNA AGGTTTATCTGCCAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGTGGACTTGA TTTCACCAAAGATGACGAAA ATGTTAATTCTCAACCTTTT AformCDNA AGGTTTATCTGCCAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGTGGACTTGA TTTCACCAAAGATGACGAAA ATGTTAATTCTCAACCTTTT NOTHCRdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGCATATgAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC NOTHCRcdn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MSPCHdna AGGTTTAtCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MSPCHcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MFUCHdnaR AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MFUCHcdna AGGTTTATCTGCTAAAAACT ATGGtAGAGCTGTataTGAA TGtCTTCGCGGtGGacTTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MFLAGAY86 AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC MFLAGcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGATGGGA ATGTTAATTCTCAACCTTTC MFLAGcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGATGGGA ATGTTAATTCTCAACCTTTC PFIMBAY86 AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PFIMBcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATGTGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGAATGAG AATGTTAATTCTCACCTTTC PFIMBcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATGTGAA TGTCTTCGCGGTGGACTTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PCADCCdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACCTGA TTTCACTAAGGATGATGAGA ATGTTAATTCTCAACCTTTC PCADCCcdn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACTTGA TTTCACTAAGGATGATGAGA ATGTTAATTCTCAACCTTTC PSPCH2dna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PSPCH2cdn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACCTGA TTTCACTAAGGATGATGAGA ATGTTAATTCTCAACCTTTC PSPCHMdna AGGTTTATCTGCTAAAAACT ATGGTaGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PSPCHMcdn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PSPDO4dna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACCTGA TTTCACTAAGGATGATGAGA ATGTTAATTCTCAACCTTTC PSPD04cdn AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACCTGA TTTCACTAAAGATGATGAGA ATGTTAATTCTCAACCTTTC PPECAdnaJ AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCCTCGTGGCGGACCTGA TTTCACTAAGGATGATGAGA ATGTGAATTCTCAACCTTTC PPECAcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACTTGA TTTCACTAAGGATGATGAGA ATGTGAATTCTCAACCTTTC PPECAcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGGACTTGA TTTCACTAAGGATGATGAGA ATGTGAATTCTCAACCTTTC PPECAcdna AGGTTTATCTGCTAAAAACT ATGGTAGAGCTGTATATGAA TGTCTTCGTGGCGgACTTGA TTTCACTAAGGATGATGAGA ATGTGAATTCTCAaCCTTTC TAKAKIADN AGGTTCATCTGCCAAAAACC ATGGTAGAGCAGTATATGAA TGTCCTCGTGGTGGACTTGA TTTTACTAAAGATGATGAAA ACGTAAATTCCCAATCTTTC TAKAKpred AGGTTTATCTGCCAAAAACT ATGGTAGAGCAGTATATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAAAGATGATGAAA ACGTAAATTCCCAATCTTTC SPAGdna10 AGGTCTATCTGCTAAAAACT ATGGCAGAGCAGTATATGAG TGTCTTCGTGGTGGACTTGA TTTTACGAAAGATGATGAAA ACGTCAATTCTCAaCCTTTC SPAGcdna1 AGGTCTATCTGCTAAAAACT ATGGCAGAGCAGTATATGAG TGTCTTCGTGGTGGACTTGA TTTTACGAAAGATGATGAAA ACgTGAATTCTCAACCTTT- SPAGcdna1 AGGTCTAtctgctaAAAACT ATGGCAGAGCAGTATATGAG TGTCTtctTgGTGgACTTGA TTTTACGAAAGATGATGAAA ACGTCAATTCTCAACCTTTC SPAGcdna1 AGGTCTATCTGCTAAAAACT ATGGCAGAGCAGTATATGAG TGTCTTCGTGGTGGACTTGA TTTTACGAAAGATGATGAAA ACGTCAATTCTCAACCTTTC SPAGcdna1 AGGTCTATCTGCTAAAAACT ATGGCAGAGCAGTATATGAG TGTCTTCGTGGTGGACTTGA TTTTACGAAAGATGATGAAA ACGTCAATTCTCAACCTTTC SPAGcdna1 AGGTCTATCTGCTAAAAACT ATGGCAGAGCAGTATATGAG TGTCTTCGTGGTGGACTTGA TTTTACGAAAGATGATGAAA ACGTCAATTCTCAACCTTTC HAPLOdna8 GGGTCTATCCGCAAAAAACT ATGGCAGAGCCGTATACGAG TGTCTCCGCGGAGGACTTGA TTTTACAAAGGACGATGAGA ACGTGAATTCCCAACCGTTT HAPLOCDNA GGGTCTATCCGCAAAAAACT ATGGCAGAGCCGTATACGAg TGTCTCCGCGGAGGACTTGA TTTTACAAAGGACGATgAGA ACGTGAATTCCCAACCGTTT

610 630 650 670 690 Marchanti AGGTTTATCTGCTAAAAATT ATGGTCGAGCTGTATATGAA TGTCTTCGTGGTGGACTTGA TTTTACTAAAGATGATGAAA ACGTAAATTCTCAACCATTT Charavulg AGGTTTATCTGCTAAAAATT ATGGTAGAGCTGTATATGAA TGTCTTCGCGGTGGACTTGA CTTTACAAAAGATGACGAAA ACGTGAATTCTCAGCCTTTT GNETdna11 ATGCGCTGGAGAGACCGCTT TGTTTTTTGCGCGGAAGCTC TTTATAAAGCTCAAGCTGAA ACAGGTGAAATTAAAGGgCA TTATTTAAATGCTACTGCAG GNETcdna1 ATGCGCTGGAGAGACCGCTT TGTTTTTTGCGCGGAAGCTC TTtATAAAGCTCAAGCTGAA ACAgGTGAGATTAAAGGGCA CTATTTAAATGCTACTGCAG GNETcdna1 ATGCGCTGGAGAGACCGCTT TGTTTTTTGCGCGGAAGCTC TTTATAAAGCTCAAGCTGAA ACAGGTGAAATTAAAGGGCA TTATTTAAATGCTACTGCAG LYCOdna10 ATGCGTTGGAGAGACCGTTT CTTATTCGTAGCAGAAGCTC TTTATAAAGCTCAAGCCGAA ACAGGCGAAATTAAGGGTCA TTACTTGAATGCTACCGCGG LYCOcdna1 ATGCGTTGGAGAGACCGTTT CTTATTCGTAGCAGAAGCTC TTTATAAAGCTCAAGCCGAA ACAGGCGAAATTAAGGGTCA TTACTTGAATGCTACCGCGG LYCOcdna1 ATGCGTTGGAGAGACCGTTT CTTATTCGTAGCAGAAGCTC TTTATAAAGCTCAAGCCGAA ACAGGCGAAATTAAGGGTCA TTACTTGAATGCTACCGCGG PSILOdna1 ATGCGTTGGAGAGATCGTTT CTTATTTGTAGCAGAAGCTC TTTTCAAATCCCAAGCTGAA ACAGGTGAAATTAAGGGACA TTATTTGAACGCCACTGCAG PSILOcdna ATGCGTTGGAGAGATCGTTT CTTATTTGTAGCAGAAGCTC TTTTCAAATCCCAAGCTGAA ACAGGTGAAATTAAGGGACA TTATTTGAACGCCACTGCAG PSILOcdna ATGCGTTGGAGAGATCGTTT CTTATTTGTAGCAGAAGCTC TTTTCAAATCCCAAGCTGAA ACAGGTGAAATTAAGGGACA TTATTTGAACGCCACTGCAG PSILOcdna ATGCGTTGGAGAGATCGTTT CTTATTTGTAGCAGAAGCTC TTTTCAAATCCCAAGCTGAA ACAGGTGAAATTAAGGGACA TTATTTGAACGCCACTGCAG 62 BOTRYdna ATGCGTTGGAGAGATCGTTT CTTATTCGTGGCAGAAGCTC TTTTCAAATCTCAAGCTGAA ACGGGTGAGATTAAGGGGCA TTACTTAAACGCTACCGNGG BOTRYcdna ATGCGTTGGAGAGATCGTTT CTTATTCGTGGCAGAAGCTC TTTTCAAATCTCAAGCTGAA ACGGGTGAGATTAAGGGGCA TTACTTAAACGCGACCGCGG THUIDdna ATGCGTTGGAGAGACCGTTT CTTATTTGTAGCGGAAGCTA TTTACAAATCTCAAGCTGAA ACGGGTGAAATTAAAGGACA TTATCTAAATGCTACCGCAG THUIDcdna ATGCGTTGGAGAGACCGTTT CTTATTTGTAGCGGAAGCTA TTTACAAATCTCAAGCTGAA ACGGGTGAAATTAAAGGACA TTATCTAAATGCTACCGCAG THUIDcdna ATGCGTTGGAGAGACCGTTT CTTATTTGtAGCGgAaGCTA TTTACGAATCTCAAGCTGAA ACGGGTGAAATTAAAGGACA TTATCTAAATGCTACCGCAG TRFR1082d ATGCGTTGGAGAGATCGTTT CTTATTCGTAGCAGAAGCTC TCTTCAAATCTCAGGCCGAA ACAGGCGAAATTAAGGGACA TTACTTAAACGCTACTGCGG TRFRcdna9 ATGCGTTGGAGAGATCGTTT CTTATTCGTAgCAGAAGCTC TCTTCAAATCTCAGGCCGAA ACAGGCGAAATTAAGGGACA TTACTTAAACGCTACTGCGG TRFRcdna9 ATGCGTTGGAGAGATCGTTT CTTATTCGTAGCAGAAGCTC TCTTCAAATCTCAGGCCGAA ACAGGCGAAATTAAGGGACA TTACTTAAACGCTACTGCGG TRFRcdna9 ATGCGTTGGAGAGATCGTTT CTTATTCGTAGCAGAAGCTC TCTTCAAATCTCAgGCCGAA ACAGGCGAAATTAAGGGACA TTACTTAAACGCTACTGCGG PAUSIUdna ATGCGTTGGAGAGATCGTTT CCTATTTGTGGCAGAAGCTC TTTTCAAATCCCAAGCTGAA ACAGGGGAAATCAAAGGGCA TTACTTAAATGCTACTGCAG PAUSIUcdn ATGCGTTGGAGAGATCGTTT CCTATTTGTGGCAGAAGCTC TTTTCAAaTCCCAaGCTGAA ACAggggaaatcaaagggCA TTACTTAAATGCTACTGCAG SELAG1dna ATGCGTTGGCGAGATCGTTT CGTATTTGTAGCGGAAGCTC TTAATAAGGCTCAgGcCGAA ACGGGCGAgatTAAAGGCCA CTACCTGAATGCTACTGCaG SELAG1cdn ATGCGTTGGCGAGATCGTTT CGTATTTGTAGCGGAAGCTC TTAATAAGGCTCAGGCCGAA ACGGGCGAGATTAAAGGCCA CTACCTGAATGCTACTGCAG SELAG1cnd aTGCGTTGGCGAGATCGTTT CGTATTTGTAGCGGAAGCTC TTAATAAGGCTCAGGCCGAA ACGGGCGAGATTAAAGGCCA CTACCTGAATGCTACTGCAG SELAG3dna ATGCGTTGGCGAGATCGTTT CGTATTTGTAGCGGAAGCTC TTAATAAGGCTCAGGCCGAA ACGGGCGAGATTAAAGGCCA CTACCTGAATGCTACTGCAG SELAG4dna ATGCGTTGGCGAGATCGTTT CGTATTTGTAGCGGAAGCTC TTAATAAGGCTCAGGcCGAA ACGGGCGAGATTAAAGGCCA CTACCTGAATGCTACTGCAg L4LDU05dn ATGCGTTGGAGAGATCGTTT TCTATTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAAGGACA TTACTTAAATGCTACTGCAG L4LDU05cD ATGCGTTGGAGAGATCGTTT TCTATTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAAGGACA TTACTTAAATGCTACTGCAG AformDNA ATGCGTTGGAGAGATCGTTT TCTATTCGTAGCAGAAGCTA TTTCTAAATCTCAAGCTGAA ACAGGGGAGATTAAAGGGCA TTACTTAAATGCCACTGCAG AformCDNA ATGCGTTGGAGAGATCGTTT TCTATTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGGGAGATTAAAGGGCA TTACTTAAATGCCACTGCAG NOTHCRdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTCAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG NOTHCRcdn ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTCAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MSPCHdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MSPCHcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTGCTTGAATGCCACTGCAG MFUCHdnaR ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MFUCHcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MFLAGAY86 ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTCAAATCTCAGGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MFLAGcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTCAAATCTCAGGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG MFLAGcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTCAAATCTCAGGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PFIMBAY86 ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PFIMBcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGCG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PFIMBcdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGCG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PCADCCdna ATGCGTTGGAAAGATCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAACCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCGACTGCAG PCADCCcdn aTGCGTTGGAGAGATCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCGACTGCAG PSPCH2dna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PSPCH2cdn ATGCGTTGGAGAGATCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAACCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCGACTGCAG PSPCHMdna ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PSPCHMcdn ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCG TTACTTGAATGCCACTGCAG PSPDO4dna ATGCGTTGGAGAGGTCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAACCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCGACTGCAG PSPD04cdn ATGCGTTGGAGAGATCGTTT CTTGTTTGTAGCAGAAGCCC TTTTTAAATCTCAAGCTGAG ACAGGCGAGATTAAAGGGCA TTACTTGAATGCCACTGCAG PPECAdnaJ ATGCGTTGGAGAGATCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAACCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCAACTGCTG PPECAcdna ATGCGTTGGAGAGATCGTTT CTTGTTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCAACTGCTG PPECAcdna ATGCGTTGGAGAGATCGTTT CTtGTTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCAACTGCTG PPECAcdna ATGCGTTGGAGAGATCGTCT CTTGTTCGTAGCAGAAGCTA TTTTTAAATCTCAAGCTGAA ACAGGCGAGATTAAGGGGCA TTACTTGAATGCAACTGCTG TAKAKIADN ACGCGATGGAGAGATCGTTC CTTATTCATAGCAGAAGCTA TTTATAAATCTCAAGCCGAA ACAGGCGAGATCAAGGGGCA TTACTCGAATGCTACTGCAG TAKAKpred ACGCGATGGAGAGATCGTTC CTTATTCATAGCAGAAGCTA TTTATAAATCTCAAGCCGAA ACAGGCGAGATCAAGGGGCA TTACTTGAATGCTACTGCAG SPAGdna10 ATGCGTTGGAGAGATCGCTT CCTATTTGTGGCAGAAGCTA TTTATAAATCCCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACTGCAG SPAGcdna1 ATGCGTTGGAGAGACCGTTT CTTATTCGTAgCAGAaGCTC TTTATaaAGCTCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACTGCAG SPAGcdna1 ATGCGTTGGAGAGATCGCTt CCTATTTGTGGCAGAAGCTA TTTATAAATCCCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACTGCAG SPAGcdna1 ATGCGTTGGAGAGATCGCTT CCTATTTGTGGCAGAAGCTA TTTATNAATCCCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACTGCAG SPAGcdna1 ATGCGTTGGAGAGATCGCTT CCTATTTGTGGCAGAAGCTA TTTATAAATCCCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACTGCAG SPAGcdna1 ATGCGTTGGAGAGATCGCTT CCTATTTGTGGCAGAAGCTA TTTATAAATCCCAGGCTGAA ACGGGTGAAATCAAGGGTCA TTACTTGAATGCTACCGCAG HAPLOdna8 ATGCGTTGGAGGgATCGTTT CGTATTCGTGGCAGAAGCTA TTTTCAAATCCCAAGCAGAG ACCGGTGAGATTAAGGGACA TTATCTAAATGCTACTGCCG HAPLOCDNA ATGCGTTGGAGGGATCGTTT CGTATTCGTGGCAGAAGCTA TTTTCAAATCCCAAACAGAG ACCGGTGAGATTAAGGGACA TTATCTAAATGCTACTGCCG

710 730 750 770 790 Marchanti ATGCGTTGGAGAGATCGTTT CTTATTTGTAGCAGAAGCTA TTTATAAATCTCAAGCAGAA ACTGGAGAAATCAAAGGACA TTATTTAAATGCTACTGCAG Charavulg ATGCGATGGAGAGATAGATT CTTATTTGTAGCAGAAGCAA TTTATAAATCTCAAGCAGAA ACTGGAGAAATTAAAGGTCA CTATTTAAATGCTACTGCAG GNETdna11 GTACATGCGAGGAAATGATC AAAAGGGCAGTATTTGCAAG AGAATTAGGAGCTCCTATTG TCATGCATGACTATCTGACA GGAGGTTTTACCGCGAATAC GNETcdna1 GTACATGCGAGGAAATGATC AAAAGGGCAGTATTTGCAAG AGAATTAGGAGCTCCTATTG TCATGCATGACTATCTGACA GGAGGTTTTACCGCGAATAC GNETcdna1 GTACATGCGAGGAAATGATC AAAAGGGCAGTATTTGCAAG AGAATTAGGAGCTCCTATTG TCATGCATGACTATCTGACA GGAGGTTTTACCGCGAATAC LYCOdna10 GTACATATGAAGAAATGCTT AAAAGGGCACATTGTGCTAA AGAATTAGGAGTGCCTATCG TAATGCATGACTATTTGACG GGAGGTTTCACCGCAAATAC 63 LYCOcdna1 GTACATATGAAGAAATGCTT AaaAGGGCACATTGTGCTAA AGAATTAGGAGTGCCTATCG TAATGCATGACTATTTGACG GGAGGTTTCACCGCAAATAC LYCOcdna1 GTACATATGAAGGAATGCtT AAAAGGgCACATtGTGCTAA AGAaTtAGGAGTGCCTATCG TAATGCATGACTATTtGACG GGAGGTTTCACCGCAAATAC PSILOdna1 GTACTTGCGAAGAAATGATG AAAAGAGCAGTCTTCGCTCG AGAATTAGGAGCCCCCATTG TTATGCACGACTATTTGACA GGAGGATTTACTGCAAATAC PSILOcdna GTACTTGCGAAGAAATGATG AAAAGAGCAGTCTTCGCTCG AGAATTAGGAGCCCCCATTG TTATGCACGACTATTTGACa GGAGGATTTACTGCAAATAC PSILOcdna GTACTTGCGAAGAGATGATG AAAAGAGCAGTCTTCGCTCG AGAATTAGGAGCCCCCATTG TTATGCACGACTATTTGACA GGAGGATTTACTGCAAATAC PSILOcdna GTACTTGCGAAGAAATGATG AAAAGAGCAGTCTTCGCTCG AGAATTAGGAGCCCCCATTG TTATGCACGACTATTTGACA GGAGGATTTACTGCAAATAC BOTRYdna GTACTTGTGAGGAAATGCTA AAAAgagcagTATTTGCTCG AGGATTGGGAGtACCCATTG TCATGCACGATTATTTGACa GgAGgATTCaCTGCAAATAC BOTRYcdna GTACTTGTGAGGAAATGCTA AAAAGAGCAGTATTTGCTCG AGAATTGGGAGTACCCATTG TCATGCACGATTATTTGACA GGAGGATTCACTGCAAATAC THUIDdna GTACTTGTGAAGAGATGTTA AAAAGAGCTCAATTCGCTAG AGAATTGGGAGTTCCTATTG TTATGCATGACTATTTAACA GGTGGTTTTACTGCAAATAC THUIDcdna GTACTTGTGAAGAGATGTTA AAAAGAGCCCAATTCGCTAG AGAATTGGGAGTTCCTATTG TTATGCATGACTATTTAACA GGTGGTTTTACTGCAAATAC THUIDcdna GTACTTGTGAAGAGATGTTA AAAAGAGCTCAATTCGCTAG AGAATTGGGAGTTCCTATTG TTATGCATGACTATTTAACA GGTGGTTTTACTGCAAATAC TRFR1082d GTACGTGTGAAGAAATGTTG AAAGGAGCCCAATTTGCTAt AGGATTGGGGGCACCAATTG TAATGcaTgAcTATCTGACC GGAGGGTTTACCGCAAACAC TRFRcdna9 GTACGTGTGAAGAAATGTTG AAAAGAGCCCAATTTGCTAG AGAATTGGGGGCACCAATTG TAATGcatGacTATCTGACC GGAGGGTTTACCGCAAACAC TRFRcdna9 GTACGTGTGAAGAAATGTTG AAAAGAGCCCAATTTGCTAG AGAATTGGGGG---CAATTG TAATGCATGACTATCTGACC GGAGGGTTTACCGCAAACAC TRFRcdna9 GTACGTGTGAAGAAATGTTG AAAAGAGCCCAATTTGCTAG AGAATTGGGGGCACCAATTG TAATGCATGACTATCTGACC GGAGGGTTTACCGCAAACAC PAUSIUdna GTACATGTGAAGAGATGATG AAAAGAGCTGCTTTTGCTAG GGAATTGGGTGCACCAATTG TCATGCATGACTACCTAACC GGAGGGTTTACTGCAAATAC PAUSIUcdn GTACATGTGAAGAGATGATG AAAAGAGCTGCtTTTGCTAG GGAATTGGGTGCACCAATTG TCATGCATGACTACCTAACC GGAGGGTTTACTGCAAATAC SELAG1dna GGaCATGCGAGGAAATGATG AAAAGGGCAGAATTCGCCAG AGAACTGGGAGTGCCCATCG TTATGCATGACTATTTGACA GGAGGGTTCACTGCGAATAC SELAG1cdn GGACATGCGAGGAAATGATG AAAAGGGCAGAATTCGCCAG AGAACTGGGAGTGCCCATCG TTATgcatgaCTATTTGACA GGAGGGTTCACTGCGAATAC SELAG1cnd GgACATgCGAGGAAATgATg AaAAGGGCAGAATTcGCCAG AGAACTgGGAGTgCCCATcG TtATgCATgACTaTTTgACA GGAGGGTtCaCTgCGAATaC SELAG3dna GGACATGCGAGGAAATGATG AAAAGGGCAGAATTCGCCAG AgAaCTGG-AGTGCCCATCG TTATGCATGACTATTTGACA gGAGGGTTCACTGCGAATAC SELAG4dna GGacATGCGAGGAAATGATG AAaagGgCGGAATTCGcCAG AGAACTGGGAGTgCCCATCG TTATGCATGACTATTTGACA GGAGGGTTCACTGCGAATaC L4LDU05dn GTACATGTGAGGAAATGATG AAAAGGGCACAATTTGCTAG AGAATTGGGAGTGCCTATCA TAATGCACGACTATTTGACA GGAGGTTTTACTGCAAATAC L4LDU05cD GTACATGTGAGGAAATGATG AAAAGGGCACAATTTGCTAG AGAATTGGGAGTGCCTATCA TAATGCACGACTATTTGACA GGAGGTTTTACTGCAAATAC AformDNA GTACATGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGAATGCCTATCG TAATGCATGATTATTTGACA GGAGGTTTTATTGCAAATAC AformCDNA GTACATGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGAATGCCTATCG TAATGCATGATTATTTGACA GGAGGTTTTACTGCAAATAC NOTHCRdna GTACATGTGACGAGATGATG AAAAGGGCATATTTTGCTAG AgAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC NOTHCRcdn GTACATGTGACGAGATGATG GAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC MSPCHdna GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTCTACTGCAAATAC MSPCHcdna GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC MFUCHdnaR GTACATGTGAAGAGATGATG AAAAGGGcaTATTTTGCtaG AGAATTAGGGGTgCCTATTG tAaTgCACGACTaTTTgACA GGTGGTTCtACTGCAAATaC MFUCHcdna GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC MFLAGAY86 GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTCTACTGCAAATAC MFLAGcdna GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC MFLAGcdna GTACATGTGAAGAGATGATG AAAAGGGCATATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC PFIMBAY86 GTACATGTGAAGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTCTACTGCAAATAC PFIMBcdna GTACATGTGAGGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCCATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC PFIMBcdna GTACATGTGAGGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCCATTG TAATGCACGACTATTTGACA GGTGGTTTTACTGCAAATAC PCADCCdna GTACGTCTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGAGGTTCTACTGCAAATAC PCADCCcdn GTACGTCTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGAGGTTTTACTGCAAATAC PSPCH2dna GTACATGTGAAGAGATGATG AAAAGGGCACAGTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTCTACTGCAAATAC PSPCH2cdn GTACGTCTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTgACA GGAGGTTCTACTGCAAATAC PSPCHMdna GTACATGTGAAGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTTCTACTGCAAATAC PSPCHMcdn GTACATGTGAAGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTgACA GGTGGTTCTACTGCAAATAC PSPDO4dna GTACGTCTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGAGGTTCTACTGCAAATAC PSPD04cdn GTACATGTGAAGAGATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCACGACTATTTGACA GGTGGTCCTACTGCTAATAC PPECAdnaJ GTACGTGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCATGACTATTTGACA GGAGGTTTTACTGCAAATAC PPECAcdna GTACGTGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGCGCCTATTG TAATGCATGACTATTTGACA GGAGGTTTTACTGCAAATAC PPECAcdna GTACGTGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCATGACTATTTGACA GGAGGTTTTACTGCAAATAC PPECAcdna GTACGTGTGAGGAAATGATG AAAAGGGCACATTTTGCTAG AGAATTAGGGGTGCCTATTG TAATGCATGACTATTTGaCA GGAGGTTTTACTGCAAATAC TAKAKIADN GTACATGTGAAGACACGATG AA---AGGCGGATTTGCTAG AGAATTGGGAGTGCCTATCG CCACGCATGACTATTCGACA GGAGGTTTTACTGCAAATAC TAKAKpred GTACATGTGAAGACACGATG AA---AGGCGGATTTGCTAG AGAATTGGGAGTGCCTATCG TCACGCATGACTATTTGACA GGAGGTTTTACTGCAAATAC SPAGdna10 GTACATGCGAAGAAATGATG AAgAgAGCGGCATTTGCTAG AGAGTTAGGAGCaCCTATCA TCATGCATGACTATCTGACA GGTGGtTTCACTGCAAATAC SPAGcdna1 GTACATGCGAAGAAATGATG AAGAGAGCGGCATTTGCTAG AGAGTTAGGAGCACCTATCA TCATGCATGACTATCTGACA GGTGATTTCACTGCAAATAC SPAGcdna1 GTACATGCGAAGAAATGATG AAGAGAGCGGCATTTGCTAG AGAGTTAGGAGCACCTATCA TCATGCATGACTATCTGACA GGTGGTTTCACTGCAAATAC SPAGcdna1 GTACATGCGAAGAAATGATG AAGAGAGCGGCATTTGCTAG AGAGTTAGGAGCACCTATCA TCATGCATGACTAcTCGACA GGTGGTTTCACTGCAAATAC SPAGcdna1 GTACATGCGAAGAAATGATG AAGAGAGCGGCATTTGCTAG AGAGTTAGGAGCACCTATCA TCATGCATGACTACCTGACA GGTGGTTTCACTGCAAATAC SPAGcdna1 GTACATGCGAAGAAATGATG AAGAGAGCGGCATTTGCTAG AGAGTTAGGAGCACCTATCA TCATGCATGACTATCTGACA TGTGGTTTCACTGCAAATAC HAPLOdna8 GTACGTGTGAAGAAATGATG AAAAGGGcAGAATATGCTAG AGAATTGGGAGTACCTATTG TCATGCACGATTATTTGACG GGAGGtTTCACGGCGAACAC HAPLOCDNA GTACGTGTGAAGAAATGATG AAAAGGGTAGAATTTGCTAG AGAATTGGGAGTACCTATTG TCATGCACGATTATTTGACG GGAGGTTTCACGGCGAACAC

810 830 850 870 890 64 Marchanti GTACATGTGAAGAAATGCTA AAAAGAGCAGCATGTGCTAG AGAGTTAGGTGTACCAATTG TTATGCACGATTACTTAACT GGTGGTTTCACTGCAAATAC Charavulg GAACTTGTGAAGAAATGCTT AAGAGAGCACAATGTGCAAG AGAATTAGGTATGCCTATTG TTATGCATGATTATCTTACA GGTGGATTTACAGCTAATAC GNETdna11 CACCTTGGCTCATTATTGCC GAGATAACGGCTTACTTCTT CACATTCACCGCGCAATGCA TGCAGTTATTGATAGACAAA AAAATCATGGTATGCATTTC GNETcdna1 CACCTTGGCTCATTATTGCC GAGATAACGGCTTACTTCTT CACATTCACCGCGCAATGCA TGCAGTTATTGATAGACAAA AAAATCATGGTATGCATTTC GNETcdna1 CACCTTGGCTCATTATTGCC GAGATAACGGCTTACTTCTT CACATTcacCgCGcAATGCA TGCAGTTATTGATAGACAAA AAAATCATGGTATGCATTTC LYCOdna10 TAGTTTAGCCCATTATTGTC GAgacaaTGGTCCACTGCTT CACATTCACCGCGCGATGCA CGCAGTTATTGACAGACAAA AGAATCATGGTATTCACTTC LYCOcdna1 TAGTTTAGCCCATTATTGTC GAGACAATGGTCTACTGCTT CACATTCACCGCGCGATGCA TGCAGTTATTGACAGACAAA AGAATCATGGTATTCACTTC LYCOcdna1 TAGTTTAGCCCATTATTGTC GAGACAATGGTCTACTGCTT CACATTCACCGCGCGATGCA TGCAGTTATTGACAGACAAA AGAATCATGGTATTCACTTC PSILOdna1 TAGTCTGGCTTTCTATTGTC GAGATAATGGTCTACTTCTT CATATTCACCGTGCCATGCA TGCTGTTATTGATAGACAAA GAAATCATGGTATGCACTTC PSILOcdna TAGTCTGGCTTTCTATTGTC GAGATAATGGTCTACTTCTT CATATTCACCGTGCCATGCA TGCTGTTATTGATAGACAAA GAAATCATGGTATGCACTTC PSILOcdna TAGTCTGGCTTTCTATTGTC GAGATAATGGTCTACTTCTT CATATTCACCGTGCCATGCA TGCTGTTATTGATAGACAAA GAAATCATGGTATGCACTTC PSILOcdna TAGTCTGGCTTTCTATTGTC GAGATAATGGTCTACTTCTT CATATTCACCGTGCCATGCA TGCTGTTATTGATAGACAAA GAAATCATGGTATGCACTTC BOTRYdna TAGCTCagCTTTCTACTGCC GGGATAATGGTCTACTCCTT CACATTCATCGTGCAATGCA TGCTGTTATTGATAGGCAAA GAA-TCATGGTATCCACTTC BOTRYcdna CAGCTTAGCTTTCTACTGCC GGGATAATGGTCTACTCCTT CACATTCATCGTGCAATGCA TGCTGTTATTGATAGGCAAA AGAATCATGGTATCCACTTC THUIDdna TACTTTGGCTCACTATTGTC GTGATAATGGTTTACTCCTT CATATTCACCGTGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCATTTC THUIDcdna TACTTTGGCTCACTATTGTC GTGATAATGGTTTACTCCTT CATATTCACCGTGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCATTTC THUIDcdna TACTTTGGCTCACTATTGTC GTGATAATGGTTTACTCCTT CATATTCACCGTGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCATTTC TRFR1082d TAGCTTGGCCTTCTATTGCC GAGATAATGGGCTGCTTCTT CACATTCACCGTGCAATGCA TGCTGTCATCGATAGACAGA AAAATCACGGTATACATTTT TRFRcdna9 TAGCTTGGCCTTCTATTGCC GAGATAATGGGCTGCTTCTT CACATTCACCGTGCAATGCA TGCTGTCATCGATAGACAGA AAAATCACGGTATACATTTT TRFRcdna9 TAGCTTGgCCTTCTATTGCC GAGATAATGGGCTGCTTCTT CACATTCACCGTGCAATGCA TGCTGTCATCGATAGACAGA AAAATCACGGTATACATTTT TRFRcdna9 TAGCTTGGCCTTCTATTGCC GAGATAATGGGATGCTTCTT CACATTCACCGTGCAATGCA TGCTGTCATCGATAGACAGA AAAATCACGGTATACATTCT PAUSIUdna CAGCTTAGCTTTTTATTGCA GAGATAATGGACTGCTTCTT CATATTCACCGCGCAATGCA TGCTGTGATCGATAGGCAAC GAAATCACGGTATGCATTTT PAUSIUcdn CAGCTTAGCTTTTTATTGCA GGGATAATGGACTGCTTCTT CATATTCACCGCGCAATGCA TGCTGTGATCGATAGGCAAC GAAATCACGGTATGCATTTT SELAG1dna AACCCTGGCCTCCTACTGCC GGGATAATGGGCTACTACTA CATATACATCGCGCAATGCA TGCTGTTATTGACAGGCAGA AGAATCATGGTATTCATTTC SELAG1cdn AACCCTGGCCTCCTACTGCC GGGATAATGGGCTACTACTA CATATACATCGCGCAATGCA TGCTGTTATTGACAGGCAGA AGAATCATGGTATTCATTTC SELAG1cnd AACCCtGgCCTCCTaCTgCC GGGATAaTggGCTaCTaCTa CATaTaCATcGCGCAATgCA TgCtGTtATtGACAGGCAGA AGAATCATgGTaTTcATTTc SELAG3dna AACCCTGGCCTCCTACTGCC GGGATAATGGGCTACTACTA CATATACATCGCGCAATGCA TGCTGTTATTGACAGGCAGA AGAATCATGGTATTCATTTC SELAG4dna AACCCTGGCCTCCTACTGCC GGGATAATGGGCtaCTACTA CATATACATCGCGCAATGCA TGCTGTTATTGACAGGCAGA AGAATCATGGTATTCATTTC L4LDU05dn GAGTTTGGCTCATTATTGTC GAGACAATGGTTTACTTCTT CATATTCACCGTGCTATGCA TGCAGTTATCGATAGACAAA AAAATCATGGTATTCACTTT L4LDU05cD GAGTTTGGCTCATTATTGTC GAGACAATGGTTTACTTCTT CATATTCACCGTGCTATGCA TGCAGTTATCGATAGACAAA AAAATCATGGTATTCACTTT AformDNA GACCCTGGCTCGTTATTGTA GAGACAATGGTTTACTTCTT CATATTCATCGTGCCATGCA TGCAGTTACCGACAGACAAA GAAATCATGGTATTCACTTT AformCDNA GACCCTGGCTCGTTATTGTA GAGACAATGGTTTACTTCTT CATATTCATCGTGCCATGCA TGCAGTTATTGACAGACAAA GAAATCATGGTATTCACTTT NOTHCRdna AACTTTGGCTCACTACTGTC GAGACAACGGTTTACTTCCC CATATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA AAAATCATGGTATGCACTTT NOTHCRcdn AACTTTGGCTCACTACTGTC GAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCAGTTATTGACAGACAAA AAAATCATGGTATGCACTTT MSPCHdna AAGTTTGGCTCACTACTGTC GAGACAACGGTTCACTTCCC CATATTCATCGTGCTATGCA TGCAGTTACCGACAGACAAA GAAATCATGGTATTCACTTT MSPCHcdna AAGTTTGGCTCACTACTGTC GAGACAACGGTTTACTTCTT CATATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT MFUCHdnaR AAGTTTGGCTCACTatTGTC GAGACAACGGTTtACTTCCC CATaTTCATCGTGCTATGCA TGCGGTTACCGACAGACAAA GAAATCATGGTATTCACTTT MFUCHcdna AAGTTTGGCTCACTATTGTC GAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCGGTTATCGACAGACAAA GAAATCATGGTATTCACTTT MFLAGAY86 GAGTTTGGCTCACTACTGTC GAGACAACGGTTTACTTCCC CATATTCATTGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT MFLAGcdna GAGTTTGACTCACTACTGTC GAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT MFLAGcdna GAGTTTGACTCACTACTGTC GAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PFIMBAY86 GAGTTTGGCTCACTACTGTC GAGACAACGGTTTACTTCCC CATATTCATCGTGCTATGCA TGCGGTTACCGACAGACAAA GAAATCATGGTATTCACTTT PFIMBcdna GAGTTTGGCTCACTACTGTC AAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCGGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PFIMBcdna GAGTTTGGCTCACTACTGTC AAGACAACGGTTTACTTCTC CATATTCATCGTGCTATGCA TGCGGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PCADCCdna GAGTTTGGCTCACTACTGTC GAGATAATGGTTTACTTTCC TACATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PCADCCcdn GAGTTTGGCTCACTACTGTC GAGATAATGGTTTACTTCTC CACATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PSPCH2dna TAGTTTAGCTCACTACTGTC GAGACAACGGTTTACTTCCC CATATTCATCGTGCTATGCA TGCGGTTACCGACAGACAAA GAAATCATGGTATTCACTTT PSPCH2cdn GAGTTTGGCTCACTACTGTC GAGATAATGgTTTACTTtCC CACATTCATCGTGCTATGCA TGCAGTTATCGACAGGCAAA GAAATCATGGTATTCACTTT PSPCHMdna GACTTTGGCTCACTACTGTC GAGACAACGGTTCACTTCCC CATATTCATCGTGCTATGCA TGCGGTTACCGACAGACAAA GAAATCATGGTATTCACTTT PSPCHMcdn GACTTTGGCTCACTACTGTC GAGACAACGGTTCACTTCCC CAtATTCATCGTGCTATGCA TGCGgTtaCCGACAGACAAA GAAATCATGGTATTCACTTT PSPDO4dna GAGTTTGGCTCACTACTGTC GAGATAATGGTTTACTTTCC CACATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PSPD04cdn GACTTTGGCTCACTACTGTC GAGACAAcGGTTCACTTCCC CATATTCATCGTGCTATGCA TGCGGTTACCGACAGACAAA GAAATCATGGTATTCACCTT PPECAdnaJ GAGTTTGGCTCACTACTGTC GAGACAATGGTTTACTTCTC CACATTCATCGTGCTATGTA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PPECAcdna GAGTTTGGCTCACTACTGTC GAGACAATGGTTTACTTCTC CACATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PPECAcdna GAGTTTGGCTCACTACTGTC GAGACAATGGTTTACTTCTC CACATTCATCGTGCTATGCA TGCAGTTATCGACAGACAAA GAAATCATGGTATTCACTTT PPECAcdna GAGTTTGGCTCaCTACtGTC GAgACAATGGtTtACTTCTC CACATTCATCGTGCTATGCA TGCAGTTATCGaCAGACAAA GAAATCATGGTATTCACTTT TAKAKIADN TAGTTTAGCTCATTATCGCC GAGACAATGGTCCACTCCTT CACATTCACCGCGCAACGCA CGCAGTTATCGACCGACAAA AAAATCATGGTATGCACTTC TAKAKpred TAGTTTAGCTCATTATTGCC GAGACAATGGTCTACTCCTT CACATTCACCGCGCAATGCA CGCAGTTATCGACCGACAAA AAAATCATGGTATGCACTTC SPAGdna10 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTCTT CATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC SPAGcdna1 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTCTT CATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC SPAGcdna1 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTcTT cATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC SPAGcdna1 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTCTT CATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC 65 SPAGcdna1 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTCTT CATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC SPAGcdna1 TACTTTGGCTCATTATTGCC GAGATAATGGTCTACTTCTT CATATTCATCGCGCAATGCA TGCAGTTATTGACCGACAAA AAAACCATGGTATGCACTTC HAPLOdna8 AACCtTGGCTCATTACTGTC GAGATAATGGTCTACTTCTT CACATCCACCGCGCGATGCA TGCAGTTATTGACAGACAGC AAAATCATGGCATGCATTTT HAPLOCDNA AACCTTGGCTCATTACTGTC GAGATAATGGTCTACTTCTT CACATCCACCGCGCGATGCA TGCAGTTATTGACAGACAGA AAAaTCATGgCATGCATTTT

910 930 950 970 990 Marchanti TAGTCTGGCTTTTTATTGCC GTGACAATGGTTTACTTCTT CATATTCACCGTGCAATGCA TGCAGTTATTGATAGACAAA AAAATCATGGTATACATTTC Charavulg TACTCTAGCTCATTATTGTC GAGATAATGGTTTATTGCTT CATATCCACCGTGCTATGCA CGCTGTACTTGACCGTCAAA AAAATCATGGTATGCATTTT GNETdna11 CGTGTACTGGCTAAAGCGTT GCGCTTGTCCGGTGGGGATC ACATTCATGCTGGTACTGTA GTTGGTAAACTTGAAGGAGA ACGAGAAATCACTTTAGGTT GNETcdna1 CGTGTACTGGCTAAAGCGTT GCGCTTGTCCGGTGGGGATC ACATTCATGCTGGTACTGTA GTTGGTAAACTTGAAGGAGA ACGAGAAATCACTTTAGGTT GNETcdna1 CGTGTACTGGCTAAAGCGTT GCGCTTGTCCGGTGGGGATC ACATTCATGCTGGTACTGTA GTTGGTAAACTTGAAGGAga acgagAAATCACTTTAGGTT LYCOdna10 CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ACATACACTCCGGTACTGTA GTAGGTAAACTCGAAGGAGA ACGCCAAATAACTTTAGGTT LYCOcdna1 CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ACATACACTCCGGTACTGTA GTAGGTAAACTCGAAGGAGA ACGCCAAATAACTTTAGGTT LYCOcdna1 CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ACATACACTCCGGTACTGTA GTAGGTAAACTCGAAGGAGA ACGCCAAATAACTTTAGGTT PSILOdna1 CGTGTATTGGCTAAAGCATT GCGTATGTCGGGTGGAGATC ATGTCCATGCTGGTACTGTA GTAGGTAAGCTTGAAGGGGA ACGAGACGTCACTCTGGGAT PSILOcdna CGTGTATTGGCTAAAGCATT GCGTATGTCGGGTGGAGATC ATGTCCATGTTGGTACTGTA GTAGGtAAGCTTGAAGGGGA ACGAGaCGTCACTCTGGGAT PSILOcdna CGTGTATTGGCTAAAGCATT GCGTATGTCGGGTGGAGATC ATGTCCATGCTGGTACTGTA GTAGGTAAGCTTGAAGGGGA ACGAGACGTCACTCTGGGAT PSILOcdna CGTGTATTGGCTAAAGCATT GCGTATGTCGGGTGGAGATC ATGTCCATGCTGGTACTGTA GTAGGTAAGCTTGAAGGGGA ACGAGACGTCACTCTGGGAT BOTRYdna CGTGTATTGGCCAAAGCATT GCGTATGTCCGGCGGGGATC ATATTCATTCCGGTACTGTA GTGGGTAAACTTGAGGGAGA ACGCGAAATAACTCTAGGCT BOTRYcdna CGTGTATTGGCCAAAGCATT GCGTATGTCCGGCGGGGATC ATATTCATTCCGGTACTGTA GTGGGTAAACTTGAGGGAGA ACGCGAAATAACTCTAGGCT THUIDdna CGTGTATTAGCTAA-GCATT ACGTCTATCAGGTGGAGATC ATATTCACGCTGGTACTGTA GTAGGTAAACTTGAAGGAGA ACGTCAAGTAACTTTAGGGT THUIDcdna CGTGTATTAGCTAAAGCATT ACGTCTATCAGGTGGAGATC ATATTCACGCTGGTACTGTA GTAGGTAAACTTGAAGGAGA ACGTCAAGTAACTTTAGGGT THUIDcdna CGTGTATTAGCTAAAGCATT ACGTCTATCAGGTGGAGATC ATATTCACGcTGGTACTGTA GTAGGTAAACTTGAAGGAGA ACGTCAAGTAACTTTAGGGT TRFR1082d CGTGTATTAGCAAAAGCATT ACGTATGTCCGGTGGGGATC ATGTTCACTCTGGGACTGTA GTAGGCAAACTAGAGGGAGA ACGTGAAGTTACCTTGGGTT TRFRcdna9 CGTGTATTAGCAAAAGCATT ACGTATGTCCGGTGGGGATC ATGTTCACTCTGGGACTGTA GTAGGCAAACTAGAGGGAGA ACGTGAAGTTACCTTGGGTT TRFRcdna9 CGTGTATTAGCAAAAGCATT ACGTATGTCCGGTGGGGATC ATGTTCACTCTGGGACTGTA GTAGGCAAACTAGAGGGAGA ACGTGAAGTTACCTTGGGTT TRFRcdna9 CGTGTATTAGCAAAAGCATT ACGTATGTCCGGTGGGGATC ATGTTCACTCTGGGACTGTA GTAGGCAAACTAGAGGGAGA ACGTGAAGTTACCTTGGGTT PAUSIUdna CGTGTATTGGCCAAAGCGTT ACGCATGTCCGGCGGAGATC ATATACATGCTGgAACTGTA GTAaGCAAACTAGAAGGGgA ACGAGAAGTTaCCCTTGGTT PAUSIUcdn CGTGTATTGGCCAAAGCGTT ACGCATGTCCGGCGGAGATC ATATACATGCTGGAACTGTA GTAGGCAAACTAGAAGGGGA ACGAGAAGTTACCCTTGGTT SELAG1dna CGCGTCTTGGCCAAAGCATT ACGTATGTCCGGCGGGGACC ACATCCACGCTGGCACCGTG GTGGGTAAGCTCGAAGGGGA GCGTCAAGTAACTCTAGGGT SELAG1cdn CGCGTCTTGGCCAAAGCATT ACGTATGTCCGGCGGGGACC ACATCCACGCTGGCACCGTG GTGGGTAAGCTCGAAGGGGA GCGTCAAGTAACTCTAGGGT SELAG1cnd CGCGTaTtGGCCaAAGCATT aCGtATgTcCGGCGGGGACC aCATcCaCGCtGGCaCCGtG GtGGGTaAGCTCGAAGGGGA GCGTcAAGTaACTCTaGGGT SELAG3dna CGCGTCTTGGCCAAAGCATT ACGTATGTCCGGCGGGGACC ACATCCACGCTGGCACCGTG GTGGGTAAGCTCGAAGGGGA GCGTCAAGTAACTCTAGGGT SELAG4dna CGCGTCTTGGCCAAAGCATT ACGTATGTCCGGCGGGGACC ACATCCACGCTGGCACCGTG GTGGGTAAGCTCGAAGGGGA GCGTCAAGTAACTCTAGGGT L4LDU05dn CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGAGC ATATTCATTCAGGTACTGTA GTCGGTAAATTGGAGGGGGA ACGTCAAGTAACTTTAGGCT L4LDU05cD CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTA GTCGGTAAATTGGAGGGGGA ACGTCAAGTAACTTTAGGCT AformDNA CGTGTATCAGCTAAAGCATC ACGTATGTCCGGGGGAGATC ATATTCATCCAGGTACAGTA GTAGGTAAATTAGAGGGAGA ATGTGAAGTAACTTTAGGTT AformCDNA CGTGTATTAGCTAAAGCATT ACGTATGTCCGGGGGAGATC ATATTCATTCAGGTACAGTA GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGTT NOTHCRdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGCC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT NOTHCRcdn CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT MSPCHdna CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT MSPCHcdna CGTGCATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGTC GTAGGTAAATTAGAGGGAGG ACGTGAAGTAACTTTAGGCT MFUCHdnaR CGTGtATtAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATGCAGGtACTGCC GTAGGtAAATTaGAGGGAGA ACGTGaAGtAACTTTAGGCT MFUCHcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT MFLAGAY86 CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT MFLAGcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT MFLAGcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATGCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PFIMBAY86 CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT PFIMBcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PFIMBcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PCADCCdna CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PCADCCcdn CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PSPCH2dna CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT PSPCH2cdn CGtGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATcCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTtTAGGCT PSPCHMdna CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT PSPCHMcdn CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATcCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT PSPDO4dna CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PSPD04cdn CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGCC GTAGGTAAATCAGAGGGAGA ACGTGAAGTAACTTTAGGCT PPECAdnaJ CGTGTATCAGCTAAAGCATC ACGTATGTCTGGTGGAGATC ATATTCATCCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PPECAcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PPECAcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT PPECAcdna CGTGTATTAGCTAAAGCATT ACGTATGTCTGGTGGAGATC ATATTCATTCAGGTACTGTC GTAGGTAAATTAGAGGGAGA ACGTGAAGTAACTTTAGGCT 66 TAKAKIADN CGTGTATTAGCAAAGGCATT ACGTCTGTCCGGTGGAGATC ATATCCACTCTGGTACCGTG GTAGGTAAACTTGAAGGGGA ACGTCAAGTAACTCCAGGGT TAKAKpred CGTGTATTAGCAAAGGCATT ACGTCTGTCCGGTGGAGATC ATATCCACTCTGGTACCGTG GTAGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTAGGGT SPAGdna10 CGTGTATTAGCTAAAGCATT ACGTTTATCTGGTGGGGATC ATATTCACGCCGGTACTGTA GTGGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTGGGAT SPAGcdna1 CGTGTATTAGCTAAAGCATT ACGTTTATCTGGTGGGGATC ATATTCACGCCGGTACTGTA GTGGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTGGGAT SPAGcdna1 CGTGTATTAGCTAAAGCATT ACGTTTATCTGGTGGGGATC ATATTCACGCCGGTACTGTA GTGGGtAAACCTGAAGGGGA ACGTCAAGTAACTCTGGGAT SPAGcdna1 CGTGTATTAGCTAAAGCAtT ACGTTTATCTGGTGGGGATC ATATTCACGCCGGTACTGTA GTGGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTGGGAT SPAGcdna1 CGTGTATTAGCTAAAGCATT ACGTTTATCTGGTGGGGATC ATATTCACNCCGGTACTGTA GTGGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTGGGAT SPAGcdna1 CGTGTATTAGCTAAAGCATT ACGTTTATCTGGTGGGGATC ATATTCACGCCGGTACTGTA GTGGGTAAACTTGAAGGGGA ACGTCAAGTAACTCTGGGAT HAPLOdna8 CGTGTATTAGCCAAAGCGTT GCGCCTATCCGGTGGGGATC ATATCCATGCCGGGACGGTC GTGGGTAAGCTCGAGGGAGA ACGACAAGTCACTCTAGGAT HAPLOCDNA CGTGTATTAGCCAAaGCGTT GCGCCTATCCGGtGGGGATC ATATCCATGCCGGGACGGTC GtGGgTAAGCTCGAGGGAgA ACGACAaGTCaCTCTAGgAT

1010 1030 1050 1070 1090 Marchanti CGTGTATTAGCAAAAGCTTT ACGTATGTCTGGTGGAGATC ATATTCACGCTGGTACTGTT GTAGGTAAACTTGAAGGAGA CCGTCAAGTAACTTTAGGTT Charavulg CGTGTTTTAGCTAAAGCTCT TCGTCTTTCTGGTGGTGATC ATGTGCATTCTGGTACTGTG GTAGGTAAATTGGAAGGCGA ACGTGAAGTTACTTTAGGTT GNETdna11 TTGTGGATTTACTTCGCGAT GATTTTGTTGAAAAAGACCG AAGTCGTGGTATTTATTTCA CCCAAGATTGGGTATCTACG CCGGGCGTCCTGCCTGTAGC GNETcdna1 TTGTGGATTTACTTCGCGAT GATTTTGTTGAAAAAGACCG AAGTCGTGGTATTTATTTCA CCCAAGATTGGGTATCTACG CCGGGCGTCCTGCCTGTAGC GNETcdna1 TTGTGGATTTACTTCGCGAT GATTTTGTTGAAAAAGACCG AAGTCGTGGTATTTATTTCA CCCAAGATTGGGTATCTACG CCGGGCGTCCTGCCTGTAGC LYCOdna10 TTGTTGATCTACTTCGTGAT GACTATATCGAGAAGGACCG AAGCCGTGGTATTTATTTCA CTCAAGATTGGGTATCTATG CCTGGTGCTTCGCCCGTAGC LYCOcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAGAAGgACCG AAGCCGTGGTATTTATTTCA CTCAAGATTGGGTATCTATG CCTGGTGTTTTGCCCGTAGC LYCOcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAGAAGGACCG AAGCCGTGGTATTTATTTCA CTCAAGATTGGGTATCTATG CCTGGTGTTTTGCCCGTAGC PSILOdna1 TTGTTGATTTACTTCGTGAT GATTATATTGAGAAGGACCG GAGCCGCGGCGTCTATTTCA CTCAAGATTGGGTTTCTATG CCAGGTGTACTGCCTGTTGC PSILOcdna TTGTTGATTTaCTTCGTGAT GATTATATTGaGAAGGACCG GAGCCGCGGCGTCTATTTCA CTCAAGATTGGGTTTCTATG CCAGGTGTACTGCCTGTTGC PSILOcdna TTGTTGATTTACTTCGTGAT GATTATATTGAGAAGGACCG GAGCCGCGGCGTCTATTTCA CTCAAGATTGGGTTTCTATG CCAGGTGTACTGCCTGTTGC PSILOcdna TTGTTGATTTACTTCGTGAT GATTATATTGAGAAGGACCG GAGCCGCGGCGTCTATTTCA CTCAAGATTGGGTTTCTATG CCAGGTGTACTGCCTGTTGC BOTRYdna TCGTTGATCTACTCCGTGAC GATTATATCGAGAAAGATCG AAGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGTGTACTGCCTGTAGC BOTRYcdna TCGTTGATCTACTCCGTGAC GATTATATCGAGAAAGATCG AGGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGTGTACTGCCTGTAGC THUIDdna TTGTAGATTTGCTTCGCGAT GACTATATCGAAAAAGATAG AAGTCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGTATTTTATCCGAAGC THUIDcdna TTGTAGATTTGCTTCGCGAT GACTATATCGAAAAAGATAG AAGTCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTg CCAGGTATTTTACCCGTAGC THUIDcdna TTGTAGATTTGCTTCGCGAT GACTATATCGAAAAAGATAG AAGTCGTGGTATTTATTTCA CCCAAGACTGGGTTTCCTTA CCAGGTATTTTACCCGTAGC TRFR1082d TTGTCGATTTGCTTCGCGAC GATTATATTGAAAAAGACCG TAGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGCGTACTTCCCGTAGC TRFRcdna9 TTGTCGATTTGCTTCGCGAC GATTATATTGAAAAAGACCG TAGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGCGTACTTCCCGTAGC TRFRcdna9 TTGTCGGTTTGCTTCGCGAC GATTATATTGAAAAAGACCG TAGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGCGTACTTCCCGTAGC TRFRcdna9 TTGTCGATTTGCTTCGCGAC GATTATATTGAAAAAGACCG TAGCCGCGGCATCTATTTCA CCCAAGATTGGGTATCTATG CCGGGCGTACTTCCCGTAGC PAUSIUdna TCGTcGATTTACTTCGCG-C GATTATATTGAAAAAGATCG TAGCCGCGGTATTTATTTCA CCCAAGATTGGGTATCTATG CCGGGTGTATTCCCTGTAGC PAUSIUcdn TCGTCGATTTACTTCGCGAC GATTATATTGAAAAAGATCG TAGCCGCGGTATTTATTTCA CCCAAGATTGGGTATCTATG CCGGGTGTATTCCCTGTAGC SELAG1dna TCGTGGATTTGCTCCGGGAT GACTATATTGAAAAGGATCG GAGCCGAGGTATTTACTTCA CCCAGGATTGGGTATCTATG CCTGGTGTACTGCCCGTAGC SELAG1cdn TCGTGGATTTGCTCCGGGAT GACTATATTGAAAAGGATCG GAGCCGAGGTATTTACTTCA CCCAGGATTGGGTATCTATG CCTGGTGTACTGCCCGTAGC SELAG1cnd TCGtGGATTtgCTCCGGGAT gAC-AtatTGAAAAGGATcG GAGCCGAGGtATTTaCTTcA CCCAGGAtTGGGtATcTaTg CCtGGtGTaCTgCCCGtAGC SELAG3dna TCGTGGATTTGCTCCGGGAT GACTATATTGAAAAGGATCG GAGCCGAGGTATTTACTTCA CCCAGGATTGGGTATCTATG CCTGGTGTACTGCCCGTAGC SELAG4dna TCGTGGATTTGCTCCGGGAT GACTATATTGAAAAGGATCG GAGCCGAGGTATTTACTTCA CCCAGGATTGGGTATCTATG CCTGGTGTACTGCCCGTAGC L4LDU05dn TTGTTGATTTACTTCGCGAT GATTATATTGAAAAAGATCG AAGTCGTGGTATCTATTTCT CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTTGC L4LDU05cD TTGTTGATTTACTTCGCGAT GATtATATTGAAAAAGATCG AAGTCGTGGTATCTATTTCT CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTTGC AformDNA TTGTGGATTTACCTTGTGAC GATTATATTGAAAAAGATAG AAGTCGTGGGATATATTTTA CTCAAGACTGGGTCTCTATG CCTGGTGTTTTGCTCGTGGC AformCDNA TTGTGGATTTACTTCGTGAC GATTATATTGAAAAAGATAG AAGTCGTGGGATATATTTTA CTCAAGACTGGGTCTCTATG CCTGGTGTTTTGCCCGTGGC NOTHCRdna TTGTTGATTCACCTCGCGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATATATTTCA CTCAAGACTGGGTATCTATG CCTGGtGTTTTCCCTGtCGC NOTHCRcdn TTGTTGATTTACTTCGCGac GATTATATCGAAAAGGaT-G AAGTCGTGGtATATATtTCA CTCAaGACTGGGTATCTaTG CCTGGTGTTTTCCCTGTCGC MSPCHdna TTGTTGATTCACTTCGCGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATCTATTTCA CTCAAGACTGGGTATCTATG CCTGGTGTTTCGCCTGTCGC MSPCHcdna TTGTTGATTTACTTCGCGAC GATATATCGAAAAGGATCGA AGTCCGTGGTATCTATTTCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC MFUCHdnaR TTGTTGATTCACTTCGCGAC GATTATATCgaAAAGGATCG aAGTCGTGGtATCTATTTCA CTCAAGACTGGGtATCTATG CCTGGTGTTTCGCCTGTCGC MFUCHcdna TTGTTGATTTACTTCGCGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATCTATTTCA CTCAGGACTGGGTATCTATG CCTGGGGTTTTGCCTGTCGC MFLAGAY86 TTGTTGATTCACCTTGTGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATCTATTTCA CTCAAGACTGGGTATCTATG CCGGGTGTTTCGCCTGTCGC MFLAGcdna TTGTTGATTTACTTCGTGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATCTATTTCA CTCAAGACTGGGTATCTATG CCGGGTGTTTTGCCTGTCGC MFLAGcdna TTGTTGATTTACTTCGTGAC GATTATATCGAAAAGGATCG AAGTCGTGGTATCTATTTCA CTCAAGACTGGGTATCTATG CCGGGTGTTTTGCCTGTCGC PFIMBAY86 TTGTTGATTCACCTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PFIMBcdna TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTTTA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PFIMBcdna TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTTTA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PCADCCdna TTGTTGATTCACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTTAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PCADCCcdn TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTTTA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PSPCH2dna TTGTTGATTCACCTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PSPCH2cdn TTGTTGATTCACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTTAAGACTGGGTATCTATG CCTGGCGTTTTGCCTGTCGC PSPCHMdna TTGTTGATTCACCTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PSPCHMcdn TTGTTGATTCACCTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTCAAGACTGGGTATCTATG CcTGGTGTTTTGCCTGTCGC 67 PSPDO4dna TTGTTGATTCACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTTAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PSPD04cdn TTGTTGATTCACCTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATCTATTCCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PPECAdnaJ TTGTTGATTCACTTCGCGAC GATTACATCGAAAAAGaTCG AaGTCGTGGtaTTTATTCCA CTTAAgACTGGGtATCTATG CTTGGTGTTTTGCCTGTCGC PPECAcdna TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATTTATTTCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PPECAcdna TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATTTATTTCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC PPECAcdna TTGTTGATTTACTTCGCGAC GATTACATCGAAAAAGATCG AAGTCGTGGTATTTATTTCA CTCAAGACTGGGTATCTATG CCTGGTGTTTTGCCTGTCGC TAKAKIADN TCGGTTTGGACCATCGTGAT GATTACATCGAAAAAGATCG AAGTCGTGGTATTTATTTCA CTCAGGACTGGGTCTCTTCG CCAGGTGTTTTACCTGTAGC TAKAKpred TCGGTTTGGACCATCGTGAT GATTACATCGAAAAAGATCG AAGTCGTGGTATTTATTTCA CTCAGGACTGGGTCTCTTTG CCAGGTGTTTTACCTGTAGC SPAGdna10 TTGTTGATCTACTCCGTGAT GACTATATCGAAAAAGACCG AAGCCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGTGTTTTACCTGTAGC SPAGcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAAAAAGACCG AAGCCGTGGTATTTATTACA CCCAAGACTGGGTTTCTTTA CCAGGTGTTTTACCTGTAGC SPAGcdna1 TTGGTGATCTACTTCGTAAT GACTATATCGAAAAAGACCG AGGCCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGTGTTTTACCTGTAGC SPAGcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAAAAAGACCG AAGCCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGTGTTTTACCTGTAGC SPAGcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAAAAAGACCG AAGCCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGTGTTTTACCTGTAGC SPAGcdna1 TTGTTGATCTACTTCGTGAT GACTATATCGAAAAAGACCG AAGCCGTGGTATTTATTTCA CCCAAGACTGGGTTTCTTTA CCAGGGGtTTTACCTGGAGC HAPLOdna8 TTGTTGATTTGCTCCGCGAT GATTATATAGAAAAAGATCG AAGCCGTGGTATTTATTTCA CCCAGGATTGGGTTTCCTTA CCTGGTGTGTTCCCCGTAGC HAPLOCDNA TtGTtGATTTGCTCCGCGAT GATTATATAGAAAAAGATCG AAGCCGTGGTATTTATTTTA CCCAGGATTGGGTTTCCTTA CCTGGTGTGTTCCCCGTAGC

1110 1130 1150 1170 1190 Marchanti TCGTAGATTTACTTCGTGAT GACTATATTGAAAAAGATAG AAGTCGTGGTATTTATTTCA CACAAGATTGGGTTTCTTTA CCTGGTGTTTTCCCTGTAGC Charavulg TTGTGGATTTACTTCGTGAT GATTATATTGAAAAAGATAG AAGTCGCGGTGTTTACTTTA CCCAAGATTGGGTTTCTTTA CCAGGTGTATTACCTGTAGC GNETdna11 TTCGGGAGGTATTCATGTTT GGCATATGCCAGCTCTAACT GAAATTTTTGGAGATGATGC AGTACTCCAATTTGGTGGAG GAACCTTGGGACATCCTTGG GNETcdna1 TTCGGGAGGTATTCATGTTT GGCATATGCCAGCTCTAACT GAAATTTTTGGAGATGATGC AGTACTCCAATTTGGTGGAG GAACCTTGGGACATCCTTGG GNETcdna1 TTCGGGAGGtATTCCTGTTT GGCATATcCCAGCTCTAACT GAAATTTTTGGAGATGATGC AGTACTCCAATTTGGTgGAG GAACCTTGGGACATCCTTGG LYCOdna10 TTCGGgAGGTATTCATGTCT GGCACATGCCTGCTCTGACT GAAATCTTTGGAGATGATTC TGTATCACAATTCGGTGGGG GAACTTTGGGACACCCTTGG LYCOcdna1 TTCGGGAGGTATTCATGTCT GGCACATGCCTGCTCTGACT GAAATCTTTGGAGATGATTC TGTATTACAATTCGGTGGGG GAACTTTGGGACACCCTTGG LYCOcdna1 TTCGGGAGGTATTCATGTCT GGCACATGCCTGCTCTGACT GAAATCTTTGGAGATGATTC TGTATTACAATTCGGTGGGG GAACTTTGGGACACCCTTGG PSILOdna1 TTCGGGGGGTATTCATGTTT GGCACATGCCAGCTTTAACT GAGATCTTTGGGGATGATTC TGTATTACAGTTCGGTGGAG GAACCTTGGGACATCCTTGG PSILOcdna TTCGGGGGGTATTCATGTTT GGCACATGCCAGCTTTAACT GAGATCTTTGGGGATGATTC TGTATTACAGTTCGGTGGAG GAACCTTGGGACATCCTTGG PSILOcdna TTCGGGGGGTATTCATGTTT GGCACATGCCAGCTTTAACT GAGATCTTTGGGGATGATTC TGTATTACAGTTCGGTGGAG GAACCTTGGGACATCCTTGG PSILOcdna TTCGGGGGgTAtTCATGTTT GGCACATGCCAGCTTTAACT GAGATCTTTGGGGATGATTC TGTATTACAGTTCGGTGGAG GAACCTTGGGACATCCTTGG BOTRYdna TTCGGGAGGCATCCATGTTT GGCATATGCCCGCTTTGACT GAAATCTTCGGGGACGATTC CGTACTCCAATTTGGTGGAG GAACTTTGGGACACCCCTGG BOTRYcdna TTCGGGAGGCATCCATGTTT GGCATATGCCCGCTTTGACT GAAATCTTCGGGGACGATTC CGTACTCCAATTTGGTGGAG GAACTTTGGGACACCCCTGG THUIDdna TTCTGGTGGTATTCATGTTT GGCATACGCCAGCATTAACT GAAATCTTTGGAGATGACTC TGTATTACAATTTGGTGGAG GAACTTTAGGTCACCCTTGG THUIDcdna TTCTGGTGGTATTCATGTTT GGCATATGCCAgCATTAACT GAAATCTTTGGAGATGACTC TGTATTACAATTTGGTGGAG GAACTTTAGGTCACCCTTGG THUIDcdna TTCTGGTGGTATTCATGTTT GGCATATGCCAGCATTAACT GAAATCTTTGGAGATGACTC TGTATTACAATTTGGTGGAG GAACTTTAGGTCACCCTTGG TRFR1082d TTCGGGGGGTATCCACGTAT GGCATATGCCTGCTCTAACC GAAATCTTCGGAGACGATTC TGTCTTACAGTTCGGCGGAG GAACCTTGGGACATCCTTGG TRFRcdna9 TTCGGGGGGTATCCACGTAT GGCATATGCCTGCTCTAACC GAAATCTTCGGAGACGATTC TGTCTTACAGTTCGGCGGAG GAACCTTGGGACATCCTTGG TRFRcdna9 TTCGGGGGGTATCCACGTAT GGCATATGCCTGCTCTAACC GAAATCTTCGGAGACGATTC TGTCTTACAGTTCGGCGGAG GAACCTTGGGACATCCTTGG TRFRcdna9 TTCGGGGGGTATCCACGTAT GGCATATGCCTGCTCTAACC GAAATCTTCGGAGACGATTC TGTCTTACAGTTCGGCGGAG GAACCTTGGGACATCCTTGG PAUSIUdna TTCAGGAGGCATCCACGTCT GGCACATGCCTGCTCTAACC GAAATCTTTGGGGACGATTC CGTATTACAGTTCGGCGGAG GAACTTTAGGACATCCTTGG PAUSIUcdn TTCAGGAGGCATCCACGTCT GGCACATGCCTGCTCTAACC GAAATCTTTGGGGACGATTC CGTATTACAGTTCGGCGGAG GAGCTTTAGGACATCCTTGG SELAG1dna TTCCGGGGGCATTCACGTTT GGCATATGCCCGCCCTGACT GAAATATTTGGAGATGATTC CGTGCTCCAATTCGGAGGCG GTACTTTGGGTCACCCTTGG SELAG1cdn TTCCGGGGGCATTCACGTTT GGCATATGCCCGCCCTGACT GAAATATTTGGAGATGATTC CGTGCTCCAATTCGGAGGCG GAACTTTGGGTCACCCTTGG SELAG1cnd TTCGGGGG-CATT-ACGTTt GGCAtATgCCCGCCCTgaCT gAAAtATT-GGAGAtGATTc CGTgCTcCAATTcGGAGGCG GAACTTtGGGTcACCCTtGG SELAG3dna TTCCGGGGGCATTCACGTTT GGCGTATGCCCGCCCTGACT GAAATATTTGGAGATGATTC CGTGCTCCAATTCGGAGGCG GAACTTTGGGTCACCCTTGG SELAG4dna TTCCGGGGGCATTCACGTTT GGCATATGCCCGCCCTGACT GAAATATTTGGAGATGATTC CGTGCTCCAATTCGGAGGCG GAACTTTGGGTCACCCTTGG L4LDU05dn TTCAGGAGGTATTCATGTTT GGCATATGCCTGCTTTAACC GAGATTTTTGGAGATGATTC CGTATTGCAATTTGGTGGAG GAACTCTAGGACACCCTTGG L4LDU05cD TTCAGGAGGTATTCATGTTT GGCATATGCCTGCTTTAACC GAGATTTTTGGAGATGATTC CGTATTGCAATTTGGTGGAG GAACTCTAGGACACCCTTGG AformDNA TTCAGGTGGTATTCATGTTT GGCATATGCCAGCTTTAACT GAGATTTTCGGAGATGATTC CGTACTACAGTTTGGCGGGG GAACTCTAGGACACCCTTGG AformCDNA TTCAGGTGGTATTCATGTTT GGCATATGCCAGCTTTAACT GAGATTTTCGGAGATGATTC CGTACTACAGTTTGGCGGGG GAACTCTAGGACACCCTTGG NOTHCRdna TtCAGGAGGTATtCATGTCT GGCATATgCCTGCTTtaACT GAgATTtCCGGAGATgaTTC tGtaTcacaAtttggTGGgG GAaCTTtAGGgCACCctTGg NOTHCRcdn TTCAGGAGGtATTCATGTCT GGCATATGCCTGCttTAACT GAgATTTTCGGAgaTGaTtC TGTATTACAATTTGGTGGGG GAaCTTTAGGGCACCCTTGG MSPCHdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGCATCACAATTTGGTGGAG GGACTTTAGGGCACCCTTGG MSPCHcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGAG GGACTTTAGGGCACCCTTGG MFUCHdnaR TTCAGgaGGtATTCATGTCT GGCATATGCCTGCTTtAACT gAgATTTCCGGGgaTgaTTC TGCATCACAATTTGGTGGGG GgaCTTTAGGGCACCCTTGG MFUCHcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGGG GGACTTTAGGGCACCCTTGG MFLAGAY86 TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGTATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG MFLAGcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG MFLAGcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PFIMBAY86 TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGCATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PFIMBcdna TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PFIMBcdna TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG 68 PCADCCdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGTATCACAATTTGGTGGGG GAACTCTAGGGCACCCTTGG PCADCCcdn TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGGG GAACTCTAGGGCACCCTTGG PSPCH2dna TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTGACT GAGATTTCCGGAGATGATTC TGTATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PSPCH2cdn TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGTATCACAATTTGGTGGGG GAACTCTAGGGCACCCTTGG PSPCHMdna TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGCATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PSPCHMcdn TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGCATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PSPDO4dna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGTATCACAATTTGGTGGGG GAACTCTAGGGCACCCTTGG PSPD04cdn TTCGGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTCCGGAGATGATTC TGCATCACAGTTTGGTGGGG GAACTTTAGGGCACCCTTGG PPECAdnaJ TTCAGgAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAgATTTCCGgagATgATTC TGtATCACAATTTGGTGGGG gaACTTTAGGGCACCCTCGG PPECAcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGGG GAACTTTAGGGCACCCTTGG PPECAcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGGG GAACTTTAGGGCACCCTTGG PPECAcdna TTCAGGAGGTATTCATGTCT GGCATATGCCTGCTTTAACT GAGATTTTCGGAGATGATTC TGTATTACAATTTGGTGGGG GAACTTTAGGGCACCCTTGG TAKAKIADN TTCGGGAGGTATTCATGTTC GGCATATGCCCGCTTCGACC GAAATTTATGGAGATGATTC TGTATTACAGTTCGGTGGAG AAACTTTAGGACACCCT--- TAKAKpred TTCGGGAGGTATTCATGTTT GGCATATGCCCGCTTTGACC GAAATTTATGGAGATGATTC TGTATTACAGTTCGGTGGAG AAACTTTAGGACACCCT--- SPAGdna10 TTCGGGAGGTATTCATGTTT GGCATATGCCTGCTCTAACT GAAATTTTCGGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG SPAGcdna1 TTCGGGAGGTATTCATGTTT GGCATATGCCTGCTCTAACT GAAATTTTTGGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG SPAGcdna1 TTCGGGAGGTATTCATGTTT GGCATATGCCTGCTCTAACT GAAATTTTTGGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG SPAGcdna1 TTCGGGAGGTATTCATGTTT GGCATATGCCTGCTCTAACT GAAATTTTTGGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG SPAGcdna1 TTCGGGAGGTATTCATGTTT GGCATATGCCTGCTCTAACT GAAATCTTTGGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG SPAGcdna1 TTCgGGAG-TATTCATGTTT GgCATATGCCTGCTCTAACT GAAATTTTTgGGGATGATTC CGTATTACAGTTTGGTGGAG GAACTCTGGGACATCCTTGG HAPLOdna8 TTCGGGGGGTATCCATGTTT GGCATATGCCAGCTCTGACC GAAATTTTTGGAGATGATCC CGTACTACAGTTCGGTGGTG GAACCCTAGGCCACCCGTGG HAPLOCDNA TTCGGGGGGTATCCATGTTT GGCATATGCCAGCTCTGACC GAAATTTTTGGAGATGATTC CGTACTACAGTTCGGTGGTG GAACCCTAGGCCACCCGTGG

1210 1230 1250 1270 1290 Marchanti ATCTGGTGGGATCCATGTTT GGCATATGCCTGCTTTAACT GAAATTTTTGGAGATGACTC TGTTTTACAATTTGGTGGTG GAACTTTAGGTCATCCTTGG Charavulg TTCAGGTGGAATTCATGTTT GGCATATGCCTGCATTAACT GAAATTTTTGGTGACGATTC AGTATTGCAATTTGGAGGAG GCACTTTAGGTCATCCTTGG GNETdna11 GGGAATGCACCCGGTGCTGT AGCTAATCGAGTTGCTTTAG AGGCTTGTGTACAGGCTCGT AATGAAGGGCGTGATCTTGT TCGTGAAGGTAATGAAGTGA GNETcdna1 GGGAATGCACCCGGTGCTGT AGCTAATCGAGTTGCTTTAG AGGCTTGTGTACAGGCTCGT AATGAAGGGCGTGATCTTGT TCGTGAAGGTAATGAAGTGA GNETcdna1 GGGAATGCACCCGGTGCTGT AGCTAATCGAGTTGCTTTAG AGGCTTGTG-ACAGGCTCG- AATgAAGGGCGTgATCTTGT TCGTGAAGGtAATgAAGTgA LYCOdna10 GGGAATGCACCCGGTGCAGT AGCAAACCGAGTTGCTTTAG AAGCTTGTGTACAGGCTCGT AATGAAGGACGCGATCTTGC TCGTGAGGGTAATGAGATTA LYCOcdna1 GGGAATGCACCCGGTGCAGT AGCAAACCGAGTTGCTTTAG AAGCTTGTGTACAGGCTCGT AATGAAGGACGCGATCTTGC TCGTGAGGGTAATGAGATTA LYCOcdna1 GGGAATGCACCCGGTGCAGT AGCAAACCGAGTTGCTTTAG AAGCTTGTGTACAGGCTCGT AATGAAGGACGCGATCTTGC TCGTGAGGGTAATGAGATTA PSILOdna1 GGAAATGCACCAGGTGCTGT GGCTAATCGAGTTGCGTTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTGAAGGGAATGACATTA PSILOcdna GGAAATGCACCAGGTGCTGT GGCTAATCGAGTTGCGTTGG GAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTGAAGGGAATGACATTA PSILOcdna GGAAATGCACCAGGTGCTGT GGCTAATCGAGTTGCGTTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTGAAGGGAATGACATTA PSILOcdna GGAAATGCACCAGGTGCTGT GGCTAATCGAGTTGCGTTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTGAAGGGAATGACATTA BOTRYdna GGTAATGCACCTGGTGCAGT GGCCAATCGAGTTGCTCTGG AAGCTTGTGTACAGGCTCGT AATGAAGGGCGTGATCTTGC TCGTGAAGGTAATGAAATTA BOTRYcdna GGTAATGCACCTGGTGCAGT GGCCAATCGAGTTGCTCTGg AAGCTTGTG-ACAGGCTCGt AATgAAGGGCGTgATCTTGC TCGTgAAGGtAATgAAATtA THUIDdna GGTAATGCACCTGGTGCAGT TGCTAACAGAGTTGCTTTAG AAGCTTGTGTGCAAGCTCGT AATGAAGGACGTGATCTTGC TCGCGAAGGTAATGAAATTA THUIDcdna GGTAATGCACCTGGTGCAGT TGCTAATAGAGTTGCTTTAG AAGCTTGTGTGCAAGCTCGT AATGAAGGACGTGATCTTGC TCGCGAAGGTAATGAAATTA THUIDcdna GGTAATGCACCTGGTGCAGT TGCTAACaGaGTTGCTTTAG AAGCTTGTGTGCAAGCTCGT AATGAAGGACGTGATCTTGC TCGCGAAGgTaATgaaatTA TRFR1082d GGAAATGCGCCCGGTGCCGT AGCTAATCGAGTCGCGTTAG AGGCTTGTGTACAAGCTCGT AATGAAGGCCGTGACCTTGC TCGTGAAGGTAATGAGATTA TRFRcdna9 GGAAATGCGCCCGGTGCCGT AGCTAATCGAGTCGCGTTAG AGGCTTGTGTACAAGCTCGT AATGAAGGCCGTGACCTTGC TCGTGAAGGTAATGAGATTA TRFRcdna9 GGAAATGCGCCCGGTGCCGT AGCTAATCGAGTCGCGTTAG AGGCTTGTGTACAAGCTCGT AATGAAGGCCGTGACCTTGC TCGTGAAGGTAATGAGATTA TRFRcdna9 GGAAATGCGCCCGGTGCCGT AGCTAATCGAGTCGCGTtAG AGGCTTGTGtACAAGCTCGt AATgAAGGCCGTgACCTTGC TCGTgAAGGtAATgAGATtA PAUSIUdna GGAAATGCGCCAGGTGCAGT AGCCAACCGAGTCGCATTGG AAGCTTGCGTACAGGCTCGT AATGAGGGTCGCGACCTCGC CCGTGAAGGTAATGAGATTA PAUSIUcdn GGAAATGCGCCAGGTGCAGT AGCCAACCGAGTCGCATTGG AAGCTTGCGTACAGGCTCGT AATGAGGGTCGCGACCTCGC CCGTGAAGGTAATGAGATTA SELAG1dna GGGAATGCGCCGGGCGCCGT GGCCAATCGAGTTGCTTTGG AAGCCTGTGTGCAAGCCCGC AACGAGGGTCGTGACCTCGC CACTGCGGGTAACGAGGTTA SELAG1cdn GGGAACGCGCCGGGCGCCGT GGCCAATCGAGTTGCTTTGG AAGCCTGTGTGCAAGCCCGC AACGAGGGTCGTGACCTCGC CACTGCGGGTAACGAGGTTA SELAG1cnd GGGAATgC-CCGGGCGCCG- GGCCAAT-GAGTT-CTT-GG AAGCC------SELAG3dna GGGAATGCGCCGGGCGCCGT GGCCAATCGAGTTGCTTTGG AAGCCTGTGTGCAAGCCCGC AACGAGGGTCGTGACCTCGC CACTGCGGGTAACGAGGTTA SELAG4dna GGGAATGCGCCGGGCGCCGT GGCCAATCGAGTTGCTTTGG AAGCCTGTGTGCAAGCCCGC AACGAGGGTCGTGaCCTCGC CACTGCGGGTAACGAGGTTA L4LDU05dn GGAAATGCACCTGGTGCAGT AGCTAACCGAGTTGCGTTAG AAGCTTGTGTACAAGCTCGT AATGAAGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA L4LDU05cD GGAAATGCACCTGGTGCAGT AGCTAATCGAGtTGCGTtAG AAGC------AformDNA GGAAATGCACCTGGTGCAGT AGCTAATCGAGTTGCTTTAG AAGCTTGTGTGCAAGCTCGT AATGAAGGACGTGATCTTGC TCGCGAGGGTAATGATATTA AformCDNA GGAAATGCACCTGGTGCAGT AGCTAATCGAGTTGCTTTAG AAGCTTGTGTGCAAGCTCGT AATGAAGGACGTGATCTTGC TCGCGAGGGTAATGATATTA NOTHCRdna GgaAATGCACCTGgtGCTGt AGCTAATCGAGTGGCCTtAG aAGCTTGTGTCCAAGCTCGt AATGAGGgACGTGATCTTGC TCGCCAAGGtAATGATATTA NOTHCRcdn GGAAATGCACCTGGTGCTGT AGCTAATCGAGTGGCCTTAG AAGCTTGTGTCCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGATATTA MSPCHdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCTTGTGTCCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA MSPCHcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCTTGTGTCCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA MFUCHdnaR GcAAATGCACCTGGTGCTGt aGCtAATCgaGTCGCCTtag AaGCTTGTGTCCAAGCTCGT AATGAGGGACgtgATCTTGC TCGCCAAGGTAATGAAATTA MFUCHcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTGG AAGCTTGTGTCCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA 69 MFLAGAY86 GGAAATGCACCTGGTGCTGT AGCTAATCGAGTTGCCTCAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA MFLAGcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAATTGCCTCAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA MFLAGcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAATTGCCTCAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PFIMBAY86 GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PFIMBcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PFIMBcdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PCADCCdna GGAAATGCACCTGGTGCAGt AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PCADCCcdn GGAAATGCACCTGGTGCAGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPCH2dna GAAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCTTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPCH2cdn GGAAATGCACCTGGTGCAGT AGCTAATCGAGTTGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPCHMdna GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPCHMcdn GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGATGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPDO4dna GGAAATGCACCTGGTGCAGT AGCTAATCGAGTTGCCTTAG AGGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PSPD04cdn GGAAATGCACCTGGTGCTGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCCAAGGTAATGAAATTA PPECAdnaJ GgAAATGCACCTGGTGCAGt AGCTAATCgAGTCGCCTTAg AAGCCTGTGTGCAAGCTCGT AATgAGGgACGTGATCTTGC TCGCGAAGGTAATgAAATTA PPECAcdna GGAAATGCACCTGGTGCAGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCGAAGGTAATGAAATTA PPECAcdna GGAAATGCACCTGGTGCAGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCGAAGGTAATGAAATTA PPECAcdna GGAAATGCACCTGGTGCAGT AGCTAATCGAGTCGCCTTAG AAGCCTGTGTGCAAGCTCGT AATGAGGGACGTGATCTTGC TCGCGAAGGTAATGAAATTA TAKAKIADN ------TAKAKpred ------SPAGdna10 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTCAAGGTAATGATATTA SPAGcdna1 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTgATCTTGC TCGTCAAGGtAATGATATtA SPAGcdna1 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTgATCTTGC TCGTCAAGGtAATGATATtA SPAGcdna1 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTCAAGGTAATGATATTA SPAGcdna1 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATGAGGGACGTGATCTTGC TCGTCAAGGTAATGATATTA SPAGcdna1 GGGAACGCACCTGGTGCAGT TGCTAATCGAGTTGCTCTGG AAGCCTGTGTACAAGCTCGT AATgAGGGACGTGATCTTGC TCGTCAAGGtAATGATATtA HAPLOdna8 GGGAACGCACCCGGTGCTGT CGCCAACCGAGTAGCTCTGG AAGCTTGTGTACAGGCTCGT AACGAGGGACGTGATCTCGC TCGTGAAGGTAATGAAGTGA HAPLOCDNA GGGAACGCACCCGGTGCTGT CGCCAACCGAGTAGCTCTGG AAGCTTGTGTACAGGCTCGT AACGAGGGACGTGATCTCGC TCGTGAAGGTAATGAAGTGA Marchanti GGTAACGCACCTGGTGCAGT TGCTAACCGAGTTTCGTTAG AAGCTTGCGTACAAGCACGT AATGAAGGTCGTGATCTTGC TCGTGAAGGAAATGAAATTA Charavulg GGAAATGCTCCAGGAGCTGT AGCTAACCGAGTAGCTTTGG AAGCTTGTGTACAAGCACGC AATGAGGGAAGAGATTTAGC TCGTGAAGGTAATGAAATTA

1310 1330 GNETdna11 TCCGcGaAGCTAGTAAATGG AGTCCTGAACTAGCT GNETcdna1 TCCGCgAAGCTAGTAAATGG AGTCCTGAACTAGCT GNETcdna1 TCCGCgAAGCtAGtAAATGg AGTCCTgAACtAGCT LYCOdna10 TTCGTGAAGCTTGTAAGTGG AGTGCTGAATTAGCT LYCOcdna1 TTCGTGAAGCTTGTAAGTGG AGTGCTGAATTAGCT LYCOcdna1 TTCGTGAAGCTTGTAAGTGG AGTTCTGAATTAGCT PSILOdna1 TTCGTGAAGCTAGTAAGTGG AGTCCCGAATTGGTT PSILOcdna TTCGTGAAGCTAGTAAGTGG AGTCCCGAATTGGCT PSILOcdna TTCGTGAAGCTAGTAGGTGG AGTCCCGAATTGGCT PSILOcdna TTCGTGAAGCTAGTAAGTGG AGTCCCGAATTGGCT BOTRYdna TTCGTGAAGCTGCTAAGTGG AGTCCCGACTTAGCC BOTRYcdna TTCGTgAAGCTGCtAAGTGg AGTCCCgACTtAGCC THUIDdna TTCGTGAAGCTGCTAAGTGG AGTCCTGAATTAGCT THUIDcdna TTCGTGAAGCTGCTAAGTGG AGTCCTGAATTAGC- THUIDcdna TTCGTGAAGCTGCTAAGTGG AGTCC------TRFR1082d TTCGTGAAGCTAGTCAGTGG AGTCCTGAATTGGCT TRFRcdna9 TTCGTGAAGCTAGTCAGTGG AGTCCTGAATTgGCT TRFRcdna9 TTCGTGAAGCTAGTCAGTGG AGTCCTGAATTGGCT TRFRcdna9 TTCGTgAAGCtAGTCAGTGg AGTCCTGtATTGGCT PAUSIUdna TCCGTGAAGCTTGTAAGTGG AGTCCGGAATTGGCT PAUSIUcdn TTCGTGAAGCTTGTAAGTGG AGTCCGGAATTGGCT SELAG1dna TTcgtgaAGCTTGTAAGTGG AGTCCCGAGCTAGCT SELAG1cdn TTCGTGAAGCTTGTAAGTGG AGTCCCGAGCTAGCT SELAG1cnd ------SELAG3dna TTCGTGAAGCTTGTAAGTGG AGTCCCGAGCTAGCT SELAG4dna TTCgtGAAGCTTGTAAGTGG AGTCCCGAGCTAGCT L4LDU05dn TTCGTGAAGCTAGTAAATGG AGTCCTGAATTAGCA L4LDU05cD ------AformDNA TCCGTGAAGCTAGTAAATGG AGTCCTGAATTAGCA AformCDNA TCCGTGAAGCTAGTAAATGG AGTCCTGAATTAGCA 70 NOTHCRdna TTCGAGAGGCTTCTAAgTGg AGTCCAGAATTAGCA NOTHCRcdn TTCGAGAGGCTTCtAAGtGG agtCCAGAATTAGCA MSPCHdna TTCGGGAGGCTTGTAAGTGG AGTCCAGAATTAGCA MSPCHcdna TTCGGGAGGCTTGTAAGTGG AGTCCAGAATTAGCA MFUCHdnaR TTCGAGAGGCTTGTAAG??? ??????????????? MFUCHcdna TTCGGGAGGCTTGTAAGTGG AGTCCAGAATTAGCA MFLAGAY86 TTCGAGAGGCTTGTAAATGG AGTCCAGAATTAGCA MFLAGcdna TTCGAGAGGCTTGTAAATGG AGTCCAGAATTAGCA MFLAGcdna TTCGAGAGGCTTGTAAATGG AGTCCAGAATTAGCA PFIMBAY86 TTCGGGAGGCTTCTAAGTGG AGTCCAG------PFIMBcdna TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PFIMBcdna TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PCADCCdna TTCGCGAAGCTTGTAAGTGG AGTCCAGAATTAGCA PCADCCcdn TTCGCGAAGCTTGTAAGTGG AGTCCAGAATTAGCA PSPCH2dna TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PSPCH2cdn TTCGCGAAGCTTGTAAGTGG AGTCCAGAATTAGCA PSPCHMdna TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PSPCHMcdn TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PSPDO4dna TTCGCGAAGCTTGTAAGTGG AGTCCAGAATTAGCA PSPD04cdn TTCGGGAGGCTTCTAAGTGG AGTCCAGAATTAGCA PPECAdnaJ TTCGCgAAGCTTGtAAgTGg aGTCCAGaATTAG-- PPECAcdna TTCGCGAAGCTTGTAAGTGG AGTCCGGAATTAGCA PPECAcdna TTCGCGAAGCTTGTAAGTGG AGTCCGGAATTAGCA PPECAcdna TTCGCGAAGCTTGTAAGTGG AGTCCGGAATTAGCA TAKAKIADN ------TAKAKpred ------SPAGdna10 TTCGCGAAGCTGCCAAGTGG AGTCCCGAACTAGCC SPAGcdna1 TTCGCGAaGCTGCCAAGTGg AGTCCCGAaCtAGCC SPAGcdna1 TTCGCGAaGCTGCCAAGTGg AGTCCCGaACtAGCC SPAGcdna1 TTCGCGAAGCTGCCAAGTGG AGTCCCGAACTAGCC SPAGcdna1 TTCGCGAAGCTGCCAAGTGG AGTCCCGAACTAGCC SPAGcdna1 TTCGCGAaGCTGCCAAGTGg AGTCCCGAaCtAGCC HAPLOdna8 TTCGCGAAGCTTGTAAATGG AGTCCTGAACTGGCT HAPLOCDNA TTCGCGAAGCTTGTAAATGG AGTCCTGAACTGGCT Marchanti TTCGCGAAGCTTGTAAGTGG AGTCCTGAGTTATCT Charavulg TTCGAGAGGCTGCTAAGTGG AGTCCTGAATTAGCT

71 Base Position 5 253740477477 Codon Position 2111222 Type CCCCTCc Observed/Real OOOOOOP AA Position 211 91314162526 AA Change S-F H-Y,N-Y H-Y H-Y L-P T-I S-L Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A 11 Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059

72 Base Position 5 253740477477 Codon Position 2111222 Type CCCCTCc Observed/Real OOOOOOP AA Position 211 91314162526 AA Change S-F H-Y,N-Y H-Y H-Y L-P T-I S-L Phaeoceros sp. - PAF2445 11 111 Phaeoceros carolinianus - AY463057 11 111 Nothoceros breutellii - AY463054 11 111 Notothylas orbicularis - AY463055 1 111 Notothylas breutellii - AY463054 1 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 11 111 Phymatoceros phymatodes - DQ845660 11 111 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 11 11 Megaceros fuegensis - DQ845643 11 11 Dendroceros crispatus - AY63048 11 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 11 1 Megaceros aenigmaticus - L13481 11 11 Megaceros vincentianus - DQ845647 11 11 Megaceros vincentianus - MEGVINC55 11 11 Megaceros canaliculatus - AY463047 11 11 Megaceros vincentianus - AY463045 11 11 Phaeoceros sp. - DQ845666 11 111 Phaeomegaceros hirticalyx - AY463043 11 11 Phaeomegacoers coriaceus - AY463042 11 11 Megaceros pellucidus - AY463041 111 Megaceros flagellaris - AY860198 11 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 1 Phyloglossum - Y07939 Lycopodium 1025 1 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 A Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

73 Base Position 5 253740477477 Codon Position 2111222 Type CCCCTCc Observed/Real OOOOOOP AA Position 211 91314162526 AA Change S-F H-Y,N-Y H-Y H-Y L-P T-I S-L Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 1 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

74 Base Position 86 88* 97 100* 104 115 175 Codon Position 2111211 Type CTTTT t C Observed/Real OOPOOP P AA Position 29 30 33 34 35 39 59 AA Change S-F *-R S-P *-Q L-P S-P R-W Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 1 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 g Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 1 Rhizomnium - AF478237 g Plagiobryum - AY163045 g Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 11 1 Anthoceros alaminiferus - AY463053 11 1 Anthoceros punctatus - U87063 11 1 Anthoceros punctatus - UAB013669 11 1 Folioceros fuciformis - AY463050 11 1 Folioceros fuciformis - AY463501 11 1 Phaeocoeros leavis - PLAEU 11 g Phaeoceros pearsonii - AY860203 11 g

75 Base Position 86 88* 97 100* 104 115 175 Codon Position 2111211 Type CTTTT t C Observed/Real OOPOOP P AA Position 29 30 33 34 35 39 59 AA Change S-F *-R S-P *-Q L-P S-P R-W Phaeoceros sp. - PAF2445 11 g Phaeoceros carolinianus - AY463057 11 g Nothoceros breutellii - AY463054 111 Notothylas orbicularis - AY463055 111 Notothylas breutellii - AY463054 111 Dendroceros granulosus - AY463049 11 Phymatoceros bulbiculosus - AY860201 11 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 g Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 g Megaceros fuegensis - DQ845643 11 g Dendroceros crispatus - AY63048 11 Dedroceros crispatus - DCRJC742 11 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 11 g Megaceros vincentianus - DQ845647 11 g Megaceros vincentianus - MEGVINC55 11 Megaceros canaliculatus - AY463047 11 g Megaceros vincentianus - AY463045 11 g Phaeoceros sp. - DQ845666 11 g Phaeomegaceros hirticalyx - AY463043 11 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 11 g Megaceros flagellaris - AY860198 11 g Megacoeros gracilis - AY463037 11 g Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 g Megaceros flagellaris - AY463036 11 g Megaocerosgracilis - AY463039 11 g Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 g Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A g Botrychium - L13474 Ophioglossum - AB024960 g Equisetum - AF313580

76 Base Position 86 88* 97 100* 104 115 175 Codon Position 2111211 Type CTTTT t C Observed/Real OOPOOP P AA Position 29 30 33 34 35 39 59 AA Change S-F *-R S-P *-Q L-P S-P R-W Onoclea - U62034 g Grammitis -AY460647 g PAUSIU798A g Woodsia - AB021726 g Polypodium - AY362611 g Polypodium 798A Bommeria - U19497 g Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 g Belvisia - AY362562 1 g Thelypteris - D43919 g Vittaria - U20937 g Lygodium - AJ303419 Asplenium -AY641804 g Woodwardia - AB040604 g Dicksonia - AM177345 g Marattia - AY138399 Dryopteris - AY268887 g Dicranopteris - U18626 Adiantum - AY178864 g Polypodium - AY362612 g Seed Plants Pinus pumila - AB161042 g Gnetum 1162A Zamia 1109A g Welwitchia 1110A Arabinopsis thaliana - AP000423

77 Base Position 197 214* 220 253* 269 272 274 Codon Position 2111221 Type cTCTCTC Observed/Real OOP POP P AA Position 66 72 74 85 90 91 92 AA Change P-L *-R H-Y *-Q A-V V-A H-Y Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 A Coleochaete - AY051142 A Spirogyra - L11057 A Nitella - AF097745 Tolypella - U27531 Chara - L13476 A Chara - AY17043 A Chara - DQ229107 A Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 A Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 11 Anthoceros punctatus - U87063 11 Anthoceros punctatus - UAB013669 11 Folioceros fuciformis - AY463050 11 Folioceros fuciformis - AY463501 11 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203 1

78 Base Position 197 214* 220 253* 269 272 274 Codon Position 2111221 Type cTCTCTC Observed/Real OOP POP P AA Position 66 72 74 85 90 91 92 AA Change P-L *-R H-Y *-Q A-V V-A H-Y Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 11 Notothylas orbicularis - AY463055 11 Notothylas breutellii - AY463054 11 Dendroceros granulosus - AY463049 11 Phymatoceros bulbiculosus - AY860201 11 1 Phymatoceros phymatodes - DQ845660 11 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 11 Dedroceros crispatus - DCRJC742 11 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 11 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 11 Megaceros canaliculatus - AY463047 1 Megaceros vincentianus - AY463045 11 Phaeoceros sp. - DQ845666 1 Phaeomegaceros hirticalyx - AY463043 11 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 11 Megaceros flagellaris - AY860198 11 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 A PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

79 Base Position 197 214* 220 253* 269 272 274 Codon Position 2111221 Type cTCTCTC Observed/Real OOP POP P AA Position 66 72 74 85 90 91 92 AA Change P-L *-R H-Y *-Q A-V V-A H-Y Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 1 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

80 Base Position 277 278 281 289 290 308 314 Codon Position 1221222 Type TTCCCT c Observed/Real OOOPOP P AA Position 93 93 94 97 97 103 105 AA Change S-P L-P, F-P S-L, P-L P-F S-F I-T T-M Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 1 Sphagnum 1033A 1 Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203 11

81 Base Position 277 278 281 289 290 308 314 Codon Position 1221222 Type TTCCCT c Observed/Real OOOPOP P AA Position 93 93 94 97 97 103 105 AA Change S-P L-P, F-P S-L, P-L P-F S-F I-T T-M Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 11 Notothylas orbicularis - AY463055 11 Notothylas breutellii - AY463054 111 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 111 Phymatoceros phymatodes - DQ845660 111 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 11 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 11 Megaceros vincentianus - DQ845647 11 Megaceros vincentianus - MEGVINC55 11 Megaceros canaliculatus - AY463047 11 Megaceros vincentianus - AY463045 11 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 11 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 1 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 1 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

82 Base Position 277 278 281 289 290 308 314 Codon Position 1221222 Type TTCCCT c Observed/Real OOOPOP P AA Position 93 93 94 97 97 103 105 AA Change S-P L-P, F-P S-L, P-L P-F S-F I-T T-M Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

83 Base Position 317 329 341 347 356 358* 365 Codon Position 1222212 Type CCCCCTC Observed/Real OOOOOPO AA Position 106 110 114 116 119 120 122 AA Change S-F A-V S-F S-F S-L *-R P-L, S-L Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 1 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 A1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU 11 1 1 Phaeoceros pearsonii - AY860203 11 1 1

84 Base Position 317 329 341 347 356 358* 365 Codon Position 1222212 Type CCCCCTC Observed/Real OOOOOPO AA Position 106 110 114 116 119 120 122 AA Change S-F A-V S-F S-F S-L *-R P-L, S-L Phaeoceros sp. - PAF2445 11 1 1 Phaeoceros carolinianus - AY463057 11 1 1 Nothoceros breutellii - AY463054 11 111 Notothylas orbicularis - AY463055 11 111 Notothylas breutellii - AY463054 11 111 Dendroceros granulosus - AY463049 11 1 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 11 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 111 Megaceros fuegensis - DQ845643 111 1 Dendroceros crispatus - AY63048 11 1 Dedroceros crispatus - DCRJC742 11 Nothoceros sp - AY463048 111 1 Megaceros aenigmaticus - L13481 111 1 Megaceros vincentianus - DQ845647 111 1 Megaceros vincentianus - MEGVINC55 111 1 Megaceros canaliculatus - AY463047 111 1 Megaceros vincentianus - AY463045 111 1 Phaeoceros sp. - DQ845666 11 1 1 Phaeomegaceros hirticalyx - AY463043 111 1 Phaeomegacoers coriaceus - AY463042 111 1 Megaceros pellucidus - AY463041 11 Megaceros flagellaris - AY860198 11 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

85 Base Position 317 329 341 347 356 358* 365 Codon Position 1222212 Type CCCCCTC Observed/Real OOOOOPO AA Position 106 110 114 116 119 120 122 AA Change S-F A-V S-F S-F S-L *-R P-L, S-L Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A 1 Arabinopsis thaliana - AP000423

86 Base Position 367 371 380 382* 392 401 407 Codon Position 1221222 Type TCCTTTT Observed/Real PPOPPPP AA Position 123 124 127 128 131 134 136 AA Change C-R P-L S-L *-R L-P V-S, L-S, I-S I-T Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 1 Metzgeria - U87081 11 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 11 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 11 Anthoceros alaminiferus - AY463053 111 Anthoceros punctatus - U87063 111 Anthoceros punctatus - UAB013669 11 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU 11 Phaeoceros pearsonii - AY860203

87 Base Position 367 371 380 382* 392 401 407 Codon Position 1221222 Type TCCTTTT Observed/Real PPOPPPP AA Position 123 124 127 128 131 134 136 AA Change C-R P-L S-L *-R L-P V-S, L-S, I-S I-T Phaeoceros sp. - PAF2445 11 Phaeoceros carolinianus - AY463057 11 Nothoceros breutellii - AY463054 111 Notothylas orbicularis - AY463055 111 Notothylas breutellii - AY463054 111 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 1 Dendroceros crispatus - AY63048 1 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 1 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 1 Megaceros canaliculatus - AY463047 1 Megaceros vincentianus - AY463045 1 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 1 Megaceros pellucidus - AY463041 1 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 1 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A 1 Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

88 Base Position 367 371 380 382* 392 401 407 Codon Position 1221222 Type TCCTTTT Observed/Real PPOPPPP AA Position 123 124 127 128 131 134 136 AA Change C-R P-L S-L *-R L-P V-S, L-S, I-S I-T Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 G Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

89 Base Position 410 433* 452 460 471 481 485 Codon Position 2121112 Type CTCCCCT Observed/Real OOOP POP AA Position 137 145 151 154 161 161 162 AA Change S-F *-Q S-L H-Y R-C R-C I-T Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU Phaeoceros pearsonii - AY860203 11

90 Base Position 410 433* 452 460 471 481 485 Codon Position 2121112 Type CTCCCCT Observed/Real OOOP POP AA Position 137 145 151 154 161 161 162 AA Change S-F *-Q S-L H-Y R-C R-C I-T Phaeoceros sp. - PAF2445 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 1 Notothylas orbicularis - AY463055 1 Notothylas breutellii - AY463054 1 Dendroceros granulosus - AY463049 11 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 1 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 1 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 1 Megaceros canaliculatus - AY463047 Megaceros vincentianus - AY463045 1 Phaeoceros sp. - DQ845666 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 1 Megaceros pellucidus - AY463041 1 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 1 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 1 Huperzia - Y07934 1 Lycopodiella - AJ133261 Phyloglossum - Y07939 1 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 1 Selaginella 1160A Selaginella 1161A 1 Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

91 Base Position 410 433* 452 460 471 481 485 Codon Position 2121112 Type CTCCCCT Observed/Real OOOP POP AA Position 137 145 151 154 161 161 162 AA Change S-F *-Q S-L H-Y R-C R-C I-T Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

92 Base Position 488 506 520 533 545 557 589 Codon Position 2212221 Type CCCCCCC Observed/Real PPPOOOP AA Position 163 169 174 178 182 186 197 AA Change T-I S-L H-Y A-V P-L P-L P-S Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 1 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 11 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 Anthoceros alaminiferus - AY463053 Anthoceros punctatus - U87063 Anthoceros punctatus - UAB013669 Folioceros fuciformis - AY463050 Folioceros fuciformis - AY463501 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203 11

93 Base Position 488 506 520 533 545 557 589 Codon Position 2212221 Type CCCCCCC Observed/Real PPPOOOP AA Position 163 169 174 178 182 186 197 AA Change T-I S-L H-Y A-V P-L P-L P-S Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 11 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 1 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 11 Megaceros vincentianus - DQ845647 11 Megaceros vincentianus - MEGVINC55 1 Megaceros canaliculatus - AY463047 11 Megaceros vincentianus - AY463045 1 Phaeoceros sp. - DQ845666 1 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 1 Megaceros pellucidus - AY463041 1 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 1 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

94 Base Position 488 506 520 533 545 557 589 Codon Position 2212221 Type CCCCCCC Observed/Real PPPOOOP AA Position 163 169 174 178 182 186 197 AA Change T-I S-L H-Y A-V P-L P-L P-S Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

95 Base Position 623 644 649 679 682 686 692 Codon Position 2211122 Type cCCTC CT Observed/Real POOPPPP AA Position 208 215 217 227 228 229 231 AA Change S-L S-F/Y S-P Y-H H-Y S-L V-A Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203 1

96 Base Position 623 644 649 679 682 686 692 Codon Position 2211122 Type cCCTC CT Observed/Real POOPPPP AA Position 208 215 217 227 228 229 231 AA Change S-L S-F/Y S-P Y-H H-Y S-L V-A Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 111 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 Megaceros fuegensis - DQ845643 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 Nothoceros sp - AY463048 Megaceros aenigmaticus - L13481 Megaceros vincentianus - DQ845647 Megaceros vincentianus - MEGVINC55 Megaceros canaliculatus - AY463047 Megaceros vincentianus - AY463045 Phaeoceros sp. - DQ845666 1 Phaeomegaceros hirticalyx - AY463043 Phaeomegacoers coriaceus - AY463042 Megaceros pellucidus - AY463041 Megaceros flagellaris - AY860198 Megacoeros gracilis - AY463037 Megacoeros denticulatus - AY463038 Megaceros flagellaris - AY463040 Megaceros flagellaris - AY463036 Megaocerosgracilis - AY463039 Lycophytes Huperzia - AJ133897 1 Huperzia - Y07934 1 Lycopodiella - AJ133261 1 Phyloglossum - Y07939 1 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

97 Base Position 623 644 649 679 682 686 692 Codon Position 2211122 Type cCCTC CT Observed/Real POOPPPP AA Position 208 215 217 227 228 229 231 AA Change S-L S-F/Y S-P Y-H H-Y S-L V-A Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

98 Base Position 730* 758 764 776 788 791 794 Codon Position 1222222 Type TCcCCTT Observed/Real OPPPOOP AA Position 244 253 255 259 263 264 265 AA Change OR *-Q/H/E S-I, T-I T-M S-L S-F I-T A-V Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 11 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 1 Tetraphis - TPU87091 1 Funaria - AF226818 Buxbaumia - AF478211 1 Rhodobryum - AF478243 Rhizomnium - AF478237 1 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 11 Anthoceros punctatus - U87063 11 Anthoceros punctatus - UAB013669 111 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203

99 Base Position 730* 758 764 776 788 791 794 Codon Position 1222222 Type TCcCCTT Observed/Real OPPPOOP AA Position 244 253 255 259 263 264 265 AA Change OR *-Q/H/E S-I, T-I T-M S-L S-F I-T A-V Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 Phymatoceros phymatodes - DQ845660 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 1 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 1 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 1 Megaceros canaliculatus - AY463047 1 Megaceros vincentianus - AY463045 1 Phaeoceros sp. - DQ845666 1 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 1 Megaceros pellucidus - AY463041 11 1 Megaceros flagellaris - AY860198 11 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 1 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

100 Base Position 730* 758 764 776 788 791 794 Codon Position 1222222 Type TCcCCTT Observed/Real OPPPOOP AA Position 244 253 255 259 263 264 265 AA Change OR *-Q/H/E S-I, T-I T-M S-L S-F I-T A-V Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

101 Base Position 800 803 806 814 817 833 836 Codon Position 2221122 Type TTCCCCC Observed/Real PPOPPP?P AA Position 267 268 269 272 273 278 279 AA Change M-T I-S S-L F-Y,H-Y R-C P-L, S-L P-L Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 11 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 Anthoceros alaminiferus - AY463053 11 Anthoceros punctatus - U87063 11 Anthoceros punctatus - UAB013669 11 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU Phaeoceros pearsonii - AY860203

102 Base Position 800 803 806 814 817 833 836 Codon Position 2221122 Type TTCCCCC Observed/Real PPOPPP?P AA Position 267 268 269 272 273 278 279 AA Change M-T I-S S-L F-Y,H-Y R-C P-L, S-L P-L Phaeoceros sp. - PAF2445 Phaeoceros carolinianus - AY463057 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 11 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 Megaceros fuegensis - DQ845643 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 Megaceros vincentianus - DQ845647 Megaceros vincentianus - MEGVINC55 Megaceros canaliculatus - AY463047 Megaceros vincentianus - AY463045 Phaeoceros sp. - DQ845666 Phaeomegaceros hirticalyx - AY463043 Phaeomegacoers coriaceus - AY463042 Megaceros pellucidus - AY463041 Megaceros flagellaris - AY860198 Megacoeros gracilis - AY463037 Megacoeros denticulatus - AY463038 Megaceros flagellaris - AY463040 Megaceros flagellaris - AY463036 Megaocerosgracilis - AY463039 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 1 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A 1 Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

103 Base Position 800 803 806 814 817 833 836 Codon Position 2221122 Type TTCCCCC Observed/Real PPOPPP?P AA Position 267 268 269 272 273 278 279 AA Change M-T I-S S-L F-Y,H-Y R-C P-L, S-L P-L Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 11 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 1 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

104 Base Position 838 839 841 857 859 863 869 Codon Position 1212122 Type TCTCTTC Observed/Real POPPPPO AA Position 280 280 281 286 287 288 290 AA Change S-L P-L Y-H T-M Y-H V-A T-I Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 1 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 111 Anthoceros punctatus - U87063 111 Anthoceros punctatus - UAB013669 111 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU 111 Phaeoceros pearsonii - AY860203 1

105 Base Position 838 839 841 857 859 863 869 Codon Position 1212122 Type TCTCTTC Observed/Real POPPPPO AA Position 280 280 281 286 287 288 290 AA Change S-L P-L Y-H T-M Y-H V-A T-I Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 1 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 1 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 11 Megaceros vincentianus - DQ845647 11 Megaceros vincentianus - MEGVINC55 11 Megaceros canaliculatus - AY463047 11 Megaceros vincentianus - AY463045 11 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 11 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 1 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 1 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

106 Base Position 838 839 841 857 859 863 869 Codon Position 1212122 Type TCTCTTC Observed/Real POPPPPO AA Position 280 280 281 286 287 288 290 AA Change S-L P-L Y-H T-M Y-H V-A T-I Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 1 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

107 Base Position 877* 908 922 946 949 956 959 Codon Position 1211122 Type TCTTCTC Observed/Real POPPOPO AA Position 293 303 308 316 317 319 320 AA Change *-Q S-L C-R Y-H P-A, S-A I-T A-V Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 A Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 11 Anthoceros alaminiferus - AY463053 11 11 Anthoceros punctatus - U87063 111 Anthoceros punctatus - UAB013669 11 Folioceros fuciformis - AY463050 11 Folioceros fuciformis - AY463501 11 Phaeocoeros leavis - PLAEU 11 Phaeoceros pearsonii - AY860203 11

108 Base Position 877* 908 922 946 949 956 959 Codon Position 1211122 Type TCTTCTC Observed/Real POPPOPO AA Position 293 303 308 316 317 319 320 AA Change *-Q S-L C-R Y-H P-A, S-A I-T A-V Phaeoceros sp. - PAF2445 11 11 Phaeoceros carolinianus - AY463057 11 11 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 1 Notothylas breutellii - AY463054 1 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 11 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 1 Megaceros fuegensis - DQ845643 1 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 1 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 111 Megaceros canaliculatus - AY463047 1 Megaceros vincentianus - AY463045 1 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 111 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 11 Megaceros flagellaris - AY860198 11 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

109 Base Position 877* 908 922 946 949 956 959 Codon Position 1211122 Type TCTTCTC Observed/Real POPPOPO AA Position 293 303 308 316 317 319 320 AA Change *-Q S-L C-R Y-H P-A, S-A I-T A-V Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

110 Base Position 971 982 995 1010 1013 1015 1045 Codon Position 2122211 Type CTCCCTT Observed/Real O?OPOOOP AA Position 324 328 332 337 338 339 349 AA Change S-L C-R P-L S-L P-L C-R C-R Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 1 Rhizomnium - AF478237 1 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 111 Anthoceros alaminiferus - AY463053 1 111 Anthoceros punctatus - U87063 1 111 Anthoceros punctatus - UAB013669 1 111 Folioceros fuciformis - AY463050 11 Folioceros fuciformis - AY463501 11 Phaeocoeros leavis - PLAEU 1 Phaeoceros pearsonii - AY860203 1

111 Base Position 971 982 995 1010 1013 1015 1045 Codon Position 2122211 Type CTCCCTT Observed/Real O?OPOOOP AA Position 324 328 332 337 338 339 349 AA Change S-L C-R P-L S-L P-L C-R C-R Phaeoceros sp. - PAF2445 1 Phaeoceros carolinianus - AY463057 1 Nothoceros breutellii - AY463054 1 Notothylas orbicularis - AY463055 11 Notothylas breutellii - AY463054 11 Dendroceros granulosus - AY463049 11 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 11 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 11 Megaceros fuegensis - DQ845643 1 Dendroceros crispatus - AY63048 11 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 11 Megaceros aenigmaticus - L13481 11 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 111 Megaceros canaliculatus - AY463047 11 Megaceros vincentianus - AY463045 11 Phaeoceros sp. - DQ845666 1 Phaeomegaceros hirticalyx - AY463043 111 Phaeomegacoers coriaceus - AY463042 111 Megaceros pellucidus - AY463041 1 111 Megaceros flagellaris - AY860198 1 111 Megacoeros gracilis - AY463037 1 111 Megacoeros denticulatus - AY463038 1 111 Megaceros flagellaris - AY463040 1 111 Megaceros flagellaris - AY463036 1 111 Megaocerosgracilis - AY463039 1 111 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

112 Base Position 971 982 995 1010 1013 1015 1045 Codon Position 2122211 Type CTCCCTT Observed/Real O?OPOOOP AA Position 324 328 332 337 338 339 349 AA Change S-L C-R P-L S-L P-L C-R C-R Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 1 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

113 Base Position 1058 1063* 1076 1079 1082 1088 1091 Codon Position 2122222 Type CT t CTCC Observed/Real OOP POOP AA Position 353 355 359 360 361 363 364 AA Change S-F *-Q F-S S-L L-P A-V S-L Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 A Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 1 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 1 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 Anthoceros alaminiferus - AY463053 Anthoceros punctatus - U87063 Anthoceros punctatus - UAB013669 Folioceros fuciformis - AY463050 Folioceros fuciformis - AY463501 Phaeocoeros leavis - PLAEU 11 Phaeoceros pearsonii - AY860203 11 1

114 Base Position 1058 1063* 1076 1079 1082 1088 1091 Codon Position 2122222 Type CT t CTCC Observed/Real OOP POOP AA Position 353 355 359 360 361 363 364 AA Change S-F *-Q F-S S-L L-P A-V S-L Phaeoceros sp. - PAF2445 11 Phaeoceros carolinianus - AY463057 11 Nothoceros breutellii - AY463054 11 Notothylas orbicularis - AY463055 11 1 Notothylas breutellii - AY463054 11 1 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 11 Phymatoceros phymatodes - DQ845660 11 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 Megaceros fuegensis - DQ845643 1 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 Nothoceros sp - AY463048 1 Megaceros aenigmaticus - L13481 Megaceros vincentianus - DQ845647 Megaceros vincentianus - MEGVINC55 1 Megaceros canaliculatus - AY463047 Megaceros vincentianus - AY463045 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 1 Phaeomegacoers coriaceus - AY463042 1 Megaceros pellucidus - AY463041 1 Megaceros flagellaris - AY860198 1 Megacoeros gracilis - AY463037 1 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 1 Megaceros flagellaris - AY463036 1 Megaocerosgracilis - AY463039 1 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 11 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

115 Base Position 1058 1063* 1076 1079 1082 1088 1091 Codon Position 2122222 Type CT t CTCC Observed/Real OOP POOP AA Position 353 355 359 360 361 363 364 AA Change S-F *-Q F-S S-L L-P A-V S-L Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

116 Base Position 1093 1094 1120 1127 1129 1136 1147 Codon Position 1212121 Type CT cCTCC Observed/Real OOPOPPP AA Position 365 365 374 376 377 379 383 AA Change S-P L-P R-W T-M ??-P S-L L-F Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 1 Haplomitrium - U87072 Haplomitrium 804A Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 11 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A 11 Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 1 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 1 Phaeocoeros leavis - PLAEU Phaeoceros pearsonii - AY860203

117 Base Position 1093 1094 1120 1127 1129 1136 1147 Codon Position 1212121 Type CT cCTCC Observed/Real OOPOPPP AA Position 365 365 374 376 377 379 383 AA Change S-P L-P R-W T-M ??-P S-L L-F Phaeoceros sp. - PAF2445 Phaeoceros carolinianus - AY463057 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 Megaceros fuegensis - DQ845643 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 Nothoceros sp - AY463048 Megaceros aenigmaticus - L13481 Megaceros vincentianus - DQ845647 Megaceros vincentianus - MEGVINC55 Megaceros canaliculatus - AY463047 Megaceros vincentianus - AY463045 Phaeoceros sp. - DQ845666 Phaeomegaceros hirticalyx - AY463043 Phaeomegacoers coriaceus - AY463042 Megaceros pellucidus - AY463041 Megaceros flagellaris - AY860198 Megacoeros gracilis - AY463037 Megacoeros denticulatus - AY463038 Megaceros flagellaris - AY463040 Megaceros flagellaris - AY463036 Megaocerosgracilis - AY463039 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 Lycopodium 1025 Isoetes - L11054 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

118 Base Position 1093 1094 1120 1127 1129 1136 1147 Codon Position 1212121 Type CT cCTCC Observed/Real OOPOPPP AA Position 365 365 374 376 377 379 383 AA Change S-P L-P R-W T-M ??-P S-L L-F Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 Marattia - AY138399 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

119 Base Position 1148 1159 1163 1166 1195 1228* 1238 Codon Position 2122112 Type CCCCTTC Observed/Real OOO?OPPP AA Position 383 387 388 389 399 410 413 AA Change S-F P-S, A-S A-V S-L S-P *-R S-L Algae Chaetosphaeridium - AY072816 Coleochaete - l13477 Coleochaete - AY051142 Spirogyra - L11057 Nitella - AF097745 Tolypella - U27531 Chara - L13476 Chara - AY17043 Chara - DQ229107 Liverworts Treubiala - AY057428 Haplomitrium - AY507402 Haplomitrium - U87072 Haplomitrium 804A 1 Haplomitrium - U87072? Fossombronia - U87069 Sphaerocarpos - U87090 1 Jubula - U87074 Metzgeria - U87081 Monoclea - U87083 Ricciocarpos - U87089 Marchanti - X04465 Mosses Sphagnum - AF231887 Sphagnum 1033A Takakia - AF244565 Andreaea - AF478199 Andreae - AF478200 Andreaeobryum - AF231059 Thuidium916A Polytrichum - U87087 Atrichum - AF478209 Mnium - U87082 Tetraphis - TPU87091 1 Funaria - AF226818 Buxbaumia - AF478211 Rhodobryum - AF478243 Rhizomnium - AF478237 Plagiobryum - AY163045 Hornworts Leiosporoceros dussii - AY463052 Leiosporoceros dussii - L2LDUMX Anthoceros formosae - NC_004543 Anthoceros alaminiferus - AY463053 1 Anthoceros punctatus - U87063 1 Anthoceros punctatus - UAB013669 1 Folioceros fuciformis - AY463050 1 Folioceros fuciformis - AY463501 11 Phaeocoeros leavis - PLAEU 11 Phaeoceros pearsonii - AY860203 11

120 Base Position 1148 1159 1163 1166 1195 1228* 1238 Codon Position 2122112 Type CCCCTTC Observed/Real OOO?OPPP AA Position 383 387 388 389 399 410 413 AA Change S-F P-S, A-S A-V S-L S-P *-R S-L Phaeoceros sp. - PAF2445 11 Phaeoceros carolinianus - AY463057 11 Nothoceros breutellii - AY463054 Notothylas orbicularis - AY463055 Notothylas breutellii - AY463054 Dendroceros granulosus - AY463049 Phymatoceros bulbiculosus - AY860201 1 Phymatoceros phymatodes - DQ845660 1 Phaeoceros carolinianus - AY860202 Phaeomegaceros fimbriatus - DQ845655 Nothoceros sp - AY860199 11 Megaceros fuegensis - DQ845643 11 Dendroceros crispatus - AY63048 Dedroceros crispatus - DCRJC742 1 Nothoceros sp - AY463048 111 Megaceros aenigmaticus - L13481 111 Megaceros vincentianus - DQ845647 1 Megaceros vincentianus - MEGVINC55 111 Megaceros canaliculatus - AY463047 11 Megaceros vincentianus - AY463045 111 Phaeoceros sp. - DQ845666 11 Phaeomegaceros hirticalyx - AY463043 111 Phaeomegacoers coriaceus - AY463042 11 Megaceros pellucidus - AY463041 111 Megaceros flagellaris - AY860198 111 Megacoeros gracilis - AY463037 11 Megacoeros denticulatus - AY463038 11 Megaceros flagellaris - AY463040 11 Megaceros flagellaris - AY463036 11 Megaocerosgracilis - AY463039 11 Lycophytes Huperzia - AJ133897 Huperzia - Y07934 Lycopodiella - AJ133261 Phyloglossum - Y07939 1 Lycopodium 1025 1 Isoetes - L11054 1 Isoetes - AF404499 Selaginella - AF093254 Selaginella - AF093253 Selaginella 1160A Selaginella 1161A Selaginella 1021A Ferns Psilotum - AP004638 PSIOLOCDNA925B Psilotum 1081A Psilotum - L11059 Botrychium 1026A Botrychium - L13474 Ophioglossum - AB024960 Equisetum - AF313580

121 Base Position 1148 1159 1163 1166 1195 1228* 1238 Codon Position 2122112 Type CCCCTTC Observed/Real OOO?OPPP AA Position 383 387 388 389 399 410 413 AA Change S-F P-S, A-S A-V S-L S-P *-R S-L Onoclea - U62034 Grammitis -AY460647 PAUSIU798A Woodsia - AB021726 Polypodium - AY362611 Polypodium 798A Bommeria - U19497 Cyathea - AM177335 Cyathea1082A Dennstaedtia - U18636 Belvisia - AY362562 Thelypteris - D43919 Vittaria - U20937 Lygodium - AJ303419 Asplenium -AY641804 Woodwardia - AB040604 Dicksonia - AM177345 1 Marattia - AY138399 1 Dryopteris - AY268887 Dicranopteris - U18626 Adiantum - AY178864 Polypodium - AY362612 Seed Plants Pinus pumila - AB161042 Gnetum 1162A Zamia 1109A Welwitchia 1110A Arabinopsis thaliana - AP000423

122 Base Position 1282 1325 1331 EDITING Codon Position 122SUM Type TTC Observed/Real PPP AA Position 428 442 444 AA Change S-R,C-R,T-RS-P, A-P,L-P F-L,S-L Algae Chaetosphaeridium - AY072816 0 Coleochaete - l13477 0 Coleochaete - AY051142 0 Spirogyra - L11057 0 Nitella - AF097745 0 Tolypella - U27531 A0 Chara - L13476 0 Chara - AY17043 0 Chara - DQ229107 0 Liverworts 0 Treubiala - AY057428 0 Haplomitrium - AY507402 4 Haplomitrium - U87072 0 Haplomitrium 804A 3 Haplomitrium - U87072? 0 Fossombronia - U87069 0 Sphaerocarpos - U87090 1 Jubula - U87074 1 Metzgeria - U87081 2 Monoclea - U87083 0 Ricciocarpos - U87089 0 Marchanti - X04465 0 Mosses 0 Sphagnum - AF231887 1 Sphagnum 1033A 1 Takakia - AF244565 20 Andreaea - AF478199 0 Andreae - AF478200 0 Andreaeobryum - AF231059 1 Thuidium916A 2 Polytrichum - U87087 0 Atrichum - AF478209 0 Mnium - U87082 1 Tetraphis - TPU87091 3 Funaria - AF226818 0 Buxbaumia - AF478211 1 Rhodobryum - AF478243 2 Rhizomnium - AF478237 3 Plagiobryum - AY163045 0 Hornworts 0 Leiosporoceros dussii - AY463052 0 Leiosporoceros dussii - L2LDUMX 0 Anthoceros formosae - NC_004543 19 Anthoceros alaminiferus - AY463053 131 Anthoceros punctatus - U87063 130 Anthoceros punctatus - UAB013669 129 Folioceros fuciformis - AY463050 19 Folioceros fuciformis - AY463501 19 Phaeocoeros leavis - PLAEU 28 Phaeoceros pearsonii - AY860203 28

123 Base Position 1282 1325 1331 EDITING Codon Position 122SUM Type TTC Observed/Real PPP AA Position 428 442 444 AA Change S-R,C-R,T-RS-P, A-P,L-P F-L,S-L Phaeoceros sp. - PAF2445 28 Phaeoceros carolinianus - AY463057 29 Nothoceros breutellii - AY463054 24 Notothylas orbicularis - AY463055 127 Notothylas breutellii - AY463054 125 Dendroceros granulosus - AY463049 17 Phymatoceros bulbiculosus - AY860201 133 Phymatoceros phymatodes - DQ845660 132 Phaeoceros carolinianus - AY860202 0 Phaeomegaceros fimbriatus - DQ845655 0 Nothoceros sp - AY860199 20 Megaceros fuegensis - DQ845643 27 Dendroceros crispatus - AY63048 17 Dedroceros crispatus - DCRJC742 15 Nothoceros sp - AY463048 28 Megaceros aenigmaticus - L13481 27 Megaceros vincentianus - DQ845647 23 Megaceros vincentianus - MEGVINC55 30 Megaceros canaliculatus - AY463047 24 Megaceros vincentianus - AY463045 26 Phaeoceros sp. - DQ845666 28 Phaeomegaceros hirticalyx - AY463043 29 Phaeomegacoers coriaceus - AY463042 28 Megaceros pellucidus - AY463041 28 Megaceros flagellaris - AY860198 25 Megacoeros gracilis - AY463037 24 Megacoeros denticulatus - AY463038 25 Megaceros flagellaris - AY463040 24 Megaceros flagellaris - AY463036 24 Megaocerosgracilis - AY463039 24 Lycophytes 0 Huperzia - AJ133897 3 Huperzia - Y07934 2 Lycopodiella - AJ133261 2 Phyloglossum - Y07939 3 Lycopodium 1025 5 Isoetes - L11054 1 Isoetes - AF404499 0 Selaginella - AF093254 A0 Selaginella - AF093253 2 Selaginella 1160A 0 Selaginella 1161A 1 Selaginella 1021A 0 Ferns 0 Psilotum - AP004638 0 PSIOLOCDNA925B 0 Psilotum 1081A 0 Psilotum - L11059 0 Botrychium 1026A 2 Botrychium - L13474 0 Ophioglossum - AB024960 0 Equisetum - AF313580 0

124 Base Position 1282 1325 1331 EDITING Codon Position 122SUM Type TTC Observed/Real PPP AA Position 428 442 444 AA Change S-R,C-R,T-RS-P, A-P,L-P F-L,S-L Onoclea - U62034 0 Grammitis -AY460647 0 PAUSIU798A 0 Woodsia - AB021726 2 Polypodium - AY362611 0 Polypodium 798A 0 Bommeria - U19497 0 Cyathea - AM177335 0 Cyathea1082A 0 Dennstaedtia - U18636 0 Belvisia - AY362562 1 Thelypteris - D43919 0 Vittaria - U20937 0 Lygodium - AJ303419 1 Asplenium -AY641804 0 Woodwardia - AB040604 2 Dicksonia - AM177345 1 Marattia - AY138399 2 Dryopteris - AY268887 0 Dicranopteris - U18626 1 Adiantum - AY178864 0 Polypodium - AY362612 0 Seed Plants 0 Pinus pumila - AB161042 0 Gnetum 1162A 0 Zamia 1109A 0 Welwitchia 1110A 1 Arabinopsis thaliana - AP000423 0

125