Sequencing a Gene under Strong Selection Aspartate Aminotransferase in North Atlantic

Felix Mittermayer

Degree project for Master of Science (Two Years) in Biology

Degree course in Marine Ecology 45 hec Spring and Autumn 2013

Department of Biological and Environmental Sciences University of Gothenburg

Examiner: Kerstin Johannesson Department of Biological and Environmental Sciences University of Gothenburg

Supervisor: Marina Panova and Mårten Duvetorp Department of Biological and Environmental Sciences University of Gothenburg

ABSTRACT 3 INTRODUCTION 4 NATURAL SELECTION, LOCAL ADAPTION AND ALLOZYME VARIATION 4 OF LITTORINA IN NORTH ATLANTIC 4 ASPARTATE AMINOTRANSFERASE ALLOZYME VARIATIONS IN L. SAXATILIS 6 AIM 7 MATERIAL AND METHODS 8 SAMPLING, SAMPLE PREPARATION AND PHENOTYPING 8 DNA AMPLIFICATION AND SEQUENCING 8 DATA ANALYSIS 10 RESULTS 10 DISCUSSION 11 FUTURE WORK 13 REFERENCES 14 FIGURES AND TABLES 17

2 Abstract Natural selection is one of the driving forces of evolution, to understand it we must gain inside into the molecular mechanics that create genetics variation. Allozymes are variants of an enzyme coded for by different alleles, they are generally considered to be under neutral or weak selection. Aspartate aminotransferase (Aat, EC 2.6.1.1) in the rough periwinkle, Littorina saxatilis (Olivi, 1792), has however shown to be under a very stringed selection regime, Aat is an essential part of the anaerobe energy production is molluscs. The two variants of Aat are found over a vertical gradient, Aat100 is predominant (0.7-0.8) in the surf zone while Aat120 is mainly found (0.8-0.9) in the splash zone. After an extinction event of L. saxatilis in the surf zone the opening was recolonized by individuals from the splash zone that were mainly Aat120, but within period of a few generations Aat100 was again the dominant allele in the surf zone. A gene under that kind of selection can further provide inside into speciation and local adaption once the coding sequence is available.

Sampling of Littorina species, genotyping using acrylamid gel electrophoresis and RNA extraction was prepared prior to the start of the project. All initial primers were designed using the coding sequence from Crossastrea gigas, the closest related species available and later complemented using Aplysia californica genome data as well as assembly data from the IMAGO genome project and transcriptome data from A. Sa Pinto, CIBIO, Portugal. Following primers were based on partial sequences acquired from the partially sequenced samples. Final primers allowed us to characterized the coding sequence for 348 of the 409 amino acids that comprise enzyme based on 118 RNA sequences from 19 individuals (L. saxatilis, Littorina arcana, Littorina compressa) A total of 4 synonymous and 2 non- synonymous mutations were identified to distinguish Aat100from Aat120. Several more synonymous and non-synonymous mutations are found to the coding sequences of Aat in Littorina fabalis and Littorina littorea. Further we calculate the isoelectric point for this part of the enzyme, 8.74 for Aat100 and 8.44 for Aat120, which does explain the different migration speed through the acrylamind gels. We additionally retrieved the available protein and nucleotide sequences for Aat in other molluscs and incorporate them into a phylogenetic tree, revealing that Aat in molluscs is indeed a gene under strong selection

3

Introduction

Natural selection, local adaption and allozyme variation Local adaption is a process in which a population or a group of individuals have changed genetically in one or more characters in a way that improves fitness in a local environment in relation to the population living in other environments. This change or adaption to the local conditions, may the selection means be biotic or abiotic, improves the survival and reproductive success of those individuals that are being favoured by the reigning natural selection regime under local conditions (Futuyma 2009). When this local adaption results in reproductive isolation such as assortative mating or intrinsic incompatibilities a speciation event might occur (Butlin et al 2008). Allozymes are variants of an enzyme coded for by different alleles of a gene and distinguished on gel electrophoresis. Variation in allozyme frequencies is common among different populations of one species (Johannesson & Johannesson 1989, Armbruster 2001). These frequency variation is often considered to be the result of random genetic drift (Thorpe & Solé-Cava 1994) but some allozyme variation has shown to be maintained by strong diverging selection (Theisen 1978, Koehn et al. 1983), that is in some cases connected to the habitat (Johannesson & Johannesson 1989, Johannesson et al 1995, Armbruster 2001). Examples of allozymes under selection are cytosolic malate dehydrogenase in limpets (Lottia spp) (Dong & Somero 2009), phosphoglucose isomerase in Colias butterflies (Wheat et al. 2006) and lactate dehydrogenase in fish Fundulus heteroclitus (Powers & Schulte 1998). Another example of adaptive variation in proteins is the haemoglobin of Cod (Gadus morhua) (Andersen et al. 2009) which is selected by temperature differences over many hundreds of kilometres. One of the most striking examples of allozyme variation is the case of aspartate aminotransferase in the intertidal snail Littorina saxatilis the selection occurs at strength over a gradient of few meters (see below).

Species of Littorina in North Atlantic In the North Atlantic, there are six species of Littorina: the rough periwinkle L. saxatilis (Olivi, 1792), its sister-species L. compressa (Jeffereys, 1865) and L. arcana (Hannaford Ellis, 1978), two species of flat periwinkles L. obtusata (L) and L. fabalis (Turton, 1825) and the more distantly related species L. littorea (L.) The main focus of this study is on L. saxatilis.

4 The ovoviviparous marine snail L. saxatilis inhabits the littoral and sublittoral shorelines of the Northern Atlantic, it can be found in the Arctic, and the North American east coast as well as on most Northern and Western European shores (Reid 1996), where it can occur in very high densities, with several hundred individuals per m2. Littorina saxatilis is the only species within Littorina genus that has direct development, meaning that the females release fully developed juveniles after carrying them in a brood pouch (Reid 1996). The highly variable intertidal habitat can cause very different stress levels in the various microenvironments with respect to abiotic factors such as temperature and desiccation (Sokolova et al. 2000) or wave- exposure and biotic factors such as predation (Johannesson 2003) . As a consequence of the mode of reproduction and the low mobility of adult snails, migration and gene flow among distant populations is weak. Weak gene flow in combination with strong differential selection have in this species resulted in the formation of wave-exposed, moderate, sheltered and barnacle ecotypes of snails with different morphologies (Reid 1996), in addition to genetic differences as a consequence of genetic drift and selection (Johannesson & Johannesson 1990). The most common ecotypes are the large, thick-shelled one with high spire known as crab type and the small, thin-shelled one with a big aperture, the wave type. Depending on the geographical position the location of the different ecotypes varies; in the UK for instance the crab type is generally found on shores with boulder that allow for the presence of crabs while the wave type is found on cliffs, the gradient is crab type above wave type. In Sweden however the different ecotypes are found on a horizontal distribution patter, as they are found in different habitats along the coastline: crab type in sheltered bays often among boulders and pebbles and the wave type on wave exposed cliffs. Thus, the wave ecotype in Sweden is exposed to further microenvironmental variation between low and high shore. Littorina compressa and L. arcana are considered to be sister species with L. saxatilis (Reid et al 1996). Their distributions are limited to the British Isles, Brittany and northern Norway. Where all three species occur in sympatry, they have different vertical distribution with L. compressa in the lower, L. saxatilis in the middle and L. arcana in the upper part of the littoral zone, however the segregation of the vertical distribution is not total as there is a considerable overlap between vertically neighbouring species. Unlike L. saxatilis these two species are oviparous, i.e. they lay their eggs in gelatinous masses in damp crevices from which metamorphosed juveniles hatch. Further, L. compressa and L. arcana have the same ecotypes as L. saxatilis except a brackish water type since they are not found in that kind of an environment (Reid 1996).

5 The flat periwinkle, L. obtusata, is found all around the north Atlantic, from Russia (White Sea) to Portugal in the eastern and Greenland to New England in the western Atlantic. This species inhabits the eulittoral zone and is closely associated with macrophytes, mostly fucoids. Its sister species L. fabalis has a similar distribution range but is not found in Atlantic Canada or New England. Similar to L. saxatilis and its sister species L. obtusata and L. fabalis lack a planktonic development but lay gelatinous egg masses on thallie of their host fucoids from with fully developed juveniles hatch. Both species produce ecotypes dependent on the exposure of their habitats to wave action, exposed, sheltered and a moderate ecotype exists. They are also known to enter brackish environments and can be found in the Western Baltic Sea (Reid 1996). The only pelagic spawning species included in this study is L. littorea, the common periwinkle, whose larvae are planktotrophic until metamorphosis and settling (Johannesson 1988, Reid 1996). The species is common all along the North-eastern Atlantic coast from Russia to Portugal, including the western Baltic Sea and from Atlantic Canada the species is also found to the Delaware Bay (Reid 1996) on the western North Atlantic seaboard. Further L. littorea lacks ecotypes and this is presumably due high potential of larval dispersal that cause genetic uniformity over large geographic areas (Johannesson 1992).

Aspartate aminotransferase allozyme variations in L. saxatilis There are multiple evidence that aspartate aminotransferase (Aat, EC 2.6.1.1) also known as glutamate oxaloacetate transaminase (Got) is under very strong selection in L. saxatilis (Johannesson et al. 1995). In L. saxatilis, aspartate aminotransferase has two allozyme variants: Aat100 and Aat120 (Johannesson & Johannesson 1989, Panova & Johannesson 2004), of which the former is found in higher frequencies in the surf zone, e.g. the area closest to the lower part of species distribution that is constantly or at least very regularly a wet habitat. Aat120 dominates the splash zone, the area that is only wet when the breaking waves create splash and spray. The gradient is very strong, with Aat120 having a frequency 0.8-0.9 at the upper shore and Aat100 having a frequency 0.7-0.8 at the lower shore (Johannesson & Johannesson 1989). Parallel patterns of phenotype distribution can be found in Norway, Iceland and the United Kingdom (Johannesson & Johannesson 1989), while the pattern in Spain is the exact opposite with Aat120 being found in higher frequencies in the surf zone and Aat100 being more frequent in the splash zone (Johannesson et al. 1993). The designation of

6 Aat100 respectively Aat120 are based on the different migration speeds of the enzyme products in a gel electrophoreses, as Aat120 moves roughly 20% faster trough the gel than Aat100. Aspartate aminotransferase is an essential part of the anaerobe energy production in marine molluscs (Hochachka 1980). Earlier studies have shown that Aat120 has a lower enzyme activity of aspartate aminotransferase than Aat100, and the enzyme activity is negatively correlated to the body mass (Panova & Johannesson 2004). This was suggested to be an adaptation to reduce that accumulation of metabolic waste products and to save energy resources during long periods of desiccation, which are more common in the splash zone. At the same time a higher enzymatic activity of Aat100 would allow the periwinkle to sustain their metabolic rates and remain active during short desiccation at the low shore. DNA sequence of the Aat gene in Littorina, as well as variation behind the allozyme alleles in L. saxatilis is known. Aspartate aminotransferase coding sequence has been characterized for the bivalve Crassostrea gigas (Boutet et al. 2005). In addition, it could be found in the recently published genomes of two mollusc species, Aplysia californica (Broad Institute) and the limpet Lottia gigantea (JGI). Since this project has started, an unpublished partial genome assembly (CeMEB) and transcriptome data (A. Sa Pinto, CIBIO, Portugal) have become available for L. saxatilis. Using the available Aat protein sequences in molluscs, we were able to find partial Aat sequences in Littorina genome and transcriptome assemblies, and use them for primer design.

Aim The principal aim of this study is to characterize the aspartate aminotransferase coding sequences in six North Atlantic Littorina species and identify mutations at DNA level that create the two variants of the enzyme in L. saxatilis, L. arcana and L.compressa and to sequence the enzyme in the remaining species. As the gradient of Aat alleles has been found in several geographic populations of L. saxatilis, we investigate if the Aat100 and Aat120 alleles in all regions are characterised by the same mutations; this is especially interesting in the case of the reversed pattern in Spain. Furthermore, we will compare the Aat alleles in L. saxatilis to that of its sister species inhabiting predominantly low (L. compressa) or high shore (L. arcana) who’s appearance in natural gel electrophoresis is identical. Finally, we will obtain the Aat coding sequences for the rest of North Atlantic Littorina species (L. littorea, L. obtusata and L. fabalis) to compare Aat gene genealogy with the phylogeny of this group. The resulting information of difference and mutations between the sequences and allozymes. Is important since the molecular architecture of the allozyme polymorphism has only been

7 addressed in very few studies. This is a first step towards the understanding the functional effects of mutations and the role of Aat in local adaptation, and generally will improve our knowledge of adaptation and speciation in the genus Littorina.

Material and Methods

Sampling, sample preparation and phenotyping Samples of L. saxatilis, L. arcana, L. compressa, L. obtusata, L. fabalis, L. littorea and Melaraphe neritoides (Montagu, 1806) were collected from Sweden, Norway, Iceland and Portugal (Tab. 1). The snails were either kept alive until dissection or stored at -80° C. Tissues collected during dissection, mainly head or foot tissue for RNA extractions were stored in RNAlater ® for later use. The remaining tissue was homogenized and used to score genotypes by gel electrophoresis on acrylamid gels; all gels were score independently by two persons. Only homozygotes were selected for RNA extraction and followed by singled- stranded cDNA synthesis using a Promega reverse transcriptase, resulting in a total of 28 samples representing different allozyme genotypes, localities and species (Tab. 2). The initial work of allozyme genotyping, RNA extraction, cDNA synthesis, initial primer construction and RACE analysis were obtained from Johansson (Johansson 2012).

DNA amplification and Sequencing Primers designs were based on data from L. saxatilis genome contigs showing high similarity to Aat-1 (cytosolic Aat) protein sequences found in the Pacific oysters (Crassostrea gigas) genome sequence (Boutet et al. 2005) and further on two partial L. saxatilis transcriptome sequences, kindly provided by Alexandra Sa Pinto, CIBIO, Portugal (Fig.1). These sequences were aligned with the multiple alignment tool in Geneious® 6.1.4 by Gateway®, this program was also used to design the primers, which were manufactured by Cybergene AB (Stockholm, Sweden). Actin was used as a positive control in both ss-cDNA synthesis and polymerase chain reaction DNA amplification and all products were visualized using ether 1% agarose gels with Tris- boric EDTA buffer or 1,5% agarose gels with Tris-acetate EDTA buffer, a 100-bp ladder was used the estimate the length of PCR products on the gels. All resulting products of good quality and correct length were purified using ExoSAP-IT (USB, Wycombe, UK) or E.Z.N.A® Cycle-Pure kit and send for sequencing (Macrogen Europe, Amsterdam, The Netherlands). All resulting data was aligned with the CDS, predicted from the genome and transcriptome sequences.

8 GeneRacer™ kit was used to amplify and sequence the UTRs of the transcript. Primers were adapted for use with the GeneRacer kit, as they need to have higher T(m) and T(a) (melting and annealing temperatures). PCRs on the GeneRacer™ template were conducted using Platinum® Pfx DNA polymerase from Invitrogen™ and the products were examined on a 1,5% agarose gel with Tris-acetate EDTA buffer. Products that contained a single fragment of the correct length were used as a template in a nested PCR. In products with multiple fragments the band containing the fragment of the correct length was cut from the gel using a sterile scalpel under UV-light. A gel extraction kit (QIAquick®) was used to retrieve the DNA from the gel, purified using ExoSAp-IT and a nested PCR was performed on the extracts as template. Products from this nested PCR were individually cloned into E.coli (One Shot® TOP10) using a Zero Blunt® TOPO® PCR cloning kit containing a pCR®-Blunt II- TOPO® vector. Clones were picked after the colonies reached a diameter of 1-3mm and products were amplified using M13 PCR protocols. After screening using gel electrophoreses all samples containing the product were purified using the Cycle-Pure® kit and sent for sequencing (Macrogene Europe, Amsterdam, The Netherlands). The obtained sequences were aligned to the expected coding sequence and used for the design of new primers in Geneious. In total 41 different primers (Tab. 4) have been tested during the project. Many of them failed to produce PCR product of the expected size, alternatively amplified a faint band or several products of different length in all or some individuals. The possible explanations to this are ambiguity in the genome assembly contigs used to design primers (especially at 3’ end of the CDS), variation in DNA sequences between the species and suboptimal primer characteristics. In addition, mapping of reads to the L. saxatilis transcriptome assembly, which became available at the end of this project, showed that Aat is expressed at low levels in the muscle tissue used for the RNA extractions (this data was kindly provided by Magnus Alm Rosenblad, CMB, Universtiy of Gothenburg, Sweden). All primers were tested in different primer combinations, different TAQs (Phusion® High- Fidelity DNA Polymerase, Platinum® Pfx DNA Polymerase, 5 PRIME™ HotMaster™ Taq Polymerase and RBC Taq DNA Polymerase) and accordingly modified PCR programs. In the end, Aat_sax_F2 as forward and Aat_R9 as reverse primer were identified as the most efficient primer combination, producing a product of 1046 bp when using Hot-5 PRIME™ HotMaster™ Taq Polymerase and a touch-down cycling parameters (calculated for Ta=55°). The products were individually cloned into E.coli (One Shot® TOP10) with a TOPO® TA Cloning® kit using a pCR®2.1-TOPO® TA vector. Colonies were picked and screened using M13 PCR protocol and gel electrophoresis, as many positive clones as possible but not more

9 than 12 for every individual were sent for sequencing (Macrogene Europe, Amsterdam, The Netherlands).

Data analysis Forward and reverse sequences were assembled, inspected for quality and manually corrected and aligned using Geneious. Additional alignments were done in Seaview v4.4.2 and the web- based Praline multiple sequence alignment tool. Phylogenetic analysis were performed with a Bayesian approach on the remote Albiorix server (University of Gothenburg, Sweden) with the program Mr Bayes (Huelsenbeck & Ronquist 2001). All resulting trees were edited using FigTree v1.4.0 and haplotype networks were created in TCS 1.21 (Clement et al. 2000). Consensus alignments were created for each species and the L. saxatilis sequence was used as query in an NCBI-Blast (Altschul et al. 1990) to retrieve mRNA and protein sequences for aspartate aminotransferase from other mollusc taxa. Resulting matches were aligned and analysed phylogenetically as above.

Results In the allozyme analysis, both aspartate aminotransferase alleles, Aat100 and Aat120, were found in L. saxatilis and L. arcana while only Aat100 could be found in L. compressa. Littorina fabalis and L. obtusata had the same electrophoretic pattern as L. saxatilis Aat100 while the L. littorea variant moved slightly faster through the gels. Initially, a fragment of Aat coding sequences with the length of 1046 bp from a total expected length of 1230 bp was characterized in 19 individuals. Of these four individuals had to be excluded from the analysis due to low number of sequences and/or low sequence quality. For L. saxatilis, L. arcana and L. compressa a total of five haplotypes for Aat100 and five for Aat120 were identified. A total of 16 mutations were identified among the haplotypes of L. saxatilis, L. arcana and L. compressa; all but 4 of these mutations were synonymous. The non-synonymous mutations between Aat100 and Aat120 in Scandinavia for L. saxatilis, L. arcana and L. compressa are situated in codon 29 and 103, where a glutamine is replaced by a isoleucine and an isoleucine with a valine, respectively. While the differences between Aat100 and Aat120 in Portugal are to be found in the codons 212 and 215 where alanine is replaced by threonine and lysine by glutamine (Fig. 1 & 2). Littorina fabalis differed in eight and L. littorea in 44 nucleotides from Aat100 of which three respectively 14 were non-synonymous mutations. A phylogenetic tree based on nucleotides were created using a Bayesian approach (Fig. 3) where L. littorea was set as on outgroup while a tree based on the amino acids sequences was produced using the Neighbor-Joining

10 method (Fig. 4). In both trees L. saxatilis, L. arcana and L. compressa do not cluster by species or by origin but only by allozyme relative mobility on acrylamid gels. The only exceptions to this are Portuguese Aat120 haplotypes that do not group with the other Aat120 but group together with all Aat100. Littorina fabalis is placed between the Aat100 and Aat120 groups. The Blast search, using blastp and tblastn in EST (Expressed Sequence Tags) and TSA (Transcriptome Shotgun Assembly) NCBI databases, yielded usable mRNA and protein sequences for eight other species (Tab. 3). When converting these sequences into amino acids and aligning them the resulting alignment shows that the enzyme is not very well preserved (Fig. 5). A tree sprung by using Mr Bayes (Huelsenbeck & Ronquist 2001) clades gastropods together while bivalves do not create a monophyletic group for this gene (Fig. 6). The isoelectric point (pl or IEP) describes the pH at which a molecule has no electric charge and has the effect that different enzymes move at different speeds in a gel with a certain pH under electrophoresis. Calculating the isoelectric point for the partial protein sequence was done using Geneious. The non-synonymous mutations between the two allozymes Aat100 (pI=8.74) and Scandinavian Aat120 (pI=8.44) correspond to a difference of 0.3 in the isoelectric point. Further the Portuguese Aat120 hast the same pI as the Scandinavian Aat120 (pI=8.44) explaining why they could not have been distinguished during acrylamid gel electrophoresis. Littorina littorea and L. fabalis Aat have a pI of 8.34 and 8.74 respectively.

Discussion After sequencing most of the aspartate aminotransferase coding sequence (341/410 aa) (Fig. 7) we can now explain the different migrations speeds of Aat100 and Aat120 by two non- synonymous mutations. These two mutations can be linked to the functional differences examined by Panova & Johannesson (2004), however, we cannot rule out metabolic difference caused by variations in expression levels. The observed higher number of synonymous mutations compared to non-synonymous ones is expected as the former do not lead to amino acid replacement and hence do not affect the function of the protein, Consequently, synonymous mutations are not removed by selection while non-synonymous mutations with lethal effects are (Brown et al. 1982). The sequenced alleles for L. saxatilis, L. arcana and L. compressa show very little to no variation for the two allozymes (99.2-100% similarity on nucleotide level) as might be explained by the close relatedness and agrees with the similarity of allozymes described in these three taxa (Zaslavskaya et al. 1992). The extremely low frequency of Aat120 in L.

11 compressa (Knight & Ward 1991) might be explained by their absences in the splash zone where they would be exposed to desiccation and thus do not need to adapt to this environmental stress. When examining the location of the individuals with Portuguese origin in the phylogenetic trees (Fig. 3, 4 & 5) it is notable that they cluster in a particular way compare to all other L. saxatilis. This may explain the fact that L. saxatilis from the Iberian Peninsula has a reversed distribution pattern compared to other European populations, e.g. Aat120 near the low tide mark and Aat100 at high tidal level (Johannesson et al. 1993). The Iberian and Scandinavian Aat120 alleles have different non-synonymous mutations compared to Aat100, but still have the same isoelectric point. This suggests that there is also a difference in activity or other enzymatic characteristic between the Iberian and Scandinavian Aat120 alleles that may explain their reverse vertical distributions on the shores, which is presumably also a consequence of microhabitat-linked natural selection. At this point we can speculate that the non-synonymous mutations may explain the differences in enzyme activity measured on the upper and lower shore (Panova & Johannesson 2004) and heat stability (Hull et al. 1999). It has been suggested that gastropod species living in the littoral fringe, e.g. the splash zone have a different strategy to cope with desiccation than species from the eulittoral zone. While the eulittoral species are submerged regularly by tides the species from the splash zone are only wetted through wave action that occurs on a much more irregular bases and have developed an allozyme that reduces anaerobe metabolic activity to reduce waste accumulation and retain energy (McMahon 1990, McMahon et al. 1995). Sokolova and Pörtner (2001a, b) show that this is the case for L. saxatilis populations in the White Sea. Another reason for this adaptation could be related to temperature, Panova and Johannesson (2004) argue however that this is not the case since L. saxalitis inhabiting the rock pools above the splash zone predominantly have Aat100 (Johannesson & Johannesson 1989) as well as the Swedish crab-ecotype of L. saxatilis dwelling on sheltered boulder shores (Janson & Ward 1984). Another examples of allozymes under natural selection is mitochondrial isocitrate dehydrogenas in the lugworm (Arenicola marina) whose allozyme frequencies are correlated to the annual mean temperature of their native substrate along the European coast (Hummel et al. 1997). Yet another commonly used example is the frequencies of aminopeptidase (Lap) in Mytilus edulis (L.) in the Long Island Sound population where natural selection occurs over a salinity gradient. In this particular case the allozyme with the higher catalytic activity is mainly found in a full saline regime while the lower catalytic active allozyme is found in

12 more brackish estuarine conditions (Koehn et al. 1983), the case is similar in the Baltic Sea (Theisen 1978). When comparing the amino acid and mRNA sequences for Aat from 8 other mollusc species, namely gastropods and bivalves and one cephalopod, one bivalve is placed in the Heterobranchia instead of the bivalve clade which would be expected when using neutral makers (Passamaneck et al. 2004). This is another piece of evidence that suggests that aspartate aminotransferase is, even at a higher taxonomic level, under selection. When comparing differences between the amino acid sequences from the species examined in this study to the sequences obtained from databases, a difference of 4.9% can be found within the Littorina genus, 16,7% difference to other , e.g. Bithynisa siamensis, 35.8- 37.3% to other gastropods outside the Littorinimorpha clade, 38.2-53.1 % to bivalves and 34.3 to cephalopods, e.g. Octopus vulgaris. Since we only have a partial sequence with 348 of the expected 410 amino acids for the gene all conclusions from the present study should be taken with some caution. Once the whole gene is sequenced it may well be that the number of mutations increases including non- synonymous replacements, which could possibly further change protein functions and the isoelectric point. This is likely to happen since a lot of variation on mollusc Aat appears to be at the end of the coding sequence. Unfortunately, it was not possible to include any other gastropods that inhabit littoral and eulittoral zone in this analysis due to the lack of protein and sequence data. Such data would be interesting, because such species may experience desiccation and heat similar to the included Littorina species and thus have a more similar CDS allowing for a better resolution in the phylogenetic analysis. We also tried to include samples from the small or black periwinkle Melaraphe neritoides that can sporadically be found on the Swedish west coast. This species belongs to family and thus would have been interesting from a phylogenetic view. Further M. neritodies has been shown to be polymorphic for Aat locus (Johannesson 1992). However, at present we lack good quality RNA sample for this species. Further, we still need to amplify any Aat coding sequence in L. obtusata.

Future work In order to arrive at more conclusive results we need the complete coding sequence for the both aspartate aminotransferase allozymes. This requires the sequencing of the remaining 60- odd amino acids, e.g. designing of new primers, cloning and sequencing. I further suggest that

13 we try to include L. obtusata in the analysis. As well as to complete the data set with the missing L. saxatilis, L. arcana and L. compressa individuals and to add a number of L. saxatilis from different origin especially from the Swedish and Iberian populations to investigate the case of the reverse distribution pattern mentioned above. If possible M. neritoides will be added as an outgroup to improve phylogenetic resolution.

Another interesting study would be to look into the genetic diversity in the Aat100 in Sweden compared to other locations to see whether the toxic algae bloom that whipped out the lower shore population (Johannesson et al. 1995) created a bottleneck for this allozyme.

References

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local aligment search tool. J Mol Biol 215:403–410

Andersen O, Wetten OF, Rosa MC De, Andre C, Carelli Alinovi C, Colafranceschi M, Brix O, Colosimo A (2009) Haemoglobin polymorphisms affect the oxygen-binding properties in Atlantic cod populations. Proc Biol Sci 276:833–41

Armbruster GFJ (2001) Selection and habitat-specific allozyme variation in the self-fertilizing land snail Cochlicopa lubrica ( O . F . Müller). J Nat Hist 35:185–199

Boutet I, Meistertzheim A-L, Tanguy A, Thébault M-T, Moraga D (2005) Molecular characterization and expression of the gene encoding aspartate aminotransferase from the Pacific oyster Crassostrea gigas exposed to environmental stressors. Comp Biochem Physiol Part C Toxicol Pharmacol 140:69–78

Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondral DNA Seqeuences of Primates: Tempo and Mode of Evolution. J Mol Evol 18:225–239

Butlin, R. K., Galindo, J., & Grahame, J. W. (2008). Sympatric, parapatric or allopatric: the most important way to classify speciation? Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1506), 2997–3007.

Clement M, Posada D, Crandall K a (2000) TCS: a computer program to estimate gene genealogies. Mol Ecol 9:1657–9

Dong Y, Somero GN (2009) Temperature adaptation of cytosolic malate dehydrogenases of limpets (genus Lottia): differences in stability and function due to minor changes in sequence correlate with biogeographic and vertical distributions. J Exp Biol 212:169–77

Futuyma, D. (2009). Evolution (2nd ed.). Sinauer, Sunderland, Mass. USA.

14 Hochachka PW (1980) Living wiuthout oxygen. Harward University Press, Cambridge, Mass., USA

Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–5

Hull SL, Grahame J, Mill PJ (1999) Heat stability and activity levels of aspartate aminotransferase and alanine aminotransferase in British Littorinidae. J Exp Mar Bio Ecol 237:255–270

Hummel H, Sommer A, Bogaards RH, Pörtner HO (1997) Variation in genetoc traits of the lugworm Arenicola marina: temperature related expression of mithondrial allozymes. Mar Ecol Prog Sereis 159:189–195

Janson K, Ward RD (1984) Microgeographic variation in allozyme and shell characters in Littorina saxatilis Olivi (Prosobranchia: Littorinidae). Biol J Linn Soc 22:289–307

Johannesson K (1988) The paradox of Rockall: why is a brooding gastropod (Littorina saxatilis) more widespread than one having a planktonic larval dispersal stage (L. littorea)? Mar Biol 99:507–513

Johannesson K (1992) Genetic variability and large-scale differentiation in two species of littorinid gastropods with planktotrophic development, Littorina littorea (L.) and Melarhaphe (Littorina) neritoides (L.) (Prosobranchia: Littorinacea), with notes on a mass occurrence. Biol J Linn Soc 47:285–299

Johannesson K. 2003. Evolution in Littorina: ecology matters. J Sea Research 49:107-117

Johannesson K, Johannesson B (1989) Differrences in allele frequencies of Aat between high and mid rocky shore populations of Littorina saxatilis (Olivi) suggest selection in this enzyme locus. Genet Res 54:7–11

Johannesson K, Johannesson B (1990) Genetic variation within Littorina saxatilis (Olivi) and Littorina neglecta Bean: Is L. neglecta a good species? Hydrobiologia 193:89–97

Johannesson K, Johannesson B, Lundgren U (1995) Strong natural selection causes microscale allozyme variation in a marine snail. Proc Natl Acad Sci 92:2602–2606

Johannesson K, Johannesson B, Rolan-Alvarez E (1993) Morphological Differentiation and Genetic Cohesiveness Over a Microenvironmental Gradient in the Marine Snail Littorina saxatilis. Evolution (N Y) 47:1770–1787

Johansson E (2012) A genetic study of the allozyme Aat (aspartate aminotranferas) in the marine periwinkle Littorina saxatilis (Olivi, 1792). University of Gothenburg

Knight AJ, Ward RD (1991) The Genetic Relationships of three Taxa in the Littorina saxatilis Species Complex (Prosobranchia: Littorinidae). J Molluscan Stud 57:81–91

Koehn RK, Zera AJ, Hall JG (1983) Enzyme polymorphism and natural selection. In: Nei M,

15 Koehn RK (eds) Evolution of Genes and Proteins. Sinauer, Sunderland, Mass. USA, p 115– 136

McMahon RF (1990) Thermal tolerance, evaporative water loss, air-water oxygen consumption and zonation of intertidal prosobranchs: a new synthesis. Hydrobiologia 193:241–260

McMahon RF, Russell-Hunter WD, Aldridge DW (1995) Lack of metabolic temperature compensation in the intertidal gastropods, Littorina saxatilis (Olivi) and L. obtusata (L.). Hydrobiologia 309:89–100

Panova M, Hollander J, Johannesson K (2006) Site-specific genetic divergence in parallel hybrid zones suggests nonallopatric evolution of reproductive barriers. Mol Ecol 15:4021– 4031

Panova M, Johannesson K (2004) Microscale variation in Aat (aspartate aminotransferase) is supported by activity differences between upper and lower shore allozymes of Littorina saxatilis. Mar Biol 144:1157–1164 LA – English

Passamaneck YJ, Schander C, Halanych KM (2004) Investigation of molluscan phylogeny using large-subunit and small-subunit nuclear rRNA sequences. Mol Phylogenet Evol 32:25– 38

Powers DA & Schulte PM 1998. Evolutionary adaptations of gene structure and expression in natural populations in relation to a changing environment: A multidisciplinary approach to address the million-year saga of a small fish. J Exp Zool 282:71-94.

Reid DG (1996) Systematics and Evolution of Littorina (TR Society, Ed.). The Dorset Press, Dorchester, Dorset

Reid DG, Rumbak E, Thomas RH (1996). DNA, Morphology and Fossils: Phylogeny and Evolutionary Rates of the Gastropod Genus Littorina. Proc Biol Sci 351:877-895

Sokolova IM, Granovitch AI, Berger VJ, Johannesson K (2000) Intraspecific physiological variability of the gastropod Littorina saxatilis related to the vertical shore gradient in the White and North Seas. Mar Biol 137:297–308

Sokolova I, Pörtner H (2001a) Physiological adaptations to high intertidal life involve improved water conservation abilities and metabolic rate depression in Littorina saxatilis. Mar Ecol Prog Ser 224:171–186

Sokolova I, Pörtner H (2001b) Temperature effects on key metabolic enzymes in Littorina saxatilis and L. obtusata from different latitudes and shore levels. Mar Biol 139:113–126

Theisen BF (1978) Allozyme clines and evidence of strong selection in three loci in Mytilus edulis L. ( Bivalvia ) from Danish waters. Ophelia 17:135–142

Thorpe JP, Solé-Cava AM (1994) The use of allozyme electrophoresis in invertebrate systematics. Zool Scr 23:3–18

16 Wheat CW, Watt WB, Pollock DD, Schulte PM. 2006. From DNA to fitness differences: sequences and structures of adaptive variants of Colias phosphoglucose isomerase (PGI). Mol Biol Evol 23: 499-512.

Zaslavskay NII, Sergievsky SOO, Tatarenkov NNA (1992) Allozyme similarity of Atlantic and Pacific species of Littorina (Gastropada: Littorinidae). J Molluscan Stud 58:377–384 Figures and Tables

Table 1: List of sampled species and sample origin Origin Species Sweden Littorina saxatilis Littorina fabalis Littorina obtusata Littorina littorea Melaraphe neritoides Iceland Littorina saxatilis Norway Littorina saxatilis Littorina arcana Littorina compressa Littorina fabalis Littorina obtusata Portugal Littorina saxatilis

Table 2: List of RNA samples Number of Origin Species High/Low Allozyme Extractions Sweden Littorina saxatilis High Aat120 2 Low Aat100 2 Littorina fabalis n/a other 1 Littorina obtusata n/a other 1 Littorina littorea n/a other 2 Iceland Littorina saxatilis High Aat120 2 Low Aat100 2 Norway Littorina saxatilis High Aat120 2 Low Aat100 2 Littorina fabalis n/a other 1 Littorina obtusata n/a other 1 Littorina arcana High Aat120 2 Low Aat100 2 Littorina compressa Low Aat100 2 Portugal Littorina saxatilis High Aat100 2 Low Aat120 2

17

Table 3: List of other species with RNA and/or protein sequences available Species Crassostrea gigas Villosa lienosa Octopus vulgaris Mytilus californicus Lottia gigantea Aplysia californica Lymnaea stagnalis Bynthia siamensis

18 Table 4: List of primers and primer sequences

Primer Name Sequence Aat_F10 catttgcgcagacttcc Aat_F11 tttttgagacggccccta Aat_F12 agacgaggggaagccatgggt Aat_F13 caaatctcagcagacatgac Aat_F14 cttgaaccacgaatatctgc Aat_F6 tacaacgagagaattggcaa Aat_F7 gcctactttgaggaatggaa Aat_F8 gacagaagaaggtgccag Aat_F9 tattcatcggcaacatggcg Aat_Lsax_R1 acgttcttggtggtcagtcc Aat_Lsax_R5 gccaatctgctggaggatgt Aat_R10 gctcctccattcctcaaagt Aat_R6 cccaacatctctcattcaga Aat_R7 ccaatcaaataactgcagct Aat_R8 cttcagttcatagcgcaatg Aat_R9 cctggcaccttcttctgtc Aat_Race_F gatttgctactggcgatctggaagc Aat_Race_F1 agacgcacaagttgtacgctacttt Aat_Race_FN gagaggctttgagctccttatcgc Aat_Race_FN1 gtcaagagaggctttgagctcttta Aat_Race_R ttctccacagtcttgaccattgggag Aat_Race_R1 ggttcaaggtcatgtctgctgaga Aat_Race_RN gccatgttgccgatgaatatccctg Aat_Race_RN1 ggcccatcttcacgttactgaatat Aat_sax_F1 cctgtgtccttgacctcaca Aat_sax_F2 acagggatattcatcggcaac Aat_sax_F2A acagggattattcatcggcaac Aat_sax_F5 ttcagcttcaccggactcac Aat_sax_R2 gcctcaacgctgtacatgta Aat_sax_R3 cgactgggcgataaagagct Aat_sax_R4 cagtccaacagggggaacag Aat-Race-F2 caacaagctgcgacagaagaaggtg Aat-Race-F3 atggccaaccgtatcaaggagatgc Aat-Race-F4 cccggcctactttgaggaatgggag GeneRacer_3' gctgtcaccgatacgctacgtaacg GeneRacer_3'N cgctacgtaacggcatgacagtg GeneRacer_5' cgactggagcacgaggacactga GeneRacer_5'N ggacactgacatggactgaaggagta LSactine-F atgcccccagggccgtcttc LSactine-R cagtgaggtcacggccagcg Race3Seq cgtaacggcatgacagtg

19

Figure 1: Haplotype network based on aspartate aminotranferase nucleotide sequences for 19 Littorina spp. samples. Yellow Aat120, Turquoise Aat120 and Red Aat100, empty dots synonymous mutations and green dots non-synonymous mutations

20 12/02/2013 02:33:45 PM Results colour-coded for amino acid conservation

The current colourscheme of the alignment is for amino acid conservation.

The conservation scoring is performed by PRALINE. The scoring scheme works from 0 for the least conserved alignment position, up to 10 for the most conserved alignment position. The colour assignments are:

Unconserved 0 1 2 3 4 5 6 7 8 9 10 Conserved

...... 10 ...... 20 ...... 30 ...... 40 ...... 50 6_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 5_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-1_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-2_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 22_LsLice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 9_LsHice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-1_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-2_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 16_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 27_LsLpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-1_LsHpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 29_LsLpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 4_Lc_100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 7_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 17_LaLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-1_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-2_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-2_LsHpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 8 * * * * * * * * * * * * * *

...... 60 ...... 70 ...... 80 ...... 90 ...... 100 6_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 5_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-1_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-2_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 22_LsLice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 9_LsHice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-1_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-2_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 16_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 27_LsLpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-1_LsHpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 29_LsLpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 4_Lc_100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 7_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 17_LaLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-1_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-2_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-2_LsHpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 110 ...... 120 ...... 130 ...... 140 ...... 150 6_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 5_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-1_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-2_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 22_LsLice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 9_LsHice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-1_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-2_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 16_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 27_LsLpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-1_LsHpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 29_LsLpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 4_Lc_100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 7_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 17_LaLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-1_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-2_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-2_LsHpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H Consistency * * * * * * * * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ...... 160 ...... 170 ...... 180 ...... 190 ...... 200 6_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D

Results colour-coded for amino acid conservation 21 1 12/02/2013 02:33:45 PM Results colour-coded for amino acid conservation

The current colourscheme of the alignment is for amino acid conservation.

The conservation scoring is performed by PRALINE. The scoring scheme works from 0 for the least conserved alignment position, up to 10 for the most conserved alignment position. The colour assignments are:

Unconserved 0 1 2 3 4 5 6 7 8 9 10 Conserved

...... 10 ...... 20 ...... 30 ...... 40 ...... 50 6_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 5_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-1_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-2_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 22_LsLice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 9_LsHice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-1_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-2_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 16_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 27_LsLpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-1_LsHpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 29_LsLpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 4_Lc_100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 7_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 17_LaLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-1_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-2_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-2_LsHpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 8 * * * * * * * * * * * * * *

...... 60 ...... 70 ...... 80 ...... 90 ...... 100 6_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 5_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-1_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-2_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 22_LsLice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 9_LsHice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-1_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-2_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 16_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 27_LsLpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-1_LsHpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 29_LsLpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 4_Lc_100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 7_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 17_LaLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-1_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-2_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-2_LsHpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 110 ...... 120 ...... 130 ...... 140 ...... 150 6_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 5_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-1_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-2_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 22_LsLice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 9_LsHice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-1_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-2_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 16_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 27_LsLpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-1_LsHpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 29_LsLpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 4_Lc_100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 7_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 17_LaLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-1_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-2_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-2_LsHpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H Consistency * * * * * * * * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 160 ...... 170 ...... 180 ...... 190 ...... 200 12/02/2013 02:33:45 PM 6_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 5_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 2-1_LsHswe120Results colour-codedA G FforS QaminoI K E YacidR conservationY W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 1 2-2_LsHswe120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 22_LsLice120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 9_LsHice120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 8-1_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 8-2_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 16_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 27_LsLpor120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 15-1_LsHpor100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 29_LsLpor100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 4_Lc_100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 7_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 17_LaLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 26-1_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 26-2_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 15-2_LsHpor120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 210 ...... 220 ...... 230 ...... 240 ...... 250 6_LsHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 5_LsHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 2-1_LsHswe120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 2-2_LsHswe120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 22_LsLice120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 9_LsHice120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 8-1_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 8-2_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 16_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 27_LsLpor120 P T M D Q W A Q I A D I I E T R Q L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 15-1_LsHpor100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 29_LsLpor100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 4_Lc_100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 7_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 17_LaLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 26-1_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 26-2_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 15-2_LsHpor120 P T M D Q W A Q I A D I I E T R Q L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E Consistency * * * * * * * * * * * * * * 8 * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 260 ...... 270 ...... 280 ...... 290 ...... 300 6_LsHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 5_LsHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 2-1_LsHswe120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 2-2_LsHswe120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 22_LsLice120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 9_LsHice120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 8-1_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 8-2_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 16_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 27_LsLpor120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 15-1_LsHpor100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 29_LsLpor100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 4_Lc_100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 7_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 17_LaLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 26-1_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 26-2_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 15-2_LsHpor120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 310 ...... 320 ...... 330 ...... 340 ...... 6_LsHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 5_LsHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 2-1_LsHswe120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 2-2_LsHswe120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 22 22_LsLice120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 9_LsHice120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 8-1_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 8-2_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 16_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 27_LsLpor120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 15-1_LsHpor100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 29_LsLpor100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 4_Lc_100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P

Results colour-coded for amino acid conservation 2 12/02/2013 02:33:45 PM

5_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 2-1_LsHswe120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 2-2_LsHswe120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 22_LsLice120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 9_LsHice120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 8-1_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 8-2_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 16_LaHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 27_LsLpor120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 15-1_LsHpor100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 29_LsLpor100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 4_Lc_100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 7_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 17_LaLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 26-1_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 26-2_LsLtro100 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D 15-2_LsHpor120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 210 ...... 220 ...... 230 ...... 240 ...... 250 6_LsHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 5_LsHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 2-1_LsHswe120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 2-2_LsHswe120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 22_LsLice120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 9_LsHice120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 8-1_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 8-2_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 16_LaHtro120 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 27_LsLpor120 P T M D Q W A Q I A D I I E T R Q L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 15-1_LsHpor100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 29_LsLpor100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 4_Lc_100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 7_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 17_LaLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 26-1_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 26-2_LsLtro100 P T M D Q W A Q I A D I I E A R K L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E 15-2_LsHpor120 P T M D Q W A Q I A D I I E T R Q L F P L L D C A Y Q G F A T G D L E A D A Q V V R Y F V K R G F E Consistency * * * * * * * * * * * * * * 8 * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 260 ...... 270 ...... 280 ...... 290 ...... 300 6_LsHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 5_LsHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 2-1_LsHswe120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 2-2_LsHswe120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 22_LsLice120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 9_LsHice120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 8-1_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 8-2_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 16_LaHtro120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 27_LsLpor120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 15-1_LsHpor100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 29_LsLpor100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 4_Lc_100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 7_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 17_LaLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 26-1_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 26-2_LsLtro100 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S 15-2_LsHpor120 L F I A Q S F S K N F G L Y N E R I G N L C I V T K T P D V I P Q I R S Q M E I I A R T M W S N P S Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 310 ...... 320 ...... 330 ...... 340 ...... 6_LsHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 5_LsHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 2-1_LsHswe120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 2-2_LsHswe120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 22_LsLice120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 9_LsHice120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 8-1_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 8-2_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 16_LaHtro120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 27_LsLpor120 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 15-1_LsHpor100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 12/02/2013 02:33:45 PM 29_LsLpor100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 12/02/2013 02:33:45 PM 4_Lc_100ResultsH Hcolour-codedG A R I V A M V L A N PforA Y F aminoE E W K S QacidV A T M conservationA N R I K E M R Q M M F N K L R Q K K V P 7_LsLtro100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 17_LaLtro100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P Results Thecolour-coded current colourscheme for amino acid of conservation the alignment is for amino acid conservation. 2 26-1_LsLtro100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 26-2_LsLtro100 H H G A R I V A M V L A N P A Y F E E W K S Q V A T M A N R I K E M R Q M M F N K L R Q K K V P 15-2_LsHpor120The conservationH H G A RscoringI V A M isV performedL A N P A YbyF PRALINE.E E W K S TheQ V A scoringT M A N schemeR I K E worksM R Q M fromM F N 0 forK LtheR QleastK K V conservedP alignment position, up to 10 for the most conserved alignment position. ConsistencyThe colour *assignments* * * * * * * are:* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Unconserved 0 1 2 3 4 5 6 7 8 9 10 Conserved

Figure 2: Alignment of aspartate aminotransferase amino acids sequences of 19 Littorina samples. “Conserved” is concerning the properties of the amino acid, e.g. polarity, hydrophilia, etc...... 10 ...... 20 ...... 30 ...... 40 ...... 50 6_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 5_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-1_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-2_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 22_LsLice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 9_LsHice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-1_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-2_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 16_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 27_LsLpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-1_LsHpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 29_LsLpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 4_Lc_100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 7_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 17_LaLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-1_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-2_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-2_LsHpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 8 * * * * * * * * * * * * * *

...... 60 ...... 70 ...... 80 ...... 90 ...... 100 6_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 5_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-1_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A Figure 3: Phylogenetic2-2_LsHswe120 tree based aspartateG K P W V aminotranferaseL P M V K T V EonN nucleotideQ I S A D sequencesM T L N HforE LittorinaY L P V A spp.G 19L P A F R T A A S A L I L G K D N P A 120 120 100 samples. Yellow22_LsLice120 Aat , TurquoiseG AatK P W andV L PRedM V AatK T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 9_LsHice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-1_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-2_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 16_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G23L P A F R T A A S A L I L G K D N P A 27_LsLpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-1_LsHpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 29_LsLpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 4_Lc_100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 7_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 17_LaLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-1_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-2_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-2_LsHpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 110 ...... 120 ...... 130 ...... 140 ...... 150 6_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 5_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-1_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-2_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 22_LsLice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 9_LsHice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-1_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-2_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 16_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H Results colour-coded for amino acid conservation 3 27_LsLpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-1_LsHpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 29_LsLpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 4_Lc_100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 7_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 17_LaLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-1_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-2_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-2_LsHpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H Consistency * * * * * * * * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 160 ...... 170 ...... 180 ...... 190 ...... 200 6_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D

Results colour-coded for amino acid conservation 1 Figure 4: Phylogenetic tree based on aspartate aminotranferase amino acid sequences for Littorina spp. 19 samples. Yellow Aat120, Turquoise Aat120 and Red Aat100

24 12/02/2013 02:06:17 PM Results colour-coded for amino acid conservation

The current colourscheme of the alignment is for amino acid conservation.

The conservation scoring is performed by PRALINE. The scoring scheme works from 0 for the least conserved alignment position, up to 10 for the most conserved alignment position. The colour assignments are:

Unconserved 0 1 2 3 4 5 6 7 8 9 10 Conserved

...... 10 ...... 20 ...... 30 ...... 40 ...... 50 Littorina_litto A A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A I M E N R V E G V Q V Q G G T G Bythnia_siamens A A D A T L N H E Y L P V A G L P V F R T L A N A L I L G K D N P A I L E N R V E G I Q V Q G G T G Littorina_saxat S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A I L E N R V E G V Q V Q G G T G Aplysia_califor A A D L T L N H E Y L P V A G M P D F R T S A T K I L L G T N H A A I S E N R T E S F Q A L G G T G Lymnaea_stagnal A N D L T L N H E Y L P V A G L P D F R I A A T Q L L L G S N H T V I A E N K V E S F Q A L G G T G Octopus_vulgari A S D P T L N H E Y L P I L G F P D F R S A A I R L L L G E G N P A I V E N R V V G V Q S L G G T G Mytilus_califor R G D A G L N H E Y L P I S G L P D Y R T A C Q K L L L G S E S P A I Q S S R V E S F Q A C G G T G Lottia_gigantea A N D A T L N H E Y L P V A G L P T Y R E A A A R L L L G E G H K A L V E N R V E G V Q T L G G T G Villosa_lienosa A T D P T L N H E Y L P V A G L P D F R A A A V R L L L G E D N P A L V E N R V E G V Q A L G G T G Crassostrea_gig A T D V T L N H E Y L P V A G M P D F R L A A L R L L L G E D S P A I V E N R V E G V Q A I G G R G Consistency 8 5 * 4 8 * * * * * * * 9 8 * 7 * 5 8 * 5 8 8 4 5 9 8 * * 5 5 6 6 9 9 5 8 9 9 8 8 7 6 * 6 5 * * 8 *

...... 60 ...... 70 ...... 80 ...... 90 ...... 100 Littorina_litto G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F M H A G F S Q I K E Y R Y W D A K N Bythnia_siamens G I R L A A E F L K R N L Q S D V V Y V S N P T W E N H K T I F S H A G F S Q I R E Y R Y W D A K N Littorina_saxat G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H A G F S Q I K E Y R Y W D A K N Aplysia_califor A L R L A A A F L G N V L G F K V V Y V S N P T W G N H K G I F K S A G F S X I K E Y R Y W D N A N Lymnaea_stagnal A L R L G A A F L K N V M N F N V V Y V S D P T W E N H K G V F Q A A G F S D V R L Y R Y W D S E S Octopus_vulgari A L R L A A D F Y R T I N K A D T V Y V S S P T W G N H K G V F M A A G V T N I K E Y R Y W D A E N Mytilus_califor A I R L A A D F L K R L M N Y D C V Y V S K P T W G N H K T I F K R S G F S D V R E Y R Y W H P E T Lottia_gigantea A I R L A A E F L K Q F L N K S V V Y V S K P T W G N H K G I F K A A G Y A K I E E Y R Y W D D A N Villosa_lienosa A L R L A A D F L K I V K K C D N V I M S N P T W E N H R M I F K N C G Y G N I K Q H R Y W N A E M Crassostrea_gig G I R L C A D F C K K M L G D D S M Y T S S P T W G N H L G I F K S C G Y S N V K Q Y R Y W D A Q N Consistency 7 8 * * 6 * 6 * 7 7 4 4 4 4 3 6 5 9 8 8 * 5 * * * 6 * * 7 4 9 * 5 4 7 * 7 7 - 2 1 4 7 4 8 3 6 4 8 9 7 7 9 * * * 7 5 5 6

...... 110 ...... 120 ...... 130 ...... 140 ...... 150 Littorina_litto L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D P T M E Q W K A I A D I I E D R Bythnia_siamens R G V D I T G F L E D L Q N A P E G A I V V L H A V A H N P T G N D P T P A D W E K I A D V M E A K Littorina_saxat L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D P T M D Q W A Q I A D I I E A R Aplysia_califor R S L D F S G M M E D L S N A P E N A V V I L H G V A H N P T G V D P T P D Q W K T I C Q T C I D K Lymnaea_stagnal R G L N F R G M C E D L S N A P E N A V I I L H G V A H N P T G I D P T K E Q W E A I A N I C K E K Octopus_vulgari R S L D I N G M V E D L R A A K E N S V V I L H A V A H N P T G V D P T Q E Q W K Q I A D V C E E R Mytilus_califor K S L D F Q G M L E D L N N A P E R S V V I L H G V A H N P T G V D P T K D Q W V E I I K T C Q A K Lottia_gigantea R C L N L S G M L E D L K G A P E N A V V I L H T C A H N P T G T D P T Q D Q W K E I I Q V C Q D K Villosa_lienosa K G L D L S G M L E D I K A M P D E T V V V L H A C A H N P T G V D P T Q E Q W K K I A D M C E Q K Crassostrea_gig R T I D F N G M M E D L N A A P E K T V V I L H G C C H N P T G V N P T E E Q L K E I G D L V E R K Consistency 6 5 8 8 6 5 * 7 5 * * 9 5 5 8 8 9 4 7 7 9 9 * * 6 6 8 * * * * * 5 9 * * 4 6 8 8 6 5 * 5 6 6 5 6 4 8

...... 160 ...... 170 ...... 180 ...... 190 ...... 200 Littorina_litto K L F P L L D C A Y Q G F A T G D L E A D A Q V V R - Y F V N R X G I E L F I A Q S F S K N F G L Y Bythnia_siamens R L F P L L D C A Y Q G F A S G D L E A D A Q T C R - Y F V K R X G F E V F I S Q S F S K N F G L Y Littorina_saxat K L F P L L D C A Y Q G F A T G D L E A D A Q V V R - Y F V K R X G F E L F I A Q S F S K N F G L Y Aplysia_califor K L F V L I D C A Y Q G F A S G D L D Q D A F A S R - Y F A D Q X R M D M F V A Q S F S K N F G L Y Lymnaea_stagnal K L F V F L D C A Y Q G F A S G N L D E D G W A S R - Y F A D Q X D L D L F V A Q S F S K N F G L Y Octopus_vulgari K A L V L M D I A Y Q G F A T G D L V A D A W A P R - Y F A S R X G F E F L T A Q S F S K N F G L Y Mytilus_califor N I F I L V D I A Y Q G F A T G D L D A D A H L P R - Y L X D R N N I E F M A A Q S F S K N F G L Y Lottia_gigantea K I F M L M D C A Y Q G F A S G D L D K D A F P V R - Y M A E Q X G L E F F V A Q S F S K N F G L Y Villosa_lienosa K A Y V L M D I A Y Q G F T S G D I E R D S W A V R - Y F V N R X G F E L F A S Q S F S K N F G L Y Crassostrea_gig K F M L L V D E A Y Q G S L L V T W S R M E T L F D I S S K R X - G F E F F V S Q S F S K N F G L Y Consistency 8 6 7 5 8 7 * 5 * * * * 8 7 6 8 7 7 6 5 8 7 3 5 3 8 0 8 6 - 2 1 4 7 4 8 3 6 4 8 5 - 2 1 4 7 4 8 3 6 4 8 - 2 1 4 7 4 8 3 6 4 8 6 6 8 6 7 6 8 * * * * * * * * * *

...... Littorina_litto N E R V G N Bythnia_siamens N E R I G N Littorina_saxat N E R I G N Aplysia_califor N E R T G N 12/02/2013 02:33:45 PM Lymnaea_stagnal N E R T G N ResultsOctopus_vulgari colour-codedN E R I G N for amino acid conservation Mytilus_califor N E R T G N TheLottia_gigantea current colourschemeN ofE RtheI GalignmentN is for amino acid conservation. Villosa_lienosa N E R V G N Crassostrea_gig N E R T G N TheConsistency conservation scoring* is* performed* 6 * * by PRALINE. The scoring scheme works from 0 for the least conserved alignment position, up to 10 for the most conserved alignment position. The colour assignments are:

Results colour-coded for amino acid conservation 1 Unconserved 0 1 2 3 4 5 6 7 8 9 10 Conserved

Figure 5: Alignment of aspartate aminotransferase amino acids sequences of 10 mollusc species. “Conserved” is concerning...... the. 10 properties...... of. the. 20 amino. . . . .acid,. . . e.g.. 30 polarity,...... hydrophilia,. . . 40 . . . .etc...... 50 6_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 5_LsHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-1_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 2-2_LsHswe120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 22_LsLice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 9_LsHice120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 25 8-1_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 8-2_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 16_LaHtro120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N E N K V N L G V G A Y R T D E 27_LsLpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-1_LsHpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 29_LsLpor100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 4_Lc_100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 7_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 17_LaLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-1_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 26-2_LsLtro100 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E 15-2_LsHpor120 T G I F I G N M A S I F S N V K M G P T I E V F A L T K R F S E D T N Q N K V N L G V G A Y R T D E Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 8 * * * * * * * * * * * * * *

...... 60 ...... 70 ...... 80 ...... 90 ...... 100 6_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 5_LsHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-1_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 2-2_LsHswe120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 22_LsLice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 9_LsHice120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-1_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 8-2_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 16_LaHtro120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 27_LsLpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-1_LsHpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 29_LsLpor100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 4_Lc_100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 7_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 17_LaLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-1_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 26-2_LsLtro100 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A 15-2_LsHpor120 G K P W V L P M V K T V E N Q I S A D M T L N H E Y L P V A G L P A F R T A A S A L I L G K D N P A Consistency * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 110 ...... 120 ...... 130 ...... 140 ...... 150 6_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 5_LsHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-1_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 2-2_LsHswe120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 22_LsLice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 9_LsHice120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-1_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 8-2_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 16_LaHtro120 I L E N R V E G V Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 27_LsLpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-1_LsHpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 29_LsLpor100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 4_Lc_100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 7_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 17_LaLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-1_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 26-2_LsLtro100 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H 15-2_LsHpor120 I L E N R V E G I Q V Q G G T G G I R L T A E F L K R N F G S D V V Y V S K P T W G N H K T V F S H Consistency * * * * * * * * 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

...... 160 ...... 170 ...... 180 ...... 190 ...... 200 6_LsHtro120 A G F S Q I K E Y R Y W D A K N L G V D F E G L K E D L Q N A P E G A C V I L H A V A H N P T G N D

Results colour-coded for amino acid conservation 1

Figure 6: Phylogenetic tree based on aspartate aminotransferase amino acid sequences for 10 mollusc species

26

Figure 7: Alignment of predicted coding sequence with all primers and amplified and analysed region (black bars)

27

28