CHARACTERISATION OF DUPLICATED HAEMOGLOBIN GENES IN BIVALVES

Mathilde Klein Bachelor of Medical Laboratory Science, QUT

Submitted in fulfilment of the requirements for the degree of

Master of Applied Science (Research)

School of Biomedical Sciences, IHBI

Faculty of Health

Queensland University of Technology

2017

Keywords

Arcoida, Bivalves, Gene duplication, Genomics, , Haemoglobin,

Limoida, Molluscs, Transcriptome

Keywords i

Abstract

Haemoglobins (Hbs) are found in virtually all phyla and are some of the most investigated in biomedical sciences. These proteins exhibit an extraordinary diversity of form and function in invertebrate lineages. This provides a unique opportunity to explore the origin and evolution of Hbs yet little is known about their distribution, function and evolution in invertebrate lineages. To explore further the functions and evolution of those Hbs, recent transcriptome data for the Arcid bivalve Anadara trapezia is investigated here. This species shows the presence of duplicated Hb encoding genes suggesting that gene duplication may have been more extensive than previously thought in bivalves. This study tests the hypothesis that these duplicated genes show patterns of tissue specific expression and evidence of neofunctionalisation. This is shown here for at least three Hb encoding genes present in A. trapezia with strong tissue specific expression in haemolymph compared to other tissues. Furthermore, the expression of these genes remains unaffected by prolonged air exposure suggesting that neofunctionalisation may confer an evolutionary advantage to this bivalve. As well as the unique Hbs found in the bivalve order Arcoida, Hbs are also found in three other bivalve orders: Carditoida, Solemyoida and Veneroida. These four orders that possess Hbs provide compelling evidence for the independent evolution of these proteins in multiple bivalve lineages. To expand data on the distribution of Hbs in bivalves, a transcriptome sequence for Ctenoides ales in the order Limoida was generated in this project. Interrogation of the transcriptome shows the presence of at least three -like encoding genes including two Hb-like encoding genes providing preliminary evidence for another independent origin of Hb in a bivalve lineage. Overall, this study provides novel insights into the function, evolution and distribution of Hbs in bivalves by investigating two distantly related species. Results of this study are consistent with current theories that Hb diversity in bivalves is a result of repeated rounds of gene duplication providing the raw material for evolution. Investigation of hypoxic resistance also reinforces that greater expression of Hbs in haemolymph confers a physiological advantage suggesting that Hb would evolve more often in some lineages during adaptation to unfavourable environment conditions, particularly

Abstract ii

hypoxia and prolonged air exposure. The finding of Hb-like encoding genes in another bivalve lineage also supports the evolution of this gene family through independent evolution and gene duplication, and gives insight into the distribution of globin genes in bivalves which is still poorly understood. The investigation of Hb genes in these bivalves also contributes to further understand the role of Hbs and provides potential novel insights for resistance in hypoxic environments, disease control and resistance to pollution in aquaculture.

Abstract iii

Table of Contents

Keywords ...... i Abstract ...... ii List of Figures ...... vi List of Tables ...... xi List of Abbreviations ...... xii Statement of Original Authorship ...... xiii Acknowledgements ...... xiv Chapter 1: Introduction ...... 1 1.1 Functional and structural diversity within the globin superfamily ...... 1 1.1.1 Recently discovered globin genes ...... 2 1.1.2 Myoglobins and haemoglobins ...... 2 1.2 Evolution of haemoglobins ...... 4 1.2.1 Gene duplication ...... 5 1.2.2 Divergent evolution of duplicated genes ...... 8 1.2.3 Convergent evolution ...... 9 1.2.4 Invertebrate haemoglobins ...... 10 1.3 Bivalve haemoglobins ...... 12 1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass...... 16 1.4 Aims ...... 19 1.5 Thesis Outline ...... 20 Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin genes in Anadara trapezia ...... 22 2.1 Background ...... 22 2.2 Material and Methods ...... 23 2.2.1 Sample acquisition and tissue dissection ...... 23 2.2.2 RNA extraction and cDNA synthesis ...... 24 2.2.3 RT–PCR (Real Time PCR) ...... 25 2.2.4 Candidate gene validation ...... 26 2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression ...... 26 2.2.6 RT-qPCR data analysis ...... 27 2.3 Results ...... 28 2.3.1 Haemolymph analysis ...... 28 2.3.2 RT-PCR ...... 30 2.3.3 Candidate gene validation ...... 31 2.3.4 RT-qPCR for quantification of gene expression ...... 31 2.4 Discussion ...... 35 2.4.1 Haemolymph characteristics ...... 35 2.4.2 Tissue specific expression and neofunctionalisation ...... 36 2.5 Conclusion ...... 37 Chapter 3: Functional annotation of the Ctenoides ales transcriptome ...... 39 3.1 Background ...... 39

Table of contents iv

3.2 Materials and Methods ...... 41 3.2.1 Sample collection ...... 41 3.2.2 RNA extraction, library preparation and sequencing ...... 41 3.2.3 Transcriptome assembly and validation ...... 44 3.2.4 Functional annotation of transcripts and mapping ...... 44 3.2.5 Comparative transcriptomics ...... 46 3.2.6 Candidate genes identification ...... 47 3.2.7 Primer design and candidate gene validation ...... 47 3.2.8 Phylogenetic analysis of sequences ...... 48 3.3 Results ...... 48 3.3.1 RNA extraction, library preparation and sequencing ...... 48 3.3.2 Transcriptome assembly and validation ...... 50 3.3.3 Functional annotation of transcripts and mapping ...... 51 3.3.4 Comparative transcriptomics ...... 52 3.3.5 Candidate genes ...... 53 3.3.6 Primer design and candidate gene validation ...... 55 3.3.7 Phylogeny of globins in bivalves ...... 56 3.4 Discussion ...... 58 3.5 Conclusion ...... 60 Chapter 4: General Discussion ...... 61 4.1 Role of gene duplication in the current diversity of bivalve Hbs ...... 61 4.2 Globin gene evolution in hypoxic environments ...... 64 4.3 The importance of globin gene evolution to aquaculture ...... 66 4.4 Limitations of the study ...... 67 4.5 Future research ...... 67 4.6 Conclusions ...... 67 References… ...... 69 Appendices...... 98 Appendix A: Poster presented at the annual Lorne Genome conference 2015 ...... 98

Table of contents v

List of Figures

Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and jawless (agnathans ) . In this phylogeny -specific globins are grouped into two distinct clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a)...... 5 Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2) (Zhang et al., 2014)...... 8 Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in 11 major phyla shown here : , Nematodes, Nemertines, Mollusks, , Echiurans, Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych & Passamaneck, 2001)...... 11 Figure 1.4 Gene structure from two Hbs found in worms (Branchipolynoe spp.). This diagram illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain (bottom) globins found in these species (Projecto-Garcia et al., 2010)...... 12 Figure 1.5 Phylogenetic classification of class of molluscs. The main bivalve subclasses are represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta, Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida, Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015)...... 15 Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha) subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue. This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c), both found

List of figures vi

in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013)...... 17 Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to assess tissue specific expression of Hb genes in this study...... 24 Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of fresh haemolymph using and Abbott i-STAT analyser. This was done for all haemolymph samples across two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2 values as dependent variables respectively. Condition was used as a factor for each test and results were considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*) and (•) in each graph and error bars represent 2 standard errors around the mean...... 29 Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified products for 2D, AG, BG, HB, HD and 18S genes respectively...... 31 Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS and differences were considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*). Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2 standard errors around the mean...... 32 Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping gene 18S in each tissue type. Target amplification curves are represented in orange (left),

List of figures vii

negative controls in green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler® measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical replicates along with negative controls and 18S as a housekeeping gene...... 33 Figure 2.6. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: haemolymph, foot, gills, mantle and muscle. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS with ratio values as a dependent variable and condition as a factor in each tissue. Differences were considered statistically significant at p < 0.05 and no statistical differences were found here. Error bars represent 2 standard errors around the mean...... 34 Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E-value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal peptides and Pfam to determine the presence and position of domains. All results were uploaded to the SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016)...... 46 Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 % agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate) from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively. The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed for quantity and integrity and sequencing libraries were prepared...... 49 Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip. Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for each sample...... 50

List of figures viii

Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to represent common and basic biological information: cellular component (CC), molecular function (MF) and biological process (BP)...... 52 Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve species C. gigas, L. gigantea, A. trapezia and C. ales. Orthologous gene clusters for C. ales, A. trapezia and the two model species L. gigantea and C. gigas are annotated and compared here using the program OrthoVenn. For L. gigantea and C. gigas, the predicted proteomes obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp alignment...... 53 Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain. Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal repeats...... 54 Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as shown in table 3.2 above. Lanes 1 in all gels displayed here are Hyperladder 100 bp. Candidate gene 1 is shown in lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane three of gel (C). All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500 (ThermoFisher)...... 55 Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The percentage of trees in which the associated taxa clustered

List of figures ix

together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site...... 57 Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all sequences and all species are indicated above by an asterix (*)...... 58

List of figures x

List of Tables

Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve species A. trapezia using RT- PCR. All primers were designed to amplify the entire ORF of each gene with product sizes between 350 bp and 619 bp...... 26 Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A. trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014)...... 27 Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD) have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at least 2 biological replicates out of 6 in that group...... 30 Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on an Illumina NextSeq500 using 150bp paired-end chemistry. Libraries obtained were then assembled and the table below summarises results obtained...... 51 Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table below...... 55

List of tables xi

List of Abbreviations

AA- Amino acid

Angb- Androglobin

BLAST- Basic local alignment search tool

CO- Carbon monoxide

Cyg-

GbE- Globin E

GbX- Globin X

GbY- Globin Y

Hb- Haemoglobin

LUCA- Last universal common ancestor

NO- Nitric oxide

Mb-

NCBI- National Centre for Biotechnology Information

Ngb-

ORF- Open reading frame

PCR- Polymerase chain reaction

ROS- Reactive Oxygen Species

List of abbreviations xii Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

QUT Verified Signature

Signature:

Date: _____June 2017______

Statement of original authorship xiii

Acknowledgements

I would firstly like to express my gratitude to my supervisors Dr. Ana Pavasovic (School of Biomedical Sciences, Queensland University of Technology), Dr. Peter Prentis (School of Earth, Environmental and Biological Sciences, QUT) and Professor Louise Hafner (School of Biomedical Sciences, QUT) for their supervision and guidance throughout my research and writing of this thesis. I would also like to thank them for their continuous support, patience and encouragement throughout these challenging two years.

A special thank you to my principal supervisors Dr. Ana Pavasovic and Dr. Peter Prentis for their flexibility, genuine caring and faith in me along the way.

I would like to thank all my colleagues in the evolutionary and physiological genomics lab (ePGL), especially Shorash Amin, Hayden Smith, Joachim Surm and Chloe van de Burg for their ongoing help and support. I would also like to thank Associate Professor Christopher Collet (School of Biomedical Sciences, QUT) and Dr. David Hurwood (School of Earth, Environmental and Biological Sciences, QUT) for reviewing my thesis draft and providing valuable feedback during my final seminar.

I would like to thank all the staff at the QUT marine lab facility for their helpful advice regarding care of the marine and tanks. I would also like to thank QUT HPC and QUT MGRF for use of their facilities.

I would like to thank my family for giving me the opportunity to study in Australia and for their ongoing loving support and encouragement.

Lastly, I would like to thank my partner Tom for his emotional support, endless patience and encouragement. He has always been a source of strength and has kept me determined throughout this challenging journey.

Acknowledgements xiv

Chapter 1: Introduction

1.1 Functional and structural diversity within the globin superfamily

The globins are an ancient gene superfamily, thought to have been present in the last universal common ancestor (LUCA) of the three domains of life (Hoffmann et al., 2010a). Globin genes encode for small iron , typically ~ 150 amino acids in length. Globin domains consist of a proximal histidine residue in the F helix for iron binding and a distal histidine residue on the opposite side of this iron for oxygen binding (Bashford et al., 1987). Structurally, globin domains consist of eight alpha (α) helical segments which form the characteristic globin fold in a three-on-three α helical structure with a central haeme group comprising a proto-porphyrin ring (Perutz, 1979; Blank & Burmester, 2012). It is this haeme prosthetic group that allows globin proteins to reversibly bind oxygen

(O2) and other gaseous ligands (Pesce et al., 2002). For a long time, the metazoan () globin superfamily was thought to consist of only two globin types, haemoglobin (Hb) and myoglobin (Mb), but recent research has shown that this superfamily is functionally and structurally far more diverse than originally thought (Götting & Nikinmaa, 2015).

The globin gene superfamily includes genes encoding Hbs, (Ngbs), cytoglobins (Cygbs), globin X (GbX), androglobins (Angbs), Mbs, globin E (GbE) and globin Y (GbY) among others (Hoffmann et al., 2010a; Storz et al., 2013; Opazo et al., 2015). This diverse group of genes is thought to have evolved from a common ancestral gene through repeated rounds of gene and genome duplication, some 1.8 billion years ago

(Efstratiadis et al., 1980; Wajcman et al., 2009). This coincides with the accumulation of O2 levels in the atmosphere, suggesting that those globin genes arose as a mechanism to

scavenge toxic O2, carbon monoxide (CO) and nitric oxide (NO) gases (Hardison, 1996; Koch & Britton, 2008). The repeated rounds of duplication and functional divergence of duplicated genes have resulted in a large and diverse group of structurally similar proteins. In this superfamily, GbX and Ngb are some of the more recently described proteins (Pesce et al., 2002; Burmester & Hankeln, 2004; Roesner et al., 2005) but are thought to be the most ancestral globin genes found in metazoans. In fact, GbX and Ngb are the only globin genes found in early divergent phyla such as Cnidaria and Porifera, and predate the split of deuterostomes and protostomes (Schwarze et al., 2014).

Chapter 1: Introduction 1

1.1.1 Recently discovered globin genes Neuroglobin and GbX are both expressed in the nervous system of vertebrates, but unlike Ngb, which is typically found in the cellular cytoplasm, GbX is a membrane bound protein (Pesce et al., 2002; Burmester & Hankeln, 2014). The functions of both of these proteins are still uncertain and they have been the focus of numerous studies to elucidate their physiological function (Hankeln et al., 2005; Burmester & Hankeln, 2009; Wawrowski et al., 2011). Nonetheless, initial studies speculate that Ngb and GbX largely play a protective role in the cell (Brunori & Vallone, 2007; Burmester & Hankeln, 2009; Corti et al., 2016) as well as being involved in cellular signaling processes (Burmester & Hankeln, 2009; Su et al., 2014). Globin E, another described vertebrate globin (Kugelstadt et al., 2004), is typically

expressed in eye tissue. Its role is thought to be related to O2 supply of retinal cells, which are metabolically highly active (Blank et al., 2011). Androglobin is an ancient chimeric gene with high levels of expression in the testes of vertebrates (Hoogewijs et al., 2011), however, its function remains unresolved (Burmester & Hankeln, 2014). Yet another recently discovered globin, Cygb, is expressed in a diverse range of tissues and cell types including epithelium, fibroblast cell lineages, macrophages, neurons and muscle fibers (Oleksiewicz et al., 2011; Motoyama et al., 2014). Functions of Cygb appear similarly diverse, with roles in protection from reactive oxygen species and in nitrous oxide metabolism (Pesce et al., 2002; Singh et al., 2014).

1.1.2 Myoglobins and haemoglobins Stemming from their early discovery, Mbs and Hbs are the most extensively studied globins. Most metazoan species possess only a single copy of the Mb gene except in rare cases of some fish lineages where an increase in copy number (up to seven) is seen (Koch & Burmester, 2016). Mbs are small monomeric proteins whose expression is principally restricted to striated muscle tissue where they play a key role in supplying the mitochondria

of myocytes with O2 during periods of hypoxia, also defined as oxygen deficiency in a biotic environment (Wittenberg & Wittenberg, 2003). This protein is also reported to play an important role in the decomposition of nitrous oxide during high cellular metabolic activity

(Flögel et al., 2001). Unlike the other globin proteins, Hb principally transports O2 from respiratory surfaces to the working tissue within an organism.

Haemoglobins in metazoans have been reported in multiple animal phyla as key oxygen-transport proteins (Weber, 1980; Mangum, 1992; Hardison, 1996; Hourdez et al,

2 Chapter 1: Introduction

2000). Most frequently Hbs are found in the circulatory system of metazoan species, but can sometimes be confined to specific tissues, such as in the bivalve Yoldia eightsii where Hb is expressed in the gill tissue (Dewilde et al., 2003). Outside of vertebrates, Hbs exhibit a remarkable diversity in structure (Terwilliger, 1998; Weber & Vinogradov, 2001). For example, Hb proteins can occur as monomers, dimers, tetramers or even in polymeric forms (Weber, 1980). Circulating Hbs may also be found within erythrocytes (intracellular) or freely dissolved in fluid tissue such as blood or haemolymph (extracellular) (Ching Ming Chung & Ellerton, 1980). Extracellular Hbs are extremely diverse in subunit size and quaternary structure, however, they all share the same globin fold as intracellular Hbs (Negrisolo et al., 2001). Notable structural variations can often be found in invertebrate species such as the bivalve mollusc reeveana where a very large polymeric Hb of 430 kDa composed of 34 kDa di-domain subunits occurs, each containing two covalently

linked functional units for O2 binding (Grinich & Terwilliger, 1980). Other examples of structural variation include annelid species, such as Riftia pachyptila and Lumbricus terrestris, which possess Hbs consisting of 24 and 144 subunits respectively (Royer et al., 2000; Strand et al., 2004; Flores et al., 2005).

Structural variations in Hbs can be observed even among vertebrate lineages, where jawless vertebrates (agnathans) have weakly cooperative dimers compared to the canonical tetrameric Hbs of jawed vertebrates (gnathostomes) (Hoffmann et al., 2010a). By far the best-studied Hb is the tetrameric form found in humans. Structurally, the human Hb consists of 4 globin subunits (2 alpha (α) and 2 beta (β)), each with their own haeme group (Pesce et al., 2002; Storz et al., 2013). α and β chains consist of 141 and 146 amino acids, respectively and are held together by noncovalent interactions to form a heterotetramer (Antonini & Chiancone, 1977; Hardison et al.,1997).

The genes that encode the different subunits of Hb proteins are found as clusters in the human genome. For example, the α globin gene cluster is located on and includes seven loci (NC_000016.10; 5’ – zeta – pseudozeta – mu – pseudoalpha1 – alpha2 – alpha1 – theta – 3’) (Vernimmen, 2014) while the β gene cluster is found on and is comprised of five loci (NC_000011.10; 5’-epsilon – gammaG – gammaA – delta – beta – 3’) (Moleirinho et al., 2013). The order in which the genes are found within these clusters, determines the timeline at which they are expressed during development (Hardison et al., 1997) with those closest to the locus control region at the 5’end, expressed in early

Chapter 1: Introduction 3

development and those further away expressed at a later stage (Hanscombe et al., 1991). This developmental expression has been associated with varying oxygen levels throughout embryonic and foetal development (Efstratiadis et al., 1980) and demonstrates that oxygen availability can influence the expression and evolution of Hbs.

1.2 Evolution of haemoglobins

Haemoglobins are among the most extensively studied proteins, yet our understanding of their evolutionary history is becoming increasingly unclear. A recent phylogenetic analysis of the vertebrate Hbs by Hoffmann et al., (2010a) suggests that Hbs in gnathostomes and agnathans have evolved in each lineage independently. It is also likely that these Hb genes have evolved from different ancestral genes based on the placement of gnathostome and agnathan Hbs in two distinct clades (Figure 1.1) (Hoffmann et al., 2010a). Findings such as these are in contrast to the prevailing paradigm that the current vertebrate Hbs have evolved from an ancestral Mb gene. In invertebrates, the evolution of Hb genes remains even less clear than in vertebrates, largely due to insufficient genomic resources for the groups that possess Hbs.

The current paradigm to explain the diversity of vertebrate globin genes largely involves gene and whole genome duplication events, mutation and natural selection acting to promote evolutionary innovation in this gene family. Both, the two rounds of whole genome duplication (2R hypothesis; (Hokamp et al., 2003)) that preceded the evolution of vertebrates and further tandem duplications provided the raw substrate for the evolution and diversification in the vertebrate globin genes seen today. Divergent and convergent evolution have played major roles in the evolution of new functions observed in duplicated vertebrate globin genes.

4 Chapter 1: Introduction

Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and jawless (agnathans ) vertebrates. In this phylogeny vertebrate-specific globins are grouped into two distinct clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a).

1.2.1 Gene duplication

The role of both gene duplication and whole-genome duplication is now widely accepted as a key driver of evolution as it is the most frequent mechanism responsible for the generation of genes with new functions (Ohno et al., 1968; Cañestro et al., 2013). It is through the accumulation of mutations and repeated rounds of duplication of existing genes that new functions are acquired (Zhang, 2003). Both whole-genome and gene duplication have promoted evolutionary innovation and led to the current diversity within the Hb gene family (Storz et al., 2013). Gene duplication can occur through five mechanisms: (i) the

Chapter 1: Introduction 5

unequal crossing-over between two sister chromatids of one chromosome or (ii) between two homologous chromosomes during replication, (iii) regional redundant duplication of DNA molecules (segmental duplication), (iv) polyploidization or (v) retrotransposition (Ohno et al., 1968 ; Zhang, 2003). Unequal crossing over events can also be explained as recombination between DNA sequences at different sites on sister chromatids or homologous chromosomes. This results in a decrease in gene copy number on one chromosome or one chromatid and an increase on the other (Roeder, 1983). The variations in globin gene copy numbers have often been associated with this type of recombination (Goossens et al., 1980; Higgs et al., 1980; Trent et al., 1981; Roeder, 1983; Hurles, 2004). Vertebrate Hb gene clusters, described previously, provide classic examples of evolution through gene duplication. In 1961, Ingram suggested that polyploidization initiated two duplication events from Mb to the primordial Hb α gene which then produced the Hb β-like gene. While this is an overly simplistic explanation for the diversity found within the globin gene family, it provided an early plausible hypothesis that gene duplication was a major driver of the evolution of the vertebrate globin gene family.

Gene duplication is known to have four possible outcomes (Zhang, 2003; Cañestro et al., 2013). One possible outcome is dosage repetition, where limited sequence evolution is seen, and gene function is conserved (Zhang, 2003; Cañestro et al., 2013). Another outcome following a duplication event may be pseudogenisation of one duplicated copy through the accumulation of mutations creating a non-functional gene (pseudogene) (Force et al., 1999; Zhang, 2003; Bischof et al., 2006). This may result in a duplicate transcribed into RNA but not translated. Neofunctionalisation represents another possible outcome of gene duplication whereby accumulation of mutations in one of the copies may lead to new functions (Innan & Kondrashov, 2010). Lastly, subfunctionalisation is a sub-category of neofunctionalisation where the function remains the same but changes in regulatory elements that control expression of the gene may lead to different copies of the duplicate being expressed in different tissues (Lynch & Force, 2000; Huerta-Cepas et al., 2011). Neofunctionalisation and subfunctionalisation are largely the result of divergent selection acting on newly formed duplicates. Mutations in DNA coding regions will lead to changes in the amino acid sequence and affect protein-protein interactions which are the foundation of cellular molecular function therefore giving rise to new functions for those proteins (Wray et al., 2003).

6 Chapter 1: Introduction

In vertebrates, repeated rounds of gene and whole genome duplication events have led to copy number variation across the eight different classes of globin genes, as listed in section 1.1. For example, in fish Mb copy number variation can range from none (ice fishes: Chaenocephalus aceratus, Pseudochaenichthys georgianus and stickleback: Gasterosteus aculeatus) (Sidell & O’Brien, 2006; Hoffmann et al., 2011) to seven copies (lungfish: Protopterus annectens) (Koch et al., 2016). Pseudogenisation has led to attrition in some of these globin classes in certain vertebrate lineages. Specifically, the GbY is absent in birds, marsupials and placental mammals, while it is present in monotremes and most other vertebrate lineages with the exception of lampreys and ray-finned fishes (Burmester & Hankeln, 2014). Similarly, this process of gene loss is responsible for the absence of Mb in a number of amphibian species (e.g., anurans (frogs)) (Maeda & Fitch, 1982; Fuchs et al., 2006). In the vertebrates, the formation of the α globin gene cluster is a consequence of tandem duplication leading to functional copy number variation, such as two in green anole (Anolis carolinensis) (Hoffmann et al., 2010b), three in chicken (Gallus gallus) (Reitman et al., 1993) and four in platypus (Ornithorhynchus anatinus) (Opazo et al., 2008). A recent systematic analysis of 22 avian and 22 mammalian genomes revealed a significant conservation in copy number within the avian α globin gene cluster (2-3 copies) but substantial variation among the surveyed mammalian lineages (2-8 copies)(Figure 1.2) (Zhang et al., 2014). In addition to this duplication of α globin genes within the mammalian lineages, there is strong evidence to suggest that mutations which have accumulated in some duplicated copies may have led to the evolution of novel functions (He & Zhang, 2005). This process of neofunctionalisation is often a result of divergent natural selection acting on new genetic variation in duplicated gene copies.

Chapter 1: Introduction 7

Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2) (Zhang et al., 2014).

1.2.2 Divergent evolution of duplicated genes

Divergent evolution of duplicated genes is the process in which genes of a similar function diverge from each other following duplication from a common ancestral gene (Bikard et al., 2009). An example of divergent evolution in the globin superfamily is seen in a number of vertebrate Hb proteins that undertake slightly different but similar functions. These Hb proteins are encoded by genes which have evolved from the same ancestral gene but exhibit differences in their ontogenetic timing of expression and functional properties, such as affinity for oxygen (Gribaldo et al., 2003). In fact, the differences in O2 scavenging and O2 transport roles of embryonic Hb versus adult Hb are attributable to amino acid replacements (non-synonymous mutations) in the zeta/epsilon and beta/alpha genes expressed in the embryonic and adult individual, respectively (Goodman et al., 1987). Another example of divergent evolution driving evolutionary innovation in Hb proteins can be seen in the avian lineage. In this instance, adult birds express two functionally distinct Hb isoforms, HbA and HbD (Grispo et al., 2012). While the β globin chains in both isoforms are identical, two functionally distinct α chains, one encoded by the alphaA globin gene and the other by the alphaD globin gene are found in the HbA and HbD

8 Chapter 1: Introduction

isoforms, respectively. The alphaA and alphaD globin genes share a common ancestral gene,

but possess amino acid replacements that substantially alter O2 affinity in the presence of allosteric modulators (Storz et al., 2015). Other examples of divergent evolution include the expression differences found in GbX paralogs (orthologous genes, which have diversified through duplication within the species) in the elephant shark (Callorhinchus milii) (Opazo et al., 2015). In this example, GbX1 is expressed across a wide range of tissues, while the expression of GbX2 is primarily restricted to the gonads. Taken together these examples highlight the process of divergent evolution operating on non-synonymous mutations in duplicated globin genes, but does not, however, explain the independent evolution of Hb in a number of metazoan lineages such as vertebrates, bivalves and annelids (Wray et al., 1996).

1.2.3 Convergent evolution

Convergent evolution is the independent evolution of the same function or phenotype in different lineages from different ancestral genes (Hoffmann et al., 2010b). As a consequence, it can lead to analogous molecules with similar functions in unrelated lineages (Hoffmann et al., 2010b; Burmester & Hankeln, 2014; Opazo et al., 2015). Such adaptive phenotypic convergence appearing in unrelated taxa is partly attributed to similar selection pressures operating on duplicated genes and this process is thought to be widespread in nature (Parker et al., 2013). An important example of convergent evolution is observed in the independent evolution of electric organs in numerous fish species, where the myogenic electric organ produces electrical currents for the purposes of communication and is believed to have evolved multiple times (Gallant et al., 2014). Similarly, echolocation in Cetacea (whales and dolphins) and Chiroptera (bats) has also been attributed to phenotypic convergence (Liu et al., 2010). Genome-wide analysis of cetaceans and two different bat lineages (Yinpterochiroptera and Yangochiroptera) have shown extensive convergent changes in nearly 200 loci containing coding sequences involved in echolocation (Parker et al., 2013). Functionally different Hb proteins occurring within erythrocytes of agnathan (jawless) and gnathostome (jawed) vertebrates present another example of convergence (Hoffman et al., 2010a). In these two disparate lineages, phylogenetic analyses have determined that the Hbs have not evolved from orthologous genes and are structurally distinct, reflected in a weakly cooperative dimeric Hb form produced in agnathans and a cooperative tetrameric Hb form found in gnathostomes (Figure 1.1)

(Hoffmann et al., 2010a; Schwarze et al., 2014). In addition to evidence of convergent

Chapter 1: Introduction 9

evolution of Hb among vertebrates, evidence of convergence between vertebrate and invertebrate Hb proteins is also observed. A specific example is the repeated evolution of intracellular tetrameric Hb in some bivalve mollusc species and vertebrates (O’Gower & Nicol, 1968). In invertebrates, a variety of Hbs and Mbs have also been observed (Weber & Vinogradov, 2001) and are believed to be the result of globin proteins emerging several times convergently from different ancestral globin genes (Blank & Burmester, 2012; Schwarze et al., 2014).

1.2.4 Invertebrate haemoglobins

From primary sequence to secondary, tertiary and quaternary structures, a large diversity exists among invertebrate Hbs. This variability is thought to represent specialization of Hb molecules to the wide range of environments they inhabit (Terwilliger, 1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002; Gow et al., 2005). Structurally, invertebrate Hbs also retain the globin fold (Gow et al., 2005) and exhibit a crystal structure that consists of at least six α-helices (Weber & Vinogradov, 2001). Haemoglobin in invertebrates has evolved independently at least 11 times (Figure 1.3) (Weber & Vinogradov, 2001), however, it is likely that this number may increase as more data becomes available for many understudied invertebrate lineages.

10 Chapter 1: Introduction

Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in 11 major phyla shown here : Arthropods, Nematodes, Nemertines, Mollusks, Annelids, Echiurans, Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych & Passamaneck, 2001).

The diversity in Hb proteins is a result of sequence diversity seen in genes encoding invertebrate Hbs. Within invertebrates; nematodes, annelids, arthropods and molluscs all possess genes encoding Hb proteins with more than a single functional globin domain which has resulted in a diversity of functional proteins (Natarajan et al., 2015). The brine shrimp (Artemia), for instance, expresses two functional Hb genes (an α and β subunit type), each of which consists of nine tandem globin domains. The association of the polypeptide chains encoded by these two genes results in the expression of three Hb proteins; a hetero-dimer (Hb II) and two homodimers (HbI and HbIII) (Manning et al., 1990). Structurally, HbII is characterised by α and β subunit types, while HbI and HbIII are composed of α and β subunit types, respectively (Manning et al., 1990). Interestingly, in addition to possessing multi- domain Hbs, production of multiple types of Hbs appears common among invertebrates. For example, in the annelid worms (Branchipolynoe symmytilida and B. seepensis) a single domain globin gene (SD) encodes a 137 amino acid Hb protein, while a tetradomain gene (TD) encodes a Hb protein of 552 amino acids (Projecto-Garcia et al., 2010). Figure 1.4

Chapter 1: Introduction 11

illustrates the exon-intron structure and domain architecture of the two Hbs from Branchipolynoe spp. Similarly, in nematodes, there is a diversity of globins including single- domain and di-domain Hbs (Projecto-Garcia et al., 2015). It is notable that despite this diversity, the di-domain globin genes appear to have a restricted distribution only in two parasitic species (Ascaris suum and Pseudoterranova decipiens) (Darawshe et al., 1987; Dixon et al., 1991). These di-domain globin genes from nematodes encode a large polymeric multi-domain Hb (octamer consisting of eight two domain subunits). Overall, based on the observed diversity, invertebrate Hbs have been classified into four distinct groups according to their gene and protein subunit structures. These include single domain, single-subunit Hbs; two-domain, multi-subunit Hbs; multi-domain, multi-subunit Hbs and single-domain, multi-subunit Hbs (Vinogradov, 1985).

Figure 1.4 Gene structure from two Hbs found in annelid worms (Branchipolynoe spp.). This diagram illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain (bottom) globins found in these species (Projecto-Garcia et al., 2010).

1.3 Bivalve haemoglobins

Similar to the diversification patterns seen in other invertebrate lineages, bivalve molluscs show strikingly diverse patterns of Hb distribution and evolution. This is of interest as bivalves typically do not possess a functional respiratory pigment and those that do were previously thought to utilise a copper-based respiratory pigment known as haemocyanin (Hc), with Hb expression considered rare (O’Gower & Nicol, 1968; Terwilliger, 1998). Species that contain Hc are restricted to the earliest branching lineage of bivalves and it has been suggested that Hc is the ancestral oxygen-transport protein in class Bivalvia. It has been shown, however, that Hb is more widely distributed in bivalve molluscs than previously

12 Chapter 1: Introduction

thought (Manwell, 1963; Smith, 1967; Terwilliger, 1998; Alyakrinskaya, 2002; Wajcman et al., 2009).

Currently, four independent origins of Hbs have been hypothesised in bivalves (Figure 1.5). The first instance is thought to have occurred in the primitive bivalve lineage containing Solemya velum, which possesses several different types of Hbs expressed predominately in gill tissue (Dando, 1985; Doeller et al., 1988; Torres-Mercado et al., 2003). In Lucina pectinata, also found in this lineage, three Hbs are expressed; HbI which is a sulphide reactive protein and HbII and III, which are O2 reactive globins (Montes-Rodriguez et al., 2016). A second instance has occurred in the bivalve family Arcidae where species contain multiple forms of Hb in circulating erythrocytes (Mangum, 1997) represented in Figure 1.5 by Arca noae. Haemoglobin diversity of Arcidae will be discussed in more detail in the following section as it constitutes one of the questions addressed by this thesis. The third independent origin of bivalve Hb is found in the deep-sea clam genus Calyptogena. Species within this genus generally present with two homo-dimeric Hb proteins (HbI and HbII) in erythrocytes but C. nautilei possess two monomeric Hbs (HbIII and HbIV) which show low sequence identity to HbI and HbII (Kawano et al., 2003). The fourth instance is an extracellular Hb found in the heterodont clams Cardita borealis and C. floridana (Manwell, 1963; Terwilliger et al., 1978). Figure 1.5 depicts the basic bivalve phylogeny with the distribution of Hc and Hb among its taxa. While these four cases listed above provide compelling evidence for the independent evolution of Hb in multiple bivalve lineages, it is not known whether Hb has evolved in other lineages where O2 transport proteins have not been examined.

One lineage, in particular, that has not been extensively examined for the presence of Hb proteins are the members of the family Limidae. This is significant as there is some evidence to indicate that some species in this family may also possess Hb proteins. This evidence, however, is principally restricted to a single report of this observation (Rawat, 2010) with no molecular or biochemical studies reported to date. It is plausible to hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate associated with the ability to swim and the red pigmentation of their tissues (Baldwin & Lee, 1978; Harper & Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an important physiological advantage for a more efficient oxygen delivery system to respiring tissues during swimming. Consequently, if Hb were found in Limidae, then this would

Chapter 1: Introduction 13

constitute a fifth independent origin of Hb in bivalve molluscs. This represents a key knowledge gap which will be addressed in this thesis.

14 Chapter 1: Introduction

Figure 1.5 Phylogenetic classification of class Bivalvia of molluscs. The main bivalve subclasses are represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta, Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida, Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015).

Chapter 1: Introduction 15

1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass The bivalve family Arcidae, also referred to as the blood clams, are unusual in comparison to most bivalve species as they possess haemolymph which has a deep red coloration attributed to its circulating erythrocytes (Mangum, 1998). These erythrocytes contain multiple Hbs; which are often dimeric (~ 32 kDa) and tetrameric (~ 65 kDa) (Como & Thompson, 1980b; Furuta & Kajita, 1983; Suzuki et al., 2000), with the exception of a 430 kDa polymeric Hb found in some members of the Barbatia genus, such as B. reeveana and B. lima (Grinich & Terwilliger, 1980; Suzuki & Arita, 1995). Dimeric Hbs in the Arcidae exhibit further structural complexity and can be homo-dimeric or hetero-dimeric (Furuta & Kajita, 1983; Mann et al., 1986; Suzuki et al., 1992). In some species, such as blood cockles (Anadara trapezia and Scapharca inaequivalvis), multiple homo-dimeric Hbs have been found circulating in erythrocytes at the same time (O’Gower & Nicol, 1968; Fisher et al., 1984; Ronda et al., 2013).

The tetrameric Hbs in Arcidae are formed from two different subunits in an alpha2beta2

(α2β2) arrangement, analogous to the vertebrate Hb (Ronda et al., 2013). The α and β globin chains in these species range from ~ 145 to 165 amino acids in length (Mann et al., 1986; Suzuki et al., 1996). Both the dimeric and tetrameric Hbs show different pairing of their helices to the canonical arrangement seen in vertebrate Hbs (Ronda et al., 2013). Specifically, the E and F helices show subunit pairing, with the G and H helices on the outside of the quaternary structure, creating a back-to-front formation when compared to vertebrate Hb (Vinogradov, 1985). Figure 1.6 illustrates the comparison of the E and F helical arrangement of the human and S. inaequivalvis Hb. This back-to-front structural arrangement is common to all invertebrate Hbs for which crystal structures have been resolved (Royer et al., 2005). This structural variation is also considered to provide evidence to suggest that arcid Hb proteins are a result of convergent evolution with vertebrate (Royer et al., 2005).

16 Chapter 1: Introduction

Homo sapiens HbA Scapharca inaequivalis HbII Scapharca inaequivalis HbI

Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha) subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue. This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c), both found in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013).

Based on the current literature, protein sequences have been more extensively studied in comparison to the underlying genetic variation. Currently, most of the information regarding the genes encoding Hb proteins in Arcidae has been generated in studies which utilised small scale cDNA sequencing (Suzuki & Arita, 1995; Piro et al., 1996; Suzuki et al., 2000). This sequencing has demonstrated that the majority of species examined most frequently have two to three Hb encoding genes. For example, Tegillarca granosa contains three different Hb genes; α and β encoding genes of the tetrameric Hb and a minor globin gene encoding the homo-dimeric Hb (Bao et al., 2013a). S.inaequivalvis also expresses three homologous Hb genes (α, β and minor) (Piro et al., 1996; Piro et al., 1998). Interestingly, these three genes have the same exon/intron structure seen in vertebrate Hb genes, consisting of three exons and two introns (Piro et al., 1996). Unlike in the previous two examples, B. lima expresses four distinct Hb genes which encode a minor homo-dimeric globin (delta (δ)), a hetero-tetramer (α and β) and a polymeric globin (2D) (Suzuki et al., 1996). The 2D gene is a di-domain globin and is thought to have resulted from a duplication of delta gene, producing two consecutive domains within a gene due to the loss of a stop codon (Suzuki et al., 1996). While these studies capture some of the diversity of Arcidae Hb genes, they are principally based on previous protein evidence. There

Chapter 1: Introduction 17

is a need for a systematic evaluation of diversity at the sequence level of all expressed Hb genes in this group of organisms.

In arcid bivalves, there is preliminary evidence to suggest that some duplicated Hb encoding genes may have taken on new functions. For example, in T. granosa molecular analysis has revealed that some Hb encoding genes may play a role in immune response to bacterial pathogens (Bao et al., 2013b). In this instance, single nucleotide polymorphisms identified in α and β Hb encoding genes (HbIIA and HbIIB) showed an association with resistance to pathogenic bacteria Vibrio parahaemolyticus (Bao et al., 2013b). In addition to this example of neofunctionalisation, there is also some evidence that duplicated Hb genes in some bivalves show tissue specific expression (Burmester et al., 2000), although, this has only been reported for species outside of the Arcidae. Some authors speculate that the diversity of globin genes in Arcidae may have contributed to the capacity of these organisms to persist in anoxic or intertidal environments (Terwilliger et al., 1978; Alyakrinskaya, 2002). While these studies are few and speculative, further investigation is needed to determine the extent of gene duplication within a single species and if the duplicated genes have undergone neofunctionalisation. Assessment of expression levels of Hb genes across different tissues under experimental perturbations (e.g., hypoxic versus normoxic abiotic stress) would allow identification and quantification of potentially new function in these genes.

A.trapezia was the first arcid species to have its Hb proteins isolated and characterised. Early work by Nicol & O’Gower (1967) identified the presence of multiple circulating Hbs including a tetrameric (HbI) with an α2β2 configuration and two homo-dimeric Hbs (HbIIa and HbIIb). The homo-dimeric forms appear to be allelic variants of the same gene as they are in Hardy-Weinberg equilibrium within populations (O’Gower & Nicol, 1968; Como & Thompson, 1980b; Mann et al., 1986). One of the genes that encodes the β subunit of the tetrameric Hb (HbI) has been fully characterised (At & Eo, 1984), however, the complete gene sequence for the two homo-dimeric variants and the α subunit of the tetramer have not. Interestingly, another Hb gene encoding a minor Hb, possibly seen as a ghost band by O’Gower & Nicol (1968), has been characterised fully in A. trapezia (Titchen et al., 1991). Based on this protein and DNA evidence, it appears that, at least, four different Hb genes are present in this species. Despite this significant early work on Hb in A. trapezia, there have been numerous inconsistencies regarding the nomenclature of the Hb proteins, and corresponding genes (if at all characterised). In addition to this limitation, it remains unsure

18 Chapter 1: Introduction

whether the DNA and protein sequences reported for A. trapezia represent the complete repertoire of Hb encoding genes in this species. This makes inferences about the evolution of Hb in bivalves difficult and highlights the need for a systematic evaluation of the Hb genes in A. trapezia and arcids in general. Significantly, even less is known about the functional diversification of this gene family demonstrating the requirement for further functional genomic interrogation of Hb genes in the Arcidae. This paucity of information represents a key knowledge gap which will be addressed in this thesis.

1.4 Aims

The overarching aim of this study is to expand our understanding of the evolution of haemoglobin (Hb) encoding genes, one of the most important gene families in the metazoan tree of life. Specifically, this thesis aims to address the lack of information regarding the distribution of Hb genes in bivalves and investigate potential evidence of neofunctionalisation in duplicated Hb genes of Arcidae bivalves. In order to address these aims, the thesis will consist of two main objectives each with its own hypothesis. These include:

• Objective 1

o H1 – Duplicated Hb genes in Anadara trapezia, a member of the Arcidae family, show tissue specific expression patterns and therefore evidence of neofunctionalisation.

This hypothesis will be tested by interrogating the recently published whole organism transcriptome of A. trapezia for all expressed copies of Hb genes and their expression measured across multiple tissues using quantitative Polymerase Chain Reaction (qPCR) techniques.

• Objective 2

o H1 – Bivalve Ctenoides ales, a member of the Limidae family, expresses Hb encoding genes and therefore provides evidence of a fifth independent origin of Hb in bivalve molluscs.

The hypothesis in objective 2 will be tested by sequencing the expressed portion of the C. ales’ genome for presence of Hb encoding genes. Whole organism transcriptome will

Chapter 1: Introduction 19

be generated, sequenced utilizing high throughput sequencing techniques and bioinformatically interrogated for evidence of Hb expression.

1.5 Thesis Outline

Thesis presented here includes four main chapters, consisting of an introductory literature review (chapter 1), followed by two studies each described within their individual chapter (chapters 2 and 3). Final section of the thesis is the general discussion (chapter 4).

• Chapter 1 The first chapter is a literature review which highlights the current understanding of vertebrate and invertebrate haemoglobins (Hb), focusing on their vast structural diversity and evolution. This section also explores evolution by gene duplication and discusses divergent and convergent evolution. In addition, it focuses on invertebrate Hbs and more specifically bivalves in light of the topic for this study.

• Chapter 2 This is the first data chapter which addresses Objective 1. It provides an overview of duplicated Hb genes in Arcidae bivalves and highlights the knowledge gap in our understanding of the evolutionary innovation through neofunctionalisation of these genes. The chapter provides detailed methodological information on the measurement of tissue specific expression of candidate Hb genes in A. trapezia and interpretation of the results. Findings of this study are interpreted in detail in discussion.

• Chapter 3 Second data chapter which addresses Objective 2, describes the interrogation of the transcriptome of C. ales for presence of Hb encoding genes. Methodology section in this chapter concerns the isolation, sequencing and bioinformatics interrogation of the C. ales transcriptome. Results present the detailed outcomes of this interrogation and describe the transcriptional profile this species as well as the relevant candidate genes, followed by detailed discussion.

20 Chapter 1: Introduction

• Chapter 4 In the fourth chapter, the overall outcomes of the project are discussed in context of the current body of work relating to evolution of bivalve Hbs. Interrogation of the results is extended to encompass a discussion on the limitations of the study and potential applications of the outcomes reported in this thesis.

Chapter 1: Introduction 21

Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin

genes in Anadara trapezia

2.1 Background

Haemoglobins (Hbs) are among the best studied proteins in vertebrates but little is known about their distribution, function and evolution in invertebrate lineages (Alyakrinskaya, 2002; Bao et al., 2013a). A number of invertebrate lineages have

independently evolved circulating Hbs as a mechanism to transport O2 for cellular respiration (Mangum et al., 1975). Invertebrate lineages that have independently evolved circulating Hbs include some arthropods, some annelids and the Arcid bivalves (Terwilliger, 1998; Weber & Vinogradov, 2001; Wajcman et al., 2009).

The Arcid bivalves are an interesting group of intertidal mollusc species known as blood clams. This group displays a diversity of Hb proteins with monomeric, dimeric, tetrameric and polymeric protein forms (Terwilliger, 1980; Royer et al., 1985; Weber & Vinogradov, 2001). In fact, the presence of multiple Hb proteins in circulating erythrocytes was first observed in the Australian blood clam, Anadara trapezia by Nicol & O’Gower (1967). This study revealed the presence of at least three distinct proteins in this species, two homodimers and one tetramer, while protein sequencing indicated that at least two duplication events produced these proteins (Suzuki et al., 1996). Having such a variety of Hb genes may allow these organisms to survive in the intertidal zone and endure long periods submerged or exposed to air. A clinal pattern of allele frequencies has also been observed for the homo-dimer Hb gene in A. trapezia, which the authors hypothesised was associated with changes in environmental variables including temperature and salinity (O’Gower & Nicol, 1968).

The large number and variation in blood clam Hb proteins is thought to be the result of repeated rounds of gene duplications but the exact function of these diverse Hb proteins remains unclear. Current data suggests that blood clam Hbs evolved from an ancestral mollusc globin gene which, following the divergence of the blood clam ancestor underwent repeated rounds of lineage specific gene duplication in different blood clam lineages (Riggs, 1991). More recent transcriptome data for A. trapezia (Prentis & Pavasovic, 2014) has

22 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

identified seven full length and genetically distinct Hb genes in this single species and indicates that gene duplication may have been more extensive than previously reported.

The Hbs used in this study are HbI, a tetramer made up of α and β subunits encoded by the genes AG (alpha globin) and BG (beta globin); HbII, a homo-dimer molecule encoded by the HD gene, a hetero-dimer Hb encoded by the gene HB and a putative dimer Hb encoded by the gene 2D. These genes were used for this study as they encode structurally distinct proteins including a homo-dimeric and tetrameric Hb proteins. It is postulated that these different Hbs may show distinct patterns of tissue specific or environment specific gene expression. Consequently, differences in their patterns of expression were tested across five tissues, that could be dissected distinctly and where Hbs where most likely to play a role, (haemolymph, foot, gills, mantle and muscle) in four different experimental treatments to determine if they showed tissue specific or environment specific gene expression.

2.2 Material and Methods

2.2.1 Sample acquisition and tissue dissection Anadara trapezia specimens were collected from the intertidal zone in Wynnum, Queensland, Australia (27°26'08.7"S 153°10'25.0"E) and transferred to holding tanks at QUT Marine Lab facility until required for experimentation. All animals were collected under the general fisheries permit number 166312. Ethics approval was not required since the specimens do not qualify as animals as described by the Queensland Animal Care and Protection Act 2001 (ACPA). Water conditions during the acclimation period of bivalves were as follows: water salinity 30-35 ppt, pH 7.9-8.1, temperature 20-25 °C and ammonia levels (< 0.1 mg/L). A 12 h light/dark cycle was also maintained in the tanks and the animals were kept unfed for one week prior to the start of the experiment.

To best capture the variation in Hb expression, tissues were extracted from individual animals submerged in seawater, as well as animals undergoing aerial exposure in a moist tank (as experienced during periods of regular and prolonged low-tide). Consequently, animals used for tissue dissection consisted of 12 individuals collected at six hours (n = 6), six control animals submerged in water and six animals from a moist tank. This was repeated at 12 hours (n = 6 for both treatments). In total, 24 A. trapezia individuals had following five specific tissues removed and used for analysis: haemolymph, mantle, muscle, gills and foot (Figure 2.1). A few microlitres of fresh haemolymph from each animal

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 23

was used to conduct analysis to measure pH, pCO2 and sO2 % using Abbott i-STAT blood analysers. Haemolymph samples were spun at 10,000 rpm for 3 min, supernatant removed

and RNA was extracted immediately. Tissue samples were snap frozen in liquid N2.

Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to assess tissue specific expression of Hb genes in this study.

2.2.2 RNA extraction and cDNA synthesis RNA was extracted from each tissue separately using Mollusc RNA extraction kit from Omega Biotek, following its longest protocol. Specifically, tissues were ground in

liquid N2 and the fine powder obtained was transferred into 1.5 mL microcentrifuge tubes. These were rapidly brought to a fume hood and 350 µL of MRL buffer provided in the kit was added and vortexed. Three hundred and fifty µL of a 24:1 solution of chloroform:isopropanol freshly made were then added to each tube, vortexed mix and placed in a centrifuge for 2 min at 10,000 x g. The upper aqueous phase of each tube was carefully removed and transferred to clean 1.5 mL microcentrifuge tubes. One volume of isopropanol was added to each tube with the upper aqueous phase, vortexed and immediately centrifuged for 2 min at 10,000 x g. Supernatant was carefully aspirated and tubes were briefly inverted over a paper towel to remove any residual liquid. RB buffer (100 µL) provided in the kit was then added to each tube, vortexed to re-suspend the pellets and briefly incubated in water baths at 65 °C to elute RNA. Another 350 µL of RB buffer and 100 µL of 100 % EtOH was

24 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

added to each tube and vortexed. The entire samples were then transferred into a HiBind® RNA (Omega Biotek) mini column inside a 2 mL collection tube for washing and elution. Samples were first centrifuged for 1 min at 10, 000 x g and filtrates were discarded. Five hundred µL of RNA Wash Buffer I provided in the kit was added to the mini column and tubes were centrifuged for 1 min at 10,000 x g. Filtrates were discarded and 500 µL of Wash Buffer II provided in the kit was added to the mini columns and centrifuged for 1 min at 10,000 x g. Filtrates were discarded and another wash step was done with Wash Buffer II. All samples were then centrifuged for 2 min at max speed to dry the columns. These were then transferred to clean 1.5 mL microcentrifuge tubes for elution. Fifty µL of DEPC (Diethylpyrocarbonate) water provided in the kit was added to each tube and samples were centrifuged for 2 min at maximum speed and eluted RNA samples were stored at -70 °C. cDNA synthesis was performed using the SensiFASTTM cDNA Synthesis Kit (Bioline) following the manufacturer’s protocol. cDNA synthesis was performed in a total volume of 50 µL (20 µL water, 20 µL total RNA, 8 µL 5xTransAmp buffer and 2 µL reverse transcriptase) with the following cycling conditions: 1 cycle at 25 °C for 10 min, 1 cycle at 42 °C for 60 min, 1 cycle at 85 °C for 5 min and a hold step at 4 °C. The cDNA was used for all PCR reactions in downstream analysis.

2.2.3 RT–PCR (Real Time PCR) Five candidate globin genes: putative dimer (2D), α chain (AG), homo-dimer (HB), hetero-dimer (HD) and β chain (BG) previously identified were amplified in each sample by using primers designed to amplify the entire Open reading frame (ORF) based on transcriptomic data from A. trapezia (Prentis & Pavasovic, 2014) (Table 2.1). Polymerase chain reaction amplification was then performed to determine the presence or absence of each of the five genes in the different tissue samples using the following conditions: initial denaturation for 3 min at 94 °C and 30 cycles of 30 sec at 94 °C, 30 sec at 52 °C and 1 min at 72 °C followed by one cycle of 5 min at 72 °C and a hold step at 4 °C. Amplicons were separated by electrophoresis on a 1.5 % agarose gel and stained with GelRedTM (Biotum). The intensity of PCR products was determined using an image analysis program assisted by a gel documentation system (Chemidoc XRS, Bio-Rad). Some of these PCR products were also used for candidate gene validation through sequencing.

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 25

Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve species A. trapezia using RT-PCR. All primers were designed to amplify the entire ORF of each gene with product sizes between 350 bp and 619 bp (Prentis & Pavasovic, 2014).

Gene Primer Name Primer Sequence Product size (bp)

AG (Alpha chain) AGORF CGAGTCCGATTTATTGCTGA 545 AGORR TGTCAAACGAGACAGGTCCA BG (Beta chain) BGORF TAATGCAGCCTGGACAACAG 501 BGORR TTTGTTGTAGACGCCCTTTG HB (Homo-dimer) HBORF TTGTCACCTCCAGTCTGTCG 618 HBORR ACGCTACCCTGGTGATTGTC 2D (Putative dimer) 2DORF CGAAACCCAAGTCCATCAT 506 2DORR ATCCCTCACAGAGTGCTGCT HD (Hetero-dimer) HDORF ATCTGACGGAAGCAGACG 515 HDORR CGCGAGGTAGTGATATCGAA

2.2.4 Candidate gene validation Amplicons from two samples were purified using the Bioline isolate PCR purification kit followed by cloning using the pGEM®-T and pGEM®-T Easy (Promega) vector systems quick protocol. Duplicate samples for each gene in each biological replicate were selected and sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequences obtained were imported into the Geneious® software version 8.1.6 for visualization and comparison to the de novo assembled contigs from the A. trapezia annotation report using a pairwise global alignment. Sequences were aligned to the contig they were designed from to determine percentage nucleotide similarity.

2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression RT-qPCR was performed on each sample using the One-Step Real Time PCR kit (Roche) and short primers designed based on past analysis of transcriptomic data from A. trapezia (Prentis & Pavasovic, 2014) (Table 2.2). These reactions were performed using the Lightcycler®480 real-time PCR system (Roche), measuring specific fluorescence at each cycle and quantifying the initial levels of mRNA for each Hb gene in each tissue. The PCR

26 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

reactions were performed using the following conditions: 1 cycle of 5 min at 95 °C and 45 cycles of 10 sec at 95 °C, 10 sec at 60 °C and 10 sec at 72 °C. All quantitative PCR analyses were repeated in three technical replicates to determine the validity of results along with negative controls for each gene in each sample and 18S as a house keeping control gene.

Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A. trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014).

Gene Primer Name Primer Sequence Product size (bp)

AG (Alpha chain) AGF TGATGACCCATCCAGATTGA 117 AGR TTCAGGGTCTCCTTCATTGG BG (Beta chain) BGF TGTCGAGAGCATCGATGAAG 109 BGR AAATTCGCTCGTCTTGGAGA HB (Homo-dimer) HBF GCTGTCAACCACATCACCAG 139 HBR CCTGGACAACGCCTACAAGT 2D (Putative dimer) 2DF ATCCGACCCATGGAATAACA 114 2DR CGTGCAACATCTTCCAAGTC HD (Hetero-dimer) HDF AAGGGACATGCCACAACATT 148 HDR CTCCGAGTGCCTGAAATTCT 18S 18F CGGCGACGTATCTTTCAAAT 136 18R CTTGGATGTGGTAGCCGTTT

2.2.6 RT-qPCR data analysis RT-qPCR data was exported and viewed in the Lightcycler96 software associated with the instrument. Relative quantification analysis was performed using the analysis function of

the Lightcycler96 software based on the ∆∆CT method. In this method, the housekeeping gene provides a basis for comparing levels of target sequences to levels of reference sequences and the final result is expressed as a relative ratio. These relative ratio values were exported into Microsoft Excel version 14.0 to be converted to a format compatible with the program SPSS (Statistical Package for the Social Science) used for statistical analysis. In this program, significance of results was assessed through ANOVA testing followed by Tukey’s post-hoc test and differences were considered significant at p < 0.05. Firstly, to assess levels of target genes in each tissue, ratio values were used as a dependent variable against tissue type for each gene. Secondly, significant differences in expression among conditions (water or air 6 hours and 12 hours) were assessed in the same way with ratio values as a dependent

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 27

variable and condition as factor. Due to very low relative ratio values, the following formula was applied to each value and plotted in SPSS: (-1/log (relative ratio)).

2.3 Results

2.3.1 Haemolymph analysis Haemolymph analysis results obtained from iSTAT are summarised in Figure 2.2. Haemolymph pH measurements ranged from 6.7 to 7.2 across the treatment groups with significant differences among samples exposed to air for 12 h and samples in water for 6 h and 12 h. Oxygen saturation measurements ranged from 24 to 89 % with significant differences (p < 0.05) between the two time points of aerial exposure at 6 h and 12 h. Figure 2.2 shows a drop of nearly 30 % for oxygen bound to Hb molecules among samples kept in water for 6 h and samples exposed to air for 6 h. Measurements for pCO2 ranged from 8.9 to 20.9. Values in water 6 h, air 6 h and water 12 h were relatively consistent across treatments, however, measurements for samples exposed to air for 12 h fluctuated significantly with values between 10.4 and 20.9. Similar values were observed for samples in water for 6 h and in water for 12 h, but there is a slight increase in pCO2 in samples exposed to air for 6 h and a large increase samples exposed to air for 12 h, showing a gradual build-up of CO2 in the haemolymph of those animals.

28 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

A Variation of pH across conditions B Variation of sO2 % across conditions A B

• • * *

%

2 O • ean pH ean ean s M M

C

C VariationC of partial pressure of CO2 across conditions

* *

2 • • • Mean pCO Mean

Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of fresh haemolymph using and Abbott i-STAT blood analyser. This was done for all haemolymph samples across two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH

measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed

through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2 values as dependent variables respectively. Condition was used as a factor for each test and results were considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*) and (•) in each graph and error bars represent 2 standard errors around the mean.

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 29

2.3.2 RT-PCR RT-PCR results are summarised in Table 2.3. Results for the HD gene have been removed here as it could not be validated. In haemolymph, all genes were amplified in all samples. For foot samples, BG and HB were amplified in all samples, AG was amplified in 23 out of 24 samples, 2D was amplified in 14 out of 24 samples. Low amplification results for foot samples overall may be due to low yields of RNA obtained for this tissues. In gill samples, 2D, AG and BG were amplified in at least 21 out of 24 samples and HB was amplified in 20 samples. In mantle, all five genes were amplified in at least 22 out of 24 samples. In muscle samples, AG, BG and HB were amplified in all 24 samples and 2D in 21 samples. Overall, all four genes tested here are found in all tissues but their expression was significantly higher in the haemolymph while in the other tissues tested their expression was minor. Inconsistencies with RT-qPCR results in section 2.3.4 may be due to RNA degradation in some of those samples or failed amplifications during RT-PCR.

Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD) have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at

Tissue Candidate Water 6 h Air 6 h Water 12 h Air 12 h globin genes Haemolymph 2D 6 6 6 6 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6 Foot 2D 4 4 5 1 AG 6 5 6* 6 BG 6 6 6 6 HB 6 6 6 6 Gills 2D 6 6 6 5 AG 5 6 5 5 BG 5 6 5 5 HB 5* 6* 4* 5* Mantle 2D 5 6* 5 6 AG 6 6 5 6 BG 6 6 5 6 HB 6 6 5 6 Muscle 2D 6 6 6 4 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6 least 2 biological replicates out of 6 in that group.

30 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

2.3.3 Candidate gene validation Figure 2.3 shows PCR products for all candidate genes used in sequencing for validation. Following Sanger sequencing, alignments of amplified sequences showed 100 % similarity with the original candidate ORFs for 2D, AG, BG and HB, while HD did not show the correct sequence and therefore was not validated.

A B

Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified products for 2D, AG, BG, HB, HD and 18S genes respectively.

2.3.4 RT-qPCR for quantification of gene expression RT-qPCR results revealed that 2D, AG and BG had significantly higher expression levels (p < 0.05) in haemolymph compared to other tissues (Figure 2.4). In the case of HB, there were no significant differences (p > 0.05) in expression levels across tissues with overall lower expression than the other genes tested. These results are consistent with amplification curves for target genes and for the housekeeping gene 18S as shown in Figure 2.5. In haemolymph, fluorescence is first detected in cycle 12 and this first cluster represents amplification curves for 2D, AG and BG. The second cluster of amplification curves begins at cycle 32 and represents HB expression. This difference is also observed across all tissues in relative expression of 2D, AG and BG compared to HB as seen in Figure 2.4. Amplification curves for foot, mantle and muscle follow a similar pattern with fluorescence detected at cycle 20 for 2D, AG, BG (first cluster) and at cycle 26 for HB (second cluster). The housekeeping gene 18S used in all PCR runs was consistently detected at cycle 8, across

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 31

all tissues. Relative expression levels showed no significant differences between air or water exposed samples (Figure 2.6).

Relative expression of 2D across tissues Relative expression of AG across tissues

p < 0.05 p < 0.05

Relative expression of BG across tissues Relative expression of HB across tissues

p < 0.05

Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative

quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS and differences were considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*). Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2 standard errors around the mean.

32 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

0.350 0.350

0.300 0.300 0.250 0.250 0.200 0.200 Fluorescence Fluorescence 0.150 0.150 0.100 0.100

Haemolymph 0.050 0 050

5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle

0.360 0.500

0.300 0.400

0.240 0.300 0.180 Fluorescence Foot Fluorescence 0.200 0.120 0.060 0.100

5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle

0.600 0.500 0.500

0.400 0.400 0.300 0.300 Gills Fluorescence

0.200 Fluorescence 0.200

0.100 0.100

5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle

0.700

0.720

0.600

0.600 0.500 0.480 0.400

0.360 Fluorescence

Mantle 0.300 Fluorescence 0.240 0.200 0.120 0.100

5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle

0.600 0.720

0.500 0.600 0.400 0.480 0.300 Muscle 0.360 Fluorescence 0.200 Fluorescence 0.240 0.100 0.120

5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping gene 18S in each tissue type. Target amplification curves are represented in orange (left), negative controls in green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler® measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical replicates along with negative controls and 18S as a housekeeping gene.

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 33

Relative expression of all genes across conditions in haemolymph Relative expression of all genes across conditions in foot

Gene Gene 2D 2D AG 2.50 AG BG 2.50 BG HB HB

2.00 2.00

1.50 1.50 Relative expression Relative expression 1.00 1.00

0.50 0.50

0.00 0.00 Water 6h Air 6h Water 12h Air 12h Water 6h Air 6h Water 12h Air 12h Condition Condition

Relative expression of all genes across conditions in gills Relative expression of all genes across conditions in mantle

Gene Gene 2D 2.50 AG 2D BG 2.50 AG

HB BG HB

2.00 2.00

1.50 1.50 Relative expression Relative expression

1.00 1.00

0.50 0.50

0.00 0.00 Water 6h Air 6h Water 12h Air 12h Water 6h Air 6h Water 12h Air 12h Condition Condition

Relative expression of all genes across conditions in muscle

Gene 2D 2.50 AG BG

HB

2.00

1.50 Relative expression

1.00

0.50

0.00 Water 6h Air 6h Water 12h Air 12h Condition

Figure 2.6. Relative expression of candidate globin genes in A. trapezia amplified using RT-qPCR. Relative

quantification analysis was performed using the ∆∆CT method. Relative expression is shown as a ratio of levels of target sequences to 18S sequences, used here as a housekeeping gene. Significant differences (p < 0.05) were assessed through ANOVA testing using SPSS with ratio values as dependent variable, condition as factor in each tissue. No statistical differences were found here. Error bars represent 2 standard errors around the mean.

34 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

2.4 Discussion

In this study, the expression of Hb encoding genes was examined (2D, AG, BG and HB) across different tissues and environmental conditions, in A. trapezia, in order to determine if these genes show patterns of expression consistent with neofunctionalisation. Based on these findings, three Hb encoding genes appear to be predominantly expressed in the haemolymph. This pattern was observed to be consistent across the two environmental conditions tested (submersion and aerial exposure). Physico-chemical properties of the haemolymph, however, indicated that major physiological changes are occurring in the haemolymph without major changes in expression of Hb genes.

2.4.1 Haemolymph characteristics Haemolymph analysis was performed in order to gain insights on the state of Hb molecules and gas exchange throughout the experiment. pH values remained constant in animals kept in water for 6 h and 12 h. However, animals exposed to air for 6 h (pH= 7.2) showed a slight drop in pH and a significant drop was observed in animals exposed to air for 12 h (pH= 6.9). This finding is consistent with observations in another bivalve species (Crassostrea gigas) (Michaelidis et al., 2005). This is an interesting physiological response as C. gigas does not have a circulating respiratory pigment and this therefore suggests that these changes may be independent of Hb in the haemolymph. The drop in pH and increase in pCO2 in animals exposed to air for 6 h was not statistically significant (p > 0.05), however, this may be due to the fact that A. trapezia is typically exposed to six hours (high to low) tide cycles. Consequently, the animals may still be consuming their oxygen stores and relying on aerobic metabolism (Booth et al., 1984).

The observed significant decline in pH is consistent with the increase in pCO2 in animals under prolonged aerial exposure (12 h), and may be explained by a shift of the organism from aerobic metabolism to anaerobic metabolism, an adaptation to hypoxia widespread among bivalves (Brooks et al., 1991; De Zwann et al., 1993). This is well supported in literature on marine bivalves, where a large number of studies have shown that

hypoxia causes pCO2 to rise in the haemolymph due to the accumulation of CO2 as an anaerobic by-product, leading to acidification of body fluids and a drop in pH levels (Crenshaw & Neff, 1969; Jokumsen & Fyhn, 1982; De Zwann et al., 1983; Booth et al.,

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 35

1984; Michaelidis et al., 2005; Shumway & Parsons, 2011). This gradual switch from aerobic to partial or complete anaerobic metabolism upon prolonged air exposure is documented in bivalves such as the deep-sea clam (Calyptogena magnifica) (Hourdez & Lallier, 2006) and the sea mussel (Mytilus edulis). Another study on the intertidal lugworm Arenicola marina (Booth et al., 1984; Toulmond & Tchernigovtzeff, 1984; Wang &

Widdows, 1991), which is known for its ability to rapidly acclimate to restricted O2 availability in its habitat, reinforces this finding and reports that acidosis was also found to be

coupled with a rise in pCO2 as a result of decreased gas exchange during low tide (Toulmond & Tchernigovtzeff, 1984). Moreover, from results in this study, globins present in the haemolymph of A. trapezia do not seem to show a Bohr effect unlike Hbs found in teleost fish species (Souza & Bonilla-Rodiguez, 2007; Witeska, 2013). The saturation of oxygen appears independent of pH as a significant drop in pH is coupled with a significant increase

in sO2 % for animals exposed to air for 12 h. This is consistent with previous studies on Hb containing bivalves such as Barbatia reeveeana (Grinich & Terwilliger, 1980), Cardita floridana (Manwell, 1963) and Anadara broughtonii (Furuta & Kajita, 1983) for which no Bohr effect was observed.

2.4.2 Tissue specific expression and neofunctionalisation It is now widely accepted that invertebrate Hbs are understudied, particularly in reference to their evolution, distribution and functions (Alyakrinskaya, 2002; Bao et al., 2013a). In an attempt to address this knowledge gap, a quantitative investigation of Hb encoding genes was performed across five different tissues in A. trapezia individuals undergoing aerial exposure. A. trapezia is particularly suited to examine neofunctionalisation of Hb genes as it expresses multiple different Hb genes (Como & Thompson, 1980b) and is commonly under abiotic stress due to aerial exposure during tidal cycles (Davenport & Wong, 1986). A. trapezia, like most bivalves, possesses an open circulatory system where surface areas for O2 uptake are composed of a pair of gills and mantle tissue anchored to the shell (Herreid, 1980).

While strong patterns of differential expression across tissues were observed, there was no evidence to support any changes in expression differences across environmental conditions. This may indicate that these genes have similar functions under environmental stress. Alternatively, the aerial exposure in this experiment mimics a tidal cycle at 6 h and a

36 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

prolonged period of 12 h, the absence of significant differences (p > 0.05) in expression levels of Hb encoding genes during air exposure may indicate that multiple Hbs maximise O2 binding and supply during times of hypoxia (Projecto-Garcia et al., 2015).

All Hb encoding genes were expressed across all tissues but the genes encoding AG, BG and 2D had significantly higher expression in haemolymph than foot, gills, mantle and muscle (p < 0.05). Other authors have used patterns of tissue specific expression of duplicated genes to infer that they have undergone neofunctionalisation (e.g., Na/K- ATPase in fish in electrical organs) (Norman et al., 2011; Gallant et al., 2014). Consequently, results in this study would indicate that at least three of the candidate genes investigated (AG, BG and 2D) have potentially undergone neofunctionalisation based on their expression patterns. Although it may be expected that Hb genes have a higher expression in haemolymph, most of closely related bivalve lineages to Arcidae do not possess circulating erythrocytes (Sullivan, 1961; Read, 1966). This indicates that these genes have undergone altered expression patterns following duplication from an ancestral globin gene. This may have been influenced by their challenging living environment in intertidal zones where they are regularly exposed to changing oxygen availabilities.

While Hb genes were expressed across multiple tissues, the expression levels may have been somewhat influenced by the permeability of tissues to haemolymph. For example, the gills are highly permeable to haemolymph because they are made up of folded and ciliated epithelium to maximise surface area for gas exchange (Widdows et al., 1979). Similarly, mantle tissue is rich in capillaries where oxygenated haemolymph is distributed to the rest of the tissues for O2 delivery (Weber, 1980). Muscle and foot tissue also depend on haemolymph for O2 supply and consequently the lower expression patterns seen across all tissues apart from haemolymph may be the result of the residual haemolymph content within them.

2.5 Conclusion

Overall, air exposure for 6 hours mimicking the duration of a low tide does not change the haemolymph pH in A. trapezia whereas prolonged air exposure causes acidosis in

the haemolymph. This drop in pH on prolonged air exposure coincides with a rise in pCO2 and therefore indicates that the organism may be switching to anaerobic metabolism in order to deal with this stress. In terms of expression of Hb encoding genes, the multiple Hbs

Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 37

expressed in this species are present in all tissues but their expression is significantly higher in haemolymph than in foot, gills, mantle and muscle. For at least three of these Hb encoding genes, this contributes to the theory that they have undergone neofunctionalisation through

gene duplication. In this case, the unfavourable O2 conditions of the intertidal environment these bivalves live in may have been a driver for this selective adaptation.

38 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia

Chapter 3: Functional annotation of the Ctenoides ales transcriptome

3.1 Background

Respiratory pigments, or oxygen-transport proteins, within the bivalve lineage show a patchy distribution, as well as extensive variation in both form and function across species in which they are expressed (Terwilliger, 1980; Booth et al., 1984; Morse et al., 1986; Mangum et al., 1987; Weber & Vinogradov, 2001). The majority of bivalves do not have a functional respiratory pigment and rely on filtration of highly oxygenated water over the gills, to meet their respiratory demands (Angelini et al., 1998). Those bivalve species that do possess a respiratory pigment usually express haemocyanin (Hc) (Terwilliger et al., 1988), a complex copper – based protein that reversibly binds O2. In rare cases, however, bivalves utilise a type of haemoglobin (Hb), as seen in orders Arcoida, Carditoida, Solemyoida and Veneroida (Manwell, 1963; Terwilliger et al., 1978; Dando et al., 1985; Doeller et al., 1988; Suzuki et al., 2000). Based on the current phylogenetic reconstruction of the bivalve species by González et al. (2015), it appears that the earliest lineages possess Hc as the primary respiratory pigment. Based on this observation, and the fact that almost all gastropods (sister lineage to bivalves) possess Hc (Linzen et al., 1985), it is likely that Hc is the ancestral oxygen-transport protein in Bivalvia. Unlike Hc, Hb is thought to have evolved from ancestral globin genes on multiple occasions within bivalves.

Currently, four independent origins of Hbs have been hypothesised in bivalve molluscs. The first instance is thought to have occurred in the primitive bivalve lineage containing Solemya velum in the order Solemyoida. In addition to circulating , this species contains several tissue Hbs, predominantly localised in the gills (Dando et al., 1985; Doeller et al., 1988). The second independent origin of Hbs was reported in the bivalve family Arcidae in the order Arcoida, where evidence suggests that all species investigated contain multiple Hbs in circulating red blood cells (Mangum, 1997). The diversity of Hbs in this group consists of monomeric, homo-dimeric, hetero-dimeric, tetrameric and polymeric proteins (Terwilliger, 1980). The third reported origin of bivalve Hb is found in the deep-sea clam genus, Calyptogena in the order Veneroida. In this genus, species such as C. kaikoi contain two types of homo-dimeric Hb proteins, where one is found in erythrocytes and the other is restricted to abductor muscle tissue (Suzuki et al., 2000). The fourth instance is an extracellular Hb found in the heterodont clams, Cardita borealis and Cardita floridana in the

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 39

order Carditoida (Manwell, 1963; Terwilliger et al., 1978). While these four cases provide compelling evidence for the independent evolution of Hb in multiple bivalve lineages it is not known whether Hb has evolved in other lineages where oxygen-transport proteins have not been examined.

Interestingly, there is some sparse evidence to suggest that members of the family Limidae may also possess Hb proteins. This is largely restricted to one textbook publication (Rawat, 2010) with no molecular or biochemical studies having been reported. It is plausible to hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate associated with the ability to swim and the red pigmentation of their tissues (Harper & Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an important physiological advantage for a more efficient O2 delivery system to respiring tissues during swimming. Consequently, if Hb were found in Limidae, then this would constitute a fifth independent origin of Hb in bivalve molluscs.

An efficient approach to determine if Hbs are present in a given species is to sequence the entire expressed portion of their genome, as it has successfully been performed in the blood clams, Tegillarca granosa (Bao & Lin, 2010) and A. trapezia (Prentis & Pavasovic, 2014). Research into the presence of Hb in the family Limidae has lagged principally due to a lack of genomic and proteomic resources developed for species from this bivalve family. Here, the transcriptome sequencing, assembly and annotation for C. ales, a non-model bivalve mollusc from the family Limidae is described, for which no transcriptome data currently exists. The aim of the present study is to test the hypothesis that Hb genes, that encode for a functional Hb protein, are present in this species and this would therefore likely constitute a fifth independent origin of Hb in bivalves, as these proteins have only been found in the bivalve orders Arcoida, Carditoida, Solemyoida and Veneroida. The newly generated transcriptome will also greatly increase the genomic resources for C. ales and provide an initial candidate gene list for further genetic research.

40 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

3.2 Materials and Methods

3.2.1 Sample collection C. ales specimens were obtained from Cairns Marine Pty Ltd, Australia and kept in holding tanks at the Marine Facility (QUT) , until required for experimentation. This experiment required no ethics approval as the specimens do not qualify as animals as described by the Queensland Animal Care and Protection Act 2001 (ACPA). Water conditions were salinity at 30 - 35 ppt, pH 7.9 - 8.1, temperature 20 – 25 °C and ammonia levels < 0.1 mg/L. A 12h light/dark cycle was maintained. Animals were fed every three days with phyto-blast containing a wide range of aquaculture marine phytoplankton (Continuum Aquatics).

3.2.2 RNA extraction, library preparation and sequencing One individual tissue sample was used and divided into four for RNA extraction, as per previously optimised TRIzol/chloroform RNA extraction protocol for mollusc species (Prentis & Pavasovic, 2014). Specifically the protocol consisted of following steps: 1.0 mL of Trizol reagent was added to the 1.5 mL microcentrifuge tubes containing tissue and each tube was vortexed to mix for 5 min. The samples were then incubated at room temperature for 2 min. Chloroform (0.2 mL) was added to each tube and vortexed for 20 – 30 sec, followed by a 5 min incubation at room temperature. Samples were spun at 12,000 g for 15 min and the aqueous clear phase in each tube was collected carefully without disturbing the other layers, and placed in four new 1.5 mL microcentrifuge tubes. Following this, isopropanol (0.5 mL) was added to each tube of newly collected solution and vortexed for approximately 10 sec. Sodium chloride (1.2 M, 100 µL) was added to each tube, followed by a 10 min incubation at room temperature. Samples were then spun at 12,000 g for 15 min and supernatant removed from each tube without disturbing the pellets. Ethanol (70%, 200 µL) was then added to each tube to wash the pellets, followed by centrifugation at 12,000 g for 10 min and removal of supernatant. The pellets were then allowed to air dry for 10 min before being eluted in 50 µL of RNase free water. The RNA was then visualised on a 1.5 % agarose gel. The quantity and integrity of total RNA was validated on a Bioanalyzer 2100 RNA Nano chip (Agilent Technologies).

Sequencing libraries were prepared from 20 µg of total RNA using a TruSeq® stranded RNA library prep kit (Illumina), as per manufacturer’s low sample (LS) protocol. This consisted of selective purification of mRNA using oligo(dT) magnetic beads and

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 41

subsequent selection of 200 – 700 bp fragments. Specifically, total RNA was diluted with nuclease-free ultra-pure water to a final volume of 50 µL in a 96 - well 0.3 mL PCR plate provided in the kit and labelled with a RNA Purification beads (RPB) code. After being brought to room temperature, the tube containing the RNA Purification Beads was vortexed to resuspend the oligo-dT beads. 50 µL of RNA Purification Beads was added to each well of the RBP plate to bind the polyA RNA to the oligo-dT magnetic beads. Each well was then mixed thoroughly by pipetting the entire volume up and down six times followed by sealing of the plate with a Microseal adhesive seal provided. The RBP plate was then placed in a thermal cycler on a cycle of 65 °C for 5 min and 4 °C hold to denature the RNA and facilitate binding of the polyA RNA to the beads. After being taken out of the cycler, the RBP plate was incubated at room temperature for 5 min to allow the RNA to bind to the beads. The adhesive seal was then removed and the plate was placed on the magnetic stand at room temperature for 5 min to separate the polyA RNA bound beads from the solution. Supernatant from each well was then removed, discarded and the plate was taken off the magnetic stand. Beads were washed by adding 200 µL of Bead Washing Buffer provided in each well to remove unbound RNA. Wells were mixed thoroughly by pipetting up and down six times. The RBP was again placed on the magnetic stand for 5 min at room temperature. After being thawed, the Elution Buffer solution provided was centrifuged at 6,000 g for 5 sec. Supernatant from each well was removed after incubation and discarded as it contained most of the ribosomal RNA and other non-messenger RNA. The RBP plate was removed from the magnetic stand and 50 µL of Elution Buffer was added to each well and mixed thoroughly by pipetting up and down six times. The RBP plate was sealed and placed in the thermal cycler on a cycle of 80 °C for 2 min and 25 °C hold to elute the mRNA from the beads. This step releases both the mRNA and any contaminant rRNA that may have bound to the beads non- specifically. The plate was removed from the thermal cycler and unsealed. After thawing, the Bead Binding Buffer provided was centrifuged at 600g for 5 sec. Bead Binding Buffer (50 µL) was added to each well of the RBP plate to allow mRNA to specifically rebind the beads as well as reducing the amount of rRNA that is bound non-specifically. Each well was mixed. The RBP plate was then incubated at room temperature for 5 min and then placed on the magnetic stand where supernatant was removed and discarded. The plate was removed from the stand and the beads were washed by adding 200 µL of Bead Washing Buffer in each well and mixing. The plate was placed on the magnetic stand at room temperature for 5 min and supernatant was removed and discarded to avoid any residual rRNA and other contaminants. The plate was removed from the magnetic stand and 19.5 µL of Fragment,

42 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

Prime, Finish Mix provided were added and mixed thoroughly with the RNA to serve as a first strand cDNA synthesis reaction buffer as they contain random hexamers. The plate was sealed and placed in the thermal cycler on a cycle of 80 °C for 8 mins and 4 °C hold. The RBP plate was then removed from the thermal cycler and centrifuged briefly. Complementary DNA synthesis was then carried out on enriched mRNA samples as follows: the RBP plate was placed on the magnetic stand at room temperature for 5 min and 17 µL of supernatant in each well was transferred to the corresponding well on a new 0.3 mL PCR plate labelled with a cDNA Plate (CDP) barcode. After thawing, the First Strand Synthesis Act D mix was centrifuged at 6,000 g for 5 sec. SuperScript II (50 µL) was added to the First Strand Synthesis Act D tube. Eight µL of this mix was added to each well of the CDP plate and mixed thoroughly. The CDP plate was sealed and centrifuged briefly before being transferred to the thermal cycler on a cycle 25 °C for 10 min, 42 °C for 15 min, 70 °C for 15 min and hold at 4 °C. The CDP plate was then taken out of the thermal cycler and unsealed. Five µL of Resuspension Buffer was added to each well followed by 20 µL of thawed and centrifuged Second Strand Marking Master Mix provided. Each well was thoroughly mixed and the plate sealed. The CDP plate was placed in the thermal cycler and incubated at 16 °C for one hour. It was then removed from the cycler and unsealed. Well-mixed AMPure XP beads (90 µL) provided was added to each well now containing 50 µL of double stranded cDNA and this was mixed by pipetting the entire volume up and down ten times. The CDP plate was then incubated at room temperature for 15 min before being placed on the magnetic stand for 5 min. 135 µL of supernatant was removed and discarded from each well. Freshly made 80 % EtOH (200 µL) was then added to each well without disturbing the beads and the plate was incubated at room temperature for 30 sec before removing and discarding all of the supernatant from each well. This EtOH wash step was repeated one more time before leaving the CDP plate to dry at room temperature for 15 min. The CDP plate was then removed from the stand and 17.5 µL of thawed and centrifuged Resuspension Buffer provided was added to each well and mixed. The plate was incubated at room temperature for 2 min and placed on the magnetic stand for 5 min. Fifteen µL of supernatant from each well was then transferred from the CDP plate to a new 96 well 0.3 mL PCR plate labelled with an Adapter Ligation Plate (ALP) barcode.

Purified fragments were poly-A tailed, ligated to Illumina paired-end adapters, and size selected by gel purification. Finally, PCR was used to enrich the DNA fragments with the following conditions: one cycle at 98 °C for 30 sec, 15 cycles of 98 °C for 10 sec, 60 °C

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 43

for 30 sec, 72 °C for 30 sec, one cycle at 72 °C for 5 min and hold at 4 °C. The final cDNA library was sequenced on an Illumina NextSeq500 using 150 bp paired-end chemistry.

3.2.3 Transcriptome assembly and validation Following sequencing, the libraries were concatenated into two separate files based on read direction (-left all_R1_reads.fastq and –right all_R2_reads.fastq). Quality control was then performed on both files to validate the quality of the raw reads prior to assembly. This was done using the FastQC program which provides summary statistics for read quality, length and GC content (Andrews, 2010). Low quality reads (sequences with > 1 % N bases and/or greater than 1 % Q < 20) were discarded and only reads with quality scores above Q20 and less than 1 % ambiguity were used for downstream analysis. These remaining reads were assembled into contigs ≥ 200 bp using the Trinity short read de novo assembler (version 2.0.6) (Grabherr et al., 2011). Assembled contiguous sequences (contigs) were filtered for redundancy and chimeric transcripts using the program CD-HIT (Cluster Database at High Identity with Tolerance), version 4.6.1 (Huang et al., 2010) and sequences with > 95 % similarity were clustered into a single contig. At this point, the C. ales transcriptome assembly was evaluated for completeness using CEGMA (Core Eukaryotic Genes Mapping Approach) to determine the presence of a group of 248 highly conserved core eukaryotic genes.

3.2.4 Functional annotation of transcripts and mapping The transcriptome assembly was functionally annotated using sequence based searches implemented in the Trinotate annotation pipeline (Haas et al., 2013), which is a comprehensive annotation suite designed specifically for de novo assembled transcriptomes of model or non-model organisms. Trinotate generates functional annotation based on homology searches to known sequence data, protein domain identification and protein signal peptide/transmembrane domain predictions before integrating the analysis of transcripts into a SQLite database to obtain an annotation report as described below and as summarised in Figure 3.1. Firstly, contigs were used as BLASTx queries against the TrEMBL and Swiss- Prot databases with a stringency E-value of 1 x 10-6. Both of these databases are a collection of core data on proteins such as the amino acid sequence, the protein name or description, the taxonomic data, the citation information and as much annotation information as possible.

44 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

They are used concurrently as TrEMBL is computationally analysed while Swiss-Prot is manually annotated based on literature and curated computational analysis. Gene ontology terms were assigned to contigs that received BLAST hits that also contained functional information. In order to characterise the transcriptomic data obtained, these terms were analysed using WEGO (Web gene Annotation Plotting) (Ye et al., 2006). TransDecoder v.2.0.1 (Haas et al., 2013) was used to generate a predicted proteome for the transcriptome assembly based on the longest open reading frame (ORF) in each transcript. The predicted proteome was then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP 4.1 was used to predict the presence and location of signal peptide cleavage sites in coding sequences. Pfam domains were assigned using the Hidden Markov Model-based sequence alignment tool (HMMER) to determine the presence and position of domains within each predicted proteome. The analyses obtained for all transcripts were then uploaded to the SQLite database and an annotation report was generated. Trinity output consisted of sequence clusters identified as genes through BLASTx searches against the TrEMBL and Swiss-Prot databases.

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 45

Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E- value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal peptides and Pfam to determine the presence and position of protein domains. All results were uploaded to the SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016).

3.2.5 Comparative transcriptomics In order to further validate the newly generated transcriptome, orthologous gene clusters for C. ales, A. trapezia and the two model species Lottia gigantea and Crassostrea gigas were compared and annotated using the program OrthoVenn (Wang et al., 2015). The predicted proteomes from the genome of L. gigantea and C. gigas were used and compared to the predicted proteome from the whole organism transcriptome of A. trapezia and C. ales. Ortholog groups were identified by an all-against-all or reciprocal BLASTp alignment.

46 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

3.2.6 Candidate genes identification The annotated transcriptome was interrogated for candidate globin genes, here defined as contigs that possess a characteristic globin fold sequence and globin domain as identified in the Trinotate pipeline through BLASTp and Pfam. Candidate contigs identified were analysed through ORF finder (Stothard, 2000) to determine open reading frames. Coding regions were translated into protein sequences using the translating tool ExPAsy (Gasteiger et al., 2003) and used as BLASTp queries against the NR database at NCBI. Relevant conserved domains were identified using SMART searches for homologous Pfam domains (Finn et al., 2014), signal peptides and internal repeats.

3.2.7 Primer design and candidate gene validation Primers for the candidate sequences were designed using Primer3 software, to amplify the entire ORFs (open reading frames) of the candidate genes: contigs c97022_g1_i3 (CalesGl1), c89016_g1_i4 (CalesGl2) and c97010_g2_i1 (CalesGl3) and used for validation. ® This was done with the following PCR protocol: 12.5 µL of MyFi 2x, 8.5 µL of H2O, 1 µL of forward primer, 1 µL of reverse primer, 2 µL of MgCl2 and 1 µL of template to a total volume of 26 µL. Different conventional PCR amplification conditions were used for each candidate gene. For contig c97022_g1_i3: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 54 °C 30 sec, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig c89016_g1_i4: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 50 °C 45 sec, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig c97010_g2_i1: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 48 °C 1 min, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. Amplified products were size-separated by electrophoresis on 1.5 % agarose gels stained with GelRedTM (Biotum) alongside a 100 bp Hyperladder (Bioline). The intensity of PCR products was determined using an image analysis program assisted by a gel documentation system (Chemidoc XRS, Bio-Rad). PCR products were then purified using an Ethanol/EDTA precipitation protocol as follows: 5 µL of 125 mM EDTA disod ium salt (pH 8.0) was added to each PCR tube, vortexed and spun briefly. Ethanol (100 %, 60 µL) was added and tubes were left to incubate at room temperature for 15 min. Tubes were then spun in a centrifuge at 13,000 g for 20 min and the supernatant was carefully aspirated and discarded. Freshly made 80 % ethanol (350 µL) was then added and tubes were spun at 13,000 g for 5 min before the supernatant was aspirated and discarded. Pellets in the tubes

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 47

were then left to dry at room temperature for 1 hr covered with aluminium foil. Samples were sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequence chromatograms were visualised using Geneious® software version 8.1.6 and sequences were re-aligned to candidate ORFs that primers were designed from.

3.2.8 Phylogenetic analysis of sequences Phylogenetic relationships of globin genes in this study and model bivalve species were resolved using maximum likelihood analysis in MEGA version 6.0 (Tamura et al., 2013). This software conducts manual and automatic sequence alignments to infer molecular relationships and build phylogenetic trees (Tamura et al., 2013). Firstly, the globin gene sequences were isolated from the genomes of the two bivalve model species C. gigas and L. gigantea by using a custom BLAST against haemoglobins, neuroglobins, globin-X, cytoglobins and myoglobins in the three vertebrate model species: Homo sapiens (human), Gallus gallus (chicken) and Danio rerio (zebra fish). For C. gigas, 12 globin genes were found and 10 for L. gigantea. Globin gene sequences were isolated from the transcriptome of A. trapezia. All sequences were further validated using Pfam searches to ensure the presence of a globin domain. Alignments were performed using MUSCLE (Edgar, 2004) protein plug in with default settings in MEGA and a best-fit model test was performed resulting in LG+G+I (Tamura et al., 2013). This LG gamma distributed (G) with invariant sites (I) model was therefore used with three discrete gamma categories to construct a maximum-likelihood tree using the bootstrap method with 500 bootstrap replications. All bivalve globin protein sequences used in the phylogeny were also aligned for comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA version 6.0.

3.3 Results

3.3.1 RNA extraction, library preparation and sequencing Total RNA extraction, performed in quadruplicate, for C. ales is visualised in Figure 3.2, where all four samples showed strong bands around 700 bp, which represent ribosomal RNA. Bioanalyzer results (Figure 3.3) confirm these findings with high concentration and

48 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

yields for both samples submitted. Transcriptome sequencing of mRNA from C. ales on Illumina NextSeq 500 resulted in 55,592,966 150 bp paired-end reads and a GC content of 36.53 %. This slightly low GC content may be due to shortness of sequences and instability of mRNA.

Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 % agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate) from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively. The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed for quantity and integrity and sequencing libraries were prepared.

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 49

Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip. Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for each sample.

3.3.2 Transcriptome assembly and validation Assembly of this data resulted in 182,908 contigs (≥ 200 bp). The mean contig length was 357 bp, with an N50 of 1,167 bp. Assembly statistics are presented in Table 3.1.

50 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on an Illumina NextSeq500 using 150 bp paired-end chemistry. Libraries obtained were then assembled and the table below summarises results obtained.

Assembly statistic Total assembled reads 127,196,743 Total trinity ‘genes’ 160,227

Total contigs 182,908

Mean contig length (bp) 357 Average contig 695.41 N50 1,167

3.3.3 Functional annotation of transcripts and mapping Overall, 22,903 (12.5 %) transcripts received significant BLASTx hits and 19,788 (10.8 %) BLASTp hits against the Swissprot database. Against the TrEMBL database of predicted proteins, 34,098 (18.6 %) transcripts received significant BLASTx hits and 26,344 (14.4 %) BLASTp hits. Gene ontology (GO) terms were assigned to 30,315 transcripts (Figure 3.4). The most frequently assigned GO terms were cellular process (20,887), biological regulation (11,020) and metabolic process (15,717) for the broad biological process category. For the cellular component category, most GO terms were assigned to cell and cell part (23,916), followed by organelle (16,087). In the molecular function category, GO terms were most commonly assigned to binding (19,803) and catalytic activity (12,179).

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 51

Number of genes Percent of genes

Cellular component Molecular function Biological process

Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to represent common and basic biological information: cellular component, molecular function and biological process.

3.3.4 Comparative transcriptomics A comparative analysis was conducted across the four mollusc transcriptomes including the two model species C. gigas and L. gigantea, as well as the bivalves A. trapezia and C. ales. Orthologous clusters conserved among the four molluscs and those unique to each species are represented in the Venn diagram in Figure 3.5. Overall, 6,743 clusters were shared by all four species, which represents 55.5 % of the total number of clusters for C. gigas, 58.1 % for L. gigantea, 53.2 % for A. trapezia and 38.8 % for C. ales. The transcriptome with the lowest number of unique clusters was L. gigantea with 918, while C. ales had the highest number of 4,355. A. trapezia shared the most orthologous clusters with C. ales (1,607).

52 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve species C. gigas, L. gigantea, A. trapezia and C. ales. For L. gigantea and C. gigas, the predicted proteomes obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp alignment.

3.3.5 Candidate genes Contigs identified as globins were extracted from the Trinotate annotation report (c97022_g1_i3; c89016_g1_i4; c97010_g2_i1). Contig c97022_g1_i3 (CalesGl1) was found to be 1,930 bp in length, consisting of an ORF of 498 bp encoding a polypeptide of 165 amino acids. Analysis of the predicted protein sequence identified a globin domain (Figure 3.6) with 40 % sequence identity to a Ngb-like gene in the bivalve C. gigas. Contig c89016_g1_i4

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 53

(CalesGl2) was 715 bp in length, consisting of an ORF of 585 bp encoding a polypeptide of 194 amino acids. The predicted amino acid sequence of this contig consisted of a signal peptide and a globin domain (Figure 3.6). It had 36 % protein identity to a Hb gene found in the crab Carcinus maenas. Contig c97010_g2_i1 (CalesGl3) was found to be 1,835 bp in length consisting of an ORF of 1,107 bp encoding a polypeptide chain of 368 amino acids. The predicted amino acid sequence of this contig was found to encode a transmembrane domain and two globin domains (Figure 3.6), with 38 % sequence identity to the HbI gene found in the bivalve T. granosa.

CalesGl1

CalesGl2

CalesGl3

bp

Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain. Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal repeats.

54 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

3.3.6 Primer design and candidate gene validation Forward and reverse primers were designed based on the longest ORF for each of the three candidate genes for validation (Table 3.2). PCR reactions were optimised for each primer from each candidate gene and sequences successfully amplified (Figure 3.7). Alignments of chromatograms obtained from Sanger sequencing and original candidate ORFs were successful and all three genes were validated at > 97 %.

Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table below.

Primer name Primer sequence 5’-3’ Contig Annealing T° Product size CalesGl1_F AGA AGC GCA GGC AGA AGA AA c97022_g1_i3 55° 670 CalesGl1_R TGT GAA TCA ACG CAT TGC ACA CalesGl2_F ATC AGT CGA CTG GTG CAT AG c89016_g1_i4 55° 670 CalesGl2_R TGC ATG TAC AGA ATA AAG GCA CalesGl3_F GCA CAC AGC TAC ACT CTT GT c97010_g2_i1 55° 1400 CalesGl3_R TGC GTA GAT CGG AGT AGA GA

A B C

Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as shown in table 3.2 above. Lanes 1 in all gels shown here are Hyperladder 100 bp. Candidate gene 1 is shown in lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane 3 of gel (C). All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500 (ThermoFisher). Lanes 2 and 4 of gel (B) and lane 2 of gel (C) are candidate gene products prior to PCR reactions being optimised.

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 55

3.3.7 Phylogeny of globins in bivalves The phylogenetic relationships of globins in the four mollusc species (L. gigantea, C. gigas, A. trapezia and C. ales) and three model vertebrate species (D. rerio, G. gallus and H. sapiens) are represented in Figure 3.8. In this maximum-likelihood tree, there are three weakly supported major clades (A, B and C). Clade C contained Hb and other related globin (all globins apart from Ngb and GbX) genes from the three vertebrate model species. Clades A and B both contained globin genes for L. gigantea, C. gigas, C. ales and A. trapezia. Clade A, although poorly supported, contained globin genes from L. gigantea, A. trapezia and C. gigas, C. ales and vertebrate GbX but no genes encoding molluscan Hb proteins. Clade B is strongly supported and contains all known Hb genes in A. trapezia. All bivalve globin protein sequences used for phylogeny were also aligned using MUSCLE in MEGA. Results are shown from position 260-331 where residues are conserved across all 30 bivalve globins. In position 271, a phenylalanine residue and in positions 295 and 328, two histidine residues are conserved across all 30 globins from the four different bivalve species (Figure 3.9).

56 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

L. gigantea globin-like 6 C. gigas neuroglobin-like A. trapezia neuroglobin NG L. gigantea globin-like 5 C. gigas haemoglobin-1-like C. gigas cytoglobin-2-like X3 L. gigantea globin-like 1 D. rerio GbX L. gigantea globin-like 8 C. gigas cytoglobin-2-like X1 L. gigantea globin-like 2 C. ales neuroglobin-like CalesGl1 C. giga s neuroglobin D. rerio neuroglobin G. gallus neuroglobin H. sapiens neuroglobin C. gigas globin-like L. gigantea globin-like 7 C. gigas cytoglobin-1 L. gigantea globin-like 10 C. gigas cytoglobin-1-like C. gigas neuroglobin-1 C. gigas neuroglobin (2) L. gigantea globin-like 3 L. gigantea globin-like 4 L. gigantea globin-like 9 C. gigas globin-like X1 C. ales haemoglobin-like CalesGl2 C. ales haemoglobin 1-like CalesGl3 A. trapezia heterodimer HB A. trapezia dimer 2D A. trapezia homodimer HD A. trapezia beta globin BG A. trapezia alpha globin AG H. sapiens cytoglobin D. rerio cytoglobin 2 G. gallus cytoglobin D. rerio cytoglobin 1 G. gallus myoglobin H. sapiens myoglobin D. rerio myoglobin G. gallus globin E G. gallus HbA H. sapiens HbA D. rerio HbA D. rerio HbB G. gallus HbB H. sapiens HbB

Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The percentage of trees in which the associated taxa clustered together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site.

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 57

Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all sequences and all species are indicated above by an asterix (*).

3.4 Discussion

In this study, the transcriptome of C. ales, a bivalve mollusc, three full length globin genes were identified. All predicted proteins of the candidate genes showed the two histidines as well as the phenylalanine residues characteristic of globin proteins (Bashford et al., 1987). This indicates that the proteins encoded by the globin genes identified here are capable of the formation of the hydrophobic pocket with a haeme group (Royer et al., 2001). The sequence similarity of the proteins, however, were highly divergent, with candidate gene one being most similar to vertebrate Ngb genes and a Ngb-like gene from C. gigas, while candidate genes two and three were most similar to Hb encoding genes from A. trapezia.

Of most interest here, is that two globin genes from C. ales are sister to the known Hb encoding genes from A. trapezia. The finding that candidate genes CalesGl2 and CalesGl3 were most similar to genes encoding Hb proteins may indicate that these genes in C. ales represent a fifth independent origin of Hb in Bivalvia. Currently, Hb has been identified in species from Arcoida, Veneroida, Carditoida and Solemyoida orders in Bivalvia. It remains to be determined if Hb encoding genes show patterns of molecular convergence as seen in

58 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

genes underlying convergently evolved phenotypic traits in echo-locating animals (Parker et al., 2013) and marine mammals (Foote et al., 2015). The repeated evolution of Hb proteins in bivalves indicates that when these mutations arise in ancestral globin genes they are strongly selected for. To fully support a fifth independent origin of Hb in Bivalvia will require functional protein work which was outside the scope of the current project.

Candidate gene CalesGl3 is of particular interest as this gene had multiple globin domains in a single globin gene. Previously, di-domain Hbs in bivalve molluscs have only been found in Barbatia reeveana and Barbatia lima from order Arcoida (Naito et al., 1991; Suzuki et al., 1996). The sporadic occurrence of a multi-domain globin in another bivalve class suggests that the di-domain globin gene identified in C. ales has evolved independently of those in order Arcoida. Multi-domain globin genes have been identified in other animal phyla including Annelida (Branchipolynoe symmytilida and Barbatia seepensis (Projecto- Garcia et al., 2010), and Arthropoda (Artemia; (Jellie et al., 1996)). This indicates that multi- domain globin genes may arise relatively frequently, but this idea remains to be tested. In B. reeveana the di-domain gene was generated through incomplete gene duplication which resulted from unequal crossing over during meiosis (Naito et al., 1991). It would not be surprising if a similar mechanism generated the di-domain globin gene found in C. ales. In fact, both gene duplication and adaptive evolution have been implicated in the origin and diversity of many multi-domain proteins (Vogel et al., 2005) reinforcing the hypothesis of evolution by incomplete duplication in this case.

Candidate gene CalesGl1 was identified as a Ngb-like gene and was found in a weakly supported clade with vertebrate genes encoding Ngb proteins. It was not surprising to find Ngb-like genes in bivalves as these genes were present in the common ancestor of Eumetazoans (Cnidaria + Bilateria) (Roesner et al., 2005). The presence of only a single copy of Ngb-like gene in both C. ales and A. trapezia was unexpected as most invertebrate lineages often have more than one Ngb-like gene (i.e., C. gigas has 6 copies (UniProt accession numbers: K1QVD6, K1Q9R1, K1QT48, K1QF07, K1RX51, K1R7G1). The low number observed in C. ales and A. trapezia could be associated with low levels of gene expression of Ngb-like genes (Schindelmeiser et al., 1979; Burmester & Hankeln, 2004) and the use of transcriptome sequencing to identify these genes in both species investigated. Vertebrate Ngbs are monomeric proteins involved in various physiological functions including oxygen supply, storage, and interactions with mitochondria in nerve cells (Roesner et al., 2005; Watanabe et al., 2012), but a number of functions of these proteins remain

Chapter 3: Functional annotation of the Ctenoides ales transcriptome 59

uncharacterised. The predicted proteins identified here, when fully characterised, may help to better understand the function of neuroglobin-like proteins outside of vertebrate species.

3.5 Conclusion

These data provide the most comprehensive transcriptomic resource currently available for the bivalve mollusc C. ales. In this transcriptome, there are at least three genes encoding globin-like proteins. Candidate genes CalesGl2 and CalesGl3 are likely to encode Hb proteins based on sequence data characteristics and similarities with existing sequences. Candidate gene CalesGl3 in particular contains two globin domains in a single globin gene which has only been found in another two bivalve species in a different lineage. In addition to this, sequence similarities of candidate globin genes identified here to Hb encoding genes in other bivalves further contributes to the theory of a common bivalve ancestor and independent evolution of Hbs in bivalve lineages. These results provide preliminary evidence for a possible fifth independent origin of Hbs in Bivalvia.

60 Chapter 3: Functional annotation of the Ctenoides ales transcriptome

Chapter 4: General Discussion

Haemoglobin genes are highly diverse and have evolved independently in multiple metazoan lineages (Hardison, 1996; Mangum, 1998; Weber & Vinogradov, 2001; Hofmann et al., 2010a; Hoffmann et al., 2012). This diversity is particularly evident in invertebrates where Hbs show exceptional variation in their form, function and structural arrangement (Weber, 1980; Terwilliger, 1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002; Hoffmann et al., 2012). Such variability is reinforced by the patchy distribution of Hbs across invertebrate groups. Bivalve molluscs for example, represent one such group, where Hbs have evolved independently across multiple lineages (Manwell, 1963; Terwilliger et al., 1978; Dando et al., 1985; Doeller et al., 1988; Suzuki et al., 2000). This has been shown to be the result of gene duplication and can be correlated to species living in environments with low or fluctuating oxygen availabilities. Knowledge on the sequence, role and evolution of Hbs is not only important for phylogenomics but is also crucial for the bivalve farming industry. Despite this, only a limited number of studies have examined the full range of Hb genes, or functionally characterised their expression in bivalves (Terwilliger et al., 1983; Angelini et al., 1998; Hourdez & Weber, 2005; Decker et al., 2014). To better understand the evolution and expression of Hbs in bivalves, an investigation of Hb genes in two distantly related species of bivalves is performed here. Firstly, by quantifying patterns of tissue specific Hb gene expression in A. trapezia under submerged and aerially exposed treatments. This provided insights into the function and role of Hb encoding genes in this species. Secondly, by generating a new high-quality transcriptome resource for C. ales, the globin gene repertoire in this species was examined and a fifth independent origin of Hb in bivalve molluscs potentially identified.

4.1 Role of gene duplication in the current diversity of bivalve Hbs

The current diversity of Hb genes in bivalves has been hypothesised to be a result of repeated rounds of gene duplication driving evolution. One of the mechanisms responsible for this duplication has been demonstrated, by some authors, to be unequal crossing-over during meiosis (Naito et al., 1991; Dewilde et al., 1999; Kato et al., 2001). For example, the two-domain Hb (2D) from the blood clam B. lima is thought to have evolved from an unequal crossing over event between two ancestral globin genes and the loss of a stop codon (Suzuki

Chapter 4: General discussion 61

et al., 1996). Duplication through unequal crossing over is best illustrated in datasets in this study by the di-domain globin gene found in C. ales. This di-domain gene has most likely evolved as a result of unequal crossing over that led to the deletion of a stop codon in a single domain gene, generating an extended open reading frame with two globin domains. Interestingly, most Arcidae species studied to date do not possess this di-domain gene. Therefore, as C. ales, a member of the family Limidae and B. lima, a member of the Arcidae, both have a di-domain globin gene, it is likely that this gene may have arisen independently in both species.

The presence of at least five globin genes that encode Hb proteins in A. trapezia, which form a single clade in the globin phylogeny for bivalves, supports the hypothesis that extensive duplication has played an important role in the evolution of this gene family. In arcid bivalves most species have more than a single Hb gene (Como & Thompson, 1980b; Mangum, 1997), however, A. trapezia is the first species examined to have more than four Hb genes. For example, three Hb genes have been identified in T. granosa and S. inaequivalvis, while B. reeveana has four distinct Hb genes (Ikeda-Saito et al., 1983; Petruzelli et al., 1985; Terwilliger, 1998; Royer et al., 2001; Bao & Lin, 2010). This indicates that either extensive lineage specific duplication has occurred independently in the species of this family, or that duplication events occurred in the common ancestor of the family Arcidae.

Investigation of expression patterns of globin genes in A. trapezia highlighted strong tissue specific expression with some genes being predominantly expressed in erythrocytes. Often tissue or developmental specific expression in duplicated genes is associated with changes in the regulatory elements of these genes (Sankaran et al., 2008), which has led to higher expression in certain tissues or developmental times. This has been demonstrated in mammals as recently duplicated genes that shared regulatory sequences were more likely to be co-expressed than duplicated genes that do not share regulatory sequences (Lan & Pritchard, 2016). It remains to be determined whether divergence in regulatory sequences are responsible for the tissue specific expression observed in A. trapezia globin genes encoding Hb proteins, but this study is the first to show tissue specific expression. This indicates that globin genes may have undergone neofunctionalisation following duplication in A. trapezia to be dominantly expressed in haemocytes. This data together with data from C. ales shows that duplication and subsequent divergence have played a dominant role in globin gene evolution in bivalve molluscs.

62 Chapter 4: General discussion

Neofunctionalisation is the evolution of new functions in duplicated genes and has been well demonstrated in vertebrate Hb genes (Aguileta et al., 2004; Hoffmann & Storz, 2007; Opazo et al., 2008). The pattern observed in results from chapter 2 of this study is consistent with neofunctionalisation of clade B globin genes in A. trapezia, as two of these genes (AG and BG) encode a tetrameric Hb (Como & Thompson, 1980a), while another gene (HB) in this clade encodes a dimeric Hb (Petruzelli et al., 1985). Two of these three genes (AG and BG), are dominantly expressed in haemocytes and as these cells are a novel phenotypic trait in Arcidae, this expression pattern is indicative of neofunctionalisation. Multiple examples exist that support the idea that neofunctionalisation of gene duplicates has played an important role in the generation of novel phenotypic traits (Birchler & Veitia, 2010; Kaessmann, 2010; Osborn et al., 2003). Of these, a well-known example is that of a neofunctionalised copy of an elastin gene which contributed to the evolution of the bulbus arteriosus, a novel organ found in the of teleost fish (Moriyama et al., 2016). While it is tempting to speculate that neofunctionalised Hb genes may contribute to evolutionary novelty in Arcid bivalve erythrocytes, the extensive duplication of globin genes encoding Hb proteins may be associated with escape from adaptive conflict (Des Marais & Rausher, 2008; Storz et al., 2008) if Hb proteins undertake multiple roles within this cell type.

Globin and Hb genes from many species have been demonstrated to encode

multifunctional proteins that serve not only in O2 transport, but also play a role in immune function and contribute to disease phenotypes with haemocytes recognised as the immune effectors (Bao et al., 2013b; Vinogradov & Moens, 2008; Weatherall, 2001; Donaghy, 2009). For example, a duplicated gene encoding a dimeric Hb from the arcid bivalve T. granosa, was upregulated following an immune challenge with V. parahaemolyticus (Bao et al., 2013b). This study demonstrates that the duplicated Hb genes of arcid bivalves may

have multifunctional roles in both O2 transport and the innate immune system response. Such acquisition of new roles in pre-existing genes requires changes in the regulatory elements of these genes as well as through coding regions. This is also necessary as these genes adapt to different living conditions such as hypoxic environments. For example, globin diversity in vesicomyid clams is believed to be a result of monomers modulation to accommodate for oxygen levels in their surrounding environment. This is hypothesised to be the result of structural genetic diversity in populations or changes in globin gene transcription according to environmental changes (Carney et al., 2007). Based on findings in this study for the bivalve A. trapezia, we can argue that it is most likely not a result of

Chapter 4: General discussion 63

changes in the transcription of globin genes, at least for animals exposed to air for up to 12 hours as expression levels of globins did not significantly change. Consequently, the extensive duplication of Hb encoding globin genes in A. trapezia may allow natural selection to drive specialization of different members of this multifunctional gene family. Further functional studies will be required to determine if this idea is correct.

4.2 Globin gene evolution in hypoxic environments

The variability in functional properties of bivalve Hbs are thought to be associated with the wide range of environmental conditions that these organisms encounter in their habitat (O’Gower & Nicol, 1968; Terwilliger et al., 1978; Alyakrinskaya, 2002). Among these environmental conditions, one of the most common stresses affecting bivalves is hypoxia (Widdows et al., 1979; Weber, 1980; Booth et al., 1984; Burnett, 1997; Gobler et al., 2014). It develops in organisms where the depletion of oxygen through respiration is faster than its replenishment and some bivalve species show better tolerance and survival rates than others when subject to extended periods of hypoxia (Officer et al., 1984; Rabalais et al., 2010). One of the reasons for this is their ability to close their shells therefore avoiding oxygen depletion and switching from aerobic metabolism to anaerobic metabolism (Brooks et al., 1991; De Zwann et al., 1993). The present study showed that this may also be an adaptive feature in the blood clam A. trapezia.

Furthermore, the evolution of circulating Hbs in bivalve lineages has been hypothesised to allow for maximised O2 binding and transport during times of hypoxia (Projecto-Garcia et al., 2015), since both Hbs and Hcs occur in bivalve species that experience periodic hypoxia (Morse et al., 1986; Mangum et al., 1987; Terwilliger et al., 1988; Riggs, 1991; Weber & Vinogradov, 2001; Projecto-Garcia et al., 2015). The presence of multiple Hb proteins in bivalve species is also a common observation and has been linked to many organisms that synthesise multiple oxygen carriers with different oxygen affinities to meet physiological demands (Mangum, 1997; Weber & Vinogradov, 2001; Projecto-Garcia et al., 2015). For example, the clam C. magnifica is found lodged into rock fissures outside hydrothermal vents and exposed to both deep-sea water and vent fluid meaning that it is frequently subject to chronic hypoxia (Berg, 1980). This species possesses an intracellular Hb with high O2 affinity for carrying and transport required for its sustainability in such a challenging environment (Terwilliger et al., 1983; Scott & Fisher, 1995; Hourdez & Weber, 2005). The

64 Chapter 4: General discussion

deep-sea clams C. kaikoi, Calyptogena soyoae and Calyptogena tsubasa all possess two homo-dimeric Hbs with more than 90 % identity, of which, HbI and HbII in C. kaikoi have been found to be involved in O2 storage under low O2 conditions in the deep sea rather than

O2 transport (Kawano et al., 2003; Suzuki et al., 2000). Arcid bivalves are also frequently found in habitats with low levels of O2 such as intertidal zones or deep-sea waters and often experience prolonged periods of hypoxia (Arp et al., 1984; Abele-Oeschger & Oeschger, 1995; Terwilliger, 1998; Weber & Vinogradov, 2001). This is the case with the intertidal bivalve A. trapezia used in this study but also with other arcid bivalves such as Anadara kagoshimensis. As with A. trapezia, this species is found in sandy-muddy areas of the Indo- Pacific region and possesses haemoglobin-containing erythrocytes (Golovina et al., 2016). Compared to other bivalve species in the same habitat, A. kagoshimensis has also shown better tolerance to hypoxia and this has been attributed to the presence of Hbs (Holden et al., 1994). The multiple Hbs in these bivalves may be produced simultaneously and may have different functions such as O2 carrying or storage, or can be produced sequentially to follow changing conditions (Terwilliger, 1998). Different Hb structures confers them these different functions and thus their potential resistance to hypoxia in the habitats where the bivalves settle (Decker et al., 2016). Overall, the presence of Hbs seems to play a role in adaptation of bivalves to hypoxia. Therefore, mutations that confer the evolution of such complex respiratory pigments such as Hb are likely selected for.

Since all blood clams seem to have originated from a common ancestor, globin structural variations and levels of expression is thought to have been influenced directly by the environmental conditions these clam species are subjected to and specifically oxygen concentration and availability (Kawano et al., 2003). The evidence from this study reinforces that greater expression of Hbs in haemolymph confers a physiological advantage in tolerating hypoxia. It is shown here by expression levels obtained for the blood clam A. trapezia. Given the hypoxic nature of the intertidal environment where this bivalve is found, it is almost certain that a duplication event was driven by the need for oxygen availability during low tides. This is also consistent with the finding of Hb-like genes that may encode Hb proteins in C.ales as this species lives in the tropical waters of the Indo-Pacific area at depths between 30 – 35 m (and sometimes up to 50 m) where both depth and temperature reduce the solubility of O2 creating a hypoxic environment (Garcia et al., 2005; Karstensen et al., 2008). Nonetheless, the idea that Hb has evolved more often in lineages that live in hypoxic environments requires further analysis before it can be supported.

Chapter 4: General discussion 65

4.3 The importance of globin gene evolution to aquaculture

Aquaculture is one of the fastest growing food-producing sectors and accounts for nearly 50 % of world consumption. The major groups currently produced in aquacultureinclude finfish, crustaceans and molluscs. Culture of molluscs contributed approximately 20 % of total aquaculture production in 2014 and this amount has been steadily increasing in recent years. Bivalve molluscs of the family Arcidae includea number of major fishery and aquaculture species such as T. granosa, S. inaequivalis, S. broughtonii and A. trapezia (Donaghy et al., 2009; Bao et al., 2013b). Besides their importance as food sources, bivalve molluscs are frequently used as indicators of pollution and overall health of ecosystems. Some of the major issues encountered in clam farming are disease outbreaks, low survival rates due to environmental pollution and slow growth rates (Alkarkhi et al., 2008; Vuddhakul et al., 2006). Haemocytes found in bivalve molluscs have been shown to be involved in various biological functions (Donaghy et al., 2009) such as immune defence against bacteria and viruses (Bao et al., 2013b) but also detoxification. In fact, the role of haemoglobin in haemocytes goes beyond supporting aerobic metabolism and in some species SNPs in specific Hb genes have been observed to correlate with disease resistence. Hbs in bivalves have been shown to be a source of antibacterial activity but are also responsible for eliminating harmful reactive oxygen species (ROS) and Nitric Oxide (NO) which may be present in polluted environments. Investigation of the haemoglobin genes in these bivalves as well as their expression under environmental stress as it is done in this study for A.trapezia contributes to our understanding of the functions of Hbs and haemocytes which may provide new perspectives for disease control and resistance to pollution in cultures.

Aquaculture production systems are often classified into three general types: extensive ponds, intensive ponds and intensive recirculating tank and raceway systems (Ebeling, 2006). There is also a tendency of many aquaculture enterprises to intensify production using superintensive systems, for example, super intensive culture using Atlantic salmon and rainbow trout. Currently, dissolved oxygen is the most important limiting factor to increase production in intensive systems. The study of haemoglobin gene expression under stress in A. trapezia and the transcriptome generated for C. ales may be used in future studies looking at dealing with this major issue. In particular, examining haemoglobin gene expression under different oxygen concentrations will provide much needed data about how molluscs cope with low dissolved oxygen in culture.

66 Chapter 4: General discussion

Other bivalves such as scallops are also used for farming. For example, in the scallop family Pectinidae, only about 10 species are currently cultured from the 400 living species present in this family (Shumway & Parsons, 2011). New sequence data for species is the ultimate resource for the introduction new species in cultures or the addition of genetic material to existing species to increase their diversity and their performance in cultures (Guo, 2009). The transcriptome generated in this study for the scallop C. ales and the haemoglobin-like genes described can therefore be used as potential molecular markers for bivalve selection and breeding to improve aquaculture production. Overall, research based on trancriptomics and proteomics proves valuable as it increases genetic data to study molecular traits of interest for cultured species and especially evaluate their disease susceptibility and resistance to environmental stresses.

4.4 Limitations of the study

Despite some limitations in the two studies conducted as part of this Masters thesis, the findings about the evolution and expression of Hb genes in bivalve molluscs are valid. One major limitation of the study is the use of transcriptome sequences for the identification of Hb genes in both A. trapezia and C. ales instead of full genome sequences. Consequently, the presence of only six and three full-length globin transcripts in A. trapezia and C. ales, respectively may be an underestimation of the actual number of Hb genes in these species and should be viewed with some caution. Many functional genes are not captured by transcriptome sequencing because they are only expressed at specific developmental stages, in certain tissues or at very low levels (Yagil et al., 2005). Thus this is a limitation of using transcriptome sequencing in isolation to identify the number of different genes within gene families in a species. Complete genome sequencing was not feasible in the current study due to the time restrictions and financial constraint.

4.5 Future research

Data presented here for A. trapezia provides insight into the multiple functions of Hbs, possible future studies could consist of further validating Hb genes. This could include whether Hbs are translated into proteins or whether some might be pseudogenes. Data presented here for C. ales provide the most comprehensive transcriptomic resource currently available for this species and therefore allows for numerous gene families to be examined

Chapter 4: General discussion 67

detail. Overall this resource should lay an important foundation for future genetic or genomic studies in this species. In future studies, complete genome sequences could also improve the detection of the entire complement of globin genes for these two bivalve species.

4.6 Conclusions

Overall, the evolution of Hbs in bivalves is intriguing due to the great diversity of proteins found across lineages both in structure and function. This study looked at the blood clam A. trapezia which possesses five duplicated Hb encoding genes and determined that expression levels of those five genes are much higher in haemolymph than in foot, gills, mantle or muscle therefore validating the first hypothesis of this project that they have undergone neofunctionalisation through gene duplication. Furthermore, the expression of those genes was not affected by short-term or long-term hypoxia; therefore it can be concluded that the neofunctionalisation acquired through gene duplication of existing Hb encoding genes may provide some evolutionary advantage for this bivalve species. Findings on the sensitivity and reaction of A. trapezia to hypoxia may also be a valuable indicator of the potential for bivalve populations to survive in challenging environments.

Additionally, next generation sequencing was used to obtain the expressed portion of the genome and present the first transcriptome for the species C. ales. Three globin-like encoding genes were found in the newly generated transcriptome for this bivalve species. The presence of a di-domain globin in particular represents the first in bivalves since the findings of a di-domain in B. lima and B. reeveana in the Arcoida order. This suggests that multi- domain globin genes may arise repeatedly through incomplete gene duplication. Although more investigation is required on the three candidate genes identified here, preliminary findings in this study support a fifth independent origin of Hb in bivalves and contribute to validate the second hypothesis of this project.

68 Chapter 4: General discussion

References

Abele-Oeschger, D., & Oeschger, R. (1995). Hypoxia-induced autoxidation of haemoglobin

in the benthic invertebrates Arenicola marina (Polychaeta) and Astarte borealis (Bivalvia)

and the possible effects of sulphide. Journal of Experimental Marine Biology and

Ecology, 187(1), 63–80. https://doi.org/10.1016/0022-0981(94)00172-A

Aguileta, G., Bielawski, J. P., & Yang, Z. (2004). Gene conversion and functional divergence

in the β-globin gene family. Journal of Molecular Evolution, 59(2), 177–189.

https://doi.org/10.1007/s00239-004-2612-0

Alkarkhi, F. A., Ismail, N., & Easa, A. M. (2008). Assessment of arsenic and heavy metal

contents in cockles (Anadara granosa) using multivariate statistical techniques. Journal

of hazardous materials, 150(3), 783-789. https://doi.org/10.1016/j.jhazmat.2007.05.035

Alyakrinskaya, I. O. (2002). Physiological and biochemical adaptations to respiration of

haemoglobin-containing hydrobionts. Biology Bulletin of the Russian Academy of

Sciences, 29(3), 268–283. https://doi.org/10.1023/A:1015438615417

Andrews, S. (2010). FastQC: A quality control tool for high throughput sequence data.

Angelini, E., Salvato, B., Muro, P. D., & Beltramini, M. (1998). Respiratory pigments of

Yoldia eightsi, an Antarctic bivalve. Marine Biology, 131(1), 15–23.

https://doi.org/10.1007/s002270050291

Antonini, E., & Chiancone, E. (1977). Assembly of multisubunit respiratory proteins. Annual

Review of Biophysics and Bioengineering, 6(1), 239–271.

https://doi.org/10.1146/annurev.bb.06.060177.001323

Arp, A. J., Childress, J. J., & Fisher, C. R. (1984). Metabolic and blood gas transport

characteristics of the hydrothermal vent bivalve Calyptogena magnifica. Physiological

Zoology, 57(6), 648–662. https://doi.org/%7B%7Barticle.doi%7D%7D

References 69

At, G., & Eo, T. (1984). Amino acid sequence of the beta-chain of the tetrameric

haemoglobin of the bivalve mollusc, Anadara trapezia. Australian Journal of Biological

Sciences, 38(3), 221–236. https://doi.org/10.1071/BI9800653

Baldwin, J., & Lee, A. K. (1978). Contributions of aerobic and anaerobic energy production

during swimming in the bivalve mollusc Limaria fragilis (family Limidae). Journal of

Comparative Physiology, 129(4), 361–364. https://doi.org/10.1007/BF00686994

Bao, Y. B., Wang, Q., Guo, X. M., & Lin, Z. H. (2013a). Structure and immune expression

analysis of haemoglobin genes from the blood clam Tegillarca granosa. Genetics and

Molecular Research, 12(3), 3110–3123. http://dx.doi.org/10.4238/2013.February.28.5

Bao, Y., Li, P., Dong, Y., Xiang, R., Gu, L., Yao, H., …& Lin, Z. (2013b). Polymorphism of

the multiple haemoglobins in blood clam Tegillarca granosa and its association with

disease resistance to Vibrio parahaemolyticus. Fish & Shellfish Immunology, 34(5), 1320–

1324. https://doi.org/10.1016/j.fsi.2013.02.022

Bao, Y., & Lin, Z. (2010). Generation, annotation, and analysis of ESTs from hemocyte of

the bloody clam, Tegillarca granosa. Fish & Shellfish Immunology, 29(5), 740–746.

https://doi.org/10.1016/j.fsi.2010.07.009

Bashford, D., Chothia, C., & Lesk, A. M. (1987). Determinants of a protein fold. Journal of

Molecular Biology, 196(1), 199–216. https://doi.org/10.1016/0022-2836(87)90521-3

Berg Jr, C. J. (1980). Description of living specimens of Calyptogena magnifica Boss and

Turner with notes on their distribution and ecology. Appendix 1. The giant white clam

from the Galapagos Rift, Calyptogena magnifica species novum. Malacologia, 20, 183–

185.

Bikard, D., Patel, D., Metté, C. L., Giorgi, V., Camilleri, C., Bennett, M. J., & Loudet, O.

(2009). Divergent Evolution of Duplicate Genes Leads to Genetic Incompatibilities Within

70 References

Arabidopsis thaliana. Science, 323(5914), 623–626.

https://doi.org/10.1126/science.1165917

Birchler, J. A., & Veitia, R. A. (2010). The gene balance hypothesis: implications for gene

regulation, quantitative traits and evolution. New Phytologist, 186(1), 54–62.

https://doi.org/10.1111/j.1469-8137.2009.03087.x

Bischof, J. M., Chiang, A. P., Scheetz, T. E., Stone, E. M., Casavant, T. L., Sheffield, V. C.,

& Braun, T. A. (2006). Genome-wide identification of pseudogenes capable of disease-

causing gene conversion. Human Mutation, 27(6), 545–552.

https://doi.org/10.1002/humu.20335

Blank, M., & Burmester, T. (2012). Widespread occurrence of N-terminal acylation in animal

globins and possible origin of respiratory globins from a membrane-bound ancestor.

Molecular Biology and Evolution, 23(11), 3553–3561.

https://doi.org/10.1093/molbev/mss164

Blank, M., Kiger, L., Thielebein, A., Gerlach, F., Hankeln, T., Marden, M. C., & Burmester,

T. (2011). Oxygen supply from the bird’s eye perspective globin E is a respiratory protein

in the chicken retina. Journal of Biological Chemistry, 286(30), 26507–26515.

https://doi.org/10.1074/jbc.M111.224634

Booth, C. E., McDonald, D. G., & Walsh, P. J. (1984). Acid-base balance in the sea mussel,

Mytilus edulis. I. Effects of hypoxia and air-exposure on haemolymph acid-base status.

Marine Biology Letters, 5, 347–358.

Brooks, S. P. J., Zwaan, A. de, Thillart, G. van den, Cattani, O., Cortesi, P., & Storey, K. B.

(1991). Differential survival of Venus gallina and Scapharca inaequivalvis during anoxic

stress: Covalent modification of phosphofructokinase and glycogen phosphorylase during

anoxia. Journal of Comparative Physiology , 161(2), 207–212.

https://doi.org/10.1007/BF00262885

References 71

Brunori, M., & Vallone, B. (2007). Neuroglobin, seven years after. Cellular and Molecular

Life Sciences, 64(10), 1259. https://doi.org/10.1007/s00018-007-7090-2

Burmester, T., & Hankeln, T. (2014). Function and evolution of vertebrate globins. Acta

Physiologica, 211(3), 501–514. https://doi.org/10.1111/apha.12312

Burmester, T., & Hankeln, T. (2009). What is the function of neuroglobin? Journal of

Experimental Biology, 212(10), 1423–1428. https://doi.org/10.1242/jeb.000729

Burmester, T., & Hankeln, T. (2004). Neuroglobin: A Respiratory Protein of the Nervous

System. Physiology, 19(3), 110–113. https://doi.org/10.1152/nips.01513.2003

Burmester, T., Weich, B., Reinhardt, S., & Hankeln, T. (2000). A vertebrate globin expressed

in the brain. Nature, 407(6803), 520–523. https://doi.org/10.1038/35035093

Burnett, L. E. (1997). The challenges of living in hypoxic and hypercapnic aquatic

environments. American Zoologist, 37(6), 633–640. https://doi.org/10.1093/icb/37.6.633

Cañestro, C., Albalat, R., Irimia, M., & Garcia-Fernàndez, J. (2013). Impact of gene gains,

losses and duplication modes on the origin and diversification of vertebrates. Seminars in

Cell & Developmental Biology, 24(2), 83–94.

https://doi.org/10.1016/j.semcdb.2012.12.008

Carney, S. L., Flores, J. F., Orobona, K. M., Butterfield, D. A., Fisher, C. R., & Schaeffer, S.

W. (2007). Environmental differences in haemoglobin gene expression in the

hydrothermal vent tubeworm, Ridgeia piscesae. Comparative Biochemistry and

Physiology Part B: Biochemistry and Molecular Biology, 146(3), 326-337.

https://doi.org/10.1016/j.cbpb.2006.11.002

Ching Ming Chung, M., & Ellerton, H. D. (1980). The physico-chemical and functional

properties of extracellular respiratory haemoglobins and chlorocruorins. Progress in

Biophysics and Molecular Biology, 35, 53–102. https://doi.org/10.1016/0079-

6107(80)90003-6

72 References

Como, P. F., & Thompson, E. O. P. (1980a). Amino acid sequence of the α-chain of the

tetrameric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal of

Biological Sciences, 33(6), 653–664. https://doi.org/10.1071/BI9800653

Como, P. F., & Thompson, E. O. P. (1980b). Multiple haemoglobins of the bivalve mollusc

Anadara trapezia. Australian Journal of Biological Sciences, 33(6), 643–652.

https://doi.org/10.1071/BI9800643

Corti, P., Xue, J., Tejero, J., Wajih, N., Sun, M., Stolz, D. B., … & Gladwin, M. T. (2016).

Globin X is a six-coordinate globin that reduces nitrite to nitric oxide in fish red blood

cells. Proceedings of the National Academy of Sciences, 113(30), 8538–8543.

https://doi.org/10.1073/pnas.1522670113

Crenshaw, M. A., & Neff, J. M. (1969). Decalcification at the mantle-shell interface in

molluscs. American Zoologist, 9(3), 881–885. https://doi.org/10.1093/icb/9.3.881

Dando, P. R., Southward, A. J., Southward, E. C., Terwilliger, N. B., & Terwilliger, R. C.

(1985). Sulphur-oxidising bacteria and haemoglobin in gills of the bivalve mollusc Myrtea

spinifera. Retrieved from http://agris.fao.org/agris-

search/search.do?recordID=AV20120126232

Darawshe, S., Tsafadyah, Y., & Daniel, E. (1987). Quaternary structure of

from the nematode Ascaris suum. Evidence for unsaturated haem-binding sites.

Biochemical Journal, 242(3), 689–694. https://doi.org/10.1042/bj2420689

Davenport, J., & Wong, T. M. (1986). Responses of the blood cockle Anadara granosa (L.)

(Bivalvia: Arcidae) to salinity, hypoxia and aerial exposure. Aquaculture, 56(2), 151–162.

https://doi.org/10.1016/0044-8486(86)90024-4

De Zwaan, A., Cattan, O., & Putzer, V. M. (1993). Sulfide and cyanide induced mortality and

anaerobic metabolism in the arcid blood clam Scapharca inaequivalvis. Comparative

References 73

Biochemistry and Physiology Part C: Comparative Pharmacology, 105(1), 49–54.

https://doi.org/10.1016/0742-8413(93)90056-Q

Decker, C., Zorn, N., Le Bruchec, J., Caprais, J. C., Potier, N., Leize-Wagner, E., ... &

Andersen, A. C. (2016). Can the haemoglobin characteristics of vesicomyid clam species

influence their distribution in deep-sea sulfide-rich sediments? A case study in the Angola

Basin. Deep Sea Research Part II: Topical Studies in Oceanography.

http://dx.doi.org/10.1016/j.dsr2.2016.11.009

Decker, C., Zorn, N., Potier, N., Leize-Wagner, E., Lallier, F. H., Olu, K., & Andersen, A. C.

(2014). Globin’s structure and function in vesicomyid bivalves from the gulf of guinea

cold seeps as an adaptation to life in reduced sediments. Physiological and Biochemical

Zoology: Ecological and Evolutionary Approaches, 87(6), 855–869.

https://doi.org/10.1086/678131

Des Marais, D. L., & Rausher, M. D. (2008). Escape from adaptive conflict after duplication

in an anthocyanin pathway gene. Nature, 454(7205), 762–765.

https://doi.org/10.1038/nature07092

Dewilde, S., Angelini, E., Kiger, L., Marden, M., Beltramini, M., Salvato, B., & Moens, L.

(2003). Structure and function of the globin and globin gene from the Antarctic mollusc

Yoldia eightsi. Biochemical Journal, 370, 245–253. https://doi.org/10.1042/bj20020727

Dewilde, S., Hauwaert, M.-. L., Peeters, K., Vanfleteren, J., & Moens, L. (1999). Daphnia

pulex didomain haemoglobin: structure and evolution of polymeric haemoglobins and

their coding genes. Molecular Biology Evolution, 16.

https://doi.org/10.1093/oxfordjournals.molbev.a026211

Dixon, B., Walker, B., Kimmins, W., & Pohajdak, B. (1991). Isolation and sequencing of a

cDNA for an unusual haemoglobin from the parasitic nematode Pseudoterranova

74 References

decipiens. Proceedings of the National Academy of Sciences, 88(13), 5655–5659.

https://doi.org/10.1073/pnas.88.13.5655

Doeller, J. E., Kraus, D. W., Colacino, J. M., & Wittenberg, J. B. (1988). Gill Haemoglobin

May Deliver Sulfide to Bacterial Symbionts of Solemya velum (Bivalvia, ). The

Biological Bulletin, 175(3), 388–396. https://doi.org/%7B%7Barticle.doi%7D%7D

Donaghy, L., Lambert, C., Choi, K. S., & Soudant, P. (2009). Hemocytes of the carpet shell

clam (Ruditapes decussatus) and the Manila clam (Ruditapes philippinarum): current

knowledge and future prospects. Aquaculture, 297(1), 10-24.

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Research, 32(5), 1792–1797.

https://doi.org/10.1093/nar/gkh340

Efstratiadis, A., Posakony, J. W., Maniatis, T., Lawn, R. M., O’Connell, C., Spritz, R. A., …

& Blechl, A.E. (1980). The structure and evolution of the human β-globin gene family.

Cell, 21(3), 653–668. https://doi.org/10.1016/0092-8674(80)90429-8

Ekblom, R., & Galindo, J. (2011). Applications of next generation sequencing in molecular

ecology of non-model organisms. Heredity, 107(1), 1–15.

https://doi.org/10.1038/hdy.2010.152

FAO. (2009). Fisheries & Aquaculture - Fishery statistical collections - global aquaculture

production. Retrieved August 23, 2016, from http://www.fao.org/fishery/statistics/global-

aquaculture-production/en

Finn, R. D., Miller, B. L., Clements, J., & Bateman, A. (2014). iPfam: a database of protein

family and domain interactions found in the Protein Data Bank. Nucleic Acids Research,

42(D1), D364–D373. https://doi.org/10.1093/nar/gkt1210

References 75

Fisher, A., Comly, M., Do, R., Temarkin, L., Ghazanfari, A. F., & Mukherjee, A. B. (1984).

Two pools of β-endorphin-like immunoreactivity in blood: plasma and erythrocytes. Life

Sciences, 34(19), 1839–1846. https://doi.org/10.1016/0024-3205(84)90677-5

Flögel, U., Merx, M. W., Gödecke, A., Decking, U. K. M., & Schrader, J. (2001).

Myoglobin: A scavenger of bioactive NO. Proceedings of the National Academy of

Sciences, 98(2), 735–740. https://doi.org/10.1073/pnas.98.2.735

Flores, J. F., Fisher, C. R., Carney, S. L., Green, B. N., Freytag, J. K., Schaeffer, S. W., &

Royer, W. E. (2005). Sulfide binding is mediated by zinc ions discovered in the crystal

structure of a hydrothermal vent tubeworm haemoglobin. Proceedings of the National

Academy of Sciences of the United States of America, 102(8), 2713–2718.

https://doi.org/10.1073/pnas.0407455102

Foote, A. D., Liu, Y., Thomas, G. W. C., Vinař, T., Alföldi, J., Deng, J., … & Gibbs, R. A.

(2015). Convergent evolution of the genomes of marine mammals. Nature Genetics, 47(3),

272–275. https://doi.org/10.1038/ng.3198

Force, A., Lynch, M., Pickett, F. B., Amores, A., Yan, Y., & Postlethwait, J. (1999).

Preservation of Duplicate Genes by Complementary, Degenerative Mutations. Genetics,

151(4), 1531–1545.

Fuchs, C., Burmester, T., & Hankeln, T. (2006). The amphibian globin gene repertoire as

revealed by the Xenopus genome. Cytogenetic and Genome Research, 112(3–4), 296–306.

Furuta, H., & Kajita, A. (1983). Dimeric haemoglobin of the bivalve mollusc Anadara

broughtonii: complete amino acid sequence of the globin chain. Biochemistry, 22(4), 917–

922. https://doi.org/10.1021/bi00273a032

Gallant, J. R., Traeger, L. L., Volkening, J. D., Moffett, H., Chen, P.-H., Novina, C. D., … &

Sussman, M. R. (2014). Genomic basis for the convergent evolution of electric organs.

Science, 344(6191), 1522–1525. https://doi.org/10.1126/science.1254432

76 References

Garcia, H. E., Boyer, T. P., Levitus, S., Locarnini, R. A., & Antonov, J. (2005). On the

variability of dissolved oxygen and apparent oxygen utilization content for the upper

world ocean: 1955 to 1998. Geophysical Research Letters, 32(9).

https://doi.org/10.1029/2004GL022286

Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D., & Bairoch, A. (2003).

ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic

Acids Research, 31(13), 3784–3788.

Gobler, C. J., DePasquale, E. L., Griffith, A. W., & Baumann, H. (2014). Hypoxia and

acidification have additive and synergistic negative effects on the growth, survival, and

metamorphosis of early life stage bivalves. PLoS one, 9(1), e83648.

https://doi.org/10.1371/journal.pone.0083648

Golovina, I. V., Gostyukhina, O. L., & Andreyenko, T. I. (2016). Specific metabolic features

in tissues of the Anadara kagoshimensis. Russian Journal of Biological

Invasions, 7(2), 137-145.

González, V. L., Andrade, S. C. S., Bieler, R., Collins, T. M., Dunn, C. W., Mikkelsen, P. M.,

Taylor, J.D., & Giribet, G. (2015). A phylogenetic backbone for Bivalvia: an RNA-seq

approach. Proceedings of the Royal Society B, 282(1801), 20142332.

https://doi.org/10.1098/rspb.2014.2332

Goodman, M., Czelusniak, J., Koop, B. F., Tagle, D. A., & Slightom, J. L. (1987). Globins: A

case study in molecular phylogeny. Cold Spring Harbor Symposia on Quantitative

Biology, 52, 875–890. https://doi.org/10.1101/SQB.1987.052.01.096

Goossens, M., Dozy, A. M., Embury, S. H., Zachariades, Z., Hadjiminas, M. G.,

Stamatoyannopoulos, G., & Kan, Y. W. (1980). Triplicated alpha-globin loci in humans.

Proceedings of the National Academy of Sciences, 77(1), 518–521.

References 77

Götting, M., & Nikinmaa, M. (2015). More than haemoglobin – the unexpected diversity of

globins in vertebrate red blood cells. Physiological Reports, 3(2), e12284.

https://doi.org/10.14814/phy2.12284

Gow, A. J., Payson, A. P., & Bonaventura, J. (2005). Invertebrate haemoglobins and nitric

oxide: How pocket structure controls reactivity. Journal of Inorganic Biochemistry,

99(4), 903–911. https://doi.org/10.1016/j.jinorgbio.2004.12.001

Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., … &

Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a

reference genome. Nature Biotechnology, 29(7), 644–652.

https://doi.org/10.1038/nbt.1883

Gribaldo, S., Casane, D., Lopez, P., & Philippe, H. (2003). Functional divergence prediction

from evolutionary analysis: A case study of vertebrate haemoglobin. Molecular Biology

and Evolution, 20(11), 1754–1759. https://doi.org/10.1093/molbev/msg171

Grinich, N. P., & Terwilliger, R. C. (1980). The quarternary structure of an unusual high-

molecular-weight intracellular haemoglobin from the bivalve mollusc Barbatia reeveana.

Biochemical Journal, 189(1), 1–8. https://doi.org/10.1042/bj1890001

Grispo, M. T., Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., & Storz, J. F.

(2012). Gene duplication and the evolution of haemoglobin isoform differentiation in

birds. Journal of Biological Chemistry, 287(45), 37647–37658.

https://doi.org/10.1074/jbc.M112.375600

Guo, X. (2009). Use and exchange of genetic resources in molluscan aquaculture. Reviews in

Aquaculture, 1(3–4), 251–259. https://doi.org/10.1111/j.1753-5131.2009.01014.x

Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J.,… &

Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the

78 References

Trinity platform for reference generation and analysis. Nature Protocols, 8(8), 1494–1512.

https://doi.org/10.1038/nprot.2013.084

Halanych, K. M., & Passamaneck, Y. (2001). A brief review of metazoan phylogeny and

future prospects in hox-research. American Zoologist, 41(3), 629–639.

https://doi.org/10.1093/icb/41.3.629

Hankeln, T., Ebner, B., Fuchs, C., Gerlach, F., Haberkamp, M., Laufs, T. L., …& Burmester,

T. (2005). Neuroglobin and cytoglobin in search of their role in the vertebrate globin

family. Journal of Inorganic Biochemistry, 99(1), 110–119.

https://doi.org/10.1016/j.jinorgbio.2004.11.009

Hanscombe, O., Whyatt, D., Fraser, P., Yannoutsos, N., Greaves, D., Dillon, N., & Grosveld,

F. (1991). Importance of globin gene order for correct developmental expression. Genes &

Development, 5(8), 1387–1394. https://doi.org/10.1101/gad.5.8.1387

Hardison, R., Slightom, J. L., Gumucio, D. L., Goodman, M., Stojanovic, N., & Miller, W.

(1997). Locus control regions of mammalian β-globin gene clusters: combining

phylogenetic analyses and experimental results to gain functional insights. Gene, 205(1–

2), 73–94. https://doi.org/10.1016/S0378-1119(97)00474-5

Hardison, R. (1996). A brief history of haemoglobins: plant, animal, protist, and bacteria.

Proceedings of the National Academy of Sciences of the United States of America, 93(12),

5675.

Harper, E. M., & Skelton, P. W. (1993). The Mesozoic marine revolution and epifaunal

bivalves. Scripta Geologica, Special, (2), 127–153.

He, X., & Zhang, J. (2005). Rapid subfunctionalization accompanied by prolonged and

substantial neofunctionalization in duplicate gene evolution. Genetics, 169(2), 1157–1164.

https://doi.org/10.1534/genetics.104.037051

References 79

Herreid, C. F. (1980). Hypoxia in invertebrates. Comparative Biochemistry and Physiology

Part A: Physiology, 67(3), 311–320. https://doi.org/10.1016/S0300-9629(80)80002-8

Higgs, D. R., Old, J. M., Pressley, L., Clegg, J. B., & Weatherall, D. J. (1980). A novel

[alpha]-globin gene arrangement in man. Nature, 284(5757), 632–635.

https://doi.org/10.1038/284632a0

Hoffmann, F. G., Opazo, J. C., Hoogewijs, D., Hankeln, T., Ebner, B., Vinogradov, S. N., …

& Storz, J. F. (2012). Evolution of the globin gene family in deuterostomes: lineage-

specific patterns of diversification and attrition. Molecular Biology and Evolution, 29(7),

1735–1745. https://doi.org/10.1093/molbev/mss018

Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2011). Differential loss and retention of

cytoglobin, myoglobin, and globin-E during the radiation of vertebrates. Genome Biology

and Evolution, 3, 588–600. https://doi.org/10.1093/gbe/evr055

Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2010a). Gene cooption and convergent

evolution of oxygen transport haemoglobins in jawed and jawless vertebrates. Proceedings

of the National Academy of Sciences, 107(32), 14274–14279.

https://doi.org/10.1073/pnas.1006756107

Hoffmann, F. G., Storz, J. F., Gorr, T. A., & Opazo, J. C. (2010b). Lineage-specific patterns

of functional diversification in the α- and β-globin gene families of tetrapod vertebrates.

Molecular Biology and Evolution, 27(5), 1126–1138.

https://doi.org/10.1093/molbev/msp325

Hoffmann, F. G., & Storz, J. F. (2007). The αD-globin gene originated via duplication of an

embryonic α-like globin gene in the ancestor of tetrapod vertebrates. Molecular Biology

and Evolution, 24(9), 1982–1990. https://doi.org/10.1093/molbev/msm127

Hokamp, K., McLysaght, A., & Wolfe, K. H. (2003). The 2R hypothesis and the human

genome sequence. In A. Meyer & Y. V. de Peer (Eds.), Genome Evolution (pp. 95–110).

80 References

Springer Netherlands. Retrieved from http://link.springer.com/chapter/10.1007/978-94-

010-0263-9_10

Holden, J. A., Pipe, R. K., Quaglia, A., & Ciani, G. (1994). Blood cells of the arcid clam,

Scapharca inaequivalvis. Journal of the Marine Biological Association of the United

Kingdom, 74(02), 287-299.

Hoogewijs, D., Ebner, B., Germani, F., Hoffmann, F. G., Fabrizius, A., Moens, L., … &

Hankeln, T. (2011). Androglobin: A chimeric globin in metazoans that is preferentially

expressed in mammalian testes. Molecular Biology and Evolution, 29(4), 1105-1114.

https://doi.org/10.1093/molbev/msr246

Hourdez, S., & Lallier, F. H. (2006). Adaptations to hypoxia in hydrothermal-vent and cold-

seep invertebrates. Reviews in Environmental Science and Bio/Technology, 6(1–3), 143–

159. https://doi.org/10.1007/s11157-006-9110-3

Hourdez, S., & Weber, R. E. (2005). Molecular and functional adaptations in deep-sea

haemoglobins. Journal of Inorganic Biochemistry, 99(1), 130–141.

https://doi.org/10.1016/j.jinorgbio.2004.09.017

Hourdez, S., Lamontagne, J., Peterson, P., Weber, R. E., & Fisher, C. R. (2000).

Haemoglobin from a deep-sea hydrothermal-vent copepod. The Biological Bulletin,

199(2), 95–99.

Huang, Y., Niu, B., Gao, Y., Fu, L., & Li, W. (2010). CD-HIT Suite: a web server for

clustering and comparing biological sequences. Bioinformatics, 26(5), 680–682.

https://doi.org/10.1093/bioinformatics/btq003

Huerta-Cepas, J., Dopazo, J., Huynen, M. A., & Gabaldon, T. (2011). Evidence for short-time

divergence and long-time conservation of tissue-specific expression after gene duplication.

Briefings in Bioinformatics, 12(5), 442–448. https://doi.org/10.1093/bib/bbr022

References 81

Hurles, M. (2004). Gene duplication: the genomic trade in spare parts. PLoS Biology, 2(7),

e206. https://doi.org/10.1371/journal.pbio.0020206

Ikeda-Saito, M., Yonetani, T., Chiancone, E., Ascoli, F., Verzili, D., & Antonini, E. (1983).

Thermodynamic properties of oxygen equilibria of dimeric and tetrameric haemoglobins

from Scapharca inaequivalvis. Journal of Molecular Biology, 170(4), 1009–1018.

https://doi.org/10.1016/S0022-2836(83)80200-9

Ingram, V. M. (1961). Gene evolution and the haemoglobins. Nature, 189(4766), 704-708.

Innan, H., & Kondrashov, F. (2010). The evolution of gene duplications: classifying and

distinguishing between models. Nature Reviews Genetics, 11(2), 97–108.

https://doi.org/10.1038/nrg2689

Jellie, A. M., Tate, W. P., & Trotman, C. N. A. (1996). Evolutionary history of introns in a

multidomain globin gene. Journal of Molecular Evolution, 42(6), 641–647.

https://doi.org/10.1007/BF02338797

Jokumsen, A., & Fyhn, H. J. (1982). The influence of aerial exposure upon respiratory and

osmotic properties of haemolymph from two intertidal mussels, Mytilus edulis L. and

Modiolus modiolus L. Journal of Experimental Marine Biology and Ecology, 61(2), 189–

203. https://doi.org/10.1016/0022-0981(82)90008-9

Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new genes. Genome

Research, 20(10), 1313–1326. https://doi.org/10.1101/gr.101386.109

Karstensen, J., Stramma, L., & Visbeck, M. (2008). Oxygen minimum zones in the eastern

tropical Atlantic and Pacific oceans. Progress in Oceanography, 77(4), 331–350.

https://doi.org/10.1016/j.pocean.2007.05.009

Kato, K., Tokishita, S., Mandokoro, Y., Kimura, S., Ohta, T., Kobayashi, M., & Yamagata,

H. (2001). Two-domain haemoglobin gene of the water flea Moina macrocopa:

82 References

duplication in the ancestral Cladocera, diversification, and loss of a bridge intron. Gene,

273(1), 41–50. https://doi.org/10.1016/S0378-1119(01)00569-8

Kawano, K., Iwasaki, N., & Suzuki, T. (2003). Notable diversity in haemoglobin expression

patterns among species of the deep-sea clam, Calyptogena. Cellular and Molecular Life

Sciences CMLS, 60(9), 1952–1956. https://doi.org/10.1007/s00018-003-3184-7

Koch, J., & Burmester, T. (2016). Membrane-bound globin X protects the cell from reactive

oxygen species. Biochemical and Biophysical Research Communications, 469(2), 275–

280. https://doi.org/10.1016/j.bbrc.2015.11.105

Koch, J., Lüdemann, J., Spies, R., Last, M., Amemiya, C. T., & Burmester, T. (2016).

Unusual diversity of myoglobin genes in the lungfish. Molecular Biology and Evolution,

msw159. https://doi.org/10.1093/molbev/msw159

Koch, L. G., & Britton, S. L. (2008). Aerobic metabolism underlies complexity and capacity.

The Journal of Physiology, 586(1), 83–95. https://doi.org/10.1113/jphysiol.2007.144709

Kugelstadt, D., Haberkamp, M., Hankeln, T., & Burmester, T. (2004). Neuroglobin,

cytoglobin, and a novel, eye-specific globin from chicken. Biochemical and Biophysical

Research Communications, 325(3), 719–725. https://doi.org/10.1016/j.bbrc.2004.10.080

Lan, X., & Pritchard, J. K. (2016). Coregulation of tandem duplicate genes slows evolution of

subfunctionalization in mammals. Science, 352(6288), 1009–1013.

https://doi.org/10.1126/science.aad8411

Linzen, B., Soeter, N. M., Riggs, A. F., Schneider, H. J., Schartau, W., & Moore, M. D.

(1985). The structure of hemocyanins. Science, 229(4713), 519–524.

https://doi.org/10.1126/science.4023698

Liu, Y., Cotton, J. A., Shen, B., Han, X., Rossiter, S. J., & Zhang, S. (2010). Convergent

sequence evolution between echolocating bats and dolphins. Current Biology, 20(2), R53–

R54. https://doi.org/10.1016/j.cub.2009.11.058

References 83

Lynch, M., & Force, A. (2000). The probability of duplicate gene preservation by

subfunctionalization. Genetics, 154(1), 459–473.

Maeda, N., & Fitch, W. M. (1982). Frog heart monomeric haemoglobin. In Methods in

Protein Sequence Analysis (Eds), (pp. 569–570). Humana Press. Retrieved from

http://link.springer.com/chapter/10.1007/978-1-4612-5832-2_63

Mangum, C. P. (1998). Major Events in the Evolution of the Oxygen Carriers. American

Zoologist, 38(1), 1–13.

Mangum, C. P. (1997). Introduction The Red Blood Cell Haemoglobins Distribution and

localization Molecular structure. (Vol. 2).

Mangum, C. P. (1992). Respiratory function of the red blood cell haemoglobins of six animal

phyla. In C. P. Mangum (Eds.), Blood and Tissue Oxygen Carriers (pp. 117–149). Berlin,

Heidelberg: Springer Berlin Heidelberg. Retrieved from http://dx.doi.org/10.1007/978-3-

642-76418-9_5

Mangum, C. P., Scott, J. L., Miller, K. I., Holde, K. E. V., & Morse, M. P. (1987). Bivalve

: structural, functional, and phylogenetic relationships. The Biological

Bulletin, 173(1), 205–221.

Mangum, C. P., Woodin, B. R., Bonaventura, C., Sullivan, B., & Bonaventura, J. (1975). The

role of coelomic and vascular haemoglobin in the annelid family Terebellidae.

Comparative Biochemistry and Physiology Part A: Physiology, 51(2), 281–294.

https://doi.org/10.1016/0300-9629(75)90372-2

Mann, R. G., Fisher, W. K., Gilbert, A. T., & Thompson, E. O. P. (1986). Genetic variation

of the dimeric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal

of Biological Sciences, 39(2), 109–116.

Manning, A. M., Trotman, C. N. A., & Tate, W. P. (1990). Evolution of a polymeric globin in

the brine shrimp Artemia. Nature, 348(6302), 653–656. https://doi.org/10.1038/348653a0

84 References

Manwell, C. (1963). The chemistry and biology of haemoglobin in some marine clams—I.

Distribution of the pigment and properties of the oxygen equilibrium. Comparative

Biochemistry and Physiology, 8(3), 209–218. https://doi.org/10.1016/0010-

406X(63)90125-7

Michaelidis, B., Haas, D., & Grieshaber, M. K. (2005). Extracellular and intracellular acid‐

base status with regard to the energy metabolism in the oyster Crassostrea gigas during

exposure to air. Physiological and Biochemical Zoology: Ecological and Evolutionary

Approaches, 78(3), 373–383. https://doi.org/10.1086/430223

Mikkelsen, P. M., & Bieler, R. (2003). Systematic revision of the western Atlantic file clams,

Lima and Ctenoides (Bivalvia : Limoida : Limidae). Invertebrate Systematics, 17(5), 667–

710.

Moleirinho, A., Seixas, S., Lopes, A. M., Bento, C., Prata, M. J., & Amorim, A. (2013).

Evolutionary constraints in the β-globin cluster: The signature of purifying selection at the

δ-globin (HBD) locus and its role in developmental gene regulation. Genome Biology and

Evolution, 5(3), 559–571. https://doi.org/10.1093/gbe/evt029

Montes-Rodríguez, I. M., Rivera, L. E., López-Garriga, J., & Cadilla, C. L. (2016).

Characterization and expression of the Lucina pectinata oxygen and sulfide binding

haemoglobin genes. PLoS one, 11(1), e0147977.

https://doi.org/10.1371/journal.pone.0147977

Moriyama, Y., Ito, F., Takeda, H., Yano, T., Okabe, M., Kuraku, S., ... & Koshiba-Takeuchi,

K. (2016). Evolution of the fish heart by sub/neofunctionalization of an elastin

gene. Nature communications, 7.

Morse, M. P., Meyhofer, E., Otto, J. J., & Kuzirian, A. M. (1986). Hemocyanin respiratory

pigment in bivalve mollusks. Science, 231(4743), 1302–1304.

https://doi.org/10.1126/science.3945826

References 85

Motoyama, H., Komiya, T., Thuy, L. T. T., Tamori, A., Enomoto, M., Morikawa, H., … &

Kawada, N. (2014). Cytoglobin is expressed in hepatic stellate cells, but not in

myofibroblasts, in normal and fibrotic human liver. Laboratory Investigation, 94(2), 192–

207. https://doi.org/10.1038/labinvest.2013.135

Naito, Y., Riggs, C. K., Vandergon, T. L., & Riggs, A. F. (1991). Origin of a “bridge” intron

in the gene for two domain globin. Proceedings of the National Academy of Sciences of

the USA, 88(15). https://doi.org/10.1073/pnas.88.15.6672

Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., Muñoz-Fuentes, V., Green,

A. J., … & Storz, J. F. (2015). Convergent Evolution of Haemoglobin Function in High-

Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence

Level. PLoS Genetics, 11(12), e1005681. https://doi.org/10.1371/journal.pgen.1005681

Negrisolo, E., Pallavicini, A., Barbato, R., Dewilde, S., Ghiretti-Magaldi, A., Moens, L., &

Lanfranchi, G. (2001). The evolution of extracellular haemoglobins of annelids,

vestimentiferans, and pogonophorans. Journal of Biological Chemistry, 276(28), 26391–

26397. https://doi.org/10.1074/jbc.M100557200

Nicol, P. I., & O’Gower, A. K. (1967). Haemoglobin variation in Anadara trapezia. Nature,

216, 684. https://doi.org/10.1038/216684a0

Norman, J. D., Danzmann, R. G., Glebe, B., & Ferguson, M. M. (2011). The genetic basis of

salinity tolerance traits in Arctic charr (Salvelinus alpinus). BMC Genetics, 12(1), 81.

https://doi.org/10.1186/1471-2156-12-81

Officer, C. B., Biggs, R. B., Taft, J. L., Cronin, L. E., Tyler, M. A., & Boynton, W. R. (1984).

Chesapeake Bay anoxia: origin, development, and significance. Science, 223(6).

O’Gower, A., & Nicol, P. I. (1968). A latitudinal cline of haemoglobins in a bivalve mollusc.

Heredity, 23(4), 485–491.

86 References

Ohno, S., Wolf, U., & Atkin, N. B. (1968). Evolution from Fish to Mammals by Gene

Duplication. Hereditas, 59(1), 169–187. https://doi.org/10.1111/j.1601-

5223.1968.tb02169.x

Oleksiewicz, U., Liloglou, T., Field, J. K., & Xinarianos, G. (2011). Cytoglobin: biochemical,

functional and clinical perspective of the newest member of the globin family. Cellular

and Molecular Life Sciences, 68(23), 3869–3883. https://doi.org/10.1007/s00018-011-

0764-9

Opazo, J. C., Lee, A. P., Hoffmann, F. G., Toloza-Villalobos, J., Burmester, T., Venkatesh,

B., & Storz, J. F. (2015). Ancient duplications and expression divergence in the globin

gene superfamily of vertebrates: Insights from the elephant shark genome and

transcriptome. Molecular Biology and Evolution, 32(7), 1684-1694.

https://doi.org/10.1093/molbev/msv054

Opazo, J. C., Hoffmann, F. G., & Storz, J. F. (2008). Genomic evidence for independent

origins of β-like globin genes in monotremes and therian mammals. Proceedings of the

National Academy of Sciences, 105(5), 1590–1595.

https://doi.org/10.1073/pnas.0710531105

Osborn, T. C., Pires, J.C., Birchler, J. A., Auger, D. L., Chen, Z.J., Lee, H.-S., …

Martienssen, R.A. (2003). Understanding mechanisms of novel gene expression in

polyploids. Trends in Genetics, 19(3), 141–147. https://doi.org/10.1016/S0168-

9525(03)00015-5

Parker, J., Tsagkogeorga, G., Cotton, J. A., Liu, Y., Provero, P., Stupka, E., & Rossiter, S. J.

(2013). Genome-wide signatures of convergent evolution in echolocating mammals.

Nature, 502(7470), 228–231. https://doi.org/10.1038/nature12511

Perutz, M., F. (1979). Regulation of oxygen affinity of haemoglobin: influence of structure of

the globin on the heme iron. Annual review of biochemistry, 48(1), 327–386.

References 87

Pesce, A., Bolognesi, M., Bocedi, A., Ascenzi, P., Dewilde, S., Moens, L., … & Burmester,

T. (2002). Neuroglobin and cytoglobin. EMBO reports, 3(12), 1146–1151.

Petruzzelli, R., Goffredo, B. M., Barra, D., Bossa, F., Boffi, A., Verzili, D., … & Chiancone,

E. (1985). Amino acid sequence of the cooperative homodimeric haemoglobin from the

mollusc Scapharca inaequivalvis and topology of the intersubunit contacts. FEBS Letters,

184(2), 328–332. https://doi.org/10.1016/0014-5793(85)80632-3

Piro, M. C., Gambacurta, A., Basili, P., & Ascoli, F. (1998). The exon/intron organization of

the globin gene of Scapharca inaequivalvis homodimeric haemoglobin: unusual intron

homology with other bivalve mollusc globin genes. Gene, 221(1), 45–49.

https://doi.org/10.1016/S0378-1119(98)00442-9

Piro, M. C., Gambacurta, A., & Ascoli, F. (1996). Scapharca inaequivalvis tetrameric

haemoglobin α and β chains: cDNA sequencing and genomic organization. Journal of

Molecular Evolution, 43(6), 594–601. https://doi.org/10.1007/BF02202107

Prentis, P. J., & Pavasovic, A. (2014). The Anadara trapezia transcriptome: A resource for

molluscan physiological genomics. Marine Genomics, 18, 113–115.

https://doi.org/10.1016/j.margen.2014.08.004

Projecto-Garcia, J., Jollivet, D., Mary, J., Lallier, F. H., Schaeffer, S. W., & Hourdez, S.

(2015). Selective forces acting during multi-domain protein evolution: the case of multi-

domain globins. SpringerPlus, 4(1), 1–14. https://doi.org/10.1186/s40064-015-1124-2

Projecto-Garcia, J., Zorn, N., Jollivet, D., Schaeffer, S. W., Lallier, F. H., & Hourdez, S.

(2010). Origin and evolution of the unique tetra-domain haemoglobin from the

hydrothermal vent scale worm branchipolynoe. Molecular Biology and Evolution, 27(1),

143–152. https://doi.org/10.1093/molbev/msp218

88 References

Rabalais, N. N., Diaz, R. J., Levin, L. A., Turner, R. E., Gilbert, D., & Zhang, J. (2010).

Dynamics and distribution of natural and human-caused hypoxia. Biogeosciences, 7(2),

585-619.

Rawat, R. (2010). Anatomy of Mollusca. Mittal Publications.

Read, K. R. (1966). Molluscan haemoglobin and myoglobin. Academic Press New York,

(Vol. 2).

Reitman, M., Grasso, J. A., Blumenthal, R., & Lewit, P. (1993). Primary sequence, evolution,

and repetitive elements of the Gallus gallus (chicken) β-globin cluster. Genomics, 18(3),

616–626. https://doi.org/10.1016/S0888-7543(05)80364-7

Riggs, A. F. (1991). Aspects of the origin and Evolution of Non-Vertebrate Haemoglobins.

American Zoologist, 31(3), 535–545. https://doi.org/10.1093/icb/31.3.535

Roeder, G. S. (1983). Unequal crossing-over between yeast transposable elements. Molecular

and General Genetics MGG, 190(1), 117–121.

Roesner, A., Fuchs, C., Hankeln, T., & Burmester, T. (2005). A globin gene of ancient

evolutionary origin in lower vertebrates: evidence for two distinct globin families in

animals. Molecular Biology and Evolution, 22(1), 12–20.

https://doi.org/10.1093/molbev/msh258

Ronda, L., Bettati, S., Henry, E. R., Kashav, T., Sanders, J. M., Royer, W. E., & Mozzarelli,

A. (2013). Tertiary and quaternary allostery in tetrameric haemoglobin from Scapharca

inaequivalvis. Biochemistry, 52(12), 2108–2117. https://doi.org/10.1021/bi301620x

Royer, W. E., Zhu, H., Gorr, T. A., Flores, J. F., & Knapp, J. E. (2005). Allosteric

haemoglobin assembly: diversity and similarity. Journal of Biological Chemistry, 280(30),

27477–27480. https://doi.org/10.1074/jbc.R500006200

Royer Jr, W. E., Knapp, J. E., Strand, K., & Heaslet, H. A. (2001). Cooperative

haemoglobins: conserved fold, diverse quaternary assemblies and allosteric mechanisms.

References 89

Trends in Biochemical Sciences, 26(5), 297–304. https://doi.org/10.1016/S0968-

0004(01)01811-4

Royer, W. E., Strand, K., Heel, M. van, & Hendrickson, W. A. (2000). Structural hierarchy in

erythrocruorin, the giant respiratory assemblage of annelids. Proceedings of the National

Academy of Sciences, 97(13), 7107–7111. https://doi.org/10.1073/pnas.97.13.7107

Royer, W. E., Love, W. E., & Fenderson, F. F. (1985). Cooperative dimeric and tetrameric

clam haemoglobins are novel assemblages of myoglobin folds. Nature, 316(6025), 277–

280. https://doi.org/10.1038/316277a0

Sankaran, V. G., Menne, T. F., Xu, J., Akie, T. E., Lettre, G., Handel, B. V., … & Orkin, S.

H. (2008). Human fetal haemoglobin expression is regulated by the developmental stage-

specific repressor BCL11A. Science, 322(5909), 1839–1842.

https://doi.org/10.1126/science.1165409

Schindelmeiser, I., Kuhlmann, D., & Nolte, A. (1979). Localization and characterization of

in the central nervous tissue of some gastropods. Comparative Biochemistry

and Physiology Part B: Comparative Biochemistry, 64(2), 149–154.

https://doi.org/10.1016/0305-0491(79)90153-6

Schwarze, K., Campbell, K. L., Hankeln, T., Storz, J. F., Hoffmann, F. G., & Burmester, T.

(2014). The globin gene repertoire of lampreys: convergent evolution of haemoglobin and

myoglobin in jawed and jawless vertebrates. Molecular Biology and Evolution, 31(10),

2708–2721. https://doi.org/10.1093/molbev/msu216

Scott, K. M., & Fisher, C. R. (1995). Physiological ecology of sulfide metabolism in

hydrothermal vent and cold seep vesicomyid clams and vestimentiferan tube worms.

American Zoologist, 35(2), 102–111. https://doi.org/10.1093/icb/35.2.102

Shumway, S. E., & Parsons, G. J. (2011). Scallops: Biology, Ecology and Aquaculture.

Elsevier.

90 References

Sidell, B. D., & O’Brien, K. M. (2006). When bad things happen to good fish: the loss of

haemoglobin and myoglobin expression in Antarctic icefishes. Journal of Experimental

Biology, 209(10), 1791–1802. https://doi.org/10.1242/jeb.02091

Singh, S., Canseco, D. C., Manda, S. M., Shelton, J. M., Chirumamilla, R. R., Goetsch, S. C.,

… & Mammen, P. P. A. (2014). Cytoglobin modulates myogenic progenitor cell viability

and muscle regeneration. Proceedings of the National Academy of Sciences, 111(1),

E129–E138. https://doi.org/10.1073/pnas.1314962111

Smith, M. H. (1967). Occurrence of haemoglobin in some molluscs. Comparative

Biochemistry and Physiology, 20(1), 361–364. https://doi.org/10.1016/0010-

406X(67)90755-4

Souza, P. C. de, & Bonilla-Rodriguez, G. O. (2007). Fish haemoglobins. Brazilian Journal of

Medical and Biological Research, 40(6), 769–778. https://doi.org/10.1590/S0100-

879X2007000600004

Stapley, J., Reger, J., Feulner, P. G. D., Smadja, C., Galindo, J., Ekblom, R., … Slate, J.

(2010). Adaptation genomics: the next generation. Trends in Ecology & Evolution, 25(12),

705–712. https://doi.org/10.1016/j.tree.2010.09.002

Storz, J. F., Bridgham, J. T., Kelly, S. A., & Garland, T. (2015). Genetic approaches in

comparative and evolutionary physiology. American Journal of Physiology - Regulatory,

Integrative and Comparative Physiology, 309(3), R197–R214.

https://doi.org/10.1152/ajpregu.00100.2015

Storz, J. F., Hoffmann, F. G., Opazo, J. C., & Moriyama, H. (2008). Adaptive functional

divergence among triplicated α-globin genes in rodents. Genetics, 178(3), 1623–1638.

https://doi.org/10.1534/genetics.107.080903

References 91

Storz, J. F., Opazo, J. C., & Hoffmann, F. G. (2013). Gene duplication, genome duplication,

and the functional diversification of vertebrate globins. Molecular Phylogenetics and

Evolution, 66(2), 469–478. https://doi.org/10.1016/j.ympev.2012.07.013

Stothard, P. (2000). The sequence manipulation suite: JavaScript programs for analyzing and

formatting protein and DNA sequences. BioTechniques, 28(6), 1102, 1104.

Strand, K., Knapp, J. E., Bhyravbhatla, B., & Royer Jr, W. E. (2004). Crystal structure of the

haemoglobin dodecamer from Lumbricus erythrocruorin: allosteric core of giant annelid

respiratory complexes. Journal of Molecular Biology, 344(1), 119–134.

https://doi.org/10.1016/j.jmb.2004.08.094

Su, C.-Y., Kemp, H. A., & Moens, C. B. (2014). Cerebellar development in the absence of

GbX function in zebrafish. Developmental Biology, 386(1), 181–190.

https://doi.org/10.1016/j.ydbio.2013.10.026

Sullivan, G. (1961). Functional morphology, micro-anatomy, and histology of the “Sydney

cockle” Anadara trapezia (Deshayes )(Lamellibranchia: Arcidae). Australian Journal of

Zoology, 9(2), 219–257.

Surm, J. M., Prentis, P. J., & Pavasovic, A. (2015). Comparative analysis and distribution of

omega-3 lcPUFA biosynthesis genes in marine molluscs. PLoS one, 10(8), e0136301.

https://doi.org/10.1371/journal.pone.0136301

Suzuki, T., Kawamichi, H., Ohtsuki, R., Iwai, M., & Fujikura, K. (2000). Isolation and

cDNA-derived amino acid sequences of haemoglobin and myoglobin from the deep-sea

clam Calyptogena kaikoi. Biochimica et Biophysica Acta (BBA)-Protein Structure and

Molecular Enzymology, 1478(1), 152–158.

Suzuki, T., Kawasaki, Y., Arita, T., & Nakamura, A. (1996). Two-domain haemoglobin of

the blood clam Barbatia lima resulted from the recent gene duplication of the single-

domain delta chain. Biochemical Journal, 313, 561–566.

92 References

Suzuki, T., & Arita, T. (1995). Two-domain haemoglobin from the blood clam, Barbatia

lima. The cDNA-derived amino acid sequence. Journal of Protein Chemistry, 14(7), 499–

502.

Suzuki, T., Nakamura, A., Satoh, Y., Inai, C., Furukohri, T., & Arita, T. (1992). Primary

structure of chain I of the heterodimeric haemoglobin from the blood clam Barbatia

virescens. Journal of Protein Chemistry, 11(6): 629–633 .

https://doi.org/10.1007/BF01024963

Tamura, K., Stecher, G., Peterson, D., Filipski, A., & Kumar, S. (2013). MEGA6: Molecular

Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution, 30(12),

2725–2729. https://doi.org/10.1093/molbev/mst197

Terwilliger, N. B. (1998). Functional adaptations of oxygen-transport proteins. Journal of

Experimental Biology, 201(8), 1085–1098.

Terwilliger, N. B., Terwilliger, R. C., Meyhöfer, E., & Morse, M. P. (1988). Bivalve

hemocyanins—a comparison with other molluscan hemocyanins. Comparative

Biochemistry and Physiology Part B: Comparative Biochemistry, 89(1), 189–195.

https://doi.org/10.1016/0305-0491(88)90282-9

Terwilliger, R. C., Terwilliger, N. B., & Arp, A. (1983). Thermal vent clam (Calyptogena

magnifica) haemoglobin. Science, 219(4587), 981–983.

https://doi.org/10.1126/science.219.4587.981

Terwilliger, R. C. (1980). Structures of invertebrate haemoglobins. American Zoologist,

20(1), 53–67. https://doi.org/10.1093/icb/20.1.53

Terwilliger, R. C., Terwilliger, N. B., & Schabtach, E. (1978). Extracellular haemoglobin of

the clam, Cardita borealis (conrad): An unusual polymeric haemoglobin. Comparative

Biochemistry and Physiology Part B: Comparative Biochemistry, 59(1), 9–14.

https://doi.org/10.1016/0305-0491(78)90262-6

References 93

Titchen, D. A., Glenn, W. K., Nassif, N., Thompson, A. R., & Thompson, E. O. P. (1991). A

minor globin gene of the bivalve mollusc Anadara trapezia. Biochimica et Biophysica

Acta (BBA) - Gene Structure and Expression, 1089(1), 61–67.

https://doi.org/10.1016/0167-4781(91)90085-Z

Torres-Mercado, E., Renta, J. Y., Rodríguez, Y., López-Garriga, J., & Cadilla, C. L. (2003).

The cDNA-derived amino acid sequence of haemoglobin II from Lucina pectinata.

Journal of Protein Chemistry, 22(7–8), 683–690.

https://doi.org/10.1023/B:JOPC.0000008734.44356.b7

Toulmond, A., & Tchernigovtzeff, C. (1984). Ventilation and respiratory gas exchanges of

the lugworm Arenicola marina (L.) as functions of ambient PO2 (20–700 torr).

Respiration Physiology, 57(3), 349–363. https://doi.org/10.1016/0034-5687(84)90083-5

Trent, R. J., Bowden, D. K., Old, J. M., Wainscoat, J. S., Clegg, J. B., & Weatherall, D. J.

(1981). A novel rearrangement of the human β-like globin gene cluster. Nucleic Acids

Research, 9(24), 6723–6734. https://doi.org/10.1093/nar/9.24.6723

van der Burg, C. A., Prentis, P. J., Surm, J. M., & Pavasovic, A. (2016). Insights into the

innate immunome of actiniarians using a comparative genomic approach. BMC Genomics,

17, 850. https://doi.org/10.1186/s12864-016-3204-2

Vernimmen, D. (2014). Uncovering enhancer functions using the α-globin locus. PLoS

Genetics, 10(10), e1004668. https://doi.org/10.1371/journal.pgen.1004668

Vinogradov, S. N. (1985). The structure of invertebrate extracellular haemoglobins

( and chlorocruorins). Comparative Biochemistry and Physiology Part B:

Comparative Biochemistry, 82(1), 1–15. https://doi.org/10.1016/0305-0491(85)90120-8

Vinogradov, S. N., & Moens, L. (2008). Diversity of globin function: enzymatic, transport,

storage, and sensing. Journal of Biological Chemistry, 283(14), 8773–8777.

https://doi.org/10.1074/jbc.R700029200

94 References

Vogel, C., Teichmann, S. A., & Pereira-Leal, J. (2005). The relationship between domain

duplication and recombination. Journal of Molecular Biology, 346(1), 355–365.

https://doi.org/10.1016/j.jmb.2004.11.050

Vuddhakul, V., Soboon, S., Sunghiran, W., Kaewpiboon, S., Chowdhury, A., Ishibashi, M.,

... & Nishibuchi, M. (2006). Distribution of virulent and pandemic strains of Vibrio

parahaemolyticus in three molluscan shellfish species (Meretrix meretrix, Perna viridis,

and Anadara granosa) and their association with foodborne disease in southern

Thailand. Journal of food protection, 69(11), 2615-2620.

Wajcman, H., Kiger, L., & Marden, M. C. (2009). Structure and function evolution in the

superfamily of globins. Comptes Rendus Biologies, 332(2–3), 273–282.

https://doi.org/10.1016/j.crvi.2008.07.026

Wang, Y., Coleman-Derr, D., Chen, G., & Gu, Y. Q. (2015). OrthoVenn: a web server for

genome wide comparison and annotation of orthologous clusters across multiple species.

Nucleic Acids Research, 43(W1), W78–W84. https://doi.org/10.1093/nar/gkv487

Wang, W. X., & Widdows, J. (1991). Physiological responses of mussel larvae Mytilus edulis

to environmental hypoxia and anoxia., (70), 223–236.

Watanabe, S., Takahashi, N., Uchida, H., & Wakasugi, K. (2012). Human neuroglobin

functions as an oxidative stress-responsive sensor for neuroprotection. Journal of

Biological Chemistry, 287(36), 30128–30138. https://doi.org/10.1074/jbc.M112.373381

Wawrowski, A., Gerlach, F., Hankeln, T., & Burmester, T. (2011). Changes of globin

expression in the Japanese medaka (Oryzias latipes) in response to acute and chronic

hypoxia. Journal of Comparative Physiology. B, Biochemical, Systemic, and

Environmental Physiology, 181(2), 199–208. https://doi.org/10.1007/s00360-010-0518-2

References 95

Weatherall, D. J. (2001). Phenotype—genotype relationships in monogenic disease: lessons

from the thalassaemias. Nature Reviews Genetics, 2(4), 245–255.

https://doi.org/10.1038/35066048

Weber, R. E., & Vinogradov, S. N. (2001). Nonvertebrate haemoglobins: functions and

molecular adaptations. Physiological Reviews, 81(2), 569–628.

Weber, R. E. (1980). Functions of invertebrate haemoglobins with special reference to

adaptations to environmental hypoxia. American Zoologist, 20(1), 79–101.

https://doi.org/10.1093/icb/20.1.79

Widdows, J., Bayne, B. L., Livingstone, D. R., Newell, R. I. E., & Donkin, P. (1979).

Physiological and biochemical responses of bivalve molluscs to exposure to air.

Comparative Biochemistry and Physiology Part A: Physiology, 62(2), 301–308.

https://doi.org/10.1016/0300-9629(79)90060-4

Witeska, M. (2013). Erythrocytes in teleost fishes: a review. Zoology and Ecology, 23(4),

275–281. https://doi.org/10.1080/21658005.2013.846963

Wittenberg, J. B., & Wittenberg, B. A. (2003). Myoglobin function reassessed. Journal of

Experimental Biology, 206(12), 2011–2020. https://doi.org/10.1242/jeb.00243

Wray, G. A., Hahn, M. W., Abouheif, E., Balhoff, J. P., Pizer, M., Rockman, M. V., &

Romano, L. A. (2003). The evolution of transcriptional regulation in eukaryotes.

Molecular Biology and Evolution, 20(9), 1377–1419.

https://doi.org/10.1093/molbev/msg140

Wray, G. A., Levinton, J. S., & Shapiro, L. H. (1996). Molecular evidence for deep

precambrian divergences among metazoan phyla. Science, 274(5287), 568–573.

Yagil, C., Hubner, N., Monti, J., Schulz, H., Sapojnikov, M., Luft, F. C., … & Yagil, Y.

(2005). Identification of hypertension-related genes through an integrated genomic-

96 References

transcriptomic approach. Circulation Research, 96(6), 617–625.

https://doi.org/10.1161/01.RES.0000160556.52369.61

Ye, J., Fang, L., Zheng, H., Zhang, Y., Chen, J., Zhang, Z., … Wang, J. (2006). WEGO: a

web tool for plotting GO annotations. Nucleic Acids Research, 34(Web Server issue),

W293–W297. https://doi.org/10.1093/nar/gkl031

Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., … & Wang, J.(2014). Comparative

genomics reveals insights into avian genome evolution and adaptation. Science,

346(6215), 1311–1320. https://doi.org/10.1126/science.1251385

Zhang, J. (2003). Evolution by gene duplication: an update. Trends in Ecology & Evolution,

18(6), 292–298. https://doi.org/10.1016/S0169-5347(03)00033-8

References 97

Appendices

Appendix A: Poster presented at the annual Lorne Genome conference 2015

98 Appendix