CHARACTERISATION OF DUPLICATED HAEMOGLOBIN GENES IN BIVALVES
Mathilde Klein Bachelor of Medical Laboratory Science, QUT
Submitted in fulfilment of the requirements for the degree of
Master of Applied Science (Research)
School of Biomedical Sciences, IHBI
Faculty of Health
Queensland University of Technology
2017
Keywords
Arcoida, Bivalves, Gene duplication, Genomics, Globins, Haemoglobin,
Limoida, Molluscs, Transcriptome
Keywords i
Abstract
Haemoglobins (Hbs) are found in virtually all phyla and are some of the most investigated proteins in biomedical sciences. These proteins exhibit an extraordinary diversity of form and function in invertebrate lineages. This provides a unique opportunity to explore the origin and evolution of Hbs yet little is known about their distribution, function and evolution in invertebrate lineages. To explore further the functions and evolution of those Hbs, recent transcriptome data for the Arcid bivalve Anadara trapezia is investigated here. This species shows the presence of duplicated Hb encoding genes suggesting that gene duplication may have been more extensive than previously thought in bivalves. This study tests the hypothesis that these duplicated genes show patterns of tissue specific expression and evidence of neofunctionalisation. This is shown here for at least three Hb encoding genes present in A. trapezia with strong tissue specific expression in haemolymph compared to other tissues. Furthermore, the expression of these genes remains unaffected by prolonged air exposure suggesting that neofunctionalisation may confer an evolutionary advantage to this bivalve. As well as the unique Hbs found in the bivalve order Arcoida, Hbs are also found in three other bivalve orders: Carditoida, Solemyoida and Veneroida. These four orders that possess Hbs provide compelling evidence for the independent evolution of these proteins in multiple bivalve lineages. To expand data on the distribution of Hbs in bivalves, a transcriptome sequence for Ctenoides ales in the order Limoida was generated in this project. Interrogation of the transcriptome shows the presence of at least three globin-like encoding genes including two Hb-like encoding genes providing preliminary evidence for another independent origin of Hb in a bivalve lineage. Overall, this study provides novel insights into the function, evolution and distribution of Hbs in bivalves by investigating two distantly related species. Results of this study are consistent with current theories that Hb diversity in bivalves is a result of repeated rounds of gene duplication providing the raw material for evolution. Investigation of hypoxic resistance also reinforces that greater expression of Hbs in haemolymph confers a physiological advantage suggesting that Hb would evolve more often in some lineages during adaptation to unfavourable environment conditions, particularly
Abstract ii
hypoxia and prolonged air exposure. The finding of Hb-like encoding genes in another bivalve lineage also supports the evolution of this gene family through independent evolution and gene duplication, and gives insight into the distribution of globin genes in bivalves which is still poorly understood. The investigation of Hb genes in these bivalves also contributes to further understand the role of Hbs and provides potential novel insights for resistance in hypoxic environments, disease control and resistance to pollution in aquaculture.
Abstract iii
Table of Contents
Keywords ...... i Abstract ...... ii List of Figures ...... vi List of Tables ...... xi List of Abbreviations ...... xii Statement of Original Authorship ...... xiii Acknowledgements ...... xiv Chapter 1: Introduction ...... 1 1.1 Functional and structural diversity within the globin superfamily ...... 1 1.1.1 Recently discovered globin genes ...... 2 1.1.2 Myoglobins and haemoglobins ...... 2 1.2 Evolution of haemoglobins ...... 4 1.2.1 Gene duplication ...... 5 1.2.2 Divergent evolution of duplicated genes ...... 8 1.2.3 Convergent evolution ...... 9 1.2.4 Invertebrate haemoglobins ...... 10 1.3 Bivalve haemoglobins ...... 12 1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass...... 16 1.4 Aims ...... 19 1.5 Thesis Outline ...... 20 Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin genes in Anadara trapezia ...... 22 2.1 Background ...... 22 2.2 Material and Methods ...... 23 2.2.1 Sample acquisition and tissue dissection ...... 23 2.2.2 RNA extraction and cDNA synthesis ...... 24 2.2.3 RT–PCR (Real Time PCR) ...... 25 2.2.4 Candidate gene validation ...... 26 2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression ...... 26 2.2.6 RT-qPCR data analysis ...... 27 2.3 Results ...... 28 2.3.1 Haemolymph analysis ...... 28 2.3.2 RT-PCR ...... 30 2.3.3 Candidate gene validation ...... 31 2.3.4 RT-qPCR for quantification of gene expression ...... 31 2.4 Discussion ...... 35 2.4.1 Haemolymph characteristics ...... 35 2.4.2 Tissue specific expression and neofunctionalisation ...... 36 2.5 Conclusion ...... 37 Chapter 3: Functional annotation of the Ctenoides ales transcriptome ...... 39 3.1 Background ...... 39
Table of contents iv
3.2 Materials and Methods ...... 41 3.2.1 Sample collection ...... 41 3.2.2 RNA extraction, library preparation and sequencing ...... 41 3.2.3 Transcriptome assembly and validation ...... 44 3.2.4 Functional annotation of transcripts and mapping ...... 44 3.2.5 Comparative transcriptomics ...... 46 3.2.6 Candidate genes identification ...... 47 3.2.7 Primer design and candidate gene validation ...... 47 3.2.8 Phylogenetic analysis of sequences ...... 48 3.3 Results ...... 48 3.3.1 RNA extraction, library preparation and sequencing ...... 48 3.3.2 Transcriptome assembly and validation ...... 50 3.3.3 Functional annotation of transcripts and mapping ...... 51 3.3.4 Comparative transcriptomics ...... 52 3.3.5 Candidate genes ...... 53 3.3.6 Primer design and candidate gene validation ...... 55 3.3.7 Phylogeny of globins in bivalves ...... 56 3.4 Discussion ...... 58 3.5 Conclusion ...... 60 Chapter 4: General Discussion ...... 61 4.1 Role of gene duplication in the current diversity of bivalve Hbs ...... 61 4.2 Globin gene evolution in hypoxic environments ...... 64 4.3 The importance of globin gene evolution to aquaculture ...... 66 4.4 Limitations of the study ...... 67 4.5 Future research ...... 67 4.6 Conclusions ...... 67 References… ...... 69 Appendices...... 98 Appendix A: Poster presented at the annual Lorne Genome conference 2015 ...... 98
Table of contents v
List of Figures
Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and jawless (agnathans ) vertebrates. In this phylogeny vertebrate-specific globins are grouped into two distinct clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a)...... 5 Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2) (Zhang et al., 2014)...... 8 Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in 11 major phyla shown here : Arthropods, Nematodes, Nemertines, Mollusks, Annelids, Echiurans, Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych & Passamaneck, 2001)...... 11 Figure 1.4 Gene structure from two Hbs found in annelid worms (Branchipolynoe spp.). This diagram illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain (bottom) globins found in these species (Projecto-Garcia et al., 2010)...... 12 Figure 1.5 Phylogenetic classification of class Bivalvia of molluscs. The main bivalve subclasses are represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta, Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida, Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015)...... 15 Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha) subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue. This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c), both found
List of figures vi
in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013)...... 17 Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to assess tissue specific expression of Hb genes in this study...... 24 Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of fresh haemolymph using and Abbott i-STAT blood analyser. This was done for all haemolymph samples across two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2 values as dependent variables respectively. Condition was used as a factor for each test and results were considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*) and (•) in each graph and error bars represent 2 standard errors around the mean...... 29 Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified products for 2D, AG, BG, HB, HD and 18S genes respectively...... 31 Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS and differences were considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*). Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2 standard errors around the mean...... 32 Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping gene 18S in each tissue type. Target amplification curves are represented in orange (left),
List of figures vii
negative controls in green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler® measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical replicates along with negative controls and 18S as a housekeeping gene...... 33 Figure 2.6. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: haemolymph, foot, gills, mantle and muscle. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS with ratio values as a dependent variable and condition as a factor in each tissue. Differences were considered statistically significant at p < 0.05 and no statistical differences were found here. Error bars represent 2 standard errors around the mean...... 34 Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E-value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal peptides and Pfam to determine the presence and position of protein domains. All results were uploaded to the SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016)...... 46 Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 % agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate) from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively. The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed for quantity and integrity and sequencing libraries were prepared...... 49 Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip. Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for each sample...... 50
List of figures viii
Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to represent common and basic biological information: cellular component (CC), molecular function (MF) and biological process (BP)...... 52 Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve species C. gigas, L. gigantea, A. trapezia and C. ales. Orthologous gene clusters for C. ales, A. trapezia and the two model species L. gigantea and C. gigas are annotated and compared here using the program OrthoVenn. For L. gigantea and C. gigas, the predicted proteomes obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp alignment...... 53 Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain. Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal repeats...... 54 Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as shown in table 3.2 above. Lanes 1 in all gels displayed here are Hyperladder 100 bp. Candidate gene 1 is shown in lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane three of gel (C). All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500 (ThermoFisher)...... 55 Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The percentage of trees in which the associated taxa clustered
List of figures ix
together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site...... 57 Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all sequences and all species are indicated above by an asterix (*)...... 58
List of figures x
List of Tables
Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve species A. trapezia using RT- PCR. All primers were designed to amplify the entire ORF of each gene with product sizes between 350 bp and 619 bp...... 26 Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A. trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014)...... 27 Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD) have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at least 2 biological replicates out of 6 in that group...... 30 Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on an Illumina NextSeq500 using 150bp paired-end chemistry. Libraries obtained were then assembled and the table below summarises results obtained...... 51 Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table below...... 55
List of tables xi
List of Abbreviations
AA- Amino acid
Angb- Androglobin
BLAST- Basic local alignment search tool
CO- Carbon monoxide
Cyg- Cytoglobin
GbE- Globin E
GbX- Globin X
GbY- Globin Y
Hb- Haemoglobin
LUCA- Last universal common ancestor
NO- Nitric oxide
Mb- Myoglobin
NCBI- National Centre for Biotechnology Information
Ngb- Neuroglobin
ORF- Open reading frame
PCR- Polymerase chain reaction
ROS- Reactive Oxygen Species
List of abbreviations xii Statement of Original Authorship
The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.
QUT Verified Signature
Signature:
Date: _____June 2017______
Statement of original authorship xiii
Acknowledgements
I would firstly like to express my gratitude to my supervisors Dr. Ana Pavasovic (School of Biomedical Sciences, Queensland University of Technology), Dr. Peter Prentis (School of Earth, Environmental and Biological Sciences, QUT) and Professor Louise Hafner (School of Biomedical Sciences, QUT) for their supervision and guidance throughout my research and writing of this thesis. I would also like to thank them for their continuous support, patience and encouragement throughout these challenging two years.
A special thank you to my principal supervisors Dr. Ana Pavasovic and Dr. Peter Prentis for their flexibility, genuine caring and faith in me along the way.
I would like to thank all my colleagues in the evolutionary and physiological genomics lab (ePGL), especially Shorash Amin, Hayden Smith, Joachim Surm and Chloe van de Burg for their ongoing help and support. I would also like to thank Associate Professor Christopher Collet (School of Biomedical Sciences, QUT) and Dr. David Hurwood (School of Earth, Environmental and Biological Sciences, QUT) for reviewing my thesis draft and providing valuable feedback during my final seminar.
I would like to thank all the staff at the QUT marine lab facility for their helpful advice regarding care of the marine animals and tanks. I would also like to thank QUT HPC and QUT MGRF for use of their facilities.
I would like to thank my family for giving me the opportunity to study in Australia and for their ongoing loving support and encouragement.
Lastly, I would like to thank my partner Tom for his emotional support, endless patience and encouragement. He has always been a source of strength and has kept me determined throughout this challenging journey.
Acknowledgements xiv
Chapter 1: Introduction
1.1 Functional and structural diversity within the globin superfamily
The globins are an ancient gene superfamily, thought to have been present in the last universal common ancestor (LUCA) of the three domains of life (Hoffmann et al., 2010a). Globin genes encode for small iron metalloproteins, typically ~ 150 amino acids in length. Globin domains consist of a proximal histidine residue in the F helix for iron binding and a distal histidine residue on the opposite side of this iron for oxygen binding (Bashford et al., 1987). Structurally, globin domains consist of eight alpha (α) helical segments which form the characteristic globin fold in a three-on-three α helical structure with a central haeme group comprising a proto-porphyrin ring (Perutz, 1979; Blank & Burmester, 2012). It is this haeme prosthetic group that allows globin proteins to reversibly bind oxygen
(O2) and other gaseous ligands (Pesce et al., 2002). For a long time, the metazoan (animal) globin superfamily was thought to consist of only two globin types, haemoglobin (Hb) and myoglobin (Mb), but recent research has shown that this superfamily is functionally and structurally far more diverse than originally thought (Götting & Nikinmaa, 2015).
The globin gene superfamily includes genes encoding Hbs, neuroglobins (Ngbs), cytoglobins (Cygbs), globin X (GbX), androglobins (Angbs), Mbs, globin E (GbE) and globin Y (GbY) among others (Hoffmann et al., 2010a; Storz et al., 2013; Opazo et al., 2015). This diverse group of genes is thought to have evolved from a common ancestral gene through repeated rounds of gene and genome duplication, some 1.8 billion years ago
(Efstratiadis et al., 1980; Wajcman et al., 2009). This coincides with the accumulation of O2 levels in the atmosphere, suggesting that those globin genes arose as a mechanism to
scavenge toxic O2, carbon monoxide (CO) and nitric oxide (NO) gases (Hardison, 1996; Koch & Britton, 2008). The repeated rounds of duplication and functional divergence of duplicated genes have resulted in a large and diverse group of structurally similar proteins. In this superfamily, GbX and Ngb are some of the more recently described proteins (Pesce et al., 2002; Burmester & Hankeln, 2004; Roesner et al., 2005) but are thought to be the most ancestral globin genes found in metazoans. In fact, GbX and Ngb are the only globin genes found in early divergent phyla such as Cnidaria and Porifera, and predate the split of deuterostomes and protostomes (Schwarze et al., 2014).
Chapter 1: Introduction 1
1.1.1 Recently discovered globin genes Neuroglobin and GbX are both expressed in the nervous system of vertebrates, but unlike Ngb, which is typically found in the cellular cytoplasm, GbX is a membrane bound protein (Pesce et al., 2002; Burmester & Hankeln, 2014). The functions of both of these proteins are still uncertain and they have been the focus of numerous studies to elucidate their physiological function (Hankeln et al., 2005; Burmester & Hankeln, 2009; Wawrowski et al., 2011). Nonetheless, initial studies speculate that Ngb and GbX largely play a protective role in the cell (Brunori & Vallone, 2007; Burmester & Hankeln, 2009; Corti et al., 2016) as well as being involved in cellular signaling processes (Burmester & Hankeln, 2009; Su et al., 2014). Globin E, another described vertebrate globin (Kugelstadt et al., 2004), is typically
expressed in eye tissue. Its role is thought to be related to O2 supply of retinal cells, which are metabolically highly active (Blank et al., 2011). Androglobin is an ancient chimeric gene with high levels of expression in the testes of vertebrates (Hoogewijs et al., 2011), however, its function remains unresolved (Burmester & Hankeln, 2014). Yet another recently discovered globin, Cygb, is expressed in a diverse range of tissues and cell types including epithelium, fibroblast cell lineages, macrophages, neurons and muscle fibers (Oleksiewicz et al., 2011; Motoyama et al., 2014). Functions of Cygb appear similarly diverse, with roles in protection from reactive oxygen species and in nitrous oxide metabolism (Pesce et al., 2002; Singh et al., 2014).
1.1.2 Myoglobins and haemoglobins Stemming from their early discovery, Mbs and Hbs are the most extensively studied globins. Most metazoan species possess only a single copy of the Mb gene except in rare cases of some fish lineages where an increase in copy number (up to seven) is seen (Koch & Burmester, 2016). Mbs are small monomeric proteins whose expression is principally restricted to striated muscle tissue where they play a key role in supplying the mitochondria
of myocytes with O2 during periods of hypoxia, also defined as oxygen deficiency in a biotic environment (Wittenberg & Wittenberg, 2003). This protein is also reported to play an important role in the decomposition of nitrous oxide during high cellular metabolic activity
(Flögel et al., 2001). Unlike the other globin proteins, Hb principally transports O2 from respiratory surfaces to the working tissue within an organism.
Haemoglobins in metazoans have been reported in multiple animal phyla as key oxygen-transport proteins (Weber, 1980; Mangum, 1992; Hardison, 1996; Hourdez et al,
2 Chapter 1: Introduction
2000). Most frequently Hbs are found in the circulatory system of metazoan species, but can sometimes be confined to specific tissues, such as in the bivalve Yoldia eightsii where Hb is expressed in the gill tissue (Dewilde et al., 2003). Outside of vertebrates, Hbs exhibit a remarkable diversity in structure (Terwilliger, 1998; Weber & Vinogradov, 2001). For example, Hb proteins can occur as monomers, dimers, tetramers or even in polymeric forms (Weber, 1980). Circulating Hbs may also be found within erythrocytes (intracellular) or freely dissolved in fluid tissue such as blood or haemolymph (extracellular) (Ching Ming Chung & Ellerton, 1980). Extracellular Hbs are extremely diverse in subunit size and quaternary structure, however, they all share the same globin fold as intracellular Hbs (Negrisolo et al., 2001). Notable structural variations can often be found in invertebrate species such as the bivalve mollusc Barbatia reeveana where a very large polymeric Hb of 430 kDa composed of 34 kDa di-domain subunits occurs, each containing two covalently
linked functional units for O2 binding (Grinich & Terwilliger, 1980). Other examples of structural variation include annelid species, such as Riftia pachyptila and Lumbricus terrestris, which possess Hbs consisting of 24 and 144 subunits respectively (Royer et al., 2000; Strand et al., 2004; Flores et al., 2005).
Structural variations in Hbs can be observed even among vertebrate lineages, where jawless vertebrates (agnathans) have weakly cooperative dimers compared to the canonical tetrameric Hbs of jawed vertebrates (gnathostomes) (Hoffmann et al., 2010a). By far the best-studied Hb is the tetrameric form found in humans. Structurally, the human Hb consists of 4 globin subunits (2 alpha (α) and 2 beta (β)), each with their own haeme group (Pesce et al., 2002; Storz et al., 2013). α and β chains consist of 141 and 146 amino acids, respectively and are held together by noncovalent interactions to form a heterotetramer (Antonini & Chiancone, 1977; Hardison et al.,1997).
The genes that encode the different subunits of Hb proteins are found as clusters in the human genome. For example, the α globin gene cluster is located on chromosome 16 and includes seven loci (NC_000016.10; 5’ – zeta – pseudozeta – mu – pseudoalpha1 – alpha2 – alpha1 – theta – 3’) (Vernimmen, 2014) while the β gene cluster is found on chromosome 11 and is comprised of five loci (NC_000011.10; 5’-epsilon – gammaG – gammaA – delta – beta – 3’) (Moleirinho et al., 2013). The order in which the genes are found within these clusters, determines the timeline at which they are expressed during development (Hardison et al., 1997) with those closest to the locus control region at the 5’end, expressed in early
Chapter 1: Introduction 3
development and those further away expressed at a later stage (Hanscombe et al., 1991). This developmental expression has been associated with varying oxygen levels throughout embryonic and foetal development (Efstratiadis et al., 1980) and demonstrates that oxygen availability can influence the expression and evolution of Hbs.
1.2 Evolution of haemoglobins
Haemoglobins are among the most extensively studied proteins, yet our understanding of their evolutionary history is becoming increasingly unclear. A recent phylogenetic analysis of the vertebrate Hbs by Hoffmann et al., (2010a) suggests that Hbs in gnathostomes and agnathans have evolved in each lineage independently. It is also likely that these Hb genes have evolved from different ancestral genes based on the placement of gnathostome and agnathan Hbs in two distinct clades (Figure 1.1) (Hoffmann et al., 2010a). Findings such as these are in contrast to the prevailing paradigm that the current vertebrate Hbs have evolved from an ancestral Mb gene. In invertebrates, the evolution of Hb genes remains even less clear than in vertebrates, largely due to insufficient genomic resources for the groups that possess Hbs.
The current paradigm to explain the diversity of vertebrate globin genes largely involves gene and whole genome duplication events, mutation and natural selection acting to promote evolutionary innovation in this gene family. Both, the two rounds of whole genome duplication (2R hypothesis; (Hokamp et al., 2003)) that preceded the evolution of vertebrates and further tandem duplications provided the raw substrate for the evolution and diversification in the vertebrate globin genes seen today. Divergent and convergent evolution have played major roles in the evolution of new functions observed in duplicated vertebrate globin genes.
4 Chapter 1: Introduction
Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and jawless (agnathans ) vertebrates. In this phylogeny vertebrate-specific globins are grouped into two distinct clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a).
1.2.1 Gene duplication
The role of both gene duplication and whole-genome duplication is now widely accepted as a key driver of evolution as it is the most frequent mechanism responsible for the generation of genes with new functions (Ohno et al., 1968; Cañestro et al., 2013). It is through the accumulation of mutations and repeated rounds of duplication of existing genes that new functions are acquired (Zhang, 2003). Both whole-genome and gene duplication have promoted evolutionary innovation and led to the current diversity within the Hb gene family (Storz et al., 2013). Gene duplication can occur through five mechanisms: (i) the
Chapter 1: Introduction 5
unequal crossing-over between two sister chromatids of one chromosome or (ii) between two homologous chromosomes during replication, (iii) regional redundant duplication of DNA molecules (segmental duplication), (iv) polyploidization or (v) retrotransposition (Ohno et al., 1968 ; Zhang, 2003). Unequal crossing over events can also be explained as recombination between DNA sequences at different sites on sister chromatids or homologous chromosomes. This results in a decrease in gene copy number on one chromosome or one chromatid and an increase on the other (Roeder, 1983). The variations in globin gene copy numbers have often been associated with this type of recombination (Goossens et al., 1980; Higgs et al., 1980; Trent et al., 1981; Roeder, 1983; Hurles, 2004). Vertebrate Hb gene clusters, described previously, provide classic examples of evolution through gene duplication. In 1961, Ingram suggested that polyploidization initiated two duplication events from Mb to the primordial Hb α gene which then produced the Hb β-like gene. While this is an overly simplistic explanation for the diversity found within the globin gene family, it provided an early plausible hypothesis that gene duplication was a major driver of the evolution of the vertebrate globin gene family.
Gene duplication is known to have four possible outcomes (Zhang, 2003; Cañestro et al., 2013). One possible outcome is dosage repetition, where limited sequence evolution is seen, and gene function is conserved (Zhang, 2003; Cañestro et al., 2013). Another outcome following a duplication event may be pseudogenisation of one duplicated copy through the accumulation of mutations creating a non-functional gene (pseudogene) (Force et al., 1999; Zhang, 2003; Bischof et al., 2006). This may result in a duplicate transcribed into RNA but not translated. Neofunctionalisation represents another possible outcome of gene duplication whereby accumulation of mutations in one of the copies may lead to new functions (Innan & Kondrashov, 2010). Lastly, subfunctionalisation is a sub-category of neofunctionalisation where the function remains the same but changes in regulatory elements that control expression of the gene may lead to different copies of the duplicate being expressed in different tissues (Lynch & Force, 2000; Huerta-Cepas et al., 2011). Neofunctionalisation and subfunctionalisation are largely the result of divergent selection acting on newly formed duplicates. Mutations in DNA coding regions will lead to changes in the amino acid sequence and affect protein-protein interactions which are the foundation of cellular molecular function therefore giving rise to new functions for those proteins (Wray et al., 2003).
6 Chapter 1: Introduction
In vertebrates, repeated rounds of gene and whole genome duplication events have led to copy number variation across the eight different classes of globin genes, as listed in section 1.1. For example, in fish Mb copy number variation can range from none (ice fishes: Chaenocephalus aceratus, Pseudochaenichthys georgianus and stickleback: Gasterosteus aculeatus) (Sidell & O’Brien, 2006; Hoffmann et al., 2011) to seven copies (lungfish: Protopterus annectens) (Koch et al., 2016). Pseudogenisation has led to attrition in some of these globin classes in certain vertebrate lineages. Specifically, the GbY is absent in birds, marsupials and placental mammals, while it is present in monotremes and most other vertebrate lineages with the exception of lampreys and ray-finned fishes (Burmester & Hankeln, 2014). Similarly, this process of gene loss is responsible for the absence of Mb in a number of amphibian species (e.g., anurans (frogs)) (Maeda & Fitch, 1982; Fuchs et al., 2006). In the vertebrates, the formation of the α globin gene cluster is a consequence of tandem duplication leading to functional copy number variation, such as two in green anole (Anolis carolinensis) (Hoffmann et al., 2010b), three in chicken (Gallus gallus) (Reitman et al., 1993) and four in platypus (Ornithorhynchus anatinus) (Opazo et al., 2008). A recent systematic analysis of 22 avian and 22 mammalian genomes revealed a significant conservation in copy number within the avian α globin gene cluster (2-3 copies) but substantial variation among the surveyed mammalian lineages (2-8 copies)(Figure 1.2) (Zhang et al., 2014). In addition to this duplication of α globin genes within the mammalian lineages, there is strong evidence to suggest that mutations which have accumulated in some duplicated copies may have led to the evolution of novel functions (He & Zhang, 2005). This process of neofunctionalisation is often a result of divergent natural selection acting on new genetic variation in duplicated gene copies.
Chapter 1: Introduction 7
Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2) (Zhang et al., 2014).
1.2.2 Divergent evolution of duplicated genes
Divergent evolution of duplicated genes is the process in which genes of a similar function diverge from each other following duplication from a common ancestral gene (Bikard et al., 2009). An example of divergent evolution in the globin superfamily is seen in a number of vertebrate Hb proteins that undertake slightly different but similar functions. These Hb proteins are encoded by genes which have evolved from the same ancestral gene but exhibit differences in their ontogenetic timing of expression and functional properties, such as affinity for oxygen (Gribaldo et al., 2003). In fact, the differences in O2 scavenging and O2 transport roles of embryonic Hb versus adult Hb are attributable to amino acid replacements (non-synonymous mutations) in the zeta/epsilon and beta/alpha genes expressed in the embryonic and adult individual, respectively (Goodman et al., 1987). Another example of divergent evolution driving evolutionary innovation in Hb proteins can be seen in the avian lineage. In this instance, adult birds express two functionally distinct Hb isoforms, HbA and HbD (Grispo et al., 2012). While the β globin chains in both isoforms are identical, two functionally distinct α chains, one encoded by the alphaA globin gene and the other by the alphaD globin gene are found in the HbA and HbD
8 Chapter 1: Introduction
isoforms, respectively. The alphaA and alphaD globin genes share a common ancestral gene,
but possess amino acid replacements that substantially alter O2 affinity in the presence of allosteric modulators (Storz et al., 2015). Other examples of divergent evolution include the expression differences found in GbX paralogs (orthologous genes, which have diversified through duplication within the species) in the elephant shark (Callorhinchus milii) (Opazo et al., 2015). In this example, GbX1 is expressed across a wide range of tissues, while the expression of GbX2 is primarily restricted to the gonads. Taken together these examples highlight the process of divergent evolution operating on non-synonymous mutations in duplicated globin genes, but does not, however, explain the independent evolution of Hb in a number of metazoan lineages such as vertebrates, bivalves and annelids (Wray et al., 1996).
1.2.3 Convergent evolution
Convergent evolution is the independent evolution of the same function or phenotype in different lineages from different ancestral genes (Hoffmann et al., 2010b). As a consequence, it can lead to analogous molecules with similar functions in unrelated lineages (Hoffmann et al., 2010b; Burmester & Hankeln, 2014; Opazo et al., 2015). Such adaptive phenotypic convergence appearing in unrelated taxa is partly attributed to similar selection pressures operating on duplicated genes and this process is thought to be widespread in nature (Parker et al., 2013). An important example of convergent evolution is observed in the independent evolution of electric organs in numerous fish species, where the myogenic electric organ produces electrical currents for the purposes of communication and is believed to have evolved multiple times (Gallant et al., 2014). Similarly, echolocation in Cetacea (whales and dolphins) and Chiroptera (bats) has also been attributed to phenotypic convergence (Liu et al., 2010). Genome-wide analysis of cetaceans and two different bat lineages (Yinpterochiroptera and Yangochiroptera) have shown extensive convergent changes in nearly 200 loci containing coding sequences involved in echolocation (Parker et al., 2013). Functionally different Hb proteins occurring within erythrocytes of agnathan (jawless) and gnathostome (jawed) vertebrates present another example of convergence (Hoffman et al., 2010a). In these two disparate lineages, phylogenetic analyses have determined that the Hbs have not evolved from orthologous genes and are structurally distinct, reflected in a weakly cooperative dimeric Hb form produced in agnathans and a cooperative tetrameric Hb form found in gnathostomes (Figure 1.1)
(Hoffmann et al., 2010a; Schwarze et al., 2014). In addition to evidence of convergent
Chapter 1: Introduction 9
evolution of Hb among vertebrates, evidence of convergence between vertebrate and invertebrate Hb proteins is also observed. A specific example is the repeated evolution of intracellular tetrameric Hb in some bivalve mollusc species and vertebrates (O’Gower & Nicol, 1968). In invertebrates, a variety of Hbs and Mbs have also been observed (Weber & Vinogradov, 2001) and are believed to be the result of globin proteins emerging several times convergently from different ancestral globin genes (Blank & Burmester, 2012; Schwarze et al., 2014).
1.2.4 Invertebrate haemoglobins
From primary sequence to secondary, tertiary and quaternary structures, a large diversity exists among invertebrate Hbs. This variability is thought to represent specialization of Hb molecules to the wide range of environments they inhabit (Terwilliger, 1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002; Gow et al., 2005). Structurally, invertebrate Hbs also retain the globin fold (Gow et al., 2005) and exhibit a crystal structure that consists of at least six α-helices (Weber & Vinogradov, 2001). Haemoglobin in invertebrates has evolved independently at least 11 times (Figure 1.3) (Weber & Vinogradov, 2001), however, it is likely that this number may increase as more data becomes available for many understudied invertebrate lineages.
10 Chapter 1: Introduction
Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in 11 major phyla shown here : Arthropods, Nematodes, Nemertines, Mollusks, Annelids, Echiurans, Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych & Passamaneck, 2001).
The diversity in Hb proteins is a result of sequence diversity seen in genes encoding invertebrate Hbs. Within invertebrates; nematodes, annelids, arthropods and molluscs all possess genes encoding Hb proteins with more than a single functional globin domain which has resulted in a diversity of functional proteins (Natarajan et al., 2015). The brine shrimp (Artemia), for instance, expresses two functional Hb genes (an α and β subunit type), each of which consists of nine tandem globin domains. The association of the polypeptide chains encoded by these two genes results in the expression of three Hb proteins; a hetero-dimer (Hb II) and two homodimers (HbI and HbIII) (Manning et al., 1990). Structurally, HbII is characterised by α and β subunit types, while HbI and HbIII are composed of α and β subunit types, respectively (Manning et al., 1990). Interestingly, in addition to possessing multi- domain Hbs, production of multiple types of Hbs appears common among invertebrates. For example, in the annelid worms (Branchipolynoe symmytilida and B. seepensis) a single domain globin gene (SD) encodes a 137 amino acid Hb protein, while a tetradomain gene (TD) encodes a Hb protein of 552 amino acids (Projecto-Garcia et al., 2010). Figure 1.4
Chapter 1: Introduction 11
illustrates the exon-intron structure and domain architecture of the two Hbs from Branchipolynoe spp. Similarly, in nematodes, there is a diversity of globins including single- domain and di-domain Hbs (Projecto-Garcia et al., 2015). It is notable that despite this diversity, the di-domain globin genes appear to have a restricted distribution only in two parasitic species (Ascaris suum and Pseudoterranova decipiens) (Darawshe et al., 1987; Dixon et al., 1991). These di-domain globin genes from nematodes encode a large polymeric multi-domain Hb (octamer consisting of eight two domain subunits). Overall, based on the observed diversity, invertebrate Hbs have been classified into four distinct groups according to their gene and protein subunit structures. These include single domain, single-subunit Hbs; two-domain, multi-subunit Hbs; multi-domain, multi-subunit Hbs and single-domain, multi-subunit Hbs (Vinogradov, 1985).
Figure 1.4 Gene structure from two Hbs found in annelid worms (Branchipolynoe spp.). This diagram illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain (bottom) globins found in these species (Projecto-Garcia et al., 2010).
1.3 Bivalve haemoglobins
Similar to the diversification patterns seen in other invertebrate lineages, bivalve molluscs show strikingly diverse patterns of Hb distribution and evolution. This is of interest as bivalves typically do not possess a functional respiratory pigment and those that do were previously thought to utilise a copper-based respiratory pigment known as haemocyanin (Hc), with Hb expression considered rare (O’Gower & Nicol, 1968; Terwilliger, 1998). Species that contain Hc are restricted to the earliest branching lineage of bivalves and it has been suggested that Hc is the ancestral oxygen-transport protein in class Bivalvia. It has been shown, however, that Hb is more widely distributed in bivalve molluscs than previously
12 Chapter 1: Introduction
thought (Manwell, 1963; Smith, 1967; Terwilliger, 1998; Alyakrinskaya, 2002; Wajcman et al., 2009).
Currently, four independent origins of Hbs have been hypothesised in bivalves (Figure 1.5). The first instance is thought to have occurred in the primitive bivalve lineage containing Solemya velum, which possesses several different types of Hbs expressed predominately in gill tissue (Dando, 1985; Doeller et al., 1988; Torres-Mercado et al., 2003). In Lucina pectinata, also found in this lineage, three Hbs are expressed; HbI which is a sulphide reactive protein and HbII and III, which are O2 reactive globins (Montes-Rodriguez et al., 2016). A second instance has occurred in the bivalve family Arcidae where species contain multiple forms of Hb in circulating erythrocytes (Mangum, 1997) represented in Figure 1.5 by Arca noae. Haemoglobin diversity of Arcidae will be discussed in more detail in the following section as it constitutes one of the questions addressed by this thesis. The third independent origin of bivalve Hb is found in the deep-sea clam genus Calyptogena. Species within this genus generally present with two homo-dimeric Hb proteins (HbI and HbII) in erythrocytes but C. nautilei possess two monomeric Hbs (HbIII and HbIV) which show low sequence identity to HbI and HbII (Kawano et al., 2003). The fourth instance is an extracellular Hb found in the heterodont clams Cardita borealis and C. floridana (Manwell, 1963; Terwilliger et al., 1978). Figure 1.5 depicts the basic bivalve phylogeny with the distribution of Hc and Hb among its taxa. While these four cases listed above provide compelling evidence for the independent evolution of Hb in multiple bivalve lineages, it is not known whether Hb has evolved in other lineages where O2 transport proteins have not been examined.
One lineage, in particular, that has not been extensively examined for the presence of Hb proteins are the members of the family Limidae. This is significant as there is some evidence to indicate that some species in this family may also possess Hb proteins. This evidence, however, is principally restricted to a single report of this observation (Rawat, 2010) with no molecular or biochemical studies reported to date. It is plausible to hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate associated with the ability to swim and the red pigmentation of their tissues (Baldwin & Lee, 1978; Harper & Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an important physiological advantage for a more efficient oxygen delivery system to respiring tissues during swimming. Consequently, if Hb were found in Limidae, then this would
Chapter 1: Introduction 13
constitute a fifth independent origin of Hb in bivalve molluscs. This represents a key knowledge gap which will be addressed in this thesis.
14 Chapter 1: Introduction
Figure 1.5 Phylogenetic classification of class Bivalvia of molluscs. The main bivalve subclasses are represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta, Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida, Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015).
Chapter 1: Introduction 15
1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass The bivalve family Arcidae, also referred to as the blood clams, are unusual in comparison to most bivalve species as they possess haemolymph which has a deep red coloration attributed to its circulating erythrocytes (Mangum, 1998). These erythrocytes contain multiple Hbs; which are often dimeric (~ 32 kDa) and tetrameric (~ 65 kDa) (Como & Thompson, 1980b; Furuta & Kajita, 1983; Suzuki et al., 2000), with the exception of a 430 kDa polymeric Hb found in some members of the Barbatia genus, such as B. reeveana and B. lima (Grinich & Terwilliger, 1980; Suzuki & Arita, 1995). Dimeric Hbs in the Arcidae exhibit further structural complexity and can be homo-dimeric or hetero-dimeric (Furuta & Kajita, 1983; Mann et al., 1986; Suzuki et al., 1992). In some species, such as blood cockles (Anadara trapezia and Scapharca inaequivalvis), multiple homo-dimeric Hbs have been found circulating in erythrocytes at the same time (O’Gower & Nicol, 1968; Fisher et al., 1984; Ronda et al., 2013).
The tetrameric Hbs in Arcidae are formed from two different subunits in an alpha2beta2
(α2β2) arrangement, analogous to the vertebrate Hb (Ronda et al., 2013). The α and β globin chains in these species range from ~ 145 to 165 amino acids in length (Mann et al., 1986; Suzuki et al., 1996). Both the dimeric and tetrameric Hbs show different pairing of their helices to the canonical arrangement seen in vertebrate Hbs (Ronda et al., 2013). Specifically, the E and F helices show subunit pairing, with the G and H helices on the outside of the quaternary structure, creating a back-to-front formation when compared to vertebrate Hb (Vinogradov, 1985). Figure 1.6 illustrates the comparison of the E and F helical arrangement of the human and S. inaequivalvis Hb. This back-to-front structural arrangement is common to all invertebrate Hbs for which crystal structures have been resolved (Royer et al., 2005). This structural variation is also considered to provide evidence to suggest that arcid Hb proteins are a result of convergent evolution with vertebrate (Royer et al., 2005).
16 Chapter 1: Introduction
Homo sapiens HbA Scapharca inaequivalis HbII Scapharca inaequivalis HbI
Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha) subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue. This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c), both found in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013).
Based on the current literature, protein sequences have been more extensively studied in comparison to the underlying genetic variation. Currently, most of the information regarding the genes encoding Hb proteins in Arcidae has been generated in studies which utilised small scale cDNA sequencing (Suzuki & Arita, 1995; Piro et al., 1996; Suzuki et al., 2000). This sequencing has demonstrated that the majority of species examined most frequently have two to three Hb encoding genes. For example, Tegillarca granosa contains three different Hb genes; α and β encoding genes of the tetrameric Hb and a minor globin gene encoding the homo-dimeric Hb (Bao et al., 2013a). S.inaequivalvis also expresses three homologous Hb genes (α, β and minor) (Piro et al., 1996; Piro et al., 1998). Interestingly, these three genes have the same exon/intron structure seen in vertebrate Hb genes, consisting of three exons and two introns (Piro et al., 1996). Unlike in the previous two examples, B. lima expresses four distinct Hb genes which encode a minor homo-dimeric globin (delta (δ)), a hetero-tetramer (α and β) and a polymeric globin (2D) (Suzuki et al., 1996). The 2D gene is a di-domain globin and is thought to have resulted from a duplication of delta gene, producing two consecutive domains within a gene due to the loss of a stop codon (Suzuki et al., 1996). While these studies capture some of the diversity of Arcidae Hb genes, they are principally based on previous protein evidence. There
Chapter 1: Introduction 17
is a need for a systematic evaluation of diversity at the sequence level of all expressed Hb genes in this group of organisms.
In arcid bivalves, there is preliminary evidence to suggest that some duplicated Hb encoding genes may have taken on new functions. For example, in T. granosa molecular analysis has revealed that some Hb encoding genes may play a role in immune response to bacterial pathogens (Bao et al., 2013b). In this instance, single nucleotide polymorphisms identified in α and β Hb encoding genes (HbIIA and HbIIB) showed an association with resistance to pathogenic bacteria Vibrio parahaemolyticus (Bao et al., 2013b). In addition to this example of neofunctionalisation, there is also some evidence that duplicated Hb genes in some bivalves show tissue specific expression (Burmester et al., 2000), although, this has only been reported for species outside of the Arcidae. Some authors speculate that the diversity of globin genes in Arcidae may have contributed to the capacity of these organisms to persist in anoxic or intertidal environments (Terwilliger et al., 1978; Alyakrinskaya, 2002). While these studies are few and speculative, further investigation is needed to determine the extent of gene duplication within a single species and if the duplicated genes have undergone neofunctionalisation. Assessment of expression levels of Hb genes across different tissues under experimental perturbations (e.g., hypoxic versus normoxic abiotic stress) would allow identification and quantification of potentially new function in these genes.
A.trapezia was the first arcid species to have its Hb proteins isolated and characterised. Early work by Nicol & O’Gower (1967) identified the presence of multiple circulating Hbs including a tetrameric (HbI) with an α2β2 configuration and two homo-dimeric Hbs (HbIIa and HbIIb). The homo-dimeric forms appear to be allelic variants of the same gene as they are in Hardy-Weinberg equilibrium within populations (O’Gower & Nicol, 1968; Como & Thompson, 1980b; Mann et al., 1986). One of the genes that encodes the β subunit of the tetrameric Hb (HbI) has been fully characterised (At & Eo, 1984), however, the complete gene sequence for the two homo-dimeric variants and the α subunit of the tetramer have not. Interestingly, another Hb gene encoding a minor Hb, possibly seen as a ghost band by O’Gower & Nicol (1968), has been characterised fully in A. trapezia (Titchen et al., 1991). Based on this protein and DNA evidence, it appears that, at least, four different Hb genes are present in this species. Despite this significant early work on Hb in A. trapezia, there have been numerous inconsistencies regarding the nomenclature of the Hb proteins, and corresponding genes (if at all characterised). In addition to this limitation, it remains unsure
18 Chapter 1: Introduction
whether the DNA and protein sequences reported for A. trapezia represent the complete repertoire of Hb encoding genes in this species. This makes inferences about the evolution of Hb in bivalves difficult and highlights the need for a systematic evaluation of the Hb genes in A. trapezia and arcids in general. Significantly, even less is known about the functional diversification of this gene family demonstrating the requirement for further functional genomic interrogation of Hb genes in the Arcidae. This paucity of information represents a key knowledge gap which will be addressed in this thesis.
1.4 Aims
The overarching aim of this study is to expand our understanding of the evolution of haemoglobin (Hb) encoding genes, one of the most important gene families in the metazoan tree of life. Specifically, this thesis aims to address the lack of information regarding the distribution of Hb genes in bivalves and investigate potential evidence of neofunctionalisation in duplicated Hb genes of Arcidae bivalves. In order to address these aims, the thesis will consist of two main objectives each with its own hypothesis. These include:
• Objective 1
o H1 – Duplicated Hb genes in Anadara trapezia, a member of the Arcidae family, show tissue specific expression patterns and therefore evidence of neofunctionalisation.
This hypothesis will be tested by interrogating the recently published whole organism transcriptome of A. trapezia for all expressed copies of Hb genes and their expression measured across multiple tissues using quantitative Polymerase Chain Reaction (qPCR) techniques.
• Objective 2
o H1 – Bivalve Ctenoides ales, a member of the Limidae family, expresses Hb encoding genes and therefore provides evidence of a fifth independent origin of Hb in bivalve molluscs.
The hypothesis in objective 2 will be tested by sequencing the expressed portion of the C. ales’ genome for presence of Hb encoding genes. Whole organism transcriptome will
Chapter 1: Introduction 19
be generated, sequenced utilizing high throughput sequencing techniques and bioinformatically interrogated for evidence of Hb expression.
1.5 Thesis Outline
Thesis presented here includes four main chapters, consisting of an introductory literature review (chapter 1), followed by two studies each described within their individual chapter (chapters 2 and 3). Final section of the thesis is the general discussion (chapter 4).
• Chapter 1 The first chapter is a literature review which highlights the current understanding of vertebrate and invertebrate haemoglobins (Hb), focusing on their vast structural diversity and evolution. This section also explores evolution by gene duplication and discusses divergent and convergent evolution. In addition, it focuses on invertebrate Hbs and more specifically bivalves in light of the topic for this study.
• Chapter 2 This is the first data chapter which addresses Objective 1. It provides an overview of duplicated Hb genes in Arcidae bivalves and highlights the knowledge gap in our understanding of the evolutionary innovation through neofunctionalisation of these genes. The chapter provides detailed methodological information on the measurement of tissue specific expression of candidate Hb genes in A. trapezia and interpretation of the results. Findings of this study are interpreted in detail in discussion.
• Chapter 3 Second data chapter which addresses Objective 2, describes the interrogation of the transcriptome of C. ales for presence of Hb encoding genes. Methodology section in this chapter concerns the isolation, sequencing and bioinformatics interrogation of the C. ales transcriptome. Results present the detailed outcomes of this interrogation and describe the transcriptional profile this species as well as the relevant candidate genes, followed by detailed discussion.
20 Chapter 1: Introduction
• Chapter 4 In the fourth chapter, the overall outcomes of the project are discussed in context of the current body of work relating to evolution of bivalve Hbs. Interrogation of the results is extended to encompass a discussion on the limitations of the study and potential applications of the outcomes reported in this thesis.
Chapter 1: Introduction 21
Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin
genes in Anadara trapezia
2.1 Background
Haemoglobins (Hbs) are among the best studied proteins in vertebrates but little is known about their distribution, function and evolution in invertebrate lineages (Alyakrinskaya, 2002; Bao et al., 2013a). A number of invertebrate lineages have
independently evolved circulating Hbs as a mechanism to transport O2 for cellular respiration (Mangum et al., 1975). Invertebrate lineages that have independently evolved circulating Hbs include some arthropods, some annelids and the Arcid bivalves (Terwilliger, 1998; Weber & Vinogradov, 2001; Wajcman et al., 2009).
The Arcid bivalves are an interesting group of intertidal mollusc species known as blood clams. This group displays a diversity of Hb proteins with monomeric, dimeric, tetrameric and polymeric protein forms (Terwilliger, 1980; Royer et al., 1985; Weber & Vinogradov, 2001). In fact, the presence of multiple Hb proteins in circulating erythrocytes was first observed in the Australian blood clam, Anadara trapezia by Nicol & O’Gower (1967). This study revealed the presence of at least three distinct proteins in this species, two homodimers and one tetramer, while protein sequencing indicated that at least two duplication events produced these proteins (Suzuki et al., 1996). Having such a variety of Hb genes may allow these organisms to survive in the intertidal zone and endure long periods submerged or exposed to air. A clinal pattern of allele frequencies has also been observed for the homo-dimer Hb gene in A. trapezia, which the authors hypothesised was associated with changes in environmental variables including temperature and salinity (O’Gower & Nicol, 1968).
The large number and variation in blood clam Hb proteins is thought to be the result of repeated rounds of gene duplications but the exact function of these diverse Hb proteins remains unclear. Current data suggests that blood clam Hbs evolved from an ancestral mollusc globin gene which, following the divergence of the blood clam ancestor underwent repeated rounds of lineage specific gene duplication in different blood clam lineages (Riggs, 1991). More recent transcriptome data for A. trapezia (Prentis & Pavasovic, 2014) has
22 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
identified seven full length and genetically distinct Hb genes in this single species and indicates that gene duplication may have been more extensive than previously reported.
The Hbs used in this study are HbI, a tetramer made up of α and β subunits encoded by the genes AG (alpha globin) and BG (beta globin); HbII, a homo-dimer molecule encoded by the HD gene, a hetero-dimer Hb encoded by the gene HB and a putative dimer Hb encoded by the gene 2D. These genes were used for this study as they encode structurally distinct proteins including a homo-dimeric and tetrameric Hb proteins. It is postulated that these different Hbs may show distinct patterns of tissue specific or environment specific gene expression. Consequently, differences in their patterns of expression were tested across five tissues, that could be dissected distinctly and where Hbs where most likely to play a role, (haemolymph, foot, gills, mantle and muscle) in four different experimental treatments to determine if they showed tissue specific or environment specific gene expression.
2.2 Material and Methods
2.2.1 Sample acquisition and tissue dissection Anadara trapezia specimens were collected from the intertidal zone in Wynnum, Queensland, Australia (27°26'08.7"S 153°10'25.0"E) and transferred to holding tanks at QUT Marine Lab facility until required for experimentation. All animals were collected under the general fisheries permit number 166312. Ethics approval was not required since the specimens do not qualify as animals as described by the Queensland Animal Care and Protection Act 2001 (ACPA). Water conditions during the acclimation period of bivalves were as follows: water salinity 30-35 ppt, pH 7.9-8.1, temperature 20-25 °C and ammonia levels (< 0.1 mg/L). A 12 h light/dark cycle was also maintained in the tanks and the animals were kept unfed for one week prior to the start of the experiment.
To best capture the variation in Hb expression, tissues were extracted from individual animals submerged in seawater, as well as animals undergoing aerial exposure in a moist tank (as experienced during periods of regular and prolonged low-tide). Consequently, animals used for tissue dissection consisted of 12 individuals collected at six hours (n = 6), six control animals submerged in water and six animals from a moist tank. This was repeated at 12 hours (n = 6 for both treatments). In total, 24 A. trapezia individuals had following five specific tissues removed and used for analysis: haemolymph, mantle, muscle, gills and foot (Figure 2.1). A few microlitres of fresh haemolymph from each animal
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 23
was used to conduct analysis to measure pH, pCO2 and sO2 % using Abbott i-STAT blood analysers. Haemolymph samples were spun at 10,000 rpm for 3 min, supernatant removed
and RNA was extracted immediately. Tissue samples were snap frozen in liquid N2.
Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to assess tissue specific expression of Hb genes in this study.
2.2.2 RNA extraction and cDNA synthesis RNA was extracted from each tissue separately using Mollusc RNA extraction kit from Omega Biotek, following its longest protocol. Specifically, tissues were ground in
liquid N2 and the fine powder obtained was transferred into 1.5 mL microcentrifuge tubes. These were rapidly brought to a fume hood and 350 µL of MRL buffer provided in the kit was added and vortexed. Three hundred and fifty µL of a 24:1 solution of chloroform:isopropanol freshly made were then added to each tube, vortexed mix and placed in a centrifuge for 2 min at 10,000 x g. The upper aqueous phase of each tube was carefully removed and transferred to clean 1.5 mL microcentrifuge tubes. One volume of isopropanol was added to each tube with the upper aqueous phase, vortexed and immediately centrifuged for 2 min at 10,000 x g. Supernatant was carefully aspirated and tubes were briefly inverted over a paper towel to remove any residual liquid. RB buffer (100 µL) provided in the kit was then added to each tube, vortexed to re-suspend the pellets and briefly incubated in water baths at 65 °C to elute RNA. Another 350 µL of RB buffer and 100 µL of 100 % EtOH was
24 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
added to each tube and vortexed. The entire samples were then transferred into a HiBind® RNA (Omega Biotek) mini column inside a 2 mL collection tube for washing and elution. Samples were first centrifuged for 1 min at 10, 000 x g and filtrates were discarded. Five hundred µL of RNA Wash Buffer I provided in the kit was added to the mini column and tubes were centrifuged for 1 min at 10,000 x g. Filtrates were discarded and 500 µL of Wash Buffer II provided in the kit was added to the mini columns and centrifuged for 1 min at 10,000 x g. Filtrates were discarded and another wash step was done with Wash Buffer II. All samples were then centrifuged for 2 min at max speed to dry the columns. These were then transferred to clean 1.5 mL microcentrifuge tubes for elution. Fifty µL of DEPC (Diethylpyrocarbonate) water provided in the kit was added to each tube and samples were centrifuged for 2 min at maximum speed and eluted RNA samples were stored at -70 °C. cDNA synthesis was performed using the SensiFASTTM cDNA Synthesis Kit (Bioline) following the manufacturer’s protocol. cDNA synthesis was performed in a total volume of 50 µL (20 µL water, 20 µL total RNA, 8 µL 5xTransAmp buffer and 2 µL reverse transcriptase) with the following cycling conditions: 1 cycle at 25 °C for 10 min, 1 cycle at 42 °C for 60 min, 1 cycle at 85 °C for 5 min and a hold step at 4 °C. The cDNA was used for all PCR reactions in downstream analysis.
2.2.3 RT–PCR (Real Time PCR) Five candidate globin genes: putative dimer (2D), α chain (AG), homo-dimer (HB), hetero-dimer (HD) and β chain (BG) previously identified were amplified in each sample by using primers designed to amplify the entire Open reading frame (ORF) based on transcriptomic data from A. trapezia (Prentis & Pavasovic, 2014) (Table 2.1). Polymerase chain reaction amplification was then performed to determine the presence or absence of each of the five genes in the different tissue samples using the following conditions: initial denaturation for 3 min at 94 °C and 30 cycles of 30 sec at 94 °C, 30 sec at 52 °C and 1 min at 72 °C followed by one cycle of 5 min at 72 °C and a hold step at 4 °C. Amplicons were separated by electrophoresis on a 1.5 % agarose gel and stained with GelRedTM (Biotum). The intensity of PCR products was determined using an image analysis program assisted by a gel documentation system (Chemidoc XRS, Bio-Rad). Some of these PCR products were also used for candidate gene validation through sequencing.
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 25
Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve species A. trapezia using RT-PCR. All primers were designed to amplify the entire ORF of each gene with product sizes between 350 bp and 619 bp (Prentis & Pavasovic, 2014).
Gene Primer Name Primer Sequence Product size (bp)
AG (Alpha chain) AGORF CGAGTCCGATTTATTGCTGA 545 AGORR TGTCAAACGAGACAGGTCCA BG (Beta chain) BGORF TAATGCAGCCTGGACAACAG 501 BGORR TTTGTTGTAGACGCCCTTTG HB (Homo-dimer) HBORF TTGTCACCTCCAGTCTGTCG 618 HBORR ACGCTACCCTGGTGATTGTC 2D (Putative dimer) 2DORF CGAAACCCAAGTCCATCAT 506 2DORR ATCCCTCACAGAGTGCTGCT HD (Hetero-dimer) HDORF ATCTGACGGAAGCAGACG 515 HDORR CGCGAGGTAGTGATATCGAA
2.2.4 Candidate gene validation Amplicons from two samples were purified using the Bioline isolate PCR purification kit followed by cloning using the pGEM®-T and pGEM®-T Easy (Promega) vector systems quick protocol. Duplicate samples for each gene in each biological replicate were selected and sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequences obtained were imported into the Geneious® software version 8.1.6 for visualization and comparison to the de novo assembled contigs from the A. trapezia annotation report using a pairwise global alignment. Sequences were aligned to the contig they were designed from to determine percentage nucleotide similarity.
2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression RT-qPCR was performed on each sample using the One-Step Real Time PCR kit (Roche) and short primers designed based on past analysis of transcriptomic data from A. trapezia (Prentis & Pavasovic, 2014) (Table 2.2). These reactions were performed using the Lightcycler®480 real-time PCR system (Roche), measuring specific fluorescence at each cycle and quantifying the initial levels of mRNA for each Hb gene in each tissue. The PCR
26 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
reactions were performed using the following conditions: 1 cycle of 5 min at 95 °C and 45 cycles of 10 sec at 95 °C, 10 sec at 60 °C and 10 sec at 72 °C. All quantitative PCR analyses were repeated in three technical replicates to determine the validity of results along with negative controls for each gene in each sample and 18S as a house keeping control gene.
Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A. trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014).
Gene Primer Name Primer Sequence Product size (bp)
AG (Alpha chain) AGF TGATGACCCATCCAGATTGA 117 AGR TTCAGGGTCTCCTTCATTGG BG (Beta chain) BGF TGTCGAGAGCATCGATGAAG 109 BGR AAATTCGCTCGTCTTGGAGA HB (Homo-dimer) HBF GCTGTCAACCACATCACCAG 139 HBR CCTGGACAACGCCTACAAGT 2D (Putative dimer) 2DF ATCCGACCCATGGAATAACA 114 2DR CGTGCAACATCTTCCAAGTC HD (Hetero-dimer) HDF AAGGGACATGCCACAACATT 148 HDR CTCCGAGTGCCTGAAATTCT 18S 18F CGGCGACGTATCTTTCAAAT 136 18R CTTGGATGTGGTAGCCGTTT
2.2.6 RT-qPCR data analysis RT-qPCR data was exported and viewed in the Lightcycler96 software associated with the instrument. Relative quantification analysis was performed using the analysis function of
the Lightcycler96 software based on the ∆∆CT method. In this method, the housekeeping gene provides a basis for comparing levels of target sequences to levels of reference sequences and the final result is expressed as a relative ratio. These relative ratio values were exported into Microsoft Excel version 14.0 to be converted to a format compatible with the program SPSS (Statistical Package for the Social Science) used for statistical analysis. In this program, significance of results was assessed through ANOVA testing followed by Tukey’s post-hoc test and differences were considered significant at p < 0.05. Firstly, to assess levels of target genes in each tissue, ratio values were used as a dependent variable against tissue type for each gene. Secondly, significant differences in expression among conditions (water or air 6 hours and 12 hours) were assessed in the same way with ratio values as a dependent
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 27
variable and condition as factor. Due to very low relative ratio values, the following formula was applied to each value and plotted in SPSS: (-1/log (relative ratio)).
2.3 Results
2.3.1 Haemolymph analysis Haemolymph analysis results obtained from iSTAT are summarised in Figure 2.2. Haemolymph pH measurements ranged from 6.7 to 7.2 across the treatment groups with significant differences among samples exposed to air for 12 h and samples in water for 6 h and 12 h. Oxygen saturation measurements ranged from 24 to 89 % with significant differences (p < 0.05) between the two time points of aerial exposure at 6 h and 12 h. Figure 2.2 shows a drop of nearly 30 % for oxygen bound to Hb molecules among samples kept in water for 6 h and samples exposed to air for 6 h. Measurements for pCO2 ranged from 8.9 to 20.9. Values in water 6 h, air 6 h and water 12 h were relatively consistent across treatments, however, measurements for samples exposed to air for 12 h fluctuated significantly with values between 10.4 and 20.9. Similar values were observed for samples in water for 6 h and in water for 12 h, but there is a slight increase in pCO2 in samples exposed to air for 6 h and a large increase samples exposed to air for 12 h, showing a gradual build-up of CO2 in the haemolymph of those animals.
28 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
A Variation of pH across conditions B Variation of sO2 % across conditions A B
• • * *
%
2 O • ean pH ean ean s M M
C
C VariationC of partial pressure of CO2 across conditions
* *
2 • • • Mean pCO Mean
Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of fresh haemolymph using and Abbott i-STAT blood analyser. This was done for all haemolymph samples across two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH
measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed
through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2 values as dependent variables respectively. Condition was used as a factor for each test and results were considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*) and (•) in each graph and error bars represent 2 standard errors around the mean.
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 29
2.3.2 RT-PCR RT-PCR results are summarised in Table 2.3. Results for the HD gene have been removed here as it could not be validated. In haemolymph, all genes were amplified in all samples. For foot samples, BG and HB were amplified in all samples, AG was amplified in 23 out of 24 samples, 2D was amplified in 14 out of 24 samples. Low amplification results for foot samples overall may be due to low yields of RNA obtained for this tissues. In gill samples, 2D, AG and BG were amplified in at least 21 out of 24 samples and HB was amplified in 20 samples. In mantle, all five genes were amplified in at least 22 out of 24 samples. In muscle samples, AG, BG and HB were amplified in all 24 samples and 2D in 21 samples. Overall, all four genes tested here are found in all tissues but their expression was significantly higher in the haemolymph while in the other tissues tested their expression was minor. Inconsistencies with RT-qPCR results in section 2.3.4 may be due to RNA degradation in some of those samples or failed amplifications during RT-PCR.
Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD) have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at
Tissue Candidate Water 6 h Air 6 h Water 12 h Air 12 h globin genes Haemolymph 2D 6 6 6 6 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6 Foot 2D 4 4 5 1 AG 6 5 6* 6 BG 6 6 6 6 HB 6 6 6 6 Gills 2D 6 6 6 5 AG 5 6 5 5 BG 5 6 5 5 HB 5* 6* 4* 5* Mantle 2D 5 6* 5 6 AG 6 6 5 6 BG 6 6 5 6 HB 6 6 5 6 Muscle 2D 6 6 6 4 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6 least 2 biological replicates out of 6 in that group.
30 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
2.3.3 Candidate gene validation Figure 2.3 shows PCR products for all candidate genes used in sequencing for validation. Following Sanger sequencing, alignments of amplified sequences showed 100 % similarity with the original candidate ORFs for 2D, AG, BG and HB, while HD did not show the correct sequence and therefore was not validated.
A B
Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified products for 2D, AG, BG, HB, HD and 18S genes respectively.
2.3.4 RT-qPCR for quantification of gene expression RT-qPCR results revealed that 2D, AG and BG had significantly higher expression levels (p < 0.05) in haemolymph compared to other tissues (Figure 2.4). In the case of HB, there were no significant differences (p > 0.05) in expression levels across tissues with overall lower expression than the other genes tested. These results are consistent with amplification curves for target genes and for the housekeeping gene 18S as shown in Figure 2.5. In haemolymph, fluorescence is first detected in cycle 12 and this first cluster represents amplification curves for 2D, AG and BG. The second cluster of amplification curves begins at cycle 32 and represents HB expression. This difference is also observed across all tissues in relative expression of 2D, AG and BG compared to HB as seen in Figure 2.4. Amplification curves for foot, mantle and muscle follow a similar pattern with fluorescence detected at cycle 20 for 2D, AG, BG (first cluster) and at cycle 26 for HB (second cluster). The housekeeping gene 18S used in all PCR runs was consistently detected at cycle 8, across
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 31
all tissues. Relative expression levels showed no significant differences between air or water exposed samples (Figure 2.6).
Relative expression of 2D across tissues Relative expression of AG across tissues
p < 0.05 p < 0.05
Relative expression of BG across tissues Relative expression of HB across tissues
p < 0.05
Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative
quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS and differences were considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*). Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2 standard errors around the mean.
32 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
0.350 0.350
0.300 0.300 0.250 0.250 0.200 0.200 Fluorescence Fluorescence 0.150 0.150 0.100 0.100
Haemolymph 0.050 0 050
5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle
0.360 0.500
0.300 0.400
0.240 0.300 0.180 Fluorescence Foot Fluorescence 0.200 0.120 0.060 0.100
5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle
0.600 0.500 0.500
0.400 0.400 0.300 0.300 Gills Fluorescence
0.200 Fluorescence 0.200
0.100 0.100
5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle
0.700
0.720
0.600
0.600 0.500 0.480 0.400
0.360 Fluorescence
Mantle 0.300 Fluorescence 0.240 0.200 0.120 0.100
5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle
0.600 0.720
0.500 0.600 0.400 0.480 0.300 Muscle 0.360 Fluorescence 0.200 Fluorescence 0.240 0.100 0.120
5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 Cycle Cycle Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping gene 18S in each tissue type. Target amplification curves are represented in orange (left), negative controls in green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler® measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical replicates along with negative controls and 18S as a housekeeping gene.
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 33
Relative expression of all genes across conditions in haemolymph Relative expression of all genes across conditions in foot
Gene Gene 2D 2D AG 2.50 AG BG 2.50 BG HB HB
2.00 2.00
1.50 1.50 Relative expression Relative expression 1.00 1.00
0.50 0.50
0.00 0.00 Water 6h Air 6h Water 12h Air 12h Water 6h Air 6h Water 12h Air 12h Condition Condition
Relative expression of all genes across conditions in gills Relative expression of all genes across conditions in mantle
Gene Gene 2D 2.50 AG 2D BG 2.50 AG
HB BG HB
2.00 2.00
1.50 1.50 Relative expression Relative expression
1.00 1.00
0.50 0.50
0.00 0.00 Water 6h Air 6h Water 12h Air 12h Water 6h Air 6h Water 12h Air 12h Condition Condition
Relative expression of all genes across conditions in muscle
Gene 2D 2.50 AG BG
HB
2.00
1.50 Relative expression
1.00
0.50
0.00 Water 6h Air 6h Water 12h Air 12h Condition
Figure 2.6. Relative expression of candidate globin genes in A. trapezia amplified using RT-qPCR. Relative
quantification analysis was performed using the ∆∆CT method. Relative expression is shown as a ratio of levels of target sequences to 18S sequences, used here as a housekeeping gene. Significant differences (p < 0.05) were assessed through ANOVA testing using SPSS with ratio values as dependent variable, condition as factor in each tissue. No statistical differences were found here. Error bars represent 2 standard errors around the mean.
34 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
2.4 Discussion
In this study, the expression of Hb encoding genes was examined (2D, AG, BG and HB) across different tissues and environmental conditions, in A. trapezia, in order to determine if these genes show patterns of expression consistent with neofunctionalisation. Based on these findings, three Hb encoding genes appear to be predominantly expressed in the haemolymph. This pattern was observed to be consistent across the two environmental conditions tested (submersion and aerial exposure). Physico-chemical properties of the haemolymph, however, indicated that major physiological changes are occurring in the haemolymph without major changes in expression of Hb genes.
2.4.1 Haemolymph characteristics Haemolymph analysis was performed in order to gain insights on the state of Hb molecules and gas exchange throughout the experiment. pH values remained constant in animals kept in water for 6 h and 12 h. However, animals exposed to air for 6 h (pH= 7.2) showed a slight drop in pH and a significant drop was observed in animals exposed to air for 12 h (pH= 6.9). This finding is consistent with observations in another bivalve species (Crassostrea gigas) (Michaelidis et al., 2005). This is an interesting physiological response as C. gigas does not have a circulating respiratory pigment and this therefore suggests that these changes may be independent of Hb in the haemolymph. The drop in pH and increase in pCO2 in animals exposed to air for 6 h was not statistically significant (p > 0.05), however, this may be due to the fact that A. trapezia is typically exposed to six hours (high to low) tide cycles. Consequently, the animals may still be consuming their oxygen stores and relying on aerobic metabolism (Booth et al., 1984).
The observed significant decline in pH is consistent with the increase in pCO2 in animals under prolonged aerial exposure (12 h), and may be explained by a shift of the organism from aerobic metabolism to anaerobic metabolism, an adaptation to hypoxia widespread among bivalves (Brooks et al., 1991; De Zwann et al., 1993). This is well supported in literature on marine bivalves, where a large number of studies have shown that
hypoxia causes pCO2 to rise in the haemolymph due to the accumulation of CO2 as an anaerobic by-product, leading to acidification of body fluids and a drop in pH levels (Crenshaw & Neff, 1969; Jokumsen & Fyhn, 1982; De Zwann et al., 1983; Booth et al.,
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 35
1984; Michaelidis et al., 2005; Shumway & Parsons, 2011). This gradual switch from aerobic to partial or complete anaerobic metabolism upon prolonged air exposure is documented in bivalves such as the deep-sea clam (Calyptogena magnifica) (Hourdez & Lallier, 2006) and the sea mussel (Mytilus edulis). Another study on the intertidal lugworm Arenicola marina (Booth et al., 1984; Toulmond & Tchernigovtzeff, 1984; Wang &
Widdows, 1991), which is known for its ability to rapidly acclimate to restricted O2 availability in its habitat, reinforces this finding and reports that acidosis was also found to be
coupled with a rise in pCO2 as a result of decreased gas exchange during low tide (Toulmond & Tchernigovtzeff, 1984). Moreover, from results in this study, globins present in the haemolymph of A. trapezia do not seem to show a Bohr effect unlike Hbs found in teleost fish species (Souza & Bonilla-Rodiguez, 2007; Witeska, 2013). The saturation of oxygen appears independent of pH as a significant drop in pH is coupled with a significant increase
in sO2 % for animals exposed to air for 12 h. This is consistent with previous studies on Hb containing bivalves such as Barbatia reeveeana (Grinich & Terwilliger, 1980), Cardita floridana (Manwell, 1963) and Anadara broughtonii (Furuta & Kajita, 1983) for which no Bohr effect was observed.
2.4.2 Tissue specific expression and neofunctionalisation It is now widely accepted that invertebrate Hbs are understudied, particularly in reference to their evolution, distribution and functions (Alyakrinskaya, 2002; Bao et al., 2013a). In an attempt to address this knowledge gap, a quantitative investigation of Hb encoding genes was performed across five different tissues in A. trapezia individuals undergoing aerial exposure. A. trapezia is particularly suited to examine neofunctionalisation of Hb genes as it expresses multiple different Hb genes (Como & Thompson, 1980b) and is commonly under abiotic stress due to aerial exposure during tidal cycles (Davenport & Wong, 1986). A. trapezia, like most bivalves, possesses an open circulatory system where surface areas for O2 uptake are composed of a pair of gills and mantle tissue anchored to the shell (Herreid, 1980).
While strong patterns of differential expression across tissues were observed, there was no evidence to support any changes in expression differences across environmental conditions. This may indicate that these genes have similar functions under environmental stress. Alternatively, the aerial exposure in this experiment mimics a tidal cycle at 6 h and a
36 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
prolonged period of 12 h, the absence of significant differences (p > 0.05) in expression levels of Hb encoding genes during air exposure may indicate that multiple Hbs maximise O2 binding and supply during times of hypoxia (Projecto-Garcia et al., 2015).
All Hb encoding genes were expressed across all tissues but the genes encoding AG, BG and 2D had significantly higher expression in haemolymph than foot, gills, mantle and muscle (p < 0.05). Other authors have used patterns of tissue specific expression of duplicated genes to infer that they have undergone neofunctionalisation (e.g., Na/K- ATPase in fish in electrical organs) (Norman et al., 2011; Gallant et al., 2014). Consequently, results in this study would indicate that at least three of the candidate genes investigated (AG, BG and 2D) have potentially undergone neofunctionalisation based on their expression patterns. Although it may be expected that Hb genes have a higher expression in haemolymph, most of closely related bivalve lineages to Arcidae do not possess circulating erythrocytes (Sullivan, 1961; Read, 1966). This indicates that these genes have undergone altered expression patterns following duplication from an ancestral globin gene. This may have been influenced by their challenging living environment in intertidal zones where they are regularly exposed to changing oxygen availabilities.
While Hb genes were expressed across multiple tissues, the expression levels may have been somewhat influenced by the permeability of tissues to haemolymph. For example, the gills are highly permeable to haemolymph because they are made up of folded and ciliated epithelium to maximise surface area for gas exchange (Widdows et al., 1979). Similarly, mantle tissue is rich in capillaries where oxygenated haemolymph is distributed to the rest of the tissues for O2 delivery (Weber, 1980). Muscle and foot tissue also depend on haemolymph for O2 supply and consequently the lower expression patterns seen across all tissues apart from haemolymph may be the result of the residual haemolymph content within them.
2.5 Conclusion
Overall, air exposure for 6 hours mimicking the duration of a low tide does not change the haemolymph pH in A. trapezia whereas prolonged air exposure causes acidosis in
the haemolymph. This drop in pH on prolonged air exposure coincides with a rise in pCO2 and therefore indicates that the organism may be switching to anaerobic metabolism in order to deal with this stress. In terms of expression of Hb encoding genes, the multiple Hbs
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 37
expressed in this species are present in all tissues but their expression is significantly higher in haemolymph than in foot, gills, mantle and muscle. For at least three of these Hb encoding genes, this contributes to the theory that they have undergone neofunctionalisation through
gene duplication. In this case, the unfavourable O2 conditions of the intertidal environment these bivalves live in may have been a driver for this selective adaptation.
38 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.1 Background
Respiratory pigments, or oxygen-transport proteins, within the bivalve lineage show a patchy distribution, as well as extensive variation in both form and function across species in which they are expressed (Terwilliger, 1980; Booth et al., 1984; Morse et al., 1986; Mangum et al., 1987; Weber & Vinogradov, 2001). The majority of bivalves do not have a functional respiratory pigment and rely on filtration of highly oxygenated water over the gills, to meet their respiratory demands (Angelini et al., 1998). Those bivalve species that do possess a respiratory pigment usually express haemocyanin (Hc) (Terwilliger et al., 1988), a complex copper – based protein that reversibly binds O2. In rare cases, however, bivalves utilise a type of haemoglobin (Hb), as seen in orders Arcoida, Carditoida, Solemyoida and Veneroida (Manwell, 1963; Terwilliger et al., 1978; Dando et al., 1985; Doeller et al., 1988; Suzuki et al., 2000). Based on the current phylogenetic reconstruction of the bivalve species by González et al. (2015), it appears that the earliest lineages possess Hc as the primary respiratory pigment. Based on this observation, and the fact that almost all gastropods (sister lineage to bivalves) possess Hc (Linzen et al., 1985), it is likely that Hc is the ancestral oxygen-transport protein in Bivalvia. Unlike Hc, Hb is thought to have evolved from ancestral globin genes on multiple occasions within bivalves.
Currently, four independent origins of Hbs have been hypothesised in bivalve molluscs. The first instance is thought to have occurred in the primitive bivalve lineage containing Solemya velum in the order Solemyoida. In addition to circulating hemocyanins, this species contains several tissue Hbs, predominantly localised in the gills (Dando et al., 1985; Doeller et al., 1988). The second independent origin of Hbs was reported in the bivalve family Arcidae in the order Arcoida, where evidence suggests that all species investigated contain multiple Hbs in circulating red blood cells (Mangum, 1997). The diversity of Hbs in this group consists of monomeric, homo-dimeric, hetero-dimeric, tetrameric and polymeric proteins (Terwilliger, 1980). The third reported origin of bivalve Hb is found in the deep-sea clam genus, Calyptogena in the order Veneroida. In this genus, species such as C. kaikoi contain two types of homo-dimeric Hb proteins, where one is found in erythrocytes and the other is restricted to abductor muscle tissue (Suzuki et al., 2000). The fourth instance is an extracellular Hb found in the heterodont clams, Cardita borealis and Cardita floridana in the
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 39
order Carditoida (Manwell, 1963; Terwilliger et al., 1978). While these four cases provide compelling evidence for the independent evolution of Hb in multiple bivalve lineages it is not known whether Hb has evolved in other lineages where oxygen-transport proteins have not been examined.
Interestingly, there is some sparse evidence to suggest that members of the family Limidae may also possess Hb proteins. This is largely restricted to one textbook publication (Rawat, 2010) with no molecular or biochemical studies having been reported. It is plausible to hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate associated with the ability to swim and the red pigmentation of their tissues (Harper & Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an important physiological advantage for a more efficient O2 delivery system to respiring tissues during swimming. Consequently, if Hb were found in Limidae, then this would constitute a fifth independent origin of Hb in bivalve molluscs.
An efficient approach to determine if Hbs are present in a given species is to sequence the entire expressed portion of their genome, as it has successfully been performed in the blood clams, Tegillarca granosa (Bao & Lin, 2010) and A. trapezia (Prentis & Pavasovic, 2014). Research into the presence of Hb in the family Limidae has lagged principally due to a lack of genomic and proteomic resources developed for species from this bivalve family. Here, the transcriptome sequencing, assembly and annotation for C. ales, a non-model bivalve mollusc from the family Limidae is described, for which no transcriptome data currently exists. The aim of the present study is to test the hypothesis that Hb genes, that encode for a functional Hb protein, are present in this species and this would therefore likely constitute a fifth independent origin of Hb in bivalves, as these proteins have only been found in the bivalve orders Arcoida, Carditoida, Solemyoida and Veneroida. The newly generated transcriptome will also greatly increase the genomic resources for C. ales and provide an initial candidate gene list for further genetic research.
40 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.2 Materials and Methods
3.2.1 Sample collection C. ales specimens were obtained from Cairns Marine Pty Ltd, Australia and kept in holding tanks at the Marine Facility (QUT) , until required for experimentation. This experiment required no ethics approval as the specimens do not qualify as animals as described by the Queensland Animal Care and Protection Act 2001 (ACPA). Water conditions were salinity at 30 - 35 ppt, pH 7.9 - 8.1, temperature 20 – 25 °C and ammonia levels < 0.1 mg/L. A 12h light/dark cycle was maintained. Animals were fed every three days with phyto-blast containing a wide range of aquaculture marine phytoplankton (Continuum Aquatics).
3.2.2 RNA extraction, library preparation and sequencing One individual tissue sample was used and divided into four for RNA extraction, as per previously optimised TRIzol/chloroform RNA extraction protocol for mollusc species (Prentis & Pavasovic, 2014). Specifically the protocol consisted of following steps: 1.0 mL of Trizol reagent was added to the 1.5 mL microcentrifuge tubes containing tissue and each tube was vortexed to mix for 5 min. The samples were then incubated at room temperature for 2 min. Chloroform (0.2 mL) was added to each tube and vortexed for 20 – 30 sec, followed by a 5 min incubation at room temperature. Samples were spun at 12,000 g for 15 min and the aqueous clear phase in each tube was collected carefully without disturbing the other layers, and placed in four new 1.5 mL microcentrifuge tubes. Following this, isopropanol (0.5 mL) was added to each tube of newly collected solution and vortexed for approximately 10 sec. Sodium chloride (1.2 M, 100 µL) was added to each tube, followed by a 10 min incubation at room temperature. Samples were then spun at 12,000 g for 15 min and supernatant removed from each tube without disturbing the pellets. Ethanol (70%, 200 µL) was then added to each tube to wash the pellets, followed by centrifugation at 12,000 g for 10 min and removal of supernatant. The pellets were then allowed to air dry for 10 min before being eluted in 50 µL of RNase free water. The RNA was then visualised on a 1.5 % agarose gel. The quantity and integrity of total RNA was validated on a Bioanalyzer 2100 RNA Nano chip (Agilent Technologies).
Sequencing libraries were prepared from 20 µg of total RNA using a TruSeq® stranded RNA library prep kit (Illumina), as per manufacturer’s low sample (LS) protocol. This consisted of selective purification of mRNA using oligo(dT) magnetic beads and
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 41
subsequent selection of 200 – 700 bp fragments. Specifically, total RNA was diluted with nuclease-free ultra-pure water to a final volume of 50 µL in a 96 - well 0.3 mL PCR plate provided in the kit and labelled with a RNA Purification beads (RPB) code. After being brought to room temperature, the tube containing the RNA Purification Beads was vortexed to resuspend the oligo-dT beads. 50 µL of RNA Purification Beads was added to each well of the RBP plate to bind the polyA RNA to the oligo-dT magnetic beads. Each well was then mixed thoroughly by pipetting the entire volume up and down six times followed by sealing of the plate with a Microseal adhesive seal provided. The RBP plate was then placed in a thermal cycler on a cycle of 65 °C for 5 min and 4 °C hold to denature the RNA and facilitate binding of the polyA RNA to the beads. After being taken out of the cycler, the RBP plate was incubated at room temperature for 5 min to allow the RNA to bind to the beads. The adhesive seal was then removed and the plate was placed on the magnetic stand at room temperature for 5 min to separate the polyA RNA bound beads from the solution. Supernatant from each well was then removed, discarded and the plate was taken off the magnetic stand. Beads were washed by adding 200 µL of Bead Washing Buffer provided in each well to remove unbound RNA. Wells were mixed thoroughly by pipetting up and down six times. The RBP was again placed on the magnetic stand for 5 min at room temperature. After being thawed, the Elution Buffer solution provided was centrifuged at 6,000 g for 5 sec. Supernatant from each well was removed after incubation and discarded as it contained most of the ribosomal RNA and other non-messenger RNA. The RBP plate was removed from the magnetic stand and 50 µL of Elution Buffer was added to each well and mixed thoroughly by pipetting up and down six times. The RBP plate was sealed and placed in the thermal cycler on a cycle of 80 °C for 2 min and 25 °C hold to elute the mRNA from the beads. This step releases both the mRNA and any contaminant rRNA that may have bound to the beads non- specifically. The plate was removed from the thermal cycler and unsealed. After thawing, the Bead Binding Buffer provided was centrifuged at 600g for 5 sec. Bead Binding Buffer (50 µL) was added to each well of the RBP plate to allow mRNA to specifically rebind the beads as well as reducing the amount of rRNA that is bound non-specifically. Each well was mixed. The RBP plate was then incubated at room temperature for 5 min and then placed on the magnetic stand where supernatant was removed and discarded. The plate was removed from the stand and the beads were washed by adding 200 µL of Bead Washing Buffer in each well and mixing. The plate was placed on the magnetic stand at room temperature for 5 min and supernatant was removed and discarded to avoid any residual rRNA and other contaminants. The plate was removed from the magnetic stand and 19.5 µL of Fragment,
42 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Prime, Finish Mix provided were added and mixed thoroughly with the RNA to serve as a first strand cDNA synthesis reaction buffer as they contain random hexamers. The plate was sealed and placed in the thermal cycler on a cycle of 80 °C for 8 mins and 4 °C hold. The RBP plate was then removed from the thermal cycler and centrifuged briefly. Complementary DNA synthesis was then carried out on enriched mRNA samples as follows: the RBP plate was placed on the magnetic stand at room temperature for 5 min and 17 µL of supernatant in each well was transferred to the corresponding well on a new 0.3 mL PCR plate labelled with a cDNA Plate (CDP) barcode. After thawing, the First Strand Synthesis Act D mix was centrifuged at 6,000 g for 5 sec. SuperScript II (50 µL) was added to the First Strand Synthesis Act D tube. Eight µL of this mix was added to each well of the CDP plate and mixed thoroughly. The CDP plate was sealed and centrifuged briefly before being transferred to the thermal cycler on a cycle 25 °C for 10 min, 42 °C for 15 min, 70 °C for 15 min and hold at 4 °C. The CDP plate was then taken out of the thermal cycler and unsealed. Five µL of Resuspension Buffer was added to each well followed by 20 µL of thawed and centrifuged Second Strand Marking Master Mix provided. Each well was thoroughly mixed and the plate sealed. The CDP plate was placed in the thermal cycler and incubated at 16 °C for one hour. It was then removed from the cycler and unsealed. Well-mixed AMPure XP beads (90 µL) provided was added to each well now containing 50 µL of double stranded cDNA and this was mixed by pipetting the entire volume up and down ten times. The CDP plate was then incubated at room temperature for 15 min before being placed on the magnetic stand for 5 min. 135 µL of supernatant was removed and discarded from each well. Freshly made 80 % EtOH (200 µL) was then added to each well without disturbing the beads and the plate was incubated at room temperature for 30 sec before removing and discarding all of the supernatant from each well. This EtOH wash step was repeated one more time before leaving the CDP plate to dry at room temperature for 15 min. The CDP plate was then removed from the stand and 17.5 µL of thawed and centrifuged Resuspension Buffer provided was added to each well and mixed. The plate was incubated at room temperature for 2 min and placed on the magnetic stand for 5 min. Fifteen µL of supernatant from each well was then transferred from the CDP plate to a new 96 well 0.3 mL PCR plate labelled with an Adapter Ligation Plate (ALP) barcode.
Purified fragments were poly-A tailed, ligated to Illumina paired-end adapters, and size selected by gel purification. Finally, PCR was used to enrich the DNA fragments with the following conditions: one cycle at 98 °C for 30 sec, 15 cycles of 98 °C for 10 sec, 60 °C
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 43
for 30 sec, 72 °C for 30 sec, one cycle at 72 °C for 5 min and hold at 4 °C. The final cDNA library was sequenced on an Illumina NextSeq500 using 150 bp paired-end chemistry.
3.2.3 Transcriptome assembly and validation Following sequencing, the libraries were concatenated into two separate files based on read direction (-left all_R1_reads.fastq and –right all_R2_reads.fastq). Quality control was then performed on both files to validate the quality of the raw reads prior to assembly. This was done using the FastQC program which provides summary statistics for read quality, length and GC content (Andrews, 2010). Low quality reads (sequences with > 1 % N bases and/or greater than 1 % Q < 20) were discarded and only reads with quality scores above Q20 and less than 1 % ambiguity were used for downstream analysis. These remaining reads were assembled into contigs ≥ 200 bp using the Trinity short read de novo assembler (version 2.0.6) (Grabherr et al., 2011). Assembled contiguous sequences (contigs) were filtered for redundancy and chimeric transcripts using the program CD-HIT (Cluster Database at High Identity with Tolerance), version 4.6.1 (Huang et al., 2010) and sequences with > 95 % similarity were clustered into a single contig. At this point, the C. ales transcriptome assembly was evaluated for completeness using CEGMA (Core Eukaryotic Genes Mapping Approach) to determine the presence of a group of 248 highly conserved core eukaryotic genes.
3.2.4 Functional annotation of transcripts and mapping The transcriptome assembly was functionally annotated using sequence based searches implemented in the Trinotate annotation pipeline (Haas et al., 2013), which is a comprehensive annotation suite designed specifically for de novo assembled transcriptomes of model or non-model organisms. Trinotate generates functional annotation based on homology searches to known sequence data, protein domain identification and protein signal peptide/transmembrane domain predictions before integrating the analysis of transcripts into a SQLite database to obtain an annotation report as described below and as summarised in Figure 3.1. Firstly, contigs were used as BLASTx queries against the TrEMBL and Swiss- Prot databases with a stringency E-value of 1 x 10-6. Both of these databases are a collection of core data on proteins such as the amino acid sequence, the protein name or description, the taxonomic data, the citation information and as much annotation information as possible.
44 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
They are used concurrently as TrEMBL is computationally analysed while Swiss-Prot is manually annotated based on literature and curated computational analysis. Gene ontology terms were assigned to contigs that received BLAST hits that also contained functional information. In order to characterise the transcriptomic data obtained, these terms were analysed using WEGO (Web gene Annotation Plotting) (Ye et al., 2006). TransDecoder v.2.0.1 (Haas et al., 2013) was used to generate a predicted proteome for the transcriptome assembly based on the longest open reading frame (ORF) in each transcript. The predicted proteome was then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP 4.1 was used to predict the presence and location of signal peptide cleavage sites in coding sequences. Pfam domains were assigned using the Hidden Markov Model-based sequence alignment tool (HMMER) to determine the presence and position of domains within each predicted proteome. The analyses obtained for all transcripts were then uploaded to the SQLite database and an annotation report was generated. Trinity output consisted of sequence clusters identified as genes through BLASTx searches against the TrEMBL and Swiss-Prot databases.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 45
Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E- value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal peptides and Pfam to determine the presence and position of protein domains. All results were uploaded to the SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016).
3.2.5 Comparative transcriptomics In order to further validate the newly generated transcriptome, orthologous gene clusters for C. ales, A. trapezia and the two model species Lottia gigantea and Crassostrea gigas were compared and annotated using the program OrthoVenn (Wang et al., 2015). The predicted proteomes from the genome of L. gigantea and C. gigas were used and compared to the predicted proteome from the whole organism transcriptome of A. trapezia and C. ales. Ortholog groups were identified by an all-against-all or reciprocal BLASTp alignment.
46 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.2.6 Candidate genes identification The annotated transcriptome was interrogated for candidate globin genes, here defined as contigs that possess a characteristic globin fold sequence and globin domain as identified in the Trinotate pipeline through BLASTp and Pfam. Candidate contigs identified were analysed through ORF finder (Stothard, 2000) to determine open reading frames. Coding regions were translated into protein sequences using the translating tool ExPAsy (Gasteiger et al., 2003) and used as BLASTp queries against the NR database at NCBI. Relevant conserved domains were identified using SMART searches for homologous Pfam domains (Finn et al., 2014), signal peptides and internal repeats.
3.2.7 Primer design and candidate gene validation Primers for the candidate sequences were designed using Primer3 software, to amplify the entire ORFs (open reading frames) of the candidate genes: contigs c97022_g1_i3 (CalesGl1), c89016_g1_i4 (CalesGl2) and c97010_g2_i1 (CalesGl3) and used for validation. ® This was done with the following PCR protocol: 12.5 µL of MyFi 2x, 8.5 µL of H2O, 1 µL of forward primer, 1 µL of reverse primer, 2 µL of MgCl2 and 1 µL of template to a total volume of 26 µL. Different conventional PCR amplification conditions were used for each candidate gene. For contig c97022_g1_i3: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 54 °C 30 sec, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig c89016_g1_i4: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 50 °C 45 sec, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig c97010_g2_i1: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 48 °C 1 min, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. Amplified products were size-separated by electrophoresis on 1.5 % agarose gels stained with GelRedTM (Biotum) alongside a 100 bp Hyperladder (Bioline). The intensity of PCR products was determined using an image analysis program assisted by a gel documentation system (Chemidoc XRS, Bio-Rad). PCR products were then purified using an Ethanol/EDTA precipitation protocol as follows: 5 µL of 125 mM EDTA disod ium salt (pH 8.0) was added to each PCR tube, vortexed and spun briefly. Ethanol (100 %, 60 µL) was added and tubes were left to incubate at room temperature for 15 min. Tubes were then spun in a centrifuge at 13,000 g for 20 min and the supernatant was carefully aspirated and discarded. Freshly made 80 % ethanol (350 µL) was then added and tubes were spun at 13,000 g for 5 min before the supernatant was aspirated and discarded. Pellets in the tubes
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 47
were then left to dry at room temperature for 1 hr covered with aluminium foil. Samples were sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequence chromatograms were visualised using Geneious® software version 8.1.6 and sequences were re-aligned to candidate ORFs that primers were designed from.
3.2.8 Phylogenetic analysis of sequences Phylogenetic relationships of globin genes in this study and model bivalve species were resolved using maximum likelihood analysis in MEGA version 6.0 (Tamura et al., 2013). This software conducts manual and automatic sequence alignments to infer molecular relationships and build phylogenetic trees (Tamura et al., 2013). Firstly, the globin gene sequences were isolated from the genomes of the two bivalve model species C. gigas and L. gigantea by using a custom BLAST against haemoglobins, neuroglobins, globin-X, cytoglobins and myoglobins in the three vertebrate model species: Homo sapiens (human), Gallus gallus (chicken) and Danio rerio (zebra fish). For C. gigas, 12 globin genes were found and 10 for L. gigantea. Globin gene sequences were isolated from the transcriptome of A. trapezia. All sequences were further validated using Pfam searches to ensure the presence of a globin domain. Alignments were performed using MUSCLE (Edgar, 2004) protein plug in with default settings in MEGA and a best-fit model test was performed resulting in LG+G+I (Tamura et al., 2013). This LG gamma distributed (G) with invariant sites (I) model was therefore used with three discrete gamma categories to construct a maximum-likelihood tree using the bootstrap method with 500 bootstrap replications. All bivalve globin protein sequences used in the phylogeny were also aligned for comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA version 6.0.
3.3 Results
3.3.1 RNA extraction, library preparation and sequencing Total RNA extraction, performed in quadruplicate, for C. ales is visualised in Figure 3.2, where all four samples showed strong bands around 700 bp, which represent ribosomal RNA. Bioanalyzer results (Figure 3.3) confirm these findings with high concentration and
48 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
yields for both samples submitted. Transcriptome sequencing of mRNA from C. ales on Illumina NextSeq 500 resulted in 55,592,966 150 bp paired-end reads and a GC content of 36.53 %. This slightly low GC content may be due to shortness of sequences and instability of mRNA.
Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 % agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate) from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively. The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed for quantity and integrity and sequencing libraries were prepared.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 49
Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip. Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for each sample.
3.3.2 Transcriptome assembly and validation Assembly of this data resulted in 182,908 contigs (≥ 200 bp). The mean contig length was 357 bp, with an N50 of 1,167 bp. Assembly statistics are presented in Table 3.1.
50 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on an Illumina NextSeq500 using 150 bp paired-end chemistry. Libraries obtained were then assembled and the table below summarises results obtained.
Assembly statistic Total assembled reads 127,196,743 Total trinity ‘genes’ 160,227
Total contigs 182,908
Mean contig length (bp) 357 Average contig 695.41 N50 1,167
3.3.3 Functional annotation of transcripts and mapping Overall, 22,903 (12.5 %) transcripts received significant BLASTx hits and 19,788 (10.8 %) BLASTp hits against the Swissprot database. Against the TrEMBL database of predicted proteins, 34,098 (18.6 %) transcripts received significant BLASTx hits and 26,344 (14.4 %) BLASTp hits. Gene ontology (GO) terms were assigned to 30,315 transcripts (Figure 3.4). The most frequently assigned GO terms were cellular process (20,887), biological regulation (11,020) and metabolic process (15,717) for the broad biological process category. For the cellular component category, most GO terms were assigned to cell and cell part (23,916), followed by organelle (16,087). In the molecular function category, GO terms were most commonly assigned to binding (19,803) and catalytic activity (12,179).
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 51
Number of genes Percent of genes
Cellular component Molecular function Biological process
Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to represent common and basic biological information: cellular component, molecular function and biological process.
3.3.4 Comparative transcriptomics A comparative analysis was conducted across the four mollusc transcriptomes including the two model species C. gigas and L. gigantea, as well as the bivalves A. trapezia and C. ales. Orthologous clusters conserved among the four molluscs and those unique to each species are represented in the Venn diagram in Figure 3.5. Overall, 6,743 clusters were shared by all four species, which represents 55.5 % of the total number of clusters for C. gigas, 58.1 % for L. gigantea, 53.2 % for A. trapezia and 38.8 % for C. ales. The transcriptome with the lowest number of unique clusters was L. gigantea with 918, while C. ales had the highest number of 4,355. A. trapezia shared the most orthologous clusters with C. ales (1,607).
52 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve species C. gigas, L. gigantea, A. trapezia and C. ales. For L. gigantea and C. gigas, the predicted proteomes obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp alignment.
3.3.5 Candidate genes Contigs identified as globins were extracted from the Trinotate annotation report (c97022_g1_i3; c89016_g1_i4; c97010_g2_i1). Contig c97022_g1_i3 (CalesGl1) was found to be 1,930 bp in length, consisting of an ORF of 498 bp encoding a polypeptide of 165 amino acids. Analysis of the predicted protein sequence identified a globin domain (Figure 3.6) with 40 % sequence identity to a Ngb-like gene in the bivalve C. gigas. Contig c89016_g1_i4
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 53
(CalesGl2) was 715 bp in length, consisting of an ORF of 585 bp encoding a polypeptide of 194 amino acids. The predicted amino acid sequence of this contig consisted of a signal peptide and a globin domain (Figure 3.6). It had 36 % protein identity to a Hb gene found in the crab Carcinus maenas. Contig c97010_g2_i1 (CalesGl3) was found to be 1,835 bp in length consisting of an ORF of 1,107 bp encoding a polypeptide chain of 368 amino acids. The predicted amino acid sequence of this contig was found to encode a transmembrane domain and two globin domains (Figure 3.6), with 38 % sequence identity to the HbI gene found in the bivalve T. granosa.
CalesGl1
CalesGl2
CalesGl3
bp
Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain. Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal repeats.
54 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.3.6 Primer design and candidate gene validation Forward and reverse primers were designed based on the longest ORF for each of the three candidate genes for validation (Table 3.2). PCR reactions were optimised for each primer from each candidate gene and sequences successfully amplified (Figure 3.7). Alignments of chromatograms obtained from Sanger sequencing and original candidate ORFs were successful and all three genes were validated at > 97 %.
Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table below.
Primer name Primer sequence 5’-3’ Contig Annealing T° Product size CalesGl1_F AGA AGC GCA GGC AGA AGA AA c97022_g1_i3 55° 670 CalesGl1_R TGT GAA TCA ACG CAT TGC ACA CalesGl2_F ATC AGT CGA CTG GTG CAT AG c89016_g1_i4 55° 670 CalesGl2_R TGC ATG TAC AGA ATA AAG GCA CalesGl3_F GCA CAC AGC TAC ACT CTT GT c97010_g2_i1 55° 1400 CalesGl3_R TGC GTA GAT CGG AGT AGA GA
A B C
Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as shown in table 3.2 above. Lanes 1 in all gels shown here are Hyperladder 100 bp. Candidate gene 1 is shown in lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane 3 of gel (C). All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500 (ThermoFisher). Lanes 2 and 4 of gel (B) and lane 2 of gel (C) are candidate gene products prior to PCR reactions being optimised.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 55
3.3.7 Phylogeny of globins in bivalves The phylogenetic relationships of globins in the four mollusc species (L. gigantea, C. gigas, A. trapezia and C. ales) and three model vertebrate species (D. rerio, G. gallus and H. sapiens) are represented in Figure 3.8. In this maximum-likelihood tree, there are three weakly supported major clades (A, B and C). Clade C contained Hb and other related globin (all globins apart from Ngb and GbX) genes from the three vertebrate model species. Clades A and B both contained globin genes for L. gigantea, C. gigas, C. ales and A. trapezia. Clade A, although poorly supported, contained globin genes from L. gigantea, A. trapezia and C. gigas, C. ales and vertebrate GbX but no genes encoding molluscan Hb proteins. Clade B is strongly supported and contains all known Hb genes in A. trapezia. All bivalve globin protein sequences used for phylogeny were also aligned using MUSCLE in MEGA. Results are shown from position 260-331 where residues are conserved across all 30 bivalve globins. In position 271, a phenylalanine residue and in positions 295 and 328, two histidine residues are conserved across all 30 globins from the four different bivalve species (Figure 3.9).
56 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
L. gigantea globin-like 6 C. gigas neuroglobin-like A. trapezia neuroglobin NG L. gigantea globin-like 5 C. gigas haemoglobin-1-like C. gigas cytoglobin-2-like X3 L. gigantea globin-like 1 D. rerio GbX L. gigantea globin-like 8 C. gigas cytoglobin-2-like X1 L. gigantea globin-like 2 C. ales neuroglobin-like CalesGl1 C. giga s neuroglobin D. rerio neuroglobin G. gallus neuroglobin H. sapiens neuroglobin C. gigas globin-like L. gigantea globin-like 7 C. gigas cytoglobin-1 L. gigantea globin-like 10 C. gigas cytoglobin-1-like C. gigas neuroglobin-1 C. gigas neuroglobin (2) L. gigantea globin-like 3 L. gigantea globin-like 4 L. gigantea globin-like 9 C. gigas globin-like X1 C. ales haemoglobin-like CalesGl2 C. ales haemoglobin 1-like CalesGl3 A. trapezia heterodimer HB A. trapezia dimer 2D A. trapezia homodimer HD A. trapezia beta globin BG A. trapezia alpha globin AG H. sapiens cytoglobin D. rerio cytoglobin 2 G. gallus cytoglobin D. rerio cytoglobin 1 G. gallus myoglobin H. sapiens myoglobin D. rerio myoglobin G. gallus globin E G. gallus HbA H. sapiens HbA D. rerio HbA D. rerio HbB G. gallus HbB H. sapiens HbB
Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The percentage of trees in which the associated taxa clustered together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 57
Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all sequences and all species are indicated above by an asterix (*).
3.4 Discussion
In this study, the transcriptome of C. ales, a bivalve mollusc, three full length globin genes were identified. All predicted proteins of the candidate genes showed the two histidines as well as the phenylalanine residues characteristic of globin proteins (Bashford et al., 1987). This indicates that the proteins encoded by the globin genes identified here are capable of the formation of the hydrophobic pocket with a haeme group (Royer et al., 2001). The sequence similarity of the proteins, however, were highly divergent, with candidate gene one being most similar to vertebrate Ngb genes and a Ngb-like gene from C. gigas, while candidate genes two and three were most similar to Hb encoding genes from A. trapezia.
Of most interest here, is that two globin genes from C. ales are sister to the known Hb encoding genes from A. trapezia. The finding that candidate genes CalesGl2 and CalesGl3 were most similar to genes encoding Hb proteins may indicate that these genes in C. ales represent a fifth independent origin of Hb in Bivalvia. Currently, Hb has been identified in species from Arcoida, Veneroida, Carditoida and Solemyoida orders in Bivalvia. It remains to be determined if Hb encoding genes show patterns of molecular convergence as seen in
58 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
genes underlying convergently evolved phenotypic traits in echo-locating animals (Parker et al., 2013) and marine mammals (Foote et al., 2015). The repeated evolution of Hb proteins in bivalves indicates that when these mutations arise in ancestral globin genes they are strongly selected for. To fully support a fifth independent origin of Hb in Bivalvia will require functional protein work which was outside the scope of the current project.
Candidate gene CalesGl3 is of particular interest as this gene had multiple globin domains in a single globin gene. Previously, di-domain Hbs in bivalve molluscs have only been found in Barbatia reeveana and Barbatia lima from order Arcoida (Naito et al., 1991; Suzuki et al., 1996). The sporadic occurrence of a multi-domain globin in another bivalve class suggests that the di-domain globin gene identified in C. ales has evolved independently of those in order Arcoida. Multi-domain globin genes have been identified in other animal phyla including Annelida (Branchipolynoe symmytilida and Barbatia seepensis (Projecto- Garcia et al., 2010), and Arthropoda (Artemia; (Jellie et al., 1996)). This indicates that multi- domain globin genes may arise relatively frequently, but this idea remains to be tested. In B. reeveana the di-domain gene was generated through incomplete gene duplication which resulted from unequal crossing over during meiosis (Naito et al., 1991). It would not be surprising if a similar mechanism generated the di-domain globin gene found in C. ales. In fact, both gene duplication and adaptive evolution have been implicated in the origin and diversity of many multi-domain proteins (Vogel et al., 2005) reinforcing the hypothesis of evolution by incomplete duplication in this case.
Candidate gene CalesGl1 was identified as a Ngb-like gene and was found in a weakly supported clade with vertebrate genes encoding Ngb proteins. It was not surprising to find Ngb-like genes in bivalves as these genes were present in the common ancestor of Eumetazoans (Cnidaria + Bilateria) (Roesner et al., 2005). The presence of only a single copy of Ngb-like gene in both C. ales and A. trapezia was unexpected as most invertebrate lineages often have more than one Ngb-like gene (i.e., C. gigas has 6 copies (UniProt accession numbers: K1QVD6, K1Q9R1, K1QT48, K1QF07, K1RX51, K1R7G1). The low number observed in C. ales and A. trapezia could be associated with low levels of gene expression of Ngb-like genes (Schindelmeiser et al., 1979; Burmester & Hankeln, 2004) and the use of transcriptome sequencing to identify these genes in both species investigated. Vertebrate Ngbs are monomeric proteins involved in various physiological functions including oxygen supply, storage, and interactions with mitochondria in nerve cells (Roesner et al., 2005; Watanabe et al., 2012), but a number of functions of these proteins remain
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 59
uncharacterised. The predicted proteins identified here, when fully characterised, may help to better understand the function of neuroglobin-like proteins outside of vertebrate species.
3.5 Conclusion
These data provide the most comprehensive transcriptomic resource currently available for the bivalve mollusc C. ales. In this transcriptome, there are at least three genes encoding globin-like proteins. Candidate genes CalesGl2 and CalesGl3 are likely to encode Hb proteins based on sequence data characteristics and similarities with existing sequences. Candidate gene CalesGl3 in particular contains two globin domains in a single globin gene which has only been found in another two bivalve species in a different lineage. In addition to this, sequence similarities of candidate globin genes identified here to Hb encoding genes in other bivalves further contributes to the theory of a common bivalve ancestor and independent evolution of Hbs in bivalve lineages. These results provide preliminary evidence for a possible fifth independent origin of Hbs in Bivalvia.
60 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Chapter 4: General Discussion
Haemoglobin genes are highly diverse and have evolved independently in multiple metazoan lineages (Hardison, 1996; Mangum, 1998; Weber & Vinogradov, 2001; Hofmann et al., 2010a; Hoffmann et al., 2012). This diversity is particularly evident in invertebrates where Hbs show exceptional variation in their form, function and structural arrangement (Weber, 1980; Terwilliger, 1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002; Hoffmann et al., 2012). Such variability is reinforced by the patchy distribution of Hbs across invertebrate groups. Bivalve molluscs for example, represent one such group, where Hbs have evolved independently across multiple lineages (Manwell, 1963; Terwilliger et al., 1978; Dando et al., 1985; Doeller et al., 1988; Suzuki et al., 2000). This has been shown to be the result of gene duplication and can be correlated to species living in environments with low or fluctuating oxygen availabilities. Knowledge on the sequence, role and evolution of Hbs is not only important for phylogenomics but is also crucial for the bivalve farming industry. Despite this, only a limited number of studies have examined the full range of Hb genes, or functionally characterised their expression in bivalves (Terwilliger et al., 1983; Angelini et al., 1998; Hourdez & Weber, 2005; Decker et al., 2014). To better understand the evolution and expression of Hbs in bivalves, an investigation of Hb genes in two distantly related species of bivalves is performed here. Firstly, by quantifying patterns of tissue specific Hb gene expression in A. trapezia under submerged and aerially exposed treatments. This provided insights into the function and role of Hb encoding genes in this species. Secondly, by generating a new high-quality transcriptome resource for C. ales, the globin gene repertoire in this species was examined and a fifth independent origin of Hb in bivalve molluscs potentially identified.
4.1 Role of gene duplication in the current diversity of bivalve Hbs
The current diversity of Hb genes in bivalves has been hypothesised to be a result of repeated rounds of gene duplication driving evolution. One of the mechanisms responsible for this duplication has been demonstrated, by some authors, to be unequal crossing-over during meiosis (Naito et al., 1991; Dewilde et al., 1999; Kato et al., 2001). For example, the two-domain Hb (2D) from the blood clam B. lima is thought to have evolved from an unequal crossing over event between two ancestral globin genes and the loss of a stop codon (Suzuki
Chapter 4: General discussion 61
et al., 1996). Duplication through unequal crossing over is best illustrated in datasets in this study by the di-domain globin gene found in C. ales. This di-domain gene has most likely evolved as a result of unequal crossing over that led to the deletion of a stop codon in a single domain gene, generating an extended open reading frame with two globin domains. Interestingly, most Arcidae species studied to date do not possess this di-domain gene. Therefore, as C. ales, a member of the family Limidae and B. lima, a member of the Arcidae, both have a di-domain globin gene, it is likely that this gene may have arisen independently in both species.
The presence of at least five globin genes that encode Hb proteins in A. trapezia, which form a single clade in the globin phylogeny for bivalves, supports the hypothesis that extensive duplication has played an important role in the evolution of this gene family. In arcid bivalves most species have more than a single Hb gene (Como & Thompson, 1980b; Mangum, 1997), however, A. trapezia is the first species examined to have more than four Hb genes. For example, three Hb genes have been identified in T. granosa and S. inaequivalvis, while B. reeveana has four distinct Hb genes (Ikeda-Saito et al., 1983; Petruzelli et al., 1985; Terwilliger, 1998; Royer et al., 2001; Bao & Lin, 2010). This indicates that either extensive lineage specific duplication has occurred independently in the species of this family, or that duplication events occurred in the common ancestor of the family Arcidae.
Investigation of expression patterns of globin genes in A. trapezia highlighted strong tissue specific expression with some genes being predominantly expressed in erythrocytes. Often tissue or developmental specific expression in duplicated genes is associated with changes in the regulatory elements of these genes (Sankaran et al., 2008), which has led to higher expression in certain tissues or developmental times. This has been demonstrated in mammals as recently duplicated genes that shared regulatory sequences were more likely to be co-expressed than duplicated genes that do not share regulatory sequences (Lan & Pritchard, 2016). It remains to be determined whether divergence in regulatory sequences are responsible for the tissue specific expression observed in A. trapezia globin genes encoding Hb proteins, but this study is the first to show tissue specific expression. This indicates that globin genes may have undergone neofunctionalisation following duplication in A. trapezia to be dominantly expressed in haemocytes. This data together with data from C. ales shows that duplication and subsequent divergence have played a dominant role in globin gene evolution in bivalve molluscs.
62 Chapter 4: General discussion
Neofunctionalisation is the evolution of new functions in duplicated genes and has been well demonstrated in vertebrate Hb genes (Aguileta et al., 2004; Hoffmann & Storz, 2007; Opazo et al., 2008). The pattern observed in results from chapter 2 of this study is consistent with neofunctionalisation of clade B globin genes in A. trapezia, as two of these genes (AG and BG) encode a tetrameric Hb (Como & Thompson, 1980a), while another gene (HB) in this clade encodes a dimeric Hb (Petruzelli et al., 1985). Two of these three genes (AG and BG), are dominantly expressed in haemocytes and as these cells are a novel phenotypic trait in Arcidae, this expression pattern is indicative of neofunctionalisation. Multiple examples exist that support the idea that neofunctionalisation of gene duplicates has played an important role in the generation of novel phenotypic traits (Birchler & Veitia, 2010; Kaessmann, 2010; Osborn et al., 2003). Of these, a well-known example is that of a neofunctionalised copy of an elastin gene which contributed to the evolution of the bulbus arteriosus, a novel organ found in the heart of teleost fish (Moriyama et al., 2016). While it is tempting to speculate that neofunctionalised Hb genes may contribute to evolutionary novelty in Arcid bivalve erythrocytes, the extensive duplication of globin genes encoding Hb proteins may be associated with escape from adaptive conflict (Des Marais & Rausher, 2008; Storz et al., 2008) if Hb proteins undertake multiple roles within this cell type.
Globin and Hb genes from many species have been demonstrated to encode
multifunctional proteins that serve not only in O2 transport, but also play a role in immune function and contribute to disease phenotypes with haemocytes recognised as the immune effectors (Bao et al., 2013b; Vinogradov & Moens, 2008; Weatherall, 2001; Donaghy, 2009). For example, a duplicated gene encoding a dimeric Hb from the arcid bivalve T. granosa, was upregulated following an immune challenge with V. parahaemolyticus (Bao et al., 2013b). This study demonstrates that the duplicated Hb genes of arcid bivalves may
have multifunctional roles in both O2 transport and the innate immune system response. Such acquisition of new roles in pre-existing genes requires changes in the regulatory elements of these genes as well as through coding regions. This is also necessary as these genes adapt to different living conditions such as hypoxic environments. For example, globin diversity in vesicomyid clams is believed to be a result of monomers modulation to accommodate for oxygen levels in their surrounding environment. This is hypothesised to be the result of structural genetic diversity in populations or changes in globin gene transcription according to environmental changes (Carney et al., 2007). Based on findings in this study for the bivalve A. trapezia, we can argue that it is most likely not a result of
Chapter 4: General discussion 63
changes in the transcription of globin genes, at least for animals exposed to air for up to 12 hours as expression levels of globins did not significantly change. Consequently, the extensive duplication of Hb encoding globin genes in A. trapezia may allow natural selection to drive specialization of different members of this multifunctional gene family. Further functional studies will be required to determine if this idea is correct.
4.2 Globin gene evolution in hypoxic environments
The variability in functional properties of bivalve Hbs are thought to be associated with the wide range of environmental conditions that these organisms encounter in their habitat (O’Gower & Nicol, 1968; Terwilliger et al., 1978; Alyakrinskaya, 2002). Among these environmental conditions, one of the most common stresses affecting bivalves is hypoxia (Widdows et al., 1979; Weber, 1980; Booth et al., 1984; Burnett, 1997; Gobler et al., 2014). It develops in organisms where the depletion of oxygen through respiration is faster than its replenishment and some bivalve species show better tolerance and survival rates than others when subject to extended periods of hypoxia (Officer et al., 1984; Rabalais et al., 2010). One of the reasons for this is their ability to close their shells therefore avoiding oxygen depletion and switching from aerobic metabolism to anaerobic metabolism (Brooks et al., 1991; De Zwann et al., 1993). The present study showed that this may also be an adaptive feature in the blood clam A. trapezia.
Furthermore, the evolution of circulating Hbs in bivalve lineages has been hypothesised to allow for maximised O2 binding and transport during times of hypoxia (Projecto-Garcia et al., 2015), since both Hbs and Hcs occur in bivalve species that experience periodic hypoxia (Morse et al., 1986; Mangum et al., 1987; Terwilliger et al., 1988; Riggs, 1991; Weber & Vinogradov, 2001; Projecto-Garcia et al., 2015). The presence of multiple Hb proteins in bivalve species is also a common observation and has been linked to many organisms that synthesise multiple oxygen carriers with different oxygen affinities to meet physiological demands (Mangum, 1997; Weber & Vinogradov, 2001; Projecto-Garcia et al., 2015). For example, the clam C. magnifica is found lodged into rock fissures outside hydrothermal vents and exposed to both deep-sea water and vent fluid meaning that it is frequently subject to chronic hypoxia (Berg, 1980). This species possesses an intracellular Hb with high O2 affinity for carrying and transport required for its sustainability in such a challenging environment (Terwilliger et al., 1983; Scott & Fisher, 1995; Hourdez & Weber, 2005). The
64 Chapter 4: General discussion
deep-sea clams C. kaikoi, Calyptogena soyoae and Calyptogena tsubasa all possess two homo-dimeric Hbs with more than 90 % identity, of which, HbI and HbII in C. kaikoi have been found to be involved in O2 storage under low O2 conditions in the deep sea rather than
O2 transport (Kawano et al., 2003; Suzuki et al., 2000). Arcid bivalves are also frequently found in habitats with low levels of O2 such as intertidal zones or deep-sea waters and often experience prolonged periods of hypoxia (Arp et al., 1984; Abele-Oeschger & Oeschger, 1995; Terwilliger, 1998; Weber & Vinogradov, 2001). This is the case with the intertidal bivalve A. trapezia used in this study but also with other arcid bivalves such as Anadara kagoshimensis. As with A. trapezia, this species is found in sandy-muddy areas of the Indo- Pacific region and possesses haemoglobin-containing erythrocytes (Golovina et al., 2016). Compared to other bivalve species in the same habitat, A. kagoshimensis has also shown better tolerance to hypoxia and this has been attributed to the presence of Hbs (Holden et al., 1994). The multiple Hbs in these bivalves may be produced simultaneously and may have different functions such as O2 carrying or storage, or can be produced sequentially to follow changing conditions (Terwilliger, 1998). Different Hb structures confers them these different functions and thus their potential resistance to hypoxia in the habitats where the bivalves settle (Decker et al., 2016). Overall, the presence of Hbs seems to play a role in adaptation of bivalves to hypoxia. Therefore, mutations that confer the evolution of such complex respiratory pigments such as Hb are likely selected for.
Since all blood clams seem to have originated from a common ancestor, globin structural variations and levels of expression is thought to have been influenced directly by the environmental conditions these clam species are subjected to and specifically oxygen concentration and availability (Kawano et al., 2003). The evidence from this study reinforces that greater expression of Hbs in haemolymph confers a physiological advantage in tolerating hypoxia. It is shown here by expression levels obtained for the blood clam A. trapezia. Given the hypoxic nature of the intertidal environment where this bivalve is found, it is almost certain that a duplication event was driven by the need for oxygen availability during low tides. This is also consistent with the finding of Hb-like genes that may encode Hb proteins in C.ales as this species lives in the tropical waters of the Indo-Pacific area at depths between 30 – 35 m (and sometimes up to 50 m) where both depth and temperature reduce the solubility of O2 creating a hypoxic environment (Garcia et al., 2005; Karstensen et al., 2008). Nonetheless, the idea that Hb has evolved more often in lineages that live in hypoxic environments requires further analysis before it can be supported.
Chapter 4: General discussion 65
4.3 The importance of globin gene evolution to aquaculture
Aquaculture is one of the fastest growing food-producing sectors and accounts for nearly 50 % of world consumption. The major groups currently produced in aquacultureinclude finfish, crustaceans and molluscs. Culture of molluscs contributed approximately 20 % of total aquaculture production in 2014 and this amount has been steadily increasing in recent years. Bivalve molluscs of the family Arcidae includea number of major fishery and aquaculture species such as T. granosa, S. inaequivalis, S. broughtonii and A. trapezia (Donaghy et al., 2009; Bao et al., 2013b). Besides their importance as food sources, bivalve molluscs are frequently used as indicators of pollution and overall health of ecosystems. Some of the major issues encountered in clam farming are disease outbreaks, low survival rates due to environmental pollution and slow growth rates (Alkarkhi et al., 2008; Vuddhakul et al., 2006). Haemocytes found in bivalve molluscs have been shown to be involved in various biological functions (Donaghy et al., 2009) such as immune defence against bacteria and viruses (Bao et al., 2013b) but also detoxification. In fact, the role of haemoglobin in haemocytes goes beyond supporting aerobic metabolism and in some species SNPs in specific Hb genes have been observed to correlate with disease resistence. Hbs in bivalves have been shown to be a source of antibacterial activity but are also responsible for eliminating harmful reactive oxygen species (ROS) and Nitric Oxide (NO) which may be present in polluted environments. Investigation of the haemoglobin genes in these bivalves as well as their expression under environmental stress as it is done in this study for A.trapezia contributes to our understanding of the functions of Hbs and haemocytes which may provide new perspectives for disease control and resistance to pollution in cultures.
Aquaculture production systems are often classified into three general types: extensive ponds, intensive ponds and intensive recirculating tank and raceway systems (Ebeling, 2006). There is also a tendency of many aquaculture enterprises to intensify production using superintensive systems, for example, super intensive culture using Atlantic salmon and rainbow trout. Currently, dissolved oxygen is the most important limiting factor to increase production in intensive systems. The study of haemoglobin gene expression under stress in A. trapezia and the transcriptome generated for C. ales may be used in future studies looking at dealing with this major issue. In particular, examining haemoglobin gene expression under different oxygen concentrations will provide much needed data about how molluscs cope with low dissolved oxygen in culture.
66 Chapter 4: General discussion
Other bivalves such as scallops are also used for farming. For example, in the scallop family Pectinidae, only about 10 species are currently cultured from the 400 living species present in this family (Shumway & Parsons, 2011). New sequence data for species is the ultimate resource for the introduction new species in cultures or the addition of genetic material to existing species to increase their diversity and their performance in cultures (Guo, 2009). The transcriptome generated in this study for the scallop C. ales and the haemoglobin-like genes described can therefore be used as potential molecular markers for bivalve selection and breeding to improve aquaculture production. Overall, research based on trancriptomics and proteomics proves valuable as it increases genetic data to study molecular traits of interest for cultured species and especially evaluate their disease susceptibility and resistance to environmental stresses.
4.4 Limitations of the study
Despite some limitations in the two studies conducted as part of this Masters thesis, the findings about the evolution and expression of Hb genes in bivalve molluscs are valid. One major limitation of the study is the use of transcriptome sequences for the identification of Hb genes in both A. trapezia and C. ales instead of full genome sequences. Consequently, the presence of only six and three full-length globin transcripts in A. trapezia and C. ales, respectively may be an underestimation of the actual number of Hb genes in these species and should be viewed with some caution. Many functional genes are not captured by transcriptome sequencing because they are only expressed at specific developmental stages, in certain tissues or at very low levels (Yagil et al., 2005). Thus this is a limitation of using transcriptome sequencing in isolation to identify the number of different genes within gene families in a species. Complete genome sequencing was not feasible in the current study due to the time restrictions and financial constraint.
4.5 Future research
Data presented here for A. trapezia provides insight into the multiple functions of Hbs, possible future studies could consist of further validating Hb genes. This could include whether Hbs are translated into proteins or whether some might be pseudogenes. Data presented here for C. ales provide the most comprehensive transcriptomic resource currently available for this species and therefore allows for numerous gene families to be examined
Chapter 4: General discussion 67
detail. Overall this resource should lay an important foundation for future genetic or genomic studies in this species. In future studies, complete genome sequences could also improve the detection of the entire complement of globin genes for these two bivalve species.
4.6 Conclusions
Overall, the evolution of Hbs in bivalves is intriguing due to the great diversity of proteins found across lineages both in structure and function. This study looked at the blood clam A. trapezia which possesses five duplicated Hb encoding genes and determined that expression levels of those five genes are much higher in haemolymph than in foot, gills, mantle or muscle therefore validating the first hypothesis of this project that they have undergone neofunctionalisation through gene duplication. Furthermore, the expression of those genes was not affected by short-term or long-term hypoxia; therefore it can be concluded that the neofunctionalisation acquired through gene duplication of existing Hb encoding genes may provide some evolutionary advantage for this bivalve species. Findings on the sensitivity and reaction of A. trapezia to hypoxia may also be a valuable indicator of the potential for bivalve populations to survive in challenging environments.
Additionally, next generation sequencing was used to obtain the expressed portion of the genome and present the first transcriptome for the species C. ales. Three globin-like encoding genes were found in the newly generated transcriptome for this bivalve species. The presence of a di-domain globin in particular represents the first in bivalves since the findings of a di-domain in B. lima and B. reeveana in the Arcoida order. This suggests that multi- domain globin genes may arise repeatedly through incomplete gene duplication. Although more investigation is required on the three candidate genes identified here, preliminary findings in this study support a fifth independent origin of Hb in bivalves and contribute to validate the second hypothesis of this project.
68 Chapter 4: General discussion
References
Abele-Oeschger, D., & Oeschger, R. (1995). Hypoxia-induced autoxidation of haemoglobin
in the benthic invertebrates Arenicola marina (Polychaeta) and Astarte borealis (Bivalvia)
and the possible effects of sulphide. Journal of Experimental Marine Biology and
Ecology, 187(1), 63–80. https://doi.org/10.1016/0022-0981(94)00172-A
Aguileta, G., Bielawski, J. P., & Yang, Z. (2004). Gene conversion and functional divergence
in the β-globin gene family. Journal of Molecular Evolution, 59(2), 177–189.
https://doi.org/10.1007/s00239-004-2612-0
Alkarkhi, F. A., Ismail, N., & Easa, A. M. (2008). Assessment of arsenic and heavy metal
contents in cockles (Anadara granosa) using multivariate statistical techniques. Journal
of hazardous materials, 150(3), 783-789. https://doi.org/10.1016/j.jhazmat.2007.05.035
Alyakrinskaya, I. O. (2002). Physiological and biochemical adaptations to respiration of
haemoglobin-containing hydrobionts. Biology Bulletin of the Russian Academy of
Sciences, 29(3), 268–283. https://doi.org/10.1023/A:1015438615417
Andrews, S. (2010). FastQC: A quality control tool for high throughput sequence data.
Angelini, E., Salvato, B., Muro, P. D., & Beltramini, M. (1998). Respiratory pigments of
Yoldia eightsi, an Antarctic bivalve. Marine Biology, 131(1), 15–23.
https://doi.org/10.1007/s002270050291
Antonini, E., & Chiancone, E. (1977). Assembly of multisubunit respiratory proteins. Annual
Review of Biophysics and Bioengineering, 6(1), 239–271.
https://doi.org/10.1146/annurev.bb.06.060177.001323
Arp, A. J., Childress, J. J., & Fisher, C. R. (1984). Metabolic and blood gas transport
characteristics of the hydrothermal vent bivalve Calyptogena magnifica. Physiological
Zoology, 57(6), 648–662. https://doi.org/%7B%7Barticle.doi%7D%7D
References 69
At, G., & Eo, T. (1984). Amino acid sequence of the beta-chain of the tetrameric
haemoglobin of the bivalve mollusc, Anadara trapezia. Australian Journal of Biological
Sciences, 38(3), 221–236. https://doi.org/10.1071/BI9800653
Baldwin, J., & Lee, A. K. (1978). Contributions of aerobic and anaerobic energy production
during swimming in the bivalve mollusc Limaria fragilis (family Limidae). Journal of
Comparative Physiology, 129(4), 361–364. https://doi.org/10.1007/BF00686994
Bao, Y. B., Wang, Q., Guo, X. M., & Lin, Z. H. (2013a). Structure and immune expression
analysis of haemoglobin genes from the blood clam Tegillarca granosa. Genetics and
Molecular Research, 12(3), 3110–3123. http://dx.doi.org/10.4238/2013.February.28.5
Bao, Y., Li, P., Dong, Y., Xiang, R., Gu, L., Yao, H., …& Lin, Z. (2013b). Polymorphism of
the multiple haemoglobins in blood clam Tegillarca granosa and its association with
disease resistance to Vibrio parahaemolyticus. Fish & Shellfish Immunology, 34(5), 1320–
1324. https://doi.org/10.1016/j.fsi.2013.02.022
Bao, Y., & Lin, Z. (2010). Generation, annotation, and analysis of ESTs from hemocyte of
the bloody clam, Tegillarca granosa. Fish & Shellfish Immunology, 29(5), 740–746.
https://doi.org/10.1016/j.fsi.2010.07.009
Bashford, D., Chothia, C., & Lesk, A. M. (1987). Determinants of a protein fold. Journal of
Molecular Biology, 196(1), 199–216. https://doi.org/10.1016/0022-2836(87)90521-3
Berg Jr, C. J. (1980). Description of living specimens of Calyptogena magnifica Boss and
Turner with notes on their distribution and ecology. Appendix 1. The giant white clam
from the Galapagos Rift, Calyptogena magnifica species novum. Malacologia, 20, 183–
185.
Bikard, D., Patel, D., Metté, C. L., Giorgi, V., Camilleri, C., Bennett, M. J., & Loudet, O.
(2009). Divergent Evolution of Duplicate Genes Leads to Genetic Incompatibilities Within
70 References
Arabidopsis thaliana. Science, 323(5914), 623–626.
https://doi.org/10.1126/science.1165917
Birchler, J. A., & Veitia, R. A. (2010). The gene balance hypothesis: implications for gene
regulation, quantitative traits and evolution. New Phytologist, 186(1), 54–62.
https://doi.org/10.1111/j.1469-8137.2009.03087.x
Bischof, J. M., Chiang, A. P., Scheetz, T. E., Stone, E. M., Casavant, T. L., Sheffield, V. C.,
& Braun, T. A. (2006). Genome-wide identification of pseudogenes capable of disease-
causing gene conversion. Human Mutation, 27(6), 545–552.
https://doi.org/10.1002/humu.20335
Blank, M., & Burmester, T. (2012). Widespread occurrence of N-terminal acylation in animal
globins and possible origin of respiratory globins from a membrane-bound ancestor.
Molecular Biology and Evolution, 23(11), 3553–3561.
https://doi.org/10.1093/molbev/mss164
Blank, M., Kiger, L., Thielebein, A., Gerlach, F., Hankeln, T., Marden, M. C., & Burmester,
T. (2011). Oxygen supply from the bird’s eye perspective globin E is a respiratory protein
in the chicken retina. Journal of Biological Chemistry, 286(30), 26507–26515.
https://doi.org/10.1074/jbc.M111.224634
Booth, C. E., McDonald, D. G., & Walsh, P. J. (1984). Acid-base balance in the sea mussel,
Mytilus edulis. I. Effects of hypoxia and air-exposure on haemolymph acid-base status.
Marine Biology Letters, 5, 347–358.
Brooks, S. P. J., Zwaan, A. de, Thillart, G. van den, Cattani, O., Cortesi, P., & Storey, K. B.
(1991). Differential survival of Venus gallina and Scapharca inaequivalvis during anoxic
stress: Covalent modification of phosphofructokinase and glycogen phosphorylase during
anoxia. Journal of Comparative Physiology , 161(2), 207–212.
https://doi.org/10.1007/BF00262885
References 71
Brunori, M., & Vallone, B. (2007). Neuroglobin, seven years after. Cellular and Molecular
Life Sciences, 64(10), 1259. https://doi.org/10.1007/s00018-007-7090-2
Burmester, T., & Hankeln, T. (2014). Function and evolution of vertebrate globins. Acta
Physiologica, 211(3), 501–514. https://doi.org/10.1111/apha.12312
Burmester, T., & Hankeln, T. (2009). What is the function of neuroglobin? Journal of
Experimental Biology, 212(10), 1423–1428. https://doi.org/10.1242/jeb.000729
Burmester, T., & Hankeln, T. (2004). Neuroglobin: A Respiratory Protein of the Nervous
System. Physiology, 19(3), 110–113. https://doi.org/10.1152/nips.01513.2003
Burmester, T., Weich, B., Reinhardt, S., & Hankeln, T. (2000). A vertebrate globin expressed
in the brain. Nature, 407(6803), 520–523. https://doi.org/10.1038/35035093
Burnett, L. E. (1997). The challenges of living in hypoxic and hypercapnic aquatic
environments. American Zoologist, 37(6), 633–640. https://doi.org/10.1093/icb/37.6.633
Cañestro, C., Albalat, R., Irimia, M., & Garcia-Fernàndez, J. (2013). Impact of gene gains,
losses and duplication modes on the origin and diversification of vertebrates. Seminars in
Cell & Developmental Biology, 24(2), 83–94.
https://doi.org/10.1016/j.semcdb.2012.12.008
Carney, S. L., Flores, J. F., Orobona, K. M., Butterfield, D. A., Fisher, C. R., & Schaeffer, S.
W. (2007). Environmental differences in haemoglobin gene expression in the
hydrothermal vent tubeworm, Ridgeia piscesae. Comparative Biochemistry and
Physiology Part B: Biochemistry and Molecular Biology, 146(3), 326-337.
https://doi.org/10.1016/j.cbpb.2006.11.002
Ching Ming Chung, M., & Ellerton, H. D. (1980). The physico-chemical and functional
properties of extracellular respiratory haemoglobins and chlorocruorins. Progress in
Biophysics and Molecular Biology, 35, 53–102. https://doi.org/10.1016/0079-
6107(80)90003-6
72 References
Como, P. F., & Thompson, E. O. P. (1980a). Amino acid sequence of the α-chain of the
tetrameric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal of
Biological Sciences, 33(6), 653–664. https://doi.org/10.1071/BI9800653
Como, P. F., & Thompson, E. O. P. (1980b). Multiple haemoglobins of the bivalve mollusc
Anadara trapezia. Australian Journal of Biological Sciences, 33(6), 643–652.
https://doi.org/10.1071/BI9800643
Corti, P., Xue, J., Tejero, J., Wajih, N., Sun, M., Stolz, D. B., … & Gladwin, M. T. (2016).
Globin X is a six-coordinate globin that reduces nitrite to nitric oxide in fish red blood
cells. Proceedings of the National Academy of Sciences, 113(30), 8538–8543.
https://doi.org/10.1073/pnas.1522670113
Crenshaw, M. A., & Neff, J. M. (1969). Decalcification at the mantle-shell interface in
molluscs. American Zoologist, 9(3), 881–885. https://doi.org/10.1093/icb/9.3.881
Dando, P. R., Southward, A. J., Southward, E. C., Terwilliger, N. B., & Terwilliger, R. C.
(1985). Sulphur-oxidising bacteria and haemoglobin in gills of the bivalve mollusc Myrtea
spinifera. Retrieved from http://agris.fao.org/agris-
search/search.do?recordID=AV20120126232
Darawshe, S., Tsafadyah, Y., & Daniel, E. (1987). Quaternary structure of erythrocruorin
from the nematode Ascaris suum. Evidence for unsaturated haem-binding sites.
Biochemical Journal, 242(3), 689–694. https://doi.org/10.1042/bj2420689
Davenport, J., & Wong, T. M. (1986). Responses of the blood cockle Anadara granosa (L.)
(Bivalvia: Arcidae) to salinity, hypoxia and aerial exposure. Aquaculture, 56(2), 151–162.
https://doi.org/10.1016/0044-8486(86)90024-4
De Zwaan, A., Cattan, O., & Putzer, V. M. (1993). Sulfide and cyanide induced mortality and
anaerobic metabolism in the arcid blood clam Scapharca inaequivalvis. Comparative
References 73
Biochemistry and Physiology Part C: Comparative Pharmacology, 105(1), 49–54.
https://doi.org/10.1016/0742-8413(93)90056-Q
Decker, C., Zorn, N., Le Bruchec, J., Caprais, J. C., Potier, N., Leize-Wagner, E., ... &
Andersen, A. C. (2016). Can the haemoglobin characteristics of vesicomyid clam species
influence their distribution in deep-sea sulfide-rich sediments? A case study in the Angola
Basin. Deep Sea Research Part II: Topical Studies in Oceanography.
http://dx.doi.org/10.1016/j.dsr2.2016.11.009
Decker, C., Zorn, N., Potier, N., Leize-Wagner, E., Lallier, F. H., Olu, K., & Andersen, A. C.
(2014). Globin’s structure and function in vesicomyid bivalves from the gulf of guinea
cold seeps as an adaptation to life in reduced sediments. Physiological and Biochemical
Zoology: Ecological and Evolutionary Approaches, 87(6), 855–869.
https://doi.org/10.1086/678131
Des Marais, D. L., & Rausher, M. D. (2008). Escape from adaptive conflict after duplication
in an anthocyanin pathway gene. Nature, 454(7205), 762–765.
https://doi.org/10.1038/nature07092
Dewilde, S., Angelini, E., Kiger, L., Marden, M., Beltramini, M., Salvato, B., & Moens, L.
(2003). Structure and function of the globin and globin gene from the Antarctic mollusc
Yoldia eightsi. Biochemical Journal, 370, 245–253. https://doi.org/10.1042/bj20020727
Dewilde, S., Hauwaert, M.-. L., Peeters, K., Vanfleteren, J., & Moens, L. (1999). Daphnia
pulex didomain haemoglobin: structure and evolution of polymeric haemoglobins and
their coding genes. Molecular Biology Evolution, 16.
https://doi.org/10.1093/oxfordjournals.molbev.a026211
Dixon, B., Walker, B., Kimmins, W., & Pohajdak, B. (1991). Isolation and sequencing of a
cDNA for an unusual haemoglobin from the parasitic nematode Pseudoterranova
74 References
decipiens. Proceedings of the National Academy of Sciences, 88(13), 5655–5659.
https://doi.org/10.1073/pnas.88.13.5655
Doeller, J. E., Kraus, D. W., Colacino, J. M., & Wittenberg, J. B. (1988). Gill Haemoglobin
May Deliver Sulfide to Bacterial Symbionts of Solemya velum (Bivalvia, Mollusca). The
Biological Bulletin, 175(3), 388–396. https://doi.org/%7B%7Barticle.doi%7D%7D
Donaghy, L., Lambert, C., Choi, K. S., & Soudant, P. (2009). Hemocytes of the carpet shell
clam (Ruditapes decussatus) and the Manila clam (Ruditapes philippinarum): current
knowledge and future prospects. Aquaculture, 297(1), 10-24.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Research, 32(5), 1792–1797.
https://doi.org/10.1093/nar/gkh340
Efstratiadis, A., Posakony, J. W., Maniatis, T., Lawn, R. M., O’Connell, C., Spritz, R. A., …
& Blechl, A.E. (1980). The structure and evolution of the human β-globin gene family.
Cell, 21(3), 653–668. https://doi.org/10.1016/0092-8674(80)90429-8
Ekblom, R., & Galindo, J. (2011). Applications of next generation sequencing in molecular
ecology of non-model organisms. Heredity, 107(1), 1–15.
https://doi.org/10.1038/hdy.2010.152
FAO. (2009). Fisheries & Aquaculture - Fishery statistical collections - global aquaculture
production. Retrieved August 23, 2016, from http://www.fao.org/fishery/statistics/global-
aquaculture-production/en
Finn, R. D., Miller, B. L., Clements, J., & Bateman, A. (2014). iPfam: a database of protein
family and domain interactions found in the Protein Data Bank. Nucleic Acids Research,
42(D1), D364–D373. https://doi.org/10.1093/nar/gkt1210
References 75
Fisher, A., Comly, M., Do, R., Temarkin, L., Ghazanfari, A. F., & Mukherjee, A. B. (1984).
Two pools of β-endorphin-like immunoreactivity in blood: plasma and erythrocytes. Life
Sciences, 34(19), 1839–1846. https://doi.org/10.1016/0024-3205(84)90677-5
Flögel, U., Merx, M. W., Gödecke, A., Decking, U. K. M., & Schrader, J. (2001).
Myoglobin: A scavenger of bioactive NO. Proceedings of the National Academy of
Sciences, 98(2), 735–740. https://doi.org/10.1073/pnas.98.2.735
Flores, J. F., Fisher, C. R., Carney, S. L., Green, B. N., Freytag, J. K., Schaeffer, S. W., &
Royer, W. E. (2005). Sulfide binding is mediated by zinc ions discovered in the crystal
structure of a hydrothermal vent tubeworm haemoglobin. Proceedings of the National
Academy of Sciences of the United States of America, 102(8), 2713–2718.
https://doi.org/10.1073/pnas.0407455102
Foote, A. D., Liu, Y., Thomas, G. W. C., Vinař, T., Alföldi, J., Deng, J., … & Gibbs, R. A.
(2015). Convergent evolution of the genomes of marine mammals. Nature Genetics, 47(3),
272–275. https://doi.org/10.1038/ng.3198
Force, A., Lynch, M., Pickett, F. B., Amores, A., Yan, Y., & Postlethwait, J. (1999).
Preservation of Duplicate Genes by Complementary, Degenerative Mutations. Genetics,
151(4), 1531–1545.
Fuchs, C., Burmester, T., & Hankeln, T. (2006). The amphibian globin gene repertoire as
revealed by the Xenopus genome. Cytogenetic and Genome Research, 112(3–4), 296–306.
Furuta, H., & Kajita, A. (1983). Dimeric haemoglobin of the bivalve mollusc Anadara
broughtonii: complete amino acid sequence of the globin chain. Biochemistry, 22(4), 917–
922. https://doi.org/10.1021/bi00273a032
Gallant, J. R., Traeger, L. L., Volkening, J. D., Moffett, H., Chen, P.-H., Novina, C. D., … &
Sussman, M. R. (2014). Genomic basis for the convergent evolution of electric organs.
Science, 344(6191), 1522–1525. https://doi.org/10.1126/science.1254432
76 References
Garcia, H. E., Boyer, T. P., Levitus, S., Locarnini, R. A., & Antonov, J. (2005). On the
variability of dissolved oxygen and apparent oxygen utilization content for the upper
world ocean: 1955 to 1998. Geophysical Research Letters, 32(9).
https://doi.org/10.1029/2004GL022286
Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D., & Bairoch, A. (2003).
ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic
Acids Research, 31(13), 3784–3788.
Gobler, C. J., DePasquale, E. L., Griffith, A. W., & Baumann, H. (2014). Hypoxia and
acidification have additive and synergistic negative effects on the growth, survival, and
metamorphosis of early life stage bivalves. PLoS one, 9(1), e83648.
https://doi.org/10.1371/journal.pone.0083648
Golovina, I. V., Gostyukhina, O. L., & Andreyenko, T. I. (2016). Specific metabolic features
in tissues of the ark clam Anadara kagoshimensis. Russian Journal of Biological
Invasions, 7(2), 137-145.
González, V. L., Andrade, S. C. S., Bieler, R., Collins, T. M., Dunn, C. W., Mikkelsen, P. M.,
Taylor, J.D., & Giribet, G. (2015). A phylogenetic backbone for Bivalvia: an RNA-seq
approach. Proceedings of the Royal Society B, 282(1801), 20142332.
https://doi.org/10.1098/rspb.2014.2332
Goodman, M., Czelusniak, J., Koop, B. F., Tagle, D. A., & Slightom, J. L. (1987). Globins: A
case study in molecular phylogeny. Cold Spring Harbor Symposia on Quantitative
Biology, 52, 875–890. https://doi.org/10.1101/SQB.1987.052.01.096
Goossens, M., Dozy, A. M., Embury, S. H., Zachariades, Z., Hadjiminas, M. G.,
Stamatoyannopoulos, G., & Kan, Y. W. (1980). Triplicated alpha-globin loci in humans.
Proceedings of the National Academy of Sciences, 77(1), 518–521.
References 77
Götting, M., & Nikinmaa, M. (2015). More than haemoglobin – the unexpected diversity of
globins in vertebrate red blood cells. Physiological Reports, 3(2), e12284.
https://doi.org/10.14814/phy2.12284
Gow, A. J., Payson, A. P., & Bonaventura, J. (2005). Invertebrate haemoglobins and nitric
oxide: How heme pocket structure controls reactivity. Journal of Inorganic Biochemistry,
99(4), 903–911. https://doi.org/10.1016/j.jinorgbio.2004.12.001
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., … &
Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a
reference genome. Nature Biotechnology, 29(7), 644–652.
https://doi.org/10.1038/nbt.1883
Gribaldo, S., Casane, D., Lopez, P., & Philippe, H. (2003). Functional divergence prediction
from evolutionary analysis: A case study of vertebrate haemoglobin. Molecular Biology
and Evolution, 20(11), 1754–1759. https://doi.org/10.1093/molbev/msg171
Grinich, N. P., & Terwilliger, R. C. (1980). The quarternary structure of an unusual high-
molecular-weight intracellular haemoglobin from the bivalve mollusc Barbatia reeveana.
Biochemical Journal, 189(1), 1–8. https://doi.org/10.1042/bj1890001
Grispo, M. T., Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., & Storz, J. F.
(2012). Gene duplication and the evolution of haemoglobin isoform differentiation in
birds. Journal of Biological Chemistry, 287(45), 37647–37658.
https://doi.org/10.1074/jbc.M112.375600
Guo, X. (2009). Use and exchange of genetic resources in molluscan aquaculture. Reviews in
Aquaculture, 1(3–4), 251–259. https://doi.org/10.1111/j.1753-5131.2009.01014.x
Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J.,… &
Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the
78 References
Trinity platform for reference generation and analysis. Nature Protocols, 8(8), 1494–1512.
https://doi.org/10.1038/nprot.2013.084
Halanych, K. M., & Passamaneck, Y. (2001). A brief review of metazoan phylogeny and
future prospects in hox-research. American Zoologist, 41(3), 629–639.
https://doi.org/10.1093/icb/41.3.629
Hankeln, T., Ebner, B., Fuchs, C., Gerlach, F., Haberkamp, M., Laufs, T. L., …& Burmester,
T. (2005). Neuroglobin and cytoglobin in search of their role in the vertebrate globin
family. Journal of Inorganic Biochemistry, 99(1), 110–119.
https://doi.org/10.1016/j.jinorgbio.2004.11.009
Hanscombe, O., Whyatt, D., Fraser, P., Yannoutsos, N., Greaves, D., Dillon, N., & Grosveld,
F. (1991). Importance of globin gene order for correct developmental expression. Genes &
Development, 5(8), 1387–1394. https://doi.org/10.1101/gad.5.8.1387
Hardison, R., Slightom, J. L., Gumucio, D. L., Goodman, M., Stojanovic, N., & Miller, W.
(1997). Locus control regions of mammalian β-globin gene clusters: combining
phylogenetic analyses and experimental results to gain functional insights. Gene, 205(1–
2), 73–94. https://doi.org/10.1016/S0378-1119(97)00474-5
Hardison, R. (1996). A brief history of haemoglobins: plant, animal, protist, and bacteria.
Proceedings of the National Academy of Sciences of the United States of America, 93(12),
5675.
Harper, E. M., & Skelton, P. W. (1993). The Mesozoic marine revolution and epifaunal
bivalves. Scripta Geologica, Special, (2), 127–153.
He, X., & Zhang, J. (2005). Rapid subfunctionalization accompanied by prolonged and
substantial neofunctionalization in duplicate gene evolution. Genetics, 169(2), 1157–1164.
https://doi.org/10.1534/genetics.104.037051
References 79
Herreid, C. F. (1980). Hypoxia in invertebrates. Comparative Biochemistry and Physiology
Part A: Physiology, 67(3), 311–320. https://doi.org/10.1016/S0300-9629(80)80002-8
Higgs, D. R., Old, J. M., Pressley, L., Clegg, J. B., & Weatherall, D. J. (1980). A novel
[alpha]-globin gene arrangement in man. Nature, 284(5757), 632–635.
https://doi.org/10.1038/284632a0
Hoffmann, F. G., Opazo, J. C., Hoogewijs, D., Hankeln, T., Ebner, B., Vinogradov, S. N., …
& Storz, J. F. (2012). Evolution of the globin gene family in deuterostomes: lineage-
specific patterns of diversification and attrition. Molecular Biology and Evolution, 29(7),
1735–1745. https://doi.org/10.1093/molbev/mss018
Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2011). Differential loss and retention of
cytoglobin, myoglobin, and globin-E during the radiation of vertebrates. Genome Biology
and Evolution, 3, 588–600. https://doi.org/10.1093/gbe/evr055
Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2010a). Gene cooption and convergent
evolution of oxygen transport haemoglobins in jawed and jawless vertebrates. Proceedings
of the National Academy of Sciences, 107(32), 14274–14279.
https://doi.org/10.1073/pnas.1006756107
Hoffmann, F. G., Storz, J. F., Gorr, T. A., & Opazo, J. C. (2010b). Lineage-specific patterns
of functional diversification in the α- and β-globin gene families of tetrapod vertebrates.
Molecular Biology and Evolution, 27(5), 1126–1138.
https://doi.org/10.1093/molbev/msp325
Hoffmann, F. G., & Storz, J. F. (2007). The αD-globin gene originated via duplication of an
embryonic α-like globin gene in the ancestor of tetrapod vertebrates. Molecular Biology
and Evolution, 24(9), 1982–1990. https://doi.org/10.1093/molbev/msm127
Hokamp, K., McLysaght, A., & Wolfe, K. H. (2003). The 2R hypothesis and the human
genome sequence. In A. Meyer & Y. V. de Peer (Eds.), Genome Evolution (pp. 95–110).
80 References
Springer Netherlands. Retrieved from http://link.springer.com/chapter/10.1007/978-94-
010-0263-9_10
Holden, J. A., Pipe, R. K., Quaglia, A., & Ciani, G. (1994). Blood cells of the arcid clam,
Scapharca inaequivalvis. Journal of the Marine Biological Association of the United
Kingdom, 74(02), 287-299.
Hoogewijs, D., Ebner, B., Germani, F., Hoffmann, F. G., Fabrizius, A., Moens, L., … &
Hankeln, T. (2011). Androglobin: A chimeric globin in metazoans that is preferentially
expressed in mammalian testes. Molecular Biology and Evolution, 29(4), 1105-1114.
https://doi.org/10.1093/molbev/msr246
Hourdez, S., & Lallier, F. H. (2006). Adaptations to hypoxia in hydrothermal-vent and cold-
seep invertebrates. Reviews in Environmental Science and Bio/Technology, 6(1–3), 143–
159. https://doi.org/10.1007/s11157-006-9110-3
Hourdez, S., & Weber, R. E. (2005). Molecular and functional adaptations in deep-sea
haemoglobins. Journal of Inorganic Biochemistry, 99(1), 130–141.
https://doi.org/10.1016/j.jinorgbio.2004.09.017
Hourdez, S., Lamontagne, J., Peterson, P., Weber, R. E., & Fisher, C. R. (2000).
Haemoglobin from a deep-sea hydrothermal-vent copepod. The Biological Bulletin,
199(2), 95–99.
Huang, Y., Niu, B., Gao, Y., Fu, L., & Li, W. (2010). CD-HIT Suite: a web server for
clustering and comparing biological sequences. Bioinformatics, 26(5), 680–682.
https://doi.org/10.1093/bioinformatics/btq003
Huerta-Cepas, J., Dopazo, J., Huynen, M. A., & Gabaldon, T. (2011). Evidence for short-time
divergence and long-time conservation of tissue-specific expression after gene duplication.
Briefings in Bioinformatics, 12(5), 442–448. https://doi.org/10.1093/bib/bbr022
References 81
Hurles, M. (2004). Gene duplication: the genomic trade in spare parts. PLoS Biology, 2(7),
e206. https://doi.org/10.1371/journal.pbio.0020206
Ikeda-Saito, M., Yonetani, T., Chiancone, E., Ascoli, F., Verzili, D., & Antonini, E. (1983).
Thermodynamic properties of oxygen equilibria of dimeric and tetrameric haemoglobins
from Scapharca inaequivalvis. Journal of Molecular Biology, 170(4), 1009–1018.
https://doi.org/10.1016/S0022-2836(83)80200-9
Ingram, V. M. (1961). Gene evolution and the haemoglobins. Nature, 189(4766), 704-708.
Innan, H., & Kondrashov, F. (2010). The evolution of gene duplications: classifying and
distinguishing between models. Nature Reviews Genetics, 11(2), 97–108.
https://doi.org/10.1038/nrg2689
Jellie, A. M., Tate, W. P., & Trotman, C. N. A. (1996). Evolutionary history of introns in a
multidomain globin gene. Journal of Molecular Evolution, 42(6), 641–647.
https://doi.org/10.1007/BF02338797
Jokumsen, A., & Fyhn, H. J. (1982). The influence of aerial exposure upon respiratory and
osmotic properties of haemolymph from two intertidal mussels, Mytilus edulis L. and
Modiolus modiolus L. Journal of Experimental Marine Biology and Ecology, 61(2), 189–
203. https://doi.org/10.1016/0022-0981(82)90008-9
Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new genes. Genome
Research, 20(10), 1313–1326. https://doi.org/10.1101/gr.101386.109
Karstensen, J., Stramma, L., & Visbeck, M. (2008). Oxygen minimum zones in the eastern
tropical Atlantic and Pacific oceans. Progress in Oceanography, 77(4), 331–350.
https://doi.org/10.1016/j.pocean.2007.05.009
Kato, K., Tokishita, S., Mandokoro, Y., Kimura, S., Ohta, T., Kobayashi, M., & Yamagata,
H. (2001). Two-domain haemoglobin gene of the water flea Moina macrocopa:
82 References
duplication in the ancestral Cladocera, diversification, and loss of a bridge intron. Gene,
273(1), 41–50. https://doi.org/10.1016/S0378-1119(01)00569-8
Kawano, K., Iwasaki, N., & Suzuki, T. (2003). Notable diversity in haemoglobin expression
patterns among species of the deep-sea clam, Calyptogena. Cellular and Molecular Life
Sciences CMLS, 60(9), 1952–1956. https://doi.org/10.1007/s00018-003-3184-7
Koch, J., & Burmester, T. (2016). Membrane-bound globin X protects the cell from reactive
oxygen species. Biochemical and Biophysical Research Communications, 469(2), 275–
280. https://doi.org/10.1016/j.bbrc.2015.11.105
Koch, J., Lüdemann, J., Spies, R., Last, M., Amemiya, C. T., & Burmester, T. (2016).
Unusual diversity of myoglobin genes in the lungfish. Molecular Biology and Evolution,
msw159. https://doi.org/10.1093/molbev/msw159
Koch, L. G., & Britton, S. L. (2008). Aerobic metabolism underlies complexity and capacity.
The Journal of Physiology, 586(1), 83–95. https://doi.org/10.1113/jphysiol.2007.144709
Kugelstadt, D., Haberkamp, M., Hankeln, T., & Burmester, T. (2004). Neuroglobin,
cytoglobin, and a novel, eye-specific globin from chicken. Biochemical and Biophysical
Research Communications, 325(3), 719–725. https://doi.org/10.1016/j.bbrc.2004.10.080
Lan, X., & Pritchard, J. K. (2016). Coregulation of tandem duplicate genes slows evolution of
subfunctionalization in mammals. Science, 352(6288), 1009–1013.
https://doi.org/10.1126/science.aad8411
Linzen, B., Soeter, N. M., Riggs, A. F., Schneider, H. J., Schartau, W., & Moore, M. D.
(1985). The structure of arthropod hemocyanins. Science, 229(4713), 519–524.
https://doi.org/10.1126/science.4023698
Liu, Y., Cotton, J. A., Shen, B., Han, X., Rossiter, S. J., & Zhang, S. (2010). Convergent
sequence evolution between echolocating bats and dolphins. Current Biology, 20(2), R53–
R54. https://doi.org/10.1016/j.cub.2009.11.058
References 83
Lynch, M., & Force, A. (2000). The probability of duplicate gene preservation by
subfunctionalization. Genetics, 154(1), 459–473.
Maeda, N., & Fitch, W. M. (1982). Frog heart monomeric haemoglobin. In Methods in
Protein Sequence Analysis (Eds), (pp. 569–570). Humana Press. Retrieved from
http://link.springer.com/chapter/10.1007/978-1-4612-5832-2_63
Mangum, C. P. (1998). Major Events in the Evolution of the Oxygen Carriers. American
Zoologist, 38(1), 1–13.
Mangum, C. P. (1997). Introduction The Red Blood Cell Haemoglobins Distribution and
localization Molecular structure. (Vol. 2).
Mangum, C. P. (1992). Respiratory function of the red blood cell haemoglobins of six animal
phyla. In C. P. Mangum (Eds.), Blood and Tissue Oxygen Carriers (pp. 117–149). Berlin,
Heidelberg: Springer Berlin Heidelberg. Retrieved from http://dx.doi.org/10.1007/978-3-
642-76418-9_5
Mangum, C. P., Scott, J. L., Miller, K. I., Holde, K. E. V., & Morse, M. P. (1987). Bivalve
hemocyanin: structural, functional, and phylogenetic relationships. The Biological
Bulletin, 173(1), 205–221.
Mangum, C. P., Woodin, B. R., Bonaventura, C., Sullivan, B., & Bonaventura, J. (1975). The
role of coelomic and vascular haemoglobin in the annelid family Terebellidae.
Comparative Biochemistry and Physiology Part A: Physiology, 51(2), 281–294.
https://doi.org/10.1016/0300-9629(75)90372-2
Mann, R. G., Fisher, W. K., Gilbert, A. T., & Thompson, E. O. P. (1986). Genetic variation
of the dimeric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal
of Biological Sciences, 39(2), 109–116.
Manning, A. M., Trotman, C. N. A., & Tate, W. P. (1990). Evolution of a polymeric globin in
the brine shrimp Artemia. Nature, 348(6302), 653–656. https://doi.org/10.1038/348653a0
84 References
Manwell, C. (1963). The chemistry and biology of haemoglobin in some marine clams—I.
Distribution of the pigment and properties of the oxygen equilibrium. Comparative
Biochemistry and Physiology, 8(3), 209–218. https://doi.org/10.1016/0010-
406X(63)90125-7
Michaelidis, B., Haas, D., & Grieshaber, M. K. (2005). Extracellular and intracellular acid‐
base status with regard to the energy metabolism in the oyster Crassostrea gigas during
exposure to air. Physiological and Biochemical Zoology: Ecological and Evolutionary
Approaches, 78(3), 373–383. https://doi.org/10.1086/430223
Mikkelsen, P. M., & Bieler, R. (2003). Systematic revision of the western Atlantic file clams,
Lima and Ctenoides (Bivalvia : Limoida : Limidae). Invertebrate Systematics, 17(5), 667–
710.
Moleirinho, A., Seixas, S., Lopes, A. M., Bento, C., Prata, M. J., & Amorim, A. (2013).
Evolutionary constraints in the β-globin cluster: The signature of purifying selection at the
δ-globin (HBD) locus and its role in developmental gene regulation. Genome Biology and
Evolution, 5(3), 559–571. https://doi.org/10.1093/gbe/evt029
Montes-Rodríguez, I. M., Rivera, L. E., López-Garriga, J., & Cadilla, C. L. (2016).
Characterization and expression of the Lucina pectinata oxygen and sulfide binding
haemoglobin genes. PLoS one, 11(1), e0147977.
https://doi.org/10.1371/journal.pone.0147977
Moriyama, Y., Ito, F., Takeda, H., Yano, T., Okabe, M., Kuraku, S., ... & Koshiba-Takeuchi,
K. (2016). Evolution of the fish heart by sub/neofunctionalization of an elastin
gene. Nature communications, 7.
Morse, M. P., Meyhofer, E., Otto, J. J., & Kuzirian, A. M. (1986). Hemocyanin respiratory
pigment in bivalve mollusks. Science, 231(4743), 1302–1304.
https://doi.org/10.1126/science.3945826
References 85
Motoyama, H., Komiya, T., Thuy, L. T. T., Tamori, A., Enomoto, M., Morikawa, H., … &
Kawada, N. (2014). Cytoglobin is expressed in hepatic stellate cells, but not in
myofibroblasts, in normal and fibrotic human liver. Laboratory Investigation, 94(2), 192–
207. https://doi.org/10.1038/labinvest.2013.135
Naito, Y., Riggs, C. K., Vandergon, T. L., & Riggs, A. F. (1991). Origin of a “bridge” intron
in the gene for two domain globin. Proceedings of the National Academy of Sciences of
the USA, 88(15). https://doi.org/10.1073/pnas.88.15.6672
Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., Muñoz-Fuentes, V., Green,
A. J., … & Storz, J. F. (2015). Convergent Evolution of Haemoglobin Function in High-
Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence
Level. PLoS Genetics, 11(12), e1005681. https://doi.org/10.1371/journal.pgen.1005681
Negrisolo, E., Pallavicini, A., Barbato, R., Dewilde, S., Ghiretti-Magaldi, A., Moens, L., &
Lanfranchi, G. (2001). The evolution of extracellular haemoglobins of annelids,
vestimentiferans, and pogonophorans. Journal of Biological Chemistry, 276(28), 26391–
26397. https://doi.org/10.1074/jbc.M100557200
Nicol, P. I., & O’Gower, A. K. (1967). Haemoglobin variation in Anadara trapezia. Nature,
216, 684. https://doi.org/10.1038/216684a0
Norman, J. D., Danzmann, R. G., Glebe, B., & Ferguson, M. M. (2011). The genetic basis of
salinity tolerance traits in Arctic charr (Salvelinus alpinus). BMC Genetics, 12(1), 81.
https://doi.org/10.1186/1471-2156-12-81
Officer, C. B., Biggs, R. B., Taft, J. L., Cronin, L. E., Tyler, M. A., & Boynton, W. R. (1984).
Chesapeake Bay anoxia: origin, development, and significance. Science, 223(6).
O’Gower, A., & Nicol, P. I. (1968). A latitudinal cline of haemoglobins in a bivalve mollusc.
Heredity, 23(4), 485–491.
86 References
Ohno, S., Wolf, U., & Atkin, N. B. (1968). Evolution from Fish to Mammals by Gene
Duplication. Hereditas, 59(1), 169–187. https://doi.org/10.1111/j.1601-
5223.1968.tb02169.x
Oleksiewicz, U., Liloglou, T., Field, J. K., & Xinarianos, G. (2011). Cytoglobin: biochemical,
functional and clinical perspective of the newest member of the globin family. Cellular
and Molecular Life Sciences, 68(23), 3869–3883. https://doi.org/10.1007/s00018-011-
0764-9
Opazo, J. C., Lee, A. P., Hoffmann, F. G., Toloza-Villalobos, J., Burmester, T., Venkatesh,
B., & Storz, J. F. (2015). Ancient duplications and expression divergence in the globin
gene superfamily of vertebrates: Insights from the elephant shark genome and
transcriptome. Molecular Biology and Evolution, 32(7), 1684-1694.
https://doi.org/10.1093/molbev/msv054
Opazo, J. C., Hoffmann, F. G., & Storz, J. F. (2008). Genomic evidence for independent
origins of β-like globin genes in monotremes and therian mammals. Proceedings of the
National Academy of Sciences, 105(5), 1590–1595.
https://doi.org/10.1073/pnas.0710531105
Osborn, T. C., Pires, J.C., Birchler, J. A., Auger, D. L., Chen, Z.J., Lee, H.-S., …
Martienssen, R.A. (2003). Understanding mechanisms of novel gene expression in
polyploids. Trends in Genetics, 19(3), 141–147. https://doi.org/10.1016/S0168-
9525(03)00015-5
Parker, J., Tsagkogeorga, G., Cotton, J. A., Liu, Y., Provero, P., Stupka, E., & Rossiter, S. J.
(2013). Genome-wide signatures of convergent evolution in echolocating mammals.
Nature, 502(7470), 228–231. https://doi.org/10.1038/nature12511
Perutz, M., F. (1979). Regulation of oxygen affinity of haemoglobin: influence of structure of
the globin on the heme iron. Annual review of biochemistry, 48(1), 327–386.
References 87
Pesce, A., Bolognesi, M., Bocedi, A., Ascenzi, P., Dewilde, S., Moens, L., … & Burmester,
T. (2002). Neuroglobin and cytoglobin. EMBO reports, 3(12), 1146–1151.
Petruzzelli, R., Goffredo, B. M., Barra, D., Bossa, F., Boffi, A., Verzili, D., … & Chiancone,
E. (1985). Amino acid sequence of the cooperative homodimeric haemoglobin from the
mollusc Scapharca inaequivalvis and topology of the intersubunit contacts. FEBS Letters,
184(2), 328–332. https://doi.org/10.1016/0014-5793(85)80632-3
Piro, M. C., Gambacurta, A., Basili, P., & Ascoli, F. (1998). The exon/intron organization of
the globin gene of Scapharca inaequivalvis homodimeric haemoglobin: unusual intron
homology with other bivalve mollusc globin genes. Gene, 221(1), 45–49.
https://doi.org/10.1016/S0378-1119(98)00442-9
Piro, M. C., Gambacurta, A., & Ascoli, F. (1996). Scapharca inaequivalvis tetrameric
haemoglobin α and β chains: cDNA sequencing and genomic organization. Journal of
Molecular Evolution, 43(6), 594–601. https://doi.org/10.1007/BF02202107
Prentis, P. J., & Pavasovic, A. (2014). The Anadara trapezia transcriptome: A resource for
molluscan physiological genomics. Marine Genomics, 18, 113–115.
https://doi.org/10.1016/j.margen.2014.08.004
Projecto-Garcia, J., Jollivet, D., Mary, J., Lallier, F. H., Schaeffer, S. W., & Hourdez, S.
(2015). Selective forces acting during multi-domain protein evolution: the case of multi-
domain globins. SpringerPlus, 4(1), 1–14. https://doi.org/10.1186/s40064-015-1124-2
Projecto-Garcia, J., Zorn, N., Jollivet, D., Schaeffer, S. W., Lallier, F. H., & Hourdez, S.
(2010). Origin and evolution of the unique tetra-domain haemoglobin from the
hydrothermal vent scale worm branchipolynoe. Molecular Biology and Evolution, 27(1),
143–152. https://doi.org/10.1093/molbev/msp218
88 References
Rabalais, N. N., Diaz, R. J., Levin, L. A., Turner, R. E., Gilbert, D., & Zhang, J. (2010).
Dynamics and distribution of natural and human-caused hypoxia. Biogeosciences, 7(2),
585-619.
Rawat, R. (2010). Anatomy of Mollusca. Mittal Publications.
Read, K. R. (1966). Molluscan haemoglobin and myoglobin. Academic Press New York,
(Vol. 2).
Reitman, M., Grasso, J. A., Blumenthal, R., & Lewit, P. (1993). Primary sequence, evolution,
and repetitive elements of the Gallus gallus (chicken) β-globin cluster. Genomics, 18(3),
616–626. https://doi.org/10.1016/S0888-7543(05)80364-7
Riggs, A. F. (1991). Aspects of the origin and Evolution of Non-Vertebrate Haemoglobins.
American Zoologist, 31(3), 535–545. https://doi.org/10.1093/icb/31.3.535
Roeder, G. S. (1983). Unequal crossing-over between yeast transposable elements. Molecular
and General Genetics MGG, 190(1), 117–121.
Roesner, A., Fuchs, C., Hankeln, T., & Burmester, T. (2005). A globin gene of ancient
evolutionary origin in lower vertebrates: evidence for two distinct globin families in
animals. Molecular Biology and Evolution, 22(1), 12–20.
https://doi.org/10.1093/molbev/msh258
Ronda, L., Bettati, S., Henry, E. R., Kashav, T., Sanders, J. M., Royer, W. E., & Mozzarelli,
A. (2013). Tertiary and quaternary allostery in tetrameric haemoglobin from Scapharca
inaequivalvis. Biochemistry, 52(12), 2108–2117. https://doi.org/10.1021/bi301620x
Royer, W. E., Zhu, H., Gorr, T. A., Flores, J. F., & Knapp, J. E. (2005). Allosteric
haemoglobin assembly: diversity and similarity. Journal of Biological Chemistry, 280(30),
27477–27480. https://doi.org/10.1074/jbc.R500006200
Royer Jr, W. E., Knapp, J. E., Strand, K., & Heaslet, H. A. (2001). Cooperative
haemoglobins: conserved fold, diverse quaternary assemblies and allosteric mechanisms.
References 89
Trends in Biochemical Sciences, 26(5), 297–304. https://doi.org/10.1016/S0968-
0004(01)01811-4
Royer, W. E., Strand, K., Heel, M. van, & Hendrickson, W. A. (2000). Structural hierarchy in
erythrocruorin, the giant respiratory assemblage of annelids. Proceedings of the National
Academy of Sciences, 97(13), 7107–7111. https://doi.org/10.1073/pnas.97.13.7107
Royer, W. E., Love, W. E., & Fenderson, F. F. (1985). Cooperative dimeric and tetrameric
clam haemoglobins are novel assemblages of myoglobin folds. Nature, 316(6025), 277–
280. https://doi.org/10.1038/316277a0
Sankaran, V. G., Menne, T. F., Xu, J., Akie, T. E., Lettre, G., Handel, B. V., … & Orkin, S.
H. (2008). Human fetal haemoglobin expression is regulated by the developmental stage-
specific repressor BCL11A. Science, 322(5909), 1839–1842.
https://doi.org/10.1126/science.1165409
Schindelmeiser, I., Kuhlmann, D., & Nolte, A. (1979). Localization and characterization of
hemoproteins in the central nervous tissue of some gastropods. Comparative Biochemistry
and Physiology Part B: Comparative Biochemistry, 64(2), 149–154.
https://doi.org/10.1016/0305-0491(79)90153-6
Schwarze, K., Campbell, K. L., Hankeln, T., Storz, J. F., Hoffmann, F. G., & Burmester, T.
(2014). The globin gene repertoire of lampreys: convergent evolution of haemoglobin and
myoglobin in jawed and jawless vertebrates. Molecular Biology and Evolution, 31(10),
2708–2721. https://doi.org/10.1093/molbev/msu216
Scott, K. M., & Fisher, C. R. (1995). Physiological ecology of sulfide metabolism in
hydrothermal vent and cold seep vesicomyid clams and vestimentiferan tube worms.
American Zoologist, 35(2), 102–111. https://doi.org/10.1093/icb/35.2.102
Shumway, S. E., & Parsons, G. J. (2011). Scallops: Biology, Ecology and Aquaculture.
Elsevier.
90 References
Sidell, B. D., & O’Brien, K. M. (2006). When bad things happen to good fish: the loss of
haemoglobin and myoglobin expression in Antarctic icefishes. Journal of Experimental
Biology, 209(10), 1791–1802. https://doi.org/10.1242/jeb.02091
Singh, S., Canseco, D. C., Manda, S. M., Shelton, J. M., Chirumamilla, R. R., Goetsch, S. C.,
… & Mammen, P. P. A. (2014). Cytoglobin modulates myogenic progenitor cell viability
and muscle regeneration. Proceedings of the National Academy of Sciences, 111(1),
E129–E138. https://doi.org/10.1073/pnas.1314962111
Smith, M. H. (1967). Occurrence of haemoglobin in some molluscs. Comparative
Biochemistry and Physiology, 20(1), 361–364. https://doi.org/10.1016/0010-
406X(67)90755-4
Souza, P. C. de, & Bonilla-Rodriguez, G. O. (2007). Fish haemoglobins. Brazilian Journal of
Medical and Biological Research, 40(6), 769–778. https://doi.org/10.1590/S0100-
879X2007000600004
Stapley, J., Reger, J., Feulner, P. G. D., Smadja, C., Galindo, J., Ekblom, R., … Slate, J.
(2010). Adaptation genomics: the next generation. Trends in Ecology & Evolution, 25(12),
705–712. https://doi.org/10.1016/j.tree.2010.09.002
Storz, J. F., Bridgham, J. T., Kelly, S. A., & Garland, T. (2015). Genetic approaches in
comparative and evolutionary physiology. American Journal of Physiology - Regulatory,
Integrative and Comparative Physiology, 309(3), R197–R214.
https://doi.org/10.1152/ajpregu.00100.2015
Storz, J. F., Hoffmann, F. G., Opazo, J. C., & Moriyama, H. (2008). Adaptive functional
divergence among triplicated α-globin genes in rodents. Genetics, 178(3), 1623–1638.
https://doi.org/10.1534/genetics.107.080903
References 91
Storz, J. F., Opazo, J. C., & Hoffmann, F. G. (2013). Gene duplication, genome duplication,
and the functional diversification of vertebrate globins. Molecular Phylogenetics and
Evolution, 66(2), 469–478. https://doi.org/10.1016/j.ympev.2012.07.013
Stothard, P. (2000). The sequence manipulation suite: JavaScript programs for analyzing and
formatting protein and DNA sequences. BioTechniques, 28(6), 1102, 1104.
Strand, K., Knapp, J. E., Bhyravbhatla, B., & Royer Jr, W. E. (2004). Crystal structure of the
haemoglobin dodecamer from Lumbricus erythrocruorin: allosteric core of giant annelid
respiratory complexes. Journal of Molecular Biology, 344(1), 119–134.
https://doi.org/10.1016/j.jmb.2004.08.094
Su, C.-Y., Kemp, H. A., & Moens, C. B. (2014). Cerebellar development in the absence of
GbX function in zebrafish. Developmental Biology, 386(1), 181–190.
https://doi.org/10.1016/j.ydbio.2013.10.026
Sullivan, G. (1961). Functional morphology, micro-anatomy, and histology of the “Sydney
cockle” Anadara trapezia (Deshayes )(Lamellibranchia: Arcidae). Australian Journal of
Zoology, 9(2), 219–257.
Surm, J. M., Prentis, P. J., & Pavasovic, A. (2015). Comparative analysis and distribution of
omega-3 lcPUFA biosynthesis genes in marine molluscs. PLoS one, 10(8), e0136301.
https://doi.org/10.1371/journal.pone.0136301
Suzuki, T., Kawamichi, H., Ohtsuki, R., Iwai, M., & Fujikura, K. (2000). Isolation and
cDNA-derived amino acid sequences of haemoglobin and myoglobin from the deep-sea
clam Calyptogena kaikoi. Biochimica et Biophysica Acta (BBA)-Protein Structure and
Molecular Enzymology, 1478(1), 152–158.
Suzuki, T., Kawasaki, Y., Arita, T., & Nakamura, A. (1996). Two-domain haemoglobin of
the blood clam Barbatia lima resulted from the recent gene duplication of the single-
domain delta chain. Biochemical Journal, 313, 561–566.
92 References
Suzuki, T., & Arita, T. (1995). Two-domain haemoglobin from the blood clam, Barbatia
lima. The cDNA-derived amino acid sequence. Journal of Protein Chemistry, 14(7), 499–
502.
Suzuki, T., Nakamura, A., Satoh, Y., Inai, C., Furukohri, T., & Arita, T. (1992). Primary
structure of chain I of the heterodimeric haemoglobin from the blood clam Barbatia
virescens. Journal of Protein Chemistry, 11(6): 629–633 .
https://doi.org/10.1007/BF01024963
Tamura, K., Stecher, G., Peterson, D., Filipski, A., & Kumar, S. (2013). MEGA6: Molecular
Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution, 30(12),
2725–2729. https://doi.org/10.1093/molbev/mst197
Terwilliger, N. B. (1998). Functional adaptations of oxygen-transport proteins. Journal of
Experimental Biology, 201(8), 1085–1098.
Terwilliger, N. B., Terwilliger, R. C., Meyhöfer, E., & Morse, M. P. (1988). Bivalve
hemocyanins—a comparison with other molluscan hemocyanins. Comparative
Biochemistry and Physiology Part B: Comparative Biochemistry, 89(1), 189–195.
https://doi.org/10.1016/0305-0491(88)90282-9
Terwilliger, R. C., Terwilliger, N. B., & Arp, A. (1983). Thermal vent clam (Calyptogena
magnifica) haemoglobin. Science, 219(4587), 981–983.
https://doi.org/10.1126/science.219.4587.981
Terwilliger, R. C. (1980). Structures of invertebrate haemoglobins. American Zoologist,
20(1), 53–67. https://doi.org/10.1093/icb/20.1.53
Terwilliger, R. C., Terwilliger, N. B., & Schabtach, E. (1978). Extracellular haemoglobin of
the clam, Cardita borealis (conrad): An unusual polymeric haemoglobin. Comparative
Biochemistry and Physiology Part B: Comparative Biochemistry, 59(1), 9–14.
https://doi.org/10.1016/0305-0491(78)90262-6
References 93
Titchen, D. A., Glenn, W. K., Nassif, N., Thompson, A. R., & Thompson, E. O. P. (1991). A
minor globin gene of the bivalve mollusc Anadara trapezia. Biochimica et Biophysica
Acta (BBA) - Gene Structure and Expression, 1089(1), 61–67.
https://doi.org/10.1016/0167-4781(91)90085-Z
Torres-Mercado, E., Renta, J. Y., Rodríguez, Y., López-Garriga, J., & Cadilla, C. L. (2003).
The cDNA-derived amino acid sequence of haemoglobin II from Lucina pectinata.
Journal of Protein Chemistry, 22(7–8), 683–690.
https://doi.org/10.1023/B:JOPC.0000008734.44356.b7
Toulmond, A., & Tchernigovtzeff, C. (1984). Ventilation and respiratory gas exchanges of
the lugworm Arenicola marina (L.) as functions of ambient PO2 (20–700 torr).
Respiration Physiology, 57(3), 349–363. https://doi.org/10.1016/0034-5687(84)90083-5
Trent, R. J., Bowden, D. K., Old, J. M., Wainscoat, J. S., Clegg, J. B., & Weatherall, D. J.
(1981). A novel rearrangement of the human β-like globin gene cluster. Nucleic Acids
Research, 9(24), 6723–6734. https://doi.org/10.1093/nar/9.24.6723
van der Burg, C. A., Prentis, P. J., Surm, J. M., & Pavasovic, A. (2016). Insights into the
innate immunome of actiniarians using a comparative genomic approach. BMC Genomics,
17, 850. https://doi.org/10.1186/s12864-016-3204-2
Vernimmen, D. (2014). Uncovering enhancer functions using the α-globin locus. PLoS
Genetics, 10(10), e1004668. https://doi.org/10.1371/journal.pgen.1004668
Vinogradov, S. N. (1985). The structure of invertebrate extracellular haemoglobins
(erythrocruorins and chlorocruorins). Comparative Biochemistry and Physiology Part B:
Comparative Biochemistry, 82(1), 1–15. https://doi.org/10.1016/0305-0491(85)90120-8
Vinogradov, S. N., & Moens, L. (2008). Diversity of globin function: enzymatic, transport,
storage, and sensing. Journal of Biological Chemistry, 283(14), 8773–8777.
https://doi.org/10.1074/jbc.R700029200
94 References
Vogel, C., Teichmann, S. A., & Pereira-Leal, J. (2005). The relationship between domain
duplication and recombination. Journal of Molecular Biology, 346(1), 355–365.
https://doi.org/10.1016/j.jmb.2004.11.050
Vuddhakul, V., Soboon, S., Sunghiran, W., Kaewpiboon, S., Chowdhury, A., Ishibashi, M.,
... & Nishibuchi, M. (2006). Distribution of virulent and pandemic strains of Vibrio
parahaemolyticus in three molluscan shellfish species (Meretrix meretrix, Perna viridis,
and Anadara granosa) and their association with foodborne disease in southern
Thailand. Journal of food protection, 69(11), 2615-2620.
Wajcman, H., Kiger, L., & Marden, M. C. (2009). Structure and function evolution in the
superfamily of globins. Comptes Rendus Biologies, 332(2–3), 273–282.
https://doi.org/10.1016/j.crvi.2008.07.026
Wang, Y., Coleman-Derr, D., Chen, G., & Gu, Y. Q. (2015). OrthoVenn: a web server for
genome wide comparison and annotation of orthologous clusters across multiple species.
Nucleic Acids Research, 43(W1), W78–W84. https://doi.org/10.1093/nar/gkv487
Wang, W. X., & Widdows, J. (1991). Physiological responses of mussel larvae Mytilus edulis
to environmental hypoxia and anoxia., (70), 223–236.
Watanabe, S., Takahashi, N., Uchida, H., & Wakasugi, K. (2012). Human neuroglobin
functions as an oxidative stress-responsive sensor for neuroprotection. Journal of
Biological Chemistry, 287(36), 30128–30138. https://doi.org/10.1074/jbc.M112.373381
Wawrowski, A., Gerlach, F., Hankeln, T., & Burmester, T. (2011). Changes of globin
expression in the Japanese medaka (Oryzias latipes) in response to acute and chronic
hypoxia. Journal of Comparative Physiology. B, Biochemical, Systemic, and
Environmental Physiology, 181(2), 199–208. https://doi.org/10.1007/s00360-010-0518-2
References 95
Weatherall, D. J. (2001). Phenotype—genotype relationships in monogenic disease: lessons
from the thalassaemias. Nature Reviews Genetics, 2(4), 245–255.
https://doi.org/10.1038/35066048
Weber, R. E., & Vinogradov, S. N. (2001). Nonvertebrate haemoglobins: functions and
molecular adaptations. Physiological Reviews, 81(2), 569–628.
Weber, R. E. (1980). Functions of invertebrate haemoglobins with special reference to
adaptations to environmental hypoxia. American Zoologist, 20(1), 79–101.
https://doi.org/10.1093/icb/20.1.79
Widdows, J., Bayne, B. L., Livingstone, D. R., Newell, R. I. E., & Donkin, P. (1979).
Physiological and biochemical responses of bivalve molluscs to exposure to air.
Comparative Biochemistry and Physiology Part A: Physiology, 62(2), 301–308.
https://doi.org/10.1016/0300-9629(79)90060-4
Witeska, M. (2013). Erythrocytes in teleost fishes: a review. Zoology and Ecology, 23(4),
275–281. https://doi.org/10.1080/21658005.2013.846963
Wittenberg, J. B., & Wittenberg, B. A. (2003). Myoglobin function reassessed. Journal of
Experimental Biology, 206(12), 2011–2020. https://doi.org/10.1242/jeb.00243
Wray, G. A., Hahn, M. W., Abouheif, E., Balhoff, J. P., Pizer, M., Rockman, M. V., &
Romano, L. A. (2003). The evolution of transcriptional regulation in eukaryotes.
Molecular Biology and Evolution, 20(9), 1377–1419.
https://doi.org/10.1093/molbev/msg140
Wray, G. A., Levinton, J. S., & Shapiro, L. H. (1996). Molecular evidence for deep
precambrian divergences among metazoan phyla. Science, 274(5287), 568–573.
Yagil, C., Hubner, N., Monti, J., Schulz, H., Sapojnikov, M., Luft, F. C., … & Yagil, Y.
(2005). Identification of hypertension-related genes through an integrated genomic-
96 References
transcriptomic approach. Circulation Research, 96(6), 617–625.
https://doi.org/10.1161/01.RES.0000160556.52369.61
Ye, J., Fang, L., Zheng, H., Zhang, Y., Chen, J., Zhang, Z., … Wang, J. (2006). WEGO: a
web tool for plotting GO annotations. Nucleic Acids Research, 34(Web Server issue),
W293–W297. https://doi.org/10.1093/nar/gkl031
Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., … & Wang, J.(2014). Comparative
genomics reveals insights into avian genome evolution and adaptation. Science,
346(6215), 1311–1320. https://doi.org/10.1126/science.1251385
Zhang, J. (2003). Evolution by gene duplication: an update. Trends in Ecology & Evolution,
18(6), 292–298. https://doi.org/10.1016/S0169-5347(03)00033-8
References 97
Appendices
Appendix A: Poster presented at the annual Lorne Genome conference 2015
98 Appendix