Colossoma Colossoma

)

20

DOUTORAMENTO EM CIÊNCIA DOUTORAMENTOCIÊNCIA EM E GENÉTICA EMMELHORAMENTO ESPECIALIDADE

From genes to nutrition: lipid metabolism in in metabolism lipid nutrition: to genes From the Brazilian tambaqui ( macropomum Ferraz Barbosa Renato

Maria João Peixoto Maria

20

D

Renato Barbosa Ferraz D.ICBAS 2020 From genes to nutrition: lipid metabolism in the Brazilian teleost tambaqui (Colossoma macropomum)

From genes to nutrition: lipid metabolism in the Brazilian teleost tambaqui (Colossoma macropomum)

Renato Barbosa Ferraz

INSTITUTO DE CIÊNCIAS BIOMÉDICAS ABEL SALAZAR Renato Barbosa Ferraz 2020

From genes to nutrition: lipid metabolism in the Brazilian teleost tambaqui (Colossoma macropomum)

Tese de Candidatura ao grau de Doutor em Ciências Animal - Especialidade em Genética e Melhoramento - submetida ao Instituto de Ciências Biomédicas Abel Salazar da Universidade do Porto.

Orientador Professor Doutor Luís Filipe Castro, Professor da Faculdade de Ciências da Universidade do Porto.

Co-orientador Doutor Rodrigo Ozório, Investigador no CIIMAR.

Co-orientador Professora Doutora Ana Lúcia Salário, Professora da Universidade Federal de Viçosa (Brasil). Financial support

This study was supported by CNPq, Conselho Nacional de Desenvolvimento Científico e Tecnológico – Brazil, for the doctoral scholarship to student Renato Barbosa Ferraz with process number 201864 / 2015-0.

And, by COMPETE 2020, Portugal 2020 and the European Union through the ERDF, grant number 031342, and by FCT through national funds (PTDC/CTA-AMB/31342/2017) and the strategic project UID/Multi/04423/2019 granted to CIIMAR.

The research reported in this thesis was conducted at:

1) Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto (Portugal), 2) Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR), Universidade do Porto and 3) Departamento de Biologia Animal, Universidade Federal de Viçosa (Brasil).

i

Agradecimentos

A oportunidade de aprender é fundamental para qualquer carreira profissional. Creio que no passado, o Brasil veio igualando essas oportunidades com a política de abertura das universidades com o fornecimento de bolsas de estudos, principalmente em nível de Doutoramento, o qual, não se trata de política sem retorno, mas sim a oportunidade de desenvolvimento de cientistas. Creio que o percurso que percorri nos últimos anos foi de grande crescimento profissional, abrangendo meus conhecimentos para além da Zootecnia, o qual só foi possível por ter sido agraciado por uma bolsa de doutorado pelo extinto programa Ciências sem Fronteiras (CNPq-Brasil). Por ser filho do ensino público Brasileiro gratuito e de qualidade, meus agradecimentos iniciais desta Tese será ao Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), pelo fornecimento da bolsa de estudo no exterior durante o Doutoramento.

Ao meu orientador, Filipe Castro, por ter me ajudado a construir um plano de trabalho que aos poucos foi tomando forma, e o qual se conclui com esta tese. O meu muito obrigado por todo o apoio ao longo destes últimos anos para além das necessidades académicas, por todos conselhos, sugestões e por todo o auxílio. Uma palavra que guardarei para além de todo o aprendizado nesses últimos anos é o que ele sempre diz: “Seguimos!”.

Ao meu co-orientador, Dr. Rodrigo Ozório por ter aceitado orientar esta tese e nos ter acolhido no Instituto de Ciências Biomédicas Abel Salazar. Obrigada por todo apoio e incentivo. Em especial aos incentivos nos processos burocráticos.

A minha co-orientadora, Professora Dra. Ana Lúcia Salaro por ter aceitado orientar esta tese. A qual não somente no Doutorado, mas de longas datas vem me orientando. Obrigado por disponibilizar o Laboratório de Nutrição e Produção de Peixes da Universidade Federal de Viçosa para o ensaio de crescimento desta tese.

Ao Doutor Óscar Monroig por ter me recebido neste último ano de Doutorado no Instituto de Acuicultura de Torre de la Sal, CSIC, na Espanha. Agradeço por ter contribuído com uma grande parte deste trabalho, bem como todos os comentários e críticas construtivas aos trabalhos submetidos para publicação.

Agradeço também aos amigos do AGE que tanto me ajudaram a aprender as técnicas da biologia molecular. Em especial, à Mónica Lopes-Marques por grande contribuição com a elaboração e análise de mapas de sintenia, ao André Machado por ter

ii contribuído com as técnicas da bioinformática neste trabalho e ao André Modesto por ter executado o ensaio de crescimento. Também deixo aqui, um agradecimento especial a todos os amigos do LANUCE, onde foi minha casa de acolhimento no CIIMAR. Agradeço a todos amigos do Porto, todos os amigos da Residência Alberto Amaral, pela convivência durante todos esses anos, e a todos os familiares e amigos do Brasil que de alguma forma, ou de outra, me ajudaram a concluir essa tese.

Por último, porém não menos importante, deixo meus agradecimentos às instituições que participaram deste projeto: o acolhimento do Doutorado pelo Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto; ao Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR) por ter sido minha instituição de acolhimento; ao Laboratório de Nutrição e Produção de Peixes Universidade Federal de Viçosa (UFV - Brasil), onde foi feito o ensaio de crescimento dos peixes, com orientação da Prof. Ana Lúcia Salaro; ao Instituto de Acuicultura de Torre de la Sal, CSIC, pelo acolhimento nesta intercâmbio no último ano sobre orientação do Dr. Óscar Monroig. E por fim, mais uma vez, ao Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), instituição financiadora deste projeto.

A Deus, e a todos, meu muito obrigado por me ajudarem a vencer essa etapa!

iii

Index

Financial support ...... i Agradecimentos ...... ii Index ...... iv List of figures ...... vii List of tables ...... x Publication list ...... xi List of acronyms ...... xii Abstract ...... xiii Resumo ...... xv Chapter 1 ...... 1 General Introduction ...... 1 1.1. Aquaculture in Brazil ...... 2 1.2. Tambaqui: an omnivore freshwater ...... 4 1.3. Long chain polyunsaturated fatty acids (LC-PUFA) in aquaculture ...... 6 1.4. Alternative source of oils in aquaculture ...... 10 1.5. Genomics in nutrition and aquaculture ...... 11 1.6. Molecular Resources in Tambaqui ...... 11 1.7. Objectives ...... 14 1.8. References ...... 17 Chapter 2 ...... 25 A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum ...... 25 2.1. Abstract ...... 26 2.2. Introduction ...... 26 2.3. Materials and methods ...... 28 2.4. Results ...... 32 2.5. Discussion ...... 36 2.6. References ...... 40 Chapter 3 ...... 44 From the Amazon: a comprehensive transcriptome of the tambaqui, Colossoma macropomum ...... 44 3.1. Abstract ...... 45

iv

3.1. Introduction ...... 45 3.3. Material and methods ...... 46 3.4. Results and discussion ...... 49 3.5. References ...... 58 Chapter 4 ...... 62 Regulation of gene expression associated to LC-PUFA metabolism in juvenile tambaqui (Colossoma macropomum) fed different dietary oil sources ...... 62 4.1. Abstract ...... 63 4.2. Introduction ...... 64 4.3. Materials and methods ...... 67 4.4. Results ...... 76 4.5. Discussion ...... 86 4.6. References ...... 89 Chapter 5 ...... 96 The fatty acid elongation genes, elovl4a and elovl4b, are present and functional in the genome of tambaqui (Colossoma macropomum) ...... 96 5.1.Abstract ...... 97 5.2. Introduction ...... 97 5.3. Materials and methods ...... 100 5.3.Results ...... 104 5.4. Discussion ...... 110 5.6. References ...... 114 Chapter 6 ...... 118 The repertoire of the elongases of very long-chain fatty acids gene family is conserved in the tambaqui (Colossoma macropomum) genome, and include members of the novel Elovl8 ...... 118 6.1. Abstract ...... 119 6.2. Introduction ...... 119 6.3. Materials and methods ...... 122 6.4. Results ...... 123 6.5. Discussion ...... 127 6.6. References ...... 130 Chapter 7 ...... 133 General discussion ...... 133 7.1. LC- and VLC-PUFA biosynthesis in tambaqui ...... 135 7.2. Expression of LC-PUFA metabolism genes in juvenile tambaqui ...... 137

v

7.3. Genomic resources in tambaqui ...... 139 7.4. Final conclusions and future perspectives ...... 141 7.5. References ...... 143 Supplementary material...... 149 Supplementary Tables ...... 150 Supplementary Figures ...... 172

vi

List of figures

Figure 1.1. Aquaculture contribution to total fish production (excluding aquatic plants) (FAO 2018).

Figure 1.2. Aquaculture production of major producing regions and major producers of main species groups, 2001–2016 (FAO 2018).

Figure 1.3. Quantity (thousand tons) and percentage distributiuon of fish production in Brazil, according to species or groups of fish, in increasing order of production - Brazil - 2016. Adapted from IBGE 2017.

Figure 1.4. Tambaqui (Colossoma macropomum).

Figure 1.5. Biosynthetic pathways of LC-PUFA from the precursor’s linoleic acid (18:2n-6) and α-linolenic acid (18:3n-3) in . Desaturation reactions are catalyzed by fatty acyl desaturases (Fads) and elongation reactions, denoted with “Elo,” are catalyzed by elongation of very long-chain fatty acid (Elovl) proteins. β-ox, Partial β-oxidation.

Figure 2.1. Maximum likelihood phylogenetic analysis of Fads amino acid sequences rooted with the invertebrate clade. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Fads2 studied herein is highlighted. Accession numbers for all Fads sequences are available in Supplementary Table 2.2.

Figure 2.2. Maximum likelihood phylogenetic analysis of Elovl amino acid sequences rooted with the Elovl4 sequences. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum sequences (Elovl2 and Elovl5) studied herein are highlighted. Accession numbers for all Elovl sequences are available in Supplementary Table 2.3.

Figure 2.3. The pathways of biosynthesis long-chain (C20–24) polyunsaturated fatty acids in C. macropomum predicted from activity of Fads2, Elovl2 and Elovl5 measured in yeast. Desaturation reactions are indicated as “Δx”, whereas elongation reactions are indicated as “Elo”. LA, linoleic acid (18:2n-6); ALA, α-linolenic acid (18:3n-3); ARA, arachidonic acid; EPA, ; DHA, . β-ox, beta-oxidation.

Figure 3.1. Quality assessment and blast-x analysis of the gold standard transcriptome of C. macropomum. a) Transcript length distribution; b) A number of transcripts (isoforms) per gene; c) Homologous gene-species distribution; d) E-value and Similarity distribution.

Figure 3.2. Histogram of the clusters of orthologous groups of C. macropomum transcriptome (COG).

Figure 3.3. Functional classification of C. macropomum in three Gene Ontology (GO) categories-biological process (blue), molecular function (grey), cellular component (red).

Figure 3.4. PPAR signaling pathway with unigenes detected in the C. macropomum transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white are the not detected genes.

vii

Figure 3.5. Maximum likelihood phylogenetic analysis of the PPARs (PPARαa, PPARαb, PPARβa, PPARβb and PPARγ) amino acid sequences values at node represent posterior probabilities. Phylogenetic PPARs were rooted with invertebrate sequence Ciano intestinalis.

Fig. 4.1. Experimental set up of the nutrition assay: fish and diets, procedure, sample collection, RNA extraction and cDNA synthesis and real-time PCR experiments.

Fig. 4.2. Functional annotation of tambaqui brain transcriptome. (a) Venn diagram showing the overlap of match hits against five databases, Ncbi-nr, Ncbi-nt, SwissProt, Uniref90 and Pfam. (b) Percentages and number of transcripts annotated per source database.

Fig. 4.3. Expression of genes involved in long-chain polyunsaturated fatty acid (LC-PUFA) biosynthesis. Expression of fads2 in liver (A) and brain (B), elovl2 in liver (C) and brain (D), elovl5 in liver (E) and brain (F). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real-time quantitative PCR. Different letters represent significantly different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%). The interaction of the oil source and percentage of lipids factors where represented by “*”.

Fig. 4.4. Expression of genes involved in the regulation of fatty acidy metabolism: peroxisome proliferator-activated receptors (PPAR). Expression of pparαa in liver (A) and brain (B), pparαb in liver (C) and brain (D), pparβa in liver (E) and brain (F), pparβb in liver (G) and brain (H) and pparγ in liver (I) and brain (J). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real-time quantitative PCR. Different letters represent significantly different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%). The interaction of the oil source and percentage of lipids factors where represented by “*”.

Fig. 4.5. Expression of genes involved in regulation of lipid biosynthesis and catabolism. Expression of liver x receptor (lxrα) in liver (A) and brain (B), sterol regulatory element binding protein 1 (srebp-1) in liver (C) and brain (D), lipoprotein lipase (lpl) in liver (E) and brain (F) and fatty acid synthase (fas) in liver (G) and brain (H). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real- time quantitative PCR. Different letters represent significant different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%).

Figure 5.1. A. Genomic loci and exon organization of the C. macropomum elovl4a and elovl4b genes. B. Maximum likelihood phylogenetic tree comparing C. macropomum Elovl4a and Elovl4b with elongase proteins from other , rooted with Lepisosteus oculatus. Numbers at nodes indicate branch support in posterior probabilities. Accession numbers for all Elovl4 sequences are available in Supplementary Table 2.

Figure 5.2. Comparison of the deduced C. macropomum amino acid (AA) sequences of Elovl4a and Elovl4b with those of other species. A - For Elovl4a D. rerio (NP_957090.1), P. nattereri (XP_017565830.1), O. niloticus (XP_003443720.1) and C. macropomum (this work) were used. B - For Elovl4b D. rerio (NP_956266.1), P. nattereri (XP_017574113.1), O. niloticus (XP_003440669.1) and C. macropomum (this work). Both AA sequences were aligned using BioEdit. Identical residues are shaded black and similar residues are shaded grey. Indicated are the conserved HXXHH histidine box motif and the ER retrieval signal, where RXKXX (Elovl4a) and KXKXX (Elovl4b) at the carboxyl terminus; “*” represents “Q” (glutamine) in position −5 from the H**HH.

viii

Figure 6.1. Phylogenetic tree comparing the deduced amino acid sequences of Colossoma macropomum (Elovl1a, Elovl1b, Elovl2, Elovl3, Elovl4a, Elovl4b, Elovl5, Elovl6, Elovl7a, Elovl7b, Elovl8a and Elovl8b), and sequences from teleost species and mammalian. The Ciona intestinalis Elovl6 was included in the analysis as an outgroup sequence to construct the rooted tree. where “*” represents the protein identified in this work.

Figure 6.2. Comparison of the deduced amino acid (AA) sequences of Elovl8a and Elovl8b from C. macropomum with Elovl8a D. rerio (NP_001070061.1) and Elovl8b D. rerio (NP_001019609.2). The AA sequences were aligned using BioEdit. Identical residues are shaded black and similar residues are shaded grey. Indicated are the conserved HXXHH histidine box motif, where “*” represents “Q” (glutamine) in position −5 from the HXXHH, and the ER retrieval signal.

Figure 6.3. Gene annotation of Colossoma macropomum genomic scaffold and comparative synteny maps. Panels A-G contain synteny maps of Elovl1a, Elovl1b, Elovl2, Elovl4a, Elovl7a, Elovl7b, Elovl8a and Elovl8b respectively. Target elongase is represented in red, neighbouring cross-species conserved genes are represented in colour. Genes related by teleost genome duplication are underlined.

ix

List of tables

Table 2.1. Functional characterization of the tambaqui (C. macropomum) Fads2 in yeast S. cerevisiae. Yeast were transformed with the C. macropomum fads2 ORF and grown in the presence of Δ6 (18:2n-6 and 18:3n-3), Δ8 (20:2n-6 and 20:3n-3), Δ5 (20:3n-6 and 20:4n-3), and Δ4 (22:4n-6 and 22:5n-3) fatty acid (FA) substrates. Desaturation of 24:5n-3 was tested by co-expressing both D. rerio elovl2 and C. macropomum fads2 in S. cerevisiae (Oboh et al., 2017). Conversions in both expression systems were calculated according to the formula [all product areas/(all product areas + substrate area)] × 100.

Table 2.2. Functional characterization of the tambaqui (C. macropomum) Elovl5 and Elovl2 elongases in yeast S. cerevisiae. Yeast transformed with the C. macropomum elovl5 and elovl2 ORF sequences and grown in the presence of exogenously added fatty acid (FA) substrates. Conversions by Elovl5 and Elovl2 were calculated according to the formula [all product areas/(all product areas + substrate area)] × 100.

Table 3.1. Transrate and Trinity statistics of the original, filtered and gold standard transcriptome assembly of liver transcriptome of C. macropomum.

Table 3.2. KEGG pathways of endocrine system with the number of unigenes mapped by C. macropomum gold standard transcriptome.

Table 4.1. Composition and analyses of the experimental diets.

Table 4.2. Fatty acid composition of the experimental diets.

Table 4.3. Nucleotide sequences of Pygocentrus nattereri used to search for Colossoma macropomum sequence in the RNASeq and primers used for real-time PCR expression gene.

Table 4.4. Statistics of brain raw and protein coding transcriptome assembly in C. macropomum.

Table 4.5. Growth performances of Colossoma macropomum juvenile fed diets containing different levels of oil (FO and VO) for 9 weeks.

Table 5.1. General stats of Colossoma macropomum draft genome assembly.

Table 5.2. Functional characterisation of Elovl4b elongase of Colossoma macropomum by heterologous expression in the yeast Saccharomyces cerevisiae. Data are presented as the percentage conversions of polyunsaturated fatty acid (FA) substrates calculated according to the formula [areas of first product and longer chain products / (areas of all products with longer chain than substrate + substrate area)] × 100.

Table 5.3. Selected saturated fatty acids (percentage of total fatty acids) of yeast Saccharomyces cerevisiae transformed with either the empty pYES2 vector (Control) or the Colossoma macropomum elovl4 ORF sequences (pYES2-elovl4a or pYES2-elovl4b). Results are means ± SD (n=3). Significant statistical differences between transgenic yeast expressing each elovl4 gene and its corresponding control are indicated with an asterisk (Student t-test, P < 0.05).

x

Publication list

During the PhD work, the candidate actively participated in other research projects, which entailed additional publications not included in this thesis but included in this list:

Ferraz, Renato B.; Kabeya, Naoki; Lopes-Marques, Mónica ; Machado, André M.; Ribeiro, Ricardo A. ; Salaro, Ana L.; Ozório, Rodrigo; Castro, L. Filipe C.; Monroig, Óscar. A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum. Comparative Biochemistry and Physiology B-Biochemistry & Molecular Biology, v. 227, p. 90-97, 2018.

Machado, André; Torresen, Ole; Kabeya, Naoki; Couto, Alvarina; Petersen, Bent; Felício, Mónica; Campos, Paula; Fonseca, Elza; Bandarra, Narcisa; Lopes-Marques, Mónica; Ferraz, Renato; Ruivo, Raquel; Fonseca, Miguel; Jentoft, Sissel; Monroig, Óscar; da Fonseca, Rute; C. Castro, L. -Out of the Can-: A Draft Genome Assembly, Liver Transcriptome, and Nutrigenomics of the European , Sardina pilchardus. Genes, v. 9, p. 485, 2018.

Machado, André M*.; Ferraz, Renato*; Ribeiro, Ricardo do Amaral; Ozório, Rodrigo; Castro, L. Filipe C. From the Amazon: A comprehensive liver transcriptome dataset of the teleost fish tambaqui, Colossoma macropomum. Data in Brief, v. 23, p. 103751, 2019. * joint first author.

Peixoto, Maria J.; Ferraz, Renato; Magnoni, Leonardo J.; Pereira, Rui; Gonçalves, José F.; Calduch-Giner, Josep; Pérez-Sánchez, Jaume; Ozório, Rodrigo O. A. Protective effects of seaweed supplemented diet on antioxidant and immune responses in European seabass (Dicentrarchus labrax) subjected to bacterial infection. Scientific Reports, v. 9, p. 16134, 2019.

xi

List of acronyms

ARA Arachidonic acid (20:4n-6)

DHA Docosahexaenoic acid (22:6n-3)

EPA Eicosapentaenoic acid (20:5n-3)

EFA essential fatty acids

FADS Fatty acid desaturases

FAS Fatty acid synthase

FA Fatty acids

FO

LA Linoleic acid (18:2n-6)

LPL Lipoprotein lipase

LXR Liver X receptor

LC-PUFA Long-chain polyunsaturated fatty acids

VLC-PUFA Very Long-chain polyunsaturated fatty acids

MUFA Monounsaturated fatty acids

OA Oleic acid (18:1 n-9)

PPARs Peroxisome proliferator-activated receptors

PCR Polymerase chain reaction

PUFA Polyunsaturated fatty acids

SFA Saturated fatty acids

SREBPs Sterol regulatory element binding proteins

TFs Transcription factors

VO Vegetable oil

ELOVL Elongation of Very long chain fatty acids

VLCFA Very long-chain fatty acids

VL-SFA Very long-chain saturated fatty acids

ALA α-linolenic acid (18:3n-3)

xii

Abstract

Teleost fish are not only an important source of protein for human populations but represent rich and popular sources of healthy long-chain polyunsaturated fatty acids (LC-PUFAs) such as arachidonic acid (ARA), eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA). Importantly, fish, like all vertebrates, cannot biosynthesize LC-PUFAs. Yet, some species have the ability to convert essential dietary PUFAs such as linoleic acid (LA; 18: 2n-6) and α-linolenic acid (ALA; 18: 3n-3) to LC-PUFA. These conversions are performed by desaturase (Fads) and very long chain fatty acid elongation (Elov) enzymes. Thus, investigating the biosynthesis of LC-PUFA in teleost fish is crucial to identify dietary requirements for essential fatty acids (EFA) and understand their ability to utilise commonly used raw materials such as vegetable oils in diets. The present study therefore aimed to investigate this biosynthetic capacity in tambaqui (Colossoma macropomum), the most commercially important endemic farmed fish in Brazil. A desaturase (fads2) and four elovl (elovl2, elovl5, elovl4a and elovl4b) cDNAs were identified, cloned and functionally characterised by heterologous expression in yeast. Fads2 showed Δ6, Δ5 and Δ8 desaturase capacities within the same enzyme. Elovl2 demonstrated the ability to elongate

C20 and C22 PUFA and thus complements the Elovl5 with elongase capability towards C18 and C20 PUFA. Thus, catalysing all the desaturation and elongation reactions required for

ARA, EPA and DHA biosynthesis from C18 (LA and ALA) precursor fatty acids. Finally, Elovl4a and Elovl4b enzymes from tambaqui were also demonstrated to enable the biosynthesis of very long-chain (>C24) saturated fatty acids (VLC-SFA) and very long-chain polyunsaturated fatty acids (VLC-PUFA), Elovl4a participate in the biosynthesis of VLC- PUFA up to 36 carbons, while Elovl4b up to 34. To further dissect the species ability to biosynthesize LC-PUFA from C18 precursors, an in vivo experimental trial was performed. For this, juvenile tambaqui were feed for 9 weeks with four diets, varying in oil sources (fish oil - FO and vegetable oil - VO) and levels (5% or 10%). This study confirmed that physiological EFA requirements for tambaqui could be satisfied with dietary provision of C18 PUFA (LA and ALA) present in vegetal oils. In effect, no differences in survival, body weight gain (BWG) and feed conversion ratio were observed between treatments. Moreover, the main biosynthesis genes of LC-PUFA were up-regulated in liver and brain. Specifically, the results from real-time PCR show that fads2 and elovl5 were up-regulated in liver, while fads2 and elovl2 are up-regulated in brain when feed with VO. The expression of transcription factors was also measured, with pparβb and ppary showing an up-regulation in the brain by VO diet, when compared to FO diet. Overall, the expression analysis showed that lipid

xiii metabolism genes was up-regulated by VO lipid sources contributing to the biosynthesis of LC-PUFAs and, in special, DHA in the brain. In the course of this thesis and to attain its main objectives, various genomic resources for tambaqui were generated. These will be increasingly necessary for a more efficient and productive process within the context of the production chain of tambaqui. Thus, we produced and assembled two transcriptome datasets (liver and brain) and a first genome draft, which can be used for future studies with the species. To finalize and demonstrate the proof of concept of this initial genome assembly, we investigated the gene repertoire of fatty acid elongases present in tambaqui. We identified the full repertoire of elovl genes in tambaqui (elovl1-7), and the newest gene of the Elovl family, elovl8, which, in principle, contributes to the biosynthesis of LC-PUFA in fish. With this thesis, focusing mainly on lipid metabolism, the results contribute to better understand the LC-PUFA and VLC-PUFA metabolism and their regulation in tambaqui. Such knowledge generated together with the data released to the scientific community (genomic and transcriptome sequences) will further support novel research studies beyond EFA, contributing to the development of tambaqui and consequently of the Brazilian aquaculture.

xiv

Resumo

Os peixes teleósteos representam não apenas fonte de proteína para as populações humanas, mas também são fontes importante de ácidos graxos poliinsaturados de cadeia longa (LC-PUFA), como ácido araquidônico (ARA), ácido eicosapentaenóico (EPA) e o ácido docosahexaenóico (DHA). É importante ressaltar que os peixes, como todos os vertebrados, não podem biossintetizar LC-PUFAs, porém, algumas espécies têm a capacidade de converter PUFAs dietéticos essenciais, como ácido linoléico (LA; 18: 2n-6) e ácido α-linolênico (ALA; 18: 3n- 3) em LC-PUFA. Essas conversões são realizadas por enzimas dessaturases (Fads) e elongases (Elov). Portanto, investigar a biossíntese de LC- PUFA em peixes teleósteos é crucial para identificar os requisitos alimentares de ácidos graxos essenciais (EFA) e entender sua capacidade de utilizar matérias-primas comumente usadas em dietas, como os óleos vegetais. Portanto, o presente estudo teve como objetivo investigar essa capacidade biossintética no tambaqui (Colossoma macropomum), o peixe endêmico mais importante comercialmente no Brasil. Foram identificados cDNAs da dessaturase (fads2) e quatro elovl (elovl2, elovl5, elovl4a e elovl4b), clonados e caracterizados funcionalmente por expressão heteróloga em leveduras. A Fads2 demonstrou capacidade de desaturação nas posições Δ6, Δ5 e Δ8 na mesma enzima. A

Elovl2 demonstrou a capacidade de alongar os PUFA C20 e C22, a qual complementa a Elovl5 que tem a capacidade de elongar os PUFA C18 e C20. Assim, catalisando todas as reações de dessaturação e alongamento necessárias para a biossíntese de ARA, EPA e DHA a partir de ácidos graxos precursores de C18 (LA e ALA). Por fim, as enzimas Elovl4a e Elovl4b demonstraram capacidade de biossíntese de ácidos graxos saturados de cadeia muito longa (> C24) (VLC-SFA) e ácidos graxos poliinsaturados de cadeia muito longa (VLC- PUFA), o qual a Elovl4a participa da biossíntese de VLC-PUFA até 36 carbonos, enquanto que a Elovl4b de até 34. Para compreender ainda mais a capacidade em biossintetizar LC-

PUFA a partir de C18 no tambaqui, foi realizado um ensaio experimental in vivo. Para isso, júvenes de tambaqui foram alimentados por 9 semanas com quatro dietas, variando em fontes de óleo (óleo de peixe - FO e óleo vegetal - VO) e níveis (5% ou 10%). Este estudo confirmou que os requisitos fisiológicos de EFA para o tambaqui podem ser satisfeitos com o fornecimento dietético de C18 PUFA (LA e ALA) presente em óleos vegetais. Durante o ensaio de crescimento não foram observadas diferenças na sobrevivência, ganho de peso corporal (BWG) e na taxa de conversão alimentar entre os tratamentos. Além disso, os principais genes de biossíntese de LC-PUFA aumentaram a expressão no fígado e no cérebro. Especificamente, os resultados da PCR em tempo real mostram que a fads2 e

xv elovl5 tiveram um aumento na expressão no fígado, o mesmo ocorreu com a fads2 e elovl2 no cérebro, quando alimentados com VO. A expressão dos fatores de transcrição também foram mensurados, o qual pparβb e ppary demostraram uma maior atividade no cérebro, nos tratamentos de VO, quando comparada à dieta FO. No geral, a análise da expressão mostrou que os genes do metabolismo lipídico aumentaram a expressão pelas fontes lipídicas do VO, contribuindo para a biossíntese dos LC-PUFAs e, em especial, do DHA no cérebro. No decorrer desta tese e para atingir os objetivos, foram gerados recursos genômicos adicionais para o tambaqui. Os quais serão cada vez mais necessários para um processo mais eficiente e produtivo no contexto da cadeia produtiva da piscicultura de tambaqui. Assim, produzimos dois conjuntos de dados do transcriptoma (fígado e cérebro) e um primeiro draft do genoma do tambaqui, que podem ser usados para estudos futuros com a espécie. Para finalizar e demonstrar a qualidade desse conjunto inicial de genoma, investigamos o repertório gênico de elongases de ácidos graxos presentes em tambaqui. Identificamos o repertório completo dos genes elovl em tambaqui (elovl1-7), e o gene mais recente da família Elovl, elovl8, que, em princípio, contribui para a biossíntese de LC-PUFA em peixes. Com isso, focando principalmente no metabolismo lipídico, os resultados contribuem para o melhor entendimento do metabolismo LC-PUFA e VLC-PUFA e sua regulação no tambaqui. Com os conhecimentos gerados juntamente com os dados disponibilizado à comunidade científica (sequências genômica e transcriptomas) podem ser gerados novos estudos além da essencialidade lipídica, contribuindo para o desenvolvimento da criação do tambaqui e consequentemente da aquicultura brasileira.

xvi

Chapter 1

General Introduction

1

1.1. Aquaculture in Brazil

Brazilian aquaculture production has increased, with an average growth rate of around 9% per year for the past decade (OCDE–FAO 2015). This outlook is also paralleled by the increase in the per capita consumption of fish, where over the past decade, domestic consumption of fish has increased. Consumption per capita grew from 6.0 kg/p in 2005 to 9.9 kg/p in 2014, follows world trends, which grew from 9.0 kg in 1961 to 20.2 kg in 2015 (FAO 2018). In Americas, aquaculture contributes approximately 18% of total fish production (FAO 2018) (Figure 1.1), it still far below the world rate of 47% of a total of 171 million tonnes output in 2016 (world and aquaculture) (FAO 2018) (Figure 1.1). But, even so, Brazil stands out in a world position, where it occupies the thirteenth position of the major world aquaculture producers (> of 500 000 tonnes in 2016, excluding aquatic plants) (FAO 2018). Thus, Brazil is the second largest producer of aquaculture on the American continent, second to Chile (Figure 1.2).

Figure 1.1. Aquaculture contribution to total fish production (excluding aquatic plants) (FAO 2018).

2

Figure 1.2. Aquaculture production of major producing regions and major producers of main species groups, 2001–2016 (FAO 2018).

In the past ten years, the total number of commercially farmed species registered by FAO increased by 27%, from 472 in 2006 to 598 species in 2016 (FAO 2018). Despite the great diversity in the species raised, aquaculture production by volume is dominated by a small number of “staple” species or species groups at regional, national and global levels (FAO 2018). Remarkably, 27 species contribute to over 90% of world production. The 20 most produced species items accounted for 84% of the total production, which demonstrates a dominance of few species in world aquaculture. Brazil does not differ from this global scenario, with almost half of the national aquaculture production based on just one species, tilapia (Oreochromis niloticus). The production of this exotic species was approximately 239.09 thousand tons in 2016, representing 47% of national production (OCDE–FAO 2015) (Figure 1.3). According to Brazilian legislation regarding continental aquaculture parks, the creation of exotic species in water bodies, is limited, except when there is evidence that the invasive species has already been detected in the river basin in accordance with IBAMA Order No. 145 / N, of October 29, 1998. This brings a lot of conflict between the most conservative researchers, who claim that special attention has to be given to this species, since they are not endemic to the Brazilian rivers, and once not created with due care; they can be considered invasive and of great risk to national biodiversity (Vitule, Freire, and Simberloff 2009). One of the possibilities to better balance this reality would be a greater focus on the use of native species in Brazilian aquaculture. Fish faunas of South America are the most diverse on Earth, with current species richness estimates standing above 9100 species, of which, about 5160 are freshwater (Reis et al. 2016). Of this total, one great diversity of native species such as jundiá (Rhamdia sp.), pacu (Piaractus mesopotamicus), pirarucu (Arapaima gigas), tambaqui (Colossoma macropomum), matrinxã (Brycon

3 amazonicus) have potential for aquaculture (Barthem and Fabré 2004; Baldisserotto and Gomes 2005). National aquaculture already has a contribution from these species (figure 1.3), with tambaqui standing out totalizing one third of the production.

Figure 1.3. Quantity (thousand tons) and percentage distribution of fish production in Brazil, according to species or groups of fish, in increasing order of production - Brazil - 2016. Adapted from (IBGE 2017).

1.2. Tambaqui: an omnivore

Tambaqui, Colossoma macropomum (Figure 1.4), was first described by George Cuvier in 1816. The species belongs to the class , order Characiformes, family Characidae. Tambaqui, is a native to the Amazon and Orinoco river basins, being the second largest scaled fish (after Arapaima gigas, Osteoglossidae) in the Amazon basin, widely distributed in South America. When adult, the species is characterized by the colouring of irregular ventral and caudal dark spots and reaches at least one meter in total

4 length and 30 kg in weight (Goulding and Carvalho 1982). From a nutrition standpoint, tambaqui behaves as an omnivorous freshwater fish. In the natural environment this species feeds according to the rainfall regime and the morphophysiological adaptations allow it to explore a wide variety of food items (Rodrigues 2014). It usually feeds on a large deal of fibre and energy from fruits and seeds, available during the flood, and shifting to zooplankton in the dry seasons (Goulding and Carvalho 1982). Therefore, considering the food habits and the structure of the digestive tract and the distribution of the digestive enzymes (Bezerra et al. 2000), it is expected that they display higher digestive functions capable of hydrolysing a greater variety of carbohydrate and lipids (De Almeida, Lundstedt, and Moraes 2006).

Figure 1.4: Tambaqui (Colossoma macropomum), photo by William Chaves, 2020.

Tambaqui was intensively fished in the Amazon region in the 1980s, which led to a decrease in fish stocks. Consequently, the Brazilian regulatory agency (IBAMA) acted to curb and protect the species. Fishing at the time of migration for reproduction was prohibited and fishing restrictions were also established at any time through ordinances (Ordinances 65 and 67 of 10/30/2003), regulated the minimum size for capturing and marketing species of individuals over 55 cm in total length (Souza et al. 2004). Then gradually recovered its population. However, the high demand for tambaqui, mainly by the population of Northern Brazil, and preserving natural stocks, was the reason for the intensification of captive production. Since then, several production systems have been proposed for the cultivation of tambaqui, from the most extensive to the most intensive, can be practiced in several types of facilities, such as: dams, tanks and recirculating aquaculture system. In the Amazonas, the activity is mainly developed in dam-type nurseries (Melo, Izel,

5 and Rodrigues 2001). Therefore, since tambaqui presents good zootechnical indexes of growth with the supply of artificial foods (Dairiki and Silva 2011), in addition to the population demand, made it become the second most produced species in Brazil, with 27.0% of the total of fish production in 2016 (Figure 1.3), with a productivity of 136,99 thousand tons.

1.3. Long chain polyunsaturated fatty acids (LC-PUFA) in aquaculture

Fish are the most important source of long chain (C20-24) polyunsaturated fatty acids (LC-PUFA), contributing with physiologically essential fatty acids (EFA) in human diet. Thus, they are key contributors to human health and general homeostasis. Functionally, the most important n‐3 LC-PUFAs are the eicosapentaenoic acid (EPA) and the docosahexaenoic acid (DHA), which have a wide range of physiological roles linked to certain health or clinical benefits (Calder 2014). LC-PUFAs composition of cell membranes is dietary-dependent (Simopoulos 2000), and thus intakes of EPA and DHA are recommended to ensure a normal growth, development, and overall health maintenance (FAO 2010). They induce different effects on several functions of e.g. leukocytes, insulin secreting cells and endothelial cells; these differences are associated with the effects on membranes physicochemical properties, intracellular signalling pathways and gene expression (Gorjão et al. 2009). There is evidence that EPA and DHA have specific and potentially independent effects on chronic disease, being associated with lower risk of fatal cardiac events (Mozaffarian and Wu 2012). Additionally, DHA is quantitatively the most important n-3 LC-PUFA in the brain, having unique and indispensable roles in neuronal membranes, with levels preserved by multiple mechanisms (Stillwell and Wassall 2003; Stillwell et al. 2005). Whereas capture fisheries provided the bulk of fish supplied for human consumption in the past, aquaculture has increasingly contributed to global fish supply in the last few decades. In effect, as noted above, world aquaculture contributes nowadays almost half of the fish produced. Freshwater species are among the farmed fish species driving global production and are projected to make up about 60 % of total aquaculture production by 2025 (FAO/OECD 2017). This is one of the reasons that the EFA requirements of freshwater fish species have been extensively studied over the years and are known to vary both qualitatively and quantitatively (Sargent and Tacon 1999; Sargent et al. 1999). Fish, like all vertebrates, cannot synthesize LC-PUFA de novo, such as EPA, DHA and arachidonic acid (ARA). However, dietary essential PUFA, such as linoleic acid (LA; 18:2n-6) and α-linolenic acid (ALA; 18:3n-3), found in vegetable oil, can be converted to LC- PUFA in some species to satisfy this requirement (Tocher, 2010). For this, the endogenous

6 capacity of fish to convert LA and ALA to n−6 and n−3 LC-PUFA varies among species and depends on the endogenous capacity of mainly two enzymes namely fatty acyl desaturases (Fads) and elongation of very long-chain fatty acid (Elov) (Monroig, Tocher, and Castro 2018) (Figure 1.5). Fads introduce double bonds between a pre-existing double bond and the carboxylic group and are therefore termed “front-end” desaturases. Moreover, Elovl proteins are responsible for the initial condensation reaction of the elongation pathway that results in the addition of two carbons to pre-existing fatty acid (FA) substrate (Cook 1996).

Figure 1.5. Biosynthetic pathways of LC-PUFA from the precursor’s linoleic acid (18:2n-6) and α-linolenic acid (18:3n-3) in teleosts. Desaturation reactions are catalyzed by fatty acyl desaturases (Fads) and elongation reactions, denoted with “Elo,” are catalyzed by elongation of very long-chain fatty acid (Elovl) proteins. β-ox, Partial β-oxidation.

In contrast to other vertebrates, which have two types of Fads desaturases, namely Fads1 (∆5 desaturase) and Fads2 (∆6 desaturase) (Guillou et al. 2010), teleost fish appear to have lost fads1 during evolution and thus have a varied number of fads2 genes (e.g. Castro, Tocher, and Monroig 2016). To date, the only exception in the teleostei lineage has been reported in the (Anguilla anguilla), where a fads1 desaturase was isolated and exhibited Δ5 desaturase activity (Lopes-Marques et al. 2018). More commonly though, teleostei Fads2 functionalised during the diversification of teleosts and Fads2 with

7

Δ6 (Zheng et al. 2004; Zheng et al. 2009; González-Rovira et al. 2009; Mohd-Yusof et al. 2010), dual Δ6 and Δ5 (Hastings et al. 2001; Tanomman et al. 2013; Fonseca-Madrigal et al. 2014; Kuah, Jaya-Ram, and Shu-Chien 2016; Oboh et al. 2016; Machado et al. 2019; Ferraz et al. 2019), Δ5 (Abdul Hamid et al. 2016) and Δ4 (Li et al. 2010; Morais et al. 2012; Fonseca-Madrigal et al. 2014), like mammalian FADS2, teleost Fads2 typically show Δ8 desaturase activity (Monroig, Li, and Tocher 2011), desaturase activities have been identified. The Elovl family of fatty acid elongases in fish also have been of interest because of their function in the biosynthesis of LC-PUFA, which is the rate-limiting step in the two- carbon elongation of pre-existing fatty acyl chains (Guillou et al. 2010; Jakobsson, Westerberg, and Jacobsson 2006). Seven members (Elovl 1–7) with similar and shared amino acid motifs constitute the Elovl protein family in vertebrates. Elovl1, Elovl3, Elovl6 and Elovl7 preferring saturated (SFA) and monounsaturated fatty acids (MUFA) substrates, being Elovl2, Elovl4 and Elovl5 selective for PUFA (Guillou et al. 2010). Elovl2, Elovl4 and Elovl5 have received considerable attention in aquatic due to their involvement in conversion of C18 PUFAs to LC-PUFA. Yet, only Elovl4 is the family member capable of catalysing the production of Very Long Chain Saturated Fatty Acids (VLC-SFA) and very long-chain polyunsaturated fatty acids (VLC-PUFA) with chain lengths ≥28 carbons (Sherry et al. 2019). Elovl2, Elovl4, and Elovl5 have been identified and characterized functionally in several species of teleost fish (Monroig et al. 2016; Castro, Tocher, and Monroig 2016). For example, Elovl5 enzymes can efficiently elongate C18 and C20 PUFA, with remarkably lower elongase capacity towards C22 substrates (Li, Monroig, Wang, et al. 2017; Morais et al.

2009b; Gregory and James 2014; Ferraz et al. 2019). And, Elovl2, an enzyme with C20 and

C22 PUFA as preferred elongation substrates (Morais et al. 2009a; Monroig et al. 2009; Oboh et al. 2016; Ferraz et al. 2019). However, Elovl2 was probably lost during the diversification of teleosts and is absent in most teleost lineages (Castro, Tocher, and Monroig 2016). A possible replacement for the loss of Elovl2 in some species of teleosts would be Elovl4. Elovl4 is probably involved in the biosynthesis of very long-chain (> C24) PUFA, since it can elongate C22 PUFA (Carmona-Antonanzas et al. 2011; Li, Monroig, Navarro, et al. 2017; Oboh et al. 2017a; Ran et al. 2019). Apparently, the elovl4 gene in teleosts appears duplicated (elovl4a and elovl4b) (Monroig et al. 2010). Recently, a novel Elovl gene class has been described in fish (Li et al. 2020) and name elovl8. This was identified from the herbivorous marine teleost rabbitfish (Siganus canaliculatus), with two novel fish-specific elovl genes (named as elovl8a and elovl8b). Apparently these two genes have distinct functions, with only Elovl8b can elongate PUFA to LC-PUFA (Li et al. 2020).

8

As already discussed, the endogenous capacity of fish to convert LA and ALA, present in the sources of alternative oils, to n−6 and n−3 LC-PUFA it is due the activity of an alternating sequence of desaturations and elongations, catalyzed by Fads and Elovls (Cook 1996; Castro, Tocher, and Monroig 2016). In addition to these two major enzymes involved in LC-PUFA biosynthesis, many key enzymes and transcription factors (TFs) are involved in the overall regulation in lipid metabolism. Some TFs such as the Liver X receptor (LXR), peroxisome proliferator-activated receptors (PPARs) and sterol regulatory element binding proteins (SREBPs) have received greater attention because they play a crucial role in regulating the expression of genes involved in lipid metabolism (Fievet and Staels 2009; Daemen, Kutmon, and Evelo 2013). Briefly, LXRs are best known for their control of lipid and cholesterol metabolism (Jakobsson et al. 2012), increasing the expression of genes of fatty acid biosynthesis (lipogenesis), and raising plasma triglyceride levels (hypertriglyceridemia) (Li et al. 2016; Zhu et al. 2018; Fievet and Staels 2009). In teleosts, in contrast to mammals, a single gene LXRα was identified (Fonseca et al. 2017). PPARs are also nuclear receptors activated by fatty acids and their derivatives. In vertebrates, three PPAR nuclear receptor gene paralogues have been recognized: PPAR, PPAR and PPAR (Madureira et al. 2017; Urbatzka et al. 2013; Robinson-Rechavi et al. 2001; Leaver et al. 2005). PPARα is expressed predominantly in tissues that have a high level of fatty acid catabolism, such as liver, heart, and muscle, regulates the expression of several genes critical for lipid and lipoprotein metabolism, encoding enzymes involved in the peroxisomal and mitochondrial β-oxidation of fatty acids (Yoon 2009). In the same way, PPARβ activates lipid utilization by regulating the expression of target gene encoding enzymes involved in β-oxidation and energy uncoupling in various tissues (Dressel et al. 2003). In contrast, PPARγ is a central regulator of adipose tissue development and an important modulator of gene expression in a number of specialized adipocytes cell types (Walczak and Tontonoz 2002). SREBPs are transcription factors which play an important role in the regulation of the intracellular cholesterol concentration and in overall lipid homeostasis (Daemen, Kutmon, and Evelo 2013), where SREBP1 plays a crucial role in the regulation of many lipogenic genes and SREBP2 primarily regulates the transcription of cholesterol enzymes (Jeon and Osborne 2012).

9

1.4. Alternative source of oils in aquaculture

Fish meal (FM) and fish oil (FO) are the most widely used dietary components of commercially produced high quality fish/shrimp feed throughout the world. FM and FO are preferred for commercial feed production because of their unique balance of protein (amino acids) and lipids (LC-PUFA n-3) in a highly digestible energy dense form (Tacon and Metian 2008). High FM and FO levels have been traditionally included in aquafeed formulations. However, supplies of FO for aquaculture production are expected to become critical, due to the release of recent data which indicates that the aquaculture industry uses 40 and 75% of the global production of FM and FO respectively (Tacon and Metian 2008). Within the next decade FO production may not meet the required quantities for aquaculture, meaning that food grade fisheries which provide FO and FM have reached their limit of sustainability (Tacon and Metian 2008); their use is ecologically harmful, with a collapse of all wild seafood species by 2050 (Worm et al. 2006; Berkes et al. 2006). For these reasons, several studies have been carried out to investigate certain vegetable oils (VO) as possible sustainable total or partial substitutes for FO in compounded fish feeds. Plants have the ability for de novo synthesis of n-3 fatty acids up to 18 carbons (Leonard et al. 2004), which represent an interesting option for FO replacement in some species. The most common vegetable oils used for fish feed production have been soybean, linseed, rapeseed, sunflower, palm oil and olive oil. However, most VO are relatively poor sources of n-3 fatty acids in comparison to marine FO (Turchini, Torstensen, and Ng 2009). VO are rich sources of n-6 and n-9 fatty acids, mainly LA and oleic acid (OA; 18:1 n-9), with the exception of , which is rich in ALA (Turchini, Torstensen, and Ng 2009; Ayisi, Zhao, and Apraku 2019). VO have been evaluated as FO substitutes either singly or as a blend of VO formulated to replicate the fatty acid composition found in FO in terms of relative total saturated fatty acids (SFA), monounsaturated fatty acids (MUFA) and PUFA ratios (Turchini, Torstensen, and Ng 2009; Richard et al. 2006). Studies with the objective of evaluating the effects of dietary FO replacement by VOs supplementation on growth performance, fatty acid composition, plasma biochemical parameters, lipid metabolism, hepatic antioxidant ability and morphology are essential to guarantee the viability for this supplementation.

10

1.5. Genomics in nutrition and aquaculture

The entire genomic information, from protein coding regions to functional non- coding regions, is included in the genome of a species (Shen 2019). The process to determine nucleotide content in a DNA fragment is known as DNA sequencing has been revolutionized in recent years (Ansorge 2009). With the rapid development of sequencing technologies, it is now easy to determine and characterize large number of genomes in species of interest using next-generation sequencing technologies. Omic resources such as whole genome reference sequences, transcriptome, high-density SNP genotyping arrays and genotyping-by-sequencing are in development for several aquaculture species (Gui and Zhu 2012), such as (Cyprinus carpio) (Xu et al. 2014), Atlantic (Salmo salar)(Lien et al. 2016), Rainbow (Oncorhynchus mykiss) (Palti et al. 2012), european sea bass (Dicentrarchus labrax)(Tine et al. 2014) and tilapia (Oreochromis niloticus) (Conte et al. 2017). In Brazil, sequencing projects for neotropical species are also officially underway. EMBRAPA is leading within the Animal Genomic Network the complete sequencing of two species of importance for national aquaculture; tambaqui and cachara (Pseudoplatystoma reticulatum), which are not yet available to the public. Brazilian species that already have genome data available are for example red-bellied piranha (Pygocentrus nattereri) (Schartl et al. 2019) and arapaima (Arapaima gigas) (Du et al. 2019). The increase in the availability of genomic resources, such as genomes and transcriptomes, will be highly useful for a better understanding of not only the basic biology, but to provide the necessary information for establishing nutrition and aquaculture conditions and for supporting conservation. For example, understanding gene regulation and control networks that are related to reproduction, growth, disease resistance, cold tolerance and hypoxia tolerance, can be used to develop new technologies for the aquaculture optimizing productivity (Gui and Zhu 2012).

1.6. Molecular Resources in Tambaqui

The development of easy to use DNA sequencing technologies has had a revolutionary impact on animal genetics. These techniques have the power to provide clear insights into genome structure and function, thus allowing the understanding not only the basic biology of the animal, but also the possibility of applying other biological technologies (Liu and Cordes 2004). In the case of tambaqui, previous studies have studied single- nucleotide polymorphisms (SNPs) (Nunes et al. 2017). SNPs were used to construct a high-

11 density genetic linkage map, and it was possible to build syntenic relationships and functional annotation analyses by aligning tambaqui against the zebrafish (Danio rerio) genome, thus contributing the tambaqui physical map construction (Nunes et al. 2017). Another utility of this tool is understanding population dynamics and delimit conservation units. In this sense, SNPs were used for tambaqui for genetic studies applicable to animals from both the Amazon and Orinoco basins to better understand population/conservation (Martínez et al. 2016). MicroRNAs (miRNAs) are a highly conserved group of small, non-coding RNA molecules, which are 19-25 nucleotides in size that play a key role in post-transcriptional gene regulation of several organisms (Ranganathan and Sivasankar 2014). The miRNAs have been target of studies in different organisms, including important fish species such as Salmo salar (Barozai 2012b), Danio rerio (Wienholds et al. 2003), Ictalurus punctatus (Barozai 2012a), Cyprinus carpio (Zhu et al. 2012) and Oncorhynchus mykiss (Juanchich et al. 2016). The first report about the identification and characterization of miRNAs in the tambaqui was in liver and skin (Gomes et al. 2017), where were identified 279 conserved miRNAs, being 257 from liver and 272 from skin, with several miRNAs shared between tissues, with divergence in the number of reads. One of the previous techniques used to study evolutionary dynamics and genetic diversity distribution was microsatellite markers. Microsatellite markers consist of a tract of repetitive DNA in which certain DNA motifs (ranging in length from one to six or more base pairs) are repeated, make up genomic repetitive regions (Vieira et al. 2016). For estimating the genetic diversity of cultivated and natural populations and evaluate the viability of a species, microsatellite markers are being used in tambaqui (Santos, Hrbek, and Farias 2009; Hamoy et al. 2010; Santos et al. 2016). Consistently with its semi-migratory behaviour, a high genetic variability was observed in molecular analyses, with tambaqui forming a panmictic population along the Solimôes-Amazon River channel. Overall, these results are in agreement with species-typical behaviour of semi-migratory movements driven by dispersal for feeding and reproduction during its life cycle (Santos, Ruffino, and Farias 2007). Another tool currently used in omic approaches is transcriptome profiling using deep-sequencing technology, commonly known as RNA-Seq. This has become increasingly popular, since it has become more affordable and provides output of encoded genes; this tool has been widely used in aquaculture (Chana-Munoz et al., 2017; Lei et al., 2017; Mu et al., 2014; Pereiro et al., 2012). In tambaqui the muscle transcriptome was previously

12 caracterized under specific conditions of future climate scenarios (soft, intermediate and extreme scenario) (Prado-Lima & Val, 2016).

13

1.7. Objectives

The overall objective of this thesis is to elucidate the functionality of the LC-PUFA biosynthetic pathway in tambaqui. This aim involved the cloning and functional characterisation of all fads and elovls genes with putative roles in this pathway. After describing the LC-PUFA metabolism pathway in vitro, we aim at addressing the expression of these genes and other lipid gene modules in vivo. For this, an experimental trial was conducted, with tambaqui juveniles with different oil sources and levels to investigate their effect on the expression of key genes involved in lipid metabolism in the liver and brain of tambaqui. Concomintaly with these analyses, we generated a collection of omic resources from the tambaqui, such as two transcriptomes (liver and brain) and a draft assembly of the genome. We hypothesise that the characterisation of the full set of Fads and Elovl enzymes, and investigating the molecular landscapes regulating different metabolic situations, will allow us to identify the effect of vegetable oils and two levels of inclusion in the regulation of fads, elovls and transcriptional factors.

The specific aims of this project include:

1. Molecular cloning of genes encoding Fads and Elovl involved in the LC-PUFA and VLC-PUFA biosynthetic pathways of tambaqui;

2. Functional characterisation of Fads and Elovl by heterologous expression in yeast; 3. Establish the effect of replacing the FO with VO in the expression of genes that code for LC-PUFA metabolism enzymes (fads2, elovl5, elovl2), regulatory modules (pparαa, pparαb, pparβa, pparβb, ppary, lxrα and srebp1) and enzymes involved in the control of lipid biosynthesis or oxidation (fas and lpl) in vivo tambaqui juveniles fed different diets; 4. Provide tambaqui with omic resources, such as transcriptomes and a draft genome assembly.

14

This thesis consists of a General introduction (Chapter 1), data chapters (Chapters 2 - 6) and a General discussion (Chapter 7), which provides a concise synthesis of all the outcomes and conclusions of the data chapters.

The data chapters include:

Chapter 2: A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum. This chapter covers the molecular cloning and functional characterisation of fads2, elovl2 and elovl5 genes from C. macropomum. Results from this chapter have been published in Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 227 (2019): 90-97.

Chapter 3: From the Amazon: a comprehensive liver transcriptome of the tambaqui, Colossoma macropomum. In this chapter we produced the first comprehensive liver transcriptome of C macropomum. Results from this chapter have been published in Data in brief 23 (2019): 103751.

Chapter 4: Regulation of gene expression associated to LC-PUFA metabolism in juvenile tambaqui (Colossoma macropomum) fed different dietary oil sources. This chapter covers expression of genes to investigate the effect of replacing the FO with VO in juveniles of C. macropomum. Additionally, it was generated a brain transcriptome of C macropomum. Results from this chapter are in preparation to submit to Aquaculture.

Chapter 5: The fatty acid elongation genes, elovl4a and elovl4b, are present and functional in the genome of tambaqui (Colossoma macropomum). This chapter covers the molecular cloning and functional characterisation, of two elovl4 genes from C. macropomum. Additionally, we generated a draft genome assembly for this species. Results from this chapter are under revision in Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology.

15

Chapter 6: The repertoire of the elongases of very long-chain fatty acids gene family is conserved in the tambaqui (Colossoma macropomum) genome, and include members of the novel Elovl8. This chapter covers the identified for the first time of all elovl present in the C. macropomum, plus a new elongase - Elovl8 present in teleosts.

16

1.8. References

Abdul Hamid, N. K., G. Carmona-Antonanzas, O. Monroig, D. R. Tocher, G. M. Turchini, and J. A. Donald. 2016. 'Isolation and Functional Characterisation of a fads2 in Rainbow Trout (Oncorhynchus mykiss) with Delta5 Desaturase Activity', PloS one, 11: e0150770. Ansorge, Wilhelm J. 2009. 'Next-generation DNA sequencing techniques', New biotechnology, 25: 195-203. Ayisi, CL, Jl Zhao, and A Apraku. 2019. 'Consequences of Replacing Fish Oil with Vegetable Oils in Fish', Journal of Animal Research and Nutrition. Baldisserotto, Bernardo, and Levy de Carvalho Gomes. 2005. Espécies Nativas para a Piscicultura no Brasil. Barozai, Muhammad Younas Khan. 2012a. 'The MicroRNAs and their targets in the channel catfish (Ictaluruspunctatus)', Molecular biology reports, 39: 8867-72. Barozai, Muhammad Younas Khan. 2012b. 'Identification and characterization of the microRNAs and their targets in Salmo salar', Gene, 499: 163-68. Barthem, Ronaldo Borges , and Nidia Noemi Fabré. 2004. 'Biologia e diversidade dos recursos pesqueiros da Amazônia'. Berkes, Fikret, Terry P Hughes, Robert S Steneck, James A Wilson, David R Bellwood, Beatrice Crona, Carl Folke, LH Gunderson, HM Leslie, and J Norberg. 2006. 'Globalization, roving bandits, and marine resources', Science, 311: 1557-58. Bezerra, Ranilson DE Souza, Juliana Ferreira dos Santos, Mércia Andrea da Silva Lino, Vera Lúcia Almeida Vieira, and Luiz Bezerra Carvalho. 2000. 'Characterization of stomach and pyloric caeca proteinases of tambaqui (Colossoma macropomum)', Journal of food biochemistry, 24: 189-99. Calder, Philip C. 2014. 'Very long chain omega‐3 (n‐3) fatty acids and human health', European journal of lipid science and technology, 116: 1280-300. Carmona-Antonanzas, G., O. Monroig, J. R. Dick, A. Davie, and D. R. Tocher. 2011. 'Biosynthesis of very long-chain fatty acids (C>24) in Atlantic salmon: cloning, functional characterisation, and tissue distribution of an Elovl4 elongase', Comp Biochem Physiol B Biochem Mol Biol, 159: 122-9. Carolsfeld, Joachim, Brian Harvey, Carmen Ross, and Anton Baer. 2004. 'Migratory of South America: biology, fisheries and conservation status'. Castro, L. F., D. R. Tocher, and O. Monroig. 2016. 'Long-chain polyunsaturated fatty acid biosynthesis in : Insights into the evolution of Fads and Elovl gene repertoire', Prog Lipid Res, 62: 25-40. Conte, Matthew A, William J Gammerdinger, Kerry L Bartie, David J Penman, and Thomas D Kocher. 2017. 'A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions', BMC genomics, 18: 341. Cook, Harold W. 1996. 'Fatty acid desaturation and chain elongation in eukaryotes.' in, Biochemistry of Lipids, Lipoproteins and Membranes (Elsevier Science ). Daemen, Sabine, Martina Kutmon, and Chris T Evelo. 2013. 'A pathway approach to investigate the function and regulation of SREBPs', Genes & nutrition, 8: 289. Dairiki, J. J., and T. B. A. Silva. 2011. 'Revisão de literatura: Exigências nutricionais do tambaqui – compilação de trabalhos, formulação de ração adequada e desafios futuros', Empresa Brasileira de Pesquisa Agropecuária. De almeida, LC, LM Lundstedt, and G Moraes. 2006. 'Digestive enzyme responses of tambaqui (Colossoma macropomum) fed on different levels of protein and lipid', Aquaculture nutrition, 12: 443-50. Dressel, Uwe, Tamara L Allen, Jyotsna B Pippal, Paul R Rohde, Patrick Lau, and George EO Muscat. 2003. 'The peroxisome proliferator-activated receptor β/δ agonist,

17

GW501516, regulates the expression of genes involved in lipid catabolism and energy uncoupling in skeletal muscle cells', Molecular endocrinology, 17: 2477-93. Du, Kang, Sven Wuertz, Mateus Adolfi, Susanne Kneitz, Matthias Stöck, Marcos Oliveira, Rafael Nóbrega, Jenny Ormanns, Werner Kloas, and Romain Feron. 2019. 'The genome of the arapaima (Arapaima gigas) provides insights into gigantism, fast growth and chromosomal sex determination system', Scientific reports, 9: 5293. FAO. 2018. "The State of World Fisheries and Aquaculture 2018‐Meeting the sustainable development goals." In.: FAO Rome, Italy. FAO, Food and Agriculture Organization of the United Nations -. 2010. Fats and fatty acids in human nutrition: report of an expert consultation (Rome). FAO/OECD. 2017. OECD-FAO Agricultural Outlook 2017-2026. Special Focus: Southeast Asia. Ferraz, Renato B, Naoki Kabeya, Mónica Lopes-Marques, André M Machado, Ricardo A Ribeiro, Ana L Salaro, Rodrigo Ozório, L Filipe C Castro, and Óscar Monroig. 2019. 'A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 227: 90-97. Fievet, C, and B Staels. 2009. 'Liver X receptor modulators: effects on lipid metabolism and potential use in the treatment of atherosclerosis', Biochemical pharmacology, 77: 1316-27. Fonseca-Madrigal, J., J. C. Navarro, F. Hontoria, D. R. Tocher, C. A. Martinez-Palacios, and O. Monroig. 2014. 'Diversification of substrate specificities in teleostei Fads2: characterization of Delta4 and Delta6 Delta5 desaturases of Chirostoma estor', J Lipid Res, 55: 1408-19. Fonseca, Elza, Raquel Ruivo, Mónica Lopes-Marques, Huixian Zhang, Miguel M Santos, Byrappa Venkatesh, and L Filipe C Castro. 2017. 'LXRα and LXRβ nuclear receptors evolved in the common ancestor of gnathostomes', Genome biology and evolution, 9: 222-30. Garrido, Diego, Naoki Kabeya, Mónica B Betancor, José A Pérez, N Guadalupe Acosta, Douglas R Tocher, Covadonga Rodríguez, and Óscar Monroig. 2019. 'Functional diversification of teleost Fads2 fatty acyl desaturases occurs independently of the trophic level', Scientific reports, 9: 11199. Gomes, Fátima, Luciana Watanabe, Sérgio Nozawa, Layanna Oliveira, Jedson Cardoso, João Vianez, Márcio Nunes, Horacio Schneider, and Iracilda Sampaio. 2017. 'Identification and characterization of the expression profile of the microRNAs in the Amazon species Colossoma macropomum by next generation sequencing', Genomics, 109: 67-74. González-Rovira, Almudena, Gabriel Mourente, Xiaozhong Zheng, Douglas R. Tocher, and Carlos Pendón. 2009. 'Molecular and functional characterization and expression analysis of a Δ6 fatty acyl desaturase cDNA of European Sea Bass (Dicentrarchus labrax L.)', Aquaculture, 298: 90-100. Gorjão, Renata, Anna Karenina Azevedo-Martins, Hosana Gomes Rodrigues, Fernando Abdulkader, Manoel Arcisio-Miranda, Joaquim Procopio, and Rui Curi. 2009. 'Comparative effects of DHA and EPA on cell function', Pharmacology & therapeutics, 122: 56-64. Goulding, M. , and M. L. Carvalho. 1982. 'Life history and management of the tambaqui (colossoma macropomum, characidae); an important amazonian food fish', REVISTA BRASILEIRA DE ZOOLOGIA, 1: 107-33. Gregory, Melissa K, and Michael J James. 2014. 'Rainbow trout (Oncorhynchus mykiss) Elovl5 and Elovl2 differ in selectivity for elongation of omega-3 docosapentaenoic

18

acid', Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids, 1841: 1656-60. Gui, JianFang, and ZuoYan Zhu. 2012. 'Molecular basis and genetic improvement of economically important traits in aquaculture animals', Chinese Science Bulletin, 57: 1751-60. Guillou, H., D. Zadravec, P. G. Martin, and A. Jacobsson. 2010. 'The key roles of elongases and desaturases in mammalian fatty acid metabolism: Insights from transgenic mice', Prog Lipid Res, 49: 186-99. Hamoy, Igor Guerreiro, Fernanda Witt Cidade, Maria Silvanira Barbosa, Evonnildo Costa Gonçalves, and Sidney Santos. 2010. 'Isolation and characterization of tri and tetranucleotide microsatellite markers for the tambaqui (Colossoma macropomum, Serrasalmidae, Characiformes)', Conservation Genetics Resources, 3: 33-36. Hastings, N., M. Agaba, D. R. Tocher, M. J. Leaver, J. R. Dick, J. R. Sargent, and A. J. Teale. 2001. 'A vertebrate fatty acid desaturase with Delta 5 and Delta 6 activities', Proc Natl Acad Sci U S A, 98: 14304-9. Holdt, Susan Løvstad, and Stefan Kraan. 2011. 'Bioactive compounds in seaweed: functional food applications and legislation', Journal of applied phycology, 23: 543-97. IBGE, Instituto Brasileiro de Geografia e Estatística -. 2017. "Produção da pecuária municipal." In, edited by IBGE. Av. Franklin Roosevelt, 166 - Centro - 20021-120 - Rio de Janeiro, RJ - Brasil. Jakobsson, A., R. Westerberg, and A. Jacobsson. 2006. 'Fatty acid elongases in mammals: their regulation and roles in metabolism', Prog Lipid Res, 45: 237-49. Jakobsson, Tomas, Eckardt Treuter, Jan-Åke Gustafsson, and Knut R Steffensen. 2012. 'Liver X receptor biology and pharmacology: new pathways, challenges and opportunities', Trends in pharmacological sciences, 33: 394-404. Janaranjani, M, Min-Qian Mah, Meng-Kiat Kuah, Nor Fadhilah, Sher-Ryn Hing, Wan-Yin Han, and Alexander Chong Shu-Chien. 2018. 'Capacity for eicosapentaenoic acid and arachidonic acid biosynthesis in silver barb (Barbonymus gonionotus): Functional characterisation of a Δ6/Δ8/Δ5 Fads2 desaturase and Elovl5 elongase', Aquaculture, 497: 469-86. Jeon, Tae-Il, and Timothy F Osborne. 2012. 'SREBPs: metabolic integrators in physiology and metabolism', Trends in Endocrinology & Metabolism, 23: 65-72. Juanchich, Amelie, Philippe Bardou, Olivier Rué, Jean-Charles Gabillard, Christine Gaspin, Julien Bobe, and Yann Guiguen. 2016. 'Characterization of an extensive rainbow trout miRNA transcriptome by next generation sequencing', BMC genomics, 17: 164. J. Gordon Bell, 1 Douglas R. Tocher, R. James Henderson, James R. Dick and Vivian O. Crampton*. 2003. 'Altered fatty acid compositions in Atlantic salmon (Salmo salar) fed diets containing linseed and rapeseed oils can be partially restored by a subsequent fish oil finishing diet', American Society for Nutritional Sciences. Kabeya, Naoki, Miguel M Fonseca, David EK Ferrier, Juan C Navarro, Line K Bay, David S Francis, Douglas R Tocher, L Filipe C Castro, and Óscar Monroig. 2018. 'Genes for de novo biosynthesis of omega-3 polyunsaturated fatty acids are widespread in animals', Science advances, 4: eaar6849. Kuah, M. K., A. Jaya-Ram, and A. C. Shu-Chien. 2016. 'A fatty acyl desaturase (fads2) with dual Delta6 and Delta5 activities from the freshwater carnivorous striped snakehead Channa striata', Comp Biochem Physiol A Mol Integr Physiol, 201: 146-55. Leaver, Michael J, Evridiki Boukouvala, Efthimia Antonopoulou, Amalia Diez, Laurence Favre-Krey, M Tariq Ezaz, José M Bautista, Douglas R Tocher, and Grigorios Krey. 2005. 'Three peroxisome proliferator-activated receptor isotypes from each of two species of marine fish', Endocrinology, 146: 3150-62. Leonard, Amanda E., Suzette L. Pereira, Howard Sprecher, and Yung-Sheng Huang. 2004. 'Elongation of long-chain fatty acids', Progress in Lipid Research, 43: 36-54.

19

Li, Songlin, Óscar Monroig, Juan Carlos Navarro, Yuhui Yuan, Wei Xu, Kangsen Mai, Douglas R Tocher, and Qinghui Ai. 2017. 'Molecular cloning and functional characterization of a putative Elovl4 gene and its expression in response to dietary fatty acid profiles in orange‐spotted grouper E pinephelus coioides', Aquaculture Research, 48: 537-52. Li, Songlin, Óscar Monroig, Tianjiao Wang, Yuhui Yuan, Juan Carlos Navarro, Francisco Hontoria, Kai Liao, Douglas R Tocher, Kangsen Mai, and Wei Xu. 2017. 'Functional characterization and differential nutritional regulation of putative Elovl5 and Elovl4 elongases in large yellow croaker (Larimichthys crocea)', Scientific reports, 7: 2303. Li, Y., O. Monroig, L. Zhang, S. Wang, X. Zheng, J. R. Dick, C. You, and D. R. Tocher. 2010. 'Vertebrate fatty acyl desaturase with 4 activity', Proceedings of the National Academy of Sciences, 107: 16840-45. Li, Yang, Zhengyong Wen, Cuihong You, Zhiyong Xie, Douglas R Tocher, Yueling Zhang, Shuqi Wang, and Yuanyou Li. 2020. 'Genome wide identification and functional characterization of two LC-PUFA biosynthesis elongase (elovl8) genes in rabbitfish (Siganus canaliculatus)', Aquaculture: 735127. Li, Yang, Xiao Liang, Yin Zhang, and Jian Gao. 2016. 'Effects of different dietary soybean oil levels on growth, lipid deposition, tissues fatty acid composition and hepatic lipid metabolism related gene expressions in blunt snout bream (Megalobrama amblycephala) juvenile', Aquaculture, 451: 16-23. Lien, Sigbjørn, Ben F Koop, Simen R Sandve, Jason R Miller, Matthew P Kent, Torfinn Nome, Torgeir R Hvidsten, Jong S Leong, David R Minkley, and Aleksey Zimin. 2016. 'The Atlantic salmon genome provides insights into rediploidization', Nature, 533: 200. Liu, Z Jꎬ, and JF Cordes. 2004. 'DNA marker technologies and their applications in aquaculture genetics', Aquaculture, 238: 1-37. Lopes-Marques, Mónica, Naoki Kabeya, Yu Qian, Raquel Ruivo, Miguel M Santos, Byrappa Venkatesh, Douglas R Tocher, L Filipe C Castro, and Óscar Monroig. 2018. 'Retention of fatty acyl desaturase 1 (fads1) in Elopomorpha and Cyclostomata provides novel insights into the evolution of long-chain polyunsaturated fatty acid biosynthesis in vertebrates', BMC evolutionary biology, 18: 157. Louro, Bruno, Gianluca De Moro, Carlos Garcia, Cymon J Cox, Ana Veríssimo, Stephen J Sabatino, António M Santos, and Adelino VM Canário. 2019. 'A haplotype-resolved draft genome of the European sardine (Sardina pilchardus)', GigaScience, 8: giz059. Machado, André M, Renato Ferraz, Ricardo do Amaral Ribeiro, Rodrigo Ozório, and L Filipe C Castro. 2019. 'From the Amazon: A comprehensive liver transcriptome dataset of the teleost fish tambaqui, Colossoma macropomum', Data in Brief, 23: 103751. Machado, André M, Ole K Tørresen, Naoki Kabeya, Alvarina Couto, Bent Petersen, Mónica Felício, Paula F Campos, Elza Fonseca, Narcisa Bandarra, and Mónica Lopes- Marques. 2018. '“Out of the Can”: A Draft Genome Assembly, Liver Transcriptome and Nutrigenomics of the European Sardine, Sardina pilchardus'. Madureira, T. V., I. Pinheiro, R. de Paula Freire, E. Rocha, L. F. Castro, and R. Urbatzka. 2017. 'Genome specific PPARalphaB duplicates in salmonids and insights into estrogenic regulation in brown trout', Comp Biochem Physiol B Biochem Mol Biol, 208-209: 94-101. Martínez, José Gregorio, Valéria Nogueira Machado, Susana J. Caballero-Gaitán, Maria da C. Freitas Santos, Rodrigo Maciel Alencar, Maria Doris Escobar L, Tomas Hrbek, and Izeni Pires Farias. 2016. 'SNPs markers for the heavily overfished tambaqui Colossoma macropomum, a Neotropical fish, using next-generation sequencing- based de novo genotyping', Conservation Genetics Resources, 9: 29-33. Melo, Luiz Antelmo Silva, Antônio Cláudio Uchôa Izel, and Francisco Mendes Rodrigues. 2001. "Criação de Tambaqui (Colossoma macropomum) em Viveiros de Argila/Barragens no Estado do Amazonas." In Embrapa Amazônia Ocidental.

20

Mohd-Yusof, N. Y., O. Monroig, A. Mohd-Adnan, K. L. Wan, and D. R. Tocher. 2010. 'Investigation of highly unsaturated fatty acid metabolism in the Asian sea bass, Lates calcarifer', Fish Physiol Biochem, 36: 827-43. Monroig, O., Y. Li, and D. R. Tocher. 2011. 'Delta-8 desaturation activity varies among fatty acyl desaturases of teleost fish: high activity in delta-6 desaturases of marine species', Comp Biochem Physiol B Biochem Mol Biol, 159: 206-13. Monroig, O., M. Lopes-Marques, J. C. Navarro, F. Hontoria, R. Ruivo, M. M. Santos, B. Venkatesh, D. R. Tocher, and L. F. Castro. 2016. 'Evolutionary functional elaboration of the Elovl2/5 gene family in chordates', Sci Rep, 6: 20510. Monroig, O., J. Rotllant, J. M. Cerda-Reverter, J. R. Dick, A. Figueras, and D. R. Tocher. 2010. 'Expression and role of Elovl4 elongases in biosynthesis of very long-chain fatty acids during zebrafish Danio rerio early embryonic development', Biochim Biophys Acta, 1801: 1145-54. Monroig, O., J. Rotllant, E. Sanchez, J. M. Cerda-Reverter, and D. R. Tocher. 2009. 'Expression of long-chain polyunsaturated fatty acid (LC-PUFA) biosynthesis genes during zebrafish Danio rerio early embryogenesis', Biochim Biophys Acta, 1791: 1093-101. Monroig, Oscar, Douglas R Tocher, and Luís Filipe C Castro. 2018. 'Polyunsaturated fatty acid biosynthesis and metabolism in fish.' in, Polyunsaturated Fatty Acid Metabolism (Elsevier). Monteiro, Marta, Elisabete Matos, Rafael Ramos, Inês Campos, and Luisa M. P. Valente. 2018. 'A blend of land animal fats can replace up to 75% fish oil without affecting growth and nutrient utilization of European seabass', Aquaculture, 487: 22-31. Morais, S., F. Castanheira, L. Martinez-Rubio, L. E. Conceicao, and D. R. Tocher. 2012. 'Long chain polyunsaturated fatty acid synthesis in a marine vertebrate: ontogenetic and nutritional regulation of a fatty acyl desaturase with Delta4 activity', Biochim Biophys Acta, 1821: 660-71. Morais, S., O. Monroig, X. Zheng, M. J. Leaver, and D. R. Tocher. 2009a. 'Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of Elovl5- and Elovl2-like elongases', Mar Biotechnol (NY), 11: 627-39. Morais, Sofia, Oscar Monroig, Xiaozhong Zheng, Michael J Leaver, and Douglas R Tocher. 2009b. 'Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of ELOVL5-and ELOVL2-like elongases', Marine Biotechnology, 11: 627-39. Mozaffarian, Dariush, and Jason HY Wu. 2012. '(n-3) fatty acids and cardiovascular health: are effects of EPA and DHA shared or complementary?', The Journal of nutrition, 142: 614S-25S. Mussatto, Solange I, Giuliano Dragone, and Inês Conceicao Roberto. 2006. 'Brewers' spent grain: generation, characteristics and potential applications', Journal of cereal science, 43: 1-14. Nunes, J. R., S. Liu, F. Pertille, C. A. Perazza, P. M. Villela, V. M. de Almeida-Val, A. W. Hilsdorf, Z. Liu, and L. L. Coutinho. 2017. 'Large-scale SNP discovery and construction of a high-density genetic map of Colossoma macropomum through genotyping-by-sequencing', Sci Rep, 7: 46112. Oboh, Angela, Mónica B. Betancor, Douglas R. Tocher, and Oscar Monroig. 2016. 'Biosynthesis of long-chain polyunsaturated fatty acids in the African catfish Clarias gariepinus: Molecular cloning and functional characterisation of fatty acyl desaturase (fads2) and elongase (elovl2) cDNAs7', Aquaculture, 462: 70-79. Oboh, Angela, Juan C Navarro, Douglas R Tocher, and Oscar Monroig. 2017a. 'Elongation of very long-chain (> C24) fatty acids in Clarias gariepinus: Cloning, functional characterization and tissue expression of elovl4 elongases', Lipids, 52: 837-48. OCDE–FAO. 2015. 'Perspectivas agrícolas no brasil: Desafios da agricultura brasileira 2015- 2024', Capítulo 2. Agricultura Brasileira: Perspectivas e Desafios

21

Palti, Yniv, Carine Genet, Guangtu Gao, Yuqin Hu, Frank M You, Mekki Boussaha, Caird E Rexroad, and Ming-Cheng Luo. 2012. 'A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of conserved synteny with model fish genomes', Marine Biotechnology, 14: 343-57. Peixoto, Maria J, Renato Ferraz, Leonardo J Magnoni, Rui Pereira, José F Gonçalves, Josep Calduch-Giner, Jaume Pérez-Sánchez, and Rodrigo OA Ozório. 2019. 'Protective effects of seaweed supplemented diet on antioxidant and immune responses in European seabass (Dicentrarchus labrax) subjected to bacterial infection', Scientific reports, 9: 1-12. Ran, Zhaoshou, Jilin Xu, Kai Liao, Óscar Monroig, Juan Carlos Navarro, Angela Oboh, Min Jin, Qicun Zhou, Chengxu Zhou, and Douglas R Tocher. 2019. 'Biosynthesis of long- chain polyunsaturated fatty acids in the razor clam Sinonovacula constricta: Characterization of four fatty acyl elongases and a novel desaturase capacity', Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids, 1864: 1083-90. Ranganathan, Kannan, and Vaishnavi Sivasankar. 2014. 'MicroRNAs-Biology and clinical applications', Journal of oral and maxillofacial pathology: JOMFP, 18: 229. Reis, R. E., J. S. Albert, F. Di Dario, M. M. Mincarone, P. Petry, and L. A. Rocha. 2016. 'Fish biodiversity and conservation in South America', J Fish Biol, 89: 12-47. Richard, Nadege, Sadasivam Kaushik, Laurence Larroquet, Stéphane Panserat, and Genevieve Corraze. 2006. 'Replacing dietary fish oil by vegetable oils has little effect on lipogenesis, lipid transport and tissue lipid uptake in rainbow trout (Oncorhynchus mykiss)', British journal of Nutrition, 96: 299-309. Robinson-Rechavi, Marc, Oriane Marchand, Héctor Escriva, Pierre-Luc Bardet, Dominique Zelus, Sandrine Hughes, and Vincent Laudet. 2001. 'Euteleost fish genomes are characterized by expansion of gene families', Genome research, 11: 781-88. Rodrigues, Ana Paula Oeda. 2014. 'Nutrição e alimentação do tambaqui (Colossoma macropomum)', Bol. Inst. Pesca, 40(1): 135 – 45. Santos, C. H., G. X. Santana, C. S. Sa Leitao, M. N. Paula-Silva, and V. M. Almeida-Val. 2016. 'Loss of genetic diversity in farmed populations of Colossoma macropomum estimated by microsatellites', Anim Genet, 47: 373-6. Santos, M. C. F., M. L. Ruffino, and I. P. Farias. 2007. 'High levels of genetic variability and panmixia of the tambaqui Colossoma macropomum (Cuvier, 1816) in the main channel of the Amazon River', Journal of Fish Biology, 71: 33-44. Santos, Maria Da Conceicao F, Tomas Hrbek, and Izeni P. Farias. 2009. 'Microsatellite markers for the tambaqui (Colossoma macropomum, Serrasalmidae, Characiformes), an economically important keystone species of the Amazon River floodplain', Molecular ecology resources 874-76. Sargent, John, Gordon Bell, Lesley McEvoy, Douglas Tocher, and Alicia Estevez. 1999. 'Recent developments in the nutrition of fish', Aquaculture, 177: 191-99. Sargent, JR, and AGJ Tacon. 1999. 'Development of farmed fish: a nutritionally necessary alternative to meat', proceedings of the Nutrition Society, 58: 377-83. Schartl, Manfred, Susanne Kneitz, Helene Volkoff, Mateus Adolfi, Cornelia Schmidt, Petra Fischer, Patrick Minx, Chad Tomlinson, Axel Meyer, and Wesley C Warren. 2019. 'The Piranha Genome Provides Molecular Insight Associated to Its Unique Feeding Behavior', Genome biology and evolution, 11: 2099-106. Shen, Chang-Hui. 2019. Diagnostic Molecular Biology (Academic Press). Sherry, David M, Ferenc Deak, Robert E Anderson, and Jennifer L Fessler. 2019. 'Novel Cellular Functions of Very Long Chain-Fatty Acids: Insight from ELOVL4 Mutations', Frontiers in cellular neuroscience, 13: 428.

22

Simopoulos, A. P. 2000. 'role of poultry products in enriching the human diet with n-3 pufa', Poultry Science, 79: 961–70. Souza, Mario J. F. Thomé de, Marcelo B. Raseira, Mauro Luis Ruffino, Claudemir Oliveira da Silva, Vandick da Silva Batista, Ronaldo Borges Barthem, and Ellen Silva Ramos Amaral. 2004. "Estatística pesqueira do Amazonas e do Pará." In Ministra do Meio Ambiente. Brasília-DF, Brasil: Instituto Brasileiro do Meio Ambiente e dos Recursos Naturais Renováveis. Stillwell, William, Saame Raza Shaikh, Mustafa Zerouga, Rafat Siddiqui, and Stephen R Wassall. 2005. 'Docosahexaenoic acid affects cell signaling by altering lipid rafts', Reproduction Nutrition Development, 45: 559-79. Stillwell, William, and Stephen R Wassall. 2003. 'Docosahexaenoic acid: membrane properties of a unique fatty acid', Chemistry and physics of lipids, 126: 1-27. Tacon, Albert G. J., and Marc Metian. 2008. 'Global overview on the use of and fish oil in industrially compounded aquafeeds: Trends and future prospects', Aquaculture, 285: 146-58. Tanomman, S., M. Ketudat-Cairns, A. Jangprai, and S. Boonanuntanasarn. 2013. 'Characterization of fatty acid delta-6 desaturase gene in Nile tilapia and heterogenous expression in Saccharomyces cerevisiae', Comp Biochem Physiol B Biochem Mol Biol, 166: 148-56. Tine, Mbaye, Heiner Kuhl, Pierre-Alexandre Gagnaire, Bruno Louro, Erick Desmarais, Rute ST Martins, Jochen Hecht, Florian Knaust, Khalid Belkhir, and Sven Klages. 2014. 'European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation', Nature communications, 5: 5770. Tocher, Douglas R. 2010. 'Fatty acid requirements in ontogeny of marine and freshwater fish', Aquaculture Research, 41: 717-32. Turchini, Giovanni M, Bente E Torstensen, and Wing‐Keong Ng. 2009. 'Fish oil replacement in finfish nutrition', Reviews in Aquaculture, 1: 10-57. Urbatzka, R, S Galante-Oliveira, E Rocha, LFC Castro, and I Cunha. 2013. 'Tissue expression of PPAR-alpha isoforms in maximus and transcriptional response of target genes in the heart after exposure to WY-14643', and biochemistry, 39: 1043-55. Vicente, Igor ST, Fabiana Elias, and Carlos E Fonseca-Alves. 2014. 'Perspectivas da produção de tilápia do Nilo (Oreochromis niloticus) no Brasil', Revista de Ciências Agrárias, 37: 392-98. Vieira, Maria Lucia Carneiro, Luciane Santini, Augusto Lima Diniz, and Carla de Freitas Munhoz. 2016. 'Microsatellite markers: what they mean and why they are so useful', Genetics and molecular biology, 39: 312-28. Vitule, Jean Ricardo Simões, Carolina Arruda Freire, and Daniel Simberloff. 2009. 'Introduction of non‐native freshwater fish can certainly be bad', Fish and Fisheries, 10: 98-108. Walczak, Robert, and Peter Tontonoz. 2002. 'PPARadigms and PPARadoxes: expanding roles for PPARγ in the control of lipid metabolism', Journal of lipid research, 43: 177- 86. Wienholds, Erno, Marco J Koudijs, Freek JM van Eeden, Edwin Cuppen, and Ronald HA Plasterk. 2003. 'The microRNA-producing enzyme Dicer1 is essential for zebrafish development', Nature genetics, 35: 217-18. Worm, Boris, Edward B Barbier, Nicola Beaumont, J Emmett Duffy, Carl Folke, Benjamin S Halpern, Jeremy BC Jackson, Heike K Lotze, Fiorenza Micheli, and Stephen R Palumbi. 2006. 'Impacts of biodiversity loss on ocean ecosystem services', science, 314: 787-90.

23

Xu, Peng, Xiaofeng Zhang, Xumin Wang, Jiongtang Li, Guiming Liu, Youyi Kuang, Jian Xu, Xianhu Zheng, Lufeng Ren, and Guoliang Wang. 2014. 'Genome sequence and genetic diversity of the common carp, Cyprinus carpio', Nature genetics, 46: 1212. Yoon, Michung. 2009. 'The role of PPARα in lipid metabolism and obesity: focusing on the effects of estrogen on PPARα actions', Pharmacological Research, 60: 151-59. Zheng, X., I. Seiliez, N. Hastings, D. R. Tocher, S. Panserat, C. A. Dickson, P. Bergot, and A. J. Teale. 2004. 'Characterization and comparison of fatty acyl Delta6 desaturase cDNAs from freshwater and marine teleost fish species', Comp Biochem Physiol B Biochem Mol Biol, 139: 269-79. Zheng, Xiaozhong, Zhaokun Ding, Youqing Xu, Oscar Monroig, Sofia Morais, and Douglas R. Tocher. 2009. 'Physiological roles of fatty acyl desaturases and elongases in marine fish: Characterisation of cDNAs of fatty acyl Δ6 desaturase and elovl5 elongase of cobia (Rachycentron canadum)', Aquaculture, 290: 122-31. Zhu, Kecheng, ChaoPing Zhao, DianChang Zhang, Huayang Guo, Nan Zhang, Liang Guo, Baosuo Liu, and Shigui Jiang. 2018. 'The Transcriptional Factor PPARαb Positively Regulates Elovl5 Elongase in Golden Pompano Trachinotus ovatus (Linnaeus 1758)', Frontiers in Physiology, 9: 1340. Zhu, Ya-Ping, Wei Xue, Jin-Tu Wang, Yu-Mei Wan, Shao-Lin Wang, Peng Xu, Yan Zhang, Jiong-Tang Li, and Xiao-Wen Sun. 2012. 'Identification of common carp (Cyprinus carpio) microRNAs and microRNA-related SNPs', BMC genomics, 13: 413.

24

Chapter 2

A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum

25

2.1. Abstract

In vertebrates, the essential fatty acids (FA) that satisfy the dietary requirements for a given species depend upon its desaturation and elongation capabilities to convert the C18 polyunsaturated fatty acids (PUFA), namely linoleic acid (LA, 18:2n-6) and α-linolenic acid

(ALA, 18:3n–3), into the biologically active long-chain (C20–24) polyunsaturated fatty acids (LC-PUFA), including arachidonic acid (ARA, 20:4n-6), eicosapentaenoic acid (EPA,20:5n- 3) and docosahexaenoic acid (DHA, 22:6n-3). Recent studies have established that tambaqui (Colossoma macropomum), an important aquaculture-produced species in Brazil, is a herbivorous fish that can fulfil its essential FA requirements with dietary provision C18 PUFA LA and ALA, although the molecular mechanisms underpinning such ability remained unclear. The present study aimed at cloning and functionally characterizing genes encoding key desaturase and elongase enzymes, namely fads2, elovl5 and elovl2, involved in the LC- PUFA biosynthetic pathways in tambaqui. First, a fads2-like desaturase was isolated from tambaqui. When expressed in yeast, the tambaqui Fads2 showed Δ6, Δ5 and Δ8 desaturase capacities within the same enzyme, enabling all desaturation reactions required for ARA, EPA and DHA biosynthesis. Moreover, tambaqui possesses two elongases that are bona fide orthologs of elovl5 and elovl2. Their functional characterization confirmed that they can operate towards a variety of PUFA substrates with chain lengths ranging from 18 to 22 carbons. Overall, our results provide compelling evidence that demonstrates that all the desaturase and elongase activities required to convert LA and ALA into ARA, EPA and DHA are present in tambaqui within the three genes studied herein, i.e. fads2, elovl5 and elovl2.

2.2. Introduction

Long-chain (C20–24) polyunsaturated fatty acids (LC-PUFA) such arachidonic acid (ARA, 20:4n–6), eicosapentaenoic acid (EPA, 20:5n–3) and docosahexaenoic acid (DHA, 22:6n–3), play many essential roles in growth, development and reproduction of vertebrates (Cook, 1996; Jaya-Ram et al., 2011; Innis, 2008; Vagner and Santigosa, 2011; Calder, 2014). In vertebrates, LC-PUFA can be obtained through the diet or endogenously synthesized from C18 polyunsaturated fatty acid (PUFA) precursors, namely linoleic acid (LA, 18:2n–6) and α-linolenic acid (ALA, 18:3n–3), themselves being dietary essential compounds since they cannot be biosynthesized de novo by vertebrates (Castro et al., 2016). Biosynthesis of the physiologically important LC-PUFA from the dietary essential C18 PUFA precursors requires the coordinated action of both fatty acid desaturases (Fads) and elongation of very

26 long chain fatty acids (Elovl) proteins, membrane-bound enzymes localized in the endoplasmic reticulum (Guillou et al., 2010). The characterization of Fads and Elovl involved in LC-PUFA biosynthesis in fish has been extensively investigated (Castro et al., 2016; Kabeya et al., 2018; Monroiget al., 2018). This research has been mainly driven by the need to understand the endogenous ability that farmed species have to utilize commonly used ingredients such as vegetable oils (VO), devoid of LC-PUFA but often rich in their biosynthetic precursors LA and ALA (Turchini et al., 2009; Tocher, 2010). These studies revealed that the LC-PUFA biosynthetic capability varies remarkably among teleosts, depending upon the species-specific fads and elovl gene complement and function (Castro et al., 2016; Monroig et al., 2016, 2018). Fads enzymes introduce a double bond (unsaturation) at a specific position of the fatty acid (FA) chain and are termed as “∆x desaturase”, with “x” denoting the position of the carbon from the carboxylic group at which the new double bond is introduced (Leonard et al., 2004). Vertebrate Fads are characterized by a cytochrome b5-like domain containing a heme-biding motif (HPGG), three histidine boxes consisting HXXXH, HXXHH and QXXHH, and several membrane-spanning domains (Sperling et al., 2003). Unlike most vertebrates that have Fads1 and Fads2 with ∆5 and ∆6 desaturase activities, respectively (Guillou et al.,2010), teleost fish appear to have lost the fads1 gene, presenting exclusively fads2 genes (Castro et al., 2012, 2016), with the exception of the Japanese eel Anguilla japonica, a basal teleostei possessing a Fads1 with ∆5 activity (Lopes-Marques et al., 2018). Interestingly, Teleostei Fads2 are functionally more diverse than their nonteleost counterparts (Fonseca- Madrigal et al., 2014; Castro et al., 2016). While the majority of teleost Fads2 enzymes functionally characterized to date present Δ6 desaturase activity (Zheng et al., 2004, 2009; González-Rovira et al., 2009; Mohd-Yusof et al., 2010; Monroig et al., 2013; Monroig et al., 2010a; Kabeya et al., 2018), an increasingly high number of teleost Fads2 have alternative desaturase substrate specificities including both Δ6 and Δ5 activities within the same enzyme (Hastings et al., 2001; Tanommanet al., 2013; Kuah et al., 2016; Oboh et al., 2016), Δ5 (Abdul Hamidet al., 2016) and Δ4, the latter often exhibiting some residual activity as Δ5 desaturase (Li et al., 2010; Morais et al., 2012; Fonseca-Madrigalet al., 2014; Oboh et al., 2017). Consistent with the mammalian FADS2 (Park et al., 2009), fish Fads2 typically possess Δ8 desaturation capacity (Monroig et al., 2011a), enabling the “Δ8 pathway”, an alternative route to the most prominent “Δ6 pathway” leading to the biosynthesis of 20:3n– 6 and 20:4n–3, immediate precursors of ARA and EPA, respectively (Castro et al., 2016). Regarding Elovl, these enzymes catalyze the condensation of malonyl-CoA into an activated FA, a rate-limiting reaction in the elongation pathway resulting in the extension of

27 the pre-existing FA in 2 carbons (Jakobsson et al., 2006). Several distinct Elovl (termed Elovl 1–7) have been described in vertebrates (Jakobsson et al., 2006; Guillou et al.,2010), each one having different substrate preferences. Interestingly, Elovl2, Elovl4 and Elovl5 can elongate PUFA substrates and thus play major roles in LC-PUFA biosynthesis (Jakobsson et al., 2006; Guillouet al., 2010). In fish, the majority of Elovl5 orthologs investigated to date showed ability to efficiently elongate C18 and C20 PUFA substrates, with remarkably lower efficiency generally exhibited towards C22 PUFA (Castro et al., 2016). Vertebrate Elovl2 can elongate C20 and C22 PUFA, but it has little ability to elongate C18 PUFA. Loss of Elovl2 in recently emerged teleost lineages has been postulated to be partly compensated by Elovl4, an enzyme that, along its role in the biosynthesis of very long-chain (>C24) PUFA (Monroig et al., 2010b), can operate towards C22 PUFA like Elovl2 thus playing a role in LC-PUFA biosynthesis (Monroig et al., 2011b; Kabeya et al., 2015) Overall, it has become clear that the endogenous capacity for LC-PUFA production in fish species cannot be anticipated from their genetic repertoire of fads and elovl. Rather, functional analyses of the gene products (i.e. enzymes) is required to fully understand the physiological capacities that farmed fish species have to biosynthesize ARA, EPA and DHA from C18 precursors contained in VO. The Colossoma macropomum, popularly known as “tambaqui”, is a freshwater fish and is currently the main native species farmed in Brazil, representing 28% of the total of the national production. Importantly, tambaqui displays high growth rates, adaptation to intensive culture systems and good quality (Guimarães and Martins, 2015). The adult fish are predominantly herbivorous, feeding abundantly on fruits and seeds during flooding episodes (Almeida et al., 2008). Such natural adaptation to low LC-PUFA diets suggests that tambaqui possess an active LC-PUFA biosynthetic capacity that enable them to thrive with little input of these essential nutrients. In order to test this hypothesis, the present study aimed at cloning and functionally characterize the key desaturase and elongase genes, namely fads2, elovl5 and elovl2, involved in the LC-PUFA biosynthetic pathways in this species.

2.3. Materials and methods

RNA sampling and cDNA synthesis

Liver samples from an adult tambaqui were preserved in an RNA stabilization buffer (3.6 M ammonium sulphate, 18 mM sodium citrate, 15 mM EDTA, pH 5.2) and stored at −80 °C prior to RNA extraction. RNA extraction was performed using the Illustra RNA spin kit

28

(GE Healthcare, Little Chalfont, UK) according to the manufacturer's guidelines including an on-column DNase treatment. Final RNA samples were run on a 1% agarose TAE gel stained with GelRed™ nucleic acid stain (Biotium, Hayward, CA, USA) to evaluate integrity, while RNA quantification was performed using BiotTek® microplate reader to determine sample absorbance. Subsequently, cDNA synthesis was per-formed from 1 μg of total RNA and using iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA) according to manufacturer's recommendations. Additionally, 5′ and 3′ Rapid Amplification of cDNA Ends (RACE) cDNA was prepared using SMARTer®RACE 5′/3’ Kit (Clontech, Mountain View, CA, USA).

Isolation of ORF fads and elovl genes from tambaqui

For amplification of the first fragments of the target genes fads2, elovl5 and elovl2, polymerase chain reactions (PCR) were conducted using degenerated primers designed on conserved regions of teleosts orthologs of fads2, elovl5 and elovl2 using CODEHOP (http://blocks.fhcrc.org/codehop.html) (Rose et al., 2003). Amplifications of partial fragments of the target genes were carried out using a 1 μl of liver cDNA, 500nM of sense and antisense primers, and Flash High-Fidelity PCR Master Mix (Thermo Fisher Scientific, Waltham, MA, USA), set fora final volume of 20 μl (see Supplementary Table 2.1 for primers, PCR conditions). Resulting PCR products were analyzed on a 1% agarose gel, and fragments with the expected size were purified with NYZGelpure (NZYtech, Lisbon, Portugal) and confirmed by DNA sequencing (GATCBiotech, Constance, Germany). To obtain full-length cDNA sequences, RACE PCR were performed. Gene specific primers for RACE PCR were designed using the previously isolated first fragments, with adapter specific primers being provided with the kit (SMARTer® RACE 5′/3’ Kit, Clontech, Mountain View, CA, USA). The RACE PCR were performed with Flash High-Fidelity PCRMaster Mix (Thermo Fisher Scientific, Waltham, MA, USA) using as template 5′ and 3’ RACE cDNA prepared from liver RNA. Each RACE PCR contained 1 μl of gene specific primer combined with 2 μl Universal primer mix (Clontech, Mountain View, CA, USA) and corresponding RACE cDNA template (see Supplementary Table 2.1 for primers and PCR conditions). Resulting 5′ and 3’ RACE PCR products were confirmed by sequencing (GATC Biotech, Constance, Germany) and assembled with first fragments to obtain open reading frame (ORF) sequences for each target gene (Geneious V7.1.9).

29

Sequence collection and phylogenetic analysis

Blastp searches in the public database NCBI were performed using human Fads1, Fads2, Elovl2, Elovl5 and Elovl4, as queries. Fads and Elovl and amino acid sequences were collected from the major vertebrate lineages, namely (Mammalia, Reptilia, Aves, Amphibia, and Coelacanthiformes), Chondrichtyes (Elasmobranchii and Chimaera) and Actinopterygii (Elopomorpha, , Otomorpha, and Euteleosteomorpha), as well as from invertebrate groups including Hemichordata (Saccoglossus kowalevskii), Cephalochordata (Branchiostoma lanceolatum), Urochordata (Ciona inestinalis) and Cephalopod (Octopus vulgaris) (see Supplementary Table 2.2 for accession numbers). Next, two phylogenetic trees were constructed containing Fads and Elovl sequences, with alignments for both Fads and Elovl performed independently in MAFFT (Katoh and Toh, 2008) with the L-INS-i method. The resulting sequence alignments were inspected and stripped of 90% columns containing gaps leaving a total of 59 sequences with 467 positions for phylogenetic analysis for Fads, and 53 sequences and 318 positions for Elovl. Final sequence alignments were submitted for phylogenetic analysis to PhyML V3.0 (Guindon et al.,2010). For each analysis evolutionary model was determined using the smart model selection (SMS) option resulting in a JTT +G +I for Fads and JTT+ G+ I for Elovl and branch support in both runs was calculated using aBayes. The resulting trees were visualized and analysed in Figure Tree V1.3.1 available athttp://tree.bio.ed.ac.uk/software/figtree/.

Functional characterization of tambaqui fads2, elovl5 and elovl2

Functions of the tambaqui Fads2, Elovl5 and Elovl2 enzymes were characterized individually by heterologous expression in yeast, Saccharomyces cerevisiae. First, the ORF sequences of the tambaqui fads2, elovl5 and elovl2 were isolated and cloned into the yeast expression vector pYES2 (Thermo Fisher Scientific, Waltham, MA, USA) (Lopes-Marques et al., 2017). The ORF of the target genes were isolated by PCR (Flash High-Fidelity PCR Master Mix, Thermo Fisher Scientific, Waltham, MA, USA) using C. macropomum liver cDNA as template and primers containing appropriate restriction sites for further cloning into pYES2 (see Supplementary Table 2.1 for primers and PCR conditions). PCR products and pYES2 were subsequently digested with appropriate restriction enzymes (Promega, Madison, WI, USA) and ligated to produce the plasmid constructs pYES2-fads2, pYES2- elovl5 and pYES2-elovl2. Sequences of inserts in each plasmid construct was confirmed

30

(GATC Biotech, Constance, Germany) before being used to transform yeast competent cells. One colony of transgenic yeast carrying either pYES2-fads2, pYES2-elovl5 or pYES2-elovl2 was grown in S. cerevisiae minimal (SCMM-ur-acil) medium lacking uracil to produce a bulk culture and subsequently diluted to OD600 of 0.4 in all the Erlenmeyer flasks used to test each potential FA substrate assayed. For Fads2, transgenic yeast were grown in the presence of Δ6 (18:2n–6 and 18:3n–3), Δ8 (20:2n–6 and 20:3n–3), Δ5 (20:3n–6 and 20:4n–3) and Δ4 (22:4n–6 and 22:5n–3) desaturase substrates. In addition, to determine the desaturase activity towards 24:5n–3, the transgenic yeast co-expressing zebrafish elovl2 and the tambaqui fads2 were grown in the presence of 22:5n–3 as de-scribed previously by Oboh (Oboh et al., 2016). Here the conversion towards 18:3n-3 in the co-expression yeast system was also determined as positive control. With regards to tambaqui elongases, both Elovl5 and Elovl2 were functionally characterized by growing transgenic yeast in medium supplemented with one of the following PUFA substrates: 18:2n−6, 18:3n−3, 18:3n−6, 18:4n−3, 20:4n−6, 20:5n−3, 22:4n−6 and 22:5n−3. Given that PUFA uptake by yeast decreases with increasing chain length, PUFA substrates were added to the yeast cultures from both desaturase and elongase assays at final concentrations of 0.5 mM (C18), 0.75 mM

(C20) and 1.0 mM (C22) (Lopes-Marqueset al., 2017). Additionally, yeast transformed with empty pYES2 were also grown in presence of PUFA substrates as control treatments. After 2 days of culture at 30 °C, yeast were harvested, washed, and homogenized in chloroform/methanol (2:1, v/v) containing 0.01% (w/v) butylated hydroxytoluene (BHT) and kept −20 °C until further analysis. All PUFA substrates, except stearidonic acid (18:4n-3), were from Nu-Chek Prep, Inc. (Elysian, MN, USA). Stearidonic acid and chemicals used to prepare SCMM-uracil were from Sigma-Aldrich (Darmstadt, Germany), except for the bacteriological agar obtained from Oxoid Ltd. (Hants, UK).

Fatty acid analysis of yeast

Total lipid extracted from yeast (Folch et al., 1957) was used to prepare fatty acyl methyl esters (FAME) that were further analyzed by gas chromatography as previously described (Hastings et al., 2001; Liet al., 2010). FAME were identified based on retention times using an Fisons GC-8160 (Thermo Fisher Scientific, Waltham, MA, USA) gas chromatograph equipped with a 60 m ×0.32 mm i.d. x 0.25 μm ZB-wax column (Phenomenex, Macclesfield, UK) and flame ionization detector. The desaturation or elongation conversion efficiencies from exogenously added PUFA substrates were

31 calculated by the proportion of substrate FA converted to desaturated or elongated products as [all product areas/(all products areas + substrate area)] ×100. For the tambaqui elongases, “all product areas” include those of the initial elongation products, as well as those from stepwise elongations occurring subsequently (Monroig et al., 2012). Similarly, the tambaqui Fads2 exhibited multifunctional abilities, and thus the conversions on Δ8 substrates (20:2n-6and 20:3n-3) include stepwise Δ5 desaturase reactions (Fonseca- Madrigal et al., 2014).

2.4. Results

Phylogenetic analysis of the tambaqui fads and elovl sequences

To determine orthology of the isolated sequences from tambaqui, two independent phylogenetic trees including Fads (Figure 2.1) and Elovl (Figure 2.2) were constructed. For Fads, the resulting phylogenetic tree showed two well supported vertebrate clades that contained either Fads1 or Fads2 sequences, and both being out-grouped by invertebrate Fads (Figure 2.1). We found that the isolated Fads from tambaqui was placed within the Fads2 clade together with orthologs from other teleost species such as Danio rerio and Ictalurus punctatus. This positioning indicates that newly cloned tambaqui fads is a bona fide Fads2 desaturase, orthologous to previously characterized Fads2 from other vertebrate species (Figure 2.1). Regarding the Elovl phylogenetic analysis, the phylogenetic tree revealed that vertebrate Elovl4 sequences including the invertebrate elongase from C. intestinalis formed an independent monophyletic group positioned. The remaining vertebrate Elovl2 and Elovl5 sequences constituted two well supported independent clades with the invertebrate B. lanceolatum Elovl2/5 sequence as an outgroup (Monroiget al., 2016). Regarding the herein isolated elongases from tambaqui, one clustered within the Elovl2 clade and the other within the Elovl5 clade, in both cases including characterized Elovl sequences from other teleosts and vertebrate species. These results indicate that the isolated tambaqui elongases are orthologs of elovl2 and elovl5 (Figure 2.2).

32

Figure 2.1. Maximum likelihood phylogenetic analysis of Fads amino acid sequences rooted with the invertebrate clade. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Fads2 studied herein is highlighted. Accession numbers for all Fads sequences are available in Supplementary Table 2.2.

33

Figure 2.2. Maximum likelihood phylogenetic analysis of Elovl amino acid sequences rooted with the Elovl4 sequences. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum sequences (Elovl2 and Elovl5) studied herein are highlighted. Accession numbers for all Elovl sequences are available in Supplementary Table 2.3.

34

Functional characterization of C. macropomum fads2, elovl5 and elovl2 in S. cerevisiae

Control yeast transformed with the empty vector pYES2 did not show activity towards any of the substrates tested (data not shown). Transgenic yeast expressing the tambaqui fads2 exhibited Δ6 desaturase activity since 18:2n–6 and 18:3n–3 were converted to 18:3n–6 (28.3% conversion) and 18:4n–3 (63.5% conversion), respectively (Table 2.1). The tambaqui Fads2 also showed Δ5 desaturase activity since transgenic yeast converted 20:3n–6 and 20:4n–3 were converted to 20:4n–6 (14.8% conversion) and 20:5n–3 (17.8% conversion), respectively (Table 2.1). These results indicate that the tambaqui Fads2 is a dual ∆6∆5 desaturase. Moreover, this enzyme exhibited ∆8 desaturase activity, with exogenously added 20:2n–6 and 20:3n–3 being desaturated to 20:3n–6 and 20:4n–3, respectively (Table 2.1). In addition, the tambaqui Fads2 showed ∆6 desaturation capacity towards 24:5n–3, which was desaturated to 24:6n–3.

Table 2.1. Functional characterization of the tambaqui (C. macropomum) Fads2 in yeast S. cerevisiae. Yeast were transformed with the C. macropomum fads2 ORF and grown in the presence of Δ6 (18:2n-6 and 18:3n-3), Δ8 (20:2n-6 and 20:3n-3), Δ5 (20:3n-6 and 20:4n-3), and Δ4 (22:4n-6 and 22:5n-3) fatty acid (FA) substrates. Desaturation of 24:5n-3 was tested by co-expressing both D. rerio elovl2 and C. macropomum fads2 in S. cerevisiae (Oboh et al., 2017). Conversions in both expression systems were calculated according to the formula [all product areas/(all product areas + substrate area)] × 100.

FA substrate FA product Conversion (%) Activity 18:2n–6 18:3n–6 28.3 Δ6 18:3n–3 18:4n–3 63.5 Δ6 20:2n–6 20:3n–6 2.8a Δ8 20:3n–3 20:4n–3 4.5a Δ8 20:3n–6 20:4n–6 14.8 Δ5 20:4n–3 20:5n–3 17.8 Δ5 22:4n–6 22:5n–6 Nd Δ4 22:5n–3 22:6n–3 Nd Δ4

Expressed with D. rerio elovl2 18:3n–3 18:4n–3 20.5 Δ6 24:5n–3 24:6n–3 21.9 Δ6 a Conversions of Δ8 substrates (20:2n-6 and 20:3n-3) include stepwise reactions due to multifunctional desaturation abilities. Thus, the conversions of the C. macropomum Fads2 on 20:2n-6 and 20:3n-3 include the Δ8 desaturation toward 20:3n-6 and 20:4n-3, respectively, and their subsequent Δ5 desaturations to 20:4n-6 and 20:5n-3, respectively. Nd, not detected.

35

Functional assays of tambaqui elongases showed that yeast expressing Elovl5 were able to elongate all C18 (18:2n-6, 18:3n-3, 18:3n-6 and 18:4n-3) and C20 (20:4n-6 and 20:5n-

3) PUFA substrates, yet no elongation was detected for C22 (22:4n-6 and 22:5n-6) substrates (Table 2.2). Moreover, yeast expressing the tambaqui Elovl2 exhibited elongation ability for all PUFA including C18, C20 and C22, with C20 substrates appearing as preferred substrates for elongation (Table 2.2).

Table 2.2. Functional characterization of the tambaqui (C. macropomum) Elovl5 and Elovl2 elongases in yeast S. cerevisiae. Yeast transformed with the C. macropomum elovl5 and elovl2 ORF sequences and grown in the presence of exogenously added fatty acid (FA) substrates. Conversions by Elovl5 and Elovl2 were calculated according to the formula [all product areas/(all product areas + substrate area)] × 100.

Conversion (%)

FA substrate FA product Elovl5 Elovl2 Activity

18:2n–6 20:2n–6 29.8 1.5 C18→C20

18:3n–3 20:3n–3 47.7 7.6 C18→C20

18:3n–6 20:3n–6 79.8 13.5 C18→C20

18:4n–3 20:4n–3 79.0 26.3 C18→C20

20:4n–6 22:4n–6 33.4 30.9 C20→C22

20:5n–3 22:5n–3 53.1 66.5 C20→C22

22:4n–6 24:4n–6 Nd 16.4 C22→C24

22:5n–3 24:5n–3 Nd 21.8 C22→C24 Nd, not detected.

2.5. Discussion

The essential FA for a given species depends upon its desaturation and elongation capability to endogenously convert dietary C18 PUFA LA and ALA into the biologically active LC-PUFA, namely ARA, EPA and DHA (Tocher, 2015). Recent investigations have shown that tambaqui juveniles exhibited no difference in growth performance when fed diets containing mixtures of VO (corn and linseed oils) in comparison to fish fed a fish oil (control) diet (Pereira et al., 2017; Paulino et al., 2018). Consequently, it was established that tambaqui can fulfil its essential FA requirements with dietary provision C18 PUFA (Paulino et al., 2018). The present study provides compelling molecular evidence that support previous findings and unequivocally demonstrates that all the desaturase and elongase activities required to convert C18PUFA into ARA, EPA and DHA exist in tambaqui within the three genes studied herein, i.e. fads2, elovl5 and elovl2 (Figure 2.3).

36

The phylogenetic analysis confirmed that the fads-like desaturase isolated from tambaqui is an ortholog of fads2. Additionally, sequence alignment of the tambaqui desaturase with Fads2 from other teleost species showed a high degree of conservation of the classic desaturase motifs, namely histidine boxes and heme binding region (Supplementary Figure 2.1). Recently Lopes-Marques et al. (2018) demonstrated that, along the formerly characterized Fads2 (Wang et al.,2014), Anguilla japonica possess a Fads1 with Δ5 activity similarly to and mammals (Marquardt et al., 2000; Castro et al.,2012), being the first report of a non-fads2 desaturase described in teleosts. Nevertheless, the presence of fads1 among teleosts appears to be anecdotic and restricted to certain post-3R lineages like Elopomorpha (Lopes-Marques et al., 2018) since the vast majority of teleosts possess only fads2 as the sole fads desaturase found in their genomes (Castro et al., 2016). Indeed, loss of fads1 has been often postulated as a major evolutionary driver partly explaining the unique functional plasticity of teleosts Fads2 (Fonseca-Madrigal et al., 2014; Castro et al., 2016). In agreement, the functional characterization of the tambaqui Fads2 demonstrated this is a functionalized desaturase that, in addition to the expected Δ6 activity, it also exhibited Δ5 desaturase capacity. Therefore, the tambaqui Fads2 can be categorized as a bifunctional or dual Δ6Δ5 desaturase, an enzyme type previously described in D. rerio (Hastings et al., 2001), Siganus canaliculatus (Li et al., 2010), Oreochromis niloticus (Tanomman et al., 2013), Channa striata (Kuah et al.,2016) and Clarias gariepinus (Oboh et al., 2016). Moreover, the tambaqui Fads2 also showed ∆8 desaturase activity as described in a wide range of teleosts (Monroig et al., 2011a; Oboh et al., 2016; Lopes-Marques et al., 2017; Kabeya et al., 2018). With the exception of the marine herbivore S. canaliculatus, all teleost species reported to have dual Δ6Δ5 desaturase inhabit freshwater ecosystems with limited abundance of LC-PUFA compared to marine environments (Colomboet al., 2016). Indeed, such dietary restriction of LC-PUFA has been hypothesized to account for the enhanced LC-PUFA biosynthesizing capacity of freshwater teleosts in comparison to their marine counterparts (Tocher et al., 2003; Leaver et al., 2008). Interestingly, the varied range of desaturase activities contained within the tambaqui Fads2 implies that this enzyme enables all the desaturation reactions required to biosynthesize ARA and EPA from the C18 PUFA precursors LA and ALA, respectively (Figure 2.3). Also, the analysis of critical residues involved in determining desaturation position preference in mammalian FADS desaturases (residues marked with “*” in Supplementary Figure 2.1) (Watanabe et al., 2016) revealed that tambaqui Fads2 has a similar profile to that found in the bifunctional D. rerio Fads2 (red boxes Supplementary Figure 2.1), presenting no additional distinctive residue

37 replacements. Similarly, the elongation capacities involved in the biosynthesis of ARA and EPA were also demonstrated within the two newly cloned elovl from tambaqui. Phylogenetic analysis of the two tambaqui elongases showed that these were orthologs of Elovl2 and Elovl5 confirming the presence of both types of elongases as anticipated by the phylogenetic position of tambaqui (Characiformes) (Ravi and Venkatesh, 2018). Certainly, whereas Elovl5 is present virtually in all teleosts (Castro et al., 2016), distribution of Elovl2 is more restricted, with absence in recently emerged teleosts (Morais et al., 2009; Monroig et al., 2016) and presence reported only in relatively ancient lineages such as (Monroig et al., 2009), Siluriformes (Oboh et al., 2016) and Salmoniformes (Morais et al., 2009; Gregory and James, 2014). Importantly, functional characterization of the tambaqui elongases further demonstrated their role in LC-PUFA biosynthesis, with both enabling the elongation capacity required to produce ARA and EPA from their C18 precursors (Figure 2.3). Thus, the tambaqui Elovl5 showed elongation capacity towards C18 and C20 PUFA, while no activity detected towards C22 substrates. Such elongation capacities exhibited by the tambaqui Elovl5 are generally consistent with those of other teleost species (Castro et al., 2016), with some interesting particularities. Thus, the tambaqui Elovl5 showed no activity towards the C22 PUFA substrates 22:4n-6 and 22:5n-3, in contrast to Elovl5 from other species such as S. canaliculatus and Thunnus thynnus, with remarkably high C22 elongation capacity when expressed in yeast (Morais et al., 2011; Monroig et al.,2012). On the other hand, the tambaqui Elovl2 showed activity towards all

C18, C20 and C22 PUFA substrates tested. Yet, certain C18 substrates, namely LA (18:2n-6) and ALA (18:3n-3), were elongated at a comparatively lesser extent, suggesting that Elovl2 has a minor role in the ∆6 desaturase – elongase - ∆5 desaturase pathway leading to ARA and EPA biosynthesis from LA and ALA, respectively (Figure 2.3). Unlike Elovl5 though, the tambaqui Elovl2 had efficiency to operate towards C22 substrates. For instance, 22:5n–3 was efficiently elongated to 24:5n–3, a key intermediate component of the so-called “Sprecher pathway”, a metabolic route that accounts for DHA biosynthesis in vertebrates (Buzzi et al., 1996, 1997; Sprecher, 2000) and recently demonstrated to be widespread among teleosts (Oboh et al., 2017). Certainly, our results show that tambaqui can also biosynthesize DHA through the Sprecher pathway since, in addition to the capacity of Elovl2 to produce 24:5n- 3 mentioned above, its Fads2 has the ability to desaturate 24:5n-3 to 24:6n-3, a key desaturation reaction preceding a final β-oxidation step required to produce DHA. Similarly, to the enzymatic reactions required for ARA and EPA, our results confirm that both the elongation and desaturase capacities involved to biosynthesize DHA from EPA (elongase – elongase - ∆6 desaturase) are present in tambaqui.

38

In conclusion, the combination of the enzymatic capacities demonstrated by the tambaqui Fads2, Elovl2 and Elovl5 allows this species to convert the dietary essential C18 PUFA LA (18:2n–6) and ALA (18:3n–3) into the biologically active ARA, EPA, and DHA. These results confirm that tambaqui can fulfil its essential FA requirements with an adequate dietary provision of C18 PUFA. Therefore, tambaqui emerges as a valuable species for the sustainable development of Brazilian aquaculture, given its ability to efficiently utilize alternative sources of essential FA such as VO as confirmed at molecular level enzymatic capabilities shown in this study.

Figure 2.3. The pathways of biosynthesis long-chain (C20–24) polyunsaturated fatty acids in C. macropomum predicted from activity of Fads2, Elovl2 and Elovl5 measured in yeast. Desaturation reactions are indicated as “Δx”, whereas elongation reactions are indicated as “Elo”. LA, linoleic acid (18:2n-6); ALA, α-linolenic acid (18:3n-3); ARA, arachidonic acid; EPA, eicosapentaenoic acid; DHA, docosahexaenoic acid. β-ox, beta-oxidation.

39

2.6. References

Abdul Hamid, N.K., Carmona-Antonanzas, G., Monroig, O., Tocher, D.R., Turchini, G.M., Donald, J.A., 2016. Isolation and functional characterisation of a fads2 in rainbow trout (Oncorhynchus mykiss) with Δ5 desaturase activity. PLoS One 11, e0150770. Almeida, N.M., Visentainer, J.V., Franco, M.R.B., 2008. Composition of total, neutral and phospholipids in wild and farmed tambaqui (Colossoma macropomum) in the Brazilian Amazon area. J. Sci. Food Agric. 88, 1739-1747. Buzzi, M., Henderson, R., Sargent, J., 1996. The desaturation and elongation of linolenic acid and eicosapentaenoic acid by hepatocytes and liver microsomes from rainbow trout (Oncorhynchus mykiss) fed diets containing fish oil or olive oil. Biochim. Biophys. Acta 1299, 235-244. Buzzi, M., Henderson, R., Sargent, J., 1997. Biosynthesis of docosahexaenoic acid in trout hepatocytes proceeds via 24-carbon intermediates. Comp. Biochem. Physiol. B 116, 263-267. Calder, P.C., 2014. Very long chain omega‐3 (n‐3) fatty acids and human health. Eur. J. Lipid Sci. Technol. 116, 1280-1300. Castro, L.F., Monroig, O., Leaver, M.J., Wilson, J., Cunha, I., Tocher, D.R., 2012. Functional desaturase Fads1 (Δ5) and Fads2 (Δ6) orthologues evolved before the origin of jawed vertebrates. PLoS One 7, e31950. Castro, L.F., Tocher, D.R., Monroig, O., 2016. Long-chain polyunsaturated fatty acid biosynthesis in chordates: Insights into the evolution of Fads and Elovl gene repertoire. Prog. Lipid Res. 62, 25-40. Colombo, S.M., Wacker, A., Parrish, C.C., Kainz, M.J., Arts, M.T., 2016. A fundamental dichotomy in long-chain polyunsaturated fatty acid abundance between and within marine and terrestrial ecosystems. Environ. Rev. 25, 163-174. Cook, H.W., 1996. Fatty acid desaturation and chain elongation in eukaryotes. In: Vance, D.E., Vance J.E. (Ed.), Biochemistry of Lipids, Lipoproteins and Membranes. Elsevier Science, Alberta Canada, pp. 129- 152. Folch, J., Lees, M., Sloane Stanley, G., 1957. A simple method for the isolation and purification of total lipids from animal tissues. J. Biol. Chem. 226, 497-509. Fonseca-Madrigal, J., Navarro, J.C., Hontoria, F., Tocher, D.R., Martinez-Palacios, C.A., Monroig, O., 2014. Diversification of substrate specificities in teleostei Fads2: characterization of Δ4 and Δ6Δ5 desaturases of Chirostoma estor. J. Lipid Res. 55, 1408-1419. González-Rovira, A., Mourente, G., Zheng, X., Tocher, D.R., Pendón, C., 2009. Molecular and functional characterization and expression analysis of a Δ6 fatty acyl desaturase cDNA of European sea bass (Dicentrarchus labrax L.). Aquaculture 298, 90-100. Gregory, M.K., James, M.J., 2014. Rainbow trout (Oncorhynchus mykiss) Elovl5 and Elovl2 differ in selectivity for elongation of omega-3 docosapentaenoic acid. Biochim. Biophys. Acta 1841, 1656-1660. Guillou, H., Zadravec, D., Martin, P.G., Jacobsson, A., 2010. The key roles of elongases and desaturases in mammalian fatty acid metabolism: Insights from transgenic mice. Prog. Lipid Res. 49, 186-199. Guimarães, I.G., Martins, G.P., 2015. Nutritional requirement of two Amazonian aquacultured fish species, Colossoma macropomum (Cuvier, 1816) and Piaractus brachypomus (Cuvier, 1818): a mini review. J. Appl. Ichthyol. 31, 57-66. Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O., 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307-321.

40

Hastings, N., Agaba, M., Tocher, D.R., Leaver, M.J., Dick, J.R., Sargent, J.R., Teale, A.J., 2001. A vertebrate fatty acid desaturase with Delta5 and Delta6 activities. Proc. Natl. Acad. Sci. U.S.A. 98, 14304-14309. Innis, S.M., 2008. Dietary omega 3 fatty acids and the developing brain. Brain Res. 1237, 35-43. Jakobsson, A., Westerberg, R., Jacobsson, A., 2006. Fatty acid elongases in mammals: their regulation and roles in metabolism. Prog. Lipid Res. 45, 237-249. Jaya-Ram, A., Ishak, S.D., Enyu, Y.L., Kuah, M.K., Wong, K.L., Shu-Chien, A.C., 2011. Molecular cloning and ontogenic mRNA expression of fatty acid desaturase in the carnivorous striped snakehead fish (Channa striata). Comp. Biochem. Physiol. A 158, 415-422. Kabeya, N., Yamamoto, Y., Cummins, S.F., Elizur, A., Yazawa, R., Takeuchi, Y., Haga, Y., Satoh, S., Yoshizaki, G., 2015. Polyunsaturated fatty acid metabolism in a marine teleost, Nibe croaker Nibea mitsukurii: Functional characterization of Fads2 desaturase and Elovl5 and Elovl4 elongases. Comp. Biochem. Physiol. B 188, 37-45. Kabeya, N., Yevzelman, S., Oboh, A., Tocher, D.R., Monroig, O., 2018. Essential fatty acid metabolism and requirements of the , ballan Labrus bergylta: Defining pathways of long-chain polyunsaturated fatty acid biosynthesis. Aquaculture 488, 199-206. Katoh, K., Toh, H., 2008. Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. 9, 286-298. Kuah, M.K., Jaya-Ram, A., Shu-Chien, A.C., 2016. A fatty acyl desaturase (fads2) with dual Δ6 and Δ5 activities from the freshwater carnivorous striped snakehead Channa striata. Comp. Biochem. Physiol. A 201, 146-155. Leaver, M.J., Bautista, J.M., Björnsson, B.T., Jönsson, E., Krey, G., Tocher, D.R., Torstensen, B.E., 2008. Towards fish lipid nutrigenomics: current state and prospects for fin-fish aquaculture. Rev. Fish. Sci. 16, 73-94. Leonard, A.E., Pereira, S.L., Sprecher, H., Huang, Y.-S., 2004. Elongation of long-chain fatty acids. Prog. Lipid Res. 43, 36-54. Li, Y., Monroig, O., Zhang, L., Wang, S., Zheng, X., Dick, J.R., You, C., Tocher, D.R., 2010. Vertebrate fatty acyl desaturase with Δ4 activity. Proc. Natl. Acad. Sci. U.S.A. 107, 16840-16845. Lopes-Marques, M., Ozorio, R., Amaral, R., Tocher, D.R., Monroig, O., Castro, L.F., 2017. Molecular and functional characterization of a fads2 orthologue in the Amazonian teleost, Arapaima gigas. Comp. Biochem. Physiol. B 203, 84-91. Lopes-Marques, M., Kabeya, N., Qian, Y., Ruivo, R., Santos, M. M., Venkatesh, B., Tocher, D.R., Castro, L.F., Monroig, Ó. (2018). Retention of fatty acyl desaturase 1 (fads1) in Elopomorpha and Cyclostomata provides novel insights into the evolution of long- chain polyunsaturated fatty acid biosynthesis in vertebrates. BMC evolutionary biology, 18(1), 157. Marquardt, A., Stöhr, H., White, K., Weber, B.H., 2000. cDNA cloning, genomic structure, and chromosomal localization of three members of the human fatty acid desaturase family. Genomics 66, 175-183. Mohd-Yusof, N.Y., Monroig, O., Mohd-Adnan, A., Wan, K.L., Tocher, D.R., 2010. Investigation of highly unsaturated fatty acid metabolism in the Asian sea bass, Lates calcarifer. Fish Physiol. Biochem. 36, 827-843. Monroig, O., Rotllant, J., Sanchez, E., Cerda-Reverter, J.M., Tocher, D.R., 2009. Expression of long-chain polyunsaturated fatty acid (LC-PUFA) biosynthesis genes during zebrafish Danio rerio early embryogenesis. Biochim. Biophys. Acta 1791, 1093- 1101. Monroig, O., Zheng, X., Morais, S., Leaver, M.J., Taggart, J.B., Tocher, D.R., 2010a. Multiple genes for functional Δ6 fatty acyl desaturases (Fad) in Atlantic salmon (Salmo salar

41

L.): gene and cDNA characterization, functional expression, tissue distribution and nutritional regulation. Biochim. Biophys. Acta 1801, 1072-1081. Monroig, O., Rotllant, J., Cerda-Reverter, J.M., Dick, J.R., Figueras, A., Tocher, D.R., 2010b. Expression and role of Elovl4 elongases in biosynthesis of very long-chain fatty acids during zebrafish Danio rerio early embryonic development. Biochim. Biophys. Acta 1801, 1145-1154. Monroig, O., Li, Y., Tocher, D.R., 2011a. Delta-8 desaturation activity varies among fatty acyl desaturases of teleost fish: high activity in delta-6 desaturases of marine species. Comp. Biochem. Physiol. B 159, 206-213. Monroig, Ó., Webb, K., Ibarra-Castro, L., Holt, G.J., Tocher, D.R., 2011b. Biosynthesis of long-chain polyunsaturated fatty acids in marine fish: Characterization of an Elovl4- like elongase from cobia Rachycentron canadum and activation of the pathway during early life stages. Aquaculture 312, 145-153. Monroig, Ó., Wang, S., Zhang, L., You, C., Tocher, D.R., Li, Y., 2012. Elongation of long- chain fatty acids in rabbitfish Siganus canaliculatus: Cloning, functional characterisation and tissue distribution of Elovl5- and Elovl4-like elongases. Aquaculture 350-353, 63-70. Monroig, Ó., Tocher, D.R., Hontoria, F., Navarro, J.C., 2013. Functional characterisation of a Fads2 fatty acyl desaturase with Δ6/Δ8 activity and an Elovl5 with C16, C18 and C20 elongase activity in the anadromous teleost meagre (Argyrosomus regius). Aquaculture 412-413, 14-22. Monroig, O., Lopes-Marques, M., Navarro, J.C., Hontoria, F., Ruivo, R., Santos, M.M., Venkatesh, B., Tocher, D.R., Castro, L.F., 2016. Evolutionary functional elaboration of the Elovl2/5 gene family in chordates. Sci. Rep. 6, 20510. Monroig, O., Tocher, D.R., Castro, L.F.C., 2018. Polyunsaturated fatty acid biosynthesis and metabolism in fish, Polyunsaturated Fatty Acid Metabolism. In: Burdge, G.C. (Ed.), Polyunsaturated Fatty Acid Metabolism. Academic Press, London, pp. 31-60. Morais, S., Monroig, O., Zheng, X., Leaver, M.J., Tocher, D.R., 2009. Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of Elovl5- and Elovl2-like elongases. Mar. Biotechnol. 11, 627-639. Morais, S., Mourente, G., Ortega, A., Tocher, J.A., Tocher, D.R., 2011. Expression of fatty acyl desaturase and elongase genes, and evolution of DHA:EPA ratio during development of unfed larvae of Atlantic bluefin (Thunnus thynnus L.). Aquaculture 313, 129-139. Morais, S., Castanheira, F., Martinez-Rubio, L., Conceicao, L.E., Tocher, D.R., 2012. Long chain polyunsaturated fatty acid synthesis in a marine vertebrate: ontogenetic and nutritional regulation of a fatty acyl desaturase with Δ4 activity. Biochim. Biophys. Acta 1821, 660-671. Oboh, A., Betancor, M.B., Tocher, D.R., Monroig, O., 2016. Biosynthesis of long-chain polyunsaturated fatty acids in the African catfish Clarias gariepinus: Molecular cloning and functional characterisation of fatty acyl desaturase (fads2) and elongase (elovl2) cDNAs. Aquaculture 462, 70-79. Oboh, A., Kabeya, N., Carmona-Antoñanzas, G., Castro, L.F.C., Dick, J.R., Tocher, D.R., Monroig, O., 2017. Two alternative pathways for docosahexaenoic acid (DHA, 22: 6n-3) biosynthesis are widespread among teleost fish. Sci. Rep. 7, 3889. Park, W.J., Kothapalli, K.S., Lawrence, P., Tyburczy, C., Brenna, J.T., 2009. An alternate pathway to long-chain polyunsaturates: the FADS2 gene product Δ8-desaturates 20: 2n-6 and 20: 3n-3. J. Lipid Res. 50, 1195-1202. Paulino, R.R., Pereira, R.T., Fontes, T.V., Oliva-Teles, A., Peres, H., Carneiro, D.J., Rosa, P.V., 2018. Optimal dietary linoleic acid to linolenic acid ratio improved fatty acid profile of the juvenile tambaqui (Colossoma macropomum). Aquaculture 488, 9-16.

42

Pereira, R.T., Paulino, R.R., de Almeida, C.A.L., Rosa, P.V., Orlando, T.M., Fortes-Silva, R., 2017. Oil sources administered to tambaqui (Colossoma macropomum): growth, body composition and effect of masking organoleptic properties and fasting on diet preference. Appl. Anim. Behav. Sci. 199, 103–110. Ravi, V., Venkatesh, B., 2018. The divergent genomes of teleosts. Annu. Rev. Anim. Biosci. 6, 47-68. Rose, T.M., Henikoff, J.G., Henikoff, S., 2003. CODEHOP (COnsensus-DEgenerate hybrid oligonucleotide primer) PCR primer design. Nucleic Acids Res. 31, 3763-3766. Sperling, P., Ternes, P., Zank, T.K., Heinz, E., 2003. The evolution of desatureses. Prostaglandins Leukot. Essent. Fat. Acids 68, 73-95. Sprecher, H., 2000. Metabolism of highly unsaturated n-3 and n-6 fatty acids. Biochim. Biophys. Acta 1486, 219-231. Tanomman, S., Ketudat-Cairns, M., Jangprai, A., Boonanuntanasarn, S., 2013. Characterization of fatty acid delta-6 desaturase gene in Nile tilapia and heterogenous expression in Saccharomyces cerevisiae. Comp. Biochem. Physiol. B 166, 148-156. Tocher., D.R., Agaba., M., Hastings., N., Teale., A.J., 2003. Metabolism and functions of lipids and fatty acids in teleost fish Rev. Fisc. Sci., 107-184. Tocher, D.R., 2010. Fatty acid requirements in ontogeny of marine and freshwater fish. Aquac. Res. 41, 717-732. Tocher, D.R., 2015. Omega-3 long-chain polyunsaturated fatty acids and aquaculture in perspective. Aquaculture 449, 94-107. Turchini, G.M., Torstensen, B.E., Ng, W.K., 2009. Fish oil replacement in finfish nutrition. Rev. Aquacult. 1, 10-57. Vagner, M., Santigosa, E., 2011. Characterization and modulation of gene expression and enzymatic activity of delta-6 desaturase in teleosts: A review. Aquaculture 315, 131- 143. Wang, S., Monroig, Ó., Tang, G., Zhang, L., You, C., Tocher, D.R., Li, Y., 2014. Investigating long-chain polyunsaturated fatty acid biosynthesis in teleost fish: Functional characterization of fatty acyl desaturase (Fads2) and Elovl5 elongase in the catadromous species, Japanese eel Anguilla japonica. Aquaculture 434, 57-65. Watanabe, K., Ohno, M., Taguchi, M., Kawamoto, S., Ono, K., Aki, T., 2016. Identification of amino acid residues that determine the substrate specificity of mammalian membrane-bound front-end fatty acid desaturases. J. Lipid Res. 57, 89-99. Zheng, X., Seiliez, I., Hastings, N., Tocher, D.R., Panserat, S., Dickson, C.A., Bergot, P., Teale, A.J., 2004. Characterization and comparison of fatty acyl Δ6 desaturase cDNAs from freshwater and marine teleost fish species. Comp. Biochem. Physiol. B 139, 269-279. Zheng, X., Ding, Z., Xu, Y., Monroig, O., Morais, S., Tocher, D.R., 2009. Physiological roles of fatty acyl desaturases and elongases in marine fish: Characterisation of cDNAs of fatty acyl Δ6 desaturase and elovl5 elongase of cobia (Rachycentron canadum). Aquaculture 290, 122-131.

43

Chapter 3

From the Amazon: a comprehensive liver transcriptome of the tambaqui, Colossoma macropomum

44

3.1. Abstract

The teleost fish tambaqui (Colossoma macropomum) is a valuable resource for the Brazillian aquaculture sector, representing more than one-quarter of the total production. In this context, the development of molecular tools is critical to address and improve productivity, nutrition, and genetic breeding programs. In this study, we applied RNA-seq technology to produce the first comprehensive liver transcriptome in this species. Our analysis generated a gold standard transcriptome with a total of 43,098 transcripts, with an N50 of 1855 bp and the average length of 1312 bp. To functionally annotate the transcripts, the Trinotate pipeline together with several public databases were scrutinized. The blast-x analysis revealed more than 40,000 homologous match hits for each database (NCBI-Nr, Uniref90, Swissprot, Trembl), while the Kaas web server allowed the mapping of our transcripts to 380 kegg pathways. Additionally, we show the biological significance of the assembled tambaqui liver transcriptome, by identifying critical regulators of lipid metabolism, the gene orthologues of the peroxisome proliferator-activated receptors (PPARs). The dataset provided in this study entails a comprehensive molecular resource, which will be instrumental to further develop tambaqui aquaculture, specifically in the field of nutrigenomics.

3.1. Introduction

The Amazon river basin, the largest in the world, exhibits the highest diversity of freshwater fishes with approximately 3,000 described species (Junk et al., 2007; Reis, 2013). Importantly, of this total, approximately 200 species are used for human consumption. Yet, only roughly 6-12 species comprise more than 80% of the landings in the larger cities along the Amazon river (Reis, 2013). One of the native species that stands out in captured volumes is tambaqui (Colossoma macropomum, Cuvier,1818). Tambaqui is the largest Characidae freshwater species native to South America, reaching at least one meter in total length and 30 kg in weight (Goulding and Carvalho, 1982). This species has become one of the most relevant in Brazilian aquaculture, with the production in 2016 reaching 136.991,478 tons (28% of the total production) (Ministério da Agricultura, 2016). From an ecological perspective, tambaqui displays omnivorous feeding habits (e.g. zooplankton, insects, snails and decaying plants), a valuable feature for captivity adaptation to artificial feed. Overall, characteristics such as rapid growth and acceptance by the consumer market, indicate that this species is well-suited for aquaculture farming (Lima et al., 2016).

45

High-throughput sequencing technologies are valuable tools to improve the biological efficiency of aquaculture fish production, providing key insights into numerous physiological modules including nutrition, disease models or genetic improvement programs (e.g. Yáñez, Newman, and Houston, 2015). Yet, teleost full genome sequences are by comparison relatively scarce (e.g. Ravi and Venkatesh, 2018), although some examples include aquaculture produced species such as the sea bass (Tine et al., 2014). In addition, transcriptome profiling using deep-sequencing technology, commonly known as RNA-seq, has become increasingly popular since it rapidly provides holistic views on molecular pathway activation or in the identification of key genes, including in the field of aquaculture (Pereiro et al., 2012; Chana-Munoz et al., 2017). In the case of tambaqui few genomic resources, vital to improve the nutritional management of this species in culture, are currently available. Some studies have addressed the genetic diversity of wild populations (e.g Santos et al., 2016), muscle transcriptome analysis under specific conditions (e.g. climate scenarios) (Prado-Lima and Val, 2016), SNPs markers (e.g Da Silva Nunes et al., 2017), microRNA expresisson (Gomes et al., 2017) and mitochondrial genome sequencing (Wu et al., 2016). Therefore, approaches that increase the genomic information of tambaqui are key to further develop the aquaculture of this species. Here, we provide an extended molecular resource for the tambaqui with the assembly and annotation of the liver transcriptome, with a specific focus on the peroxisome proliferator-activated receptor (PPAR) signalling pathway.

3.3. Material and methods

Ethics statement

The tissue sampling procedure was approved by the ethical committee of the Federal University of Acre, Brazil, with the project number 23107-009564/2014-29 and protocol number 08/201, under the responsibility of Prof. Anselmo Fortunato Ruiz Rodriguez. The samples were exported from Brazil to Portugal with the approved of the Brazilian Environmental and Renewable Resource Institute (IBAMA) with permit number 16BR021098/DF.

Sampling, RNA isolation and RNA-Seq library preparation

Juvenile of tambaqui were obtained from a commercial farm in Rio Branco, Acre, Brazil (10°05'10.6"S 67°32'15.8"W) and transported to Universidade Federal do Acre, Brazil,

46 where they were anaesthetized and fresh liver was quickly collected and immediately preserved in an RNA stabilization buffer (RNAlater) and stored at −80 °C prior to RNA extraction. RNA extraction from the liver sample was performed with the kit (illustra RNAspin Mini RNA Isolation Kit, GE Healthcare, UK). The process included an on-column DNase I treatment (provided in the kit). RNA integrity was assessed on a 1% agarose TAE gel stained with GelRed™ nucleic acid stain (Biotium, Hayward, CA, USA). The generated high-quality liver RNAt sample was sequenced in the lllumina Hiseq 4000 platform, using 150bp paired- end sequencing reads by STABVIDA, Lda (Caparica, Portugal).

Cleaning, de novo assembly, and optimization of the liver transcriptome

The raw reads generated in sequencing were quality-checked with FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic (Bolger et al., 2014) was used to trim the first 15 bases of the reads and bases with a quality score below 15 at leading and trailing ends. Reads were then scanned with a 4-base sliding window, cutting when the average quality per base dropped below 20. In the end, only reads with high quality and longer than 50 bases were retained for further analysis. After applying a first technical quality control, the dataset was de novo assembled using the Trinity v2.5.0 software (Grabherr et al., 2011), following the Haas et al (2013) protocol and specific parameters to our case, strand-specific data and minimum length contig (SS_lib_type RF; min_contig_length 300). The liver transcriptome optimization was done using three independent approaches. Firstly, the TransDecoder (https://transdecoder.github.io/) was used to predict open reading frames (ORFs) with a minimum cut-off of 100 amino acids, with recourse to homology searches (Blast-p against Swissprot Database (Bateman et al., 2017) and Pfam search (Punta et al., 2012)) as auxiliary. Secondly, all transcripts were blasted against two independent databases, (NR) Non-Redundant of NCBI and Uniref90 of Uniport (Bateman et al., 2017). To perform the blast was used the blast-x tool of DIAMOND v0.8.36 software (Buchfink et al., 2014) and all hits, given a match with Actinopterygii taxon with an E-value cut-off of 1e-5, were retained for further analysis. The liver transcriptome filtration was done by the overlap between the contigs with ORF and with hits against Actinopterygii taxon. Thirdly, the tr2aacds pipeline, from the Evidential – Gene package (http://arthropods.eugenes.org/EvidentialGene/), was used as a strategy to handle the redundancy and the number of isoforms per ‘gene’ in this filtered transcriptome. All

47 transcripts classified by the tr2aacds pipeline as ‘primary’ or ‘alternate’ were retained to the next step of the annotation (Gold standard transcriptome assembly). Importantly, all steps of filtering and optimization were supervised with Trinity and Transrate (Smith-Unna et al., 2016) statistics. To evaluate the final transcriptome assembly, the Benchmarking Universal Single-Copy Orthologs tool (BUSCO) was used (Simão et al., 2015).

Transcriptome annotation

To perform the transcriptome annotation, the final nucleotide and aminoacid sequences were retrieved from transdecoder.pep and Trinity.fasta initial files with the heads of the gold standard transcriptome assembly. Subsequently, the sequences were searched against several databases, NR, Uniref90, Trembl, Swissprot, at the local level, using blast-p and blast-x tools of the DIAMOND v0.8.36 software (Buchfink et al., 2014), and applying an E-value cut-off of 1e-5. The PFAM (Punta et al., 2012) and HMMER (Finn et al., 2011) were used to identify protein domains, TMHMM (Krogh et al., 2001) to predict transmembrane regions, GOseq (Ashburner et al., 2000) to determine gene ontology and eggNOG v.3.0 (Powell et al., 2012) to identify clusters of orthologous groups of genes. All the results were integrated into the Trinotate v3.0.1 (http://trinotate.github.io) annotation pipeline and then reported with an E-value cut-off of 1e-5. The remaining analysis of Go terms and COG’s were done using the longest ORF per ‘gene’ of the trinotate report. In parallel, the longest protein sequences were submitted to KAAS web server for KEGG annotation, using as reference 400,502 teleost fish sequences.

Identification and curation of C. macropomum PPARs gene sequences

The step of transcriptome annotation allowed the identification of several PPAR-like gene sequences. To perform the sequence validation, several predicted amino acid alignments against the NCBI-NR database, to closely related species ( e.g. Pygocentrus nattereri, Astyanax mexicanus), were done. As expected, due to the automatic character of the Trinity assembler, some minor incongruences at the nucleotide level were found. To deal with sequence misassembly and quimeras a manual curation was performed. Using as reference P. nattereri for PPAR (XM_017716837.1), PPAR-like (XM_017701190.1), PPAR (XM_017722410.1) PPAR (XM_017704775.1) and A. mexicanus for PPAR -like (XM_022682487.1), blast-n searches were performed to the raw read dataset, to retrieve the corresponding reads for each PPAR. Collected reads were loaded into Geneious V7.1.9,

48 manually curated and assembled using the map to reference tool and the corresponding reference sequence. The final assembled tambaqui PPAR consensus sequences were used for phylogenetics to determine the sequence orthology.

Phylogenetic analysis

Vertebrates amino acid sequences and invertebrate deuterostomes (Branchiostoma floridae) were retrieved through BLASTp searches in the publically available genome databases using as reference previously isolated PPARs (PPARαA, PPARαB, PPARβA, PPARβB, and PPARγ) from tambaqui, retrieved sequences and corresponding accession numbers are listed in Supplementary Table 3.1. All sequences were aligned with MAFFT alignment software (Katoh and Toh, 2008) with the L-INS-I method. The final alignment containing 56 sequences was manually inspected and all columns containing gaps were removed leaving a total of 347 positions for phylogenetic analysis. Final sequence alignment was submitted to PhyML V3.0 (Guindon et al., 2010) for maximum likelihood phylogenetic analysis, evolutionary model (JTT +G+F) was determined automatically using the smart model selection option. Finally, branch support was calculated using aBayes and tree was visualized in Figure Tree V1.3.1 available at http://tree.bio.ed.ac.uk/software/figtree/.

3.4. Results and discussion

Transcriptome de novo assembly and optimization

To comprehensively characterize the liver transcriptome of C. macropomum, a total of 52,753,287 raw reads were generated from a single tissue, using a 150-bp paired-end (PE) sequencing method of Illumina Hiseq4000 technology. To perform the removal the low- quality reads, Illumina adapters or any reads containing unknown nucleotides, we used a quality-filtered assurance test (FastQC and Trimmomatic) and 82.47% clean reads were obtained from the initial dataset (Table 3.1). The conservative parameters of trimming allowed the retention of high-quality reads and made the sequence output reliable before proceeding to assembly and annotation. Currently, the genome sequence of C. macropomum has not been reported. Thus, we chose a de novo assembly approach, through the Trinity software package (Grabherr et al., 2011), to assemble our transcriptome. This methodology generated approximately 659,372 transcripts with N50 of 1000 bp and 776 average length (Table 3.1). To increase

49 the confidence in the assembly and to deal with a high number of false positives/ artificial sequences, we used three independent approaches for the assembly optimization (See material and methods). First, all transcripts codifying to an ORF, be they complete, internal or 5’/ 3’ partial, were retained. This resulted in 165,769 captured transcripts with ORF and having more than 100bp. The second approach combined the blast-x results of the transcriptome against NR and Uniref90 databases and identified 128,768 transcripts with significant (E-value ≤ 1e-5) hit in the Actinopterygii taxon. The overlap between the two methodologies resulted in a filtered transcriptome of 103,895 transcripts that after subjected to the tr2aacds pipeline originated a gold standard transcriptome of 28,638 ‘genes’, 43,101 transcripts with N50 of 1855 bp and an average length of 1312. Importantly, this three-step methodology ensured that all transcripts matched with known Actinopterygii sequences were collected, and the removal of non-coding, as well as the high number of isoforms per transcript, was done. All generated data are available in public databases. The raw sequencing reads have been submitted to NCBI SRA database (SRR6741523) under the project number PRJNA434368; the filtered de novo assembly transcriptome was deposited at DDBJ/ENA/GenBank under the accession GFYY00000000 to the Transcriptome Shotgun Assembly Sequence Database (https://www.ncbi.nlm.nih.gov/nuccore/GGHL00000000.1), and the gold standard transcriptome was deposited in the figshare digital repository (https://figshare.com/s/d52c6698e6196bb6c8a5). The Trinity and Transrate statistics of each filtering step are shown in Table 1. To access the quality of the gold standard transcriptome, the distribution of sequence lengths and the number of isoforms per genes are shown in Figure 3.1 a and b and in Supplementary Table 3.2 and 3.3. The sequences lengths ranged from 300 bp to a maximum transcript length of 18,047 bp. Most of the assembled sequences were in the range of 300-599 and 600-999 bp (43%), while the quantity of longer transcripts generally decreased (Figure 3.1a). In terms of isoforms per ‘gene’, almost 70% of the ‘genes’ have just one isoform, 20% have two, and 10% have three or more isoforms (Figure 3.1b). Furthermore, to assess the completeness of our transcriptome, in terms of gene content, the Benchmarking Universal Single-Copy Orthologs tool (BUSCO) was used (Simão et al., 2015). Interestingly, we found 82.2% complete BUSCO hits against the eukaryota, and 84.7% against the metazoa lineage-specific profile libraries (Supplementary Table 3.4). These libraries represent well-annotated and conserved single-copy orthologs, whereby the values obtained indicate a high representativeness of the expected gene content and a small fragmentation of our assembly.

50

Table 3.1. Transrate and Trinity statistics of the original, filtered and gold standard transcriptome assembly of liver transcriptome of C. macropomum.

C. macropomum tissue Liver

Number of raw sequencing 52753287 reads

Number of cleaned reads used 43509171 in the assembly

Percentage of reads submitted 82.48 to assembly (%)

Assembly versions Raw transcriptome Filtered de novo Gold Standard assembly transcriptome assembly transcriptome Number of “genes” 1 471503 30762 28638

Number of transcripts 659372 103895 43098

N50 transcript length(bp) 2 1000 2517 1855

Median transcript length (bp) 466 1376 930

Mean transcript length (bp) 776 1811 1312

Length of the smallest 301 301 301 transcript

Length of the largest transcript 19846 19846 18047

Number of transcripts with 122174 66468 20222 length > 1k nn 3

Number of transcripts with 285 282 36 length > 10k nn 4

Total Assembled bases 511615092 188186859 56527072

1 Number of groups of transcripts clustered with base on shared sequence content. 2 N50 is the sequence length of the shortest contig at 50% of the total transcriptome length. 3 Number of transcripts with length larger than 1000 nucleotides. 4 Number of transcripts with length larger than 10000 nucleotides

Transcriptome annotation

To functional annotate the gold standard transcriptome, the nucleotide and aminoacid sequences previously predicted (see material and methods), were blasted with blast-x/p tools against four databases, NR, Uniref90, Trembl and Swissprot with an E-value cut-off of 1e-5. In total, 42,246, 42,295, 40,981 and 34,426 transcripts showed a significant blast-x hit against each database. The top 20 species distribution of blast-x against NR database is plotted in Figure3.1c. The species with a higher number of best hits against our transcriptome, is Pygocentrus nattereri (red-bellied piranha), having more than 78% of the total matches (Supplementary Table 3.5). This result is expected and can be partially explained by abundant sequence resources of P. nattereri and its close taxonomic

51 relationship with tambaqui, both of the Serrasalmidae family. To complement the species distribution, both of similarity and e-value distributions are also shown Figure 3.1d. The distribution of e-value shows that about of 43% of the transcripts with blast-x hits have E- values between 0 and 1e-100. On the other hand, the similarity distribution reveals that 57% of the best hits have between 95% and 100% of similarity (Supplementary Table 3.6 and S7).

Figure 3.1. Quality assessment and blast-x analysis of the gold standard transcriptome of C. macropomum. a) Transcript length distribution; b) A number of transcripts (isoforms) per gene; c) Homologous gene-species distribution; d) E-value and Similarity distribution.

To reinforce the previous functional annotation, based on homology, other functional annotation resources were used and integrated into the Trinotate pipeline. From the total number transcripts, about 28,301 contained protein domains (PFAM), 32,334 had one or

52 more associated go terms, and 28,483 transcripts had one eggnog or cog annotation term. To prevent statistical gene overrepresentation, only longest ORF per ‘gene’, 28,638 unigenes, were used in further analysis of GO terms, Kegg pathways and COG’s. In Supplementary Table 3.8 of the transcriptome annotation is possible to consult the annotation for all transcripts and for unigenes.

Figure 3.2. Histogram of the clusters of orthologous groups of C. macropomum transcriptome (COG).

Clusters of Orthologous Groups (COG) Classification

The COG database is used to phylogenetically classify proteins encoded by completely sequenced genomes. Essentially, the COG database contains 25 categories and is based on the principle that conserved genes should be classified according to their homologous relationship. In our case, the search of COG’s was done through the eggNOG database integrated within the Trinotate pipeline. From a total of 18,750 unigenes with eggNOG or COG annotations, we found 6,218 unigenes with one of the 25 COG categories (Figure3.2 and Supplementary Table 3.9). The COG distribution showed that the largest group is represented by the cluster for general function prediction (1509 unigenes, 24.26%), followed by the posttranslational modification, protein turnover, chaperones (650 unigenes, 10.45%) and signal transduction mechanisms (386 unigenes, 6.21%). On the other hand, other groups such as cell wall/membrane/envelope biogenesis, RNA processing, and modification, nuclear structure, cell motility, extracellular structures are underrepresented or absent (Supplementary Table 3.9).

53

Figure 3.3. Functional classification of C. macropomum in three Gene Ontology (GO) categories-biological process (blue), molecular function (grey), cellular component (red).

Gene Ontology (GO) Gene ontology is an international gene standardization system with the objective of describing the properties of genes and their product within an organism, using a dynamic- updated controlled vocabulary (Blake et al., 2015). Based on blast hits of SwissProt we could assign 258,132 go terms to 21,075 unigenes of the three main categories: molecular function (MF - 58,511, 23%), cellular component (CC - 74,592, 29%) and biological process (BP -125,029, 48%). Though the CC category does not contain the higher number of total go terms, the number of unigenes per go term is much higher than the other two categories. Curiously, and in contrast, the BP categories display the lower distribution of unigenes per go terms, showing a wide range of gene ontology categories. To evaluate each category, the top 15 go terms with a higher number of mapped unigenes was plotted Figure 3.3. From this distribution, we can retain that nucleus (26%) of cellular component, the metal ion binding (14%) in molecular function and transcription/DNA-templated (10%) in the biological

54 process are the go terms with more mappings per category. In contrast, with the smaller number of unigenes mapped we have innate immune response and GTPase activator activity go terms with 2% in BP and MF, respectively, and endoplasmic reticulum go term with 4% on CC category (Figure 3.3).

Table 3.2. KEGG pathways of endocrine system with the number of unigenes mapped by C. macropomum gold standard transcriptome.

KEGG Pathway Pathway ID1 Mapped Unigenes2

PPAR signaling pathway 3320 40 Insulin secretion 4911 33 Insulin signaling pathway 4910 72 Glucagon signaling pathway 4922 50 Regulation of lipolysis in adipocyte 4923 28 Adipocytokine signaling pathway 4920 39 GnRH signaling pathway 4912 45 Ovarian Steroidogenesis 4913 18 Estrogen signaling pathway 4915 46 Progesterone-mediated oocyte maturation 4914 39 Prolactin signaling pathway 4917 39 Oxytocin signaling pathway 4921 69 Thyroid hormone signaling pathway 4919 72 Thyroid hormone synthesis 4918 32 Relaxin signaling pathway 4926 50 Melanogenesis 4916 36 Renin secretion 4924 30 Renin-angiotensin system 4614 9 Aldosterone synthesis and secretion 4925 41 1 Pathway identification number in https://www.genome.jp/kegg/pathway.html database. 2 Number of unigenes mapped against a specific metabolic Pathway.

Kegg pathways analysis and PPAR pathway genes To improve the comprehension of functional and metabolic interactions in our transcriptome, the protein sequences of unigenes sequences were submitted to KAAS web server and about of 10,695 C. macropomum unigenes were mapped to a total of 380 pathways. We next focused our attention on a critical nutritional component - lipid metabolism, which entails the biosynthesis of long-chain polyunsaturated fatty acids (LC- PUFAs). Recently, was described the molecular capacity of tambaqui to biosynthesize LC- PUFAs (Ferraz et al., 2019). LC-PUFAs are particularly in the context of farmed fish (Jump et al., 2013; Carmona-Antoñanzas et al., 2014). The homeostasis of lipid metabolic

55

molecular modules is mostly under the control of the PPAR signaling pathway (Figure 3.4). PPARs act as key transcription factors and regulate multiple endocrine aspects, including hepatic lipid metabolism thus contributing to nutritional homeostasis (Jump et al., 2013). Moreover, teleost PPARs have been shown to regulate the endogenous LC-PUFA biosynthesis in relation to dietary inputs (You et al., 2017). In this study, we identified 40 genes included in the PPAR signaling pathway (Figure 3.4) (Table 3.2). These include the single copy Liver X receptor nuclear receptor found in teleosts, involved in cholesterol metabolism (Fonseca et al., 2017), but also other pivotal metabolic players critical in fatty acid oxidation or lipogenesis (e.g. Scd1 and Cpt1) (Figure 3.4). Additionally, to further validate the tambaqui transcriptome, we identified the full ORFs of the five PPAR gene orthologues and provide formal demonstration of their orthology (Figure 3.5). In vertebrates, three PPAR nuclear receptor gene paralogues have been recognized: PPAR, PPAR and PPAR. In teleosts, specific gene duplicates gene duplicates have been recognized (Madureira et al., 2017). In accordance, based on the RNA-seq predicted transcripts, we show that this is the case also in tambaqui.

Figure 3.4. PPAR signaling pathway with unigenes detected in the C. macropomum transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white are the not detected genes.

56

Figure 3.5. Maximum likelihood phylogenetic analysis of the PPARs (PPARαA, PPARαB, PPARβA, PPARβB and PPARγ) amino acid sequences values at node represent posterior probabilities. Phylogenetic PPARs were rooted with invertebrate sequence Ciano intestinalis.

57

3.5. References

Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM and Sherlock G 2000. Gene Ontology: tool for the unification of biology. Nature genetics 25, 25. Bateman A, Martin MJ, O’Donovan C, Magrane M, Alpi E, Antunes R, Bely B, Bingley M, Bonilla C, Britto R, Bursteinas B, Bye-AJee H, Cowley A, Da Silva A, De Giorgi M, Dogan T, Fazzini F, Castro LG, Figueira L, Garmiri P, Georghiou G, Gonzalez D, Hatton-Ellis E, Li W, Liu W, Lopez R, Luo J, Lussi Y, MacDougall A, Nightingale A, Palka B, Pichler K, Poggioli D, Pundir S, Pureza L, Qi G, Rosanoff S, Saidi R, Sawford T, Shypitsyna A, Speretta E, Turner E, Tyagi N, Volynkin V, Wardell T, Warner K, Watkins X, Zaru R, Zellner H, Xenarios I, Bougueleret L, Bridge A, Poux S, Redaschi N, Aimo L, ArgoudPuy G, Auchincloss A, Axelsen K, Bansal P, Baratin D, Blatter MC, Boeckmann B, Bolleman J, Boutet E, Breuza L, Casal-Casas C, De Castro E, Coudert E, Cuche B, Doche M, Dornevil D, Duvaud S, Estreicher A, Famiglietti L, Feuermann M, Gasteiger E, Gehant S, Gerritsen V, Gos A, Gruaz-Gumowski N, Hinz U, Hulo C, Jungo F, Keller G, Lara V, Lemercier P, Lieberherr D, Lombardot T, Martin X, Masson P, Morgat A, Neto T, Nouspikel N, Paesano S, Pedruzzi I, Pilbout S, Pozzato M, Pruess M, Rivoire C, Roechert B, Schneider M, Sigrist C, Sonesson K, Staehli S, Stutz A, Sundaram S, Tognolli M, Verbregue L, Veuthey AL, Wu CH, Arighi CN, Arminski L, Chen C, Chen Y, Garavelli JS, Huang H, Laiho K, McGarvey P, Natale DA, Ross K, Vinayaka CR, Wang Q, Wang Y, Yeh LS and Zhang J 2017. UniProt: The universal protein knowledgebase. Nucleic Acids Research 45, D158–D169. Blake JA, Christie KR, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov D, Burgess S, Buza T, Gresham C, McCarthy F, Pillai L, Wang H, Carbon S, Dietze H, Lewis SE, Mungall CJ, Munoz-Torres MC, Feuermann M, Gaudet P, Basu S, Chisholm RL, Dodson RJ, Fey P, Mi H, Thomas PD, Muruganujan A, Poudel S, Hu JC, Aleksander SA, McIntosh BK, Renfro DP, Siegele DA, Attrill H, Brown NH, Tweedie S, Lomax J, Osumi-Sutherland D, Parkinson H, Roncaglia P, Lovering RC, Talmud PJ, Humphries SE, Denny P, Campbell NH, Foulger RE, Chibucos MC, Giglio MG, Chang HY, Finn R, Fraser M, Mitchell A, Nuka G, Pesseat S, Sangrador A, Scheremetjew M, Young SY, Stephan R, Harris MA, Oliver SG, Rutherford K, Wood V, Bahler J, Lock A, Kersey PJ, McDowall MD, Staines DM, Dwinell M, Shimoyama M, Laulederkind S, Hayman GT, Wang SJ, Petri V, D’Eustachio P, Matthews L, Balakrishnan R, Binkley G, Cherry JM, Costanzo MC, Demeter J, Dwight SS, Engel SR, Hitz BC, Inglis DO, Lloyd P, Miyasato SR, Paskov K, G, Simison M, Nash RS, Skrzypek MS, Weng S, Wong ED, Berardini TZ, Li D, Huala E, Argasinska J, Arighi C, Auchincloss A, Axelsen K, Argoud-Puy G, Bateman A, Bely B, Blatter MC, Bonilla C, Bougueleret L, Boutet E, Breuza L, Bridge A, Britto R, Casals C, Cibrian-Uhalte E, Coudert E, Cusin I, Duek-Roggli P, Estreicher A, Famiglietti L, Gane P, Garmiri P, Gos A, Gruaz-Gumowski N, Hatton-Ellis E, Hinz U, Hulo C, Huntley R, Jungo F, Keller G, Laiho K, Lemercier P, Lieberherr D, Macdougall A, Magrane M, Martin M, Masson P, Mutowo P, O’Donovan C, Pedruzzi I, Pichler K, Poggioli D, Poux S, Rivoire C, Roechert B, Sawford T, Schneider M, Shypitsyna A, Stutz A, Sundaram S, Tognolli M, Wu C, Xenarios I, Chan J, Kishore R, Sternberg PW, Van Auken K, Muller HM, Done J, Li Y, Howe D and Westerfeld M 2015. Gene ontology consortium: Going forward. Nucleic Acids Research 43, D1049–D1056. Bolger AM, Lohse M and Usadel B 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120. Buchfink B, Xie C and Huson DH 2014. Fast and sensitive protein alignment using DIAMOND. Nature Methods 12, 59–60.

58

Carmona-Antoñanzas G, Tocher DR, Martinez-Rubio L and Leaver MJ 2014. Conservation of lipid metabolic gene transcriptional regulatory networks in fish and mammals. Gene 534, 1–9. Chana-Munoz A, Jendroszek A, Sønnichsen M, Kristiansen R, Jensen JK, Andreasen PA, Bendixen C and Panitz F 2017. Multi-tissue RNA-seq and transcriptome characterisation of the spiny dogfish (Squalus acanthias) provides a molecular tool for biological research and reveals new genes involved in osmoregulation. PLoS ONE 12, e0182756. Ferraz, Renato B, Naoki Kabeya, Mónica Lopes-Marques, André M Machado, Ricardo A Ribeiro, Ana L Salaro, Rodrigo Ozório, L Filipe C Castro, and Óscar Monroig. 2018. 'A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology. Finn RD, Clements J and Eddy SR 2011. HMMER web server: Interactive sequence similarity searching. Nucleic Acids Research 39, W29–W37. Fonseca E, Ruivo R, Lopes-Marques M, Zhang H, Santos MM, Venkatesh B and Castro LFC 2017. LXRα and LXRβ nuclear receptors evolved in the common ancestor of gnathostomes. Genome Biology and Evolution 9, 222–230. Gomes F, Watanabe L, Nozawa S, Oliveira L, Cardoso J, Vianez J, Nunes M, Schneider H and Sampaio I 2017. Identification and characterization of the expression profile of the microRNAs in the Amazon species Colossoma macropomum by next generation sequencing. Genomics 109, 67–74. Goulding M and Carvalho ML 1982. Life history and management of the tambaqui (Colossoma macropomum, Characidae): an important Amazonian food fish. Revista Brasileira de Zoologia 1, 107–133. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N and Regev A 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29, 644–652. Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W and Gascuel O 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology 59, 307–321. Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, Leduc RD, Friedman N and Regev A 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols 8, 1494–1512. Jump DB, Tripathy S and Depner CM 2013. Fatty Acid–Regulated Transcription Factors in the Liver. Annual Review of Nutrition 33, 249–269. Junk WJ, Soares MGM and Bayley PB 2007. Freshwater fishes of the Amazon River basin: Their biodiversity, fisheries, and habitats. Aquatic Ecosystem Health and Management 10, 153–173. Katoh K and Toh H 2008. Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9, 286–298. Krogh A, Larsson BB, von Heijne G and Sonnhammer EL. L 2001. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. Journal of molecular biology 305, 567–580. Lima CSD, Bomfim MAD, de Siqueira JC, Ribeiro FB and Lanna EAT 2016. Crude Protein Levels in the Diets of Tambaqui, Colossoma Macropomum (Cuvier, 1818), Fingerlings. Revista Caatinga 29, 183–190.

59

Madureira TV, Pinheiro I, de Paula Freire R, Rocha E, Castro LF and Urbatzka R 2017. Genome specific PPARαB duplicates in salmonids and insights into estrogenic regulation in brown trout. Comparative Biochemistry and Physiology Part - B: Biochemistry and Molecular Biology 208–209, 94–101. Ministério da Agricultura, B. (2016). Produção da pecuária municipal / IBGE. Retrieved on April of 2018 from https://biblioteca.ibge.gov.br/visualizacao/periodicos/84/ppm_2015_v43_br.pdf Pereiro P, Balseiro P, Romero A, Dios S, Forn-Cuni G, Fuste B, Planas J V., Beltran S, Novoa B and Figueras A 2012. High-throughput sequence analysis of turbot (Scophthalmus maximus) transcriptome using 454-pyrosequencing for the discovery of antiviral immune genes. PLoS ONE 7, e35369. Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ, von Mering C and Bork P 2012. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Research 40, D284–D289. Prado-Lima M and Val AL 2016. Transcriptomic characterization of tambaqui (Colossoma macropomum, Cuvier, 1818) exposed to three climate change scenarios. PLoS ONE 11, e0152366. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer ELL, Eddy SR, Bateman A and Finn RD 2012. V. Nucleic Acids Research 40, D290-301. Ravi V and Venkatesh B 2018. The divergent genomes of teleosts. Annual Review of Animal Biosciences 6, 47–68. Reis RE 2013. Conserving the freshwater fishes of South America. International Zoo Yearbook 47, 65–70. Santos CHA, Santana GX, Sá Leitão CS, Paula-Silva MN and Almeida-Val VMF 2016. Loss of genetic diversity in farmed populations of Colossoma macropomum estimated by microsatellites. Animal Genetics 47, 373–376. Da Silva Nunes JDR, Liu S, Pértille F, Perazza CA, Villela PMS, De Almeida-Val VMF, Hilsdorf AWS, Liu Z and Coutinho LL 2017. Large-scale SNP discovery and construction of a high-density genetic map of Colossoma macropomum through genotyping-by- sequencing. Scientific Reports 7, 46112. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva E V. and Zdobnov EM 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212. Smith-Unna R, Boursnell C, Patro R, Hibberd JM and Kelly S 2016. TransRate: reference- free quality assessment of de novo transcriptome assemblies. Genome research 26, 1134–44. Tine M, Kuhl H, Gagnaire P-A, Louro B, Desmarais E, Martins RST, Hecht J, Knaust F, Belkhir K, Klages S, Dieterich R, Stueber K, Piferrer F, Guinand B, Bierne N, Volckaert FAM, Bargelloni L, Power DM, Bonhomme F, Canario AVM and Reinhardt R 2014. European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation. Nature Communications 5, 5770. Wu Y-P, Xie J-F, He Q-S and Xie J-L 2016. The complete mitochondrial genome sequence of Colossoma macropomum (Characiformes: Serrasalmidae). Mitochondrial DNA Part A 27, 4080–4081. Yáñez JM, Newman S and Houston RD 2015. Genomics in aquaculture to better understand species biology and accelerate genetic progress. Frontiers in Genetics 6, 128. You C, Jiang D, Zhang Q, Xie D, Wang S, Dong Y and Li Y 2017. Cloning and expression characterization of peroxisome proliferator-activated receptors (PPARs) with their agonists, dietary lipids, and ambient salinity in rabbitfish Siganus canaliculatus.

60

Comparative Biochemistry and Physiology Part - B: Biochemistry and Molecular Biology 206, 54–64.

61

Chapter 4

Regulation of gene expression associated to LC-PUFA metabolism in juvenile tambaqui (Colossoma macropomum) fed different dietary oil sources

62

4.1. Abstract

Dietary long-chain (≥C20) polyunsaturated fatty acids (LC-PUFA) from marine fish oil (FO) and fishmeal play a physiological and developmental key role in fish diets, since endogenous capacity to biosynthesize LC-PUFAs varies considerably, due to biotic and abiotic factors. Given the ecological unsustainability of this strategy, the search for alternative dietary LC-PUFA sources is crucial for aquaculture. Tambaqui (Colossoma macropomum) is a valuable economic aquaculture resource in Brazil, which is capable of endogenously elongate and desaturate linoleic (LA; 18:2n-6) and α-linolenic acids (ALA; 18:3n-3) acids to longer and physiologically vital LC-PUFAs. However, it is not yet clear how this pathway is regulated by different oil source. In this context, the development of RNAseq is essential to improve nutrition and consequently productivity. In this study, we applied RNAseq technology to produce the first comprehensive brain transcriptome to better understand the metabolism LC-PUFA of tambaqui. Our analysis generated a gold standard transcriptome with a total of 95,239 transcripts, with an N50 of 2,483 bp and the average length of 1,720.2 bp. From data generated here in the brain RNAseq, and with the previous published liver RNAseq, important genes in lipid metabolism were isolated. For this, juvenile tambaqui were fed for 9 weeks with four diets (FO5, FO10, VO5 and VO10), varying in oil sources (fish oil - FO and vegetable oils - VO) and levels (5% or 10%). Then, using real-time PCR, the mRNA expression of genes encoding enzymes involved in LC-PUFA were measured. About the animal experiment, no differences survival, body weight gain (BWG) and feed conversion ratio were observed between treatments. The results from real-time PCR show that fads2 and elovl5 were up-regulated in liver and while fads2 and elovl2 are up-regulated in brain in tambaqui fed with VO. This suggests the ability of tambaqui to biosynthesize LC-PUFA when fed LA and ALA acids precursors. The expression of transcription factors pparβb and ppary were also shown to be up-regulated in the brain by VO diet, when compared to FO diet. The VO diet also contributed to the biosynthesis of LC- PUFA in liver and specifically DHA in brain. Overall, our approach demonstrated that lipid- metabolism relevant genes were regulated by different dietary lipid sources. These results contribute to better understand the LC-PUFA metabolism and their regulation in tambaqui.

63

4.2. Introduction

Finfish aquaculture consumes a remarkable part (~ 75%) of the globally produced fishmeal (FM) and fish oil (FO), which mostly derive from capture fisheries and, to a lesser extent, by-products from aquaculture and capture fisheries. Both FM and FO are considered highly nutritious and digestible ingredients and they constitute key components of current aquafeed formulations (Tacon and Metian 2008). With the expansion of aquaculture in the last decades, alternative ingredients to FM and FO have been actively searched and, plant- derived ingredients have now become common in aquafeeds (Turchini, Torstensen, and Ng 2009). Plant raw materials differ remarkably from marine ingredients in terms of nutritional profiles and, particularly within the lipid profile, vegetable oils (VO) are devoid of the physiologically important long-chain (≥C20) polyunsaturated fatty acids (LC-PUFA) compounds that readily available in FO and, to a lesser extent, FM (Nasopoulou and

Zabetakis 2012). Rather VO are typically rich in oleic acid (OA, 18:1 n-9) and C18 polyunsaturated fatty acids (PUFA), namely 18:2n−6 (linoleic acid, LA) and 18:3n−3 (α- linolenic acid, ALA) (Turchini, Torstensen, and Ng 2009). Generally, VO are highly digestible and good energy sources but their use in feed may result in lower growth parameters when compared to FO (Chen et al. 2020; Teoh and Ng 2016; Fountoulaki et al. 2009). This detrimental effect of VO-rich feeds can partly be counteracted in some fish species with the ability to convert the C18 PUFA contained in VO into LC-PUFA.

The endogenous capacity of fish to convert the C18 PUFA LA and ALA into LC-PUFA varies among species as a function of the gene complement and function of fatty acyl desaturases (Fads) and elongation of very long-chain fatty acid (Elovl) proteins (Castro, Tocher, and Monroig 2016; Garrido et al. 2019; Monroig, Tocher, and Castro 2018). Certain

Fads and Elovl enzymes catalyse reactions through which C18 PUFA can be desaturated (new double bonds are inserted) and elongated (acyl chains increase their carbons by 2) to produce LC-PUFA (see figure 3.1 in (Monroig, Tocher, and Castro 2018)) (Cook 1996; Castro, Tocher, and Monroig 2016). In addition to these two major enzymes involved in LC- PUFA biosynthesis, many key enzymes involved in mechanisms of lipid transport, lipid absorption and FA bioconversion processes in lipid metabolism (Torstensen, Lie, and Frøyland 2000; Castro et al. 2015; Sarameh et al. 2019; Chen et al. 2020), as some transcription factors (TF) are involved in the overall regulation in lipid metabolism. TFs such as the Liver X receptor (LXR), peroxisome proliferator-activated receptors (PPARs) and sterol regulatory element binding proteins (SREBP) have received great attention because they play crucial roles in regulating the expression of genes involved in lipid metabolism in

64 general and LC-PUFA biosynthesis in particular in both mammals and teleost fish (Fievet and Staels 2009; Daemen, Kutmon, and Evelo 2013; Monroig, Tocher, and Castro 2018). srebps, lxr, and ppars were cloned and characterized from various fish species. The Japanese seabass (Lateolabrax japonicus) (Dong et al. 2015), rabbitfish (Siganus canaliculatus) (You et al. 2017), Atlantic salmon (Salmo salar L.) (Betancor et al. 2014) and yellow croaker (Larimichthys crocea) (Du et al. 2017) are some examples where the gene expression has been shown to be regulated by TFs when dietary modulations occur, suggesting that TFs might be involved in the upregulation of LC-PUFA biosynthesis with dietary VO. To understand the effects of different dietary lipid sources currently used in the finfish industry, it is essential to assess the regulation of relevant genes involved in key lipid metabolic pathways, and on various enzymes involved in lipid biosynthesis and lipid catabolism. Tambaqui (Colossoma macropomum) is a freshwater teleost species endemic from the Amazon and Orinoco basins. It is of great importance to aquaculture in Brazil due to its good production performance such as high growth rate, ease of reproduce in captivity, ease for fingerlings (produção de juvenis) and juveniles production, and good acceptance by the consumer (Silva, Gomes, and Brandão 2007). With regard to tambaqui nutrition, wild adult specimens are predominantly omnivorous, feeding abundantly on fruits and seeds during flooding episodes (Almeida, Visentainer, and Franco 2008). Moreover, it has been shown that tambaqui digestive enzymes are regulated in response to changes in lipid composition of feed (De Almeida, Lundstedt, and Moraes 2006; De Almeida et al. 2011; Pereira et al. 2017). The effect of dietary VO containing varying LA and ALA ratios on growth performance and feed utilization was evaluated in tambaqui (Paulino et al. 2018). In general, it was shown that tambaqui fed VO-based diets, though lower than that of fish fed the FO- based diet, had eicosapentaenoic acid (EPA, 20:5n-3), docosahexaenoic acid (DHA, 22:6n- 3) and arachidonic acid (ARA, 20:4n-6) enough contents to improving the nutritional quality of tambaqui fillets, suggesting its ability to efficiently use dietary C18 PUFA from a variety of sources and convert them into LC-PUFA including EPA, DHA and ARA. Consistently, the gene orthologs of fads2, elovl5 and elovl2 have been isolated from tambaqui, and their putative proteins functionally characterized in yeast (Ferraz et al., 2019), confirming that all enzymatic capacities required for converting C18 PUFA into LC-PUFA maybe present in tambaqui (Ferraz et al. 2019). More specifically, the tambaqui Fads2 contained Δ6, Δ5 and Δ8 desaturase capacities and thus enabling all desaturation reactions required for ARA, EPA and DHA biosynthesis from C18 PUFA precursors. In addition, tambaqui’s Elovl5 and Elovl2

65 can combinedly elongate PUFA substrates with chain lengths ranging from 18 to 22 carbons (Ferraz et al., 2019). Transcriptome profiling using deep-sequencing technology, commonly known as RNAseq, has become increasingly popular since it rapidly provides holistic views on molecular pathway activation or in the identification of key genes, including in the field of aquaculture (Pereiro et al., 2012; Chana-Munoz et al., 2017). In the case of tambaqui some studies have used this tool, such as muscle transcriptome analysis under specific conditions (e.g. climate scenarios) (Prado-Lima and Val, 2016), microRNA expresisson (Gomes et al., 2017), and more recently liver transcriptome (Machado et al. 2019). Therefore, to improve the nutritional management of tambaqui and develop the aquaculture of this species additional genomic information is essential. Here, we provide an extended molecular resource for the tambaqui with the assembly and annotation of the brain transcriptome, to investigate the effects of different dietary oil sources on regulation of genes relevant in lipid metabolism pathways in tambaqui. For this, from data generated here in the brain RNAseq, together with the previous published liver RNAseq (Machado et al. 2019), key genes involved in lipid metabolism were identified. And then, using real-time PCR, the mRNA expression of genes encoding enzymes involved in LC-PUFA biosynthesis (fads2, elovl5 and elovl2), TF (pparαa, pparαb, pparβa, pparβb, ppary, lxrα and srebp1), and enzymes involved in the fatty acid biosynthesis (fas) and lipid catabolism (lpl) were measured in the different treatments (FO5, FO10, VO5, VO10) varying in oil sources (fish oil - FO and vegetable oil - VO) and levels (5% or 10%), where juvenile tambaqui were fed for 9 weeks. Our overarching hypothesis is that understanding the effects of different dietary lipid sources into regulation of relevant genes involved in key lipid metabolic pathways and its regulation will provide insight to optimize diet formulations and the effective use of sustainable dietary lipid sources in tambaqui aquaculture.

66

4.3. Materials and methods

Ethics

The animal experiment was approved by the Ethical Committee on the Use of Production Animals of the Federal University of Viçosa, protocol number 044/2017.

RNA extraction and sequencing

Tambaqui juveniles were obtained from a commercial farm (Aquaculture Vale Dos Lagos, Cnpj/CPF: 08.764.701/0001-09. Address: OTR Matutina, S/N, Itarana-ES, CEP 29.620-000). Fish were conditioned for 30 days, fed commercial diet, to acclimate to the experimental conditions. After that, one fish was collected for brain RNAseq and the other animals went on to the feeding trial (more details below). For that, this fish was sacrificed (overdose of clove oil 400 mg L-1), brain was dissected and the samples was preserved in RNA stabilization buffer (3.6 M ammonium sulphate, 18 mM sodium citrate, 15 mM EDTA, pH 5.2), and stored at −80 °C prior to RNA extraction. Total RNA was extracted using the Illustra RNAspin Mini Kit (GE Healthcare, Chicago, USA), following the manufacturer’s protocol. On column DNase digestion was used during the extraction procedure. RNA quality was verified in 1% (w/v) agarose gels. Total RNA concentration was determined using the Quant-it™ RiboGreen® RNA Assay Kit (Invitrogen, California, USA). The RNA sample of brain was sequenced by STABVIDA Lda (Caparica, Portugal), using lllumina Hiseq 4000 platform and 150bp paired-end reads.

Raw dataset clean up, assembly and ORF prediction of brain transcriptome

The sequencing raw reads were inspected, quality-filtered and assembled following the protocol applied in (Machado et al. 2018). Briefly, the Fastqc (v.0.11.8) (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) software was applied to quality control the raw dataset, the Trimmomatic (v.0.38) (Bolger, Lohse, and Usadel 2014) – parameters “Leading:15 Trailing:15 Sliding window:4:20 Minlen:50” – was used to trim and remove low quality reads, and Trinity (v2.4.0) (Grabherr et al. 2011; Haas et al. 2013) – parameters “SS_lib_type RF; min_contig_length 300” – to perform the de novo assembly transcriptome of C. macropomum brain. To remove C. macropomum exogenous sequences, taxonomic and blast searches were conducted against two databases, NCBI (NCBI-nt) (downloaded on 30/03/2019) and

67

UniVec (downloaded on 02/04/2019). The blast searches were performed following the procedure: 1) blast-n against NCBI-nt database, using conservative parameters (paremeters -e-value 1e-5; -max_target_seqs 1; -perc_identity 90; -max_hsps 1; and minimum alignment length of 100 bp); 2) Blast-n hits obtained from NCBI-nt database, not classified as Actinopterygii were considered contaminations and removed; 3) blast-n against UniVec database (paremeters , -reward 1; -penalty -5; -gapopen 3; -gapextend 3; -dust yes; - soft_masking true; -e-value 700; -searchsp 1750000000000); 4) All transcripts with match hit against UniVec database were removed; 5) In the end, only transcripts without any match hit or with match hits taxonomically classified as Actinopterygii were kept in transcriptome and used in further analyses. To control the quality of the raw transcriptome assembly, several strategies and software were applied. Firstly, it was used the Transrate (v1.0.3) (Smith-Unna et al. 2016) software to calculate basic statistics, such transcript number, mean length, n50 length, among others. Posteriorly, the Benchmarking Universal Single‐Copy Orthologs (BUSCO v3.0.2) (Simão et al. 2015) – using four lineage-specific databases (Metazoa, Eukaryota, Vertebrates and Actinopterygii) – was used to estimate the gene content completeness. Finally, the rate of reads back mapping to the transcriptome (RBMT) was also assessed with Bowtie2 (v2.3.5) (Langmead and Salzberg 2012) tool. TransDecoder software (v5.3.0) (https://transdecoder.github.io/) was used to predict the open reading frames (ORF) of the raw transcriptome assembly. During the ORF prediction we used the TransDecoder options, blast-p (v.2.9.0) (Altschul et al. 1990) – e- value cut-off of 1e-5 – and hmmscan (v.2.4i) (Finn, Clements, and Eddy 2011) against UniProtKB/Swiss-Prot database (download 12/04/2019) (Consortium. 2019) and PFAM database (downloaded on 12/04/2019) (Bateman et al. 2004), respectively. The remaining software settings were kept in default mode (including the parameter of ‘minimum protein length’, 100 amino acids). In addition, the redundant proteins were clustered and filtered out with CD-HIT (v.4.7) (Li and Godzik 2006) software (Parameters – -c 0.95 -g 1 -M 40000 -T 30). In the next analyses, we focused on protein-coding genes, and therefore only transcripts containing ORF (i.e., protein coding transcriptome) were considered and scrutinized. In other words, we used for further analyses the non-redundant dataset of proteins, and the corresponding transcripts (extracted from raw transcriptome assembly). For comparative purposes, “raw transcriptome assembly” from a previous work was also collected (Machado et al. 2019) and applied the procedures above described.

68

Transcriptome annotation, KEGG pathways exploratory analyses and cross-species comparisons

The protein coding transcriptome assembly of brain was annotated using the Trinotate pipeline (Bryant et al. 2017). This pipeline provides several strategies to functional annotate large transcriptomic datasets (e.g., Blast searches against different databases, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Kanehisa et al. 2011), protein domains (PFAM database) (Punta et al. 2011), orthologous group of genes (eggNOG) (Powell et al. 2011) and Gene Ontology (GO) (Ashburner et al. 2000).The blast- p and blast-x searches were done against two UniProt (Consortium. 2019), namely Uniref90 (downloaded on 12/04/2019) and Swiss-Prot databases (downloaded on 12/04/2019), using the DIAMOND software (v.0.9.24) (Buchfink, Xie, and Huson 2015). On the other hand, the remaining analyses (in abovementioned databases), were performed with the default settings and directly integrated in Trinotate-provided SQLite database template. The Trinotate annotation report was generated with an e-value cut-off of 1e-5. Nevertheless, we still conducted additional blast-x, p, and n searches on NCBI databases such as Non- redundant protein database (NCBI-nr, downloaded on 12/04/2019) and Nucleotide database (NCBI-nt, databases, downloaded on 30/03/2019), for strengthening the functional annotation of protein coding transcriptome assembly. The DIAMOND software was used to build and blasting locally, the NCBI-nr database (e-value cut-off of 1e-5), while the blast-n tool (v.2.9.0) (Altschul et al. 1990) was used to perform the searches in the local database of NCBI-nt (e-value cut-off of 1e-5). In the end, all additional blast outputs were integrated into the Trinotate report, using built in-house shell scripts. The KEGG pathway exploratory analyses were done using the KEGG Automatic Annotation Server (KAAS) (Moriya et al. 2007). The analyses were performed for brain and liver transcriptomes, at protein level (transdecoder proteins), using Single-directional Best Hit (SBH) method and taking as their reference, 1,062,583 sequences of 40 organisms, of which 33 are teleost fish (for detailed list of species consult Figshare repository). OrthoFinder (v.2.3.3) (Emms and Kelly 2015) was used, with the defaults settings, to perform the cross-species-comparisons and identification of orthologous genes in three species. For these analyses, we used the transdecoder proteins of liver and brain C. macropomum transcriptomes as well as the proteome of Pygocentrus nattereri (NCBI accession number: GCA_001682695.1; downloaded at 16/09/2019) and Astyanax mexicanus (NCBI accession number: GCA_000372685.2; downloaded at 16/09/2019). Both

69 proteomes were selected due their close taxonomic relationship with tambaqui, both of the Characiformes order. The raw sequencing reads have been submitted to NCBI SRA database under the Bioproject number (PRJNA605748). The raw de novo assembly transcriptomes (brain and liver), the protein coding transcriptome (brain and liver), the annotation report (brain), the orthofinder files and KEGG files reporting the transcript-ko correspondence were deposited in the Figshare digital repository.

Feeding trial and experimental conditions

Fish from the acclimatization mentioned above were submitted to a feeding trial, where 160 fish of similar weight (8.61 ± 1.38 g) were stocked in 20 tanks (60 L), a total of 8 fish per tank. Fish were hand-fed on the corresponding experimental diets (FO5, FO10, VO5 and VO10) (see details below) to apparent satiation three times per day (at 08:00, 12:00 and 17:00 h) for 9 weeks. A freshwater recirculation system containing a physical filter (retain particles), a biological filter and UV light was used to maintain water quality. During the experimental period, the water temperature ranged from 27.0 ± 0.5 ºC (mercury thermometer), dissolved oxygen was approximately 5.7 ± 0.5 mg l−1 (automatic oxygen analyzer YSI-550ª, EUA) and total ammonia was 0.006 ± 0.002 mg l-1 (Kit Labcon Test, Industry and Commerce of Dehydrated Foods Alcon Ltda, Brazil). At the end of the feeding trial, all fish were fasted for 24 h, sacrificed (overdose of clove oil 400 mg L-1), counted, weighed and length measured. Liver and brain were dissected (n=5 liver/brain per treatment). Samples were preserved in RNA stabilization buffer (3.6 M ammonium sulphate, 18 mM sodium citrate, 15 mM EDTA, pH 5.2), and stored at −80 °C prior to RNA extraction with the same protocol mentioned above.

Experimental diets

Four iso-protein (327.05 g kg-1), isolipids (10%) and isoenergetic (4404.25 MJ kg-1) diets were formulated with two oil sources: fish oil (FO) and a mix of vegetable oil containing 60% of soybean oil and and 40% linseed oil (VO) at two lipid levels (5% and 10%) (Table 4.1), in 2 × 2 factorial design (FO5, FO10, VO5 and VO10) (Figure 4.1). All the ingredients except oils were finely ground and sieved, and then thoroughly mixed with different oil levels and sources, in a feed mixer (Filizola, P-22, São Paulo, SP, Brazil), then an appropriate amount of water was added to produce a stiff dough. The dough was then passed through a meat mincer to produce threads that were cut into pellets. Experimental diets were stored

70 at −20 °C until use for fish feeding. The fatty acid composition of the experimental diets is given in Table 4.2.

Figure 4.1. Experimental set up of the nutrition assay: fish and diets, procedure, sample collection, RNA extraction and cDNA synthesis and real-time PCR experiments.

Dietary lipid analysis

Samples of the experimental diets (~ 5 g) were lyophilized (Labconco, Kansas City, MO, EUA) and ground in a balls mill (Marconi MA923, Piracicaba, SP, Brazil) for the proximate analyses. Crude protein content (%N x 6.25) was determined by the Kjeldahl method (Quimis, Diadema, SP, Brazil). Total lipids content was determined by ether extraction in a Soxtherm 2000 extractor (Gerhardt, New Orleans, USA). Mineral matter content was determined by combusting dry samples in a muffle furnace (TE-1100-1P, Tecnal, Piracicaba, SP, Brazil) at 550 °C for 6 h, according to the Association of Official Analytical Chemists (AOAC, 2016). Gross energy content was determined using an adiabatic calorimetry bomb (Parr 1266, Parr Instruments Co., Moline, IL, USA). All analyses were performed in triplicate. Fatty acid analyses were performed on diets by extracting the total lipids following the method described in Folch et al. (1957) (Folch, Lees, and Sloane Stanley 1957) using chloroform:methanol (2:1, v/v) as a solvent from extraction, and quantified gravimetrically. Both polar and neutral lipid fractions were separated using adsorption chromatography

71 silica cartridges, and subsequently, fatty acid methyl esters (FAME) were prepared from subsamples of each lipid fraction through transesterification with methanol in sulfuric acid (DILS 1974). FAME were analysed on a gas chromatograph.

Table 4.1. Composition and analyses of the experimental diets. Ingredients (g kg-1) FO 5% FO 10% VO 5% VO 10% Soybean meal 520.00 536.30 520.00 536.30 Corn Gluten 100.00 100.00 100.00 100.00 Wheat bran 100.00 100.00 100.00 100.00 Rice bran 162.30 45.60 162.30 45.60 Inert (Kaolin) 10.00 50.00 10.00 50.00 Cellulose 00.00 10.00 00.00 10.00 L-Lysine 04.00 04.40 04.00 04.40 DL-Methionine 04.40 04.40 04.40 04.40 Fish oil 50.00 100.00 00.00 00.00 Soybean oil 00.00 00.00 30.00 60.00 Linseed oil 00.00 00.00 20.00 40.00 Dicalcium 39.00 39.00 39.00 39.00 phosphate C 00.10 00.10 00.10 00.10 Common salt 05.00 05.00 05.00 05.00 Mineral and vitamin 05.00 05.00 05.00 05.00 mix3 Antioxidant BHT 00.20 00.20 00.20 00.20 *Proximate Composition g kg -1 Dry matter 938.90 951.30 945.80 937.30 Crude protein 321.50 325.10 330.90 330.70 Crude lipid 93.70 114.80 104.60 97.50 Ash 109.00 140.40 109.00 142.50 Energy (MJ/Kg) 4439 4359 4423 4396 1VO –soybean and linseed oils in the proportion of 60% and 40%, respectively. 2FO–Fish oil. 3Levels of guarantee per kilogram of product: Vit. A, 1,200,000 IU; Vit. D3, 200,000 IU; Vit. E, 12,000mg; Vit. K3, 2,400mg; Vit. B1, 4,800mg; Vit. B2, 4,800mg; Vit. B6, 4,000mg; Vit. B12, 4,800mg; B.C. Folic acid, 1,200mg; Pantothenate Ca, 12,000mg; Vit. C, 48,000mg; Biotin, 48,000mg; choline, 65,000mg; Niacin, 24,000mg; Iron, 10,000mg; Copper, 6,000mg; Manganese, 4,000mg; Zinc, 6,000mg; Iodine, 20mg; Cobalt, 2mg; Selenium, 20mg. * Analyzes carried out in the Laboratory of Food Analysis of the Department of Animal Science of the Federal University of Viçosa.

72

Table 4.2. Fatty acid composition of the experimental diets. FA % FO 5% FO 10% VO 5% VO 10% 14:00 1.3 3.1 0.2 0.2 16:00 16.6 20.1 14.6 12.6 16:1n-7 2.4 5.1 0.6 0.5 18:00 7.0 4.7 3.4 4.1 18:2n-6 33.7 16.7 45.3 47.8 Linoleic 18:3n-3 3.2 1.9 6.0 7.6 Linolenic 20:1n-11 0.0 0.2 0.4 0.0 20:1n-9 0.5 1.0 0.0 0.4 20:2n-6 0.2 0.3 0.0 0.0 20:4n-6 0.5 1.0 0.1 0.0 ARA 20:5n-3 0.7 4.6 0.6 0.6 EPA 22:1n-11 0.0 0.3 0.0 0.0 22:5n-3 0.3 1.3 0.0 0.0 22:6n-3 0.7 6.9 0.2 0.2 DHA Σ SAFA 25.66 29.51 18.41 17.18 Σ MUFA 33.47 32.61 29.15 26.37 Σ PUFA 40.88 37.88 52.44 56.45 Σ n-9 30.31 19.76 27.81 25.80 Σ n-6 34.27 18.40 45.29 47.83 Σ n-3 5.30 16.30 6.84 8.48 n-3 HUFA 1.71 13.25 0.81 0.83 n-6/n-3 6.47 1.13 6.62 5.64 LA/ALA 10.53 8.78 7.55 6.28 DHA/EPA 1.02 1.52 0.31 0.26 FA – Fatty acids. VO – soybean and linseed oils in the proportion of 60% and 40%, respectively. FO – Fish oil. SAFA – Saturated fatty acids. MUFA – Monounsaturated fatty acids. PUFA – Polyunsaturated fatty acids

Identification of C. macropomum gene sequences and primers design

The positive ORF of the key genes involved in the LC-PUFA biosynthesis (fads2, elovl5 and elovl2), TF (pparαa, pparαb, pparβa, pparβb, ppary, lxrα and srebp1), and enzymes involved in the fatty acid biosynthesis (fas) and lipid catabolism (lpl) and real-time PCR normalization genes (actb, tuba and b2m), were identified and retrieved from tambaqui liver (Machado et al. 2019) and brain transcriptomes (this study).

73

Initially, the reference target gene sequences were manually identified, via blast-n searches on NCBI-nt database and gathered from P. nattereri genome (accession numbers of each gene available in Table 4.3). To collect the orthologous gene candidates from tambaqui transcriptomes, we used two strategies. Firstly, we identified/locate, using the P. nattereri protein accession number, the orthogroups of our target genes in cross-species comparison analyses (see above) (Suppl. Table. 4.7). And then, conducted Blast-n searches in each transcriptome, to confirm the orthogroup hits in order to solve unclear cases and to retrieve the corresponding amino acid sequence of each gene.

Table 4.3. Nucleotide sequences of Pygocentrus nattereri used to search for Colossoma macropomum sequence in the RNASeq and primers used for real-time PCR expression gene.

Accession Target gene Forward primer (5′- 3′) Reverse primer (5′- 3′) number fads2 XM_017706719.1 AGTTGGCTCCTTCTGAACCC TCTGCTTCAAACTGGTCCCG elovl5 XM_017694149.1 ATATGAAGGACAGGCAGCCG TATCCACCTTGCCACACAGC elovl2 XM_017696189.1 CACTCAGTGCAGCGGTTTTC GCTTTGTCATCTCTTGCTGGC Faz XM_017710987.1 TTCAGCCAATGCTACGTCTG AGGGTCCCACTGAACAAATG Lpl XM_017693341.1 TGTACGAGAGCTGGGTTCCT GACGAACTTGGCCACATCTT Lxrα XM_017703618.1 CTCGTGAGGACCAGATAGCC TCGATCTGTAGCCCTGCTTT srebp-1 XM_017684875.1 TACAGCCATGTTTGCTGGAA GTCACCTGGATTTGCGAACT ppar α XM_017716837.1 CTCCCTCAACGATGACACCT ACCGCTCCAAACACTGTAGA ppar αb XM_017701190.1 TCCAGCGTCAAGTCCATCAT CGTGGACCCCGTAGTGATAA ppar β XM_017722410.1 GAGAAGGCAGTGAGGTCTCG AGAGAGGAGGTCTCCGATGG

ppar βb XM_017688879.1 TGCAGGATTGGAGGAGAACA AGCTTCATCTCGACCCAACA ppar γ XM_017704775.1 GTCTACGACCACTGCGACCT CGAATGGCATTGTGTGACAT Βactin XM_017703173.1 TACCCTGGCATTGCAGACAG TATTTACGCTCAGGTGGGGC β-2-microglobulin XM_017696347.1 GAACTTCCAACCCTGCTCAA GCTACAGCCCTGGTGAGTTC αtubulin XM_017683409.1 CCGTAAACTGGCTGACCAAT GGAGCTGGATAAATGGCAAA fads2, Fatty acid desaturase 2; elovl5, elongase ofvery long chain fatty acids 5; elovl2, Fatty acid elongase 2; fas, Fatty acid synthase; lpl, Lipoprotein lipase; lxrα, Nuclear receptor subfamily 1; srebp-1, Sterol regulatory element binding protein 1; pparα, Proliferator activated receptor alpha; ppar αB, Peroxisome proliferator-activated receptor alpha-like; pparβ, Proliferator activated receptor beta; pparβB, Peroxisome proliferator-activated receptor beta-like; pparγ, Proliferator activated receptor gamma; βactin, Beta actin; β-2-microglobulin, Beta-2-microglobulin; αTubulin, Alpha tubulin.

To validate the orthology of the predicted sequences in tambaqui, eight independent phylogenetic trees were constructed of Lxrα (Supplementary Figure 4.1), Srebp-1 (Supplementary Figure 4.2), Lpl (Supplementary Figure 4.3), Fas (Supplementary Figure 4.4), Actb (Supplementary Figure 4.5), B2m (Supplementary Figure 4.6), Tuba (Supplementary Figure 4.7). For each phylogenetic tree, we used the predicted proteins of tambaqui, protein sequences of other teleost fish species and proteins of Latimeria

74 chalumnae (outgroup) obtained from ENSEMBL (Supplementary Table 4.1). The amino acid sequences were aligned with MAFFT (Katoh and Toh 2008) with the L-INS-i method. The resulting sequence alignments were inspected and stripped of columns containing gaps. Final sequence alignments were submitted for phylogenetic analysis to PhyML V3.0 (Guindon et al. 2010). After confirmation and validation, the RNA-Seq obtained sequences were used for designing specific primers with Primer 3 (v. 0.4.0) software (http://bioinfo.ut.ee/primer3- 0.4.0/), in a region outflanking a conserved intron (18–22 nucleotide length and GC content 40–60%). Primers were synthesized by STABVIDA (Portugal). The primer sequences are included in Table 4.3.

cDNA synthesis and real-time PCR assays

cDNA was synthesized from 1000 and 500 ng total RNA for liver and brain, respectively, using NZY Fist-Strand cDNA Synthesis Kit (NzyTech, Portugal) according to the manufacturer’s instructions. Fluorescence-base real-time PCR was performed with the NZYSpeedy qPCR Green Master Mix (2x) (Nzytech). Each sample was analyzed in duplicate using 96-well optical plates in a 10 μL reaction volume consisting of 2 μL of cDNA with 1/10 dilution, 5 μL of 2x NZYSpeedy Qpcr Green Master Mix (NzyTech), 0.4 μL 200 nM of the appropriate forward and reverse primers, and 2.2 μL ultra-pure water. On each plate, a “no template control” was included. The real-time PCR program included an initial polymerase activation of 95 °C for 2 min, and 40 cycles with denaturation at 95 °C for 5 s, annealing at 62 °C for 27 s and extension at 58 °C to 62°C depending on the gene. A melting curve was generated in every run to confirm the specificity of the assays. The PCR products were also analyzed by agarose gel electrophoresis to confirm the presence of single bands and one of each gene PCR products was sequenced to confirm the identity of the amplicon (data not shown). To determine the reaction efficiency, a standard curve that consisted of five-fold serial dilutions of cDNA from a mix of all samples, was also performed in each run (reaction efficiency was from 87% to 103% for all reactions). Target gene expression was normalized utilizing the geNorm software (Vandesompele et al. 2002), resulting in a multiple reference gene approach for normalization (actb, tuba and b2m). The mRNA levels of the target and reference genes were calculated within the R2 more than 0.98. Relative gene expression was calculated with the ΔΔCt method (Pfaffl 2001).

75

Statistical analysis

Outliers were removed from each treatment group by PRISM - GraphPad program (https://www.graphpad.com/), which uses the ESD method (Extreme Studentized Deviate). Two-way ANOVA was used to evaluate the effects of the two factors, namely oil source and lipid content (% of lipids) on the studied variables, as well as their interaction. A significance level of P < 0.05 was set for the two-way ANOVA. Prior to the two-way ANOVA, the homogeneity of variances was analyzed with the Levene’s test. If variances were homogeneous, data were analyzed without any transformation. If variances were heterogeneous (P<0.05) data were transformed with some geometric function, which was the case in the liver fads2 using (sqrt(x)), pparγ, lxr and fas using (1/x). In the brain fads2, pparαa, pparαb, pparγ and lpl were transformed by the equation sqrt (x). If tests were still heterogeneous after transformation was used nonparametric test for independent samples. This was the case with gene expression data for pparαa and pparαb in liver (Supplementary Table 4.2). All statistical analyses were performed using the software program SPSS version 11.5 (SPSS, Chicago, IL, USA).

4.4. Results

De novo transcriptome assembly and ORF prediction

To comprehensively characterize the brain transcriptome of C. macropomum, a total of 52,455,842 raw reads were generated from a single tissue, using a 150-bp paired-end (PE) sequencing method of Illumina HiSeq4000 technology. After the removal of low-quality reads, Illumina adapters and reads containing unknown nucleotides (FastQC and Trimmomatic) about of 82.63% of the initial dataset was retained for further analyses (Table 4.4). We have chosen a de novo assembly approach, through the Trinity software package, to assemble the C. macropomum brain transcriptome. After the transcriptome decontamination, the raw transcriptome assembly had approximately 711,990 transcripts with N50 of 1331 bp and 972.6 mean length (Table 4.4). In the next step, we performed the ORF prediction using the TransDecoder software. This metodology allowed the identification of 123,050 transcritps encoding an ORF, complete, internal or 5’/ 3’ partial and having more than 100 amino acids in lenght. To remove small and overlapped protein fragments, we applied the CD-hit software with conservative paremeters. As a result, the dataset was reduced to 95,239 predicted proteins (protein coding transcriptome assembly). In addition, we also collected the raw

76 transcriptome assembly of our previous work (Machado et al. 2019). Results of the statistical analysis of protein coding transcriptome assembly of liver are available on Supplementary Table 4.3. In general the global stat values were kept very similar, presenting the same order of magnitude. The small variations can be atributed to the methodologic differences. The general Trinity and Transrate (Smith-Unna et al., 2016) statistics of brain raw transcriptome and protein coding transcriptome assembly can be consulted in Table 4.4. To assess the completeness of tambaqui brain transcriptome, in terms of gene content, the BUSCO tool and several databases lineage-specific profile libraries were used. In the raw transcriptome assembly, we found 99.4, 98.9, 92.7 and 83.1% of BUSCO hits against the Metazoa, Eukaryota, Vertebrates and Actinopterygii databases, respectively (Table 4.4). Notably, even with ORF filtering, the protein coding transcriptome assembly kept high levels of gene content (Metazoa - 98.7%, Eukaryota - 97.2%, Vertebrates - 92.1% and Actinopterygii - 82.8%), showing a high percentage of transcripts codifying to an ORF.

Transcriptome annotation

To functional annotate the protein coding transcriptome of brain, the nucleotide and amino acid sequences were blasted with blast-x/p tools against three databases, Ncbi-nr, Ncbi-nt, UniProt (Uniref90 and SwissProt), and with an e-value cut-off of 1e-5. In the total, 78,311 , 69,581 , 69,648 transcripts showed a significant blast-x/p hit against each database (Figure 4.2a). Additionally, we also used other functional annotation sources to produce the final annotation report. Thus, from 95,239 protein coding transcripts, about 48,069 were classified as containing protein domains (PFAM), 55,076 (blast source) and 30,130 (pfam source) had one or more associated go terms, 49,929 had one eggnog or cog annotation term, and 50,187 transcripts have KEGG pathway annotation. Overall, 80,830 transcripts were identified in at least one of the searched databases. The detailed number of transcripts annotated per source database are available in Figure 4.2b, Table 4.4 and Annotation report (available in Figshare repository).

77

Table 4.4. Statistics of brain raw and protein coding transcriptome assembly in C. macropomum.

Trimming Brain Raw sequencing reads (PE) 104911684 Reads used in assembly 86692576 Raw transcriptome Protein coding Assembly Versions assembly transcriptome assembly Transcript Number 711990 95239 n50 transcript length (bp) 1331 2483 Mean transcript length (bp) 972.61 1720.6 Largest transcript 31835 30487 Number of transcripts over 1k nn 198929 55664 Number of transcripts over 10k nn 744 366 Total Assembled bases 692488313 163867816 Back Mapping Reads % 90.52 - BUSCO Statistics - - (Euk;Met;Ver;Act)* Complete 92.8 \ 93.7 \ 74.2 \ 69.3 92.1 \ 92.3 \ 74.2 \ 69.4 Single 46.9 \ 49.0 \ 37.5 \ 35.9 61.1 \ 65.7 \ 54.6 \ 50.2 Multi 45.9 \ 44.7 \ 36.7 \ 33.4 31.0 \ 26.6 \ 19.6 \ 19.2 Fragment 6.6 \ 5.2 \ 18.5 \ 13.8 6.6 \ 4.9 \ 17.9 \ 13.4 Missing 0.6 \ 1.1 \ 7.3 \ 16.9 1.3 \ 2.8 \ 7.9 \ 17.2 Total Busco Groups 99,4 \ 98,9 \ 92,7 \ 83,1 98,7 \ 97,2 \ 92,1 \ 82,8 Functional Annotation - Nº of transcripts with ORF - 95239 Blast-x hits in Ncbi-nt 78311 Blast-x hits in Ncbi-nr - 69539 Blast-p hits in Ncbi-nr - 64830 Blast-x hits in Uniref90 - 69499 Blast-p hits in Uniref90 - 64840 Blast-x hits in SwissProt - 59071 Blast-p hits in SwissProt - 56302 Nº of transcripts with GO terms 55076 (blast) - Nº of transcripts with GO terms (PFAM) - 30130 Nº of transcripts with Kegg 50187 Patways - Nº of transcripts with 49929 eggNOG/COG - Nº of transcripts with PFAM - 48069 Nº Transcripts in > 1 database 80830

78

Figure 4.2. Functional annotation of tambaqui brain transcriptome. (a) Venn diagram showing the overlap of match hits against five databases, Ncbi-nr, Ncbi-nt, SwissProt, Uniref90 and Pfam. (b) Percentages and number of transcripts annotated per source database.

Exploratory KEGG pathway analyses

To complement the functional annotation, we performed a broad and exploratory KEGG pathway analysis on KAAS web server. Importantly, to have a very first overview of the presence/absense of some genes crucial to this study, we searched the protein datasets of brain and liver transcriptomes in the sequences of 40 organisms. In the total, 19,890 (brain) and 18,601 (liver) proteins were mapped to 400 pathways (Supplementary Table 4.4). Next, we focused on two specific KEGG patways (PPAR signalling pathway and a sub- pathway of Fatty acid elongation map (the endoplasmic reticulum section (n≥16)) (Supplementary Figure 4.8, 4.9, 4.10, 4.11). In PPAR signalling pathway, we identified 44 (brain) (Supplementary Figure 4.8) and 43 (liver) (Supplementary Figure 4.9) genes of 63 genes (Supplementary Table 4.5). As expected, the major part of the identified genes was common between both transcriptomes, while some genes were only identified in one of the two tissues (e.g. Brain - cd36 (K06259), fabp4 (K08753), pltp (K08761); Liver - fabp1 (K08750), aqp7 (K08771)) (Supplementary Table 4.5, (Supplementary Figure 4.8, 4.9)). In sub-pathway of Fatty acid elongation map we identified 10 (brain) and 9 (liver) genes of 13 genes (Supplementary Table 4.5, (Supplementary Figure 4.10, 4.11)). Except in the case of elovl7 (K10250) gene, only found in brain transcriptome, all the remaining genes are present in both transcriptomes (Supplementary Table 4.5). Moreover, eight of the target genes (lpl (K01059), fads2 (K10226), elovl5 (K10244), elovl2 (K10205), pparα (K07294), pparβ (K04504), pparγ (K08530), lxrα (K08536)) also can be found in these two KEGG pathways,

79 in both transcriptomes. On the other hand, genes such as fas (K00665), srebp1 (K07197), actb (K05692), b2m (K08055), and tuba (K07374) were also found, but associated to other pathways (e.g. Insulin signalling pathway, AMPK signalling pathway, Fatty acid biosynthesis, Fatty acid metabolism, Rap1 signalling pathway, Gap junction, Antigen processing and presentation). All the transcripts-KO’s mappings, in liver and brain transcriptomes can be found in Figshare respository.

Cross-species comparisons

In cross-species comparisons, we analyzed the protein coding transcriptome of brain and liver of C. macropomum, as well as the proteome of two taxonomically related species, P. nattereri and A. mexicanus. The protein datasets arising from transcriptomic approach showed more than the double of number of proteins in relation to the genome- based protein datasets (C. macropomum - brain transcriptome (95,239), C. macropomum – liver transcriptome (92,497), P. nattereri - proteome (44,364), A. mexicanus – proteome (42,649)). Despite the initial differences in the number of sequences analyzed per dataset, 184,673 (67.2% of the total genes) genes were assigned to 36,439 orthogroups (11,593 are shared across all datasets and 1822 are (1:1) single copy orthogroups). Of these genes, 53,256 and 51,046 are from C. macropomum brain and liver transcriptomes, respectively, 41,299 are from P. nattereri proteome and 39,072 belong to A. mexicanus proteome. In contrast, the number of specie-specific orthogroups and corresponding genes are a small part of these values. While C. macropomum liver and brain datasets showed 31 (109 genes), and 60 (248 genes) orthogroups, the P. nattereri and A. mexicanus proteomes revealed 8 (45 genes) and 34 (270 genes) specie-specific orthogroups. Importantly and as expected, only small fractions of reference genes (P. nattereri, A. mexicanus) were unassigned to the orthogroups, 3065 (6.9%) and 3577 (8.8%), respectively. These values serve as a proof of concept of the high gene content completeness of both transcriptomes here analyzed. From a general point of view, we also observed that fifty percent (G50) of all assigned genes are within of groups with seven or more genes and contained in the largest 7581 orthogroups (O50). Regarding the on-target genes we were able to identify, the major part of the orthogroups of interest. More challenging was to discriminate the orthogroups of the two gene paralogs, Pparαa and Pparαb and Ppara and Pparb, which were automatically grouped together.

80

Phylogenetic analysis of the tambaqui sequences

The orthology of the isolated sequences in tambaqui of the regulatory fatty acid synthesis TF (Lxrα and Srebp1) were confirmed from those of other teleost species, including P. nattereri, in phylogenetic trees: Lxrα (Supplementary Figure 4.1), and Srebp1 (Supplementary Figure 4.2). The same occurred for the lipid metabolism enzymes Lpl and Fas (Supplementary Figure 4.3 and 4.4) and housekeping genes Actb (Supplementary Figure 4.5), B2m (Supplementary Figure 4.6) and Tuba (Supplementary Figure 4.7). The other isolated sequences used Fads2, Elovl5 and Elovl5 are the same genes previously isolated (Ferraz et al. 2019), and Ppars, gene orthologs were identified in Chapter 3.

Growth performance

The impacts of different dietary oil sources and lipid content on growth performance and feed utilization are presented in Table 4.5. Initial weight on average of 8.6 g did not differ among treatments. After the 9 weeks, all diets were readily accepted by fish throughout the feeding trial, no significant differences in survival, body weight gain (BWG) and feed conversion ratio (FCR) were detected (Table 4). Fish body weight gain averaged from 610 to 767% and FCR ranged from 1.61 to 1.79.

Table 4.5. Growth performances of Colossoma macropomum juvenile fed diets containing different levels of oil (FO and VO) for 9 weeks.

Experimental diets P valor Variable Fish oil Vegetable oil Factor Interaction

5% 10% 5% 10% Source Level S x L Initial body weight (g) 8.55 ± 1.45 8.62 ± 1.60 8.62 ± 1.42 8.55 ±1.44 0.942 0.942 0.721 Final body weight (g) 65.80 ± 13.99 55.91 ± 32.33 73.61 ± 13.79 62.63 ± 25.10 0.442 0.274 0.954 BWG (%) 667.94 ± 106.93 554.03 ± 269.80 759.54 ± 111.24 607.31 ± 186.67 0.365 0.104 0.809 FCR 1.62 ± 0.13 1.65 ± 0.38 1.61 ± 0.15 1.79 ± 0.41 0.624 0.421 0.531

Survival (%) 100 100 100 100 Values are means ± SE of five replicates (8 fish/tank). Means in the same row with different superscripts are significantly different (P < 0.05). 1 BWG, body weight gain = (final body weight − initial body weight) × 100/initial body weight. 2 FCR, feed conversion ratio = total diet fed (g)/total wet weight gain (g).

81

Liver gene expression

The oil source significantly influenced the expression of some of the genes analyzed in liver. Both fads2 and elovl5 were up-regulated in the liver when tambaqui were fed the VO diet as compared to the FO diet (Figure 4.3, A and E). With regard to ppar expression in liver (Figure 4.4), only pparβa showed differences in expression levels among dietary treatments. More specifically, there was an increased expression with 10% oil supplementation compared with 5% oil diet, in both FO and VO diets, with VO diet also resulting in an up-regulation as compared to FO diet (Figure 4.4, E). Other ppar did not show a significant difference between treatments in liver. For the regulators of FA synthesis (lxrα and srebp1) (Figure 4.5), no significant differences in expression were observed in liver (Figure 4.5, A and C). However, lpl was up-regulated in the liver when tambaqui were fed the VO diet as compared to the FO diet (Figure 4.5, E), but no significant differences in expression were observed for fas (Figure 4.5, G).

Brain gene expression

The expression of LC-PUFA biosynthetic pathway genes (fads2 and elovl2) in brain, were significantly upregulated by VO as compared to FO, while lipid level (%) had no significant effect on the expression of none gene (Figure 4.3). There was an interaction between oil source and lipid level on fads2 and elovl2 expression (Figure 4.3, B and D). Expression of pparβb and pparγ in the brain (Figure 4.4) was up-regulated by dietaty VO (VO5%, VO10%) in comparison to FO (FO5%, FO10%) (Figure 4.4, H and J). Pparβb was significantly affected by the interaction of oil source and oil level. For the regulators of FA synthesis (lxrα and srebp1), lipid catabolism (lpl) and fatty acid biosynthesis (fas) (Figure 4.5, B, D, F and H), no significant differences in expression were observed.

82

Figure 4.3. Expression of genes involved in long-chain polyunsaturated fatty acid (LC-PUFA) biosynthesis. Expression of fads2 in liver (A) and brain (B), elovl 2 in liver (C) and brain (D), elovl5 in liver (E) and brain (F). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real-time quantitative PCR. Different letters represent significantly different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%). The interaction of the oil source and percentage of lipids factors where represented by “*”.

83

Figure 4.4. Expression of genes involved in the regulation of fatty acidy metabolism: peroxisome proliferator-activated receptors (PPAR). Expression of pparαa in liver (A) and brain (B), pparαb in liver (C) and brain (D), pparβa in liver (E) and brain (F), pparβb in liver (G) and brain (H) and pparγ in liver (I) and brain (J). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real-time quantitative PCR. Different letters represent significantly different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%). The interaction of the oil source and percentage of lipids factors where represented by “*”.

84

Figure 4.5. Expression of genes involved in regulation of lipid biosynthesis and catabolism. Expression of liver x receptor (lxrα) in liver (A) and brain (B), sterol regulatory element binding protein 1 (srebp-1) in liver (C) and brain (D), lipoprotein lipase (lpl) in liver (E) and brain (F) and fatty acid synthase (fas) in liver (G) and brain (H). Graphical representation of relative gene expression averages by normalized ΔΔCt values for FO5% determined by real- time quantitative PCR. Different letters represent significant different between diets (ANOVA, P<0.05), in the following order, oil source (FO or VO), and level (5% or 10%).

85

4.5. Discussion

In this study, we set out to investigate the molecular architecture of tambaqui to biosynthesize LC-PUFA from C18 precursors in vivo. Specifically, genetic expression of the LC-PUFA metabolism genes and their regulators in different nutritional situations were assayed. Juvenile tambaqui were fed for 9 weeks with four diets, with varying oil sources (fish oil - FO and vegetable oil - VO) and levels (5% or 10%) present in four diets: FO5, FO10, VO5 and VO10). In the process, we generated a brain specific RNA-seq projet for tambaqui, which was used to isolate genes from the LC-PUFA biosynthesis (fads2, elovl5 and elovl2), TF (pparαa, pparαb, pparβa, pparβb, ppary, lxrα and srebp1), enzymes involved in the fatty acid biosynthesis (fas) and lipid catabolism (lpl). The reported conditions, with the substitution of FO for VO in the diet for tambaqui juveniles, did not affect growth for the period of 9 weeks. The results of our assay for tambaqui, follow similar findings of previous research studies, with dietary lipid levels (5 to 20%) not affecting growth performance (De Almeida et al., 2006; De Almeida et al., 2011; Paulino et al., 2018; Pereira et al., 2017). Thus, the EFA requirements of tambaqui can be satisfied by C18 PUFA (Paulino et al., 2018). Moreover, the dietary LA/ALA ratio for tambaqui should be around 3.9 to 5.6 to provide the highest EPA, DHA and ARA content in the fillet (De Almeida et al. 2011; Paulino et al. 2018). These results suggest that tambaqui has some ability to compensate for the deprivation of physilogically essential nutrients such as LC- PUFA in the diet. This is in agreement with our functional genomics results, since tambaqui demonstrated to display all the genes and associated enzymatic potential activities required from bioconverting C18 PUFA found in VO into LC-PUFA (Ferraz et al. 2019; Chapter 5). The present study showed further that, as far as the C18 PUFA precursors are present in the diet, tambaqui lipid metabolism genes were up-regulated contributing to the biosynthesis of LC- PUFAs, in particular, DHA in the brain. This genetic regulation allows tambaqui to keep their AFA levels (LC-PUFA), thus allowing their normal development even when fed with a VO source. Similarly, some other fish species in which dietary FO has been replaced totally or partially with VO exhibited a similar response with regard to managing to metabolize and deposit LC-PUFAs such as Maccullochella peelii peelii, Perca fluviatilis, Megalobrama amblycephala, Micropterus salmoides, Dicentrarchus labrax, Oreochromis sp. and Solea senegalensis (Turchini, Francis, and De Silva 2006; Blanchard, Makombu, and Kestemont 2008; Li et al. 2016; Turchini et al. 2011; Chen et al. 2020; Castro et al. 2015; Teoh and Ng 2016; Conde-Sieira et al. 2018; Corrêa et al. 2018). In relation to the percentage of lipid supplementation, no difference on survival, BWG and FCR were detected in our feeding

86 trial. Other studies suggested that high dietary lipid content might result in reduced digestion and absorption efficiency and eventually reduced growth and survival of fish (e.g. Morais et al. 2007). However, the level of total lipid in this experiment was not raised to the level of impairing such dysfunction, where juvenile tambaqui also had reduced growth when fed diets with lipid levels above 10% (De Almeida et al. 2011). We further examined the mRNA expression level of selected genes from key pathways including LC-PUFA biosynthesis. Here, the expression of the fads2, elovl2 and elovl5 was analyzed in liver and brain. The results indicated that at 5 and 10% VO supplementation upregulated the mRNA levels of fads2 and elovl5 in liver, as compared to FO (5 and 10%). These results suggest that tambaqui is able to activate the routes of biosynthesis of LC-PUFA probably to compensate for the low supply of these compounds in VO-based diets. Such compensatory response to lowered LC-PUFA (i.e. VO rich) diets in liver has been previously reported in a plethora of studies (Vagner and Santigosa 2011; Silva‐Brito et al. 2016; Xue et al. 2014). Despite its more conservative lipid composition, nutritional regulation of both fads2 and elovl2 but not elovl5 was observed in the present study, with increased expression in brain from fish fed the VO diets. These results indicate tambaqui is to some extent able to guarantee the high DHA demand in brain when fed on diets containing suboptimal levels of this essential nutrient (Tocher et al. 2019). Consistently, our study on functional characterization of LC-PUFA biosynthesizing enzymes from tambaqui demonstrated that both Elovl2 and Fads2 may play key roles in the biosynthesis of DHA (Ferraz et al. 2019). First, tambaqui elovl2 was able to elongate 22:5n-3 to 24:5n-3 and subsequently, 24:5n-3 is desaturated to 24:6n-3 by the tambaqui Fads2, thus completing two rate-limiting reactions required for the Sprecher pathway (Sprecher 2000). Such biosynthetic capacity for DHA biosynthesis is present in multiple teleost species that, like tambaqui, have been able to adapt and thrive in freshwater habitats with naturally low DHA occurrence (Ishikawa et al. 2019). Three ppar nuclear receptor paralogs have been recognized in vertebrate genomes: ppar, ppar and ppar. In teleosts, specific gene duplicates have been found in pparα and ppar, a and b (pparαa and pparαb, as for ppara and pparb) (Madureira et al. 2017; Urbatzka et al. 2013; Leaver et al. 2005; Robinson-Rechavi et al. 2001). In this work, we were able to deduce the complete ORF of five ppar gene sequence (pparαa, pparαb, pparβa, pparβb and pparγ) through the investigation of the brain and liver transcriptome (Machado et al. 2019; this work). Importantly, in teleosts some Ppar have been shown to regulate the endogenous production of LC-PUFA and associated with the modulation of fads expression (Li, Gao, and Huang 2015; Morais et al. 2012; Vagner et al. 2009; You et al.

87

2017; Li et al. 2019). Studies with VOs in freshwater species indicate that the optimum level of C18 LA increased the mRNA expressions of pparα and pparβ, while a deficiency or excess intake may suppress pparα expression in fish (Li et al. 2016; Li, Gao, and Huang 2015; You et al. 2017). However, the clear correlation between Ppars and LC-PUFA biosynthesis in teleosts was insufficiently elucidated. In the present study, pparβ and pparγ expression can be better understood as the potential regulation mechanism for endogenous LC-PUFA synthesis in tambaqui. Here, the highest expressions of pparβa was activated in the liver of fish fed the 10% lipids and in response to the dietary VO. This agreed with a higher hepatic pparβ expression observed in fish fed diets with vegetable oils (VO) than that with fish oil (FO). An increasing hepatic expression of fish fed VO increase pparα, pparβ,and pparγ in blunt snout bream Megalobrama amblycephala (Li et al. 2015), pparα2 in Japanese seabass (Dong et al. 2015), and pparα, pparβ and pparγ rabbitfish Siganus canaliculatus (You et al. 2017). This may indicate a stimulation of lipid utilization in the liver. Once, Pparβ activates lipid utilization by regulating the expression of target genes encoding enzymes involved in β-oxidation and energy uncoupling in various tissues (Dressel et al. 2003). This up- regulation of the pparβa was accompanied by the upregulation of the lpl, under the same conditions. Lpl is a key enzyme that regulates the delivery of lipid fuels in the body (Fielding and Frayn 1998), and subsequently, fatty acids delivered to the cells are taken up for oxidation and/or storage (Richard et al. 2006), where some studies suggested that the lpl mRNA level of fish could be regulated by feeding conditions and dietary lipid levels (Lu et al. 2013). Although the level of substitution of the VO affected the pparβa, it did not affect the lpl, probably because all diets have the same amount of total lipid. PPARγ controls lipid storage and adipogenesis in adipose tissue (Hummasti and Tontonoz 2006; Kersten 2014). The regulation of LC-PUFA in the tambaqui brain follows the same lines from the liver, where pparβb and pparγ were up-regulate against the VO. Although results indicate that pparγ is involved in the depressed expression of fads2 in liver in rabbitfish (Li et al. 2019). Here, in the tambaqui brain we do not observe the same trend, where the up expression of the ppary was accompanied by the increase in the expression of fads2 and elovl2. As previously discussed, these genes in the brain have an importance to maintain the levels of LC-PUFA, mainly DHA. These results suggest that Pparγ may have different regulatory mechanisms in different species. Interestingly, in the brain of tambaqui there was no feedback regulation of TF srebp-1 and lxr genes expressions. Once, ppars might indirectly regulate Fads and Elovls expression through stimulating srebp-1 and lxr or directly modulate the transcriptional expression (You et al. 2017; Li et al. 2019; Carmona- Antoñanzas et al. 2014). Another difference is pparα, which did not show significance in

88 gene expression in the VO diets treatment, neither in the liver nor in the brain. We suggest, that perhaps, the diversification of the ppars genes upon duplication modified some gene function. A plausible aspect is that the differences in the amino acid sequence and structure of PPARs among fish and mammals might affect the activity of PPARs (Leaver et al. 2005; Zheng et al. 2015). Alternativly, each paralogue gene copy may have evolved different functions, as in Lateolabrax japonicus (Dong et al. 2015). In this species, gene paralogues show functional divergence in response to dietary fatty acids between pparαa and pparαb; the molecular mechanisms involved in their activation can be modified by the new structure, new ligand-independent or dependent transactivation activity (Kondo et al. 2007; Zheng et al. 2015). However, in order to better understand the activity of pparα’s in different diets, more investigations have to be done to test this hypothesis. In conclusion, the present study demonstrated that after 9 weeks of feeding juvenile tambaqui with VO diets did not have a significant effect on the growth performance, oncy tambaqui biosynthesize LC-PUFA when fed LA and ALA acids precursors. Overall, it demonstrated that lipid metabolism relevant genes expression was significantly regulated by VO. In this present study, tambaqui that were fed VO diets displayed higher mRNA levels of hepatic pparβa and brain pparβb and pparγ, which coincides with the expression of fad2, elovl2 and elovl5. Therefore, PPARs might play a crucial role in regulating LCPUFA synthesis in the liver and brain in tambaqui. However, future studies are needed to investigate the effects of dietary VO levels on the hepatic and brain regulation mechanism of TF ppars, lxrα, srebp-1 to better understand the LC-PUFA regulation mechanism in tambaqui.

4.6. References

Almeida, Neiva Maria, Jesuí Vergílio Visentainer, and Maria Regina Bueno Franco. 2008. 'Composition of total, neutral and phospholipids in wild and farmed tambaqui (Colossoma macropomum) in the Brazilian Amazon area', Journal of the Science of Food and Agriculture, 88: 1739-47. Altschul, Stephen F, Warren Gish, Webb Miller, Eugene W Myers, and David J Lipman. 1990. 'Basic local alignment search tool', Journal of molecular biology, 215: 403-10. Ashburner, Michael, Catherine A Ball, Judith A Blake, David Botstein, Heather Butler, J Michael Cherry, Allan P Davis, Kara Dolinski, Selina S Dwight, and Janan T Eppig. 2000. 'Gene ontology: tool for the unification of biology', Nature genetics, 25: 25. Bateman, Alex, Lachlan Coin, Richard Durbin, Robert D Finn, Volker Hollich, Sam Griffiths‐ Jones, Ajay Khanna, Mhairi Marshall, Simon Moxon, and Erik LL Sonnhammer. 2004. 'The Pfam protein families database', Nucleic acids research, 32: D138-D41.

89

Betancor, M. B., E. McStay, M. Minghetti, H. Migaud, D. R. Tocher, and A. Davie. 2014. 'Daily rhythms in expression of genes of hepatic lipid metabolism in Atlantic salmon (Salmo salar L.)', PloS one, 9: e106739. Blanchard, Gersande, Judith G Makombu, and Patrick Kestemont. 2008. 'Influence of different dietary 18: 3n-3/18: 2n-6 ratio on growth performance, fatty acid composition and hepatic ultrastructure in Eurasian perch, Perca fluviatilis', Aquaculture, 284: 144-50. Bolger, Anthony M, Marc Lohse, and Bjoern Usadel. 2014. 'Trimmomatic: a flexible trimmer for Illumina sequence data', Bioinformatics, 30: 2114-20. Bryant, Donald M, Kimberly Johnson, Tia DiTommaso, Timothy Tickle, Matthew Brian Couger, Duygu Payzin-Dogru, Tae J Lee, Nicholas D Leigh, Tzu-Hsing Kuo, and Francis G Davis. 2017. 'A tissue-mapped axolotl de novo transcriptome enables identification of limb regeneration factors', Cell reports, 18: 762-76. Buchfink, Benjamin, Chao Xie, and Daniel H Huson. 2015. 'Fast and sensitive protein alignment using DIAMOND', Nature methods, 12: 59. Carmona-Antoñanzas, Greta, Douglas R Tocher, Laura Martinez-Rubio, and Michael J Leaver. 2014. 'Conservation of lipid metabolic gene transcriptional regulatory networks in fish and mammals', Gene, 534: 1-9. Castro, Carolina, Geneviève Corraze, Stephane Panserat, and A Oliva‐Teles. 2015. 'Effects of fish oil replacement by a vegetable oil blend on digestibility, postprandial serum metabolite profile, lipid and glucose metabolism of European sea bass (D icentrarchus labrax) juveniles', Aquaculture nutrition, 21: 592-603. Castro, L. F., D. R. Tocher, and O. Monroig. 2016. 'Long-chain polyunsaturated fatty acid biosynthesis in chordates: Insights into the evolution of Fads and Elovl gene repertoire', Prog Lipid Res, 62: 25-40. Chaves, William, Érica C Almeida, Cristiana Carneiro, Larisa Magnone, Nilton Martins, Martin Bessonart, Jener Zuanon, and Ana Salaro. 2019. 'Growth performance of Astyanax altiparanae fed with plant and/or animal lipid sources', Revista De Ciencias Agrícolas, 36. Chen, Yifang, Zhenzhu Sun, Zuman Liang, Yongdong Xie, Jiliang Su, Qiulan Luo, Junyan Zhu, Qingying Liu, Tao Han, and Anli Wang. 2020. 'Effects of dietary fish oil replacement by soybean oil and l-carnitine supplementation on growth performance, fatty acid composition, lipid metabolism and liver health of juvenile largemouth bass, Micropterus salmoides', Aquaculture, 516: 734596. Conde-Sieira, Marta, Manuel Gesto, Sónia Batista, Fátima Linares, José LR Villanueva, Jesús M Míguez, José L Soengas, and Luisa MP Valente. 2018. 'Influence of vegetable diets on physiological and immune responses to thermal stress in Senegalese sole (Solea senegalensis)', PloS one, 13. Consortium., UniProt. 2019. 'UniProt: a worldwide hub of protein knowledge', Nucleic acids research, 47: D506-D15. Cook, Harold W. 1996. 'Fatty acid desaturation and chain elongation in eukaryotes.' in, Biochemistry of Lipids, Lipoproteins and Membranes (Elsevier Science ). Corrêa, Camila Fernandes, Renata Oselame Nobrega, Jane Mara Block, and Débora Machado Fracalossi. 2018. 'Mixes of plant oils as fish oil substitutes for Nile tilapia at optimal and cold suboptimal temperature', Aquaculture, 497: 82-90. Daemen, Sabine, Martina Kutmon, and Chris T Evelo. 2013. 'A pathway approach to investigate the function and regulation of SREBPs', Genes & nutrition, 8: 289. De Almeida, L. C., I. M. Avilez, C. A. Honorato, T. S. F. Hori, and G. Moraes. 2011. 'Growth and metabolic responses of tambaqui (Colossoma macropomum) fed different levels of protein and lipid', Aquaculture nutrition, 17: e253-e62.

90

De Almeida, LC, LM Lundstedt, and G Moraes. 2006. 'Digestive enzyme responses of tambaqui (Colossoma macropomum) fed on different levels of protein and lipid', Aquaculture nutrition, 12: 443-50. DILS, R. 1974. "Lipid Analysis: Isolation, Separation, Identification and Structural Analysis of Lipids." In.: Portland Press Limited. Dong, Xiaojing, Houguo Xu, Kangsen Mai, Wei Xu, Yanjiao Zhang, and Qinghui Ai. 2015. 'Cloning and characterization of SREBP-1 and PPAR-α in Japanese seabass Lateolabrax japonicus, and their gene expressions in response to different dietary fatty acid profiles', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 180: 48-56. Dressel, Uwe, Tamara L Allen, Jyotsna B Pippal, Paul R Rohde, Patrick Lau, and George EO Muscat. 2003. 'The peroxisome proliferator-activated receptor β/δ agonist, GW501516, regulates the expression of genes involved in lipid catabolism and energy uncoupling in skeletal muscle cells', Molecular endocrinology, 17: 2477-93. Du, Jianlong, Hanlin Xu, Songlin Li, Zuonan Cai, Kangsen Mai, and Qinghui Ai. 2017. 'Effects of dietary chenodeoxycholic acid on growth performance, body composition and related gene expression in large yellow croaker (Larimichthys crocea) fed diets with high replacement of fish oil with soybean oil', Aquaculture, 479: 584-90. Emms, David M, and Steven Kelly. 2015. 'OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy', Genome biology, 16: 157. Ferraz, Renato B, Naoki Kabeya, Mónica Lopes-Marques, André M Machado, Ricardo A Ribeiro, Ana L Salaro, Rodrigo Ozório, L Filipe C Castro, and Óscar Monroig. 2019. 'A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 227: 90-97. Fielding, Barbara A, and Keith N Frayn. 1998. 'Lipoprotein lipase and the disposition of dietary fatty acids', British journal of nutrition, 80: 495-502. Fievet, C, and B Staels. 2009. 'Liver X receptor modulators: effects on lipid metabolism and potential use in the treatment of atherosclerosis', Biochemical pharmacology, 77: 1316-27. Finn, Robert D, Jody Clements, and Sean R Eddy. 2011. 'HMMER web server: interactive sequence similarity searching', Nucleic acids research, 39: W29-W37. Folch, Jordi, M_ Lees, and GH Sloane Stanley. 1957. 'A simple method for the isolation and purification of total lipids from animal tissues', J biol Chem, 226: 497-509. Fountoulaki, E, A Vasilaki, R Hurtado, K Grigorakis, I Karacostas, I Nengas, G Rigos, Y Kotzamanis, B Venou, and MN Alexis. 2009. 'Fish oil substitution by vegetable oils in commercial diets for gilthead sea bream (Sparus aurata L.); effects on growth performance, flesh quality and fillet fatty acid profile: recovery of fatty acid profiles by a fish oil finishing diet under fluctuating water temperatures', Aquaculture, 289: 317-26. Garrido, Diego, Naoki Kabeya, Mónica B Betancor, José A Pérez, N Guadalupe Acosta, Douglas R Tocher, Covadonga Rodríguez, and Óscar Monroig. 2019. 'Functional diversification of teleost Fads2 fatty acyl desaturases occurs independently of the trophic level', Scientific reports, 9: 11199. Grabherr, Manfred G, Brian J Haas, Moran Yassour, Joshua Z Levin, Dawn A Thompson, Ido Amit, Xian Adiconis, Lin Fan, Raktima Raychowdhury, and Qiandong Zeng. 2011. 'Full-length transcriptome assembly from RNA-Seq data without a reference genome', Nature biotechnology, 29: 644.

91

Guindon, S., J. F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, and O. Gascuel. 2010. 'New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0', Syst Biol, 59: 307-21. Haas, Brian J, Alexie Papanicolaou, Moran Yassour, Manfred Grabherr, Philip D Blood, Joshua Bowden, Matthew Brian Couger, David Eccles, Bo Li, and Matthias Lieber. 2013. 'De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis', Nature protocols, 8: 1494. Houston, Sam JS, Vasileios Karalazos, John Tinsley, Mónica B Betancor, Samuel AM Martin, Douglas R Tocher, and Oscar Monroig. 2017. 'The compositional and metabolic responses of gilthead seabream (Sparus aurata) to a gradient of dietary fish oil and associated n-3 long-chain PUFA content', British journal of nutrition, 118: 1010-22. Hummasti, S., and P. Tontonoz. 2006. 'The peroxisome proliferator-activated receptor N- terminal domain controls isotype-selective gene expression and adipogenesis', Mol Endocrinol, 20: 1261-75. Ishikawa, Asano, Naoki Kabeya, Koki Ikeya, Ryo Kakioka, Jennifer N Cech, Naoki Osada, Miguel C Leal, Jun Inoue, Manabu Kume, and Atsushi Toyoda. 2019. 'A key metabolic gene for recurrent freshwater colonization and radiation in fishes', Science, 364: 886-89. Kanehisa, Minoru, Susumu Goto, Yoko Sato, Miho Furumichi, and Mao Tanabe. 2011. 'KEGG for integration and interpretation of large-scale molecular data sets', Nucleic acids research, 40: D109-D14. Katoh, K., and H. Toh. 2008. 'Recent developments in the MAFFT multiple sequence alignment program', Brief Bioinform, 9: 286-98. Kersten, Sander. 2014. 'Integrated physiology and systems biology of PPARα', Molecular metabolism, 3: 354-71. Kondo, Hidehiro, Ryohei Misaki, Laurent Gelman, and Shugo Watabe. 2007. 'Ligand- dependent transcriptional activities of four torafugu pufferfish Takifugu rubripes peroxisome proliferator-activated receptors', General and comparative endocrinology, 154: 120-27. Langmead, Ben, and Steven L Salzberg. 2012. 'Fast gapped-read alignment with Bowtie 2', Nature methods, 9: 357. Leaver, M. J., E. Boukouvala, E. Antonopoulou, A. Diez, L. Favre-Krey, M. T. Ezaz, J. M. Bautista, D. R. Tocher, and G. Krey. 2005. 'Three peroxisome proliferator-activated receptor isotypes from each of two species of marine fish', Endocrinology, 146: 3150-62. Li, Weizhong, and Adam Godzik. 2006. 'Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences', Bioinformatics, 22: 1658-59. Li, Yang, Jian Gao, and Songqian Huang. 2015. 'Effects of different dietary phospholipid levels on growth performance, fatty acid composition, PPAR gene expressions and antioxidant responses of blunt snout bream Megalobrama amblycephala fingerlings', Fish physiology and biochemistry, 41: 423-36. Li, Yang, Xiao Liang, Yin Zhang, and Jian Gao. 2016. 'Effects of different dietary soybean oil levels on growth, lipid deposition, tissues fatty acid composition and hepatic lipid metabolism related gene expressions in blunt snout bream (Megalobrama amblycephala) juvenile', Aquaculture, 451: 16-23. Li, Yang, Yuntong Zhao, Yongkang Zhang, Xiao Liang, Yin Zhang, and Jian Gao. 2015. 'Growth performance, fatty acid composition, peroxisome proliferator‐activated receptors gene expressions, and antioxidant abilities of blunt snout bream, Megalobrama amblycephala, fingerlings fed different dietary oil sources', Journal of the World Aquaculture Society, 46: 395-408. Li, Yuanyou, Ziyan Yin, Yewei Dong, Shuqi Wang, Óscar Monroig, Douglas R Tocher, and Cuihong You. 2019. 'Pparγ Is Involved in the Transcriptional Regulation of Liver LC-

92

PUFA Biosynthesis by Targeting the Δ6Δ5 Fatty Acyl Desaturase Gene in the Marine Teleost Siganus canaliculatus', Marine Biotechnology, 21: 19-29. Lu, Kang-Le, Wei-Na Xu, Xiang-Fei Li, Wen-Bin Liu, Li-Na Wang, and Chun-Nuan Zhang. 2013. 'Hepatic triacylglycerol secretion, lipid transport and tissue lipid uptake in blunt snout bream (Megalobrama amblycephala) fed high-fat diet', Aquaculture, 408: 160- 68. Machado, André M, Mónica Felício, Elza Fonseca, Rute R da Fonseca, and L Filipe C Castro. 2018. 'A resource for sustainable management: De novo assembly and annotation of the liver transcriptome of the Atlantic chub , Scomber colias', Data in brief, 18: 276-84. Machado, André M, Renato Ferraz, Ricardo do Amaral Ribeiro, Rodrigo Ozório, and L Filipe C Castro. 2019. 'From the Amazon: A comprehensive liver transcriptome dataset of the teleost fish tambaqui, Colossoma macropomum', Data in Brief, 23: 103751. Madureira, T. V., I. Pinheiro, R. de Paula Freire, E. Rocha, L. F. Castro, and R. Urbatzka. 2017. 'Genome specific PPARalphaB duplicates in salmonids and insights into estrogenic regulation in brown trout', Comp Biochem Physiol B Biochem Mol Biol, 208-209: 94-101. Monroig, Oscar, Douglas R Tocher, and Luís Filipe C Castro. 2018. 'Polyunsaturated fatty acid biosynthesis and metabolism in fish.' in, Polyunsaturated Fatty Acid Metabolism (Elsevier). Morais, S, LEC Conceição, I Rønnestad, W Koven, C Cahu, JL Zambonino Infante, and MT Dinis. 2007. 'Dietary neutral lipid level and source in marine fish larvae: effects on digestive physiology and food intake', Aquaculture, 268: 106-22. Morais, Sofia, Tomé Silva, Odete Cordeiro, Pedro Rodrigues, Derrick R Guy, James E Bron, John B Taggart, J Gordon Bell, and Douglas R Tocher. 2012. 'Effects of genotype and dietary fish oil replacement with vegetable oil on the intestinal transcriptome and proteome of Atlantic salmon (Salmo salar)', BMC genomics, 13: 448. Moriya, Yuki, Masumi Itoh, Shujiro Okuda, Akiyasu C Yoshizawa, and Minoru Kanehisa. 2007. 'KAAS: an automatic genome annotation and pathway reconstruction server', Nucleic acids research, 35: W182-W85. Nasopoulou, C., and I. Zabetakis. 2012. 'Benefits of fish oil replacement by plant originated oils in compounded fish feeds. A review', LWT - Food Science and Technology, 47: 217-24. Paulino, Renan Rosa, Raquel Tatiane Pereira, Táfanie Valácio Fontes, Aires Oliva-Teles, Helena Peres, Dalton José Carneiro, and Priscila Vieira Rosa. 2018. 'Optimal dietary linoleic acid to linolenic acid ratio improved fatty acid profile of the juvenile tambaqui ( Colossoma macropomum )', Aquaculture, 488: 9-16. Pereira, Raquel Tatiane, Renan Rosa Paulino, Charlle Anderson Lima de Almeida, Priscila Vieira Rosa, Tamira Maria Orlando, and Rodrigo Fortes-Silva. 2017. 'Oil sources administered to tambaqui (Colossoma macropomum): growth, body composition and effect of masking organoleptic properties and fasting on diet preference', Applied Animal Behaviour Science. Pfaffl, Michael W. 2001. 'A new mathematical model for relative quantification in real-time RT–PCR', Nucleic acids research, 29: e45-e45. Powell, Sean, Damian Szklarczyk, Kalliopi Trachana, Alexander Roth, Michael Kuhn, Jean Muller, Roland Arnold, Thomas Rattei, Ivica Letunic, and Tobias Doerks. 2011. 'eggNOG v3. 0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges', Nucleic acids research, 40: D284-D89. Punta, Marco, Penny C. Coggill, Ruth Y. Eberhardt, Jaina Mistry, John Tate, Chris Boursnell, Ningze Pang, Kristoffer Forslund, Goran Ceric, Jody Clements, Andreas Heger, Liisa Holm, Erik L. L. Sonnhammer, Sean R. Eddy, Alex Bateman, and Robert D. Finn. 2011. 'The Pfam protein families database', Nucleic acids research, 40: D290-D301.

93

Richard, Nadege, Sadasivam Kaushik, Laurence Larroquet, Stéphane Panserat, and Genevieve Corraze. 2006. 'Replacing dietary fish oil by vegetable oils has little effect on lipogenesis, lipid transport and tissue lipid uptake in rainbow trout (Oncorhynchus mykiss)', British journal of nutrition, 96: 299-309. Robinson-Rechavi, Marc, Oriane Marchand, Héctor Escriva, Pierre-Luc Bardet, Dominique Zelus, Sandrine Hughes, and Vincent Laudet. 2001. 'Euteleost fish genomes are characterized by expansion of gene families', Genome research, 11: 781-88. Sarameh, Sara Pourhosein, Amir Houshang Bahri, Alireza Salarzadeh, and Bahram Falahatkar. 2019. 'Effects of fish oil replacement with vegetable oil in diet of sterlet sturgeon (Acipenser ruthenus) broodstock on expression of lipid metabolism related genes in eggs', Aquaculture, 505: 441-49. Silva‐Brito, Francisca, Leonardo J Magnoni, Sthelio Braga Fonseca, Maria João Peixoto, L Filipe C Castro, Isabel Cunha, Rodrigo Otávio de Almeida Ozório, Fernando Antunes Magalhães, and José Fernando Magalhães Gonçalves. 2016. 'Dietary oil source and selenium supplementation modulate Fads2 and Elovl5 transcriptional levels in liver and brain of meagre (Argyrosomus regius)', Lipids, 51: 729-41. Silva, Clichenner Rodrigues, Levy Carvalho Gomes, and Franmir Rodrigues Brandão. 2007. 'Effect of feeding rate and frequency on tambaqui (Colossoma macropomum) growth, production and feeding costs during the first growth phase in cages', Aquaculture, 264: 135-39. Simão, Felipe A, Robert M Waterhouse, Panagiotis Ioannidis, Evgenia V Kriventseva, and Evgeny M Zdobnov. 2015. 'BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs', Bioinformatics, 31: 3210-12. Smith-Unna, Richard, Chris Boursnell, Rob Patro, Julian M Hibberd, and Steven Kelly. 2016. 'TransRate: reference-free quality assessment of de novo transcriptome assemblies', Genome research, 26: 1134-44. Tacon, Albert G. J., and Marc Metian. 2008. 'Global overview on the use of fish meal and fish oil in industrially compounded aquafeeds: Trends and future prospects', Aquaculture, 285: 146-58. Teoh, Chaiw-Yee, and Wing-Keong Ng. 2016. 'The implications of substituting dietary fish oil with vegetable oils on the growth performance, fillet fatty acid profile and modulation of the fatty acid elongase, desaturase and oxidation activities of red hybrid tilapia, Oreochromis sp', Aquaculture, 465: 311-22. Torstensen, Bente E, Øyvind Lie, and Livar Frøyland. 2000. 'Lipid metabolism and tissue composition in Atlantic salmon (Salmo salar L.)—effects of capelin oil, palm oil, and oleic acid-enriched sunflower oil as dietary lipid sources', Lipids, 35: 653-64. Turchini, Giovanni M, David S Francis, and Sena S De Silva. 2006. 'Fatty acid metabolism in the freshwater fish Murray (Maccullochella peelii peelii) deduced by the whole-body fatty acid balance method', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 144: 110-18. Turchini, Giovanni M, Bente E Torstensen, and Wing‐Keong Ng. 2009. 'Fish oil replacement in finfish nutrition', Reviews in Aquaculture, 1: 10-57. Turchini, GM, DS Francis, SPSD Senadheera, T Thanuthong, and SS De Silva. 2011. 'Fish oil replacement with different vegetable oils in Murray cod: evidence of an “omega- 3 sparing effect” by other dietary fatty acids', Aquaculture, 315: 250-59. Urbatzka, R, S Galante-Oliveira, E Rocha, LFC Castro, and I Cunha. 2013. 'Tissue expression of PPAR-alpha isoforms in Scophthalmus maximus and transcriptional response of target genes in the heart after exposure to WY-14643', Fish physiology and biochemistry, 39: 1043-55. Vagner, M., and E. Santigosa. 2011. 'Characterization and modulation of gene expression and enzymatic activity of delta-6 desaturase in teleosts: A review', Aquaculture, 315: 131-43.

94

Vagner, Marie, Jean H Robin, José L Zambonino-Infante, Douglas R Tocher, and Jeannine Person-Le Ruyet. 2009. 'Ontogenic effects of early feeding of sea bass (Dicentrarchus labrax) larvae with a range of dietary n-3 highly unsaturated fatty acid levels on the functioning of polyunsaturated fatty acid desaturation pathways', British journal of nutrition, 101: 1452-62. Vandesompele, Jo, Katleen De Preter, Filip Pattyn, Bruce Poppe, Nadine Van Roy, Anne De Paepe, and Frank Speleman. 2002. 'Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes', Genome biology, 3: research0034. 1. Xue, Xi, Charles Y Feng, Stefanie M Hixson, Kim Johnstone, Derek M Anderson, Christopher C Parrish, and Matthew L Rise. 2014. 'Characterization of the fatty acyl elongase (elovl) gene family, and hepatic elovl and delta-6 fatty acyl desaturase transcript expression and fatty acid responses to diets containing camelina oil in Atlantic cod (Gadus morhua)', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 175: 9-22. You, C., D. Jiang, Q. Zhang, D. Xie, S. Wang, Y. Dong, and Y. Li. 2017. 'Cloning and expression characterization of peroxisome proliferator-activated receptors (PPARs) with their agonists, dietary lipids, and ambient salinity in rabbitfish Siganus canaliculatus', Comp Biochem Physiol B Biochem Mol Biol, 206: 54-64. Zheng, Jia-Lang, Mei-Qin Zhuo, Zhi Luo, Yu-Feng Song, Ya-Xiong Pan, Chao Huang, Wei Hu, and Qi-Liang Chen. 2015. 'Peroxisome proliferator-activated receptor alpha1 in yellow catfish Pelteobagrus fulvidraco: Molecular characterization, mRNA tissue expression and transcriptional regulation by insulin in vivo and in vitro', Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 183: 58- 66.

95

Chapter 5

The fatty acid elongation genes, elovl4a and elovl4b, are present and functional in the genome of tambaqui (Colossoma macropomum)

96

5.1.Abstract

Long-chain (C20-24) polyunsaturated fatty acids (LC-PUFA) are physiologically important nutrients for vertebrates including fish. Previous studies have addressed the metabolism of LC-PUFA in the Amazonian teleost tambaqui (Colossoma macropomum), an emerging species in Brazilian aquaculture, showing that all the desaturase and elongase activities required to convert C18 polyunsaturated fatty acids (PUFA) into LC-PUFA are present in tambaqui. Yet, elongation of very long-chain fatty acid 4 (Elovl4) proteins, which participate in the biosynthesis of very long-chain (>C24) saturated fatty acids (VLC-SFA) and very long-chain polyunsaturated fatty acids (VLC-PUFA), had not been characterized in this species. Here, we investigate the repertoire and function of two Elovl4 in tambaqui. Furthermore, we present the first draft genome assembly from tambaqui, and demonstrated the usefulness of this resource in nutritional physiology studies by isolating one of the tambaqui elovl4 genes. Our results showed that, similarly to other teleost species, two elovl4 gene paralogs termed as elovl4a and elovl4b, are present in tambaqui. Tambaqui elovl4a and elovl4b have open reading frames (ORF) of 948 and 912 base pairs, encoding putative proteins of 315 and 303 amino acids, respectively. Functional characterization in yeast showed that both Elovl4 enzymes have activity towards all the PUFA substrates assayed (18:3n-3, 18:2n-6, 18:4n-3, 18:3n-6, 20:5n-3, 20:4n-6, 22:5n-3, 22:4n-6 and 22:6n-3), producing elongated products of up to C36. Moreover, both Elovl4 were able to elongate 22:5n-3 to 24:5n-3, a key elongation step required for the synthesis of docosahexaenoic acid via the Sprecher pathway.

5.2. Introduction

Long-chain (C20-24) polyunsaturated fatty acids (LC-PUFA) are physiologically important nutrients for vertebrates and especially, the so-called “omega-3” (or n-3) LC- PUFA play important roles in human health (Calder, 2014; Campoy et al., 2012; Delgado- Lista et al., 2012; Innis, 2007). Oily fish and seafood in general are primary sources of n-3 LC-PUFA in the human diet. LC-PUFA are commonly regarded as essential nutrients for many vertebrates since they cannot be biosynthesized at relevant physiological rates and, consequently, LC-PUFA must be supplied in the diet to prevent deficiency symptoms (Castro et al., 2016; Tocher, 2015). In some teleosts, particularly freshwater species

(Ishikawa et al., 2019), dietary shorter chain C18 polyunsaturated fatty acids (PUFA), such as

97 linoleic (LA; 18:2n-6) and α-linolenic acids (ALA; 18:3n-3), can satisfy the essential fatty acid requirements since they can convert them into LC-PUFA (Castro et al. 2016; Monroig et al.

2018). Conversions of C18 PUFA into C20-24 LC-PUFA are mediated by members of two families of enzymes named fatty acyl desaturases (Fads) and elongation of very long-chain fatty acid (Elovl) proteins (Castro et al., 2016; Monroig and Kabeya, 2018; Tocher, 2015). Elovl are rate-limiting enzymes enabling the addition of two-carbon atoms to pre-existing fatty acid (FA) substrates (Guillou et al., 2010; Jakobsson et al., 2006). While a recent study has reported on the putative roles of a novel Elovl present, Elovl8, in PUFA elongation in teleosts (Li et al., 2020), robust evidence supports that mainly three members of the Elovl family, namely Elovl 2, 4 and 5, have been shown to participate in the biosynthesis of LC- PUFA in vertebrates (Jakobsson et al. 2006; Guillou et al., 2010; Castro et al 2016). More specifically, Elovl5 exhibits high activity toward C16–20 PUFA substrates, while Elovl2 shows high activity toward C20–22 PUFA (Jakobsson et al., 2006; Monroig et al., 2013). Elovl4 is involved in the biosynthesis of very long-chain (˃C24) FA including saturated (VLC-SFA) and polyunsaturated (VLC-PUFA) compounds (Deak et al., 2019). Teleost fish possess two elovl4 genes termed elovl4a and elovl4b (Monroig et al., 2010). Teleost elovl4 genes have been so far isolated and characterized in zebrafish (Danio rerio) (Monroig et al., 2010), Atlantic salmon (Salmo salar) (Carmona-Antonanzas et al., 2011), cobia (Rachycentron canadum) (Monroig et al., 2011), rabbitfish (Siganus canaliculatus) (Monroig et al., 2012), Nibe croaker (Nibea mitsukurii) (Kabeya et al., 2015), orange-spotted grouper (Epinephelus coioides) (Li et al., 2017a), loach (Misgurnus anguillicaudatus) (Yan et al., 2017), African catfish (Clarias gariepinus) (Oboh et al., 2017a), large yellow croaker (Larimichthys crocea) (Li et al., 2017b), black seabream (Acanthopagrus schlegelii) (Jin et al., 2017), rainbow trout (Oncorhynchus mykiss) (Zhao et al., 2019), and (Thunnus thynnus) (Betancor et al., 2020). Tissue distribution of elovl4 mRNA suggests that this gene has a widespread expression pattern, with elovl4a being mostly expressed in brain tissues (brain and pituitary) (Monroig et al., 2010; Oboh et al., 2017a), and elovl4b in retina and gonads (Monroig et al., 2010; Oboh et al., 2017a). Teleost Elovl4 functions were first characterized in D. rerio, confirming that Elovl4a had the ability to elongate saturated fatty acids to produce VLC-SFA, whereas only Elovl4b appeared to have a role in the biosynthesis of VLC-PUFA (Monroig et al., 2010). More recently however, it has been shown that both Elovl4a and Elovl4b from Clarias gariepinus and Acanthopagrus schlegelii have the ability to biosynthesize VLC-PUFA (Jin et al., 2017; Oboh et al., 2017a). Along with their role in VLC-PUFA biosynthesis, certain teleost Elovl4 characterized to date have also shown ability to elongate 22:5n-3 to 24:5n-3 for

98 biosynthesis of DHA through the so-called “Sprecher pathway” (Sprecher, 2000), a metabolic route operated in teleosts (Buzzi et al., 1996; Buzzi et al., 1997; Oboh et al., 2017b; Tocher et al., 2003). Genome information is fundamental to decipher multiple aspects of a species biology. Various teleost species already have the genome assembled and annotated, in many cases for evolutionary study, mainly because during the early evolution of teleosts there was a genomic duplication (TGD, teleost genome duplication) that greatly increased the number of genes (Berthelot et al., 2014). Furthermore, aquaculture has also taken advantage of genomes available from commercially important species such as carp (Cyprinus carpio) (Xu et al., 2014), Atlantic salmon (S. salar) (Lien et al., 2016), rainbow trout (O. mykiss) (Palti et al., 2012), European sea bass (Dicentrarchus labrax) (Tine et al., 2014), European sardine (Sardina pilchardus) (Louro et al., 2019; Machado et al., 2018) and Nile tilapia (Oreochromis niloticus) (Conte et al., 2017). Similarly, in Brazil, sequencing projects of neotropical species are also officially underway. EMBRAPA is leading the complete sequencing of two species of importance for Brazilian aquaculture, within the Animal Genomic Network, tambaqui (Colossoma macropomum) and cachara (Pseudoplatystoma reticulatum), along the species already sequenced, such as red-bellied piranha (Pygocentrus nattereri) (Schartl et al., 2019) and arapaima (Arapaima gigas) (Du et al., 2019). Tambaqui is a tropical teleost fish native to the Amazon and Orinoco river basins representing over a quarter of the total aquaculture production in Brazil (Ministério da Agricultura, 2016). Previous studies have characterized the LC-PUFA metabolic pathway in tambaqui, showing that all the desaturase and elongase activities required to convert C18 PUFA into LC-PUFA, namely fads2, elovl5, and elovl2, are present (Ferraz et al., 2019). However, the last step of the VLC-PUFA elongation mediated by Elovl4 has not been established, yet. Here, we aim to characterize the repertoire of elovl4 genes in tambaqui and investigate the functions of the putative enzymes by heterologous expression in yeast. This study reveals new insights in the understanding of VLC-PUFA metabolism in teleosts. Despite tambaqui being widely cultured in Brazil, its genome resources are still limited. As part of this study, we also provide the first draft genome assembly of C. macropomum, which was an essential tool to isolate elovl4b, and should be useful for further functional and physiology studies.

99

5.3. Materials and methods

Sample collection, DNA and RNA extraction

Juvenile tambaqui (∼65 g) were obtained from a commercial farm in Laranja da Terra, Espirito Santo, Brazil (19°55'45.9"S 40°57'42.1"W). Sections of liver and brain were collected from each fish, preserved in RNA stabilization buffer (3.6 M ammonium sulphate, 18 mM sodium citrate, 15 mM EDTA, pH 5.2), and stored at −80 °C prior to DNA and RNA extraction. Total genomic DNA was isolated from ∼25 mg of liver using the QIAamp tissue kit (QIAGEN, Netherlands), with DNA resuspended in water. DNA quality was verified in a 2% agarose gel. Total DNA concentration was estimated to be 607 ng/l, using Quant-it™ RiboGreen® (Invitrogen, California, USA). A BiotTek® microplate reader was used to determine the quality of the DNA. Total RNA was isolated from ∼25 mg of brain using the Illustra RNAspin Mini Kit (GE Healthcare, Chicago, USA) following the manufacturer’s instructions. On column DNase digestion was used during the extraction procedure. RNA quality was verified in 1% agarose gels. Total RNA concentration was estimated using the Quant-it™ RiboGreen® RNA Assay Kit (Invitrogen) and BiotTek® microplate reader to determine sample absorbance.

Draft genome assembly

The DNA sample of liver was sequenced using the Illumina HiSeq X Ten platform (Macrogen, Seoul, Korea). Overall, a total of 90 Gbp of raw data (2x150 paired-end raw reads) was sequenced. An initial data assessment was performed using FastQC (v.0.11.8) (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Then, Trimmomatic (v. 0.38) (Bolger et al., 2014) was used to filter and remove the low quality reads and adaptors (Parameters ;LEADING:5 TRAILING:5 SLIDINGWINDOW:4:20 MINLEN:50). Notwithstanding, Lighter tool (v.1.1.1) (Song et al., 2014) was still applied, with parameters (-k 17 1500000000 0.1), to correct artefact errors from sequencing. The final step of raw dataset validation was done with GenomeScope (v.1.0.0) (Vurture et al., 2017) software (K- mer; 27, 29, 31) to estimate various parameters of C. macropomum genome (size, heterozygosity rate, and repeated content) from the clean short reads, allowing to obtain a first overview of the species general proprieties.

100

The de novo draft genome assembly was performed using the W2RAP pipeline (v.0.1) (Clavijo et al., 2017), under the default settings, and following the authors’ protocol, available online (https://www.biorxiv.org/content/10.1101/110999v1). Briefly, the KAT (v.2.4.1) (Mapleson et al., 2016) software was used to evaluate the k-mer coverage distribution and determine the cut-off frequency suitable to W2RAP pipeline. After that, W2RAP pipeline was run in default mode under the specifications (-t 30 -k 200 -m 400 -- min_freq 5 -d 32 --dump_all 1). In the end of the assembly process, the draft genome assembly was assessed using the QUAST software (v.5.0.2) (Gurevich et al., 2013) to provide the general metrics, and the Benchmarking Universal Single-Copy Orthologs (BUSCO (v.3.0.2)) (Simão et al., 2015), using several databases (Eukariota, Metazoa, Vertebrata, and Actinopterygii), to verify the gene content completeness.

Identification of tambaqui elovl4a and elovl4b gene sequences

A combination of approaches was used to obtain elovl4a and elovl4b gene sequences. Initially, the sequences from the Pygocentrus nattereri elovl4a (XM_017710341.1) and elovl4b (XM_017722369.1) were used for blast searches of liver and brain RNA-seq tambaqui projects. In the case of elovl4a, we were able to deduce the full open reading frame (ORF) from the brain RNA-Seq dataset (Ferraz at al., 2020; Unpublished). This blast search produced a single hit (TRINITY_DN10862), which was later used to design primers for amplification of the ORF length by PCR. All primers were designed with Primer3 (v. 0.4.0) software (http://bioinfo.ut.ee/primer3-0.4.0/) and manufactured by STABVIDA (Portugal). For PCR, cDNA C. macropomum brain was synthesized from total RNA (∼30 μg) extracted from brain using NZY First-Strand cDNA Synthesis Kit (NzyTech, Portugal) following manufacturer’s protocol, posteriorly Flash High- Fidelity PCR Master Mix was used (Thermo Fisher Scientific, Waltham, MA, USA) to amplify cDNA templates. Primer sequences and PCR conditions used in the present study are described in Supplementary Table 5.1. PCR products were run on 1% agarose gels and those with the expected sizes were excised and purified (Thermo Fisher Scientific) for further sequencing (GATC Biotech, Constance, Germany). Unlike elovl4a, blast searches to RNA-Seq projects were unsuccessful for the tambaqui elovl4b, probably due to the lack of relevant gene expression in the examined tissues (brain and liver). Alternatively, the nucleotide sequence of elovl4b was obtained from blast to a genome draft assembly of C. macropomum, which is also presented in this work.

101

The predicted ORF of the tambaqui elovl4b, identified in the genome scaffold 79611, was synthesized by Integrated DNA Technologies, BVBA (Belgium), as an insert into the pUC19 plasmid.

Phylogenetic analysis

To confirm the orthology of the isolated genes, the deduced amino acid sequences of tambaqui Elovl4 (see Supplementary Table 2 for accession numbers) were used to construct a phylogenetic tree. The evolutionary history was inferred by using the Maximum Likelihood method and JTT matrix-based model (Jones et al., 1992). The tree with the highest log likelihood (-3692.85) is shown. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree was drawn to scale, with branch lengths measured in the number of substitutions per site. The proportion of sites where at least one unambiguous base was present in at least one sequence for each descendent clade is shown next to each internal node in the tree. This analysis involved 20 amino acid sequences. A total of 380 positions were considered in the final dataset. Evolutionary analyses were conducted in MEGA X (Kumar et al., 2018).

Functional characterization of lovl4a and Elovl4b

The tambaqui Elovl4a and Elovl4b were functionally characterized in yeast. Briefly, the ORF of the elovl4a genes were amplified with primers containing restriction sites XbaI (forward) and BamHI (reverse) (Supplementary Table 5.1), using brain cDNA and then digested. Thus, elovl4b was isolated from the direct digestion of the pUC19-elovl4b with the enzymes HindIII (forward) and XbaI (reverse). Subsequently both elovl4 were purified to obtain ORF that were subsequently ligated into the similarly restricted yeast expression pYES2 (Thermo Fisher Scientific) to produce the constructs pYES2-elovl4a and pYES2- elovl4b. Finally, all constructs were confirmed by Sanger sequencing (GATC Biotech). Yeast competent cells InvSc1 (Invitrogen) were transformed with pYES2-elovl4a and pYES2-elovl4b using the S.c. EasyComp™ Transformation Kit (Invitrogen). Selection of yeast containing the pYES2 constructs was done on S. cerevisiae minimal medium minus uracil (SCMM-ura) plates. In order to determine the role of the tambaqui Elovl4 in the

102 biosynthesis of VLC-PUFA, a single colony of transgenic yeast expression either elovl4a or elovl4b was grown at 30 oC in SCMM-ura broth with one of the following PUFA substrates: 18:3n-3, 18:2n-6, 18:4n-3, 18:3n-6, 20:5n-3, 20:4n-6, 22:5n-3, 22:4n-6 and 22:6n-3. In order to compensate for differences in substrate uptake efficiency by yeast, final PUFA substrate concentrations varied according to their fatty acyl chain lengths as 0.5 mM (C18), 0.75 mM

(C20) and 1.0 mM (C22) (Lopes-Marques et al., 2017). Yeast were grown in the presence of PUFA substrates for 2 days at 30 oC, and then harvested, washed twice with 5 mL ice-cold HBSS (Invitrogen) containing 1 % (w/v) fatty acid free bovine serum albumin (Sigma- Aldrich), and freeze dried for 24 h for further analyses. Total lipids extracted from yeast (Folch et al., 1957) were used to prepare fatty acyl methyl esters (FAME) that were further analyzed by gas chromatography as previously described (Li et al., 2017b). FAME were identified based on retention times using an Fisons GC-8160 (Thermo Fisher Scientific) gas chromatograph equipped with a 60 m × 0.32 mm i.d. x 0.25 μm ZB-wax column (Phenomenex, Macclesfield, UK) and flame ionization detector. The elongation conversion efficiencies of exogenously added PUFA substrates were calculated as [areas of first product and longer chain products/(areas of all products with longer chain than substrate + substrate area)] × 100. In order to test the role of the tambaqui Elovl4 in VLC-SFA biosynthesis, yeast transformed with pYES2-elovl4a, pYES2-elovl4b, or empty pYES2 (control), were incubated in triplicate in the absence of exogenously added substrates. Yeast culture, harvest and collection was carried out as described for yeast supplemented with PUFA substrates. The elongation efficiency towards saturated FA of the tambaqui Elovl4 was estimated by comparing the saturated FA profiles of the control yeast (i.e. transformed with empty pYES2) and those of yeast transformed with pYES2-elovl4a and pYES2-elovl4b.

Statistical analysis

For the Elovl4 functional characterization, the percentages of selected saturated fatty acids of yeast expressing the tambaqui elovl4a and elovl4b were compared with those of control yeast transformed with the empty pYES2 vector, by a Student’s t-test (P < 0.05) (SPSS version 11.5, SPSS, Chicago, IL, USA). FA contents in yeast were presented as mean ± SD (n=3).

103

5.3.Results

A draft genome assembly of tambaqui

As part of a larger study addressing the nutritional physiology of tambaqui, we generated a novel genomic resource. A draft/low coverage (22x) and cost-effective assembly of tambaqui’s genome was produced. About 491 million clean and low error rate pair-end (PE) short reads (error rate < 1%) were generated (Table 5.1). These reads allowed the estimation of the genome size of 2.3 Gb – higher than previous estimates (Hardie and Hebert 2004, 2003; Lobo 2015), a heterozygosity rate between 0.08-0.09 % and between 24-28 % repeat content (Supplementary Table 5.4). The present draft assembly has a total length of 2.31 Gb, a N50 contig length of 18 kb and a Scaffold N50 length of 25 kb. In addition, and although relatively fragmented, this draft assembly showed a high level of gene content in BUSCO analyses (e.g. % of total genes found (complete + fragmented) in Metazoa and Vertebrata databases were 95.6 % and 96.1 %, respectively) (Table 5.1). All the data resources generated in this work were submitted to GenBank (details can be checked in Supplementary Table 5.3).

Elovl4 sequences and phylogenetic analyses

Using various experimental strategies, we were able to deduce the putative full ORF of tambaqui elovl4a and elovl4b. In the case of elovl4a, the inspection of two RNA-Seq projects allowed a clear identification of a single sequence with similarity to this gene. In the case of elovl4b, various attempts were made but without match hits. In effect, elovl4b has a far more restricted tissue expression pattern (e.g. eye) and it is probably absent from the analysed RNA-Seq projects (liver and brain). To explore the possibility that elovl4b was independently lost in this species, we next investigated the produced draft genome assembly. By applying various blast strategies, we were able to identify two positive hits (Figure 5.1A). A careful examination of both scaffolds allowed the identification of the C. macropomum elovl4b. Both elovl4a and elovl4b have of 948 and 912 bp, respectively, encoding putative proteins of 315 and 303 amino acids. Accession numbers for the herein isolated elovl4a and elovl4b genes orthologs have been deposited to GenBank as NM936179 and NM947651, respectively. Both genes (elovl4a and elovl4b) are organized in

104 six exons, with the first five having the same number of base pairs in both genes, and the last exon containing 312 and 276 bp in elovl4a and elovl4b, respectively (Figure 5.1A). A phylogenetic tree was constructed based on the amino acid sequence alignment of the tambaqui Elovl4a and Elovl4b deduced proteins with other teleost fish Elovl4 (Figure 5.1B). The phylogenetic analysis showed that the tambaqui Elovl4a and Elovl4b grouped together with their corresponding teleost orthologs. Moreover, sequence analysis shows that both tambaqui Elovl4 possess the conserved diagnostic histidine box HXXHH motif present in all elongases and the endoplasmic reticulum (ER) retrieval signal RXKXX (Elovl4a) and KXKXX (Elovl4b) at the carboxyl terminus, which could be observed in the alignment made with D. rerio and P. nattereri (Figure 5.2).

Table 5.1: General stats of Colossoma macropomum draft genome assembly. Contig * Scaffold * Raw Data Stats Raw sequencing reads (PE) 491416754 Clean reads 526280086 Assembly Stats Total number of Sequences 255110 229835 Total number of Sequences 228808 203533 (>= 1000) Total number of Sequences 71024 62135 (>= 10000) Total number of Sequences 20829 24453 (>= 25000) Total number of Sequences 4287 7519 (>= 50000) Total length, bp 2311578259 2314112053 Total length (>= 1000), bp 2292711909 2295245703 Total length (>= 10000), bp 1638446955 1760715179 Total length (>= 25000), bp 856096545 1158738469 Total length (>= 50000), bp 294364635 572298158 N50 length, bp 18011 25040 L50 length, bp 35010 24386 Maximum length, bp 228241 330952 GC content, % 39,36 39,36 Genome coverage, × - 22x Back Mapping Reads, % - 98% BUSCOS Buscos – Eukariota** - C:76.9%[S:31.7%,D:45.2%],F:11.2%,M:11.9%,n:303

Buscos – Metazoa** - C:86.7%[S:35.3%,D:51.4%],F:8.9%,M:4.4%,n:978

Buscos – Vetebrata** - C:80.1%[S:34.1%,D:46.0%],F:16.0%,M:3.9%,n:2586

Buscos – Actinopterygii** - C:77.9%[S:34.3%,D:43.6%],F:11.5%,M:10.6%,n:4584 * All statistics are based on contigs/scaffolds of size ≥500bp. ** (C : Complete Buscos [S: Complete and single copy, %;D:Complete and duplicated, %], F: Fragmented, %, M: Missing, %, n: Number of sequences in Database)

105

Regarding the diagnostic “Q” (glutamine) in position −5 from the H**HH, characteristic of PUFA elongases, this is only present in Elovl4b sequence. Elovl4a does not display this feature, similarly to P. nattereri Elovl4 (Figure 5.2).

Figure 5.1: A. Genomic loci and exon organization of the C. macropomum elovl4a and elovl4b genes. B. Maximum likelihood phylogenetic tree comparing C. macropomum Elovl4a and Elovl4b with elongase proteins from other Osteichthyes, rooted with Lepisosteus oculatus. Numbers at nodes indicate branch support in posterior probabilities. Accession numbers for all Elovl4 sequences are available in Supplementary Table 2.

106

Figure 5.2. Comparison of the deduced C. macropomum amino acid (AA) sequences of Elovl4a and Elovl4b with those of other species. A - For Elovl4a D. rerio (NP_957090.1), P. nattereri (XP_017565830.1), O. niloticus (XP_003443720.1) and C. macropomum (this work) were used. B - For Elovl4b D. rerio (NP_956266.1), P. nattereri (XP_017574113.1), O. niloticus (XP_003440669.1) and C. macropomum (this work). Both AA sequences were aligned using BioEdit. Identical residues are shaded black and similar residues are shaded grey. Indicated are the conserved HXXHH histidine box motif and the ER retrieval signal, where RXKXX (Elovl4a) and KXKXX (Elovl4b) at the carboxyl terminus; “*” represents “Q” (glutamine) in position −5 from the H**HH.

107

Roles of the tambaqui Elovl4a and Elovl4b in the biosynthesis of VLC-PUFA

The activity of the tambaqui Elovl4 enzymes was investigated by growing yeast transformed with pYES2-elovl4a and pYES2-elovl4b in the presence of potential PUFA with acyl chains of C18 (18:3n-3, 18:2n-6, 18:4n-3 and 18:3n-6), C20 (20:5n-3 and 20:4n-6) and

C22 (22:5n-3, 22:4n-6 and 22:6n-3). The results showed that both tambaqui Elovl4 enzymes can contribute to VLC-PUFA biosynthesis since yeast expressing both elovl4a and elovl4b were able to elongate a large variety of PUFA substrates (Table 5.2). The step-wise elongation reactions lead to the production of a series of polyenes that, in the case of

Elovl4a, reached up to C36 long for all PUFA substrates tested except for 22:4n-6 and DHA

(Table 5.2). Polyenes of up to C34 were detected in yeast expression the tambaqui elovl4b (Table 5.2).It is noteworthy that Elovl4a end Elolv4b were able to elongate both 20:5n-3 and 22:5n-3 up to 24:5n-3, the substrate for DHA biosynthesis via the Sprecher pathway (Table 5.2).

Table 5.2. Functional characterisation of Elovl4b elongase of Colossoma macropomum by heterologous expression in the yeast Saccharomyces cerevisiae. Data are presented as the percentage conversions of polyunsaturated fatty acid (FA) substrates calculated according to the formula [areas of first product and longer chain products / (areas of all products with longer chain than substrate + substrate area)] × 100. FA substrate FA product % Conversion Elovl4a Elovl4b 18:3n-3 20:3n-3 10.9 16.6 22:3n-3 10.4 9.7 24:3n-3 35.9 16.9 26:3n-3 86.1 87.3 28:3n-3 95.7 100.0 30:3n-3 98.8 93.6 32:3n-3 97.0 61.9 34:3n-3 85.3 N.D. 36:3n-3 21.2 N.D. 18:2n-6 20:2n-6 6.7 12.1 22:2n-6 19.9 17.4 24:2n-6 30.7 25.1 26:2n-6 83.9 86.1 28:2n-6 94.3 90.6 30:2n-6 97.9 86.9 32:2n-6 90.7 9.5 34:2n-6 54.0 N.D. 36:2n-6 12.4 N.D. 18:4n-3 20:4n-3 5.7 8.1 22:4n-3 15.4 14.4 24:4n-3 43.5 10.2

108

26:4n-3 70.1 100.0 28:4n-3 94.3 100.0 30:4n-3 98.7 76.8 32:4n-3 93.3 77.8 34:4n-3 87.2 N.D. 36:4n-3 17.5 N.D. 18:3n-6 20:3n-6 6.7 13.0 22:3n-6 29.8 26.9 24:3n-6 57.9 22.2 26:3n-6 75.9 85.2 28:3n-6 95.3 93.7 30:3n-6 99.4 93.5 32:3n-6 97.4 25.6 34:3n-6 74.3 N.D. 36:3n-6 11.2 N.D. 20:5n-3 22:5n-3 17.2 23.8 24:5n-3 27.1 26.2 26:5n-3 31.4 19.6 28:5n-3 71.7 100.0 30:5n-3 99.0 91.6 32:5n-3 97.9 88.7 34:5n-3 92.6 24.5 36:5n-3 44.9 N.D. 20:4n-6 22:4n-6 16.1 30.8 24:4n-6 31.4 37.9 26:4n-6 44.5 55.7 28:4n-6 75.0 91.6 30:4n-6 99.0 97.5 32:4n-6 96.2 67.9 34:4n-6 88.2 4.8 36:4n-6 29.4 N.D. 22:5n-3 24:5n-3 20.0 10.0 26:5n-3 33.4 44.1 28:5n-3 70.6 94.9 30:5n-3 98.9 99.5 32:5n-3 97.0 92.4 34:5n-3 91.6 19.7 36:5n-3 40.6 N.D. 22:4n-6 24:4n-6 1.9 6.4 26:4n-6 N.D. 49.2 28:4n-6 N.D. 93.0 30:4n-6 N.D. 98.2 32:4n-6 N.D. 74.0 34:4n-6 N.D. 6.7 22:6n-3 24:6n-3 0.1 0.2 N.D., not detected

109

Roles of the tambaqui Elovl4a and Elovl4b in the biosynthesis of VLC-SFA

The putative Elovl4 enzymes of tambaqui were functionally characterized by comparing the FA profiles of control S. cerevisiae (i.e. transformed with empty pYES2) with those of yeast expressing either tambaqui elovl4a or elovl4b. Results showed that pYES2- elovl4a and pYES2-elovl4b transformed yeast, contained higher proportions of saturated FA from 20:0 to 32:0 than their corresponding controls (Table 5.3). However, only the results for elovl4b with 28:0 and 30:0 as substrates, were significantly higher in comparison to control yeast (Table 5.3). These results confirmed that Elovl4 is involved in the biosynthesis of VLC-SFA.

Table 5.3. Selected saturated fatty acids (percentage of total fatty acids) of yeast Saccharomyces cerevisiae transformed with either the empty pYES2 vector (Control) or the Colossoma macropomum elovl4 ORF sequences (pYES2-elovl4a or pYES2-elovl4b). Results are means ± SD (n=3). Significant statistical differences between transgenic yeast expressing each elovl4 gene and its corresponding control are indicated with an asterisk (Student t-test, P < 0.05). P pYES2- P SFA Control pYES2-elovl4a value Control elovl4b value 61.15 ± 55.41 ± 16:0 6.05 49.21 ± 9.25 0.30 3.46 39.68 ± 2.33 0.37 35.77 ± 42.22 ± 18:0 4.12 39.35 ± 10.71 0.26 3.13 51.32 ± 3.13 1.00 20:0 0.46 ± 0.02 0.75 ± 0.03 0.46 0.39 ± 0.03 0.87 ± 0.09 0.31 22:0 0.57 ± 0.14 1.13 ± 0.13 0.10 0.60 ± 0.07 1.03 ± 0.16 0.12 24:0 0.20 ± 0.01 0.52 ± 0.01 0.13 0.26 ± 0.05 0.53 ± 0.02 0.31 26:0 1.63 ± 0.12 5.15 ± 0.22 0.22 0.98 ± 0.25 4.36 ± 0.96 0.06 28:0 0.12 ± 0.01 2.51 ± 0.24 0.06 0.09 ± 0.00 2.00 ± 0.40 0.04* 30:0 0.05 ± 0.01 1.13 ± 0.16 0.20 0.01 ± 0.00 0.15 ± 0.01 0.05* 32:0 0.01 ± 0.00 0.19 ± 0.02 0.16 0.01 ± 0.00 0.01 ± 0.00 0.24

5.4. Discussion

The importance of LC-PUFA molecules is commonly known for the role in homeostasis and (Tocher, 2010). However, recently, several studies on commercially important fish species have emphasized the importance of very long-chain saturated fatty acids (VLC-SFA) and PUFA (VLC-PUFA) in aquaculture (Carmona- Antonanzas et al., 2011; Jin et al., 2017; Oboh et al., 2017a; Torres et al., 2020). Recently, the LC-PUFA biosynthesis pathway in tambaqui was unveiled (Ferraz et al., 2019) showing that all the desaturase and elongase activities required to convert C18 PUFA into LC-PUFA

110 are present. However, the repertoire and function of the tambaqui elovl4 had not been investigated. The knowledge of genomics has evolved massively in the last decade. With the rapid development of sequencing technologies, it is now easier to sequence and characterize a large number of genomes in species of interest using next-generation sequencing. Here, as part of a larger study addressing tambaqui nutritional physiology, a novel genomic resource was generated. The key enzymes of VLC-FA biosynthesis were investigated in tambaqui draft genome. Due to the high completeness and the long scaffold N50 of the tambaqui draft genome assembly, it was possible to identify the elovl4b gene. Such a goal had not been possible to achieve by searches within the presently available transcriptomes (liver and brain). Furthermore, these results helped us to clarify that elovl4b has not been independently lost in tambaqui. The repertoire and function of genes encoding Elovl4 proteins was investigated, and from then on, the phylogenetic analysis of Elovl4 recognized that C. macropomum has two elovl4-like elongases that have high homology to previously characterized Elovl4 from teleosts. They were termed accordingly Elovl4a and Elovl4b. This finding is in agreement with in silico studies suggesting that virtually all teleosts possess both types of elovl4 genes (Castro et al., 2016). Both C. macropomum elovl4a and elovl4b encoded putative proteins that contained a diagnostic histidine box, indicating that Elovl enzymes have functional regions highly conserved during evolution (Hashimoto et al., 2008). Notably, the Elovl4b sequence contains the diagnostic “Q” (glutamine) in position −5 from the H**HH motif, characteristic of PUFA elongases (Hashimoto et al., 2008). Interestingly, although Elovl4a does not contain the diagnostic “Q”, our functional characterization data demonstrated that this enzyme can operate toward PUFA substrates, and so participates in the biosynthesis of VLC-PUFA. Both C. macropomum Elovl4 elongases characterized in the present study were demonstrated to elongate PUFA substrates including compounds of different chain lengths

(C18–22) and series (n-3 and n-6). These results suggest that both Elovl4a and Elovl4b from tambaqui participate in the biosynthesis of VLC-PUFA of up to 34-36 carbons, through consecutive elongations from a varied range of shorter-chain PUFA precursors. These results are in agreement with those reported in C. gariepinus and A. schlegelii species, in which both their Elovl4 proteins (Elovl4a and Elovl4b) contribute to VLC-PUFA biosynthesis (Jin et al., 2017; Oboh et al., 2017a). In contrast, the pioneer study reporting for first time on functions of Elovl4 from teleosts showed that in D. rerio only Elovl4b, but not Elovl4a, plays important roles in the biosynthesis of VLC-PUFA (Monroig et al., 2010). Such conclusion

111 was drawn upon functional analysis data derived from yeast assays similar to those used herein, along with gene expression analyses, that demonstrated that the D. rerio elovl4b is highly active in retina and pineal gland (Monroig et al., 2010), tissues containing photoreceptors in which VLC-PUFA accumulate (Agbaga et al., 2010; Deak et al., 2019). Vertebrate Elovl4 have demonstrated roles in the biosynthesis of VLC-SFA (Deak et al., 2019). Here, yeast expressing both elovl4a and elovl4b had increased levels of VLC-SFA of C26–30 as compared to control yeast, in agreement with elongation abilities of some teleost Elovl4 reported previously (Carmona-Antonanzas et al., 2011; Li et al., 2017a; Li et al., 2017b; Monroig et al., 2010; Monroig et al., 2012; Monroig et al., 2011; Oboh et al., 2017a; Ran et al., 2019). However, such increased production of VLC-SFA was only statistically significant for Elovl4b, more specifically for 28:0 and 30:0, but not for Elovl4a. These results differe from those reported in D. rerio since its Elovl4a but not Elovl4b, was found to play major roles in VLC-SFA (Monroig et al., 2010). Both VLC-PUFA and VLC-SFA are recognized to be biosynthesized by Elovl4 in a tissue-specific manner (Deak et al., 2019). Provided that elovl4a and elovl4b are strongly expressed in eye and brain from teleosts (Monroig et al., 2010; Oboh et al., 2017a; Torres et al., 2020), it is reasonable to speculate that Elovl4 are responsible for endogenous supply of these compounds in neural tissues. This hypothesis is partly supported by the identification of certain VLC-PUFA including 24:6n-3, 26:6n-3, 28:6n-3, 30:6n-3, 32:6n-3 and 34:6n-3, in phosphatidylcholine (PC) from teleost retina (Garlito et al., 2019). For instance, Atlantic salmon retinal PC contained detectable levels of 24:6n-3 and 32:6n-3 (Garlito et al., 2019), in agreement with the salmon Elovl4b being able to elongate DHA up to 34:6n-3 (Carmona-Antonanzas et al., 2011). Our functional characterization data, however, indicated that both tambaqui Elovl4 isoforms could only elongate DHA to 24:6n-3, with longer elongation products not being detected. Unfortunately, retinal samples from tambaqui were not available in the present study and hence we were unable to confirm whether, along 24:6n-3, other VLC-PUFA can be endogenously synthesized by tambaqui. More clearly though, our data strongly suggest that tambaqui Elovl4 can actively contribute to the biosynthesis of DHA, a compound that is highly abundant in neural tissues as described above for VLC-PUFA (Bell and Tocher, 1989; Dyall, 2015). DHA is quantitatively the most important n-3 LC-PUFA in the brain, and has consistently been shown to have unique and indispensable roles in the neuronal function, being an important constituent of brain phospholipids and playing a role in maintaining structural and functional integrity of membranes in brain (Bell and Tocher, 1989; Dyall, 2015). In addition to dietary origin, DHA can be synthesized endogenously. Arguably, the

112 most common route for DHA biosynthesis is the Sprecher pathway, which involves elongation of 22:5n-3 to 24:5n-3, before consecutive ∆6 desaturation and a peroxisome β- oxidation reaction occur (Castro et al., 2016; Monroig and Kabeya, 2018; Monroig et al., 2016). Our results demonstrate that the Elovl4 characterized herein from tambaqui can contribute to DHA biosynthesis through this pathway as both isoforms were able to elongate 22:5n-5 to 24:5n-3. Such characteristic, widespread among teleost Elovl4 enzymes e.g. African catfish (C. gariepinus) (Oboh et al., 2017a) and rainbow trout (O. mykiss) (Zhao et al., 2019), had been postulated to compensate for lack of Elovl2 in genomes of many teleosts (Castro et al., 2016). Recently, a study on the Atlantic bluefin tuna (T. thynnus) Elovl4b has shown that this enzyme is indeed able to produce 24:5n-3 before this metabolite can be further desaturated by the Fads2 existing in this species (Betancor et al., 2020). Tambaqui ∆6 Fads2 was also demonstrated to desaturate 24:5n-3 to 24:6n-3 suggesting that the this species is capable to operate the Sprecher pathway (Ferraz et al., 2019). Importantly, the present study shows that, along Elovl2 (Ferraz et al., 2019), key elongation of 22:5n-3 to 24:5n-3 can be also mediated by Elovl4 enzymes as evidenced in our in vitro assays. These results indicate that Elovl2 (Ferraz et al., 2019) and both Elovl4 enzymes characterized in the present study have partly overlapping functions, as previously described in Elovl2 and Elovl5 in tamaqui itself (Ferraz et al., 2019) or other teleosts (Castro et al., 2016). In conclusion, a draft genome assembly of tambaqui has been presented. This data collection can provide insight into the nutritional physiology and metabolism of essential FA including VLC-PUFA and VLC-SFA. To prove the genome quality and demonstrate its usefulness, we have isolated the elovl4b gene from the draft genome. The present study confirmed that both C. macropomum Elovl4 enzymes play critical roles in the biosynthesis of VLC-PUFA and VLC-SFA. Elovl4a showed the ability to biosynthesize VLC-PUFA of up to 36 carbons, whereas Elovl4b enabled the synthesis of polyenes of up to 34 carbons. Moreover, both Elovl4 from tambaqui can contribute to the biosynthesis of DHA via the Sprecher pathway since they produced the metabolic intermediate 24:5n-3 from 22:5n-3.

113

5.6. References

Agbaga, M.-P., Mandal, M.N.A., Anderson, R.E., 2010. Retinal very long-chain PUFAs: new insights from studies on ELOVL4 protein. J. Lipid Res. 51, 1624-1642. Bell, M., Tocher, D.R., 1989. Molecular species composition of the major phospholipids in brain and retina from rainbow trout (Salmo gairdneri). Occurrence of high levels of di-(n-3) polyunsaturated fatty acid species. Biochemical J. 264, 909-915. Berthelot, C., Brunet, F., Chalopin, D., Juanchich, A., Bernard, M., Noël, B., Bento, P., Da Silva, C., Labadie, K., Alberti, A., 2014. The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat. Commun. 5, 3657. Betancor, M.B., Oboh, A., Ortega, A., Mourente, G., Navarro, J.C., de la Gandara, F., Tocher, D.R., Monroig, O., 2020. Molecular and functional characterisation of a putative elovl4 gene and its expression in response to dietary fatty acid profile in Atlantic bluefin tuna (Thunnus thynnus). Comp. Biochem. Physiol. 240B, 110372. Bolger, A.M., Lohse, M., Usadel, B., 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 30, 2114-2120. Buzzi, M., Henderson, R., Sargent, J., 1996. The desaturation and elongation of linolenic acid and eicosapentaenoic acid by hepatocytes and liver microsomes from rainbow trout (Oncorhynchus mykiss) fed diets containing fish oil or olive oil. Biochim Biophys Acta. 1299, 235-244. Buzzi, M., Henderson, R.J., Sargent, J.R., 1997. Biosynthesis of docosahexaenoic acid in trout hepatocytes proceeds via 24-carbon intermediates. Comp. Biochem. Physiol. 116B, 263-267. Calder, P.C., 2014. Very long chain omega‐3 (n‐3) fatty acids and human health. Eur. J. Lipid Sci. Technol. 116, 1280-1300. Campoy, C., Escolano-Margarit, M.V., Anjos, T., Szajewska, H., Uauy, R., 2012. Omega 3 fatty acids on child growth, visual acuity and neurodevelopment. Br. J. Nutr. 107, S85-S106. Carmona-Antonanzas, G., Monroig, O., Dick, J.R., Davie, A., Tocher, D.R., 2011. Biosynthesis of very long-chain fatty acids (C>24) in Atlantic salmon: cloning, functional characterisation, and tissue distribution of an Elovl4 elongase. Comp. Biochem. Physiol. 159B, 122-129. Castro, L.F., Tocher, D.R., Monroig, O., 2016. Long-chain polyunsaturated fatty acid biosynthesis in chordates: Insights into the evolution of Fads and Elovl gene repertoire. Prog. Lipid Res. 62, 25-40. Clavijo, B., Accinelli, G.G., Wright, J., Heavens, D., Barr, K., Yanes, L., Di Palma, F., 2017. W2RAP: a pipeline for high quality, robust assemblies of large complex genomes from short read data. BioRxiv. 110999. Conte, M.A., Gammerdinger, W.J., Bartie, K.L., Penman, D.J., Kocher, T.D., 2017. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions. BMC genomics. 18, 341. Deak, F., Anderson, R.E., Fessler, J.L., Sherry, D.M., 2019. Novel Cellular Functions of Very Long Chain-Fatty Acids: Insight From ELOVL4 Mutations. Frontiers in cellular neuroscience. Front. Cell. Neurosci. 13, 428. Delgado-Lista, J., Perez-Martinez, P., Lopez-Miranda, J., Perez-Jimenez, F., 2012. Long chain omega-3 fatty acids and : a systematic review. Br. J. Nutr. 107, S201-S213. Du, K., Wuertz, S., Adolfi, M., Kneitz, S., Stöck, M., Oliveira, M., Nóbrega, R., Ormanns, J., Kloas, W., Feron, R., 2019. The genome of the arapaima (Arapaima gigas) provides

114

insights into gigantism, fast growth and chromosomal sex determination system. Scientific reports. Sci. Rep. 9, 5293. Dyall, S.C., 2015. Long-chain omega-3 fatty acids and the brain: a review of the independent and shared effects of EPA, DPA and DHA. Front Aging Neurosci. 7, 52. Ferraz, R.B., Kabeya, N., Lopes-Marques, M., Machado, A.M., Ribeiro, R.A., Salaro, A.L., Ozório, R., Castro, L.F.C., Monroig, Ó., 2019. A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum. Comp. Biochem. Physiol. 227B, 90-97. Folch, J., Lees, M., Sloane Stanley, G., 1957. A simple method for the isolation and purification of total lipids from animal tissues. J. biol Chem. 226, 497-509. Garlito, B., Portolés, T., Niessen, W., Navarro, J., Hontoria, F., Monroig, Ó., Varó, I., Serrano, R., 2019. Identification of very long-chain (> C24) fatty acid methyl esters using gas chromatography coupled to quadrupole/time-of-flight mass spectrometry with atmospheric pressure chemical ionization source. Anal. Chim. Acta. 1051, 103-109. Guillou, H., Zadravec, D., Martin, P.G., Jacobsson, A., 2010. The key roles of elongases and desaturases in mammalian fatty acid metabolism: Insights from transgenic mice. Prog. Lipid Res. 49, 186-199. Gurevich, A., Saveliev, V., Vyahhi, N., Tesler, G., 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 29, 1072-1075. Hashimoto, K., Yoshizawa, A.C., Okuda, S., Kuma, K., Goto, S., Kanehisa, M., 2008. The repertoire of desaturases and elongases reveals fatty acid variations in 56 eukaryotic genomes. J. Lipid Res. 49, 183-191. Innis, S.M., 2007. Dietary (n-3) fatty acids and brain development. J. Nutr. 137, 855-859. Ishikawa, A., Kabeya, N., Ikeya, K., Kakioka, R., Cech, J.N., Osada, N., Leal, M.C., Inoue, J., Kume, M., Toyoda, A., 2019. A key metabolic gene for recurrent freshwater colonization and radiation in fishes. Science. 364, 886-889. Jakobsson, A., Westerberg, R., Jacobsson, A., 2006. Fatty acid elongases in mammals: their regulation and roles in metabolism. Prog. Lipid Res. 45, 237-249. Jin, M., Monroig, O., Navarro, J.C., Tocher, D.R., Zhou, Q.-C., 2017. Molecular and functional characterisation of two elovl4 elongases involved in the biosynthesis of very long- chain (> C24) polyunsaturated fatty acids in black seabream Acanthopagrus schlegelii. Comp. Biochem. Physiol. 212B, 41-50. Jones, D.T., Taylor, W.R., Thornton, J.M., 1992. The rapid generation of mutation data matrices from protein sequences. Bioinformatics. 8, 275-282. Kabeya, N., Yamamoto, Y., Cummins, S.F., Elizur, A., Yazawa, R., Takeuchi, Y., Haga, Y., Satoh, S., Yoshizaki, G., 2015. Polyunsaturated fatty acid metabolism in a marine teleost, Nibe croaker Nibea mitsukurii: Functional characterization of Fads2 desaturase and Elovl5 and Elovl4 elongases. Comp. Biochem. Physiol. 188B, 37-45. Kumar, S., Stecher, G., Li, M., Knyaz, C., Tamura, K., 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547-1549. Li, S., Monroig, Ó., Navarro, J.C., Yuan, Y., Xu, W., Mai, K., Tocher, D.R., Ai, Q., 2017a. Molecular cloning and functional characterization of a putative Elovl4 gene and its expression in response to dietary fatty acid profiles in orange‐spotted grouper E pinephelus coioides. Aquaculture Res. 48, 537-552. Li, S., Monroig, Ó., Wang, T., Yuan, Y., Navarro, J.C., Hontoria, F., Liao, K., Tocher, D.R., Mai, K., Xu, W., 2017b. Functional characterization and differential nutritional regulation of putative Elovl5 and Elovl4 elongases in large yellow croaker (Larimichthys crocea). Sci. reports. 7, 2303. Li, Y., Wen, Z., You, C., Xie, Z., Tocher, D.R., Zhang, Y., Wang, S., Li, Y., 2020. Genome wide identification and functional characterization of two LC-PUFA biosynthesis elongase (elovl8) genes in rabbitfish (Siganus canaliculatus). Aquaculture. 735127.

115

Lien, S., Koop, B.F., Sandve, S.R., Miller, J.R., Kent, M.P., Nome, T., Hvidsten, T.R., Leong, J.S., Minkley, D.R., Zimin, A., 2016. The Atlantic salmon genome provides insights into rediploidization. Nature. 533, 200. Lopes-Marques, M., Ozorio, R., Amaral, R., Tocher, D.R., Monroig, O., Castro, L.F., 2017. Molecular and functional characterization of a fads2 orthologue in the Amazonian teleost, Arapaima gigas. Comp. Biochem. Physiol. 203B, 84-91. Louro, B., De Moro, G., Garcia, C., Cox, C.J., Veríssimo, A., Sabatino, S.J., Santos, A.M., Canário, A.V., 2019. A haplotype-resolved draft genome of the European sardine (Sardina pilchardus). GigaScience. 8, giz059. Machado, A.M., Tørresen, O.K., Kabeya, N., Couto, A., Petersen, B., Felício, M., Campos, P.F., Fonseca, E., Bandarra, N., Lopes-Marques, M., 2018. “Out of the Can”: A Draft Genome Assembly, Liver Transcriptome and Nutrigenomics of the European Sardine, Sardina pilchardus. Genes. 9.10, 485. Mapleson, D., Garcia Accinelli, G., Kettleborough, G., Wright, J., Clavijo, B.J., 2016. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics. 33, 574-576. Ministério da Agricultura, Brasil., 2016. Produção da pecuária municipal / IBGE. Accessed 2017/10. http://biblioteca.ibge.gov.br/index.php/biblioteca- catalogo?view=detalhes&id=784. Monroig, Ó., Kabeya, N., 2018. Desaturases and elongases involved in polyunsaturated fatty acid biosynthesis in aquatic invertebrates: a comprehensive review. Fisheries Sci. 1- 18. Monroig, O., Lopes-Marques, M., Navarro, J.C., Hontoria, F., Ruivo, R., Santos, M.M., Venkatesh, B., Tocher, D.R., Castro, L.F., 2016. Evolutionary functional elaboration of the Elovl2/5 gene family in chordates. Sci. Rep. 6, 20510. Monroig, O., Rotllant, J., Cerda-Reverter, J.M., Dick, J.R., Figueras, A., Tocher, D.R., 2010. Expression and role of Elovl4 elongases in biosynthesis of very long-chain fatty acids during zebrafish Danio rerio early embryonic development. Biochim. Biophys. Acta. 1801, 1145-1154. Monroig, Ó., Tocher, D.R., Hontoria, F., Navarro, J.C., 2013. Functional characterisation of a Fads2 fatty acyl desaturase with Δ6/Δ8 activity and an Elovl5 with C16, C18 and C20 elongase activity in the anadromous teleost meagre (Argyrosomus regius). Aquaculture. 412-413, 14-22. Monroig, Ó., Wang, S., Zhang, L., You, C., Tocher, D.R., Li, Y., 2012. Elongation of long- chain fatty acids in rabbitfish Siganus canaliculatus: Cloning, functional characterisation and tissue distribution of Elovl5- and Elovl4-like elongases. Aquaculture. 350-353, 63-70. Monroig, Ó., Webb, K., Ibarra-Castro, L., Holt, G.J., Tocher, D.R., 2011. Biosynthesis of long- chain polyunsaturated fatty acids in marine fish: Characterization of an Elovl4-like elongase from cobia Rachycentron canadum and activation of the pathway during early life stages. Aquaculture. 312, 145-153. Oboh, A., Kabeya, N., Carmona-Antoñanzas, G., Castro, L.F.C., Dick, J.R., Tocher, D.R., Monroig, O., 2017b. Two alternative pathways for docosahexaenoic acid (DHA, 22: 6n-3) biosynthesis are widespread among teleost fish. Sci. reports. 7, 3889. Oboh, A., Navarro, J.C., Tocher, D.R., Monroig, O., 2017a. Elongation of very long-chain (> C24) fatty acids in Clarias gariepinus: Cloning, functional characterization and tissue expression of elovl4 elongases. Lipids. 52, 837-848. Palti, Y., Genet, C., Gao, G., Hu, Y., You, F.M., Boussaha, M., Rexroad, C.E., Luo, M.-C., 2012. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of conserved synteny with model fish genomes. Mar. Biotechnol. 14, 343-357.

116

Ran, Z., Xu, J., Liao, K., Monroig, Ó., Navarro, J.C., Oboh, A., Jin, M., Zhou, Q., Zhou, C., Tocher, D.R., 2019. Biosynthesis of long-chain polyunsaturated fatty acids in the razor clam Sinonovacula constricta: Characterization of four fatty acyl elongases and a novel desaturase capacity. Biochim. Biophys. Acta Mol. Cell. Biol. Lipids. 1864, 1083-1090. Schartl, M., Kneitz, S., Volkoff, H., Adolfi, M., Schmidt, C., Fischer, P., Minx, P., Tomlinson, C., Meyer, A., Warren, W.C., 2019. The Piranha Genome Provides Molecular Insight Associated to Its Unique Feeding Behavior. Genome biology and evolution. Genome Biol. Evol. 11, 2099-2106. Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., Zdobnov, E.M., 2015. BUSCO: assessing genome assembly and annotation completeness with single- copy orthologs. Bioinformatics. 31, 3210-3212. Song, L., Florea, L., Langmead, B., 2014. Lighter: fast and memory-efficient sequencing error correction without counting. Genome Boil. 15, 509. Sprecher, H., 2000. Metabolism of highly unsaturated n-3 and n-6 fatty acids. Biochim. Biophys. Acta Mol. Cell. Biol. Lipids. 1486, 219-231. Tine, M., Kuhl, H., Gagnaire, P.-A., Louro, B., Desmarais, E., Martins, R.S., Hecht, J., Knaust, F., Belkhir, K., Klages, S., 2014. European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation. Nat. Commun. 5, 5770. Tocher, D.R., 2010. Fatty acid requirements in ontogeny of marine and freshwater fish. Aquaculture Res. 41, 717-732. Tocher, D.R., 2015. Omega-3 long-chain polyunsaturated fatty acids and aquaculture in perspective. Aquaculture. 449, 94-107. Tocher, D.R., Agaba, M.K., Hastings, N., Teale, A.J., 2003. Biochemical and molecular studies of the polyunsaturated fatty acid desaturation pathway in fish. In: Browman H I, Skiftesvik A B (ed.). The Big Fish Bang: Proceedings of the 26th Annual Larval Fish Conference, Bergen, Norwary: Institute of Marine Research (IMR) / Fishlarvae.com, pp. 211-228. Torres, M., Navarro, J., Varó, I., Agulleiro, M., Morais, S., Monroig, Ó., Hontoria, F., 2020. Expression of genes related to long-chain (C18–22) and very long-chain (> C24) fatty acid biosynthesis in gilthead seabream (Sparus aurata) and Senegalese sole (Solea senegalensis) larvae: Investigating early ontogeny and nutritional regulation. Aquaculture. 734949. Vurture, G.W., Sedlazeck, F.J., Nattestad, M., Underwood, C.J., Fang, H., Gurtowski, J., Schatz, M.C., 2017. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 33, 2202-2204. Xu, P., Zhang, X., Wang, X., Li, J., Liu, G., Kuang, Y., Xu, J., Zheng, X., Ren, L., Wang, G., 2014. Genome sequence and genetic diversity of the common carp, Cyprinus carpio. Nat. genetics. 46, 1212. Yan, J., Liang, X., Cui, Y., Cao, X., Gao, J., 2017. Elovl4 can effectively elongate C18 polyunsaturated fatty acids in loach Misgurnus anguillicaudatus. Biochem. Biophys. Res. Commun. 495.4, 2637-2642. Zhao, N., Monroig, Ó., Navarro, J.C., Xiang, X., Li, Y., Du, J., Li, J., Xu, W., Mai, K., Ai, Q., 2019. Molecular cloning, functional characterization and nutritional regulation of two elovl4b elongases from rainbow trout (Oncorhynchus mykiss). Aquaculture. 511, 734221.

117

Chapter 6

The repertoire of the elongases of very long- chain fatty acids gene family is conserved in the tambaqui (Colossoma macropomum) genome, and include members of the novel Elovl8

118

6.1. Abstract

Elongases of very long-chain fatty acids (Elovl) are critical players in the regulation of the length of a fatty acid. At present, eight Elovls, designated as Elovl1 to 8, with characteristic fatty acid substrate specificity, have been identified in most vertebrates including fish. In general, Elovl1, Elovl3, Elovl6 and Elovl7 display a substrate preference for saturated and monounsaturated fatty acids (SFA and MUFA), while Elovl2, Elovl4, Elovl5 and Elovl8 use PUFA as substrates. Elovl2, Elovl4 and Elovl5 have received considerable attention in aquatic animals due to their involvement in conversion of C18 PUFAs to long- chain polyunsaturated fatty acids (LC-PUFA). However, from those, only Elovl4 is involved in elongation of very long-chain polyunsaturated fatty acids (VLC-PUFA). Recently, Elovl8 was discovered, which in principle, contributes to the biosynthesis of LC-PUFA in fish. Here, we identified the full repertoire of Elovl genes in tambaqui, and additionally, carried out a detailed phylogenetic and synteny analysis that displayed the conservation of these genes as compared to other species. Lastly, we identified in tambaqui the newest gene of the Elovl family, elovl8. Future gene function analysis combined with CRISPR/CAS9 assays should provide valuable clues into the biology of this omnivorous freshwater fish of increasing value for the Latin America aquaculture.

6.2. Introduction

Elongases of very long-chain fatty acids (Elovl) are widely present in the genomes of animals, plants and microorganisms, regulating the length of fatty acids (FA) (Guillou et al. 2010). All Elovl proteins contain a characteristic Elovl domain and a highly conserved HXXHH motif, which is conserved in yeast, mouse, rat and human (Jakobsson, Westerberg, and Jacobsson 2006). The number of Elovls differs markedly among species. In teleost fish, seven Elovl were previously described, with Elovl1, Elovl3, Elovl6 and Elovl7 preferring saturated and monounsaturated fatty acids (SFA and MUFA) substrates; while Elovl2, Elovl4 and Elovl5 are selective for polyunsaturated fatty acids (PUFA) (Guillou et al. 2010). In contrast, Elovl2, Elovl4 and Elovl5 display a capacity to convert C18 PUFAs to long-chain

(C20–24) polyunsaturated fatty acids (LC-PUFA). These enzymes have received significant attention due to the health relevance of LC-PUFAs such as arachidonic acid (ARA, 20:4n– 6), eicosapentaenoic acid (EPA, 20:5n–3) and docosahexaenoic acid (DHA, 22:6n–3). These are essential, playing many biological roles such as growth, development and reproduction of vertebrates (Monroig et al., 2018; Tocher, 2010; Tocher and Dick 2000). Strikingly, a new

119 member of Elovl family has been unveiled in fish genomes, Elovl8, with two gene paralogues, designated elovl8a and elovl8b respectivly. Interestingly, studies have shown that at least Elovl8b has the ability to elongate PUFA to LC-PUFA (Oboh 2018; Li et al. 2020). Below, we will briefly discuss each of the Elovl classes. The elongation of SFA and MUFA can be catalyzed by four elongases, namely Elovl1, Elovl3, Elovl6 and Elovl7. Generally, these enzymes often overlap functions with some redundancy (Guillou et al. 2010; Kihara 2012; Sherry et al. 2019). Briefly, Elovl6 is involved in the elongation of C12–16 SFA and MUFA up to C18 (Sherry et al. 2019). Elovl1 is involved in the production of SFA up to C26 in length (Ofman et al. 2010). Elovl3 is suggested to control the synthesis of SFA and MUFA of up to C24 (Zadravec et al. 2010), while Elovl7 is reported to elongate SFA and MUFA of 18–22 carbons (Sherry et al. 2019; Naganuma et al. 2011). Although these FAs are essential barrier components of the plasma membranes, where they play important roles in several aspects of cellular growth (Guillou et al. 2010), few studies have been done with SFA and MUFA elongases in fish. In particular, no functional characterization has been conducted with Elovl3 and Elovl7 orthologues in fish. The Elovls involved in the biosynthesis of LC-PUFA such ARA, EPA and DHA, have been widely studied in fish for the past decades where homologs of mammals have been identified and functionally characterized in several species of teleost fish (Castro, Tocher, and Monroig 2016; Monroig et al. 2016a; Ferraz et al., 2019; Janaranjani et al., 2018; Li et al., 2017; Monroig et al., 2013; Oboh et al., 2016; Oboh et al., 2017b). Elovl2 and Elovl5 are highly conserved across vertebrates (Monroig et al., 2016). In teleosts, most species studied have a single elovl5 gene with the ability to elongate C18 and C20 PUFA substrates, with lower efficiency towards C22 PUFA (Castro et al., 2016). The Atlantic salmon (Salmo salar) is an exception, where two copies of elovl5 have been characterized (Morais et al. 2009b). On the other side, Elovl2 in fish has much more affinity towards C20 and C22 PUFA (Morais et al. 2009a; Carmona-Antoñanzas et al. 2013). Teleost Elovl2 has demonstrated the capacity to elongate docosapentaenoic acid (DPA) and thus contributing to DHA synthesis through the so-called “Sprecher pathway” (Sprecher 2000; Oboh et al. 2017b). Another significant functional player is Elovl4. This enzyme plays a crucial role in the biosynthesis VLC-PUFAs, having the unique ability to synthesise FA with chain lengths up to C36 (Carmona-Antonanzas et al., 2011; Kabeya et al., 2015; Monroig et al., 2010; Monroig et al., 2012; Monroig et al., 2011; Oboh et al., 2017a; Zhao et al., 2019). In various cases, each elongase family displays two gene copies in teleost fish species. One of the examples is elovl4, which exhibits two gene copies, elovl4a and elovl4b (Monroig et al. 2010; Oboh et al. 2017a; Yan et al. 2017). But the SFA and MUFA pathways

120 also have several examples of duplicated Elovl genes in fish. In zebrafish, two elovl1 genes (elovl1a and elovl1b) are highly expressed in the ; elovl1b is also expressed in the kidney, being a key determinant on the development of these organs (Bhandari et al. 2016). In addition, elovl7 in fish also has two copies (elovl7a and elovl7b) (Xue et al., 2014). However, a detailed analysis in the teleostei lineage is still missing. The origin of these duplicated genes is most likely the specific whole-genome duplication (WGD) in teleosts (Pasquier et al., 2016). In vertebrates, two rounds of whole-genome duplications occurred at the root of the lineage (WGD1 and WGD2) (Dehal and Boore, 2005). Teleost fish experienced a third round of WGD, known as the teleost-specific round of WGD (TGD) (Hoegg et al. 2004; Meyer and Van de Peer 2005; Pasquier et al. 2016). Genome duplications have clear impacts in the evolution of species, since they provide the extra genes to develop novel functions. Importantly, while many of the duplicated genes have been lost, a substantial percentage of the duplicates has been retained (Zhou et al. 2008). As a result, fish often have two co-orthologs in contrast to a single copy gene in humans and other mammals. Tambaqui (Colossoma macropomum) is a tropical teleost fish native to the Amazon and Orinoco river basins, representing over a quarter of the total aquaculture production in Brazil (Ministério da Agricultura 2016). It is well-suited to farming fish since it accepts artificial feed, grows rapidly, and is appreciated by consumers (Valladão, Gallani, and Pilarski 2018). It is the most important native cultured species in Brazil, where the production of native fish is currently overtaking the production of exotic species in some countries. This is a milestone for South American aquaculture, where tambaqui has been pointed as an new species option for aquaculture (Shen and Yue 2018). Recently, studies on the biosynthesis of LC-PUFA have evolved in tambaqui, where fads2, elovl5, elovl2 (Ferraz et al. 2019) and elovl4a and elovl4b (Chapter 5) have already been isolated and characterized, in addition to studies of gene expression correlated with these isolated genes (Chapter 4). These string of studies demonstrate that all the desaturase and elongase activities required to convert C18 PUFA, namely linoleic acid (LA, 18:2n-6) and α-linolenic acid (ALA, 18:3n–3) into ARA, EPA and DHA are present in tambaqui. In this study, to further expand and identify the full repertoire of the Elovl family present in the tambaqui genome, we identify, for the first time, all elovl genes from 1 to 7, plus the new elongase class - elovl8. We additionally carried out a detailed phylogenetic and synteny analysis that displayed the conservation of these genes as compared to other species. This approach is an elegant demonstration of the research value of the initial genome assembly.

121

6.3. Materials and methods

Sequence and Phylogenetic Analysis of Elovls

Pygocentrus nattereri elovl gene sequences were used to blast search a C. macropomum draft genome (Chapter 5) to obtain tambaqui’s elovl genes (elovl1a, elovl1b, elovl2, elovl3, elovl4a, elovl4b, elovl5, elovl6, elovl7a, elovl7b, elovl8a and 8b). Each section of contiguous sequence was manually collected, and introns removed to have only exons to form the ORF for each gene. The deduced amino acid sequences were compared to the corresponding orthologues from other species. A phylogenetic analysis of the deduced aa of C. macropomum, and those from D. rerio, P. nattereri, M. musculus and H. sapiens was carried out, constructing phylogenetic trees using the Maximum Likelihood method and the JTT matrix-based model (Jones, Taylor, and Thornton 1992). Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The proportion of sites where at least 1 unambiguous base is present, in at least 1 sequence for each descendent clade, is shown next to each internal node in the tree. This analysis involved 51 amino acid sequences. There was a total of 388 positions in the final dataset. Evolutionary analyses were conducted in MEGA X (Kumar et al. 2018).

Gene annotation and Synteny maps

Assembled genome scaffolds were searched to identify scaffolds containing elovl- like genes. The identified scaffolds were uploaded individually to FGENESH (http://www.softberry.com) and HMM-based gene structure prediction was performed using A. mexicanus (Characiedae) as reference. For each scaffold the predicted protein sequences were identified using blastp in NCBI and mapped. The remaining synteny maps were created using the annotated genomes available in NCBI, more specifically A. mexicanus (GCF_000372685.2), P. nattereri (GCA_001682695.1) and D. rerio (GCF_000002035.6). All synteny maps are centred on the target gene and four neighbouring genes were collected up and downstream when possible.

122

6.4. Results

Sequence and Phylogenetic Analysis of Elovls

The sequences of elovl1a, elovl1b, elovl2, elovl3, elovl4a, elovl4b, elovl5, elovl6, elovl7a, elovl7b, elovl8a and 8b were identified and retrieved from tambaqui’s genome (Ferraz et al. 2020, unpublished). The phylogenetic analysis of C. macropomum elovl sequences resulted in the construction of a phylogenetic tree with the highest log likelihood (-13978.46) (Figure 6.1). It is shown that the tambaqui Elovl proteins cluster with Elovl-like proteins from fish and mammalian. The topology of the tree showed two clades: one consisting of proteins involved in the elongation of LC-PUFA, and the other consisting of proteins involved on the elongation of saturated and monounsaturated. The elongation of LC-PUFA proteins themselves was separated into two clusters, one consisting of Elovl2 and Elovl5, and the other consisting of Elovl4 and Elovl8. More distantly related, two main clusters could be distinguished including Elovl3/Elovl6, Elovl1/Elovl7. Additionally, the identified aa sequence of tambaqui Elovl8a and Elovl8b show high identity scores with D. rerio Elovl8s proteins in alignment. Beyond have conserved HXXHH histidine box motif where “*” represents “Q” (glutamine) in position −5 from the HXXHH, and the ER retrieval signal (Figure 6.2).

Gene annotation and Synteny maps

To support the phylogenetic analysis in assigning orthology to the tambaqui elongase genes, we performed a comparative synteny analysis. For this, the genomic scaffolds of tambaqui containing Elovl-like genes were submitted for gene annotation. Initial analysis showed that 5 elongase genes namely Elovl5, Elovl4b, Elovl8b, Elovl3 and Elovl6 were located in scaffolds that only contained that target gene (Supplementary material, Figure 1). On the other hand, the remaining elongases Elovl2, Elovl4a, Elovl1a, Elovl1b, Elovl7a, Elovl7b and Elovl8a were located in scaffolds that included besides the target elongase at least one neighbouring gene. For comparative purposes, synteny maps were created for tambaqui and for two additional Characidae species (A. mexicanus and P. nattereri), and for the model organism D. rerio. Gene annotation of scaffold 5642 and 6198 identified tambaqui Elovl1a and Elovl1b respectively. In scaffold 5642 besides tambaqui Elovl1a it was also identified a second neighbouring gene Cdc20, analysis of this locus in other teleost species showed conservation of the Elovl1a and Cdc20 pair (Figure 6.3A).

123

Figure 6.1. Phylogenetic tree comparing the deduced amino acid sequences of Colossoma macropomum (Elovl1a, Elovl1b, Elovl2, Elovl3, Elovl4a, Elovl4b, Elovl5, Elovl6, Elovl7a, Elovl7b, Elovl8a and Elovl8b), and sequences from teleost species and mammalian. The Ciona intestinalis Elovl6 was included in the analysis as an outgroup sequence to construct the rooted tree. where “*” represents the protein identified in this work.

124

Figure 6.2. Comparison of the deduced amino acid (AA) sequences of Elovl8a and Elovl8b from C. macropomum with Elovl8a D. rerio (NP_001070061.1) and Elovl8b D. rerio (NP_001019609.2). The AA sequences were aligned using BioEdit. Identical residues are shaded black and similar residues are shaded grey. Indicated are the conserved HXXHH histidine box motif, where “*” represents “Q” (glutamine) in position −5 from the HXXHH, and the ER retrieval signal.

Regarding scaffold 6198, in addition to Elovl1b, gene annotation identified Mknk1, Mob3c and Zfyve9, again comparative analysis shows that this locus is conserved with A. mexicanus P. nattereri and D. rerio (Figure 6.3B). In addition to Elovl1a and Elovl1b we identified other teleost specific genome duplicates in this region such as Zfyve9-like neighbouring Elovl1a and Zfyve next to Elovl1b. Regarding tambaqui scaffolds 21403 and 31131 gene annotation identified tambaqui Elovl2 and an additional neighbouring gene Sycp2l. Comparative analysis with other teleost species showed that Sycp2l is also located in the vicinity of Elovl2, and that this locus is conserved between species with presence of multiple conserved genes such as Gcm2 and Gnal (Figure 6.3C). In the case of the scaffold 960 gene annotation identified Elovl4a and an additional 4 neighbouring genes namely Cdkal1, Echdc1, Soga3 and Bckdhb, all of which are also found in the same genomic locus of Elovl4a in the other teleost species analysed (Figure 3 D). Tambaqui Elovl7a and Elovl7b were identified in scaffolds 34678 and 142 respectively, again gene annotation identified additional neighbouring gene in both cases namely Ercc8 for Elovl7a and Pde4d-like and Dep1b for Elovl7b. Comparative analysis with other teleost showed that these genes are conserved in the corresponding locus and also identified other genes related by the teleost specific genome duplication (Figure 6.3E and 6.3F). Regarding Tambaqui Elovl8a this gene

125 was identified in scaffold 6820 together with Tesk2, Toe1, Thap2-like and Sepp1b. The comparison of tambaqui elovl8a locus with the corresponding locus in P. nattereri and D. rerio shows conservation of all genes identified in tambaqui scaffold 6820 (Figure 6.3G).

Figure 6.3. Gene annotation of Colossoma macropomum genomic scaffold and comparative synteny maps. Panels A-G contain synteny maps of Elovl1a, Elovl1b, Elovl2, Elovl4a, Elovl7a, Elovl7b, Elovl8a and Elovl8b respectively. Target elongase is represented in red, neighbouring cross-species conserved genes are represented in colour. Genes related by teleost genome duplication are underlined.

126

6.5. Discussion

Previous comparative genomic studies have addressed the gene repertoire of the Elovl gene family, mainly those that participate in the LC-PUFA biosynthesis in chordates (reviewed in Castro, Tocher, and Monroig 2016). Overall, various approaches have highlighted the role of whole genome duplications, specifically expanding the Elovl gene catalogue in the vertebrate ancestor (Kabeya et al., 2015; Oboh et al., 2017a). This process has been a fundamental contributor to the diversification of Elovl family in teleosts. For example, studies on the biosynthesis of LC-PUFA indicate that Elovl2 and Elovl5 emerged from genome duplications in vertebrate ancestry. Before this duplication, existed the single Elovl2/5 in the invertebrate amphioxus (Monroig et al. 2016b). Assigned to the expanded repertoire of Elovl in vertebrate, seven members of the Elovl protein family (Elovl 1-7) have been previously described in fish and mammals, that can be broadly sub-divided into elongases of SFA and MUFA, namely Elovl1, Elovl3, Elovl6 and Elovl7, and elongases of PUFA, namely Elovl2, Elovl4 and Elovl5 (Monroig et al. 2016a; Castro, Tocher, and Monroig 2016). In addition, a new member of Elovl family has been recently unveiled in fish, and it is designated as a putative Elovl8 which apparently elongates PUFA substrates (Oboh 2018; Li et al. 2020). Here, we identified all Elovl 1-8 gene orthologues in the tambaqui genome. Phylogenic analysis of the encoded Elovl proteins revealed extensive homology to other Elovls proteins, where the phylogenetic and comparative synteny analysis confirmed that the all Elovls identified in tambaqui genome are orthologous to the corresponding Elovls of fish and mammalian, with the exception of elovl8s, which is unique in teleosts (Figure 6.1). In fact, a recent study demonstrated that elovl8 genes may be unique to teleosts, and that they might have arisen from a common ancestor gene, which is the elovl4-like gene of Sarcopterygii (Li et al. 2020). Additionally, here, sequence alignment of tambaqui Elovl8 with Elovl8 from D. rerio showed a high degree of conservation, confirming that these isolated genes are true elovl8 gene orthologues and different from those previously isolated in tambaqui (Figure 6.2). Some of these genes have already been studied in tambaqui, including their functional characterization, elovl2 and elovl5 (Ferraz et al. 2019). This approach that these enzymes have homologous functions in the LC-PUFA synthesis similar to those of other teleostei, where elongases activities is required to convert C18 PUFA into LC-PUFA, such as DHA, EPA and ARA. Similarly, both elovl4 gene paralogues (a and b) (Chapter 5), display elongase activities consistent with their participation in the biosynthesis of VLC-SFA and VLC-PUFA (Kabeya et al. 2015; Li et al. 2017; Yan et al. 2017; Oboh, Navarro, et al. 2017;

127

Jin et al. 2017; Zhao et al. 2019; Betancor et al. 2020). In this study, we have demonstrated that the ortholog of elovl4 has duplicated in the tambaqui, similar to others teleost fish. In the same direction as elovl4a and elovl4b, here, in tambaqui, we have identified 3 other genes that are duplicated: elovl1, elovl7 and elovl8. Phylogenetic analyses grouped the newly identified genes into the same clade with other teleosts. The strongest evidence came from the conserved synteny analysis, where elovl1a, elovl1b, elovl7a, elovl7b, and elovl8a are conserved in the same position as in other teleosts. As shown in the phylogenetic analysis, unlike the other genes duplicated in teleosts, elovl8 gene is not present in mammals, probably, due to independent gene loss of this gene (Li et al. 2020). Teleost fish underwent an extra event of whole-genome duplication which explains the emergence of several new genes (Toloza-Villalobos, Arroyo, and Opazo 2015; Pasquier et al. 2016). However, some of these new genes have been deleted or had a new activity in the evolutionary process (Demuth et al. 2006; Eckhart et al. 2008; Kuraku and Kuratani 2011). This secondary loss may be the best justification for the non-existence of elovl8 gene in mammals. Recent case-study suggests the same path, where elovl8s were present in some teleosts but not in amphibians, reptiles, birds, and mammals (Li et al. 2020). In addition, the same work suggested that elovl8 genes may be unique to teleosts, and that they might have arisen from a common ancestral gene, which is the elovl4-like gene of Sarcopterygii (Li et al. 2020). Regarding the activity of Elovl8 in fish, an in vitro study in yeast was carried out to functionally characterize this gene from African catfish (C. gariepinus) (Oboh 2018) and rabbitfish (S. canaliculatus) (Li et al. 2020). Substrates such as C18:2n6, C18:3n3, C18:3n6, C18:4n3, C20:4n6, C20:5n3, C22:4n6 and C22:5n3 were tested and showed that the C. gariepinus Elovl8a and Elovl8b are capable of elongating some of the FA substrates assayed (18:3n-3, 18:2n-6, 18:4n-3, 18:3n-6 and 20:4n-6), but not efficiently (Oboh 2018). However, more recently it has been shown that at least one of these proteins, Elolv8b, can efficiently elongate PUFA to LC-PUFA in S. canaliculatus when feed with vegetable oil (Li et al. 2020). Here, the deduced aa sequence of the Elovl8 isoforms from tambaqui contain all the features of the vertebrate Elovl protein family members, including ER retrieval signal at the C terminus containing lysine (K) residues, and the diagnostic histidine box (HXXHH) (Jakobsson, Westerberg, and Jacobsson 2006). Moreover, the histidine (H) box and its N- terminal side (QLTFLHVYHH) show a typical aa pattern of the PUFA elongase subfamily of eukaryotic elongases, with a glutamine (Q) at position −5 and a leucine (L) at position −1 from the first H (Hashimoto et al. 2008) (Figure 6.2). This similarity suggests that, like Elovl8

128 isolated in S. canaliculatus, Elovl8 from tambaqui could have a function in the PUFAs cascade, thus contributing to the biosynthesis of VLC-PUFA when feed with VO. As discussed above, tambaqui displays a full capacity to biosynthesize LC-PUFA from C18 PUFA LA and ALA, having all the necessary elongation and desaturation capacities. Interestingly, these results suggest that this endogenous molecular capacity fits with the species trophic level. Yet, defining the trophic level of tambaqui is not straithfoward, as tambaqui in its natural habitat has a variable dietary habit, depending on the life stage and time of the year (Oliveira et al. 2006). The Amazon river is a complex system, with the natural floods and droughts of the river. Therefore, tambaqui exploits the diverse resources exhibiting varied feeding habits. This impacts the ability of tambaqui to use both animal and plant food sources. Where the most common feed is a high fractional contribution of zooplankton and insects, during the low water level period, and at the other part of the year, fruits and seeds from the flooded forest (Silva, Pereira-Filho, and M. Oliveira Pereira 2000). With this varied feeding spectrum, tambaqui frequently moves among different trophic levels (Oliveira et al. 2006). This variation in the trophic level would paralell the endogenous capacity to biosynthesize LC-PUFA. Since the feed may vary, its constituents in the FA profile can also vary, so once this scenario changes, the essentiality of biosynthesizing LC- PUFA may change. That is, when the animal is in a primary setting, it cannot feed on a vast chain of LC-PUFAs, having the need to biosynthesize to maintain the ideal levels of these FA in the body. However, that is very likely that the retention of all elovl and the maintenance of their activities in LC-PUFA biosynthesis pathway in tambaqui during the evolutionary process, allows them to explore all these ecological niches. Thus, being able, when necessary, to biosynthesize LC-PUFA from the C18 LA and ALA, allowing its habitat to be broader and not dependent only on a specific type. In conclusion, we have shown that tambaqui has all the elongases described to date in fish, namely elovl1-8 which are homologous to other species. Additionally, a detailed phylogenetic and synteny analysis was carried out that confirmed conservation of these genes as compared to other species. We confirm that there are four duplicated genes in Elovl family in tambaqui, namely elovl1a, elovl1b, elovl4a, elovl4b, elovl7a, elovl7b, elovl8a and elovl8b, resulting probably from the specific genome duplication in teleosts. Lastly, we identified in tambaqui the newest gene of the Elovl family, elovl8, which, in principle, contributes to the biosynthesis of LC- PUFA in fish. The presence of all Elovls in tambaqui, and consequently the ability to biosynthesize a high range of FAs, including LC-PUFA and VLC-PUFA, is probably justified by its history of diversified intermittent food habits, which a probable evolutionarily process conserved all of these genes. The ability to biosynthesize

129

LC-PUFA from c18 LA and ALA, has a great contribution to current aquaculture of tambaqui. Importantly, the world aquaculture scene encourages farming of species with lower trophic level, like the tambaqui which is omnivorous. This is because this species may have the potential to accept substitutes for fish-derived feed ingredients in the diet, such as VO. Thus, reducing the cost of production and beyond the environmental issue of fish for the production of FO and FM, in addition to increasing productivity, thus contributing to the sustainable and most economically viable development of national aquaculture.

6.6. References

Bhandari, S., Lee, J.N., Kim, Y.-I., Nam, I.-K., Kim, S.-J., Kim, S.-J., Kwak, S., Oh, G.-S., Kim, H.-J., Yoo, H.J., 2016. The fatty acid chain elongase, Elovl1, is required for kidney and swim bladder development during zebrafish embryogenesis. Organogenesis 12, 78-93. Carmona-Antonanzas, G., Monroig, O., Dick, J.R., Davie, A., Tocher, D.R., 2011. Biosynthesis of very long-chain fatty acids (C>24) in Atlantic salmon: cloning, functional characterisation, and tissue distribution of an Elovl4 elongase. Comp Biochem Physiol B Biochem Mol Biol 159, 122-129. Carmona-Antoñanzas, G., Tocher, D.R., Taggart, J.B., Leaver, M.J., 2013. An evolutionary perspective on Elovl5 fatty acid elongase: comparison of Northern pike and duplicated paralogs from Atlantic salmon. BMC evolutionary biology 13, 85. Castro, L.F., Tocher, D.R., Monroig, O., 2016. Long-chain polyunsaturated fatty acid biosynthesis in chordates: Insights into the evolution of Fads and Elovl gene repertoire. Prog Lipid Res 62, 25-40. Dehal, P., Boore, J.L., 2005. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS biology 3. Demuth, J.P., De Bie, T., Stajich, J.E., Cristianini, N., Hahn, M.W., 2006. The evolution of mammalian gene families. PloS one 1. Eckhart, L., Ballaun, C., Hermann, M., VandeBerg, J.L., Sipos, W., Uthman, A., Fischer, H., Tschachler, E., 2008. Identification of novel mammalian caspases reveals an important role of gene loss in shaping the human caspase repertoire. Molecular biology and evolution 25, 831-841. Ferraz, R.B., Kabeya, N., Lopes-Marques, M., Machado, A.M., Ribeiro, R.A., Salaro, A.L., Ozório, R., Castro, L.F.C., Monroig, Ó., 2019. A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 227, 90-97. Guillou, H., Zadravec, D., Martin, P.G., Jacobsson, A., 2010. The key roles of elongases and desaturases in mammalian fatty acid metabolism: Insights from transgenic mice. Prog Lipid Res 49, 186-199. Hashimoto, K., Yoshizawa, A.C., Okuda, S., Kuma, K., Goto, S., Kanehisa, M., 2008. The repertoire of desaturases and elongases reveals fatty acid variations in 56 eukaryotic genomes. Journal of lipid research 49, 183-191. Hoegg, S., Brinkmann, H., Taylor, J.S., Meyer, A., 2004. Phylogenetic timing of the fish- specific genome duplication correlates with the diversification of teleost fish. Journal of molecular evolution 59, 190-203.

130

Jakobsson, A., Westerberg, R., Jacobsson, A., 2006. Fatty acid elongases in mammals: their regulation and roles in metabolism. Prog Lipid Res 45, 237-249. Jones, D.T., Taylor, W.R., Thornton, J.M., 1992. The rapid generation of mutation data matrices from protein sequences. Bioinformatics 8, 275-282. Kabeya, N., Yamamoto, Y., Cummins, S.F., Elizur, A., Yazawa, R., Takeuchi, Y., Haga, Y., Satoh, S., Yoshizaki, G., 2015. Polyunsaturated fatty acid metabolism in a marine teleost, Nibe croaker Nibea mitsukurii: Functional characterization of Fads2 desaturase and Elovl5 and Elovl4 elongases. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 188, 37-45. Kihara, A., 2012. Very long-chain fatty acids: elongation, physiology and related disorders. The journal of biochemistry 152, 387-395. Kumar, S., Stecher, G., Li, M., Knyaz, C., Tamura, K., 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Molecular biology and evolution 35, 1547-1549. Li, Y., Wen, Z., You, C., Xie, Z., Tocher, D.R., Zhang, Y., Wang, S., Li, Y., 2020. Genome wide identification and functional characterization of two LC-PUFA biosynthesis elongase (elovl8) genes in rabbitfish (Siganus canaliculatus). Aquaculture, 735127. Meyer, A., Van de Peer, Y., 2005. From 2R to 3R: evidence for a fish‐specific genome duplication (FSGD). Bioessays 27, 937-945. Ministério da Agricultura, B., 2016. Produção da pecuária municipal / IBGE. , in: G.d.B.e.A. Especiais (Ed.), RJ-IBGE/85-29(rev. 2016). Monroig, O., Lopes-Marques, M., Navarro, J.C., Hontoria, F., Ruivo, R., Santos, M.M., Venkatesh, B., Tocher, D.R., Castro, L.F., 2016a. Evolutionary functional elaboration of the elovl2/5 gene family in chordates. Sci Rep 6, 20510. Monroig, Ó., Lopes-Marques, M., Navarro, J.C., Hontoria, F., Ruivo, R., Santos, M.M., Venkatesh, B., Tocher, D.R., Castro, L.F.C., 2016b. Evolutionary functional elaboration of the elovl2/5 gene family in chordates. Scientific reports 6, 20510. Monroig, O., Rotllant, J., Cerda-Reverter, J.M., Dick, J.R., Figueras, A., Tocher, D.R., 2010. Expression and role of Elovl4 elongases in biosynthesis of very long-chain fatty acids during zebrafish Danio rerio early embryonic development. Biochim Biophys Acta 1801, 1145-1154. Monroig, Ó., Wang, S., Zhang, L., You, C., Tocher, D.R., Li, Y., 2012. Elongation of long- chain fatty acids in rabbitfish Siganus canaliculatus: Cloning, functional characterisation and tissue distribution of Elovl5- and Elovl4-like elongases. Aquaculture 350-353, 63-70. Monroig, Ó., Webb, K., Ibarra-Castro, L., Holt, G.J., Tocher, D.R., 2011. Biosynthesis of long- chain polyunsaturated fatty acids in marine fish: Characterization of an Elovl4-like elongase from cobia Rachycentron canadum and activation of the pathway during early life stages. Aquaculture 312, 145-153. Morais, S., Monroig, O., Zheng, X., Leaver, M.J., Tocher, D.R., 2009a. Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of Elovl5- and Elovl2-like elongases. Mar Biotechnol (NY) 11, 627-639. Morais, S., Monroig, O., Zheng, X., Leaver, M.J., Tocher, D.R., 2009b. Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of ELOVL5-and ELOVL2-like elongases. Marine Biotechnology 11, 627-639. Naganuma, T., Sato, Y., Sassa, T., Ohno, Y., Kihara, A., 2011. Biochemical characterization of the very long-chain fatty acid elongase ELOVL7. FEBS Lett 585, 3337-3341. Oboh, A., 2018. Investigating the long-chain polyunsaturated fatty acid biosynthesis of the African catfish Clarias gariepinus (Burchell, 1822). Oboh, A., Kabeya, N., Carmona-Antoñanzas, G., Castro, L.F.C., Dick, J.R., Tocher, D.R., Monroig, O., 2017b. Two alternative pathways for docosahexaenoic acid (DHA, 22: 6n-3) biosynthesis are widespread among teleost fish. Scientific reports 7, 3889.

131

Oboh, A., Navarro, J.C., Tocher, D.R., Monroig, O., 2017a. Elongation of very long-chain (> C24) fatty acids in Clarias gariepinus: Cloning, functional characterization and tissue expression of elovl4 elongases. Lipids 52, 837-848. Ofman, R., Dijkstra, I.M., van Roermund, C.W., Burger, N., Turkenburg, M., van Cruchten, A., van Engen, C.E., Wanders, R.J., Kemp, S., 2010. The role of ELOVL1 in very long‐ chain fatty acid homeostasis and X‐linked adrenoleukodystrophy. EMBO molecular medicine 2, 90-97. Oliveira, A., Martinelli, L.A., Moreira, M.Z., Soares, M., Cyrino, J.E.P., 2006. Seasonality of energy sources of Colossoma macropomum in a floodplain lake in the Amazon–lake Camaleão, Amazonas, Brazil. Fisheries Management and Ecology 13, 135-142. Pasquier, J., Cabau, C., Nguyen, T., Jouanno, E., Severac, D., Braasch, I., Journot, L., Pontarotti, P., Klopp, C., Postlethwait, J.H., 2016. Gene evolution and gene expression after whole genome duplication in fish: the PhyloFish database. BMC genomics 17, 368. Shen, Y., Yue, G., 2018. Current status of research on aquaculture genetics and genomics- information from ISGA 2018. Aquaculture and Fisheries. Sherry, D.M., Deak, F., Anderson, R.E., Fessler, J.L., 2019. Novel Cellular Functions of Very Long Chain-Fatty Acids: Insight from ELOVL4 Mutations. Frontiers in Cellular Neuroscience 13, 428. Silva, J.A.M., Pereira-Filho, M., M. Oliveira Pereira, M.I., 2000. Seasonal variation of nutrients and energy in tambaqui’s (colossoma macropomum cuvier, 1818) natural food. Rev. Brasil. Biol. 60(4), 599-605. Sprecher, H., 2000. Metabolism of highly unsaturated n-3 and n-6 fatty acids. Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids 1486, 219-231. Tocher, D.R., Dick, J.R., 2000. Essential fatty acid deficiency in freshwater fish: the effects of linoleic, α-linolenic, γ-linolenic and stearidonic acids on the metabolism of [1-14C] 18: 3n-3 in a carp cell culture model. Fish Physiology and Biochemistry 22, 67-75. Toloza-Villalobos, J., Arroyo, J.I., Opazo, J.C., 2015. The circadian clock of teleost fish: a comparative analysis reveals distinct fates for duplicated genes. Journal of molecular evolution 80, 57-64. Valladão, G.M.R., Gallani, S.U., Pilarski, F., 2018. South American fish for continental aquaculture. Reviews in Aquaculture 10, 351-369. Xue, X., Feng, C.Y., Hixson, S.M., Johnstone, K., Anderson, D.M., Parrish, C.C., Rise, M.L., 2014. Characterization of the fatty acyl elongase (elovl) gene family, and hepatic elovl and delta-6 fatty acyl desaturase transcript expression and fatty acid responses to diets containing camelina oil in Atlantic cod (Gadus morhua). Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 175, 9-22. Yan, J., Liang, X., Cui, Y., Cao, X., Gao, J., 2017. Elovl4 can effectively elongate C18 polyunsaturated fatty acids in loach Misgurnus anguillicaudatus. Biochemical and biophysical research communications. Zadravec, D., Brolinson, A., Fisher, R.M., Carneheim, C., Csikasz, R.I., Bertrand-Michel, J., Borén, J., Guillou, H., Rudling, M., Jacobsson, A., 2010. Ablation of the very-long- chain fatty acid elongase ELOVL3 in mice leads to constrained lipid storage and resistance to diet-induced obesity. The FASEB Journal 24, 4366-4377. Zhao, N., Monroig, Ó., Navarro, J.C., Xiang, X., Li, Y., Du, J., Li, J., Xu, W., Mai, K., Ai, Q., 2019. Molecular cloning, functional characterization and nutritional regulation of two elovl4b elongases from rainbow trout (Oncorhynchus mykiss). Aquaculture 511, 734221. Zhou, J., Li, W., Kamei, H., Duan, C., 2008. Duplication of the IGFBP-2 gene in teleost fish: protein structure and functionality conservation and gene expression divergence. PloS one 3.

132

Chapter 7

General discussion

133

The importance of fish in human health is not restricted to be a valuable protein source. The relevant presence of essential fatty acids (EFA) and polyunsaturated fatty acids (LC-PUFA), such as docosahexaenoic and eicosapentaenoic acids (DHA and EPA) is also of the utmost importance (Calder 2014; Simopoulos 2000). These nutrients have been shown to have vital roles in cardiovascular function and neuronal development, with their inclusion in human diets resulting in the reduced risk of disease and improved overall outcomes (Lands 2014). For this reason, several recommendations for DHA/EPA intake for human populations have been produced by a large number of global and national health agencies and associations, and government bodies (EPA, 2019). Thus, farmed fishes have been increasingly contributing to the human food basket (FAO/OECD 2017), with the consumer market increasingly concerned regarding the quality and origin of the fish consumed. Traditionally, diets of farmed fish are formulated with fishmeal (FM) and fish oil (FO) which are rich in n-3 LC-PUFA (Tocher 2015). However, the inevitable dietary replacement of FM and FO by alternative ingredients, particularly plant meal and vegetable oils (VO), has been occurring during the last 15-20 years, due to economic and environmental factors (Naylor et al. 2009; Gatlin III et al. 2007; Hardy 2010; Turchini, Ng, and Tocher 2010). However, the high dietary VO content cause a significant deficiency in essential fatty acids, such as a n-3 LC-PUFA (Tocher 2015). Thus, studying the metabolism of LC-PUFA in teleost farmed fish has become essential to guarantee the presence of these LC-PUFA in adequate levels in aquaculture species. Some fish species have the ability to convert essential dietary PUFAs such as linoleic acid (LA; 18: 2n-6) and α-linolenic acid (ALA; 18: 3n-3) to LC-PUFA by desaturase (Fads) and very long chain fatty acid elongation (Elov) enzymes (Monroig et al., 2018). This biosynthesis capacity will vary depending on the presence and activity of these enzymes in each species. In general, freshwater fish species have a greater ability for conversion of C18 PUFA to LC-PUFA than marine species (Tocher, 2010). But such evidence cannot be generalized, since teleosts display a great variability in these genes due to the duplication and loss by evolutionary events (Castro et al., 2016). Such ability to biosynthesize LC-PUFA and the evolutionary process that allows each species, seems to be intrinsically related one or more factors including trophic level, habitat (freshwater vs. marine) and trophic ecology (Garrido et al., 2019; Trushenski and Rombenso, 2019). Hence, the importance to carry out studies with aquaculture species to know the genetic repertoire of elongases and desaturases and thus the ability to biosynthesize LC-PUFA.

134

7.1. LC- and VLC-PUFA biosynthesis in tambaqui

Before this thesis, various observations supported the hypothesis that tambaqui endogenously elongates and desaturates LA and ALA to their corresponding end products (ARA, EPA and DHA) (De Almeida et al., 2011; De Almeida, 2006; Paulino et al., 2018; Pereira et al., 2017), as observed in some freshwater fish (Senadheera et al., 2011; Tan et al., 2009; Tian et al., 2016; Zeng et al., 2016). Such observations derived from growth trials with different dietary formulas analysing growth performance, feed utilization, body composition as well as the lipid profile, enzyme activities and plasma metabolites profile. However, the precise molecular landscape, i.e. gene networks, capable of supporting this capacity to convert EFAs such as LA and ALA into physiologically active molecules of LC- PUFA, such as EPA, ARA or DHA was unknown in tambaqui. Therefore, we initiated this research program by establishing the repertoire and function of desaturases (Fads) and elongase (Elovl) with pivotal roles in LC-PUFA biosynthesis (Chapter 2). Using a combination of strategies, we demonstrated that tambaqui has all the desaturase and elongase capacities required to convert C18 PUFA into LC-PUFA ARA, EPA and DHA. Among fads, our data shows that the tambaqui possess one single fads-like gene that was established to be orthologous of fads2 through phylogenetic analysis. Having deciphered the gene orthology of Fads2, the next step was to examined the desaturation functional capacity using a heterologous system. We were able to infer that the tambaqui single copy fads2 orhtologue is able of Δ6, Δ5 and Δ8 desaturations within the same enzyme. In detail, tambaqui Fads2 is capable of converting 18:3n-3 and 18:2n-6 to 18:4n-3 and 18:3n-6, respectively (∆6 desaturase activity), 20:4n-3 and 20:3n-6 to 20:5n-3 and 20:4n-6, respectively (∆5 desaturase activity), and 20:2n-6 and 20:3n-3 to 20:3n-6 and 20:4n-3 respectively (Δ8 desaturase activity). Interestingly, the later activity - Δ8, is apparently very low, which implies that tambaqui Fads2 can be considered a dual classic Δ6Δ5 desaturase. Comparatively, this functional desaturation archetype was originally described in D. rerio (Hastings et al., 2001) and later found in various species, including Siganus canaliculatus (Li et al., 2010), Oreochromis niloticus (Tanomman et al., 2013), Channa striata (Kuah et al., 2016) and Clarias gariepinus (Oboh et al., 2016). Among Elovl, we identified four elovl (elovl2, elovl5, elovl4a and elovl4b), with well-known roles in PUFA biosynthesis. In Chapter

2, the tambaqui Elovl5 showed elongation capacity towards C18 and C20 PUFA, which has the activity complemented by Elovl2. Elovl2 demonstrated the ability to elongate C20 and C22 PUFA substrates. Although Elovl5 is commonly found in most teleosts (Castro et al., 2016), Elovl2 display a more restricted phylogenetic distribution. In fact, it has been most likely lost

135 from the evolutionary lineage leading to the but retained in some lineages such as Cypriniformes (Monroig et al., 2009), Siluriformes (Oboh et al., 2016) and Salmoniformes (Gregory and James, 2014). It is hypothesized that, the retention of elovl2 have developed the possibility of biosynthesizing LC-PUFA and, thus, being successful in covering freshwater habitats (Morais et al., 2009). Therefore, the retention of elovl2 in tambaqui parallels the species success in inhabit freshwater environments with respect to nutrition requirements. This scenario is consistent with their endogenous ability to biosynthesize LC-PUFA and to adapt to an omnivorous habitat. Therefore, the presence of these genes shows that tambaqui has the complete molecular machinery and activities required to elongate and desaturate and thus produce ARA, EPA and DHA from C18 (LA and ALA) fatty acids precursors. To complement the activities of the elongases previously studied, in Chapter 5 we expanded our analysis to address the biosynthesis of very long-chain fatty acids (VLC-FAs). The later are found in most animals and constitute a group of fatty acids with chain numbers normally ranging from C26 to C40 that can be saturated, monounsaturated or polyunsaturated (Poulos 1995). Among them, Elovl4 has been demonstrated to be a critical enzyme in the biosynthesis of both very long-chain saturated fatty acids (VLC-SFA) and very long-chain polyunsaturated fatty acids (VLC-PUFA) (Agbaga, Mandal, and Anderson 2010; Agbaga et al. 2010). Until the present PhD study, the elongation of very long-chain fatty acid 4 (elovl4) gene(s) had not been described in tambaqui. In Chapter 5, we were able to identify, isolate and characterize Elovl4 in tambaqui. We showed that similarly to other teleost species, two elovl4 gene paralogues termed accordingly as elovl4a and elovl4b, are present in tambaqui. In most teleosts examined to data only in Elovl4b showed PUFA activities (Monroig et al., 2010; Kabeya et al. 2015; Li et al. 2017; Yan et al. 2017; Zhao et al. 2019; Betancor et al. 2020); in tambaqui both Elovl4s are involved in the biosynthesis of VLC-SFA and VLC-PUFA. Elovl4a participates in the biosynthesis of VLC-PUFA up to 36 carbons, while Elovl4b up to 34 carbons. Similar results as observed for tambaqui, have been already observed for other teleost species such as C. gariepinus and A. schlegelii, in which both their Elovl4 enzymes (Elovl4a and Elovl4b) contribute to VLC-PUFA biosynthesis (Jin et al., 2017; Oboh et al., 2017a). Furthermore, Elovl4 can partly contribute to the efficient synthesis of 24:5n−3 as precursor for DHA biosynthesis via the Sprecher pathway (Oboh et al., 2017b; Sprecher, 2000), thus complementing the Elovl2 activity previously described. Unlike VLC-PFA activity, in relation to VLC-SFA, tambaqui showed activity in Elovl4b, more specifically for 28:0 and 30:0, but not for Elovl4a. These results differ from those reported in

136

D. rerio since its Elovl4a but not Elovl4b, was found to play major roles in VLC-SFA (Monroig et al., 2010).

7.2. Expression of LC-PUFA metabolism genes in juvenile tambaqui

The analysis of elongation and desaturation capacities at the molecular level demonstrates that empirical observations are indeed based on the gene repertoire of tambaqui. Another important question is to understand the molecular alterations resulting from dietary modifications in animal production settings. In particular, this entails the dissection of genetic cascades regulating lipid metabolism and fatty acid biosynthesis. Thus, we examined the global lipid-gene pathways in tambaqui under different dietary scenarios (Chapter 4). The study explored the effect of different dietary levels of vegetable oil (VO) and fish oil (FO) on lipid metabolism in tambaqui gene expression profiles in two specific tissues, liver and brain. Liver was selected by the fact that it plays a central role in lipid metabolism, contributing to the biosynthesis of EPA, DHA and ARA (Dyall, 2015). The LC- PUFAS are, nevertheless, essential for brain functioning, since DHA play a pivotal role on the growth and function of the nervous tissue (Innis, 2008). To guarantee sufficient supply of DHA to neural tissues, particularly in critical early developmental stages, fish Fads2, Elovl2 and Elovl4 can be activated to guarantee the minimum levels of these LC-PUFAS (Monroig et al., 2018). Typically, gene expression analyses by quantitative PCR (real time PCR) of a panel of tissues from genes involved in the metabolic route of LC-PUFA were highly expressed in liver and brain (Abdul Hamid et al., 2016; Geay et al., 2016; Morais et al., 2009). Considering such, juvenile tambaqui was assayed after 9-weeks of feeding tambaqui with four experimental diets, varying in oil sources (fish oil - FO and vegetable oil - VO) and levels (5% or 10%). There was no difference on growth and survival rates between treatments. Importantly, our results showed that when tambaqui are fed VO, the genes fads2 and elovl5 were up-regulated in liver; while fads2 and elovl2 were up-regulated in the brain. This molecular pathway scenario is in line with the observation that tambaqui is able to efficiently feed on VO sources (LA and ALA), since genes related to LC-PUFA are up- regulated to guarantee adequate physiological levels of LC-PUFA EFA (e.g. EPA. DHA and ARA). In addition to LC-PUFA biosynthesis genes, the role of transcription factors (TF) were also investigated, since PUFA have important activities on TF, affecting transcription of many genes involved in lipid metabolism, including desaturases and elongases (Leaver et al. 2007). For example, the presumed natural ligands of Ppars include various fatty acids with a general preference for PUFA (You et al., 2017), and these Ppars were associated with

137 modulation of fads2 and elovs expression, the first rate-limiting enzyme of LC-PUFA biosynthesis in fish (You et al., 2017). Therefore, Ppars have been recognized as general fatty acid sensors and are master regulators of lipid metabolism. Because of this, Ppar isotypes have been identified and cloned from several fish such as zebrafish (Danio rerio) (Ibabe et al., 2002), Atlantic salmon (Salmon salar) (Sundvold et al., 2010), brown trout (Salmo trutta) (Batista-Pinto et al., 2005), rabbitfish (Siganus canaliculatus) (Li et al., 2019; You et al., 2017), plaice (Pleuronectes platessa), gilthead sea bream (Sparus aurata) (Leaver et al., 2005), European sea bass (Dicentrarchus labrax) (Boukouvala et al., 2004), Japanese seabass (Lateolabrax japonicas) (Dong et al., 2015), cobia (Rachycentron canadum) (Tsai et al., 2008), and red sea bream (Pagrus major) (Oku and Umino, 2008). These studies demonstrated that a higher hepatic pparα, pparβ, and pparγ is more expressed in fish fed diets with VO than that with FO, consecutively an increasing expression of fads and elovls genes was previously reported. These results suggest that Ppars are involved in the upregulation of LC-PUFA biosynthesis since the lipidic profile of VO is poor in LC-PUFA, but contributes to precursors ALA and LA C18 PUFA (Dong et al., 2015), which can be elongated and desaturated in some fish species, as previously observed. This probably accounted for the observation, that in tambaqui pparβb and ppary, showed an up-regulation in the brain upon VO diet challenge, when compared to FO diet. Similarly, increased liver expression of ppars expression has been observed in rainbow trout, Oncorhynchus mykiss, blunt snout bream, Megalobrama amblycephala, Japanese seabass, Lateolabrax japonicus (Dong et al., 2015; Li et al., 2015; Vestergren et al., 2013; You et al., 2017). Even though each isotype of Ppars has a distinct biological function, Pparα and Pparβ is known for their role in fatty acid catabolism and Pparγ controlling lipid storage and adipogenesis, in general, tambaqui showed that lipid metabolism genes were up-regulated by VO lipid sources contributing to the biosynthesis of LC-PUFAs and, in special, DHA in the brain. Thus, Ppars might play an important role in regulating LC-PUFA synthesis. Overall, tambaqui is able to regulate the genes of the lipid metabolism pathway when fed VO. Thus, dietary FO could probably be replaced with unconventional VO sources. Dietary fatty acids have been reported to affect fatty acid metabolism (Tocher et al., 2001; Turchini et al., 2009). As seen above, we detail how dietary lipid source modulates in vivo molecular pathways involved in fatty acid metabolism in tambaqui juveniles. Similarly, other fish species exhibits a similar response, when dietary FO was totally or partially replaced by VO. Growth performance was not negatively affected by replacing FO with VO in several fish species, such as S. canaliculatus (Xu et al., 2012); darkbarbel catfish, Pelteobagrus vachellithe (Jiang et al., 2013); hybrid tilapia, Oreochromis niloticus×O. aureus

138

(Han et al., 2013); olive flounder, Paralichthys olivaceus (Qiao et al., 2014); rainbow trout, Oncorhynchus mykiss (Yildiz et al., 2015); largemouth bass, Micropterus salmoides (Chen et al., 2020) and Nile tilapia, O. niloticus (Ayisi et al., 2018). The PUFA conversion to LC- PUFA is more pronounced when fish is preconditionally fed on vegetable sources. Such is the case in Maccullochella peelii peelii, Perca fluviatilis, M. amblycephala, M. salmoides, D. labrax, Oreochromis sp. and Solea senegalensis (Turchini, Francis, and De Silva 2006; Blanchard, Makombu, and Kestemont 2008; Li et al. 2016; Turchini et al. 2011; Chen et al. 2020; Castro et al. 2015; Teoh and Ng 2016; Conde-Sieira et al. 2018; Corrêa et al. 2018). To our knowledge, currently no information is avaliable comparing the genes involved in fatty acid metabolism of tambaqui fed on VO diets. Our results showed that dietary VO influenced the fatty acid metabolism of tambaqui. This type of study is needed by the aquaculture sector to improve formulation of fish feeds that can maintain not only growth performance but also appropriate levels of LC-PUFA with alternative sources of oils, thus guaranteeing the sustainable and economically viable growth of aquaculture and the quality of the foodfish.

7.3. Genomic resources in tambaqui

During the past years the use of functional genomics, proteomics and metabolomics to better characterize reproduction, development, nutrition, immunity and toxicology of fish, in order to improve aquaculture species has been thriving (e.g. Cerda and Manchado 2013). With significant advances in sequencing technology, rapid progress has been made with whole genome sequencing of aquaculture species (Maduna et al., 2020). These knowledge- intensive approaches (omics) have improved production, more efficient and sustainable, and provide higher quality of the final product to consumers. RNA-seq is a technology to sequence transcriptomes using next-generation sequencing technologies. It has been widely used for analysis of gene expression profiling and identification of differentially expressed genes. In the recent past, transcriptome profiling has been widely used in aquaculture for effective identification and expression analysis of candidate genes involved in growth, reproduction, development, immunity, disease, stress and toxicology (Chandhini and Rejish Kumar, 2019). Recently, transcriptome sequencing of various aquaculture species has been published, where de novo transcriptome analysis of many important aquaculture species is now available, including, Salmo salar (Rise et al., 2004), Sparus aurata (Sarropoulou et al., 2005), Lates calcarifer (Xia and Yue, 2010), Oreochromis niloticus (Zhang et al., 2013), Epinephelus coioides

139

(Huang et al., 2011), Ctenopharyngodon idella (Chen et al., 2012), Dicentrarchus labrax(Geay et al., 2011), Megalobrama amblycephala (Gao et al., 2012), Oncorhynchus mykiss (Salem et al., 2010), Gymnocypris przewalskii (Tong et al., 2015) and Carassius auratus (Liao et al., 2013). To add more data to the already existing omics set of tambaqui, we also produced two transcriptomes, liver (Chapters 3) and brain (Chapter 4). These resources will provide powerful tools for the research community and will aid in the determination of the genetic factors involved in the regulation of complex traits. For example, RNA-seq available here was essential to isolate genes and thus develop the work of this thesis. Thus, omics datasets can offer a tremendous amount of insight into the biological responses of aquaculture animals to diet and nutrition (Rise et al., 2019). For example, Illumina-based RNA-seq and qPCR (with liver templates) were used to investigate the impact of a 12-week feeding trial of high-ALA versus high-LA on the economically important Japanese seabass (Lateolabrax japonicus) (Xu et al., 2019). Similarly, the levels of arachidonic acid (ARA-free versus moderate ARA) on juvenile grass carp (Ctenopharyngodon idella) (Tian et al., 2019) were studied with these sequencing techniques. With farmed Atlantic salmon (Salmo salar) (Caballero-Solares et al., 2017; Caballero-Solares et al., 2018) studied new findings on the impact of terrestrial animal and plant products on the nutrition and health which could help formulate superior feeds for the aquaculture industry. Also with salmon, where information about the influence of diet on the synthesis of LC-PUFA was revealed with the integration of transcriptome and fatty acid composition (Katan et al., 2019). These studies provided valuable information on the influence of different dietary 18‐carbon polyunsaturated fatty acids (PUFA) on hepatic transcription related to lipid metabolism and other biological processes of important species for aquaculture, as well as our contribution to the knowledge of tambaqui. Complementing the transcriptome, genome information and information regarding the interaction between genotype and phenotype is also critical to be developed for applications in aquaculture. This includes for example the development of genetic and physical maps, annotated genome sequences, and platforms for high-throughput genotyping, to identify genomic variations such as insertions/deletions, single nucleotide polymorphisms (SNPs), copy number variations, and differentially methylated regions. In this sense, led by Embrapa, the Brazilian researchers have a project to sequence the genome of the two most produced native fish in the country, like the tambaqui and the Amazon cachara (Pseudoplatystoma punctifer), also known as surubim (Embrapa, 2020). These projects are not yet available and limited information is currently available for tambaqui. Thus, a first version of the tambaqui draft genome was made available in the

140 current thesis (Chapter 5). In a second phase the genomic resources developed previously can be used to develop a functional understanding of animal systems. For this, a more complete characterization of these resources was made available in Chapter 6 and results in an improved fundamental knowledge of the structure and biology of the genome, highlighting the analysis of repertoire of elovl genes in tambaqui. Chapter 6 showed that with all the elovl present in this family (elovl 1-8), and the maintenance of their activities in LC-PUFA biosynthesis pathway during the evolutionary process, tambaqui was able to explore different habitats with different food sources. With the draft genome and liver and brain transcriptomes of the tambaqui, we further demonstrate the power of this dataset by exploring the endogenous capacity of tambaqui to biosynthesize LC-PUFAs. The information will also contribute to future comparative studies, notably regarding life history strategies among teleosts that inhabit freshwater ecosystems and with an omnivorous nutrition. Therefore, to establish well-managed aquaculture conditions it is important to know the animal's nutrition and physiology, and more deeply the genetic characteristics that this animal displays. Overall, this knowledge impacts the definition of commercial diets that fit species metabolic profiles. Consequently, this approach contributes to better growth and development of species with a valid environmental responsibility. In the case of tambaqui, the omic research program developed in this thesis, contained with the transcriptomes and the draft genome, were essential to understand the gene catalogue present in the tambaqui and understand how the LC/VLC-PUFA biosynthesis and regulation pathway worked in tambaqui.

7.4. Final conclusions and future perspectives

Because of the excellent adaptation to the aquaculture production process in captivity, such as rapid growth, low level of stress when handled, excellent acceptance by the consumed market, tambaqui is considered a new and valuable aquaculture species. Yet, tambaqui high productivity in captivity is mainly due to the fact that it is omnivorous and very well accepts plant-based commercial diets (Shen and Yue 2018). The aquaculture industry will diversify the most produced species (that is, including new species), so adapting new cultivation methods is increasingly necessary. Perhaps more than ever, major progress has been made in aquaculture genomics including the development of genetic linkage maps, physical maps, microarrays, single nucleotide polymorphism (SNP) arrays, transcriptome databases and various stages of genome reference sequences (Rise et al., 2019). This thesis has brought the theme of lipid metabolism and molecular data of tambaqui. With these

141 resources, we show that tambaqui has all the enzymes necessary to biosynthesize LC/VLC-

PUFA, thus being able to feed on C18 PUFA, present in VO, to satisfy the essentiality of LC- PUFA. In addition, the juvenile feeding trial clarified the lipid metabolism of tambaqui, since the juveniles tambaqui regulates all this pathway to continue with the essential levels of these EFAs in vivo. The highlight in these studies may be assigned to the fact that we were the first ones who studied in vivo LC-PUFA metabolism genes. But additional studies exploring other metabolic pathways, and not only with LC-PUFA, are a prerequisite for the development of an optimal approach for the production of tambaqui in captivity. With the advent of nutrigenomics, transcriptomics can be effectively used to analyze the changes in the gene expression of the tambaqui feed with various plant-based diets to the continued successful growth and stability of tambaqui aquaculture production. Here we leave some possible future investigations that would help to leverage tambaqui production:

1. Study of gene expression with a wider range of genes to understand the animal metabolism in response to alternative plant-based diets. 2. Tambaqui microbiota, which we know is important for digestion beyond the prevention of possible pathogens. 3. RNASeq study on various experimental conditions to better understand the physiology of the animal when fed or challenged by different circumstances

142

7.5. References

Abdul Hamid, N.K., Carmona-Antonanzas, G., Monroig, O., Tocher, D.R., Turchini, G.M., Donald, J.A., 2016. Isolation and Functional Characterisation of a fads2 in Rainbow Trout (Oncorhynchus mykiss) with Delta5 Desaturase Activity. PLoS One 11, e0150770. Agbaga, M.-P., Brush, R.S., Mandal, M.N.A., Elliott, M.H., Al-Ubaidi, M.R., Anderson, R.E., 2010a. Role of Elovl4 protein in the biosynthesis of docosahexaenoic acid, Retinal Degenerative Diseases. Springer, 233-242. Agbaga, M.-P., Mandal, M.N.A., Anderson, R.E., 2010b. Retinal very long-chain PUFAs: new insights from studies on ELOVL4 protein. Journal of lipid research 51, 1624-1642. Ayisi, C.L., Zhao, J., Wu, J.-W., 2018. Replacement of fish oil with palm oil: Effects on growth performance, innate immune response, antioxidant capacity and disease resistance in Nile tilapia (Oreochromis niloticus). PloS one 13. Batista-Pinto, C., Rodrigues, P., Rocha, E., Lobo-da-Cunha, A., 2005. Identification and organ expression of peroxisome proliferator activated receptors in brown trout (Salmo trutta f. fario). Biochimica et Biophysica Acta (BBA)-Gene Structure and Expression 1731, 88-94. Blanchard, G., Makombu, J.G., Kestemont, P., 2008. Influence of different dietary 18: 3n- 3/18: 2n-6 ratio on growth performance, fatty acid composition and hepatic ultrastructure in Eurasian perch, Perca fluviatilis. Aquaculture 284, 144-150. Boukouvala, E., Antonopoulou, E., Favre-Krey, L., Diez, A., Bautista, J.M., Leaver, M.J., Tocher, D.R., Krey, G., 2004. Molecular Characterization of Three Peroxisome Proliferator-Activated Receptors from the Sea Bass (Dicentrarchus labrax). Lipids 39. Caballero-Solares, A., Hall, J.R., Xue, X., Eslamloo, K., Taylor, R.G., Parrish, C.C., Rise, M.L., 2017. The dietary replacement of marine ingredients by terrestrial animal and plant alternatives modulates the antiviral immune response of Atlantic salmon (Salmo salar). Fish & shellfish immunology 64, 24-38. Caballero-Solares, A., Xue, X., Parrish, C.C., Foroutani, M.B., Taylor, R.G., Rise, M.L., 2018. Changes in the liver transcriptome of farmed Atlantic salmon (Salmo salar) fed experimental diets based on terrestrial alternatives to fish meal and fish oil. BMC genomics 19, 796. Calder, P.C., 2014. Very long chain omega‐3 (n‐3) fatty acids and human health. European journal of lipid science and technology 116, 1280-1300. Castro, C., Corraze, G., Panserat, S., Oliva‐Teles, A., 2015. Effects of fish oil replacement by a vegetable oil blend on digestibility, postprandial serum metabolite profile, lipid and glucose metabolism of E uropean sea bass (D icentrarchus labrax) juveniles. Aquaculture nutrition 21, 592-603. Castro, L.F., Tocher, D.R., Monroig, O., 2016. Long-chain polyunsaturated fatty acid biosynthesis in chordates: Insights into the evolution of Fads and Elovl gene repertoire. Prog Lipid Res 62, 25-40. Cerda, J., Manchado, M., 2013. Advances in genomics for aquaculture. Genes & nutrition 8, 5-17. Chandhini, S., Rejish Kumar, V.J., 2019. Transcriptomics in aquaculture: current status and applications. Reviews in Aquaculture 11, 1379-1397. Chen, J., Li, C., Huang, R., Du, F., Liao, L., Zhu, Z., Wang, Y., 2012. Transcriptome analysis of head kidney in grass carp and discovery of immune-related genes. BMC veterinary research 8, 108. Chen, Y., Sun, Z., Liang, Z., Xie, Y., Su, J., Luo, Q., Zhu, J., Liu, Q., Han, T., Wang, A., 2020. Effects of dietary fish oil replacement by soybean oil and l-carnitine supplementation

143

on growth performance, fatty acid composition, lipid metabolism and liver health of juvenile largemouth bass, Micropterus salmoides. Aquaculture 516, 734596. Conde-Sieira, M., Gesto, M., Batista, S., Linares, F., Villanueva, J.L., Míguez, J.M., Soengas, J.L., Valente, L.M., 2018. Influence of vegetable diets on physiological and immune responses to thermal stress in Senegalese sole (Solea senegalensis). PloS one 13. Corrêa, C.F., Nobrega, R.O., Block, J.M., Fracalossi, D.M., 2018. Mixes of plant oils as fish oil substitutes for Nile tilapia at optimal and cold suboptimal temperature. Aquaculture 497, 82-90. De Almeida, L.C., Avilez, I.M., Honorato, C.A., Hori, T.S.F., Moraes, G., 2011. Growth and metabolic responses of tambaqui (Colossoma macropomum) fed different levels of protein and lipid. Aquaculture Nutrition 17, e253-e262. De almeida, L.C.L.L.M.M.G., 2006. Digestive enzyme responses of tambaqui (Colossoma macropomum) fed on different levels of protein and lipid. Aquaculture Nutrition 12, 443–450. Dong, X., Xu, H., Mai, K., Xu, W., Zhang, Y., Ai, Q., 2015. Cloning and characterization of SREBP-1 and PPAR-α in Japanese seabass Lateolabrax japonicus, and their gene expressions in response to different dietary fatty acid profiles. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 180, 48- 56. Dyall, S.C., 2015. Long-chain omega-3 fatty acids and the brain: a review of the independent and shared effects of EPA, DPA and DHA. Frontiers in aging neuroscience 7, 52. Embrapa, E.B.d.P.A.-. 2020. Cientistas sequenciam genoma de peixes brasileiros. https://www.embrapa.br/busca-de-noticias/-/noticia/21279591/cientistas- sequenciam-genoma-de-peixes-brasileiros. EPA, U.S.E.P.A., 2019. ADVICE ABOUT EATING FISH For Women Who Are or Might Become Pregnant, Breastfeeding Mothers, and Young Children. FAO/OECD, 2017. OECD-FAO Agricultural Outlook 2017-2026. Special Focus: Southeast Asia. Gao, Z., Luo, W., Liu, H., Zeng, C., Liu, X., Yi, S., Wang, W., 2012. Transcriptome analysis and SSR/SNP markers information of the blunt snout bream (Megalobrama amblycephala). PloS one 7. Garrido, D., Kabeya, N., Betancor, M.B., Pérez, J.A., Acosta, N.G., Tocher, D.R., Rodríguez, C., Monroig, Ó., 2019. Functional diversification of teleost Fads2 fatty acyl desaturases occurs independently of the trophic level. Scientific reports 9, 11199. Gatlin III, D.M., Barrows, F.T., Brown, P., Dabrowski, K., Gaylord, T.G., Hardy, R.W., Herman, E., Hu, G., Krogdahl, Å., Nelson, R., 2007. Expanding the utilization of sustainable plant products in aquafeeds: a review. Aquaculture research 38, 551-579. Geay, F., Ferraresso, S., Zambonino-Infante, J.L., Bargelloni, L., Quentel, C., Vandeputte, M., Kaushik, S., Cahu, C.L., Mazurais, D., 2011. Effects of the total replacement of fish-based diet with plant-based diet on the hepatic transcriptome of two European sea bass (Dicentrarchus labrax) half-sibfamilies showing different growth rates with the plant-based diet. Bmc Genomics 12, 522. Geay, F., Tinti, E., Mellery, J., Michaux, C., Larondelle, Y., Perpète, E., Kestemont, P., 2016. Cloning and functional characterization of Δ6 fatty acid desaturase (FADS2) in Eurasian perch (Perca fluviatilis). Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 191, 112-125. Gregory, M.K., James, M.J., 2014. Rainbow trout (Oncorhynchus mykiss) Elov15 and Elov12 differ in selectivity for elongation of omega-3 docosapentaenoic acid. Biochim Biophys Acta 1841, 1656-1660. Han, C.-Y., Zheng, Q.-M., Feng, L.-N., 2013. Effects of total replacement of dietary fish oil on growth performance and fatty acid compositions of hybrid tilapia (Oreochromis niloticus× O. aureus). Aquaculture international 21, 1209-1217.

144

Hardy, R.W., 2010. Utilization of plant proteins in fish diets: effects of global demand and supplies of fishmeal. Aquaculture Research 41, 770-776. Hastings, N., Agaba, M., Tocher, D.R., Leaver, M.J., Dick, J.R., Sargent, J.R., Teale, A.J., 2001. A vertebrate fatty acid desaturase with Delta 5 and Delta 6 activities. Proc Natl Acad Sci U S A 98, 14304-14309. Huang, Y., Huang, X., Yan, Y., Cai, J., Ouyang, Z., Cui, H., Wang, P., Qin, Q., 2011. Transcriptome analysis of orange-spotted grouper (Epinephelus coioides) spleen in response to Singapore grouper iridovirus. BMC genomics 12, 556. Ibabe, A., Grabenbauer, M., Baumgart, E., Fahimi, D.H., Cajaraville, M.P., 2002. Expression of peroxisome proliferator-activated receptors in zebrafish (Danio rerio). Histochemistry and Cell Biology 118, 231-239. Innis, S.M., 2008. Dietary omega 3 fatty acids and the developing brain. Brain Res 1237, 35- 43. Jiang, X., Chen, L., Qin, J., Qin, C., Jiang, H., Li, E., 2013. Effects of dietary soybean oil inclusion to replace fish oil on growth, muscle fatty acid composition, and immune responses of juvenile darkbarbel catfish, Pelteobagrus vachelli. Afr. J. Agric. Res 8, 1492-1499. Jin, M., Monroig, O., Navarro, J.C., Tocher, D.R., Zhou, Q.-C., 2017. Molecular and functional characterisation of two elovl4 elongases involved in the biosynthesis of very long- chain (> C24) polyunsaturated fatty acids in black seabream Acanthopagrus schlegelii. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 212, 41-50. Katan, T., Caballero-Solares, A., Taylor, R.G., Rise, M.L., Parrish, C.C., 2019. Effect of plant- based diets with varying ratios of ω6 to ω3 fatty acids on growth performance, tissue composition, fatty acid biosynthesis and lipid-related gene expression in Atlantic salmon (Salmo salar). Comparative Biochemistry and Physiology Part D: Genomics and Proteomics 30, 290-304. Lands, B., 2014. Historical perspectives on the impact of n-3 and n-6 nutrients on health. Progress in Lipid Research 55, 17-29. Leaver, M.J., Boukouvala, E., Antonopoulou, E., Diez, A., Favre-Krey, L., Ezaz, M.T., Bautista, J.M., Tocher, D.R., Krey, G., 2005. Three peroxisome proliferator-activated receptor isotypes from each of two species of marine fish. Endocrinology 146, 3150-3162. Leaver, M.J., Ezaz, M.T., Fontagne, S., Tocher, D.R., Boukouvala, E., Krey, G., 2007. Multiple peroxisome proliferator-activated receptor beta subtypes from Atlantic salmon (Salmo salar). J Mol Endocrinol 38, 391-400. Li, Y., Gao, J., Huang, S., 2015. Effects of different dietary phospholipid levels on growth performance, fatty acid composition, PPAR gene expressions and antioxidant responses of blunt snout bream Megalobrama amblycephala fingerlings. Fish physiology and biochemistry 41, 423-436. Li, Y., Liang, X., Zhang, Y., Gao, J., 2016. Effects of different dietary soybean oil levels on growth, lipid deposition, tissues fatty acid composition and hepatic lipid metabolism related gene expressions in blunt snout bream (Megalobrama amblycephala) juvenile. Aquaculture 451, 16-23. Li, Y., Monroig, O., Zhang, L., Wang, S., Zheng, X., Dick, J.R., You, C., Tocher, D.R., 2010. Vertebrate fatty acyl desaturase with Δ4 activity. Proceedings of the National Academy of Sciences 107, 16840-16845. Li, Y., Yin, Z., Dong, Y., Wang, S., Monroig, Ó., Tocher, D.R., You, C., 2019. Pparγ Is Involved in the Transcriptional Regulation of Liver LC-PUFA Biosynthesis by Targeting the Δ6Δ5 Fatty Acyl Desaturase Gene in the Marine Teleost Siganus canaliculatus. Marine biotechnology 21, 19-29.

145

Liao, X., Cheng, L., Xu, P., Lu, G., Wachholtz, M., Sun, X., Chen, S., 2013. Transcriptome analysis of crucian carp (Carassius auratus), an important aquaculture and hypoxia- tolerant species. PloS one 8, e62308. Maduna, S.N., Vivian-Smith, A., Jónsdóttir, Ó.D.B., Imsland, A.K., Klütsch, C.F., Nyman, T., Eiken, H.G., Hagen, S.B., 2020. Genome-and transcriptome-derived microsatellite loci in lumpfish Cyclopterus lumpus: molecular tools for aquaculture, conservation and fisheries management. Scientific Reports 10, 1-11. Monroig, O., Rotllant, J., Cerda-Reverter, J.M., Dick, J.R., Figueras, A., Tocher, D.R., 2010. Expression and role of Elovl4 elongases in biosynthesis of very long-chain fatty acids during zebrafish Danio rerio early embryonic development. Biochim Biophys Acta 1801, 1145-1154. Monroig, O., Rotllant, J., Sanchez, E., Cerda-Reverter, J.M., Tocher, D.R., 2009. Expression of long-chain polyunsaturated fatty acid (LC-PUFA) biosynthesis genes during zebrafish Danio rerio early embryogenesis. Biochim Biophys Acta 1791, 1093-1101. Monroig, O., Tocher, D.R., Castro, L.F.C., 2018. Polyunsaturated fatty acid biosynthesis and metabolism in fish, Polyunsaturated Fatty Acid Metabolism. Elsevier, 31-60. Morais, S., Monroig, O., Zheng, X., Leaver, M.J., Tocher, D.R., 2009. Highly unsaturated fatty acid synthesis in Atlantic salmon: characterization of Elovl5- and Elovl2-like elongases. Mar Biotechnol (NY) 11, 627-639. Naylor, R.L., Hardy, R.W., Bureau, D.P., Chiu, A., Elliott, M., Farrell, A.P., Forster, I., Gatlin, D.M., Goldburg, R.J., Hua, K., Nichols, P.D., 2009. Feeding aquaculture in an era offinite resources. Proceedings of the National Academy of Sciences 106, 18040- 18040. Oboh, A., Betancor, M.B., Tocher, D.R., Monroig, O., 2016. Biosynthesis of long-chain polyunsaturated fatty acids in the African catfish Clarias gariepinus: Molecular cloning and functional characterisation of fatty acyl desaturase (fads2) and elongase (elovl2) cDNAs7. Aquaculture 462, 70-79. Oboh, A., Kabeya, N., Carmona-Antoñanzas, G., Castro, L.F.C., Dick, J.R., Tocher, D.R., Monroig, O., 2017b. Two alternative pathways for docosahexaenoic acid (DHA, 22: 6n-3) biosynthesis are widespread among teleost fish. Scientific reports 7, 3889. Oboh, A., Navarro, J.C., Tocher, D.R., Monroig, O., 2017a. Elongation of very long-chain (> C24) fatty acids in Clarias gariepinus: Cloning, functional characterization and tissue expression of elovl4 elongases. Lipids 52, 837-848. Oku, H., Umino, T., 2008. Molecular characterization of peroxisome proliferator-activated receptors (PPARs) and their gene expression in the differentiating adipocytes of red sea bream Pagrus major. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 151, 268-277. Paulino, R.R., Pereira, R.T., Fontes, T.V., Oliva-Teles, A., Peres, H., Carneiro, D.J., Rosa, P.V., 2018. Optimal dietary linoleic acid to linolenic acid ratio improved fatty acid profile of the juvenile tambaqui (Colossoma macropomum). Aquaculture 488, 9-16. Pereira, R.T., Paulino, R.R., de Almeida, C.A.L., Rosa, P.V., Orlando, T.M., Fortes-Silva, R., 2017. Oil sources administered to tambaqui (Colossoma macropomum): growth, body composition and effect of masking organoleptic properties and fasting on diet preference. Applied Animal Behaviour Science. Poulos, A., 1995. Very long chain fatty acids in higher animals—a review. Lipids 30, 1-14. Qiao, H., Wang, H., Song, Z., Ma, J., Li, B., Liu, X., Zhang, S., Wang, J., Zhang, L., 2014. Effects of dietary fish oil replacement by microalgae raw materials on growth performance, body composition and fatty acid profile of juvenile olive flounder, P aralichthys olivaceus. Aquaculture nutrition 20, 646-653. Rise, M.L., Jones, S.R., Brown, G.D., von Schalburg, K.R., Davidson, W.S., Koop, B.F., 2004. Microarray analyses identify molecular biomarkers of Atlantic salmon macrophage

146

and hematopoietic kidney response to Piscirickettsia salmonis infection. Physiological genomics 20, 21-35. Rise, M.L., Martyniuk, C.J., Chen, M., 2019. Comparative physiology and aquaculture: Toward Omics-enabled improvement of aquatic animal health and sustainable production. Elsevier. Salem, M., Rexroad, C.E., Wang, J., Thorgaard, G.H., Yao, J., 2010. Characterization of the rainbow trout transcriptome using Sanger and 454-pyrosequencing approaches. BMC genomics 11, 564. Sarropoulou, E., Kotoulas, G., Power, D.M., Geisler, R., 2005. Gene expression profiling of gilthead sea bream during early development and detection of stress-related genes by the application of cDNA microarray technology. Physiological genomics 23, 182- 191. Senadheera, S.D., Turchini, G.M., Thanuthong, T., Francis, D.S., 2011. Effects of dietary α- linolenic acid (18: 3n-3)/linoleic acid (18: 2n-6) ratio on fatty acid metabolism in Murray cod (Maccullochella peelii peelii). Journal of agricultural and food chemistry 59, 1020-1030. Shen, Y., Yue, G., 2018. Current status of research on aquaculture genetics and genomics- information from ISGA 2018. Aquaculture and Fisheries. Simopoulos, A.P., 2000. Role of poultry products in enriching the human diet with n-3 PUFA. Poultry Science 79, 961–970. Sprecher, H., 2000. Metabolism of highly unsaturated n-3 and n-6 fatty acids. Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids 1486, 219-231. Sundvold, H., Ruyter, B., Østbye, T.-K., Moen, T., 2010. Identification of a novel allele of peroxisome proliferator-activated receptor gamma (PPARG) and its association with resistance to Aeromonas salmonicida in Atlantic salmon (Salmo salar). Fish & shellfish immunology 28, 394-400. Tan, X.-y., Luo, Z., Xie, P., Liu, X.-j., 2009. Effect of dietary linolenic acid/linoleic acid ratio on growth performance, hepatic fatty acid profiles and intermediary metabolism of juvenile yellow catfish Pelteobagrus fulvidraco. Aquaculture 296, 96-101. Tanomman, S., Ketudat-Cairns, M., Jangprai, A., Boonanuntanasarn, S., 2013. Characterization of fatty acid delta-6 desaturase gene in Nile tilapia and heterogenous expression in Saccharomyces cerevisiae. Comp Biochem Physiol B Biochem Mol Biol 166, 148-156. Teoh, C.-Y., Ng, W.-K., 2016. The implications of substituting dietary fish oil with vegetable oils on the growth performance, fillet fatty acid profile and modulation of the fatty acid elongase, desaturase and oxidation activities of red hybrid tilapia, Oreochromis sp. Aquaculture 465, 311-322. Tian, J.-j., Lei, C.-x., Ji, H., Zhou, J.-s., Yu, H.-b., Li, Y., Yu, E.-m., Xie, J., 2019. Dietary arachidonic acid decreases the expression of transcripts related to adipocyte development and chronic inflammation in the adipose tissue of juvenile grass carp, Ctenopharyngodon idella. Comparative Biochemistry and Physiology Part D: Genomics and Proteomics 30, 122-132. Tian, J.J., Lei, C.X., Ji, H., 2016. Influence of dietary linoleic acid (18: 2n‐6) and α‐linolenic acid (18: 3n‐3) ratio on fatty acid composition of different tissues in freshwater fish Songpu mirror carp, Cyprinus Carpio. Aquaculture Research 47, 3811-3825. Tocher, D.R., 2010. Fatty acid requirements in ontogeny of marine and freshwater fish. Aquaculture Research 41, 717-732. Tocher, D.R., 2015. Omega-3 long-chain polyunsaturated fatty acids and aquaculture in perspective. Aquaculture 449, 94-107. Tocher, D.R., Bell, J.G., MacGlaughlin, P., McGhee, F., Dick, J.R., 2001. Hepatocyte fatty acid desaturation and polyunsaturated fatty acid composition of liver in salmonids:

147

effects of dietary vegetable oil. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 130, 257-270. Tong, C., Zhang, C., Zhang, R., Zhao, K., 2015. Transcriptome profiling analysis of naked carp (Gymnocypris przewalskii) provides insights into the immune-related genes in highland fish. Fish & shellfish immunology 46, 366-377. Trushenski, J.T., Rombenso, A.N., 2019. Trophic levels predict the nutritional essentiality of polyunsaturated fatty acids in fish—introduction to a special section and a brief synthesis. North American Journal of Aquaculture. Tsai, M.-L., Chen, H.-Y., Tseng, M.-C., Chang, R.-C., 2008. Cloning of peroxisome proliferators activated receptors in the cobia (Rachycentron canadum) and their expression at different life-cycle stages under cage aquaculture. Gene 425, 69-78. Turchini, G., Francis, D., Senadheera, S., Thanuthong, T., De Silva, S., 2011. Fish oil replacement with different vegetable oils in Murray cod: evidence of an “omega-3 sparing effect” by other dietary fatty acids. Aquaculture 315, 250-259. Turchini, G.M., Francis, D.S., De Silva, S.S., 2006. Fatty acid metabolism in the freshwater fish Murray cod (Maccullochella peelii peelii) deduced by the whole-body fatty acid balance method. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 144, 110-118. Turchini, G.M., Ng, W.-K., Tocher, D.R., 2010. Fish oil replacement and alternative lipid sources in aquaculture feeds. CRC Press. Turchini, G.M., Torstensen, B.E., Ng, W.K., 2009. Fish oil replacement in finfish nutrition. Reviews in Aquaculture 1, 10-57. Vestergren, A.S., Trattner, S., Pan, J., Johnsson, P., Kamal-Eldin, A., Brännäs, E., Moazzami, A.A., Pickova, J., 2013. The effect of combining linseed oil and sesamin on the fatty acid composition in white muscle and on expression of lipid-related genes in white muscle and liver of rainbow trout (Oncorhynchus mykiss). Aquaculture international 21, 843-859. Xia, J.H., Yue, G.H., 2010. Identification and analysis of immune-related transcriptome in Asian seabass Lates calcarifer. BMC genomics 11, 356. Xu, H., Liao, Z., Wang, C., Wei, Y., Liang, M., 2019. Hepatic transcriptome of the teleost Japanese seabass (Lateolabrax japonicus) fed diets characterized by α- linolenic acid or linoleic acid. Comparative Biochemistry and Physiology Part D: Genomics and Proteomics 29, 106-116. Xu, S., Wang, S., Zhang, L., You, C., Li, Y., 2012. Effects of replacement of dietary fish oil with soybean oil on growth performance and tissue fatty acid composition in marine herbivorous teleost S iganus canaliculatus. Aquaculture Research 43, 1276-1286. Yildiz, M., Köse, İ., Issa, G., Kahraman, T., 2015. Effect of different plant oils on growth performance, fatty acid composition and flesh quality of rainbow trout (O ncorhynchus mykiss). Aquaculture research 46, 2885-2896. You, C., Jiang, D., Zhang, Q., Xie, D., Wang, S., Dong, Y., Li, Y., 2017. Cloning and expression characterization of peroxisome proliferator-activated receptors (PPARs) with their agonists, dietary lipids, and ambient salinity in rabbitfish Siganus canaliculatus. Comp Biochem Physiol B Biochem Mol Biol 206, 54-64. Zeng, Y.Y., Jiang, W.D., Liu, Y., Wu, P., Zhao, J., Jiang, J., Kuang, S.Y., Tang, L., Tang, W.N., Zhang, Y.A., 2016. Optimal dietary alpha‐linolenic acid/linoleic acid ratio improved digestive and absorptive capacities and target of rapamycin gene expression of juvenile grass carp (C tenopharyngodon idellus). Aquaculture Nutrition 22, 1251- 1266. Zhang, R., Zhang, L.-l., Ye, X., Tian, Y.-y., Sun, C.-f., Lu, M.-x., Bai, J.-j., 2013. Transcriptome profiling and digital gene expression analysis of Nile tilapia (Oreochromis niloticus) infected by Streptococcus agalactiae. Molecular biology reports 40, 5657-5668.

148

Supplementary material

149

Supplementary Tables

Supplementary Table 2.1. Primer sets and corresponding PCR conditions used in the cloning of the C. macropomum fads2, elovl5 and elovl2 ORF sequences.

Primer set Primer name Primer sequence Cycle TM Extensio function s n (size bp) Degenerate 54 72 °C/10 primers FADS2_degen_F GCGCCTCCGCCAAytggtggaayc 40 °C s FADS2_degen_R TGGCCGGAGAACcartcrttraa Gene specific 65 72 °C/15

race Cma_FADS2_3RACE_F GGAGTCTTCGGATCATTTGCGCTTC 45 °C s Cma_FADS2_5RACE_R GCAGAGGAGGACCAATCAGGAAGAA 59 72 °C/24 FADS2 Full ORF Cma_FADS2_ORF_F TCATCAGAGAGAGCAGCGAG 35 °C s CmA_FADS2_ORF_R CCAGCATAGATGGCAGAGGA Restriction CCCGGTACCATAATGGGTGGGGGCACTCA 68 72 °C/21 site Cma_FADS2_Pyes_KPNI_F T 35 °C s CCCTCTAGATTATTTGTGGAGGTATGCGTC Cma_FADS2_Pyes_XBAL_R C Degenerate 52 primers ELOVEL5_degen_F TGAACGTCCTGTgtggtaytayt 45 °C 72 °C/5 s ELOVEL5_degen_R GCTGCACCTGGGTGATGTACykyttccacca Gene specific 65 race Cma_ELOVEL5_3RACE_F TGGACACGTTCTTCTTCATCCTGCG 20 °C 72 °C/8 s 65 nested Cma_ELOVEL5_3RACE_F2 GTCCTGTGCTGTAGTCTGGCCCTGC 25 °C 72 °C/6 s ELOVL5 65 Cma_ELOVEL5_5RC_R CGCAGGATGAAGAAGAACGTGTCCA 45 °C 72 °C/5 s Restriction Cma_ELOVEL5_Pyes_KPNI CCCGGTACCAAGATGGAGGCCTTTAATCAC 58 72 °C/15 site _F A 35 °C s Cma_ELOVEL5_Pyes_XBAI _R CCCTCTAGATCAATCTGCCCGCGGCTT Degenerate 60 primers ELOVL2_degen_F TACTTGGGACCAAAGTACATGA 45 °C 72 °C/7 s ELOVL2_degen_R AGATAGCGTTTCCACCACAG Gene specific 72 race Cma_ELOVL2_3RACE_F ACCACGCCTCCATGTTCAATATCTGGTG 45 °C 72 °C/7 s

69 Cma_ELOVL2_5RACE_R CCCGCTGACCAGGTTGCTGAAATTA 45 °C 73 °C/7 s 68° 73

ELOVL2 Full ORF Cma_ELOVL2_ORF_F GAGAGGCGCGGCGAGGAAACAAC 35 C °C/19s

Cma_ELOVL2_ORF_R GGCTGTGGTTGTGCATATGTCTCAG

Restriction Cma_ELOVL2_Pyes_KPNI_ CCCGGTACCACCATGGAGCTCTTTAGCATG 62° 73 °C/14 site F AA 35 C s Cma_ELOVL2_Pyes_XBAL_ CCCTCTAGATTACTGTAGCTTATGTTTGGCT R CC

Supplementary Table 2.2. Accession numbers of all sequences used in phylogenetic analysis of Fads amino acid sequences.

Accession numbers

Fads 1

150

Homo sapiens NP_037534.3

Sus scrofa NP_001106512.1

Mus musculus AAH26848.1

Rattus norvegicus NP_445897.2

Monodelphis domestica H9H609

Ornithorhynchus anatinus XP_016084018.1

Latimeria chalumnae XP_005988035.1

Scyliorhinus canicula AEY94454.1

Callorhinchus milii XP_007885635.1

Xenopus tropicalis XP_002943012.2

Xenopus laevis XP_018112981.1

Xenopus laevis XP_018115603.1

Anolis carolinensis a XP_003224167.1

Anolis carolinensis c XP_003224188.1

Anolis carolinensis d XP_003224187.1

Gallus gallus a XP_421052.4

Gallus gallus b XP_426408.2

Gallus gallus c XP_421051.3

Alligator mississippiensis XP_006274989.1

Fads 2

Homo sapiens AAG23121.1

Mus musculus NP_062673.1

Rattus norvegicus NP_112634.1

Sus scrofa NP_001165221.1

Anolis carolinensis XP_003224168.1

Gallus gallus NP_001153900.1

Alligator mississippiensis XP_006274951.1

Xenopus tropicalis NP_001120262.1

Salmo salar (Fads2Δ6 _c) NP_001165752.1

Salmo salar (Fads2Δ6 _b) NP_001165251.1

Salmo salar (Fads2Δ6 _a) NP_001117047.1

151

Salmo salar (Fads2Δ5 ) NP_001117014.1

Oncorhynchus masou ABU87822.1

Oncorhynchus mykiss (Fads2Δ6) NP_001117759.1

Oncorhynchus mykiss (Fads2Δ5) AFM77867.1

Gadus morhua (Fads2Δ6) AAY46796.1

Scatophagus argus (Fads2Δ6) AHA62794.1

Oreochromis niloticus (Fads2Δ6) AGV52807.1

Dicentrarchus labrax (Fads2Δ6) ACD10793.1

Argyrosomus regius (Fads2Δ6/ Δ8) AGG69480.1

Sparus aurata (Fads2Δ6) AAL17639.1

Thunnus maccoyii (Fads2Δ6) ADG62353.1

Scophthalmus maximus (Fads2Δ6) AAS49163.1

Solea senegalensis (Fads2Δ6) AEQ92868.1

Siganus canaliculatus (Fads2Δ4) ADJ29913.1

Lates calcarifer (Fads2Δ6) ACS91458.1

Rachycentron canadum (Fads2Δ6) ACJ65149.1

Anguilla japonica (Fads2Δ6/ Δ8) AHY22375.1

Arapaima gigas (Fads2Δ6/ Δ8) AOO19789.1

Scleropages formosus XP_018585703.1

Danio rerio (Fads2Δ5/ Δ6) NP_571720.2

Colossoma macropomum MH734335

Astyanax mexicanus XP_007235183.1

Ictalurus punctatus XP_017341187.1

Clarias gariepinus (Fads2Δ5/ Δ6) AMR43366.1

Latimeria chalumnae XP_005988034.2

Scyliorhinus canicula AEY94455.1

Callorhinchus milii XP_007885636.1

Invertebrates Fads

Saccoglossus kowalevskii XP_006822674.1

Octopus vulgaris AEK20864.1

152

Supplementary Table 2.3. Accession numbers of all sequences used in phylogenetic analysis of Elovl amino acid sequences.

Accession numbers Elovl

Elovl2

Homo sapiens NP_060240.3

Bos taurus NP_001076986.1

Mus musculus NP_062296.1

Rattus norvegicus NP_001102588.1

Gallus gallus NP_001184237.1

Xenopus tropicalis NP_001016159.1

Xenopus laevis NP_001087564.1

Callorhinchus milii XP_007900820.1

Latimeria chalumnae XP_006006450.1

Esox lucius XP_010884057.1

Salmo salar NP_001130025.1

Oncorhynchus mykiss AIT56593.1

Clupea harengus XP_012671565.1

Danio rerio NP_001035452.1

Colossoma macropomum MH734337

Astyanax mexicanus XP_007260136.1

Clarias gariepinus AOY10780.1

Elovl5

Homo sapiens NP_068586.1

Mus musculus NP_599016.2

Bos taurus NP_001040062.1

Gallus gallus NP_001186126.1

Xenopus laevis NP_001089883.1

Sparus aurata AAT81404.1

Dicentrarchus labrax CBX53576.1

Rachycentron canadum ACJ65150.1

Salmo salar (Elovl5b) NP_001130024.1

153

Salmo salar (Elovl5a) NP_001117039.1

Larimichthys crocea AFB81415.1

Nibea mitsukurii ACR47973.1

Argyrosomus regius AGG69479.1

Siganus canaliculatus ADE34561.1

Clupea harengus XP_012695835.1

Clarias gariepinus AAT81405.1

Ictalurus punctatus NP_001188041.1

Pangasius larnaudii AGR45586.1

Danio rerio NP_956747.1

Colossoma macropomum MH734336

Branchiostoma lanceolatum ALZ50284.1

Elovl4

Homo sapiens NP_073563.1

Mus musculus NP_683743.2

Bos taurus NP_001092520.1

Gallus gallus NP_001184238.1

Danio rerio a NP_957090.1

Danio rerio b NP_956266.1

Takifugu rubripes a XP_003966009.1

Takifugu rubripes b XP_003971605.1

Salmo salar NP_001182481.1

Rachycentron canadum ADG59898.1

Ciona intestinalis NP_0010290

154

Supplementary Table 3.1. Accession numbers of all sequences used in PPARs phylogenetic analysis of the PPARs (PPARα, PPARαa, PPARαb, PPARβ, PPARβa, PPARβb and PPARγ).

Species Pparα

Lepisosteus oculatus W5NK92

Gallus gallus NP_001001464.1

Taeniopygia guttata XP_012424764.1

Anolis carolinensis XP_003221452.1

Mus musculus NP_035274.2

Cavia porcellus NP_001166475.1

Homo sapiens NP_001001928.1

Pparαa

Astyanax mexicanus W5K935

Oreochromis niloticus I3JQG1

Xiphophorus maculatus M4A5E5

Oryzias latipes H2LAT3

Danio rerio NP_001154805.1

Colossoma macropomum *Isolated in this present work

Pparαb

Astyanax mexicanus W5L742

Danio rerio NP_001096037.1

Colossoma macropomum *Isolated in this present work

Oreochromis niloticus NP_001276995.1

Xiphophorus maculatus M3ZQ58

Oryzias latipes NP_001158347.1

Pparβ

Gallus gallus NP_990059.1

Taeniopygia guttata H0YU22

Anolis carolinensis XP_003220387.1

Mus musculus NP_035275.1

Homo sapiens NP_001165289.1

155

Lepisosteus oculatus W5N0Y1

Pparβa

Danio rerio F1QJT0

Colossoma macropomum *Isolated in this present work

Pparβb

Oreochromis niloticus NP_001276565.1

Oryzias latipes NP_001265836.1

Takifugu rubripes NP_001091097.1

Astyanax mexicanus W5KJX4

Xiphophorus maculatus M4A7P7

Tetraodon nigroviridis H3DCC4

Gasterosteus aculeatus G3P057

Poecilia formosa A0A087Y6K3

Colossoma macropomum *Isolated in this present work

Danio rerio NP_571543.1

Pparγ

Astyanax mexicanus W5KED5

Oreochromis niloticus NP_001277129.1

Xiphophorus maculatus M4ASD3

Lepisosteus oculatus W5N992

Oryzias latipes H2LHU7

Tetraodon nigroviridis H3BZ42

Takifugu rubripes NP_001091096.1

Gasterosteus aculeatus G3NA33

Poecilia formosa A0A087Y577

Danio rerio NP_571542.1

Colossoma macropomum *Isolated in this present work

Gallus gallus NP_001001460.1

Taeniopygia guttata H0Z0W7

Anolis carolinensis XP_016847098.1

Mus musculus NP_035276.2

156

Homo sapiens NP_619725.2

Invertebrates

Branchiostoma floridae XP_002598634.1

Branchiostoma belcheri XP_019618227.1

Ciona intestinalis NP_001071801.1

157

Supplementary Table 3.2. Length distribution of contigs in gold standard transcriptome assembly.

Contig Lenght Number of Hits Percentage of Hits (%) 300-600 13,237 30.71 600-800 5,467 12.68 800-1000 4,157 9.64 1000-1200 3,271 7.59 1200-1400 2,798 6.49 1400-1600 2,365 5.49 1600-1800 2,003 4.65 1800-2000 1,65 3.83 2000-2500 2,969 6.89 2500-3000 1,911 4.43 3000-4000 1,83 4.25 >4000 1,443 3.35 Total 43,101 100.00

158

Supplementary Table 3.3. Distribution of isoforms per ‘gene’ in the gold standard transcriptome assembly.

Number of Isoforms Number of Hits Percentage of Hits (%) 1 20,285 70.83 2 5,254 18.35 3 1,696 5.92 4 736 2.57 5 299 1.04 6 161 0.56 7 74 0.26 8 60 0.21 9 22 0.08 10 17 0.06 11 8 0.03 >12 27 0.09 Total 28,639 100.00

Supplementary Table 3.4. BUSCO evaluations of completeness of C. macropomum Liver gold standard transcriptome against the metazoa and eukaryota gene sets.

BUSCO Statistics Metazoa Database (%) Eukaryota Database (%) Complete 84.7 82.2 Single 69.6 69.3 Multi 15.1 12.9 Fragment 8.5 10.9 Missing 6.8 6.9 Total Busco Groups 978 303

159

Supplementary Table 3.5. The top 20 species for which there was a top Blast-x hit. These Blast-x results from the queried of gold standard transcriptome against Non-redundant Database of NCBI.

Percentage of Specie Taxon ID Number of Blast-x Hits Blast-x Hits (%) Pygocentrus nattereri 42514 33,232 78.66 Astyanax mexicanus 7994 2,521 5.97 Cyprinus carpio 7962 778 1.84 Ictalurus punctatus 7998 744 1.76 Oncorhynchus mykiss 8022 669 1.58 Danio rerio 7955 562 1.33 Larimichthys crocea 215358 498 1.18 Sinocyclocheilus anshuiensis 1608454 270 0.64 Sinocyclocheilus grahami 75366 205 0.49 Sinocyclocheilus rhinocerous 307959 192 0.45 Scleropages formosus 113540 177 0.42 Oreochromis niloticus 8128 171 0.40 Salmo salar 8030 154 0.36 Austrofundulus limnaeus 52670 141 0.33 Clupea harengus 7950 138 0.33 Oryzias latipes 8090 133 0.31 Esox lucius 8010 78 0.18 Notothenia coriiceps 8208 78 0.18 Lepisosteus oculatus 7918 77 0.18 Maylandia zebra 106582 72 0.17 Others - 1,356 6.4 Total - 42,246 42.246

160

Supplementary Table 3.6. E-value distribution of Blast-x hits of gold standard transcriptome against NR database.

E-values Ranges Number of Blast-x Hits Percentage of Blast-x Hits (%) 0 ~ 1e-100 18,213 43.1 1e-100 ~ 1e-60 8,954 21.2 1e-60 ~ 1e-45 5,256 12.4 1e-45 ~ 1e-30 4,400 10.4 1e-30 ~ 1e-15 3,414 8.1 1e-15 ~ 1e-5 2,009 4.8 Total 42,246 100.00

Supplementary Table 3.7. Similarity distribution of Blast-x hits of gold standard transcriptome against NR database.

Similarity Ranges (%) Number of Blast-x Hits Percentage of Blast-x Hits (%)

20 ~ 40 34 0.1 40 ~ 60 1,109 2.6 60 ~ 80 4,271 10.1 80 ~ 90 5,812 13.8 90 ~ 95 6,842 16.2 95 ~ 100 24,178 57.2 Total 42,246 100.00

161

Supplementary Table 3.8. Functional annotation categories and statistics for gold standard transcriptome and for unigenes.

Final transcriptome Final Transcriptome Trinotate Annotation Statistics assembly Subset

Number of “genes” with ORF 28,639 - Number of “Unigenes” with ORF - 28,639 Number of transcripts with ORF 43,101 28,639 Transcripts with Blastx match NR 42,246 28,067 Transcripts with Blastp match NR 40,046 26,555 Transcripts with Blastx match Uniref90 42,295 28,099 Transcripts with Blastp match Uniref90 40,187 26,648 Transcripts with Blastx match Trembl 40,981 27,296 Transcripts with Blastp match Trembl 39,256 25,958 Transcripts with Blastx match SwissProt 34,426 22,423 Transcripts with Blastp match SwissProt 33,135 21,642 Transcripts with GO terms 32,334 21,075 Transcripts with eggNOG/COG 28,483 18,750 Transcripts with PFAM 28,301 18,297

162

Supplementary Table 3.9. Clusters of orthologous groups distribution.

Number of Percentage of Class Class Description Unigenes Unigenes (%) A 50 0.80 RNA processing and modification B 159 2.56 Chromatin structure and dynamics C 171 2.75 Energy production and conversion D 92 1.48 Cell cycle control, cell division, chromosome partitioning E 267 4.29 Amino acid transport and metabolism F 135 2.17 Nucleotide transport and metabolism G 218 3.51 Carbohydrate transport and metabolism H 113 1.82 Coenzyme transport and metabolism I 236 3.79 Lipid transport and metabolism J 339 5.45 Translation, ribosomal structure and biogenesis K 281 4.52 Transcription L 284 4.57 Replication, recombination and repair M 64 1.03 Cell wall/membrane/envelope biogenesis N 4 0.06 Cell motility O 650 10.45 Posttranslational modification, protein turnover, chaperones P 184 2.96 Inorganic ion transport and metabolism Q 180 2.89 Secondary metabolites biosynthesis, transport and catabolism R 1,509 0.02 General function prediction only S 268 4.31 Function unknown T 386 6.21 Signal transduction mechanisms U 184 2.96 Intracellular trafficking, secretion, and vesicular transport V 97 1.56 Defence mechanisms W 0 0.00 Extracellular structures Y 10 0.16 Nuclear structure Z 338 5.43 Cytoskeleton Total 6,219 100.00

163

Supplementary Table 4.1. Accession numbers of all sequences used in phylogenetic analysis.

Accession numbers Lxrα Denticeps clupeoides XP_028812805.1 Tachysurus fulvidraco XP_026992819.1 Electrophorus electricus XP_026860594.1 Pygocentrus nattereri XP_017559106.1 *Isolated in this present Colossoma macropomum work Danio rerio NP_001017545.2 Anabarilius grahami ROL52734.1 Carassius auratus XP_026123325.1 Cyprinus carpio ACQ99544.1 Lepisosteus oculatus AQR58545.1 Latimeria chalumnae XP_005987002.1

Srebp-1 Tachysurus vachellii ALB38958.1 Ictalurus punctatus XP_017308085.1 Pangasianodon hypophthalmus XP_026772081.1 Astyanax mexicanus XP_022529045.1 Pygocentrus nattereri XP_017540364.1 *Isolated in this present Colossoma macropomum work Onychostoma macrolepis AVR48442.1 Anabarilius grahami ROL49807.1 Ctenopharyngodon idella AIZ50708.1 Megalobrama amblycephala AYA57965.1 Lepisosteus oculatus XP_015215650.1 Latimeria chalumnae XP_014344252.1

Lpl Oreochromis niloticus I3J210 Electrophorus electricus XP_026861273.1 Astyanax mexicanus XP_007240188.2 *Isolated in this present Colossoma macropomum work Pygocentrus nattereri A0A3B4CZ75 Ctenopharyngodon idella ACN66300.1 Culter alburnus AGT80484.1 Hypophthalmichthys molitrix ACT22640.1 Danio rerio F1QHK8 Sinocyclocheilus grahami XP_016131571.1 Cyprinus carpio AIU47020.1 Carassius auratus ACN37860.1

164

Lepisosteus oculatus W5M751 Latimeria chalumnae H3BF25

Fas Tachysurus fulvidraco AXN57017.1 Danio rerio XP_009305081.1 Ctenopharyngodon idella AYW03307.1 Carassius auratus XP_026078157.1 Cyprinus carpio ARO92273.1 *Isolated in this present Colossoma macropomum work Pygocentrus nattereri XP_017566476.1 Lepisosteus oculatus XP_006635198.2 Latimeria chalumnae XP_014339995.1

β-Actin Pygocentrus nattereri XP_017558662.1 *Isolated in this present Colossoma macropomum work Labeo calbasu AAL57317.1 Tigriopus japonicus AAQ05016.1 Danio rerio NP_853632.3 Elopichthys bambusa AEK69350.1 Lepisosteus oculatus XP_006637184.1 Latimeria chalumnae XP_005992382.1

Microgobulin Astyanax mexicanus XP_007235551.2 Electrophorus electricus XP_026867047.1 Pygocentrus nattereri XP_017551836.1 *Isolated in this present Colossoma macropomum work Carassius auratus XP_026053496.1 Sinocyclocheilus grahami XP_016090776.1 Salmo trutta XP_029599575.1 Lepisosteus oculatus XP_015201692.1 Latimeria chalumnae H3A1N2

Tubulin Amphiprion ocellaris XP_023147458.1 Perca flavescens XP_028447189.1 Xiphophorus maculatus XP_005800766.1 Labeo rohita RXN28036.1 Danio rerio NP_001003558.1 Pangasianodon hypophthalmus XP_026768801.1 Denticeps clupeoides XP_028846748.1

165

Astyanax mexicanus XP_007259997.2 Pygocentrus nattereri XP_017547163.1 *Isolated in this present Colossoma macropomum work Lepisosteus oculatus XP_006636851.2 Latimeria chalumnae XP_005986289.1

Supplementary Table 4.2. Values are means ± SE of five replicates (8 fish/tank). (ANOVA, P<0.05) where the factor were the effect of font of oil and percentage of oil, and interaction between they, determined by real-time quantitative PCR. Different letters represent significant different between diets (ANOVA, P<0.05) where represented by “*”. OS- Oil source, % percentage of lipids, OSx% is the interaction of the two factors.

LIVER Experimental diets P valor Gen FO 5% FO 10% VO 5% VO 5% OS x % % OS 1,01 ± 1,34 ± 1,57 ± fads2 0,90 ± 0,26 0,301 0,715 0.006* 0,42 0,22 0,43 0,67 ± 0,71 ± 0,77 ± elovl2 0,78 ± 0.27 0,897 0,581 0,914 0.38 0.23 0,40 1,17 ± 1,91 ± 1,56 ± elovl5 0.93 ± 0.46 0,873 0,403 0.059* 0,97 0.67 0.83 0,89 ± 1,87 ± 1,29 ± pparαa 1,43 ± 0,30 0,536 0,07 0,68 0,22 1.12 ± 1.06 ± 1.04 ± pparαb 0.70 ± 0.04 0,754 0.38 0.16 0.23 0.89 ± 1.16 ± 1,68 ± pparβa 1.20 ± 0.16 0,173 0.001* 0.001* 0.10 0.21 0,07 0.68 ± 1,51 ± 1.06 ± pparβb 1,19 ± 0,74 0,157 0,934 0,298 0.17 1,11 0.49 1,25 ± 1,65 ± 1,58 ± pparγ 1,56 ± 0,23 0,377 0,559 0,324 0,60 0,46 0,35 1,13 ± 1,08 ± 1,01 ± lxr 1,24 ± 0,59 0,642 0,942 0,489 0,16 0,49 0,29 0.94 ± 0.90 ± 0.89 ± srebp 0.41 ± 0.23 0,148 0,127 0,208 0.36 0.50 0.35 0,93 ± 1.37 ± 1,23 ± lpl 0,85 ± 0,32 0,876 0,569 0,051* 0,37 0.26 0,65 1,75 ± 0,73 ± fas 3,06 ± 1,85 2,11 ± ,56 0,752 0,062 0,133 1,75 0,55

166

BRAIN Experimental diets P valor Gen FO 5% FO 10% VO 5% VO 5% OS x % % OS 1,00 ± 2,00 ± 1,76 ± fads2 1,76 ± 0,37 0,047* 0,498 0.001* 00,29 0,11 0,37 0,087 ± 0,260 ± 0,45 ± 0,21 ± elovl2 0.009* 0,589 0.035* 0.09 0.17 0,65 0.13 0.84 2,65 ± 1,55 ± elovl5 1.30 ± 0.53 0,146 0,543 0,063 ±0.36 1,60 0.63 0,63 ± 1,44 ± 0,98 ± pparαa 1,32 ± 0,55 0.007* 0,539 0,21 0,11 0,19 0,24 1,34 ± 1,06 ± 1,05 ± pparαb 1,21 ± 0,38 0,825 0,798 0,444 0,87 0,12 0,43 0.56 ± 1,95 ± 2,06 ± 2,04 ± pparβa 0,239 0,253 0,191 0.44 01,46 0,72 1,25 0,11 ± 0,41 ± 0,32 ± pparβb 0,30 ± 0,10 0,023* 0,36 0.013* 0,39 0,10 0,11 0,44 ± 0,78 ± 0.54 ± pparγ 0,48 ± 0,09 0,105 0,247 0.031* 0,10 0.21 0,15 0,31 ± 0,89 ± 0,99 ± lxr 0,94 ± 0,15 0,191 0,073 0.127 0.14 0,45 0,47 2,46 ± 5,65 ± 2,89 ± srebp 3,65 ± 3,09 0,151 0,554 0,363 1,93 2,52 0,33 0,92 ± 0.97 ± 1,25 ± lpl 1,16 ± 0,26 0,919 0,272 0,757 0,71 0.30 0.03 32,72 ± 27,21 ± 23,00 ± 32,06 ± fas 0,187 0,93 0,996 16,0 12,73 15,86 12,31

167

Supplementary Table 4.3. Statistics of Liver raw and protein coding transcriptome assembly in C. macropomum.

Protein coding Assembly Versions transcriptome assembly Transcript Number 92497 n50 transcript length (bp) 1932 Mean transcript length (bp) 1293.83 Largest transcript 18492 Number of transcripts over 1k nn 40584 Number of transcripts over 10k nn 131 Total Assembled bases 119675557 Back Mapping Reads % - BUSCO Statistics (Euk;Met;Ver;Act)* Complete 83.1 \ 85.1 \ 58.4 \ 51.6 Single 59.7 \ 56.2 \ 39.5 \ 35.4 Multi 23.4 \ 28.9 \ 18.9 \ 16.2 Fragment 10.9 \ 8.5 \ 21.5 \ 14.7 Missing 6.0 \ 6.4 \ 20.1 \ 33.7 Total Busco Groups 94.0 \ 93,6 \ 79,9 \ 66,3 *Euk: Database with 303 Eukaryota genes. *Met: Database with 978 Metazoa genes. *Ver: Database with 2586 Vertebrate genes. *Actino: Database with 4584 Actinopterygii genes.

168

Supplementary Table 5.1. The primers and the conditions of Elovl4a PCR.

Primer Extensi set Cycl Primer name Primer sequence TM on (size functio es bp) n Full Cma_ELOVL4A_ORF 72 CCCAGAACGGACAATTAGGA 35 60 °C ORF _F °C/24 s

Cma_ELOVL4A_ORF

_R GAAGCCTGCCTGTTTCTCTG Restric CCCGGATCCAAGATGGATATTGTA 72

Elovl4a tion Cma_ELOVL4A_Pyes 35 62 °C ACACATCTCATC °C/21 s site _XBAL_F Cma_ELOVL4A_Pyes CCCTCTAGACTAGTCACGCTTGGC

_BAMHI_R CCTG

Supplementary Table 5.2. Phylogenetic tree Elovl4 amino acid accession numbers.

elovl4b Pygocentrus nattereri XP_017574113.1 Colossoma macropomum this work Astyanax mexicanus XP_022533251.1 Danio rerio XP_022533251.1 Misgurnus anguillicaudatus AUO15656.1 Clarias gariepinus ASY01351.1 Siganus canaliculatus ADZ73580.1 Larimichthys crocea ALB35198.1 Rachycentron canadum ADG59898.1 Oreochromis niloticus XP_003440669.1 Epinephelus coioides AHI17192.1 elovl4a Danio rerio NP_957090.1 Misgurnus anguillicaudatus AUO15655.1 Clarias gariepinus ASY01350.1 Astyanax mexicanus XP_007257326.2 Oreochromis niloticus XP_003443720.1 Pygocentrus nattereri XP_017565830.1 Colossoma macropomum this work elovl4

Erpetoichthys calabaricus XP_028653499.1

Lepisosteus oculatus XP_006626327.1

169

Supplementary Table 5.3. MixS descriptors and acession numbers of tissue sample, raw data and draft genome assembly of Colossoma Macropomum.

Item Description investigation_type Eukaryote project_name Colossoma Macropomum Genome assembly lat_lon 19°55'45.9"S 40°57'42.1"W geo_loc_name Laranja da Terra, Espirito Santo, Brazil tissues Liver collection_date September/2017 seq_meth Illumina HiSeq X Ten Collector Ana Lúcia Salaro Maturity Juvenile Datasets Generated Acession Numbers of NCBI Genome raw data PRJNA596735 Genome Assembly JAAFGN000000000

170

Supplementary Table 5.4. GenomeScope (k-mer 27, 29 and 31), Kmergenie and Sga Preqc statistics of Colossoma macropomum raw reads.

GenomeScope

K = 27 K = 29 k = 31

Property min max min max min max Heterozygosity, % 0.0851359 0.0891992 0.0818162 0.0854054 0.0786521 0.0818496 Genome Haploid Length, bp 2,284,669,586 2,286,668,997 2,304,304,586 2,306,199,480 2,320,642,059 2,322,444,124 Genome Repeat Length, bp 635,420,013 635,976,096 592,124,547 592,611,467 554,061,679 554,491,929 Genome Unique Length, bp 1,649,249,573 1,650,692,901 1,712,180,039 1,713,588,013 1,766,580,380 1,767,952,195

Model Fit, % 96.3584 98.5872 96.6005 98.799 96.8041 98.8033 Read Error Rate, % 0.224321 0.224321 0.219684 0.219684 0.215323 0.215323 Kmergenie

Property

Genome Haploid Length (bp) 2,318,146,338 Predicted Best Kmer 93-95 SGA Preqc

Property

Genome Haploid Length (bp) 2,479,000,000 Pcr Duplication Proportion 0.06 Mean quality score 34,25-36

Fragment Size 400

171

A complete enzymatic capacity for long-chain polyunsaturated fatty acid biosynthesis is present in the Amazonian teleost tambaqui, Colossoma macropomum

Renato B. Ferraz,1,2, Naoki Kabeya3,4, Mónica Lopes-Marques1, André M. Machado1, Ricardo A. Ribeiro5, Ana L. Salaro6, Rodrigo Ozório1, L. Filipe C. Castro1,7† and Óscar Monroig3†

Supplementary Figure 1: Sequence alignment and conservation analysis of destaturase sequence motifs of C. macropomum Fads2. *- indicates residues involved in the determination of the destaturation activity in mammalian desaturases (Watanabe et al., 2016). Red boxes highlight Supplementaryconserved residue replacements with Danio reri oFiguresΔ5Δ6 Fads2.

Supplementary Figure 2.1. Sequence alignment and conservation analysis of destaturase sequence motifs of C. macropomum Fads2. *- indicates residues involved in the determination of the destaturation activity in mammalian desaturases (Watanabe et al., 2016). Red boxes highlight conserved residue replacements with Danio rerio Δ5Δ6 Fads2.

Supplementary Figure 4.1. Maximum likelihood phylogenetic analysis of Liver X receptor (Lxrα) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Lxrα studied herein is highlighted. Accession numbers for all Lxrα sequences are available in Supplementary Table 4.1.

172

Supplementary Figure 4.2. Maximum likelihood phylogenetic analysis of sterol regulatory element binding proteins (Srebp-1) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Srebp-1 studied herein is highlighted. Accession numbers for all Srebp-1 sequences are available in Supplementary Table 4.1.

173

Supplementary Figure 4.3. Maximum likelihood phylogenetic analysis of lipoprotein lipase (Lpl) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Lpl studied herein is highlighted. Accession numbers for all Lpl sequences are available in Supplementary Supplementary Table 4.1.

174

Supplementary Figure 4.4. Maximum likelihood phylogenetic analysis of fatty acid synthase (Fas) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Fas studied herein is highlighted. Accession numbers for all Fas sequences are available in Supplementary Table 4.1.

175

Supplementary Figure 4.5. Maximum likelihood phylogenetic analysis of β-Actin (Actb) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Actb studied herein is highlighted. Accession numbers for all Actb sequences are available in Supplementary Table 4.1.

176

Supplementary Figure 4.6. Maximum likelihood phylogenetic analysis of Microgobulin (B2m) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum B2m studied herein is highlighted. Accession numbers for all B2m sequences are available in Supplementary Table 4.1.

177

Supplementary Figure 4.7. Maximum likelihood phylogenetic analysis of Tubulin (Tuba) amino acid sequences rooted with Latimeria chalumnae. Numbers at nodes indicate branch support in posterior probabilities calculated using aBayes. C. macropomum Tuba studied herein is highlighted. Accession numbers for all Tuba sequences are available in Supplementary Table 4.1.

178

Supplementary Figure 4.8. PPAR Signaling Pathway - Brain Transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white boxes are the not detected genes.

179

Supplementary Figure 4.9. PPAR Signaling Pathway - Liver Transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white boxes are the not detected genes.

180

Supplementary Figure 4.10. Fatty Acid Elongation -Brain Transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white boxes are the not detected geses.

181

Supplementary Figure 4.11. Fatty Acid Elongation - Liver Transcriptome. In green boxes are the unigenes identified in our transcriptome, while in white boxes are the not detected genes.

182

Supplementary Figure 6.1. Comparative synteny anaysis of all C. macropomum genome scaffolds. A- Elovl2 and B- Elovl5.

183

Supplementary Figure 6.2. Comparative synteny anaysis of all C. macropomum genome scaffolds. A- Eovl4a, B- Elovl4b, C-Elovl8a and, D-Elovl8b.

184

Supplementary Figure 6.3. Comparative synteny anaysis of all C. macropomum genome scaffolds. A- Elovl7a, B- Elovl7b, C- Elovl1a and D- Elovl1b.

185

Supplementary Figure 6.4. Comparative synteny anaysis of all C. macropomum genome scaffolds. A- Elovl3 and B-Elovl6.

186

"As coisas que fazemos sobrevivem a nós”

Extraordinário.

187