Modeling the evolutionary loss of erythroid genes by icefishes: analysis of the hemogen gene using transgenic and mutant zebrafish

by Michael J. Peters

B.S. in Biology, University of New Hampshire

A dissertation submitted to

The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

June 4, 2018

Dissertation directed by

H. William Detrich, III Professor of Marine Molecular Biology and Biochemistry

1

Dedication

For my Oma Oswald, who started this journey with me.

2

Acknowledgments

I would like to thank my advisor, Dr. H William Detrich III, for encouraging me to be innovative and to pursue cutting-edge research. I thank the members of my committee,

Drs. Erin Cram, Rebeca Rosengaus, Steven Vollmer, and Leonard Zon for their many helpful suggestions. I enjoyed working alongside Sandra Parker and Carmen

Elenberger and appreciate their support. I also enjoyed working with many students including Caroline Benavides, Carolyn Dubnik, Carmen Elenberger, Laura Goetz, Urjeet

Khanwalkar, Ben Moran, Alessia Santilli, Eileen Sheehan, Margaret Streeter, Kathleen

Shusdock, and Sierra Smith. I especially thank Jonah Levin, who joined the lab as a high school student and has since continued working with me. I thank Corey Allard and

Drs. Donald Yergeau and Jeffrey Grim for collecting samples that I used in my studies. I owe thanks to the staff of the Marine Science Center for their support, including Roberto

Valdez, Sonya Simpson, Heather Sears, David Dawson, and Ryan Hill. I thank Drs.

Joseph Ayers and Justin Ries for use of their facilities and Drs. Isaac Westfield and

Ryan Myers for help with scanning electron microscopy. I thank Drs. Camille Berthelot,

Melody Clark, James Monaghan, and Leonard Zon for valuable discussions and providing important datasets and materials. I thank Dr. Jill de Jong and her lab for inspiring my interest in zebrafish research. I was pushed to my best by my fellow graduate students at Northeastern University. I am especially grateful to my Mom and

Dad for their unending support and for reading every word I write. I am grateful to my siblings Sarah, Katie, Ryan, and Zachary for inspiring me with their passions and positive energy.

3

Abstract of Dissertation

The Antarctic icefishes () are the only vertebrate taxon whose do not produce red blood cells, thereby providing a natural mutant model to study the regulators of blood development and disease. To identify novel regulators of erythropoiesis, I compared RNA-Seq transcriptomes from red- and white-blooded notothenioids. I find that both icefishes and their sister taxon, the dragonfishes

(Bathydraconidae), model beta-spectrin mutated spherocytic anemia. Icefishes appear to have evolved morph-biased changes in expression of hematopoietic regulatory genes, including down-regulation of the histone acetyltransferase p300 and overexpression of histone deacetylase 1b. In icefishes, I characterize a frameshift mutation that truncates the P300-binding domain of Hemogen, an important erythroid transcription factor. Tol2 and CRISPR/Cas9-generated transgenic zebrafish lines reveal that hemogen is expressed in hematopoietic, renal, neural, and reproductive tissues. I find that two conserved non-coding elements differentially contribute to hemogen expression in primitive and definitive waves of hematopoiesis. CRISPR-generated mutant zebrafish lines, which replicate the C-terminal mutation in icefish hemogen, show severe anemia and growth defects. Furthermore, I show that the function of zebrafish Hemogen is dependent on acidic residues within the TAD. Thus, Antarctic icefishes evolved an intricate system for repression of erythropoiesis that is caused in part by the loss of Hemogen function.

4

Table of Contents

Dedication 2

Acknowledgments 3

Abstract of Dissertation 4

Table of Contents 5

List of Figures 6

List of Tables 9

List of Symbols 10

List of Genes 14

Chapter 1: Morph-biased gene expression and sequence divergence typifies disease-like

traits of Antarctic icefishes 25

Chapter 2: Divergent Hemogen genes of teleosts and mammals share conserved roles in

erythropoiesis: Analysis using transgenic and mutant zebrafish 87

Chapter 3: Erythroid gene discovery using the erythrocyte-null Antarctic icefishes 157

Conclusion 196

References 200

5

List of Figures

Introduction

Figure 1 Erythropoiesis in the zebrafish 19

Figure 2 Comparison of red- and white-blooded notothenioids 24

Chapter 1

Figure 1 Peripheral blood smears from Antarctic notothenioid 32

Figure 2 Expression profile heatmap of differentially expressed genes 36

Figure 3 Gene ontology enrichment for differentially expressed genes 40

Figure 4 Association networks for DE genes in the icefish head kidney 44

Figure 5 Three tissue-specific clusters of hematopoietic genes are differentially

expressed in the head kidneys of Ps. georgianus and P. charcoti 48

Figure 6 Differential expression of hematopoietic regulators 52

Figure 7 Deleterious substitutions in erythroid genes from icefishes 55

Figure 8 The dragonfish P. charcoti is a natural mutant model for beta-spectrin

mutated spherocytic anemia 57

Figure 9 Functional mutations occur in the interaction domains of Hemogen, Gata1,

and P300 64

Figure 10 Whole protein acetylation in the head kidneys of red- and white-blooded

notothenioids 68

6

Chapter 2

Figure 1 Zebrafish si:dkey-25o16.2 and human hemogen are orthologous and

encode related proteins that differ in size 93

Figure 2 hemogen expression in zebrafish embryos 99

Figure 3 Alternative promoters drive hemogen expression in hematopoietic and

nonhematopoietic tissues in zebrafish 103

Figure 4 Conserved elements in the zebrafish hemogen promoter are predicted

targets for transcription factors 108

Figure 5 Gata1 binds distal and proximal promoter elements to regulate hemogen

expression in zebrafish 111

Figure 6 Promoter elements have distinct roles in driving hematopoietic, renal, and

testicular expression of hemogen in transgenic Tg(hemgn:mCherry)

zebrafish 115

Figure 7 Morpholino targeting of hemogen inhibits erythropoiesis in embryonic

zebrafish 118

Figure 8 CRISPR/Cas9 mutagenesis of the third exon of zebrafish hemogen

impairs primitive and definitive erythropoiesis 124

Chapter 3

Figure 1 The erythroid gene hemogen is mutated in Antarctic icefishes 161

Figure 2 hemogen is expressed in hematopoietic, renal, and neural tissues in red-

blooded notothenioids 164

7

Figures 3 A truncated isoform of hemogen is highly expressed in icefishes and is

translated 169

Figure 4 Overexpression of icefish hemogen in zebrafish blocks primitive

erythropoiesis 172

Figure 5 A novel MABP-containing protein (mabpcp) is an RBC-specific gene that

was lost in icefishes 174

Figure 6 Modeling a truncated cd33-related Siglec (cd33rSig) from icefishes in

mutant zebrafish 180

8

List of Tables

Chapter 1

Table 1 Hematopoietic genes are differentially expressed in the icefish head

kidney 84

Table 2 GO enrichment of genes under different selective pressures in two red-

and two white-blooded notothenioids 84

Table 3 GO enrichment for genes with deleterious substitutions found in two white-

blooded icefishes but not in two red-blooded notothenioids 85

Table 4 Table of primers 85

Table 5 Icefishes have predicted deleterious substitutions in targets of human

diseases 86

Chapter 2

Table S1 Sequences of primer and oligonucleotides used in experiments 156

Chapter 3

Table S1 Primer Sequences 194

Table S2 Oligos for CRISPR gRNA template 195

9

List of Symbols

AGM Aorta gonad mesonephros

ALL Acute lymphocytic leukemia

AML Acute myeloid leukemia

ATP Adenosine triphosphate

B-ALL B-cell acute lymphoblastic leukemia

BWS Beckwith-Wiedemann syndrome bZIP Basic leucine zipper domain

CBF-AML Core binding factor acute myeloid leukemia

CC Coiled coil domain

CHT Caudal hematopoietic tissue

Ce Corpus cerebelli

CL-XPosure Clear-blue X-ray film

CLL Chronic lymphocytic leukemia

CML Chronic myeloid leukemia

CMP Common myeloid progenitor

COFS Cerebro oculo facio skeletal syndrome

CRISPR Clustered Regularly Interspaced Short Palindromic Repeats

CT domain C-terminal cystine knot-like domain

CT-ZF C-terminal zinc finger

CV Caudal vein

Cyto Cytoplasmic domain

C2H2 Cys2-His2 zinc finger

10

DA Dorsal aorta

DBA Diamond-Blackfan anemia

DED1 Death effector domain

DLBCL Diffuse large B-cell lymphoma

DS-AMKL Acute megakaryoblastic leukemia in Down syndrome

ECL Enhanced chemiluminescence

EGFP Enhanced green fluorescent protein

G Glomerulus

HCP Hereditary Coproporphyria

HDR Homology directed repair

HK Head kidney

HNSCC Head and neck squamous cell carcinoma

HRP Horseradish peroxidase

HS Hereditary spherocytic anemia

ICM Intermediate cell mass

Ig Immunoglobulin

IgG Immunoglobulin G

ITIM Immunoreceptor tyrosine-based inhibition motif

LDS Lithium dodecyl sulfate

MAE Myoclonic astatic epilepsy

MABP MVB12-associated beta prism domain

MHB Midbrain-hindbrain-boundary

MO Medulla oblongata

11

MPN Myeloproliferative neoplasms

NH-terminus Amino-terminus

NHEJ Non-homologous end joining

NLS Nuclear localization signal

PBI Peripheral blood island

PD Pronephric ducts

PHD Plant homeodomain

ProE Proerythroblast

PTK Protein tyrosine kinase

PVDF Polyvinylidene fluoride

RING-finger Really interesting new gene finger domain

SCN Severe congenital neutropenia

SDS-PAGE Sodium dodecyl sulfate-polyacrylamide gel electrophoresis

Se Sertoli cells

Siglec Sialic acid-binding immunoglobulin-type lectin

SNF Sucrose non-fermentable

ST Seminiferous tubules

TAD Transactivation domain

TALEN Transcription activator-like effector nuclease

T-ALL T-cell acute lymphoblastic leukemia

T-CLL T-cell chronic lymphoblastic leukemia

TBST Tris-buffered saline and Tween-20

TFBS Transcription factor binding site

12

TK Trunk kidney

Tr Transmembrane domain

UDP Uridine diphosphate

WISH Whole-mount in situ hybridization

Zn Zinc

13

List of Genes

Add1 Adducin-1

Add2 Adducin-2

AKT2 RAC-beta serine/threonine protein kinase 2

Ank1 Ankyrin-1

Anxa2 Annexin A2

Asxl2 Additional sex combs like 2, transcriptional regulator

Band3/Slc4a1 Band 3 anion transport protein

Bcl11a B-cell lymphoma/leukemia 11a

Bcl2l1/BclxL BCL2-like 1 gene/B-cell lymphoma-extra large 1 gene

Blvrb Biliverdin reductase B

Brca1 Breast cancer 1

Card11 Caspase recruitment domain family member 11

Casp8 Caspase 8

Cdkn1c Cyclin dependent kinase inhibitor 1c

Cd33rSig CD33 related siglec

Chd2 Chromodomain helicase DNA binding protein 2

Cox15 Cytochrome C oxidase assembly homolog

Cpox Coproporphyrinogen oxidase

Edag Erythroid differentiation associated gene

Eml1 Echinoderm microtubule associated protein like 1

Epor Erythropoietin receptor

Ercc1 Excision repair 1

14

Ero1lα Endoplasmic reticulum oxidoreductase 1 alpha

Fam161al Family with sequence similarity 161, member A-like

Fech Ferrochelatase

Fes Feline sarcoma oncogene

Fgl1 Fibrinogen-like protein 1

Flt1 Fms related tyrosine kinase 1

Foxo1 Forkhead box protein O1

Foxp1 Forkhead box protein P1

Gata1 GATA-binding factor 1

Gapdh Glyceraldehyde 3-phosphate dehydrogenase

Gfi1 Growth factor independent 1 transcriptional repressor

Gfi1b Growth factor independent 1 transcriptional repressor B

G6PD Glucose-6-phosphate dehydrogenase

Hbba Hemoglobin, beta adult major chain

Hbbe1 Hemoglobin beta embryonic 1.1

Hdac1b Histone deacetylase 1b

Hemgn Hemogen

Hpx Hemopexin

Ikzf1 Ikaros family zinc finger 1

Ikzf2 Ikaros family zinc finger 2

Il7r Interleukin-7 receptor

Klf1 Krüppel-like factor 1

Ldb1 LIM domain-binding protein 1

15

Lmo2 LIM domain only 2

Mabpcp1/Dkey:30j10.5 MABP-containing protein 1

Mate1 Multidrug and toxin extrusion protein

Nfe2 Nuclear factor, erythroid derived 2

Nfe2l1 Nuclear factor, erythroid derived 2 like 1

Ntrk1 Neurotrophic receptor tyrosine kinase 1

Numa1 Nuclear mitotic apparatus protein 1

Pphln1 Periphilin 1

Pu.1/Spi1 Spi-1 proto-oncogene

P300 Histone acetyltransferase p300

Rela NFĸB p65 subunit/RELA proto-oncogene

Runx1 Runt related transcription factor 1

Sgk1 Serum and glucocorticoid-regulated kinase 1

Slc25a39 Solute carrier family 25 member 39

Sptb Spectrin beta, erythrocytic

Stat5b Signal transducer and activator of transcription 5B

Tal1 T-cell acute lymphocytic leukemia protein 1

Tf Transferrin

Tfrc Transferrin receptor C

Tgfβ Transforming growth factor beta 1

Trim16l Tripartite motif containing 16 like

Ugt1a1 UDP glucuronosyltransferase 1 family, polypeptide A1

Zfp64 Zinc finger protein 64

16

Introduction

1. Ontogeny of hematopoiesis in vertebrates

Hematopoiesis, the production of all blood lineages from pluripotent hematopoietic stem cells, is a complex developmental process essential to vertebrate life. Comparison of hematopoiesis in different models (e.g. humans, chicken, mice, zebrafish) has revealed fundamental features and key molecular regulators of blood development and disease (Detrich, 1999; Paw and Zon, 2000; Zon, 1995).

Several hematopoietic processes and genes originated early in evolution and are conserved in invertebrates, including both (Pascual-Anaya et al., 2013) and non-chordates (Evans et al., 2003). In all vertebrates, blood production occurs in two distinct waves, termed primitive and definitive hematopoiesis (Maximow, 1909).

Primitive hematopoiesis occurs in the yolk sac blood islands in embryos of all vertebrates except fishes (Zon, 1995). In this first wave, committed progenitors differentiate into nucleated erythrocytes that supply the early embryo with oxygen as they complete maturation in circulation (Palis, 2014). Definitive hematopoiesis is defined as the production of blood lineages from pluripotent hematopoietic stem cells (HSCs), which originate in the aorta gonad mesonephros (AGM). In humans and other mammals, HSCs seed and develop in the fetal liver for a short time before colonizing the bone marrow and thymus (Palis, 2014).

Even though many aspects of blood development are highly conserved across vertebrate taxa, the evolution of diverse forms of was coincident with alterations to the ontogeny of hematopoiesis. In all vertebrates, primitive hematopoiesis initiates in

17 lateral plate mesoderm (LPM) from cells called “hemangioblasts,” which are capable of forming both blood and vasculature (Sabin, 2002). In sharks and in some teleosts, primitive hematopoiesis is extraembryonic and commences directly on the yolk sac

(Zon, 1995). However, in most teleost fishes, the first wave of hematopoiesis is intraembryonic (Oellacher, 1872). The zebrafish has provided an excellent model to study teleost blood development (Fig. 1A). Primitive erythroid progenitors are produced in the intermediate cell mass (ICM) (Detrich et al., 1995). Subsequently, a committed population of definitive erythroid-myeloid progenitors (EMPs) is found in the peripheral blood island (PBI), intermixed with primitive erythrocytes (Bertrand et al., 2007a; Detrich et al., 1995). Concurrently, primitive myelopoiesis initiates from anterior lateral plate mesoderm (ALM) (Bennett et al., 2001). Primitive erythrocytes migrate from the ICM and are pulled into circulation by the heart through the ducts of Cuvier (Detrich et al.,

1995). At 30 hpf, the first definitive HSCs are produced from hemogenic endothelial cells from the ventral wall of the dorsal aorta in a region called the aorta gonad mesonephros (AGM) (Thompson et al., 1998). The HSCs migrate via circulation to colonize a temporary site of definitive hematopoiesis in the caudal hematopoietic tissue

(CHT) and to seed the thymus, the major site of adult T-cell production (Murayama et al., 2006). From the CHT, HSCs “crawl” along the pronephric ducts to colonize the pronephric head kidney, the site analogous to human bone marrow (Bertrand et al.,

2008). Here, lymphoid and myeloid cells are produced within the hematopoietic stem cell niche between renal tubules. The primary juvenile and adult hematopoietic organs

(kidney, spleen, liver, bone marrow) vary between different species of amphibians, reptiles, and fishes (Zon, 1995).

18

19

2. Regulators of erythroid differentiation and function

The path to becoming a red blood cell is determined by signaling molecules, transcription factors, structural proteins, and other factors. Stages of erythroid differentiation (progenitors in Fig. 1B) are generally characterized by condensation of the nucleus, a decrease in cell size, and an accumulation of hemoglobin during terminal differentiation. Each stage is also defined by a unique gene expression profile. As stem cells mature and lose their pluripotency, factors involved in self-renewal (e.g. myb) are down-regulated while factors that are critical for heme synthesis and erythroid differentiation (e.g. gata1) are up-regulated (Hattangadi et al., 2011). Extrinsic factors within erythroblast islands play an early role in the activation of erythroid differentiation and these include cytokines and interactions with receptors on neighboring cells or with the extracellular matrix. The major activator of erythropoiesis is the hormone,

Erythropoietin (Epo), which binds to the erythropoietin receptor (EPOR) on BFU-E and

CFU-E progenitors (D'Andrea et al., 1989; Krantz, 1991; Lin et al., 1996), which signals up-regulation of erythroid genes and anti-apoptotic factors (Dolznig et al., 2002).

Erythroid transcription factors are also expressed and interact within complexes to regulate cell differentiation. The Gata1 transcription factor, a master regulator of erythropoiesis, colocalizes with other nuclear proteins (e.g. Scl/Tal1, Ldb1, Lmo2, Klf1) at promoters and enhancers to activate erythroid gene transcription via long-range chromatin interactions (Love et al., 2014; Tijssen et al., 2011). Gata1 also functions as a transcriptional activator or repressor by recruiting the Hdac1-containing NuRD

(Nucleosome Remodeling Deacetylase) complex (Miccio et al., 2010). During terminal differentiation, these cofactors up-regulate expression of erythroid genes including anti-

20 apoptotic molecules like Bcl-xL, which protects erythroid cells from BAX-induced cell death (Dolznig et al., 2002; Rhodes et al., 2005).

3. Gene editing and zebrafish mutant models

The zebrafish has provided an excellent model to study the genetic regulators of hematopoiesis (Kafina and Paw, 2018). Zebrafish spawn 100-200 translucent eggs whose development may be easily studied following fertilization. The generation of transgenic zebrafish using the Tol2 system has allowed researchers to track gene expression and tissue development using fluorescent reporters (Kawakami, 2016).

Forward mutant screens have used chemical mutagens like N-ethyl-N-nitrosourea

(ENU) or retroviral insertion of DNA to generate zebrafish lines with developmental abnormalities (Detrich, 1999; Frame, 2017.). A series of zebrafish blood mutants were found to have defective erythropoiesis (e.g. vlad tepesm651, riesling, zinfandel) (Ransom et al., 1996; Weinstein et al., 1996) and impaired HSC specification (e.g. hi1618 and hi2335) (Amsterdam et al., 2004). These methods have generally not been site-specific.

The advent of CRISPR technologies (Clustered Regularly Interspaced Short

Palindromic Repeats) provides a highly efficient method for targeted gene disruption using a small guide RNA (sgRNA) and the Cas9 endonuclease (Ata, 2016; Jinek et al.,

2012). Indeed, this technology permits precise modification of genes of interest to generate mutant zebrafish lines that model human diseases (Frame, 2017.). Further improvements of these gene editing technologies have enhanced our ability to manipulate and study the genome (Carroll, 2017; Sertori et al., 2016).

21

4. Defective erythropoiesis in notothenioid fishes

The only vertebrates that do not produce erythrocytes are the family of Antarctic icefishes (Channichthyidae), a monophyletic clade of the suborder

(Cocca et al., 1995b; di Prisco et al., 2002; Near et al., 2006b; Zhao et al., 1998b).

Notothenioid fishes adaptively radiated in the ~34 million years as it began cooling to the freezing point of seawater (–1.86C ) (Colombo et al., 2015;

Matschiner et al., 2011; Rutschmann et al., 2011) and other taxa became locally extinct (Eastman, 1993). The decline in temperature was likely caused by the separation of the Antarctic continent and formation of the Antarctic Circumpolar Current

(Kennett, 1977). Today, notothenioids are the most speciose of , comprising ~130 species grouped into 8 families (Eastman and Eakin, 2000; Near et al.,

2003).

Notothenioids share several synapomorphic traits that allowed for their evolution in a harsh environment. Almost all members of the clade possess antifreeze glycoproteins (AFGPs) that keep their blood from freezing (Chen et al., 1997; Deng et al., 2010). Cold adaptation may also explain the convergent evolution of different XY sex-chromosome systems in several Antarctic notothenioids, which may avoid skewed sex ratios otherwise induced by temperature-dependent sex determination (Ghigliotti et al., 2016). All notothenioids lack a gas bladder, the organ through which most teleosts regulate their buoyancy (DeVries and Eastman, 1978). Nevertheless, the evolution of decreased bone mineralization and accumulation of lipids to enhance buoyancy

22 facilitated the radiation of notothenioid species to occupy diverse niches in the water column (Albertson et al., 2010). All notothenioids display hematopoietic phenotypes with reduced hematocrits and hemoglobin concentrations compared to temperate fishes

(Wells et al., 1980). The complete loss of red blood cells by Antarctic icefishes (Fig. 2) provides a unique opportunity to study the genetic regulators of erythropoiesis.

23

24

Chapter 1: Morph-biased gene expression and sequence divergence typifies disease- like traits of Antarctic icefishes

Key words: icefish, Antarctic, erythropoiesis, anemia, mutant, gene expression

25

Abstract

The family of white-blooded Antarctic icefishes (Channichthyidae) displays several unusual traits that are reminiscent of human diseases, most notably their severe anemia. Because most molecular-genetic pathways are shared among vertebrates, the mutations that cause the icefish traits may be the same targeted pathways in human diseases. Here I performed a comparative transcriptomics study to identify changes in gene expression and coding mutations that are morph-biased for red- and white- blooded notothenioid fishes. I show that both the profoundly anemic icefishes and their red-blooded sister clade, the dragonfishes, possess predicted deleterious mutations in genes that are associated with hemolytic anemias in humans (e.g. g6pd, sptb).

However, erythropoiesis in icefishes appears to be stalled early in erythroid differentiation prior to potential hemolysis. I hypothesize that this may be an adaptation to abrogate the production of erythrocytes in an environment, the cold, oxygen-rich

Southern Ocean, in which their utility is marginal. Moreover, I propose the icefishes have shut down terminal erythroid differentiation through decreased expression of positive regulators of erythroid differentiation and increased expression of pluripotency markers typically expressed in leukemias. I show that mutations in the transcription factor hemogen, combined with overexpression of hdac1b and decreased expression of p300b, are likely to cause an imbalance in regulatory acetylation of Gata1 that downregulates its activity in icefish hematopoietic tissues. Together, these changes suggest that Antarctic notothenioids have evolved an intricate repression of erythropoiesis.

26

Introduction

Fish are proven models for studying blood development as they share the same set of hematopoietic cell lineages as mammals. One exception is the family of Antarctic icefishes (Channichthyidae), which is the only vertebrate clade that has lost the ability to produce red blood cells (Cocca et al., 1995a; di Prisco et al., 2002; Near et al., 2006b;

Zhao et al., 1998b). Thus, icefishes provide a natural mutant model for human anemias

(Albertson et al., 2009).

The icefish clade belongs to the Antarctic notothenioid suborder, which radiated adaptively as the Southern Ocean began cooling ~34 million years ago (Mya) to the freezing point of seawater (–1.86C ) today; other fish groups became locally extinct

(Eastman, 1993). The notothenioid radiation produced ~136 species belonging to eight recognized families, among which there are highly divergent phenotypes (Colombo et al., 2015; Matschiner et al., 2015).

The 16 species of Antarctic icefishes are unique among vertebrates in that they neither produce the oxygen-carrying pigment hemoglobin nor do they produce typical mature erythrocytes (Cocca et al., 1995b; di Prisco et al., 2002; Near et al., 2006b;

Zhao et al., 1998b). The high oxygen content in the Southern Ocean facilitated the loss of erythrocytes in icefishes, but their severe anemia was clearly disaptive (Montgomery and Clements, 2000), as the icefish co-evolved expanded hearts and highly vascularized tissues, possibly as a consequence of elevated systemic nitric oxide (NO) levels (Sidell and O'Brien, 2006). Red-blooded Antarctic notothenioids also have decreased hematocrits and reduced hemoglobin concentrations compared to temperate teleost species (Eastman, 1993; Wells et al., 1980). There is evidence for temperature-

27 sensitive phenotypic plasticity of erythropoiesis in temperate teleosts; erythropoiesis is repressed by cold exposure (Kulkeaw et al., 2010; Maekawa et al., 2012). Thus, genetic fixation of cold-induced anemia in ancestral notothenioids may be an example of West-

Eberhard’s theory of “Increased genetic divergence due to phenotype fixation” (West-

Eberhard, 1989, 2005).

Significant mutations in coding sequences, including the deletion of globin genes, have probably made permanent the anemia of icefishes (Cocca et al., 1995b; Near et al., 2006b). However, one cannot rule out mutations of erythroid gene regulatory elements as causes of icefish anemia; indeed, such changes might have initiated red cell loss. The relaxed selection pressure accompanying regulatory mutations may explain the high mutation rates observed in morph-biased genes when they become expressed below a functional level (Helantera and Uller, 2014; Leichty et al., 2012).

Thus, reduction of hemoglobin levels, as seen in more derived notothenioid clades [i.e., the family Bathydraconidae (dragonfishes)] due to deletion of globin gene regulatory elements, may have led ultimately to the evolutionary loss of the globin locus in

Antarctic icefishes (Lau et al., 2012; Near et al., 2006b; Zhao et al., 1998b). However, the loss of β-globin may be just one of many erythroid specific gene mutations that occurred in icefishes.

In this study, I compare multi-tissue, RNA-Seq transcriptomes (Berthelot et al.,

2018. Manuscript in preparation) to interrogate morph-biased gene expression and coding sequence divergence between the derived sister lineages of dragonfishes

(Bathydraconidae) and icefishes (Channichthyidae). The goal was to discover the genetic determinants of icefish traits, specifically changes to hematopoietic genes that

28 may contribute to their anemia. The analysis revealed tissue-specific, differential expression for genetic pathways that regulate development of blood, brain, muscle, gonad and kidney. In hematopoietic tissues, decreased expression was observed for both well-known and previously uncharacterized erythroid genes – several of these genes contained predicted deleterious substitutions or frameshift mutations. I found that the dragonfish, Parachaenichthys charcoti, was a natural mutant model for hereditary spherocytic anemia. The icefish, Chaenocephalus aceratus, has been shown to be blocked in erythroid differentiation (Yergeau et al., 2005; Yergeau et al., 2006.

Manuscript in preparation). I suggest that the block to differentiation may be caused by increased expression of pluripotency factors and decreased expression of erythroid differentiators. Specifically, the block in erythroid differentiation may be caused by increased expression of hdac1b and by deleterious mutations in the interaction domains of Hemogen, P300b, and Gata1, which would together promote deacetylation and deactivation of Gata1. Together, these changes show that notothenioid evolution led ultimately to an intricate repression of the erythropoietic pathway.

Results

Erythrocyte morphology in notothenioid fishes

All Antarctic notothenioid fishes are anemic, with reduced hematocrits and low hemoglobin concentrations, compared to temperate fishes (Wells et al., 1980). This is most apparent in dragonfishes (Bathydraconidae), the sister lineage to icefishes, which display a more severe anemia (Kunzmann et al., 1991) than other red-blooded notothenioids. Yet, very little is known about the process of erythroid differentiation in

29 notothenioids. Two icefishes, C. aceratus and Dacodraco hunteri, were proposed to have very rare, senile erythrocytes (Barber et al., 1981). Subsequently, it was discovered that C. aceratus produces erythroid progenitors but shows a clear block to terminal erythroid differentiation (Yergeau et al., 2005; Yergeau et al., 2006. Manuscript in preparation). Nonetheless, the blood cell phenotypes of most notothenioid species have not been characterized.

I examined the morphology and frequency of hematopoietic cell types in peripheral blood smears from six species from three families of Antarctic notothenioid fishes. The red blood cells of two nototheniids, Notothenia coriiceps and Gobionotothen gibberifrons, were oval shaped, measuring 11.7±0.30 µm (n = 12) and 12.1±0.33 µm (n

= 12) on the longest axis, and were morphologically similar to erythrocytes seen in temperate teleosts (Fig. 1A,B). Strikingly, erythrocytes of the dragonfish, P. charcoti, were spherical in shape and significantly smaller (mean diameter 8.7±0.19 µm; n = 14;

Student’s t test, P = 6.1E-06) than those of N. coriiceps. They also had a reduced surface area of 48.7±3.1 µm2 compared to the oval erythrocytes from N. coriiceps at

77.6±4.1 µm2 (Student’s t test, P = 0.003, n = 6) (Fig. 1C). Interestingly, the spherocytic erythrocytes of P. charcoti were morphologically similar to the erythrocyte-like cells that have been described in the icefish Channichthys rhinoceratus (Hureau, 1966; Spillman and Hureau, 1967). Spherocytic morphology of erythrocytes is typically caused by defects in erythroid membrane cytoskeletal proteins.

Among the icefishes that I analyzed, blood cell morphologies and frequencies varied between species (Fig. 1D-F). Mature erythrocytes were not apparent in the peripheral blood of Pseudochaenichthys georgianus (Ps. georgianus) or C. aceratus,

30 but there were a number of circulating erythroblasts in the blood of C. aceratus (5.7% of peripheral blood cells, n = 72, Fig. 1E). By comparison, the blood of N. coriiceps did not contain circulating erythroblasts and reticulocytes were present at a low frequency

(1.8% n = 111, Fig. 1A). The peripheral blood of Champsocephalus gunnari and

Chionodraco rastrospinosus contained abundant myeloid cell-types (56.3% N = 98, and

19.7% n = 65 respectively, Fig. 1). Some myeloid cell-types in C. rastrospinosus had clear cytoplasm and condensed nuclei in contrast to the polymorphonuclear leukocytes of C. aceratus (Fig. 1D,E) and were similar to the putative erythrocytes of the icefish, C. rhinoceratus (Hureau, 1966; Spillman and Hureau, 1967). Thus, blood cell morphologies are highly variable among notothenioid fishes even between species within the icefish clade.

31

32

Figure 1. Peripheral blood smears from Antarctic notothenioid fishes. Light micrographs of Giemsa stained blood cells from N. coriiceps (A), G. gibberifrons (B), P. charcoti (C), C. rastrospinosus (D), C. aceratus (E), and C. gunnari (F). Note the oval erythrocytes in two nototheniids (A,B) and spherocytic erythrocytes of the dragonfish

(C,D). Abbreviations: E, eosinophil; Er, erythrocyte; L, lymphocyte; Mon, monocyte; N, neutrophil; O, orthochromatophilic normoblast; ProE: proerythroblast; T, thrombocyte.

Scale bars = 10 µm (A-F).

33

Comparative transcriptomics reveals tissue-specific, differentially expressed genes between an icefish and dragonfish

To identify genes that are differentially expressed in icefish tissues, I performed an in silico comparison of RNA-Seq expression profiles between the transcriptomes of the icefish, Ps. georgianus, and red-blooded dragonfish, P. charcoti (Berthelot et al., 2018.

Manuscript in preparation). More than 18,700 orthologous genes were identified in the multi-tissue transcriptomes of these species using OMA stand-alone and a pipeline that has been described previously (Altenhoff et al., 2013; Altenhoff et al., 2011; Sharma et al., 2014). For each tissue, the TPM (transcripts per million) normalized expression values were strongly correlated between tissue replicates within each species (0.78 > R

> 0.98). Expression was correlated for most interspecies comparisons (0.51 > R > 0.92) except at sites of hematopoiesis and in the liver (Fig. S1-2). Differential expression (DE) analysis was conducted on the orthologs using edgeR v3.10.2 to normalize and detect significant differences between read counts in each tissue from Ps. georgianus and P. charcoti with 2-4 biological replicates per sample. From the list of 18,781 orthologs, differentially expressed transcripts (significance criterion P ≤ 1E-05, FDR ≤ 0.001) were identified in brain (2,005), head kidney (2,317), liver (1,960), ovary (784), spleen

(3,108), trunk kidney (2,688), pectoral muscle (2,129), white muscle (1,447), and heart ventricle (2,005). Significant differential expression of 295 genes (39 up-regulated, 256 down-regulated) was observed in every tissue comparison. Tissue-wide down- regulation of a gene may hint at a significant genomic alteration or it may occur due to an incorrect orthology call. Therefore, the orthologies for specific genes of interest were

34 confirmed by genomic synteny and/or by mapping to established zebrafish and stickleback orthologs. Subsequently, I performed hierarchical clustering of TPM- normalized expression values for DE genes across tissues from both species (Fig. 2).

This facilitated the isolation of specific clusters of DE genes (cut at 56% height of the dendogram) with the highest expression in each tissue. Tissue-specific genes were differentially expressed in brain (432), head kidney (255), liver (102), ovary (138), spleen (323), trunk kidney (402), pectoral muscle (139), white muscle (56), and heart ventricle (171). In the head kidney, 152 DE genes were determined to be blood-specific genes.

35

36

Figure 2. Multi-tissue expression profile heatmap of genes that are differentially expressed in head kidney from Ps. georgianus and P. charcoti. Differentially expressed genes were identified in the head kidney RNA-Seq transcriptomes from Ps. georgianus and P. charcoti (Fisher’s exact test, P < 1E-05, FDR < 0.001). For genes that are differentially expressed in the icefish head kidney, the relative transcript abundances in each tissue from both species are shown as the log2 transformed values of transcripts per million (TPM). Hierarchical clustering identified groups of genes with similar expression profiles. Tissue-specific clusters of genes with the highest expression in each tissue were isolated by cutting the dendogram at a height of 56% (dashed line).

Abbreviations: H. Kidney, head kidney; T. Kidney, trunk kidney; Pectoral, pectoral muscle; W. muscle, white muscle.

37

Gene ontology enrichment of DE genes highlights tissue-specific molecular processes that underlie icefish phenotypes

The differential expression of molecular processes in each tissue between P. charcoti and Ps. georgianus may represent lineage-specific adaptations to temperature, oxidative stress, and/or functional loss of red blood cells. To evaluate these possibilities, gene ontology (GO) enrichment was used to assess the tissue-specific biological functions of each DE gene cluster and to identify specific genetic pathways that may be unique to icefish development (Fig. 3).

At sites of hematopoiesis, in the head kidney (n = 234, Fig. 3A) and spleen (n =

296, Fig. 3B), many of the genetic pathways that control cell survival, proliferation and differentiation were enriched. More genes involved in the immune system were differentially expressed in the spleen (n = 68; sum of up- and downregulated genes) compared to head kidney (n = 27), particularly for genes involved in lymphopoiesis. In both hematopoietic tissues, the widespread decrease in expression for erythroid genes highlights the loss of mature erythrocytes in icefishes (Fig. S3).

With the highest number of tissue-specific DE genes (n = 398), the icefish brain primarily exhibited down-regulation of regulators of nervous system development and function (Fig. 3C). Decreased expression was observed for several factors involved in glutamate receptor signaling (e.g. GRM8, GRIK5, GRIA3, GRIA1), which is consistent with the contraction of this gene family observed in notothenioids (Shin et al., 2014).

Reduced glutamate signaling might inhibit neuronal cell function but may represent an

38 adaptation to prevent excessive generation of reactive oxygen species (ROS)

(Reynolds and Hastings, 1995; Willard and Koochekpour, 2013).

In the trunk kidney (n = 372), gene expression changes were observed for several metabolic processes (Fig. 3D). For example, the differential expression for many lipid metabolites (n = 37, e.g. Fabp1) is consistent with the elevated polyunsaturated fatty acid (PUFA) levels in icefish mitochondrial membranes (O'Brien and Mueller, 2010).

Tissue-specific DE genes in pectoral red muscle (n = 128, Fig. 3E) and in trunk white muscle (n = 50, Fig. 3F) were involved in striated muscle development and function. In most cases, the same processes were strictly up-regulated in pectoral muscle of the icefish but down-regulated in white muscle compared to the dragonfish.

This may highlight the increased hypertrophy and loss of hyperplasia in icefish pectoral muscle (Archer and Johnston, 1987). Icefishes generally use their pectoral muscles for labriform swimming (Archer and Johnston, 1987), whereas Parachaenichthys species use sub-carangiform swimming and have a heavily muscled trunk (Kuhn et al., 2010).

More genes encoding mitochondrial proteins were differentially expressed in icefish pectoral muscle, reflecting its higher concentration of mitochondria (Archer and

Johnston, 1991; Lin et al., 1974).

39

40

Figure 3. Gene ontology (GO) enrichment for tissue-specific, differentially expressed genes between Ps. georgianus and P. charcoti. Enriched GO groups were identified using STRING (Fishers exact test, P < 0.05, FDR < 0.01). Graphs show numbers of up-regulated (red) and down-regulated (blue) genes in Ps. georgianus head kidney (A), spleen (B), brain (C), trunk kidney (D), pectoral muscle (E), and white muscle (F) for different biological processes.

41

Interaction network for differentially expressed, tissue-specific hematopoietic regulators in the icefish head kidney

To identify the pathways that control blood development in icefishes, I created a gene association network for tissue-specific, differentially expressed genes in the icefish head kidney, the major site of adult definitive hematopoiesis in teleosts (Fig. 4). The network was created with STRING (Jensen et al., 2009) using annotations of the human orthologs (see Methods). For genes of interest, orthology was verified by comparative synteny of the sequenced genomes for N. coriiceps and H. sapiens. In the association network, K-means clustering (n = 11, Fig. 4) revealed sets of genes that were grouped consistently with ten GO biological functions (Fig. 4). The network highlights the loss of expression for groups of genes involved in the erythroid skeletal membrane, heme synthesis, erythroid transcriptional regulation, chromatin regulation, apoptosis, lipid metabolism, and in the regulation of adenylate cyclase activity. It also shows a cluster of signaling molecules, including many with increased expression in the icefish head kidney (Fig. 4).

Central nodes linking these clusters included tspo (25), hdac1 (18), akt (16), rela

(15), foxo1 (13), gata1 (13), tk2 (13), tfrc (12), fech (10), and ntrk1 (10), all of which were down-regulated with the exceptions of hdac1 and ntrk1 (Fig. 4). The cluster centering on gata1 highlights the interactions between down-regulated Gata1 transcriptional targets and Gata1 co-factors that cooperate to drive erythropoiesis

(Ferreira et al., 2005). Previous studies have emphasized the correlation between node centrality and essential function (Batada et al., 2006; He and Zhang, 2006; Jeong et al.,

42

2001; Raman et al., 2014; Song and Singh, 2013). Thus, differential expression of these central nodes should highlight the major genetic changes that contribute to the

43 hematopoietic defecefishes. 44

Figure 4. Association networks between tissue-specific DE genes in the icefish head kidney showing both up-regulated and down-regulated genes. Gene association networks were created for tissue-specific DE genes in the head kidney with

STRING (Jensen et al., 2009) using annotations from the human orthologs. Colors represent K-means clusters of gene nodes (n = 11). Genes were generally clustered by their biological functions.

45

Decreased expression of genes involved in erythroid differentiation in icefishes

The most obvious phenotype that differentiates the icefishes from other notothenioids is their lack of red blood cells (RBC, erythrocytes). To distinguish the loss of erythroid-specific genes from early-acting regulators of the hematopoietic stem cell niche, I examined two clusters of down-regulated genes (C1, C2) in the icefish that had strong tissue-specific expression in P. charcoti head kidney or peripheral blood (Fig

5A,B). While many erythroid genes function in erythroid progenitors of the head kidney

(Orkin and Zon, 1997), the high concentration of RBC in the peripheral blood of P. charcoti allowed me to detect both early and late erythroid markers.

Down-regulated genes that were specific to the head kidney included the hematopoietic stem cell (HSC) markers myb and runx1, both of which are required for definitive, but not primitive, hematopoiesis (Sood et al., 2010; Soza-Ried et al., 2010).

Decreased expression was also observed for several other genes that play critical roles in HSC maintenance and differentiation, including relA/p65 (Stein and Baldwin, 2013) and caspase 3 (Janzen et al., 2006).

Of the down-regulated genes in the icefish head kidney, 149 were blood-specific markers (Fig. 5A,B, Table 1). The decreased expression of many RBC-specific genes reflects the loss of erythrocytes in icefishes and included genes encoding globins, heme biosynthetic enzymes and erythrocyte membrane proteins (n = 27; e.g. hb1, blvrb, band3, band4.1, alas2, fech, add2, ank1, sptb). The list also included erythroid factors that are more highly expressed in immature erythroblasts (Kingsley et al., 2013). These genes regulate erythroid lineage commitment and/or terminal differentiation in the head

46 kidney (n = 51; e.g. gata1, hemogen, klf1, scl/tal1, gfi1b, ldb1, epor, tfrc, tgm2).

Additionally, I identified 31 novel blood-specific genes that have not been previously associated with erythropoiesis (data not shown).

47

48

Figure 5. Three tissue-specific clusters of hematopoietic genes are differentially expressed (DE) in the head kidneys of Ps. georgianus and P. charcoti. (A) Heat map of relative gene expression for tissue-specific DE genes in the icefish head kidney.

Expression was normalized to transcripts per million (TPM). Three clusters show tissue- specific genes in (C1) dragonfish blood, (C2) dragonfish head kidney, and (C3) icefish head kidney. (B) Expression profiles for genes in each tissue-specific cluster.

Abbreviations: HK, head kidney; TK, trunk kidney; Liv, liver; Pec, pectoral; WM, white muscle.

49

Increased expression of pluripotency markers highlights mechanisms of erythroid inhibition in icefishes

The decreased expression for many erythroid genes in icefish head kidney is in part caused by the loss of mature erythrocytes by this group. Thus, the genes with increased expression may portray the cell lineages and developmental processes that predominate in the icefish head kidney. My results show that icefish kidney expressed at elevated levels a number of hematopoietic regulatory genes (Fig. 4-5), including cbfb, ntrk1/trk1, gas6, and dock1. Several of the overexpressed genes in the icefish head kidney (Table 1) are proto-oncogenes (e.g. flt1/vegfr1, bcr, ntrk1/trka) or leukemia markers (e.g. hdac1b, dock1, gas6) that are frequently associated with leukemogenesis

(Bradbury et al., 2005; Collins et al., 1987; Dirks et al., 1999; Lee et al., 2017; Wang et al., 2003).

Signaling pathways that drive hematopoietic proliferation (Van Etten, 2007) showed altered expression in icefish kidney (Fig. 4). Specifically, the interaction network of hematopoietic DE genes was enriched for the PI3K-Akt-mTOR signaling pathway that promotes cell survival and proliferation (Ghosh and Kapur, 2017). This included up-regulated oncogenes like sgk1 (Orlacchio et al., 2017) and down-regulated genes, such as the tumor suppressor foxo1 and others (akt2, casp3, catalase) (Fig. 4).

Aberrant cell signaling promotes carcinogenesis (Martin, 2003) and mutations in regulators of PI3K-Akt-mTOR signaling frequently activate this pathway in leukemias

(Fransecky et al., 2015; Park et al., 2010). In agreement with previous findings, I found increased expression of TGF-beta signaling molecules (Xu et al., 2015), which may be

50 due to extensive duplications of genes in this pathway in Antarctic notothenioids (Chen et al., 2008). TGF-β signaling has been shown to activate AKT signaling in many normal and leukemic cell types (Drabsch and ten Dijke, 2012; Naka et al., 2010). Thus, erythroid differentiation may stall in icefish due to increased expression of signaling molecules and pluripotency genes that may mark a proliferative cell-type.

51

52

Figure 6. Differential expression of hematopoietic regulators is represented by red- and white-blooded notothenioids. Relative expression of hdac1b, p300b, gata1, and spi1b determined by qPCR in head kidneys from two red-blooded (N. coriiceps, P. charcoti) and two white-blooded (C. aceratus, Ps. georgianus) notothenioids. Target gene expression was normalized to beta-actin and error bars represent standard deviation (n.s., not significant; Student’s t test, P > 0.05).

53

Confirmation of differential expression of hematopoietic regulatory genes across notothenioid clades

Genes with morph-biased expression may show high variation even between individuals within a species (Helantera and Uller, 2014). To assess whether differential expression of hematopoietic genes was a consistent feature of the red- and white- blooded notothenioids, I employed qRT-PCR to examine kidney expression of hematopoietic regulatory genes across four representative notothenioid species: the icefishes Ps. georgianus and C. aceratus, the nototheniid N. coriiceps, and the dragonfish P. charcoti. The head kidneys of both icefishes showed significant down- regulation of gata1a and p300b (Student’s t test, P < 0.05; Fig. 6). By contrast, expression of hdac1b was found to be significantly up-regulated in the head kidneys of both icefishes (Student’s t test, P < 0.05). Expression of the myeloid marker, pu.1/spi1b, did not differ significantly between the four species (Fig. 6), consistent with the comparable numbers of myeloid cells in the head kidneys of red- and white-blooded notothenioids. In the RNA-Seq transcriptome, I showed that p300b was down-regulated in all tissues of P. georgianus compared to P. charcoti. In contrast, I found that hdac1b was specifically up-regulated in the icefish head kidney and spleen but not in non- hematopoietic tissues (significance criterion P ≤ 1E-05, FDR ≤ 0.001). These findings suggest that erythropoietic regulatory proteins (e.g., Gata1) may be differentially acetylated, and hence differentially active, in icefish head kidney.

54

55

Figure 7. Predicted deleterious substitutions and frameshifts in blood genes from icefishes. Provean was used to predict deleterious substitutions (Provean score < -3)

(Choi and Chan, 2015) that were shared by three white-blooded icefishes but which did not occur in red-blooded notothenioids. (A) The CD33-related Siglec contained a F753* frameshift mutation that truncated the transmembrane and cytoplasmic regions in the icefishes, N.ionah, C. aceratus and Ps. georgianus. Numbers indicate length in amino acids. (B) Predicted deleterious substitutions in icefish Glucose-6-phosphate dehydrogenase (G6pd). Lines represent the NADP binding sites. White boxes indicate the dimer interface. Capital letters represent beta-turns and lowercase letters are alpha helices. Icefish mutation highlighted in yellow is mutated in human G6PD-deficiency. (C)

Predicted deleterious substitutions in the Transferrin receptor (Tfrc). (D) Predicted deleterious substitutions in Hemopexin (Hpx). Residue highlighted in yellow is involved in binding heme. Abbreviations: Cyto, cytoplasmic; C2-set, immunoglobulin c2-set

(constant) domain; Ig, immunoglobulin-like domain; ITIM, immunoreceptor tyrosine- based inhibitory motif; v-set, immunoglobulin v-set (variable) domain; PA, protease- associated domain; R, Hpx repeat; S, signal peptide; Tr, transmembrane.

56

57

Figure 8. Erythroid beta-spectrin is mutated in the dragonfish P. charcoti.

Scanning electron micrographs of erythrocytes from (A) N. coriiceps, a nototheniid, and from (B-C) P. charcoti, a dragonfish. (D) Light micrographs of Giemsa stained triton- insoluble erythrocyte membrane skeletons from N. coriiceps (Nc) and P. charcoti (Pc).

Flash frozen blood samples were treated with 1% Triton-X, spread on glass coverslips, fixed with 4% PFA. (E) Predicted deleterious mutations in dragonfish (bold italic) and icefish (Roman case) Erythroid beta-spectrins. Residues highlighted by yellow boxes are also mutated in hereditary spherocytic anemia in humans. Scale bars = 10 µm (A,B)

5 µm (C,D).

58

Icefish-specific deleterious substitutions and frameshift mutations occur in common targets of disease

Deleterious point mutations or frameshifts in erythroid-specific functional domains are likely to contribute to the profound anemia of icefishes. I identified frameshift mutations in 16 genes (Table 5) from three icefish species (Neopagetopsis ionah, C. aceratus, Ps. georgianus) after alignment to the orthologs from red-blooded notothenioids (P. charcoti, N. coriiceps, H. antarcticus). One blood-specific gene, a

Cd33-related Siglec (Sialic-acid-binding immunoglobulin-like lectin) contained a C- terminal frameshift that removed its cytoplasmic immunoreceptor tyrosine-based inhibition motif (Fig. 7A). Loss of CD33 causes slight erythropoietic defects in mutant mice (Brinkman-Van der Linden et al., 2003).

I identified all nonsynonymous substitutions in 7,049 orthologs that differed between red- and white-blooded notothenioid lineages and then used Provean to predict whether these were potentially deleterious mutations. Genes with potentially deleterious substitutions (Provean-score < -3.0) were significantly enriched for hematopoietic factors (n = 11; P < 5.380e-2), including regulators of heme metabolism

(n = 6; P < 2.890e-5) and myeloerythroid differentiation (n = 9; P < 1.940e-2) (Table 3,

Table 5, Fig. 7). The mutations were found in important functional domains that have been associated with human diseases (Table 5). Mutated residues in G6PD (glucose-6- phosphate dehydrogenase, Fig. 7) and Erythroid beta-spectrin (Fig. 8) are also mutated in hemolytic anemias in humans (Barisic et al., 2005; Landrum et al., 2016).

59

60

The dragonfish P. charcoti is a natural mutant model for spherocytic anemia

The red blood cells of the dragonfish, P. charcoti, were morphologically different from erythrocytes seen in other red-blooded notothenioids. I employed scanning electron microscopy to compare erythrocytes from P. charcoti and N. coriiceps. A nuclear bulge was apparent in erythrocytes from N. coriiceps but not from P. charcoti

(Fig. 8A-C). This indicated a spherocytic morphology for the dragonfish RBCs, which may result from loss of incorporation of cytoskeletal proteins. Loss of membrane proteins was evidenced by the size difference between triton-insoluble erythroid membrane skeletons from P. charcoti (5.1±0.18 µm, n = 10) and N. coriiceps (12.4±0.72

µm n = 9) (Student’s t test, P = 1.8E-05; Fig. 8D). In the dragonfish, these features may result from seven deleterious substitutions found in erythroid β-spectrin including an

R1037S mutation in the 7th spectrin repeat, which corresponds to an R1035W substitution (rs143827332) that has been associated with hereditary spherocytic anemia

(HS) in humans (Landrum et al., 2016). Accumulation of deleterious substitutions in erythroid β-spectrin from dragonfishes and icefishes may initially have been caused by membrane instability at cold temperatures (Lomako et al., 2015) or may have been caused by relaxed selection on erythrocyte markers as a result of anemia.

61

Regulators of heme metabolism are under different selection in red- and white- blooded notothenioids

Morph-biased gene expression is associated with increased rates of mutation, due to relaxed selection upon genes that are expressed by few individuals of the population or due to loss of function as a result of neutral selection (Helantera and Uller,

2014; Leichty et al., 2012). To determine the selective pressures on icefish coding sequences, I compared the rate of non-synonymous to synonymous substitutions (ω, dN/dS) between orthologous genes from two red-blooded species (P. charcoti and

Harpagifer antarcticus) and two white-blooded species (N. ionah and Ps. georgianus).

To search for genes with variable dN/dS ratios between the notothenioid lineages, I employed a likelihood ratio test (P < 0.05) to compare a 2-ratio and 1-ratio (null) model for each set of gene alignments. Most hematopoietic regulators (i.e. gata1, spi1b/pu.1) were under equally strong purifying selection in red and white-blooded notothenioids and must be functional in some cell lineages in icefishes (data not shown). However, several erythroid genes had significantly higher dN/dS ratios in icefishes including three genes (e.g. tfrc, hpx, slc25a39) involved in heme metabolism (Likelihood ratio test, P <

0.05; Table 2). The increased nonsynonomous substitution rate of the icefish transferrin receptor illustrates continued genetic drift in a gene that is highly polymorphic in notothenioids (Trinchella et al., 2008). Adaptive changes to hematopoietic pathways may serve to combat the negative side-effects of anemia in icefishes. It has been suggested that stable serotransferrin expression may scavenge free ferric iron (Fe 3+) in icefish tissues (Kuhn et al., 2016). Likewise, the strong up-regulation of hemopexin

62

(hpx) in the icefish, the significant positive selection acting on its sequence, and putative functional mutations that remove (H36P, H37G, H79R, H333Q, H364A) or introduce

(Q132H, Q175H, Y246H, D362H, D395H, N434H) histidine residues that may bind heme indicate that Hemopexin function may be adapted to enhance heme scavenging in the plasma in response to the loss of hemoglobin formation.

63

64

Figure 9. Functional mutations occur in the interaction domains of Gata1,

Hemogen, and P300. (A) Deleterious mutations were discovered in the interaction domains (brackets) of Gata1, P300, and Hemogen from white-blooded icefishes (N. ionah, Ps. georgianus) but not in red-blooded notothenioids (P. charcoti, H. antarcticus).

Deleterious mutations were predicted with Provean (Choi and Chan, 2015). (A) Icefish

Gata1 contains a deleterious N319S substitution in the C-terminal zinc finger (C-ZF), which binds Hemogen and P300 (Zheng et al., 2014). (B) Icefish P300 contains three deleterious substitutions in the Gata1 binding region (Blobel et al., 1998) which overlaps with the acetyl transferase domain (spans Br, P, KAT, Z, and CH) (Bordoli et al., 2001).

(C) Icefish Hemogen contains a P174fs frameshift mutation that truncates the C- terminal domain that is responsible for binding of P300 (Zheng et al., 2014). (D) Model for molecular repression of Gata1 function and erythroid gene transcription caused by mutations (marked with X) and by dysregulation of gene expression (arrowheads).

65

Icefish-specific deleterious mutations in the interaction domains of Gata1,

Hemogen, and P300b

The top candidate pathways for the block of erythroid differentiation in icefishes involve Gata1, which is considered the master regulator of erythropoiesis in vertebrates

(Ferreira et al., 2005; Suzuki et al., 2011). Icefish Gata1 and several of its co-factors

(P300, Hemogen, Runx1, Spi1, Gfi1b, Klf1) contained deleterious mutations that may affect Gata1 activity. In both N. ionah and Ps. georgianus, Gata1 contained a single deleterious N319S substitution in the C-terminal Zinc finger (CF) (Fig. 9A), a domain that is required for DNA-binding and for promoting erythroid differentiation (Omichinski et al., 1993). The Gata1 C-ZF domain is bound by the erythroid transcription factor,

Hemogen (Zheng et al., 2014), and by the histone acetyl-transferases CBP (Creb- binding protein) and P300 (Boyes et al., 1998). The erythroid transcription factor,

Hemogen, can recruit P300 to promote acetylation of Gata1 in an immediately adjacent basic domain, leading to enhanced erythroid gene transcription (Zheng et al., 2014).

Previously, I characterized a C-terminal deletion in icefish Hemogen (See

Chapter 4), a defect that introduces a frameshift and premature stop codon in some icefish species (Fig. 9A) This frameshift removes a C-terminal transactivation domain motif that may be required for binding of P300. Icefish P300b also contained seven deleterious substitutions, including three in the TAZ2/CH3 domain (Transcription

Adaptor putative Zinc Finger/cysteine-histidine), the domain that binds Gata1 (Blobel et al., 1998). Thus, all of the interaction domains of Gata1, Hemogen, and P300b contain predicted deleterious mutations. In contrast, Hdac1b was highly conserved between

66 red- and white-blooded species and did not contain any predicted deleterious mutations.

The mutations in Gata1, Hemogen, and P300 may contribute to the differential expression of all identifiable transcriptional targets of Gata1 (n = 161) and Hemogen (n

= 367) in the icefish, Ps. georgianus.

The up-regulation of hdac1b and down-regulation of p300b may contribute to a homeostatic imbalance in erythroid-specific acetylation in icefishes. Thus, I employed

Western blotting to assess whole-protein acetylation status in the head kidneys of red- and white-blooded notothenioid fishes (Fig. 10). Acetylation of most proteins was comparable in C. aceratus and N. coriiceps. Normal acetylation of most proteins may be compensated by p300 paralogs (e.g. p300a, cbp, cbp-like) in notothenioids. However, changes in acetylation for specific targets of Hdac1b and p300b could not be ruled out.

In mice, mutations in the KIX domain of P300 cause severe anemia and erythroid cell defects (Kasper et al., 2002) whereas Hdac1 is inactivated during terminal differentiation by P300 (Yang et al., 2012). Thus, the differential expression of Gata1 targets in icefishes may result from (1) decreased expression of Hemogen, P300b, and

Gata1, (2) by deleterious mutations in the interaction domains of all three proteins and

(3) by overexpression of Hdac1b (Fig. 9B).

67

68

Figure 10. Protein acetylation in the head kidneys of red- and white-blooded notothenioids. (A) Protein acetylation was detected in purified protein from head kidneys of N. coriiceps and C. aceratus. Separated proteins were probed with anti- acetylated lysine antibody (Santa Cruz Biotechnology, AKL5C1). Signals were normalized to Ponceau stained bands and calculated as a fold change relative to N. coriiceps. Arrows mark proteins with increased acetylation in the icefish (> 1.5 fold change).

69

Discussion

Molecular repression of erythropoiesis in Antarctic icefishes

Hemolysis of mature erythrocytes may cause the reduced hematocrits observed in red-blooded Antarctic notothenioids and may have instigated the evolutionary loss of red blood cells in icefishes. The dragonfish, P. charcoti, possesses spherocytic erythrocytes, a feature that may be caused by deleterious mutations that occur in the functional domains of erythroid β-spectrin including specific residues that are mutated in hereditary spherocytic anemia (HS) in humans. In both dragonfishes and icefishes, the accumulation of deleterious substitutions in β-spectrin may have been caused by cold- induced membrane instability (Lomako et al., 2015) or by relaxed selection due to the loss of red blood cell function.

Icefishes may have adapted a molecular repression of erythroid differentiation to avoid the consequences of hemolytic anemias. I identified changes in expression of hematopoietic regulators in icefishes including overexpression of pluripotency genes and decreased expression for genes that promote erythroid differentiation. Icefish hematopoiesis may be disrupted by an acetylation imbalance caused by decreased expression of the p300 acetyltransferase in all tissues and hematopoietic-specific overexpression of hdac1b. Histone acetyltransferases (HATs) and histone deacetylases

(HDACs) control gene expression through acetylation and deacetylation of histones and transcription factors (De Ruijter et al., 2003; Eberharter and Becker, 2002; Vo and

Goodman, 2001). Loss of P300 is embryonic lethal and mutations in the KIX domain of

P300 cause severe anemia and erythroid cell defects in mice (Kasper et al., 2002).

70

Hdac1 activity also plays a critical role in early erythroid proliferation (Heideman et al.,

2014) but is inactivated by P300 during terminal differentiation (Yang et al., 2012).

Furthermore, Antarctic icefishes contain predicted deleterious mutations in the interaction domains of Hemogen, P300, and Gata1. Truncation of the C-terminal transactivation domain in icefish Hemogen may prevent it from recruiting the P300 acetyltransferase to Gata1 (Zheng et al., 2014). Hdac1 facilitates Gata1-mediated transcriptional repression by the NuRD complex (Hong et al., 2005; Snow and Orkin,

2009). During terminal differentiation, P300 acetylates and inactivates Hdac1 and converts this complex to an activator (Yang et al., 2012). Thus, in icefishes, Gata1 may function solely as a transcriptional repressor due to Hdac1b overexpression. Loss of

Gata1 expression and function in icefishes may prevent formation of active chromatin hubs (ACH), which are thought to play a global role in erythroid gene transcription

(Schoenfelder et al., 2010). Specifically, the loss of chromatin looping by the LCR (locus control region) (Krivega and Dean, 2016) may have contributed to the deletion of globin promoter elements in dragonfishes (Lau et al., 2012) and the complete loss of globin genes in icefishes (Cocca et al., 1995a).

71

Methods

Transcriptome assembly and orthology assignment

Transcript sequences from multi-tissue transcriptomes were previously generated for two red-blooded (H. antarcticus, P. charcoti) and two white-blooded (Ps. georgianus, N. ionah) notothenioid species (Berthelot et al., 2018. Manuscript in preparation). Whole-tissue transcriptomes were assembled with Trinity using default parameters (Haas et al., 2013). For comparisons between transcriptomes, orthologous relationships were determined as previously described (Sharma et al., 2014). Briefly,

CD-HIT was used to eliminate gene duplicates (95% similarity) and TransDecoder was used to identify putative open reading frames (Fu et al., 2012; Haas et al., 2013). The longest ORF was chosen for each Trinity subcomponent to produce unique proteins by use of usegalaxy.org (Blankenberg et al., 2010; Giardine et al., 2005; Goecks et al.,

2010). OMA stand-alone v.0.99t (Altenhoff et al., 2013; Altenhoff et al., 2011) identified

7,297 orthologs that were shared in the transcriptomes of all four species (Ps. geogianus, P. charcoti, N. ionah, H. antarcticus) and 18,781 orthologous groups that were shared between Ps. georgianus and P. charcoti. Confirmation of orthologous pairs was done with Blast v2.2.30+ (Altschul et al., 1990; Altschul et al., 1997). A list of

52,959 shared genes was identified in Ps. georgianus and P. charcoti assemblies by reciprocal blast hit (E value < 10-80) criteria. From this list, 19,665 genes from Ps. georgianus mapped (E value < 10-40) to the published genome (60.9%) of Notothenia coriiceps (Shin et al., 2014). Transcriptomes were also mapped (E value < 1010) to the

Swiss Prot database for human proteins Release 2015_9 (UniProt, 2015). Reciprocal

Blast confirmed 15,606 of the 18,781 orthologous genes from the OMA analysis.

72

Mapping (E value < 1040) to known zebrafish and stickleback orthologs in ENSEMBL v74 (Herrero et al., 2016) confirmed 9,735 genes. Gene association networks and gene ontology enrichment were analyzed by STRING v10 (Jensen et al., 2009) based on the

Swiss-Prot annotations (The UniProt, 2017). Association networks were edited using

Inkscape (www.inkscape.org).

Expression Analysis

Expression of pairwise-orthologs shared by Ps. georgianus and P. charcoti was directly compared between tissues from each species using the Trinity pipeline (Haas et al., 2013). Briefly, reads from each tissue were aligned to the respective assembly using bowtie v1.1.1, and abundance estimation was carried out with RSEM v1.2.21

(Langmead et al., 2009; Li and Dewey, 2011). Differential expression (DE) analysis was performed on Trinity components (gene level) with edgeR v3.10.2 to normalize and detect significant differences between Ps. georgianus and P. charcoti read counts in each tissue with 2-4 biological replicates per sample (Robinson et al., 2010). Differential expression was considered statistically significant with an exact test P-value of 1E-05 and an FDR < 0.05. Expression profiles of DE genes were normalized across samples to transcripts per million (TPM) (Haas et al., 2013). Hierarchical clustering was performed with Gene-E to group similarly expressed genes and generate expression profile heat maps (http://www.broadinstitute.org). Clusters were cut at a dendogram height of 56%.

73

Quantitative RT-PCR

Whole RNA was purified from flash frozen tissues in TriZol using the Ribopure kit

(Ambion). RNA was reverse transcribed with a polyT(23) primer using Protoscript II RT-

PCR kit (Invitrogen). Target genes were amplified from cDNA in triplicate by quantitative

PCR (Table 4). Standard curves were generated in QuantStudio v3 (Thermo Fischer

Sci) to confirm the efficiency of all primers. One or two biological replicates were used per notothenioid species. Beta-actin expression was used to normalize expression of target genes for comparison by the ΔΔCt method. Statistical comparisons were carried out between red- and white-blooded lineages using a Student’s t test (P < 0.05).

Determination and comparison of dN/dS ratios

Sequence analysis was conducted on the set of 7,297 orthologs that were shared by two red-blooded (P. charcoti, H. antarcticus) and two white-blooded (Ps. georgianus, N. ionah) notothenioids. First, ratios of nonsynonymous to synonymous substitution rates (dN/dS) were determined to identify genes that are under different selection in each ecotype. Coding and peptide sequences were aligned with T-Coffee v11.00.8 and back-translated with ParaAT (Notredame et al., 2000; Zhang et al.,

2012b). Substitution rates were determined in PAML v4.8 with codeml (Yang, 2007).

This method does not consider gaps in alignments when estimating substitution rates.

Extremely high dN/dS ratios (>10) may be due to high sequence similarity or short sequence length and were discarded (Mugal et al., 2014). To test for genes with variable dN/dS ratios between notothenioid lineages, a likelihood ratio test (P < 0.05)

74 was carried out to compare a 2-ratio and 1-ratio (null) model for each set of alignments.

Prediction of deleterious missense mutations

Sequence alignments were scanned for mutations that may have altered the function of the encoded protein. PAML was used to identify all nonsynonymous substitutions that supported the division between the red- and white-blooded lineage branches. The program Provean was used to predict whether these missense mutations have a neutral or deleterious effect on protein function (Choi and Chan, 2015). The program was run on all sequence alignments using a custom shell script. Provean works under the assumption that substitutions in evolutionarily conserved protein domains are likely to have deleterious effects. A Provean score < -3 was was used as a threshold to predict deleterious mutations because it had a higher specificity than the default score (<2.5) and was shown to accurately predict ~84% of deleterious mutations. Protein domain diagrams were created with Geneious version R10

(http://www.geneious.com) (Kearse et al., 2012).

Identification of frameshift mutations

Frameshift mutations were determined by a Blastx search of the icefish coding sequences (Ps. georgianus, N. ionah) to the translated protein databases for two red- blooded species (P. charcoti, N. coriiceps) (Shin et al., 2014). As a requisite, the same mutation(s) in both icefishes could not occur in either red-blooded species. Mutations were also checked manually by aligning the genes to the reference genome for N.

75 coriiceps (Shin et al., 2014). Mutated genes were isolated and sequenced from genomic

DNA purified from N. coriiceps and another icefish, C. aceratus.

Cloning of genes and cDNAs from Antarctic fish tissues

Genomic DNA (gDNA) was isolated from flash frozen tissues using the HotShot protocol (Truett et al., 2000). Target genes were amplified from gDNA and cDNA by

PCR with 1 µM primers (Table 4) – the amplification program was 35 cycles of 98°C for

10 s, 57°C for 10 s, and 72°C for 30 s. PCR products were cloned into the pGEM-T

Easy vector (Promega, A1360), plasmids were transformed into 5-α competent cells

(New England Biolabs, C2987H), recombinant plasmids were identified by blue/white screening and purified with the Wizard Plus SV Miniprep Kit (Promega A1330), and inserts were sequenced by GeneWiz.

Imaging

Peripheral blood smears were prepared from N. coriiceps and P. charcoti on glass slides and fixed in 4% paraformaldehyde (PFA) (Yergeau et al., 2005). Cells were stained with Giemsa according to the manufacturer’s instructions (Sigma Aldrich).

Triton-insoluble erythrocyte membrane skeletons were prepared from flash frozen peripheral blood samples from N. coriiceps and P. charcoti. Briefly, cells were resuspended in 1% Triton-X, spread on glass coverslips and fixed with 4% PFA. Blood smears and triton-shell spreads were imaged with a Nikon Eclipse E800 microscope using a Photometrics Scientific CoolSNAP EZ camera. Morphological measurements of cells were made using NIKON NIS-Elements AR 4.20 software. For scanning electron microscopy, peripheral blood smears were sputter-coated for 5 s with gold and imaged

76 directly by Scanning Electron Microscopy at the Marine Science Center of Northeastern

University.

Western blotting of anti-acetylated lysine

Total protein was prepared for sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) from flash frozen notothenioid tissues by homogenization in lithium dodecyl sulfate (LDS) Bolt buffer (Life Technologies, B007) and NuPAGE reducing agent (Life Technologies, NP0009) using a pestle and microcentrifuge tube

(USA Scientific, 1415-5390). Samples were boiled for 3 min and centrifuged at top speed in an Eppendorf 5417R centrifuge for 2 min. Aliquots (15 µg) were electrophoresed on a 4-12% SDS polyacrylamide gel, and the separated proteins were transferred to a polyvinylidene difluoride (PVDF) membrane by use of the iBlot system

(Life Technologies, IB21001). Membranes were blocked in maleic acid blocking buffer

(2% Roche blocking reagent, 2% BSA, 0.2% heat treated goat serum, 0.1% Tween-20) for 1 h at room temperature and then incubated overnight at 4°C with 1:1000 mouse anti-acetylated lysine (Santa Cruz Biotechnology, AKL5C1). Membranes were washed in TBST (0.1 M Tris, 0.1 M NaCl, 0.1% Tween-20) and incubated for 2 h with horseradish peroxidase HRP-conjugated goat anti-mouse IgG (H&L) (Aviva,

OARA04973). Bound antibodies were detected with the Amersham ECL Western

Blotting Analysis System (GE Healthcare, RPN2106) on CL-X Posure film (Thermo

Scientific,34091).

77

78

Figures S1. Box plot of RNA-Seq expression profiles in notothenioid tissues.

Values are normalized to transcripts per million (TPM) and log2 transformed.

Abbreviations: HK, head kidney; T. kidney, trunk kidney; Pectoral, pectoral muscle; W. muscle, white muscle; Gonads, ovary. Ventricle, heart ventricle.

79

80

Figure S2. Heatmap of Pearson’s correlation coefficients after comparison of

RNA-Seq gene expression in P. charcoti and Ps. georgianus tissues. Heat map color represents the Pearson’s correlation coefficient for total gene expression from each tissue comparison. (A) Gene expression correlation coefficients cluster by tissue type between P. georgianus and P. charcoti when all genes are assessed. (B) Gene expression correlation coefficients do not cluster between P. georgianus and P. charcoti for differentially expressed genes in head kidney.

81

82

Figure S3. Down-regulation of genes in the icefish head kidney for gene ontology

(GO) groups related to erythropoiesis. GO enrichment was determined using

STRING (Fishers exact test, P < 0.05, FDR < 0.01).

83

Table 1. Hematopoietic genes are differentially expressed in the icefish head kidney

GO:0030097 hemopoiesis 20 1.00E+00 1.76E-02 Up in Icefish GeneID logFC Pvalue Human homolog GAS6 4.30 6.55E-14 growth arrest-specific protein 6 NTRK1 3.53 3.63E-06 high affinity nerve growth factor receptor BAX 2.89 9.88E-08 apoptosis regulator BAX DOCK1 2.66 1.80E-07 dedicator of cytokinesis protein 1 CBFB 2.02 5.88E-05 core-binding factor subunit beta Down in Icefish MYB -2.03 4.24E-05 transcriptional activator Myb CBFA2T3 -2.50 1.31E-07 protein CBFA2T3 CASP3 -2.55 9.78E-10 caspase-3 GATA1 -2.60 2.53E-07 GATA-1 CD28 -3.33 3.45E-05 T-cell-specific surface glycoprotein CD28 FECH -3.37 6.30E-09 ferrochelatase ALAS2 -4.53 6.72E-14 5-aminolevulinate synthase KLF2 -4.59 1.05E-20 Krueppel-like factor 2 KLF1 -4.70 4.05E-16 Krueppel-like factor 1 TFRC -4.88 1.93E-15 transferrin receptor protein 1 GFI1B -2.02 0.000101 zinc finger protein Gfi-1b

Table 2. GO enrichment of genes under different selective pressures in two red- and two white-blooded notothenioids

GO Enrichment for genes under Different Selection GO:0030097 hemopoiesis 11 5.66E-02 GO:0034101 Erythrocyte homeostasis 7 6.49E-05 GO:0042168 Heme metabolic process 4 4.10E-04 GO:0033572 Transferrin transport 3 8.50E-03

84

Table 3. GO enrichment for genes with deleterious substitutions found in two white- blooded icefishes but not in two red-blooded notothenioids

GO Enrichment for Deleterious Mutations GO.0030099 myeloid cell differentiation 31 0.000286 GO.0030097 hemopoiesis 56 0.00165 GO.0006915 apoptotic process 94 0.0074 GO.0055114 oxidation-reduction process 85 0.0231 GO.0007010 cytoskeleton organization 72 0.0284 5220 Chronic myeloid leukemia 11 0.0473 4380 Osteoclast differentiation 19 0.008

Table 4. Table of Primers

Gene Oligo Sequence (5’ – 3’) Gene Method Sptb_F CCAGGCCTTCATGGCTGAG sptb PCR Sptb_R CGCACCTGGTTCTCCGTC (Notothenioid) gDNA Sptb_R_exon19 GATGCTTCTTGAGCAAGATG Hdac1b_F GAGGAGGCCTTCTACACCAC hdac1b qPCR Hdac1b_R CGACTCGTCGTCAATACCGT (Notothenioid) Spi1_F GGATCCAAACCTTGGGGCAC spi1b qPCR Spi1_R GTGGATACACAGGCCGAGG (Notothenioid) Gata1a_F CCACAGCCGAGCGCCTCC gata1a qPCR Gata1a_R GCCCCGTCCAGCAGCTGC (Notothenioid) SGK1_F CTGAAGCCTGAGAACATCC sgk1 qPCR SGK1_R CCATAGAGCATCTCGTAGAG (Notothenioid) PML-L_F TGACCTGGAGGCCACTGG pml-like qPCR PML-L_R CCTGCAGGTCAGACCCG (Notothenioid) P300b_F CCCGAGAAACGGAAGCTGAT p300b qPCR P300b_R TTTTTCAGCGGCAGGCAAAC (Notothenioid) ZFP64_F GCCTTACACTGTGAGGAGG zfp64 qPCR ZFP64_R AACTCCTCATTGTGGGAGG (Notothenioid) ERO1α_F GCAGGTGCTTCTGTCAG ero1α qPCR ERO1α_R GTTTGGAGAAGAGCTGGTTG (Notothenioid) Fam161a_F TTTAAGGCGAGACCCATG fam161a qPCR Fam161a_R CACCATCTCAATGGAAACC (Notothenioid) CD33rSig_F CTGCTCATTAGAGATTGATGA cd33rSig qPCR CD33rSig_R GAAGGTTATTGTGGAGGTC (Notothenioid) Bact_F CAGATCATGTTCGAGACCTTCAAC beta-actin qPCR Bact_R TCACCRGARTCCATGACGATA (Notothenioid)

85

Table 5. Icefishes have predicted deleterious substitutions in targets of human diseases Gene GO Mutation Provean Domain Associated Diseases Reference ADD1 MH A246T -3.692 Aldolase II, NH-head domain HS Robledo et al. 2008,Anong et al. 2009 ANXA2 MH Y113F -3.105 Annexin domain AML, B-ALL Olwill et al. 2005 ASXL2 M G1082R -3.637 Proximate to PHD CBF-AML Jean-Baptiste et al. 2014 BCL11A H E14G -3.353 AML, CML Yin et al. 2016 CARD11 H E258G -6.062 GBP_C "guanylate-binding protein” DLBCL Lenz et al. 2008 CASP8 MH C22R -3.965 DED1 domain HNSCC Ando et al. 2013 CDKN1C MH G202E -4.246 BWS Yatsuki et al. 2013 CHD2 H N502I -5.68 SNF2 domain CLL, MAE Rodriguez et al. 2015, Trivisano et al. 2015 EML1 H Y36D -3.204 T-ALL De Keersmaecker et al. 2005 ERCC1 H R414Y -3.021 COFS, ALL Jaspers et al. 2007, Wang et al. 2006 FOXP1 MH V77F -3.126 B-ALL Put et al. 2011 G6PD MH P495N -4.957 G6PD deficiency, hemolytic anemia Beutler et al. , GATA1 MHN N319S -4.756 CT-Zn finger Thrombocytopenia, DS-AMKL, DBA Nichols et al. 2000, Crispino 2005 GFI1 MH C217Y -6.278 3rd C2H2 Zn finger AML, CLL, bleeding disorder, SCN Moroy et al. 2015 GFI1B MH E136A -3.387 1st C2H2 Zn finger Macrothrombocytopenia Kitamura et al. 2016 IKZF1 M P145A -4.715 Proximate to Zn finger 1 B-ALL, ALL Kastner et al 2013 NFE2 MH A380V -3.633 Coiled-coil, bZIP DNA binding MPN Jutzi et al. 2013,Shyu et al. 2006 RUNX1 MN L342H -5.353 TAD domain AML, B-ALL, CML Gaidzik et al. 2011 SPI1 MH V56P -3.414 Acidic TAD domain AML Mueller et al. 2002, Lamandin et al. 2002 STAT5B HN P198A -3.563 All-alpha domain Lymphomas Kucuk et al. 2015 TF MH E382A -4.57 C-lobe, disulfide bond Atransferrinemia, Alzheimer's Lee et al. 2001, Giambattistelli et al. 2012 TFRC MH E140V -3.759 ZN-peptidase transferrin receptor Iron deficiency anemia Roetto et al. 2001, Jabara et al. 2016 BLVRB P G145R -3.626 BVR/FR (Flavin reductase) domain Thrombopoiesis WU et al. 2016, O'Brien et al. 2015 COX15 P L235F -3.623 Transmembrane region Leigh syndrome, cardiomyopathy Antonicka et al. 2003, Bugiani et al. 2005 CPOX P P364H -8.838 Coproporphyrinogen III oxidase HCP, harderoporphyria Martasek et al. 1994, Schmitt et al. 2005 HPX P G52E -7.765 Hpx repeat 1 Diabetic macular edema Mehta et al. 2015, Hernandez et al. 2013 NFE2L1 P S301F -4.455 - Cancer, neurodegenerative disease Han et al. 2012, Taniguchi et al. 2017 SLC25A39 P L230F -3.055 1st solcar repeat Anemia, epilepsy Nilsson et al. 2009, Slabbaert et al. 2016 UGT1A1 P R332G -4.98 UDP-glucoronosyltransferase 1-1 Crigler-Najjar (CN), Gilbert (GILBS) Servedio 2005 IKAROS2 G108D -6.144 Zn finger 1 ALL Zhang et al. 2007,Chen et al. 2013 FES R585C -5.823 Catalytic domain of PTK, ATP binding AML Cheng et al. 2001,Sangrar et al. 2005 FLT1 C186R -4.337 CT domain AML, CML Choi et al. 2005, Fragoso et al. 2006 Gene logFC GO Mutation Domain Associated Diseases Reference PPHLN1 0.57 L265fs Intrahepatic cholangiocarcinoma Sia et al. 2014 PAPPS2 na A43fs Brachyolmia, Prostate cancer Miyake et al. 2012, Ibeawuchi et al. 2015 MATE1 -1.18 F467fs Environmental toxin clearance, CML Chen et al. 2009 ZFP64 0.46 L615fs Amyotrophic lateral sclerosis Schymick et al. 2007 ERO1la -5.06 T37fs Adenocarcinoma Endoh et al. 2004 FAM161Al 3.07 H291fs C-terminus Retinitis pigmentosa 28 Karlstetter et al. 2014, Van Schil et al. 2015 CD33l -1.46 F753fs ITIM domain Alzheimer's, AML (expression) Stefania De Propris et al. 2011 NUMA1 -0.09 E895fs Osteosarcoma, AML Kovac et al. 2015, Strehl et al. 2012

86

Chapter 2: Divergent Hemogen genes of teleosts and mammals share conserved roles in erythropoiesis: Analysis using transgenic and mutant zebrafish

Michael J. Peters1, Sandra K. Parker1, Jeffrey Grim1,2, Corey A. H. Allard1,3, Jonah

Levin1,4, H. William Detrich, III1*

1Department of Marine and Environmental Sciences, Northeastern University, Nahant,

MA 01908, USA

2Present address: Department of Biology, The University of Tampa, Tampa, FL 33606,

USA

3Present address: Department of Biochemistry and Cell Biology, Geisel School of

Medicine at Dartmouth College, Hanover, NH 03755, USA

4Present address: Department of Biochemistry, McGill University, Montreal, Quebec

H3G1Y6, CA

Published:

Peters MJ, Parker SK, Grim J, Allard CAH, Levin J, Detrich HW III. 2018. Divergent

Hemogen genes of teleosts and mammals share conserved roles in erythropoiesis:

Analysis using transgenic and mutant zebrafish. Biology Open bio.035576 doi:

10.1242/bio.035576

87

Summary Statement

Transgenic and mutant zebrafish lines were created to characterize the expression and functions of Hemogen, a transcription factor involved in the formation of red blood cells and other processes.

88

ABSTRACT

Hemogen is a vertebrate transcription factor that performs important functions in erythropoiesis and testicular development and may contribute to neoplasia. Here we identify zebrafish Hemogen and show that it is considerably smaller (~22 kDa) than its human ortholog (~55 kDa), a striking difference that is explained by an underlying modular structure. We demonstrate that Hemogens are largely composed of 21-25 amino acid repeats, some of which may function as transactivation domains (TADs).

Hemogen expression in embryonic and adult zebrafish is detected in hematopoietic, renal, neural, and gonadal tissues. Using Tol2- and CRISPR/Cas9-generated transgenic zebrafish, we show that Hemogen expression is controlled by two Gata1- dependent regulatory sequences that act alone and together to control spatial and temporal expression during development. Partial depletion of Hemogen in embryos by morpholino knock-down reduces the number of erythrocytes in circulation.

CRISPR/Cas9-generated zebrafish lines containing either a frameshift mutation or an in-frame deletion in a putative, C-terminal TAD display anemia and embryonic tail defects. This work expands our understanding of Hemogen and provides mutant zebrafish lines for future study of the mechanism of this important transcription factor.

89

INTRODUCTION

Hemogen (Hemgn) is a vertebrate transcription factor that is expressed in mammalian hematopoietic progenitors (Lu et al., 2001; Yang et al., 2001) and has been implicated in erythroid differentiation and survival (Li et al., 2004). Originally identified in mice and subsequently described in humans as EDAG (Erythrocyte Differentiation

Associated Gene), Hemogen has also been implicated in testis development in mammals and chickens (Nakata et al., 2013; Yang et al., 2003), and in osteogenesis in rats (Kruger et al., 2002; Kruger et al., 2005). Here we analyze the developmental roles of teleost Hemogen using the zebrafish model system and its powerful suite of reverse- genetic technologies.

Teleost Hemogen was discovered using a subtractive hybridization screen designed to isolate novel erythropoietic genes from fish belonging to the largely

Antarctic suborder Notothenioidei (Detrich and Yergeau, 2004; Yergeau et al., 2005).

Sixteen species belonging to the icefish family (Channichthyidae) are unique among vertebrates because they are white-blooded ‒ they fail to execute the erythroid genetic program or produce hemoglobin (Cocca et al., 1995a; Near et al., 2006a; Zhao et al.,

1998a). Forty-five candidate erythropoietic cDNAs were recovered using representational difference analysis (Hubank and Schatz, 1999) applied to kidney marrow transcriptomes of two notothenioid species, one red-blooded and the other white-blooded (Detrich and Yergeau, 2004; Yergeau et al., 2005). One of the unknown genes, clone Rda130, was similar to mammalian Hemogen and was expressed only by the red-blooded notothenioid.

90

Although Hemogen is clearly involved in hematopoiesis, its mechanism remains incompletely understood. In human cell lines, Hemogen activates erythroid gene transcription in part by recruiting the histone acetyltransferase P300 to acetylate Gata1

(Zheng et al., 2014). Like Gata1, Hemogen protects erythroid cells from apoptosis by upregulating anti-apoptotic factors (e.g., Nf-κB, Bcl-xL) that are critical for terminal differentiation (Li et al., 2004; Rhodes et al., 2005; Zhang et al., 2012a).

The regulation of Hemogen expression is of interest because it is overexpressed frequently in patients with a variety of cancers and leukemias (An et al., 2005; Forbes et al., 2017; Li et al., 2004). This putative oncogene, which is located in a human chromosomal region (9q22) of leukemia-associated breakpoints, has been linked to proliferation and survival of leukemic cells and to induction of tumor formation in mice

(Chen et al., 2016; Lu et al., 2002). Thus, somatic mutations in Hemogen or its regulators may contribute to neoplasia.

The zebrafish is a well-established model organism for studying hematopoiesis in vertebrates because it produces the same blood lineages as mammals (de Jong and

Zon, 2005; Paffett-Lugassy and Zon, 2005). In zebrafish, erythropoiesis occurs in sequential waves at unique anatomical locations in embryos and adults that correspond to analogous sites in mammals (Galloway and Zon, 2003). Many of the molecular players that orchestrate the erythroid program appear to be conserved between zebrafish and mammals, but relatively few have been functionally characterized in zebrafish. Nevertheless, mutant zebrafish models accurately phenocopy human blood diseases caused by mutations in major erythroid factors, such as Gata1 (Lyons et al.,

2002) and Erythroid beta-spectrin (Liao et al., 2000).

91

The purpose of this study is to characterize the regulation of Hemogen expression and the function of the Hemogen protein in zebrafish. We identify the zebrafish Hemogen ortholog, which despite being only 40% as large as the human protein, contains similarly arranged functional motifs. Hemogen is expressed in blood, testis, ovaries, kidney, and the central nervous system in zebrafish. Two tissue-specific, alternative Hemogen promoters are associated with conserved noncoding elements

(CNEs) and have distinct regulatory functions in primitive and definitive hematopoiesis and other processes. By analysis of morphant and mutant zebrafish, we show that

Hemogen is required for normal erythropoiesis and that this role depends in part on a cluster of acidic residues within a putative, C-terminal transactivation domain (TAD).

92

93

Figure 1. Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and encode related proteins that differ in size. (A) Structure of the zebrafish Hemogen- like gene, Si:dkey-25o16.2. Two conserved noncoding elements (C1 and C2; black boxes) were identified in a 2-kb segment proximal to the start codon (see Results, Figs.

4-6). Coding exons, white boxes; noncoding exons, gray boxes. Numbers indicate length in bp. (B) Synteny of loci for zebrafish Si:dkey-25o16.2 on chromosome 1 and

Hemogen on human chromosome 9 (region q22). Transcriptional orientations indicated by arrows. (C) Alternative splicing of zebrafish Hemogen-like transcripts showing sequenced regions. Introns are shown as chevrons. Transcripts 1 and 2 differ by retention of 12 bp of intron (red). (D) Modular structures of zebrafish and human

Hemogen proteins each encoded by four exons (numbered boxes). Locations of truncating mutations found in some human cancers (Forbes et al., 2017) are indicated by asterisks. Predicted regions and motifs: green, coiled coil; blue, nuclear localization signal; red, four residues introduced by alternative splicing; yellow, tandem peptide repeats; brown, acidic repeat with transactivation domain (TAD) motif; gray, no prediction. (E) Three-dimensional ab initio models of Hemogens. The ribbon diagram of the zebrafish protein, color-coded as in panel D, is superimposed on the gray, space- filling model for the human protein (See Materials and Methods).

94

RESULTS

Teleosts contain a single Hemogen-like gene that is syntenic with human Hemogen

Chromosomal synteny is an important criterion when assigning gene relationships across divergent taxa. Despite the whole-genome duplication (WGD) that coincided with the separation of teleosts from more basal ray-finned fishes and tetrapods (Postlethwait et al., 2000), the sequenced genomes of nearly all fishes retain a single Hemogen-like gene. We cloned zebrafish Hemogen-like cDNAs and found that they corresponded to the predicted gene Si:dkey-25o16.2 on chromosome 1 of the zebrafish genome (Howe et al., 2013). When we compared the synteny of the putative teleost and mammalian orthologs, represented in Figure 1B by zebrafish Si:dkey-

25o16.2 (chromosome Dr1) and human Hemogen (chromosome Hs9), we found that the flanking genes and their transcriptional orientations were conserved, which strongly supported Si:dkey-25o16.2 as the zebrafish Hemogen ortholog.

Structure of the zebrafish Hemogen gene

The basic structure of the Hemogen gene of teleosts and mammals was also found to be highly conserved – four coding exons were separated by three introns (Fig.

1A), and two introns were found in the 5’-UTR. Two transcription start sites were predicted to occur within 2-kb upstream of the Hemogen start codon in zebrafish (Fig.

1A) ‒ these appear to correspond to the hematopoietic- and testis-specific Hemogen promoters (noncoding exons 1H and 1T, respectively) described for mammals (Yang et al., 2003). Alignment of Hemogen genes from 10 teleost species (Yates et al., 2016) revealed two conserved non-coding elements, CNE1 and CNE2, that overlapped with

95 zebrafish exons 1T and 1H, respectively (Fig. 1A). We hypothesized that these elements function individually or together to regulate transcription of Hemogen.

Transcription of the zebrafish Hemogen gene yields multiple mRNA isoforms

We confirmed transcription from both promoters in zebrafish by isolating and sequencing four splicing variants (Fig. 1C). Three isoforms were transcribed from the proximal promoter (exon 1H, Fig. 1A,C), each containing the same 5’-untranslated region (5’-UTR). Alternative splicing of the second coding exon produced transcripts 1 and 2, which differ by four additional codons in the latter (Fig. 1C, red); the shorter version has not been described in mammals. Transcript 3 retained the entire third intron

(156 bp), which introduced a premature translation-termination codon. A fourth isoform was transcribed from the distal promoter (1T) located ~1.65 kb upstream of the translation start codon (Fig. 1A,C). Splicing of exons 1T and 1H to form the 5’-UTR of transcript 4 made use of canonical donor (AT-GT) and acceptor (AG-TT) splice sites.

Teleost and mammalian Hemogen proteins differ markedly in size but share structural motifs

Teleost Hemogen-like genes encoded shorter proteins (194-289 amino acids) than the annotated Hemogen genes of mammals (417-827 amino acids), and the overall amino acid sequence similarity between teleost and mammalian orthologs was modest (18%-38%). Despite this heterogeneity in length and sequence, Hemogens of teleost fish and mammals shared predicted structural motifs, as shown in Figure 1D,E for zebrafish (198 aa, 22 kDa) and human (484 aa, 55 kDa) orthologs, respectively.

Their N-termini (zebrafish residues 1-74, human 1-78) were substantially conserved

96

(51% sequence similarity; Fig. S1) and contained two predicted coiled-coil (CC) forming alpha-helices, the second of which was a putative nuclear localization signal (NLS)

(Yang et al., 2001) (Fig. 1D; Fig. S1). By contrast, their C-termini (zebrafish residues 75-

198, human 79-484) were weakly conserved in sequence (13% similarity), but both were rich in Pro and Glu residues (Figs. S1-S2), consistent with intrinsic disorder of these regions (Dyson and Wright, 2005). Furthermore, the C-termini shared modular structures – each was built of several 21-25 amino acid motifs, three in zebrafish and nine in humans, with distinct but related consensus sequences

(PEXXXIAEXXXXXQEVXPQXXLVP and YSXEXYQEXAEPEDXSPETYQEIPX, respectively) (Fig. 1D,E, Figs. S1-S2). Thus, the size heterogeneity between zebrafish and human Hemogens was largely attributable to the number of repetitive segments contained within each.

Within the C-termini of teleost Hemogens, we identified a conserved acidic region

(zebrafish residues 119-169, 35-49% similarity across 10 species) that was similar to an acidic region of the mouse protein (Yang et al., 2001). Given the transactivation functions of Hemogen in humans (Zheng et al., 2014), we investigated whether the zebrafish and human proteins possessed TAD motifs based on the consensus sequences ϕϕxxϕ or ϕxxϕϕ, where ϕ is a bulky hydrophobic residue (Dyson and Wright,

2016). The acidic C-termini of both Hemogens contained one TAD motif. Four additional

TAD motifs were distributed in other regions of the human protein (Fig. S1).

To assess the three-dimensional conformations of zebrafish and human

Hemogens, although in a static context, we generated ab initio tertiary structural models with I-Tasser (Yang et al., 2015) using the best of ten predicted templates (Fig. 1E, see

97

Materials and Methods). The structures for zebrafish and human Hemogens had template modeling scores (TM-scores) of 0.45 and 0.55, respectively, where a TM- score > 0.3 indicates significantly different (P < 0.001) from random structures (Xu and

Zhang, 2010). When the two models were superimposed, amino acid sequences shared by human and zebrafish Hemogens showed 98% coincidence and a TM-score of 0.71.

The N-termini of the zebrafish and human Hemogens presented exposed CC domains that may serve as binding sites for Gata1 (Zheng et al., 2014). The “disordered” C- termini of Hemogens from zebrafish and humans were comprised of two distinct elements: proline-rich repeats (yellow) and an acidic, C-terminal repeat containing the

TAD motif (maroon) (Fig, 1E, Fig. S1). The former may coalesce as rigid linkers to extend the TAD motif to binding partners. These features are common to transcription factors, as epitomized by the structure of p53 (Wells et al., 2008).

98

99

Figure 2. Hemogen expression in zebrafish embryos. (A-H) Wild-type embryos,

WISH. (A) Epiboly at 9 hpf. Hemogen expression was not detected. (B) 10-somite stage. Hemogen transcripts along the lateral plate mesoderm (LPM). (C) 20 hpf.

Hemogen staining in the intermediate cells mass (ICM) and posterior blood island (PBI).

The inset shows a sense probe control. (D) 33 hpf. Hemogen-positive primitive erythrocytes of the peripheral blood (PB) exited the Ducts of Cuvier (DC) onto the yolk.

Staining at the midbrain-hindbrain boundary (MHB) was observed. (E) 144 hpf.

Hemogen expression in the caudal hematopoietic tissue (CHT) and pronephric kidney

(PK), and in erythrocytes in the heart (H). The asterisk indicates the plane of the cross section in panel F. (F) 144 hpf. Cross section of embryo in panel E showing heavily stained pronephric ducts. (G) 48 hpf. Lateral aspect of tail. Hemogen transcripts in the

CHT and pronephric tubule duct (PD). (H) Kidney touch print from adult fish. Hemogen expression was observed in proerythroblasts (ProE) and normoblasts (N) but not in erythrocytes (E). (I) 48 hpf. View of circulating EGFP+ erythrocytes in the dorsal aorta

(DA) of Tg(Lcr:EGFP)cz3325Tg zebrafish after staining for Hemogen protein by indirect immunofluorescence. Hemogen (red signal) accumulated in nuclei (Nu) of erythrocytes whereas the cytoplasm (C) was marked by EGFP. Other abbreviations: CV, caudal vein; DA, dorsal aorta; G, gut; M, myotomes; NC, notochord; PK, pronephric kidney; SB, swim bladder; SC, spinal cord. Scale bars: (A-F) 250 µm; (G) 1 mm; (H) 100 µm; (I) 50

µm.

100

Hemogen expression tracks the ontogenetic progression of hematopoiesis in zebrafish

The spatial and temporal patterns of Hemogen expression were evaluated in zebrafish between 2 and 144 hours post fertilization (hpf) by whole-mount in situ hybridization (WISH) (Fig. 2A-H). Hemogen transcripts were not apparent prior to somitogenesis (Fig. 2A) but first appeared at the 10-somite stage in punctate, intersomitic foci in the lateral plate mesoderm (LPM; Fig. 2B). By 20 hpf, Hemogen was expressed throughout the intermediate cell mass (ICM) and posterior blood island (PBI)

(Fig. 2C), the sites of primitive hematopoiesis (Bertrand et al., 2007b; Davidson and

Zon, 2004). Primitive erythrocytes expressed Hemogen as they entered circulation at 33 hpf (Fig. 2D).

Definitive hematopoiesis in zebrafish embryos commences in the aorta gonad mesonephros (AGM) region at 30 hpf with the emergence of hematopoietic stem progenitor cells (HSPCs) that subsequently seed the caudal hematopoietic tissue (CHT) and the thymus (Murayama et al., 2006). By 144 hpf, HSPCs migrate from the CHT to establish a niche associated with the pronephric glomeruli (Bertrand et al., 2008).

Although we did not detect Hemogen mRNA in the AGM (Fig. 2D), we observed strong expression in cells of the CHT at 48 and 144 hpf (Fig. 2E,G) and in the region of the pronephric glomeruli at 144 hpf (Fig. 2E,F). In the adult zebrafish kidney, Hemogen was strongly expressed in progenitor cells in the interstitial hematopoietic stem cell niche between pronephric tubules (Fig. 2H). Hemogen expression was robust in progenitors but absent in mature erythrocytes (Fig. 2H), whereas an anti-sense riboprobe for βe1- globin hybridized exclusively to mature erythrocytes but not to progenitor cells (data not shown).

101

Hemogen has been shown to function as a nuclear transcription factor in mammals (Zheng et al., 2014). To determine whether or not Hemogen is likely to play the same role in zebrafish, we examined Tg(Lcr:EGFP)cz3325Tg embryos at 48 hpf by indirect immunofluorescence microscopy using an antibody specific for Hemogen.

Tg(Lcr:EGFP)cz3325Tg zebrafish have been used to track both primitive and definitive erythrocytes (Ganis et al., 2012). Figure 2I shows that Hemogen accumulated in the nuclei (red signal) of GFP-labeled circulating erythrocytes in the dorsal aorta; thus, its role in transcription is likely to be conserved in zebrafish.

Alternative promoters regulate Hemogen expression in zebrafish hematopoietic and reproductive tissues

In zebrafish, we also detected Hemogen expression in the hindbrain and in the pronephric tubules of embryonic zebrafish between 30 and 48 hpf (Fig. 3A,B) and in adult zebrafish reproductive tissues (Fig. 3G-H). The alternative Hemogen promoters found in zebrafish probably correspond to the hematopoietic and testis-specific

Hemogen promoters of mammals (Yang et al., 2003). To quantify relative levels of transcription from each promoter in zebrafish (Fig. 3I), we performed qRT-PCR on total

RNA from adult peripheral blood, testis, and ovaries (Fig. 3J) using primer pairs specific for exons 1H and 1T. Because all of exon 1H was included in transcripts initiated from exon 1T, one must infer transcription from the proximal promoter by difference.

Transcription from the proximal promoter was greatest in peripheral blood; the presence of transcripts from this promoter in testis and ovarian tissue may be due to contaminating blood RNA. The distal promoter was highly active in both peripheral blood and in testes but not in ovaries.

102

103

Figure 3. Alternative promoters drive Hemogen expression in hematopoietic and nonhematopoietic tissues in zebrafish. WISH of wild-type embryos (A-B) and adult tissues (C-H). (A) 48 hpf. Hemogen expression in the pronephric kidney glomeruli (PG), pronephric tubule duct (PD), caudal hematopoietic tissue (CHT), and brain (Br). (B) 48 hpf. Section showing strong Hemogen expression in the hindbrain (HB) but at low levels in the midbrain (MB). (C) Dorsal and (D) ventral views of the adult zebrafish brain after staining for Hemogen transcripts. (E) Schematic drawing of the dorsal view. Hemogen was highly expressed at the midbrain-hindbrain boundary within the eminentia granularis (EG), in the crista cerebellaris (CC), and in the hypothalamus (Hy). The asterisk indicates the plane of the cross section in panel F. (F) Section of the hindbrain showing Hemogen expression in the periventricular gray zone (PGZ). (G) Hemogen was expressed by Sertoli cells (Se) between the seminiferous tubules (ST) of the testes.

(H) Hemogen was expressed in early (I-III) but not late (IV) stage oocytes. Transcripts accumulated around the germinal vesicle (GV). (I) Schematic of the Hemogen noncoding exons 1T and 1H (gray) upstream of the first coding exon (white); bent arrows, transcription initiation sites. Arrowheads mark primer binding sites for qPCR amplification of transcripts initiated from exons 1T or 1H. (J) Expression of transcripts from alternative promoters determined by qRT-PCR using RNA from blood, testes, and ovaries of adult TU zebrafish. Expression in three biological replicates were normalized to β-actin and calculated relative to ovaries. Error bars represent the standard deviation.

Transcription initiated from 1H must be inferred by difference [1H – 1T] because the 1H primers also amplified 1T transcripts. Other abbreviations: Ce, corpus cerebelli; MO,

104 medulla oblongata; OB, olfactory bulb; OT, optic tectum; SR, superior raphe; Te, telencephalon; TS, torus semicircularis. Scale bars = 250 µm (A, B, F-H); 1 mm (C,D).

105

Hemogen CNEs are predicted targets for transcription factors that regulate erythropoiesis and spermatogenesis

In teleosts, we identified two evolutionarily conserved non-coding elements,

CNE1 and CNE2, that were tightly associated with exons 1T and 1H, respectively (Fig.

1A, Fig. 4A). These elements may function as core promoters and/or enhancers to regulate transcription of the different Hemogen isoforms in zebrafish. To identify potential regulators of Hemogen transcription, we used ConTra v2 (Broos et al., 2011) to predict transcription factor binding motifs in the aligned Hemogen CNEs from two mammals and nine teleosts (Yates et al., 2016) (Fig. 4B,C). Each CNE contained binding motifs for transcription factors involved in erythropoiesis and/or spermatogenesis.

In zebrafish CNE2, two Gata1 binding sites, located +59 and +127 bp downstream relative to the transcription start site, aligned with Gata1 sites known to be active in the mammalian Hemogen promoter (Fig. 4C) (Yang et al., 2006). Each Gata motif was paired with a predicted E-box - this motif in Hemogen CNE2 is a known target of the Ldb1-erythroid-complex recruited by Scl (Soler et al., 2010). CNE2 also contained binding sites for Klf4, a driver of zebrafish primitive erythropoiesis (Gardiner et al.,

2007), for Myb, a regulator of zebrafish definitive hematopoiesis (Soza-Ried et al.,

2010), and for HoxB4, a regulator of Hemogen expression in mammalian hematopoietic stem cells (Jiang et al., 2010).

The distal CNE1 of teleosts possessed a similar suite of transcription factor binding motifs in roughly the same arrangement as the proximal CNE but with the

106 notable addition of binding sites for Sox9 and the Androgen receptor (Fig. 4B), both of which play roles in zebrafish spermatogenesis (Hossain et al., 2008; Rodriguez-Mari et al., 2005). CNE1, like CNE2, contained pairs of E-box and Gata motifs downstream of the zebrafish transcription start site (+15 and +48 bp, respectively). CNE1 may function as an enhancer for the Hemogen gene and/or act as the core promoter for exon 1T.

107

108

Figure 4. Conserved elements in the zebrafish Hemogen promoter are predicted targets for transcription factors. (A) Schematic of the zebrafish Hemogen gene.

CNEs, black; coding exons, white; noncoding exons, gray; transcription initiation sites, bent arrows. Numbers indicate length in bp. (B, C) Sequence alignments of CNE1 and

CNE2, respectively, from 9 teleost species, mice, and humans. ConTra software (Broos et al., 2011) predicted transcription factor binding sites for the Androgen receptor (light green), Brca1 (cyan), Foxl2 (pink), Gata1 (dark blue), Gfi1 (orange), HoxB4 (sky blue),

Hnf1a (dark green), Klf4 (yellow), Myb (dark gray), P300 (red), Sox9 (purple),

Scl/Lmo2/Ldb1 complex (light gray). Splice donor sites are highlighted black. Species abbreviations: Dr, Danio rerio; Ca, Cynoglossus semilaevis; Gm, Gadus morhua; Ga,

Gasterosteus aculeatus; Ol, Oryzias latipes; Xm, Xiphophorus maculatus; On,

Oreochromis niloticus; Tr, Takifugu rubripes; Tn, Tetraodon nigroviridis; Mm, Mus musculus; Hs, Homo sapiens.

109

Hematopoietic and neural expression of Hemogen in zebrafish is dependent on Gata1 binding to the promoter CNEs

In mammals, transcription of Hemogen from the proximal promoter is tightly regulated by Gata1 in hematopoietic cells (Yang et al., 2006). To investigate whether

Gata1 regulates Hemogen in zebrafish, we analyzed a Gata1 ChIP-seq dataset that was generated to assess Gata1 activity in adult zebrafish erythrocytes (Yang et al.,

2016). Figure 5A shows that Gata1 bound to CNE1 and CNE2 at sites overlapping their

Gata motifs (red lines), which indicates strongly that Gata1 is required for transcription of Hemogen in zebrafish. Corroboration that CNE1 and CNE2 were active chromatin regions was provided by ATAC-seq and DNase I hypersensitive site analysis (Yang et al., 2016) (Fig. 5A). Our data reveal that Gata motifs in CNE1, like those in CNE2, are important regulators of Hemogen expression in zebrafish erythrocytes.

We performed WISH to compare the expression of Hemogen and Embryonic beta-globin (βe1-globin) in embryos produced by the Gata1-null mutant, vlad tepes

(vltm651) (Lyons et al., 2002). At 33 hpf, Hemogen was expressed normally in circulating blood cells and in the hindbrain of wild-type siblings (Fig. 5B), and βe1-globin was abundant in the blood (Fig. 5B, inset). Homozygous vltm651 mutant siblings, by contrast, failed to express Hemogen in the blood and brain (Fig. 5C). This result mimicked the loss of βe1-globin in vltm651 mutants, with the exception that βe1-globin expression persisted in the PBI (Fig. 5C, inset), as has been demonstrated for α1-globin, Scl and

Gata1 (Jin et al., 2009).

110

111

Figure 5. Gata1 binds distal and proximal promoter elements to regulate

Hemogen expression in zebrafish. (A) Gata1 ChIP-sequencing showing enriched binding of Gata1 at CNE1 and CNE2 (C1 and C2, red lines) in the Hemogen promoter in adult zebrafish red blood cells (Yang et al., 2016). DNase-sequencing and ATAC- sequencing showing colocalization of the active chromatin regions (Yang et al., 2016).

(B) Hemogen expression by WISH of wild-type (n = 16/21) and (C) homozygous mutant

(n = 5/21) siblings (33 hpf) from in-crossed Gata1+/- vltm651 mutants. Insets show βe1- globin expression in mutant (n = 4/10) and wild-type (n = 6/10) siblings. Scale bar = 250

µm (C).

112

Tg(Hemgn:mCherry) zebrafish reveal the functions of the two Hemogen promoters

To determine the tissue-specific regulatory profiles of the two Hemogen promoters, we generated transgenic zebrafish embryos

[Tg(Hemgn:mCherry,myl7:EGFP)] in which the mCherry reporter was controlled by the putative promoter elements (Fig. 6). The dual promoter, P1 (2,248 bp), spanned the upstream, non-coding region to the Hemogen start codon and contained both CNEs.

Transgenic fish were outcrossed to wild-type TU zebrafish, and offspring with the strongest mCherry expression were selected as founders. In the early embryo, the P1 transgene drove expression of mCherry in primitive blood cells of the ICM and the PBI

(20 hpf; Fig. 6B) and in primitive erythrocytes in circulation (Movie 1). Between 2-8 dpf, mCherry was expressed strongly throughout the pronephric ducts (Fig. 6C) and was present in the proximal convoluted tubule at 72 hpf (Fig. 6D). In adult transgenic fish, the head and trunk kidneys were positive for the reporter (Fig. 6H), as were Sertoli cells surrounding the seminiferous tubules of the testes (Fig. 6I). Therefore, the ~2.2 kb P1 transgene contained all of the regulatory elements necessary to recapitulate Hemogen expression (Fig. 6B-I). We note that the dual promoter did not confer detectable ovarian or neural expression, which may require more distal sequences.

We found that the same expression profile was driven by the endogenous

Hemogen promoter in embryonic zebrafish by using CRISPR/Cas9 technology to insert the mCherry gene (containing a polyadenylation motif) two codons downstream of, and in frame with, the Hemogen start codon (See Methods, Fig. S3A,C). Homology-directed integration of the transgene, confirmed by sequencing of the locus, produced mCherry+

113 cells in the CHT and in the kidney in 10% (n = 15/150) of embryos at 3 dpf (Fig. S3B) and at a lower frequency in circulating RBC (n = 3/150, data not shown).

To characterize hematopoietic cell lineages that express Hemogen, the P1 reporter plasmid was injected into embryos of Tg(CD41:EGFP)Ia2Tg or

Tg(Lcr:EGFP)cz3325Tg zebrafish, which have been used to track hematopoietic progenitors (Lin et al., 2005) and primitive and definitive erythrocytes (Ganis et al.,

2012), respectively. We did not observe mCherry expression in the AGM, in the thymus, or in CD41+ HSPCs colonizing the thymus or pronephros (Bertrand et al., 2008).

However, the reporter was strongly expressed in a subset of LCR+ erythroid and

CD41+ myeloid-biased progenitors in the CHT (Fig. 6E,F), a tissue that supports myelopoiesis (Gekas and Graf, 2013; Medvinsky et al., 2011). This lends support to previous findings that Hemogen is a marker and promoter of myeloerythroid, but not lymphoid, lineages (Li et al., 2007; Lu et al., 2001). Maturing mCherry+ primitive progenitors peaked in brightness just prior to leaving the caudal plexus and entering circulation at 72 hpf (observed by time-lapse imaging; data not shown). However, mature definitive erythrocytes expressed little mCherry in adult transgenics (Fig. 6G), which supports prior observations that Hemogen expression is limited to primitive erythrocytes and immature definitive progenitors (Lu et al., 2001).

114

115

Figure 6. Promoter elements have distinct roles in driving hematopoietic, renal, and testicular expression of Hemogen in transgenic Tg(Hemgn:mCherry) zebrafish. (A) Schematic of the zebrafish Hemogen gene. CNEs, black; coding exons, white; transcription initiation sites, bent arrows. Three Tg(Hemgn:mCherry,myl7:EGFP) transgenes driven by portions of the Hemogen promoter were transfected into one-cell

TU embryos by Tol2 transposase-mediated insertion. Numbers indicate length of promoter elements and arrows show gene direction. (B) 20 hpf. P1 transgene expression in the peripheral blood island (PBI). (C) 72 hpf. P1 transgene expression in the pronephric ducts (PD). (D) 5 dpf. P1 transgene expression in the proximal convoluted tubule (PCT). (E,F) 72 hpf. colocalization of mCherry and EGFP in progenitors in the CHT of Tg(Hemgn-P1:mCherry,Lcr:GFP) or Tg(Hemgn-

P1:mCherry,CD41:EGFP) zebrafish. (G) Transgene expression in mature erythrocytes from adult zebrafish. (H) Transgene expression in adult head kidney (HK), trunk kidney

(TK), and tail kidney (T) near the EGFP+ heart (H). (I) Transgene expression in adult

Sertoli cells (Se) that surround the seminiferous tubules (ST). (J) Proportion of embryos expressing transgenes P1, P2, or P3 in ICM, kidney, CHT, and circulating primitive erythrocytes (RBC). Scale bars = 100 µm (B,D-F,I); 500 µm (C,H); 25 µm (G).

116

Hemogen promoters have different functions in primitive and definitive erythropoiesis in zebrafish

We evaluated the separate and combined contributions of the two Hemogen promoters, including CNE1 or CNE2, to the observed tissue-expression profiles by injecting wild-type embryos with one of three Tg(Hemgn:mCherry,myl7:EGFP) reporter constructs in which mCherry expression was driven: 1) by the dual promoter (P1); 2) by a 2-kb fragment (P2) containing the distal promoter including CNE1; or 3) by a 188-bp fragment (P3) containing the proximal promoter including CNE2 (Fig. 6A). Transgenic embryos were screened for EGFP+ hearts, and mCherry transcription was confirmed by

RT-PCR and sequencing.

mCherry fluorescence was examined in four cell types: 1) erythroid progenitors in the ICM at 1 dpf; 2) primitive erythrocytes in the peripheral blood at 3 dpf; 3) erythroid progenitors in the CHT at 3 dpf; and 4) renal cells of the kidney tubules at 3 dpf. Fig. 6J shows that the dual promoter (P1) supported strong expression of the mCherry reporter in erythroid cells of the ICM and peripheral blood (RBC), in the CHT, and in renal cells of the kidney. By contrast, the distal promoter (P2 construct) containing CNE1 failed to drive reporter expression in these tissues. Finally, the proximal promoter (P3 construct) containing CNE2 alone produced strong expression of the reporter in the CHT and in kidney cells but was not active in cells of the ICM and peripheral blood. Together, these results indicate that the proximal promoter containing CNE2 is necessary and sufficient to drive expression in definitive hematopoiesis and in the kidney, whereas the full 2.2-kb sequence including both promoters and CNEs is required in primitive erythropoiesis.

117

118

Figure 7. Morpholino targeting of Hemogen inhibits erythropoiesis in embryonic zebrafish. Embryos were injected with 2-4 ng antisense MO targeted to the first 25 coding nucleotides of Hemogen. (A-B) O-dianisidine staining of erythrocytes was decreased in morphants (MO) relative to wild-type embryos (WT) or embryos rescued with 500 pg synthetic Hemgn mRNA (zHem) at 24 hpf. (ANOVA, Tukey post hoc test, P

< 0.001). Live wild-type (C), Hem1 MO-injected (D), and Hem1mm mismatch MO- injected (E) Tg(Lcr:EGFP)cz3325Tg embryos at 20 hpf. Morphants showed decreased

EGFP expression in the ICM compared to the wild-type and mismatch MO controls. Live wild-type (F), Hem1 MO-injected (G), and Hem1mm MO-injected (H) embryos at 72 hpf.

Morphant embryos have fewer EGFP+ cells in circulation compared to the two controls.

The dorsal aortas of embryos (insets above F-H) were magnified 20x to permit quantitation of EGFP+ erythrocytes. Background red (D,G) and green (E,H) fluorescence was generated by the fluorescent labels on the MOs. (I) In vivo flow quantitation of EGFP+ erythrocyte concentrations between 3-6 dpf in Hem1-injected (n

= 9,7,7,7), Hem1mm-injected (n = 13,14,11,11), and uninjected (n = 5,10,10,9) embryos. Data shown as means ± s.e.m. (* P ≤ 0.05, ** P ≤ 0.001, ANOVA, Tukey-

Kramer post hoc test). Scale bars = 500 µm (A-F) 100 µm (inset).

119

Morpholino knock-down of Hemogen protein expression partially disrupts erythropoiesis in zebrafish

To perturb Hemogen function in zebrafish, we first injected wild-type zebrafish embryos at the one cell stage with an antisense morpholino oligonucleotide (MO), Hem-

1, targeted to the translation start codon of the Hemogen transcript (Hem1). MO treatment significantly reduced Hemogen protein levels by 19% at 33 hpf (Student’s t- test, P < 0.05; Fig. S4A,B) and steady-state levels of βe1-globin mRNA at 3 dpf

(Student’s t-test, P < 0.05; Fig. S4C). At 24 hpf, 61% of morphants were anemic compared to 35% of uninjected zebrafish (Fig. 7A,B). Red cell levels were restored to wild-type by co-injection of the MO with 500 pg of synthetic zebrafish Hemogen mRNA containing silent mutations in the MO target site. Both the uninjected and rescue treatments differed significantly from the MO treatment (ANOVA, Tukey post hoc test, P

< 0.001; Fig. 7A).

We used Tg(Lcr:EGFP) cz3325Tg zebrafish to visualize the red blood cell population in Hem1-treated morphants from 0-6 dpf. Control embryos were injected with a 5-bp mismatch MO (Hem1mm) or were uninjected. At 20 hpf, EGFP+ erythrocytes appeared to be reduced in the ICM/PBI of 75% of Hem1 morphants (n = 56) but not in mismatch or uninjected control embryos (n = 14 and 63, respectively) (Fig. 7C-E). At 2 dpf, morphant embryos had few erythrocytes in circulation compared to controls (Fig. 7F-H,

Movie 2). Using quantitative in vivo flow analysis (Fig. 7I), we found that morphant embryos at 3-5 dpf had fewer than 50% of the circulating EGFP+ erythrocytes as the uninjected and Hem1mm-injected controls, whereas the controls did not differ statistically from each other (ANOVA, Tukey-Kramer post hoc test, P < 0.05).

120

A conserved C-terminal domain in Hemogen is required for hematopoiesis and prevents apoptosis in embryonic tissues

The function of the putative C-terminal transactivation domain of zebrafish

Hemogen was investigated using CRISPR/Cas9 mutagenesis. We generated zebrafish lines with mutations in the conserved region near the end of the third coding exon of

Hemogen, immediately downstream of the TAD motif (Fig. 8A-D, Fig. S1). Founders

(F0) were out-crossed to wild-type TU zebrafish and mutant alleles were genotyped in the F1 generation by high resolution melting analysis and by sequencing the locus (Fig.

8E, Fig. S1). One line, Hemgnnuz2, had a 5-bp deletion (Δ5) that produced a frameshift mutation, thereby introducing a premature stop codon (Fig. 8E, Fig. S1). PolyA-tailed transcripts of the Δ5 allele were detected at equivalent steady-state levels relative to the wild-type allele in peripheral blood from individual adult heterozygotes (Fig. 8F).

Western blot analysis revealed, however, that truncated Hemogen protein was almost undetectable in peripheral blood from single heterozygous adults (data not shown) and in pooled 33-hpf embryos from a heterozygous in-cross (Fig. 8G). Therefore, if the truncated Hemgnnuz2 transcripts were translated, then the protein must have been rapidly degraded. The second line, Hemgnnuz4, contained an in-frame 12-bp deletion

(Δ12), which deleted an acidic cluster (EEED) in the last repeat that is conserved in teleost species (Fig. S1). Hemogen protein was detected in the blood of homozygous

Δ12 adults by Western Blot (data not shown).

To evaluate the effects of the mutant Hemogen alleles on erythropoiesis during development, we examined embryos from mutant crosses by microscopy and genotyped them between 20-48 hpf (Fig. 8A-C) - mutant genotypes were recovered

121 near the expected Mendelian ratios (Fig. S5A), but homozygous Δ5 hemgnnuz2 mutants could not be raised to adulthood. To classify the mutants, we assessed the relative numbers of blood cells and relative concentrations of hemoglobin beginning at 2 dpf

(Ransom et al., 1996). Embryos from a heterozygous in-cross were scored for hypochromic blood (paler blood) and decreased numbers of circulating cells on the yolk sac and in the vasculature. Erythrocyte levels were reduced to about 25-75% of normal levels in frameshift Hemgnnuz2/+ mutants (n = 8) at 24 hpf compared to wild-type siblings

(n = 7) (Fig. 8C). At 48 hpf, 59% of heterozygous (n = 49) and 50% of homozygous (n =

12) Hemgnnuz2 mutants had reduced numbers of circulating erythrocytes (Fig. 8H, Movie

3) and homozygotes could be distinguished by their more severe anemia. Comparable numbers of anemic individuals were observed for heterozygotes and homozygotes of the Δ12 Hemgnnuz4 allele – 64% (n = 25) and 60% (n = 10), respectively (Fig. 8H). In all cases, the proportion of anemic mutant embryos was significantly different from that for wild-type (* P ≤ 0.05, ** P ≤ 0.005, Chi square).

Erythrocyte levels in adult mutants were partially suppressed in heterozygotes.

Hemgnnuz2/+ and Hemgnnuz4/+ adults gave average erythrocyte counts of 2.2 ± 1.0 x 106 cells µl-1 and 2.1 ± 0.8 x 106 cells µl-1, respectively, whereas wild-type zebrafish had 4.3

± 1.0 x 106 cells µl-1 (Fig. 8J,K). Homozygous Δ12 Hemgnnuz4 gave average erythrocyte counts of 2.2 ± 1.2 X 106 cells µl-1 (Fig. 8K). Taken together, the erythroid defects of embryonic and adult zebrafish carrying the CRISPR-generated mutant alleles support the conclusion that the conserved C-terminus of Hemogen functions as a TAD, but the mechanism of action of these mutations remains to be determined.

122

Both the Δ5 and Δ12 mutant Hemogen alleles also caused mild to severe developmental defects in the nototchord and the trunk of heterozygotes and homozygotes (Fig. 8A-B, Fig. S5B). Embryos had kinked notochords and exhibited increased cellular refractility consistent with apoptotic cell death. Elevated apoptotic cell death was apparent in Hemgnnuz2/+ mutants as detected by staining with acridine orange

(Fig. S5C). Apoptosis occurred throughout the embryo, including sites of embryonic hematopoiesis. Nevertheless, viable heterozygotes for both alleles could be raised to adulthood; they were slightly smaller than wild-type siblings (Fig. 8I). Impaired growth was significant in homozygous Δ12 Hemgnnuz4 adult mutants (Student’s t test, P = 0.04,

N = 3, Fig. S5D,E).

123

124

Figure 8. CRISPR/Cas9 mutagenesis of the third exon of zebrafish Hemogen reduces primitive and definitive erythropoiesis. Embryos were injected with Cas9 mRNA and a guide RNA to establish lines with mutations in exon three of zebrafish

Hemogen. (A) 20-hpf. Representative wild-type and mutant siblings with notochord defects (arrow) (B) 48 hpf. Mutant Δ12 embryos with an in-frame deletion showing kinked notochords (arrow). (C) 24 hpf. Wild-type and Δ5/+ mutant embryos stained with diaminofluorene. Production of erythrocytes was reduced in heterozygotes. (D)

Schematic of CRISPR/Cas9 target in the third exon (red arrowhead) of zebrafish

Hemogen. (E) Sequences of founder mutations aligned at the CRISPR target site: Δ5

(Hemgnnuz2); Δ12 (Hemgnnuz4). The sequence traces show the Δ5 and Δ12 mutant alleles. PAM, blue and underlined; Δ, deletions (highlighted in red). (F) Relative expression of wild-type and Δ5 transcripts in blood from single adult, heterozygous

Hemgnnuz2/+ mutants determined by qRT-PCR with allele specific primers. Three biological replicates were normalized to β-actin. Error bars represent the standard deviation. (G) Western blot of Hemogen in pooled 33 hpf wild-type embryos or pooled embryos from a Δ5 Hemgnnuz2/+ heterozygous in-cross. We calculated that the protein would run 6.5 kDa above its molecular weight at 28.5 kDa because of its high acidic composition (Guan et al., 2015). Arrows show the calculated sizes of wild-type and truncated alleles. (H) Proportion of genotyped mutants and wild-type sibling embryos at

2 dpf that were anemic (black) or phenotypically normal (white) (* P ≤ 0.05, ** P ≤ 0.005,

Chi square). (I) Wild-type and mutant zebrafish heterozygous for the Δ5 and Δ12 alleles. (J) Red blood cells from adult Hemgnnuz2/+ mutant zebrafish and wild-type siblings. (K) Erythrocyte counts in adult heterozygous Hemgnnuz2 (Δ5, n = 12),

125 heterozygous Hemgnnuz4 (Δ12, n = 4) mutants, homozygous Hemgnnuz4 (Δ12, n = 2) mutants, and wild-type (n = 9) siblings (* P ≤ 0.05, ANOVA, Tukey post hoc test). Scale bars = 500 µm (A-C); 50 mm (I); 20 µm (J)

126

DISCUSSION

The zebrafish is a compelling model for understanding the pleiotropic functions of

Hemogen in the context of vertebrate development. Our results show that zebrafish

Hemogen is considerably smaller than its human ortholog, a distinction true for teleost and mammalian Hemogens in general. Hemogen is expressed in multiple zebrafish tissues from the early embryo to the adult under the control of at least two promoters.

Both primitive and definitive erythropoiesis are affected by depletion of Hemogen and by targeted mutation of a putative, C-terminal TAD. The transgenic and mutant zebrafish lines that we have generated will contribute to a mechanistic understanding of this important transcription factor.

Hemogen ‒ small or large, it’s built of related modules and has a conserved role in erythropoiesis

We show that the divergent Hemogens of zebrafish and human are largely, but not entirely, built of 21-25 residue repeats; the number of repeats largely determines protein size. The repeat consensus sequences are distinct, but they appear to have evolved from an 8-10 amino acid core motif (Fig. S2). Although all repeats are acidic

(Figure S2), the terminal repeat of each Hemogen is particularly so (> 38% Asp and Glu for zebrafish, > 29% for human), and these repeats contain TAD motifs. Together, these features suggest that Hemogens possess flexible, intrinsically disordered TADs, as is true of many transcription factors (e.g., p53, HIF-1α, NF-κB, etc). The multivalent structure of Hemogen provides opportunities for cooperative binding to single or multiple protein partners, including P300 (Zheng et al., 2014).

127

Hemogen interacts with a variety of proteins to stimulate the transcription of genes involved in terminal erythroid differentiation and other processes. In humans,

Hemogen contributes to transcription of erythroid genes in part by recruiting P300 to acetylate and activate Gata1 (Zheng et al., 2014). Our results show that nonsense (Δ5) and deletion (Δ12) alleles of Hemogen vicinal to the zebrafish TAD motif cause significant reductions of erythrocyte levels in embryos and adults. The Δ12 allele may be hypomorphic, but we have not determined whether the protein that is expressed has reduced activity.

Hemogen – targeted mutation of the acidic C-terminus impairs erythropoiesis, but not completely

Our CRISPR-generated zebrafish mutant lines show that nonsense (Δ5) and deletion (Δ12) alleles of Hemogen caused a decrease in erythrocyte levels in embryos and adults. However, these phenotypes were incompletely penetrant ‒ in both heterozygous and homozygous Hemogen mutants the proportion of anemic embryos was 50-65%, compared to 20% for wild-types. If Hemogen were essential for erythropoiesis, one would anticipate an erythroid-null phenotype for homozygous mutants, as observed for the Gata1 mutant, vlad tepesm651 (Lyons et al., 2002). Rather, the Hemogen phenotype resembles the variable reduction of red cells in zebrafish zinfandel (zinte207) mutants that harbor a mutation in a regulatory region at the globin locus (Ransom et al., 1996), a known target of both Hemogen and Gata1 transcription factors (Zheng et al., 2014). Loss of Hemogen in zebrafish contributes to decreased expression of Embryonic beta-globin (Fig. S4), which may explain the hypochromic state of Hemogen mutants.

128

The most plausible explanation for the incomplete penetrance of anemia in

Hemogen mutants is the phenomenon of genetic compensation, which may occur when genes are knocked out as opposed to knocked down (El-Brolosy and Stainier, 2017;

Rossi et al., 2015). Although the mechanisms are poorly understood, genetic compensation entails changes in gene expression (e.g., upregulation of paralogous genes or functionally related genes) that at least partially offset the phenotype caused by the mutant protein. Compensation through elevated expression of other erythroid co- activators is an attractive possibility that might maintain erythrocyte production in

Hemogen mutants. The functional loss of Hemogen could be mitigated by Gata1 homodimerization and/or by direct recruitment of CBP/P300, both of which enhance

Gata1 activity (Ferreira et al., 2005; Nishikawa et al., 2003).

129

Similar design and regulation of Hemogen and Gata1 genes

Comparison of the expression of Hemogen and of Gata1 throughout zebrafish development reveals a remarkable degree of overlap in tissue and cellular specificity.

For example, Gata1 mRNA appears in cells of the LPM at the two-somite stage (Detrich et al., 1995), immediately prior to the onset of Hemogen expression at ten somites.

Furthermore, Hemogen and Gata1 are co-expressed in primitive erythrocytes and definitive hematopoietic progenitors (Ferreira et al., 2005; Lu et al., 2001), in Sertoli cells (Nakata et al., 2013; Wakabayashi et al., 2003), and at the midbrain-hindbrain boundary (Volkmann et al., 2008). Interestingly, both Hemogen and Gata1 genes possess hematopoietic- and testis-specific promoters (Wakabayashi et al., 2003). The temporal and spatial co-incidence of Hemogen and Gata1 expression almost certainly results from their similar regulatory architectures and also through regulatory crosstalk.

Our results and studies conducted by others (Ding et al., 2010; Yang et al., 2006; Zheng et al., 2014) indicate that reciprocal transcriptional activation of Hemogen and Gata1 may form a positive feedback loop that drives erythropoiesis.

Strikingly, the two CNEs of Hemogen are organized like, and have the same functions as, the distal and proximal enhancers of the Gata1 gene (McDevitt et al.,

1997; Onodera et al., 1997; Suzuki et al., 2009). The proximal Gata1 promoter functions exclusively in definitive erythropoiesis (McDevitt et al., 1997), as does CNE2 of zebrafish Hemogen. In contrast, transcription of Gata1 in primitive erythrocytes requires both the proximal promoter and a distal enhancer comparable to Hemogen CNE1

(McDevitt et al., 1997). Fig. S6 presents a model for the transition from primitive to definitive hematopoiesis based on chromatin looping at the Hemogen locus We propose

130 that the transition from primitive to definitive erythropoiesis involves a switch from a loop conformation to a linear conformation, mediated by the Gata1/Ldb1-complex at erythroid transcription factories (Osborne et al., 2004; Schoenfelder et al., 2010). This model may also apply to the Gata1 enhancer, which is another known target of the

Ldb1-complex (Love et al., 2014). The zebrafish lines produced in this study may help clarify the cell-specific Hemogen expression profile driven by different Gata1-containing complexes and the functions of Hemogen in different cell types.

131

MATERIALS AND METHODS

Fish husbandry

Wild-type (SAT, AB, TU) zebrafish (Danio rerio), the transgenic lines

Tg(Lcr:EGFP)cz3325Tg (Ganis et al., 2012) and Tg(CD41:EGFP)Ia2Tg (Traver et al., 2003), and the mutant vlad tepesm651 (Lyons et al., 2002) were all generously provided by Dr.

Leonard I. Zon (Howard Hughes Medical Institute and Harvard Medical School, Boston).

Animal procedures were carried out in full accordance with established standards set forth in the Guide for the Care and Use of Laboratory Animals (8th Edition). The animal care and use protocol for live zebrafish embryos was reviewed and approved by

Northeastern University’s Institutional Animal Care and Use Committee (Protocol No.

15-0207R). The animal care and use program at Northeastern University has been continuously accredited by AAALAC Int. since July 22, 1987, and maintains the Public

Health Service Policy Assurance number A3155-01.

Cloning and sequence analysis of zebrafish Hemogen cDNAs

Total RNA was isolated from wild-type AB zebrafish embryos and adult tissues

(kidney, blood, brain, ovary, intestine) using TRI reagent (Sigma, T9424) and the

Ribopure Kit (Ambion, AM1924). Total cDNA was produced from mRNA using M-MuLV reverse transcriptase (NEB, M0253S) and an oligo(dT)23 primer. Hemogen cDNA was amplified by PCR from total cDNA with 1 µM primers (Table S1) – the amplification program was 35 cycles of 98°C for 10 s, 57°C for 10 s, and 72°C for 30 sec. PCR products were cloned into the pGEM-T Easy vector (Promega, A1360), plasmids were transformed into 5-α competent cells (New England Biolabs, C2987H), recombinant

132 plasmids were identified by blue/white screening and purified with the Wizard Plus SV

Miniprep Kit (Promega A1330), and inserts were sequenced by GeneWiz.

Bioinformatic comparison of vertebrate Hemogen genes and Hemogen proteins

We utilized the murine gene nomenclature for comparing orthologs from different vertebrate species. We used Blast+ (Altschul et al., 1990) to identify Hemogen in the zebrafish genome (assembly GRCz11) (Howe et al., 2013). Chromosomal synteny comparisons were performed using the Synteny Database with a sliding window of 200 genes (Catchen et al., 2009) and Ensembl Genomes v74 (Kersey et al., 2016).

Hemogen promoter alignments were obtained from whole genome alignments for 10 teleost species (ENSEMBL v74) (Yates et al., 2016). Transcription factor binding motifs were predicted using the program ConTra with the default similarity matrix of 0.75

(Broos et al., 2011). Transcription start sites were predicted using NNPP v2.2 with a score cutoff of 0.98 (Reese, 2001).

Protein domains in zebrafish were identified using annotated human Hemogen

(Yang et al., 2001), or they were predicted using HHpred (Soding et al., 2005) and the

Conserved Domain Database (CDD) (Marchler-Bauer et al., 2015). Peptide repeats were predicted with RADAR (Heger and Holm, 2000). The 9aaTAD Prediction Tool was first used to predict transactivation domain (TAD) motifs, starting with low stringency

DFx repeats (Piskacek et al., 2016). These were then culled by ϕϕxxϕ or ϕxxϕϕ criteria, where ϕ is a bulky hydrophobic motif (Dyson and Wright, 2016). We refer to the latter five amino acid consensus sequences as “TAD motifs,” in contrast to larger, functionally defined “transactivation domains” (TADs). Ab initio tertiary structure models were

133 created for zebrafish and human Hemogen proteins with I-Tasser (Yang et al., 2015) based on the X-ray structure for the secretory component of Immunoglobulin A

(PDB:3chnS), which was the best of ten predicted structural templates determined by

LOMETS (Wu and Zhang, 2007). The 3D models were superimposed using TM-align

(Zhang and Skolnick, 2005) and Geneious version R10 (Kearse et al., 2012).

MO knock-down of Hemogen in zebrafish and rescue of the morphant phenotype

The antisense MO Hem1 (5’-TCTCTTTCTCCAACGGGTCTTCCAT-3’), which targets the first 25 base pairs of the zebrafish Hemogen open reading frame, was designed according to the manufacturer’s instructions (Gene Tools, LLC). The control

MO (Hem1mm; 5’-TCTgTTTgTCCAtCGGcTCTTCgAT-3’) targeted the same sequence but contains five mismatched bases to prevent efficient binding to Hemogen mRNA.

MOs were labeled with lissamine or fluorescein so that the quality of injections could be monitored by fluorescence microscopy. MOs were injected (2-8 ng) into embryos at the single-cell stage using a PLI-100 Picoinjector (Medical Systems Corporation, 65-0001) and a micromanipulator (Narishige, MN-151). Injected embryos were sampled from 0-6 dpf for subsequent analyses.

Rescue of the morphant phenotype was tested by co-injection of the Hem1 MO with 500 pg synthetic zebrafish Hemogen mRNA transcribed from a zebrafish Hemogen cDNA cloned into pGem-T Easy (Promega). Primers (Table S1) introduced five silent mutations within the MO target site. The clone was digested with Spe1, and mRNA was transcribed, capped, and polyadenylated in vitro using the mMessage T7 kit (Ambion,

134

AM1340) and the Poly(A) Tailing Kit (Ambion, AM1350). mRNA was purified with the

MEGAclear kit (Ambion).

In-situ hybridization

The spatial and temporal patterns of expression of selected genes were analyzed by whole-mount in situ hybridization (WISH) of zebrafish embryos following standard protocols (Jacobs et al., 2011). These methods were adapted to evaluate Hemogen expression in tissues, peripheral blood smears, and pronephric kidney prints prepared from euthanized adult fish [200 mg L-1 tricaine methane sulfonate (MS222; Sigma-

Aldrich, 886862)] (Detrich and Yergeau, 2004; Gupta and Mullins, 2010). For sectioning, embryos and tissues were embedded in a solution containing 0.25 g gelatin, 30 g albumin, 22 g sucrose, 2.5% glutaraldehyde (v/v) per 100 ml phosphate buffered saline

(PBS). Sections were cut with a vibrating blade microtome (Leica, VT1000S).

Digoxigenin-labeled antisense and sense RNA probes were transcribed from zebrafish cDNA clones using the DIG RNA Labeling Kit (Roche Diagnostics, 11175025910).

Indirect Immunofluorescence

Zebrafish embryos were fixed in 4% paraformaldehyde (PFA) at 48 hpf. Embryos were incubated with 1:1000 rabbit anti-Hemogen primary antibody (Aviva,

ARP57794_P050) followed by 1:1000 goat anti-rabbit IgG Alexafluor 488 secondary antibody (Life Technologies, A11034) as previously described (Westerfield, 2000). The specificity of the Hemogen antibody was validated both by Clontech and by our laboratory by Western blotting of zebrafish protein extracts.

135

Hemoglobin staining

To detect red blood cells in circulation, embryos were stained with o-dianisidine (Iuchi and Yamamoto, 1983) or diaminofluorene (McGuckin et al., 2003).

Western blotting

Total embryonic protein was prepared for sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) from dechorionated, 33-hpf embryos (n =80) by homogenization in lithium dodecyl sulfate (LDS) Bolt buffer (Life Technologies, B007) and NuPAGE reducing agent (Life Technologies, NP0009) using a pestle and microcentrifuge tube (USA Scientific, 1415-5390). Samples were boiled for 3 min and centrifuged at top speed in a centrifuge for 2 min. Aliquots (15 µg) were electrophoresed on a 4-12% SDS polyacrylamide gel, and the separated proteins were transferred to a polyvinylidene difluoride (PVDF) membrane with the iBlot system (Life Technologies,

IB21001). Membranes were blocked in maleic acid blocking buffer (2% Roche blocking reagent, 2% BSA, 0.2% heat treated goat serum, 0.1% Tween-20) for 1 hour at room temperature and then incubated overnight at 4°C with 1:1000 rabbit anti-Hemogen

(Aviva, ARP57794_P050) or with 1:1000 mouse anti-GAPDH (Aviva, OAE00006) antibodies. Membranes were washed in tris-buffered saline and Tween 20 (TBST) and incubated for 2 h with horseradish peroxidase (HRP)-conjugated goat anti-rabbit IgG

(H&L) (Aviva, ASP00001) or HRP-conjugated goat anti-mouse IgG (H&L) (Aviva,

OARA04973), respectively. Bound antibodies were detected with the Amersham ECL

Western Blotting Analysis System (GE Healthcare, RPN2106) on CL-X Posure film

(Thermo Scientific,34091).

136

Tol2 generation of Tg(Hemgn:mCherry) zebrafish To identify the regulatory elements that drive Hemogen expression in zebrafish, three different Tg(Hemgn:mCherry) reporter plasmids were created using Gateway

Cloning Technology (Invitrogen, 11791020) (Hartley et al. 2000). First, the proximal

Hemogen promoter (~2.2 kb) was amplified from wild-type SAT zebrafish using 1 µM primers (Table S1). The promoter sequence spanned the upstream, non-coding region before, but not including, the Hemogen translation start codon. The promoter was cloned between KpnI/SpeI restriction sites in the p5e-MCS vector (Tol2kit, #228) using the Tol2kit vector system (Kwan et al., 2007) to generate the entry clone, p5e-Hemgn-1.

The resulting plasmid was digested with NaeI/KpnI or NaeI/SpeI to remove each of two conserved non-coding elements (CNE1 or CNE2) from the promoter. Each new construct was blunt-ended with Q5 Hot Start High-Fidelity 2x Master Mix (NEB) and religated with T4 DNA Ligase (NEB) to create p5e-Hemgn-2 and p5e-Hemgn-3. Each of the three entry clones were cloned in front of the mCherry gene within the pDestTol2CG2 destination vector (Tol2kit, #395). The pCS2FA-transposase clone

(Tol2kit, #396) was digested with PmeI, and Tol2 transposase mRNA was transcribed, capped, and polyadenylated in vitro using the mMessage SP6 kit (Ambion, AM1340) and the Poly(A) Tailing Kit (Ambion, AM1350). mRNA was purified by precipitation using

2.5 M LiCl. Transposase mRNA (37 ng µL-1) and each of the

Tg(Hemgn:mCherry,myl7:EGFP) expression clones (25 ng µL-1) were co-injected into one-cell wild-type zebrafish embryos. Founders were raised and out-crossed to wild- type TU zebrafish for two generations.

137

CRISPR/Cas9 generation of transgenic and mutant zebrafish

Optimal targets for CRISPR-Cas9 mutagenesis were identified within the first and third exons of zebrafish Hemogen using the program CHOPCHOP (Labun et al., 2016;

Montague et al., 2014). The templates for multiple small guide RNAs were produced by a cloning-free method as previously described (Table S1) (Hruscha et al., 2013; Talbot and Amacher, 2014). Guide RNAs were transcribed with the T7 MaxiScript Kit (Ambion,

AM1312) and purified by LiCl precipitation.

A donor construct for homology directed repair was created containing the mCherry gene and polyadenylation signal flanked by 199 bp and 253 bp homology arms that were PCR amplified from the sequence surrounding exon 1 of Hemogen from wild- type AB zebrafish (Table S1). The homology arms and mCherry gene were PCR amplified with primers that added AvrII and ClaI restriction sites, ligated, and cloned into the pGem-T Easy vector (Promega). Tg(Lcr:EGFP)cz3325Tg embryos were co-injected at the single-cell stage with EcoRI linearized donor plasmid (25 ng µl-1), two exon-1 targeting guide RNAs (150 ng µl-1), and Cas9 mRNA (300 ng µl-1) (Trilink). Embryos were checked for fluorescence between 1 and 3 dpf. To confirm integration, the locus was PCR amplified with internal and external primers (Table S1) and cloned into the pGem-T Easy vector for sequencing.

Wild-type (TU) embryos were co-injected with a guide RNA (150 ng µl-1) targeting exon 3, Cas9 mRNA (300 ng µl-1), and mCherry mRNA (30 ng µl-1) to identify successful injections. Embryos were raised and adults were tail-clipped for haplotyping by high- resolution melting analysis (HRMA) as previously described (Talbot and Amacher,

138

2014). PCR amplification was run using 1 µM primers (Table S1) with PowerUp SYBR

MasterMix (Applied Biosystems, A25742) on a QuantStudio 3 Real-time PCR system

(ThermoFisher, A28137). Founder mutants were outcrossed to wild-type (TU) fish. The offspring were raised and mutations were characterized by HRMA and sequencing of the locus.

Imaging of zebrafish embryos

Fixed embryos were mounted in 80% glycerol and imaged with a dissecting microscope (Nikon, SMZ-U) and a CCD digital camera (Diagnostic Instruments,

SPOT32). Live embryos were embedded in 0.1% agarose in embryo medium (EB) with

0.01% tricaine and imaged with an epifluorescence-equipped microscope (Nikon,

Eclipse E800). Movies (0.01 sec interval) and time-lapse images (1 min interval) were obtained using a Photometrics Scientific CoolSNAP EZ camera and NIKON NIS-

Elements AR 4.20 software. Methods for in vivo flow analyses were adapted to quantify fluorescently labeled red blood cells in MO-injected Tg(Lcr:EGFP)cz3325Tg zebrafish

(Schwerte et al., 2003; Zeng et al., 2012). Briefly, 100 frame videos were taken set at a

500 μs exposure time with no delay. The field of view (20x) was centered on the dorsal aorta adjacent to the cloaca. The summed maximum intensity images of all frames were used to create “casts” of the dorsal aorta and the average volume was calculated assuming cylindrical vasculature. EGFP+ cells were converted to binary objects (6.66

µm diameter, contrast 180) and counted within the region of interest.

139 qRT-PCR

RNA was purified from adult zebrafish tissues or 10-30 pooled embryos at 3 or 4 dpf in TriZol (Sigma-Aldrich, T9424) using the PureLink RNA purification Kit (Ambion).

DNase treated RNA was reverse transcribed with a polyT(23) primer using Protoscript II

RT-PCR kit (New England Biolabs, M0368S). Target genes were amplified in triplicate from cDNA by qRT-PCR with 1 µM primers (Table S1). Standard curves were generated to confirm primer efficiencies. Target gene expression was normalized to beta-actin for comparison by the ΔΔCt method. Three or four biological replicates were used for each treatment for statistical comparisons.

Statistical analyses

Data were analyzed as means ± s.e.m. or means ± s.d. as noted. Statistical tests applied to the results are provided with each experiment. Differences with a p-value ≤

0.05 were considered significant.

GenBank accession numbers

Zebrafish Hemgn isoform 1, JZ970258; zebrafish Hemgn isoform 2, JZ970260; zebrafish Hemgn isoform 3, JZ970259; and zebrafish Hemgn isoform 4, JZ970257.

Zebrafish ZFIN IDs

Transgenic construct Tg(hemgn:mCherry,myl7:EGFP), ZDB-TGCONSTRCT-170726-1; zebrafish line nuz1Tg, ZDB-ALT-170726-1; zebrafish line hemgnnuz2, ZDB-ALT-170726-

140

2; zebrafish line hemgnnuz3, ZDB-ALT-170726-3; zebrafish line hemgnnuz4, ZDB-ALT-

170726-4

Acknowledgements

We thank Dr. Leonard Zon and Christian Lawrence at Children’s Hospital in Boston for providing zebrafish and plasmids. We thank Dr. John Postlethwait, Dr. Leonard Zon, and Christopher Wells for helpful discussion. We thank Dr. Johanna Farkas and Carly

Ching for their technical contributions. We thank Dr. Leonard Zon, Dr. Yi Zhou, and colleagues at Boston Children's Hospital, Stem Cell and Regenerative Biology

Department, Harvard Medical School and Harvard University for providing ATAC-seq,

ChIP-seq, and DNase I-seq datasets.

Funding

This research was supported by a Graduate Research Grant from the College of

Sciences and the Office of the Vice Provost of Graduate Studies at Northeastern

University awarded to MJP and by NSF grants PLR-1247510 and PLR-1444167 awarded to HWD. This is contribution number 380 from the Northeastern University

Marine Science Center.

Authors’ Contributions

MJP designed and carried out experiments, created expression plasmids, did in vivo flow analysis, generated zebrafish lines, and analyzed and interpreted results. SKP created expression plasmids and contributed to MO-knockdown and WISH experiments and analyses. JL constructed expression plasmids, helped create transgenic zebrafish,

141 and did immunofluorescence microscopy. JG and CAHA designed and carried out MO experiments and rescues. MJP and CAHA isolated transcripts. HWD conceived the study and participated in its design and interpretation. MJP and HWD drafted and revised the manuscript. All authors reviewed the manuscript.

Author Disclosure Statement

No competing financial interests exist.

142

143

Supplemental Figure 1. Alignment of the amino acid sequences of wild-type and mutant Hemogens in zebrafish with the orthologous proteins from other vertebrate species. Transactivation domain (TAD) motifs are boxed and were identified in human and zebrafish Hemogens by ϕϕxxϕ or ϕxxϕϕ, where ϕ is a bulky hydrophobic motif. Alleles are shown for Hemgnnuz2 (Δ5) and Hemgnnuz4 (Δ12) mutant zebrafish lines. Predicted motifs: green, coiled coil; blue, nuclear localization signal; maroon, four residues introduced by alternative splicing; yellow, tandem peptide repeats; box, TAD motif; purple, TAD motif conserved in teleosts; bold italic, acidic region; red, frameshifted residues; red dashes, deletion. Species abbreviations: H. sapiens, Homo sapiens; M. musculus, Mus musculus; G. aculateus, Gasterosteus aculeatus; G. gallus,

Gallus gallus; C. milii, Callorhinchus milii; D. rerio, Danio rerio

144

145

Supplemental Figure 2. Analysis of peptide repeats (A) Alignment of predicted tandem peptide repeats from zebrafish and human Hemogens. Conserved residues are shaded black. Each predicted repeat in human Hemogen can be divided into two more repeats. Conserved regions of peptide repeats between zebrafish and human

Hemogens are boxed. Repeats are most similar within species but repeats 1 and 3 are similar between human and zebrafish Hemogens (marked with asterisks). (B) Amino acid composition of zebrafish Hemogen. The repeat region is enriched for glutamic acid and proline. The acidic C-terminal repeat is enriched for glutamic acid and aspartic acid.

146

147

Supplemental Figure 3. CRISPR/Cas9-mediated replacement of zebrafish

Hemogen with the mCherry transgene recapitulates endogenous Hemogen expression in zebrafish. (A) Schematic showing insertion of the mCherry transgene at the CRISPR target site within exon 1 of zebrafish Hemogen. Integration of the mCherry transgene was confirmed by sequencing the locus with internal and external primers

(arrows; Table S1). (B) 3 dpf. Representative image of the tail segment. mCherry+ mutant cells were present in the CHT and the pronephric duct (PD) and at a low frequency in circulation in the dorsal aorta (DA, dashed outline) (n = 15 embryos). (C)

Sequence across the insertion, showing part of the Hemogen promoter (blue), the first 7 codons of the mCherry transgene (red), and a linker sequence (black). Scale Bar = 100

µm (B).

148

149

Supplemental Figure 4. (A) Representative Western blot of Hemogen from pooled morphants (MO) or wild-type (WT) embryos at 33 hpf. (B) Average Hemogen protein expression from three experiments. GAPDH served as the internal control. (*, P ≤ 0.05,

Student’s t test). (C) Relative βe1-globin expression in pooled morphant or wild-type embryos at 3 dpf as determined by qRT-PCR. Four samples of 10 pooled embryos were amplified per treatment. Signals were normalized to β-actin and shown relative to wild- type. Error bars represent the standard deviation (*, P ≤ 0.05, Student’s t test).

150

151

Supplemental Figure 5. Hemogen mutant zebrafish have increased cell death during embryonic development. (A) Genotypic ratios of 2 dpf embryos produced from heterozygous incrosses of Hemgnnuz3 (Δ5) or Hemgnnuz4 (Δ12) mutants. (B) Proportion of genotyped mutants and wild-type sibling embryos at 2 dpf that were apoptotic (black) or phenotypically normal (white) (* P ≤ 0.05, ** P ≤ 0.005, Chi square). (BC) Acridine orange staining for apoptotic cells is increased in the bodies and in the peripheral blood island (outlined) in 20 hpf heterozygous Hemgnnuz2 mutant zebrafish (n = 3) compared to wild-type siblings (n = 3). (D) Comparison of adult wild-type and homozygous

Hemgnnuz4 (Δ12) mutant. (E) Average body length of adult wild-type and Hemgnnuz4

(Δ12) mutants. Error bars represent standard error (*, P ≤ 0.05, Student’s t test). Scale bars = 100 µm (E), 50 mm (D)

152

153

Supplemental Figure 6. Proposed models for regulation of Hemogen expression by promoter elements. (A) Linear, two-promoter model. (B) Chromatin looping model.

154

Movie 1. Circulating erythrocytes in Tol2-generated transgenic Tg(Hemgn-

1:mCherry,myl7:EGFP) zebrafish at 2 dpf. 4x magnification.

Movie 2. Comparison of circulating EGFP+ erythrocytes in the dorsal aorta of Hemogen morphant and wild-type Tg(Lcr:EGFP)cz3325Tg zebrafish embryos at 3 dpf. The dorsal aorta is highlighted, and EGFP+ erythrocytes are marked with a dot. 20x magnification.

Movie 3. Comparison of circulating erythrocytes in, Hemgnnuz2/+ zebrafish embryos (Δ5 frameshift), Hemgnnuz4/+ embryos (Δ12 deletion) and wild-type siblings at 24 hpf. 10x magnification.

155

Table S1. Sequences of primer and guide oligonucleotides used in experiments Gene Oligo Sequence (5’ – 3’) Method hemgn F1 CTTTCTTCTGTGAGTATTGTGC hemgn F2 GAGAAAGAGATCCCACCAACTG RT-PCR hemgn F3 GACATGATTGTGAACACGCCC of Hemgn hemgn R1 TTGTTTCCATAGTAAGGAGGTG hemgn R2 TCTGAGTCGCCGCCGAATTCC hemgn F1 GACATGATTGTGAACACGCCC qRT-PCR of hemgn F2 CTGTGAGTATTGTGCCAAGTCC Hemgn hemgn R TCTGAGTCGCCGCCGAATTCC hemgn-Kpn1F ATCATGGGTACCCACATCCAGAAATGAGACAT PCR hemgn-Spe1-R ATCATGACTAGTTTTGTAGTCCTGTCACATGA Promoter hemgn F CTGTGAGTATTGTGCCAAGTCC RT-PCR mCherry R GAACTCCTTGATGATGGCC transgene hemgnMM F4 ACCATGGAGGATCCGCTGGAGAAAGAGA zHemgn hemgn R1 TTGTTTCCATAGTAAGGAGGTG rescue cDNA βe1-globin F TCGCCAAGGCTGACTACGA βe1-globin R CGGCATTGTAGGTTTCCAA qRT-PCR β-actin F CGAGCAGGAGATGGGAACC of Morphants β -actin R CAACGGAAACGCTCATTGC LarmF CTGTGAGTATTGTGCCAAGTCC LarmR-AvrII-R ATCATGCCTAGGGTCTTCCATTTTGTAGTCC Rarm-ClaI-F ATGTACATCGATCCTTAGCATTAAACATCAATCAC Constructing RarmR CCATGCCTAGTGTCAGGATC Donor Plasmid mCherry-AvrII-F ATCATGCCTAGGATGGTGAGCAAGGGCG PolyA-ClaI-R ATGTACATCGATCTTGTTTATTGCAGCTTATAATGGTTAC hemgn F GCTCGCTTGTTGTTTACTCT mCherry R GAACTCCTTGATGATGGCC PCR Knock-in sgRNA AAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACG CRISPR GACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAAC template sgRNA Hemgn GAAATTAATACGACTCACTATAGGTGGGATCTCTTTCTCC CRISPR ex1a AAGTTTTAGAGCTAGAAATAGC template sgRNA Hemgn GAAATTAATACGACTCACTATAGGAATAAAAGATTCAGAT CRISPR ex1b GAGTTTTAGAGCTAGAAATAGC template sgRNA Hemgn GAAATTAATACGACTCACTATAGGATCTGGGGCCAGATGA CRISPR ex3 GGGTTTTAGAGCTAGAAATAGC template hemgn ex3 F GGTGCCTGAAGAAGCAATAAGTG HRMA hemgn ex3 R CATTCATGAACAAGACGTTTCAGC hemgn ex1 F GCATGAATGTAAGCGGGC HRMA hemgn ex1 R GTGATTGATGTTTAATGCTAAGG hemgn WT F TGAGGATCTGGGGCCAGATG hemgn Δ5 F GGATCTGGGGCCAGGAG qRT-PCR hemgn Δ12 F AGGATCTGGGGCCAGATATGC of hemgn Both F GATTGAGGATCTGGGGCCAG mutant alleles hemgn Both R GGTGCTGGAGCAAACATTGG

156

Chapter 3: Erythroid gene discovery using the erythrocyte-null Antarctic icefishes

157

Abstract

The molecular regulators of erythropoiesis have been carefully studied in different animal models. However, only the most important erythroid genes or the most highly expressed markers have been well characterized. The complete loss of functional red blood cells in Antarctic icefishes provides an opportunity to discover and characterize new erythroid genes. I previously identified 31 novel erythroid-specific genes by transcriptomic comparison of hematopoietic tissues from red- and white- blooded notothenioids. Here, I characterize the loss of the erythroid gene hemogen and two novel blood genes, mabcp1 and cd33rSig, from icefishes.

My studies reveal a truncating frameshift mutation in hemogen from icefishes, which may alter its function in hematopoietic cells, kidney, brain, and testis where it is expressed. This defect may have resulted from overexpression of a short hemogen isoform (hemgn-s) that is specific to icefishes. I show that overexpression of icefish hemogen in zebrafish embryos impairs primitive erythropoiesis. Finally, I characterize the loss of two more erythroid genes, a novel mabp (MVB12-associated β-prism)- containing protein with the second highest expression in notothenioid red blood cells and the teleost ortholog of cd33 which is truncated in icefishes.

158

Introduction

The “white-blooded” Antarctic icefishes (Channicthyidae), are the only vertebrates that do not produce hemoglobin nor typical mature erythrocytes (Cocca et al., 1995b; Near et al., 2006b; Zhao et al., 1998b). Loss of the globin genes in icefishes

(Cocca et al., 1995b) may be one of several erythropoietic defects that contributed to their anemia. The defective molecular pathways in icefishes may reveal novel regulators of erythroid development and disease (Albertson et al., 2009).

The study of blood development has helped uncover fundamental aspects of cell differentiation and function including transcriptional regulation, chromatin regulation, heme synthesis, the cytoskeleton, and cell survival. The process of hematopoiesis provided the first model of cell differentiation from pluripotent stem cells (Till and

McCulloch, 1980). Erythrocytes also provided the first model to study the cytoskeleton and important membrane-associated proteins (e.g. Spectrin, Actin, Band3, Band 4.1,

Band7) (Steck, 1974).

The severe anemia of Antarctic icefishes makes these animals the ideal “mutant model” for erythroid gene discovery (Detrich and Yergeau, 2004). I identified 31 novel erythroid genes by transcriptomic comparisons of red- and white-blooded notothenioid fishes (See Chapter 1). In the current study, I characterize three blood-specific genes that were mutated in Antarctic icefishes. I evaluate the functions of each gene using the zebrafish model. This study provides a first analysis of three novel erythroid genes that were lost by Antarctic icefishes.

159

Results

hemogen (hemgn)

Isolation of the mutated hemogen gene from Antarctic icefishes

Teleost hemogen was originally discovered as a candidate erythroid gene that was strongly down-regulated in Antarctic icefishes (Detrich and Yergeau, 2004;

Yergeau et al., 2005). In all vertebrates, hemogen is found as a single-copy, four exon gene (Fig.1A) at a locus that is highly syntenic in fish and mammals (Fig. S1).

To characterize hemogen in notothenioids, I isolated and sequenced the genomic locus from N. coriiceps and C. aceratus. Alignments of the two orthologs revealed a 90-bp deletion and 1-bp insertion in exon 3 of icefish hemogen that introduced a frameshift and premature stop codon (Fig. 1B-D). The same frameshift mutation was found in hemogen in the RNA-Seq transcriptome for N. ionah. However, the ortholog of the icefish, Ps. georgianus, only contained a 90-bp in-frame deletion, which removed 30 amino acids (176P_206P). Thus, hemogen from icefishes first evolved a 90-bp deletion which was followed by a 1-bp insertion in some species.

Strikingly, the hemogen gene from the dragonfish, P. charcoti, contained a 6-bp deletion at the same site as the icefish mutation which removes two amino acids

(191V_192Pdel). Thus, mutations accumulated in the third exon of hemogen in a region that may constitute a functional domain with an erythropoietic role. Furthermore, these data show that loss of residues in this domain began prior to the divergence of dragonfishes and icefishes.

160

161

Figure 1. The erythroid gene hemogen is mutated in Antarctic icefishes. (A) Gene structure of zebrafish hemogen on chromosome 1 from assembly GRCz11 (O'Leary et al., 2016). Coding exons, green boxes; introns, green lines. CRISPR/Cas9 targets are highlighted red. RNA-Sequencing shows strong expression of hemogen in zebrafish blood and other tissues. Values are Log2 transformed RPKM (reads per kilobase of transcript per million mapped reads). (B) Protein domains of Hemogen from Notothenia coriiceps. In the icefish, C. aceratus, a frameshift mutation occurs at a putative transactivation domain (TAD). Numbers indicate length in amino acids. Abbreviations:

CC, coiled-coil domain; NLS, nuclear localization signal; R, peptide repeats; 4AA, four amino acides introduced by alternative splicing. (C) 3D model of Hemogen from N. coriiceps (color coded as in panel B) designed with I-tasser (Yang et al., 2015) using human secretory component of immunoglobulin g as a template (PDB:3CHN:S). Red and yellow regions mark the deleted and frameshifted sequences in the icefish, respectively. (D) Exon structures of hemogen genes from N. coriiceps and C. aceratus.

The frameshift mutation occurs in exon 3 (red arrow) in icefish hemogen.

162

The deleted domain in icefish Hemogen contains a transactivation domain motif

Teleost Hemogens possess conserved functional domains at the N- and C- termini separated by a linker formed by proline-rich peptide repeats (Peters et al.,

2018). The N-terminus contains predicted coiled-coil forming nuclear localization signals. In most teleosts, the C-terminus contains a conserved transactivation domain

(TAD) motif with the consensus sequence φφxφ, where φ is a strong hydrophobic residue. Residues within this conserved region of Hemogen were found to be critical for normal erythropoiesis in zebrafish (Peters et al., 2018). However, notothenioid

Hemogens lack this specific TAD motif. Instead, notothenioid Hemogens possess a

TAD motif within the peptide repeat that was deleted in icefishes (Fig. 1B).

To predict the tertiary structure of Hemogen from N. coriiceps, ab initio, 3D models were created with I-Tasser using the X-ray structure for the secretory component of immunoglobulin g (PDB 3CHN:S) as a template (See Methods, Fig. 1C).

The structures had TM-scores of 0.48+0.15 (TM-score > 0.3 is significant). Mutations in

Hemogens from icefishes remove the proline-rich linker (highlighted red, Fig. 5C) including the TAD motif. I predict that loss of the proline-linker and C-terminal globular domain including the putative TAD may disrupt Hemogen activation of erythropoiesis in

C. aceratus.

163

164

Figure 2. hemogen is expressed in hematopoietic, renal, and neural tissues in embryonic and adult notothenioids. (A) 1 month post fertilization. Whole-mount in situ hybridization (WISH) of hemogen in embryos of the red-blooded nototheniid, N. coriiceps. Transcripts were detected with an anti-sense riboprobe synthesized from C. aceratus hemogen cDNA. (B) 1 week. WISH of hemogen transcripts in the brain of embryos from the icefish, C. aceratus. (C) Northern blot detection of hemogen transcripts in tissues from N. coriiceps using an antisense riboprobe for C. aceratus hemogen. Four alternative transcripts were detected in different tissues. (D) Western blot detection of Hemogen protein in spleen from adult individuals of N. coriiceps (Ncor) and C. aceratus (Cace). Specific bands were detected in both species at the molecular weight for full-length Hemogen (~36 kDa). (E) In situ hybridization of hemogen transcripts in spleen prints from C. aceratus with a digoxigenin-labeled antisense riboprobe for C. aceratus hemogen. Cytoplasmic staining was seen in different hematopoietic cell types in the icefish.

165

In situ hybridization of hemogen in notothenioid embryos and in spleen from icefishes

To determine the tissues that express hemogen in Antarctic notothenioid fishes and that may be affected by the mutation in icefishes, I employed whole-mount in situ hybridization to detect hemogen transcripts in embryos from N. coriiceps and C. aceratus. In N. coriiceps embryos, at the onset of blood circulation (~1 month post fertilization), hemogen was detected in the pronephric tubules, in the brain, and in circulating blood cells in the vasculature and on the yolk sac (Fig. 2A). The same expression profile is found in zebrafish embryos at 48 hpf (Peters et al., 2018). In C. aceratus embryos, prior to the onset of circulation (~1 week post fertilization), hemogen expression was detected in the brain but not in blood cells of the intermediate cell mass

(ICM) (Fig. 2B). In adult spleen prints from C. aceratus, hemogen transcripts were detected in different hematopoietic cell types (Fig. 2E).

Tissue-specific isoforms of hemogen are expressed in red- and white-blooded notothenioids

To characterize the hemogen transcripts that were expressed in notothenioids, I performed Northern blotting of hemogen transcripts in tissues from N. coriiceps (see

Methods). Multiple alternative isoforms of hemogen were detected in different tissues from N. coriiceps. The first transcript (~1.35 kb) was highly expressed in blood and head

166 kidney (Fig. 2C) and corresponded to the expected size (1,331 kb) of N. coriiceps hemogen (XM_010775526.1). A second transcript was found at ~1.8 kb and was expressed in all tissues. This isoform may correspond to the 1,777-bp hemogen transcript that retains intron 2. A third transcript was detected at 1.25 kb in testis and ovary (Fig. 6C), and this may correspond to the testis-specific isoform found in zebrafish

(Peters et al., 2018) and in mammals (Yang et al., 2003). A fourth transcript was detected at 900 bp in brain from N. coriiceps (Fig. 2C). Our laboratory isolated a fifth short isoform (hemgn-s) by RT-PCR that occurred in icefishes but not in red-blooded notothenioids (data not shown). This transcript splices around the deleted region in icefish hemogen but produces the same frameshift and premature stop codon.

Normal and short Hemogen protein variants are expressed in icefishes

To identify Hemogen protein in notothenioids, I ran Western Blots on different tissues from N. coriiceps and C. aceratus using an antibody directed against the conserved N-terminal region of human Hemogen (see Methods). Hemogen was specifically detected at 36 kDa in both spleen and brain from N. coriiceps and in spleen from C. aceratus (Fig. 2D). The absence of a size difference for Hemogens from N. coriiceps and C. aceratus could not be explained. I also detected a short Hemogen isoform at ~12 kDa (Fig. 3A) that was only expressed by icefishes and likely corresponded to the short isoform (hemgn-s). I employed quantitative PCR to measure expression of the normal and short isoforms of hemogen in head kidneys from red- and white-blooded notothenioids (Fig. 3B). Expression of normal hemogen was significantly

167 down-regulated in icefish head kidney compared to that of red-blooded species (~35 fold change, Fig. 3B). By contrast, expression of the truncated isoform was significantly higher in icefishes (~2,634 fold change, Fig. 3B). Therefore, down-regulation of normal hemogen in icefishes was associated with up-regulation of the truncated isoform

(hemgn-s).

168

169

Figures 3. The short isoform of hemogen is overexpressed in icefishes and is translated into a truncated protein. (A) Western blot detection of a short variant of

Hemogen (Hemgn-s, 12 kDa). The Hemgn-s protein was expressed in head kidney

(HK) and spleen of the icefish, C. aceratus (Ca), but not in the red-blooded nototheniid,

N. coriiceps, nor in the peripheral blood (PB) of either species. (B) Relative expression of full-length hemogen (hemgn) and short hemogen (hemgn-s) isoforms in head kidney from notothenioid fishes detected by quantitative PCR. Target gene expression was normalized to beta-actin and error bars represent the standard deviation of one or two biological replicates. Significant differences were seen between red- and white-blooded phenotypes (Student’s t-test, P < 0.05,*).

170

Overexpression of icefish hemogen disrupts erythropoiesis in zebrafish embryos

To characterize the developmental abnormalities caused by icefish Hemogen, zebrafish embryos were injected with synthethic, icefish hemogen mRNAs (200 pg) with the endogenous Kozak sequence. Blood production was assessed by o-dianisidine staining at 48 hpf and injected embryos were compared with wild-type siblings or embryos injected with mCherry mRNA (Fig. 4). Red blood cell production was reduced in ~40% of embryos injected with hemogen mRNA from the icefish (Ca-Hemgn) compared to 10% of embryos injected with mCherry mRNA and 2% of uninjected zebrafish (Fig. 4A,B). Thus, icefish Hemogen inhibits primitive erythropoiesis and may function as a dominant negative allele.

171

172

Figure 4. Overexpression of icefish hemogen in zebrafish blocks primitive erythropoiesis. (A) 48 hpf. TU embryos were injected with a mix of synthetic, polyadenylated hemogen mRNA from C. aceratus (Ca-Hemgn) and mCherry mRNA.

Controls were uninjected TU wild-type (WT) siblings or embryos injected with mCherry mRNA alone. O-dianisidine staining of erythrocytes was reduced in Ca-Hemgn injected embryos. (B) Graph showing the proportion of Ca-Hemgn injected or control embryos that had decreased erythrocyte production (P < 0.01,* P < 0.001,**; chi square test of proportions).

173

174

Figure 5. A novel MABP-containing protein (mabpcp) is a RBC-specific gene in notothenioid fishes. (A) Gene structure of the zebrafish mabpcp ortholog (si:dkey-

30j10.5) on chromosome 3 from assembly GRCz11 (O'Leary et al., 2016).

CRISPR/Cas9 targets in the gene are highlighted red. RNA-Sequencing shows strong, specific expression of si:dkey-30j10.5 in zebrafish blood. Values are the log2 transformed RPKM. (B) Protein domains of Mabpcp from Notothenia coriiceps.

Numbers indicate length in amino acids. Two gaps (marked X) were found in the assembled transcript from both icefishes, C. aceratus and Ps. georgianus.

Abbreviations: non-cyto, non-cytoplasmic domain; S, signal peptide; MABP, MVB12-

Associated β-prism domain. (C) Color-coded, 3D model of MABP-containing protein from Notothenia coriiceps (LOC104952319) designed using I-tasser (Yang et al., 2015).

The model is superimposed on the MABP domain from human MVB12B (PDB:

3TOW:A). Yellow arrows, beta sheet; red, gaps in icefish sequence; blue, lipid-binding residues; white, no prediction. (D) 24 hpf. Whole-mount in situ hybridization of dkey-

30j10.5 in wild-type zebrafish. Sense probe control shown as inset. (E) 20 hpf. Wild-type

TU and CRISPR-injected sibling embryos. Mutants were injected with Cas9 and a gRNA targeting dkey-30j10.5 as in Panel A. (F) CRISPR target sequences in zebrafish dkey-30j10.5. Protospacer-adjacent motif (PAM, underlined and highlighted blue).

Abbreviations: MB, midbrain; HB, hindbrain; Vent, brain ventricle

175

MABP-CONTAINING PROTEIN (MABPCP)

I identified a novel RBC-specific gene in the RNA-Seq transcriptomes of notothenioid fishes, which had the second highest expression level in P. charcoti peripheral blood (63,503 TPM) and head kidney (36,283 TPM), second only to beta- globin (267,053 TPM and 76,609 TPM) and 11 times higher than all other blood-specific genes. The two-exon gene mostly encodes an MVB12-Associated β-prism domain

(MABP, InterPro IPR023341) that is related to the MABP domain of the DENND4c protein (DENN domain-containing protein 4C) found in all vertebrates. Thus, we named the notothenioid protein MABP-containing protein (MABPCP). The icefish ortholog was fragmented and not strongly expressed (< 0.39 TPM, Fig. 5B) in the transriptomes of both Ps. georgianus and N. ionah. In the icefish ortholog, two gaps were present in the assembled contig at residues that correspond to W115 and Y176 (Fig. 5B). The gaps in the assembly may have been caused by repetitive sequences and/or significant genomic alterations at the gap loci.

The MABP domain is a lipid-binding structure that localizes proteins to membranes and which has been implicated in endocytic transport (de Souza and

Aravind, 2010). A variety of MABP-containing proteins are found in eukaryotes and also in bacteria (de Souza and Aravind, 2010) but no direct ortholog of notothenioid

MABPCP occurs in mammals. In teleosts, including notothenioid fishes, multiple duplicated paralogs are found as one or two-exon genes. The nototheniid N. coriiceps contains the RBC-specific protein and two more paralogs, LOC104952319 and

176

LOC104956446 – the latter was specifically expressed in the trunk kidneys of P. charcoti and Ps. georgianus.

dkey:30j10.5 (Acc. XM_001335220.7) was identified as the MABPCP ortholog in zebrafish, and was also found to have strong, specific expression in peripheral blood

(Fig. 6A). dkey:30j10.5 occurs as a single-copy, two-exon gene located on chromosome

3 at a locus that is a well-known erythroid gene cluster on chromosome 17 in humans

(Fig. S1). Thus, the loss of red blood cells in icefishes was coincident with the loss of a teleost-specific erythroid gene.

Protein domain and structure of MABPCP from notothenioids

Ab initio, tertiary structure models were created for notothenioid MABPCP (Fig.

5C) using I-Tasser based on the solved X-ray structure for the MABP domain of human

Multivesicular body subunit 12B (MVB12B, PDB 3TOW:A) (Boura and Hurley, 2012).

The model and its template had a template modeling score (TM-score) of 0.34±0.11

(TM-score > 0.3 = P < 0.001). In the structure, the MABP domain from notothenioid

MABPCP contains the same hydrophobic β2-β3 loop seen in MVB12B that has been predicted to insert itself within lipid membranes (Boura and Hurley, 2012) (Fig. 5C). The gaps in icefish MABPCP occur at electropositive residues that anchor this domain to membranes (Boura and Hurley, 2012) (Fig. 5B,C). Thus, changes to this domain in icefishes may disrupt MABPCP binding to cell membranes.

177

Expression profile of mabp-containing protein in zebrafish embryos

The spatiotemporal expression profile of zebrafish mabpcp (dkey:30j10.5) was evaluated in wild-type TU embryos by whole mount in situ hybridization (WISH) at 20 hpf (Fig. 5D). Sibling embryos were fixed and hybridized with sense or anti-sense digoxigenin-labeled riboprobes targeting a portion of the dkey:30j10.5 gene (See

Methods). Faint expression was specifically detected in the central nervous system with an anti-sense probe (Fig. 5D) but not with a sense probe (Fig. 5D inset). The highest expression occurred in the midbrain and hindbrain, specifically in cells associated with the brain ventricles (Fig. 5D). The intermediate cell mass (ICM) and peripheral blood island (PBI) did not show strong expression, which indicates dkey:30j10.5 is not associated with primitive erythropoiesis in embryos.

CRISPR/Cas9 targeting of mabp-containing protein in zebrafish

To analyze the developmental role of mabpcp, we employed CRISPR/Cas9 gene editing to target zebrafish dkey-30j10.5 (Fig. 6A,E,F). Wild-type TU zebrafish embryos were co-injected with a guide RNA (100 ng µl -1), Cas9 mRNA (1000 ng µl -1), and mCherry mRNA (100 ng µl -1) (See Methods and Table 2). At 20 hpf, 88% of injected mutants had shortened tails and severe deformities compared to uninjected, wild-type siblings (n = 25, Fig. 5E) - these deformities are common phenotypes produced from off-target effects. Furthermore, the deformities (Fig. 5E) were not restricted to the sites

178 of dkey-30j10.5 expression in embryonic zebrafish (Fig. 5D), which also suggested they were non-specific phenotypes.

Several MABP-domain containing proteins have been found in eukaryotes and bacteria and were shown to bind cell membranes (Allaire et al., 2010; Boura and Hurley,

2012; Denef et al., 2008; Rosado et al., 2007). In the dragonfish, P. charcoti, mabpcp is the second highest expressed gene in red blood cells. This protein may have a function in the ESCRT machinery (endosomal sorting complexes required for transport) as do other MABP-containing proteins (Boura and Hurley, 2012). The ESCRT pathway plays an important role in mitochondrial removal during erythroid maturation and defects in this process can cause anemia (Mortensen et al., 2010).

179

180

Figure 6. Modeling a truncated CD33-related Siglec (CD33rSig) from icefishes in mutant zebrafish. (A) Gene structures of the zebrafish cd33rSig paralogs, dkey-

238d18.10 and LOC101884840, on chromosome 15 from assembly GRCz11 (O'Leary et al., 2016). CRISPR targets are highlighted red. RNA-Sequencing shows specific expression of dkey-238d18.10 in blood and LOC101884840 in brain. Values are the log2 transformed RPKM. (B) Protein domains of the CD33rSig ortholog from N. coriiceps. The F753* mutation truncates CD33rSig in both icefishes, C. aceratus and

Ps. georgianus. Numbers indicate length in amino acids. (C) 20 hpf. Wild-type TU embryo (D) 20 hpf. CRISPR-injected embryo targeting dkey-238d18.10. Note the enlarged peripheral blood island (PBI, outlined) in the mutant. (E) 20 hpf. CRISPR- injected embryo targeting LOC101884840. (F) 3D model of CD33rSig from N. coriiceps created with I-tasser (Yang et al., 2015) using human CD33 as a template (PDB:

5IHB:A). Red marks the truncated region in icefish CD33rSig. (G) High resolution melting curve showing decreased melting of the mutant (red) dkey-238d18.10 alleles compared to wild-type (blue). Inset shows the difference curves for several mutants. (H)

Sequence TRACE result of dkey-238d18.10 from CRISPR-injected TU zebrafish.

Frameshifts occur at the protospacer-adjacent motif (PAM, underlined). Abbreviations:

S, sigal peptide; Ig, immunoglobulin-like domain; C2-set, immunoglobulin c2-set

(constant) domain; v-set, immunoglobulin v-set (variable) domain; Tr, transmembrane; cyto, cytoplasmic; ITIM?, putative immunoreceptor tyrosine-based inhibitory motif.

181

CD33-RELATED SIGLEC (CD33rSig)

One of the blood-specific gene was identified as a member of the multi-gene family of CD33-related Sialic-acid-binding immunoglobulin-like lectins (CD33rSiglec) and had strong, specific expression in peripheral blood cells from P. charcoti (42.04

TPM). This CD33rSiglec was found as a single copy gene and is orthologous to

LOC104953882 from the genome of the red-blooded nototheniid, N. coriiceps (Shin et al., 2014). In the transcriptomes of three icefishes, (N. ionah, Ps. georgianus, and C. aceratus), the corresponding orthologs contain a C-terminal frameshift mutation created by a 1-bp deletion in exon 9, which introduced a frameshift and an immediate premature translation termination codon (Fig. 7B).

The family of Siglecs is made up of diverse cell surface receptors that are expressed on the membranes of different hematopoietic lineages (Crocker et al., 2007).

In humans, CD33 (Siglec-3) is restricted to myeloid lineages and is over-expressed in acute myeloid leukemias (De Propris et al., 2011). No obvious cd33 ortholog is found in teleosts but tandem duplication of an ancestral CD33-like gene produced numerous

CD33rSiglec-extended (CD33e) genes in fishes that are conserved with mammalian

CD33rSiglecs, Siglec-4/MAG (myelin-associated glycoprotein), and Siglec-2/CD22 (Cao et al., 2009). Recently, a new study discovered the same CD33rSig in another teleost

(rock bream, Oplegnathus fasciatus), and this gene was designated as the functional ortholog of mammalian CD33 (Jeswin et al., 2018). As for notothenioid fishes, rock bream CD33 was specifically expressed in leukocytes of the peripheral blood and was

182 found to be up-regulated during the immune response (Jeswin et al., 2018). Thus,

Antarctic icefishes appear to have lost the functional ortholog of CD33.

In the zebrafish genome assembly GRCz10 (Howe et al., 2013), a tandem pair of paralogs, si:dkey-238d18.10 and LOC101884840, are both related to N. coriiceps

LOC104953882 and are located at the CD33rSiglec cluster on chromosome 15 (Fig.

7A,S1). This locus shares synteny with the genomic loci of both MAG and CD33 on human chromosome 19 (Fig. S1). High, specific expression in blood is seen for zebrafish si:dkey-238d10.10 but not for LOC101884840 (Fig. 6A). Together these data indicate that notothenioid LOC104953882 and zebrafish si:dkey-238d18.10 are candidates for the functional orthologs of the myeloid Siglec, CD33.

Protein structures and domains of CD33rSig from notothenioids

Ab initio, tertiary structure models were created for the notothenioid CD33rSiglec

(LOC104953882) with I-Tasser using the X-ray structure for human CD33 (PDB 5IHB:A) as a template (Dodd, 2016, to be published). The structures for the zebrafish and human proteins had template modeling scores (TM-scores) of 0.726, (TM-score > 0.3 =

P < 0.001) (Xu and Zhang, 2010). The frameshift mutation and premature stop codon in icefish CD33rSiglec occurs in the most C-terminal C2-set immunoglobulin domain, which removes the transmembrane and cytoplasmic domains from this cell surface protein (Fig. 6B,F). In red-blooded species, the cytoplasmic domain contains two tyrosine residues that likely function as immunoreceptor tyrosine-based inhibitory (ITIM)

183 or activator (ITAM) motifs (Fig. 6B,F), which carry out the function of most CD33rSiglecs

(Paul et al., 2000). The loss of this inhibitory domain in icefishes may affect immunity, hematopoietic proliferation and cell survival (Nguyen et al., 2006; Varki and Angata,

2006; Vitale et al., 2001).

CRISPR/Cas9 targeting of CD33rSig in zebrafish

To determine the developmental role of cd33rSiglec in fishes, I employed

CRISPR/Cas9 gene editing to target the zebrafish paralogs, si:dkey-238d18.10 and

LOC101884840 (Fig. 6A). Wild-type TU zebrafish embryos were co-injected with a guide RNA (100 ng µl -1), Cas9 mRNA (1000 ng µl -1), and mCherry mRNA (100 ng µl -1)

(See methods and Table 2). Mutant and wild-type siblings were imaged and genotyped by high-resolution melting analysis (HRMA) (Fig. 6G) and by sequencing of the locus

(Fig. 6H). At 20 hpf, enlarged peripheral blood islands (PBI) were seen in 33% of injected mutant si:dkey-238d18.10 zebrafish (n = 12) (Fig. 6D). Additionally, 33% of mutants had shortened tails with a reduced PBI, and 25% were deformed due to cell death (Fig. 6D). While some of these phenotypes may result from off-target effects, the uncontrolled proliferation of PBI blood cells in si:dkey-238d18.10 mutant zebrafish is an uncommon trait and may evidence a role for this CD33rSiglec in primitive hematopoiesis. The phenotype of si:dkey-238d18.10 mutants contrasted sharply with that of LOC101884840 mutants (Fig. 6D). Shortened tails occurred in 71% of

LOC101884840 mutants and 57% were developmentally delayed (n = 7, Fig. 6E).

184

In mammals, truncation of CD33 and loss of the cytoplasmic ITIM has been shown to prevent internalization of the receptor thereby increasing CD33 expression on the cell surface (Walter et al., 2008). CD33 is highly expressed in normal and malignant myeloid lineages and is a common target for immunotherapy – internalization of a drug- linked CD33 antibody (Gemtuzumab ozogamicin) promotes cell death (Geiger and

Rubnitz, 2015). The functions of CD33 in myeloid cells are not well understood

(Ulyanova et al., 1999). Many Siglecs are involved in erythroblast island formation

(Rhodes et al., 2008) and CD33 knockout mice display a slight erythroid defect

(Brinkman-Van der Linden et al., 2003). The loss of cd33 may affect the survival of myeloerythroid progenitors in icefishes.

185

Methods

Fish husbandry Wild-type (SAT, AB, TU) and transgenic Tg(lcr:egfp) zebrafish (Danio rerio), were generously provided by Dr. Leonard I. Zon (Howard Hughes Medical Institute and

Harvard Medical School, Boston). Animal procedures were carried out in full accordance with established standards set forth in the Guide for the Care and Use of

Laboratory Animals (8th Edition). The animal care and use protocol for live zebrafish embryos was reviewed and approved by Northeastern University’s Institutional Animal

Care and Use Committee (Protocol No. 15-0207R). The animal care and use program at Northeastern University has been continuously accredited by AAALAC Int. since July

22, 1987, and maintains the Public Health Service Policy Assurance number A3155-01.

Cloning and sequence analysis of zebrafish and notothenioid cDNAs Total RNA was isolated from wild-type AB zebrafish embryos or flash frozen tissues from Antarctic notothenioid species using TRI reagent (Sigma, T9424) and the

Ribopure Kit (Ambion, AM1924). Total cDNA was produced from mRNA using M-MuLV reverse transcriptase (NEB, M0253S) and an oligo(dT)23 primer. cDNAs were amplified by PCR from total cDNA with 1 µM primers (Table S1) – the amplification program was

35 cycles of 98°C for 10 s, 60°C for 10 s, and 72°C for 30 s. PCR products were cloned into the pGEM-T Easy vector (Promega, A1360), plasmids were transformed into 5-α competent cells (New England Biolabs, C2987H), recombinant plasmids were identified by blue/white screening and purified with the Wizard Plus SV Miniprep Kit (Promega

A1330), and inserts were sequenced by GeneWiz.

186

Comparison of vertebrate genes

We used Blast+ (Altschul et al., 1990) to identify genes from Antarctic notothenioid fishes in the zebrafish genome (assembly GRCz11) (Howe et al., 2013).

Chromosomal synteny comparisons were performed using the Synteny Database with a sliding window of 200 genes (Catchen et al., 2009). Protein domains were predicted using InterProScan (Jones et al., 2014). Ab initio tertiary structure models were created for proteins from N. coriiceps including Hemogen, LOC104953882, and

LOC104952319. 3D models were created with I-Tasser (Yang et al., 2015) based on the X-ray structures for the secretory component of immunoglobulin G (PDB:3CHN:S),

CD33 (PDB:5IHB:A), and the MABP domain of MVB12B (PDB:3TOW:A), respectively.

The 3D models were superimposed using TM-align (Zhang and Skolnick, 2005) or

Geneious version R10 (Kearse et al., 2012).

In-situ hybridization

The spatial and temporal patterns of expression of selected genes were analyzed by whole-mount in situ hybridization (WISH) of zebrafish and notothenioid embryos following standard protocols (Jacobs et al., 2011). These methods were adapted to evaluate Hemogen expression in spleen prints prepared from adult notothenioid fishes after they were euthanized in 200 mg L-1 tricaine methane sulfonate (MS222; Sigma-

Aldrich, 886862) (Detrich and Yergeau, 2004; Gupta and Mullins, 2010). Digoxigenin-

187 labeled antisense and sense RNA probes were transcribed from cDNA clones using the

DIG RNA Labeling Kit (Roche Diagnostics, 11175025910).

Northern Blotting

Total mRNA was purified from flash frozen tissues using TRI reagent (Sigma,

T9424) and the Ribopure Kit (Ambion, AM1924). mRNA (5 µg) was electrophoresed in a denaturing gel. Replicate lanes were cut out and stained with ethidium bromide to assess RNA quality. Separated mRNAs were transferred overnight to nylon paper by upward capillary transfer. Blots were hybridized with digoxigenin-labeled antisense or sense RNA probes following a standard protocol (Alwine et al., 1977).

Western blotting

Total protein was prepared for sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) from flash frozen notothenioid tissues or fresh zebrafish tissues by homogenization in lithium dodecyl sulfate (LDS) Bolt buffer (Life

Technologies, B007) and NuPAGE reducing agent (Life Technologies, NP0009) using a pestle and microcentrifuge tube (USA Scientific, 1415-5390). Samples were boiled for 3 min. and centrifuged at top in an Eppendorf 5417R centrifuge speed for 2 min. Aliquots

(15 µg) were electrophoresed on a 4-12% SDS polyacrylamide gel, and the separated proteins were transferred to a polyvinylidene difluoride (PVDF) membrane with the iBlot system (Life Technologies, IB21001). Membranes were blocked in maleic acid blocking buffer (2% Roche blocking reagent, 2% BSA, 0.2% heat treated goat serum, 0.1%

188

Tween-20) for 1 hour at room temperature and then incubated overnight at 4°C with

1:1000 rabbit anti-Hemogen (Aviva, ARP57794_P050) or with 1:1000 mouse anti-

GAPDH (Aviva, OAE00006) antibodies. Membranes were washed in TBST (0.1 M Tris,

0.1 M NaCl, 0.1% Tween-20) and incubated for 2 h with horseradish peroxidase (HRP)- conjugated goat anti-rabbit IgG (H&L) (Aviva, ASP00001) or HRP-conjugated goat anti- mouse IgG (H&L) (Aviva, OARA04973), respectively. Bound antibodies were detected with the Amersham ECL Western Blotting Analysis System (GE Healthcare, RPN2106) on CL-X Posure film (Thermo Scientific,34091).

Overexpression of icefish hemogen in zebrafish

The natural hemogen kozak sequence in notothenioids was added to hemogen cDNA clones from N. coriiceps and C. aceratus by PCR using 1 µM primers (Table S1)

– the amplification program was 35 cycles of 98°C for 10 s, 60°C for 10 s, and 72°C for

30 s. PCR products were cloned into the pGEM-T Easy vector (Promega, A1360), plasmids were transformed into 5-α competent cells (New England Biolabs, C2987H), recombinant plasmids were identified by blue/white screening and purified with the

Wizard Plus SV Miniprep Kit (Promega A1330), and inserts were sequenced by

GeneWiz. Sense mRNAs were transcribed, capped, and polyadenylated in vitro using the mMessage SP6 kit (Ambion, AM1340) and the Poly(A) Tailing Kit (Ambion,

AM1350). mRNA was purified by precipitation using 2.5 M LiCl. Hemogen mRNAs (30 ng µL-1) and mCherry mRNA (30 ng µl-1) were co-injected into one-cell, wild-type SAT zebrafish embryos. Treated and control embryos were stained with o-dianisidine using

189 previously established methods (Yergeau et al., 2005) and micrographed between 20-

48 hpf.

CRISPR/Cas9 generation of mutant zebrafish

Optimal targets for CRISPR-Cas9 mutagenesis were identified in zebrafish si:dkey-30j10.5, si:dkey-238d18.10/CD33, LOC101884840, and tyrosinase using the program CHOPCHOP (Labun et al., 2016; Montague et al., 2014). The templates for multiple small guide RNAs (Table S2) were produced by a cloning-free method as previously described (Hruscha et al., 2013; Talbot and Amacher, 2014). Guide RNAs were transcribed with the T7 MaxiScript Kit (Ambion, AM1312) and purified by LiCl precipitation.

Wild-type (TU) embryos were co-injected with a guide RNA (150 ng µl-1), Cas9 mRNA (300 ng µl-1), and mCherry mRNA (ng µl-1) to identify successful injections.

Embryos were raised and adults were tail-clipped and genotyped by high-resolution melting analysis (HRMA) as previously described (Talbot and Amacher, 2014). PCR amplification was run using 1 µM primers (Table S1) with PowerUp SYBR MasterMix

(Applied Biosystems, A25742) on a QuantStudio 3 Real-time PCR system

(ThermoFisher, A28137). PCR amplicons were sequenced by Genewiz.

Imaging

Fixed zebrafish or notothenioid embryos were mounted in 80% glycerol and imaged with a dissecting microscope (Nikon, SMZ-U) and a CCD digital camera

190

(Diagnostic Instruments, SPOT32). Live zebrafish embryos were embedded in 0.1% agarose in embryo medium (EB) with 0.01% tricaine and imaged with an epifluorescence-equipped microscope (Nikon, Eclipse E800) using a Photometrics

Scientific CoolSNAP EZ camera and NIKON NIS-Elements AR 4.20 software.

Quantitative PCR

RNA was purified from flash frozen notothenioid tissues or fresh zebrafish tissues in TriZol (Sigma-Aldrich, T9424) using the PureLink RNA purification Kit

(Ambion). DNase treated RNA was reverse transcribed with a polyT(23) primer using

Protoscript II RT-PCR kit (New England Biolabs, M0368S). Target genes were amplified in triplicate from cDNA by qRT-PCR with 1 µM primers (Table S1). Standard curves were generated to confirm primer efficiencies. Target gene expression was normalized to beta-actin for comparison by the ΔΔCt method. Three or four biological replicates were used for each treatment for statistical comparisons.

Statistical analyses

Data are displayed as means±s.e.m. or means±s.d. or as noted. Differences with a p-value ≤ 0.05 were considered significant for all statistical tests.

191

192

Figure S1. Synteny maps comparing the chromosomal loci of novel RBC-specific genes in zebrafish and humans. (A) Syntenic Hemogen loci on zebrafish chromosome 1 and human chromosomal region 9q22.33. (B) Synteny of loci for zebrafish dkey-30j10.5 on chromosome 3 and the corresponding region on human chromosome 17. No direct ortholog was identified in humans. (C) Synteny of loci for zebrafish dkey-238d18.10 and LOC101884840 paralogs on chromosome 15 and human CD33 on chromosome 19.

193

Table S1. Primer Sequences Gene Oligo Sequence (5’ – 3’) Gene Method Ncor130For GGAGGAGACATTTCAAC hemgn PCR NcHemgn_R2 CTAACAGGATGCACACTAACC (Notothenioid) gDNA NcHemgnR1050 AGATACCCGTCATTCAGGA NcHemgnR2.2 CCTCAGAAGATCCCTGTCAC NcHemgnR2.1 CACGTAACCGGCGACGGATC hemgn PCR NcHemgn5utrF ATGCCCTCACACAACTTGAC (Notothenioid) gDNA Icefish_For5utr GTGTCCCCGAGGTTATAATAC 30j10_F CCAGCACTGCGGTTCAG 30j10_R GAGATATGGAAAAAGGTCTGGAGG dkey:30j10.5 RT-PCR 30j10_F3 GACCAGGATCAGTTTTCATTC (Zebrafish) 30j10_R_mp1 AGATTCTTCTTGACCTGCTCGT 30j10_F CCACCACTAAAGATGAGGAGGA dkey:30j10.5 HRMA 30j10_R CCACAGATTGATTTTGTCTCCA (Zebrafish) Dkey238F GTGCACTATTATTTGCACGCTC dkey:238 HRMA Dkey238R CCCGATTTAAACCAGAAAGTGT (Zebrafish) LOC101884840F CCACAGCTGCAATTTACAGAAC LOC101884840 HRMA LOC101884840R CTGATACCACACAACTCTGCGT (Zebrafish) AATTCATAGCAGGACTCAGAATGGA Hemgn_F_kozak hemgn PCR GGAGACATTTCAAC NcHemgn_R2 (Notothenioid) gDNA CTAACAGGATGCACACTAACC CD33rSig_F CTGCTCATTAGAGATTGATGA cd33rSig PCR CD33rSig_R GAAGGTTATTGTGGAGGTC (Notothenioid) gDNA CCTCAAGAGGAGTTTTTGATTGAGG hemgn Drhemex3Fb PCR (Zebrafish) β-globin_F TCGCCAAGGCTGACTACGA beta-globin qPCR β-globin_R CGGCATTGTAGGTTTCCAA (Zebrafish) β-actin_F CGAGCAGGAGATGGGAACC beta-actin qPCR β-actin_R CAACGGAAACGCTCATTGC (Zebrafish) β-act_F CAGATCATGTTCGAGACCTTCAAC beta-actin qPCR β-act_R TCACCRGARTCCATGACGATA (Notothenioid) Hemgn_short_F GACTAACCAGTGGGTTTTAAGCC hemgn qPCR NcHemgn_R2 CTAACAGGATGCACACTAACC (Notothenioid)

194

Table S2. Oligos for CRISPR gRNA template Gene Oligo Sequence (5’ – 3’) Gene Method GAAATTAATACGACTCACTATAGGAAAGATCG 30j10_gRNA1 CGTCTTCCTCGTTTTAGAGCTAGAAATAGC dkey:30j10.5 CRISPR GAAATTAATACGACTCACTATAGGAGGCAGCT (Zebrafish) 30j10_gRNA2 GGGTACGAGCGTTTTAGAGCTAGAAATAGC GAAATTAATACGACTCACTATAGGAACCTTGG LOC101884840 LOC101884840_gRNA CRISPR AGGCCGTGAAGTTTTAGAGCTAGAAATAGC (Zebrafish) GAAATTAATACGACTCACTATAGGTTGGACTC dkey:238d18.10 Dkey238_gRNA CRISPR TCTTTCTGACGTTTTAGAGCTAGAAATAGC (Zebrafish) AAAAGCACCGACTCGGTGCCACTTTTTCAAGT CRISPR gRNA_common TGATAACGGACTAGCCTTATTTTAACTTGCTAT template TTCTAGCTCTAAAAC

195

Conclusion

The evolution of Antarctic notothenioid fishes appears to have involved a gradual reduction in the activity of the erythropoietic pathway (Eastman, 1993; Lau et al., 2012;

Wells et al., 1980), which culminated in the complete loss of production of typical mature red blood cells in the derived monophyletic clade of icefishes (Channicthyidae).

Mutations in key erythroid genes or genetic pathways may have instigated the evolutionary loss of red blood cells. Genes that are targets in human anemias are prime candidates, but there may also be unknown genes whose mutation caused or contributed to icefish anemia. As an example of the former, I found that the Antarctic dragonfish P. charcoti (dragonfishes are the sister clade to the icefishes), produces abnormal spherocytic erythrocytes and has mutations in erythroid beta-spectrin that are identical to several that cause human hereditary spherocytic anemia. Moreover, some mutations in icefishes, like the deletions of the adult alpha and beta globin genes

(Cocca et al., 1995b; di Prisco et al., 2002; Zhao et al., 1998b), may be a consequence of relaxed selection due to the loss of globin transcriptional regulation (Lau et al., 2012) or due to the loss of globin-expressing erythrocytes. Alternatively, one may speculate that some regulator(s) of globin transcription (e.g. Lcr, Gata1, Hemgn, etc.) became functionally compromised, and this led to the loss of globin expression and the subsequent deletion of the locus.

196

Chapter 2

Teleost hemogen was first discovered as a marker that was strongly expressed in the hematopoietic tissues of red-blooded notothenioids but not in the derived lineage of white-blooded Antarctic icefishes (Detrich and Yergeau, 2004; Yergeau et al., 2005) – this finding implicated teleost hemogen in erythropoiesis. In my preliminary research, I identified a frameshift mutation in the icefish hemogen gene, a defect that truncates the putative transactivation domain (TAD) of the encoded transcription factor. The hemogen gene is present in the genomes of all vertebrates except the superclass of jawless fishes (Agnatha), and is found as a highly conserved, single-copy, four exon gene.

Except in a few species, the hemogen gene has been preserved in the same state for over 450 million years and is likely to be a gene that is crucial for vertebrate development. In support of this, I found that the expression pattern of hemogen was conserved between fishes and mammals. In zebrafish, hemogen expression is driven by Gata1 in differentiating erythrocytes, in Sertoli cells of the testis, in the brain, and in renal cells of the kidney. Two conserved non-coding elements function individually and together to regulate hemogen expression in primitive and definitive waves of blood development in zebrafish.

In icefish Hemogen, deletion of the putative TAD is likely to impair the recruitment of P300 to the erythroid transcription factor, Gata1 (Zheng et al., 2014). To determine the effects of the icefish hemogen mutation on erythroid development, I used the CRISPR/Cas9 gene editing system to generate hemogen mutant zebrafish that recapitulate the icefish mutation. I showed that the frameshift mutation in zebrafish hemogen was a dominant-negative allele, which caused partial anemia in embryos and

197 adults. Therefore, intact Hemogen appears to be required for erythropoiesis and may contribute to the anemia of Antarctic icefishes.

The transgenic zebrafish lines produced in this study provide the first in vivo animal models to analyze the function of Hemogen during embryonic and adult development. These zebrafish models may be useful in identifying causes and treatments of human blood diseases that have been associated with hemogen overexpression.

Chapter 1

To discern the mutation events that led to the deletion of the globin genes and the loss of red blood cells in icefishes, I performed comparative transcriptomics of red- and white-blooded notothenioid species. I show that the mutation in icefish hemogen was one of several genetic defects in a shared molecular pathway that may contribute to an intricate repression of erythropoiesis. Notably, icefish erythropoiesis may be blocked by an acetylation imbalance caused by down-regulation of P300 and overexpression of Hdac1b, two proteins that are known to regulate the activity of Gata1

(Boyes et al., 1998). Furthermore, both P300 and Gata1 contain predicted deleterious substitutions in the domains that bind Hemogen, which suggest that this activating complex was lost by icefishes. These mutations and the truncation of the Hemogen

TAD in icefishes may hint at the loss of a multi-protein complex formed by Hemogen,

Gata1, and other cofactors that bind the globin locus control region (Lcr).

198

Chapter 3

In red-blooded Antarctic notothenioid fishes, hemogen is expressed in the same tissues at sites of embryonic and adult hematopoiesis, in the brain, and in renal cells of the kidney. Despite their severe anemia, icefish embryos produce hemgn+ primitive erythroid cells in lateral plate mesoderm and express the truncated allele at very low levels compared to red-blooded species. Overexpression of icefish hemogen severely impairs erythropoiesis in a zebrafish model, which indicates that this truncated hemogen is a dominant negative allele. Interestingly, icefishes produce another truncated isoform of hemogen (hemgn-s) through alternative splicing, which is not seen in red-blooded species. It is likely that this truncated Hemogen isoform also operates as dominant negative protein to disrupt erythropoiesis in icefishes. This unique mechanism of evolution is strikingly similar to splicing mutations that cause some human blood diseases (Conboy, 2017). One might speculate that overexpression of the hemgn-s isoform may have pre-empted the permanent deletion and truncation of the Hemogen

TAD in icefishes.

The evolutionary loss of red blood cells in Antarctic icefishes facilitated the discovery of 31 novel erythroid genes. In icefishes, three blood-specific genes (hemgn, cd33rsig, and mabp-like) contain nonsense mutations that disrupt important functional domains. The hemgnuz2 and hemgnnuz4 mutant zebrafish lines produced in this study demonstrate a critical role for the Hemogen C-terminal TAD in erythropoiesis.

Generation of stable mutant zebrafish lines for the cd33rsig and mabp-like alleles from icefishes may also reveal novel roles for these genes in erythropoiesis.

199

References

Albertson, R.C., Cresko, W., Detrich, H.W., 3rd, Postlethwait, J.H., 2009. Evolutionary mutant models for human disease. Trends Genet 25, 74-81. Albertson, R.C., Yan, Y.L., Titus, T.A., Pisano, E., Vacchi, M., Yelick, P.C., Detrich, H.W., 3rd, Postlethwait, J.H., 2010. Molecular pedomorphism underlies craniofacial skeletal evolution in Antarctic notothenioid fishes. BMC evolutionary biology 10, 4. Allaire, P.D., Marat, A.L., Dall'Armi, C., Di Paolo, G., McPherson, P.S., Ritter, B., 2010. The Connecdenn DENN domain: a GEF for Rab35 mediating cargo-specific exit from early endosomes. Mol Cell 37, 370- 382. Altenhoff, A.M., Gil, M., Gonnet, G.H., Dessimoz, C., 2013. Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS One 8, e53786. Altenhoff, A.M., Schneider, A., Gonnet, G.H., Dessimoz, C., 2011. OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res 39, D289-294. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. Journal of molecular biology 215, 403-410. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-3402. Alwine, J.C., Kemp, D.J., Stark, G.R., 1977. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc Natl Acad Sci U S A 74, 5350-5354. Amsterdam, A., Nissen, R.M., Sun, Z., Swindell, E.C., Farrington, S., Hopkins, N., 2004. Identification of 315 genes essential for early zebrafish development. Proc Natl Acad Sci U S A 101, 12792-12797. An, L.L., Li, G., Wu, K.F., Ma, X.T., Zheng, G.G., Qiu, L.G., Song, Y.H., 2005. High expression of EDAG and its significance in AML. Leukemia 19, 1499-1502. Archer, S.D., Johnston, I.A., 1987. Kinematics of labriform and subcarangiform swimming in the Antarctic fish Notothenia neglecta. J. Exp. Biol. 143, 195-210. Archer, S.D., Johnston, I.A., 1991. Density of cristae and distribution of mitochondria in the slow muscle fibres of Antarctic fish. Physiol. Zool. 64, 242-258. Ata, H., Clark, K.J., Ekker, S.C., 2016. The zebrafish genome editing toolkit. Barber, D.L., Westerman, J.E.M., White, M.G., 1981. The blood cells of the Antarctic icefish Chaenocephalus aceratus Lönnberg: light and electron microscopic observations. J Fish Biol 19, 11- 28. Barisic, M., Korac, J., Pavlinac, I., Krzelj, V., Marusic, E., Vulliamy, T., Terzic, J., 2005. Characterization of G6PD deficiency in southern Croatia: description of a new variant, G6PD Split. J Hum Genet 50, 547- 549. Batada, N.N., Hurst, L.D., Tyers, M., 2006. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol 2, e88. Bennett, C.M., Kanki, J.P., Rhodes, J., Liu, T.X., Paw, B.H., Kieran, M.W., Langenau, D.M., Delahaye- Brown, A., Zon, L.I., Fleming, M.D., Look, A.T., 2001. Myelopoiesis in the zebrafish, Danio rerio. Blood 98, 643-651. Berthelot, C., J., C., Desvignes, T., Detrich III, H.W., Flicek, P., Peck, L.S., Peters, M., Postlethwait, J.H., Clark, M.S., 2018. Manuscript in preparation. Adaptation of proteins to the cold in Antarctic fish: A role for Methionine?

200

Bertrand, J.Y., Kim, A.D., Teng, S., Traver, D., 2008. CD41+ cmyb+ precursors colonize the zebrafish pronephros by a novel migration route to initiate adult hematopoiesis. Development 135, 1853-1862. Bertrand, J.Y., Kim, A.D., Violette, E.P., Stachura, D.L., Cisson, J.L., Traver, D., 2007a. Definitive hematopoiesis initiates through a committed erythromyeloid progenitor in the zebrafish embryo. Development 134, 4147-4156. Bertrand, J.Y., Kim, A.D., Violette, E.P., Stachura, D.L., Cisson, J.L., Traver, D., 2007b. Definitive hematopoiesis initiates through a committed erythromyeloid progenitor in the zebrafish embryo. Development. Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., Taylor, J., 2010. Galaxy: a web-based genome analysis tool for experimentalists. Current protocols in molecular biology Chapter 19, Unit 19 10 11-21. Blobel, G.A., Nakajima, T., Eckner, R., Montminy, M., Orkin, S.H., 1998. CREB-binding protein cooperates with transcription factor GATA-1 and is required for erythroid differentiation. Proc Natl Acad Sci U S A 95, 2061-2066. Bordoli, L., Husser, S., Luthi, U., Netsch, M., Osmani, H., Eckner, R., 2001. Functional analysis of the p300 acetyltransferase domain: the PHD finger of p300 but not of CBP is dispensable for enzymatic activity. Nucleic Acids Res 29, 4462-4471. Boura, E., Hurley, J.H., 2012. Structural basis for membrane targeting by the MVB12-associated beta- prism domain of the human ESCRT-I MVB12 subunit. Proc Natl Acad Sci U S A 109, 1901-1906. Boyes, J., Byfield, P., Nakatani, Y., Ogryzko, V., 1998. Regulation of activity of the transcription factor GATA-1 by acetylation. Nature 396, 594-598. Bradbury, C.A., Khanim, F.L., Hayden, R., Bunce, C.M., White, D.A., Drayson, M.T., Craddock, C., Turner, B.M., 2005. Histone deacetylases in acute myeloid leukaemia show a distinctive pattern of expression that changes selectively in response to deacetylase inhibitors. Leukemia 19, 1751-1759. Brinkman-Van der Linden, E.C., Angata, T., Reynolds, S.A., Powell, L.D., Hedrick, S.M., Varki, A., 2003. CD33/Siglec-3 binding specificity, expression pattern, and consequences of gene deletion in mice. Molecular and cellular biology 23, 4199-4206. Broos, S., Hulpiau, P., Galle, J., Hooghe, B., Van Roy, F., De Bleser, P., 2011. ConTra v2: a tool to identify transcription factor binding sites across species, update 2011. Nucleic Acids Res 39, W74-78. Cao, H., de Bono, B., Belov, K., Wong, E.S., Trowsdale, J., Barrow, A.D., 2009. Comparative genomics indicates the mammalian CD33rSiglec locus evolved by an ancient large-scale inverse duplication and suggests all Siglecs share a common ancestral region. Immunogenetics 61, 401-417. Carroll, D., 2017. Genome Editing: Past, Present, and Future. Yale J Biol Med 90, 653-659. Catchen, J.M., Conery, J.S., Postlethwait, J.H., 2009. Automated identification of conserved synteny after whole-genome duplication. Genome research 19, 1497-1505. Chen, D.L., Hu, Z.Q., Zheng, X.F., Wang, X.Y., Xu, Y.Z., Li, W.Q., Fang, H.S., Kan, L., Wang, S.Y., 2016. EDAG-1 promotes proliferation and invasion of human thyroid cancer cells by activating MAPK/Erk and AKT signal pathways. Cancer biology & therapy 17, 414-421. Chen, L., DeVries, A.L., Cheng, C.H., 1997. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci U S A 94, 3811-3816. Chen, Z., Cheng, C.H., Zhang, J., Cao, L., Chen, L., Zhou, L., Jin, Y., Ye, H., Deng, C., Dai, Z., Xu, Q., Hu, P., Sun, S., Shen, Y., Chen, L., 2008. Transcriptomic and genomic evolution under constant cold in Antarctic notothenioid fish. Proc Natl Acad Sci U S A 105, 12944-12949. Choi, Y., Chan, A.P., 2015. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31, 2745-2747. Cocca, E., Ratnayake-Lecamwasam, M., Parker, S.K., Camardella, L., Ciaramella, M., di Prisco, G., Detrich, H.W., 1995a. Genomic remnants of -globin genes in the hemoglobinless antarctic icefishes. Proc. Natl. Acad. Sci. U. S. A 92, 1817-1821.

201

Cocca, E., Ratnayake-Lecamwasam, M., Parker, S.K., Camardella, L., Ciaramella, M., di Prisco, G., Detrich, H.W., 3rd, 1995b. Genomic remnants of alpha-globin genes in the hemoglobinless antarctic icefishes. Proc Natl Acad Sci U S A 92, 1817-1821. Collins, S., Coleman, H., Groudine, M., 1987. Expression of bcr and bcr-abl fusion transcripts in normal and leukemic cells. Molecular and cellular biology 7, 2870-2876. Colombo, M., Damerau, M., Hanel, R., Salzburger, W., Matschiner, M., 2015. Diversity and disparity through time in the adaptive radiation of Antarctic notothenioid fishes. J Evol Biol 28, 376-394. Conboy, J.G., 2017. RNA splicing during terminal erythropoiesis. Curr Opin Hematol 24, 215-221. Crocker, P.R., Paulson, J.C., Varki, A., 2007. Siglecs and their roles in the immune system. Nat Rev Immunol 7, 255-266. D'Andrea, A.D., Lodish, H.F., Wong, G.G., 1989. Expression cloning of the murine erythropoietin receptor. Cell 57, 277-285. Davidson, A.J., Zon, L.I., 2004. The 'definitive' (and 'primitive') guide to zebrafish hematopoiesis. Oncogene 23, 7233-7246. de Jong, J.L., Zon, L.I., 2005. Use of the Zebrafish to Study Primitive and Definitive Hematopoiesis. Annu. Rev. Genet 39, 481-501. De Propris, M.S., Raponi, S., Diverio, D., Milani, M.L., Meloni, G., Falini, B., Foa, R., Guarini, A., 2011. High CD33 expression levels in acute myeloid leukemia cells carrying the nucleophosmin (NPM1) mutation. Haematologica 96, 1548-1551. De Ruijter, A.J.M., Van Gennip, A.H., Caron, H.N., Kemp, S., Van Kuilenburg, A.B.P., 2003. Histone deacetylases (HDACs): characterization of the classical HDAC family. Biochem J 370, 737-749. de Souza, R.F., Aravind, L., 2010. UMA and MABP domains throw light on receptor endocytosis and selection of endosomal cargoes. Bioinformatics 26, 1477-1480. Denef, N., Chen, Y., Weeks, S.D., Barcelo, G., Schupbach, T., 2008. Crag regulates epithelial architecture and polarized deposition of basement membrane proteins in Drosophila. Dev Cell 14, 354-364. Deng, C., Cheng, C.H., Ye, H., He, X., Chen, L., 2010. Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci U S A 107, 21593-21598. Detrich, H.W., 3rd, Kieran, M.W., Chan, F.Y., Barone, L.M., Yee, K., Rundstadler, J.A., Pratt, S., Ransom, D., Zon, L.I., 1995. Intraembryonic hematopoietic cell migration during vertebrate development. Proc Natl Acad Sci U S A 92, 10713-10717. Detrich, H.W., III., Westerfield, M., Zon, LI., 1999. Overview of the Zebrafish System. In Methods in Cell Biology, Overview of the Zebrafish system, Vol. 59. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 3-10 Detrich, H.W., Yergeau, D.A., 2004. Comparative genomics in erythropoietic gene discovery: synergisms between the Antarctic icefishes and the zebrafish, in: Detrich, H.W., Westerfield, M., Zon, L.I. (Eds.), Methods in Cell Biology, The Zebrafish, 2nd edition: Genetics, Genomics, and Informatics, Vol. 77. Elsevier Academic Press, San Diego, pp. 475-503. DeVries, A.L., Eastman, J.T., 1978. Lipid sacs as a buoyancy adap- tation in an Antarctic fish. Nature 271, 352-353. di Prisco, G., Cocca, E., Parker, S., Detrich, H., 2002. Tracking the evolutionary loss of hemoglobin expression by the white-blooded Antarctic icefishes. Gene 295, 185-191. Ding, Y.L., Xu, C.W., Wang, Z.D., Zhan, Y.Q., Li, W., Xu, W.X., Yu, M., Ge, C.H., Li, C.Y., Yang, X.M., 2010. Over-expression of EDAG in the myeloid cell line 32D: induction of GATA-1 expression and erythroid/megakaryocytic phenotype. Journal of cellular biochemistry 110, 866-874. Dirks, W., Rome, D., Ringel, F., Jager, K., MacLeod, R.A., Drexler, H.G., 1999. Expression of the growth arrest-specific gene 6 (GAS6) in leukemia and lymphoma cell lines. Leuk Res 23, 643-651. Dodd, R.B., Meadows, W., Qamar, S., Johnson, C.M., Kronenberg-Versteeg, D., St George-Hyslop, P., 2016, to be published. Structure of ligand bound CD33 receptor associated with Alzheimer's disease.

202

Dolznig, H., Habermann, B., Stangl, K., Deiner, E.M., Moriggl, R., Beug, H., Mullner, E.W., 2002. Apoptosis protection by the Epo target Bcl-X(L) allows factor-independent differentiation of primary erythroblasts. Curr Biol 12, 1076-1085. Drabsch, Y., ten Dijke, P., 2012. TGF-beta signalling and its role in cancer progression and metastasis. Cancer Metastasis Rev 31, 553-568. Dyson, H.J., Wright, P.E., 2005. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Bio 6, 197-208. Dyson, H.J., Wright, P.E., 2016. Role of Intrinsic Protein Disorder in the Function and Interactions of the Transcriptional Coactivators CREB-binding Protein (CBP) and p300. The Journal of biological chemistry 291, 6714-6722. Eastman, J.T., 1993. Antarctic fish biology : evolution in a unique environment. Academic Press, San Diego. Eastman, J.T., Eakin, R.R., 2000. An updated species list for notothenioid fish (; Notothenioidei), with comments on Antarctic species. Archive of Fishery and Marine Research 48, 11-20. Eberharter, A., Becker, P.B., 2002. Histone acetylation: a switch between repressive and permissive chromatin. Second in review series on chromatin dynamics. EMBO Rep 3, 224-229. El-Brolosy, M.A., Stainier, D.Y.R., 2017. Genetic compensation: A phenomenon in search of mechanisms. PLoS Genet 13, e1006780. Evans, C.J., Hartenstein, V., Banerjee, U., 2003. Thicker than blood: conserved mechanisms in Drosophila and vertebrate hematopoiesis. Dev Cell 5, 673-690. Ferreira, R., Ohneda, K., Yamamoto, M., Philipsen, S., 2005. GATA1 function, a paradigm for transcription factors in hematopoiesis. Molecular and cellular biology 25, 1215-1227. Forbes, S.A., Beare, D., Boutselakis, H., Bamford, S., Bindal, N., Tate, J., Cole, C.G., Ward, S., Dawson, E., Ponting, L., Stefancsik, R., Harsha, B., Kok, C.Y., Jia, M., Jubb, H., Sondka, Z., Thompson, S., De, T., Campbell, P.J., 2017. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res 45, D777- D783. Frame, J.M., Lim, S.-E., North, T.E., 2017. Hematopoietic stem cell development: using the zebrafish to identify extrinsic and intrinsic mechanisms regulating hematopoiesis. In Methods in Cell Biology, The Zebrafish: Disease Models and Chemical Screens, 4th Edition. Vol. 138. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 165-184 Fransecky, L., Mochmann, L.H., Baldus, C.D., 2015. Outlook on PI3K/AKT/mTOR inhibition in acute leukemia. Mol Cell Ther 3, 2. Fu, L., Niu, B., Zhu, Z., Wu, S., Li, W., 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150-3152. Galloway, J.L., Zon, L.I., 2003. Ontogeny of hematopoiesis: examining the emergence of hematopoietic cells in the vertebrate embryo. Current topics in developmental biology 53, 139-158. Ganis, J.J., Hsia, N., Trompouki, E., de Jong, J.L., DiBiase, A., Lambert, J.S., Jia, Z., Sabo, P.J., Weaver, M., Sandstrom, R., Stamatoyannopoulos, J.A., Zhou, Y., Zon, L.I., 2012. Zebrafish globin switching occurs in two developmental stages and is controlled by the LCR. Dev Biol 366, 185-194. Gardiner, M.R., Gongora, M.M., Grimmond, S.M., Perkins, A.C., 2007. A global role for zebrafish klf4 in embryonic erythropoiesis. Mech Dev 124, 762-774. Geiger, T.L., Rubnitz, J.E., 2015. New approaches for the immunotherapy of acute myeloid leukemia. Discov Med 19, 275-284. Gekas, C., Graf, T., 2013. CD41 expression marks myeloid-biased adult hematopoietic stem cells and increases with age. Blood 121, 4463-4472. Ghigliotti, L., Cheng, C.H., Pisano, E., 2016. Sex determination in Antarctic notothenioid fish: chromosomal clues and evolutionary hypotheses. Polar Biology 39.

203

Ghosh, J., Kapur, R., 2017. Role of mTORC1-S6K1 signaling pathway in regulation of hematopoietic stem cell and acute myeloid leukemia. Exp Hematol 50, 13-21. Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W.J., Nekrutenko, A., 2005. Galaxy: a platform for interactive large-scale genome analysis. Genome research 15, 1451-1455. Goecks, J., Nekrutenko, A., Taylor, J., Galaxy, T., 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11, R86. Guan, Y., Zhu, Q., Huang, D., Zhao, S., Jan Lo, L., Peng, J., 2015. An equation to estimate the difference between theoretically predicted and SDS PAGE-displayed molecular weights for an acidic peptide. Sci Rep 5, 13370. Gupta, T., Mullins, M.C., 2010. Dissection of organs from the adult zebrafish. Journal of visualized experiments : JoVE 37, e1717. Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P.D., Bowden, J., Couger, M.B., Eccles, D., Li, B., Lieber, M., Macmanes, M.D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C.N., Henschel, R., Leduc, R.D., Friedman, N., Regev, A., 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature protocols 8, 1494-1512. Hattangadi, S.M., Wong, P., Zhang, L., Flygare, J., Lodish, H.F., 2011. From stem cell to red cell: regulation of erythropoiesis at multiple levels by multiple proteins, RNAs, and chromatin modifications. Blood 118, 6258-6268. He, X., Zhang, J., 2006. Why do hubs tend to be essential in protein networks? PLoS Genet 2, e88. Heger, A., Holm, L., 2000. Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41, 224-237. Heideman, M.R., Lancini, C., Proost, N., Yanover, E., Jacobs, H., Dannenberg, J.H., 2014. Sin3a-associated Hdac1 and Hdac2 are essential for hematopoietic stem cell homeostasis and contribute differentially to hematopoiesis. Haematologica 99, 1292-1303. Helantera, H., Uller, T., 2014. Neutral and adaptive explanations for an association between caste-biased gene expression and rate of sequence evolution. Frontiers in genetics 5, 297. Herrero, J., Muffato, M., Beal, K., Fitzgerald, S., Gordon, L., Pignatelli, M., Vilella, A.J., Searle, S.M., Amode, R., Brent, S., Spooner, W., Kulesha, E., Yates, A., Flicek, P., 2016. Ensembl comparative genomics resources. Database : the journal of biological databases and curation 2016. Hong, W., Nakazawa, M., Chen, Y.Y., Kori, R., Vakoc, C.R., Rakowski, C., Blobel, G.A., 2005. FOG-1 recruits the NuRD repressor complex to mediate transcriptional repression by GATA-1. EMBO J 24, 2367- 2378. Hossain, M.S., Larsson, A., Scherbak, N., Olsson, P.E., Orban, L., 2008. Zebrafish androgen receptor: isolation, molecular, and biochemical characterization. Biol Reprod 78, 361-369. Howe, K., Clark, M.D., Torroja, C.F., Torrance, J., Berthelot, C., Muffato, M., Collins, J.E., Humphray, S., McLaren, K., Matthews, L., McLaren, S., Sealy, I., Caccamo, M., Churcher, C., Scott, C., Barrett, J.C., Koch, R., Rauch, G.J., White, S., Chow, W., Kilian, B., Quintais, L.T., Guerra-Assuncao, J.A., Zhou, Y., Gu, Y., Yen, J., Vogel, J.H., Eyre, T., Redmond, S., Banerjee, R., Chi, J., Fu, B., Langley, E., Maguire, S.F., Laird, G.K., Lloyd, D., Kenyon, E., Donaldson, S., Sehra, H., Almeida-King, J., Loveland, J., Trevanion, S., Jones, M., Quail, M., Willey, D., Hunt, A., Burton, J., Sims, S., McLay, K., Plumb, B., Davis, J., Clee, C., Oliver, K., Clark, R., Riddle, C., Elliot, D., Threadgold, G., Harden, G., Ware, D., Begum, S., Mortimore, B., Kerry, G., Heath, P., Phillimore, B., Tracey, A., Corby, N., Dunn, M., Johnson, C., Wood, J., Clark, S., Pelan, S., Griffiths, G., Smith, M., Glithero, R., Howden, P., Barker, N., Lloyd, C., Stevens, C., Harley, J., Holt, K., Panagiotidis, G., Lovell, J., Beasley, H., Henderson, C., Gordon, D., Auger, K., Wright, D., Collins, J., Raisen, C., Dyer, L., Leung, K., Robertson, L., Ambridge, K., Leongamornlert, D., McGuire, S.,

204

Gilderthorp, R., Griffiths, C., Manthravadi, D., Nichol, S., Barker, G., Whitehead, S., Kay, M., Brown, J., Murnane, C., Gray, E., Humphries, M., Sycamore, N., Barker, D., Saunders, D., Wallis, J., Babbage, A., Hammond, S., Mashreghi-Mohammadi, M., Barr, L., Martin, S., Wray, P., Ellington, A., Matthews, N., Ellwood, M., Woodmansey, R., Clark, G., Cooper, J., Tromans, A., Grafham, D., Skuce, C., Pandian, R., Andrews, R., Harrison, E., Kimberley, A., Garnett, J., Fosker, N., Hall, R., Garner, P., Kelly, D., Bird, C., Palmer, S., Gehring, I., Berger, A., Dooley, C.M., Ersan-Urun, Z., Eser, C., Geiger, H., Geisler, M., Karotki, L., Kirn, A., Konantz, J., Konantz, M., Oberlander, M., Rudolph-Geiger, S., Teucke, M., Lanz, C., Raddatz, G., Osoegawa, K., Zhu, B., Rapp, A., Widaa, S., Langford, C., Yang, F., Schuster, S.C., Carter, N.P., Harrow, J., Ning, Z., Herrero, J., Searle, S.M., Enright, A., Geisler, R., Plasterk, R.H., Lee, C., Westerfield, M., de Jong, P.J., Zon, L.I., Postlethwait, J.H., Nusslein-Volhard, C., Hubbard, T.J., Roest Crollius, H., Rogers, J., Stemple, D.L., 2013. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496, 498-503. Hruscha, A., Krawitz, P., Rechenberg, A., Heinrich, V., Hecht, J., Haass, C., Schmid, B., 2013. Efficient CRISPR/Cas9 genome editing with low off-target effects in zebrafish. Development 140, 4982-4987. Hubank, M., Schatz, D.G., 1999. cDNA representational difference analysis: a sensitive and flexible method for identification of differentially expressed genes. Methods Enzymol 303, 325-349. Hureau, J.C., 1966. Biologic de Chaenichthys rhinoceratus Richardson, et problème du sang incolore des Chaenichthyidae, poissons des mers australes. Bulletin de la Societe Zoologique de France 91, 735-751. Iuchi, I., Yamamoto, M., 1983. Erythropoiesis in the developing rainbow trout, Salmo gairdneri irideus: histochemical and immunochemical detection of erythropoietic organs. J Exp Zool 226, 409-417. Jacobs, N.L., Albertson, R.C., Wiles, J.R., 2011. Using whole mount in situ hybridization to link molecular and organismal biology. Journal of visualized experiments : JoVE 49, e2533. Janzen, V., Fleming, H.E., Waring, M.T., Milne, C.D., Scadden, D.T., 2006. Multifunctional role of caspase- 3 in regulating hematopoietic stem cells. Blood 108, 258a-258a. Jensen, L.J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M., Bork, P., von Mering, C., 2009. STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37, D412-416. Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai, Z.N., 2001. Lethality and centrality in protein networks. Nature 411, 41-42. Jeswin, J., Joo, M.S., Jeong, J.M., Bae, J.S., Choi, K.M., Cho, D.H., Park, S.I., Park, C.I., 2018. The first report of siglec-3/CD33 gene in a teleost (rock bream, Oplegnathus fasciatus): An analysis of its spatial expression during stimulation to red seabream iridovirus (RSIV) and two bacterial pathogens. Dev Comp Immunol 84, 117-122. Jiang, J., Yu, H., Shou, Y., Neale, G., Zhou, S., Lu, T., Sorrentino, B.P., 2010. Hemgn is a direct transcriptional target of HOXB4 and induces expansion of murine myeloid progenitor cells. Blood 116, 711-719. Jin, H., Sood, R., Xu, J., Zhen, F., English, M.A., Liu, P.P., Wen, Z., 2009. Definitive hematopoietic stem/progenitor cells manifest distinct differentiation output in the zebrafish VDA and PBI. Development 136, 647-654. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., Charpentier, E., 2012. A programmable dual- RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821. Jones, P., Binns, D., Chang, H.Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S.Y., Lopez, R., Hunter, S., 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236-1240. Kafina, M.D., Paw, B.H., 2018. Using the Zebrafish as an Approach to Examine the Mechanisms of Vertebrate Erythropoiesis. Methods Mol Biol 1698, 11-36.

205

Kasper, L.H., Boussouar, F., Ney, P.A., Jackson, C.W., Rehg, J., van Deursen, J.M., Brindle, P.K., 2002. A transcription-factor-binding surface of coactivator p300 is required for haematopoiesis. Nature 419, 738-743. Kawakami, K., Asakawa, K., Muto, A., Wada, H., 2016. Tol2-mediated transgenesis, gene trapping, enhancer trapping, and Gal4-UAS system. In Methods in Cell Biology, The Zebrafish: Genetics, Genomics, and Transcriptomics, 4th Edition, Vol. 135. (Detrich, H. W., III, Westerfield, M., & Zon, L. I., Eds.), Elsevier Academic Press, San Diego, pp. 19-36. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond, A., 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647-1649. Kennett, J., 1977. Cenozoic evolution of Antarctic glaciation, the circum‐Antarctic Ocean, and their impact on global paleoceanography. Journal of Geophysical Research 82, 3843-3860. Kersey, P.J., Allen, J.E., Armean, I., Boddu, S., Bolt, B.J., Carvalho-Silva, D., Christensen, M., Davis, P., Falin, L.J., Grabmueller, C., Humphrey, J., Kerhornou, A., Khobova, J., Aranganathan, N.K., Langridge, N., Lowy, E., McDowall, M.D., Maheswari, U., Nuhn, M., Ong, C.K., Overduin, B., Paulini, M., Pedro, H., Perry, E., Spudich, G., Tapanari, E., Walts, B., Williams, G., Tello-Ruiz, M., Stein, J., Wei, S., Ware, D., Bolser, D.M., Howe, K.L., Kulesha, E., Lawson, D., Maslen, G., Staines, D.M., 2016. Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Res 44, D574-580. Kingsley, P.D., Greenfest-Allen, E., Frame, J.M., Bushnell, T.P., Malik, J., McGrath, K.E., Stoeckert, C.J., Palis, J., 2013. Ontogeny of erythroid gene expression. Blood 121, e5-e13. Krantz, S.B., 1991. Erythropoietin. Blood 77, 419-434. Krivega, I., Dean, A., 2016. Chromatin looping as a target for altering erythroid gene expression. Ann N Y Acad Sci 1368, 31-39. Kruger, A., Ellerstrom, C., Lundmark, C., Christersson, C., Wurtz, T., 2002. RP59, a marker for osteoblast recruitment, is also detected in primitive mesenchymal cells, erythroid cells, and megakaryocytes. Developmental dynamics : an official publication of the American Association of Anatomists 223, 414-418. Kruger, A., Somogyi, E., Christersson, C., Lundmark, C., Hultenby, K., Wurtz, T., 2005. Rat enamel contains RP59: a new context for a protein from osteogenic and haematopoietic precursor cells. Cell Tissue Res 320, 141-148. Kuhn, D.E., O'Brien, K.M., Crockett, E.L., 2016. Expansion of capacities for iron transport and sequestration reflects plasma volumes and heart mass among white-blooded notothenioid fishes. Am J Physiol Regul Integr Comp Physiol. 311, 649-657. Kuhn, K.L., Near, T.J., Detrich, H.W.I., Eastman, J.T., 2010. Biology of the Antarctic dragonfish Vomeridens infuscipinnis (Notothenioidei: Bathydraconidae). Antarctic Science 23, 18-26. Kulkeaw, K., Ishitani, T., Kanemaru, T., Fucharoen, S., Sugiyama, D., 2010. Cold exposure down-regulates zebrafish hematopoiesis. Biochem Biophys Res Commun 394, 859-864. Kunzmann, A., Caruso, C., Diprisco, G., 1991. Hematological Studies on a High-Antarctic Fish - Bathydraco-Marri Norman. J Exp Mar Biol Ecol 152, 243-255. Kwan, K.M., Fujimoto, E., Grabher, C., Mangum, B.D., Hardy, M.E., Campbell, D.S., Parant, J.M., Yost, H.J., Kanki, J.P., Chien, C.B., 2007. The Tol2kit: a multisite gateway-based construction kit for Tol2 transposon transgenesis constructs. Developmental dynamics : an official publication of the American Association of Anatomists 236, 3088-3099. Labun, K., Montague, T.G., Gagnon, J.A., Thyme, S.B., Valen, E., 2016. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res 44, W272-276.

206

Landrum, M.J., Lee, J.M., Benson, M., Brown, G., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Hoover, J., Jang, W., Katz, K., Ovetsky, M., Riley, G., Sethi, A., Tully, R., Villamarin-Salomon, R., Rubinstein, W., Maglott, D.R., 2016. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res 44, D862-868. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L., 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25. Lau, Y.T., Parker, S.K., Near, T.J., Detrich, H.W., 3rd, 2012. Evolution and function of the globin intergenic regulatory regions of the antarctic dragonfishes (Notothenioidei: Bathydraconidae). Molecular biology and evolution 29, 1071-1080. Lee, S.H., Chiu, Y.C., Li, Y.H., Lin, C.C., Hou, H.A., Chou, W.C., Tien, H.F., 2017. High expression of dedicator of cytokinesis 1 (DOCK1) confers poor prognosis in acute myeloid leukemia. Oncotarget 8, 72250-72259. Leichty, A.R., Pfennig, D.W., Jones, C.D., Pfennig, K.S., 2012. Relaxed genetic constraint is ancestral to the evolution of phenotypic plasticity. Integrative and comparative biology 52, 16-30. Li, B., Dewey, C.N., 2011. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC bioinformatics 12, 323. Li, C.Y., Zhan, Y.Q., Li, W., Xu, C.W., Xu, W.X., Yu, D.H., Peng, R.Y., Cui, Y.F., Yang, X., Hou, N., Li, Y.H., Dong, B., Sun, H.B., Yang, X.M., 2007. Overexpression of a hematopoietic transcriptional regulator EDAG induces myelopoiesis and suppresses lymphopoiesis in transgenic mice. Leukemia 21, 2277- 2286. Li, C.Y., Zhan, Y.Q., Xu, C.W., Xu, W.X., Wang, S.Y., Lv, J., Zhou, Y., Yue, P.B., Chen, B., Yang, X.M., 2004. EDAG regulates the proliferation and differentiation of hematopoietic cells and resists cell apoptosis through the activation of nuclear factor-kappa B. Cell death and differentiation 11, 1299-1308. Liao, E.C., Paw, B.H., Peters, L.L., Zapata, A., Pratt, S.J., Do, C.P., Lieschke, G., Zon, L.I., 2000. Hereditary spherocytosis in zebrafish riesling illustrates evolution of erythroid beta-spectrin structure, and function in red cell morphogenesis and membrane stability. Development 127, 5123-5132. Lin, C.S., Lim, S.K., D'Agati, V., Costantini, F., 1996. Differential effects of an erythropoietin receptor gene disruption on primitive and definitive erythropoiesis. Genes Dev 10, 154-164. Lin, H.F., Traver, D., Zhu, H., Dooley, K., Paw, B.H., Zon, L.I., Handin, R.I., 2005. Analysis of thrombocyte development in CD41-GFP transgenic zebrafish. Blood 106, 3803-3810. Lin, Y., Dobbs, G.H., 3rd, Devries, A.L., 1974. Oxygen consumption and lipid content in red and white muscles of Antarctic fishes. J Exp Zool 189, 379-386. Lomako, V.V., Shilo, A.V., Kovalenko, I.F., Babiichuk, G.A., 2015. [Erythrocytes of hetero- and homoiothermal animals in natural and artificial hypothermia]. Zh Evol Biokhim Fiziol 51, 52-59. Love, P.E., Warzecha, C., Li, L., 2014. Ldb1 complexes: the new master regulators of erythroid gene transcription. Trends Genet 30, 1-9. Lu, J., Xu, W.X., Wang, S.Y., Jiang, Y., Li, C.Y., Cai, W.M., Yang, X.M., 2002. [Overexpression of EDAG-1 in NIH3T3 cells leads to malignant transformation]. Sheng Wu Hua Xue Yu Sheng Wu Wu Li Xue Bao (Shanghai) 34, 95-98. Lu, J., Xu, W.X., Wang, S.Y., Zhan, Y.Q., Jiang, Y., Cai, W.M., Yang, X.M., 2001. Isolation and Characterization of EDAG-1, A Novel Gene Related to Regulation in Hematopoietic System. Sheng Wu Hua Xue Yu Sheng Wu Wu Li Xue Bao (Shanghai) 33, 641-646. Lyons, S.E., Lawson, N.D., Lei, L., Bennett, P.E., Weinstein, B.M., Liu, P.P., 2002. A nonsense mutation in zebrafish gata1 causes the bloodless phenotype in vlad tepes. Proc Natl Acad Sci U S A 99, 5454- 5459. Maekawa, S., Iemura, H., Kuramochi, Y., Nogawa-Kosaka, N., Nishikawa, H., Okui, T., Aizawa, Y., Kato, T., 2012. Hepatic confinement of newly produced erythrocytes caused by low-temperature exposure in Xenopus laevis. The Journal of experimental biology 215, 3087-3095.

207

Marchler-Bauer, A., Derbyshire, M.K., Gonzales, N.R., Lu, S., Chitsaz, F., Geer, L.Y., Geer, R.C., He, J., Gwadz, M., Hurwitz, D.I., Lanczycki, C.J., Lu, F., Marchler, G.H., Song, J.S., Thanki, N., Wang, Z., Yamashita, R.A., Zhang, D., Zheng, C., Bryant, S.H., 2015. CDD: NCBI's conserved domain database. Nucleic Acids Res 43, D222-226. Martin, G.S., 2003. Cell signaling and cancer. Cancer Cell 4, 167-174. Matschiner, M., Colombo, M., Damerau, M., Ceballos, S., Hanel, R., Salzburger, W., 2015. The Adaptive Radiation of Notothenioid Fishes in the Waters of Antarctica. Springer International Publishing Switzerland. Matschiner, M., Hanel, R., Salzburger, W., 2011. On the origin and trigger of the notothenioid adaptive radiation. PLoS One 6, e18911. Maximow, A., 1909. Der Lymphozyt als gemeinsame Stammzelle der verschiedenen Blutelemente in der embryonalen Entwicklung und im postfetalen Leben der Säugetiere. Fol. Haematol. 8, 125-134. McDevitt, M.A., Fujiwara, Y., Shivdasani, R.A., Orkin, S.H., 1997. An upstream, DNase I hypersensitive region of the hematopoietic-expressed transcription factor GATA-1 gene confers developmental specificity in transgenic mice. Proc Natl Acad Sci U S A 94, 7976-7981. McGuckin, C.P., Forraz, N., Liu, W.M., 2003. Diaminofluorene stain detects erythroid differentiation in immature haemopoietic cells treated with EPO, IL-3, SCF, TGFbeta1, MIP-1alpha and IFNgamma. European journal of haematology 70, 106-114. Medvinsky, A., Rybtsov, S., Taoudi, S., 2011. Embryonic origin of the adult hematopoietic system: advances and questions. Development 138, 1017-1031. Miccio, A., Wang, Y., Hong, W., Gregory, G.D., Wang, H., Yu, X., Choi, J.K., Shelat, S., Tong, W., Poncz, M., Blobel, G.A., 2010. NuRD mediates activating and repressive functions of GATA-1 and FOG-1 during blood development. EMBO J 29, 442-456. Montague, T.G., Cruz, J.M., Gagnon, J.A., Church, G.M., Valen, E., 2014. CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic Acids Res 42, W401-407. Montgomery, J., Clements, K., 2000. Disaptation and recovery in the evolution of Antarctic fishes. Trends in ecology & evolution 15, 267-271. Mortensen, M., Ferguson, D.J., Edelmann, M., Kessler, B., Morten, K.J., Komatsu, M., Simon, A.K., 2010. Loss of autophagy in erythroid cells leads to defective removal of mitochondria and severe anemia in vivo. Proc Natl Acad Sci U S A 107, 832-837. Mugal, C.F., Wolf, J.B., Kaj, I., 2014. Why time matters: codon evolution and the temporal dynamics of dN/dS. Molecular biology and evolution 31, 212-231. Murayama, E., Kissa, K., Zapata, A., Mordelet, E., Briolat, V., Lin, H.F., Handin, R.I., Herbomel, P., 2006. Tracing hematopoietic precursor migration to successive hematopoietic organs during zebrafish development. Immunity 25, 963-975. Naka, K., Hoshii, T., Muraguchi, T., Tadokoro, Y., Ooshio, T., Kondo, Y., Nakao, S., Motoyama, N., Hirao, A., 2010. TGF-beta-FOXO signalling maintains leukaemia-initiating cells in chronic myeloid leukaemia. Nature 463, 676-680. Nakata, T., Ishiguro, M., Aduma, N., Izumi, H., Kuroiwa, A., 2013. Chicken hemogen homolog is involved in the chicken-specific sex-determining mechanism. Proc Natl Acad Sci U S A 110, 3417-3422. Near, T.J., Parker, S.K., Detrich, H.W., 2006a. A genomic fossil reveals key steps in hemoglobin loss by the antarctic icefishes. Mol. Biol. Evol 23, 2008-2016. Near, T.J., Parker, S.K., Detrich, H.W., 3rd, 2006b. A genomic fossil reveals key steps in hemoglobin loss by the antarctic icefishes. Molecular biology and evolution 23, 2008-2016. Near, T.J., Pesavento, J.J., Cheng, C.H., 2003. Mitochondrial DNA, morphology, and the phylogenetic relationships of Antarctic icefishes (Notothenioidei: Channichthyidae). Mol Phylogenet Evol 28, 87- 98.

208

Nguyen, D.H., Ball, E.D., Varki, A., 2006. Myeloid precursors and acute myeloid leukemia cells express multiple CD33-related Siglecs. Exp Hematol 34, 728-735. Nishikawa, K., Kobayashi, M., Masumi, A., Lyons, S.E., Weinstein, B.M., Liu, P.P., Yamamoto, M., 2003. Self-association of Gata1 enhances transcriptional activity in vivo in zebra fish embryos. Mol Cell Biol 23, 8295-8305. Notredame, C., Higgins, D.G., Heringa, J., 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of molecular biology 302, 205-217. O'Brien, K.M., Mueller, I.A., 2010. The unique mitochondrial form and function of Antarctic channichthyid icefishes. Integrative and comparative biology 50, 993-1008. O'Leary, N.A., Wright, M.W., Brister, J.R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D., Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V., Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., Farrell, C.M., Goldfarb, T., Gupta, T., Haft, D., Hatcher, E., Hlavina, W., Joardar, V.S., Kodali, V.K., Li, W., Maglott, D., Masterson, P., McGarvey, K.M., Murphy, M.R., O'Neill, K., Pujar, S., Rangwala, S.H., Rausch, D., Riddick, L.D., Schoch, C., Shkeda, A., Storz, S.S., Sun, H., Thibaud-Nissen, F., Tolstoy, I., Tully, R.E., Vatsan, A.R., Wallin, C., Webb, D., Wu, W., Landrum, M.J., Kimchi, A., Tatusova, T., DiCuccio, M., Kitts, P., Murphy, T.D., Pruitt, K.D., 2016. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44, D733-745. Oellacher, J., 1872. Beitrage zur enwicklungsgeschichte der knochenfische nach beobachtungen am bachforelleneie. . Z. Wiss. Zool. 23, 373-421. Omichinski, J.G., Trainor, C., Evans, T., Gronenborn, A.M., Clore, G.M., Felsenfeld, G., 1993. A small single-"finger" peptide from the erythroid transcription factor GATA-1 binds specifically to DNA as a zinc or iron complex. Proc Natl Acad Sci U S A 90, 1676-1680. Onodera, K., Takahashi, S., Nishimura, S., Ohta, J., Motohashi, H., Yomogida, K., Hayashi, N., Engel, J.D., Yamamoto, M., 1997. GATA-1 transcription is controlled by distinct regulatory mechanisms during primitive and definitive erythropoiesis. Proc Natl Acad Sci U S A 94, 4487-4492. Orkin, S.H., Zon, L.I., 1997. Genetics of erythropoiesis: induced mutations in mice and zebrafish. Annu Rev Genet 31, 33-60. Orlacchio, A., Ranieri, M., Brave, M., Arciuch, V.A., Forde, T., De Martino, D., Anderson, K.E., Hawkins, P., Di Cristofano, A., 2017. SGK1 Is a Critical Component of an AKT-Independent Pathway Essential for PI3K-Mediated Tumor Development and Maintenance. Cancer Res 77, 6914-6926. Osborne, C.S., Chakalova, L., Brown, K.E., Carter, D., Horton, A., Debrand, E., Goyenechea, B., Mitchell, J.A., Lopes, S., Reik, W., Fraser, P., 2004. Active genes dynamically colocalize to shared sites of ongoing transcription. Nat Genet 36, 1065-1071. Paffett-Lugassy, N.N., Zon, L.I., 2005. Analysis of hematopoietic development in the zebrafish. Methods Mol. Med 105, 171-198. Palis, J., 2014. Primitive and definitive erythropoiesis in mammals. Front Physiol 5, 3. Park, S., Chapuis, N., Tamburini, J., Bardet, V., Cornillet-Lefebvre, P., Willems, L., Green, A., Mayeux, P., Lacombe, C., Bouscary, D., 2010. Role of the PI3K/AKT and mTOR signaling pathways in acute myeloid leukemia. Haematologica 95, 819-828. Pascual-Anaya, J., Albuixech-Crespo, B., Somorjai, I.M., Carmona, R., Oisi, Y., Alvarez, S., Kuratani, S., Munoz-Chapuli, R., Garcia-Fernandez, J., 2013. The evolutionary origins of hematopoiesis and vertebrate endothelia. Dev Biol 375, 182-192. Paul, S.P., Taylor, L.S., Stansbury, E.K., McVicar, D.W., 2000. Myeloid specific human CD33 is an inhibitory receptor with differential ITIM function in recruiting the phosphatases SHP-1 and SHP-2. Blood 96, 483-490. Paw, B.H., Zon, L.I., 2000. Zebrafish: a genetic approach in studying hematopoiesis. Curr Opin Hematol 7, 79-84.

209

Peters, M.J., Parker, S.K., Grim, J., Allard, C.A.H., Levin, J., Detrich, H.W., 3rd, 2018. Divergent Hemogen genes of teleosts and mammals share conserved roles in erythropoiesis: Analysis using transgenic and mutant zebrafish. Biol Open. Piskacek, M., Havelka, M., Rezacova, M., Knight, A., 2016. The 9aaTAD Transactivation Domains: From Gal4 to p53. PLoS One 11, e0162842. Postlethwait, J.H., Woods, I.G., Ngo-Hazelett, P., Yan, Y.L., Kelly, P.D., Chu, F., Huang, H., Hill-Force, A., Talbot, W.S., 2000. Zebrafish comparative genomics and the origins of vertebrate chromosomes. Genome research 10, 1890-1902. Raman, K., Damaraju, N., Joshi, G.K., 2014. The organisational structure of protein networks: revisiting the centrality-lethality hypothesis. Syst Synth Biol 8, 73-81. Ransom, D.G., Haffter, P., Odenthal, J., Brownlie, A., Vogelsang, E., Kelsh, R.N., Brand, M., van Eeden, F.J., Furutani-Seiki, M., Granato, M., Hammerschmidt, M., Heisenberg, C.P., Jiang, Y.J., Kane, D.A., Mullins, M.C., Nusslein-Volhard, C., 1996. Characterization of zebrafish mutants with defects in embryonic hematopoiesis. Development 123, 311-319. Reese, M.G., 2001. Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Computers & chemistry 26, 51-56. Reynolds, I.J., Hastings, T.G., 1995. Glutamate induces the production of reactive oxygen species in cultured forebrain neurons following NMDA receptor activation. The Journal of neuroscience : the official journal of the Society for Neuroscience 15, 3318-3327. Rhodes, M.M., Kopsombut, P., Bondurant, M.C., Price, J.O., Koury, M.J., 2005. Bcl-x(L) prevents apoptosis of late-stage erythroblasts but does not mediate the antiapoptotic effect of erythropoietin. Blood 106, 1857-1863. Rhodes, M.M., Kopsombut, P., Bondurant, M.C., Price, J.O., Koury, M.J., 2008. Adherence to macrophages in erythroblastic islands enhances erythroblast proliferation and increases erythrocyte production by a different mechanism than erythropoietin. Blood 111, 1700-1708. Robinson, M.D., McCarthy, D.J., Smyth, G.K., 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. Rodriguez-Mari, A., Yan, Y.L., Bremiller, R.A., Wilson, C., Canestro, C., Postlethwait, J.H., 2005. Characterization and expression pattern of zebrafish Anti-Mullerian hormone (Amh) relative to sox9a, sox9b, and cyp19a1a, during gonad development. Gene Expr Patterns 5, 655-667. Rosado, C.J., Buckle, A.M., Law, R.H., Butcher, R.E., Kan, W.T., Bird, C.H., Ung, K., Browne, K.A., Baran, K., Bashtannyk-Puhalovich, T.A., Faux, N.G., Wong, W., Porter, C.J., Pike, R.N., Ellisdon, A.M., Pearce, M.C., Bottomley, S.P., Emsley, J., Smith, A.I., Rossjohn, J., Hartland, E.L., Voskoboinik, I., Trapani, J.A., Bird, P.I., Dunstone, M.A., Whisstock, J.C., 2007. A common fold mediates vertebrate defense and bacterial attack. Science 317, 1548-1551. Rossi, A., Kontarakis, Z., Gerri, C., Nolte, H., Holper, S., Kruger, M., Stainier, D.Y., 2015. Genetic compensation induced by deleterious mutations but not gene knockdowns. Nature 524, 230-233. Rutschmann, S., Matschiner, M., Damerau, M., Muschick, M., Lehmann, M.F., Hanel, R., Salzburger, W., 2011. Parallel ecological diversification in Antarctic notothenioid fishes as evidence for adaptive radiation. Mol Ecol 20, 4707-4721. Sabin, F.R., 2002. Preliminary note on the differentiation of angioblasts and the method by which they produce blood-vessels, blood-plasma and red blood-cells as seen in the living chick. 1917. J Hematother Stem Cell Res 11, 5-7. Schoenfelder, S., Sexton, T., Chakalova, L., Cope, N.F., Horton, A., Andrews, S., Kurukuti, S., Mitchell, J.A., Umlauf, D., Dimitrova, D.S., Eskiw, C.H., Luo, Y., Wei, C.L., Ruan, Y., Bieker, J.J., Fraser, P., 2010. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat Genet 42, 53-61.

210

Schwerte, T., Uberbacher, D., Pelster, B., 2003. Non-invasive imaging of blood cell concentration and blood distribution in zebrafish Danio rerio incubated in hypoxic conditions in vivo. The Journal of experimental biology 206, 1299-1307. Sertori, R., Trengove, M., Basheer, F., Ward, A.C., Liongue, C., 2016. Genome editing in zebrafish: a practical overview. Brief Funct Genomics 15, 322-330. Sharma, P.P., Kaluziak, S.T., Perez-Porro, A.R., Gonzalez, V.L., Hormiga, G., Wheeler, W.C., Giribet, G., 2014. Phylogenomic interrogation of arachnida reveals systemic conflicts in phylogenetic signal. Molecular biology and evolution 31, 2963-2984. Shin, S.C., Ahn, D.H., Kim, S.J., Pyo, C.W., Lee, H., Kim, M.K., Lee, J., Lee, J.E., Detrich, H.W., Postlethwait, J.H., Edwards, D., Lee, S.G., Lee, J.H., Park, H., 2014. The genome sequence of the Antarctic bullhead notothen reveals evolutionary adaptations to a cold environment. Genome Biol 15, 468. Sidell, B.D., O'Brien, K.M., 2006. When bad things happen to good fish: the loss of hemoglobin and myoglobin expression in Antarctic icefishes. The Journal of experimental biology 209, 1791-1802. Snow, J.W., Orkin, S.H., 2009. Translational isoforms of FOG1 regulate GATA1-interacting complexes. The Journal of biological chemistry 284, 29310-29319. Soding, J., Biegert, A., Lupas, A.N., 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33, W244-248. Soler, E., Andrieu-Soler, C., de Boer, E., Bryne, J.C., Thongjuea, S., Stadhouders, R., Palstra, R.J., Stevens, M., Kockx, C., van Ijcken, W., Hou, J., Steinhoff, C., Rijkers, E., Lenhard, B., Grosveld, F., 2010. The genome-wide dynamics of the binding of Ldb1 complexes during erythroid differentiation. Genes Dev 24, 277-289. Song, J., Singh, M., 2013. From hub proteins to hub modules: the relationship between essentiality and centrality in the yeast interactome at different scales of organization. PLoS Comput Biol 9, e1002910. Sood, R., English, M.A., Belele, C.L., Jin, H., Bishop, K., Haskins, R., McKinney, M.C., Chahal, J., Weinstein, B.M., Wen, Z., Liu, P.P., 2010. Development of multilineage adult hematopoiesis in the zebrafish with a runx1 truncation mutation. Blood 115, 2806-2809. Soza-Ried, C., Hess, I., Netuschil, N., Schorpp, M., Boehm, T., 2010. Essential role of c-myb in definitive hematopoiesis is evolutionarily conserved. Proc Natl Acad Sci U S A 107, 17304-17308. Spillman, J., Hureau, J.C., 1967. Observations sur les éléments figures du sang incolore de Chaenichthys rhinoceratus Richardson, poisson télCostéen antarctique (Chaenichthyidae). Bulletin du Museum National d'Histoire Naturelle 38, 779-783. Steck, T.L., 1974. The organization of proteins in the human red blood cell membrane. A review. J Cell Biol 62, 1-19. Stein, S.J., Baldwin, A.S., 2013. Deletion of the NF-kappaB subunit p65/RelA in the hematopoietic compartment leads to defects in hematopoietic stem cell function. Blood 121, 5015-5024. Suzuki, M., Moriguchi, T., Ohneda, K., Yamamoto, M., 2009. Differential contribution of the Gata1 gene hematopoietic enhancer to erythroid differentiation. Molecular and cellular biology 29, 1163-1175. Suzuki, M., Shimizu, R., Yamamoto, M., 2011. Transcriptional regulation by GATA1 and GATA2 during erythropoiesis. Int J Hematol 93, 150-155. Talbot, J.C., Amacher, S.L., 2014. A streamlined CRISPR pipeline to reliably generate zebrafish frameshifting alleles. Zebrafish 11, 583-585. The UniProt, C., 2017. UniProt: the universal protein knowledgebase. Nucleic Acids Res 45, D158-D169. Thompson, M.A., Ransom, D.G., Pratt, S.J., MacLennan, H., Kieran, M.W., Detrich, H.W., 3rd, Vail, B., Huber, T.L., Paw, B., Brownlie, A.J., Oates, A.C., Fritz, A., Gates, M.A., Amores, A., Bahary, N., Talbot, W.S., Her, H., Beier, D.R., Postlethwait, J.H., Zon, L.I., 1998. The cloche and spadetail genes differentially affect hematopoiesis and vasculogenesis. Dev Biol 197, 248-269.

211

Tijssen, M.R., Cvejic, A., Joshi, A., Hannah, R.L., Ferreira, R., Forrai, A., Bellissimo, D.C., Oram, S.H., Smethurst, P.A., Wilson, N.K., Wang, X., Ottersbach, K., Stemple, D.L., Green, A.R., Ouwehand, W.H., Gottgens, B., 2011. Genome-wide analysis of simultaneous GATA1/2, RUNX1, FLI1, and SCL binding in megakaryocytes identifies hematopoietic regulators. Dev Cell 20, 597-609. Till, J.E., McCulloch, E.A., 1980. Hemopoietic stem cell differentiation. Biochim Biophys Acta 605, 431- 459. Traver, D., Paw, B.H., Poss, K.D., Penberthy, W.T., Lin, S., Zon, L.I., 2003. Transplantation and in vivo imaging of multilineage engraftment in zebrafish bloodless mutants. Nat Immunol 4, 1238-1246. Trinchella, F., Parisi, E., Scudiero, R., 2008. Evolutionary analysis of the transferrin gene in Antarctic Notothenioidei: A history of adaptive evolution and functional divergence. Mar Genomics 1, 95-101. Truett, G.E., Heeger, P., Mynatt, R.L., Truett, A.A., Walker, J.A., Warman, M.L., 2000. Preparation of PCR- quality mouse genomic DNA with hot sodium hydroxide and tris (HotSHOT). Biotechniques 29, 52, 54. Ulyanova, T., Blasioli, J., Woodford-Thomas, T.A., Thomas, M.L., 1999. The sialoadhesin CD33 is a myeloid-specific inhibitory receptor. Eur J Immunol 29, 3440-3449. UniProt, C., 2015. UniProt: a hub for protein information. Nucleic Acids Res 43, D204-212. Van Etten, R.A., 2007. Oncogenic signaling: new insights and controversies from chronic myeloid leukemia. J Exp Med 204, 461-465. Varki, A., Angata, T., 2006. Siglecs--the major subfamily of I-type lectins. Glycobiology 16, 1R-27R. Vitale, C., Romagnani, C., Puccetti, A., Olive, D., Costello, R., Chiossone, L., Pitto, A., Bacigalupo, A., Moretta, L., Mingari, M.C., 2001. Surface expression and function of p75/AIRM-1 or CD33 in acute myeloid leukemias: engagement of CD33 induces apoptosis of leukemic cells. Proc Natl Acad Sci U S A 98, 5764-5769. Vo, N., Goodman, R.H., 2001. CREB-binding protein and p300 in transcriptional regulation. The Journal of biological chemistry 276, 13505-13508. Volkmann, K., Rieger, S., Babaryka, A., Koster, R.W., 2008. The zebrafish cerebellar rhombic lip is spatially patterned in producing granule cell populations of different functional compartments. Dev Biol 313, 167-180. Wakabayashi, J., Yomogida, K., Nakajima, O., Yoh, K., Takahashi, S., Engel, J.D., Ohneda, K., Yamamoto, M., 2003. GATA-1 testis activation region is essential for Sertoli cell-specific expression of GATA-1 gene in transgenic mouse. Genes to cells : devoted to molecular & cellular mechanisms 8, 619-630. Walter, R.B., Hausermann, P., Raden, B.W., Teckchandani, A.M., Kamikura, D.M., Bernstein, I.D., Cooper, J.A., 2008. Phosphorylated ITIMs enable ubiquitylation of an inhibitory cell surface receptor. Traffic 9, 267-279. Wang, Y., Xiao, Z.J., Liu, P., Yang, C., Yang, R.C., Cai, Y.L., Han, Z.C., 2003. [Expression of vascular endothelial growth factor and its receptors KDR and Flt1 in acute myeloid leukemia]. Zhonghua Xue Ye Xue Za Zhi 24, 249-252. Weinstein, B.M., Schier, A.F., Abdelilah, S., Malicki, J., Solnica-Krezel, L., Stemple, D.L., Stainier, D.Y., Zwartkruis, F., Driever, W., Fishman, M.C., 1996. Hematopoietic mutations in the zebrafish. Development 123, 303-309. Wells, M., Tidow, H., Rutherford, T.J., Markwick, P., Jensen, M.R., Mylonas, E., Svergun, D.I., Blackledge, M., Fersht, A.R., 2008. Structure of tumor suppressor p53 and its intrinsically disordered N-terminal transactivation domain. Proc Natl Acad Sci U S A 105, 5762-5767. Wells, R.M.G., Ashby, M.D., Duncan, S.J., Macdonald, J.A., 1980. Comparative-Study of the Erythrocytes and Hemoglobins in Nototheniid Fishes from Antarctica. J Fish Biol 17, 517-527. West-Eberhard, M.J., 1989. Phenotypic plasticity and the origins of diversity. Annu. Rev. Ecol. Syst. 20, 249-278. West-Eberhard, M.J., 2005. Developmental plasticity and the origin of species differences. P Natl Acad Sci USA 102, 6543-6549.

212

Westerfield, M., 2000. The zebrafish book. A guide for the laboratory use of zebrafish (Danio rerio). in: Univ. of Oregon Press, E. (Ed.), 4th Edition ed, Univ. of Oregon Press, Eugene. Willard, S.S., Koochekpour, S., 2013. Glutamate, glutamate receptors, and downstream signaling pathways. International journal of biological sciences 9, 948-959. Wu, S., Zhang, Y., 2007. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res 35, 3375-3382. Xu, J., Zhang, Y., 2010. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26, 889-895. Xu, Q., Cai, C., Hu, X., Liu, Y., Guo, Y., Hu, P., Chen, Z., Peng, S., Zhang, D., Jiang, S., Wu, Z., Chan, J., Chen, L., 2015. Evolutionary suppression of erythropoiesis via the modulation of TGF-beta signalling in an Antarctic icefish. Mol Ecol 24, 4664-4678. Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., Zhang, Y., 2015. The I-TASSER Suite: protein structure and function prediction. Nature methods 12, 7-8. Yang, L.V., Heng, H.H., Wan, J., Southwood, C.M., Gow, A., Li, L., 2003. Alternative promoters and polyadenylation regulate tissue-specific expression of Hemogen isoforms during hematopoiesis and spermatogenesis. Developmental dynamics : an official publication of the American Association of Anatomists 228, 606-616. Yang, L.V., Nicholson, R.H., Kaplan, J., Galy, A., Li, L., 2001. Hemogen is a novel nuclear factor specifically expressed in mouse hematopoietic development and its human homologue EDAG maps to chromosome 9q22, a region containing breakpoints of hematological neoplasms. Mech Dev 104, 105- 111. Yang, L.V., Wan, J., Ge, Y., Fu, Z., Kim, S.Y., Fujiwara, Y., Taub, J.W., Matherly, L.H., Eliason, J., Li, L., 2006. The GATA site-dependent hemogen promoter is transcriptionally regulated by GATA1 in hematopoietic and leukemia cells. Leukemia 20, 417-425. Yang, S., Ott, C.J., Rossmann, M.P., Superdock, M., Zon, L.I., Zhou, Y., 2016. Chromatin immunoprecipitation and an open chromatin assay in zebrafish erythrocytes. Method Cell Biol 135, 387-412. Yang, T., Jian, W., Luo, Y., Fu, X., Noguchi, C., Bungert, J., Huang, S., Qiu, Y., 2012. Acetylation of histone deacetylase 1 regulates NuRD corepressor complex activity. The Journal of biological chemistry 287, 40279-40291. Yang, Z., 2007. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution 24, 1586-1591. Yates, A., Akanni, W., Amode, M.R., Barrell, D., Billis, K., Carvalho-Silva, D., Cummins, C., Clapham, P., Fitzgerald, S., Gil, L., Giron, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Keenan, S., Lavidas, I., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Nuhn, M., Parker, A., Patricio, M., Pignatelli, M., Rahtz, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Birney, E., Harrow, J., Muffato, M., Perry, E., Ruffier, M., Spudich, G., Trevanion, S.J., Cunningham, F., Aken, B.L., Zerbino, D.R., Flicek, P., 2016. Ensembl 2016. Nucleic Acids Res 44, D710-716. Yergeau, D.A., Cornell, C.N., Parker, S.K., Zhou, Y., Detrich, H.W., 2005. bloodthirsty, an RBCC/TRIM gene required for erythropoiesis in zebrafish. Dev. Biol 283, 97-112. Yergeau, D.A., Wingert, R.A., Zon, L.I., Detrich, H.W.I., 2006. Manuscript in preparation. Hematopoietic tissues of the erythrocyte-null Antarctic icefishes contain proerythroblasts that fail to complete terminal differentiation. Zeng, Y., Xu, J., Li, D., Li, L., Wen, Z., Qu, J.Y., 2012. Label-free in vivo flow cytometry in zebrafish using two-photon autofluorescence imaging. Optics letters 37, 2490-2492. Zhang, M.J., Ding, Y.L., Xu, C.W., Yang, Y., Lian, W.X., Zhan, Y.Q., Li, W., Xu, W.X., Yu, M., Ge, C.H., Ning, H.M., Li, C.Y., Yang, X.M., 2012a. Erythroid differentiation-associated gene interacts with NPM1

213

(nucleophosmin/B23) and increases its protein stability, resisting cell apoptosis. The FEBS journal 279, 2848-2862. Zhang, Y., Skolnick, J., 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33, 2302-2309. Zhang, Z., Xiao, J., Wu, J., Zhang, H., Liu, G., Wang, X., Dai, L., 2012b. ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments. Biochem Biophys Res Commun 419, 779-781. Zhao, Y., Ratnayake-Lecamwasam, M., Parker, S.K., Cocca, E., Camardella, L., di Prisco, G., Detrich, H.W., 1998a. The major adult -globin gene of Antarctic teleosts and its remnants in the hemoglobinless icefishes: calibration of the mutational clock for nuclear genes. J. Biol. Chem 273, 14745-14752. Zhao, Y., Ratnayake-Lecamwasam, M., Parker, S.K., Cocca, E., Camardella, L., di Prisco, G., Detrich, H.W., 3rd, 1998b. The major adult alpha-globin gene of antarctic teleosts and its remnants in the hemoglobinless icefishes. Calibration of the mutational clock for nuclear genes. The Journal of biological chemistry 273, 14745-14752. Zheng, W.W., Dong, X.M., Yin, R.H., Xu, F.F., Ning, H.M., Zhang, M.J., Xu, C.W., Yang, Y., Ding, Y.L., Wang, Z.D., Zhao, W.B., Tang, L.J., Chen, H., Wang, X.H., Zhan, Y.Q., Yu, M., Ge, C.H., Li, C.Y., Yang, X.M., 2014. EDAG positively regulates erythroid differentiation and modifies GATA1 acetylation through recruiting p300. Stem Cells 32, 2278-2289. Zon, L.I., 1995. Developmental biology of hematopoiesis. Blood 86, 2876-2891.

214