Austral de Austral D. Joseph Th 10.1111/1755-0998.12588doi: differences to lead between the of thisversion citethisarticle and asVersion Record.Please copyediting, paginationbeen throughthe andproofreadingtypesetting, process,may which This article acceptedundergonehas been fullpeer forpublicationreview buthasnot and adaptation, bony fish, Keywords: trade,aquarium E-mail: Phone: +61 478 514097 University, PO Box WA U1987,Perth, 6845,Australia * King AbdullahUniversity Science of and Technology,Thuwal Arabia 23955,Saudi WA6845,Perth, Arabia University, KingAbdullah ofScienceandTechnology,Thuwal23955,Saudi Engineering 1 Aranda Running title: austriacus blacktailof aniconicthe RedSeareef Draft genomefish, theCover' submission 'From Molecular Ecology Resources type Article :From Cover the DateAccepted 19-Jul-2016 : Revised Date:05-Jul-2016 DateReceived 22-Nov-2015 : http://olabout.wiley.com/WileyCDA/Section/id-828039.html#terms article maybeusedfornon-commercialpurposesinaccordancewithWileyTermsandConditionsSelf-Archivingat its characteristics.MolecularEcologyResources,whichhasbeenpublishedinfinalformathttp://doi.org/10.1111/1755-0998.12588.This and Berumen,M.2016.DraftgenomeofaniconicRedSeareeffish,theblacktailbutterflyfish(Chaetodonaustriacus):Currentstatus This isthepeerreviewedversionoffollowingarticle:DiBattista,J.andWang,X.Saenz-Agudelo,P.Piatek,M.Aranda, Red Sea Correspondance: JosephCorrespondance: DiBattista, Department ofEnvironment Curtin andAgriculture,
is article isprotectedbyis article copyright. All rightsreserved. Accepted Article , 1 [email protected] 2 , Department Curtin of andAgriculture, Environment University, PO Box U1987, and Research Center, D
DiBattista ): c Chile, Valdivia 5090000,Chile
Michael L Michael Berumen urrent statusand urrent Draft genome of the blacktail genomeDraft ofthe blacktail Australia 1 ,2 * , WangXin ,
3
Instituto Ambientales yEvolutivas,Universidad deCiencias
ivision ivision Scienceand l andEnvironmental of Biologica
its characteristics its 1 1 , Pablo Saenz-Agudelo , 4 Computational Bioscience Research Center, butterflyfish endemism, endemism, 1,3
, Marek J. Piatek, J. Marek butterflyfish ( genomics, Indo-Pacific Chaetodon 4 , Manuel
This article isThis article by protected copyright. All rightsreserved. approaches sequencing particularly Latimeria radiations Howe obtained Osteichthyes), benefit Genetic Introduction Pacific congen related Red unique Sea environment potential to serve The quality and completenessofthe 28,926 shelter austriacus assembled generalquestto study B Abstract
Accepted Article areamongutterflyfish most the of iconic There are at least areatleast There
et . Using . high for
research
al. for ;
) (fami
and annotated the genome sequence ofand annotated genomesequence the multiple Amemiya
, a
(Chen (NGS) - r
2013 well quality eef n Arabian n Arabian region endemic species that is reliant oncoralreefs forand food
available bonyavailable fish er which
ly
ic fish
- ) is
as studied
,
C
2015;
species ofbutterflyfish
technologies aquaculture
rapidly fields ichlidae; protein es a resource for studies a resourceforstudies
5 et represent ions evolution of biogeography, , ,
000
are al.
Xiao
model of
2014 moving still - true reef in true fish species coding genes
research , Loh
as well as as as well et
more (common lacking
). ( systems al. f
et or (superclass Osteichthyes) Genome
from
2015)
al.
draft draft review . than
Genome
, 2008
despite
a comparison gene of sequences closely between simple related
c .
were predicted fromwere predicted genome 50% on
distributed - arp,
scal see )
, the
or coral
Cyprinus of
e sequences
Kosuri significant to to
co resources evolutionary
all the the
of increasingly
medical tropical many seaswith traits diverse and -
reef model fishesandrepresent a system evolution of
known C. austriacus , more broadly blacktail
and po &
carpio
Church from
advances for research
genome vertebrate pulation geneticspulation
novelties 13 non
b ; complex the
utterflyfish Xu reef fishreef ,
967 2014 -
suggests that it model b across tropical the Indo
s
(z et ony in
reference, a as a total of assembled scaffolds.
ebrafish,
) al. species, next
(
and
systems, fish c
adaptations adaptations to the 2014 oela organisms
-
( bioinf generation es Chaetodon canth, .
) ( were Danio We , superclass
adaptive
with ormatic has the
, genus first
rerio great
- ;
This article isThis article by protected copyright. All rightsreserved. also environmental minimal This one for genome of diversity specializ family analyses indicate thatthebutterflyfish from family diverged nearest theangelfish relative, its the next closest being 61% ofnutrition, source are 2008) 2002; Littlewood feeding behavio 2003; Lawton groupspeciose 130species) of(> reef fishbythe targeted ornamental often trade mechanismsthe adaptations the ofsome underlying coral thesefishto reefs. of al. et shark, (b ( rubripes far adaptations, waterbut thus a cold pufferfish northwest from the Pacific Ocean ( Tetraodon Tetraodon nigroviridis
Accepted Article luefin
From butterflyfish andThe conspicuous bannerfish inpa (family Chaetodontidae)
of
harbo s 2014 . Indeed,that of fishthe species corals feed directly onscleractinian primaryas their emi
Rhincodon typus the ( Pomaca
; t fre
una, a - van 2004 dePeer
rs enclosed most ). This doesno deficiency
- biogeographic shwater
based phylogeneticcomp based one Thunnus orientalis
et al. et variables
nthidae), nthidae), 33millionabout years ago r unique
, of
et al. et and e
the basin inflow 2013 ations in wrasses (family: Labridae)
highest ; Read cological nicheselectioncological and 2004; Pratchett 2005;GrahamCole 2007; ;
(Sofianos van de Peer 2004van dePeer )
at , display perspective, , ), a brackish), a pufferfishof rivers water fromthe to Sri Lanka China
geologically high the butterflyfi
the butterflyfishes,the et al. et
levels
north
; rates Nakamura
2003; a tremendous diversity of a tremendousdiversity colo
2015 t allow us to t usto allow identify
western arisons arisons among of of
shes butterflyfish
endemism
complex evaporation ) Ngugi ) are are the genomicclosest resources(
, ( a pelagic predator thea pelagic PacificOcean in northern Cole
et al. et boundary
et (Roberts & (Roberts
this family this ideal serve as an should model
(20%; (20%; et al. et
regions
al.
2013 for es
(Bellwood , the
2012; and are
marine
2008; Lewis Rotjan & 2008 of ) Bellwood specializ ,
genes under selection under orelucidate genes
prominent of and the latitudinal the
Raitsos Ormond Zekeria 1992; the
Indian organis
et al. et world ed reeffish r pattern world’s et al. et et al. et
et
in
gradients Ocean 2010) ms,
’ al.
s
fish 2008; Gregson
2010 oceans,
2013). , bodyshape, with largest fishlargest (whale
. D
assemblages is es ). Molecular ). Molecular also see
ue to the ue tothe
subject . in 12.9%
rticular, a
the The
( Wabnitz Takifugu )
et al. et , with the Red
Red
Spaink for to et al. et
of Sea.
Sea
This article isThis article by protected copyright. All rightsreserved. the congruent phylogeny markers would Nanninga differentiation and disjunctive et and upwelling DiBattista what uncertain Pleistocene and (DiBattista into (Roberts that scleractinian fi shes,
Acceptedal. Article
first The Our
a shallow cor 6 the
limits
2014 turbid
of
benefit 12.6% al
adjacent origin objective
draft
[RAD the
et
genera is
et , isolation ; off et
et distribution al. or dispersal Giles - glacial
how
( 12 al. water
al. corals
137
genome
al. for the
from whole - of 1992) between
Seq species 2014; 2016a
the (DiBattista Gulf
2016
high polychaetes, northeast et m) in
region
cycles ]
(DiBattista events, the
al.
this , separation for connection genome for
although
of b Giles of ) levels
. 2015 ) application recorded
of populations
, population
Some these study a
Aden
but
south
(Rohling some Red
or
African
et )
et
of
. is
sequencing)
to
The relatively 8.1% al.
possible was
or al. Sea some
et thought among endemism of
reef
test with from
as 2015 al.
2016a
19
importance
of to reef
genomics et
and of
for
far
fish finer 2016a ° of
the use NGS al.
the
; to coral
extant
barriers these to as e fish
southern S ;
sedentary
chinoderms,
species
1998; 20°
aenz - Indian Sheppard
NGS
Red scale to be the in
) approaches ,
.
reef
,
the
shed , Red
Butterflyfish
N species
species at -
Gulf
ultra Sea
of Agudelo
approaches Siddall
in
to hypotheses
least Ocean blacktail
organisms (Robert
Arabian these
Sea
light
the larval
organisms were -
of & conserved
are
is in
16.5% reef Oman Red Sheppard
et
barriers on (
that
maintained et
part, e.g. at
s dispersal
now
al. butterflyfish
coasts es
et the al. fish
Sea one
and related (
restricted
restriction Froukh al. are
(DiBattista 2003; for
the
at 2015
known exact
elements is
(Roberts time
de is
1992;
the particularly
1991 ascidians
(Smeed result still
supported
include
novo
)
to Bailey
. within larval
timing described
& This
) to
subject reef water DiBattista
,
(
Kochzius of site
as
Chaetodon
extend assembl [UCEs] et
et 1997;
,
area a
stage fish well
cold
2009) the
al.
al. and
and associated
narrow
flow by interesting
to
1992;
2016a Red as
connectivity. - of
as 5.8%
Kemp their water location the
( much y .
for
see
et endemic 2008 during
research Even genetic to
Sea,
(18 al.
Nanninga )
provide range
for .
2000) de ; DNA
2016a
more km)
given o
of bate r
)
This article isThis article by protected copyright. All rightsreserved. ofUniversity Scienceand sequencing, 45 μgwasfor provided mate CA using aliquotwasquantified extracted dsDNA aQubit (Invitrogen, HS Kit Assay Valencia,(Qiagen, CA) followingthe manufacturer's DNA protocol. Total from each tissue and animal wasnotavailable preferred to inbreduse an individual for genome sequencing to simplify assembly preserved andarchived genomic preparation library (KAUST ID: Tissue whole RS3552), the and specimen was (N Arabia A single DNA sequencing Methods and Material environment may (e.g. Sea, C. closely exclusively the austriacus
Accepted Articlelunulatus
). A Chaetodontidae Genomic for shotgun DNA libraries sequencingprepared were atthe King Abdullah
respectively provide DiBattista
related total of prepandtotal forlibrary 10μgofDNAwas provided downstream shotgun C. austriacus ). placed into aliquots fourseparate using aQiagen DNeasy BloodandTissue Kit 18.658617
in , Chaetodon
s
a and .
the
means sister
et
(Waldrop
C. al. northern
family
melapterus
taxa
°, E 2015 to
specimen was collected on March 5, 2013, near Ablo Island, Saudispecimen onMarch5,2013,nearAbloIsland, collected was
at California the Academyof Sciences (CAS236290).Althoughis it
austriacus identify .
40.826967 found Genomic DNAfromwas extracted 100mg approximately ofgill ). Technology (
found
and et
Future
al.
central
, together
genes
which 2016 in
Rüppell,
g the °). Portions filamentsof gill the removed were for enomic
)
Red
related , live Red
KAUST) Bioscience KAUST)Bioscience (BCL)Core Laboratory using with in - pair librarypair prep.
Sea the
in
1836 Sea. comparisons
rare
to the
Cora region
unique
This
is a Indian reas
one l lochaetodon
iconic (DiBattista
of
of adaptations
Ocean,
among co
twelve
- species occurrence
Pacific
these
subgenus, shallow et
to al. is
the distributed closely
2016a Ocean
and
water different
C.
hybridization )
,
, related
trifasci
and and
members , Carlsbad,
almost an inbred an inbred
Arabian has
fish atus
three
of
,
This article isThis article by protected copyright. All rightsreserved. Accepted weresequences identified the genome with by performing NCBIbacterial search BLAST a ( fish transcripts tissue extractions andnormaliz bestassemblyusingthe akm wasobtained mitogenomesfish to prior genome shortreadassembler the Abyss with assembly respectively. 11,207,168 read remaining kept default parameterswere as 2013 quality trimming,in resulting total of808,641,943paired which included Article The shotgun sequences De novogenome run PE Illumina KAUST, HiSeq2000at and all three sizes insert (3 FC (Cat# GelPlusprotocolPair outlined “Illumina in MatePairlibrarypreparation Nextera guide KAUST BCLfacility tostrandsubject d ofanrun PEonfive lanes Illumina HiSeq2000(v3 atKAUST. reagents) an Illumina
Paired on three ofPElanes Illumina )
for qualityfilteringfollowing withthe parameters: score >20). - 132 - end libraries were screenedmitochondrial for sequencesafter aligningavailable to TruSeq
- 9001DOC, Part#15035209 RevD.)”. The5
adapter trimming, B - i.e. pairs were retained 3 forthe assembly assembly isplacement of3
contaminants), such These asbacteria. putative bacterial contaminant DNA Sample Kit Preparation (Illumina, SanDiego,CA). Themate and Genotypic Te 777,393,511 paired (101 bp read length) -
end reads andsubjected weresequenced toq - pair readswereprocessedusing NextClip ing the was libraries the increased potential to sequence non
NextSeq500 at NextSeq500 at Genotypic - 5kb, 5 5kb, - trimming, low and chnology g - - . 8kb, and 8 reads containing 150.99 er size of63. size er After trimming, 33,333,612,19,362,215,and were quality filtered using SeqQCwere filtered using quality - 5 kb, 5 enomics following Nextera facility a Mate - - 8 kb,and 8 11kb - quality trimming end functions A further consequence ofanimalfurther A
- - t 0, 8kb was library lanerun onone of were prepared separately were prepared separately at the
Technology - y 32,17(softmatch), and the - 5kb, 5 - 11 kbinsert Gb
- of data (withdata of 8kb, and 8
. tool (
uality filteringuality and
Mate
Leggett The librarywas libraries, - pair libraries vers. - vers. 11kb) were
a 2.2 et al. Phred
1.5.2.; . , A
- This article isThis article by protected copyright. All rightsreserved. 2006 &2.1; Iwata Gotoh2012 modelsGene predicted were usinga combined approac Gene http://www.timetree.org/ Evolutionary fordifferent distances the species to Brawand comparative analysiswithselectedfish specieswasretrieved from Amemiya categories: DNA assemb conjunction with BLASTandCrossmatchidentified annotate the to repeatsinthegenome 1.4 (Wheeler annotat ( using models Repeatmodeler consensus vers. Recon We interspersed identified repeats and DNA lowcomplexity sequences in genomethe using Genome annotation is genome sequence usingfilled al. et dataset. sequences
Accepted Articlehttp://www.repeatmasker.org/RepeatModeler.html
;
1.0 (Edgar &Myers 2005) 2011
a GlimmerHMM
vers. nnotation nnotation ly. We used a hierarchical systemly. Weusedahierarchical toclassify ed based ona et et al. )
GapCloser with the followingwith the 1.0.8 (Bao & Eddy 2002), RepeatScout1.0.8 (Bao&Eddy2002), et al.et
(2014), (2014), Howe
- transposon, LTR,SI transposon,
function a function Assembled 2013). Repeatmasker - PRJNA292048
combination of RepBase repeat identification vers. vers. . )
and
3.0.3
1.12 nnotation et al. et c ab initio parameters: ontigs were .
,
The output wassubsequentlyThe output refineused to andclassify (Luo Kelley Kelley
(2013), Kasahara (2013), .
NEs, LINEs
et al. et
gene vers.
vers. et al. et
and phylogenetic comparison - joined
x 1,
2012) prediction (Augustus
1.0.8
4.0.5 (Chen2004) 2012 vers. - T 20, ,
C. austriacus . TheNCBI BioProjectaccession for the
using and simple repeats. and simple )
). . Identified repeat elements. Identified repeat were 20.02 (Bao vers. de novo
et al. For homology h ofhomology –
g 1 SSPACE
1.0.5 (Price
(2007), and Wang . T
repeat elements int he resulting scaffolds et al. et were retrieved fromwere retrieved
was also used in was also usedin - vers. BASIC - based prediction gene
2015) et al.et - based Data for the Data forthe
3.0.3
-
vers.
2005), and Dfam
, (SPALN et al. et et al.et Stanke o the different
2.0
and PILER (2015). (2015). (2013),
( were gap were gap Boetzer
et al. et vers. vers.
This article isThis article by protected copyright. All rightsreserved. Pfamthe annotations functional we used InterProScan chalumnae included the species fis against ofsever gene sets with , database (Table S1 database rem and - ( blasted (BlastP)a blasted 5 Amemiya ted against against ted the database,2) TreMBL a These gene These gene models were was retained,t and3) oval ofgenemodels moreoval with thantwostopcodons and<20 2014 Danio rerio ( ], Haas - zebrafish E. off of electricus - ) with anaverage of size 10,232 based and based We identified al fishal species withfully sequenced genomes to order identify in et al. et - searcher [ ucius Howe 2013 in Supporting in Supporting Information ) gainst Swissprotdatabase; the t 926 high quality gene926 highquality sets - species 5 2008 was retained whereaswas retained ) [ ( , and et al. et Ro Gallant he remaining models gene annotation were without blasted ab initio with ) toform acomprehensi ndeau - then then specific parameters for sub 270,127 exons with an coding270,127 exonswith average sequenc D. 53.7% and39.81%forexon 2013 default default to parameters UniProt inorder align the - et al.et rerio f annotated using a three et al.et gene were prediction integr rag ]) to ourassembled genome; t ( ed publicin the databases Jones ments. Augustus andGlimmerHMM 2014 ( Howe 2014 l l gene no models with hits suitable against et al.et ) , gene models with > withnohitor hits ], E. ). et al. et bp, and a gene density of density bp, andagene Latimeria chalumnae Latimeria (Table S1andTable S2 A final validationA final of 2014 lucius ve ofconsensus set models gene he best hit forhe besthit gene each model 2013 ab initio ) to predict domainsto predict base ( Rondeau ) - Electrophorus electricus Electrophorus . Toprovide further st s ep approach and intron gene predicti ated EVM using that we his provided et al. et bp the modelsthe was [ in length Amemiya s, used; used; t 2014 : 1) in and best the on. The were run were al ) e , hese hese l L. d on , we et et . This article isThis article by protected copyright. All rightsreserved. genebetween pairs (TableS3 analysis tool SynChro (Drillon against the recently published NileTilapia genome (Brawand 15), 656.4Mb(kmer 17) Thiskmer size. approach generated N coverage, is the actual of coverage genome,the mean Listhe r we thecoverage, applied formula M=N*(L isthe value mean forvalue data.Basedand coverage onthis the the kmer actual genome with plotted different kmer sizes(15, 17 smallused the paired insert accuracy satisfying and continuance of ourasse The mapping thelibraries, rates S2total (Fig. ofthe rate short a qualityIn order toevaluatethe of the assembled genome Genome v median (~ coverage againsthits curated proteinsfromhighly SWISS the performed analyzing distributionthe by ofhitlengths20,794genemodelssignificant with Accepted Article ll libraries BWAusing To further assessTo further of the quality ou We estimated approximatekmer genome the a using size distribution ofdistribution mapped read alidation a frequency a frequency histogram. distribution should The in Supporting Informationin Supporting - insert insert 82 paired %; see vers. vers. , and 65 - , had abroad end library todetermine frequency distribution kmer the using and 19). and 19). kmersize For each - end 0.7.5 (Li Durbin & 2010) Fig. S1 in Supporting in Supporting Information et al.et - 4.9 Mb(kmer 19)( pair sizes insert assembledon the genome a indicated library, which from200to ranged comparable 2014 r assembly weperfo in Supporting Information). ). Owing to the different). Owingtothe qualities of mate the range, butrange, ), allowing for 5,10 – mbly ( K + 1) / L, where M is L,where Mis K +1)/ the mean kmer genome estimatesof650.8Mb(kmer size were all Fig. - PROT andcalcula database Fig. S2 with default parameters. Themapping ). , S , we counted kmer the frequency and we assessed the we assessed mapping of rates rmed 3 W substantially higher than65%. in Supporting Information be unimo - e also assessed completeness also the e based based approach. Briefly, , in Supporting Information et al. et and 15 intervening genesand 15intervening microsynteny analyses ead length 2014) the using synteny 400 dal , bp, was96.5 where the , and ting their K is the peak - % in % in ). pair pair we ). This article isThis article by protected copyright. All rightsreserved. amountslower of can oftenvaryby+/ Chaetodon rainfordi estimates from other estimatedg based butterflyfish 2,000scaffolds than bp greater genomes. F on scaffolds 200bpin > length value 1)(Table Bioproject under the through CenterNational forBiotechnologyInformation (NCBI) ReadArchive Short fold coverage ofthegenome processing after (Table1).RawIllumina available reads are C. austriacus Quality Genome a R in Supporting Information). CEGMA were either genes.of core genesassessed eukaryotic by Theresults showedthat98%ofthecore of genome the CEGMA using Accepted Article esults After trimming and filtering, After trimming andfiltering, s increased increased - filtered Illumina reads . ssembly The N50 or example, or genomewas approximately0.71Gb genome generated genome generated 8 PRJNA292048 enome of0.65Gb size identified - fold in and N90 - Chaetodon (733 Mb completely or partiallyrepresentedinour genome S4 assembly (Table 10% sequencing fromread assemblies2015) (Guoetal. short an N50ofbp 136,950 the scaffoldthe repeat elements lengths ) (Hardie &Hebert ) (Hardie 2004 vers. vers. . (> 97% had ofbases (Nakamura These are values . species such species such as 150.99 the the remaining 2.5 (Parra 2.5 (Parra of genome the the Gb (Fig. S2 (Fig. contigs were et al. et . of approximately whichrepresented data, was achieved inbluefint et al. (N50 =170,231 (N50 reads Chaetodon aureofasciatus 2013) , whichwa in Supporting Information similar to 2007) Phred were assembled into ). . Total ass 21,086 Kmer in order quality scores > 20) other published bonyfish s slightly larger than s slightly kmerthe - based genomebased estimates size bp and embly ofthescaffolded size , to determine the presence N90 = 3,026 una , respectively 118,995 24,561 (685 Mb based on based ), but similar ), but to o f the diploid bp contigs due to ) and ) based based 212 ; t hese - - This article isThis article by protected copyright. All rightsreserved. Elements DNA andan in (SINEs) increase transposons Short in comparison to and Oreochromis niloticus elementstransposable (TEs) ofrepeat elementscontent el most abundantelements were DNA transposon contained 17.86% (127 repeatelementsMb) anaverage with GC content of 42.48%.The orderPerciformesthe ( (~51.3%), of annotatedgenes(> the models gene. assembledper scaffold rangedfrom1 to126,with7%ofthescaffolds onlyone containing bacterialobservable contamination inthefinalgap assembled coveragenone and withhigh few in contigs meaning singleton our genome, thatallscaffolds were wepresent here aBLASTnot have match inanyofthe There databasesunannotated. weredesignated were annotations forprovided 24,433genes 28,926 hig approachUsing acombined ofhomology Annotation Acceptedements inouranalysis (TableS4 Article Analysis oftherepetitiveelement content showed the that Pundamilia nyererei BLAST results sho , of 97.5% whichmorethan , as well as Long as as well Terminal (LTRs) Repeat elements,themarginally despite h confidence h confidence gene models. BLAST Stegaste Oryzias latipes s (~16.2%) , see Neolamprologus brichardiNeolamprologus wed high confidence database matches database wed highconfidence ( (~114 Mya (~114 Mya divergence 70%) matched70%) from sequences , Fig. as well as as well relative the proportions of mostabundant the , were , and 1). (Japanese ricefish) similar tothe closer related species Perciformes and TableS5 Oreochromis were (Table S1 are are indicative ofcon known sequences fromknown sequences Actinopterygii. The majority - based based and - based annotation againstmultiple annotation databasesbased s , which accountedfor>54% oftheclassified ) in Supporting Information). . differences observed However, clear were in Supporting Information (~5.1%) , - closed closed assembly. Astatotilapia and Long (LINEs) Interspersed and Long(LINEs)Interspersed Nuclear , whichshowedproportionsof reduced de novo the fishthe genera , which, like tamination. gene prediction weidentified gene prediction Chaetodon austriacus burtoni - The number ofgenes The 5 ) for Chaetodon, , Larimichthys Indeed, Indeed, Metriaclima zebra Metriaclima ~82 ). The overall The overall Contigs that didContigs that % of the gene % ofthegene there was no belong to such as genome deeper , This article isThis article by protected copyright. All rightsreserved. Accepted inhabit. that they for dispersal habitatcapacity, selection, and adaptation oceanicenvironmentsto the unique austriacus short comparison with (82 scaffold andarelatively(170,231 bp), high size propo arereleasedthey geneexpressed sets, longercoverage, reads to ‘modelstatus group’ resource ( genome Using an Discussion Article assembly approach. observedthe differences simplycould the dueto with inherent associated be biases each referred to here elementsrepetitive s among (~429 Mya divergence) 2; Table(Fig. S5 Mya divergence) differentTEs was substantially in t phylogenetic (~139 distance Myadivergence).The content overall and proportionsof relative % ofthe genes)willfacilitatepredicted the - term, this ; the first first ; the NGS approach we were able to sequence, to NGS assemble wewereable approach http://caus.reefgenomics.org in greater the Indo resource were derived fromreadsequencing assemblies only,which short means that . , closely related In the In the long draft draft That said,t Ctenopharyngodon idellus a linkage map genome foratropical fishreef available to more close ofthegaps for future genomic studies will will pecies mustpecies viewed cautiongiven be other with thatthe genomes - term, genomemay help shapenewhypotheses this he high (~212 genome coverage - enable comparative withgenomics thecongenersof research Pacific (Waldrop butterflyfish species he more distant to linearizethescaffolds ) has the and S6 (~237 Myadivergence) immed potential theChaetodontidae to elevate et al. et , in Supporting Information transcriptome toidentify sequencing ly with increased short related related species (DiBattista 2016 iate iate rtion of rtion proteinannotated use of these sequences for use ofthese ), which agenetic may basis reveal , andnewas , - a fold . nd annotate This et al.et ), large ), large N50 average such as , publically available and submitted - read sequencing sembly L. ). a butterflyfish D. chalumnae C rerio omparisons of - algorithms as ) related tothe coding . Inthe (~232 family genes genes C. This article isThis article by protected copyright. All rightsreserved. identify theunderpinnings of genetic traits involved the in of evolutionary programs sequencingrefinementsfurther ornamental fish trade assemblyamong studies approaches S6 inSupplementary Material species contained substantially lesstransposable elements thant TEsabundant withincreasing phylogenetic distance highlighted substantial differences intheoverall content elements showed selected similarity across fish species high to but closer related species, als genome models andgene assembly gene models C. austriacus fishsequenced not genomesare lackofthe closely support forfurther most, all of not This but thegenemodels. explainedby might be inpart T. genomesequenced (Brawand 24,559genesthe identified in hybridization this within group(Hobbs (Bellwood evolution ofabroader set withinthespecialized butcomplex taxa butterflyfish family Accepted Articlerubripes T We 28,926 protein coding predicted genesin his assembly , experimental experimental under conditions study controlled et al. (Aparicio from gene set usinggene set NCBI majority the nrdatabase, however,matched vast ofthe the curios 2010), as asreveal agenetic well forthe basis P here here related genomes erciformes species ( et et al. ( Delbeek http://reefbuilders.com/tag/hybrid provides 2002). The BLAST et al. et ) O. niloticus , although may this in bedue, biasinsequencingpart, and to a , 2013; 2013) Lawtonet al. would then have then would yet one ofthefirst . 2014), but 2014), but in public in public repositories, available on NCBI.phylogenetic classification Theof the . Moreover, o Moreover, , which lends further support to , whichlendsfurther the et al. et , next the relatedspecies fully closest with a 2013; DiBattis still still - based annotation of annotation theseprovidedbased genes draft draft ur comparison therepertoire ofrepetitive of C. austriacus broad applicationsbroad to lower thanlower 31,059 the predicted genes in . Specifically wefind allPerciformes that “genomic for blueprints” breeders inthe and - . , since most ofthe butterflyfish/ ability ability to spawn andsurvivein This information and, insomecases, ta relative proportionrelative et al. et , which is slightly , whichis higher than relative he zebra 2015) captive breeding ) high rates of high rates . fish genome (Table G . fidelity ofthefidelity enome scans to enome to scans , recently recently along withalong the ofmost the production o This article isThis article by protected copyright. All rightsreserved. Accepted prep,library Illumina sequencing,genome andgenome assembly, annotation. Ltd. with assistancetheir with CoreKAUST Laboratory Bioscience with Sivakumar Neelamegam andHicham for Mansour andclassificationannotation species Californiathe Academy of Sciences; contributions from Catania for withspecimen LuizRochaandDavid assistance at archiving Resources Core Gusti andMarine Coastal Lab andAmr support logistic funds toM FundsResearch (OCRF) AwardNo.CRG under Acknowledgements. Article and natural can which representsfraction anannotated ofourgenome. draft allelochemical familymayin this role onthe of depend proteins in CYP3Athe of detoxification specializ environments with it (i.e. cyanide fishing) fish particularlyaquaria, fortheobligatecora therefore (Lawton ed Dinesh Velayutham, . L butterflyfish anthropogenic pressures. anthropogenic . et al. et provide B that that result s . , wethankEric MasonDream at inSaudiArabia Divers , as well 9024 National, aswell GeographicSocietyGrant as a f ound insome (i.e.soft corals; foodsources Maldonadoetal.2016), their 2013) library prep a firstcritical indeterminingstep these species whether can adapt to . in , Recent studies have studieshave Recent also shownthat This research wasby This research the supported ofCompetitiveKAUST Office in addition to a . More generally,More ongoing perturbationstoshallow l oss ofhabitat Vinay Pantulwar Vinay and Illumina sequencing , pr as as well Yi Jin Liew Yi Jin for providing essential scripts for the nefariousthe methods fishing that sometimescome l livores, livores, may and coralbleaching are , - and 1 ovisioning theovisioning genome browser - 2012 Subashini mitigate . - We acknowledgealso important BER ; and GenotypicTechnologyPvt. This butterflyfish the degree the of specialization diet - 002 andbas the the for theirassistancewith ongoing a major threat to the - 11 toJ and eline eline research harvest the KAUSTthe . draft draft D - water coastal . D. ; the the genome of these For This article isThis article by protected copyright. All rightsreserved. Areviewof(2016a) contemporary ofendemism patterns waterfor shallow reef E,SinclairAgudelo P,Salas MR,JH, Gaither HobbsJP,Kahil M,Kochzius R,M, Myers G,Robitzch V,Saenz Paulay JD DiBattista Biogeography the at crossroads ofmarine provinces biogeographical inthe Sea.Arabian Berumen (2015) ML JD, HeS,Priest MA,Sinclair DiBattista Hobbs RochaLA, JP, Butterflyfishes, JC(2013) Delbeek fishes ontropical coral reefs. C Chen R (2015) ProtocolsCurrent inBioinformatics toidentifyrepetitiveChen NUsingRepeatMasker elements (2004) sequences. ingenomic forsubstrate adaptiveradiationinAfrican cichlid fish. Brawand D,Wagner CE,Li YI,Malinsky I,FanS, M,Keller usingcontigs SSPACE. M, Boetzer He fishes. of Evolutionary history butterflyfishesthe Chaetodontidae) the (f: feeding and rise ofcoral Bellwood DR, Klanten S, CowmanPF, Pratche in eukaryotic genomes. W,Bao RepbaseUpdate,ofrepetitive O(2015) Kojimaadatabase elementsKK,Kohany genomes.sequenced Bao Z,Eddy SRAutomated of denovoidentification (2002) sequence families repeat in Publishing. Springer R(2009)Bailey ofthe assemblyshotgun andanalysis genome of S, Aparicio Chapman E,PutnamJM,Dehal J,Stupka Chia N, genomecoelacanth insights intotetrapod provides evolution. Amemiya Alföldi J,LeeAP,FanS, CT, J W, S, W,Altschul Gish localalignment E,Lipman Miller D (1990)Basic Myers tool. search References ournal Accepted MS,Jones AJ,Pratchett GP(2008) ole Article Journal EvolutionaryBiology of of Mol , Roberts M, Bouwmeester M,Bouwmeester J, Bowen, Roberts BW, DF, Lozano Coker , nkel CV,nkel HJ, D,Pirovano Jansen Butler W Scaffolding pre (2011) ecular 42, On Bioinformatic Resources. 292 Ecosystem Sites to Geography: FromEcoregions 1601 . Captive care and of breeding butterflyfishes.coral reef Genome Research Genome When Biol Mobile Mobile - Bioinformatics 1614 ogy biogeographical provinces collide: hybridization fishes ofreef - Taylor Westneat RJ, M, WilliamsTH, Toonen S, Berumen ML Fish , . 215, DNA and 403 , , Chapter 4, 6, Fish , , , - 11. 27 410 12, 23 Philippe H,MacCallum (2013) al I,et Diversity andfunctional importance ofcoral , , eries 578 . 1269 335 Genomics Proteomics Bioinformatics tt MS,Konow L(2010) N, vanHerwerden - - , 579 Fugu rubripes Fugu 349 - 9 Unit 410. 1276. , 286 . . Nature - 307 . Nature , et al. et P, etal.(2002)Whole - 513, . T Science aylor TH,aylor Bowen BW, . 2 (2014) Thegenomic 375 , nd 496 edition, New - , 381. , 297 Journal of Journal 311 Biology of of Biology - Cortés D Cortés , - 1301 316 The African fauna in the - assembled . . - 1310. York, York, - F, Choat genome ‐ feeding - This article isThis article by protected copyright. All rightsreserved. AcceptedButterflyfishes, CRC Press, Boca Raton,FL. pp.48 butterflyfishes. Hobbs J andFisheries Sciences Aquatic DC,HebertHardie Program Alignments. toAssemble Spliced Automated(2008) gene structureannotation using EVidenceModeler eukaryotic and the W, SL,Zhu Allen JE,Orvis J, Haas BJ,Salzberg M, White Pertea O,BuellCR,Wortman JR Frontiers in Physiology analysis estimates ofthegenome of sizes Guo LT,WangWu SL,W,QJ, ZhouXG,XieFlowcytometry YJ(2015) and Zhang K Coral Reefs butterflyfish (Chaetodontidae) feeding rates andcoral consumption onthe BarrierGreat Reef. Gregson MA, Pratchett MS, Berumen ML,Goodm climate driven coral mortality. GrahamNAJ (2007) Evol andgenetics inthe kinship sponge reef EC,SaenzGiles convergent evolution organs. ofelectric WellsAnand R, GB,Pinch M Article LL, Gallant JR, Traeger Biol damselfishes M:Froukh T,Kochzius boundaries Species and lineagesin evolutionary the green blue Bioinformatics Edgar RC,MyersEW PILER: andclassification identification of (2005) genomic repeats. blocksand visualize synteny eukaryoticchromosomes. along G, Drillon Molecular Ecology Resources Using abutterflyfish genome ageneral for as tool RAD JD,DiBattista Saenz RedSeain the M,Rocha LA,ToonenRJ,Westneat Berumen (2016 ML JD DiBattista Red Sea. ogy ution - 2008, PA, vanHerwerden MS,Allen L,Pratchett GR (2013)Hybridizationamong Journal ofBiogeographyJournal , Carbone A,Fischer G(2014).SynChro: afastand toolto easy reconstruct 5 , , 27, 2487 , Choat JH, Gaither MR, Lozano , ChoatJH,Gaither Hobbs JP, Chromis viridis . 72, , Journal Journal of Biogeography 21, - 583 Agudelo P, 451 - 2502 PD (2004) Genome i152 - - Ecological versatility andthe feeding declineofcoral fishes following 591 Agudelo P,Piatek M,Wang X,Aranda M,BerumenML(submitted) - - 69 Pratchett MS,Berumen In: B Kapoor ML, 457 , . - Volkening JD, MoffettGN, Phillips H,ChenPH,NovinaCD, Jr., 6 158. . , . 144. Ravasi and . et al.et Mar , 61 Chromis atripectoralis , , ine 43, (2014) Nonhumanforthe Genomic basis genetics. 1636 T, Hussey N,Berumen ML (2015) Biol - Stylissa carteri Stylissa 423 size evolution infishes. evolution size Science , - Bemisia tabaci Bemisia 43, 1646. ogy Genome Biology Genome - 439. 13 , , 153, - 30. 344, an BA (2008) BA (2008) an 119 1522 in the Red Sea. - Seq studies Seq studies inspecialized reef fish. - - b B and Q(Hemiptera: Aleyrodidae). (Pomacentridae). 127. Cortés DF,Cortés Myers R,Paulay G, ) - On theoriginof endemicspecies , 1525. 9, PloS one R7. Canadian Journal ofCanadian Journal Relationships between (eds.) , Ecol Exploring seascapeExploring 9, e92621. Biology ofBiology J ogy and ournal ofournal Fish - mer This article isThis article by protected copyright. All rightsreserved. Acceptedgeographical adaptations tonovel toxin exposures inbutterflyfish. RS,Hoh E,Rimoldi GK Gadepalli JM, Ostrander (2016)Biochemical mechanisms for Maldonado A, LavadoR, KnustonWatanabe S,SlatteryM,Ankisetty S,GoldstoneJV, K, improvedmemory Luo R, Liu cichlids. s analysis reveals Loh YHE, LS,Mims Katz TD, Kocher YiSV,Streelman MC, Zootaxa Chaetodon Littlewood, D.T.J transform. andaccurate Li H,DurbinR(2010)Fast long preparationand read tool for Nextera longmate libraries. pair RM,Leggett BJ, Clavijo BG) artisanal fisheries. In: Lawt applications. S,Church, GM(2014)Kosuri Article records new eight ofcoral reef fishesfrom Arabia. Kemp J(2000) Research metagenomicfor sequences augmented andclustering byclassification Glimmer DR,LiuAL,with SLKelley Geneprediction Pop M,Salzberg B,Delcher (2012) genome vertebratedraft genome andinsights into evolution. S,Nakatani M, Y,QuKasahara Naruse K,Sasaki W, AhsanB, classification. A, Mitchell Nuka G P,BinnsJones D,ChangHY,M,LiW,C, Fraser McWill McAnulla Acids Research of version Spalnthatincorporates extended additional species H, Iwata 498 genomezebrafish reference anditsrelationship to thehuman sequence genome. MD,TorrojaHowe CF,Torrance K,Clark J, - , 269 503 on RJ, Pratchett MS,on RJ,Pratchett JC Delbeek (2013) , . Genome Biol Gotoh O(2012)Benchmarking spliced alignment programs Spaln2, an including - 779, , 291 Bioinformatics 40, and Chaetodontidae the Perciformes)(Teleostei : morphology. withreferenceto B, Xie Y,Li Z,HuangW, YuanJ(2012) Nat . Bioinformatics 1 e9. , Zoogeography offishes coralofthe Zoogeography the reef northeastern Gulf ofAden,with - 20. 40, ure ignatures of differentiationamid genomic inLakeMalawi polymorphism ., - efficient short McDonald SM McDonald e161. et al. et Methods ogy Biology of Butterflyfishes Clissold L, MD,CaccamoClissold Clark (2013) M , , (2014) (2014) 5:InterProScan genome 9 26, , , R113 , 30, Large 11 589 , - read 236 499 , - . 595. Gil - scale scale - - 1240. l AC de novo 507 . de , Cribb TH Berthelot C,Muffato al M,et (2013) - novo Harvesting ofHarvesting butterflyfishes for and aquarium read alignment Burrows with assembler. (eds: Pratchett (eds: Fauna of Arabia DNA synthesis: technologiesDNA and synthesis: SOAPdenovo2: anempirically (2004) - scale proteinfunctionscale GigaScience Nature - Bioinformatics Molecular phylogenetics of specific features. specific features. JT (2008) MS, Berumen ML, et al. et PloS one , , NextClip: ananalysis iam J, H, Maslen 447 18, (2007) Themedaka . , Nucleic Acids 1, , 293 Comparative 714 18 - , Wheeler - 11, , btt702 321 . - 719. Nucleic Nature e015 . The Kapoor . 4208. , 496 , This article isThis article by protected copyright. All rightsreserved. ML ( AcceptedSaenz Prog Rotjan RD, mixedmodels. F,HuelsenbeckJP(2003).MrBayes 3: phylogenetic inferenceRonquist Bayesian under PLoS One lucius C, BirdThegenome NH,KoopBF (2014) andlinkagemap thenorthern pike( of Rondeau EB,MinkleyDR, Leong JS,Messmer KR, AM,Jantzen JR, vonSchalburg Lemon sea Rohling EJ,FJ, Jorissen Fenton M, B of structure andRed Seabutterflyfishes angelfishes. Rober incidence ofterritoriality CM,Roberts Ormond (1992) RFG largest fish,world’s the whale shark: Haase CP,WebbDove AD DH, (2015) IIIRA, R,AhmadRead TD,SJ,Alam Joseph Petit Weil R,Vuong MT,M,Bhimani phytoplankton of seasonal succession theRedSea. DE,PradhanY,Bre Raitsos Article genomes. AL,JonesPrice NC,PevznerPA(2005) Denovoidentification families ofrepeat inlarge Island,BarrierLizard northern Great Reef. MSPratchett (2005) in eukaryotic genomes. G,BradnamParra CEGMA: K, KorfI(2007) annotate toaccurately genes apipeline core 388 an across antagonistic temperature Ngugi DK,AntunesA,BruneStingl Ecol genetic the populationpredict structure ofacoral fish thereef RedSea. in Nanninga GB,Saenz Proc visual ofmultiple changes pigment in genes the complete tuna. genomebluefin ofPacific Nakamura K,SaitohOshim Y,Mori - - level lowstands 500,000 ofthepast years. 405 ogy eedings ress 2015 - conserved): synteny between revealed the salmonid groupand Neoteleostei. the sister ts CM, Shepherd ARD,Ormond RFG (1992) Agudelo P,DiBattista JD, Piatek M,Gaither M,HarrisonH, NanningaG,Berumen , . 23, Ser Bioinformatics , ) Seascapegenetics along environmental gradients Arabianin the Peninsula: 9, Lewis SM(2008) ies of the of the 591 e102089. Bioinformatics , - 367 602 Nat , - Dietary overlap among coral . Agudelo P,Manica A,Berumen ML(2014) 73 ional Bioinformatics - , 91 – 21, a review. a review. . win RJW, G,Hoteit I (2013) Stenchikov Acad i351 , Impact predators ofcoral on tropical reefs. 19 emy of , - – 1572 Butterflyfish social 358. salinity gradient in Redthe Sea. Environ ertrand P, JP( ertrand Gnassen G,Caulet Rhincodon typus , a K, Mekuchi M,Sugaya T,et (2013) a K,Mekuchi al 23, Draft sequencing andassembly ofthegenometh of U (2012) - 1574. Sci Mar 1061 ences mental Nature ine - 1067. - Biogeography of p USA feeding butterflyfishes (Chaetodontidae)at PloS one Biol Large J Biol , ournal ournal of - 394, , Smith 1828. behavior, special tothe with reference ogy 110 ogy of - scale variation inassemblage , 162 , , 148, 11061 8 , - Biogeogr Fishes e64909 165 373 Environmental gradients - PeerJ PeerJ PrePrints . 11066 Mol elagic bacterioplankton elagic - Remote sensing the 382 , 1998) . 34 ecular aphy Mar . , Mol . 79 Magnitudes of - ine , ecular 93 19, Ecol Evolutionary . Esox Ecol 239 ogy , e1036 JS, JS, ogy - 250 , 21 . . , e This article isThis article by protected copyright. All rightsreserved. Accepted BioinformaticsGenomics Proteomics WuYu J(2015) Xiao J,ZhangZ, J, Acids Research Dfam: ofrepetitive DNA onprofile adatabase based models. hidden Markov TJ,Wheeler Clements R,JonesTA,Jurka J,Smit J,EddySR,Hubley AF, FinnRD (2013) adaptation. ( carp Wang Y,ZhangNingZ,LiZhaoQ., Y,Lu Y, (Subgenusbutterflyfishes Bowen(2016 BW WaldropJE, E,HobbsJP,Randall Ecology boundaries and the in Lake relationships Victoria cichlidadaptive radiation. (2013) Wagner CE,Keller I,Wittwer S,Selz OM,MwaikoS,GreuterL,SivasundarA,SeehausenO Species Wabnitz C(2003) polyploids. Yvan dePeer (2004) prediction initio ofalternative transcripts. M,Keller O,GunduzI,HayesA,Stanke Waack ab Morgenstern B(2006)AUGUSTUS: S, Article gene findingineukaryotes. M,SteiStanke Funct HJ, Spaink HP,Jansen D Res Sea circulation: 2.Three Sof Oceanol Smeed DA (1997) DA (2003) M, Siddall Rohling EJ, Almogi Arabia CRC,Sheppard SheppardALS (1991) of frominsights ddRADsequencing anemonefishes. ianos SS(2003) ianos earch Ctenopharyngodon idellus ional , Genome (No. 17) 12, , ogica , 22, 108 Sea Ge Nature genetics Genome Biol 3 787 - 170 , nomics Acta nkampWaack R,S,Morgenstern B(2004) - , 3066 - level fluctuations during thelast cycle.glacial wide RAD sequence data provide resolutionofspecies data unprecedented wide RAD sequence 41, - Cambridge, UNEP/Earthprint 798. . ) From Ocean Aquarium:The inMarine to GlobalTrade Ornamental Seasonal variation strait in the oftheflowof BabalMandab. , Phylogeography, population structure and evolution of coral An OceanicGeneral Circulation (OGCM) Model of investigation Red the D70 20, . , Tetraodon genomeTetraodon confirms 13 773 - - ogy irks RP(2014) irks , dimensional circulation RedSea.in the 82. Corallochaetodon 144 , Nucleic Acids Res - , 47 781 5 - , 156 , - Labin A,Hemleben C,Meis 250 ) provides insights its evolution into and vegetarian 625 . . DiBattista DiBattista JD . - A brief review of software tools forpangenomics.A briefreviewof tools software 631. . Corals and coral andCorals coral communities of Arabia. Advances in genomicsAdvances in fish. ofbony Nucleic AcidsResearch ). earch J . ournal of , RochaLA,K et al. et Takifugu , 32, Molecular Ecology (2015) Thedraftgeno W309 Biogeogr AUGUSTUS: awebserver for chner D,Schmeltzer chner I, Smeed findings: most fish are most are findings: fish ancient Nature - osaki RK osaki W312 J , ournal ournal of aphy 34, , . 423 W435 , , , 43, 24, Berumen ML, , 853 Geophys Molecular 1116 - Brief 6241 me thegrass of 439. Nucleic - - Fauna of 858 feeding - ings - 1129. 6255. . ical in This article isThis article by protected copyright. All rightsreserved. Accepted developed the manuscriptapprove paper. and ofthefinal analyzed the data J.D.D. ofand andM.L.B.conceived designed the genome J.D.D., study. Article contributions Author Genome URL: browser PRJNA292048 (ingenome FASTAf sequence setsThe data supporting results the of thisarticle, including the Accessibility Data 168 intheamongbutterflyfish species RedSea. four ZA Zekeria ofthe diversity commoncarp, Xu P, Zhang X, Wang X, P, Zhang LiJ,Liu X, . , Dawit Y . . X.W. annotated and M.A.validated, , Ghebremedhin S http://caus.reefgenomics.org Cyprinus carpio ormat), are available in areavailablein the BioProject: ormat), NCBI repository, G, Kuang (2014) Y,etal , Naser M Naser . Nat Mar , Videler Videler JJ ure ine and ine , Genet and analyzed the genome. Genome andgenetic sequence Freshwater Freshwater Res (2 ics 002) Chaetodon austriacus , 46 , Resource partitioning 1212 P.S. - 1219 earch - S. , . and M.J.P. and M.J.P. , All authors All authors 53, 163 - respectively, are genome sequence a Table 1. The N50 or N90 N90 indexN90 indexN50 Sequences >= Mb 1 Sequences >= 10 Kb Sequences >= 1 Kb Sequences >= 500 bp Percentage Averagelength Maximum length Sequences generated Total assembly size Assemblystatistics Mate pair8 Mate pair5 Mate pair3 Shotgun sequences Library Sequencing This article isThis article by protected copyright. All rightsreserved. Accepted Article ) above) which 50 or 90% length ( Chaetodon austriacus a a - - - of non 11 Kb 8 Kbinserts 5 Kbinserts statistics assembled contig and scaffold of the final index indicate the - ATGC characters inserts . of the genome shortest genome 3.80 Gb 1.47 Gb 2.30 Gb 163 , Size andassembly statistics. sequencing Gb 663,525,437 bp 282,798 bp Raw Raw data 21,086 bp 3,026 bp 5,641 bp Contigs 118,995 16,071 66,176 84,217 0 % 0 0 Coverage 230.06 1.47 2.88 4.85 X X X X 0.70 Gb 1.50 Gb 2.70 Gb 151 Size Gb Processed data 712,378,618 bp 1,950,664 bp 170,231 bp 24,561 bp 51,004 bp Scaffolds 6.85 % 12,937 13,441 13,967 8,012 18 Coverage 212.66 0.95 1.95 3.40 X X X X This article isThis article by protected copyright. All rightsreserved. genus.specific t The toallthegenes. respect different inth colors numbers inner number are the outer of andthe genes M) Dicentrarchus Larimichthys Perciformes. depict colors and associated Theletters followingthe genera match from ofclosely knownsequences fish to within related species genera order the gene models andknownproteins. The majority vast of andagainst by nr grouped depic genusto Fig. Accepted Articlehe Circos plot,which Cynoglossus 1 Genomic of composition , C) , H) , N) Stegastes Neolamprologus Tetraodon provides information percentage ofgenes the regarding matching toa , D) , O) Oreochromis Chaetodon austriacus , I) Fundulus Poecilia t phylogenomic relationships , E) , P) , J) Notothe e ringcorrespond outer to same the colors Oryzias Takifugu numbers the are . C. Genes wereclassified bybest hits nia , Q) austriacus , F) , K) Danio Haplochromis Maylandia , and R) genes have theirbest ir between predicted the percentage with : A) , L) Others. , G) Chaetodon Pundamilia The , B) , in This article isThis article by protected copyright. All rightsreserved. Accepted Article phylogenetic from distance species ( ourfocal Perciformes thedashed line), aswell (below the in change proportions with increasin Notespecies. Fig. 2 Relative proportion ofthemosttransposable prevalent element the generally higherthe of amount speciesfrom DNAtransposons in order the Chaetodon austriacus ). s inselectedfish g