!"#$#%&'()'*+,'-.//+(0(1)*$2'3()"&45'*14'6(5.+)5' ! 7*8+('-9,'"#$%!&$'()*!+,#-)$.!.-+.+-/(+%0!$%1!2$.0(1!1#02-(./(+%' ! Taxon Total Validated Species Genome length Overlap Capsid type Capsid Species Species with (ln) proportion flexible? overlap (ln) DNA viruses Acanthamoeba-polyphaga-mimivirus 1 0 Adenoviridae 44 12 12 10.47 -3.71 icosahedral no Anellovirus 5 1 1 8.26 -1.78 icosahedral no Ascoviridae 3 0 Asfarviridae 1 0 Bacillus-phage-GIL-sixteen-c 1 1 1 9.61 -3.05 no description ? Bacillus-virus-one 1 0 Baculoviridae 43 1 1 11.78 -4.79 rod shaped yesa Bicaudaviridae 2 0 Circoviridae 16 3 3 7.65 -1.78 icosahedral no Clostridium-phage-phiC-two 1 0 Corticovirus 1 1 1 9.22 -4.76 icosahedral no Fuselloviridae 5 3 3 9.69 -3.22 lemon-shaped yesb Geminiviridae 199 82 80 8.23 -1.54 icosahedral no Geobacillus-phage-GBSVone 1 1 1 10.45 -4.69 no description ? Globuloviridae 2 0 Gryllus-bimaculatus-nudivirus 1 0 Heliothis-zea-virus-one 1 0 Herpesviridae 47 26 26 11.97 -4.44 icosahedral no His-one-virus 1 0 His-two-virus 1 0 Inoviridae 25 18 17 8.88 -4.64 filamentous yes Iridoviridae 8 1 1 11.54 -5.31 icosahedral no Lipothrixviridae 8 2 2 10.62 -4.34 rod shaped yes Microviridae 55 13 12 8.56 -2.23 icosahedral no Myoviridae 71 35 35 11.37 -4.89 icosahedral no Nanoviridae 6 1 0 Nimaviridae 1 0 Papillomaviridae 66 13 13 8.97 -3.11 icosahedral no Parvoviridae 44 8 6 8.56 -2.14 icosahedral no Phycodnaviridae 8 1 1 12.72 -5.95 icosahedral no Plasmaviridae 1 1 1 9.39 -8.00 quasi-spherical yes Podoviridae 62 32 32 10.59 -3.58 icosahedral no Polydnaviridae 5 0 Polyomaviridae 17 2 2 8.57 -2.67 icosahedral no Poxviridae 25 7 7 12.12 -4.22 ovoid or brick yes Rudiviridae 4 0 Salmonella-phage-ST-seventyfour-B 1 1 1 10.60 -5.02 no description ? Sinorhizobium-phage-PBCfive 1 0 Siphoviridae 180 106 106 10.77 -4.55 icosahedral no Streptococcus-phage-SMP 1 0 Tectiviridae 2 1 1 9.61 -3.60 icosahedral no Xanthomonas-phage-OP-two 1 1 1 10.75 -4.78 no description ? Xanthomonas-phage-Xp-fiveteen 1 0 RNA viruses Acyrthosiphon pisum virus 1 0 Arenaviridae 18 2 1 9.24 -4.25 spherical/pleomorphic yes Arteriviridae 4 3 3 9.59 -3.18 icosahedralc no Astroviridae 6 4 4 8.83 -6.37 isometric nod Barnaviridae 1 0 Beet western yellows ST9-associated virus 1 0 Benyvirus 2 0 Birnaviridae 7 3 3 8.69 -2.68 icosahedral no Bornaviridae 1 1 1 9.09 -3.39 spherical yes Botrytis virus F 1 0 Botrytis virus X 1 0 Bromoviridae 25 9 4 9.05 -4.00 varies between genera ? Bunyaviridae 24 4 0 Caliciviridae 16 9 9 8.96 -6.78 icosahedral no Caulimoviridae 28 7 7 8.96 -4.77 varies between genera ? Cheravirus 2 1 0 Chrysoviridae 3 0 Closteroviridae 21 7 7 9.73 -5.78 rod shaped yes Comoviridae 18 7 1 9.33 -6.93 icosahedral no Coronaviridae 17 9 9 10.29 -4.36 icosahedrale no Cystoviridae 4 3 3 9.54 -5.60 icosahedral no Cytoplasmic citrus leprosis virus 1 0 Diaporthe ambigua RNA virus 1 1 0 Dicistroviridae 14 0 Endornavirus 5 1 0 Filoviridae 4 0 Flaviviridae 42 26 1 9.30 -6.31 icosahedralf no Flexiviridae 70 22 22 8.93 -3.56 rod shaped yes Furovirus 5 3 0 Fusarium graminearum dsRNA mycovirus 1 1 0 Hepadnaviridae 10 1 1 8.08 -0.68 icosahedral no Hepeviridae 1 0 Heterocapsa circularisquama RNA virus 1 0 Hordeivirus 1 1 1 9.23 -3.83 rod shaped yes Hypoviridae 4 0 Idaeovirus 1 1 0 Iflavirus 8 0 Kelp fly virus 1 0 Leviviridae 9 6 3 8.24 -3.74 icosahedral no Luteoviridae 18 8 8 8.64 -1.46 icosahedral no Marnaviridae 1 0 Narnaviridae 8 0 Nodaviridae 9 2 2 8.40 -2.87 icosahedral no Nora virus 1 0 Ophiovirus 3 0 Orthomyxoviridae 5 1 1 9.52 -3.30 spherical/pleomorphic yes Oyster mushroom spherical virus 1 0 Paramyxoviridae 33 3 3 9.65 -3.25 spherical/pleomorphic yes Partitiviridae 19 2 0 Pecluvirus 2 1 1 9.25 -3.97 rod shaped yes Picobirnavirus 1 Picornaviridae 46 12 0 Pomovirus 4 0 Potyviridae 69 42g 60 9.17 -3.65 filamentous yes Reoviridae 33 8g 13 10.03 -4.57 icosahedral no Retroviridae 47 11 11 9.19 -3.47 spherical yes Rhabdoviridae 22 4 0 Roniviridae 1 0 Schizochytrium single-stranded RNA virus 1 1 0 Sclerophthora macrospora virus A 1 1 1 8.68 -1.91 no description ? Sequiviridae 6 4g 2 9.40 -3.81 isometric no Sobemovirus 9 9 9 8.34 -1.85 icosahredal no Solenopsis invicta virus 2 1 0 Tenuivirus 2 0 Tetraviridae 4 0 Thielaviopsis basicola dsRNA virus 1 1 0 Tobamovirus 21 6 5 8.76 -4.91 rod shaped yes Tobravirus 3 1 1 9.25 -7.87 rod shaped yes Togaviridae 15 9 1 9.34 -6.36 icosahedral no Tomato torrado virus 1 0 Tombusviridae 41 9 9 8.42 -2.55 icosahedral no Totiviridae 24 3 3 8.74 -4.40 icosahedralh no Tymoviridae 14 5 5 8.76 -1.16 icosahedralh no Umbravirus 4 2 2 8.32 -1.65 nonei yes White bream virus 1 0 ! $Virion is within an occlusion body that may be polyhedral. bLemon-shaped and flexible according to ICTV website. cCore shell probably icosahedral. dRound with polyhedral symmetry. They have a distincty star shape on the capsid. eCore shell possibly icosahedral. fCore is icosahedral. gNumber of validated species includes those in the Firth papers (see main text). hSome show some general isometric symmetry. iNo conventional virion.
'
' -.//+(0(1)*$2'3()"&45!
34#!)$-5#-!6$2/#-(+.4$5#0!/4$/!7#!80#!&+-!())80/-$/(+%!(%!9(58-#!:!04$-#!/4#!0+;2$))#1!<=>?!&+)1!@A$B#-!#/!$)C!DEEFG!H($%5!#/!$)C!DEEIG!
J(B+&&!#/!$)C!DEEEK!@L5(--#M$6$)$!#/!$)C!DEE?G!N81$!#/!$)C!DEEOG!P&&$%/(%!#/!$)C!DEEOG!H($%5!#/!$)C!DEEOG!Q$%1#-!#/!$)C!DEERG!"+-$(0!#/!$)C!
DEEFKC!S%!+8-!24+(2#!$'+%5!10NTL!6$2/#-(+.4$5#0!74(24!04$-#!/4#!<=!>?;&+)1!@U$0V#%0!DEERKW!7#!2+%0(1#-#1!+%)*!0.#2(#0!74+0#!2$.0(1!
(0!(0+'#/-(2!@0##!6#)+7K!$%1!4$,#!$!3!%8'6#-!74(24!(0!B%+7%!$%1!4$0!6##%!2+%&(-'#1!@N81$!DEERKC!9+-!#$24!3!%8'6#-W!74#-#!'+-#!
/4$%!+%#!#X$'.)#!7$0!$,$()$6)#W!+%)*!+%#!-#.-#0#%/$/(,#!+&!#$24!5#%80!4$0!6##%!24+0#%!@N81$!DEERG!van Regenmortel!#/!$)C!DEEEKC!34#!
0#)#2/#1!00NTL!,(-80#0!$-#!+%#0!7(/4!0('()$-!2$.0(1!'+)#28)$-!7#(54/!/4$/!-#.-#0#%/!1(&-#%/!5#%#-$!$%1!$!7(1#!-$5#!+&!5#%+'#!)#%5/4C!
L'+%5!10NTL!,(-80#0!04$-(%5!/4#!<=!>?;&+)1W!$))!/4#!/4#+-#/(2$))*!.+00(6)#!3!%8'6#-0!&-+'!I!/+!:O!4$,#!6##%!+60#-,#1!#X2#./!&+-!>!$%1!
:DC!Y%#!#X.)$%$/(+%!.8/!&+-7$-1!/+!#X.)$(%!/4#0#!'(00(%5!3!%8'6#-0!(0!/4$/!/4#!2+$/!.-+/#(%0!+&!/4#!10NTL!,(-80#0!04$-(%5!<=>?!4$,#!$!
/#%1#%2*!/+!&+-'!0B#7#1!4#X+%0W!74(24!$-#!4#X$'#-(2!2)80/#-0!+&!/4#!2+$/!.-+/#(%!(/0#)&!@348'$%;U+''(B#!#/!$)C!:>>>KC!Z824!0B#7#1!
4#X+%0!$-#!.-#0#%/!(%!/4#!.-+2$.0(10!$0!04+7%!6*!IN!0/-82/8-#!-#2+%0/-82/(+%0!6+/4!$/!'#1(8'!$%1!4(54!-#0+)8/(+%!@U+%7$*!#/!$)C!
:>>FG!N+B)$%1![!"8-($)1+!:>>IG!\#-/0'$%!#/!$)C!DEE>G!H($%5!#/!$)C!DEEIG!Q$%1#-!#/!$)C!DEERG!]-$0$1!#/!$)C!:>>IG!3-80!#/!$)C!:>>OK!$%1!
04+7!$!/7+&+)1!0*''#/-*!(%0/#$1!+&!/4#!0(X&+)1!0*''#/-*!+60#-,#1!(%!'$/8-#!,(-80#0C!A#2$80#!+&!0*''#/-*!.-(%2(.)#0W!$!4#X+%!7(/4!
+%)*!$!/7+&+)1!0*''#/-*!'80/!2+%/$(%!#(/4#-!/4-##!+-!0(X!1(&-#%/!2+%&+-'$/(+%0!$%1!/4(0!7+8)1!)#$1!/+!$!-#0/-(2/(+%!+&!.+00(6)#!3!
%8'6#-0W!$22+-1(%5!/+!/4#!^8$0(;#^8(,$)#%2#!-8)#0!@U$0.$-![!=)85!:>ODK!/+!/4#!0#-(#0!3!_!I%!`!:W!74#-#!%!(0!$%*!(%/#5#-!@348'$%;
U+''(B#!#/!$)C!:>>>KC!34#!+%)*!B%+7%!$..$-#%/!#X2#./(+%!/+!/4(0!5#%#-$)(M$/(+%!(0!$!3_I!2$.0(1!0/-82/8-#!6#)+%5(%5!/+!.4$5# !D>! @"+-$(0!#/!$)C!DEEFKW!(%!74(24!$!/#%1#%2*!+&!/4#!2+$/!.-+/#(%!/+!&+-'!0B#7#1!4#X+%0!4$0!6##%!04+7%!@U4+(!#/!$)C!DEEOKC!<+7#,#-W!!D>!
@$%1!/4#-#&+-#!3!_!IK!4$0!%+/!6##%!(%2)81#1!(%!+8-!$%$)*0(0!&+-!9(58-#!:!0(%2#!(/!4$0!$!.-+)$/#!@$%(0+'#/-(2K!4#$1!$%1!2$%!6#!#X.#2/#1!/+!
4$,#!'+-#!-#)$X#1!2+%0/-$(%/0!7(/4!-#0.#2/!/+!5#%+'#!.$2B$5(%5W!$0!(/!4$0!6##%!04+7%!&+-!$%+/4#-!0('()$-!.4$5#W!3a!@P$-%04$7!#/!$)C!
:>?RKC!L!-#)$/#1%#00!6#/7##%!2$.0(10!7(/4!3!%8'6#-0!+&!aW!?!$%1!:I!7$0!$)0+!0855#0/#1!+%!/4#!6$0(0!+&!/4#!0+;2$))#1!b)+2$)!-8)#c!
07(/24(%5!'#24$%(0'!&+-!(2+0$4#1-$)!04#))!5#+'#/-*!@A#-5#-!#/!$)C!DEEEKC!34#0#!)+2$)!-8)#0!0.#2(&*!)+2$)!(%/#-$2/(+%!.$//#-%0!/4$/!2$%!
1(-#2/!/4#!0#)&;$00#'6)*!+&!,(-$)!2+$/!$%1!02$&&+)1(%5!0868%(/0!(%/+!1(&-#%/!2$.0(1!5#+'#/-(#0!@A#-5#-!#/!$)C!:>>aG!N+B)$%1!DEEEKC!
! Table S2. d#)$/(,#!(%2-#$0#!(%!2$.0(1!,+)8'#!2+'.$-#1!/+!5#%+'#!)#%5/4! ! ! ! !
Family Genus Species Abbrev. T- Next T- "(Tf)/"(Ti) Relative ln (Relative Genome ln Major Source Name number number = Rf/Ri (R = increase in increase in length (Genome capsid (Ti) (Tf) Radius) volume = volume) (bases) length) protein (Rf/Ri)3 mol. wht Myoviridae T4-like Synechococc Syn9 16 19 1.09 1.29 0.26 177300.00 12.09 48.7 a us phage Syn9 Myoviridae SPO1-like Bacillus SPO1 16 19 1.09 1.29 0.26 132500.00 11.79 45 b phage SPO1 Siphoviridae T5-like Enterobacter T5 13 16 1.11 1.37 0.31 121750.00 11.71 44 c ia phage T5 Myoviridae P2-like Enterobacter P2 7 13 1.36 2.53 0.93 33593.00 10.42 36.5 d ia phage P2 Siphoviridae lambda- Enterobacter # 7 13 1.36 2.53 0.93 48500.00 10.79 38 e like ia phage ! Podoviridae T7-like Enterobacter T7 7 13 1.36 2.53 0.93 39937.00 10.60 44 f ia phage T7 Podoviridae P22-like Enterobacter P22 7 13 1.36 2.53 0.93 41724.00 10.64 38 g ia phage P22 Myoviridae P2-like Enterobacter P4 4 7 1.32 2.32 0.84 11624.00 9.36 36.5 h ia phage P4 Circoviridae Circovirus Porcine PCV 1 4 2.00 8.00 2.08 1763.00 7.47 27.4 i circovirus-1 and -2 (average) Microviridae Microvirus ϕ"174 !$174 1 4 2.00 8.00 2.08 5400.00 8.59 48.4 j Parvoviridae Brevidenso Anopheles AgDNV 1 4 2.00 8.00 2.08 4139.00 8.33 40.5 k virus gambia densonucleo sis virus Circoviridae Gyrovirus Chicken CAV 1 4 2.00 8.00 2.08 2309.00 7.74 51.6 l Anemia Virus ! ! a(Weigele et al. 2007) b(Duda et al. 2006; Parker et al. 1983) c(Effantin et al. 2006) d(Dokland et al. 1992) e(Lander et al. 2008; Williams & Richards 1974) f(Agirrezabala et al. 2007; Steven et al. 1983) g(Jiang et al. 2003; Prasad et al. 1993) h(Dokland et al. 1992) i(Bennett et al. 2008; Crowther et al. 2003; van Regenmortel et al. 2000) j(McKenna et al. 1992) k(Bennett et al. 2008; Cheng et al. 2004; van Regenmortel et al. 2000) l(Crowther et al. 2003; van Regenmortel et al. 2000) !
Table S3. Virus mutation rates
"#$%! &'()*#! +,%-*'$!.-/$/-0)/1! 2$%3)%4'! 560'0)/1!-'0%7! 8%9%-%13%!64%:!
8SW! ;#40/,)-):'%! <=<<>?! !@! A=?BC &*',)-)-):'%! <=< I%$':1',)-):'%! <=M P%,),)-):'%! <=H! Q"! C=MKBC +-0F/(#B/,)-):'%! <=<>@H! &PVWJ! ?=A@BC ! ABC C=LBC &PVOJ! @BC .)3/-1',)-):'%! <=<<< ./*)/! C=?MBC A=KABC C= C=CHBC .'-'(#B/,)-):'%! <=<>HL! 5%J! @=ALBC 8%0-/,)-):'%! <=<>CC! OPJ! >=>BC ! IYJDC! C=HBC A=>HBC 5PJ! >=??BC Z@=@ABC >=C>BC 82J! K=@BC 2SJ! K=?CBC ! >=@CBC C=ABC 8F'7:/,)-):'%! <=<< C=< M=M@BC @=CHBC "/7'(/,)-64! <=< 44NSW! Y1/,)-):'%! <=< 5)3-/,)-):'%! <=CMCC! !#C?K! C=LBC C= :4NSW! I%-$%4,)-):'%! <= 5#/,)-):'%! <=< "K! A= 2)$F/,)-):'%! <= ! ']'3F!%10-#!9/-!'!4$%3)%4!)4!9-/(!'!4%$'-'0%!406:#!/-!(%0F/:=! 7560'0)/1!-'0%4!)1!0F%4%!$67*)3'0)/14!'-%!0#$)3'**#!%B$-%44%:!'4!0F%!16(7%-!/9!46740)060)/14!$%-! 163*%/0):%!4)0%!$%-!-/61:!/9!X%1/(%!-%$*)3'0)/1=!I/T%,%-[!0F%!,'*6%4!9/-!I;J[!!#C?K!'1:!0F%!0T/! 2'1\6^1!J2J!%40)('0%4!'-%!%B$-%44%:!'4!$%-!3%**!)19%30)/1!3#3*%!E0F%-%!('#!7%!(6*0)$*%!-/61:4!/9! X%1/(%!-%$*)3'0)/1!T)0F)1!'!4)1X*%!)19%30)/1!3#3*%G=!! 35)::*%!/9!0F%!-'1X%!/9!,'*6%4!X),%1!E<=HBC :_%!F',%!46((%:!0F%!:'0'!9-/(!ICSC!'1:!I>SA!40-')14=!! %5%'14!/9!,'*6%4!9/-!40-')14!IASA!'1:!ICSC=!! 9 J'*6%4!)1!"'7*%!C!/9!-%9%-%13%!E:),):%:!7#!X%1/(%!*%1X0F`!TF%-%!(/-%!0F'1!/1%!aX!,'*6%!)4!X),%1!0F%! (%'1!T'4!64%:G=!! XJ'*6%4!)1!"'7*%!>!/9!-%9%-%13%=! F"F)4!,'*6%[!)1!"'7*%!>[!)4!9-/(!'!:)99%-%10!406:#!0/!0F'0!X),%1!)1!EN-'R%!CLL>G=!! )"F%!/0F%-!,'*6%[!>=H>BC 0F'0!T%-%!46((%:!)1!0F%!A=K?BC \;'*36*'0%:!9-/(!0F%!$F%1/0#$)3!(60'10!9-%b6%13#!X),%1!)1!0F%!/-)X)1'*!-%9%-%13%!E;6%,'4!%0!'*=!A< ! ! &)X6-%!2C=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!)1!NSW! ,)-64%4!T)0F!'**!8%92%b!%10-)%4!)13*6:%:!E)=%=!1/0!/1*#!,'*):'0%:!/1%4!'4!)1!&)X6-%4!A!'1:! >!/9!0F%!(')1!0%B0G=! ! ! ! 0 7 8 9 10 11 12 13 14 -1 y = -1.4547x + 9.7741 R! = 0.40047 -2 -3 -4 dsDNA ssDNA Linear(dsDNA) -5 y = -0.0008x - 4.25 Linear(ssDNA) Ln(Overlap proportion) R! = 4.7E-07 -6 -7 -8 -9 Genome length (Ln(bases)) ! ! dsDNA Significance P = 0.996834 ssDNA Significance P = 0.127169 ! &)X6-%!2A=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!)1!0F%!9/6-! 0#$%4!/9!8SW!,)-64c!4)1X*%!40-'1:%:!$/4)0),%!4%14%[!4)1X*%!40-'1:%:!1%X'0),%!4%14%[! :/67*%!40-'1:%:!'1:!-%0-/0-'143-)7)1X=! All viruses families by groups - All annotation files ! 0 7.5 8 8.5 9 9.5 10 10.5 y = -2.7069x + 20.945 -2 R2 = 0.9183 y = -2.3119x + 16.628 R2 = 0.3992 dsRNA -4 y = -0.2862x - 2.2336 Retro R2 = 0.0033 ssRNAneg y = -0.8976x + 2.2842 ssRNApos -6 R2 = 0.0611 Linear (Retro) Linear (ssRNApos) Linear (dsRNA) -8 Ln(Overlap proportion) Linear (ssRNAneg) -10 -12 Genome length (Ln(bases)) ! ! dsRNA Significance P = 0.555201 Retroviridae Significance P = 0.184485 ssRNAneg Significance P = 0.893153 ssRNApos Significance P = 2.74E-05 ! &)X6-%!2>=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!)1!0F%!0T/! 0#$%4!/9!NSW!,)-64c!4)1X*%!40-'1:%:!'1:!:/67*%!40-'1:%:=!! ! ! ! All DNA viruses families - best annotation files 0 7 8 9 10 11 12 13 14 -1 y = -1.9208x + 13.701 R2 = 0.4919 -2 -3 dsDNA -4 ssDNA proportion) Linear (dsDNA) -5 Linear (ssDNA) Ln(Overlap -6 y = -0.3635x - 0.6034 R2 = 0.1197 -7 -8 -9 Genome length (Ln(bases)) ! ! :4NSW! Significance P = 0.135061 ssDNA Significance P = 0.12046! &)X6-%!2K=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!)1!8SW! ,)-64%4=!+,%-*'$4!3'0%X/-)4%:!'4!!"#$%"&'!EY10G[!TF%-%!/1%!X%1%!)4!3/($*%0%*#!T)0F)1! '1/0F%-[!'1:!()#$%"&'!E]B0G[!TF%-%!0T/!X%1%4!/,%-*'$!9/-!/1*#!$'-0!/9!0F%)-!*%1X0F4=! ! ! Internal-External overlap proportion: all RNA viruses families 0 7 7.5 8 8.5 9 9.5 10 10.5 11 -1 -2 y = -1.3993x + 8.9093 R 2 = 0.4147 -3 Ext -4 Int Linear (Int) -5 Linear (Ext) -6 Ln(Overlap proportion) y = -1.6323x + 10.429 -7 R2 = 0.2388 -8 -9 Genome length (Ln(bases)) ! ! Ext Significance P = 0.008321 Int Significance P = 0.007094! ! ! ! &)X6-%!2M=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!)1!NSW! ,)-64%4=!+,%-*'$4!3'0%X/-)4%:!'4!!"#$%"&'!EY10G[!TF%-%!/1%!X%1%!)4!3/($*%0%*#!T)0F)1! '1/0F%-[!'1:!()#$%"&'!E]B0G[!TF%-%!0T/!X%1%4!/,%-*'$!9/-!/1*#!$'-0!/9!0F%)-!*%1X0F4=! ! Internal-External overlap proportion: all DNA viruses families ! 0 7 8 9 10 11 12 13 14 -1 -2 -3 y = -0.4899x + 0.5482 R 2 = 0.2769 Ext -4 Int Linear (Int) -5 Linear (Ext) -6 Ln(Overlap proportion) y = -0.9259x + 4.7089 R 2 = 0.6299 -7 -8 -9 Genome length (Ln(bases)) ! ! Ext Significance P = 0.00576 Int Significance P = 0.000412 ! &)X6-%!2@=!8%*'0)/14F)$!7%0T%%1!(%'1!16(7%-!/9!/,%-*'$4!3/($'-%:!0/!X%1/(%! *%1X0F=!+$%1!3)-3*%4!'-%!9'()*#!(%'14!9/-!NSW!,)-64%4[!3*/4%:!3)-3*%4!'-%!9'()*#!(%'14! 9/-!8SW!,)-64%4! ! ! ! ! ! ! ! &)X6-%!2?=!8%*'0)/14F)$!7%0T%%1!(%'1!/,%-*'$!*%1X0F!'1:!X%1/(%!*%1X0F=!;)-3*%4c! (%'1!,'*6%4!9/-!9'()*)%4!/9!8SW!,)-64%4`!4b6'-%4c!(%'1!,'*6%4!9/-!9'()*)%4!/9!NSW! ,)-64%4=! ! ! ! ! ! &)X6-%!2H=!8%*'0)/14F)$!7%0T%%1!0F%!-'0)/!/9!16(7%-!/9!/,%-*'$4!3/($'-%:!0/!16(7%-! /9!X%1%4!E#!'B)4G[!'1:!X%1/(%!*%1X0F!EB!'B)4G=!&'()*#!(%'14!/9!/,%-*'$!'1:!X%1%! 16(7%-!'-%!64%:=!+$%1!3)-3*%4!'-%!NSW!,)-64%4[!3*/4%:!3)-3*%4!'-%!8SW!,)-64%4=! ! ! ! &)X6-%!2L=!8%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:!X%1/(%!*%1X0F!%B3*6:)1X!,%-#! 4F/-0!0%-()1'*!/,%-*'$4!E*%44!0F'1!@<7$G=! ! ! ! ! 0 7 8 9 10 11 12 13 -1 y = -0.8523x + 4.5808 R2 = 0.7121 -2 p < 0.001 -3 RNA DNA -4 Linear (RNA) Linear (DNA) Ln(Overlap proportion) -5 y = -1.7438x + 11.978 -6 R2 = 0.3271 p = 0.001 -7 -8 Genome length (Ln(bases)) ! ! ! &)X6-%!2C<=!8SW!,)-64!T)0F)1D9'()*#!-%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:! X%1/(%!*%1X0F=!+1*#!9'()*)%4!T)0F!'0!*%'40!%)XF0!X%1/(%4!'-%!)13*6:%:=! ! ! ! ! &)X6-%!2CC=!NSW!,)-64!T)0F)1D9'()*#!-%*'0)/14F)$!7%0T%%1!/,%-*'$!$-/$/-0)/1!'1:! X%1/(%!*%1X0F=!+1*#!9'()*)%4!T)0F!'0!*%'40!%)XF0!X%1/(%4!'-%!)13*6:%:=! ! ! ! DNA viruses species splitted by families (min 8 genomes) 0 7 8 9 10 11 12 13 Myoviridae -1 Podoviridae -2 Siphoviridae Adenoviridae -3 Herpesviridae Papillomaviridae -4 Geminiviridae -5 Microviridae Inoviridae -6 Linear (Geminiviridae) -7 Linear (Microviridae) Ln(Overlap proportion) Linear (Inoviridae) -8 Linear (Papillomaviridae) -9 Linear (Podoviridae) Linear (Siphoviridae) -10 Linear (Myoviridae) Genome length (Ln(bases)) Linear (Adenoviridae) ! ! Linear (Herpesviridae) ! ! "#$#%#&'#(! WX)--%d'7'*'[!e=[!J%*'db6%dD56-)%*[!f=!W=[!g/(%dD.6%-0'4[!.=[!23F%-%4[!2=!I=!_=[!;'-'d/[!f=! 5=!U!;'--'43/4'[!f=!P=!A