Title: Molecular dating of the emergence of anaerobic fungi and the impact of laterally acquired

Short title: Molecular dating and HGT of anaerobic gut fungi

Authors: Yan Wang*,†, Noha Youssef‡, M.B. Couger§, Radwa Hanafy‡, Mostafa Elshahed‡, Jason E. Stajich*,†

Affiliations:

* Department of Microbiology and Pathology, University of California, Riverside, Riverside, California, 92521 USA.

† Institute for Integrative Genome Biology, University of California, Riverside, Riverside, California, 92521 USA.

‡ Department of Microbiology and Molecular Genetics, Oklahoma State University, Stillwater, Oklahoma, 74074 USA.

§ High Performance Computing Center, Oklahoma State University, Stillwater, Oklahoma, 74074 USA.

To whom correspondence may be addressed: Yan Wang, +1 951.386.5197, [email protected] Jason E. Stajich, +1 951.827.2363, [email protected]

This Supplementary Materials PDF file includes: Figures S1-6 Tables S1-2

Supplementary figure legend: Fig. S1. Maximum likelihood phylogenetic tree of using as the outgroup. All bootstrap values (out of 100) are labeled on the branches.

Fig. S2. Presence (dark gray) and absence (light gray) of the homologous families across the genomes (and transcriptomes) of Neocallimastigomycota and Chytridiomycota. The 4,824 gene families were selected as universal homologous genes that present at least 21 out of the 26 Neocallimastigomycota genomes (and transcriptomes) with missing no more than 1 of the 5 included Chytridiomycota genomes. In addition, it also includes the unique gene families that are strictly absence from all Chytridiomycota but encoded by the Neocallimastigomycota (missing no more than 5 out of the 26 taxa).

Fig. S3. Mid-point rooted phylogenetic tree of the “Cthe_2159” encoded by the Neocallimastigomycota (red). All 126 Neocallimastigomycota (AGF) copies form a single clade (red) indicating the HGT donor, Clostridiales bacterium C5EMF8 (an obligate rumen bacterium), with strong support of maximum likelihood bootstrap (98/100).

Fig. S4. Phylogenetic tree of the -like “Gal-Lectin” domain identified in Neocallimastigomycota. Clades are colored in consistence with the Figure 5 (Neocallimastigomycota in red, in blue, in green, and in brown).

Fig. S5. Phylogenetic tree of the AGF “Gal-Lectin” flanking domain “Glyco_transf_34” (rooted with bacterium, in purple). AGF homologs are colored in red clustering with other fungal taxa (in black). Plant homologs are in green and in brown.

Fig. S6. Mid-point rooted phylogenetic tree of the “Rhamnogal_lyase” domain encoded by the Neocallimastigomycota (red). Labels are consistent with the Figure 6.

Pecoramyces sp. YC3 95 C1A 95 Pecoramyces ruminantium Pecoramyces sp. FS3C 100 Pecoramyces sp. S4B 95 sp. FX4B 100 Pecoramyces Orpinomyces sp. D3B 100 Orpinomyces sp. D3A 79 100 Orpinomyces sp. D4C Feramyces austinii WSF2c 100 100 Feramyces austinii WSF3a Neocallimastix californiae G1 100 sp. G3 100 Neocallimastix Neocallimastix frontalis Hef5 100 Anaeromyces sp. G3G 100 sp. NA 99 Anaeromyces Anaeromyces sp. C3G 100 Anaeromyces sp. C3J 100 100 Anaeromyces sp. O2 77 Anaeromyces robustus S4 sp. B4 100 sp. B5 86 Piromyces 100 sp. A1 86 Piromyces sp. E2 77 Piromyces Piromyces finnis Caecomyces sp. Brit4 100 Caecomyces sp. Iso3 Chytriomyces sp. MP71 100 99 Rhizoclosmatium globosum 100 Entophlyctis helioformis JEL805 100 Gonapodya prolifera JEL478 Gaertneriomyces semiglobifer Barr43

0.3 Neocallimastigomycota

Anaeromyces Piromyces Chytridiomycota Pecoramyces Neocallimastix Orpinomyces Caecomyces Feramyces

Presence Absence 91 91 Anaeromyces_robustus_S4_305391 92 Anaeromyces_robustus_S4_330235 Anaeromyces_robustus_S4_330234 92 92 Anaeromyces_robustus_S4_295777 91 Anaeromyces_sp_C3G_6164 94 100 Anaeromyces_sp_C3J_33178 Anaeromyces_sp_NA_28982 99 Anaeromyces_sp_C3J_32903 Anaeromyces_robustus_S4_294801 100 Pecoramyces_sp_FS3C_28347 88 100Pecoramyces_ruminantium_C1A_1181724 92 Anaeromyces_sp_C3J_32897 89 Caecomyces_sp_Iso3_28243 98 80 Anaeromyces_robustus_S4_265696 Piromyces_sp_E2_13282 93 Piromyces_sp_E2_16996 88 Anaeromyces_sp_C3G_3365 78 Caecomyces_sp_Iso3_12816 100 Piromyces_sp_B4_57595 100 Piromyces_sp_B4_50542 Caecomyces_sp_Brit4_2790 92 100 Piromyces_finnis_356560 Piromyces_finnis_414173 99 100 Piromyces_sp_E2_5442 100 Piromyces_finnis_409988 100 Piromyces_sp_B4_55629 Piromyces_finnis_581087 100 100 Piromyces_sp_E2_5441 98 Pecoramyces_sp_FS3C_28354 100 Pecoramyces_ruminantium_C1A_1180421 98 Feramyces_austinii_WSF2c_44490 Feramyces_austinii_WSF3a_38604 94 100 100 Anaeromyces_robustus_S4_329210 Anaeromyces_sp_C3G_17122 100 Caecomyces_sp_Brit4_7458 94 Caecomyces_sp_Iso3_32479 100 Piromyces_sp_E2_5444 Piromyces_finnis_345772 Piromyces_sp_E2_248659 100 Pecoramyces_sp_FS3C_4471 95 100 Pecoramyces_sp_FS3C_18029 100 Pecoramyces_ruminantium_C1A_1186381 Pecoramyces_sp_YC3_16201 92 100Pecoramyces_ruminantium_C1A_1175978 100 Neocallimastix_californiae_G1_510108 100 100 Neocallimastix_californiae_G1_510110 100 Neocallimastix_californiae_G1_510107 97 Neocallimastix_californiae_G1_620865 91 Anaeromyces_robustus_S4_275012 94 Anaeromyces_robustus_S4_293955 59 Piromyces_sp_E2_15943 Pecoramyces_ruminantium_C1A_1179782 100 Pecoramyces_ruminantium_C1A_1179708 100 Pecoramyces_ruminantium_C1A_1187945 70 Neocallimastix_californiae_G1_640852 Neocallimastix_californiae_G1_640854100 100 Neocallimastix_californiae_G1_670432 43 Feramyces_austinii_WSF3a_27727 64 Pecoramyces_sp_YC3_20461 100 Anaeromyces_robustus_S4_330057 Piromyces_sp_E2_12499 92 Neocallimastix_californiae_G1_679316 74 Neocallimastix_californiae_G1_710198 55 74 Feramyces_austinii_WSF2c_15374 97 Feramyces_austinii_WSF2c_32250 100 Neocallimastix_frontalis_Hef5_44591 80 Neocallimastix_californiae_G1_638851 96 Neocallimastix_californiae_G1_679086 82 98 Piromyces_sp_B4_59147 Piromyces_finnis_585400 Piromyces_sp_E2_12505 90 Pecoramyces_ruminantium_C1A_1179702 67 Anaeromyces_robustus_S4_273069 92 Anaeromyces_robustus_S4_330058 58 59 Caecomyces_sp_Iso3_24744 Feramyces_austinii_WSF3a_47915 Piromyces_sp_E2_8821 100 100 Anaeromyces_robustus_S4_327961 100 Anaeromyces_sp_NA_2763 Piromyces_sp_E2_13245 97 100 Piromyces_finnis_584506 100 Anaeromyces_robustus_S4_270435 100 Anaeromyces_sp_C3J_47439 100 Anaeromyces_sp_C3G_16487 74 98 100 Piromyces_sp_A1_999 Piromyces_sp_E2_12494 48 Piromyces_finnis_585406 23 Neocallimastix_californiae_G1_710490 Piromyces_finnis_336161 100 Piromyces_sp_E2_5043 100 Pecoramyces_sp_FS3C_30089 33 100 Pecoramyces_sp_S4B_32602 100 Pecoramyces_ruminantium_C1A_1180902 99 Piromyces_finnis_588453 69 Piromyces_sp_E2_8199 Anaeromyces_sp_C3G_5587 39 100 Caecomyces_sp_Brit4_6442 91 Piromyces_sp_E2_12496 78 Piromyces_finnis_585407 Caecomyces_sp_Iso3_11920 100 Caecomyces_sp_Brit4_25893 100 Pecoramyces_sp_YC3_29771 100 Pecoramyces_sp_YC3_29772 100 Pecoramyces_sp_S4B_19211 100 100 Pecoramyces_ruminantium_C1A_1182088 82 Neocallimastix_californiae_G1_702986 69 Piromyces_sp_E2_12497 Feramyces_austinii_WSF2c_14310100 60 100 Piromyces_sp_B4_53914 98 Piromyces_sp_B4_53918 Piromyces_sp_B4_50142 98 100 Piromyces_sp_E2_10137 51 100 Piromyces_sp_A1_30720 Piromyces_sp_A1_30723 98 56 Piromyces_sp_E2_9301 53 74 Pecoramyces_ruminantium_C1A_1179584 57 Feramyces_austinii_WSF2c_1735_ 37 Feramyces_austinii_WSF3a_29255 Neocallimastix_californiae_G1_670430 58 Pecoramyces_sp_YC3_29332 50 Feramyces_austinii_WSF3a_11808 82 81 Caecomyces_sp_Brit4_6443 Piromyces_sp_E2_9302 Anaeromyces_robustus_S4_328840 100 Neocallimastix_frontalis_Hef5_24931 Neocallimastix_californiae_G1_508092 100 Clostridiales_bacterium_C5EMF8_149-397 92 Oribacterium_sinus_C2KW59_105-357 Oribacterium_sp._oral_taxon_D4CKY0_76-312 100 Hungatella_hathewayi_D3AMA3_63-315 87 Lachnospiraceae_bacterium_R9J5D6_73-323 84 Lachnospiraceae_bacterium_R9KQJ4_95-344 82 Clostridium_stercorarium_L7VQR5_90-387 Lachnospiraceae_bacterium_R9KF64_92-341 96 Lachnospiraceae_bacterium_R6NI92_55-302 68 100 Eubacterium_infirmum_H1PJH2_85-332 92 Butyrivibrio_proteoclasticus_E0S0X0_131-383 100 Slackia_heliotrinireducens_C7N1V3_112-364 Solobacterium_moorei_E7MPP4_87-336 93 100 Ruminococcus_albus_E6UH47_75-327 97 Ruminococcus_flavefaciens_W7UGE5_82-335 96 Shuttleworthia_satelles_C4GCI5_90-332 93 Olsenella_sp_G5F167_130-400 96 Firmicutes_bacterium_R6F067_70-304 68 71 95 Blautia_sp_R7JSW3_124-384 Peptostreptococcaceae_bacterium_J4TAQ5_87-323 94 Roseburia_inulinivorans_R5HY49_93-400 72 Desulfitobacterium_hafniense_Q24Y85_86-386 78 Marvinbryantia_formatexigens_C6LI05_95-382 97 Slackia_piriformis_K0YXK2_115-365 93 99 Adlercreutzia_equolifaciens_S6C770_119-338 97 Eggerthella_lenta_C8WKR1_151-371 Cryptobacterium_curtum_C7MME2_108-359 Methanocorpusculum_labreanum_A2SQ64_83-403 78 Syntrophobotulus_glycolicus_F0SYP4_82-361 100 Ruminococcus_albus_E6UH17_81-445 97 Cellulomonas_flavigena_WP_013116373.1 99 Cellulomonas_flavigena_WP_013116374.1 79 Xylanimonas_cellulosilytica_D1C0K7_108-371 100 99 Sanguibacter_keddieii_D1BCC2_100-367 94 Cellulomonas_fimi_F4H3C7_106-370 84 Gemmatimonas_aurantiaca_C1AAE0_125-385 83 Brachybacterium_phenoliresistens_Z9JUC2_86-337 78 91 Jonesia_denitrificans_C7QZY8_91-348 Pseudarthrobacter_chlorophenolicus_B8HFZ5_99-34596 83 100 Pseudoflavonifractor_capillosus_A6NRS6_135-438 Clostridium_lentocellum_F2JJB4_102-355 80 Holdemania_filiformis_B9YAY1_91-365 Anaerotruncus_sp._G3_R9LFA9_56-38799 100 Ruminococcus_flavefaciens_W7UM54_137-388 87 Ruminococcus_flavefaciens_W7UW60_139-390 94 Ruminococcus_champanellensis_D4L9K5_123-388 91 Turicibacter_sanguinis_D4W5L2_74-354 47 Clostridium_celatum_L1Q6H2_67-347 77 Methanofollis_liminatans_J0SB26_80-359 Youngiibacter_fragilis_V7I1Z8_98-374 89 Methanolacinia_petrolearia_E1RD34_94-376 69 Oceanobacillus_picturae_W9AMP1_76-417 45 Desulfosporosinus_youngiae_H5XXH7_79-371 95 81 Clostridium_lentocellum_F2JP69_95-395 Bacillus_azotoformans_K6DIH1_91-377 Paenibacillus_mucilaginosus_WP_014368460.1 86 Erysipelotrichaceae_bacterium_E2SKD2_64-342 Ruminococcus_champanellensis_D4LFA6_96-370 100 100 Paraprevotella_clara_R5NWY7_117-408 100 Bacteroides_thetaiotaomicron_Q8A6T0_117-361 100 Bacteroides_cellulosilyticus_R6KIA5_117-382 97 Capnocytophaga_sp_F0IGU7_95-376 92 Prevotella_sp_R5PPA9_100-374 90 Clostridium_thermocellum_A3DDD5_139-377 78 Weissella_confusa_H1X9B0_91-341 Butyrivibrio_crossotus_R5LE19_73-318 97 Fibrobacter_succinogenes_D9S755_112-377 100 Fibrobacter_succinogenes_C9RN77_191-498 78 100 Coprobacillus_sp_R5FW97_289-548 96 Staphylococcus_R6UEN0_78-333 98 Coprobacillus_sp_R7DND1_73-327 100 Eubacterium_eligens_C4Z222_81-435 100 Ruminococcus_albus_E6UH50_143-458 80 Ruminococcus_flavefaciens_W7UM54_506-834 Ruminococcus_champanellensis_D4L9K5_469-775 62 Faecalibacterium_sp_R7ILX8_40-324 Clostridium_thermocellum_A3DHD2_56-282

0.3 100 100 Pecoramyces_sp_YC3_22824 93 Pecoramyces_ruminantium_C1A_119040 100 Pecoramyces_sp_S4B_18039 92 Pecoramyces_sp_FX4B_10761 100 Pecoramyces_ruminantium_C1A_118218 100 Pecoramyces_sp_FS3C_22760 94 Pecoramyces_sp_FX4B_10742 100 Pecoramyces_sp_FX4B_10751 99 Piromyces_sp_A1_3237 Piromyces_sp_E2_4802 92 Piromyces_sp_B4_42210 100 Anaeromyces_robustus_S4_290046 94 Anaeromyces_sp_O2_30509 90 81 Anaeromyces_robustus_S4_327379 82 92 Anaeromyces_sp_O2_7892 100 Pecoramyces_ruminantium_C1A_117677 100 Neocallimastix_frontalis_Hef5_83114 Neocallimastix_californiae_G1_666153 Feramyces_austinii_WSF2c_14555 82 100 Orpinomyces_sp_D4C_11866 Pecoramyces_ruminantium_C1A_118787100 100 Anaeromyces_sp_C3J_45171 97 Anaeromyces_sp_O2_38292 89 73 Caecomyces_sp_Iso3_4149 Anaeromyces_robustus_S4_290454 Pecoramyces_sp_YC3_27248 100 Orpinomyces_sp_D3B_17767 69 91 Pecoramyces_sp_FX4B_10735 82 100 Neocallimastix_californiae_G1_673769 68 Pecoramyces_ruminantium_C1A_117650 Piromyces_sp_A1_14800 100 Piromyces_sp_A1_18773 Piromyces_sp_E2_62681 Piromyces_sp_E2_13295100 62 Neocallimastix_frontalis_Hef5_21157 100 Neocallimastix_californiae_G1_460899 80 Pecoramyces_ruminantium_C1A_118172 100 Pecoramyces_sp_YC3_10644 86 Pecoramyces_sp_S4B_22482 100 Pecoramyces_ruminantium_C1A_117809 88 Pecoramyces_sp_FX4B_11141 84 86 Pecoramyces_sp_S4B_18042 Piromyces_finnis_277335 92 Neocallimastix_californiae_G1_387374 71 84 Pecoramyces_ruminantium_C1A_118447 Feramyces_austinii_WSF3a_29522 Anaeromyces_robustus_S4_236984 100 Neocallimastix_frontalis_Hef5_13601 Neocallimastix_californiae_G1_38336899 100 Caecomyces_sp_Iso3_6105 84 Caecomyces_sp_Iso3_9872 77 Caecomyces_sp_Brit4_5175 80 87 Piromyces_sp_B4_48469 64 Orpinomyces_sp_D4C_12959 79 Pecoramyces_sp_FS3C_6780_ 100 Piromyces_sp_A1_2735 83 Piromyces_sp_B4_41481 79 Piromyces_sp_B4_77760 100 Neocallimastix_sp_G3_46054 96 76 Neocallimastix_frontalis_Hef5_20175 90 Anaeromyces_sp_C3J_19460 Piromyces_finnis_366110 Anaeromyces_robustus_S4_227283 98 Feramyces_austinii_WSF2c_35019 100 Neocallimastix_sp_G3_12702 95 Neocallimastix_californiae_G1_502830 98 Neocallimastix_sp_G3_12703 100 Neocallimastix_frontalis_Hef5_54666 99 93 Neocallimastix_frontalis_Hef5_54672 91 Neocallimastix_frontalis_Hef5_54670 Neocallimastix_frontalis_Hef5_54674 49 Neocallimastix_californiae_G1_710267 Neocallimastix_californiae_G1_673813100 81 Piromyces_sp_A1_12418 82 Piromyces_finnis_581265 Piromyces_sp_B5_51602 78 100 Anaeromyces_robustus_S4_328300 86 Anaeromyces_sp_C3G_15985 Orpinomyces_sp_D4C_2918 86 100 Neocallimastix_frontalis_Hef5_30921 100 Neocallimastix_californiae_G1_701830 100 Neocallimastix_frontalis_Hef5_28065 87 Neocallimastix_californiae_G1_702300 86 Pecoramyces_sp_S4B_29666 58 81 Anaeromyces_robustus_S4_296457 81 Pecoramyces_ruminantium_C1A_117493 75 Feramyces_austinii_WSF2c_23659 Piromyces_sp_A1_18746 75 Anaeromyces_sp_G3G_18844 69 100 Neocallimastix_frontalis_Hef5_2562 97 Neocallimastix_frontalis_Hef5_36994 74 Neocallimastix_californiae_G1_671391 56 Anaeromyces_sp_O2_22906 93 Anaeromyces_robustus_S4_226475 69 Anaeromyces_robustus_S4_298056 51 Neocallimastix_californiae_G1_643350 100 Pecoramyces_ruminantium_C1A_117921 90 100 Pecoramyces_ruminantium_C1A_117606 Pecoramyces_ruminantium_C1A_118029 Pecoramyces_ruminantium_C1A_117712 100 Anaeromyces_sp_C3G_5884 Anaeromyces_sp_C3J_33544 45 Anaeromyces_sp_C3J_33540 78 Pecoramyces_ruminantium_C1A_118413 Feramyces_austinii_WSF3a_44934100 100 Piromyces_sp_A1_43898 96 Piromyces_sp_B5_26463 100 Piromyces_sp_E2_69851 Piromyces_finnis_585997 Piromyces_finnis_585996100 78 100 Anaeromyces_robustus_S4_266956 74 Anaeromyces_sp_C3J_37863 100 Anaeromyces_robustus_S4_290480 100Piromyces_sp_B4_41172 100 Pecoramyces_sp_FS3C_16489 61 Pecoramyces_sp_FS3C_16493 Orpinomyces_sp_D4C_11867 49 58 94 Piromyces_sp_A1_23485 Anaeromyces_robustus_S4_269081 73 89 Piromyces_sp_A1_24640 100 Piromyces_sp_E2_7290 100 Caecomyces_sp_Brit4_14857 55 Caecomyces_sp_Iso3_74359 Caecomyces_sp_Iso3_45383 Piromyces_sp_B4_3473695 24 74 Piromyces_sp_B4_47248 100 Piromyces_sp_B4_90034 97 Piromyces_sp_E2_6915 61 Neocallimastix_frontalis_Hef5_48064 82 Piromyces_sp_B4_13258 Orpinomyces_sp_D4C_9040 Feramyces_austinii_WSF2c_14164 Orpinomyces_sp_D3A_22428100 97 100 Orpinomyces_sp_D3A_7268 97 Orpinomyces_sp_D3B_10023 Pecoramyces_sp_FX4B_11167 89 92 Feramyces_austinii_WSF3a_30550 100 Feramyces_austinii_WSF3a_7791 100 Anaeromyces_robustus_S4_326878 100 Anaeromyces_sp_C3J_13254 99 Piromyces_sp_B4_64553 100 Piromyces_sp_B5_32370 Piromyces_finnis_372627100 58 Pan_troglodytes_XP_511127.5 58 Gorilla_gorilla_XP_018867125.1 Loxodonta_africana_XP_010596185.1 100 100 Equus_caballus_XP_014593972.1 Ovis_aries_XP_004015542.1 69 98 Ailuropoda_melanoleuca_XP_002917769.1 100 Canis_familiaris_XP_546816.3 100 Alligator_mississippiensis_XP_014464738.1 Cuculus_canorus_XP_009559365.1 Xenopus_tropicalis_XP_017948677.1 93 Ovis_aries_XP_014951756.1 69 Ailuropoda_melanoleuca_XP_019657347.1 100Pan_troglodytes_XP_016807132.1 99 Ailuropoda_melanoleuca_XP_019657342.1 99 Alligator_mississippiensis_XP_019353718.1 Xenopus_tropicalis_XP_017951923.1 91 Takifugu_rubripes_XP_011605061.1 90 Alligator_mississippiensis_XP_019339288.1 100 Gorilla_gorilla_gorilla_XP_018870036.1 99 Pan_troglodytes_XP_016790771.1 Xenopus_tropicalis_XP_004918936.1 Takifugu_rubripes_XP_011610840.1 98 Canis_familiaris_XP_013969881.1 Pan_troglodytes_XP_009457757.1 Pan_troglodytes_XP_016809704.1 100 Pan_troglodytes_XP_016810244.1 Pan_troglodytes_XP_009457686.1 98 62100Equus_caballus_XP_014595515.1 95 Takifugu_rubripes_XP_011614409.1 40 Takifugu_rubripes_XP_011614410.1 Takifugu_rubripes_XP_011613260.1 94 Xenopus_tropicalis_XP_017948858.1 Xenopus_tropicalis_XP_017948862.1 Gorilla_gorilla_gorilla_XP_018873163.1 Pan_troglodytes_XP_009457802.1 65 Pan_troglodytes_XP_016793887.1 51 Pan_troglodytes_XP_009457289.1 Pan_troglodytes_XP_009456796.1 8297 Canis_familiaris_XP_005638890.1 90 Equus_caballus_XP_014591947.1 92 Loxodonta_africana_XP_003410354.1 Ovis_aries_XP_012041746.1 100 100 Xenopus_tropicalis_XP_004912208.1 Alligator_mississippiensis_XP_019332043.1 100Cuculus_canorus_XP_009565428.1 99 100 Alligator_mississippiensis_XP_019337458.1 100 Alligator_mississippiensis_XP_019337459.1 Cuculus_canorus_XP_009560617.1 94 100 Xenopus_tropicalis_XP_002939251.1 Xenopus_tropicalis_XP_017952159.1 100 Alligator_mississippiensis_XP_019336533.1 Cuculus_canorus_XP_009558738.199 97 Phyllostachys_heterocycla_var_pubescens_PH01000299G0840 100 Phyllostachys_heterocycla_var_pubescens_PH01000312G0670 92 Zea_mays_XP_008643999.1 99 Oryza_sativa_Japonica_NP_001049068.1 95 95 Musa_acuminata_Ma04_p13600.1 92 Musa_acuminata_Ma09_p22100.1 100 Elaeis_guineensis_XP_010917686.1 Elaeis_guineensis_XP_010932916.1 Aegilops_tauschii_subsp_Tauschii_XP_020165934.1 Arabidopsis_thaliana_NP_568001.179 100 94 Aegilops_tauschii_subsp_Tauschii_XP_020147370.1 100 Zea_mays_XP_008672351.1 99 Phyllostachys_heterocycla_var_pubescens_PH01000633G0240 100 Phyllostachys_heterocycla_var_pubescens_PH01002566G0240 100 Elaeis_guineensis_XP_010939579.1 Musa_acuminata_Ma06_p37720.1 Arabidopsis_thaliana_NP_187988.199 96 Brachypodium_distachyon_XP_003572734.1 60 93 Aegilops_tauschii_subsp_Tauschii_XP_020156672.1 97 Sorghum_bicolor_XP_002457961.1 Brachypodium_distachyon_XP_003566745.1 100 94 Oryza_sativa_Japonica_NP_001043252.1 100 Oryza_sativa_Japonica_NP_001064366.1 Elaeis_guineensis_XP_019709551.1 100 100 Musa_acuminata_Ma04_p32540.1 100 Musa_acuminata_Ma05_p11640.1 99 Musa_acuminata_Ma02_p23670.1 25 100 Arabidopsis_thaliana_NP_568399.4 53 Arabidopsis_thaliana_NP_001319124.1 20 Aegilops_tauschii_subsp_Tauschii_XP_020148128.1 45 Oryza_sativa_Japonica_NP_001066673.1 100 Brachypodium_distachyon_XP_003576603.1 98 Sorghum_bicolor_XP_002442888.1 97 Arabidopsis_thaliana_NP_565755.1 79 Elaeis_guineensis_XP_010929755.1 Musa_acuminata_Ma02_p23070.3 63 100 Physcomitrella_patens_XP_001756743.1 Physcomitrella_patens_XP_001775306.1 100 Physcomitrella_patens_XP_001753645.1 Physcomitrella_patens_XP_001775275.1100 95 Brachypodium_distachyon_XP_003576777.1 100 Aegilops_tauschii_subsp_Tauschii_XP_020151138.1 Phyllostachys_heterocycla_var_pubescens_PH01000511G0750 97 81 Zea_mays_XP_008653570.1 84 Brachypodium_distachyon_XP_014755852.1 100 100 Oryza_sativa_Japonica_NP_001062440.1 58 Phyllostachys_heterocycla_var_pubescens_PH01001490G0320 Aegilops_tauschii_subsp_Tauschii_XP_020192585.1 56 99 Zea_mays_XP_008663872.1 64 Elaeis_guineensis_XP_010941318.1 64 Elaeis_guineensis_XP_019709489.1 99 67 Musa_acuminata_Ma01_p08280.1 Musa_acuminata_Ma01_p19650.1 97 Musa_acuminata_Ma00_p02660.1 100 Elaeis_guineensis_XP_019709567.1 99 Arabidopsis_thaliana_NP_179264.2 Arabidopsis_thaliana_NP_567973.1 73 58 Arabidopsis_thaliana_NP_001154292.1 95 Brachypodium_distachyon_XP_010232635.2 95 Aegilops_tauschii_subsp_Tauschii_XP_020157045.1 96 Aegilops_tauschii_subsp_Tauschii_XP_020157040.1 97 Phyllostachys_heterocycla_var_pubescens_PH01001343G0300 100 Oryza_sativa_Japonica_NP_001044964.1 100Zea_mays_NP_001151478.1 68 Elaeis_guineensis_XP_010907797.1 Elaeis_guineensis_XP_010907798.199 46 97 Phyllostachys_heterocycla_var_pubescens_PH01004391G0090 84 Phyllostachys_heterocycla_var_pubescens_PH01000382G1060 Oryza_sativa_Japonica_NP_001174435.1 81100 Brachypodium_distachyon_XP_014755242.1 100 Aegilops_tauschii_subsp_Tauschii_XP_020188230.1 Sorghum_bicolor_XP_021302950.1 100 100 Aegilops_tauschii_subsp_Tauschii_XP_020157042.1 98 Zea_mays_XP_008674765.1 90 Elaeis_guineensis_XP_019703543.1 98 Musa_acuminata_Ma11_p24710.1 72 Arabidopsis_thaliana_NP_001320597.1 99 Phyllostachys_heterocycla_var_pubescens_PH01000308G0820 98 Oryza_sativa_Japonica_NP_001049591.1 Phyllostachys_heterocycla_var_pubescens_PH01001238G0300 99 99 Aegilops_tauschii_subsp_Tauschii_XP_020199938.1 100 Aegilops_tauschii_subsp_Tauschii_XP_020199939.1 76 Sorghum_bicolor_XP_002465536.1 97 Sorghum_bicolor_XP_002444076.2 97 Arabidopsis_thaliana_NP_001324192.1 81 Musa_acuminata_Ma04_p28640.1 Musa_acuminata_Ma04_p27470.1 Elaeis_guineensis_XP_010910537.1 Elaeis_guineensis_XP_010921988.1 100 Flavobacterium_johnsoniae 90 Sorangium_cellulosum 99 100 Roseibacterium_elongatum Sideroxydans_lithotrophicus Desulfovibrio_salexigens

0.5 100 100 Bipolaris_victoriae_FI3 99 Cochliobolus_heterostrophus.2 58 Cochliobolus_sativus 57 Stemphylium_lycopersici_str._CIDEFI_216.3 58 Alternaria_alternata.2 58 Pyrenophora_tritici-repentis.3 51 Pyrenophora_teres_f._teres.3 89 Phaeosphaeria_nodorum 95 Leptosphaeria_maculans.3 Stagonospora_sp._SRC1lsM3a.3 Epicoccum_nigrum 89 100 Fusarium_verticillioides 89 Gibberella_nygamai 100 Stachybotrys_chartarum_IBT_40288.2 89 Stachybotrys_chartarum.2 100 Colletotrichum_fioriniae_PJ7 99 Colletotrichum_orchidophilum Colletotrichum_higginsianum 100 Pyrenophora_teres_f._teres 98 84 Pyrenophora_tritici-repentis.2 81 Leptosphaeria_maculans 81 97 Bipolaris_oryzae_ATCC_44560 Leptosphaeria_maculans.4 100 Setosphaeria_turcica 99 Pyrenochaeta_sp._DS3sAY3a 85 Stagonospora_sp._SRC1lsM3a.2 Stemphylium_lycopersici 98 69 Metarhizium_robertsii 100 100 Pochonia_chlamydosporia_170 100 Metarhizium_album 100 Beauveria_bassiana_D1-5 99 Metarhizium_robertsii.2 99 Hirsutella_minnesotensis_3608 100 Ophiocordyceps_unilateralis Ophiocordyceps_camponoti-rufipedis 100 Verticillium_dahliae_JR2 99 99 Verticillium_longisporum.2 99 Verticillium_longisporum.3 99 Colletotrichum_gloeosporioides Stachybotrys_chartarum_IBT_40288 94 99 Diplodia_corticola.2 100 Diplodia_seriata.2 Macrophomina_phaseolina 81 Neofusicoccum_parvum_UCRNP2 97 100 Diplodia_corticola Diplodia_seriata 51 Macrophomina_phaseolina.2 94 Cercospora_berteroae 100 Pseudocercospora_musae 100 Sphaerulina_musiva.2 56 100 Dothistroma_septosporum 63 Zymoseptoria_tritici.2 67 99 Baudoinia_panamericana 98 Hortaea_werneckii_EXF-2000 96 Rachicladosporium_antarcticum 98 Elsinoe_australis.2 95 Aureobasidium_melanogenum_CBS_110374 Coniosporium_apollinis Exophiala_oligosperma.3 100 Aureobasidium_pullulans_EXF-150 60 Aureobasidium_subglaciale_EXF-2481 100 Claviceps_purpurea 81 Moelleriella_libera_RCEF_2490 99 100 Ophiocordyceps_australis 97 Ophiocordyceps_australis.2 Torrubiella_hemipterigena 100 Stachybotrys_chartarum 78 100 Tolypocladium_ophioglossoides_CBS_100239 81 83 Tolypocladium_paradoxum Cordyceps_sp._RAO-2017 Cadophora_sp._DSE1049 100 Cercospora_berteroae.2 62 Sphaerulina_musiva 100 92 Hortaea_werneckii_EXF-2000.2 60 Zymoseptoria_tritici_ST99CH_3D1 Cercospora_beticola Endocarpon_pusillum_Z07020 100 100 Bipolaris_oryzae_ATCC_44560.2 86 Bipolaris_victoriae_FI3.2 86 Cochliobolus_sativus.2 Stemphylium_lycopersici_str._CIDEFI_216 Alternaria_alternata 94 100 100 Clohesyomyces_aquaticus 99 Corynespora_cassiicola_Philippines 100 100 Bipolaris_zeicola_26-R-13 Periconia_macrospinosa Stagonospora_sp._SRC1lsM3a 93 Pyrenophora_tritici-repentis 98 61 Exophiala_oligosperma 100 Exophiala_xenobiotica 100 Exophiala_sideris 87 62 Cyphellophora_europaea_CBS_101466.2 Elsinoe_australis 100 99 Pseudocercospora_fijiensis 99 Zymoseptoria_tritici Cercospora_beticola.3 100 Cercospora_zeina 100 Cercospora_berteroae.3 66 Cercospora_beticola.2 99 Exophiala_aquamarina_CBS_119918 92 Exophiala_mesophila.2 99 Exophiala_dermatitidis 78 47 84 Exophiala_oligosperma.2 Exophiala_spinifera 93 99 45 Elsinoe_australis.3 99 Elsinoe_australis.4 Exophiala_mesophila 100 Cyphellophora_europaea_CBS_101466.3 Phialophora_attae.3 99 94 100 Capronia_epimyces_CBS_606.96 100 Fonsecaea_monophora Fonsecaea_erecta Phialophora_attae.2 100 100 Hyaloscypha_variabilis_F 94 Pezoloma_ericae Amorphotheca_resinae_ATCC_22711 Meira_miltonrushii Umbilicaria_pustulata 100 Aspergillus_neoniger_CBS_115656 100 Aspergillus_tubingensis_CBS_134.48 99 100 Aspergillus_niger 99 Aspergillus_niger.2 78 Aspergillus_ellipticus_CBS_707.79 Aspergillus_sclerotioniger_CBS_115572 Aspergillus_steynii_IBT_23096.2 100 62 100 Aspergillus_fumigatus_var._RP-2014 100 Neosartorya_fumigata 71 83 Neosartorya_fischeri 69 Aspergillus_turcosus 84 Penicillium_decumbens 100 Blastomyces_parvus Emmonsia_parva_UAMH_139 94 Polytolypa_hystricis_UAMH7299 72 Uncinocarpus_reesii 83 Aspergillus_nomius_NRRL_13137 100 Aspergillus_calidoustus.2 100 Aspergillus_versicolor_CBS_583.65 94 100 65 Aspergillus_sydowii_CBS_593.65 79 Aspergillus_versicolor_CBS_583.65.2 Aspergillus_ochraceoroseus_IBT_24754 Beauveria_bassiana 65 100 100 Talaromyces_stipitatus 84 Talaromyces_stipitatus.2 Talaromyces_atroroseus 86 Aspergillus_clavatus 98 Aspergillus_steynii_IBT_23096 99 Aspergillus_wentii_DTO_134E9 100 Talaromyces_islandicus Aspergillus_calidoustus Penicilliopsis_zonata_CBS_506.65 100 100 Ophiostoma_piceae 100 Sporothrix_schenckii_1099-18 Sporothrix_insectorum_RCEF_264 35 99 100 Grosmannia_clavigera Neurospora_tetrasperma 87 Verticillium_longisporum 99 Hyaloscypha_variabilis_F.2 100 99 Phialocephala_scopiformis 100 Phialocephala_subalpina Oidiodendron_maius_Zn 100 Exophiala_sideris.3 Phaeomoniella_chlamydospora 91 Penicillium_patulum 100 Penicillium_vulpinum 98 99 89 Penicillium_flavigenum 82 Penicillium_antarcticum Aspergillus_terreus 95 Aspergillus_ellipticus_CBS_707.79.2 100 Talaromyces_islandicus.2 88 100 Talaromyces_stipitatus.3 Talaromyces_atroroseus.2 95 Byssochlamys_spectabilis 100 Aspergillus_campestris_IBT_28561 100 Aspergillus_candidus 94 Aspergillus_taichungensis 82 100 Aspergillus_clavatus.2 Aspergillus_udagawae 93 97 Aspergillus_ochraceoroseus 93 Aspergillus_sydowii_CBS_593.65.2 81 Emericella_nidulans Aspergillus_calidoustus.3 99 Cordyceps_confragosa 100 Metarhizium_rileyi_RCEF_4871 100 85 Metarhizium_acridum 93 99 Metarhizium_majus 98 Aureobasidium_subglaciale_EXF-2481.2 50 100 Aureobasidium_subglaciale_EXF-2481.3 Aureobasidium_pullulans_EXF-150.2 Cordyceps_confragosa_RCEF_1005 98 Cordyceps_brongniartii_RCEF_3172 98 90 Sphaerulina_musiva.3 92 100 Zymoseptoria_tritici_ST99CH_1E4 Ramularia_collo-cygni Acidomyces_richmondensis_BFW 100 Aureobasidium_namibiae_CBS_147.97 89 Hortaea_werneckii_EXF-2000.3 100 Cyphellophora_europaea_CBS_101466 99 Phialophora_attae 70 54 89 Exophiala_sideris.2 Xylona_heveae_TC161 97 Exophiala_sideris.4 100 Fonsecaea_multimorphosa Fonsecaea_multimorphosa_CBS_102226 Verruconis_gallopava 97 Periconia_macrospinosa.2 97 Tilletiopsis_washingtonensis 100 Testicularia_cyperi Testicularia_cyperi.2 100 92 Anaeromyces_robustus.5 92 Anaeromyces_robustus.7 75 Neocallimastix_californiae.10 Anaeromyces_robustus.3 79 Piromyces_sp._E2 100 Anaeromyces_robustus.9 85 100 Neocallimastix_californiae.3 Piromyces_sp._E2.2 100 Anaeromyces_robustus Anaeromyces_robustus.2 99 66 100 Neocallimastix_californiae.12 96 Neocallimastix_californiae.14 100 Anaeromyces_robustus.10 99 Piromyces_finnis.9 90 Piromyces_sp._E2.5 44 100 99 Piromyces_finnis.3 100 Piromyces_finnis.4 Piromyces_sp._E2.3 76 Piromyces_sp._E2.4 100 Anaeromyces_robustus.12 70 Piromyces_finnis.6 Neocallimastix_californiae.7 100 Piromyces_finnis.2 Piromyces_finnis.7 90 Neocallimastix_californiae.13 82 Neocallimastix_californiae.16 39 92 100 Neocallimastix_californiae.15 Neocallimastix_californiae.18 85 95 Piromyces_finnis.10 90 Piromyces_finnis.8 72 100 Anaeromyces_robustus.11 Anaeromyces_robustus.13 Anaeromyces_robustus.14 98 35 Anaeromyces_robustus.4 71 97 98 Piromyces_finnis Neocallimastix_californiae.2 99 Neocallimastix_californiae.17 61 100 Neocallimastix_californiae.8 Neocallimastix_californiae.9 99 Anaeromyces_robustus.6 74 Anaeromyces_robustus.8 57 Neocallimastix_californiae 58 Neocallimastix_californiae.5 58 100 Neocallimastix_californiae.11 Neocallimastix_californiae.4 94 Piromyces_finnis.5 99 Piromyces_sp._E2.6 99 Cryptococcus_depauperatus_CBS_7841 Kwoniella_heveanensis_CBS_569 Mortierella_elongata_AG-77 100 51 Leptosphaeria_maculans.2 39 Pyrenochaeta_sp._DS3sAY3a.2 Phaeosphaeria_nodorum.2 99 Didymella_rabiei 89 Didymella_rabiei.2 96 Epicoccum_nigrum.2 76 96 66 Clohesyomyces_aquaticus.2 52 Paraphaeosphaeria_sporulosa 98 Corynespora_cassiicola_Philippines.2 Periconia_macrospinosa.3 100 80 Cochliobolus_heterostrophus 100 99 Stemphylium_lycopersici_str._CIDEFI_216.4 Alternaria_alternata.3 100 Pyrenophora_teres_f._teres.2 99 Bifiguratus_adelaidae Verruconis_gallopava.2 Cercospora_zeina.2 Plasmodiophora_brassicae 99 99 Phaeodactylum_tricornutum.3 100 Phaeodactylum_tricornutum.4 Phaeodactylum_tricornutum.2 54 Phaeodactylum_tricornutum 100 Physcomitrella_patens_subsp._patens 100 Selaginella_moellendorffii Musa_acuminata_subsp._malaccensis Bac_Candidatus_Babela_massiliensis

0.4 Hordeum_vulgare_subsp_vulgare_41 Hordeum_vulgare_subsp_vulgare_3886 86Hordeum_vulgare_subsp_vulgare_42 Hordeum_vulgare_subsp_vulgare_81 9199Hordeum_vulgare_subsp_vulgare_66 Hordeum_vulgare_subsp_vulgare_181 9850Aegilops_tauschii_310 100Triticum_aestivum_71 99Triticum_aestivum_59 Triticum_urartu_313 100Triticum_aestivum_77 Hordeum_vulgare_subsp_vulgare_16475 93 49Hordeum_vulgare_subsp_vulgare_168 48Hordeum_vulgare_subsp_vulgare_153 Hordeum_vulgare_subsp_vulgare_134 9770Hordeum_vulgare_subsp_vulgare_128 74Hordeum_vulgare_subsp_vulgare_320 Triticum_aestivum_75 10088Triticum_aestivum_64 99 Triticum_aestivum_131 Brachypodium_distachyon_112 99 Triticum_aestivum_32 94Triticum_aestivum_33 93Triticum_aestivum_34 97100Triticum_urartu_55 Hordeum_vulgare_subsp_vulgare_72 Hordeum_vulgare_subsp_vulgare_187100 98 99100Triticum_urartu_60 95Triticum_aestivum_56 99 94Triticum_aestivum_62 63Triticum_aestivum_51 Aegilops_tauschii_261 Brachypodium_distachyon_267 77 Brachypodium_distachyon_228 Zea_mays_68100 43Zea_mays_48 73Zea_mays_26 40Zea_mays_61 95 94Zea_mays_74 99Zea_mays_78 100Sorghum_bicolor_95 96 Setaria_italica_73 Sorghum_bicolor_58 100Setaria_italica_46 100Sorghum_bicolor_18 62 Zea_mays_176 Sorghum_bicolor_63 100Zea_mays_27 94Sorghum_bicolor_23 50100Sorghum_bicolor_24 Setaria_italica_29 100Setaria_italica_22 Oryza_nivara_8382 46Oryza_nivara_80 38Oryza_sativa_subsp_japonica_117 Oryza_rufipogon_2813 52Oryza_longistaminata_36 8Oryza_glaberrima_146 99Oryza_longistaminata_39 Oryza_sativa_subsp_japonica_316 Oryza_glumipatula_47 9Oryza_sativa_subsp_indica_37 8Oryza_barthii_43 19Oryza_meridionalis_45 24Oryza_rufipogon_40 Oryza_punctata_50 67Oryza_glumipatula_144 100Oryza_nivara_149 90 96 Oryza_glaberrima_152 54Oryza_sativa_subsp_indica_139 49 Oryza_barthii_102 63Oryza_punctata_202 Oryza_brachyantha_256 100Oryza_brachyantha_21 64Leersia_perrieri_295 100Leersia_perrieri_87 97Solanum_tuberosum_90 100Solanum_tuberosum_86 99 Solanum_lycopersicum_82 100 Solanum_lycopersicum_70 Solanum_lycopersicum_145 Brassica_oleracea_var_oleracea_4 Brassica_napus_8100 100Brassica_napus_6 Brassica_rapa_subsp_pekinensis_19 100 Brassica_rapa_subsp_pekinensis_20 100Brassica_napus_9 99 100Brassica_oleracea_var_oleracea_5 100Brassica_napus_7 100Arabidopsis_lyrata_subsp_lyrata_54 Brassica_oleracea_var_oleracea_255 Brassica_napus_18897 100100Brassica_napus_175 60Brassica_rapa_subsp_pekinensis_177 Arabidopsis_thaliana_65 Arabidopsis_thaliana_88 99Arabidopsis_thaliana_94 100 Arabidopsis_thaliana_92100 100Arabidopsis_thaliana_276 Arabidopsis_lyrata_subsp_lyrata_84 100 Brassica_rapa_subsp_pekinensis_10364 Brassica_napus_98100 100Brassica_napus_109 100Brassica_oleracea_var_oleracea_110 Brassica_napus_52 100 100Brassica_oleracea_var_oleracea_44 95 Brassica_oleracea_var_oleracea_96 Brassica_napus_97100 100Brassica_rapa_subsp_pekinensis_105 97 Brassica_napus_106100 97 Arabidopsis_lyrata_subsp_lyrata_179 100Arabidopsis_lyrata_subsp_lyrata_178 97 Arabidopsis_thaliana_165 100100Brassica_napus_154 91Brassica_rapa_subsp_pekinensis_163 Brassica_napus_15699 97Brassica_oleracea_var_oleracea_159 99 Brassica_napus_121 96Brassica_rapa_subsp_pekinensis_160 Corchorus_capsularis_286 100 Theobroma_cacao_172 Populus_trichocarpa_157 85 100 100Populus_trichocarpa_161 Populus_trichocarpa_180 Vitis_vinifera_15100 100 Vitis_vinifera_14 93 Vitis_vinifera_67 Prunus_persica_93 Glycine_max_208100 100Glycine_max_212 71 84 Glycine_max_185 81Medicago_truncatula_216 92 Medicago_truncatula_115 98 Medicago_truncatula_219100 100Medicago_truncatula_225 85 Trifolium_pratense_221 Trifolium_pratense_35 87 100 100Medicago_truncatula_166 76 Glycine_max_246 100Glycine_max_263 Glycine_max_289 Beta_vulgaris_subsp_vulgaris_13100 100 Beta_vulgaris_subsp_vulgaris_17 100Beta_vulgaris_subsp_vulgaris_135 99 97 Beta_vulgaris_subsp_vulgaris_291 Beta_vulgaris_subsp_vulgaris_12 Beta_vulgaris_subsp_vulgaris_16100 100Solanum_tuberosum_277 100 Solanum_lycopersicum_25 100 Solanum_lycopersicum_49 100 Solanum_lycopersicum_57 Solanum_lycopersicum_138 100 Solanum_lycopersicum_140 Solanum_tuberosum_302 Solanum_tuberosum_27291 Solanum_tuberosum_26598 100Solanum_tuberosum_26099 Solanum_lycopersicum_259 98 Solanum_tuberosum_284 Solanum_tuberosum_28376 Solanum_tuberosum_285100 Solanum_lycopersicum_147 100Brassica_napus_274 100Brassica_rapa_subsp_pekinensis_244 100Brassica_oleracea_var_oleracea_278 Brassica_napus_268 92 Brassica_oleracea_var_oleracea_26458 77 100Brassica_rapa_subsp_pekinensis_251 Brassica_napus_269 Arabidopsis_lyrata_subsp_lyrata_250 100Arabidopsis_thaliana_239 Arabidopsis_thaliana_24294 91 100Arabidopsis_thaliana_240 Arabidopsis_thaliana_243 “Rhamnogal_lyase” domain 86 Beta_vulgaris_subsp_vulgaris_275 97 Beta_vulgaris_subsp_vulgaris_258 Populus_trichocarpa_207 Medicago_truncatula_226 94 93Medicago_truncatula_197 Medicago_truncatula_20680 91 100Medicago_truncatula_273 97 71 Medicago_truncatula_294 100 Trifolium_pratense_257 Glycine_max_279 from Plant 97 100Glycine_max_281 Prunus_persica_245 Theobroma_cacao_262 Brassica_napus_119 96Brassica_oleracea_var_oleracea_125 100Brassica_rapa_subsp_pekinensis_143 87100Brassica_napus_148 100100Arabidopsis_thaliana_218 89 Arabidopsis_thaliana_132 98 Arabidopsis_lyrata_subsp_lyrata_99 100Trifolium_pratense_299 87 100Medicago_truncatula_162 Glycine_max_133 92Populus_trichocarpa_211 Corchorus_capsularis_290 93 100Theobroma_cacao_282 Theobroma_cacao_142 9998Populus_trichocarpa_189 100Prunus_persica_191 96 Prunus_persica_217 Beta_vulgaris_subsp_vulgaris_248 Arabidopsis_thaliana_114 Arabidopsis_thaliana_171 54Arabidopsis_thaliana_124 100Arabidopsis_thaliana_120 100Arabidopsis_thaliana_238 35Arabidopsis_lyrata_subsp_lyrata_122 35Brassica_napus_194 80 93Brassica_oleracea_var_oleracea_150 99Brassica_rapa_subsp_pekinensis_116 Brassica_napus_130 92100Glycine_max_173 100Glycine_max_183 100Glycine_max_210 94100Medicago_truncatula_195 Trifolium_pratense_186 92 Theobroma_cacao_11 100Corchorus_capsularis_288 87 Prunus_persica_182 Populus_trichocarpa_169 Brassica_oleracea_var_oleracea_127 Brassica_napus_136100 100Brassica_napus_129 Brassica_rapa_subsp_pekinensis_141 84 100 100Arabidopsis_thaliana_100 97Arabidopsis_thaliana_101 99 Arabidopsis_lyrata_subsp_lyrata_89 72 Theobroma_cacao_10 94100Corchorus_capsularis_287 Populus_trichocarpa_107 100Populus_trichocarpa_113 97 Glycine_max_69 100Glycine_max_220 87 100Medicago_truncatula_158 9186 100Trifolium_pratense_118 Prunus_persica_123 78 Solanum_lycopersicum_85 94 Beta_vulgaris_subsp_vulgaris_126 Vitis_vinifera_266 Solanum_lycopersicum_203 97 Solanum_lycopersicum_214 Hordeum_vulgare_subsp_vulgare_205 Hordeum_vulgare_subsp_vulgare_234 Hordeum_vulgare_subsp_vulgare_209 Hordeum_vulgare_subsp_vulgare_227 Hordeum_vulgare_subsp_vulgare_23237 Hordeum_vulgare_subsp_vulgare_30986 Hordeum_vulgare_subsp_vulgare_224 Hordeum_vulgare_subsp_vulgare_190 84Hordeum_vulgare_subsp_vulgare_199 Hordeum_vulgare_subsp_vulgare_200 Hordeum_vulgare_subsp_vulgare_198 15Hordeum_vulgare_subsp_vulgare_20173 Hordeum_vulgare_subsp_vulgare_184 5Hordeum_vulgare_subsp_vulgare_312 Hordeum_vulgare_subsp_vulgare_315 4Hordeum_vulgare_subsp_vulgare_308 49Triticum_urartu_307 Hordeum_vulgare_subsp_vulgare_3178 Aegilops_tauschii_29820 Hordeum_vulgare_subsp_vulgare_31433 Hordeum_vulgare_subsp_vulgare_319 Triticum_aestivum_23021 56Triticum_aestivum_223 Triticum_aestivum_231100 100Triticum_aestivum_233 9731Triticum_aestivum_222 Triticum_aestivum_204 97 84Brachypodium_distachyon_247 Zea_mays_235 Sorghum_bicolor_249100 94Setaria_italica_236 Setaria_italica_237100 100Oryza_barthii_292 47Oryza_barthii_293 Oryza_glumipatula_30476 Oryza_glumipatula_30356 Oryza_sativa_subsp_japonica_311 100100Oryza_glaberrima_301 60Oryza_sativa_subsp_indica_296 10010063Oryza_rufipogon_300 Oryza_nivara_253 Oryza_punctata_192 100Oryza_punctata_193 63Oryza_rufipogon_174 94Oryza_sativa_subsp_japonica_213100 Oryza_sativa_subsp_japonica_241 95 37 98Oryza_meridionalis_305 34Oryza_meridionalis_254 10082Oryza_glaberrima_215 92 9466 Oryza_barthii_306 Oryza_sativa_subsp_indica_297 100Oryza_glumipatula_196 Oryza_brachyantha_252 Leersia_perrieri_270 Leersia_perrieri_271100 Musa_acuminata_subsp_malaccensis_76 95Musa_acuminata_subsp_malaccensis_1 96 95 100Musa_acuminata_subsp_malaccensis_2 Musa_acuminata_subsp_malaccensis_3 91 100Musa_acuminata_subsp_malaccensis_30 Musa_acuminata_subsp_malaccensis_280 100Musa_acuminata_subsp_malaccensis_111 Amborella_trichopoda_316 100Amborella_trichopoda_229 97 Selaginella_moellendorffii_170 100Selaginella_moellendorffii_318 Selaginella_moellendorffii_91 100Selaginella_moellendorffii_104 100 Selaginella_moellendorffii_155 100 100Selaginella_moellendorffii_167 Physcomitrella_patens_79 99 Physcomitrella_patens_137 100Vitis_vinifera_108 100Prunus_persica_151 Populus_trichocarpa_53 FS3C_FS3C_21844 S4B_S4B_28233 S4B_S4B_28230 S4B_S4B_28229 S4B_S4B_2823493 95FX4B_FX4B_33758 Orpsp1_1_Orpsp1_1_1190025100 100D3B_D3B_5094 93D4C_D4C_26786 Hef5_Hef5_1320873 100Hef5_Hef5_23402 93Hef5_Hef5_23400 60WSF2_WSF2_659 42100WSF3_WSF3_22420 WSF2_WSF2_663 Pirfi3_Pirfi3_316903 55Brit4_Brit4_25267 Iso3_Iso3_630892 100Iso3_Iso3_34868100 Brit4_Brit4_25271 99 A1_A1_12102 PirE2_1_PirE2_1_1707585 Anasp1_Anasp1_294545 80C3J_C3J_33041 O2_O2_35864 7058O2_O2_35857 C3J_C3J_33047100 C3G_C3G_17753 98 O2_O2_35856 98 C3G_C3G_17751 Hef5_Hef5_13209 100Hef5_Hef5_13210 Neosp1_Neosp1_665236 77Neosp1_Neosp1_665240 38Neosp1_Neosp1_665237 Neosp1_Neosp1_5010863734 59Neosp1_Neosp1_501077 98Neosp1_Neosp1_631879 97 57Neosp1_Neosp1_457438 99Neosp1_Neosp1_665249 95 Neosp1_Neosp1_669383 47G3_G3_4102 100 100 PirE2_1_PirE2_1_1846 PirE2_1_PirE2_1_17387 100PirE2_1_PirE2_1_17412 88D3A_D3A_29411 99 D4C_D4C_2762680 97 D3B_D3B_5534100 93 D3B_D3B_5535 FS3C_FS3C_43727 86S4B_S4B_37077 57 WSF2_WSF2_660 WSF3_WSF3_22418 94 “Rhamnogal_lyase” domain 65WSF3_WSF3_22405 100WSF3_WSF3_22400 WSF3_WSF3_22403 70C3J_C3J_7445 99Anasp1_Anasp1_290139 70 C3J_C3J_33043 C3G_C3G_17754 70C3J_C3J_33044 O2_O2_35865 D3B_D3B_19756 D3B_D3B_19760 from Neocallimastigomycota D3B_D3B_19761 D3B_D3B_19765 D3B_D3B_19758 D3B_D3B_19766 D3B_D3B_19759 48D3B_D3B_19763 91D3B_D3B_19764 D3B_D3B_19757 95D3A_D3A_22374 D3B_D3B_22039 (Novel) 99D4C_D4C_11860 D3B_D3B_19762 D4C_D4C_17094 9896D4C_D4C_11858 D3A_D3A_22376 98Orpsp1_1_Orpsp1_1_1189773 92S4B_S4B_30369 FX4B_FX4B_8371100 99 YC3_YC3_22068 O2_O2_3585869 99O2_O2_35860 67 O2_O2_35861100 99O2_O2_35862 C3J_C3J_33042 99O2_O2_3586675 C3J_C3J_33049 87 100 C3J_C3J_33048 Anasp1_Anasp1_265102 100Neosp1_Neosp1_702533 34100G3_G3_12541 94Neosp1_Neosp1_678787 99WSF3_WSF3_16171 Pirfi3_Pirfi3_359929 82 Brit4_Brit4_2751279 100Iso3_Iso3_40960 Iso3_Iso3_24258 Hef5_Hef5_5448567 Hef5_Hef5_54486100 100Hef5_Hef5_54487 Neosp1_Neosp1_705818 100G3_G3_12543 100Orpsp1_1_Orpsp1_1_1187117 99 YC3_YC3_13304100 10099 97WSF2_WSF2_662 100WSF3_WSF3_22419 PirE2_1_PirE2_1_19213 PirE2_1_PirE2_1_9593 100Flavobacterium_daejeonense_21 100 Flavobacterium_glycines_22 100 Flavobacterium_sp_23 100 Pedobacter_glucosidilyticus_98 Leeuwenhoekiella_blandensis_MED217_32 100 100 Verrucomicrobia_bacterium_IMCC26134_100 60 Opitutaceae_bacterium_TSB47_46 99 Mucilaginibacter_paludis_DSM_18603_38 99 100 Chthoniobacter_flavus_Ellin428_5 100Opitutaceae_bacterium_TSB47_41 100Opitutaceae_bacterium_TSB47_42 90 Asticcacaulis_sp_YBE204_2 “Rhamnogal_lyase” domain Chthonomonas_calidirosea_6 Chthonomonas_calidirosea_767 100Chthonomonas_calidirosea_8 100 100Chthonomonas_calidirosea_T49_9 Mucilaginibacter_sp_PPCGB_2223_39 99 100 Granulicella_mallensis_MP5ACTX8_24 Opitutaceae_bacterium_TSB47_43 100 Opitutaceae_bacterium_TAV5_40 Asticcacaulis_sp_YBE204_3 100 from Bacterial group I 100 Asticcacaulis_excentricus_CB_48_1 100 Paraburkholderia_tropica_52 100Paenibacillus_mucilaginosus_3016_48 100Paenibacillus_mucilaginosus_K02_49 99 Paenibacillus_mucilaginosus_47 Paenibacillus_sp_Soil766_51 Verrucomicrobia_bacterium_IMCC26134_101 Pectobacterium_wasabiae_CFBP_3304_96 Unspecified domain from Bacteria Pectobacterium_wasabiae_95100 97Pectobacterium_wasabiae_CFBP_3304_97 Pectobacterium_sp_94 94100Pectobacterium_parmentieri_WPP163_93 Pectobacterium_betavasculorum_58 99Pectobacterium_betavasculorum_59 100Pectobacterium_atrosepticum_54 Pectobacterium_atrosepticum_SCRI1043_57100 96Pectobacterium_atrosepticum_ICMP_1526_56 9095Pectobacterium_carotovorum_subsp_carotovorum_UGC32_90 100Pectobacterium_atrosepticum_53 31Pectobacterium_atrosepticum_55 Pectobacterium_carotovorum_subsp_carotovorum_80 7Pectobacterium_carotovorum_subsp_brasiliense_67 99Pectobacterium_carotovorum_subsp_brasiliense_70 5Pectobacterium_carotovorum_subsp_brasiliense_68 Pectobacterium_carotovorum_subsp_brasiliense_65 23Pectobacterium_carotovorum_subsp_carotovorum_83 3Pectobacterium_carotovorum_subsp_brasiliense_64 Pectobacterium_carotovorum_subsp_carotovorum_79 Pectobacterium_carotovorum_subsp_carotovorum_PCC21_89 3Pectobacterium_carotovorum_subsp_carotovorum_8114 Pectobacterium_carotovorum_subsp_brasiliense_72 40Pectobacterium_carotovorum_subsp_brasiliense_69 27Pectobacterium_carotovorum_subsp_brasiliensis_ICMP_19477_75 96Pectobacterium_carotovorum_60 47Pectobacterium_carotovorum_subsp_carotovorum_87 1276Pectobacterium_carotovorum_subsp_brasiliense_63 Pectobacterium_carotovorum_subsp_brasiliense_66 100 “Rhamnogal_lyase” domain 52Pectobacterium_carotovorum_subsp_brasiliense_74 Pectobacterium_carotovorum_subsp_brasiliense_71 44Pectobacterium_carotovorum_subsp_carotovorum_76 Pectobacterium_carotovorum_subsp_brasiliense_6278 Pectobacterium_carotovorum_subsp_brasiliense_73 Pectobacterium_carotovorum_subsp_odoriferum_91100 7050Pectobacterium_carotovorum_subsp_odoriferum_92 Pectobacterium_carotovorum_subsp_actinidiae_61 82 26 100Pectobacterium_carotovorum_subsp_carotovorum_78 552799Pectobacterium_carotovorum_subsp_carotovorum_82 Pectobacterium_carotovorum_subsp_carotovorum_85 from Bacterial group II (with HGT in Insects) Pectobacterium_carotovorum_subsp_carotovorum_86 97Pectobacterium_carotovorum_subsp_carotovorum_77 Pectobacterium_carotovorum_subsp_carotovorum_ICMP_5702_8898 80 Pectobacterium_carotovorum_subsp_carotovorum_84 Erwinia_tracheiphila_18 100Erwinia_tracheiphila_PSU-1_20 Cedecea_neteri_str_ND14b_4 Dickeya_solani_14100 100 87Dickeya_chrysanthemi_10 Dickeya_dadantii_13 87RhiE_Dickeya_chrysanthemi_RH_I_CAD27359100 100Dickeya_solani_D_s0432-1_15 100 100Dickeya_chrysanthemi_Ech1591_12 100 Serratia_sp_99 Kluyvera_georgiana_ATCC_51603_25 96 100Dickeya_zeae_16 100Dickeya_zeae_EC1_17 Dickeya_chrysanthemi_Ech1591_11 88 Erwinia_tracheiphila_PSU-1_19 100Dendroctonus_ponderosae_1 99 Dendroctonus_ponderosae_2 Dendroctonus_ponderosae_3 Melissococcus_plutonius_33 78 Melissococcus_plutonius_ATCC_35311_36 Melissococcus_plutonius_S1_37100 100Melissococcus_plutonius_34 Melissococcus_plutonius_35 99 100Lactobacillus_helsingborgensis_29 98 99 Lactobacillus_sp_wkB8_31 100 100 Lactobacillus_ghanensis_DSM_18630_27 Lactobacillus_hamsteri_DSM_5661___JCM_6256_28 Unspecified domain from Bacteria Lactobacillus_ghanensis_DSM_18630_26 100 Lactobacillus_pentosus_KCA1_30 Paenibacillus_sp_Soil766_50 Pyrenochaeta_sp_DS3sAY3a_1 100 “RhgB_N” domain from Bacteria 100Pyrsp1_Pyrsp1_572958 99Parch1_Parch1_477556 97Macan1_Macan1_408090 99Didex1_Didex1_281537 100 100 Didma1_Didma1_17685 79 Parsp1_Parsp1_1145684 94 100Crypa2_Crypa2_51746 100Crypto1_Crypto1_502709 92 Venin1_Venin1_23265 100Venpi1_Venpi1_213522 100 Neonectria_ditissima Neodi1_Neodi1_328100 Rhyru1_1_Rhyru1_1_114943 Aureobasidium_pullulans 100 99 100Aurpu_var_pul1_Aurpu_var_pul1_283651 99 Aurpu_var_sub1_Aurpu_var_sub1_703636 Unspecified domain from 98 Neopa1_Neopa1_8910 100 100 Settu3_Settu3_154507 Hyspu1_1_Hyspu1_1_118609 98 Phcit1_Phcit1_197331 Aspergillus_oryzae_RIB40 100Aspergillus_flavus_NRRL3357 Rhizoctonia_solani_AG-1100 100Rhiso1_Rhiso1_3897 100Thacu1_Thacu1_719011 100RSOL_EUC63446 100 V565_KEP54040 100CerAGI_CerAGI_604328 100CerAGI_CerAGI_520626 88 Thacu1_Thacu1_665681 Pyrenochaeta_sp_DS3sAY3a_2 70 99 Rhamnogalacturonate_lyaseB_Neonectria_ditissima_KPM43995 100Penicillium_steckii 63 100Pensub1_Pensub1_9802 Penra1_Penra1_280727 Unspecified domain from Dikarya XylPMI506_XylPMI506_515693 63 100Chaetomium_globosum_CBS_14851 100Chagl_1_Chagl_1_17472 100 Thiap1_Thiap1_583828 100Diaporthe_helianthi 67 100 Diaam1_Diaam1_8226 Stagonospora_sp_SRC1lsM3a 100Cucbe1_Cucbe1_393062 Tragib1_Tragib1_295752 100 100Trabet1_Trabet1_736816 (including Rhamnogalacturonate B & C ) 100 Trave1_Trave1_42375 100 Rhamnogalacturonate_lyaseC_Fusarium_langsethiae_KPA39151 100Cob_ENH77361 Mrafri1_Mrafri1_439242 Fusarium_oxysporum_20-278_4 Fusarium_oxysporum_20-278_2 Fusarium_oxysporum_Panama_20-278 Fusarium_oxysporum_20-278_3100 100Fusarium_oxysporum_20-278_1 Gibberella_fujikuroi_20-278 100100Gibberella_moniliformis_20-278 Fusarium_pseudograminearum_20-279 100100Gibberella_zeae_20-279 Nectria_haematococca_20-279 100 100 Dactylellina_haptotyla_23-283 100 Pestalotiopsis_fici_20-279 100 Eutypa_lata_20-278 56 Penicillium_rubens_16-271 100 Emericella_nidulans_87-259 78 Pseudocercospora_fijiensis_19-271 5388 Dothistroma_septosporum_21-274 Sphaerulina_musiva_21-273 97Macrophomina_phaseolina_23-273 72 100 Dipse1_Dipse1_9147 Botryosphaeria_parva_22-273 Botryosphaeria_parva_21-274 100Macrophomina_phaseolina_66-319 73 100Cochliobolus_heterostrophus_29-285 100Bipolaris_zeicola_20-278 87 Cochliobolus_sativus_24-282 100 Setosphaeria_turcica_22-278 Pyrenophora_teres_21-276 97 100Pyrenophora_tritici_repentis_21-277 Phaeosphaeria_nodorum_21-281 60Podospora_anserina_25-280 76Chaetomium_globosum_25-280 99Chaetomium_thermophilum_25-280 100 Myceliophthora_thermophila_25-280 98 100Neurospora_crassa_25-281 98100Neurospora_tetrasperma_25-281 Sordaria_macrospora_25-282 66 Togninia_minima_1-175 66 Gaeumannomyces_graminis_27-288 55 Eutypa_lata_21-278 Colletotrichum_orbiculare_25-285 100Colletotrichum_gloeosporioides_25-285 99 100Colletotrichum_graminicola_24-283 100Colletotrichum_higginsianum_25-285 Verticillium_dahliae_24-277 100 Verticillium_alfalfae_7-212100 Aspergillus_niger_22-271100 100Aspergillus_niger_22-270 Aspergillus_kawachii_22-270 100 Aspergillus_oryzae_22-270100 100100Aspergillus_flavus_22-270 Rhamnogalacturonate_lyase_A_Aspergillus_arachidicola_PIG80164 Aspergillus_terreus_22-270 100 82 Penicillium_rubens_22-276 73 99Penicillium_roqueforti_23-271 73 Penicillium_digitatum_22-264 72 Dactylellina_haptotyla_22-270 100 Aspergillus_fumigatus_22-270 70100Neosartorya_fischeri_22-270 100 Aspergillus_nidulans_22-270 Aspergillus_clavatus_22-270 100 Marssonina_brunnea_20-273 98 Glarea_lozoyensis_19-272 100Thanatephorus_cucumeris_21-315 100 Thanatephorus_cucumeris_1-160 Serendipita_indica_19-269 Pyronema_omphalodes_20-262 100Gloeophyllum_trabeum_36-280 92 65 Gloeophyllum_trabeum_42-281 Thanatephorus_cucumeris_99-256 87 100Thanatephorus_cucumeris_157-335 67 100Thanatephorus_cucumeris_24-216 Thanatephorus_cucumeris_24-268 Moniliophthora_roreri_22-270 100 Coprinopsis_cinerea_23-269 99 “RhgB_N” domain from Earsca1_Earsca1_428861 96 75 Schizophyllum_commune_22-275 100 Schizophyllum_commune_21-269 Colletotrichum_graminicola_21-270 100Colletotrichum_higginsianum_21-270 100Colletotrichum_gloeosporioides_21-225 100 Colletotrichum_gloeosporioides_21-271100 100 Colletotrichum_orbiculare_21-270 Phaeosphaeria_nodorum_21-272 100 Drechslerella_stenobrocha_20-268 97 Dikarya, Oomycetes, and Bacteria 100Arthrobotrys_oligospora_20-268 Dactylellina_haptotyla_20-268 65Phytophthora_parasitica_46-294_1 80Phytophthora_parasitica_46-294_2 100Phytophthora_infestans_44-292 Phytophthora_parasitica_46-294_3 99Phytophthora_parasitica_46_295_1 100Phytophthora_parasitica_46-295100 Phytophthora_parasitica_46-295_2 100Phytophthora_sojae_46-294 96 Phytophthora_ramorum_45-294 (including Rhamnogalacturonate lyase A) 100 100Phytophthora_parasitica_10-256 100Phytophthora_parasitica_2-235 95 Phytophthora_sojae_1-141 Phytophthora_ramorum_27-275 100Phytophthora_parasitica_28-279_1 99 100Phytophthora_parasitica_28-279_2 98Phytophthora_parasitica_28-279_3 Phytophthora_infestans_25-210 98Phytophthora_infestans_25-276100 100Phytophthora_sojae_26-276 Phytophthora_ramorum_25-276 100 99 Phytophthora_parasitica_29_279_192 Phytophthora_parasitica_29-279_2100 95Phytophthora_nicotianae_29-279 Phytophthora_ramorum_29-279_1 Phytophthora_ramorum_29-279_2100 96 Pythium_ultimum_26-278 100 Pythium_ultimum_25-276 83 100Thanatephorus_cucumeris_19-266_2 100Thanatephorus_cucumeris_19-266_1 100Thanatephorus_cucumeris_19-231_2 100 Thanatephorus_cucumeris_19-231_1 Thanatephorus_cucumeris_23-214 99 100 Thanatephorus_cucumeris_20-235 100 Serendipita_indica_23-267 78 100Heterobasidion_irregulare_22-266 100 Agaricus_bisporus_19-267 Schizophyllum_commune_18-262 95 Pseudomonas_fluorescens_21-268 92 Xanthomonas_albilineans_27-274 92Xanthomonas_oryzae_61-307 100Rhamnogalacturonate_lyaseA_Xanthomonas_fuscans_SOO43950 98 Xanthomonas_campestris_34-233 Actinoplanes_sp_42-285 100 Actinoplanes_missouriensis_31-278 Clostridium_saccharobutylicum_39-288 99 Opitutus_terrae_38-299 95Cochliobolus_heterostrophus_21-279 100Bipolaris_zeicola_21-279 100Cochliobolus_sativus_21-279 Pyrenophora_tritici_repentis_1-231 100100Pyrenophora_teres_21-279 100Setosphaeria_turcica_21-279 100 100 Leptosphaeria_maculans_21-275 99 Eutypa_lata_21-283 Pestalotiopsis_fici_21-279 99 Colletotrichum_gloeosporioides_21-275_1100 100Colletotrichum_gloeosporioides_21-275_2 100Colletotrichum_orbiculare_21-275 100 95 Colletotrichum_higginsianum_21-267 Nectria_haematococca_21-282 Marssonina_brunnea_22-279 100 Tuber_melanosporum_21-268 99Streptomyces_davawensis_40-289 100Streptomyces_ipomoeae_49-298 94Streptomyces_turgidiscabies_40-289 100 Streptomyces_sviceus_41-290 9397Streptomyces_avermitilis_40-289 99Streptomyces_scabiei_45-294 55 Streptomyces_viridochromogenes_35-284 97 Streptomyces_sviceus_51-301 100 Streptomyces_turgidiscabies_43-293 99 100Streptomyces_ipomoeae_31-281 100 Streptomyces_viridochromogenes_14-263 Firmicutes_bacterium_39-291 Acinetobacter_nectaris_72-331

0.4 Table S1. List of Pfam domain names with annotated functions for the ones uniquely maintained or lost in Neocallimastigomycota (AGF) comparing to Chytridiomycota AGF Unique Domains AGF Lost Domains Domain Function Domain Function 2OG- Acyl-coenzyme A:6-aminopenicillanic AAT FeII_Oxy 2OG-Fe(II) oxygenase superfamily acid acyl- _3 Alpha/beta superfamily, 2OG- Abhydrolase_ hydrolytic of widely differing FeII_Oxy 2OG-Fe(II) oxygenase superfamily 5 phylogenetic origin and catalytic _4 function that share a common fold. Aspartic proteases are a catalytic type of 3-hydroxyanthranilic acid dioxygenase, protease enzymes that use an activated part of the kynurenine pathway for the Asp_protease water molecule bound to one or more 3-HAO degradation of tryptophan and the aspartate residues for of their biosynthesis of nicotinic acid peptide substrates. Asp_protease Acetyl-coenzyme A synthetase N- Aspartic proteases ACAS_N _2 terminus N-linked (asparagine-linked) glycosylation of is mediated by a highly conserved pathway in This is the catalytic region of aspzincins, , in which a lipid (dolichol Aspzincin_M Alg6_Alg a group of lysine-specific metallo- phosphate)-linked oligosaccharide is 35 8 endopeptidases in the M35 assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This belongs to the family of , those acting on carbon- bonds other than peptide bonds, specifically in linear amidines. The systematic name of this enzyme class is allantoate amidinohydrolase. This enzyme participates in metabolism by facilitating the utilization This is a carbohydrate binding domain of as secondary nitrogen sources which has been shown in Allantoica Carb_bind under nitrogen-limiting conditions. Schizosaccharomyces pombe to be se While purine degradation converges to required for septum localisation in all vertebrates, its further degradation varies from species to species. and produce and using the uricolytic pathway. Allantoicase performs the second step in this pathway catalyzing the conversion of allantoate into ureidoglycolate and . ATP_sub CBAH Choloylglycine hydrolase family ATP synthase complex subunit h _h Carbohydrate-binding module family 10 (CBM10) is found in two distinct sets of proteins with different functions. Those ATP- mitochondrial ATPase complex subunit CBM_10 found in aerobic bacteria bind cellulose synt_10 ATP1 (or other carbohydrates); but in anaerobic fungi they are binding

domains, referred to as dockerin domains. The dockerin domains are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. Carbohydrate-binding module family 25 Atp11p is a molecular chaperone of the (CBM25) binds alpha- mitochondrial matrix that participates in CBM_25 glucooligosaccharides, particularly those ATP11 the biogenesis pathway to form F1, the containing alpha-1,6 linkages, and catalytic unit of the ATP synthase granular starch. This is a mannan-specific carbohydrate binding domain, previously known as the X4 module. Unlike other CBM_35 B12D NADH dehydrogenase (ubiquinone) carbohydrate binding modules, binding to causes a conformational change Carbohydrate-binding module family 6 (CBM6) is unusual in that is contains two substrate-binding sites, cleft A and cleft B. Cellvibrio mixtus endoglucanase 5A contains two CBM6 domains, the CBM6 domain at the C-terminus displays distinct ligand binding Band_7_ CBM_6 C-terminal region of band_7 specificities in each of the sustrate- C binding clefts. Both cleft A and cleft B can bind cello-oligosaccharides, laminarin preferentially binds in cleft A, xylooligosaccharides only bind in cleft A and beta1,4,-beta1,3-mixed linked glucans only bind in cleft B. Polysaccharide lyase family 4, domain BCDHK_ Mitochondrial branched-chain alpha- CBM-like III Adom3 ketoacid dehydrogenase kinase CBM26 is a carbohydrate-binding CBM26 BCS1_N Mitochondrial chaperone BCS1 module that binds starch CHB_HEX_ Chitobiase/beta-hexosaminidase C- Eukaryotic mitochondrial regulator Bot1p C_1 terminal domain protein CotH kinase protein, members of this family include the spore coat protein H Chalcone CotH (cotH). This protein is an atypical Chalcone like _2 protein kinase that phosphorylates CotB and CotG Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and Carbohydrate-binding domain- Chromate Cthe_2159 archaebacteria. The proteins are containing protein _transp composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP. This family includes FeoA a small Cytokine-induced anti-apoptosis protein, probably involved in Fe2+ inhibitor 1, Fe-S biogenesis. Anamorsin, FeoA transport [1]. This presumed short CIAPIN1 subsequently named CIAPIN1 for domain is also found at the C-terminus cytokine-induced anti-apoptosis inhibitor of a variety of metal dependent 1, in humans is the homologue of yeast

transcriptional regulators. This suggests Dre2, a conserved soluble eukaryotic Fe- that this domain may be metal-binding. S cluster protein, that functions in In most cases this is likely to be either cytosolic Fe-S protein biogenesis. It is iron or manganese. found in both the cytoplasm and in the mitochondrial intermembrane space (IMS) Ferrous iron transport protein B C Cytochrome c oxidase biogenesis protein FeoB_C Cmc1 terminus Cmc1 like In molecular biology, the FLYWCH zinc finger is a zinc finger domain. It is found in a number of eukaryotic Cytochrome oxidase complex assembly FLYWCH proteins. FLYWCH is a C2H2-type zinc Coa1 protein 1 finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: Complex1 Gal_Lectin Galactose binding lectin domain NADH dehydrogenase (ubiquinone) _LYR Glycoside hydrolase family 11 CAZY This is a family of proteins carrying the GH_11 comprises enzymes with only Glyco_hydro Complex1 LYR motif of family Complex1_LYR, one known activity, xylanase (EC _11 _LYR_2 PF05347 likely to be involved in Fe-S 3.2.1.8). These enzymes were formerly cluster biogenesis in mitochondria. known as cellulase family G Glycoside hydrolase family 48 CAZY GH_48 comprises enzymes with several Glyco_hydro known activities; endoglucanase (EC COQ7 Ubiquinone biosynthesis protein COQ7 _48 3.2.1.4); cellobiohydrolase (EC 3.2.1.91). Glycoside hydrolase family 6 comprises enzymes with several known activities including endoglucanase (EC 3.2.1.4) and cellobiohydrolase (EC 3.2.1.91). These enzymes were formerly known as Glyco_hydro cellulase family B. The 3D structure of COX15- Cytochrome c oxidase _6 the enzymatic core of cellobiohydrolase CtaA II (CBHII) from the Trichoderma reesei reveals an alpha-beta protein with a fold similar to the ubiquitous barrel topology first seen in triose phosphate isomerase. Glycoside hydrolase family 8 CAZY GH_8 comprises enzymes with several Cytochrome c oxidase, the last enzyme known activities; endoglucanase (EC in the respiratory electron transport chain Glyco_hydro 3.2.1.4); lichenase (EC 3.2.1.73); COX6B of mitochondria (or bacteria) located in _8 chitosanase (EC 3.2.1.132). These the mitochondrial (or bacterial) enzymes were formerly known as membrane. cellulase family D. CtaG_Co GT-D Glycosyltransferase GT-D fold Cytochrome c oxidase x11 A leucine-rich repeat (LRR) is a protein structural motif that forms an α/β horseshoe fold. It is composed of LRR_5 repeating 20–30 amino acid stretches Cupin_4 Cupin superfamily protein that are unusually rich in the hydrophobic amino acid leucine. These repeats commonly fold together to form

a solenoid , termed leucine-rich repeat domain. Typically, each repeat unit has beta strand-turn- alpha helix structure, and the assembled domain, composed of many such repeats, has a horseshoe shape with an interior parallel and an exterior array of helices. One face of the beta sheet and one side of the helix array are exposed to solvent and are therefore dominated by hydrophilic residues. The region between the helices and sheets is the protein's hydrophobic core and is tightly sterically packed with leucine residues. Leucine-rich repeats are frequently involved in the formation of protein– protein interactions. This cupin like domain shares similarity Anaerobic ribonucleoside-triphosphate NRDD Cupin_8 to the JmjC domain, which catalyse a reductase novel modification PD-(D/E)XK nuclease family Cyto_hem PDDEXK_2 transposase. These proteins are Holocytochrome-c synthase e_lyase transposase proteins. Cytochrome b561 is an integral membrane protein responsible for Pollen_allerg This family contains allergens lol PI, PII Cytochro electron transport, binding two heme _1 and PIII from Lolium perenne. m_B561 groups non-covalently.[1] It is a family of ascorbate-dependent enzymes Pur_ac_phos Purple acid Phosphatase, N-terminal Mitochondrial ribosomal death- DAP3 ph_N domain associated protein 3 Deoxyribodipyrimidine photolyase This domain is usually found associated (DNA photolyase) is a DNA repair with PF00239 in putative enzyme. It binds to UV-damaged DNA DNA_pho Recombinase integrases/recombinases of mobile containing pyrimidine dimers and, upon tolyase genetic elements of diverse bacteria and absorbing a near-UV photon (300 to 500 phages. nm), breaks the cyclobutane ring joining the two pyrimidines of the dimer. Emopamil binding protein, encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level Rhamnogalacturonate lyase ( EC:4.2.2.-) expression in epithelial tissues. The EBP degrades the rhamnogalacturonan I (RG- protein has emopamil binding domains, Rhamnogal_l I) backbone of pectin. This family EBP including the sterol acceptor site and the yase contains mainly members from plants, catalytic centre, which show Delta8- but also contains the plant pathogen Delta7 sterol isomerase activity. Human Erwinia chrysanthemi. sterol isomerase, a homologue of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. ETC_C1_ Rubrerythrin This domain has a ferritin-like fold NADH dehydrogenase (ubiquinone) NDUFA4

Integrase, Retroviral integrase (IN) is an enzyme produced by a retrovirus (such as HIV) that enables its genetic material to be integrated into the DNA of the infected . Retroviral INs are not to be confused with phage integrases, such as Electron transfer flavoprotein-ubiquinone rve λ phage integrase (Int) (see site-specific ETF_QO oxidoreductase, 4Fe-4S recombination). IN is a key component in the retroviral pre-integration complex (PIC). The complex of integrase bound to cognate viral DNA (vDNA) ends has been referred to as the intasome. Reverse transcriptase (RNA-dependent DNA polymerase), is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of Photolyases (EC 4.1.99.3) are DNA mobile elements, including FAD_bin RVT_2 repair enzymes that repair damage retrotransposons, retroviruses, group II ding_7 caused by exposure to ultraviolet light. introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognised by the PF00078 model. Carbohydrate esterase, sialic acid- FAD_bin SASA specific acetylesterase. Sialic acid FAD-binding domain ding_8 acetylesterase in autoimmunity Stealth protein CR1, Stealth_C1 is the first of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. CR1 carries a well-conserved IDVVYT sequence-motif. The domain is found in Ferric reductase like transmembrane tandem with CR2, CR3 and CR4 on component. This family includes a both potential metazoan hosts and common region in the transmembrane pathogenic eubacterial species that are Ferric_red proteins mammalian cytochrome B-245 Stealth_CR1 capsular polysaccharide uct heavy chain (gp91-phox), ferric phosphotransferases. The CR domains reductase transmembrane component in appear on eukaryotic proteins such as yeast and respiratory burst oxidase from GNPTAB, N-acetylglucosamine-1- mouse-ear cress. phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in for the bacteria to evade detection by the host innate immune system Stealth_CR2 is the second of several highly conserved regions on stealth proteins in metazoa and bacteria. There FLILHEL Stealth_CR2 are up to four CR regions on all member protein of unknown function TA proteins. CR2 carries a well-conserved NDD sequence-motif. The domain is found in tandem with CR1, CR3 and

CR4 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1- phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system A toxin-antitoxin system is a set of two or more closely linked genes that together encode both a protein 'poison' and a corresponding 'antidote'. When these systems are contained on plasmids – transferable genetic elements – they ensure that only the daughter cells that Ferroportin1, that may play a role in iron inherit the plasmid survive after cell export from the cell. This family may YoeB_toxin FPN1 division. If the plasmid is absent in a represent a number of transmembrane daughter cell, the unstable antitoxin is regions in Ferroportin1 degraded and the stable toxic protein kills the new cell; this is known as 'post- segregational killing' (PSK). Toxin- antitoxin systems are widely distributed in prokaryotes, and organisms often have them in multiple copies. There is strong evidence for involvement of the ZinT domain in zinc homeostasis and management of zinc in the periplasm. It may also facilitate zinc Glycosyl hydrolase family 63 (CAZY uptake from the environment through GH_63) is a family of eukaryotic interactions with the znuABC zinc enzymes. They catalyse the specific transporter. It is regulated by the cleavage of the non-reducing terminal metalloregulator gene Zur (zinc uptake Glyco_hy glucose residue from ZinT regulator). dro_63 Glc(3)Man(9)GlcNAc(2). Mannosyl The domain was originally discovered in oligosaccharide glucosidase EC the bacterial stress response to cadmium. 3.2.1.106 is the first enzyme in the N- Further studies have found that it binds linked oligosaccharide processing to cadmium, zinc, nickel, and mercury, pathway. but not other common metals such as cobalt, copper, iron, and manganese. It may have a secondary function in managing heavy-metal toxicity. Glycosyl hydrolase family 63 (CAZY GH_63) is a family of eukaryotic enzymes. They catalyse the specific cleavage of the non-reducing terminal Glyco_hy glucose residue from dro_63N Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC 3.2.1.106 is the first enzyme in the N- linked oligosaccharide processing pathway

This family of glycosyl are specifically (mannosyl) Glyco_tra glucuronoxylomannan/galactoxylomanna nsf_90 n -beta 1,2-xylosyltransferases, EC:2.4.2.-. Glyoxal_ Glyoxal oxidase N-terminus oxid_N Hydroxyacylglutathione hydrolase C- terminus. Substrate binding occurs at the HAGH_C interface between this domain and the catalytic domain Indoleamine 2,3-dioxygenase. Indoleamine 2,3-dioxygenase is the first and rate-limiting enzyme of tryptophan IDO catabolism through the kynurenine pathway, thus causing depletion of tryptophan which can cause halted growth of microbes as well as T cells. IGR IGR protein motif Insulin-induced protein, found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), INSIG preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations. LIAS_N is found as the N-terminal domain of the Radical_SAM family in the members that are lipoyl synthase LIAS_N enzymes, particularly the mitochondrial ones in metazoa but also those in bacteria. Catalytic LigB subunit of aromatic ring- LigB opening dioxygenase Sphingolipid Delta4-desaturase (DES). Lipid_DE Sphingolipids are important membrane S signalling molecules involved in many different cellular functions in eukaryotes. MAM33 Mitochondrial glycoprotein This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an MCD inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidised. Mg_trans Magnesium transporter NIPA _NIPA

Mitochondrial genome maintenance Mgm101p MGM11 Mit_KHE Mitochondrial K+-H+ exchange-related 1 Mitofilin Mitochondrial inner membrane protein MMADH Methylmalonic aciduria and C homocystinuria type D protein Mo-co oxidoreductase dimerisation domain. This domain is found in Mo- molybdopterin (Mo-co) co_dimer . It is involved in dimer formation, and has an Ig-fold structure MOSC N-terminal domain, MOSC_N predicted sulfur-carrier domain MRP_L5 39S ribosomal protein L53 3 Mitochondrial ribosomal protein subunit MRP-L20 L2 MRP-L28 Mitochondrial ribosomal protein L28 MRP-L46 39S mitochondrial ribosomal protein L46 MRP-L47 39S mitochondrial ribosomal protein L47 MRP-S25 Mitochondrial ribosomal protein S25 MTP18 Mitochondrial 18 KDa protein NADH dehydrogenase (ubiquinone), is an enzyme of the respiratory chains of myriad organisms from bacteria to MWFE humans that falls under the H+ or Na+- translocating NADH Dehydrogenase (NDH) Family (TC# 3.D.1), a member of the Na+ transporting Mrp superfamily. NAD_bin Ferric reductase NAD binding domain ding_6 Provide feedback NADH- NADH-ubiquinone oxidoreductase u_ox- complex I, 21 kDa subunit rdase NDUF_B NADH dehydrogenase (ubiquinone) 7 NTPase_I protein of unknown function -T Ofd1_CT Oxoglutarate and iron-dependent DD oxygenase degradation C-term In molecular biology 2-oxo-4-hydroxy-4- carboxy-5-ureidoimidazoline OHCU_d decarboxylase (OHCU decarboxylase) ecarbox EC 4.1.1.n1 is an enzyme involved in purine catabolism Optic atrophy 3 protein, deficiency of which causes type III 3-methylglutaconic OPA3 aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal

dysfunction, ataxia, and cognitive deficits, but normal longevity Pam17 Mitochondrial import protein Pam17 These domains play a key role in the PDZ_1 formation and function of signal transduction complexes. This is a family of metalloproteases. Peptidase Proteins in this family are also annotated _M76 as Ku70-binding proteins. PET117 PET assembly of cytochrome c oxidase Pet127 Mitochondrial protein Pet127 Pet191_N Cytochrome c oxidase Pirin_C Pirin C-terminal cupin domain Ubiquinol-cytochrome-c reductase QCR10 complex subunit The DNA single-strand annealing proteins (SSAPs), such as RecT, Red- Rad52_R beta, ERF and Rad52, function in RecA- ad22 dependent and RecA-independent DNA recombination pathways RNA dependent RNA polymerase, This family of proteins are eukaryotic RNA dependent RNA polymerases. These RdRP proteins are involved in post transcriptional gene silencing where they are thought to amplify dsRNA templates. Rib_5- Ribose 5-phosphate isomerase A P_isom_ (phosphoriboisomerase A) A Ribosoma unknown l_L33 Ribosoma unknown l_L34 Ribosoma unknown l_L36 DNA-directed RNA polymerase N- terminal. This domain has a role in interaction with regions of upstream promoter DNA and the nascent RNA chain, leading to the processivity of the enzyme. This domain undergoes a structural change in the transition from RPOL_N initiation to elongation phase. The structural change results in abolition of the promoter , creation of a channel accommodating the heteroduplex in the and formation of an exit tunnel which the RNA transcript passes through after peeling off the heteroduplex. S1- S1 P1 nuclease protein domain, which P1_nuclea cleave RNA and single stranded DNA se with no sequence specificity. They are

found in both prokaryotes and eukaryotes and are thought to be associated in programmed cell death and also in tissue differentiation. Furthermore, they are secreted extracellular, that is, outside of the cell. Their function and distinguishing features mean they have potential in being exploited in the field of biotechnology. SCO1- This family is involved in biogenesis of SenC respiratory and photosynthetic systems. Squalene epoxidase. This domain is found in squalene epoxidase (SE) and related proteins which are found in taxonomically diverse groups of eukaryotes and also in bacteria. SE was first cloned from Saccharomyces SE cerevisiae where it was named ERG1. It contains a putative FAD binding site and is a key enzyme in the sterol biosynthetic pathway [1]. Putative transmembrane regions are found to the protein's C- terminus. The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae. The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA Sgf11 combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation mitochondrial inner membrane proteins She9_MD with a role in inner mitochondrial M33 membrane organisation and biogenesis Organic solute transporter Ostalpha. This family is a transmembrane organic solute transport protein. In vertebrates these Solute_tra proteins form a complex with Ostbeta, ns_a and function as bile transporters [1]. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death The Spo12 protein plays a regulatory role in two of the most fundamental processes of biology, and meiosis, and yet its biochemical function remains elusive. Spo12 is a nuclear protein. Spo12 is a Spo12 component of the FEAR (Cdc fourteen early anaphase release) regulatory network, that promotes Cdc14 release from the nucleolus during early anaphase. The FEAR network is comprised of the polo kinase Cdc5, the

separase Esp1, the kinetochore- associated protein Slk19, and Spo12. SPO22/ZIP4 in yeast is a meiosis specific protein involved in sporulation. SPO22 It has been shown to regulate crossover distribution by promoting synaptonemal complex formation Sucrase/ferredoxin-like. This family contains a number of bacterial and Suc_Fer- eukaryotic proteins approximately 400 like residues long that resemble ferredoxin and appear to have sucrolytic activity TIM21 interacts with the outer mitochondrial TOM complex and TIM21 promotes the insertion of proteins into the inner mitochondrial membrane TMEM22 Transmembrane protein 223 3 UcrQ Coenzyme Q – cytochrome c reductase Ureidoglycolate lyase, one of the Ureidogly enzymes that acts upon ureidoglycolate, _lyase an intermediate of purine catabolism, releasing urea. VRR_NU unknown C YCII-related domain. This domain is suggested to play a role in transcription YCII initiation (Bateman A per. obs.). This domain is named after the most conserved motif in the alignment. zf- Zinc-finger domain of monoamine- 4CXXC_ oxidase A repressor R1 R1 zf-CHCC NADH dehydrogenase (ubiquinone)

Table S2. Genome information of the animal hosts and diet plants used in the study to infer the genetic elements in Neocallimastigomycota that have a foreign origin Name Taxon Version Source Loxodonta africana Elephant (African savanna Loxafr3.0 Broad Institute elephant) Horse Equus caballus EquCab2.0 Wade et al. 2009 International Genomics Sheep Ovis aries Oar_v4.0 Consortium 2010 Bos mutus (wild Yak BosGru_v2.0 Qiu et al. 2012 yak) DH-Pahang Banana Musa acuminata Martin et al. 2016 v2 Elaeis guineensis Palm EG5 Singh et al. 2013 (African oil palm) Phyllostachys Bamboo heterocycla var. v1.0 Peng et al. 2013 pubescens Aegilops tauschii GoatGrass Aet_MR_1.0 Zimin et al. 2017 subsp. Tauschii B73 Maize Zea mays Jiao Y, et al. 2017 RefGen_v4 Oryza sativa Rice Build 4.0 Rice Annotation Project 2007 Japonica Group Brachypodium International Brachypodium Brome v2.0 distachyon Initiative 2010 Sorghum Sorghum bicolor v3 Paterson et al. 2009 Arabidopsis Arabidopsis thaliana TAIR10 Swarbreck et al. 2008 Physcomitrella V1.1 Rensing et al. 2008 patens

References: Broad Institute. Elephant Genome Project. (2018). Jiao, Y. et al. Improved maize reference genome with single-molecule technologies. Nature 546, 524–527 (2017).

Martin, G. et al. Improvement of the banana ‘Musa acuminata’ reference sequence using NGS data and semi-automated bioinformatics methods. BMC Genomics 17, 1–12 (2016).

Paterson, A. H. et al. The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551–556 (2009).

Peng, Z. et al. The draft genome of the fast-growing non-timber forest species moso bamboo (Phyllostachys heterocycla). Nat. Genet. 45, 456–461 (2013).

Qiu, Q. et al. The yak genome and adaptation to at high altitude. Nat. Genet. 44, 946–949 (2012).

Rensing, S. A. et al. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319, 64–69 (2008).

Singh, R. et al. Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds. Nature 500, 335–339 (2013).

Swarbreck, D. et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2008).

The International Brachypodium Initiative et al. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763–768 (2010).

The International Sheep Genomics Consortium et al. The sheep genome reference sequence: a work in progress. Anim. Genet. 41, 449–453 (2010).

The Rice Annotation Project. Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana. Genome Res. 17, 175–183 (2007).

Wade, C. M. et al. Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326, 865–867 (2009).

Zimin, A. V. et al. Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a grogenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome Res. 27, 787–792 (2017).