Appendix A Identification and characterization of biosynthetic gene clusters from halophilic marine fungus Eurotium rubrum

Obul Bandapali1, Jens Frederik Teilfeldt Hansen2, Alisha Parveen3, Pradeep Phule4, Emmagouni Sharath Kumar Goud5,, Suraj Kumar Acharya6, Jens Laurids Sørensen7 & Abhishek Kumar8, * 1Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Hopp Children's Cancer Center (KiTZ), and Heidelberg University, Medical Faculty, D-69120 Heidelberg, Germany; [email protected] 2Department of Biochemistry, McGill University, Montréal, QC H3G 0B1, Canada and Department of Chemistry and Bioscience, Aalborg University, Niels Bohrs Vej 8, DK-6700 Esbjerg, Denmark; [email protected] 3Institute of Molecular Medicine and Cell Research, Albert Ludwigs University Freiburg, Stefan Meier Strasse 17, 79104, Freiburg, Germany; [email protected] 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066 India;[email protected] 5Institute of Bioinformatics, International Technology Park, Bangalore, 560066 India; [email protected] 6Institute of Bioinformatics, International Technology Park, Bangalore, 560066 India; [email protected] 7Department of Chemistry and Bioscience, Aalborg University, Niels Bohrs Vej 8, DK-6700 Esbjerg, Denmark; [email protected] 8Institute of Bioinformatics, International Technology Park, Bangalore, 560066 and Manipal Academy of Higher Education (MAHE), Manipal 576104, Karnataka, India; [email protected] * Correspondence: [email protected]; Tel.: +91 80-2841-6140 (A.K.)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

A. EruBGC2 - KK088412.1 - scaffold00002

B. EruBGC14 - KK088418.1 - scaffold00008

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

C. EruBGC25 - KK088444.1 - scaffold00034

D. EruBGC32 - KK088468.1 - scaffold00058

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S1. Overview of BGCs encoding for NRPS in E. rubrum

A. EruBGC4 - KK088413.1 - scaffold00003

B. EruBGC10 - KK088415.1 - scaffold00005

C. EruBGC18 - KK088423.1 - scaffold00013 – No matches with known BGCs

D. EruBGC20 - KK088437.1 - scaffold00027 – No matches with known BGCs

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

E. EruBGC22 - KK088438.1 - scaffold00028

F. EruBGC29 - KK088450.1 - scaffold00040

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

G. EruBGC31 - KK088464.1 - scaffold00054

H. EruBGC34 - KK088474.1 - scaffold00064

Fig S2. Summary of NRPS-like BGCs from E. rubrum genome.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S3. Overview of NRPS-like proteins from E. rubrum genome.

A. EruBGC3 - KK088412.1 - scaffold00002 i.Blast clusters

ii. known cluster

B. EruBGC5 - KK088413.1 - scaffold00003 i. Blast clusters

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

ii. known cluster

C. EruBGC8 - KK088414.1 - scaffold00004

i. Blast clusters

ii. known cluster

*BGC0000156: TAN-1612 / 1-(2,3,5,10-tetrahydroxy-7-methoxy-4-oxo-1,2,3,4-tetrahydroanthracen-2-yl)pentane-2,4-dione / desmethyl TAN-1612 (60% of genes show similarity), Polyketide:Iterative type I

**BGC0000168: viridicatumtoxin / previridicatumtoxin / 5-hydroxyanthrotainin / 8-O-desmethylanthrotainin (13% of genes show similarity), Polyketide:Iterative type I

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

D. EruBGC11- KK088415.1 - scaffold00005

E. EruBGC13 - KK088417.1 - scaffold00007

F. EruBGC23 - KK088442.1 - scaffold00032

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

G. EruBGC24 - KK088442.1 - scaffold00032

H. EruBGC35 - KK088487.1 - scaffold00077

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S4. Overview of PKS producing BGCs from E. rubrum genome.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S5. Annotation summary of PKS proteins from E. rubrum genome.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

A. EruBGC1 - KK088412.1 - scaffold00002 – No matches with known BGCs

B. EruBGC6 - KK088413.1 - scaffold00003

C. EruBGC16 - KK088420.1 - scaffold00008 i.Blast clusters

ii. known cluster

D. EruBGC19 - KK088424.1 - scaffold00014 i.Blast clusters

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

ii. known cluster

E. EruBGC21 - KK088437.1 - scaffold00027 – No matches with known BGCs

F. EruBGC26 - KK088445.1 - scaffold00035

G. EruBGC27 - KK088446.1 - scaffold00036 - No matches with known BGCs

H. EruBGC33 - KK088470.1 - scaffold00060

I. EruBGC36 - KK088489.1 - scaffold00079

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

J. EruBGC15 - KK088420.1 - scaffold00008

Fig S6. Overview of terpenes producing BGCs from E. rubrum genome.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S7. Sequence alignment of terpenoid synthase (EYE98762.1) with top homologs from other fungi. XP_025466343.1 - terpenoid synthase (Aspergillus sclerotioniger CBS 115572); CRL22192.1 - terpenoid synthase (Penicillium camemberti); XP_025453264.1 - terpenoid synthase (Aspergillus lacticoffeatus CBS 101883).

Fig S8. Protein sequence alignment of terpenoid synthase (EYE98616.1) shows 80.24%, 72.46% and 70.87% sequence identities trichodiene synthase (KGO75161.1, Penicillium italicum), terpenoid synthase (PLN82674.1, Aspergillus taichungensis) and terpenoid synthase (XP_024673860.1, Aspergillus candidus), respectively.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S9. Protein alignment of terpenoid synthase (EYE98616.1) with top homologs. EYE98616.1 shares 74.34%, 57.14%, and 53.78%, sequence identities with RJE22100.1 (farnesyl pyrophosphate synthetase, A. sclerotialis), KAE8384153.1 (isoprenoid synthase domain-containing protein, A. albertensis) and RJE18221.1 (farnesyl pyrophosphate synthetase, A. sclerotialis), respectively.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S10. Alignment of farnesyl-diphosphate , EYE94773.1 with top homologs. It shares 94%, 82% and 81% amino acid sequence identities with putative squalene synthase (ODM24304.1) from A. cristatus, putative Farnesyl-diphosphate farnesyltransferase (CEL06587.1) from A. calidoustus and squalene synthetase (KKK19626.1) from A. ochraceoroseus, respectively

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S11. Alignment of farnesyl-diphosphate farnesyltransferase (EYE94773.1) with top homologs. It depicts that EYE94773.1 is sharing 94.12%, 81.97% and 81.33% identities with ODM24304.1 (putative squalene synthase, A. cristatus), XP_024706925.1 (farnesyl-diphosphate farnesyltransferase, A. steynii IBT 23096), KKK19626.1 (squalene synthetase, A. ochraceoroseus), respectively.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S12. Protein alignment of terpenoid synthase (EYE92376.1) revealed 85% sequence identities with orthologs in Aspergillus strains. GAA88553.1 - A. kawachii IFO 4308; XP_025519208.1 - A. piperis CBS 112811; XP_025533848.1 - A. costaricaensis CBS 115574

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S13. Sequence alignment of squalene cyclase (EYE91407.1) depicts 55- 57% sequence identities with orthologs from Aspergillus species. KAE8353390.1 - squalene cyclase (Aspergillus coremiiformis); RDK47893.1 - squalene cyclase (Aspergillus phoenicis ATCC 13157); XP_026631671.1 - squalene cyclase (Aspergillus welwitschiae)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S14. Sequence alignment of terpenoid synthase (EYE91365.1) depicts 62- 65% sequence identities with orthologs from Aspergillus species. KAE8386459.1 - isoprenoid synthase domain-containing protein (Aspergillus alliaceus); XP_031902430.1 - isoprenoid synthase domain-containing protein (Aspergillus alliaceus); KAA8643853.1 - terpene cyclase (Aspergillus tanneri)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S15. Alignment of phytoene synthase (EYE90127.1) depicts 64-67% identities with close homologs. XP_025536128.1 - phytoene synthase (Aspergillus costaricaensis CBS 115574); XP_025518135.1 - phytoene synthase (Aspergillus piperis CBS 112811); XP_025473898.1 - phytoene synthase (Aspergillus neoniger CBS 115656)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S16. Protein alignment of acetyl-CoA synthetase-like protein (EYE95809.1) depicts 68-79% identities with close homologs from two Penicillium species and Aspergillus udagawae. OOQ91688.1 - putative AMP dependent /synthetase (Penicillium brasilianum); CDM36077.1 - AMP-dependent synthetase/ligase (Penicillium roqueforti FM164); GFF57015.1 - putative acyl-CoA synthetase YngI (Aspergillus udagawae)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

Fig S17. Protein alignment of aldolase (EYE95810.1) depicts 73.05%, 70.22% and 68.68% sequence identities with CDM36078.1, GFF57018.1, XP_024698020.1, respectively. CDM36078.1 - Crotonase, core (Penicillium roqueforti FM164); GFF57018.1 - isoform 2 of hydroxymethylglutaryl-CoA , mitochondrial (Aspergillus udagawae);XP_024698020.1 - aldolase (Aspergillus campestris IBT 28561)

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

A. EruBGC7 NRPS,indole KK088413.1 scaffold00003

B. EruBGC17 T1PKS,indole KK088422.1 scaffold00012

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

C. EruBGC28 T1PKS,NRPS-like KK088447.1 scaffold00036

D. EruBGC30 T1PKS,NRPS KK088453.1 scaffold00043

Fig S18. Overview of Hybrid BGCs from E. rubrum genome.

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs

© 2020 by the authors. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Mar. Drugs 2020, 18, x; doi: FOR PEER REVIEW www.mdpi.com/journal/marinedrugs