Comparative Genomics of the Major Parasitic Worms

Comparative Genomics of the Major Parasitic Worms

Comparative genomics of the major parasitic worms International Helminth Genomes Consortium Supplementary Information Introduction ............................................................................................................................... 4 Contributions from Consortium members ..................................................................................... 5 Methods .................................................................................................................................... 6 1 Sample collection and preparation ................................................................................................................. 6 2.1 Data production, Wellcome Trust Sanger Institute (WTSI) ........................................................................ 12 DNA template preparation and sequencing................................................................................................. 12 Genome assembly ........................................................................................................................................ 13 Assembly QC ................................................................................................................................................. 14 Gene prediction ............................................................................................................................................ 15 Contamination screening ............................................................................................................................. 16 2.2 Data production, McDonnell Genome Institute (MGI) .............................................................................. 18 Genome sequencing library preparation ...................................................................................................... 18 Genome assembly ........................................................................................................................................ 19 Assembly QC / Contamination screening ..................................................................................................... 20 Transcriptome sequencing and assembly .................................................................................................... 20 Gene prediction ............................................................................................................................................ 20 2.3 Data production, Blaxter Nematode and Neglected Genomics (BaNG) .................................................... 21 Genome sequencing library preparation and sequencing ........................................................................... 21 Genome assembly ........................................................................................................................................ 22 Assembly QC ................................................................................................................................................. 22 Gene prediction ............................................................................................................................................ 22 3 Functional annotation ................................................................................................................................... 22 Assigning protein names to predicted proteins............................................................................................ 22 Assigning GO terms to predicted proteins ................................................................................................... 23 4 Repeat libraries and repeat-masking ............................................................................................................ 24 5 Regression model for genome size ............................................................................................................... 25 6 Mitochondrial genome analysis .................................................................................................................... 26 7 Defining high-quality ‘tier 1’ species for downstream analyses ................................................................... 26 8 Compara database of gene families .............................................................................................................. 27 Construction of the in-house Compara database ........................................................................................ 27 Identification of gene families, orthologs and paralogs .............................................................................. 28 1 9 Identification of synapomorphic gene families ............................................................................................. 28 10 Phylogenetic analysis of candidate lateral gene transfers .......................................................................... 28 11 Network representation of gene families ................................................................................................... 29 12 Phylogenetic tree based on gene family presence/absence ...................................................................... 29 13 Identification of gene family expansions .................................................................................................... 29 14 Species Tree ................................................................................................................................................. 32 15 Novel domain combinations ....................................................................................................................... 32 16 Ion Channels and ABC Transporters ............................................................................................................ 33 17 Proteases ..................................................................................................................................................... 33 18 Kinase prediction ......................................................................................................................................... 33 20 Signal peptide for secretion and TM domains predictions ......................................................................... 34 21 InterPro and GO annotations ...................................................................................................................... 34 22 Species-level functional enrichment (GO / InterPro / Pfam) analysis ........................................................ 34 23 SCP/TAPS protein family ............................................................................................................................. 34 24 GPCR analysis .............................................................................................................................................. 35 25 Metabolism ................................................................................................................................................. 36 Assigning ECs to predicted proteins and generating high-confidence EC predictions ................................. 36 Reconstructing metabolic pathways and pathway hole-filling .................................................................... 37 Analysis of KEGG metabolic modules and pathways ................................................................................... 37 Analysis of chokepoints in metabolic pathways ........................................................................................... 38 Carbohydrate active enzymes (CAZymes) .................................................................................................... 39 26 Identification of Potential Anthelmintic Drug Targets and Drugs............................................................... 39 Known anthelmintic drugs and compounds ................................................................................................. 39 Dendrogram of known anthelmintic compounds......................................................................................... 40 Identifying potential helminth drug targets ................................................................................................. 40 Identifying potential new anthelmintic drugs in ChEMBL ............................................................................ 44 Diversity analysis for creating a ‘diverse screening set’ ............................................................................... 45 Identifying compounds available for purchase using ZINC15 ...................................................................... 46 Self-organising map of compounds .............................................................................................................. 47 Supplementary Results .............................................................................................................. 48 1. Genomic diversity in parasitic nematodes and platyhelminths ................................................... 48 1.1 Genome sequencing and assembly ............................................................................................................ 48 Sequencing strategy ..................................................................................................................................... 48 Genome assembly pipeline validation .......................................................................................................... 48 Assembly statistics ......................................................................................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    123 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us