Presentation on MEGAN

Presentation on MEGAN

ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% MEGAN% ACTGACTG% GACTGACT% taxonomic%binning%of% TGACTGAC% CTGACTGA% sequence%data% ACTGACTG% GACTGACT% TGACTGAC% % CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% Overview% ACTGACTG% GACTGACT% TGACTGAC% • Introduc>on% CTGACTGA% ACTGACTG% • Why%use%MEGAN% GACTGACT% TGACTGAC% • CTGACTGA% using%MEGAN% ACTGACTG% GACTGACT% • Exercises% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 2% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% High%throughput%sequencing% ACTGACTG% GACTGACT% TGACTGAC% ShotFgun%sequencing%!%huge%amounts%of%data% CTGACTGA% 454%GSFFLX%generates%400F600%million%bp%per%run% ACTGACTG% %with%a%length%of%the%reads%between%400F500%bp %% GACTGACT% TGACTGAC% % CTGACTGA% Understanding%this%amount%of%informa>on%in%a%quick%manner?% ACTGACTG% !%classifica>on%of%sequences% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 3% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% Sequence%Classifica>on% ACTGACTG% GACTGACT% TGACTGAC% Sequence%classifica>on%(binning)%is%the%process%of%separa>ng%sequence% CTGACTGA% data%using%specific%informa>on%!%crea>ng%bins% ACTGACTG% % GACTGACT% This%informa>on%can%be%based%on%:% TGACTGAC% CTGACTGA% F Similarity%e.g.%MEGAN,%SORTFITEMS% ACTGACTG% F Phylogeny%e.g.%tools%like%CARMA% GACTGACT% F Func>onal%annota>on%e.g.%GO%classifiers% TGACTGAC% CTGACTGA% ACTGACTG% MEGAN%uses%similarity%searches%with%BLAST%to%bin%sequences% GACTGACT% into%taxa.% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 4% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% MEGAN% ACTGACTG% GACTGACT% Metagenome:%the%collec>ve%genome%of%all%the%microorganisms%in%am% TGACTGAC% environment.%(Handelsman)et)al.,)1998)% CTGACTGA% ACTGACTG% % GACTGACT% Metagenomics%is%the%study%of%the%metagenome%using%high%throughput% TGACTGAC% sequencing.% CTGACTGA% % ACTGACTG% The%ques>on:% GACTGACT% TGACTGAC% How%to%determine%species%composi>on%in%a%metagenomic%dataset?% CTGACTGA% % ACTGACTG% • Sequence%comparison%with%known%sequences%from%a%database%e.g.% GACTGACT% Genbank.% TGACTGAC% • Metagenomic%datasets%contain%many%sequences,%so%manual% CTGACTGA% inspec>on%is%impossible%!%MEGAN%is%your%assistant.% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 5% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% MEGAN% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% (Huson%et%al.,%Genome%Research,%2007)% 6% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% Why%use%Megan?% ACTGACTG% GACTGACT% TGACTGAC% Easy%to%work%with%on%a%desktop%/%laptop%computer:% CTGACTGA% Extra%things%needed:%Java,%a%BLAST%server%(e.g.%Bioportal)% ACTGACTG% % GACTGACT% TGACTGAC% MEGAN%%gives%a%visualiza>on%of%BLAST%results% CTGACTGA% • Study%diversity% ACTGACTG% • Compare%samples% GACTGACT% TGACTGAC% • Contamina>on%filtering%% CTGACTGA% • Special%gene%of%interest% ACTGACTG% • Extrac>on%of%sequences%based%on%taxonomic%informa>on. %% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% Bioportal:%hap://www.bioportal.uio.no//% GACTGACT% TGACTGAC% CTGACTGA% 7% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% The%basics%of%MEGAN% ACTGACTG% GACTGACT% TGACTGAC% MEGAN%uses%BLAST,%a%database%and%a%taxonomy%file% CTGACTGA% • BLAST%N%:%nucleo>des%against%a%nucleo>de%database.% ACTGACTG% • BLAST%X%:%Translated%nucleo>de% GACTGACT% against%a%protein%database.% TGACTGAC% CTGACTGA% • Which%Database?% ACTGACTG% %one%of%the%many%available%database%like%the%NCBIFnonFredundant% GACTGACT% database,%or%a%your%own%custom%database.% TGACTGAC% • Taxonomy:%NCBI%taxonomy,%or%your%own%custom%taxonomy% CTGACTGA% ACTGACTG% % GACTGACT% BLAST%output%file%is%used%to%bin%sequences%using%the%LCA%assignment% TGACTGAC% algorithm%into%specific%taxons.% CTGACTGA% ACTGACTG% % GACTGACT% % TGACTGAC% % CTGACTGA% 8% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% The%basics%of%MEGAN% ACTGACTG% GACTGACT% • The%LCA%algorithm%=%“Lowest%Common%Ancestor”%algorithm% TGACTGAC% CTGACTGA% In%this%approach,%every%read%is%assigned%to%some%taxon.%If%the%read%aligns% ACTGACTG% “ GACTGACT% very%specifically%only%to%a%single%taxon,%then%it%is%assigned%to%that% TGACTGAC% taxon.%The%less%specifically%a%read%hits%taxa,%the%higher%up%in%the% CTGACTGA% taxonomy%it%is%placed.%Reads%that%hit%ubiquitously%may%even%be% ACTGACTG% assigned%to%the%root%node%of%the%NCBI%taxonomy.”% GACTGACT% % TGACTGAC% CTGACTGA% %(the%MEGAN%manual)% ACTGACTG% % GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 9% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% The%basics%of%MEGAN% ACTGACTG% GACTGACT% TGACTGAC% The%%default%LCA%parameters%are:%% CTGACTGA% •%Min%Support%% !%%default%=%5%reads%per%taxon% ACTGACTG% •%Min%Score%% !%default%bitscore%=%35% GACTGACT% TGACTGAC% •%Min%score%/%length%% !%Bitscore%divided%by%the%read%length%d%=%0% CTGACTGA% •%The%top%percentage% !%The%maximum%percentage%by%which%the% ACTGACTG% % %%%%%%score%of%a%hit%may%fall%below%the%best% GACTGACT% % %%%%%%score%achieved%for%a%given%read%%d%=%10% TGACTGAC% •Win%score% !%If%a%win%score%is%set,%then,%for%a%given%read,%if%any% % CTGACTGA% ACTGACTG% % %%match%exceeds%the%win%score,%only%matches%% %%%% GACTGACT% %exceeding%the%win%score%(“winners”)%are%used%to%% %%% TGACTGAC% %place%the%given%read.%d%=%0% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% 10% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% The%basics%of%MEGAN% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% No%hits:%none%of%the%blast%hits%reached%the%minimum%bitscore.% CTGACTGA% Not%assigned%:%The%sequence%had%not%enough%hits%to%be%% ACTGACTG% %%%%%classified%to%a%taxon%(Min%support%&%top%percentage)% GACTGACT% TGACTGAC% CTGACTGA% 11% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% Playing%with%min%support% ACTGACTG% Alphaproteobacteria 1 GACTGACT% Rhizobiales 2 Gammaproteobacteria 20 Alteromonadales 1 Shewanella denitrificans OS217 1 TGACTGProteobacteriaAC% 215 Methylococcus capsulatus str. Bath 1 Bacteria 93 Gammaproteobacteria 20 Enterobacteriaceae 1 CTGACTGA% Thioalkalivibrio sp. HL-EbGR7 1 Thiotrichales 0 Methylophaga thiooxidans DMS010 1 ACcellularTG AorganismsCTG% 56 Beggiatoa sp. PS 2 Mariprofundus ferrooxydans 22 Endoriftia persephone 7 Proteobacteria 215 GACTGACT% Desulfovibrio desulfuricans 4 Desulfobacterales 0 Desulfobacteraceae 1 TGACTGAC% Deltaproteobacteria 2 root 10 Bacteria 93 Desulfotalea psychrophila 1 delta/epsilon subdivisions 0 Desulfuromonadales 2 CTGACTGA% Syntrophobacter fumaroxidans 1 ACTGACTG% Methanosarcina 10 Epsilonproteobacteria 3 Campylobacterales bacterium GD 1 1 cellular organisms 56 Sulfurovum sp. NBC37-1 3 GACTGACT% root 10 Betaproteobacteria 8 Burkholderiales 1 Thiobacillus denitrificans 7 TGACTGAC% Mariprofundus ferrooxydans 22 Bacteroidetes 2 Flavobacteriales bacterium HTCC2170 5 Bacteroidetes/Chlorobi group 0 Bacteroidales 1 CTGACTGA% Not assigned 107 Chlorobaculum parvum 1 Cyanobacteria 4 ACTGACTG% Cyanothece sp. PCC 7424 1 uncultured marine bacterium 2 GACTGACT% Verrucomicrobia 1 Euryarchaeota 7 Methanoculleus marisnigri 1 Methanomicrobia 0 Methanosarcinales 4 Methanosarcina 10 TGACTGAC% Methanosarcinaceae 5 Archaea 7 Methanococcoides burtonii 4 CTGACTGA% uncultured archaeon GZfos17F1 1 environmental samples 0 uncultured archaeon GZfos26B2 5 ACTGACTG% uncultured archaeon GZfos1D1 3 Eukaryota 0 Spermatophyta 1 Ass=0, GACTGACT% Sum=1, TGACTGAC% %=%10% %=%1% CTGACTGA% 12% ACTGACTG% GACTGACT% TGACTGAC% CTGACTGA% % ACTGACTG% GACTGACT% TGACTGAC% Phosphorus metabolism CTGACTGA% Nitrogen metabolism Stappia 5 ACTGACTG% Alphaproteobacteria 0 Magnetospirillum 12 GACTGACT% Alteromonadales 41 Saccharophagus 6 TGACTGAC% Colwellia 22 %%%%An%example% Oceanospirillum 7 CTGACTGA% Methylococcus 5 Gammaproteobacteria 293 ACTGACTG% Photobacterium 5 GACTGACT% Comparison%between%reads%assigned% Nitrosococcus 15 Thiomicrospira 6 Thiotrichales 0 TGACTGAC% to%Phosphorus%metabolism% Beggiatoa 9 CTGACTGA% and%Nitrogen%metabolism% Proteobacteria 880 Endoriftia 69 Sorangium 5 ACTGACTG% Desulfobacteraceae 26 Desulfococcus 26 Desulfobacterales 0 Desulfatibacillum 16 GACTGACT% Deltaproteobacteria 82 Bacteria 719 Desulfotalea 9 TGACTGAC% cellular organisms 265 Desulfuromonadales 13 Pelobacter 7 delta/epsilon subdivisions 0 CTGACTGA% root 0 Geobacter 11 ACTGACTG% Bdellovibrio 6 Campylobacterales 19 Arcobacter 5 Epsilonproteobacteria 73 GACTGACT% Sulfurimonas 44 TGACTGAC%

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    36 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us