The SHOGUN Machine Learning Toolbox (And Its R Interface)

The SHOGUN Machine Learning Toolbox (And Its R Interface)

Introduction Features Code Example Summary The SHOGUN Machine Learning Toolbox (and its R interface) S¨orenSonnenburg1;2, Gunnar R¨atsch2,Sebastian Henschel2, Christian Widmer2,Jonas Behr2,Alexander Zien2,Fabio De Bona2,Alexander Binder1,Christian Gehl1, and Vojtech Franc3 1 Berlin Institute of Technology, Germany 2 Friedrich Miescher Laboratory, Max Planck Society, Germany 3 Center for Machine Perception, Czech Republic fml Introduction Features Code Example Summary Outline 1 Introduction 2 Features 3 Code Example 4 Summary fml Introduction Features Code Example Summary Introduction What can you do with the SHOGUN Machine Learning Toolbox [6]? Types of problems: Clustering (no labels) Classification (binary labels) Regression (real valued labels) Structured Output Learning (structured labels) Main focus is on Support Vector Machines (SVMs) Also implements a number of other ML methods like Hidden Markov Models (HMMs) Linear Discriminant Analysis (LDA) Kernel Perceptrons fml Introduction Features Code Example Summary Support Vector Machine Given: Points xi 2 X (i = 1;:::; N) with labels yi 2 {−1; +1g Task: Find hyperplane that maximizes margin Decision function f (x) = w · x + b fml Introduction Features Code Example Summary SVM with Kernels SVM decision function in kernel feature space: N X f (x) = yi αi Φ(x) · Φ(xi ) + b (1) i=1 | {z } =k(x;xi ) Training: Find parameters α Corresponds to solving quadratic optimization problem (QP) fml Introduction Features Code Example Summary Large-Scale SVM Implementations Different SVM solvers employ different strategies Provides generic interface to 11 SVM solvers Established implementations for solving SVMs with kernels LibSVM SVMlight More recent developments: Fast linear SVM solvers LibLinear SvmOCAS [1] Support of Multi-Threading ) We have trained SVMs with up to 50 million training examples fml Introduction Features Code Example Summary Large Scale Computations Training time vs sample size fml Introduction Features Code Example Summary Large Scale Computations Training time vs sample size fml Introduction Features Code Example Summary Large Scale Computations Training time vs sample size fml Introduction Features Code Example Summary Large Scale Computations Training time vs sample size fml Introduction Features Code Example Summary Various Kernel Functions Kernels for real-valued data (a) Linear (b) Polynomial (c) Gaussian ) What if my data looked like... fml Introduction Features Code Example Summary Various Kernel Functions Kernels for real-valued data (d) Linear (e) Polynomial (f) Gaussian ) What if my data looked like... fml Introduction Features Code Example Summary Various Kernel Functions ...this?! TTTTGGAAGGAACGGAGGATCCGGATAGGCTTCATCCAAATATTCGATTATTGCCAAGCTTTCTGTCAGTGATAAGCCATTTATCACTAAAGTTGGAACTTTTTTCGCTGGGTTGTGCTTGACGAACTCTGCATTGTTCTGAAATAC AGCTATATTTCTATGGAAGTGGGAGTAGGGCAGGGGTGGATGGCAAATTGCCGGTTTGCCGGAAATTTTCAATTCCGGCAATTTACCGATTTACCGTTTGCCGAATTTTACCGATTAAAATTGAAATTCTGAAATTTCCAGAAAAA AATGTGCAAAACCACAAGTTGCCGAAAGTTTTCGGCAAATTGTGTTTTTCCGGCAAATCAGCGATTTGCCGTTCTGCCGAATTTTGCCAGAAATTTTCAATTCCGGCAATTTTCCAATTTACCGGAAATTTTCAATTCTAGCAATTT GCTGGAAATTTCAATTCAGGCAATATGCCGAATTTGCCGAAAATTTTCAATTCCGGCAATTTTCCAATTTGCCCGAAAATTTTGATTTTCATTAAATTTTCGATTTGCCGGAAATTGTAATTTTGGGCAAATTGCCGATTTACCGTTG CCAAAAGTTTTCGACAAATTGTGGTTTTTTCCGGCAAATCGGCAATTTGCCGCTAGGCCGAATTTTGCCGGAAATTTAAATTTCGGCACTTTGCCGATTTGCCGAAAATTTTCAATCCCGGCAATTTGCCGCTCTGGCGAATTTTGC CGGAAATTTTCAATTCCGGCAATTTTCCAATTTGCCGGAAATTTTCTAACCTTGGATTCTTCAGAAAATAAGTCGATTGGCCGGTATTCGTAGTCGATATTTTTCAGAGCAAGTGCTATAAATCAGTATTTTATAGAATTTCCAAGCTA AAATTACCGATGCGAACTCTCCAGGCACACGACGAACGCCAATATGAATATAAAATCGGTTTTGCCATGGAGTTAGAAGCTTCTAAAGAAGAAAATTGAAATATTTAAGCAATAAATGACATTTTGAGAAGAAATTAAATATGAAC ACGTGGGAAAAATCTTCAAAAACCCTAATTTTTCAGTTAAAAACTACTTTCTCCGAATAAAACCTCCATTTTCCTCAATTTTCAGCACTTTTTTATGTCAATTCCCACTTAAAATTCCCCTTTAACGGTCATTTTTCCTAAACTGTGC AGTTTCGCGTCGCCGTGTTTGCACACATTCCGTTTGATGCGCAGGTTTCCACGTTGTCCTCTTTTCTCCATTTTCTGTCTCTTTTAACGTGGATTTACCGATTTTTATCTCTGAAATTCTTAATTTTTATGGTATTTTTCGTAGTTTCTTT ATCAATTTAATCGAAATTCCAAGTTTTTTCATTGATTTTTTGCATTTTTGTCTGTTTTTTTTAAACATTCTTATATATCCACGTGTCATTTCCATAATTTCAGAACCTTCATTCACGCCGAGGAGCAGTTTTTGACTGGTTACCGACTATA AATCGGATATCATGTCGCTCCTTATCAAATTTGAATCAAAATCCGCTAGGGTTAAAGGTAAATATTTTGGTTTTTGGCATTATTTTTCAATAATTTCCCAATTTTCCAGGAATTTCGTTCCACCCAACTCGGCCATGGGTGCTCACATC GCTGCACAGTGGAGTCATTCAACTCTGGGACTACCGAATGTGCGTTCTTCTCGAGAAGTTCGACGAGCACGATGGTCCAGTGAGAGGAATTTGCTTCCATCATGATCAGCCGATCTTTGTGTCTGGAGGAGATGATTATAAGATTA AAGTGTGGAATTACAAGCAGAAACGCTGTATTTTCACACTTCTCGGACATTTGGATTATATTCGTACCACTTTCTTCCATCATAAGTATCCATGGATCATTTCTGCGTCCGACGATCAAACGGTTCGGATTTGGAATTGGCAGAGCC GAAACTCGATCGCCATCCTCACCGGACACAATCACTACGTCATGTGTGCTCAATTCCACCCAACCGAAGATCTTGTGGCTTCTGCCTCCTTGGATCAGACCGTCCGTGTTTGGGATATTTCTGGACTTCGGAAGAAGCAAATGCCT GGTGGAGGTGCTCCAAGCCGGCCAACCGGAGGGCAACAAGCTGAGCTATTCGGGCAACCTGATGCAGTTGTGAAGCATGTGCTAGAGGGACACGATCGTGGAGTCAACTGGGTTGCATTCCATCATACCAATCCAATTCTAGTC TCGGGATCCGATGATCGTCAAGTCAAGATTTGGCGTTACAATGAGACGAAAGCCTGGGAGCTCGACTCGTGCCGTGGACACTACAATAACGTCTCATCTGTGATCTTCCATCCAAATGCTGACTTGATTCTTTCCAACTCGGAGGA TAAGTCGATTCGTGTGTGGGATATGCAGAAGCGTACCAGTCTTCACGTTTTCCGACATGAAAATGAGCGTTTCTGGGTTTTGGCCGCCCATCCAAGCTTGAACATGTTCGCCGCAGGACATGACAATGGAATGGTCGTTTTTAAAA TTCAAAGAGAACGCCCGGCTTATTGTGTCAGTGATAATTTGGTTTTCTATGTGAAGGGACAACAAATTCGCAAGCTCGATTTGACAACCAAGTGAGTAGTTTGATGCCAAGAAAAGTGTGCCGCAAAATGCATTTTTGCACAGAA TTAGCGTTTTTTTTTTGCGAAAAAAAGCGTGCGAGAGTTGCGAAAATGCATTTTCCGGCGCACTGAATCATGTTTTTGTGGAGTCTCAGTAGTTTTTAAATGAAAAACTTTAAATGTGCCCCATTAAGCGAATTTCGTATTTAGAGC CTCCAAAAAATCGGAAAAAAGGCGTAATTTCAATAGCTAAATTCTCCAACTTCCACTGAAAATAGCAGAGTTTTTTTTGAGATTTATCAAATTTTTAGCTTTTCAAAAAATGTTTTCTGCTTTTTACGAACAAATCTTTTGTTAAAC AATGAAAATCGAATTTAAAATGACTAAAATTAACTTTTTCCGTCCACTGGTTGTGAAATGGGTTGAAATTGAAGAAATCAACGGGATTTTTCTGATTTTCTGAATATTTTTCTATTAGAAATCGGTTTTAAACCATTTTTTTTTACTAA AACCGATTTTTAATAGGAGAATATTTAGAAAATGAGAAAAATCCCGTTGATTTCTTCAATTTCAACCCATTTCACAACCAGTGGACGAAAAAAGTTAATTTTAGTCATTTTAAATTCGATTTTCACTGTTTAACAGAAGGTTTGTTC GTAAAATGCAGGAAAAAAATTTTTAAAGCTAAAAATTTGATAAATCTAAAAAATCTGTTATTTTCAGTACGAAGTTGGAGATCTTAGCTAAATTTAGAGCTTTTCTCCGGAAATTTTGAGGACTCTCACTACAAAATTCGTTATTTT TTGTACATTCAGAGGGGATGTACAAGCAAAAACCCCTCTAAAATCAAATTTTCAGCAAGGATGTTGCTCTGTGCAAGCTCCGTTACCCACAACCATTCATGCAACCATACTACTCTCTTTCCTTCAATCCTGCCGAGGGAACCTTC CTTCTCACAAGCCGTACTCACAATAAGGATCTATGCGCCTTTGAGCTGTACAAAGTTGCCAGTAACTCTGATGGAACCACTGAAGCTGCGTGCGTCAAGAGCACCGGAATCAATGCACTTTGGGTTGCTCGTAATCGTTTTGCAG TTCTTGACAAGGCTCAGAATGTGTCGTTGCGTGATTTGACGAACAAGGAACTTCGCAAACTCGAGAATATCAACACGGCTGTCGACGATATTTTCTATGCCGGAACTGGAATGCTCCTTTTGAGAAATGATGATGGGCTTCAACT GTTTGATGTTCAGCAAAAGATTGTGACTGCTTCAGTGAAAGTGTCGAAAGTTCGTTATGTGATTTGGAGCAAGTCTATGGAATTCGCAGCTCTTCTTTCAAAGCACACGCTGACCTTGGTCAACCGTAAATTGGAGATTCTGTGTA CCCAACAGGAGTCGACTCGTGTGAAGAGTGGAGCTTGGGATGATGACTCGGTATTCCTCTACACCACTAGCAATCACATCAAATATGCAATCAATTCTGGAGATTGTGGAATCGTTCGTACTCTGGATCTTCCTCTGTACATTCTG GCTATTCGAGGAAACGTGCTCTACTGCTTGAATCGTGACGCTACACCAGTTGAAGTTCCAATTGATAATTCTGATTACAAGTTTAAGCTTGCTTTGATTAACAAACGAATCGATGAAGTTGTCAACATGGTTCGATCCGCAAATCTC GTCGGTCAATCCATCATTGGATATCTCGAGAAAAAGGGTTACCCAGAGATTGCTCTTCACTTTGTCAAGGATGAGAAGACTCGCTTCGGGTTGGCAATTGAATGTGGAAACCTGGAAATCGCGTTGGAAGCCGCCAAGAAGCTC GACGAGCCAGCAGTCTGGGAGGCTCTCGGAGAGACTGCTCTTCTTCAAGGCAATCATCAAATTGTTGAGATGAGCTATCAACGAACGAAAAACTTTGAGAAGCTCTCGTTTTTGTATTTTGTTACTGGAAATACGGATAAGCTTG TCAAGATGATGAAGATTGCACAGGCTAGAAATGATGCTCACGGACATTTCCAGACAGCTCTCTACACTGGTGATGTTGAGGAGAGAGTTAAAGGTAAAAATGTGCAGCTGGATTTCAGAAGGGGGCAAAAGTCAAAGTTTTATT GATTTTTAGCTCCTCTTTGCGTATTTTTTTCGTGAATTTCTGCAAATTCTAGACGATTTTGAAATTATTGAACCTAAAAATGGTCCAAATCTACCGAATATCGTCATGAGATTTCTCAAAAAAAAATTTGGAAATCTACCAATTTTCAG CTAAAATCTCATATTTTGCCAACTTTTCAGTTTCGCAGCTGACGGAACTTGATTTTTATTCAATTATCGCTATTTAATACGTATTTTGGTCGTTTTAACGAATAAAAGTGTGATAGACGACCAAAATATGTATTAAAAAGTAATAATCA ATGACACACTAAAGTTGGCAAAATTTGAGATTTTAGCTGAAAATAGGTTTTTTTCCAAATCTCAAAACCGCCATAACTTTTTTTGTGAGTTTTCAGAAACTTTTCAGGATTTTTCATGACGAAATTCGTTACTTTTTGGGTCATTTTC AACAAAATAAAGTATAAAGTTAATTTTTCAGTGCTCCGAAACTGTGGACAAACTTCTCTGGCCTACCTCGCCGCAGCCACTCATGGATATTCTGCTGAAGCTGAGGAGTTGAAGGCTGAACTCGAGAGTCGTCAGCAACCAATCC CACCAGTTGATCCAAATGCTCGCCTCCTTGTCCCACCACCACCAGTTGCTCGTTTAGAAGAGAATTGGCCACTTCTTGCCTCGGCTCGAGGAACTTTCGATGCACAGCTTCTTGGATTAGGAGGACAATCTGCTCCAACTAATGTT GGCGGTGTGAAGCCTGCTGCTGCTGCCTTTGCTGTTATGGATGATGATGTTAGTTTATTGATTTCTGAGCTATCTGAGCTATAAAATATTAATTTTTGCAGGATGGTGATGTTGGAAATGAAGCATGGGGAGATGATGAGTATCTTGT CGGAGAAGATGGTGAACTTGATGTTGATGAAGGAGAAGGACCAGTTGATGGAGATGAGGAAGGTGGATGGGATGTGGATGATGATCTGGCTCTTCCTGATGTTACCGATGATCAAGGTGGAGATGATGATGAGGAAGTGGTTCC TAACCCAGCTCCAGCTGTCTCATCGGAATGGCCGAATGTTTCTCGTCTTGCTGCTGATCATGTTGCAGCTGGTTCATTCGGAACTGCCATCAAGCTTCTTCATGATACAATTGGAGTCGTTGAAGCAGCACCATTCAAGGATGTATT CGTGAAGGCTTATGCAGCTTCTAGACTCTCACATCGTGGATGGGGAGGACTCGGACCAGCTGGACCAGTGTCCATTCATCCGGTTCGTAATTTCCAGGAGGATAAGAATCATCTTCCAGTCGCTGCGTTCAAACTTTCTCAACTTG CAAAGAAACTGCAAAAGGCTTATCAAATGACGACGAATGGAAAGTTTGGAGATGCTGTCGAGAAACTGCGGGAGATTCTTCTATCTGTTCCACTGATCGTTGTGTCTTCGAAACAAGAAGTCGCCGAAGCGGAGCAGCTCATC ACAATCACACGGGAATATCTTGCAGCACTTCTCCTTGAAACTTATCGAAAAGATTTGCCAAAAACGAATTTGGAGGATGCGAAACGCAACGCCGAACTAGCTGCCTACTTTACACACTTTGAGCTTCAACCAATGCATCGTATCC TGACACTGAGATCTGCTATCAATACTTTCTTCAAAATGAAGCAAATGAAGACTTGTGCATCACTTTGCAAACGTCTATTGGAGCTCGGTACGTGCTTGCGTTGTTGATTTTTTTAAATTTTATTTCTAATTCTTAATTTTTTTTTCCAG CTCCAAAACCAGAAGTTGCTGCTCAAATCCGAAAAGTGCTGACTGCCGCCGAGAAGGATAACACTGACGCGCATCAGTTGACCTATGATGAGCATAATCCATTCGTCGTGTGCTCTCGTCAATTTGTGCCTCTTTACCGTGGACG TCCACTCTGTAAATGCCCGTATTGTGGAGCATCGTACTCTGAAGGGCTCGAGGGAGAAGTGTGCAACGTGTGCCAGGTTGCAGAAGTTGGAAAGAATGTTCTTGGGCTCCGAATCTCAACTCTGAAGAGCTAAAAGTTGTGAAA TATGATAATTTTTTCGTTTTTTGTGTGATATTAATTATTGTAATTTATTTGCCTTGTTCTTCCACAATTCCCATTGCTTTTGGTCCCTTCATATTATTTTCCCGTTGGGCGGCGGTTCATGTTTTCTATTTAAATTATGTGTTTTTAAAAATT

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us