Machine Learning – En Introduktion Josefin Rosén, Senior Analytical Expert, SAS Institute

Machine Learning – En Introduktion Josefin Rosén, Senior Analytical Expert, SAS Institute

Machine learning – en introduktion Josefin Rosén, Senior Analytical Expert, SAS Institute [email protected] Twitter: @rosenjosefin #SASFORUMSE Copyright © 2015, SAS Institute Inc. All rights reserved. Machine learning – en introduktion Agenda Vad är machine learning? När, var och hur används machine learning? Exempel – deep learning Machine learning i SAS Copyright © 2015, SAS Institute Inc. All rights reserved. Machine learning – vad är det? Wikipedia: Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. SAS: Machine learning is a branch of artificial intelligence that automates the building of systems that learn from data, identify patterns, and make decisions – with minimal human intervention. Copyright © 2015, SAS Institute Inc. All rights reserved. Vad är vad egentligen? Statistics Pattern Computational Neuroscience Recognition Data Science Data Mining Machine AI Learning Databases Information Retrieval Copyright © 2015, SAS Institute Inc. All rights reserved. Machine learning – vad är det? ”Komplicerade metoder, men användbara resultat” Copyright © 2015, SAS Institute Inc. All rights reserved. När används machine learning? När modellens prediktionsnoggrannhet är viktigare än tolkningen av modellen När traditionella tillvägagångssätt inte passar, t ex när man har: fler variabler än observationer många korrelerade variabler ostrukturerad data fundamentalt ickelinjära eller ovanliga fenomen Copyright © 2015, SAS Institute Inc. All rights reserved. Beslutsträd Träningsdata Regression Neuralt nätverk Copyright © 2015, SAS Institute Inc. All rights reserved. Var används machine learning? Några exempel: Rekommendationsapplikationer Fraud detection Prediktivt underhåll Textanalys Mönster och bildigenkänning Den självkörande Google-bilen Copyright © 2015, SAS Institute Inc. All rights reserved. Statistics Pattern Computational Neuroscience Recognition Data Science Data Mining Machine AI Learning Databases Information Retrieval Copyright © 2015, SAS Institute Inc. All rights reserved. Machine Learning SUPERVISED UNSUPERVISED SEMI-SUPERVISED LEARNING LEARNING LEARNING Regression A priori rules Prediction and LASSO regression Clustering classification* Logistic regression k-means clustering Ridge regression Clustering* Mean shift clustering Decision tree Spectral clustering EM Gradient boosting Kernel density TSVM Random forests Don’t Sometimes NeuralKnow networks y estimation Manifold SVM Nonnegative regularizationknow y Naïve Bayes matrixknow y Autoencoders factorization Multilayer perceptron Neighbors Restricted Boltzmann PCA Data Mining Data Gaussian machines Kernel PCA processes Sparse PCA Singular value decomposition SOM *In semi-supervised learning, supervised prediction and classification algorithms are often combined with clustering. Copyright © 2015, SAS Institute Inc. All rights reserved. Deep learning Deep learning – att använda neurala nätverk med fler än två gömda lager Används framgångsrikt bl a inom mönsterigenkänning Bra på att extrahera features från ett dataset Copyright © 2015, SAS Institute Inc. All rights reserved. MNIST träningsdata 784 variabler bildar en 28x28 digital grid 784-dimensionell inputvektor X = (x1,…,x784) Varierande gråskala från 0 till 255 60,000 träningsbilder med label 10,000 testbilder utan label Copyright © 2015, SAS Institute Inc. All rights reserved. MNIST exempel Träna en stacked denoising autoencoder Extrahera representativa features från MNIST data Jämföra med PCA, två PCs Copyright © 2015, SAS Institute Inc. All rights reserved. Stacked denoising autoencoder Uncorrupted Output Features Target Layer Hidden Neurons h5 h4 Hidden Neurons Hidden layers h3 Hidden Neurons Extractable Features h2 Hidden Neurons h1 Hidden Neurons Partially Corrupted Input Features Input Layer Copyright © 2015, SAS Institute Inc. All rights reserved. Record ID Hidden Unit 1 Hidden Unit 2 1 0.98754 0.32453 2 0.76854 0.87345 3 0.87435 0.05464 ⋮ ⋮ ⋮ h3 Hidden Neurons Extractable Features h2 Hidden Neurons h1 Hidden Neurons Partially Corrupted Input Features Input Layer Record ID Pixel 1 Pixel 2 Pixel 3 Pixel 4 Pixel 5 Pixel 6 Pixel 7 Pixel 8 Pixel 9 Pixel 10 … 1 0 0 0 0 0 5 8 11 6 3 … 2 0 0 0 0 10 20 45 46 36 24 … 3 0 25 37 32 40 64 107 200 67 46 … ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ Copyright © 2015, SAS Institute Inc. All rights reserved. Feature extraction – denoising autoencoder Copyright © 2015, SAS Institute Inc. All rights reserved. Feature extraction - PCA Copyright © 2015, SAS Institute Inc. All rights reserved. SAS machine learning algoritmer Neural networks Expectation maximization Decision trees Multivariate adaptive regression Random forests splines Associations and sequence Bayesian networks discovery Kernel density estimation Gradient boosting and bagging Principal components analysis Support vector machines Singular value decomposition Nearest-neighbor mapping Gaussian mixture models K-means clustering Sequential covering rule building DBSCAN Model ensembles Self-organizing maps Recommendations Local search optimization techniques such as genetic algorithms Copyright © 2015, SAS Institute Inc. All rights reserved. SAS-produkter som använder machine learning SAS Enterprise Miner SAS Text Miner SAS In-Memory Statistics for Hadoop SAS Visual Statistics SAS/STAT SAS/OR SAS Factory Miner Copyright © 2015, SAS Institute Inc. All rights reserved. Algoritm SAS EM-noder SAS procedurer Supervised learning algoritmer learning Supervised Regression High Performance Regression ADAPTIVEREG LARS GAM Partial Least Squares GENMOD Regression GLMSELECT HPGENSELECT HPLOGISTIC HHPQUANTSELECT HPREG LOGISTIC QUANTREG QUANTSELECT REG Beslutsträd Decision Tree ARBORETUM High Performance Tree HPSPLIT Random forest High Performance Tree HPFOREST Gradient boosting Gradient Boosting ARBORETUM Neurala nätverk AutoNeural HPNEURAL DMNeural NEURAL High Performance Neural Neural Network Support vector machine High Performance Support Vector Machine HPSVM Naïve Bayes HPBNET* Neighbors Memory Based Reasoning DISCRIM *PROC HPBNET kan lära sig olika nätverksstrukturer (naïve, TAN, PC, och MB) och automatiskt välja den bästa modellen Copyright © 2015, SAS Institute Inc. All rights reserved. Unsupervised learning algoritmer Algoritm SAS EM-noder SAS procedurer A priori rules Association Link Analysis K-means klustring Cluster FASTCLUS High Performance Cluster HPCLUS Spektral klustring Custom lösning genom Base SAS och procedurerna DISTANCE och PRINCOMP Kernel density estimation KDE Kernel PCA Custom lösning genom Base SAS och procedurerna CORR, PRINCOMP och SCORE Singular value decomposition HPTMINE IML Self organizing maps SOM/Kohonen Copyright © 2015, SAS Institute Inc. All rights reserved. Semi-Supervised learning algoritmer Algoritm SAS EM-noder SAS procedurer Denoising autoencoders HPNEURAL NEURAL Copyright © 2015, SAS Institute Inc. All rights reserved. Varför har machine learning fått ökat intresse? Big data Beräkningsresurser Kraftfulla datorer“Space is big. You just won't believe how Billig datalagringvastly, hugely, mind-bogglingly big it is” Douglas Adams i ”Liftarens guide till galaxen” Copyright © 2015, SAS Institute Inc. All rights reserved. Copyright © 2015, SAS Institute Inc. All rights reserved. Mer läsning • White papers http://www.sas.com/en_us/whitepapers/machine-learning-with-sas-enterprise-miner-107521.html http://support.sas.com/resources/papers/proceedings14/SAS313-2014.pdf • SAS-länkar http://www.sas.com/en_us/insights/analytics/machine-learning.html http://www.sas.com/en_us/insights/articles/analytics/introduction-to-machine-learning-five-things-the-quants-wish-we- knew.html • SAS Data Mining Community https://communities.sas.com/community/support-communities/sas_data_mining_and_text_mining/ • Big Data Matters Webinar Series: www.sas.com/bigdatamatters Copyright © 2015, SAS Institute Inc. All rights reserved. Tack! Copyright © 2015, SAS Institute Inc. All rights reserved. .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    26 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us