Machine Learning

Machine Learning

MACHINE LEARNING Azure Reference Architecture • Solliance Founder, CEO • Author • Microsoft MVP– Microsoft Azure • Azure Elite, Azure Insider Zoiner Tejada [email protected] • CQURE Certified Security Professional @zoinertejada • Google Developer Expert (GDE) AGENDA You will learn: • the key tools in the toolbox (data transformation, supervised learning modules, unsupervised learning modules) • the value that Azure ML brings to the larger solution (such as classification, clustering and predictive analytics) • how you train your model (if you have to at all) and how to validate your model • how Azure ML integrates with your data pipeline © DEVintersection. All rights reserved. http://www.DEVintersection.com INTRO TO DATA SCIENCE Keepin’ it stats-light WHAT IS DATA SCIENCE • Practice of obtaining insights from data • Applies equally to small data and BIG data • Structured and unstructured • Multidisciplinary • Stats • Math • Operations • Signal processing • Linguistics • Database / Storage • Programming • Machine Learning • Scientific Computing © DEVintersection. All rights reserved. http://www.DEVintersection.com WHY NOW? • Data has become a critical asset • With volumes increasing, it’s getting increasingly harder to tease information and insight out the data • Companies with more than 1k employees, store an average of 235 TB of data • 50B connected devices expected by 2020 • Analyst expectations such as those from Gartner say it’s worth it • Organizations that invest in modern data infrastructure will financially outperform their peers by up to 20% • Customers now expect data sophistication • Think “you might also like” on Amazon or Netflix’s recommended movies © DEVintersection. All rights reserved. http://www.DEVintersection.com ANALYTICS SPECTRUM Descriptive Diagnostic Predictive Prescriptive © DEVintersection. All rights reserved. http://www.DEVintersection.com DESCRIPTIVE ANALYTICS • What is happening? • Example • For a retail store, identify the customer segments for marketing purposes © DEVintersection. All rights reserved. http://www.DEVintersection.com DIAGNOSTIC ANALYTICS • Why is it happening? • Example • Understanding what factors are causing customers to leave a service (churn) © DEVintersection. All rights reserved. http://www.DEVintersection.com PREDICTIVE ANALYTICS • What will happen? • Example • Identify customers who are likely to upgrade to the latest phone © DEVintersection. All rights reserved. http://www.DEVintersection.com PRESCRIPTIVE ANALYTICS • What should be done? • Example • What’s the best offer to give to a customer who is likely to want that latest phone © DEVintersection. All rights reserved. http://www.DEVintersection.com PROCESS Monitor Define the model business Develop the performance problem model & tune Acquire and Deploy the prepare data model © DEVintersection. All rights reserved. http://www.DEVintersection.com HOW DO MACHINES LEARN? • The learning process is the same for humans and machine • Divided into three components • Data input – use observation, memory, and recall to provide factual basis for further reasoning • Abstraction – translate the data into broader representations • Generalization – use the abstraction to form a basis for action © DEVintersection. All rights reserved. http://www.DEVintersection.com KEY ML TERMS • Knowledge representation • the formation of logical structures that assist with turning raw data into meaningful insights • Observations/Examples • the raw data inputs, typically thought of as a tuple • Features • An an attribute or column in the example • Model • how the computer summarizes the raw inputs • Training • fitting a particular model to a dataset • Over-fitting • A model that performs well on the training dataset, but poorly when tested with other data © DEVintersection. All rights reserved. http://www.DEVintersection.com COMMON TECHNIQUES • Classification • Clustering • Regression • Simulation • Content Analysis • Recommendation © DEVintersection. All rights reserved. http://www.DEVintersection.com SUPERVISED VS. UNSUPERVISED • Refers to the requirements of the algorithm • Does it need to be “trained” on a set of data before it can provide conclusions? • Supervised algorithms need to be carefully trained before they can be shown other examples and provide results • Unsupervised algorithms do not require training, they provide results given the data at hand © DEVintersection. All rights reserved. http://www.DEVintersection.com CLASSIFICATION ALGORITHMS • Classify people or things into groups • They classify (or predict) a “label” for an example • The outcome is typically known in advance • Tools include • Decision trees • Logistic regression • Neural networks • Supervised learning • Can provide not just the classification, but also how a particular classification was reached © DEVintersection. All rights reserved. http://www.DEVintersection.com CLUSTERING ALGORITHMS • Dividing a set of examples into homogenous groups • While they also can predict a “label” for an example, they are applied when the labels are not known in advance • In other words, you are discovering what groups exist in the data • Tools include • K-means clustering • Unsupervised learning © DEVintersection. All rights reserved. http://www.DEVintersection.com PATTERN DETECTION ALGORITHMS • Identify frequent associations in the data • Tools include • Association rules • Unsupervised learning © DEVintersection. All rights reserved. http://www.DEVintersection.com REGRESSION ALGORITHMS • Predict numerical outcomes • Inputs may be categorical or numerical, but the output is typically a number • Tools include • Linear regression • Neural networks © DEVintersection. All rights reserved. http://www.DEVintersection.com SIMULATION • Model and optimize real world processes • Offers the opportunity to test many scenarios by adjusting model variables • Tools include • Monte Carlo simulations • Markov chain analysis • Linear programming © DEVintersection. All rights reserved. http://www.DEVintersection.com CONTENT ANALYSIS • Surface information and insights from content like text, audio and video • Tools • Pattern recognition • Text mining • Image recognition • OCR © DEVintersection. All rights reserved. http://www.DEVintersection.com RECOMMENDATION • Identify beneficial relationships and recommend items based on similarity between entities or between entities and items • Common example is Amazon’s product recommendations • Tools used • Collaboration filtering (similarity between users or between items) • Content analysis • Affinity (e.g. market basket analysis) © DEVintersection. All rights reserved. http://www.DEVintersection.com ENSEMBLE MODELS • The latest approaches have realized • You can have a set of individually weak algorithms • Use them together to process data • The result can be far superior than even the best lone algorithm • Tools used • Decision Forests (the data is split amongst many decision trees) • Boosted Decision Trees (the data in error is flowed thru a chain of trees) © DEVintersection. All rights reserved. http://www.DEVintersection.com SUMMARY • Defined data science and key machine learning terminology • Described the data science process • Enumerated the types of analytics • Reviewed the many categories of algorithms © DEVintersection. All rights reserved. http://www.DEVintersection.com INTRO TO AZURE MACHINE LEARNING Democratizing machine learning, with the power of the cloud AZURE ML STUDIO • Web based UI for modeling experiments • Typically requires Azure account to design and run GUEST ACCESS • Experiments can be shared outside of having an Azure account • Guest access allows read-only viewing of experiments • Does not allow them to be run © DEVintersection. All rights reserved. http://www.DEVintersection.com EXPERIMENTS • The core “project” type in Azure ML Studio is the experiment • Option for Blank • Numerous templates/samples with which to get started © DEVintersection. All rights reserved. http://www.DEVintersection.com MODULES • Experiments contain modules arranged in a flowchart fashion MODULE HELP • Getting help • Right click a module and select Help to view documentation © DEVintersection. All rights reserved. http://www.DEVintersection.com MODULE COMMENTS • Right-click on module, choose Edit Comment • Add free-form text to document what module accomplishes in the context of the experiment. • You can collapse the comments by clicking on the chevron (up arrow) © DEVintersection. All rights reserved. http://www.DEVintersection.com MODULE CATEGORIES Source Data ML Modules Operationalize© DEVintersection. All rights reserved.Don’t Use Your Models http://www.DEVintersection.com WINE QUALITY PREDICTION • Type: Regression • Candidate Algorithms: • Decision Tree • Data Prep: • None • Business Requirements: • Build a model that takes various characteristics of wine and predicts the quality score deemed by experts © DEVintersection. All rights reserved. http://www.DEVintersection.com DEMO Tour of Azure ML Studio – A first experiment in Wine Quality DATASET • Data saved to your Azure ML workspace is saved in a dataset • A Dataset is data that has been uploaded to Azure Machine Learning Studio • Datasets are external to your experiment • Azure ML provides ~40 sample datasets © DEVintersection. All rights reserved. http://www.DEVintersection.com DATATABLE • Even if you upload data in another format, or specify a storage format such as CSV, ARFF, or TSV, the data is implicitly converted to a DataTable object whenever used by a module in an experiment. •

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    138 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us