Industry Applications of Machine Learning and Data Science

#1 Agile Predictive Analytics Platform for Today’s Modern Analysts Industry Applications of Machine Learning and Data Science Ralf Klinkenberg, Co-Founder & Head of Data Science Research, RapidMiner [email protected] www.RapidMiner.com Industrial Data Science Conference (IDS 2019), Dortmund, Germany, March 13th, 2019 Can you predict the future? ©2015©2016 RapidMiner, Inc. All rights reserved. - 2 - Predictive Analytics finds the hidden patterns in big data and uses them to predict future events. 473ms © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 3 - 3 Machine Learning: Pattern Detection, Trend Detection, Finding Correlations & Causal Relations, etc. from Data to Create Models that Enable the Automated Classification of New Cases, Forecasting of Events or Values, Prediction of Risks & Opportunities ©2015 RapidMiner, Inc. All rights reserved. - 4 - Predictive Analytics Transforms Insight into ACTION Prescriptive ACT Operationalize Predictive ANTICIPATE What will happen Diagnostic EXPLAIN Why did it happen Value Descriptive OBSERVE Analytics What happened ©2016 RapidMiner, Inc. All rights reserved. - 5 - Industry Applications of Machine Learning and Predictive Analytics ©2016 RapidMiner, Inc. All rights reserved. - 6 - Customer Churn Prediction: Predict Which Customers are about to Churn. Energy Provider E.ON: 17 Million Customers. => Predict & Prevent Churn => Secure Revenue, Less Costly Than Acquiring New Customers. © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 7 - Demand Forecasting: Predict Which Book Will be Sold How Often in Which Region. Book Retailer Libri: 33 Million Transactions per Year. => Guarantee Availability and Delivery Times. © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 8 - Predictive Maintenance: Predict Machine Failures before They Happen in order to Prevent Them, => Demand-Based Maintenance, Fewer Failures, Lower Costs © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 9 - Predictive Maintenance © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 10 - Sensor Data Log Data Meta Data Free Text Audio/Image Data © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 11 - Predictive Maintenance Customers Using RapidMiner for Predictive Maintenance, i.e. for Predicting & Preventing Machine Failures before they happen: • Major German Car Manufacturers: Text Analytics of Repair & Service Reports to Identify Car Quality & Car Maintenance Issues, Audio Analytics • Major European & South American Airplane Manufacturers and Major International Airplane Operators: Sensor Data & Text Mining Repair & Service Reports for Predictive Airplane Maintenance & Resource Allocation • Major European Cement Producer: Cement Mill Failure Prediction & Prevention • Major Chinese Energy Provider: Wind Turbine Failure Prediction & Prevention © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 12 - Engine Quality Prediction • Task: – Predict Engine Lifetime / Quality from Audio Data • Challenges: – Audio Feature Generation • Solution: – Automated Feature Generation & Selection – Automated Optimizers – Classification Problem © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 13 - Product Surface Quality Optimization • Task: – Image Analysis to Determine Product Surface Quality/Issues • Challenges: – Image Feature Construction – Light Conditions and Their Variety • Solution: – Automated Feature Generation & Selection – Automated Optimization – Classification Problem © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 14 - PRESED Predictive Sensor Data mining for Product Quality Improvement u Project Overview e . Project Overview d • Improve production quality in steel mills e s • Optimize the manufacturing process by identifying the e main causes of bad quality r p • Predict the quality of the product as soon as possible . to better characterize it and reduce the cost • Develop new methodologies to improve the w w Use Case w quality of the steel production by • Identifying causes for bad quality • Predict the quality of a product as soon as possible • Integrate expert knowledge for different production sites – Identifying causes for bad quality – Predict the quality of a product as soon as possible during the production process • RapidMiner building and providing the analytic server infrastructure 8 1 0 2 / • RapidMiner implementing the tools developed 1 0 Approach Challenges – 4 • Build a Big Data Infrastructure for product • Design a data structure to track the by the other partners 1 tracking and visualization material over the complete production 0 2 • Extract features from raw sensor data process / series • Tracking is difficult as the product – New algorithms for time series analysis 7 0 • Apply machine learning for quality changes shape (rolling, cutting) : prediction • Huge amount of raw data n • Build an analytics server for model (several hundred parameters, o – Visualization methods for provided data i t construction and management with a frequency of 1-10Hz over a 2-3 a • Enrich data with knowledge from experts year period) r u D Contact: [email protected] www.rapidminer.com Project partners: Funded by: RFSR-CT-2014-00031 © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 15 - PRESED „Predictive Sensor Data Mining for Product Quality Improvement“ Product Quality Prediction with Machine Learning on Time Seires Data Alain Sauvan / ArcelorMittal Fos-sur- Mer © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 16 - PRESED 10 Meter ▪ Product Variations ▪ Varying Form/Shape 2+ Kilometer – From 10m blocks to 2km rolls ▪ Large Volume of Recorded Data – Hundreds of sensors and parameters – Value sensing frequency of 1-10Hz – Production data of many years © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 17 - OPTIMAL MIXTURE OF INGREDIENTS? © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 18 - Optimizing Mixtures of Ingridients • Which mixtures will produce high quality products? • Which mixtures will lead to quality issues? • How much of particular expensive additives is needed? • How to lower costs while ensuring high product quality? • How to increase production process reliability & product quality? • What variables are correlated to product quality and how? • How to predict and ensure product quality? • How much of each ingredient is optimal? • How to configure the production process and machines? • => Automated Predictions & Alerts & Action Recommendations • => Lower Cost & Lower Risk & Higher Reliability & Higher Quality © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 19 - Optimizing Mixtures of Ingridients Customers Using RapidMiner for Optimizing Mixtures of Ingredients: • Major European Tire Manufacturer: Optimizing the rubber ingredients to optimize product quality and features (e.g. durability, adhesiveness, etc.) • Major European Metal Product/Part Producer (Components of Cars, Trains, and Appliances): Predicting & Optimizing Quality & Cost & Reliability (ensure required quality level while reducing cost of ingredients, e.g. reducing expensive additives as much as reasonable, but not beyond) © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 20 - EVERYTHING OK? © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 21 - ! © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 22 - BIG DATA CAN BE OVERWHELMING © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 23 - Detecting & Predicting Critical Situations • Big Data can be overwhelming – huge amounts of structured and unstructured data (e.g. texts) from many different sources • How to find the relevant information to look at? • How to effectively detect critical situations in a mass of data? • How to predict and prevent critical situations? • How to schedule maintenance early enough in advance? • => Automated Alerts & Action Recommendations • => Predicting and Preventing Machine Failures, Predictive and Preventive Maintenance • => Resource Optimization & Planning • => Lower Cost & Lower Risk & Higher Reliability & Higher Quality © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 24 - Detecting, Predicting, and Preventing Exceeded Emissions & Critical Situations Project “FEE” sponsored by the German Government © 2018 RapidMiner, GmbH & RapidMiner, Inc.: all rights reserved. - 25 - FEE Early Detection and Decision Assistance for Critical Situations in the Production Environment e Project Overview d . t Project Overview k • Chemical plants are highly automated e j production sites that produce a lot of o data r p • Data comes from various sources - (Sensor values, alarm logs, shift books) • Design and build a Big Data infrastructure to e e • Critical operation conditions can result in f . a cascade of warnings (alarm shower) w manage the data of production plants w w • Develop methods for early detection and prediction of critical situations • Develop methods to help users during critical situations or to avoid critical situations 7 1 • Build ad-hoc analysis functions to build 0 2 / 8 intervention strategies 0 – Goal of the Project Proposed Solution 4 1 • Design and build a Big Data infrastructure to • Usage of long-term data collections • Change from reactive to proactive action 0 manage the data of the production plants (~ 10 years) from production 2 / • Develop methods for early detection of critical • Offline data analytics with Big Data 9 situations

Industry Applications of Machine Learning and Data Science

Feature Selection for Gender Classification in TUIK Life Satisfaction Survey A

Effect of Distance Measures on Partitional Clustering Algorithms

An Analysis on the Performance of Rapid Miner and R Programming Language As Data Pre-Processing Tools for Unsupervised Form of Insurance Claim Dataset

Açık Kaynak Kodlu Veri Madenciliği Yazılımlarının Karşılaştırılması

Mining the Web of Linked Data with Rapidminer

Rapidminer Operator Reference Manual ©2014 by Rapidminer

ML-Flex: a Flexible Toolbox for Performing Classification Analyses

Machine Learning

A Comparative Study on Various Data Mining Tools for Intrusion Detection

The Forrester Wave

Magic Quadrant for Data Science Platforms Published: 14 February 2017 ID: G00301536 Analyst(S): Alexander Linden, Peter Krensky, Jim Hare, Carlie J

Rapidminer Studio