Industrial Analytics for Quality Improvement in Complex Systems

Dr. Kaibo Liu

Department of Industrial and Systems Engineering

University of Wisconsin-Madison

1 Lab for System Informatics and Data Analytics (SIDA) Background

• A.P. 2013-now, Department of industrial and Systems Engineering, UW-Madison • Ph.D. 2013, Industrial Engineering (Minor: ), Georgia Institute of Technology • M.S. 2011, Statistics, Georgia Institute of Technology • B.S. 2009, Industrial Engineering and Engineering Management, Hong Kong University of Science and Technology, Hong Kong

2 Lab for System Informatics and Data Analytics (SIDA) My Research & Expertise

Research Interests Expertise System Informatics and data analytics:

• Complex system modeling and Engineering performance assessment Multidisciplinary approach • Data fusion for online process monitoring, diagnosis and prognostics Operation Statistics/ Research/ Machine • Statistical learning, data mining, and Control Learning decision making

Multi-disciplinary Research

Sensor Measurement System Degradation Spatiotemporal Field and Monitoring Strategy Analysis and Prognostics Modeling and Prediction

Overall, my research goal is to make sense of big data for better decision making!

3 Lab for System Informatics and Data Analytics (SIDA) Sensor Measurement and Monitoring Strategy

4 Lab for System Informatics and Data Analytics (SIDA) Objective-oriented sensor system designs in complex systems Objective Approaches • Obtain an optimal sensor allocation design at • A best allocation subsets by intelligent search, minimum cost under different user specified named BASIS algorithm that intelligently quality requirements searches for the optimal sensor allocation solution

• Features • Consider the trade-off of detection speed, fault diagnosis accuracy, and cost savings

Results Summary • Ensure customer satisfaction by optimally designing sensor allocation strategy • The average cycle time, cost and inventory level can be greatly reduced • Algorithms have been tested in several applications, e.g., the hot forming and the Effectively search for optimal sensor system design cap alignment processes solutions • Supported several students

5 Lab for System Informatics and Data Analytics (SIDA) Causation-based monitoring, diagnosis and control Objective Approaches • Transform from existing correlation-based techniques into a new causation-based quality control paradigm to achieve effective online quality monitoring and inference, root cause diagnosis, and proactive process control • Features • Engineering knowledge enhanced causal modeling • Causation-based online quality monitoring, inference, and diagnosis • Causation-based online feed-forward and feed-back process control Results Summary • Establish a series of causation-based monitoring, diagnosis and control techniques for quality improvement in complex systems • Algorithms have been tested in the hot forming, the cap alignment, and the rolling processes improved efficiency, yield, and quality • Supported several students

6 Lab for System Informatics and Data Analytics (SIDA) Online monitoring of Big Data Streams

Objective Approaches • Create a new paradigm of dynamic data-driven • A self-updated statistical model to fully modeling, sampling and monitoring schemes characterize the changing background for Big Data Streams (e.g., Video streams) • A dynamic, data-driven sampling strategy subject to practical resources constraints • A scalable and robust statistical process control method tailored for Big Data Streams • Features • Scalability: linear complexity that ensures practical implementation Examples of thermal profiles on the polishing pad • Adaptability: automatically localize the during CMP process under different conditions anomaly regions without any prior knowledge

Results Summary • Establish a series of real-time monitoring methodologies that are tailored for Big data streams for quick anomaly detection (either cyber of physical) and localization • Algorithms have been tested in various applications, e.g., diaper manufacturing, Maximize the detection capability with climate monitoring and solar flare detection practical resources constraints • Supported several students 7 Lab for System Informatics and Data Analytics (SIDA) Dynamic Data-Driven Modeling, Sampling and Monitoring for Real-Time Solar Flare Detection

Original Solar Image Updated Solar Image • A dynamically updated spatial-temporal statistical model fully Update characterize the Model changing background • A dynamic sampling algorithm that 푡 (a) Applications (b) Applications modeling actively decides DDDAS Sample data Update sampling which data streams to Framework observe given the SPC Chart Dynamic Sampling resources constraints

Update • A scalable and robust SPC SPC to effectively combine the information from significant data streams to produce an 푡 overall global (d) Mathematical and (c) Application measurement monitoring system statistical algorithms systems and methods

8 Lab for System Informatics and Data Analytics (SIDA) Sensor Measurement and Monitoring Strategy

• Objective-Oriented Optimal Sensor Allocation Strategy: determine the minimum number of sensors needed given user specified requirements • Adaptive Sensor Allocation Strategy: Adaptively adjust sensor allocation in a Bayesian Network to enhance monitoring and diagnosis • A Top-r based Adaptive Sampling Strategy: Online monitor normally distributed big data streams in the context of limited resources • A Nonparametric Adaptive Sampling Strategy: Online monitor non-normal big data streams in the context of limited resources • Effective Online Data Monitoring and Saving Strategy: intelligently select and record the most informative extreme values in the simulation data • A Spatial Adaptive Sampling Procedure: leverage the spatial information and adaptively and intelligently integrate two seemingly contradictory ideas (Wide and deep searches) • A Rank-based Sampling Algorithm by Data Augmentation: automatically augment information for unobservable variables based on the online observations

9 Lab for System Informatics and Data Analytics (SIDA) System Degradation Modeling and Prognostics

10 Lab for System Informatics and Data Analytics (SIDA) -enabled Condition-based Monitoring, Diagnosis, and Prognostics

Objective Approaches • Leverage condition monitoring signals • Novel data fusion methods that select collected from multiple and heterogeneous best sensors and combine their sensors to better visualize and assess the information to construct health indices current system health status and predict its for system performance assessment future behavior in real time and visualization, ℎ푖,푡 = 푓 풙푖,.,푡 • Features • Combine data-driven approaches and engineering principles governing the underlying failure mechanism to Aircraft engine diagram ensure satisfactory performance Results Summary • Establish a series of data fusion methodologies that are tailored for IoT- enabled service systems for health status visualization, characterization and prediction • Algorithms have been tested in various applications, e.g., engine health monitoring, Better health status Better fault Better RUL Alzheimer's disease and forklift management characterization diagnosis prediction • Supported several students

11 Lab for System Informatics and Data Analytics (SIDA) Case Study – Engine RUL prediction

∗ ∗ • Optimal weights 풘 : ℎ푖 푡 = 푳푖 푡 풘 Name T24 T50 P30 Nf Ps30 phi NRf BPR htBleed W31 W32 Value 0.13 0.37 -0.03 -0.05 0.23 -0.21 -0.08 0.16 0.12 -0.05 -0.16

The stochastic T24 degradation models … Bayesian (Gebraeel, 2006) Remaining life updating W32 prediction Real time sensor methods Health index information

• Developed HI-QL improved the RUL prediction accuracy o by 64.83% compared with the best single sensor o by 20.7% compared with existing HI-based models

12 Lab for System Informatics and Data Analytics (SIDA) System Degradation Modeling and Prognostics

• Non-parametric data fusion model: does not need to know the parametric form of the degradation signal • semi-parametric data fusion model: integrate degradation modeling and prognostics in an integrated manner • SNR-based data fusion model: immune to the heterogeneous sensor challenges in terms of signal scales and measurement units • Quantile regression-based data fusion model: ensure to recover the underlying degradation status with estimated fusion coefficients converging to the true values • Sensory-based Failure Threshold Estimation: online update the failure threshold estimation of the in-field unit • Kernel-trick for nonlinear data fusion model • Generic data fusion model with automatic sensor selection • Data fusion model for multiple failure modes • Data fusion model when there are multiple environmental conditions • Generic data fusion model when mutisensor signals are asynchronous • Dynamic control of degradation speed and RLD via workload adjustment

13 Lab for System Informatics and Data Analytics (SIDA) Smart Monitoring of Alzheimer’s Disease via Data Fusion, Personalized Prognostics, and Selective Sensing

Existing Screening New Approaches Methodology Biomarkers Screening Tests Smart Monitoring Passive Proactive Expensive, information information Effective e.g., $ 5000 collection: collection driven -ness per scan for burden, and by accurate PiB-PET complexity statistical models Proposed Smart Monitoring Method

The model of AD trajectory [3]

14 Lab for System Informatics and Data Analytics (SIDA) Data-Driven Failure for Internet of Things (IoT) enabled Service Systems

Real-time on-line CM data Historical off-line data on individual units Equipment on multiple units Condition monitoring (CM) data

in the field Sensing data

10 10 10 10 10 10 10

10 Service alert

9 9 9 9 9 9 9 9

UCarnit #1 signal

8 8 8 8 8 8 8 8

7 7 7 7 7 7 7 7

6 6 6 6 6 6 6

6 UCarnit #2 signal 0 CM Signal CM

… …. Time

5 5 5 5 5 5 5 5 UCarnit #i signal Failure event data Failure

Censored

4 4 4 4 4 4 4 4

3 3 3 3 3 3 3 3 Communication Back-office 0 5 10 15 Processing center

Time network Failure cases

Time-to-failure data

Establish a core set of data-driven modeling, failure prognosis, and service decision-making methodologies for emerging Internet of Things (IoT) enabled service systems, particularly in the context of TMHNA

15 Lab for System Informatics and Data Analytics (SIDA) Big data analytics solutions to improve nuclear power plant efficiency: Online monitoring, visualization, prognosis, and maintenance decision making

Advance the ability to assess equipment condition and predict the remaining useful life (RUL) to support optimal maintenance decision making in nuclear power plants.

16 Lab for System Informatics and Data Analytics (SIDA) Spatiotemporal Field Modeling and Prediction

17 Lab for System Informatics and Data Analytics (SIDA) Real-time travel demand modeling and prediction in smart and connected cities Objective Approaches • Online prediction of the origin-destination • Propose a multivariate Poisson log-normal (OD) demand in traffic networks model with specific parametrization tailored • Existing literature models the demand count to the traffic demand problem data separately for different OD pairs without • Capture the spatiotemporal correlations of considering spatial correlations or domain the traffic demand across different routes and knowledge epochs and automatically clusters the routes based on the demand correlations • The model is estimated using an Expectation- Maximization (EM) algorithm and applied for predicting future demand counts at the subsequent epochs

Results Summary • The proposed method integrates traffic network domain knowledge and achieves a

sparse estimation based on clusters of routes. ഥ 흁 • Estimate the parameters of the model accurately with the developed EM algorithm • Has been applied on a real New York yellow

푡 taxi dataset • Supported several students

18 Lab for System Informatics and Data Analytics (SIDA) Modeling of dynamic thermal fields via grid-based sensor networks Objective Approaches • Accurate modeling and estimation of the full- • Integrate physical dynamics model (for global scale grain thermal field based on the grid- profile) and spatiotemporal stochastic based sensor networks. processes (for local profile) • Challenges: • Develop a spatiotemporal transfer learning • Grid-based but sparse sensor data technique for 3D field estimation using sensor • Spatiotemporal correlation structures observations from several homogeneous data • Local variability of grain temperature sources • Estimate time-varying parameters in PDE 푌(푠, 푡1) 푌(푠, 푡2) … 푌(푠, 푡푀) models from the obtained data to acquire a … more accurate description of the dynamics Time 푡1 푡2 … 푡푀 Results Summary • The proposed methods integrate physical dynamics model, spatiotemporal statistical model, and advanced machine learning technique to achieves an accurate estimation of the 3D thermal fields based on grid-based sensor networks. • Has been tested and verified on several real datasets for grain storage application

19 Lab for System Informatics and Data Analytics (SIDA) Other Research Projects

20 Lab for System Informatics and Data Analytics (SIDA) Operator activity index development and performance improvement

Objective Approaches • Propose a generic approach to develop an • a new nonnegative principal component effective composite index to identify high- analysis (NPCA) approach with optimal performing operators on multiple dimensions balance • Best separation of operators • Comply with practical interpretation

Results Summary • Developed an OAI by combining worker metrics information to measure the activity of operators • OAI by NPCA meaningfully explains the operator activity and also provides guidance for performance improvement • Algorithms have been tested in the forklift operator activity analyses • Supported several students 21 Lab for System Informatics and Data Analytics (SIDA) Obstructive Sleep Apnea Detection

22 Lab for System Informatics and Data Analytics (SIDA) Retail Site Location Analysis by Business Data Analytics Objective Approaches • Choose an optimal location for the opening of • Estimate the new market shares of the a new retail site company over the country if the new retail The company of interest conducts gas station equipment site is tentatively opened at different repair and replacement business, who provided a dataset potential locations contains a total of more than 1 million detailed business transactions with a size about 8 GB over the past 5 years.

Results Summary • Established a generic guideline on leveraging data analytics tools for resolving business issues when dealing with business big data • Algorithms have been tested in a real case study involving choosing an optimal location for the opening of a new retail site • Supported several students

23 Lab for System Informatics and Data Analytics (SIDA) Research Summary

Engineering Engineering Industrial Big Data Analytics Statistics Operation /ControlStatistics/OR Research/ Data Control Mining

24 Lab for System Informatics and Data Analytics (SIDA) Acknowledgement

25 Lab for System Informatics and Data Analytics (SIDA) Thank you! Questions? [email protected]

26 Lab for System Informatics and Data Analytics (SIDA)