Validating ML in the context of GMP
Prepared By: Shire Enterprise Analytics Aug 25, 2017
Acquiring knowledge by extracting patterns from raw data cGMP – Continuous Good Manufacturing Practice
• GMP refers to the Good Manufacturing Practice Regulations promulgated by the US Food and Drug Administration • Requires that manufacturers, processors, and packagers of drugs, medical devices, some food, and blood take proactive steps to ensure that their products are safe, pure, and effective. • GMP regulations require a quality approach to manufacturing, to minimize or eliminate instances of contamination, mixups, and errors. • Failure of firms to comply can result in very serious consequences including recall, seizure, fines, and jail time.
2 How do we achieve this
• GMP regulations address issues including recordkeeping, personnel qualifications, sanitation, cleanliness, equipment verification, process validation, and complaint handling. • Most GMP requirements are very general and open-ended, allowing each manufacturer to decide individually how to best implement the necessary controls. • Manufacturers interpret the requirements in a manner which makes sense for each individual business.
3 Reduce Risk and Uncertainty
• Predictable, Reliable, Repeatable • SOPs • Training • Measuring & Testing • Qualification of Technologies • Root cause analysis • Qualify by Design (QbD) • Discrete, Batch, Continuous
4 Processes and Technologies Our integrated systems are diverse and complex
• Quality Systems • Facilities and Equipment Systems • Materials Systems • Production Systems • Packaging and Labeling Systems • Laboratory Control Systems • ML Systems (stand alone & embedded) −Regression −Classification −Dimensionality Reduction 5 −Ensemble Learning Machine Learning
• Machine Learning −To learn without being explicitly programmed −To learn and make predictions −To operate by building Models built from Observations (Inputs) to make data-driven predictions or decisions expressed as Outputs
6 Neurons - Building Blocks of the Nervous System
• 1906 Nobel Prize in Physiology or Medicine
Camillo Golgi Santiago Ramón y Cajal
7 1963 Nobel Prize in Physiology
Alan Lloyd Hodgkin, Andrew Huxley, (Action Potential)
John Eccles (Synapse)
1970 Nobel Prize Bernard Katz, Julius Axelrod, Ulf von Euler (Neuro Transmitters)
1991 Nobel Prize Erwin Neher, Bert Sakmann 9 (Ion Channels) Artificial Neuron
Neuron
Artificial neuron (Perceptron)
10 The 1st Perceptron
11 Anatomy & Physiology of the Perceptron
Weights & Bias & Activation Function
12 The 1st Perceptron Inventor: Dr. Frank Rosenblatt
1950’s “The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself an be conscious of its existence … Dr. Frank Rosenblatt, a research psychologist at the Cornell Aeronautical Laboratory, Buffalo, said Perceptrons might be fired to the planets as mechanical space explorers”
13 SLP Architecture (Single Layer)
1
2
3
4 Features Classifier
14 AI Winter (1969 – 1986)
15 MLP – Multilayer Perceptron
Known Known
f(x) = Unknown
16 Training the MLP (Supervised Learning) Backpropagation – realized in 74; rediscovered in 86
Minimize error (cost or loss)
17 Anatomy & Physiology of the Retina
• Rods & Cones
• Horizontal Cells
• Bipolar Cells Multilayered
• Amacrine Cells
• Ganglion Cells
18 Anatomy & Physiology of “Visual (Striate) Cortex”
1981 Nobel Prize David H. Hubel, Torsten Wiesel (Information Processing in the Visual System)
19 Learning Complicated Concepts building from Simpler Concepts
20 Deep Net “Seeing” – Multilayered (semi-supervised) A Hierarchy of Concepts
Acquiring knowledge by extracting patterns from raw data 21 CNN (Convolutional Neural Network)
22 Growth of ML Training Datasets
ImageNet Dataset
MNIST Dataset
23 CNNs are amazing but can be fooled
• Predicted at 99.99% confidence by CNN
24 The race is on…
1. Our labeled datasets were thousands of times too small. 2. Our computers were millions of times too slow. 3. We initialized the weights in a stupid way. 4. We used the wrong type of non-linearity
25 Gartner says we are peaking…
26 Technology Convergence on the Cloud Fuels the Analytics rEvolution
27 Data Connectedness in the Grid Fuels a new kind of Discovery
28 Analytics COMMUNITY
29 A Community of Convergence…
30 Skill level in ML (Survey N=111)
31 Quality is job 1 – Safe, Pure, Effective
32 Consider ML as new cog in the Machine…
ML
33 Not all ML is a Black Box…
34 Don’t be caught in the hype…
35 Conduct Software Verification & Validation as you would for any system…
36 Perform Standard Verification & Validation activities
37 Perform Platform Qualification – Q’s
DQ
MQ IQ
PQ OQ
38 Educate & Train
• Develop a level of internal expertise
39 Incorporate an established Data Science methodology into current processes
IBM’s Cross Industry Standard Process for Data Mining (CRISP-DM)
Microsoft’s Team Data Science Process
40 Create Quality Metrics for ML Models
Confusion Matrix for Classification
• Error = (FP+FN) / (TP+FN+FP+TN) • Accuracy = (TP+TN) / (TP+FN+FP+TN) • Precision = TP / (TP+FP) • Recall = TP / (TP+FN) • F1 = 2TP / (2TP + FP + FN) ROC Curves
41 The future is fast becoming our new reality Meet Sophia from Hanson Robotics What do you think Sophia?
How would you react?
42 Thank you
43