PSI Webinar What does the Data- Driven Hype mean for Pharma ?

Richard Pugh Chief Data Scientist [email protected] @richatmango

PROPERTY of MANGO SOLUTIONS Agenda

Introductions Data, hype & snake oil

Never mind the buzzwords

What this means for pharma

Summary

2

PROPERTY of MANGO SOLUTIONS PSI Webinar Introductions

PROPERTY of MANGO SOLUTIONS My background

• Statistician working in pharmaceutical industry • SAS > S+ > R • Founded Mango in 2002 • Work across sectors • Advice on data & analytics

PROPERTY of MANGO SOLUTIONS Mango helps companies to deliver data-driven value, create a data-first culture and to build a lasting capability Data Science Section DATA | EVIDENCE | DECISIONS

• Established in 2017 • Representatives from business, industry, government and academia • Formed to address emerging topics that will impact the long term success of data science as a profession • Our remit is to be a professional body that represents data scientists in the UK

PROPERTY of MANGO SOLUTIONS PSI Webinar Data, hype & Snake Oil

PROPERTY of MANGO SOLUTIONS The visibility and remit of analytics has changed over the last 20 years

PROPERTY of MANGO SOLUTIONS Organisations believe that future success will depend largely on an ability to use data to make optimal business decisions and drive efficiencies

9

PROPERTY of MANGO SOLUTIONS A Data Driven Company is one that generates value from data by integrating it into the DNA of its decision making processes

10

PROPERTY of MANGO SOLUTIONS 11

PROPERTY of MANGO SOLUTIONS So what changed?

12

PROPERTY of MANGO SOLUTIONS The Gartner Hype Curve Big Data Data Science 2017 2018 2015 2016 2017 Deep Learning

2013 2015 2012 2014 2014

2011

Innovation Peak of Inflated Trough of Slope of Plateau of Trigger Expectations Disillusionment Elightenment Productivity

PROPERTY of MANGO SOLUTIONS Who is driving the hype and why?

Large technology Large consulting vendors looking to firms looking to sell Startups looking for sell highly-priced analytic consulting funding technical solution services

14

PROPERTY of MANGO SOLUTIONS The story so far …

• Over the last 20 years, the hype has raised the profile of data & analytics • Analytics isSo now whata strategic changed? topic for leadership • The hype focused on a set of “buzzwords” which are now in common use • What do these “buzzwords” mean, and how does it impact pharmaceutical analysis?

15

PROPERTY of MANGO SOLUTIONS PSI Webinar Never mind the ^ buzzwords WORDS

PROPERTY of MANGO SOLUTIONS AI Big Data Analytics Machine Learning Data Science Statistics 17

PROPERTY of MANGO SOLUTIONS You keep using that word …

… I do not think it means what you think it means 18

PROPERTY of MANGO SOLUTIONS Data & Analytic Terminology

Impact on the business Influence decision making or ensure visibility on performance

Analytics Turn data into knowledge to be communicated to the business

Data Information captured from internal or external sources, stored and managed on technical platforms 20

PROPERTY of MANGO SOLUTIONS Data Dimensions

• Categorised as the “3 (or 4) Vs”: • Volume – Data Size • Velocity – Real time decisions • Variety – Structure, Unstructured • Veracity – Data Quality

21

PROPERTY of MANGO SOLUTIONS Hadoop

• Technology solution to scaling • Considered “obsolete” on the 2017 Hype cycle • Spawned set of technologies to manage data across the “V” dimensions

22

PROPERTY of MANGO SOLUTIONS Data & Analytic Terminology

Impact on the business Influence decision making or ensure visibility on performance

Analytics Turn data into knowledge to be communicated to the business

Data Information captured from internal or external sources, stored and managed on technical platforms Data & Analytic Terminology

Impact on the business Influence decision making or ensure visibility on performance

Analytics Turn data into knowledge to be communicated to the business

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, ) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Influence decision making or ensure visibility on performance

Descriptive Analytics Diagnostic Analytics The use of reporting and The use of modelling The use predictive The use of predictive basic analysis to summarise to understand the modelling approaches to modelling and historical data using charts, factors that influence a understand likely outcomes optimisation used to tables and statistics, with particular outcome, from a process under new understand how to results often displayed in based on historical circumstances optimise future outcomes dashboards and reports data

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Influence decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Diagnostic Analytics Predictive Analytics Prescriptive Analytics The use of reporting and The use of modelling The use predictive The use of predictive basic analysis to summarise to understand the modelling approaches to modelling and historical data using charts, factors that influence a understand likely outcomes optimisation used to tables and statistics, with particular outcome, from a process under new understand how to results often displayed in based on historical circumstances optimise future outcomes dashboards and reports data

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Business Intelligence Influence decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics The use of reporting and basic analysis to summarise historical data using charts, tables and statistics, with results often displayed in dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Business Intelligence Influence decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Statistical Modelling The use of reporting and Human-centric process of basic analysis to summarise building models through historical data using charts, data understanding using tables and statistics, with iterative mathematical results often displayed in approaches dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Business Intelligence Influence decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Statistical Modelling Machine Learning The use of reporting and Human-centric process of Compute-centric creation of basic analysis to summarise building models through mathematical models by historical data using charts, data understanding using learning about patterns in tables and statistics, with iterative mathematical data and how they influence results often displayed in approaches outcomes dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data & Analytic Terminology

Impact on the business Business Intelligence Influence decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Statistical Modelling Machine Learning Artificial Intelligence The use of reporting and Human-centric process of Compute-centric creation of A narrow set of machine basic analysis to summarise building models through mathematical models by learning algorithms that historical data using charts, data understanding using learning about patterns in appear to have “human-like” tables and statistics, with iterative mathematical data and how they influence qualities, typically applied to results often displayed in approaches outcomes unstructured data dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Data Science

• Term coined in 1997 • Professor Jeff Wu suggested it as an alternative to “statistics”

31

PROPERTY of MANGO SOLUTIONS The growth of data science

32

PROPERTY of MANGO SOLUTIONS Definitions of data science

Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.

33

PROPERTY of MANGO SOLUTIONS What is Data Science?

The proactive use of data and advanced analytics to drive better decision making

34

PROPERTY of MANGO SOLUTIONS Data & Analytic Terminology

Data Science Business Intelligence The proactive application of data and advanced analytics to drive better decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Statistical Modelling Machine Learning Artificial Intelligence The use of reporting and Human-centric process of Compute-centric creation of A narrow set of machine basic analysis to summarise building models through mathematical models by learning algorithms that historical data using charts, data understanding using learning about patterns in appear to have “human-like” tables and statistics, with iterative mathematical data and how they influence qualities, typically applied to results often displayed in approaches outcomes unstructured data dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) Warning: These definitions are badly misused and can vary greatly!

37

PROPERTY of MANGO SOLUTIONS PSI Webinar What this means for pharma?

PROPERTY of MANGO SOLUTIONS The pharma industry is, to some extent, already data-driven However, the data science “movement” is broadening the remit of analytics, which is impacting the role of analytics and its practitioners

PROPERTY of MANGO SOLUTIONS There are 5 ways in which Data Science is impacting pharma statisticians

REMIT ROLE METHODS DATA TECH

The interest in data The changing remit A new focus on Big data All of this is having science and AI is of analytics can analytics enables technologies are an impact on both broadening the impact the role of practitioners to look allowing a wider the technology we remit of analytic practitioners as at a broader range range of data employ, and the way teams, opening up a we’re increasingly of analytic sources to be in which we use wider range of asked to engage techniques to solve collected and technology in a challenges and with the business problems, including analysed, resulting more “DevOps” style moving the insight and explore new those based on in the use of new to create repeatable closer to the opportunities machine learning analytic approaches insight generation decision approaches

40

PROPERTY of MANGO SOLUTIONS The broadening of the analytic remit is impacting our role and approach

• Increased visibility of data and analytics to support decision making • Broader range of challenges to which analytics can be applied • Increasing use of prescriptive analysis to drive decisions • More openness to try new approaches and innovate

REMIT Novartis is a data company, and I think data science is going to allow us unlock even more insights across every element of our business I'm incredible excited about the power of these technologies to enable us to do even more as a company Vas Narasimhan, CEO

41

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS The broadening of the analytic remit is impacting our role and approach

• Quantitative Decision Making (QDM) Across our R&D, we are using AI to framework built into development help us decipher a wealth of process information with the aim of gaining a REMIT • Framework describes the better understanding of the diseases quantitative characteristics of the we want to treat; identifying new proposed study design targets for novel medicines; recruiting • QDM helps understand probability for and designing better clinical trials; of trial success driving personalised medicine • QDM rolled out to all clinicians and strategies and speeding up the way we clinical statisticians design, develop and make new drugs • Data science tools built to facilitate and standardise process for analysts Jim Weatherall VP, Data Science & AI, R&D

42

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS The nature of our roles is evolving as the remit of analytics changes

• More emphasis on soft skills needed to engage with broader business • Focus on modern techniques and technologies • Curiosity to explore new approaches and answer new ROLE questions

43

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS The nature of our roles is evolving as the remit of analytics changes

The Data Scientist is accountable for driving projects and improvement activities through technical consulting and value delivery leadership … Advanced communications skills are critical for this role, with particular strengths required in distilling and communicating ROLE complex concepts … … new and emerging data science technologies such as Machine/Deep Learning and Artificial Intelligence … … will have a passion for discovering solutions hidden in large data sets and working with stakeholders to improve business outcomes …

44

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS New methodologies can be applied, initially in non-clinical space

• Pharma analysis is primarily focused on (largely frequentist) statistical methods • The use of machine learning and other techniques is becoming widespread, mostly in non-clinical areas METHODS • The use of Bayesian methods in Clinical Statistics gives a good indication of the pace of change we could expect

45

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS New methodologies can be applied, initially in non-clinical space

ML methods used for subgroup AstraZeneca using DS and AI to analysis in clinical trials in help them recruit for, and design, respiratory better clinical trials

Supervised learning techniques Many examples of deep learning METHODS used in , high applied to images to create new throughput screening endpoints for study

Transfer learning to analyze NLP and Deep Learning used to molecular and imaging libraries as automate the classification and well as patient datasets to uncover extraction of information from complex biomarker patterns medical papers

https://blog.benchsci.com/pharma-companies-using-artificial-intelligence-in-drug-discovery 46

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS Big data technologies enable analyse of wider variety of data formats

• Big data tech allows us to collect, store and manage new data sources • Unstructured data sources such as text, image and video, as well as more traditional (but large) data sources such as genomics data • Also supports analysis of data streams from devices (e.g. wearables, or devices that trigger alerts)

DATA Example Uses of Wearables • Asthma devices with IoT to monitor and analyse correct dosing • Sensors used to supplement, or replace, pain endpoints • Devices to monitor patient health • Devices to trigger alerts on patient falls within healthcare facilities • IoT used to trigger alerts for patients in the home

47

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS Demands on analysis changing our use of technology

As technology evolves, and the remit of analysis broadens, there is more demand to leverage innovations improve drug development

Push towards tech that offers Innovation around access to modern capabilities, flexibility and intelligence - advanced reporting, TECH scale (R, Py, Spark) lightweight apps, interactivity

This changes the technology we use and relationship with it

Use of DevOps approaches Collaborative working in teams becoming more common place around version control for efficient where code is a primary output creation of common IP

48

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS Demands on analysis changing our use of technology

Creating applications to Reproducible research allow rich review of results impacting reporting flows

Many initiatives looking at Programming approaches developing apps in technologies managing to separate components such as “Shiny” to provide richer of reports to speed turnaround TECH experience for data review time to results from lock

This includes initiatives to build Approaches based on technologies applications to support FDA review such as R and Markdown, with of Tables, Figures and Listings concepts based on reproducible research (e.g. parameterised publication of standard reports)

49

CONFIDENTIAL BETWEEN MANGO SOLUTIONS AND CIFAS PSI Webinar Summary

PROPERTY of MANGO SOLUTIONS Over the last 20 years, the hype has raised the profile of, and expectations on, data & analytics The hype focused on a set of “buzzwords” which are now in common use

51

PROPERTY of MANGO SOLUTIONS Data & Analytic Terminology

Data Science Business Intelligence The proactive application of data and advanced analytics to drive better decision making Delivering management information via reports and Advanced Analytics dashboards The application of data and math to model real-world processes and likely future outcomes

Descriptive Analytics Statistical Modelling Machine Learning Artificial Intelligence The use of reporting and Human-centric process of Compute-centric creation of A narrow set of machine basic analysis to summarise building models through mathematical models by learning algorithms that historical data using charts, data understanding using learning about patterns in appear to have “human-like” tables and statistics, with iterative mathematical data and how they influence qualities, typically applied to results often displayed in approaches outcomes unstructured data dashboards and reports

Big Data A set of techniques that enable access to structured and unstructured data of varying size

Structured Data Unstructured Data Structured data in rectangular formats (tables, databases) Non-standard data forms (image, text, video, sound, streams) There are 5 ways in which Data Science is impacting pharma statisticians

REMIT ROLE METHODS DATA TECH

The interest in data The changing remit A new focus on Big data All of this is having science and AI is of analytics can analytics enables technologies are an impact on both broadening the impact the role of practitioners to look allowing a wider the technology we remit of analytic practitioners as at a broader range range of data employ, and the way teams, opening up a we’re increasingly of analytic sources to be in which we use wider range of asked to engage techniques to solve collected and technology in a challenges and with the business problems, including analysed, resulting more “DevOps” style moving the insight and explore new those based on in the use of new to create repeatable closer to the opportunities machine learning analytic approaches insight generation decision approaches

53

PROPERTY of MANGO SOLUTIONS Enabling your data-driven journey Helping you to thrive on data science

PROPERTY of MANGO SOLUTIONS

Richard Pugh RSS Data Science Section @MangoTheCat London Office Chippenham Office Chief Data Scientist www.mango-solutions.com Dawson House Mango Solutionsgroups.io/g/datasciencesection 5 Jewry Street linkedin.com/company/2 Methuen Park rssdatascience rich@[email protected] London Chippenham @richatmango EC3N 2EX SN14 0GB +44 1249 705450 linkedin.com/in/richatmango/

54