and Cognitive Computing: A Proposed Framework to Navigate the Opportunities

By Ajay Bhilegaonkar

M.S. Electrical Engineering University of Texas Submitted to the System Design and Management Program in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Engineering and Management

at the MASSACHUSETS INSTITUTE Massachusetts Institute of Technology OF TECHNOLOGY

June 2016 OCT 26 2016

C 2016 Ajay Bhilegaonkar LIBRARIES All rights reserved ARCHVES

The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created.

Signature of Author .Signature redacted \j 'JAjay Bhilegaonkar System D esign and Management Program 'Signature redacted 06 May 2016 Certified by Jeanne Ross Thesis Su ech Qctor a Principal Research Scientist Center for Info stems Rese , MI7Sl9In/Ichool of Management Accepted by -Signature red acted- Patrick Hale Director, System Design & Management Program Machine Learning and Cognitive Computing: A Proposed Framework to Navigate the Opportunities

By Ajay Bhilegaonkar

Submitted to the System Design and Management Program on February, 2016 in Partial Fulfillment of the Requirements for the Degree of Master of Science in Engineering and Management

Abstract

Machine Learning and Cognitive Computing universe is buzzing again. Recent significant events are special. There is also talk about beginning of a general purpose "Smart Machine Age"

Advances in computing power, storage capacity and machine learning / cognitive computing technologies have a gained critical mass. This combination is driving significant growth and heavy investments. Cognitive computing is coming of age, the market is experiencing exponential growth and there are literally thousands of startups competing to seize the opportunities and hundreds of products hitting the market every quarter. Businesses definitely need to pay attention.

But for a business professional, there is so much happening out there that, it is extremely hard to decide which way to turn. CC/ML opportunities may have huge potential to improve business performance or there may be opportunities to waste money. This is a major concern for large businesses and business professionals.

This thesis aims to develop an end to end framework to navigate CC/ML opportunities. The framework will guide a business professional to navigate the complex landscape of CC/ML and arrive at a solution approach recommendation.

Thesis Advisor: Jeanne Ross Title: Research Director and Principal Research Scientist Center for Information Systems Research, MIT Sloan School of Management

2 This page left intentionally blank.

3 Acknowledgements

I would like to express my sincere gratitude towards my Thesis Advisor Dr. Jeanne Ross. This work would not have been possible without her guidance and advice.

I would like to sincerely thank SDM Director Patrick Hale for giving me this opportunity to participate in this program. Thanks to all the professors and students at MIT, who I learned form.

Thanks to my wife for her support and patience throughout this program. I would also like to thank my family for encouragement.

4 This page left intentionally blank.

5 Table of Contents Chapter 1. Introduction...... 8

1.1 Shifting paradigm in CC/ L Universe ...... 8 1.2 Hype and Excitem ent in CC/M L is evident ...... 8 1.3 Recent growth of the CC/ M L m arket is explosive ...... 9 When research from Gartner 5'' and IDC6 is synthesized, following list emerges:.. 12 1.4 Research M otivation ...... 12 1.5 Thesis Statem ent & Prim ary Research Objectives ...... 13 Chapter 2. The State of the Art of / Cognitive Computing / Machine Le a rn in g ...... 1 6 2.1 Artificial Intelligence...... 17 2.2 Cognitive Com puting ...... 18 2.3 Davenport Cognitive Com puting Categories'...... 18 2.4 M achine Learning...... 21

2.5 Tenets of machine Learning8 ...... 21 2.6 Learning M echanism s for M achine Learning ...... 22 2.7 Supervised Learning ...... 24 2.8 Sim ple exam ple of supervised learning ...... 24 2.9 Anthom orphizing Technology adds to confusion...... 26 Chapter 3 CC/ M L Use Cases ...... 27 U se C a se 1 ...... 2 7 U se C a se 2 ...... 2 8 U se C a se 3 ...... 2 8 U se C a se 4 ...... 2 9 Use Case 5 ...... 30 U se C a se 6 ...... 3 0 U se C a se 7 ...... 3 1 U se C a se 8 ...... 3 1 U se C a se 9 ...... 3 2 Use Case 10...... 32 Use Case 11...... 33

Use Case 12 ...... 34 Chapter 4 Technology Landscape ...... 35

6 4.1 Software Development Tools ...... 35 4.2 Cloud Platforms and Services...... 38 4.3 Enterprise Scale Software Solutions...... 43 4.4 Cognitive Com puting/ M achine Learning Startups ...... 44 4.5 Software Development Tools / Program ming Languages ...... 45 4.6 Cloud Development Platforms / Software as a Service...... 46 4.7 Traditional Enterprise Solutions ...... 46 4 .8 S ta rtup s ...... 4 7 Chapter 5 Analysis Framework...... 48 5.1 Davenport Categories of Cognitive Com puting Technologies1 ...... 48 5.2 Davenport / Kirby Knowledge Task Survey ...... 48

5.3 Original Davenport / Kirby M atrix ...... 50 5.4 Technology Readiness Level ...... 51 5.5 M odified Davenport/Kirby M atrix...... 52 5.6 TRIL scores on Davenport Kirby Matrix ...... 54 5.7 Accenture M odel for Cognitive Com puting34 ...... 55 5.8 Realizing Business Value from CC/M L Projects is a Journey...... 58 5.9 Enterprise Capability Assessment for M IL / CC ...... 59 5.10 Proposed End to End Business Analysis Framework ...... 63 Chapter 6 Analysis of Use Cases in Pharmaceutical Industry ...... 65 6.1 Pathology Image/ Data Analysis to discover new Cancer Treatments...... 66 6.2 Scientifically Aware Search...... 68

6.3 Chem ical Synthesis Step Robustness through M achine Learning ...... 70 6.4 Business Process Automation ...... 72 Chapter 7. Conclusion ...... 74 7.1 CC/M L Universe is Com plex to Navigate ...... 74 7.2 Systematic Analysis framework aids in navigating the landscape...... 75 7.3 Future Research ...... 75 B ib lio gra p h y ...... 7 6 Appendix A - CC/M L Startup Ecosystem ...... 80

7 Chapter 1. Introduction

In February 2011, IBM destroys human competitors on the TV game show "Jeopardy." A 2015 book, "The Master algorithm" declares that Artificial Intelligence and big-data technology will remake the world. In March 2016 's AlphaGo (Google DeepMind's Artificial Intelligence Program) wins the ancient Chinese game GO by defying millennia of basic human instinct and defeating Lee Sedol 2, world champion, 4 games to 1.

What is happening now is amazing because in 1997 when IBM's Deep Blue defeated world champion Gary Kasparov in chess, Deep Blue just followed pre-programmed instructions. Deep Blue did not come up with its own moves. On the other hand, in 2016, AlphaGo astonished the world by doing just that. AlphaGo came up with moves that even experts could not contemplate making or explaining. Experts did not see cleverness of the move right away.

At first, Fan Hui, a GO expert, thought AlphaGo's move was rather odd. But then he saw its beauty 3 . "It's not a human move. I've never seen a human play this move," he said. "So beautiful."

1.1 Shifting paradigm in CC/ML Universe

These and many other recent significant events in the CC/ML universe have a common theme. They demonstrate the high profile of Cognitive Computing/ Machine Learning (CC/ML) applications and the potential they have to transform business computing profoundly. CC/ML is definitely worth studying now because, at this moment in time, these technologies are mature enough to do cool things and for people and companies to invest heavily. But they are immature enough that there are lots of pitfalls.

1.2 Hype and Excitement in CC/ML is evident

There is a lot of hype and lot of excitement in the cognitive computing/ machine learning domain. Since the machines are displaying that they are learning in a meaningful way and applying the learnings to problem solving, it is safe to say that CC/ML application are coming of age.

8 Gartner does not stop at agreeing that high profile cognitive computing and machine learning applications have arrived, but goes even further and makes few bold assertions4 about a 'smart machine big bang' and 'smart machine age'.

Gartner in 'Smart machines see major breakthroughs after decades of failure' refers to the confluence of three factors, Hardware, deep neural nets, and massive data as the "smart machine big bang". The report4 concludes that these three forces have interacted explosively and have gained critical mass, which was previously unattainable.

As a part of the smart machine 'Big Bang', radical new hardware, massive amounts of data, and unprecedented advances in deep neural networks are creating a new, 75 year general purpose technology cycle: The Smart Machine Age.

1.3 Recent growth of the CC/ ML market is explosive

Cognitive computing/ machine learning is growing explosively as an industry. Gartner research 5 shows that advanced analytics (that overlaps machine learning) is the fastest-growing segment in the business analytics software market.

IDC Research reports6 that by 2018 half of all consumers will interact with services based on cognitive computing / machine learning on a regular basis. Market revenue for cognitive software platforms will exceed $15 Billion by 2025 and overall market revenue for cognitive solutions will exceed $60 Billion by 2025.

9 Cognitive Software Platform Forecast

2014-2019 Revenue ($M) with Growth (%) i sm 4.030 45 3.500 40 3.000 35 2.900 30 25 2,020 1.500 2

0 0 2014 2015 2C16 2017 2018 2019

rt

Commercial cognitive software platforms have just begun to emerge on the market scene. This category of siftware used to build "smart" applications and expert advisors will grow rapidly over the next five years enabling a multi-billion dollar intelligent applications market.

Cognitive Systems Forecast

2014-2019 Revenue ($M) with Growth (%) "SM) (%) 10,000 Total: 20 9,165,4 9.000 8.000 7000 Total: 6.000 4,548,8 7r-- 570 1 15 5,000 350.4 10

4.000 3.000 2.000 1.000 0 0 2014 2015 2016 2017 2018 2019

-th . en-t i'R

Total Market A Dn-premnise!Other CAGR 13.1 CAGR A Public Cloud CAGR 31.6

Figure 1.3a: Cognitive Platform and Solution growth 6

10 Cognitive computing / machine learning startup ecosystem is also witnessing huge investment / mergers and acquisitions. Bloomberg Beta 7 reports that there are over 2,529 separate firms participating in its 'Machine learning' category.

Select Cognitive Systems VC Funding

$700M in 2014 and 2015 *Iio-AmounT Palantir $500M+ 2014 (multiple rounds) Kensho $15M 2014 Sentient Technologies $103M 2014 Scaled Inference $8M 2014 Highspot $9.6M 2014 Digital Reasoning $24M 2014 Saffron Technology $7M 2014 Context Relevant $13.5M 2014 Metamind $8M 2014 Viv Labs $12.5M 2015

Figure 1.3b: Venture Capital Finding in Cognitive Systems Market6

There is a frenzy out there for top talent and top startup company acquisitions. (Notice Google's DeepMind Acquisition)

Notable Cognitive and Al Acquisitions Acquirer I vendor Date Amount Google DeepMind Jan. 2014 >$400M Microsoft Equivio Jan. 2015 -$200M Yahoo Incredible Labs Jan. 2014 NA IBM Cognea May 2014 NA Google Wavii Apr. 2013 >$30M Walmart Labs Inkiru Jun. 2013 NA Microsoft Aorato Nov. 2014 -$200M Google Emu Aug. 2014 NA Intel Indisys Sep. 2013 >$26M Twitter Madbits Jul. 2014 NA Facebook Wit.ai Jan. 2015 NA IBM Alchemy API Mar. 2015 NA

Figurel.3c: Notable Cognitive Systems Mergers and Acquisitions6

0

11 I What is driving this exponential growth in cognitive systems / machine learning / artificial intelligence domain? What is driving this growth now? What are the enabling factors contributing to this state?

When research from Gartner 5,8 and IDC6 is synthesized, following list emerges:

- Data growth explosion IDC said in 2011 there was 1.8 zettabytes (or 1.8 trillion GBs) of information. In 2012 data volume reached 2.8 zettabytes and IDC forecasts 40 zettabytes (ZB) by 2020. - Ubiquity of Unstructured content suitable for machine learning / cognitive techniques, where traditionalsoftware fails IDC says unstructured content such as email, video, instant messages accounts for 90% of all digital information - Machine Learning architecturesare now possible Cloud computing and ever cheaper storage and processing in-memory combined with extremely high powered computing - Recent leaps in deep neural network technologies and natural languageprocessing

1.4 Research Motivation

A New York Times article9 claims that machine learning or artificial intelligence is changing the way we use computing, ranging from globe- spanning computer systems to how you pay at the cafeteria. In the same article9 Diane Greene of Google says 'Just teaching companies how to use A.I. will be a big business'.

"In the '80s, it was spreadsheets," said Andreas Bechtolsheim, a noted computer design expert who was Google 's first investor. "Now it's what you can do with machine learning." He added: "Better maps andphotos is just the start. It 's going to be in life sciences, automobiles, everything."

This is not happening in venture capital arena or Silicon Valley technology companies only though.

12 McKinsey quarterly executive guide'0 (3Q 2015) asserts that machine learning is no longer the preserve of artificial intelligence researchers and born digital companies like Amazon, Google and Netflix.

Gartner 4' 5' 8 predicts that smart machine technologies will significantly impact virtually every industry sector over the next five years. I agree with these assessments.

Cognitive computing is coming of age, the market is experiencing exponential growth and there are literally thousands of startups competing to seize the opportunities and hundreds of products hitting the market every quarter. Businesses definitely need to pay attention.

But for a business professional, there is so much happening out there that, it is extremely hard to decide which way to turn. CC/ML opportunities may have huge potential to improve business performance or there may be opportunities to waste money. This is a major concern for large businesses and business professionals.

1.5 Thesis Statement & Primary Research Objectives

While there is lot of hype and excitement, it is not clear what business benefits can be accrued. This thesis will help businesses navigate this complex landscape, so business professionals can decide, when, where and how to start leveraging these amazing technologies that are still immature in many aspects.

How can businesses navigate the opportunities presented by CC/ML to enhance business performance?

Specifically, what opportunities are currently available to large pharmaceuticalcompanies? How do these companies position themselves to recognize and cash in on emerging opportunities?

This thesis answers aforementioned questions by explaining relevant technology and terminology, describing broad CC/ML use cases, categorizing technology providers. All of these are accomplished through literature review. Pharmaceutical industry specific use cases are identified through stakeholder interviews conducted with 2 research scientists and 6 Director/ Associate Director level IT professionals. Based on literature

13 review and thesis writer's industry experience, the thesis proposes a framework for making high level decisions to invest in ML/CC opportunities. This framework is then applied to Pharmaceutical industry use cases identified through the interviews.

The thesis is organized as follows:

Chapter 2 introduces the concepts of artificial intelligence, cognitive computing and machine learning and describes the state of the art.

Chapter 3 provides examples of use cases across diverse business domains such as Marketing and Sales, Retail, Financial risk management, fraud detection, Customer Engagement, healthcare, pharmaceuticals, hoteling and others. Referring back to the concepts in chapter two, I highlight the business outcomes that have been achieved through machine learning applications and the more limited application-and hoped for outcomes-of more human-like technology tools.

Chapter 4 introduces three categories to segment the machine learning / cognitive systems technology landscape. Chapter then provides high level technical details that will aid an IT professional to quickly sift through possible development and delivery mechanism for ML/AI.

Chapter 5 reviews Davenport/Kirby and Accenture approaches to disambiguate the cognitive computing landscape. Modification to Davenport / Kirby approach, by including technology readiness level. Chapter 5 also introduces the need for and a mechanism to conduct enterprise capability assessment. Concluding, in order to navigate the opportunities and make vendor selection and investment decisions, chapter proposes a 10 step analysis framework. Chapter 5 highlights the contributions made in this thesis toward disambiguating cognitive computing landscape, look at opportunities from business case perspective and make informed decisions.

Chapter 6 uses the 10 step analysis framework proposed in chapter 5, for pharmaceutical industry use case analysis and propose solution approach based on the 10 step analysis results. Chapter 7 reflects on the learning from the research. It offers next steps for pharmaceutical companies that want to develop competency in ML/CC. It then generalizes the learning from pharmaceutical companies to recommend how other companies can start to take advantage of CC/ML.

14 15 Chapter 2. The State of the Art of Artificial Intelligence / Cognitive Computing / Machine Learning

"Viewed narrowly, there seem to be almost as many definitions of intelligence as there were experts asked to define it." - R. J. Sternberg quoted in [The Oxford Companion to the Mind by R. L. Gregory"]

AutomotodCors SptechPc~cEssirg PoI ndomroce03t8 Pobotics Artificiollntolli once TmtdAnL9- tics \ ot urlLcinguag oProcessing

Figure 2: Word Cloud Artificial Intelligence / Cognitive Computing / Machine Learning (not scientific)

Like Intelligence, Al has many definitions based on who is defining it. The broader term of Artificial Intelligence may refer to Machine Learning, Cognitive Computing, Intelligent machines, smart machines, Automated Machines so on and so forth.

There are other frequently used terms such as , neural networks, random forests, robotics, computer vision etc. which all broadly have at least some overlap or in some cases a lot of overlap with Artificial Intelligence, Cognitive Computing and Machine Learning.

16 2.1 Artificial Intelligence

Here is a Gartner definition12 that I broadly agree with: AI is technology that appears to emulate human performance typically by learning, coming to its own conclusions, appearing to understand complex content, engaging in natural dialogs with people, enhancing human cognitive performance (also known as cognitive computing) or replacing people on execution of non- routine tasks. Applications include autonomous vehicles, automatic speech recognition and generation and detecting novel concepts and abstractions (useful for detecting potential new risks and aiding humans quickly understand very large bodies of ever changing information)

It is helpful to recognize that cognitive computing comes under the larger Artificial Intelligence umbrella. Machine Learning comes under the Cognitive computing umbrella. CC/ML have most relevant business applications and other areas are still immature.

Figure 2.1: Thesis Writers interpretation of Artificial Intelligence / Cognitive Computing / Machine Learning overlap (not scientific)

17 2.2 Cognitive Computing

Cognitive computing refers to systems that learn at scale, reason with purpose and interact with humans naturally. Rather than being explicitly programmed, they learn and reasonfrom their interactionswith us andfrom their experiences with their environment. 13

- Dr. John E. Kelly III, Sr. V.P. IBM Research and Solutions

IBM is attributed with coining the term cognitive computing. IBM contends that prior information systems have been deterministic and that cognitive systems depart from older approaches in that they are probabilistic. They not only provide results as outputs but generate hypothesis, reasoned arguments and recommendations about more complex and meaningful sections of data. Very importantly, cognitive systems deal with "unstructured data". Over 80% of all available data is unstructured data. IBM Watson, the first cognitive system, deals with unstructured data.

In 2011, Watson defeated humans in Jeopardy 4 . Playing and winning Jeopardy required making sense of messy, unstructured data, answer subtle complex and deliberately twisted questions intended for humans. Traditional programmable systems could not have made sense of Jeopardy.

2.3 Davenport Cognitive Computing Categories'

In order to truly describe and cover the broad spectrum of technologies involved in Cognitive Computing Thomas Davenport proposed six categories of Cognitive Computing Technologies'.

Analytics / Predictive Analytics:

Technologies or smart systems that mostly deal with structured data. This data is usually in the form of numbers, for example financial transactions or sales data. Analytics technologies apply a broad spectrum of statistical or mathematical analysis techniques to numbers to find patterns, insights or trends. The continuously improving analysis techniques and ability to compute faster and process more data have been significantly improving the field of analytics. Analytics technologies use data visualization techniques to

18 communicate the insights that can lead to improved performance or system optimization.

Image Recognition / Computer Vision:

Acquiring, and then processing images form the real world by representing them in the form of numbers, conducting analysis and then making decisions is image processing. Image processing has been around for a long time but recent advances have been profound. Facebook's DeepFace now recognizes faces of your friends. Google Brain recognizes cats from images over the internet.

Rules and Business Rules:

Rules are expressing business logic in a structured manner using simple constructs such as if /then statements. Expert systems were programmed using these type of business rules in order to emulate decision making ability of a human expert. Expert systems have been around since 1970's and have been first successful demonstration of Al software. An expert system might employ a business rule that dictates how a customer is to be treated when he applied for a mortgage. Should the credit application be accepted or rejected would depend on a known threshold such as a Credit score.

Machine Learning / Neural Networks:

These are more advanced concepts on data analytics but also involve creating a model from data / model fitting. These models can also be self- learning. Many approaches to machine learning include neural networks, deep neural networks, Bayesian classifiers, support vector machines etc.

Natural Language Processing:

NLP deals with human computer interaction. Speech recognition, reading text content and deriving meaning out of the processed textual information is natural language processing. IBM Watson, Apple and are examples of systems that attempt to understand spoken languages and act on the information.

19 Complex Event Processing:

Event processing systems deal with real time data from a diverse set of sources and types. Then attempt to aggregate the information, understand and take action. CEP systems are widely used in financial domain, such as stock trading, credit fraud detection etc.

From a business analysis perspective broad categories of cognitive computing technologies listed above help cut through the fog.

20 2.4 Machine Learning "A field of study that gives computers the ability to learn without being explicitly programmed." - Arthur Samuel 5

Machine learning often repurposes classical statistical methods to large amounts of data that was not digitally available until a decade ago nor was the computing prowess in existence to process it. Paradigm shifting improvements in computing hardware and software are making advanced business application conceivable and possible.

Advanced machine learning algorithms use many techniques including neural networks, support vector machines, genetic algorithms, random forests, decision trees and more recently deep learning. These techniques operate guided by lessons learned from existing information.

Core idea is machines learning on their own. A computer program is said to learnfrom experience E with respect to some class of tasks T andperformance measure P if its performance at tasks in T, as measured by P, improves with experience E. - Tom Mitchel, Formal definition of machine learning. 16

2.5 Tenets of machine Learning8

- Machine learning relates to extracting knowledge from data

- Machine learning is promising for unstructured data analysis problems where deterministic approaches such as traditional programming are not very successful

- Machine learning can deal with variety of data types such as text, numbers, images, video, real time social media streams etc.

- Machine learning use cases can be plotted on a spectrum wide and deep that ranges from personal digital assistants (Apple Siri), marketing, sales, financial fraud prediction and detection, investment management, drug development, disease diagnostics, R&D to virtually every industry sector imaginable. A survey of use cases can be found in chapter 3.

21 2.6 Learning Mechanisms for Machine Learning

There are various learning approaches based on availability of data and goals of the tasks to be learned.

Machine Learning approaches:

- Supervised learning - Semi supervised learning - Reinforcement learning - Unsupervised learning

Table below provides additional details about each of the approach listed above.

22 Approaches employed in teaching machines learn 16:

Approach Description Supervised learning Inferring useful concepts or structure from observations (data) whose outcomes of interest (labels) are known to the learner in the form of training data set, that contains pairs of observations and outcomes. Semi supervised Inferring useful structure or concepts from learning observations (data) whose outcomes of interest (labels) are known to the learner for partial observations. One could consider semi supervised learning as a sub category of supervised learning, where small fraction of training data is labeled with outcomes of interest, and a remaining data is not labeled. Reinforcement learning Reinforcement learning differs from supervised learning in that correct observation / outcome pairs or labelled data pairs are never presented to the learner. Although, based on learner's current state and action, the learning context changes its state and provides reward signal as feedback. The learner tries to maximize reward. Unsupervised learning Inferring useful structure or concepts from observations (data) whose outcomes of interest (labels) are not known to the learner.

Since the outcomes are not known, there is no error as in supervised learning, or a reward signal as in reinforcement learning.

23 2.7 Supervised Learning

Gartner reports that over 95% of all known machine learning use cases utilize supervised learning in some fashion or form. Thus, for purposes of this research, I will focus on supervised learning approaches to machine learning.

Supervised learning deals with labeled pairs of inputs (observations) and outputs (outcomes). This dataset is often referred to as training data. Training data is used to teach the machine learning algorithm a known 'input to output' relationship. Using this taught relationship between observation and outcome, machine learning algorithm is supposed to reproduce outcomes for inputs that were not part of the training data.

The relationship between observation and outcome could be as simple as speed and time to destination or as complex as predicting weather. There are use cases where relationships are used to predict stock prices, flight arrivals times, and consumer demand.

This relationship can be expressed in many ways. In supervised learning, lot of choices such as linear regression, artificial neural networks, decision trees or genetic algorithms can be utilized. Deep neural networks recent most advanced models used.

2.8 Simple example of supervised learning

A mobile phone maker wants to know how much more a customer might be willing to pay if number of features in a phone are increased. To make it simple lets describe number of features by x. See figure 2.8. For x number of features, we want to predict how much a customer is willing to pay (y).

In this illustration only a few data points are shown to convey the concept and not get bogged down by numbers. In reality, this equation could be more complicated with thousands of independent variables.

A very simple linear regression can be used to fit this data. It is important to note that if the machine simply plugs in data into the regression, it is not learning. Machine only learns if it develops and continues to change the regression equation based on new data as it receives more inputs and

24 outputs. Many ML techniques use far more sophisticated and complex algorithms and massive volumes of data.

Y

250

200 191

150 C) U 107 117 I 100 Y

50 50

0 I -2 0 2 4 6 8 10 12 x number of features

y 250

200 191

150

117 100 y, n

@00Linear (y) 50 to

0 2 0 2 4 6 8 10 12 x number of features

Figure 2.8: Machine Learning Regression Plot

The dotted line in plot in blue can be represented by y = mx+c. This line can be obtained by feeding the data to a mathematical routine and then obtaining the slope m and intercept c.

25 I 2.9 Anthomorphizing Technology adds to confusion

Recognizing the confusion over hyped terms and general purpose Al's unmet expectations from past few decades, Gartner' 7 recommended following to cut through the fog:

Anthromorphizing technology inaccurately creates marketing hype and unrealistic expectations. Use descriptive terms for technology. This approach will differentiate it from people (human brain) and set accurate and achievable expectations.

IBM's own technical expert (Sr. VP of Research, not marketing) says 13:

"The success of cognitive computing will not be measured by Turing tests or a computer's ability to mimic humans. It will be measured in more practical ways, like return on investment, new market opportunities, diseases cured and lives saved."

This chapter introduced machine learning and explained it using a very simple real life example. The chapter then provided a very brief overview of cognitive computing and demonstrated that plethora of possible interpretations only add to confusion. The chapter defined artificial intelligence and went on to recommend avoiding comparison to humans or anthromorphizing technology.

26 Chapter 3 CC/ ML Use Cases

Cognitive Computing / Machine Learning use case can be plotted on a spectrum that runs wide and deep. I found use cases in the literature described as CC/ML use cases. Purists might argue that some of them are not really machine learning but should be categorized as predictive analytics or something else. I do not get into that debate, since in chapter 2, I agree with Davenport line of thinking that cognitive computing technologies lie on a spectrum. I rather pick use cases at different locations on this spectrum that use a variety of technology in CC/ML domain, on a wide scale of sophistication and complexity.

The use cases are framed in a specific format, emphasizing the need, the solution and the business outcomes.

Use Case 1

Financial services companies monitor trading using unsupervised learning to predict and prevent fraud' 8

The Need: $15 billion in losses are estimated to be through trading fraud. Many financial companies needed improved mechanisms to predict likely fraud and prevent significant losses.

The Solution: BAE systems product NetReveal is used to detect trading fraud. NetReveal implements unsupervised learning in the form of self-organizing maps as neural network methods. These maps are used to detect unauthorized trading. Maps show traders' activity features like volume of trades, most frequently traded products, and other key risk indicators. System uses a neural network to automatically cluster traders exhibiting similar behaviors.

The outcomes: Automatic trader clustering based on risk indicators allows the NetReveal system to identify significant changes in trader behavior. An example would be front-office trader behaving like a back-office trader.

27 The solution has been proven to identify unauthorized trading before significant losses occur and predicts trader misbehavior for about six months giving ample time to the bank to avoid actual losses

Use Case 2

Cross-channel fraud detection in credit card organizations 8

The Need: US Credit card organization with millions of user accounts needed to improve security and prevent fraud, but solutions would adversely affect user experience. The company had in house and multiple third-party commercial fraud risk scoring engines. These engines were not solving the problem. Growth in chip based cards shifted fraud to alternate channels. Existing fraud scores were conflicting, resulting in poor user experience.

The Solution: Feedzai random forest, SVM, and other machine learning models were used to create a detection system.

The outcomes: Fraud detection rates were improved by more than 40%. This resulted in $125 million increase in savings. The system identified that 68% of fraud was cross channel. The fraud detection models were fast.

Use Case 3

US bank anti-money-laundering (AML) improvement efforts'8

The Need: A US bank had been observing unacceptable level of false positives in Anti Money Laundering alerts. The bank was frustrated imperfect AML investigation. Financial Auditors were losing confidence in the system and demanding more transparency.

The Solution: The bank deployed SAS' AML solution. The SAS based solution applied a hybrid model of supervised and unsupervised regression and decision trees.

The Outcomes:

28 The SAS based system improved suspicious activity reports (SARs) filing using a transparent and auditable process. Auditors were able to understand the system and found it more open and acceptable. System saved $ 1 million in AML investigations in year one. System reduced the investigators workload by 46% since false positives were significantly reduced.

Use Case 4

Automatic Real time product recommendations at multiple online retailers' 9

The need Prevent loss of sale opportunity for products that online retail customers might want to buy, but are NOT actively looking for. Alternatively, customers may NOT know that those products are sold through online medium or by this retailer. Provide a comprehensive buying experience to online customer. Predict customer needs, attempt to fulfill those needs.

The Solution: A recommender system based on proprietary machine learning algorithms is implemented. Recommender system uses customer features such as past browsing history, purchase history, social network history and many other features. Additional features about the product and externally available relevant data is also used.

The Outcomes: Novel buying patterns based on previously undiscovered customer and product relationships were used to recommend additional products the customers might be interested in, creating complimentary product sales. The recommender system also provided an end to end and personalized experience, as if the customer was able to see what other items of relevance are stacked in the isle of a physical store realizing additional sale. Significant improvement in overall customer experience customer engagement was achieved. Machine learning recommender system continuously learns from success and failures of recommendations, further enhancing the outcomes.

29 Use Case 5

9 2 0 Placement of online advertisements at an internet search giant' ,

The Need: A major Internet company needed to improve online advertising placement performance using ever changing online consumer context. More specifically, personalize advertisement based on customer ad words searching.

The Solution Proprietary machine learning solutions were used to place online advertisement based on customer search words, previous customer online behavior, additional known context etc.

The Outcomes: Significantly improved online advertisement performance capturing major market share for on; line ads. Monetizing a free online platform using consumer data readily available.

Use Case 6

Clustering and Segmentation19, 20

The need: Go beyond traditional ethnographic customer clustering and market segmentation

The Solution: Data driven cluster analysis and segmentation approaches were used to group similar objects, behaviors, or whatever else represented by the available consumer and market data. Machine learning algorithms were used to achieve this type clustering and segmentation. Data was not limited to ethnographic information or physical location but took into account all available attributes and information, such as website visit history, social network activity, mobile access, purchase history.

The outcomes: Highly actionable and valuable insights.

30 Easily identify consumer behavior patterns that are very challenging for traditional marketing efforts. Improved customer retention based on buying behavior, or lack thereof for a specific brand or line of products. More focused marketing, based on insights generated by machine learning based clustering and segmentation.

Use Case 7

Predictive Maintenance of physical equipment 2o

The need: Predict maintenance and avoid breakdowns, save money and improve profitability for ThyssenKrupp. The Solution: ThyssenKrupp teamed up with CGI to connect thousands of sensors and systems in its elevators that monitor motor temperature to shaft alignment, cab speed and door functioning. This data was stored in to the cloud with Microsoft Azure Internet of Things services. The Outcomes: Technicians have instant diagnostic capabilities and rich, real-time data visualization and predictions based on machine learning algorithms improving uptime and reducing costs.

Use Case 8

Dynamic Price Optimization at Retail organization 20

The Need: 7- Eleven and other retailers needed to set regular and promotional pricing based on real time product demand The Solution: Use Model based on multiple market dynamics using KSS Retail market basket analytics and other services. The outcomes: Retailers are able to stay competitive by changing prices based on algorithms that take into account competitor pricing, supply and demand, and other external factors.

31 Use Case 9

Improve Customer Engagement20

The Need: Retailer Pier 1 Imports wanted to improve customer engagement using insights and data. The Solution: Pier 1 used cloud to pilot a predictive analytics solution based on Microsoft Azure Machine Learning and Microsoft Power Business Intelligence. The Outcomes: Pier 1 Imports can use data insights to predict which products customers will want in the future Create a dynamic website using predictive modelling and create more efficient and effective marketing campaigns.

Use Case 10

Using data to improve Drug Discovery and improve quality of life2 '

The Need:

Google and Stanford had a vision to "Discover effective drug treatments for a variety of diseases using data" to improve quality of life. High throughput screening (rapid automated screening of drug like compounds to test for desired therapeutic effects) is expensive and usually done in sophisticated research labs.

The Solution:

Virtual screening is a novel way to perform drug discovery tasks. Virtual screening involves training supervised classifiers to predict interactions between targets and small molecules. 259 publically available datasets about biological processes that combine 37.8 million Data points about 1.6 million compounds were analyzed using machine learning models - such as deep learning and multitask networks.

The Outcomes: Improved virtual drug screening Multitask networks provide better accuracy than single task networks

32 If pharmaceutical companies share data openly society at large can benefit.

Use Case 11

Helping doctors make better cancer treatment choices at Memorial Sloan Kettering Hospital 22

The Need: There are over 400 types of cancers, massive literature and constantly updating treatment guidelines. Choosing among cancer treatments or even examining many treatment options is difficult and any assistance provided to the oncologist will potentially result in better care choices and eventually improved patient outcomes.

The Solution: IBM Watson is being trained at memorial Sloan Kettering by clinicians and analysts to extract and interpret physician's notes, lab results and clinical research. Memorial Sloan Kettering's expertise and experience with thousands of patients are the basis for teaching Watson how to translate data into actionable clinical practice based on a patient's unique cancer.

Potential Outcomes:

Memorial Sloan Kettering's world-renowned cancer expertise combined with the analytical speed of IBM Watson, the tool has the potential to transform how doctors provide individualized cancer treatment plans and to help improve patient outcomes. Oncologists anywhere will be able to make more specific and nuanced treatment decisions more quickly, based on the latest data. IBM has now created a specialized product called Watson for oncology based on work at Memorial Sloan Kettering.

33 Use Case 12

Hiltons Connie is a robotic concierge (Virtual ) 23

The Need: A concierge available all the time and aids Hilton's concierge.

The Solution IBM Watson based solution using natural language processing to grasp guest queries and then provide local attractions and interesting sites using Way Blazer's travel database.

The Outcomes: Connie helps guests navigate around the hotel, helps guest find local restaurant information and tourist attractions. This will eventually lead to improved guest engagement.

This chapter outlines many use cases that span diverse business domains such as Marketing and Sales, Retail, Financial risk management, fraud detection, Customer Engagement, healthcare, pharmaceuticals, hoteling and Internet of things. Solutions applied to meet business needs are equally diverse, and at times combinations of technology tools are applied. Review of use cases does provide user a broader view of how cognitive computing/ machine learning has been applied and what business outcomes have been achieved or are possible, but fails to be exhaustive. The notion of clearing the fog (created by literature) of overhyped promises and possible business value remains elusive.

34 Chapter 4 Technology Landscape

Machine learning / cognitive computing or broader Artificial Intelligence techniques can be implemented using multiple software solutions. There are companies who provide ML/AI techniques as core offering, there are others who may be playing catchup game and offering some ML/AI technology capabilities on top of their existing offerings.

Software Cloud Platforms Enterprise Scale Development and Services Software Tools Solutions

Figure 4: Classifying ML/AI Cognitive Computing Market

Technology delivery and development mechanisms for ML/AI is rapidly evolving. Some technology vendor(s) offer programming languages, others programming libraries, and subroutines. These tools facilitate "code yourself' efforts. Entrenched players offer traditional 'download the software and install to use it' technology capabilities in ML/AI space. These capabilities are many times add on to existing domain specific offerings. The latest and exponentially expanding industry trend 'software as a service' or' 'platform as a service with ML /AI services' is heating up and rapidly transforming the market. Chapter 1 surveys the growth potential.

4.1 Software Development Tools

This category can be further subdivided into offerings for core software programmers and developers and for non-programmers such as scientists, non-computer engineers and other knowledge workers.

As an example of offerings for non-programmers, scientific software Matlab is extensively used in academia, laboratory / research settings and in industrial research labs. These domain experts use Matlab libraries to build machine learning solutions.

35 On the other hand to utilize machine learning libraries such as Apache Mahout does require some level of software development background and are not deliberately designed for non-computer domain experts.

Table below provides a glimpse of ML/AI development tools. Right here I want to know when someone might use these tools. Practically speaking, who might develop an ML app and why would they use one of these tools rather than another? Turns out, you do address this question, but it's a totally different part of the paper. Would make this a great section of the research.

Language Description Details / Library / Framework R Free software R is a language and environment for statistical environment computing and graphics. R is free under GNU for statistical license. R provides a wide variety of statistical computing (linear and nonlinear modelling, classical statistical and graphics tests, time-series analysis, classification, clustering) and graphical techniques. R is highly extensible. R can produce high quality plots with ease that include mathematical symbols and formulae where needed. R compiles and runs on a wide variety of UNIX platforms and similar systems (FreeBSD and Linux), Windows and MacOS. Matlab Matlab is a Matlab can be augmented with following toolboxes: Mathematical - Statistics and Machine Learning Toolbox Software - Neural Network Toolbox programming - Computer Vision System Toolbox package, a - Fuzzy Logic Toolbox product of Matlab has its own scripting language and also offers Mathworks some drag and drop capabilities for programming. Weka Machine Weka is a collection of machine learning algorithms Learning for data mining tasks. The algorithms can either be algorithms in applied directly to a dataset or called from Java code. Java Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Weka is well positioned to create new machine learning algorithms.

36 Jubatus Distributed Jubatus is a machine learning library. It is also a online distributed processing framework and a distributed machine online machine learning framework with fault learning tolerance. framework Jubatus includes these functions: Classification, Regression, Recommendation (Nearest Neighbor Search), Graph Mining, Anomaly Detection, Clustering Feature Vector Converter (fv_converter): Data Preprocess and Feature Extraction

Language Description Details / Library / Framework Apache Mahout is Apache Mahout is an environment for quickly Mahout Apache's creating scalable performant machine learning machine applications. learning Distributed under a commercially friendly Apache algorithm software license, Mahout comprises a core set of library. algorithms for clustering, classification and collaborative filtering that can be implemented on distributed systems. Mahout supports three basic types of algorithms or use cases to enable recommendation, clustering and classification tasks. One interesting aspect of Mahout is its goal to build a strong community for the development of new and fresh machine learning algorithms. Apache Spark General One of 's main components is the MLlib, engine which is Spark's machine learning library. The for processing library works using the Spark engine to perform large-scale faster than MapReduce and can operate in datasets. conjunction with NumPy, Python's core scientific computing package, giving MLlib a great deal of flexibility to design new applications in these languages. Some of the algorithms included within MLlib are: K-means clustering with K-means| initialization LI- and L2-regularized linear regression Li - and L2-regularized logistic regression Alternating least squares collaborative filtering, with explicit ratings or implicit feedback NaYve-Bayes multinomial classification Stochastic gradient descent

37 4.2 Cloud Platforms and Services

A recent New York Times article24 'The race to control artificial Intelligence and tech's future' eloquently quotes that Amazon. Google, IBM and Microsoft are jockeying to become the go to company for Al and are engaged in a "platform war".

Platforms are software / hardware offerings, on which other entities (individual developers or companies) build on. The relationship is symbiotic for all parties involved including end users or consumers.

Companies such as Amazon and Microsoft are already ahead of Google and IBM in the sense that they offer full Stack Solutions that can start with data storage layer all the way up to data presentation layer.

Just to explain what a CC/ML cloud stack might look like, following diagram, adapted from MIT Sloan's Bhagattjee2 5 shows cloud technology layers in a full stack. CC/ML services are at top Layer.

Cognitive Computing /Machine Learning Services Data Analytics Services Data Management Computing Data Storage Cloud Infrastructure Infrastructure

Figure 4.2a: CC/ML full cloud technology stack

Both Amazon and Microsoft offer Machine Learning as independent services (no need to use entire stack, 'presentation layer only' offerings possible) but are better positioned to offer full technology stack solutions or 'platform as a service' as shown above.

38 Amazon Machine Learning26.

Amazon Machine Learning (AML) service makes it easy for developers of all skill levels to use machine learning technology. AML comes equipped with visualization tools and wizards that guide user through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology. AML makes accessing predictions easy using simple APIs (Application Programming Interface), eliminating need to implement custom prediction generation code, or infrastructure management. In essence, AML offers ML services and is attempting to democratize machine learning.

Microsoft Azure Machine Learning:

Microsoft claims that Azure Machine Learning is designed for applied machine learning. It is a fully managed cloud service that enables easy building, deployment and sharing of predictive analytics solutions. Azure ML provides unified access to the Azure cloud platform, ML developer tools (ML Studio) and a variety of Microsoft Analytics tools such as analytics suite. Microsoft Azure machine learning studio (ML studio) is a software development environment. ML studio can publish models as web services. Machine learning services can be consumed by a variety of applications such as a trend on a website or Microsoft native tools such as excel. These Services are portable on desktop / mobile platforms.

Experiments, Modules, and Datasets

Create Experiments - 9 Peprocess data atysis and reduction 0 Extract features ML Studio *Tst and iterate 6 TrainWrite Models

Read BLOB, Table, or Text Data Write Scored Data

Aure, or [v SV J Win dowsAzure j 7 Tables Figure 4.2b: Microsoft Azure Machine Learning Studio27

39 Machine Amazon Microsoft Azure Learning Platform Service Type Managed Service, Interactive Visual Interactive Visual Tools, fully managed tools service, Azure machine learning Studio web based product

Service offered Data Analysis, Model Training, Data Analysis, Model Training, Model Model Evaluation Evaluation

Offering Type Cloud based (Amazon Cloud, Cloud based (Microsoft Azure Cloud) AWS)

Dataset formats CSV, Amazon redshift, SQL CSV, TSV, ARFF, SVMLight (A variety of data formats supported)

Data Built in visualization and Visualization tools offered transformation transformation tools

Model Built in tools for model training, Training, evaluation and model scoring evaluation, and output tuning inside ML Studio

Application API's to create, review, and delete Request response service, batch execution Programming data sources, models and service, retaining API. Interface evaluation. Direct deployment of web service from ML Studio

SDK support Desktop and Mobile. Scripting supported for R, Python, SQL SDK support for many languages such as PHP, NET, Java, Python, Ruby

ML algorithms diverse set of industry standard classification, regression, decision trees, algorithms supported, such as recommendation system and many more classification, regression

Predictions Batch prediction of offline data or Batch prediction and web services real time predictions

Pricing Pay per use. Pay per use

Pricing details Data analysis and Model building: Pricing in intertwined with Azure platform $0.42 per hour services, Batch predictions: 0.12 per 1000 but a total price per month including ML can predictions (Monthly) be calculated. Real time predictions: 0.0001 per prediction (Monthly and reserved ($ 9.99 per seat per month, $ per hour for capacity charge of 0.00 1 per 10 MB use of developer environment, $ 2 per of reserved memory for real time compute hour, $ 0.50 per every 1000 predictions) AWS price NOT transactions ) included

Table 4.2 Head to head comparison of Amazon ML and Azure ML 40 AlchemyAPI Cloud Platform28,37.

Cloud based platforms such as AlchemyAPI offer advanced machine learning and artificial intelligence capabilities as services. AlchemyAPI offers text and sentiment analysis services, a broad range of natural-language processing capabilities, including information and entity relationship extraction all via simple cloud based interface. AlchemyAPI also offers, deep learning as a service (deep neural networks) to classify images and other content such as text. AlchemyAPI has demonstrated can create knowledge graphs and base ontologies. Alchemy API covers a broad spectrum of ML / Al advanced technologies.

AlchemyAPI was recently acquired by IBM's Watson unit.

if V How AlchemyAPI Works

AlchemyA P Smart App Devt- r Smat Sifts., cleans and stores Access and transform the optimized data Gain valuable insights to digitally disrupt the world's unstructured data via AlchemyAP's simple REST APIs. their industries and drive business decisions

:.Spiderbook

0000 PULSAR

AlchemyVision AchemyData AlchemyLanguage

Figure 4.2b: Alchemy API Illustration 37

Google Cloud Machine Learning Platform 38.

Google recently launched cloud machine learning products. Google Cloud Machine Learning offers modern machine learning services, pre-trained models and also platform to generate new models. Platform is based on neural networks, and promises unmatched accuracy, scale and speed. It is also a gully managed (full stack solution integrated with other Google Cloud products). Google offers Vision API, Speech API and Translate API. Google is trying to establish itself in enterprise cloud ecosystem.

41 IBM Watson Cognitive Computing Platform29.

Watson is a cognitive computing platform and is rapidly adding capabilities by acquisition and internal development. IBM is betting big on Watson and wants to incorporate Watson into every part of the business.

IBM Watson technology platform uses natural language processing and advanced machine learning to reveal insights from large amounts of unstructured data. Watson promises to answer customers' most pressing questions, quickly extract key information from all documents, and reveal insights, patterns and relationships across data.

Here are the current Watson product offerings: - Watson Knowledge Studio (training Watson, annotating documents, building relationships among entities) - Watson for Clinical Trial Matching (Matching patients with clinical trials, useful for Pharmaceutical and healthcare organizations) - Watson for Oncology (Evidence based treatment options, useful for medical professionals and healthcare organizations) - Watson Discovery Advisor (using data for cognitive insights, useful for Research and Development) - Watson Explorer (Enterprise Search using cognitive computing) - Watson Engagement Advisor (Machine Learning applied to self- service tasks)

42 4.3 Enterprise Scale Software Solutions

Enterprise scale players in database, data warehousing, data management and business intelligence space are increasingly offering machine learning capabilities on top of their offerings. These companies largely missed the cloud and software-as-a-service bandwagons, but are trying to stay relevant by reengineering their product lines. IBM is different in the sense that IBM Watson is a platform and ML / Al is its core offering.

Company Product Description Details IBM SPSS performing SPSS Modeler helps users and systems make the right modeler mining and decision by integrating predictive analytics with decision predictive management, scoring and optimization in an organization's analysis, for both processes and operational systems. general and Capabilities: industry vertical Analytical Decision Management. Automated Modelling, purposes Text Analytics, Entity analytics, Social network analysis, geospatial analysis Modelling algorithms.

SAS Enterprise Performing Several ML techniques can be found within SAS' vast Miner mining and analytics platform, from SAS Enterprise and Tex Miner predictive products to its SAS High-Performance Optimization analysis, for both offering. general and An interesting fact to consider is SAS' ability to provide industry vertical industry and line-of-business approaches for many of its purposes. software offerings, encapsulating functionality with prepackaged vertical functionality.

SAP Hana SAP HANA, an Supports advanced analytics and includes machine learning in-memory support. platform that combines an ACID-compliant database with advanced data processing, application services, and flexible data integration services Teradata Warehouce Enterprise Data Packages a set of data profiling and mining functions that Miner mining and big includes machine learning algorithms alongside predictive data solutions and mining ones. The Warehouse Miner is able to perform analysis directly in the database without undergoing a data movement operation, which ease the process of data preparation.

Table 4.3: Following table compares offerings by some enterprise players

43 4.4 Cognitive Computing/ Machine Learning Startups

Startups are not a product / market category but some discussion about startups is warranted. Organizations might be interested in buying products or investing in startups to solve very specific problems that others do not have solutions for.

As of early 2016 startups are moving away from building large technology platforms and focusing on solving specific business problems7 .

'Appendix A' shows a snapshot of CC/ML startup ecosystem.

44 The classification of CC/ML into three product / market categories proposed in this chapter, disambiguates commercially available software products and software delivery mechanisms. Building on these categories, further analysis is conducted to arrive at solution approaches for a problem at hand.

This is done by answering, which product category addresses what issues, and is suitable for which particular situation.

4.5 Software Development Tools / Programming Languages

CC/ML have been employed by researchers, statisticians and programmers for a long time. These workers have been using programming languages such as R, Python, or they have been using tools such as Matlab and it's built in machine learning and statistics toolbox. Traditional programming languages have also been employed in these settings with special purpose software libraries.

This approach is for organizations that are very well set up in terms of Information Technology resources. They have:

Mature software engineering practices. They have deep expertise in building/ maintainingboth software and hardware infrastructure.

Deep programmingexpertise either in-house or available on demand

Track record of analyzing business problems and converting them into home grown software solutions using programminglanguages.

Machine learning experts writing code hand in hand with business domain experts.

Alternatively, this approach can be applied by organizations that have:

Subject matter experts, who are also adept at programmingand are writing small scale software code to support their own research or solving complex business problems in their departments / domains.

These SME's have deep expertise in their own field and also have some experience in writing code for specific purposes, for example optimizing a step in the manufacturing process.

45 4.6 Cloud Development Platforms / Software as a Service

Development platforms usually offer a full stack of software / hardware solution that can start at the bottom with data storage all the way up to building algorithms (programming layer), creating predictions (execution layer) and visualizing results (Presentation layer). Some platforms may offer partial capabilities in the stack by positioning themselves within various layers of the stack where they offer value.

From CC/ML perspective, the programming / execution and presentation layers are of significance. Platforms offer programming tools, prebuilt machine learning algorithms, application programming interfaces (API's) to connect to other applications, so on and so forth.

These platforms are easy to use and do not require deep machine learning / or deep programming expertise.

Cloud Platforms are very suitable for organizations that are: Not fully resourced with IT capabilities. They don't need to have deep software engineering expertise or deep technology infrastructure expertise. Platforms offer full stack of solutions that a small group in an organization can utilize to solve their business problems. Platforms are suitable for organizations who do not see technology as their core and do not want to carry the burden of technology overhead. They are organizations who want to be the users of ML technologies to improve their core businesses performance. They do not want to be technology innovators or new ML algorithm creators.

4.7 Traditional Enterprise Solutions

Large enterprise software companies also offer software solutions in the machine learning domain. Many times machine learning algorithms come as part of a broader suit of analytics or other enterprise applications. These can be domain specific software solutions, can be limiting in the sense that they can only be applied to specific types of data or specific types of problems.

Traditional enterprise solutions require customization, configuration and implementation. Organizations using these types of tools, do require well resources IT organizations, but do not require deep software engineering expertise, neither do they require deep programming expertise. They do need

46 abilities to implement and support the underlying infrastructure, software applications and associated ecosystem. These tools are for organizations, who recognize data as a business critical asset, IT as an important investment and have the wherewithal to resource the IT organizations adequately to meet business goals.

4.8 Startups

Startups solve specific and niche problems. They offer algorithms, tools, software suits and sometimes consulting expertise. Advantage for going with startups is that they do not carry the overhead of large enterprise solutions and can be very flexible in meeting customer needs and adapting their offerings on demand. Disadvantage of startups is that, in B2B software, their business viability and technology capabilities need to be properly evaluated before doing business with them. There are thousands of startups in variety of business areas offering machine learning solutions.

This Chapter introduced three categories to segment the CC/ML products / market. Chapter then provided high level technical details that aid business person quickly sift through possible development and delivery mechanism for CC/ML.

47 Chapter 5 Analysis Framework

Previous chapters provide definitions for various terms in CC/ML technologies, survey the broad spectrum of CC/ML use cases across diverse industries and categorize the CC/ML technology market. Now a business person reasonably understands CC/ML terms, use cases, and available product categories in the market. But a framework to conduct an end to end business analysis from identifying a business problem to recommending a specific product and implementation approach is even more valuable.

Understanding the terms, use cases, and product categories does not make applying ML/CC easy. In this chapter I have developed a framework for analyzing and implementing CC/ML application.

5.1 Davenport Categories of Cognitive Computing Technologies'

From a business analysis perspective, broad categories of cognitive computing technologies listed below help cut through the fog and show a business person the whole spectrum of applications. As described in Chapter 2, we suggest that CC/ML apps can be categorized as follows:

- Analytics / Predictive Analytics - Image Recognition / Computer Vision - Rules and Business Rules - Machine Learning / Neural Networks: - Natural Language Processing: - Complex Event Processing

Business problems (where CC/ML can be applied), usually involve one or more knowledge tasks, whose performance can be improved.

5.2 Davenport / Kirby Knowledge Task Survey3 2

In a MIT Sloan webcast, Davenport and Kirby outline knowledge tasks performed by humans, where CC/ML will increasingly aid humans by taking over portions of the tasks or entire tasks.

48 Profession Possible Computer Aided Tasks Teacher / Professor Online Content / Adaptive Learning Lawyer e- Discovery, Predictive coding etc. Accountant Automated audits and Tax Radiologist Automated Cancer Detection Reporter Automated Story telling Marketer Programmatic buying, focus groups, personalized emails, etc. Financial Advisor Robo - advisors Architect Automated drafting, design Financial Asset Manager Index funds, Trading Pharmaceutical Scientist Cognitive creation of new drugs Private Equity Analyst Analyze Venture Capital Investment Opportunities

Table 5.2: Expanded Davenport Kirby Knowledge Task list

They take the discussion a step further, by mapping knowledge tasks types on one axis and the required machine intelligence on another axis, as shown in original Davenport Kirby Matrix.

This matrix is useful for a business person. A business problem consists of knowledge task(s). Tasks can be analyzed and placed appropriately on this matrix. For example if the knowledge task involves analyzing numbers, but also involves applying context and self-learning in some fashion, then "Machine Learning, Neural Networks" may be applicable.

49 5.3 Original Davenport / Kirby Matrix 32

Level of Human Repetitive Context Self-Aware Intelligence Support Task Awareness Intelligence Automation & Learning

Task Type Analyze BI, Data Operational Machine Not yet Numbers Visualiza Analytics Learning tion, Scoring Neural Nets Hypothes Model is driven Management analysis Digest Character Image Q&A, NLP Not yet words / and Recognition Images Speech Machine recogniti Vision on Perform Business Rules Not yet Not yet original Process Engines, Tasks Managem RPA (Admin & ent Decisions Perform Remote Industrial Fully Not yet Physical Operation robotics Autonomous Tasks Collaborative robots, robotics vehicles

Figure 5.3: Original Davenport / Kirby Matrix (Machine IQ for Knowledge tasks. How smart are the machines?)

Notice that the Original Davenport/ Kirby matrix does not clearly articulate the capabilities of existing technologies to complete these tasks. For example, the Matrix lists neural networks. But, currently, Neural Networks technology is not ready for application.

50 Thus, to apply this matrix, it is worthwhile, for technology experts to assess the readiness of a technology that might be applied to a given need. This readiness can be analyzed by DoD TRL scores.

5.4 Technology Readiness Level 3

System Test. Launch & Operations TRL 9

System/Subsystem TRL 8 Development TRL 7 Technology Demonstration

Technology Development

Research to Prove Feasibility

Basic Technology Research

Figure 5.4: DoD Technology Readiness Level

Definitions of DoD technology readiness levels can be seen in figure above and TRL Definition table below.

51 Technology Readiness level Description 1. Basic principles observed and reported Lowest level of technology readiness. 2. Technology concept and/or application Invention begins. Applications are speculative, and formulated there may be no proof or detailed analysis to support the assumptions.

3. Analytical and experimental critical function Active R&D is initiated. and/or characteristic proof of concept

4. Component and/or breadboard validation in Basic technological components are integrated to laboratory environment establish that they will work together. (low fidelity and ad hoc)

5. Component validation in relevant environment The basic technological components are integrated with reasonably realistic supporting elements so they can be tested in a simulated environment.

6. System/subsystem model or prototype Representative model or prototype system, which demonstration in a relevant environment is well beyond that of TRL 5, is tested in a relevant environment.

7. System prototype demonstration in an Prototype near or at planned operational system. operational environment. Represents a major step up from TRL 6 by requiring demonstration of an actual system prototype in an operational environment

8. Actual system completed and qualified through Technology has been proven to work in its final test and demonstration. form and under expected conditions. In almost all cases, this TRL represents the end of true system development. E

9. Actual system proven through successful Actual application of the technology in its final mission operations. form and under mission conditions

Table 5.4: DoD Technology Readiness Levels explained

5.5 Modified Davenport/Kirby Matrix

TRL scale and definitions are used to estimate TRL score for capabilities listed on Davenport/ Kirby matrix. This is the Modified Davenport/ Kirby Matrix. TRL provides a concise measurement of state of the art on listed cognitive capabilities. In this thesis, while applying TRL scores, scientific

52 rigor is not applied. Merely a first read out for TRL score is provided. The TRL scores might be different in different industries and TRL scores will change with time, since these technologies are rapidly evolving. Scores should be assigned by experts when used to evaluate CC/ML opportunities.

Level of Human Repetitive Context Self-Aware Intelligence Support Task Awareness Intelligence Automation & Learning Task Type Analyze BI, Data Operational Machine Not yet Numbers Visualiza Analytics Learning tion, Scoring Neural Nets Hypothes Model is driven Management analysis TRL 7-8, 8-9 7-8 I 5-6, 6-7 3-4 Digest Character Image Q&A, NLP Not Vet words / and Recognition Images Speech Machine recogniti Vision on TRL 7-8, 8-9 7-8 5-6, 6-7 3-4 1 - 1 Perform Business Rules Not yet Not yet original Process Engines, Tasks Managem RPA (Admin & ent Decisions) TRL 7-8, 8-9 7-8 5-6 3-4 Perform Remote Industrial Fully Not yet Physical Operation robotics Autonomous Tasks Collaborative robots, robotics vehicles TRL 7-8, 8-9 7-8 5-6 3-4

Figure 5.5: Modified Davenport Kirby Matrix (Machine IQ for Knowledge tasks. How smart are the machines? TRL is Technology Readiness Level)

53 5.6 TRL scores on Davenport Kirby Matrix

As an example, data visualization and business analytics technologies are estimated to be in 7 - 9 level of readiness. I made this assessment based on my experience and research. This score means these technologies are well demonstrated and applied in the industry.

On the other hand machine learning/ neural network based approaches are estimated in the range of 5 - 7. This means that these technologies have been tested in operational environments, but there may only be specific domains where these capabilities are applied in actual field environment. They are not broadly available for full scale wide deployment in diverse areas.

Technologies such as self-aware Intelligence are low fidelity and ad hoc at best, if not speculative, so they are estimated in the range of 3 - 4 technology readiness level.

TRL scores provided here are merely estimates made by the thesis writer. Intention is to provide a high level idea about how these scores can be useful in quickly determining market readiness of the cognitive tools available.

Companies should assign experts to compute these score for their own industry domain and use them as tools to make more informed investment decisions.

54 The Modified Davenport /Kirby IQ matrix hints at a spectrum of task automation (no human involvement, computer does the task) to augmentation (collaborate with humans to complete task) but does not deal with the data complexity or work complexity of the knowledge tasks.

5.7 Accenture Model for Cognitive Computing 3

A challenging task for business person is to incorporate CC/ML tools into existing and possibly legacy technology infrastructure. But more importantly it is hard to incorporate CC/ML tools into the human resource infrastructure.

It is helpful to ask questions about types of tasks / subtasks cognitive systems might perform. Are they going to collaborate with the existing human resources? Will these systems fully automate some of the routine tasks? Or will there be a collaborative relationship between employees and cognitive systems? What is the right balance? Some of these questions are answered by analyzing the complexity of the work and the complexity of the data used in that work. Accenture model is based on these parameters to help a business person think through some of the above mentioned questions to incorporate CC/ML tools into human resources in the organizations.

Cyrille Bataller and Jeanne Harris of Accenture suggest that cognitive tasks can be analyzed based on the complexity of the work on X scale and complexity of data on Y scale.

At the lower end of complexity scale, work can be mundane and clerical, such as credit decision making or claims processing. On the high end of complexity scale, work might require unpredictable decision making, discretion and originality, such as for an artist or a research scientist.

Data can be straightforward, structured and of manageable volume at the lower end of data complexity scale. But data can also be from diverse sources, variety of types, volatile and significant in volume, at the higher end of data complexity scale.

Routine predictable work is on the left side of X axis and unpredictable, judgement based work on the right side of x axis.

Low volume, structured data is on the lower end of Y axis and Unstructured, High volume data on the higher end of the Y axis.

55 If the work is not complex and if the data is not complex, then full automation is brings efficiency and consistency. Full Automation is optimal approach in this case. This represents the lower left corner of the Accenture matrix.

On the other hand, if the work is complex and data is also complex, then augmentation will help humans make better decisions, but humans still need to lead the process. This represents the top right corner of the Accenture matrix.

A Effectiveness Innovation 1Ur7If FA"TJ Enable Creativity and Ideation Support Seamless Integration and Collaboration * Original, innovative work * Wide range of interconnected work activities " Highly reliant on deep expertise, * Highly reliant on coordination and * experimentation, exploration and creativity communication * Example solutions: Support for biomedical research; fashion design; - Example solutions: Virtual agents for consumers or music writing 4- for enterprise customer service; collaboration or workflow management E 0 Efficiency Expert

-WfU Provide consistent low cost performance " Routine work with little discretion Leverage Specialized Expertise 0 " Highly reliant on well-defined and * Judgment-oriented work " well-understood criteria, rules and procedures " Highly reliant on expertise and experience " Example solutions: Automated credit decisions; * Example solutions: Expert system for package delivery via drones medical diagnosis; legal or financial research

Work Complexity

Figure 5.7: Work and Data complexity Matrix

X Scale of Work Complexity moves from Routine/ Predictable /Rules Based work to Ad Hoc/Unpredictable/Judgement Based Work

Y Scale of Data Complexity moves from Structured/Stable/Low Volume data to Unstructured/Volatile/ High Volume data.

56 Based on the current state of CC/ML, applications in the efficiency quadrant tend to have higher TRL scores to those application that lie in the innovation quadrant.

57 5.8 Realizing Business Value from CC/ML Projects is a Journey

Before embarking on a cognitive computing/ ML investment, a recognition needs to be made that, realizing business value from these tools is a journey.

AI / Cognitive computing is promising to be a scanner / sense-maker to uncover actionable business insights. But the reality is that applications using AI/CC approaches and technologies must be developed as a project. Hadley Reynolds, Big Data and Cognitive Computing

An enterprise capability assessment for CC/ ML readiness should be the next order of business. This assessment can then guide the implementation approach and vendor selection. Our contributions is the enterprise capability assessment matrix shown below.

58 5.9 Enterprise Capability Assessment for ML / CC

Most ML / CC projects are computing projects that involve software / hardware and human resources. To ensure success of CC/ML projects, companies must have technical and project management capabilities.

An Enterprises Technology Capability can be measured against software/ hardware infrastructure maturity and engineering process maturity. In figure 5.9 Technology capability increases as you move right on X axis.

While Enterprise Domain Expertise is also equally critical to conceive and implement successful CC/ML projects. In figure 5.9, Y axis represents increasing domain expertise as you move upward on the axis.

Enterprise Technology Capability:

Mature technology capabilities are very important to implement successful software projects, let alone ML / CC projects. If an organization does not have mature technology capability, then it needs to use cloud based solutions. If the organization is not comfortable with that approach, then it needs to build technology capabilities internally. CC/ML tools available through SaaS approach are solid, and an organization might want to leverage SaaS and avoid reinventing the wheel.

Maturity of software engineeringpractices matters because most ML / CC projects will involve software engineering work unless a pre-packaged software is available that solves specific problems at hand. In most situations for Science based organizations, this is not the case. Therefore to implement innovative ML/CC projects, software engineering practices in the organization will need to be mature enough to successfully deliver a project, even if a project idea is conceived by available subject matter expertise. Availability of ML / CC expertise is also critical for achieving business outcomes from the projects.

Enterprise Domain Expertise:

Domain expertise is the Subject Matter Expertise in organizations industry domain. Subject matter expertise can range from very superficial to the extent that this organization is an innovator in the specific domain. As an example, if

59 you are a generic drug maker, your knowledge of the specific drug manufacturing process is superficial at best. On the other hand if you are an organization who developed the drug manufacturing process, your subject matter expertise is deep. Why this is important? Al / ML approaches are usually about changing the way tasks are done / improving a process. To innovate, you need a deeper level of understanding and expertise in that field. This is especially important for scientific companies, such as pharmaceuticals.

This based on two success factors, namely Technology Capability and Domain Expertise, we propose a CC/ML vendor selection approach. This vendor selection approach is more suitable for non-technology companies. Pharmaceutical companies are a good example, since at their core, they are scientific organizations and not information technology companies.

Enterprise Capability Assessment Model:

L Niche Players Innovators

" Mature Enterprise IT Capabilities - Deep Domain Subject Matter Expertise - Deep Domain Subject Matter Expertise - Deep CC ML expertise - Deep / some CC'ML expertise " Mature relationships with technology Consulting Partners - Immature Enterprise IT capabilities

Consumers Followers

- Immature Enterprise IT Capabilities - Mature Enterprise IT Capabilities - Lack of Doamin/ Subject Matter Expertise - Lack of or some Subject Matter expertise - Lack CC ML expertise - Lack of or some CC/ML Expertise

Enterprise Technology Capability Software / Hardware / SW Engineering Z

Figure 5.9: Enterprise Capability Model

60 Consumers should use SaaS

If you fall into the lower left quadrant, you should go Full Stack SaaS. You are purely in the Consumers model.

- Immature Enterprise IT Capabilities " Lack of Domain / Subject Matter Expertise " Lack CC ML expertise

Innovators can use multiple solution approaches

Programming Tools / Enterprise Scale Software / SaaS / invest in Startups

If you are an established company with proven skills in technology (you lie in the top right quadrant) then you can try multiple approaches based on the needs. Such a large organization will need multiple approaches because, it will have diverse needs.

- Mature Enterprise IT Capabilities " Deep Domain / Subject Matter Expertise - Deep CC/ML expertise " Mature relationships with technology Consulting Partners

Niche Players should use Programming Tools and SaaS

These are companies with deep domain expertise but no IT establishments. Their business goals can be met with small scale internal projects or SaaS.

- Deep Domain / Subject Matter Expertise " Deep / some CC/ML expertise " Immature Enterprise IT capabilities

Followers should use Enterprise scale software and SaaS

These are companies that have large established IT groups but lack domain or CC/ML expertise. They should use enterprise scale tools.

e Mature Enterprise IT Capabilities

61 - Lack of or some Subject Matter expertise - Lack of or some CC/ML Expertise

62 5.10 Proposed End to End Business Analysis Framework

Davenport Kirby and Accenture models help disambiguate CC/ML landscape, especially when Technology Readiness Levels are also introduced. We employ this approach and propose a framework that helps a business person navigate complex CC/ML landscape. A business person is guided end to end to start with a business problem and arrive at a recommendation for solution approach. While building this framework, the author's prior experience working in large pharmaceutical's IT organization weighs in heavily.

A framework for assessing cognitive computing opportunities and deciding implementation approach:

Step Description Review Use case, identify if there are distinct knowledge tasks. Example task list 'Table 1 Knowledge Task List' (Expanded Davenport/ Kirby list ) 2 Based on the task goals, Categorize task(s)using Davenport categories Determine level of intelligence needed to perform the task using the Davenport Kirby 3 Matrix Broadly evaluate Technology Readiness Level based on the IQ decided in step 3 4 (Davenport / Kirby matrix and DoD TRL) expertise needed to assign TRL 5 Evaluate Work complexity 6 Evaluate Data Complexity, then based on 5 and 6 decide scale of automation. 7 Identify a Model as a general guideline (Accenture Model) 8 Perform Enterprise Capability Assessment for ML / CC readiness Use model in step 7 and results of capability assessment to guide identification of 9 implementation approaches suitable and available 10 Arrive at solution approaches

Table 5.10: CC/ML end to end Business Analysis framework

In this chapter we reviewed Davenport/Kirby and Accenture approaches to disambiguate the cognitive computing landscape. We modified Davenport/Kirby approach by including technology readiness level. We also introduced the need for and a mechanism to conduct enterprise capability assessment. Concluding, in order to navigate the opportunities and make vendor selection and investment decisions, we proposed a 10 step analysis framework.

63 64 Chapter 6 Analysis of Use Cases in Pharmaceutical Industry

There is lot of excitement to leverage CC/ML technologies in pharmaceutical companies, but there is lack of real use cases converted into implementation projects. It is noteworthy that Pharmaceutical industry is always slow to pick up advances in technologies and CC/ML is a recent phenomenon.

Pharmaceutical industry is using analytics widely and generating valuable business outcomes. But broad use of CC/ML technologies was not found in pharmaceutical industry.

Informal discussions and limited stakeholder interviews were conducted to analyze use cases. These use cases mainly spanned R&D from discovery, development and small scale manufacturing. These use cases are representative of a large chunk of knowledge work especially in R&D and early stage manufacturing that goes on in pharmaceutical industry.

These use cases demonstrate some applicability and early adoption of CC/4L technologies in the pharmaceutical industry.

For example, types of analysis a Pharmaceutical scientist might do with pathology data is also representative of lab analysis work in other disease areas, the data types may be different.

Representative pharmaceutical use cases are taken through assessment framework proposed in chapter 5. This assessment is done to help define the solution approach for these specific uses. In the process, I establish how this same approach can work for future unnamed opportunities in other industry domains, specifically, for non tech companies.

65 6.1 Pathology Image/ Data Analysis to discover new Cancer Treatments

Pathology image and Laboratory data analysis use case represents knowledge tasks, that discovery phase pharmaceutical scientists come across often. They have massive amounts of data thrown at them in various formats and there is always a time crunch. Often a large amount of data is never analyzed, or is so massive that identifying possible novel relationships is just impossible for human beings. A solution based on ML/CC would help focus scientific resources and potentially identify new drug mechanisms / relationships.

Pathology use case was taken through the end to end analysis framework. The results indicated that commercially available technology was not ready and a pilot should be done to demonstrate the business outcomes.

Description Step Details The Use case 1 Pathology reports and Imaging Analysis Pharmaceutical Scientist (Medical professionals, Biologists) needs to perform automatic analysis of large amount of pathology imaging and pathology lab reports data: - Image recognition and analysis / computer vision - Text Analytics - Number Analytics - Search scanned documents for text (optical character recognition) - Search 3 dimensional biological structures - Use previous knowledge obtained from scientific literature and physicians notes - Discover new relationships of immune cell response to cancer in The task therapies where immune system is harnessed to attack cancer cells and subtasks 1 - Discover new approaches for cancer treatment Analytics Character and image recognition / Computer Vision Broad Cognitive Machine learning / Neural Networks (advanced) Categories 2 Natural language processing Repetitive task automation (indexing text and images), Level of Intelligence 3 Q&A, Natural Language Processing (Context awareness & Learning)

66 TRL range 5 - 6 - Character recognition (OCR) technology is commercially available - Text search technology is commercially available - Computer vision technology for pathology images is not commercially mature - Image searching technology is Not available - Machine learning / neural network techniques for pathology data are not operationally demonstrated yet - Technology to automatically process complex data to uncover hidden relationships between tumor and immune cells does not exist. There is no general purpose tool commercially available to address all of the above, which are required to solve the problem posed by the TRL Score 4 use case. Accenture Model Work/Data Complexity 5 High Data and work Complexity Accenture Model Scale of Automation 6 Full Automation to uncover hidden relationships Accenture Model Quadrant: Innovation Identification 7

Quadrant: Followers

Software and hardware infrastructure (IT) - Well Established Ability to analyze business case and build in house software None Enterprise IT is an important asset/ data is critical asset : Yes Capability Subject matter Expertise: Yes Assessment 8 Deep learning expertise and resources available: Some Solution Conduct vendor Assessment for Enterprise Cognitive Solution and Approaches 9 Software As a Service Tools Do Pilot of Cognitive Solution that can - ingest, index, annotate data from diverse scientific sources, - In built or integrate with OCR tools - Deep natural language processing - Deep learning, hypothesis generation, advise with confidence (probabilities) Recommended Solution 10 Fully integrated Tool such as IBM Watson for Oncology

67 6.2 Scientifically Aware Search

Description Step Details Pharmaceutical Development group cannot find relevant knowledge assets. Knowledge resides in documents, databases, emails and diverse systems but is not easily accessible to scientists. Systems are locked down, or hard to find useful information from. This is leading to lost productivity in locating information and repetition of work if relevant scientific information is not The Use Case 1 found. Pharmaceutical Scientist needs to find scientific information to decide course of action in a drug development experiment: - Understand queries written by users in natural language - Search internal / external websites - Search internal / external Documents - Search scanned documents for text (optical character recognition) - Search chemical structures / identify chemical structures (Image recognition) - Search and Electronic Laboratory Experiment repositories - Search and retrieve information from laboratory results repositories - Discover new relationships between knowledge entities, such as two lab experiments are similar - Discover expertise, such as this person is expert in copper removal methods - Need a unified view of knowledge across multiple data The task sources and subtasks 1 - need to enhance scientific awareness of the search tool Analytics Broad Cognitive Character and image recognition Categories 2 Natural language processing Repetitive task automation (indexing text and images), Human Support (tagging knowledge assets appropriately, creating metadata where automatic metadata not available or Level of Intelligence 3 not possible)

68 TRL range 7 - 8 - Character recognition (OCR) technology is commercially available - Text search technology is commercially available - Chemical Structure Image recognition technology is commercially available but is Not full proof - Image searching technology is Not available - Search technology to conduct database searches in commercially available - technology to enhance scientific awareness of text content is commercially available (scientific vocabularies can be TRL Score 4 purchased) Low to medium Search and retrieval work has low complexity for document search and Accenture Model medium complexity for database search, Integrating multiple Work/Data databases, search technology, character recognition Complexity 5 technology appears to have medium complexity Accenture Model Scale of Automation 6 Medium (Humans still need to query data) Accenture Model Quadrant: Efficiency Identification 7

Quadrant: Followers

Software and hardware infrastructure (IT) - Well Established Ability to analyze business case and build in house software None Enterprise Capability IT is an important asset/ data is critical asset : Yes Assessment 8 Subject Matter Expertise: Yes Conduct vendor Assessment for Enterprise search Tools and Solution Approaches 9 Software As a Service Tools

Buy Enterprise Search tool that can

- retrieve data from diverse sources, - easily integrate with scientific taxonomy tools - integrate with OCR tools - does at least shallow natural language processing - has proven track record

Recommended Do NOT go SaaS since the scientific data is a critical asset and Solution 10 there is no precedent of storing this data in cloud yet.

69 6.3 Chemical Synthesis Step Robustness through Machine Learning

Description Step Details Chemical Synthetic development groups need to scale up chemical drug manufacturing process while maintaining the The Use Case 1 robustness of the processes. Pharmaceutical Scientist (in this case Chemical Development Engineer) needs to find use an existing dataset to identify critical parameters that can be controlled to improve and maintain process robustness when process is scaled up. - Deep domain expertise - Conduct hundreds of Design of Experiment studies - Identify critical control parameters The task - Perform model fitting and simulation exercises and subtasks 1 - Repeat for other processes Broad Cognitive Categories 2 Analytics (numerical)

Level of Human Support (choosing random forests, SVM or Neural Intelligence 3 network approach to fit models and predict process robustness) TRL range 5 -6 and 6- 7 - Advanced general purpose Neural network technology for TRL Score 4 process scale up is NOT available High Accenture Model Complex multidimensional data (Process Sensors, chemical Work/Data structures, environmental conditions, process conditions) Complexity 5 Complex chemical reaction modelling processes Accenture Model Scale of Automation 6 Low (Judgement needs to be made by human) Accenture Model Quadrant : Expert Identification 7 Quadrant: Niche Players

Ability to analyze business case and build in house software High if in process development domain Subject Matter Expertise: Yes in Scientific domain of Chemical Enterprise Development Capability Machine learning Expertise: Yes, applying for lab based research Assessment 8 methods Solution Conduct Assessment of Development tools / programming Approaches 9 languages

70 Enable user to program / create machine learning models using tools such as Matlab, R. Based on SME interest make Python and machine learning libraries available. Recommended Small scale of the project, available expertise, enables using Solution 10 development tools and programming languages.

71 6.4 Business Process Automation

Description Step Details Bring Consistency, transparency, speed, end to end tracking to laboratory sample testing operations. Current laboratory sample testing process is broken. It is mostly manual and highly inefficient. R&D group has > 1000 scientists actively conducting laboratory scale chemical experiments. Scientists need to test their experimental products using Analytical labs across R&D. Testing labs are dispersed in many subgroups and either service dedicated groups or multiple groups depending on the type. Although all labs conduct laboratory The Use case 1 sample testing. Pharmaceutical Scientist (chemistry practitioners) need to better sample testing and management process: - Create sample testing request - Suggest suitable tests based on context, provide ability to pick laboratory tests - Provide mechanism to uniquely label sample (barcode) - Suggest Laboratories that can conduct testing (Based on real time demand) - Provide ability to route sample - Provide end to end 'chain of custody' view for the sample, from origin to destination - provide ability to track testing status, dashboard results based on context - provide state of the art user interface The task - submit results into laboratory data management systems and subtasks 1 - provide reports and dashboards for scientists and managers

Broad Cognitive Rules and Business Rules Categories 2 Analytics Business Process Management (Laboratory Sample testing process management) Repetitive Task Automation (multiple subtasks such as tracking, printing, routing, reporting, updating status, demand management can be automated) Some Human support still required, but many tasks can be automated Level of Intelligence 3 and significant productivity improvement can be achieved. TRL range 8-9, 7 - 8 - Business Process Management Software suits are commercially TRL Score 4 available and capable of meeting most needs Low to medium Search and retrieval work has low complexity for document search and Accenture Model medium complexity for database search, Integrating multiple Work/Data databases, search technology, character recognition technology Complexity 5 appears to have medium complexity 72 Accenture Model Scale of Automation 6 Medium (Humans still need to be involved) Accenture Model Quadrant: Efficiency Identification 7 Quadrant: Followers

Software and hardware infrastructure (IT) - Well Established Consulting Partnership with tech companies: Well Established Ability to analyze business case and build in house software : Low IT is an important asset/ data is critical asset : Yes SME: routine process automation, some expertise needed Enterprise Capability Data storage: Laboratory results need to be stored in controlled Assessment 8 company infrastructure, which is available Solution Approaches 9 Conduct vendor Assessment for Enterprise BPM tools Most BPM suits are Platform as a Service with ability to create Recommended workflows in the cloud, run software form the cloud and eventually Solution 10 store results of the operations in house. Go SaaS.

This chapter utilized the 10 step analysis framework proposed in chapter 5, for pharmaceutical industry use cases to recommend CC/ML solutions.

73 Chapter 7. Conclusion

Born digital companies such as Amazon and Netflix have always been at the forefront of embracing or at times inventing new technologies such as cognitive computing / machine Learning. CC/ML technologies have now reached an inflection point and a Smart Machine Age has begun.

Key Takeaways:

- Non technology companies can no longer stay on the sidelines and must take advantage of CC/ML opportunities - CC/ML Universe is complex to navigate - Analysis in this thesis helps a business person avoid becoming victim of hype - As in any hype, there are lots of opportunities to make bad investment decisions, but there are real opportunities to derive tangible business value as well. - If business problems involve: Humongous and complex data Analysis, massive computing needs and opportunities to recognize novel patterns Then CC/ML experts should be consulted to vet the opportunity for possible investment. - Finally, if a problem is recognized as a CC/ML opportunity then the framework in chapter 5 should be used to arrive at a solution.

7.1 CC/ML Universe is Complex to Navigate

A variety of CC/ML technologies are available or hitting the marketplace every day and have confusing names and hyped up promises. Instead of Anthromorphized technology definitions that promise human brain like abilities, descriptive terms explaining what and how are more helpful. Market readiness of CC/ML technologies varies by the segment as well as the domain they are applied in. Multiple solution delivery mechanisms are available for CC/ML technologies.

A business person finds the CC / ML landscape complex. This thesis provided clear definitions, survey of use cases across industries, approach to segment products and delivery mechanisms that cuts through the fog.

74 7.2 Systematic Analysis framework aids in navigating the landscape

Framework proposed in this thesis, helps a business person navigate CC/ML opportunities. The framework helps identify CC/ML category, possible technologies, their market readiness, a reasonable solution approach based on an enterprises own capability. The assessment conducted for pharmaceutical industry use cases with this framework can be applied to use cases in other industry domain, especially for non-technology companies.

7.3 Future Research

Assessments made to pharmaceutical industry use cases should be validated with practitioners and implementers. Framework can be applied to other industry domains and validated. Framework can then be refined based on learnings. Expert TRL assessments of solutions available for specific industry domains and task categories can also be conducted.

75 Bibliography

1. Davenport, Thomas 2014. "The Confusing Landscape of Cognitive Computing" Wall Street Journal. http://blogs.wsj.com/cio/2014/12/17/the-confusing-landscape-of- cognitive-computing/

2. Moyer, Christopher 2016. "How Google's AlphaGo beat a Go World Champion" The Atlantic. http://www.theatlantic.com/technology/archive/2016/03/the-invisible- opponent/47561 1/

3. Wood, George, 2016. The Sadness and Beauty of watching Google's Al play GO", Wired. http://www.wired.com/2016/03/sadness-beauty-watching-googles-ai- play-go/

4. Austin, Tom, 2015. "Smart machines see major breakthroughs after decades of failure", Gartner, G00291251.

5. Sood, Bhavish and Hare, Jim 2015. "Market Share Analysis: Business Intelligence and Analytics Software 2014", Gartner, G00278629.

6. Schubmehl, Dave, 2015. "Market Analysis Perspective: Worldwide Content Analytics, Discovery, and Cognitive Systems", IDC Research.

7. Zilis, Chivon, 2015. "The current State of Machine Intelligence 2.0", O'Reilly https://www.oreilly.com/ideas/the-current-state-of-machine- intelligence-2-0

8. Linden, Alexander et al, 2015. "Machine Learning Drives Digital Business", Gartner. G00263964

9. Hardy, Quentin, 2016. "Silicon Valle looks to Artificial Intelligence for the next big thing", New York Times.

76 10. Pyle, Dorian, 2015. "An executive's guide to machine learning", McKinsey Quarterly, 2015 3rd Quarter, Issue 3, p 4 4 5 3 .

11. Gregory, Richard, 1987. "Oxford Companion to the Mind", Oxford University Press.

12. "IT Glossary", Gartner http://www.gartner.com/it-glossary/artificial-intelligence/

13. Kelly, John, 2015. "Computing Cognition and the future of knowing", IBM White Paper.

14.Watson Computer, Entry. https://en.wikipedia.org/wiki/Watson_(computer)

15.Machine Learning, Wikepedia Entry. https://en.wikipedia.org/wiki/Machinelearning

16.Mitchell, Thomas, 1997. "Machine Learning" Text book, MCGraw Hill, ISBN 0-07-042807-7, p. 2 .

17. Austin Tom, Et al, 2016. "How to define and use Smart machine terms effectively", Gartner, G00301283

18. Cser, Andras, 2015. "Stop Billions in Fraud Losses with Machine Learning" Forrester.

19. Machine Learning Use Cases https://www.sfile.com/solutions.html

20."Big Data and Machine Learning scenarios for retail" http://blogs.msdn.com/b/shishirs/archive/2015/01/26/big-data-amp- machine-learning-scenarios-for-retail.aspx

21."Google, Stanford use machine learning on 37.8m data points for Drug discovery", Author, CIO (13284045). 3/3/2015, p 1 1 .1p.

22. "Memorial Sloan Kettering Trains IBM Watson to help Doctors make better cancer treatment choices"

77 https://www.mskcc.org/blog/msk-trains-ibm-watson-help-doctors- make-better-treatment-choices

23. IBM Watson now powers a Hilton hotel robot concierge http://arstechnica.com/gadgets/2016/03/ibm-watson-hilton-robot- connie/

24. Markoff, John and Lohr, Steve, 2016 "The race is on to control Artificial intelligence and tech's future" New York Times http://www.nytimes.com/2016/03/26/technology/the-race-is-on-to- control-artificial-intelligence-and-techs-future.html?_r=0

25. Bhagatjee, Benoy 2014. "Emergence and taxonomy of Big data as a service" MIT SDM Thesis.

26. Amazon Machine Learning https://aws.amazon.com/machine-learning/

27. Microsoft Azure Machine Learning, https://azure.microsoft.com/en-us/documentation/articles/machine- learning-what-is-ml-studio/

28. Garcia, Jorge 2014. "Machine Learning and Cognitive Systems, a Vendor landscape" Wired. http://insights.wired.com/profiles/blogs/machine-learning-and- cognitive-system s-part-3 -a-ml-vendor

29. IBM Watson Cognitive Computing, http://www.ibm.com/smarterplanet/us/en/ibmwatson/

30. Machine Learning Startups https://angel.co/machine-learning

31.2015, "Cool Vendors in Smart Machines", Gartner G00274683 (Gartner)

32. Davenport Thomas and Kirby Julia, 2016. "Cognitive Technologies: the next step up for Data and Analytics". MIT Sloan Webcast.

33. Technology Readiness Level Wikipedia Entry

78 https://en.wikipedia.org/wiki/Technologyreadinesslevel

34.Bataller and Harris, 2015. "Turning Cognitive Computing into Business value today", Accenture.

35. Reynolds Hadley, 2016. "Big Data and Cognitive Computing", KM World Magazine.

36.2015, Hype Cycle for Smart Machines, Gartner. G00274690 (Gartner)

37.How Alchemy API works http://www.alchemyapi.com/

38.Google Cloud platform, Cloud Machine learning Products https://cloud.google.com/products/machine-learning/

79 Appendix A - CC/ML Startup Ecosystem

Company Product Description Details X.AI Personal Assistant Magically Schedule Meetings Scheduling meetings using Machine Learning techniques. Rethink Robotics Sawyer, Baxter, Physical task Automation Collaborative flexible robots Intera Platform Collaborative Robots, Software for Manufacturing, R&D Software Platform to train robots using context instead of coordinates, so that non- technical personnel can train the robots. Feedzai Feedzai Data Financial fraud detection Reimagine financial fraud Studio technology. detection technology through Make commerce safe and machine intelligence. Data improve end user experience. Studio a software environment for financial risk professionals to do data science.

Preact Customer churn Sales improvements using Customer churn prediction and prediction machine learning. prevention. Product usage analysis. LiftIgniter Marketing through Machine Learning to improve Improve Click Through Rate, Personalization Personalized Marketing Recommendations, engagement Narrative Science Quill Create stories from data and Natural language Generation. insights MetaMind Vision, Language Image Analysis and Text Analysis Machine intelligence applied to image analysis and text analysis for business applications. Enlitic Data Driven Deep learning techniques to Disease diagnostics using Medicine detect disease early. advanced machine learning for Contextualize imaging data by broad spectrum of diseases comparing it with large data sets instead of specialized computer and by analyzing clinical data and programs. reports. Deep Genomics Genomic Medicine Genome Biology and Precision Machine learning technologies Software SPIDEX medicine. to transform precision medicine, genetic testing, diagnostic and development of therapies. SPIDEX is a comprehensive set of mutations and their predicted effects on RNA splicing across the entire human genome.

80