Cassie Kozyrkov, Chief Decision Scientist, Google
Total Page:16
File Type:pdf, Size:1020Kb
Cassie Kozyrkov, Chief Decision Scientist, Google @quaesita @quaesita @quaesita @quaesita @quaesita @quaesita What is machine learning? @quaesita @quaesita @quaesita @quaesita @quaesita @quaesita @quaesita @quaesita Why do teams fail at machine learning? @quaesita Machine Learning Data Algorithms Models Predictions Ingredients Appliances Recipes Dishes @quaesita How does it work? @quaesita @quaesita @quaesita @quaesita Support Vector Classifier Decision Tree Neural Network @quaesita Label: Y or N @quaesita Label: Y or N @quaesita Label: Y or N Applied machine learning is a team sport @quaesita Meet the team Decision-maker @quaesita Data Decisions Is it tasty? Y or N ? Inputs Outputs @quaesita Meet the team Decision-maker Software engineer @quaesita Programmer Model Decisions Y or N ? Ada, Countess of Lovelace, 1815-1852 Expert Recipe Outputs @quaesita Meet the team Decision-maker Software engineer Data engineer @quaesita Data Programmer Models Decisions Is it tasty? Y or N ? Ingredients Expert Recipes Outputs @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst @quaesita Data Analyst Models Decisions Y or N ? Is it tasty? Karl Friedrich Gauss 1777-1855 Y or N ? Florence Nightingale 1820-1910 Ingredients Expert Recipes Outputs @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer @quaesita Data Algorithms Models Decisions Y or N ? Is it tasty? Y or N ? Y or N ? Ingredients Tools Recipes Outputs @quaesita Data Algorithms Models Decisions SV Classifier Decision Tree Neural Network Ingredients Tools Recipes Outputs @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) @quaesita Data Algorithms Models Decisions SV Classifier Is it tasty? Decision Tree Neural Network Ingredients Tools Recipes Outputs @quaesita Data Algorithms Models Decisions SV Classifier Y or N ? Is it tasty? Decision Tree Y or N ? Neural Network Y or N ? Ingredients Tools Recipes Outputs @quaesita Data Models Decisions Is it tasty? Ingredients Recipes Outputs @quaesita Data Models Decisions Is it tasty? Ingredients Recipes Outputs @quaesita Data Models Decisions Is it tasty? Y Y N Y N N Y Y N Y N N Y Y N Y N N Ingredients Recipes Outputs @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) Statistician @quaesita Training Validation Test Build Final Taste Make Recipes Casual Taste Test Add to Menu Test @quaesita Training Validation Test Build 75% 90% 85% Final Taste Make Recipes Casual Taste Test Add to Menu Test @quaesita Training Validation Test Build 75% 90% Pass 85% Final Taste Make Recipes Casual Taste Test Add to Menu Test @quaesita Training Validation Test Build 75% 90% Pass 85% Final Taste Make Recipes Casual Taste Test Add to Menu Test @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) Statistician (Reliability engineer) @quaesita What is data science? @quaesita A map of data science? SQL Python R Descriptive Machine Statistical Analytics Learning Inference @quaesita A map of data science? Histogram Neural network Student’s t-test Descriptive Machine Statistical Analytics Learning Inference @quaesita A map of data science! None Many Few Descriptive Machine Statistical Analytics Learning Inference @quaesita A map of data science! Get inspired Make a recipe Decide wisely Descriptive Machine Statistical Analytics Learning Inference @quaesita Seeing with Google Photos [beach] Is machine learning right for you? @quaesita @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) Statistician (Reliability engineer) (Ethicist) @quaesita Reliable or unreliable? @quaesita Technology is a lever. @quaesita Decisions at scale need skilled leaders. @quaesita Design metrics that can’t be gamed. @quaesita Warning! Decision-makers make or break a data science project. Please train them. @quaesita Set performance criteria up front. @quaesita Would you stay at this hotel? @quaesita Data-inspired or data-driven? @quaesita Frame the decision context 1. What if you get no information? 2. What if you get full information? 3. What if you get partial information? Role: Decision-maker @quaesita Get help! From a teammate. From data. Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) Statistician (Reliability engineer) (Ethicist) (Qualitative expert) @quaesita Help from the social sciences 1. Psychology a. Judgment & decision-making b. Social psychology c. Neuroeconomics 2. Economics a. Experimental game theory b. Behavioral economics 3. Managerial sciences Role: Qualitative expert @quaesita Get help! From a teammate. From data. Warning! Datasets used for inspiration can’t be used for rigorous decision-making. @quaesita Have your cake and eat it too. Split your data. ALL OF IT Empower everyone to look at data. @quaesita You are already a data analyst. Who, me? Yes, you. @quaesita @quaesita @quaesita Split your data. Inspiration (everyone) ALL OF IT Rigor (experts only) Statistician “Does it actually work?” “Let’s come to conclusions safely.” Empower everyone to look at data. Be careful coming to conclusions. @quaesita Inspiration is cheap, rigor takes expertise. @quaesita Warning! Avoid rigor for rigor’s sake. @quaesita Meet the team Decision-maker Software engineer Data engineer Descriptive analyst Machine learning engineer (Researcher) Statistician (Reliability engineer) (Ethicist) (Qualitative expert) Data science manager @quaesita Rigor should start with the decision-maker. @quaesita Why trust machine learning? @quaesita Why trust your student? @quaesita Catch memorization! Don’t let overfitting take you by surprise; expect performance on examples you’ve studied from to be too good to be true. @quaesita Testing keeps you safe. It’s the key to responsible ML and AI. Make sure your solution actually works on relevant, new data. @quaesita Never forget the basics of learning and teaching! @quaesita Your ingredients matter. @quaesita “The world represented by your training data is the only world you can expect to succeed in.” @quaesita Where do data come from? @quaesita The future of work @quaesita We’re all being promoted. @quaesita The future of work is creativity. @quaesita Thank you @quaesita Summary Decision intelligence engineering is a team sport. The decision-maker’s role is vital in machine learning. Data-driven decisions require up front context framing. Ensure that your metrics can’t be gamed. Splitting your data lets you have your cake and eat it too. Inspiration is cheap, rigor takes expertise. Rigor should come from the decision-maker. Don’t reinvent the wheel with research. Test carefully on relevant, new examples. @quaesita.