<<

Soc400/500: Applied Social Week 1: Introduction and Probability

Brandon Stewart1

Princeton

August 31-September 4, 2020

1These slides are heavily influenced by Matt Blackwell and Adam Glynn with contributions from Justin Grimmer and Matt Salganik. Illustrations by Shay O’Brien. Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 1 / 70 Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Where We’ve Been and Where We’re Going...

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 2 / 70 This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 2 / 70 Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 2 / 70 Long Run

I probability → inference → regression → causal inference

Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 2 / 70 Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 2 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 3 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 3 / 70 The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in .

I ... am trained in and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 Welcome and Introductions

The tale of two classes: Soc400/Soc500 Applied Social Statistics I

I ... am an Assistant Professor in Sociology.

I ... am trained in political science and statistics

I ... do research in methods and statistical text analysis

I ... love doing collaborative research

I ... talk very quickly Your Preceptors

I Emily Cantrell

I Alejandro Schugurensky

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 4 / 70 Goal: train you in statistical thinking. Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc. Difficult course but with many resources to support you. When we are done you will be able to teach yourself many things Syllabus is a useful resource including philosophy of the class.

Overview

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc. Difficult course but with many resources to support you. When we are done you will be able to teach yourself many things Syllabus is a useful resource including philosophy of the class.

Overview

Goal: train you in statistical thinking.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 Difficult course but with many resources to support you. When we are done you will be able to teach yourself many things Syllabus is a useful resource including philosophy of the class.

Overview

Goal: train you in statistical thinking. Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 When we are done you will be able to teach yourself many things Syllabus is a useful resource including philosophy of the class.

Overview

Goal: train you in statistical thinking. Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc. Difficult course but with many resources to support you.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 Syllabus is a useful resource including philosophy of the class.

Overview

Goal: train you in statistical thinking. Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc. Difficult course but with many resources to support you. When we are done you will be able to teach yourself many things

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 Overview

Goal: train you in statistical thinking. Fundamentally a graduate course for sociologists, but also useful for research in other fields, policy evaluation, industry etc. Difficult course but with many resources to support you. When we are done you will be able to teach yourself many things Syllabus is a useful resource including philosophy of the class.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 5 / 70 critically read and reason about quantitative social science using techniques. conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Specific Goals

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational data for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Specific Goals

critically read and reason about quantitative social science using linear regression techniques.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 explain the limitations of observational data for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Specific Goals

critically read and reason about quantitative social science using linear regression techniques. conduct, interpret, and communicate results from analysis using multiple regression.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Specific Goals

critically read and reason about quantitative social science using linear regression techniques. conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational data for making causal claims and distinguish between identification and estimation.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Specific Goals

critically read and reason about quantitative social science using linear regression techniques. conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational data for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 feel empowered working with data

Specific Goals

critically read and reason about quantitative social science using linear regression techniques. conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational data for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 Specific Goals

critically read and reason about quantitative social science using linear regression techniques. conduct, interpret, and communicate results from analysis using multiple regression. explain the limitations of observational data for making causal claims and distinguish between identification and estimation. understand the logic and assumptions of several modern designs for making causal claims. write clean, reusable, and reliable R code in tidyverse style. feel empowered working with data

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 6 / 70 It will give you super powers (but not at first) It is free and open source It is the de facto standard in many applied statistical fields

Why R?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 7 / 70 It is free and open source It is the de facto standard in many applied statistical fields

Why R?

It will give you super powers (but not at first)

Artwork by @allison horst

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 7 / 70 It is the de facto standard in many applied statistical fields

Why R?

It will give you super powers (but not at first) It is free and open source

Artwork by @allison horst

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 7 / 70 Why R?

It will give you super powers (but not at first) It is free and open source It is the de facto standard in many applied statistical fields

Artwork by @allison horst

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 7 / 70 Why RMarkdown?

Artwork by @allison horst

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 8 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 9 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 9 / 70 No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 We all come from very different backgrounds. Please have patience with yourself and with others.

Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 Mathematical Prerequisites

No formal pre-requisites. Balance of rigor and intuition.

I no rigor for rigor’s sake.

I we will tell you why you need the math, but also feel free to ask.

I course focus on how to reason about statistics, not just memorize guidelines. We will teach you any math you need as we go along Crucially though—this class is not about innate statistical aptitude, it is about effort. We all come from very different backgrounds. Please have patience with yourself and with others.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 10 / 70 Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Problem Sets reinforce understanding of material, practice

Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Ways to Learn Pre-Recorded Lectures learn broad topics (4–8 videos a week, ≈2.5 hours) Pre-Recorded Precept learn data analysis skills, get targeted help on assignments Perusall an annotation platform for videos Course Meetings come together and discuss material Ed ask questions of us and your classmates Office Hours ask us even more questions, but (sort of) in-person Problem Sets reinforce understanding of material, practice

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 11 / 70 Grading and solutions Collaboration policy You may find these difficult. Start early and seek help! Most important part of the class

Problem Sets

Schedule (due Friday at 5PM eastern)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 12 / 70 Collaboration policy You may find these difficult. Start early and seek help! Most important part of the class

Problem Sets

Schedule (due Friday at 5PM eastern) Grading and solutions

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 12 / 70 You may find these difficult. Start early and seek help! Most important part of the class

Problem Sets

Schedule (due Friday at 5PM eastern) Grading and solutions Collaboration policy

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 12 / 70 Most important part of the class

Problem Sets

Schedule (due Friday at 5PM eastern) Grading and solutions Collaboration policy You may find these difficult. Start early and seek help!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 12 / 70 Problem Sets

Schedule (due Friday at 5PM eastern) Grading and solutions Collaboration policy You may find these difficult. Start early and seek help! Most important part of the class

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 12 / 70 Pre-Recorded Lectures Pre-Recorded Precept Perusall Course Meetings Ed Office Hours Problem Sets Instructor Office Hours Final Exam Prep External Consulting Individual and Group Tutoring

Your Job: work hard and get help when you need it!

Ways to Learn

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 13 / 70 Instructor Office Hours Final Exam Prep External Consulting Individual and Group Tutoring

Your Job: work hard and get help when you need it!

Ways to Learn Pre-Recorded Lectures Pre-Recorded Precept Perusall Course Meetings Ed Office Hours Problem Sets

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 13 / 70 Your Job: work hard and get help when you need it!

Ways to Learn Pre-Recorded Lectures Pre-Recorded Precept Perusall Course Meetings Ed Office Hours Problem Sets Instructor Office Hours Final Exam Prep External Consulting Individual and Group Tutoring

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 13 / 70 Ways to Learn Pre-Recorded Lectures Pre-Recorded Precept Perusall Course Meetings Ed Office Hours Problem Sets Instructor Office Hours Final Exam Prep External Consulting Individual and Group Tutoring

Your Job: work hard and get help when you need it!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 13 / 70 Staying in Touch

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 14 / 70 Staying in Touch

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 14 / 70 Staying in Touch

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 14 / 70 Think of the lecture slides as primary reading. If you want material to read, come talk to me about recommendations. Suggested Books (more in the syllabus!):

I Angrist and Pischke. 2008. Mostly Harmless

I Aronow and Miller. 2019. Foundations of Agnostic Statistics

I Blitzstein and Hwang. 2019. Introduction to Probability A somewhat obvious tip: don’t skip the math!

A Note on Reading

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 15 / 70 If you want material to read, come talk to me about recommendations. Suggested Books (more in the syllabus!):

I Angrist and Pischke. 2008. Mostly Harmless Econometrics

I Aronow and Miller. 2019. Foundations of Agnostic Statistics

I Blitzstein and Hwang. 2019. Introduction to Probability A somewhat obvious tip: don’t skip the math!

A Note on Reading

Think of the lecture slides as primary reading.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 15 / 70 Suggested Books (more in the syllabus!):

I Angrist and Pischke. 2008. Mostly Harmless Econometrics

I Aronow and Miller. 2019. Foundations of Agnostic Statistics

I Blitzstein and Hwang. 2019. Introduction to Probability A somewhat obvious tip: don’t skip the math!

A Note on Reading

Think of the lecture slides as primary reading. If you want material to read, come talk to me about recommendations.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 15 / 70 A somewhat obvious tip: don’t skip the math!

A Note on Reading

Think of the lecture slides as primary reading. If you want material to read, come talk to me about recommendations. Suggested Books (more in the syllabus!):

I Angrist and Pischke. 2008. Mostly Harmless Econometrics

I Aronow and Miller. 2019. Foundations of Agnostic Statistics

I Blitzstein and Hwang. 2019. Introduction to Probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 15 / 70 A Note on Reading

Think of the lecture slides as primary reading. If you want material to read, come talk to me about recommendations. Suggested Books (more in the syllabus!):

I Angrist and Pischke. 2008. Mostly Harmless Econometrics

I Aronow and Miller. 2019. Foundations of Agnostic Statistics

I Blitzstein and Hwang. 2019. Introduction to Probability A somewhat obvious tip: don’t skip the math!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 15 / 70 Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Ask questions if you don’t know what’s going on!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Advice from Prior Generations

Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 Advice from Prior Generations

Ask questions if you don’t know what’s going on! Investing a considerable amount of time in getting familiar with R and its various tools will pay off in the long run! Go over the lecture slides each week. This can be hard when you feel like you’re treading water and just staying afloat, but I wish I had done this regularly. It’s challenging but very doable and rewarding if you put the time in. There are plenty of resources to take advantage of for help. I found it helpful to read through the lecture slides again after I had opened the problem set. It made it easier to create connections between what we went through and how to do it. Go over your psets and the pset solutions the moment they are graded as a habit and figure out what you don’t know.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 16 / 70 Causal Inference: inferring counterfactual effect given association. Regression: estimate association. Inference: estimating things we don’t know from data. Probability: learning what data we would expect if we did know the truth.

Probability → Inference → Regression → Causal Inference

Outline of Topics

Outline in reverse order:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 Regression: estimate association. Inference: estimating things we don’t know from data. Probability: learning what data we would expect if we did know the truth.

Probability → Inference → Regression → Causal Inference

Outline of Topics

Outline in reverse order: Causal Inference: inferring counterfactual effect given association.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 Inference: estimating things we don’t know from data. Probability: learning what data we would expect if we did know the truth.

Probability → Inference → Regression → Causal Inference

Outline of Topics

Outline in reverse order: Causal Inference: inferring counterfactual effect given association. Regression: estimate association.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 Probability: learning what data we would expect if we did know the truth.

Probability → Inference → Regression → Causal Inference

Outline of Topics

Outline in reverse order: Causal Inference: inferring counterfactual effect given association. Regression: estimate association. Inference: estimating things we don’t know from data.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 Probability → Inference → Regression → Causal Inference

Outline of Topics

Outline in reverse order: Causal Inference: inferring counterfactual effect given association. Regression: estimate association. Inference: estimating things we don’t know from data. Probability: learning what data we would expect if we did know the truth.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 Outline of Topics

Outline in reverse order: Causal Inference: inferring counterfactual effect given association. Regression: estimate association. Inference: estimating things we don’t know from data. Probability: learning what data we would expect if we did know the truth.

Probability → Inference → Regression → Causal Inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 17 / 70 My philosophy on teaching: don’t reinvent the wheel customize, refine, improve. Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto. Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien. Shay O’Brien for many hand-drawn illustrations.

Attribution and Thanks

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto. Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien. Shay O’Brien for many hand-drawn illustrations.

Attribution and Thanks

My philosophy on teaching: don’t reinvent the wheel customize, refine, improve.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto. Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien. Shay O’Brien for many hand-drawn illustrations.

Attribution and Thanks

My philosophy on teaching: don’t reinvent the wheel customize, refine, improve. Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien. Shay O’Brien for many hand-drawn illustrations.

Attribution and Thanks

My philosophy on teaching: don’t reinvent the wheel customize, refine, improve. Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Shay O’Brien for many hand-drawn illustrations.

Attribution and Thanks

My philosophy on teaching: don’t reinvent the wheel customize, refine, improve. Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto. Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Attribution and Thanks

My philosophy on teaching: don’t reinvent the wheel customize, refine, improve. Huge thanks to those who have provided slides particularly: Matt Blackwell, Adam Glynn, Justin Grimmer, Jens Hainmueller, Erin Hartman, Kevin Quinn Also thanks to those who have discussed with me at length including Dalton Conley, Chad Hazlett, Gary King, Kosuke Imai, Matt Salganik and Teppei Yamamoto. Previous generations of preceptors have also been incredible important: Clark Bernier, Elisha Cohen, Ian Lundberg, Simone Zhang, Alex Kindel, Ziyao Tian, Shay O’Brien. Shay O’Brien for many hand-drawn illustrations.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 18 / 70 Welcome To Class!

Be sure to read the syllabus for more details.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 19 / 70 Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 20 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 21 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 21 / 70 The name comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 In practice, an essential part of research, policy making, political campaigns, selling people things...

What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 What is Statistics?

Branch of mathematics studying collection and analysis of data The name statistic comes from the word state The arc of developments in statistics 1) an applied scholar has a problem 2) they solve the problem by inventing a specific method 3) statisticians generalize and export the best of these methods Relatively recent field (started at end of 19th century) Goal: principled guesses based on stated assumptions. In practice, an essential part of research, policy making, political campaigns, selling people things...

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 22 / 70 It enables inference.

Why study probability?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 23 / 70 Why study probability? It enables inference.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 23 / 70 Probability

Inference

Observed Data generating data process

In Picture Form

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 24 / 70 Inference

Observed data

In Picture Form

Probability

Data generating process

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 24 / 70 In Picture Form

Probability

Inference

Observed Data generating data process

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 24 / 70 In Picture Form

probability

Data generating Observed data process

inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 25 / 70 We start with probability. Allows us to contemplate world under hypothetical scenarios.

I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

I it tells us what the world would look like under a certain assumption. Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Statistical Thought

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 Allows us to contemplate world under hypothetical scenarios.

I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

I it tells us what the world would look like under a certain assumption. Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Statistical Thought Experiments

We start with probability.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

I it tells us what the world would look like under a certain assumption. Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Statistical Thought Experiments

We start with probability. Allows us to contemplate world under hypothetical scenarios.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 I it tells us what the world would look like under a certain assumption. Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Statistical Thought Experiments

We start with probability. Allows us to contemplate world under hypothetical scenarios.

I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Statistical Thought Experiments

We start with probability. Allows us to contemplate world under hypothetical scenarios.

I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

I it tells us what the world would look like under a certain assumption.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 Statistical Thought Experiments

We start with probability. Allows us to contemplate world under hypothetical scenarios.

I hypotheticals let us ask- is the observed relationship happening by chance or is it systematic?

I it tells us what the world would look like under a certain assumption. Most of the probability material is in the first two weeks but we will return to these ideas periodically through the semester.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 26 / 70 The Story Setup (lady discerning about tea) The (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher .

Example: The Lady Tasting Tea

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 The Experiment (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 The Result (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical (count possibilities)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 (boom she was right)

This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical (count possibilities) The Result

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 This became the Fisher Exact Test.

Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 Example: The Lady Tasting Tea

The Story Setup (lady discerning about tea) The Experiment (perform a taste test) The Hypothetical (count possibilities) The Result (boom she was right)

This became the Fisher Exact Test.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 27 / 70 The in that story was Sir , arguably the most influential statistician of the 20th century. Besides founding key areas of statistics, Fisher was also one of the founders of population genetics. He was also a eugenicist and a racist. Statistics has been used intermittently as a force for progress and a force against progress.

A Note on Fisher and the

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 28 / 70 Besides founding key areas of statistics, Fisher was also one of the founders of population genetics. He was also a eugenicist and a racist. Statistics has been used intermittently as a force for progress and a force against progress.

A Note on Fisher and the History of Statistics

The statistician in that story was Sir Ronald Fisher, arguably the most influential statistician of the 20th century.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 28 / 70 He was also a eugenicist and a racist. Statistics has been used intermittently as a force for progress and a force against progress.

A Note on Fisher and the History of Statistics

The statistician in that story was Sir Ronald Fisher, arguably the most influential statistician of the 20th century. Besides founding key areas of statistics, Fisher was also one of the founders of population genetics.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 28 / 70 Statistics has been used intermittently as a force for progress and a force against progress.

A Note on Fisher and the History of Statistics

The statistician in that story was Sir Ronald Fisher, arguably the most influential statistician of the 20th century. Besides founding key areas of statistics, Fisher was also one of the founders of population genetics. He was also a eugenicist and a racist.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 28 / 70 A Note on Fisher and the History of Statistics

The statistician in that story was Sir Ronald Fisher, arguably the most influential statistician of the 20th century. Besides founding key areas of statistics, Fisher was also one of the founders of population genetics. He was also a eugenicist and a racist. Statistics has been used intermittently as a force for progress and a force against progress.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 28 / 70 “[Variables] empirically perform as theoretically predicted, by displaying statistically significant effects net of other variables in the right direction”

Lundberg, Johnson, and Stewart. Setting the Target: Precise Estimands and the Gap Between Theory and Empirics

Preview: Connecting Theory and Evidence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 by displaying statistically significant effects net of other variables in the right direction”

Lundberg, Johnson, and Stewart. Setting the Target: Precise Estimands and the Gap Between Theory and Empirics

Preview: Connecting Theory and Evidence

“[Variables] empirically perform as theoretically predicted,

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 Lundberg, Johnson, and Stewart. Setting the Target: Precise Estimands and the Gap Between Theory and Empirics

Preview: Connecting Theory and Evidence

“[Variables] empirically perform as theoretically predicted, by displaying statistically significant effects net of other variables in the right direction”

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 Preview: Connecting Theory and Evidence

“[Variables] empirically perform as theoretically predicted, by displaying statistically significant effects net of other variables in the right direction”

Lundberg, Johnson, and Stewart. Setting the Target: Precise Estimands and the Gap Between Theory and Empirics

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 The target tautology:

Research goals are defined by hypotheses about model coefficients

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 The target tautology:

Research goals are defined by hypotheses about model coefficients

The goal is only defined within the

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 The target tautology:

Research goals are defined by hypotheses about model coefficients

The goal is only defined within the statistical model

It becomes impossible to reason about other estimation strategies

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 The target tautology:

Research goals are defined by hypotheses about model coefficients

The goal is only defined within the statistical model

It becomes impossible to reason about other estimation strategies

Solution:

State the research goal separately from the estimation strategy

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 The target tautology:

Research goals are defined by hypotheses about model coefficients

The goal is only defined within the statistical model

It becomes impossible to reason about other estimation strategies

Our diagnosis for the source Solution: of many methodological problems

State the research goal separately from the estimation strategy

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 Connecting Theory and Evidence

Evidence points to new questions

Theory or Theoretical Empirical Estimation general goal estimands estimands strategies

Set Link Learn a specific to observable how to estimate from target data data we observe

By argument By assumption By data

Example tools: Target population, Directed Acyclic Graphs, OLS regression, Causal contrast Potential outcomes Machine learning

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 29 / 70 Statistics as a field (the good, the bad and the ugly) The probability and inference loop Connecting theory and evidence through estimands See you next time!

We Covered. . .

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 30 / 70 The probability and inference loop Connecting theory and evidence through estimands See you next time!

We Covered. . .

Statistics as a field (the good, the bad and the ugly)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 30 / 70 Connecting theory and evidence through estimands See you next time!

We Covered. . .

Statistics as a field (the good, the bad and the ugly) The probability and inference loop

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 30 / 70 See you next time!

We Covered. . .

Statistics as a field (the good, the bad and the ugly) The probability and inference loop Connecting theory and evidence through estimands

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 30 / 70 We Covered. . .

Statistics as a field (the good, the bad and the ugly) The probability and inference loop Connecting theory and evidence through estimands See you next time!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 30 / 70 Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 31 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 32 / 70 1 Course Structure Overview Ways to Learn Final Details

2 Core Ideas What is Statistics? Preview: Connecting Theory and Evidence

3 Introduction to Probability What is Probability? Sample Spaces and Events Probability Functions

4 Three Big Ideas in Probability Marginal, Joint and Conditional Probability Bayes’ Rule Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 32 / 70 From ‘Probably’ to Probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 33 / 70 Helps us envision hypotheticals Describes uncertainty in how the data is generated Estimates probability that something will happen Thus: we need to know how probability gives rise to data

Why Probability?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 34 / 70 Describes uncertainty in how the data is generated Estimates probability that something will happen Thus: we need to know how probability gives rise to data

Why Probability?

Helps us envision hypotheticals

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 34 / 70 Estimates probability that something will happen Thus: we need to know how probability gives rise to data

Why Probability?

Helps us envision hypotheticals Describes uncertainty in how the data is generated

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 34 / 70 Thus: we need to know how probability gives rise to data

Why Probability?

Helps us envision hypotheticals Describes uncertainty in how the data is generated Estimates probability that something will happen

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 34 / 70 Why Probability?

Helps us envision hypotheticals Describes uncertainty in how the data is generated Estimates probability that something will happen Thus: we need to know how probability gives rise to data

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 34 / 70 While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version):

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 Intuitive Definition of Probability

While there are several interpretations of what probability is, most modern (post 1935 or so) researchers agree on an axiomatic definition of probability. 3 Axioms (Intuitive Version): 1 The probability of any particular event must be non-negative. 2 The probability of anything occurring among all possible events must be1. 3 The probability of one of many mutually exclusive events happening is the sum of the individual probabilities.

All the rules of probability can be derived from these axioms. To state them formally, we first need some definitions.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 35 / 70 To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S =

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads},

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails},

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads},

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 (Note we defined illogical guesses to be prob= 0)

Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 Sample Spaces

To define probability we need to define the set of possible outcomes.

The sample space is the set of all possible outcomes, and is often written as S.

For example, if we flip a coin twice, there are four possible outcomes,

S = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails}

Thus the table in Lady Tasting Tea was defining the sample space. (Note we defined illogical guesses to be prob= 0)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 36 / 70 Imagine that we sample one apple from a bag. Looking in the bag we see:

The sample space is: Ω = S = { , , , }

A Running Visual Metaphor

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 37 / 70 Looking in the bag we see:

The sample space is: Ω = S = { , , , }

A Running Visual Metaphor Imagine that we sample one apple from a bag.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 37 / 70 The sample space is: Ω = S = { , , , }

A Running Visual Metaphor Imagine that we sample one apple from a bag. Looking in the bag we see:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 37 / 70 A Running Visual Metaphor Imagine that we sample one apple from a bag. Looking in the bag we see:

The sample space is: Ω = S = { , , , }

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 37 / 70 For example, if Ω = S = { , , , }

then

{ , , } and { } are both events.

Events Events are subsets of the sample space.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 38 / 70 then

{ , , } and { } are both events.

Events Events are subsets of the sample space. For example, if Ω = S = { , , , }

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 38 / 70 Events Events are subsets of the sample space. For example, if Ω = S = { , , , }

then

{ , , } and { } are both events.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 38 / 70 Sets are collections of things, in this case collections of outcomes One way to define an event is to describe the common property that all of the outcomes share. We write this as

{ω|ω satisfies Property they share},

Example:

If A = {ω|ω has a leaf}: A, A, A, A

Events Are a Kind of Set

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 39 / 70 One way to define an event is to describe the common property that all of the outcomes share. We write this as

{ω|ω satisfies Property they share},

Example:

If A = {ω|ω has a leaf}: A, A, A, A

Events Are a Kind of Set Sets are collections of things, in this case collections of outcomes

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 39 / 70 Example:

If A = {ω|ω has a leaf}: A, A, A, A

Events Are a Kind of Set Sets are collections of things, in this case collections of outcomes One way to define an event is to describe the common property that all of the outcomes share. We write this as

{ω|ω satisfies Property they share},

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 39 / 70 Events Are a Kind of Set Sets are collections of things, in this case collections of outcomes One way to define an event is to describe the common property that all of the outcomes share. We write this as

{ω|ω satisfies Property they share},

Example:

If A = {ω|ω has a leaf}: A, A, A, A

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 39 / 70 Ac , is everything else not in A.

{ , , } and { } are complements.

Ac = {ω ∈ S|ω∈ / A}. Important complement: Sc = ∅, where ∅ is the empty set.

Complement

A complement of event A, denoted Ac , is also a set.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 40 / 70 { , , } and { } are complements.

Ac = {ω ∈ S|ω∈ / A}. Important complement: Sc = ∅, where ∅ is the empty set.

Complement

A complement of event A, denoted Ac , is also a set. Ac , is everything else not in A.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 40 / 70 Important complement: Sc = ∅, where ∅ is the empty set.

Complement

A complement of event A, denoted Ac , is also a set. Ac , is everything else not in A.

{ , , } and { } are complements.

Ac = {ω ∈ S|ω∈ / A}.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 40 / 70 Complement

A complement of event A, denoted Ac , is also a set. Ac , is everything else not in A.

{ , , } and { } are complements.

Ac = {ω ∈ S|ω∈ / A}. Important complement: Sc = ∅, where ∅ is the empty set.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 40 / 70 The union of two events, A and B is the event that A or B occurs:

= A ∪ B = {ω|ω ∈ A or ω ∈ B}. { , , }

The intersection of two events, A and B is the event that both A and B occur:

= A ∩ B = {ω|ω ∈ A and ω ∈ B}. { }

Unions and Intersections

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 41 / 70 The intersection of two events, A and B is the event that both A and B occur:

= A ∩ B = {ω|ω ∈ A and ω ∈ B}. { }

Unions and Intersections

The union of two events, A and B is the event that A or B occurs:

= A ∪ B = {ω|ω ∈ A or ω ∈ B}. { , , }

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 41 / 70 Unions and Intersections

The union of two events, A and B is the event that A or B occurs:

= A ∪ B = {ω|ω ∈ A or ω ∈ B}. { , , }

The intersection of two events, A and B is the event that both A and B occur:

= A ∩ B = {ω|ω ∈ A and ω ∈ B}. { }

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 41 / 70 An event and its complement A and Ac are by definition disjoint.

Sample spaces can have infinite events where we will often write the different events using subscripts of the same letter: A1, A2,... A∞ (e.g. imagine an event that was the count of some object)

Operations on Events

We say that two events A and B are disjoint or mutually exclusive if they don’t share any elements or that A ∩ B = ∅.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 42 / 70 Sample spaces can have infinite events where we will often write the different events using subscripts of the same letter: A1, A2,... A∞ (e.g. imagine an event that was the count of some object)

Operations on Events

We say that two events A and B are disjoint or mutually exclusive if they don’t share any elements or that A ∩ B = ∅. An event and its complement A and Ac are by definition disjoint.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 42 / 70 Sample spaces can have infinite events where we will often write the different events using subscripts of the same letter: A1, A2,... A∞ (e.g. imagine an event that was the count of some object)

Operations on Events

We say that two events A and B are disjoint or mutually exclusive if they don’t share any elements or that A ∩ B = ∅. An event and its complement A and Ac are by definition disjoint.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 42 / 70 Operations on Events

We say that two events A and B are disjoint or mutually exclusive if they don’t share any elements or that A ∩ B = ∅. An event and its complement A and Ac are by definition disjoint.

Sample spaces can have infinite events where we will often write the different events using subscripts of the same letter: A1, A2,... A∞ (e.g. imagine an event that was the count of some object)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 42 / 70 nonnegativity

normalization

additivity

1) P(A) ≥ 0 for all A in the set of all events.

2) P(S) = 1

3) if events A1, A2,... are mutually exclusive then S∞ P∞ P( i=1 Ai ) = i=1 P(Ai ).

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 normalization

additivity

nonnegativity

2) P(S) = 1

3) if events A1, A2,... are mutually exclusive then S∞ P∞ P( i=1 Ai ) = i=1 P(Ai ).

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 normalization

additivity

2) P(S) = 1

3) if events A1, A2,... are mutually exclusive then S∞ P∞ P( i=1 Ai ) = i=1 P(Ai ).

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 additivity

normalization

3) if events A1, A2,... are mutually exclusive then S∞ P∞ P( i=1 Ai ) = i=1 P(Ai ).

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

2) P(S) = 1 2. P( { , , , }) =1

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 additivity

3) if events A1, A2,... are mutually exclusive then S∞ P∞ P( i=1 Ai ) = i=1 P(Ai ).

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

2) P(S) = 1 normalization 2. P( { , , , }) =1

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 additivity

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

2) P(S) = 1 normalization 2. P( { , , , }) =1

3) if events A1, A2,... are 3. P( ) = P( ) +P( ) mutually exclusive then S∞ P∞ when and are P( Ai ) = P(Ai ). i=1 i=1 mutually exclusive.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

2) P(S) = 1 normalization 2. P( { , , , }) =1

3) if events A1, A2,... are 3. P( ) = P( ) +P( ) mutually exclusive then S∞ P∞ when and are P( i=1 Ai ) = i=1 P(Ai ). additivity mutually exclusive.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 Probability Function A probability function P(·) is a function defined over all subsets of a sample space S that satisfies the following three axioms:

1) P(A) ≥ 0 for all A in the set 1. P( ) = -.5 of all events. nonnegativity

2) P(S) = 1 normalization 2. P( { , , , }) =1

3) if events A1, A2,... are 3. P( ) = P( ) +P( ) mutually exclusive then S∞ P∞ when and are P( i=1 Ai ) = i=1 P(Ai ). additivity mutually exclusive.

All the rules of probability can be derived from these axioms. (See Blitzstein & Hwang, Def 1.6.1.)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 43 / 70 Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 A Brief Word on Interpretation

Massive debate on interpretation: Subjective Interpretation

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is whatever you want it to be. But...

I If you don’t follow the axioms, a bookie can beat you

I There is a correct way to update your beliefs given your assumptions about the data generating process. Frequency Interpretation

I Probability is the relative frequency with which an event would occur if the process were repeated a large number of times under similar conditions.

I Example: The probability of drawing 5 red cards out of 10 drawn from a deck of cards is the frequency with which this event occurs in repeated samples of 10 cards.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 44 / 70 Events and Sample Spaces Probability Functions and Three Axioms Next: Three Big Ideas derived from the axioms that provide the rules of working with probability. See you next time!

We Covered. . .

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 45 / 70 Probability Functions and Three Axioms Next: Three Big Ideas derived from the axioms that provide the rules of working with probability. See you next time!

We Covered. . .

Events and Sample Spaces

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 45 / 70 Next: Three Big Ideas derived from the axioms that provide the rules of working with probability. See you next time!

We Covered. . .

Events and Sample Spaces Probability Functions and Three Axioms

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 45 / 70 See you next time!

We Covered. . .

Events and Sample Spaces Probability Functions and Three Axioms Next: Three Big Ideas derived from the axioms that provide the rules of working with probability.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 45 / 70 We Covered. . .

Events and Sample Spaces Probability Functions and Three Axioms Next: Three Big Ideas derived from the axioms that provide the rules of working with probability. See you next time!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 45 / 70 Where We’ve Been and Where We’re Going...

Last Week

I living that class-free, quarantine life This Week

I course structure

I core ideas

I introduction to probability

I three big ideas in probability Next Week

I random variables

I joint distributions Long Run

I probability → inference → regression → causal inference

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 46 / 70 Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Three Big Ideas

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 47 / 70 Bayes’ rule

Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 47 / 70 Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 47 / 70 Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 47 / 70 Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 47 / 70 So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A). P(A) is sometimes called a marginal probability.

Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur. This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

P( , ) = P( ) = P( )

Marginal and Joint Probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 P(A) is sometimes called a marginal probability.

Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur. This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

P( , ) = P( ) = P( )

Marginal and Joint Probability

So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur. This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

P( , ) = P( ) = P( )

Marginal and Joint Probability

So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A). P(A) is sometimes called a marginal probability.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

P( , ) = P( ) = P( )

Marginal and Joint Probability

So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A). P(A) is sometimes called a marginal probability.

Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 P( , ) = P( ) = P( )

Marginal and Joint Probability

So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A). P(A) is sometimes called a marginal probability.

Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur. This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 Marginal and Joint Probability

So far we have only considered situations where we are interested in the probability of a single event A occurring. We’ve denoted this P(A). P(A) is sometimes called a marginal probability.

Suppose we are now in a situation where we would like to express the probability that an event A and an event B occur. This quantity is written as P(A ∩ B), P(B ∩ A), P(A, B), or P(B, A) and is the joint probability of A and B.

P( , ) = P( ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 48 / 70 P( ) = ?

P( , ) = ?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 49 / 70 P( ) = 7/10

P( , ) = 4/10

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 49 / 70 The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Conditional Probability

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Conditional Probability

The “soul of statistics”

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 P(A, B) P(B|A) = P(A)

This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Conditional Probability

The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Conditional Probability

The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Conditional Probability

The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

This implies that

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 Hopefully this second formulation is intuitive!

Conditional Probability

The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 Conditional Probability

The “soul of statistics” If P(A) > 0 then the probability of B conditional on A is

P(A, B) P(B|A) = P(A)

This implies that

P(A, B) = P(A)P(B|A) = P(B)P(A|B)

Hopefully this second formulation is intuitive!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 50 / 70 Conditional Probability: A Visual Example

P( , ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 51 / 70 Conditional Probability: A Visual Example

P( , ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 51 / 70 Conditional Probability: A Visual Example

P( , ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 51 / 70 With 2 Events: P(B) = P(B, A) + P(B, Ac ) = P(B|A)P(A) + P(B|Ac )P(Ac )

P( ) = P( ) + P( )

= P( | ) x P( ) + P( | ) x P( )

In general, if {Ai : i = 1, 2, 3,... } forms a partition of the sample space, then X P(B) = P(B, Ai ) i X = P(B|Ai )P(Ai ) i

Law of Total Probability (LTP)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 52 / 70 P( ) = P( ) + P( )

= P( | ) x P( ) + P( | ) x P( )

In general, if {Ai : i = 1, 2, 3,... } forms a partition of the sample space, then X P(B) = P(B, Ai ) i X = P(B|Ai )P(Ai ) i

Law of Total Probability (LTP) With 2 Events: P(B) = P(B, A) + P(B, Ac ) = P(B|A)P(A) + P(B|Ac )P(Ac )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 52 / 70 In general, if {Ai : i = 1, 2, 3,... } forms a partition of the sample space, then X P(B) = P(B, Ai ) i X = P(B|Ai )P(Ai ) i

Law of Total Probability (LTP) With 2 Events: P(B) = P(B, A) + P(B, Ac ) = P(B|A)P(A) + P(B|Ac )P(Ac )

P( ) = P( ) + P( )

= P( | ) x P( ) + P( | ) x P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 52 / 70 Law of Total Probability (LTP) With 2 Events: P(B) = P(B, A) + P(B, Ac ) = P(B|A)P(A) + P(B|Ac )P(Ac )

P( ) = P( ) + P( )

= P( | ) x P( ) + P( | ) x P( )

In general, if {Ai : i = 1, 2, 3,... } forms a partition of the sample space, then X P(B) = P(B, Ai ) i X = P(B|Ai )P(Ai ) i

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 52 / 70 We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 =0.75 × 0.6 + 0.15 × 0.4 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 =.51

Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 Example: Voter Mobilization Suppose that we have put together a voter mobilization campaign and we want to know what the probability of voting is after the campaign: P(vote). We know the following: P(vote|mobilized) = 0.75 P(vote|not mobilized) = 0.15 P(mobilized) = 0.6 and so P(not mobilized) = 0.4 Note that mobilization partitions the data. Everyone is either mobilized or not. Thus, we can apply the LTP:

P(vote) =P(vote|mobilized)P(mobilized)+ P(vote|not mobilized)P(not mobilized) =0.75 × 0.6 + 0.15 × 0.4 =.51

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 53 / 70 Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Three Big Ideas

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 54 / 70 Bayes’ rule

Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 54 / 70 Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 54 / 70 Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 54 / 70 Often we have information about P(B|A), but want P(A|B). When this happens, always think: Bayes’ rule Bayes’ rule: if P(B) > 0, then:

P(B|A)P(A) P(A|B) = P(B)

Bayes’ Rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 55 / 70 When this happens, always think: Bayes’ rule Bayes’ rule: if P(B) > 0, then:

P(B|A)P(A) P(A|B) = P(B)

Bayes’ Rule

Often we have information about P(B|A), but want P(A|B).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 55 / 70 Bayes’ rule: if P(B) > 0, then:

P(B|A)P(A) P(A|B) = P(B)

Bayes’ Rule

Often we have information about P(B|A), but want P(A|B). When this happens, always think: Bayes’ rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 55 / 70 Bayes’ Rule

Often we have information about P(B|A), but want P(A|B). When this happens, always think: Bayes’ rule Bayes’ rule: if P(B) > 0, then:

P(B|A)P(A) P(A|B) = P(B)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 55 / 70 Bayes’ Rule Mechanics P( | ) P( ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 56 / 70 Bayes’ Rule Mechanics P( | ) P( ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 56 / 70 Bayes’ Rule Mechanics P( | ) P( ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 56 / 70 Bayes’ Rule Mechanics P( | ) P( ) P( | ) = P( )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 56 / 70 Example: Race and Names

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 57 / 70 Note that the collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 Example: Race and Names

Note that the Census collects information on the distribution of names by race. For example, Washington is the most common last name among African-Americans in America:

I P(AfAm) = 0.132

I P(not AfAm) = 1 − P(AfAm) = .868

I P(Washington|AfAm) = 0.00378

I P(Washington|not AfAm) = 0.000061 We can now use Bayes’ Rule

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 58 / 70 Note we don’t have the probability of the name Washington. Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash) P(Wash|AfAm)P(AfAm) = P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm) 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061 ≈ 0.9

Example: Race and Names

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash) P(Wash|AfAm)P(AfAm) = P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm) 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061 ≈ 0.9

Example: Race and Names

Note we don’t have the probability of the name Washington.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 P(Wash|AfAm)P(AfAm) P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm) 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061 ≈ 0.9

Example: Race and Names

Note we don’t have the probability of the name Washington. Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash)

=

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061 ≈ 0.9

Example: Race and Names

Note we don’t have the probability of the name Washington. Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash) P(Wash|AfAm)P(AfAm) = P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 ≈ 0.9

Example: Race and Names

Note we don’t have the probability of the name Washington. Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash) P(Wash|AfAm)P(AfAm) = P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm) 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 Example: Race and Names

Note we don’t have the probability of the name Washington. Remember that we can calculate it from the LTP since the sets African-American and not African-American partition the sample space:

P(Wash|AfAm)P(AfAm) P(AfAm|Wash) = P(Wash) P(Wash|AfAm)P(AfAm) = P(Wash|AfAm)P(AfAm) + P(Wash|not AfAm)P(not AfAm) 0.132 × 0.00378 = 0.132 × 0.00378 + .868 × 0.000061 ≈ 0.9

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 59 / 70 Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Three Big Ideas

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 60 / 70 Bayes’ rule

Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 60 / 70 Independence

Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 60 / 70 Three Big Ideas

Marginal, joint, and conditional probabilities

Bayes’ rule

Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 60 / 70 Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Independence is a massively important concept in statistics.

Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Independence Intuitive Definition Events A and B are independent if knowing whether A occurred provides no information about whether B occurred. Formal Definition P(A, B) = P(A)P(B) =⇒ A⊥⊥B With all the usual > 0 restrictions, this implies P(A|B) = P(A) P(B|A) = P(B) Conditional Independence P(A, B|C) = P(A|C)P(B|C) =⇒ A⊥⊥B|C

Independence is a massively important concept in statistics.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 61 / 70 Independence, the Heroic Assumption

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 62 / 70 Independence, the Heroic Assumption

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 62 / 70 Independence, the Heroic Assumption

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 62 / 70 Independence, the Heroic Assumption

Deploy with Caution!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 62 / 70 Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

x i = (x1i , x2i ,..., xJi ) We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Advanced Example: Building a Spam Filter

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 x i = (x1i , x2i ,..., xJi ) We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Advanced Example: Building a Spam Filter Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Advanced Example: Building a Spam Filter Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

x i = (x1i , x2i ,..., xJi )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Advanced Example: Building a Spam Filter Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

x i = (x1i , x2i ,..., xJi ) We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Advanced Example: Building a Spam Filter Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

x i = (x1i , x2i ,..., xJi ) We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 Advanced Example: Building a Spam Filter Suppose we have an email i,(i = 1,..., N) which we represent with a series of J indicators for whether or not it contains a set of words

x i = (x1i , x2i ,..., xJi ) We want to classify these into one of two categories: spam or not

{Cspam, Cnot}

We have a set of labeled documents Y = (Y1, Y2,..., YN ) where Yi ∈ {Cspam, Cnot}.

Goal: Use what we’ve learned to build a model which can classify emails into spam and not spam.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 63 / 70 In other words what we want is P(Cspam|x i ). Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i ) P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

We used the law of total probability to work out the bottom.

Now there are only 4 pieces we need (2 for spam, 2 for not)

Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i ) P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

We used the law of total probability to work out the bottom.

Now there are only 4 pieces we need (2 for spam, 2 for not)

Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

In other words what we want is P(Cspam|x i ).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

We used the law of total probability to work out the bottom.

Now there are only 4 pieces we need (2 for spam, 2 for not)

Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

In other words what we want is P(Cspam|x i ). Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i )

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 We used the law of total probability to work out the bottom.

Now there are only 4 pieces we need (2 for spam, 2 for not)

Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

In other words what we want is P(Cspam|x i ). Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i ) P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 Now there are only 4 pieces we need (2 for spam, 2 for not)

Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

In other words what we want is P(Cspam|x i ). Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i ) P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

We used the law of total probability to work out the bottom.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 Example: Building a Spam Filter

For each document, we will get to see x i (the words in the document), and we would like to infer the category.

In other words what we want is P(Cspam|x i ). Let’s use Bayes’ Rule!

P(x i |Cspam)P(Cspam) P(Cspam|x i ) = P(x i ) P(x |C )P(C ) = i spam spam P(x i |Cspam)P(Cspam) + P(x i |Cnot spam)P(Cnot spam)

We used the law of total probability to work out the bottom.

Now there are only 4 pieces we need (2 for spam, 2 for not)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 64 / 70 Intuitively, P(Cspam) is the probability that a randomly chosen email will be spam.

No. Spam Emails P(C ) = spam No. Emails

Because ‘not spam’ is the complement of spam we know that:

P(Cnot spam) = 1 − P(Cspam)

Note: this estimate is only good if our labeled emails are a random sample of all emails! More on this in future weeks.

Estimating the Baseline Prevalence Let’s plug in some estimates based on our labeled emails.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 65 / 70 No. Spam Emails P(C ) = spam No. Emails

Because ‘not spam’ is the complement of spam we know that:

P(Cnot spam) = 1 − P(Cspam)

Note: this estimate is only good if our labeled emails are a random sample of all emails! More on this in future weeks.

Estimating the Baseline Prevalence Let’s plug in some estimates based on our labeled emails.

Intuitively, P(Cspam) is the probability that a randomly chosen email will be spam.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 65 / 70 Because ‘not spam’ is the complement of spam we know that:

P(Cnot spam) = 1 − P(Cspam)

Note: this estimate is only good if our labeled emails are a random sample of all emails! More on this in future weeks.

Estimating the Baseline Prevalence Let’s plug in some estimates based on our labeled emails.

Intuitively, P(Cspam) is the probability that a randomly chosen email will be spam.

No. Spam Emails P(C ) = spam No. Emails

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 65 / 70 Note: this estimate is only good if our labeled emails are a random sample of all emails! More on this in future weeks.

Estimating the Baseline Prevalence Let’s plug in some estimates based on our labeled emails.

Intuitively, P(Cspam) is the probability that a randomly chosen email will be spam.

No. Spam Emails P(C ) = spam No. Emails

Because ‘not spam’ is the complement of spam we know that:

P(Cnot spam) = 1 − P(Cspam)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 65 / 70 Estimating the Baseline Prevalence Let’s plug in some estimates based on our labeled emails.

Intuitively, P(Cspam) is the probability that a randomly chosen email will be spam.

No. Spam Emails P(C ) = spam No. Emails

Because ‘not spam’ is the complement of spam we know that:

P(Cnot spam) = 1 − P(Cspam)

Note: this estimate is only good if our labeled emails are a random sample of all emails! More on this in future weeks.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 65 / 70 Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 Estimating the Language Model

Now we need P(x i |Cspam which we call the language model because it represents the probability of seeing any combination of the J words that we are counting from the emails. Can we use the same strategy as before (just counting up emails)? J No! Remember x i is a vector of J words, that is 2 possibilities! We will make the heroic assumption of conditional independence:

J Y P(x i |Cspam) = P(xij |Cspam) j=1 Intuition: count the proportion of spam emails containing each word.

Called Na¨ıveBayes classifier because the conditional independence assumption is na¨ıve.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 66 / 70 The Na¨ıveBayes Procedure: Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Guess how much spam there is overall, P(Cspam)

Plug in values of x i for new emails to score them by whether they are spam or not. . . . Profit?

Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Guess how much spam there is overall, P(Cspam)

Plug in values of x i for new emails to score them by whether they are spam or not. . . . Profit?

Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

The Na¨ıveBayes Procedure:

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 Guess how much spam there is overall, P(Cspam)

Plug in values of x i for new emails to score them by whether they are spam or not. . . . Profit?

Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

The Na¨ıveBayes Procedure: Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 Plug in values of x i for new emails to score them by whether they are spam or not. . . . Profit?

Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

The Na¨ıveBayes Procedure: Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Guess how much spam there is overall, P(Cspam)

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 . . . Profit?

Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

The Na¨ıveBayes Procedure: Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Guess how much spam there is overall, P(Cspam)

Plug in values of x i for new emails to score them by whether they are spam or not.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 Estimating the Na¨ıveBayes Classifier

P(xi |Cspam)P(Cspam) P(Cspam|xi ) = P(xi |Cspam)P(Cspam) + P(xi |Cnot spam)P(Cnot spam)

The Na¨ıveBayes Procedure: Learn what spam emails look like to create a function that lets us plug in an email and get out a probability, P(x i |Cspam)

Guess how much spam there is overall, P(Cspam)

Plug in values of x i for new emails to score them by whether they are spam or not. . . . Profit?

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 67 / 70 This was a really advanced example (it is okay if you didn’t follow all of it!). Draws on all the probabilistic concepts we have introduced:

I Bayes’ Rule

I Law of Total Probability

I Conditional Independence Shares the basic structure of many models particularly in use of conditional independence.

Example: Building a Spam Filter

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 68 / 70 Draws on all the probabilistic concepts we have introduced:

I Bayes’ Rule

I Law of Total Probability

I Conditional Independence Shares the basic structure of many models particularly in use of conditional independence.

Example: Building a Spam Filter

This was a really advanced example (it is okay if you didn’t follow all of it!).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 68 / 70 Shares the basic structure of many models particularly in use of conditional independence.

Example: Building a Spam Filter

This was a really advanced example (it is okay if you didn’t follow all of it!). Draws on all the probabilistic concepts we have introduced:

I Bayes’ Rule

I Law of Total Probability

I Conditional Independence

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 68 / 70 Example: Building a Spam Filter

This was a really advanced example (it is okay if you didn’t follow all of it!). Draws on all the probabilistic concepts we have introduced:

I Bayes’ Rule

I Law of Total Probability

I Conditional Independence Shares the basic structure of many models particularly in use of conditional independence.

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 68 / 70 Going Deeper:

Blitzstein, Joseph K., and Hwang, Jessica. (2019). Introduction to Probability. CRC Press. http://stat110.net/

Next week: random variables!

This Week in Review

Course logistics Core ideas in statistics Foundations of probability Three big probability concepts

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 69 / 70 Next week: random variables!

This Week in Review

Course logistics Core ideas in statistics Foundations of probability Three big probability concepts

Going Deeper:

Blitzstein, Joseph K., and Hwang, Jessica. (2019). Introduction to Probability. CRC Press. http://stat110.net/

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 69 / 70 This Week in Review

Course logistics Core ideas in statistics Foundations of probability Three big probability concepts

Going Deeper:

Blitzstein, Joseph K., and Hwang, Jessica. (2019). Introduction to Probability. CRC Press. http://stat110.net/

Next week: random variables!

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 69 / 70 References

Enos, Ryan D. “What the demolition of public housing teaches us about the impact of racial threat on political behavior.” American Journal of Political Science (2015). Lundberg, Ian, Rebecca Johnson, and Brandon M. Stewart. ”Setting the target: Precise estimands and the gap between theory and empirics.” (2020). Salsburg, David. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (2002).

Stewart (Princeton) Week 1: Introduction and Probability August 31-September 4, 2020 70 / 70