Introductory stascs for biological engineers

Module 2, Lecture 7 20.109 Spring 2014 Agi Stachowiak, Shannon Hughes, and Leona Samson Lecture by Zac Nagel, 4/8/14 What is an experiment?

• A test of a hypothesis • A comparison of two condions • Seeing what happens when we change a variable • Asking a queson about the world Typical Experimental Data:

80

75

70

65 Height (inches)

60

Male Female Did my experiment tell me anything?

• Did I get an answer? • How sure am I of the answer? • Can we all agree on how sure of the answer we are? Experimental data speak with limited authority Is this the cure for cancer? Yes

hp://jimbuchan.com/ Experimental data speak with limited authority Is this the cure for cancer? Eh…

hp://tvbythenumbers.zap2it.com/ All Science is an approximaon

• Measure twice, cut once

hp://www.harborfreight.com/

hp://www.neontommy.com/ All Science is an approximaon

• Measure even more than twice? Why? • à We never have all the data! Very limited radar, satelite and blackbox pinger data had been used (as of the date of this lecture) together with mathemacal modeling to obtain a sort of confidence interval delineang the most likely locaon of Malasia Airlines flight 370. hp://www.cbsnews.com/news/malaysia-airlines-flight-370-no-new-black-box-like-sounds- heard/ All Science is an approximaon

• Measure even more than twice? Why? • à We almost never have all the data!

Nate Silver has used stascs to predict the most likely outcome in elecons with great success; at le, his last predicon before the 2012 presidenal elecon. He happened to get 50/50 this me: red states to Mi Romney, blue states to Barack Obama. Note the predicons amounted to probabilies based on a sample of hp://www.laprogressive.com/elecon-2012- polling data. watch/ Can a sample of 12 people be used to draw conclusions about the global human populaon?

80

75

70

65 Height (inches)

60

Male Female Stascs is a powerful tool for deciding what can be concluded from data An example: how long does a light bulb last?

hp://www.clipartbest.com/clipart-Kijxd4riq What real data look like: Number of light bulbs

Lifeme (hours) What real data look like: Number of light bulbs

Lifeme (hours) Mean is the average lifeme: Mean

n ∑xi mean = x = i n

Where: Number of light bulbs x = Average value for xi

Lifeme for a parcular bulb xi = n = Total number of bulbs Lifeme (hours) Standard deviaon measures how closely data cluster around the mean

Smaller Standard Deviaon

Larger Standard Deviaon Number of light bulbs

Lifeme (hours) Standard deviaon measures how closely data cluster around the mean Standard Deviaon n 2 ∑(xi − x) s = i n −1

Where:

Number of light bulbs s = Standard deviaon x = Average value for xi Lifeme for a parcular bulb xi = Lifeme (hours) n = Total number of bulbs Standard deviaon measures how closely data cluster around the mean Standard Deviaon n 2 ∑(xi − x) s = i n −1 Number of light bulbs Degrees of n −1= freedom

Lifeme (hours) If we had all the data

x à True mean, mu s à True Standard deviaon, sigma Since we almost never have all the data, how confident are we about our esmate of the mean? Confidence Interval: a range of values within which there is a specified probability of finding the true mean Confidence Interval: a range of values within which there is a specified probability of finding the true mean Number of light bulbs

Lifeme (hours) Confidence Interval: a range of values within which there is a specified probability of finding the true mean

• For a Gaussian curve, a 68.3% of the data fall within 1 sigma of the mean, mu Number of light bulbs

Lifeme (hours) Confidence Interval: a range of values within which there is a specified probability of finding the true mean ts µ = x ± n µ = The true mean t = Student’s t s = Standard deviaon n = Number of observaons Where do we get t? Confidence Interval Example: ts 80 µ = x ± 75 n

70 x = 67.9 s = 2.9 65 Height (inches) n = 7 60 DOF = 6

Male Female t50% = 0.718 Confidence Interval Example: ts µ = x ± 80 n

75 µ50% = 67.9 ± 0.8

70

65 x = 67.9 Height (inches) s = 2.9 60 n = 7 Male Female DOF = 6

t50% = 0.718 99.9% Confidence interval is larger: ts µ = x ± 80 n

75 µ99.9% = 67.9 ± 6.5

70 x = 67.9 65 Height (inches) s = 2.9 60 n = 7 Male DOF 6 Female =

t99.9% = 5.959 How do we know whether two means are different from one another?

80

75

70

65 Height (inches)

60

Male Female How do we know whether two means are different from one another? Number of People

Height T-test for comparison of means:

x1 x2 x1 − x2 n1n2 tcalc = spooled n1 + n2

2 2 1 (n1 −1)+ 2(n2 −1) Number'of'light'bulbs' s s spooled = Number of People n1 + n2 − 2

Life1me'(hours)'Height T-test for comparison of means:

x1 x2 x1 − x2 n1n2 tcalc = spooled n1 + n2

74.2 − 67.9 5× 7 tcalc = Number'of'light'bulbs' 7.6 5+ 7 Number of People

Life1me'(hours)'Height T-test for comparison of means:

x1 x2 x1 − x2 n1n2 tcalc = spooled n1 + n2

tcalc =1.41

Number'of'light'bulbs' ttable50% = 0.7 Number of People

ttable90% =1.8

Life1me'(hours)'Height T-test for comparison of means:

tcalc =1.41 x1 x2

ttable50% = 0.7

ttable90% =1.8

à Significant at the 50% confidence level, not at the 90% confidence level

à There is a chance greater than 10% of Number'of'light'bulbs'

Number of People observing a difference this large if the two populaons were the same

Life1me'(hours)'Height Be aware of assumpons

• The data adopt a normal distribuon • The data correspond to a representave sample of the total populaon • The (self-reported) data are accurate What can we do to increase our confidence?

x1 − x2 n1n2 tcalc = spooled n1 + n2

• Seek larger t values by increasing n – For a given total number of observaons (n1 + n2), t is maximized when n1 = n2 • More precise experimental technique • Avoiding sampling bias, reporng bias • Perform a second, complementary experiment Tools for compung stascal significance • Excel TTEST funcon (spreadsheet available on course page) • Graphing soware has built in tools to perform t tests Conclusions

• We are never totally certain of a result • Stascs provide a way to tell others how certain we are • Increasing the number of data points can increase certainty • Doing a second independent experiment can be very helpful Lies, Damn Lies, and Stascs

• What can go wrong? – Ill-defined queson – Insufficient data – Over-interpretaon of Data • slides show how data mining can result in spurious correlaons; this illustrates a piall of working with large data sets. We oen call a result significant if there is only a 5% chance of obtaining the result by chance; but if you “roll the dice” 1000 mes, you will find quite a few examples of that 5% chance. Data Fail S & P 500

he Journal of Invesng Spring 2007, Vol. 16, No. 1: pp. 15-22 Data Fail S & P 500

he Journal of Invesng Spring 2007, Vol. 16, No. 1: pp. 15-22 Examples of Gaussian Curves forming before your eyes • Exhibit at the Boston Museum of Science hps://www.youtube.com/watch? v=KtPr7iipUso

• Paerns of snow accumulaon around an obstrucon appear to generate a similar effect