Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

Index

Abbott v. Burke , 145 breast cancer, 155 Achilles/tortoise paradox, 143 Brewster, Kingman, 141 , 174 Alam, S., 28 Bridgeman, Brent, 52n10 Almond, Russell, 185 Briggs, Derek, 57 Alyea, Hubert N., 192 Brownback, Sam, xv American Arbitration Association, 155 Brugger, Kenneth, 190 Andersson, Arne, 20 Bucknell, 75 angina, surgical treatment for (Fieschi), 60 Bulletin of the Seismological Society of Angoff , C., 131 America, Th e , 63n1 approximation, 11 Atkinson, Richard, 172n3 capture-recapture procedures, 7 Caputo, Ralph, 151 Bachmann, Michelle, 29n1 carcinogenic additives, 49 Balbi, Adriano, 123 “Case for SAT Words, Th e” (Murphy), 170 Bannister, Roger, 19 , 20 Catalogtree, 94 Bayes methods, 179 causal eff ect B a y e s ’ Th eorem, 156n5 average, 23 , 40 Bayi, Filbert, 20 calculating, 23 Beardsley, Noah, 21n2 covariate information and, 41 Bench, Johnny, 112 , 115 , 118 defi ned, 23 , 35 , 40 , 58 Berkeley, George, 79 estimating, 31 Bernstein, James, 189 interruptions and, 54 Bernstein, Jill, 189 measuring Bernstein, Joseph, 189 average, 23 Berra, Yogi, 115 , 118 size, 43 Berry, Scott, 20 directional, 24 Bertin, Jacques, 98 missing variable and, 54 Bieber, Martin A., 188 summary, 35 boggle threshold (Haynes), 2 causality, 8 , 22 Bojorquez, Manuel, 69 defi ned (Hume), 23n3 Bok, Derek, 57n16 longitudinal studies and, 26 Bowdoin College and optional SAT ordering cause and eff ect, 26 submissions, 72 – 4 Census, U. S. Decennial, 176 – 7 Bowie High School (El Paso, Texas), 76 chess reward (Ferdowski), 15 , 15n1 Bowie Model, 76 China's industrial expansion, measuring, Braun, Henry, 22 101 – 5

© in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

206 Index

choropleth, 129 core of, 3 strengths and weaknesses, 137 defi ned, 2 Claremont McKenna College, 75 scientist, 22 Clauser, Brian, 52n10 , 56 Delaney Clause research, 49 clean air regulations, xiv delivery (birth) costs, 188 Cleveland, Bill, 2 DerSimonian, Rebecca, 57 Cobb, Leonard, 60 deviations Cochran, Mickey, 118 from the mean, plotting, 124 Coe, Sebastian, 20 Median Absolute (MAD), 112 communication, and empathy, 82 – 90 dewatering, 62 compound interest, 16 drilling, horizontal and vertical, 62 conjecture, and fact, 18 driving impaired, 189 – 90 control alcohol use, 189 experimental, 30 cell phone use, 189 group, 23 , 40 drug trials, 41 counterfactuals and, 58 Duckworth, A. L., 26 outcomes, 23 Dupin, Charles, 122 through experimentation, 12 Durso, John, 71 coronary by-pass surgery, 32 – 40 correlation, and coincidence, 69 ecological fallacy (defi ned), 137 Cortot, Alfred, 21 education counterfactual, 4 , 6 , 11 , 36 educational performance defi ned, 23 , 58 decline in, 140 randomization and, 59 Promise Academy and, 161 – 6 covariates tests and, 141 , 142 defi ned, 31 , 38 fi nancing importance of, 41 bond issue, 139 – 40 Cram, Peter, 188 property taxes, 145 creationism, 30n3 tenure and, 146 – 51 Cristie, Chris, 146 Education Research International , 25 Educational Testing Service (ETS), 153 data e ff ect (defi ned), 23 big, 44 , 152 Einstein, Albert, xvi , 109 cross-sectional, 26 El Guerrouj, Hicham, 20 defi ned, 45 electrocardiogram (ECG) costs, 189 evidence and, 45 Elliot, Herb, 20 high-dimensional, x Ellsworth, William, 69 longitudinal, 26 Emanuel, Rahm, 146n2 missing, ix , xi , 31 – 40 , 72 – 7 , 161 – 6 Emory, 75 -at-random, 73 , 75 , 76 evidence bias size and, 33 boggle threshold and, 2 causal eff ect and, viii characteristics of, 45 , 192 deception potential in, 13 circumstantial, 61 , 137n6 excluding subjects, 33 correlational, 25 imputation and, 33 , 75 , 158 defi ned, 45 defi ned, 109 hope (magical thinking) and, 2 uncertainty and, 32 reliability, 45 multivariate, 109 resistance to, xiv – xv observational validity, 45 causal conclusions and, 44 Evidence Centered Design (Mislevy, Steinberg, control and, 58 Almond), 185 science Evidence-Based-Medicine, xiii as an extension of , 3 evolution, xv

© in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

Index 207

Fallin, Mary, 64 health inventory, 185 false positives, 152 – 60 , 182 healthcare, controlling cost of, 188 – 9 Federalist #1 (Hamilton), 170 height distribution study (Galton), 16 feedback, measuring, 54 – 6 Helfgott, David, 19n1 Feinberg, Richard, 180 heroic assumptions (Rubin), 75 Ferdowski Tusi, Hakim Abu Al-Qasim, 15 Herschel, John Frederick William, 128 Fernandez, Manny, 76 Hindoostan, Playfair's of, 103 Fieschi, Davide, 60 hip replacement costs, 188 Fisher, Ronald, 22 Hobbes, Th omas, 138 , 138n7 FiveTh irtyEight , xiv Hofman, August Wilhelm von, 192 Fletcher, Joseph, x , 80 , 123 – 9 , 124n2 Holland, Paul, 22 , 30 , 31 , 58 four-minute mile, 20 – 1 Holmes, Sherlock, 47 fracking (hydraulic fracturing), 62 – 8 homework ban, 45 – 6 defi ned, 62 How to Display Data Badly (Wainer), 91 horizontal, 63 Hsu, Jane, 45 seismic eff ects of, 63 – 71 Huckabee, Mike, xv Freedle, Roy, 50 Hume, David, 11 , 22 , 23n3 , 26 , 79 Freedle’s Folly, 50 ice cream and drownings, 43 Galbraith, John Kenneth, 145 ideas Galchen, Rivka, 66 , 69 Dopeler Eff ect , 36n7 Galton, Francis, 16 practical, 9 Garcia, Lorenzo, 76 – 7 rapid, 4 , 25n7 gas mileage, exponential, 17 theoretical, 9 gedanken experiments (defi ned), 8 truthiness and, 4 Gelman, Andrew, 129 inferences gene mutation letters, 83 – 90 based on averages within groups, 4 global warming, xiv , 70 , 70n4 , 71 beyond data, 192 graphical display, ix limits of, in observational studies, 68 defi ned, 107 longitudinal, and cross-sectional data, 26 line labels, 98 – 101 randomized, controlled experiments and, 59 in the media, 91 – 105 Inhofe, Jim, xv , 69 , 70n4 , 71 multidimensional, 80 Inside-Out Plot (Hartigan), 110 – 21 pie , 92 – 7 statistical, 79 Jolie, Angelina, 84 – 5 growth Journal of Happiness Studies , 25 exponential, 15 – 17 Journal of the American Medical Association , 24 measuring, requirements for, 161 Journey North Project (Annenberg/CPB Learner Guerry, Andre-Michel, 123 Online), 191 Jungeblut, Ann, 57 Haag, Gunder, 20 Haberman, Shelby, 179 Kahneman, Danny, 32n4 Hamilton, Alexander, 170 Kaiser, George, xv Hamm, Harold, xv Kant, Immanuel, 38 happiness Keino, Kip, 21n2 defi ned, 24 Keller, Randy, 64 performance and, 24 – 8 Kemeny, John, xii , 174 school success and, 25 Khak, A.A., 28 Harder, Donelle, 69 knowledge acquisition, exponential, 17 Hargadon, Fred, 82 Kolata, Gina, 188 Hartigan, John, 110 Hatfi eld, Kim, 69 Ladra, Sandra, 64 Haynes, Renée, 2 Laird, Nan, 57

© in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

208 Index

Landy, John, 20 nothing is not always zero , 161 – 6 Lang, Lang, 21 Nurmi, Paavo, 20 , 21 Laplace, Pierre-Simon, 8 LeMenager, Steve, 83 , 85 Obama, Barack, 1 Li, Yundi, 21 obesity, 24 , 31 Liszt, Franz, 19 observations Little, Rod, 77 extreme, vii , 11 Locke, John, 79 observational studies, viii lottery, 14 practicality and, 12 Louis XVI, 107 sample size and, vii Lowenthal, Jerome, 19n1 outcomes Lu, Xin, 188 actual, 23 Luhrmann, Tanya, 2 average, 58 control, 23 Main Street accidents analogy, 71 covariate information and, 40 mammograms, 155 – 60 dependent variable, 30 e ff ectiveness of, 156 , 156n5 measuring, 33 potential, 23 , 35 , 40 crimes against property (Quetelet), 124 sequentially randomized (split plot) criminal variables (Fletcher), 125 designs and, 27 French population (Montizon), 122 Ovett, Steve, 20 ignorance and crime (Fletcher), 125 , 128 instruction and crime (Balbi and Pacini, Filippo, 122n1 Guerry), 123 Pacioli, Luca, 16 London cholera epidemic (Snow), 122 Page, Satchel, 192 male students and population (Dupin), 122 Pauli, Wolfgang, 107n11 thematic, 122 performance, and happiness, 24 – 8 as visual metaphors, x Pfeff ermann, D., 98 marriage and life expectancy, 44 Piazza, Mike, 112 , 115 , 118 Mauer, Joe, 110 – 21 Picasso, Pablo, 34 Median Absolute Deviation (MAD), 112 Picturing the Uncertain World Mencken, H. L., 131 (Wainer), 91n1 Messick, Sam, 57 Playfair, William, 79 , 103 Minard, Charles Joseph, 80 predictions, xiii Mislevy, Bob, 185 election, 1 – 2 Monarch Butterfl ies ( Danausplexippus ), Princeton regional schools bond issue, 139 – 40 190 – 1 application responses, Montizon, Frére de, 122 82 – 3 , 85 Morceli, Noureddine, 20 profi ciency Murphy, James, 170 piano, 19 – 21 population size and, 20 – 1 National Assessment of Educational Progress Prokofi ev, Sergei, 19 (NAEP), 54 , 130 , 140 , 143 Promise Academy (Memphis TN), 161 – 6 National Board of Medical Examiners, 56 Public School 116 ( City NY), 45 – 6 National Rifl e Association, xiv Naur, Peter, 2 Quality of Life (QOL) score, 33 – 40 near-sightedness and night lights, 44 Quants , xiv , 1 New Haven County employment, 98 Quetelet, Adolphe, 124 Neyman, Jerzy, 22 Quinn, P. D., 26 nicotine, addictiveness of, 70 Nightingale, Bob, 110 Rachmaninoff , Sergei, 19n1 Nightingale, Florence, 95 random assignment, 54 , 68 “no causation without manipulation”, 23 controlled, and scientifi c method, viii “No Child Left Behind”, 175n1 credible assumptions and, 40 , 48

© in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

Index 209

randomization, 23 small area estimates, 177 importance of, 58 smoking missing variables and, 31 , 68 life expectancy and, 30 – 1 reading and shoe size, 68 obesity and, 31 Reading/Language Arts Adequate Yearly Progress Snell, Peter, 20 (AYP, Tennessee), 162 Snow, John, 122 reason, and faith, 2 Sonata (Liszt), 19 Reckase, Mark, 98 Spearman-Brown curve, 177 reconnaissance-at-a-distance , 137 standard setting, 54 – 6 reliability (defi ned), 175n2 Standards for Educational and Psychological rescue therapy, 41 Testing , 154n3 Rodriguez, Ivan, 112 , 115 , 121 Statistical Breviary (Playfair), 103 Romney, Mitt, 2 statistics Rose of the Crimean War (Nightingale), 95 defi ned, xiii Rosenthal, Jaime, 188 moral (maps of), x , 81 , 123 – 8 Rubin, Donald B., 9 , 11 , 75 , 77 origins of, xiii Rubin’s Model for Causal Inference, 9 , 11 , 22 Statistics and Causal Inference (Holland), 22 , 59 central idea of, 23 Steinberg, Linda, 185 control and, 12 stem-and-leaf display (Tukey), 115n5 e ff ects of possible causes and, 29 Strayer, David, 189 potential outcome and, 35 studies Rule of 72 (Pacioli), 16 experimental as an approximation, 16n4 controlled, 30 , 48n5 Runyon, Damon, 158 as gold standard for evidence, 29 , 48 , 59 random-assignment, 48 Salk vaccine, 60 observational studies and, 30 sample size, 11 , 25 longitudinal, 26 Sanders, William, 164 observational, 29 , 48n5 , 61 Savage, Sam, 187 ancillary information and, 31 scatterplots, 127 controlled, 30 2008 presidential election (Gelman), 129 covariates and, 31 2012 presidential election, 130 – 4 designing, 65 gun ownership and gun violence, 129 , 130 – 8 randomization and, 48 Scherr, G. H., 192 weakness of, 31 Scholastic Aptitude Test (SAT) subscores Bowdoin College and, 72 – 4 characteristics of, 178 changing meaningful, 178 March 2014, 141 , 167 – 74 orthogonality, 178 October 2000, 50 reliability, 178 – 9 coaching for, 57 uses of, 178 scoring formula, 168 Schusterman, Lynn, xv Tancredo, Tom, xv scientifi c investigations, essentials of, 187 teacher evaluation value-added models, 75 – 6 Scopes, John, 147 Teague, Michael, 69 sequentially randomized (split plot) design, 27 tenure Serkin, Rudolf, 19 academic freedom and, 146 Serrano v. Priest , 145 as a job benefi t, 147 Sessa, 15 value of, 147 – 51 Shahnameh (Ferdowski), 15 tests Shine (movie), 19n1 accommodating disabilities, 46 – 8 , 49 – 52 Silver,Nate, xiv , 1 as contests, 171 Sinclair, Upton, xiv as measuring instruments, 172 Sinharay, Sandip, 54 , 179 as prods, 172 , 172n3 , 180 skepticism, 44 cheating detection, 152 – 5

© in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-13057-9 - Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think Like a Data Scientist Howard Wainer Index More information

210 Index

tests (cont.) Tukey, John, xiii , 16 , 79 , 109 , 123 options for accused, 154 – 5 Tulane, 75 statistical evidence of, 154 Twain, Mark, 18 costs of, 184 examinee, 184 uncertainty fourth-grade mathematics, 144 physics and, xiii length of, and reliability, 175 – 85 statistics as science of, xiii prototypical licensing, 181 – 3 Uneducated Guesses (Wainer), 75 purposes of, 171 Urquhart, Fred, 190 scores and racial groups, 143 US News and World Report (USN&WR ) college subscores, 178– 80 rankings, 73 – 5 teaching to the, 56 – 7 unplanned interruptions in, 52 – 4 variables increased likelihood of, 53 dependent, 30 Texas Assessment of Knowledge and Skills highly correlated, 121 (TAKS), 76 – 7 mapping geographic, to other data, x Th inking Fast, Th inking Slow (Kahneman), 3n2 missing, and randomization, 31 , 68 Th ird Piano Concerto multivariate, 109 Prokofi ev, 19 noncausal, 23 Rachmaninoff , 19n1 qualitative, relationship between, 129 Th oreau, Henry David, 61 , 137n6 random assignment and, 23 Tiller, R., 98 two or more (high-dimension), x Tommasini, Anthony, 19 , 21 zero-centering, 124 truthiness Voltaire, xv claims Voss, Joan, 151 circumcision and cervical cancer, 6 – 7 counting pond fi sh, 7 – 8 Walker, Scott, 146n2 fetal conversations, 4 – 5 Wang, Yuja, 21 repeating kindergarten, 5 – 6 Watson, John, 47 clarity and, 107 weight gain, exponential, 17 defi ned, 2 Wu, Jeff , 2 evidence and, 3 educational issues, 140 Youngman, Henny, 176 intuition and, 2 rapid ideas and, 4 Zahra, S., 28 reptilian brain and, 3 Zeno, 143

© in this web service Cambridge University Press www.cambridge.org