Judea Pearl - Book of Why (Chapters 0-5) - Review

Judea Pearl - Book of Why (Chapters 0-5) - Review

Seminar: How Do I Lie With Statistics? Judea Pearl - Book of Why (Chapters 0-5) - Review Patrick Dammann1 1MSc. Angew. Informatik, Universität Heidelberg February 27, 2020 Abstract Contents In this review/summary of 1 Basics2 Judea Pearl’s and Dana Macken- zie’s book "The Book of Why", 1.1 Ladder of Causation.....2 the most important aspects of the 1.2 Causal Diagrams.......3 first five chapters are outlined. The book deals with the problems that 2 History of Statistics4 arose when people tried to intro- 2.1 Francis Galton........4 duce causal reasoning into the field 2.2 Karl Pearson.........5 of statistics, which grew up in an- 2.3 Sewall Wright.........5 ticipation to causal argumentation. It gives a brief history about how 3 Bayes and Junctions5 it came to this development, and 3.1 Rule of Bayes.........6 introduces modern methods that 3.2 Junctions in Causal Diagrams6 make the connection between both concept really easy. In the fifth chapter, the problem is demon- 4 Confounders7 strated using the smoking-lung- 4.1 RCTs.............7 cancer-debate from the middle of 4.2 Back-Door Criterion.....8 the 20th century. 5 The Smoke-Cancer-Debate8 5.1 Does smoking cause cancer?.8 5.2 Smoking and newborns....9 6 Conclusion 10 1 Introduction ciation", "Intervention" and "Counterfac- tuals" and bear the following structure: "The Book of Why"[1] by Judea Pearl and Dana Mackenzie is a popular science book The lowest rung, "Association", sup- about the introduction of real causal meth- ports all questions whose answers can be ods to the sciences and how they started found by looking at data alone. These in- to help liberating the statistics from mis- clude questions like "If a patient has a cer- takes of their early days. The main tool tain symptom X, how likely is it that he used here are causal diagrams, graph-based suffers from disease Y?" and "If a customer models for causal relationships that Pearl bought cookies, what are the odds the he (re-)discovered and improved over the past also buys milk?". These are classic statis- years. tical queries that lead to results that can The general question about the impor- only describe the relation of different obser- tance of causation as opposed to the use of vations without intervening in the observed data on its own can be shown via a little process. example: Presume having data about the Addressing questions of this rung is nat- number of fire fighters at an operation, com- ural to all higher animals, since it is needed pared to the damage the fire caused. Usu- to interact with the outside world at all. ally, a positive trend should be noticeable: The more fire fighters, the more damage. The second rung, "Intervention", con- But without knowing the causal connec- tains those problems that additionally im- tions in the background, one cannot derive ply some external control over the situa- from the data whether lowering the number tion, like doing or preventing a certain ac- of fire fighters sent to an operation might tion. This can, for example, be something lower the damage done by the fires, or not. like "Will my headache be cured, if I take this medicine?" or "How will poverty rates In this summary/review, in each chap- behave, when we introduce this new law?". ter I will focus on the two to three main This introduces some kind of causal think- 1 points and summarize them. In most (but ing, since the main difference to rung one not all) cases, I will abstain from adding my is that the queries cannot be answered with personal thought. While the numbering of observational data alone, because it shows my sections 1-5 fits to the book, the titles of only in which relations the variables might my summaries are chosen on my own. The occur "in the wild", but not which variable sixth section then contains my personal con- influences which and therefore what would clusion. change if someone changed something. The fire fighter example from above be- 1 Basics longs into rung two. We can answer rung two questions like "How many fire fighter 1.1 Ladder of Causation might be there, when the fire cost X$?", but to know the difference between seeing The guiding thread through the whole book less fire fighters (maybe because of a small is a metaphor called the ladder of causa- fire) and making less fire fighters handling tion, which describes three different prob- a fire (which might result in bad conse- lem classes seated on the three rungs of the quences), more knowledge is needed than ladder, each presenting new issues that can- the data itself supplies. This could for ex- not be solved with methods only that would ample be the fact that both variables might suffice for overcoming obstacles from the have the fire’s magnitude as a common cause lower rungs. Those rungs are named "Asso- (which obviously yields the difference be- 1in my eyes 2 tween seeing and enforcing less fire fighters) finds two things: and the circumstance that fire fighters fight On one hand, a strong AI that learn fires (which leads to opposing results when like and interact with humans on a natu- reducing their numbers). ral level needs to be able to answer ques- Engaging in these questions requires tions from all three rungs, since they are more brainpower than the previous ones, necessary for an understanding of the world so it is assumed that early humans were the and so deeply woven into the human mind first animals to develop the skill to grasp that a robot’s lack of understanding would the consequences of their manipulation of limit the possibilities of easy communication the world, going on until today, where tod- greatly. On the other hand, state-of-the-art dlers learn this in their early years. machine learning models still reside on rung one, in some cases maybe scratching on the The highest rung, "Counterfactuals", bottom of rung two, because most training deals with imagining worlds where things algorithms do not involve the modeling of would have been different than in the cur- the world but only associating from what is rent situation, hence the name COUNTER- seen. Therefore, a strong AI seems further FACT(-UAL). Instead of predictions about away than the current, sci-fi-esque era might a general, statistical population and their suggest and will need more research involv- behaviour under "normal" or tweaked con- ing causal reasoning in machine learning. ditions, one observes a special situation (e.g. that of a special individual) and than wants 1.2 Causal Diagrams to know how the outcome of that exact scene would have changed, if some details To either work with and analyze causal would have differed. For example, possible models or implement them in learning ma- queries could be "Was it the medicine, that chines, they need a representation. And cured my headache?" (since it is equivalent since causality consists of asymmetric rela- to "Would my headache also have stopped, tions between different variables, we can il- if I hadn’t taken the medicine?") or "Would lustrate a model as a directed graph where Kennedy still be alive, if he hadn’t been the nodes symbolize random variables (mea- shot?". surable and not) and every direct causal ef- Questions from this class of problems fect from one variable to another is shown still seem very natural to us humans, even through a directed edge. A graph with though they involve worlds that do not ex- these features is called "causal diagram", a ist. This is assumed to be a skill unique method highly utilized in this book. to humans2, and laying the foundation of In the following example, which is highly not only all fictional story-telling, but also simplified to explain the nature of those of human inventions at all. This is due to causal diagrams, all variables are boolean the fact that these questions are needed for (true/false) and if there is a causal effect be- understanding the world, since it is essen- tween two variables, it is of the type "if A tial for comprehension of e.g. a method, happens, B happens". to know what would have happened if the The idea is the modelling of a shooting method wasn’t applied or applied differently. squad and the causal relationships are easy: Iff the court orders the prisoner to die, the Since the book’s author, Judea Pearl, captain commands the shooting. Iff the cap- looks at and partially illuminates the issue tain commands a shooting, A shoots. Iff the from a computer scientist’s point of view, he captain commands a shooting, B shoots. Iff compares these different levels to the capa- either A or B shot, the prisoner dies. There bilities of modern, artificial intelligence and are no jammed guns, pacifistic or rampaging 2At least on earth..! 3 soldiers, missed bullets, suicidal prisoners or or "is made to" and setting it to a fixed anything else in this scenario. value, thus erasing all external influences on it. Since classic statistics does not work on causal relationships but data alone, this court process cannot be expressed by classic no- tation, so a new operator is introduced, the do-operator: P (deadjdo(A = true)). The answer to this query is obviously yes, since the still applied rules make prisoner die even on a single bullet. captain To generate queries from rung three, a given situation must be described, followed by a question that involves fact that contra- dict the situation.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us