A Commonsense Approach to Story Understanding Bryan Michael

A Commonsense Approach to Story Understanding by Bryan Michael Williams B.S., Massachusetts Institute of Technology (2016) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2017 © Massachusetts Institute of Technology 2017. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science May 26, 2017 Certified by. Patrick Henry Winston Ford Professor of Artificial Intelligence and Computer Science Thesis Supervisor Certified by. Henry Lieberman Research Scientist Thesis Supervisor Accepted by . Christopher J. Terman Chairman, Masters of Engineering Thesis Committee 2 A Commonsense Approach to Story Understanding by Bryan Michael Williams Submitted to the Department of Electrical Engineering and Computer Science on May 26, 2017, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract Story understanding is a essential part of human intelligence. Therefore, any complete model of human intelligence must be able to understand stories. Understanding stories requires commonsense knowledge. Many story understanding systems rely on rules to express and manipulate common sense, but the amount of common sense is vast and expressing significant amounts of common sense this way is tedious. Com- monsense knowledge bases can help address this problem. My work demonstrates the power of combining a story-understanding system and a commonsense knowledge base. Genesis is a story-understanding system that models aspects of story understanding by reading and analyzing stories written in simple English. I have extended Genesis by incorporating ConceptNet, a large knowledge base of common sense, into its processing. Now, Genesis not only has access to the dozens of inference rules provided by Genesis users, but also tens of thousands of goal-related assertions provided by contributors to ConceptNet. Genesis uses ConceptNet’s knowledge to infer characters’ goals, in some cases allowing Genesis to understand stories with few or even zero user-defined rules. Genesis can now infer relevant events and connections never explicitly stated in the story in a manner unprecedented within the space of story understanding and machine reading. Genesis also uses ConceptNet to apply user-defined rules in a commonsense, more flexible fashion, extending the reachof a rule by factors ranging from ten to one hundred. These additions to the Gene- sis story-understanding system takes steps toward practical AI-enabled systems that provide advice in areas that heavily depend on precedent, including law, medicine, and business. Thesis Supervisor: Patrick Henry Winston Title: Ford Professor of Artificial Intelligence and Computer Science Thesis Supervisor: Henry Lieberman Title: Research Scientist 3 4 Acknowledgments Thank you to the Genesis group, which has been a regular source of support and inspiration for two years. Thank you to my family for all of their support and encouraging me to do the M.Eng program. Thank you to my lovely girlfriend Sidni for everything she does. Henry Lieberman was an essential part of this project, providing the seeds for much of the work that follows. Thank you for making me a more ambitious, directed, and confident researcher. Patrick Winston has taught me how to write, speak, and think. He has provided both great freedom yet also necessary guidance. I am very grateful to have worked so closely with him. This research was supported, in part, by the Air Force Office of Scientific Research, Award Number FA9550-17-1-0081. 5 6 Contents 1 Introduction 15 1.1 Overview . 15 1.2 Vision: Artificial General Intelligence . 19 1.3 The Importance of Common Sense . 21 1.4 Approach . 23 1.5 Contributions . 26 1.6 Chapter Descriptions . 27 2 Genesis Background 29 2.1 Overview . 29 2.2 Capabilities . 31 2.3 Limitations . 31 3 ConceptNet Background 35 3.1 Overview . 35 3.2 AnalogySpace . 38 3.3 Discussion . 39 4 Connecting Genesis and ConceptNet 43 5 Common Sense Enabled Rule Matching 49 5.1 Overview . 49 5.2 Implementation . 50 5.3 Discussion . 53 7 6 ASPIRE 55 6.1 Overview . 55 6.2 Framework . 56 6.3 Implementation . 59 6.3.1 Interaction with Genesis Rules . 60 6.3.2 Concept Extraction . 61 6.3.3 ASPIRE-created Properties . 66 6.4 Recommendations for Knowledge Organization . 67 7 Results 69 7.1 Common Sense Enabled Rule Matching . 69 7.2 ASPIRE . 73 7.3 Interaction with Other Genesis Capabilities . 78 7.4 Quantifying Gained Commonsense Knowledge . 80 7.4.1 Common sense enabled rule matching . 81 7.4.2 ASPIRE . 82 8 Limitations and Future Work 85 8.1 Limitations . 86 8.2 Future Work . 90 9 Related Work 93 9.1 Machine Learning Approaches . 93 9.2 Story Understanding and Common Sense . 96 9.3 Commonsense Knowledge Bases . 98 10 Conclusion and Contributions 101 Appendix. Full Story Text and ConceptNet Justifications 109 Bullying . 109 A Long Day . 110 Mexican-American War . 114 8 A Long Day, Alternate Ending . 116 9 10 List of Figures 1.1 Mexican-American war question answering . 19 1.2 Limitations of Siri . 21 2.1 Macbeth elaboration graphs . 32 3.1 ConceptNet visualization . 35 4.1 Genesis-ConceptNet connection diagram . 45 6.1 Example ASPIRE cause graph . 59 7.1 Bullying elaboration graphs . 71 7.2 Bullying ConceptNet justifications . 72 7.3 A Long Day elaboration graph . 75 7.4 A Long Day ConceptNet justifictions . 76 7.5 Mexican-American war elaboration graph . 77 7.6 Mexican-American war question answering; reprinted for reference . 79 7.7 Equivalent Genesis goal rule . 83 8.1 Alternate A Long Day elaboration graph . 89 11 12 List of Tables 7.1 Rule amplification factors . 81 13 14 Chapter 1 Introduction “Only a small community has concentrated on general intelligence. No one has tried to make a thinking machine and then teach it chess - or the very sophisticated oriental board game Go...The bottom line is that we really haven’t progressed too far toward a truly intelligent machine. We have collections of dumb specialists in small domains; the true majesty of general intelligence still awaits our attack.” —Marvin Minsky (Stork, 1997) 1.1 Overview Story understanding is essential to artificial general intelligence. Here, I use the term “story” to refer to any related sequence of events; a traditional narrative structure is not necessary. Humans communicate in stories, and the ability to fully understand human communication would enhance all aspects of a computer’s interaction with human text, speech, or even thoughts. Better machine readers, question an- swers, and intelligent assistants are only the beginning. Although today’s popular approaches to natural language tend to ignore commonsense knowledge, it is a vital component of story understanding. “Commonsense knowledge” here broadly refers to all of the everyday, obvious knowledge humans hold and often keep implicit when 15 communicating—ordinary facts like wings are used for flight and students typically desire good grades. Many story understanding systems rely on rules to express and manipulate common sense, but the amount of common sense is vast and expressing significant amounts of common sense this way is tedious. Commonsense knowledge bases can help address this problem. Genesis is a story-understanding system whose rule-based approach to story understanding provides humanlike flexibility. Genesis is able to analyze a story and demonstrate its understanding in numerous ways, including question answering, sum- marization, hypothetical reasoning, and story alignment (Holmes & Winston, 2016; Winston, 2014). Its rule-based core and general knowledge representation are used to complete all these tasks and many others. However, its reliance on rules sacri- fices breadth. Prior to my work, the user had to explicitly give Genesis commonsense rules that codify all the knowledge necessary for identifying connections between story events. I have tackled the problem of common sense enabled story understanding by connecting Genesis to ConceptNet, a large commonsense knowledge base. Equipped with access to hundreds of thousands of commonsense assertions, Genesis can now analyze text in novel ways, such as using its background knowledge to infer implicit explanations for story events. These explanations can reference events never explicitly stated in the story. Genesis detects that these inferred explanations are relevant using the context its common sense provides. Moreover, Genesis provides a human- readable justification for every common-sense-assisted inference it makes, showing the user exactly what pieces of commonsense knowledge it believes are relevant to the story situation. ConceptNet’s substantial size allows these features to function at a large scale, and its knowledge is general enough to be widely relevant, yet descriptive enough to allow for nontrivial inferences. Inferring nontrivial implicit events at a large scale is a new capability within the space of machine reading and story understanding in both rule-based and machine learning approaches. Connecting Genesis and ConceptNet demonstrates the power of combining a story understanding system with a commonsense knowledge base. The user or programmer 16 is no longer required to specify all the commonsense knowledge Genesis should employ because Genesis contacts

A Commonsense Approach to Story Understanding Bryan Michael

Commonsense Reasoning for Natural Language Processing

Review Articles

Downloading Programs

Commonsense Knowledge in Wikidata

Natural Language Understanding with Commonsense Reasoning

Evaluating Commonsense Reasoning in Neural Machine Translation

An Approach to Evaluate AI Commonsense Reasoning Systems∗

Commonsense Reasoning About Task Instructions

Using Conceptnet to Teach Common Sense to an Automated Theorem Prover∗

Machine Common Sense Concept Paper David Gunning DARPA/I2O October 14, 2018

Knowledge Representation and Commonsense Reasoning: Reviews of Four Books

An Architecture of Diversity for Commonsense Reasoning