Inferential Commonsense Knowledge from Text
Total Page:16
File Type:pdf, Size:1020Kb
Inferential Commonsense Knowledge from Text by Jonathan Michael Gordon Submitted in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Supervised by Professor Lenhart Karl Schubert Department of Computer Science Arts, Sciences and Engineering Edmund A. Hajim School of Engineering and Applied Sciences University of Rochester Rochester, New York ii To my parents. iii Biographical Sketch Jonathan Gordon was born in Wayne, NJ, in . He attended Vassar College, where he was advised by Nancy Ide. At Vassar, he worked on the annotation the American National Corpus and, with Luke Hunsberger, performed research on learning models of musical chord progressions. He graduated with a Bachelor of Arts degree in Computer Science in and matriculated at the University of Rochester. In , he was a visiting research assistant at the Institute for Creative Technologies at the University of Southern California, where he worked with H. Chad Lane on the automated analysis of student essays for an intelligent tutoring system. He was awarded the Master of Science degree in . At Rochester, he pursued his doctoral research in artificial intelligence under the direction of Lenhart Schubert. e following publications were a result of work conducted during doctoral study: J. M. Gordon, B. Van Durme, and L. K. Schubert. Weblogs as a source for extracting general world knowledge. In Y. Gil and N. F. Noy, editors, Proceedings of the Fih International Conference on Knowledge Capture (-), pages –, Redondo Beach, CA, Sept. Association for Computing Machinery —. Learning from the Web: Extracting general world knowledge from noisy text. In V. Nastase, R. Navigli, and F. Wu, editors, Proceedings of the Workshop on Collaboratively-built Knowledge Sources and Artificial Intelligence, pages –, Atlanta, GA, July b. Press —. Evaluation of commonsense knowledge with Mechanical Turk. In C. Callison- Burch and M. Dredze, editors, Proceedings of the Workshop on Creating iv Speech and Language Data with Amazon’s Mechanical Turk, pages –, Los Angeles, CA, June a. Association for Computational Linguistics J. M. Gordon and L. K. Schubert. Quantificational sharpening of commonsense know- ledge. In C. Havasi, D. B. Lenat, and B. Van Durme, editors, Proceedings of the Fall Symposium on Commonsense Knowledge (), Arlington, VA, Nov. Press —. Discovering commonsense entailment rules implicit in sentences. In S. Pado and S. ater, editors, Proceedings of the Workshop on Textual Entailment (TextInfer), Edinburgh, Scotland, July . Association for Computational Lin- guistics —. Using textual patterns to learn expected event frequencies. In J. Fan, R. Hoff- man, A. Kalyanpur, S. Riedel, F. M. Suchanek, and P. P. Talukdar, editors, Pro- ceedings of the Workshop on Automatic Knowledge Base Construction and Web-Scale Knowledge Extraction (-), pages –, Montréal, Quebec, Canada, June . Association for Computational Linguistics —. WordNet hierarchy axiomatization and the mass–count distinction. In D. A. Evans, M. van der Schaar, and P. Sheu, editors, Proceedings of the Seventh Interna- tional Conference on Semantic Computing (), Irvine, CA, Sept. J. M. Gordon and B. Van Durme. Reporting bias and knowledge acquisition. In F.M. Suchanek, S. Riedel, S. Singh, and P.P. Talukdar, editors, Proceedings of the Work- shop on Automated Knowledge Base Construction (), pages –, San Fran- cisco, CA, Oct. Association for Computing Machinery v Acknowledgments By the toil of others we are led into the preserve of things which have been brought from darkness into light. , On the Shortness of Life is dissertation would not have been possible without many others, including —my parents, for a life-long interest in what we can learn by reading; —Molly, for her love and patience as I read and learned; — the faculty and staff of the Department of Computer Science, for giving me the time, space, and support to pursue my research; — my doctoral committee, James Allen, Gregory Carlson, and Daniel Gildea, for their instruction and their questions; — my research collaborators, including Benjamin Van Durme, Fabrizio Morbini, and Adam Purtee, for sharing their ideas and their work; —my advisor, Lenhart Schubert, for entirely too much of his time and knowledge. vi Abstract Toenable human-level artificial intelligence, machines must have access to the same kind of commonsense knowledge about the world that people have. e best source of such knowledge is text – learning by reading. Implicit in linguistic discourse is information about what people assume to be possible or expect to happen. From these references, I obtain an extensive collection of semantically underspecified ‘factoids’ – simple predic- ations and conditional rules. Using lexical-semantic resources and corpus frequencies, these factoids are generalized and partially disambiguated to form a collection of reason- able commonsense knowledge. Together with lexical axioms from the interpretation of WordNet, these probabilistic logical inference rules allow a reasoner to draw conclusions about everyday situations as might be encountered while reading a story or conversing with a person. vii Contributors and Funding Sources is work was supervised by a dissertation committee consisting of Professors Lenhart Schubert (advisor), James Allen, and Daniel Gildea of the Department of Computer Sci- ence and Professor Gregory Carlson of the Departments of Linguistics and Philosophy. Work in Chapter and Appendix was completed with additional guidance from Ben- jamin Van Durme, while the discussion and measurement of reporting bias in Chapter is an expansion on points raised by Van Durme () and benefitted from feedback by Lenhart Schubert, Peter Clark, and Doug Downey. All other work conducted for the dissertation was completed by the student independently. is material is based upon work supported by awards - – Know- ledge Representation and Reasoning Mechanisms for Explicitly Self-Award Communicative Agents (–), - – General Knowledge Bootstrapping from Text (–), and - – Adapting a Natural Logic Reasoning Platform to the Task of Entailment Inference (–); --- through subcontract from the Friedland Group; and a gi from Bosch Research and Technology. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of above named organizations. viii Contents Introduction . Commonsense Knowledge . Knowledge Representation . Episodic Logic . Overview of the esis Knowledge Acquisition . Knowledge Engineering . Crowdsourced Knowledge Engineering . Automatic Knowledge Extraction . Episodic Logic Interpretation and Abstraction . Chapter Summary Text Sources . Introduction . Rates of Knowledge Extraction . Knowledge Overlap . Text Sources and Knowledge Quality . Evaluation of Knowledge Quality . Chapter Summary ix Reporting Bias . Introduction . Measuring Reporting Bias . Discussion . Previous Approaches . Addressing Reporting Bias . Chapter Summary Lore: Learning & Sharpening Implicit Knowledge . Conditional Knowledge from Text . Sharpening to Appropriate Form . Learning Expected Event Frequencies . Chapter Summary Making & Evaluating Inferences with Commonsense Rules . Evaluating Knowledge Extraction . Knowledge for Reasoning . Inferential Evaluation . Chapter Summary Conclusions . Summary . Knowledge for Text Understanding . Future Work . Final Remarks References x Crowdsourced Evaluation & Filtering . Introduction . Experiments . Conclusions Learning Lexical Axioms . Introduction . Previous Work . Acquiring Axioms . Evaluation . Reasoning with WordNet Axioms . Conclusions . Discussion and Future Work xi List of Tables . e number of factoids extracted from Web and traditional corpora . Average quality of filtered factoids from Web sources . Textual support for body-part extractions . Textual support for extractions about city populations . Textual support for extractions about verbal events . Textual support for winning an Academy Award vs being nominated . Textual references to vehicular accidents . Average ratings and Pearson correlation for rules . Ratings of sharpened factoids . Quantifier conclusion weights . Verbalizations of certainty values associated with statements . Average ratings of inferences from factoids and sharpened axioms . Average ratings from Mechanical Turk . Axiom counts for WordNet as a whole and ‘core’ synsets . Distribution of ratings for WordNet axioms xii List of Figures . e growth of unique factoids learned from Wikipedia and weblogs as more raw factoids are generated . Coverage of Wikipedia factoids by increasing amounts of the raw extractions from weblogs . Instructions for scaled judging. . Frequency of ratings assigned to unfiltered factoids from both corpora . Counts for how many rules were assigned each rating by judges . Sharpened formula quality . Sharpened formula quantifier strength . Selected examples from inference with unsharpened factoids . Selected examples from inference with sharpened factoids . Frequency of strength ratings for inferences with factoids and sharpened knowledge . Selected examples of inferences with ReVerb extractions . e provided examples of good and bad factoids . Frequency of ratings in the highly correlated results of Round . Algorithm for axiomatizing WordNet’s hypernym hierarchy . System output from the evaluation set . Baseline output from the evaluation set Introduction To think is to forget differences,