Constructing and Embedding Abstract Event Causality Networks from Text Snippets
Sendong Zhao∗, Quan Wang†, Sean Massung‡, ∗ Bing Qin∗, Ting Liu∗ , Bin Wang†, ChengXiang Zhai‡ ∗School of Computer Science and Technology , Harbin Institute of Technology †Institute of Information Engineering, Chinese Academy of Sciences ‡Computer Science Department, University of Illinois at Urbana-Champaign {sdzhao, bqin, tliu}@ir.hit.edu.cn, {wangquan, wangbin}@iie.ac.cn, {massung1,czhai}@illinois.edu
ABSTRACT causality is not only a kind of knowledge but also the basis In this paper, we formally define the problem of repre- behind reasoning and making sense of the unknown. As senting and leveraging abstract event causality to power a kind of knowledge, causality is an essential resource for downstream applications. We propose a novel solution to question answering and decision making. To be able to this problem, which build an abstract causality network and answer the question What causes tumors to shrink?,one embed the causality network into a continuous vector space. would require a large causality repository [18]. The abstract causality network is generalized from a specific However, how to leverage the power of event causality one, with abstract event nodes represented by frequently co- for downstream applications is very challenging and has occurring word pairs. To perform the embedding task, we never been seriously studied before. Some researchers tried design a dual cause-effect transition model. Therefore, the methods to use event causality for predicting events [25] proposed method can obtain general, frequent, and simple and generating future scenarios [15, 14]. Nevertheless, there causality patterns, meanwhile, simplify event matching. are three major drawbacks to such studies. First, they Given the causality network and the learned embeddings, focus solely on causality between specific events and fail our model can be applied to a wide range of applications such to discover general causality patterns. For example, they as event prediction, event clustering and stock market move- can only detect “A massive 8.9-magnitude earthquake hit → ment prediction. Experimental results demonstrate that 1) northeast Japan on Friday a large amount of houses the abstract causality network is effective for discovering collapsed ”, but cannot create the concise and abstract → high-level causality rules behind specific causal events; 2) the causality pattern “earthquake house collapse”. Second, embedding models perform better than state-of-the-art link event matching is the key to event prediction and future prediction techniques in predicting events; and 3) the event scenario generation. However, the underlying symbolic causality embedding is an easy-to-use and sophisticated nature of tuple matching in [25] or phrase matching in [15] feature for downstream applications such as stock market greatly limits the flexibility of event matching, and might movement prediction. degrade the accuracy of event prediction or future scenario generation. Third, although causality itself is an important resource for reasoning and prediction, the symbolic form of Keywords causality derived from [25, 15, 14] is hard to generalize to Causality, Event causality network, Event prediction, Em- other applications. bedding methods, Stock price movement prediction Due to these limitations, we propose to 1) build an abstract news event causality network from which we can 1. INTRODUCTION obtain general, frequent, and simple causality patterns; and 2) embed the causality network into a continuous vector Causality is the relation between one event (the cause) space to simplify event matching and make it easy to use and a second event (the effect), where the second event is for other applications, resulting in two benefits: 1) general understood as a consequence of the first [5]. Furthermore, causality patterns help people to better understand the ∗ Corresponding author high-level causality rules behind specific causal events; 2) compared to matching tuples or noun phrases, matching in continuous vector spaces is easy to manipulate and much Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed more flexible, which might result in better performance in for profit or commercial advantage and that copies bear this notice and the full citation event prediction or future scenario generation. Besides, on the first page. Copyrights for components of this work owned by others than event causality embedding may improve application tasks ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or such as stock market prediction for two reasons: 1) low- republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. dimensional dense vectors can effectively alleviate feature WSDM 2017, February 06-10, 2017, Cambridge, United Kingdom sparsity issues and causality gives a reasonable way to c 2017 ACM. ISBN 978-1-4503-4675-7/17/02. . . $15.00 embed events into low-dimensional dense vectors; and 2) DOI: http://dx.doi.org/10.1145/3018661.3018707 the causality also benefits from finding the causal factors of
335 (murder-42.1, people) !"#$%&"'( #)* +"' (capture, people) Abstract Causality Network (escape, prison) ! ",(0#1&'(,-'#$)(* +(,-' (send, prison) 8&;!#<&