Arxiv:2105.03157V1 [Cs.CL] 7 May 2021
Total Page:16
File Type:pdf, Size:1020Kb
CO-NNECT: A Framework for Revealing Commonsense Knowledge Paths as Explicitations of Implicit Knowledge in Texts Maria Becker, Katharina Korfhage, Debjit Paul, Anette Frank Department of Computational Linguistics, Heidelberg University mbecker|korfhage|paul|[email protected] Abstract paths can explicate implicit information conveyed by the text. Connections can either be direct, e.g. In this work we leverage commonsense knowl- given the sentences The car was too old and The edge in the form of knowledge paths to es- engine broke down, the concepts car and engine tablish connections between sentences, as a can be linked with a direct relation (singlehop path) form of explicitation of implicit knowledge. Such connections can be direct (singlehop car ! HASA ! engine; or indirect – here inter- paths) or require intermediate concepts (mul- mediate concepts are required to establish the link, tihop paths). To construct such paths we com- as between Berliners produce too much waste and bine two model types in a joint framework we Environmental protection should play a more im- call CO-NNECT: a relation classifier that pre- portant role, where the link between waste and envi- dicts direct connections between concepts; and ronmental protection requires a multihop reasoning a target prediction model that generates tar- path: waste ! RECEIVESACTION ! recycle ! get or intermediate concepts given a source concept and a relation, which we use to con- PARTOF ! environmental protection. struct multihop paths. Unlike prior work that We show that two complementary model types relies exclusively on static knowledge sources, can be combined to solve the two subtasks: (i) we leverage language models finetuned on for predicting singlehop paths between concepts, knowledge stored in ConceptNet, to dynam- we propose a relation classification model that is ically generate knowledge paths, as explana- very precise, but restricted to direct connections be- tions of implicit knowledge that connects sen- tween concepts; (ii) for constructing longer paths tences in texts. As a central contribution we design manual and automatic evaluation set- we rely on a target prediction model that can gen- tings for assessing the quality of the generated erate intermediate concepts and is thus able to paths. We conduct evaluations on two argu- generate multihop paths. However, the interme- mentative datasets and show that a combina- diate concepts can be irrelevant or misleading. To tion of the two model types generates mean- our knowledge, prior work has applied either re- ingful, high-quality knowledge paths between lation classification or target prediction models. sentences that reveal implicit knowledge con- We propose CO-NNECT, a framework that estab- veyed in text. lishes COmmonsense knowledge paths for CON- NECTing sentences by combining relation classi- arXiv:2105.03157v1 [cs.CL] 7 May 2021 1 Introduction fication and target prediction models, leveraging Commonsense knowledge covers simple facts their strengths and minimizing their weaknesses. about the world, people and everyday life, e.g., With CO-NNECT, we obtain high-quality knowl- Birds can fly or Cars are used for driving. It is edge paths that explicate implicit knowledge con- increasily used for many NLP tasks, e.g. for ques- veyed by the text. tion answering (Mihaylov et al., 2018), textual en- We focus on commonsense knowledge in Con- tailment (Weissenborn et al., 2018), or classifying ceptNet (Speer et al., 2017), a knowledge graph argumentative functions (Paul et al., 2020). In this (KG) that represents concepts (words or phrases) work, we leverage commonsense knowledge in the as nodes, and relations between them as edges, form of single- and multihop knowledge paths for e.g., hoven,USEDFOR, bakingi. As instances of the establishing connections between concepts from model types we use COREC (Becker et al., 2019), a different sentences in texts, and show that these multi-label relation classifier that predicts relation types and that we enhance with a pretrained lan- of-the-art results. To our knowledge, Becker et al. guage model; and COMET (Bosselut et al., 2019), (2019) is the only work that proposed a relation a pretrained transformer model that learns to gen- classification model specifically for ConceptNet erate target concepts given a source concept and a relations, which we adapt for our work. Besides relation. In contrast to models that retrieve knowl- COMET (Bosselut et al., 2019), the model used edge from static KGs (Mihaylov et al., 2018; Lin in our approach, Saito et al.(2018) perform tar- et al., 2019), both models are fine-tuned on Con- get prediction on ConceptNet using an attentional ceptNet and applied on the fly, to dynamically gen- encoder-decoder model. They improve the KB erate knowledge paths that generalize beyond the completion model of Li et al.(2016) by jointly static knowledge, allowing us to predict unseen scoring triples and predicting target concepts. knowledge paths. We compare our models to a baseline model that solely relies on static KGs. Utilizing commonsense knowledge paths. We evaluate our framework on two English argu- When using commonsense knowledge for question mentative datasets, IKAT (Becker et al., 2020) and answering (Mihaylov et al., 2018), commonsense ARC (Habernal et al., 2018), which offer annota- reasoning (Lin et al., 2019) or NLI (Kapanipathi tions that explain implicit connections between sen- et al., 2020), most approaches rely on paths re- tences. While knowledge paths have been widely trieved from static knowledge resources. In con- used in NLP downstream tasks, a careful evalua- trast, we propose a framework that in addition tion of these paths has not received much attention. makes use of dynamic knowledge provided by lan- As a central contribution of our work, we address guage models. Few other models have used knowl- this shortcoming by designing manual and auto- edge paths dynamically, e.g. Paul et al.(2020), matic settings for path evaluation: we evaluate the who enrich ConceptNet on the fly when classifying relevance and quality of the paths and their ability argumentative functions. to represent implicit knowledge in an annotation Wang et al.(2020) make use of language mod- experiment; and we compare the paths to the anno- els for question answering. They generate multi- tations of implicit knowledge in IKAT and ARC, hop paths by sampling random walks from Con- using automatic similarity metrics. ceptNet and finetune a language model on these Our main contributions are: i) we propose CO- paths to connect question and answers, improving NNECT, a framework that combines two comple- accuracy on two question answering benchmarks. mentary types of knowledge path prediction models Bosselut et al.(2021) generate knowledge paths that have previously only been applied separately;1 using a language model for zero-shot question an- ii) we show that commonsense knowledge paths swering, which they use to select the answer to generated with CO-NNECT effectively represent a question, surpassing performance of pretrained implied knowledge between sentences; iii) we pro- language models on SocialIQA (a multiple-choice pose an evaluation scheme that measures the qual- question answering dataset for probing machine’s ity of the knowledge paths, going beyond many ap- emotional and social intelligence in a variety of proaches that use knowledge paths for downstream everyday situations). Similarly, Chang et al.(2020) applications without analyzing their quality. incorporate knowledge from ConceptNet in pre- trained language models for SocialIQA. They ex- 2 Related Work tract keywords from question and answers, query ConceptNet for relevant triples, and incorporate In this work we combine relation classification them in their language models via attention. Their and target prediction for generating commonsense evaluation shows that their knowledge-enhanced knowledge representations over text. Relation model outperforms knowledge-agnostic baselines. classification covers a range of methods and learn- Finally, Paul and Frank(2020) propose an atten- ing paradigms for representing relations. A vari- tion model that encodes commonsense inference ety of neural architectures such as RNNs (Zhang rules and incorporates them in a transformer based et al., 2018), CNNs (Guo et al., 2019), sequence- reasoning cell, taking advantage of pretrained lan- to-sequence models (Trisedya et al., 2019) or lan- guage models and structured knowledge. Their guage models (Wu and He, 2019) achieved state- evaluation on two reasoning tasks shows that their 1The code for our framework can be found here: https: model improves performance over models that lack //github.com/Heidelberg-NLP/CO-NNECT. external knowledge. Hence, none of these sys- tems directly evaluates the quality of the generated et al., 2019), a transformer encoder-decoder based paths, but measure the effectiveness of common- on GPT-2 (Radford et al., 2019). Input to the model sense knowledge indirectly by evaluation on down- is a source concept cs and a relation ri. Then the stream tasks. We will address this shortcoming in pretrained language model is finetuned using Con- our work by carefully evaluating the quality of the ceptNet as labelled train set for the task of gener- generated paths. ating new concepts. Depending on the beam size, COMET can propose multiple targets per input 3 Enriching Texts with Commonsense instance. Knowledge Paths Datasets. To compare model performances, we evaluate COREC-LM and COMET on the