Crowdsourcing a Parallel Corpus for Conceptual Analysis of Natural Language

Proceedings of the Fifth Conference on Human Computation and Crowdsourcing (HCOMP 2017) Crowdsourcing a Parallel Corpus for Conceptual Analysis of Natural Language Jamie C. Macbeth, Sandra Grandic Department of Electrical and Computer Systems Engineering Fairfield University 1073 North Benson Road Fairfield, Connecticut 06824 Abstract the authors of a recent survey of NLP research assert, “the truly difficult problems of semantics, context, and knowl- Computer users today are demanding greater performance edge will probably require new discoveries,” (Hirschberg from systems that understand and respond intelligently to hu- and Manning 2015) and solutions to many problems in NLP man language as input. In the past, researchers proposed and built conceptual analysis systems that attempted to under- still appear to be out of reach using existing corpora. An- stand language in depth by decomposing a text into struc- other rich tradition of work in cognitive artificial intelligence tures representing complex combinations of primitive acts, strives for human-like performances on language compre- events, and state changes in the world the way people con- hension tasks by performing analyses of text that are driven ceive them. However, these systems have traditionally been by conceptual representations rather than syntax, phonology, time-consuming and costly to build and maintain by hand. and morphology. Conceptual analysis systems attempt to This paper presents two studies of crowdsourcing a paral- transform a text into a non-linguistic representation that re- lel corpus to build conceptual analysis systems through ma- flects how humans conceive of physical and social situations chine learning. In the first study, we found that crowdwork- represented by the language, to support the sort of memory ers can view simple English sentences built around specific retrieval and inference processes that humans perform when action words, and build conceptual structures that represent understanding language in depth. decompositions of the meaning of that action word into simple and complex combinations of conceptual primitives. The A conceptual analyzer requires a mapping between lexical conceptual structures created by crowdworkers largely agree items—words and phrases—and the non-linguistic concep- with a set of gold standard conceptual structures built by ex- tual structures that form the elements of subsequent analy- perts, but are often missing parts of the gold standard concep- ses, a subsystem traditionally built by hand. As with many tualization. In the second study, we developed and tested a efforts in symbolic artificial intelligence, the development novel method for improving the corpus through a subsequent of conceptual analysis systems must confront the knowl- round of crowdsourcing; In this “refinement” step, we pre- edge engineering problem: the impracticality of building and sented only conceptual structures to a second set of crowd- maintaining such systems at scale manually, especially when workers, and found that when crowdworkers could identify machine learning technologies are available. However, to the action word in the original sentence based only on the build conceptual analysis systems via machine learning will conceptual structure, the conceptual structure was a stronger match to the gold standard structure for that sentence. We also require at least an initial kernel of non-linguistic conceptual calculated a statistically significant correlation between the structures tied to language, which are unlikely to be gleaned number of crowdworkers who identified the original action from existing corpora (Schuler 2005; Miller 1995), because word for a conceptual structure, and the degree of matching they comprise commonsense knowledge understood by all between the conceptual structure and a gold standard con- competent language users that typically goes unsaid. ceptual structure. This indicates that crowdsourcing may be For many years, crowdsourcing (Howe 2006) has been used not only to generate the conceptual structures, but also used to annotate datasets for natural language processing to select only those of the highest quality for a parallel corpus based on machine learning. More recently, crowdsourcing linking them to natural language. has even demonstrated promise for collecting narrative intelligence knowledge (Li et al. 2012). But there are great Introduction potential challenges to crowdsourcing a corpus for machine learning-based conceptual analysis: building concep- Building systems to understand and respond to natural lan- tual structures is an abstract and complex process, even guage input has been a goal of artificial intelligence research for experts, and crowdworkers likely will not have back- for decades, and recently researchers have applied novel ma- ground knowledge of the conceptual representation—either chine learning techniques to vast text corpora to solve prob- its primitive elements, or the connective elements that allow lems posed by natural language processing (NLP). But, as one to build larger structures. Copyright c 2017, Association for the Advancement of Artificial This paper presents studies of methods for leveraging Intelligence (www.aaai.org). All rights reserved. crowdsourcing to develop a corpus for conceptual analysis. 128 We find that crowdworkers can build both simple and com- state of the world that result from them. The theory behind plex conceptual structures in a language-free representation Conceptual Dependency proposes that the conceptual struc- called Conceptual Dependency. The conceptual structures tures that a human understander of language manipulates they build are based on their understandings of simple sen- are not isomorphic to words, phrases, or grammar of their tences in English, and they largely matched and agreed with language, but that the thought process behind language un- a set of gold standard conceptual structures created by ex- derstanding takes place in a “private” realm of “language perts. We also developed and tested a novel method for im- of thought,” having different origins and, therefore, differ- proving the corpus of conceptual structures through a sub- ent characteristics from the spoken language of a human sequent round of crowdsourcing. We found that when we understander (Miller and Johnson-Laird 1976; Fodor 1975; presented decomposed conceptual structures to a second set Schank 1975). Conceptual Dependency (CD) decomposes of crowdworkers, they could often “recompose” them and meanings into complex structures based on combinations of identify the action word in the original sentence, and this “language-free” conceptual primitives, comprising a thought indicated that the conceptual structure was a strong match representation of the actual events separate from the lan- to the gold standard structure for that word. We calculated guage. Figure 1 shows examples of sentences and their con- a statistically significant correlation between the fraction of ceptual analyses in CD. At the same time, building these sys- times crowdworkers could recompose the conceptual struc- tems by hand incurs the high cost of knowledge engineering ture and determine the original action word, and the de- (Feigenbaum 1977) of the symbolic structures that comprise gree of matching between the conceptual structure and a the conceptual analysis. gold standard conceptual structure created by experts. This Crowdsourcing is now well known as an inexpensive and demonstrates that crowdsourcing can be used to generate a fast method to collect human annotation of datasets for parallel corpus of conceptual structures tied to lexical items, classifiers of natural language based on machine learning. and it can be used refine the corpus to achieve a quality com- When researchers have approached the viability of crowd- parable to that provided by experts knowledgeable in the sourcing for annotations of natural language corpora, the representation. main question has been about whether people recruited from crowds can provide performance and annotation qual- Background and Prior Work ity that is comparable to experts (Callison-Burch 2009; In the last century, experimental psychologists and psy- Snow et al. 2008). Crowdsourcing has been used to collect cholinguists who studied language behavior showed that commonsense knowledge, narrative intelligence knowledge, when human listeners understand discourse, they quickly and annotations of FrameNet (Havasi, Speer, and Alonso forget its grammatical form or syntax (Sachs 1967), and 2007; Li et al. 2012; Boujarwah, Abowd, and Arriaga 2012; whether a particular meaning was conveyed by a noun or Chang et al. 2015). In all of these cases, however, the an- verb (Johnson-Laird, Robins, and Velicogna 1974). Other notations and knowledge structures that crowdworkers pro- related research found that humans tend to construct men- vide are in the form of natural language. While studies have tal models (Johnson-Laird 1983) and imagery (Paivio 1971) shown that crowdworkers can grasp language-free primi- rather than ontologies or structures in classical logic (Rosch tives in coherent ways to collect commonsense knowledge 1975; Wason and Johnson-Laird 1972) in their language un- (Johnson-Laird and Quinn 1976; Macbeth and Barionnette derstanding processes. Based on these insights, many artifi- 2016), no research to date has addressed the challenges of cial intelligence researchers of the same period were mo- using crowdsourcing to collect complex structures based on tivated

Load more