Open Mind Common Sense: Crowd-Sourcing for Common Sense

Catherine Havasi, Robert Speer, Kenneth Arnold, Henry Lieberman, Jason Alonso, and Jesse Moeller Common Sense Computing Initiative and MIT Media Lab 20 Ames Street, Cambridge MA 02139 [email protected]

Open Mind Common Sense (OMCS) is a freely available crowd-sourced knowledge base of natural language state- ments about the world. The goal of Open Mind Common Sense is to provide intuition to AI systems and applications by giving them access to a broad collection of basic infor- mation and the computational tools to work with this data. For our system demo, we will be presenting three aspects of the OMCS project: the OMCS knowledge base, the Concept- Net (Liu and Singh 2004) (Havasi, Speer, and Alonso 2007), and the AnalogySpace algorithm (Speer, Havasi, and Lieberman 2008) which deals well with noisy, user-contributed data.

The OMCS Knowledge Base Figure 1: AnalogySpace discovers patterns in common sense Open Mind Common Sense takes a distributed approach knowledge and uses them for inference. to the problem of commonsense knowledge acquisition. The project allows the general public to enter common- sense knowledge into it, without requiring any knowledge score to indicate its reliability, which increases either when of linguistics, artificial intelligence, or computer science.The a contributor votes for a statement through our Web site OMCS has been collecting commonsense statements from or when multiple contributors submit equivalent statements volunteers on the since 2000. In that time, weve independently. collected more than a million pieces of English-language commonsense data from more than 17,000 contributors. The AnalogySpace OMCS project has expanded to other languages, with sites AnalogySpace (Speer, Havasi, and Lieberman 2008) is a collecting knowledge in Portuguese, Korean, Japanese, and matrix representation of ConceptNet that is “smoothed” using other languages. dimensionality reduction. It expresses the knowledge in The information contained in ConceptNet includes rela- ConceptNet as a matrix of concepts and the common-sense tions between everyday objects (“Books are used for read- features that hold true for them, such as “. . . is part of a car” ing.”), information on peoples priorities and goals (“People or “a computer is used for ...”. want to be respected.”), and affectual information (“Argu- Reducing the dimensionality of this matrix using truncated ments make people angry.”). singular value decomposition has the effect of describing the knowledge in ConceptNet in terms of its most important ConceptNet correlations. It also provides a vector space that represents To make the knowledge in the OMCS corpus accessible to AI the concepts and the features simultaneously. applications and techniques, we transform it into a semantic network called ConceptNet. ConceptNet is References a graph whose edges, or assertions, express common sense Havasi, C.; Speer, R.; and Alonso, J. 2007. ConceptNet 3: a flexible, relationships between two short phrases, known as concepts. multilingual semantic network for common sense knowledge. In The edges are labeled from a defined set of relations, such as Recent Advances in Natural Language Processing. IsA, HasA,orUsedFor, expressing what relationship holds Liu, H., and Singh, P. 2004. ConceptNet: A practical commonsense between the concepts. Each assertion additionally has a reasoning toolkit. BT Technology Journal 22(4):211–226. Speer, R.; Havasi, C.; and Lieberman, H. 2008. AnalogySpace: Copyright c 2010, Association for the Advancement of Artificial Reducing the dimensionality of common sense knowledge. Pro- Intelligence (www.aaai.org). All rights reserved. ceedings of AAAI 2008.

51