Evaluating Vector-Space Models of Analogy Dawn Chen ([email protected]) Joshua C

Evaluating vector-space models of analogy Dawn Chen ([email protected]) Joshua C. Peterson ([email protected]) Thomas L. Griffiths (thomas griffi[email protected]) Department of Psychology, University of California, Berkeley, CA 94720 USA Abstract domain of animal words with vectors obtained using multidi- Vector-space representations provide geometric tools for rea- mensional scaling, little progress was made on the parallelo- soning about the similarity of a set of objects and their relation- gram model after the initial proposal, with the exception of a ships. Recent machine learning methods for deriving vector- handful of reasonably successful applications (see Ehresman space embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These & Wessel, 1978). vector spaces have also been shown to exhibit a surprising ca- However, in the past few years, the parallelogram pacity to capture verbal analogies, with similar results for nat- model was reincarnated in the machine learning literature ural images, giving new life to a classic model of analogies as word2vec parallelograms that was first proposed by cognitive scientists. through popular word embedding methods such as We evaluate the parallelogram model of analogy as applied to (Mikolov, Sutskever, Chen, Corrado, & Dean, 2013) and modern word embeddings, providing a detailed analysis of the GloVe (Pennington, Socher, & Manning, 2014). These word extent to which this approach captures human relational similarity judgments in a large benchmark dataset. We find that representations enable verbal analogy problems such as king that some semantic relationships are better captured than oth- : queen :: man : ? to be solved through simple vector arith- ers. We then provide evidence for deeper limitations of the par- metic, i.e., vqueen − vking + vman results in a vector very close allelogram model based on the intrinsic geometric constraints of vector spaces, paralleling classic results for first-order simi- (in terms of cosine distance) to vwoman. Word embeddings like larity. word2vec and GloVe have also been used successfully in a Keywords: analogy; word2vec; GloVe; vector space models variety of other natural language processing tasks, suggesting that these representations may indeed contain enough infor- Introduction mation for relations to be inferred from them directly. Re- Recognizing that two situations have similar patterns of rela- cently, researchers in computer vision have been successful tionships, even though they may be superficially dissimilar, in extracting feature spaces that exhibit similar properties in is essential for intelligence. This ability allows a reasoner to both explicit (supervised) (Radford, Metz, & Chintala, 2015) transfer knowledge from familiar situations to unfamiliar but and implicit (unsupervised) (Reed, Zhang, Zhang, & Lee, analogous situations, enabling analogy to become a power- 2015) ways, yielding linearized semantic image transforma- ful teaching tool in math, science, and other fields (Richland tions such as object rotations and high-level human face inter- & Simms, 2015). Computational modeling of analogy has polations. The potential for applying the parallelogram model primarily focused on comparing structured representations of analogy to vector space models appears to be domain- that contain labeled relationships between entities (Gentner agnostic, broadly applicable to both semantic and perceptual & Forbus, 2011). However, the questions of where these re- domains. This suggests a promising cognitive model and pro- lations come from and how to determine that the relationship vides the opportunity to evaluate a classic theory in large- between one pair of entities is the same as that between an- scale, ecologically valid contexts. woman other pair are an unsolved mystery in such models. Some man models, such as DORA (Doumas, Hummel, & Sandhofer, 2008) and BART (Lu, Chen, & Holyoak, 2012), try to learn relations from examples, but have only demonstrated success on comparative relations such as larger. queen Another possibility is that the representations of entities king themselves contain the information necessary to infer rela- Figure 1: The parallelogram model of analogy completes the tionships between entities and that relations do not need to analogy king : queen :: man : ? by adding the difference be learned separately. An instantiation of this hypothesis is vector between king and queen to man. This forms a parallel- the parallelogram model of analogy (see Figure 1), first pro- ogram in the underlying vector space. posed by Rumelhart and Abrahamson (1973) over 40 years ago. In this model, entities are represented as points in a Eu- In this paper, we evaluate the parallelogram model of anal- clidean space and relations between entities are represented ogy as applied to modern vector-space representations of as their difference vectors. Even though two pairs of points words. Focusing on the predictions that this approach makes may be far apart in the space (i.e., they are featurally dissim- about the relational similarity of words, we provide a new ilar), they are considered relationally similar as long as their dataset of over 5,000 comparisons between word pairs that difference vectors are similar. Although Rumelhart and Abra- exemplify 10 different types of semantic relations. We find hamson found that this simple model worked well for a small that the parallelogram model captures human relational similarity judgments for some semantic relations, but not others. Table 1: Examples of word pairs instantiating each of two We then show that human relational similarity judgments for representative subtypes from each general relation type in the pairs of words violate the geometric constraints of symme- SemEval-2012 Task 2 dataset try and triangle inequality, just as Tversky (1977) showed for judgments of first-order similarity between words. This cre- ates a challenge for any vector space model that aims to make Relation type Subtype Example predictions about relational similarity. 1. CLASS- Taxonomic flower : tulip INCLUSION Class:Individual river : Nile Relational Similarity Object:Component car : engine 2. PART-WHOLE One way to evaluate vector space models such as word2vec Collection:Member forest : tree and GloVe as accounts of analogy is to compare their as- Synonymy car : auto sessments of relational similarity – the similarity of the re- 3. SIMILAR Dimensional Simi- simmer : boil lation between one pair of words to that of another – with larity human judgments. A good foundation for this task is the Contrary old : young SemEval-2012 Task 2 dataset (Jurgens, Turney, Mohammad, 4. CONTRAST & Holyoak, 2012), which contains prototypicality scores Reverse buy : sell based on human data for word pairs that exemplify 79 dif- Item:Attribute beggar : poor 5. ATTRIBUTE ferent semantic relations. These relations were taken from a Object:State coward : fear taxonomy of semantic relations (Bejar, Chaffin, & Embret- Item:Nonattribute fire : cold son, 1991) and are subtypes of 10 general types, such as 6. NON-ATTRIBUTE Object:Nonstate corpse : life CLASS-INCLUSION, SIMILAR, and CONTRAST. Participants were given three or four paradigmatic examples of a relation 7. CASE Agent:Instrument soldier : gun and asked to generate additional examples of the same rela- RELATIONS Action:Object plow : earth tion. A total of 3,218 unique word pairs were generated for Cause:Effect joke : laughter the 79 relations, with an average of 41 word pairs per rela- 8. CAUSE-PURPOSE Cause:Compensa- hunger : eat tion. A prototypicality score for each participant-generated tory action word pair was calculated based on how often a second group of participants chose the word pair as the best and worst ex- Location:Item library : book ample of the relation among a set of choices. Table 1 shows 9. SPACE-TIME Time:Associated winter : snow examples illustrating two representative subtypes of each of Item the ten general types of relations. Sign:Significant siren : danger 10. REFERENCE According to the parallelogram model, two pairs of words Representation diary : person (A : B and C : D) are relationally similar to the extent that their difference vectors (vB − vA and vD − vC) are similar. within each relation. How appropriate is this geometric relationship for the var- Although the SemEval-2012 dataset contains prototypical- ious semantic relations? As a preliminary investigation of ity scores for the participant-generated word pairs within each this question, we projected the 300-dimensional word2vec relation, which have been interpreted as the relational similar- vectors into a two-dimensional space using principal compo- ities between the participant-generated pairs and the paradig- nents analysis separately for each relational subtype in the matic pairs, prototypicality is influenced by other factors such SemEval dataset, and visualized the difference vectors for the as the production frequencies of words (Uyeda & Mandler, participant-generated word pairs from each relation. Figure 2 1980). Moreover, because participants were encouraged to shows the difference vectors for the 20 relational subtypes focus on the relation illustrated by the paradigmatic exam- that are shown in Table 1. ples, the prototypicality scores may not have much to do with Examining the difference vectors for each relation shows the particular word pairs chosen as paradigmatic examples. that the parallelogram rule does not appear to capture all

Load more