VerbNet Overview

Karin Kipper Schuler [email protected]

May 31st, 2009 Overview Real world applications need resources with rich syntactic and se- mantic representations.

• Most existing broad-coverage resources provide only a shallow semantic representation • Much richer representations are needed • The verbs are key elements in providing this

1 Overview

Natural language applications are currently limited to specific do- mains with hand-crafted lexicons.

• not available to the whole community • expensive and time-consuming to build

Most available broad-coverage resources either focus on syntax or on semantics and do not provide a clear association between the two.

2 Semantic representation must be tied to the syntactic information:

• Differences between syntactic frames can help:

Eng: John left the room. (exited) Port: John saiu do quarto. Eng: John left the book on the table. (left) Port: John deixou o livro na mesa. • But syntax alone is not sufficient:

Eng: John left the room. (exited) Port: John saiu do quarto. Eng: John left a fortune. (gave away) Port: John deixou uma fortuna.

3 Overview Predicate argument relations are of interest for NLP, providing gen- eralizations over data:

• Ronaldo scored a goal for the Brazilian team • A goal was scored by Ronaldo for the Brazilian team • Ronaldo wanted to score a goal for the Brazilian team

4 VerbNet connects semantics to syntax

Created with these ideas in mind

• computational verb lexicon • broad-coverage and domain-independent • clear association between syntax and semantics – lexical semantic information (pred argument structure) – syntactic frames and selectional restrictions – semantic predicates – links to WordNet senses, FrameNet frames, PropBank framesets • refinement of Levin classes to construct the entries

5 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

6 Levin classes

Levin (1993)

• Verbs are grouped into classes • Each class is characterized by a set of syntactic patterns

John broke the jar / The jar broke / Jars break easily

John cut the bread / *The bread cut / Bread cuts easily

John hit the wall / *The wall hit / *Walls hit easily

• Hypothesis: syntax reflects implicit semantic components contact, directed motion, exertion of force, change of state

7 Example Levin class

Break Levin class - Change-of-state

break chip tear split crack splinter crash snap crush smash fracture shatter rip

8 Levin classes

• classes are not semantically homogeneous {braid,clip,file, powder,etc..} • classes are not completely syntactically homogeneous • verbs can be in multiple class listings • alternation contradictions – Carry verbs disallow conative but include {push, pull, shove, etc} also in Push/pull class which does take conative

9 Event Structure

Verbs refer to events which can be decomposed into a tripartite structure in a manner similar to Moens and Steedman (1988)

culmination

preparatory consequent process state

10 Verb classes and event structure

RUN class BREAK class (bounce, jog, jump, hop, run) (break, chip, crack, tear) preparatory consequent process (activity) state

culmination

HIT class (batter, kick, hit, slap)

11 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

12 Characteristics of verbs: Verbs have complex meaning: key components can be made explicit

• have participants • space • verbs represent processes/events/states which are located in time • can be subdivided into sub-parts to capture during, end, results

13 Examples of verbs and their components

• RUN – express iterative activity, no culmination, or consequent – one participant – motion of participant is a semantic component – path is optional • HIT – express contact between two objects – happens momentarily, has a well defined end, has no consequent – has three participants • BREAK – express a change of state

14 VerbNet class entries

Kipper, Dang and Palmer, 2000

• verb classes based on Levin’s classification • classes defined by syntactic properties • capture generalizations about verb behavior • for each verb class – thematic roles – syntactic frames – selectional restrictions for the arguments in each frame – each frame includes semantic predicates with a time function

15 Hit class Class hit-18.1 Parent — Members bang (1,3), bash(1), batter(1,2,3), beat(2,5), ..., hit(2,4,7,10), kick(3), ... Themroles Agent Patient Instrument Selrestr Agent[+int control] Patient[+concrete] Instrument[+concrete] Frames Name Syntax Semantic Predicates Transitive Agent V Patient cause(Agent, E) ∧ “Paula hit the ball” manner(during(E),directedmotion,Agent) ∧ !contact(during(E), Agent, Patient) ∧ manner(end(E),forceful, Agent) ∧ contact(end(E), Agent, Patient) Transitive Agent V Patient cause(Agent, E) ∧ with Prep(with) Instrument manner(during(E),directedmotion,Agent) ∧ Instrument “Paula hit the ball with a !contact(during(E),Instrument,Patient) ∧ stick” manner(end(E),forceful, Agent) ∧ contact(end(E), Instrument,Patient)

16 Thematic roles

• use roles to provide as much information as possible for classes • thematic roles vs. generic arguments • specification of roles supplies part of the semantic description for the class Build-26.1 “The artist (Agent) carved a toy (Product) out of a piece of wood (Material)” • roles also help differentiate classes Admire-31.2: Experiencer and Theme Hit-18.1: Agent and Patient

17 Thematic roles

• small set of roles (Agent, Theme, Location, ...) • roles used across classes • certain roles need specific characteristics to be present (Patient → undergoes change) • thematic roles used are generally accepted

18 Selectional Restrictions

• semantic restrictions on the thematic roles – based on EuroWordNet concepts (Vossen 2003) – associate VerbNet with an ontology publicly available, widely used – IS-A hierarchy with multiple inheritance and no cycles • syntactic restrictions for the syntactic frames (e.g., sentential, plural)

19 Selectional Restrictions force int-control machine vehicle human animate animal natural plant body-part comestible machine concrete phys-obj artifact tool rigid garment solid non-rigid pointed shape elongated substance idea SelRestr abstract sound communication regionPP location place time object state scalar currency organization

20 Syntactic Frames Describe possible surface realizations for verbs in a class

• constructions such as transitive, intransitive, resultative, and a large set of Levin’s alternations • Examples: 1. Agent V Patient (John hit the ball) 2. Agent V at Patient (John hit at the window) 3. Agent V Patient[+plural] together (John hit the sticks together)

21 Semantic Predicates Semantics of a syntactic frame captured through a conjunction of semantic predicates

• each semantic predicate includes a time function showing at what stage in the event the predicate holds start(E), during(E), end(E), result(E) • similar to Moens and Steedman’s event decomposition • choice influenced because it was suitable for PARs (pre-conditions, post-conditions, and results) • semantic predicates can be: General (e.g.,motion and cause), Specific (e.g.,suffocate), or Variable (Prep)

22 Semantic Predicates

• relations between verbs (or verb classes) captured implicitly by the predicates for the class • aspect captured by the temporal function present in the predi- cates: – activities (e.g., run) have during(E) – bounded activities (e.g., hit) have during(E) and end(E) – accomplishments (e.g., break) have result(E)

23 Hit class Class hit-18.1 Parent — Members bang (1,3), bash(1), batter(1,2,3), beat(2,5), ..., hit(2,4,7,10), kick(3), ... Themroles Agent Patient Instrument Selrestr Agent[+int control] Patient[+concrete] Instrument[+concrete] Frames Name Syntax Semantic Predicates Transitive Agent V Patient cause(Agent, E) ∧ “Paula hit the ball” manner(during(E),directedmotion,Agent) ∧ !contact(during(E), Agent, Patient) ∧ manner(end(E),forceful, Agent) ∧ contact(end(E), Agent, Patient) Transitive Agent V Patient cause(Agent, E) ∧ with Prep(with) Instrument manner(during(E),directedmotion,Agent) ∧ Instrument “Paula hit the ball with a !contact(during(E),Instrument,Patient) ∧ stick” manner(end(E),forceful, Agent) ∧ contact(end(E), Instrument,Patient)

24 Hierarchical organization

Refinement of Levin classes

• verb classes are hierarchically organized – the original set of Levin classes has been further subdivided into additional subclasses which are more syntactic and semantically coherent – members have common semantic predicates, thematic roles, syntactic frames – a particular verb or subclass inherit from parent and may add more infor- mation

25 Transfer of Message

Class Transfer mesg-37.1 Parent — Members cite(1,3,4), demonstrate(1), ... Themroles Agent Topic Recipient Selrestr Agent[+animate] Topic[+message] Recipient[+animate] Frames Name Syntax Semantic Predicates Transitive Agent V Topic transfer info(during(E),Agent,?,Topic)∧ cause(Agent,E) “Wanda cited the author” Dative (to- Agent V Topic Prep(to) transfer info(during(E),Agent,Recipient,Topic) ∧ PP variant) Recipient cause(Agent,E) “Wanda cited the author to her students”

Class Transfer mesg-37.1-1 Parent Transfer mesg-37.1 Members quote(1), read(3) Themroles Selrestr Frames Name Syntax Semantic Predicates Dative (di- Agent V Recipient Topic transfer info(during(E),Agent,Recipient,Topic) ∧ transitive “Wanda quoted her cause(Agent,E) variant) students the author”

26 Transfer of Message – level 2

Class Transfer mesg-37.1-1 Parent Transfer mesg-37.1 Members quote(1), read(3) Themroles Agent Topic Recipient Selrestr Agent[+animate] Topic[+message] Recipient[+animate] Frames Name Syntax Semantic Predicates Transitive Agent V Topic transfer info(during(E),Agent,?,Topic)∧ cause(Agent,E) Dative (to- Agent V Topic Prep(to) transfer info(during(E),Agent,Recipient,Topic) ∧ PP variant) Recipient cause(Agent,E) Dative (di- Agent V Recipient Topic transfer info(during(E),Agent,Recipient,Topic) ∧ transitive cause(Agent,E) variant)

27 Related Work

• WordNet (Miller 1985; Fellbaum 1998) – predicate-argument structure • FrameNet (Baker et al. 1998) – verb groupings – frame elements vs. thematic roles • LCS database (Dorr 2001) – classes based on Levin – syntactic frames not explicit

28 Related Work

• Xtag (Xtag Research Group 2001) and ComLex (Comlex 1994) – provide detailed syntactic description

29 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

30 Parameterized Action Representation (PAR)

Badler et al. (1999) Interface to agents in an animation system. Needs a semantically precise representation.

• Representation of actions – instructions to a virtual human – used in a simulated 3D environment • Represented as – parameterized structures – hierarchical organization

31 PARs include:

• action participants (agents/objects) • restrictions on the types of objects • kinematic and dynamic properties (path, manner, ..., force) • stages of the action (like Moens and Steedman event decomposition) – preparatory specifications – termination conditions – post-assertions

32 Example of the PAR inheritance hierarchy

contact/(par:contact)

hit/(manner:forcefully) touch/(manner:gently)

kick/(OBJ2:foot) hammer/(OBJ2:hammer)

A lexical/semantic hierarchy for actions of contact

33 Instantiated PAR: John hit the ball with a stick

activity : ACTION          agent : John     participants :   objects : ball, stick               preparatory spec :[get control of(John, stick)]             termination cond ,   :[contact(ball stick)]             post assertions :             duration :[momentarily]             path, motion, force             manner : forcefully     

34 PARs and VerbNet PARs for animating agents require precise semantics associated with syntax provided by VerbNet.

• participants of an action are the arguments of a verb • selectional restrictions on the arguments • event structure (during, end, result) • semantic components expressed by predicates

35 Aggregates

(Allbeck et al. 2002)

• VerbNet also used to describe actions of aggregate entities • actions decomposed by features based on Laban Movement Analysis (EMOTE system) • used in a playground scenario with a teacher and 8 kids • Examples of aggregate actions:

Aggregate actions

Gathering Dispersing Obj refer Formation Milling

assemble congregate dissipate scatter surround encircle

36 Aggregates

• PAR entry for assemble (as in “children assemble in the playground”)

assemble / ARG0-v / is_concrete(ARG0) is_plural(ARG0) !together_group(start(e),ARG0) transl_motion(during(e),ARG0) shape_enclosing(during(e),ARG0) effort_direct(during(e),ARG0) together_group(end(e),ARG0)

• VerbNet entry Class Herd-47.5.2 Parent — Members accumulate aggregate amass assemble cluster collect congregate convene flock gather group herd huddle mass Themroles Theme[+concrete +plural] Frames Name Example Syntax Semantics Intransitive The kids are assembling Theme V !together(start(E),physical,Theme) together(end(E),physical,Theme)

37 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

38 Syntactic coverage against PropBank

Kipper, Snyder, Palmer 2004

• used Penn data, with PropBank annotation to verify coverage • 79k instances, 2,100 verbs, 185 classes • mapping between PB framesets and VN verb classes • mapping between PB argument labels and VN thematic roles • goal was to establish a baseline

39 Syntactic coverage against PropBank

escape−51.1

"move away from" leave.01 leave−51.2

arg0 (entity leaving) Theme arg1 (place left) Source arg2 (attribute)

keep−15.2

fulfill−13.4.1

"give" leave.02 future_having−13.3

arg0 (giver) Agent arg1 (thing given) Theme arg2 (benefactive) Recipient

40 Syntactic coverage against PropBank

Example: verb LEAVE wsj/05/wsj 0568.mrg 12 4: The tax payments will leave Unisys with $ 225 million in loss carry-forwards that will cut tax payments in future quarters .

[ARG0 The tax payments] [rel leave] [ARG2 Unisys] [ARG1 with 225 million] leave-51.2: Theme V NP Prep(with) Source future have-13.3: Agent V Recipient Prep(with) Theme

41 Syntactic coverage against PropBank

• VerbNet provided exact matches of syntactic frames in 84.67% of the instances (86.30% using a more relaxed criteria of matching, such as ignoring preposition mismatches, or allowing ARGMs to match against specific roles) Why not a 100%? • few problems in the mappings • different sense coverage in PropBank and VerbNet • different granularity of the arguments

Calibratable cos-45.6 The price of oil soared ⇐⇒ Oil soared in price

42 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

43 Linking resources Many applications would benefit by merging the results of different lexical resources and annotation projects:

• compatibility between resources • inherent theoretical differences • the different levels of representation may be more than the sum of its parts, since inferences may be drawn from how the components interact

Semlink: develop computationally explicit connections between FrameNet, PropBank, and VerbNet.

44 Mappings between VerbNet and WordNet Each verb in VerbNet is mapped to its corresponding synset(s) in WordNet, if available.

LEAVE

escape−51.1 leave−51.2 fulfill−13.4.1future_having−13.3 keep−15.2 motion, direction motion, direction, has_possession, has_possession, be Prep change location transfer transfer, future_having

wn1 wn5 wn10 wn3 wn2 wn9 wn14

45 PropBank/VerbNet/WordNet

move away from leave.01 leave.02 give

escape−51.1 leave−51.2 fulfill−13.4.1future_having−13.3 keep−15.2 motion, direction motion, direction, has_possession, has_possession, be Prep change location transfer transfer, future_having

wn1 wn5 wn10 wn3 wn2 wn9 wn14

46 Mappings between VerbNet and FrameNet Two steps: 1. mappings between VerbNet verb senses to FrameNet; VN class VN member FN frame 9.1 arrange (diff. sense) 9.1 immerse Placing 9.1 lodge Placing 9.1 mount Placing 9.1 sling 2. mappings from VerbNet thematic roles to the FrameNet frame elements VNclass 9.1 FNclass “Placing” VN role FN frame element Agent Agent Agent Cause Destination Goal Theme Theme

47 Assigning Xtag trees to VerbNet

Ryant and Kipper (2004)

• VerbNet only describes declarative frames • Xtag provides detailed account of syntactic transformations • mapping VerbNet syntactic frames to Xtag trees extends VerbNet syntactic coverage while providing semantics for the Xtag trees • 104 VerbNet syntactic frames (out of 131) map to 19 Xtag tree families

48 Mappings

These resources are:

• Complementary • Redundancy is harmless, may even be useful • PropBank provides great training data • VerbNet provides clear links between syntax and semantics • FrameNet provides rich semantics • Together they give us the most comprehensive coverage

49 Outline

• Overview • Building blocks for VerbNet • VerbNet • Parameterized Action Representation (PARs) • Evaluation • Mappings to other Resources • Automatic techniques to extend coverage

50 Extending VerbNet’s members – LCS

Dorr (2001) Addition of members from the LCS database

• inspected 1,266 verbs present in the LCS database and not in VerbNet • 429 (426 lemmas) were initially integrated into our lexicon • verbs had been acquired automatically, data noisy

51 Automatic acquisition of verbs – Clusters

Kingsbury and Kipper (2003); Kingsbury (2004)

• used PropBank subcategorization frames (e.g., Arg0.V.Arg1) • 121 clusters from the EM algorithm (0 to 45 elements each) • 1,278 verbs which occurred at least 10 times in the PropBank annotation were used as data • 484 verbs were already in VerbNet class (824 potential candidates for inclusion in VerbNet classes)

52 Automatic acquisition of verbs – Clusters Results:

• 5.6% of the candidates were included in VerbNet • large clusters were not predictive of any classes • small clusters did not offer many candidates • 12.6% if using only “good clusters” • need better way to filter the clusters • impoverished features • senses predicted in VerbNet and PropBank are different

53 Extending VerbNet with WordNet

(Loper, Kipper and Palmer)

• use WordNet as a source of candidates for inclusion in VerbNet • use syntactic contexts of these verbs in Propbank • candidates are filtered based on the grammatical patterns and the relationship between those patterns and known members of VerbNet classes • 707 lemmas suggested, 849 senses • 208 lemmas, 255 senses integrated into the suggested classes • experiment done on version 1.5 of VerbNet

54 Extending VerbNet with WordNet Problems due to different goals of VerbNet and WordNet:

• verbs suggested with a preposition (e.g., chase after) • idiomatic expressions (e.g., shoot a line) • morphological variants (criminalise vs. criminalize) • archaic usages (e.g., coggle, aggroup)

55 Conclusion To achieve the detailed level of representation required for natural language applications we need resources capable of providing a rich semantic representation tied to syntax. VerbNet:

• broad-coverage, general purpose natural • focuses on both syntax and semantics and provides a clear asso- ciation between the two, necessary for characterizing verbs

56 Thank you!

57