Question Answ ering Based on Semantic Structures

Narayanan Sanda Harabagiu Srini

ternational Computer Science Institute Department of Computer Science In

Univ Center Street ersity of Texas at Dallas

eley CA hardson TX Berk Ric

snarayanicsiberkeleyedu sandahltutdallasedu

interpretation of the question and generates a p os Abstract

sible index in an oline battery of ontologies The

The ability to answer complex questions p osed in Natu

third step consists of building a scalable and expres

ral Language dep ends on the depth of the available

sive mo del of actions and events which allows the

the inferential mecha semantic representations and

nisms they supp ort In this pap er we describ e a QA ar

sophisticated reasoning imp osed by QA within com

and candidate chitecture where questions are analyzed

plex scenarios We emb ed the three forms of seman

answers generated by identifying predicate argument

tic representations and the inference they enable in

structures and semantic frames from the input and

a novel exible QA architecture that allows us to

p erforming structured probabilistic inference using the

evaluate the impact of each new form of semantic

extracted relations in the context of a domain and sce

information on the accuracy of answering complex

nario mo del A novel asp ect of our system is a scal

questions

able and expressive representation of actions and events

The remainder of this pap er is organized as fol

based on Co ordinated Probabilistic Relational Mo dels

Section we present the semantic knowl lows In

CPRM In this pap er we rep ort on the ability of the

edge that we extract from questions and answers

implemented system to p erform several forms of prob

as well as our novel QA architecture In Section

abilistic and temp oral inferences to extract answers to

we detail our mo del of event structure Section

complex questions The results indicate enhanced accu

presents the typ es of inference that are asso ciated

racy over current stateoftheart QA systems

details the results with the event structure Section

Intro duction

summarizes the of our initial evaluations Section

conclusions

Curren t QA systems extract

answers from large text collections by classify

Semantic Structures for QA

ing the answer typ e they exp ect using question

keywords or patterns asso ciated with questions to Pro cessing complex questions involves the identica

ranking identify candidate answer passages and tion of several forms of complex semantic structures

the candidate answers to decide which passage con First we need to recognize the answer typ e that is

Few systems also justify the tains the exact answer exp ected which is a rich semantic structure in the

answer by p erforming ab duction in rstorder pred case of a complex question or a mere concept in

van et al This paradigm icate logic Moldo the case of a factual question Second we need to

is limited by the assumption that the answer can identify the question class or the question pattern

b e found b ecause it uses the question Al Third in the case of a complex question which is

though this may happ en sometimes this assump part of a scenario we need to mo del the topic of the

tion do es not cover the common case where an in scenario

formative answer is missed b ecause its identication At least three forms of information are needed for

question classes and requires more sophisticated pro cessing than named detecting the answer typ e

named entity classes syntactic dep endency in entity recognition and the identication of an answer

formation and semantic information taking the typ e Therefore we argue that access to rich seman

form of i predicateargument structures or seman tic structures derived from domain mo dels as well as

tic frames and ii the representation of the question from questions and answers enables the retrieval of

topic The following question illustrated the signi more accurate answers as well as inference pro cesses

cance of each of the three forms of information that explain the validity and contextual coverage of

What stimulated Indias missile program Q answers

e consider several stages of deep er semantic pro The question stem what is ambiguous as multiple W

cessing for answering complex questions A rst answer typ es could b e asso ciated with a question

To nd candidate step in this direction is the incorp oration of se pattern What stimulated X

mantic parsers that recognize predicateargument answers the recognition of India and other related

structures or semantic frames when pro cessing b oth named entities eg Indian as well as the name

missile or its related program is im questions and do cuments A second step is the iden of the Prithvi

p ortant To b etter pro cess question Q the syntac tication of a topic mo del that contributes to the Question Processing Document Processing Answer Processing

Keyword Recognition Indexing and Retrieval based on Candidate lexico−semantic knowledge Answers Syntactic Parse Named Entity Recognition Identification of Identification of Recognition of Probabilistic Inference Complex Frame Structures Event Structure Question Network Identification of Identification of Recognition Predicate−Argument Recognition of Syntactic Parse of Answer Structures Frame Structures Answer Type Named Entity Recognition Structure

Documents Ontologies Answer

Figure QA architecture based on several forms of semantic structures

tic dep endencies enable the recognition of predicate

Question PATTERN: How can X be detected?

t structures The predicateargument struc

argumen Question FOCUS: X = biological weapons program

ture of Q is

TOPIC MODEL PREDICATE: Stimulate Topic relations: [develop −− program], [produce −− bilogical agents] ARG0 (role = agent) ANSWER (part 1) ARG1 (role = thing increasing): India’s missile progam [stockpile −− weapons], [deliver −− missiles] ARG2 (role = instrument) : ANSWER (part 2) Possible paths of action

1) development −−> production −−> stockpiling −−> delivery

predicateargument structure was built based

The 2) development −−> acquisition −−> stockpiling −−> delivery

pro ject Kings on the denitions of the PropBank

et al The structure indicates that the

bury Predicate−argument structure

er may have the role of agent or even the role answ PREDICATE: = detect

Arg0 (detector) : Answer (1)

instrument When additional information from of Arg1 (detected): biological weapons program

Arg2 (instrument) ; Answer (2)

Baker et al is used we nd that FrameNet

the answer may have four other semantic roles de

FOCUS Interpretation

ed as frame elements of two distinct frames riv 1) program for producing biological weapons FRAME: Stimulate 2) program for acquiring biological weapons Frame element CIRCUMSTANCES: ANSWER (part 1) Frame Element EXPERIENCER: India’s missile progam PREDICATE: = produce PREDICATE: = acquire Frame Element STIMULUS : ANSWER (part 2) Arg0 (producer) : Answer Arg0 (buyer) : Answer Arg1 (product): biological weapons Arg1 (object): biological weapons

FRAME: Subject_stimulus

Frame element CIRCUMSTANCES: ANSWER (part 3)

Question pro cessing based on topic mo dels Frame element COMPARISON SET: ANSWER (part 4) Figure Frame element EXPERIENCER: India’s missile program

Frame element PARAMETER: nuclear proliferation

inal predicateargument structure in other predicate

None of these semantic roles are fully sp ecied

structures in which the semantic typ e of the answer

To interpret the semantic information constrained

has less ambiguity Figure illustrates the mapping

by the thematic roles we need to also have access to

of the predicate detect in the predicates produce and

a topic model of the scenario in which the question

e that can b e extracted in parallel This map acquir

is b eing asked For example for the question Q

ping enabled by the topic mo del corresp onds to the

How can a biological weapons program be detected

decomp osition of the original complex questions into

the topic mo del consists of a a set of typical

a set of less complex questions

relations b etween topic concepts and b a set of

Because the mo del for event structure has the ca

p ossible paths of actions As it is illustrated in Fig

pability of incorp orating domain knowledge in

ure the identication of a predicateargument

O WLbased representations and p erforms sev

structures and b semantic frames contributes to

eral forms on inference on this knowledge it can b e

the recognition of the exp ected answer as well as to

used to extract candidate answers from the passages

the formation of the topic mo del

retrieved by the topic relations The QA architec

Question Q is mapp ed into its pattern and its

ture that takes advantage of these semantic struc

cus which has the role of the topic of the ques fo

tures and the inference they enable is illustrated

tion The do cument passages retrieved for the sp e

in Figure The syntactic parse is pro duced by

cic topic can b e used to extract the most relevant

the Collins parser Collins the Named En

topic relations with the metho d detailed in Section

tity Recognizer NER is an implementation of the

The event structure detailed in Section enables

NER rep orted in Bikel et al whereas the

the recognition of p ossible paths of action in the

format of chains b etween the events lexicalized in

1

OWL is a markup language for the

the topic relations The set of p ossible paths of ac

httpwwwsemanticweborg which allows for the sp eci

tions generate dierent interpretations of the ques

cation of ontologies and the semantic markup of do cuments

format on the web in an xml tions fo cus which facilitate the mapping of the orig

Palmer rep ort on the same statistical metho d predicateargument structures and the frame ele

that lab els argument roles from PropBank or FEs ments are parsed with the techniques describ ed in

from FrameNet on any English sentence that is syn Section All these four op erations are p erformed

tactically parsed Their metho d consists of two clas b oth in the question pro cessing mo dule and in the

sication tasks identifying the parse tree con do cument pro cessing mo dule The topic mo del gen

stituents corresp onding to the predicate arguments erated at question pro cessing has three roles it

or the FEs and recognizing the role of the provides an index for the event structures to nd

argument or FE They have intro duced seven fea ontological information it renes the denition

tures that a were used for training b oth classiers of the answer typ e and it improves the quality

and b worked b oth for PropBank and FrameNet of the retrieved answer passages b ecause it makes

In Surdeanu et al seven additional fea topicrelevant relations available The derivation

tures were prop osed that enhanced the p erformance of the topic mo del is based on the predicate argu

of the classiers By using b oth sets of features ment structures derived from the question whereas

in our implementation using the SVMligh t soft the answer typ e and the event structures rely on

ware available from httpsvmlightjoachimsorg we the frame semantics available from questions and

automatically transformed the Question Q into relevant passages Because PropBank has higher

and the the predicateargument structure PASQ lexical coverage than FrameNet whenever the se

Frame Structure FSQ mantic frames cannot b e recognized the QA sys

tem falls back on the predicateargument structure

tied in questions and do cuments This backo

iden Q3: What kind of nuclear materials were stolen from the Russian navy ?

hanism enables indexing and retrieving rel mec PAS(Q3): What [Arg1: kind of nuclear materials] were [Predicate: stolen]

[Arg2: from the Russian navy]?

evant passages from do cument collections by using

tic knowledge and the recognition lexicoseman FS(Q3): What [GOODS: kind of nuclear materials] were

[target−Predicate: stolen] [VICTIM: from the Russian navy]?

of the event structure referred by questions and an

swers The Probabilistic Inference Networks PINs

The exp ected answer as predicted by PASQ

describ ed in Section select the answer structures

is the Arg of the predicate steal when the Arg

and identify the answers to b e returned

has the head R ussian navy Additionally the an

swer needs to b e in the same semantic class as nu

Predicate and Frame Structures

clear materials The FEs from FSQ show that we

Prop osition Bank or PropBank is a one million

should search for an FE with the role Goods when

corpus annotated with predicateargument

ever we nd a target word of the frame STEAL The

structures which were describ ed in Kingsbury

paragraphs containing candidate answers are parsed

et al The corpus consists of the

similarly For example the correct answer AQ is

texts Penn Wall Street Journal

transformed into the predicateargument structure

ennedutreebank For every given wwwcisup

PASAQ and the Frame Structure FSAQ

predicate lexicalized by a verb a set of arguments se

quentially numb ered from Arg to Arg were anno

A(Q3): Russia’s Pacific Fleet has also fallen prey to nuclear theft; in 1/96,

The general pro cedure was to select for each

tated approximately 7 kg of HEU was reportedly stolen from a naval

erb the roles that seem to o ccur most frequently v base in Sovetskaya Gavan .

PAS(A(Q3)): [Arg1(P1) Russia’s Pacific Fleet] has [ArgM−DIS(P1) also]

and use these roles as mnemonics for the predi

[Predicate(P1): fallen] [Arg1(P1): prey to nuclear theft];

Arg would stand for arguments Generally cate [ArgM−TMP(P2): in 1/96], [Arg1(P2): approximately 7 kg of HEU]

was [ArgM−ADV(P2) reportedly] [Predicate(P2): stolen]

agent Arg for direct obje ct or theme whereas Arg

[Arg2(P2): from a naval base] [Arg3(P2): in Sovetskaya Gavan]

represents indirect object benefactive or instrument

mnemonics tend to b e verb sp ecic For exam but FS(A(Q3)): [VICTIM: Russia’s Pacific Fleet] has also fallen prey to

[GOODS: nuclear] [target−Predicate(P1): theft]; in 1/96,

the argument structure for the verbpredicate

ple [GOODS(P2): approximately 7 kg of HEU] was reportedly

al has Argagent Argtheme Argsource and ste [target−Predicate(P2): stolen] [VICTIM(P2): from a naval base]

[SOURCE(P2): in Sovetskaya Gavan]

Additionally the argument may Argbeneciary

include functional tags from Treebank eg ArgM

In PASAQ we identify two predicates in

indicates a directional ArgMLOC indicates a DIR

dexed P and P P is lexicalized with the same

lo cative and ArgMTMP stands for a temp oral

wordlemma as the predicate from Q thus its

The FrameNet pro ject annotates roles dened for

kg of HEU provides the ArgP approximately

each semantic frame A frame is a schematic rep

exact answer It is to b e noted that its ArgP

resentation of situations involving various partici

is a naval base which has a meronym relation

pants props and other conceptual roles all called

with the previously mentioned NP Russias Pacic

Frame Elements FEs For example the frame

Fleet a meronym of Russian navy The same

describ es situations in which a Perpetra THEFT

meronymy needs to b e resolved b etween the FE Vic

tor takes Goods that b elong to the Victim The

tim of stolen and the FE of Victim of theft in the

by which this is accomplished may b e also Means

FSAQ In the second case the meronymy is

expressed The British National Corpus is used for

identied since the second frame identies an event

annotations

which is an example of the event identied by the

Gildea and Jurafsky and Gildea and rst frame

lated to the corpus from which new topic relations Topic Mo dels

can b e extracted Two typ es of relations are tar

In question pro cessing two ob jects need to b e identi

geted syntaxbased relations eg Verb Sub

ed the expected answer type and the focus of

ject Verb Obje ct and Verb Prepositional Attach

the question For example in question Q How can

ment and saliencebased relations which mo del

a biological weapons program be detected the ex

longdep endency relations to a seed concept The

p ected answer typ e is Mannerof detection and the

relations are ranked based on a metho dology intro

fo cus is biological weapons program When pro cess

duced in Rilo each relation is ranked based

ing complex question the role of the fo cus b ecomes

on its RelevanceRate and its Frequency The Fre

more imp ortant since it guides the recognition of

quency of an extracted relation counts the numb er of

the topic mo del asso ciated with the question which

times the relation is identied in the relevant para

in turn enables the identication of partial answers

Count graphs The RelevanceRate Frequency

and the relations b etween them To identify the ex

where Count measures the numb er of times an ex

p ected answer typ e we can rely on the question stem

tracted relation is recognized in any paragraph con

eg How and its asso ciated semantic classes or

sidered

we can determine the answer typ e by using a combi

This ranking allows us to select a new topic rela

nation of features asso ciated with the question stem

tion and to resume the topic mo deling pro cedure

and one or more of the question words For exam

this time on a new corpus generated by the most

ple the question How long does it take to produce

recently discovered relation We stop the discovery

weapons of mass destruction has the answer typ e

pro cess when we have identied topic relations

Span determined by the combination of the Time

Some of the topic relations discovered for question

stem how and the adverb long This information

Q are illustrated in Figure

is much more relevant for identifying the exp ected

The second enhancement of topic representations

answer typ e than the fact that the predicate take

rep orted in Harabagiu considers the notion

how long and Argproduce weapons has ArgM

of topic theme that asso ciates clusters of topic rela

of mass destruction which represents the fo cus of

tion with text segments The segmentation is pro

the question

algorithm Hearst duced by the TextTiling

Complex questions rely on topic mo dels for nding

The nominalization of the verb corresp onding to the

the answer since it is unlikely that in a text collec

most relevant topic relation in a segment is consid

tion the exact answer to a complex questions can b e

ered to b e linked to the nominalization from the fol

found but it is more likely that partial answers can

lowing topicrelevant segment Such segments are

b e detected and then they may b e combined for

called themes and the chains of nominalizations rep

generating the most informative answer We used

resent p ossible paths of actions Two such paths are

an incremental topic representation that was intro

represented in Figure

duced in Harabagiu Information ab out a

topic is mo deled through two incremental enhance

From Semantic Extraction to

ments of the topic signatures intro duced in Lin and

Inference for QA

Hovy The rst enhancement determines a

set of seed relations The metho dology considers

Semantic extraction allows us to identify predica

ltering out outliers of the terms identied as

tions in the input text For pro cessing complex

relevant with the statistical metho d based on likeli

questions we further identify the question class or

ho o d ratio rep orted in Lin and Hovy

the question pattern as well as relevant parts of the

morphological expansion of the nouns and verbs

scenario which we refer to as the topic mo del A

from the topic signature

signicant gap remains b etween a the unstructured

through the NER and an semantic normalization

and intuitively chosen tag sets used in FrameNet or

oline ontology of words and

PropBank and the relation names and clusters in

selection of the topic seeds with the same like

the topic mo del and b a formal characterization of

liho o d ratio metho d applied for acquiring the topic

the interrelated events actions states and relations

concepts The seeds are the most relevant Verb

holding among them The explicit representation of

pairs which have a predicateargument rela Noun

such frame semantic and event structure information

tionship

is needed for for the p otential use of such resources

For question Q words like say have or identify

for question answering

were ltered out living words like weapons sarin

In previous work Chang et al we bridged

and produce as the most relevant topic concepts

the gap by dening a formalism that unpacks the

The morphological expansion added words like pro

shorthand of frames into structured event represen

duction whereas the semantic normalization unied

tations This allows annotated FrameNet data to

tionality and bomb Russian and Iraqi into Na

parameterize event simulations Narayanan

or building into Artifact

that pro duce negrained contextsensitive infer

The seed relations that was selected for ques ences We have extended this work to further incor

tion Q is develop program The relation is fur p orate the topic mo del and theme describ ed earlier

t ther used to pro duce a corpus of paragraphs re Currently the list of extracted predicateargumen

and world states that obtain b efore the event in structures the topic mo del and the answer typ e

clude a victim owns the goods b the perpe predicate are used to index into a set of parame

ce and c the goods are at trator is at the sour terized event representations instantiated to sp ecic

the source The theft event can b e a simple tran values based on the extracted predicateargument

sition or can zo omin to a complex event with phases bindings see Figure The answer typ e predicate

such as start ongoing nish interrupt cancel re translates to a sp ecic inference pro cedure

sume stop Complex events can include monitoring

Figure middle shows the representation of ex

and detection conditions as well as resource pro duc

tracted predicateargument bindings in our param

tion consumption and lo cking The completion of

eterized event formalism Emb o died Construction

theft results in a the perpetrator owning the

Grammar ECGBergen and Chang in press that

goods and b the evo cation of the crime scenario

maps annotations to event simulations ECG is a

schema which gets simulated if other conditions ob

constraintbased formalism similar in many resp ects

tain such as authorities notice the crime The

to other unication based linguistic formalisms such

eect of one action may probabilistically enable dis

as HPSG or LFG features roles constraints simple

able interrupt or terminate other p ossible events

and complex slots sub casing and a self reference

such as own provides evidence for the future sell

ECG diers from other linguistically motivated pro

event The result of running the inference pro cess

p osals in the use of an evokes relation that mo d

for this example results in identication of rele

els the priming of a background schema role inher

vant unb ound roles perpetrator and means and

itance is lazy and explicitly sp ecied and the

highly probable new assertions and bindings the

complex network of conceptual schemas in ECG are

p erp etrator owns the go o ds after the theft sug

designed to map utterances to mental simulations

gests new scenariobased query expansion strategies

in context to pro duce a rich set of inferences It is

and is a result of up dating the state variables after

thus ideally suited for our current goal of translat

the new evidence extracted predicatearguments is

ing frames to conceptual representations Figure

is asserted as this pro cess is called ltering

schema instantiated middle left shows the theft

the resultant state after executing the action and

to the bindings extracted from the answer passage

is computed by a executing the action and identi

Figure middle right shows the schema instance

states and b up dating the state fying reachable

enhanced with inferentially derived additional bind

after the action to nd the Maxim um A Poste

ings

MAP probabilities These pro cedures are A(Q3): Russia’s Pacific Fleet has also fallen prey to nuclear theft; in 1/96, riori

approximately 7 kg of HEU was reportedly stolen from a naval

the imp ortant inference metho ds for struc

base in Sovetskaya Gavan . amongst

sto chastic pro cesses and are directly supp orted

FS(A(Q3)): [VICTIM: Russia’s Pacific Fleet] has also fallen prey to tured [GOODS: nuclear] [target-Predicate(P1): theft]; in 1/96,

[GOODS(P2): approximately 7 kg of HEU] was reportedly

by our implementation [target-Predicate(P2): stolen] [VICTIM(P2): from a naval base]

[SOURCE(P2): in Sovetskaya Gavan]

Technically the event structure implementation

a factorized mo del of states based on Temp o SCHEMA INSTANCE: FN:THEFT SCHEMA INSTANCE: FN:THEFT uses

Subcase_of: FN:Committing_Crime Subcase_of: FN:Committing_Crime

Extended aka Dynamic Probabilistic Rela Subcase_of: FN:Take Subcase_of: FN:Take rally

Evokes: FN:Crime_Scenario as FNC Evokes: FN:Crime_Scenario as FNC

Mo dels Murphy Pfeer Geto or Roles Roles tional

VICTIM: "Russian Navy, Pacific Fleet,Naval Base" PERPETRATOR: ?x:AGENT

al that enable a variety of inferences that GOODS: "approx. 7 KG of HEU" VICTIM: "Russian Navy, Pacific Fleet,Naval Base" et

SOURCE: "in Sovetskaya Gavan" GOODS: "approx. 7 KG of HEU"

and revise the state variables forward and SOURCE: "in Sovetskaya Gavan" up date

MEANS: ?m

kward in time Central to the representation

OWN(?PERPETRATOR, "approx. 7KG HEU") bac

of actions and events is an event mo del called ex

schemas or xschemas motivated by Crime ecuting

own(VICTIM, GOODS)

research in b oth sensorimotor control and cognitive

COMMITTING CRIME

hemas are ac tics Narayanan Xsc EVOKE CRIME_SCENARIO seman

at(SOURCE, GOODS)

tive structures based on Sto chastic Petri Nets Cia

et al that cleanly capture sequentiality at(SOURCE, PERP) THEFT(?MEANS) ENABLE rdo

own(PERP, GOODS)

concurrency and eventbased asynchronous control

FN:THEFT CPRM SELL(PERP, GOODS)

Our implementation integrates the PRM based state

mo del with the xschema based action mo del and is

Figure From Semantic Extraction to Inference

called Co ordinated Probabilistic Relational Mo dels

or CPRM Our CPRM implementation KarmaSIM

Figure b ottom shows a fragment of the event

is linked to existing linguistic resources FrameNet

simulation for the theft frame all the informa

and WordNet and to ontologies on the semantic

tion in this simulation is generated from informa

tion in the FrameNet Preconditions

Frames in FrameNet to event simulations But that issue re

2

mains op en

In general as we argued in Chang et al there is a

3

considerable gap b etween F rameNet representations and com Xschemas have b een shown to provide a cognitively

putational mo dels capable of inference Our current eorts motivated basis for mo deling diverse eventstructure re

lated linguistic phenomena including asp ectual inference involve mainly manual translations from FrameNet frames to

representations As FrameNet matures and the various Chang et al metaphoric inference Narayanan ECG

Frame and FE relations grow and b ecome systematized we and eventbased reasoning in narrative understanding

may b e able to automate the pro cess of going from Event Narayanan

high frequency web To address the vexing issue of domain sp e answer typ es for questions in the

The top four categories were cic Knowledge Acquisition KA in past work we AQUAINT CNS data

ortJustication for a prop osition the to Supp have constructed automatic translators from OWL

abilit based event and pro cess ontologies such as OWL y of an agent to p erform a sp ecic act tem

or predictions from a state and S to the CPRM mo deling framework KarmaSIM p oral pro jection

hyp othetical situations including counterfactuals Narayanan and McIlraith WordNet Op en

and SUMO are also available in OWL For In our mo del these map straightforwardly into the

the exp eriments rep orted here we used the OWL running of various inference pro cedures including

based Teknowledge WMD ontology their sequential application describ ed in Section to instantiate

The For counterfactuals we use the idea of mo del inter the general frames obtained from FrameNet

CPRM mo del p opulated with domain knowledge to vention prop osed by Pearl The exact de

functions as a QA system comp onent for answer ex tails of the algorithm for counterfactuals is outside

traction see Figure the scop e of this pap er Table summarizes the

various query typ es and the corresp onding inference

We have develop ed a proto col that allows us to

know of any previously im algorithms We dont

take predicates and frames extracted from the input

plemented QA system going from text to inference

text and p erform a variety of causal and event struc

capable of handling these kinds of questions

ture related inferences for QA Currently the main

API b etween the semantic extraction and inference

Answer Typ e Inference Typ e

comp onents makes use of extracted predicate

JustProp osition MAP

argument structures extracted topic mo dels and

AbilityAgtAct FS

PredictionState PRMAP

a set of extracted answertyp e predicates The

Hyp otheticalIState FR

I

topic mo dels provide an index into the CPRM mo del

database compiled from existing FrameNet and Se

Table The typ e of answer required and the inference

mantic Web OWLbased CPRM Mo d

MAP stands algorithm used in the CPRM mo del Here

els matching the topic mo del are retrieved and in

for Maximum A Posteriori estimation F for ltering S

stantiated by the predicate argument bindings sp ec

for smo othing R for reachability and P for predictive

ied by the semantic parse output The answertyp e

inference indicates sequential application The symb ol

predicates are mapp ed to sp ecic structured proba

I represents a sp ecic intervention into the CPRM net

bilistic inference pro cedures aorded by the CPRM

work Pearl as sp ecied by the hyp othetical con

dition Computing reachability after the intervention is

mo dels The next section outlines the currently im

given by R

I

plemented CPRM inference algorithms and their use

for question and answer pro cessing

Inference With CPRMs for QA

Evaluating Semantically based QA

Inference in structured probabilistic mo dels of dy

The previous sections describ ed techniques to incor

namic systems as in the CPRM mo del consists of

p orate semantic comp onents at increasing levels of

the following kinds of computations Here X is a

t

depth and complexity We now rep ort on exp eri

lowercase x is a value as state variable at time t

t

ments conducted to evaluate the utility of these dif

is an observation value at time t signment and y

t

fering We rep ort on results p ertaining to the impact

FilteringCompute P X jy State up date based

t it

of the identication of semantic structures and

on the observation sequence

inference through CPRMs on a baseline stateof

jy Predict the Prediction Compute P X

t h t

theart QA system that emerged after ve years of

state at some future time t h based on the obser

TREC evaluations

vation sequence up to time t

jy Recompute Smo othing Compute P X

Evaluating semantic information

t m t

previously estimated states in the present of current

To evaluate our novel QA architecture we have used

evidence

a set of questions p ertaining to four dierent

MAP Compute ar g max P x jy Com

x t t

1 t

topics T UN inspections T Thefts in Russias

pute the b est assignment of state values given the

nuclear navy T Status of Indias Prithvi bal listic

observation sequence

project and T Chinas participation in non missile

habilityGiven a CPRM S with an initial state Reac

proliferation regimes For each topic we have created

X and a nal state X is X RS X

t f f t

a gold standard consisting of questions

We compiled a list of complex semantically rich

one or several text spans considered correct answers

by two indep endent judges the syntactic parse

4

httpwwwreliantteknowledgecomDAMLWMDowl

pro duced by the Collins parser Collins which

5

The compilation pro cess is not completely automated

was manually corrected the predicate argument

since none of the owl ontologies were rich enough to cover

our event structure mo del For the exp eriment we restricted

structures of the questions and its corresp onding

any information added to the OWLbased ontologies to the

6

class do cumentation strings provided in the ontology We are AQUAINT is an ARDA sp onsored QA program The

currently trying to use semantic extraction to automatically Center for NonProliferation CNS data is a data source re

generate this information from the do cumentation leased to the AQUAINT pro ject

Arg Corpus PArg RArg F

1

for QA

PropBank

To test we manually compiled CPRM domain

AnswerBank

mo dels based on our core theory of events and on

the gold standard annotations we used a

Corpus PRole RRole F Role

1

buildvalidatetest dataset We compared this

PropBank

to the semiautomatically generated from the OWL

AnswerBank

databases of WMD pro cesses For our rst exp er

iment we lo oked at how many of the complex se

Table Identication of predicateargument struc

mantically rich inference typ es could b e made by

tures

our system for the two mo dels Figure shows the

Percent correct by inference type

Corpus PFE RFE F FE 1

90

FrameNet Justification, 87

AnswerBank 85 d

Ability, 83

PRole RRole F Role Corpus Prediction, 83 1 80 Manually generated

from CNS data

FrameNet

75

AnswerBank Hypothetical, 73 Justification, 72 70 OWL-based Domain Ability, 69

Model

able Identication of frame structures T 65 Prediction, 63

60 % correct (compared to gold standar gold to (compared %correct

55

answer pro duced automatically and then corrected

Hypothetical, 51

50

manually the semantic frames whenever they

Justification Prediction Ability Hypothetical

could b e identied The answers were extracted

OWL-based Domain Model Manually generated from CNS data

from the AQUAINT CNS corpus The gold standard

was used for evaluating the precision PArg and

Figure Performance of the CNSbased gold stan

recall RArg of identifying the correct b oundaries

dard and O WLderived CPRM mo dels based on infer

of predicate arguments We have also computed and

ence typ e

Ar g RAr g P

score as F Arg Table lists F

P Ar g RAr g

the results The Table also lists the precision of clas

p erformance of the two systems on the CNS gold

sifying the arguments PRole the recall for argu

standard annotations the results are for the test

ment classication RRole and the corresp onding

data of questions Note that b oth the manually

F score The results are presented for two corp ora

built and the OWLbased mo dels p erform reason

the PropBank section and AnswerBank which

ably well for the dierent inference typ es we lo oked

represents our gold standard Table presents sim

at This is somewhat encouraging given that this is

ilar results for recognizing the b oundaries of frame

the rst inference based QA system that we are

elements FEs from FrameNet and for classifying

aware of that go es from textual input to infer

their semantic roles

ence The main shortcoming of the OWLderived

mo dels was that they lacked detailed sp ecications

mo del for QA Evaluating the CPRM

of the pro cesses their resource requirements and

We exp erimented with the QA system on the a detailed list of agent abilities preconditions ef

AQUAINT CNS data Since there are no imple fects and maintenance conditions We are seeking to

mented QA systems that p erform the kinds of com overcome this deciency through a variety of auto

plex inferences describ ed ab ove our evaluation with matic techniques semantic web resources and Sub

resp ect to the current stateoftheart baseline re ject Matter Exp ert SME input using the CPRM

lates to the enhanced set of questions and answer GUI to b o otstrap and enhance the acquisition of

typ es our system can handle We wanted to calibrate domain sp ecic knowledge However results from

to extent and typ e of inferences needed for dierent these eorts remains future work

questions in the CNS scenario data as well as the ex To test we lo oked at the p ercentage of in

tent to which such inferences require manual domain ferences by dierent typ es of eventstructure infer

mo del building To this end we created a set of ences that had to b e made to generate the answer

handannotated question answer passages for the for the questions in the gold standard anno

gold standard We measured the p erformance of our tations The categories we lo oked at were asp ec

system with along the following dimensions How tual inferences Phases of events viewp oints zo om

well did the automatically constructed CPRM do in zo omout action and pro cessfeature infer

main mo dels from the OWL ontologies fare when ences Preconditions Eect Resources pro duced

compared to the manually constructed from gold consumed lo cked metaphoric inferences we only

standard CNS data CPRM mo del How capa lo oked at Event Structure Metaphors Lako

ble was our CPRM event mo del in p erforming a set We counted the numb er of inferences made by the

of complex eventstructure based inferences required human and by the mo del the CNSbased manually

and mapping into name classes was the source of built mo del for each category in the annotated data

only of the correctly recognized answer typ es We lo oked at the precision numb er of correct infer

in contrast with the more than that is cor ences and recall numb er of total made

rectly identied for factoid questions when pro cess

Comp onent Numb er M M

1f f 2

ing TREClike data To evaluate the contribution

Asp ectual

of predicateargument structures PAS we consid

Actionfeature

ered that the answer typ e can b e dened not only

Metaphor

as a semantic class but also as an argument of a

sp ecic predicate Whenever the answer would b e

Table Inferences broken by Event Structure comp o

recognized as the same argument of the same predi

nent M refers to the fscore of the manually con

1f

cate or of a directly related predicate we considered

structed CNS goldstandard mo del M to the mo del

2f

that the answer typ e is recognized correctly Simi

derived from OWL

larly when the frame structures could b e identied

in the question and the answer the answer typ e can

Table shows our initial results Note that all

b e indicated by the frame element FE and its cor

three of the categories of inferences are fairly com

rect identication accounts for our resolution of a

mon in the data and our initial results are quite en

correctly predicted answer typ e The topic mo dels

couraging The more domain general inference typ es

TMs contribute to the recognition of the answer

regarding the asp ectual and metaphoric inferences

typ e if any of the relations they induce p ertains to

ab out events seem to fair reasonably well recall that

the exp ected answer which may b e either the re

all these inferences are imp ossible in the stateof

lation itself a more complicated structure that in

theart baseline QA system The lower score the

cludes any of the topic relations or any concept that

actionfeature inference seems to tied to the lack of

takes part in any topic relation but was not acces

domain knowledge in our mo del regarding domain

sible directly from the question words The event

sp ecic pro cess details such as the sp ecic resources

structure ES was considered a valid source for nd

for the pro duction or disp ersal of WMD We ex

ing the answer typ e if any of the schemas that were

p ect this numb er to increase considerably with more

instantiated contained at least a semantic class or re

domain sp ecic knowledge using the techniques de

lation that corresp onds even partially to the answer

scrib ed earlier We are also conducting a detailed

structure whereas the combination b etween ES and

study of other imp ortant categories of event related

the inference pro cedures Inf determines the answer

causal inferences

typ e either by considering only the semantic infor

mation available from the ES or by adding to it the

Evaluating the Answers

answer typ es determined by inference The results

The fo cus of our exp eriments was to measure the im

listed in Table show that the schema instantia

pact of the identication of semantic structures

tions through their very general semantic coverage

and inference through CPRMs on stateofthe

account for most of the answer typ es which are rec

art QA techniques that emerged after ve years of

ognized whereas the addition of answer typ es deter

TREC evaluations As rep orted in Moldovan et al

mined by inference accounts for almost of the

most of the errors of QA systems are de

correct answer typ es of the evaluated complex ques

termined by a the incorrect identication of the

tions When pro cessing the test questions only with

exp ected answer typ e and b the inability to ex

the AH of the answers were correct In con

pand question keywords with the ideal words that

trast when all the other semantic structures were

enhance the retrieval of the candidate answers

available and probabilistic inference could b e p er

Table lists the results obtained for the identica

formed of the extracted answers were correct

tion of correct answer typ es The answer hierarchy

In future work we plan to investigate ways in which

AH comprising more than WordNet concepts

the semantic structures presented in this pap er could

improve the quality of paragraph retrieval and key

2P R

7

We computed an fscore based on for b oth the

P +R

word selection

CNS goldstandard based CPRM mo del and for the OWL

derived mo del

Issues and Discussion

AH PAS FS TM

The last few years have witnessed a go o d deal of ac

tivity on predicate extraction aka semantic

PASTM FSTM ESTM ESInf

Gildea and Jurafsky Kingsbury et al

Until now it has b een unclear if and how predicate

extraction might help in the p erformance of an ac

Table Numb er of correct answer typ es identied by

tual NLP task Often the intuitive justication of

semantic information originating in the Answer Hierar

chy AH the predicateargument structure PAS the

8

Directly related predicates are those that a b elong to

topic mo del TM the event structure ES and the

the same verb hierarchy in WordNet or b are arguments

CPRM inference Inf for a set of complex questions

of the target predicate either b ecause they are innitives or

b ecause they b elong to a relative clause

fered was that predicate extraction was an interme Daniel Gildea and Daniel Jurafsky Automatic La

b eling of Semantic Roles Computational Linguistics

diate step toward semantic inference Gildea and Ju

rafsky As far as we know the results rep orted

Daniel Gildea and Martha Palmer The Neces

in this pap er constitute the rst demonstration that

sity of Parsing for Predicate Argument Recognition

sophisticated textual analysis including predicate

In Proceedings of the th Meeting of the Associa

argument extraction can b e combined with deep se

tion for Computational Linguistics ACL

mantic representation and inference mo dels to en

Philadelphia PA

hance a stateoftheart QA system to answer new

Sanda Harabagiu Incremental Topic Representa

question typ es that p ertain to causal and temp o

tions In Proceedings of the th International Confer

ral asp ects of complex events Imp ortantly we b e

ence on Computational Linguistics COLING

lieve our work demonstrates a exible architecture

M Hearst TextTiling Segmenting text into

and metho dology that harnesses the increasingly

multiparagaph subtopic passages Computational

widespread availability of semantically motivated re

Linguistics

sources such as WordNet FrameNet and the Se

ChinY ew Lin and Eduard Hovy The Auto

mantic Web Our current eorts are directed at

mated Acquisition of Topic Signatures for Text Sum

more eective knowledge acquisition and at expand

marization In Proceedings of the th International

ing the coverage of system b oth in terms of the do

Conference on Computational Linguistics COLING

main mo dels and question and answer typ es sup

p orted We b elieve that our exible architecture

Paul Kingsbury Martha Palmer and Mitch Marcus

Adding Semantic Annotation to the Penn Tree and CPRM based computational mo del for combin

Bank In Proceedings of the Human Language Tech

ing predicate and frame parsing with deep inference

nology Conference HLT San Diego

could p oint the way for building the next generation

California

of semantically rich QA systems

George Lako and Mark Johnson Philosophy in

the Flesh The Emb o died Mind and Its Challenge to

wledgements Ackno

Western Thought Basic Bo oks New York

This work was funded by an ARDA AQUAINT

Dan Moldovan Marius Pasca Sanda Harabagiu and Mi

grant We would like to thank Steve Maiorano for

hai Surdeanu Performance Issues and Error

the many discussions of ideas he shared with us in

Analysis in an Op enDomain Question Answering Sys

this pro ject Sp ecial thanks go to Jerry Feldman and

tem In Proceedings of the th Meeting of the Asso

the NTL group at ICSI and UC Berkeley We are

ciation for Computational Linguistics ACL

also grateful to the researchers and students working

Philadelphia PA

Dan I Moldovan Christine Clark Sanda M

on the AQUAINT pro ject in the Human Language

Harabagiu Steven J Maiorano COGEX A Logic

Technology Research Institute at UT Dallas

Prover for Question Answering HLTNAACL

Dan Moldovan Christine Clark Sanda Harabagiu and

References

Steven Maiorano A Logic Prover for Question

er Charles J Fillmore and John B Lowe Collin F Bak

Answering In Proceedings of the HLTNAACL

The Berkeley FrameNet Pro ject In Proceedings

Edmonton Canada

CL Montreal Canada of COLINGA

Murphy Kevin Dynamic Bayesian Networks

Benjamin K Bergen and Nancy Chang in press

Representation Inference and Learning University

Simulationbased language understanding in Emb o d

of California Berkeley dissertation

ied Construction Grammar In Construction Gram

Narayanan Srini Know ledgebased Action Repre

mars Cognitive and Crosslanguage dimensions

sentations for Metaphor and Aspect KARMA Com

John Benjamins

puter Science Division University of California at

Daniel M Bikel Richard Schwartz and Ralph M

Berkeley dissertation

Weischedel An Algorithm that Learns Whats

Narayanan Srini Reasoning ab out actions in nar

in a Name Machine Learning Journal

rative understanding In Proc Sixte enth International

Nancy Chang Srini Narayanan and Miriam RL

Joint Conference on Articial Intel ligence IJCAI

Petruck Putting frames in p ersp ective In

Morgan Kaufmann Press

Proc Nineteenth International Conference on Com

Narayanan Srini and Sheila McIlraith Analysis

putational Linguistics COLING

and Simulation of Web Services In Computer Net

Ciardo Gianfranco Reinhard German and Christoph

works Elsevier NH

Lindemann A characterization of the sto chas

Pearl Judea Causality Models Reasoning and

tic pro cess underlying a sto chastic p etri net Softwar e

Inference Cambridge University Press

Engine ering

Pfeer Avi Probabilistic Reasoning for Complex

Michael Collins A New Statistical Parser Based on Bi

Systems Stanford University dissertation

gram Lexical Dep endencies In Proceedings of the th

Ellen Rilo Automatically Generating Extrac

Annual Meeting of the Association for Computational

tion Patterns from Untagged Text In Proceedings of

ACL pages Linguistics

the Thirteenth National Conference on Articial In

Geto or Lise Nir Friedman Daphne Koller and Avi Pf

tel ligence AAAI

eer Learning probabilistic relational mo dels

In Relational Data Mining ed by DzeroskiLavrac Mihai Surdeanu Sanda Harabagiu Paul Aarseth and

SV John Williams Using PredicateArgument

Structures for In Proceedings

of the st Meeting of the Association for Computa

tional Linguistics ACL Sapp oro Japan