<<

A State of the Art Technique in Semantic Analysis of Natural Utterances MarijaBrkić,MajaMatetić DepartmentofInformatics FacultyofArtsandSciences UniversityofRijeka Omladinska14,51000Rijeka,Croatia Telephonenumber:051345034 Email:[email protected] ,[email protected]com.hr Abstract: This paper presents computational approaches to One of the approaches to semantic analysis is syntax the problem of semantic analysis, which is a crucial term in driven approach. It is based on the principle of language technology applications, e.g. dialogue systems and compositionality. The key idea of the principle of machine . Explanations are supported by Croatian compositionality is thatthemeaningofacanbe language example sentences. The focus is on syntax-driven composed from the meanings of its parts. However, this semantic approach that is based on the principle of idea should not beliterallyinterpreted.Themeaningofa compositionality and on the process of augmenting Context- sentenceisnotjustbasedonthemeaningofthewordsthat Free Grammar rules. There are also cases where the meaning of a constituent is not based on the meaning of its parts in makeitup,butalsoonthegrouping,orderingandrelations straightforward compositional sense, and this paper presents amongthewordsinthesentence. a way of dealing with such cases as well. Mathematical system used for modeling constituent structureinnaturalwhichisthemostcommonly usedistheContextFreeGrammar,orCFG.Itconsistsofa I.INTRODUCTION set of rules or productions. These rules express the ways thatofthelanguagecanbegroupedandordered together. Language and speech resources are of crucial importance for research and development in language Contextfreegrammarrulesneedtobeaugmentedwith technology. It is equally important to develop language semantic attachments which instruct how to construct a tools to help people communicate effectively in foreign meaning representation of a construction from the countriesorwithforeigners.Mostresearchesaredoneon meanings of its constituent parts. [7] These augmented English language. Since Croatian and other Slavic ruleshavethefollowingstructure: languages are essentially very different from English, we are in the need of developing languagespecific tools. A →α1…α n {f(α j.sem,…,α k.sem)} (1) Research on languages with similar characteristics should serveasguidance;consider[1],[2],[3],and[10].Atypical Themeaningrepresentationassignedtotheconstruction spoken language understanding system has a speech A can be computed by running the function f on some recognizer and a speech synthesizer. The speech subsetofthesemanticattachmentsof A’sconstituents. recognition results are parsed into semantic forms by sentenceinterpretationcomponent.Sentenceinterpretation MeaningrepresentationswillbepresentedinFirstOrder moduleoftenneedsdiscourseanalysistotrackcontextand PredicateCalculus,aflexibleandwellunderstoodmeaning resolve . Dialog manager is the central representation language. Let us shortly describe FOPC. It component and it communicates with discourse analysis provides three ways to represent an object constants, module, sentence interpretation component and lastly, functions,andvariables.Constantsrefertospecificobjects message generation module [6]. We will focus on the in the world (e.g. single capitalized letter or concrete sentenceinterpretationmodule.Thescopeofthesemantic words). FOPC functions refer to . They are analysis is wide which can be seen through [5], [8], [9], syntactically the same as single argument predicates but [11],and[12].Ourattentionwillbedirectedtooneofthe theyareactuallytermsinthattheyrefertospecificobjects approachestosemanticanalysis,tosyntaxdrivensemantic without having to associate a named object. The last analysisapproach. mechanismthatreferstoobjectsisavariablewhichcanbe used to refer to anonymous object or generically to all objects in a collection. They give us the ability to make II.SYNTAXDRIVENSEMANTICANALYSIS inferences or make assertions. These uses are made possiblebytheuseofquantifiers,existentialquantifier ∃ ∀ Semantic analysis is the process of relating syntactic (thereexists)andtheuniversalquantifier (forall).Let structures (phrases, clauses, sentences, text) to their usturnnowtothemechanismsusedtostaterelationsthat languageindependent meanings. Idioms, being cultural holdamongobjects.Predicatesaresymbolsthatrefertothe elements,alsohavetobeconvertedintorelativelyinvariant relationsthatholdamongsomefixednumberofobjectsin meanings.Theyarespecialinthattheyconsistofgroupsof agivendomain.FOPCsentencescanbeassignedavalue words in a fixed order that have a particular meaning of True orFalsebasedonwhethertheprepositionsarein differentfromthemeaningsofeachwordunderstoodonits accord with the world or not. Sentence with universally own. quantified variables must be true under all possible substitution, while those with existentially quantified variablesmusthaveatleastonesubstitutionthatresultsina Thisfunctionalityisprovidedbyanotationalextension truesentence.Thevariouslogicalconnectivesgiveusthe toFOPCcalledthelambdanotation.TheextendedFOPC abilitytocreatelargerrepresentations.Inthatrespect,we syntaxincludesexpressionsofthefollowingform: canusethreeoperatorsrepresentedinTableI. Considerthesentence: λxP (x) (6) Anaposlužujejelo.(Anaservesmeal.) (2) The concrete entities are represented by the FOPC These lambda expressionsundergoaprocessoflambda constants Ana and jelo . The lexical rules that introduce reductionwhichmeansthattheycanbeappliedtological thesewordsintothesentenceare: terms to yield new FOPC expressions where the formal ProperNoun→Ana {Ana} parametervariablesareboundtothespecificterms. CommonNoun→jelo {Jelo} The NPs (Noun Phrases) obtain their meaning λ xP (x)( A) (7) representations from the meanings of their children. The semantic expression associated with the child is simply P(A) (8) copied to the parent for nonbranching grammar rules whichwehaveinourexample(2). NP→ProperNoun {ProperNoun.sem } One lambda expression can be used as the body of another.Afterthefirstreductiontheresultingexpressionis NP→CommonNoun {CommonNoun.sem } stillalambdaexpression.Thistechniqueiscalled currying A generic event Posluživanje (Serving) involves anditactuallyconvertsapredicatewithmultiplearguments Poslužitelj (Server) andsomething Posluženo (Served): into a sequenceofsingleargumentpredicates.Itisworth notingthatargumentsarelimitedtoFOPCterms. ∃ Λ Λ e,,(, x y Isa e Posluživanje ) Poslužitelj (,) e x Poslužen o (,) e y (3) λxλyNear(x,y ) (9) The formula in (4)presentsthesemanticattachmentof theverb poslužuje: Nowwecangobacktoourexamplein(2). Verb→poslužuje Verb→poslužuje {,,∃e x y Isa (, e Posluživanje ) Λ Poslužitelj (,) e x Λ Poslužen o (,)} e y (4) {,(,λx ∃ e y Isa e Posluživanje ) Λ Poslužitelj (,) e y Λ Poslužen o (,)} e x (10) Themeaningofthe NP needstobeincorporatedintothe meaningoftheverbandtheresultingrepresentationneeds The attachment for our VP rule specifies a lambda to be assigned to the VP.sem (Verb Phrase semantic application. Lambda expression is provided by Verb.sem attachment). The variable y will be replaced with the andtheargumentisprovidedby NP.sem .,Jelo. logicalterm Jelo asthesecondargumentofthe Posluženo roleofthe Poslužujeevent: VP→VerbNP {Verb.sem(NP.sem) } (11) ∃e,(, x Isa e Posluživanje ) Λ Poslužitelj (,) e x Λ Posluženo (,)e Jelo (5) This lambda application results in the binding of the single formal parameter x of the lambda expression with The VP semanticattachmentmusthavetwocapabilities. thevaluein NP. sem. Ithastoknowwhichvariableswithinthe Verb’s semantic attachment are to be replaced by the of the ∃e,(, y Isa e Posluživanje ) Λ Poslužitelj (,) e y Λ Posluženo (,)e Jelo (12) Verb's arguments,andithastohavetheabilitytoperform suchareplacement. Whatwestillneedisthesemanticattachmentforthe S (Sentence) rule. It must incorporate an NP argument into TABLEI the appropriate role in the event representation in the TRUTHTABLEGIVINGTHESEMANTICSOFTHE VP. sem. VARIOUSLOGICALCONNECTIVES S→NPVP {VP.sem(NP.sem) } (13) P Q ¬ P P Λ Q P ∨ Q P ⇒ Q However, the lambda application performed at the VP False False True False False True rule resulted in a generic FOPC expression. The Verb attachment has to consist of an embedded lambda False True True False True True expression to make the Poslužitelj role available for True False False False True False bindingatthe S levelofthegrammar. True True False True True True Verb→poslužuje λ λ ∃ Λ Λ {x y eIsa (, e Posluživanje ) Poslužitelj (,) e y Poslužen o (,)} e x (14) The Verb attachment consists of a lambda expression This amounts to the intersection of the category of insidea lambda expression.Theouterexpressionprovides beautifulthingswiththecategoryofgirls. thevariablethatisreplacedbythefirst lambda reduction, Considerexample'bivšiprijatelj'(formerfriend). whiletheinnerprovidesthevariablethatisreplacedbythe second lambda reduction.Theorderingofvariablesinthe Λ multiplelayers lambda expressionsinsemanticattachment λxIsa(x,Prijatelj) Isa(x,Bivši)(18) oftheverbencodesfactsabouttheexpectedlocationofa Verb’s arguments in the syntax. This is a fairly simple Itassertsthatthepersoninquestionisafriend,whichis example when considered in English language which has nottrue,anditmakesuseofafairlyunreasonablecategory relatively fixed word order, but not inCroatianlanguage. offormerthings.Thebestapproachistonotethestatusof TheparsetreeforthisexampleisshowninFigure1. a specific kind of modification relation and replace this Letusnowlookataphrase'Lijepadjevojka'(beautiful vaguerelationwithsomefurtherprocedure. girl). An obvious and often incorrect proposal for the semantic attachment of the NP is illustrated in the Nominal→AdjNominal followingrules: {λxNominal.sem(x) Λ AM(x,Adj.sem)}(19) Nominal→AdjNominal Λ Applying this rule to 'Lijepa djevojka' results in the {λxNominal.sem(x) Isa(x,Adj.sem)}(15) following formula in which AM stands for adjective Adj→lijepa{Lijepa}(16) modifier: This yields the following fairly reasonable ∃x Isa(, x Djevojka ) ∧ AM (, x Lijepa ) (20) representation: Λ This proposal does not work with former friend where λxIsa(x,Djevojka) Isa(x,Lijepa)(17) thesolutionhastobebasedonthespecificsemanticsofthe adjectivesandnounsinquestion. This is an example of intersective semantics since the Ingeneral,lexicalrulesprovidecontentlevelpredicates meaningofthephrasecanbethoughtofastheintersection and terms for meaning representations. The semantic ofthecategorystipulatedbythenominalandthecategory attachmentstogrammarrulesjustputpredicatesandterms stipulatedbytheadjective. togetherintherightways. S ∃eIsa (, e Posluživanje ) Λ Poslužitelj (,) e Ana Λ Poslužen o (,)} e Jelo

NP VP λx∃ e Isa(, e Posluživanje ) Λ Poslužitelj (,) e x Λ Posluženo (,)}e Jelo

NP Jelo

ProperNoun Verb CommonNoun Jelo

Ana poslužuje jelo

Figure 1.Parsetreewithsemanticattachmentsfor Anaposlužujejelo . Uptonow,wefocusedonlyondeclarativesentence.Let also introduce semantic content that is not derived from usconsiderthefollowingexamples: Anaposlužujeručak. anyofitsparts. (Ana serves lunch .) Posluži ručak! (Serve lunch! ) Da li Considerthefollowingrule: Ana poslužuje ručak? ( Does Ana serve lunch? ) Tko poslužujeručak? (Whoserveslunch? )Thesesentencesall contain concerning the serving of lunch on S→trčatipredrudo flights but they differ with respect to the role they are {Prebrzo} (25) intended to serve. The first of these sentences conveys factualinformationtothelistener,thesecondisarequest Theconstant Prebrzo shouldnotbetakenforgrantedas foranaction,andthelasttwoarerequestsforinformation. the meaning representation for this idiom. However, it Todifferentiatebetweenthesesentences,wecanintroduce illustratesthatthemeaningoftheidiomhasnothingtodo a set of operators which will be applied to the FOPC withthemeaningsofitsparts.Ofcourse,specialattention sentences: DCL (declaratives), IMP (imperatives), YNQ intheserulesforCroatianandsimilarlanguagesshouldbe (yesnoquestions) , and WHQ (whquestions). paid topersonmarking.[4]( Trčimpredrudo.Trčišpred By alteringthebasicsentencerulewehavebeenusing rudo. ) sofar,wewillgetarulefordeclarativesentences: Therearesomeidiomsthatallowforsomevariationand thoseshouldberepresentedbymoregeneralrules.Letus S → NPVP {DCL (VP.sem (NP.sem ))} (21) take the following idiom as an example: Tresla se brda, rodiosemiš (Muchadoaboutnothing ).Wecoulddefine thefollowingruleforthisidiom: Imperative sentences begin with a verb phrase and do nothaveanovertsubject. S→treslasebrda,rodiosemiš S → VP {IMP(VP.sem (DummyYou ))} (22) {Pretjerivati}(26) Yesnoquestions consist of a sentence initial auxiliary However,somebodycouldsay Treslasebrda,rodiose verb,followedbyaparticle,asubjectnounphraseandthen malimiš andthisrulewouldnotworkanymore.Thatisthe averbphrase. reasonwhyweshouldmakemoregeneralrules[7]: S → AuxNPVP {YNQ(VP.sem (NP.sem ))} (23) S→treslasebrda,rodiosemišNP {Pretjerivati} (27) Yesnoquestionscanbeansweredprettysimply,byjust determining whether the preposition is in the knowledge base, or whether it can be inferred from the knowledge III.SEMANTICGRAMMARS base. Whsubjectquestions are the last type we are going to Grammars that are needed for compositional semantic dealwith.Theyhavethefollowingattachment[7]: analysisandthatrepresentoneofthewaysofinstantiating asyntaxdrivenapproachinpracticalsystemsareknownas S → WhWord NP VP {WHQ(NP.sem.var, semantic grammars and they differ a lot from traditional VP.sem(NP.sem ))} (24) grammars. The need of having uniform semantic attachmentsoftenresultsinconstituentsthatareattheright levelofgeneralityforthesyntax,butattoohighlevelfor semanticpurposes.Letusconsidertheruleforthephrase II.I DIOMS kineskirestoran(Chineserestaurant ): As one could expect, the principle of compositionality doesnotworkwithidioms.Idiomisanexpressionwhose Nominal→AdjNominal meaning cannot be deducted from the meaning, grouping {λxNominal.sem() x Λ AM (, x Adj . sem ) } (28) andorderingofitsparts.Itreferstoafigurativemeaning thatisknownonlythroughconventionaluse[7].Consider thefollowingidioms: Kakoprostreš,takoćešileći(Asyou Itresultsinthefollowingmeaningrepresentation: makeyourbed,soyoumustlieuponit) , Miovuku,avuk navrata(Speakofthedevil,andinhewalks),Kujželjezo ∃x Isa(, x Restoran ) Λ AM (, x Kineski ) (29) dokjevruće(Makehaywhilethesunshines) , Tihavoda bregedere(Stillwatersrundeep),Tojekapkojajeprelila čašu(Thatwasthelaststraw),Trčatipredrudo(Jumpthe Thisisjustanindicationthatthenominalismodifiedby gun), etc. theadjective.Theexpressionwewanttorepresentmeans thatfoodispreparedinaparticularway.Theseproblems The best way to deal with these expressions is to can be solved by means of semantic grammars. Rules in introduce new grammar rules designed for them. These thesegrammarsaremadenomoregeneralthanisneeded rulesmixlexicalitemswithgrammaticalconstituents.They forsensiblesemanticanalysis.Therulethatcouldbeused [3] Erjavec, T., Džeroski, S., „Machine Learning of toparsethephrase'kineskirestoran'couldbe: Morphosyntactic Structure: Lemmatizing Unknown Slovene Words“, Applied Artificial Intelligence , 18: 17 RestaurantType→NationalityRestaurantType(30) 41,2004. [4] Franks,S.,„SlavicLanguages“, in:Cinque,G.,Kayne,R. Although convenient at first sight, semantic grammars need to be huge in size in order to beefficient.Consider (ed.), HandbookofComparativeSyntax,2005. theexpression'kanadskirestoran' (Canadianrestaurant ).It [5] Gildea, D. and Jurafsky, D., Automatic Labeling of matches the rule, although such interpretation would be SemanticRoles ,Computational28(3)(2002) utterly incorrect. Such phrase simply refers to the restaurantlocatedinCanada. 24528814 Semantic grammars are useful when dealing with [6] Huang, X., Acero, A., Hon, H., Spoken Language anaphors and ellipsis because they enable prediction. Unfortunately,cannotbereusedbecausetheyaredomain Processing. A Guide to Theory, Algorithm, and System specific[7]. Development , Prentice Hall, Upper Saddle River, New Jersey,2001. [7] Jurafsky, D., Martin, J. H., Speech and Language IV.C ONCLUSION Processing. An Introduction to Natural Language Since Croatian language syntax essentially differs from Processing, Computational Linguistics and Speech English language for which a great deal of research has Recognition , Prentice Hall, Upper Saddle River, New been done, languagespecific tools need to be developed. Jersey,2000. This paper outlines one of the approaches to semantic analysis, syntaxdriven approach. It is based on the [8] Lin,D.andPantel,P., DiscoveryofInferenceRulesfor principleofcompositionality,buttakingcareofgrouping, Question Answering, Natural Language Engineering ordering and relations among words in the sentence. Contextfree grammar rules augmented with semantic 2001.,7(4):34336 attachments specify how to construct a meaning [9] Lin, D. and Pantel, P., 2001 Induction of Semantic representation of a construction from the meanings of its constituentparts.Meaningrepresentationsarepresentedin Classes from Natural Language Text , In Proceedings of First Order Predicate Calculus. Outlined rules can be ACMSIGKDDConferenceonKnowledgeDiscoveryand further developed. Completely new grammar rules are DataMining2001.,pp.317322 introduced for idioms since they refer to figurative meaning. These rules mix lexical items with grammatical [10] Nenadić,G.,Vitas,D.,Krstev,C.,„LocalGrammarsand constituentsasshownthroughexamples.Grammarsneeded Compound Verb Lemmatization in SerboCroatian“, for compositional semantic analysis differ a lot from traditional grammars and they are known as semantic Zybatow,G.,Junghanns,U.,Mehlhorn,G.,Szucsich,L. grammars. (ed.), CurrentIssuesinFormalSlavicLinguistics ,Peter Lang,Frankfurt/Main,2001.,str.469477 REFERENCES [11] Riloff, E. and Thelen, M., A Rulebased Question Answering System for Reading Comprehension Tests" , [1] Cussons, J.,Džeroski,S.,Erjavec,Ž.,„Morphosyntactic Workshop on Reading Comprehension Tests as Tagging of Slovene using Progol“, Lecture Notes in EvaluationforComputerBasedLanguageUnderstanding Computer Science, Proceedings of the 9th International Systems,ANLP/NAACL,2000. Workshop on Inductive Logic Programming , vol. 1634, [12] Tsang, V., Stevenson, S. and Merlo, P., Crosslinguistic 1999., str. 6879, www TransferinAutomaticVerbClassification ,Proceedingsof ai.ijs.si/SasoDzeroski/files/1999_CDE_MorphosyntacticT the 19th International Conference on Computational aggingSlovene.pdf Linguistics(COLING),2002.,10231029. [2] Drodzynski, W., Homola, P., Piskorski J., Zinkevicius, V.,„AdaptingSProUTtoprocessingBalticandSlavonic languages“, IESL'03 Workshop held in conjunction with the RANLP Recent Advances in Natural Language Processing ,Borovets,Bugarska,2003.