<<

Statistical and CKY

Instructor: Wei Xu Ohio State University

Many slides from Ray Mooney and Michael Collins TA Office Hours for HW#2

• Dreese 390: - 03/28 Tue 10:00AM-12:00 noon - 03/30 Thu 10:00AM-12:00 noon - 04/04 Tue 10:00AM-12:00 noon Wuwei Lan

• Readings: - textbook http://ciml.info/dl/v0_99/ciml-v0_99-ch17.pdf - slide #28,29.30: https://cocoxu.github.io/courses/ 5525_slides_spring17/15_more_memm.pdf

Syntax

Syntactic Parsing Parsing

• Given a string of terminals and a CFG, determine if the string can be generated by the CFG: - also return a for the string - also return all possible parse trees for the string

• Must search space of derivations for one that derives the given string. - Top-Down Parsing - Bottom-Up Parsing Simple CFG for ATIS English

Grammar Lexicon S → NP VP Det → the | a | that | this S → Aux NP VP → book | flight | meal | money S → VP → book | include | prefer NP → Pronoun Pronoun → I | he | she | me NP → Proper-Noun Proper-Noun → Houston | NWA NP → Det Nominal Aux → does Nominal → Noun Prep → from | to | on | near | through Nominal → Nominal Noun Nominal → Nominal PP VP → Verb VP → Verb NP VP → VP PP PP → Prep NP Parsing Example

S

VP

Verb NP book that flight

book Det Nominal

that Noun

flight Top Down Parsing • Start searching space of derivations for the start symbol. S

NP VP

Pronoun Top Down Parsing

S

NP VP

Pronoun X book Top Down Parsing

S

NP VP

ProperNoun Top Down Parsing

S

NP VP

ProperNoun X book Top Down Parsing

S

NP VP

Det Nominal Top Down Parsing

S

NP VP

Det Nominal X book Top Down Parsing

S

Aux NP VP Top Down Parsing

S

Aux NP VP X book Top Down Parsing

S

VP Top Down Parsing

S

VP

Verb Top Down Parsing

S

VP

Verb

book Top Down Parsing

S

VP

Verb X book that Top Down Parsing

S

VP

Verb NP Top Down Parsing

S

VP

Verb NP

book Top Down Parsing

S

VP

Verb NP

book Pronoun Top Down Parsing

S

VP

Verb NP

book Pronoun X that Top Down Parsing

S

VP

Verb NP

book ProperNoun Top Down Parsing

S

VP

Verb NP

book ProperNoun X that Top Down Parsing

S

VP

Verb NP

book Det Nominal Top Down Parsing

S

VP

Verb NP

book Det Nominal

that Top Down Parsing

S

VP

Verb NP

book Det Nominal

that Noun Top Down Parsing

S

VP

Verb NP

book Det Nominal

that Noun

flight Bottom Up Parsing • Start searching space of reverse derivations from the terminal symbols in the string.

book that flight Bottom Up Parsing

Noun

book that flight Bottom Up Parsing

Nominal

Noun

book that flight Bottom Up Parsing

Nominal

Nominal Noun

Noun

book that flight Bottom Up Parsing

Nominal

Nominal Noun X Noun

book that flight Bottom Up Parsing

Nominal

Nominal PP

Noun

book that flight Bottom Up Parsing

Nominal

Nominal PP

Noun Det

book that flight Bottom Up Parsing

Nominal

Nominal PP

NP

Noun Det Nominal

book that flight Bottom Up Parsing

Nominal

Nominal PP

NP

Noun Det Nominal

book that Noun

flight Bottom Up Parsing

Nominal

Nominal PP

NP

Noun Det Nominal

book that Noun

flight Bottom Up Parsing

Nominal

Nominal PP S

NP VP

Noun Det Nominal

book that Noun

flight Bottom Up Parsing

Nominal

Nominal PP S

NP VP Noun Det Nominal X

book that Noun

flight Bottom Up Parsing

Nominal

Nominal PP X NP

Noun Det Nominal

book that Noun

flight Bottom Up Parsing

NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

VP NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

S

VP NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

S

VP X NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

VP

VP PP NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

VP

VP PP X NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

VP NP NP Verb Det Nominal

book that Noun

flight Bottom Up Parsing

VP NP

Verb Det Nominal

book that Noun

flight Bottom Up Parsing

S

VP NP

Verb Det Nominal

book that Noun

flight Top Down vs. Bottom Up

• Top down never explores options that will not lead to a full parse, but can explore many options that never connect to the actual . • Bottom up never explores options that do not connect to the actual sentence but can explore options that can never lead to a full parse. • Relative amounts of wasted search depend on how much the branches in each direction.

Syntax CYK Algorithm Parsing

• CKY (Cocke-Kasami-Younger) algorithm based on bottom-up parsing and requires first normalizing the grammar. • First grammar must be converted to (CNF) in which productions must have either exactly 2 non- terminal symbols on the RHS or 1 terminal symbol (lexicon rules). • Parse bottom-up storing formed from all substrings in a triangular table (chart). Dynamic Programming

• a general algorithm design technique for solving problems defined by recurrences with overlapping subproblems • first invented by Richard Bellman in 1950s • “programming” here means “planning” or finding an optimal program, as also seen in the term “linear programming” • Main idea: • setup a recurrence of smaller subproblems • solve subproblems once and record solutions in a table (avoid any recalculation)

ATIS English Grammar Conversion Original Grammar Chomsky Normal Form

S → NP VP S → NP VP S → Aux NP VP S → X1 VP X1 → Aux NP S → VP S → book | include | prefer S → Verb NP S → VP PP NP → Pronoun NP → I | he | she | me NP → Proper-Noun NP → Houston | NWA NP → Det Nominal NP → Det Nominal Nominal → Noun Nominal → book | flight | meal | money Nominal → Nominal Noun Nominal → Nominal Noun Nominal → Nominal PP Nominal → Nominal PP VP → Verb VP → book | include | prefer VP → Verb NP VP → Verb NP VP → VP PP VP → VP PP PP → Prep NP PP → Prep NP

Note that, although not shown here, original grammar contain all the lexical entires. Exercise CKY Parser Book the flight through Houston j= 1 2 3 4 5 i= 0 Cell[i,j] contains all 1 constituents (non-terminals) covering words 2 i +1 through j

3

4 CKY Parser Book the flight through Houston j= 1 2 3 4 5 i= 0 Cell[i,j] contains all 1 constituents (non-terminals) covering words 2 i +1 through j

3

4 CKY Parser

Book the flight through Houston

S, VP, Verb, Nominal, Noun None

NP Det

Nominal, Noun CKY Parser

Book the flight through Houston

S, VP, Verb, Nominal, VP Noun None

NP Det

Nominal, Noun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None

NP Det

Nominal, Noun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None

NP Det

Nominal, Noun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None None

NP Det None

Nominal, Noun None

Prep CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None None

NP Det None

Nominal, Noun None

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None None

NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None None

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP Noun None None VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP S Noun None None VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S Nominal, VP VP S Noun None None VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston

S, VP, Verb, S S Nominal, VP VP S Noun None None VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston Parse S, VP, Verb, S S Nominal, VP VP Tree S Noun None None #1 VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun CKY Parser

Book the flight through Houston Parse S, VP, Verb, S S Nominal, VP VP Tree S Noun None None #2 VP

NP NP Det None

Nominal, Noun None Nominal

Prep PP

NP ProperNoun The Problem with Parsing: Ambiguity

INPUT: She announced a program to promote safety in trucks and vans

+ POSSIBLE OUTPUTS:

S S S S S S

NP VP NP VP NP VP NP VP She Sh e NP VP She NP VP She

announced NP She She announced NP announced NP announced NP

NP VP NP VP a program announced NP announced NP NP VP a program NP PP

to promote NP a progra m to promote NP PP in NP NP VP safety PP safety in NP a program trucks and va ns to promote NP in NP to promote trucks and vans NP safe ty

trucks and vans NP and NP NP and NP va ns vans NP and NP NP VP va ns NP VP sa fety PP a program a program in NP to promote NP PP trucks to promote NP safe ty in NP

trucks safety PP

in NP

trucks

And there are more...

Syntax

Probabilistic Context Free (PCFG)

split point

O(n3|N|3)

O(n2) for l, i choices

Parsing with Lexicalized CFGs S(saw)

I The new form of grammar looksNP(man) just like a Chomsky normalVP(saw) form CFG, but with potentially O( ⌃ 2 N 3) possible rules. DT(the) | NN(man)| ⇥ | | the man VP(saw) PP(with) I Naively, parsing an n word sentence using the dynamic programming algorithm will take O(n3 ⌃ 2 Vt(saw)N 3) time.NP(dog)But IN(with) NP(telescope) saw with ⌃ can be huge!! | | | | DT(the) NN(dog) DT(the) NN(telescope) | | the dog the telescope 2 3 I Crucial observation: at most O(n N ) rules can be p(t) = q(S(saw)⇥ | | 2 NP(man) VP(saw)) applicable to a given sentence w1,w2,...w! n of length n. q(NP(man) 2 DT(the) NN(man)) This is because any rules which⇥ containq(VP(saw) a lexical! VP(saw) item that PP(with) is ) ⇥ !1 not one of w1 ...wn, can be safelyq(VP(saw) discarded. Vt(saw) NP(dog)) ⇥ !1 q(PP(with) 1 IN(with) NP(telescope)) ⇥5 3 ! I The result: we can parse in O(n ...N ) time. ⇥ | |

Syntax

Dependency Parsing