Problems with Recursive Descent Problems with Recursive Descent Parsing The CYK The CYK Algorithm

1 Problems with Recursive Descent Parsing Chart Parsing: the CYK Algorithm Complexity Informatics 2A: Lecture 17 2 The CYK Algorithm Parsing as The CYK Algorithm Bonnie Webber Visualizing the Chart School of Informatics Properties of the Algorithm University of Edinburgh [email protected] Reading:

28 October 2008 J&M (2nd ed), ch. 13 (Sections 13.3 – 13.4) NLTK Tutorial, Chart Parsing and Probabilistic Parsing pp. 1-8.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 1 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 2

Problems with Recursive Descent Parsing Left Recursion Problems with Recursive Descent Parsing Left Recursion The CYK Algorithm Complexity The CYK Algorithm Complexity Motivation: Ambiguity Left Recursion

Recall that recursive descent parsing may require restructuring a (Lecture 15) aimed to address a limited grammar to eliminate left-recursive rules. amount of local ambiguity – the problem of not being able to But what if those rules reflect important structural and decide uniquely which grammar rule to use next in a left-to-right distributional properties of the language? analysis of the input string, even if the string is not globally ambiguous. NP → DET N By re-structuring the grammar, the parser can make a unique NP → NPR decision, based on a limited amount of look-ahead. DET → NP ’s We’ll now look at two other ways of handling ambiguity: These rules generate English NPs with possesive modifiers such as: Chart parsing: handling ambiguity with the parser alone; John’s sister Probabilistic Grammars: handling ambiguity with both John’s mother’s sister grammar and parser. John’s mother’s uncle’s sister John’s mother’s uncle’s sister’s niece

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 3 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 4 Problems with Recursive Descent Parsing Left Recursion Problems with Recursive Descent Parsing Left Recursion The CYK Algorithm Complexity The CYK Algorithm Complexity Left Recursion Complexity

Tree structures for possessives: Recall other problems with recursive descent parsing: NP NP NP

DET N DET N DET N 1 Structural ambiguity in the grammar and lexical ambiguity in the words can lead the parser down a path that will eventually NP NP NP fail (i.e., it cannot parse the whole input). NPR DET N DET N 2 The same sub-tree may be built several different times: when NP NP John ’s sister mother ’s sister uncle ’s sister a path fails, the parser backtracks, undoes the structure, and

NPR DET N starts again.

NP John ’s mother ’s The complexity of this blind is exponential in the NPR worst case, because of repeated re-analysis of the same sub-string. We need a type of parser that solves this problems but does not John ’s When left-recursive rules are necessary, we can’t use recursive require restructuring the grammar. descent parsing.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 5 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 6

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Dynamic Programming Parsing as Dynamic Programming

Dynamic Programming: With a CFG, a parser should be able to avoid re-analyzing a Given a problem, systematically fill a table of solutions to sub-string because such an analysis is independent of the rest of sub-problems: this is called . the parse. Once solutions to all sub-problems have been accumulated, NP solve the overall problem by composing them.. The dog saw a man in the park For parsing, the sub-problems are analyses of sub-strings, and they NP NP NP are memoized in a chart (aka well-formed substring table, WFST). The search space explored by the parser can reflect this This contains: independence if we use a parser based on dynamic programming. constituents (sub-trees) that have been found, indexed by the Dynamic programming is the basis for all chart parsing . start and end of the sub-strings they cover; hypotheses about what constituents could be found, indexed by the start and end of the sub-strings that suggest them.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 7 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 8 Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Depicting a WFST/Chart Depicting a WFST as a Matrix

A well-formed substring table (aka chart) can be depicted as either 1 2 3 4 5 6 a matrix or a graph. Both contain the same information. 0 V When a WFST (aka chart) is depicted as a matrix: 1 Prep PP Rows and columns of the matrix correspond to the start and end positions of a span (ie, starting right before the first word, 2 Det NP ending right after the final one); A cell in the matrix corresponds to the sub-string that starts 3 N at the row index and ends at the column index. It can contain information about the type of constituent (or constituents) 4 that span(s) the substring, pointers to its sub-constituents, 5 and/or predictions about what constituents might follow the substring. See with a telescope in hand

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 9 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 10

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Depicting a WFST as a Graph Algorithms for Chart Parsing

When a WFST (aka chart) is depicted as a graph: nodes/vertices represent positions in the text string, starting before the first word, ending after the final word. Important members of the chart parsing family include: arcs/edges connect vertices at the start and the end of a span to represent a particular substring. Edges can be labelled with the CYK algorithm, which memoizes only constituents; the same information as in a cell in the matrix representation. three algorithms that memoize both constituents and predictions:

PP a bottom-up a top-down chart parser NP the Earley algorithm

Prep Det N

with a telescope 1 23 4 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 11 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 12 Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm CYK Algorithm Chart Parsing with the CYK Algorithm

CYK (Cocke, Younger, Kasami) is just a particular regime for recognizing and recording constituents in the chart (WFST). Let Close(X) = {B | B →* A, using unary productions, and A  X} The simplest version of CYK is for a CFG whose rules have at (t,[w ,. . . ,w ]) most two symbols on their RHS. Build CYK Chart 1 n for j ← 1 to n We can enter constituent A in cell (i, j) if there is a rule do A → B t(j-1, j) ← Close({wj }) for k ← 1 to n and B is found in cell (i, j), or if for j ← k to n A → BC for m ← 1 to k-1 and B is found in cell (i, k) and C is found in cell (k, j). do t(j-k, j) ← t(j-k, j) ∪ Close({A | A → BC CYK is designed to guarantee that the parser only looks for rules for some B ∈ t(j-k, j-m) and C ∈ t(j-m, j)}) that use a constituent from i to j after it has determined all the constituents that end at i. Otherwise something might be missed.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 13 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 14

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm CYK Schematic Diagram Visualizing the Chart

CYK proceeds systematically left-to-right across the input string, Grammatical rules Lexical rules S → NP VP Det → a | the (determiner) looking back to see what constituents can now be built with what NP → Det Nom N → fish | frogs | soup (noun) has been found. NP → Nom Prep → in | for (preposition) Nom → N SRel TV → saw | ate (transitive verb) Nom → N IV → fish | swim (intransitive verb) VP → TV NP Relpro → that (relative pronoun) VP → IV PP VP → IV PP → Prep NP 0 1 2 3 4 5 SRel → Relpro VP

This algorithm is complete and does recognition in time O(n3). Nom: nominal (follows the determiner in an NP with determiner; occurs also in bare NP). SRel: subject relative clause, as in the frogs that ate fish.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 15 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 16 Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Visualizing the Chart Visualizing the Chart

1 2 3 4 1 2 3 4

0 0 det

1 1

2 2

3 3

the frogs ate soup the frogs ate soup

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 17 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 18

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Visualizing the Chart Visualizing the Chart

1 2 3 4 1 2 3 4

0 det 0 det np

n n 1 nom 1 nom np np

2 2

3 3

the frogs ate soup the frogs ate soup Unary branching rules: Nom → N, NP → Nom Binary branching rule: NP → Det Nom

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 19 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 20 Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Visualizing the Chart Visualizing the Chart

1 2 3 4 1 2 3 4

0 det np 0 det np

n n 1 nom 1 nom np np

2 tv 2 tv

n 3 3 nom np

the frogs ate soup the frogs ate soup

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 21 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 22

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Visualizing the Chart Visualizing the Chart

1 2 3 4 1 2 3 4

0 det np 0 det np

n n 1 nom 1 nom S np np

2 tv vp 2 tv vp

n n 3 nom 3 nom np np

the frogs ate soup the frogs ate soup Binary branching rule: VP → TV NP Binary branching rule: S → NP VP

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 23 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 24 Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm Visualizing the Chart Directionality of CYK

1 2 3 4 Does CYK have to proceed left-to-right? 0 det np S Could one get a similar guarantee that no constituent would be missed if one proceeded right-to-left? Yes. n 1 nom S np

2 tv vp

n 3 nom np 0 1 2 3 4 5

the frogs ate soup Why might one want to proceed Right-to-Left? Binary branching rule: S → NP VP

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 25 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 26

Parsing as Dynamic Programming Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart The CYK Algorithm Visualizing the Chart Properties of the Algorithm Properties of the Algorithm From CYK Recognizer to CYK Parser 1 2 3 4

Can we tell from the CYK chart what the syntactic analysis (tree 0 det np (( )) S (( )) structure) is from the frogs ate soup? No.

We just have a chart recognizer, a way of determining whether a n string belongs to the language generated by the grammar. 1 nom (( )) S (( )) np (( )) For a parser, we would have to record in another field which existing constituents were combined into the new constituent. 2 tv vp (( )) This record must be a list of constituent lists, as a sub-string can realize a constituent in more than one way. The following example n has two different analyses as a VP: 3 nom(( )) np (( )) Put the block in the box on the table the frogs ate soup

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 27 Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 28 Parsing as Dynamic Programming Problems with Recursive Descent Parsing The CYK Algorithm The CYK Algorithm Visualizing the Chart Properties of the Algorithm Summary

Recursive decent parsing cannot handle left-recursive rules and is inefficient (exponential). Alternative: Use dynamic programming and memoize sub-analysis in a chart to avoid duplicate work. The chart can be visualized as a graph or as a matrix. The CYK algorithm builds a chart in O(n3) time. It is specified as a recognizer, but can be used as a parser, if more information is recorded in the chart.

Informatics 2A: Lecture 17 Chart Parsing: the CYK Algorithm 29