The CYK Algorithm

The CYK Algorithm

The CYK Algorithm 자연어처리 The CYK Algorithm • The membership problem: – Problem: • Given a context-free grammar G and a string w – G = (V, ∑ ,P , S) where » V finite set of variables » ∑ (the alphabet) finite set of terminal symbols » P finite set of rules » S start symbol (distinguished element of V) » V and ∑ are assumed to be disjoint – G is used to generate the string of a language – Question: • Is w in L(G)? The CYK Algorithm • J. Cocke • D. Younger, • T. Kasami – Independently developed an algorithm to answer this question. The CYK Algorithm Basics – The Structure of the rules in a Chomsky Normal Form grammar – Uses a “dynamic programming” or “table-filling algorithm” Chomsky Normal Form • Normal Form is described by a set of conditions that each rule in the grammar must satisfy • Context-free grammar is in CNF if each rule has one of the following forms: – A BC at most 2 symbols on right side – A a, or terminal symbol – S λ null string where B, C Є V – {S} Construct a Triangular Table • Each row corresponds to one length of substrings –Bottom Row – Strings of length 1 –Second from Bottom Row – Strings of length 2 . –Top Row – string ‘w’ Construct a Triangular Table •Xi, i is the set of variables A such that A wi is a production of G •Compare at most n pairs of previously computed sets: (Xi, i , Xi+1, j ), (Xi, i+1 , Xi+2, j ) … (Xi, j-1 , Xj, j ) e.g. i=1, j=5 (X1,1 , X2,5 ), (X1,2 , X3,5 ) … (X1,4 , X5,5 ) Construct a Triangular Table X1, 5 X1, 4 X2, 5 X1, 3 X2, 4 X3, 5 X1, 2 X2, 3 X3, 4 X4, 5 X1, 1 X2, 2 X3, 3 X4, 4 X5, 5 w1 w2 w3 w4 w5 Table for string ‘w’ that has length 5 Construct a Triangular Table X1, 5 X1, 4 X2, 5 X1, 3 X2, 4 X3, 5 X1, 2 X2, 3 X3, 4 X4, 5 X1, 1 X2, 2 X3, 3 X4, 4 X5, 5 w1 w2 w3 w4 w5 Looking for pairs to compare Example CYK Algorithm • Show the CYK Algorithm with the following example: – CNF grammar G • S NP VP • NP DET NP | NP NP | time | flies | arrow • VP VP NP | VP PP | flies | like • PP P NP • DET an • P like – w is "time flies like an arrow" – Question Is "time flies like an arrow" in L(G)? Constructing The Triangular Table {NP} (X1,1) {NP,VP} (X2,2) {VP,P} (X3,3) {DET} (X4,4) {NP} (X5,5) time (1) flies (2) like (3) an (4) arrow (5) Calculating the Bottom ROW: Xi, i Constructing The Triangular Table {NP,S} (X1,2) {S} (X2,3) {} (X3,4) {NP} (X4,5) {NP} {NP,VP} {VP,P} {DET} {NP} time (1) flies (2) like (3) an (4) arrow (5) X1,2: (X1,1 , X2,2 ) X3,4: (X3,3 , X4,4 ) X2,3: (X2,2 , X3,3 ) X4,5: (X4,4 , X5,5 ) Constructing The Triangular Table {S} (X1,3) {} (X2,4) {VP,PP} (X3,5) {NP,S} {S} {} {NP} {NP} {NP,VP} {VP,P} {DET} {NP} time (1) flies (2) like (3) an (4) arrow (5) X1,3: (X1,1 , X2,3 ), (X1,2 , X3,3 ) X3,5: (X3,3 , X4,5 ), (X3,4 , X5,5 ) Constructing The Triangular Table {} (X1,4) {S,VP} (X2,5) {S} {} {VP,PP} {NP,S} {S} {} {NP} {NP} {NP,VP} {VP,P} {DET} {NP} time (1) flies (2) like (3) an (4) arrow (5) X1,4: (X1,1 , X2,4 ), (X1,2 , X3,4 ) , (X1,3 , X4,4 ) X2,5: (X2,2 , X3,5 ), (X2,3 , X4,5 ) , (X2,4 , X5,5 ) Constructing The Triangular Table {S} (X1,5) {} {S,VP} {S} {} {VP,PP} {NP,S} {S} {} {NP} {NP} {NP,VP} {VP,P} {DET} {NP} time (1) flies (2) like (3) an (4) arrow (5) X1,5: (X1,1 , X2,5 ), (X1,2 , X3,5 ) , (X1,3 , X4,5 ) , (X1,4 , X5,5 ) CYK algorithm: Pseudocode let the input be a string S consisting of n characters: a1 ... an. let the grammar contain r nonterminal symbols R1 ... Rr. This grammar contains the subset Rs which is the set of start symbols. let P[n,n,r] be an array of booleans. Initialize all elements of P to false. for each i = 1 to n for each unit production Rj ai set P[i,1,j] = true for each i = 2 to n -- Length of span for each j = 1 to n-i+1 -- Start of span for each k = 1 to i-1 -- Partition of span for each production RA RB RC if P[j,k,B] and P[j+k,i-k,C] then set P[j,i,A] = true if any of P[1,n,x] is true (x is iterated over the set s, where s are all the indices for Rs) then S is member of language else S is not member of language .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us