Backtrack Parsing Context-Free Grammar Context-Free Grammar

Total Page:16

File Type:pdf, Size:1020Kb

Backtrack Parsing Context-Free Grammar Context-Free Grammar Context-free Grammar Problems with Regular Context-free Grammar Language and Is English a regular language? Bad question! We do not even know what English is! Two eggs and bacon make(s) a big breakfast Backtrack Parsing Can you slide me the salt? He didn't ought to do that But—No! Martin Kay I put the wine you brought in the fridge I put the wine you brought for Sandy in the fridge Should we bring the wine you put in the fridge out Stanford University now? and University of the Saarland You said you thought nobody had the right to claim that they were above the law Martin Kay Context-free Grammar 1 Martin Kay Context-free Grammar 2 Problems with Regular Problems with Regular Language Language You said you thought nobody had the right to claim [You said you thought [nobody had the right [to claim that they were above the law that [they were above the law]]]] Martin Kay Context-free Grammar 3 Martin Kay Context-free Grammar 4 Problems with Regular Context-free Grammar Language Nonterminal symbols ~ grammatical categories Is English mophology a regular language? Bad question! We do not even know what English Terminal Symbols ~ words morphology is! They sell collectables of all sorts Productions ~ (unordered) (rewriting) rules This concerns unredecontaminatability Distinguished Symbol This really is an untiable knot. But—Probably! (Not sure about Swahili, though) Not all that important • Terminals and nonterminals are disjoint • Distinguished symbol Martin Kay Context-free Grammar 5 Martin Kay Context-free Grammar 6 Context-free Grammar Context-free Grammar Nonterminal symbols ~ grammatical categories Nonterminal symbols ~ grammatical categories S, NP, VP, Det, N, V S, NP, VP, Det, N, V Terminal Symbols ~ words Terminal Symbols ~ words the, dog, cat, chased the, dog, cat, chased Productions ~ (unordered) (rewriting) rules Productions ~ (unordered) (rewriting) rules S → NP VP Det → the S → NP VP Det → the NP → Det N N → dog NP → Det N N → dog VP → V NP N → cat VP → V NP N → cat V → chased V → chased Distinguished Symbol Distinguished Symbol S S Martin Kay Context-free Grammar 7 Martin Kay Context-free Grammar 8 Context-free Grammar Context-free Grammar Nonterminal symbols ~ grammatical categories Nonterminal symbols ~ grammatical categories S, NP, VP, Det, N, V S, NP, VP, Det, N, V Terminal Symbols ~ words Terminal Symbols ~ words the, dog, cat, chased the, dog, cat, chased Productions ~ (unordered) (rewriting) rules Productions ~ (unordered) (rewriting) rules S → NP VP Det → the S → NP VP Det → the S S NP → Det N N → dog NP → Det N N → dog NP VP NP VP VP → V NP N → cat VP → V NP N → cat V → chased V → chased Det N Distinguished Symbol Distinguished Symbol S S Martin Kay Context-free Grammar 9 Martin Kay Context-free Grammar 10 Context-free Grammar Context-free Grammar Nonterminal symbols ~ grammatical categories Nonterminal symbols ~ grammatical categories S, NP, VP, Det, N, V S, NP, VP, Det, N, V Terminal Symbols ~ words Terminal Symbols ~ words the, dog, cat, chased the, dog, cat, chased Productions ~ (unordered) (rewriting) rules Productions ~ (unordered) (rewriting) rules S → NP VP Det → the S → NP VP Det → the S S NP → Det N N → dog NP → Det N N → dog NP VP NP VP VP → V NP N → cat VP → V NP N → cat V → chased Det N V NP V → chased Det N V NP Distinguished Symbol Distinguished Symbol Det N S S Martin Kay Context-free Grammar 11 Martin Kay Context-free Grammar 12 Context-free Grammar Context-free Grammar Nonterminal symbols ~ grammatical categories Nonterminal symbols ~ grammatical categories S, NP, VP, Det, N, V S, NP, VP, Det, N, V Terminal Symbols ~ words Terminal Symbols ~ words the, dog, cat, chased the, dog, cat, chased Productions ~ (unordered) (rewriting) rules Productions ~ (unordered) (rewriting) rules S → NP VP Det → the S → NP VP Det → the S S NP → Det N N → dog NP → Det N N → dog NP VP NP VP VP → V NP N → cat VP → V NP N → cat V → chased Det N V NP V → chased Det N V NP Distinguished Symbol the Det N Distinguished Symbol the dog chasedDet N S S the cat Martin Kay Context-free Grammar 13 Martin Kay Context-free Grammar 14 Context-free Grammar Context-free Grammar Nonterminal symbols ~ grammatical categories S, NP, VP, Det, N, V Terminal Symbols ~ words the, dog, cat, chased • Defines a language (set of strings) and Productions ~ (unordered) (rewriting) rules S → NP VP Det → the • Defines a corresponding set of structures S NP → Det N N → dog NP VP VP → V NP N → cat V → chased Det N V NP Distinguished Symbol the dog chasedDet N S the cat Martin Kay Context-free Grammar 15 Martin Kay Context-free Grammar 16 Structure and Semantics Aithmetic Expressions sum ! prod add_op sum sum sum ! prod 14 sum prod ! base mul_op prod prod ad_op sum prod prod ! base 12 base + prod ad_op base base ! a | b | c ... ( sum ) base - y num plus num times num base ! ( sum ) 2 3 4 ad_op ! + prod ad_op sum x ad_op ! - 2 + 3 ! 4 base + prod mul_op ! * a base mul_op ! / b Martin Kay Context-free Grammar 17 Martin Kay Context-free Grammar 18 Aithmetic Expressions Embedding sum sum ! prod add_op sum Arbitrarily embedded parentheses are not prod sum ! prod possible in a regular language because the number of open parentheses that still prod ! base mul_op prod base requiring matching close parentheses can be prod ! base ( sum ) greater than the number of states in the base ! a | b | c ... automaton, whatever that number is. prod base ! ( sum ) Hence, by the pumping lemma, an automaton ad_op ! + base with only a finite number of states cannot characterize the language. ad_op ! - ( sum ) mul_op ! * . mul_op ! / Martin Kay Context-free Grammar 19 Martin Kay Context-free Grammar 20 Ambiguity Lots of Problems! More than one structure can correspond to a That it rained slept That it rained were single string soundly. certainly true. S Mary told the boy Mary His arguments convinces me. John told that it rained NP VP me Me told Mary it. Det N Vi That can burst For the moment, just forget the problems Pro aux Vi NP VP S Martin Kay Context-free Grammar 21 Martin Kay Context-free Grammar 22 Why "Context-free"? Convenience You can define something once, and then use it in many places in the grammar S S S NP VP NP VP Det N V NP NP the dog chased Det N Vd NP NP S the cat S NP VP VP NP VP What can be in this subtree Vi Vt NP is determined only by this symbol Vi and not by any of this "context" John told me that it rained That it rained surprised John Martin Kay Context-free Grammar 23 Martin Kay Context-free Grammar 24 How many (binary) trees How many (binary) trees with n terminals? with n terminals? Any binary number Any binary number represents a set of represents a set of binary branching trees binary branching trees 2n-2 2n-2 Exponential Exponential ⬈ ⬈ ⬉ ⬉ ⬈ ⬈ ⬉ ⬈ ⬉ ⬉ ⬈ ⬈ ⬉ ⬈ ⬈ ⬉ ⬉ ⬈ ⬈ ⬉ ⬈ ⬉ ⬉ ⬈ ⬈ ⬉ 1 1 1 0 0 1 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 0 1 1 0 Martin Kay Context-free Grammar 25 Martin Kay Context-free Grammar 26 Catalan Numbers Catalan Numbers 1 terminal 2 terminals Terminals Trees Terminals Trees 1 1 1 1 2 1 Why? Martin Kay Context-free Grammar 27 Martin Kay Context-free Grammar 28 Catalan Numbers Catalan Numbers 3 terminals 3 terminals Terminals Trees Terminals Trees left ! 1 1 1 1 ! right 2 1 ! right left ! 2 1 Martin Kay Context-free Grammar 29 Martin Kay Context-free Grammar 30 Catalan Numbers Catalan Numbers 4 terminals 4 terminals Terminals Trees Terminals Trees left ! 1 1 1 1 2 1 left ! 2 1 ! right 3 2 ! right 3 2 1 ! 2 = 2 1 ! 2 = 2 1 ! 1 = 1 Martin Kay Context-free Grammar 31 Martin Kay Context-free Grammar 32 Catalan Numbers Catalan Numbers 4 terminals 4 terminals Terminals Trees Terminals Trees 1 1 ! right 1 1 ! right 2 1 2 1 left ! 3 2 left ! 3 2 4 5 1 ! 2 = 2 1 ! 2 = 2 1 ! 1 = 1 1 ! 1 = 1 2 ! 1 = 2 2 ! 1 = 2 5 Martin Kay Context-free Grammar 33 Martin Kay Context-free Grammar 34 Catalan Numbers Catalan Numbers 5 terminals 4 terminals Terminals Trees Terminals Trees left ! 1 1 1 1 2 1 left ! 2 1 3 2 3 2 ! right 4 5 ! right 4 5 1 ! 5 = 5 1 ! 5 = 5 1 ! 2 = 2 Martin Kay Context-free Grammar 35 Martin Kay Context-free Grammar 36 Catalan Numbers Catalan Numbers 4 terminals 4 terminals Terminals Trees Terminals Trees 1 1 1 1 ! right 2 1 ! right 2 1 left ! 3 2 3 2 4 5 left ! 4 5 1 ! 5 = 5 1 ! 5 = 5 1 ! 2 = 2 1 ! 2 = 2 2 ! 1 = 2 2 ! 1 = 2 5 ! 1 = 5 14 Martin Kay Context-free Grammar 37 Martin Kay Context-free Grammar 38 Catalan Numbers Catalan Numbers The number of binary trees with n terminals Terminals Trees 1 1 The number of arbitrarily branching trees 2 1 with n nodes. 3 2 (= 1*1 + 1*1) 4 5 (= 1*2 + 1*1 + 2*1) 5 14 (= 1*5 + 1*2 + 2*1 + 5*1) 6 42 (= 1*14 + 1*5 + 2*2 + 5*1 + 14*1) ... Any tree can be encoded in a 2n - 1 binary tree with the same Cat(n) = n number of terminals as the 2n - 1 original tree has nodes. Martin Kay Context-free Grammar 39 Martin Kay Context-free Grammar 40 Catalan Numbers Analyzing Sentences There are n nonterminals in the binary tree for the n nodes in the original tree, except for the root.
Recommended publications
  • Lecture 4 Dynamic Programming
    1/17 Lecture 4 Dynamic Programming Last update: Jan 19, 2021 References: Algorithms, Jeff Erickson, Chapter 3. Algorithms, Gopal Pandurangan, Chapter 6. Dynamic Programming 2/17 Backtracking is incredible powerful in solving all kinds of hard prob- lems, but it can often be very slow; usually exponential. Example: Fibonacci numbers is defined as recurrence: 0 if n = 0 Fn =8 1 if n = 1 > Fn 1 + Fn 2 otherwise < ¡ ¡ > A direct translation in:to recursive program to compute Fibonacci number is RecFib(n): if n=0 return 0 if n=1 return 1 return RecFib(n-1) + RecFib(n-2) Fibonacci Number 3/17 The recursive program has horrible time complexity. How bad? Let's try to compute. Denote T(n) as the time complexity of computing RecFib(n). Based on the recursion, we have the recurrence: T(n) = T(n 1) + T(n 2) + 1; T(0) = T(1) = 1 ¡ ¡ Solving this recurrence, we get p5 + 1 T(n) = O(n); = 1.618 2 So the RecFib(n) program runs at exponential time complexity. RecFib Recursion Tree 4/17 Intuitively, why RecFib() runs exponentially slow. Problem: redun- dant computation! How about memorize the intermediate computa- tion result to avoid recomputation? Fib: Memoization 5/17 To optimize the performance of RecFib, we can memorize the inter- mediate Fn into some kind of cache, and look it up when we need it again. MemFib(n): if n = 0 n = 1 retujrjn n if F[n] is undefined F[n] MemFib(n-1)+MemFib(n-2) retur n F[n] How much does it improve upon RecFib()? Assuming accessing F[n] takes constant time, then at most n additions will be performed (we never recompute).
    [Show full text]
  • Exhaustive Recursion and Backtracking
    CS106B Handout #19 J Zelenski Feb 1, 2008 Exhaustive recursion and backtracking In some recursive functions, such as binary search or reversing a file, each recursive call makes just one recursive call. The "tree" of calls forms a linear line from the initial call down to the base case. In such cases, the performance of the overall algorithm is dependent on how deep the function stack gets, which is determined by how quickly we progress to the base case. For reverse file, the stack depth is equal to the size of the input file, since we move one closer to the empty file base case at each level. For binary search, it more quickly bottoms out by dividing the remaining input in half at each level of the recursion. Both of these can be done relatively efficiently. Now consider a recursive function such as subsets or permutation that makes not just one recursive call, but several. The tree of function calls has multiple branches at each level, which in turn have further branches, and so on down to the base case. Because of the multiplicative factors being carried down the tree, the number of calls can grow dramatically as the recursion goes deeper. Thus, these exhaustive recursion algorithms have the potential to be very expensive. Often the different recursive calls made at each level represent a decision point, where we have choices such as what letter to choose next or what turn to make when reading a map. Might there be situations where we can save some time by focusing on the most promising options, without committing to exploring them all? In some contexts, we have no choice but to exhaustively examine all possibilities, such as when trying to find some globally optimal result, But what if we are interested in finding any solution, whichever one that works out first? At each decision point, we can choose one of the available options, and sally forth, hoping it works out.
    [Show full text]
  • Adaptive LL(*) Parsing: the Power of Dynamic Analysis
    Adaptive LL(*) Parsing: The Power of Dynamic Analysis Terence Parr Sam Harwell Kathleen Fisher University of San Francisco University of Texas at Austin Tufts University [email protected] [email protected] kfi[email protected] Abstract PEGs are unambiguous by definition but have a quirk where Despite the advances made by modern parsing strategies such rule A ! a j ab (meaning “A matches either a or ab”) can never as PEG, LL(*), GLR, and GLL, parsing is not a solved prob- match ab since PEGs choose the first alternative that matches lem. Existing approaches suffer from a number of weaknesses, a prefix of the remaining input. Nested backtracking makes de- including difficulties supporting side-effecting embedded ac- bugging PEGs difficult. tions, slow and/or unpredictable performance, and counter- Second, side-effecting programmer-supplied actions (muta- intuitive matching strategies. This paper introduces the ALL(*) tors) like print statements should be avoided in any strategy that parsing strategy that combines the simplicity, efficiency, and continuously speculates (PEG) or supports multiple interpreta- predictability of conventional top-down LL(k) parsers with the tions of the input (GLL and GLR) because such actions may power of a GLR-like mechanism to make parsing decisions. never really take place [17]. (Though DParser [24] supports The critical innovation is to move grammar analysis to parse- “final” actions when the programmer is certain a reduction is time, which lets ALL(*) handle any non-left-recursive context- part of an unambiguous final parse.) Without side effects, ac- free grammar. ALL(*) is O(n4) in theory but consistently per- tions must buffer data for all interpretations in immutable data forms linearly on grammars used in practice, outperforming structures or provide undo actions.
    [Show full text]
  • Backtracking / Branch-And-Bound
    Backtracking / Branch-and-Bound Optimisation problems are problems that have several valid solutions; the challenge is to find an optimal solution. How optimal is defined, depends on the particular problem. Examples of optimisation problems are: Traveling Salesman Problem (TSP). We are given a set of n cities, with the distances between all cities. A traveling salesman, who is currently staying in one of the cities, wants to visit all other cities and then return to his starting point, and he is wondering how to do this. Any tour of all cities would be a valid solution to his problem, but our traveling salesman does not want to waste time: he wants to find a tour that visits all cities and has the smallest possible length of all such tours. So in this case, optimal means: having the smallest possible length. 1-Dimensional Clustering. We are given a sorted list x1; : : : ; xn of n numbers, and an integer k between 1 and n. The problem is to divide the numbers into k subsets of consecutive numbers (clusters) in the best possible way. A valid solution is now a division into k clusters, and an optimal solution is one that has the nicest clusters. We will define this problem more precisely later. Set Partition. We are given a set V of n objects, each having a certain cost, and we want to divide these objects among two people in the fairest possible way. In other words, we are looking for a subdivision of V into two subsets V1 and V2 such that X X cost(v) − cost(v) v2V1 v2V2 is as small as possible.
    [Show full text]
  • Sequence Alignment/Map Format Specification
    Sequence Alignment/Map Format Specification The SAM/BAM Format Specification Working Group 3 Jun 2021 The master version of this document can be found at https://github.com/samtools/hts-specs. This printing is version 53752fa from that repository, last modified on the date shown above. 1 The SAM Format Specification SAM stands for Sequence Alignment/Map format. It is a TAB-delimited text format consisting of a header section, which is optional, and an alignment section. If present, the header must be prior to the alignments. Header lines start with `@', while alignment lines do not. Each alignment line has 11 mandatory fields for essential alignment information such as mapping position, and variable number of optional fields for flexible or aligner specific information. This specification is for version 1.6 of the SAM and BAM formats. Each SAM and BAMfilemay optionally specify the version being used via the @HD VN tag. For full version history see Appendix B. Unless explicitly specified elsewhere, all fields are encoded using 7-bit US-ASCII 1 in using the POSIX / C locale. Regular expressions listed use the POSIX / IEEE Std 1003.1 extended syntax. 1.1 An example Suppose we have the following alignment with bases in lowercase clipped from the alignment. Read r001/1 and r001/2 constitute a read pair; r003 is a chimeric read; r004 represents a split alignment. Coor 12345678901234 5678901234567890123456789012345 ref AGCATGTTAGATAA**GATAGCTGTGCTAGTAGGCAGTCAGCGCCAT +r001/1 TTAGATAAAGGATA*CTG +r002 aaaAGATAA*GGATA +r003 gcctaAGCTAA +r004 ATAGCT..............TCAGC -r003 ttagctTAGGC -r001/2 CAGCGGCAT The corresponding SAM format is:2 1Charset ANSI X3.4-1968 as defined in RFC1345.
    [Show full text]
  • Lexing and Parsing with ANTLR4
    Lab 2 Lexing and Parsing with ANTLR4 Objective • Understand the software architecture of ANTLR4. • Be able to write simple grammars and correct grammar issues in ANTLR4. EXERCISE #1 Lab preparation Ï In the cap-labs directory: git pull will provide you all the necessary files for this lab in TP02. You also have to install ANTLR4. 2.1 User install for ANTLR4 and ANTLR4 Python runtime User installation steps: mkdir ~/lib cd ~/lib wget http://www.antlr.org/download/antlr-4.7-complete.jar pip3 install antlr4-python3-runtime --user Then in your .bashrc: export CLASSPATH=".:$HOME/lib/antlr-4.7-complete.jar:$CLASSPATH" export ANTLR4="java -jar $HOME/lib/antlr-4.7-complete.jar" alias antlr4="java -jar $HOME/lib/antlr-4.7-complete.jar" alias grun='java org.antlr.v4.gui.TestRig' Then source your .bashrc: source ~/.bashrc 2.2 Structure of a .g4 file and compilation Links to a bit of ANTLR4 syntax : • Lexical rules (extended regular expressions): https://github.com/antlr/antlr4/blob/4.7/doc/ lexer-rules.md • Parser rules (grammars) https://github.com/antlr/antlr4/blob/4.7/doc/parser-rules.md The compilation of a given .g4 (for the PYTHON back-end) is done by the following command line: java -jar ~/lib/antlr-4.7-complete.jar -Dlanguage=Python3 filename.g4 or if you modified your .bashrc properly: antlr4 -Dlanguage=Python3 filename.g4 2.3 Simple examples with ANTLR4 EXERCISE #2 Demo files Ï Work your way through the five examples in the directory demo_files: Aurore Alcolei, Laure Gonnord, Valentin Lorentz. 1/4 ENS de Lyon, Département Informatique, M1 CAP Lab #2 – Automne 2017 ex1 with ANTLR4 + Java : A very simple lexical analysis1 for simple arithmetic expressions of the form x+3.
    [Show full text]
  • Module 5: Backtracking
    Module-5 : Backtracking Contents 1. Backtracking: 3. 0/1Knapsack problem 1.1. General method 3.1. LC Branch and Bound solution 1.2. N-Queens problem 3.2. FIFO Branch and Bound solution 1.3. Sum of subsets problem 4. NP-Complete and NP-Hard problems 1.4. Graph coloring 4.1. Basic concepts 1.5. Hamiltonian cycles 4.2. Non-deterministic algorithms 2. Branch and Bound: 4.3. P, NP, NP-Complete, and NP-Hard 2.1. Assignment Problem, classes 2.2. Travelling Sales Person problem Module 5: Backtracking 1. Backtracking Some problems can be solved, by exhaustive search. The exhaustive-search technique suggests generating all candidate solutions and then identifying the one (or the ones) with a desired property. Backtracking is a more intelligent variation of this approach. The principal idea is to construct solutions one component at a time and evaluate such partially constructed candidates as follows. If a partially constructed solution can be developed further without violating the problem’s constraints, it is done by taking the first remaining legitimate option for the next component. If there is no legitimate option for the next component, no alternatives for any remaining component need to be considered. In this case, the algorithm backtracks to replace the last component of the partially constructed solution with its next option. It is convenient to implement this kind of processing by constructing a tree of choices being made, called the state-space tree. Its root represents an initial state before the search for a solution begins. The nodes of the first level in the tree represent the choices made for the first component of a solution; the nodes of the second level represent the choices for the second component, and soon.
    [Show full text]
  • Compiler Design
    CCOOMMPPIILLEERR DDEESSIIGGNN -- PPAARRSSEERR http://www.tutorialspoint.com/compiler_design/compiler_design_parser.htm Copyright © tutorialspoint.com In the previous chapter, we understood the basic concepts involved in parsing. In this chapter, we will learn the various types of parser construction methods available. Parsing can be defined as top-down or bottom-up based on how the parse-tree is constructed. Top-Down Parsing We have learnt in the last chapter that the top-down parsing technique parses the input, and starts constructing a parse tree from the root node gradually moving down to the leaf nodes. The types of top-down parsing are depicted below: Recursive Descent Parsing Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. It uses procedures for every terminal and non-terminal entity. This parsing technique recursively parses the input to make a parse tree, which may or may not require back-tracking. But the grammar associated with it ifnotleftfactored cannot avoid back- tracking. A form of recursive-descent parsing that does not require any back-tracking is known as predictive parsing. This parsing technique is regarded recursive as it uses context-free grammar which is recursive in nature. Back-tracking Top- down parsers start from the root node startsymbol and match the input string against the production rules to replace them ifmatched. To understand this, take the following example of CFG: S → rXd | rZd X → oa | ea Z → ai For an input string: read, a top-down parser, will behave like this: It will start with S from the production rules and will match its yield to the left-most letter of the input, i.e.
    [Show full text]
  • 7.2 Finite State Machines Finite State Machines Are Another Way to Specify the Syntax of a Sentence in a Lan- Guage
    32397_CH07_331_388.qxd 1/14/09 5:01 PM Page 346 346 Chapter 7 Language Translation Principles Context Sensitivity of C++ It appears from Figure 7.8 that the C++ language is context-free. Every production rule has only a single nonterminal on the left side. This is in contrast to a context- sensitive grammar, which can have more than a single nonterminal on the left, as in Figure 7.3. Appearances are deceiving. Even though the grammar for this subset of C++, as well as the full standard C++ language, is context-free, the language itself has some context-sensitive aspects. Consider the grammar in Figure 7.3. How do its rules of production guarantee that the number of c’s at the end of a string must equal the number of a’s at the beginning of the string? Rules 1 and 2 guarantee that for each a generated, exactly one C will be generated. Rule 3 lets the C commute to the right of B. Finally, Rule 5 lets you substitute c for C in the context of having a b to the left of C. The lan- guage could not be specified by a context-free grammar because it needs Rules 3 and 5 to get the C’s to the end of the string. There are context-sensitive aspects of the C++ language that Figure 7.8 does not specify. For example, the definition of <parameter-list> allows any number of formal parameters, and the definition of <argument-expression-list> allows any number of actual parameters.
    [Show full text]
  • CS/ECE 374: Algorithms & Models of Computation
    CS/ECE 374: Algorithms & Models of Computation, Fall 2018 Backtracking and Memoization Lecture 12 October 9, 2018 Chandra Chekuri (UIUC) CS/ECE 374 1 Fall 2018 1 / 36 Recursion Reduction: Reduce one problem to another Recursion A special case of reduction 1 reduce problem to a smaller instance of itself 2 self-reduction 1 Problem instance of size n is reduced to one or more instances of size n − 1 or less. 2 For termination, problem instances of small size are solved by some other method as base cases. Chandra Chekuri (UIUC) CS/ECE 374 2 Fall 2018 2 / 36 Recursion in Algorithm Design 1 Tail Recursion: problem reduced to a single recursive call after some work. Easy to convert algorithm into iterative or greedy algorithms. Examples: Interval scheduling, MST algorithms, etc. 2 Divide and Conquer: Problem reduced to multiple independent sub-problems that are solved separately. Conquer step puts together solution for bigger problem. Examples: Closest pair, deterministic median selection, quick sort. 3 Backtracking: Refinement of brute force search. Build solution incrementally by invoking recursion to try all possibilities for the decision in each step. 4 Dynamic Programming: problem reduced to multiple (typically) dependent or overlapping sub-problems. Use memoization to avoid recomputation of common solutions leading to iterative bottom-up algorithm. Chandra Chekuri (UIUC) CS/ECE 374 3 Fall 2018 3 / 36 Subproblems in Recursion Suppose foo() is a recursive program/algorithm for a problem. Given an instance I , foo(I ) generates potentially many \smaller" problems. If foo(I 0) is one of the calls during the execution of foo(I ) we say I 0 is a subproblem of I .
    [Show full text]
  • 2D Trie for Fast Parsing
    2D Trie for Fast Parsing Xian Qian, Qi Zhang, Xuanjing Huang, Lide Wu Institute of Media Computing School of Computer Science, Fudan University xianqian, qz, xjhuang, ldwu @fudan.edu.cn { } Abstract FeatureGeneration Template: Feature: p .word+p .pos lucky/ADJ In practical applications, decoding speed 0 0 is very important. Modern structured Feature Retrieval learning technique adopts template based Index: method to extract millions of features. Parse Tree 3228~3233 Complicated templates bring about abun- Buildlattice,inferenceetc. dant features which lead to higher accu- racy but more feature extraction time. We Figure 1: Flow chart of dependency parsing. propose Two Dimensional Trie (2D Trie), p0.word, p0.pos denotes the word and POS tag a novel efficient feature indexing structure of parent node respectively. Indexes correspond which takes advantage of relationship be- to the features conjoined with dependency types, tween templates: feature strings generated e.g., lucky/ADJ/OBJ, lucky/ADJ/NMOD, etc. by a template are prefixes of the features from its extended templates. We apply our technique to Maximum Spanning Tree dependency parsing. Experimental results feature extraction beside of a fast parsing algo- on Chinese Tree Bank corpus show that rithm is important for the parsing and training our 2D Trie is about 5 times faster than speed. He takes 3 measures for a 40X speedup, traditional Trie structure, making parsing despite the same inference algorithm. One impor- speed 4.3 times faster. tant measure is to store the feature vectors in file to skip feature extraction, otherwise it will be the 1 Introduction bottleneck. Now we quickly review the feature extraction In practical applications, decoding speed is very stage of structured learning.
    [Show full text]
  • Toward a Model for Backtracking and Dynamic Programming
    TOWARD A MODEL FOR BACKTRACKING AND DYNAMIC PROGRAMMING Michael Alekhnovich, Allan Borodin, Joshua Buresh-Oppenheim, Russell Impagliazzo, Avner Magen, and Toniann Pitassi Abstract. We propose a model called priority branching trees (pBT ) for backtrack- ing and dynamic programming algorithms. Our model generalizes both the priority model of Borodin, Nielson and Rackoff, as well as a simple dynamic programming model due to Woeginger, and hence spans a wide spectrum of algorithms. After witnessing the strength of the model, we then show its limitations by providing lower bounds for algorithms in this model for several classical problems such as Interval Scheduling, Knapsack and Satisfiability. Keywords. Greedy Algorithms, Dynamic Programming, Models of Computation, Lower Bounds. Subject classification. 68Q10 1. Introduction The “Design and Analysis of Algorithms” is a basic component of the Computer Science Curriculum. Courses and texts for this topic are often organized around a toolkit of al- gorithmic paradigms or meta-algorithms such as greedy algorithms, divide and conquer, dynamic programming, local search, etc. Surprisingly (as this is often the main “theory course”), these algorithmic paradigms are rarely, if ever, precisely defined. Instead, we provide informal definitional statements followed by (hopefully) well chosen illustrative examples. Our informality in algorithm design should be compared to computability the- ory where we have a well accepted formalization for the concept of an algorithm, namely that provided by Turing machines and its many equivalent computational models (i.e. consider the almost universal acceptance of the Church-Turing thesis). While quantum computation may challenge the concept of “efficient algorithm”, the benefit of having a well defined concept of an algorithm and a computational step is well appreciated.
    [Show full text]