COMP2012 Lecture Notes

6 Disproving Regularity 50 6.1 The pumping lemma ......................... 50 6.2 Applying the pumping lemma .................... 51 Languages and Computation 6.3 Exercises ............................... 52 7 Context-Free Grammars 53 (COMP2012/G52LAC) 7.1 What are context-free grammars? .................. 53 Lecture notes 7.2 The meaning of context-free grammars ............... 55 7.3 The relation between regular and context-free languages ..... 56 Spring 2019 7.4 Derivation trees ............................ 57 7.5 Ambiguity ............................... 59 Thorsten Altenkirch, Venanzio Capretta, and Henrik Nilsson 7.6 Applications of context-free grammars ............... 62 7.7 Exercises ............................... 64 March 1, 2019 8 Transformations of context-free grammars 66 8.1 Equivalence of context-free grammars ............... 66 8.2 Elimination of uselsss productions ................. 66 Contents 8.3 Substitution .............................. 67 8.4 Left factoring ............................. 68 1 Introduction 4 8.5 Disambiguating context-free grammars ............... 68 1.1 Example: Valid Java programs ................... 4 8.6 Elimination of left recursion ..................... 70 1.2 Example: The halting problem ................... 5 8.7 Exercises ............................... 74 1.3 Example: The λ-calculus ....................... 7 1.4 P versus NP .............................. 7 9 Pushdown Automata 75 9.1 What is a pushdown automaton? .................. 75 2 Formal Languages 9 9.2 How does a PDA work? ....................... 76 2.1 Exercises ............................... 11 9.3 The language of a PDA ....................... 77 9.4 Deterministic PDAs ......................... 78 3 Finite Automata 13 9.5 Context-free grammars and push-down automata ......... 79 3.1 Deterministic finite automata .................... 13 3.1.1 What is a DFA? ....................... 13 10 Recursive-Descent Parsing 81 3.1.2 The language of a DFA ................... 15 10.1 What is parsing? ........................... 81 3.2 Nondeterministic finite automata .................. 16 10.2 Parsing strategies ........................... 81 3.2.1 What is an NFA? ....................... 16 10.3 Basics of recursive-descent parsing ................. 82 3.2.2 The language accepted by an NFA ............. 18 10.4 Handling choice ............................ 85 3.2.3 The subset construction ................... 21 10.5 Recursive-descent parsing and left-recursion ............ 88 3.2.4 Correctness of the subset construction ........... 25 10.6 Predictive parsing .......................... 88 3.3 Exercises ............................... 26 10.6.1 First and follow sets ..................... 90 10.6.2 LL(1) grammars ....................... 90 4 Regular Expressions 29 10.6.3 Nullable nonterminals .................... 91 4.1 What are regular expressions? .................... 29 10.6.4 Computing first sets ..................... 92 4.2 The meaning of regular expressions ................. 30 10.6.5 Computing follow sets .................... 93 4.3 Algebraic laws ............................ 33 10.6.6 Implementing a predictive parser .............. 94 4.4 Translating regular expressions into NFAs ............. 34 10.6.7 LL(1), left-recursion, and ambiguity ............ 96 4.5 Summing up ............................. 44 10.6.8 Satisfying the LL(1) conditions ............... 98 4.6 Exercises ............................... 45 10.7 Beyond hand-written parsers: use parser generators ........ 99 10.8 Exercises ...............................100 5 Minimization of Finite Automata 46 5.1 The table-filling algorithm ...................... 46 5.2 Example of DFA minimization using the table-filling algorithm . 47 1 2 11 Turing Machines 102 Exercise 8.3 ................................ 74 11.1 What is a Turing machine? .....................102 Exercise 10.1 ...............................100 11.2 Grammars and context-sensitivity .................105 Exercise 10.2 ...............................100 11.3 The halting problem .........................106 Exercise 11.1 ...............................111 11.4 Recursive and recursively enumerable sets .............107 Exercise 11.2 ...............................111 11.5 Back to Chomsky ...........................110 Exercise 11.3 ...............................112 11.6 Exercises ...............................111 Exercise 12.1 ...............................121 Exercise 12.2 ...............................121 12 λ-Calculus 113 Exercise 13.1 ...............................126 12.1 Syntax of λ-calculus .........................113 Exercise 13.2 ...............................126 12.2 Church numerals ...........................116 Exercise 13.3 ...............................127 12.3 Other data structures ........................117 12.4 Confluence ..............................118 12.5 Recursion ...............................119 1 Introduction 12.6 The universality of λ-calculus ....................120 12.7 Exercises ...............................121 This module is about two fundamental notions in computer science, languages and computation, and how they are related. Specific topics include: 13 Algorithmic Complexity 122 • Automata Theory 13.1 The Satisfiability Problem ......................123 13.2 Time Complexity ...........................123 • Formal Languages 13.3 NP-completeness ...........................125 • Models of Computation 13.4 Exercises ...............................126 • Complexity Theory A Model Answers to Exercises 128 The module starts with an investigation of classes of formal languages and related abstract machines, considers practical uses of this theory such as parsing, List of exercises and finishes with a discussion on what computation is, what can and cannot be computed at all, and what can be computed efficiently, including famous results Exercise 2.1 ................................ 11 such as the Halting Problem and open problems such as P versus NP. Exercise 2.2 ................................ 12 COMP2012/G52LAC builds on COMP1001/G51MSC Mathematics for Com- Exercise 2.3 ................................ 12 puter Scientists. It feeds into modules such as COMP3012/G53CMP Compilers, Exercise 2.4 ................................ 12 COMP3001/G53COM Computability, and COMP4001/G54FOP Mathematical Exercise 3.1 ................................ 26 Foundations of Programming. To give you a more concrete idea about what you Exercise 3.2 ................................ 26 will encounter in this module, as well as the broader significance of the module Exercise 3.3 ................................ 26 and a bit of historical context, we will illustrate with some examples. Exercise 3.4 ................................ 26 Exercise 3.5 ................................ 27 1.1 Example: Valid Java programs Exercise 3.6 ................................ 27 Exercise 3.7 ................................ 28 Consider the following Java fragment: Exercise 4.1 ................................ 45 class Foo { Exercise 4.2 ................................ 45 int n; Exercise 4.3 ................................ 45 void printNSqrd() { Exercise 4.4 ................................ 45 System.out.println(n * n); Exercise 6.1 ................................ 52 } Exercise 7.1 ................................ 64 } Exercise 7.2 ................................ 64 Exercise 7.3 ................................ 64 As written using a text editor or as stored in a file, it is just a string of characters. Exercise 7.4 ................................ 65 But not any string is a valid Java program. For example, Java uses specific Exercise 8.1 ................................ 74 keywords, have rules for what identifiers must look like, and requires proper Exercise 8.2 ................................ 74 nesting, such as a definition of a method inside a definition of a class. This raises a number of questions: 3 4 • How to describe the set of strings that are valid Java programs? Note how the numbers both increase and decrease in a way that is very hard to describe, which is exactly why it is so hard to analyse this program. The sequence • Given a string, how to determine if it is a valid Java program or not? involved is known as the hailstone sequence, and Collatz conjecture says that the • How to recover the structure of a Java program from a “flat” string? number 1 will always be reached. And, in fact, for all numbers that have been tried (all numbers up to 260!), the sequence does indeed terminate. But so far, To answer such questions, we will study regular expressions and grammars to no one has been able to prove that it always will! The famous mathematician give precise descriptions of languages, and various kinds of automata to decide if Paul Erd˝os even said: “Mathematics may not be ready for such problems.” (See a string belongs to a language or not. We will also consider how to systematically Collatz conjecture, Wikipedia.) derive programs that efficiently answer this type of questions, drawing directly The following important decidability result should then perhaps not come from the theory. Such programs are key parts of compilers, web browsers and as a total surprise: web servers, and in fact of any program that uses structured text in one way or another. It is impossible to write a program that decides if another, arbitrary, A little bit of history. Context-free grammars were invented by American program terminates (halts) or not. linguist, philosopher, and cognitive scientist Noam Chomsky (1928–) in an at- This is known as the

Load more