Automating Grammar Comparison

Automating Grammar Comparison Ravichandhran Madhavan Mikaël Mayer Sumit Gulwani EPFL, Switzerland EPFL, Switzerland Microsoft Research, USA ravi.kandhadai@epfl.ch mikael.mayer@epfl.ch [email protected] rtifact ∗ A Comple * t * te n * A * te Viktor Kuncak s W i E s A e n C l l L o D C S o * * c P u e m s E u O e e EPFL, Switzerland n v R t e O o d t a y * s E a * l u d a e viktor.kuncak@epfl.ch t Abstract S ! ID E S ! S + S j S ∗ S j ID We consider from a practical perspective the problem of E ! +S j ∗ S j checking equivalence of context-free grammars. We present Figure 1. Grammars recognizing simple arithmetic expres- techniques for proving equivalence, as well as techniques sions. An example proven equivalent by our tool. for finding counter-examples that establish non-equivalence. Among the key building blocks of our approach is a novel algorithm for efficiently enumerating and sampling words Categories and Subject Descriptors D.3.1 [Programming and parse trees from arbitrary context-free grammars; the Languages]: Formal Definitions and Theory—Syntax; D.3.4 algorithm supports polynomial time random access to words [Programming Languages]: Processors—Parsing, Compilers, belonging to the grammar. Furthermore, we propose an Interpreters; K.3.2 [Computers and Education]: Computer algorithm for proving equivalence of context-free grammars and Information Science Education; D.2.5 [Software Engi- that is complete for LL grammars, yet can be invoked on any neering]: Testing and Debugging context-free grammar, including ambiguous grammars. General Terms Languages, Theory, Verification Our techniques successfully find discrepancies between different syntax specifications of several real-world lan- Keywords Context-free grammars, equivalence, counter- guages, and is capable of detecting fine-grained incremental examples, proof system, tutoring system modifications performed on grammars. Our evaluation shows that our tool improves significantly on the existing available 1. Introduction state of the art tools. In addition, we used these algorithms Context-free grammars are pervasively used in verification to develop an online tutoring system for grammars that we and compilation, both for building input parsers and as foun- then used in an undergraduate course on computer language dation of algorithms for model checking, program analysis, processing. On questions involving grammar constructions, and testing. They also play an important pedagogical role in our system was able to automatically evaluate the correctness introducing fundamentals of formal language theory, and are of 95% of the solutions submitted by students: it disproved an integral part of undergraduate computer science education. 74% of cases and proved 21% of them. Despite their importance, and despite decades of theoretical advances, practical tools that can check semantic properties ∗ This work is supported in part by the European Research Council (ERC) of grammars are still scarce, except for specific tasks such as Project Implicit Programming and Swiss National Science Foundation Grant parsing. Constraint Solving Infrastructure for Program Analysis. In this paper, we develop practical techniques for checking equivalence of context-free grammars. Our techniques can find counter-examples that disprove equivalence, and can prove that two context-free grammars are equivalent, much like a software model checker. Our approaches are motivated by two applications: (a) comparing real-world grammars, such as those used in production compilers, (b) automating tutoring and evaluation of context-free grammars. These applications are interesting and challenging for a number of reasons. S ! Int G creasingly permissive over the years to mitigate this problem. S ! A ) S j Int G !) Int G j ; Int A j However, there still remains considerable overhead in this A ! Int; A j Int A !; Int A j ) Int G process, and there is a general need for tools that pin-point subtle changes in the modified versions (documented in works Figure 2. Grammars defining well-formed function signa- such as [33]). It is almost always impossible to spot differ- tures over Int. An example proven equivalent by our tool. ences between large real-world grammars through manual scanning, because the grammars typically appear similar, and S ! A ) Int j Int S ! A ) S j Int even use the same name for many non-terminals. A challenge (a) (b) A ! S; Int j Int A ! S; S j Int this paper addresses is developing techniques that scales to real-world grammars, which have hundreds of non-terminals Figure 3. Grammars subtly different from the grammars and productions. shown in Fig. 2. The grammar on the left does not accept An equally compelling application of grammar compari- “Int ) Int ) Int". son arises from the importance of context-free grammar in computer science education. Assignments involving context- Much of the front-ends of modern compilers and inter- free grammars are harder to grade and provide feedback, preters are automatically or manually derived from grammar- arguably even more than programming assignments, because based descriptions of programming languages, more so with of the greater variation in the possible solutions arising due to integrated language support for domain specific languages. the succinctness of the grammars. The complexity is further When two compilers or other language tools are built ac- aggravated when the solutions are required to belong to sub- cording to two different reference grammars, knowing how classes like LL(1). For example, Figures 1 and 2 show two they differ in the programs they support is essential. Our ex- pairs of grammars that are equivalent. The grammars shown periments show that two grammars for the same language on the right are LL(1) grammars, and are reference solutions. almost always differ, even if they aim to implement the same The grammars shown on the left are intuitive solutions that standard. For instance, we found using our tool that two high a student comes up with initially. Proving equivalence of quality standard Java grammars (namely, the Java grammar1 these pairs of grammars is challenging because they do not used by Antlr v4 parser generator [1], and the Java language have any similarity in their structure, but recognize the same specification [2]) disagree on more than 50% of words that language. On the other hand, Figure 3 shows two grammars are randomly sampled from them. (written by students) that subtly differ from the grammars of Even though the differences need not correspond to incom- Fig. 2. The smallest counter-example for the grammar shown patibilities between the compilers that use these grammars in Fig. 3(a) is the string “Int ) Int ) Int". We invite the (since their handling could have been intentionally delegated readers to identify a counter-example that differentiates the to type checkers and other backend phases), the sheer volume grammar of Fig. 3(b) from those of Fig. 2. of these discrepancies does raise serious concerns. In fact, In our experience, a practical system that can prove that a most deviations found by our tool are not purposeful. More- student’s solution is correct and provide a counter-example over, in the case of dynamic languages like Javascript, where if it is not can greatly aid tutoring of context-free grammars. parsing may happen at run-time, differences between parsers The state of the art for providing feedback on programming can produce diverging run-time behaviors. Our experiments assignments is to use test cases (though there has been recent show that (incorrect) expressions like “++ RegExp - this" work on generating repair based feedback [29]). We bring the discovered by our tool while comparing Javascript grammars same fundamentals to context-free grammar education. Fur- result in different behaviors on Firefox and Internet Explorer thermore, we exploit the large, yet under-utilized, theoretical browsers, when the expressions are wrapped inside functions research on decision procedures for equivalence of context- and executed using the eval construct (see section 5). free grammars to develop a practical algorithm that can prove Besides detecting incompatibilities, comparing real-world the correctness of solutions provided by the students. grammars can help identify portions of the grammars that are overly permissive. For instance, many real-world Java grammars generate programs like “enum ID implements char Overview and Contributions. At the core of our system is { ID }", “private public class ID" (which were dis- a fast approach for enumerating words and parse trees of an covered while comparing them with other grammars). These arbitrary context-free grammar, which supports exhaustive imprecisions can be eliminated with little effort without com- enumeration as well as random sampling of parse trees promising parsing efficiency. and words. These features are supported by an efficient Furthermore, often grammars are rewritten extensively polynomial time random access operation that constructs a to make them acceptable by parser generators, which is la- unique parse tree for any given natural number index. We borious and error prone. Parser generators have become in- construct a scalable counter-example detection algorithm by integrating our enumerators with a state of the art parsing 1 github.com/antlr/grammars-v4/blob/master/java/Java.g4 technique [25]. We develop and implement an algorithm for proving equiv- putable in polynomial time (formalized in Theorem 1). The alence by extending decision procedures for subclasses of functions are partial if the set that is enumerated is finite. deterministic context-free grammars to arbitrary (possibly Using bijective functions to construct enumerators has many ambiguous) context-free grammars, while preserving sound- advantages, for example, it immediately provides a way of ness. We make the algorithm practical by performing numer- sampling elements from the given set. It also ensures that ous optimizations, and use concrete examples to guide the there is no duplication during enumeration. Additionally, the proof exploration.

Load more