An Attribute- Grammar Compiling System Based on Yacc, Lex, and C Kurt M

Total Page:16

File Type:pdf, Size:1020Kb

An Attribute- Grammar Compiling System Based on Yacc, Lex, and C Kurt M View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Digital Repository @ Iowa State University Computer Science Technical Reports Computer Science 12-1992 Design, Implementation, Use, and Evaluation of Ox: An Attribute- Grammar Compiling System based on Yacc, Lex, and C Kurt M. Bischoff Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/cs_techreports Part of the Programming Languages and Compilers Commons Recommended Citation Bischoff, Kurt M., "Design, Implementation, Use, and Evaluation of Ox: An Attribute- Grammar Compiling System based on Yacc, Lex, and C" (1992). Computer Science Technical Reports. 23. http://lib.dr.iastate.edu/cs_techreports/23 This Article is brought to you for free and open access by the Computer Science at Iowa State University Digital Repository. It has been accepted for inclusion in Computer Science Technical Reports by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Design, Implementation, Use, and Evaluation of Ox: An Attribute- Grammar Compiling System based on Yacc, Lex, and C Abstract Ox generalizes the function of Yacc in the way that attribute grammars generalize context-free grammars. Ordinary Yacc and Lex specifications may be augmented with definitions of synthesized and inherited attributes written in C syntax. From these specifications, Ox generates a program that builds and decorates attributed parse trees. Ox accepts a most general class of attribute grammars. The user may specify postdecoration traversals for easy ordering of side effects such as code generation. Ox handles the tedious and error-prone details of writing code for parse-tree management, so its use eases problems of security and maintainability associated with that aspect of translator development. The translators generated by Ox use internal memory management that is often much faster than the common technique of calling malloc once for each parse-tree node. Ox is a Yacc/Lex/C preprocessor, and is designed to bring attribute grammars closer to the mainstream of Unix-based language development. Ox inherits all of the familiar syntax and semantics of Yacc, Lex, and C. It is relatively easy to convert programs between Ox code and "pure Yacc/Lex/C" code. Ox has been used to build a compiler for a small (eighty grammar rules) block-structured imperative programming language. This paper considers Ox's design, implementation, and use, evaluates the performance of Ox-generated evaluators, and makes recommendations for improvements in Ox. Disciplines Programming Languages and Compilers This article is available at Iowa State University Digital Repository: http://lib.dr.iastate.edu/cs_techreports/23 Design, Implementation, Use, and Evaluation of Ox:AnAttribute-Grammar Compiling System based on Yacc, Lex, and C by Kurt M. Bischo TR92-31 Decemb er, 1992 Department of Computer Science 226 Atanaso Hall Iowa State University Ames, Iowa 50011 c 1992 Kurt M. Bischo Contents 1 General Intro duction and User Manual 5 1.1 Overview of Use :: :: :: ::: :: :: :: :: ::: :: :: :: 5 1.2 Preliminary :: :: :: :: ::: :: :: :: :: ::: :: :: :: 6 1.3 Attribute Declarations :: ::: :: :: :: :: ::: :: :: :: 7 1.3.1 Semantics of Attribute Declarations :: ::: :: :: :: 8 1.4 Rules and Attribute Occurrences : :: :: :: ::: :: :: :: 8 1.5 Ox Attribute De nitions : ::: :: :: :: :: ::: :: :: :: 9 1.5.1 Inherited vs. Synthesized Attributes : ::: :: :: :: 9 1.5.2 Attribute Reference Sections in the Y- le :: :: :: :: 10 1.5.3 Attribute Reference Sections in the L- les :: :: :: 13 1.5.4 Cycles : :: :: :: ::: :: :: :: :: ::: :: :: :: 14 1.6 Translation into C Co de : ::: :: :: :: :: ::: :: :: :: 14 1.7 Temp oral Behavior of the Ox-generated Evaluator ::::::: 15 1.7.1 Stack Op erations : ::: :: :: :: :: ::: :: :: :: 15 1.7.2 Placement of Generated Co de : :: :: ::: :: :: :: 15 1.7.3 Decoration and the Ready Set : :: :: ::: :: :: :: 16 1.8 Programming Style :: :: ::: :: :: :: :: ::: :: :: :: 17 1.9 Postdecoration Traversals ::: :: :: :: :: ::: :: :: :: 17 1.9.1 Example: In x to Pre x Translation : ::: :: :: :: 18 1.9.2 General Description ::: :: :: :: :: ::: :: :: :: 20 1.10 Macros ::: :: :: :: :: ::: :: :: :: :: ::: :: :: :: 24 1.10.1 Macro De nitions : ::: :: :: :: :: ::: :: :: :: 24 1.10.2 Macro Uses :: :: ::: :: :: :: :: ::: :: :: :: 25 1.10.3 Example :: :: :: ::: :: :: :: :: ::: :: :: :: 25 1.11 Shell Command Sequences for Evaluator Development ::::: 26 1.11.1 Conventions of Naming Ox Output Files :: :: :: :: 26 1 1.11.2 Review: Combining the Outputs of Yacc and Lex :: 26 1.11.3 Combined Use of Ox, Yacc, and Lex ::: :: :: :: 27 1.11.4 Typical Command Sequences : :: :: ::: :: :: :: 27 1.12 Other Points and Features ::: :: :: :: :: ::: :: :: :: 28 1.12.1 Error Recovery :: ::: :: :: :: :: ::: :: :: :: 28 1.12.2 Memory Alignment ::: :: :: :: :: ::: :: :: :: 28 1.12.3 Stripping Ox Constructs :: :: :: :: ::: :: :: :: 28 1.12.4 Preventing Execution of Attribute De nition Co de :: 29 1.12.5 Control of Storage Allo cation in the Generated Evaluator 29 1.12.6 Parse Tree Statistics :: :: :: :: :: ::: :: :: :: 30 1.12.7 Adjusting the Sizes of Ox's Data Structures :: :: :: 31 1.13 Example: An Integer Calculator :: :: :: :: ::: :: :: :: 31 1.14 Example: A Binary Number Translator :: :: ::: :: :: :: 33 1.15 Example: Translation from In x to Post x and Pre x ::::: 37 2 General Design Criteria and Comparison with Other Sys- tems 40 2.1 Generalization of the Function of Yacc : :: ::: :: :: :: 40 2.2 Generality of the Class of Attribute Grammars Accepted ::: 41 2.3 Ox as a Prepro cessor : :: ::: :: :: :: :: ::: :: :: :: 41 2.4 Portability : :: :: :: :: ::: :: :: :: :: ::: :: :: :: 41 2.5 Comp eting Systems :: :: ::: :: :: :: :: ::: :: :: :: 42 2.5.1 The Delft System : ::: :: :: :: :: ::: :: :: :: 42 2.6 Performance :: :: :: :: ::: :: :: :: :: ::: :: :: :: 42 3 Design of the Ox-generated Evaluator 44 3.1 Top ological Sort Using Dep endee Counts : :: ::: :: :: :: 44 3.2 Static Phase of the Top ological Sort :: :: :: ::: :: :: :: 45 3.3 Parse-Tree No des : :: :: ::: :: :: :: :: ::: :: :: :: 47 3.4 Items on the Parse-Tree Stack :::::::::::::::::: 48 3.5 Addressing Attribute Instances and Dep endee Counts ::::: 48 3.6 Static Data Structures of the Generated Evaluator ::::::: 49 3.6.1 Table of Dep endee-Count-Initialization Lists :: :: :: 49 3.6.2 Table of Decoration-time-executed Co de Fragments :: 49 3.6.3 Table of Lists of Dep endents :: :: :: ::: :: :: :: 49 3.7 Solved Instances Belonging to Parentless Nonterminal No des : 50 3.8 Op erations on the Parse-Tree Stack :: :: :: ::: :: :: :: 50 2 3.8.1 Synchronization of Stacks : :: :: :: ::: :: :: :: 50 3.8.2 Bu ering of Images of New Terminal No des: Lo okahead 51 3.8.3 Creation of Terminal No des: Shifting : ::: :: :: :: 51 3.8.4 Creation of Nonterminal No des: Reduction ::::::: 51 3.9 Dynamic Phases of the Top ological Sort : :: ::: :: :: :: 52 3.9.1 Decoration : :: :: ::: :: :: :: :: ::: :: :: :: 52 3.10 Stack Implementation of the Ready Set :: :: ::: :: :: :: 53 3.11 Storage ManagementScheme :::::::::::::::::: 54 3.11.1 Parse-Tree Storage ::: :: :: :: :: ::: :: :: :: 54 3.11.2 Dep endee-Count Storage :: :: :: :: ::: :: :: :: 55 3.11.3 Pruning :: :: :: ::: :: :: :: :: ::: :: :: :: 55 3.11.4 Example: Stack Op erations and Pruning :: :: :: :: 56 4 Design and Implementation of the Evaluator Generator 59 4.1 Size of Source Co de :: :: ::: :: :: :: :: ::: :: :: :: 59 4.2 Lexical Analysis :: :: :: ::: :: :: :: :: ::: :: :: :: 59 4.2.1 Limits of Lex :: ::: :: :: :: :: ::: :: :: :: 59 4.2.2 Use of Several Scanners :::::::::::::::::: 60 4.2.3 Macro Pro cessing : ::: :: :: :: :: ::: :: :: :: 61 4.3 Syntax Analysis :: :: :: ::: :: :: :: :: ::: :: :: :: 61 4.4 Semantic Analysis : :: :: ::: :: :: :: :: ::: :: :: :: 61 4.4.1 Top ological Sort :: ::: :: :: :: :: ::: :: :: :: 62 4.4.2 Pruning Conditions ::: :: :: :: :: ::: :: :: :: 62 4.5 Co de Generation : :: :: ::: :: :: :: :: ::: :: :: :: 64 4.6 Validation of Ox : :: :: ::: :: :: :: :: ::: :: :: :: 64 5 Exp erience with Ox 65 5.1 Small Languages : :: :: ::: :: :: :: :: ::: :: :: :: 65 5.2 GPPL: A Blo ck-structured Imp erative Programming Language 65 5.2.1 General Description ::: :: :: :: :: ::: :: :: :: 65 5.2.2 Implementation: gc :: :: :: :: :: ::: :: :: :: 66 5.2.3 Performance of gc ::: :: :: :: :: ::: :: :: :: 67 5.2.4 Remark :: :: :: ::: :: :: :: :: ::: :: :: :: 70 6 Future Work 71 6.1 Easy Improvements in Storage Eciency : :: ::: :: :: :: 71 6.1.1 Dep endee Counts as Unsigned Nybbles ::: :: :: :: 71 3 6.1.2 Eliminating Attribute-less Leaf No des : ::: :: :: :: 72 6.1.3 Removing Indirection from Parse-Tree No des :: :: :: 72 6.1.4 Calculation of Space Savings :: :: :: ::: :: :: :: 73 6.2 Grammar-wide Semantic Analysis : :: :: :: ::: :: :: :: 74 6.2.1 Static CircularityTest :::::::::::::::::: 74 6.2.2 Solving More Instances at Reduction Time ::::::: 74 6.2.3 Global Static Top ological Sorting : :: ::: :: :: :: 74 6.3 Extensions to Ox Syntax and Semantics : :: ::: :: :: :: 75 6.3.1 Dep endees for Synthesized Occurrences of Tokens ::: 75 6.3.2 L- les not Based on Lex :: :: :: ::: :: :: :: 76 6.3.3 Defaults for Attribute De nitions : :: ::: :: :: :: 76 6.3.4 Automatic Construction of Syntax Trees :: :: :: :: 76 6.3.5 Traversals at Reduction Time : :: :: ::: :: :: :: 77 6.3.6 Rep etition of Traversals :: :: :: :: ::: :: :: :: 77 6.4 Facility for Interactive Debugging : :: :: :: ::: :: :: :: 77 6.5 Supp ort for Other Languages ::::::::::::::::::
Recommended publications
  • DC Console Using DC Console Application Design Software
    DC Console Using DC Console Application Design Software DC Console is easy-to-use, application design software developed specifically to work in conjunction with AML’s DC Suite. Create. Distribute. Collect. Every LDX10 handheld computer comes with DC Suite, which includes seven (7) pre-developed applications for common data collection tasks. Now LDX10 users can use DC Console to modify these applications, or create their own from scratch. AML 800.648.4452 Made in USA www.amltd.com Introduction This document briefly covers how to use DC Console and the features and settings. Be sure to read this document in its entirety before attempting to use AML’s DC Console with a DC Suite compatible device. What is the difference between an “App” and a “Suite”? “Apps” are single applications running on the device used to collect and store data. In most cases, multiple apps would be utilized to handle various operations. For example, the ‘Item_Quantity’ app is one of the most widely used apps and the most direct means to take a basic inventory count, it produces a data file showing what items are in stock, the relative quantities, and requires minimal input from the mobile worker(s). Other operations will require additional input, for example, if you also need to know the specific location for each item in inventory, the ‘Item_Lot_Quantity’ app would be a better fit. Apps can be used in a variety of ways and provide the LDX10 the flexibility to handle virtually any data collection operation. “Suite” files are simply collections of individual apps. Suite files allow you to easily manage and edit multiple apps from within a single ‘store-house’ file and provide an effortless means for device deployment.
    [Show full text]
  • Chapter # 9 LEX and YACC
    Chapter # 9 LEX and YACC Dr. Shaukat Ali Department of Computer Science University of Peshawar Lex and Yacc Lex and yacc are a matched pair of tools. Lex breaks down files into sets of "tokens," roughly analogous to words. Yacc takes sets of tokens and assembles them into higher-level constructs, analogous to sentences. Lex's output is mostly designed to be fed into some kind of parser. Yacc is designed to work with the output of Lex. 2 Lex and Yacc Lex and yacc are tools for building programs. Their output is itself code – Which needs to be fed into a compiler – May be additional user code is added to use the code generated by lex and yacc 3 Lex : A lexical analyzer generator Lex is a program designed to generate scanners, also known as tokenizers, which recognize lexical patterns in text Lex is an acronym that stands for "lexical analyzer generator.“ The main purpose is to facilitate lexical analysis The processing of character sequences in source code to produce tokens for use as input to other programs such as parsers Another tool for lexical analyzer generation is Flex 4 Lex : A lexical analyzer generator lex.lex is an a input file written in a language which describes the generation of lexical analyzer. The lex compiler transforms lex.l to a C program known as lex.yy.c. lex.yy.c is compiled by the C compiler to a file called a.out. The output of C compiler is the working lexical analyzer which takes stream of input characters and produces a stream of tokens.
    [Show full text]
  • Clostridium Difficile Infection: How to Deal with the Problem DH INFORMATION RE ADER B OX
    Clostridium difficile infection: How to deal with the problem DH INFORMATION RE ADER B OX Policy Estates HR / Workforce Commissioning Management IM & T Planning / Finance Clinical Social Care / Partnership Working Document Purpose Best Practice Guidance Gateway Reference 9833 Title Clostridium difficile infection: How to deal with the problem Author DH and HPA Publication Date December 2008 Target Audience PCT CEs, NHS Trust CEs, SHA CEs, Care Trust CEs, Medical Directors, Directors of PH, Directors of Nursing, PCT PEC Chairs, NHS Trust Board Chairs, Special HA CEs, Directors of Infection Prevention and Control, Infection Control Teams, Health Protection Units, Chief Pharmacists Circulation List Description This guidance outlines newer evidence and approaches to delivering good infection control and environmental hygiene. It updates the 1994 guidance and takes into account a national framework for clinical governance which did not exist in 1994. Cross Ref N/A Superseded Docs Clostridium difficile Infection Prevention and Management (1994) Action Required CEs to consider with DIPCs and other colleagues Timing N/A Contact Details Healthcare Associated Infection and Antimicrobial Resistance Department of Health Room 528, Wellington House 133-155 Waterloo Road London SE1 8UG For Recipient's Use Front cover image: Clostridium difficile attached to intestinal cells. Reproduced courtesy of Dr Jan Hobot, Cardiff University School of Medicine. Clostridium difficile infection: How to deal with the problem Contents Foreword 1 Scope and purpose 2 Introduction 3 Why did CDI increase? 4 Approach to compiling the guidance 6 What is new in this guidance? 7 Core Guidance Key recommendations 9 Grading of recommendations 11 Summary of healthcare recommendations 12 1.
    [Show full text]
  • Project1: Build a Small Scanner/Parser
    Project1: Build A Small Scanner/Parser Introducing Lex, Yacc, and POET cs5363 1 Project1: Building A Scanner/Parser Parse a subset of the C language Support two types of atomic values: int float Support one type of compound values: arrays Support a basic set of language concepts Variable declarations (int, float, and array variables) Expressions (arithmetic and boolean operations) Statements (assignments, conditionals, and loops) You can choose a different but equivalent language Need to make your own test cases Options of implementation (links available at class web site) Manual in C/C++/Java (or whatever other lang.) Lex and Yacc (together with C/C++) POET: a scripting compiler writing language Or any other approach you choose --- must document how to download/use any tools involved cs5363 2 This is just starting… There will be two other sub-projects Type checking Check the types of expressions in the input program Optimization/analysis/translation Do something with the input code, output the result The starting project is important because it determines which language you can use for the other projects Lex+Yacc ===> can work only with C/C++ POET ==> work with POET Manual ==> stick to whatever language you pick This class: introduce Lex/Yacc/POET to you cs5363 3 Using Lex to build scanners lex.yy.c MyLex.l lex/flex lex.yy.c a.out gcc/cc Input stream a.out tokens Write a lex specification Save it in a file (MyLex.l) Compile the lex specification file by invoking lex/flex lex MyLex.l A lex.yy.c file is generated
    [Show full text]
  • User Manual for Ox: an Attribute-Grammar Compiling System Based on Yacc, Lex, and C Kurt M
    Computer Science Technical Reports Computer Science 12-1992 User Manual for Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C Kurt M. Bischoff Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/cs_techreports Part of the Programming Languages and Compilers Commons Recommended Citation Bischoff, Kurt M., "User Manual for Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C" (1992). Computer Science Technical Reports. 21. http://lib.dr.iastate.edu/cs_techreports/21 This Article is brought to you for free and open access by the Computer Science at Iowa State University Digital Repository. It has been accepted for inclusion in Computer Science Technical Reports by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. User Manual for Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C Abstract Ox generalizes the function of Yacc in the way that attribute grammars generalize context-free grammars. Ordinary Yacc and Lex specifications may be augmented with definitions of synthesized and inherited attributes written in C syntax. From these specifications, Ox generates a program that builds and decorates attributed parse trees. Ox accepts a most general class of attribute grammars. The user may specify postdecoration traversals for easy ordering of side effects such as code generation. Ox handles the tedious and error-prone details of writing code for parse-tree management, so its use eases problems of security and maintainability associated with that aspect of translator development. The translators generated by Ox use internal memory management that is often much faster than the common technique of calling malloc once for each parse-tree node.
    [Show full text]
  • Unix (And Linux)
    AWK....................................................................................................................................4 BC .....................................................................................................................................11 CHGRP .............................................................................................................................16 CHMOD.............................................................................................................................19 CHOWN ............................................................................................................................26 CP .....................................................................................................................................29 CRON................................................................................................................................34 CSH...................................................................................................................................36 CUT...................................................................................................................................71 DATE ................................................................................................................................75 DF .....................................................................................................................................79 DIFF ..................................................................................................................................84
    [Show full text]
  • 5 Command Line Functions by Barbara C
    ADAPS: Chapter 5. Command Line Functions 5 Command Line Functions by Barbara C. Hoopes and James F. Cornwall This chapter describes ADAPS command line functions. These are functions that are executed from the UNIX command line instead of from ADAPS menus, and that may be run manually or by automated means such as “cron” jobs. Most of these functions are NOT accessible from the ADAPS menus. These command line functions are described in detail below. 5.1 Hydra Although Hydra is available from ADAPS at the PR sub-menu, Edit Time Series Data using Hydra (TS_EDIT), it can also be started from the command line. However, to start Hydra outside of ADAPS, a DV or UV RDB file needs to be available to edit. The command is “hydra rdb_file_name.” For a complete description of using Hydra, refer to Section 4.5.2 Edit Time-Series Data using Hydra (TS_EDIT). 5.2 nwrt2rdb This command is used to output rating information in RDB format. It writes RDB files with a table containing the rating equation parameters or the rating point pairs, with all other information contained in the RDB comments. The following arguments can be used with this command: nwrt2rdb -ooutfile -zdbnum -aagency -nstation -dddid -trating_type -irating_id -e (indicates to output ratings in expanded form; it is ignored for equation ratings.) -l loctzcd (time zone code or local time code "LOC") -m (indicates multiple output files.) -r (rounding suppression) Rules • If -o is omitted, nwrt2rdb writes to stdout; AND arguments -n, -d, -t, and -i must be present. • If -o is present, no other arguments are required, and the program will use ADAPS routines to prompt for them.
    [Show full text]
  • Yacc a Tool for Syntactic Analysis
    1 Yacc a tool for Syntactic Analysis CS 315 – Programming Languages Pinar Duygulu Bilkent University CS315 Programming Languages © Pinar Duygulu Lex and Yacc 2 1) Lexical Analysis: Lexical analyzer: scans the input stream and converts sequences of characters into tokens. Lex is a tool for writing lexical analyzers. 2) Syntactic Analysis (Parsing): Parser: reads tokens and assembles them into language constructs using the grammar rules of the language. Yacc is a tool for constructing parsers. Yet Another Compiler to Compiler Reads a specification file that codifies the grammar of a language and generates a parsing routine CS315 Programming Languages © Pinar Duygulu 3 Yacc • Yacc specification describes a Context Free Grammar (CFG), that can be used to generate a parser. Elements of a CFG: 1. Terminals: tokens and literal characters, 2. Variables (nonterminals): syntactical elements, 3. Production rules, and 4. Start rule. CS315 Programming Languages © Pinar Duygulu 4 Yacc Format of a production rule: symbol: definition {action} ; Example: A → Bc is written in yacc as a: b 'c'; CS315 Programming Languages © Pinar Duygulu 5 Yacc Format of a yacc specification file: declarations %% grammar rules and associated actions %% C programs CS315 Programming Languages © Pinar Duygulu 6 Declarations To define tokens and their characteristics %token: declare names of tokens %left: define left-associative operators %right: define right-associative operators %nonassoc: define operators that may not associate with themselves %type: declare the type of variables %union: declare multiple data types for semantic values %start: declare the start symbol (default is the first variable in rules) %prec: assign precedence to a rule %{ C declarations directly copied to the resulting C program %} (E.g., variables, types, macros…) CS315 Programming Languages © Pinar Duygulu 7 A simple yacc specification to accept L={ anbn | n>1}.
    [Show full text]
  • Binpac: a Yacc for Writing Application Protocol Parsers
    binpac: A yacc for Writing Application Protocol Parsers Ruoming Pang∗ Vern Paxson Google, Inc. International Computer Science Institute and New York, NY, USA Lawrence Berkeley National Laboratory [email protected] Berkeley, CA, USA [email protected] Robin Sommer Larry Peterson International Computer Science Institute Princeton University Berkeley, CA, USA Princeton, NJ, USA [email protected] [email protected] ABSTRACT Similarly, studies of Email traffic [21, 46], peer-to-peer applica- A key step in the semantic analysis of network traffic is to parse the tions [37], online gaming, and Internet attacks [29] require under- traffic stream according to the high-level protocols it contains. This standing application-level traffic semantics. However, it is tedious, process transforms raw bytes into structured, typed, and semanti- error-prone, and sometimes prohibitively time-consuming to build cally meaningful data fields that provide a high-level representation application-level analysis tools from scratch, due to the complexity of the traffic. However, constructing protocol parsers by hand is a of dealing with low-level traffic data. tedious and error-prone affair due to the complexity and sheer num- We can significantly simplify the process if we can leverage a ber of application protocols. common platform for various kinds of application-level traffic anal- This paper presents binpac, a declarative language and com- ysis. A key element of such a platform is application-protocol piler designed to simplify the task of constructing robust and effi- parsers that translate packet streams into high-level representations cient semantic analyzers for complex network protocols. We dis- of the traffic, on top of which we can then use measurement scripts cuss the design of the binpac language and a range of issues that manipulate semantically meaningful data elements such as in generating efficient parsers from high-level specifications.
    [Show full text]
  • Lex and Yacc
    Lex and Yacc A Quick Tour HW8–Use Lex/Yacc to Turn this: Into this: <P> Here's a list: Here's a list: * This is item one of a list <UL> * This is item two. Lists should be <LI> This is item one of a list indented four spaces, with each item <LI>This is item two. Lists should be marked by a "*" two spaces left of indented four spaces, with each item four-space margin. Lists may contain marked by a "*" two spaces left of four- nested lists, like this: space margin. Lists may contain * Hi, I'm item one of an inner list. nested lists, like this:<UL><LI> Hi, I'm * Me two. item one of an inner list. <LI>Me two. * Item 3, inner. <LI> Item 3, inner. </UL><LI> Item 3, * Item 3, outer list. outer list.</UL> This is outside both lists; should be back This is outside both lists; should be to no indent. back to no indent. <P><P> Final suggestions: Final suggestions 2 if myVar == 6.02e23**2 then f( .. ! char stream LEX token stream if myVar == 6.02e23**2 then f( ! tokenstream YACC parse tree if-stmt == fun call var ** Arg 1 Arg 2 float-lit int-lit . ! 3 Lex / Yacc History Origin – early 1970’s at Bell Labs Many versions & many similar tools Lex, flex, jflex, posix, … Yacc, bison, byacc, CUP, posix, … Targets C, C++, C#, Python, Ruby, ML, … We’ll use jflex & byacc/j, targeting java (but for simplicity, I usually just say lex/yacc) 4 Uses “Front end” of many real compilers E.g., gcc “Little languages”: Many special purpose utilities evolve some clumsy, ad hoc, syntax Often easier, simpler, cleaner and more flexible to use lex/yacc or similar tools from the start 5 Lex: A Lexical Analyzer Generator Input: Regular exprs defining "tokens" my.flex Fragments of declarations & code Output: jflex A java program “yylex.java” Use: yylex.java Compile & link with your main() Calls to yylex() read chars & return successive tokens.
    [Show full text]
  • Time-Sharing System
    UNIXTM TIME-SHARING SYSTEM: UNIX PROGRAMMER'S MANUAL Seventh Edition, Volume 2B January, 1979 Bell Telephone Laboratories, Incorporated Murray Hill, New Jersey Yacc: Yet Another Compiler-Compiler Stephen C. Johnson Bell Laboratories Murray Hill, New Jersey 07974 ABSTRACT Computer program input generally has some structure; in fact, every computer program that does input can be thought of as de®ning an ``input language'' which it accepts. An input language may be as complex as a programming language, or as sim- ple as a sequence of numbers. Unfortunately, usual input facilities are limited, dif®cult to use, and often are lax about checking their inputs for validity. Yacc provides a general tool for describing the input to a computer program. The Yacc user speci®es the structures of his input, together with code to be invoked as each such structure is recognized. Yacc turns such a speci®cation into a subroutine that handles the input process; frequently, it is convenient and appropriate to have most of the ¯ow of control in the user's application handled by this subroutine. The input subroutine produced by Yacc calls a user-supplied routine to return the next basic input item. Thus, the user can specify his input in terms of individual input characters, or in terms of higher level constructs such as names and numbers. The user-supplied routine may also handle idiomatic features such as comment and con- tinuation conventions, which typically defy easy grammatical speci®cation. Yacc is written in portable C. The class of speci®cations accepted is a very gen- eral one: LALR(1) grammars with disambiguating rules.
    [Show full text]
  • Unix Programmer's Manual
    There is no warranty of merchantability nor any warranty of fitness for a particu!ar purpose nor any other warranty, either expressed or imp!ied, a’s to the accuracy of the enclosed m~=:crials or a~ Io ~helr ,~.ui~::~::.j!it’/ for ~ny p~rficu~ar pur~.~o~e. ~".-~--, ....-.re: " n~ I T~ ~hone Laaorator es 8ssumg$ no rO, p::::nS,-,,.:~:y ~or their use by the recipient. Furln=,, [: ’ La:::.c:,:e?o:,os ~:’urnes no ob~ja~tjon ~o furnish 6ny a~o,~,,..n~e at ~ny k:nd v,,hetsoever, or to furnish any additional jnformstjcn or documenta’tjon. UNIX PROGRAMMER’S MANUAL F~ifth ~ K. Thompson D. M. Ritchie June, 1974 Copyright:.©d972, 1973, 1974 Bell Telephone:Laboratories, Incorporated Copyright © 1972, 1973, 1974 Bell Telephone Laboratories, Incorporated This manual was set by a Graphic Systems photo- typesetter driven by the troff formatting program operating under the UNIX system. The text of the manual was prepared using the ed text editor. PREFACE to the Fifth Edition . The number of UNIX installations is now above 50, and many more are expected. None of these has exactly the same complement of hardware or software. Therefore, at any particular installa- tion, it is quite possible that this manual will give inappropriate information. The authors are grateful to L. L. Cherry, L. A. Dimino, R. C. Haight, S. C. Johnson, B. W. Ker- nighan, M. E. Lesk, and E. N. Pinson for their contributions to the system software, and to L. E. McMahon for software and for his contributions to this manual.
    [Show full text]