An Introduction to LEX and YACC

An Introduction to LEX and YACC

An Introduction to LEX and YACC SYSC-3101 1 Programming Languages CONTENTS CONTENTS Contents 1 General Structure 3 2 Lex - A lexical analyzer 4 3 Yacc - Yet another compiler compiler 10 4 Main Program 17 SYSC-3101 2 Programming Languages 1 GENERAL STRUCTURE 1 General Structure Program will consist of three parts: 1. lexical analyzer: scan.l 2. parser: gram.y 3. “Everything else”: main.c SYSC-3101 3 Programming Languages 2 LEX - A LEXICAL ANALYZER 2 Lex - A lexical analyzer %{ /* C includes */ }% /* Definitions */ %% /* Rules */ %% /* user subroutines */ Figure 1: Lex program structure SYSC-3101 4 Programming Languages Lex Definitions 2 LEX - A LEXICAL ANALYZER Lex Definitions • Table with two columns: 1. regular expressions 2. actions • ie: integer printf("found keyword INT"); • If action has more than one statement, enclose it within {} SYSC-3101 5 Programming Languages Regular Expressions 2 LEX - A LEXICAL ANALYZER Regular Expressions • text characters: a - z, 0 - 9, space... nn : newline. nt : tab. • operators: " \ [ ] ˆ - ? . * + | ( ) $ / { } % < > "..." : treat ’...’ as text characters (useful for spaces). n : treat next character as text character. : match anything. SYSC-3101 6 Programming Languages Regular Expressions 2 LEX - A LEXICAL ANALYZER • operators (cont): [...] : match anything within [] ? : match zero or one time, eg: ab?c ! ac, abc * : match zero or more times, eg: ab*c ! ac, abc, abbc... + : match one or more times, eg: ab+c ! abc, abbc... (...) : group ..., eg: (ab)+ ! ab, abab... | : alternation, eg ab|cd ! ab, cd fn,mg : repitition, eg a{1,3} ! a, aa, aaa fdefng : substitute defn (from first section). SYSC-3101 7 Programming Languages Actions 2 LEX - A LEXICAL ANALYZER Actions • ; ! Null action. • ECHO; ! printf("%s", yytext); • {...} ! Multi-statement action. • return yytext; ! send contents of yytext to the parser. yytext : C-String of matched characters (Make a copy if neccessary!) yylen : Length of the matched characters. SYSC-3101 8 Programming Languages Actions 2 LEX - A LEXICAL ANALYZER Figure 2: LEX Template %{ /* -*- c -*- */ #include <stdio.h> #include "gram.tab.h" %} extern YYSTYPE yylval; %% [0-9] { yylval.anInt = atoi((char *)&yytext[0]); return INTEGER; } . return *yytext; %% SYSC-3101 9 Programming Languages 3 YACC - YET ANOTHER COMPILER COMPILER 3 Yacc - Yet another compiler compiler %{ /* C includes */ }% /* Other Declarations */ %% /* Rules */ %% /* user subroutines */ Figure 3: Yacc program structure SYSC-3101 10 Programming Languages YACC Rules 3 YACC - YET ANOTHER COMPILER COMPILER YACC Rules • A grammar rule has the following form: A : BODY ; • A is a non-terminal name (LHS). • BODY consists of names, literals, and actions. (RHS) • literals are enclosed in quotes, eg: ’+’ 'nn' ! newline. 'n'' ! single quote. SYSC-3101 11 Programming Languages YACC Rules 3 YACC - YET ANOTHER COMPILER COMPILER • The rules: A : B ; A : C D ; A : E F G ; • can be specified as: A : B | C D | E F G ; SYSC-3101 12 Programming Languages YACC Rules 3 YACC - YET ANOTHER COMPILER COMPILER • Names representing tokens must be declared; this is most simply done by writing %token name1 name2 . • Define name1, name2,... in the declarations section. • Every name not defined in the declarations section is assumed to represent a nonterminal symbol. • Every nonterminal symbol must appear on the left side of at least one rule. SYSC-3101 13 Programming Languages Actions 3 YACC - YET ANOTHER COMPILER COMPILER Actions • the user may associate actions to be performed each time the rule is recognized in the input process, eg: XXX : YYY ZZZ { printf("a message\n"); } ; • $ is special! $n ! psuedo-variables which refer to the values returned by the components of the right hand side of the rules. $$ ! The value returned by the left-hand side of a rule. expr : ’(’ expr ’)’ { $$ = $2 ; } • Default return type is integer. SYSC-3101 14 Programming Languages Declarations 3 YACC - YET ANOTHER COMPILER COMPILER Declarations %token : declares ALL terminals which are not literals. %type : declares return value type for non-terminals. %union : declares other return types. • the type typedef union { body of union ... } YYSTYPE; is generated and must be included into the lex source so that types can be associated with tokens. SYSC-3101 15 Programming Languages Declarations 3 YACC - YET ANOTHER COMPILER COMPILER Figure 4: YACC Template %{ #include <stdio.h> %} %token <anInt> INTEGER %type <anInt> S E %union { int anInt; } %% S : E { printf( "Result is %d\n", $1 ); } ; %% yyerror( char * s ) { fprintf( stderr, "%s\n", s ); } SYSC-3101 16 Programming Languages 4 MAIN PROGRAM 4 Main Program Figure 5: Main template #include <stdio.h> #include <stdlib.h> extern int yyerror(), yylex(); #define YYDEBUG 1 #include "gram.tab.c" #include "lex.yy.c" main() { /* yydebug = 1; */ yyparse(); } SYSC-3101 17 Programming Languages 4 MAIN PROGRAM Figure 6: Running the example Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\DJGPP\etc\SYSC-3˜1>bison -d gram.y C:\DJGPP\etc\SYSC-3˜1>flex scan.l C:\DJGPP\etc\SYSC-3˜1>gcc main.c C:\DJGPP\etc\SYSC-3˜1>echo 1 + 2 | a.exe Result is 3 C:\DJGPP\etc\SYSC-3˜1> SYSC-3101 18 Programming Languages 4 MAIN PROGRAM Figure 7: Running the example using Visual C++ Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. E:\YaccLex>bison -d gram.y E:\YaccLex>flex scan.l E:\YaccLex>cl main.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8804 for 80x86 Copyright (C) Microsoft Corp 1984-1998. All rights reserved. E:\YaccLex>main.exe 2 + 1 Result is 3 SYSC-3101 19 Programming Languages.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    19 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us