<<

An Introduction to and

SYSC-3101 1 Programming Languages 1 Contents CONTENTS 2 3 4 SYSC-3101 General Lex Y Main acc - - A Y lexical et ogram Structur another analyzer e compiler 2

Programming CONTENTS Languages 10 17 4 3 Program 1 SYSC-3101 3. 2. 1. “Ev parser: le General xical erything will analyzer: gram.y consist else”: Structur of scan.l main. three parts: e

3

1 GENERAL

Programming

STR UCTURE Languages 2 SYSC-3101 /* }% /* %% /* %% /* %{ Definitions user Rules C Lex includes - subroutines A */ lexical Figure */ */ 1: analyzer Le */ x

program

4 2 LEX

structure

-

A LEXICAL

Programming AN

Languages AL YZER SYSC-3101 Le Lex

• • •

x If Definitions integer ie: T 2. 1. Definitions able action actions re gular with has e tw xpressions o ("found columns: than one

statement,

5 2 LEX keyword

enclose

-

A LEXICAL it Programming INT"); within

{} AN

Languages AL YZER Regular Re SYSC-3101

• • gular te operators: \ . \ "..." \ t n xt :

: treat Expressions : : match characters: tab ne wline. . : ne treat " an xt essions ything. \ character ’... a [ ’ - as z ] , te 0 ˆ as xt - - 9 te characters , xt space ? character

. 6 2 ...

* LEX (useful + .

|

- A (

for LEXICAL ) Programming spaces). $ /

{ AN }

Languages AL

% YZER < > SYSC-3101 Re

• gular operators + (...) [...] | ? * { { n defn

, : : : : Expressions m match alternation, match match } } : : : : substitute (cont): repitition, group match one zero zero or or or e ..., an g more one more ab| ything e defn e g: g , a{1,3} (ab)+ times, times, (from within → e g: e ab e first g:

→ 7 2 g: ab?c → [] , ab+c ab*c cd

ab section). LEX a , , → aa abab

→ - →

, ac A aaa

abc ac LEXICAL , ... Programming abc , , abc abbc , abbc

... AN

Languages AL

... YZER Actions Actions SYSC-3101 yytext yylen • • • • ; ; {...} return → : Length : C-String Null → → yytext; action. of printf("%s", Multi-statement the of matched matched → send characters characters.

action. 8 2 contents

yytext); LEX

(Mak -

of A

yytext e LEXICAL a Programming cop y if to

neccessary!) the AN

Languages AL

parser YZER . SYSC-3101 Actions %% . %{ extern %} #include #include [0-9] %% /* -*- YYSTYPE { return } "gram.tab.h" c -*- return yylval.anInt yylval; *yytext; */ Figure INTEGER; 2:

LEX = 9 2 atoi((char

T LEX

emplate

-

A LEXICAL Programming

*)&yytext[0]); AN

Languages AL YZER SYSC-3101 3 /* %% /* %% /* }% /* %{ user Rules Other C Y acc includes subroutines - */ Declarations

Y et 3

Figure

another Y

*/

A CC

3:

- Y YET */ acc */

compiler

10 ANO program THER

structure COMPILER compiler

Programming COMPILER Languages SYSC-3101 Y Y

A A

A

• • • • CC CC : liter BODY A A ' ' \ \ is

BODY grammar '' Rules n' als a Rules non-terminal consists → → are ; enclosed single ne

rule 3 wline. of

has Y

names, quote. A

name CC the in

quotes, follo -

literals, (LHS). YET wing

e 11 ANO g: and form:

’+’ THER

actions. COMPILER (RHS)

Programming COMPILER Languages SYSC-3101 Y

A A A A A

• • CC : : : : | | ; can The

B C E C E Rules be rules: ; D F D F specified ; G G

; 3

as:

Y

A

CC

-

YET

12 ANO

THER COMPILER

Programming COMPILER Languages SYSC-3101 Y

%token A

• • • • CC

Define Names Ev Ev done represent rule. ery Rules ery by name nonterminal name1 representing writing a name1 nonterminal

not 3 , name2

defined

Y A

symbol name2 CC tok

,... ens

symbol. -

in YET in the must must . the

declarations 13 ANO . appear be declarations .

declared; THER on

the

section COMPILER section. this left Programming is side is assumed

of COMPILER simply Languages least to one Actions Actions SYSC-3101 • • • expr XXX Def $ the $ $$ recognized n is components user ault special! : { ; → → : return YYY printf("a may ’(’ psuedo-v The

in associate 3 the v alue ZZZ

of Y expr

input A the ariables

is CC returned inte right actions

message\n");

process,

- ger YET ’)’ hand . by

to 14 ANO e { the be g: side refer

left-hand performed THER $$ of to the = the

} COMPILER $2 rules. v side Programming alues each ; of } time returned a

rule. COMPILER the Languages rule by the is %type Declarations %union SYSC-3101 %token Declarations • the is can generated type be : typedef } declares : : associated declares declares YYSTYPE; body

and 3 return of other ALL

must Y union

with

A union CC v terminals return alue be

tok -

included YET ens. { type types.

... 15 ANO which for into

non-terminals. THER are the not

le COMPILER x literals. source Programming so

that COMPILER Languages types SYSC-3101 Declarations S %% } %union yyerror( %% %type %token %} #include %{ int { anInt; : { ; char

E printf( 3

* Y

s S INTEGER A

Figure CC ) E "Result

{

- YET fprintf( 4: Y

A 16 ANO is CC

%d\n", T THER emplate

stderr, COMPILER $1 Programming "%s\n", );

} COMPILER Languages s ); } SYSC-3101 4 extern #include #include #define /* { main() #include #include } Main yyparse(); int yydebug YYDEBUG "lex.yy.c" "gram.tab.c" Pr yyerror(), ogram = 1 1; Figure */ yylex(); 5: Main 17

template 4

Programming

MAIN PR

Languages OGRAM SYSC-3101 C:\DJGPP\etc\SYSC-3˜1>bison C:\DJGPP\etc\SYSC-3˜1>echo C:\DJGPP\etc\SYSC-3˜1>gcc C:\DJGPP\etc\SYSC-3˜1>flex C:\DJGPP\etc\SYSC-3˜1> Result Microsoft (C) Copyright is 3 Windows 1985-2001 Figure XP 6: [Version Running Microsoft 18 main.c 1 scan.l -d + the 5.1.2600] gram.y 2 e xample | Corp.

a.exe 4

Programming

MAIN PR

Languages OGRAM SYSC-3101 Result 2 E:\YaccLex>main.exe Copyright E:\YaccLex>bison Microsoft E:\YaccLex>cl E:\YaccLex>flex Microsoft (C) + 1 Copyright is Figure 3 (C) (R) Windows 7: main.c Microsoft 32-bit 1985-2001 Running scan.l -d XP gram.y C/C++ [Version the Corp Microsoft e 19 xample Optimizing 1984-1998. 5.1.2600] using Corp.

V 4 Programming isual

Compiler MAIN All C++

rights PR

Languages Version OGRAM reserved. 12.00.8804 for 80x86