Syntax-Directed Translation

CS308 Compiler Principles Syntax-Directed Translation Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Phases of Compilation Intermediate Language Lexical Analyzer Syntax Analyzer Code Optimizer Source Semantic Analyzer Target Code Generator Language Intermediate Code Language Generator Analysis Symbol Synthesis Table Compiler Principles A Model of A Compiler Font End • Lexical analyzer reads the source program character by character and returns the tokens of the source program. • Parser creates the tree-like syntactic structure of the given program. • Intermediate-code generator translates the syntax tree into three- address codes. Compiler Principles Syntax-Directed Translation • Associate semantic meanings with the grammar. – generate intermediate codes – put information into the symbol table – perform type checking – issue error messages – perform some other activities – in fact, they may perform almost any activities. Compiler Principles Syntax-Directed Translation Cont’d • Syntax-Directed Definitions: – associate a production rule with a set of attributes and semantic rules – give high-level specifications for translations – hide many implementation details such as order of evaluation of semantic rules • Translation Schemes: – embed program fragments within production bodies – indicate the order of evaluation of semantic actions associated with a production Compiler Principles Syntax-Directed Definition (SDD) * • A syntax-directed definition is an extension of a context-free grammar: – Each grammar symbol is associated with a set of attributes. – Each production is associated with a set of semantic rules. • Attributes are divided into two kinds: – Synthesized attribute is defined only in terms of attribute values at the node’s children and itself. – Inherited attribute is defined in terms of attribute values the node’s parent, itself, and siblings. Imagine a parse tree! Compiler Principles SDD Cont’d • In a syntax-directed definition, each production A→α is associated with a set of semantic rules of the form: b=f(c1,c2,…,cn) where f is a function, b is a synthesized attribute of A and c1,c2,…,cn are attributes of the grammar symbols in the production ( A→α ). b is an inherited attribute of one of the grammar symbols in α, and c1,c2,…,cn are attributes of the grammar symbols in the production ( A→α ). Compiler Principles Attribute Grammar • A semantic rule b=f(c1,c2,…,cn) indicates that the attribute b depends on attributes c1,c2,…,cn. • In a syntax-directed definition, a semantic rule may not only evaluate the value of an attribute, but also have some side effects such as printing values. • An attribute grammar is a syntax-directed definition without side effects. Compiler Principles SDD Example1 Production Semantic Rules L → E return print(E.val) E → E1 + T E.val = E1.val + T.val E → T E.val = T.val T → T1 * F T.val = T1.val * F.val T → F T.val = F.val F → ( E ) F.val = E.val F → digit F.val = digit.lexval • Symbols E, T, and F are associated with a synthesized attribute val. • The token digit has a synthesized attribute lexval (an integer value returned by the lexical analyzer). Compiler Principles SDD Example2 Production Semantic Rules E → E1 + T E.loc=newtemp(), E.code = E1.code || T.code || add E1.loc,T.loc,E.loc E → T E.loc = T.loc, E.code=T.code T → T1 * F T.loc=newtemp(), T.code = T1.code || F.code || mult T1.loc,F.loc,T.loc T → F T.loc = F.loc, T.code=F.code Guess what happens F → ( E ) F.loc = E.loc, F.code=E.code F → id F.loc = id.name, F.code=“” • Symbols E, T, and F are associated with synthesized attributes loc and code. • The token id has a synthesized attribute name. • || is the string concatenation operator. Compiler Principles Annotated Parse Tree A parse tree can be used to visualize the translation specified by an SDD. • A parse tree showing the values of attributes at each node is called an annotated parse tree. • The process of computing the attributes values at the nodes is called annotating (or decorating) of the parse tree. Compiler Principles Annotated Parse Tree Example Input: 5+3*4 L E E + T T T * F F F digit digit digit Compiler Principles Annotated Parse Tree Example Input: 5+3*4 L E.val=17 E.val=5 + T.val=12 T.val=5 T.val=3 * F.val=4 F.val=5 F.val=3 digit.lexval=4 digit.lexval=5 digit.lexval=3 What about these rules: E.code = E1.code || T.code || add E1.loc,T.loc,E.loc Compiler Principles Dependency Graph • While an annotated parse tree shows the values of attributes • Semantic rules set up dependencies among attributes. • Dependency graph determines the evaluation order of the semantic rules. – An edge from one attribute to another indicates that the value of the former one is needed to compute the later one. Compiler Principles Dependency Graph Example Input: 5+3*4 L E.val=17 E.val=5 T.val=12 T.val=5 T.val=3 F.val=4 F.val=5 F.val=3 digit.lexval=4 digit.lexval=5 digit.lexval=3 Compiler Principles Inherited Attributes Example Input: real p q Production Semantic Rules D → T L L.in = T.type T → int T.type = integer T → real T.type = real L → L1 id L1.in = L.in, addtype(id.entry,L.in) L → id addtype(id.entry,L.in) • Symbol T is associated with a synthesized attribute type. • Symbol L is associated with an inherited attribute in. Compiler Principles A Dependency Graph with Inherited Attributes Input: real p q D T.type=real L.in=real T L L1.in=real, addtype(q,real) real L id addtype(p,real) id.entry=q id id.entry=p parse tree dependency graph Compiler Principles S & L-Attributed Definitions • We will look at two sub-classes of the syntax-directed definitions: – S-Attributed Definitions: only synthesized attributes are used in the syntax-directed definitions. – L-Attributed Definitions: both synthesized and inherited attributes are used in a restricted fashion. • dependency-graph edges can go from left to right, but not from right to left Why? Compiler Principles S-Attributed Definitions • S-Attributed Definitions: only synthesized attributes are used in the syntax-directed definitions – each rule computes an attribute for the non- terminal at the head of a production from attributes taken from the body of the production – the attributes can be evaluated by performing a post-order traversal of the parse tree How? – can be implemented naturally with an LR parser – can also be implemented with an LL parser Compiler Principles Bottom-Up Evaluation of S-Attributed Definitions • Put the values of the synthesized attributes of the grammar symbols into a parallel stack • Evaluate the values of the attributes during reductions Example: A XYZ A.a=f(X.x,Y.y,Z.z) (all attributes are synthesized) stack parallel-stack top Z Z.z Y Y.y f() X X.x top A A.a . Compiler Principles SDD Example Recall Production Semantic Rules L → E return print(E.val) E → E1 + T E.val = E1.val + T.val E → T E.val = T.val T → T1 * F T.val = T1.val * F.val T → F T.val = F.val F → ( E ) F.val = E.val F → digit F.val = digit.lexval • Symbols E, T, and F are associated with a synthesized attribute val. • The token digit has a synthesized attribute lexval (an integer value returned by the lexical analyzer). Compiler Principles Canonical LR(0) Collection for The Grammar L I : L’→ L I1: L’→L I : L →Er * 0 . r 7 . I : E →E+T. 9 L →.Er T 11 T →T *F E + . E →.E+T I2: L →E.r I8: E →E+.T F 4 E → T E →E +T T → T*F ( . 5 T → T*F T → F d . T . 6 T →.F I3: E →T. F →.(E) F →.(E) T →T.*F F →.d F → d * . F I4: T →F. I9: T →T* F F ( . I : T →T*F. 12 I5: F → (.E) F →.(E) E → E+T F → d ( . E . 5 E → T d . 6 T →.T*F T →.F T I10:F →(E.) ) F →(E) 3 I13: . F → (E) E →E +T + . F . F →.d 4 8 ( d 5 I6: F →d d . 6 Compiler Principles Bottom-Up Evaluation Example • At each shift of digit, we also push digit.lexval into val-stack. stack val-stack input action semantic rule 0 5+3*4r s6 d.lexval(5) into val-stack 0d6 5 +3*4r F→digit F.val=d.lexval 0F4 5 +3*4r T→F T.val=F.val 0T3 5 +3*4r E→T E.val=T.val 0E2 5 +3*4r s8 push empty slot into val-stack 0E2+8 5- 3*4r s6 d.lexval(3) into val-stack 0E2+8d6 5-3 *4r F→digit F.val=d.lexval 0E2+8F4 5-3 *4r T→F T.val=F.val 0E2+8T11 5-3 *4r s9 push empty slot into val-stack 0E2+8T11*9 5-3- 4r s6 d.lexval(4) into val-stack 0E2+8T11*9d6 5-3-4 r F→digit F.val=d.lexval 0E2+8T11*9F12 5-3-4 r T→T*F T.val=T1.val*F.val 0E2+8T11 5-12 r E→E+T E.val=E1.val+T.val 0E2 17 r s7 push empty slot into val-stack 0E2r7 17- $ L→Er print(17), pop empty slot from val-stack 0L1 17 $ acc Compiler Principles Bottom-Up Eval. of S-Attributed Definitions Production Semantic Rules L → E return print(val[top-1]) E → E1 + T val[ntop] = val[top-2] + val[top] E → T T → T1 * F val[ntop] = val[top-2] * val[top] T → F F → ( E ) val[ntop] = val[top-1] F → digit push digit.lexval • At each shift of digit, we also push digit.lexval into val-stack.

Syntax-Directed Translation

Derivatives of Parsing Expression Grammars

Adaptive LL(*) Parsing: the Power of Dynamic Analysis

Lecture 3: Recursive Descent Limitations, Precedence Climbing

Packrat Parsers Can Support Left Recursion

Parsing, Lexical Analysis, and Tools

CS412/CS413 Introduction to Compilers Tim Teitelbaum

Parsing I: Earley Parser

Modular and Efficient Top-Down Parsing for Ambiguous Left

On the Covering Problem for Left-Recursive

Generalised Recursive Descent Parsing and Follow-Determinism

Parsing Expression Grammars Made Practical

Top Down Parsing • Following Grammar Generates Types of Pascal