Lesson 1: Symbol Tables 1

Lesson 1: Symbol Tables 1

Lesson 1: Symbol Tables 1. Introduction 2. Name spaces 3. Organization 4. Block structured languages 5. Perspective 6. tabla.c and tabla.h 7. Exercises Readings: Scott, section 3.3 Munchnick, chapter 3 Aho, section 7.6 Fischer, chapter 8 Holub, section 6.3 Bennett, section 5.1 Cooper, sections 5.7 y B.4 12048 Compilers II - J. Neira – University of Zaragoza 1 The Symbol Table • Why is it important? – Lexical analysis? int a, 1t; – Syntactic analysis? – Semantic analysis? ¿?¿? – Code generation? – Code optimization? – Execution? while a then ... • Why is it particular and complex? – What information does it contain? ¿?¿? – How/when is information included? – How/when is it accessed? – How/when is it deleted? • Do interpreters need one? c := i > 1; • Do debuggers? ¿? ¿? • Unassemblers? ¿? ¿? 12048 Compilers II - J. Neira – University of Zaragoza 2 1. Introduction Symbol table: structure It can additionally include: used by the compiler to store – temporary symbols information (in the form of attributes) associated to – labels symbols declared in the – Predefined symbols program. •Conceptually it is a set of records •Program dictionary •Its organization is strongly •name: lexical analysis influenced by syntactic as •type: syntactic analysis well as semantic aspects of •scope: semantic analysis the language at hand. •address: code generation • Types available in the language: determine the CONTENTS of the table. • Scope rules: determine the visibility of the symbols, i.e., the table’s ACCESS MECHANISM. 12048 Compilers II - J. Neira – University of Zaragoza 3 Table contents • Reserved words: they have a • Literals, constants that denote a special meaning; they CANNOT certain value be redefined. • Symbols generated by the program begin end type compiler var array if ... var a: record • Predefined symbols: also have b,c : integer; a special meaning, but can be end; redefined. – Generates symbol noname1 for sin cos get put read the anonymous type correspon- write integer! real! ding to the record. •Symbols predefined by the programmer –Variables: type, place in memory, value? references? –Data types: description –Procedures and functions: address, parameters, result type –Parameters: type of variable, parameter class –Labels: place in the program 12048 Compilers II - J. Neira – University of Zaragoza 4 Query When processing ............... the compiler ...............…. • declarations queries the table to prevent illegal duplication of symbol names. • statements queries the table to verify that the involved symbols are accessible and used correctly. const c = v; var v : t; • Does identifier c exist? • Identifier v exists? v := f (a, b + 1); • Is type t declared? •Is f a function? v := e; • Does the number of •Is v defined? Is it a variable? arguments agree with the Which is its address? number of parameters? •Is e defined? Is it a variable? • Do the types of the arguments agre with those of the • Is it the same type as v (or of a parameters? compatible type)? • Is the use of arguments 12048 Compilers II - J. Neira – University of Zaragozacorrect? 5 Update When processing ............... the compiler ...............…. • Declarations updates the table to include new symbols. • scopes updates the table to modify the visibility of symbols. const c = v; var v : t; – Include c of type v and its value – Include v with type t – Assign a location in memory? – Assign it a position in memory end; – Delete (or hide) all symbols of the block (scope) that is closing. function f (i : integer) : integer; procedure P (i : integer; var b : boolean); – Include f (as function) – Open a new scope – Include P (procedure) –Include f (as assignment variable) – Open a new scope –Include i (as parameter) –Include i and b (parameters) 12048 Compilers II - J. Neira – University of Zaragoza 6 Requirements • Speed: query is the most • Easy maintenance: frequent operation. identifier deletion must be simple O(n)? – It is not random O(log2(n))? O(1)? • Duplicate identifiers • Efficiency in space should be allowed: most management: a large languages allow it. It must amount of information is be clear which are stored. accessible at each moment. • Flexibility: the possibility of defining types makes the declaration of variables arbitrarily complex. 12048 Compilers II - J. Neira – University of Zaragoza 7 Requirements program e; e(): var a, b, c : char; a, b, c procedure f; f, g, j var a, b, c : char; ... procedure g; var a, b : char; g(): f(): j(): procedure h; a, b var c, d : char; a, b, c b, d ... h, i procedure i; var b, c : char; ... ... procedure j; h(): i(): var b, d : char; c, d b, c ... ... Program being compiled: e() f() g() h() i() j() e() h() i() f() g() g() g() j() e() e() e() e() e() e() e() Symbol table: 12048 Compilers II - J. Neira – University of Zaragoza 8 2. Names • Remember: conceptually, we store records with the struct { name and attributes of char name[MAX]; each symbol. ... } table_entry; •A storage and search mechanism by name is required. Space for the longest Possible identifier • Alternative 1: define the name attributes name field as a vector of base ... characters. indice • FORTRAN IV: number of comienzo significant characters x severely limited (6). It is not so in modern languages. • Variability in the length of names -> space is wasted. 12048 Compilers II - J. Neira – University of Zaragoza 9 Use of the Heap • Alternative 2: define the name field as a pointer to char. name attributes HEAP ...base ... indice struct { ... comienzo char *nombre; ... ... } entrada_tabla; x ... ... e->nombre = strdup(nombre); • Obtaining space may be slow • Space reuse depends on the heap memory recovering mechanism. • The requirements for a symbol table are simpler than those offered by heaps. 12048 Compilers II - J. Neira – University of Zaragoza 10 Name space • Alternative 3: define the name field as an index in a vector of names. Name space bb a a s s e e i i n n d d i i c c e e c c o o m m i i e e n n z z o o x x free struct { int nombre; ... name attributes } entrada_tabla; .... char espacio_nombres[MAX]; • Administration is local • Space can be reused • Space is not ‘unlimited’ 12048 Compilers II - J. Neira – University of Zaragoza 11 Name space •Small:it may be WhatWhat is is the the appropriate appropriate size size insufficient forfor the the name name space? space? •Large:space may be wasted • Solution: segmented name space Array – segment: of Vector of size name div s pointers s 0 bb a a s s e e i i n n d d i i c c e e – Index in segment: c o m i e n z o x name mod s c o m i e n z o x T name = segment * s + index • Overall size is limited by the size of the vector of pointers (T=50 pointers to 1024 chars = 50k) 12048 Compilers II - J. Neira – University of Zaragoza 12 3. Organization • Three basic operations: – search() – include() – delete() • Alternative 1: unordered list search() O(n) include() O(1) name attributes delete() O(1) 0 name attributes p Include here n u M-1 Include here 12048 Compilers II - J. Neira – University of Zaragoza 13 3. Organization • Aternative 2: List ordered by name name attributes name attributes 0 p u n Insertion where? M-1 search() O(log2(n)) O(n) include() O(log2(n))+O(n) O(n)+O(1) delete() ? ? TheThe orderorder ofof inclusioninclusion isis lost!lost! 12048 Compilers II - J. Neira – University of Zaragoza 14 3. Organization • Alternative 3: Binary trees Only if balanced! name attributes search() O(log2(n)) include() O(log2(n))+O(1) delete() ? • In the worst case, the cost of each operation is the same as the ordered list. i i var i, j var i, m j, m, k, k j, j l, l l, l m : integer; m k: integer; k • There is no guarantee that the tree will be balanced (names are not random). 12048 Compilers II - J. Neira – University of Zaragoza 15 3. Organization • Alternative 4: hash • Collisions: two different tables. Mechanism to sequences may be randomly distribute an associated with the same arbitrary number of items index. into a finite set of classes. 0 base i x management?management? h(’base’) = 0 h(’x’) = i ... h(’comienzo’) = j j comienzo M-1 ¿¿hh?? • With an appropriate hash • Hash function h: function and an adequate associates a character collision management, you sequence with a hash code can search() in constant = index in the table. time . 12048 Compilers II - J. Neira – University of Zaragoza 16 The hash function h • Desirable characteristics: – h(s) should depend only on s – Efficiency: it must be simple and easy to compute – Efficacy: it should produce small collision lists » Uniform: all indexes should be assigned with equal probability » Randomizing: similar names should go to different indexes • Birthday paradox: –GivenN names, and a table of size M – Let h be uniform and randomizing. The number of expected insertions before a collision is: M sqrt(pi M/2) 10 4 Random numbers between 0 y 100: 365 24 1.000 40 84 35 45 32 89 1 58 16 38 69 5 90 16 53 61 ... 10.000 125 Collision at 13th. 100.000 396 12048 Compilers II - J. Neira – University of Zaragoza 17 Examples Method of division function hash_add(s : string; M : integer) : integer; var k : integer; begin k := 0; for i := 1 to length(s) do k := k + ord(s[i]); hash_add := k mod M; end; • Size = 19, ids A0...A199 • Size = 100, ids A0...A99 0- 50- - - 0-- ********************* - - - ********** - - - ********** - - - ********** - - - ********** - - - ********** - - - *********** - -**-* - ******************** -* - *** - *********** -* - ***** - *********** -* - ************* - *********** -* - ******** - *********** -* - ********* - *********** --* - ***************

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    51 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us