Language Grouse-3
Total Page:16
File Type:pdf, Size:1020Kb
SFU CMPT 379 Compilers Spring 2015 Assignment 3 (preliminary) Assignment due XXX, November XX, by 11:59pm. For this assignment, you are to expand your Grouse-2 compiler from assignment 2 to handle Grouse-3. Project submission format is the same as for assignment 2 (an archive of your “src” directory inside a folder named with your sfu username). Grouse-3 is backwards compatible with Grouse-2. All restrictions and specifications from Grouse-2 are still in force, unless specifically noted. I also omit unchanged productions in the Tokens and Grammar sections. If ". ." is listed as a possibility for a production, it means "all possibilities for this production from Grouse-2". Language Grouse-3 Tokens: punctuation → -> | … Grammar: S → globalDefinition* main block globalDefinition → tupleDefinition functionDefinition tupleDefinition → tuple identifier parameterTuple ; functionDefinition → func identifier ( parameterList ) -> parameterTuple body parameterTuple → ( parameterList ) | identifier // identifier must be a tuple name. parameterList → parameterSpecification ⋈ , // 0 or more comma-separated parameterSpecifications parameterSpecification → type identifier type → identifier | … // void usable only as a return type statement → returnStatement // allowed inside a function definition only. breakStatement // allowed inside a loop only. continueStatement // allowed inside a loop only. callStatement … callStatement → call functionInvocation ; // function return value is ignored returnStatement → return ; breakStatement → break ; continueStatement → continue ; forStatement → for ( forControlPhrase ) block forControlPhrase → index identifier of expression // expression is of array type element identifier of expression // expression is of array type ever count ( expression lessOp ) ? identifier lessOp expression // expressions are of type int lessOp → < | <= expression → functionInvocation subelementReference fresh identifier ( expressionList ) ; … functionInvocation → identifier ( expressionList ) subelementReference → expression . identifier 1. Tuples Tuples are a new type of type in Grouse-3. Each tuple type is an ordered sequence of specific types. In type- notation, tuple types are denoted “tuple(type1, type2, … )” or “t(type1, type2, …)” for short. Tuple types can be considered as parameterized types, where type1, type2, etc. are the parameters, also known as the subtypes. 1.1. Tuple type equivalence Two tuple types t(typea1, typea2, … , typeaN) and t(typeb1, typeb2, … , typebM) are considered equivalent if N=M and, for all i from 1 to N, typeai is equivalent to typebi . There is no other notion of tuple type equivalence; in particular, names of the types and the subtypes are not important. When two types are equivalent, we often say they are the same. As special cases, any unary tuple type t(type) is considered equivalent to the not-included-in-a-tuple type. Thus t(int) is considered to be the same type as int, and t(t(int, t(char))) is considered equivalent to t(int, char). Also, the nullary tuple type t() is considered equivalent to a new primitive type, known as “void”. Note that the Grouse programmer has no way other than the nullary tuple type to express the void type. Any tuple type with at least two subtypes is considered a nontrivial tuple type; nontrivial tuple types are reference types. The kind of equivalence described here is known as structural equivalence because it is the structure of the types that is compared; this is opposed to name equivalence where the names of the types and subtypes must also agree. 1.2. Trivial and non-trivial tuple types Since they are equivalent to (and implemented as) either void or another type, nullary and unary tuple types are considered trivial tuple types. The type void takes zero bytes. Any other tuple type—which has at least two subtypes--is considered a nontrivial tuple type. Nontrivial tuple types are reference types. 1.3. Naming a tuple type A tuple type may be anonymous or named. Names for tuple types are created by the grouse programmer with a tuple definition or a function definition. A tuple definition is of the form: tupleDefinition → tuple identifier parameterTuple ; With this definition, the programmer declares identifier to be a name for the tuple type denoted by parameterTuple. In the function definition functionDefinition → func identifier parameterTuple1 -> parameterTuple2 body the identifier is declared as a name for the tuple type denoted by parameterTuple2. In either definition, the parameterTuple that receives the name can either be a name declared in some other tupleDefinition or functionDefinition, or it can be a parenthesized comma-separated list of zero or more parameterSpecifications: parameterTuple→ ( parameterList ) | identifier parameterList → parameterSpecification ⋈ , parameterSpecification → type identifier For example, consider the following tupleDefinitions: tuple turkey (int a, float b, int c); tuple duck (int c, float xy, int d); tuple goose (int c, int d, float f); tuple ostrich (int c, int d, float f, float g); The first line defines turkey to be a name for the tuple type t(int, float, int). The second defines duck to be a name for the tuple type t(int, float, int). Note that turkey and duck thus refer to the same type (as defined in section 1.1). On the other hand, goose is defined to be a name for the tuple type t(int, int, float), so it is a different type. Ostrich is defined as t(int, int, float, float), so it’s a different bird altogether. Now consider: tuple robin (); tuple woodThrush (bool b); tuple variedThrush woodThrush; All of the above are legal tuple definitions, and it is permitted that they appear in this order inside a grouse file. In particular, robin is defined as a name for the nullary tuple type t(), woodThrush as a name for a unary tuple type t(bool). variedThrush is defined as being equivalent to woodThrush, so it is the type t(bool). The function definitions: func whistle(int a, int b)->(char c, char d) { … } func triller(int x) -> woodThrush { … } as one of their effects, create type names exactly as if one had defined: tuple whistle (char c, char d); tuple triller woodThrush; 1.4. Implementing tuples Trivial tuples are the same type as their subtype (or void); thus, one uses the subtype implementation. Nontrivial tuples are implemented as a reference type, with records having the standard 9-byte header (typeID, status, refcount). The typeID for tuples are integers starting with 100 and ascending from there. Each different tuple type must be given a different typeID (if you’d like to assign multiple typeIDs for a given tuple type, it is not necessary, but feel free to do so). The status bits are the same as for arrays and strings, but subtype-is-reference is set to 1 for any tuple iff at least one subtype is a reference type. The size of the record for a tuple type is 9 (for the header) + the sum of the sizes of the subtypes. For instance, t(int, float) would have a record size of 21 (= 9 + 4 + 8). In the grouse compiler, each tuple type should have its own instance of the TupleType class (or whatever you wish to call this class); TupleType implements Type and it has a reference to a symbol table that contains bindings that include the name and type of each subelement, along with a (positive) offset for each, and possibly an ordinal (a.k.a. position--in case you don’t want to have to derive this from the offsets). You can acquire this symbol table by placing a scope on the tuple type’s definition and visiting its subtree. TupleType should have an appropriate “equals” method that compares the details in the symbol tables of two TupleTypes. Recall that only the types and positions, and not the names, of the subtypes matter when comparing TupleTypes. 1.5. Visibility and distinctness of a tuple name Any name that is defined as a tuple name is visible throughout the file, including before the definition. For instance, this means that: tuple swainsonsThrush bluebird; tuple bluebird (int i, char c); is legal Grouse-3 and results in swainsonsThrush being a name for the type t(int, char). This is also true of tuple names defined in a functionDefinition. Identifiers defined as a tuple name (or function name) may not be used for any other purpose throughout the file. It may not be shadowed. 1.6. Tuple creation Nontrivial tuples are created with fresh or copy. For example, fresh bluebird(14, ‘r) creates a new bluebird tuple with entries 14 and ’r. This works with the bluebird definition in sec. 1.5 to create a new tuple record with i=14 and c=’r. (This tuple record would have 14 bytes, with i stored at offset 9 + 0 and c at offset 9+4. (Here the 9’s denote the header size.) copy also works to copy a tuple type. For example: imm v := fresh bluebird(14, ‘r); imm w := copy v; But keep in mind that the argument to copy can be any expression that gives a tuple type. A nontrivial tuple can also be created when a function returns. With whistle declared as in section 1.3 : imm v := whistle(4, 10); results in v being of type t(char, char), and v being assigned to a tuple record of this type created by the function whistle. 1.7. Tuple entries The entries (subelements) of a tuple are mutable and targetable. They are expressed in grouse as expression . identifier where expression has a tuple type, and identifier is the name of a subelement of that tuple type. (This is the place where the names of the subelements of a tuple type matter.) Although turkey and duck above refer to equivalent types, they have different subelement names, so the following are illegal in Grouse-3 (assuming myTurkey is of type turkey): myTurkey.xy fresh duck(10, 4.317, 12).b On the other hand, the following are legal: myTurkey.c fresh duck(10, 4.317, 12).xy 1.8.