Bali++ Language Specifications

CS212

Fall 2006

1 Introduction

1.1 Bali Specifications

As discussed in lecture, we can specify a language in terms of its syntax (spelling and structural rules) and semantics (meaning). Although formal description methods exist for both kinds of specifications, we will only formalize the Bali syntax, as developed in Section 2. Bali’s semantics are informally defined in Section 3.

1.2 Notation

We use the following notation throughout this document:

• Specific language elements, like keywords, operators, and punctuation, are shown as bold.

• Other terminals that can have a range of values, such as integers and characters, are shown as bolditalic.

• Non-terminals in production rules are shown as plain red.

• Square brackets are used to denote optional items. [ int ] indicates that the keyword int can be present but is not required.

• A single | denotes that either the item on the left or the right can be present. So a | b indicates that either a or b must be present, but not both.

• Parentheses are used to options. Thus, ( a | b ) indicates that c must be prefixed by either a or b. Square brackets can also be used for grouping with the difference that the group would be optional.

• An (*) is used to indicate zero or more occurrences of this item and a plus (+) is used to indicate one or more occurrences of this item.

• An arrow (→) indicates a production rule, which means can be expressed as.

(please refer also to the lecture notes for further information regarding regular expressions)

1 2 Bali Syntax

2.1 Program and Functions

program → [globals varBlock ;] func* program can have global variables and functions func → funcHeader ( varBlock ; )* statement function has header, optional local variables, and a statement funcHeader → type name ( [ varBlock ] ) header has return type, function name, and optional parameters

2.2 Variables and Naming

varBlock → type name ( , type name )* one or more variable definitions type → ( int | boolean | char ) primitive types name → (a-z|A-Z)(a-z|A-Z|_|0-9)* names must begin with a letter

2.3 Statements statement → { statement* } block of statements statement → if ( expression ) then statement [ else statement ] conditional statement statement → while ( expression ) statement while loop statement → expression; expression statement statement → print expression; output statement → return expression; return statement

2.4 Expressions

expression → expPart [ binop expPart ] basic expression expression → expPart := expression assignment expression → expPart ? expPart : expPart ternary/conditional operator expPart → unaryop expPart unary expPart → int | boolean |’char’ integer, boolean, or character constant expPart → readInt() integer input expPart → readChar() character input expPart → name variable access expPart → name ( [ expression ( , expression )* ] ) function call expPart → ( expression ) nested expression binop → arithmeticOp | comparisonOp | booleanOp binary operators arithmeticOp → + |-| * | / | % operators comparisonOp → == | > | < comparison operators booleanOp → && | || boolean operators (both are short-circuiting) unaryOp → - | ! unary operators

2 3 Bali Semantics

The following section details the semantics of the Bali language. If the following restrictions are violated, the compiler must throw a SemanticException (as opposed to a SyntaxException for violations of syntax above).

3.1 Reserved Keywords

The following keywords may not be used as variable or function names: globals, int, boolean, char, if, then, else, while, print, return, readInt, readChar, true, false.

3.2 Namespaces

A namespace is a set of unique identifiers (names). The following namespaces exist in Bali:

• global namespace (global variables and functions)

• local namespacess (arguments and local variables of a function)

Furthermore, the following semantic rules apply to these namespaces

• Declaring two or more symbols with the same name within the same namespace is not allowed.

• Symbols with the same name may be declared in different namespaces.

• Global variables and functions can be accessed anywhere where it is syntacticaly valid. Local and argument variables can only be accessed within the function in which they are defined. It is semantically invalid to access a function or variable that has not been defined in the program.

• If a variable access occurs in a function and its name was declared both in that function’s local variable names- pace and in the global name space, that variable access refers to the local variable not the global variable.

3.3 Expressions

Bali is a strongly typed language, and as such enforces restrictions on how expressions may be combined. Each expression has a type, and each operator will only work with specific types. These typing rules are given bellow.

Binary Operators

• Boolean operators accept and produce booleans.

• The equality comparison operator, ==, accepts two of the same type and produces a boolean result. All other comparison operators accept only integers and produce a boolean result.

• Arithmetic operators accept only integers and produce an integer. Division is integral for integers.

• Binary operations should be evaluated starting with the expression on the left and then the expression on the right. So for a + b, a should be evaluated first, then b, and then their results should be added.

3 Assignment Operator

The left side of an assignment operator must be a valid variable name and must be of the same type as the output of the expression on the right. It assigns the value of the expression on the right to the variable on the left and produces a result of the same type and value as the value assigned.

Ternary/Conditional Operator

The ternary operator is of the form ex1 ? ex2 : ex3. ex1 must be of type boolean. ex2 and ex3 must be of the same type T. Furthermore, the operator produces a result of T. If ex1 is true, then the operator evaluates only ex2. Otherwise, only ex3 is evaluated.

Unary Operators

The unary operator “negate” accepts and produces only integers and the unary operator “not” accepts and produces only booleans.

Constants

Any Java integer is a constant of type int. The keywords true and false are constants of type boolean. Any Java character delimited by single quotes (’) is a constant of type char.

Variables and Function Calls

• The built in function readInt requests an integer from the user and returns an integer. readChar requests a character from the user and returns a character.

• Variables have the type that they were defined to have.

• Function calls have the type of the return type of the function being called. Furthermore, function arguments should be evaluated from left to right. (For example, in the call “foo(x + 1,y + 2)” first “x + 1” should be evaluated and secondly y + 2 should be evaluated.)

3.4 Statements

• while statements require an expression of boolean type and have the same looping functionality as in Java.

• if statements require an expression of boolean type and have the same conditional functionality as in Java (just the added then keyword).

• A print statement requires an expression of integer or character type and then prints the output to SaM’s console.

• A return statement occuring within a particular function requires an expression of that function’s return type.

• An expression statement can have an expression with any return type, since the return value is eliminated.

4 3.5 Functions

• All functions contain an implicit return statement at the end of the function that returns 0 for integers, ’\0’ for characters, false for booleans. Thus a function will return something even if no specific return statement is encountered.

• There must be one function called main with a return type of int and no parameters. This function will be called on startup.

5