Appendix a the BCPL Language
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A The BCPL Language I've used BCPL as the language of illustration throughout most of this book because it is designed as a system-programming language and is especially suited to the special problems of compiler writing (a recurrent joke is that BCPL is designed as a language in which to write the BCPL compiler!). It is suitable for this purpose mainly because of the ease with which the program can manipulate pointers. In addition it is untyped - all BCPL values are merely treated as 'bitstrings' - so that the program can do just about anything that you might require to do with a pointer. Recursion is efficient in BCPL (see chapter 13). I have taken many liberties with BCPL syntax in examples, mainly inspired by the need to compress complicated algorithms into the space of a single page. I have used if-then-else in place of BCPL's test-then-or for no other reason than the first construction will be more familiar to many users: I have used elsf because it abbreviates the programs. I have used dot-suffix notation 'nodep.x' rather than 'nodep!x' or 'xAnodep' because I believe it will be more familiar to many readers. Apologies to BCPL fanatics (and to BCPL's designer>, but I defend myself by saying that my task is to explain compiler algorithms, not the syntax of BCPL. In what follows I explain only those portions of BCPL (and pseudo-BCPL) which I have used in examples. The language is much more powerful and elegant than I have made it seem - I commend its use to all readers. Statements BCPL provides several forms of iterative and conditional statements. In general, if there are two uses for a statement, BCPL provides two different syntaxes hence while, until, repeatwhile, repeatuntil and repeat are all provided in BCPL. Semicolons aren't usually needed to separate statements, although I have used them where one statement follows another on a single line. The then or do symbol is usually unnecessary, but I have 382 Understanding and Writing Compilers generally included it to avoid confusing those readers unused to the language. 1. Assignment statement: <lhs-exprlist> := <exprlist> The <lhsexpr>s can be <name> <expr>!<expr> !<expr> <name>.<name> 2. Procedure call: <expr><> <expr><<exprlist>) - BCPL makes you indicate parameterless procedures by showing an empty parameter list; there is only call-by-value. 3. Iterative statements: while <expr> do <statement> until <expr> do <statement> <statement> repeat <statement> repeatwhile <expr> <statement> repeatuntil <expr> Test is either before execution of <statement> (while, until) or after execution of statement <repeat, repeatwhile, repeatuntil>. 4. Selection of cases: switchon <expr> into <statement> The <statement> must contain case labels of the form case <constant>: or case <constant> •• <constant>: or default:. 5. Conditional statement (not strict BCPL): if <expr> then <statement> if <expr> then <statement> else <statement> if <expr> then <statement> elsf <expr> then 6. Compound statements: { <statement>* } { <declaration>* <statement>* } 7. Control statements: break - exit from current iterative statement Loop - end present iteration of current iterative statement endcase - exit from current switchon statement Appendix A: The BCPL Language 383 Declarations 8. Variable declarations: let <namelist> = <exprlist> 9. Procedures and functions: let <name>() be <statement> let <name>(<namelist>) be <statement> let <name>() = <expr> let <name><<namelist>) = <expr> Note that the <expr> in a function declaration is usually a valof expression. Expressions There are many kinds of BCPL operator, both binary and unary. I have assumed that all unary operators have the highest priority, that arithmetic operators come next with their conventional priorities, relational operators next, and finally Logical operators. In cases of confusion I've used brackets. Conditional expressions have the Lowest priority. In translation examples I have used an invented operator: the '++' operator which takes a string and appends a character. It is quite unrealistic, but I hope you see what it means. 10. Value of a statement: valof <statement> The <statement> must contain a resultis <expr> statement 11. Unary operators: @<name> I* address-of *I !<expr> I* contents-of *I +<expr>, -<expr> I* unary integer arithmetic *I \ <expr> I* Logical 'not' *I 12. Function calls: <expr>() <expr>(<exprlist>) 13. Binary operators: <expr>!<expr> I* subscript operator *I <expr>.<name> I* field-select operator *I +, -,*,I, rem I* integer arithmetic *I <, <=, =, \=, >=, > I* integer relations *I &, I I* Logical 'and' and 'or' *I ++ I* string concatenation *I 'V!n' is similar to V(n] in most other Languages, 'P.a' is 384 Understanding and Writing Compilers similar to SIMULA 67's P.a, PASCAL's PA.a and ALGOL 68's a of P. 14. Conditional expressions: <expr0> -> <expr1>, <expr 2> If the value of <expro> is true then <expr1> is evaluated: otherwise <expr 2> is evaluated. Appendix B Assembly Code Used in Examples I have been consistent in using a single-address, multi-register machine as my illustration in examples. The code format is: <operation-code> <register-number>, <address> where an address is either a <number> or <number>(<register>) Examples: JUMPFALSE 1, 44 ADD 1, 3217(7) ADDr 5, 2.. Any register may be used as an address modifier or as an accumulator. There are a fixed number of registers (the exact number doesn't matter). Sometimes (e.g. JUMP) the register doesn't matter and is omitted, sometimes (e.g. FIXr) the address is omitted I find it best to divide instructions into different groups distinguished by a Lower-case Letter at the end of the operation code. This makes Logically different operations, which on many real machines are implemented by widely differing instructions, visually distinct. The suffixes are: - no suffix means a store-to-register operation (except STORE, which is of course register-to-store) - 'r' means register-to-register - 's' means register-to-store 'n' means the <address> part is to be interpreted as a number 'a' means the <address> part is to be interpreted as an address - this may seem just Like 'n' but on some machines it isn't! - 'i' means indirect addressing- the memory cell addressed contains the actual address to be used in the instruction. 386 Understanding and Writing Compilers Examples of the differences: ADD 1, 2 means add the contents of store Location 2 to register 1 AD Dr 1, 2 means add the contents of register 2 to register 1 ADDs 1, 2 means add the contents of register 1 to store Location 2 ADDn 1, 2(4) means add the number which is formed by adding 2 and the contents of register 4, to register ADD a 1, 2(4) means add the address which is formed by combining 2 and the contents of register 4, to register 1 I hope the instruction-names are fairly indicative of their operation. In the examples I've mostly used LOAD, STORE, JSUB, SKIP?? and the arithmetic operations. Here is a table which may clarify matters: Instruction Explanation LOAD place a value in a register STORE place a value in a memory cell INCRST add one to store Location DECRST subtract one from store Location STOZ set all bits in store Location to zero STOO set all bits in store Location to one INCSKP increment store Location; skip next instruction if result is zero DECSKP decrement store Location; skip next instruction if result is zero ADD add two values SUB subtract one value from another NEGr negate value in register MULT multiply two values DIV divide one value by another (I have used fADD, fSUB etc. to denote the floating-point analogue of these instructions. Likewise xSUB, xDIV etc. denotes the 'exchanged' or 'reverse' variant - see chapter 5) FIXr convert floating point number in register to fixed-point FLOATr convert fixed point number in register to floating point Appendix B: Assembly code used in examples 387 SKIP jump over the next instruction SKIPLT jump over the next instruction if register value is less than (LT) store value SKIPLE ditto, but relation is 'Less or equal' SKIPNE ditto, but relation is 'not equal' SKIPEQ ditto, but relation is 'equal' SKIPGE ditto, but relation is 'greater or equal' SKIPGT ditto, but relation is 'greater than' JUMP transfer control to indicated address JUMPLT transfer control only if register value is less than (LT) zero JUMPLE ditto, but if less than or equal to zero JUMPNE ditto, but if not equal to zero JUMPEQ ditto, but if equal to zero JUMPGE ditto, but if greater than or equal to zero JUMPGT ditto, but if greater than zero JUMPTRUE transfer control if register contains special TRUE value JUMPFALSE ditto, but for FALSE value JSUB L, a transfer control to indicated address, storing address of next instruction (return address) in register RETN , a return to indicated address PUSH p, a transfer contents of address a to top of stack indicated by register p, increase p POP p, a decrement stack pointer p, transfer contents of top of stack to address a PUSHSUB, POPRETN analogous to JSUB, RETN but link address is on the stack rather than in a register. INC p, a add contents of Location a to stack register p and check that p is not outside bounds of stack space DEC p, a subtract contents of location a from stack pointer and check limits of stack. Some of the instructions in this list may seem to have the same effect as others, but I have in general included an instruction for each simple machine-code operation which I have needed to illustrate. Using such a Large instruction set makes my task easier than it would otherwise be, of course, but some real machines have larger sets still (e.g.