What is SaM?

• SaM is a simple stack machine designed to introduce you to compilers in 3-4 lectures • SaM I: written by me around 2000 – modeled vaguely after JVM SaM I Am • SaM II: complete reimplementation and major extensions by Ivan Gyurdiev and David Levitan (Cornell undergrads) around 2003 • Course home-page has – SaM jar file – SaM instruction set manual – SaM source code

SaM Screen-shot Stack machine • All data is stored in stack (or heap) – no data registers although there might be control registers • Stack also contains addresses • Stack pointer (SP) points to the first free location on stack • In SaM, stack addresses start at 0 and go up • Int/float values take one stack location

. . …. 102 SP 101 5 100 4 . …... . …... . 0

1 Program area in SaM Program area Stack machine HALT

…….. SaM 2 ……. • Stack machine is sometimes called a 0- 1 PUSHIMM 8 address machine 0 PUSHIMM 7 PC – arithmetic operations take operands from top of • Program area: stack and push result(s) on stack – contains SaM code – one instruction per location • (PC):

…. – address of instruction to be executed ADD 102 102 SP SP – initialized to 0 when SaM is booted up 101 5 101 100 4 100 9 • HALT: …... …... – Initialized to false (0) when SaM is booted up – Set to true (1) by the STOP command – Program execution terminates when HALT is set to true (1)

Program Execution

Program area Loader HALT …….. SaM 2 ……. • How do commands get into the Program 1 PUSHIMM 8 area of SaM? 0 PUSHIMM 7 PC • Loader: a program that can open an input Command : file of SaM commands, and read them into PC = 0; the Program area. while (HALT == 0) //STOP command sets HALT to 1 Execute command Program[PC]; //ADD etc increment PC

2 Interpreter and Loader

0 1 2 …….. Labels HALT Program area ……..

PUSHIMM 2 PUSHIMM 3 PUSHIMM ADD • SaM assembly instructions in program file Interpreter: PC PC = 0; can be given labels while (HALT == 0) foo: PUSHIMM 1 Execute command Program[PC]; ……. Loader (command-file): File Loc = 0; JUMP foo Open command-file for input; while (! EOF) { • SaM loader resolves labels and replaces Read a command from command-file; Store command in Program[Loc]; jump targets with addresses Loc++; 2 PUSHIMM 3 PUSHIMM ADD …….. }

Other SaM areas

Some SaM commands

• FBR: Frame Base Register (see later) • Heap: for dynamic storage allocation (malloc and free) SaM uses a version of Doug Lea’s allocator

3 ALU commands Classification of SaM commands • ADD,SUB,… • Arithmetic/logic commands: • DUP: duplicate top of stack (TOS) • ISPOS: – ADD,SUB,.. – Pop stack; let popped value be Vt • Load/store commands: – If Vt is positive, push true (1);otherwise push false (0) – PUSHIMM,PUSHIND,STOREIND,… • ISNEG: same as above but tests for negative value • RegisterStack commands: on top of stack – PUSHFBR,POPFBR, LINK,PUSHSP,… • ISNIL: same as above but tests for zero value on top of stack • Control commands: • CMP: pop two values Vt and Vb from stack; – JUMP, JUMPC, JSR, JUMPIND,… – If (Vb < Vt) push 1 – If (Vb = Vt) push 0 – If (Vb > Vt) push -1

Example Pushing values on stack SaM code to compute (2 + 3) • PUSHIMM c PUSHIMM 2 – “push immediate”: value to be pushed is in the PUSHIMM 3 instruction itself ADD – will push c on the stack (eg) PUSHIMM 4 SaM code to compute (2 – 3) * (4 + 7)

PUSHIMM -7 PUSHIMM 2 PUSHIMM 3 SUB PUSHIMM 4  Compare with postfix notion (reverse Polish) PUSHIMM 7 ADD TIMES

4 Load/store commands • PUSHOFF n: push value contained in location Stack[FBR+n] • SaM ALU commands operate with values on top • v = Stack[FBR + n] of stack. • Push v on Stack • What if values we want to compute with are somewhere inside the stack? -9 • Need to copy these values to top of stack, and SP SP 33 33 store them back inside stack when we are done. • Specifying address of location: two ways FBR FBR – address specified in command as some offset from FBR -9 PUSHOFF -1 -9 (offset mode) – address on top of stack (indirect mode) Stack[FBR –1] contains -9

• STOREOFF n: Pop TOS and write value • PUSHIND: into location Stack[FBR+n] – TOS has an address • TOS has a value v – Pop that address, read contents of that address • Pop it and write v into Stack[FBR + n]. and push contents on stack

SP SP -9 333 52 SP SP . . . PUSHIND . STOREOFF 2 52 -9 52 -9 -9 333 FBR FBR TOS is 52 Contents of location 52 is -9

Store 333 into Stack[FBR+2]

5 •STOREIND: Using PUSHOFF/STOREOFF – TOS has a value v; below it is address s – Pop both and write v into Stack[s]. • Consider simple language SL – only one method called main – only assignment statements

333 main( ){ SP 52 SP . . int x,y; We need to assign stack locations for “x” and “y” STOREIND x = 5; . . and read/write from/to these locations to/from TOS 52 -9 52 333 y = (x + 6); return (x*y); } TOS is value 333. Value 333 is written Below it is address 52. into location 52 Contents of location 52 is -9

Stack frame SaM code (attempt 1) main: PUSHIMM 0 //allocate space for return value PUSHIMM 0//allocate space for x ADDSP 3 • Sequence of stack locations for PUSHIMM 0//allocate space for y holding local variables of //code for x = 5; procedure PUSHIMM 5 – “x” and “y” STOREOFF 1 • In addition, frame will have a location for return value y //code for y = (x+6); y • Code for procedure must leave x PUSHOFF 1 x PUSHIMM 6 return value in return value slot FBR return value FBR return value • Use offsets from FBR to ADD address “x” and “y” STOREOFF 2 • Where should FBR point to //compute (x*y) and store in rv – let’s make it point to “return PUSHOFF 1 value” slot frame for main PUSHOFF 2 frame for main – we’ll change this later TIMES STOREOFF 0 ADDSP -2 //pop x and y STOP

6 Problem with SaM code RegisterStack Commands • How do we know FBR is pointing to the base of the frame when we start execution? • Need commands to save FBR, set it to base • Commands for moving contents of SP, FBR of frame for execution, and restore FBR to stack, and vice versa. when method execution is done. • Used mainly in invoking/returning from • Where do we save FBR? methods – Save it in a special location in the frame • Convenient to custom-craft some commands to make method invocation/return easier to y implement. x FBR saved FBR return value

main:PUSHIMM 0//space for rv LINK//save and update FBR ADDSP 2//space for x and y FBR Stack commands //code for x = 5; PUSHIMM 5 – PUSHFBR: push contents of FBR on stack STOREOFF 1 • Stack[SP] = FBR; //code for y = (x+6); y • SP++; PUSHOFF 1 x – POPFBR: inverse of PUSHFBR PUSHIMM 6 • SP--; ADD FBR saved FBR • FBR = Stack[SP]; STOREOFF 2 return value – LINK : convenient for method invocation //compute (x+y) and store in rv • Similar to PUSHFBR but also updates FBR so it points to PUSHOFF 1 location where FBR was saved PUSHOFF 2 TIMES frame for main • Stack[SP] = FBR; SP 89 SP • FBR = SP; 88 88 89 34 STOREOFF –1 • SP++; -1 87 -1 ADDSP –2//pop locals 34 88 POPFBR//restore FBR FBR FBR STOP

7 SP  Stack commands Control Commands • So far, command execution is sequential – PUSHSP: push value of SP on stack – execute command in Program[0] • Stack[SP] = SP; – execute command in Program[1] • SP++ –….. – POPSP: inverse of POPSP • For implementing conditionals and loops, we need • SP--; the ability to • SP = Stack[SP]; – skip over some commands – ADDSP n: convenient for method invocation • SP = SP + n – execute some commands repeatedly • For example, ADDSP –5 will subtract 5 from SP. • In SaM, this is done using • ADDSP n can be implemented as follows: – JUMP: unconditional jump – PUSHSP – PUSHIMM n – JUMPC: conditional jump – ADD • JUMP/JUMPC: like GOTO in PASCAL – POPSP

•JUMP t://t is an integer Example – Jump to command at Program[t] and execute commands from there on. SP – Implementation: PC  t SP SP -5 SP 0 SP -1 SP •JUMPC t: -5 -5 -5 -5 -5 5 – same as JUMP except that JUMP is taken only if the topmost value on stack is true; otherwise, execution continues with command after this one. PC 0 PC 1 PC 2 PC 3 PC 4 PC 5 – note: in either case, stack is popped. – Implementation: Program to find absolute value of TOS: • pop top of stack (Vt); 0: DUP • if Vt is true, PC  t else PC++ 1: ISPOS If jump is not taken, sequence of PC values 2: JUMPC 5 is 0,1,2,5 3: PUSHIMM –1 4: TIMES 5: STOP

8 Symbolic Labels • It is tedious to figure out the numbers of commands that are jump targets (such as STOP in example). Using JUMPC for conditionals • SaM loader allows you to specify jump targets using a symbolic label such as DONE in example above. • Translating if e then B1 else B2 • When loading program, SaM figures out the addresses of all jump targets and replaces symbolic names with those addresses. code for e JUMPC newLabel1 DUP DUP code for B2 ISPOS ISPOS JUMP newLabel2 JUMPC 5 JUMPC DONE newLabel1: code for B1 PUSHIMM –1 PUSHIMM –1 newLabel2: TIMES TIMES …… STOP DONE: STOP

PC  Stack Commands Example • Obvious solution: something like – PUSHPC: save PC on stack // not a SaM command • Stack[SP] = PC; • SP++; ……. • Better solution for method call/return: JSR foo //suppose this command is in Program[32] – JSR xxx: save value of PC + 1 on stack and jump to xxx ADD • Stack[SP] = PC +1; ….. • SP++; foo: ADDSP 5 //suppose this command is in Program[98] • PC = xxx ……. – JUMPIND: like “POPPC” (use for return from method call) JUMPIND//suppose this command is in Program[200] • SP--; ….. • PC = Stack[SP];

– JSRIND: like JSR but address of method is on stack • temp = Stack[SP]; Sequence of PC values: ….,32,98,99,…,200,33,34,….., • Stack[SP] = PC + 1; assuming stack just before JUMPIND is executed is same •PC = temp; as it was just after JSR was executed

9 SaM stack frame for CS 375 Protocol for call/return

mth local saved PC • Caller: …………. saved FBR – creates return value slot first local p – evaluates parameters from first FBR saved PC b to last, leaving values on stack mth local saved FBR return value –LINK …………. nth parameter stack grows saved PC – JSR to callee first local ………….. saved FBR – POPFBR //executed on return FBR saved PC p one frame for power – pop parameters saved FBR first parameter b • Callee: nth parameter stack grows return value return value – create space for local variables ………….. – execute code of callee power(b,p){ • Return from callee: first parameter – Evaluate return value and write return value if (p = = 0) return 1; into rv slot else return b*power(b,p-1); – Pop off local variables } – JUMPIND //return to caller

Writing SaM code Recursive code generation

Construct Code • Start by drawing stack frames for each method in integer PUSHIMM xxx your code. x PUSHOFF yy //yy is offset for x • Write down the FBR offsets for each variable and (e1 + e2) code for e1 return value slot for that method. code for e2 • Translate Bali code into SaM code in a ADD compositional way. Think mechanically. x = e; code for e STOREOFF yy

{S1 S2 … Sn} code for S1 code for S2 …. code of Sn

10 Recursive code generation(contd) Recursive code generation(contd)

Construct Code Construct Code

code for e PUSHIMM 0//return value slot if e then B1 else B2 f(e1,e2,…en) JUMPC newLabel1 Code for e1 code for B2 … JUMP newLabel2 Code for en newLabel1: LINK//save FBR and update it code for B1 JSR f newLabel2: POPFBR//restore FBR ADDSP –n//pop parameters newLabel1: JUMP newLabel1 while e do B; code for e newLabel2: ISNIL code for B JUMPC newLabel2 newLabel1: code for B code for e JUMPC newLabel2 JUMP newLabel1 newLabel2: Better code

Recursive code generation(contd)

Construct Code OS code for SaM

ADDSP c // c is number of locals • On a real machine f(p1,p2,…,pn){ code for B int x,…,z;//locals – OS would transfer control fEnd: //OS code to set up call to main B} to main procedure STOREOFF r//r is offset of rv slot – control returns to OS when ADDSP –c//pop locals off main terminates PUSHIMM 0 //rv slot for main JUMPIND//return to callee • In SaM, it is convenient to LINK //save FBR begin execution with code JSR main //call main that sets up stack frame POPFBR return e; code for e for main and calls main STOP JUMP fEnd//go to end of method – this allows us to treat main like any other procedure

11 Symbol tables Example Let us write a program to compute absolute value of an integer. • When generating code for a procedure, it is Symbol Table Bali: saved PC convenient to have a map from variable names to FBR saved FBR Variable Offset frame offsets main() { return value rv -1 return abs(-5); • This is called a “symbol table” } saved PC • For now, we will have Symbol Table abs (n) { saved FBR – one symbol table per procedure Variable Offset if ((n > 0)) return n; FBR n n-1 – each table is a map from variable names to offsets else return (n*-1); return value rv -2 • Symbol tables will also contain information like } types from type declarations (see later)

main() { main: main: Complete code return abs(-5); ADDSP 0 // 0 is number of locals ADDSP 0 // 0 is number of locals } code for “return abs(-5)” code for “abs(-5)” //OS code to set up call to main mainEnd: JUMP mainEnd PUSHIMM 0 //rv slot for main abs: PUSHOFF –1//get n STOREOFF -1//-1 is offset of rv slot mainEnd: LINK //save FBR ISPOS //is it positive ADDSP -0 //pop locals off STOREOFF -1//-1 is offset of rv JSR main //call main JUMPC pos//if so, jump to pos JUMPIND//return to callee ADDSP -0 //pop locals off POPFBR (1) JUMPIND//return to callee STOP PUSHOFF –1//get n (2) PUSHIMM –1//push -1 main: TIMES//compute -n main: main: //set up call to abs JUMP absEnd//go to end ADDSP 0 // 0 is number of locals ADDSP 0 // 0 is number of locals PUSHIMM 0//return value slot for abs PUSHIMM 0 pos: PUSHOFF –1//get n PUSHIMM 0 PUSHIMM –5//parameter to abs code for “-5” JUMP absEnd PUSHIMM -5 LINK//save FBR and update FBR LINK LINK absEnd: JSR abs//call abs JSR abs JSR abs STOREOFF –2//store into r.v. POPFBR POPFBR POPFBR //restore FBR JUMPIND//return ADDSP -1 ADDSP -1 ADDSP –1//pop off parameter JUMP mainEnd JUMP mainEnd //from code for return mainEnd: mainEnd: JUMP mainEnd STOREOFF -1//-1 is offset of rv slot STOREOFF -1//-1 is offset of rv slot mainEnd: ADDSP -0 //pop locals off ADDSP -0 //pop locals off STOREOFF -1//store result of call JUMPIND//return to callee JUMPIND//return to callee JUMPIND (3) (4)

12 //OS code to set up call to main fact: PUSHIMM 0 //rv slot for main PUSHOFF -1 //get n LINK //save FBR PUSHIMM 0 Factorial JSR main //call main EQUAL POPFBR JUMPC zer PUSHOFF -1 //get n Symbol Table STOP Saved PC PUSHIMM 0 // fact(n-1) main() { Saved FBR Variable Offset FBR main: PUSHOFF -1 return fact(5); rv rv -1 //code for call to fact(10) PUSHIMM 1 } PUSHIMM 0 SUB PUSHIMM 10 LINK LINK JSR fact Symbol Table JSR fact POPFBR fact(n) { Saved PC Variable Offset POPFBR ADDSP -1 if ((n ==0) return 1; Saved FBR FBR n n-1 ADDSP -1 TIMES //n*fact(n-1) else return (n*fact(n-1)); rv //from code for return JUMP factEnd } rv -2 JUMP mainEnd zer: PUSHIMM 1 //from code for function def JUMP factEnd mainEnd: factEnd: STOREOFF -1 STOREOFF -2 JUMPIND JUMPIND

Running SaM code

• Download the SaM interpreter and run these examples. • Step through each command and see how the computations are done. • Write a method with some local variables, generate code by hand for it, and run it.

13