/Functions

A is a program fragment that. . . „ Resides in user space (i.e, not in OS) „ Performs a well-defined task „ Is invoked (called) by a user program Implementing „ Returns control to the calling program when finished Functions at the Virtues Machine Level „ Reuse useful code without having to keep typing it in (and debugging it!) „ Divide task amonggppg multiple programmers „ Use vendor-supplied library of useful routines

Based on slides © McGraw-Hill Additional material © 2004/2005 Lewis/Martin Modified by Diana Palsetia

CIT 593 1 CIT 593 2

LC3 Opcode for CALLING a Subroutine JSR

JSR/JSRR – saves the return address in R7 and computes 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 the the starting address of the subroutine and loads it JSR 0 1 0 0 1 PCoffset11 Note: This is PC into PC of next instruction JSR 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 „ PC-relative mode (just like PC-relative LD/ST) PC 0100000000011001 Register File „ Target address of the subroutine is incremented PC + offset BR 512 R0 IR 0100101000000000 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 R1 9 JSR R2 0 1 0 0 1 PCoffset11 R3 SEXT R4 16 16 R5 16 R6 0000001000000000 JSRR R7 0100000000011001 16 „ Base addressing mode B A ADD „ Target address of the subroutine is obtained from Base Register ALU 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Just like JMP (but PC is saved in R7) 16 16 JSRR 01000 0 0 BaseR 0 0 0 0 0 0 1 0 CIT 593 3 CIT 593 4

1 Returning From a Subroutine Example: 2’s complement routine Use RET .ORIG x3000 DoSomething1 •. „ Just a special case of JMP i.e. RET == JMP R7 •. •. •. ;need to compute R4 = R1 - R3 Note x3005 ADD R0, R3, #0 ; copy R3 to R0 x3006 JSR TwosComp ; negate „ If we use JMP to call subroutine instead of JSR/JSRR, x3007 ADD R4, R1, R0 ; add to R1 we can’t use RET to return from subroutine! x3008 BRz DoSomething2 „ Why not? TwosComp NOT R0, R0 ; R0 is the input to the routine ADD R0, R0, #1 ; add one RET ; return to caller

Do Some thing 2 •. •. •. •.

CIT 593 5 CIT 593 6

JSRR Information regarding Subroutines this zero means “register mode” How to Pass Information To/From? 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 „ In registers (simple, fast, but limited number) JSRR 01000 0 0 BaseR 0 0 0 0 0 0 ¾ E.g. R0 contains input and the return value is also place in R0 „ In memory (many, but expensive) Register File ¾ This is the stack implementation R0 JSRR R5 R1 „ Both IR 0100000101000000 R2 R3 R4 What should the User of the Function know? R5 000001000011000 „ Address: or at least a label that will be bound to its address R6 R7 0100010000011001 ¾In high-level we use the subroutine name 16 „ Function: what it does NtNote: ThiiPCfThis is PC of 16 ¾NOTE: The programmer does not need to know how the next instruction 16 1 0 subroutine works, but what changes are visible in the d 16 machine’s state after the routine has run 0000010000011000 „ Arguments: what they are and where they are placed PC 0100010000011001 Virtues of JSRR? c „ Return values: what they are and where they are placed

CIT 593 7 CIT 593 8

2 Saving and Restoring Registers Location for argument(s) and return value

Remember that any piece of code uses same set of machine registers Caller/Callee must agree on argument & return value location „ So we must maintain state of machine after the subroutine call

Called routine ⇒ “callee-save” Approach 1 „ Before start , save register(s) that will be altered „ Every subroutine does what it likes (unless altered value is desired by calling program!) „ Program needs to look at documentation for each one „ Before return, restore those same register(s) „ Values are saved by storing them in memory Approach 2 Calling routine ⇒ “caller-save” „ Define a consistent calling convention „ If register value needed later, save register(s) calling the routine ¾ This is much harder as you may need to know the internal workings of a E.g. LC-3 argument/ret-val location fnctionfunction „ First 4 arguments passed in R0, R1, R2, R3 „ Single value returned in register other than R7 By convention, callee-saved „ Subsequent arguments passed in memory

CIT 593 9 CIT 593 10

Example of Callee Save for subroutines Stack

TwosComp: ;good to save R7 even if u don’t use it Concept ST R7, SAVER7;save R7 A last-in first-out (LIFO) storage structure ST R1, SAVER1;save R1 „ The first thing you put in is the last thing you take out NOOTR0, R 0 ;R0 is the input to the routine „ The last thing you put in is the first thing you take out AND R1, R1, #0 „ Not like an array, where you can access any item based on an index ADD R1, R1, #1 ADD R0, R0,R1 Two main operations LD R7, SAVER7 ;restore R7 PUSH: add an item to the stack LD R1, SAVER1 POP: remove an item from the stack RET SAVER7: .FILL x0000 SAVER1: .FILL x0000

CIT 593 11 CIT 593 12

3 A Physical Stack A Software Stack implemented in Memory

Coin holder „ Assume section x3FFFB to x3FFF in memory used for stack „ Data items don't move in memory, just our idea about where TOP of the stack is (a register is used to keep track of the location)

1995 1996 1998 1998 1982 x3FFB / / / / / / / / / / / / / / / / / / / / / / / 1982 1995 x3FFC / / / / / / / / / / / / 12 TOP 12 1995 x3FFD / / / / / / / / / / / / 5 5 x3FFE / / / / / / / / / / / / 31 31 TOP x3FFF / / / / / / 18 TOP 18 18 TOP x4000 R6x3FFF R6x3FFC R6x3FFE R6 Initial State After After Three After One Push More Pushes One Pop Initial State After After Three After One Push More Pushes Two Pops Last quarter in is the first quarter out (LIFO) By convention, in LC3 R6 holds the Top of Stack (TOS) pointer

CIT 593 13 CIT 593 14

Basic Push and Pop Code What happens when we run out of space?

Push Overflow ADD R6, R6, #-1 ; decrement stack ptr „ When we have no more locations and we try to push some data STR R0, R6, #0 ; store data contained in R0 on the stack

Underflow „ The stack is empty and we try to pop the data Pop LDR R0, R6, #0 ; load data from TOS Some how the programmer needs to learn about the ADD R6, R6, #1 ; increment stack ptr overflow/underflow problem „ Java: Stack Overflow Note „ C: Segmentation fault „ Stacks can grow in either direction (toward higher address or toward lower addresses)

CIT 593 15 CIT 593 16

4 POP routine w/ Underflow Detection General Implementation

POP: LD R1, EMPTY ;R1 = -x4000 Some processors use three registers to track: ADD R2, R6, R1 ;R6 = Stack Pointer MAX Limit Stack Pointer (SP): contains addr of top of stack BRz Failure ;check whether R6 = x4000 /////// / / / / / LDR R0, R6, #0 ;POP value of the stack 8 SP ADD R6, R6, #1 ;update the stack pointer Stack Base: Contains the addr of the bottom location 12 RET BOTTOM Base Failure: AND R5, R5, #0 Stack Limit: contains the addr of the other end of the ADD R5, R5, #1 ;R5 = 1 indicates that the POP was stack (checks for PUSH error) ;not successful E.g. 3 entry stack RET Empty: .FILL xC000 ; - x4000 (stack limit)

CIT 593 17 CIT 593 18

Uses of Stack Use: Passing arguments & return value in a subroutine

Passing arguments & routines for subroutine calls Use the stack for arguments & return values instead of „ Instead passing them through registers registers „ To perform recursive subroutines „ Calling routine pushes arguments on the stack and calls the „ Allocatinggp space for local variables inside a subroutine subroutine „ Called routine pops the passed arguments from the stack and pushes any return value onto the stack Use as Temporary storage „ Finally, the calling routine then retrieves the return value(s) from „ Computers without registers store temporary values during the stack and continues execution computation on a stack Why would this be helpful? Interrupt I/O „ This is especially helpful if a subroutine has many arguments „ More when we do I/O „ Also stack is need for recursion

CIT 593 19 CIT 593 20

5 Function Call scenario (Recursion Problem) Solution to Recursion problem

Main . . . First call to Foo For each subroutine call, need a mechanism to distinguish JSR Foo „ SaveR7 contains address of Next its invocation Next . . . „ This is known as activation record HALT Second call to Foo „ SaveR7 contains address of After Activation Record contains Foo ST R7, SaveR7 „ Invocation-specific data (e.g., local variables, saved registers, AND R0, R0, #0 First return from Foo arguments and returns) . . . „ Returns to After JSR Foo „ Need to store this information per function call and discard when After . . . the function call is over LD R7, SaveR7 Second return from Foo RET „ Returns to After again!!! „ Stack data structure fits this description well Save7 .FILL #0 ¾When function is called we store (push) record on the stack Counter .FILL #0 ¾When function call is over we discard (pop) record from the stack .END

CIT 593 21 CIT 593 22

Big Picture Complete view of an activation record

Activation Record of a called function: Memory Memory Memory R5 1. input parameters passed & return value Local var

func 2. book keeping information Book keeping „ E.g. Callers Return address „ Frame Pointer main main main

3. local variables Args + return Before call During call After call value Stack Frame/Record

CIT 593 23 CIT 593 24

6 Frame Pointer Activation Record

Each function gets a record (a.k.a frame), but we need to Arguments somehow „ The calling routine places the arguments on top of „ Delineate one function’s record from other, especially if we are in the stack before calling a function a recursive call „ The callee will use the arguments from the stack to „ Track within an activation record the local variables, book keeping info, and arguments do its work

This tracking is done by what is called the frame pointer Book keeping „ Frame pointer is the address which points to the top of the Callers Frame Pointer record/frame „ Need to save callers frame pointer so that when the „ Once we know the frame pointer , all information can be accessed via frame pointer + offset control returns to the caller, it will be able to access its own local variable, arguments etc. Note: By convention in LC3, R5 holds frame pointer ¾ If we destroy this value then we have trouble restarting the caller correctly when the callee finishes

CIT 593 25 CIT 593 26

Activation Record contd.. Activation Record contd.. Book Keeping Record Local Variables „ Local Variables are added after arguments and book 2. Return address keeping information „ Save pointer to next instruction in calling function „ They are added in the order they are declared „ Convenient location to store R7 ¾ Ther ef or e v ari abl e decl ar ed l ast i s alw ays on th e top of th e ¾Especially helpful during recursive calls stack (LIFO)

3. Saved Registers Compiler tracks each variable Save all registers that will be used for temporary work in the „ Just like assembler, maintains symbol table for labels function „ Compiler stores for very variable 1.Name (identifier) 2.Type Name Type Offset 4Rt4. Returns 3.Location in memory Every function always allocates a space for return value on 4.Scope the activation record, whether or not it returns anything „ Even if the function return is void Compiler only stores offset and uses R5 as the base address

CIT 593 27 CIT 593 28

7 Example local variables of main() Example local variables of main() (contd..) int main() LC3 initialization code: To calculate amt, we can now easily access local variables p,t, and r { using the frame pointer R5 ADD R0, R0, #0 R5 r = 0 „ Assigning amt = p * t * r ; double amt = 0; t = 0 STR R0, R5, #3 p = 0 LDR R1, R5, #2 ;R1 = p int p = 0; STR R0, R5, #2 amt = 0 LDR R2#122, R5, #1 ; R2 = t; int t = 0; LDR R3, R5, #0 ; R3 = r STR R0, R5, #1 r MUL R1, R1, R2 ; R1 = p * t R5 int r = 0; t STR R0, R5, #0 MUL R1, R1, R3 ; R1 = p * t * r p ... STR R1, R5, #3 ; Store R1 = amt amt amt = p * t * r; Note: Reserved Opcode is used to create MUL instruction We know that R5 Compiler’ s Symbol Table } Compiler’s Symbol Table contaihfins the frame Name Type Offset Scope pointer of function Name Type Offset Scope main(). Now variables amt double 3 main amt double 3 main in main can accessed p int 2 main p int 2 main by : t int 1 main t int 1 main Note: (explanation r int 0 main Frame Pointer + Offset r int 0 main deviates from book CIT 593 29 CIT 593 section 12.5) 30

Example of a Complete Activation Record Function Call Example

int main() { int func(int a, int b) int x, y, val; { x = 10; “result” int w, x, y; y = 11; Caller’s FP (R5) R5 y val = max(x + 10, y); save R0 . locals printf(“%d”, val); x return 0; save R1 ’s . w } save R7 max . Callers Frame Pointer max’s return value view int max(int a, int b) return y; bookkeeping return addr. (R7) “a” return value { } int result = 0; “b” a if (b > a) { “val” Name Type Offset Scope args b result = b; main’s “y” } b int 7 func view “x” else { a int 6 func … result = a; “ret. value” int 5 func } main’s return value w int 2 func return result; x int 1 func a & b were placed on the } y int 0 func stack by caller

CIT 593 31 CIT 593 32

8 Other Use: Non-Register Machines Example: Zero Address ISA

LC3 three address machine because it specifies all 3 PUSH Instruction PUSH address locations (i.e. 1 Dest and 2 Src). E.g. ADD R1, R2, R1 „ Adjusts the stack pointer (internally done), and pushes value onto the stack Some ISAs , are zero address machines or „ Programmer can only read the stack pointer (just like in LC3 the „ They use stack for Dst and Src operands and the programmer can read the condition code registers). instruction does not explicitly specify them POP „ Is similar to PUSH, except that value is removed from top of the stack and „ In an all memory machine, some location in memory put back into memory location other than stack space. (the location is holds the address of the top of the stack specified by the address field) ¾The address stored is called Stack Pointer (SP) ADD Instruction ADD 0………0 „ Takes the two values out the stack (2 POP instructions) „ Then gives that to ALU to compute the result „ The result is PUSHed back on the stack

CIT 593 33 CIT 593 34

Stack Machine: Multiply Add Example

Want to compute E = (A + B).(C + D). Let say A = 25, B = 17, C = 3 and D = 2

Stack Implementation Register Implementation

Push A (take a data from memory and push it on to stack section) LD R0, A Push B LD R1,B Add (2 POP’s, so that operands can ADD R0, R0, R1 be given to ALU, and then result LD R2, C PUSHed backed on stack) LD R3, D Push C ADD R2, R2,R3 Push D Add MUL R0, R0, R2 Mult ST R0, E POP E (pop it off stack and store somewhere in memory) Does everyone see the usefulness of register file ?? CIT 593 35

9