Introduction to Intermediate Code, Virtual Machine Implementation

Introduction to Intermediate Code, Virtual Machine Implementation

Where do we go from here? testing AST Type Introduction to Intermediate Code, virtual generation checking machine implementation Cleanup? Interpreter Lecture 19 in intermediate Virtual code/ assmblr machine out Prof. Fateman CS 164 Lecture 19 1 Prof. Fateman CS 164 Lecture 19 2 Details of MJ what the VM must support What is Intermediate Code? No single Details of the VM what IC to generate answer… • Any encoding further from the source and closer to the machine. Possibilities: – Same except remove line/column numbers! – Change all parameter or local NAMES to stack offset counts in intermediate Virtual – Change all global references (vars, methods) to vtable counts – Possible spew out Lisp as an IC. [My favorite: map every code/ assmblr machine language you encounter into Common Lisp. Macro defs make it out possible to define an intermediate level language “especially suited to IC” in Lisp. ] • Not (usually) machine code itself • Usually some extra “abstraction” – Imagine you have arbitrary numbers of registers for calculations. – Imagine you have “macro” instructions like x=a[i,j] Prof. Fateman CS 164 Lecture 19 3 Prof. Fateman CS 164 Lecture 19 4 What is Intermediate Code? Advantages Forms of Intermediate Code Virtual stack machine • Generally relatively portable between machines simplified “macro” machine • One IC may support several languages (a typical set might be C, C++, Pascal, Fortran) where the resources 3-address code needed are similar. (If you can support C, the rest of A :=B op C them are pretty easy.) Register machine models • Languages with widely-differing semantics will not fit in this restricted set and may require an extended IC. with unlimited number of registers, mapped to real • E.g. Java IC supporting Scheme is hard because registers later Scheme has 1st-class functions. Scheme IC supporting Tree form (= lisp program, more or less) Java is plausible structurally but might not be as efficient. Prof. Fateman CS 164 Lecture 19 5 Prof. Fateman CS 164 Lecture 19 6 Reminder of what we are doing.. From a Using Intermediate Code tree representation like our AST.. • We need to make some progress towards the • Instead of typechecking or interpreting the machine we have in mind code, we are traversing it and putting – Subject to manipulation, “optimization” together a list of instructions… generating IC – Perhaps several passes … that if run on a machine -- would execute – Or, we can generate assembler the program. – Or, we could plop down in absolute memory • In other words instead of interpreting or locations the binary instructions needed to execute typechecking a loop,or going through a loop the program “compile-and-go” system. This was a executing it, we write it out in assembler. common “student compiler” technique for Pascal and Fortran. Prof. Fateman CS 164 Lecture 19 7 Prof. Fateman CS 164 Lecture 19 8 The simplest example Just a simple stack machine (oversimplified) Compiling the MJ program segment … 3+4… Compiling (+ 3 4) The string "3+4" [too simple to have line/column numbers] (some-kind-of-compiler '(+ 3 4)) ;; I made-up name ☺ parses to (Plus (IntegerLiteral 3) (IntegerLiteral 4)) ((pushi 3) (pushi 4) (+)) typechecker approves and says it is of type int. Result is a list of assembly language instructions push immediate constant 3 on stack A program translating to lisp produces push immediate constant 4 on stack + = take 2 args off stack and push sum on stack (+ 3 4) Conventions: result is left on top of stack. Which could be executed… Location of these instructions unspecified. But we don’t really want Lisp, we want machine code. We could start Lengthy notes in simple-machine file describe virtual machine. from the Lisp or from the AST, i.e. …. (Plus …)… Prof. Fateman CS 164 Lecture 19 9 Prof. Fateman CS 164 Lecture 19 10 It doesn’t have to look like lisp if we write a Consider the file Simple-compiler.fasl printing program (pprint-code compiled-vector) ;; prints out… Load this file and you have functions for compiling pushi 3 pushi 4 methods, exps, statements. You can trace them. + mj-c-method compiles one method mj-c-exp compiles one expression mj-c-statement compiles one statement Each program calls “emit” to add to the generated program some sequence of instructions. Typically these are consed on to the front of a list intended to become the “body” of a method. This list which is then reversed, embroidered with other instructions, and is ready to assemble. Prof. Fateman CS 164 Lecture 19 11 Prof. Fateman CS 164 Lecture 19 12 Some FAQ. 1. Redundant work? FAQ 2. Where do the programs live? • Q: Going back to the AST for compiling, it • You might ask this question; how are the methods placed in memory? seems to me that I am re-doing things I • For now we do not have to say, but we could let a already did (or did 95%) for type-checking. “loader” determine where to put each code segment in memory. Each method can refer to instructions in its • A. You are right. Next time you write a body by a relative address; a call sets the program compiler (hehe) you will remember and maybe counter (PC) to the top of the method’s code. • In a “real” program, a loader would resolve the you will save the information some way. E.g. references to methods /classes/ etc defined save the environment / inheritance hierarchy, elsewhere. MJ has no such problems. (discuss why?) offsets for variables, type data used to • Or we could let all this live in some world where there is a symbol table (like Lisp) and lets us just grab the determine assembler instructions, etc. definition when we want to get it. (Dynamic loading, too) Prof. Fateman CS 164 Lecture 19 13 Prof. Fateman CS 164 Lecture 19 14 FAQ 3: Too many layers? The 2-pass assembler… (50 lines of code?) What does the assembler program do with a list of • Q. There are too many pieces up in the air. Where do I start? symbolic instructions – the reversed list mentioned previously? • A. Yes, we are faced with a multi-level target for understanding. 1. Turn a list of instructions in a function into a vector. • That’s why we took it in steps up to here. Two more levels, ;; count up the instructions and keep track of the labels closely tied together. ;; make a note of where the labels are (program counter • We produce code for the Assembler ;; relative to start of function). – Understand the assembler input / output ;; Create a vector of instructions. – Requires understanding VM ;; The transformed "assembled" program can support fast • The VM determines what instructions make sense to generate to ;; jumps forward and back. accomplish tasks (esp. call/return, get/set data (multiple-value-bind (length labels) • The programming language definition determines what we need. ;extract 2 items from 1st pass E.g. Where does “this” object come from? (asm-first-pass (fn-code fn)) – Look at output of translator (setf (fn-code fn) ;; put vector back instead of list – You see what to generate (or equivalent…) (asm-second-pass (fn-code fn) length labels)) Prof. Fateman CS 164 Lecture 19 15 Prof. Fateman CS 164 Lecture 19 16 fn)) What does the assembler do? The 1st pass counts up locations • Transforms a list of symbolic instructions and (defun asm-first-pass (code) labels into a vector. "Return the label assoc list" – Turn a list of instructions in a function into a vector. (let ((length 0) – When there is a jump <label>, replace with a jump (labels nil)) <integer> where the <integer> is the location of that (dolist (inst code) <label> (if (label-p inst) • First pass merely counts up the instructions and (push (cons inst length) labels) keeps track of the labels. Produces a lookup-table (incf length))) (e.g. assoc. list) for label integer mapping. labels)) • Second pass creates the vector with substitutions ;; could return (values length labels) ☺ Prof. Fateman CS 164 Lecture 19 17 Prof. Fateman CS 164 Lecture 19 18 The 2nd pass resolves labels, makes vector (defun asm-second-pass (code labels) "Put code into code-vector, adjusting for labels." (coerce The MJ Virtual Machine (map 'list #'(lambda (inst) (if (member (car inst) '(jump jumpn jumpz pusha call)) `(,(car inst) The easy parts ,(cdr (assoc (second inst) labels)) ,@(cddr inst)) inst)) (remove-if #'label-p code)) 'vector)) Prof. Fateman CS 164 Lecture 19 19 Prof. Fateman CS 164 Lecture 19 20 It is a stack-based architecture simulated It is a stack-based architecture simulated by a Lisp by a Lisp program. program • Some differences from CS61c MIPS/SPIM architecture How does this differ from the MIPS architecture in CS61c? 1. No registers for values of variables – Registers not used for passing arguments 2. (there are some implicit registers fp, sp, pc) – No “caller saved” or “callee saved” registers – Only implicit registers (e.g. program counter, frame pointer, stack pointer) – Debugging in the VM – I/O much different – No interrupts – No jump delay slot – No floating point co-processor – Undoubtedly more differences Prof. Fateman CS 164 Lecture 19 21 Prof. Fateman CS 164 Lecture 19 22 We set up an update-program-counter/ The outer exception handling controller execute loop (defun run-vm (vm) (defun run-vm (vm) ;; set up small utilities ;; set up small utilities (loop (handler-case (vm-fetch vm) ; Fetch instruction / update PC (case (car (vm-state-inst vm)) (loop ;; Variable/stack manipulation instructions ;;process stuff (lvar (vpush (frame (a1)))) ) (lset (set-frame (a1) (vpop))) (error (pe) ;;; etc etc (format t "~%Caught error: ~a" pe) (pushi (vpush (a1))) (if (vm-state-crash-hook vm) (pusha (vpush (a1))) (funcall (vm-state-crash-hook vm) vm) ;; Branching instructions: (vm-state-print vm))))) (jump (set-pc (a1))) ;;; etc ;; Function call/return instructions: ;; ….

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us