List of Lectures Lecture Content

List of lectures 1Introduction and Functional Programming 2Imperative Programming and Data Structures 3Parsing TDDA69 Data and Program Structure 4Evaluation Virtual Machines and Bytecode 5Object Oriented Programming 6Macros and decorators Cyrille Berger 7Virtual Machines and Bytecode 8Garbage Collection and Native Code 9Distributed Computing 10Declarative Programming 11Logic Programming 12Summary 2 / 46 How is a program interpreted? Lecture content Source code Parser Virtual Machines Types of virtual/hardware machines Parser Bytecode Abstract Syntax Tree Tree visitor Simple instruction set From AST to Bytecode Generator Source code ... Bytecode Interpreter Meta-circular evaluator Bytecode Virtual Machine Assembler Assembly Operating System CPU 3 / 46 4 / 46 What is a Virtual Machine? A Virtual Machine is a hardware or software emulation of a real or hypothetical computer system A system virtual machine emulates a Virtual Machines complete system and is intented to execute a complete operating system Examples: VirtualBox, VMWare, Parallels... A process/language virtual machine runs a single program in a single process Examples: JVM, CPython, V8, Dalvik... 6 Language Virtual Machine Native Environment vs Emulated Environment Operating System Source Code Input / Output Native Program Hardware Compiler Libraries Byte Code Virtual Machine Operating System Emulated Virtual Machine Virtual Machine Virtual Machine Input / Output Input / Output Emulated Emulated Program Hardware Interface Hardware Windows Linux Mac Bindings Libraries 7 8 Benefits of Virtual Machines Portability: Virtual Machines are compatible with various hardware platforms and Operating Systems Isolation: Virtual Machines are Types of virtual/hardware machines isolated from each other For running incompatible applications concurrently Encapsulation: computation is seperated from the operating system Beneficial for security 9 Types of virtual/hardware machines Register Machine: Formal definition Register Machine A (in)finite set of registers, which holds a Stack Machine single non-negative integer An instruction set, which defines the operation on registers Arithmetic, control, input/output... A state register, which holds the current instruction and its index Sequential list of labeled instructions which defines the program to be executed 11 12 Stack Machine: Formal definition Stack Machines vs Register Machines An (almost) infinite stack, which holds Stack Machines need more compact code integers Arithmetic instruction are smaller An instruction set, which defines the Stack Machines have simpler compiler, operation on the stack interpreters and a minimal processor state Arithmetic operations are always applied on the top two Stack Machines have a performance elements and the results is stored in the top disadvantages A state register, which holds the current More memory references, less caching of temporaries instruction and its index Higher cost of factoring out common subexpressions It has to be stored as a temporary variable Sequential list of labeled instructions Most common hardware are register machines which defines the program to be executed 13 14 Virtual Machines for Dynamic Typing Remember: With static typing, types are checked during compilation With dyamic typing, types are checked during execution Bytecode Implication Stack/Registers contains pointer to objects Function call convention 15 Bytecode For a virtual stack machine Instruction set Generate the bytecode Interpreting the bytecode Simple instruction set 17 Stack managment Arithmetic operators PUSH [constant_value] ADD, MUL... Push the constant on the stack Pop two arguments on the stack POP [number] Push the result Pop a certain numbers of variables from the stack 19 20 Variables Jumps LOAD [varname] JMP [idx] Push the value of variable Jump to execute instruction at the given index DCL [varname] IFJMP [idx] Declare the variable Pop the value and if true jump to [idx] STORE [varname] Get the value, store the result and push the value 21 22 Functions Objects CALL [arguments] LOAD_MEMBER [varname] Pop the function object and call it with the Pop the object and push the value of variable given number of arguments STORE_MEMBER [varname] RET Pop the object and value and store it and push Return the value To simplify the bytecode this can be implemented as function call 23 24 From AST to Bytecode With a tree visitor... It can take several pass: Find the variables Compute the jumps From AST to Bytecode Generate the code The real challenge is to map high level language to instructions 26 Arithmetic operation Functions 1+2 func(1) a-2 func(1,2) var a console.log('Hello world!') a=1 function func(a, b) { return a + b; var a = 1 } a+=2 a=b*c+2 a.b=c 27 28 Controls if(a) { console.log('test') } if(a) { console.log('hello') } else { console.log('world') } var a = 10; Bytecode Interpreter while(a) { a -= 1; } for(var a = 0; a < 10; a += 1) { console.log(a); } for(var a = 0; a < 10; a += 1) { console.log(a); if(a+b) { break; } } 29 Components of bytecode interpreter Verification Verifier Verifier checks correctness of Bytecode may come from buggy compiler or bytecode malicious source Every instruction must have a valid operation Dynamic checking of types, array bounds, code function arguments... Every instruction must have valid parameters Instruction executer Every branch instruction must branch to the start of some other instruction, not middle of instruction 31 32 Bytecode Interpreter Interpreter loop Standard virtual machine instruction_index = 0 instructions = [ ... ] interprets instructions stack = [ ] Perform run-time checks such as array bounds current_env = Environment() and types and function arguments while instruction_index < len(instructions): Possible to compile bytecode to native code next_instruction = instructions[instruction_index] (JIT: Just-In-Time) switch(next_instruction.opcode): Call native methods case ADD: .. Typically functions written in C case JMP: ... 33 34 Interpreter Example Function Call Print the absolute value of 4-1: Recursive call 1 PUSH 4 For a function call, instantiate a new 2 PUSH 1 interpreter loop 3 SUB Stack 4 DUP Push on the stack some information on how to 5 PUSH 0 restore the interpreter when returning 6 SUP More complex code, but higher performance 7 IFJMP 9 and flexibility 8 NEG Allow infinite recursion 9 LOAD 'print' 10 CALL 35 36 Function Call - Stack Handling Exceptions (1/2) instruction_index = 0 instructions = [ ... ] stack = [ ] Use the implementation language current_env = Environment() while instruction_index < len(instructions): next_instruction = instructions[instruction_index] exceptions switch(next_instruction.opcode): ... case CALL: env = Environment() Add exception information to the func = stack.pop() for arg in func.args: value = stack.pop() stack env.set(arg.name, value) stack.push( [ next_instruction, instructions, current_env ] ) next_instruction = 0 instructions = func.instructions current_env = env case RET: retval = stack.pop() info = stack.pop() stack.push(retval) next_instruction = info[0] instructions = info[1] current_env = info[2] 37 38 Handling Exceptions (2/2) instruction_index = 0 instructions = [ ... ] stack = [ ] current_env = Environment() while instruction_index < len(instructions): next_instruction = instructions[instruction_index] switch(next_instruction.opcode): ... case TRY_PUSH: Meta-circular evaluator stack.push(Exception(next_instruction.rescue_index, instructions)) case TRY_POP: stack.pop() case THROW: while True: info = stack.pop() if(info is Exception): instructions = info.instructions next_instruction = info.rescue_index 39 Meta-circular evaluator RPython An evaluator that is written in the RPython is a subset of Python same language that it evaluates is Variables contains value of at most one type Module globals are constants said to be metacircular (SICP 4.1). Restrictions on control flow, objects, Easy to extend the language exceptions... Make it easier to build sophisticated debuggers Meant to be easy to interpret More suited for building new languages 41 42 RPython and PyPy PyPy Virtual Machine Usually interpreters are written in a Source Code target platform language such as C Compiler PyPy VM But C is a complex and unsafe language, and interpreter tend to Byte Code get complicated PyPy uses RPython: PyPy VM PyPy VM PyPy VM to provide a set of tools for implementing interpreters for any language Windows Linux Mac a python implementation using those tools 43 44 Benefits of RPython over host language Conclusion Easier to develop Virtual Machines for interpreting If you know Python, you know RPython programs Easier to debug Stack machines are easier to Since you can run RPython program with a regular Python interpreter implement but slower than Factor the port to different register machines platform Virtual Machines introduce compilation overhead, but are faster to execute 45 46 / 46.

Load more