<<

List of lectures

1Introduction and Functional Programming 2Imperative Programming and Data Structures 3Parsing TDDA69 Data and Program Structure 4Evaluation Virtual Machines and Bytecode 5Object Oriented Programming 6Macros and decorators Cyrille Berger 7Virtual Machines and Bytecode 8Garbage Collection and Native Code 9Distributed Computing 10Declarative Programming 11Logic Programming 12Summary

2 / 46

How is a program interpreted? Lecture content

Source code Parser Virtual Machines Types of virtual/hardware machines Parser Bytecode Tree visitor Simple instruction set From AST to Bytecode Generator ... Bytecode Meta-circular evaluator Bytecode

Assembler Assembly CPU

3 / 46 4 / 46 What is a Virtual Machine?

A Virtual Machine is a hardware or software emulation of a real or hypothetical computer system A system virtual machine emulates a Virtual Machines complete system and is intented to execute a complete operating system Examples: VirtualBox, VMWare, Parallels... A process/language virtual machine runs a single program in a single process Examples: JVM, CPython, V8, ...

6

Language Virtual Machine Native Environment vs Emulated Environment

Operating System Source Code Input / Output

Native Program Hardware Libraries Code Virtual Machine Operating System Emulated Virtual Machine Virtual Machine Virtual Machine Input / Output Input / Output

Emulated Emulated Program Hardware Interface Hardware Windows Linux Mac Bindings Libraries

7 8 Benefits of Virtual Machines

Portability: Virtual Machines are compatible with various hardware platforms and Operating Systems Isolation: Virtual Machines are Types of virtual/hardware machines isolated from each other For running incompatible applications concurrently Encapsulation: computation is seperated from the operating system Beneficial for security

9

Types of virtual/hardware machines Register Machine: Formal definition

Register Machine A (in)finite set of registers, which holds a single non-negative integer An instruction set, which defines the operation on registers Arithmetic, control, input/output... A state register, which holds the current instruction and its index Sequential list of labeled instructions which defines the program to be executed

11 12 Stack Machine: Formal definition Stack Machines vs Register Machines

An (almost) infinite stack, which holds Stack Machines need more compact code integers Arithmetic instruction are smaller An instruction set, which defines the Stack Machines have simpler compiler, operation on the stack interpreters and a minimal processor state Arithmetic operations are always applied on the top two Stack Machines have a performance elements and the results is stored in the top disadvantages A state register, which holds the current More memory references, less caching of temporaries instruction and its index Higher cost of factoring out common subexpressions It has to be stored as a temporary variable Sequential list of labeled instructions Most common hardware are register machines which defines the program to be executed

13 14

Virtual Machines for Dynamic Typing

Remember: With static typing, types are checked during compilation With dyamic typing, types are checked during Bytecode Implication Stack/Registers contains pointer to objects Function call convention

15 Bytecode

For a virtual stack machine Instruction set Generate the bytecode Interpreting the bytecode Simple instruction set

17

Stack managment Arithmetic operators

PUSH [constant_value] ADD, MUL... Push the constant on the stack Pop two arguments on the stack POP [number] Push the result Pop a certain numbers of variables from the stack

19 20 Variables Jumps

LOAD [varname] JMP [idx] Push the value of variable Jump to execute instruction at the given index DCL [varname] IFJMP [idx] Declare the variable Pop the value and if true jump to [idx] STORE [varname] Get the value, store the result and push the value

21 22

Functions Objects

CALL [arguments] LOAD_MEMBER [varname] Pop the function object and call it with the Pop the object and push the value of variable given number of arguments STORE_MEMBER [varname] RET Pop the object and value and store it and push Return the value To simplify the bytecode this can be implemented as function call

23 24 From AST to Bytecode

With a tree visitor... It can take several pass: Find the variables Compute the jumps From AST to Bytecode Generate the code The real challenge is to map high level language to instructions

26

Arithmetic operation Functions

1+2 func(1) a-2 func(1,2) var a console.log('Hello world!') a=1 function func(a, b) { return a + b; var a = 1 } a+=2 a=b*c+2 a.b=c

27 28 Controls

if(a) { console.log('test') } if(a) { console.log('hello') } else { console.log('world') } var a = 10; Bytecode Interpreter while(a) { a -= 1; } for(var a = 0; a < 10; a += 1) { console.log(a); } for(var a = 0; a < 10; a += 1) { console.log(a); if(a+b) { break; } }

29

Components of bytecode interpreter Verification

Verifier Verifier checks correctness of Bytecode may come from buggy compiler or bytecode malicious source Every instruction must have a valid operation Dynamic checking of types, array bounds, code function arguments... Every instruction must have valid parameters Instruction executer Every branch instruction must branch to the start of some other instruction, not middle of instruction

31 32 Bytecode Interpreter Interpreter loop

Standard virtual machine instruction_index = 0 instructions = [ ... ] interprets instructions stack = [ ] Perform run-time checks such as array bounds current_env = Environment() and types and function arguments while instruction_index < len(instructions): Possible to compile bytecode to native code next_instruction = instructions[instruction_index] (JIT: Just-In-Time) switch(next_instruction.): Call native methods case ADD: .. Typically functions written in C case JMP: ...

33 34

Interpreter Example Function Call

Print the absolute value of 4-1: Recursive call 1 PUSH 4 For a function call, instantiate a new 2 PUSH 1 interpreter loop 3 SUB Stack 4 DUP Push on the stack some information on how to 5 PUSH 0 restore the interpreter when returning 6 SUP More complex code, but higher performance 7 IFJMP 9 and flexibility 8 NEG Allow infinite recursion 9 LOAD 'print' 10 CALL

35 36 Function Call - Stack Handling Exceptions (1/2)

instruction_index = 0 instructions = [ ... ] stack = [ ] Use the implementation language current_env = Environment() while instruction_index < len(instructions): next_instruction = instructions[instruction_index] exceptions switch(next_instruction.opcode): ... case CALL: env = Environment() Add exception information to the func = stack.pop() for arg in func.args: value = stack.pop() stack env.set(arg.name, value) stack.push( [ next_instruction, instructions, current_env ] ) next_instruction = 0 instructions = func.instructions current_env = env

case RET: retval = stack.pop() info = stack.pop() stack.push(retval) next_instruction = info[0] instructions = info[1] current_env = info[2]

37 38

Handling Exceptions (2/2)

instruction_index = 0 instructions = [ ... ] stack = [ ] current_env = Environment() while instruction_index < len(instructions): next_instruction = instructions[instruction_index] switch(next_instruction.opcode): ... case TRY_PUSH: Meta-circular evaluator stack.push(Exception(next_instruction.rescue_index, instructions)) case TRY_POP: stack.pop() case THROW: while True: info = stack.pop() if(info is Exception): instructions = info.instructions next_instruction = info.rescue_index

39 Meta-circular evaluator RPython

An evaluator that is written in the RPython is a subset of Python same language that it evaluates is Variables contains value of at most one type Module globals are constants said to be metacircular (SICP 4.1). Restrictions on control flow, objects, Easy to extend the language exceptions... Make it easier to build sophisticated debuggers Meant to be easy to interpret More suited for building new languages

41 42

RPython and PyPy PyPy Virtual Machine

Usually interpreters are written in a Source Code target platform language such as C Compiler PyPy VM But C is a complex and unsafe language, and interpreter tend to Byte Code get complicated PyPy uses RPython: PyPy VM PyPy VM PyPy VM to provide a set of tools for implementing interpreters for any language Windows Linux Mac a python implementation using those tools

43 44 Benefits of RPython over host language Conclusion

Easier to develop Virtual Machines for interpreting If you know Python, you know RPython programs Easier to debug Stack machines are easier to Since you can run RPython program with a regular Python interpreter implement but slower than Factor the port to different register machines platform Virtual Machines introduce compilation overhead, but are faster to execute

45 46 / 46