<<

• Implementing a language is a good way to learn more about programming languages • Interpreters are easier to implement than compilers, in general • Scheme is a simple language, but also a powerful one • Implementing it first in Scheme allows us to put off some of the more complex lower-level parts, like parsing and data structures • While focusing on higher-level aspects

• Simple syntax and semantics • “A meta-circular evaluator is a special • John McCarthy’s original Lisp had very case of a self- in which the little structure: existing facilities of the parent interpreter • Procedures CONS, CAR, CDR, EQ and are directly applied to the source code ATOM being interpreted, without any need for • Special forms QUOTE, COND, SET and additional implementation. Meta-circular LAMBDA evaluation is most common in the • Values T and NIL context of homoiconic languages”. • The rest of Lisp can be built on this • We’ll look at an adaptation from foundation (more or less) Abelson and Sussman,Structure and Interpretation of Computer Programs, MIT Press, 1996.

1 • is a property of some programming languages • We’ll not do all of Scheme, just enough • From homo meaning the same and icon for you to understand the approach meaning representation • We can us the same approach for an • A is homoiconic interpreter for Scheme in Python if its primary representation for programs • To provide reasonable efficiency, we’ll is also a data structure in a primitive use mutable-pairs type of the language itself • Few examples: Lisp, , Snobol

• Scheme calls a cons cell a pair GL% clisp [5]> l2 • Lisp always had special functions to ... (FOO B) change (aka destructively modify or [1]> (setq l1 '(a b)) [6]> (rplacd l1 '(2 3 4)) mutate) the components of a simple cons cell (A B) (FOO 2 3 4) • Can you detect a sentiment there? [2]> (setq l2 l1) [7]> l1 • RPLACA (RePLAce CAr) was Lisp’s function (A B) (FOO 2 3 4) to replace the car of a cons cell with a new [3]> (rplaca l2 'foo) [8]> l2 pointer (FOO B) (FOO 2 3 4) • RPLACD (RePLAce CDr) clobbered the cons cell’s cdr pointer [4]> l1 (FOO B)

2 > (define l1 '(a b c d)) > (set-cdr! l1 l1) • Scheme removed set-car! and set-cdr! from > l1 > l1 the language as of R6RS • They played to their ideological base here (a b c d) #0=(foo . #0#) • Or maybe just eating their own dog food > (set-car! l1 'foo) > (cadr l1) foo • R6RS is the Revised **6 Report on the > l1 Algorithmic Language Scheme > (caddr l1) R6RS does have a library, mutable-pairs, (foo b c d) foo • provides a new datatype for a mutable pair > (set-cdr! l1 '(2 3)) > (cadddr l1) and functions for it > l1 foo • mcons, mcar, mcdr, mlist, …set-mcar!, set- (foo 2 3) mcdr!

• Some languages are created/promoted by a • Scheme is standardized in the official company (e.g., Sun:Java, Microsoft:F#, IEEE standard and via a de facto Apple:Objective C) standard called the Revisedn Report on • But for a language to really be accepted, it the Algorithmic Language Scheme should be defined and maintained by the community • Or RnRS • And backed by a well-defined standard • Common versions: • That may be supported by a recognized • R5RS in 1998 standards organizations (e.g., IEEE, ANSI, • R6RS in 2007 W3C, etc)

3 > (define l1 (cons 1 (cons 2 empty))) > (mcar m1) > l1 1 • We’ll sketch out some rules to use in (1 2) > (set-car! l1 'foo) . . reference to undefined identifier: evaluating an s-expession > (define m1 (mcons 1 (mcons 2 set-car! empty))) > (set-mcar! l1 'foo) • Then realize them in Scheme > m1 . . set-mcar!: expects type as 1st argument, given: (1 2); • The (only) tricky part is representing an > (car l1) other arguments were: foo environment: binding symbols to values 1 > (set-mcar! m1 'foo) > m1 • Environments inherit from other > (car m1) {foo 2} . . car: expects argument of type environments, so we’ll consider an ; given {1 2} environment to be a set of frames • We’ll start with a global environment

• An environment is just a list of frames • An environment frame is just an • The first frame is the current (unordered) collection of bindings environment, the second is the one it • A binding has two elements: a inherits from, the third is the one the representing a variable and an object second inherits from, etc. representing its (current) value • The last frame is the global or top level • An environment might be represented as environment ( ( (x 100) (y 200) ) ( (a 1) (b 2) (x 2) ) ( (null ‘()) (empty ‘()) (cons …) …) )

4 Special forms are those that get evaluated • Self-Evaluating - Just return their value in a special, non-standard way • Numbers and strings are self • (quote X) – return X evaluating • (define X B) – bind X to evaluation of B • (lambda VARS BODY) - Make a procedure, • Symbol - Lookup closest binding in the write down VARS and BODY, do not evaluate current environment and return its • (set! X Y) – find X binding name, Y and set second element X to the return value • Raise an error if not found • (if X Y Z) – eval X and then eval either Y or Z

• Primitive: (F . ARGS) • We’re implementing an interpreter for • Apply by magic... Scheme in Scheme • User-defined: (F . ARGS) • The host language (Scheme) will do many details: data representation, reading, • Make a new environment frame printing, primitives (e.g., cons, car, +) • Extend to procedures frame • Our implementation will focus on a few • Bind arguments to formal parameters key parts: eval, apply, variables and • Evaluate procedure body in the new environments, user defined functions, etc. frame • Return its value

5 http://cs.umbc.edu/331/f11/code/scheme/mcs/ • define can only assign a variable to a value, i.e., 1st arg must be a symbol. • mcs.ss: simple Scheme subset Define functions like: • mcs_scope.ss: larger Scheme subset (define add1 (lambda (x) (+ x 1)) • mcs_basics.ss: 'library' of • lambda only allows one expression in functions body; for more use begin: (lambda (x) (begin (define y (* x x)) (* y y))) • readme.txt: short intro text • No set! to assign variables outside of the • session.txt: example of McScheme in local environment (e.g., global variables) use

Here’s a trivial read-eval-print loop: (define (mcscheme) ;; mcscheme read-eval-print loop (printf "mcscheme> ") (mcprint (mceval (read) global-env)) (mcscheme))

(define (mcprint x) ;; Top-level print: print x iff it's not void (or (void? x) (printf "~s~n" x))) The eval and apply operations have been fundamental from the start

6 (define (mceval exp env) (define (mcapply op args) (cond ((self-evaluating? exp) exp) (if (primitive? op) ((symbol? exp) (lookup exp env)) (do-magic op args) ((special-form? exp) (mceval (op-body op) (do-something-special exp env)) (extend-environment (else (mcapply (mceval (car exp) env) (op-formals op) (map (lambda (e) (mceval e env)) args (cdr exp))) ) ) ) (op-env op)))))

• In Scheme or Lisp, the representation of a • An environment is just a list of environment function has three parts: frames • A list of the names of its formal parameters • The last frame in the list is the global one • The expression(s) that make up the • The nth frame in the list extends the n+1th function’s body, i.e. the code to be evaluated • An environment frame records two things • The environment in which the function was • A list of variables bound in the environment defined, so values of non-local symbols can • The values they are bound to be looked up • Suppose we want to extend the global environ- • We might just represent a function as a list like ment with a new local one where x=1 and y=2 (procedure (x y) (+ (* 2 x) y) (… env …))

7 • Consider entering: • Consider entering:  (define foo 100)  (square foo)  (define square (lambda (x) (* x x))) • mcscheme evaluates square and foo in  (define x -100) the current environment and pushes a new • The environment after evaluating the first frame onto the environment in which x is three expressions would look like: bound to 100 ( ( (x . -100) ( (( x . 100 )) (square lambda (x)(* x x )) ((x . -100) (foo . 100) (square lsmbda (x)(* x x )) …) (foo . 100) ... ) ) )

(define (mceval exp env) (cond ((self-evaluating? exp) exp) ((symbol? exp) (lookup-variable-value exp env)) ((quoted? exp) (cadr exp)) ((assignment? exp) (eval-assignment exp env)) ((definition? exp) (eval-definition exp env)) ((if? exp) (eval-if exp env)) Take a look at the handout ((lambda? Exp) (make-procedure (cadr exp) (cddr exp) env)) ((begin? exp) (eval-sequence (cdr exp) env)) ((application? exp) (mcapply (mceval (car exp) env) (map (lambda (x)(mceval x env)) (cdr exp)))) (else (error "mceval: Unknown expression type" exp))))

8 (define (mcapply procedure arguments) (cond ((primitive-procedure? procedure) (define global-environment '()) (apply-primitive-procedure procedure arguments)) ((defined-procedure? procedure) (define (setup-environment) (eval-sequence (let ((initial-env (proc-body procedure) (extend-environment primitive-proc-names (extend-environment primitive-proc-objects (proc-parameters procedure) the-empty-environment))) arguments (define-variable! 'empty '() initial-env) (proc-environment procedure)))) initial-env)) (else (error "mceval: Unknown proc. type" procedure)))) (define the-global-environment (setup-environment))

(define (lookup-variable-value var env) (define (env-loop env)

(define (scan vars vals) (define (extend-environment vars vals base-env) (cond ((null? vars) (if (= (length vars) (length vals)) (env-loop (enclosing-environment env))) (cons (make-frame vars vals) base-env) ((eq? var (car vars)) (car vals)) (if (< (length vars) (length vals)) (else (scan (cdr vars) (cdr vals))))) (error "Too many arguments supplied" vars vals) (error "Too few arguments supplied" vars vals)))) (if (eq? env the-empty-environment) (error "Unbound variable" var) (let ((frame (car-frame env))) (define (make-frame variables values) (scan (frame-variables frame) (cons variables values)) (frame-values frame))))) (env-loop env))

9 (define (define-variable! var val env) (let ((frame (car-frame env))) (define (set-variable-value! var val env) (define (scan vars vals) (define (env-loop env) (cond ((null? vars) (define (scan vars vals) (add-binding-to-frame! var val frame)) (cond ((null? vars) (env-loop (enclosing-environment env))) ((eq? var (car vars)) (set-car! vals val)) ((eq? var (car vars)) (else (scan (cdr vars) (cdr vals))))) (set-car! vals val)) (if (eq? env the-empty-environment) (else (scan (cdr vars) (cdr vals))))) (error "Unbound variable -- SET!" var) (scan (frame-variables frame) (let ((frame (car-frame env))) (frame-values frame)))) (scan (frame-variables frame) (frame-values frame))))) (env-loop env))

10