Reading Assignment Lazy Evaluation
Total Page:16
File Type:pdf, Size:1020Kb
Reading Assignment Lazy Evaluation • MULTILISP: a language for concurrent Lazy evaluation is sometimes symbolic computation, called “call by need.” We do an by Robert H. Halstead evaluation when a value is used; (linked from class web page) not when it is defined. Scheme provides for lazy evaluation: (delay expression) Evaluation of expression is delayed. The call returns a “promise” that is essentially a lambda expression. (force promise) A promise, created by a call to delay, is evaluated. If the promise has already been evaluated, the value computed by the first call to force is reused. © © CS 538 Spring 2008 207 CS 538 Spring 2008 208 Example: An argument to a function is strict Though and is predefined, writing if it is always used. Non-strict a correct implementation for it is arguments may cause failure if a bit tricky. evaluated unnecessarily. The obvious program With lazy evaluation, we can define a more robust and (define (and A B) function: (if A (define (and A B) B (if A #f (force B) ) #f ) ) is incorrect since B is always ) evaluated whether it is needed or not. In a call like This is called as: (and (not (= i 0)) (> (/ j i) 10)) (and (not (= i 0)) (delay (> (/ j i) 10))) unnecessary evaluation might be Note that making the programmer fatal. remember to add a call to delay is unappealing. © © CS 538 Spring 2008 209 CS 538 Spring 2008 210 Delayed evaluation also allows us We need to slightly modify how a neat implementation of we explore suspended infinite suspensions. lists. We can’t redefine car and The following definition of an cdr as these are far too infinite list of integers clearly fundamental to tamper with. fails: Instead we’ll define head and (define (inflist i) tail to do much the same job: (cons i (inflist (+ i 1)))) (define head car) But with use of delays we get the (define (tail L) desired effect in finite time: (force (cdr L))) (define (inflist i) head looks at car values which (cons i are fully evaluated. (delay (inflist (+ i 1))))) tail forces one level of Now a call like (inflist 1) evaluation of a delayed cdr and creates saves the evaluated value in place of the suspension (promise). 1 promise for (inflist 2) © © CS 538 Spring 2008 211 CS 538 Spring 2008 212 Given Exploiting Parallelism (define IL (inflist 1)) Conventional procedural (head (tail IL)) returns 2 and programming languages are expands IL into difficult to compile for multiprocessors. Frequent assignments make it difficult to find independent computations. 1 Consider (in Fortran): do 10 I = 1,1000 2 promise for X(I) = 0 (inflist 3) A(I) = A(I+1)+1 B(I) = B(I-1)-1 C(I) = (C(I-2) + C(I+2))/2 10 continue This loop defines 1000 values for arrays X, A, B and C. © © CS 538 Spring 2008 213 CS 538 Spring 2008 214 Which computations can be done • C(I) = (C(I-2) + C(I+2))/2 in parallel, partitioning parts of an It is clear that even and odd array to several processors, each elements of C don’t interact. Hence operating independently? two processors could compute C • X(I) = 0 even and odd elements of in Assignments to X can be readily parallel. Beyond this, since both parallelized. earlier and later C values are used in each computation of an element, • A(I) = A(I+1)+1 no further means of parallel Each update of A(I) uses an A(I+1) evaluation is evident. Serial value that is not yet changed. Thus evaluation will probably be needed a whole array of new A values can for even or odd values. be computed from an array of “old” A values in parallel. • B(I) = B(I-1)-1 This is less obvious. Each B(I) uses B(I-1) which is defined in terms of B(I-2), etc. Ultimately all new B values depend only on B(0) and I. That is, B(I) = B(0) - I. So this computation can be parallelized, but it takes a fair amount of insight to realize it. © © CS 538 Spring 2008 215 CS 538 Spring 2008 216 Exploiting Parallelism in How is Parallelism Found? Scheme There are two approaches: Assume we have a shared- • We can use a “smart” compiler that is memory multiprocessor. We might able to find parallelism in existing be able to assign different programs written in standard serial processors to evaluate various programming languages. independent subexpressions. • We can add features to an existing For example, consider programming language that allows a (map (lambda(x) (* 2 x)) programmer to show where parallel '(1 2 3 4 5)) evaluation is desired. We might assign a processor to each list element and compute the lambda function on each concurrently: 12 34 5 Processor 1 ... Processor 5 246 8 10 © © CS 538 Spring 2008 217 CS 538 Spring 2008 218 Concurrentization • Control Dependence Not all expressions need be (or Concurrentization (often called should be) evaluated. parallelization) is process of In automatically finding potential (if (= a 0) concurrent execution in a serial 0 program. (/ b a)) Automatically finding current we don’t want to do the division execution is complicated by a until we know a ≠ 0. number of factors: • Side Effects • Data Dependence If one expression can write a Not all expressions are value that another expression independent. We may need to might read, we probably will need delay evaluation of an operator or to serialize their execution. subprogram until its operands are available. Consider (define rand! Thus in (let ((seed 99)) (+ (* x y) (* y z)) (lambda () we can’t start the addition until (set! seed both multiplications are done. (mod (* seed 1001) 101101)) seed )) ) © © CS 538 Spring 2008 219 CS 538 Spring 2008 220 Now in Utility of Concurrentization (+ (f (rand!)) (g (rand!))) we can’t evaluate (f (rand!)) Concurrentization has been most and (g (rand!)) in parallel, successful in engineering and because of the side effect of set! scientific programs that are very in rand!. In fact if we did, f and g regular in structure, evaluating might see exactly the same large multidimensional arrays in “random” number! (Why?) simple nested loops. Many very • Granularity complex simulations (weather, Evaluating an expression fluid dynamics, astrophysics) are concurrently has an overhead (to run on multiprocessors after setup a concurrent computation). extensive concurrentization. Evaluating very simple Concurrentization has been far expressions (like (car x) or less successful on non-scientific (+ x 1)) in parallel isn’t worth programs that don’t use large the overhead cost. arrays manipulated in nested for Estimating where the “break loops. A compiler, for example, is even” threshold is may be tricky. difficult to run (in parallel) on a multiprocessor. © © CS 538 Spring 2008 221 CS 538 Spring 2008 222 Concurrentization within Adding Parallel Features to Processors Programming Languages. Concurrentization is used It is common to take an existing extensively within many modern serial programming language and uniprocessors. Pentium and add features that support PowerPC processors routinely concurrent or parallel execution. execute several instructions in For example versions for Fortran parallel if they are independent (like HPF—High Performance (e.g., read and write distinct Fortran) add a parallel do loop registers). This are superscalar that executes individual iterations processors. in parallel. These processors also routinely Java supports threads, which may speculate on execution paths, be executed in parallel. “guessing” that a branch will (or Synchronization and mutual won’t) be taken even before the exclusion are provided to avoid branch is executed! This allows unintended interactions. for more concurrent execution than if strictly “in order” execution is done. These processors are called “out of order” processors. © © CS 538 Spring 2008 223 CS 538 Spring 2008 224 Multilisp The Pcall Mechanism Multilisp is a version of Scheme Pcall is an extension to Scheme’s augmented with three parallel function call mechanism that evaluation mechanisms: causes the function and its • Pcall arguments to be all computed in Arguments to a call are evaluated parallel. in parallel. Thus • Future (pcall F X Y Z) Evaluation of an expression starts causes F, X, Y and Z to all be immediately. Rather than waiting evaluated in parallel. When all for completion of the computation, evaluations are done, F is called a “future” is returned. This future with X, Y and Z as its parameters will eventually transform itself into (just as in ordinary Scheme). the result value (when the Compare computation completes) (+ (* X Y) (* Y Z)) • Delay Evaluation is delayed until the with result value is really needed. (pcall + (* X Y) (* Y Z)) © © CS 538 Spring 2008 225 CS 538 Spring 2008 226 It may not look like pcall can Look at the execution of treemap give you that much parallel on the tree execution, but in the context of (((1 . 2) . (3 . 4)) . recursive definitions, the effect ((5 . 6) . (7 . 8))) can be dramatic. We start with one call that uses Consider treemap, a version of the whole tree. This splits into map that operates on binary trees two parallel calls, one operating (S-expressions). on ((1 . 2) . (3 . 4)) (define (treemap fct tree) and the other operating on (if (pair? tree) ((5 . 6) . (7 . 8)) (pcall cons Each of these calls splits into 2 (treemap fct (car tree)) calls, and finally we have 8 (treemap fct (cdr tree)) independent calls, each operating ) on the values 1 to 8. (fct tree) )) © © CS 538 Spring 2008 227 CS 538 Spring 2008 228 Futures If the computation of expr is not yet completed, you are forced to Evaluation of an expression as a wait until computation is future is the most interesting completed.