Icon Analyst 8
Total Page:16
File Type:pdf, Size:1020Kb
In-Depth Coverage of the Icon Programming Language October 1991 Number 8 subject := OuterEnvir.subject object := OuterEnvir.object s_pos := OuterEnvir.s_pos In this issue … o_pos := OuterEnvir.o_pos fail String Synthesis … 1 An Imaginary Icon Computer … 2 end Augmented Assignment Operations … 7 procedure Eform(OuterEnvir, e2) The Icon Compiler … 8 local InnerEnvir Programming Tips … 12 InnerEnvir := What’s Coming Up … 12 XformEnvir(subject, s_pos, object, o_pos) subject := OuterEnvir.subject String Synthesis object := OuterEnvir.object s_pos := OuterEnvir.s_pos In the last issue of the Analyst, we posed the problem o_pos := OuterEnvir.o_pos of implementing a string-synthesis facility for Icon, using the ideas given earlier about modeling the string-scanning control suspend InnerEnvir.object structure. OuterEnvir.subject := subject Our solution is given below. First we need procedures OuterEnvir.object := object analogous to the procedure used for modeling string scanning. OuterEnvir.s_pos := s_pos In addition to a subject, there’s now an “object”, which is the OuterEnvir.o_pos := o_pos result of string synthesis. There now also are two positions, subject := InnerEnvir.subject one in the subject and one in the object. The global identifiers object := InnerEnvir.object subject, object, s_pos, and o_pos are used for these four s_pos := InnerEnvir.s_pos “state variables” in the procedures that follow. (In a real o_pos := InnerEnvir.o_pos implementation, these would be keywords.) fail The string synthesis control structure is modeled as end expr1 ? expr2 → Eform(Bform(expr1),expr2) Most of the procedures specified in the last issue of the are straightforward. Care must be taken, however, The procedures Bform() and Eform() are very similar Analyst to assure that values assigned to the positions are in range — to Bscan() and Escan() used in the last issue of the Analyst this is done automatically for the keyword &pos, but it must for modeling string scanning. There just are two additional be done explicitly for s_pos and o_pos. The procedures state variables to maintain. The subject gets its value from smove() and omove() illustrate this: expr1 as before, while the object starts out as the empty string. Both positions start at 1. procedure smove(i) record XformEnvir(subject, s_pos, object, o_pos) if s_pos + i >= 1 then suspend .subject[.s_pos:s_pos <– s_pos + i] global subject, object, s_pos, o_pos end procedure Bform(e1) local OuterEnvir procedure omove(i) OuterEnvir := if o_pos + i < 1 then fail XformEnvir(subject, s_pos, object, o_pos) suspend .object[.o_pos:o_pos <– o_pos + i] subject := e1 end object := "" s_pos := o_pos := 1 Note that smove() and omove() are the same except for the suspend OuterEnvir use of different variables. The Icon Analyst / 1 In addition, non-positive specifications for positions What Next? must be converted to positive ones. The procedure cvpos() takes care of this: If you’re interested in programming language design, you may wish to try your hand at adding additional string procedure cvpos(i, s) synthesis procedures. if i <= 0 then i +:= ∗s + 1 Another possibility is a facility that combines analysis if 1 <= i <= ∗s + 1 then return i and synthesis so that the subject is transformed as it’s ana- else fail lyzed. While this kind of string transformation has conceptual end appeal, it presents the problem of keeping track of where you are in a subject whose length may be changing. If you come This conversion is needed in all procedures whose up with some ideas, you can try them out without a great arguments are position specifications: investment in time and effort by using the modeling tech- procedure spos_(i) niques we’ve shown. return cvpos(i, subject) = s_pos end An Imaginary Icon Computer procedure stab(i) As mentioned in the last issue of the Analyst, the suspend .subject[.s_pos:s_pos <– interpretive implementation of Icon is based on the concept of cvpos(i, subject)] an imaginary Icon computer that performs the operations needed to execute an Icon program. end The instruction set of this imaginary computer is a bit procedure opos_(i) strange — it’s not something you’d want to build in hardware. return cvpos(i, object) = o_pos Rather, this imaginary computer forms a conceptual bridge between the semantics of Icon and the architecture of conven- end tional computers on which Icon must run. This implementa- procedure otab(i) tion technique is an old one. The first language in which the suspend .object[.o_pos:o_pos <– method was well documented was BCPL [1]. The macro cvpos(i, object)] implementation of SNOBOL4 also used this technique [2]. end Virtual Machine Language Again note the similarity of the procedures that deal with the The code for this Icon computer is called virtual ma- subject and the object. chine code and is described in some detail in the Icon imple- The three interesting procedures related to synthesis mentation book [3]. We won’t attempt to describe virtual are: machine code in detail here, but a few of its characteristics are procedure xswap() worth note. First, the imaginary Icon computer is stack-based. That suspend (object <–> subject) & (o_pos <– 1) & is, the operands of an operation are pushed on to a stack. The (s_pos <– 1) & &null operation pops its operands off the stack and pushes its result end in their place. procedure odelete(i) One curious aspect of the virtual machine language is that there is a separate instruction for every Icon operator — suspend (object[o_pos+:i] <– "") & .opos not just ones for arithmetic operations. There is an instruction end for subscripting, an instruction for activating a co-expression, procedure oplace(s) and so on. On the other hand, there is just one instruction for suspend (object[o_pos:o_pos] <– s) & invoking all functions and procedures. The difference be- .(o_pos <– o_pos + ∗s) tween the handling of operators and functions is a reflection end of the fact that Icon operators are fixed and their meanings cannot change during program execution: In all three, conjunction and reversible assignment are used to assure that the state variables are restored to their i + j former values if a suspended procedure call is resumed. always means addition. On the other hand, functions and 2 / The Icon Analyst procedures are first-class values. They start out as the values of global identifiers, but other values can be assigned to these proc main 1 identifiers. Consequently, local 0,000000,total local 1,000000,check write(s) local 2,000000,write con 0,004000,0.0 may be the invocation of the built-in function for writing, or con 1,010000,8,124,157,164,141,154,040,075,040 it may be something entirely different if another value has declend 2 filen sum.icn been assigned to write. line 1 3 The virtual machine language does not attempt to cope mark L1 pnull with generators. There is a single virtual machine instruction var 0 for the element generator (!x) just as there is one for the size real 0 operation (∗x). It’s left to the instruction (that is, its execution line 2 asgn by the imaginary Icon computer) to handle what goes on in unmark generation. lab L1 mark L2 As with any real computer, there are lots of other mark0 instructions: instructions to push values on the stack, instruc- pnull tions to reference variables, and instructions to transfer con- var 0 dup trol between places in a virtual machine program. var 1 You can look at the virtual machine code if you wish — pnull line 3 it’s found in ucode files. As described in the article in the keywd 18 preceding Analyst, ucode is the format used in procedure bang libraries. If you have a library of ucode files (perhaps from the invoke 1 plus Icon program library), you can print them or examine them asgn with a text editor — they are plain text files. You also can pop create ucode files using the –c option to the Icon interpreter, lab L3 efail as in lab L4 icont –c sum.icn unmark lab L2 mark L5 This produces a pair of ucode files, sum.u1 and sum.u2. var 2 The .u1 file contains virtual machine code for the str 1 var 0 program, while the .u2 file contains global program informa- line 4 tion. invoke 2 unmark lab L5 An Example pnull line 5 Consider the following Icon program that produces the pfail 4 sum of real numbers given in the standard input file. It also end removes any leading or trailing white space and ignores lines proc check 1 that do not contain valid real numbers: local 0,001000,s local 1,000000,tab procedure main() local 2,000000,many local 3,000000,real total := 0.0 local 4,000000,trim every total +:= check(!&input) con 0,020000,2,040,011 write("Total = ", total) con 1,002000,1,0 declend end line 7 mark L1 procedure check(s) var 0 line 8 s ? { bscan tab(many(' \t')) mark L2 var 1 return real(trim(tab(0), ' \t')) var 2 } cset 0 line 9 end . If this program is in the file sum.icn and is translated sum.u1 with the –c option, the file sum.u1 that results is as shown in the box at the right. The Icon Analyst / 3 There is a section of virtual machine code for each file, which contains global information about the program, procedure (main() and check() in this program). usually is short, as shown here: 1 Each procedure begins with proc, followed by infor- mation about identifiers and constants used in the procedure. version U8.3.000 impl local Identifiers are indicated by local, followed by three global 2 comma-separated fields. The first field associates an integer 0,000005,main,0 1,000005,check,1 (starting at 0) with the identifier.