Syntax-Directed Documentation for PL360
Total Page:16
File Type:pdf, Size:1020Kb
University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange The Harlan D. Mills Collection Science Alliance 1970 Syntax-Directed Documentation for PL360 Harlan D. Mills Follow this and additional works at: https://trace.tennessee.edu/utk_harlan Part of the Software Engineering Commons Recommended Citation Mills, Harlan D., "Syntax-Directed Documentation for PL360" (1970). The Harlan D. Mills Collection. https://trace.tennessee.edu/utk_harlan/25 This Article is brought to you for free and open access by the Science Alliance at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in The Harlan D. Mills Collection by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Programming Languages J.c. REYNOLDS, Editor Syntax-Directed Documentation A key principle in what follows is that the programming documentation process is managed solely on the basis of For PL360 the syntax of programs. The semantics of the documenta- tion, as embodied in programmer responses to interroga- tion, are not analyzed by the process except in mechanical HARLAN D. I~/[ILLS ways such as keyword indexing. In this way, a program- IBM, Gaithersburg , Maryland mer's responses are treated as "black messages" in the process, in analogy to the idea of a "black box." That is, a programmer's responses are requested, accepted, stored, The language PL360, together with its phrase structure gram- and later retrieved with no semantic analysis of their mar, is used as a concrete basis for illustrating an idea coiled contents. syntax-directed documentation. This idea is: (1) to use the phrase structure of a program to define the structure of a formal Syntactic Preliminaries documentation for that program; (2) to use the syntactic types We use the notation and definitions for PL360 in [1]. and identifiers in the resulting structure to trigger the automatic In defining documentation units and lines, the following formation of questions to the programmer, whose answers will device is used. First, denote the grammar in [1, in particu- become part of that documentation; and (3) to provide auto- lar, Appendix III, pp. 72-74] by G, which defines the matic storage and retrieval facilities so that other programmers language PL360, say L(G). This grammar G will be trans- who want to understand or modify the program can access the formed finitely into a new grammar G* such that resulting documentation, which is cross-indexed in various ways by syntactic types and objects. A small PL360 program, al- L(G*) = L(G), ready found in the literature, is worked out as an example. and such that G* contains syntactic entities we want to KEY WORDS AND PHRASES: documentation, syntax analysis, PL360, en- classify as documentation units and use to define lines. forced documentation, indexed documentation, automatic interrogation, phase The basis for the transformation of G into G* is a finite structured grammar, syntax-directed documentation, syntax processing number of elementary steps as follows. If X is any finite CR CATEGORIES: 3.59, 3.79, 4.19, 4.43 sequence of tokens and/or syntactic entities which occurs as part of the right side of a production rule in a grammar G ~, and (A) is not a syntactic entity in G k, we can define a Introduction new production (A) :: = X and substitute (A) for X in the PL360, due to the efforts of N~iklaus Wirth [1], is a right side of any rule we please in G k, to get a grammar phrase structured programming language which provides G ~+1. It is clear that L(G ~+1) = L(G k) by this construc- the facilities of a symbolic machine language for the IBM tion. Then, we consider a sequence 360 computers. It is defined by a recursive syntax and is G = G O,G 1,...,G = = G* implemented by a syntax-directed compiler consisting of a precedence syntax analyzer and a set of interpretation where n is the (finite) number of additional syntactic enti- rules, as discussed by Wirth and Weber in [2]. ties we want to be defined in G* which are not in G. Syntax-directed documentation refers to an automatic We note that, even though additional syntactic entities process which acquires programming documentation can easily be introduced in a grammar while retaining the through the syntactical analysis of a program, followed by identical language, the question of keeping it a precedence the interrogation of the originating programmer. This grammar [2] is a delicate matter. This general point is not documentation can be dispensed through reports or file pursued here. However, we use only transformations which query replies when other programmers later need to know label the entire right side of a rule; in this case the grammar the program structure and its details. obviously retains its precedence properties. The interrogation of an originating programmer consists In what follows, the grammar G is augmented to G* just of: (1) a relisting of the program text, with certain syntac- to provide a basis for invoking additional interpretation tic entities, which are classified as documentation units, set rules which define documentation files and generate ques- off typographically in lines and labeled with an ordinal tions. It will also be apparent that the same device can be coordinate system; and (2) a sequence of questions about useful in extending syntax processing beyond documenta- these documentation units. These questions are generated tion to questions of execution control and dynamic storage automatically, by completing prestored skeleton questions allocation in multiprogramming operating systems. For with coordinates and/or programmer generated identifiers. example, better use of core may arise if core is allocated to The programmer's responses to the questions are stored the machine code corresponding to syntactic entities such and indexed to these documentation units for retrieval. as "for statements" and "while statements" rather than 216 Communications of the ACM Volume 13 / Number 4 / April, 1970 simply arbitrary "pages" of machine code which may break referencing during programmer interrogation and later up such natural units of execution. examination. When programmers make an informal effort to arrange their programs for readability, they typically Documentation Units start each documentation unit, as defined above, on a new We classify as a documentation unit any right hand side line and use indentation to correspond in a general way of a rule which reduces to one of the following syntactic with syntactical nesting in the program. We recognize that entities in [1]: the problem is a subjective one, but we give a syntax- (SIMPLE STATEMENT} defined listing algorithm which is believed to satisfy the (STATEMENT) intuitive intentions observed in informal programming (DECL) efforts. (PROGRAM) For the purpose of typographical listing, we partition a There are 19 such documentation units given in Table I. If PL360 program or procedure into a string of substrings. the right hand side is already defined in G, it is used Each substring is to be a printed line, and the string of lines directly. Otherwise, a new syntactic entity is defined, with constitutes a listing of the program. Associated with each the understanding that G is augmented by each such de- line are two numbers: one which specifies its order in the finition, as described above. program or procedure, and one which corresponds to the indentation (or starting column) of the line. If a line ex- TABLE l DOCUMENTATION UNITS ceeds the width of paper available, its continuation is 1 <BLOCK> :~= <BLOCKBODY> END further indented a standard amount. 2 <CASE ST> ::= <CASE SEQ> END 3 <FOR ST> ::= FOR <ASS STEP> <LIMIT) <DO> <STATEMENT*) 4 <FUNCDECL?> The partition of a PL360 program or procedure into lines 5 <FUNC ID) 6 <FUNCST> ::= <FUNCI> ) is defined by marking the starting text for each documenta- 7 <GOTO ST> :~= GOTO <ID> 8 <IF THEN ELSE ST> ::= IF <CONO THEN> <TRUE PART) <STATEMENT*> tion unit, and each label, BEGIN, END, ELSE, and. (dot) 9 <IF THEN ST> :== IF <CONO THEN> <STATEMENT=> 10 <K MEG ASS> I[ <NULL ST) symbol. The lines are numbered consecutively. The inden- 12 <PKOC DECL> ::= <PRDC HO6> ~STATEMENT*> 13 <PNOC ID> tation number is the level of nesting of the documentation 14 <PROGRAM> 15 <SEG DECL> ::= <SEG HEAD> BASE <K REG> unit it begins, if any, based on syntactic membership as Ib <SYNOC2> 17 <T CELL ASS> :~= <T CELL> := <K RiG) IB <T DECLT) described above. The only lines not beginning a new docu- 19 <WHILE ST> ::= <WHILE> <COND DO> <STATEMENT*> mentation unit are BEGIN (in CASE statements), END, ELSE, and. (dot). In each case they are indented according In effect, this classification of documentation rules is a to the level of the documentation unit which they help convenience, for identifying productions, whose recog- define. Labels are given the indentation level of the pro- nition in an analysis, corresponds to having additional gram or procedure being listed. interpretation rules that deal with documentation proc- To refer to a line from outside a procedure, we qualify essing. Given a PL360 program, we consider every realization of the line numbers with the procedure name. While the con- cept of program is defined in PL360, no provision is made such documentation units, which can be structured on the for naming a program in the syntax. basis of syntactic membership, as follows. A documentation For convenience, we introduce a new basic symbol unit is a member of a second documentation unit if its pro- PROGRAM and the redefinition gram text is a subset of the program text of the second. It is an immediate member if it is not a member of any third (PROGRAM) ::= PROGRAM (ID) (STATEMENT) documentation unit, itself a member of the second.