UNIVERSITY OF LONDON

IMPERIAL COLLEGE OF SCIENCE AND TECHNOLOGY

DEPARTMENT OF COMPUTING

Microcomputers as Adaptive and Protective Interfaces in Computer Networks

by

David Michael Ireland

A thesis submitted for the Degree of Master of Philisophy

December 1989 Abstract

There is a growing trend for the direct use of computers by non­ computer specialists. However the interface presented to the user is often tailored more to the computer expert than to the ‘novice user'. Therefore there is a need to make the man-machine interface more 'user friendly'. The features that are desirable in a user interface in order to meet this objective are discussed. The proposed solution is a 'protective interface', a separate entity from the application programs with which the user is interacting, acting as a buffer between the two. This protective interface would have the capability of adapting itself to suit the individual user, by forming a model of the user's interactions with the computer. A mechanism for implementing such an adaptive interface is proposed, based upon an interpreter controlled by a directed graph. The directed graph consists of a number of interconnected binary trees and the nodes of the graph represent the input expected, or the output to be generated. The interface is adaptive because it is not only controlled by the graph, but is also able to modify the graph. An implementation of the adaptive protective user interface is described which demonstrates the basic principles of the design. This demonstration implements the adaptive interface on a microcomputer, which is networked to a mainframe computer running the application programs. The of a time-sharing mainframe was chosen as the application with which to interface, because

Job Control Language is the first point of contact of a novice user with the computer.

Page 2 Contents

Page

Abstract 2

Acknowledgements 6

Glossary 7

CHAPTER 1: INTRODUCTION

1.1 The need for simple interfaces 10 1.2 Overall project objectives 11

1.3 Outline description of system 13

CHAPTER 2: SURVEY OF RELATED WORK

2.1 Desirable characteristics of man-machine interface 15 2.2 Features of some existing interfaces 21

2.3 Adaptive programs 26 2.4 Job Control Languages 26

CHAPTER 3: OBJECTIVES AND BASIC PRINCIPLES OF DESIGN

3.1 Objectives 28

3.2 Proposed interface characteristics 29 3.3 Scope of Project 31 3.4 Protective Interface 33 3.5 Basic principles of design 34

Page 3 CHAPTER 4: DESCRIPTION OF PROGRAMS

4.1 General Principles 45 4.2 The Digraph 53

4.3 Design Decisions and Alternatives 60 4.4 TREEM 66

4.5 RESTRUCT 69

4.6 BLDTRE 69 4.7 UNBUILD 71

CHAPTER 5: DEMONSTRATION PROGRAMS

5.1 CDCFILE 72

5.2 Macro-node editor 75 5.3 Summary 75

CHAPTER 6: FUTURE DEVELOPMENTS

6.1 An Integrated Package 77 6.2 Automatic tree-building 77 6.3 Tree Editor 78 6.4 Minor enhancements 79

6.5 Distributed Processing 82 6.6 A Vision of the Future 83

CHAPTER 7: SUMMARY AND CONCLUSIONS

7.1 Enhancing Current Styles of User Interface 85 7.2 Adaptive User Interfaces 85 7.3 Protective User Interfaces 86 7.4 Current Trends 86

Page 4 BIBLIOGRAPHY 87

APPENDICES

A: Detailed description of programs 98

B: Building and running the programs 140

C: Description of macro-nodes 144

D: Macro-node digraph for CDCFILE 153 E: File Formats 160 F: Node Types and Their Actions 164

Page 5 AcKnow ledgemen ts

I would like to acknowledge the assistance of my supervisor Mr E.B.

James for his guidance in the preparation of this thesis; my wife, Anne, for her encouragement and help; and the Science and Engineering Research Council for their financial support.

Page 6 G

Glossary of terms and abbreviations:

Byte "byte" is used to mean 8 bits throughout this document.

CAI Computer Assisted Instruction.

CAL Computer Assisted Learning.

CCP Console Command Processor - the command line interpreter of the

CP/M operating system.

CDC Control Data Corporation - the name of the manufacturers of the large mainframe computer operated by the Imperial College Computer Centre.

CLI Command Line Interpreter.

CP/M Control Program/Monitor - an operating system widely used on

8-bit microcomputers, including the RML 380Z,

Digraph Directed graph.

GEC - GEC Computers manufacture a range of

minicomputers.

ICCC Imperial College Computer Centre.

I/O Input/Output.

JCL Job Control Language - the "programming language" used to give commands to the command line interpreter of the Operating Operating System.

Page 7 MM I Man Machine Interface - the user interface.

NOS Network Operating System - the operating system run on the ICCC CDC mainframes.

OS Operating System.

OSCL Operating System Command Language - see JCL.

OSCRL Operating System Command and Response Langauge - see JCL.

0S4000 The operating system used on GEC 4000 range minicomputers.

RML Research Machine Ltd - the name of the manufacturers of a

range of microcomputers.

RML 380Z An 8-bit microcumputer manufactured by Research Machines Ltd that is equipped with floppy disk drives and runs the CP/M operating system.

RMX86 Intel’s real time operating system used on Intel 8086 (iAPX) family of microprocessors.

RSX 11M Digital Equipment Corporation’s real-time operating system for

use on their PDP-11 range of minicomputers.

SI04 The serial I/O port interface on the RML 380Z microcomputer. (V.24/RS232 interface).

TAC Terminal Access Controller - a “switch’’ which allows a terminal

to connect to one of several computers, by selecting from a menu.

TELEX The name of the time-shared access to the CDC mainframe computers running the NOS operating system.

Page 8 TP A Transient Program Area - the area of memory used by CP/M for non-resident programs,

TREEM "Tree Matcher": a program developed by this project.

TX/Alter A full screen editor used on Intel's RMX 86 operating system.

TXED The text editor provided by RML for use on the RML 380Z

microcomputer.

UNIX Bell Lab's portable operating system widely used on many minicomputers.

VDU Visual Display Unit.

VM/370 IBM's operating system for use on IBM 370 series computers.

VMS Digital Equipment Corporation's real time operating system for use on their VAX range of super-minicomputers.

VT Video Terminal.

Page 9 1 : INTRODUCE ION

1.1 The Need for Simple Interfaces (Motivation)

In the past, the majority of (direct) computer users who were not computer specialists were engineers, scientists, mathematicians etc.. However, this is changing: computers are increasingly being used by 'novice users' who do not have an engineering or scientific background (police, doctors, bank clerks etc.) and the introduction of microcomputers is accelerating this trend.

The advantage of bringing computing facilities direct to the end user is that hardware costs have fallen so much that the cost of employing staff to 'buffer' between the end user and the computer is now significantly larger than the cost of the hardware. It would therefore considerably reduce the cost of using computers if the end user could use them directly, but this will require a much more sophisticated user interface than is usual at the moment if the novice user is not to waste a lot of time 'fighting the system'.

The purpose of this project is to develop an interface that is significantly easier to use by inexperienced users than existing interfaces. It is hoped to make it possible to use the computer with no documentation or manuals. It is assumed that no changes will be made to existing software: the interface will communicate with the existing programs' interface and with the user, acting as a buffer between them.

The most important feature will be that it will be adaptive: it will adjust itself to each particular user and the complexity that they require.

The first point of contact between user and computer is usually with the operating system. This normally has to be so, because most users will

Page 10 Introduction Chapter 1

need to use more than one application at some time and thus direct entry to a fixed application is not appropriate. The command and response

language used to communicate with the operating system is referred to here as a Job Control Language (JCL). Nearly all current JCL's were

designed for batch use, as a kind of simple programming language, rather than from the point of view, 'What sort of interface should the operating system present to the user?'.

There are many levels of user (i.e. different users have different amounts of experience of a system, and need to do different things) and

with time any given user is likely to move from one level to another. Hence it is desirable that the interface be adaptive to take account of

the fact that different users (and the same users on different occasions) will want to 'see' a different interface.

The trend towards networking computers together is adding to the problems of the computer user. It is now increasingly likely that the range of facilities available will be split over a number of computers. These may all be accessed from one terminal or workstation by the user. Although convenient, this may lead to a variety of interfaces being presented to the user, in succession or even concurrently.

1.2 Overall Project Objectives

It is possible that the ideal man-machine interface is natural language, although not necessarily so. However, this is at present impossible to achieve, even with large, fast computers, and since it is desired to design an interface that may be used on microcomputers, natural language can be ruled out. This project takes a more realistic aim: to develop a 'user- friendly' interface that is simple enough to be implemented on small systems. At the simplest level this could be a set of standards, but as the interface is intended to be adaptive the interface must be a program, which will probably be an interpreter, acting as a buffer between the user and the computer.

Page 11 Introduction Chapter 1

1.2.1 Flexible Interfaces

Experienced users require briefer prompts than novice users (who, in a conventional system, will have to use the 'help' facility a lot), and because they require more flexibility will want commands to have many parameters. However, it is desirable from the novice user's point of view to be able to omit most parameters (and let them assume sensible default values).

An individual user may correspond to different levels on different occasions, so it is better to talk about catagories of use than categories of user. However, 'user-levels' will be used here, while bearing in mind that a real user may correspond to more than one user level.

1.2.2 Menu-Driven Interfaces

Menu-driven interfaces are often claimed to be easy to use, since the user has only to select an option displayed on the screen, rather than remembering what is available and what the correct keyword is. In addition, menu-driven interfaces can use a pointing device (e.g. mouse, light pen, tracker ball etc.) as the input device, instead of a keyboard. However, a menu alone is no substitute for proper help information, since the amount of information that can be fitted into a menu is usually too cryptic to help a novice user.

In order for menus to be comprehensible, they must restrict the number of options displayed on any one menu. A consequence of this is that where the user has potentially a large number of options it is necessary to go through more than one level of menu. This is slower and frustrating for a more experienced user, so there must be some way of by-passing all intermediate menus and invoking the required option immediately: i.e. there needs to be some sort of command line interface (or function keys?) in addition to the menu-driven interface. It is often forgotten in the desire to design easy to use interfaces for novice users that it is just as

Page 12 Introduction Chapter 1

important to present experienced users with an interface that they regard as easy to use. One interface style is unlikely to suit all levels of user experience, or even the personal preferences of different users of similar experience. Therefore, some form of adaptable interface is highly desirable.

1.2.3 Rdle of Standards

One way of simplifying the user interface would be to standardise it.

This will certainly help users move from one machine to another. However, it is not practical to standardise all possible commands because different machines may do different things. Also, standardisation always lags some time behind the latest ideas.

1.3 Outline Description of System

1.3.1 Modelling User Behaviour

In order for the user interface to be most helpful to the user, it must have some model of the way the user behaves, i.e. the sort of sequences of commands that the user types,

A good model will be able to anticipate what the user will do next, and will be able to make use of this in interpreting commands which are incomplete or have minor errors.

Obviously, if the user does something totally unexpected it is impossible to do anything to help him. It is important that the interface does not try too hard to force the user's input into one of the expected commands; it should be able to recognise when there is a departure from "normal behaviour".

Page 13 Introduction Chapter 1

1.3.2 Menu-D isp lay

Our interface presents either a command line or a menu-driven interface to the user. In the latter case the options in the menu will be diplayed in the reverse order of their popularity, i.e. the most frequently used option will be displayed first. Since the menu display will change dynamically, there is a risk that the user may come to expect an option to be in a particular position and could select an incorrect option after an automatic menu re-structuring phase, if not actively reading the menu choices. Therefore, it is preferable for the user to select the choice by giving the first few characters of the option, rather than selecting purely on position (by number, or, to a lesser extent, by pointing device such as mouse).

Page 14 2 : SURVEY OE RELATED WORK

This chapter provides a survey of previous work which is related to user interfaces including JCLs, adaptive programs and tree parsing.

2.1 Desirable characteristics of man-machine interfaces

This section summarises some of the statements and observations that others have made about features of a user interface that are desirable or necessary to promote efficient interaction. The desirable characteristics which will be used as the guiding rules in the design of our user interface are given in section 3.5, based upon the results of the studies summarised in this section.

IZINN & CONKLIN 1970]

"Interactive mode considerations are especially important for the infrequent users. Four statements are made as working hypotheses to be demonstrated by studies at the University of Michigan".

1. Quick reply from system will encourage effective use as a learning tool. The student-user should be able to question system if he does not know where he is. If processor begins to display material in too much detail then user should be able to interrupt and change mode.

2. Simplicity and consistency: so rules and conventions easy to learn and recall.

3. System should be engineered to do what it is good at, and let people attend to patterns, set direction for investigation etc.

Page 15 Survey of Related Work Chapter 2

4. Flexibility. User should feel encouraged to test tentative ideas.

System should provide immediate reply to ambiguous and incomplete

instructions, but let user sketch ideas and fill in details later.

[BODENSCHER 1970]

The standard QWERTY keyboard layout is all right for typists, keypunch operators etc., but not for the "average terminal user" who searches for the buttons visually and mostly pushes them using one finger on each hand.

When communicating with the computer keywords must be typed in letter by letter, and often special symbols are used (e.g. $) whose meaning is not obvious. A special keyboard was made with the following characteristics:

The keys were arranged in groups.

Alphabetic keys were in alphabetic order, and numeric keys as on a calculator.

There were no shift keys. The keys were located according to function and frequency of use.

Operators (keywords) had a single key each. Keys could be pressed without obscuring their label.

The result of this re-design of the keyboard was that the number of key presses required was reduced by 50%; new users found it easier to use

(less time was spent reading the manauls); experienced users found it about twice as fast as a conventional keyboard; and errors were reduced by about 25%.

Page 16 Survey of Related Work Chapter 2

CHUCKLE]

This survey of computer users in higher education and research divided

the users into six groups:

- students of computer science

- students of other disciplines

- lecturers and teachers

- research workers

- administration staff

- computer centre staff

Of these the second group are the largest in numbers, but are not the largest users in terms of computer resources used. Activity is usually in short bursts, with long bursts of inactivity, so the users forget how to use the computers. In addition the users may use more than one different computer. This group are the most likely to benefit from sharing resources

of heterogeneous computers on a network. However the interfaces to such networks are too complicated for this group of users at present. They need a uniform user-friendly interface, which may also be of benefit to other classes of user.

Survey of user behaviour:

Most users are deterred from using networks because of the interfaces provided, so there is no experience on which to base the design of user interfaces to computer networks. A basis for design was obtained by studying users of single-computer systems. Surveys were carried out on a DECsystem 10 at Hatfield Polytechnic to find out: the commands invoked, responses received, and circumstances leading to errors.

Page 17 Survey of Related Work Chapter 2

Most non-specialists <97%) use the BASIC or Fortran programming languages, or packages. Sessions consist mainly of creating, editing,

listing and executing small programs (less than 30 lines of source code).

Nearly two thirds (60%) of sessions involve compilation and execution.

Files are usually listed in their entirety, and on the terminal rather than a printer. Only a small subset of the command language is used, and defaults are used heavily for parameters. Many users, especially first time users, have difficulty logging on. Specialists use collective specification of filenames (i.e. wildcards) widely, but frequently only one file was actually manipulated, so the wildcard mechanism is probably being used for abbreviating filenames,

Three things give problems to both specialists and non-specialists: (1) they tend to forget which subsystem they are in (e.g. Operating System command mode, BASIC, user program etc.), and enter commands intended for a different environment, despite having different prompts in the different environments/subsysterns. (2) Users are careless when typing, frequently making mistakes. Sometimes this results in the system interpreting the command differently to what was intended instead of detecting the error, e.g. REPLASE may be interpreted as UREP LASEM. The error messages generated are often unrelated to the real error. Users often omit significant spaces or other delimiters, or use the wrong delimiter. (3) Error messages are not explicit enough or are not helpful. Non-specialist users often ignore them, not realising that a command has not been executed. Sometimes the error messages are not sufficient to help even the specialist users, or are misleading. In many cases users were reluctant to use help facilities when prompted to do so. This may have been due to poor help facilities throughout the system leading to users losing confidence in them.

Interface design criteria:

The design aims were to produce an interface that was easy to learn and use. Novice users use the system too infrequently to remember many commands and their sequence of use. They are however willing to use more tedious methods if that means that less has to be remembered, e.g.

Page 18 Survey of Related Work Chapter 2 occasional users may be more willing to retype an entire command line than to learn a sophisticated intra-line editor. Since users do not differentiate clearly enough between the operating system and different subsystems, despite the differing prompts, it is necessary to either structure the command language to emphasise the difference or else to reduce the number of subsystems. The second solution also helps with another problem: the same command in different environments may produce different results.

A comprehensive system of defaults is important. This means that sometimes the computer will not do what was intended, but this is outweighed by the advantages. If the system cannot surmise the user's intentions then it should prompt for additional information. If actions are irreversible then a fail-safe mechanism is needed. There should be a default filename etc. Some systems allow explicit specification of defaults, some are implicit, As each user's needs differ, and an individual user's needs change with time, it is better for each user to have their own set of default values, rather than a system-wide set, that can be changed by the user or on his behalf. Most operating systems allow parameter specifications by position, or by keyword. It would be better to allow both. The interface should avoid too much typing by the user and the arbitrary use of punctuation characters. Delimiters and special characters tend to confuse users, so it is better to use plain language keywords instead. Command names should be brief but meaningful. The requirements of experienced users (who will want to abbreviate etc.) should be a secondary consideration.

Users should be able to understand the information in system responses, so narrative error messages rather than error codes are required. Jargon should be avoided. Responses should be consistent, so that it is immediately obvious whether an operation was successful or not. The system should always give the same prompt when ready for a command. Responses need to be at a number of levels of detail, as experienced users want brief responses which often mislead inexperienced users. There should be good on-line information on commands etc. at a number of levels of detail.

Page 19 Survey of Related Work Chapter 2

The majority of students' sessions follow a general pattern: edit program, execute it, edit, execute ... etc. Therefore it is important to make it easy to edit and execute repeatedly. Non-specialist users frequently list a program before, during, or after editing them.

The command language should be extensible so that new facilities can be added in the future. It is desireable for users to be able to change their own interface to provide facilities tailored to their particular needs. This is usually provided by a procedure or macro facility, often the form of invocation is the same as for an 'ordinary' operating system command so that users do not need to distinguish them apart.

Page 20 Survey of Related Work Chapter 2

Conclusion:

A survey of a user population of a single-computer system has been conducted to help assess user requirements. The design criteria of a new user interface may be summarised as the user interface is required to:

1 Provide access to the required facilities

2 Provide interactive working, and batch working as a secondary consideration.

3 Be structured in a way to avoid confusion about subsystem environments.

4 Provide a comprehensive defaulting system.

5 Allow command parameters to be specified by both position and keyword.

6 Have a simple syntax and require the minimum of typing.

7 Have meaningful command names that can be abbreviated.

8 Provide meaningful error and system messages.

9 Provide a multi-level response language.

10 Have on-line system documentation that is easy to access and understand.

11 Allow users to repeatedly edit and execute files simply and effectively.

12 Allow users to define their own commands.

Page 21 Survey of Related Work Chapter 2

2.2 Features of Some Existing Interfaces

2.2.1 CAL/CAI

Most simple user interfaces have been developed for CAL and CAI (Computer Assisted Learning and Computer Assisted Instruction), where it is known that users will usually have no computer experience and there is no reason to force the users to learn anything about computers.

2.2.2 The Science Museum Terminal

For a number of years there has been an experimental simple user interface to the ICCC network of CDC mainframe computers operating in the Science Museum [IRELAND 781. This has been very successful, but the range of interaction is limited to a few games. It works on a question and answer basis like CAL/CAI systems, and does not allow the user to take the lead in the conversation.

2.2.3 MM/1

Since this project is concerned with the user interface in general, rather than specifically JCL, no attempt will be made to produce another

JCL. Instead the aim is to make an existing JCL easier to use. An approach very similar to this has been taken in the development of a user interface at NPL called MM/1 [SCHOFIELD, HILLMAN & RODGERS!. This is a standard interface: so although a user may not know which command to use to do something, he knows how to find out.

This approach has much to commend it when users may have to use several different computers, for different purposes, since it may be impossible to define a standard JCL that suits them all. However, there is no reason why they should not all share the same style and a common set of basic commands.

Page 22 Survey of Related Work Chapter 2

MM/1 caters for a range of user levels by allowing abbreviations and also by allowing all parameters to be given on one line, or the user can be prompted for each individually. It also includes a standard method for the user to obtain more information about the required input at any stage.

2.2.4 User-friendly Interfaces

[CARD et al 19701

This research project had two purposes: to find out whether computer interrogation was acceptable to the patient and to compare the precision of the evidence collected by the computer with that collected by a specialist.

The patients interacted with the computer via a teleprinter that had only three response keys, marked "Yes", "No" and "Don't know/understand".

All the patients were given a standard questionnaire. None of them found using the computer unpleasant or alarming, and it was compared favourably with the consultants, being described as "polite", "friendly" and "unders tandable".

The precision of the answers was as good as that of the consultants, however only simple questions were used - it was thought that the computer might not be so good if the questions were more complex.

2.2.5 Line editing

Line editing is normally restricted to deleting the last character (which may be used repetitively) and deleting the whole current input line. There is often a re-display line function (typically invoked by ), because on hard-copy terminals it is not possible to actually remove the erroneous character from the display, which results in difficult to read display after a few corrections.

Page 23 Survey of Related Work Chapter 2

Deleting or inserting characters in the middle of a line is not normally supported on teletype or scroll-mode VDU interfaces. However it is sometimes supported on page-oriented screens such as the IBM 3270 type terminal.

2.2.6 Recalling previous command lines

It is often very usefull if it is possible to recall a previously entered command in order to repeat it, or to edit it slightly and re-issue

it if the entered command did not have the desired effect or was invalid, or if the user wishes to enter a sequence of similar commands. This feature can also be useful for checking on what exactly was entered into the system for the last few commands. Under RMX86 it is possible to recall previous command lines by pressing , This can be used repeatedly to step backwards, but it is not possible to step forwards again. Editing is only possible using the delete last character function. Under VMS it is possible to step both backwards and forwards through the command lines, and also to edit a line at any point before re-issuing it. Under 0S4000 it is possible to recall the last command with the *?Q' command, but it may only be edited by deleting from the end of the line. Under UNIX, using the

C shell, it is possible to recall previous lines, by specifying a string used whithin the line, or by displaying a menu of stored commands (i.e. the command history) and selecting one by number. It is possible to edit the line by using line-editing replace commands similar to those of the line-editor, (i.e. substitute old-string with new-string).

2.2.7 Features of UNIX C Shell

In UNIX the command line interpreter that provides the user interface to the operating system is called a shell. There are two commonly used shells on UNIX - the Bourne shell and the C shell. The latter is so called because it is intended to be similar to the C programming language. In particular its flow control constructs are borrowed from the C programming language.

Page 24 Survey of Related Work Chapter 2

One of the features of the C shell is that it can maintain a history of commands that have been entered. This history may be displayed, and previous commands can be selected for re-submission with or without editing. This history feature is useful not only to pick out a long command that has to be repeated (possibly with minor modification), but also as a record or reminder of previous commands which have dissappeared off the screen. In some versions of UNIX the history can be saved from one session to the next, greatly increasing its usefulness. The history can be used to obtain the effect of automatically customising commands: once a complete command has been entered it can be re-invoked by giving just enough characters at the start of a line to identify it.

Files containing commands to be read by the shell are called shell scripts. As in other operating systems these can be used to combine sequences of frequently used commands and to create customised commands.

If the "search path" is set up so that the user's own directories are searched before the system directories then the UNIX shell scripts may be used to override system-provided commands, thus allowing customisation of commands by the user, or the system administrator etc. Alternatively, the search path can be set so that the user’s directories are searched after the system directories, preventing accidental overriding of system commands.

A newer development on some versions of UNIX is the 'visual shell'. This is a menu driven interface that is perceived as more appropriate to new users of UNIX than the standard obscure interfaces.

2.2.8 User Defined Symbols and Keys (Macros)

Facilities allowing user's to define their own symbols or keys (commonly referred to as macro facilities) give the user an easily customisable interface. For example under VM/370 private commands can be generated by creating an 'exec' file, which will be invoked whenever its name is typed as a command (i.e. as the first item on a command line). This

Page 25 Survey of Related Work Chapter 2

kind of facility is quite common, and is also available (for example) in UNIX (shell scripts), CP/M (submit files), etc. In other operating systems the mechanism may be slightly different. The same effect can be achieved

under OS4000 by creating a command file, and then setting up a ‘context

pointer' which allows the file to be referred to by a string that looks

like a command (as opposed to a file name). Under VAX/VMS a symbol may be set up to execute a file. Whenever this symbol is typed it is expanded

into a full string which will execute the file. This mechanism is much

more powerful1 as it is general purpose.

In general where commands are actually the names of files, there will be a number of directories that are searched in some particular order to

find the file. Often, the user's own directory will be searched before system directories, allowing the user to override system commands with

their own private versions. However this is not always the case, and sometimes the search order is under the control of the user (e.g. in Unix, see section 2,2.7).

Editors often allow the user to define macros, allowing complicated commands to be invoked with one or a few keystrokes, e.g. UDK (user defined keys) in Word-11, single character commands in TX/Alter.

2.3 Adaptive Programs CTree Processing)

D. Partridge and E.B. James have successfully used a tree processing program as a tolerant Fortran compiler [PARTRIDGE 1972, PARTRIDGE & JAMES

19761. This compiler front-end could recognise Fortran statements which were misspelled by comparing the program statement with a language description stored in a binary tree, and attempting to force a match.

An interesting feature of this tree processor is the 'confidence jump' mechamism, which by skipping over redundant characters in the input program statements (instead of checking them) was able to improve efficiency. This also meant that some mistakes were not even 'noticed' by the program.

Page 26 Survey of Related Work Chapter 2

2.4 Job Control Languages (Standard JCLs)

Many bodies have considered a standard (high level) JCL e.g. U.S.

Department of Defence 1972; ANSI committee 1972; IFIP working conference 1974; CODASYL 1973. There are also a number of high level command

languages that have been implemented, e.g. GCL at the UKAEA, UNIQUE at

Nottingham University, ABLE at Bristol University, and WFL is available on Burroughs machines. There is also a PL/l-like JCL called JOL which is compiled into 0S360 JCL. Of these examples all except UNIQUE are based around batch job processing.

It should be noted that the purpose of this project is to develop a general purpose interface, applicable to a wider range of applications than just job control, and not to develop another JCL. The problem with standard JCLs is that different computers are different. At best a standard JCL can only be a compromise, but more likely each computer will have some facilities (X-Y plotter, microfilm/fiche printer, typesetter, audio output) that others do not, making a true standard impossible. Nevertheless it is certainly an advantage to have all JCLs based around a common core, so that the overall structure is the same, and where two computers can do the same thing the commands are identical on the two computers.

Page 27 3: OBJECTIVES AND BASIC PRINCIPLES

This chapter sets out the objectives of our proposed adaptive user

interface, and outlines the basic principles of the design. Not all of the idealised system described here are achieved by the system as implemented: chapter 4 describes the actual partial implementation of the interface described here.

3.1 Objectives: Criteria for design

Ultimately the best test of a new interface is to compare the user's performance (speed of interaction, number of mistakes etc.) using the new interface with that using the old interface it is replacing. The problem is how to decide what features will improve an interface.

It is possible to design an improved interface by using past experience: comparing different interfaces, finding out which features are good and which bad, and trying to incorporate all the good features and none of the bad in the new interface. It is not necessary to compare two or more interfaces: it is possible to analyse an interface individually and pinpoint bad areas, and then try to improve these areas. This method of iterative enhancement is a standard method for improving the design of anything, and should not be ignored. It is a very strong reason for making any new interface (or anything else for that matter) as easy to modify as possible, so that changes in the design can be experimented with, and once decided upon easily implemented. The list of desirable features in section 2.1 is mostly based upon past experience of user interfaces.

Page 28 Objectives and Basic Principles Chapter 3

A more general approach is to attempt to mimic human-human communication. In this context a good user interface is one that satisfies

Turing's test for 'intelligence' (i.e. the user should be unable to tell whether it is a computer or another human that he is communicating with).

In this approach the computer constructs a model of the user's behaviour, and uses this model in its interpretation of the user's input. Note that we are considering here the case of a user sitting at a terminal (teletype or VDU), so the only behaviour that the computer can observe is whatever the user types in at the keyboard.

Due to the limited timescale of this project, and the intention of implementing the interface on microcomputers, the model must of course be rather simple. Following the method used successfully by D. Partridge and E.B. James for tolerant matching of Fortran programs [PARTRIDGE & JAMES 19761, we have chosen to attempt to model the user's behaviour (i.e. the commands typed in to the computer) by using a binary tree. This tree stores all commands that are valid in the current context, and the tree is ordered so that it reflects the pattern of usage of these commands by the individual user.

3.2 Proposed Interface Characteristics

3.2.1 Local and Remote User Interfaces

Since, in general, applications may be distributed and remote from the user, the distributed case is considered here. Local access can be considered to be a special case of this. However, the local access case may not generalise into the remote access case. A consequence of this is that it is more obvious that at least a part of the user interface must be split off and separated from the application.

Page 29 Objectives and Basic Principles Chapter 3

3.2.2 Host Responses

In addition to accepting different styles of user input, depending on the experience of the user, it is necessary for the user interface to be

able to adapt the host's responses to suit the user. This involves tailoring both the amount of information and in some cases the content of

messages, in order to use words and terms familiar to the user.

Such a full implementation of the user interface would be very complex,

and is not attempted here. However, a possible extension to the current

work that would provide such a full implementation is discussed in chapter

6.

3.2.3 Concurrent or Parallel Activities

The user may wish to interact with more than one activity concurrently. This can be handled by mapping multiple logical terminals into one physical terminal. The user interface described here is intended to handle only one activity, and multiple copies of it would have to be run to support multiple conversations. It should be noted that where a user is interacting with several activities at once the user's level of experience for each activity may be different.

3.2.4 Communications Bandwidth

The bandwidth of the line connecting the local system to the host affects the amount of host interaction possible, if the interface is to be constrained to exhibit a reasonable response time. This in turn determines

the power ( CPU speed, storage space ) necessary at the local system in order to provide the user interface. A casualty of limited communications bandwidth is that extensive use of full screen displays and menus is not possible (because it is too slow), unless the repertoire is sufficiently small and stable enough to enable storage at the local system.

Page 30 Objectives and Basic Principles Chapter 3

3.3 Scope of Project

This section defines the boundaries of the area that this project addresses. In particular it builds upon existing research that has analysed use of various styles of interface and is seen as complementary to ongoing work in defining standards for user interfaces. The project concentrates on providing two key aspects of user interfaces that have been neglected to some extent in the past:

- Adaption: the user interface modifies its behaviour to suit the user at that point in time.

- Protection: the user interface provides a protective layer between the user and the applications.

3.3.1 User's Requirements

We have not carried out a survey ourselves as there have been many previous surveys and discussions of user requirements, some of which are summarised in section 2. However one requirement that we would like to emphasise is that the interface to the user should match his or her level of experience. Although this has been pointed out before, the attempts actually to implement a 'variable interface' have, in general, been restricted to providing alternative 'brief' and 'full' prompts and messages, and allowing commands to be abbreviated. (Note: multi-user systems usually have different levels of privilege, to protect other users. This feature is often the most developed! However it should not be confused with altering the style of the interface to suit the 'level' of the user). We would also like to point out that over a period of time any individual may pass through several categories that have been defined in previous surveys, and thus the interface to the same person should be changing with time.

Since most other user requirements have been tackled by other workers, this work concentrates on the idea of an "adaptive interface", which is

Page 31 Objectives and Basic Principles Chapter 3 personalised to each user, and changes as his experience changes. The user's experience may decrease as well as increase, due to less frequent use of an application (possibly because of increased use of another application or changes in the application).

3.3.2 Undo Facility

Ideally it should be possible to undo all the effects of any command given in error. Such an undo facility should enable the user to backtrack an arbitrary number of steps. The reason why this is desirable is that the user may not fully realise all of the effects of a command until after its execution has been confirmed.

An undo facility cannot be provided by the user interface alone. This facility can only be provided by the application, which knows all the hidden side effects (as well as the main effect) that must be reversed. It may in fact be impossible to undo some commands, or they may only be partially undone. Because this is one desirable function of a user interface that cannot be separated out into an independant user agent we have not attempted to provide it in our system. However it is highly desirable that applications are designed to have reversible commands so that an undo facility could be provided.

3.3.3 Confirm Option

It is not reasonable to request the user to confirm every action (after some feedback on the effects of the command). However if the user interface is intelligent enough to detect harmful or 'unlikely' commands then it could request the user to confirm that the action is intentional. It is impossible for the user interface to detect how potentially damaging a command is: this information can only come from the application. However the user interface may be able to detect unusual commands automatically, where 'unusual' commands are either infrequent commands or more common commands in an unusual context.

Page 32 Objectives and Basic Principles Chapter 3

3.3.4 Standard JCLs

We have not attempted to define or make use of, a ‘standard JCL\ There are several groups currently attempting to define standards and we regard our work as being complementary to theirs. It is necessary, of course, to define a JCL for use with our system, and we would propose to use a standard JCL when one has emerged. However for the moment we have based our "initial JCL" on the JCL used on the ICCC mainframes to which our system is connected. It would be easy enough to change the initial JCL to anything else at any time, since its definition is read from a file, rather than being built in to the interface programs.

3.4 Protective Interface

The user interface is considered here to be separate from the application with which the user interacts, acting as a protective layer to shield the user from the 'raw' interface provided by the application. This enables us to concentrate on the user interface rather than the application. This approach also leads to a more cons is tan t interface being presented over a variety of applications. From an implementation point of view sharing common code to handle aspects of the user interface should both reduce development effort on the application (in order to achieve a similar level of functionality in the user interface) and also reduce overall code sizes in the total system.

Page 33 Objectives and Basic Principles Chapter 3

3.5 Basic Principles of Design

3.5.1 Desirable Features of a Man-Machine Interface

This section lists the set of desirable features that will be used as

the guiding principles in the design of the user interface. These are derived from the work that is described in section 2.

1 Messages must not be cryptic. Explanations of errors etc must be given, using keywords - not code numbers or letters.

2 There must be some (standard) way of finding out the form of input

required at any stage.

3 Both expert and novice users should be catered for.

4 There should be a systematic way of finding further information on

a query, without abandoning the current activity.

5 There should be a standard method of escape from a query.

6 Overall conventions should be consistent throughout the system

(therefore the interface should be implemented at a very low level).

7 There should be an 'undo' facility.

8 Abbreviations (used by the user) should be expanded as confirmation of a valid command.

9 The user should be able to interrupt the computer at any time,

Page 34 Objectives and Basic Principles Chapter 3

3.5.2 Adaptive Interface

The man-machine interface MM/1 (see section 2) caters for a range of user levels by allowing abbreviations and also by allowing all parameters

to be on one line, or the user can be prompted for each individually. It also includes a standard method for obtaining more information if the user requires it.

An adaptive interface uses its past experience (of the individual user and of all users) to control such things as how brief or verbose prompts are, and to assess which commands the user is most likely to give at every stage, e.g. the new user will get help unless he specifically suppresses it

(instead of the other way round), and non-unique input may be allowed in some circumstances. The interface would respond by echoing the most likely command, and checking with the user whether that was what was intended.

3.5.3 Separate User Interface from Application

It is important to separate the user interface from the application program. The interface forms a "protective" buffer between the user and the application. The advantages of this approach are:

- the user interface is not considered as an afterthought, but due

attention is paid to its design.

- the user interface subsystem can be re-used for many applications, giving both a common style to all the applications and making the

development of a sophisticated user interface more cost effective.

- several user subsystems may be developed, tailored to the needs or preferences of different users (e.g. command-line and menu-driven interfaces could be offered as options for the same application program).

Page 35 Objectives and Basic Principles Chapter 3

- the user interface subsystem may be implemented on a separate processor. The advantages of this are considered in section 4.3.1.

These benefits are only fully obtained if a standard interface between the user interface subsystem and application program is defined. This interface could take many forms, for example subroutine calls, message exchanges, or text strings passed between separate processes. The last method can be used if the user interface is built as a front-end, without any modification of the application program. It also lends itself readily to implementing the user interface on a separate processor.

The disadvantage of this approach is that the front-end needs to have a model of the conversation in addition to the existing model in the application program. That is, it is necessary to input the tree defining the interaction with the application to the interface program when a new application is provided, and whenever any changes are made to the application that affect the user interface it is necessary for the interface definition to be kept in step. In addition to the extra imnpleraentation and maintenance effort that this entails it is also a source of possible errors. This duplication can only be avoided if the application program is designed from the start to use a separate user interface.

3.5.4 The **?" Command

At any stage when prompted for input the user can type a "?" to obtain a description of what commands are valid. If a keyword is expected, then a list of valid keywords is output, in decreasing order of use as determined by their order in the local tree. If there are more than ten possible keywords, then only the first ten are output followed by uetc,u to indicate that there are more. This limit is set so that the user does not need to read through a long list, since it is most likely that the keyword he wants is in the first ten anyway, and to prevent losing text due to the scrolling of the text on the screen. A full text can be obtained with the

Page 36 Objectives and Basic Principles Chapter 3

"help" command. In our simple prototype version there are not usually many more than ten keywords to choose from. In larger systems it would probably be necessary to add some way of getting further lists of keywords (e.g. typing "?" again could produce the next ten keywords etc.).

If the expected input is not a keyword but is a filename or a number, then an appropriate message is output to the user. This message gives a description of the expected input; e.g. "Please enter an integer between 1 and 99".

It should be noted that all this output is generated automatically by the interpreter by scanning the local tree so it does not have to be explicitly typed in, in addition to the keywords for matching. It is, however, possible to override this automatic facility by including the character "?" as a keyword, which will then lead to whatever action is specified in the node network. (Typically this would be some output nodes, giving descriptions of the commands instead of just listing them.) This technique can also be used to hide some of the commands if for some reason it is desired to prevent the user obtaining a complete list of all commands.

3.5.5 Non-unique Abbreviations

The implemented system does not allow non-unique abbreviations. This approach was taken for safety: if non-unique abbreviations were allowed, then there is the possibility of user confusion since there is a danger of the effect of such an abbreviation changing as a side effect of the tree optimisation process.

It might seem useful where there are a large number of keywords all starting with the same letter to allow non-unique abbreviations. The tree re-structuring mechanism would mean that the most likely keyword would be chosen in such circumstances. However, if the pattern of use changed (permanently or temporarily), then the effect of the abbreviation would change. While this would lead to a reduction in the number of characters

Page 37 Objectives and Basic Principles Chapter 3 typed, once the new pattern of use has stabilised, it would probably lead to considerable user confusion.

3.5.6 Defaulting Options

It is standard practice to have a set of defaults for many parameters of a command, so that they do not all have to specified. This is especially true for complex commands with many parameters. However this can present a problem for the novice user because if a command is entered without an optional parameter that was necessary to achieve the intended effect then the command will be accepted but will do something else. Generally speaking the defaults may be aimed at making the commonest use of the command the shortest, rather than making the novice user's most likely first use the shortest. This is less of a problem in a menu-driven interface where the user can see all the optional parameters and their defaults, and is therefore more aware of them.

One way around this is to insist on checking with the user on all parameters the first time a command is used. The set of parameters given by the user could then be used as the default in the future, instead of having a static system-wide set of defaults. Another possibility would be to define a hierarchy of digraphs, with users formed into groups, so that defaults etc. could be defined on a group basis instead of having to be on a system-wide basis.

Note that a new default should not be set up every time a user specifies a different parameter in a command, unless explicitly requested by the user. However if a different value is consistently requested over a period of time then it would obviously be more convenient if it became the default.

Page 38 Objectives and Basic Principles Chapter 3

3.5.7 "Forced" Matching

In fact, the interpreter works on the principle of accepting the first word in the tree (highest up) which does not fail to match. Thus a match may be achieved when ambiguous incomplete keywords are entered by the user. The tree is ordered so that this will normally be the most likely correct match for that user. The tree is re-ordered from time to time based on the "count" nodes. In addition, branches may be added automatically, so that the tree grows to reflect the particular user. As

the user's pattern of interaction evolves the interface adapts to preserve the property of forcing a match against the currently most probable keyword.

Although the interpreter supports this capability it will only be used

if the jump nodes are set up appropriately in the tree. Our demonstration does not make use of this capability (see section 3.5.5).

3.5.8 Feedback from User to Interface

The user interface will sometimes guess the user's intentions when presented with invalid input. It is obviously very dangerous to allow the guess to be acted upon without first checking with the user that the guess is indeed what was intended. However the user may find this very tedious if in the habit of entering numerous trivial errors. Again, this is a feature that needs to be tailored to the individual's preferences. It is arguable whether correcting a user's errors automatically leads to sloppiness on the part of the user.

3.5.9 Confirmation of Commands

If the user interface is going to correct, modify or expand what the user has actually input then it should tell the user what the new command looks like. This allows the user to check that the system is going to do

Page 39 Objectives and Basic Principles Chapter 3 what was intended by the user. It also informs/reminds the user of what the full command looks like. This helps to keep the user better aware of what is happening.

If the user interface has made any changes to the user's input then in order to be perfectly safe it would be necessary to check that expanded command with the user before acting on it. In practice this would be very tedious and time-consuming. It is thus better for the user interface to establish some 'confidence level', and only to query the user if this falls below some level. It is desirable that the confidence level at which queries are raised with the user should be tailored to the individual user.

This could be under manual control, or automatically adaptive (but on what criteria?).

3.5.10 Responses from Computer to User

Some systems (e.g. Unix) are silent if successful. This lack of positive response may suit the expert, by keeping down the amount of information he has to read while not actually reducing the useful information content.

However novice users might prefer a more positive acknowledgement that a command has executed successfully.

Silent successful completion is particularly likely to be a problem if a command is considered to be successful if it does nothing (e.g. delete a non-existant file), in which case it may be important to inform the user whether anything was done.

In the case of commands/programs that take a long time to execute with no visible output it is a good idea to output something to the console to tell the user that the system is still alive. This needs to be actively updated by the host computer - it is not adequate to rely on the terminal displaying a message that has to be cancelled by the computer if it is to provide any real user confidence.

Page 40 Objectives and Basic Principles Chapter 3

3.5.11 Command Files: Hierarchical Structure of Filing System

Most operating systems / command languages work on the principle that the first token on a line is the name of a file to be executed (object code or O.S. commands), and the later tokens are parameters to be passed to the executing program. In most systems (e.g. CDC and IBM) this leads to a very large number of commands.

We propose to reduce the problem this presents by organising the commands as a tree. This means identifying a command by several tokens. (Or are they parameters for the first token ? - the distinction beteween the command and its parameters becomes hazy). This reduces the number of possibilities at each stage.

To a certain extent the capability for this structure already exists in operating systems that have a hierarchical file structure (e.g. OS 4000,

UNIX, RMX/86), since commands are filenames, which consist of a concatenation of tokens, and the equivalent of our '?' command is to list a subdirectory. However some filenames in these systems are not command files; data files and command files are mixed together, which makes this approach unusable in practice at present.

3.5.12 Distinction Between a Command and its Parameters

It is difficult to know where to draw the line between distinct commands and commands with parameters that modify the effect of the command. The effect of parameters may be to select a particular section of code within a program, in which case there is very little difference. On the other hand a parameter may specify an item of data (e.g. file) on which a command is to act.

Conventionally the command (main command) comes first, followed by the parameters, but sometimes part of the command comes after the 'data' part, e.g. switches on commands to PIP in RSX 11M or CP/M.

Page 41 Objectives and Basic Principles Chapter 3

3.5.13 Recalling Earlier Input Lines

Many systems have a way of re-cal ling the last line entered, or one of the most recently entered lines, input to the system - e.g "?Q" on OS

4000 (GEC 4000 series) MfRH on RMX 86 (Intel 8086-based systems); "!" on UNIX c-shell. This line may then be edited in the input buffer before being sent by pressing carriage-return or enter. This type of facility is described more fully in section 2.

However, this is always the last line input, whatever the context. It would be more useful if the last line returned to the input buffer was the last line entered while in the same context. For example, if the editor is invoked to edit a file, and at some later time this is exited to return to the command line interpreter of the operating system, then at this stage the "recall last input line" facility would recall the last command given to the operating system, rather than the last line given to the editor. In this particular case it would be the line that was used to invoke the editor.

Another possible extension to this type of facility would be to maintain a stack the last few lines entered, instead of only one. Thus would require some mechanism for specifying which line to recall, such as following a control character with a digit, indicating how many lines back to go. Another possibility would be to allow repeated "last input line" characters (commands), each stepping one back and displaying. This second method, although slightly slower, would probably be easier to use in practice. This facility is provided by the Unix c-shell and the RMX-86 Human Interface.

A stack is used to record the progress through the tree, so that it is possible to backtrack up through it. By making use of this it is possible to re-create the last input line, or the one before etc., until the end of the stack is reached. The number of commands it is possible to go back depends on the size of the stack. (Note this stack could alternatively be

Page 42 Objectives and Basic Principles Chapter 3

regarded as a trace buffer). It is not possible to use backwards links in the digraph, rather than a stack, because the digraph may contain loops.

In order for this to work correctly it is necessary to store the values

of parameters (see description of 'param' nodes) on the stack too. This is because it is possible to go round the network in a loop, corresponding to

giving a similar command (with different parameters) several times. This

applies even if there are other intervening commands.

The stack is always emptied whenever control passes through the root node, so it is impossible to backtrack past the root node. The root node may be deliberately included in the 'command loop', so that a small stack and tree may be used, or it may only occur at the beginning of a session, and/or when an error occurs.

3.5,14 Split-Screen Display

In the prototype version, use was made of the memory mapped video display, in that the normal scrolled display was restricted to the bottom 22 lines, and the top line was used for 'status' display. The second line contained a row of dashes to separate the two areas. This was done originally for debugging purposes, but it can be useful for the user to have a status display so that he knows when the computer is busy, and when it is prompting for input.

In the past many terminals did not lend themselves easily to this kind of split-screen display. Most microcomputers do, however, and it is probable that use will be made of this, as it is already being used widely in 'display terminals' used on mainframes (e.g. IBM 3270 type terminals).

This technique is especially popular with editors, since in this case there are two well-defined types of text to display: the text being edited, and the user's commands and the computer's responses.

Page 43 Objectives and Basic Principles Chapter 3

The different types of text on the screen (input, output, status) could be separated by using different colours.

It should be noted that 'status1 is actually the output from another process. It is common for input to and output from more than one process to be mixed on the screen. This may actually be beneficial if the user can

interact happily without knowing which process each message is from/to. (c.f. "?P" on OS4000 which allows the user to explicitly direct his input to a named process.)

3.5.15 Alternative Input Devices

Pointing devices, such as the mouse, light pen, tracker ball etc. can be used for graphical input and for selecting between options in a menu-driven environment. They cannot be used for text input and are

therefore an addition to some other input device that does allow for the free format input of text, rather than a replacement. This other input device might be a keyboard, pad, speech recogniser etc. At the moment the keyboard is still the most common input device, and is likely to remain so for a few years yet (probably in conjuction with a mouse). The general principles of the proposed user interface would apply to any of these input devices, although the detailed description here assumes a keyboard, and contains features tailored to keyboard input. The principles described would also apply to voice-recognition input, where the ability to predict the likely words would help with the recognition process.

Page 44 4- : DESCRIPTION OF* PROGRAMS

This chapter describes the programs that were developed in order to demonstrate the principles outlined in chapter 3. These programs are a partial implementation of the proposed system. The shortcomings and possible enhancements are discussed in chapter 6.

4.1 General Principles

4.1.1 Scope of Demonstration System

A working system was developed along the lines suggested in Chapter 3 in order to demonstrate some of the features of the proposed system. This is a demonstration system rather than a complete implementation embodying all the features and principles outlined in Chapter 3. Chapter 6 will suggest how the current system could be extended in order to provide an enhanced user interface.

The demonstration system is based around the concept of using digraphs

(directed graphs) to model 'conversations' between a user and an application. The system needed to create, modify and use these digraphs is split into a number of components implemented as separate programs. The two most important of these are an interpreter that implements the user interface and a re-structuring program which re-structures the digraph on the basis of information recorded in it by the interpreter. The other programs are concerned with the manual creation, inspection and modification of the digraphs.

Page 45 Description of Programs Chapter 4

4.1.2 A General-Purpose Interface

The method chosen for implementing the interface is intended to be

suitable for a wide variety of applications, simply by changing the

digraph, which is in effect an interface description/definition. All that is

necessary is that it should be possible for the interpreter to intercept

all communication between the user and the application program that the user is communicating with.

In some operating systems

concurrent program in the time-sharing environment, with all the program's I/O "re-directed" to the interpreter. In operating systems like CP/M, the interface could be installed by changing the low-level terminal I/O

routines to call the interpreter, which would then be in the path of all terminal I/O. However this is unlikely to be feasible on 8-bit microcomputers because there will not be sufficient program memory to run both the user interface program and a useful application. The particular

system we have implemented uses another approach: the interpreter runs on a separate processor, interfacing to a program which runs on the main processor.

It is possible to increase the number of conversations (from two) so

that, for example, the interface could deal with both the operating system and/or applications on the local processor and also on one or more other

processors. The opportunities that this presents are discussed in chapter

6 .

Although the implemented system supports only two conversations, it does have some extra node types accepted by the interpreter that involve both processors. These are for transferring files to and from the local floppy disk, thus allowing the user to transfer files between the mainframe and the microcomputer. This type of facility will be useful, we anticipate, in the networks of microcomputers that are likely to replace the time-sharing systems of today.

Page 46 Description of Programs Chapter 4

4.1.3 Dynamic Re-structuring of the Digraph

The structure of the digraph reflects the normal pattern of conversation between a user and a computer. The structure needs to be updated dynamically and automatically as more is learnt about the user's pattern of interaction and as the user's pattern of use changes. In general, users do not remain "novices" for ever but progress towards some level of expertise in their interactions with some applications, although remaining novices in their interactions in less frequently used areas. As they progress, their requirements of the user interface change. A novice user will normally prefer a verbose conversation with the computer giving long detailed prompts and guidance on the expected input

It should be noted that the same user may need to interact using the whole range of "experience levels" at any stage, according to how familiar he is with the current phase of the conversation. It is the norm for a user who is expert in the use of one or more packages to occasionally use others with which the user is not familiar. It is therefore essential that the user interface is able to move smoothly between the different user levels.

4.1.4 Constructing Complete Commands

One of the fundamental features of our system is that if a command line is incomplete: the system will assist the user in completing it, rather

Page 47 Description of Programs Chapter 4

than simply discarding the user's first input. It does this by prompting for the missing parameters, one at a time, so that a command line can be built up, one token at a time, into a complete command that the system can act upon. This can be seen as an extension of the idea of line editing, which is provided on all computers to some degree.

The advantage of building a line token by token is that at each stage the "?" and "HELP" commands can be used to help decide what to type next.

Thus, a user is helped through the tree that defines all possible commands as a sequence of tokens.

4.1.5 Programs Developed and Their Functions

The actions carried out on or under the control of the digraphs can be split into several phases, and it is convenient to implement these as separate programs, although in principle they could all be combined into one program. The reasons for splitting the system into several programs are discussed further in section 4.3.4. The programs are summarised here and described individually in more detail in sections 4.4 to 4.7.

There are four programs. These are:-

TREEM (tree-ma tcher):

This is the interpreter which provides the adaptive user interface.

Its actions are controlled by a digraph stored in "internal format". Statistics on the usage of paths through the digraph are stored in the nodes.

RESTRUCT:

This re-structures a digraph that is in "internal format". It scans through the digraph, looking for local trees, and then re-orders the branches in the local tree so that the most frequently used ones are

Page 48 Description of Programs Chapter 4

near the top. Jump nodes may be moved: all jump nodes are removed and may be re-inserted so as to allow any specified minimum

abbreviation length. This may or may not be the same as it was before,

BLDTRE:

The node-networks used by TREEM and RESTRUCT are in an “internal

format", in memory during execution, and on disk between runs. This coded and compressed format could be created and edited by a normal

text editor while on disk, but it would be very difficult to “program"

correctly. Therefore another “input format" has been defined, and BLDTRE converts from this to internal format. This “input format" was designed to be simple and to be fairly close to the form of the digraph used internally by TREEM. However, it would be perfectly feasible to write other programs to convert from any CAL-like language to internal digraph form.

UNBUILD:

This program performs the inverse function of BLDTRE. It reads a file containing a digraph in the internal format used by TREEM and

produces a text file that is printable and may be edited with any standard text editor. The text file produced is in the format

accepted by BLDTRE. This program has two main purposes. One is that it is possible to produce a human-readable form of the digraph after it has been modified by TREEM and RESTRUCT in order to examine their

effect on the digraph (mainly for testing purposes). The other is that it allows manual editing of the digraph, in order to clean it up or to quickly change its structure.

Page 49 Description of Programs Chapter 4

4.1.6 Output Categories: Computer Output, Input Echo, Status

The text on a VDU screen can be split into three categories. These are:

1: output from the application running on the host computer 2: an echo of what is typed at the keyboard

3: status information from the computer system(s) and communications

network

For much of the time most of the text on a screen is usually of category 1 (output from the host computer). The major exception is when the user is entering a large quantity of text, for example when running an editor or word processor application, when category 2 (the user's input) may predominate.

Category 2 is necessary to provide feedback to the user, partly so that

the user does not have to remember what he has typed so far, and partly in case of keying errors.

Category 3 is really different from category 1, and usually comes in smaller quantities. It may at any time be coming from a different process from that supplying category 1 output. However, the distinction between 1 and 3 is often hazy, largely due to the fact that they often cannot be distinguished on an ordinary VDU or teletype. The main exceptions are the

IBM 3270-type terminals where some status information is displayed separately at the bottom of the display screen.

Most terminals and VDU's are based on the old teleprinters, which were

intended to communicate with another human sitting at another teleprinter. With mechanical printers it is easiest to mix everything up together, but with the VDU's this is no longer necessary. In fact, it has become common practice to separate text and command areas in editors (TECO, TXED, CREDIT,

XEDIT), especially on microcomputers which often use memory-mapped displays which lend themselves to efficient implementation of "full screen" editors.

Page 50 Description of Programs Chapter 4

On mainframes this splitting into areas is to be found on IBM 3270 type displays and similar terminals, which are usually split into a large output area, a small status area of one or two lines, and a one- or two- line input area. Since most computers accept input one line at a time, this is a convenient size of input area.

In our interface we have chosen to have input echo and computer output mixed, with status displayed separately on the top line. This is so that it can be used with any conventional VDU or hard copy terminal. In fact, it is only the user interface status that is displayed on the top line, and this can be discarded, (with some some loss of information to the user); the computer's status information is mixed with its output as normal.

4.1.7 Syntax vs Semantic Problems

The errors that a user makes in command lines fall into two categories:

1) The user knows what he wants to do, but does not get the exact syntax correct.

2) The user gives a valid command, but it does not do what he expected or wanted.

The protective interface can help with type 1 errors by prompting stage by stage, correcting typing errors etc.

Since the protective interface knows nothing about the application it cannot help with type 2 errors. It is important that the keywords used by the application are chosen sensibly. The interface can be used to change keywords, by mapping between different words on each side, if it is desired to alter these keywords from the user's point of view without changing the application code.

Page 51 Description of Programs Chapter 4

There are two different types of help available, corresponding to the two different types of problem:

1) Typing a "?" will cause the interpreter to list the contents of the

local tree, whose root is the current node. This has the effect of

listing the keywords that are valid input at the current stage. This

helps with getting the syntax right if the user can guess which

keyword he wants to use. This is particularly useful as a "memory refresher" for users who did know what commands to use, but have forgotten the precise syntax since last using the system, or the

format has been changed slightly since last using the system.

2) If it has'been built into the application program (and ideally it always is), typing "HELP" will give some information on commands. This

should specify what the actions of commands are. It should not simply list the commands available, since this is provided automatically by

the protective interface.

4.1.8 Command Sequences

Operating systems usually provide some sort of 'submit file' feature, allowing a frequently used group of commands to be executed by giving only the name of a file containing the commands. This is an essential feature of a good user interface. Sometimes the submit file has to be 'called' (e.g. CP/M: 'SUBMIT file', NOS: '-file' or 'CALL file'), but a better scheme is to simply type the filename (e.g. as in UNIX, VM/370, CP/M+,

VAX/VMS). This means that users can define their own commands which appear to be similar to inbuilt operating system commands, built up from other commands, thus providing in effect a very high level programming language suitable for users familiar only with this level of interaction.

This also allows the user to override the standard operating system command of the same name, to provide a different (customised) version.

In our system this type of feature is not implemented explicitly so much as implicitly. The system will come to recognise frequently used

Page 52 Description of Programs Chapter 4 command sequences, and once started on such a sequence will tend to stick to it. Extensions to this mechanism to allow the naming of subtrees, providing the ability to define macro commands, are discussed in chapter 6,

4.2 The Digraph

The actions of the adaptive interface are controlled by a digraph of nodes. The adaptive nature of the interface is achieved by allowing the user interface to itself restructure the controlling digraph.

4.2.1 Introduction

The nodes within the digraph may be one of eleven types. The different types of nodes specify what action the user interface is to take, e.g. match user input against a character, output a character etc. A complete list of node types and descriptions of the actions that they specify is given in section 4.2.5. Each node has two outgoing links pointing to two alternative nodes. Many of the nodes (e.g. match against user input) have the concept of success or failure. The path to be followed depends on the "success*' of the node's action. Success and failure are defined differently for different node types, and some node types do not have a failure mode.

4.2.2 Terminology

The terminology used in both this text and within the programs is rather loose: the terms "digraph" and "network" are used to describe the set of linked nodes that drive the interface interpreter. However strictly speaking it is a multigraph as it is possible for both links of a node to point to the same successor node, whereas in a true graph all edges or arcs are distinct. The out-degree of all nodes is two, the in-degree is always at least one. However one or both of the links (out-going arcs) may

Page 53 Description of Programs Chapter 4 be special links (implicitly connected to the fail node). There may be any number of arcs going into a node.

4.2.3 Local Trees

Within the digraph there are many instances of 'local trees', i.e. areas of the digraph that do not contain any loops. These correspond to the points at which a selection is made from a variety of input strings that are valid at the current phase of the conversation.

If all nodes other than MATCH nodes are ignored, and the links from the

MATCH nodes to other node types are considered null, then the resulting 'islands' in the digraph will each be in the form of a binary tree,

-4 commandl -> paraml. 1 -» paraml.2

4- command2 -4 param2. 1

4- command3 -» param3. 1 -> param3.2

4-

However in a sense these are not true trees, even if their links into the rest of the digraph are ignored, because there are no true leaf nodes. All nodes have both right and down links, and although they may take a special null value this is effectively the same as a link to the fail node.

Page 54 Description of Programs Chapter 4

-» c o m m a n d 1 -» paraml.1 -» paraml. 2 4 4 4

4- fail fail 4

command2 -> param2.1 4 4

4 fail 4

command3 -» param3.1 -» param3.2 4 4 4

4 fail fail 4 fail

Since there are no empty link fields in the nodes of these 'local trees' they cannot be made into threaded trees. Therefore it is necessary for the programs to maintain a stack of nodes traversed if they are to perform any backtracking.

These local trees can be further partitioned into a tree of trees.

Typically the top level tree represents the alternative commands expected, and within a branch representing a command there are further trees for the parameter options.

Page 55 Description of Programs Chapter 4

-)commandl -* optionl.1.1 -» optionl.2.1 | 4 4 I option 1.1.2 option 1.2.2 4 command2 -* option2. 1. 1

I 4 I opt i on2.1.2 4 command3 -* option3.1.1 -* option3.2.1 I 4 4 I option3.1.2 option3.2.2

I 4 I opt i on3.1.3 4

4.2.4 Local Function Nodes

Most of the node types in the digraph are concerned either with parsing the user's input or with generating output sent to the host. However, there are in addition some node types concerned with actions local to the user interface machine (microcomputer in this case). In the example system described here these local function codes allow the user to exit from the user interface program and return to the local system's operating system, and to transfer files in both directions between the local machine and the host.

To the user there is nothing explicit to indicate the difference between commands that have local effects instead of, or in addition to, having an effect at the host. The user sees a seamless interface that hides the distributed nature of the processes with which he is interacting.

Page 56 Description of Programs Chapter 4

4.2.5 Node types and Their Actions

This section describes the actions the user interface interpreter (TREEM) takes when it is processing each of the node types.

IMMED (Immediate Match)

The character stored in the node is compared with the next character from the specified input stream (channel/conversation). If the letters match control is passed to the node on the right, if the match fails control passes to the node below. If the match was successful then the character is echoed to the console (user output stream) whichever input stream it was read from, and the next character from the appropriate input stream is read. Note that if both characters are delimiters, as defined by the set of delimiters held in the delimiters array, then the match is successful even if the characters are not identical.

KOPY

Text is copied unmodified from one stream to the other (i.e. console to mainframe or vice versa). The stream number in the node specifies the source stream. The terminating carriage return is copied too. If the stream number is 1 then the first character is removed before copying the rest of the line from the user's console to the mainframe communication link port.

This is because the first character on the line is the "escape" character (e.g. "$") that was used to cause invocation of the KOPY node. This node is used when it is desired to allow the user to by-pass the interpreter, and give commands directly to the mainframe. The copied line is also echoed to the console.

NFAIL

Control is passed to a fail node after any kind of error. Normally there is only one fail node, and control passes to it when the user gives an invalid command. An error message and a list of valid keywords are

Page 57 Description of Programs Chapter 4

displayed, then control is passed to the root node, ready to accept another command.

NTERM (terminator)

It is checked that the next character on the input channel specified is a line terminator (i.e. carriage return). Control is passed to the right if it is the end of line, and down if there is more text in the input buffer.

It is in effect an IMMED node storing an end-of-line character, but it does not attempt to read another character from the input buffer if successful

(whereas an IMMED node does).

JUMP

For efficiency jump nodes may be inserted so that not all characters are checked for a match. The characters that are skipped over are not echoed - but what they should be is. This mechanism allows abbreviations, and expands them, it also allows some types of spelling mistake, without

'noticing' them.

NPARAM (parameter)

This reads a string in from the channel specified, and stores it in the parameter block. There is a limit on the number of parameters that may be stored at one time (5 at present). The string is determined by a delimiter (as specified by the delimiter set, normally space, comma or "="), or after 14 characters. If the conversation number indicates that the parameter is to be read from the console, but the input buffer is empty then the user is prompted to supply one. If there is a delimiter at the start of where the parameter is expected in the input stream, then the current contents of the parameter entry in the parameter block are left unchanged. The delimiters before parameters are not specified explicitly in the tree, so they can be omitted. Control moves right.

Page 58 Description of Programs Chapter 4

NWRICH (wri te character)

This node type is used to write a single character out. The character stored in the node is output to the console (if conversation 1) or the mainframe (if conversation 2). Control moves right.

NWPARM (wri te parame ter)

The parameter specified is written to the output channel specified.

Control moves right. (The parameter has previously been written into the parameter block as a result of processing an NPARAM node).

NEWLN (new line)

Writes a buffer to the specified output stream. Effectively the same as a NWRICH node containing a "new-line" character. Control passes right.

NGET (get line)

Gets a line from the console (if conversation 1) or the mainframe (if conversation 2). In the second case if more than one line is received then the last line is the one that will be stored in the input buffer, and previous lines will be lost (although they will be displayed on the console). When getting a line from the console receiving stops when carriage-return is pressed. When inputting from the mainframe receiving stops when an X-OFF character is received or after a time-out (nothing received for several seconds). Control passes to the right on 'normal' termination, and down if reception stopped due to a time-out.

LODFIL (load file)

This node is used for transferring files between the mainframe and the local microcomputer. If the conversation number is 1 then the file is transferred from the mainframe to the local microcomputer, and if the conversation number is 2 the transfer is in the other direction.

Page 59 Description of Programs Chapter 4

4.2.6 Jump Nodes

The purpose of jump nodes is different in some respects to those in D.

Partridge's adaptive Fortran pre-processor [PARTRIDGE 72, PARTRIDGE & JAMES 763. Their purpose is the same in allowing some types of spelling mistake

to go 'unnoticed', but they differ in that in the Fortran pre-processor their main purpose was to improve efficiency. In our implementation TREEM

runs on a separate local processor, which is idle for much of the time

while waiting for input, so efficiency is irrelevent. If TREEM ran concurrently with other programs on the same processor then efficiency

might become an important factor.

No strategies for forced matching were incorporated into TREEM initially, and it is the jump nodes which provide a degree of tolerant matching, mostly in the form of abbreviations being allowed. It is the

routines which insert the jump nodes (called by BLDTRE and RESTRUCT) which determine the types of abbreviation allowed.

4.3 Design Decisions and Alternatives

This section explains why some of the design routes were chosen.

4.3.1 Why put Protective Interface on a Separate Processor

The protective interface could have been implemented as a process on the mainframe running the applications.

However (for the system in question, the ICCC CDC mainframe) it would have been difficult to interpose the protective interface at the "user end": it would have been necessary to pass through the host OS to the user interface process. Thus it would be impossible to achieve our aim of providing a layer to protect the user from the OS interface. This may also

Page 60 Description of Programs Chapter 4 be true of some other operating systems. It is desirable that the protective interface should communicate with the user directly, with no other interfaces between, since the protective interface is intended to provide a uniform user-friendly interface to the whole system (possibly to even the user’s own programs).

By running the user interface on a separate local micro it is possible to inform of faults in the mainframe, or if it is down, instead of behaving unpredictably or simply appearing to be “dead".

The micro may be under the user’s direct control, and the mainframe may not. In this situation the user has some control over the interface he sees, but might not be able to control the interface used on the mainframe.

The response from the dedicated micro is constant, and often faster than from the mainframe. (With many mainframes the response time varies considerably with load).

The same basic interface can be used for a variety of mainframes - using different trees for each. It might be possible to identify the mainframe automatically, or if a switching function is included, select the appropriate mainframe automatically.

If communications costs to the mainframe are charged for on a quantity of data basis then it may be cheaper to have a 'filter' local to the user to remove unnecessary (incorrect) commands etc.

To summarise: running the interpreter in a separate processor has several advantages. The response time is fast, and consistent. It continues to function when the mainframe is down, even though all it can do then is to indicate that the mainframe is down (but it might be used for example as an editor, to prepare a file for sending to the mainframe later, when it is working again). It is easy to use the same interface, which may be used on several different mainframes. It is independent of changes to the main

Page 61 Description of Programs Chapter 4

system, except for changes to the command/response language, which are

easy to accommodate.

The interface programs could be used to interface to the operating

system of the microcomputer. However this is probably not a practical

proposition on an 8-bit microcomputer because of its limited memory. If

the interpreter was memory resident there would be insufficient space for

many application programs. If the interpreter was disk-based it would not

be suitable for use by the applications as well as the operating system

and would considerably slow the system.

It would be easy to incorporate the interpreter into any operating system which allows a process to be inserted into the I/O stream between

the user and the operating system, e.g. UNIX.

The interface programs were implemented on a Research Machines 380Z running the CP/M operating system. This choice was dictated by its availability. Fortran was chosen as the programming language so that the programs would be as portable as possible. It would be possible to implement them on the mainframe to which they are providing the interface,

at the expense of reduced performance.

4.3.2 Size of Tree: Store in Memory or on Disk ?

The current version of TREEM stores the whole of the tree in memory; up to 200 nodes are permitted, and each node takes ten bytes. The interpreter itself takes about 20k bytes, so it seems likely that the maximum tree size that can be supported on an 8-bit CP/M system will be about 20k bytes, or 2000 nodes. At present the immediate-match nodes that store characters to be matched against characters input, and the write- character nodes that store characters to be output, store only one character per node. Therefore a number of such nodes are necessary for each keyword to be matched or output. The whole of the currently available tree space (200 nodes) can be filled up with an interaction of only half a

Page 62 Description of Programs Chapter 4

dozen lines or less. The total memory space that can be made available for the digraph on the 8-bit CP/M microcomputer used for the demonstration

program is about 20k bytes, which would allow 2000 nodes. However even this is not enough for any realistic 'production* user interface.

The situation can be improved slightly by replacing the Fortran

routines of the FLD module with assembler routines, which pack the data more efficiently. The minimum likely node size is:

right link : 114 or 2 bytes down link : 114 or 2 bytes character/ parameter number : 1 byte count : 1 byte

Total : 6 or 7 bytes

As this improves utilisation of data storage by only 67% at best, some or all of the tree will have to be stored on disk and/or fewer nodes must be used.

The number of nodes can be reduced by having one per block of text to be matched or output, instead of one per character. This will alter the way in which text is matched against input, but has almost no effect on text to be output.

Page 63 Description of Programs Chapter 4

Using Microsoft Fortran (F80) under CP/M the record length for random access files is fixed at 128 bytes. It should be possible to fit 16 nodes

into a record, using 8 bytes per node:

right link 2 bytes down link 2 bytes

type & channel 1 byte

letter/parameter no. 1 byte count 1 byte

spare 1 byte

This will give 128 nodes per kbyte, so assuming that 200 kbytes is about the largest practicable disk file size for the tree, a limit of 20 000 nodes is imposed. This should be sufficient for most applications.

The nodes would be read as an array of logicals, with the routines in the FLD module doing the packing and unpacking of nodes.

4.3.3 Graph Representation

The graph representation chosen (using an RLINK and a DLINK) was chosen because much of the graph can be split into binary trees.

Alternative common methods of graph representations (which could be implemented by changing only the routines in the FLD module) are:

1: Adjacency Matrix:

This would use (for 200 vertices) 200 x 200 x 1 bit = 40,000

bits. As the graph we wish to store is directed, this figure cannot be halved.

2: Adjacency List:

Page 64 Description of Programs Chapter 4

This is basically the same as the method used, but since there are always only two adjacent nodes (in the “correct direction")

a linked list is not necessary; just two fields.

3: Adjacency Multilists:

In this case the technique is to use one node per edge. All the

usual methods of storing adjacency multilists allow for each node to have an indeterminate number of in-edges and out-edges.

However since our graph is based upon binary trees the out- degree is always two. The storage space used for 200 nodes is 200 x 2 x 16 = 6,400 bits for the linking information, to which

the data storage requirements must be added. In fact for 200 nodes it is possible to use 8 bit integers instead of 16 bit

integers, which halves this figure. It is therefore a considerably more compact representation than the adjacency

matrix.

4.3.4 Separate Programs

The interpreter and re-structuring process were implemented as separate programs for several reasons. The main one was that it would be inefficient for the interpreter to re-structure the digraph continually; it makes more sense to schedule the re-structuring periodically while the interface is not in use (e.g. overnight). This also means that changes in the interface behaviour can only occur between sessions and not during sessions. This has both advantages and disadvantages, which are discussed later. Another advantage is that the computing resources required by the separate programs individually are less than would be required for an interpreter with built-in re-structuring. Although the scheduling of the re-structuring process was under manual control in our demonstration system, in principle it could be under the control of the interpreter.

Page 65 Description of Programs Chapter 4

4.3.5 Why Macro Nodes ?

The immediate-match nodes used 'internally' by the interpreter contain only one character each. This is not a convenient form in which to input the digraph, so a set of 'macro nodes' has been defined. These are similar to the 'micro nodes' used internally, but the 'match' macro nodes contain strings, which are expanded into a sequence of immediate-match nodes automatically by the program BLDTRE.

Also where it is desired to create a lot of alternative keywords (which will become a 'local tree' in the digraph ) there is a 'tree' type macro node. This is analogous to a 'case statement' in programming languages, whereas 'MATCH' macro nodes are analagous to IF THEN ELSE statements.

4.4 TREEM

4.4.1 Brief Description of Interpreter

The program that interfaces between the user and the mainframe is controlled by a digraph. In fact it can be considered that tha program is an interpreter and that the digraph is a specialised form of programming code. Taking the analogy further, the macro-node descrition of a digraph is the source code, which is compiled by BLDTRE into an intermediate code to be interpreted by TREEM. This has similarities with, for example, semi­ compiled Pascal, where the compiler produces P-code, which is then interpreted. The macro-node description does not look like a conventional programming language, but it does have some similarity to CAL languages. However an important difference between TREEM and a CAL-interpreter is that TREEM can modify the digraph that controls it.

TREEM can also be considered to be a table-driven interpreter, since the tree can be regarded as a table (which is effectively how it is

Page 66 Description of Programs Chapter 4 implemented), where again the table may be modified by the interpreter (in theory - in practice a separate program is used).

Each node in the tree is one of the following types: immediate-match, copy, fail, terminal (effectively an immediate-match for end of line), jump, parameter, write-parameter, write-character, new-line (write a new-line character) or get. Most of these can refer to any conversation (channel). The conversation number is usually 1 for the user, 2 for the mainframe, and 3 for a local file.

The main part of the interpreter is simply a case statement inside an endless loop, which selects the appropriate routines to call, according to the node type of the current node.

4.4.2 Multiple Conversations

Another area where TREEM differs from a CAL interpreter (as well as in its adaptive behaviour) is that instead of there being just one conversation (with the user), there are several.

In the first version of TREEM there are two conversations:

1: with the user

2: with the mainframe

Another will be added later:

3: with the local O.S. (CP/M CCP)

The enhancement to support the third conversation is discussed in section 6.4. In the current implementation, there is already limited support for the third conversation, but only for reading or writing files on the local (microcomputer's) disks. The routines in the "10" module interface

Page 67 Description of Programs Chapter 4

these conversations to actual devices via the Fortran I/O system (for conversation number 1) and assembler routines (for conversation number 2).

Eventually TREEM could be put in place of the basic I/O system, interfacing all I/O to and from the console. However a program of this size would reduce the memory available for application or utility programs

(CP/M transient programs) to an extent that it would not be a reasonable option. In general it is unlikely that any kind of useful adaptive

interface could co-exist with application programs on an 8-bit processor, although it would certainly be feasible on 16-bit processors with their

larger addressing range.

4.4.3 Incorporating the Digraph into TREEM

In order to save the time taken by TREEM in reading the digraph and control files in at the beginning of every run, it is possible to halt execution (with control C), and save the interpreter together with the already-loaded digraph by using the command (e.g.) SAVE 108 X.COM, where X is to be the name of the combined interpreter and digraph. This method is described in appendix B.

If a program with the digraph bound in like this is produced then although it will initialise more quickly and will run as a stand-alone program (with no associated data files) it will not be possible to accumulate the node counts and it will take no account of any changes to the data files. Thus this creates a “fixed" program which has lost any adaptive features.

Page 68 Description of Programs Chapter 4

4.5 RESTRUCT

The RESTRUCT program is responsible for restructuring the digraph on the basis of the information recorded in the nodes by TREEM. Re­ structuring consists mainly of re-ordering the branches of the local trees in the digraph, and possibly moving and inserting jump nodes.

4.5,1 Re-structuring the Tree(s)

What actually makes the interface adaptive is the automatic re-structuring of the digraph. This re-structuring could be done by the

interpreter, or by a process which runs concurrently with the interpreter. However this could slow down the response time, unless it could be run at a lower priority - to make use of idle time. It could also lead to many rapid fluctuations in the digraph which would probably confuse or annoy the user.

The method chosen was to run a re-structuring program at regular intervals. This could be run between user sessions, or possibly during idle time while the interpreter is waiting for a response from user or mainframe. Running RESTRUCT as a separate program, between user sessions, was chosen as the initial method, since isolation of the programs makes design and debugging easier.

4.6 BLDTRE

This section describes the tree-builder program.

4.6.1 Entering A Digraph

It is always necessary or at least desirable to be able to input and edit a tree directly. This is done by having an alternative representation

Page 69 Description of Programs Chapter 4

(to the internal form) which is easily understandable by humans. In fact it is rather similar to a CAL/CAI language, except that there are two

"conversations" instead of just one.

A program has been written (BLDTRE) which will take a digraph description in this language, analyse it for errors, and convert it into

"dump" format after checking. The "dump" format is very similar to internal format (the format in which it is stored in memory by TREEM), but is suitable for storage as a disk file. The file's format is described in appendix E.

It is intended that once a digraph is in use it can be updated automatically, although it is possible to perform manual updates by using BLDTRE in conjunction with UNBUILD. However this program is necessary in order to set up the digraph initially and thus start with a fully functional interface.

4.6.2 Using BLDTRE

BLDTRE is used to build trees or digraphs of nodes, which can then be used by the tree/digraph matching program, TREEM.

TREEM is an interpreter whose actions are controlled by a digraph. It can "talk" to a user via a console and to a mainframe computer via a serial port. It takes the digraph initially from a file called TREELOAD.DAT, BLDTRE takes a digraph "program" from the file NET .DAT and creates a file

TREELOAD.DAT which is in a form that can be used by TREEM directly.

After producing the files TREELOAD.DAT and MESSAGES from a run of BLDTRE they will be used automatically by TREEM when it is next run provided that they are on the logged-on disk.

The form of the network description required for input to BLDTRE is described in appendix C.

Page 70 Description of Programs Chapter 4

It should also be noted that the 'macro-nodes' used as input to BLDTRE are different from the nodes used by TREEM. In general, the macro nodes may be expanded into a number of nodes as used by TREEM. There is a limit of 200 nodes in the present implementation, which implies a limit of around 100 macro-nodes.

4.7 UNBUILD

The UNBUILD program performs the inverse operation of BLDTRE. It takes an internal format digraph and dumps it out as text format macro-node digraph. This permits manual alteration of the digraph in the absence of any program that provides a capability to directly edit the internal format digraph. It also provides a printable representation of the digraph for debugging purposes.

The format in which the output of UNBUILD is produced is directly compatible with the format required for input to BLDTRE (described in appendix C).

Page 71 5: DEMONSTRATION PROGRAMS

Chapter four described the general purpose programs that are controlled by or act upon the digraphs. This chapter outlines two demonstration uses of the programs. That is, it describes two simple digraphs that have been developed, one to provide a simple file transfer capability between the microcomputer and the mainframe, the other is a simple editor for manipulating the macro-node descriptions themselves.

5.1 CDCFILE

A program was developed that to the user appears to be a file transfer program allowing the transfer of text files between the Research Machines

380Z microcomputer and the ICCC CDC mainframes. This program, CDCFILE, is in fact a copy of TREEM (i.e. the interface interpreter) with a digraph incorporated that allows logon to the mainframe, a few of the more common mainframe O.S. commands, such as listing directories, and file transfer. This application resulted in the addition of an extra node type to the system, in order to support the file transfer function. A full description of this digraph is given in appendix D.

The microcomputer is connected to the mainframe via a Terminal Access Controller (TAC). (The TAC allows terminals to access a number of different computers). CDCFILE automatically sets up the connection through the TAC, if it is not already set up, without any interaction from the user. It will then find out whether the user is logged on or not, and request the username and password to log on if required. (The program does not force a logoff on exit, in case the user wants to use the program several times during a session). Once logged on the user can upload or download files, and the NOS commands to list the file catalogue or print a file are also

Page 72 Demonstration Programs Chapter 5 handled. In the latter case CDCFILE provides a command "PRINT" which is translated into the NOS equivalent (QUEUE,=:PS). An "escape" mechanism is provided to allow the user to send any NOS command: if a line is preceded with the $ symbol it is sent unaltered to the mainframe.

A sample session is shown below. User input is in bold type, the computer's output in normal type, and comments are in italics.

Page 73 Demonstration Programs Chapter 5

A>CDCFILE

Attempting to log on Please give your user number: umacg26 mypassw Giving- the extra parameter now means What do you want to do? that CDCFILE will not prompt for it Automatic list of local tree

Valid input is: DOWNLOAD

UPLOAD

CATLIST STOP

END QUICAT

QSTATUS EXIT $ HELP

PRINT What do you want to do? help One of the above options Possible commands are: - explains the meaning of commands DOWNLOAD: transfers file from CDC

UPLOAD: transfers file to CDC CATLIST .QUICAT .QSTATUS : NOS commands Any valid NOS command, preceded by

What do you want to do? down No filename given File name? so ask for it afile File downloaded What do you want to do? end

A>

Page 74 Demonstration Programs Chapter 5

5.2 Macro-node Editor

Originally the macro-node files were created and edited using the standard text editor supplied on the microcomputer. Any errors made in the format of the file were detected when BLDTRE was run to process the macro-node digraph.

A program was developed that would allow the syntax checking of the digraph description as it was entered. This program is again a tailored version of TREEM. It is of course able to check only for syntax errors on a line by line basis, errors such as links to non-existant nodes cannot be tested for until the description is complete. These tests are still performed by the BLDTRE program.

5.3 Summary

The two examples of macro-node digraphs given in this chapter illustrate the form of the digraph description files, and also provide working examples of the adaptive protective interface. In both cases the applications are quite small and simple, however they need to be in order to be within the capability of the prototype demonstration system.

The first example of use of the interface is closer to its original intended purpose, in that it provides a 'protective' interface to the operating system of a mainframe computer. This example also performs a practical function in providing the file transfer facility between the mainframe and the microcomputer that supports the protective interface, thus making it a useful program that fulfils a genuine requirement.

The second example demonstates some of the flexibility of the design, which allows it to be applied in this case to its own support. This has some similarity to the implementation of a compiler in the language it is compiling, allowing it to compile itself. TREEM could be adapted to any

Page 75 Demonstration Programs Chapter 5

task which requires comparison of events with a description of choices that can be represented in tree form.

Page 76 6 : EUTURE DEVELOPMENTS

This chapter considers ways in which the existing implemented system might be enhanced in the future.

6.1 An Integrated Package

The current system is implemented as a suite of four programs. There would be several advantages in integrating the separate programs into a single package. This is not really feasible using the hardware employed for

the prototype demonstration, because of the limitations in memory addressing range of the 8-bit microcomputer. However as more powerful microcomputers become the norm this is unlikely to be a problem in the near future.

6.2 Automatic Tree-Building

Re-structuring trees automatically is fairly straightforward. However, to take full advantage of the technique of storing the command descriptions in a digraph that may be altered at run-time, it would be an advantage if commands could be added dynamically.

This presents a problem: when is a command added? Obviously it should only be added if it is not invalid. So one possible strategy is to pass on all unknown commands, and if the command is accepted as valid by the host then it is added to the tree. However, in some cases it can be difficult to judge whether the host considers a command to be valid.

Page 77 Future Developments Chapter 6

6,3 Tree-Editor

As an alternative to the UNBLD/BLDTRE program combination a tree-editor would provide a better way to change the structure of the digraph, and to examine it. It is necessary to provide this type of facility because it is impossible to make the re-structuring process completely automatic, and in a teacher/student type application it is useful if the teacher is able to modify the digraph in order to change the interface presented to the student.

This digraph editing facility allows commands to be added or removed.

For example, if the digraph is getting too big then some branches can be removed to reduce its size, little used branches being removed first. The statistics collected by TREEM in the nodes mean that it is possible to find out which commands are little used. UNBLD could be used first to obtain a listing with frequency counts of the nodes.

BLDTRE & UNBLD would be kept in addition to the editor, or their functions would be incorporated into the editor, so that it is possible to enter and list digraphs in one go, e.g. UNBLD could produce a listing on the lineprinter if it is desired to look through the digraph without using the terminal. The macro-node format also allows for storage of the digraph on any device that can store normal text. Thus it may be important for porting digraphs between different systems.

Commands will allow the user to move around the digraph. The current local section will be displayed on the video screen (which could be a memory-mapped screen or a VDU). It will be possible to remove branches, but it should be noted that there may be more than one path to a node, so this may cause problems at "run-time" (i.e. when TREEM is interpreting the digraph). With the current system using UNBLD/BLDTRE this type of error is detected at "compilation time" by BLDTRE.

As a minimum, the tree editor would support the basic movement commands to move down, right and back to allow the user to select any

Page 78 Future Developments Chapter 6 node in the digraph, and commands to change the fields of a node including the right and down links, and to add and delete nodes. The backward movement would be achieved by using a stack in the same way that TREEM performs backtracking. The validity of the digraph, that is whether there are any nodes missing, or existing nodes with no path to them, can be checked on a command and/or when exiting the editor, this is preferable to checking after each modification for two reasons. One is that a full check of every modification is very inefficient. The other is that where a user has a batch of changes to make, it is easier to make all changes directly to the required state, even if this means that the digraph is in an inconsistent state while the changes are in progress. If the digraph has to be kept valid at every step, then the user of the editor has more work because the order in which changes are made becomes important.

6.4 Minor Enhancements

This section considers some modifications which could be made to the existing prqgrams without significantly changing the basic design.

6.4.1 Enhanced Line Editing

As an extension of the "line-editor" commands can be added to display the line so far, and to delete the last token

Page 79 Future Developments Chapter 6

6.4.2 Full Duplex Communication

The prototype system operated in a half-duplex mode, in that it would only accept input from either side when it was expecting it. It thus alternated between transferring data in one direction and then the other. Unfortunately this is not entirely satisfactory in some circumsatances. It is possible for input from either the user or the host computer to be lost if it arrives when not expected.

At the least concurrent input processes which can read and buffer up input until it is read by the interpreter are highly desirable. This would solve the problem of the possibility of lost data. However it does not allow the possibility of an asynchronous 'interrupt' from either party. This type of feature is required, for example, to allow the user to abort a requested action, or to suppress unwanted information.

6.4.3 Translation of Computer's Output

In the prototype adaptive interface we have concentrated on translating the user's input, where necessary. However, as has already been stated in chapters 2 and 3, it is desirable that the protective interface should also translate the text going to the user as well.

6.4.4 Parallel Conversations and Windows

Although from the point of view of the interface it is dealing with two

'conversations', one with the user and one with the host computer, from the user's point of view he has only one conversation. However it is common practice for a user to be able to interact with several applications simultaneously, typically using a mutli-windowing display with one window for each conversation. The prototype adaptive interface caters for only one

Page 80 Future Developments Chapter 6

user conversation. Since to a large extent the parallel conversations are independent of each other, extension of the adaptive interface to cater for

this environment does not require any major changes to the architecture.

The area where the parallel independant conversations do interact is in

the management of the windows, which is in effect another user

conversation, but in this case with a local management function rather than an application.

6.4.5 Interaction With Local Operating System

The demonstration programs went slightly further than providing an

interface to a host time-sharing computer, in that file transfer between the local microcomputer and the host mainframe was provided. However the

system did not extend as far as providing a protective interface to the local computer system. This is technically more difficult to achieve because the operating system of the local microcomputer does not lend itself to inserting a process in the console input/output path.

However it is intended to extend the protective interface to providing

a limited interaction with the local system, thus adding a third conversation. Initially conversation number 3 will work in one direction only (output) by writing commands to a file (TREEMCOM.SUB) and then submitting it. To do this it will have to be put at the high end of store

and the space available for application programs reduced (by reducing the HIMEM pointer).

6.4.6 Defining Macro Commands

A very useful feature of current operating systems is the ability to put a number of commands together in a file and invoke them all by using the file as a command. This is often referred to as a "SUBMIT" or "EXEC" facility. The equivalent in this adaptive user interface would be

Page 81 Future Developments Chapter 6 separating out a commonly used path in the digraph, naming it, and providing the ability to invoke this path segment by the name.

It is not entirely trivial to analyse the digraph for commonly used command sequence segments because they may be used with different parameters. To be useful any macro command facility must allow for parameterisation. In addition the fact that a "name" must be chosen (and parameter syntax defined) means that this process cannot be entirely automated: the user would need to have some control.

6.5 Distributed Processing

One of the advantages of separating out the user interface from the application specific part of a program is that in a distributed processing environment the user interface may be implemented at (or near) the user's terminal. This may reduce the apparent response time to the user, in addition to reducing communication costs and offloading the application processor.

6.5.1 Remote Terminal Access

If user interfaces are tailored to the individual user, either automatically as suggested here or by some explicit configuration process, then the advantages of locating the user interface at the user's terminal become greater. The style of user interface is then completely under the user's control and is divorced from the application. It is the user's responsibility to provide an interface to suit himself, not the application designer. Therefore, the user can have whatever idiosyncratic user interface he considers to be optimal, which is common across all applications accessed, and the applications need only implement one (standardised) interface. This concept is similar to that of the "Virtual Terminal" communication protocols being defined within the ISO and elsewhere, but is a higher level of abstraction.

Page 82 Future Developments Chapter 6

6.5.2 Physical Location of U s e r ' s Profile

Since in a distributed processing environment the user may potentially access applications on many different hosts from many different terminals, the question arises of where the user's profile should be stored. If kept at the most frequently used application node or terminal, then it must be possible to transfer it if another application or terminal is used. Another possibility is to store user profiles on dedicated nodes on the network, so that a third party is always involved in the communication between the user and the application.

Yet another possibility is to store the user profile on some removable media that is carried around by the user and plugged into the terminal. The media might be completely passive (e.g. floppy disk, bubble memory pack) or might be active (e.g. smart card containing its own microprocessor). The latter option has much to commend it if the user profile store is also to be used for authentication, i.e. to act as a key to both the terminal and the application.

This very personalised interface does, however, suffer from a problem: it is unable to take advantage of what it might learn from other similar users, particularly when interfacing to the same application. Therefore, some form of hybrid approach is likely to be ultimately the best solution.

6.6 A Vision of the Future

This section explores a view of what an adaptive protective user interface of the future might look like, developed further upon the ideas outlined for the prototype,

The most obvious logical progression is that as microcomputers become smaller and more portable computer users could carry their own "interface box" with them. The interface box would act as the user's agent, providing a personalised interface to the user and making use of a standard

Page 83 Future Developments Chapter 6

interface into any number of different service-providing computers. The

user interface would be customised to its owner, and would adapt with

them over time. It could also be expanded to build in additional facilities such as a notepad which would be common to all the services the user

accesses.

A variation on this theme would be to provide standard terminals for accessing computers and computer networks which would accept a "smart

card" defining the user's interface. This would avoid the need for the user to carry the bulk of the keyboard and display, but still provide an

interface that is carried around with the user and could therefore be used to provide the customised interface in a variety of locations. The card

could also be used for authentication for control of access to services and for billing purposes.

Another way of avoiding the bulk of keyboard and screen (and also providing computer access to some categories of disabled people) is to use voice recognition. Voice access might also be more acceptable to "technophobes" who would nevertheless benefit from some computer services. One of the characteristics of the demonstration adaptive user interface is that it is often able to predict the next most likely input. This ability to predict the most likely words could be used to good effect to provide

context-sensitive voice recognition - it is much simpler and more reliable to choose between a handful of options than any known word in a

vocabulary.

Page 84 *7 ; SUMMARY AND CONCLUSIONS

7.1 Enhancing Current Styles of User Interfaces

The need for improved man-machine interactions has been well documented in many places. Most approaches to meeting this need have concentrated on providing a 'simple' interface aimed at the novice or

infrequent user. A different approach has been proposed: a user interface that attempts to adapt itself to the individual user. This shows promise as a method of providing an acceptable interface to a range of computer users, rather than being tailored specifically to a restricted class of users.

7.2 Adaptive User Interfaces

A mechanism for implementing an adaptive user interface has been described, based upon binary trees. This adaptive user interface consists of an interpreter whose actions are controlled by a digraph, which can

itself be modified by the user interface. Thus, the user interface is able to 'learn' or adapt itself to suit the characteristics of the individual user.

Demonstration programs were implemented to prove that the concept would work when applied to the task of providing a protective interface to a large mainframe's Job Control Language using a microcomputer. However the demonstration programs are limited to verifying the basic principles - they do not provide everything that was detailed as desirable Chapter 3. Thus they are only demonstration prototypes; they are not viable "production" systems.

An Adaptive User Interface Page 85 Summary and Conclusions Chapter 7

7.3 Protective User Interfaces

A key feature of the demonstration programs was that they operated as a "protective layer" between the user and the application programs.

Furthermore the interface programs executed on a micro-computer local to the user and remote from the host computer executing the application

programs. This allows the processing power necessary for sophisticated user interfaces to be distributed out to the periphary of a computer network, offloading the central host processors and the communications network.

This separation of user interface from application program allows the user to have more control over the style of man machine interface presented and to tailor it to the individual. The microcomputer then acts as the user's agent when accessing application programs which no longer have to each have their own sophisticated (but differing) user interfaces. This model of distributed computing is simply the classic client-server model with the user interface acting as the client to one or more services available to the user on remote host computers.

7.4 Current Trends

We are already seeing a greater emphasis on improved "user- friendliness" of commercial software, particularly those packages aimed at

the Personal Computer mass market. As costs of processing and memory continue to fall we can expect to see an increasing potential for providing advanced user interfaces. This commercial recognition of the importance of the user interface and the improving economics of providing it, coupled with the ever increasing diversity of the computer user population, should mean that there will be a place in the marketplace for adaptive user interfaces in the future.

Page 86 B IBB IOGRAPHY

ADDIS, T.R., BOSTON, D.W. & UNDERWOOD, M.J. (1970) ‘An Interactive System using a Simple Spoken Word Recogniser', IEE Conference on Man-Computer Interaction p29 (1970)

AMBROZY, D. (1971) 'On Man-Computer Dialogue', I.J. of Man-Machine Studies 3 (4) p375 (October 1971)

ANTONELLI, D.C. (1970) 'Terminal Design - A Challenge To Human Factors', IEE Conference on Man-Computer Interaction p95 (1970)

BACKUS, J. (1978) 'Can Programming be Liberated from the von Neumann Style? A Functional Style and its Algebra of Programs', Comm ACM 21 (8) p613-641 (August 1978)

BARRON, D.W. (1974) 'Job Control Languages and Job Control Programs', Computer Journal 17 (3) p282-286 (1974)

BARROW, H.G. (1970) 'The Development of a Real World Interface', IEE Conference on Man-Computer Interaction p89-94 (1970)

BEATTIE, J.D. (1969) 'Natural Language Processing by Computer', IJ. of Man- Machine Studies 1 (3) p311 (July 1969)

BENNETT, J.L. (1972) 'The User Interface in Interactive Systems', Annual Review of Information Science and Technology' 7 pl59-196 (1972)

BIERMANN, A.W. & KRISHNASWAMY, R, (1976) 'Constructing Programs from Example Computations', IEEE Trans, on Soft. Eng. SE-2 (3) p 141 (September 1976)

Page 87 Bibliography

BITZER, D.L. & EASLEY, J.A. (1965) 'PLATO : A Computer-Controlled Teaching System', Computer Augmentation of Human Reasoning p89-103 Eds: Sans, M.A. & Wilkinson, W.D. (1965)

BOBROW, D.G. (1963) 'Syntactic Analysis of English by Computer - A Survey', AFIPS - Proc. Fall Joint Computer Conference 24 p365 (1963)

BOBROW, D.G. (1964) 'Natural Language Input for a Computer Problem Solving System', PhD Thesis Department of Mathematics, MIT (1964)

BOBROW, D.G. (1967) 'Problems in Natural Language Communication with Computers', IEEE Trans, on Human Factors in Electronics 8 p52 (1967)

BODENSCHER, H. (1970) 'A Console Keyboard for Improved Man-Machine Interaction', IEE Conference on Man-Computer Interaction pl96 (1970)

BOIES, S.J. (1974) 'User Behaviour on an Interactive Computer System', IBM System Journal 13 p2-18 (1974 : No.l)

BORNAT, R. & BRADY, J.M. (1976) 'Using Knowledge in the Computer Interpretation of Handwritten FORTRAN Coding Sheets', IJ. of Man-Machine Studies 8 (1) pl3 (January 1976)

BROWN, T. & KLERER, M. (1975) 'The Effect of Language Design on Time- Sharing Operational Efficiency', IJ. of Man-Machine Studies 1 (2) p233 (February 1975)

BRUNT, R.F. & TUFFS, D.E. (1976) 'A User-Oriented Approach to Control Languages', Software Practice & Experience 6 (1) p93-108 (January 1976)

CARD, W.I., CREAN, G.P., EVANS, C.R., JAMES, B.W., NICHOLSON, M., WATKINSON, G. & WILSON, J. (1970) 'On-Line Interrogation of Hospital Patients by a Time­ sharing Terminal with Computer/Consultant Comparison Analysis', IEE Conference on Man-Computer Interaction p 141 (1970)

Page 88 Bibliography

CARLISLE, J.H. (1970) 'Comparing Behaviour at Various Computer Display Consoles in Time-Sharing Legal Information', Rand Corp., Santa Monica, California (AD712695) (1970)

COX, D.WJ. (1975) 'Job Control Language - The Way Forward', Computer Bulletin p28 (March 1975)

CRAFT, P.C.R. (1970) 'Some Aspects of Human Factors in Terminal Design', IEE Conference on Man-Computer Interaction p77-81 (1970)

DAVIES, D.W. et. al. (1974) 'Human Factors in Interactive Teleprocessing Systems', Proc. ICCC p491-496 (1974)

DENERT, E. (1977) 'Specification and Design of Dialogue Systems with State Diagrams', Proc. Int. Comp. Symp. p417-424 (Liege : 1974)

DOWSEY, M.W. (1970) 'A Language to Facilitate Computer-Aided Instruction', IEE Conference on Man-Computer Interaction p72 (1970)

DZIDA, W., HERDA, J. & ITZFELDT, W.D. (1978) 'User-Perceived Quality of Interactive Systems', IEEE Trans, on Soft. Eng. SE-4 (4) p270 (July 1978)

EASON, K.D. (1974) 'The Manager as a Computer User', Ap. Eng. 5 p9-14 (1974)

EASON, K.D. (1976) 'Understanding the Needs of the Naive Computer User', Computer Journal 19 p3-7 (1976)

EDMONDS, E.A. (1974) 'A Process for the Development of Software for Non- Technical Users as an Adaptive System', Gen. Syst. 19 p215—217 (1974)

EDMONDS, E.A. (1978) 'Adaptable Man/Machine Interfaces for Complex Dialogues', Proceedings: Eurocomp 1978 p639-646 Online Conferences

Page 89 Bibliography

EDMONDS, E.A. & LEE, J. (1974) 'An Appraisal of Some Problems of Achieving Fluid Man/Machine Interaction', Proc. Eurocomp. p635-645 (1974)

EMBLEY, D.W. et. al. (1978) 'A Procedure for Predicting Program Editor Performance from the User's Point of View', Int. Journal Man-Machine Studies 10 p639-650 (November 1978)

ENSLOW, P.H. (1975) 'OSCL (1) Activity in Europe', Operating Systems Review 9 (2) p 16— 17 (1975)

EVANS, C.R. (1972) 'An Automated Medical History-Taking Project - A Study in Man-Computer Interaction', NPL Computer Science Report No.55 (1972)

EVANS, C.R., WILSON, J., CARD, W.I., CREAN, G.P., WILSON, J.B., NICHOLSON, M. &

WATKINSON, G. (1971) ' A Study of On-Line Interrogation of Hospital Patients by a Time-Sharing Terminal with Computer/Consultant Comparison Ana lys is', NPL Repor t C0M52 (1971)

FEENEY, W.R. & HOOD J. (1977) 'Adaptive Man Computer Interfaces Information Systems which Take Account of User Style', Comput. Pers. 6 p4- 10 (1977)

FELDMAN, J.A. (1979) 'High Level Programming for Distributed Computing', Comms. of ACM 22 (6) p353~368 (June 1979)

FLORENTIN, J.J. (1977) 'Automatic Generation of Interactor Programs', Proc. Displays of Man-Machine Systems (Lancaster : 1977)

GAINES, B.R. (1972) 'Axioms for Adaptive Behaviour', I.J. of Man-Machine Studies 4 (2) pl69 (April 1972)

GILB, T, & WEINBERG, G.M. (1977) Humanized Input : Techniques for Reliable Keyed Input (1977)

Page 90 Bibliography

GOULD, J.D. (1975) 'Some Psychological Evidence on How People Debug Computer Programs', I.J, of Man-Machine Studies 7 (2) pl51 (February 1975)

GRIGNETTI, M.C. & MILLER, D.C. (1970) 'Modifying Computer Response Characteristics to Influence Command Choice', IEE Conference on Man- Computer Interaction p201-205 (1970)

HARTLEY, J.R., SLEEMAN, D.H. & WOODS, P. (1970) 'Assisting the Computer to Assist Instruction', IEE Conference on Man-Computer Interaction p218-227 (1970)

HILL, I.D. (1972) 'Wouldn't it be nice if we could write Computer Programs in Ordinary English - Or would it?', Computer Bulletin p306 (June 1972)

HILLMAN, A.L. & SCHOFIELD, D. (1977) 'EDIT, An Interactive Network Service', Software Practice & Experience 7 p595-611 (1977)

HUCKLE, B.A. 'Designing a Command Language for Inexperienced Users', Hatfield Polytechnic

HUMMEL, H. (1976) 'LEKT0R - A List-Oriented, Machine-Independent Programming System for Conversational Applications', Software Practice & Experience 6 (4) p447 (October - December 1976)

IRELAND, D.M. (1978) 'A Study of the Science Museum Terminal* MSc Thesis Department of Computing and Control, Imperial College, London University

JAMES, E.B. (1979) 'Personal Computing Systems for University Research', Imperial College Computer Centre, Internal Report (November 1979)

JAMES, E.B. (1980) 'The user interface', British Computer Society Journal (Feb 1980)

JAMES, E.B. & IRELAND, D.M. (1981) 'Microcomputers as Protective Interfaces in Computing Networks', Software Practice Ss Experience (1981)

Page 91 Bibliography

JAMES, E.B. & IRELAND, D.M (1980) 'Programming Techniques for User-Friendly Interfaces to Operating Systems', Internal Report Imperial College Computer Centre (1980)

JAMES, E.B. & PARTRIDGE, D.P. (1972) 'Machine Intelligence : The Best of Both Worlds?', I.J. of Man-Machine Studies 4 (1) p23-31 (January 1972)

JAMES, E.B. & PARTRIDGE, D.P. (1976) 'Tolerance to Inaccuracy in Computer Programs', The Computer Journal 19 (3) p207-212 (1976)

JERVIS, M.W. (1970) 'The Use of Alpha-Numeric Cathode Ray Tube Data Displays in CEGB Power Stations', IEE Conference on Man-Computer Interaction p35 (1970)

KENNEDY, T.C.S. (1974) 'The Design of Interactive Procedures for Man-Machine Communication', I.J. of Man-Machine Studies 6 (3) p309 (May 1974)

KENNEDY, T.C.S. (1975) 'Some Behavioural Factors Affecting the Training of Naive Users of an Interactive Computer System', I.J. of Man-Machine Studies 7 (6) p817 (November 1975)

KERNIGHAN, B.W. & MASHEY, J.R. (1979) 'The UNIX Programming Environment', Software Practice & Experience 9 p 1 — 15 (1979)

KN0WLT0N, K. (1962) 'Sentence Parsing with a Self-Organizing Heuristic Program', PhD Thesis MIT (1962)

LEA, W.A. (1970) 'Towards Versatile Speech Communication with Computers', I.J. of Man-Machine Studies 2 (2) pi07 (April 1970)

LEE, R.I. (1973) 'A Model for Computer Aided Programming', PhD Thesis University of London (1973)

LONGUET-HIGGINS, C. & ISARD, S.D. (1970) 'The Monkey's Paw', IEE Conference on Man-Computer Interaction p83-88 (1970)

Page 92 Bibliography

LONGUET-HIGGINS, H.C. & ORTONY, A. (1968) 'The Adaptive Memorization of Sequences', Machine Intelligence 3 D. Michie (ed.) University Press, Edinburgh (1968)

LUCAS, R.W., CREAN, G.P., KNILL-JONES, R.P. & CARD, W.I. (1976) 'An Adaptive System for the Interrogation of Hospital Patients', Proc. Conf. on the Appl. of Elec, in Medicine p 161 — 170 (1976)

McCARTHY, J. 0959) 'Programs with Common Sense', Proc. Symposium on Mechanisation of Thought Processes (NPL : 1959)

MADSON, J. (1979) 'CCL - A High-Level Command Language', Software Practice & Experience 9 p25-30 (1979)

MARTENS, H.H. (1959) 'Two Notes on Machine "Learning"', Inf. Contr. 2 p364- 374 (1959)

MARTIN, J.N.T. & MORTON, J. (1970) 'Interrogation by Naive Users', IEE Conference on Man-Computer Interaction p206 (1970)

MASHEY, J.R. (1976) 'Using a Command Language as a High-Level Programming Language', Proc. Second Int.. Conf. on Soft. Eng. pl69-176 (1976)

MICHIE, D. (1980) 'Expert Systems', The Computer Journal 23 (4) p369-376 (November 1980)

MILLER, L.A. (1974) 'Programming by Non-Programmers', I.J. of Man-Machine Studies 6 (1974)

MILLER, L.H. (1977) 'A Study in Man-Machine Interaction', National Computer Conference p409-421 (1977)

MILLER, R. (1978) 'UNIX - A Portable Operating System?', Operating Systems Review 12 (3) p23-37 (1978)

Page 93 Bibliography

MOORE, L. (1975) 'The Feasibility of a Transportable Job Organisation Language', MPhil Thesis University of London

MOORE, L. (1979) 'Design for a Transportable Job Organisation Language', The Computer Journal 22 (4) p296-302 (1979)

NEWMAN, I.A. (1973) 'The UNIQUE Command Language for Portable Job Control', Data fair Proceedings p353-357 (1973)

NEWMAN, I.A. (1976) 'Machine Independent Command Language', Computer Bulletin 2 (9) pl8-19 (September 1976)

NEWMAN, I.A. (1978) 'Personalised User Interfaces to Computer Systems', Proceedings: Eurocomp 1978 p473-486 Online Conferences

NEWMAN, I.A. et al. (1975) 'Machine Independent Command Language - A Framework for Development', Computer Bulletin 2 (4) pl4-15 (June 1975)

OSSANNA, J.F. & SALTZER, J.H. (1970) 'Technical and Human Engineering Problems in Connecting Terminals to a Time-Sharing System', AFIPS FJCC 37 p355-362 (1970)

PARTRIDGE, D. (1972) 'Heuristic Methods in the Analysis of Program Statements', PhD Thesis Department of Computing and Control, Imperial College, University of London (August 1972)

PARTRIDGE, D.P. & JAMES, E.B. (1974) 'Natural Information Processing', I.J. of Man-Machine Studies 6 (2) p205 (March 1974)

PARTRIDGE, D.P. & JAMES, E.B. (1976) 'Compiling Techniques to Exploit the Pattern of Language Use', Software Practice & Experience 6 (4) p527 (October-December 1976)

PRICE, W.L. (1970) 'The Viability of Computer Transcription of Machine Shorthand', IEE Conference on Man-Computer Interaction pi (1970)

Page 94 Bibliography

RAPHAEL,B. (1964) ‘SIR : A Computer Program for Semantic Information Retrieval', PhD Thesis Department of Mathematics, MIT (1964)

RAPHAEL, B. (1964) 'A Computer Program which “Understands"', Proc. AFIPS Joint Computer Conference 26 p577-589 (Fall 1964)

REISNER, P. (1977) 'Use of Psychological Experimentation as an Aid to Development of a Query Language', IEEE Trans, on Soft. Eng. SE-3 (3) p218 (May 1977)

REYNOLDS, C.F. (1970) 'CODIL', IEE Conference on Man-Computer Interaction p211 (1970)

RIDSDALE, B. (1970) 'The Non-Specialist User and the Computer Terminal', IEE Conference on Man-Computer Interaction (1970)

RITCHIE, D.M. & THOMPSON, K. (1974) 'The UNIX Time-Sharing System', Comm. ACM 17 (7) p365-375 (1974)

SCHOFIELD, D., HILLMAN, A.L. & RODGERS, J.L. 'MM/1, A Man-Machine Interface -

A Contribution Towards a Standard', NPL Report

SELF, J. (1977) 'Artificial Intelligence Techniques in Computer Aided Instruction', Austra, Comp. J. 9 pi 18-127 (1977)

SHACKEL, B. (1969) 'Man-Computer Interaction - The Contribution of the Human Sciences', Ergonomics 12 p485-498 (1969)

SIBLEY, E.H. (1976) ‘Economic Justification of an 0SCL/0SRL', Operating Systems Review 10 (4) p7-15 (October 1976)

S0WRY, J. (1973) 'A High Level Language for Processing Binary Trees', MSc Thesis Computing and Control Collection, Imperial College, University of London (1973)

Page 95 Bibliography

SPARKES, J.J. (1969) 'Pattern Recognition and a Model of the Brain', I.J. of Man-Machine Studies 1 (3) p263~278 (July 1969)

TAGG, (1978) 'A Command Language for CAL', CALNEWS 2 plO (November 1978)

TEITELMAN, W. (1972) "'Do What I Mean" : The Programmer's Assistant', Computers and Automation p8— 11 (April 1972)

TURING, A.W. (1950) 'Can the Machine Think?', Mind 59 p430 (1950)

VELICHNO, V.M. & ZAGORUYKO, N.G. (1970) 'Automatic Recognition of 200 Words', LJ. of Man-Machine Studies 2 (3) p223 (July 1970)

WALTHER, G.H. & O'NEIL, H.F. (1974) 'On-Line User-Computer Interface - The

Effects of Interface Flexibility, Terminal Type, and Experience on Performance', National Computer Conference p379-p384 (1974)

WATSON, W.W. (1974) 'User Interface Design Issues for a Large Interactive System', National Computer Conference p357-364 (1974)

WE1ZENBAUM, J. (1966) 'ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine', Comm. ACM 9 p36 (January 1966)

WEIZENBAUM, J. (1967) 'Contextual Understanding by Computers', Comm. ACM 10 (8) (August 1967)

YOUNGS, E.A. (1974) 'Human Errors in Programming*, I.J. of Man-Machine Studies 6 (3) p361 (May 1974)

ZADEH, L.A. (1963) 'On the Definition of Adaptivity', Proc. IEEE (correspondence) 51 p469 (1963)

Page 96 Bibliography

ZINN, K.L. & CONKLIN, J.W. (1970) 'Programming Languages and Operating Systems for Specific Instructional Environments', IEE Conference on Man- Computer Interaction p57-64 (1970)

Page 97 APPEND IX A : DETA ILED DESCR IPT ION OE PROGRAMS

A.l Introduction

This appendix provides a description of the programs down to the level of each procedure. The programs were written in Microsoft Fortran on a Research Machines Ltd (RML) 380Z. The Fortran source is not reproduced here. The program source was stored on two CP/M format 8 - inch double sided floppy disks, which are identified by the colour of their label

(orange or red). For further description of the organisation of the files over the disks see section B.2 in appendix B.

A.2 Module Definition

In this section a 'module' is used to mean all the routines contained in one source file. It should be noted that this is not the same as a module in the sense used by the Microsoft compilers and linkers used on the RML 380Z. The Fortran compiler actually makes each routine a separate module.

A.3 Module Hierarchy

Apart from the main programs and the standard libararies there are six modules:

TRE - tree operations NOD - node operations FLD - field operations CHR - character operations STK - stack operations 10 - I/O operations

Page 98 Detailed Description of Programs Appendix A

The module hierarchy is show below (each module may use any of the procedures in the module below it):

I TREEM i RESTRUCT I BLDTRE I UNBUILD I I I |------1 | m a i n

1 I I BLDRTNS I 1 programs

1 TRE 1

1 1 CHR 1 NOD 1 support 1 i| ...... ■ -- i | 1i brar i es

1 1 STK | FLD 1 1 1i 1i 1 1 i 10 1

A.4 Tracing

Each module (my usage) has a 'trace level' associated with it. There is an array of logical variables (TRACE) which controls the outputting of trace information. If the flag in this array corresponding to the appropriate trace level is set to 'trace', then all calls to routines in the module will be recorded, along with parameters and results etc..

This trace information is written to the logical unit specified by the variable ITRAFL, so that trace information can be directed either to the console or to a file. When TREEM starts it reads the file "TRACECTL.TRM" on

Page 99 Detailed Description of Programs Appendix A

the logged on disk, which should contain 10 logicals used to set the trace

level flags and an integer which is used to set ITRAFL. If the file does not exist then all tracing is disabled.

The trace levels and their associated modules are:

1 display stack usage

2 main programs (TREEM, RESTRUCT, BLDTRE, UNBUILD)

3 TRE

4 10

5 NOD

6 FLD

7 CHR

8 STK

9 on-screen display of tree traversal

10 not used

A.5 Module/Routine Descriptions

Functional descriptions of the four main programs are given in chapter 4. This section of the appendix briefly describes the design of the programs and the common base modules or libraries that they use.

Page 100 Detailed Description of Programs Appendix A

A.5.1 Program TREEM

Source file: TREEM .FOR Source disk: Orange (side 1)

Trace level: 2

This main program checks whether any parameters were given on the invocation command line: if there is a filename then this is opened as the digraph file. If there are no parameters then the default file "NETWORK.NOD" is opened as the digraph file. If the file "TRACECTL.TRM" is found then it is read to determine which trace flags should be set and the unit nubmer to use for the trace file. The trace will normally be to unit 6, which is opened as file "TREEM.TRA". The digraph is read in from the digraph file.

The program then enters the main loop, where it processes the current node according to the description in appendix F. When a QUIT node is processed the loop is terminated, the digraph file that was read in is renamed to have an extension of "OLD", and the digraph is written out to the filename from which it was read ("NETWORK.NOD" if no filename was specified when TREEM was invoked).

A.5.2 Program RESTRUCT

Source file: RESTRUCT .FOR Source disk: Red (side 2) Trace level: 2

RESTRUCT gets any filename parameter and opens it as the digraph file. If no filename was given as a parameter to RESTRUCT then it defaults to opening the file "NETWORK.NOD". It then reads the trace flags and unit number from the file "TRACECTL.RST"; if not found then all trace flags are set to false. The file "RESTRUCT.TRA" is opened for use as the normal trace file, using unit 6. The digraph is read in. The user is then prompted to

Page 101 Detailed Description of Programs Appendix A enter the number of significant characters in keywords (i.e. the minimum number of characters that will be accepted as a command).

Then the digraph is traversed. Whenever a local tree is found all the jump nodes are removed, the branches are re-ordered using the information on number of passes that TREEM has stored in them, and then the jump nodes are re-inserted. Note that if the number of significant characters in keywords has been changed then the jump nodes will be re-inserted at different points along the branches. The program has a set of •'visited" flags which are used to mark when a node has been visited.

Once the digraph has been completely traversed a check is run over all nodes to find out if there are any that have not been visited. If any such nodes are found then a warning message that there is no path to them is printed. RESTRUCT does not attempt to repair faults with the diagraph: if any are found then it is necessary to use UNBUILD to return to text- editable form, make any necessary changes manually, and then re-build with BLDTRE.

Upon completion the original input file is renamed to have an extension of "OLD" and then the restructured digraph file is written out to the filename used for input ("NETWORK.NOD" by default).

A.5.3 Program BLDTRE

Source file: BLDTRE .FOR, BLDRTNS.FOR Source disk: Red (side 1) Trace level: 2

This program takes two optional parameters. The first is the filename containing the macro node definitions. The second is the name of the internal format digraph file to be written. If not given then the first defaults to "MACROS.NET" and the second to "NETWORK.NOD". The trace flags and unit number are read from the file "TRACECTL.BLD" if it exists, defaulting to all trace flags false if the file is not found.

Page 102 Detailed Description of Programs Appendix A

BLDTRE then parses the source file, expanding the macro-nodes into their equivalent sets of internal digraph nodes. It will then prompt for the number of significant characters in keywords, and traverse the digraph inserting jump nodes. While traversing the digraph a "visited11 flag is set for every node visited. A check is then made that every node was visited; if any were not then a warning is printed that there is no path to the node. A correlation table between the internal nodes and input macro-nodes is kept, so it is possible to identify the offending macro-node in the warning message.

BLDTRE then prompts the user to specify which characters will be accepted as delimiters between keywords and parameters (e.g. space, comma etc.) and the character used to signal newlines in the macro-node source file.

If there is already a file with the name to be used for writing the digraph file (default "NETWORK .NOD") then it is renamed to have an extension of "OLD", then the digraph file is written.

A.5.4 Program UNBUILD

Source file: UNBUILD .FOR Source disk: Red (side 2) Trace level: 2

UNBUILD can take two filename as parameters. The first is the name of the file containing the digraph, and defaults to "NETWORK.NOD" if not given. The second is the name of the file to dump the macro-node listing to, and defaults to "MACROS.DMP" if not given.

The file "UNBLD.TRA" is opened for use as the trace file (if the default unit 6 is used) and the trace control file "TRACECTL.UNB" is read to set the trace control flags and trace file unit number.

Page 103 Detailed Description of Programs Appendix A

UNBUILD then reads in the digraph and traverses it, converting the internal nodes to corresponding macro-nodes. It keeps a set of "visited" flags for the nodes and reports any unconnected nodes at the end. Note that the macro-node listing may bear no resemblence to the original macro­ node description used to create the digraph in the first place.

A.5.5 Module TRE

Source file: TRE .FOR Source disk: Orange (side 1)

Trace level: 3

The routines in this module manipulate local trees and the branches in them. Routines in modules NOD and FLD are used to move about the trees, and to access fields in the nodes. Two stacks are used for backtracking: stack number 2 is used for traversing a local tree ( as normal), and in addition stack number 1 is used for traversing the whole network.

Subroutine JMPLNK

This subroutine is used to insert jump nodes into all the local trees in the network. It has one parameter, which specifies the number of significant characters in keywords, i.e. the minimum length to which keywords typed in by the user may be abbreviated and still be accepted by TREEM.

A breadth first search (ref ***) (using stack 1) is used to find the root nodes of all the local tress in the network. For each local tree JUMPIN is called to insert the jump nodes into the local tree, and then LTREE is called to list the local tree, so that a side effect of using JMPLNK is that all lists of keywords defined by the network are displayed.

Page 104 Detailed Description of Programs Appendix A

Subroutine JUMP IN

This routine inserts jump nodes into a local tree. It is called from

JMPLNK with two parameters: the root node of a local tree and the number

of significant characters in keywords (see description of JMPLNK).

The object is to find any sequences of immediate-match nodes which do

not have any alternative branches (i.e. all their down-links point to the

fail node), for a number of nodes that is greater than SIGCHR (the number of significant characters). It is the routine ISCNRT which scans along

branches, counting the number nodes for which there are no alternative branches. If the count returned by ISCNRT is greater than SIGCHR then the

routine INSJMP is called to insert a jump node into the branch.

JUMP IN has to find the starting nodes of all branches in the local tree (i.e. all immediate-match nodes that can only be approached from above - not from the left). It does this by pushing all such nodes onto stack 2, starting with the root node of the local tree and working down. A loop then pops these starting nodes off stack 2 and uses ISCNRT to find out if a jump node should be inserted. When any other branches are discovered their starting nodes are pushed onto stack 2.

Subroutine INSJMP

This subroutine is called by JUMPIN to insert a jump node into a branch. It has three parameters: the first node in a sequence of immediate-match nodes with no alternative branches, the node at the end of

the sequence, and the number of significant characters in keywords.

The jump node has to be inserted the specified number of characters after the first node. The last node may either be an immediate-match node (if the character sequence is in the middle of a keyword) or some other type of node (if the character sequence is at the end of a keyword). In the second case the letter field of the jump node is set to a space (i.e.

Page 105 Detailed Description of Programs Appendix A delimiter), so that the jump node has the effect of allowing the keyword to be abbreviated at this point. In the first case the letter field is set to the same as the letter field of the last node in the sequence, so that when this letter is encountered in the input keyword (by TREEM) matching will continue from that character onwards.

The down-link of the jump node is set to point to the last node in the sequence, which is the node to be jumped to when TREEM is traversing the branch, looking for a match to a keyword.

The jump node is marked as ‘visited’, so that JMPLNK does not try to process the new jump node.

Function ISCNRT

This function scans along a branch starting at the node specified by its first parameter, until a non immediate-match node is encountered (i.e. end of branch) or one of the down-links does not point to the fail node (i.e. there is an alternative branch). The node at which scanning stopped is returned as the function's result, and the count of the nodes scanned over is returned in the second parameter.

Since this function is called indirectly by JMPLNK (via JUMPIN) all nodes scanned over are marked as 'visited' and the last node is pushed onto stack 1.

Subroutine JMPOUT

This subroutine removes all jump nodes from a local tree. It has one parameter: the root node of a local tree. It is called by the program RSTRCT, which does a breadth first search of the whole network analogous to JMPLNK, and removes jump nodes with JMPOUT prior to re-structuring and

Page 106 Detailed Description of Programs Appendix A re-inserting jump nodes (possibly at different points in the local trees - which is why it is ncessary to be able to remove them).

The local tree is scanned and any jump nodes are removed and returned to the "pool of unused nodes" (by routine RETNOD). The count value that was in the jump node is copied into the sequence of immediate-match nodes that follow it, since they will have been skipped and hence their counts do not reflect the number of times a successful match was made to that sequence.

Subroutine LTREE

This subroutine lists on the console the keywords that are contained in a local tree. When called by TREEM (in response to some invalid input or to the "?" command) it has the effect of giving a list of the keywords that are valid at that point in the interaction. LTREE is also used by

RSTRCT as confirmation that a local tree has been processed, and to indicate the order of keywords before and after re-structuring the local tree.

It has two parameters; the root node of the local tree and the maximum number of lines to be output. When it is desired to list the whole of the local tree the second parameter should be set to a large number (larger than the total number of keywords represented by the local tree). If it is set to a fairly low number, e.g. 10, then only the first 10 keywords in the local tree will be listed (if there are more than 10), followed by "etc." to indicate to the user that there are more keywords but they have not been listed.

This feature of limiting the number of keywords that LTREE will list is used by TREEM when giving some valid keywords to the user after some invalid input. The limit prevents a large amount of output being produced if the user makes a mistake at a point where there is a large choice of keywords. (When the user types "?" then LTREE is called with a large

Page 107 Detailed Description of Programs Appendix A

"maximum number of lines" parameter, so that the whole list of possible keywords will be given). Note that local trees should normally be ordered with the most frequently used branches at the top, so LTREE will list keywords in decreasing order of use,

A.5.6 Module NOD

Source file: NOD .FOR

Source disk: Orange (side 1) Trace level: 5

These routines are used for moving between nodes and for adding nodes

to the tree. The routines in module FLD are used to access the fields in a node.

Function MOVRGT

This function takes one parameter; a node identifier, and returns one result; the identifier of the node to the right of the specified node. It is always used with the 'current node* (ICNODE) both as the parameter, and the result. Besides returning the value of the 'right-link' of a node this routine has two other main purposes: the current node identifier is pushed onto stack number 2, which is used for backtracking purposes, and if the count-flag is on then the 'count' field in the node is incremented to indicate a successful traversal of the node. The reason for having a count-flag to control this counting is that while it is required to count successful matches while in TREEM, the counts should not be affected by

BLDTRE, UNBLD or RESTRUCT. It is desirable that all programs should use the same underlying routines to access the nodes, and this flag makes that possible. (TREEM sets it to 'true', the other programs set it to 'false').

The rest of the code in the function MOVRGT is concerned with the trace options. If trace level 5 is activated then the old and new node

Page 108 Detailed Description of Programs Appendix A

identifiers, and their types, are written to the trace file for program de-bugging purposes.

If trace level 9 is activated then TREEM will give a two-dimensional display of the route through the tree that it is traversing. The graphics routine PLOTCH (‘plot-character*) is used to plot the letter in the node on the screen to the right of the previously displayed node. If the node is not of the 'immediate-match' (IMMED) or 'write-character' (WRITCH) type then the 'letter' field of the node contains a special character which indicates the node type, and this is the character that is plotted. The co-ordinates of the screen position where the letter was plotted are stored on stack number 3, as they will be needed if TREEM backtracks.

LASTND always records the last node processed. It is not used for backtracking.

Function MOVDWN

This function is analagous to MOVRGT. The only differences are that it is used to move the current node down instead of right, and the count field is not incremented since a move down implies an unsuccessful match.

If trace level 9 is activated then the node contents are displayed on the screen below the last node.

Function MOVBAK

This function is used for backtracking. It moves the current node back to the node that was processed before on a previous pass. It does this by popping a node identifier off stack 2 (which is where MOVRGT and MOVDWN push node identifiers as the current node is moved). LASTND is set to the current node before moving back, so that while backtracking LASTND records the last node processed (as it does while scanning forwards through the tree).

Page 109 Detailed Description of Programs Appendix A

If trace level 9 is on, then the appropriate screen co-ordinates are popped off stack 3, so that the display on the screen will backtrack correctly too. It is not necessary to plot the contents of the node on the screen as it should already be there from the forward pass, but if a different branch is tried after backtracking the display will continue from the correct point.

Function IADDRT

This function is used to add a new node to the right of the specified node. The second parameter specifies what type the new node is to be. This information is passed on to NEWNOD (in module FLD) which is called to create the new node. IADDRT is normally used to add a node to the right of the current node, making the new node the current node.

It is assumed that the current node does not have any more nodes to the right of it (right link points to 'fail node'): IADDRT is used for adding nodes to a branch that is being grown, it cannot be used for inserting nodes into the middle of an existing branch.

NEWNOD initialises some of the fields of the new node (see description of NEWNOD). The right link and down link are both set to point to the fail node by IADDRT.

If trace level 9 is active then the new node will displayed on the screen to the right of the last node (as for MOVRGT).

Function IADDBW

This function is used to add a node below the current node. It is very similar to IADDRT. IADDBW is only used while a tree is being built to add a new branch at the bottom of a tree, it cannot be used to insert a new branch half way down.

Page 110 Detailed Description of Programs Appendix A

A.5.7 Module FLD

Source file: FLD .FOR

Source disk: Orange (side 1) Trace level: 6

These routines are used to access the fields of nodes, and also to create or delete nodes. Since all access to the nodes is via this module it is possible to change between storing the nodes in memory or on disk by changing this module only. The version described here stores the whole network in memory while the program (TREEM, BLDTRE etc.) is running. In between runs the network is stored on a file on a diskette. Routines to load a network from disk, or dump it to disk, are also included in this module.

Function NODTYP

This integer function takes one parameter; a node identifier (an integer). It returns the contents of the 'type* field of the specified node.

It is checked that the node identifier is valid by calling RNGCHK, which will stop the program if the node identifier is out of range. If trace level 6 is active then the node identifier and the contents of the type field are output to the trace file.

Function IRLINK

This integer function takes an integer (a node identifier) as a parameter, and returns the value stored in the 'right-link* field of the specified node.

Page 111 Detailed Description of Programs Appendix A

Function IDLINK

This integer function returns the value of the down-link field of the node specified by the integer parameter.

Function LETTER

This integer function returns the character stored in the 'letter* field

of the node specified by the parameter, in A1 format.

Function IPARNO

This integer function returns the number stored in the 'parameter-number' field of the node specified as the (integer) parameter.

This field is actually the same as the letter field, in order to save storage space, since there is no node type which requires both a letter

field and a parameter-number field. However no check is made that the node type is compatible with a call to IPARNO.

Function KOUNT

This integer function returns the value stored in the 'count' field of the node specified by the argument. The count field is used to record the number of times an immediate-match node is processed successfully by TREEM.

Function ICHAN

This integer function returns the 'channel number' stored in the specified node. The channel number determines which I/O stream certain

Page 112 Detailed Description of Programs Appendix A

node types (get, write buffer etc.) act upon. It is stored in the same field as the count, since there is no node type which requires both fields.

Subroutine SETTYP

This subroutine takes two integer parameters. The first is a node

identifier, the second is the 'type' which is to be inserted into the type

field of the specified node. RNGCHK is called to check that the node identifier is valid. No check is made on the type field. If trace level 6 is active then the routine name, the node identifier, and the value being stored in the type field are output to the trace file.

Subroutine SETRLK

The node specified by the first parameter has its right-link field set to the value specified by the second parameter.

Subroutine SETDLK

The down-link of the node specified by the first (integer) parameter is set to the value specified by the second (integer) parameter.

Subroutine SETLET

This subroutine is used to store a character in the letter field of a node. The first (integer) parameter is the node identifier, the second

(integer) parameter should contain a character in A1 format that is to be stored in the letter field of the node.

Page 113 Detailed Description of Programs Appendix A

Subroutine SETPNO

This subroutine takes two integer parameters. The first is a node identifier, the second is a parameter number that is to be stored in the parameter number field of the node.

Subroutine SETCNT

This subroutine sets the count-field of the node specified by the first integer parameter to the value specified by the second integer parameter.

Subroutine SETCHN

This subroutine is used for setting the channel number (or stream number, or conversation number) in nodes that have this field. The first parameter (an integer) is the node identifier, the second parameter (an integer) is the channel number. This is not checked, but should be in the range 1 to 3.

Subroutine RNGCHK

This subroutine is called by all the other routines to check that a node indentifier is in range (i.e. is valid). If it is not then an error message is displayed on the console and the program is stopped.

Page 114 Detailed Description of Programs Appendix A

A .5.8 Module STK

Source file: STK .FOR Source disk: Orange (side 1) Trace level: 8

This module provides three stacks for use by the other modules. They are used for storing node identifiers for back-tracking and network scanning purposes (stacks 1 and 2), and for saving the co-ordinates of points plotted on the console screen (stack 3).

In general the stacks can be used for storing any integers. The number of stacks can be changed by changing the declarations for the common area

/STACK/, and by changing the upper bound in the DO loop in INITSK. The sizes of the stacks can also be changed by modifying the declarations, and

the stack overflow test in PUSH. The stacks must all be the same size, and must be sufficiently large in relation to the network size and the largest local tree for stack overflow not to occur. If trace level 1 is active then some information to help select an appropriate stack size is given - see description of RSTSTK.

Subroutine INITSK

This routine initialises the stacks by setting their associated stack pointers to zero (the stacks work “upwards"), and also the “maximum stack size used" variables.

Subroutine RSTSTK

This routine resets (empties) a stack. The stack number is specified as a parameter. Before the stack pointer and maximum size variables are reset the value in the maximum size variable is output, if trace level 1 is

Page 115 Detailed Description of Programs Appendix A active. This variable records the maximum value that the stack pointer has reached since it was last reset. Whenever significant changes are made to the maximum network size the programs TREEM, BLDTRE, UNBLD and RESTRUCT should all be run initially with trace level 1 active, and a generous stack size. It is then possible to find out the maximum stack sizes actually used by the programs, and the necessary modifications can be made to STK to reduce the sizes of the stack arrays to a reasonable level again.

Function EMPTY

This logical function takes a stack number as a parameter and returns

'true* if the specified stack is empty, or otherwise 'false'. (The associated stack pointer is zero if the stack is empty).

Function IPOP

This integer function is used to pop a value from a stack. It takes a stack number, as a parameter and returns the top value on the stack. The stack pointer is decremented. A test for stack underflow is made, in case the stack was empty. The program is halted if this error is detected.

Subroutine PUSH

This routine pushes an integer value onto a stack. It takes two parameters; the first is the stack number, the second is the value to be pushed onto the stack.

The stack pointer is incremented. A test for stack overflow is made. If it has overflowed then a message is output and the program stopped, otherwise the value is put onto the top of the stack.

Page 116 Detailed Description of Programs Appendix A

If trace level 8 is active then the stack number, value, and current position of the stack top (i.e. the stack pointer) are output to the trace file.

A.5.9 Module CHR

Source file: CHR.FOR

Source disk: Orange (side 1) Trace level: 7

The routines in this module are used for performing test on characters. The characters are assumed to be in A1 format in integer variables, which is the way that the programs TREEM, BLDTRE, UNBLD and RESTRUCT store characters. Although this wastes some RAM (since it is possible to pack two characters into one integer, or alternatively to store them in logical variables which only occupy one byte) this scheme makes the processing of characters easier (with a consequent saving in the program size, which will partially offset the extra memory used for data), and it improves the portability of the programs.

Function INDLMS

This logical function takes one parameter, which is a character. (The character may be stored in either an integer or a logical variable). The function returns 'true' if the character is in the delimiter set, or 'false' if it is not.

The delimiter set is defined by the integer array IDELIM (in common block /DELIMS/). Up to five delimiter characters may be defined in the present system. To use less than five one of the delimiter characters should be repeated to fill all five positions in IDELIM. To increase the number of delimiters the declaration in the BLOCK DATA subprogram should

Page 117 Detailed Description of Programs Appendix A

be changed (that is where IDELIM is initialised), and the upper bound of

the DO loop in INDLMS should also be changed.

Function CHMTCH

This logical function compares two characters, which are passed as

parameters. The two parameters may be either integer or logical variables,

If the two characters match then true' is returned, if they do not then 'false' is returned, except in the special case where both are delimiter

characters (but not necessarily the same character), in which case 'true' is returned.

The function INDLMS is used to test both characters to see if they are delimiter characters.

Function SPCLCH

When the tree is being displayed on the console as it is being scanned,

non immediate-match or write-character nodes are represented by special is in common block /ITYPCH/ (the characters are set up by the BLOCK DATA

subprogram). The logical function SPCLCH tests whether the parameter (a character stored in an integer in A1 format) is one of the special

characters, and returns 'true' if it is, or otherwise 'false'.

The upper bound of the DO loop in this function must be the same as the number of special characters defined.

Page 118 Detailed Description of Programs Appendix A

A.5.10 Module 10

Source file: 10 .FOR

Source disk: Orange (side 1) Trace level: 4

The routines in this module are used to provide character-oriented I/O, since Fortran I/O is line-oriented. Three I/O streams (channels) are defined. Channel 1 is the console and keyboard (user I/O stream), channel 2 is the SI04 interface of the RML 380Z (which is connected to the

mainframe), and channel 3 is the file on logical unit 4. This file is 'TREEM.OUT' by default, but may be specified as anything else by giving a

filename on the same line as the TREEM command. (See description of the main program of TREEM).

This stream (channel 3) is used when it is desired to use TREEM to

create a file from the user's input, instead of interfacing to another computer. An example of this type of use of TREEM is the validation of macro-node network descriptions, as they are input. Although any given application of TREEM will probably only use two of the I/O streams, it is possible for an application to use all three (e.g. when being used to interface between a user and a mainframe, a log file could be created

using stream 3).

Subroutine INITIO

This routine initialises the 10 module by setting the 'position pointers' associated with the output buffers to zero, to indicate that they are empty (otherwise the first buffers output would start with garbage, unless they were all individually cleared with CLRBUF before the first call to WRITCH). The input position pointers are not initialised, so it is important

Page 119 Detailed Description of Programs Appendix A

that NEXTCH should not be called before the first call to READLN for that channel.

Function NEXTCH

This function returns the next character in the input stream specified by the parameter. If the input stream (channel) is invalid, i.e. not in the range 1 to 3, then the diagnostic 'INVALID UNIT' is output to the console and the program is stopped. Otherwise the next character in the input buffer is obtained, and converted to upper case if it is in lower case. No check is made on whether the program is attempting to read past the end of the valid data in the input buffer - the program should use the logical function ENDLN to test for end of line before using NEXTCH. There is also ne check that anything has been read into the input buffer - the program must call READLN before the first call to NEXTCH and whenever ENDLN returns 'true' and more input is required.

Subroutine READLN

This subroutine has one parameter, which gives the stream from which another line is to be read. This routine should be called whenever it is desired to fill the appropriate input buffer (the function NEXTCH is then used to obtain characters from the input buffer). The associated end-pointer is set to the number of characters in the buffer, and the position-pointer is set to zero, so that NEXTCH will read from the start of the buffer. The global variable ICHAR is set to the first character in the input buffer by calling NEXTCH once.

If the channel number is one then the line is read from the user's keyboard, if three then from the file on logical unit 4. If end of file is detected then the global variable FINIS is set to 'true'. A check is made to see if the first character is control-S, and the program is stopped if it is. This feature is provided so that the program can be stopped tidily

Page 120 Detailed Description of Programs Appendix A

from the keyboard. If control-C is used then CP/M re-boots immediately, without calling the Fortran termination routine which closes files etc. The

function BUFLEN (in IOLIB) is used to find out how many characters were read in (since Fortran does not give any indication of how many characters are read).

Reading from the mainframe is a special case. If the channel specified

is '2' then the routine GETS4 (in module GETS4) is called. This routine reads consecutive lines into input buffer 2, overwriting each line with the following one. The routine exits when there has been no output from the

mainframe for a certain time interval, or when a control-? is recieved. At this point the last non-blank line received is in the input buffer (and the

variable INEND2 has been set to the number of characters in the input buffer). All previous lines will have been lost.

Function ENDLN

This logical function takes a channel number as a parameter. If the channel number is invalid (not in the range 1 to 3) then the program is

stopped after the message 'ENDLN: INVALID UNIT' has been output to the console. Otherwise the function returns 'true' if all the characters in the specified input buffer have been read (by NEXTCH), or 'false' if there are more characters in the input buffer. This function should always be called before NEXTCH, since if an attempt is made to read past the end of the line NEXTCH will return garbage.

Subroutine WRITCH

This subroutine is used to write a character to an output buffer. It has two parameters; the channel number and the character. If the channel is 1 or 3 then the character will not actually be output until the buffer is output by WRIBUF. If the channel is 2 then the character is output immediately to the serial port using the subroutine 0UTS4.

Page 121 Detailed Description of Programs Appendix A

If the channel number is invalid or an attempt is made to write beyond the end of an output buffer then the program is stopped.

Subroutine WRIBUF

This subroutine takes one parameter; the channel number. If this is 1 or 3 then the appropriate output buffer (containing one line of text) is output to the console or the file on unit 4. If the channel number is 2 then a carriage return, line feed, and three nulls are output (via subroutine 0UTS4) to the SI04 serial interface (which is connected to the mainframe). Since characters are sent immediately on channel 2 there is no buffer of characters to be sent. The appropriate output position pointer is set to zero so that the output buffer is ready to accept another line of text.

If the channel number specified was not in the range 1 to 3 then the message 'WRIBUF: ILLEGAL UNIT' is output and the program stopped.

Subroutine CLRBUF

This subroutine can be used to clear an output buffer, thus preventing the output of any characters written with WRITCH, since the last call to WRIBUF. It will not work correctly if the channel number is 2, since any characters written to channel 2 will have already been sent to the mainframe, instead of being stored in a buffer until there is a complete line, like the other channels.

The channel number which is to have its output buffer cleared is specified as a parameter. The program is stopped if this is not in the range 1 to 3. No error message is given if the channel is 2, although CLRBUF will not have any effect if it is.

Page 122 Detailed Description of Programs Appendix A

A.5.11 Module GETS4

Source file: GETS4.MAC Source disk: Orange (side 2)

Trace level: none

GETS4

This assembler routine is used for receiving messages from the

mainframe via the microcomputer's serial I/O port ("SI04"). No standard

device drivers for receiving from the serial port were supplied with the

Fortran package.

GETS4 receives the characters from the serial port, putting them into

the input buffer, until either a timeout occurs or the special character control-Q (marking end of computer's message) is received. As the

characters are received they are also echoed to the top leftmost position on the screen - giving the user an indication of when text is being received without the raw text actually being visible to the user.

This use of control-Q takes advantage of a facility the CDC mainframe has for controlling automatic input from a paper tape reader on a teletype. If the "TAPE" command is issued to the NOS operating system then it uses control characters to start and stop the tape reader. Thus it is

possible to get the mainframe to signal when it has finished its output and is waiting for input, avoiding the problems that occur when using timeouts alone.

The timeout used to determine when the computer has stopped sending is long. (It is configurable, but is normally around five seconds). It has to be long to allow for the pauses that sometime occur during output. However the use of the control character mechanism described above (i.e. using the TAPE command) means that normally the timeout mechanism is not brought into use, and the user does not have long delays before TREEM switches back to reading his keyboard.

Page 123 Detailed Description of Programs Appendix A

SCRL22

This subroutine has no parameters. It puts the RML 380Z screen into a mode where the bottom 22 lines scroll and the top two lines are fixed.

This routine writes a row of dashes along the second line. This mode is normally used by TREEM (except when graphically plotting tree parsing on the screen, using trace level 9); the top line is used for status.

This subroutine was developed for use specifically by TREEM, hence it is included in the GETS4 module as part of the special TREEM I/O rather than being included in the general purpose IOLIB library intended for general use. The effects of this routine can be cancelled by use of the

SCRLVT or GRAFVT routines (in IOLIB).

SCRL22 works by writing the number of lines to scroll (22) into location NLINES (FFOOH) and by writing the address of the first character to be scrolled in the VT memory area (F080H for the start of the third line) into the word at location TOP (FF01H).

A.5.12 Library IOLIB

Source file: ASCCHR.MAC, BUFLEN.MAC, FILERTNS.MAC, FIXFN.FOR, GRAFRTNS.FOR,

SI04RTNS.MAC, TBUFF.MAC, VTRTNS.MAC Source disk: Orange (side 2) Trace level: none

IOLIB is a library of Fortran-callable routines for performing I/O functions not supported by the standard libraries. It provides routines for graphics output to the VT screen, single character I/O to both screen/keyboard and the SI04 serial port, and CP/M file manipulation.

ASCJASC

This function takes a single integer argument containing a character in A1 format and returns an integer in the range 0 to 127 which corresponds

Page 124 Detailed Description of Programs Appendix A to the ASCII value of the character. The inverse of this function is

CHR/ICHR. In Microsoft Fortran an integer containing a character in A1 format contains two characters (the first is a space); this function strips off the unused character. ASC must be declared as integer.

CHR,ICHR

This function takes an integer argument in the range 0 to 127 and returns an integer containing the character in A1 format. (It adds a space in the first character, to conform to Microsoft Fortran conventions), e.g. to produce a carriage-re turn in A1 format use ICHR(13). CHR must be declared as integer.

BUFLENJLENBUF

This function takes a dummy argument which is ignored and returns an integer which is the number of bytes transferred in the last I/O operation (Fortran read/write) less one. It is intended to be used immediately after reading a line from the console or disk: it then gives the number of characters that were read (the terminating carriage-return is not stored in the buffer). BUFLEN must be declared as integer.

LOOKUP

This function returns a logical. It takes two parameters: a string and an integer. The first parameter must contain exactly 11 characters: 8 for a filename and 3 for a filetype (both blank filled from the right). The second parameter must be in the range 0 to 4, and is used to specify a disk drive (O=logged on disk, 1 = drive A, 2 = drive B, 3 = drive C, 4 = drive D). LOOKUP returns TRUE if the specified file exists. LOOKUP must be declared as logical.

Page 125 Detailed Description of Programs Appendix A e.g. LOGICAL FEX 1ST .LOOKUP

FEXIST=LOOKUP('AFILE EXT',1)

This will return TRUE if the file "AiAFILE.EXT" exists.

GRAFVT

This subroutine takes no parameters. It puts the RML 380Z screen into graphics mode: the top 20 lines are cleared and scrolling is restricted to the bottom four lines. The subroutine to return to normal full screen scrolling is SCRLVT.

SCRLVT

This subroutine takes no parameters. It puts the RML 380Z into full screen scrolling mode (the normal mode), reversing the effect of GRAFVT.

OPENVT

This subroutine has no parameters. It opens the VT memory area to the CPU. The screen will be blanked while open - the display processor cannot access it. While the VT memory area is open the CPU can read and write the memory directly (e.g. using PEEK and POKE). The screen remains blank until CLOSVT is called. See the COS manual for more details on writing to the VT memory area. e.g. CALL POKE(X'FOOO1, IASCOA')) will plot the character 'A' in the top left corner of the screen.

Page 126 Detailed Description of Programs Appendix A

CLOSVT

This subroutine has no parameters. It closes the VT memory area, after

it has been opened by OPENVT.

PLOTCH

This subroutine takes three integer parameters. It plots a character on the screen, which is assumed to have already been put in graphics mode by a call to GRAFVT, The first parameter is the X co-ordinate, in the range 1 to 40. The second parameter is the Y co-ordinate, in the range 1 to 20.

The third parameter is the character to be plotted, it should be an integer holding the character in A1 format. If either co-ordinate is less than 1 then PLOTCH prints an error message and aborts. If either co­ ordinate is too large then PLOTCH returns immediately with no error indication.

OUTCHR, VTOUT

This subroutine takes one integer parameter. The parameter should contain a character in A1 format. The character is output directly to the VT screen at the current cursor position, by-passing the Fortran I/O subsystem.

KBDIN

This subroutine takes one integer parameter. It returns a character typed at the keyboard, if there is one in the keyboard input buffer. If not then it returns zero. The character is returned in A1 format.

Note that it is possible to use this routine to allow the interruption of a program, since control-C and control-F will be checked for whenever this routine is called. Thus it can be used as a "dynamic breakpoint" inside long loops etc. (if a routine such as this one is not called within

Page 127 Detailed Description of Programs Appendix A

program loops then the only way to interrupt the program is by resetting the microcomputer).

e.g. C WAIT UNTIL A KEY IS STRUCK

10 CALL KBDIN (ICHAR) IF (ICHAR .EQ .0 >GOTO 10

S40UT

This subroutine takes one integer parameter. The parameter should contain a character in A1 format, which is output directly to the SI04 serial interface.

S104 IN

This subroutine takes one integer parameter. It returns a character received over the SI04 serial interface, if there is one in the input buffer, otherwise it returns zero. The character is returned in A1 format.

SETS4

This subroutine takes one integer parameter. It sets up the SI04 interface at the baud rate given by the parameter. The codes for various baud rates are documented in the COS manual. e.g. CALL SETS4<0) sets the serial interface to 110 baud, suitable for use with the Olivetti printer. CALL SETS4(1) sets the serial interface to 300 baud, suitable for "telex" access to the CDC mainframe.

ERASE

This subroutine takes two parameters; a string and an integer. The first should contain a filename and extension in 11 characters (each blank filled from the right if less than 8 or 3 characters respectively). The

Page 128 Detailed Description of Programs Appendix A

second parameter is the disk drive number in the range 0 to 4 (0 for logged on disk, 1 to 4 for drives A to D). If the specified file is found

then it is erased. If the file is not found then there is no failure

indication.

e.g. CALL ERASECFILE2 E2 ',2) will erase the file “B:FILE2.E2" if it exists.

RENAME

This subroutine takes three parameters: two strings and an integer. The first is the filename of an existing file. The second is the name to change it to. The third is the disk drive number. The rename will fail with no warning if the first named file does not exist or the second already exists. The LOOKUP function should be used first to check for both these possible errors.

e.g. CALL RENAME COLDNAME OLD'.'NEWNAME NEW',0>

will rename "OLDNAME.OLD" on the logged-on disk to "NEWNAME.NEW".

TBUFF

This subroutine takes two parameters, an integer and a logical array, which are both used to return results. This routine is intended for

finding out what, if anything, was typed on the program invocation line. The first parameter returns the number of characters typed after the program name. The second parameter, which should be a locgical array of dimension 127, will be filled with any characters that were typed on the command line following the program name. The terminating carriage return is not included, and the last character is followed by a zero byte (ASCII null). Subsequent bytes are not initialised. See CP/M Interface Guide, page 10a.

Page 129 Detailed Description of Programs Appendix A

e.g. LOGICAL BUFF(127)

CALL TBUFF

TFCBJFCB2

These subroutines take two parameters. The first is an array of 11 logicals into which they will put a filename in fixed format (8 characters for the filename and 3 for the extension, each with trailing spaces). The second is an integer into which they put a disk drive number (0 for the logged-on drive, 1 to 4 for disk drives A to D). These filenames are derived from parameters on the program invocation line: TFCB returns the first program parameter and TFCB2 the second program parameter. If no parameter was given then the filename will contain a space in the first character. See CP/M Interface Guide, page 10. e.g. LOGICAL FILNAM(11),SPACE

DATA SPACE/' 7

CALL TFCB (FILNAM, IDRIVE) IF (F ILNAM < 1) .EQ .SPACE )G0T0 900 CALL OPEN (IUN IT ,F ILENAM, IDR IVE)

FIXFN

This subroutine takes three parameters. The first two are strings and the third is an integer. The first parameter should be a string containing

Page 130 Detailed Description of Programs Appendix A a filename in free format, using the syntax as would normally be typed in a CP/M command (e.g. "A:MYFILE.TXT"). The filename must be terminated with a space. FIXFN returns the filename and drive number in fixed format, suitable for use in the standard OPEN routine (or the RENAME, ERASE and

LOOKUP routines provided by this library). The second parameter must be an array of at least 11 logicals, into which the filename and extension will be stored. The disk drive number (in the range 0 to 4) is returned in the third parameter. If an error was encountered in trying to convert the filename (e.g. the disk drive letter was not in the range A to D) then a negative disk drive number is returned.

e.g. LOGICAL FN(15),FILNAM<11)

WRITE (1,1000) 1000 FORMAT0 File name?') READ (1,2000) (FN (I), 1= 1,14)

2000 FORMAT(14A1) FN (15)=‘ • CALL FIXFN (FN ,F ILNAM ,NDR IVE) IF (NDRIVE .LT .0 )G0T0 900 CALL OPEN (LUN IT ,F ILNAM,NDR IVE)

A.5.13 Library FORLIB

Source file: DSKDRV.MAC, IN IT .MAC

Source disk: Orange (side 2) Trace level: none

FORLIB is the Fortran library supplied as part of the Microsoft Fortran development package. This library was modified slightly in the following ways:

Page 131 Detailed Description of Programs Appendix A

DSKDRV

The disk driver was modified to discard line-feeds when reading from disk. The reason for this is that when writing to disk the standard disk driver appends a carriage-re turn, line-feed pair to the end of a record. However when reading with the standard disk device driver the carriage- return acts as the record terminator, and the line feed shows up as an extra character on the front of all records except the first one in the file. The consequence of this is that if the standard device driver is used it is necessary to use different FORMAT statements when reading the first and subsequent records from a file. In addition the FORMAT statements for writing and reading the records have to differ. The effect of modifying the device driver is that the same FORMAT statement can be used when reading or writing any record, thus simplifying the Fortran source (and also making it behave like Fortran systems on other computers). A consequence of this is that the programs MUST be used with this modified Fortran library - they will not work correctly with the standard Microsoft

Fortran library.

The disk driver is in file DSKDRV .MAC. The change was made to the formatted read routine (DSKFRD) shortly after label DSKFR1. Immediately after the check for an end-of-file character (control-Z) a check for a line-feed character (control-J) is made. If the character is a line-feed then a jump back to DSKFR1 is made so that the character is not stored and the next character is read.

WIT

The initialisation procedure was also changed. The RML 380Z has two memory locations (HIMEM1 and HIMEM2) which specify the top of memory available to programs. One of these (HIMEM1) is the bottom of the CP/M BDOS (Basic Disk Operating System), and must never be exceeded by any programs. This is the value used in the standard IN IT routine to determine where to place the stack (which starts at highest available memory and works down) used by the Fortran run-time support. INIT was modified to use

Page 132 Detailed Description of Programs Appendix A

HIMEM2 instead. HIMEM2 can be modified by programs that are loaded at the

top of memory, just below the CP/M CCP (Console Command Processor). Thus it is possible to load code which will remain resident during the execution of these Fortran programs, and the CP/M command line interpreter (CCP) also remains resident. The most obvious effect of this is the program returns immediately to the CP/M prompt on exit, without having to re-boot from disk. This will also allow for direct communication between TREEM and

CP/M in the future.

LIB80 was used to replace the DSKDRV and IN IT modules in FORLIB with

the modified versions.

A.6 Global Data

This section documents the global data shared by two or more modules.

All global data is stored in Fortran named common blocks.

CHARS

Used by: TREEM,BLDTRE,BLDRTNS

A set of commonly used characters, such as carriage-return, line-feed, space, period, colon, question-mark.

CNTFLG

Used by: TREEM,RESTRUCT,BLDTRE,NOD

A boolean flag used to control whether counting of node traversals is enabled or not.

Page 133 Detailed Description of Programs Appendix A

ECHO

Used by: TREEM

The buffer used to echo the user's input.

ERRFU

Used by: BLDTRE.BLDRTNS

The logical unit number of the file used to print error messages on when building the digraph.

FILES

Used by: TREEM .RESTRUCT.BLDTRE ,UNBU ILD.BLDRTNS ,TRE ,NOD,FLD ,CHR, 10 ,STK

The set of logical unit numbers for all the files used by the programs.

ICHAR

Used by: TREEM

The current input character.

ICONV

Used by: BLDTRE.BLDRTNS

Used to store the "conversation number" parameter of macro-nodes during digraph building.

Page 134 Detailed Description of Programs Appendix A

1DELIM

Used by: TREEM,BLDTRE,CHR

This defines the set of delimiter characters, used to separate keywords and/or parameters. Typically characters such as space, comma, paretheses etc.

INBUF

Used by: TREEM, 10,GETS4

This block contains the three input buffers and associated variables

for the three conversations.

1TYPCH

Used by: FLD.CHR

The character field of node types which do not need such a field have it set to whatever character is contained in this array for the node type. (This is done as a debugging aid).

LASTND

Used by: TREEM .UNBUILD ,TRE, NOD

Used to keep the node number of the last node traversed, giving an easily accessible record without having to pop and push the number from the stack.

Page 135 Detailed Description of Programs Appendix A

LINKS

Used by: BLDTRE.BLDRTNS

Used to store down and right links during digraph building.

LINENO

Used by: BLDTRE.BLDRTNS

Used to store the current input file line number during digraph building, for use in error messages if an error with the source file is found.

LROOT

Used by: BLDTRE.BLDRTNS

Used to store the node number of the root of a local-tree, while building a digraph from macro-nodes.

MACNO

Used by: BLDTRE.BLDRTNS

Used to store the current macro-node number.

MODES

Used by: BLDTRE

Used to store the root macro-node number and the type of the current macro-node.

Page 136 Detailed Description of Programs Appendix A

MPTRS

Used by: BLDTRE.BLDRTNS

An array of pointers to macro-nodes used during tree building. Indexed by an internal node number it gives the corresponding macro-node number.

MSGS

Used by: TREEM .BLDTRE .UNBUILD

The array of messages that TREEM can output.

NDPTRS

Used by: BLDTRE.BLDRTNS

An array of pointers to nodes used during tree building. Indexed by a macro-node number it gives the corresponding internal node number.

NLCHAR

Used by: TREEM.BLDTRE.UNBUILD

A single character, defining the “newline" (carriage-return equivalent) character. Typically contains

NODES

Used by: TREEM.RESTRUCT,BLDTRE,UNBUILD,TRE,NOD,FLD

The arrays used to store the nodes of the digraph. One array for each field of the node - the node number is the index for the arrays.

Page 137 Detailed Description of Programs Appendix A

NTYPES

Used by: TREEM,RESTRUCT.BLDTRE,UNBUILD,BLDRTNS,TRE,STK

Defines the constant numeric values used for all the node types.

OUTBUF

Used by: TREJO

Contains the three output buffers and associated variables (such as current buffer length) for the three conversations.

PARAMS

Used by: TREEM

Used to store the parameters that are read by param nodes,

STACK

Used by: STK

Arrays of integers and "stack pointers". Three independant stacks are provided. Used for storing node numbers when traversing the digraph to allow backtracking and the X,Y coordinates of characters plotted on the screen so that the two dimensional plotting display of tree traversal that is enabled by trace level 9 can also back-track.

Page 138 Detailed Description of Programs Appendix A

TRACE

Used by: TREEM ,RESTRUCT .BLDTRE ,UNBU ILD .BLDRTNS ,TRE ,NOD ,FLD ,CHR, 10 ,STK

The set of trace flags used to control whether tracing is enabled for each of the modules. An array of 10 logicals, with one used for each module.

TYPES

Used by: BLDTRE .BLDRTNS

The names of macro-node types in character string form.

TYPNAM

Used by: UNBUILD

The names of node types in character form.

VSTD

Used by: RESTRUCT,BLDTRE,UNBUILD,TRE

An array of flags (one for each node) used to mark when the node has been visited. These are required by the digraph traversal algorithms.

VT

Used by: TREEM,NOD

This block contains the X and Y coordinates of the current cursor position on the screen, when plotting out tree traversal in two-dimensional form (i.e. trace level 9 enabled).

Page 139 APPENDIX B:

BUILDING AND RUNNING THE PROGRAMS

B.l Program Maintenance

This appendix describes how to compile, link and locate the programs TREEM, BLDTRE, RESTRUCT and UNBUILD. This information is necessary if any changes are made to the source code of any of the modules used in these programs. The information in the appendix refers to the diskettes that were in use as at October 1980. It is, of course, possible to transfer the files to other diskettes (or other media) and to group them differently.

Note: These diskettes contain some documentation on using the programs, in files with the extension '.DOC' or '.TEX':

USETREEM on side 2 of Orange disk documents using TREEM

USEBLD on side 1 of Red disk documents using BLDTRE

IOLIB on side 2 of Orange disk documents IOLIB routines

FORLIB on side 2 of Orange disk documents changes to FORLIB

The .TEX files should be run through the TXED text formatter before printing or reading.

All modules have a 'source disk' (given for each module in Appendix A), on which the source file is held. Whenever this module is compiled (or assembled), the relocatable object file produced must be copied onto the other disks. On the same diskettes as each of the four main programs there

Page 140 Building and Running the Programs Appendix B is a 'submit file' that will link and locate the necessary modules to provide the final program. These files are called LINKTRM (produces TREEM),

LINKBLD (produces BLDTRE), LINKRST (produces RESTRUCT) and LINKUNB (produces UNBUILD).

B.2 Disk Organisation

The source disks are too large for them all to be contained on one floppy disk. They are spread over three sides of two floppy disks. One of these disks has a red label and the other an orange label. The disks are referred to here by the colours of their labels.

Side 1 of the orange disk contains most of the support routines for access to the digraph, and the main program for the tree-matcher

(interpreter). Side 1 of the red disk contains support routines and the main program for building the digraphs. Side 2 of the red disk contains two programs, one for listing/dumping a digraph and the other for re­ structuring one. Submit files to create the four programs are on the appropriate disk sides. It should be noted that to build any program other than TREEM requires both disks to be installed. Both disks are CP/M system disks.

B.3 Linking the Programs

In order to make an executable version of any of the four programs TREEM, BLDTRE, RESTRUCT or UNBUILD it is necessary to compile all the modules that are used by the program and then link them. It will normally only be necessary to re-compile any modules whose source file has been changed. The commands to link the programs (using the Microsoft linker L80) are:

Page 141 Building and Running the Programs Appendix B

TREEM:

L80 TREEM ,TRE/S,NOD/S ,FLD,CHR, 10 ,STK,GETS4, IOLIB/S,FORL IB/S ,C:TREEM/N ,/U/E

BLDTRE:

L80 BLDTRE .BLDRTNS ,TRE/S ,NOD/S ,FLD/S ,STK/S, IO/S, IOL IB/S .BLDTR/N ,CHR ,/E

RESTRUCT;

L80 C rRESTRUCT, IO/S ,TRE/S ,NOD/S ,FLD/S ,STK/S, IOL IB/S ,C :RESTRUCT/N ,/U ,/E

UNBUILD:

L80 C :UNBU ILD,NOD/S ,FLD/S .STK/S ,C :UNBU ILD/N, IOL IB/S ,/U ,/E

In order to assist with the linking operation the disks containing the source files for the programs also contain 'submit' files to link the programs. There are four submit files, one for each program:

LINKTRM.SUB — links TREEM LINKBLD.SUB — links BLDTRE LINKRST.SUB — links RESTRUCT

LINKUNB.SUB — links UNBUILD

So, for example, the command 'SUBMIT LINKTRM' will make an executable version of TREEM from the Fortran source files by linking all the appropriate modules (as listed above).

To build all programs from scratch all Fortran source files on the disks should be compiled, and then the four submit files submitted to create the four executable programs.

Page 142 Building and Running the Programs Appendix B

It is important in all cases to use the FORLIB library that is on this

disk set, It is a modified form of the standard FORLIB library - these programs will not work correctly with the standard Microsoft-supplied

library. The differences are documented in appendix A.

B.4 Embedding the digraph into TREEM

When BLDTRE is run to produce a digraph it produces the file

NETWORK.NOD. If this file is on the logged-on disk when TREEM is run then it is automatically read as the default digraph file, unless a filename was

specified on the TREEM command line. This will also be used as the default name for the digraph file written by TREEM on exit.

It is possible to speed up TREEM initialisation by building this data

into the program rather than reading it from a file on program start-up. CDCFILE is an example of TREEM which has been set up to work this way. In

addition to speeding up TREEM initialisation the program also works "stand-a lone": only the program file is necessary, there are no additional data files that have to be copied with it.

To create a version of the interpreter with the data built in run TREEM as normal, but then halt it the first time it requests input by using control-C. The interpreter complete with its embedded digraph and other data can now be saved using the command "SAVE 123 X.COM", where X is to

be the name of the new program. Note that this program is now committed to using its built-in digraph - it cannot read it from a file. Therefore it

is essential to keep a copy of the original TREEM as well, in order to allow future modification of the digraph.

Page 143 APPENDIX C :

DESCRIPTION OE MACRO—NODES

This appendix describes the format and meaning of the macro-nodes that are used in the text file that forms the input to the BLDTRE program.

C.l Creating a Digraph for use by TREEM

BLDTRE is a program that builds trees or networks (digraphs) of nodes which can then be used by the tree (/network/digraph) matching interpreter TREEM. BLDTRE takes a network 'program' from the file MACROS.NET and creates a file NETWORK .NOD which is in a form that can be used by TREEM.

TREEM is an interpreter whose actions are controlled by a network of nodes. It can 'talk' to both the user and a mainframe computer (via a serial port). The network of nodes is initially loaded from the file NETWORK .NOD, which is normally created by BLDTRE. This file will automatically be used by TREEM when it is run, provided the file is on the logged-on disk at the time. It is possible to rename this file, and then specify the filename as a parameter to TREEM, if several different digraphs are in use (for different purposes).

In order to save the time taken by TREEM to read in these files at the start of every run it is possible to produce a version of TREEM with the network 'built-in'. To do this, halt execution of TREEM (with ctrl-c) the first time it asks for input, and save the combined interpreter and network by saving memory to disk with the command 'SAVE 123 X.COM', where X is the name to be given to the new fixed-network interpreter. Note that it is now impossible to modify the network, so it is essential to retain an original copy of TREEM and the network definition file. Note that these programs with an embedded digraph file for start-up will still write out

Page 144 Description of Macro-nodes Appendix C

the digraph (as NETWORK.NOD) on exit; this digraph file will have the "count" fields of the nodes updated as a result of this run of TREEM.

The TREEM interpreter creates a trace file called TREEM.TRA on the

logged on disk, so this disk must not be read-only, otherwise a BDOS error will be generated.

The following sections describe the form of the network description

required for input to BLDTRE.

C.2 Using BLDTRE

The input to BLDTRE is in the format of 'macro-nodes'. These nodes are not the same as the ones used internally by TREEM, in general they represent more complex operations, or combinations of the nodes used

internally by TREEM. The following pages describe the action of each type of macro-node in turn.

The general form of a macro-node for input to BLDTRE is:

nodenum type convno rightlink downlink data

Where 'nodenum' is an integer between 1 and 500 and is the node identifier, 'type' is the node type, 'convno' is 1 or 2 (the conversation number, described later), 'rightlink' is the node to which control passes if the action of this node is successful, 'downlink' is the node number to which control is to be passed if the action of this node fails, and 'data' is a field whose purpose depends on the node type.

The downlink field is ignored for the node types OUTPUT, WRPAR and COPY but must still be present (and conventionally zero). The only node type which does not require both link fields is QUIT, for which anything after the type field is ignored.

Page 145 Description of Macro-nodes Appendix C

The 'macro nodes' used in the input to BLDTRE differ from those used internally by TREEM (which are described in appendix ?). In general the macro-nodes may be expanded into a number of TREEM nodes. There is a

limit of 200 nodes in the present implementation of TREEM, which implies a maximum of around 100 macro-nodes. If it is attempted to create too large a network then BLDTRE will abort with the error message 'no more space for

tree'.

C.3 Errors in Macro-node Digraph

If there is a syntax error in a macro node then it is listed on the console, with an arrow underneath pointing to the first field in which an error was detected. Once the complete input file of all macro-nodes has been read a list of any nodes referenced but not defined is printed.

A listing file is produced called MACROS.PRN, but this does not include the error messages (at present), which are output to the console. In general errors do not stop BLDTRE from producing the output files TREELOAD.DAT and MESSAGES.DAT, but the network produced will be incomplete.

It is however still possible to use TREEM with this incomplete network, although any attempt to pass control to a missing section of the network will cause control to pass to the fail node instead.

C.4 Local Trees

The concept of 'local trees' within the network, for matching against input from the user or mainframe and branching to an appropriate part of

the network, is an important feature of the TREEM/BLDTRE system.

These local trees are usually initially created using the TREE type macro-node. However small trees can also be created using MATCH type macro-nodes.

Page 146 Description of Macro-nodes Appendix C

C.5 Macro Node Actions

C.5.1 MATCH

e.g. 230 MATCH1 232 235 REPLACE

The string in the data field is compared with the next characters in

the appropriate input buffer (the user's input buffer in this case, because

conversation 1 is specified). If they do not match at the first character ('R' in this case) then control passes to the downlink node (235). If they

match completely then control passes to the rightlink node (232). If they match at the first character or more, but mismatch later then control passes to the fail node.

Default Fail Strategy

If control passes from a MATCH 1 node to the fail node then the

interpreter TREEM scans back to the start of the local tree, prints the input string that did not match, and also prints all possible strings that

would have been matched by the local tree. Then TREEM scans back two nodes before the root of the local tree, and recommences from that point. Usually a local tree will have been preceded by an OUTPUT 1 node giving a prompt to the user followed by a GET1 node to get the user's input. Therefore the effect of this strategy is that if the user gives an unrecognised keyword then a list of valid keywords is given and then the user is re-prompted for input.

If control passes from a MATCH2 node to the fail node then the action is basically similar in that the received string and a list of expected strings are output to the user, but control then resumes from node 1. In other words this is treated as an unrecoverable error. The information output to the user would be better stored in a file somewhere, as it tells a system programmer what went wrong but may not be of much use to the user.

Page 147 Description of Macro-nodes Appendix C

C.5.2 TREE e.g. 3 TREE2 0 0 / 7 I.C.C.C. TAC

/ 200 IDLE / 200 / / 85 JOB ACTIVE

*

TREE macro nodes use more than one line. The first line if of conventional format: the downlink specifies the node to which control will pass if none of the following strings match the input string, the rightlink gives the node to which control should pass if the appropriate input string is empty. The strings from which the tree branches are to be constructed then follow, one to a line. Each line starts with a V , and the last line is followed by a line starting with a Each branch includes a macro node number after the V , which is the node to which control should pass if the node matches (a rightlink).

In the example above it is the next string from conversation 2 (the mainframe) which is being matched. If the input string is empty, or no match is successful, then control is passed to the FAIL node (macro node

0 ).

C.5.3 GET e.g. 205 GET1 210 220

GET nodes do not have a data field. In the case above one line is accepted from the user (conversation number 1). If the conversation is 2

Page 148 Description of Macro-nodes Appendix C

then a response from the mainframe would be accepted, and the last non­ blank line received would be stored in the input buffer for conversation 2.

If the node is successful (i.e. a line was received) then control passes to the right link (node 210 in this case). If the user inputs a blank line, then the node fails and control passes to the downlink; node 220 in this example. There is no timeout on input from the user, but a GET2 node will fail if no input has been received from the mainframe within about 5 seconds.

Default Fail Strategy:

Control is passed to node 1 (the root of the whole network), if the downlink is specified as zero (or one).

C.5.4 OUTPUT e.g. 10 0UTPUT1 165 0 What do you want to do ?!

This causes the text in the data field to be output to the user (as in this case) or the mainframe (if conversation number is 2). Exclamation marks (!) cause the buffer to be output (i.e. have the effect of a carriage return), otherwise the data is not output immediately, but is instead appended to the current contents of the output buffer. This allows a complete line to be built up by several OUTPUT and WRPAR nodes.

Control always passes to the right link. The downlink must be present but is ignored. In the listing file the number in the downlink field is the message number.

Page 149 Description of Macro-nodes Appendix C

C.5.5 PARAM e.g. 300 PARAM1 305 360 1

The data field must contain an integer in the range 1 to MAXPAR (5 in the demonstration programs), which specifies the parameter number that this node refers to. The next string in the specified input buffer (in this case the user's input buffer) is saved as one of a number of parameters; parameter 1 in this case. A subsequent WRPARx y z 1 will write it out again to conversation x. Also LODFIL uses parameters set up this way to determine filenames.

This node type fails if the input buffer is empty.

Default Fail Strategy:

Control is passed to node 1.

C.5.6 WRPAR (write parameter) e.g. 310 WRPAR2 315 0 1

This node type writes out the parameter specified in the data field, which should have been previously stored by a PARAM node. In the case above the parameter is output to the mainframe. The downlink field is ignored, but must be present.

C.5.7 COPY e.g. 224 COPY2 1 0

If the conversation number is 1 then one line is read from the console and sent unaltered to the mainframe. If the conversation number is 2 then

Page 150 Description of Macro-nodes Appendix C text from the mainframe is copied to the user's screen until the mainframe has stopped sending any characters for a period. There are actually two timeouts: there is a long one (a number of seconds) while waiting for the first character from the mainframe. Once characters have been received there is a shorter timeout acting on the period between characters (around one second). Both timeouts are configurable but cannot be changed dynamically (i.e. TREEM has to be re-built if new values are required).

This node type is intended to be used for copying a single line through, although it will copy multiple lines. (Since there may be pauses between lines sent from the mainframe the timeouts should be re-configured if COPY is going to be used for copying multiple lines).

Control always passes to the right link. The data field is not used.

C.5.8 LODFH (load file) e.g. 350 L0DFIL1 355 0 1

A file is transferred from the mainframe to the RML 380Z microcomputer if the conversation number is 1 (as here). The file is transferred the other way (i.e. from the host computer port to the local file system) if the conversation number is 2. An integer in the range 1 to 5 in the data field gives the parameter number storing the file name.

Control is passed to the right link (355) if the file transfer is successful, and to the downlink (0: the fail node) if the file name is invalid. The file name must be a valid CP/M type filename.

Default Fail Strategy:

Control is passed to the root node (node 1).

Page 151 Description of Macro-nodes Appendix C

C.5.9 QUIT e.g. 500 QUIT

This node type has no fields other than the node number and the type. It causes the interpreter to stop, and return to the (CP/M) operating system.

Page 152 APPENDIX D:

MACRO—NODE DIGRAPH EOR CDCFILE

This appendix describes an application program based upon the TREEM adaptive user interface program.

D.l A Program to Control File Transfer Between Microcomputer and Mainframe

CDCFILE is a program that runs on a Research Machines Ltd 380Z microcomputer, which is connected via its

time-sharing line of the CDC mainframes at Imperial College. The program

allows files to be transferred between the two computers in either direction.

The line used is a normal 300 baud asynchronous RS232/V.24 line

intended for connecting a terminal to the time-sharing network of the mainframe. The rate of transfer of a file is about lkbyte per minute.

There are no checks made on the data transferred, but in practice it has proven to be reliable. Only text files of ascii characters may be transferred.

The disadvantages of using this approach, as compared with using a high speed synchronous line, are that it is relatively slow (comparable to using cassettes for file storage on the microcomputer), and it is not as secure or reliable.

The advantages are that no special software is required for the mainframe and no special hardware is required (other than the connecting lead, assumming that the microcomputer is equipped with a serial interface). It is therefore quick to implement and may readily be adapted to work with other mainframes. In addition it is possible to use dial-up

Page 153 Macro-node Digraph for CDCFILE Appendix D

telephone lines, in conjunction with modems, to access mainframes for which a direct line is impossible or unavailable.

CDCFILE is in fact a copy of a general purpose interpreter TREEM

(tree-matcher), complete with an internally-stored network of nodes which control its actions. Appendix B describes how to build a version of TREEM

with an embedded digraph. The following section describes the macro-node

digraph that was used to build CDCFILE.

CDCFILE creates a trace file called CDCTRACE.DAT on the logged-on disk. Therefore this disk must be write-enable, otherwise a BDOS error will be generated when running CDCFILE.

D.2 Summary of Digraph for CDCFILE

The digraph for CDCFILE is structured around two main "local trees"

(see section 4.2.3). One of these interprets the mainframe’s prompts and the other interprets the user's commands. On startup (at macro-node 1)

CDCFILE sends a blank line (i.e. carriage-return) to the mainframe and then enters the first of these local trees to establish whether connected to

the mainframe via the Terminal Access Controller (TAC) and if so whether logged on. If not connected then CDCFILE connects to the mainframe without any user interaction. If not logged on then the user is prompted to supply the necessary username and password.

Once logged on to the mainframe the user is asked for a command and the second main local tree is entered to determine the user's response. A limited set of mainframe commands are supported; listing the file catalogue (CATLIST/QUICAT), printing a file (PRINT), and querying the printer queue (QSTATUS). A special command ($) can be used to pass any NOS command straight through to the mainframe. In addition to the upload and download commands which provide the main functions of this program there is also a help command and any of EXIT, STOP or END will terminate the program.

Page 154 Macro-node Digraph for CDCFILE Appendix D

T t

4-

get mainframe prompt

and match -^connect through TAC->

-*logon

get user's command

and match ->NOS command-*

•^upload----- »

^download--->

D.3 Sample Listing of MACROS.NET

Here is a listing of the input file to BLDTRE used to create a network for controlling file transfer between the RML 380Z and the I.C.C.C. CDC mainframes. This is the network incorporated into CDCFILE. There are two main local trees, one rooted at node 3 matches responses from the mainframe, and the one rooted at node 210 matches commands from the user.

; DIGRAPH FOR TALKING TO CDC

Page 155 Macro-node Digraph for CDCFILE Appendix D

; PROD MAINFRAME

I 1 0UTPUT2 2 0 !

I ; GET MAINFRAME'S RESPONSE

1 2 GET2 3 55

3 TREE2 0 0 / 7 I.C.C.C. TAC / 200 IDLE

/ 200 / / 85 JOB ACTIVE

/ 80 READY / 52 RVICE

/ 80 RECOVER / 200 *REENTER

/ 95 USER / 135 XXX

/ 52 SERVICE- / 93 = ILL / 93 =ARG *

7 OUTPUT1 8 0 Attempting to log on!

8 0UTPUT2 2 0 T!

* ; MATCH MAINFRAME'S RESPONSE

> 52 OUTPUT 1 53 0 Telex not available.!

53 OUTPUT 1 500 0 Try again later.! 55 OUTPUT1 60 0 Line appears to be dead.! 60 0UTPUT1 65 0 Is it connected correctly ?! 65 GET1 70 60 70 MATCH1 1 75 YES 75 MATCH1 77 0 NO 77 0UTPUT1 500 0 Exiting from program!

Page 156 Macro-node Digraph for CDCFILE Appendix D

80 0UTPUT2 2 0 TAPE! 85 GET2 3 1 93 OUTPUT 1 2 0 IIlegal/invalid command!

J ; LOG ON TO MAINFRAME

J 95 0UTPUT1 100 0 Please give your user number;!

100 GET1 105 95

105 PARAM1 110 95 1 110 WRPAR2 115 0 1 115 OUTPUT2 120 0 ,

120 PARAM1 125 135 2 125 WRPAR2 130 0 2 130 0UTPUT2 150 0 !

135 OUTPUT1 140 0 and your password ?;! 140 GET1 120 135

150 GET2 155 1 155 MATCH2 160 3 USER 160 OUTPUT1 165 0 Invalid user number or password.! 165 OUTPUT 1 95 0 Try again.!

I ; GET USER'S COMMAND

» 200 OUTPUT1 205 0 What do you want to do ?!

205 GET1 210 200

* ; MATCH USER'S RESPONSE i 210 TREE1 0 0 / 300 DOWNLOAD / 400 UPLOAD / 222 CATLIST / 500 STOP / 500 END / 229 QUICAT

Page 157 Macro-node Digraph for CDCFILE Appendix D

/ 231 QSTATUS / 500 EXIT

/ 24-2 $ l 255 HELP / 272 PRINT *

222 OUTPUT2 224 0 CATLIST

224 COPY2 1 0

229 OUTPUT2 224 0 QUICAT 231 0UTPUT2 224 0 Q,QT=PR

242 COPY1 244 0 244 COPY2 1 0 255 OUTPUT1 257 0 Possible commands are:! 257 OUTPUT 1 259 0 DOWNLOAD : transfers file from CDC!

259 OUTPUT 1 261 0 UPLOAD : transfers file to CDC! 261 OUTPUT 1 263 0 CATL 1ST,QUICAT,QSTATUS : NOS commands! 263 OUTPUT 1 200 0 Any valid NOS command, preceded by '$'!

272 0UTPUT2 274 0 QUEUE, 274 PARAM1 275 276 2 275 WRPAR2 278 276 2 276 WRPAR2 278 280 1 278 0UTPUT2 279 0 =PS! 279 C0PY2 1 0 280 OUTPUT1 282 0 File name ?!

282 GET1 284 280 284 PARAM1 276 0 1 i ; DOWNLOAD FILE

) 300 PARAMI 305 360 1 305 0UTPUT2 310 0 OLD, 310 WRPAR2 315 0 1 315 0UTPUT2 320 0 /ND! 320 GET2 325 305 325 MATCH2 330 0 /

Page 158 Macro-node Digraph for CDCFILE Appendix D

330 0UTPUT2 350 0 LIST/DT=PTAPE!

350 L0DFIL1 355 0 1 355 OUTPUT 1 1 0 File downloaded.!

360 OUTPUT 1 365 0 File name ?!

365 GET1 300 360

» ; UPLOAD FILE i 400 PARAMI 405 480 1 405 0UTPUT2 410 0 NEW,

410 WRPAR2 415 0 1 415 0UTPUT2 420 0 /ND!

420 GET2 425 405 425 MATCH2 430 0 /

430 0UTPUT2 435 0 TEXT! 435 GET2 440 430 440 MATCH2 445 442 ENTER 442 MATCH2 445 0 /

445 LODFIL2 448 0 1 448 GET2 450 450

450 0UTPUT2 455 0 PACK! 455 GET2 460 450

460 MATCH2 465 0 / 465 OUTPUT 1 470 0 File uploaded.!

470 OUTPUT2 2 0 REPLACE! 480 OUTPUT1 485 0 File name ?! 485 GET1 400 480 500 QUIT

Page 159 APPENDIX E:

EILE FORMATS

E.l Internal Digraph File

The first line of TREELOAD.DAT contains the pointer to the list of free nodes and the root node, in that order. Then follow 200 lines, one per node. The first number on each line is the node type plus either 100 or

200 depending on whether the conversation number is 1 or 2. The second and third numbers are the right link and downlink. The fourth field is the data field, which contains a character in 'match' and 'write-character' nodes, a parameter number for 'parameter', 'write-parameter' and 'load-file' nodes, and is a non-alphanumeric character for all other node types. The fifth field is a count field, which is initialised to zero for all nodes.

The node types are:

I match (one character)

4 fail 5 write (one character)

6 read parameter 7 write parameter 8 copy line 9 output buffer II get line (i.e. read into buffer) 12 quit (stop interpreter program) 13 write message (the downlink stores the message number) 14 load file

Page 160 File Formats Appendix E

E.2 Message File

The messages referred to by node type 13 are stored in the random access file MESSAGES.DAT. It is important that this file is not modified directly with the text editor, since the records are all padded to a fixed length (128 bytes) with NUL characters, and the TXED text editor removes these.

E.3 Macro Nodes

Form of macro node (in file NET .DAT):

nodenum type cn rlink dlink data nodenum : is the node number, which must be an integer between 1 and 500.

type : is onp of the following:

MATCH GET

QUIT OUTPUT

PARAM WRPAR

LODFIL cn : (conversation number) is 1 (the user) or 2 (the mainframe) rlink : is the number of another node (in the same file). Control will be passed to this node if the action of the node is successful

Page 161 File Formats Appendix E dlink : is the number of another node, or zero. Control will be passed to this node if the action of the node fails.

data : the contents of this field depends on the node type

If a line starts with a semi-colon (;) or asterisk (*) then it is ignored so that comments may be inserted in the file.

There is an implicit node zero, which is a '‘fail" node, and depending on the type of node passing control to it, there is some sort of recovery action.

Processing starts at node 1.

E.4 Trace Control Files

Each of the four programs will attempt to read a trace control file during initialisation, which is used to determine which trace levels should be enabled and where the trace should be written to. Trace levels are described in appendix A. Each program has its own trace control file: program trace control

TREEM TRACECTL.TRM RESTRUCT TRACECTL.RST

BLDTRE TRACECTL.BLD UNBUILD TRACECTL.UNB

The format of these files is in every case a single line containing 10 logicals .TRA), or 3 for the console.

Page 162 File Formats Appendix E

Thus a typical trace control file, if levels 1 and 9 are enabled and all others are not, might be:

TFFFFFFFTF 6 APPENDIX E:

NODE TYPES AND THEIR ACTIONS

This is a description of what the user interface interpreter (TREEM) does when the current node is of the specified type.

F.l JMMED (Immediate match node)

The letter contained in the node is compared with the current character from either conversation, depending on the conversation number specified in the node.

If the letters match, control is passed to the node to the right, otherwise to the node below. In addition, if the match is successful, the character is echoed to the console and the next character from one of the input buffers read.

It should be noted that if both characters are delimiters but not the same, the match may be counted as sucessful (but not at present).

F.2 IENDND (End Node)

This is now obsolete. It marked the end of a branch, and control left the main loop of the program and execution of the constructed command was initiated. It is not used in the present system in which the nodes form a network rather than a tree.

Page 164 Node Types and their Actions Appendix F

F.3 KOPY

Text is copied unmodified (almost) from the console to the mainframe, if the conversation number is 1, or the other way round if it is 2. One line, including the carriage return, is copied, and if it is from console to mainframe, the first character is removed. It is used to allow the user to by-pass the interpreter and give commands directly to the mainframe. The line is echoed to the console.

FA NFAIL

Control is passed to a fail node after any kind of error. At present there is only one fail node, and control passes to it when the user gives an invalid command. An error message and a list of valid keywords is displayed, then control is passed to the root-node, ready to accept another command.

F.5 NTERM

Checks that at the end of a line and passes control to the right or down as appropriate. In effect an IMMED node with an end-of-line character, but does not try to read another character from an input buffer if successful.

F.6 JUMP

For efficiency jump nodes may be inserted so that not all characters are checked for a match. The characters that are skipped over are not echoed, but what they should be is. This mechanism also allows abbreviations, and expands then, and allows some spelling mistakes - without noticing them. (See Partridge.)

Page 165 Node Types and their Actions Appendix F

F.7 NPARAM

Read a string in from either conversation and store it in the parameter block. Up to five parameters may be stored (at present). The string is terminated by a delimiter (space, comma or "=") or after fourteen characters.

If input is from console (user) and parameter is not present then the user is prompted to supply one. If there is a delimiter at the start of where the parameter is expected, it is skipped over. The delimiters before parameters are not specified explicitly in the tree, so they can be omitted. (Control moves right.)

F.8 NWRICH (Write character)

The character stored in the node is output to the console (conversation number 1) or to the mainframe (conversation number 2). Control moves right.

F.9 NWPAR (Write parameter)

The parameter specified (1 to 5) is written to channel 1 or 2, from the parameter block. Control moves right.

F.10 NEWLN (New line)

Writes a buffer (either 1 or 2). Effectively the same as IWRICH with a "new-line1' character. Control passes right.

Page 166 Node Types and their Actions Appendix F

F.ll NGET

Gets a line from the console or mainframe. In the second case, if more than one line is received then the last line is the one that will be stored in the input buffer (INBUF2). Receiving stops when there have been no characters for several seconds. At present, control always moves right, in future no response after a period of time will cause control to move down.

F.12 LODFJL (load file)

This node is used for transferring a file between the CDC (local file) and the 380Z (file on floppy disc).

Channel=l

Direction is from CDC to 380Z

Call GETFIL (assembler routine) to accept characters from SI04 and store in memory above TREEM. Returns on receiving a punch- off character (control, or where there has been nothing for about five or ten seconds).

Use PEEK to read from memory and write a line at a time to the file (on unit 9). For the moment the file name is filed as FROMCDC.DAT

Close file

Page 167 Node Types and their Actions Appendix F

Channel=2

From RML to CDC

Open TOCDC.DAT on unit 9. Read lines and send the using S40UT.

Close file. Do not terminate with ETX (control C).

In both cases there will be a maximum line length (of 128 characters). NULs and DELs will be ignored, and so will some other control characters.

Page 168