ICL Technical Journal Volume 4 Issue 4
Total Page:16
File Type:pdf, Size:1020Kb
TECHniCAl j o u R m i Volume 4 Issue 4 November 1985 iC L T C p u n ip p i The ICL Technical Journal is published twice a year by 1/¼ in n n i Peter Peregrinus Ltd. on behalf of International JOURrlnL Computers Ltd. Editor J. Howlett ICL House, Putney, London SW15 1SW, UK Editorial Board J. Howlett (Editor) CJ. Hughes H.M. Cropper (British Telecom Research Laboratories) D.W. Davies K.H. Macdonald G.E. Felton J.M. Pinkerton M.D. Godfrey E.C.P. Portman All correspondence and papers to be considered for publication should be addressed to the Editor. 1986 subscription rates: annual subscription £16.00 UK, £19.00 overseas, airmail supplement £8.00, single copy £10.00. Cheques should be made out to ‘Peter Peregrinus Ltd.’, and sent to Peter Peregrinus Ltd., Station House, Nightingale Road, Hitchin, Herts. SG5 ISA, UK, telephone: Hitchin (s.t.d. 0462) 53331. The views expressed in the papers are those of the authors and do not necessarily represent ICL policy. Publisher Peter Peregrinus Ltd. PO Box 8, Southgate House, Stevenage, Herts. SGI 1HQ, UK This publication is copyright under the Berne Convention and the Inter national Copyright Convention. All rights reserved. Apart from any copying under the UK Copyright Act 1956, part 1, section 7, whereby a single copy of an article may be supplied, under certain conditions, for the purposes of research or private study, by a library of a class prescribed by the UK Board of Trade Regulations (Statutory Instruments 1957, No. 868), no part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means without the prior permission of the copyright owners. Permission is, however, not required to copy abstracts of papers or articles on condition that a full reference to the source is shown. Multiple copying of the contents of the publication without permission is always illegal. ©1985 International Computers Ltd. Printed by A.McLay & Co. Ltd., London and Cardiff ISSN 0142-1557 JCL TCCHNCfll JOURnflL Contents Volume 4 Issue 4 Foreword Asa W. Lanum 351 History of the ICL content-addressable file store (CAFS) J.W.S. Carmichael 352 History of the CAFS relational software A.T.F. Hutt 358 The CAFS system today and tomorrow G.McC. Haworth 365 Development of the CAFS-ISP controller product for Series 29 and 39 systems N. Macphail 393 CAFS-ISP: issues for the application designer R.M. Tagg 402 Using secondary indexes for large CAFS databases P.R. Wiles 419 Creating an end-user CAFS service C.E.H. Corbin 441 Textmaster — a document-retrieval system using CAFS- ISP M.H. Kay 455 ICL Technical Journal November 1985 349 CAFS and text: the view from academia L. Burnard 468 Secrets of the sky: the IRAS data at Queen Mary College D. Walker 483 CAFS file-correlation unit E. Babb 489 Patents relating to the ICL content-addressable file store CAFS 504 Notes on the authors 506 Subject index to Volume 4 511 Author index to Volume 4 518 350 ICL Technical Journal November 1985 Foreword On the 21st April 1985 it was announced that the Queen’s Award for Technological Achievement had been presented to ICL in recognition of the successful innovation represented by the development of CAFS-ISP. The problem of accessing large amounts of data in a structured manner, so as to present usable information within a reasonable time, began to be recognised as early as 1962. The CAFS system is a combined hardware and software solution to this problem. Additionally, it is especially well positioned because throughout the development there has been a clear understanding that any technological breakthroughs achieved in solving the problem must fit into the real world of existing customer data and equipment. The CAFS-ISP product clearly represents a breakthrough worthy of the Queen’s Award, and as such is one of the unique technological advances in the data-processing industry today. As a vehicle it is just now being recognised by other organisations trying to provide comparable data-access capabilities. This issue of the ICL Technical Journal reviews the history of the development of CAFS from the first recognition and formalisation of the problem through to the evolution of the physical solution, and also the history of the supporting software. In addition it illustrates the use of the product in the real world, showing how this has extended from the original closely defined problem of data access to include the processing of text and other forms of information by real users to solve real problems. Asa W. Lanum Director and General Manager, ICL Applied Systems ICL Technical Journal November 1985 351 History of the ICL content-addressable file store (CAFS) J.W.S. Carmichael ICL Mainframe Systems, Slough, Berkshire Abstract The paper is a chronological review of the development of CAFS from an original notion through successive stages of research, development and integration to a standard ICL product. It is essentially the author's historical review included in the 1985 Report of the CAFS Group Working Party of the ICL Computer Users Association. 1 Introduction The origin of CAFS as a concept was the explicit recognition that the activities of a living organisation of co-operating people are not totally ordered, so that their interactions must be regarded as an interplay of order and disorder. The use of information to maintain their organised behaviour is therefore also characterised by an element of disorder, in its fundamental sense of intrinsic unpredictability, and so, whenever a large amount of data is stored on magnetic media, some applications will require that it be accessed by search. The traditional approach to solving this problem has been to build indexes to cope with the more predictable searches. This is fine as far as it goes, but not all searches can be predicted, and if one tries to build indexes to cope with every eventuality - by, in effect, totally inverting the file - the task of managing the indexes, particularly when the data is volatile, can become intolerable. Two very relevant quotations from Vic Mailer1 are: ‘There are, in fact, instances where the indexes can occupy between two and four times the volume occupied by the data to which they refer a perfectly absurd situation’. \ .. It should not be assumed that indexable operations cover the totality of useful functions ... Indexes are really very primitive projections of files and consequently their utility should not always be taken for granted’. Content addressing, or associative processing, is a natural way of trying to circumvent these difficulties. Very natural, in fact, since we are all very familiar with the marvellous effects we can achieve using these techniques in retrieval from the memories in our heads. Laymen have always assumed that 352 ICL Technical Journal November 1985 computers work this way; look at the computers in 2001, Doctor Who or Blake’s Seven to see how ordinary people think they behave. In the early 1960s there was much speculation in the academic world on the potential value of associative stores, but few suggestions regarding prac ticable techniques for making such a store of adequate capacity at acceptable cost. In 1962 Gordon Scarrott, working with Roy Mitchell in the Ferranti computer department, was stimulated by this situation to point out to their management that if a store, of adequate capacity to be useful, conceptually rotates as a consequence of its operating principle - for example a magnetic drum or a delay line - then the mean time required for access by search is half the revolution time, exactly the same as for access to a known address. Hence associative access can be achieved with existing technology, without an additional access time penalty. However, in 1962 the concept of a ‘data base’ had not become common and consequently the need for store access by automated search was not widely recognised. Moreover, there were no resources available for exploratory development, and so no physical work was initiated at the time. 2 Initial project In 1969 George Coulouris, then lecturer in computer science at Imperial College, suggested to Gordon Scarrott a joint project with ICL’s Research & Advanced Development Centre to identify the intrinsic requirements for file storage and to propose ways for meeting such requirements - a suggestion that was immediately accepted. The late Roy Mitchell, then a senior member of the staff of RADC, took responsibility for the project, collaborating with John Evans, one of George Coulouris’s senior students. At that time moving-head disc storage devices were not in common use. Nevertheless, in December 1969 Roy Mitchell had proposed a combination of disc store, key store and comparator, search evaluation unit and retrieval unit. He coined the term ‘CAFS’ to refer to the system, a name by which this combination of mechanisms is still known. Initially there was some confusion about whether the ‘A’ stood for ‘addressable’ or ‘addressed’, but this issue has long been resolved in favour of ‘addressable’ since the system permits addressing by content but does not enforce it. It is noteworthy that most of the individual events (i.e. comparisons) in the CAFS searching system take place as soon as the data becomes available from the disc store, so that Roy Mitchell’s design was in effect a ‘data flow’ system as now defined. But of course the term ‘data flow’ was not used in 1969. Since CAFS is an engineering innovation, its most basic feature, the autonomous searching mechanism, should be regarded as a novel and advantageous synthesis of ends and means. Gordon Scarrott and Roy Mitchell proposed the means in 1962, George Coulouris and John Evans ICL Technical Journal November 1985 353 recognised the ends in 1969, and Roy Mitchell proposed the synthesis in 1969.