RECORDS MANAGEMENT ATTRIBUTES IN INTERNATIONAL OPEN DOCUMENT EXCHANGE STANDARDS

by

HAROLD ANTHONY GREGSON

B. A., University of Victoria, 1971

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF

THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF ARCHIVAL STUDIES

in

THE FACULTY OF GRADUATE STUDIES

(School of Library, Archival and Information Studies)

We accept this thesis as conforming to the required standard

THE UNIVERSITY OF BRITISH COLUMBIA

October 1995

© Harold Anthony Gregson 1995 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.

The University of British Columbia Vancouver, Canada

DE-6 (2/88) Abstract

The thesis is a study of the ability of international open document exchange standards to capture the attributes of the archival document, or record in its electronic form using a set of decontextualized attributes developed by diplomatics. Open document exchange is the ability to exchange documents in their complete form between heterogeneous electronic document management systems. Three standards are examined:

ISO 8613 Open Document Architecture (ODA), ISO 10166 Document Filing and

Retrieval (DFR), and ISO 8879 Standard Generalized Mark Up Language (SGML) as realized in the Text Encoding Initiative; As distributed computing becomes common, there is a growing need for such standards, but their usefulness to archivists and recordskeeping systems depends on their ability to recognize the record.

Open document exchange standards function by setting broad descriptive requirements without prescribing the specific implementation by any given system. In describing documents, they are therefore in need of a terminology that is free of any particular records . The thesis proposes that the archival science of diplomatics is capable of providing a complete set of decontextualized descriptors, or attributes, that encompass all aspects of the archival document or record. Since diplomatics is based on a scientific analysis of documents, and makes use of terminology that has long been in use by records creators and keepers, it is proposed that diplomatics terminology should be treated as a surrogate international standard in defining the attributes of the document profile and the logical and layout structure of documents. Diplomatics concepts can then be used as the basis for describing records attributes in the document profile of electronic

records management systems. The thesis demonstrates this proposition by means of a thesaurus which maps attributes and concepts of SGML, ODA, and DFR against

diplomatic terminology.

ii Table of Contents

Abstract ii Table of Contents iii List of Figures iv Acknowledgments v

INTRODUCTION 1 Role of Communication Standards 7 Problem of Data Structure 12 The Need for Decontextualized Attributes 13 New Types of Documents 15 Definition of the Record 15

CHAPTER ONE: ELECTRONIC DOCUMENT MANAGEMENT SYSTEMS Definition 27 Structure of an EDMS 28 Document Profile 31 Role of Metadata 33 Distributed Database Management 34 Open Document Processing and Role of Standards 35

CHAPTER TWO: DD7LOMATICS 41 Juridical System 44 Facts 45 Creation Process 49 Procedures 50 Phases of Procedure 51 Persons 53 Intrinsic Elements & Extrinsic Elements 55 Transmission 67 Other Aspects 72

CHAPTER THREE: INTERNATIONAL OPEN DOCUMENT EXCHANGE STANDARDS 77 Standard Generalized (SGML) 77 Open Document Architecture (ODA) 94 Document Filing and Retrieval (DFR) 107

CHAPTER FOUR: A THESAURUS OF RECORD ATTRIBUTES 111 Rules&Terms 112 Glossary Sources 183

CONCLUSION 185

BIBLIOGRAPHY 192 List of Figures

Figure 1: Basic Processing Model 36

Figure 2: Sample Business Letter for SGML TEI

Text Encoding 87

Figure 3; Sample SGML TEI Encoded Document 88

Figure 4: ODA Logical and Layout Structure 98

Figure 5: ODA Correspondence between Logical and Layout Objects including Content Portion 98 Figure 6: ODA Content Architecture and Layout for Diplomatic Elements 99

iv Acknowledgements

I would like to acknowledge the interest and support of Clive Smith, World Bank Group Archivist, and Ana Flavia Fonseca, head of Document Management, in permitting me the time and opportunity to pursue the subject of this thesis, and those of my colleagues at the World Bank involved in the development of the electronic document management system who so freely gave of their time to discuss the various issues involved. I would like to thank Susan Rogers for her personal support. INTRODUCTION

The subject of this thesis is the recognition within international open document exchange communication standards of those elements of form that permit the archival document to be distinguished as a record. The archival document is here understood to be "a document made or received in the course of activity as a means for and residue of it, and preserved for reference or in pursuance of legal obligation by its creator or legitimate agent, or successor."1

International open document communication standards are those standards promulgated by the

International Organization for Standardization (ISO) in order to facilitate document interchange between heterogeneous electronic systems. Specifically, the standards considered in this thesis are ISO 8613 Open Document Architecture Standard (ODA), ISO 10166

Document Filing and Retrieval Standard (DFR), and ISO 8879 Standard Generalized Markup

Language.

The thesis takes as its starting point the need to define the record without which neither electronic records management systems, nor an exploration of diplomatic concepts, would have any point. In turn, the record cannot be explained without a definition of the document, a particularly important distinction given the indiscriminate use of the term in the electronic systems world. The thesis then goes on to explain the context of open document exchange standards, which is electronic records management, and the Open

Systems Model, which in turn form a context of their own for the standards. Finally, the stage for an examination of the standards themselves, and the mapping of their attributes and concepts to a thesaurus of diplomatic terminology, is set by an explanation of diplomatics.

1 Duranti, Luciana, Managing Electronic Documents: Making Sense Out of Chaos or "Records Management is Dead! Long Live Records Management! "(Presentation to World Bank, April 27 1993, Washington, D.C): 9. The term document will be used here to include all types of recorded materials of which archival documents, or records, are only one type. This is an necessary distinction to bear in mind because open document exchange standards are designed to capture all sorts of documents, and not just records.

1 Introduction

The spread of distributed information technology systems has compelled many organizations to rely in the conduct of their affairs on different types of computers, peripherals, and software applications manufactured according to proprietary specifications. Such distributed information systems offer great benefits in the use of information, but these can be realized only if the information is "easily accessible and capable of being amended (e.g. annotated and updated) by all the possible organizational users."2 For example, with many different word processing or spreadsheet packages, how can a document be transmitted and received in the same format? With data, text, images, and graphics now frequently incorporated into the same document, each of which is generated by a software application of its own, the problem is compounded.

The purpose of open document exchange standards can be denned more specifically as the ability to provide computerized document management systems with the ability to exchange a record in its complete form, which diplomatics defines as "one that contains all the elements it is supposed to contain according to the administrative and legal system."3 The most complete form of a record is the original, which diplomatics defines as not only a complete (or perfected) record in terms of form, but also the first to be issued by the creator. Without the element of completeness, the reliability4 of a record is in doubt. For instance, a contract that differs from the first to be created and enacted in use of typefaces and layout of text is unlikely to be accepted in a court of law as anything but a simple copy, or a "mere transcription of the content from the original".5 The inability to transmit a record between heterogeneous computer systems in its complete form therefore has serious implications for everyone who must create and manage records. In the words of Alvin Tedjamulia, Executive Vice President of

2 Advisory Committee for the Co-ordination of Information Systems (ACCIS), Strategic Issues for Electronic Records Management: Towards Open Systems Interconnection (New York: United Nations, 1992): 3.

3 Duranti, Managing Electronic Documents, 9.

4 Reliability is the quality of trustworthiness conferred on a record "by its degree of completeness and the degree of control on its creation procedure and/or its author's reliability." Ibid., 9.

5 Luciana Duranti, Diplomatics: New Uses for an Old Science (Part I,) Archivaria 28 (Summer 1989): 3.

2 Introduction

SoftSolutions Technology, "The purpose of document management is to know that an original is an original, who touches it, and when."6

Standardization has inevitably emerged as the obvious solution to this problem of which the movement known as Open Systems Interconnection (OSI) is a part. The objective of

OSI has been "to facilitate interworking between different organizations (or between parts of large organizations) - even when the organizations (or their parts) have very different kinds of computer technology."7 OSI is backed by various international standards-setting bodies such as

CCITT (International Telegraph and Telephone Consultative Committee) and ISO

(International Organization for Standardization), national standards bodies such as ANSI

(American National Standards Institute) and BSI (British Standards Institute), manufacturers, and end user organizations. OSI has been the active promoter of a number of international communication standards intended to standardize the exchange and storage of electronic documents.

The context of open document exchange is electronic records management (ERM).

Electronic management of records applies to the whole spectrum of records-making and recordskeeping functionalities spanning the entire life-cycle of records. An ERM system comprises a full range of records management processes including the creation and identification, appraisal, control and use, and disposition of documents.8 These processes are, at first sight, in many ways, similar to records management in the paper environment, but there are some fundamental differences.

In order to make information available, an ERM system must be capable of transferring records from a store of documents to the user or to other stores. The record must also be moved from the creator or modifier to the document store.9 This raises issues of security, access, and physical transfer that are relatively uncomplicated from a technical point of view

(although intellectually equally important) in the paper records environment. Transfer of

6 Andy Reinhardt, Managing the New Document, Byte (August 1994): 93.

7 ACCIS, Strategic Issues, 3.

8 ACCIS, Strategic Issues, 6.

9 Ibid., 11.

3 Introduction

records becomes even more complex where the exchange must take place between different

platforms or networks. This is hardly a problem where the physical medium of the record has

always the same characteristics as in the case of paper. ERM systems, however, must be

capable of handling a great many, different types of formats, as well as be capable of relating

electronic records to paper records.

ERM also goes much further than traditional systems in the management of records.'

"Traditional records management systems have evolved in the context of custodians managing

finished documents in aggregations of files and records series, essentially automating the paper

system where the sheer volume of paper records to be managed precludes any finer degree of

distinction."10 Records management in paper-based record environments is limited to the

functions of use, maintenance and disposition of records. Electronic records management

systems have the capability of controlling the actual creation and manipulation of records.

"Electronic document management Systems ... on the market today focus on enabling the

author or members of a workgroup to manage the contents of an electronic document

throughout the creation phase when multiple parts may be under revision separately but are

eventually merged into a single, finished edition."11 The creation phase in turn may be broken

down into a number of functional requirements including the entry of data, editing, assembling

of document parts, distribution, collaborative processing, import of external documents and

storage:12

The context of an ERM system is the record itself. The management of records shifts

to the item level as opposed to the more traditional file or dossier level. Whereas in paper

recordkeeping systems documents aggregate themselves into the physical equivalent of the •

intellectual relationship between them, within ERM systems, the relationship between any one

. record and another must be precisely defined to the system beforehand if the record is to be

managed at all. This entails managing the record at the moment of creation.

10 Diane Hopkins, Karl Lawrence, Irene Travis et al., Extending ARM Requirements at the World Bank, (internal report, Washington: World Bank, 1994): 1.

11 Hopkins et al., ExtendingARM Requirements, 1.

12 EDMS Integration Team, Electronic Document Management System: Delivery Schedule Mapping to Functional Requirements, (internal report, Washington: World Bank, 1994): 1-15.

4 Introduction

The international communications standards within which records management functionalities may be addressed are known as open document standards in that they permit documents to be created using software modeled to standards that are themselves "open", i.e. they can be exchanged and manipulated over a network. The question then becomes how the record can be defined and recognized within electronic document management systems employing open document communication standards, How must these standards capture the record and what must archivists know about these standards to appraise their application to a records systems?

Before tackling this question, one more definition is needed. Oddly enough, there is no specific definition for a records system. A system itself has been defined as "a somewhat vague term that usually refers to a combination of components working together. For example, a computer system usually includes both hardware and software."13 While this may be true, a more accurate definition would be to define a system as an architecture of components brought, together to achieve a particular end. The components may be human and non-human. For instance, a typical electronic office records system today is usually comprised of staff responsible for the drafting and approval of records using desktop workstations, printers for output of hard copies for distribution, a dedicated computer for centralized storage of electronic records, administrative support staff" responsible for maintaining classification schemes and filing hard copies, retention schedules for the disposition of records, policies and procedures designed to control the form, distribution, access to, and storage of records, and documentation designed to describe and maintain the computerized components.

Its important to note that the hardware and software comprising the computerized

component are only a part of the records system, even though it is easy to think of them as being the system itself. This is because, in an ERM, many of these functions; including records creation, maintenance, and use are all performed using computer equipment and

software programs. But certain parts of the,system such as system documentation, and

13 Philip E. Margolis, The Random House Personal Computer Dictionary (New York: Random House, 1991): 452.

5 Introduction policies and procedures controlling the creation and maintenance of records are necessarily independent of the ERM (as comprised of hardware and software) in that they govern its very existence by determining its uses and rules.14

An implied qualification of system components is that all the parts that participate in the system are competent, that is, are "adequately qualified or capable" of carrying out their part. Competence, in this sense, is a term that can be applied to both human and non-human elements. For example, a structural member of a bridge is considered

"competent" if it is able to withstand the load forces that it is designed to transmit as part of the structural design. This engineering definition of competence can be applied to the non-human elements of an ERM, that is, the hardware and software. The archival meaning of competent stems from the idea of a competence, defined as "the sphere of functional responsibility entrusted to an office or officer".15 From an archival viewpoint, therefore, competent means capable of carrying out a function by reason of being entrusted with that responsibility. If these two definitions are united within the context of a system, competent means capable of carrying out a designated responsibility towards the fulfillment of a general purpose or function. Such a definition can apply equally to both human and non- human components.

Before a records system per se can be defined, two further conclusions must be drawn about the nature of systems. First, that a system cannot be defined by any one component. The reason components are brought together in a system in the first place is to accomplish an end that could not be achieved by any one component acting on its own.

There is more to this than functional reality.

Archival requirements of reliability and authenticity imply that no one element of a records system can be completely self-justifying. If the records created within a records system are to be deemed reliable and authentic, then the system itself must be able to guarantee these qualities. A recordskeeping system must therefore consist of elements that

This would be as true of paper records systems as computerized systems.

15 . School of Library, Archival and Information Studies, Select List of Archival Terminology (unpublished glossary for Master of Archival Studies Program, Vancouver, University of British Columbia, 1990): 5.

6 Introduction

are able to, so to speak, bear witness against each other in order to ensure that a record is

authentic and reliable. This includes those elements, such as system documentation and

policies and procedures, that permit the user to understand the system as a whole and judge of its purposes. We must therefore define a recordskeeping system as an

architecture of competent records-creators, together with their equipment, and support

mechanisms, governed by policies and procedures for the management of documents made

and received in the course of business, designed to ensure the reliability, authenticity and

completeness of archival documents (or records) in the course of their creation,

maintenance, use, and disposition.

An important proviso here is that open document exchange standards cannot in

themselves form a complete recordskeeping system but can be only part of a

recordskeeping system in that they are a mechanism to capture and exchange documents.

Therefore, all the elements required to guarantee reliability and authenticity cannot be met

by the mechanism of document exchange. For instance, open document exchange systems

are not designed to impose rules of governance on communicating systems.16 Their

function is confined to communicating the completeness of a record.

David Bearman has outlined system functionalities that he maintains are essential

to guaranteeing the archival integrity of an electronic recordskeeping system, or in other

words, the reliability and authenticity of its records. Since, however, Bearman's

functionalities are intended to capture the archival qualities of an electronic

recordskeeping system, they are not relevant to a discussion of open document

communication standards per se for the reason that such standards are only the messenger

between systems and are not concerned with defining the recordskeeping system itself.

Role of Communication Standards

Standards may be defined in general as "publicly available definitions resulting from

international, national, or industrial agreement." Such a broad definition applies to standards of

any sort, communications or otherwise, and there are many thousands covering almost every

conceivable manufacturing object and process. The aim of all standards is to impose some

In fact, as standards, quite the opposite is true.

7 Introduction degree of consistency and as such, they are hardly new. The ancient world had its standard measures of weight and size for coins and many different types of goods. Communications is an area where there has always been a great deal of standardization of necessity, in writing, language, and the equipment and instruments used to communicate. It is no less true of the forms used to convey documentary meaning, such as a letter or a legal contract. It is a point of fundamental importance in discussing the desirability of standards and the applicability of

standardization as a concept. The question is not whether to apply standards but how to define

something that is already implicit.

Standards tend to fall into two broad categories: informal standards, and those that are

agreed through some structured process of development. Language itself might be taken as an

example of an informal standard where a people have come to agree on accepted vocabulary

and grammar.17 Informal standards evolve with a great deal of trial and error and spread because "something just works" and everyone can use it. The more formal standards are those that are agreed by standards-setting bodies who may be established within an industry, at the

national, or at the international level. These standards evolve through a complex, bureaucratic

process which is closed to all but the chosen stakeholders and experts. The relationship between the formal and the informal is dynamic: the one may borrow from the other. But the

formal process is slower to develop, and tends to protect vested economic interests simply because it tends to arise at a later stage, after the genetic free for all, when the stakeholders

have stabilized and are in a position where their continued survival depends on their ability to

define and defend their interests. This does not mean that formal standards will necessarily

prevail. It must be said that in the world of information and computer technology, standards

will continue to come and go for as long as the industry remains in a state of explosive

technological development.18 The international communications standards that are the subject

of this thesis may well be obsolete or non-starters five or ten years from now.19

Although language, too, has its more formal standards, such as authoritative dictionaries, and even standards-setting bodies, such a the Academie Francais.

18 The internet is an example of the informal paradigm of standards development where people continually put out ideas onto the Internet without any ability to control access through such mechanisms as copyright or pricing. In the Internet, free-for-all, standards have emerged, such as HGML for the

8 Introduction

' In the realm of computer communications, the aim of standards is to permit different systems to communicate. From this point of view, standards have been defined more specifically as "standard sets of rules of cooperation between peer entities that govern two areas: service provision to the end-systems; and peer cooperation with an entity on the other end-system involved."20 They apply to programming languages, operating systems, data formats, protocols, and electrical interfaces.

The fact that standards are rule-based is fundamental to our understanding of their requirements, for to be capable of standardization, a thing must itself be capable of being reduced to rules, or "principles to which actions conform."21 In other words, for archival documents to be captured by international communication standards, they must be capable of being reduced to a set of rules. That archival documents do, in fact, embody a set of rules, a set of rules, moreover, that is capable of being translated into the machine environment of computer communications, is the sine qua non of this thesis.

A second fundamental point to grasp about standards is that they are independent of any one system. The way that the Open Systems Interconnect Reference Model (OSI), permits retrieval of a file from a distant site is illustrative. The standard for this service, File Transfer,

Access and Manipulation (FTAM) nowhere sets out a command, "transfer this file". The software that would be based on FT AM might have such a command but the standard itself defines only a set of more precise services that, when utilized in a processing order, will bring about file transfer. How. the software program presents this service to the user or chooses to access the services is outside the standard because such considerations can be determined by the system itself. It is not unlike the choice of a telephone: telecommunications standards ensure that the device meets system specification for connection and operation, and have

document profile. There is no telling how long they will survive or how far they will go because they have developed as need has arisen and do not respond to vested interests.

19 In fact, the ODA standard has already been declared a non-starter by many even though it has been the subject of years of development by international standards-making bodies.

20 Peter Henshall, Opening Up OSI: an illustrated introduction (Chichester, Sussex: Ellis Horwood, 1992): 50.

21 Oxford Modern English Dictionary, 1992, s.v. "rule."

9 Introduction nothing to do with such matters as the choice of manufacturer or the uses to which the system may be put because these have no bearing on the ability of the system to communicate. They would, in any case, be too arbitrary to define. In other words, system-independent means independent of any particular context.22 For international communication standards to look at the communication of records, we are therefore assuming that records have a set of characteristics that are free of any particular documentary context.23 Sets of characteristics that are free of any particular context are called metadata, or, broadly speaking, information about information.

International communication standards are concerned with the ability to store, retrieve and manipulate documents located at remote sites, that is, between networks. A remote site means not just one that is distant but that is serviced by a different system. A system that consists of a lot of end users all connected together using the same system (as on a Wide Area

Network (WAN) or Local Area Network (LAN) is what is called a sub-system. International communication standards are not concerned with how sub-systems operate because they do not have any differences within the system. By connecting sub-systems up through gateways, they form networks. This is the realm of the international communication standard where the need to reconcile differences between sub-systems creates a need for a common language.

There are several types of standards. International standards are those accepted by an international standards-setting body such as the ISO or the CCITT. National standards are those that are agreed by national standards-setting bodies, such as ANSI or CSI. Very often the national bodies work in conjunction with international bodies, and a national standard can be recognized as international. Then there are industry standards which have been established by industry groups. And finally there are de facto standards which have emerged within an industry and become accepted. IBM has been a source of some of these in the computer industry; the Internet, another. The international open document exchange standards within

A rose is a rose is a rose, wrote Gertrude Stein. She may have been talking about ISO standards.

23 Standards agree on those elements that can be agreed. By agreeing upon them, they impart to them a certain universality, but the reverse is not necessarily true. For something to become standardized does not mean it must have universality to begin with. Standardization produces universality as an end product, but as a process it is not dependent on universality. A journey is not defined by its destination.

10 Introduction which records management functionalities may be addressed are known as open document

standards in that they permit documents to be created using software modeled to standards that

are themselves "open", i.e. they can be exchanged and manipulated over a network. The

question then becomes how the record can be defined and recognized within electronic

document management systems employing international open document communication

standards. How must these standards capture the record and what must archivists know about

these standards to appraise their application to records systems?

The international open document exchange standards that are here discussed have all

received sanction by the ISO, and have been the subject of broad discussion and/or

development, with actual implementations. ODA (Open Document Architecture, ISO 8613)

and SGML (Standard Generalized Mark-up Language, ISO 8879) are both concerned with

capturing the structure and representation of documents. DFR (Document Filing and Retrieval,

ISO 10166) is a storage standard, discussed here because the storage of documents is in many

ways the flip side of managing their representation and structure in that the object to be

retrieved must mirror what was created.24

These standards address various aspects of electronic document management within a

model known as Open Systems Interconnection (OSI). ODA and SGML are designed to

encode the structure and representation of documents so that they can be transferred between

networks. They are part of a family of standards which contribute different functionalities to the

OSI. These other standards are not examined in this thesis. The Virtual Terminal Service (VT)

is another architecture and presentation standard designed to permit end user display of

applications running on different networks, such as spreadsheet or word-processing

packages.25 Transfer standards include the File Transfer, Access and Management Service

(FTAM), which provides "a generic way of getting files from one host computer system to

another and also handles the problem of data conversion between different computers",26 and

H. Fanderl, K. Fischer, and J. Kamper, The Open Document Architecture: From standardization to the market ABM Systems Journal. 31 (No. 4. 19921: 732.

25 ACCIS, Strategic Issues, 25

26 Ibid., 35. ,

11 Introduction the Message Handling Service (MHS), which is a world-wide electronic mail service that will be able to handle not only interpersonal messages but also the transfer of different parts of documents.27 Information management standards include the Directory Service (X.500) which was designed as sort of world-wide yellow/white pages and has been expanded to include access to other types of information, such as staff, organizations and documents. Another information management service is the Jxiformation Resource Dictionary System (LRDS), which with the Remote Database Access standard (RDA) standard and the Structured Query

Language 2 (SQL2) is intended to provide access to distributed database management.28.

Finally, the OSI Management Service is intended to manage communication resources such as networks.29

The heart of an electronic document management system is the document profile for it

is this which controls document definition and the document structure. The document profile is

composed of attributes, each of which define a particular characteristic of a document.30. The

attributes are defined - or not defined - in various ways by the different communication

standards for their own purposes. For those standards concerned with records management,

document attributes should be defined in a way that is useful for archivists, and one of the

purposes of the thesis is the development of a thesaurus of standard attributes that can be

mapped to the standards for document architecture and representation.

Problem of Data Structure

One of the most daunting aspects of developing the document profile is that data

processing deals with information in a highly structured manner, as tables Documents must

21 Ibid, 35.

28 Ibid., 58.

29 Ibid., 59.

30 It must be borne in mind that SGML is different from ODA in that it does not use a document profile. SGML flags document characteristics by means of a markup language embedded in the document itself and defines attributes in a much narrower sense than ODA. In other words, SGML permits the compilation of a user-driven vocabulary of document characteristics into a syntax that can be exchanged and manipulated over a network. Nonetheless, the problem of standardizing the description of document characteristics remains regardless of the implementation. The (DTD) or header used by SGML may be considered the conceptual equivalent of a profile.

12 Introduction therefore be translated into this same, highly structured environment in order to be manipulated. "Once data is placed in documents, it becomes frozen in a medium that no longer allows its analysis or reuse."31

Information managed in tabular or numeric format, arranged in columns or tables, is much easier to handle and is today easily available (e.g. airline reservation system). Such systems, however, cannot address documentary forms which comprise the bulk of the world's information. "Most information does not fit into a tabular model."32 Documents resist the sort of re-use to which tabular information lends itself, because information once in document form is intended to be static. Although word processors might appear to address the movement of information, they actually address only the appearance of the document. Word processing is incapable of recognizing the components that make up a document, such as the signature, the address, or the text, as independent entities. For this reason, "documents remain linear strings of words and pictures, with no inherent organization below the file level."33 But the problem with document creation in a document management system is that documents may be assembled from parts of other documents or require specific components, such as electronic signatures. In this respect, "for document-based information to become . . . dynamic, it must possess a structure and organization analogous to the rows and columns in tabular information."34

The need for decontextualized attributes.

It is retrieval strategy that is probably the key to an ERM. Open systems are intended, to simplify the search for records by making it possible to standardize record attributes. The standards and the ERMs that use them, however, tend to reflect the purposes and context of their creators, with considerable variation in the types and definitions of attributes. "The

Lani Hajagos, Lani, Documents and SGML (the Standard Generalized Markup Language standard for document processing), UNIX Review 11 ( No. 3 Mar. 1993): 38.

32 Ibid., 38.

33 Ibid., 39. File has here the data processing meaning of "a collection of records that all deal with the same sort of data." Advisory Committee for the Co-ordination of Information Systems (ACCIS), Management of electronic records: issues and guidelines (New York: United Nations, 1990): 154.

34 Hajagos, Documents and SGML, 39.

13 Introduction organization of a document database," as one writer puts it, "is almost always a direct result of the pre-defined structure imposed on the document collection by the author."35 Moreover, documents can be obtained from a variety of internal and external sources which have not been designed or pre-processed to work within existing information systems. For example, international standards make possible the exchange of documents between disparate systems.

The possibility of document interchange between electronic document management systems therefore creates the need for an agreed standards of metadata that can be used to define documents. Documents have thus to be defined not only within the context of the system where they are generated, but also in terms of a system where they may be received.

The spread of personal computer-based computing is another aspect of this same problem by encouraging the decentralization of document creation through such activities as distributed editing35 and the development of collaborative tools. To operate at the pc level, the document profile must therefore become both "transparent" or user friendly to individual records creators and also capable of accommodating their personal foibles. It is precisely the freedom of being able to do things one's own way that has made personal computing so popular and productive. Yet the profile must also be capable of imposing standardization of attributes in terms of coping with decentralization and exchange between heterogeneous records systems. This problem is exacerbated by the complexity of modern bureaucracy and its procedures.

A further dimension to the need for decontextualized attribute definitions is the need for attributes that are both explicit and machine-readable in order to manipulate document structure. To be accessible to a large audience in a heterogeneous computer environment, a

"standard methodology for expressing a document content model, as well as a standard syntax for describing the model and marking document contents, is required.1,37

35 Thomas K Kolopolous, ed, Handbook of Document Management Systems Evaluation and Design (Boston, Mass: Delphi Consulting Group, 1991): 7.

36 The ability to edit a document which is on a remote system (remote editing) or edit a document simultaneously (joint editing). Ute Bormann and Carsten Bormann, Standards for open document processing: currents state and future developments Computer Networks and ISDN Systems 21 (North Holland: Elsevier Science Publishers BV, 1991): 158.

37 Hajagos, Documents and SGML, 38.

14 Introduction retrieval, the distinction between them is apt to be blurred, if not altogether lost. The first task is therefore to distinguish records from documents.

There are many definitions of a document. From the retrieval point of view alone, a document has been defined as "a structure of syntactically normalized, semantically resolved propositions. . "39 From a purely data processing viewpoint, a document has been defined as "a transaction set or message."40 A broader records management definition defines document as

"a single record or manuscript item."41 The World Bank has defined document within its own electronic document management system as "an identifiable unit of information that must be managed as one entity to support a Bank business function."42 According to the Bank, this could be a "coherent set of pages used to convey a complete message," but it could equally well be a spreadsheet table, a database row, a single page.43 A document may also be a data object, defined as "any collectivity of data operated upon by a software system as a logical entity, for example a record, a document, an image or a software routine."44 A similarly broad definition is offered by the Oxford Companion to the Law: "Anything on which signs have been marked to record or transmit any information, a category including books, letters, deed, title-

deeds, maps, plans, drawings, photographs and the like."45

Within open document communication standards, documents are defined to suit the

particular purposes of the standard. For instance, the Document Filing and Retrieval Standard

(DFR), which is concerned with the access and storage of information at a remote site, defines

Karen Sparck Jones, Assumptions and Issues in Text-based Retrieval, in Text-based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, Paul S. Jacobs, ed. (Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1992), 160.

40 ACCIS, Management of electronic records, 150.

41 Ibid., 150.

42 Information, Technology and Facilities Department: Information Engineering (ITFIE), ITF Staff Paper No. 12: Information Management Architecture: FY 94. Harold Steyer, ed. (Internal report prepared by the World Bank, Washington, D.C.: World Bank, 1993): 99

43 Ibid., 5.

44 ACCIS, Management of electronic records, 147.

45 David M. Walker, Oxford Companion to Law (Oxford: Clarendon Press, 1980), 371.

16 Introduction a document as "a structured amount of information that can be filed, retrieved, and interchanged consisting of a DFR-object-class of the DFR-object.".46 A DFR document consists of content together with attributes which are associated with the content. The Open

Document Architecture standard (ODA) on the otherhand, defines a document as "a structured amount of information intended for human perception, that can be interchanged as a unit between users and/or systems.41. SGML defines a document as a Document Type Definition and an instance or actual occurrence of text. The document profile itself is a document in its own right in that it can be exchanged by ODA, as can the Document Type Definition of

SGML. In effect, profiles and DTDs are virtual documents in that they are rules for the formulation of a document.

The conceptual relationships between information, documents and records can be modeled by an adaptation of the Tree of Porphyry which sets them forth as a taxonomy.48

Porphyry derives all documents from the summum genus of information. Information was defined by Samuel Johnson as "intelligence given," or knowledge shared by giving it to someone as a message. The Oxford Dictionary defines information as "something told; knowledge; items of knowledge, news."49 Within an EDMS, this communication takes place across a physical or logical communications switch in the electronic system.

Intelligence must be communicated to qualify as information. Dreams and thoughts kept in our own minds are, therefore, not information in this sense. Yet a note made to ourselves about these thoughts and dreams or an account of them told to someone qualifies our recollections as information. The important point is that at the heart of the whole concept of

46 ISO/EC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI), Revised Text of DIS 10166- 1: Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures (New York: ANSI, 1991): 5.

47 Canadian Standards Association, CAN/CSA-Z243.221-90 (ISO 8613-1:1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 1: Introduction and General Principles (Rexdale, Ontario: Canadian Standards Association, 1990): 5.

48 This adaptation of the Tree of Porphyry, which was originally intended to explain inorganic chemistry, was first proposed by Trevor Livelton in Public Records: A Study in Archival Theory (unpublished Master of Archival Studies thesis, UBC 1991): 75.

49 Oxford Modern English Dictionary, s.v. "information."

17 Introduction information is a principle of objectification, whereby the intelligence we wish to convey is somehow separated from ourselves by putting it into a form that is capable of communication.

Intelligence must also have a comprehensible meaning: raw, unstructured data (as opposed to simply garbled data) is not intelligence. Going further, intelligence must therefore be comprehensible, but not necessarily in only human-readable terms. Intelligence can also be comprehensible only by the computer itself.

The second level of Porphyry's taxonomy captures this distinction in the differentia of recorded and non-recorded information. The act of recording information creates a document, if not, as yet, a record. But what does it mean, to record information? What is involved in the capture of information in a document?

Diplomatics approaches this problem by defining the document as "the expression of ideas in a form both objectified (physical) and syntactic (governed by rules of arrangement). A document's components are: (1) a message; (2) a medium; (3) an intellectual codification of ideas (information configuration: text, image, etc.); and (4) logical arrangement of the internal

elements (intellectual form)."50 These elements might be said to conceptually define any piece

of information, recorded or unrecorded, but they must be present as both physical and

intellectual realities to become recorded information or a document. In other words, all these

elements must be brought together in one place for a document to come into existence.

These various elements of the document may be taken for granted as purely conceptual

distinctions where they are brought together by the unitary nature of the paper document. But

when it comes to dealing with electronic document management systems and open document

exchange communication standards, the message, medium, intellectual codification, and logical

arrangement become independent attributes, or groups of attributes, that must each be

physically defined to the document profile before a given document can be created, exchanged,

and retrieved. This is because the computer is essentially a "brainless" brain that forces us to

define conceptually any object or task that it is to perform, and break it down into its

constituent elements, as attributes, in the case of objects, or in the case of tasks, algorithms. If

Duranti, Managing Electronic Documents, 9.

18 Introduction these elements are independent, then each must be defined before they can be realized in the document profile.

MESSAGE

If a document is an attempt to communicate ideas, then it must be about something, and the message is no more, nor less, than this, the intelligence or intellectual content of the document, be it machine or human comprehensible. This definition of message is captured by the telecommunications meaning of message where the document is composed of an envelope and a body.51 The body will contain a human-readable message, but the header will contain encoding that will permit the receiving application to interpret the content so that it can be read.

Open document communication standards tend to associate message with the human readable content. For instance, ODA defines content as "the information conveyed by the document, other than the structural information, and that is intended for human perception."52

DFR defines content as "the prime information content of a DFR object".53 Whatever the definition, the independence of content, or message, from structure is a physical reality in open document interchange, where content can be swapped about and imported into various documents.54 For example, the aim of SGML is to permit the information in a report to be used in another document with quite another purpose.55 To do so, however, the information in a document must be conveyed by means of a structure, and it is this that documents bring to information and the reason that structure, and not the intelligence per se in a document, is the focus of open document standards.

51 Henshall, Opening Up OSI, 104.

52 ISO, ODA: Part 1: Introducton and General Principles, 4.

53 ISO, DFR: Part 1: Abstract Service Definition and Procedures, 5

54 One is tempted to say data, but data is defined as "information formatted in a special way", and also as the plural of a datum, a single piece of information, which does not get us much further head. Definitions from Philip Margollis, The Random House Personal Computer Dictionary (New York, Random House, 1991): 112.

55 Indeed, that is one of the characteristics of content, that it takes on the meaning of its context.

19 Introduction

MEDIUM

It is, of course, impossible to contemplate the transmission of intelligence or knowledge without giving consideration to the medium. From a data processing point of view, the medium is the electronic telecommunications switch that transmits the message as digital or analogue data. From an archival point of view, for the message to be transmitted as a document, it must be somehow captured or fixed in a storage medium in much the same way that a photographic print is "fixed" in the development process. A medium is therefore the

"physical material or substance upon which information can be recorded or stored,"56 and includes both those that are relatively permanent, and those that are ephemeral. Writing is thus

"fixed" on paper; voice can.be recorded on tape; data is "written" to magnetic or optical disks.

Electronic media encompass a very broad range of various storage media such as tape, diskettes, and hard drives.57

To be fixed in a medium, a message must be "written to" the medium, or physically attached. In traditional paper or parchment recordskeeping media, ink was the most common way of affixing the message.58 Electronic data may be affixed to the storage medium by magnetic or optical means.

INTELLECTUAL CODIFICATION

The means of representation, or the codification of information, partly dictated by the physical constraints of the media, is the third, fundamental aspect of a document.59 The use of

56 SLAIS, Glossary, 12.

57 In networks, media refers to the various types of cables linking workstations together. Margolis, 291.

58 The term formatting, as in formatting a disk, is not the same as affixing a message. To format a document meant to prepare it to receive a message, by drawing lines or making holes. Electronic storage media are formatted with the same purpose in mind, to make it capable of physically bearing a message. In medieval documents, formatting also had the purpose of indicating the purpose of a record. For example, priviledges usually had holes, and charters were often a particular size. The purpose of this was partly to overcome the problems of functional illiteracy. Symbols could be used to achieve the same end, by use of colour codes etc.

59 The term format must be distinguished from representation. The mode of representation is a broad term that takes in various forms of expression, such as writing, film etc, or what might also be called media. Format is the particular choice of codification available within each form of expression. Writing uses words; it can also uses Dictographs. Film uses 35 mm, video etc. EDMS will use digital or analogue formats to encode content.

20 Introduction text, raster or geographical graphics, voice, or digital or analogue data are all forms of

intellectual codification by which the message may be represented in electronic documents and

must therefore be transmitted by open document standards. As with the message, the.

intellectual codification may. be machine or human readable. For instance, digital or analogue

data is not itself human-readable except with great difficulty, although it may be used to

produce a human-readable form of intellectual representation, such as a map, or words. For

example, a word-processing application will use a formatting language to encode instructions

that will permit a screen display of a document.

LOGICAL STRUCTURE

Finally, information must have some form of logical structure in order to be interpreted

for use. The logical structure is actually two types of structure, a structure determined by

meaning, and a structure required by physical requirements of the medium, or a layout

structure. Since ODA and SGML both distinguish between these two types of structure, the

distinction will be retained henceforth. The logical structure consists of elements such as

sentences, paragraphs, and footnotes that are determined by the content of the message.

Graphics may be part of a logical structure as illustrations or tables. A relational database will

have a logical structure in the form of tables or fields.

The concept of logical structure is extended by compound documents that combine

active links with databases or other documents in order to incorporate elements into a single

document, such as graphics, text and data. The links that make possible the compounding.of

the various parts of the document must be considered a part of the logical structure as well as

the various elements;themselves.

Only in the very broadest sense, can the layout structure of a document be considered

logical, but since both tangible and electronic media place physical constraints on the ways and

amount of information that may be recorded on them, choices must be made and intellectually

accepted that determine the structure of the message. The traditional layout conventions of

book publishing, such as page breaks, margins and the positioning of text and page numbers on

the page of book publishing, for instance, represent a creative and intellectual response to the

physical constraints of paper and printing machinery in an effort to balance aesthetics, textual

clarity, and economics. Introduction

From an archival viewpoint, the limitation with all these views of the document is that they do not address the nature of the document as a record. The definitions driven by retrieval are content-driven. In effect, they define the document by its subjects, as a mere container of topical information. The definitions used by international standards define the document in terms of its structure for purposes of processing but go no further. They are communication

standards, designed to facilitate the exchange of the document as information. But to address the nature of the document as a record for the purpose of establishing whether it is a record, it

is necessary to go further.

Like documents, records have been variously defined. The data processing definition

hardly distinguishes between a record and a document, and calls both a "set of related data or

words, treated as a unit",60 or "in database management systems, a complete set of

information."61 This type of record is composed of discrete fields which brought together in

some intellectual aggregation (such as an address, which might be made up of fields for name,

street, city, and postal code) form a complete unit. The data processing record really

corresponds to the file.

Each of the fields can also be manipulated individually and aggregated through various

search techniques to form other data processing records.

Another data processing definition of record permits the record to be defined as a data

structure which might consist of a combination of other data objects, for example, a number of

different types of numbers and a character string. Regardless of the definition, since all

information in a computer must be stored as one type of file or another, the purpose of the data

processing definition of record is to permit the management of data for whatever processing

purposes are required, such as retrieval, storage, and manipulation. This is quite opposite to the

purpose of the archival management of the record as evidence. The fact that a record is

intended as evidence implies preservation and not manipulation.

ACCIS, Management of electronic records, 176.

Margolis, Personal Computer Dictionary, 402.

22 Introduction

An open-ended definition of the record is offered by Robek, Maedke and Brown according to whom a record is "Recorded information of any kind and in any form."62 The problem with this definition is that it offers nothing that the definition of document does not already cover, implying that all documents must therefore be records.

At this point it is useful to turn once more to the Tree of Porphyry as adapted by

Trevor Livelton63 as a means of conceptualizing the relationship between document and record. According to Livelton, records are really a species of document. The recorded form of information then becomes the subaltemum germs of the document itself which in turn divides into two differentia of its own: those documents that are made or received, and those that are neither made or received. Records are an infima species of those documents that are made or received in that they are produced in the course of some practical activity or transaction.

Practical activity has been defined as "an activity whose purpose is not the activity itself but the production of effects capable of influencing situations." 64 This definition of a practical activity is related closely to that of a transaction. The ACCIS glossary offers a broad definition of a transaction as "information, communicated to other people in the course of business, via a store of information available to them."65 The SAA Glossary defines transaction more strictly as "an act, or several interconnected acts, in which more than one person is concerned, and by which the relations of such persons between themselves are altered."66 There are two fundamental points to note about this definition. First, that there must be a relationship between at least two people, the sender, or creator of information, and the recipient. This relationship must be

Wilmer O. Maedke, Mary F. Robek, and Gerald F. Brown, Information and Records Management (Mission Hills, CA: Glencoe/McGraw-Hill, 1987): 568. This vague definition reflects the management problem of capturing records where there are not the resources to be more exacting.

63 Livelton, Public Records, 73 -75.

64 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 1.

65 ACCIS, Management of electronic records, 184.

66 Lewis J. Bellardo, and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers (Chicago: Society of American Archivists, 1990): 35.

23 Introduction

substantive, or have a separate and independent existence.67 A transaction, then, is an act that takes place within a context, which in this case, must be a substantive relationship. The second point to note is that a transaction consists of an act. The SAA definition hints at the nature of this act as something "by which the relations ... are altered" between the people participating as sender and recipient. Transactions are therefore the result of practical activities since the purpose of these is the "production of effects capable of influencing situations." The converse, however, is not necessarily true: all practical activities do not necessarily result in transactions.

A practical activity may be undertaken without producing any change in the status of those participating. An example would be the writing of a reminder to oneself.

There is broad agreement on the fundamental nature of the record as recorded information that manifests a transaction or a practical activity. Schellenberg defined the record as "All books, papers, maps, photographs, or other documentary materials, regardless of physical form or characteristics, made or received by an public or private

institution in pursuance of its legal obligations or in connection with the transaction of

its proper business [italics not those of citation] and preserved or appropriate for

preservation by that institution or its legitimate successor as evidence of its functions,

policies, decisions, procedures, operations or other activities or because of the information

value of the data contained therein." 68 Jenkinson's definition of archives is similar in its

essentials to Schellenberg's idea of a record: "A document which may be said to belong to

the class of Archives is one which was drawn up or used in the course of an administrative

or executive transaction (whether public or private) of which itself formed a part; and

subsequently preserved in their own custody for their own information by the person or

For this reason, the mere giving of information without the context of a substantive relationship would not qualify as a transaction. For example, newscasts, such as hurricane warnings, frequently compel listeners to take action, but the information would not qualify as a transaction between the broadcaster and the listener because there is no substantive relationship between them (which is not to say the broadcaster is absolved of responsibility for their acts). The listener's life may be altered by the news, but the relationship between the broadcaster and the listener would remain insubstantial and unaltered because it has no existence outside the broadcast itself).

68 T. R. Schellenberg, Modern Archives: Principles and Techniques (Chicago: University of Chicago Press, 1956; Midway Reprint, 1975): 16.

24 Introduction persons responsible for that transaction and their legitimate successors." The ACCIS glossary defines the record as "recorded information, regardless of form or medium created, received and maintained by an agency, institution, organization or individual in pursuance of its legal obligations or in the transaction of business."70 Similarly, the Society of American

Archivists' glossary defines the record as "a document created or received and maintained by an agency, organization or individual in pursuance of legal obligations or in the transaction of business."71

The idea that a record is a document created or received in the course of a transaction or a practical activity has been carried over into the realm of electronic records. Within the electronic environment, David ..Bearman has defined the record as "any communication between one person and another, between a person and a store of information available to others, back from the store of information to a person or between two computers programmed to exchange data in the course of business."12

Finally, it remains to establish the relationship between records and archival documents.

The distinction has revolved around the precise meaning of "preserved" in the definition of the record. By preserved, Jenkinson meant the retention of documents by their creators for their own purposes. Schellenberg, on the other hand, saw preservation to mean retention for purposes other than those legal and administrative purposes for which the record was originally created, that is, for the secondary values of research. The debate has been important because it has been used a basis for distinguishing archives from records in that, according to

Schellenberg, archives are a special group of records which have been selected by archivists on the basis of values other than those for which they were created.

Livelton sets this debate aside by equating archives with records. "Selection," he writes, "as has been seen, is implicit in the notion of preservation [in that records must first be

69 Hilary Jenkinson, A Manual of Archival Administration, revised edition (London: Percy, Lund & Humphries & Co., 1937): 10.

70 ACCIS, Management of electronic records, 176.

71 Society of American Archivists, Glossary, 28.

72 Bearman, David, Managing Electronic Mail Archives and Manuscripts 22 ( No. 1): 38.

25 Introduction selected to be preserved]. And, by leaving it implicit - that is, by refraining from qualifying the notion of preservation - this reformulation of the traditional definition can accommodate, albeit tacitly, both the 'records' and the 'archives' sides of Schellenberg's distinctions between the agents, the recipients, and the purposes of selection for preservation."73 In other words,

Livelton's infima species of records is synonymous with archives. The archival document, therefore, defined as a document produced in the course of a practical activity, is synonymous with record.

One other point remains to be established about the nature of the archival document, and that is, that records always exist in aggregations. "Archival documents or records are necessarily composed of documents and the complex of their relationships. Because of this, any document, of any nature, which acquires relationships with a group of archival documents or records, is to be considered a record itself, following the fundamental rule which governs every collectivity, according to which each individual entity acquires the nature and characteristics of the whole to which it belongs."74 All records are subject to the concept of the archival bond, defined as "the relationship that, because of the circumstances of their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterized by the purpose of the record).75

Before proceeding to the issue of the completeness of a record, it is advisable to get a better picture of the electronic context by examining the electronic records management system

(ERM) and the role and nature of standards in more depth.

73 Livelton, Public Records, 69.

74 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 4.

75 Ibid., 4.

26 CHAPTER ONE

ELECTRONIC DOCUMENT MANAGEMENT SYSTEMS

Definition

An ERM is the data processing environment in which archival documents are created, identified, and preserved within a recordskeeping system. In the sense that the definition of archival documents include those both made and received, an ERM must make provision for those records produced outside its electronic boundaries by independent recordskeeping systems.

Electronic records management is the control of documents through electronic means.

There are two levels of electronic document management. The less complex level is concerned with retrieval only, or the management of documents as "discrete objects" i.e. documents that cannot be modified. This is the focus of bibliographic retrieval systems and optical imaging systems. The more complex level attempts to go beyond management for retrieval only, to control of documents at their creation. Such systems permit documents to be created on the system as well as managed throughout their life cycle.

Electronic document management must be distinguished from electronic records management. This distinction lies in the fact that, as we have seen, not all documents are records. Electronic document management is concerned with the management of documents defined in their broadest sense, with the document as a form of information that includes both bibliographical materials as well as records. Electronic records management, on the otherhand, is concerned only with the management of records as a species of documents. Just as electronic document management is the context of electronic records management, so it is with the systems themselves. Electronic document management systems (EDMS) are designed to manage all types of documents. An ERM might therefore be defined as the functionality of an

EDMS that is specifically designed to manage records. In discussing the design and structure of an ERM, it is appropriate for present purposes to refer to the design of an EDMS.76

The discussion of an EDMS owes much to the author's experience at the World Bank in Washington, DC, where a major electronic document management system is being designed and implemented at the time of writing. The World Bank is a special agency of the United Nations devoted to fostering economic development in second and third world counUies. The system being designed will eventually serve some 10,000 employees. The World Bank EDMS consists of a software for the creation of

.27 Chapter One Electronic Document Management Systems

Structure of an EDMS

Electronic Document Design Methodology (EDDM) addresses "the creation of a flexible system of architecture, the preservation of system investment, the formulation of

EDMS-based user needs, the resolution of EDMS data management anomalies, the integration of diverse data, and the minimization of risk."" This is the world into which the archivist must venture.

There are five components to an EDMS.

1. Architectural design consists of the hardware and software infrastructure within which the EDMS will function. This includes operating systems, networks, hardware platform, input devices, output devices, software utilities and environments, remote connections, integrated facilities for the collection and dissemination of information.78 For instance, the architecture may specify the technical requirements for such functions as workflow, shared filing, document management, group authoring, and electronic mail.79

2. Database control and organization is the identification and definition of structured access paths. This is a necessary feature of larger systems, although in smaller systems retrieval can be based on direct content search or, in the case of images, by a single identifier.

3. Application analysis and design identifies how the organization will use and react to the system. From a systems design viewpoint, this is considered the most difficult part.80 For

documents, a document store managed by a database, and an imaging system to capture paper documents received from outside the Bank.

77 Kolopoulos, Handbook of EDMS, 6.

78 Architecture is defined as " a specification which determines how something is constructed, defining functional modularity as well as the protocols and interfaces which allow communication and cooperation among modules." ACCIS, Management of electronic records, 136. System architecture should not be confused with system configuration which is defined as " the arrangement of a computer system or network defined by the nature, number, and the chief characteristics of its functional units. More specifically, the configuration may refer to a hardware configuration or a software configuration." Ibid., 144. A graphics monitor and a video adapter could be a minimal configuration; the use of international communication standards could be part of the software configuration.

79 Organization and Business Practices Dept., Information Engineering, Electronic Document Management System (presentation by Robert Patt-Corner and Ronald Cutier, Washington, World Bank, 1994): 3.

80 Kolopoulos, Handbook of EDMS, 28-29.

28 Chapter One Electronic Document Management Systems example, in the World Bank, the document management regime has been divided up into four management domains within which documents are created, accessed, and stored. These domains are the personal domain, the work group domain, the business unit domain, and the institutional domain. Each has different rules for the creation, access and storage of documents.

The World Bank Group Archives control documents in the business and institutional domains where they can no longer be altered and are intended for long-term preservation.

4. The document definition is the fundamental element that must be specific to each system. The document definition comprises an array of physical and intellectual elements such as the amount of space it may take up, type of document, and delineating features. Not only must the document itself be defined as an entity, but also, each different type of document must be defined if it is to be created and retrieved. The identification of document types is a critical task that requires, first, a rigorous analysis of document types as they are used across an organization as well as between organizations, then, the definition of attributes that will be mapped to the profile. This can be time-consuming.81

5. Document structure is the structure of the document including both logical components and the logical interactions of various document types and document objects,82 and its layout or physical organization. The logical structure or architecture is the way the document is perceived by a user, which may differ from its actual functioning or form. For instance, a paragraph or a chapter is a typical element of logical structure which can only be defined by the writer. The EDMS must define how a paragraph or chapter is to be encoded and retrieved, although the actual instance will be at discretion of the writer.

81 For instance, a loan or project at the World Bank has been estimated to involve over dozens of different document types, all of which must be defined to the EDMS.

82 A data object is a type of data structure which consists of a data type (text, image etc) packaged with programming that enables it to perform certain functions towards a given end. For example, a report consisting of text and tables could be programmed to get the tables from a certain spreadsheet. A document object is a document which also contains programming to make it perform in a certain way. A document type is simply a document that is repeatedly used frequenuy enough to be identified as a type. Its form may be more or less strictly prescribed. In the World Bank, for instance, a President's Report is a document type used to present a loan proposal to the Board of Directors and is common to all loans, with its own prescrptive characteristics. An invoice, on the otherhand, is a generic document type whose form is not formally prescribed in Bank procedure.

29 Chapter One Electronic Document Management Systems

The problem of logical structure is complicated by compound documents in that they consist of a mixture of information configurations, such as text, sound, graphics and data drawn from different applications. For instance, a report could include text taken from several different reports, graphics uploaded from an image bank in a photo library, and tables drawn from a spreadsheet. The logical structure therefore also includes the links between different types of data that may be used in the document and the objects from which they are drawn.

The links may not be just between different types of data, but between different objects, or data structures that include functionalities.83 For instance, a document may be made up of various parts, each of which is a different document in its own right. It may also have notes or

annotations that may also be kept separately and which may be attached at some point in the writing or co-editing. .

The layout of a document consists of its organization into pages, including the use of

running heads, typefaces etc.

It is important to recognize the distinction between fonnatting and document structure.

Standards such as SGML and ODA are able to preserve the original author format of the

document, but "formatting is only useful for viewing of document information in its complete,

unabridged version. Once a. user begins to extract parts of the document in the original,

formatting is no longer available to facilitate comprehension. In fact it begins to hinder the

process of understanding the connections between differing pieces of information that are no

longer connected in an authored format. "84

There are only two ways to resolve this problem of, loss of document organization.

Either all documents must be placed into a single repository with a minimum of grammatical

structure, or every document collection must be reconciled through a common structure. The

second alternative favours the user but places a much greater burden on the system and is much

more difficult to implement.85

83 Margolis, Personal Computer Dictionary, 329. For example, a table from a spreadsheet consists of data or numbers that may be manipulated using Various commands or functions. This is opposed to the inclusion of an image which cannot be manipulated.

84 Kolopoulos, Handbook of EDMS, 7.

85 Ibid., 8.

30 Chapter One Electronic Document Management Systems

It is obviously the second alternative that faces the archivist if the system is to respect both the needs of the creator and the retrieval requirements of users. In fact, these are likely to be synonymous if records are to be created from parts of other documents. The question then becomes, what should be the basis of a document structure that can be translated into machine- readable form, into tables that will permit not only the formatting of documents but their creation and manipulation?

Document definition and the document structure are the features of the EDMS design that fall within the ken of the archivist.

Document Profile

The EDMS manages the document throughout its entire life by defining as attributes all the features of a document that reflect its physical and intellectual nature, as well as those others that determine its use, maintenance and disposition in a structure called the document profile. The document profile is defined as "control information that is associated with a

specific document."86 There are really not specific rules for a profile except that the profile tends to contain characteristics that apply to the document as a whole, as does the ODA

profile. The profile consists of descriptive rules that may be applied at the time of creation or

later when a document has been finalized and is ready for such final capture processes as

imaging, but the rules must be agreed at the outset, during system design. One of the most

familiar document profiles is the e-mail header. The header in an e-mail message always

identifies the sender of the message, the addressee, and the time, and provides proof of the

transaction.81

An attribute is defined as "a property or characteristic of one or more entities, for

example, colour, weight, sex."88 In terms of records, the author, date, or title of a document

ITFIE, Information Management Architecture, 5.

87 The distinction must be made between the document profile and the ability of an application to format. Document profiles include the logical and layout characteristics of a document which in its original form, as created on the native application, may be defined by the application itself (e.g. maximum number of characters in a field, range of typefaces etc) but to qualify as a profile, they must be resident in the system external to any one application and capable of being imposed on any document generated in the system.

88 ACCIS, Management of electronic records, 138.

31 Chapter One Electronic Document Management Systems are elements that may also be construed as attributes. Crucial to an understanding of their function is that attributes are managed independently of content. They can be read or changed without changing the content of the document.

The document profile constitutes a family of different groups or types of attributes that control the management of the document. Attributes fall into various groups. Contextual attributes may define provenance, such as the organizations and individuals responsible for the document, the type of business involved, the document type, or dates of creation. Management attributes specify access, security, storage and classification, and the extent to which a document may be processed or modified. Physical attributes may define types of data structures (such as graphics, or text), the size of the file, and interchange formats. Attributes may also be mandatory or optional, user-defined or system-defined.

Each open document exchange standard has its own requirements and therefore its own set of attributes. For instance, the Document Filing and Retrieval Standard (DFR) has its own attributes which are defined as data items that identify a DFR-object, describes its DFR- content, helps control access to it, or in some way is associated with the DFR-object.89

Without doubt, the document profile is at the very heart of the electronic document management system. It is here that the record itself will be defined and on the basis of those attributes, retrieved and managed.

The profile is independent of the content of the document. For instance, when a document is created, its profile is the first thing to be defined and modified. This provisional profile, parts of which will be system-defined (such as identification of originating unit, business process, or document type) and others optional (such as a title) is subsequently linked to the content which together form the document. In the same way, if the document is not preserved, all that may be retained will be the profile; the content will be eradicated, leaving an empty

shell.

The concept of a document profile is not confined only to electronic systems: it is

characteristic of all documents. For example, everyone who writes a letter does so with certain

rules in mind about how it should look in order to be understood as a letter, i.e. the inclusion of

ISO, DFR: Part 1: Abstract Service Definition and Procedures, 5.

32 Chapter One Electronic Document Management Systems a salutation and closing etc. In this case, the "profile" is assumed or understood by the creator.

Medieval notaries developed formularia which set out the rules for the creation of various types of documents by prescribing standard formulas, such as wordings or "boilerplate" text.

These are the ancestors of the style guides and standard document forms that are today a common feature of life wherever a document is required as evidence of a transaction and are a form of profiling.90 In electronic information systems, the document profile resides as a functionality that must be invoked before a document is to be created.

Role of Metadata

The attributes are metadata, defined as information about information. The concept of metadata presumes that all information is made up of two parts: a conceptual form, or its description, and the actual occurrence of information fitting that description. A particular instance of the class of information indicated by the attribute type is called an attribute value.

For instance, the attribute "date of creation" will define what constitutes a date of creation and in what form the it may be recorded as data; the entry of an actual date of creation will be an attribute value. The attribute "author's name" becomes a particular metadata and the actual name of an author becomes a specific occurrence.

There is no limit to the characteristics and functionalities of a document that can be established in the profile but defining the attributes is by no means easy. The document profile forms a relational database. This means that each attribute must be unique, cannot be confused with another, and must consist of a highly specific, quantifiable value that can be manipulated by the EDMS with consistent accuracy and results. It is obviously preferable to avoid as far as possible user-defined attributes whose values are subject to human inconsistency, incompleteness, or inaccuracy. This is the problem of transparency: the more automatic the values of the profile, the less reliance on user input and the easier it is to use. Ideally, the profile should be defined in such a way that it depends on the user as little as possible. In the design of

It might well be argued that is virtually impossible to sit down and write anything without quickly becoming aware of rules of formulation. This is true even of personal diaries, which have a recognized form even though they are not intended to be communicated to anyone but the author. The truth seems to be that communication itself, whether public or private, has implicit rules, and writing in particular, as one of the most structured forms of communication, especially so.

33 Chapter One Electronic Document Management Systems any database, the definition of the metadata is an exacting process that will determine ultimate success, all other elements being equal.

It is therefore appropriate here to be careful about the use of the term metadata. Unless there are exponential leaps in artificial intelligence, EDMS will be successful only in dealing with highly definable data that is as little dependent on user-definition as possible. For instance, no EDMS is capable of grasping the abstract concept of a record series. The identification of a record series depends on the ability to recognize a number of physical and intellectual characteristics whose aggregation is arbitrary and whose occurrence cannot be defined with consistency. While series may be defined as metadata, one must be careful to recognize that all you are really doing is mcluding human-generated metadata in an automated system without contributing to or enhancing its accuracy. Metadata, in the case of a document profile, should be thought of as including only those attributes that can be defined in terms of automatic

system manipulation.91

Distributed Database Management

It is very likely that the EDMS will operate within the environment of distributed data

processing in which some or all of the processing, storage and control functions, in addition to

input and output functions, are situated in different places and connected by transmission

facilities. The EDMS may function as a distributed database which is a logical database that has been divided among physical locations within a distributed information system,92 or in other

words, is the same database operating out of different physical locations. For example, in the

World Bank, a system called Integrated Records and Archives Management (ERAMS) is used

to control the accessioning of inactive records by the Archives, the receipt of project records by

information service centers, and the documenting of Bank reports and publications by the

Internal Documents Unit. Each of these units operates out of different locations and has

different requirements, all of which must be captured as attributes in IRAMS.

This example leaves aside the question of whether series would continue to exist in an electronic document manage system.

92 ACCIS, Management of electronic records, 149.

34 Chapter One Electronic Document Management Systems

A more complex manifestation of the concept of distributed data processing is the federated database. A federated database is a system which enables searching and/or display of structured data stored in decentralized, heterogeneous databases. Software and data structures do not have to be compatible, but the fields in the databases must have compatible semantics and these must be defined in a consolidated repository. A federated database system is made up of cooperating but autonomous databases.93

The control of documents within a distributed data processing environment relies on a database management system. A database management system is a software product that controls a data structure containing interrelated data stored so as to optimize accessibility, control redundancy, and offer multiple views of the data to multiple application programs.

Database management systems also implement data independence to varying degrees.94

At the heart of the database management system are the data dictionary and the data directory. The data dictionary is a repository of information about the definition, structure, and usage of data, or in other words, a library of metadata. It does not contain the actual data itself.

In effect, the data dictionary contains the name of each data type (element), its definition (size and type), where and how it is used, and its relationship to other data. The data directory is a

structured description of the relationships between data in a database, such as cross-reference information showing which programs access which data or which departments within an

organization receive which reports.95

Open Document Processing and the Role of Standards

Distributed data processing, whether consisting of only one database operating out of

different physical locations, or different databases that cooperate as a single system raise the

problem of standardization. This is exacerbated in a situation where databases that are not part

of the same system must try to communicate and share the same information Open document

processing is the use of standardized information formats to create a common understanding

ITFIE, Information Management Architecture, 99

ACCIS, Management of electronic records, 149.

Ibid., 149.

35 Chapter One Electronic Document Management Systems between originator and recipient about the information being interchanged;96Intemational communication standards are meant to facilitate the exchange of documents between different systems, which may or may not mean different organizations or, in archival terms, records- creators.

Figure 1

The Basic Processing Model97

Source Document Type Mapping Specification Result Document Type

Source Document Result Document Processing Mep

Open document processing functions by separating the information in the document from its layout or organization. In more specific terms, this requires "the clean separation between the document logic and the layout control of a document, i.e., the separation between

the original document information (i.e. the structure and semantic categories of the information

that the author has in mind and wants to convey to the mind of the reader) and the control

information for processing steps to be performed on this original information, particularly

control information for an automatic formatting program. This separation is important because

... the creation of a document need not be performed at the same place or at the same time as

the further processing steps of this document."98

The heart of open document interchange is the automatic processing step which

transforms the source information into the corresponding result information. This concept can

be clearly understood from the basic processing model in which a source document goes

through an automatic processing step in order to be received as a result document. Both source

96 Bormann & Bormann, Standards for open document processing, 149.

97 Ibid., 152. 98 Ibid., 149.

36 Chapter One Electronic Document Management Systems and result documents are generated from a pre-determined document type which must be mapped to the specifications of the automatic processor in order to be transmitted and interpreted correctly." For this to happen between communicating databases, the attributes of both the result and the source document must be agreed in the document profile, or in other words, standardized.

Since many documents share similar characteristics, the support of sets of rules which can be used to control, through standardization, the creation and processing Of specific documents of the same type is a necessity. These rules can be used to define both source and result documents.

Commercial applications of open document processing distinguish two application environments: the publishing environment, and the office environment. The publishing environment is concerned with the processing and distribution of publications of all kinds; the office environment provides for "the processing, forwarding and turn-around of business documents (such as letters, forms, and reports) as a routine form of communication . . . including bi-directional routine forms of communications . . "10° These two environments have different emphases. In the publishing environment, the emphasis is on interchange of manuscripts between author and publisher, and between publisher and printer. The formatting of the document for layout is of prime importance. The author in the publishing environment has no control over layout which is controlled by the publisher. The document product will last a long time, so the time taken to prepare the document is of less importance, and the product is likely to be reproduced in different formats. In the office environment, instead, the main emphasis is "on the blind interchangeability" of documents. That is, "it must be ensured that documents can be interchanged routinely between arbitrary originators and recipients without special pre-agreements between them and without restricting the ability to process the document by the recipient."101 There can be no delays in producing documents so there cannot be elaborate encoding of processing requirements. Another major difference is that the

Ibid, 152.

Ibid., 150.

Ibid., 150.

37 Chapter One Electronic Document Management Systems originator, or author, of the document usually wants strict control over the layout of the document. The directives that define the document layout must therefore be interchanged together with the document content.

The extended processing model of open document processing accommodates the need for extensions to enable documents to have the functionalities of both the office and publishing document environments. This is made possible by the use of extensions to the document modeling standards of ODA and SGML and fosters the convergence of these two basic standards. In extended processing model, the source and result documents are defined by pre• determined conceptual documents, and then mapped to the specification of the automatic processor provided by the standard. In order not to complicate the description of documents by adding all the necessary information needed to define extensions, the functionalities of the extensions are added in additional processing steps mapped from a transit document.102

The most important standards for the purposes of open document exchange are those set by the International Standards Organization (ISO). These standards are part of a whole set of standards that govern the complete process of data interchange between computer systems.

At the heart of ISO communications is the Open Systems Interconnection (OSI) Reference

Model. The OSI model is a communication reference model that has been defined by the

International Organization for Standardization (ISO). It is a seven-layered communications protocol103 intended as standard for the development of communications systems worldwide.

From top to bottom, the layers of the OSI model are:

• Layer 1 - Physical Layer

The physical layer defines the actual set of wires, plugs and electrical signals that connect the sending and receiving devices to the network.

• Layer 2 - Data Link Layer

102 Ibid., 158-159.

103 A protocol is an "agreed format for transmitting data between two devices. The protocol determines ... the type of error checking to be used; how the sending device will indicate that it has finished sending a message; and how the receiving device will indicate that it has received a message."Margolis, Personal Computer Dictionary, 389.

38 Chapter One Electronic Document Management Systems

The data link layer is responsible for gaining access to the network and transrrutting the physical block of data from one device to another. It includes the error checking necessary to ensure an accurate transmission.

• Layer 3 - Network Layer

The network layer establishes the connection between two parties that are not directly connected together. For example, this layer is the common function of the telephone system.

• Layer 4 - Transport Layer The transport layer is responsible for converting messages into the structures required for transmission over the network. A high level of error recovery is also provided in this layer. • Layer 5 - Session Layer The session layer establishes and terminates the session, queues the incoming messages and is responsible for recovering from an abnormally terminated session. • Layer 6 - Presentation Layer The presentation layer is used to convert one data format to another, for example, one word processor format to another or one database format to another. • Layer 7 - Application Layer The application layer is the top layer. It is the set of messages that application programs use to request data and services from each other. Electronic mail and query languages are examples of this layer.

The OSI model is not complete because it does not define standards for user applications that lie above and beyond the Application Layer. OSI does not attempt to define standards for types of application such as spreadsheets, or word processing or EDMS. What it does provide is the ability for these packages to communicate.

The OSI model depends for its functionality on a number of principles. First, the layers of the model must be exactly duplicated for both sender and receiver. Secondly, it assumes that each layer must be self-sufficient, or independent in its functioning from any other layer, that is,

each layer must be complete in itself. To make use of its functionalities, one has merely to plug

in; it is not necessary to know anything about the layers beneath. This means that the model is

essentially modular: standards can be developed at each layer without affecting the other layers

or having to redesign all or part of the whole model. Finally, the same features of the document

profile that separates the description of data from its actual occurrence applies to standards.

Standards are metadata meant to govern the design of an actual occurrence or implementation.

The document management standards that are the subject of this thesis are those found

at the top of the OSI Model, Layer Seven, Applications. These standards address different

39 Chapter One Electronic Document Management Systems types and functionalities of document communication and form, a considerable family. SGML

(Standard Generalized Markup Language) is designed to permit the exchange of documents that are to be published. ODA (Office Document Architecture) is designed to handle office documents. ASN.l (Abstract Syntax Notation One) is a standard language for defining data structures, or the way data may be encoded using binary encoding rules. CCF (Common

Communication Format) is a standard for the transfer of bibliographic cataloguing and abstracting information originally produced by UNESCO General Information Program for transferring data between computer systems. MARC AMC is another standard used to transfer bibliographical and archival data. Facsimile transmissions are governed by their own set of

standards, while the X.500 standards govern remote directory services. Retrieval from remote

storage on servers is governed by DFR (Document File and retrieval Standard) and FT AM

(File Transfer and Management) standard. The transfer of graphics and images is governed by

yet other standards.

40 CHAPTER TWO

DIPLOMATICS

Electronic document management takes place at the document level, and archivists need a tool that will help them work at the level of the document. The science of modem diplomatics, as proposed by Luciana Duranti, offers just such a tool. As Duranti points out,

"The boundary lines between the two disciplines is to be found in the series, the fonds, the archives as a complex of documents, as a whole, which constitutes the area of archival science.

Instead, the single document, the elemental archival unit, is the area of diplomatics."104

The general theory of diplomatics defines diplomatics as "the discipline which studies the genesis, forms, and transmission of archival documents, and their relationship with the facts represented in them and with their creator, in order to identify, evaluate, and communicate their true nature."105 By focusing on the "true nature" of documents, this definition goes well beyond the original purpose of diplomatics as it developed up until the French Revolution, which was

"strictly linked to the need to determine the authenticity106 of documents, for the ultimate purpose of ascertaining the reality of the rights or truthfulness of the facts represented in them,"107 and even further beyond the nineteenth century use of historical diplomatics as an tool of documentary criticism.

The ability to determine the authenticity of documentary sources by the study of their forms and genesis will remain fundamental to the value of diplomatics. Indeed, "as public officials who are professionally knowledgeable of the nature of records, archivists still have an important role to play in guaranteeing the authenticity of documents and may see that role

Duranti, Diplomatics, Archivaria 28, 10.

105 Ibid., 10.

106 The diplomatically authentic document is defined as "those which are written according to the practice of the time and place indicated in the text, and signed with the name(s) of the person(s) competent to create them." Ibid., 17.

107 Duranti, Diplomatics, Archivaria 28, 17.

41 Chapter Two Diplomatics grow in significance as they acquire machine-readable records."108 But changes in the circumstances of document creation and the role of archives have led to a reassessment of the potential applications of diplomatics. Duranti writes that the application of the principles and methods of diplomatics as they were developed in the nineteenth century cannot be readily applied to modem documents because of the "plurality and fragmentation of our sources, and because the formalism of old bureaucracies has atrophied in modern ones, creating forms of documents which do not often lend themselves to systematic analysis and description." Yet despite a proliferation of laws and administrative bodies, the application of diplomatics is favoured by the "growing uniformity of laws, regulations, and structures, and of the way these activities are carried out because of the standardization promoted by records management, which is vital to an elephantine bureaucracy, and because freedom of information, underlining the accountability of administrative bodies and the citizen's right to control their activities, favour a better organization and determines the spreading of the knowledge of our social system, knowledge which is losing its elitist character."109 Of particular consequence is the recognition that the boundary between records management and archives is a nineteenth century historical aberration that can have ho place in the management of electronic records.

Duranti believes that there is a particularly urgent need to apply diplomatic principles to electronic documents where the central concern should be to ensure that records are not only authentic, but also, even more important, reliable.110 "The easiness of electronic records creation and the level of autonomy that it has provided to records creators, coupled with an exhilarating sense of freedom from the claims of bureaucratic strictures, procedures and forms, have determined the sloppiest records creation in the history of records making. Too many persons and too many records forms generated in too many different contexts participate in the same transaction; too much information is recorded; too many duplicates are preserved; and

Ibid., 23.

109 Ibid., 9.

110 A record is considered reliable "when it can be treated as fact in itself, that is, as the entity of which it is evidence." Luciana Duranti, Reliability and Authenticity: the Concepts and their Implications (^unpublished paper 1995): 3'.

42 Chapter Two Diplomatics too many different technologies are used. In a word, electronic records, as presently generated, might be authentic, but they are certainly not reliable." 111

These potentials are inherent in diplomatics as a science defined not by time and place, or by its application for historical, legal, or administrative purposes, but by the nature of its subject, the archival document. The archival document is broadly defined as a document made or received by a physical or juridical person in the course of a practical activity.112 This definition distinguishes the archival document from the broader category of the written document which is defined as "evidence . . . produced on a medium ... by means of a writing instrument... or of an apparatus for fixing data, images and/or voices,"113 the term "written" referring not to the physical act of writing, but rather to the "purposes and intellectual results of writing: that is, the expression of ideas in a form which is both objectified (documentary) and syntactic (governed by rules of arrangement)."114 Diplomatics posits that all written documents convey their information by means of rules of representation which are in themselves evidence of the intention to convey information." These rules, which we call the form, reflect the political, legal, administrative, and economic structures, culture, habits, myths of society, and constitute an integral part of the written document, because they formulate or condition the idea or facts which we take to be the content of the documents." The important point is that these rules are independent of content. "The form of a written document is . . . the whole of its characteristics which can be separated from the determination of the particular subjects, persons, or places it is about."115 This separation of form and content is of profound significance for document management because it shifts the focus away from the document as an object of generic information retrieval to management of the document as a record, or archival document.

in Ibid, 9-10.

Duranti, Diplomatics, Archivaria 28, 15.

Ibid, 15.

Ibid, 15.

Ibid, 15.

43 Chapter Two Diplomatics

The diplomatic concept of form is a broad concept that should not be confused with the familiar connotations of form as an "arrangement of parts", a "shape", or a formula.116 The diplomatic concept of form is actually the expression of a system of elements of which the document is the physical and intellectual manifestation. These elements are the juridical system, or social system organized according to a system of rules which constitute the context of document creation and give it meaning and relevance; the act, or the movement of the will that gives origin to the document; the persons who participate in the creation of the document; land the procedures, or the genetic process by which the document is drawn up. All these elements are given expression in the documentary form, "which allows document creation to achieve its purpose by embracing all the relevant elements and showing their relationship."117

At the very heart of diplomatics lies the idea that all documents can be analyzed and understood in terms of this system of elements. Conceptually, that is, in terms of general diplomatics, or the theory of diplomatics, these elements are universal in their application, independent of any context of time and place or in other words, are decontextualized in nature.

If this is true, then these system elements would also have the character of metadata, that is, their definition is independent of any particular occurrence of the data. In terms of special diplomatics, or the critical application of diplomatic theory to specific situations, these elements should then become capable of application within electronic document management systems as metadata within the register or document profile that can be used to define the data entry rules for the capture of records.

JURIDICAL SYSTEM

Diplomatics holds that all documents are created within the context of binding rules according to which social groups organize themselves. A juridical system is defined as a

"collectivity based on a system of rules"118. The system of rules is the legal system, and could

116 Oxford Modern English Dictionary, 415.

117 Luciana Duranti, Diplomatics: New Uses for an Old Science (part IV), Archivaria 31 (Winter 1990-1991): 10.

118 Luciana Duranti, Diplomatics: New Uses for an Old Science, Archivaria 29 (Winter 1989-90): 5.

44 Chapter Two Diplomatics be composed of customs, statutes, traditions, or even beliefs.119 It is the juridical system which is the ultimate context of document creation, permitting us to recognize a document as something we might expect to encounter under given circumstances, and thus imparting relevance and meaning. Understood as an abstraction, the concept of the juridical system is extremely flexible in its application and has the decontextualized nature of a theoretical construct: the system of laws could be those of any form of organization known to man as long as it has rules in some form: a modern bureaucratic state, a primitive tribe, a trade, a cult, or a family, at any time, in any place. Nor is the juridical system restricted to any particular physical or intellectual form of documents: the documents could be written or oral or be sacred objects or tokens.

A juridical system is the broad context within which the creation of a record is validated, however, it is an abstract context that should not be confused with the

recordskeeping system itself, or its rules or means of control and regulation. The juridical

system provides the broad legal context which sanctions the existence of a recordskeeping

system. It cannot, therefore, be the recordskeeping system itself. A policy and procedure

manual governing the maintenance of a recordskeeping system is not an element of the juridical

system per se, but a belief that written rules are necessary to demonstrate and exercise control

is reflected in the existence of such documents and may be characteristic of the juridical system.

The distinction between the juridical system and a recordskeeping system applies equally to

open document exchange standards as instruments of the recordskeeping system.

FACTS

Another vital diplomatic concept is that of facts. Diplomatics holds that all archival

documents are created with a specific purpose in mind, that is, with intentional consequences,

and therefore embody deliberate facts or in diplomatic parlance, acts.120 Apart from the written

archival document having a medium, form and content, it also implies "either the presence of a

For instance, the World Bank is a juridical system in its own right whose creation of documents is governed by such factors as the Articles of Agreement, its own adminisuative laws and organizational competencies, the relationship with the legal systems of its members, and the rules and customs of banking and international financial Uansactions.

120 Facts themselves need not be deliberate.

45 Chapter Two Diplomatics fact and a will to manifest [the fact], or of a will to give origin to a fact. It also indicates a purpose. In fact, the existence of something written, directly or potentially, determines consequences, that is, it can create, preserve, modify or extinguish situations."121

Juridical systems attempt to anticipate facts, which are events or occurrences that have consequences. Those occurrences caused by humans are known as human facts; those, such as natural disasters, over which humans have no control, are known as natural facts.122 Human facts which are a result of a deliberate intention are critical to the concept of the archival document. "Among human facts in general, the special type of fact which results from a will determined to produce it is called an action or act. The operation of will distinguishes an act from any other general fact. . . In other words, an act is a fact originated by a will to produce exactly the effect that it produces."123

For a document to be a record, it must manifest an act, or be the expression of a deliberate will to act. Those acts which are limited simply to the accomplishment of the act, known as mere acts, do not give rise to records: a decision to go for a walk, for example, would qualify as a mere act if the only intent was to get a little exercise. To give rise to a record, an act must have the character of a transaction, defined as "a declaration of will directed towards obtaining effects recognized and guaranteed by the juridical system."124 The making out of an invoice, the drawing up of a bylaw, or the swearing of an oath are all

examples of types of actions that would qualify as transactions because they can only be undertaken if their conceptual meaning is anticipated and accepted by the juridical system. For this reason, all the actions of public bureaucracies are construed as transactions.

Because of the relationship between acts and records, without an understanding of the

act in which it is involved, it is difficult to know the nature of a document. Diplomatics has

121 Duranti, Diplomatics, Archivaria 28, 16.

122 The pejorative phrase "facts of life" reflects this fundamental distinction, where humans are thought not to have control over certain parts of their own behaviour because of the role of inheritance, physiology, and instinct.

123 Duranti, Diplomatics, Archivaria 28, 6.

124 Ibid., 7.

46 Chapter Two Diplomatics developed a taxonomy of decontextualized acts that is intended to encompass the whole range of potential administrative situations in which documents might participate as records.

If the nature of records is determined by their participation in acts, their value as evidence depends on the kind of relationships they have with those acts. On this basis, it is possible to distinguish the relative weight of documents in so far as they participate in a given act. Thus, documents which give direct expression to the act (i.e. without which the act could not exist) are called "dispositive", and are considered juridically relevant from a legal point of view, for example, a loan agreement, title deed, or sentence of consecration. The same is said of probative documents, which attest that an act occurred, for example, an oath or an invoice.

Documents whose written form is not required but are generated spontaneously in carrying out an act such as a consultant report or a discretionary memorandum of advice are "supporting" and can only contribute indirect evidence of the actions they concern. Finally, "narrative" documents that are about actions but neither take part in them, attest to them, or support them, have no force as evidence of them.125

The categorization of documents by their degree of relevance as evidence is tremendously important in analyzing the documents of organizations bureaucratic in nature.

Diplomatics recognizes that bureaucratic facts have a special character: "they are juridical acts directed to the obtainment of effects recognized and guaranteed by the system, that is, they are transactions."126 Only those documents that are "reliable and complete, that is, able to convey information, capable of being used in a transaction, and of reaching the purposes for which they have been produced, are transactions," and can be called records in this context.127

The distinction lies in whether a document is the result of a procedure or a process. A procedure is "the body of written or unwritten rules whereby a transaction is effectuated, and comprises the formal steps to be undertaken in carrying out a transaction."128 The documents created in the course of procedure are "at one" with it and must be identified as such so that

125 Duranti, Diplomatics, Archivaria 29, 9.

126 Ibid., 12.

127 Ibid., 12.

47 Chapter Two Diplomatics various activities of an organization. Precisely because it is a term that can be defined to describe individual circumstances, "business process" has a rather variable meaning. 131 This, in itself, is not necessarily a serious disadvantage. As Luciana Duranti has pointed out, recordskeeping systems exist to serve the aims and purposes of records creators - not vice versa. Provided the system is able to tie records to acts in such a way that the records creator can recognize them and is able to distinguish records involved in one type of transaction from another, the theory of acts has been realized in practice.

CREATION PROCESS

Closely allied to the principle that all records give expression to acts is the idea that all bureaucratic documents are the result of a creation process or procedures. Two conceptual procedures are common to all records: they reflect respectively, one, the moment of action, and two, the moment of documentation. The moment of documentation is that point in time when the action is documented. The moment of action when the decision is taken to act and the command to prepare a document is given. These two moments are recognized in international open document exchange standards by such attributes as SGML Release Date and ODA

Creation Date and Time, and Release Date and Time (SEE Thesaurus - dates).

Depending on the type of act, these two moments may occur simultaneously in the same document. For instance, a dispositive document unites the moment of action and the moment of documentation. These are separated in the probative document, where the document records an action that has already taken place. In medieval documents, the moment of action and the moment of documentation were usually united in the same document. A single document would encompass an entire act, from start to finish. Modern documentation, on the otherhand, is fragmented: bureaucratic procedures are complex, involving many different acts, all of which may generate documents of their own.

For example, business process at the World Bank is an attribute of the document profile. The value consists of cost-accounting codes assigned by the ConUoller, e.g. Lending - Pre-Appraisal (of loans). Some of these lump several different activities together, others go into considerable detail. The advantage of the use of business process in this context is that it is encompasses the entire organization, reflects activities as they are understood and carried out, and since it is tied directly to budgets and expenditures, must be used by everyone and is likely to be maintained and policed on a regular basis.

49 Chapter Two Diplomatics

PROCEDURES

All records are the result of a procedure which is defined as a series of formal steps undertaken to give effect to an action. Procedures are peculiar to every type of act and the juridical context in which the act takes place, but diplomatic analysis suggests that all procedures can be classified into four, basic, conceptual categories: organizational, instrumental, executive, and constitutive. Organizational procedures are "aimed at the establishment of organizational structure and internal procedures, and their maintenance, modification or extinction."132 Executive procedures are "those that allow for the regular transaction of affairs within limits and according to norms established by a different authority.133 Instrumental procedures are those connected with opinions or advice while executive procedures allow for "the regular transaction of affairs within limits, and according to norms already established by a different authority."134 Constitutive procedures are those which create, extinguish, or modify the exercise of power, and comprise a family of sub-types consisting of procedures of concession, limitation, and authorization.135[SEE Thesaurus - procedures for a complete list].

The identification of the specific procedures peculiar to the organization is essential to understanding the interrelationships of documents within a given act and ought to be captured by the electronic document management system if the interrelationships between records are to be reconstructed. Of course, procedures cannot be captured without identification and understanding of the act to which they give expression. Identification of the act must therefore come first, then the procedures.

The fundamental aim of diplomatics is to determine the extent to which the archival

document, or record, is a legitimate product of its context, the action which is realized within that context, and the procedures that put it into effect. These must be made manifest in the

Luciana Duranti, Diplomatics: New Uses for an Old Science (Part IV) Archivaria 31 (Winter 1990-91), p 19.

133 Ibid., 19.

134 Ibid., 19.

135 Ibid., 19.

50 Chapter Two Diplomatics document itself. But with more complicated types of acts, such as acts on procedure, complex and continuative acts, many iterations and types of records may be involved, and no one record will embody the entire act. Some type of mechanism must therefore be identified that can electronically identify the act and the procedure and key these to the document.

Electronic recordskeeping systems execute processes in order to permit programs to be run, and procedures or routines that permits programs to execute a particular task. These are not defined in terms of bureaucratic acts, but in terms of how data is handled by the system.136

Yet in order to recognize records, an electronic recordskeeping system must be able to acknowledge the bureaucratic procedures that lead to records creation. "Ultimately," writes

Duranti, because it is an essential constituent of reliability, "the goal is to have restructured business procedures in which the records making and keeping function is a highly regulated and integral part of the usual and ordinary conduct of affairs." 137

As with acts, no mechanism exists in ODA, DFR, or SGML to specifically flag procedures although ODA offers the catch-all of "User Specific Codes". There is no theoretical reason why this level of document control could not be imposed. On a practical level, the main reason against it is the problem of transparency: in the process of creating documentation, will the user be faced with a complicated header with all sorts of mandatory fields that must be filled in before anything further can be accomplished? Further, with something as unique and often subtle in its working manifestation as procedure, can the user be trusted to enter reliable data? While this is not a basis of theoretical objection, it is a real constraint on present-day systems. Again, the answer would seem to be that the more prescriptive, or dictated by legal requirements, the procedure may be, the more likely an attribute of procedure is to be sought.

PHASES OF PROCEDURE

An additional complication is the sequencing of document creation in phases. A phase is the step in a procedure. Diplomatics posits a taxonomy of phases that breaks all procedures

For example, a system may automatically and routinely invert cumulative indexes of terms as they are added, or Uansfer certain records offline in order to save space. A process has been described as an envelope within which a program runs, the system assigning it a number and performing various "bookkeeping" functions. Margolis, Personal Computing Dictionary, 383.

137 Duranti, Reliability and Authenticity, 10.

51 Chapter Two Diplomatics down into the same, conceptual steps. The initiative phase is composed of acts which initiate, the procedure. The prehminary phase or inquiry consists, of the collection of information needed to evaluate the application. In the consultative phase, the information is evaluated and prepared for decision which is taken in the deliberation phase. The deliberation phase may be followed by a phase of deliberation control, where persons who are not the authors of the decision check the decision for enforceability and compliance with policy and administrative norms. Finally the decision is put into effect in the phase of execution.138 [SEE Thesaurus - phases of procedure for a complete list].

ODA, .DFR, and SGML have no attributes that account for phases of procedure.

Control over the phases of procedure is, however, essential to the reliability of documents. In the electronic environment, it is necessary,. at a minimum, to be able to demonstrate control over the initiation, deliberation, and execution phases of procedure.139 Simply identifying the phase of procedure would not, however, be any indication of reliability. Control is a matter of being able to indicate that rules were being obeyed in carrying out the phase of procedure. The presence, for instance, of appropriate signatures, use of correct document types, and dates that reflect moments of action and documentation are all means of indicating that at each phase of procedure, the necessary controls were in effect and that the rules of procedure were being respected. Again,.the term "business process" might here be useful as a profile attribute to capture the phases. It may be broken down by various degrees of refinement depending on the formality and complexity of the act and the prescriptive nature of the documents, a question that is determined by the business needs of each organization.140 Rather than become prescriptive and attempt to assign a specific attribute for phases of procedure, it is preferable to let the need for reliability of any given recordskeeper determine the degree to which the

138 Duranti, Diplomatics, Archivaria 31, 15.

139 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 22. ,

140 For example, the business processes defined by the World Bank often encompass phases of procedure that are, however, sufficiently complex in thermselves to justify a separate cost accounting code. For instance, the pre-appraisal stage in the granting of a loan is the equivalent of the initiation phase of procedure while the appraisal stage is the equivalent of the phase of consultation. Each of these has a distinct set of documents and rules.

52 Chapter Two Diplomatics concept of business process might be used as an attribute to break down their activities into acts, procedures, and phases.

PERSONS

"Persons are the central element in any document."141 Archival documents are the result of deliberate acts and acts needs persons to exist and be manifest. Diplomatics identifies four conceptual roles, or persons who participate in a document. They are the author, who is

"competent for the creation of the document, which is issued by him or by his command, or in. his name."142 There are really two authors even if they may be the same person: the author responsible for the act and the author responsible for the document itself:143 The writer is the person responsible "for the tenor and articulation of the writing,"144 the one who actually draws the document up. Again, they are frequently synonymous with the author but the conceptual role is different. Every act is addressed to someone, defined as the addressee, who is "the person to whom the document is directed."145 As with the author, there may be two addressees: the addressee of the act, or the person to whom the act is directed, and the addressee of the document, or the person to whom the document is actually directed. Duranti believes that electronic documents should also distinguish between the addressee of the act and those who are merely copied, a group called the receivers.146 The reason for this distinction is that in electronic mail, the addressees may all be lumped together into a header under "to:" and "cc" with the distinction between the two not always clear, whereas in paper documents, the receivers were usually those whose names were appended at the end of the document as part of the secretarial notes (SEE Thesaurus - secretarial notes). A fourth group of persons are those who in some way validate the signature as witnesses or the form of the document as

141 Duranu, Diplomatics, Archivaria 30, 5.

142 Ibid., 5.

143 The author of the act can also be identified from those who are responsible for preparing accompanying documentation or attachment.

144 Duranti, Diplomatics, Archivaria 30, 7.

145 Ibid, 6.

146 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 20.

53 Chapter Two Diplomatics countersigners. Of the four groups, the author, writer and addressee are essential for a record. to exist.

The concept of persons is flexible, however precisely formulated. "In a diplomatic context, as well as in a legal one, persons are the subject of rights and duties; they are entities recognized by the juridical system as capable of or having the potential to act legally."141 this concept of person is legal in nature: the persons involved in an act are defined in terms of their competencies and responsibilities, and not as individual human beings. They derive their existence from their recognition within the juridical system. They may be single individuals, or a collectivity. The author of the act, for example, may be a corporate body as broadly defined as, the state, such as the Government of Canada; the, addressee of the act may be a collectivity such as the people, or a congregation; the writer is often an individual, but they may also be an organization, such as a department, or a committee, or a work group. The persons may also need not be human. An electronic system or program is quite capable of producing a record provided it is operating as a juridical person, that is, "capable of acting... as having the will that can create, maintain, modify, or extinguish situations."148 For instance, an ATM

(Automatic Teller Machine) is capable of producing a valid record of cash transactions because it is designed to interact independently with the client through its artificial intelligence and to produce records that are recognized by banks and the legal system.

Compared to concepts of the juridical system and the acts, the persons lend themselves to relatively direct translation into a precise set of attributes in the document profile and in those international document exchange. standards where they are present in one form or another. They have the added advantage of being at least familiar to document creators and users even though there is a strong possibility of confusion with bibliographical definitions of persons that are not designed to achieve the same purpose!149 This is a practical consideration

148 Duranti, Preservation ofthe Integrity ofElectronic Records, 16.

149 For instance, the bibliographical definition of author is based on information that must be cited as literally as possible from the publication itself, whereas the diplomatic definition is based on an understanding of the administrative, legal, and historical context of the document and may have to be inferred. This is because a publication is not a record per se - it does not manifest an act that has any other consequences than the writing itself. In so far as the information is not intended to effect a predictable

54 Chapter Two Diplomatics where the occurrence is unpredictable, as for instance, in the case of documents authored by

individuals, and data entry is dependent on the document creator.

ODA, DFR, and SGML use the broad category "Originators" to capture the concept

of persons. All have an attribute called "Authors" which ODA/DFR defines as the "the

name(s) of the person(s) or organization(s) responsible for the preparation of the intellectual

content of the document.150 There is some ambiguity here with ODA Owners who are

responsible for "the content of the document". 151 ODA Authors is really the equivalent of the

author of the document, whereas ODA Owners is equivalent to the author of the act. ODA

Preparers is more clearly equivalent to the writer because they are defined as those who are

responsible for the physical preparation of the document. ODA also recognizes a further

attribute called Organizations which is intended to associate the "originating organization"

with the contents of the document. This may appear to be designed to accommodate a

corporate version of the author of the act, as opposed to an authoring individual, and is an

attribute of provenance. But once again there is ambiguity, because Organizations could also

be confused with Owners. There is no recognition whatsoever within ODA SGML or DFR of

addressees or receivers, nor of witnesses and countersigners.

INTRINSIC AND EXTRINSIC ELEMENTS

Diplomatics holds that "the form of a document reveals and perpetuates the function it

serves."152 This is to say that the context of the juridical system which give the document

result, the act of authorship or publication is therefore a simple act in that the act of publication is its own fulfillment. Unlike records, publications are self-contained in that they do not depend on a contextual understanding of their creator or on a relationship to a genetic process or to other records for their meaning.

Canadian Standards Association: CAN/CSA-Z243.224-90 (ISO 8613-4: 1989). Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 4: Document (Rexdale, Ontario: Canadian Standards Association, 1990): 12.

DFR, while recognizing the ODA definition of Owners, also defines Owners as "a security subject who possesses rights to a specific DFR-object." This indicates custody of the object in terms of ownership or rights of possession, which is not the strict equivalent of author of the act. ISO/ffiC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166^ 1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) - Part 1: Abstract Service Definition and Procedures. (New York: ANSI, 1991): 7.

Duranti, Diplomatics, Archivaria 32 , 6.

55 Chapter Two Diplomatics meaning, the act that is the cause of the document, the persons, and the genetic process of procedure are expressed together in the form of the document. An analysis of the form of a document will therefore tell us what function a document serves and whether or not it can be trusted as evidence. This analysis is inductive: diplomatic criticism works upwards, from the document form to the context of creation. It does not depend on the prior accumulation of information about the context of the document, on historical, legal and administrative research, in order to determine the nature of the document. Such research clearly has a role in elucidating the context, but the function and context of a record must be directly evidenced in the form of the document itself.

There are several implications in this approach to document form. First, diplomatic analysis, has to begin with the identification of document types since the ultimate test of meaning is found in the form. Therefore, in the design of an electronic document management system, this is clearly the focus of the document definition phase. Only when all the various types of documents have been identified for any given procedure can their interrelationships and roles in the act be mapped out.

Secondly, the separation of form from content means that records must be identified by their formal constituents and not by the information they convey. While an electronic document management system must certainly be capable of retrieving documents on the basis of their subject, this function is conceptually quite separate from their identification as records.

Thirdly, the separation of form and content is the conceptual equivalent of the separation of the document profile from the document contents (or context from content) in the model of open document interchange.

Documents have both a physical form, or external make-up, composed of what are called the extrinsic elements, and an intellectual form, comprised of the intrinsic elements, which are the document's "internal articulation."153 "From a conceptual point of view, it may be

56 Chapter Two Diplomatics said that intrinsic elements of form are those which make a document complete, and extrinsic elements are those which make it perfect, that is, capable of accomplishing its purpose."134

The recognition that documents have structure is not unique to diplomatics. Hajagos, for example, points out that "in fact documents do have an implicit structure. Letters consists of an addressee, salutation, body, and signature. Books contain chapters, sections, subsections, and so on . .1,155 David Bearman defines documentary, form as the "structure internal to the individual record dictating what data will be present for specific types of transactions and facilitate its recognition and use by signaling to readers, by means of typography, data structures, and electronic links, where particular information will be located."156

Diplomatic theory goes one step further in asserting that the form of archival documents in fact follows a predictable pattern that can be captured in a model. The diplomatic model of document form consists of a typical, ideal document comprised of all the elements which documents can be expected to include, "the most regular and complete", independent of provenance or difference in purpose. "Once the elements of this ideal form have been analyzed and their specific function identified, their variations and presence or absence in existing documentary forms will reveal the administrative function of the documents manifesting those forms."157

The great value of the diplomatic model of document form is that it is composed of elements, or rules, that are independent of context, i.e. are decontextualized. It therefore becomes possible to use the elements of intellectual and physical form identified in the model to establish standard definitions that could be used to define document attributes in international standards and in the document profile of electronic document management systems. In effect, this amounts to proposing the equivalent of an idealized document profile.

154 Ibid., 6. Definitions of the various extrinsic and intrinsic elements will be found in the Glossary. The following discussion will focus on their adaptability to international standards and their applicability in the electronic document environment.

155 Hajagos, Documents and SGML, 39.

156 David Bearman, Record Systems as the Locus of Provenance (paper presented to the Ontario Association of Archivists Conference on Archives and Automation, May 13, 1993): 4.

157 Duranti, Diplomatic, Archivaria 32, 6.

57 Chapter Two Diplomatics

The intrinsic elements which determine the intellectual form are "considered to be the integral components of its intellectual articulation: the mode of presentation of the document's content, or the parts deterrnining the tenor of the whole."160 They consist of three groups or

"ideal sub-structures" of elements which tend to appear together without any particular juxtaposition to each other: the protocol, the text, and the eschatocol.

THE PROTOCOL

The first of these, the protocol, consists of that part of the document which contains the administrative context of the action consisting of an indication of the persons, involved, time and place of documentation, subject, and any initial formulae. The protocol tends to appear at the beginning of the document and consists of the entitling, date, invocation, superscription, inscription, salutation, subject, the formula perpetuatis and the appreciation.

Dates can be both topical (of place, i.e. "signed in the City of Victoria in the Province of British Columbia) and chronological. The date also captures the moment of action and the moment of documentation. Dates are important because they capture the relationship between the author/ writer and the fact or act in question. With traditional paper records, the date is usually added at the outset of compilation, but with electronic messages, the date is either captured by being automatically indicated. in the header or is included in the system at the moment of transmission.161

The name of the author of a record is an essential element. It may be captured by

several different intrinsic elements. The entitling is the part of the protocol that comprises the

name, title, capacity or address of the physical or juridical person issuing the document or of which the author of the document is an agent. In contemporary paper documents, the entitling takes the form of the letterhead and is not as important for the reliability of a record as the

signature. But in electronic messaging systems, the name of the person issuing the record is

added automatically to the document at the moment of action, or actual transmission. The

entitling becomes an electronic address from which the message is sent, and thus, "juridically,

the person from whom the address the message is sent is its author and writer, unless an

Duranti, Diplomatics, Archivaria 32, 6.

Ibid., 19.

59 Chapter Two Diplomatics attestation is attached to the record that would unequivocally demonstrate who its author/writer is, such as an electronic seal." 162 The superscription is the mention of the author of the document and/ or of the action and often appears as the initial wording of the text (e.g.

"I, Samuel Doe . . ."). Nowadays, the superscription is often a part of the entitling. An example of a document where it appears by itself is a contract where the superscription names the first party. In electronic messages, the superscription cannot take the place of the entitling which indicates both the writer and author and is automatically added by the system.

The inscription is another part of the protocol that comprises the name, title, and address of the addressee of the document and/or of the action and is thus essential to the existence of a record. With paper records, there is usually one addressee for each record with copies sent to others on a distribution list. Electronic messages, however, may be sent to multiple addressees simultaneously and copied at the same time to distribution lists of receivers.

The distinction between the two must be carefully maintained because receivers are not, by definition, objects of an act.

Other elements of the protocol are not essential to the capture of electronic records although they may continue to appear. The invocation is the mention of the name of God but it may occur in modern documents where a document claims to be invoked in the name of something, such as the people or the law. The symbol or logotype that make up a letterhead can be treated as an invocation where it is included for purely symbolic purposes.163 The salutation is a form of greeting characteristic of letters. The formula perpetuatis is a formula typical of medieval and modern documents conferring titles or privileges. It consists of a statement that the rights put into existence by the document are not circumscribed by time, e.g. in perpetuum. The appreciation is a sort of prayer for the realization of the content of the document, e.g. "looking forward to

Duranti, Preservation of the Integrity of Electronic Records, 19.

163 The invocation can be viewed as a way of evidencing the abstraction of the juridical system by indicating the moral authority under which the document is drawn up, such as "In the Name of God". Modern documents, especially business documents, have no such abstract elements in them that might be explicitly interpreted as an appeal to a moral authority greater than themselves although the use of a coat of arms by an elected government might be taken to symbolize the system of beliefs that underly the rule of a given state.

60 Chapter Two Diplomatics

THE TEXT

The second group of elements, the text, is the central part of the document "where we find the manifestation of the will of the author, the evidence of the act, or the memory of it,"164 and consists of the preamble, the notification, the exposition, the disposition, arid the final clauses.

The preamble is that part of the document that expresses the ideal motivation for action, such as a citation of law or regulations or opinions on which justify the act. The preamble is often a formula and can be treated as "boilerplate" text. The notification is the publication of the purport of a document whose purpose is to express that the act consigned to the document is communicated to all who may be affected by it as well as those who are directly concerned, usually the notification follows the preamble. The notification is typical of letters patent or a proclamation. The exposition is a part of the text where the immediate circumstances of the act are expressed or explained, and the reasons for the act given. The disposition is the expression of the will or judgment of the author, what the author wants done

or intends to do. The final clauses are found within or following the disposition. [SEE

Thesaurus - final clauses for a complete definition and list.]

THEESCHATOCHOL

The third group, the eschatocol, closes the document and consists of the corroboration, the complimentary clause, the attestation, qualification of signature and secretarial notes.

The corroboration is the mention of the measures taken to make the document reliable

and authentic. The complimentary clause is a brief formula expressing respect, such as

"sincerely yours". The attestation is the subscription of those persons who took part in the

issuing of the document, and is the substance and core of the eschatochol. Attestations include

the signatures of the author, the countersigners, and witnesses. As has been indicated above,

the role of the attestation has been affected by electronic messages where it is an alternative to

the author and writer named in the entitling. The qualification of signature is the mention of the

title and capacity of the signer that accompanies the signatures of attestation, e.g. "Division

Chief," Chairman of the Board". The secretarial notes follow the qualification of signatures and

Duranti, Diplomatics, Archivaria 32, 12.

61 Chapter Two Diplomatics can comprise a number of elements, such as the initials of the typist, distribution lists, and mention of enclosures. Like other intrinsic elements, the secretarial notes may now be treated as objects or attachments, making them individual documents in their own right with their own sets of attributes. For instance, secretarial notes can include distribution lists that in DFR are a separate object attached to the document. Moreover, filing and retrieval of electronic documents is a much more complex business, involving a standard of its own, DFR and many more attributes than are traditionally associated with classification of paper documents. These elements may be captured by such attributes as DFR References-to-Other-Objects but their association is by no means self-evident and must be individually mapped.

In terms of the structure of electronic documents, the intrinsic elements belong to the logical structure, i.e. that part of the document whose components are deteirnined by, and determine meaning and are separated out from the layout structure which is determined by, and determines, the physical arrangement of the document.

The intrinsic elements are purely intellectual elements that are entirely independent of physical structure. SGML approaches the division of the intrinsic elements most closely with its conceptual groupings of front, body, and back matter for office documents, and indeed, for

most documents. In reality, the SGML groupings have little to do with the diplomatic groupings (SEE Chapter Three, SGML), tending to base them on a publication paradigm. DFR

is not at all concerned with content, but the generic logical structure of ODA is capable of

recognizing the groupings of protocol, text, and eschatochol if so required, although this has to

be done by document classes or types. SGML is probably most capable of recognizing the

various intrinsic elements because of its ability to code individual textual features, such as a

signature, or standard clauses.165

Extrinsic Elements

The extrinsic elements break down into six groups of physical elements: the medium,

the script, the language, special signs, seals, and annotations.

It must be remembered that in order to retrieve them through SGML, each tag must be searched individually by document because SGML coding is done by document. With ODA the search is faster because the search is by attributes in the profile, and whatever profile (which may be common to any number of documents) has that attribute, the document will be retrieved.

62 Chapter Two Diplomatics

The medium consists of a group of features that physically carry the message and includes such considerations as the material, the format, and the sort of preparation given the material for receiving the message. In the past, this meant an examination of the physical type of writing medium, such as parchment or paper, its size and shape, or format, and the way the surface was prepared for writing with rules or lines. With electronic records, the meaning is ambiguous. Generally, examination of the physical medium means identification of the type of electronic storage object, such as magnetic tape, CD-ROM's, or hard disks. The format becomes a description of the way the physical medium has been prepared or. formatted to accept the information, for instance, magnetic or optical. But medium is also conflated with mode of representation, such as graphical formats, which are better treated under script. It can also have other meanings: in the design of networks, medium means the type of cabling used to connect stations.

The script determines the "layout and articulation" of the discourse,166 and includes

such elements as layout, pagination and formatting, types of scripts, handwriting, typefaces and inks, paragraphing, punctuation, abbreviations and initialisms, erasures and corrections, and

formulae for the composition of the text. Where electronic documents are concerned, the script

may also be interpreted to include computer software "because it determines the layout and

articulation of the discourse, and can provide information about provenance, procedures,

processes, uses, modes of transmission, and last, but not least, authenticity."167 The script can

also be extended so far as to include the system documentation since such information

determines how the system operates and the way that it articulates the record and gives them

form.

Modem documents are increasingly complex as physical and intellectual objects and

perhaps nowhere is this more evident than with the script. The script has been subdivided into

Duranti, Diplomatics, Archivaria 32, 7.

157 Ibid., 7. Open standards, by their very nature, avoid prescribing how a system chooses to process a document and therefore, are unlikely to specify software needed to interpret a document. Thus, ODA does not specify particular word processing applications, it only prescribes that whatever word processing application is used, it must be able to handle ODA encoding. Open document exchange standards confine themselves to description only, leaving it up to the system to decide how the object described should be processed, according to its own rules and capabilities.

63 Chapter Two Diplomatics elements of content articulation, or the elements of writing and their arrangement, and content configuration, or the mode of expression of the content which is equivalent to its intellectual representation as a map, or a graphic, or text. Content articulation includes all the elements of

document architecture, including the layout, and recognition of various textual or content

structures, such as paragraphs. A document architecture is defined by ODA as "rules for

defining the structure of documents... "168

The script of electronic documents might also be divided into two groups which relate

to each other as layers. The first layer is composed of human-readable elements: the image

itself with its layout and logical elements such as paragraphing, headers, and elements of

presentation, such as typefaces, use of holding etc. How these are actually encoded by the

application is the second layer of script elements, which permits the application to articulate the

document for human comprehension. It may be accessible to the user but would appear as

screens of encoding that would not be readily understandable to any but a trained eye. A third

layer might be the system information required by the application itself, such as how it

configures the terminal, reads the operating system, and relates to other applications. This layer

might be inaccessible altogether to the user.

In so far as document exchange standards are concerned with preserving the integrity

of records, the elements of content articulation and representation must be of vital concern. It is

less easy to argue that standards such as ODA should be concerned with how applications

interpret the requirements because to do so would require the standard to specify applications

and other particular requirements which would limit their universality. ODA does, however,

have the ability to accommodate the needs of particular documents through the use of

document application profiles (DAPs) which are designed to permit a communicating system

to determine if it can handle a particular type of document. (SEE Chapter Three - ODA).

An element of the script that is not dealt with by traditional diplomatics is the concept

of links. A link may be defined as a "a pointer to another record."169 A link may be

168 ISO, ODA, Part 1, 5

169 Margolis, Personal Computer Dictionary, 270. Links should not be confused with the concept of inserts. An insert is a form of copy that is inserted into the body of another document. The pasting of other documents into the body of electronic documents is very common, but this is not a link in the sense

64 Chapter Two Diplomatics embedded in a document template that is used to pull in data from a database into the document. Links may also be used to bring several documents together into a single virtual document.170 The idea of links should be assigned to the script, and more specifically, to the. area of content articulation, because they are a physical means of joining the parts of a document together rather than an intellectual element in their own right, such as the entitling or the text. For example, an e-mail may consist entirely of pointers from a system account to a central mail box or records store. The presence of links may be essential to the completeness of a record, even if the links are invisible to the user. DFR, ODA and SGML all contain attributes designed to capture the idea of links:

DFR User-References-to-Other-Objects, and DFR User-Reference, as well as ODA

External References could all be used to capture links.

The element of language is a third basic extrinsic element that goes beyond simply the tongue of particular communities to include the vocabulary, phraseology, and styles of different social groups. In electronic documents, language must also deal with machine- readable languages since some records (e.g. an EDI purchase) are substantively machine transactions. ODA and SGML are designed for human-readable documents; there is no assumption that they are intended to exchange documents that would be machine-readable only even though, to be exchanged at all, a document must be completely machine-readable.

The special signs are a group of extrinsic elements that reveal the various persons participating in the document such as symbols, personal marks, stamps and phrases. Like the secretarial notes and signatures, electronic special signs may exist as objects in their own right that become attachments. The seals are a little used group of extrinsic elements today although the concept of encrypted seals can be applied to electronic documents and if so, would again be of a pointer, or way of viewing another document without actually incorporating it into the body of a document. SEE Thesaurus - copy for a definition of insert.

170 A virtual record is defined as "pointers needed to create documents" Duranti & Eastwood, Preservation of the Intergity of Electronic Records, 15. A definition more characteristic of data processing as opposed to archives, keeping in mind the more limited meaning of record, is "the characteristics of an entity as perceived by the user, regardless of how they have been physically represented in a database. Thus an employee would have one virtual record, but may have numerous physical records linked together to accommodate repeating addresses, jobs held, benefits received, etc." ACCIS, Management of electronic records, 186.

65 , Chapter Two Diplomatics an attachment. Like the special signs, sealsmay have significance for security and access to documents: for instance, an electronic access code is a special sign, a seal can be a way of securing a document against tampering or further knowledge. They have a role in determining the form of transmission171 of a document and thus in ensuring its authenticity.

The. last group of extrinsic elements are the annotations, they are an important constituent of reliability because they "represent the conjunction between, elements of intellectual form and of procedure. Thus, they are a bridge between the completeness aspect of a record and the procedural control on its creation."172 Annotations fall into three groups.

Annotations of execution may be included in a document after its compilation as part of putting it into effect, or its execution. These include annotations of authentication that are the express, legal recognition that a record or the signatures on it are what they claim to be. Annotations of execution may also include annotations of registration which are a reference to a transcription of a record made in a register by an office different from the one creating the record. Both of these type of annotations of execution are peculiar to particular record forms. For instance, a copy of birth certificate obtained from a government registry of births will indicate that it is an authentic copy.

Another family of annotations are those added in the course of carrying out subsequent steps in a transaction, such as question marks, initials, dates of hearings or readings, or queries. They are made on received documents by the offices that carry out the related transaction, so they are made on finished documents. Annotations of handling include those of instruction, which are the mention of previous handling, directions for transmission, classification etc., dates of hearings or readings (for formal documents of this type), and signs made by commentators or readers of the text indicating opinions or comments. Annotations of management include registry numbers, classification codes, cross-references, dates of receipt, and names of recipients or of the office receiving the document.

171 Form of Uansmission is the form that a record has when it is made or received. Duranti, Preservation of Integrity of Electronic Records, \2.

66 Chapter Two Diplomatics

The elements of documentary form come together in the concept of the status of transmission, which is concerned with the ability of the document to give effect to the act it embodies. The status of transmission defines the fundamental concepts of the original, the draft, and the copy. The original is defined as the form of the document that is perfect, in that it has all the elements necessary to give it completeness of form required by the juridical system in which it was created, especially those of content articulation and annotations,174 is able to give effect to the. act it embodies, and is the first to be issued.175

A draft is a sketch or outline of a definitive text prepared for purposes of correction. By definition it lacks completeness and effectiveness. Anything other than an original is therefore a draft.

A copy reproduces to a greater or lesser degree, depending on the type of copy, the form of the original but lacks primitiveness, and is unable to give effect to the act.

Diplomatics distinguishes between several different types of copy: an imitative copy can reproduce the original or the draft in almost every respect except in primitiveness.175 In the

paper paradigm, copies made on copiers are imitative and can be virtually indistinguishable from the original. On the otherhand, a simply copy is a transcription of the contents of a

document without respect to form. Notes transcribed from a report, for example, are a

simple copy.177 A copy-in-the-form-of-an-original is a copy that is created when two

originals of the same document, addressed to the same person and having the same date,

are sent to the addressee in two subsequent deliveries, with the first considered the

original, and the second, the copy.17B Another type of copy is the authentic copy, where

officials who are authorized to execute such a function validate a copy so that it is capable

of being used as evidence. Birth certificates are an example of a document which is often

Duranti, Preservation of the Integrity of Electronic Records, 5.

Duranti, Diplomatics, Archivaria 28, 19.

Ibid., 20.

Ibid., 21.

Ibid., 19.

68 Chapter Two Diplomatics issued as an authentic or "certified" copy.179 Another form of authentic copy is the vidimus which is an insert in another document whose purpose is to guarantee the conformity of the copy to the original.180 (SEE Thesaurus - Copy for a complete list of all the types of

copies.)

The concept of the status of transmission applies no less to electronic documents than to paper records. The principles of perfection and primitiveness all apply but with

fresh complications. Duranti has maintained that all electronic documents are created as

drafts and received by the user as originals "in consideration of the fact that the records

received contain elements automatically added by the system which are not included in the

document sent, and which make them complete and effective."181 An example of this is an

e-mail where the text is created at the workstation of the author but the electronic mail

application will not add the date and time of the moment of action until the actual moment

of dispatch. In order to recognize the draft, the profile must be able to recognize versions

which means the ability to capture the date of creation, name of the author, persons

involved in commenting on the draft, queries, and number of the version.182 Records

received from outside the system become originals once they are physically attached to it

by means of a profile which must be complete enough to capture all their original elements

that make them complete, reliable and authentic.

ODA DFR, and SGML all contain a profile attribute, Status, designed to

accommodate the states of draft and original. But since, the concept of original requires the

presence of other elements to bring it to completeness, merely indicating that something is

an original is not sufficient. Completeness, for instance, is conveyed by such attributes as

ODA Title, Start Date and Time, Subject, Authors, and Owners, and not merely by the

attribute Status. SGML, ODA and DFR all contain an attribute for Revision History to

capture how the document has been changed, while DFR attempts to track various

Ibid., 21.

Ibid., 21.

Duranti, Diplomatics, Archivaria 33, 10.

Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 26.

69 Chapter Tvvo Diplomatics versions through the attributes Version-Root, Next-Version, and Previous-Version. The concept of version control is, in fact, control over the drafting of a document.

Open document exchange standards are designed to exchange documents, or parts of documents between heterogeneous networks so they can be processed, re-processed or read. This means that they must be capable of actually capturing the characteristics of a document or the document part as opposed to simply adding additional information. For instance, an electronic messaging system merely takes a message.and adds address and routing information. Open document standards are designed to encode the structure of the document and in some cases, add contextual and management attributes. SGML is used to exchange all or part of a document by encoding its textual features. ODA is capable of encoding both the logical and layout structure and adds additional information in the form of attributes for security, access, author and originator, and dates of action and documentation. But to judge the ability of open document standards to capture the original against a paper paradigm would be misleading because of the definition of document. Both ODA and SGML define a document as consisting of a profile or document type declaration (DTD) and an instance of content, with the DTD and the profile both capable of being exchanged in their own right as documents. The attributes or tags are therefore actually part of an effective document and must be considered part of the original because without them, the document cannot be exchanged and therefore, comprehended. This can be seen from the way both ODA and DFR deal with the concept of draft and original as versions and editions. A DFR version is "a DFR-document specified by the user as a derivation of one or more other DFR-documents by means of specific DFR-attributes."183 Printing the document out, of course, merely produces an imitative copy without the profile or DTD information.

If attributes or DTD tags are to be considered part of the actual document, then attribute management becomes an issue in determining the authenticity of a document.

How and when attributes are created, by whom and when they are modified will have to

ISO, DFR: Part 1: Abstract Service Definition and Procedures, 20.

70 Chapter Two Diplomatics captured in the profile. DFR, in fact, recognizes this function with such attributes as

Attribute-create-date-and-time, and Attribute-modified-date-and-time.

Mode of Transmission

The mode of transmission is "the method by which a record is communicated over space or time."184

Electronically, the mode of transmission therefore refers to the way a record is sent as data. The authenticity of a record is dependent on keeping track of the transmission process. With paper records, this involves capturing indications of how the document was handled by courier, messenger, and postal services, its distribution, and procedures for filing of incoming and outgoing documents by means of date stamps, registration marks,

and the rigorous application of administrative procedures for the receipt and sending of

documents. The critical factor is security. The recordskeeping system must be able to

guarantee that a record was sent or received as intended. In electronic systems, security is

a question articulating the circumstances under which a document may be sent or received,

and of providing an audit trail.185

Form of Transmission

The form of transmission refers to the form the record has when it is made or received. It

is again an essential constituent of authenticity, because it should be possible to guarantee

that a record was received in the same form in which it was sent. Form, in this case, refers

to the presence of the persons, and certain extrinsic and intrinsic elements, all of which

must be captured in the document profile or identified within the document itself. The

intrinsic elements consist of the entitling, which captures the name of the author of the

document, the inscription, which captures the name of the addressee of the document, the

subject, and the date and time of transmission. In addition, there must be various

annotations comprising a classification code, and, where applicable, a registry number. In

paper documents, other devices were possible to include such as watermarks, seals, and

special signs peculiar to the author such as a personal monogram. The use of an encrypted

Ibid., 12.

Ibid., 25.

71 Chapter Two Diplomatics seal or the inclusion of a graphic whose meaning is shared only by the author and the addressee are two ways that such elements could be included in electronic documents.186

OTHER ASPECTS

The elements of a document traditionally recognized by diplomatics are able to. accommodate almost all the needs of electronic documents if they are to be recognized as records. But there are some aspects where diplomatics needs to recognize some additional elements that are peculiar to electronic documents.

Security and Access

The first of these are the concepts' of security and access as they arise in electronic systems. Security has been defined as a technique for ensuring that data stored in a computer cannot be read or compromised.187 This is an essential element of authenticity where the manner of preservation and custody of documents are concerned, 188 and such elements of physical form as seals and special signs, as well as control.over the creation and execution of the document and the recognition of certain document forms, such as letters closed, are elements long recognized by diplomatics as implying some measure of security. There is a great difference, however, in the fact that paper records, because their form is physically bounded by their medium, can be secured from all interference simply by locking them away or restricting their circulation. While electronic records can certainly be secured by encryption or restricting access, they are, by their nature, more open to interference by nature of their medium, which is volatile and dynamic, and by the possibility of remote access on a network. Security must be .specifically built in: the author of a document must have permission to access the document, and must be able to prove to the system that they have such a privilege.

1 Security is a question of securing both the system and the document. Even though system security is an important element of reliability, security of the system cannot be a specific concern of open document exchange standards, however, security of the

186 Ibid., 25-26.

187 Margolis, Personal Computer Dictionary, 426. 188 Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 25.

.72 Chapter Two Diplomatics document through access rights is a vital concern.189 ODA, for example, defines document architecture on the basis of whether a document can be read, or modified.

Access rights are usually assigned on the basis of a privilege to read, modify, delete, create, move, or destroy documents. Access rights therefore, also become a way of defining the persons participating in a document. The writer, for instance, as the person who gives the document intellectual articulation, is to some extent defined by the privilege

of creating or modifying a document; a countersigner may have only access on the basis of their ability to modify the document with their approval.

Access usually takes the form of two attributes: an access list, which include the

name of the person granting access, and an authentication, or proof that the person

seeking access is who they claim to be, in the form of a password or an encrypted name..

ODA, DFR, and SGML all provide for access by means of such attributes as ODA Access

Rights, and Authorization, and DFR Access-List. SGML has an attribute for security

classification of the document itself, called Sensitivity.

Access is an aspect of document management that is needed throughout the life of

a document, from creation to use and disposition. In the sense that it is not necessarily

assigned after the creation of a document access control does not fit the strict definition of

an annotation of management or handling. In fact, access control should probably be

considered as an extrinsic element in its own right. The reason is that access control is the

ability of a document to be physically opened for inspection. In paper records, this would

be accomplished by first granting access to the container in which the document was held,

and then opening out the document to the light so that the process of comprehension

could begin. Access control is therefore a system surrogate for an element which is

integral to the physical format of the document and is not simply an intellectual privilege.

Nonetheless, in terms of electronic records management, access control takes the form of

a system privilege. It is proposed that to define access control as follows:

189 Access is defined in Uaditional archival terms as " the availability of records/archives for consultation as a result of both legal authorization and the existence of finding aids." ACCIS, Management of electronic records, 136. In data processing terms, it is defined as "a priviledge to use computer information in some manner." Margolis, Personal Computer Dictionary, 4.

73 Chapter Two Diplomatics

The authorization to enter an electronic data store with the specific privilege to

either view or process data in some way, or to administer or modify the system

itself. The authorization usually takes the form of a list of those who have been

granted access together with their specific privileges. The access list may be a

part of an application, or attached to an individual object, such as a document, or

object class.190

Archival Bond

A third element that needs to be recognized is the relationships that exist between documents which is called the archival bond.191 As Duranti points out, diplomatics deals with the archival document itself, whereas archives are concerned with aggregations. But whereas paper records have tended to aggregate themselves because of the physical nature of the paper medium, electronic records must have their links established between them.

Duranti points out that the act of classification is essential to making a record out of a document, because with out it, the records do not acquire the essential quality of interrelationship.192

ODA SGML and DFR all contain attributes that are designed to capture the archival bond. DFR deals with documents as groups, identifying the relationship between members, as well permitting the capture of relationships to documents outside the group through such attributes as User-references-to-Other-Objects. ODA also contains an attribute for Reference to Other Documents, as well as document class descriptions that permit the establishment of relationships between documents of a given type. These

190 An alternative view is to regard access as an aspect of persons for the reason that access rights are a priviledge and priviledges may only by granted to juridical persons. In an electronic document management system, it is not possible to define persons without rights of access which therefore become synonymous with their competence, or ability to act in some fashion. In that case, the access list proposed here might be viewed as a mechanism against which the access rightsmigh t be validated.

191 The archival bond has been defined as "The relationship that, because of the circumstances of their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterised by the purpose of the record)." Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 4.

192 This insight into the fundmantal importance of classification was obtained from a seminar given by Luciana Duranti at the World Bank; Washington DC, August 28, 1995. Chapter Two Diplomatics attributes can be also be used to identify virtual series which will, however, be far more flexible than the traditional paper series.193

Document Management Domains

A final aspect of electronic document exchange that diplomatics does not identify specifically is the concept of document management domains. These are defined as "space defined by the boundaries of an electronic document management system within which records are created, modified, used, and destroyed. The space may be divided into several areas depending on the status of transmission and the access rights."194 Document management spaces are divided up into general (or institutional space), group space, and individual space. The general space is "that part of the system that is accessible to all members of the organization, managed according to established record making and record keeping rules by the competent staff, and that contains the central filing system of the organization, including the linkages with related records in other media. The primary characteristic of the general space is that no record that has crossed its boundaries can thereafter be manipulated."195 The group space is shared by all those who share the same competence and contains many draft versions of records. The individual space is accessible to individual members.

The division of space on the basis of access control and status of transmission means that document management domains must be captured by several attributes. All those used by ODA, DFR and SGML to capture access and the status of transmission may therefore be used for this purpose. A specific annotation of management called Domain

should probably be added to the profile that would consist of a code for each part of the

system, on the basis of which, access control and the status of transmission could be

automatically assigned.

193 The archival bond should not be confused with the concept of links which are physical links to the parts of a record.

194 Duranti & Eastwood, Preservation of the Integrity of Electronic Documents, 23.

195 Ibid., 23.

75 Chapter Two Diplomatics

Conclusion

As we have seen, apart the from juridical system, which is by definition an abstraction, almost all of the main elements of diplomatic analysis of the record - the facts, procedures, persons, and intrinsic and extrinsic elements, together with such key concepts as authenticity and reliability - can be captured in open document exchange standards.

Some new elements such as links, security, access control, domain, and the archival bond should be explicitly recognized by diplomatics and added to the document profile in order to ensure that the functionality of the record within the electronic recordskeeping environment is fully realized. While this is conceptually possible, it is necessary to take a closer look at ODA, SGML, and DFR to see just how they function and what their specific limitations and advantages might be.

76 CHAPTER THREE

INTERNATIONAL OPEN DOCUMENT EXCHANGE STANDARDS

ISO 8879 Standard Generalized Markup Language (SGML)

SGML was adopted by the ISO in 1986. It is "an international standard for the description of marked-up electronic text. More exactly, SGML is a metalanguage, that is, a means of formally describing a language, in this case, mark-up language."196 Markup has traditionally been associated with publishing where manuscripts had to be marked up for typesetting with indications of typeface, font size, use of bold face, indentations, paragraphing and other formatting requirements. In this traditional sense, markup was the equivalent of encoding a text in one form so that it could be translated into the typeset form. As the publication process became more automated and integrated, electronic markup languages were extended to include printing and editing.197 As a form of encoding, markup has now come to mean "any means of making explicit an interpretation of a text."198

A markup language must specify "what markup is permitted, what markup is required, how markup is to be distinguished from text, and what the markup means."199 As

a markup language^ SGML is a "method of modeling document contents and identifying

structural and content elements."200 It was designed for the publishing environment201 in

196 C.M. Sperberg-McQueen and Lou Burnard, eds., Guidelines for the Encoding and Interchange of Machine-Readable Texts, Draft: Version 1.1. (Chicago and Oxford: Association for Computers and the Hurnanuies (ACH), Association for Computational Linguistics (ACL), Association for Literary and Linguistic Computing (ALLC) - Text Encoding Initiative, 1990): 9.

197 The Chicago Manual of Style for Electronic Manuscripts is an example.

198 Sperberg-McQueen & Burnard, Guidelines, 9. The ISO standard defines markup as "text that is added to the data of a document in order to convey information about it." ISO, International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML) (Geneva: International Organization for Standardization, 1986): 14.

199 Ibid., 10.

200 Hajagos, Documents and SGML, 38.

77 Chapter Three International Open Document Exchange Standards that it permits authors to mark up their documents through the use of a generalized language that utilizes a syntax of human-readable (or character) codes.202

SGML has several advantages over earlier markup languages in that the coding is human-readable and can therefore be embedded by the author, and the markup syntax is rigorously defined so that it can be processed like a program by a compiler. Furthermore, the markup syntax is generalized so that the tokens are not related to any particular publishing context, and has a meta-language that can be used to "anticipate new mark-up constructs".203 Perhaps most important of all from the point of view of standardization,

SGML is intended to be "future-proof or independent of hardware, software, and all applications.204

SGML uses several strategies to become future-proof. The metalanguage of descriptive markup means that documents can be processed by many different types of applications. A second data processing strategy that makes SGML data independent is its use of a general purpose mechanism called string substitution that lets a communicating system know a particular string should be replaced by another.205 Of greatest interest to archivists, however, is a third strategy, document types. A document type defines a document by its constituent parts and associated structure and consists of a "class of documents having similar characteristics."206 For instance, a report might consist of a title and an author followed by an abstract. This definition of a report may serve for all reports, and anything which does not have these basic constituents would not be a report. The use

201 Understood broadly in the sense of a document-producing environment where the author does not require strict conUol over the appearance of the document.

202 Authors in this sense means anyone who encodes a text, which could be a scholar encoding an historical manuscript, a writer preparing a report, or an editor.

203 ACCIS, Strategic Issues, 30.

204 Lou Burnard, The Text Encoding Initiative: Towards an Extensible Standard for Encoding of Texts in Electronic Information Resources and Historians: European Perspectives, Seamus Ross and Edward Higgs, eds. (St. Katherinen: Max-Planck-Institut fur Geshichte In Kommission bei Scripta Mercaturae Verlag, 1993): 106.

205 Sperberg-McQueen & Burnard, Guidelines, 11.

206 ISO, SGML, 10.

78 Chapter Three International Open Document Exchange Standards of document types enables documents of the same type to be processed in a more uniform way without defining them to each system over and over again.207

The purpose of all this is to facilitate the transmission of documents between

dissimilar systems and to maximize their informational value by permitting the

manipulation of their contents in an automated environment. Since it is mainly a

descriptive language, however, SGML cannot be used to process documents even though

it is can contain processing markup. This is because, it "does not contain semantic

definitions for controlling further processing steps on documents."208 For instance, SGML

itself cannot be used to edit a document. Applications that are SGML conformant,

however, can be used to perform operations on an SGML-encoded text such as editing,

linking or displaying texts in hypertext systems, formatting and printing, download to

databases, content analysis, or collation among many potential uses.209

SGML is based on two main principles: descriptive markup predominates and is

kept strictly separate from processing instructions; and markup is formally defined for

each document. Descriptive markup simply uses codes to "provide names to categorize

different parts of a document"210 such as for paragraph or . By

contrast, procedural markup specifies what operations are to carried out on a document

such as "insert em dash one quad right, then skip one line, then indent right margin etc."211

207 Sperberg-McQueen & Burnard, Guidelines, 11.

208 Bormann & Borman, Standards, 151. At the time this was written, a separate standard that would permit the specification of style information for controlling processing steps was underway: Document Style Semantics and Specification Language or DSSSL).

209 Sperberg-McQueen & Burnard, Guidelines, 3.

210 Ibid., 10.

Descriptive markup is defined as "Markup that describes the structure and other attributes of a document in a non-system specific manner, independently of any processing that may be performed on it. In particular, it uses tags to express the element structure." SGML actually recognizes four kinds of markup. Apart from descriptive markup (tags), it recognizes references, markup declarations, and processing instructions. ISO, SGML, 2, 14.

79 Chapter Three International Open Document Exchange Standards

In essence, SGML "enables data to move between media by describing documents by their structural elements or textual features rather than their visual format."212 Thus, in

accordance with the basic processing model, SGML would permit several paragraphs of a

report, or a glossary in a source document to be imported as a result document without

having to copy the whole document or even a page with its original layout or logical

organization. Using SGML, it is possible to mark up the various textual elements of the

content of a document, so that these can be exchanged and re-processed without having to

worry about the physical layout. It is important to note that elements are intellectually

determined or logical in nature, and are not physical or layout divisions such as page

breaks imposed by the necessities of presentations (such as limitations on page size).

Textual features divide up into structural and non-structural elements.213 Structural

elements include such components as the parts of a book, such as the front matter,

consisting of the table of contents, copyright page etc.) or the body, consisting of chapters

and sections. Non-structural elements might include individual words, paragraphs,

passages of text, names and dates, typographically highlighted phrases, basic editorial

changes, pre-existing annotations, bibliographic citations, lists (such as glossaries and

indexes) and hypertextual features such as simple links and cross-references.214

A fundamental point to grasp is that although elements are named, SGML provides

no way of knowing the meaning of a particular element: its only concern is to define the

relationship between one element and other element types. For example, the elements Title

and Author may associated with the element document type, but what Title and Author

may specify has to be defined by the encoder.215 Finally, elements take the form of tags or

names.

Elements are only one type of descriptive markup component. SGML also defines

entities and attributes. Entities are "a named part of a marked up document, irrespective of

212 Hajagos, Documents and SGML, 38.

213 Guidelines 71.

214 Burnard, Text Encoding Initiative, 111.

215 Sperberg-McQueen & Burnard, Guidelines, 12.

80 Chapter Three International Open Document Exchange Standards any structural considerations".216 They are essentially free-floating units, or collections of characters, that are not related to the structure of the text and are best managed as a single unit. A typical entity is a photograph, or a book chapter, or a file. Attribute has a more restricted meaning within SGML. An SGML attribute is used to describe "information which is in some sense descriptive of a specific element but is not regarded in itself as an element."217

An SGML document consists of a prologue and a document instance. The prologue is divided into two parts: the SGML declaration, and the document type definition (DTD). The declaration "establishes the environment in which the document operates" by permitting the user to specify "basic facts about the dialect of SGML being used"218 that the system applications will need to read the SGML file. These include such parameters as the characters sets (e.g. ASCII, EBCDIC) used to encode data, types of

SGML delimiters, length of tag names, use of symbols etc.219 The SGML declaration is invisible to the reader.

The Document Type Definition (DTD) is at the heart of the way SGML describes texts. The DTD consists of a standard header or type of profile that defines all the textual components used to describe a text in the form of tags, attributes associated with particular tags, and entities and how they are to be interpreted for a particular document type. In order to process an SGML text, an application validates the tags against the DTD.

In this way, an SGML document is self-defining and can be processed by any application that complies with the SGML standard.220 The DTD also defines the relationship between tags by permitting the establishment of hierarchical structures and permitting tags to exist or be "nested" at different levels of the hierarchy. In fact, an SGML DTD is a hierarchy of tags with most tags subordinated to other tags. Tags may also be aggregated into small

216 Ibid., 27.

217 Ibid., 24.

2,8 Ibid., 29.

219 Hajagos, Documents and SGML, 39.

81 Chapter Three International Open Document Exchange Standards objects with an internal structure called crystals. For example, an address, containing tags for the street address, postal code, and city etc. is a commonly used crystal.221

The DTD should not be confused with a relational database that is independent of the encoded text. For instance, the ODA standard establishes a document profile that is completely separate from the content in that it can be searched separately independent of content. But the SGML DTD is directly tied to the encoding of the text through the process of validation. This is because SGML is concerned only with what is actually present in the text, and not with context which in the case of records, often consists of information that is not necessarily present in the text but must be inferred. For instance, the author that appears in the SGML DTD must be the name of the author that appears as part of the text. In diplomatic terms, this author is more likely to be the writer, of the person responsible for giving the text intellectual articulation. If the aim is to capture the author of the act and the author of the document, who may be different from the writer, the TEI SGML treatment of the tag is inadequate because this is information that will not appear in the text but may have to be inferred from the administrative context of the document.

The document instance is "the data and markup for a hierarchy of elements that conforms to a document type definition,"222 or in other words, it is the actual marked-up text and contains all the data and tags needed to delineate each element. The document instance may exist in the same file as the DTD and the SGML Declaration, but document instances can also be referents to files elsewhere that contain the DTD and the declaration.

An SGML document, then, consists of a text, "or any stretch of natural language", marked up by tags, associated with a header, or document type definition.223 Although the

SGML standard does not specify a set of tags, unlike ODA or DFR, tags may be bundled into groups or tagsets for specific types of documents, and these may be standardized.

This is the aim of the Text Encoding Initiative (TEI), a project sponsored and organized

Sperberg-McQueen & Burnard, Guidelines, 88.

ISO, SGML, 13.

Burnard, Text Encoding Initiative, 108.

82 Chapter Three International Open Document Exchange Standards by several professional associations in the field of computer-assisted literary and linguistic research.22'1

The aim of TEI is to "deliver a fully specified set of Guidelines which will enable researchers in any discipline to interchange texts and datasets in machine-readable form . .

. 1,225 On the theory that many texts are similar in their basic structure but differ in their components or sub-elements, the TEI is identifying standard tagsets, or base tagsets, for a number of different types of texts. That is, certain elements needed to describe their structure are similar, but these must be used in conjunction with tags unique to each type of text. For instance, a dictionary entry consists of tags for form, sense, related entry, and etymology, while the base tag set of a memo would include references to other documents, local filing references, and a subject field. "The resulting document becomes a template for validating existing documents and creating new ones."226 Thus far, the TEI has identified base tag sets for prose, verse, drama, transcribed speech, dictionary entries, terminological entries, and letters or memos.227

The base tagsets characteristic of particular types of texts may be combined with core tag sets of elements that research by TEI into different types of texts indicates are common to all texts. These include core structural elements of front matter, body, and back matter, and basic non-structural features which occur freely in the text such as paragraphs, highlighting, lists, bibliographic citations, foreign words or expressions, terms, cited words and glosses, abbreviations, notes, entries, numbers and dates, and. crystals.226 Others include pre-existing annotations.229 The core and base tag sets may be

The TEI is sponsored by the Association for Computational Linguistics (ACL), the Association for Literary and Linguistic Computing (ALLC), and the Association for Computing and the Humanities (ACH). Burnard, Text Encoding Initiative, 105.

225 Ibid., 105.

226 Hajagos, Documents and SGML, 40.

227 Burnard, Text Encoding Initiative, 109.

228 Sperberg-McQueen & Burnard, Guidelines, 71-89.

229 Burnard, Text Encoding Initiative, 111.

83 Chapter Three International Open Document Exchange Standards

combined with user defined tags in the DTD in an approach that has been characterized as the "Chicago pizza": the user is offered a standardized set of tags to which the user-

defined tags may be added as a kind of topping.

The aim of the TEI is to publish guidelines for the DTD of specific types of

documents. An early draft of the Guidelines230 contains a TEI DTD for Office Documents.

In keeping with SGML's emphasis on structure, document types are not defined by

function. "Although office documents can be classified according to a number of criteria

and arranged in various classes (e.g. memorandum, business letter, report, minutes etc.),

their commonalities rather than their specific features have been stressed in order to devise

a single general structure for office documents".231

The TEI DTD divides the office document up into three parts: front matter, text

body, and back matter. At first it might be thought that this division corresponds roughly

to the intrinsic elements of protocol, text, and eschatochol, but the similarity is, in fact,

almost nonexistent. The text body consists of the core tag set mentioned above used to

plus a subject line, salutation and "signoff features" characteristic of correspondence. The

back matter is held to comprise bibliography, glossary and index features. On the grounds

that these are features common to many other documents, the TEI DTD for office

documents focuses mainly on the front matter which it considers a sort of document

profile relating to the "special aspects of office documents", namely, production and

storage of documents, document distribution, action requested and deadlines, and status

and history of the document as a version or draft.232 The DTD consists of a core tag set of

these elements, some of which draw on ODA and the X400 standards.

SGML TEI Office Document Type Definition

tag name encoding format

Document Type Title

230 Sperberg-McQueen & Burnard, Guidelines, 289.

231 Ibid., 188.

232 Ibid., 188.

84 Chapter Three International Open Document Exchange Standards

Document Date Author Abstract Table of Contents Language Revision History

In addition to the core tag set, the office il elements:

Document Reference Additional User Specific Codes References to Other Documents In Reply To Local File System Reference File Name Location Access Rights User Comments Subject Keywords Creation Date Originator Preparer Authorizing Person Primary Recipient Secondary Recipient Other User Information Document Status Sensitivity Number of Pages

Included in the recipient tags is a sub-element called and which is interpreted as possibly including such actions as "action", "information",

"opinion", "visa", "filing" etc.233. These action tags are merely a mixture of standard filing and routing actions that would take the form of archival or procedural annotations and do not describe the act or the transaction itself.

233 Ibid, 189-190.

85 Chapter Three International Open Document Exchange Standards

Included in the front matter are a number of "crystals" or minor tagsets that can be invoked as a group to dates, individual persons, postal addresses, and electronic addresses.

Person name Organization Address

Some or all of these tags might be used to encode an office document. A TEI encoded document consists of a DTD, which defines what tags will be used, a header, which provides information needed to describe the process of encoding and for bibliographic citation, and the actual encoded text. An example is provided in Figures 2 and 3 following.

86 Chapter Three International Open Document Exchange Standards

FIGURE 3 Sample of SGML TEI Encoded Document

This letter from the Government of Barbados to the World Bank (SEE facing page. Figure 2) has been coded with SGML tags according to the sample provided in the draft Guidelines, p 241-243 using tags identified for office documents. The SGML encoding is divided into two parts, the header and the . It does not include the DTD which would define all the permissible tags to be used in the encoding. Bold has been used to identify the first instance of each tag which are encoded in level pairs, e.g. . The tags are listed in a hierarchy of elements that nest one within the other, beginning with the most all-encompassing, i.e. . Square brackets have been used to denote supplied information for purposes of elucidation, e.g. . Comments are provided as footnotes in order to avoid cluttering the format of the encoding

234 Letter, L. Erskine Sandiford to Mr. Yoshiake Abe, 19 May, 1993: machine readable transcript transcribed by Tony Gregson small (ca. 2 Kb> 5 January 1995235 Washington DC World Bank Archives Jan 5 1995 L. Erskine Sandford [letter to Yoshiaki Abe] Barbados

234 . The TEI header is intended to provide information that can be used in bibliographic citation and to document the circumstances of the encoding.

235 Creation date is the date of the encoding or the creation of the transcription.

88 Chapter Three International Open Document Exchange Standards

FIG 3: SAMPLE OF SGML TEI ENCODED DOCUMENT (CONT'D)

19 May 1993. l p. Example of transcription of a received document by World Bank. l T. Gregson Jan 5 1995 encoded SGML format

236 L. Erskine Sandiford, Prime Minister and Minister of Finance and Economic Affairs Government of Barbados Ministry of Finance & Economic Affairs

Government Headquarters, Bay Street
19 May 1993 Mr. Yoshiaki Abe, Country Director,Latin America and the Caribbean Region International Bank for Reconstruction and Development

The text consists of front matter, the body, and back matter. No back matter has been included here because the letter does not have any of the SGML TEI standard components defined as back matter such as indexes, glossaries etc.

89 Chapter Three International Open Document Exchange Standards

FIG 3: SAMPLE OF SGML TEI ENCODED DOCUMENT (CONT'D)

237 1818 H Street N.W. Washington D.C. 20433 U.S.A.
Human Resources Project Dear Mr. Abe

I wish to nominate His Excellency, Dr. Rudi Webster, Barbados' Ambassador to the United States of America, as the Expert to sign on Barbados' behalf the Statutory Committee Report.

Yours faithfully L. Erskine Sandiford , : ^ From the point of view of capturing archival documents, there are many problems

with the office document DTD as envisaged by the TEI. First and foremost it is

bibliographical in nature, i.e. it treats office documents as self-contained document entities,

independent of their interrelationship with other documents for meaning and their role in a

transaction. The document is to be cited as if it were a publication with a Title Statement

and a Publication Statement when the letter has no title and cannot be said to have been

published except in the most general sense of that word in that it has been "formally

announced" or "read".238

Secondly, the selection of tags is poor in the contextual information that indicates

the relationship of the document to the transaction of which it is a record, and those who

are responsible. In particular, there is no tag to identify the act in which the document

took part (the making of a loan to Barbados), or the procedure (the negotiation of terms).

237 The address is an example of a crystal, a small object that has an internal structure consisting of a number of standard tags.

238 Oxford Modern English Dictionary, s.v. "publish."

90 Chapter Three International Open Document Exchange Standards

There is also inadequate identification of those responsible for the document and who participated in the transaction. The header does not use the tag for organization as part of the Source Description where information may be supplied for contextual purposes or citation that is not found in the document text. The tags for organization ( and ) are confined to the body only, where they consist of a literal transcription, or document instance, of the text. Similarly, the header does not make a distinction between the author of the act (the Government of Barbados) and the author of the document (L.

Erskine Sandford). Again, the addressee is a textual tag to be transcribed literally and there is no distinction made between the addressee of the act (the World Bank) and the

addressee of the document (Mr. Yoshiaki Abe).

The tags also fail to capture other essential records attributes. There is nothing to

capture the status of transmission which would indicate whether the letter was an original

or a copy.239

SGML has a reputation for extreme flexibility in tackling problems of the encoding

of all types of documents, including archival documents. Writes one researcher, "... it has

proved remarkably difficult to find problems for which a solution could not be expressed

in SGML."240 But SGML has broader limitations if it is ever to be useful for the capture of

archival documents.

SGML was designed for the publishing environment in that it permits authors to

mark up their documents through the use of a generalized language for formatting by the

publisher. As has been pointed out earlier, the publishing environment assumes that the

creator of the document has no interest in the eventual appearance or format of the

document but is interested only in controlling content. This assumption runs contrary to

the definition of the archival document which is "a complete document . . . that contains

all the elements it is supposed to contain according to the administrative and legal

system."241 Since these elements include the layout and logical structure as intrinsic and

239 As it is, the SGML transcription is really nothing more man a simple copy or a literal transcription of the contents of a document.

240 Burnard, Text Encoding Initiative, 106.

241 Duranti, Managing Electronic Records, 9. Chapter Three International Open Document Exchange Standards extrinsic elements, SGML as it is presently conceived is not suited to the encoding of records. Because it is so important to avoid embedding format information in structured documents using SGML, the format must be defined externally for particular presentations. These formats are then "dynamically associated" with structural elements to create a view of the information that is useful to the reader.2" Such an approach gives

SGML encoded documents great flexibility in presentation, but this is not always desirable if what is wanted is a view of the document as it is going to appear or as it originally existed.

To resolve this problem, SGML has to be capable of addressing "both avenues of document creation and maintenance, structure-oriented and format-oriented. SGML offers a structured approach, but programs such as Pagemaker and Word Perfect are format- oriented which address only the appearance."243 As pointed out earlier, the development of

DSSL (Document Style Semantics and Specification Language) may enable SGML to handle layout by adding DSSL statements.244

The philosophy of the SGML Text Encoding Initiative itself is problematic because it assumes that all SGML documents are only interpretations. "No claim to absolute authority is made by any encoder, nor should ever be; the TEI scheme merely allows encoders to 'come clean' about what they have perceived in a text, to whatever degree seems appropriate."245 This means that all such transcriptions are nothing more than simple copies with notes and could not be records in themselves since there is no guarantee that they capture the original. While this may be a necessity in a scholarly environment where texts are being transcribed for research purposes, it is inappropriate in a records-creation environment. Records are not interpretations but must have the quality of impartiality.

Impartiality is "the characteristic of archival documents that they are created for limited,

Hajagos, Documents and SGML, 41.

243 Ibid, 41.

244 B.C. Watson, R. J. Davis, ODA and SGML: An Assessment of Co-existence Possibilities, Computer Standards and Interfaces 11 (1990/91): 174.

245 Burnard, Text Encoding Initiative, 111

92 Chapter Three International Open Document Exchange Standards specific and immediate purposes of an administrative legal nature, not in order to instruct posterity."246

Duranti, Managing Electronic Documents, 16

93 Chapter Three International Open Document Exchange Standards

ISO 8613 Office Document Architecture (ODA)2"

ODA is a standard that deals more specifically with the needs of the office environment, where it provides mechanisms for describing structures, standardized semantics for controlling document layout, and syntax definitions for interchanging this information.248 ODA "defines interchange formats, concepts to represent the structure of the information in a document, and the meaning of a set of formatting parameters."249 The purpose of ODA is to provide for the interchange of documents in order to permit presentation either as intended by the creator, or to allow processing, such as editing or reformatting, or both.250 Interchange, in the ODA model, is assumed to follow the basic processing model, an automatic process of transferring a document "from an originating system to a receiving system,"251 but in the case of ODA, this automatic process is known as "blind document interchange" because it is assumed that "revisability and layout are preserved just based on the knowledge that both systems comply to the international standard."252 While ODA is used to encode the structure of a document, it must be used with two other standard interchange formats to actually exchange the information. The

Open Document Interchange Format (ODIF) is used to define a machine-readable bit stream representation of the document while SGML is used to define a human-readable representation of the document in a format called Open Document Language (ODL).253

247 ODA has also been published using the title Open Document Architecture by CCITT and ECMA. Borman & Borman, Standards, 151.

248 Ibid., 150.

249 Fanderl et al. , The Open Document Architecture, 734.

250 ISO, ODA: Part 1, 1.

251 Ibid., 7.

252 Fanderl et al. , The Open Document Architecture, 734.

253 Ibid., 736.

94 Chapter Three International Open Document Exchange Standards

ODA is designed to deal with compound documents which may consist of text, geometric graphics, and raster graphics.254 It distinguishes three document architectures, all of which may be interchanged but which lend themselves to different purposes: formatted, processable, and formatted processable. Formatted documents are "read only", or in other words, intended only for presentation. Their interchange is designed to ensure that the same layout is preserved in all systems. Processable documents permit human editing or can be otherwise modified by machine-controlled processes that may change the content or structure. Formatted processable documents not only preserve the original layout in interchange, but may also be edited or restructured. The structure of a processable ODA document is designed to permit the processing of documents in three steps: editing, layout, and presentation (imaging).255

An ODA document consists of four parts: the logical structure (chapters, paragraphs, lists, diagrams etc.) and the relationship between logical elements, the layout structure (pagination and physical location of elements such as paragraphs), the content structure, representing different content types such as text and graphics, and the document profile, which contains descriptive attributes for filing and processing. ODA's use of content types permits other standard, such as graphic standards, to be used to define the contents.

DOCUMENT ARCHITECTURE The fundamental concept behind ODA is document architecture, a set of rules that can be used to define the physical and intellectual structure of documents. Structure, in this case, means "the division and repeated subdivision of the content of a document into increasingly smaller parts."256 The parts are called objects and organized into a hierarchy

or tree. The purposes of ODA may therefore be restated in the more specific in terms of the document architecture:

Ibid., 736.

Ibid., 734.

ISO, ODA: Part 1, 14

95 Chapter Three International Open Document Exchange Standards

• to permit the exchange of documents between heterogeneous environments so that

"different types of content, including text, image, graphic, and sound can coexist within a document"; and

• to ensure that "the intentions of the document originator with respect to editing, formatting and presentation257 can be communicated most effectively.253

The rules that constitute the document architecture are based on three fundamental

assumptions about documents:

1. THE TWO VIEWS: all documents consist of a layout view, which is how the

document is physically organized into pages259, and a logical view which is how the

document is intellectually subdivided into units of meaning (e.g. paragraphs).260

The logical structure is usually embedded in the document by the author at the

time of creation and editing, whereas the layout structure is determined by a

formatting process such as a word processing application.

2. GENERIC AND SPECIFIC STRUCTURE: all documents have a "specific" structure

which is the one "the user may read", or in other words, is the human perceptible

structure, and an underlying "generic structure" which is "the template that guides

the creation of the document and that could be re-used for its amendment."261 The

concepts of generic and specific are applied to both layout and logical structures.

The generic structure represents properties that are common to a number of

documents, whereas the specific structure is an instance of the generic structure in

a given document only to a given document. Another way of looking at this is to

say that the generic structure represents the standard components that might be

common to a number of different documents, such as a title or pages, whereas the

257 Presentation is the "operation of rendering the content of a document in a form perceptible to a human being." Ibid.., 10.

258 Ibid., 13.

259 Ibid., 19.

260 Ibid., 14.

261 Ibid., 13.

96 Chapter Three International Open Document Exchange Standards

specific structure would be the particular instance of the title in a given

document.262 The generic structure controls the editing process in that only

structures conforming to those defined for the generic structure can be generated

as specific structures.263

3. DOCUMENT CLASSES: ODA posits the existence of document classes which it

defines as "a set of generic features that are common to a category of

documents."264 All document classes have both a generic layout and a generic

logical structure. ODAs concept of document classes is more complete than the

. SGML definition of document type because it encompasses a greater range of

document features, including both layout and logical structures, enabling

documents to be defined with greater specificity.

The components of the layout and logical structures are managed as different types of objects. Those associated with the generic view are known as object classes; those associated with the specific view are known as objects. Hence, there are logical object classes, and logical objects, layout object classes and layout objects. A layout object class might consist of pages associated with headers, a layout object, a page; a logical object

class might consists of chapters, a logical object, sections. The document architecture

subdivides layout and logical objects into a further hierarchy of subordinate entities:

document roots, composite objects, and basic objects, which are at the simplest level of

the architecture in having no subordinate structural entities. In addition to these, ODA also

specifically defines as object entities certain standard layout structures in the form of

blocks, frames, pages, and page sets. No such standard entities are defined for the logical

structure. The purpose of describing elements as objects is to dissociate them from any

one particular use.

262 For instance, the generic logical element "title" is common to all reports, but the title, "Drilling for Water in Abijan" is a specific instance of the logical structure for the document type, report.

263 Fanderl et al. , The Open Document Architecture, 735. This is one approach but another is to define the properties required by an editor to generate the document. For instance, instead of defining as a generic structure in the ODA document a map (an example of a geometric graphic), ODA would define how the word processing application is to interpret the requirement for a map.

264 ISO, ODA: Part 1, 13.

97 Chapter Three International Open Document Exchange Standards

Figure 4

ODA Logical and Layout Structure265

T •t T T section paragraph paraqraph paragraph headng | •T t • • content content content content content portion portion portwn portion portion k ,

block

page

Figure 5 ODA Correspondence Between Logical and Layout Objects

Including Content Portion26*

report tftbtoot , ctiapnr heading contents

mumort nuns heaang

. | CftSpiBf chapter Lagend r"™^^™!] Ona or mora A number Boa . rewifaoce) heaang elemeni | |] occurence ot A

A corona of B and C 1| ^QfBorC as figure

pcajra

265 From ISO, ODA: Part 1, 14.

266 From ISO, ODA: Part 1, 16.

98 Chapter Three International Open Document Exchange Standards

FIGURE 6

ODA Content Architecture and Layout for Diplomatic Elements

This figure is intended to illustrate how diplomatic elements would be interpreted as part of an ODA document architecture in terms of logical and layout objects. The protocol is only one part of the document intrinsic elements. The others would be the text and the eschatocol.

block

pa ge

section

protocol

content layout portion object

entitling paragraph

date paragraph

invocation paragraph

superscription paragraph

inscription | paragraph

99 Chapter Three International Open Document Exchange Standards

The logical and layout structures of an ODA document are in theory intended to be quite separate. It is important because the separation of layout and logical structures permits another layout to be applied to the same document when it has been exchanged between two different systems. But under some circumstances, layout may be driven by logical requirements, or presentation driven by either logical or layout requirements. For example, each section of a report may start on a new page, in which case, the formatting must recognize that a page break will be triggered by the end of each section. This dynamic relationship between logical and layout structures is captured by a document component in the form of directives called a style. ODA recognizes layout styles, defined as "a constituent of the document, referred to from a logical component, that guides the creation of a specific layout structure,"267 and presentation styles, defined as "a constituent

. . . referred to from either a logical of a layout component which guides the format or appearance of a document."268 Presentation styles "aggregate information that concerns the formatting of content, such as fonts and line-spacing."269

ODA also structures the content of a document, defined as "the information conveyed by the document other than the structural information, and that is intended for human perception."270 Content architecture is "the rules for defining the internal structure and representation of the content of basic components in terms of a set of content elements, attributes and control functions, and guidelines for presentation of the content."271 The content architecture is used to determine document size, number of pages, and languages, basic element of representation, such as letters, pels, or geometric graphic elements (lines, polygons etc.) and other attributes relating to content, such as

access. In other words, ODA content refers not to the subjects that may be treated in the

257 Ibid., 8.

268 The difference between this and layout is not clear. Formatting is defined as "the carrying out of operations to determine the layout of a document. "Ibid., 6.

269 Fanderl et al., The Open Document Architecture, 736.

270 ISO, ODA: Part 1, 4

271 Ibid., 16.

100 Chapter Three International Open Document Exchange Standards document, but to the way actual instances of information will be represented. For instance, words may be expressed as character text, or diagrams such as arcs or circles as geometric graphics.

Each such instance forms a content element that is associated with a basic logical or a basic layout object. For instance, the content element of character text may be associated with the logical element paragraph. A set of content elements "belongs to" or is subordinated to a logical or layout object are is called a content portion.272 Each content portion may therefore have its own architecture or set of rules for defining its internal structure in terms of content elements, their characteristics, and how they may be processed. As with layout and logical features, it is possible to have common content features or object classes keeping in mind that content is always associated with and therefore, part of the layout and logical structure.

DAPs

Just as textual features of documents can be standardized by the use of SGML

DTDs, so certain constituents of ODA documents can be standardized by means of document application profiles (DAPs). "Each of the DAPs specifies open document interchange within a certain class of applications [e.g. word-processing or imaging applications], through the definition of a set of features that are to be preserved with regard to document layout and processing behavior and its presentation in terms of ODA constituents."273 So far, DAPs have been issued for revisable teletext messages, the handling of text, raster and geographic graphics by certain word-processing applications,

such as Word Perfect, and advanced formatting for sophisticated document applications

such as computer-aided publishing.274 The DAPs are defined in terms of ODA functionalities and consists of globally registered identifiers that permit a receiving system to determine if it can handle the constituent.

Ibid., 15.

273 Fanderl et al. , The Open Document Architecture, 137.

274 Ibid., 737.

101 Chapter Three International Open Document Exchange Standards

An ODA document, then, is defined according to one of three document architectures, processable, formatted, or formatted processable. It is comprised of a hierarchy of constituents consisting of one or more generic layout structures, specific layout structures, layout styles, generic logical structures, specific logical structures, and presentation styles. These metaelements are each associated with various types of objects which instance the actual structure and consist of an object description, an object, a presentation style, a layout style, a content portion description, and a document profile.275

ODA DOCUMENT PROFILE

The profile of an ODA document consists of "a set of attributes associated with the document as a whole."276 An ODA attribute is "a property of a document or of a document constituent (e.g. a logical object, a layout object, a logical object class, a layout object class, a style or a content portion). It expresses a characteristic of the document or document component concerned, or a relationship with one or more documents or document components."277 Each attribute must be broken out according to the following criteria:

• classification (mandatory, non-mandatory, defaultable)

• permissible values divided into basic and non-basic values

• default values, if the value is defaultable.

The profile consists of three clusters of attributes: constituents (i.e. generic logical and layout structures, object classes etc.); processing and imaging attributes, known as characteristics; and document management attributes which apply to the document as a whole (e.g. author's name, title etc.).279 The profile may include or not include all the attributes and may or may not be exchanged with a document, but the profile may be exchanged by itself. [SEE Thesaurus for definitions of each element of the profile.]

List of ODA attributes Document Constituents

275 ISO, ODA: Part 1, 4

276 Ibid., 18.

277 Ibid., 16.

102 Chapter Three International Open Document Exchange Standards

Generic layout structure Specific layout structure Generic logical structure Specific logical structure Layout styles Presentation styles External document class Resource document Document Characteristics Content architecture classes Interchange format class ODA version Non-basic document characteristics Profile character sets Comments character sets Alternative representation character sets Document constituent attributes Page dimensions Medium types Layout paths Protection Block alignments Fill orders Transparencies Colours Borders Page positions Types of coding Coding attributes Presentation features Non-basic structure characteristics Number of objects per page Additional document characteristics Unit scaling Fonts listing Document management attributes Document description Title Subject Document reference Document type Abstract Keywords Dates and times Creation date and time Local filing date and time

103 Chapter Three International Open Document Exchange Standards

Expiry date and time Start date and time Purge date and time Release date and time Revision history Originators Organizations.. Preparers Owners Authors Other user information Copyright Status User-specific codes Distribution list . • Additional information External references Reference to other documents Superseded documents Local file references Content attributes Document size Number of pages Languages Security information Authorization Security classification Access rights

ODA clearly approaches the,ability to capture all the diplomatic attributes of a record,

more closely than SGML, at least as advanced by the Text Encoding Initiative, because it

is designed to capture both the physical and intellectual components of a record as can be

seen from Figure 5: ODA Correspondence Between Layout and Logical Objects, where

logical elements can be equated with elements of layout. ODA is particularly interesting

for its set of document management attributes. These map well to diplomatic elements in

many respects. Its attributes of dates are particularly sensitive from the point of action and

documentation as well as filing of the record. The attributes for persons, while lacking in

precision in some respects (SEE the discussion under persons in Chapter Two), are

comprehensive in their range and go well beyond the narrow bibliographical

interpretations of persons. It is also sensitive to the question of status of transmission, and

104' Chapter Three International Open Document Exchange Standards goes beyond traditional diplomatic requirements in making provisions for security and access. Above all, ODA, unlike the SGML TEI, makes a firm separation between form and content, which translates into elements of content configuration rather than interpreting subjects as unique elements. ODA therefore seeks to standardize documents, which makes it particularly suitable for documents of a prescriptive and formal nature, even though it is possible for these to have a processable status. In this respect, ODA is the diametric opposite of the SGML TEI, whose emphasis is on permitting the widest possible latitude in interpreting text in all its forms. This is just as well because ODA's document architecture is quite complex to handle. Once defined for any given type of document, it would not be practical to make frequent changes.

ODA vs SGML

.. Borman and Borman maintain that the differences between SGML and ODA are not that great. In fact, they are being driven together by the emergence of desktop publishing which is making the layout of office documents just as demanding as commercial publication. As the walls between traditional forms of documents (commercial publication in all its variety and office documents) dissolve in the environment of information interchange as their contents are no longer "frozen" by the document form, the distinction between office and publishing disappears. For a single standard to prevail, it must be equally capable of addressing both environments. This development has resulted in a number of planned extensions to both SGML and ODA.

SGML is being extended so that it can include standardized semantics for controlling document layout. This is being accomplished by the development of a new standard designed to maintain a clear separation between logical document structure (as marked up in SGML) and instructions for the automatic creation of a page layout.

Document Style Semantics arid Specification Language (DSSL)279 allows a clear separation between the logical document structure by decoupling layout specifications from the document itself. Extensions planned for SGML include support for direct access

ISO/DEC DP 10179 - Text Communication -Document style semantics and specification language. (Geneva: International Organization for Standardization, 1989). Cited in Bormann & Bormann, Standards, 162, fn.

105 Chapter Three International Open Document Exchange Standards to document components, musical notation, and type fonts. ODA is to be extended with support for security, improved layout, colour, different data types and computations, indexing, voice, time synchronization, annotations, hypertext, revision control, distributed editing, backward compatibility and a standard application programming interface that will permit ODA to work easily with the many tools users require, such as spreadsheets, graphics programs, and word processing.280 Such extensions should bring SGML and

ODA into increasing compatibility.

These extensions interject another stage into document processing that elaborates the basic document processing model by introducing the concept of mapping the source document to a transit document.281

Both ODA and SGML do not operate in pure document processing environments of their own, but must be accommodated to existing applications through converters which provide interchangeability of documents between different systems. SGML was premised on the belief that authors would continue to edit their work by means of their usual, local-document- processing systems, using them for both information and document markup. This is why SGML is human-readable. But this requires a knowledge of the codes which can be time-consuming to acquire. Moreover, these have to be input with accuracy.

Similarly, PODA (Pilot ODA) was developed to transform local application document formats into the ODA format which would otherwise be very time-consuming to do. Such approaches, however, can only be transitional because they sacrifice functionality in order to minimize the possibility of errors (as we recognize the neophyte linguist by his stilted grammar and attempts at commonplace conversation.)

Bormann & Bormann, Standards, 154-158.

Ibid., 158.

106 Chapter Three International Open Document Exchange Standards

ISO 10166 Document Filing and Retrieval (DFR)

Document filing and retrieval is the complement to document creation and exchange. DFR and ODA are intended to be complementary standards and share many of the same document management attributes. DFR is intended to provide "a large capacity document store to multiple users in a distributed office environment."282 It is not "an attempt to generalize all filestores in computing systems, but rather, filestores where clients and servers are on different nodes of a distributed system."283

DFR provides services between two "atomic" parts - the DFR-Server and the

DFR-User.284 The DFR-Server provides access to a file store for the user by means of

Filing Ports, Retrieval Ports, and Administration Ports. The filing and retrieval services supported by the DFR configured server are called the DFR-Server Abstract Service and includes information on how the user can make use of the service. Administration is considered a separate type of service.

The DFR server gives access to a DFR-Document store which consists of a number of different types of document objects. These are DFR-documents, which is the

most basic object and could comprise any sort of document; DFR-Groups, or collections

of documents; DFR-References which provide a means to include a document in more

than one group without making copies; and DFR-Search-Result-Lists, which merely

contain information satisfying some search criteria.285

The DFR-Abstract-Service can be used to create, delete, modify, copy, move or

simply read stored objects. The attributes of each object can also be created, copied,

moved, stored, or modified, but the content of an object cannot be changed. In other

words, the content of an object can be described by an attribute, and the description (and

therefore the content) altered, but the actual instance of the content, such as the text of a

282 ACCIS, Strategic Issues, 51.

283 Ibid., 51.

284 The DFR-Server is a file server, which is "a computer and storage device dedicated to storing files." Margolis, Personal Computing Dictionary, 428.

285 ACCIS, Strategic Issues, 52.

107 Chapter Three International Open Document Exchange Standards document, cannot be changed. To do that, the document must be imported into an application environment where this is permitted, and then returned to DFR-document store.

The DFR-document consists of a set of attributes established in a profile, and a content, or actual body of the document. The attributes must establish the location of the document at the very least. DFR is not all concerned with the content beyond its presence.

The definition of document could include part of a document, such as a page or part of a book, or a group of different documents, such as a number of documents that are all pulled together to form another document pulled together into a single document object.286

A DFR-Group consists simply of a collection of DFR-objects that have some common characteristic, and again, could comprise different parts of a document, or a number of documents related by some common purpose or function. Whatever they are, they must all be stored in the same server. DFR does not allow for a distributed document store which would be spread out over a number of different servers.287 The commonality is defined by attributes while content is captured by a list of all the Unique-Permanent-

Identifiers (UPIs) for each member of the group. A DFR-Reference consists of pointers to

DFR-objects that enable them to participate in more than one group. The reference is again a set of attributes combined with a pointer to the particular DFR-object, which could be a DFR-Document or DFR-Group or a DFR-Search-Result-List.

The function of DFR attributes is to "give support to the user in understanding

DFR".288 They may come from a wide variety of sources although DFR supports many attributes of ODA. The attributes break down into two groups: those attributes used to

286 For instance, a World Bank Staff Appraisal Report on a project consists of a number of different field reports and memorandums.

287 ACCIS, Strategic Issues, 53.

288 Ibid., 53.

288 Ibid., 52.

288 Ibid., 52.

108 Chapter Three International Open Document Exchange Standards manage the document within the DFR file store (the Basic Attribute Set), and those used to identify the object for management purposes outside the DFR store itself (the Extension

Attribute Set).

Basic Attribute Set Attribute-type Name DFR-UPI* DFR-Object-Class DFR-Document-Type DFR-Title DFR-Pathname* DFR-Parent-Identification* DFR-Referent-Deleted* DFR-Membership-Criteria DFR-Ordering DFR-Resource-Limit DFR-Resource-Used * DFR-Number-Of-Group-Merhbers* Version-Name DFR-Previous-Versions DFR-Next-Version* DFR-Version-Root* DFR-External-Location User-Reference User-References-to-Other-Objects DFR-Attributes-Create-Date-and-Time* DFR-Content-Create-Date-and-Time* DFR-Created-By* DFR-Attributes-Modify-Date-and-Time* DFR-Content-Modify-Date-and-Time* Document-Date-and-Time DFR-Reservation * DFR-Reserved-By* DFR-Access-List*

*Assigned by the DFR Server. The other attributes may be assigned by the user or the owner.

DFR Extension Attribute Set This group of attributes is assigned by the Owner and is a subset of the ODA-Document- Profile and therefore have similar definitions to those discussed under ODA. Other-Titles Subject Document-Type Document-Architecture-Class Keywords

109 Chapter Three International Open Document Exchange Standards

Create-Date-and-Time Purge-Date-and-Time Revision-Date-and-Time Organizations Preparers Owners Authors Status User-Specific-Codes Superseded-Documents Number-of-Pages Languages

Since DFR is a document store, it is sensitive to the manipulation of documents, and therefore, to the status of transmission. DFR achieves this through version management, where it tracks documents for their derivation from a Version-Root, and as they become members of groups, which may consist of family of versions of the same document. It is very flexible in its ability to link with other objects which may or may not be documents

(document being only one type of object). With ODA it shares the same sensitivity to moments of action and documentation through its different attributes for dates. DFR is also capable of capturing the persons, and the relationships between records, or the archival bond, through attributes such as References-to-Other-Objects. Because it is a document store, however, DFR picks up on the document once it has been created, so it is not concerned with the logical and layout structure and in this respect, makes assumptions. about the completeness of a record that ODA does not.

no CHAPTER FOUR A THESAURUS OF RECORD ATTRIBUTES

The Thesaurus is designed to equate document attributes, tags and structural features as identified in ODA, DFR and SGML with general and special diplomatic concepts. In so doing, the Thesaurus treats diplomatics as a de facto international standard of document management in its own right, that is, one sanctioned by widespread usage rather than any standards-setting body.

The Thesaurus relates attributes (or tags) in the document profile as well as structural characteristics of documents (such as the logical structure) of ODA, DFR, and

SGML TEI to standard diplomatic concepts. There is an attempt to propose new diplomatic elements that are not identified with any open document exchange standard at this time. These proposed elements are envisioned as attributes of the document profile and are indicated separately by underlining in bold, e.g. access control. The Thesaurus also includes general diplomatic concepts (such as the juridical system and the written document) that are too abstract to be directly capturable as tags or attributes in ODA,

SGML or DFR but which are nonetheless necessary to an understanding of the theoretical relationship between diplomatics, electronic records management (ERM), and open document exchange.

KEY

s SYNONYMOUS TERM OR ATTRIBUTE/TAG N • NARROW TERM OR ATTRIBUTE/TAG B BROAD TERM OR ATTRIBUTE/TAG R RELATED TERM OR ATTRIBUTE/TAG = SEE ALSO ARCH ARCHIVAL (INCUDING RECORDS MANAGEMENT) TERM DIP DIPLOMATIC TERM

ERM ELECTRONIC RECORDS MANAGEMENT TERM ODA OPEN DOCUMENT ARCHITECTURE STANDARDS

SGM STANDARD GENERALIZED MARKUP LANGUAGE DFR DOCUMENT FILING AND RETRIEVAL STANDARD

PRP PROPOSED NEW DIPLOMATIC TERM AU DEFINITION SUPPLIED BY AUTHOR.

Ill Chapter Four Thesaurus

RULES

1. "No standards equivalent" or "No diplomatics equivalent" means that there is no term, attribute, tag, or structural characteristic of documents at present identified within open document exchange standards or diplomatics, that is synomymous, or is an exact match, with the entry term. In cases of partial equivalence, the term is assigned to either the Narrow term or the Broad term.

2. The citations used for each definition indicate a numbered source followed by the page reference, e.g. (1)-156 = ACCIS, Management of electronic records, page 156.

3. Where diplomatic terms have no actual attribute/tag or conceptual equivalent in ODA SGML or DFR, the Thesaurus attempts to capture them at a broader level where there is some equivalency. For instance, constitutive acts have no equivalents except at the broader level captured in the term acts.

4. Both diplomatic terms and standards attributes/tags or document features are given for each level of the Thesaurus hierarchy, but where no standards equivalents or occurrences may be found, then only the diplomatic concept will be given.

5. Because of the lack of precise definitions in the Text Encoding Initiative, all SGML tags have been correlated with diplomatic concepts on an approximate basis.

6. All standards terms, including attribute names, tags, and concepts are italicized, diplomatic and archival terms are in bold face.

7. Where the names of DFR and ODA attributes are synonymous (but not necessarily the meaning), the ODA spelling (which is unhyphenated) is preferred for the entry heading.

112 Chapter Four Thesaurus

Abstract ODA - An attribute of Document Description that contains information to summarize a document.(5)- 9

SGML TEI - A summary of the content of the document as continuous prose.(7)-189 access control DIP - Proposed definition: The right of entering an electronic data store with specific privileges to view or process data in some way, or to administer or modify the system itself. The authorization usually takes the form of a list of those who are entitled to access together with their specific privileges. Access is permitted on the basis of authentication. The access list may permit entry to a system, an application, or to individual objects, such as a document, or groups of objects. (AU) s: No diplomatic equivalent. n: DFR Access-List ODA Access Rights SGML Access Rights b: ERM security r: DFR Authentication Access-List DFR - This attribute identifies the security subjects allowed to access this DFR-Object specifying for each of them their respective access rights.(8)-88 SEE access; reliability; security

Access Rights ODA - This attribute specifies the access right(s) to the document relating to its privacy, as defined by the current owner(s) of the document. (4)-14 SGML TEI - An element of the local file system reference in the tag set for the front matter of office documents that is intended to tag electronic access rights. (7)-189. SEE access; reliability; security acts DIP - Movements of the will aimed to create, maintain, modify, or extinguish situations.(22)-3 • Among human facts in general, the special type of fact which results from a will determined to produce an act is called an action or act. The operation of will distinguishes an act from any other general fact. Therefore, all acts are also facts, but only facts generated by a determined will are acts. All archival documents express a fact which may be considered the transaction embodied in the record.(3)

mere act: An act in which the will is limited to the accomplishment of the act, without the intention of producing any other effect then the act itself: effect and act coincided 11)-7

simple act: When the power of accomplishing the act is concenUated in one individual or organ we have a simple act. The will to produce the act is one will.(4)-21

113 Chapter Four Thesaurus collegial act: When the power of accomplishing the act is concentrated in a number of individuals acting with one will (for example, a circular signed by a number of ministries). (11)- 13. A collegial act is a form of simple act. collective act: Those acts produced by the identical wills of different individuals or organs, and resulting in one document.(l l)-9 contracts. When the power of accomplishing the act belongs to two or more interacting parties (individuals, public bodies, states, state-and-individuals) we have a contract. Notwithstanding the difference in motivation and interests between the parties, their wills converge in one, aimed at producing one act. (11)-13 multiple act: Acts produced by the will of the same individual or organ but directed to different individuals or organs, and resulting in one document, e.g. a document giving merit increases to a number of employees. (11)-13 compound act. Acts composed of many different acts produced by the same individual or organ or by a number of individuals or organs, but all essential to the formation of some final act of which they are partial elements. The partial acts may concern the same or different subjects and may respond to convergent or contrasting interests, but each results in documents which are all necessary to the formation of the final document. The final product of the compound act may be further divided into continuative, complex acts and acts on procedure.(11)-14

acts on procedure: A form of compound act. When the final act derives from a series of different acts (which may be simply, compound, collegial, or collective, in sequence of parallel) produced by a number of different individuals and/or organs, which have equal or different motivation or interests and accomplish different functions. However, all these partial acts have the common aim of making possible the accomplishment of the final act.(ll)-14

complex act: A type of compound act that occurs when individuals or organs which may have different motivation and interests but pursue the same function, produce a number of simply acts having the same content, all necessary to the accomplishment of the final act, e.g. all the series of approval needed for the appointment of a Dean.(l 1)-14

continuative act: A type of compound act in which the same individual or organ needs to manifest the same will more than once in order to produce the final and definitive act, so that the partial acts constituting the compound act are all identical, but the documents resulting from them are not (for example, a City Council's three subsequent deliberations of the same by-law.)(l 1)-14

Acts in general: s ST No standards equivalent n DIP SEE Glossary - acts - for a classification of different types of acts b DIP facts; juridical act; juridical fact r DIP functions; phases of procedure; procedure; process; transaction PRP Business Process linked to ODA Document Type

11-4 Chapter Four Thesaurus addressee DIP - The person(s) to whom the document is directed.(22)-2 • Every document has two theoretical addressees: the person to whom the act is directed, and the person to whom the document itself is directed. These are not necessarily the same. There is no document without an addressee because documents result from actions and any action falls on somebody. An action may be directed to an entire collectivity, and in such a case the addressee of the related document may be all the people, or a social, ethnic, and religious group and so forth. (17)-6 s ST No standards equivalent n DIP addressee of document; addressee of act ODA Distribution List SGML In Reply To; Primary Recipient; Secondary Recipient b DIP persons r DIP writer; author addressee of the act SEE addressee addressee of the document SEE addressee annotations DIP - Extrinsic elements consisting of additions to the record after its compilation. They can be distinguished in categories in relation to the procedural moment in the treatment of the affair in which they were added to the record in question. (22)-6 • For an electronic document, the document profile is the container of all annotations, but also of some elements of intellectual form.(22)-22 SEE annotations of execution; annotations of handling; annotations of management

annotation of execution DIP - These are added in the execution phase, when the act is put into effect. They comprise annotations of authentication and registration. Authentication is the express, legal recognition that a record or the signature(s) on it is what it purports to be (particular to certain record forms). Registration is the reference to a transcription of the record made in a register by an office different from the one creating the record (particular to certain record forms).(22)-6 s ST No standards equivalent n ODA Authorization; Expiry Date and Time; Release Date and Time; Start Date and Time SGML Authorizing Person b DIP annotations r DIP annotations of handling; annotations of management

115 Chapter Four Thesaurus annotations of handling DIP - These are added during the handling of the matter. They comprise instructions, such as the mention of previous or following actions, directions for transmission, disposition, classification etc. They also include dates of hearings or readings, and signs added beside the text such as question marks or checks.(22)-6 s ST No standards equivalent n ODA Additional Information; Revision History SGML User Info b DIP annotations

r DIP annotations of execution; annotation of management annotations of management DIP - These are added to the document as means of controlling the document itself. They include registry numbers, classification codes identifying its relationship with other documents in the receiving or generating office, cross-references to related files, date of receipt, and name of the recipient, usually the stamp of the receiving office. (22)-6,7 s ST No standards equivalent n DFR External-Location; Keywords; Local-Filing-Date-and-Time; Membership-Criteria; Number-of-Group-Members; Parent- Identification; Pathname; Purge-Date-and-Time; Referent- Deleted; Unique-Product-Identifier (UPI); User-Reference; Resource-Limit; Resource-Used ODA Local File References; Authorization; Security Classification; Abstract; interchange format class PRP Access Control SGML Document Reference (if defined as unique for each document); Local File System: Reference, File name, Location; Front Matter; encoding instructions; file header; revision history; User Comments; Source Description; Abstract; Authorizing Person; Sensitivity; Bibliographic File description b DIP annotations

r DIP annotations of execution; annotations of handling

appreciation DIP - A sort of prayer for the realization of the content of the document, e.g. "looking forward to, I appreciate etc. "(19)-12 No standards occurrence; SEE protocol archival bond DIP - The relationship that, because of the circumstances of their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterized by the purpose of the record). (22)-4. s: No standards equivalent

116 Chapter Four Thesaurus

n: DFR Group; group-content; group-interrelationship; group-member; Root-Group; User-References-to-Other-Objects; Create-Date-and- Time; UPI; Document-Type; Parent-Identification; User-Reference ODA Reference to Other Documents; Document Type, Creation Date and Time; document description; document class description PRP business process SGML Creation Date; Document Reference; References to Other Documents; Additional User Specific Codes b: DIP archival document r: ARCH provenance archival document DIP - A document created or received by a physical or juridical person for the achievement of its purposes or in the exercise of its functions.(22)-4. All archival documents have facts, a purpose, consequences, and are the result of a genetic process.(3) s: No standards equivalent n: No standards occurrence b: ERM record; document; object r: DFR Document-Type; Object DP transaction DIP transaction ODA Document Type; Object Class SGM Document attestation DIP - The subscription of those persons who took part in the issuing of the record (i.e. the author, writer, countersigner, and/or witnesses). It might take the form of signatures. The attestation is the substance and core of the eschatocol. (19)-14,15 s: No standards equivalent n: ODA logical object (where this is a signature); User Specific Codes b: DIP persons r: DIP eschatocol

attribute DP - A property or characteristic of one or more entities, for example, colour, weight, sex." In an EDMS, kinds of attributes include data attributes, display attributes and user attributes.(l)-138 • Information (including text and voice) that can be interchanged with a document but which will only be presented to the recipient if particular conditions arise, such as an explicit request.(9)-157 DFR has several different types of attributes: those that a server will execute on a mandatory basis (basic) and those that it does so optionally (extension). The various abstract operations also have their own attributes: search, security, produce and consume.(8)- 7, 18.

117 Chapter Four Thesaurus

• DFR attributes are managed independently of the content. Attributes can be read or changed. If an attribute is changed the content of that object is not changed. Attributes characterize an object, that is, each attribute provides a piece of information about, or derived from, the object to which it corresponds. Attributes affect storage and retrieval of an object and control access to it.(8)-21 • A data item that identifies a DFR-object, describes its DFR-content, helps control access to it, or in some way is associated with the DFR-object. (8)-5 ODA - An element of a constituent of a document that has a name and a value and that expresses a characteristic of this constituent or a relationship with one or more constituents.(2)-3 SGML - Information which is in some sense descriptive of specific element occurrences but not itself regarded as an element. (7)-24.

SEE ALSO tag

Attribute-Create-Date-and- Time DFR - A DFR-specific mandatory attribute, part of the basic attribute set, that contains the date and time when the mandatory attributes of a DFR object were stored in a DFR document store. A DFR server sets it to the current date and time during the create abstract operation.(8)-85 SEE authentic record; status of transmission

Attributes-Modified-By DFR - A DFR-specific, mandatory attribute, part of the basic attribute set, that identifies the DFR user which has most recently modified the DFR attributes of a DFR object. It can be read only by a DFR user having at least extended-read access rights.(8)-86 SEE author; status of transmission; writer

Attributes-Modify-Date-and-Time DFR - A DFR-specific, mandatory attribute, part of the basic attribute set, that contains the date and time when the attributes of a particular DFR object were last modified in a DFR document store. When a DFR object is created, this attribute is set to current time. Subsequently, the DFR server maintains the attribute and is not updated when those attributes modified or deleted by the server are modified or deleted, i.e. DFR Pathname, DFR Parent Identification, DFR Referent Deleted, DFR Resource Used, and DFR Number-of-Group-Members.(8)-86 SEE authentic copy; status of transmission

authentication DIP - The legal recognition that a signature is affixed by, and belongs to the person whose name it expresses, that a document is what it purports to be, or that a copy conforms to the original. Authentication may refer to one or more signatures, to an entire document, or to a copy of a document.(3) s: DFR Authentication ODA Authorization

118 Chapter Four Thesaurus

n: DIP authentic copy; copy-in-the-form-of-an-original; countersigner; witness b: DIP authenticity ERM security r: DIP author; copy; original; writer ERM access authentic copy DIP - A copy certified by officials authorized to execute such a function, so as to render it legally admissable in evidence. Also included are inserts - quoted or reported - in subsequent original documents in order to renew their effects, or because they constitute precedents of the legal act attested in the subsequent originals. The perfect form of insert is that called vidimus. An authentic copy in general, and a vidimus in particular, only guarantees the conformity of the copy to the original text. Thus, an authentic copy in the diplomatic sense is also an authentic copy in the legal sense but neither in diplomatics nor in law is it an authentic document. The authentication provides the copy with validity and the effects of the original, not with its forms, and it does not influence diplomatic, legal, or historical genuineness.(4)-21 s: No standards equivalent n: DFR User-Reference; User-References-to-Other-Objects where it is a link DIP vidimus ODA References to Other Documents b: DIP copy r: DIP status of transmission authentic record DIP - A record whose genuineness can be assumed on the basis of one or more of the following: mode, form, and state of transmission, and manner of preservation and custody. (22)-12 s: No standards equivalent n: DIP form of transmission; mode of transmission; status of transmission DFR Attributes-Create-Date-and-Time; Content-Create-Date- and-Time; Create-Date-and-Time; Attributes-Modify-Date- and-Time; Document-Date-and-Time; Next-Version; Previous Version; Revision-Date-and-Time; Status; User- References-to-Other-Objects; Version-Root ODA Document Date and Time; Status; interchange format class; ODA Version; Revision History; Security Information SGML Authorizing Person b: DIP authenticity; authentication r: DIP forgery EDMS security

119 Chapter Four Thesaurus

authenticity DIP - The extent to which a document is what it purports to be.(4)-l7 s: No standards equivalent . n: DIP authentication; authenticity - diplomatic; authenticity - historical; authenticity - legal b: DIP juridical system r: DIP genuine document ERM security

Author DIP - The person(s) competent for the creation of the document which is issued by them personally, by their command, or in their name. Usually, the author of a document coincides with the author of the act put into being or referred to by the document, because the person whose will has given origin to the act documented tends to be also the person competent for the creation of the related documentation.(17)-5,6 s: ODA Author = author of document; Owners = author of act DFR Authors = author of document; Owners = author of document; Created-By SGML Author; Authorizing Person n: DFR Content-Modified-By DIP author of the act; author of the document b: DIP persons ODA Originators r: DIP attestation; countersigner; witness; writer DFR Organizations; Preparers ODA Organizations; Preparers SGM Preparer; Originator

Authors DFR - A non-DFR specific, optional attribute, part of the extension attribute set, that defines name(s) of the person(s) and/or organization(s) responsible for the preparation of the intellectual content of the document. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, authors.(8)-91 ODA - An attribute of originators that identifies the name(s) of the person(s) and/or organization(s) responsible for the preparation of the intellectual content of the document. The value of this attribute consists of one or more entries, each with two optional parameters: personal name of author, and author's organization.(5)-12 SGML TEI - Names of those responsible for the intellectual content of the work, as given on the title page.

SEE author; juridical person; persons SEE ALSO owners; preparers; originators; organizations

120 Chapter Four Thesaurus

Authorization ODA - An attribute of security information that identifies the person or organization approving or authorizing the document.(5)-14 SEE annotation of handling; deliberation control; security

Authorizing Person SGML - An attribute of security information that identifies the person or organization approving or authorizing the document.(7)-190 SEE reliability; security SEE ALSO persons; juridical person

Back Matter SGML - That part of the core structural features of a document consisting of bibliography, glossary, and index features.(7)- 88 Not relevant to records. basic processing model DP - A model of open document interchange in which a source document does through an automatic processing step in order to be received as a result document. But source and result documents are generated from a pre-determined document type which must be mapped to the specifications of the automatic processor in order to be transmitted and interpreted correctly.(9)-152 SEE ALSO source document; Document Type; extended processing model

Bibliographic File Description SGML TEI - Part of the file header intended to provide a citation to the source document. SEE annotations of management business process DIP - A term that might be used to capture the concept of acts as carried out by any one juridical person in the course of their business by uniting the type of acts with the procedures needed to put them into effect. SEE acts chronological date DIP - The time of the compilation of the document and/or of the action which the document concerns.(22)-5 s: ODA Dates and Times n DFR Attributes-Modify-Date-and-Time; Contents-Modify-Date-and- Time; Attributes-Create-Date-and-Time; Content-Create-Date-and- Time DIP moment of action; moment of documentation ODA Creation-Date-and-Time; Local Filing Date and Time; Expiry Date and Time; Start Date and Time; Purge Date and Time, Release Date and Time; Revision History

121 Chapter Four Thesaurus

SGM Document Date; Revision History b: ODA Dates and Times r: DIP topical date competence DIP - The authority and capacity of carrying out an act.(17)-8 s: DIP responsibility n: DIP author of the act ODA Authorization; Preparers SGML Originator b: DIP persons r: DIP acts; reliability completeness DIP - A record that has all the elements of form required by the juridical system in which it is created. Completeness is conferred on a record by the presence of all required elements of its intellectual form, specifically the features of content articulation and the annotations. (22)-5 Completeness of electronic records is conferred by the presence of the following elements of intellectual form: chronological date, topical date, entitling, attestation, addressee, receivers, title or subject, and disposition.(22)-21 s: No standards equivalent. n: DFR Document-Type; document architecture class; document content; UPI; Object-Class; Document-Date-and- Time; Title; Owners; Authors; Status; Distribution-List; Subject; ODA document body; document profile; content; document architecture, Title; Start Date and Time; Distribution List; Status; Subject; Authors; Owners SGML document type declaration(DTD); document instance; Title; Creation Date; Originator; Author; Primary recipient; Document Status; annotations b: DIP reliability

r: DIP authenticity; archival document or record complimentary clause DIP - A brief formula expressing respect, such as "sincerely yours". SEE text compound document DP - A document containing a mixture of content types that may include text, sound, and raster or vector images.(1)-143 SEE ALSO User-References-to-Other-Objects; User-Reference

122 Chapter Four Thesaurus conceptual document DFR - A set of DFR-documents, considered to be "different versions" of the same document. (8)-5 SEE draft constituent ODA - A set of attributes that is one of the following types: a document profile, an object description, an object class description, a presentation style, a layout style or a content portion description.(2)-4 SEE ALSO content articulation

Content DFR - The prime information content of a DFR-object. The nature of the DFR-content depends on the DFR-object class of the DFR-object.(8)-5 ODA - The information conveyed by the document, other than the structural information, and that is intended for human perception.(2)-4 SEE content SEE ALSO content architecture; content architecture class; content architecture level; content attributes; content articulation; content element; content portion; Content-Create-Date-and-Time; Content-Modified-by; Content-Modify-Date-and-Time content DIP - The places, names and dates that may be mentioned apart from the syntactical and objective elements of form (persons, intrinsic and extrinsic elements, facts and procedures) that are intended to give a document meaning and effect, s: No standards equivalent n: DFR document content; Content-Create-Date-and-Time; Content- Modified-By; Content-Modify-Date-and-Time; ODA Content; content architecture class; content portion; Document Size; Number of Pages; Languages SGML element declaration b: DIP intellectual form DFR content r: DFR object class DIP intrinsic elements ODA document body

content architecture ODA - Rules for defining the internal structure and representation of the content of basic components in terms of a set of content elements, attributes and control functions, and guidelines for the presentation of the content. (2)-4

123 Chapter Four Thesaurus

• The rules governing the more detailed internal structure of a content portion associated with a logical or a layout object. The rules depend on the type of content and there can be only one content architecture for each object.(2)-16 SEE content articulation

SEE ALSO content architecture class; content architecture level content architecture class ODA - The rules for defining the internal structure and representation of the content of basic components in one set of forms defined for each type of content element. Examples are formatted form, processable form and formatted processable.(2)-4 SEE content articulation

SEE ALSO content architecture; content architecture level content architecture level ODA - An identified subset of the features pertaining to a content architecture class. (2)-5 SEE content articulation

SEE ALSO content architecture; content architecture class content articulation DIP - The elements of the writing and their arrangement, that is, what determines the distinction between a letter and a memo, or a chart and a map.(22)-2 s: No standards equivalent n: DFR document content; Document-Type ODA document constituents, generic layout structure; specific layout structure; generic logical structure; specific logical structure; layout styles; presentation styles; external document class; resource document; content architecture; content portion description; content element; document characteristics: content architecture classes; interchange format class; ODA version; non-basic document characteristics: profile character sets; comment character sets; alternative representation character sets; document constituent attributes: page dimensions; medium types; layout paths; protection; block alignments; fill orders; transparencies; colours; borders; page positions; types of coding; non-basic document characteristics: profile character sets; comments character sets; alternative representation; coding attributes; presentation features; non-basic structure characteristics: number of objects per page; additional document characteristics: unit scaling; fonts listing. SGML content articulation; document type definition; document type; document type declaration; document type definition; element; element declaration; element definition. b: DLP script; intellectual form r: DLP content configuration; annotations

124 Chapter Four Thesaurus content configuration DIP - The mode of expression of the content, e.g. graphics, text, images, or a combination. (22)-2 s: No standards equivalent, n: DFR document content ODA content elements; presentation features; presentation styles SGML entity b: DIP intellectual form; script r: DIP content articulation; annotations content attributes ODA - A group of document management attributes comprising Document-Size, Number-of-Pages, and Languages.(5)-13,14 SEE extrinsic elements; languages; content articulation SEE ALSO layout; content portion; content element; content

Content-Create-Date-and Time DFR - A DFR-specific attribute, part of the basic attribute set, that contains the date and time when the DFR content of a DFR object was stored in a DFR document store. The server sets the attribute to the current date and time when the DFR content of the object is created. (8)-86 SEE chronological dates; moment of documentation SEE ALSO Content-Modified-By; Content-Modified-Date-and-Time content element ODA - A basic element of the content of a document. (2)-5 • For content consisting of character text, the content elements are characters. In the case of images or graphics, the content elements are picture elements (also called pels) or geometric graphics elements (lines, arcs, polygons, etc.)(2)-15 SEE content articulation

SEE ALSO layout; content portion; content attributes

Content-Modified-By DFR - A DFR-specific attribute, part of the basic attribute set, that identifies the DFR user which has been most recently responsible for modifying the DFR content of a DFR object. It can only be read by a DFR user having at least extended-read access rights. (8)- 87 SEE author; draft; writer SEE ALSO Content-Created-Date-and-Time; Content-Modified-Date-and- Time

Content-Modify-Date-and-Time DFR - A DFR-specific attribute, part of the basic attribute set, that contains the date and time when the DFR content of a DFR object was last modified in a DFR document store.

125 Chapter Four Thesaurus

When a DFR object is created, this attribute is set to current time. Subsequently, the DFR server maintains the attribute.(8)-86 SEE archival annotation; chronological dates; draft SEE ALSO Content-Created-Date-and-Time; Content-Modified-By content portion ODA - The result of partitioning the content of a document according to its logical and/or layout structure.(2)-5. • A set of related content elements that belong to one basic object (if the document has any logical structure) and one basic layout object (if the document has any layout structure). It follows that a basic logical object has associated with it one or more content portions as does a layout object, and that any basic or composite object (logical or layout) has associated with it an integral number of content portions. Logical and layout objects do not always correspond (e.g. the arrangement of the content into sections and paragraphs need not correspond to pages.(2)-15 SEE content articulation

SEE ALSO layout; content element; content attributes

Contents SGML TEI - A table of contents, specifying the structure of a work and listing its constituents. Part of the front matter. (7)-45 SEE content copy DLP - If a document is not an original or a draft, it is a copy.(4)-20 • Diplomatics recognizes several different types of copies:

copy-in-the-form-of-an-original: A type of copy that is created when two originals of the same document, addressed to the same person and having the same date, are sent to the addressee in two subsequent deliveries. The first delivery is considered to be the original, the second delivery is a copy in the form of an original.(4)-19

imitative copy. A form of copy which reproduces, completely or partially, not only the content but also the forms of a document, including extrinsic elements such as layout, script, special signs etc. of the original. A modern example is the photocopy. (4)-20

inserts or insets. SEE authentic copy

pseudo-original. A copy that attempts to imitate in every respect the extrinsic and intrinsic forms of an original but is not legally authorized and is therefore a forgery. (4)-21

simple copy: A simple copy is constituted by the mere transcription of the content for the original (e.g. notes taken from a report) and cannot have legal effects. This is the most common type of copy and is usually compiled as an aid to memory. (4)-21

Copies in general: s: No standards equivalent n: DFR Reference = virtual copy consisting of pointers to an object; Status

126 Chapter Four Thesaurus

n: DIP authentic copy; copy-in-the-form-of-an-original; imitative copy; pseudo-original; simple copy; receivers ODA Distribution List; Reference to Other Documents; Status b: DIP status of transmission SGML Document Status r: DIP original; draft DP soft copy core structural features SGML TEI - Basic structural features which are common to a large number of texts and which may be said to establish their principle gross structure or shape, e.g. the parts of a book, front matter, body, back matter. (7)-72 SEE office documents; header; script; layout; logical structure; text

core tag set SGML

SEE ALSO office documents

countersigner DIP - The signature following the subscription of the writer which has the special function of validating the physical and intellectual form of the document and of guaranteeing that the document was created according to the established procedure and signed by the appropriate person. The countersigner assumes responsibility only for the regularity of formation of the document and for its forms; that is, not for its content, and not for the wording chosen to express content, but for the presence in the document of all the elements required for its effectiveness.(17)-8 ns:: NDFoR standard Authenticatios equivalenn t DIP attestation ODA Authorization b: DIP persons r: DIP reliability

Created-by DFR - A DFR-specific attribute, part of the basic attribute set, that identifies the DFR user which created the DFR object. It is not modified when the object is moved and can only be read by a DFR user having at least extended read-access right.(8)-86 SEE author; writer; juridical person SEE ALSO Dates and Times

Creation-date-and-time DFR - An optional attribute, part of the extension attribute set, that specifies the date, and, optionally, the time of day when the document was created. In the case of an ODA document, this value of attribute can be taken from the document profile where it is the equivalent of ODA creation date and time.(8)-90

127 Chapter Four Thesaurus

ODA - An attribute of dates and times that specifies the date, and optionally, the time of day when the document was initially created.(5)-l0 SEE dates; moment of action; moment of documentation crystals SGML - Small objects with internal structure containing semantically constrained sorts of data. Typical crystals for office documents include names, addresses, and organizations. The address crystal consists of sub-elements for city, country, state, street, postbox, telephone etc. (7)-88,89 custody ARCH The responsibility for care of archival material based on its physical possession.(21)-6 Custody does not always include legal ownership or the right to control access to records. (14)-9 SEE reliability

SEE ALSO access; custodial history; security custodial history ARCH - The succession of offices or persons who had custody of a body of archival materials from its creation to its acquisition by an archives or manuscript repository. SEE security; reliability SEE ALSO provenance dates

SEE chronological date; topical dates

Dates and Times ODA - A group of document management attributes that comprise Document Date and Time, Creation Date and Time, Local Filing Date and Time, Expiry Date and Time, Start Date and Time, Purge Date and Time, Release Date and Time, and Revision History.(5)- 10,11.

SEE dates declarations SGML - A formal statement in simple syntax used to define different levels of data in a document. There are several types in SGML. (AU) SEE script; document type; content SEE ALSO element type declaration; entity; document type declaration; SGML declaration descendant DFR - For a given DFR-group, any of the DFR-group members, and recursively, any descendant thereof. (8) - 5 SEE draft

128 Chapter Four Thesaurus discretionary documents DIP - Documents which refer to an act where the document is not needed to bring the act into existence (dispositive document) or to prove the existence of an oral act (probative document) (\\)-l SEE narrative documents; supporting documents disposition DIP - That part of the text of a document in which the author expresses their will or judgment. Here, the fact or act is expressly enunciated, usually by means of a verb able to communicate the nature of the action. (19)-12 SEE text dispositive document DIP - If the purpose of the written form was to put into existence an act, the effects of which were determined by the writing itself (that is, if the written form was the essence and substance of the act), the document was called dispositive, e.g. contracts and wills.(29)-9 s: No standards equivalent n: DFR Document-Type ODA Document Type; document architecture class; User Specific Codes PRP business process SGML Document Type, document type definition b: DIP juridical relevance SGML office documents r: DIP narrative document; probative document; supporting document

Distribution List ODA - An attribute of other user information that specifies a list of intended recipients of a document. It has two parameters, "personal name of recipient", and "recipients organization."(4)-12 SEE addressees; receivers document DIP - The expression of ideas in a form both objectified (physical) and syntactic (governed by rules of arrangement). A document's components are: a message, a medium, an intellectual codification of ideas (information configuration: text, image, etc.), and a logical arrangement of the internal elements (intellectual form).(16)-9. s: no standards equivalent n: DFR document; object; object-class; object-content ODA document; document profile; generic document SGM document; document type declaration b: DTP written document DFR Group; Group-Member

129 Chapter Four Thesaurus

r: DFR document-type ODA document type; document class; document class description SGM document type; document class; core tag set document DFR - A structured amount of information that can be filed, retrieved, and interchanged consisting of a DFR-object-class of the DFR-object.(8)-5. • A DFR document consists of a DFR document content together with attributes which are associated with the content. A DFR document is contained in one document store. Consistency between copies is outside the scope of the standard and is the responsibility of the user. (8)-14 ODA - A structured amount of information intended for human perception, that can be interchanged as a unit between users and/or systems.(2)-5 SGML - A collection of information that is processed as a unit. A document is classified as being a particular document type. (23)-10 • A prologue and a document instance. (5)-29.

SEE document document architecture ODA - Rules for defining the structure of documents, in terms of a set of components and content portions, and the representation of documents in terms of constituents and attributes. (2)-5 • The structural information of a document consisting of the set of one or more of the following structures: specific logical structure, specific layout structure, generic logical structure and/or generic layout structure.(2)-5 • The document architecture provides for the representation of documents in three forms: formatted form, processable form, and formatted processable form.(2)-13 • The key concept in the document architecture is that of structure. Document structure is the division and repeated subdivision of the content of a document into increasingly small parts. The parts are called objects. The structure has the form of a tree. The document architecture permits two structures to be applied to a document: a logical structure and a layout structure. Any one or both structures may be applied to a given document. (2)-14 SEE content articulation SEE ALSO document architecture attributes; document architecture level; document architecture class

document architecture attributes ODA - The set of attributes that applies to a logical object or a layout object depends on the type of object: different sets of attributes are defined for basic logical objects, composite logical objects, document logical root, blocks, frames, pages, page sets and document layout root. Document architecture attributes are independent of the type of content of the objects to which they apply. Examples are "object identifier" for all objects;

130 Chapter Four Thesaurus

"subordinates" for composite objects; layout directives such as "indivisibility", "offset", "separation", "position" (of blocks and frames) and "dimensions."(2)- 16,17 SEE content articulation SEE ALSO document architecture; document architecture level; document architecture class document architecture class DFR - An optional attribute, part of the DFR extension attribute set, that specifies the document architecture class used in the document. In the case of an ODA document, this can be taken from the document profile and is equivalent of the ODA attribute, document- architecture-class. (8)-89 ODA - The rules for defining the structure and representation of documents in formatted form, processable form or formatted processable form.(2)-5 SEE content articulation SEE ALSO document architecture; document architecture level; document architecture attributes document architecture level ODA - An identified subset of the features pertaining to a document architecture class. (2)- 5 SEE content articulation SEE ALSO document architecture; document architecture class; document architecture attributes document body ODA - The part of a document that may include a generic logical and layout structure, specific logical and layout structure, layout and presentation styles but excludes the document profile. (2)-5 SEE content articulation; intrinsic elements document characteristics ODA - Those attributes in the document profile that permit a recipient to determine which capabilities are required for processing or imaging the document. They include: a specification of the form (formatted, processable or formatted processable); specification of the content architectures used; specification of character sets, fonts, styles, orientations and types of emphasis.(2)-18 SEE content articulation; intrinsic elements; form of transmission document class ODA - A set of logical object class descriptions, layout object class descriptions, generic content portion descriptions, styles and a document profile, that specifies a set of documents with common characteristics.(2)-6 • A specification of the set of properties that are common to a group of similar documents. The specification consists of a set of rules to determine the values of the attributes that specify the common properties. These rules can be used to control

131 Chapter Four Thesaurus

consistency among the documents making up the class, and to facilitate the creation of additional documents.(2)-17 SEE archival bond; content articulation; formularium document class description ODA - The specification of a document class. (2)-6 SEE archival bond; content articulation; formularium document content DFR - A body of information actually contained within the document, e.g. an office document, and not interpreted by DFR.(8)-5 • A body of information that has been provided to the DFR server for the purpose of storage. The server transfers the content of a DFR document to the user, and never interprets the content.(8)-14 SEE content

SEE ALSO office document

Document Date SGML TEI - The date of the text as given on the title page. (7)-74 SEE moment of action; moment of documentation Docu ment-Date-and- Time DFR - A non-DFR-specific attribute, part of the basic attribute set, that specifies the date and time that the DFR user associates with the DFR document or with a DFR reference. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, Document-Date-and-Time.(8)-87 ODA - An attribute of dates and times which specifies the date and, optionally, the time of day that the originator associates with the document.(5)-10 SEE moment of action; moment of documentation document declaration SGML - The part of the SGML prologue which specifies basic facts about the dialect of SGML being used, e.g. the character set; length of identifiers. Usually held by the SGML processor in the form of compiled tables and is thus invisible to the user.(7)-29 SEE prologue document description ODA - A group of document management attributes comprised of title, subject, document reference, document type, abstract, and keywords.(5)-9 SEE subject; annotations of management; title

132 Chapter Four Thesaurus document instance SGML - A marked up text.(7)-13 • The content of the document itself. It contains only text, markup and general entity references, and thus may not contain any new declarations.(7)-31 SEE content SEE ALSO declarations

document interchange DP - The capability of transmitting documents from one information system and receiving them in another system in a form in which they can be acted upon by the receiving system, often called "revisable" form.(l)-150 SEE form of transmission SEE ALSO imaging

document interchange architecture DP - The specification of rules and data streams necessary to interchange information in a consistent, predictable manner.(1)-150. SEE ALSO basic processing model; encoding; document declaration

document management domains SEE domain

document profile ODA - A set of attributes which specifies the characteristics of the document as a whole; an identified subset of the features pertaining to the document profile(2)-16,17 • A set of attributes associated with a document as a whole. It represents reference information about the document and may repeat information in the document content, for example, the title and name of the author.(2)-16,17 • In addition to reference information such as title, date and author's name, which facilitates storage and retrieval of the document, the document profile contains a summary of the document architecture features that are used in the document, in order that a recipient can easily determine which capabilities are required for processing or imaging the document. The attributes representing the latter type of information are called document characteristics. The document profile may be interchanged alone.(2)- 18. SEE document

SEE ALSO document characteristics; header; annotations

Document Reference ODA - An attribute of document description whose value is used to refer to the document from other documents.(5)-9 SGML TEI - An element of the front matter of a generic office document. (5)-189 SEE archival bond; annotations of management SEE ALSO classification

133 Chapter Four Thesaurus

Document Size ODA - An attribute of content that represents the estimated size of the whole document, expressed as a number of 8-but bytes. The size includes that of the document profile and the document body (if present).(5)-13 SEE annotations of management SEE ALSO Resource-Used; Resource-Limit

Document Status SGML TEI - An element of the front matter of a generic office document. (5)-190 SEE status of transmission

Document-Type DFR - An non-DFR specific, optional attribute, part of the extension attribute set, that specifies the type of document, e.g. memorandum, letter, report, resource. This attribute specifies only an informal name; it does not specify a relation to a particular document class description. This attribute can be taken from the ODA document profile and is the equivalent of ODA document type.(8)-89 • A DFR-specific attribute, part of the basic attribute set, that contains an object identifier whose value defines the representation of the document content, for example, ODA or SGML, in the DFR access protocol. For a DFR reference, this attribute will only exist if the referent is a DFR document.(8)-81 ODA - An attribute of document description which specifies an informal name for a document, e.g. memorandum, letter, report, resource. It does not specify a relation to a particular document class description.(5)-9. SGML - A class of documents having a similar characteristics: (for example, journal, article, technical manual, or memo) (23)-10 SEE title; content configuration document type declaration (DTD) SGML - A standard for a header that identifies an agreed upon document type, such as report, article, book, journal, and includes additional information needed to process the specified document. DTD is used as part of Standard Generalized Markup Language (SGML) which defines tags to mark parts of documents. The tags which identify the parts are interpreted in terms of the DTD. (1)-150 • The document type declaration specifies the document type definition against which the document instance is to be validated. Like the SGML declaration it may be held in the form of compiled tables within the SGML processor, or associated with it in some way which is invisible to the user or requires only that the name of the document type be specified before the document is validated.(7)-30. • At its simplest the document type declaration consists simply of a base document type definition (possibly also one or more concurrent document type definitions) which is prefixed to the document instance. More usually, the document type definition will be held in a separate file and invoked by a reference.(7)-30 • The motivating principle for the design of the DTD has been to allow but not to require structural constraints on documents. An encoded document is seen as comprising a header

134 Chapter Four Thesaurus and a body. The header can contain SGML declarations and additional declarations required to conform to the Text Encoding Initiative. The body contains the encoded text itself. (7)-193 SEE document

SEE ALSO header; document type definition; element type declaration document type definition (DTD) SGML - A formal specification for the structure of a document. SEE content articulation SEE ALSO document type declaration documentation requirements EDMS - A functional aspect of the preservation requirements for electronic recordskeeping systems that requires systems to preserve a number of different aspects of the record:

content, structure and context: preservation of content plus any structure supported by the software in which the document was created, plus context whether assigned.by creators (such as key terms or distribution lists) or by the system with reference to the business application in which the record participated.

documentation of processing: preservation of processing rules and schema's controlling views and permissions so that records as output products of specific processes can be understood with respect to the data known to the organization.

functionalities: for records with functionality, documentation of business application procedures as embodied in system scripts, rules, instructions, and routines, with maintenance whenever they change, so that records can be correctly associated with the status of the system at the time of record creation. Functionality embodied in live links and their representations should be launchable.(12)-21 documentation ARCH - The organization and processing of documents or data including location, identification, acquisition, analysis, storage, retrieval, presentation and circulation for the information of users.(l)-150 DP - An organized series of descriptive documents explaining the operating system and software necessary to use and maintain a file and the arrangement, content and coding of the data which it contains.(l)-150 SEE ALSO formularium

domain DLP - Space defined by the boundaries of an electronic document management system within which records are created, modified, used, and destroyed. The space may be divided into several areas depending on the status of transmission and the access rights.

general (or institutional) space: that part of the system that is accessible to all members of the organization, managed according to established record making and record keeping rules by the competent staff, and that contains the central filing system of the organization, including the

135 Chapter Four Thesaurus

linkages with related records in other media. The primary characteristic of the general space is that no record that has crossed its boundaries can thereafter be manipulated.(22)-23

group space: that part of the system that is accessible to all the individuals who share the same competence, horizontally or vertically, temporarily, or permanently. This is the space containing many draft versions of the same record, comments, notations etc.(22)-23

individual space: that part of the system that is accessible to individual members of the organization. The individual space within the organization's records system must be distinguished from the personal private space of the individual, which should also have a different electronic address. (22)-23

private space: that part of the records system in which records of a private nature are created and managed by the creator for their own ends, and which is accessible only to individuals as private persons. It should have a separate address. (22)-23 and (TG)

s: No standards or diplomatics equivalent n: DFR Version; Status; Version-Root; Next-Version; Previous-Versions ODA Status; Revision History SGM Revision History; Status b: No standards or diplomatics equivalent r: DLP reliability ERM - The intersection of a class of objects and a common set of rules that govern objects. Domains may also be defined by the action or purpose of the rules, e.g. document management domains in which the object class is documents and the purpose is management; security classification domains which are defined by the degree of confidentiality; information access domains defined by the location, medium and representation of the information; and security management domains defined as groupings of business processes with a common set of rules. (10)-6 draft DIP - Temporary version of a record, prepared for purposes of correction. (22)-9 s: DFR Version; Status ODA Status SGM Revision History n: DFR Version-Root; Next-Version; Previous-Versions ODA Revision History b: DFR conceptual document DLP status of transmission r: DLP original; copy element SGML - A component of the hierarchical structure defined by a document type definition; it is identified in a document instance by descriptive markup, usually a start-tag and end- tag.(23)-10 SGML TEI - A textual unit, viewed as a structural component. Different types of elements are given different names, but SGML provides no way of expressing the meaning of a particular type of element, other than its relationship to other element types. (7)-12.

136 Chapter Four Thesaurus

• For instance, the element Preface may or may not occur within a larger element, Front Matter, and be composed of sub-elements as Title and Text. Elements may be both structural (related to the organization of the document) and non-structural (related to its intellectual content), such an element, Author. Elements are identified by tags. SGML is in no way concerned with semantics of elements as these are software- dependent. Element should not be confused with attribute. SEE content articulation SEE ALSO attribute element type declaration SGML - A markup declaration that contains the formal specification of the part of an element definition that deals with the content and markup minimization.(23)-10 SEE document type element definition SGML - Application of specific rules that apply SGML to the markup of elements of a particular type. An element type definition includes a formal specification, expressed in element and attribute definition list declarations of the content, markup minimization and attributes allowed for a specific element type. An element type definition is normally part of a document type definition. (23 )-l 1 SEE content articulation encoding instructions SGML - The part of the file header that contains such information required to interpret an SGML conformant data file such as normalization of source text, methods of resolving ambiguous punctuation, editorial comments, reference system, levels of encoding, and normalization of machine-readable text.(7)-5 5 SEE content articulation entitling DIP - That part of the protocol comprising the name, title, capacity and address of the physical or juridical person issuing the document, or of which the author of the document is an agent. Today corresponds to letterhead.(3) s: No standards equivalent n: ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP protocol SGML front matter r: DIP intrinsic elements entity SGML - Together with elements and attributes, part of the descriptive markup of a document, consisting of a named part of a marked up document, irrespective of any structural considerations. (7)-27.

137 Chapter Four Thesaurus

• An entity is a free-floating unit, such as a photograph, that is not part of the structure of the text. SEE content configuration SEE ALSO element; attribute eschatochol DIP - That part of a document that contains the documentation context of the act, i.e. enunciation of the validation, indication of the responsibilities for documentation of the act, and the final formulae (19)-11 s: no standards equivalent n: DLP annotations of management; attestation; complimentary clause; corroboration; final clauses; qualification of signature; secretarial notes; special signs DFR document content; Author; Owners; Preparers; User-Specific- Codes; Access-List ODA Distribution List; generic logical structure; specific logical structure; content portion; Author; Owners; Preparers; User- Specific-Codes; Security Information SGML element; element definition; element declaration, Sensitivity; PRP receivers b: DLP intrinsic elements r: DLP extrinsic elements execution phase SEE phases of procedure executive procedures SEE procedures extended processing model DP - An elaboration of the basic processing model for open document interchange which accommodates the needs for additional functionalities, such as the ability to handle graphics, required of both office and publishing environments in the ODA/SGML standards.(9)-158,159 SEE ALSO basic processing model

Expiry Date and Time ODA - An attribute of dates and times that specifies the date and, optionally, the time of day after which the document is considered to be invalid.(5)-10. SEE archival annotations exposition DLP - A part of the text in which the substance is expressed, i.e. the narration of the concrete and immediate circumstances generating the act and/or the document. In documents resulting from procedures, whether public or private, the exposition may

138 Chapter Four Thesaurus include the memory of the various procedural phases or be entirely constituted by the mention of one or more of them. (18)-13 s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP text SGML body matter r: DIP preamble; disposition external document class ODA - A document class referred to by the document profile of an interchanged document containing no generic structure .(2)-6 SEE content articulation

external references ODA - A group of document management attributes comprising references to other documents, superseded documents, and local file references. (5)-13. SEE archival annotations

extrinsic elements DIP - Those elements of documentary forms which constitute the material make-up of the document and its external appearance.(32)-6 s: DIP physical form no standards equivalent n: DIP annotations; content articulation; content configuration; language; medium; script; seals; special signs DFR document content ERM layout; logical structure ODA document constituents, generic layout structure; specific layout structure; generic logical structure; specific logical structure; layout styles; presentation styles; external document class; resource document; document characteristics, content architecture classes; interchange format class; ODA version; non-basic document characteristics, profile character sets; comment character sets; alternative representation character sets; document constituent attributes: page dimensions; medium types; layout paths; protection; block alignments; fill orders; transparencies; colours; borders; page positions; types of coding; coding attributes; presentation features; non-basic structure characteristics: number of objects per page; additional document characteristics: unit scaling; fonts listing. SGML basic non-structural features; entity b: DIP form r: DIP intrinsic elements

139 Chapter Four Thesaurus

External-Location DFR - A mandatory attribute, part of the basic attribute set, that contains a user-specified description of the location of an object stored outside any DFR document store. (8)-85 SEE annotations of management facsimile DP - The exact image of a document transmitted electronically to another location.(1)-153 Paper fax machines digitize the image and are equipped to print out the document. Fax modems must send and receive an electronic disk file. ARCH - A reproduction of a document or item that is similar in appearance to, but not necessarily the same size as, the original.(27)-470. SEE ALSO document interchange; imaging; mode of transmission facts DLP - Occurrences of human conduct and natural events that take place within a given juridical system.. Facts whose consequences are not anticipated by the juridical system are considered juridically irrelevant; facts which are contemplated by the body of written or unwritten rules on which the juridical system is based, that is, the legal system, are qualified as juridically relevant.(11)-5 SEE acts SEE ALSO dispositive document; narrative document; probative document; supporting document

false document DLP - The concept of falsity refers to the presence of elements which do not correspond to reality. They refer to different elements of the document in a legal, diplomatic and historical sense. Legally and diplomatically, to say that a document is false is to say that the facts are untrue.(4)-18 SEE authenticity

fax

SEE facsimile file header SGML - "An electronic title page and preface" for SGML text-conformant files consisting of a bibliographic file description, encoding declarations, and revision history identifying the source text or chief, source of information, and providing the basis for citation. This is not be confused with the SGML prologue or the core structural features. (7)-53 SEE annotations of management; content articulation; status of transmission File Name SGML TEI - An element of the front matter of an office document, part of Local File References. (7)-189 SEE annotations of management

140 Chapter Four Thesaurus

Fill Orders ODA - A document constituent attribute, part of the non-basic document characteristics. (5)-7 SEE content articulation SEE ALSO filling filling ODA - The storage of a document according to some defined method in order to facilitate retrieval. (2)-6 final clauses DIP - Formulae, part of the intrinsic elements, found within or following the disposition, the object of which is to ensure the execution of the act, to avoid its violation, to guarantee its validity, to preserve the rights of third parties, to attest the execution of the required formalities, and to indicate the means employed to give the document probative value.(19)-14. They are divided into the following groups:

clauses of injunction, those expressing the obligation of all those concerned to conform to the will of the authority.

clauses of prohibition, those expressing the prohibition to violate the enactment or oppose it.

clauses of derogation: those expressing the obligation to respect the enactment, notwithstanding other orders or decisions contrary to it, opposition, appeals or previous dispositions.

clauses of exception: those expressing situations, conditions or persons which would constitute an exception to the enactment.

clauses of obligation: those expressing the obligation of the parties to respect the act, for themselves and for their successors or descendants.

clauses of renunciation: those expressing consent to give up a right or a claim.

clauses of warning: those expressing a threat of punishment should the enactment be violated. They comprise two categories: 1) spiritual sanctions, comprising threats of malediction or anathema; 2)penal sanctions, comprising the mention of specific penal consequences.

promissory clauses: those expressing the promise of a prize, usually of a spiritual nature, for those who respect the enactment.

clauses of corroboration: those enunciating the means used to validate the document and guarantee its authenticity. The words vary according to the time and place, but the clauses are usually formulaic and fixed.

The final clauses in general: s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion

141 Chapter Four Thesaurus

SGML element; element definition; element declaration b: DLP text SGML body matter r: DIP disposition; exposition; preamble font EDMS - A design for a set of characters. A font is the combination of typeface and other qualities, such as size, pitch, and spacing.(25)-189 ODA - A set of character images normally with a common design and size.(5)-6 SEE content articulation Fonts List ODA - An attribute of'additionaldocument characteristics'that specifies the character fonts used in the document.(5)-9 SEE content articulation

forgery SEE copy - pseudo-original

form DLP - The form of a written document is ... the whole of its characteristics which can be separated from the determination of particular subjects, persons, or places it is about Any written document in the diplomatic sense contains information transmitted or described by means of rules of representation, which are themselves evidence of. the intent to convey information: formulas, bureaucratic or literary style, specialized language, interview technique, and so on. These rules, which we call form, reflect political ,legal, administrative, and economic structures, culture, habits, myths, and constitute an integral part of the written document because they formulate or condition the ideas or facts which we take to be the content of the documents. (4)-15 SEE extrinsic elements; intrinsic elements; persons; acts; procedures

form of transmission DD? - The form that the record has when it is made or received.(22)-12 s: No standards equivalent n: DFR Document-Architecture-Class ODA Document Architecture Class; processes: formatted processable; processable; ODA version; interchange format class b: DLP transmission; authentic record ODA document architecture ST source document; result document r: status of transmission; mode of transmission; reliability

format DP, ARCH - A predetermined arrangement of characters, fields, lines, punctuation, page numbers, etc.(1)-156

142 Chapter Four Thesaurus

• The display conventions and syntactic niles used to record commonly used data items such as dates, currencies etc.(10) - Glossary ERM - The semantics which define the rules for recording information contained in a document. These include electronic encoding schemes (ASCII), image formats (TIFF), presentation of print conversions. (Postscript), creation tool conventions (Word Perfect, Excel etc.), and the differences between documents created by different versions of the same tool.(10) - Glossary SEE ALSO representation; medium formatted form ODA - A form of representation of a document that allows the presentation of the document as intended by the originator and that does not support editing and (re)formatting.(2)-6 SEE form of transmission SEE ALSO processable; formatted processable formatted processable ODA - A form of representation of the document that allows presentation of the document as intended by the originator and also supports editing and (re)formatting.(2)-6 SEE form of transmission SEE ALSO processable; formatted form formatting ODA - The carrying out of operations to determine the layout of a document, i.e. the appearance of its content on a presentation medium. (2)-13 SEE content articulation formularium DIP - Models of documents or instructions for their compilation, e.g. formulary, style guide, codebook.(AU) SEE documentation front matter SGML TEI - The part of office documents that corresponds to a type of document profile. It consists of: a) production and storage information e.g. local file name; b) document distribution by post or electronic mail, e.g. originator; addressee; c) action request and reply deadline; d) status and history, e.g. draft, confidential, internal.(7)-188 • The essential features of the front matter of office documents are contained in the core tag set comprising the document type, title, document date, author, abstract, table of contents, language, and revision history. (7)-89. Optional features of the front matter in office documents include document reference, additional user specific codes, references to other documents, in reply to, local file system reference, subject field, keywords, creation date, originator, preparer, authorizing person, primary recipient, secondary recipient, other user information, document status,

143 Chapter Four Thesaurus sensitivity, and number of pages.(7)-89,190. SEE persons; annotations; status of transmission function DIP - The whole of the activities aimed to one purpose. When such activities, or part of them, are assigned to a person, they constitute a competence .(22)-4 general inscription DIP - A form of inscription in which the addressee is a larger, indeterminate entity, e.g. the citizens, the believers, or "To all to whom these presents shall come".(19)-12 SEE ALSO inscription; nominal inscription. general space SEE document management domains generic document ODA - A structured amount of information intended for the interchange of generic structures, and optionally associated styles and content portions, for use in the processing of interchanged documents. (2)-19 • A generic document consisting of a document profile and generic structures may be used to assist in the processing of interchanged documents and may be interchanged itself. (2)-19 SEE document SEE ALSO generic document structure; generic identifier; generic layout structure; generic logical structure generic document structure ODA - The template that guides the creation of the document and that could be re-used for its amendment.(2)-13, 14 • The set of logical object classes and layout object classes associated with a document, and their relationships.(2)-18 SEE content articulation generic identifier SGML - The technical term assigned by the application user for the name of an element type.(7)-12 SEE content articulation SEE ALSO element.

generic layout structure ODA - The set of all the potential specific layout structures that are applicable to a document class. The generic layout structure comprises a set of rules from which specific

144 Chapter Four Thesaurus

logical objects can be derived during the editing process (e.g. a template for a page layout). (2)-18 SEE content articulation SEE ALSO result document type generic logical structure ODA - The set of all specific logical structures that are applicable to a document class. The generic logical structure comprises a set of rules from which specific logical objects can be derived during the editing process (e.g. a style sheet).(2)-18 SEE content articulation SEE ALSO source document type

genuine document DIP - The quality of a record that it is truly what it purports to be. . . . Genuineness is conferred on a record on the basis of one or more of the following: mode, form and state of transmission, and manner of preservation and custody. (22)-12 SEE authentic record; custody; mode of transmission; form of transmission; preservation; status of transmission

Group DFR - A collection of DFR-Objects in a DFR-Document-Store which are called DFR- Group-Members of the DFR-Group. A DFR-Group consists of DFR-Attributes which are associated with the DFR-Group as a whole and a DFR-Group-Content which is a sequence of UPIs of all Members of the DFR-Group. (8)-18 SEE document; archival bond SEE ALSO Group-Interrelationships; Group-Content; Group-Member; Root-Group; Proper-Group; Parent-Identification

Group-Interrelationships DFR - A collection of DFR-objects in a DFR-document-store which are called DFR- group-members of the DFR-group. A DFR-group consists of DFR-attributes which are associated with the DFR-group as a whole and a DFR-group-object.(8)-2 • A DFR group can be either a root-group or a proper- group, the difference being that a root-group has no affiliation with any other group, while a group proper is always a member of some other group (parent). • Any DFR object in a server can be reached through the root group because the DFR server only ever services one group.(8)-18 • A DFR group can be viewed as the root of a DFR object tree consisting of all the descendants of that DFR group. (8)-12 SEE document; archival bond SEE ALSO Group; Group-Content; Group-Member; Root-Group; Proper-Group

145 Chapter Four Thesaurus

Group-Content DFR - A sequence of unique personal identifiers (UPIs) identifying all DFR-members of the DFR-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Member; Root-Group; Proper-Group

Group-Member DFR - A DFR-object which is identified in the DFR-content of its parent DFR-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Root-Group; Proper-Group group space SEE document management domain hardcopy ARCH - A document or copy, usually on paper, as opposed to a microform or machine-readable record.(l)-157 DP - Printed copy of machine output in a visually readable form, e.g. printed reports, listings, documents, summaries etc.(l)-157 SEE medium SEE ALSO soft copy handling annotations SEE annotations header ARCH - A word or series of words, and/or page numbers that appear consistently at the top of the pages of a document, including copyright notices, company logos, and so on.(l)-158 DP - System-defined control information that precedes user data.(l)-158 • That portion of a message that contains control information for the message such as one or more destination fields, the name of the originating station, an input sequence number, a character string indicating the type of message, and a priority level for the message.(l)-158

SEE document profile historical authenticity SEE authenticity image ARCH - A reproduction of the subject matter copied, usually by photography. (1)-159

146 Chapter Four Thesaurus

DP - An exact logical duplicate of a data item stored on a different physical medium. (1)- 159 • A visually interpreted representation as displayed, plotted or printed.(1)-159 • A representation in storage by,means other than the storage code of the computer or device in which it is held. Examples include the but patterns of an alien code, and bit pattern matrices of such things as the punching pattern of a punched card or of a character to be displayed.(l)-159

SEE content configuration image ODA - Representation of a document in a form perceptible to a human, for example, on paper or on a screen. (2)-20. SEE content configuration imaging ERM - The document imaging process is concerned with presenting an image of the document in a form perceptible to a human, for example, on paper or on a screen. (2)-20. ODA - Imaging is a locally defined process that depends on the presentation device used. It is not part of the ODA standard for this reason.(2)-20 impartiality ARCH - The characteristic of archival documents that they are created for limited, specific and immediate purposes of an administrative-legal nature, not in order to instruct posterity. Therefore, they constitute reliable evidence of facts and events they relate to. Naturally, they contain the biases and idiosyncrasies of their creators, but, because they are not meant for dissemination, they have the capacity to reveal what actually happened. (16)-12 SEE ALSO archives imitative copy SEE copy inauthentic document DIP - The concept of inauthenticity refers to the absence of the requisites which provide authenticity i.e. legal or diplomatic but not historical. SEE authenticity individual space SEE document management domain initiative phase SEE phases of procedure

147 Chapter Four Thesaurus

inquiry phase SEE phases of procedure

In Reply To SGML TEI - An element of the front matter of generic office documents that is not part of the core tag set.(7)-189 SEE addressee inscription DLP - Documents in epistolary form usually present in their protocol the name, title and address of the addressee of the document and/or the action. It may be a nominal inscription or a general inscription.(19)-12 s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element.declaration b: DLP protocol SGML front matter r: DIP entitling, dates; invocation; superscription; salutation; subject inserts also insets SEE copy instrumental procedures SEE procedures intellectual control ARCH - The acquisition and creation of documentation required to access the informational content of records. Contrasted with administrative control. (1)-162 interchange format class ODA - The form of interchange suitable to a specific application. (2)-7 SEE annotations of management

SEE ALSO application; document interchange; processing interrelationship ARCH - The characteristic of archival documents that they are related among themselves by activities in which they participated and by the procedures and processes from which they have resulted.(16)-12 SEE ALSO archival bond

148 Chapter Four Thesaurus

intellectual form DIP - The characteristics of the internal composition of the record. (22)-2 SEE annotations; content articulation; content configuration SEE ALSO extrinsic elements; intrinsic elements intrinsic elements DIP - Elements of intellectual form which are considered to be the integral components of its intellectual articulation: the mode of expression of the document's content, or the parts determining the tenor of the whole. (19)-6 SEE eschatochol; protocol; text

SEE ALSO extrinsic elements; intellectual form; physical form juridical act DIP - When a juridical system takes into consideration in its body of rules not only the effects of human conduct but also the will determining it, we call that conduct a juridical act.(ll)-6 SEE acts; juridical system juridical fact DLP - An event, whether intentionally or unintentionally produced, whose results are taken into consideration by the juridical system in which it takes place.(11)-5 SEE acts; juridicalsystem juridical person DLP - An entity having the capacity or the potential to act legally and constituted either by a collection or succession of physical persons or a collection of properties. (17)-5 SEE juridical system; persons juridical relevance DIP - The degree to which a document participates in a juridical act. Those that are directly involved, without which the act cannot exist (ad substantem) or be proved to have taken place (probative), are called juridically relevant; those that are ancillary to the act, but still contribute to the act (supporting), are also juridically relevant. Those documents that have no bearing on the act (narrative) are called irrelevant.(TG) SEE acts; dispositive document; fact; narrative document; probative document; supporting document

149 Chapter Four Thesaurus

juridical system DIP -A collectivity governed by rules which may be implicitly understood (e.g. beliefs, or customs), or explicit (e.g. codes of law). The system of rules is called a legal system.(l 1)- 5

SEE acts; facts language DIP - The style, wording and composition used in compiling the document.(19)-8 s: DFR Languages ODA Languages SGML Language; basic non-structural features n: Not required b: DIP extrinsic elements r: DIP script Languages DFR - A non-DFR specific attribute, part of the extension attribute set, that specifies the primary language(s) in which the content of the document is written. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, languages. (8)-92 ODA - An attribute of content that specifies the primary language(s) in which the content of the document is written. (5)-14 SEE language; script

Keywords , ARCH - A word or group of words taken from the title or text of a document characterizing its content and facilitating its retrieval.(14)-19 SGML TEI- A tag that is part of the front matter of a generic office document. (7)-189 SEE annotations of management layout ODA - A process whereby a document is organized into pages and all the physical constituents thereof (e.g. running heads, borders).(2)-l 8 SEE content articulation layout styles ODA - A constituent of the document, referred to from a logical component, that guides the creation of a specific logical structure (2)-8 SEE content articulation

150 Chapter Four Thesaurus

links DIP - PRP - An extrinsic element providing a physical connection between the parts of a document. . DP - In database management systems, a link is a pointer to another record. One or more records can be connected by inserting links.(25)-271 • In spreadsheet programs, linking refers to the ability of a worksheet to take its data for particular cells from another worksheet.(25)-270 • In many operating systems (UNIX for example), a link is a pointer to a file. Links make it possible to reference a file by several different names and to access a file without specifying a full path.(25)-271 s: No diplomatics equivalent n: DFR User-References-to-Other-Objects; User-Reference ODA References to Other Documents SGML References to Other Documents b: No standards or diplomatic equivalent r: ERM compound document

Local File References ODA - An attribute of external references that specifies where a copy of the document may be found. It consists of one or more entries, one for each location where a copy of the document may be found.(5) - 13. SGML TEI - a group of elements characteristic of generic office documents consisting of file name, location, access rights, user comments.(7)-189 SEE annotations of management

Local Filing Date and Time ODA - An attribute of dates and times that specifies the date and, optionally, the time of day when the document was filed. When more than one entry occurs, the last entry indicates the most recent local filing date and time.(5)-10 SEE annotations of management

Location (or directory) SGML TEI - Part of local file references in the standard set of tags that are not part of the core set for generic front office documents. It is intended to specify the directory where the file may be found.(TG) SEE annotations of management

logical DP - The way a data structure, hardware or software system, is perceived by an individual that may be different from its actual functioning or form. (1)-164

151 Chapter Four Thesaurus

logical object ODA - An element of the specific logical structure of a document which may have a meaning that is significant to the application user, for example, chapter, section, paragraph. • Layout objects and logical objects, or in other words, the intellectual arrangement of the text and the physical arrangement or format of a document do not necessarily correspond. (2)-17 SEE content articulation

logical record DP - A compilation of related data elements referring to one person, place, thing or event that are treated as a unit.(1)-165 SEE ALSO record

logical structure ODA - The result of dividing and subdividing the content of a document into increasingly smaller parts, on the basis of the human-perceptible meaning of the content, for example, into chapters, sections, paragraphs.(2)-9 • All logical objects and associated content portions representing the logical hierarchy of a document.(2)-9 • The logical structure is independent of the layout structure in principle and is determined by the author and embedded in the document during the editing process. Attributes associated with the logical structure may control the formatting process or the layout of the document.(2)-17 , SEE content articulation SEE ALSO document architecture

management annotations SEE annotations

medium DIP - The material carrying the message.(19)-7 • The physical substance to which the message of a document is affixed. The function of a document is to fix the message in a medium so that it can be preserved. s: ODA medium types n: No standard equivalents b: extrinsic elements r: format

Medium Types ODA - An attribute specifying non-basic attributes of medium type. It consists of one or more groups of parameter values for "nominal page size", and or "side of sheet, and details of one non-basic medium type used in the document.(4)-7 SEE medium; SEE ALSO non-basic

152 Chapter Four Thesaurus

mere act SEE acts mode of transmission DIP - Method by which a document is transmitted through space and time.(22)-12 s: No standards equivalent n: No standards equivalent b: DIP transmission

r: form of transmission; status of transmission; authenticity moment of action DIP - That point in time when the decision to act is taken and the iussio or command to prepare a document is given. (AU) s: no standards equivalent n: DFR Content-Create-Date-and-Time; Creation-Date-and-Time; Contents-Modified-Date-and-Time; Attributes-Modified-Date-and- Time ODA Creation Date and Time; Release Date and Time; Expiry Date and Time; Start Date and Time SGM Document Date b: DLP status of transmission; phases of procedure - execution phase r: DLP moment of documentation moment of documentation DLP - That point in time when the action is documented. In probative documents, this will always follow on the moment of action. In substantive documents, the moment of action and documentation are always the same.(AU) s: no standards equivalent n: DFR Content-Create-Date-and-Time; Creation-Date-and-Time; Contents-Modified-Date-and-Time; Attributes-Modified-Date-and- Time ODA Creation Date and Time; Release Date and Time; Expiry Date and Time; Start Date and Time SGM Document Date b: DLP status of transmission; probative document; dispositive document r: DLP moment of action multiple acts SEE acts

narrative documents DLP - Written evidence of an activity that is juridically irrelevant. (29)-9 s: No standards equivalent n: DFR Document-Type

153 Chapter Four Thesaurus

ODA Document Type; document architecture class; User Specific Codes PRP business process SGML Document Type; document type definition b: DIP juridical relevance SGML office documents r: DIP dispositive document; probative document; supporting document

Next Versions DFR - A mandatory attribute, part, of the basic attribute set, that is a multi-valued attribute. It is defined only for DFR documents and is updated by the DFR server each time a new version is declared having this DFR document as its previous version (in a create or modify abstract operation), or when an existing version is discarded (by a delete or modify abstract operation). The DFR user is prohibited from modifying this attribute explicitly. When the value of this attribute is read by the DFR user, only those documents to which the user has at least read access rights are included in the result of the DFR abstract operation. (8)-84 SEE draft; SEE ALSO Version; Version-Root; Previous-Versions; Version- Management; Superseded Documents

nominal inscription A form of general inscription which refers to one or more specific persons by name. (19)-12 SEE inscription

non-basic ODA - A qualifier for attribute values, control function parameters values and other capabilities that are only allowed in document interchange in the context of a given document application profile (DAP) if their use is declared in the document profile (2)-9

non-basic document characteristics ODA - A set of attributes that must be declared in the document profile to be exchanged. They comprise profile character sets, comment character sets, and alternative representation character sets. SEE ALSO document characteristics; non-basic

non-basic structure characteristics ODA -A set of attributes regarding the structure of the document that must be declared in the document profile to be exchanged. It comprises only one attribute, numbers of objects per page.

SEE ALSO document structure; non-basic

notification DIP - The publication of the purport of a document whose purpose is to express that the act consigned to the document is communicated to all who may be affected by it as well as those who are directly concerned. Usually follows the preamble in a dispositive document and is recognized by such formulas as "Be it known" or notum s/r".(19)-13

154 Chapter Four Thesaurus

s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DLP protocol SGML front matter r: DLP entitling; dates; invocation; superscription; salutation; subject non-processable document DP - A format for document representation, such as a bit-map pattern of a page image or a page image defined by a proprietary page definition language, that prevents the document from being edited or manipulated in another computer system, unless operating the same software.(1)-169 SEE form of transmission

Number-of-Group-Members DFR - A DFR-specific attribute, part of the basic attribute set, that specifies the number of members in a DFR group.(8)-83 SEE archival bond; compound document; document SEE ALSO group

Number of Objects per Page ODA - An attribute of non-basic structural characteristics that specifies the number of specific layout objects per page used in the document. This attribute is only specified in if the number of objects per page exceeds the value specified by the document application profile. (5)-8 SEE content articulation

Number of Pages DFR - An non-DFR specific attribute, part of the extension attribute set, that specifies the number of pages in the specific layout structure (if any) of the document. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, number of pages. (8)-92 ODA - An attribute of content that specifies the number of pages in a specific layout structure (if any) of the document. (5)-14 SEE content articulation object DM - A data element that includes both data and the methods or processes that act on that data.(26)-G-10 DFR - One of a set of information entities managed by a DFR-server. DFR-objects defined are DFR-documents, DFR-groups, DFR-references, and DFR-search-result lists.(8)-6 • A DFR object consists qf attributes and content and is introduced into a document store by creating a DFR entry.(8)-13

155 Chapter Four Thesaurus

• A DFR object is immediately contained by a DFR group known as the parent. There can be only one parent per object. Both the parent and the object itself are each members of the DFR group. (8)-13 • There can be only one object per DFR reference, but an object can have several different references.(8)-12 ODA - An element of a generic structure from which objects with common characteristics may be derived.(2)-9 SEE document; content articulation object-class DFR - A DFR-specific attribute, part of the basic attribute set, that indicates the class of a DFR object and is associated with every DFR object.(8)-81 • A DFR-attribute indicating the class of a DFR-object i.e. whether it is a DFR- document, group, reference or search=result list.(8)-6 ODA - Groups of similar logical (e.g. chapter or section hierarchy) or layout objects (e.g. size or style), or content objects (e.g. page headers, or footers). Object classes may include groups of entire documents such as memoranda, or a report in which case they may be called document classes.(2)-17 • An element of a generic structure from which objects with common characteristics may be derived.(2)-9 SEE document; content articulation . SEE ALSO document class

object class description ODA - A set of attributes that specify the properties of an object including its relationships, if any, with other components.(2)-9 SEE document; content articulation SEE ALSO object

object content DFR - The actual information stored with the DFR object. The nature of the content depends on the object class. For a document group, content consists of a string of unique permanent (UPIs) identifiers for all its members. For a DFR document, the content is a body of information, e.g. an office document. The DFR content of a DFR Reference is a pointer to some other DFR object (Group, Document or Search Result List) called a referent.(8)-13 SEE content; document

object identifier DFR - The direct reference component of the DFR document content which is equivalent to the DFR Document-type attribute.(8)-14

object tree DFR - The DFR-object tree is tree formed by a DFR-group and its descendants.(8)-6 SEE descendants

156 Chapter Four Thesaurus object type ODA - A property of every component that specifies which attributes are permitted in the description to which it applies and indicates the role of the component in the document architecture. (2)-9 SEE document architecture

ODIF - Office Document Interchange Format ODIF (ISO 8613-5) is a data stream defined in terms of a set of data structures, called "interchange data element", which represents the constituents (document profile, object descriptions, object class descriptions, presentation styles, layout styles and content portion descriptions) of a document. ODIF uses the Office Document Language (ODL) to represent and process documents. ODL uses SGML names and markup conventions for representing the constituents and attributes of a document. (2)-21 office documents SGML TEI - Various classes of documents (e.g. reports, correspondence, memoranda) sharing common features. Office documents are divided into front matter, text and back matter. The essential features are contained in the front matter and are denned in a core tag set which corresponds to a header. Optional tags and crystals define other features. (7)-188,189 SEE ALSO front matter; text; backmatter; crystals; document type; document profile

official record ARCH - An original record, or an authentic copy, whose written form is required administratively and/or legally (but not necessarily beyond the time necessary for its consequences to take place. (16)-10 • A record in law, having the legally recognized and judicially enforceable quality establishing some fact.(1)-170 SEE dispositive document; authentic copy

organizational procedure SEE procedures

Organizations DFR - A non-DFR specific attribute, part of the extension attribute set, that identifies the originating organization(s) associated with the document. In the case of an ODA- document, the value of this attribute can be taken by the DFR user from the ODA document profile where it is the equivalent of ODA attribute, organizations. (8)-90 ODA - An attribute of originators that identifies the originating organization(s) associated with the document.(5)-11. SGML TEI - A crystal identifying organizations as part of office documents. It includes tags for the name of the organization, division or department, and the address.(7)-191 SEE persons; juridical persons; SEE ALSO Originators

157 Chapter Four Thesaurus

original Diplomatics examines the concept of originality and points out the common denominators of all originals, independently of time and place of creation. The first element of originality is that indicated by the English legal definition, which derives from its etymology: the Latin word originalis means primitive or first in order. The second necessary element is perfection. To be original, a document must be perfect, a term which both legally and diplomatically means complete, finished, without defect and enforceable. A perfect document is a document that is able to produce the consequences wanted but its author, and perfection is conferred on a document by its form.(4)-19 s: No standard equivalent n: DFR Status; version; Next-Version; Revision-Date-and-Time ODA Status; Start Date and Time; Security Information; Revision History; document architecture SGML Document Status; document type declaration b: DIP status of transmission r: DIP form of transmission; mode of transmission

Originators ODA - A group of document management attributes consisting of organizations, preparers, owners, and authors.(5)-l 1,12 SGML TEI - A tag that is part of the non-core tag set for the front matter of generic office documents.(7)-189 SEE persons; juridical persons

Other-Titles DFR - An non-specific DFR attribute that contains alternative titles for a DFR-object. In DFR, this attribute is taken from the ODA document profile.(8)-89 SEE title

other user information ODA - A groups of document management attributes comprising copyright, status, user-specific codes, distribution list, and additional information.(5)-12 SEE annotations of management

Owner DFR - A security-subject, with owner access rights to a specific DFR-object.(8)-7 • A non-specific DFR attribute, part of the extension attribute set, that identifies the name(s) of the person(s) and/or organization(s) responsible for the content of the document. In the case of an ODA document, the value of this attribute can be taken by the DFR user from the ODA document profile where it is the equivalent of the attribute, owner. (8)-90 • The ability to modify the Access-List attribute and apply a committed reservation. Includes read-modify-delete access. A DFR-user creating a DFR object by create or copy abstract operation is automatically included in the DFR Access List attribute as the owner. (8)-24

158 Chapter Four Thesaurus

ODA - An attribute of originators that identifies the name(s) of the person(s) and/or organization(s) responsible for the content of the document. This attribute consists of one or more entries, each with two optional parameters: personal name of owner and owner's organization.(5)-12. SEE persons; juridical persons SEE ALSO Originators

Page ODA - A layout component that corresponds to a rectangular area used for presenting the content of the document. (2)-10 SEE content articulation

SEE ALSO non-basic; Page Dimensions; Page Positions

Page Dimensions ODA - An attributes that specifies the non-basic values of the attribute "dimensions" of layout objects of type "page" used in the document. The value consists of one or more pairs of page dimensions.(4)-6 SEE content articulation SEE ALSO non-basic; Page; Page Positions Page Positions ODA - An attribute that specifies the non-basic values of the attribute "page position" used in the document.(4)-8 SEE content articulation SEE ALSO non-basic; Page; Page Positions parent DFR - Each DFR-object, except the DFR-root-group, is a DFR-group member of a DFR- group, which is termed its parent.(8)-7 SEE ALSO parent identification

Parent-identification DFR - A mandatory attribute, part of the basic attribute set, that identifies the DFR group of which the object is a member. Its value is equal to the unique personal identifier of the DFR group to which belong its parents.(8)-82 SEE document; archival bond SEE ALSO Group

Personal Name SGML TEI - A crystal that contains a set of tags intended to encode the personal names in generic office document. It consists of tags for title, personal name ( forename, first name, and Christian name), family name (surname and last name) and a generational qualifier or other suffix (Jr., Sr. etc.)(7)-191 SEE persons

159 Chapter Four Thesaurus persons DIP - Entities who are the subject of rights and duties and as such are recognized by the juridical system as capable of, or having the potential for acting legally. Persons may be a collection or a succession or a private individual.(17)-5 s: no standards equivalent n: DFR Authors; Organizations; Owners; Preparers; Access List DIP author of document; author of act; addressee of act; addressee of document; writer; witness; countersigner; ODA Originators: Authors; Organizations; Owners; Preparers; Authorization; Distribution List; Access Rights SGM Author; Preparer; Originator; Authorizing Person; Primary Recipient; Secondary Recipient; crystal - Personal Name b: DLP juridical persons r: DIP acts; attestation phases of procedure Ideal or decontextualized series of formal steps common to every procedure by which the procedure is realized. These phases are, in order of their occurrence, the initiative, inquiry, consultation, deliberation, deliberation control, and execution.

initiation phase: Phase of procedure constituted by those acts, written and/or oral, which start the mechanism of the procedure. Examples of documents created in this phase are petitions, applications, claims, drafts, or bills. (18)-14

inquiry phase: Phase of procedure constituted by the collection of the elements necessary to evaluate the situation. Examples of documents created in this phase include surveys, estimates, curricula, technical reports, reference letters. (18)-14

consultation phase: A phase of procedure constituted by the collection of opinions and advice after all the relevant data have been assembled. Examples of documents created in this phase are agendas, minutes, memoranda, discussion papers. (18)-14

deliberation phase: A phase of procedure constituted by the final decision-making. Examples of documents created in this phase are appointment notices, contracts and laws. (18)-14

deliberation control phase: A phase of procedure constituted by the control exercised by a physical or juridical person different from the author of the document embodying the transaction, on the substance of the deliberation and/or on its forms. Sometimes, some form of control is necessary to insure the effectiveness of the deliberation and its enforceability. Examples of documents created in this phase are letters of transmission, memoranda, and definitive compilations of the documents embodying the transactions.(18)-14

execution phase: A phase of procedure constituted by all the actions which give formal character to the transaction (i.e. validation, communication, notification, publication). The documents created in this phase are the originals of those embodying the transactions. Examples are registrations, letters of transmission, letters to newspapers.(18)-15

Phases of procedure in general: s: No standards equivalent

160 Chapter Four Thesaurus

n: ODA Start Date and Time; Expiry Date and Time (for execution phase) SGML Authorized By (for approvals at execution and deliberation control) b: DIP procedures r: DIP acts

physical form SEE extrinsic elements

preamble That part of a document that expresses the ideal motivation of the action. In modern legal documents, the preamble contains a citation of the laws, regulations, decrees or opinions on which the act rests. Today, just as in the past, it is possible to notice that some types of documentary form have their own specific, and often stereotyped, preamble. (19)-12,13 s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP text SGML front matter r: DIP entitling; dates; invocation; superscription; salutation; subject

Preparers DFR - A non-DFR specific attribute, part of the extension attribute set, that identifies the name(s) of the person(s) and/or organization(s) responsible for the physical preparation of the document. In the case of an ODA document, this attribute can be taken from the document profile where it is the equivalent of the ODA attribute, preparers.(8)-90 ODA - An attribute of originators that identifies the names(s) of the person(s) and/or organization(s) responsible for the physical preparation of the document. The attributes consists of two or more entries, each with optional parameters. These are: a) personal name of preparer b) preparer organization^) 11 SGML TEI - A tag identifying the preparer of a document as part of the non-core tag set for generic front office documents.(7)-189

SEE writer; persons SEE ALSO originators presentation ODA - The operation of rendering a document in a form perceptible to a human being. Typical presentation media are paper and video screens.(2)-13 SEE content articulation; medium; script SEE ALSO presentation features; presentation styles

161 Chapter Four Thesaurus presentation features ODA - An attribute that consists of one or more sets of presentation features used in the document. Each set pertains to a single content type and consists of presentation features that are specified as non-basic by the document profile. Presentation features consists of presentation attribute values, control function parameter values, sets of content elements, and their parameter values. The names of the sets of presentation features are: character presentation features, raster-graphics presentation features, and geometric-graphics presentation features.(4)-8 SEE content configuration

SEE ALSO graphics; basic; basic layout object; basic layout component presentation styles ODA - A constituent of the document, referred to from a basic logical or layout component, which guides the format and appearance of the document content. (2)-10 SEE content articulation

SEE ALSO basic; basic layout object; basic layout component preservation ARCH - The actions which enable the materials in archives to be retained for as long as they are needed i.e. the basic functions of storing, protecting and maintaining records and archives in archival custody. (24)-476 SEE reliability

SEE ALSO read-modify-delete; formatted; processing

Previous- Versions DFR - A DFR-specific attribute, part of the basic attribute set, that is multi-valued. It is defined only for DFR documents and is assigned by the DFR user when the document is declared a new version (in a create or modify abstract operation). It can then be modified by the DFR server provided that the document has not yet become a previous version for some other new version. After that, it cannot be modified. It is automatically updated if any specific pervious version disappears (by means of a delete or modify abstract operation). When the value of this attribute is read by the DFR user, only those DFR documents to which the user has at least read access rights are included in the result of the abstract operation. (8)-84 SEE draft SEE ALSO Version; Version-Root; Next-Version; Version-Management; Superseded Documents Primary Recipient SGML TEI - A tag, part of the non-basic core tag set for generic office documents. (7)- 190 • This tag may correspond to the addressee of the document. (TG) SEE addressee

162 Chapter Four Thesaurus principle of provenance ARCH - Also known as respect des fonds, is "the principle of the arrangement of archival material that fonds of different provenance should not be intermingled."(21)-17 SEE ALSO archival bond; provenance private document DIP - A document is private if it is created by a private person or by their command or in their name, that is, by a person performing functions considered to be private by the juridical system in which the person acts.(17)-16 SEE document management domain SEE ALSO public document

private space SEE document management domain; private document SEE ALSO public document

privilege DP - An indication of the access rights of a user or user program to the data of a computer system. If given a numeric value, it may be termed an "access control level. "(1)-173 SEE ALSO access

probative document DLP - If the purpose of the written form of the document was rather to produce evidence of an act which came into existence and was called complete before being manifested in writing the document was called probative, e.g. certificates and receipts.(29)-9 s: No standards equivalent n: DFR Document-Type ODA Document Type; document architecture class; User Specific Codes ERM business process SGML Document Type; document type definition b: DIP juridical relevance SGML office documents r: DIP dispositive document; narrative document; supporting document

procedure DLP - A body of written and unwritten rules whereby a transaction is effectuated and comprises the formal steps to be undertaken in carrying out a transaction. (3)

executive procedures. Those which allow for the regular transaction of affairs within limits and according to norms already established by a different authority.(18)-19

instrumental procedures. A type of procedure in which expressions of opinion are given or advice is sought. (18)-19

organizational procedures: Those aimed at the establishment of organizational structure and internal procedures, and their creation, modification, preservation, or extinction.(18)-19

163 Chapter Four Thesaurus

constitutive procedures: Those procedures which create, extinguish, modify or preserve the exercise of power. They comprise several types: procedures of authorization: those which consent to the exercise of powers already held by a physical or juridical person. They do not create powers but remove limits to their exercise.

procedures of limitation: those which deprive physical or juridical persons of powers or faculties;

procedures of concession: those which create new situations and new powers for the addressee. (18)-12

Procedures in general: s: No standards equivalent n: DFR User-Specific Codes ODA User Specific Codes SGM Action PRP business process b: DIP acts r: phases of procedure; process procedure of authorization SEE procedure procedure of concession SEE procedure procedure of limitation SEE procedure process DIP - A series of motions or activities in general, carried out to set oneself to work and go towards each formal step of the procedure. (22)-10 • Processes do not create reliable records because of their spontaneity and lack of rules. (22)-10 SEE ALSO phases of procedure DP - An operating system concept that refers to the combination of a program being executed and bookkeeping information used by the operating system. Whenever a program is executed, the operating system creates a new process for it. The process is like an envelope for the program: it identifies the program with a process number and attaches other bookkeeping information to it. Multiprocessing systems can run several processes at the same time. There is usually a one-to-one match between a process and a program. Multitasking systems allow a single process to run one or more programs at the same time. (25)-382,383 • Typical processing assignments given records are filing, reorganizing files, updating, printing and so forth.

164 Chapter Four Thesaurus processes ODA - The editing, layout, and imaging of documents.(2)- 20. SEE ALSO processable, formatted processable, formatted. processable DP - A format for document representation that supports the capacity to communicate a document between two computing systems so that the transferred document can be edited by the recipient.(1)-173 ODA - A document that has been edited and is suitable for interchange for purposes of either further editing or formatting in layout.(2)-18 SEE narrative documents; supporting documents; phases of procedure

processing ODA - The carrying out of operations on a document, including editing, reformatting, presentation, filing, and retrieval.(2)-10 SEE ALSO processable

Profile Character Sets ODA - This attribute specifies the graphic character sets, other than the character set specified [for values of document profile attributes] used in those document profile attributes that consist of character strings. (4)-6 SEE content articulation

Proper-Group DFR - Any DFR-group other than the DFR-root-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Group- Member; Root-Group

protocol ARCH - A formal document embodying the terms of a legal transaction. (1)-174 • A diplomatic document, especially the final text of a treaty or compact, signed by the negotiators and subject to subsequent ratification.(1)-174 DP - A formal set of conventions governing the orderly exchange of information between communicating devices by defining such things as connection establishment, security provision, data sequencing, error control, etc. Protocols achieve efficient line utilization by reducing the amount of information transferred by distinguishing between device control information and data. (1)-174 DIP - That part of a document which "sets the scene" or contains the administrative context of the action consisting of an indication of the persons involved, time and place, and subject, and initial formulae.(19)-11

Document in general. s: no standards equivalent

165 Chapter Four Thesaurus

n: DIP inscription; invocation; superscription; date; entitling; salutation; subject DFR document content; Author; Owners; Preparers; User-Specific- Codes; Access-List ODA Distribution List; generic logical structure; specific logical structure; content portion; Author; Owners; Preparers; User- Specific-Codes; Security Information SGML element; element definition; element declaration b: DIP intrinsic elements SGML front matter r: DIP extrinsic elements provenance ARCH - The organization or person creating a fonds.(21)-15 SEE archival bond SEE ALSO principle of provenance pseudo-original SEE copy

SEE ALSO original public document DIP - A document created by a public person, at their command or in their name, that is, if the will determining the creation of the document is public in nature. (17)-16 SEE document management domain Purge Date and Time DFR - An optional attribute, part of the extension attribute set, that specifies the date, and optionally, the time of day after which the DFR-document can be purged from the DFR document store. The case of an ODA document, the value of this attribute can be taken by the DFR user from the ODA document profile where it is the equivalent of the ODA attribute, creation date and time.(8)-90 ODA - An attribute of dates and times that specifies the date, and, optionally, the time of day after which the document can be purged from wherever it is stored.(5)-10 SEE annotations of management; chronological date SEE ALSO Dates and Times qualification of signature The mention of the title and capacity of the signer that accompanies the signatures of attestation. (19)-15 s: No standards equivalent n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP eschatochol r: DIP entitling; dates; invocation; superscription; salutation; subject

166 Chapter Four Thesaurus

query DP - A request for information from a database.(25)-392 SEE ALSO transaction DIP - A question, esp. expressing doubt or objection.(27)-879 SEE annotations of handling read-modify-delete DFR - Access to a DFR-object that permits the user to delete or move that object. This includes read-modify status.(8)-24 SEE access control receivers DIP - A proposed term indicating those persons who are copied on a distribution list as opposed to the addressees. The two groups must be distinguished. In traditional textual records, the receivers are usually listed at the end of the document. SEE ALSO addressee s: No standards equivalent; no diplomatic equivalent n: SGML Secondary Recipient b: DIP addressee of the document ODA Distribution List r: DIP copy; secretarial notes record ARCH - Recorded information, regardless of form or medium created, received and maintained by an agency, institution, organization or individual in pursuance of its legal obligations or in the transaction of business. (1)-176 DLP - A complete and effective archival document. Completeness and effectiveness are provided by form. A complete document is one that contains all the elements it is supposed to contain according to the administrative and legal system. An effective document is a document capable of achieving its purposes.(16)-9 • Recorded transactions communicated to other people in the course of business via a store of information available to them.(l 1)-12 • Not all documents are records. Only those that fulfill the necessary requirements of form can be considered records; their content is irrelevant for diplomatic purposes. Documents are the genus, records the species. • Records arise from administrative activities which manifest themselves in series of acts. These acts, and their documentation, are governed by written or unwritten rules of procedure which are revealed in the forms of the records. (11)-10 • The necessary components of records are medium, content, form, persons, acts.(22)-3 SEE archival document; completeness SEE ALSO form; document; written document. DP - A set of related data or words, treated as a unit. (1)-176

167 Chapter Four Thesaurus recordskeeping system ARCH - an architecture of competent records creators, together with their equipment, and support mechanisms, governed by policies and procedures for the management of documents made and received in the course of business, designed to ensure the reliability, authenticity and completeness of archival documents (or records) in the course of their creation, maintenance, use, and disposition. SEE ALSO records

Reference DFR - A DFR object which acts as a link to another DFR object which is called the referent of the DFR-reference.(8)-6 • A DFR reference consists of a DFR content containing a pointer to the referenced DFR object (the referent) and,to DFR attributes.(8)-15 • A DFR referent allows an object to participate in more than one DFR Group without requiring distinct copies of the object to be created. The content consists of a pointer to a referenced object.(8)-l5 • The attributes of the reference are associated only with the reference alone while those of the referent are associated only with it alone. (8)-16 SEE links; document

SEE ALSO compound document; reference content; referent reference content DFR - The information stored in a DFR-reference for the purposes of identifying the referent. (8)-6

SEE ALSO Reference; links; compound document; document

References to Other Documents ODA - An attribute of external references that specifies references to any other associated documents and consisting of one or more entries.(5)-l 3. SGML TEI - A tag, part of the non-core tag set of generic office documents. SEE annotations of management; archival bond; links SEE ALSO compound documents Referent DFR - That DFR-Object to which a DFR-Reference refers.(8)-7 SEE ALSO Object; Reference; links; document register ARCH - A list of events, letters sent and received, actions taken, etc. usually in simple sequence, as by date or number, and then often serving as a finding aid to the records, such as a register of letters sent or a register of visitors. (1)-177 DIP - Among the various types of copies are registers in which documents are reported in extenso.(3)

168 Chapter Four Thesaurus

DP - A storage device having a specified storage capacity such as a bit, a byte, or a computer word and usually intended for a special purpose.(1)-177

SEE ALSO copy

Release Date and Time ODA - This attribute specifies the date, and optionally, the time of day after which the document can be released from any restrictions specified in the attribute, Security Classification.^)-10 SEE security

reliability DIP - A record endowed with trustworthiness. Specifically, trustworthiness is conferred on a record by its degree of completeness and the degree of control on its creation procedure and or/or its author's reliability.(22)-9 • Where electronic records are concerned, in addition to the elements of intellectual form, the profile of every record to be reliable must include date, time, author, addressee, subject. (22)-22 • If received from outside, it must include date of receipt, time of receipt, date of further transmission, time of further transmission, author, addressee, classification code, and registry number (if applicable).(22)-22 • Control of access to document management domains is an important constituent of the reliability of electronic documents. (22)-23 SEE ALSO access; authenticity; completeness; document management domains; security

s No standard equivalent n DFR Authors; Create-Date-and-Time; Created-By; Subject; Revision- Date-and-Time; Reserved-By; Reservation; Access-List; UPI; Document-Type; Document-Architecture-Class; Document Content ODA document architecture class; interchange format class; Subject; Document Type; Creation Date and Time; Revision History; Authors; Distribution List; Reference to Other Documents; Authorization; Access Rights; Security Classification; Local Filing Date and Time SGML document type declaration(DTD); document instance; Title; Creation Date; Originator; Author; Primary recipient; Document Status; annotations b No broader term r DLP authenticity; completeness

169 Chapter Four Thesaurus representation ARCH - The intellectual form in which information is presented for consumption by humans. EDMS - The form in which information is presented for consumption. These forms include image, text, voice, video, tables and graphics.(lO)-Glossary SEE ALSO format; medium

Reservation DFR - A DFR-specific, mandatory attribute, part of the basic attribute set, that indicates whether a DFR object is reserved or not. This attribute is associated with each DFR object.(8)-87 SEE security

Reserved-by DFR - A DFR-specific, mandatory attribute, part of the basic attribute set, that identified the security subject on whose behalf the DFR user has reserved this DFR subject. It is absent when the DFR object is not reserved and can be read only by a DFR user having at least extended read-access rights. (8)-87 SEE security

Resource Document ODA - A generic-document containing one or more object class descriptions referred to by one or more object class descriptions of another document.(2)-11 SEE Document

Resource-Limit DFR - A mandatory attribute, part of the basic attribute set, that specifies the maximum resource to be used for a DFR object based upon accounting information. The resource limit includes the space required to store content (if a document), the DFR object tree (if a DFR group) and any associated attributes.(8)-83 SEE annotations of management; Resource-Limit

Resource- Used DFR - A mandatory attribute, part of the basic attribute set, that contains information for accounting purposes based on resources used during some period of time, for example, the actual amount of storage used for the DFR object in the document store. The resource used includes both the space required to store content (the DFR object tree in the case of a DFR Group) and any associated attributes.(8)-83 SEE annotations of management; Resource-Limit

responsibility DIP - The obligation to answer for an act.(17)-8 SEE ALSO competence

170 Chapter Four Thesaurus

result document N SEE source document

Revision-Date-and-Time DFR - A non-specific DFR attribute, part of the extension attribute set, that specifies the date and optionally, the time of day on which a revision of the DFR object occurred. In the case of an ODA document, the value of this attribute can be taken by the DFR user from the ODA document profile where it is the equivalent of the attribute, revision date and time.(8)-90

SEE draft; Revision History; status of transmission

Revision History ODA - An attribute of dates and times that specifies the history of the document, indicating when, where and by whom the document was created and revised. The value of this attribute consists of a sequence of groups of parameters. Each group forms an entry in the history. The first group in the sequence provides information on the creation of the document. The last group in the sequence provides information on the current version of the document. Each group consists of the following optional parameters: a) revision date and time; b) version number; c) reviser(s); d) version reference; e) user comments.(5) - 10-11. SGML TEI - Part of the file header of an SGML conformant data file comprising a description of the processes and interpretations that took place during the transfer of the text from the source to the data file, and a description of any subsequent editorial or other modifications made to the data file.(7) -55 SEE draft roboratio

DIP - The most solemn phase of procedure in which the document is validated.(18)-13

Root-group DFR - The distinguished DFR-group within a DFR-document store having no ancestor and whose DFR-object tree encompasses all DFR objects in the DFR document store.(8)- 6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Group- Member; Proper-Group; Object-Tree; Document-Store

roeatio DIP - A phase of procedure in which the request is made to compile a document which has been presented orally by the parties to the notary. The rogatio corresponds to the

171 Chapter Four Thesaurus iussio expressed by public authorities, even if it has the diplomatic configuration of a contract. (18)-12 script DIP - An extrinsic element concerned with the way the content. of the document is physically articulated or presented by such means as handwriting, fonts, page layout, use of paragraphs etc. "Computer software may be considered part of the extrinsic element "script" because it determines the layout and articulation of the discourse, and can provide information about provenance, procedures, processes, uses, modes of transmission, and last, but not least, authenticity."(19)-7 SEE content articulation; content configuration SEE ALSO language; medium; seals; special signs; annotations; specific document structure; software secretarial notes DIP- The qualification of signature may be followed by the secretarial notes (e.g. initials of the typist, mention of enclosures etc.) but usually it constitutes the last intrinsic element of documentary form.(19)-15 s: No standards equivalent n: DFR Preparers ODA generic logical structure; specific logical structure; content portion, Prepared By SGML element; element definition; element declaration; Preparer b: DFR document content DIP eschatochol r: DTP entitling; dates; invocation; superscription; salutation; subject

Secondary Recipient SGML TEI - A tag, part of the non core tag set for generic office documents. (7)-190: • Use of secondary recipient could correspond to receivers. (TG) SEE addressee; recipients SEE ALSO Distribution List security ERM Techniques for ensuring that data stored in a computer cannot be read or compromised. Most security measures involve data encryption and passwords. Where mode of transmission is concerned, articulation of the circumstances and manner of transmitting records from one space to another either automatically or manually, and of receiving records from outside in any of the spaces, s: DIP No diplomatic equivalent ODA Security Information n: DFR Access-List; Reserved-By; User; Reservation ODA Access Rights; Authorization; Security Classification SGML Authorizing Person; Sensitivity

172 Chapter Four Thesaurus

b: ARCH custody; preservation r: DIP mode of transmission security classification ODA - "This attribute specifies the security classification assigned by the document owner(s) relating to such aspects as the visibility, reproduction, storage, audit, and destruction requirements."(4)-14 SEE annotations of management; SEE ALSO security simple act SEE acts simple copy SEE copy soft copy DP - Data temporarily displayed on a video screen, in contrast to hardcopy, which is printed output from a computer.(l)-180 SEE ALSO view . software DP - "Computer instructions or data. Anything that can be stored electronically or displayed on paper is software. The storage devices and display devices are hardware."(25)-43 5,436 SEE script

Source Description SGML TEI - A part of the bibliographic file description intended to provide a usable bibliographic reference to the copy text used in preparing a machine-readable text, not necessarily a detailed description with the level of detail found in a library catalogue. (7)-5 5 SEE annotations of management SEE ALSO back matter source document DP - A document containing information entered into a computer. (1)-181 ODA/SGML - In an open document processing system, such as ODA or SGML, the document that is transmitted. The source document is defined by a pre-determined description in processable terms and is the equivalent of the result document. (9)-149 • The attributes of the source and the result document are mapped to the automatic processor and must agree in their document profile. SEE ALSO basic processing model; extended processing model; mapping; source document instance - SGML; specific logical structure

173 Chapter Four Thesaurus

special signs DIP - Extrinsic elements of documents that consists of the signs of the writer and subscribers and the signs of the chancery or records office. Examples of the signs of the writers include the symbols used by notaries as personal marks in the medieval period, corresponding to the modern notarial stamp, crosses used by some subscribers in place of their name. Examples of the signs of the registry and chancery include the rota and bene used by the papal chancery, and office and archival stamps.(19)-8,9 s: No standards equivalent n: DFR User-Specific-Codes; document content ODA User Specific Codes; generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP extrinsic elements r: DIP attestation specific document structure ODA - The structure that the user may read.(2)-13,14 SEE script; content articulation specific layout structure ODA - A set of layout objects and associated content portions.(2)-11 SEE content articulation

SEE ALSO source document; generic logical structure; Number of Pages specific logical structure ODA - A set of logical objects and associated content portions. (2)-11 SEE content articulation

SEE ALSO source document; generic logical structure

Start Date and Time ODA - An attribute of dates and times that specifies the date and, optionally, the time of day after which the document is considered to be valid.(5)-10 SEE phases of procedure - execution phase; dates; completeness Status DFR - A non-DFR specific attribute, part of the extension attribute set, that specifies the document status, e.g. working paper, draft proposal etc. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, status.(8)-91 ODA - An attribute of other user information that specifies the document status, i.e. whether it is a draft, working paper, or original.(5)-12 SEE status of transmission

174 Chapter Four Thesaurus

status of tradition SEE status of transmission. status of transmission The primitiveness, completeness, and effectiveness of a record when it is initially set aside after being made or received.(22)-13 SEE ALSO completeness; copy; draft; original; dates subject DLP - An intrinsic element of documents following the inscription consisting of a statement that signifies what the document is about. The subject has been stated in some court records since the last century, but has generally been introduced into records of governmental bureaucracies and, by extension, into business records during this century.(19)-12 s: ODA Subject DFR Subject SGML Subject Field n: DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element declaration; element definition b: DIP protocol r: DLP intrinsic elements

Subject ODA - An attribute of document description that contains information to indicate the subject of the document. (5)-9 DFR - A non-specific DFR attribute, part of the extension attribute set, that contains information to indicate the subject of a DFR-object. In the case of an ODA document, the value of this attribute can be taken from the ODA document profile. (8)-89 SGML TEI - A tag, part of the non core tag set for generic office documents.(7)-189

SEE subject subscription SEE attestation

Superseded Documents ODA - "This attribute specifies reference(s) to document(s) superseded by the current document. It consists of one or more entries "(4)-13 SEE draft SEE ALSO Version-Root; Previous- Version; Version-Management; Next-Version superscription DLP - A typical element of the protocol used to be the superscription, that is, the mention of the name of the author of the document and/or the action. Today the superscription

175 Chapter Four Thesaurus tends to take the form of an entitling: sometimes, however, it coexists with the entitling. It still appears by itself in all contractual documents where it includes the mention of the first party, in declarative documents (those beginning with the first person pronoun followed by the name of the subscriber) and in holographic documents, such as wills, e.g. "This is the last will and testament of..." (19)-12 s: No standards equivalent. n: DFR Authors; Owners ODA Authors; Owners; generic logical structure; specific logical structure; content portion SGML Author; element; element declaration; element definition b: DIP protocol r: DIP intrinsic elements; entitling

supporting documents DIP - Documents constituting written evidence of an activity which does not result in a juridical act but is itself juridically relevant. Examples would include working papers. (11)- 19 s: No standards equivalent n: DFR Document-Type ODA Document Type; document architecture class; User Specific Codes ERM business process SGML Document Type; document type definition b: DIP juridical relevance SGML office documents r: DIP dispositive document; narrative document; probative document

tag

SGML - Descriptive markup.(23)-19

text DP - Text is words, sentences and paragraphs. The content of a word processing document is called text. Contrast with data, which is a precisely defined unit of information, such as name and address.(1)-184 DIP - An intrinsic element that is the part of the document that contains the action, including the considerations and circumstances that gave origin to it, and the conditions related to its accomplishment. s: no standards equivalent n: DIP preamble; notification; exposition; disposition; final clauses DFR document content ODA generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP intrinsic elements SGML body matter r: DIP extrinsic elements

176 Chapter Four Thesaurus

text unit ODA - A data structure representing a content portion description.(2)-12 SEE content articulation title The title of a document e.g. indenture, agreement, last will and testament, report.(3) s: DFR Title; Version-Name ODA Title SGM Title n: ODA Document Type; generic logical structure; specific logical structure; content portion SGML element; element definition; element declaration b: DIP intrinsic elements SGML front matter r: DIP extrinsic elements SGML bibliographic file description

Title DFR - A mandatory attribute, part of the basic attribute set, that gives the name of the DFR object as specified by the DFR user.(8)-81 ODA - An attribute of document description that gives the name of a document as specified by the author.(4)-9 SGML TEI - The title of the work, part of the front matter. (7)-74 SEE title SEE ALSO front matter

topical date The place as it appears in the document.(19)-11 s: No standards equivalent n: DFR User-Specific-Codes; Pathname ODA Local File References(?); User Specific Codes; generic logical structure; specific logical structure; content portion SGML Address; element; element definition; element declaration b: DLP protocol r: DIP dates

transaction ARCH - Information, communicated to other people in the course of business, via a store of information available to them.(l)-185 DIP - Juridical acts directed to the obtainment of effects recognized and guaranteed by the system. (4)-12. • In a transaction, a person administers their own interests with other persons. Therefore, a transaction is an expression of autonomy of a physical or juridical person who self-disciplines their own conduct in a binding way.(l l)-7

177 Chapter Four Thesaurus

• Documents which are reliable and complete, that is, able to convey information, capable of being used in a transaction, and of reaching the purposes for. which they were produced, are transactions.(11)-12 DP - Any business activity or request that is entered into a computer system. Orders, purchases, changes, additions and deletions are examples of transactions in an information system. Queries and other requests are also transactions to the computer, but are usually just acted upon and not recorded in the system. Transaction volume is a major factor in determining the size and speed of a computer system. (1)-184

SEE record

SEE ALSO transaction processing; query transaction log DP - A record of transactions performed. (1)-185 SEE ALSO transaction; transaction processing; record transaction processing DP - A type of computer processing in which the computer responds immediately to user requests. Each request is considered to be a transaction. Automatic teller machines for banks are an example of transaction processing.(25)-472 SEE ALSO query; transaction log; transaction

Transparencies ODA - This attribute specifies the non-basic values of the attribute "transparency" used in the document.(4)-7 SEE content articulation; script

Types of Coding ODA - This attribute specifies the non-basic values of the attribute "types of coding" used in the document.(4)-8 SEE content articulation; script

uniqueness ERM - A functional aspect of the capture requirements of electronic recordskeeping systems that requires the system to assign a unique identifier to all business records upon their creation or receipt consistent with organizationally established naming conventions and classification schemes.(12)-20 SEE ALSO accuracy; authenticity; completeness; evidential context; reliability

UniquePermanent-Identifier DFR - A DFR-specific attribute assigned to every DFR object by the DFR server to identify unambiguously a DFR-object within the DFR-document store.(8)-7 • A DFR-specific attribute, part of the basic attribute set, that is used by the DFR server to uniquely identify a given DFR document, DFR group, DFR reference, or DFR

178 Chapter Four Thesaurus

search-result list within the document store. Once assigned by the server, the UPI can never be changed or reassigned regardless of whether the object exists or is deleted.(8)-81

User DFR - The consumer of services supplied by a DFR-server. At any time the user is acting as a security subject and takes on the privileges of that security subject. (8)-7 SEE security

User Comments SGML TEI - A tag that is part of the Local File References in the non-core tag set for generic office documents.(7)-189 SEE archival bond; links; compound documents SEE ALSO User-Reference-to-Other-Objects; User-Reference; User-Specific- Codes

User-Reference DFR - A non-DFR specific, mandatory attribute, part of the basic attribute set, that contains an identifier for a particular DFR object. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, document reference. The attributes User Reference and User Reference to Other Objects can be used to establish DFR user references between objects stored in a document store. That is, the value of the attribute User Reference is a DFR-user-specific identifier for a DFR object. This identifier can be stored in the attribute User References to Other Objects. User Reference is managed by the user and has a single value.(8)-85 SEE archival bond; links; compound documents SEE ALSO User Comments; User-Reference; User-Specific-Codes

User-Reference-to-Other-Objects DFR - "A non-DFR specific attribute that is part of the basic attribute set and contains references to other DFR objects. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, User- References-to-Other-Objects. The attributes User-Reference and User-Reference-to- Other-Objects can be used to establish DFR user references between objects stored in a document store. That is, the value of the attribute User-Reference is a DFR-user-specific identifier for a DFR object. If later a value of the attribute User- References-to-Other- Objects is used (for example in a Search abstract operation), the referent will be identified. This attribute can contain one or many references to other object."(8)-85 ODA - SEE DFR SEE business process; acts; annotations; archival bond; links; compound documents SEE ALSO User Comments; User-Reference; User-Specific-Codes

179 Chapter Four Thesaurus

User-Specific- Codes DFR - An non-DFR specific attribute, part of the extension attribute set, that specifies additional user-specific code(s) for a DFR object, e.g. contract number, project number, budget. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, user-specific-codes.(8)-92 ODA - An attribute of other user information that specifies additional user-specific codes, e.g. contract number, project number, budget code.(5)-12 SEE business process; acts; annotations; archival bond SEE ALSO User Comments; User-Reference; User-Reference-to-Other- Objects

Version DFR - The DFR-document specified by the user as a derivation of one or more other DFR-documents by means of specific DFR-attributes. • Each version of a DFR document is itself a document (an individual entry in document store). A set of all documents considered to be versions of the same document is a conceptual document.(8)-20 SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next- Versions; Version-Name; Previous-Versions; Version-Root; Superseded Documents

Version-Management DFR - A set of DFR-specific attributes, Next Versions, Previous Versions, Unique Permanent Identifier, and Version Root, managed by the DFR server in a predefined way, in order to make it possible for the user to have a very flexible user-defined version structure, and to provide the user with DFR-server assistance while navigating through this structure or attempting modifications. Version management may be qualified as user defined and server-assisted.(8) - 20 • A DFR-reference to a mullet-version document always points to a particular version (that is, to one specific DFR document in the document store).(8)-21 • Versioning follows three patterns: linear ordering, with only one previous and one next version; tree model, where a conceptual document has several next versions; and directed graph model, where a version can be declared following more than one previous versions.(8)-21 • It is always the DFR-user who declares some existing or newly-created document to be a version. • Versions of DFR groups, references or search-result lists are not defined. (8)-21 SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next- Versions; Version-Name; Previous-Versions; Version-Root; Superseded Documents

180 Chapter Four Thesaurus

Version-Name DFR - A DFR-specific attribute, part of the basic attribute set, that is a free-form attribute intended for the DFR user's use and management. It is defined primarily for those DFR documents that are declared to be versions (in the sense of DFR version management), but it can also be used for any other DFR document. It can also appear in a DFR reference to a DFR document, normally as a copy of the corresponding attribute of the referent. In the case of an ODA document, this attribute may be taken from the document profile, where it is the equivalent of the ODA attribute, version name.(8)-83 • Version-name is only of marginal value in reinforcing the uniqueness of version values. (8)-21 SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next- Versions; Previous-Versions; Version-Management; Version-Root; Superseded Documents

Version-Root DFR - A DFR-specific attribute, part of the basic attribute set, that is defined and has the same value for all DFR documents which are declared versions of the same conceptual document, and, optionally, for DFR references to these documents. It is assigned for the first time by the DFR server when a DFR document is declared to be a version. The unique personal identifier (UPI) attribute of the version then becomes the value of the attribute, Version Root for both the old and the new version. The value of the DFR Version Root attribute is then systematically copied by the DFR server into the DFR Version Root attribute of each new version of the same conceptual document. The DFR Version Root attribute remains valid even when the "original version", bearing the UPI which is the value, has been deleted.(8)-84 • Version-root is the only attribute in common verified by the server. SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next- Versions; Previous-Versions; Version-Management; Version-Name; Superseded Documents view DP - The data which a user with a given permission set is permitted to see in a database. (1)-186 • A capability to see, but not to add or change, data in a system. (1)-186 SEE content articulation virtual file store DP - A file that appears to be a single file but is actually two or more linked files.(1 )-- 186,187 virtual record ARCH - A set of instructions for the creation of a record. (4) DIP - Pointers needed to create documents.(22)-15 • Instructions for creating documents.

181 Chapter Four Thesaurus

DP - The characteristics of an entity as perceived by the user, regardless of how they have been physically represented in a database. Thus an employee would have one virtual record, but may have numerous physical records linked together to accommodate repeating addresses, jobs held, benefits received, etc.(l)-186 writer The person who is responsible for articulating the intellectual form of the document. (17)-7 s: No standards equivalent n: DFR Created-By; Preparers; Authors ODA Preparers SGML Author; Preparers b: DLP Persons r: DLP Author

witness DIP - One of the persons responsible for a document whose signature may serve to confer solemnity on a document, or to authenticate the signature of an author (either of the act of the document, or both), or to validate the content of the document, or its compilation, or to affirm that act for which both oral and written form is required took place. . (17)-8 s: No standards equivalent. n: No standards equivalent. b: DLP persons r: attestation; countersigner

written document DLP - Evidence which is produced on a medium by means of a writing instrument or by an apparatus for fixing the data, images and/or voices. The attribute "written" is not used in diplomatics in its meaning of an act per se (e.g. drawn, scored, traced) but rather in the meaning that refers to the purpose and intellectual result of the action of writing. That is, a written document is the expression of ideas in a form that is objectified (i.e. removed from the writer) and syntactic (i.e. governed by rules of arrangement.(3) SEE document; archival document

182 Chapter Four Thesaurus

GLOSSARY SOURCES

(1) Advisory Committee for the Coordination of Information Systems (ACCIS). Management of electronic records: issues and guidelines. New York: United Nations, 1990. Glossary.

(2) Canadian Standards Association. CAN/CSA-Z243.221-90 (ISO 8613-1: 1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 1: Introduction and General Principles. Rexdale, Ontario: Canadian Standards Association, 1990). 3.0 Definitions.

(3) Schaeffer, Roy. Diplomatic Definitions. Unpublished notes. December 1992. In possession of author.

(4) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Parti). Archivaria 28 (Summer 1989), 7-27.

(5) Canadian Standards Association. CAN/CSA-Z243.224-90 (ISO 8613-4: 1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 4: Document Rexdale, Ontario: Canadian Standards Association, 1990.

(6) International Organization for Standardization (ISO). Implementation of ISO/TEC 10027: 1990 : Information technology - Information Resource Dictionary System (TRDS) framework. Geneva: ISO/EC, 1990.

(7) Association for Computers and the Humanities (ACH), Association for Computational Linguistics (ACL), Association for Literary and Linguistic Computing (ALLC). Guidelines for the Encoding and Interchange of Machine-Readable Texts. Draft. Version 1.1. CM. Sperberg-McQueen and Lou Burnard, eds. Chicago and Oxford: Text Encoding Initiative, 1990.

(8) ISO/TEC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166- 1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) - Part 1: Abstract Service Definition and Procedures. New York: ANSI, 1991.

(9) Bormann, Ute, and Carsten Bormann. Standards for open document processing: currents state and future developments, in Computer Networks and ISDN Systems 21. North Holland: Elsevier Science Publishers BV, 1991,149-163.

(10) World Bank, Information Management Architecture (IMA). May 24 1993 internal document.

(11) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part II). Archivaria 29 (Winter 1989- 90), 4-17.

(12) Bearman, David. Records Systems as the Locus of Provenance: Implications for Automation of Archival Control and Management of Electronic Records. Paper submitted to Ontario Association of Archivists, May 1993.

(13) Delphi Consulting Corp. Handbook of Document Management Systems Evaluation and Design. Thomas K Kolopolous, ed. Boston, Mass: Delphi Consulting Group, 1991.

(14) Bellardo, Lewis J. and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers. Chicago: Society of American Archivists, 1990.

183 Chapter Four Thesaurus

(15) Law, Margaret Henderson. Guide to Information Resource Dictionary Systems Applications: General Concepts and Strategic Systems Planning. Washington, DC: US Department of Commerce, 1988.

(16) Duranti, Luciana: Managing Electronic Documents: Making Sense out of Chaos or "Records management is Dead! Long Live Records Management!". Presentation to World Bank, April 27 1993, Washington, DC.

(17) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part III). Archivaria 30 (Summer 1990), 4-20.

(18) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part IV). Archivaria 31 (Winter 1990- 91) , 10-25.

(19) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part V). Archivaria 32 (Summer 1991), 6-24.

(20) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part VI). Archivaria 33 (Winter 1991- 92) , 6-24.

(21) School of Library, Archival and Information Studies, University of British Columbia. Select List of Archival Terminology. 1990

(22) Duranti, Luciana and Terry Eastwood. The Preservation of the Integrity of Electronic Records. Unpublished draft of research. School of Library, Archival and Information Studies, UBC 1995.

(23) ISO. International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML) (Geneva: International Organization for Standardization, 1986)

(24) Australian Association of Archivists. Keeping Archives. 2nd edition. Judith Mills, ed. Melbourne, Australia. Thorpe with Australian Society of Archivists Ltd. 1993

(25) Margolis, Philip E. The Random House Personal Computer Dictionary. New York:Random House, 1991

(26) O'Brien, James A. Introduction to Information Systems in Business Management, sixth ed. Homewood, IL and Boston MA: Richard D. Irwin Inc., 1991

184 CONCLUSION

The subject of the thesis has been to explore the application of diplomatics to the control of electronic records through the document profile, or its equivalent, in certain open document exchange standards. Central to this proposition is the idea that diplomatics offers a set of decontextualized meta data that can be applied to the document profile in the same manner that international document exchange standards can be used to establish requirements, without prescribing an actual implementation. In effect, this is to suggest that diplomatics has the character of a standard. After examining the standards, it now seems more accurate to say that diplomatics is a sort of conceptual metastandard - a standard upon which standards can be based. Just as there is a direct relationship between general and specific diplomatics, or between the theory of diplomatics and its application, so the same may be said of the relationship between the open document exchange standards discussed here, and the implementation, through specific applications, of their descriptive requirements. There is, however, a fundamental difference, in that open document exchange standards are not, in themselves, a general theory - they are expressions of a theory of document exchange, namely, that documents have a structure that permits them to be mapped to an electronic system in such a way that they can be exchanged with accuracy, completeness, and manipulability. In the same way, diplomatics offers a theory of the archival document, or record, that can be used to model the descriptive attributes used by open document exchange standards such as ODA, or DFR, but diplomatics itself is not a standard in that it is designed to be accessible to, and implemented by, a community of electronic recordskeeping system designers. Standards that are specifically designed to capture the archival document remain to be written, although there are attempts in progress such as the Metadata Encapsulated Object (MEO) proposed by David Bearman, and the work which Luciana Duranti and Terry Eastwood are now doing with the US Dept. of Defense.

What of the nature of the document profile itself? Central to the attempt to examine the relationship between diplomatics and electronic records management through the medium of the document profile is the proposition that the conceptual, idealized document posited by diplomatics can be realized in the document profile of open

185 Conclusion communication standards. The document profile then becomes a surrogate, conceptual document, or, in other words, a conceptual profile. The analysis of the standards indicates that diplomatics, in this respect, is filling a vacuum. Those attributes that exist seem to be there out of custom. They are there because the standards are designed to capture documents whose context and structure is taken for granted by the standards designers, just as one does not bother to explain what is assumed to be obvious. The truth seems to be that there is no rhyme or reason behind the selection of these attributes apart from broad assumptions about the nature of the office and publishing environments in which documents are created or received. That ODA and DFR do not do a bad job of capturing characteristics of archival documents is perhaps due to their specific design as document management standards compared to SGML, rather than any theoretical definition of the document. This can be proven from the self-referential definitions of document used by each of these standards, definitions which are designed to serve the purpose of the standard, and are not conceived within the context of a broad and rigorous theoretical conception of the document, let alone, the archival document or record.

The criticism can be extended to the document profile which, when compared against the characteristics of a document, turns out not to be defined so much in terms of a document as of a data object, a far broader concept. A profile is actually a means of describing and manipulating whatever object may be defined to its structure. In the absence of any rigorous theoretical construct of the document, the association of

"document" with profile is therefore something of a misnomer. It would be truer to refer to an "object profile" and to define a family of profiles, one of which would be a

document profile, and even, a record profile. This is more than a mere taxonomic

convenience. Just as the paper record, and its papyrus and parchment predecessors, came to displace the oral record as the predominant record form, perhaps we should see in the

profile not simply a surrogate of the paper-defined conceptual document (an idea that the

concept of the "virtual" document does not really escape) but an altogether new

paradigm of record form, separating the model of the paper document from the idea of the

record, which can be so many other things in the digital world. Thus, just as the medieval

186 Conclusion scribe scraped clean the parchment and ruled lines, so authors and writers will now define a profile.

The thesis has proven to be as much about finding a way for archivists to approach the territory of electronics recordskeeping system design as it is about the nature of open document exchange standards as a tool of records management. Fundamental to its purposes has been to place the record within the context of the document profile as an instrument of document management, which is why the thesis takes pains to explain the nature and construction of electronic recordskeeping systems and the problem of document structure in data processing. This is to ask what the archivist really needs to know about electronic recordskeeping system design, or to put it another way, in the design process, what special knowledge or viewpoint can the archivist contribute? The thesis assumes that the archivist needs above all to understand the record, and that it is this grasp of its nature and characteristics which provides a yardstick against which to measure the usefulness of open document exchange standards. But, in fact, it emerges to be more than that: by positing the existence of an "open" document profile, the thesis demonstrates that diplomatics offers not only a critique of the profile, but an actual design tool, something that archivists and system designers can share in the form of requirements.

But if the general theory and methodology of diplomatics is relevant to the design of electronic recordskeeping systems, how effective is diplomatics as a design tool?

Underlying the concern of this thesis with diplomatics and open document exchange standards has been the interface between recordskeeping and the design of electronics recordskeeping systems. In effect, this is a problem of object design. There are two approaches to the design of an object. One is by trial and error; the other is to proceed from a theoretical model that may be applied through trial and error but nonetheless regards reality not as the be all and end all, but as only one of many possible manifestations. Archivists have tended to take an organic approach to theory - that what theory there may be is inherent in the archival object, as a fonds, or as a series, which is therefore permitted to define itself. For instance, the basic qualities of archives - naturalness, interrelationship, uniqueness, paternity are all qualities that must be present in records before they can be considered an archives. In other words, these are found - as

187 Conclusion opposed to designed - qualities. This highlights a profound difference with the design of electronic recordskeeping systems, because these are based upon a given set of requirements. There is no way an archive can be translated into a set of system requirements in the sense of found qualities. A designed archive is, by its very nature, a false archive. Electronic recordskeeping systems, by contrast, are based upon pre-defined requirements.

How does diplomatics change this picture? Diplomatics is a way of defining the record-object. It is able to do so because, unlike archival theory, it does not assume that the object defines itself. The archival document consists of constituents some or all of which may be required by the juridical systems in which it is generated in order to manifest completeness, reliability and authenticity. These requirements arise from a conceptual document or model of a document identified and tested over centuries of research and study. The point is that these requirements are not organic in that they permit the archival document to define itself - they are not "found" qualities, but form a set of established characteristics that are manifested in reality in the variety of records forms and through the recordskeeping system in which they arise. Assuming the validity of the conceptual document model, diplomatics thus provides an invaluable bridge between electronic recordskeeping design methodology, and archival theory.

This is not to say that diplomatics, as it now exists, is perfectly suited to describe electronic records. Some concepts, such as the juridical system, are apparently too abstract in their definition to be captured in concrete terms in the document profile, even if the system itself nonetheless exists within the context of a juridical system. Others, particularly the persons, and the intrinsic and extrinsic elements, translate readily into attributes and constituents of electronic records. The same may be said of diplomatic concepts of reliability, authenticity and completeness which underscore qualities that records must have if records are to have any value at all, regardless of how or where they are created, and that be readily deduced by the presence or absence of identifiable attributes and recordskeeping processes. In some respects, particularly in regards to the logical structure of electronic records, diplomatic concepts such as content articulation have proven to be too generalized to handle all the elements, but this may be as much a

188 Conclusion criticism of open document exchange standards in question (ODA in particular, which has been criticized on this score as being far too complex) as it is of diplomatic terminology.

Then there are features of electronic systems, notably, access control, domain, and security, and of electronic documents, such as links, that cannot be anticipated in paper records, and others, such as the archival bond, which must be explicitly articulated in the profile where this might have arisen naturally amongst paper records as a result of their physical association in a file. The new elements should be added to the corpus of diplomatic elements of form and procedure.

On a conceptual level, diplomatics passes the test of the document profile rather easily. The main difficulty, in fact, would appear to be the lack of familiarity, particularly in North America, of archivists and system designers with diplomatics, than with diplomatics itself. There is, however, another factor hidden in the heart of diplomatics, and that is the actual process of defining the record. Records, by their very nature, are unique, that is, they are peculiar to their context. In the process of the design of electronic recordskeeping system, it is always necessary to define the document types as they are found in the particular recordskeeping systems in which they are created or received.

Thus, even though diplomatics posits an idealized conceptual record, it is not a formula to be blindly translated into the reality of the document profile. The archivists and system

designers working with the conceptual profile cannot escape developing a thorough

understanding of the uniqueness the records they are endeavouring to map, and of the

procedures and circumstances which determine their character and validity. In itself, this is

further proof of the extent to which general diplomatics shares the decontextualized nature

of open document exchange standards: the application of special diplomatics to the study

of particular documents is the problem of applying any standard, such as, for instance,

SGML to the marking up of a particular type of document, or a particular implementation

of DFR. There is a great deal of work to be done in designing the system, and it is all too

easy to let go of fundamental concepts in an effort to accommodate the peculiarities of

reality.

What of the standards themselves? Of the three ISO document exchange standards

examined here, ODA and DFR reveal themselves to reflect the characteristics of the

189 Conclusion archival document most closely in the attributes and structure of the profile. Their attributes for authors and the various types of dates map readily to diplomatic conceptions. ODA is particularly strong in regards to the ability to capture the logical and layout structure of the document. But on the whole, there is much room for improvement.

The document management attributes could be considerably expanded to capture diplomatic elements for persons (where they fail to identify an addressee or the various participants such as witnesses) and to identify acts, procedures and phases of procedure.

As it now stands, these elements can be inferred by adapting existing attributes, but if a true standard is to emerge that permits description of an archival document, there will have to be both more and explciitly defined attributes. While ODA and DFR do include certain elements, such as access control and security, that are specifically characteristic of electronic documents, other aspects, such as the archival bond, are not easily recognized, and there is nothing to capture the concept of domain.

SGML is another case altogether, at least as manifested in the Text Encoding

Initiative. Since SGML is a mark-up lanaguage, it is not intended to define any specific type of document. That is the function of specific encoding initiatives. The concept of the

Document Type Definition, which is here treated as a conceptual profile, is a structure readily adapted to the description of any number of different types of documents, including records, and could presumably be adapted to capture logical and layout features by means of extensions. Since SGML permits any and all types of elements to be defined, there is no theoretical reason why a complete set of diplomatic attributes could not be defined to a

DTD. This very flexibility, however, is in itself problematic, because it means that agreement on records attributes must be wrung from amongst disparate implementations whose proliferation SGML is specifically designed to encourage. In other words, SGML leaves archivists and system designers pretty well where they started with agreement on a basic language, but no agreement on what to say. The philosophy of text encoding itself is not suited to the capture of documents because every encoding is presumed to be an

interpretation - which violates the very nature of the archival record.

Finally, the thesis has used the mapping technique of the thesaurus to demonstrate

the relevance of diplomatic concepts to open document exchange standards. In effect, the

190 Conclusion thesaurus is an attempt to turn the tables on system design by forcing data processing terminology to conform to a terminology of the archival document or record. The thesaurus demonstrates that this works quite well, not only as a means of establishing an authority file, but also as a means of revealing the limitations of a terminology where there are no equivalents at the synonym, narrow, broad, or related levels. It is to be hoped that

such a thesaurus can be extended to electronic records management terminology as whole, taking in the most widespread standards and including other archival terms than diplomatic

concepts where these are relevant to system design.

191 BIBLIOGRAPHY

BOOKS, ARTICLES AND REPORTS

Advisory Committee for the Co-ordination of Information Systems (ACCIS). Management of electronic records: issues and guidelines. New York: United Nations, 1990.

Association for Computers and Humanities (ACH), Association for Computational Linguistics (ACL), and the Association for Literary and Linguistic Computing (ALLC), Guidelines for the Encoding and Interchange of Machine-Readable Texts, draft: version 1.1, CM. Sperberg-McQueen and Lou Burnard eds., (Chicago & Oxford: Text Encoding Initiative, 1990) 289 pp.

Barry, Richard E. "Best Practices" for Establishing Good, Defendable Practices and Procedures for Digital Document Management. Draft report submitted to World Bank. Arlington, VA: Barry Associates, 1993.

Barry, Richard E. Electronic Document and records Management Systems: Towards a Methodology for Requirements Definition. Draft report submitted to World Bank. Arlington, V.A.: Barry Associates, 1993.

Barry, Richard. E. Document Filing and Retrieval: ISO Standard 10166. Assessment report prepared for World Bank. Arlington, VA: Barry Associates, March 2 1993. 6 p.

Barry, Richard E. Open Systems Standards: Assessing Product Availability. Draft report circulated as part of World Bank presentation to Society for Worldwide Interbank Financial Telecommunication (SIBOS), Brussels, September 1992. Washington, D.C, September 1992.

Bearman, David. Issues Involved in Using SGML for Data Interchange. Archives and Informatics 8 No. 1 (Spring 1994), 74-79.

Bearman, David. Record Systems as the Locus of Provenance: Implications for Automation of Archival Control and Management of Electronic Records. Paper presented at Ontario Association of Archivists Conference on Archives and Automation, May 13, 1993, Toronto, Ontario. 28 pp.

Bellardo, Lewis J. and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers. Chicago: Society of American Archivists, 1990.

Bormann, Ute, and Carsten Bormann. Standards for open document processing: current state and future developments, in Computer Networks and ISDN Systems. North Holland: Elsevier Science Publishers B.V., 1991, 149-163.

Braid, Andrew, From Babel to EDIL: the evolution of a standard for document delivery, Computer networks and ISDN Systems, 27 (1994), 367-374

Burnard, Lou, The Text Encoding Initiative: Towards an Extensible Standard for Encoding of Texts in Electronic Information Resources and Historians: European Perspectives, Seamus Ross and Edward Higgs, eds. St. Katherinen: Max-Planck-Institut fur Geshichte In Kommission bei Scripta Mercaturae Verlag, 1993, 105-118.

Cronk, Randall D., Unlocking Data's Content, Byte (Sept. 1993), 111-120.

192 Bibliography

Du Rea, Mary V., and J. Michael Pemberton,£7ec/roA7/c Mail and Electronic Data Interchange: Challenges to Records Management. Records Management Quarterly (Oct. 1994), 3-12

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part I). Archivaria 28 (Summer 1989). 7- 27.

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part II). Archivaria 29 (Winter 1989-90), 4-17.

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part III). Archivaria 30 (Summer 1990), 4- 20.

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part IV). Archivaria 31 (Winter 1990-91), 10-25.

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part V). Archivaria 32 (Summer 1991), 6- 24.

Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part VI). Archivaria 33 (Winter 1991-92), 6-24.

Duranti, Luciana. Managing Electronic Documents: Making Sense Out of Chaos or "Records Management is Dead! Long Live Records Management!". Presentation to World Bank, April 27 1993, Washington, D.C.

Duranti, Luciana. Reliability and Authenticity: the Concepts and their Implications (unpublished paper, 1995)

Fanderl, H., K. Fischer, and J. Kamper, The Open Document Architecture: From standardization to the market. IBM Systems Journal 31, No. 4, 1992, 728-754.

Hajagos, Lani. Documents and SGML (the Standard Generalized Markup Language Standard for document processing), UNIX Review 11, No. 3 (Mar. 1993), 4 pp.

Hayes, Frank, SGML Comes of Age. UnixWorld (Nov. 1992), 99-100.

Jacobs, Paul S., ed. Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Hillsdale, N.J.: Lawrence Erlbaum Associates, 1992.

Kay, Russell, Objects in Use. Byte 19, No. 4 (April 1994), 99-104.

Law, Margaret Henderson. Guide to Information Resource Dictionary System Applications: General Concepts and Strategic Information Systems Planning. Washington, D.C: US Department of Commerce, 1988.

Margolis, Philip E. .The Random House Personal Computer Dictionary. New York: Random House, 1991.

Miller, Michael, The Next Software Revolution, PC Magazine 12, No. 6 (Mar 1993), 2 pp.

Moore, James W., David Emery and Roy Rada, Language-Independent Standards. Communications of the ACM, 37 No. 12 (Dec. 1994), 17-35.

193 Bibliography

Morell, Jonathan, Standards and the market acceptance of information technology: An exploration of relationships, Computer Standards and Interfaces Vol. 16 (1994), 321-329.

Kilov, Haim, Information Modelling: a path to document analysis, MRE-2F049, Bellcore, internal research paper, April 1994, 13 pp.

Library of Congress, Workshop on Electronic Texts - Proceedings. James Daly, ed. Washington, DC: Library of Congress, 1992.

Mullins, Craig S., The Great Debate. Byte (April 1994), 85-96.

Murray, Philip C, Documentation Goes Digital. Byte (Sept. 1993), 121-129

Nordin, Brent, David T. Barnard, and Ian A. Macleod, A review of the Standard Generalized Markup Language (SGML), Computer Standards and Interfaces 15 (1993), 5-19.

O'Brien, James A. Introduction to Information Systems in Business Management, sixth ed. Homewood, IL and Boston MA: Richard D. Irwin Inc., 1991

Phillips, John T. Organizing and Archiving Files and Records on Microcomputers. Prairie Village, K.A.: Association of Records Managers and Administrators, 1992.

Piersoll, Kurt, .4 Close-Up ofOpenDoc. Byte 19, No. 3 (March 1994), 183-188.

Reinhardt, Andy, Managing the New Document. Byte 19, No. 7 (August 1994), 91-104.

Rooney, Paula, Versatile electronic data delivery fuels corporate interest in SGML. PC Week 10 No. 8, 2pp.

Saffady, William. Managing Electronic Records. Prairie Village, K.S.: Association of Records Managers and Administrators, 1992.

School of Library, Archival and Information Studies, University of British Columbia. Select List of Archival Terminology. Unpublished glossary for Master of Archival Studies Program. 1990. Sparck Jones, Karen. Assumptions and Issues in Text-based Retrieval, in Text-based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Paul S. Jacobs, ed. (Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1992), 157-177.

Stein, Richard Marlon, Object Databases, Byte 19 No. 4 (April 1994), 74-84.

Taylor, Calvin. J., Object-oriented concepts for distributed systems. Computer Standards and Interfaces 15 (1993) 167-170.

Thompson, Craig, A reference model for object data management. Computer Standards and Interfaces 15 (1993), 121-147.

Vecchione, Anthony, How SGML Bridges Format Differences. Information Week (Mar. 29 1993), 22- 23.

Walker, David M. The Oxford Companion to Law. Oxford: The Clarendon Press, 1980.

194 Bibliography

Watson, Bradley C, and Robert J. Davis. ODA and SGML: An Assessment of Co-existence Possibilities, Computer Standards and Interfaces 11 (1990/91): 169-176

Wilmott, Sam, Distinguishing Intelligence from Formatting, The SGML Newsletter. No. 20 (Dec. 1991), 6-10.

STANDARDS

ISO/EC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166-1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures. New York: American National Standards Institute (ANSI), 1991.

ISO/TEC JTC 1. Revised Text of DIS 10166-1, Information Technology - Text and office Systems - Document Filing and Retrieval (DFR) - Part 1: Abstract Service Definition and Procedures. Draft. New York: International Standards Organization, 1991.

ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 1: General. Geneva, Switzerland: International Standards Organization, 1989.

ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 2: Document Structures. Geneva, Switzerland: International Standards Organization, 1989.

ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 4: Document Profile. Geneva, Switzerland: International Standards Organization, 1989.

ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part S: Office Document Interchange Format (ODIF). Geneva, Switzerland: International Standards Organization, 1989.

ISO. International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML). Geneva: International Organization for Standardization, 1986.

ISO/IEC JTC1/SC21 Information Retrieval, Transfer and Management for OSI. Draft Recommendation X.903: Basic Reference Model of Open Distributed Processing - Part 3: Prescriptive Model. Reprinted in Computer Standards and Interfaces Vol. 15 (1993) 191-274

ISO/TEX JTC 1/SC 21 Information Retrieval, Transfer and Management for OSI. Information Technology - Basic Reference Model for Open Distributed Processing - Part 2: Descriptive Model, Committee Draft ISO/IEC CD 10746-2.2. New York: ANSI, 1993, reprinted in Computer Standards and Interfaces Vol. 15 (1993) 171-190.

VENDOR PRESENTATIONS

Digital Equipment Corporation. The Open Software Foundation Distributed Computing Environment: An Introduction. Slide print presentation to the World Bank prepared by Terry Tvirdik. Littleton, M.A.: Digital Equipment Corporation, March 31 1993.

Digital Equipment Corporation. The World Bank Open Systems Forum. Slide print presentation to World Bank on January 7, 1993 in Alexandria, VA.

195 Bibliography

WORLD BANK INTERNAL DOCUMENTS

Information, Technology and Facilities Department. Open Systems at the World Bank. Presentation by Hywel Davies, Director, to Society for Worldwide Interbank Financial Telecommunications (SIBOS), Brussels, September 1992. Washington, D.C: World Bank, August 27 1992. 17 p.

Information, Technology and Facilities Department. Information Management: Vision and Objectives based on User Needs. Internal report by Karl. O. Lawrence. Washington, D.C: World Bank June 11 1993. 20 p.

Information, Technology and Facilities Department: Information Engineering. ITF Staff Paper No. 12: Information Management Architecture: FY 94. Harold Steyer, ed. Internal report prepared by the World Bank, Washington, D.C: World Bank, 1993.

Information, Technology and Facilities Department: Information Engineering. Document Management System: Requirements Analysis. Internal report. Washington, D.C: World Bank, December 23, 1992.

Information, Technology and Facilities Department. Final Report of the Electronic Text/Image Review. Draft internal report prepared by Irene Travis et. al. Washington, D.C: World Bank, March 22 1993.

Information, Technology and Facilities Department (ITF). Excalibur Project: Analysis and Recommendations. Internal report. Washington, D.C: World Bank, October 1992.

Information, Technology and Facilities Department (ITF). Document Management Technology Architecture. ITF Staff Paper prepared by Irene L. Travis. Washington, D.C: World Bank, December 1989.

Information, Technology and Facilities Department (ITF). Towards an Enterprise Document Management System Strategy and an Institutional Document Management System for the World Bank. Draft internal report prepared by Clifford A. Lynch. Washington, D.C: World Bank, September 25 1992.

Information, Technology and Facilities Department: Information Services Division. Seminar on Appraisal of Electronic Records: Report and Recommendations. Internal report by Irene Travis et al. Washington, D.C: World Bank, October 21, 1993.

Information Technology and Facilities Department (ITF). Developing Guidelines for Electronic Records: Report of a Project to Test the ACCIS TP/REM in Electronic Records Guidelines: A Manual for Policy Development and Implementation (ACCIS 89/018(b) 1989-07-17;. Prepared by the Task Force on Electronic Records Management Information. Washington, D.C: World Bank, September 1989.

196