app eared in: Pro c. of the 3rd International Conference on Information and Knowledge Management (CIKM) '94, Maryland, MD, 1994.1

Storing HyTime Do cuments in an Ob ject-Oriented Database

Klemens Bohm, Karl Ab erer

GMD-IPSI

Dolivostrae 15, 64293 Darmstadt

Germany

fkb o ehm, ab [email protected]

Using examples Abstract from the area of This article is about so

hypertext modelling.

yp ermedia-do cument storage system has to meet

An op en h should ease un-

ts that are not satis ed by existing systems: it

requiremen derstanding.

1

has to supp ort non-generic hyp ermedia do cumenttyp es ,

i.e. do cumenttyp es enriched with application -sp eci c se-

mantics. It has to provide hyp ermedia-do cument access

Figure 1: Sample Structure in an Argumentation Space

metho ds. Finally, it has to allow the exchange of hyp erme-

dia do cuments with other systems. On a technical level, an

ob ject-oriented database-management system, on a logical

sophisticated. Usually systems for hyp ertext-do cumenthan-

level, a well established ISO standard, namely HyTime, is

dling consist of three layers, a storage layer,an application

used to satisfy the requirements mentioned ab ove. By means

layer and a presentation layer [DeS86 ]: hyp erengines are the

of the example of do cuments incorp orating hyp ertext struc-

middle layer. With some hyp erengines, the storage layer is

tures we discuss the impact of taking such an approachon

made up with databases [ScS90 , MaS92 ]. With other ones,

representation and pro cessing within the database system.

this is not the case [SSS93 ].

1 Intro duction

In hyp ertext-do cuments of di erenttyp es the hyp erob-

jects have sp ecial semantics in addition to the canonical fea-

In the recent past the proliferation of the hyp ertext para-

tures. In [SHT89] four spaces corresp onding to the di er-

digm for representation of information has b een facilitated

ent design activities within an authoring pro cess are identi-

by technological advances. The main di erence b etween

ed: the content space,the planning space, the argumen-

conventional do cuments and hyp ertext do cuments is that

tation space and the rhetorical space. These spaces are

in the rst case the do cument structure as p erceived bythe

part of SEPIA, a co op erativehyp ermedia authoring envi-

reader is linear. In the second case, there is a graph struc-

ronmentdevelop ed at our institute [Str+92]. An argumen-

ture. To apply the hyp ertext paradigm hyperengines have

tation space inter alia contains facts (`datum') and asser-

b een develop ed. Hyp erengines administer the hyp ertext-

tions (`claim'). There are not only di erentnodetyp es, but

2

sp eci c structures such as no des, links and anchors which

also di erentlinktyp es: a so-link is a directed binary link

will b e referred to as hyperobjects. They contain the realiza-

from a datum to a claim. Figure 1 contains an example of a

tion of generic op erations. An example for a no de op eration

so-link. Other link typ es in the argumentation space are the

is the calculation of the transitive closure, i.e. the identi ca-

to-link and the contradicts-lin k. A contradicts- contributes

tion of all no des in a do cument that can b e reached by link

link links two no des with discrepantcontent. In the other

traversal from that no de. Hyp erengines di er in the internal

spaces the hyp erob ject's semantics likewise is sp ecial. In the

representation of the hyp erob jects. Furthermore, op erations

sequel we will refer to an argumentation-space structure, i.e.

re ecting the hyp erob jects' semantics maybemoreorless

ahyp ertext structure whose comp onents are claims, so-links

1

etc. as an argumentation-spacedocument.

In our terminology,withhyp ermedia do cuments the content

of the no des maybemultimedia data as opp osed to hyp ertext

do cuments.

In existing systems hyp erob jects' do cument-typ e-sp eci c

2

The edges of such a kind of graph structure are called links. Links

semantics tends to b e hardco ded in the presentation layer.

do not necessarily link no des in their entirety, but also structures

Hence, exchanging hyp erdo cuments of non-generic typ es,

within no des as, say, several words or sentences. The link ends are

e.g. argumentation-space do cuments b etween hyp erengines

called anchors [Con87].

or applying a hyp erengine in di erent contexts is not yet

conceivable. We for our part envisage a hyp erengine sup-

p orting hyp erob jects with partly sp ecial semantics based on

a database system. Wewant to comply to a format for hy-

p erdo cuments satisfying the following basic requirements:

non-genericity, orientation towards hyp ermedia do cument

storage and pro cessing, acknowledgement asan(interna-

tional) standard. Hence, wehavechosen SGML/HyTime.

document

The problems we approach and solutions we give are fairly

orthogonal to the particular format chosen. title abstract intro hytime ...

With SGML (`Standard Generalized ') ...

[ISO86, Her94] do cumenttyp es can b e de ned. In essence,

authorlist affiliation title paragraph on paragraph on

cument-type de nitions (DTDs) are attributed gram-

SGML do ... hyperengines argumentation

t structure. However, nothing mars sp ecifying the do cumen ... spaces

author1 tics of do cument comp onents, which is said ab out the seman author2 ...

......

are called elements in the SGML . The HyTime

Standard (`Hyp ermedia/Time-based Do cument Structuring

Language') [ISO92, NKN91] basically is a list of SGML

Figure 2: Tree Structure Corresp onding to this Do cument

element-typ e de nitions for, say, links or presentation sched-

ules. These elementtyp es are referred to as architectural

forms. Their semantics is xed by the standard. - In this

article we fo cus on the basic link features the HyTime stan-

-- `asdoc' short for `argumentation-space

dard provides. This facilitates a comparison of our concepts

document', contra' is GI of 'contradicts-link',

to conventional approaches to hyp erdo cument storage. We

'contrib' is GI of 'contributes_to-link'

are, however, not aware of any related work on this topic

(def.s omitted) --

dealing with di erenthyp ertext-do cumenttyp es.

In this article we describ e the database application frame-

type (position|claim|datum)

work for HyTime-do cument storage we are currently work-

#REQUIRED>

ing on. We concentrate on the following asp ects.

1. In [ISO92, Ko e+93 ] it has b een mentioned that Hy-

datum IDREF #REQUIRED>

Time pro cessing can b e accelerated by means of an

...

internal representation. Our ob jectiveistodevelop

hyp ermedia-sp eci c index structures to sp eed up ac-

cess op erations. We will explain that having more than

Figure 3: Fragment of an SGML Do cumentTyp e De nition

one internal representation of HyTime-architectural forms'

for Argumentation-Space Do cuments

instances to cho ose from maybeadvantageous.

2. We do not restrict ourselves to a set of xed hyp ertext

the list of a no de's children the content of the element.In-

structures: dynamic mo di cations of SGML/HyTime

ternal no des are nonterminal elements. All elements have

do cuments shall b e doable. With our approachboth

atyp e, e.g. section or paragraph. With SGML the logi-

the collection of do cuments and the set of do cument

cal structure of do cuments of a certain typ e can b e de ned.

typ es can b e mo di ed at runtime.

In essence the content model of an element type is a reg-

3. Op erations are part of the database application : In

ular expression sp ecifying how the contentofanelement

other words, the database has the semantic control

of that typ e may lo ok like. An element-typ e name is also

over do cument comp onents.

called generic identi er (GI). The element-typ e de nitions

in a DTD may b e completed by the de nition of attributes.

The platform on which realization will b e based is an

ob ject-oriented database-management system (OODBMS) -

Figure 3 contains a p ossible DTD for argumentation-

the OODBMS VODAK develop ed at our institute [Kla+93 ,

space do cuments. Lines starting with `

KAN93]. By using a DBMS database features such as con-

an elementtyp e together with its content mo del: instances

currency control or querying capabiliti es are available. Even

of asdoc contain a list whose elements are either instances

though we limit ourselves to the description of the HyTime

of elementtyp e node, so etc. CDATA is a terminal element

hyp erlink features other facets of hyp ermedia do cuments re-

typ e more or less comparable to the data typ e STRING.

ected in HyTime, like spatial and temp oral relationship s,

`

can b e approached in the same way.

For instance, elements of typ e node have an attribute of typ e

ID and an attribute type of typ e (positionjclaimjdatum).

The structure of this article is the following: the next

Attributes of typ e ID are unique identi ers of the element

section is an overview of the HyTime conception together

they b elong to. An attribute of typ e IDREF is an ID refer-

with examples from the hyp ertext area. In Section 4, our

ence, one of typ e IDREFS a list of ID references. EMPTY and

database-based approach to HyTime-do cument handling is

]REQUIRED are SGML keywords b eing of minor imp ortance

discussed, and it is describ ed how to realize the link features.

in this context. Comments are enclosed in double hyphens.

We review and classify related work in Section 5. Section 6 is

a brief summary, and weidentify further research ob jectives.

Using examples

from the area of hypertext should ease

2 HyTime

understanding.

This article is about hypertext modeling.

SGML. Structured do cuments may b e seen as trees whose

edges indicate how comp onents are contained in each other.

The tree corresp onding to this article is depicted in Figure

Figure 4: SGML Example - Fragment of a Do cument Cor-

2. This hierarchical structure is commonly referred to as

resp onding to the DTD in the Previous Figure.

the logical document structure. The no des are the elements, 2

Using examples

It is imp ortant to notice that with SGML it is basically the

from the area of hypertext should ease

logical do cument structure that is describ ed by means of an

understanding.

attributed grammar. One of the few provisions on the se-

type=datum>This article is about hypertext

mantic level that is part of the standard is the semantics of

modeling.

attributes of typ e ID and IDREF. Figure 4 contains a frag-

extra="A E" ...>

ment of a do cument in conformance with the DTD from

Figure 3. In the sequel, we refer to do cuments conformant

Figure 6: HyTime Example - Fragment of a HyTime Do cu-

to a DTD as instances of the DTD. The do cument corre-

ment Corresp onding to the DTD in the Previous Figure.

sp onds to the structure depicted in Figure 1. The example

also illustrates that the use of SGML is not limited to con-

ventional text do cuments, but instead can also b e used for

intra. ilink (`indep endent link') is the element-typ e form

hyp ermedia do cuments.

of which link element-typ e de nitions should b e sp ecializa -

tions. In Figure 4, wehaveshown how such a structure can

Basic Concepts of HyTime. The pro cessing seman-

b e mo delled using SGML only.We are now in the p osition

tics of conventional do cument-elementtyp es suchassection,

to illustrate how the same structure can b e represented using

chapter or paragraph is indep endent of the elementtyp e.

the HyTime element-typ e form ilink. Figure 5 contains the

For instance, for all of these elementtyp es there mightbea

relevant p ortion of the DTD. The fact that so is a sp ecializa -

metho d to display the textual content on the screen. With

tion of ilink is indicated by setting the attribute HyTime to

hyp ermedia do cuments the elementtyp es' semantics is more

ilink. Figure 6 is the do cument fragment corresp onding to

di erentiated and not necessarily obvious: supp ose that in

Figure 1 as an instance of the HyTime-DTD. The attributes

an argumentation space one maynavigate through a so-link

displayed in the gure (except for reftype which will b e ex-

only from the datum to the claim, but not in the opp o-

plained in the following paragraph) are inherited from the

site direction. Constraints of that kind makeupthebrows-

element-typ e form ilink: anchrole intro duces a lab el for

ing semantics.Theanchors of a contradicts-link, to give

eachanchor, in this case DATUM and CLAIM. reftype sp eci es

another example, are equivalent: the browsing semantics

the typ es of the elements referenced by another attribute:

would therefore b e di erent from the browsing semantics

"linkends anchors #SEQ" means that a value of linkends

of instances of so-link.To re ect the browsing semantics

must b e conformant to the content mo del of anchors, i.e.

of link elements in an SGML do cument one mightintro duce

an admissible value of the linkends-attribute is the ID of a

an attribute b earing that information. The interpretation of

datum-element, followed by the ID of a claim-element. The

attribute values of that kind, however, is not standardized.

attribute extra is to capture the browsing semantics. Here,

If an applicatio n were to pro cess instances of a DTD with

we will not explain the meaning of the attribute value. In

such an attribute it would have to b e adapted to the DTD

this context, it is sucienttoknow that its meaning is sp ec-

by hand. Suming up, SGML DTDs are not well-suited as

i ed by the standard.

exchange formats for hyp ermedia do cuments. The HyTime

standard has b een designed to overcome these problems: it

Additional Terminology. As opp osed to element-typ e

essentially is a list of SGML element-typ e de nitions called

forms containing b oth content's and attributes' de nitions

element-type forms or architectural forms. The sp ecial fea-

an attribute-list form is a list of attribute de nitions only.

ture of HyTime is that architectural forms' semantics is xed

Again, the standardization of the attributes' semantics is of

by the standard. The semantics, e.g. the attributes' mean-

imp ortance. An architectural form is either an element-typ e

ing, is partly describ ed in natural language, in part it follows

form or an attribute-list form. The attribute reftype men-

3

from the comment within the forms' de nitions. Element

tioned previously is part of a HyTime attribute-list form.

typ es in SGML/HyTime DTDs may b e sp ecializati ons of

An attribute of this typ e may b e included in element-typ e

HyTime element-typ e forms. The meaning of `sp ecializa-

de nitions in HyTime do cuments. Again, the attribute val-

tion' in this context is twofold:

ues' meaning, e.g. howtointerpret the value "linkends

anchors #SEQ" is describ ed in the standard. The HyTime

 The sp ecializati on of an element-typ e form may con-

standard consists of several mo dules. Each of them contains

tain additional attributes. On the other hand, it need

a list of architectural forms. The prologue of a HyTime do c-

not contain all attributes of the element-typ e form.

umentcontains so-called support declarations stating which

Consider the case of hyp ertext do cuments b eing so

features, e.g. architectural forms, need b e supp orted.

small that there is no need to navigate through them

b ecause they can b e viewed in their entirety. In that

Note that attributes whose value is already xed in the

case, the attributes extra and intra can b e omitted.

DTD can b e omitted from do cument instances, suchasat-

 The ranges of the content and of the element-typ e

tributes HyTime and anchrole in Figure 6.

form's attributes may b e subsets of the original ranges

in the HyTime standard.

3 Requirements on HyTime Do cument Storage

Atyp e de nition for links, to continue the example,

SGML. In [ABH94, BAH93] our approach to SGML do c-

should contain the information in which direction navigation

ument storage has b een describ ed. There the following re-

is p ossible. The relevantelement-typ e form from the stan-

quirements have identi ed inter alia:

dard, whose name is ilink, has attributes to b ear that kind

of information. The names of these attributes are extra and

 It shall b e p ossible to alter parts of the do cuments

3

without lo cking o the whole do cument for other au-

The name `comment' is misleading, b ecause in HyTime comment

mayhave binding forces. Comments in HyTime element-typ e de -

thors. Namely,weenvision our database application

nitions can further restrict the range of attributes or of the content.

to b e a platform also for authoring systems. Multi-

Hence, an SGML do cument whose DTD contains a HyTime element-

authoring may b e facilitated by concurrency control.

typ e form might b e a correct SGML do cument, but not a valid Hy-

Time do cument. 3

anchrole NAMES #FIXED "DATUM CLAIM"

linkends IDREFS #REQUIRED

reftype CDATA #FIXED "linkends anchors #SEQ"

extra ...>

Figure 5: Example - Fragment of a HyTime Do cumentTyp e De nition

instances are themselves classes. Symmetrically,we refer  The DBMS must have semantic control over versions of

to the instances of the instances of a metaclass as metain- do cument elements and the generation of new version

stances. The de nition of a class encloses the de nition of ob jects.

its instances' typ e. This typ e is called the insttype of the

 The usage of multimedia typ es shall b e supp orted by

class. A metaclass de nition may enclose b oth the de -

the underlying system [Rak+93 ].

nition of its instances' typ e and of its metainstances' typ e.

This second typ e is the instinsttype of the metaclass.Anob-

 It is not worthwhile to b e restricted to a xed set of

ject has prop erties and metho ds that are de ned as part of

do cumenttyp es. On the contrary, it shall b e p ossi-

its class's insttyp e and its metaclass's instinsttyp e. Just as

ble to administer arbitrary DTDs. Furthermore, it o c-

classes contain ob jects with common characteristics, meta-

curs quite frequently that do cumenttyp es change over

classes are in use to mo del common characteristics of classes,

time. DTD handling should b e as simple as p ossible.

e.g. semantic relations b etween classes such as aggregation

In [ABH94] wehave claimed that these requirements can

or sp ecializati on . In the context of role sp ecializatio n, for

b est b e met using an OODBMS. If do cument- le ob jects

example, a real-world ob ject has several asp ects [SeE90 ].

were left intact, full database functionalitywould not b e

For instance, an ob ject mighthave the generalization as-

achieved. With a generic do cumentfragmentation di er-

p ect `p erson' and the sp ecializa tion asp ect `patient'. In

ences b etween do cumenttyp es could not b e re ected.

the mo deling there are two classes PERSON and PATIENT.

Two database ob jects corresp ond to each real-world ob ject

HyTime. The requirements for SGML do cument stor-

of that kind: an instance of PERSON and one of PATIENT.

age also hold true for HyTime-do cument storage. Another

The instances of PERSON have p erson-sp eci c prop erties and

additional requirement for HyTime do cuments is that the

metho ds, the ones of PATIENT patient-sp eci c ones. Class

semantics of HyTime do cument comp onents, whichis xed

PATIENT is a role-specialization class, its instances are (role-

by the standard, should b e re ected in the database. The

)specialization instances. Analogously, PERSON is a gen-

database must have semantic control over HyTime ob jects

eralization class, its instances are generalization instances.

as a prerequisite for our approach to query optimization

There is a metaclass whose instances are role-sp eciali zati on

[AbF93] and for semantic concurrency control [Mut93]. Be-

classes and whose metainstances are role-sp eciali zatio n in-

cause with HyTime op erationali ty is extensive as compared

stances. The insttyp e and instinsttyp e of this metaclass

to SGML it is advantageous to put some e ort into an ef-

contain the prop erty- and metho d de nitions that are nec-

cient realization of the HyTime application -in dep en dent

essary to administer a role-sp eciali zati on relationship on the

pro cessing.

class- and on the instance-level , resp ectively. If the same se-

mantic relationship o ccurs b etween other classes, e.g. CAR

and SERVICE VEHICLE, one can fall back up on that meta-

4 AVODAK Application Framework for HyTime Do cu-

class, and the relationship need not b e mo deled anew.

ment Storage

SGML Layer. Our VML schema for SGML/HyTime

The ob jectiveofthischapter is to intro duce the approachto

do cuments consists of several layers.Layers stand for the

HyTime-do cument storage we are currently pursuing. Ba-

di erent levels of sp ecializati on. A do cument element b eing

sic concepts of the VODAK Mo deling Language (VML) are

instance of a HyTime element-typ e form has two asp ects:

reviewed in the next paragraph. More detailed information

4

the SGML asp ect and the HyTime asp ect. Within the

can b e found in [KAN93].

SGML layer there is a corresp onding class for each nonter-

minal elementtyp e. We call these classes SGML element-

Principles of the VODAK Mo deling Language.

type classes. In Figure 7 there are element-typ e classes

Ob jects have properties and methods. This is mentioned to

FOOTNOTE, SECTION, CHAPTER, SO-LINK - based on the

explain the di erence b etween the usage of `prop erty' and

assumption that the DTD contains element-typ e de nitions

`attribute' in this article: attributes are SGML attributes;

footnote, section, chapter, so-link. In Figure 7, classes

prop erties are the variable-like containers for the database

are represented as ellipses, "normal" ob jects are just dots.

ob jects' data content. The type of an ob ject is its prop erty-

and metho d de nitions. A class is a set of ob jects of the

4

Actually, SGML elements already havetwo asp ects in our mo d-

same typ e. Ob jects of the same typ e need not necessar-

eling, as explained in the previous paragraph, but in this article we

abstract from this facet of the mo deling. ily b elong to the same class. A metaclass is a class whose 4 TERMINAL NONTERMINAL CLINK ILINK

CDATA FOOTNOTE SECTION CHAPTER SO-LINK FOOTNOTE SO-LINK

so-link XY so-link XY as as SGML-object HyTime-object SGML Layer HyTime Layer

semantic relationship role-specialization-of instance–of

Figure 7: Overview of the Mo delling

overview of the mo deling is in Figure 7. As classes in the The plain arrow connects an ob ject with its class. The

SGML layer are created dynamicall y,thismust also b e the dashed arrows are from a role-sp eciali zati on class to the cor-

case for their role-sp eciali zati on classes, i.e. classes in the resp onding generalizati on class or from a sp ecializatio n in-

HyTime layer. In principle, there is a metaclass for each stance to the generalization instance. One requirementis

HyTime element-typ e form in the HyTime layer. E.g. ILINK that DTDs may b e inserted dynamically into the do cument

5

in Figure 7 corresp onds to ilink, CLINK to clink. The base as they are not known ahead of time. Hence, element-

de nition of HyTime-sp eci c op erations is part of these Hy- typ e classes may b e generated dynamicall y. Classes that

Time metaclasses' instinsttyp e. Just as SGML element-typ e are generated dynamicall y are represented by a dashed-line

classes are essentially container for elements of the same ellipse. Dynamic generation of element-typ e classes is p ossi-

typ e, HyTime element-typ e classes are container for the Hy- ble b ecause the instances of element-typ e classes are of the

Time role-sp ecializa tion instances: they contain the sp ecial- same typ e, indep endent from their particular element-typ e

izations of the SGML ob jects of one elementtyp e. Classes class. This in turn is p ossible b ecause the pro cessing seman-

and ob jects in the HyTime layer are generated only when tics of nonterminal SGML-elementtyp es is generic, leaving

the relevant supp ort declarations are part of the HyTime aside the HyTime context. Terminal elementtyp es suchas

do cument's prologue. CDATA are DTD-indep endent. The corresp onding classes to

comprise, say, CDATA elements are part of the schema, e.g.

It has b een mentioned previously that role-sp eciali zati on class CDATA in Figure 7. The instances of di erent terminal

classes are instances of a metaclass for role sp ecializa tion . element-typ e classes, on the other hand, are not of the same

This metaclass provides the prop erties and metho ds to ad- typ e: CDATA elements, to give an example, have a metho d to

minister role-sp eciali zati on relationship s for their instances display textual content that do es not necessarily makesense

and metainstances. On the other hand, however, meta- for other terminal elementtyp es, esp ecially if datatyp es for

classes in the HyTime layer contain the typ e de nition of continuous media data are involved.

the HyTime ob jects, i.e. HyTime-sp eci c prop erties and

metho ds. Hence, the typ e de nitions in HyTime metaclasses Interpreting SGML attributes of a freely-de ned typ e is

must provide b oth prop erties and metho ds for b oth facets. not part of the system. This is di erent for attribute typ es

b eing part of ISO 8879-1986 such as the ones of typ e ID

HyTime Index Structure. Do cumentexchange for- or IDREF. Exp erience shows that the non-hierarchical do cu-

mats need not necessarily b e identical with the internal ment structure induced by these attributes is of minor im-

format. Namely, system p eculiariti es must b e taken into p ortance with conventional textual do cuments. This is the

account to nd the optimal format. Instead of transform- reason why the ID/IDREF-semantics will b e used to illustrate

ing the do cumentinto a format di erent from the one for how HyTime features shall b e realized.

SGML do cuments we suggest a more di erentiated view:

transforming do cuments into an internal format makes in- HyTime Layer. AHyTimedocument elementhasboth

cremental changes more dicult. In other words, a view SGML semantics and HyTime semantics: the SGML seman-

on this internal format would have to exist. With `inter- tics, to give examples, is re ected by op erations to navigate

nal format' we refer to a format di erent from the one for through the parse tree or to verify whether an element's

SGML do cuments. By intro ducing suchaninternal format, content conforms to its content mo del. Op erations re ect-

SGML functionalitywould have to b e adapted. With our ing HyTime semantics are of course di erent for individu al

current approach to SGML-do cument storage SGML op era- architectural forms. In the database there are two ob jects

corresp onding to a HyTime do cument element, an SGML

5

ilink and clink are the element-typ e forms b eing part of the Hy-

ob ject and a HyTime ob ject. The HyTime ob ject is role-

Time Hyp erlink mo dule. ilink has already b een explained to some

sp ecializati on instance of the SGML ob ject. We call the

degree. Here we just mention that clink ("contextual link") is also a

role-sp eciali zati on classes HyTime element-type classes.An

link structure, but one simpler than ilink. 5

sometimes hardly p erceivable. Sometimes the mapping b e- tions are integrated into the mo delling in a natural way.To

tween the levels is less de nite and dep ends to a stronger solve this problem we are developing HyTime-sp eci c index

degree on the mo deling language's constructs. structures: on the one hand the SGML format is preserved,

on the other hand HyTime functionality is accelerated. Just

HyTime Attribute-List Forms. In this paragraph as conventional index structures are adapted to changes of

we brie y describ e the way HyTime attribute-list forms are the data, mo di cation of the HyTime-index structures like-

dealt with. The attribute reftype is used as an exam- wise is triggered by up date op erations on the do cument.

ple. This particular HyTime attribute is describ ed b ecause

it is - just as ilink - indisp ens abl e to mo del hyp ertext Of course it would b e p ossible that the SGML format, i.e.

structures in HyTime. The attribute reftype is not an at- the prop erties of the ob jects in the SGML layer, could serve

tribute of individual elements. It rather is a supplementof as a basis for the pro cessing of the HyTime ob jects. Without

element-typ e de nitions having an attribute of typ e IDREF. the index structures the HyTime application -in dep en dent

Thus, the reftype-semantics is element-typ e-sp eci c: there pro cessing would have to rely on the SGML format, i.e. the

is no need for a role-sp ecializati on ob ject for each instance prop erties of the SGML ob jects that at the same time are

of the SGML element-typ e class. Instead one single role- generalization instances of HyTime ob jects. For instance,

sp ecializati on ob ject of the SGML element-typ e class as a consider a metho d for the element-typ e form ilink that,

whole is sucient to capture the reftype-semantics. The given an anchor-role name, returns the corresp onding an-

reftype-semantics is basically re ected in a metho d check- chor. Based on the SGML format, its execution would b e

ing whether the instance of the IDREF attribute conforms to as follows: rst, the SGML ob ject's prop erty corresp onding

the reftype-value. The metho d is triggered whenever the to the attribute linkends would b e accessed. By parsing

corresp onding IDREF attribute is up dated. the prop ertyvalue the character sequence corresp onding to

the ID of the anchor ob ject can b e determined. Next, the

"SGML Correctness" vs. "HyTime Correctness". SGML ob jects are searched for the one with that ID. It is

An SGML do cument that is correct need not necessarily b e returned when found. However, in order to keep read-access

conformant to the HyTime standard. For instance, leaving op erations cheap, HyTime attributes and content are not

aside the compulsory HyTime comment, the ilink attribute only stored as prop erties of the SGML ob jects, but also a

anchrole is a list of names, and the attribute linkends is a second time as prop erties of the HyTime ob jects. For this

list of SGML IDs. The fact that the numb er of names must redundant storage, another format is used. With our system

equal the numb er of IDs is stated by the HyTime comment. incremental up dates shall b e p ossible. Whenever an SGML

An element with di erentnumb ers of anchor-role names and ob ject's prop erty is mo di ed the corresp onding prop erties

SGML IDs may b e correct according to the SGML DTD. of the HyTime ob ject are also up dated. This conception

It is, however, not correct according to the HyTime stan- corresp onds to general indexing mechanisms in databases:

dard. In that sense, we provide for agging HyTime ob- read op erations are accelerated by means of redundant stor-

jects as correct or not correct. We see no other wayto age. When the data is altered the index structure is up dated

ensure that metho ds relying on the internal format of the corresp ondingl y. In the sequel, we exemplify these notions

HyTime attributes work correctly. Checking conformance by means of the HyTime link features. For the moment, we

of HyTime element-typ e forms' instances to the restrictions limit ourselves to a basic set of features. We concede that

in the HyTime comment is part of the HyTime application - with this restriction full HyTime expressiveness is not yet

indep endent pro cessing. achieved. However,tomodela basicsetofhyp ertext struc-

tures, this is sucient. Besides that, it is adequate to deal

Further Re nements of the HyTime Layer. Up with the do cument structures of SEPIA.

to this p oint it has b een suggested that there is one meta-

class for every HyTime element-typ e form. However,itis Mo deling HyTime Link Features. The instances

conceivable to have more than one metaclass corresp onding of the HyTime element-typ e form ilink have an attribute

to an element-typ e form. One reason might b e to improve linkends of typ e IDREFS. An SGML ob ject representing

p erformance, another one to extend functionali ty. A sp ecial such an element has a prop ertycontaining the attribute

kind of indep endent links, to give an example, are binary value. The sp ecializatio n instances, i.e. the HyTime ob-

links, i.e. links with two anchors. First, assume that the jects, have a corresp onding prop erty whose typ e is a list

6

number of anchors is bigger than two. In that case, admin- of OIDs : each OID corresp onds to an ID reference. It is

istering the anchors in the HyTime ob ject by using a list is imp ortant to notice that while OIDs identify database ob-

adequate. If, however, there are twoanchors using a list as jects, SGML attributes of typ e ID or IDREF identify SGML

the relevant prop ertytyp e of the HyTime ob ject is gener- do cument comp onents, i.e. entities on another logical level.

ally slower than just having two prop erties for each anchor. When the linkends attribute value of the corresp onding

On the other hand, there may b e metho ds that are sp ecial SGML ob ject is altered that list is up dated. I.e. the OIDs

for a particular link typ e. For instance, consider a metho d of the anchor ob jects are identi ed and entered into the list.

getOtherAnchor whose parameter is one anchor and whose re- Thus, searching for the link anchors is not necessary within

turn value is the other one. It makes sense for binary links read op erations, but instead direct access is p ossible. By ex-

only. Selecting the right metaclass, i.e. the appropriate in- plaining that SGML IDs can b e replaced byVMLOIDswe

ternal format, can b e accomplished by the system. rather want the di erent conceptual levels to b ecome evi-

dent: SGML attributes (e.g. the linkends attribute) are on

The HyTime semantics is not only re ected in meth- a level that we call the logical one, the VML representation

o ds of HyTime architectural forms' instances. There are (such as the list of OIDs) is on a level that we refer to as the

no element-typ e forms for no des, i.e. elements referenced by physical one. With HyTime attributes from other HyTime

ilink-instances, to give an example. No des may b e elements mo dules the di erence b etween these conceptual levels is

of arbitrary typ es. On the other hand, no des have pro cess-

6

The typ e of the logical units' identi er in the VODAK OODBMS

ing semantics, to o. An example of an op eration capturing

is referred to with OID ("Ob ject Identi er"). 6

do cument comp onents. Instead wewanttoinvestigate how the semantics would b e metho ds to calculate the transitive

to integrate it into the SGML/HyTime context. closure, another example would b e a metho d identifying a

7

no de's adjoining links, p ossibly only links of a certain typ e.

Extensible Hyp ertext Systems. Taking into account In analogy to role-sp ecializati on ob jects in the HyTime layer

the variety of existing hyp erengines in [WiL92 ] it is observed b earing the semantics of HyTime element-typ e forms there

that no internal structure succeeded to b e a standard or are role-sp ecializ atio n ob jects with no de semantics for all el-

quasi-standard. The authors follow that further exp erimen- ements in do cuments with the ilink supp ort declaration. In

tation with hyp erengines is necessary. Allegedly, thisiscon- other words, in addition to metaclasses suchasILINK there

ILINK. Again, the metain- siderably eased with the system describ ed: in principle a col- are metaclasses suchasINVERSE

stances inherit the no de semantics from the metaclass. lection of classes is made available from whichhyp erengines

can b e constructed. In the article the construction of two

existing engines is summarized. In clear contrast to work

5 Related Work

discussed in the previous paragraph maximal exibilityis

announced. The reverse of the medal is that one still has

Researchinvarious areas has contributed useful ndings. In

to do implementation work to arriveatanewhyp erengine.

this section, they are brought in relation to our work.

However, we claim that the quest for the optimal internal

structure is not the only core problem. Another imp ortant

Structured Do cuments. The relationship b etween

problem is to determine appropriate exchange formats for

structured do cuments and hyp ertext structures has b een in-

hyp ertext-do cuments. Furthermore, the optimal structure

vestigated b efore, e.g. in [QuV92]. With structured do cu-

certainly on the one hand dep ends on the storage layer and

ments the structure of the do cument comp onents is hierar-

on the other hand on the frequency of the di erent kind of

chical. This need not b e the case with hyp ertext structures.

op erations. Hence, we claim that there simply is not just

The editor Grif describ ed in [QuV92] can handle b oth hier-

one optimal structure. In a way, our approachisbetween

archical and non-hierarchical do cument structures accord-

this one and the conventional hyp erengines describ ed in the

ing to arbitrary DTDs. A proprietary format is used, but

previous paragraph: exibilityisachieved by pro cessing ar-

on the conceptual level that DTD corresp onds more or less

bitrary HyTime DTDs. On the other hand, nothing needs

to SGML DTDs. While weareworking on a storage system,

to b e reimplemented.

Grif is an editor: b ecause with an editor do cuments are only

edited and not pro cessed, the question whether HyTime-

HyTime. A running system in the HyTime area is

like concepts, i.e. certain do cument comp onents' semantics,

describ ed in [Ko e+93 ]: pro cessing of a HyTime do cument

have b een taken into accountwould not make sense.

consists of three phases. First, there is an SGML parsing

pro cess. Second, the HyTime engine called HyOctane do es

Hyp erengines. There exists a broad sp ectrum of data

additional checking and creates internal structures. Third,

mo dels for hyp ertext. Based on these, various hyp erengines

the do cument is presented. Our long-term ob jectiveisto

have b een built. [CaG88, ScS90 , MaS92 , SSS93 ] is merely

realize a relevant part of the HyTime semantics as part of

a selection. The data mo dels re ect the generic, do cument-

a database application . With HyOctane databases are used

typ e indep endent semantics of do cument comp onents from

as mere storage systems. It seems that full database func-

the hyp erengine develop ers' p oint of view. With these hy-

tionality has not b een envisaged.

p erengines it is tacitly assumed that the do cument-typ e-

sp eci c semantics is part of the presentation layer, i.e. the

An imp ortantquestioninthiscontext is the one ab out

application on top of the hyp erengine. On the other hand,

HyTime's limits. As opp osed to MHEG [KrC92, ISO93 ,

with the HyTime approach not the entire do cument-typ e-

Pri93 ] information concerning the presentation, e.g. whether

sp eci c semantics needs to b e realized anew for every do cu-

in the user interface buttons or sliders are to b e used, is not

menttyp e by reimplementing the presentation layer. Namely,

part of the do cument. In other words, a consensus on what

in parts it is already contained in the element-typ e de ni-

a do cument is has not yet fully emerged. On another level,

tions - the ones b eing sp ecializa tion s of HyTime architec-

we see the problem that only a fragmentofhyp ermedia-

tural forms. One ob jective of a large part of these articles

do cument comp onents' semantics can b e mapp ed on the

was a re nement of the generic semantics, which is re ected

database ob jects' attributes. There is ongoing researchcon-

in the op erations. In [MaS92 ], for example, several kinds of

cerning do cuments comp onents' semantics that cannot b e

delete-op erations for no des are describ ed: in case of "reck-

expressed with HyTime. In [ZhP92 ], to give an example, it

less delete" a no de is deleted together with all links referenc-

is describ ed how can b e applied to describ e the browsing

ing it; in case of "content-based delete" a no de is deleted,

semantics of hyp erdo cuments. On the other hand, in Hy-

and links that previously referenced it nowcontain an in-

Time there are the attributes extra and intra to capture

valid reference; and in case of "considerate delete" a no de

the browsing semantics. With these attributes it is merely

is deleted after verifying that there are no links referencing

the last traversal action that can b e taken into accountto

it. Op erations such as these could b e part of the HyTime

determine the traversal actions allowed. The exibili tyof

application -in dep en dent pro cessing for no des. I.e. metain-

is naturally higher: a sequence of traversal actions can b e

stances of INVERSE LINK could b e deleted in three di erent

evaluated to identify those actions. The complexity of the

ways. The ob jectiveofourwork, however, is not to p olish

HyTime standard would considerably increase if such infor-

the op erationality re ecting the semantics of hyp ermedia

mation was also given within the do cument. A question

7

With regard to the no des' semantics a deep er view is advanta-

that cannot unmistakably b e answered is what part of the

geous. Generic op erations such as the general calculation of the tran-

semantics, e.g. the browsing semantics, should b e part of

sitive closure are part of the SGML semantics: this is b ecause to this

the do cument. It is also conceivable to leave this to the

end only the attributes of typ e ID/IDREF are needed. With HyTime

application or the "reader's" preferences.

the functionalitymay b e more di erentiated, e.g. the transitiveclo-

sure for certain link typ es or anchor typ es can b e calculated: this

would b e part of the HyTime semantics. 7

6 Conclusions [ISO93] Information Technology - CodedRepresentation of

Multimedia and Hypermedia Information Objects (MHEG),

ISO/IEC JTC 1/SC 29, International Organization for Stan-

Wehave outlined our approachtowards HyTime do cument

dardization, 1993.

storage. The HyTime standard is a `meta' standard: it can

b e used to de ne exchange formats for hyp ermedia do cu-

[KAN93] W. Klas, K. Ab erer, and E. Neuhold, "Ob ject-Oriented

ments. Some of the HyTime link features have b een intro- Mo deling for Hyp ermedia Systems Using the VODAK Mo del-

ing Language (VML)", in Object-Oriented Database Manage-

duced. In the database, there are classes corresp onding to el-

ment Systems,NATO ASI Series, Springer Verlag Berlin Hei-

ementtyp es derived from HyTime architectural forms. They

delb erg, August 1993.

contain the do cument comp onents with op erations re ect-

ing the particular HyTime semantics. The internal structure [Kla+93] W. Klas et al., "VML - The VODAK Mo del Language

Version 3.1", Technical Report, GMD-IPSI, July 1993.

of HyTime-database ob jects can b e compared to indexing

mechanisms within databases, thus facilitating fast access.

[Ko e+93] J.F. Ko egel et al., "HyOctane: a HyTime Engine for an

MMIS", in Proceedings of the ACM Conference on Multimedia

WerelyontwoVODAK conceptions: the distinction b e- 1993,ACM Press, pp. 129-136.

tween typ es and classes and the notion of metaclasses. In

[KrC92] F. Kretz, and F. Colaitis, "Standardizing Hyp ermedia

principle each architectural form corresp onds to a metaclass

Information Ob jects", in IEEE Communications Magazine,

which, in turn, contains the typ e de nition of its instances

May 1992, pp. 60-70.

and metainstances. The instances, "normal" application

[MaS92] M. Marmann, and G. Schlageter, "Towards a Better

classes, are containers for do cument comp onents of the same

Supp ort for Hyp ermedia Structuring: the HYDESIGN Mo del",

typ e - and therefore with the same semantics. These classes

in [Hyp92 ], pp. 232-241.

can b e generated dynamicall y. Our ob jective has b een not

[NKN91] S.R. Newcomb, N.A. Kipp, and V.T. Newcomb, "The

to b e restricted to a xed set of do cument-typ e de nitions.

"HyTime" Hyp ermedia/Time-based Do cument Structuring

Language", in Communications of the ACM, Nov. 1991, Vol.

The current stage of our work is the implementation

34, No. 11.

phase. However, integrating the HyTime Scheduling Mo dule

[Mut93] P. Muth et al., "Semantic Concurrency Control in

into our database application imp oses yet unsolved problems

Ob ject-Oriented Database Systems", in IEEE Data Engineer-

on the database architecture.

ing 1993, Vienna, Austria.

[Pri93] R. Price, "MHEG: an Intro duction to the Future Interna-

Acknowledgement. We thank Jurgen  Wasch for help-

tional Standard for Hyp ermedia Ob ject Interchange", in Pro-

ful comments.

ceedings of the ACM Conference on Multimedia 1993,ACM

Press, pp. 121-128.

References

[QuV92] V. Quint, and I. Vatton, "Combining Hyp ertext and

Structured Do cuments in Grif", in [Hyp92 ], pp. 23-32.

[AbF93] K. Ab erer, and G. Fischer, "Ob ject-Oriented Query

Pro cessing: the Impact of Metho ds on Language, Architecture

[Rak+93] T.C. Rakow et al., "Using Ob ject-Oriented Database

and Optimization", Arbeitspapiere der GMD No. 763,Sankt

Systems for Multimedia Applications", in it+ti-Informa-

Augustin, 1993. tionstechnik und Technische Informatik, Themenheft "Multi-

media/Hypermedia" , Teil 2, Oldenb ourg, Munich, June 1993,

[ABH94] K. Ab erer, K. Bohm, and C. Huser,  "The Prosp ects of

pp. 4-17.

Publishing Using Advanced Database Concepts", in Proceed-

ings of ConferenceonElectronic Publishing, April 1994, eds., [ScS90] H. Schutt, and N.A. Streitz, "Hyp er Base: a Hyp erme-

C. Huser,  W. Mohr, and V. Quint, pp. 469-480, John Wiley & dia Engine Based on a Relational Database ManagementSys-

Sons, Ltd., 1994.

tem", in A. Rizk, N.A. Streitz, and J. Andre(eds.),Hypertext:

Concepts, Systems, and Applications (ECHT'90), Cambridge

[BAH93] K. Bohm, and K. Ab erer, "Extending the Scop e of

University Press, pp. 95-108.

Do cument Handling: the Design of an OODBMS Application

Framework for SGML Do cument Storage", Arbeitspapiere der [SeE90] A. Sernadas, and H.-D. Ehrich, "What is an Ob ject,

GMD No. 811, Sankt Augustin, 1993.

After All", in R. Meersman, W. Kent, and S. Khosla (eds.),

Object-oriented Databases: Analysis, Design and Construc-

[CaG88] B. Campb ell, and J.M. Go o dman, "HAM: a General

tion, North-Holland, 1991, pp. 39-69.

Purp ose Hyp ertext Abstract Machine", in Communications of

the ACM, July 1988, Vol. 31, No. 7, pp. 856-861. [SHT89] N.A. Streitz, J. Hannemann, and M. Thuring,  "From

Ideas and Arguments to Hyp erdo cuments: Travelling through

[Con87] J. Conklin, "Hyp ertext: an Intro duction and Survey",

Activity Spaces", in Proceedings of the Second ACM Confer-

in IEEE Computer, 20 (9), Sept. 1987, pp. 17-41.

enceonHypertext (Hypertext'89),ACM Press, pp. 343-364.

[DeS86] N. Delisle, and M. Schwartz, "Neptune: a Hyp ertext

[SSS93] D.E. Shackelford, J.B. Smith, and F.D. Smith, "The Ar-

System for CAD Applications",inProceedings of the ACM

chitecture and Implementation of a Distributed Hyp ermedia

SIGMOD'86 Conference,Washington DC, U.S.A., May 1986,

Storage System", in Proceedings of the Fifth ACM Conference

ACM Press, pp. 132-143.

on Hypertext (Hypertext'93), Seattle, U.S.A., Nov. 1993, ACM

[Her94] E. van Herwijnen, Practical SGML (second edition),

Press, pp. 1-13.

Kluwer Academic Publishers, 1994.

[Str+92] N.A. Streitz et al., "SEPIA: a Co op erative Hyp ermedia

[Hyp92] Proceedings of the ACM ConferenceonHypertext, Mi-

Authoring Environment", in [Hyp92 ], pp. 11-22.

lano, Italy, 1992,ACM Press.

[WiL92] U.K. Wiil, and J.J. Leggett, "Hyp erform: Using Exten-

[ISO86] Information Processing - Text and Oce Systems -

sibilitytoDevelop Dynamic, Op en and Distributed Hyp ertext

Standardized Generalized Markup Language (SGML),ISO

Systems", in [Hyp92 ], pp. 251-261.

8879-1986 (E), International Organization for Standardization,

[ZhP92] Y. Zheng, and M.-C. Pong, "Using Statecharts to Mo del

1986.

Hyp ertext" in [Hyp92 ], pp. 242-250.

[ISO92] Information Technology - Hypermedia/Time-based

Structuring Language (HyTime), ISO/IEC 10744, 1992 (E),

International Organization for Standardization, 1992. 8