Towards a Flexible Approach to Manage Varying and Altering

Towards a flexible approach to manage varying and altering information representations Tobias Haubold, Georg Beier, Wolfram Hardt Zwickau University of Applied Sciences, Informatics, Zwickau, Germany E-mail: [email protected] ence. Thus materials differ by means of the didactic design, Abstract prior knowledge, expert and non-skilled target audience or fundamental or advanced knowledge. Information and content is described using content or docu- To create such forms of content within a reasonable time ment models. Content production always has a particular pur- copy and paste is usually used to reuse content fragments by pose, target audience and language. Often various forms of rep- duplication. In addition the use of materials in international resentations are created in different data formats and all overlap cooperation demand translation and localization. in content and structure. Fragments of such content have relations On update or revision of content the consistency of par- across different origins. Such relations refer to a finer granularity tially redundant content fragments spread across different than the elementary unit of most management tools. Thus management tools are unable to provide automated support for such document files need to be ensured manually. The process of relations and consistency must be manually ensured during main- locating related content, detecting differences and collect- tenance. ing content for refreshing translation and localization must Using the example of learning and teaching content this pa- be done by hand and leads inevitably to time consuming per outlines key challenges of the problem domain, fundamental maintenance of content. characteristics of content production and management and neces- This paper focuses on requirements of content and doc- sary concepts of content and document models to address these ument models as well as content management approaches. problems. Related work on content and document models as well as management approaches is discussed and an approach for a 2. Key Challenges in Problem Domain repository architecture is presented to target these problems on management level. The process of producing learning and training content starts by defining a main learning path through the knowledge domain. According to this path the material is struc- 1. Motivation tured and created using an didactic approach (see [1]). Thus different documents are produced (e.g. presentation slides and scripts) sharing the same structure but contain different Digital learning and teaching materials are usually the representations of the content. During content maintenance result of target oriented content production processes. They the common structure must be ensured manually to keep the are created using appropriate editing tools (Office-Tools like materials consistent. Word, Powerpoint, Apache OpenOffice, LibreOffice, image During the production of materials with a similar target processing applications, special tools to produce SCORM- audience copy and paste is usually used to reuse content by based e-learning content, and more) and managed using the duplication. Thus content fragments are stored redundantly specific data format of the tools. in different document files. During update or revision all Every kind of knowledge transfer pursues an intention duplicated content fragments must be maintained manually and follows a didactic approach. Thus teaching in pres- to ensure consistency. The same applies to derived con- ence (presentation slides), self-study (lecture notes) and in- tent. Content fragments are duplicated and subsequently teractive online learning (web content) demand all different changed to better fit other purposes or author styles. Dur- kinds of content representations. The purpose of materials ing change of the comprised knowledge all derived content is significantly characterized by the learner as target audi- must be considered for manual maintenance. The work was funded by the European Union and the Free State of Multilingual materials with a high rate of change need Saxony. frequent refreshing translations and localizations. When the 566 latest updates are not yet available, one must deal with in- 4. Solution Approaches in Content Production completely translated and localized materials. Approaches to address these challenges involve docu- To address the key challenges content or document mod- ment and content models as well as content management els need the following mechanisms. approaches. 4.1. Content References 3. Fundamental Characteristics of Content The document model must provide support for content Production and Management references. To use them they must be allowed on the desired position and the type of content must be supported. 3.1. Reuse of Content There are two kinds of content references: references to files and references to file fragments. In the latter case just By using copy and paste content can be reused by dupli- the referenced fragment is included instead of the whole cation. This results in redundant storage and thus content file. This requires mechanisms to address content fragments updates have to be done manually on each copy to ensure defined by the referenced document model. consistency. Content references allow to reuse content without dupli- 4.2. Content Aggregations cation and redundant storage. Instead of the content itself just a reference to it is stored1. Whereas copy and paste is Mechanisms to aggregate content are based on content a feature of the editing tool, content references are a feature references and allow to compose documents out of other of the document model and must be supported by it. documents and document fragments. In addition to content inclusion the content is adapted to the current context if required, i.e. it is included on the desired position in the hi- 3.2. Granularity in Content Production erarchical structure of the document and the document wide layout and formatting settings are applied. Layout and for- In this context granularity is understood as the extend matting settings particularly defined for the included con- of content. Document models define concepts to structure tent are not adjusted. content (e.g. headings for hierarchical structures in text documents or slides in presentation software) and capture con- 4.3. Content Filtering tent (e.g. text in text boxes or paragraphs). Each concept represents a level of granularity. Thus there are levels with Mechanisms for content filtering allow for conditional finer and coarser granularity. content inclusion. Some kind of configuration settings spec- Especially in document oriented content production pro- ify whether or not particular content fragments are included cesses resulting document files are tailored to particular pur- or not. Usually such mechanisms evaluate meta data at- poses and have a coarse granularity. Within learning object tached to content fragments. based content production processes Hamel et al. states that the granularity should be relatively small [2]. In addition 4.4. Separation of Content and Presentation Duval et al. states that learning objects with finer granularity are more easily reusable [3]. There are two common approaches to separate content and presentation. First, the content contains meta data 3.3. Granularity in Content Management referencing presentation details. A presentation engines resolves these references and renders the content appro- All management tasks including versioning, categoriza- priately. Examples are the HyperText Markup Language tion, meta data management as well as translation are based (HTML) which references Cascading Style Sheets (CSS) on an elementary content management unit. Document files or Open Document (ODF) which defines format tem- Management Systems (DMS) and versioning tools use sim- plates referenced by content. ple files as management unit. Due to the coarse granularity Second, there are transformation based approaches. The of document files such tools are not suited to address the content is either described by an XML based document key challenges of section 2. Content Management Systems model or an implemented content model. Transforma- (CMS) use the concepts of their content model as manage- tions are described using the Extensible Stylesheet Lan- ment unit. Thus they provide finer levels of granularity but guage Transformations (XSLT) or an appropriate template prescribe particular content models. language. An XSLT or template processor takes the content and transformations as input and produces output doc- 1links in contrast serve as navigation uments. Examples are DocBook, DITA or CMSs. 567 5. Proposed Content Management Approach 5.3. Data Storage Layer To provide automated tooling support for the key chal- The main purpose of this layer is to persist data. All data lenges mention in section 2, content management must de- is stored as binary stream. A separate data storage layer is tect and manage content references and content aggrega- suited to support deduplication and distribution of data to tions and must support meta data for content filtering on a enable the use as distributed repository. fine level of granularity. The proposed approach focuses on a repository architecture to manage fine granular content 5.4. Repository Architecture fragments and to capture content references and aggregations in a document and content model independent manner. An adapter mechanism is used to map content into the proposed management model. File based adapters build 5.1.

Towards a Flexible Approach to Manage Varying and Altering

THE TEXT ENCODING INITIATIVE Edward

Towards a Content-Based Billing Model: the Synergy Between Access Control and Billing

3.4 the Extensible Markup Language (XML)

Never the Same Stream: Netomat, Xlink, and Metaphors of Web Documents Colin Post Patrick Golden Ryan Shaw

Streaming-Based Processing of Secured XML Documents

Standoff Markup Peter L

Enhancing the Security of Applications Using XML-Based Technologies

Twenty Years of Theological Markup Languages: a Retro- and Prospective by Race J

XML Base W3C Proposed Recommendation 20 December 2000

The Text Encoding Initiative: Flexible and Extensible Document Encoding

Markup Overlap: a Review and a Horse

Evaluation of Xpath Queries Against XML Streams Dan Olteanu