CEN CWA 15778

WORKSHOP February 2008

AGREEMENT

ICS 35.240.30

English version

Document Processing for

This CEN Workshop Agreement has been drafted and approved by a Workshop of representatives of interested parties, the constitution of which is indicated in the foreword of this Workshop Agreement.

The formal process followed by the Workshop in the development of this Workshop Agreement has been endorsed by the National Members of CEN but neither the National Members of CEN nor the CEN Management Centre can be held accountable for the technical content of this CEN Workshop Agreement or possible conflicts with standards or legislation.

This CEN Workshop Agreement can in no way be held as being an official standard developed by CEN and its Members.

This CEN Workshop Agreement is publicly available as a reference document from the CEN Members National Standard Bodies.

CEN members are the national standards bodies of Austria, Belgium, Bulgaria, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, Switzerland and United Kingdom.

EUROPEAN COMMITTEE FOR STANDARDIZATION COMITÉ EUROPÉEN DE NORMALISATION EUROPÄISCHES KOMITEE FÜR NORMUNG

Management Centre: rue de Stassart, 36 B-1050 Brussels

© 2008 CEN All rights of exploitation in any form and by any means reserved worldwide for CEN national Members.

Ref. No.:CWA 15778:2008 D/E/F CWA 15778:2008 (E)

Contents Page

Foreword ...... 4 1 Introduction ...... 5 1.1 Scope ...... 5 1.2 Purpose ...... 5 1.3 Formal liaisons ...... 6 1.4 Target audience ...... 6 1.5 Opening new markets ...... 7 2 Standards for document processing for accessibility ...... 8 2.1 Problem Statement ...... 8 2.2 Standards and structures ...... 9 2.3 Current situation...... 10 2.4 Current state of the art ...... 11 2.5 Current problems...... 14 2.6 Influencing factors in Document Processing for Accessibility ...... 15 3 Formats for document processing ...... 18 3.1 Printed paper...... 18 3.2 Printed Braille ...... 19 3.3 Audio...... 20 3.4 ASCII Text ...... 21 3.5 HTML documents...... 21 3.6 XML ...... 22 3.7 Multi-type composite formats ...... 24 4 Considerations for structuring documents ...... 27 4.1 Define and use document style guidelines...... 27 4.2 Define and use structure guidelines...... 27 4.3 Edit / add structure where needed ...... 28 4.4 Edit DRM settings ...... 28 4.5 Adaptation ...... 28 5 Conversion processes ...... 29 5.1 Convert Multimedia Material to structured Multimedia Material...... 29 5.2 Convert structured Multimedia Material to XML...... 30 5.3 Convert Multimedia Material to XML...... 30 5.4 Convert traditional print to XML...... 30 5.5 Convert DTP to XML ...... 31 5.6 Convert XML to Print ...... 31 5.7 Convert XML to Braille ...... 31 5.8 Convert XML to Large Print ...... 32 5.9 Convert XML to HTML ...... 32 5.10 Convert XML to structured Multimedia Material...... 32 5.11 Convert DTP to Multimedia Material ...... 33 5.12 Convert Audio to structured Audio ...... 33 5.13 Convert XML to XML...... 33 6 Scenarios introducing accessibility within publishing workflows...... 35 6.1 Scenario 1 - Delivering XML files ...... 36 6.2 Scenario 2 - Accessibility enhancement in general...... 38 6.3 Scenario 3 - Increasing ...... 40 6.4 Scenario 4 - Accessibility policy...... 42 6.5 Scenario 5 - Spoken documents for everyone ...... 43 6.6 Scenario 6 - Accessible and protected PDFs ...... 45

2 CWA 15778:2008 (E)

6.7 Scenario 7 - Working hand in hand ...... 47 6.8 Scenario 8 - Accessible design...... 49 6.9 Scenario 9 - Accessibility on a large scale ...... 51 6.10 Scenario 10 - What authors can do...... 52 6.11 Scenario 11 - Repair and adaptation ...... 54 6.12 Common scenario requirements ...... 56 6.13 Specific scenario requirements ...... 56 7 Application-oriented scenario implementation ...... 56 7.1 Harry Potter and the RNIB ...... 56 7.2 Magazine and Newspaper distribution in the Netherlands ...... 57 7.3 Time Warner and Dolphin Audio Publishing ...... 58 7.4 Educational publishing in Austria ...... 59 7.5 Best practice for distributing accessible content ...... 61 8 Identified gaps and areas for further research ...... 66 8.1 Descriptions & Requirements ...... 66 8.2 Process & Content Modelling...... 66 8.3 Introducing and using metadata for accessibility purposes ...... 67 8.4 Standards and personalisation of content ...... 68 8.5 Licensing and technical protection measures ...... 69 9 Conclusion and future work ...... 70 Appendix A – Relevant standards ...... 72 Appendix B – Relevant European organisations ...... 79 Appendix C – Sustainability: network of interested parties for ongoing support and further development...... 82 Appendix D – Abbreviations List ...... 88

3 CWA 15778:2008 (E)

Foreword

The decision for this CEN Workshop Agreement (CWA) was taken by the DPA Workshop at its kick-off meeting on 13 May 2005.

This CWA provides a first elaboration on how the accessibility of publishing content can be enhanced by altering existing publishing workflows and introducing accessibility considerations where appropriate. For reaching this goal, in each step where accessibility is introduced, relevant formats and conversions are detailed out, as well new workflow items described.

The document has been developed through the collaboration of a number of contributing partners such as publishers, blind organizations and universities. The names of the individuals and their affiliations that have expressed support for this CWA are available from the CEN/ISSS Secretariat.

The final draft of this CWA was put on CEN’s web site for a 60 days period of external comments between 06 July and 10 September 2007.

The formal process followed by the Workshop in the development of the CEN Workshop Agreement has been endorsed by the National Members of CEN but neither the National Members of CEN nor the CEN Management Centre can be held accountable for the technical content of the CEN Workshop Agreement or possible conflict with standards or legislation. This CEN Workshop Agreement can in no way be held as being an official standard developed by CEN and its members.

The final review/endorsement round for this CWA was successfully closed on 2007-09-18.The final text of this CWA was submitted to CEN for publication on 2007-11-30.

This CEN Workshop Agreement is publicly available as a reference document from the National Members of CEN: AENOR, AFNOR, ASRO, BDS, BSI, CSNI, CYS, DIN, DS, ELOT, EVS, IBN, IPQ, IST, LVS, LST, MSA, MSZT, NEN, NSAI, ON, PKN, SEE, SIS, SIST, SFS, SN, SNV, SUTN and UNI.

Comments or suggestions from the users of the CEN Workshop Agreement are welcome and should be addressed to the CEN Management Centre.

4 CWA 15778:2008 (E)

1 Introduction

1.1 Scope Given the widespread adoption of ICT within the publishing industries, there is a general interest in the creation and provision of well-formatted digital documents. For those people who are dependent on accessible information, this interest is of central importance, and it is this convergence of interests that stimulated the creation of this Workshop. The WS/DPA has examined some of the ways in which this convergence is helping to build consensus and create new strategies and technologies for the provision of information in formats that are more accessible for everyone.

In the real world, publishers rely on accessibility experts and consider accessible information only at the end of the content production chain. This requires considerable amount of efforts to make information accessible for everyone and it is a very hard problem to tackle. This workshop introduces accessibility as a design element in the content production and provides guidelines and best practices how more accessible documents can be produced. Another important issue is that the user requirements for accessible information are not well defined. In this work, we therefore base the elaboration on publishing use cases and scenarios that have been derived together with publishers in order to analyse at least partly the user requirements. Additionally those scenarios provide specific examples of accessible information provision as an entry point to publishing stakeholders.

Sustaining the provision of useful services and meaningful accessible content can be considered vital to the growth of the Information Society as a whole. When designing, specifying and building applications and infrastructures to store accessible content, several apparently unrelated issues arise. How do we describe the knowledge and capabilities we possess and capture the repository of resources we can use to implement these capabilities? How do we describe the questions and problems of end users and content providers? How do we marry both within manageable and consistent frameworks? How do we re-apply this knowledge and combine these resources with new insights to solve new problems? How can we accelerate the processes described above and provide solutions to enable accessible information processing?

The DPA Workshop (CEN Workshop on Document Processing for Accessibility) brought together some of the key players working in the fields of publishing and accessibility. The topics addressed ranged from generic document and knowledge structures, through all aspects of accessible document processing, to Digital Rights Management and copyright issues. Perhaps the most striking aspect was the level of convergence between the needs of accessibility communities and those of content creators and providers. Indeed, with the introduction of accessibility from scratch, the information needs of all consumers are better served, particularly as content providers seek innovative solutions for re-aggregating their content for new marketplaces. 1.2 Purpose The DPA Workshop as detailed in its business plan has the following objectives: • To bring together all the players in the information provision and e-publishing chain in order to achieve the critical mass significantly to enhance the provision of accessible information at a European level. • To provide guidelines needed on integrating accessibility approaches and workflows within the document management and publishing process rather than as just a specialised additional service. • To raise awareness and stimulate the adoption at local, regional, national and European levels of the emerging formats and standards for the provision of accessible information and to find ways of ensuring that technological protection measures do not inadvertently impede legitimate access to information by people with print impairments

Based on those objectives this document: • describes the outcomes from the DPA Workshop activities • provides an elaboration of relevant standards and their possible use in the publishing sector • examines the different formats required for accessible information provision

5 CWA 15778:2008 (E)

• provides a systematic overview of relevant conversion processes and related structured information activities • examines possible scenarios of use within the publishing sector • provides real-life case studies and instances of best practice • identifies areas for further research and systematisation

1.3 Formal liaisons 1 The Workshop has been initiated and supported by the EUAIN Project. The EUAIN network is a co-ordination action co­ funded by the INFSO DG of the European Commission within the RTD activities of the Thematic Priority Information Society Technologies of the 6th Framework Programme. The EUAIN project, co-ordinated by DEDICON Amsterdam, aims to promote e-Inclusion as a core horizontal building block in the establishment of the Information Society by creating a network to bring together the different actors in the content creation and publishing industries around a common set of objectives relating to the provision of accessible information. Accessibility for print impaired people can be an increasingly integrated component of the document management and publishing process and should not be a specialised or additional service.

The DPA Workshop has established the following liaisons with relevant activities to its work programme. Liaisons established: • AIIM PDF Access Working Group • CEN/ISSS Data Protection & Privacy Workshop • CEN/ISSS Learning Technologies Workshop • CEN/ISSS Dublin Core Metadata Workshop • COST 219 TER • EDeAN European Design for All e-Accessibility Network • ETSI TC/HF ETSI Technical committee Human Factors • ICTSB/DATSCG Design for All and Assistive Technologies Standardisation Co-ordination Group • OASIS TCs concerning ODF and DITA • W3C Web Accessibility Initiative (WAI) • DAISY (NISO z39.86) • ISO/IEC JTC 1 Special Working Group on Accessibility • ISO/IEC JTC 1/SC 29 Coding of audio, picture, multimedia and hypermedia information • ISO/IEC JTC 1/SC 7/WG 2 Software and Systems Documentation 1.4 Target audience This CWA addresses accessibility in the publishing value chain and examines ways to introduce and enhance accessibility of publishing content inside publishing workflows. The intended audience includes actors and stakeholders within the publishing value chain (publishers, authors, content producers and distributors, publishing system developers and vendors) and the content accessibility area (specialised libraries, accessibility consultants, and accessible system developers and vendors)

The CWA aims to provide a first elaboration on how the accessibility of publishing content can be enhanced by altering existing publishing workflows and introducing accessibility considerations where appropriate. For reaching this goal, in

1 Contract number 511497, DG INFSO, EC, FP6. See http://www.euain.org

6 CWA 15778:2008 (E)

each step where accessibility is introduced, relevant formats and conversions are detailed out, as well new workflow items described.

The primary target group of this CWA are actors in the publishing value chain. The document is structured in a form that enables publishers to get quick insights on what they need to do in order to produce accessible content. For that reason several specific scenarios are provided that, although not exhaustive, can serve as an entry point for publishers in their accessible content implementations. 1.5 Opening new markets Accessible information is not a special type of information aimed at a specific group of a certain population. Accessible information is information that can be accessed by anyone, with or without a , aimed at a general market where anyone interested is a possible customer. Structured information is the first step in the accessible information process. A document whose internal structure can be defined and its elements isolated and classified, without losing sight of the overall structure of the information, is a document that can be navigated.

Most adaptive technology allows the user to access a document, and to read it following the "outer" structure of the original. But if the same information has also an "inner" structure that allows the adaptive device to distinguish between a phrase and a measure, between a paragraph and a sentence, highlighting particular annotations, then the level of accessibility (and therefore usability) of the whole document will be greatly enhanced, allowing the user to move through it in the same way as those without impairments do when looking at a printed document, and following the same integral logic.

In an ideal world, all documents made available in electronic formats should contain that internal structure that benefits everyone. Highly-structured documents are becoming more and more popular due to reasons that very seldom pertain to making it accessible to persons with . The move to XML related formats and associated standards for metadata have provided an impetus for far greater document structuring than before. Whatever the reasons behind those decisions are, the use of highly-structured information is of great benefit to anybody accessing them for any purpose.

In recent years, the market for accessibility and assistive technologies has started to gain recognition. It is clear that the integration of accessibility notions into mainstream technologies would provide previously unavailable opportunities in the provision of accessible multimedia information systems. It would open up modern information services and provide them to all types and levels of users, in both the software and the hardware domain. Additionally, new consumption and production devices and environments can be addressed from such platforms and this would provide very useful information provision opportunities indeed, such as information on mobile devices with additional speech assistance.

It is equally clear that we remain at the very beginning of the move to incorporate accessibility within mainstream content processing environments.

7 CWA 15778:2008 (E)

2 Standards for document processing for accessibility 2.1 Problem Statement Accessible solutions are required for anyone who requires assistance in using the mainstream solution. This could be because a user is blind, visually impaired, or impaired in some other way, and the term print impaired is often used in this context. Accessible solutions range from small assistive applications, (such as screen magnifiers) to full scale operating systems and screen reading environments. The traditional problem with accessible solutions is that they are normally implemented as an afterthought or a piggy-back solution. This results in solutions that are not fully integrated (or not well integrated) with the mainstream solutions. These independent applications are then at a disadvantage whenever software versions or operating systems are updated. In order to make this integration process easier, and provide more intuitive designs for the future, it is essential that “design for all” and accessible design methodologies are widespread. Standard, policy and legislation also helps ensure that accessible designers have a solid standard to meet to ensure future­ 2 proofing .

Notions of “accessibility” are normally equated with the adaptation and conversion of digital content, where this content can be made available. On a European level, and indeed often on a national level, much of the existing expertise on creating accessible adaptations of digital content is of a highly distributed nature. Within specialist organisations supporting print impaired people; or within university research laboratories; or indeed within publishing houses, many automated tools have been designed and implemented at least partially to execute the necessary adaptation procedures. However, each automated tool has its own, highly specific, field of application. Furthermore, the knowledge required to build these very specific tools is equally distributed, so that there is currently very little re-use of either tools or knowledge. The content provider’s perspective on digitisation is further complicated by security issues. In the modern environment driven by the internet for content dissemination, security is a vital issue for publishers. DRM is a complex problem for all content holders. Every publisher’s content, client base and requirements are different, which often results in a personalised set of requirements for each case. As a result, agreements on accessibility are often negotiated on a case- by-case basis. Naturally, publishers have to be confident that any digital format is being delivered through secure gateways to only the people who are intended to receive it.

Accessibility can also be viewed from a wider angle. Being able to see content in whatever modality; perceive its context; and attach a useful meaning to it requires that the user be able to access this content, its context and relevant software application in a way that meets that particular user's consumption preferences. These preferences may become requirements over time - we all get older. Being able to attach useful meanings to content is what lies at the very basis of the preservation and education of thought. Attaching useful meanings to content underpins the basis of culture, commerce and civilisation. Being able to access software and the content and the potential for understanding it unleashes, requires us to be able to gain access to software and not be hindered by huge costs, complexity, lack of support and additional barriers.

Given the differences between the traditional approach to accessibility and the wider view outlined above, we are in something of a transitional phase at this time. From the software producer, business community and the Open Source System community we see a move towards the inclusion of accessibility features into systems, tools and the programming languages themselves as system wide core functionalities (examples being KDE3, GNOME4, and Java Accessibility5. From the accessibility community we see a move towards more advanced and abstract descriptions of the procedures involved in moving from 'common' content towards content that is processed to be granted accessible certification. A good example of such a move is the Web Content Accessibility Guidelines 1.06 and 2.07, which provide

2 See, for example, the work of EIDD-Design For All Europe, http://www.design-for-all.org

3 KDE Accessibility Project, http://accessibility.kde.org/

4 GNOME Accessibility Project, http://developer.gnome.org/projects/gap/GNOME-Accessibility.html

5 JAVA Accessibility, http://java.suncom/j2se/1.4.2/docs/guide/access/

6 http://www.w3.org/TR/WCAG10/

8 CWA 15778:2008 (E)

detailed guidelines on how to (re)structure and enhance websites and their content to ensure a sufficient level of accessibility.

The transitional stage described above involves relatively slow change when compared with general exhilarating technological developments. However, this relatively slow pace also creates an opportunity to take a step back and observe all the individual processes that touch upon the notion of accessibility. This allows us to explicate similarities and possible complementarities, a process of convergent gradualism if you like. The opportunity then arises to synchronise various efforts in the accessibility arena and offer them to end users and business as a ‘package’. Such a package contains scientific knowledge about accessibility, as well as technological knowledge about how to implement such notions. This package also contains detailed descriptions of the requirements of the end users, producers and distributors of content, as well as tools aiming towards market segments that rely on these requirements. Such an approach that aims to unify 'common' content, system, service and tool provision and the more 'specialised' content, system, service and tool provision, can be called Accessible Information Processing (AIP). 2.2 Standards and structures Structured information is the first big step towards high-quality accessible information. A document whose internal structure can be defined and its elements isolated and classified, without losing sight of the overall structure of the document, is a document that can be navigated.

As noted above, most adaptive technology allows the user to access a document, and to read it following the "outer" structure of the original. If that structure is left to a range of visual cues, like bold capital letters for the title of a chapter and bold italics for the heading of a subchapter, the adaptive device will surely flatten that visual structure, leaving a document with no structure at all. But if the same document has also an "inner" structure that makes possible for the adaptive device to distinguish between a paragraph and a footnote, between a chapter and a subchapter, then the level of accessibility of the whole document will be greatly enhanced, allowing the user to move through it in the same way those without disabilities do when looking at the printed document, following the same "logic".

In an ideal world, any document made available in electronic format should contain that inside structure that benefits to everyone. Highly-structured documents are becoming more and more popular due to reasons that very seldom have to do with making it accessible to persons with disabilities.

Some of the largest publishers are converting their old electronic texts into full XML documents so that it will allow them to look for certain bits of text that they can re-use in further editions, as well as to help them to avoid double-production of the same text. Whatever the reasons behind those decisions are, the use of highly-structured information is of great benefit to anybody accessing them for whatever purpose. And equally important is to structure the following information standards as this allows for: • Consistency in the description of structural elements • Understanding and predictability of structures • Interaction with other standards • Technical compliance with different devices • Interchangeability of materials • Flexibility and evolution.

Standards are needed for many reasons, but probably the most relevant one is that they help manufacturers to make their products accessible in a detailed, coherent way. The existence of standards though does not imply that accessibility will be implemented in the same way or with the same results in all products. The existence of a number of standards for producing the same product (a document) may occasionally lead to two different levels of accessibility for the same "accessible" final product. Even within the same standard it sometimes happen that some features are considered essential while others may be considered expendable – as a result of this, the application of the same standard with different views about what is needed and what is not to make a document accessible may provide a wide range of accessibility levels for the same product, making it fully accessible for some users and just slightly accessible for others.

7 http://www.w3.org/WAI/intro/wcag20

9 CWA 15778:2008 (E)

It is therefore important to decide beforehand the level of accessibility to apply to a certain document according to different variables, such as: • The depth of the structure that the document allows/needs to make it sufficiently navigable • The level of navigability actually needed by the potential users • The resources available to make a document accessible

It may also happen that different standards are developed for the same purpose and though they deliver the same level of accessibility they are not compatible. This usually leads to confusion for manufacturers and service providers while it also "divides" users between the different existing standards (we all remember the Beta vs. VHS example). We can distinguish between formal (or de jure) standards and de facto standards. The former are those who have been "formalised" by standards organisations, while the latter are technical solutions that have been adopted informally by users due to its usefulness and/or reliability. Among these de facto standards, we also have two categories – proprietary standards (those developed by a commercial company) and open standards (outside vendor’s control, freely developed and updated by independent programmers).

A list of relevant standards for accessible information processing has been collated and is included in Appendix A. It quickly becomes clear that no single standard or set of standards can help fully to implement accessible information processing within mainstream workflows. We rely on the existence and the promotion of accessibility standards to prove that accessibility can be built from the first stage of production: that Design For All can be applied to emerging standards so that all the features needed to grant accessibility to the final product are built into the system right from the beginning, instead of the traditional approach of adding those features later. There are many different standardisation agencies around the world, but what is considered to be fully accessible in the United Kingdom may not be seen as accessible in the same degree in Australia. 2.3 Current situation A publisher can create a product that can be marketed for everyone, that includes accessibility features right from the first stages of the creative process and that looks just like its non-accessible version. This does not apply to all types of information (printed information, for instance), but it certainly applies to electronic publishing, audio publishing, the movie industry, etc. Printed information relies too heavily on its visual appearance and cannot be distributed "as is" to people with a print disability, but printed information is usually the last presentational output stage. Many other formats containing the same information are likely to have been created. Those files are sometimes documented, structured, catalogued and stored for future use, and those files can be created in a way in which the text and images they contain, the structure they rely on, may be of use for people with disabilities.

One clear example of how accessible information need not necessarily look different or be marketed differently is the PDF format as demonstrated by Adobe PDF version 8 onwards.8 A PDF document can be created and distributed either tagged or untagged. The former will allow print impaired people to read the document; the latter will be of use only to those who can see it. They both look the same, and they are equally expensive or inexpensive to create. Likewise, web pages can be designed in a way that can be accessed by or not. They have the same look and, for a programmer who knows which elements to use, making an accessible webpage is not a much bigger task than making an inaccessible one.

Therefore, what would make information providers decide to create their documents in a way that can be accessible to everyone? It could be to open up their market to anybody interested in the product they sell; or to comply with the 9 legislation that requires that particular information is made public in a format that is accessible to all possible; or simply to reach a much wider audience.

The publishing value chain may change significantly if publishers begin to ascribe greater value the electronic files they use for printing. Tagged and structured electronic files not only have an intrinsic value (in the e-publishing sector) but may also be used to create other formats that people with or without disabilities may use or prefer to paper. For example,

8 Adobe Reader from http://www.adobe.com and associated authoring/processing tools from http://www.adobe.com/products/acrobat/

9 Sullivan, J., (2007) Study on Copyright Limitations and Exceptions for the Visually Impaired, SCCR/15/7, WIPO, Geneva.

10 CWA 15778:2008 (E)

the cost of producing a structured talking book using these files together with good quality synthetic speech is minimal and would increase their potential market.

The "accessibility features" needed to make an electronic document usable for people with print disabilities are not "exclusive features". This expression has a two-fold meaning – a fully structured XML document will not exclude people without disabilities from using it just because it is structured and in a readable format. Besides, we do not need to create an exclusive document to be used only with special equipment (except when copyright exceptions may apply).

What is sometimes considered "accessible" is simply "usable" – structures, navigation, are features that make a document simply easier to use and more friendly to ANY user. When we add to a printed book a CD-ROM with the text or the audio (or both) we are not making it available only to people with disabilities - we are also helping students to carry the full text of the book anywhere in any portable device, and we are also giving them the opportunity to cite parts of the book in their papers without typing them. We are allowing buyers of the book to listen to it while driving home after work, as well as giving a blind person the possibility of hearing its contents.

In some cases, electronic documents offer much more possibilities and a better reading experience than the printed edition. Books full of references to other books and authors, with words that can be found in a glossary or notes that are explained at the end of the book, are better used and enjoyed when they are published in an electronic format. When properly edited, entries can be created to any chapter, subchapter or any other significant part of the structure from the table of contents; an entry to the words in the glossary every time those words are used in the book; or links to the bibliography every time another author is mentioned. All this can be accomplished with a simple mouse click or by pressing the space bar. This enhances the value of certain books, and it is a format that has already been used by some publishers of reference material. These are not books for people with print disabilities, but they are (in most cases) perfectly accessible. 2.4 Current state of the art Document processing and accessibility both have a very wide focus, it has therefore been decided to structure the elaboration of standards in this workshop according to the value chain in publishing, namely content creation, content production and content distribution. Following this workflow the elaboration will be structured in a comprehensive way as each step in the publishing process will be examined. A general publishing chain can be defined in simple terms as “economic activities that facilitate the creation, production, circulation, delivery and consumption of information-based products”10

In Figure 1 the above elements are visualised to exhibit the intention of how the elaboration is going to be demonstrated. The content processing chain requires on one side actors that create content, a framework and infrastructure for content production, storage and management, and at the end user side, consumption and delivery interfaces. In each of these building blocks of the content processing chain, several actors interact in order to create, produce and consume content. Those interactions form processing workflows along publishing channels and products, leading to single or multiple channel publishing. Dealing with each content processing building block requires modelling content, user and their interactions in a way that implicit steps and knowledge become explicit. Then, in each of these steps, accessibility requirements and enhancements need to be analysed in order to introduce accessibility right in the beginning rather than at the end of the content processing chain. This is exactly the methodology followed in the current CWA: in each of those workflow steps identified, actors and processes are analysed and accessibility requirements elaborated, links to useful and related standards made explicit, and possible content accessibility enhancement steps described in more detail. For feasibility reasons we limit the potential workflows in order to present at least in those cases a thorough analysis.

10 South Africa Department of Arts, Culture, Science and Technology (1998), Final Report: The cultural industries growth strategy (CIGS): The South African publishing industry report

11 CWA 15778:2008 (E)

Figure 1 - Generic Publishing Process

2.4.1 Current practices in accessible publishing

2.4.1.1 Document Conversion Much of the current practice in document conversion centres on the use of Word Processor files and conversion to formats such as Adobe PDF. There are various methods for achieving this by use of plug-in type software11 for a host application or larger scale document conversion services12. There are methods which can improve the accessibility and meta enhanced aspects of these formats, but in order to be successful there is a responsibility on the author to oversee and ensure the quality of this process.

Other methods are using desktop publishing applications such as Adobe InDesign13 or QuarkXpress14 but the latter cannot generate accessible PDF documents.

There is a lot of industry wide interest in using XML for publishing. There are many practical advantages to using XML based languages, these include the dynamic reusability and repurposing of content and the development of XML-centric workflows by publishers which promises the potential to create structured content that can be outputted in accessible formats required by people with disabilities15.

In certain countries this is due, in part, to changes in legislation where accessible content is a requirement16 but also to a growing recognition on the part of the publishing industry of the changes in the ways end users will access content due to technological advances17 and the need for users to be able to access published content in the medium of their choice.

11 http://www.adobe.com/support/downloads/detail.jsp?ftpID=1161

12 http://createpdf.adobe.com/

13 http://www.adobe.com/de/products/indesign/

14 http://www.quark.com/

15 http://www.webaim.org/intro/

16 http://www.ncd.gov/newsroom/publications/2004/inclusion_whitepaper.htm

17 http://www.audible.co.uk/

12 CWA 15778:2008 (E)

While the referenced examples relate primarily to Web accessibility, the issues encountered by users and subsequent methodologies used to overcome barriers to access, are relevant to the publishing industry18. 2.4.1.2 Trusted Intermediaries Trusted intermediaries establish a personalised relationship between content holders and specialist organisations whereby publishers and agencies serving blind and partially sighted people work together in a secure and trusting environment to increase the quantity and timeliness of titles available in an accessible format. Within trusted intermediary frameworks, DRM is an enabler of controlled access. A number of different security methods are being developed or are already in use for making content available in this way.

As far as security is concerned, the higher the level the more likely publishers are to allow content to be made available in accessible digital formats. At present, the security systems used are simple, they use basic encryption technologies with key exchange mechanisms. The potential for the release of content is considerable – although there are few recorded instances of such occurring. Once decrypted, content is available to anyone, authorised or not. The ability to attach content to particular devices, or better to provide access only to authorised users, requires a level of DRM sophistication that is not yet generally in place in services catering to the needs of visually impaired people19. Current examples of such practices include countries like France20 and Austria21. 2.4.1.3 Enterprise content management (ECM) ECM plays an important role because this will likely be the basis for future workflows in publishing environments as the tools and processes converge. In mid 2002, 20% of EU media and publishing enterprises with a website had content management systems in place, though only 3% had supply chain CMS22. As illustrated in Figure 1 the content production process consists of many different steps and there are also different actors involved in those processes. ECM plays an important role in being the central element of these processes since all the data can be captured, managed, stored, preserved, and delivered within the organisation and thus support the publishing process. It is obvious that integrating the accessibility requirements is an important issue related to ECM and publishing environments. Accessibility considerations should be taken into account during the whole publishing process rather than at the end as an add-on 23 feature and so correspond with the design for all principles . 2.4.1.4 Electronic Publishing The landscape of electronic publishing is constantly evolving but as technology changes ever faster there is an ever increasing need for publishers to be able to archive their content, easily retrieve it and export it in a wide variety of necessary output formats. This is a complex and multi layered process and as with any workflow there are certain points in the digital publishing chain that require specific knowledge and processing methodologies in order to manage them

18 http://www.afb.org/Section.asp?SectionID=4&TopicID=222&DocumentID=1224

19 http://www.indicare.org/tiki-read_article.php?articleId=169

20 du Bourguet, Guillaume; Guillon, Benoit; Burger, Dominique (2003): Helene: a collaborative server to create and securely deliver documents for the blind. Proceedings AAATE 2003

21 Miesenberger, K.; Ruemer, R.: Schulbuch Barrierefrei (Accessible School Books) - Co-operation Between Publishers and Service Providers in Austria. In: Proceedings of the 10th International Conference on Computers Helping People with Special Needs(ICCHP '06 ), Linz, Springer, pp32-39

22 OECD report on scientific publishing, DSTI/ICCP/IE(2004)11/FINAL

23 Darzentas, J., Miesenberger, K.: Design for All in Information Technology: a Universal Concern (Keynote), in: Andersen, K., V., Debenham, J., Wagner, R. (eds.): Database and Expert Systems Applications, 16th Internaitonal Conference, DEXA 2005, Copenhagen, Denmark, August 2005, Proceedings, Springer LCNS 3588, Berlin/Heidelberg 2005, pp. 406 – 420

13 CWA 15778:2008 (E)

24 successfully. These include archiving data and legacy document usage , data exchange, intellectual property rights and 25 26 Digital Rights Management , and finding the best practice models that work for the widest range of publishers .

Though each publisher may have specific needs that differ in the broadest sense, there are common issues for all. These include, but are not limited to, the need for logical document structure that not only makes sense to readers but also makes it possible to retrieve, index and update a document by using metadata to identify the various parts of an electronic document. Processes such as these are essential in order to modify the document format or extract relevant parts of a document for insertion in another. Much technical work has therefore to be done to create methodologies and mechanisms to realise and develop these processes. 2.4.1.5 XML-Publishing There is a transition underway from the limited tag set and functionality of SGML to XML based tools and procedures and 27 there are various XML based standards, specifications and initiatives for the printing and publishing industry , one of the 28 most powerful being DITA .

DITA is an XML-based specification for modular and extensible topic-based information. DITA provides a model for defining and processing new information types as specialisations of existing types. DITA populates the model with an extensible hierarchy of standard types. DITA encourages re-use by reference either of topics or of fragments of topics. DITA topics: • can be assembled in different combinations for many deliverables or output formats; • are optimised for navigation and search; • are well suited for concurrent authoring and content management.

DITA is customisable, which allows for the introduction of specific semantics for specific purposes without increasing the size of other DTDs, and which allows the inheritance of shared design and behaviour and interchangeability with 29 unspecialised content . 2.5 Current problems 30 Although initiatives and projects for incorporating accessibility in publishing are ongoing , the current situation relating to the publishing process is the following: • There is a lack of knowledge of accessibility related standards and formats within the publishing industry. Experiences in the work with publishers also show, that there are different understandings of “what is structure” between publishers and the accessible content producing communities. • There is also a lack of knowledge of publishing workflows and standards within specialist accessible content producing communities. This lack of understanding makes it hard for consulters and experts to give comprehensive suggestions to publishers.

24 http://www.dlib.org/dlib/january00/01hodge.html

25 http://www.publishers.org/press/pdf/DRMWhitePaper.pdf

26 http://www.tbs-sct.gc.ca/im-gi/references/pub/epub-topic_e.asp

27 http://publishing.xml.org/standards/index.shtml#ice

28 http://dita.xml.org/

29 http://xml.coverpages.org/DITA-OASIS-CFP.html 30 BMSG-533048/0001-V/10/2004: Project funded by the Austrian government, for making educational material accessible using TEI-XML (creation), produce those in formats like large print Braille, PDF and HTML (production) and distribute them electronically in a variety of formats and taking into account DRM (Digital Rights Management)

14 CWA 15778:2008 (E)

• There is a perception that the provision of digital files in alternative formats will compromise technical protection measures. In the digital world, where copyright infringements cost companies millions of 31 Euro each year , publishers are understandably afraid, to lose the control over the usage and distribution of their content. • There is a perception that the provision of accessible format materials is expensive and time­ consuming. This perception might also be a result of the fact that there is less knowledge on Accessible Information Processing outside of the accessible content producing communities. • There is a lack of knowledge of the scale of the accessible format market. Its is hard for publishers to estimate the monetary cost-benefit ratio, since there are less comprehensive statistics and market estimations for accessible content products. One should also take into account, that accessible content is also usable for mainstream users and this means also an increase of the target market. • There is no agreement on a single unified accessible format which meets the stakeholders requirements. The file formats used by publishing companies is very specific to the requirements in the publishing environment. But when it comes to produce an accessible document of the same content , the requirements of the formats are entirely different to those in the publishing field. The problem is to unify all the requirements under one common file format. 2.6 Influencing factors in Document Processing for Accessibility This section describes the different factors that affect on one hand the publishing process and on the other hand the results of the process – the Documents and the way they can be accessed. 2.6.1 Structure of digital source file By speaking of structure we have to distinguish between the implicit structure of a document (structure that is only visually recognisable) and the explicit structure of a document (semantic structure or structure that is describing the document on a META-Level). Examples for implicit structures are the visual formatting of a text document through line breaks etc. Examples for explicit structures are the usage of structure elements to describe the content e.g. in HTML this would mean to use

for a Heading or in a Microsoft Word document this would mean to use the “Heading 1”-Format option to mark a text as heading. If a document has such an explicit structure this makes it possible to transform the information without loosing the structure of the document. Structure is also the basic requirement and an fundamental prerequisite for accessible content.

But since publishing is a very diverse process there are many different formats involved. For example, one author might hand his script to the publisher in Microsoft Word format while another author might do the same but uses a TeX-based format. Within one file format, for example Microsoft Word, there is also wide range of possibilities for structuring the 32 document (e.g. by using Headings, Lists, etc.). In Desktop Publishing Tools (DTP), the structure is usually only made visible through visual formatting of the content. Other file formats do not or only minimally support content structuring (e.g. plain text format). It can therefore be said that there is not one typical method of making content accessible but that each case must be looked at individually. 2.6.2 Different accessibility requirements of target user groups 33 34 35 36 The different target groups (blind , visually impaired , or cognitive ) have different needs in terms of how they access the information. These requirements differ even within the groups. For example, one visually impaired person needs the

31 http://www.ipi.org/ipi%5CIPIPublications.nsf/PublicationLookupFullText/E274F77ADF58BD08862571F8001BA 6BF

32 http://www.webaim.org/techniques/word/

33 http://www.webaim.org/articles/visual/blind.php

34 http://www.webaim.org/articles/visual/lowvision.php

35 http://www.webaim.org/articles/visual/colorblind.php

15 CWA 15778:2008 (E)

material in a large print form, while another visually impaired person with a form of colour blindness might need the information with a different foreground-background colour setting. Besides that, the requirements for a deaf person are again totally different from those of the groups mentioned before. This variety of needs for different types of output creates a challenge for the producer of accessible information.

So it is important, particularly for the content authoring process, to consider the diverse range of user’s needs to ensure that they are met in the content authoring process. In order to better understand how the publishing industry can meet the accessibility needs of their users, a consultation took place between the members and the stakeholders of the workshop. Based on the results of this consultation, the next chapter elaborates on possible scenarios for publishers introducing accessibility in their workflow. 2.6.3 Different types of content Different modalities (text, image, video and audio) have different characteristics, purposes and uses. Every type of 37 content raises also different issues in terms of accessibility. Images need a descriptive text to be accessible and video 38 or audio sequences should have a textual transcript .

When modalities are wrapped in particular aggregations or bundles, as they must be in publishing workflows (such as PDF containing text and images) we hit a huge increase in complexity and confusion because there are so many different aggregations, no universal agreement about terms used and what they mean in each aggregation. This CWA begins the process of simplifying this specific source of confusion by establishing a number of common scenarios usable to support accessibility of different types of modality and aggregation. 2.6.4 Different publishing domains Publishing processes in each domain (general, educational, scientific, etc.) have their own requirements. Some segments may be consumer-driven and some more producer-driven. For scientific content, stability and content review are important (as in scientific journals) and particular domains can present strong notational requirements that need tricky notational representation issues such as representing mathematics in device-independent formats.

Many general domains may require only content distribution with little consumer feedback – this is useful in that it provides easy ways for publishers to control the processes and apply resources but it is difficult in that any feedback from customers about what is needed is less direct.

The educational domain requires more flexibility so as to be able to meet the requirements of specific contexts and in many institutions adaptation for particular learners needs to be carried out. In many parts education also is rapidly shifting from a central distribution (i.e. lecture or book) mode to one where learner participation and learner authoring needs to be provided for. This can create a serious challenge for publishing.

Also the content itself varies among the different domains. A novel usually has a very flat structure while a biology book for 10th graders has a lot of figures, tables and other elements. 2.6.5 Target audience of content Cultural and religious issues notwithstanding, it is probably desirable that public content (governmental content, news, etc.) is accessible to the broadest range of people and audiences possible. Commercial content on the other hand may target particular niches markets (e.g. Product brochure of a machine manufacturer). Still, there are commercial advantages in expanding the range of a product to the widest market possible. The hidden demand for a product or feature or some particular content may be wider than the producer would anticipate (e.g. visually impaired, blind and/or dyslexic persons can benefit also from the production of structured audio books). A distinct advantage of creating accessible content is the enhanced usability for many other users who may not have a disability. In general we can say that accessibility interest and usability in general are two converging issues39.

36 http://www.webaim.org/articles/cognitive/

37 http://www.w3.org/TR/WAI-WEBCONTENT/#gl-provide-equivalents

38 http://www.washington.edu/accessit/articles?79

39 http://joshuakaufman.org/articles/pdf/Accessible_and_Usable_Web_Design-Kaufman.pdf

16 CWA 15778:2008 (E)

2.6.6 Digital Rights Management (DRM), usage restrictions and licence agreements Blind, partially sighted and other print disabled people read electronic material by modifying the way in which it is presented, without modifying the content. They may do this through magnification, transformation into synthetic audio, or the use of a temporary, or "refreshable" Braille display. In some instances the software with which to make these changes is incorporated in mainstream packages, but the most flexible and adaptable solutions are achieved via dedicated "" software. The term "assistive technology" is used in this document to refer to this form of access.

Digital rights management schemes, or the technological protection measures within them, can react to assistive technology as if it was an illicit operation. Thus, the DRM systems applied to e-Books and e-documents can prevent 40 access by people who use assistive technology to read the screen or to control the computer .

We see that usage restrictions and licence agreements can present a particular challenge for accessibility because making content accessible for some particular context or user may require adaptation of the content which in turn may require access to parts of the content or even modification of the content using tools other than those it is provided with. These agreements may be good for producers and intermediate suppliers (such as educational establishments) but unless crafted very carefully may not serve the needs of actual consumers well.

40 http://www.indicare.org/tiki-read_article.php?articleId=170

17 CWA 15778:2008 (E)

3 Formats for document processing

In order to systematically describe the processes needed for accessible information processing a three layer generic system architecture is used to categorise the formats. The formats used can be divided into three categories: input formats, interchange formats and output formats. Some of the formats belong to more than one category.

Input formats refer to any format that is used to add information to the start of a process. In some cases this involves getting analog formats such as paper formats digitised in order to convert those to more accessible formats for computer use. In other cases, the input layer will be the output from another process and may be a purely machine readable format. For accessible information processing the following formats are often used for input to processes: • Printed paper • Printed Braille • ASCII Text • HTML • XML • Multi-type composite formats

Interchange and storage formats (sometimes referred to as representation formats) are extremely important. They are the formats in a system upon which the main logic or intelligence acts. For accessible information processing it is essential that this stage of the system is well designed. As systems and technologies become more advanced, these formats will likely reduce to purely XML formats of well specified international standardized multi-type composite formats. For the purpose of this document, the following formats have been identified as interchange formats: • ASCII Text • HTML • XML • Multi-type composite formats

Output formats describe the result returned from a completed process. These can be any of the available formats as they can deliver a format to an end user (be it an actor or another process). • Printed paper • Printed Braille • Audio • ASCII Text • HTML • XML • Multi-type composite formats 3.1 Printed paper Printed paper is an essential part to document production. It is the traditional end product of production processes, and can often be the input stage of a Digitisation process. Printed Paper is still the main format in our society for authors, magazine publishers etc. to make their content available to the public.

Printed paper is an essential part of document production. It is the traditional end product of the production processes, and can often be the input stage of a digitisation process. How useful it is in terms of generating alternative versions from a printed paper material depends largely on the quality of the material.

18 CWA 15778:2008 (E)

3.1.1 Clear print Many publishers are recognising the market need for (slightly larger format) books using 12 or 14 point text in medium or bold.

The resulting text is easier to read for the growing number of people who have difficulty reading smaller sized print. This is increasingly becoming a viable alternative. 3.1.2 Large Print (16 point or over) Large Print is a format based on Printed Paper. A Large Print document can be either a scaled copy of the original or, if the electronic source document exists and is available in a word processor format (Microsoft Word, OpenOffice, StarOffice,...), it is possible to scale the font size and, if necessary, change the colour of the font and background colour as well as the font itself to satisfy the reader’s needs and print the document in this special setup.

Many readers need larger print than conventional publications have. The font used should be clear and simple, medium or bold and with good line spacing. For large print documents, the process of conversion depends on the complexity of the original document and also on the quality requirements for the output document. For simple text documents such as novels designed to be read in a linear form, production of large print versions can be quite easy: once the original document’s content is available in a word processing software, characters are increased to a predetermined font size. The software usually ensures that the text flows on to new pages depending on the page format chosen. The document's page numbers may be modified (adapted book's page numbers do not match the original page numbers any more). The page numbering may not be important for a novel but is likely to be relevant in a non-fiction or reference work. In this case the original page numbering is often marked in the text. Complex documents require more care in their conversion to large print. Documents containing cross references need to be thoroughly checked because of the changes made to 41 page numbers . Table of contents and indexes must be regenerated. Production of large print books may involve some of the techniques of prepress production normal to publishing. 3.2 Printed Braille Braille has been in use by blind and visually impaired people since it was invented by Louis Braille in 1821. It exists as a code based on a series of 6 raised dots (scientific Braille uses 8 dots).

Printed Braille is produced by physically pressing dots into paper so that the points can be read by the fingers of the blind reader. This is a mechanical process converting the digital file into the analog version. Documents can be produced on a machine like a typewriter which produces single pages or on a printing machine driven by a computer feeding it a Braille file converted from a text file. Braille is usually printed on larger pages than conventional books in order to allow enough characters on a line to make reading efficient. The paper is thicker (200 gram/m2 or more) to retain the dots through many readings. The result is that documents in Braille are much bigger than the conventional print version. With precision equipment Braille can be printed at both sides of a paper (Interpoint technique) which halves more or less the amount of paper needed (but not its volume!). For example, a Braille version of the Bible takes up the same shelf space as 24 box files. There are methods of reproducing printed Braille using the pages as moulds for thin sheets of special plastic (thermoplast technique), but the current practice is to keep the digital file and print off copies when they are required.

It is possible to convert existing Braille back to a digital Braille file by scanning, but this requires expensive machinery and there are only a couple of institutions regularly involved in this practice. It is being used to recover and archive rare documents such as music Braille where there is no easy route to a printed original which can be scanned. It is a therefore a finite process and will be less relevant in the future.

Printed Braille output refers to any hard copy which represents Braille. It is the only and universal medium for blind 42 people to read books without the use of modern information technologies . Despite the new alternatives which came with Information and Communication technologies it is still very relevant to Accessible Information Processing. This is especially so when it comes to complex types of content e.g. math, music. This will be described in the following paragraphs.

41 Unless they are softcoded.

42 One should, however, been aware that only 10 % of the blind population can read Braille. This is due to the fact that elderly persons (who form the majority of blind people) cannot distinguish the dots anymore.

19 CWA 15778:2008 (E)

3.2.1 Braille Music Braille music is a system of converting music notation to be represented in Braille code so that music can be read by visually impaired musicians. Almost all standard print music notations can also be written in Braille music notation. Braille music notation is an independent and well-developed notation system with its own conventions and syntax. This is constantly being expanded by a small group of transcribers around the world who communicate with each other to secure agreement as new instrumentation and musical figures are transcribed for the first time. It is therefore a genuinely universal notation standard.

Visually impaired musicians' gain the same benefits by becoming musically literate through learning to read Braille music as do sighted musicians who learn to read printed music. It is therefore an important format in the AIP. 3.2.2 Mathematics in Braille The need to convey mathematics linearly - without the use of special typesetting and, often, with a limited character set as well - is a common one. Linearising mathematical formulas is necessary for Braille. Contrary to music, there is no universal system in place for the linearization or for the codes used. Furthermore, some systems rely on six-dot Braille, others on 8 dots.

Sometimes the use of spoken mathematics is promoted. Here again there are no universal rules on how to pronounce (complex) formulas.

Mathematical Braille is largely country or region dependent. Therefore producers should bear in mind the location of the users they are producing for. Some of the better known codes are: 43 44 • The US Nemeth code , 45 • The German Marburg code 46 • The UK Braille Mathematics Notation 47 • W3C's MathML

In many circumstances the use of Latex for mathematics has been promoted. Latex is a typesetting system based on a combination of text and codes (somewhat similar to XML, but exclusively layout based). Latex has its own system for linearising mathematics which can be very useful for higher education purposes and screen reading but not for printing on paper as it is extremely verbose.

A complete overview of mathematics handling and conversion between "standards" has been made in the European 48 Lambda project . 3.3 Audio Audio files have been an integral part of specialist formats for some time. Originally metallic tape and vinyl were used to distribute spoken materials to print impaired users. These analogue formats have mostly been replaced by digital formats such as WAV, MP3 and other MPEG formats.

43 More information on http://www.brailleauthority.org/Math-science.htm

44 A. Nemeth. The Nemeth Braille Code for Mathematics and Science Notation 1972 Revision. American Printing House for the Blind, 1972.

45 H. Epheser, D. Pograniczna, and K. Britz, Internationale Mathematikschrift für Blinde, Deutsche Blindenstudienanstalt, Marburg (Lahn) 1992.

46 http://www.bauk.org.uk/docs/bmn.pdf

47 http://www.w3.org/TR/MathML2/

48 http://www.lambdaproject.org

20 CWA 15778:2008 (E)

In modern workflows and supply chains, these are mostly packaged with some kind of multimedia framework such as 49 50 SMIL or MPEG but audio can still be used in some very specific processes.

Audio books are by far the most popular medium for the distribution of alternative format publication. Not only are the specialist libraries around the world producing far more audio than all the other formats put together but commercial audio is becoming an increasingly important part of the output of many major publishing houses.

Most of the currently commercially available audio books are read by a professional narrator. Often these are actors but some authors read their own books which might make them more commercially attractive.

Most of the currently commercially available audio books are distributed in uncompressed audio format (PCM).

Commercial production using expensive narrators and professional studios meant that adding even a rudimentary structure was only a marginal cost. Most productions have at least chapter identifiers and many have chosen to add structure to a finer level, but always using the basic "one level tracks structure" of audio CDs.

51 Existing audio CDs can hold a maximum of 80 minutes per disc which results either in a large number of discs for most of the large books, or in abridged versions of those same works, which is an unacceptable solution for print disabled users wanting to access the full content of a certain book.

Mp3 versions of books can be produced from a human narration or automatically through computer generated synthetic speech. The range of existing synthetic speech engines is very wide. Many languages are available though the most common languages tend to have the best speech quality because of the level of investment in development which reflects the potential market. A further advantage is that these files are considerably smaller than uncompressed audio, which allows producers to fit up to 10-15 times the standard duration of uncompressed audio CDs on one data CD (depending on the desired audio quality). 3.4 ASCII Text ASCII refers to an internationally recognised text standard. In terms of Accessible Information Processing it is used to represent either a text file or codes which represent information in Braille. It is becoming increasingly obsolete as supply chains move towards XML formats, but ASCII remains the base element of much of the information archived by national specialist providers. 3.5 HTML documents HTML is a very common output format. As a mark-up language, structure is given a high importance, so when correctly formatted, an HTML document can be searched and browsed in a very logical manner, and headings, paragraphs, links, lists, and so on can be used for full easy navigation. One of the advantages of HTML is that the files can be read on any web browser in any platform. A key disadvantage of HTML is that it is not possible to edit the content while reading it in a browser, as we can do when reading documents in text-processors. This may sometimes be necessary and can be 52 compared to adding notes in a traditional book. It does allow for the user, however, to open the source file of the document and add information if they know how to do so.

A bigger disadvantage is that the layout of the information is mixed in the same HTML document with the information itself and its structure, which may be a handicap when accessing the content.

49 SMIL: Synchronized Multimedia Integration Language, http://www.w3.org/AudioVideo/

50 MPEG Moving Picture Experts Group, http://www.chiariglione.org/mpeg/

51 This is due to the initial purpose of audio CDs’: high quality recording of music. On audio CDs no trade off between quality and length of recording can be made.

52 Editing of HTML based texts on the web can be done through WIKI technology, a system for collaborative working on one single document, mostly known from the wikipedia encyclopedia pages but applicable to other documents too (cf. EUAIN wiki below).

21 CWA 15778:2008 (E)

The Internet is becoming for many the dominant information location and retrieval source. Accessibility of websites and search engines (based mostly on HTML) is therefore essential to the needs of visually impaired people. Different 53 guidelines exist on how to structure correctly HTML files, as well as guidelines on how to make them fully accessible. 3.6 XML The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language that supports a wide variety of applications. XML is a simplified subset of Standard Generalised Markup Language (SGML). Its primary purpose is to facilitate the sharing or the re-use of data across different areas of application. Formally defined languages based on XML allow diverse software to reliably understand information formatted and passed in these languages. Based on these standardised languages it is also possible to write converters to transform the information into other formats.

The format is important for accessible information processing as it provides interchange formats for converting information to accessible information. XML is the name of the general family of several describing languages. Two implementations of XML (technically called DTDs or Schemas) are the TEI standard and the already mentioned DAISY standard.

The Extensible Markup Language (XML) is a mark-up language for describing data regardless of its external appearance. XML is also focussing mainly on the structure of the content. In a different process numerous different transformation style guidelines can be applied to one XML document to present the information in the most suitable way. In the context of accessible information, XML has two main advantages over other formats – the first is that the elements of books like chapters, paragraphs, headings, can be perfectly represented by XML's structural elements; the second advantage is that, being a layout-agnostic language, different transformation style guidelines may be applied to the same content to fit different user’s needs and preferences.

This format is therefore of great importance for accessible information processing as it provides interchange formats for 54 converting information to accessible outputs. 3.6.1 XHTML XHTML is in principle a reformulation of the older HTML 4 specification. The formal specification of XHTML can be read here http://www.w3.org/TR/xhtml1/, but in the context of accessible materials a number of points could be made to help understanding the concept. XHTML is an application of XML. This means that even though XHTML has the semantics of HTML, it has the syntax and the very strict rules of XML. There are both advantages and disadvantages compared to HTML:

Among the advantages are that there are a number of tools that are specifically made for working with XML, called XML­ parsers. These are built into software that use or read XHTML documents, and they make the processing of the document both easier and more accurate. Ironically, this creates a disadvantage for the user. Since XHTML documents must be valid (e.g. conform to the very strict rules of XML) they are difficult to create and edit by hand and as a consequence most XHTML documents are machine-generated.

Software that requires valid XHTML documents include Digital Talking Book players and some e-text readers. It must be noted however, that most Internet browsers – such as Internet Explorer 6 and 7 - are very tolerant and can render XHTML documents even if they are not valid. 3.6.2 TEI XML TEI is a specific implementation of the XML standard.

53 HTML standard. ISO/IEC 15445:2002 Information technology - Document description and processing languages - HyperText Markup Language (HTML). HTML 4.01 Specification (HyperText Markup Language), http://www.w3.org/TR/html401/

54 XML standard Extensible Markup Language (XML) 1.0 (Fourth Edition) 29 September 2006. http://www.w3.org/TR/REC-xml/

22 CWA 15778:2008 (E)

The Text Encoding Initiative (TEI) Guidelines are an international and interdisciplinary standard that enables libraries, museums, publishers, and individual scholars to represent a variety of literary and linguistic texts for online research, 55 teaching, and preservation. 3.6.3 Daisy XML The term “DAISY” covers a variety of standards and specifications that are maintained and supported by the DAISY Consortium (www.daisy.org). DAISY is an acronym that means “Digital Accessible Information System”. The term covers at least 3 different standards and specifications:

DAISY 2.02: A specification that defines the DAISY Digital Talking Book (DTB) format. DAISY 2.02 is by far the most widespread and popular of the DAISY formats and has gained support from both producers of digital talking books as well as producers of players and production software. When people talk about “a DAISY book” they are probably referring to DAISY 2.02.

DAISY 2.02 is based on XHTML and SMIL (Synchronised Multimedia Integration Language – also an application of XML). A number of different audio formats can be used in a DAISY 2.02 book – the most common one used being MP3. A DAISY 2.02 book can contain the full text of a printed work as well as pictures and other multimedia content, or it can be an audio only book.

The heart of a DAISY 2.02 book is the NCC file (Navigation Control Centre) that both presents itself to the user as the Table of Content and works as the reading software’s main system file. The NCC is always present in a DAISY 2.02 book even if it is without text.

DAISY 2.02 books can be read using either a hardware player or a computer with software for this purpose. DAISY 2.02 is platform independent. A number of commercially available production tools for producing DAISY 2.02 books are 56 presented on the DAISY Consortium website.

Daisy/NISO Standard 2005 (aka DAISY 3/ Z39.86): This standard covers a complex of sub-specifications and Document Type Definitions (DTDs) that together constitute a Digital Talking Book. Compared to DAISY 2.02 this standard is very advanced and can be used to represent any book in text and audio format.

The sub-systems in a DAISY 3 book include: • The manifest, that contains a complete list of all the files that make up the DTB • The Spine, that defines the reading order of the DTB • Tours, that defines alternative reading orders of the DTB

The heart of the DAISY 3 book is the NCX file that functions in much the same manner as the NCC file does in a DAISY 2.02 book. Unlike the NCC, the NCX represents the structure of the book in a true hierarchy.

Many of the features of DAISY 3 are drawn from Open eBook Publication Structure Version 1.2. DAISY 3 is a fairly new standard and books are not yet widely distributed by DTB producers. Some producers use DAISY 3 as a production and interchange format where the great richness in detail makes it ideal for XML/XSLT 57 transformations to other formats, mainly DAISY 2.02 for distribution.

DTBook: The text component of a Z39.86 book consists of one ore more DTBook files. The DTBook specification has, however, found widespread use outside the framework of the Digital Talking Book. DTBook is an element set that represents European/American book tradition in an XML context. The element set consists of 79 elements, but can be extended with e.g. math, poetry etc. DTBook is specially suited as an

55 Guidelines for Electronic Text Encoding and Interchange (2002). http://www.tei-c.org/P4X/

56The full specification can be seen here: http://www.daisy.org/z3986/specifications/daisy_202.html

57 The standard can be seen here: http://www.niso.org/standards/resources/Z39-86-2005.html

23 CWA 15778:2008 (E)

interchange format and is widely used as such. Using XML tools like XSLT, a DTBook file can be transformed 58 into PDF, RTF (Word), XHTML, HTML and virtually any other text format.

NIMAS (National Instructional Materials Accessibility Standard): A standard that was developed 2002-2004 by the National File Format Technical Panel. NIMAS is a sub-set of Z39.86.

“NIMAS is a technical standard used by publishers to produce source files (in XML) that may be used to develop multiple specialised formats (such as Braille or audio books) for students with print disabilities.

The source files are prepared using Extensible Markup Language (XML) to mark up the structure of the original content and provide a means for presenting the content in a variety of ways and styles. For example, once a NIMAS fileset has been produced for printed materials, the XML and image source files may be used to create Braille, large print, HTML versions, DAISY talking books using human voice or text-to-speech, audio files derived from text-to-speech transformations, and more.

The separation of content from presentation is an important feature of the NIMAS approach. In most cases, a human will need to enhance the source files to provide additional features needed by diverse learners.

The various specialised formats created from NIMAS file sets may then be used to support a very diverse group of learners who qualify as students with print disabilities. It is important to note that most elementary and secondary educational publishers do not own all of the electronic rights to their textbooks and related core print materials and a copyright exemption allows them to deliver the electronic content of a textbook and the related core print materials to the NIMAC, a national repository which began operations on 12/3/06, as long as the publishers possess the print rights. NIMAS applies to instructional materials published on or after 59 7/19/06. “ 3.6.4 MathML Mathematical Markup Language (MathML) is an application of XML for describing mathematical notation and capturing both its structure and content. It aims at integrating mathematical formulas within other XML documents. It is a recommendation of the W3C math working group. 3.7 Multi-type composite formats Those formats are especially important to the publishing industry, because they are widely used and part of the publishing process, In general we consider here: • Educational multimedia documents Educational publishing may require a greater degree of flexibility than other forms of publishing because of the direct engagement with the persons using the publication. Typically it is more interactive (as it must engage the learner’s attention in a more demanding context). Education often involves direct learner participation, authoring, and other processes such as assessment. Content and processes must match the learner’s particular context and requirements as closely as possible because the closer the match the better the learning outcome. This is very demanding. Given the huge variety and individual nature of assistive technology and personal adaptations (such as cognitive adaptations) it is unrealistic to expect that content producers can tailor content for each circumstance or even that they can have the accessibility knowledge necessary to do so. • Scientific, technical and medical documents (STM) The global market for English-language STM (scientific, technical and medical) journals is about $5 billion. The industry employs 90,000 people globally, of which 40% or 36,000 are employed in the EU. 60 Another 20– 30,000 full time employees are indirectly supported One of the main typesetting

58 A comprehensive set of guidelines for applying DTBook mark-up is available from the Daisy Consortium. (http://www.daisy.org/z3986/guidelines/sg-daisy3/structguide.htm). Other sources: Theory behind the DTBook DTD (http://www.daisy.org/publications/docs/theory_dtbook/theory_dtbook.html)

59 From the NIMAS web site (http://nimas.cast.org/about/nimas/index.html):

60 http://www.stm-assoc.org/storage/Scientific_Publishing_in_Transition_White_Paper.pdf

24 CWA 15778:2008 (E)

systems used is TeX, and the main document mark-up language and document preparation system is LaTex. It is widely used by mathematicians, scientists, philosophers, engineers, and scholars in academia and the commercial world, and by others as a primary or intermediate format (e.g. translating DocBook and other XML-based formats to PDF) because of the quality of typesetting achievable by TeX. It offers programmable desktop publishing features and extensive facilities for automating most aspects of typesetting and desktop publishing, including numbering and cross­ 61 referencing, tables and figures, page layout and bibliographies .

3.7.1 PDF In the past, Adobe PDF files could be very inaccessible, especially to people using screen readers. When the PDF is made by attaching an image of a page to another it is still completely inaccessible as screen readers have no text to read. This began to change with Acrobat 5, when Adobe introduced the ability to tag PDF files for accessibility. Although PDF tags could not be manipulated as easily as HTML tags, they made the content more accessible to some users with screen readers. Adobe Reader 7 continues to improve the user's access to PDF files offering the possibility to customise preferences extensively. Additionally, Adobe has included a DRM (Digital Rights Management) mechanism into Acrobat Reader, but there are also several other DRM plug-ins that work with Acrobat Reader.

Due to the major characteristic of PDF, the fact that it is rendered the same no matter on which viewer or operating system it is viewed on, PDF has become the most popular format used by publishers and increasingly among other content creators. With PDF the difference between an accessible and an inaccessible document depends on proper usage of the programmes used to create PDF files. Much more education and training in creating accessible PDF is still needed. 3.7.2 QuarkXpress® file format Quark®, provides a layout software, called QuarkXpress®, that used for writing, editing, and typography with colour and 62 pictures to produce rich final outputs for print and Web . It is used by more than three million users worldwide. The software is used mainly for creative design and page layout. QuarkXpress Version 7, provides a Voluntary Product 63 Accessibility Template (VPAT) that details the accessibility features of Quark's product in order to help customers to determine its compliance with Section 508.

With QuarkXPress users can import and export XML Documents. With Quark Digital Media Server content can be stored in a central database. It then can be used in multiple forms according to the principles of multi-channel publishing. Quark XTensions software, which are plugins, can automate functions and eliminate repetitive steps with palettes, commands, tools, and menus. Tests with QuarkXPress 6.5 Passport (international Edition) showed that QuarkXPress was not able to import the TEI-DTD. To tag the text of the book, a new, flat DTD had to be written. With the new DTD the mapping from layout formats to XML tags was possible. The content then is exported into a XML file. This is the basic version for the accessibility work. 3.7.3 InDesign file format 64 Adobe® markets InDesign® to produce professional page layouts . Rich and complex documents and outputs for multiple media. Adobe InDesign CS3 software version, supports accessible cross-media publication, allowing export of InDesign documents into PDF, XHTML, and XML. Users can add tags and alternative text attributes to InDesign 65 documents that support the production of accessible content in these exported formats.

InDesign from Adobe Inc. is a desktop publishing application (DTP) which can work with XML files. It is possible to import XML into InDesign and then prepare the document for output e.g. printed book. This feature is an important step toward

61 http://en.wikipedia.org/wiki/LaTeX

62 http://www.quark.com/products/xpress/

63 http://www.quark.com/products/xpress/pdf/VoluntaryProductAccessibilityTemplate.pdf

64 http://www.adobe.com/products/indesign/

65 http://www.adobe.com/accessibility/products/indesign/

25 CWA 15778:2008 (E)

multi-channel and cross-media publishing. Tests with Adobe InDesign CS2 showed that it is possible to tag the text of the layout document. Further investigations are done to efficiently map layout to the structure. InDesign supports the mapping of text-formats to XML-Tags but the structure had to be added afterwards. The mapping feature can be used, if the text is in a proper layout. Otherwise the user has to mark the specific text area (e.g. one chapter) and then to assert the XML tag to the text. 3.7.4 Digital Talking Book (DTB) documents For many years, "talking books" have been made available to print-disabled readers on analog media such as phonograph records and audiocassettes. These media served their users well in providing human-speech recordings of a wide array of print material in increasingly robust and cost-effective formats. However, analog media are limited in several respects when compared to a printed book. Firstly, they are by their nature linear presentations, which leave much to be desired when reading reference works, textbooks, magazines, and other materials that are often accessed randomly. In contrast, digital media offer readers the ability to move around in a book or magazine as freely as (and more efficiently than) a sighted reader flips through a print book. Secondly, analog recordings do not allow users to interact with the book by placing bookmarks or highlighting material. A DTB offers this capability, storing the bookmarks and highlights separate from, but associated with, the DTB itself. Thirdly, talking book users have since long complained that they do not have access to the spelling of the words they hear. As will be explained below, some DTBs will include a file containing the full text of the work, synchronised with the audio presentation, thereby allowing readers to locate specific words and hear them spelled. Finally, analog audio offers readers only one version of the document. If, for example, a book contains footnotes, they are either read where referenced, which burdens the casual reader with unwanted interruptions, or grouped at a location out of the flow of the text, making them difficult for interested readers to access. A DTB allows the user to easily skip over or read footnotes. The Digital Talking Book offers the print-disabled user a significantly enhanced reading experience -- one that is much closer to that of the sighted reader using a print book.

The DTB goes far beyond the limits imposed on analog audio books because it can include not just the audio rendition of the work, but the full textual content and images as well. Because the textual content file is synchronised with the audio file, a DTB offers multiple sensory inputs to readers, a great benefit to, for example, learning-disabled readers or people with . Some visually impaired readers may choose to listen to most of the book, but find that inspecting the images provides information not available in the narrative flow. Others may opt to skip the audio presentation altogether and instead view the text file via screen-enlarging software. Braille readers may prefer to read some parts or the entire document via a refreshable Braille display device connected to their DTB player and accessing the textual content file.

Digital Talking Books are not tied to a single distribution medium. CD-ROMs will be used first but DTBs are portable to any digital distribution medium capable of handling the large files associated with digital audio recordings. Regardless of how a DTB is distributed, however, it will normally be in the context of a digital rights management system.

One implementation of DTBs is based on the Daisy Standard (DAISY 2.02 / ANSI/NISO Z39.86-2005). Daisy books are described above in the sections 3.6.3.

This standard for creating digital content in structured multimedia is developed and maintained by the DAISY Consortium. Using XML text files and MP3 audio files, the DAISY format can create a range of text only, fully synchronised text and audio and audio-only books that are fully accessible and navigable for blind and visually impaired users as well as people with other disabilities such as dyslexia. It allows up to 6 levels of structure (chapter, subchapter, paragraph, and so on) as opposed to the one-level structure of commercial audio CDs, which makes it suitable for complex books like educational materials.

The DAISY standard has been adopted as the standard to be used by publishers in the United States of America to comply with the Instructional Materials Accessibility Act (2002). In Europe DAISY is also used by a wide range of alternative media publishers to create accessible material.

26 CWA 15778:2008 (E)

4 Considerations for structuring documents

Structured information is the first step in the accessible information process. A document whose internal structure can be defined and its elements isolated and classified, without losing sight of the overall structure of the information, is a document that can be navigated.

Most adaptive technology allows the user to access a document, and to read it following the "outer" structure of the original. But if the same information has also an "inner" structure that allows the adaptive device to distinguish between a phrase and a measure, between a paragraph and a sentence, highlighting particular annotations, then the level of accessibility (and therefore usability) of the whole document will be greatly enhanced, allowing the user to move through it in the same way as those without impairments do when looking at a printed document, and following the same integral logic. In an ideal world, all documents made available in electronic formats should contain this internal structure that benefits everyone. Highly-structured documents are becoming more and more popular due to reasons that very seldom pertain to making them accessible to people with disabilities.

The move to XML related formats and associated standards for metadata have provided an impetus for far greater document structuring than before. Whatever the reasons behind those decisions are, the use of highly-structured information is of great benefit to anybody accessing them for any purpose. In recent years, the market for accessibility and assistive technologies has started to gain recognition. It is clear that the integration of accessibility notions into mainstream technologies would provide previously unavailable opportunities in the provision of accessible multimedia information systems. It would open up modern information services and provide them to all types and levels of users, in both the software and the hardware domain. Additionally, new consumption and production devices and environments can be ad- dressed from such platforms and this would provide very useful information provision opportunities indeed, such as information on mobile devices with additional speech assistance.

Structuring content is essential for the subsequent processing and transformation of it into accessible content. Document style sheets and guidelines need to be followed though out the content creation process, so to enable successful conversions into accessible content and eliminate costly post-editing tasks. For that reason, careful consideration of styles and usage agreements with stakeholders are of special importance. The next sections present the steps that need to be introduced in publishing workflows to enable accessible content production. These workflow steps listed below are also part of the scenarios introducing accessibility within publishing workflows (section 0) and given here for allowing the reader to refer back when reading this section. 4.1 Define and use document style guidelines This activity basically defines style guidelines that will be used by all actors in the accessible content processing value chain. Introducing style guidelines requires an agreement between all actors in the publishing chain. These style guidelines can be used to tag specific content and map it into a specific conversion. The consistent use of style guidelines through the whole publishing chain enables efficient processing and automatic structuring of content and thus enhances significantly the accessibility of it. Actually it is not necessary for all actors use the same style guidelines but only if the mappings between the used style guidelines exist (which is not always the case!) and are well known.

66 There are several style guidelines available, very commonly used are those based on the Chicago Manual of Style 4.2 Define and use structure guidelines In this specific activity guidelines that define structure are agreed between the relevant actors. Structural guidelines are more abstract than style guidelines. They actually define how structure is tagged in the content. In the case of MS Word, structural guidelines will need to define how different headings are tagged. By consistently following the structural guidelines, automatic conversion can be achieved. Structure also deals with tagging of sections that contain graphics, images, drawings, other rich media, math, music etc. Structural guidelines also enable navigation on the structural elements and by doing so ease information retrieval and content consumption. A good point to start and examples on how to handle images or drawings can also be found in the structure guidelines for DAISY at http://www.daisy.org/z3986/structure/

66 http://www.chicagomanualofstyle.org/home.html

27 CWA 15778:2008 (E)

4.3 Edit / add structure where needed This activity deals with editing or adding structural elements, where either those were lost in the conversion process, or have not been tagged in the beginning because of inconsistent use of style guidelines. If it is a post-processing operation then it requires a lot of manual effort. Therefore, those tasks should be kept to a minimum.

In order to edit or add structure in this task, guidelines on how to add structure by hand need to be developed and used consistently as well. A list of relevant guidelines has been compiled by the EUAIN project and can be found in the 67 literature 4.4 Edit DRM settings Unless very carefully handled, DRM has the potential to disable adaptation of content to make it accessible in another context. For example, a format may include text that is locked into the format, cannot be copied and is not accessible with assistive technology such as a screen-reader. The text may be only accessible as a bitmapped image, meaning that it cannot be simplified or rendered in some other format. Whatever the format and tool the content is produced with, it is unlikely that the designer will have considered and provided for all the contexts in which it may be used. For best results the separate modalities within material need to be available for other software and hardware tools that are able to make 68 the material accessible for unforeseen contexts to use. This usually means leaving the text unlocked . This topic is for expansion in further work. 4.5 Adaptation Adaptation for a context typically will involve matching the content to the context and making changes to meet that context. This is much easier if the content exists in a form that can be taken apart (disaggregated), supplemented with other content (such as subtitling a video) then put together again. The reason for this is that because adaptations are often needed for only parts of the material and not for all of it. It may be that a visually-impaired person cannot use the diagrams in the content, for example, so that it is necessary to include alternatives for the diagrams in the content.

In all cases it will be necessary to examine both the context AND the material to see if they match. If, in the production chain, this process takes place close to the user it is more likely that the materials will match the user’s needs. In some cases, such as described in scenario 11, the user’s requirements will be directly available. In others it will be necessary to incorporate many alternatives within the content (so that the learner can select the modalities she can use) but still to allow for the content to be adapted as late as possible to meet unanticipated user needs.

When the user’s requirements are not directly available, as in a pre-sale publication process, it is useful to test the material against such checkpoints as are provided in sets of guidelines elsewhere in this document. For web content the Web Accessibility Initiative Web Content Accessibility Guidelines can form a useful base to do this and a number of automated tools that can assist with this process are available. It is important to remember that doing this is using virtual or average requirements and will not meet all circumstances but does help.

67 http://wiki.euain.org/doku.php?id=wiki:guidelines_for_accessible_information_processing

68 This topic was studied in the European SEDODEL project (http://canada.esat.kuleuven.be/docarchwebsite/show.jsp?page=projects&id=SEDODEL) but due to rapid changes in DRM technology, the work should be continuously updated.

28 CWA 15778:2008 (E)

5 Conversion processes

This list of conversion processes is not exhaustive but it should cover the most important conversions in the accessible content processing workflows. The conversions listed below are also part of the scenarios introducing accessibility within publishing workflows (section 0) and given here for allowing the reader to refer back when reading this section. 5.1 Convert Multimedia Material to structured Multimedia Material By multimedia material we mean all content that consists of different media types. Multimedia content can be a Microsoft Word document that consists of text and images but it can also be a video sequence. In most of the following scenarios the term Multimedia Material is used for PDF files.

Accessible PDF can be seen in some cases as an output format which is passed to the reader. Sometimes it might be an interchange format from which further transformations into other formats are performed.

If the accessible PDF is intended as output format, creating accessible tagged PDF files will produce PDF files accessible to standard screen readers which support tagged PDF (like JAWS and Window Eyes).

This circumvents the need for end users to learn how to use Adobe's embedded speech synthesiser.

However, it is not always easy to make PDF files directly accessible to screen readers. Documents with complex layouts can be extremely difficult, if not impossible, to convert into an accessible PDF file, due to the fact that the content does not linearise correctly. It can also be very challenging to make documents with extensive charts or with embedded videos accessible.

Converting PDF into accessible PDF may include the following steps (a detailed description of these steps can be found 69 on the EUAIN Training Resource Centre ): • Apply OCR to Image only PDF

• Converting Existing PDF Files into Tagged PDF Files

• Change Tag Type

• Add Alternate Text to Images

• Create New Tag

• Delete Tag

• Reorder Tags

• Reordering Tags Using the Order Tab

• Artefacts

• Adding Tags to Untagged PDF Files

• Add Tags to Documents Feature

• Add All Tags Manually

• TouchUp Reading Order Feature

• Adding Tags Using TouchUp Reading Order

69 http://wiki.euain.org/doku.php?id=wiki:processes:conversions:html_to_xml

29 CWA 15778:2008 (E)

5.2 Convert structured Multimedia Material to XML By structured Multimedia Material we mean multimedia content that has information on its structure included. This can be a fully navigable audio document, a Microsoft Word document that makes correct usage of heading etc or a fully accessible video document with subtitling. In most of the scenarios we mean accessible PDFs when speaking of structured Multimedia Material.

When Multimedia Material is structured it makes it possible to transform the Material into various output formats. In case of PDFs it is, if possible generally more cost effective to start from the original word processing documents. Quite often the original file used to create the PDF is unavailable. In that case you can create a XML file using Acrobat, but the file will probably be more complex and will require more post processing and restructuring work to make it accessible.

If you have images, only the alternate description will be saved, but not the image, and there are no tables in the HTML file, even if the table was an appropriately-tagged data table in the original PDF file. More information can be found on the EUAIN Training Resource Centre. 5.3 Convert Multimedia Material to XML PDF is widely recognised as a de facto standard for electronic end user documents. Public services, banks, insurance companies and others that distribute electronic documents to end users utilise the advantages of a format that is easily printed by the user and has a fixed unchangeable layout. It is also widely recognised that PDF is unsuitable for anything else than visual presentation. Information in a PDF-document is deliberately made un-flexible and is not easily retrieved. In spite of these facts, one of the most common tasks for national providers of accessible information is the conversion of PDF to an accessible format. At the same time, this particular conversion is both costly and time-consuming.

There are two different approaches to extract information from a PDF-document. Since PDF can be compared to a high­ quality TIFF picture – the same picture that is the result of a scanning process – it is possible to use the same OCR processes with PDF-documents as the processes used with scanned paper-documents – and the results are comparable. Most of the OCR-software providers on the market are recommending this method. The other approach is to try to extract the text-information that is present in a PDF-document. This is done by software that reads both the text content and the visual information in a PDF-document and translates this information to the common attributes of a text document – paragraphs, headings, etc.

The result of both processes can then be exported into an XML file which is the basis for further accessible information processing. Both of these methods require extensive use of human resources and are not easily automated. Especially rendering of visual components such as columns, tables, page-headers, unusual fonts, etc. have proved very difficult to automate. Considering the large amount of human resources employed in this type of conversion, it could be concluded that research efforts used on improving these methods would be very well spent. It should be noted that conversions to accessible formats earlier in the production chain – when the conversion process is still easily automated – would have similar results. 5.4 Convert traditional print to XML This activity refers to the transformation of printed-paper into an electronic interchange format. Usually this is completed through scanning of printed pages.

The result is a set of image files (one file per page unless the multiple page TIFF format is used). These image files can then be saved or be processed through optical character recognition (OCR) software. This software recognises the text in the images and transforms the images into a text file. Either by specific software or by hand this text can be converted into a structured XML file.

This output will refer to structured content which is either being fed into an XML reader which accesses the content and the structural data which surrounds that content or the XML is being fed into the next stage of an XML production stream for further conversion.

This process describes the first stage of a digitisation project where an analog format is being taken in and converted into a format (XML), which allows further conversion, storage or processing.

30 CWA 15778:2008 (E)

5.5 Convert DTP to XML This, primarily multimedia, input will refer to a complex package of media files, which are held together by one governing structure. Well known DTP packages currently are QuarkXpress and Adobe InDesign. These documents also have the most complex formats, which can be used in accessible information processing, and the ability to process these formats forms the key to integrating accessibility with mainstream workflows and processes.

In the case of Accessible Output from a Multimedia package, guidelines are often needed for describing images and other multimedia formats which cannot be directly translated into a suitable format.

XML output will refer to structured content which is either being fed into an XML reader which accesses the content and the metadata which surrounds that content or the XML is being fed into the next stage of an XML production stream for further conversion.

This process is likely to be taking some (or all) of the content in a multimedia package and processing it into a more generic format (XML) for conversion into further formats. More information can be found on the EUAIN Training Resource 70 Centre .

5.6 Convert XML to print XML to printed paper conversion processes (specific to accessible information processing) would refer to any process, which uses XML data to prepare content for accessible use based on a paper format.

Primarily XML input will refer to part of an XML production stream where XML is used as the core interchange format. This being the case, it is likely that there will be some pre-processing, as content is rarely created by hand straight into XML. From XML, content can be converted to almost any format so it is a very good starting point in accessible information processes.

The transformation itself is usually done automatically by software. It uses specific transformation templates (XSLT) to convert the XML data into a readable format. The transformation templates contain information on font sizes, colours, and also layout information.

Printed-paper output refers to any hard copy, which represents the information. It is probably coming out of a printer or a photocopier. In terms of accessible information processing printed-paper output can be a clear print output without sophisticated lay outing of the text. This makes the text easier to read.

This conversion process is likely to be one output branch of an XML production process that uses XML as its central archiving and interchange format. This would be the output node for one particular format. 5.7 Convert XML to Braille A common conversion process in accessible information processing is that of a generic XML format to a specialist format with specific niche user requirements. As many specialist organisations move towards an XML based production process, this process will become commonplace.

Printed Braille output refers to any hard copy, which represents Braille. Braille can be created through various means. These means will probably involve a Braille Embosser or any other Braille printer. One should be aware that Braille printer files are machine dependent.

The transformation itself is usually done automatically by software. It uses a specific transformation style guideline to convert the XML data into the format, which is fed into the Braille printer. The style guideline must do the pre-formatting for Braille output. This includes layout information like adding line-breaks and other Braille print specific information. One of the trickier transformations is the generation of language dependent contractions.

This conversion process is likely to be one output branch of an XML production process that uses XML as its central archiving and interchange format. This would be the output node for one particular format.

70 http://wiki.euain.org/doku.php?id=wiki:processes:conversions:multimedia_to_xml

31 CWA 15778:2008 (E)

5.8 Convert XML to Large Print XML to printed paper conversion processes (specific to accessible information processing) would refer to any process that uses XML data to prepare content for accessible use based on a paper format. This probably refers to large print representations.

Primarily XML input will refer to part of an XML production stream where XML is used as the core interchange format. This being the case, it is likely that there will be some pre-processing, as content is rarely created by hand straight into XML. From XML content can be converted to almost any format, so it is a very good starting point in accessible information processes.

The transformation itself is usually done automatically by software. It uses a specific transformation style guideline to convert the XML data into a readable format. The style guideline contains information on font sizes, colours, and also layout information. High-precision XML to print transformation generally will be done through the use of XSLT or XSL:FO style guidelines (cf. also below: XML->PDF transformation).

Printed-paper output refers to any hard copy, which represents the information. It is probably coming out of a printer or a photocopier. In terms of accessible information processing printed-paper output is likely to be a large print representation.

This is likely to be one output branch of an XML production process that uses XML as its central archiving and interchange format. This would be the output node for one particular format. 5.9 Convert XML to HTML XML to HTML conversion processes (specific to accessible information processing) would refer to any process, which uses XML data to prepare content for the use within web environments be it a web site or just for offline reading within a web browser.

Since both the input format and the output format can contain high amounts of structural data or structured information, we can assume that these processes are quite modern and that they represent good practice in accessible information processing.

In essence, XML input will refer to part of an XML production stream where XML is used as the core interchange format. This being the case, it is likely that there will be some pre-processing, as content is rarely created by hand straight into XML. XML content can be converted to almost any format (using e.g. XSLT or other processing systems such as Stilo- Omnimark), so it is a very good starting point for accessible information processes.

An output medium of HTML suggests that the content is being prepared for the usage within web environments. This means that processing means should be aware of the relevant standards for accessible web content (e.g. W3C, WCAG 1.0, WCAG 2.0).

The HTML will possibly be published on the web or possibly as part of a Content Management System (CMS).

This conversion process is likely to be one output branch of an XML production process that uses XML as its central archiving and interchange format. This would be the output node for one particular format. 5.10 Convert XML to structured Multimedia Material XML to structured Multimedia Material conversion processes refer to a process where for example XML data is transformed into the PDF format. PDF is a very common output format, it has the advantages that the content cannot be changed by the user. It is also a format that is platform independent, which means that a PDF document will look the same on any computer and is not dependent on any specific reading software.

Though PDF output refers to an electronic document format it can also be used within accessible information processing for preparing documents for Large Print output. These Large Print PDF can either be read on a computer screen or, if allowed, be also printed out for use as Printed Paper output.

The transformation itself is usually done automatically by software. It uses a specific transformation style guideline to convert the XML data into a readable format. This implies that processing means should be aware of the relevant PDF

32 CWA 15778:2008 (E)

transformation (XSL:FO).The conversion style guideline contains information on font sizes, colours, and also layout 71 information. XSL:FO processing requires additional software called an FO-processor .

In essence, XML input will refer to part of an XML production stream where XML is used as the core interchange format. This being the case, it is likely that there will be some pre-processing, as content is rarely created by hand straight into XML. XML content can be converted to almost any format, so it is a very good starting point for accessible information processes. 5.11 Convert DTP to Multimedia Material This primarily multimedia input will refer to a complex package of media files, which are held together by one governing structure. Well known DTP packages currently are QuarkXpress and Adobe InDesign. These documents also have the most complex formats, which can be used in accessible information processing, and the ability to process these formats forms the key to integrating accessibility with mainstream workflows and processes.

In the case of Accessible Output from a Multimedia package, guidelines are often needed for describing images and other multimedia formats which cannot be directly translated into a suitable format.

Usually the PDF output from DTP software is used for submission to the printing house for print output. New production techniques and also improvements and new features in the DTP software now make it possible to create a basic PDF output version that can be used as input to other conversion processes such as the conversion of Multimedia Material to structured Multimedia Material.

PDF as an output format is a very common one and it has the advantages that the content cannot be changed by the user. It is also a format that is platform independent, which means that a PDF document will look the same on any computer and is not dependent on any specific reading software. 5.12 Convert Audio to structured Audio Audio files have been an integral part of specialist formats for some time. Currently all audio streams are available or easily convertible to digital formats (WAV, MP3, other MPEG based formats etc.)

The conversion process includes the modification of the audio content and adding structural data to the audio content. In general, this implies cutting large files into smaller ones or producing a list of timing markers so that audio rendering can start at any desired point in a larger audio file.

Structured Audio as an output format refers to audio content that is fully navigable (headings, chapters, paragraphs). This can be reached by using relevant standards or frameworks for handling structured audio content such as ANSI/NISO Z39.86 (DAISY) and NIMAS/DAISY. These frameworks can provide a combination of different output formats. In the case of DAISY this can be a combination of textual representation of the content combined with audio. A special case is the addition of time markers on human-read audio files so that they can be rendered in small pieces of audio. This process is part of the production chain of hybrid (text + audio) Daisy books. 5.13 Convert XML to XML Primarily XML input will refer to part of an XML production stream where XML is used as the core interchange format. This being the case, it is likely that there will be some pre-processing, as content is rarely created by hand straight into XML. From XML content can be converted to almost any format, so it is a very good starting point in accessible information processes.

XML output will refer to structured content which is either being fed into an XML reader which accesses the content and the metadata which surrounds that content or the XML is being fed into the next stage of an XML production stream for further conversion.

In general this process is the core of many compound production processes. XML is being converted in some way. This could be an adaptation from one XML format to another (e.g. TEI to DAISY) or a change in presentation of an XML format, possibly through use of XSLT.

71 An example of this is the Altova Stylevision processor, http://www.altova.com/downloadtrialstylevision.html

33 CWA 15778:2008 (E)

Due to the fact that XML can also be used as storage format it is likely that different publishers may use their own XML format. For accessible information processing it might be necessary to transform one XML format into another to be able to apply the follow up conversions into the final delivery format. This might include a conversion from proprietary XML to an open standard such as TEI or DAISY.

The transformation itself is usually done automatically by software. It uses a specific transformation style guideline (XSLT) to convert the XML data into another XML format. To be able to preserve all the structural information during the transformation process it is necessary for the people involved to be aware of the involved XML grammaticism. Technically this means that both input and output structuring definitions (DTD or Schemas) must be known.

34 CWA 15778:2008 (E)

6 Scenarios introducing accessibility within publishing workflows

The term "fully accessible" or "full accessibility" is always controversial. Accessibility is a mixture of two main factors – first, the type and degree of the print disability of the user, and second, the combination of "readability" and "navigability" of the information offered to the user. Some formats are only readable, but hardly navigable and some are both. Some users may have problems with colours and others with the size of the fonts used, and some with both.

As far as formats go, a pure ASCII file is perfectly readable with most (if not all) adaptive devices, but it is hard to navigate as there is usually no structure at all, neither visual nor spatial. However, an XML text (or HTML, XHTML, etc.) are both readable and navigable, as they allow for different degrees of internal structure. Some text formats (ASCII, marked-up languages and word processor documents) allow for customisation of the characters shown in the screen (both in size and appearance) and the background colours. Some others may be readable and navigable but not customisable (like, for example, a tagged PDF document read with Acrobat Reader – here the size of the whole page can be changed but it does not allow for changing only parts of the text to the size or appearance that may be needed).

According to this pure images can be put on one side of the scale and a DAISY book on the other side – a pure untagged PDF document is completely inaccessible for most print disable people (as much as a printed book is), while a full-audio full-text DAISY book includes all possible accessibility features that might be thought of. In between these two extremes the possibilities are endless. There are proprietary file formats that can create good accessible documents if only used properly (PDF, Word, etc.).

But making documents accessible for customers does not mean that there is an obligation to satisfy everyone’s needs. That is never possible, not even with printed books – some people prefer smaller books, other people would like a slightly bigger font, while yet others prefer glossy to matte paper. To try to address as many customers as possible (according to economical, social, cultural and personal differences) publishers put in the market different editions of the same book – hardback editions, paperback editions, pocket editions… and they make as many translations of the book as needed to sell as many copies of it in as many countries as possible. All these different editions are considered to be in the same format – printed paper. But, in fact, they are "customised" versions of the same format, and some people may even consider them to be "different formats". And they all originate in the same electronic file, slightly or largely modified to meet the special characteristics of a certain edition. So publishers are already publishing the same book in different "formats" when needed.

Formats that are only "readable" are only advisable if there is no other possibility of producing an accessible version of a book. These formats are, for instance, pure ASCII files or a continuous wav or mp3 file for an audio book. The book thus produced can be read, but the reading experience is far from satisfactory. Sometimes, if the book is not the type of book that is usually read sequentially, plain readable formats are of very little or no use at all.

On the other side of what an accessible book may be stand DAISY books. They can reproduce the experience of moving and browsing through a printed book for those who cannot read print. Navigation can be taken as far as the word level when needed and both text and audio (when they are both present) are perfectly synchronised. DAISY books have many layers of navigability that go from full-audio and full-text (on the top of the scale) to fully structured text-only or audio-only books. DAISY books can be produced directly from properly created XML files with the appropriate DTD.

35 CWA 15778:2008 (E)

6.1 Scenario 1 - Delivering XML files We are an SME operating largely within one large EU country. All of our document design and structuring is out-sourced to different design agencies, depending on the type of materials we are publishing. We have been asked to provide XML source files to our national organisation for the blind in order to create accessible versions of the materials. Our material is largely educational and includes multimedia materials. How can we comply with this request?

In this scenario the libraries have to be specific in their wishes. E.g. do they want source documents or XML?

Publisher's structures (if not, structures will be lost when the material is exported)

Guidelines need to be produced instructing actors what to ask for in specific situations.

Actors • National Publisher, Educational Publishing, Multimedia Publishing, SME • Library for the blind • Service Provider

Conversions • Multimedia to structured Multimedia [5.1] • Structured Multimedia to XML [5.2] • Multimedia to XML [5.3] • XML to XML [5.13]

Scenario description

There are three basic ways to create accessible information in this scenario: • Convert the Multimedia files (e.g. PDFs) to XML and add then the structural information to the XML file and deliver those to the Library for the Blind • Edit the the structure within the Multimedia document and convert it afterwards to XML. Deliver the XML to the Library for the Blind • Convert a multimedia document into a structured multimedia document and then into XML. This can then be delivered to the Library for the Blind

36 CWA 15778:2008 (E)

Figure 2 - Scenario 1 Delivering XML files

37 CWA 15778:2008 (E)

6.2 Scenario 2 - Accessibility enhancement in general We are a large publisher operating in several markets across Europe and the US. We already have an enterprise content management system. To what extent can this be made more accessibility compliant so that we can create accessible output formats for national accessibility organisations in different countries? In some countries we are required by legislation to provide these materials, in others it is on a voluntary basis, but in each case we need to know how to do this.

In this scenario, the national accessibility organisations need to express their requirements of formats and files that are needed (e.g. sources files or XML). Guidelines need to be produced instructing actors what to ask for in specific situations.

Actors • International Publisher, General, Multimedia Publishing • Large national specialist providers

Conversions • Multimedia to XML [5.3] • DTP to XML [5.5] • DTP to Multimedia [5.11]

Scenario description

Three possible ways: • The DTP files are converted straight into a multimedia format (e.g. PDF) and then into XML, where the structure and accessibility information are added • The DTP is directly converted into XML and the structure and accessibility information are added • The structure is adden within the DTP document (e.g. through XML tagging). Then these documents can be converted into XML again

38 CWA 15778:2008 (E)

Figure 3 - Scenario 2 Accessibility enhancement in general

39 CWA 15778:2008 (E)

6.3 Scenario 3 - Increasing web accessibility We are a major scientific and technical publisher. Most of our material is already available online but we do not really pay attention to accessibility issues other than on a general level. Our customers tell us this is not enough, and much of our material remains inaccessible. How can we better distribute our materials over the web to our print impaired users?

Guidelines need to be produced instructing actors what to ask for in specific situations.

Actors • International Multimedia STM Publisher • Print impaired end users

Conversions • XML to HTML [5.9]

Scenario description

In this case, the Publisher in question has a large body of content which they would like to make accessible on the web. This is a case where standards are increasingly important.

There are several relevant standards in this case: • WCAG 1.0 (Web Content Accessibility Guidelines 1.0) • WCAG 2.0 (Web Content Accessibility Guidelines 2.0)

There are several relevant organisations in this case: • CEN/ISSS Workshop on Document Processing for Accessibility (WS/DPA) • W3C ( Consortium) • WAI (Web Accessibility Initiative)

In order to make the incorporation of these standards as streamlined as possible, it is important to use a CMS system suitable for the needs of both the target end users and the producing organisation. For non-XML documents the user might need access to information or transformation platforms to convert those documents also into HTML.

40 CWA 15778:2008 (E)

Figure 4 - Scenario 3 Increasing web accessibility

41 CWA 15778:2008 (E)

6.4 Scenario 4 - Accessibility policy

We are a traditional print publisher but we have absolutely no experience in accessibility. We want to train our staff in making our materials accessible, what do we do?

Guidelines need to be produced instructing actors what to ask for in specific situations.

Actors • International Multimedia STM Publisher • Print impaired end users

Conversions • Traditional print to XML [5.4]

Scenario description

This requires a plan to be put in place concerning an accessibility policy within the organisation. It is important that such a policy is aware of relevant standards, policies and legislation. These training materials are a good start, but communication with specialist organisations for the blind and visually impaired could also be useful.

From a technical perspective, the possibility of incorporating accessibility within existing XML processing streams should be investigated.

And for all new materials, publisher should develop author guidelines for digital publishing workflows to incorporate accessibility needs within the process.

42 CWA 15778:2008 (E)

Figure 5 - Scenario 4 Accessibility policy

6.5 Scenario 5 - Spoken documents for everyone We are a major player in the EU Spoken Book market. We are interested in the accessibility area, as we believe this could strengthen our hold on the market. How do we produce books that are accessible for everyone, especially people with dyslexia?

Actors • EU audio book publisher • National Specialist provider

Conversions • Audio to structured Audio [5.12]

Scenario description

It is likely that the solution to this scenario is very simple. With the assistance of the national specialist provider in question, it should be possibly to convert an audio book (WAV or MP3 based) into an audio book which contains structure for navigation. This would require the use of a standard such as Daisy.

43 CWA 15778:2008 (E)

Figure 6 - Scenario 5 Spoken documents for everyone

44 CWA 15778:2008 (E)

6.6 Scenario 6 - Accessible and protected PDFs We are a major publishing conglomerate. Much of our content is made available in PDF format exclusively. How do we make PDFs accessible but still protected from unauthorised use? Is there software I can use?

Actors • Major Publishing Conglomerate • National Specialist Provider

Conversions • Multimedia to structured Multimedia [5.1]

Scenario Description

Accessible PDFs are a hot item at the moment and as a result there is a lot of activity on the internet: 72 • Adobe Accessibility Resource Centre 73 • Creating Accessible PDF Documents with Adobe Acrobat 74 • PDF Universal Access working group

In this document, PDF is considered to be a multimedia format which is a packaged set of files of different formats structured into a document. In the conversion process 5.1 a PDF document is converted through tagging of the PDF into a structured multimedia document. The DRM within these documents can be considered as another element with specific and often conflicting requirements. The technical requirements for conversion are much the same as those for any multimedia package.

72 http://www.adobe.com/accessibility/

73 http://www.adobe.com/enterprise/accessibility/pdfs/acro7_pg_ue.pdf

74 http://www.aiim.org/standards.asp?ID=27861

45 CWA 15778:2008 (E)

Figure 7 - Scenario 6 Accessible and protected PDFs

46 CWA 15778:2008 (E)

6.7 Scenario 7 - Working hand in hand We are an organisation providing accessible format materials. We often receive unstructured (or poorly structured) publisher source files and convert them to (DAISY) XML and Braille. What specifications can we give to publishers so that they can better structure their source files themselves, as this will make the process easier?

Actors • Specialist organisation • General National Publisher

Conversions • Multimedia to XML [5.3] • XML to Braille [5.7] • XML to XML [5.13]

Scenario description

The communication of requirements between publishers and specialist organisations is difficult. In both directions, there can be complications because of the language used, priorities and different end user requirements. In order to make the communication process as seamless as possible, it is essential to start this process as early as possible, and build up a relationship of trust and knowledge of each other’s requirements. International Standards are also essential to ensure that there is a starting point and other work can be re-used. In order to create a solution that is as wide as possible it is important to make use of a standardised XML dialect for support of automatic transformation.

47 CWA 15778:2008 (E)

Figure 8 - Scenario 7 Working hand in hand

48 CWA 15778:2008 (E)

6.8 Scenario 8 - Accessible design We are a design agency working for several private and public organisations. They have asked us to build in accessibility in our products but how do we do this? We understand the principles of "Design For All" and web accessibility but how does this apply to the documents we design?

Actors • Design Agency • Public and private service provider

Conversions • Multimedia to XML [5.3] • XML to structured Multimedia [5.10]

Scenario description

Multimedia documents are converted into XML. The information can then be structured and afterwards converted again in multiple output formats.

Assuming the service provider is using modern software and modern XML production streams, it should be a case of adding new outputs to these streams for the major accessibility formats which are described in these training materials.

49 CWA 15778:2008 (E)

Figure 9 - Scenario 8 Accessible design

50 CWA 15778:2008 (E)

6.9 Scenario 9 - Accessibility on a large scale We are a national library and are responsible for archiving vast amounts of digital material. How can we be sure this material will be accessible? Should we follow some guidelines?

Actors • Library acting as a service provider • National Specialist Organisation for the Blind

Conversions • XML to structured multimedia (5.10)

Scenario description

Since there is an archive of digitised information, it is likely that an XML production stream has been used. If this is the case, then specific formats for impaired users in question need to be added to the output formats. If no XML processes have been put in place, it is likely that some manual intervention may be required to add metadata to the information in order to make it suitable for conversion to accessible formats. As a national Library, it is also important that there is a degree of communication with the National Library for the Blind in your country.

Figure 10 - Scenario 9 Accessibility on a large scale

51 CWA 15778:2008 (E)

6.10 Scenario 10 - What authors can do I am an author who publishes my own works/publishes as part of a collective. How do I make sure that everyone can read my books without me incurring too many costs?

Actors • Author • Accessibility specialist

Conversions • Structured Multimedia to XML [5.2] • Traditional print to XML [5.4]

Scenario description • The author defines together with accessibility specialist guidelines that support the author to create accessible documents. • On the other hand existing print material is converted into XML which can then be exported into various output formats.

52 CWA 15778:2008 (E)

Figure 11 - Scenario 10 - What authors can do

53 CWA 15778:2008 (E)

6.11 Scenario 11 - Repair and adaptation We are a disability support unit in a major university. Our staff members are experts in specific disabilities and our role is to support specific learners at the university by ensuring the multimedia content supplied for their learning is accessible to each learner, providing or recommending appropriate formats to match access modes available to that learner. How can we adapt learning materials? What are the processes and what standards are available to support the processes.

Actors • Disability Support Staff

Conversions • Multimedia to structured Multimedia [5.1]

Scenario description

A Disability Support Staff member will work with the learner and the material assessing the adaptations required to make the material accessible for the learner’s context. The learner’s requirements for the context might be expressed in a 75 functional description (e.g. IMS Accessibility for LIP or ISO/IEC JTC 1/SC 36 24751-2 Personal Needs and Preferences 76 Statement ). There may also be environmental context requirements, such as might be expressed in a device profile.

Content produced by the Educator will be in the form of an aggregation, which might be HTML or SCORM or an IMS Content Package or other aggregation usable by a Learning Management System. A piece of content may itself contain multiple aggregations and formats such as PDF files, MPEG videos, text, HTML etc. Determining whether the content matches the functional requirements and what the needs for repair are might involve examination of Metadata associated 77 with the content (such as IMS AccessForAll Metadata or ISO/IEC JTC 1/SC 36 24751-3 Digital Resource 78 Descriptions ) or using automated software tools to determine accessibility properties of content.

79 Appropriate repair assistance tools might output machine and human-readable statements in the language EARL or properties related to WCAG 2.0 or Section 508 or other such Accessibility standards and also other properties.

80 Repair might involve disassembling an aggregation (such as IMS Content Packaging 1.2 ) into its constituent parts, providing alternate/supplemental resources and re-assembling. Repair might also involve provision of offline materials (such as the notes in large font at some specific time) or online services or offline services to accompany the use of the material.

75 http://www.imsglobal.org/accessibility/#acclip

76 http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=43603&scopelist=PROGRAMM E

77 http://www.imsglobal.org/accessibility/accmdv1p0/imsaccmd_oviewv1p0.html

78 http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=43604&scopelist=PROGRAMM E

79 http://www.w3.org/TR/EARL10-Schema/

80 http://www.imsglobal.org/content/packaging/

54 CWA 15778:2008 (E)

Figure 12 - Scenario 11 Repair and adaptation

55 CWA 15778:2008 (E)

6.12 Common scenario requirements As is shown in the possible workflow for each scenario, there are common tasks that need to be covered when introducing accessibility into the publishing workflow. Those tasks include: • Definition of document style guidelines that are agreed and disseminated to the actors. • Reference conversion of document’s references into tagged structured formats. • Manual correction of conversion input/output (in the case where content creators did not follow style guidelines requirements). • Extension of descriptions regarding visual data (images, videos) and other multimedia material. • Subsequent on demand conversion into accessible formats. 6.13 Specific scenario requirements In order to make documents accessible for a specific group, specific format conversions need to take place. It is important in these format conversions that the structure, annotation and metadata and intellectual property rights of the parent documents will be inherited by the child documents and not be lost. Therefore, attention must be paid to using the right conversion order and toolkits. One strategy that seems to be the most viable is to use a media rich (including annotations, and metadata) structured format as reference and then create the desired output operating directly on that format. An additional advantage of this strategy is that this format can ideally also be used for backups and can be stored in the publishing archive or also in libraries for long-term archiving.

The next sections will further elaborate the requirements found for the possible scenarios presented above. An analysis will be provided of the options for document style guidelines, reference conversion into a structured format, how to minimise manual correction and enhancement of the conversion output, how to extend descriptions of visual data and other multimedia material and how to convert the reference format into other accessible formats.

7 Application-oriented scenario implementation

7.1 Harry Potter and the RNIB Actors involved: International Publisher, General, Printed paper books, Large, International Publisher, General, Audiobooks, Large, Public blind Service Provider

Conversions: Audio wav to ASCII, ASCII to printed Braille, Multimedia to printed Braille, Multimedia to printed paper, Multimedia to Multimedia

This case study concerns the simultaneous release of standard and accessible versions of a popular work of fiction in the United Kingdom. The latest Harry Potter book (Harry Potter and the Half Blood Prince) was brought out simultaneously in normal print, large print, Braille, audio and DTB. The RNIB produced the Braille version; the publishers produced the normal print, large print and audio. A third actor produced the DTB. 7.1.1 Print version Bloomsbury published both the standard print and large print versions of the book simultaneously. The large print version was produced in 16 point and was of good quality. A number of individuals seemed to encounter significant difficulties in securing large print copies of the book; this was despite the large print version being mentioned in press releases and being included on the Nielsen database (the primary tool for booksellers in identifying availability). Potential customers for large print were encouraged to pursue the issue with mainstream booksellers. The National Blind Children's Society also provided customised large print versions of the book to children, in a range of point sizes. They waited until the print version of the book was published and then scanned it. The customised large print service is only available to children and the print price is heavily subsidised by the charity. As with other accessible versions of the book (see below), the stated aim of the charities concerned is to provide the ‘same book, at the same time, at the same price to the print impaired end user. (List price of Standard Print Book £16.99, List price of large print Book £30.00, Price of NBCS book is £16.99 to individual children)

56 CWA 15778:2008 (E)

7.1.2 Braille RNIB were required to undertake and pay for a security audit before the publication date. Once cleared, a representative from Bloomsbury arrived with the word file of the book on a CD. The CD was loaded onto a computer that was not networked. The word file was translated into a Braille file and a stock of hard copy Braille was produced. The artwork for the cover had been secured earlier than the text. The Braille version of the book was available at exactly the same time as the print version of the book. This was a major achievement, the cost of production was still heavily subsidised by RNIB. (Price of Braille copy of book: £16.99 available from RNIB) 7.1.3 Analog Audio The audio rights for the book were owned not by Bloomsbury but by an individual, Helen Nicholl. The actor Stephen Fry narrated the book (as he had done with previous Harry Potter titles). The book was recorded using industry standard, professional, recording software (Pro tools/ Sadie). The digital master was then used to produce both a CDDA (Compact Disk Digital Audio) copy of the book (17 CDs long) and an audio-cassette version of the book. These audio versions of the book were available 6 weeks after the publication of the print book. This delay was apparently due to the work commitments of Stephen Fry. According to UK law you can not use a copyright exception to produce an accessible version of a book if there is already an equivalently accessible version commercially available. For this reason the Stephen Fry cassette and CD version of the book is the only one available (albeit at an incredibly high price) (List Price of analog audio £55.00, List price of CDDA £65.00) 7.1.4 Daisy Audio Daisy audio is deemed a significantly different format to both analog audio and CDDA so RNIB was within the law to produce the book in Daisy. There were several options available: • Use the electronic file (used to produce the Braille) to produce a synthesised voice version of the book • Use an RNIB narrator to produce a real voice recording of the book • Use the Stephen Fry recording to produce a Daisy version of the book

The advantages of the first two options were that the Daisy version would be available at the same time as the print version of the book. However as previous Daisy versions of the series had all used the Stephen Fry audio version it was decided to go for continuity over ‘same day publication’. The rights owner provided a copy of the CDDA version of the book. This was used to produce wave files of the recording which were then converted into the Daisy book. The Daisy book is available for loan via the RNIB talking book service. It is also available for sale. The rights owner set an arbitrary limit of 250 copies (for sale). This will soon be reached and will need to be re-negotiated. Once again the Daisy version of the book was given the same price as the standard print price copy of the book. It was available for loan and sale, 4 months after the original publication of the print book. (Daisy copy of book from RNIB £16.99)

7.2 Magazine and Newspaper distribution in The Netherlands Actors involved: Public blind, Public partially sighted, Public dyslexics, National Publisher, General, Multimedia Publishing, Large

Conversions: ASCII to XML, XML to audio, XML to Braille, XML to printed paper, XML to XML, HTML to XML, Multimedia to XML

The Dutch Library for the Blind Dedicon converts 37 newspapers automatically to accessible XML formats, making them available at the same time if not before the printed editions. Magazines still require some intervention by skilled staff to convert files to a common format, which sometimes causes short delays between print publication and the availability of accessible versions.

DEDICON, the Dutch library for print impaired readers distributes 37 newspapers and 60 magazines in a special XML­ format. The website Anderslezen.nl is used as a digital distribution platform, which reduces delay in delivery to a minimum. Most of the newspapers are available at the same time or even earlier than the printed versions. This is possible because conversion and distribution of the content is fully automated. Magazines cannot be delivered so quickly because some of the work still has to be done by hand. Production of the special “Dedicon” XML format means that delivery times cannot be guaranteed. Nevertheless delays are reasonable.

57 CWA 15778:2008 (E)

7.2.1 The production of newspapers Publishers deliver their content in different XML formats. For example NewsMl, NITF, SGML and special internally developed XML formats. The files are transferred by FTP to a Dedicon server, where they are converted automatically:

The conversion editor/post processor (Dedicon software) or XSLT software converts the file into Dedicon-XML and sorts the content in a logical order.

A DMD-file (document metadata, Dedicon software) is created. It contains metadata about the publisher, the newspaper, the date and the size of the file. It supports the transport and storage of metadata to Anderslezen (website and distribution platform). • The Dedicon-XML is encrypted and placed, together with a CSS and a XDF-file (the CSS supports the customer's reader software) in an EXD-file (encrypted XML document). • EXD- and DMD-files are placed on the website Anderslezen (distribution platform). • The costumer gets an automatically generated email with an attached (EXD-file) or a link. • The EXD-file is decrypted and the file is opened by the Dedicon-reader. • The costumer can navigate through the document with his mouse or the buttons of his keyboard. • The customer can choose between different output formats: synthetic speech, Braille and large print, all of them produced locally out of the XML files.

7.2.2 The production of magazines The content is not only delivered by FTP but also by CD or email. Dedicon receives many different formats, such as: PDF, Word, QuarkXpress, Indesign and HTML. It is not possible to convert all of those formats automatically.

QuarkXpress is converted by hand in an Apple environment. The employee places the different parts of the content in the right order by using Textarch software. Automatic export would lead to an illogical content order.

Indesign is converted in a Windows environment. With the assistance of FIX software (developed by Dedicon) the employee places the different parts of the content in the right order.

The conversion editor/post processor (Dedicon software) converts the content into Dedicon XML.

The employee controls the quality of the Dedicon XML. The content has to be well formed and valid. In case of an error the employee restores the content by hand.

Any information which is missing and relevant is added to the DMD file (document metadata, Dedicon software) by the employee. The rest of the process is similar to the production process of newspapers. 7.3 Time Warner and Dolphin Audio Publishing Actors involved: International Publisher, General, Audiobooks, Large Service Provider

AFB Talking Books recently introduced a new technology developed in partnership with Time Warner AudioBooks and Dolphin Audio Publishing, for the best-selling author James Patterson's new novel, The Jester. The Jester appeared as an audio e-book in 2003 and included an unabridged CD audio version of The Jester. The audio e-book, basically a digital talking book (DTB), was made possible through standards developed by libraries for people who are blind from around the world through the international DAISY (Digital Accessible Information System) Consortium.

DTB technology allows large amounts of textual information and formatting to be stored, transcribed into a variety of formats, and easily navigated. The EaseReader software developed by Dolphin Audio Publishing synchronises audio to the text and plays/displays The Jester on desktop and laptop PCs. Readers can display the text of the book on the screen, fully synchronised with the audio of a professional narrator. Switching back and forth, or “toggling,” between print and audio versions of the same work, is also possible. Additionally, users can search both the entire text and audio for keywords and phrases. These features have a particularly broad appeal for travellers and commuters who may wish to read the text and listen to the audio independently or simultaneously, depending on their environment.

Electronic hardware manufacturers are already responding to the innovation. In the near future, audio e-book technology will be integrated into hand-held Personal Data Assistants. Additionally, the Consumer Electronics Association is planning

58 CWA 15778:2008 (E)

to integrate the DTB file format into CD player technology, allowing any CD player to access the audio portion of the audio e-book. This in itself would mark a significant advance, since an entire book’s worth of text and audio can fit onto one CD with the DTB file format. 7.4 Educational publishing in Austria Actors involved: International Publisher - General - Printed paper books - SME, Service Provider - Public blind

Conversions: Multimedia to XML, XML to HTML, XML to Multimedia. This case study outlines the co-operation between Austrian schoolbook publishers and service providers for people with special needs, to make books available in electronic formats. 7.4.1 Situation in Austria In Austria, the Federal Ministry for Social Affairs and Generations is providing educational materials like schoolbooks and other materials for primary and secondary education. Blind and visually handicapped students - and hopefully soon in the future other print disabled students - can order books in accessible formats.

Publishers, till this project, did not agree on handing over and distributing digital copies of books. The development of alternative formats starts from printed books with scanning, OCR or, when lots of graphics and/or formal structures like math are used, with typing. In this process structure was added to the book, headings were defined and lists and other structural elements were assigned to the text. This was a very time consuming process.

This situation motivated to start a project which addressed the following issues: • a minimum set of structural elements which documents from publishers have to contain to make them usable for the production of books in alternative formats (e.g. Braille, large print, ebooks) • know-how and handouts for publishers on how to implement structured design with these elements using standard desktop publishing (DTP) systems (InDesign, QarkExpress) • examples new books and redesigning existing books to learn how to do accessible document design in practice • training materials, workshops and seminars to transfer the developed know-how to other publishers and design agencies • a general agreement which gives the right of transferring books in electronic format to students with disabilities • a Document Rights Management System including to prevent the data to be misused in practice • a workflow for the co-operation between schools/teachers, service providers, publishers and the ministry.

Publishers were interested to take part as a) the new anti discrimination legislation will ask for accessibility of school books and b) they experience general problems in the publishing process when they want to use sources for different publishing purposes (e.g. print, online, CD, audio/multimedia). This convergence of interests led to a strong partnership for the project named. 7.4.2 "Multi Channel Publishing" Five publishers take part in the project. Each of them is responsible for the designing or redesigning one of their books based on a predefined set of structural elements. This basic structural design defined in the project guarantees that the electronic version of the book can be used for the production of alternative formats. An analysis of the publishing process at publisher’s sites showed that service providers can only start from the final print ready version as the content, which is approved by public authorities, changes till this point. This final version today is most of the time a PDF generated from a DTP Tool (e.g. Adobe InDesign or QuarkXPress). Due to this, if electronic sources should be usable for services providers, structured design has to be implemented into the DTP work. 7.4.3 Definition of structural elements for electronic versions of books To be able to collect the data of the source document and convert it into a XML File, we used the element set of the TEI- Standard, in particular the TEI Lite DTD. The TEI’s Guidelines for Electronic Text Encoding and Interchange were first published in April 1994. This set of metadata is widely known by publisher and guarantees compatibility or convertibility to other definitions in use like Daisy [Daisy 06]. Using TEI keeps the process close to the upcoming XML database

59 CWA 15778:2008 (E)

schemes which publishers might use in the future using database structures for processing their documents. The TEI Lite DTD still consists of over 120 Elements for the tagging of books, most of them important for librarians. To simplify the work for all participating parties, a subset of those elements was selected. This subset consists of structural elements which are of general importance for structured document design and automatic content processing. This subset does not ask for special knowledge of accessible versions but can be seen as the basis for structured document design in general. Using this subset guarantees that the sources (or PDFs) can be used as a starting point for the production of accessible versions. In general this sub-set of the TEI Lite DTD comprises structural metadata elements for: • Headings • Divisions / Subdivisions • Images • Tables • Notes • Page breaks • References

It also comprises administrative metadata elements (e.g. Edition, Year of Publishing, Author(s), Publisher, ...). The experience in the project showed, that this D.T.D. Subset is sufficient to structure the content of the schoolbooks. Publishers after a short training were able to do the work by themselves. This subset also proved to be in accordance with new publishing systems based on X.M.L. databases. 7.4.4 Authoring Tools After the definition of the XML DTD, knowledge was developed how the authoring tools could support the efficient marking-up ofdocuments in the right way during the layout process. Further on routines for exporting the defined structure and layout data into XML were developed. The two most widely used authoring tools were examined in detail: 7.4.5 Example Books The post-processing tasks are necessary, because, as mentioned before, the exported files in some cases have no structure and there are also parts of some books that could not be exported (e.g. graphics, made in the authoring systems). The post-processing tasks were: • Adding Structure to the XML • Revise elements, that were not exported properly • Describe Images

The result after the completion of the work is a valid XML version of the book. The next step is to convert the XML via style sheets into the target format. The style sheets for the conversion are freely available on the internet. They allow to convert the XML file into a HTML file with one/multiple pages and also to convert the XML file into a PDF file. 7.4.6 Training materials, seminars and workshops Training materials have been developed which are now used in workshops and seminars to transfer the knowledge to as many publishers as well as design agencies as possible. 7.4.7 DRM-System To make sure that the books are not used outside the designated user group a DRM System was customised. The system consists of a secure-reader-software and a USB dongle, which acts as the key. Every student gets a key and the software. The key has a code, which allows the student to read the book if the key is plugged into the computer. This system has the advantage that the user is not bound to one specific computer or piece of hardware. He can read the book for example at school but also in a learning group or at home. How the students get their books and a detailed workflow between publishers and the service providers is described in the next paragraph. 7.4.8 Workflow To start the process, a teacher of a student with special needs orders a book in an accessible format. If the schoolbook service provider does already have the book in stock, it will be provided directly to the student. Otherwise, the service provider asks the publisher for the electronic version of the book. The publisher sends his TEI-XML file to the service

60 CWA 15778:2008 (E)

Provider. The service Provider produces the accessible version of the book. Printed (Braille/enlarged) copies are sent by standard mail. If an electronic document is ordered, the service provider encodes the files with the DRM system using the data from the student’s USB dongle. The book is placed on a server, where the student can download the book. When the student has the reader software installed and the dongle plugged in, he can open the book and read it. 7.4.9 Agreement between Publisher and Service Provider To ensure that the process works efficiently, an agreement between publishers and service providers has been drafted. The core articles of the agreement are: • The publishers provide their electronic source documents • It must be ensured, that the books are only given to people with a special need • A DRM system must be used therefore • It must be a "closed" system with registered users

The agreement will be signed by every publisher and service provider. If a service provider needs a book from a publisher he can ask for it under the condition of the framework agreement. 7.4.10 Conclusion The most important result of the project is the fact that handing over digital copies of print published documents is guaranteed in the future.

The project showed that it is technically feasible to create XML versions of books by using the print ready version of a document. The experience also showed that the quality of the XML after just using the functions provided by the authoring tools is not good enough. A lot of work has to be done afterwards by cleaning and revising the XML document. The people who are performing this work will have to have some basic XML skills. It will also be a challenge to convince the publishers to create documents that can be exported into XML without a lot of additional effort. In some areas at the moment there are only limited possibilities to sources from publishers, especially in areas, where books consist mainly of pictures, graphics and other visual content. Another challenge is the integration of non-text content like mathematical or chemical expressions.

The project made obvious that all publishers pass their layout data to the print office by using PDF. An important task for the future will be the development of a program to allow authoring systems to create PDF files that are either accessible or allow a conversion back into a useful format.

In any case these are only first, but important steps towards multi channel publishing. More work is needed for a more efficient production of different versions of one source document. 7.5 Best practice for distributing accessible content The “ideal accessible information network” could be a structured and collaborative network of organisations producing accessible information for print disabled people. To improve its efficiency, it should be a technical network with normalised tools and practices, with technical experts to keep working with publishers for innovating solutions.

Reliable technical solutions should be set up to distribute accessible electronic documents to print disabled people. These solutions must guarantee intellectual property rights without restricting access to information. This ideal accessible information network should be a trust network, where the actors are well identified and work responsibly and accountably. Exchanges with publishers should be enhanced and organised by trusted intermediaries to discuss intellectual property rights and structured files provision. These intermediaries should be legally acknowledged as public authorities.

Publishers should be provided with clear specifications on the file formats they can provide to trusted intermediaries. If necessary, guidelines or tools can be supplied to help publishers in integrating accessibility in their production chains.

Publishers should introduce accessibility in the contract they sign with their subcontractors and service providers. They should also guide their authors in creating structured information using the prescribed authoring tools.

61 CWA 15778:2008 (E)

When projects on electronic products are launched, accessibility should be introduced in the functional specifications and 81 considered in the project design and realisation. 7.5.1 Current examples of good practice These examples of current good practice cover the following areas: • Technical books • School books • Electronic books

7.5.2 Technical books These are examples of best practice in distributing accessible content in the form of technical books. 7.5.2.1 O'Reilly Media Inc. Technical books and articles have changed considerably with the advances of new information technologies. The example of O'Reilly Media Inc. illustrates how technical information can be disseminated in both paper books and electronic documents. O'Reilly media was originally a technical writing consulting company. Today it has become one of the most famous publishers of books for software developers with its iconic “animal books” and “In a Nutshell” references.

O'Reilly publications activities are not reduced to paper books; it offers many on-line services as its “Safari Books Online” 82 service, a web-based subscription service that offers a searchable reference library of computer books from different publishers. This on-line library allows subscribers to search across more than 3,000 books, parts or entire books can be read on-line, the catalogue can be browsed by category. Chapters of books can be downloaded for viewing off-line.

83 O'Reilly has published a number of Open Books – books with various forms of “open” copyright. These books can be out of print or written by authors who thought that their books had to be widely distributed under a particular open 84 copyright. Through its Open Library project, the Internet Archive is scanning and hosting PDF versions of O'Reilly open books.

A number of the open books are also available as HTML or PDF e-books on the O'Reilly web site. These documents are 85 structured in chapters, sub chapters and contain table of contents to access information.

To create such a variety of products and services around paper books, O'Reilly has set up a complete publication process starting from authors to the final products.

Authors are provided with very strict guidelines for the final book submission. The approved formats are: - Microsoft Word for PC or Mac, tagged to O'Reilly's paragraph and character style template, - XML tagged according to the DocBook Lite DTD, - Adobe FrameMaker tagged according to the paragraph and character style tags in O'Reilly's templates. Once the final draft is submitted by the authors and properly reviewed by a technical committee, the book is prepared for print by O'Reilly staff. Illustrations are re-done by graphic artists, the cover is designed.

The camera and press ready material is then produced by the production group late in the production process. Adobe 86 FrameMaker is used to prepare the final document that will be sent to press.

81 http://wiki.euain.org/doku.php?id=wiki:distribution:distributing_content:best_practice

82 http://safari.oreilly.com/

83 http://www.oreilly.com/openbook/

84 http://www.archive.org

85 An example of an HTML open book can be found at: http://www.oreilly.com/catalog/debian/chapter/book/index.html

86 http://www.oreilly.com/oreilly/author/

62 CWA 15778:2008 (E)

This particular example shows a publisher who has created a rationalised multichannel publication process. Both paper books and digital publications are created from the authors submissions. This process is well documented. It involves authors in the early stages guiding them through the edition. Even if it takes place in a favourable context (authors are computing scientists and O'Reilly is specialised in technical publication), this example can be considered as encouraging for structured content publication and leading to accessibility. 7.5.3 School books These are examples of best practice in distributing accessible content in the form of school books. 7.5.3.1 Bordas-Nathan electronic school books 87 88 The French publishers Bordas and Nathan are major school books publishers in France. In addition to a very large catalogue of paper books for pupils, teachers and pedagogues, these publishers create multimedia products as web sites and CDROMs for teachers and pupils. Since 2000, Bordas and Nathan have been also actively involved in electronic schoolbags projects.

89 BrailleNet has studied the accessibility of an electronic History and Geography book for 15 years old pupils published by Bordas. The electronic book content is based on the paper version with extra multimedia documents as Macromedia Flash animations or audio and video sequences. The aim of this project for the publisher was also to explore multichannel publishing to deliver information to different target devices as personal digital assistants (PDA) and third generation cell phones.

The final version of the electronic book for computer is a thick client application based on a modified version of Mozilla: ­ the book's static content is encoded in XHTML, - the user interface is described with XUL, - and the major part of the application mechanisms are developed in JavaScript. The electronic book for computer offers the following functionality: ­ read textual content, view pictures, play video and audio documents, - search content in the whole book using an integrated search engine, - browse the table of content of the book and move directly to a given part, chapter, lesson thanks to encoded links, - annotate images using a minimal editing toolkit containing a brush and a colour selector, ­ create and edit XHTML content for pupil’s personal homework.

BrailleNet carried out accessibility tests with common assistive technologies such as screen readers and magnifier software. The results were negative because of compatibility issues between Mozilla and the screen readers. Even if these technical problems could have been solved, the use of Macromedia Flash was also another important barrier to access.

However the book's high level of structure and the large amount of semantic information added to it was a good opportunity to improve its accessibility. It was decided to go deeper in the study, the publisher let BrailleNet access the build chain of the application and develop solutions to create an accessible version of the book.

Both XHTML content and XUL content are automatically built from XML data. The source of the whole application is contained in a unique XML document following a DTD developed by the publisher. This DTD is divided in two parts: - a first subset is common to every electronic book produced by the publisher, it is close to XHTML, and describes information as paragraphs, divisions, and images; - a second subset is specific to a given book. It defines the grammar and the vocabulary of the book: which containers are used, how they can be nested. This part was particularly interesting because the publisher chose to structure information in a very semantic way in order to output media-specific structures later in the production process.

The publisher provided BrailleNet with this DTD and the XML source file of the application. It was interesting to note that the DTD already contained the necessary structures to add textual alternatives to images. However, all these alternatives were empty in the XML document. But all the documents (textual documents, illustrative pictures, video and audio) were provided with a textual legend introducing and describing the content.

87 http://www.nathan.fr/Multimedia/cartable/default.asp

88 http://www.editions-bordas.com/

89 http://www.braillenet.org/

63 CWA 15778:2008 (E)

With the publisher's DTD and XML source files, BrailleNet was able to develop a set of XSL style sheets to convert the XML source document into XML dtbook and then to an accessible XHTML book where a user can easily navigate with tables of contents, many internal links, textual alternatives to images etc.

This example is encouraging for the integration of accessibility in the publishing industry. It shows how a publisher was led to add structure to its content to create new innovative products for its customers. It also shows how easy it was to create an accessible version of the electronic book from the XML source files of the publisher. This demonstrates the technical convergence of publishers and accessibility in electronic documents, especially in the case of multichannel publishing. This technical convergence is also illustrated by the example of a mathematics book from the same publisher: it was decided that mathematical content will be stored in MathML instead of images in order to be able to convert them later in the building chain if necessary. 7.5.4 Electronic books These are examples of best practice in distributing accessible content in the form of school books. 7.5.4.1 Numilog Numilog is an electronic bookseller based in France and selling books for both French-speaking and English-speaking markets. The Numilog website proposes a large catalogue of titles for its customers (more than 23,000 available titles). Customers can choose the format they prefer between: - Mobipocket format (PRC) readable on a computer with Microsoft operating system, Palm, Pocket PC and smartphone; - Microsoft Reader format (LIT) also readable on all the above platforms; - PDF for PC and Mac. The catalogue is composed of novels, books for children, documentaries, non­ 90 fiction books (computing, management, biology law, economics, ...) and dictionaries. 7.5.4.2 Relations with publishers Numilog has business relations with publishers to negotiate the rights to distribute their books on the Internet, the selling price and eventually source file provision. English and American publishers set the price Numilog must pay to obtain the rights to sell electronic books from their catalogue. Then Numilog is free to decide the selling price and so its profit margin. The French publishers collaborating with Numilog have decided to apply the policy they use with conventional book distributors: book prices are set by the publishers themselves. This follows the French law about book prices, called “loi Lang”. This law was passed in 1981 and prescribes that anyone publishing or importing a book has to define its selling price. This price must be respected by all distributors. This law does not include electronic books but the publishers have decided to apply it to this particular case. This means that Numilog has to negotiate with publishers the selling price in order to keep a decent profit margin. This price of the electronic version is always lower than the price of printed books. Most of the time, Numilog has to pay to obtain the electronic files from publishers, and the price fluctuates. Publishers almost always provide Numilog with files of their books. 90% of these files are prepress files in PDF, Quark XPress or Adobe inDesign. PDF files are always optimised for press, with cutting lines and very large pictures. The other 10% are author’s files, usually in Microsoft Word. Sometimes Numilog must digitise books because publishers cannot provide digital files.

91 Numilog has chosen Adobe Content Server to secure eBook distribution. Content Server is a Web-based system for publishers, distributors, libraries and booksellers. It automates the supply chain for eBooks and other media by providing: - an interface for eBook publishing, distribution and procurement; - a way to manage and protect digital rights; - a secure repository with encryption of eBooks and authentication of transactions; - a business to business transactions model, including selling eBooks to clients and procurement from vendors; - a lending model for online libraries ... Numilog has become an Adobe partner for this particular product and can sell licences and offer to host this service for a customer. 7.3.3. eBooks preparation

Numilog has set strict quality requirements for eBooks: - file size must be optimised (finding a good ratio between quality and file size); - all the eBooks must have a cover page; - file textual content must not be provided as images (textual electronic books only) so that content can be searched, magnified, selected (but not copied for security reasons); ­ technical books must be structured and offer a convenient way to access information quickly; - footnotes and endnotes must be hyper-textual links as often as possible; - external references are checked to avoid broken links. To prepare the eBooks, Numilog employs people to rework the files provided by publishers.

90 http://www.numilog.fr/

91 http://www.adobe.com/products/contentserver/index.html

64 CWA 15778:2008 (E)

7.5.4.3 Conclusion This experience shows again a convergence of interests between the publishing world and the accessible information world because Numilog's eBook quality requirements are close to accessibility requirements. Numilog faces the same issues the accessible information world currently faces, publishers files are not structured for electronic publication, and the additional costs to make structured information from unstructured files is prohibitive.

65 CWA 15778:2008 (E)

8 Identified gaps and areas for further research

In Section 1 we asked two fundamental questions: • How do we describe the questions and problems of end users and content providers? • How do we marry both within manageable and consistent frameworks?

This workshop has sought to provide some pointers towards answering these questions. As can be seen, there are many ongoing initiatives and projects which seek to incorporate accessibility within publishing processes. Largely through necessity, most of these initiatives have been undertaken by specialist organisations supporting print impaired people. In some instances, and sometimes in collaboration with publishers and other content providers, this has resulted in innovative practices and far greater access to information.

However, progress has been fragmented and often very slow. The provision of alternative format materials varies greatly from country to country (and within national boundaries), according to local conditions and according to the economic vagaries of provision and depending on different types of impairment. These problems are well-described elsewhere. 8.1 Descriptions & Requirements Considerable work has been done by specialist organisations in establishing end user requirements and a number of preferred output formats are well-established. There is less consistency, however, in relating these requirements to re­ usable models for use within content processing environments. Combined with a historical tendency to separate alternative format production from mainstream production and to focus on separate specialised formats, this has lead to a fragmented approach to implementing accessibility within both specialist and mainstream processes. Similarly, there is very little consistency in the work undertaken by content providers in this area and there is a corresponding fragmentation of effort. Such work that has been undertaken tends to focus on web authoring issues and this only addresses one part of the accessible content processing chain. Indeed, an important outcome from this workshop is the 92 realisation that generic processing models are required for this work to proceed with any degree of coherence. 8.1.1 Further Research on Descriptions & Requirements Systematic descriptions of end user requirements remain problematic. Such requirements are constantly changing and there has been little effort made to capture the dynamic nature of these requirements.

Further research s required to establish requirements for different types of impairments and to compare and examine where points of intersection might lead to collaborative efforts. 8.2 Process & Content Modelling This workshop has sought to provide information about different scenarios and to point towards ‘real-life’ examples which have proven successful. The motivation for content providers to create accessible information will always vary but there is a clear need for generic processing frameworks which can make this as straightforward as possible.

It appears unrealistic to expect that any one format be accepted by all stakeholders and universally applied. Multimedia content processing can involve many different types of software and many different processes, thereby making it very difficult indeed to introduce accessible content processing at the right stages. No single input, representation or output format can contain these complexities. Given the general move towards distributed media, it is perhaps better to focus on building frameworks which enable accessible content processing, according to the local preferences and requirements of all the stakeholders. In short, accessibility is not a format or a product: it is a process.

Our modern use of multimedia information requires that information and services accommodate different presentations and interaction designs design at the user interface level, on the basis of requirements that include user needs, preferences, personalisation, customisation, adaptation and constraints; characteristics of the tasks to be performed (e.g. repetitive, knowledge-intensive, collaborative); capabilities of available access devices; and contextual information. In

92 See for example, the ongoing work of the ProAccess project (http://proaccess.euain.org), co-ordinated by the Italian Publishers Association and supported by the Federation of European Publishers.

66 CWA 15778:2008 (E)

providing this level of usability, a fundamental accessibility can be achieved which allows users to interact with content at a much deeper level.

This work may be at a relatively early stage, but most importantly it is a mainstream endeavour: the modelling and aggregation of content is a central concern for all those in the private and public sector.

The ever-increasing complexity of dealing with information from structured and unstructured media (images, sound, text, recordings, etc.), in several working modalities and in multitasking modes, makes adapting to context and content a necessity. Additionally, what is becoming more and more important is a universal, scalable, adaptive and customisable multidimensional interface to media content where appropriate media viewpoints/perspectives can be presented to the users adapted to their preferences, workflow constraints, and interaction models.

Existing approaches to user-content interaction are characterised by a lack of a holistic view to the complex problem of designing accessibility of interaction and content and they fail to look at the process of authoring, managing and delivering the content as being highly inter-woven. Also, the majority of these approaches either look at accessibility, personalisation or context of use problems; they do not deal with the more complex issue of user interaction and content presentation as a whole and the open-ended and frequently changing real world environments.

The various MPEG family members operate at different abstraction levels with some communication between these abstraction levels. The process of contriving a procedure to interface the various processing levels should be based on use. The difficulty lies in achieving a level of description of the user requirements that allows re-description in technological terms. This re-description ideally leads to specifications and ultimately implementations. These implementations ‘prove’ the viability of the concept: it is the proof of the hypothesis. The process of standardisation that runs in parallel with this ensures extraction of higher level descriptions and these are aggregated down to the earlier family members. Using this built-in feature to provide ‘slots’ for common and specialised accessibility requirements would create what we refer to as accessibility from scratch (see above). If embedded in the family tradition of the MPEG initiative, accessibility might become a commonly available feature instead of a workaround necessity.

The representation of the interplay between the various user groups should always remain accessible. If all relevant entities in a representation system remain accessible, creating meaningful mappings is a matter of connecting the appropriate entities. For this reason, accessibility from scratch is of fundamental importance. 8.2.1 Further research on process and content modelling There is a need to develop open source frameworks to bridge the gap between original content design heuristics and intuitive multimodal interfaces required for content and communication systems.

Such frameworks would build in profile-based access to information, content and services, which not only bring together and extend state-of-the-art technologies for information access, but also conform to standards and guidelines available for accessibility, usability, scalability and adaptability.

There is a need to conduct basic research to establish the nature of the interaction between people with cognitive impairments and multimedia information. A critical and guiding factor is that the supply of information should be determined by the end user from a central content reserve, thereby allowing the end- user the freedom to explore the information as they see fit and to make their own choices regarding how the information is to be displayed, rather than through the sometimes discriminatory filtering processes of information gatekeepers. 8.3 Introducing and using metadata for accessibility purposes People compress information. People decompress information. The compression procedure involves filtering out redundant information based on the perspective of the user. How do we decide which redundant data entities are relevant for the user? What to use? On what requirements are these redundant data entities based? Whose requirements? How do we marry the existence of these accessibility metadata entities with the requirements as described in “common” metadata entities? More importantly, how do we ensure a synchronised and therefore valid coupling between any kind of content with these metadata entities? How do we ensure that any metatags themselves remain accessible? What is the context of any accessibility metatags that are to be conceived?

How then can we make sure that the context remains consistent? If we describe the knowledge that is applied to enable processes to exist in a digital system that parallels analogue organisational systems, knowledge is transferred from the individual participants to a shared information framework. The use of knowledge can be separated into three parts: the body of information that is contained inside knowledge structures; static information about the knowledge processing, which is also known as meta-information or metatags; dynamic information that is used to describe the processes and

67 CWA 15778:2008 (E)

procedures to retrieve, transform or use the content. By introducing metatags that aim to address the needs for accessible information processing, it is mandatory to describe the procedures that will meaningfully interpret these meta tags to communicate the content in a way that enables every person to appreciate the content. Creating meaningful mappings between the static redundant information -the meta tags- and the dynamic processes

Many people believe structure to be static: from a meta modelling perspective this is not the case. It is well known that if the representation of the information at hand is perceived by the system and mapped onto a framework, the information is then usable in a multitude of ways: and for this reason non-programmers will often promote the use of XML.

However, this mark up and the set of tools that surround it are simply a set of tools which exist to achieve this objective. If the architecture of the system does not answer the wider range of needs, requirements and questions, the mark up cannot paper over the cracks. In order to build extensibility into a system, the architecture should be such that every element used for processing the information is adaptable. This can be achieved by building a representation layer which builds an object oriented structure from the information and which is free to adapt the meta relationships and hierarchies intrinsic in that data genus. This is defined by identifying the parameters upon which the structure is built, and ensuring they are interconnected in such a way that promotes future adaptability without degrading the system: which is to say, using the right parameters for accessible information processing.

As noted in 8.1.1 above, the goal should be to anticipate the changes in user requirements. These changes can occur in the very nature of the requirements, such as new functional groups or in the definition of the existing requirements, such as additional details. These aims should be pursued by adding redundant information in the form of meta tags, thus augmenting the quality of the content. The content itself and the existing meta tag structures, including their mapping to the meta modelling domain, is not allowed to change. From a meta modelling perspective, this allows us to meet changing requirements for the future, because if the requirements demand additional detail in the form of features or metadata, we can unveil the metadata that is available. 8.3.1 Further research on introducing and using metadata for accessibility purposes Further research is required to identify and investigate the ways in which metadata can help achieve efficient and future­ proof solutions to accessibility.

In order to make this perceived information useful, it must be represented within an architecture which allows the accessibility requirements to be questioned in more than one way. Such architecture must enable both the core system to adapt to new and changing representation requirements, and to allow (theoretically) infinite user requirements. 8.4 Standards and personalisation of content Personalisation of media makes several demands on standards. In particular, it requires: • that the modalities in content are identified • that the modalities or adaptations of them that a user requires in the context are identified • that the glue standards that enable these things to work exist

We perceive the world partly by using our senses. Modalities are the aspects or components of media or system interfaces that correspond with those senses and enable us to perceive them. For example a video usually has a visual aspect or component and this corresponds with the sense of sight. Without at least some match between the modalities available in the media and the senses a user has available at the time or access modes a user can use there can be no perception or use of the media. Therefore it is very useful if the modalities available in media or provided in an interface can be described. Doing so permits matching to the modalities a user has available or the authoring of adaptations to enable that matching.

There is not a simple precise match between modalities and senses because some senses are used in complex interactive ways – for example the sense of sight and reading – but there is a general broad matching.

Traditionally, ways to describe media on computers has developed in ways suited to the needs of computer design and not so well suited to use by people. An example would be the Multipurpose Internet Mail Extensions standard for describing media formats (MIME-types) with email but widely used for other purposes. This does not describe modalities. Only now are appropriate standards supporting modality description emerging.

ISO/IEC JTC 1/SC 36 Committee for Information Technology for Learning Education and Training has produced a standard for description of access modalities and adaptations for those. This is the Individualized

68 CWA 15778:2008 (E)

Adaptability and Accessibility in E-learning, Education and Training. 24751. This was initially developed in IMS and then internationalised in ISO. It is planned to be publicly available at the end 2007.

The standard provides for description and matching of resources to personal contexts/requirements. It has three parts – a Framework that shows how to use the other parts, A standard for description of Digital Resources (Digital Resource Description DRD) and a standard for description of functional learner requirements Personal Needs and Preferences (PNP) that enables matching of resources and adaptations to user requirements and specific contexts. Currently the vocabulary for modalities within the standard can be used to describe the modalities of visual, textual, auditory, tactile and olfactory. Vocabularies for description of adaptations for these (for example audio description) are more extensive as these extend to representation forms used on computers and with assistive technology. Several new parts to the standard are under construction including parts for description of requirements and components for offline media and services and and for places and events. 8.4.1 Further Research on personalisation of content In order usefully to use the modalities within media to match to or adapt to user requirements a few glue standards are needed. This is an area of active development in the standards.

To make it all work there is needed the development of best practices with particular media. This CWA presents some of those best practices. Even more progress towards providing truly personalisable media could be obtained with the development of more formally-described practices. Ontologies describing the use of different media components across different media types would be very useful here. For example, most media types contain alternatives that could be described and matched to a context as described above, but the media types all do it different ways. What is needed is some way to operate across media types with the same principles.

IEEE Learning Technology Standards Committee is developing a standard currently called Resource Aggregation Model for Learning, Education and Training. This work is constructing a standard ontology that tackles this cross-media issue, though the precise requirements for adaptation for accessibility have not yet been addressed in that work. The work can be found on http://www.ieeeltsc.org/working-groups/wg11CMI/ramlet/

Alternatives within media types are further described elsewhere in this CWA. 8.5 Licensing and technical protection measures It is evident that neither at international level nor at European level is there any requirement to provide exceptions to copyright protection facilitating access to protected materials by visually impaired people. Further, the measures on anti­ circumvention of technical protection measures introduced by the WIPO Copyright Treaty (WCT) and the WIPO Performances and Phonograms Treaty (WPPT) in 1996 are not matched by any provision to accommodate exceptions to copyright protection. The European Union Copyright Directive does address this issue in article 6.4.1 but gives no indication how the conflict is to be resolved nor does it require resolution of the conflict for content made available via interactive digital services.

The legislative provisions found at national level are diverse. The reasons for this state of affairs include the following: • The variety of digital file formats used by publishers • The complexities of format and structure conversion and the provision of the corresponding resource requirement • The concerns of publishers regarding the release to third parties of digital text files • Concerns by publishers that they may be impeded from collaborating by not having the requisite rights to authorise conversion into particular formats e.g. audio books

Security is a vital issue for publishers and technical protection measures are a complex issue for most content holders. Every publisher’s content, client base and requirements are different, which often results in a personalised set of requirements for each case. As a result, approaches to licensing and agreements on accessible formats are often negotiated on a case-by-case basis. Naturally, publishers have to be confident that any digital format is being delivered through secure gateways to only the people who are intended to receive it. For these reasons, there is a perception that the provision of digital files in alternative formats may compromise technical protection measures. Combined with a widespread belief that the provision of accessible format materials is expensive and time-consuming, only limited progress has thus far been made.

69 CWA 15778:2008 (E)

Much of the discussion around DRM and Accessibility has necessarily focused on the right of access versus the need to protect content. However, points of common interest exist and the development of trusted intermediary concepts can offer real-world solutions.

From a technical perspective, earlier problems relating to the digitisation of materials have been largely overcome and recent formats (such as XML, RDF, METS, MARC21 etc) provide a realistic basis for implementing the different aspects of this work. It is now possible to address the key concerns of content creators and providers and coherently to address issues such as: automation of document structuring, adherence to emerging standards, workflow support, digital rights management and secure distribution platforms.

As the lifetime of a book gets shorter and shorter, publishers frequently have to offer access to digital versions of that book and taking this into account when constructing the layout brings us much closer to real accessibility in the wider sense. Indeed, it has been the accessibility community that has in many ways pioneered new structures for digital content, as these developments are often borne of need.

Trusted intermediaries establish a personalised relationship between content holders and specialist organisations whereby publishers and agencies serving blind and partially sighted people work together in a secure and trusting environment to increase the quantity and timeliness of titles available in an accessible format. Within trusted intermediary frameworks, DRM is an enabler of controlled access. A number of different security methods are being developed or are already in use for making content available in this way.

As far as security is concerned, the higher the level the more likely publishers are to allow content to be made available in accessible digital formats. At present, the security systems used are simple, they use basic encryption technologies with key exchange mechanisms. The potential for the release of content is considerable – although there are few recorded instances of such occurring. Once decrypted, content is available to anyone, authorised or not. The ability to attach content to particular devices, or better to provide access only to authorised users, requires a level of DRM sophistication that is not yet generally in place in services catering to the needs of visually impaired people. 8.5.1 Further research on licensing and technical protection measures There is a need to examine and describe existing practice in this area, with particular focus on the implementation of trusted intermediary environments.

Further research is required to examine accessibility in the wider sense and to examine the requirements for modelling accessibility and DRM within emerging multimedia environments.

9 Conclusion and future work

Accessible content started as a niche market mainly targeting people with visual impairments and dyslexic users. Publishers have recently realised the potential market and are offering alternative book versions (audio, large print, etc.) of newly published editions together with the printed ones. Accessible content can be used in different situations and cover part of the changing requirements of the user population in the framework of ambient intelligence for content anytime, anywhere, and with any service. Usage examples include, users on the move, multitasking environments that require hands-free, eyes-free access and browsing, learning and instructional scenarios, and many others.

Despite this, the full potential of accessible content has not been released yet, due to a mismatch between existing standards regarding accessibility and their implementation in publishing processes. This CWA aims to reduce that gap and realise the full market potential of accessible content. Therefore, several real-life publishing scenarios have been analysed and related actors, formats, conversions and standards presented.

In this elaboration the following conclusions can be drawn:

(a) There are many common tasks that need to be covered when introducing accessibility into the publishing workflow. Those tasks deal with the definition of document style sheets that are agreed and disseminated to the actors, manual correction of conversion input/output (in the case where content creators did not follow style sheet requirements), extension of descriptions regarding visual data (images, videos) and other multimedia material and lead to common requirements.

(b) Specific format conversions need to take place in order to accommodate user needs. In these format conversions, the original structure, annotation and metadata and intellectual property rights of the parent documents need to be inherited to the child documents and not be lost. One of the strategies that seem to be most viable is to use a media rich (including

70 CWA 15778:2008 (E)

annotations, and metadata) structured format as reference and then create the desired output operating direct on that format. An additional advantage of this strategy is that this format can be ideally also used for backups and to be stored in the publishing archive. c) Existing research, standards and technologies can be used in order to transform current publishing workflows into accessible content processing workflows. In order to be successful in this transformation, the barriers publishers face for deploying accessibility into their workflow must fall and the incentives for producing accessible content increased. Barriers will fall when automated format conversions and clear stylesheets and guidelines are followed. Although there is still a lot of research to be done in order to robust and reliable produce automatically accessible documents from any source, there are nowadays means to economically implement accessible content processing workflows. Incentives for producing accessible content may come from opening new markets for publishers e.g. personalised information delivery, electronic distribution, efficient storage, preservation, search and retrieval of publishing titles, and therefore a greater economic return is expected. The analysis of accessible information provision showed also that there are some specific steps that need to be done to complement and extent the accessible information publishing process. Those are:

• For software developers and ICT researchers, to develop frameworks and tools to support actors in content production to fulfil the accessibility requirements, including automated conversion of both single and multi-type composite document formats into accessible documents, personalised presentation of content and adaptive content interfaces, licensing a technical protection measures. The focus should be to solutions that address specific sector requirements (eGovernment, eLearning, Medical Documents, Scientific documents, etc.) • For accessibility researchers to look more clearly and define requirements of users with special needs in the information provision, and based on that, propose and/or adapt standards for the publishing and content management industry. • For the content management industry to bring accessible information provision on its agenda and start proposing standards in this area, supporting open framework and standards development. • For publishers and publishing associations to work on elaborating tools used by authors in the content authoring process, and systems used in content production and delivery with the aim to provide detailed input for on-going accessible format and accessibility software and standards development and be part of future development on process and content modelling efforts for accessible information provision

It is clear from the above elaboration that accessible information provision requires interdisciplinary efforts in order to be realised. We look forward to this realisation process and hope to attract the best solutions from each area and to see barriers to access information reducing.

71 CWA 15778:2008 (E)

Appendix A – Relevant standards

Nr. Title Subject (if available) URL

BP – PRD This document, derived from the guidelines for TEI Lite, provides an introduction http://www.oucs.ox.ac.uk/oucsweb/tei- to the recommendations of the Text Encoding Initiative (TEI), by describing a oucs.xml subset to, and extension of, the full TEI encoding scheme developed for marking up OUCS web pages and course documentation. BP – PRO I have prepared a set of XSLT specifications to transform TEI XML documents to http://xml.web.cern.ch/XML/www.tei- HTML, and to XSL Formatting Objects. I have concentrated on TEI Lite, but c.org/Stylesheets/teixsl.html adding support for other modules is fairly easy, and I am working my way through the TEI as applications come along. In the main, the setup has been used on `new' documents, i.e. reports and web pages that I have authored from scratch, rather than traditional TEI-encoded existing material. MPEG 21 MPEG-21 is developed within the International Standard Organisation (ISO) and http://www.chiariglione.org/mpeg/standards/ aims at defining a normative open framework for multimedia delivery and mpeg-21/mpeg-21.htm consumption for use by all the players in the delivery and consumption chain Open eBook Publication Structure http://www.idpf.org/specs.htm specification ST – PRD The Text Encoding Initiative (TEI) Guidelines are an international and http://www.tei-c.org/ interdisciplinary standard that facilitates libraries, museums, publishers, and individual scholars represent a variety of literary and linguistic texts for online research, teaching, and preservation. The International Standard Text Code The International Standard Text Code (ISTC) is developed by ISO Project 21047 http://www.collectionscanada.ca/iso/tc46sc9 (ISTC) and aims at a unique, international identification of individual textual works. /wg3.htm AEN/CTN 139 Computer applications for people with http://www.aenor.es/desarrollo/inicio/home/ disabilities. Computer accessibility home.asp requirements. Software AEN/CTN 153 Audio description for visually impaired http://www.aenor.es/desarrollo/inicio/home/ people. Guidelines for audio description home.asp procedures and for the preparation of audio guides Authoring Tools Authoring Tool Accessibility Guidelines http://www.w3.org/TR/2000/REC-ATAG10- Working Group 1.0 (ATAG 1.0) 20000203/ (AUWG)

72 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

Authoring Tools Authoring Tool Accessibility Guidelines http://www.w3.org/TR/ATAG20/ Working Group 2.0 (ATAG 2.0) (AUWG) BS 7000-6:2005 Design management systems. Managing http://www.bsi- inclusive design. Guide global.com/Quality_management/Design/bs 7000-6.xalter CAN/CSA-B659-01 Design for Aging http://www.csa- intl.org/onlinestore/GetCatalogItemDetails.a sp?mat=000000000002012683 Draft ISO/IEC 24751 Individualised Adaptability and The scope of this multi- part standard is to provide a common framework to Part 1: http://jtc1sc36.org/doc/36N1139.pdf Accessibility in E­ learning, Education facilitate matching of learner accessibility needs and preferences with (temporary URL) and Training appropriate learning resources and user interfaces. Part 1: Framework Part 2: http://jtc1sc36.org/doc/36N1140.pdf Part 2: Access For All Personal Needs and Preferences Statement (temporary URL) Part 3: Access For All Digital Resource Description Part 3: http://jtc1sc36.org/doc/36N1141.pdf (temporary URL) Draft ISO/IEC 26513 Software and systems engineering – To be developed by ISO/IEC JTC 1/SC 7/WG 2 – Software and systems Estimated ISO publication in 2010. User documentation requirements for documentation. documentation evaluators and testers. Should contain both requirements and recommendations on all aspects of documentation evaluation and testing. Draft ISO/IEC 26514 Software and systems engineering – Being developed by ISO/IEC JTC 1/SC 7/WG 2 – Software and systems Estimated ISO publication in 2008. User documentation requirements for documentation. Contains both requirements and recommendations on all documentation designers and aspects of documentation including planning, design, production and developers. maintenance. Several clauses provide guidance on accessible documentation, notably 12.5

http://www.jtc1-sc7.org/ EN 1332-4 Identification Card Systems - Man­ Machine readable cards facilitate the provision of a growing variety of services http://www.bsi- machine interface - Part 4: Coding of across Europe. The purpose of EN 1332 is to increase the accessibility of these global.com/en/Shop/Publication­ user requirements for people with special services for the benefit of consumers. This will be achieved by facilitating the Detail/?pid=000000000030009505 needs inter-sector and cross border interpretability of machine-readable cards and to do so with the maximum possible degree of user-friendliness. EN 1332 addresses the needs of all users, including people with special needs, for example the aged, minors, the disabled, the visually impaired, those with learning difficulties, first time users, those not conversant with the local language.

73 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

ETSI EG 202 416: User interfaces: Set up procedures for Being developed by ETSI HF STF 285 – Guidelines for set up procedures for Available from ETSI web site free-of- mobile terminals and services mobile terminals and services. charge. http://www.etsi.org/services_products/freest andard/home.htm The ETSI Guidelines (EG) will include documentation and include recommendations for disabled and elderly users. ETSI EG 202 417 User education guidelines for mobile Available from ETSI web site free-of- terminals and e-services charge. http://www.etsi.org/services_products/freest andard/home.htm IEEE standard RAMlet – Reference model for resource http://ieeeltsc.org/wg11CMI/ramlet/Pub/ aggregation. also: http://www.ieeeltsc.org/working­ groups/wg11CMI/ramlet/Pub/RAMLET_proj ect_description.pdf/view IMS standard IMS AccessForAll Meta-data http://www.imsglobal.org/accessibility Specification 1.0 IMS standard IMS Learner Information Package http://www.imsglobal.org/accessibility Accessibility for LIP 1.0 ISO 10075:1991 Ergonomic principles related to mental http://www.iso.org/iso/en/CatalogueDetailPa work-load -- General terms and ge.CatalogueDetail?CSNUMBER=18045&I definitions CS1=13&ICS2=180&ICS3= ISO 14915-1:2002 Software ergonomics for multimedia user http://www.iso.org/iso/en/CatalogueDetailPa interfaces -- Part 1: Design principles ge.CatalogueDetail?CSNUMBER=25578&s and framework copelist=ALL ISO 14915-2:2003 Software ergonomics for multimedia user http://www.iso.org/iso/en/CatalogueDetailPa interfaces -- Part 2: Multimedia ge.CatalogueDetail?CSNUMBER=28583&s navigation and control copelist=ALL ISO 20282-1:2006 Ease of operation of everyday products ­ ISO 20282-1:2006 provides requirements and recommendations for the design http://www.iso.org/iso/en/CatalogueDetailPa - Part 1: Context of use and user of easy-to-operate everyday products, where ease of operation addresses a ge.CatalogueDetail?CSNUMBER=34122&s characteristics subset of the concept of usability concerned with the user interface by taking copelist=PROGRAMME account of the relevant user characteristics and the context of use.

74 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

ISO 2108:2005 Information and documentation -- The purpose of ISO 2108:2005 is to establish the specifications for the http://www.isbn- International standard book number International Standard Book Number (ISBN) as a unique international international.org/en/index.html (ISBN) identification system for each product form or edition of a monographic publication published or produced by a specific publisher. It specifies the construction of an ISBN, the rules for its assignment and use, the metadata to be associated with the ISBN allocation, and the administration of the ISBN system. ISO TS 20282-2:2006 Ease of operation of everyday products ­ ISO 20282-2:2006 specifies a test method for measuring the ease of operation http://www.iso.org/iso/en/CatalogueDetailPa - Part 2: Test method of "walk-up-and-use" products. The purpose of the test is to provide a basis for ge.CatalogueDetail?CSNUMBER=36452&s predicting the ease of operation of a walk-up-and-use product, including copelist=PROGRAMME measures of its effectiveness and efficiency of operation, and the satisfaction of the intended user population in its expected context of use. ISO/AWI TR 22411 Ergonomic data and guidelines for the http://www.iso.org/iso/en/CatalogueDetailPa application of ISO/IEC Guide 71 in ge.CatalogueDetail?CSNUMBER=40933&s standards related to products and copelist=PROGRAMME services to address the needs of older persons and persons with disabilities ISO/CD 9241-20 Ergonomics of human system interaction http://www.iso.org/iso/en/CatalogueDetailPa -- Accessibility guideline for information ge.CatalogueDetail?CSNUMBER=40727&s communication equipment and services ­ copelist=PROGRAMME - Part 20: General guidelines ISO/DIS 9241-151 Ergonomics of human-system interaction http://www.iso.org/iso/en/CatalogueDetailPa -- Part 151: Guidance on World Wide ge.CatalogueDetail?CSNUMBER=37031&s Web user interfaces copelist=ALL ISO/DIS 9241-171 Ergonomics of human-system interaction http://www.iso.org/iso/en/CatalogueDetailPa -- Part 171: Guidance on software ge.CatalogueDetail?CSNUMBER=39080&s accessibility copelist=ALL ISO/DIS 9241-300 Ergonomics of human-system interaction http://www.iso.org/iso/en/CatalogueDetailPa -- Part 300: Introduction to requirements ge.CatalogueDetail?CSNUMBER=40096&s and measurement techniques for copelist=PROGRAMME electronic visual displays ISO/IEC 11581 Information technology -- User system http://www.iso.org/iso/en/CatalogueListPag interfaces and symbols-part 1-6 e.CatalogueList?COMMID=4768&scopelist =ALL

75 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

ISO/IEC 18019:2004 Guidelines for the design and Provides guidelines for the design and preparation of user documentation for preparation of user documentation for application software. It describes how to establish what information users need, application software how to determine the way in which that information should be presented to the users, and how then to prepare the information and make it available. Contains recommendations on implementing accessibility for documentation (clause 4.2.6). ISO/IEC 18019:2004 Software and system engineering -- http://www.iso.org/iso/en/CatalogueDetailPa Guidelines for the design and ge.CatalogueDetail?CSNUMBER=30804&I preparation of user documentation for CS1=35&ICS2=80&ICS3= application software ISO/IEC 26300:2006 Information technology -- Open The OpenDocument specification defines an XML schema for office applications http://www.oasis- Document Format for Office Applications and its semantics. The schema is suitable for office documents, including text open.org/committees/tc_home.php?wg_abb OASIS standard (OpenDocument) v1.0 documents, spreadsheets, charts and graphical documents like drawings or rev=office, presentations, but is not restricted to these kinds of documents. The schema provides for high-level information suitable for editing documents. It http://www.iso.org/iso/en/CatalogueDetailPa defines suitable XML structures for office documents and is friendly to ge.CatalogueDetail?CSNUMBER=43485&s transformations using XSLT or similar XML-based tools copelist=ALL ISO/IEC CD 24756 Information technology -- Algorithmic http://www.iso.org/iso/en/CatalogueListPag framework for determining accessibility e.CatalogueList?COMMID=4768&scopelist for individual users of interactive systems =ALL ISO/IEC CD TR Information Technology -- Guidelines for http://www.iso.org/iso/en/CatalogueListPag 19766 the design of icons and symbols e.CatalogueList?COMMID=4768&scopelist accessible to all users, including the =ALL elderly and persons with disabilities ISO/IEC DTR 19765 Information technology -- Survey of http://www.iso.org/iso/en/CatalogueListPag existing icons and symbols for elderly e.CatalogueList?COMMID=4768&scopelist and disabled persons =ALL ISO/IEC NP 24786-1 Information Technology - User Interfaces http://www.iso.org/iso/en/CatalogueDetailPa - Accessible User Interface for ge.CatalogueDetail?CSNUMBER=41556&s Accessibility Setting on Information copelist=ALL Devices -- Part 1: General and methods to start ISO/IEC TR 19764 Information technology -- Guidelines, http://www.iso.org/iso/en/CatalogueListPag methodology and reference criteria for e.CatalogueList?COMMID=4768&scopelist cultural and linguistic adaptability in =ALL information technology products

76 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

ISO/TR 16982:2002 Ergonomics of human-system interaction http://www.iso.ch/iso/en/CatalogueDetailPa -- Usability methods supporting human­ ge.CatalogueDetail?CSNUMBER=31176&s centred design copelist= ISO/TS 16071:2003 Ergonomics of human-system interaction http://www.iso.org/iso/en/CatalogueDetailPa -- Guidance on accessibility for human­ ge.CatalogueDetail?CSNUMBER=30858&s computer interfaces copelist=ALL Italian Government Law n. 4, January 9, 2004 - Provisions to http://www.pubbliaccesso.gov.it/english/ind support the access to information ex.htm technologies for the disabled (also known as "The Stanca Act"). JIS X 8341-1 Japanese Industrial Standards Guidelines for older persons and persons with disabilities -- Information and http://www.webstore.jsa.or.jp/webstore/Com Committee. communications equipment, software and services -- Part 1: Common /FlowControl.jsp?lang=en&bunsyoId=JIS+X Guidelines +8341- 1%3A2004&dantaiCd=JIS&status=1&page No=0 JISS0032 Japanese Industrial Standards Guidelines for the elderly and people with disabilities - Visual signs and displays http://www.webstore.jsa.or.jp/webstore/Com Committee - Estimation of minimum legible size for a Japanese single character /FlowControl.jsp?lang=en&bunsyoId=JIS+S +0032%3A2003&dantaiCd=JIS&status=1& pageNo=0 JISS0033 Japanese Industrial Standards Guidelines for the elderly and people with dasabilities - Visual signs and displays http://www.webstore.jsa.or.jp/webstore/Com Committee - A method for color combination based on categories of fundamental colors as a /FlowControl.jsp?lang=en&bunsyoId=JIS+S functionof age +0033%3A2006&dantaiCd=JIS&status=1& pageNo=0 JISZ8071 Japanese Industrial Standards Guilines for standards developers to address the needs of older persons and http://www.webstore.jsa.or.jp/webstore/Com Committee persons with disabilities /FlowControl.jsp?lang=en&bunsyoId=JIS+Z +8071%3A2003&dantaiCd=JIS&status=1& pageNo=0 Nordic Cooperation on Nordic Guidelines for Computer http://trace.wisc.edu/docs/nordic_guidelines Disability Accessibility /nordic_guidelines.htm OASIS standard OASIS DITA Language Specification The Darwin Information Typing Architecture (DITA) specification defines both a) http://www.oasis- a set of document types for authoring and organising topic-oriented information; open.org/committees/download.php/12091/ and b) a set of mechanisms for combining and extending document types using cd2.zip a process called specialisation.

77 CWA 15778:2008 (E)

Nr. Title Subject (if available) URL

Standard from: ONIX International ONIX International is the international standard for representing and http://www.editeur.org/onix.html EDItEUR jointly with communicating book industry product information in electronic form, Association of incorporating the core content which has been specified in national initiatives American Publishers, such as BIC Basic and AAP’s ONIX Version 1' Book Industry Communication and the Book Industry Study Group. Swedish Government, Guidelines for an accessible public http://www.tillganglig.se/start.asp?lang=en& Office of the Disability administration sida=1450 Ombudsman US section 508 US Section 508 , on the requirements for accessibility for public procurement. http://www.section508.gov/ This act requires all federal agencies' electronic and information technology is accessible to people with disabilities. User Agent User Agent Accessibility Guidelines http://www.w3.org/TR/2002/REC-UAAG10- Accessibility (UAAG) 1.0 20021217/ Guidelines Working Group (UAWG) W3C EARL (W3) , the Evaluation and Report The Evaluation and Report Language is a standardized vocabulary to express http://www.w3.org/TR/EARL10-Schema/ Recommendation Language. test results. W3C Web Content Accessibility Guidelines 1.0 WCAG 1.0 has 14 guidelines that are general principles of accessible design. http://www.w3.org/WAI/intro/wcag.php Recommendation Each guideline has one or more checkpoints that explain how the guideline applies in a specific area. W3C Web Content Accessibility Guidelines 2.0 Following WCAG makes Web content more accessible to the vast majority of http://www.w3.org/TR/2005/WD-WCAG20- Recommendation users, including people with disabilities and older users, using many different 20051123/ devices including a wide variety of assistive technology. W3C XHTML (1.0) W3C (X)HTML Working group http://www.w3.org/TR/xhtml1/, Recommendation Working group homepage: http://www.w3.org/MarkUp/ W3C XML standard W3C XML standards http://www.w3.org/TR/2006/REC-xml- Recommendation 20060816/ Working group homepage: http://www.w3.org/XML/

78 CWA 15778:2008 (E)

Appendix B – Relevant European organisations

Title Subject URL AAP Open eBook Standards Project and AAP/Andersen The goal of the AAP Open ebook Standards Project was to recommend standards http://www.publishers.org/digital/index.cfm Consulting eBook Study and requirements in the areas of Digital Rights Management [ref], Metadata and Numbering that will enable an open, competitive marketplace for ebook commerce on a large scale. The intention is to consider all aspects of the burgeoning ebook marketplace in developing standards and requirements to promote its growth. AXMEDIS (EC Project) AXMEDIS is developing technologies to reduce the costs of digital content http://www.axmedis.org/ production, distribution and protections. It is an environment where digital content producers, aggregators and distributors can gain access to a wide range of digital contents. BISG (Book Industry Study Group) A membership-supported, not-for-profit research organisation comprised of http://www.bisg.org/ organisations from every sector of the publishing community. Its goal is to provide accurate and current research information about the industry for its members and others. BISG produced jointly with the Association of American Publishers and Book Industry Communication the ONIX International standard. BrailleNet, France: Digital Document Delivery for the Blind in France http://www.braillenet.org/ DAISY Consortium The DAISY Consortium's mission is to develop the International Standard and http://www.daisy.org/ implementation strategies for the production, exchange, and use of Digital Talking Books in both developed and developing countries, with special attention to integration with mainstream technology, to ensure access to information for people with print disabilities. DBK, De Braillekrant, Belgium Private foundation supported by the Katholieke Universiteit Leuven and Sensotec http://www.braillekrant.be/ company, producing daily newspapers in Braille since 1993, in Daisy format since 2003 and in audio format starting April 2007. DEDICON Netherlands: National Federation of Libraries for the Blind http://www.dedicon.nl/ Digital Media Project On the policy and legal side, new policies should be determined and legacy policies http://www.chiariglione.org/project/ should be revised On the technical side, a DRM platform should be designed offering the following main features Dolphin Audio Publishing, United Kingdom: Multimedia solutions for the adaptive technology industries http://www.yourdolphin.com/

79 CWA 15778:2008 (E)

ENABLES Enhanced Network Accessibility for the Blind The specific objectives in this area include: http://www.enabledweb.org/AWC.htm and Visually Impaired Developing techniques that will convert existing non-accessible Web contents into accessible forms according to the user's need. Investigate multimodal representation for different contents, different applications and different user disabilities. Producing guidelines for creating multimodal representation and toolkits for developers to create accessible Web contents in general and images in particular. ETSI Task Force STF 286 ETSI Human Factors designing access symbols to indicate special services for http://www.etsi.org/pressroom/Previous/200 disabled users of ICT equipment 5/2005_05_stf286.htm EUAIN (EC Project) The EUAIN project (European Accessible Information Network) aims to promote e­ http://www.euain.org/modules/wfsection/ Inclusion as a core horizontal building block in the establishment of the Information Society by creating a European Accessible Information Network to bring together the different actors in the content creation and publishing industries around a common set of objectives relating to the provision of accessible information. FEP, Belgium: Federation of European Publishers, Brussels. http://www.fep-fee.be/ FORCE, Netherlands: Independent foundation for education and support for the print impaired in http://www.force-foundation.org.uk/ developing countries Institute Integriert Studieren, Austria Austria-wide Institute for Information systems Supporting Print Disabled Students http://www.integriert-studieren.jku.at/ International Digital Publishing Forum (IDPF), The International Digital Publishing Forum (IDPF), formerly the Open eBook Forum http://www.idpf.org/ (OeBF), is the trade and standards association for the digital publishing industry. ISO/IEC JTC 1 Special Working Group on Accessibility Established by JTC 1 (10/2004) to: http://www.jtc1access.org/ standards - establish user requirements for accessibility standards - prepare an inventory of existing accessibility standards & legislation - Identify gaps (and overlaps) in accessibility standardisation - Work with standards bodies (ISO, IEC, ITU, CEN, ETSI, etc) to prepare new standards

National Council For the Blind of Ireland Media The MCS converts information and documents into formats accessible to www.ncbi.ie Conversion Service (NCB IMCS) people with vision impairments for a range of public and private clients. The MCS provides 3 specific services: 1. Conversion into Braille and audio formats 2. Audio-description 3. MCS Consultancy and QA Services 4. Braille Training for the Pharmaceutical Sector

80 CWA 15778:2008 (E)

OASIS DITA Technical Committee The purpose of the OASIS DITA Technical Committee (TC) is to define and http://www.oasis- maintain the Darwin Information Typing Architecture (DITA) and to promote the use open.org/committees/tc_home.php?wg_abb of the architecture for creating standard information types and domain-specific rev=dita markup vocabularies. OASIS ODF Technical Committee http://www.oasis- open.org/committees/workgroup.php?wg_a bbrev=odf-adoption ONCE, Spain: Spanish National Organisation for the Blind http://www.once.es/ Royal National Institute of the Blind (UK) http://www.rnib.org.uk/xpedio/groups/public/ documents/publicwebsite/public_sachome.h New RNIB web site addressing software accessibility csp Royal National Institute of the Blind UK’s leading charity offering information, support and advice to over two million www.rnib.org.uk RNIB people with sight problems. The Web Access Centre provides web designers and managers with the tools and resources needed to plan, build and test accessible websites. RNIB also offers paid for web accessibility consultancy services including website audits, advice, presentations and seminars. The directory of accessible websites lists the sites that have passed the RNIB audit within the past year. Society for Technical Communication (US) The STC is a US-based international organisation and has a Special Interest http://www.stcsig.org/sn/index.shtml Groups addressing the accessibility of documentation (A-SIG). SUTII, Poland: University Technical Research Department

TechDis (a JISC Advisory Service) The mission of the JISC TechDis Service is to support the education sector in http://www.techdis.ac.uk achieving greater accessibility and inclusion by stimulating innovation and providing expert advice and guidance on disability and technology. The TechDis website features many aids for accessible document processing, including the Accessibility Essentials guides to inclusive use of Microsoft (R) Word and Adobe (R) PDFs. Technologie-Zentrum Informatik, University Bremen, Institute at the University Bremen with the main aim to develop cutting edge http://www.tzi.de/ Germany: technologies in computer science and engineering and transfer those into practice. Special focus on multimedia content accessibility and presentation Web Accessibility Business Case Documents The Web Accessibility Initiative (WAI) Education and Outreach Working Group http://www.w3.org/WAI/bcase/ (EOWG). "Developing a Web Accessibility Business Case for Your Organisation." describes social, technical, financial, legal and policy aspects of Web accessibility. http://www.w3.org/WAI/ It is designed to help organisations develop their own customised business case for Web accessibility. It provides text that can be used as is, as well as guidance on identifying the most relevant factors for a specific organisation. Web Standards Project The Web Standards Project is a grassroots coalition fighting for standards that http://www.webstandards.org/ ensure simple, affordable access to web technologies for all.

81 CWA 15778:2008 (E)

Appendix C – Sustainability: network of interested parties for ongoing support and further development

Name Enterprise Country Function Email address

P. Abrahams Bloor Research United Kingdom Practice leader Accessibility [email protected] &Usability

Mr. A. Arch Online Accessibility Consulting Vision Australia Manager [email protected]

Mr. H. Aspelund Directorate for Health and Social Affairs, The Norway Adviser ICT [email protected] Delta Centre [email protected]

Mr. S. Ball JISC TechDis Service United Kingdom Senior Advisor [email protected]

Mrs. A. Bergman-Tahon Federation of European Publishers Belgium Director [email protected]

Mr. H. Bjarnø Knowledge Centre Denmark [email protected]

Mrs. T. Bogner Institut Integriert Studieren Austria [email protected]

L. Bowick Ministere de l’Agriculture et de la Peche France Reseaux et telecommunications [email protected]

Mr. M. Brauer Sun Microsystems Germany Technical Architect Software [email protected] Engineering

Mrs. Lino Brundu Alitha of Milan Italy Chairman [email protected]

Mr. Bruno [email protected]

Mr. D. Burger Association BrailleNet France [email protected]

82 CWA 15778:2008 (E)

Name Enterprise Country Function Email address

Mrs. J. Clark [email protected]

Mr. M. Cooper Web Accessibility Specialist Web accessibility specialist [email protected]

Mr. D. Crombie DEDICON Netherlands Head International Projects [email protected]

Mrs. J. Darzentas University of the Aegean Greece [email protected]

Mr. D. Day OASIS DITA Technical Committee IBM Lead DITA Architect [email protected]

Mr. A. Egger University of Applied Sciences Austria [email protected]

Mr. J. Engelen Katholieke Universiteit Leuven Belgium [email protected]

Mrs. B. Fanning AIIM USA Director [email protected]

Mr. M. Ford Martin Ford Consultancy Italy Consultant [email protected]

Mr. T. Fraser Monotype Imaging Ltd United Kingdom Finance Director [email protected]

Mr. dr. J. Friedrich IBM Germany Germany Program Manager ICT [email protected] Standardization

Mr. J. Garner INCITS/Information Technology Industry Council USA [email protected]

Cerys Giddings IBM UK Ltd UK [email protected]

Mrs. K. Grant [email protected]

Mr. C. Gravenhorst [email protected]

83 CWA 15778:2008 (E)

Name Enterprise Country Function Email address

Mr. A. Haffner Technische Universität Dresden Germany Professur Mensch-Maschine- [email protected] Kommunikation

Mrs. Mayumi Handa RiverDocs Ireland Test Manager [email protected]

A.K. Heath Axelrod Access For All United Kingdom [email protected]

Mr. S. Herramhof [email protected]

Mr. R. Hodgkinson Institute of Scientific and Technical United Kingdom Consultant [email protected] Communicators (UK)

Mr. M. Horstmann TZI, University Bremen Germany [email protected]

Mr. A. Houser [email protected]

Mr. Dr. G. Ioannidis IN2 and TZI - University. Bremen Germany Director [email protected] and [email protected]

Mr. H. Janczikowski [email protected]

Mr. G. Kerscher [email protected]

Mr. S. Klironomos FORTH/ ICS Greece Secretariat manager [email protected]

Mr. M. Koettstorfer [email protected]

Ms. MIRA KOIVUSILTA STAKES, R&D Centre for Health and Welfare Finland [email protected]

Mr. P. Korn Sun Microsystems United States Accessibility Architect [email protected]

84 CWA 15778:2008 (E)

Name Enterprise Country Function Email address

Mr. N. Kovacs DIN Committee Information Technology Germany [email protected]

Mr. F.J. Martinez-Calvo ONCE Spain [email protected]

Dr. Thomas Kahlisch Deutsche Zentralbücherei für Blinde zu Leipzig Germany Director [email protected]

K. Lindelien Standards Norway Norway [email protected]

Mrs. L McNamee Texthelp Systems Northern Ireland Marketing manager [email protected]

Mr. S. McGrenery CTO Ireland [email protected]

Mr. N. McKenzie DEDICON Netherlands [email protected]

Mr. D. Mann RNIB [email protected]

Mrs. M. McRae OASIS-OPEN [email protected]

Mr. C. Menezes UNESCO France Senior programme specialist [email protected]

Mr. F. Middelkoop DEDICON [email protected]

Mr. Dr. K. Miesenberger University of Linz Austria [email protected]

Mr. J. O Connor NCBI (National Council For The Blind Of Ireland) Ireland Web Accessibility Consultant [email protected]

Mr. H. O’Neill Central Remedial Clinic Ireland Project Coordinator [email protected]

Mr. R. Orme RNIB [email protected]

P. Permezel Cosmosbay-Vectis France Chef de project [email protected]

85 CWA 15778:2008 (E)

Name Enterprise Country Function Email address

Ms. C. Pollitt National Library for the Blind United Kingdom [email protected]

Mrs. F. Preteux [email protected]

Mr. K. Richter [email protected]

Mr. Roel van Gils [email protected]

Mr. Dr. R. Romero Fundacion SIDAR Spain [email protected]

Mr. R. Ruemer University of Linz Austria [email protected]

Mrs. A. Salaun European Commission Belgium [email protected]

S. Schotel DEDICON Netherlands [email protected]

Mr. Shadi Abou-Zahra W3C Web Accessibility Initiative France Web Accessibility Specialist for [email protected] Europe

Ms. S. Sollat C-LHISTOIRE Digital-Prod France Executive Producer [email protected]

Mr. F. van Stek DEDICON [email protected]

Mr. C. Stephan [email protected]

Mr. M. Straat Adobe [email protected]

Mr. C. Strobbe K.U.Leuven - Departement of Electrical Belgium [email protected] Engineering - Research Group on Document Architectures

86 CWA 15778:2008 (E)

Name Enterprise Country Function Email address

Paivi Tahkokallio STAKES, R&D Centre for Health and Welfare Finland [email protected],

Mr. D. Taylor Lightning Source UK Ltd United Kingdom Managing Director [email protected]

Mr. Malte Timmermann Sun Microsystems GmbH Germany Technical Architect Software [email protected] Engineering

Mr. T. Tontchev [email protected]

Mr. D. Tucker FORCE Foundation Netherlands [email protected]

Mr. S. Tyler RNIB [email protected]

Mr. L. Van den Berghe CEN/ISSS - Information Society Standardization Belgium Workshop manager [email protected] System

Mr. C. Walinn Danish National Library for the Blind Denmark [email protected]

Mrs. M. White [email protected]

Mr. R. Winiarsczyk Silesian University of Technology Poland head of research group [email protected]

Mr. T. Worthington [email protected]

Mr. J. Worsfold Dolphin Computer Access United Kingdom [email protected]

Mr. W. Wünschmann [email protected]

Mr. Dr. J. Rietveld Netherlands Standardization institute Netherlands Standardization Consultant [email protected]

87 CWA 15778:2008 (E)

Appendix D – Abbreviations List

Abbreviation Text

AAATE Association for the Advancement of Assistive Technology in Europe

AIIM The Enterprise Content Management Association

AIP Accessible Information Processing

ANSI American National Standards Institute

ASCII American Standard Code for Information Interchange

CDDA Compact Disk Digital Audio

CD-ROM Compact Disc read-only

CEN COMITÉ EUROPÉEN DE NORMALISATION

CMS Content Management System

CSS Cascading Style Sheets

CWA CEN Workshop Agreement

DAISY Digital Accessible Information System

DATSCG Design for All and Assistive Technologies Standardization Co-ordination Group

DEXA Database and Expert Systems Applications

DITA Darwin Information Typing Architecture

DPA Document Processing for Accessibility

DRM Digital Rights Management

DTB Digital Talking Book

DTD Document Type Definitions

DTP Desktop publishing

EARL Evaluation and Report Language

EC European Commission

ECM Enterprise Content Management

EDeAN European Design for All e-Accessibility Network

EIDD European Institute for Design and Disability

88 CWA 15778:2008 (E)

ETSI European Telecommunications standards Institute

EU European Union

EUAIN European accessible information network

EXD Encrypted XML document

GNOME GNU Object Model Environment

GNU a recursive acronym for GNU's Not Unix

HTML Hypertext Markup Language

ICCHP International Conference on Computers Helping People with Special Needs

ICS International Classification for Standards

ICTSB Information Communication Technologies Standards Board

IEC International Electrotechnical Commission

IEEE Institute of Electrical and Electronics Engineers

IMS IMS Global Learning Consortium

ISO International Organization for Standardization

ISSS Information Society Standardization System

JAWS Job Access With Speech

JTC 1 Joint Technical Committee 1

KDE K Desktop Environment

LaTeX LaTeX is a document preparation system for high-quality typesetting.

LCNS Lecture Notes in Computer Science

LIP Learner Information Package

MARC21 Machien-Readable Cataloging

MathML Mathemtatical Markup Language

METS Metadata Encoding and Transmission Standard

MIME Multipurpose Internet Mail Extensions

MP3 MPEG-1 Audio Layer 3

MPEG Moving Picture Experts Group

NIMAS National Instructional Materials Accessibility Standard

89 CWA 15778:2008 (E)

NISO Navigation Control Centre

OASIS Organization for the Advancement of Structured Information Standards

OCR Optical character recognition

ODF OpenDocument Format

OECD Organisation for Ecomic Co-operation and Development

PC Personal Computer

PDF Portable Document Format

RDF Resource Description Framework

RNIB Royal National Institute of the Blind

RTD Research and Technology Development

RTF Rich Text Format

SC Subcommittee

SCORM Sharable Content Object Reference Model

SGML Standard Generalised Markup Language

SME Small and medium enterprises

SMIL Synchronized Multimedia Integration Language

STM Scientific Technical Medical

TC Technical Committee

TC/HF Technical committee Human Factors

TEI Text Encoding Initiative

TeX TeX is a typesetting language.

US United States

USB Umoversa; Serial Bus

VHS Video Home System

VPAT Voluntary Product Accessibility Template

W3C World Wide Web Consortium

WAI Web Accessibility Initiatieve

WAV short for Waveform

90 CWA 15778:2008 (E)

WCAG Web Content Accessibility Guidelines

WCT WIPO Copyright Treaty

WG Working Group

WIPO World Intellectual Property Organization

WPPT WIPO Performances and Phonograms Treaty

XHTML Extensible Hypertext Markup Language

XML Extensible Markup Language

XSLT XSL Transformations

91