Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders

D204.4 Document Conversion Engine and Implementation Guide

Project Acronym Prosperity4All Grant Agreement number FP7-610510

Deliverable number D204.4 Work package number WP204 Work package title Modularization and Replicability of Transformation Engine Authors Lars Ballieu Christensen and Paul Cosma Status Final Dissemination Level Public Delivery Date 30/03/2017 Number of Pages 42

Keyword List

Document conversion, accessibility, RoboBraille, MP3, DAISY, EPUB, Braille, OCR, RESTful API, C#, IIS, Hawk, SQL Server, HTTP, MP3, DAISY EPUB, framework

Version History

Revision Date Author Organisation Description

1 01/01/2015 Saju Sathyan Sensus ApS Initial version

2 01/06/2015 Saju Sathyan Sensus ApS

3 01/09/2015 Milad Ruben Soro Sensus ApS Accessibility conversion added, database setup and entities defined

4 01/02/2016 Vlad Paul Cosma Sensus ApS E-book, HtmlToPdf conversion added

5 01/04/2016 Vlad Paul Cosma Sensus ApS Braille conversion added, solution refactored.

6 01/06/2016 Vlad Paul Cosma Sensus ApS Audio conversion added

7 01/07/2016 Vlad Paul Cosma Sensus ApS MSOffice, HTMLtoText conversion added

8 01/08/2016 Vlad Paul Cosma Sensus ApS Daisy conversion added

9 01/09/2016 Vlad Paul Cosma Sensus ApS Amara integration created. Translation and document structure recognition framework setup

10 01/11/2016 Vlad Paul Cosma Sensus ApS Refactoring solution and test cases.

11 31/01/2017 Vlad Paul Cosma Sensus ApS Extending Amara integration and wrapping up the overall solution.

12 31/01/2017 Lars Ballieu Sensus ApS Added executive summary. Christensen Review and final edit

13 13/03/2017 Manuel Ortega Moral; ilunion and RtF Internal review. Suggestion to Gregg Vanderheiden mention deployment with Cloud4all/GPII auto- personalization from profile in APfP infrastructure. Suggestion to focus on technical interface. Minor edits.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders www.prosperity4all.eu

Revision Date Author Organisation Description

14 30/03/2017 Lars Ballieu Sensus ApS Revised document in accordance Christensen with reviewer feedback.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders www.prosperity4all.eu

Table of Contents

Executive Summary ...... 1 1 Contribution to the global architecture ...... 2 2 Introduction ...... 4 3 System Architecture and Implementation ...... 5 3.1 Component controllers ...... 5 3.2 Class diagram ...... 6 3.3 RoboBraille API Structure ...... 11 3.4 Supported Requests ...... 11 3.4.1 Post ...... 11 3.4.2 GetJobStatus ...... 12 3.4.3 GetJobResult ...... 12 3.4.4 Delete ...... 12 3.4.5 Other request methods ...... 13 3.5 General workflow ...... 13 3.6 Example sequence diagram ...... 13 3.7 Valid set of input parameters ...... 15 3.8 Amara integration ...... 20 3.9 Non-functional stubs ...... 21 3.9.1 Document Structure Recognition ...... 21 3.9.2 Language-to-Language Translation ...... 21 4 Installing, building and testing the solution ...... 22 4.1 Testing the solution ...... 22 4.2 Authentication ...... 22 4.3 Data format ...... 23 4.4 Postman ...... 23 5 Installation Prerequisites ...... 24 5.1 Microsoft Windows, Visual Studio and .NET ...... 24 5.2 Folder Configurations ...... 24

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders www.prosperity4all.eu

5.3 Sensus SB4 ...... 25 5.4 LibLouis ...... 25 5.5 High-quality OCR engine ...... 25 5.6 Tesseract 3.02 language data files for required files ...... 25 5.7 Windows Speech and Installed Voices ...... 26 5.8 Messaging ...... 26 5.9 DAISY Pipeline 1 and 2, Lame, ImageMagick, eSpeak ...... 26 5.9.1 Prerequisites ...... 26 5.9.2 Quick Run ...... 27 5.9.3 Extensive Setup ...... 27 5.10 Calibre ...... 28 5.11 Microsoft Office 2013 ...... 28 5.12 Database setup and connection ...... 28 Annex I: Glossary ...... 29

List of Tables

Table 1: Parameter value table for Accessible Conversions ...... 16

Table 2: Parameter value table for Audio ...... 16

Table 3: Parameter value table for Braille ...... 17

Table 4: Parameter value table for Daisy ...... 18

Table 5: Parameter value table for E-book ...... 18

Table 6: Parameter value table for HTMLtoPDF ...... 18

Table 7: Parameter value table for MSOfficeConversion ...... 19

Table 8: Parameter value table for HTMLtoText...... 19

Table 9: Parameter value table for OcrConversion ...... 19

Table 10: Parameter value table for Translation ...... 20

Table 11: Parameter value table for DocumentStructureRecognition ...... 20

Table 12: Parameter value table for VideoConversion ...... 20

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders www.prosperity4all.eu

List of Figures

Figure 1: Overall Picture of Prosperity4all ...... 2

Figure 2: Overall RoboBraille System Architecture ...... 5

Figure 3: Class diagram for audio conversions ...... 10

Figure 4: Sequence diagram for audio conversions ...... 15

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders www.prosperity4all.eu

Executive Summary

RoboBraille is an automated document conversion service capable of transforming a wide range of documents into alternate formats. Furthermore, the service can be used to convert otherwise inaccessible or tricky formats into more accessible formats. During the P4All project, the RoboBraille service was reimplemented as an open-source solution and several enhancements were added. The open-source approach enables third parties to establish their own document conversion services and/or integrate the conversion services provided by the RoboBraille software complex into exiting systems. It furthermore enables third parties to further enhance the service and to add new conversion capabilities. In terms of enhancements, integration with the PCF Amara media captioning service was implemented along with a web service API for system-to-system integration and stub interfaces for emerging conversion features such as semantic structure recognition and language-to-language translation. Using the web service API, any program can use information from the Cloud4all/GPII auto-personalization from profile (APfP) infrastructure to automatically specify the form and format for the returned material so that it will meet each users’ needs and preferences. The present deliverable documents the reimplemented version of RoboBraille, document its conversion features and explains how the service may be built, tested and made operational. Whereas the proprietary version of RoboBraille developed and maintained by Sensus rely on a combination of commercial software, open source projects and bespoke development, this deliverable provides information on alternative open-source solutions for OCR processing, text-to-speech and Braille transcription. However, potential implementers should be aware that using such open-source solutions will likely impact the capabilities of the resulting service as well as the quality of the output in a negative way. Several partners in the Prosperity4all project have expressed their interest in utilizing the document transformation capabilities offered by the RoboBraille service and are currently exploring integration options. To accommodate such partners, Sensus is hosting an operational version of the RoboBraille web service API complete with all back-end services. This version will be available to Prosperity4all partners for an extended period of time.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 1 www.prosperity4all.eu

1 Contribution to the global architecture

Through an open-source adaptation of and several enhancements to the RoboBraille service, this work package adds media and material transformation capabilities to the global architecture as per the illustration below marked “WP 204 Media & Material Transformation Infrastructure”:

Figure 1: Overall Picture of Prosperity4all

As such, the delivery is task T204.3 of work package WP204. The interfaces developed as part of the delivery relates to aspects of T204.1 with an integration to PCF's Amara media transformation tools (T204.5) and of T204.4 with stub interfaces for semantic structure recognition and language-to-language translation. The delivery adds a set of document conversion capabilities to the overall architecture of Prosperity4all, including capabilities to establish services to convert documents into digital Braille, MP3 audio files, DAISY structured audio books and e-books. Sensus will furthermore serve the partners of the Prosperity4all project with such conversion capabilities through its operational implementation of the developed solution for an extended period of time following the conclusion of the project. Using the web service API, any program can use information from the Cloud4all/GPII auto-personalization from profile (APfP) infrastructure

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 2 www.prosperity4all.eu

to automatically specify the form and format for the returned material so that it will meet each users’ needs and preferences. Several partners in the Prosperity4all project have expressed their interest in utilizing the document transformation capabilities offered by the RoboBraille service and are currently exploring integration options. The delivery has furthermore been added to the Developer Space for developers with an interest in establishing third party conversion services.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 3 www.prosperity4all.eu

2 Introduction

RoboBraille is an automated document conversion service capable of converting a range of document types into alternate formats such as MP3 files, digital Braille books, audio books and e-books. The service can also be used to convert otherwise inaccessible or tricky documents into more accessible formats. The primary users of the RoboBraille service are people with print impairments (e.g., the blind, partially sighted, dyslexic users a as well as users with learning disorders, cognitive disabilities, motor deficiencies, poor reading skills, poor language skills and more), as well as teachers, relatives and other resource people. The service is furthermore used by professionals (e.g., alternate media specialists, specialist librarians, etc.) to convert material on behalf of others. Finally, the service is used by mainstream users to support flexible learning, language learning or practical document conversion. This work package had resulted in an open-source reimplementation and modularisation of the RoboBraille service, the creation and publication of an web service API to the service to allow for system-to-system integration (with apps, digital libraries, learning portals, learning management systems and more), the development of a prototype user interface utilising the web service API, an identification of open-source alternatives to the various commercial component used in the commercial production versions of the service as well as development of interface components to allow for future enhancements of the service in areas such as improved document structure recognition, complex table recognition and transformation and interfacing to video captioning services. The main purposes of the present document are to:

• Describe the overall systems architecture of the open-source reimplementation of the RoboBraille service. • Document how the source code is retrieved, compiled and tested. • Document the operational environment required for setting up a service based on the open-source reimplementation of RoboBraille.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 4 www.prosperity4all.eu

3 System Architecture and Implementation

The overall system architecture of the RoboBraille Web API is illustrated below:

Figure 2: Overall RoboBraille System Architecture

The RoboBraille Web API solution is at the middle. It uses a database to store job information. The solution furthermore uses various conversion libraries to do the actual conversions. The RESTful API is the main entry/exit point of the solution, enabling clients to interact with the via HTTP requests. Within the API part, each of the services (audio, Braille, DAISY, e-book ...) is represented by its own API Controller, some to a finer granularity. The architecture is layered Controller- >Repository->Conversion Component for each service. The Controller classes are responsible for quality and security and they also process the API requests. The repository is responsible of saving the information to the database. Under the repository classes, there are other classes that either use custom algorithms or the underlying third party software components, executables, batch processes, libraries (.dlls) or web services to provide the necessary conversions.

3.1 Component controllers The API components controllers are: 1. AccessibleConversion 2. Audio Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 5 www.prosperity4all.eu

3. Braille 4. Daisy 5. E-book 6. HTMLtoPDF 7. HTMLToText 8. MSOfficeConversion 9. OcrConversion 10. Translation 11. DocumentStructureRecognition 12. VideoConversion 13. RoboBrailleJob For the purpose of this document the name “Controller” and “Controllers” will be used when providing general information about all of the above components. Information relevant to each component will be provided by referencing the above list of names.

3.2 Class diagram The below diagram illustrates a portion of the code and how it is linked within the solution. It can be considered as a vertical cut following only the classes relevant for managing audio conversions. The AudioController makes use of the AudioJobRepository. The repository is responsible for managing the DataContext class, which interacts with the database, it also implements a more general interface, the RoboBrailleJobRepository, that is used by all Repository classes. The AudioJob is an extension of the Job class. All Repository classes use an extended version of the Job class to hold job information. Parameters, such as AudioFormat, VoicePropriety and AudioSpeed of the AudioJob class use an Enumerable type to make sure only valid values are stored in those parameters. Finally, the AudioReplyQueue and the AudioJobSender are being called by the Repository class to manage the conversion. They are responsible for sending text and receiving the audio file to and from the AudioAgent (the AudioAgent is a standalone application that actually does the conversion and is outside the scope of the RoboBrailleWebApi solution, it is provided as a standalone project in the source code). The AudioJob is created from the API post method in the Controller and it gets passed down to the Repository, which in turn forwards the relevant information to other utility classes such as AudioReplyQueue and AudioJobSender. In the end the Repository stores the result in the database. All the work done by the Repository’s SubmitWorkItem() method is done in a separate thread.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 6 www.prosperity4all.eu

Figure 3: Class diagram for audio conversions

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 10 www.prosperity4all.eu

3.3 RoboBraille API Structure The Controllers reside under the “http://{url}:{port}/api” namespace. Each Controller can be referenced by appending the name of the Controller to the namespace (Example: http://{url}:{port}/api/braille), after that simply append the name of the method that will be used (Post - is optional for all except the MSOfficeConversion controller, GetJobStatus, GetJobResult). There is also a “super” controller called RoboBrailleJob for the general-purpose methods “GetJobStatus”, “GetJobResult”. The RoboBrailleJob controller also supports the “DELETE” method. Please see the next part for more details about the individual API methods. All controllers have the following generic structure:

Listing 1: Web API generic structure http://{url}:{port}/api/{ControllerName}/{MethodName} The {MethodName} is optional when doing POST requests on all controllers except the “MSOfficeConversion” controller. It is also optional when doing a DELETE request.

3.4 Supported Requests These methods are used in all Controllers except the RoboBrailleJob Controller which does not support Post requests.

Listing 2: Example of supported requests for a single Controller

GET: /api/Braille/GetTranslationTables POST: /api/Braille/Post GET: /api/Braille/GetJobStatus GET: /api/Braille/GetJobResult DELETE: /api/RoboBrailleJob/Delete

3.4.1 Post The Post request contains all the necessary parameters for starting a job in the API. The file content and all the necessary parameters must be placed in the body of the POST request and the POST content type is "multipart/form-data". The request returns a unique jobID in the form of: “ca427b75-bb66-e511-91f0- 1c6f65d84158”. This id will be used in the GetJobStatus and GetJobResult in order to track and retrieve the job. Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 11 www.prosperity4all.eu

If authentication is enabled then the Post request must be accompanied by an authorization header that contains the hawk token.

3.4.2 GetJobStatus The GetJobStatus request is used in order to verify when the job is finished. It will return one of the following status codes:

Listing 3: Job status request and codes

Request url: http://{url}:{port}/api/{ControllerName}/GetJobStatus?jobId={your jobID} Status codes: Error = 0 Done = 1 Started = 2 Queued = 3 Processing = 4 Cancelled = 5 It is not mandatory to make use of all status codes, the level of detail can be fine-tuned. When programming against the API it is recommended to always check for the Job status before returning the result.

3.4.3 GetJobResult The GetJobResult request will return the converted file. Make sure that the GetJobStatus request returns the value 1 before attempting to return the file result.

Listing 4: Job result request url example

Request url: http://{url}:{port}/api/{ControllerName}/GetJobResult?jobId={your jobID}

3.4.4 Delete There is an option to delete a processing job by using the http DELETE method. The jobID of the job must be provided.

Listing 5: Delete job example

Delete url: http://{url}:{port}/api/robobraillejob?jobId={your jobID}

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 12 www.prosperity4all.eu

If authentication is enabled then the Delete request must be accompanied by an authorization header that contains the hawk token.

3.4.5 Other request methods Other request methods exist to aid the calling of the Post request. They provide relevant input parameters or optional parameters for creating the post request when needed. For example, the request method

Listing 6: Example of other GET requests

/api/Braille/GetTranslationTables is used to get relevant translation tables for the optional input parameter “translationtable” in the braille post request.

3.5 General workflow A workflow defines a single controller request. In order to achieve more complex conversions, it may be necessary to chain two or more controller requests. Chaining controller requests works by using the jobID from the previously completed request as the input for the next POST request. In this case, the file does not need to be downloaded and attached to the new request. A typical RoboBraille API workflow involves the following steps: 1. POST the job to the desired Controller 2. Save the jobID in your application 3. Repeat: Call GetJobStatus with the saved jobID in predefined time intervals. Repeat until GetJobStatus value equals 1 or 0. 4. Depending on job status from step 3. a. If jobStatus = 0 Notify the user that a conversion error has occurred. b. If jobStatus = 1 then either call the GetJobResult with the saved jobID to get the converted file, or use the jobID to start a new conversion. 5. End.

3.6 Example sequence diagram Following the same example as for the class diagram the following sequence diagram describes the order in which methods are called within the classes and how they interact with each other in order to achieve the desired result. The client application is outside the scope of the RoboBrailleWebApi solution. It can be represented by a web app, a mobile app or any other backend service that desires a document to be converted. A simple example can be tried at:

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 13 www.prosperity4all.eu

Listing 7: Path to example client web application http://{url}:{port}/RoboBrailleSPA/Index The client application issues a POST request to the AudioController, with the audio job parameters and a text file that is to be converted. The AudioController dispatches the job to the AudioJobRepository, which first saves the new job to the database, returning the jobID to the client. Then the job begins processing in an asynchronous manner. Calling the getJobStatus method while the job result is not saved to the database returns a status 2 which means the job has started, but it is still processing. The job gets delivered to the AudioJobSender which first starts a AudioReplyQueue instance and forwards the reply address along with the necessary conversion parameters to the External Library, which in this case is the AudioAgent. The agent will publish the result to the reply queue, which in turn will return to the Repository that saves the result to the database. Now the getJobStatus method will show that the job is Finished and the getJobResult request can be made to return the converted file. It will have the same name as the source file, but it will have a different file extension. The figure below summarises the sequencing of an audio conversion. Note that all conversions follow the same scheme.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 14 www.prosperity4all.eu

Figure 4: Sequence diagram for audio conversions

3.7 Valid set of input parameters Note that certain valid values have two ways of being expressed such as “Unicode= 2”. This means either “Unicode” or “2” can be used to specify the same value. The following tables illustrate the valid values for all the parameters of each controller. Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 15 www.prosperity4all.eu

Table 1: Parameter value table for Accessible Conversions

Parameter Name Valid Values Notes FileContent The input document can be of any common Typical usage is document format. Typically: PDF, png, jpeg. aimed at OCR processing. Not needed if chained with a previous request. TargetDocumentFormat output formats as supported by the OCR engine: "OFF_MSWord", "OFF_MSExcel", "OFF_RTF", "OFF_XML", "OFF_PDF", "OFF_PDFA", "OFF_Text", "OFF_CSV", "OFF_HTML", "OFF_NoConversion", "OFF_TIFF", "OFF_JPG", "OFF_J2K", "OFF_InternalFormat", "OFF_DOCX", "OFF_XLSX", "OFF_JBIG2", "OFF_ALTO", "OFF_EPUB"

Table 2: Parameter value table for Audio

Parameter Name Valid Values Notes FileContent a simple .txt file containing the text you want Not needed if to be synthesized to speech chained with a previous request. AudioLanguage "enGB", "enUS", "bgBG", "daDK", "nlNL", "fiFI", "frFR", "deDE", "elGR", "klGL", "huHU", "isIS", "itIT", "ltLT", "nbNO", "plPL", "ptPT", "ptBR", "roRO", "ruRU", "slSI", "esES", "esCO", "svSE", "caES" SpeedOptions Fastest = 8, Faster = 6, Fast = 3, Normal = 0, Can be any value Slow = -3, Slower= -6, Slowest = -8 between -10 and 10 inclusive. But these are the recommended values FormatOptions Mp3 =1, Wav =2, Wma =4, Aac =8

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 16 www.prosperity4all.eu

Parameter Name Valid Values Notes VoicePropriety Male = 1, Female = 2, Older = 3, Younger = (Optional) Not 4, Bilingual = 5, Cantonese = 6, Mandarin = 7, needed for all Taiwanese = 8, Castilian = 9, LatinAmerican = voices. Default is 10, Anne = 11, None = 0 None

Table 3: Parameter value table for Braille

Parameter Name Valid Values Notes FileContent a .txt file Not needed if chained with a previous request. BrailleFormat sixdot =6, eightdot=8 Language enUEB =0x1, enGB =0x0809, enUS =0x0409, daDK =0x0406, nnNO =0x0814, isIS =0x040F, ptPT =0x0816, itIT =0x0410, frFR =0x040C, deDE =0x0407, roRO =0x0418, esES =0x0C0A, slSI =0x0424, huHU =0x040E, bgBG =0x0402, svSE =0x041D, elGR =0x0408, plPL = 0x0415 Contraction full =1, small =2, large =3, large2 =4, grade0 Specific =5, grade1 =6, grade2b =7, grade2i =8, contractions grade2 =9, level0 =10, level1 =11, level2 =12, work with level3 =13, user =14 specific languages. OutputFormat None=0, OctoBraille = 1, Unicode= 2, Pef=3, NACB = 4 ConversionPath texttobraille = 0, brailletotext = 1 Default is texttobraille = 0 CharactersPerLine number: 0 = no pagination and greater than Default is 0 (>) 10 for a valid pagination LinesPerPage number: 0 = no pagination and greater than Default is 0 (>) 2 for a valid pagination

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 17 www.prosperity4all.eu

Parameter Name Valid Values Notes TranslationTable a valid translation table from the (Optional) Note: getTranslationTables API method By using a typed translation table the braille controller will ignore the language and contraction parameters PageNumbering none = 0, right = 1, left = 2 (Optional) Default is none = 0. CharactersPerLine number: 0 = no pagination and greater than Default is 0 (>) 10 for a valid pagination

Table 4: Parameter value table for Daisy

Parameter Name Valid Values Notes FileContent a .docx document that contains only Not needed if chained simple text. NO mathematical equations with a previous or symbols, pictures.(TODO Edit here) request. DaisyOutput TalkingBook = 1, Epub3WMO = 2

Table 5: Parameter value table for E-book

Parameter Name Valid Values Notes FileContent a .PDF file Not needed if chained with a previous request. format mobi =1, epub =2

Table 6: Parameter value table for HTMLtoPDF

Parameter Name Valid Values Notes FileContent a valid . file with only a table Not needed if chained with a previous request.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 18 www.prosperity4all.eu

Parameter Name Valid Values Notes size a1 =1, a2 =2, a3 =3, a4 =4, a5 =5, a6 =6, a7 =7, a8 =8, a9 =9, a10 =10, letter =11

Table 7: Parameter value table for MSOfficeConversion

Parameter Name Valid Values Notes FileContent Word: - .doc, .dot, .docx, .dotx, .docm, All of the input files .dotm, .rtf, .odt, .txt, .htm, .html can be converted to PDF. Only some files Excel - .xls, .xlsx, .xlsm, .xlt, .xltm, .xltx, can be converted to .csv, .odc .txt .html .rtf.

Powerpoint - .ppt, .pptx, .pptm, .pps, Not needed if .ppsx, .ppsm, .odp chained with a previous request. Visio - .vsd, .vsdm, .vsdx, .svg [.vsdm, .vsdx & .svg require Visio >= 2013]

Publisher - .pub MSOfficeOutput PDF =1, txt=2, html=4, rtf=8 If the file is a pptx and the conversion is a txt file then

Table 8: Parameter value table for HTMLtoText

Parameter Name Valid Values Notes FileContent a valid .html file returns a .txt file

Table 9: Parameter value table for OcrConversion

Parameter Name Valid Values Notes FileContent a image file returns a .txt file OcrLanguage “enUS” “daDK” More languages can be installed: see Tessaract OCR configuration

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 19 www.prosperity4all.eu

Table 10: Parameter value table for Translation

Parameter Name Valid Values Notes FileContent a .txt file returns a .txt file SourceLanguage (as defined by the backend component) TargetLanguage (as defined by the backend component)

Table 11: Parameter value table for DocumentStructureRecognition

Parameter Name Valid Values Notes FileContent a .PDF file *not defined yet

Table 12: Parameter value table for VideoConversion

Parameter Name Valid Values Notes FileContent A .txt file returns a .txt file VideoUrl A url where the video is hosted SubtitleLangauge The desired language SubtitleFormat "srt", "txt", "dfxp", "ssa", "sbv", "vtt"

3.8 Amara integration As part of extending the current RoboBraille functionality, a new workflow has been integrated that makes use of Amara’s API (http://amara.org) to retrieve subtitles for videos. This workflow can be seen in the VideoConversion section of the project. This added component is responsible for retrieving subtitle files from videos published towards the RoboBrailleWebApi. The videos can be video files, in which case they will be stored by RoboBraille until they are uploaded to Amara, or video URL’s from popular video streaming websites (youtube, vimeo, dailymotion). The process works as follows. If the uploaded video already exists in Amara and a subtitle is available it will immediately retrieve the subtitle and provide it to the user. Otherwise it will return the subtitle information provided by Amara. This information can be that the subtitle language exists but it is not completed, or that the requested subtitle language does not exist in Amara, in which case a request for subtitling in that language must be made, or that the video does not exist, in which case both the video and the subtitle request must be made.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 20 www.prosperity4all.eu

This use case is not completely automated, since the subtitling must be done manually by Amara, also it is ideal to have a team in Amara that handles videos submitted by RoboBraille in order to assure high throughput. The final part of the Amara integration concerns the conversion of PowerPoint presentations (pptx). If a request to convert a pptx to txt is made in the MSOfficeConversion Controller, Robobraille will extract the text and the videos from the presentation and will create a subtitling request for each extracted video towards Amara. Therefore, there will be a main jobID for the MSOfficeConversion and each video will have a jobID for the subtitling request done in the VideoConversion. The end result will be a zip file containing a text file with the extracted text, the original extracted video files, plus a subtitle for each video in the desired language.

3.9 Non-functional stubs The final extension to RoboBraille concerns non-functional stubs. These are only frameworks for managing the conversions, without actually having a concrete implementation for making them.

3.9.1 Document Structure Recognition Some research has been done to find a solution for recognizing the structure of documents. This is mostly the case when a book is scanned and provided to students as reading material. OCR software does a decent job at recognizing the document’s words, but it cannot make assumptions about which part of the document is a paragraph, title, heading or which words are in a list, table or multiple columns. Much research has been done into developing such solution and the document structure recognition stub uses the information gathered from the research to provide a possible interface towards making such a solution possible. Some researched solutions: http://www.inftyreader.org/?p=766 http://primaresearch.org/www/assets/papers/ICDAR2013_Clausner_ReadingOrder.PDF http://www.kanungo.com/pubs/spie03-layoutsurvey.PDF https://PDFs.semanticscholar.org/5392/90b571b918da959fabaae7f605bb07850518.PDF

3.9.2 Language-to-Language Translation The Language-to-Language translation concerns simple translation of a text from one language to another. Solutions such as Google Translate API can be plugged into it effortlessly to handle the translation process. This conversion can be chained with Audio or Braille conversions in order to provide efficient translations together with the desired alternative format. Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 21 www.prosperity4all.eu

4 Installing, building and testing the solution

The source code for the solution can be found at: https://github.com/sensusaps/RoboBraille.Web.API Clone the repository to the local machine, do not build it yet. Once downloaded the solution will need to be configured according to the Installation step below in order to build and run successfully. Open and build with Visual Studio 2013 or newer. Make sure all the prerequisites are met (database connection, folder configurations, etc.). First start the DaisyConversionRPC project (or directly the .exe). Build and run your solution within Visual Studio (alternatively publish to IIS server). Please consult chapter 5 (Installation Prerequisites) below for information on required components and configurations.

4.1 Testing the solution User testing can be done by following the test guides present in the solution folder under RoboBraille.TestCases The solution uses Swagger [http://swagger.io/] to show an overview of the Web API functionality. To see this on your local installation use the following path:

Listing 8: Swagger path http://{url}:{port}/swagger/ui/index

4.2 Authentication The authentication can be disabled if it is not needed by removing the [Authorize] attribute above the Post method of each of the API Controllers. Feel free to skip this section if you do not intend to have authentication on your service, for example if it is intended for free use or for running on your local network. The RoboBraille API uses HAWK authentication. Which is a Token based authentication mechanism. Every POST request must be accompanied by an “Authorization” header. For example:

Listing 8: Authorization header example

Authorization : Hawk id="d2b97532-e6c5-e401-8270-f0cef103cfd0", ts="1478882022", nonce="GvcLtm", mac="/Shlw/8PrbXcV8VzROQBpZ30c7KxIT9LCRgByM9DgeY=" Where: Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 22 www.prosperity4all.eu

• Id : your unique user id given to you by us • Ts: unix timestamp • Nounce: auto generated hash • Mac: hash created from your API-key The main source of documentation for HAWK can be found at: https://github.com/hueniverse/hawk Implementations for the HAWK authentication can be found in all major languages and can be used to automatically generate the authentication header. The input required is the user id, API-key and the encryption algorithm which is “sha256”. A sample request will look like this:

Listing 9: Post request with authorization header

POST {url}:{port}/api/audio Authorization: Hawk id="d2b97532-e6c5-e401-8270-f0cef103cfd0", ts="1478882022", nonce="GvcLtm", mac="/Shlw/8PrbXcV8VzROQBpZ30c7KxIT9LCRgByM9DgeY=" For client applications that will be built on top of the API there are many code repositories and 3rd party libraries that will help you create the authorization token in the programming language of your choice.

4.3 Data format The recommended request encoding is multipart/form-data. Use the “Content-Type” HTTP header to specify the format of your request. The input parameters can also be passed as JSON objects.

4.4 Postman In order to test the API, Postman can be used for creating requests to the API. Postman is recommended because it supports creation of HAWK authentication headers. https://www.getpostman.com/

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 23 www.prosperity4all.eu

5 Installation Prerequisites

Prior to running the solution on a local IIS server, the following components and configurations need to be installed in order to achieve the full functionality of the RoboBraille Web API solution.

5.1 Microsoft Windows, Visual Studio and .NET The solution requires a server with Microsoft Windows. The minimum requirement is Windows 2008 Server Edition and a running IIS. .NET 4.0 and Visual Studio Runtime 2012. The solution is build using .NET 4.0. For running components such as Tesseract, Visual Studio Runtime 2012 is required. Create the bin folder in the RoboBraille.WebApi project and add the necessary DLL’s (the ones mentioned below).

5.2 Folder Configurations The directory paths must correspond to the key-value pairs set inside the web.config of the solution. The current folder configuration can be found in the source code under the folder called “Working Directory”. Change the local disk and source path as appropriate. There are 4 values in the web.config file that need to be mapped correctly.

tessdatapath must point to the folder containing the training data for Tesseract OCR (only necessary if Tesseract is used as part of the solution) calibrepath must point to the installation path of the Calibre setup (only necessary if Calibre is used as part of the solution) BinDirectory must point to the project’s bin directory, also when publishing to the server it must point to the bin directory of the published solution) FileDirectory corresponds to the folder called WorkingDirectory within the downloaded RoboBrailleWebAPI solution, FileDirectory must point to the WorkingDirectory. DistDirectory corresponds to the directory where RoboBraille Web API can publish files to the web in order to be sent as further requests to other API’s or any other similar actions (currently it is used to POST videos to the Amara API) Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 24 www.prosperity4all.eu

5.3 Sensus SB4 Sensus SB4 is a commercial Braille conversion library (DLL) designed and maintained by Sensus. In a RoboBraille implementation, SB4 may be replaced by other Braille conversion solutions, e.g. LibLouis. SB4 produces high quality and accurate Braille in multiple languages. The software can be bought at www.sensus.dk. Please contact Sensus for purchasing and licensing information and [email protected]. Once purchased, the SB4.dll needs to be placed in the bin directory of your solution together with the Sensus.Braille.dll and only the Sensus.SB4.dll should be added as a reference within the Web API solution (in Visual Studio, right click on the RoboBraille.WebApi project, select “Add/Reference” and browse to your bin directory to add the “Sensus.SB4.dll”).

5.4 LibLouis LibLouis is an open-source Braille converter. Download a stable build of Liblouis from the provided download link (or alternatively build and configure the Liblouis dll). Place the liblouis.dll in the bin directory of the solution and add a table subdirectory containing all the necessary conversion tables for Liblouis, please consult the Liblouis website for further information. Download link: http://liblouis.org/downloads/

5.5 High-quality OCR engine A high-quality, commercial OCR engine is recommended if RoboBraille is to convert images into accessible formats with a reasonable accuracy. In a RoboBraille implementation, it can be replaced by other OCR solutions, e.g., Tesseract. This will, however, impact the accuracy in a negative way. The OCR engine needs to be installed and configured to run as a web service on the network and added as a service reference to the solution.

5.6 Tesseract 3.02 language data files for required files Tesseract is an open-source OCR solution. Tesseract 3.02 comes installed as a NuGet package in the solution. It is used as an open source alternative for a proper conversion capabilities. Support for multiple languages must be installed and configured within the solution as needed. Currently supported languages are English and Danish. Tesseract source for languages:

• https://github.com/tesseract-ocr/tesseract Tesseract .NET:

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 25 www.prosperity4all.eu

• https://github.com/charlesw/tesseract

5.7 Windows Speech and Installed Voices By default, the application will use the English voices installed and configured on your local windows server. Additional SAPI 5 voices can be installed and configured.

5.8 Messaging Make sure Message Queuing is installed and enabled. In Control Panel under “Programs/Programs and Features/Turn Windows features on or off”, select the check box for “Microsoft Message Queue (MSMQ) Server” in order to enable it. Follow the provided link for more details: https://technet.microsoft.com/en-us/library/cc730960.aspx In Control Panel, under “System and Security/Administrative Tools/Services” Message Queuing (and optionally Message Queuing Triggers - depends on the windows version) must have “Status: Running” and “Startup Type: Automatic”. Erlang must be installed in order for the messaging system to work. Installation Link: http://www.erlang.org/download.html Messaging must be enabled in the Services settings of your windows machine. The RabbitMQ messaging client must also be installed. You can install RabbitMQ from the following link: https://www.rabbitmq.com/ If your DAISY component is running on a separate machine, clustering must also be enabled and configured: https://www.rabbitmq.com/clustering.html In order to configure and manage your RabbitMQ cluster it is advisable to enable the management console of rabbit mq: https://www.rabbitmq.com/management.html

5.9 DAISY Pipeline 1 and 2, Lame, ImageMagick, eSpeak DAISY Pipeline is an open-source solution for converting documents into DAISY structured audio books. LAME is a free demonstrator MP3 encoder. ImageMagick is an open-source image processing component. eSpeak is a compact open source software speech synthesizer for English and other languages. Please follow the “Prerequisites” step before going through either the “Quick Run” or “Extensive Setup”.

5.9.1 Prerequisites

1. Follow the instructions to download Java and run the .exe to install Java on your machine. Once you installed Java on your machine, you will need to set environment variables to point to correct installation directories. 2. Assuming you have installed Java in C:\Program Files\java\jdk directory, go to this directory and copy the path Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 26 www.prosperity4all.eu

3. Right-click on 'My Computer' and select Properties/Advanced system settings/Environment variables/system variables 4. Choose the 'Path' variable and press the edit button. At the end of this line, make sure there is a “;”,Paste in the path to your java directory, and end with a “;”. Now do the same with the jre folder from your java directory. Example: at the end of the Path variable change your path to read ; C:\Program Files\Java\jdk1.8.0_60\bin; C:\Program Files\Java\jdk1.8.0_60\jre; 5. Add the felix.jar from the “WorkingDirectory\felix-framework-5.2.0\bin” folder to your Path environment variable following the same instructions (e.g. ; C:\Users\Administrator\Source\Repos\RoboBrailleWebApi\WorkingDirectory\feli x-framework-5.2.0\bin;) 6. Install lame: http://lame.sourceforge.net/download.php (for Windows users this is recomended: http://lame.buanzo.org/Lame_v3.99.3_for_Windows.exe ). 7. Install ImageMagick: http://www.imagemagick.org/. Be aware that only version 6.9.3-Q16 has been tested and is known to work! 8. Install eSpeak: http://espeak.sourceforge.net/download.html There are two options for running the DAISY conversion. Either by following the “Quick Run” or by doing the “Extensive Setup”. The “Quick Run” uses the “WorkingDirectory” in which the DaisyConversionRPC is present along with a complete version of the Daisy Pipeline 1 and the Daisy Pipeline 2.

5.9.2 Quick Run For a quick run of the Daisy Conversion component simply run the “WorkingDirectory\DaisyConversionRPC\DaisyConversionRPC.exe” executable. The source code for this executable can be found in the github folder under the “DaisyConversionRPC” project. Note that the messaging step must be finished in order to run the executable.

5.9.3 Extensive Setup If you do not want to follow the “Quick Run” and wish to build the entire process from scratch please read the following carefully. First build the project “DaisyConversionRPC” from the github source code. Then follow the configuration steps provided by Daisy in order to set up a running service of Daisy Pipeline 1 and 2, links are provided below:

• DAISY Pipeline 1: http://www.daisy.org/pipeline/download • DAISY Pipeline 2: https://code.google.com/p/daisy- pipeline/downloads/detail?name=pipeline2-1.6.zip Optional step: The current configuration allows for installing the Daisy related conversions to a separate machine. This configuration requires installing Erlang and RabbitMQ on your Windows server and on an additional windows machine, configuring a messaging cluster between those two machines.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 27 www.prosperity4all.eu

5.10 Calibre Calibre is an open-source e-book management and conversion solution. The portable version is preferred. Use the provided calibre portable installation in the “WorkingDirectory” folder. Or download and install Calibre. For server installation the portable version is preferred and add a reference in the project's web.config to the calibre.exe installation path. Link: http://calibre-ebook.com/download

5.11 Microsoft Office 2013 In order to convert Microsoft Office documents an installation of Office 2013 must be present on the server.

5.12 Database setup and connection Use Microsoft SQL Server to create a database called “RoboBrailleJobDB” and run the latest version of the script called “RoboBrailleJobDB-(version).” from the folder named “Database Script”. After that, run the “demo-user.sql” script to add a default user. The current configuration uses a Microsoft SQL Server database with code first migrations enabled. Please check the following tutorial for help in configuring the solution http://www.asp.net/web-api/overview/data/using-web-api-with-entity- framework/part-3. If not setup automatically, check inside the web.config file that the connection string with the appropriate Data Source exists with the naming “RoboBrailleJobDB”.

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 28 www.prosperity4all.eu

Annex I: Glossary

Abbreviation Full form 3D Three Dimensional 3GPP 3rd Generation Partnership Project A Activity AAA Authentication, Authorization or Accounting mechanisms AAATE Association for the Advancement of Assistive Technology in Europe AAL Ambient Assisted Living AccLIP Accessibility for LIP (Learning Information Package): IMS standard that defines a means to specify accessibility preferences and learner accommodations AccMD AccessForAll Metadata (IMS standard) ACfP* Auto-Configuration from Profile {This is improper expansion of ACfP} ACfP* Auto-Configuration from Preference AES Advanced Encryption Standard AfA Access For All: IMS specification describing adaptation or Personalisation of resources, interfaces and content to meet the needs of individuals AM Authorization Manager AmI Ambient Intelligence Anode Android Node.js AOD Assistance on Demand APfP* Auto-Personalisation from Preferences (mechanisms enabled by Cloud4all/GPII by which users are able to automatically adapt a device/application/service/resource according to their needs and preferences) API* Application Programming Interface Apps Applications ARIA Accessible Rich Internet Applications Suite AS Assistive Software ASIT Adapted Social Interaction Tool ASTA Android compatible Smart Tv Architecture AT* Assistive Technologies ATAG Authoring Tool Accessibility Guidelines ATM* Automated Teller Machine C4aA Cloud4all CA Consortium Agreement CA Coordination Action CAC Communication Advisory Committee

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 29 www.prosperity4all.eu

Abbreviation Full form CANARIE Canada’s Advanced Research and Innovation Network CAP Common Access Profile CAPTCHAS Completely Automated Public Turing test to tell Computers and Humans Apart CAS Context Aware Server CAT Computer Aided Translation CBA* Cost-Benefit Analysis CBT Computer-Managed Instruction CC/PP Composite Capabilities/Preference Profiles CCITT (Comité Consultatif International Téléphonique et Télégraphique) International Telegraph and Telephone Consultative Committee, one of the three sectors at the International Telecommunication Union (ITU) CEA* Cost-Effectiveness Analysis CEFIE Centre for Family Business and Entrepreneurship CEG Consumers Expert Group CEN* Comité Européen de Normalisation / European Committee for Standardization, one of the three official standardisation bodies in Europe CENELEC* Comité Européen de Normalisation Électrotechnique (European Committee for Electrotechnical Standardization) CERTH Centre for Research and Technology Hellas CH Dissemination Channel CMS Content Management System. A software that allows the management of text, media and other content and its display to a front-end. CPRD Convention on the Rights of Persons with Disabilities CPU Central Processing Unit CSA Cloud Security Alliance CSP Cloud Service Provider CSS Cascading Style Sheets CSS/XSS Cross-Site Scripting CSS3 Cascading Style Sheets – Version 3 CTR Common Terms Registry (now called Preference Terms Dictionary) D (or Del)* Deliverable DAISY Digital Accessible Information System. A standardised audio book format. DBSCAN Density-based spatial clustering of applications with noise (clustering algorithm) D-Bus Desktop - Bus DDA Disability Discrimination Act 1995: legislation in the United Kingdom; replaced by the Equality Act 2010 (except in Northern Ireland) DDD4R Deliverable (Document) Due Date for Review

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 30 www.prosperity4all.eu

Abbreviation Full form DDD4S Deliverable (Document) Due Date for Submission DDOS Distributed Denial of Service DDR Device Description Repository DfA Design for All DG Directorate-General DMR Demonstration Result DNS Domain Name Services DOM Document Object Model DOW* Description of Work DPA Devices, platforms and applications DR Development Result DRD Digital Resource Description DTS Draft Technical Specification DTV* Digital Television DVB MHP Digital Video Broadcasting Multimedia Home Platform E&LAC Ethical and Legal Advisory Committee EASTIN* European Assistive Technology Information Network (www.eastin.eu) EC European Commission ECA Event-Control-Action ECB Ethical Control Board EDeAN European Design for All eAccessibility Network EFER European Foundation for Entrepreneurship Research EM expectation-maximization (algorithm) EN European Standard (“Europäische Norm”) EN English ENISA European Network and Information Security Agency EPG Electronic Program Guide ePub open e-book standard by the International Digital Publishing Forum (IDPF) ERA European Research Area ES ETSI Standard ES Expert System ES Standard ETNA* European Thematic Network on Assistive Information and Communication Technologies ETSI European Telecommunications Standards Institute, one of the three official standardisation bodies in Europe

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 31 www.prosperity4all.eu

Abbreviation Full form EU European Union EULA End User License Agreement F2F* Face to Face FAQ Frequently Asked Questions FE further education (UK and Ireland); post-compulsory education distinct from higher education; called “continuing education” in the USA FM Flow Manager FONCE Fundación ONCE para la cooperación e inclusión social de personas con discapacidad FP Framework Program GDP Gross Domestic Product GNU Literally "GNU's Not Unix!" (a recursive acronym). Can refer to both the GNU Linux operating system and the GNU Project. GOK Technological objectives GPII* Global Public Inclusive Infrastructure GPL General Public License GPS Global Positioning System GSM Global System for Mobile GUI Graphical User Interface GUIDE Gentle user interfaces for elderly people GUMO General User Model Ontology HCI Human Computer Interaction HDM Hochschule Der Medien (Stuttgart Media University) HF Human Factors HID Human Interface Device HIDS Host Intrusion Detection System HiFi* High Fidelity HIIC High Impact Innovation Centre HIPS Host Intrusion Prevention System HMI Human Machine Interface HTML HyperText Markup Language, the standard markup language for creating web pages HTTP HyperText Transfer Protocol I/O Input/Output IaaS Infrastructure as a Service IAATIP International Alliance of Assistive Technology Information Providers IAM Identity and Access Management IBM International Business Machines Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 32 www.prosperity4all.eu

Abbreviation Full form ICF International Classification of Functioning, Disability and Health: classification of health components of functioning and disability, published by the WHO ICT* Information and Communication Technologies ICT* Intelligent Communication Technologies ID Identity ID Internal Deliverable IDE Integrated Development Environment IdP Identity Provider IDRC International Development Research Centre IDS Intrusion Detection System IE Internet Explorer IEC International Electrotechnical Commission IETF Internet Engineering Task Force IMS IMS Global Learning Consortium (originally "Instructional Management Systems"); develops standards for learning technology INCITS InterNational Committee for Information Technology Standards IndieUI Independent User Interface IOS Internetwork Operating System iOS mobile operating system developed by Apple Inc. IP Internet Protocol IP Integrating Project

IP International Projects IP Office Technosite Project Management Office IPR* Intellectual Property Rights IPS Intrusion Prevention System IRR Internal Rate of Return ISACA Information Systems Audit and Control Association ISO* International Standardisation Organisation IT Information Technology ITE Integrated Translation Environment JISC UK public body that supports post-16 and higher education, and research (formerly Joint Information Systems Committee) JME Java Micro Edition JME Java platform, Mobile Edition JME/JSE Java Micro Edition/Java Standard Edition

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 33 www.prosperity4all.eu

Abbreviation Full form JSON* JavaScript Object Notation, is an open standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs. JSON-LD JSON (for) Linked Data JTAG Joint Test Action Group Debugger JTC Joint Technical Committee KBS Knowledge Base System KMS Knowledge Management System KPI Key Project Indicator LAMP Linux, Apache HTTP Server, MySQL/MariaDB and PHP/Perl/Python (a web server solution stack based on free and open-source software; see also XAMPP) LCDUI Limited Connected Device User Interface LCMS Learning Content Management System LDAP Lightweight Directory Access Protocol LEC Local Ethics Committees LINQ Language Integrated Query LIP Learner Information Package LMS Learning Management System Lo/Me-Fi Low/Medium Fidelity LoFi Low Fidelity LOR Learning Object Repository LWUIT Lightweight User Interface Toolkit MA* Mobile Accessibility for Android MeFi Medium Fidelity MM* Matchmaker MMM MiniMatchmaker MMPT Match Maker Preference Tool mmusic Multiparty Multimedia Session Control MOODLE Modular Object-Oriented Dynamic Learning Environment MP3 MPEG-1 or MPEG-2 Audio Layer III, a patented encoding and compression format for digital audio MPRIS Media Player Remote Interfacing Specification MS Microsoft MS Milestone MSAA Microsoft Active Accessibility MVC Model-View-Controller MyUI Mainstreaming Accessibility through Synergistic User Modelling and Adaptability Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 34 www.prosperity4all.eu

Abbreviation Full form N&Ps* Needs and Preferences N&P Set Needs and preferences set NAND Negated AND NATO North Atlantic Treaty Organization NDA Non Disclosure Agreement NFC* Near Field Communication NGO Non Governmental Organisation NIST National Institute of Science and Technology NP Needs and Preferences NP set Needs and preferences set. Adaptations and settings an individual needs or prefers in order to interact appropriately with a device and its software. In Cloud4all/GPII they are first decided by user through the PMT. NPV Net Present Value NVDA NonVisual Desktop Access (open source screen reader for Windows) OAEG Open Accessibility Everywhere Group OAF Open Accessibility Framework OATH Open Authentication OAuth Open Authorization Framework OCR Optical Character Recognition ODT word processing format of the Open Document Format for Office Applications (ODF), natively supported by OpenOffice and LibreOffice OER(s) open educational resource(s), free and openly licensed educational materials OS* Operating System OWASP Open Web Application Security Project OWL Web Ontology Language (W3C specification for describing and sharing ontologies on the World Wide Web) PaaS Platform as a Service PB Plenary Board PC* Personal Computer PC Project Coordinator PCA principal component analysis (a statistical procedure PCL Philips Consumer Lifestyle PCP* Personal Control Panel PDA Personal Digital Assistant PDF Portable Document Format, also standardised as ISO 32000-1:2008 PDT Portable Data Terminal

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 35 www.prosperity4all.eu

Abbreviation Full form PHP “PHP: Hypertext Preprocessor” (a recursive acronym), a server-side scripting language for web development PKI Public Key Infrastructure PM Person Month PMT* Needs and Preferences Management Tool PNP Personal Needs & Preferences POIs Point of Interests POS for .NET Point of Service for .NET PQ Picture Quality PS Preferences Server PSC Project Steering Committee PTD Preference Term Dictionary. This is the tool where all solutions’ settings are registered (Formerly Common Terms Registry) PWD Persons with disabilities QAM Quality Assurance Manager QCB Quality Control Board QCP Quality Control Plan QoS Quality of Service QR Quick Response QRcode Quick Response code R&D Research and Development R&D&I Research Development and Innovation RAM Random Access Memory Rb Rule-based RBAC Role-Based access control RBMM Rule Based Matchmaker RDF Resource Description Framework (W3C specification for Semantic Web data models) Relay Service A telecommunication service that allows e.g. Deaf and Hard of Hearing users to communicate with hearing users in real-time via a sign language interpreter or via text. REST* Representation State Transfer, a Web service design model RFDQ Reference Framework for the Description of Quality RFID Radio Frequency Identification RG General Results RGO Result from General Objective RMCP Risk management and contingency planning

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 36 www.prosperity4all.eu

Abbreviation Full form RNIB Royal National Institute of Blind People (UK charity) ROI Return Of Investment rtcweb Real-time communication in web browsers RTF Rich Text Format, a word processing format defined by Microsoft RtF-I Raising the Floor - International RUP Rational Unified Process S&T Scientific and Technological SAA Shopping/Alerting Aid SaaS Software as a Service SAB* Scientific Advisory Board SAML Security Assertions Markup Language SAT* Semantic Alignment Tool SAToGo Screen Access To Go SB4 SensusBraille 4. A Braille transcription library. SBRI Small Business Research Initiative, a funding programme by JISC TechDis in the UK SC Subcommittee SCORM Shareable Content Object Reference Model SDC Stiftung Digitale Chancen SDK* Software Development Kit SDP Software Defined Perimeter SEMA* Semantic Framework for Content and Solutions SG Security Gateway SLA Service Level Agreement SLA Solution Level Agreement SME Small and Medium Enterprices SMS/MMS Short Message Service / Multimedia Messaging Service SOA Service-Oriented Architecture SoA State Of the Art SOAP Simple Object Access protocol, is a protocol specification for exchanging structured information in the implementation of web services SOC Software On Chip SP* Sub-Project SP Leader Sub-Project Leader SP3 Sub Project 3 (of Cloud4all project) SPARQL Protocol and RDF Query Language

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 37 www.prosperity4all.eu

Abbreviation Full form SQL Structured Query Language SR Screen reader SR Solutions Registry SR Scientific Result SSL Secure Socket Layer SSO Single-Sing On SST Service Synthesizer Tool St Statistical STMM Statistical Matchmaker STREP Specific Targeted Research Project STS Security Token Service SW Software SWOT Strengths, Weaknesses, Opportunities and Threats TAG Target Audience Group TAP Think-Aloud Protocol TC Technical Committee TCP Transport Control Protocol TDL Trust in Data Life TE Technological objectives TECH Technosite TEE Trusted Execution Environment TLS Transport Layer Security TM Technical Manager TMS Translation Management System TR Technological Result TR Technical Report Transfor- Operations that map values from standard preference terms into application settings mations TRL Technology Readiness Level TS Technical Specification TTS* Text-To-Speech TUD Technische Universität Darmstadt TVM* Ticket Vending Machine UC Use Case UCD* User-Centered Design UI* User Interface Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 38 www.prosperity4all.eu

Abbreviation Full form UIO User Interface Options UL User Listener UL Unified Listing UML Unified Modelling Language UNCRPD United Nations Convention on the Rights of Persons with Disabilities uPOS universal Point of Service UR User Requirement URC Universal Remote Console [ also Universal Remote Control ] URI Uniform Resource Identifier; superset of Uniform Resource Locator (URL) and Uniform Resource Name (URN) URL Uniform Resource Locator USB Universal Serial Bus UserML User Model Markup Language UTF Unicode Transformation Format Uuid Universally Unique IDentifier UX User experience V2 Subcommittee VDE Verband der Elektrotechnik VLC VideoLAN Client VM Virtual Machine VoIP Voice Over Internet Protocol VolumeTTS Text To Speech Engine Volume VUMS Virtual User Modelling and Simulation Standardisation W3C* World Wide Web Consortium WAI Web Accessibility Initiative WAM Warren Abstract Machine WAN Wide Area Network WCAG Web Content Accessibility Guidelines WebDAV Web Distributed Authoring and Versioning WG Working Group WHA World Health Assembly, the meetings of the World Health Organization (WHO) WHO World Health Organization WM Working Memory WP* Work Package WP Leader Work-Package Leader WS Web Service Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 39 www.prosperity4all.eu

Abbreviation Full form WTH Willingness to Have WTP Willingness to Pay WURFL Wireless Universal Resource FiLe WWW World Wide Web WYSIWYG What You See Is What You Get XAMPP Cross-platform (“X”), Apache HTTP Server, MySQL/MariaDB, PHP/Perl/Python (a cross-platform web server solution stack based on free and open-source software; see also LAMP) XHTML Extensible HyperText Markup Language XML Extensible Markup Language, a specification for (defining) markup languages, by the World Wide Web Consortium (W3C) XSRF Cross-Site Request Forgery

Ecosystem infrastructure for smart and personalised inclusion and PROSPERITY for ALL stakeholders 40 www.prosperity4all.eu