TEI Publisher Documentation

Wolfgang Meier, eXist Solutions GmbH Magdalena Turska, eXist Solutions GmbH

October 8, 2021

Introduction

What TEI Publisher does ... The motivation behind TEI Publisher was to provide a tool which enables scholars and editors to publish their materials without becoming program- mers, but also does not force them into a one-size-fits-all framework. Expe- rienced developers will benefit as well by writing less code, avoiding redun- dancy, improve maintenance and interoperability - to just name a few. TEI Publisher is all about standards , modularity , reusability and sustainability ! In Publisher, it all starts with your source documents - regardless if they are in TEI or other form of XML: DocBook, MS Word (DOCX) or JATS. No matter how the source material has been encoded, it can be easily transformed into a range of output formats for publication - from a modern web page that you can open on your laptop or mobile device, to an ebook, a PDF file or its LaTeX source. TEI Publisher derives its name from TEI and the TEI Processing Model (PM). Processing Model is a part of the TEI vocabulary and TEI ODD specification format, described in the TEI P5 guidelines as well as further chapters here. It defines how a TEI document should be rendered in different output formats and lies at the heart of TEI Publisher. However, online editions require more than just a text transformation: the text needs to be embedded into an application , adding navi- gation, pagination, search, facsimile display and so on. The larger part of TEI Publisher deals with those aspects, providing all the necessary building blocks for an online edition. Staying true to the spirit of code reuse and interoperability, TEI Pub- lisher implements its functionalities as small ”lego” blocks to be freely ar- ranged and recombined. The technology making this possible is called Web Components . It is part of the HTML5 specification and natively imple- mented by many browsers. Users don’t need to dive into the details of this

1 Introduction Introduction standard though: all you need to modify the example pages is some basic HTML knowledge. Only where the available components are not enough, a new use case needs to be described and suitable new components implemented but then they can be incorporated into existing component pool for everyone else to use. After all, our mantra is reuse, reuse, reuse and we want to turn TEI Publisher into a box of tools the entire community can benefit from. Despite the elegant simplicity of this approach, various projects we real- ized in the past prove that TEI Publisher is:

1. powerful enough to cover complex transformation needs

2. a truly universal tool for any kind of digital edition

3. capable of generating high quality, camera ready material for book publishing

4. sustainable and future-proof solution

5. suitable for any XML, not just TEI (this documentation is written in DocBook!) e-editiones.org Since its first incarnation in 2015, TEI Publisher has gained substantial following with numerous academic and commercial projects around the globe using it for their editorial and publishing needs. Grass-roots user initiative led in 2020 to the foundation of an international non-profit association e- editiones.org with the focus on further joint development of TEI Publisher, open standards and best practices for digital editions. TEI Publisher development is only possible thanks to generous contri- butions of developers, users and institutions willing to employ Open Source approaches so that the whole community can reuse and benefit from their work. A growing number of projects from small to large that have decided to publish their materials with TEI Publisher gives us all not only the oppor- tunity but also the responsibility to make the project thrive for years to come and to make it truly sustainable option for XML publishing! Consider joining e-editiones.org through your affiliated institution or individually to support our efforts. We invite the community to contribute to the project - by means of code, ideas, documentation, tutorials and funding. You don’t have to be a developer to contribute, you can do so in a number of ways!

1. Check out the source code, modify it, document it, enhance it.

2 Introduction Introduction

2. Create new or enhance existing example documents and ODDs so we present showcases for various TEI applications.

3. Report your issues, feature requests or ideas for discussion via GitHub issue tracker .

4. Discuss with us on e-editiones slack chat or through the mailing list and @EEditiones twitter .

5. Contribute to translations via Crowdin . Please contact us if your target language is not listed and you’d like to work on it.

6. Port back your custimizations to the TEI Publisher code base so that others can use it too (or ask us to do it for you)!

7. Help and mentor others - publish teaching materials, answer questions on our Slack channel, mailing list and other forums.

8. Sponsor a concrete feature or fund a development grant.

Versions TEI Publisher has been under active development since 2015. Once or twice a year a new major version is released, bringing important new features. Minor versions are released at shorter intervals and offer bug fixes, minor new features and improvements. The current major version of the TEI Publisher is 7.0.0 There’s a long list of ideas and features we’d like to see incorporated into TEI Publisher: from wider coverage of input and output formats (e.g. InDesign), CQL and DTS support, to various editing workflows and support for efficient hosting and maintenance of multiple editions. These ideas are in various stages of development - some already advanced, some in conceptual phase, some waiting for funding and implementation. Coordination of future work is primarily conducted by e-editiones.org .

What’s new in TEI Publisher 7.0.0 Version 7 brought another major refactoring and restructuring of TEI Pub- lisher app , particularly regarding the server-side modules. TEI Publisher now exposes a well-defined, clear API specification fol- lowing the Open API standard . This API is used in TEI Publisher by client-side UI components but can be equally well utilized by independent software which harnesses functionality exposed via the API without being forced to rely on Publisher’s client-side components. On the server a new package, oas-router , reads the API specification and uses it to map HTTP requests to XQuery functions which perform

3 Introduction Introduction actual API operations. Other packages, particularly UI web components ( pb-components ) and TEI Processing Model library ( tei-publisher-lib ) underwent necessary changes to communicate with the new API as well as retain full backwards compatibility. Beyond the structural changes a number of reported issues has been fixed and broader test coverage for all packages introduced. A sophisticated CI setup based on Docker has been created. Extensive test suite has been prepared for individual components. Furthermore, every API operation is independently tested against the specification, to assure that e.g. parameter and response types correspond exactly to the definition. Thanks to community contributions via Crowdin , TEI Publisher is now available in 20 languages. See chapter on Updating for information how to update your app to take advantake of these developments.

What’s new in TEI Publisher 6.0.0 Version 6 brought a major refactoring and restructuring of TEI Publisher app libraries along with new specialized components and use case examples.

1. Web components overhaul: migration to LitElement and to the npm library While invisible to users, this redesign greatly improves the modularity of Publisher-based applications. With Publisher web component re- leases published on npm, updating the user interface for all Publisher- based apps is just a question of changing a single variable in the con- figuration file. Furthermore, Publisher’s library of web components - true to the basic idea of Web Components Standard can be included in any HTML web- page e.g. can be embedded into existing CMS or any other publishing solution, even if it’s not running eXist-db. Similarly, if you prefer to write your own application using any of the popular frameworks like angular, vue or react you can easily im- port the pb-components package from npm and use it directly in your project. As a final consequence, this change decouples the component library from the TEI Publisher app. It is now possible to host multiple appli- cations, which depend on different versions of the component library, without conflict within the same eXist-db instance, a point of impor- tance for institutions with numerous projects.

2. Redesigned and simplified CSS styling customization

4 Quickstart Quickstart

Encapsulation of styles offered by web components can be a mixed blessing and poses some challenges when customizing the aesthetics of components to fit a project. While some aspects of component styling remained unaccessible for customization in previous versions, Publisher 6 exposes majority of styling properties via standard CSS files and theme variables. Stylesheets can also be specified within the ODD, as previously, or through pb-view component configuration attributes.

3. Extended internationalization I18n support has been extended to cover not only the labels in HTML templates but also within web components. A mechanism for project specific language files extending the default Publisher label collection has been added. Thanks to community contributions via Crowdin a number of new languages has been added and existing ones updated.

4. Subcorpora - new TEI Publisher data organization Publisher’s pre-populated data collection is now split into Playground and TEI Publisher demo collection areas which illustrate how this mechanism could be used to host multiple subcorpora within single TEI Publisher application.

5. New and improved web components pb-select-feature and pb-toggle-feature components have been extended to allow for interactive changing of display parameters (like switching between regularized or original spelling) which can be then processed client or server-side. New components have been created to handle MEI music notation as well as for web component API documentation and demo pages.

6. User interface of the ODD editor has been improved.

7. Experimental incremental scroll mode has been introduced to improve performance for very long documents presented in single page mode.

Quickstart

Stay Home Learn TEI Publisher From Scratch A 3-part online course has been organized by e-editiones and led by Wolfgang Meier in June 2020. Course material, as well as video recordings of all the sessions, and a walk-through for the assignments are available for self-study. Find all informations on the workshop GitHub page.

5 Quickstart Quickstart

Not available in PDF edition. Go to https://www.youtube-nocookie.com/embed/QuWrfAS2SWM to view.

Installation TEI Publisher requires the eXist XML database to operate. It is distributed as an eXist application package, making it easy to install on any eXist database instance - either on your local machine or any remote server. You can install eXist and TEI Publisher manually, as described below. Alternatively use the provided docker image .

Installing into an eXist instance Java Before installing eXist, make sure you have Java installed on your machine. You can run java -version on a command line to check which version of Java you have. Make sure you have at least Java 8 (recommended: Java 11). Please note that the java -version shows the full version string, so 1.8.0 or similar instead of just 8. If you do not have Java installed, you can choose between a variety of dif- ferent Java distributions for your operating system. While these are largely equivalent, so far we have had the smoothest installation experience across operating systems with the Zulu Community OpenJDK builds. In particular for Windows users, this provides the best out of the box experience.

Download Download an eXist distribution following the link on its home- page .

It is recommended that you set up an admin password when installing eXist but make sure to remember or store it securely!

Mac installation On a Mac download the file with the .dmg extension, e.g. ” eXist-db-5.x.x.dmg ”. Double clicking the downloaded .dmg file should install eXist on your local system. It is only required to drag the eXist app icon over to the Applications folder. Once the installation has completed, you should find an app in your Applications folder which you can use to launch eXist.

Windows installation On Windows download the file with the .jar ex- tension, e.g. ” exist-installer-5.x.x.jar ”. Double clicking the .jar should install eXist on your local system. It will launch an installer to guide you through basic settings. The default settings suggested by the installer provide a good starting point for most projects.

6 Quickstart Quickstart

If double-clicking the .jar does not have any effect, there’s may be some- thing wrong with your Java setup. The java binary needs to be in your %PATH% environment. You can also try to manually start the installer by opening a com- mand prompt, changing to the directory where you downloaded the distribution and typing: java -jar exist-installer-5.x.x.jar

Once the installation is completed, you should find an eXist-db shortcut to launch eXist.

Unix installation Download the file with the .jar extension, e.g. ” exist- installer-5.x.x.jar ”. Double clicking the .jar should install eXist on your local system. It will launch an installer to guide you through basic settings. Default settings suggested by the installer provide a good starting point for most projects so there’s no need to change anything. Once the installation is completed, you should find an eXist-db shortcut to launch eXist, otherwise navigate to the installation directory and run bin/startup.sh .

Some Linux users may prefer the plain .tar.bz2 package, which can just be untarred to any location. This package does not include an installer and eXist has to be launched on the command line: navigate into the untarred directory and run bin/startup.sh in a shell, skipping the jar installer step above. Ignore the next section, navigate directly to http://localhost:8080 and follow the steps for installing TEI Publisher via the dashboard described further below.

First launch Once eXist is launched for the first time you should see (with the exception of some Unix system configurations described above) a splash window popping up, showing that default applications are being installed: Upon first start, an additional configuration window will pop up on Windows and Mac, allowing you to configure basic parameters. Default settings suggested provide a good starting point for most projects so usually there’s no need to change anything. Clicking on Save will show a popup asking to confirm the location of the data directory. Unless you have specific requirements just agree to the suggestion of the configuration dialog. Windows users will be asked if they would like to install eXist as a service. This is highly recommended to ensure that the database is correctly

7 Quickstart Quickstart

Figure 1: Splash Screen on eXist Startup

closed whenever the operating system shuts down. If all went well, eXist should now be up and running in the background. Mac and Windows users should find a small eXist icon in their task bar. Right-clicking on it will reveal a menu:

Installing TEI Publisher Clicking on Open Dashboard in the taskbar will open a browser and display eXist’s Dashboard: the central administra- tive hub for the database. Alternatively - e.g. when you chose the manual installation on Linux - you can also open a browser window and navigate to: http://localhost:8080 . Log into the dashboard using the admin account and the password you chose during the installation (it will be empty by default). Use the left sidebar to navigate to the Package Manager . You’ll see two tabs: the first one lists the application packages currently installed, the second can be used to install additional packages from eXist’s public application repository. Switch to the Available tab and search the list for TEI Publisher. Once you find it, click on the little install icon. After installing you will find the TEI Publisher icon in the tab showing installed apps. Click on it to open the TEI Publisher.

Using docker If you do not want to install eXist yourself, you can use docker to run TEI Publisher. Docker is a tool to simplify the installation of applications and

8 Quickstart Quickstart

Figure 2: Configuration Dialog Showing on First Start

services. It creates a virtual environment including everything required for the service to run. Using our docker image , eXist will already be set up to include TEI Publisher as well as the Shakespeare and Van Gogh demo apps.

1. Install docker on your machine. Windows and Mac users may down- load the docker desktop app.

2. To download the image run the following in a console docker pull existdb/teipublisher:latest

3. once the download is complete, you can run the image with the fol- lowing command: docker run -p 8081:8080 -p 8444:8443 --name teipublisher existdb/teipublisher:latest

Startup should be fast because the database is already pre-populated. However, changes you make may not persist if the docker container is deleted or updated to a newer release. If you want to be sure that

9 Quickstart Quickstart

Figure 3: Taskbar Launcher Context Menu

your changes are safe, you should specify a local volume for storing the database by adding: -v exist-data:/exist-data

See below for an explanation of the parameters:

-p Maps a port on your local machine (8081 and 8444) to the port used by eXist within the container. eXist will always run on 8080 for HTTP and 8443 for HTTPS. If those ports are already occupied by different services on your machine, choose a different port for the first number. -v Creates a named (”exist-data”) volume for storing the database, using the directory specified after the colon. If you skip this, any changes to the database will be lost if you remove the docker container, update it or create a new one. With -v the data will be stored outside the container.

10 Quickstart Quickstart

Figure 4: Installing TEI Publisher from the Package Manager

If you just intend to play around a bit, you can skip the param- eter. –name Assigns a name to the container, so you can reference it in other docker commands, like docker stop . We’ll use the name in all commands below.

Once the container has started, you can access the eXist dashboard in your browser by navigating to http://localhost:8081

From the dashboard you can click on the TEI Publisher, Shakespeare or Van Gogh icons to open the corresponding applications.

4. To stop the container run docker stop teipublisher

5. To start the container again: docker start teipublisher

Note that when you restart a container, it will run in detached mode, so you won’t see any console output. You can view the output with following command though: docker logs teipublisher

Other useful commands docker container ps -a Lists all running and stopped containers

11 Quickstart Quickstart docker volume ls Displays existing volumes (where your database is stored) docker cp teipublisher:/exist-data . Copy the contents of the database data volume to the current directory on local disk, so you can back it up. Note that this will copy the raw database files as created by eXist (not your XML, which is binary encoded inside those files). Also make sure you run the command after stopping the active container. docker volume rm exist-data Remove the contents of the database data volume (in case you would like to start from scratch, deleting all changes you made).

Have a look at the docker documentation and cheatsheet for more com- mands.

Browsing Documents The Start Page The start page of TEI Publisher serves as an entry point to explore and experiment. On a newly installed TEI Publisher the main application panel offers the choice between browsing local collections directly or using DTS API to access remote resources. Narrower panel to the right displays the list of ODD files provided with the TEI Publisher.

Figure 5: Start Page Collections

The usual starting point is the TEI Publisher Demo Collection with which users can explore a range of selected use cases , demonstrating various genres, encoding styles and presentation layouts. Various customization aspects are handled using different ODDs and view templates. We suggest

12 Quickstart Quickstart to have a look at each of them to see what TEI Publisher can achieve out of the box.

Documents in this collection are preinstalled with the TEI Publisher and users are not allowed to write to it by default.

The Playground collection is the place to upload encoded documents and ODD files to experiment with various processing models and view templates. Unlike the Demo collection, the Playground features an upload box to import new documents. You can upload your own XML and ODD (e.g. TEI, DocBook or DOCX) documents by either clicking on the Upload button or dragging and dropping files onto the upload panel. Read more on this subject in the Upload section

You need to be logged in for most advanced actions like creating or editing ODDs. The login button to the right of the menu bar allows you to log in. By default, there’s a user named tei-demo with password demo , and a user tei with password simple .

TEI Publisher Demo collection Experiment with browsing, faceting, filtering and sorting features of TEI Publisher. This page consists of several main areas:

1. the facets panel

2. the list of documents currently installed with sorting and filtering con- trols

3. a panel showing the ODD files known to the application

4. an upload box to upload new documents

Have a look at documents showcased here to get a sense of possibili- ties that TEI Publisher offers. Click on document title to proceed to the Document View

Selected Use Cases The document view can vary, sometimes substantially, depending on the sample document you are looking at. This is a natural consequence of TEI’s versatility and broad scope of its application. What follows, requirements

13 Quickstart Quickstart

Figure 6: Browsing Demo collection

for the document view - both its layout and composition as well as process- ing rules governing the transformation of the text of the document itself - will differ to a great extent. Sample documents which are included in TEI Publisher’s installation package do not exhaust its applications but rather aim to present some chosen use cases:

• Critik der reinen Vernunft from the Deutsches Textarchiv corpus presents a philosophical tractate, originally published in print, thus following ’traditional’ book structure with front pages, foreword and chapters. It can nevertheless demonstrate very well Publisher’s capacities in type- setting, switching between physical and logical structure of the docu- ment (just toggle Page View in the Settings panel) as well as generation of multiple output formats from single set of processing models in the ODD (try choosing PDF or ePub options in the Download ). Purchas his pilgrimages , from the EEBO-TCP project, while roughly similar in structure is much earlier work (1613) and demonstrates extensive use of marginal notes.

• Shakespeare’s Romeo and Juliet , from Bodleian First Folio project uses dedicated TEI elements to encode structure of the play but it also showcases the parallel transcription and facsimile alignment for its presentation which is obviously of general application and could be used for any genre, not limited to dramatic texts.

14 Quickstart Quickstart

• Correspondence corpora are common, yet very interesting, subjects for digital editions. Despite basic similarities in structure, depend- ing on the period, scope and particular research perspective, intended presentation may vary enormously. We are presenting samples of:

– A 15th century manuscript letter to Mikoaj Orlik demonstrating alignment between Latin original and parallel Polish translation, – A 16th century manuscript letter of Hernn Corts showcasing par- allel transcription/translation and facsimile view and transcrip- tion enhanced with commentaries and explicitly encoded tran- scriptional features, – A 16th century manuscript letter of Mauritius Ferber with a col- lapsible metadata panel in addition to the parallel transcription and facsimile view, – An early 19th century manuscript collocative dictionary of Pol- ish Bogactwa mowy polskiej featuring interactive highlights for regions of interest of the facsimile when hovering over dictionary headwords, – A letter from Van Gogh to Paul Gauguin written in 1888. This intentionally reproduces the flexible column layout pioneered by the Vincent Van Gogh Letters online edition, which is a model example for correspondence. – A 20th century manuscript letter from Robert Graves where em- phasis has been put on visualizing rich encoding of semantic infor- mation in the letter, in particular geographic and prosopograph- ical data.

The list of samples is expected to grow and we’d like to encourage contributions illustrating other genres and perspectives. We’d like to stress that preparing showcases above has been only possible thanks to numerous projects releasing their sources openly, in particular the Bodleian First Folio , Deutsches Textarchiv , Vincent Van Gogh Museum and EEBO-TCP . We’d also like to thank William Graves and Anna Skolimowska for sharing their correspondence material.

Document View The document view can vary, depending on the sample document you are looking at. Nevertheless some default functionality will be shared:

15 Quickstart Quickstart

• the rightmost button in the toolbar opens the Settings panel. Here you can change the ODD being used for display as well as the view template (more about this later). By default, all sample documents apply the specific ODD which fits them best, but you can play around and select another ODD to see what happens.

Figure 7: The Settings Panel

• the leftmost toolbar button will open a table of contents (if the viewed document has a division structure)

• the Download menu allows you to download the currently viewed doc- ument in a variety of output formats. Not all output formats work equally well for all examples as we have not customized every example for every media.

Experimenting with ODDs and page templates All TEI Publisher’s sample documents are TEI XML files which are trans- formed into a HTML webpage for display in the browser. Two major factors determine how the final page is going to look like: an ODD and a page tem- plate .

16 Quickstart Quickstart

We have already mentioned in the very first section that the TEI Pro- cessing Model lies at the heart of the Publisher - the ODD file associated with a document defines the rules of transformation of the XML source file into HTML. Detailed discussion of the Processing Model can be found in following chapters , for now it is sufficient to say this is where decisions if a TEI element should be rendered inline, with a tooltip, or as a marginal note, are made. Simplifying things a bit the text of the document that you see rendered in your browser is an effect of applying the rules from ODD file to the source document. Nevertheless, as we demonstrated in the section on selected sample doc- uments, in the application context we certainly want more than just text, however nicely typeset. From basic navigation controls, table of content, to facsimile display, critical apparatus, glossaries and maps - all of this and much more could be included in the final webpage. Following divide and conquer approach the TEI Publisher defines such specialized page elements as small, reusable blocks, using the Web Components technology. Compo- nents can be used like common HTML elements, thus a page template is just an HTML fragment which organizes the building blocks needed for a specific page. Looking more closely again at the TEI Publisher’s Start page, we can now give more detail what is happening when any of the sample documents is loaded. On the right hand side there is a panel listing all ODD files available. Each of the sample documents includes a processing instruction which specifies default ODD and page template for this document. You can check what they are in Settings panel. For the Graves letter it would be Graves’ Letters ODD and Letter with map/facets template. It is easy to experiment with different page templates and ODDs just changing these options in the Settings panel. An important caveat though is that not every page template makes sense for every document - after all parallel alignment can only be successful if there is something to align, map needs coordinates to display, page view needs information about page breaks and so on.

Uploading your own documents If you read this, in all likelihood you already have some documents of your own you might want published, whether they are in TEI, DocBook, MS Word DOCX or other XML format. The first step is to upload them into the database. You need to be logged in and in the Playground area to do it (check the short info on the Start page for user name and password). Then uploading is just a question of dragging your documents onto the Upload area. They will become available in the document list immediately after upload is completed.

17 Quickstart Quickstart

Figure 8: The Settings Panel for the Graves Letter

Congratulations, now you can view your documents! Try to experiment and find the ODD and page template that best fits your needs and use it as a starting point for your own customization, if necessary. Once you are ready with these you can generate your own application for your documents only, which packs just what is needed for publishing into a standalone application package. If you attempt to upload a Microsoft Word document, the upload will automatically trigger upconversion of Word to TEI, using a custom ODD for the tranformation. Please note that the focus of this conversion is to pre- serve textual content, structure and basic semantics of the text, not provide authoritative mapping of complete set of MS Word features to TEI. Refer to DOCX handling section for more information.

Please bear in mind that while TEI Publisher aims to be a universal tool, the specific components may make certain assumptions about data they are getting and if your documents do not follow the same conventions it may be required to adjust parameters passed to the components from the page template or the component logic. By means of example - a table of content component assumes that the document structure is represented by means of nested div elements

18 Supported XML vocabularies Supported XML vocabularies

Figure 9: The Upload Panel

and section titles are given in head element. If your project rather chooses numbered divisions ( div1 , div2 ) etc it may be advisable to adjust this to avoid customizing all navigation, table of contents and so on, but it is one of very rare cases where TEI Publisher exposes any predilection for a particular flavour of TEI. Similarly, the template for aligned transcription and translation is parametrized to accept an XPath expression pointing to the location of the transcription and aligned translation. Likely for your documents this expression would have to be adjusted (unless of course you also have Latin texts with Polish translation structured in a similar way). Furthermore, to correctly display corresponding translation fragment a custom mapping function may need to be passed to the translation view (cf. Van Gogh or Corts letter templates for examples)

Supported XML vocabularies

TEI Publisher started as a publishing toolbox for TEI but the principles of TEI Processing Model were never limited to a single vocabulary. Publisher very quickly extended support to other XML formats. Currently TEI, Doc- Book, JATS and MS Word DOCX are supported out of the box (DOCX via automated conversion to TEI on upload). Few specificities of TEI and DocBook are listed below, while DOCX is

19 Supported XML vocabularies Supported XML vocabularies discussed at length in the following section.

TEI In principle, any TEI document will be supported by TEI Publisher and can be displayed with the default page template and odd. Nevertheless, certain assumptions are made about encoding of the basic structure of the TEI documents for the purpose of navigation:

• page beginnings are encoded with ¡pb¿

• column beginnings are encoded with ¡cb¿

• structural divisions in the document are encoded with ¡div¿ elements

We acknowledge that TEI offers other ways to encode these features, e.g. generic ¡milestone¿ element or specialized numbered division elements like ¡div1¿ , ¡div2¿ . TEI documents using alternative encodings will be still displayed as specified in the ODD, it is only for the sake of navigation or division-based full text search that we had to assume certain conventions to be able to decide what to show as the next page, column or division. We believe our choice represents most common way of using TEI but, for those who followed the path less travelled, the chapter on customization briefly discusses how to change relevant functionality.

DocBook DocBook support is demonstrated by this very document you are now read- ing, documentation. . It is written in DocBook and presented via dedi- cated docbook.odd and documentation. page template. You will notice a custom processing instruction in the source code of this document which specifies which ODD and template to use. Experiment changing the template and ODD via Settings drawer to see how much impact it has on display.

20 Supported XML vocabularies Supported XML vocabularies

MS Word DOCX format conversion Starting with the version 5.0.0 of the TEI Publisher a new docx handling module is available to allow for ingesting documents in docx format. Goal of this module is to provide a way to import Word documents, preserving their textual content, structure and basic semantics of the text, not to provide an authoritative mapping of complete set of MS Word features to TEI. Docx format is relatively flat, thus reconstructing logical document struc- ture like divisions, lists and similar can be only based on certain heuristics. Likewise it is impossible to deduce semantics attributed to certain format- ting decisions. For that reason TEI Publisher by intention ignores many style properties trying to preserve as much as possible would likely just add unnecessary ”noise” and result in low-quality TEI.

A word about Word A Word document is essentially a zip archive of several different XML files. These files store various parts - the text content, styles, embedded media files etc. Information most relevant for the import process have been extracted into a map, which is passed as a parameter to the ODD, so it is available for every element. Thus information about numbering styles can be accessed via $parameters?nstyle(.) function and testing if a list is bulleted could be done checking the value of $parameters?nstyle(.)/numFmt/@w:val . Full list of available functions and some hints how to customize default conversion ODD are provided at the end of this chapter .

Named tei:* styles Named styles can be strong indicators for the seman- tics of the text fragment. Styles whose name starts with tei: are thus recognized as TEI elements with the same name. If a character se- quence uses a style called tei:persName , it will be wrapped into a TEI ¡persName¿ element in the output, e.g. Johann Wolfgang Goethe . A place name should be marked with a style tei:placeName and reconstructed text could be encoded by applying a style tei:supplied .

Headings and divisions Since Word does not have a concept for text di- vision, instead storing just flat lists of paragraphs, so the only way to reconstruct the logical structure is to use Word headings and outline level associated with these to determine division boundaries. In the first pass, all paragraph styles starting with heading , title or subtitle generate a ¡tei:head¿ element. The outline level assigned to the heading is recorded as well.

21 Supported XML vocabularies Supported XML vocabularies

Figure 10: MS Word archive structure

Subsequently, in a second pass through the generated output, divisions are generated based on the outline level: a ¡div¿ spans all text from the heading to the next heading on the same outline level and the process is repeated for all headings within the division on a lower outline level.

Lists Lists structure needs to be reconstructed, very much like divisions, taking into consideration the list level associated with every item which can be accessed via a call to $parameters?pstyle(.)//outlineLvl/@w:val .

Foot- and endnotes Footnotes are translated into TEI note elements. Endnotes are also supported and transformed into .

Tables Processing of simple tables works very well as well as cells spanning multiple colums. Row spans are not implemented yet.

22 Supported XML vocabularies Supported XML vocabularies

Images Embedded images are stored into a subcollection starting with the name of the docx file being processed and suffixed with .media , eg.

ODD for docx The ODD used for docx processing can be found in docx.odd . Users are free to extend the default ODD with additional heuristics. For example, a paragraph being entirely bold could also be treated as a heading, or a left text indent may indicate a quote. For testing purposes there is a Word document provided in data/doc/test.docx which includes samples of most important features like headings, lists, tables, notes and embedded images. Try uploading it via upload panel as described in the upload section and check the conversion results. Behaviour of the conversion mostly follows the approach used in TEI Stylesheets docx-to-tei transformation module and has been tested on test files included there.

Parameter functions Functions below can be used to retrieve styles or other information related to a current node. For more usage examples see docx.odd cstyle phrase level (characters, words or phrases) styles associated with the current node Returns: w:style Usage example: $parameters?cstyle(.)/name[starts-with(@w:val, ’tei:’)] endnote content of the endnote Returns: w:endnote/w:p footnote content of the footnote Returns: w:footnote/w:p link external link Returns: rel:Relationship Usage example: $parameters?link(.)/@Target nstyle list style information associated with the current node Returns: w:lvl Usage example: $parameters?nstyle(.)/numFmt

23 Processing Model transformations Processing Model transformations pstyle paragraph level styles associated with the current node Returns: w:style Usage example: $parameters?pstyle(.)/name[matches(@w:val, ’quote’;, ’i’)]

Processing Model transformations

While TEI Publisher already provides various ODDs and page templates targeting specific domains, it is likely that your project may require certain adjustments to fully meet your needs. It has been one of primary concerns in Publisher’s design that customization is not only possible on various levels but also encouraged and we aim it to be as simple as possible. Very broadly we can group customization needs into two sets: changing the rules for document transformation (how the source document is trans- lated into the output format) or changing the organization and styling of the rendered web page. In this chapter we’ll concentrate on the former, doc- ument transformation, which primarily requires modification of the ODD with the TEI Processing Model . The latter would require adjustments of the page template . In both cases, it may be best to choose as your starting point an already existing ODD or page template and adjust it.

ODD Customization Creating Your First ODD The general workflow for creating a customization is as follows: 1. upload a TEI sample document you want to format 2. create a new ODD 3. modify the ODD to match your requirements For the purpose of this quickstart, we will reuse one of the pre-installed sample documents, but create a new ODD for it (while we will start from scratch with an empty ODD, it is also possible to generate one based on one or more sample TEI documents ):

1. Log in and fill out the form at the bottom of the panel listing ODD files. Choose a name for the ODD, e.g. myletter (without a suffix) and a title, which will appear in the list after creation. Click on Create (not Create from examples ). The newly created ODD should appear in the side panel. 2. In the document list, click on Letter #6 from Robert Graves to William Graves to open it in the document viewer.

24 Processing Model transformations Processing Model transformations

3. Open the Settings panel (rightmost toolbar button, see above) and choose your ODD from the dropdown showing available ODDs. You may also change the used HTML template to Default single text layout , though this is not absolutely necessary.

4. The view should change and display the letter’s content with only basic formatting applied. Since our ODD has just been created and is empty, we see the content with standard formatting applied. Our ODD by default inherits from teipublisher.odd , which likewise extends tei simplePrint.odd . The latter is maintained by the TEI community and contains processing model declarations for the most important TEI elements. Thanks to this inheritance mechanism, many docu- ments display nicely without requiring a lot of additional customiza- tion.

5. From the menu, select Admin / Edit ODD to open the visual ODD editor.

Modify the ODD Changing processing models in the ODD is a powerful mechanism through which you can control all aspects of the transformation of your documents from source XML format to all output formats: HTML, ePUB, PDF etc. As already mentioned it is considered best practice to chain ODD customiza- tions together and rather change or add project specific rules to more generic ODD than copy them in extenso. ODD chaining allows for the future up- grades as your base ODDs may be updated by standardization bodies which maintain them. Commonly project ODDs would extend teipublisher.odd , a generic TEI Publisher set of processing rules. Beginning with version 3.0 of TEI Publisher, you have the choice between writing the ODD by hand or using a visual editor. Both approaches can be combined and mixed. The visual editor saves the ODD in a non-destructive way, preserving any information not related to the processing model. It is thus safe to switch between hand-editing the ODD and using the visual editor. Just make sure you reload the visual editor view after modifying the source XML and vice versa. That said, visual editor is specifically tailored to editing processing models so it will be likely the fastest and safest way to edit your ODD. To be able to customize the display of your document it is crucial to understand its XML structure well. Each of processing models needs to be aimed at a particular XML element and sometimes is only meant for a specific XML context - let’s say we might want to distinguish between headings of first and second level of nested divisions as they often represent titles of different text units: acts and scenes or books and chapters.

25 Processing Model transformations Processing Model transformations

We’ll start with the Graves’ letter you have already viewed applying your custom ODD in previous section. The display is quite simple and easy to read but we might want to adjust it to follow common visual conventions for a letter, starting with displaying the dateline on the right hand side and completely removing the page label which currently sits there. To create a processing model addressing this need we have to know 3 things:

• when should it be applied,

• what is supposed to happen

• and how should the text be formatted?

To be able to answer the first question, you should familiarize yourself with the XML structure of the letter to find out how datelines are repre- sented in TEI. In the tab displaying the letter, select Download / XML to open graves6.xml in eXide. Quick investigation of the TEI encoding will reveal that dateline resides in its eponymic tag ¡dateline¿ wchich is nested in the ¡opener¿ part of the document, while page labels are encoded with ¡pb¿ . We’ll use the visual editor, but show the corresponding ODD XML below each screenshot. At the end of this chapter we’ll describe how to edit the ODD XML code by hand .

First Steps The visual ODD editor opens if you select Admin / Edit ODD from the menu while viewing a document. Alternatively you can click on the name of an ODD in the list of ODDs on the TEI Publisher entry page. A new tab opens, showing an action panel to the left, and the title of your ODD to the right.

Most recent versions of the ODD editor will look slightly different, never- theless they are functionally equivalent to the screenshots below, created in an earlier version.

We need to overwrite the processing model rules for ¡dateline¿ . Enter dateline into the input box next to the New button in the left panel and click the button. This will insert a processing model rule for ¡dateline¿ into the right panel. Because ¡dateline¿ already exist in the base ODD, tei simplePrint.odd , you’ll see a single model which was copied from the base ODD. The corresponding ODD XML looks like this:

26 Processing Model transformations Processing Model transformations

Figure 11: Screen after adding ¡dateline¿

Let’s cover some key concepts of the TEI processing model first: ¡ele- mentSpec¿ primarily documents the structure, content, and purpose of an element. It is a core element in any ODD but the schema-related func- tions are not relevant for the discussion here. What is important for us is this is where processing models are defined. The @ident attribute of the ¡elementSpec¿ identifies the name of the element to which the spec (and therefore processing model) applies. An ¡elementSpec¿ may contain one or more ¡model¿ elements to specify the intended processing of this element. Every model maps the element to a behaviour . A behaviour denotes an abstract transformation function to be applied. The TEI guidelines currently list two dozen behaviours, e.g. paragraph, heading, note, inline, block. The last two are the most frequently used. How exactly a behaviour translates into the target output media may differ depending on media features and design decisions. TEI Publisher tries to implement them as generically as possible. To change the model expand it by clicking on the arrow to the left of the grey box. A form appears, allowing you to change the model configuration. In our example we are happy with what is happening with the dateline, so we don’t need to change the behaviour but we do want to fix how it is styled by justifying it to the right. Rendition can be defined in an ¡outputRendition¿ , so click on the + button next to Renditions . In the form input being inserted below, enter your styling requirements in css. The processing model uses ¡outputRendition¿ and CSS to define visual aspects. For output formats other than XML, the CSS is translated into the corresponding target language. It is thus best to limit the CSS to the most common typographical features, like bold, italic, color, underline etc. The

27 Processing Model transformations Processing Model transformations general styling of the text should be done outside the ODD to maintain a clear separation of concerns.

Figure 12: Add a rendition for ¡dateline¿

Again here’s the corresponding XML: text-align: right; To test your change, click on Save in the left panel and wait a second until a popup appears. Switch back to the tab with Graves’ letter from which you opened the editor and refresh the browser window to see your changes applied. In case you do not see any change, make sure

1. you selected the correct ODD for viewing (check the Settings drawer)

2. if you made changes to outputRenditions only, you may need to clear your browser’s cached version. For most browsers, holding the Shift key while clicking on the Reload button does the job.

Other behaviours We would also like to hide the page breaks as we do not have facsimiles available. Add a new element spec for ¡pb¿ . Again the newly added spec already includes a model with behaviour break . Just change this behaviour to omit or delete the existing model and insert a fresh one with behaviour omit .

28 Processing Model transformations Processing Model transformations

Figure 13: Omit ¡pb¿

Predicates and multiple models Next up, we may want to highlight the various places and people occurring within the text. They are all marked up with the ¡name¿ tag, using different @type attributes. Create a new element spec for ¡name¿ and supply some color to the names.

Figure 14: Color the ¡name¿ tags

And the XML for the entire ¡elementSpec¿ : color: #FF9900; This rule affects places and people alike since both these categories are marked up with ¡name¿ tag. If we’d like to treat people and places differently

29 Processing Model transformations Processing Model transformations we’d need separate models for them and a mechanism to distinguish between the two. The processing model uses predicate to make such distinctions: a model rule will only be used if the XPath expression in its predicate matches the current node being processed. Let’s add another model and give it a predicate:

Figure 15: Distinguish places and people

color: #0077FF; color: #FF9900;

30 Processing Model transformations Processing Model transformations

The order of models within the element spec is important. If you move the model with the predicate to the bottom, all names will appear in the same color again. This happens because the processor walks through the models until it finds the first one matching the current node. If the model without predicate is first, it will always win over the one with the predicate! Also, if there’s more than one matching model, only the first will be chosen.

Parameters All behaviours accept one or more parameters which are defined in the TEI guidelines. Every behaviour has an implicit parameter called content , and, as the name suggests, it specifies which part of the source document should be processed: by default it uses the nested content of the node. You may overwrite this default and assign it another value. Some behaviours take other specialized parameters. For example, the alternate behaviour accepts two parameters: default and alternate . An alternate switches between two alternative states. On the web this could take the form of a popup, in print it is usually implemented as a footnote. To put this to a test, let’s look at the ¡date¿ elements appearing within the letter. Most of them also specify a normalized date in their @when attribute. Seeing this may be helpful for the reader, for example, to know that the 19th mentioned in the postscript refers to 1957-12-19 . However, we may want to present the normalized date in a more readable way. XPath has a function format-date for the purpose and we could use it to show a representation of the date nicely formatted in the user’s language. Add a new element spec for ¡date¿ . You’ll already see 4 predefined models. The first two are for print only, but the third one does indeed use behaviour alternate , which is exactly what we want. Change the parameter value for alternate to format the date:

31 Processing Model transformations Processing Model transformations

Figure 16: Format the normalized date in @when

Screencast The screencast below recapitulates some of the modifications we just applied. It uses an older version of TEI Publisher, but the basic concepts and controls are still the same:

32 Processing Model transformations Processing Model transformations

Figure 17: Screencast

Not available in PDF edition. Go to https://www.youtube.com/embed/avRO-b2BwUI?rel=0 to view.

Edit the ODD XML by hand To switch to the XML source code of the currently edited ODD from within the visual editor, click on the button with the angle brackets in the toolbar of the left side panel. If you made changes in the form, you need to save first to update the ODD. The ODD XML will be opened in a new tab, showing eXist’s browser-based editor, eXide . While using eXide is sufficient for small edits, we really recommend using specialized XML editor like oXygen for serious work on your TEI files. It will help you with many tasks, starting with the syntax and documentation. You can edit ODDs stored in eXist using Oxygen’s webdav support or the eXist data source function.

If you edit the ODD XML by hand, there are some caveats you need to be aware of: the visual editor will automatically check if there are existing ¡elementSpec¿ s for a new element in any of the ODDs your ODD inherits from. When editing by hand, you need to do this yourself. It’s best to always have the base ODDs: tei simplePrint.odd and teipublisher.odd open on the side. Both are located in the same collection as your odd, i.e. /db/apps/tei-publisher/odd . For example, to modify the element spec for ¡dateline¿ , check tei simplePrint.odd , where you’ll find a definition already. Copy it over to your ODD and start modifying it. Pay attention to the @mode attribute on ¡elementSpec¿ . You must set this to change if you are overwriting an elementSpec which already exists in the inherited ODDs. If not, set it to add .

To test any changes, switch back to the tab in which you viewed your document (e.g. Graves’ letter) and select Admin / Recompile ODD from the menu.

Processing Model Syntax TEI gives users a lot of freedom: there’s always more than one way to encode your material! To maintain interoperability and sustainability, you need a way to formally describe the schema used as well as document editorial

33 Processing Model transformations Processing Model transformations guidelines and transcription processes. TEI ODD was designed for the pur- pose of expressing all this in the TEI language itself. But how a document should be rendered was previously still considered to be the responsibility of external publishing software and could not be described within the ODD. The advent of the TEI Processing Model changed this! The intended processing for all elements can now be expressed within the TEI vocabulary as part of the ODD thus fulfilling its promise of One Document Does It All . Markup elements are mapped to a small set of abstract transformation functions, called behaviours . Basic styling features can be set directly within the ODD using CSS. The processing model is media-agnostic: behaviours and rendition styles are transparently translated into different output media types like HTML, XSL-FO, LaTeX, or ePUB. A single ODD can handle a multitude of output media types with just a few small adjustments.

¡model¿ element ¡model¿ element is primarily used to document the intended processing for a given element. One or more of these elements may appear directly within an ¡elementSpec¿ element specification to define the processing anticipated for that element. Where multiple ¡model¿ elements appear, they are understood to document mutually exclusive processing scenarios, possibly for different outputs or applicable in different contexts. A processing model defines on an abstract level how a given element may be transformed to produce one or more outputs. The model is expressed in terms of behaviours and their parameters, using high-level formatting concepts, such as block , inline , note or heading . A processing model is thus a template description, used to generate the code needed by the publishing application to process the source document into required output. The example below depicts a situation where a single model is defined for the ¡app¿ element. As no @predicate or @output are specified, this model applies for all contexts in which ¡app¿ may appear and all possible outputs. Thus all ¡app¿ elements will be transformed into inline chunks of text containing only contents of ¡app¿ ’s ¡lem¿ child and omitting any possible ¡rdg¿ children.

¡model¿ children and attributes:

34 Processing Model transformations Processing Model transformations

• @predicate : the condition under which this model applies, given as an XPath Predicate Expression

• @behaviour : names the function which this processing model uses in order to produce output; possible values include: alternate, block, figure, heading, inline, link, list, note, paragraph

• @output : identifier of the intended output for which this model ap- plies; applies to all output if no @output is present on a ¡model¿

• @useSourceRendition : whether to obey any rendition attribute which is present in the source document

• @cssClass : one or more CSS class names which should be added to the resulting output element where applicable

• ¡param¿ : allows to pass parameters to @behaviour function; parame- ters available depend on the behaviour in question; when parameters are not explicitly passed, default values for those are assumed; all behaviour functions use current element as default content

• ¡outputRendition¿ : supplies information about the desired output rendition in CSS; its attribute @scope provides a way of defining pseudo-elements eg: first-line, first-letter, before, after

Model explicitly specifying content parameter: for ¡app¿ entries only content of its ¡lem¿ child is to be displayed (as an inline chunk of text): Model specifying output rendition: contents of ¡ex¿ elements are to be displayed in italic and wrapped in parentheses: font-style: italic; content:"("; content:")"; Sometimes different processing models are required for the same element in different contexts. For example, we may wish to process the ¡quote¿ element as an inline italic element when it appears inside a ¡p¿ element, but as an indented block when it appears elsewhere. To achieve this, we need

35 Processing Model transformations Processing Model transformations to change the specification for the ¡quote¿ element to include two ¡model¿ elements as follows: font-style: italic; left-margin: 2em; The first processing model will be used only for ¡quote¿ elements which match the XPath expression given as value for the @predicate attribute. Other element occurrences will use the second processing model. Set of multiple ¡model¿ statements is regarded as an alternation and only the first model with @predicate matching current context is applied.

¡model¿ output styling The intended rendering for a particular be- haviour of a processing model may be specified in one or all of the three following ways.

• the @cssClass attribute may be used to specify the name of a CSS style in an associated CSS stylesheet (read more on specifying CSS styles in the ODD ) which is to be applied to each occurrence of a specified element found (in a given context, for a specified output),

• the attribute @useSourceRendition may be used to indicate that the rendition specified in the source document should be applied,

• the styling to be applied may be specified explicitly as content of a child ¡outputRendition¿ element.

When more than one of these options is used, they are understood to be combined in accordance with the rules for multiple declaration of the styling language used. It is strongly recommended that use ¡outputRendition¿ should be lim- ited to strictly editorial decisions, such as ’conjectures are to be displayed in square brackets’ and not as means to record all typesetting and layout specific design choices. The latter are discussion in the Custom CSS styling chapter. The processing model library translates the CSS styles into the target media format. Restrictions apply due to differences between the output formats. Not all CSS properties are supported for every format. Please refer to the section on Output media settings for further information.

36 Processing Model transformations Processing Model transformations

¡modelSequence¿ and ¡modelGrp¿ Summary of elements that can be used to document one or more processing models for a given element:

• ¡model¿ describes the processing intended for a specific context

• ¡modelSequence¿ (sequence of processing models) a group of model elements documenting intended processing models for this element, to be acted upon in sequence

• ¡modelGrp¿ (processing model group) a group of model elements doc- umenting intended processing models for this element

The ¡modelGrp¿ element may be used to group alternative ¡model¿ ele- ments intended for a single kind of output. The ¡modelSequence¿ element is provided for the case where a sequence of models is to be processed, func- tioning as a single unit. Common use case would be to use modelSequence to generate table of contents along with the reading text as shown in the example below:

Behaviours The TEI guidelines document a number of default behaviours. TEI Pub- lisher allows users to add their own behaviours, either within the ODD itself or by writing XQuery code . The following section lists the default behaviours.

Available Behaviours Behaviour functions accept a range of parameters, depending on the function in question. Where these parameters are left unspecified in the ¡model¿ , default values are used. All functions take at least one parameter: content . It will be added by default unless specified and contains the nested content of the currently processed node. You may change this by explicitely setting a content parameter inside the model. In the parameter lists below we skip the content parameter as it is avail- able for every behaviour. Optional parameters are marked as optional in parenthesis, followed by the output mode they apply for, if relevant.

37 Processing Model transformations Processing Model transformations alternate Display alternating elements for displaying the preferred version and an alternative, both at once or by some method of toggling be- tween the two. The concrete implementation depends on the output format. Parameter Description default the content to display by default alternate alternate content persistent (optional, web) show a persistent popup on click instead of a tooltip on hover if parameter evaluates to an effective boolean value of true anchor Create an anchor to which you can link, identified by the given id.

Parameter Description id the id block Create a block structure, usually a div in HTML or fo:block in fo. body Create the body of a document. In HTML this will result in a ¡body¿ tag. break Create a line, column, or page break according to type.

Parameter Description type e.g. ”page”, ”column”, ”line” label e.g. ”p. 13v” cell Create a table cell. If the @cols or @rows attribute is specified, the cell may span several columns/rows. cit Show a citation, with an indication of the source.

Parameter Description source the citation source document Start a new output document.

figure Make a figure with provided title argument as caption

Parameter Description title a caption

38 Processing Model transformations Processing Model transformations graphic Display the graphic retrieved from the given url.

Parameter Description url the url to load the graphic from width the width of the graphic, e.g. ”300px”, ”50%” ... height the height of the graphic, e.g. ”300px”, ”50%” ... scale a scaling factor to apply. If specified, width and height will be output as percentage based on the scaling factor, which should be a number between 0 and 1. title a title for the graphics element. Usually not shown directly. heading Creates a heading.

Parameter Description level the structural level of this heading. In HTML mode, this translates to ¡h1¿, ¡h2¿ etc. inline Outputs an inline element. link Create a hyperlink.

Parameter Description uri the link url target identifier of the tab to open the link in (only web output) list Creates an ordered or unordered list, depending on the type attribute (e.g. type=”ordered” ). If a label is present before each item, a de- scription list is output instead, using the label as definition term.

Parameter Description type The type of list: use ”ordered” for an enumerated list, or ”custom” to specify item labels in combination with the n parameter on each listItem . The default is ”unordered” for a list of bullet points.

39 Processing Model transformations Processing Model transformations listItem Outputs an item in a list.

Parameter Description n a label to use for the item metadata Outputs a metadata section, e.g. a ¡head¿ in HTML. note create a note, often out of line, depending on the value of place ; could be ”margin”, ”footnote”, ”endnote”, ”inline”

Parameter Description place defines the placement of the note, e.g. ”margin”, ”footnote” ... label the label to use for the footnote reference, usually a number. omit Do nothing, skip this element, do not process children paragraph Create a paragraph. row Create a table row. section Create a new section in the output document. In HTML mode, this translates to a ¡section¿ element being output. table Create a table. text Output literal text. title Output the document title. In HTML mode, this creates a ¡title¿ element. In LaTeX, it adds the title to the document metadata. webcomponent (TEI Publisher extension) Outputs a custom HTML element (usually referencing a webcomponent) using the value of pa- rameter name as tag name. All other parameters are copied into cor- responding attributes (properties of the webcomponent).

Parameter Description name the tag name to use for the custom element. Must be a string value.

Including General CSS Styles TEI Publisher is based on webcomponents, therefore styling of one document will not interfere with the styling of another document on the same page. All

40 Processing Model transformations Processing Model transformations styles are strictly encapsulated within the component and do not ”pollute” the global browser space. This also has a downside though: CSS rules defined outside the ¡pb-view¿ have no influence on the text styling inside the component (with some exceptions, mainly for properties which are inherited down the HTML tree, e.g. font-family ). However, putting all styling information into ¡outputRendition¿ tags within the ODD is also not a good idea - it adds a lot of redundancy and mixes editorial responsibilities with web design concerns. The recommended solution would therefore be to use CSS classes for repeating styling aspects. TEI Publisher supports linking to an external CSS stylesheet from the encodingDesc/tagsDecl/rendition section of the ODD. Just specify a relative link in the @source attribute:

The file should be stored in the same collection as the source ODD it is referenced from. The linked file should be a standard CSS stylesheet. Note that unfortunately, editing renditions is not yet supported by the visual ODD editor, so you will have to fall back to add the corresponding elements to the ODD by hand. Alternatively, one may also use the same TEI element ¡rendition¿ with the @selector attribute to embed CSS rules directly in the ODD. font-family: serif; font-weight: 400; Choose one of the two approaches, but do not mix them. In both cases make sure to recompile the ODD after changes as the CSS is merged into the generated code! New addition in Publisher 6.0 allows to pass the external CSS file in load-css attribute of pb-view . Recompiling ODD in this case is not necessary, otherwise it is functionally equivalent to using ODD rendition .

Output Media Settings The library supports various output media formats and translates styles into the corresponding format. Currently the following output modes are supported and can be used in the @output attribute: web Produces HTML output fo Generates a PDF via XSL:FO Creates a PDF via LaTeX print An alias which applies to both: fo and latex modes.

41 Processing Model transformations Processing Model transformations Similar to web concerning features, but targetted at epub documents

The quality of the generated output may vary a lot for the fo and latex modes, depending on the type of input document. The following section provides more details on the configuration of the FO output option:

FO Output When generating XSL:FO output, the implementation tries to translate the CSS rules specified for renditions into the corresponding XSL:FO formatting properties. Not all CSS properties are recognized or can be mapped to FO properties. Unknown properties defined in a rendition will be ignored. The default rendering for headings, paragraphs and the like is defined by a separate CSS file. The implementation merges those defaults with the custom renditions given in the ODD. The library searches for default CSS styles in a file named ¡odd-name¿.fo.css inside the specified output collection (in which the generated XQuery files are stored). The style definitions are copied literally into attributes on the output XSL:FO elements, so any property which is a valid attribute for the corresponding element may be used. For example, teipublisher.fo.css contains:

.tei-text { font-family: "Junicode"; hyphenate: true; } .tei-floatingText { padding: 6pt; } .tei-p { text-align: justify; } Every XSL:FO document needs a master layout and a page sequence defi- nition. Because those tend to be rather verbose as they include things like page margins etc., they are read from two XML files: master.fo.xml Contains the layout master set page-sequence.fo.xml Defines the main page sequence

The mechanisms for configuring FO output are still very much under devel- opment and we welcome suggestions by users.

LaTeX Output The latex output mode produces good results for longer texts which fit well into the pre-defined LaTeX environments. The number of supported CSS properties is limited though:

• font-weight

• font-style

42 Processing Model transformations Processing Model transformations

• font-variant

• font-size

• color

• text-decoration

• text-align

• text-indent

To create arbitary complex LaTeX output, you may want to use the ¡pb-template¿ extension to the ODD syntax. It is heavily used to e.g. generate the LaTeX version of this documentation. See also serafin.odd or vangogh.odd for examples. TEI Publisher creates a default LaTeX prolog based on standard pack- ages and settings. You may overwrite the defaults by providing your own template within the ODD element spec for the TEI root element. See the example ODDs mentioned above. Note that TEI Publisher will generate some LaTeX macros for styles defined in ¡outputRendition¿ which should be imported into the prolog. The styles are added to the default config- uration map and can be accessed via $config(’latex-styles’) . Refer to the example ODDs and just copy/paste the corresponding lines. This output mode requires a local installation of LaTeX on the machine running TEI Publisher. The examples have been tested on a default instal- lation of MacTeX 2018. If you are not running MacTeX, you likely need to adjust the path to the LaTeX binary in the XQuery configuration mod- ule modules/config.xqm . Search for the variable $config:-command and adjust it to point to a binary of xelatex , pdflatex or lualatex . ePub Output The epub output mode extends the HTML mode. You may define general styling in an extra CSS file, located in resources/css/epub.css . This exter- nal stylesheet is included into all generated epub files and may be used to configure general settings like page breaks, hyphenation, font sizes etc.

Extensions to the Processing Model Specification XQuery Instead of XPath The implementation directly translates processing model instructions into an XQuery 3.1 module by generating executable XQuery code. This is straight- forward as the resulting XQuery will closely resemble the specification in the ODD, thus being easy to debug. It also leads to very efficient code, which is as fast or even faster as a hand-written, optimized transformation.

43 Processing Model transformations Processing Model transformations

As a welcome side effect, any valid XQuery expression might be used wherever the spec expects an XPath expression, e.g. in predicates or pa- rameters. For example, one can define variables inside a parameter using a standard XQuery let $x := ... return ... syntax.

Default Processing Model Rules It is possible to define a default ¡elementSpec¿ to be applied to all elements which are not already matched by another elementSpec. For example, if no ¡elementSpec¿ is present for an element, its text content is output. To change this behaviour and omit content elements without specification, you may want to define a default ¡elementSpec¿ as shown below: You can also define models to be applied to all text nodes, e.g. if you need to normalize certain nodes: Note that outputting text nodes is a performance critical operation, so use with care. Too complex processing may dramatically slow down rendering.

External Parameters The script calling the processing model may pass external parameters into the ODD. They will be available in the variable $parameters , which is an XQuery map. Access parameters using ? , the XQuery lookup operator. For example, one can use this feature to control how specific parts of the document are output, without having to define a separate output mode, which would result in much more code. Below we display a shortened header for the document, containing simply its title, but only if the parameter ”header” is set to ”short”: ... The ¡pb-view¿ webcomponent also lets you define arbitrary parame- ters to be passed to the ODD via ¡pb-param¿ . For example, the bread- crumbs shown above this documentation page are realized by setting a parameter mode and can be queried in model predicates with $parame- ters?mode=’breadcrumbs’ .

44 Processing Model transformations Processing Model transformations

If the parameter is set, the processing model rules in the ODD will output the headings of all ancestor sections of the current division only, ignoring everything else. This approach helps to reuse the same ODD for viewing specific aspects of the document. A dedicated user interface webcomponent ¡pb-toggle-feature¿ exists for toggling between two values of a parameter. Example below would produce a checkbox which when on results in the value of $parameters?mode set to diplomatic , otherwise to norm . Diplomatic View

Code Templates and Custom Behaviours The two dozen behaviours defined by the TEI processing model are enough to cover most HTML output tasks, but other output formats like LaTeX may require more customization and control over the generated output. The TEI Publisher library thus extends the processing model syntax with two custom elements for defining code templates. While TEI Publisher does provide ways to write your own behaviours in XQuery and thus extend the ones defined in the guidelines, this should only be used as last resort: custom XQuery behaviours limit the portability of the ODD and are bad for maintenance. Avoiding custom behaviours works quite well for HTML output and we have realized complex projects with just two or three extension behaviours. Things start to become more difficult if you try to output LaTeX though: there are hundreds of packages to use, and users typically define their own macros or environments for all recurring typesetting tasks. For example, to print a TEI ¡persName¿ , experienced LaTeX users would normally create a corresponding \persName macro and handle the formatting details there. Unfortunately, out of the box the TEI processing model does not facilitate this level of customization.

Introducing ¡pb:template¿ TEI Publisher thus supports an extension to the ODD syntax in its own namespace ( http://teipublisher.com/1.0 ). Within the ODD, a ¡model¿ may define a ¡pb:template¿ element contain- ing a code template. The template is expanded first and the result is passed

45 Processing Model transformations Processing Model transformations into the behaviour specified for the model, replacing the default content pa- rameter accepted by all behaviours. The very simple case of outputting a ¡persName¿ in LaTeX could thus be written as: \persName{[[content]]} The template can reference other parameters defined within the ¡model¿ by enclosing the parameter name in double brackets. In the example above we’re referencing the default parameter content , which contains the nested content of the ¡persName¿ tag. The parameter will be processed before it is passed into the template, so if ¡persName¿ contains nested TEI markup, the corresponding processing model rules will be applied first. The result of expanding the template then becomes the new content parameter to be passed to the behaviour ( inline in the example above), which is processed in the normal way as defined in the TEI guidelines. You may also specify additional parameters to be included in the tem- plate. For example, the TEI document may contain a glossary of terms which are referenced in the text using text . In LaTeX this would translate to \glslink{ref}{text} , which can be easily produced by the following ¡model¿ : \glslink{[[ref]]}{[[content]]} We define an additional parameter ref , which contains the id string from the @ref attribute, stripping out the leading ’#’. The templating mechanism is not limited to LaTeX, but may also be used to generate HTML or FO, for example, if you have to generate a complex HTML fragment to represent a single TEI element. This is hard and some- times impossible to achieve without templates. We’ll see some examples in the next section.

Defining New Behaviours in the ODD By combining code templates with parameters we can come up with a very simple mechanism to define new behaviours right inside the ODD! Take the TEI Publisher documentation as an example: it is written in docbook 5 and transformed via ODD. The documentation includes some videos which are hosted on youtube. In docbook those are represented by ¡videodata¿ elements inside a ¡videoobject¿ :

46 Processing Model transformations Processing Model transformations

Screencast
In the HTML output we would need to transform this into an ¡iframe¿ , so the reader can view the video embedded in the page. We can achieve this with a ¡pb:template¿ as sketched in the previous section, but it would be nice to turn this into a general-purpose behaviour, which we can re-use in other situations requiring an iframe. The TEI Publisher library allows us to define a behaviour right inside the ODD as follows:

Note how we have to reset the namespace on the ¡pb:template¿ ? This is required because the default namespace in an ODD document is the TEI namespace. You thus need to reset it whenever you want to output elements in another or no namespace inside a template. Without this, the iframe would end up in the TEI namespace. Webbrowsers will usually ignore it, but it will be wrong nevertheless.

All behaviours should be included in the TEI header or - to be exact: the ¡tagsDecl¿ inside the ¡encodingDesc¿ . You may have multiple behaviour declarations with the same @ident , given that they apply to different @out- put modes. Parameters specified via ¡pb:param¿ without @value attribute are expected to be passed to the behaviour from the calling model. A pa- rameter may be empty though. If you define an XPath expression as @value attribute, the result of the XPath evaluation will be used as value for the parameter. The new behaviour will be named iframe and takes three parameters: src , width and height . It can now be called from a ¡model¿ as follows:

47 Processing Model transformations Processing Model transformations

For further code examples, please have a look at docbook.odd , which is used for viewing the documentation.

At the time of writing, the graphical ODD editor in TEI Publisher does not yet support defining your own behaviours via ¡pb-behaviour¿ . You thus have to make those changes in the source XML using eXide, oXygen or another XML editor. You can, however, use the graphical editor to continue editing the ODD afterwards. It is smart enough to not overwrite your hand-written code upon save.

Extension Modules Where possible, developers should stick to the standard behaviours defined by the TEI guidelines, or use the ¡pb-behaviour¿ and ¡pb-template¿ exten- sions of the ODD syntax. However, there might be situations in which it is necessary to generate a specific type of complex output, which requires the full power of XQuery. To facilitate this, the implementation allows ad- ditional extension modules to be configured:

Configuration Configuration is done via an XML file which must reside in the same col- lection as the source ODD files. It contains a series of ¡output¿ elements, representing particular output modes (e.g. web or print) via @mode at- tribute. ¡output¿ element without a mode groups modules available for all output modes. Each ¡output¿ element lists modules to be loaded for specified output mode. Each definition may optionally be limited to a specific ODD, name of which is specified in the @odd attribute.

48 Processing Model transformations Processing Model transformations

Whenever the library tries to locate a processing model function for a given behaviour, it will first check any extension module it knows to see if it contains a matching function. One can thus overwrite the default functions as well as define new ones. An extension module may also contain general purpose XQuery functions you want to call from within an ODD parameter, e.g. for formatting a date, outputting a number etc. To make those functions available to all output modes, just skip the @mode attribute.

Implementing Custom Behaviours To be recognized by the library, an extension function must accept at least 4 default arguments, plus any number of custom parameters. The required parameters are:

$config a map containing configuration information as well as function ref- erences to be called. The most important ones are $config?apply($config, $node) and $config?apply-children($config, $node, $content) . Both are function items and when called, continue processing with either a single $node or a sequence of nodes in $content .

$node the TEI element being processed at the moment

$class a list of HTML class names to be used. This includes automatically generated class names as well as those passed via @cssClass on a model item.

$content because content is defined for every model rule, it is always passed to a behaviour function (though it might be empty)

For all additional parameters, the processing model implementation tries to fill each custom parameter with a corresponding value by looking through the ¡param¿ children of the ¡model¿ in the ODD to find one with a name

49 Processing Model transformations Processing Model transformations matching the variable name. If no matching parameter can be found, the function argument will be set to the empty sequence. You should not enforce a type or cardinality for any of the custom parameters as this may lead to unexpected errors. The parameters may be empty or contain more than one item. For example, a custom behaviour called code for syntax highlighting in an extension module named ext-html.xql might look as follows: xquery version "3.1"; (:~ : Non-standard extension functions, mainly used for the documentation. :) module namespace pmf="http://www.tei-c.org/tei-simple/xquery/ext- html"; declare namespace tei="http://www.tei-c.org/ns/1.0"; declare function pmf:code($config as map(*), $node as element(), $class as xs:string+, $content as node()*, $lang as item()?) {

 {replace(string-join($content/node()), "^\s+?(.*)\s+$", "$1")} 
}; It defines one function, pmf:code , which can be called from the ODD as follows, provided that the ext-html.xql module has been configured as described in the previous section .

Custom Behaviours Accepting User-Defined Parameters Sometimes you may like to implement a generic behaviour which takes ar- bitrary parameters from the user. This means the parameter list of your behaviour will not be fixed. To facilitate this, a behaviour function may declare a final parameter $optional as map(*) . If the processor finds ¡param¿ children in the model which cannot be mapped to an explicitly declared parameter, it stores all such extra parameters as a key/value pairs in the $optional map.

Building a Sample-Based ODD If you do not want to start a customized ODD from a blank template, you can alternatively generate one that covers the classes and elements of a

50 Processing Model transformations Processing Model transformations selection of TEI files stored in TEI Publisher’s data collection. Simply select one or more sample documents in the list to the left, enter a name and title into the form and click on Create from examples . Note that if you haven’t removed the default examples TEI files that shipped with tei-publisher their markup will be included in the constructed ODD as well. This method uses the oddbyexample. stylesheet that is part of the TEI consortium’s stylesheets . Users with a corpus of one or more TEI files can generate a custom odd that contains explicit additions and deletions for all possible TEI modules, as well as, ¡valList¿ for attribute values in the input corpus. By default the basis of the comparison for which elements have been modified in the examples is the full tei all.odd .

Advanced Use To further tweak the building process you can call the functions of the odd- by-example.xql module from your own XQuery code. If you wish to generate your own basis for the comparison you can call the following function to store a compiled ODD in the default odd location of tei-publisher: obe:compile-odd(doc(’../odd/my-file.odd’), ’my-file-name’)

Due to a bug in the odd2odd.xsl stylesheet the output of this function is not always valid. To use it for further processing you need to make sure that only valid documents are used for further processing

You can also modify the transformation parameters of: obe:process-example(doc(’../data/test/myTEI.xml’), ’odd-name’, ’ simplePrint’) The above example uses simplePrint as a basis for building the new ODD. The full list of configurable options are:

51 Custom CSS styling Custom CSS styling

Custom CSS styling

CSS stylesheet resources/css/theme.css defines styles used by all pages of the TEI Publisher and publisher-generated applications. Nevertheless, users should not directly modify this file but create a project-specific css customization file and include it alongside theme.css instead.

This approach allows to selectively overwrite certain styles and CSS vari- ables from theme.css while remaining on the easy upgrade path for future TEI Publisher and pb-components updates.

Customizing web components styling A web component completely shields its content, so it cannot be styled from outside. Web component styles remain encapsulated, preventing style con- tamination between individual components and general application context. A blessing in general, allowing components to be scripted and styled with- out the fear of collision with other parts of the page, but poses additional

52 Page templates and pb-components Page templates and pb-components challenges when adjusting the look and feel of a component to fit a project’s theme. For Publisher, encapsulation of web components means that definitions in theme.css or equivalent project customization CSS files are not able to directly govern web components styling. While some aspects of component styling remained unaccessible for cus- tomization in versions preceding Publisher 6, currently pb-components ex- pose style properties to the outside world via standard CSS variables. This way variables like --pb-footnote-color defined in theme.css can be ac- cessed by e.g. pb-view component and thus determine the color of the footnote marker in the rendered transcription. Note that while you cannot change the inner appearance of a component except by setting its custom CSS properties, you can style the component itself within the HTML template, e.g. to position it within the layout of the page.

External stylesheets ODD specification allows for explicit declaration of an external CSS file which may define styles and CSS classes to be applied to tranformed sources (in encodingDesc/tagsDecl/rendition ), e.g.

Styles and classes from that file are loaded into pb-view component and thus accessible for its content. External stylesheets for pb-view can also be specified via load-css com- ponent configuration attribute. In this scenario, unlike with ODD rendition , regenerating the ODD is not required for changes to the CSS file to be ap- plied, otherwise both methods are functionally equivalent.

Broader discussion of using ¡rendition¿ for custom styles can be found in this section .

Page templates and pb-components

As described earlier, the various sample documents included in the TEI Publisher Demo collection differ not only in the ODD they use, but also concerning the general layout and composition of the page. They are based on different HTML templates, which can be found in the templates/pages

53 Page templates and pb-components Page templates and pb-components collection of the TEI Publisher app. Each template assembles various build- ing blocks in a slightly different way - some examples show a facsimile view next to the text, others a parallel display of transcription andtranslation, some include a map and another showcase collapsible metadata section. Which template is being used is determined by a processing instruction in the TEI sources of these examples. The building blocks we mentioned are custom HTML elements. Each of them encapsulates a certain functionality and appearance. The map, the facsimile, but also the text view itself and all controls are custom HTML elements. They are like ”Lego” blocks which can be freely moved around and rearranged without knowing anything about internal implementation of the component.

Web Components The technology enabling this Lego-like modular approach is a W3C standard called Web Components . It is already built into many browsers and support is improving quickly, reducing the need for external frameworks. There’s a growing collection of ready-to-use components available, e.g. the Polymer elements we use for menus, buttons, dropdowns etc. TEI Publisher from version 6.0 exposes its collection of Web Components targeted at creating digital editions as a separate pb-components package. You do not need to know much about Web Components to use them in TEI Publisher. From a user perspective, a component looks like any other HTML element. You configure it by setting its properties via attributes. For example, the following HTML code snippet will display the first page/section of two completely different documents as you can see below in the embedded Codepen (to learn more on embedding Publisher output and components see further chapters )

54 Page templates and pb-components Page templates and pb-components

¡pb-page¿ , ¡pb-document¿ and ¡pb-view¿ are three web components from pb-components library, while ¡main¿ is a standard HTML5 tag. The name of the custom element must start with a prefix to distinguish it from standard HTML. This concept should be familiar to XML people. For TEI Publisher components, the prefix is always pb- . Components from other sources will use different prefixes, e.g. paper- and iron- for the Polymer collection. The part of the page which uses TEI Publisher web components should always be wrapped into an ¡pb-page¿ element. This element determines the TEI Publisher server instance all other components will be communicating with (see the next section below). It is also responsible for some other ini- tialization steps, e.g. loading the list of available user interface translations. . ¡pb-document¿ specifies a document source, which can then be refer- enced by id from other components. The component provides a way to configure basic properties governing document’s default rendering, like as- sociated ODD file, etc. In the example above, we define three properties for each document: path the relative path to the XML document. This will be interpreted as relative to the data root collection of TEI Publisher, by default pointing to the data collection within the TEI Publisher app. odd the name - without suffix - of the ODD to use for rendering the docu- ment. In the example, the first document is encoded in TEI and thus transformed through dta.odd , while the second is written in DocBook and passed through docbook.odd . view this property determines how the document will be paginated if the user navigates forward/backward. Currently three possible methods are available:

1. div : displays one structural division (TEI div, DocBook section ) at a time 2. page : displays the document page by page. This requires page break indicators to be present ( ¡pb¿ in TEI, not supported for docbook). 3. single : the entire document (or a selected fragment of it) is displayed at once

¡pb-view¿ is the critical component in TEI Publisher: it provides the actual text view by transforming a part or entirety of the source XML into HTML based on the processing model instructions in the ODD.

55 Page templates and pb-components Page templates and pb-components

Because webcomponents are all about encapsulation, ¡pb-view¿ ensures that the styling of the text as governed by the ODD will be confined to the boundaries of the component. This makes it possible to display two completely heterogenous texts (like the documentation and Kant’s Kritik) on the same page without styles contaminating each other. As a downside, encapsulation also poses some challenges, which we discussed in the section about CSS styling .

Webcomponent Documentation To better understand the various components TEI Publisher provides, it is best to have a look at the small examples contained in the web components API documentation . The list of components may be overwhelming at first sight. However you dont need to learn them all. There are just a few components that should be understood before you start customizing. Their demo pages showcase a working example along with the code snippet which actually implements it. You can also get an editable live view like the one above if you click on the Edit Code button to the bottom right of each example. In particular you may want to look at the following examples:

pb-document shows the example given above in action pb-view adds navigation buttons to read the document page by page pb-facsimile display facsimiles via IIIF server and link to them pb-leaflet-map show a map and link to geo coordinates pb-load call a server-side XQuery to retrieve additional information about the document, in this case actors appearing in the play pb-search execute a search on the database, retrieve the results and pag- inate through them pb-grid dynamically add more columns to a horizontal grid of compo- nents

Some properties of pb-view and other components are boolean proper- ties. In HTML5 this corresponds to an attribute without value, which is illegal in XML. If you want to preserve valid XML, just write the at- tribute with the same name and value, e.g. append-footnotes="append-footnotes" .

56 Page templates and pb-components Page templates and pb-components

Communication between Components To allow for maximum flexibility, nearly all of the TEI Publisher webcompo- nents communicate via events: this way avoids hard wiring and components may appear anywhere on the page. For example, the controls for paginating through a document do not directly talk to the document view: they just send an event , indicating the users’ wish to navigate backward or forward. Components listening for this event may then react to it by refreshing the text being displayed. Since you can have multiple text views showing content from different sources, every event can be announced at a specific communication channel . This allows us to distinguish between different sources, e.g. two transcrip- tions being shown side by side. Most TEI Publisher components therefore accept two properties to configure the channel they are listening or sending events to: subscribe name of the channel to which this component subscribes to. It will only react to events coming in via this channel. emit name of the channel to which this components sends events.

If neither of these properties is given, the component will subscribe and emit to the global default channel. A component may also send to a different channel than it subscribes to, allowing chains of events. Common properties and methods accepted by many TEI Publisher com- ponents are defined in the class PbMixin.

Page Templates TEI Publisher currently includes several different page templates, which combine the building blocks described above with off-the-shelf components to achieve a certain page layout and composition. If you look at the HTML code, you’ll see a mix of pb- elements and app- , paper- , iron- elements. The last three belong to the Polymer collection and you can find them doc- umented in the public webcomponent registry . TEI Publisher components are not yet available there, though we may move some of the general purpose components there later. To avoid redundancy, the page template files use eXist’s templating fea- ture to drag in some repeating parts which are the same for all pages, for example, the toolbar and the menu. You’ll find these files in the correspond- ing sub-collection:

57 Page templates and pb-components Page templates and pb-components

The following page templates are currently available in TEI Publisher:

view.html the default template showing a single text view at the center

facsimile.html a template featuring a facsimile view to the right, retriev- ing images from a IIIF server

letter.html used for Graves’ letter, this template displays an additional map to the right and a list of places, people and organizations

translation.html shows a transcription and its translation side by side. Both are contained in the same TEI document and extracted via an XPath expression passed to ¡pb-view¿

cortez-with-translation.html similar to translation.html but the align- ment between transcription and translation is more complex as the translation contains no page breaks. The part of the translation cor- responding to a given fragment of the transcription thus needs to be computed dynamically for each page, using an XQuery function (de- fined by the map on ¡pb-view¿ .

cortez.html like above but with additional facsimile panel

vangogh.html demonstrates three columns by default: a metadata col- umn, a transcription which can be switched to a diplomatic view in- cluding line endings, and a translation. Other columns, e.g. facsimile or commentary can be added dynamically.

dantiscus.html builds upon facsimile.html adding a collapsible pb-drawer element to display metadata from teiHeader section.

osinski.html similarly builds upon facsimile.html but a collapsible metadata section is realized with a pb-collapse element.

The page templates are meant as examples to be copied and modified by users. They were written to match the concrete example and do not intend to be universal. TEI is too heterogenous to provide a one-size- fits-all solution. We thus believe that providing a wide range of practical examples is the best way to help users realize their own project. Only the generic template, view.html should work with all example TEI documents.

58 Page templates and pb-components Page templates and pb-components

Create your own Template To create your own template

1. open one of the existing templates in e.g. eXide (by clicking the links in the list above)

2. adjust content attribute in meta tag to a label that fits your template

3. save the file under a different name into the same collection ( tem- plates/pages )

Now reload the document you’d like the template to apply to and you’ll be able to switch to your new template, either by

• using the Template dropdown in the Settings panel

• adding a parameter template=mytemplate.html to the URL showed by the browser

• adding a processing instruction to a TEI document to make a specific template the default:

Handling Complex Alignments It is often desirable to show two or more views of a document at the same time, for example to display the translation aligned with a given source fragment. In the simplest case, the transcription and translation may be aligned on the level of divisions or page breaks and one can simply use two ¡pb-view¿ referencing different starting points in the TEI document (this approach is implemented by the translation.html template for Serafin’s letter). Unfortunately things are not always as simple as that. For example, even if the transcription contains page breaks or milestones which can be used to display a single page, the translation might not. One thus needs a different approach to compute the alignment between fragments. Nevertheless, the logic of the alignment algorithm will highly depend on the conventions used

59 Page templates and pb-components Page templates and pb-components in the encoding. TEI allows a wide variety of alignment mechanisms and we do not want to limit the freedom of the editor by prescribing a particular method. TEI Publisher thus implements a generic way to plug an XQuery function into the processing pipeline. The function takes the source element being processed as input and may replace it by its aligned equivalent. Such an equivalent may be another element or fragment from the same or a different document. The source element will usually point to the part of the tran- scription being displayed. The mapping function uses this as starting point to determine an aligned fragment and returns it. The returned fragment will then be further passed through the processing model. The XQuery mapping function should be defined in the module mod- ules/map.xql . It takes an element as its only argument and may return any valid TEI fragment, which will become the input for futher processing through the processing model. The local name of the mapping function can then be supplied in the attribute map of ¡pb-view¿ . As an illustration, the Van Gogh example includes the following pb-view for displaying the translation: In the Van Gogh letters, the translation contains page breaks corre- sponding to page breaks in the original letter, but these are using a different prefix for the xml:id . To align the translation with the transcription, we only need to adjust the id, and retrieve the corresponding page break to be done. The XQuery mapping function is thus rather simple: declare function mapping:vg-translation($root as element()) { let $id := ‘‘[pb-trans-‘{$root/@f}‘-‘{$root/@n}‘]‘‘ let $node := root($root)/id($id) return $node }; Note that returning the corresponding ¡pb¿ node of the translation is sufficient here as further processing will automatically extract the page frag- ment up to the next ¡pb¿ . More complex cases may require that mapping function returns arbitrary TEI fragment. Also note that the xpath attribute of the ¡pb-view¿ element in the template must still point to the source tran- scription ( div[@type=’original’] in this case). It’s just the mapping function which translates a position in the source transcription to a corresponding fragment in the translation. The letter by Cortez to Dantiscus sent from Mexico demonstrates a much more sophisticated alignment, determining the translation fragment to be shown by inspecting the ID range of the transcription. It illustrates the case

60 Server-side API Server-side API where no milestone elements exist in the translation to explicitly mark page boundaries of the original, thus mapping algorithm aims to display closest corresponding fragment of the translated text.

Server-side API

Many of the user interface web components need to communicate with the server to request certain data to be retrieved, processed and returned before the client-side components can display it to the user. For example, when user clicks the Table of Contents button, the corresponding pb-load compo- nent sends an ajax request to retrieve the list of chapters. Similarly, when user types something in the autocomplete field, a request is sent after each keystroke to find matching terms in the . These requests need to be received and processed by the server with the eXist instance running and only then data will be returned to the browser. Previous versions of the TEI Publisher included a number of XQuery modules implementing the server-side functionality, located in modules col- lection. In version 7, a new approach has been introduced. Formal API specification , based on the Open API standard defines all the server-side API endpoints available in Publisher. The image below presents a section of the API handling all ODD-related operations: creating, retrieving, updating, deleting, recompiling; syntax check for a code fragment (lint); retrieving a list of all available ODDs.

Figure 18: TEI Publisher API page - the odd endpoint

61 Server-side API Server-side API

Advantages of the API-based approach

• clear specification and easy overview users and developers will benefit from the clear specification for avail- able functionality

• customizable users can easily overwrite and extend existing API endpoints as well as add custom ones for project-specific functionality

• separation of concerns server-side functionality can be used directly by any software system, not necessarily within the context of the TEI Publisher or using Pub- lisher’s UI components; further changes in internal API implementa- tion do not require any adjustments of the UI components

• standard-based OAS compliance means we are using a well-known and documented standard and can benefit from existing tools for writing, testing and generating documentation

API endpoint groups TEI Publisher 7 groups server-side functionality into several main sections: documents Retrieve entire or part of a document content transformed to a target format; retrieve document’s table of contents; delete a docu- ment. collection Browse the collection hierarchy and upload new files. odd Manage the ODD files stored in TEI Publisher: create, retrieve, up- date, delete, recompile; check syntax for a code fragment (lint); re- trieve a list of available ODDs. search Search: run a search through collections; list facets for the search results; list autocomplete options. transform Transform content on the fly: convert a Word document to TEI or POST an XML document and get its preview rendered via ODD.

62 Language versions Language versions view View documents via an HTML template; list available templates. apps Operations for managing apps: generate a standalone application or package and download an app. dts Implementation of the DTS (Distributed Text Services) API. user Login: authorize the user. info Information about the server: the API version.

Custom API endpoints Beyond the standard server-side functionality provided by TEI Publisher, careful provisions have been made to allow users to add their own API end- points. The Custom API section demonstrates how project-specific func- tionality can be added within the OAS API scheme. All that is required is to edit the modules/custom-api.json file and add specifications for custom endpoints in JSON format. The operationId prop- erty specifies the XQuery function to be called to process the request. All function module with the function definition needs to be imported into modules/custom-api.xql so that the request can be correctly resolved. The function implementation can either be added to the XQuery module modules/custom-api.xql or to a new module, which should be imported into this file to make it known. See the README of the oas-router package for more information about how to write a request handler function. To reference a custom function declared directly in modules/custom- api.xql , use the prefix custom: in the operationId . For functions in im- ported modules, use the prefix defined in the import statement.

Language versions

TEI Publisher is already available in twenty languages and this number increases thanks to the engagement of our user community. With version 6.0 i18n support has been greatly extended to cover not only labels and attribute values in HTML templates but also within web components. A mechanism for project specific language files extending the default Publisher label collection has been added. Thanks to community contributions via Crowdin a number of languages has been added and existing ones are updated when needed. We welcome and encourage additions and amendments. Please consider Crowdin a mas- ter repository for translations and publish your contributions there instead

63 Language versions Language versions

Figure 19: Custom API specification - show article

of submitting PR directly. Do not hesitate to get in touch if a language you’d like to support is not yet listed. Crowdin users are only exposed to form-based graphical user interface but i18n files are using JSON format to preserve their logical structure. Initial portion of German localization file de.json is shown below for illus- tration. In TEI Publisher and in generated apps translation files are by default loaded from CDN with pb-components . When it is necessary for a project to add new labels or change wording of existing ones, a customization mech- anism is described below .

64 Language versions Language versions

Figure 20: Translation JSON file structure

Using i18n in page templates There are several scenarios for using i18n labels:

• directly within HTML element

• in an attribute

• passed in component-specific structure

HTML text node with pb-i18n Any text fragment within HTML element can be considered a target for i18n when wrapped in ¡pb-i18n¿ component. ¡pb-i18n¿ elements are listening for events emitted by ¡pb-lang¿ .

Documentation

Attributes (on HTML elements or custom web components) When it is necessary to translate the value of an attribute a mechanism based on data- attributes is used. Supply additional data-i18n attribute specify- ing the key of the i18n label to use preceded by the name of the attribute that

65 Language versions Language versions needs to be translated. In the example below it’s the heading attribute that needs to be filled with the translated version of ODD Files label. The label is stored in JSON file under odd.files , so [heading]odd.files needs to be used for the data-i18n attribute.

... content of the ODD list
When ¡pb-lang¿ is switched to Spanish, the ¡paper-card¿ heading will read Archivos ODD instead of ODD Files .

Figure 21: Output for ¡paper-card¿ with translated @heading

Special web component properties configured via attributes Some web components accept more complex configuration options via arrays passed in attributes. For example, ¡pb-browse-docs¿ component allows to specify a number of label/value pairs for dropdown menus used for ordering and filtering the document list. Understandably, the labels should switch in line with changes to the language chosen via ¡pb-lang¿ . Therefore i18n translation keys like browse.title or browse.author are used instead of text values. Note that web components may have particular expectations for the data format expected so consult API documentation for each component.

66 Language versions Language versions

Project specific i18n files It is a common need that projects will need their own internationalized labels for menu items, dialogs and other user interface elements. All these can be stored in JSON language files, following the same nam- ing conventions and basic structure as TEI Publisher ones. To use custom language files, you need to specify the path under which they can be found relative to your application: The path expression requires placeholders for two parameters ns and lng : lng the language code for a selected language, e.g. de or fr ns the namespace prefix used to distinguish between different collections of language files. By default, TEI Publisher expects custom language files to be in a namespace called app , though this can be configured.

So given above configuration, TEI Publisher will search for a custom lan- guage file for, say, French in resources/i18n/app/fr.json . If you prefer a flat directory structure, you could change the locale to resources/i18n/{{ns}} {{lng}}.json and TEI Publisher will look for a file resources/i18n/app fr.json . You may also define additional namespaces to be searched with the locale-fallback-ns parameter: which means that TEI Publisher will search for labels in the my-module namespace first, then falling back to the app namespace, and if the label

67 Creating applications with the App GeneratorCreatingapplications with the App Generator could still not be found, using TEI Publisher’s default namespace. The latter is called common and should not be overwritten. Listing below demonstrates a fictional en.js with a custom set of labels. A new top-level key has been added as well as 3 subkeys for the menu section. Now the project has access to a number of new i18n keys: menu.about , menu.contact , menu.statute and greeting . Furthermore, a value for menu.documentation has been specified. That key already existed in Publisher (set to ”Documentation”) but the version from custom file will take precedence and be used in the custom app.

{ "menu": { "documentation": "Docu", "about": "About", "contact": "Contact", "statute": "Statute" }, "greeting": "Welcome" }

Creating applications with the App Generator

Once you are happy with a certain ODD and HTML template customiza- tion, you can easily create a complete, standalone application. Such ap- plication can be tailored further to fit your needs. The generated app can be downloaded as a portable xar package, to be installed into other eXist instances, or synchronised to disk for further development. It provides a fully functional application scaffolding based on TEI Publisher components and modules, pre-configured to use a certain ODD, page template and other settings.

Early versions of the TEI Publisher included everything they needed to run without TEI Publisher. This had some advantages, but also made it more difficult to update to newer releases. Starting with version 5 we are gradually moving towards more lightweight app design, ultimately containing only resources which are truly specific to the app (ODDs, HTML templates, images), while all generic func- tionality will be provided by TEI Publisher and its libraries and pack- ages. Version 6 has seen all web components extracted into pb-components package and in version 7 the server-side code has been based on a well designed API specification with provisions for easy customization. Please make sure you are following our best practice recommenda-

68 Creating applications with the App GeneratorCreatingapplications with the App Generator

tions for a smooth upgrade path.

To get started, click on App Generator in the menu bar and fill out the form. The following form fields are of particular importance:

ODD Here you select the ODD(s) to be used for transformations. If mul- tiple ODDs are selected, the first one will be the default.

URL to identify the app This is the main identifier for your app and should be a globally unique URI. It does not need to correspond to any existing web site. eXist’s package manager is using this URI as a unique identifier for the app.

Abbreviation The abbreviation will be used as the name of the root collec- tion of your app and as the last path component in any URL pointing to the app. It should be unique within one database instance.

Template The HTML template to be used as default for viewing a docu- ment. If you created a custom template or modified any of the existing templates, you likely want to select this here.

Data Collection Only specify something here if you have existing data inside the database or if you want to ship the data set as part of a second, separate app. In all other cases, leave this field empty.

Default View Default document view - whole document, page by page or division by division. If you would like to choose page by page, set this to ”By Page” but please note that all your documents must be appropriately encoded with page breaks as ¡pb¿ elements. You will still be able to change the view settings in HTML template or users can do it via query parameters.

User/Password The user account which will own all application files. For security reasons, it is advisable to create a new account for every app.

Once you created the new application, log into it using the account details you provided. You can then upload XML documents using the upload panel in the sidebar.

Generated Code Overview

69 Creating applications with the App GeneratorCreatingapplications with the App Generator

XQuery Code The collection structure of the generated app follows typical design of many eXist apps. Best practices for modifying the app are discussed in further sec- tions of this document. You can find the code of your generated app within the /db/apps collection under the name you provided in the abbreviation field of the generator form. modules Contains XQuery modules used by the app, including the copied TEI Publisher libraries in modules/lib subcollection. resources Contains a number of subcollections for various resources used by the app, such as images, fonts or JavaScript libraries. Of particular interest is the resources/odd subcollection where app’s own ODD files are stored. All of the app styling is done via a set of modularized less stylesheets, residing in resources/css . The main file is style.less , which defines a number of core parameters. Ideally this should be the only file you ever need to modify. templates Html templates for the templating framework. Contain page templates as well as smaller components such as menu or login pane. transform This collections contains XQuery modules governing the trans- formations and styles generated from app’s ODD. Its content will be overwritten with each ODD recompilation, thus there’s no point mod- ifying it. It’s worth consulting though to gain better understanding and troubleshoot the transformations with TEI Processing Model. In case of an issue, it may help to know the files:

{myodd}-web.xql The transformation module generated for output mode ”web”. {myodd}-web-module.xql Library module which calls the trans- formation module. {myodd}.css CSS styles generated from the ODD.

Files start with the name of the odd and the output mode they belong to.

70 Creating applications with the App GeneratorCreatingapplications with the App Generator

Modifying the App When you are logged in, the Admin menu in the top navbar provides various links for ease of customization of your app:

Recompile ODD After changing the application’s ODD, click here to up- date the app.

Edit ODD Opens the application’s ODD in eXide for editing.

Using Multiple ODDs For performance reasons, the mechanism used by generated apps to trans- form a document via an ODD is slightly different to the one in the main TEI Publisher app: while TEI Publisher resolves ODDs dynamically, generated apps use a static lookup method. The static method is much faster, resulting in better overall performance, but as a consequence ODDs are not automatically detected by the app, and need to be explicitly registered. Each generated app will by default use a single ODD for all transfor- mations but it is possible to add additional ODDs, e.g. to be used with a pb-view in a custom HTML template. If multiple ODDs are selected when generating the app, the app generator will take care of registering those ODDs. If you would like to add ODDs manually later, the procedure is as follows:

1. Upload the ODD to the resources/odd collection below your app root. This can either be done via eXide or the Upload panel on the start page of your app.

2. In modules/config.xqm find the variable called $config:odd-available and add the name of the new ODD to the sequence. (:~ : The main ODD to be used by default :) declare variable $config:default-odd :="shakespeare.odd";

(:~ : Complete list of ODD files used by the app. If you add another ODD to this list, : make sure to run modules/generate-pm-config.xql to update the main configuration : module for transformations (modules/pm-config.xql).

71 Creating applications with the App GeneratorCreatingapplications with the App Generator

:) declare variable $config:odd-available :=("shakespeare.odd");

3. In eXide, open modules/generate-pm-config.xql (path relative to your app root) and execute it once. This will regenerate the XQuery module modules/pm-config.xql , which registers all the ODD modules known to the app.

4. Additionally you need to regenerate all ODDs used by the app by either

(a) Clicking the Regenerate all ODDs button on the start page of the app (b) Via the api documentation page api.html by executing the POST version of the /api/odd .

Using different ODDs depending on the collection By default the same ODD will be used for all documents within the app. It is possi- ble though to organize documents into a hierarchy of collections beneath $config:data-root and use different ODDs and general view settings for each collection or document type. To do so, search for the function config:collection-config , which by default returns an empty sequence - meaning that the default configura- tion should be used. Comment this out and enable the switch/case state- ment below to return a different config depending on the current collection. The $collection and $docUri parameters be relative paths, i.e. relative to $config:data-root . So for a single level hierarchy as used in TEI Publisher by default, $collection will be either test , playground or doc . For multi- level hierarchies it might also be e.g. volume1/transcripts . In this case a simple switch/case might not be enough, but you can just replace it with an if/then/else and apply any path matching you like. declare function config:collection-config($collection as xs:string?, $docUri as xs:string?) { (: Return empty sequence to use default config :) ()

(: : Replace line above with the following code to switch between different view configurations per collection. : $collection corresponds to the relative collection path (i.e. after $config:data-root).

72 Creating applications with the App GeneratorCreatingapplications with the App Generator

:) (: switch ($collection) case "playground" return map { "odd": "dodis.odd", "view": "body", "depth": $config:pagination-depth, "fill": $config:pagination-fill, "template": "facsimile.html" } default return () :) }; Instead of switching by collection, you could also configure different views depending on the type of document, e.g. by checking the type of the first div: doc($config:data-root || "/" || $docUri)//tei:body/tei:div/@type

Exporting the Finished App To save your finished application or exchange it with other people, you need to save it as an application archive. Application archives use a standardized EXPath Packaging format: the resulting .xar file can be uploaded to any eXist instance via the Dashboard and the Package manager will take care of the deployment. There are two ways to create a .xar file from your application:

1. Use the Admin / Download App menu entry in generated app to di- rectly download a .xar

2. Synchronize the application to a directory on disk via Application / Synchronize in eXide

The first approach is recommended if all you need is a copy of you app on disk. A .xar is just a ZIP archive, so you can unpack it into a directory of your choice, which you can then commit to a version control system like git. However, if you continue to make changes inside the database (e.g. fur- ther work on the ODD), you may want to use the second method, i.e. call eXide’s synchronization. It requires that you have access to the file system of the server running eXist though, so it’s usually only an option if you run your own eXist instance locally. The synchronize steps in detail:

• Prerequisite: you need to have the Apache Ant build tool installed.

73 Best Practice Recommendations Best Practice Recommendations

• Open one resource belonging to your application in eXide. It doesn’t matter which one. The only important thing is that the name of your app is displayed next to Current app: on the top right of the eXide window. If this is not the case, stop and check again!

• Click Application / Synchronize in the menu. It opens up a dialog with two fields: Start time and Target directory . When you synchronize the first time, empty the Start time field. Enter a valid, absolute directory path on your server machine into Target directory .

• Click the Synchronize button. This may take a moment, but you should see a list of written files at the bottom of the dialog afterwards.

• Change to the directory you specified for synchronize.

Watch the screencast below for the whole synchronization procedure.

for security reasons, the password you entered when creating the app is not stored in the database, so it cannot be synchronized to disk. To restore a password for your app, you thus need to edit the repo.xml file in the directory and add a @password attribute to the ¡permissions¿ element.

Building from a directory Once you have a copy of your app extracted in a local directory, you can always rebuild a fresh .xar from it. In the simplest case it is sufficient to call ant inside the directory. For up to date build instructions which also cover more advanced uses, refer to the ”Building” section in the readme of the TEI Publisher repository .

Best Practice Recommendations

In case you’d wish to further customize the generated app it’s worth to keep the changes as much separated from the generated code as possible to allow for future alignment with newer versions of the TEI Publisher. The generated app shares most of its XQuery libraries with the main TEI Publisher app. A copy of those is included in the lib/ collection of the generated app and should not be modified! This way you can later update the libraries to a newer TEI Publisher release without breaking your app. Including the libraries in the generated app creates some redundancy,

74 Best Practice Recommendations Best Practice Recommendations but we chose to accept this trade-off to make it easier to view and modify everything relevant to the app. Meanwhile, if you find that modifications of lib/ modules are necessary, please consider if your change would be generally beneficial for TEI Publisher and create PR for the TEI Publisher if so. It is considered safe :

1. to modify all HTML templates below templates/ as well as index.html and search.html in the root of the app

2. to change XQuery modules in modules , excluding those in modules/lib

3. to add images, fonts or change i18n translations below resources

4. to add custom API functions (written in XQuery) to custom-api.json and custom-api.xql

The following core XQuery modules in every app are safe to be modified (all are stored in modules subcollection:

config.xqm The main configuration file for the app. It contains a number of parameters which control important aspects.

custom-api.xql Custom API functions should be defined in this module - or a module imported by it. Also, HTML templating modules should be imported into this module to be registered with the TEI Publisher.

app.xql Add your own HTML templating functions in XQuery here if needed. With TEI Publisher 4.0 or later, there should be less of a need to extend this file as most of the application logic is realized via web components.

pm-config.xql This file is usually generated by calling modules/generate- pm-config.xql (see the corresponding section ). It defines the functions to be called for rendering TEI content via the processing model. It imports the modules generated from your ODD and assigns them to variables as function pointers. This approach is much more efficient than the dynamic lookups done by the main TEI Publisher app. It has been production tested on large web sites.

navigation-*.xql This group of modules contains functions relevant for the display and navigation of documents corresponding to a given doc- ument type

query-*.xql Modules with functions powering the search features

75 Updating Applications Updating Applications

view.xql The main module of the app. This module initializes eXist’s templating system. The only case when it should be modified is if you want to add further XQuery library modules containing template functions.

Updating Applications

With TEI Publisher 7, we have redesigned the server-side API. Combined with the client-side reorganization brought by TEI Publisher 6, this will make future updates for generated applications a lot easier. Client-side UI and server-side API are now cleanly separated from any custom app elements. For apps created with TEI Publisher 6 updating to 7 requires only small modifications. Migrating from earlier versions, like 4 or 5, requires more effort, and is described in futher sections.

Upgrading from TEI Publisher 7 to 7.x Upgrading an app generated by TEI Publisher 7 to another minor release involves the following general steps:

1. Update the webcomponents library (also see the faq). The version used is defined by a single variable in modules/config.xqm: declare variable $config:webcomponents := "1.24.12";

2. Update server-side modules: those live in modules/lib. You can safely remove the entire folder and replace it with the corresponding one from the new version.

Extra Steps for TEI Publisher 7.1 The new annotation editor in TEI Publisher 7.1 requires a bunch of files to be copied. If you do not intend to support web annotations in your app (we recommend having a separate app for this), you only need step 1 and 2:

1. Copy modules/annotation-config.xqm into the corresponding location in your app

76 Updating Applications Updating Applications

2. Copy templates/basic/controller.xql into the root of your app. This contains an important security fix. If you modified your controller yourself, just replace the last ”else”, which in the new version should start with: let $main := if (matches($exist:path, "^/+api/+(?:odd|lint)")) then "api-odd.xql" else if (matches($exist:path, "/+tex$") or matches( $exist:path, "/+api/+apps/+generate$")) then "api-dba.xql" else "api.xql"

3. Copy odd/annotations.odd to resources/odd in the generated app.

4. Copy templates/pages/annotate.html and templates/pages/annotate.css to the corresponding folders.

5. Also copy resources/scripts/annotations

Upgrading from TEI Publisher 6 to 7 To upgrade a custom application generated with TEI Publisher 6 to version 7, we recommend to clone the source code of both, TEI Publisher 7 and the custom application, into a local directory. The commands shown below were used on a Linux system for updating the Dodis demo app , but with appropriate adjustments to the syntax the same actions could be performed on a different operating system. Alternatively the migration can be executed using eXide’s file manager. If you decide to work with the command line, as suggested here, clone TEI Publisher 7 into the directory one level above the one containing your own application source code: git clone https://github.com/eeditiones/tei-publisher-app.git cd tei-publisher-app git checkout v7.0.0 At this point we need to move to the directory where your own application source code is stored: cd ../my-custom-app

77 Updating Applications Updating Applications

Copy files Now we need to remove the old modules/lib directory and replace it entirely with the new one from the TEI Publisher 7. We’ll also copy a bunch of files from the templates subdirectory of TEI Publisher: rm -rf modules/lib cp -r ../tei-publisher-app/modules/lib/ modules/ cp ../tei-publisher-app/templates/basic/pre-install.xql . cp ../tei-publisher-app/templates/basic/post-install.xql . cp ../tei-publisher-app/templates/basic/controller.xql . cp ../tei-publisher-app/resources/css/theme.css resources/css cp ../tei-publisher-app/templates/basic/modules/custom-api.* modules / cp ../tei-publisher-app/templates/basic/modules/facets.xql modules/ cp ../tei-publisher-app/templates/api.html templates/ Ideally, these files should not have been modified by you, if you followed our earlier best practice recommendations . Otherwise you may need to reapply your changes. Next, copy the navigation* and query* modules. These are intended to be customized, so you may have changed them in your custom app. Compare the versions and make sure you reapply your modifications, if any. cp ../tei-publisher-app/modules/navigation* modules/ cp ../tei-publisher-app/modules/query*.xql modules/

Note that the naming of the query-*.xql has changed to be constistent with the navigation-*.xql files.

Same considerations apply to the HTML templates for menus and the tool- bar. Nevertheless, changes in these areas were quite minor, so you may alternatively postpone this step untill you encounter concrete issues in your application: cp ../tei-publisher-app/templates/menu.* templates/ cp ../tei-publisher-app/templates/toolbar.html templates/ cp ../tei-publisher-app/templates/drawer.html templates/

Edit HTML templates

TEI Publisher 7 expects all HTML files to reside in the templates sub- folder. Previous versions used a mix of locations with some HTML files in the root of the app. You should thus copy your index.html , search.html , error-page.html and any other custom HTML files into

78 Updating Applications Updating Applications

templates first.

Also, because TEI Publisher 7 has a well-defined API to handle the com- munication between user interface components and server-side functionality, we need to change some of the URLs in the HTML templates we’re using. As a rule of thumb, all URLs previously calling XQuery modules directly, should now start with api/ followed by the correct API path. The main HTML template to be changed is index.html . Please search and replace the properties for the following webcomponents: should be changed into and should be changed into The last change should also be applied to search.html and you should change the ¡pb-load¿ loading the search results to read: Also check templates/pages/view.html and any other page template your app uses. Search for calls to .xql or modules/ and replace them with appro- priate API paths. If you were making direct calls to custom modules, e.g. from pb-load component, these need to be added as custom API endpoints .

Update config.xqm Finally, a few settings need to be added to the main configuration module, modules/config.xqm :

1. Change $config:webcomponents to at least version 1.13.0 .

2. Check for missing variables or functions and copy these from tei- publisher-app/templates/basic/modules/config.xqm , in particular:

79 Updating Applications Updating Applications

• $config:odd-available • $config:odd-internal • config:collection-config() • config:default-config() • config:document-type() • config:get-document()

Check for other customizations In most cases you will now be able to rebuild your custom app, redeploy it to eXist and test. If you encounter any error messages, they are most likely due to additional modifications you applied to your custom app. These are usually located in two main areas:

Custom HTML templating functions Earlier versions of TEI Publisher relied heavily on eXist’s HTML templating framework and custom apps would add their own templating functions into modules/app.xql (or additional modules). This approach is still perfectly valid. Everything will work as expected if you’ve just extended modules/app.xql . However, if you have added other modules containing templating functions, now you need to import these explicitly into modules/custom- api.xql .

Custom XQuery main modules If your custom app makes direct calls to XQuery main modules (e.g. via the pb-load component), these have to be rewritten. TEI Publisher 7 expects all API calls to go through the Open API router - direct calls to XQuery modules are no longer supported. Therefore you need to transform your code into functions which can be called via the Open API, e.g. by adding them to modules/custom-api.xql or importing them there.

Remove superfluous code Now you can remove files which are no longer used: rm templates/toc.html rm templates/search-results.html rm modules/view.xql

80 Updating Applications Updating Applications

Change API endpoints Finally, to be able to view and test the API of your custom app, you should change the endpoints in the Open API specification files. In both, modules/lib/api.json and modules/custom-api.json , change the url property in the following section:

"servers": [ { "description": "Endpoint for testing on localhost", "url": "http://localhost:8080/exist/apps/tei-publisher" } ], Change the final part of the URL to match the name of your app instead of ”tei-publisher”. If you are on a remote server, adjust the whole url accord- ingly.

Migrating from TEI Publisher 4, 5 or earlier If you are migrating to version 7 from TEI Publisher 5, 4 or earlier, you also need to pay attention to the user interface redesign introduced by TEI Publisher 6. This makes things more difficult and there are two possible approaches for updating:

• generate a new application with TEI Publisher 7 and merge your changes in into the newly generated app. This is the recommended method. If your customizations were limited to e.g. ODD files, CSS styles, templates, index configurations or adding your own code in modules/ , it should all go very smoothly. • update your generated app by modifying the HTML templates you use and copying files from TEI Publisher. The first option is strongly recommended and much easier in general. The second option is only for experienced users who have to update apps containing a lot of customized code.

Update by Generating a New Application

1. Upload your customized ODD to the TEI Publisher 7 2. If you created a custom HTML template for your document view, upload it to the templates/pages collection of the TEI Publisher. You can use eXide’s file manager to do so.

81 Updating Applications Updating Applications

3. Generate a new application using your custom ODD. Make sure to choose a different URL and short name to not confuse the old and new app.

4. Adjust modules/config.xqm variables, if necessary.

5. Selectively upload all other files you changed or resources you added to the generated app (data, CSS). In case of custom page templates and XQuery modules follow recommendations for updating from Publisher 6 .

Update by copying

This approach is for experienced users only and not recommended. We mainly keep it here for reference. This guide has been created for updat- ing to TEI Publisher 6, so you will also need to follow the instructions for migrating from 6 to 7 and apply the steps not described below.

The following steps assume that you either

• have a copy of your app’s code on the filesystem and either a clone of TEI Publisher 6 or an app skeleton generated by TEI Publisher 6 (preferred) next to it

• or have your app and TEI Publisher installed in your database. In this case use eXide’s file manager to copy files In the following, the TEI Publisher 6 app you are copying from will be identified as $source .

1. Edit modules/config.xqm and add two variables at the top: $config:origin-whitelist and $config:webcomponents . $config:webcomponents should point to the latest version of pb-components.

(:~ : A list of regular expressions to check which external hosts are : allowed to access this TEI Publisher instance. The check is done : against the Origin header sent by the browser. :) declare variable $config:origin-whitelist := ( "(?:https?://localhost:.*|https?://127.0.0.1:.*)"

82 Updating Applications Updating Applications

);

(:~~ : The version of the pb-components webcomponents library to be used by this app. : Should either point to a version published on npm, : or be set to ’local’. In the latter case, webcomponents : are assumed to be self-hosted in the app (which means you : have to npm install it yourself using the existing package. json). : If a version is given, the components will be loaded from a public CDN. : This is recommended unless you develop your own components. :) declare variable $config:webcomponents := "0.9.11";

Other variables in conf.xml which may need to be updated are:

• $config:data-default • $config:data-exclude • $config:context-path

2. Copy all XQuery files from $source/modules/lib into the modules/lib of your application.

3. Copy $source/resources/css/theme.css into the same location in your application.

4. In all HTML files you changed for your app , check ¡head¿ and remove any ¡script¿ element or ¡link¿ doing an import like:

Instead add the following line to the head:

Also add a link to theme.css :

5. Copy $source/resources/css/theme.css

6. Copy $source/resources/i18n . The existing *.xml in your directory can be deleted - unless you added your own translated keys, in which case you would need to move them into the corresponding json format.

83 Data Data

7. HTML files you did not change may just be overwritten by copying the corresponding versions from $source . In particular this includes files in the $source/templates/basic/templates subdirectory.

8. Copy $source/templates/basic/controller.xql into the root of your ap- plication - unless you made changes to this file yourself, in which case you would need to merge it.

9. Check the files in modules within your app: if you have not changed any of them, just copy the corresponding files from tei-publisher- app/modules . See if you need to merge the modified ones.

10. i18n translations are now applied client-side rather than server-side. This means you should drop all references to data-template="i18n:translate" and the i18n namespace from your HTML. Also remove the i18n mod- ule import from your modules/view.xql or overwrite the file if you have not modified it.

You may consult the diff of an actual update from Publisher 5 to 6 in the dodis-wall repository on github: update to publisher 6: update templates, configuration and styling

Data

TEI Publisher ships its data files within the same application package. Nev- ertheless, separating your data from application code has many benefits, particularly for actively developed applications and data sets. This way changes to your code can be deployed without redeploying and reindexing your data and vice versa. It is also easier to maintain separate repositories (e.g. in Git) and differentiate privileges for editorial and developer teams. While we would generally recommend separating data and code, some projects may still prefer to keep their data and application integrated in a single xar package for the sake of marginally easier distribution. Internal structure of the data collection can be arbitrary, though there are some considerations regarding index configuration to take into account.

Data collection Two variables in config.xqm are used to configure location of the data col- lection. $config:data-root specifies where in the collection hierarchy the data is stored. Only the top level collection needs to be specified.

(:~

84 Data Data

: The root of the collection hierarchy containing data. :) declare variable $config:data-root := $config:app-root || "/data"; Switching to a separate data package is as simple as changing this vari- able, e.g. assuming that we store our data in /db/apps/lgpn-ling-data/data , the data-root should be defined as:

declare variable $config:data-root := ’/db/apps/lgpn-ling-data /data’; In the variable $config:data-exclude you may specifiy a sequence of root elements which should be excluded from the list of document shown in the browsing view, e.g. secondary data files like a taxonomy or entity lists. You may find it helpful to create and build the data package starting from the template hosted in the e-editiones repository . README document provided there explains shortly roles of all the files and how to adjust them to your needs.

It is critical to store index configuration file, collection.xconf in the correct location. Refer to eXist-db documentation for details but it is common practice to store a collection configuration in the data package and rely on mechanisms of pre-install script ( pre-install.xql ) to copy the file to its required position in /db/system/config . Similar consideration applies if fulltext index makes use of any ex- ternal function module, which in TEI Publisher and generated apps is commonly the case for facet and field definitions. This module needs to be stored before the index is to be applied and usually it’s best to store it in the same location as collection.xconf .

Subcollections Many editions will simply present a number of documents, e.g. a collection of letters like Van Gogh or plays like Shakespeare demo. Structure of the data is therefore very simple, all the files are stored on the same level in the data-root collection. Other publications, particularly those including heterogeneous material may require a more complex organization. TEI Publisher itself is a good example - its data collection is further divided into three subcollections: doc (for documentation), test (for examples) and playground (for user-supplied material). You will note that it also contains other resources: image files and even an HTML file. Specify the $config:data-default variable in modules/- config.xqm to set which collection should be used as a point of entry to your

85 Data Data

Figure 22: Structure of TEI Publisher’s data collection

data. Whenever collection.html is present in that location, it will be used as a custom landing page displayed instead of a simple document listing. In case of the TEI Publisher it would be the Demo Collection, Playground and Documentation

(:~ : The root of the collection hierarchy whose files should be displayed : on the entry page. Can be different from $config:data-root. :) declare variable $config:data-default := $config:data-root; This approach can be extended further, say if you wanted to add a prints and manuscripts subcollection to the playground and present them in a custom landing page. Just add the data and a corresponding collection.html into the playground . You can shape the custom landing page however you like, using the full power of HTML and eXist templating. The landing page like this could be created with this HTML template stored in data/playground/collection.html .

Custom landing page demonstrating sub-subcollections.

  • 86 Data Data

    Figure 23: Subcollections

    Prints

    This one is for prints.

  • Manuscripts

    This one is for manuscripts.

Processing instructions The default view for a specific document can be configured via a processing instruction. Before displaying a document, TEI publisher will check if a processing instruction exists at the start of the document, telling it which ODD and view template to use (along with other configuration parameters). For example, the following processing instruction associates the document with the view template translation.html , the ODD dantiscus.odd , and switches to a page-by-page display (along TEI page break boundaries): When viewing the document by structural divisions, two additional settings control the amount of content displayed at a time:

87 Data Data

Figure 24: Fragment of the subcollections landing page

odd The ODD file to use for rendering the document. template The HTML view template to use. Default is view.html as con- figured in modules/config.xqm . view Default view to show when browsing the document. Supported values are div , page or single :

1. div : displays one structural division (TEI div, docbook section ) at a time 2. page : displays the document page by page. This requires page break indicators to be present ( ¡pb¿ in TEI, not supported for docbook). 3. single : the entire document (or a selected fragment of it) is displayed at once depth When viewing entire divisions, the software tries to determine if it should show child divisions in separate pages or include them with the current div. ¡depth¿ indicates the nesting level up to which divisions should be shown separately. So setting it to ”2” will result in divisions on level 3 or greater to be shown together with their enclosing div.

88 Facet Search Configuration Facet Search Configuration

fill If child divisions appear on separate pages, it may happen that the enclosing div contains just a heading or a single line of text. In this case, the algorithm will try to fill the page by showing the first child division as well. The ¡fill¿ paramter defines the number of elements which should at least be present on a page. If not, the software tries to fill it up.

Facet Search Configuration

Facets allow users to quickly navigate through a set of documents or query results by selecting from predefined categories or properties. This way, users can ”drill down” into the set, reducing the number of displayed items with every step. For demonstration purposes, TEI Publisher configures two facets by default: ”Genre” and ”Language”. You can see those to the left of the document list on the start page, or below the search box on the search result page.

Figure 25: Facets on the start page

From a user perspective, the main concept behind facets is the drill down

89 Facet Search Configuration Facet Search Configuration

: initially the user sees all facet values associated with the set of documents or search results displayed. The number behind each value denotes the number of items in the set having the particular facet set. As the user selects one facet, the set necessarily becomes smaller, so non-matching facet values will disappear and the numbers adjust accordingly. Facets are a new feature in eXist 5.0. They are super fast because eXist will create them when indexing the document. No extra computation is needed when the user clicks on a facet to drill down into a displayed set: all information is already available in the index. To see a more complex example of facets in action, visit our Van Gogh demo. If you would like to configure other or additional facets, you need to edit three files:

collection.xconf The collection.xconf tells eXist how to index the collec- tion. The default configuration in TEI Publisher creates two Lucene full-text indexes for TEI on ¡tei:text¿ and ¡tei:div¿ . Each of those may have facets attached. Every facet must have a dimension and an expression attribute. The expression is an arbitrary XPath/XQuery string. For every element being indexed, the expression is evaluated once and the result defines the string values which will be associated with the specified facet dimension. For a description of how full-text indexes and facets are defined in the collection.xconf , please refer to the eXist documentation .

index.xql If you open the default collection.xconf , you’ll see that most facet expressions call a function nav:get-metadata . This function is declared in index.xql . By externalizing most of the code into a separate function, we can keep the index configuration clean and short. The default index.xql already does some advanced preprocessing, in particular for the ”genre” facet: each of the sample documents ref- erences a central taxonomy (contained in data/taxonomy.xml ). The references are resolved at indexing time by following the ¡catRef¿ el- ement’s @target attribute. Note that we create a hierarchical facet, because e.g. ”Philosophy” is a sub-category of ”Prose”. The code in function idx:get-genre will automatically include the super-category.

config.xqm While the other two files cover the server-side creation of facets, we also need a place where we define how facets should be displayed in the user interface. The main configuration module: con- fig.xqm declares a variable $config:facets . It should contain an array of maps, where each map defines the settings for one dimension, e.g.:

(:

90 Facet Search Configuration Facet Search Configuration

: Display configuration for facets to be shown in the sidebar. The facets themselves : are configured in the index configuration, collection.xconf. :) declare variable $config:facets := [ map { "dimension": "genre", "heading": "Genre", "max": 5, "hierarchical": true() }, map { "dimension": "language", "heading": "Language", "max": 5, "hierarchical": false(), "output": function($label) { switch($label) case "de" return "German" case "es" return "Spanish" case "la" return "Latin" case "fr" return "French" case "en" return "English" default return $label } } ];

The map properties are as follows:

dimension The name of the dimension. Should correspond to the value of the @dimension attribute used in collection.xconf heading The heading to display above the facet values max Maximum number of facet values to be displayed initially. More can be shown if the user clicks on the Show All checkbox. Pass an empty sequence, i.e. (), to not limit the number and hide the checkbox. hierarchical Defines if the facet is hierarchical, which means that only the top-level facet values in the hierarchy will be shown initially. If the user selects one top-level value, the interface will expand and show the sub-categories. For this to work the facet must be configured as ”hierarchical” in collection.xconf output A function which can be used to process the facet value before display. The facet label will be replaced by whatever the function returns.

91 Embedding TEI Publisher in other systemsEmbedding TEI Publisher in other systems

Embedding TEI Publisher in other systems

Since version 6.0, all pb-components can be used outside TEI Publisher itself. The components can be embedded into any environment, e.g. a CMS or blog software (like WordPress or Drupal) or integrated into any modern front-end framework (like vue, react or angular). All that is needed is a TEI Publisher instance available on the web which stores the source TEI and provides a communication endpoint for the components to talk to.

Live Examples The embedded example below demonstrates such a use case: it provides a sandbox running on codepen.io but communicates with the TEI Pub- lisher instance on teipublisher.com which stores the documents. The magic happens in the endpoint attribute passed to ¡pb-page¿ , which tells the components where to talk to:

You can actually edit the code above: for example, try to change the path for the first document to test/F-rom.xml and the odd to shakespeare . See how the live view changes? And if you would like to read Romeo and Juliet in two-column mode, just add column-separator=".tei-cb" to the main ¡pb-view¿ .

Retrieving the whole document as a simple HTML Embedding results of applying Processing Model transformation on a doc- ument is even simpler. Behind the scenes, TEI Publisher has a separate library part, which is essentially an implementation of the TEI processing model. This library can be used independently to retrieve the entire content of a TEI document as HTML, transformed through an ODD with processing instructions. All you need is a small XQuery which calls the library modules, setting the correct source document and ODD. Fortunately, TEI Publisher already contains a boilerplate XQuery script for this job, which you can call as follows in your browser: https://teipublisher.com/exist/apps/tei-publisher/api/document/ test%2FF-rom.xml/html?odd=shakespeare.odd This will retrieve the content of Shakespeare’s Romeo and Juliet as an HTML page, transformed through the odd shakespeare.odd . For embedding an entire document in an iframe or similar, this should already be enough.

92 Embedding TEI Publisher in other systemsEmbedding TEI Publisher in other systems

Please note that / character in the path to the document test/F-rom.xml had to be URL encoded as %2F .

Embedding webcomponents for navigation For longer documents, embedding the entire content in a page may not be too user-friendly. A better way is to use the library of webcomponents provided by TEI Publisher. This way, we can show the content page by page or division by division, allowing the reader to navigate between sections.

Because webcomponents are part of the HTML5 standard and supported natively by most modern browsers, we can easily import the component library which is at the core of the TEI Publisher app and reuse the compo- nents it provides in other contexts. They should work in any HTML5 page, no matter if it was written by hand, is generated by PHP, Python or a CMS. For a start, the page should import two scripts in its header: This imports necessary pb-components libraries from unpkg.com CDN which is considered the best practice for web sites. The second ¡script¿ tag imports all the components provided by TEI Publisher. Note that here pb-components@latest points to the latest avail- able bundle, but you could specify a concrete version number to make sure your website uses a fixed release. Now let’s actually use the components to display Shakespeare’s Romeo and Juliet : in the HTML ¡body¿ , include the following snippet:

93 Creating Custom Web Components Creating Custom Web Components

In the ¡pb-page¿ endpoint attribute we need to provide a critical piece of information: the URL of the TEI Publisher instance for all the components to communicate with. If you have set up your own instance of eXist-db and TEI Publisher, you should change the URL to point to your instance . This is important because the components will expect the documents you want to display to be stored in the same instance. ¡pb-document¿ defines the document to be displayed. The path is rel- ative to the data root of the TEI Publisher instance. It also specifies the ODD to be used for the transformation. ¡pb-view¿ is the main component for displaying the transformed content. It references the ¡pb-document¿ to use as source in its src attribute. The Shakespeare does tag page breaks, so we switch to page-by-page view via the view attribute to show the user only one page at a time. The default would be to use a division-by-division view ( view="div" ), but you could also request the entire content at once using view="single" . ¡pb-navigation¿ adds forward/backward navigation buttons to the page, allowing the user to switch to the next/previous page of the document. You can use various types of buttons, but in this case we’re choosing a ¡paper- button¿ element with a chevron ¡iron-icon¿ (both ¡paper-button¿ and ¡iron- icon¿ are part of the standard Polymer elements library ). Another example of embedding TEI Publisher web components into a static website can be found on our demo blog . It was created with Hugo , a very popular website generator which offers hundreds of ready-made themes to choose from. Here content is written in , but its not a problem to embed TEI content served by TEI Publisher.

Creating Custom Web Components

In some cases an app may need to add its own web components to the collection provided by TEI Publisher. Please consider following open-source- first approach , as we do, and contributing your component back to the community, for others to use. For that reason, we recommend here a setup which leads to relatively smooth integration with pb-components in the future. Start with forking and downloading the pb-extension-template pack- age from its GitHub repository . You will find it already includes several configuration files to get you started on the development.

94 Creating Custom Web Components Creating Custom Web Components

Figure 26: Visual Studio Code overview of pb-extension-template

• package.json file which includes definition and specifications for com- mon task, like generating documentation or running a web server for local development

• src directory to put your custom web component code. It already includes pb-clipboard.js file as an example of a simple component

• index.html file which illustrates the simplest scaffolding for using your new components

• demo directory to place your demo files for testing and documenta- tion purposes. It already contains demo.js configuration file and pb- clipboard.html demo for ¡pb-clipboard¿ .

• api.html file serving as a documentation starting point for your package

• rollup.config.js provides building specification to create a bundle that can be included in any web page.

To get you started, run npm install . This will install required depen- dencies: above all pb-components but also es-dev-server and web-component-analyzer packages which are needed for local development and documentation. If you are using Visual Studio code .vscode/component.code-snippets will provide you with code templates. If you create a new file in src , e.g. pb-foo.js and start typing litelement you will be offered a LitElement

95 Creating Custom Web Components Creating Custom Web Components

Figure 27: Visual Studio Code snippet

template. Otherwise you can copy the template manually or copy existing ¡pb-clipboard¿ element and tweak it. pb-clipboard example We will discuss the details of the ¡pb-clipboard¿ element to illustrate some basic concepts. You will find exact code of this example in pb-extension-template/src/pb-clipboard.js . Each custom component has a defined API: an interface it presents to the outside world. It includes properties, methods (also called functions) and events but it bundles HTML markup along with local CSS and JavaScript into a single file. Our example is a custom element designed to provide simple copy-to- clipboard functionality, helpful e.g. when providing ready-made citation on a page. It consists of three parts: a label, the content to be copied and a button to copy. Thus, render function which actually displays this custom component could look like this:

render() { return html‘

${translate(this.label)}

‘; }

96 Creating Custom Web Components Creating Custom Web Components

You will note that the render function uses not only regular HTML elements like ¡h3¿ or ¡div¿ but also components from the paper- package. To be able to use them, we need to explicitly import them. Same is true for LitElement class itself and important interface from the pb-components package: pbMixin . All user interface components may need to be localized. For that reason we also import translate method from TEI Publisher’s i18n module. LitElement and pbMixin must be imported for all custom components extending pb-components . Code listing below demonstrates how to cor- rectly import all classes and custom elements required as well as how to create a class signature that extends pbMixin . import { LitElement, html, css } from ’lit-element’; import { pbMixin } from ’@teipublisher/pb-components/src/pb-mixin’; import { translate } from "@teipublisher/pb-components/src/pb-i18n"; import ’@polymer/paper-icon-button’; import ’@polymer/iron-icons’;

/** * A component with a button which copies the contained content to the clipboard. * Use for the typical ’quote this content as’ hints on a webpage. * * @slot content - contains the actual content to copy to the clipboard */ export class PbClipboard extends pbMixin(LitElement) { Imports and render function sorted, there are two other static functions we need to take care of: properties and styles . pb-clipboard has just a single property: the label to display above the copy text. Nevertheless it needs to explicitly declare properties inherited from pbMixin which is done via ...super.properties notation. static get properties() { return { /** * Label to display above the text to be copied */ label: { type: String }, ...super.properties }; } You probably noticed that the button we added in the render function specifies what to do upon click event via click attribute: @click="${this. copy}"

97 Creating Custom Web Components Creating Custom Web Components

. A protected copy function of the element is called in such case and for our simple pb-clipboard element it provides its core copy-to-clipboard func- tionality.

/** * Copy text content from the to the clipboard */

_copy() { const slot = this.shadowRoot.querySelector(’slot’);

// first import nodes from the slot into a temporary div const content = document.createElement(’div’); slot.assignedNodes().forEach((node) => { content.appendChild(document.importNode(node, true)); });

// copy the innerText of the temp div into the clipboard navigator.clipboard.writeText(content.innerText); } We glanced over yet another interesting function invoked in the render method. translate , which accepts as an argument a key identifying a corresponding label in i18n language files. It could be one of keys shipping with TEI Publisher, but here, for a new component we need a more specific label, presenting the user with a short, informative text to be displayed when hovering over copy-to-clipboard button. Obviously, the label should change in line with the language setting for the whole application, which is why we need the i18n module in the first place.

title="${translate(’clipboard.copy’)}" Please refer to the chapter on i18n for in-depth discussion of the subject. Here we’ll just mention that additional language files for new components should be placed in i18n/app , mimicking the location and format of Pub- lisher.

{ "clipboard": { "label": "Quote as:", "copy": "Click to copy to clipboard" } } And one last job is a little bit of styling to make things pretty. static get styles() {

98 Creating Custom Web Components Creating Custom Web Components

return css‘ :host { display: block; } h3 { margin: 0; font-size: .85em; font-weight: normal; color: #3a3a3a; } div { display: flex; align-items: center; padding: 0 16px; } ‘; } The final directive at the very bottom of pb-clipboard.js is necessary to register the custom element with the browser.

customElements.define(’pb-clipboard’, PbClipboard);

Testing pb-clipboard Have a look at the supplied index.html example. Run npm start to run a simple local server instance which allows to test our developments as we work on them without bundling. Opening index.html with this server you can see ¡pb-clipboard¿ in action.

Using pb-clipboard

Using pb-clipboard

John Doe: "The miracles of foobar", Paradise Publishers, Little Village, Stardate 46254.7

99 Creating Custom Web Components Creating Custom Web Components

In order to use the element, you need to import the component code into your HTML page, as demonstrated with the ¡script¿ and src="src/pb-clipboard.js" . It’s all that is needed to use ¡pb-clipboard¿ as any other HTML element. Its label property is specified via @label attribute and the copy text is just the text content of the element. Pressing the button copies the citation text from the page and, just to test, you can paste it into the input box below.

API documentation Implementing a new feature is only half of the task - and sometimes the easier bit. All pb-components come with their API documentation and most of them also with one or more Demo showcases. Have a look e.g. at ¡pb-collapse¿ component for a simple one and compare with much more elaborate sections devoted to ¡pb-toggle-feature¿ or ¡pb-select-feature¿ . API documentation is generated automatically by npm docs task. You just need to add documentation-style comments in your code. Example below illustrates how to nest code examples or specify slots and events so they are picked by the documentation generator.

/** * This is a documentation-style comment * * with code example * ‘‘‘html * * *

* Foo bar *
* * ‘‘‘ * slots specification: * * @slot - unnamed default slot * @slot foo-content - content to be shown when foo happens * * events specification: * * @fires pb-foo-open - Fires opening the foo section */

100 Creating Custom Web Components Creating Custom Web Components

Demo We encourage you to prepare a demo entry for each newly created compo- nent. Demo ideally presents one or more use case examples which illustrate where and how would you use a component. Keep the demo page simple and minimal, so other elements do not obstruct understanding the code. Save your demo files in demo directory and register it in demo/de- mos.json with the demo filename and your chosen label for the tab (but please stick to Demo if you only have one).

"pb-collapse": { "demo/pb-collapse.html": "Demo" } If you supply a number of demo files, choose meaningful labels for tabs, e.g.

"pb-toggle-feature": { "demo/pb-toggle-feature.html": "Server-side with pb-view", "demo/pb-toggle-feature2.html": "Client-side", "demo/pb-toggle-feature3.html": "Server-side with pb-load" }

Bundling and distribution Any new component must be explicitly imported by pb-extension-bundle.js .

import ’./src/pb-clipboard.js’; Run npm build:production to generate a distribution bundle. The gen- erated library (located in dist ) will include everything, including the version of the pb-components library you are building upon and all dependencies. As a result, you can use it as a drop-in replacement for pb-components pack- age: in your custom project, for TEI Publisher itself or Publisher-generated custom apps.

Using pb-extension-bundle in TEI Publisher or other apps The created library can be used as a drop-in replacement for the default pb-components library. To do so:

1. clone tei-publisher-app or the generated app you would like to mod- ify

101 Adding a custom vocabulary Adding a custom vocabulary

2. edit package.json and replace the dependency for @teipublisher/pb- components with the replacement library

3. edit build.properties and change scripts.dir to point to the replacement library

4. call ant to build tei-publisher-app

For example, to use the git source of pb-extension-template in pack- age.json , change the dependencies as follows:

"dependencies": { "@teipublisher/pb-extension-template": "git+https://github.com/ eeditiones/pb-extension-template#master" } then change build.properties to contain: scripts.dir=node_modules/@teipublisher/pb-extension-template/dist Building tei-publisher-app should then copy scripts and resources from pb-extension-template instead of pb-components .

Make sure to adjust the repository link and name of the extension mod- ule to the one you created. Example above assumes working directly with pb-extension-template but it’s unlikely to be the case in real life.

Adding a custom vocabulary

As discussed in the opening chapter, publishing a corpus of documents online is much more than just transforming a single source document into the desired output format. To fully support a new XML vocabulary in TEI Publisher several aspects need to be adressed:

• ODD with processing models

• default view and page template

• navigation, breadcrumbs and table of contents

• search and filtering: full text index definitions, facets and fields

102 Adding a custom vocabulary Adding a custom vocabulary

ODD and processing models within it govern how the document in your new vocabulary will be transformed into a range of available output formats: HTML, ePub etc. All other aspects are interconnected and depend on the understanding what constitutes the basic unit of the text: TEI primarily considers divisions or pages , DocBook rather sections . Therefore navigation for TEI document will be switching between ¡div¿ s or reconstructed XML fragments between subsequent ¡pb¿ s, while talking about pages in DocBook documents makes no sense and ¡section¿ s are the main structural units. This structure has its consequences for further aspects: generating TOC in TEI will analyze nested ¡div¿ / ¡head¿ structures but only ¡section¿ / ¡title¿ in DocBook. Likewise, KWIC display in TEI will be showing matches in ¡div¿ context but ¡section¿ in DocBook, so Lucene indexes need to be defined on these elements in their appropriate namespaces. Ditto for facets and fields used for sorting and filtering. Sections below will explain where and how to customize these aspects in more detail.

ODD Create a new, blank ODD file which is not chained to any other ODD and add ¡elementSpec¿ s for its elements. Make sure that you specify correct namespace for your vocabulary. Optionally, you can also specify an external CSS file with style declarations for classes and elements you will be using in processing models of your ODD. See how it looks in the ODD for DocBook. When adding models into the custom ODD for your vocabulary it is recommended that at least one element applies the document behaviour. Usually it will be the top level element or the main content-bearing one (like ¡article¿ in DocBook). This is not strictly required but for print output via LaTeX or FO the document behaviour specifies default prologue governing basic setup for PDF. If you refer to the docbook.odd you will note that the same effect is achieved by explicitly defining the prologue in the ¡pb:template¿ for arti- cle .

Case study: foo vocabulary Let’s consider an imaginary vocabulary called foo . All documents in this vocabulary will belong to the http://foo.io namespace. Simple document could look as follows:

103 Adding a custom vocabulary Adding a custom vocabulary

My foo document About something very important for foo community. Let’s save this document in the playground collection as foo.xml . Unfortunately a request to retrieve this document in Publisher fails with the server did not return any content error message. http://localhost:8080/exist/apps/tei-publisher/playground/foo.xml Fixing this will require creating a new ODD for foo vocabulary. Create a new ODD as described previously. Use document behaviour for ¡fooStart¿ element and perhaps ¡inline¿ with an ¡outputRendition¿ set to italic for ¡foo¿ element. Please note to specify the ODD for your vocabulary. Very bare bones ODD file for a fictional Foo vocabulary could look as follows: Or, in XML form:

font-style: italic; With this in place we could reformulate our request to explicitly specify the ODD to use and a single view. http://localhost:8080/exist/apps/tei-publisher/playground/foo.xml? odd=foo&view=single

It is necessary to use single view along with the foo.odd . Otherwise, the app default view would be used, which in TEI Publisher is normally set to div . As we already mentioned, implementation of view parameters needs to be vocabulary-specific to work. Since foo vocabulary doesn’t yet have navigation customized, TEI Publisher will fall back to TEI and try to locate ¡tei:div¿ elements, which obviously cannot be found in our test document in foo namespace.

104 Adding a custom vocabulary Adding a custom vocabulary

Default view and page template Rendering of the document is governed by a number of parameters, partic- ularly view , ODD and template : Even when these parameters are not explicitly specified, TEI Publisher and apps generated from it, will fall back to the default values specified in modules/config.xqm .

• ODD: $config:default-view

• view: $config:default-view

• template: $config:default-template

Alternative way to specify these would be using a processing instruction in the foo.xml document itself.

Lucene configuration eXist-db and TEI Publisher make extensive use of Lucene indexing engine. In particular search, navigation, sorting and filtering heavily depend on full text indexes, facets and fields. It is therefore paramount to specify these correctly for your data collection. Supporting a new vocabulary, make sure to add its namespace on the ¡index¿ element in collection.xconf .

Beyond this minor adjustment, adding a new vocabulary does not differ in creation and use of facets and fields.

Navigation We have extensively covered modifications to the ODD and page templates in earlier chapters. In case of vocabularies without out-of-the-box Pub- lisher support it is necessary to customize the navigation as well. Tacit understanding of document’s structure is critical for many kinds of user interactions - from browsing through pages to creating the table of contents.

105 Annotating Documents Annotating Documents

modules/navigation.xql is the main ”control room” for all tasks related to navigation. You will note that it imports custom modules for all sup- ported vocabularies: TEI , JATS and DocBook . All requests are dispatched to specialized modules, depending on the namespace of the document (cf. config:document-type function). module namespace nav="http://www.tei-c.org/tei-simple/navigation"; import module namespace tei-nav="http://www.tei-c.org/tei-simple/ navigation/tei" at "navigation-tei.xql"; import module namespace jats-nav="http://www.tei-c.org/tei-simple/ navigation/jats" at "navigation-jats.xql"; import module namespace docbook-nav="http://www.tei-c.org/tei-simple /navigation/docbook" at "navigation-dbk.xql"; Customizing yet unsupported vocabulary will require:

• create a new navigation module for the new vocabulary (e.g. navigation- foo.xql ; it should implement all the functions that navigation.xql dis- patches to; you can use navigation-tei as a starting point for customiza- tion

• import it into navigation.xql

• adjust config:document-type function

• adjust nav:get-root

Search Full text search is realized via the same modular approach that governs nav- igation. modules/query.xql is the main ”control room”, dispatching requests to functions in specialized modules. See implementation of query-db.xql or query-tei.xql before creating a dedicated module for your vocabulary. Make sure to import your module into query.xql .

Annotating Documents

Since version 7.1.0, TEI Publisher supports annotating TEI documents via a graphic, web based interface. This means you can enhance existing TEI doc- uments directly within TEI Publisher in a user-friendly environment where XML code is neatly hidden from sight when not needed.

106 Annotating Documents Annotating Documents

The annotation editor is not meant for creating documents from scratch but targets one of the most tedious and time consuming tasks in any edition project: adding semantic, analytic or text-critical markup to an existing transcription. Typical workflow for many projects will resemble the following:

1. An initial transcription is created, either manually in a text editor, or using OCR or HTR tools like Transkribus. It contains the basic structural markup, i.e. divisions, headings, paragraphs etc.

2. That initial transcription is gradually enriched: textual features like emphasis can be marked, abbreviations expanded, corrections and reg- ularizations applied; people, places or terms appearing in the text can be explicitly tagged and linked to an authority; dates or measures nor- malized. This is an iterative process: the resulting TEI encoding is constantly reviewed, which may result in further changes to be made.

Web annotations try to ease the burden of the enrichment phase by allowing editors to mark everything directly within the user-friendly envi- ronment, closely resembling the publication view. It is much easier to read and doesn’t require half as much TEI experience as using an XML editor. Just use your mouse to highlight the text fragment to annotate, click on the annotation type and optionally select additional information, e.g. to link with an external authority file.

Prerequisites and considerations

• Browsers The annotation editor requires a modern browser like Firefox, Chrome or a Chrome-based browser like the newest Edge on Windows 10. Sa- fari will not work due to a known bug in text selection, which Apple seems reluctant to fix. Directly exporting the resulting text to a directory of the user’s choice, is currently only supported in Chrome.

• Division size The editor should work fine for texts up to the length of a journal arti- cle or chapter. Really long texts (e.g. a whole book) will be automati- cally split by top divisions, but you may still experience a recognizable time lag for long divisions.

107 Annotating Documents Annotating Documents

Annotations have been extensively tested, and should work reliably for tagging entities and other inline markup. You may want to keep an eye on the resulting TEI when testing annotations on your material. We appreciate bug reports and suggestions for improvement.

Annotations: the pen and paper of the digital world How the annotation editor works is easy to understand if you imagine re- viewing a text the old-fashioned way: print it out, then use a marker to highlight certain passages, write corrections over the text with a pen, or scribble notes into the margin. Once back at your computer, all the changes need to be transferred into the TEI. You may then continue with a second round of review by printing out the results again. Within our electronic environment, the printout corresponds to the source TEI document and the marks and scribbles are annotations. Until you ex- plicitely merge the annotations into the source TEI, they will be kept sepa- rate from the document they apply to. Once merged, a new version of the source TEI document is established and the editing process restarts. Notably, TEI Publisher is able to detect existing markup in the base text when it is loaded. Any markup corresponding to a known annotation type will be recognized and displayed as such, therefore it can be modified. In this sense it is slightly different from working with pen and paper. From the printout example it should be clear that you cannot change the printed text itself (unless you had some magic paper). Accordingly, annotations are not allowed to change the base text, i.e. the text ”printed” on the screen. If you want to explicitly indicate a correction, you can use the corr or reg annotation type, which corresponds to the TEI encoding with choice/sic/corr or choice/orig/reg. You can use annotations for virtually any kind of inline markup. The only strict requirement is that the base text, our source document, remains stable during an editing session i.e. until annotations are merged. The reason for that is the annotations are anchored to a certain position in the document, so any changes to it would mean that our coordinates might suddenly point to different fragments. That said, there’s also a special edit annotation type, which does allow you to perform limited editing of a text. It will just ”seamlessly” change the transcription text fragment without adding any explicit markup. Neverthe- less, such a change to the TEI document is only applied when you merge and therefore establish a new base text. The limitation is that only text frag- ments can be edited: the boundaries of structural markup or annotations may not be crossed. It can be convenient for fixing a typo in a transcription that wasnt caught when the base text was prepared. You also cannot use the annotation editor to modify the structural

108 Annotating Documents Annotating Documents markup of the base text, i.e. block-level elements like divs, paragraphs, headings or notes. In case you need to change the structure, you have to merge your annotations first, saving a new version of the TEI document and switch to an XML editor to change the XML there. The new document can be then reloaded into the annotation editor and you can continue. As already indicated, it is possible to handle common nested encoding structures like ¡choice¿ or ¡app¿ as annotations. Nevertheless the inner ele- ments, i.e. ¡lem¿, ¡rdg¿, ¡sic¿, ¡corr¿ or ¡orig¿, ¡reg¿, need to have simple text content: they cannot be annotated themselves (at least not in the current version).

Applying Annotations To start annotating a document, select or upload one into the annotations collection in TEI Publisher. Selecting any document in this collection will load the annotation view. This view is split into three areas:

1. a sidebar to the left, initially showing only a toolbar with a disabled button for each annotation type

2. the annotation view in the middle, displaying the text content to be annotated

3. a preview panel to the right: this is initially empty, but will come to life once you start making annotations

To annotate a certain passage, select it with your mouse. The toolbar buttons in the left sidebar now become active. Choose an annotation by clicking the corresponding button or using a keyboard shortcut. Hovering over buttons you can see an info text with a brief description and a keyboard shortut. Depending on the chosen annotation type, a form will appear beneath the toolbar. We distinguish three categories of annotations:

1. semantic annotations, which are (usually) connected to an external authority, e.g. for people, places, organizations or terms.

2. ”toggle” annotations, which do not need require additional informa- tion, so they simply switch an annotation on/off, e.g. for deletions or titles (not part of the default annotation editor setup).

3. analytic or text critical annotations, which may require further input from the user, e.g. a normalized date.

109 Annotating Documents Annotating Documents

These categories are not mutually exclusive but may be helpful to explain the main functional differences between the three use scenarios.

Semantic Annotations Semantic annotations include markup for entities like people, places or terms. Most editions will try to link those to a description of the entity, either provided by an external authority file or from a local resource. In any case you will want to reference a unique identifier to mark occurrences of the same entity throughout the edition. When choosing a semantic annotation from the toolbar, you will thus see a dialog which runs a query against the authority file configured for the annotation type, e.g. Geonames for places or GND for people. The semantic annotation dialog displays the query results for you to choose from and the query sent initially corresponds to the text selection made. Often this may not find the relevant entity, but you can change the query in the input field on top of the dialog and run the search again by pressing enter. Compare the screenshots above and below to note the difference when searching for William vs William Graves When you use the + sign to select an authority entry, its ID will be copied into the top input field in the left sidebar and a short description of the entity should appear beneath. The ID will be used as reference in the resulting TEI, i.e. it will appear in an attribute attached to e.g. ¡persName¿ or other TEI element. The exact mapping between an annotation type and a TEI snippet produced can be freely configured (see below). Semantic annotations are stored in the annotation list immediately after you select an authority entry, so you don’t have to click anywhere else. Batch edits and suggestions It is not uncommon that the same person will be mentioned multiple times in a document. To simplify and speed up the editing process, the annotation editor tries to detect possible other occurrences of the currently selected entity throughout the document. A list of potential matches is presented at the bottom of the left sidebar. The list is compiled by traversing the text shown on screen, searching for strings which may indicate the same entity. The actual query depends on the information provided authority used: e.g. if the authority supplies alternative names, those will be added to the searched for strings. Each occurrence will show a checkbox to the left. Checking it will im- mediately tag the corresponding passage as another instance of the same entity. You can also check all boxes at once by clicking on the icon to the top right above the list. Moving your mouse over an entry will scroll the text to the corresponding occurrence, showing you the match in its context.

110 Annotating Documents Annotating Documents

Existing annotations referencing the same entity will also be shown in the list - with their checkbox checked. Sometimes you may also see an entry with red dots beneath, indicating that this is an existing annotation referencing a different entity, so considered a potential inconsistency to review. For example, if you selected the authority entry for ”Ronald Reagan”, you may see other occurrences of ”Reagan”, which have already been annotated as ”Nancy Reagan”. You will see those occurrences underlined with red dots. You can switch them to ”Ronald Reagan” by checking their box. Unchecking a box will remove the existing annotation, e.g. incorrectly applied one.

Toggle Annotations Toggle annotations are immediately applied when you click the annotation button. Since they do not require any further input they are just simply ap- plied without much ado. Some good candidates might be titles or deletions. It is worth noting that it’s not a textual phenomenon that constitutes a toggle annotation versus semantic or analytic in our simple category scheme. It is rather what kinds of detail a particular project decides to encode or not. Encoding of titles can be a toggle if we just decide to simply say that a text fragment is a title. It could become a semantic annotation if we’d like to associate it with a bibliographical reference or other detailed information.

Analytic and Text Critical Annotations These annotations are not connecting to an authority reference, but typically require further input to be made by the user. For example a ¡date¿ annota- tion requires the editor to supply a normalized date, while normalizations, abbreviations and corrections need regularized, expanded or corrected form, respectively. Apparatus entries will require all the variant readings. For all these annotations an input form will be shown into which such information can be entered. The exact content of the form depends on the annotation type. To store the annotation, you must click on the save button beneath the form after filling the required fields.

Modifying Annotations The annotation view shows a colored line beneath the text spanned by each annotation. You can select text within an existing annotation or across mul- tiple different ones and apply yet another annotation to it, thus resulting in nested annotations. However, the editor will ensure that you do not select

111 Annotating Documents Annotating Documents partial elements, for example, a range of text followed by half of a ¡per- sName¿. This would result in invalid XML, so the editor will automatically expand your selection to include the entire ¡persName¿. To the right of the annotated text, a colored box indicates the type of the annotation. The sequence of boxes also indicates how annotations are nested: the ones appearing to the right are wrapping those to the left. In the example above the ”church of San Francisco” is a reference to a place, while ”San Francisco” refers to a person, a patron saint of the church. The ¡persName¿ element is nested in the ¡placeName¿. Click on the colored box to see details about the annotation. For seman- tic annotations, the details will include a summary of the entity drawn from the connected authority or the local TEI register. For analytic and text critical annotations, a table is shown, listing the corresponding attributes associated with the annotation. The toolbar at the bottom of the popup allows to delete or modify the annotation. Clicking the delete button will immediately remove the anno- tation, without asking for confirmation. Clicking on the pen icon will open the annotation in the left sidebar, where you can change it. For semantic annotations, the authority lookup dialog will pop up automatically and you can change the connected authority entry by selecting a different one. Again, semantic annotations are applied immediately, while analytic and text critical annotations require a click on the save button after making modifications.

The Preview Pane Whenever you add, remove or edit an annotation, the preview pane to the right will show you an updated preview of your changes. It shows

1. the resulting HTML, i.e. how your document would display in TEI Publisher if the annotations were merged, rendered through the ODD associated with the document

2. the TEI of the entire document if annotations were merged

3. the internal JSON structure representing the annotations (for debug- ging purposes)

4. the TEI fragments which would be changed if annotations were merged

The preview does not actually change the TEI source document: it just applies the annotations in memory, leaving the source untouched. It just allows you though to check if annotations will be applied in the correct way.

112 Configuring the Annotation Editor Configuring the Annotation Editor

The annotation text view in the middle and the preview HTML panel to the right, both render the document through ODD. However, they use different ODDs: the annotation view renders the text through a dedicated ODD (annotations.odd) to make sure that all relevant bits of the base text are displayed and there’s no hidden information, which would be inaccessible to the editor. The preview HTML panel simply uses the ODD associated with the document (or the base teipublisher.odd otherwise).

Merge, Export and Undo The toolbar on top of the preview pane provides important actions:

Reload source TEI This will reload the source TEI XML into the anno- tation view, discarding any existing annotations. Use this if you had to change the TEI in an external editor and want to continue annotating.

Merge and save annotations to TEI Merges the current set of annota- tions into the TEI and saves it, establishing a new version of the base document. The annotation view will be reloaded to reflect the new base.

Save and export TEI to a local file (Chrome only) As above, merges and saves the annotations, but additionally prompts for a local file into which a copy of the resulting TEI will be written.

Undo last change Reverts the last action applied. For batch operations (if you clicked the ”apply all” button above the occurrences list) this may comprise multiple annotations. The annotation view will be reloaded and the remaining annotations are re-applied.

Preview merge results Forces the preview pane to refresh.

Configuring the Annotation Editor

The annotation editor is fully configurable. This includes

1. the connectors used for authority lookups

2. how authority entries can be extended with additional information

3. how annotations are mapped to TEI elements

113 Configuring the Annotation Editor Configuring the Annotation Editor

If you would like to customize the editor for your own needs, we recom- mend to generate a separate application just for annotation purposes. This makes it easier to configure and you maintain a clear separation between data which is still being worked on and the final edition you want to show to users. To generate a separate application for annotations, select Admin/App Generator. In the form, make sure to choose the ODD called ”Annotations” and select ”Annotation Editing” as template.

Configuring Authorities In its current state, TEI Publisher supports the following external authori- ties for entity lookup:

GeoNames geonames.org - for places

GND Gemeinsame Normdatei via lobid.org - for people, organizations and terms

Metagrid metagrid.ch - for people

Airtable airtable.com - arbitrary entities, each corresponding to a table

KBGA Karl Barth Gesamtausgabe - for people, organizations, terms, ab- breviations

Custom A custom connector, which delegates to one or more authorities, but also searches a local TEI register

The HTML template defining the user interface of the annotations editor determines which authority is used for which annotation type and can be easily adjusted. Just edit the templates/pages/annotate.html file, which can be done in eXide (or any editor of your choice). A simple configuration looks as follows:

< /pb-authority> Each top-level pb-authority needs at least two attributes:

114 Configuring the Annotation Editor Configuring the Annotation Editor

• a name indicating the annotation type for which it should be used • a connector, which selects one of the authority connectors listed above In addition, you may define a prefix: this will be prepended to the xml:id referenced by the resulting TEI element. For example, you may want to prefix the numeric IDs received from GND to obtain an XML ID like ”gnd-124507514” since a number alone would not be a valid @xml:id. Some connectors require additional configuration attributes. At the mo- ment those are the GeoNames and Airtable connectors. GeoNames requires you to register a user, whose name is given in the user attribute. The Airtable and Custom connectors are described in more detail below.

Airtable Connector airtable.com is not an authority file as such, but rather a commercial online database, allowing users to define their own tables and forms. It can there- fore be turned into an authority, e.g. by defining a people or places table and connecting it with the airtable connector. Since any kind of data can be stored in an airtable, a slightly more complex configuration is necessary to specify how to access relevant tables and extract the required information. An example configuration for a single table is shown below: This will retrieve authority entries from a table called ’Topics’ using the columns Name, Variants, Definition and Term Type.

115 Figure 47: ”Info” template in action

Configuring the Annotation Editor Configuring the Annotation Editor api-key The personal API key obtained from your airtable.com user profile base The API key for your database table Name of the table to use fields A list of table column names to be retrieved as fields. Fields can be referenced in other configuration expressions using the ${column} notation label The field or expression to use as main label when displaying authority lookup results filter The airtable formula to be used for searching template class=”info” An HTML fragment to show when displaying a popover with annotation details in the annotation editor template class=”detail” HTML fragment to show as the detailed match description in the dialog presenting the results of querying the author- ity register tokenize Defines which columns should be used when searching the text for other potential occurrences of the entity. For the search, the column values are split using the regular expression given in tokenize-regex tokenize-regex an optional regular expression used to split values before searching the text for other occurrences. This is useful if you have variant names separated by , or ; and each variant should be searched for separately.

Custom Connector The custom connector delegates queries to one or more authorities, but most importantly it creates a copy of every authority entry you select and writes it to a local TEI register. For many editions the information provided by an external authority will not be sufficient. It may be incomplete, in the sense it may lack important information relevant in the context of the edition but also many historical persons or places may not yet appear in any authority. Using the custom connector gives you a possibility to extend the infor- mation about an entity, or add new ones. This works as follows: 1. whenever you select an entry from the external authority, the custom connector creates a skeleton ¡person¿, ¡place¿ or ¡category¿ TEI ele- ment in the local TEI registry (the exact location of the registry can be configured) using the information provided by the authority

116 Configuring the Annotation Editor Configuring the Annotation Editor

2. you can extend the local entry with additional information, e.g. add a note to explain why the person is important in the context of the edition

3. any subsequent authority query will also search the local registry and matches found there will be ranked at the top of the results

4. to support entites which do not exist in an external authority, just create an entry in the local registry and assign it a local xml:id

To configure the custom connector, simply wrap it around the definition for the external authority you want to use, e.g.: This defines a custom connector for the annotation type ’person’, dele- gating queries to the GND connector, but also maintaining a local reg- istry. By default the local registry is stored in a TEI document residing in data/registry.xml. You can change the location by modifying the variable $anno:local-authority-file in the annotation configuration, modules/annotation- config.xqm. You can also change the function anno:insert-point to configure where exactly the local entity definitions will be stored.

Configuring Annotations Adding new annotation types or modifying existing ones involves the fol- lowing three files within TEI Publisher or generated apps: templates/pages/annotate.html The HTML page for the annotation editor modules/annotation-config.xqm Defines how annotations are mapped to TEI odd/annotations.odd The ODD used to render the annotation text view. In generated apps this will live in resources/odd/annotations.odd.

Let’s assume we would like to support an annotation to tag foreign language passages with the TEI ¡foreign¿ element: First, open templates/pages/annotate.html in eXide and locate the ¡div¿ with class toolbar. This contains a list of ¡paper-icon-button¿ elements. Like many elements in TEI Publisher, ¡paper-icon-button¿ is a webcomponent from the polymer library, which renders a button compliant with Google’s material design. You could use any other button-type element here, but for

117 Figure 49: Configure a new annotation type

Configuring the Annotation Editor Configuring the Annotation Editor consistency, let’s stick with ¡paper-icon-button¿. We define our new button like this: Make sure to include all the attributes shown: class must at least contain annotation-action. The toggle indicates that this is a toggling annotation, which just switches between on and off and does not need further input. For semantic annotations you can also add authority, which would trigger the authority lookup. title a label to be shown when the mouse is moved over the button. data-type indicates the annotation type. icon the material design icon to use data-shortcut an optional keyboard shortcut: here we use command-shift- b for Mac and ctrl-shift-b on windows disabled initially disables the button If you reload the annotation editor in the browser, you should now see your new button. However, TEI Publisher does not yet know how to translate the new annotation into actual TEI markup. To fix this, open modules/annotation-config.xqm and search for the function anno:annotations. This is a long switch statement with cases corresponding to the different an- notation types. Add a new case to the top: declare function anno:annotations($type as xs:string, $properties as map(*), $content as function(*)) { switch ($type) case "foreign" return {$content() } ... }; This will simply wrap a ¡foreign¿ tag around the existing content. Note that $content is passed in as a function pointer, so it is important to call it as $content(). Also note how we need to declare the TEI namespace. Without this namespace declaration, the element would not be recognized as TEI. Now we can try our new annotation: select a passage, click on the button and see how the text is marked as an annotation. The color is chosen auto- matically. You can also check in the TEI preview pane that the annotation is correctly output as a TEI ¡foreign¿ element.

118 Configuring the Annotation Editor Configuring the Annotation Editor

However, if you save the annotations, you’ll notice that - once the anno- tation view refreshes - the foreign text is rendered in italics, but not marked as an annotation! To fix this, we need to modify the ODD used for rendering the annotation view. Open odd/annotations.odd (named ”Annotations”), either in eXide or the ODD editor. Add an ¡elementSpec¿ for element ¡foreign¿ with an inline model and a cssClass containing annotation annotation-foreign. If you like, you can also keep the ¡outputRendition¿, which renders the text in italic. Key is that you declare the two annotation annotation-foreign CSS classes, because those will tell TEI Publisher that the element should be treated as an annotation. As a general guideline, the ODD rules for annotations should produce simple inline tags. They should output the content of the element without any modification. They may use ¡outputRendition¿ to change the appear- ance, or supply other parameters (than content). After saving the ODD, reloading the annotation editor should properly display the ¡foreign¿ as an annotation and we can start using our new an- notation type. However, we may not be entirely happy with the simple on/off toggle: it would be much better if we could also indicate the used language in ¡foreign¿. This means we have to change our annotation from a mere toggle into one which can take an additional parameter: To achieve this, go back to templates/pages/annotate.html and remove the toggle class from the ¡paper-icon-button¿: We would also like to provide an input field into which the user can enter the language. Scroll down to where the ¡form¿ is defined and add another input on top:

... Again we’re using a webcomponent for the input to get the material design look and feel. The class attribute is particularly important: it must always contain annotation-form, followed by the annotation type: foreign. The

119 Configuring the Annotation Editor Configuring the Annotation Editor name is arbitrary, but should be unique and we need to remember it for later. The label should contain a short title, which will be shown next to the input. Finally, we need to change our rule in modules/annotation-config.xqm to add the new parameter to the generated ¡foreign¿ element. Navigate to the file and change the anno:annotations function as follows: declare function anno:annotations($type as xs:string, $properties as map(*), $content as function(*)) { switch ($type) case "foreign" return {$content()} ... }; The function receives all the parameters entered into the form fields in the $properties map. In the form we used name="lang", so the corresponding value is retrieved from the map with $properties?lang. That’s all. You should now be able to modify your existing foreign annotation, enter a language and see how it is output into TEI.

Other Considerations Web annotations require a clear mapping between the HTML view of the text in the browser and the TEI document in the database. It is thus important to respect the following restrictions:

1. TEI Publisher’s page-by-page display of a text will not work in the annotation editor

2. Likewise, the fill parameter, which causes TEI Publisher to fill up short fragments with content from nested divs, should be set to 0 for annotation purposes.

3. The annotation editor can operate on documents with multiple divi- sion and will show them one after the other. However, those need to be top-level divisions. Therefore the depth parameter should be set to 1.

To ensure this, TEI Publisher sets the following configuration for the annotate collection in config.xqm: declare function config:collection-config($collection as xs:string?, $docUri as xs:string?) { switch ($collection)

120 Configuring the Annotation Editor Configuring the Annotation Editor

(: For annotations we need to overwrite document-specific settings :) case "annotate" return map { "template": "annotate.html", "overwrite": true(), "depth": 1, "fill": 0 } default return (: Return empty sequence to use default config :) () }; Generated applications should choose similar settings. If you create a sepa- rate app just for annotations, the following function would suffice to impose the right settings for all documents: declare function config:collection-config($collection as xs:string?, $docUri as xs:string?) { map { "template": "annotate.html", "overwrite": true(), "depth": 1, "fill": 0 } };

121 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 28: LitElement template

122 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 29: pb-clipboard in action on the es-dev-server

Figure 30: Initial part of the API tab of pb-collapse documentation

123 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 31: Demo file for pb-clipboard

Figure 32: ODD for DocBook

124 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 33: ODD for Foo

Figure 34: foo.xml rendered with foo.odd in single view

125 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 35: Annotation editor

Figure 36: Annotation playground

126 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 37: Annotation view

Figure 38: Annotation types

Figure 39: Annotating a person

127 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 40: Re-running the query

Figure 41: Inconsistency warning

Figure 42: Correction

128 Configuring the Annotation Editor Configuring the Annotation Editor

Figure 43: Nested annotations

Figure 44: Annotation details

Figure 45: HTML preview

129