1

TEI Publisher Documentation

Wolfgang Meier, eXist Solutions GmbH , Magdalena Turska, eXist Solutions GmbH

Introduction

What TEI Publisher does ...

The motivation behind TEI Publisher was to provide a tool which enables scholars and editors to publish their materials without becoming programmers, but also does not force them into a one-size-fits-all framework. Experienced developers will benefit as well by writing less code, avoiding redundancy, improve maintenance and interoperability - to just name a few. TEI Publisher is all about standards , modularity , reusability and sustainability ! In Publisher, it all starts with your source documents - regardless if they are in TEI or other form of XML: DocBook, MS Word (DOCX) or JATS. No matter how the source material has been encoded, it can be easily transformed into a range of output formats for publication - from a modern web page that you can open on your laptop or mobile device, to an ebook, a PDF file or its LaTeX source. TEI Publisher derives its name from TEI and the TEI Processing Model (PM). Processing Model is a part of the TEI vocabulary and TEI ODD specification format, described in the TEI P5 guidelines as well as further chapters here. It defines how a TEI document should be rendered in different output formats and lies at the heart of TEI Publisher. However, online editions require more than just a text transformation: the text needs to be embedded into an application , adding navigation, pagination, search, facsimile display and so on. The larger part of TEI Publisher deals with those aspects, providing all the necessary building blocks for an online edition. Staying true to the spirit of code reuse and interoperability, TEI Publisher implements its functionalities as small "lego" blocks to be freely arranged and recombined. The technology making this possible is called Web Components . It is part of the HTML5 specification and natively implemented by many browsers. Users don't need to dive into the details of this standard though: all you need to modify the example pages is some basic HTML knowledge. Only where the available components are not enough, a new use case needs to be described and suitable new components implemented but then they can be incorporated into existing component pool for everyone else to use. After all, our mantra is reuse, reuse, reuse and we want to turn TEI Publisher into a box of tools the entire community can benefit from. 2

Despite the elegant simplicity of this approach, various projects we realized in the past prove that TEI Publisher is: 1. powerful enough to cover complex transformation needs 2. a truly universal tool for any kind of digital edition 3. capable of generating high quality, camera ready material for book publishing 4. sustainable and future-proof solution 5. suitable for any XML, not just TEI (this documentation is written in DocBook!) e-editiones.org

Since its first incarnation in 2015, TEI Publisher has gained substantial following with numerous academic and commercial projects around the globe using it for their editorial and publishing needs. Grass-roots user initiative led in 2020 to the foundation of an international non-profit association e-editiones.org with the focus on further joint development of TEI Publisher, open standards and best practices for digital editions. TEI Publisher development is only possible thanks to generous contributions of developers, users and institutions willing to employ Open Source approaches so that the whole community can reuse and benefit from their work. A growing number of projects from small to large that have decided to publish their materials with TEI Publisher gives us all not only the opportunity but also the responsibility to make the project thrive for years to come and to make it truly sustainable option for XML publishing! Consider joining e-editiones.org through your affiliated institution or individually to support our efforts. We invite the community to contribute to the project - by means of code, ideas, documentation, tutorials and funding. You don't have to be a developer to contribute, you can do so in a number of ways! 1. Check out the source code, modify it, document it, enhance it. 2. Create new or enhance existing example documents and ODDs so we present showcases for various TEI applications. 3. Report your issues, feature requests or ideas for discussion via GitHub issue tracker . 4. Discuss with us on e-editiones slack chat or through the mailing list and @EEditiones twitter . 5. Contribute to translations via Crowdin . Please contact us if your target language is not listed and you'd like to work on it. 6. Port back your custimizations to the TEI Publisher code base so that others can use it too (or ask us to do it for you)! 7. Help and mentor others - publish teaching materials, answer questions on our Slack channel, mailing list and other forums. 8. Sponsor a concrete feature or fund a development grant. 3

Versions

TEI Publisher has been under active development since 2015. Once or twice a year a new major version is released, bringing important new features. Minor versions are released at shorter intervals and offer bug fixes, minor new features and improvements. The current major version of the TEI Publisher is 7.0.0 There's a long list of ideas and features we'd like to see incorporated into TEI Publisher: from wider coverage of input and output formats (e.g. InDesign), CQL and DTS support, to various editing workflows and support for efficient hosting and maintenance of multiple editions. These ideas are in various stages of development - some already advanced, some in conceptual phase, some waiting for funding and implementation. Coordination of future work is primarily conducted by e-editiones.org .

What's new in TEI Publisher 7.0.0

Version 7 brought another major refactoring and restructuring of TEI Publisher app , particularly regarding the server-side modules. TEI Publisher now exposes a well-defined, clear API specification following the Open API standard . This API is used in TEI Publisher by client-side UI components but can be equally well utilized by independent software which harnesses functionality exposed via the API without being forced to rely on Publisher's client-side components. On the server a new package, oas-router , reads the API specification and uses it to map HTTP requests to XQuery functions which perform actual API operations. Other packages, particularly UI web components ( pb-components ) and TEI Processing Model library ( tei-publisher-lib ) underwent necessary changes to communicate with the new API as well as retain full backwards compatibility. Beyond the structural changes a number of reported issues has been fixed and broader test coverage for all packages introduced. A sophisticated CI setup based on Docker has been created. Extensive test suite has been prepared for individual components. Furthermore, every API operation is independently tested against the specification, to assure that e.g. parameter and response types correspond exactly to the definition. Thanks to community contributions via Crowdin , TEI Publisher is now available in 20 languages. See chapter on Updating for information how to update your app to take advantake of these developments.

What's new in TEI Publisher 6.0.0

Version 6 brought a major refactoring and restructuring of TEI Publisher app libraries along with new specialized components and use case examples. 4

1. Web components overhaul: migration to LitElement and to the npm library While invisible to users, this redesign greatly improves the modularity of Publisher- based applications. With Publisher web component releases published on npm, updating the user interface for all Publisher-based apps is just a question of changing a single variable in the configuration file. Furthermore, Publisher's library of web components - true to the basic idea of Web Components Standard can be included in any HTML webpage e.g. can be embedded into existing CMS or any other publishing solution, even if it's not running eXist-db. Similarly, if you prefer to write your own application using any of the popular frameworks like angular, vue or react you can easily import the pb-components package from npm and use it directly in your project. As a final consequence, this change decouples the component library from the TEI Publisher app. It is now possible to host multiple applications, which depend on different versions of the component library, without conflict within the same eXist- db instance, a point of importance for institutions with numerous projects. 2. Redesigned and simplified CSS styling customization Encapsulation of styles offered by web components can be a mixed blessing and poses some challenges when customizing the aesthetics of components to fit a project. While some aspects of component styling remained unaccessible for customization in previous versions, Publisher 6 exposes majority of styling properties via standard CSS files and theme variables. Stylesheets can also be specified within the ODD, as previously, or through pb-view component configuration attributes. 3. Extended internationalization I18n support has been extended to cover not only the labels in HTML templates but also within web components. A mechanism for project specific language files extending the default Publisher label collection has been added. Thanks to community contributions via Crowdin a number of new languages has been added and existing ones updated. 4. Subcorpora - new TEI Publisher data organization Publisher's pre-populated data collection is now split into Playground and TEI Publisher demo collection areas which illustrate how this mechanism could be used to host multiple subcorpora within single TEI Publisher application. 5. New and improved web components pb-select-feature and pb-toggle-feature components have been extended to allow for interactive changing of display parameters (like switching between regularized or original spelling) which can be then processed client or server-side. New components have been created to handle MEI music notation as well as for web component API documentation and demo pages. 6. User interface of the ODD editor has been improved. 7. Experimental incremental scroll mode has been introduced to improve performance for very long documents presented in single page mode. 5

Quickstart

«Stay Home Learn TEI Publisher From Scratch»

A 3-part online course has been organized by e-editiones and led by Wolfgang Meier in June 2020. Course material, as well as video recordings of all the sessions, and a walk-through for the assignments are available for self-study. Find all informations on the workshop GitHub page.

Installation

TEI Publisher requires the eXist XML database to operate. It is distributed as an eXist application package, making it easy to install on any eXist database instance - either on your local machine or any remote server. You can install eXist and TEI Publisher manually, as described below. Alternatively use the provided docker image .

Installing into an eXist instance

Java

Before installing eXist, make sure you have Java installed on your machine. You can run java -version on a command line to check which version of Java you have. Make sure you have at least Java 8 (recommended: Java 11). Please note that the java -version shows the full version string, so 1.8.0 or similar instead of just 8. If you do not have Java installed, you can choose between a variety of different Java distributions for your operating system. While these are largely equivalent, so far we have had the smoothest installation experience across operating systems with the Zulu Community OpenJDK builds. In particular for Windows users, this provides the best out of the box experience.

Download

Download an eXist distribution following the link on its homepage . It is recommended that you set up an admin password when installing eXist but make sure to remember or store it securely! 6

Mac installation

On a Mac download the file with the .dmg extension, e.g. " eXist-db-5.x.x.dmg ". Double clicking the downloaded .dmg file should install eXist on your local system. It is only required to drag the eXist app icon over to the Applications folder. Once the installation has completed, you should find an app in your Applications folder which you can use to launch eXist.

Windows installation

On Windows download the file with the .jar extension, e.g. " exist- installer-5.x.x.jar ". Double clicking the .jar should install eXist on your local system. It will launch an installer to guide you through basic settings. The default settings suggested by the installer provide a good starting point for most projects. If double-clicking the .jar does not have any effect, there's may be something wrong with your Java setup. The java binary needs to be in your %PATH% environment. You can also try to manually start the installer by opening a command prompt, changing to the directory where you downloaded the distribution and typing: java -jar exist-installer-5.x.x.jar

Once the installation is completed, you should find an eXist-db shortcut to launch eXist.

Unix installation

Download the file with the .jar extension, e.g. " exist-installer-5.x.x.jar ". Double clicking the .jar should install eXist on your local system. It will launch an installer to guide you through basic settings. Default settings suggested by the installer provide a good starting point for most projects so there's no need to change anything. Once the installation is completed, you should find an eXist-db shortcut to launch eXist, otherwise navigate to the installation directory and run bin/startup.sh . Some Linux users may prefer the plain .tar.bz2 package, which can just be untarred to any location. This package does not include an installer and eXist has to be launched on the command line: navigate into the untarred directory and run bin/startup.sh in a shell, skipping the jar installer step above. Ignore the next section, navigate directly to http://localhost:8080 and follow the steps for installing TEI Publisher via the dashboard described further below. 7

First launch

Once eXist is launched for the first time you should see (with the exception of some Unix system configurations described above) a splash window popping up, showing that default applications are being installed:

Splash Screen on eXist Startup Upon first start, an additional configuration window will pop up on Windows and Mac, allowing you to configure basic parameters. Default settings suggested provide a good starting point for most projects so usually there's no need to change anything. 8

Configuration Dialog Showing on First Start Clicking on Save will show a popup asking to confirm the location of the data directory. Unless you have specific requirements just agree to the suggestion of the configuration dialog. Windows users will be asked if they would like to install eXist as a service. This is highly recommended to ensure that the database is correctly closed whenever the operating system shuts down. If all went well, eXist should now be up and running in the background. Mac and Windows users should find a small eXist icon in their task bar. Right-clicking on it will reveal a menu: 9

Taskbar Launcher Context Menu

Installing TEI Publisher

Clicking on Open Dashboard in the taskbar will open a browser and display eXist's Dashboard: the central administrative hub for the database. Alternatively - e.g. when you chose the manual installation on Linux - you can also open a browser window and navigate to: http://localhost:8080 . Log into the dashboard using the admin account and the password you chose during the installation (it will be empty by default). Use the left sidebar to navigate to the Package Manager . You'll see two tabs: the first one lists the application packages currently installed, the second can be used to install additional packages from eXist's public application repository. 10

Switch to the Available tab and search the list for TEI Publisher. Once you find it, click on the little install icon. After installing you will find the TEI Publisher icon in the tab showing installed apps. Click on it to open the TEI Publisher.

Installing TEI Publisher from the Package Manager

Using docker

If you do not want to install eXist yourself, you can use docker to run TEI Publisher. Docker is a tool to simplify the installation of applications and services. It creates a virtual environment including everything required for the service to run. Using our docker image , eXist will already be set up to include TEI Publisher as well as the Shakespeare and Van Gogh demo apps. 1. Install docker on your machine. Windows and Mac users may download the docker desktop app. 2. To download the image run the following in a console

docker pull existdb/teipublisher:latest 3. once the download is complete, you can run the image with the following command:

docker run -p 8081:8080 -p 8444:8443 --name teipublisher existdb/ teipublisher:latest

Startup should be fast because the database is already pre-populated. However, changes you make may not persist if the docker container is deleted or updated 11

to a newer release. If you want to be sure that your changes are safe, you should specify a local volume for storing the database by adding:

-v exist-data:/exist-data

See below for an explanation of the parameters: Once the container has started, you can access the eXist dashboard in your browser by navigating to

http://localhost:8081

From the dashboard you can click on the TEI Publisher, Shakespeare or Van Gogh icons to open the corresponding applications. 4. To stop the container run

docker stop teipublisher 5. To start the container again:

docker start teipublisher

Note that when you restart a container, it will run in detached mode, so you won't see any console output. You can view the output with following command though:

docker logs teipublisher

Other useful commands

Have a look at the docker documentation and cheatsheet for more commands.

Browsing Documents

The Start Page

The start page of TEI Publisher serves as an entry point to explore and experiment. On a newly installed TEI Publisher the main application panel offers the choice between browsing local collections directly or using DTS API to access remote resources. Narrower panel to the right displays the list of ODD files provided with the TEI Publisher. 12

Start Page Collections The usual starting point is the TEI Publisher Demo Collection with which users can explore a range of selected use cases , demonstrating various genres, encoding styles and presentation layouts. Various customization aspects are handled using different ODDs and view templates. We suggest to have a look at each of them to see what TEI Publisher can achieve out of the box.

Note Documents in this collection are preinstalled with the TEI Publisher and users are not allowed to write to it by default. The Playground collection is the place to upload encoded documents and ODD files to experiment with various processing models and view templates. Unlike the Demo collection, the Playground features an upload box to import new documents. You can upload your own XML and ODD (e.g. TEI, DocBook or DOCX) documents by either clicking on the Upload button or dragging and dropping files onto the upload panel. Read more on this subject in the Upload section

Note You need to be logged in for most advanced actions like creating or editing ODDs. The login button to the right of the menu bar allows you to log in. By default, there's a user named tei-demo with password demo , and a user tei with password simple .

TEI Publisher Demo collection

Experiment with browsing, faceting, filtering and sorting features of TEI Publisher. This page consists of several main areas: 1. the facets panel 2. the list of documents currently installed with sorting and filtering controls 13

3. a panel showing the ODD files known to the application 4. an upload box to upload new documents Have a look at documents showcased here to get a sense of possibilities that TEI Publisher offers. Click on document title to proceed to the Document View

Browsing Demo collection

Selected Use Cases

The document view can vary, sometimes substantially, depending on the sample document you are looking at. This is a natural consequence of TEI's versatility and broad scope of its application. What follows, requirements for the document view - both its layout and composition as well as processing rules governing the transformation of the text of the document itself - will differ to a great extent. Sample documents which are included in TEI Publisher's installation package do not exhaust its applications but rather aim to present some chosen use cases: • Critik der reinen Vernunft from the Deutsches Textarchiv corpus presents a philosophical tractate, originally published in print, thus following 'traditional' book structure with front pages, foreword and chapters. It can nevertheless demonstrate very well Publisher's capacities in typesetting, switching between physical and logical structure of the document (just toggle Page View in the Settings panel) as well as generation of multiple output formats from single set of processing models in the ODD (try choosing PDF or ePub options in the Download 14

). Purchas his pilgrimages , from the EEBO-TCP project, while roughly similar in structure is much earlier work (1613) and demonstrates extensive use of marginal notes. • Shakespeare's Romeo and Juliet , from Bodleian First Folio project uses dedicated TEI elements to encode structure of the play but it also showcases the parallel transcription and facsimile alignment for its presentation which is obviously of general application and could be used for any genre, not limited to dramatic texts. • Correspondence corpora are common, yet very interesting, subjects for digital editions. Despite basic similarities in structure, depending on the period, scope and particular research perspective, intended presentation may vary enormously. We are presenting samples of: • A 15th century manuscript letter to Miko#aj Orlik demonstrating alignment between Latin original and parallel Polish translation, • A 16th century manuscript letter of Hernán Cortés showcasing parallel transcription/translation and facsimile view and transcription enhanced with commentaries and explicitly encoded transcriptional features, • A 16th century manuscript letter of Mauritius Ferber with a collapsible metadata panel in addition to the parallel transcription and facsimile view, • An early 19th century manuscript collocative dictionary of Polish Bogactwa mowy polskiej featuring interactive highlights for regions of interest of the facsimile when hovering over dictionary headwords, • A letter from Van Gogh to Paul Gauguin written in 1888. This intentionally reproduces the flexible column layout pioneered by the Vincent Van Gogh Letters online edition, which is a model example for correspondence. • A 20th century manuscript letter from Robert Graves where emphasis has been put on visualizing rich encoding of semantic information in the letter, in particular geographic and prosopographical data.

Note The list of samples is expected to grow and we'd like to encourage contributions illustrating other genres and perspectives. We'd like to stress that preparing showcases above has been only possible thanks to numerous projects releasing their sources openly, in particular the Bodleian First Folio , Deutsches Textarchiv , Vincent Van Gogh Museum and EEBO-TCP . We'd also like to thank William Graves and Anna Skolimowska for sharing their correspondence material.

Document View

The document view can vary, depending on the sample document you are looking at. Nevertheless some default functionality will be shared: • the rightmost button in the toolbar opens the Settings panel. Here you can change the ODD being used for display as well as the view template (more about this later). By default, all sample documents apply the specific ODD which 15

fits them best, but you can play around and select another ODD to see what happens.

The Settings Panel • the leftmost toolbar button will open a table of contents (if the viewed document has a division structure) • the Download menu allows you to download the currently viewed document in a variety of output formats. Not all output formats work equally well for all examples as we have not customized every example for every media.

Experimenting with ODDs and page templates

All TEI Publisher's sample documents are TEI XML files which are transformed into a HTML webpage for display in the browser. Two major factors determine how the final page is going to look like: an ODD and a page template . 16

We have already mentioned in the very first section that the TEI Processing Model lies at the heart of the Publisher - the ODD file associated with a document defines the rules of transformation of the XML source file into HTML. Detailed discussion of the Processing Model can be found in following chapters , for now it is sufficient to say this is where decisions if a TEI element should be rendered inline, with a tooltip, or as a marginal note, are made. Simplifying things a bit the text of the document that you see rendered in your browser is an effect of applying the rules from ODD file to the source document. Nevertheless, as we demonstrated in the section on selected sample documents, in the application context we certainly want more than just text, however nicely typeset. From basic navigation controls, table of content, to facsimile display, critical apparatus, glossaries and maps - all of this and much more could be included in the final webpage. Following divide and conquer approach the TEI Publisher defines such specialized page elements as small, reusable blocks, using the Web Components technology. Components can be used like common HTML elements, thus a page template is just an HTML fragment which organizes the building blocks needed for a specific page. Looking more closely again at the TEI Publisher's Start page, we can now give more detail what is happening when any of the sample documents is loaded. On the right hand side there is a panel listing all ODD files available. Each of the sample documents includes a processing instruction which specifies default ODD and page template for this document. You can check what they are in Settings panel. For the Graves letter it would be Graves' Letters ODD and Letter with map/facets template. 17

The Settings Panel for the Graves Letter It is easy to experiment with different page templates and ODDs just changing these options in the Settings panel. An important caveat though is that not every page template makes sense for every document - after all parallel alignment can only be successful if there is something to align, map needs coordinates to display, page view needs information about page breaks and so on.

Uploading your own documents

If you read this, in all likelihood you already have some documents of your own you might want published, whether they are in TEI, DocBook, MS Word DOCX or other XML format. The first step is to upload them into the database. You need to be logged in and in the Playground area to do it (check the short info on the Start page for user name and password). Then uploading is just a question of dragging your documents onto the Upload area. They will become available in the document list immediately after upload is completed. 18

The Upload Panel Congratulations, now you can view your documents! Try to experiment and find the ODD and page template that best fits your needs and use it as a starting point for your own customization, if necessary. Once you are ready with these you can generate your own application for your documents only, which packs just what is needed for publishing into a standalone application package. If you attempt to upload a Microsoft Word document, the upload will automatically trigger upconversion of Word to TEI, using a custom ODD for the tranformation. Please note that the focus of this conversion is to preserve textual content, structure and basic semantics of the text, not provide authoritative mapping of complete set of MS Word features to TEI. Refer to DOCX handling section for more information.

Note Please bear in mind that while TEI Publisher aims to be a universal tool, the specific components may make certain assumptions about data they are getting and if your documents do not follow the same conventions it may be required to adjust parameters passed to the components from the page template or the component logic. By means of example - a table of content component assumes that the document structure is represented by means of nested div elements and section titles are given in head element. If your project rather chooses numbered divisions ( div1 , div2 ) etc it may be advisable to adjust this to avoid customizing all navigation, table of contents and so on, but it is one of very rare cases where TEI Publisher exposes any predilection for a particular flavour of TEI. 19

Similarly, the template for aligned transcription and translation is parametrized to accept an XPath expression pointing to the location of the transcription and aligned translation. Likely for your documents this expression would have to be adjusted (unless of course you also have Latin texts with Polish translation structured in a similar way). Furthermore, to correctly display corresponding translation fragment a custom mapping function may need to be passed to the translation view (cf. Van Gogh or Cortés letter templates for examples)

Supported XML vocabularies

TEI Publisher started as a publishing toolbox for TEI but the principles of TEI Processing Model were never limited to a single vocabulary. Publisher very quickly extended support to other XML formats. Currently TEI, DocBook, JATS and MS Word DOCX are supported out of the box (DOCX via automated conversion to TEI on upload). Few specificities of TEI and DocBook are listed below, while DOCX is discussed at length in the following section.

TEI

In principle, any TEI document will be supported by TEI Publisher and can be displayed with the default page template and odd. Nevertheless, certain assumptions are made about encoding of the basic structure of the TEI documents for the purpose of navigation: • page beginnings are encoded with • column beginnings are encoded with • structural divisions in the document are encoded with

elements We acknowledge that TEI offers other ways to encode these features, e.g. generic element or specialized numbered division elements like , . TEI documents using alternative encodings will be still displayed as specified in the ODD, it is only for the sake of navigation or division-based full text search that we had to assume certain conventions to be able to decide what to show as the next page, column or division. We believe our choice represents most common way of using TEI but, for those who followed the path less travelled, the chapter on customization briefly discusses how to change relevant functionality. 20

DocBook

DocBook support is demonstrated by this very document you are now reading, documentation. . It is written in DocBook and presented via dedicated docbook.odd and documentation. page template. You will notice a custom processing instruction in the source code of this document which specifies which ODD and template to use. Experiment changing the template and ODD via Settings drawer to see how much impact it has on display.

MS Word DOCX format conversion

Starting with the version 5.0.0 of the TEI Publisher a new docx handling module is available to allow for ingesting documents in docx format. Goal of this module is to provide a way to import Word documents, preserving their textual content, structure and basic semantics of the text, not to provide an authoritative mapping of complete set of MS Word features to TEI. Docx format is relatively flat, thus reconstructing logical document structure like divisions, lists and similar can be only based on certain heuristics. Likewise it is impossible to deduce semantics attributed to certain formatting decisions. For that reason TEI Publisher by intention ignores many style properties — trying to preserve as much as possible would likely just add unnecessary "noise" and result in low- quality TEI.

A word about Word

A Word document is essentially a zip archive of several different XML files. These files store various parts - the text content, styles, embedded media files etc. Information most relevant for the import process have been extracted into a map, which is passed as a parameter to the ODD, so it is available for every element. Thus information about numbering styles can be accessed via $parameters?nstyle(.) function and testing if a list is bulleted could be done checking the value of $parameters? nstyle(.)/numFmt/@w:val . Full list of available functions and some hints how to customize default conversion ODD are provided at the end of this chapter . 21

MS Word archive structure

ODD for docx

The ODD used for docx processing can be found in docx.odd . Users are free to extend the default ODD with additional heuristics. For example, a paragraph being entirely bold could also be treated as a heading, or a left text indent may indicate a quote. For testing purposes there is a Word document provided in data/doc/test.docx which includes samples of most important features like headings, lists, tables, notes and embedded images. Try uploading it via upload panel as described in the upload section and check the conversion results. Behaviour of the conversion mostly follows the approach used in TEI Stylesheets docx-to-tei transformation module and has been tested on test files included there.

Parameter functions

Functions below can be used to retrieve styles or other information related to a current node. For more usage examples see docx.odd 22

Processing Model transformations

While TEI Publisher already provides various ODDs and page templates targeting specific domains, it is likely that your project may require certain adjustments to fully meet your needs. It has been one of primary concerns in Publisher's design that customization is not only possible on various levels but also encouraged and we aim it to be as simple as possible. Very broadly we can group customization needs into two sets: changing the rules for document transformation (how the source document is translated into the output format) or changing the organization and styling of the rendered web page. In this chapter we'll concentrate on the former, document transformation, which primarily requires modification of the ODD with the TEI Processing Model . The latter would require adjustments of the page template . In both cases, it may be best to choose as your starting point an already existing ODD or page template and adjust it.

ODD Customization

Creating Your First ODD

The general workflow for creating a customization is as follows: 1. upload a TEI sample document you want to format 2. create a new ODD 3. modify the ODD to match your requirements For the purpose of this quickstart, we will reuse one of the pre-installed sample documents, but create a new ODD for it (while we will start from scratch with an empty ODD, it is also possible to generate one based on one or more sample TEI documents ): 1. Log in and fill out the form at the bottom of the panel listing ODD files. Choose a name for the ODD, e.g. myletter (without a suffix) and a title, which will appear in the list after creation. Click on Create (not Create from examples ). The newly created ODD should appear in the side panel. 2. In the document list, click on Letter #6 from Robert Graves to William Graves to open it in the document viewer. 3. Open the Settings panel (rightmost toolbar button, see above) and choose your ODD from the dropdown showing available ODDs. You may also change the used HTML template to Default single text layout , though this is not absolutely necessary. 4. The view should change and display the letter's content with only basic formatting applied. Since our ODD has just been created and is empty, we see the content with standard formatting applied. Our ODD by default inherits from 23

teipublisher.odd , which likewise extends tei_simplePrint.odd . The latter is maintained by the TEI community and contains processing model declarations for the most important TEI elements. Thanks to this inheritance mechanism, many documents display nicely without requiring a lot of additional customization. 5. From the menu, select Admin / Edit ODD to open the visual ODD editor.

Modify the ODD

Changing processing models in the ODD is a powerful mechanism through which you can control all aspects of the transformation of your documents from source XML format to all output formats: HTML, ePUB, PDF etc. As already mentioned it is considered best practice to chain ODD customizations together and rather change or add project specific rules to more generic ODD than copy them in extenso. ODD chaining allows for the future upgrades as your base ODDs may be updated by standardization bodies which maintain them. Commonly project ODDs would extend teipublisher.odd , a generic TEI Publisher set of processing rules. Beginning with version 3.0 of TEI Publisher, you have the choice between writing the ODD by hand or using a visual editor. Both approaches can be combined and mixed. The visual editor saves the ODD in a non-destructive way, preserving any information not related to the processing model. It is thus safe to switch between hand-editing the ODD and using the visual editor. Just make sure you reload the visual editor view after modifying the source XML and vice versa. That said, visual editor is specifically tailored to editing processing models so it will be likely the fastest and safest way to edit your ODD. To be able to customize the display of your document it is crucial to understand its XML structure well. Each of processing models needs to be aimed at a particular XML element and sometimes is only meant for a specific XML context - let's say we might want to distinguish between headings of first and second level of nested divisions as they often represent titles of different text units: acts and scenes or books and chapters. We'll start with the Graves' letter you have already viewed applying your custom ODD in previous section. The display is quite simple and easy to read but we might want to adjust it to follow common visual conventions for a letter, starting with displaying the dateline on the right hand side and completely removing the page label which currently sits there. To create a processing model addressing this need we have to know 3 things: • when should it be applied, • what is supposed to happen • and how should the text be formatted? To be able to answer the first question, you should familiarize yourself with the XML structure of the letter to find out how datelines are represented in TEI. In the tab displaying the letter, select Download / XML to open graves6.xml in eXide. Quick investigation of the TEI encoding will reveal that dateline resides in its eponymic tag wchich is nested in the part of the document, while page labels are encoded with . 24

We'll use the visual editor, but show the corresponding ODD XML below each screenshot. At the end of this chapter we'll describe how to edit the ODD XML code by hand .

First Steps

The visual ODD editor opens if you select Admin / Edit ODD from the menu while viewing a document. Alternatively you can click on the name of an ODD in the list of ODDs on the TEI Publisher entry page. A new tab opens, showing an action panel to the left, and the title of your ODD to the right.

Note Most recent versions of the ODD editor will look slightly different, nevertheless they are functionally equivalent to the screenshots below, created in an earlier version. We need to overwrite the processing model rules for . Enter dateline into the input box next to the New button in the left panel and click the button. This will insert a processing model rule for into the right panel. Because already exist in the base ODD, tei_simplePrint.odd , you'll see a single model which was copied from the base ODD. 25

Screen after adding 26

The corresponding ODD XML looks like this:

Let's cover some key concepts of the TEI processing model first: primarily documents the structure, content, and purpose of an element. It is a core element in any ODD but the schema-related functions are not relevant for the discussion here. What is important for us is this is where processing models are defined. The @ident attribute of the identifies the name of the element to which the spec (and therefore processing model) applies. An may contain one or more elements to specify the intended processing of this element. Every model maps the element to a behaviour . A behaviour denotes an abstract transformation function to be applied. The TEI guidelines currently list two dozen behaviours, e.g. paragraph, heading, note, inline, block. The last two are the most frequently used. How exactly a behaviour translates into the target output media may differ depending on media features and design decisions. TEI Publisher tries to implement them as generically as possible. To change the model expand it by clicking on the arrow to the left of the grey box. A form appears, allowing you to change the model configuration. In our example we are happy with what is happening with the dateline, so we don't need to change the behaviour but we do want to fix how it is styled by justifying it to the right. Rendition can be defined in an , so click on the + button next to Renditions . In the form input being inserted below, enter your styling requirements in css. The processing model uses and CSS to define visual aspects. For output formats other than XML, the CSS is translated into the corresponding target language. It is thus best to limit the CSS to the most common typographical features, like bold, italic, color, underline etc. The general styling of the text should be done outside the ODD to maintain a clear separation of concerns. 27

Add a rendition for 28

Again here's the corresponding XML:

text-align: right;

To test your change, click on Save in the left panel and wait a second until a popup appears. Switch back to the tab with Graves' letter from which you opened the editor and refresh the browser window to see your changes applied. In case you do not see any change, make sure 1. you selected the correct ODD for viewing (check the Settings drawer) 2. if you made changes to outputRenditions only, you may need to clear your browser's cached version. For most browsers, holding the Shift key while clicking on the Reload button does the job.

Other behaviours

We would also like to hide the page breaks as we do not have facsimiles available. Add a new element spec for . Again the newly added spec already includes a model with behaviour break . Just change this behaviour to omit or delete the existing model and insert a fresh one with behaviour omit .

Omit

29

Predicates and multiple models

Next up, we may want to highlight the various places and people occurring within the text. They are all marked up with the tag, using different @type attributes. Create a new element spec for and supply some color to the names.

Color the tags And the XML for the entire :

color: #FF9900;

This rule affects places and people alike since both these categories are marked up with tag. If we'd like to treat people and places differently we'd need separate models for them and a mechanism to distinguish between the two. The processing model uses predicate to make such distinctions: a model rule will only be used if the XPath expression in its predicate matches the current node being processed. Let's add another model and give it a predicate: 30

Distinguish places and people

color: #0077FF; color: #FF9900;

Important The order of models within the element spec is important. If you move the model with the predicate to the bottom, all names will appear in the same color again. This happens because the processor walks through the models until it finds the first one matching the current node. If the model without predicate is first, it will always win over the one with the predicate! Also, if there's more than one matching model, only the first will be chosen. 31

Parameters

All behaviours accept one or more parameters which are defined in the TEI guidelines. Every behaviour has an implicit parameter called content , and, as the name suggests, it specifies which part of the source document should be processed: by default it uses the nested content of the node. You may overwrite this default and assign it another value. Some behaviours take other specialized parameters. For example, the alternate behaviour accepts two parameters: default and alternate . An alternate switches between two alternative states. On the web this could take the form of a popup, in print it is usually implemented as a footnote. To put this to a test, let's look at the elements appearing within the letter. Most of them also specify a normalized date in their @when attribute. Seeing this may be helpful for the reader, for example, to know that the 19th mentioned in the postscript refers to 1957-12-19 . However, we may want to present the normalized date in a more readable way. XPath has a function format-date for the purpose and we could use it to show a representation of the date nicely formatted in the user's language. Add a new element spec for . You'll already see 4 predefined models. The first two are for print only, but the third one does indeed use behaviour alternate , which is exactly what we want. Change the parameter value for alternate to format the date: 32

Format the normalized date in @when

Screencast

The screencast below recapitulates some of the modifications we just applied. It uses an older version of TEI Publisher, but the basic concepts and controls are still the same: 33

Screencast

Edit the ODD XML by hand

To switch to the XML source code of the currently edited ODD from within the visual editor, click on the button with the angle brackets in the toolbar of the left side panel. If you made changes in the form, you need to save first to update the ODD. The ODD XML will be opened in a new tab, showing eXist's browser-based editor, eXide . While using eXide is sufficient for small edits, we really recommend using specialized XML editor like oXygen for serious work on your TEI files. It will help you with many tasks, starting with the syntax and documentation. You can edit ODDs stored in eXist using Oxygen's webdav support or the eXist data source function.

Important If you edit the ODD XML by hand, there are some caveats you need to be aware of: the visual editor will automatically check if there are existing s for a new element in any of the ODDs your ODD inherits from. When editing by hand, you need to do this yourself. It's best to always have the base ODDs: tei_simplePrint.odd and teipublisher.odd open on the side. Both are located in the same collection as your odd, i.e. /db/apps/tei-publisher/odd . For example, to modify the element spec for , check tei_simplePrint.odd , where you'll find a definition already. Copy it over to your ODD and start modifying it. Pay attention to the @mode attribute on . You must set this to change if you are overwriting an elementSpec which already exists in the inherited ODDs. If not, set it to add . To test any changes, switch back to the tab in which you viewed your document (e.g. Graves' letter) and select Admin / Recompile ODD from the menu.

Processing Model Syntax

TEI gives users a lot of freedom: there's always more than one way to encode your material! To maintain interoperability and sustainability, you need a way to formally describe the schema used as well as document editorial guidelines and transcription processes. TEI ODD was designed for the purpose of expressing all this in the TEI language itself. But how a document should be rendered was previously still considered to be the responsibility of external publishing software and could not be described within the ODD. The advent of the TEI Processing Model changed this! The intended processing for all elements can now be expressed within the TEI vocabulary as part of the ODD thus fulfilling its promise of One Document Does It All . Markup elements are mapped to a small set of abstract transformation functions, called behaviours . Basic styling features can be set directly within the ODD using CSS. The processing model is media-agnostic: behaviours and rendition styles are transparently translated into 34 different output media types like HTML, XSL-FO, LaTeX, or ePUB. A single ODD can handle a multitude of output media types with just a few small adjustments.

element

element is primarily used to document the intended processing for a given element. One or more of these elements may appear directly within an element specification to define the processing anticipated for that element. Where multiple elements appear, they are understood to document mutually exclusive processing scenarios, possibly for different outputs or applicable in different contexts. A processing model defines on an abstract level how a given element may be transformed to produce one or more outputs. The model is expressed in terms of behaviours and their parameters, using high-level formatting concepts, such as block , inline , note or heading . A processing model is thus a template description, used to generate the code needed by the publishing application to process the source document into required output. The example below depicts a situation where a single model is defined for the element. As no @predicate or @output are specified, this model applies for all contexts in which may appear and all possible outputs. Thus all elements will be transformed into inline chunks of text containing only contents of 's child and omitting any possible children.

children and attributes:

• @predicate : the condition under which this model applies, given as an XPath Predicate Expression • @behaviour : names the function which this processing model uses in order to produce output; possible values include: alternate, block, figure, heading, inline, link, list, note, paragraph • @output : identifier of the intended output for which this model applies; applies to all output if no @output is present on a • @useSourceRendition : whether to obey any rendition attribute which is present in the source document • @cssClass : one or more CSS class names which should be added to the resulting output element where applicable • : allows to pass parameters to @behaviour function; parameters available depend on the behaviour in question; when parameters are not explicitly passed, default values for those are assumed; all behaviour functions use current element as default content 35

: supplies information about the desired output rendition in CSS; its attribute @scope provides a way of defining ‘pseudo-elements’ eg: first- line, first-letter, before, after Model explicitly specifying content parameter: for entries only content of its child is to be displayed (as an inline chunk of text):

Model specifying output rendition: contents of elements are to be displayed in italic and wrapped in parentheses:

font-style: italic; content:"("; content:")";

Sometimes different processing models are required for the same element in different contexts. For example, we may wish to process the element as an inline italic element when it appears inside a

element, but as an indented block when it appears elsewhere. To achieve this, we need to change the specification for the element to include two elements as follows:

font-style: italic; left-margin: 2em;

The first processing model will be used only for elements which match the XPath expression given as value for the @predicate attribute. Other element occurrences will use the second processing model. Set of multiple statements is regarded as an alternation and only the first model with @predicate matching current context is applied.

output styling

The intended rendering for a particular behaviour of a processing model may be specified in one or all of the three following ways. • the @cssClass attribute may be used to specify the name of a CSS style in an associated CSS stylesheet (read more on specifying CSS styles in the ODD ) 36

which is to be applied to each occurrence of a specified element found (in a given context, for a specified output), • the attribute @useSourceRendition may be used to indicate that the rendition specified in the source document should be applied, • the styling to be applied may be specified explicitly as content of a child element. When more than one of these options is used, they are understood to be combined in accordance with the rules for multiple declaration of the styling language used. It is strongly recommended that use should be limited to strictly editorial decisions, such as 'conjectures are to be displayed in square brackets' and not as means to record all typesetting and layout specific design choices. The latter are discussion in the Custom CSS styling chapter. The processing model library translates the CSS styles into the target media format. Restrictions apply due to differences between the output formats. Not all CSS properties are supported for every format. Please refer to the section on Output media settings for further information.

and

Summary of elements that can be used to document one or more processing models for a given element: • describes the processing intended for a specific context • (sequence of processing models) a group of model elements documenting intended processing models for this element, to be acted upon in sequence • (processing model group) a group of model elements documenting intended processing models for this element The < modelGrp> element may be used to group alternative elements intended for a single kind of output. The element is provided for the case where a sequence of models is to be processed, functioning as a single unit. Common use case would be to use modelSequence to generate table of contents along with the reading text as shown in the example below:

Behaviours

The TEI guidelines document a number of default behaviours. TEI Publisher allows users to add their own behaviours, either within the ODD itself or by writing XQuery code . The following section lists the default behaviours. 37

Available Behaviours

Behaviour functions accept a range of parameters, depending on the function in question. Where these parameters are left unspecified in the , default values are used. All functions take at least one parameter: content . It will be added by default unless specified and contains the nested content of the currently processed node. You may change this by explicitely setting a content parameter inside the model. In the parameter lists below we skip the content parameter as it is available for every behaviour. Optional parameters are marked as optional in parenthesis, followed by the output mode they apply for, if relevant.

Including General CSS Styles

TEI Publisher is based on webcomponents, therefore styling of one document will not interfere with the styling of another document on the same page. All styles are strictly encapsulated within the component and do not "pollute" the global browser space. This also has a downside though: CSS rules defined outside the have no influence on the text styling inside the component (with some exceptions, mainly for properties which are inherited down the HTML tree, e.g. font-family ). However, putting all styling information into tags within the ODD is also not a good idea - it adds a lot of redundancy and mixes editorial responsibilities with web design concerns. The recommended solution would therefore be to use CSS classes for repeating styling aspects. TEI Publisher supports linking to an external CSS stylesheet from the encodingDesc/tagsDecl/rendition section of the ODD. Just specify a relative link in the @source attribute:

The file should be stored in the same collection as the source ODD it is referenced from. The linked file should be a standard CSS stylesheet. Note that unfortunately, editing renditions is not yet supported by the visual ODD editor, so you will have to fall back to add the corresponding elements to the ODD by hand. Alternatively, one may also use the same TEI element with the @selector attribute to embed CSS rules directly in the ODD.

font-family: serif; font-weight: 400;

Choose one of the two approaches, but do not mix them. In both cases make sure to recompile the ODD after changes as the CSS is merged into the generated code! 38

New addition in Publisher 6.0 allows to pass the external CSS file in load-css attribute of pb-view . Recompiling ODD in this case is not necessary, otherwise it is functionally equivalent to using ODD rendition .

Output Media Settings

The library supports various output media formats and translates styles into the corresponding format. Currently the following output modes are supported and can be used in the @output attribute: The quality of the generated output may vary a lot for the fo and modes, depending on the type of input document. The following section provides more details on the configuration of the FO output option:

FO Output

When generating XSL:FO output, the implementation tries to translate the CSS rules specified for renditions into the corresponding XSL:FO formatting properties. Not all CSS properties are recognized or can be mapped to FO properties. Unknown properties defined in a rendition will be ignored. The default rendering for headings, paragraphs and the like is defined by a separate CSS file. The implementation merges those defaults with the custom renditions given in the ODD. The library searches for default CSS styles in a file named .fo.css inside the specified output collection (in which the generated XQuery files are stored). The style definitions are copied literally into attributes on the output XSL:FO elements, so any property which is a valid attribute for the corresponding element may be used. For example, teipublisher.fo.css contains:

.tei-text { font-family: "Junicode"; hyphenate: true; } .tei-floatingText { padding: 6pt; } .tei-p { text-align: justify; }

Every XSL:FO document needs a master layout and a page sequence definition. Because those tend to be rather verbose as they include things like page margins etc., they are read from two XML files: The mechanisms for configuring FO output are still very much under development and we welcome suggestions by users.

LaTeX Output

The latex output mode produces good results for longer texts which fit well into the pre-defined LaTeX environments. The number of supported CSS properties is limited though: 39

• font-weight • font-style • font-variant • font-size • color • text-decoration • text-align • text-indent To create arbitary complex LaTeX output, you may want to use the extension to the ODD syntax. It is heavily used to e.g. generate the LaTeX version of this documentation. See also serafin.odd or vangogh.odd for examples. TEI Publisher creates a default LaTeX prolog based on standard packages and settings. You may overwrite the defaults by providing your own template within the ODD element spec for the TEI root element. See the example ODDs mentioned above. Note that TEI Publisher will generate some LaTeX macros for styles defined in which should be imported into the prolog. The styles are added to the default configuration map and can be accessed via $config('latex-styles') . Refer to the example ODDs and just copy/paste the corresponding lines. This output mode requires a local installation of LaTeX on the machine running TEI Publisher. The examples have been tested on a default installation of MacTeX 2018. If you are not running MacTeX, you likely need to adjust the path to the LaTeX binary in the XQuery configuration module modules/config.xqm . Search for the variable $config:-command and adjust it to point to a binary of xelatex , pdflatex or lualatex . ePub Output

The output mode extends the HTML mode. You may define general styling in an extra CSS file, located in resources/css/epub.css . This external stylesheet is included into all generated epub files and may be used to configure general settings like page breaks, hyphenation, font sizes etc.

Extensions to the Processing Model Specification

XQuery Instead of XPath

The implementation directly translates processing model instructions into an XQuery 3.1 module by generating executable XQuery code. This is straightforward as the resulting XQuery will closely resemble the specification in the ODD, thus being easy to debug. It also leads to very efficient code, which is as fast or even faster as a hand- written, optimized transformation. 40

As a welcome side effect, any valid XQuery expression might be used wherever the spec expects an XPath expression, e.g. in predicates or parameters. For example, one can define variables inside a parameter using a standard XQuery let $x := ... return ... syntax.

Default Processing Model Rules

It is possible to define a default to be applied to all elements which are not already matched by another elementSpec. For example, if no is present for an element, its text content is output. To change this behaviour and omit content elements without specification, you may want to define a default as shown below:

You can also define models to be applied to all text nodes, e.g. if you need to normalize certain nodes:

Note that outputting text nodes is a performance critical operation, so use with care. Too complex processing may dramatically slow down rendering.

External Parameters

The script calling the processing model may pass external parameters into the ODD. They will be available in the variable $parameters , which is an XQuery map. Access parameters using ? , the XQuery lookup operator. For example, one can use this feature to control how specific parts of the document are output, without having to define a separate output mode, which would result in much more code. Below we display a shortened header for the document, containing simply its title, but only if the parameter "header" is set to "short":

...

The webcomponent also lets you define arbitrary parameters to be passed to the ODD via . For example, the breadcrumbs shown above this documentation page are realized by setting a parameter mode and can be queried in model predicates with $parameters?mode='breadcrumbs' .

If the parameter is set, the processing model rules in the ODD will output the headings of all ancestor sections of the current division only, ignoring everything else. This approach helps to reuse the same ODD for viewing specific aspects of the document. A dedicated user interface webcomponent exists for toggling between two values of a parameter. Example below would produce a checkbox which when on results in the value of $parameters?mode set to diplomatic , otherwise to norm .

Diplomatic View

Code Templates and Custom Behaviours

The two dozen behaviours defined by the TEI processing model are enough to cover most HTML output tasks, but other output formats like LaTeX may require more customization and control over the generated output. The TEI Publisher library thus extends the processing model syntax with two custom elements for defining code templates. While TEI Publisher does provide ways to write your own behaviours in XQuery and thus extend the ones defined in the guidelines, this should only be used as last resort: custom XQuery behaviours limit the portability of the ODD and are bad for maintenance. Avoiding custom behaviours works quite well for HTML output and we have realized complex projects with just two or three extension behaviours. Things start to become more difficult if you try to output LaTeX though: there are hundreds of packages to use, and users typically define their own macros or environments for all recurring typesetting tasks. For example, to print a TEI , experienced LaTeX users would normally create a corresponding \persName macro and handle the formatting details there. Unfortunately, out of the box the TEI processing model does not facilitate this level of customization.

Introducing

TEI Publisher thus supports an extension to the ODD syntax in its own namespace ( http://teipublisher.com/1.0 ). Within the ODD, a may define a element containing a code template. The template is expanded first and the result is passed into the behaviour specified for the model, replacing the default content parameter accepted by all behaviours. The very simple case of outputting a in LaTeX could thus be written as:

\persName{[[content]]} 42

The template can reference other parameters defined within the by enclosing the parameter name in double brackets. In the example above we're referencing the default parameter content , which contains the nested content of the tag. The parameter will be processed before it is passed into the template, so if contains nested TEI markup, the corresponding processing model rules will be applied first. The result of expanding the template then becomes the new content parameter to be passed to the behaviour ( inline in the example above), which is processed in the normal way as defined in the TEI guidelines. You may also specify additional parameters to be included in the template. For example, the TEI document may contain a glossary of terms which are referenced in the text using text . In LaTeX this would translate to \glslink{ref}{text} , which can be easily produced by the following :

\glslink{[[ref]]}{[[content]]}

We define an additional parameter ref , which contains the id string from the @ref attribute, stripping out the leading '#'. The templating mechanism is not limited to LaTeX, but may also be used to generate HTML or FO, for example, if you have to generate a complex HTML fragment to represent a single TEI element. This is hard and sometimes impossible to achieve without templates. We'll see some examples in the next section.

Defining New Behaviours in the ODD

By combining code templates with parameters we can come up with a very simple mechanism to define new behaviours right inside the ODD! Take the TEI Publisher documentation as an example: it is written in docbook 5 and transformed via ODD. The documentation includes some videos which are hosted on youtube. In docbook those are represented by elements inside a :

Screencast
43

In the HTML output we would need to transform this into an

Note Note how we have to reset the namespace on the ? This is required because the default namespace in an ODD document is the TEI namespace. You thus need to reset it whenever you want to output elements in another or no namespace inside a template. Without this, the iframe would end up in the TEI namespace. Webbrowsers will usually ignore it, but it will be wrong nevertheless. All behaviours should be included in the TEI header or - to be exact: the inside the . You may have multiple behaviour declarations with the same @ident , given that they apply to different @output modes. Parameters specified via without @value attribute are expected to be passed to the behaviour from the calling model. A parameter may be empty though. If you define an XPath expression as @value attribute, the result of the XPath evaluation will be used as value for the parameter. The new behaviour will be named iframe and takes three parameters: src , width and height . It can now be called from a as follows:

For further code examples, please have a look at docbook.odd , which is used for viewing the documentation.

Note At the time of writing, the graphical ODD editor in TEI Publisher does not yet support defining your own behaviours via . You thus have to make those changes in the source XML using eXide, oXygen or another XML editor. You can, however, use the graphical editor to continue editing the ODD afterwards. It is smart enough to not overwrite your hand-written code upon save. 44

Extension Modules

Where possible, developers should stick to the standard behaviours defined by the TEI guidelines, or use the and extensions of the ODD syntax. However, there might be situations in which it is necessary to generate a specific type of complex output, which requires the full power of XQuery. To facilitate this, the implementation allows additional extension modules to be configured:

Configuration

Configuration is done via an XML file which must reside in the same collection as the source ODD files. It contains a series of elements, representing particular output modes (e.g. web or print) via @mode attribute. element without a mode groups modules available for all output modes. Each element lists modules to be loaded for specified output mode. Each definition may optionally be limited to a specific ODD, name of which is specified in the @odd attribute.

Whenever the library tries to locate a processing model function for a given behaviour, it will first check any extension module it knows to see if it contains a matching function. One can thus overwrite the default functions as well as define new ones. An extension module may also contain general purpose XQuery functions you want to call from within an ODD parameter, e.g. for formatting a date, outputting a number etc. To make those functions available to all output modes, just skip the @mode attribute. 45

Implementing Custom Behaviours

To be recognized by the library, an extension function must accept at least 4 default arguments, plus any number of custom parameters. The required parameters are: For all additional parameters, the processing model implementation tries to fill each custom parameter with a corresponding value by looking through the children of the in the ODD to find one with a name matching the variable name. If no matching parameter can be found, the function argument will be set to the empty sequence. You should not enforce a type or cardinality for any of the custom parameters as this may lead to unexpected errors. The parameters may be empty or contain more than one item. For example, a custom behaviour called code for syntax highlighting in an extension module named ext-html.xql might look as follows: xquery version "3.1"; (:~ : Non-standard extension functions, mainly used for the documentation. :) module namespace pmf="http://www.tei-c.org/tei-simple/xquery/ext-html"; declare namespace tei="http://www.tei-c.org/ns/1.0"; declare function pmf:code($config as map(*), $node as element(), $class as xs:string+, $content as node()*, $lang as item()?) {

 {replace(string-join($content/node()), "^\s+?(.*)\s+$", "$1")} 
};

It defines one function, pmf:code , which can be called from the ODD as follows, provided that the ext-html.xql module has been configured as described in the previous section .

Custom Behaviours Accepting User-Defined Parameters

Sometimes you may like to implement a generic behaviour which takes arbitrary parameters from the user. This means the parameter list of your behaviour will not be fixed. To facilitate this, a behaviour function may declare a final parameter $optional as map(*) . If the processor finds children in the model which cannot be mapped to an explicitly declared parameter, it stores all such extra parameters as a key/value pairs in the $optional map. 46

Building a Sample-Based ODD

If you do not want to start a customized ODD from a blank template, you can alternatively generate one that covers the classes and elements of a selection of TEI files stored in TEI Publisher's data collection. Simply select one or more sample documents in the list to the left, enter a name and title into the form and click on Create from examples . Note that if you haven't removed the default examples TEI files that shipped with tei-publisher their markup will be included in the constructed ODD as well.

This method uses the oddbyexample. stylesheet that is part of the TEI consortium's stylesheets . Users with a corpus of one or more TEI files can generate a custom odd that contains explicit additions and deletions for all possible TEI modules, as well as, for attribute values in the input corpus. By default the basis of the comparison for which elements have been modified in the examples is the full tei_all.odd .

Advanced Use

To further tweak the building process you can call the functions of the odd-by- example.xql module from your own XQuery code. If you wish to generate your own basis for the comparison you can call the following function to store a compiled ODD in the default odd location of tei-publisher: 47

obe:compile-odd(doc('../odd/my-file.odd'), 'my-file-name')

Note Due to a bug in the odd2odd.xsl stylesheet the output of this function is not always valid. To use it for further processing you need to make sure that only valid documents are used for further processing You can also modify the transformation parameters of: obe:process-example(doc('../data/test/myTEI.xml'), 'odd-name', 'simplePrint')

The above example uses simplePrint as a basis for building the new ODD. The full list of configurable options are:

48

Custom CSS styling

CSS stylesheet resources/css/theme.css defines styles used by all pages of the TEI Publisher and publisher-generated applications. Nevertheless, users should not directly modify this file but create a project-specific css customization file and include it alongside theme.css instead.

This approach allows to selectively overwrite certain styles and CSS variables from theme.css while remaining on the easy upgrade path for future TEI Publisher and pb-components updates.

Customizing web components styling

A web component completely shields its content, so it cannot be styled from outside. Web component styles remain encapsulated, preventing style contamination between individual components and general application context. A blessing in general, allowing components to be scripted and styled without the fear of collision with other parts of the page, but poses additional challenges when adjusting the look and feel of a component to fit a project's theme. For Publisher, encapsulation of web components means that definitions in theme.css or equivalent project customization CSS files are not able to directly govern web components styling. While some aspects of component styling remained unaccessible for customization in versions preceding Publisher 6, currently pb-components expose style properties to the outside world via standard CSS variables. This way variables like --pb- footnote-color defined in theme.css can be accessed by e.g. pb-view component and thus determine the color of the footnote marker in the rendered transcription. Note that while you cannot change the inner appearance of a component except by setting its custom CSS properties, you can style the component itself within the HTML template, e.g. to position it within the layout of the page. 49

External stylesheets

ODD specification allows for explicit declaration of an external CSS file which may define styles and CSS classes to be applied to tranformed sources (in encodingDesc/ tagsDecl/rendition ), e.g.

Styles and classes from that file are loaded into pb-view component and thus accessible for its content. External stylesheets for pb-view can also be specified via load-css component configuration attribute. In this scenario, unlike with ODD rendition , regenerating the ODD is not required for changes to the CSS file to be applied, otherwise both methods are functionally equivalent.

Broader discussion of using for custom styles can be found in this section .

Page templates and pb- components

As described earlier, the various sample documents included in the TEI Publisher Demo collection differ not only in the ODD they use, but also concerning the general layout and composition of the page. They are based on different HTML templates, which can be found in the templates/pages collection of the TEI Publisher app. Each template assembles various building blocks in a slightly different way - some examples show a facsimile view next to the text, others a parallel display of transcription andtranslation, some include a map and another showcase collapsible metadata section. Which template is being used is determined by a processing instruction in the TEI sources of these examples. The building blocks we mentioned are custom HTML elements. Each of them encapsulates a certain functionality and appearance. The map, the facsimile, but also the text view itself and all controls are custom HTML elements. They are like "Lego" blocks which can be freely moved around and rearranged without knowing anything about internal implementation of the component. 50

Web Components

The technology enabling this Lego-like modular approach is a W3C standard called Web Components . It is already built into many browsers and support is improving quickly, reducing the need for external frameworks. There's a growing collection of ready-to-use components available, e.g. the Polymer elements we use for menus, buttons, dropdowns etc. TEI Publisher from version 6.0 exposes its collection of Web Components targeted at creating digital editions as a separate pb-components package. You do not need to know much about Web Components to use them in TEI Publisher. From a user perspective, a component looks like any other HTML element. You configure it by setting its properties via attributes. For example, the following HTML code snippet will display the first page/section of two completely different documents as you can see below in the embedded Codepen (to learn more on embedding Publisher output and components see further chapters )

, < pb-document> and are three web components from pb- components library, while

is a standard HTML5 tag. The name of the custom element must start with a prefix to distinguish it from standard HTML. This concept should be familiar to XML people. For TEI Publisher components, the prefix is always pb- . Components from other sources will use different prefixes, e.g. paper- and iron- for the Polymer collection. The part of the page which uses TEI Publisher web components should always be wrapped into an element. This element determines the TEI Publisher server instance all other components will be communicating with (see the next section below). It is also responsible for some other initialization steps, e.g. loading the list of available user interface translations. . specifies a document source, which can then be referenced by id from other components. The component provides a way to configure basic properties governing document's default rendering, like associated ODD file, etc. In the example above, we define three properties for each document: is the critical component in TEI Publisher: it provides the actual text view by transforming a part or entirety of the source XML into HTML based on the processing model instructions in the ODD. 51

Because webcomponents are all about encapsulation, ensures that the styling of the text as governed by the ODD will be confined to the boundaries of the component. This makes it possible to display two completely heterogenous texts (like the documentation and Kant's Kritik) on the same page without styles contaminating each other. As a downside, encapsulation also poses some challenges, which we discussed in the section about CSS styling .

Webcomponent Documentation

To better understand the various components TEI Publisher provides, it is best to have a look at the small examples contained in the web components API documentation . The list of components may be overwhelming at first sight. However you don’t need to learn them all. There are just a few components that should be understood before you start customizing. Their demo pages showcase a working example along with the code snippet which actually implements it. You can also get an editable live view like the one above if you click on the Edit Code button to the bottom right of each example. In particular you may want to look at the following examples:

Caveat Some properties of pb-view and other components are boolean properties. In HTML5 this corresponds to an attribute without value, which is illegal in XML. If you want to preserve valid XML, just write the attribute with the same name and value, e.g. append-footnotes="append-footnotes" .

Communication between Components

To allow for maximum flexibility, nearly all of the TEI Publisher webcomponents communicate via events: this way avoids hard wiring and components may appear anywhere on the page. For example, the controls for paginating through a document do not directly talk to the document view: they just send an event , indicating the users' wish to navigate backward or forward. Components listening for this event may then react to it by refreshing the text being displayed. Since you can have multiple text views showing content from different sources, every event can be announced at a specific communication channel . This allows us to distinguish between different sources, e.g. two transcriptions being shown side by side. Most TEI Publisher components therefore accept two properties to configure the channel they are listening or sending events to: If neither of these properties is given, the component will subscribe and emit to the global default channel. A component may also send to a different channel than it subscribes to, allowing chains of events. 52

Common properties and methods accepted by many TEI Publisher components are defined in the class PbMixin.

Page Templates

TEI Publisher currently includes several different page templates, which combine the building blocks described above with off-the-shelf components to achieve a certain page layout and composition. If you look at the HTML code, you'll see a mix of pb- elements and app- , paper- , iron- elements. The last three belong to the Polymer collection and you can find them documented in the public webcomponent registry . TEI Publisher components are not yet available there, though we may move some of the general purpose components there later. To avoid redundancy, the page template files use eXist's templating feature to drag in some repeating parts which are the same for all pages, for example, the toolbar and the menu. You'll find these files in the corresponding sub-collection:

The following page templates are currently available in TEI Publisher:

Important Note The page templates are meant as examples to be copied and modified by users. They were written to match the concrete example and do not intend to be universal. TEI is too heterogenous to provide a one-size-fits-all solution. We thus believe that providing a wide range of practical examples is the best way to help users realize their own project. Only the generic template, view.html should work with all example TEI documents.

Create your own Template

To create your own template 1. open one of the existing templates in e.g. eXide (by clicking the links in the list above) 2. adjust content attribute in meta tag to a label that fits your template

3. save the file under a different name into the same collection ( templates/pages ) 53

Now reload the document you'd like the template to apply to and you'll be able to switch to your new template, either by • using the Template dropdown in the Settings panel • adding a parameter template=mytemplate.html to the URL showed by the browser • adding a processing instruction to a TEI document to make a specific template the default:

Handling Complex Alignments

It is often desirable to show two or more views of a document at the same time, for example to display the translation aligned with a given source fragment. In the simplest case, the transcription and translation may be aligned on the level of divisions or page breaks and one can simply use two referencing different starting points in the TEI document (this approach is implemented by the translation.html template for Serafin's letter). Unfortunately things are not always as simple as that. For example, even if the transcription contains page breaks or milestones which can be used to display a single page, the translation might not. One thus needs a different approach to compute the alignment between fragments. Nevertheless, the logic of the alignment algorithm will highly depend on the conventions used in the encoding. TEI allows a wide variety of alignment mechanisms and we do not want to limit the freedom of the editor by prescribing a particular method. TEI Publisher thus implements a generic way to plug an XQuery function into the processing pipeline. The function takes the source element being processed as input and may replace it by its aligned equivalent. Such an equivalent may be another element or fragment from the same or a different document. The source element will usually point to the part of the transcription being displayed. The mapping function uses this as starting point to determine an aligned fragment and returns it. The returned fragment will then be further passed through the processing model. The XQuery mapping function should be defined in the module modules/map.xql . It takes an element as its only argument and may return any valid TEI fragment, which will become the input for futher processing through the processing model. The local name of the mapping function can then be supplied in the attribute map of . As an illustration, the Van Gogh example includes the following pb-view for displaying the translation:

In the Van Gogh letters, the translation contains page breaks corresponding to page breaks in the original letter, but these are using a different prefix for the xml:id . To 54 align the translation with the transcription, we only need to adjust the id, and retrieve the corresponding page break to be done. The XQuery mapping function is thus rather simple: declare function mapping:vg-translation($root as element()) { let $id := ``[pb-trans-`{$root/@f}`-`{$root/@n}`]`` let $node := root($root)/id($id) return $node };

Note that returning the corresponding node of the translation is sufficient here as further processing will automatically extract the page fragment up to the next . More complex cases may require that mapping function returns arbitrary TEI fragment. Also note that the xpath attribute of the element in the template must still point to the source transcription ( div[@type='original'] in this case). It's just the mapping function which translates a position in the source transcription to a corresponding fragment in the translation. The letter by Cortez to Dantiscus sent from Mexico demonstrates a much more sophisticated alignment, determining the translation fragment to be shown by inspecting the ID range of the transcription. It illustrates the case where no milestone elements exist in the translation to explicitly mark page boundaries of the original, thus mapping algorithm aims to display closest corresponding fragment of the translated text.

Server-side API

Many of the user interface web components need to communicate with the server to request certain data to be retrieved, processed and returned before the client-side components can display it to the user. For example, when user clicks the Table of Contents button, the corresponding pb-load component sends an ajax request to retrieve the list of chapters. Similarly, when user types something in the autocomplete field, a request is sent after each keystroke to find matching terms in the . These requests need to be received and processed by the server with the eXist instance running and only then data will be returned to the browser. Previous versions of the TEI Publisher included a number of XQuery modules implementing the server-side functionality, located in modules collection. In version 7, a new approach has been introduced. Formal API specification , based on the Open API standard defines all the server-side API endpoints available in Publisher. The image below presents a section of the API handling all ODD-related operations: creating, retrieving, updating, deleting, recompiling; syntax check for a code fragment (lint); retrieving a list of all available ODDs. 55

TEI Publisher API page - the odd endpoint

Advantages of the API-based approach

• clear specification and easy overview users and developers will benefit from the clear specification for available functionality 56

• customizable users can easily overwrite and extend existing API endpoints as well as add custom ones for project-specific functionality • separation of concerns server-side functionality can be used directly by any software system, not necessarily within the context of the TEI Publisher or using Publisher's UI components; further changes in internal API implementation do not require any adjustments of the UI components • standard-based OAS compliance means we are using a well-known and documented standard and can benefit from existing tools for writing, testing and generating documentation

API endpoint groups

TEI Publisher 7 groups server-side functionality into several main sections:

Custom API endpoints

Beyond the standard server-side functionality provided by TEI Publisher, careful provisions have been made to allow users to add their own API endpoints. The Custom API section demonstrates how project-specific functionality can be added within the OAS API scheme. All that is required is to edit the modules/custom-api.json file and add specifications for custom endpoints in JSON format. The operationId property specifies the XQuery function to be called to process the request. All function module with the function definition needs to be imported into modules/custom-api.xql so that the request can be correctly resolved. The function implementation can either be added to the XQuery module modules/ custom-api.xql or to a new module, which should be imported into this file to make it known. See the README of the oas-router package for more information about how to write a request handler function. 57

Custom API specification - show article 58

To reference a custom function declared directly in modules/custom-api.xql , use the prefix custom: in the operationId . For functions in imported modules, use the prefix defined in the import statement.

Language versions

TEI Publisher is already available in twenty languages and this number increases thanks to the engagement of our user community. With version 6.0 i18n support has been greatly extended to cover not only labels and attribute values in HTML templates but also within web components. A mechanism for project specific language files extending the default Publisher label collection has been added. Thanks to community contributions via Crowdin a number of languages has been added and existing ones are updated when needed. We welcome and encourage additions and amendments. Please consider Crowdin a master repository for translations and publish your contributions there instead of submitting PR directly. Do not hesitate to get in touch if a language you'd like to support is not yet listed. Crowdin users are only exposed to form-based graphical user interface but i18n files are using JSON format to preserve their logical structure. Initial portion of German localization file de.json is shown below for illustration.

Translation JSON file structure In TEI Publisher and in generated apps translation files are by default loaded from CDN with pb-components . When it is necessary for a project to add new labels or change wording of existing ones, a customization mechanism is described below . 59

Using i18n in page templates

There are several scenarios for using i18n labels: • directly within HTML element • in an attribute • passed in component-specific structure

HTML text node with pb-i18n

Any text fragment within HTML element can be considered a target for i18n when wrapped in component. elements are listening for events emitted by .

Documentation

Attributes (on HTML elements or custom web components)

When it is necessary to translate the value of an attribute a mechanism based on data- attributes is used. Supply additional data-i18n attribute specifying the key of the i18n label to use preceded by the name of the attribute that needs to be translated. In the example below it's the heading attribute that needs to be filled with the translated version of ODD Files label. The label is stored in JSON file under odd.files , so [heading]odd.files needs to be used for the data-i18n attribute.

... content of the ODD list

When < pb-lang> is switched to Spanish, the heading will read Archivos ODD instead of ODD Files . 60

Output for with translated @heading

Special web component properties configured via attributes

Some web components accept more complex configuration options via arrays passed in attributes. For example, component allows to specify a number of label/value pairs for dropdown menus used for ordering and filtering the document list. Understandably, the labels should switch in line with changes to the language chosen via . Therefore i18n translation keys like browse.title or browse.author are used instead of text values. Note that web components may have particular expectations for the data format expected so consult API documentation for each component.

61

Project specific i18n files

It is a common need that projects will need their own internationalized labels for menu items, dialogs and other user interface elements. All these can be stored in JSON language files, following the same naming conventions and basic structure as TEI Publisher ones. To use custom language files, you need to specify the path under which they can be found relative to your application:

The path expression requires placeholders for two parameters ns and lng : So given above configuration, TEI Publisher will search for a custom language file for, say, French in resources/i18n/app/fr.json . If you prefer a flat directory structure, you could change the locale to resources/i18n/{{ns}}_{{lng}}.json and TEI Publisher will look for a file resources/i18n/app_fr.json . You may also define additional namespaces to be searched with the locale- fallback-ns parameter:

which means that TEI Publisher will search for labels in the my-module namespace first, then falling back to the app namespace, and if the label could still not be found, using TEI Publisher's default namespace. The latter is called common and should not be overwritten. Listing below demonstrates a fictional en.js with a custom set of labels. A new top- level key has been added as well as 3 subkeys for the menu section. Now the project has access to a number of new i18n keys: menu.about , menu.contact , menu.statute and greeting . Furthermore, a value for menu.documentation has been specified. That key already existed in Publisher (set to "Documentation") but the version from custom file will take precedence and be used in the custom app.

{ "menu": { "documentation": "Docu", "about": "About", "contact": "Contact", "statute": "Statute" }, "greeting": "Welcome" }

62

Creating applications with the App Generator

Once you are happy with a certain ODD and HTML template customization, you can easily create a complete, standalone application. Such application can be tailored further to fit your needs. The generated app can be downloaded as a portable xar package, to be installed into other eXist instances, or synchronised to disk for further development. It provides a fully functional application scaffolding based on TEI Publisher components and modules, pre-configured to use a certain ODD, page template and other settings.

Note Early versions of the TEI Publisher included everything they needed to run without TEI Publisher. This had some advantages, but also made it more difficult to update to newer releases. Starting with version 5 we are gradually moving towards more lightweight app design, ultimately containing only resources which are truly specific to the app (ODDs, HTML templates, images), while all generic functionality will be provided by TEI Publisher and its libraries and packages. Version 6 has seen all web components extracted into pb-components package and in version 7 the server-side code has been based on a well designed API specification with provisions for easy customization. Please make sure you are following our best practice recommendations for a smooth upgrade path. To get started, click on App Generator in the menu bar and fill out the form. The following form fields are of particular importance: Once you created the new application, log into it using the account details you provided. You can then upload XML documents using the upload panel in the sidebar. 63

Create an App 64

Generated Code Overview

XQuery Code

The collection structure of the generated app follows typical design of many eXist apps. Best practices for modifying the app are discussed in further sections of this document. You can find the code of your generated app within the /db/apps collection under the name you provided in the abbreviation field of the generator form.

Modifying the App

When you are logged in, the Admin menu in the top navbar provides various links for ease of customization of your app:

Using Multiple ODDs

For performance reasons, the mechanism used by generated apps to transform a document via an ODD is slightly different to the one in the main TEI Publisher app: while TEI Publisher resolves ODDs dynamically, generated apps use a static lookup method. The static method is much faster, resulting in better overall performance, but as a consequence ODDs are not automatically detected by the app, and need to be explicitly registered. Each generated app will by default use a single ODD for all transformations but it is possible to add additional ODDs, e.g. to be used with a pb-view in a custom HTML template. If multiple ODDs are selected when generating the app, the app generator will take care of registering those ODDs. If you would like to add ODDs manually later, the procedure is as follows: 1. Upload the ODD to the resources/odd collection below your app root. This can either be done via eXide or the Upload panel on the start page of your app. 2. In modules/config.xqm find the variable called $config:odd-available and add the name of the new ODD to the sequence.

(:~ : The main ODD to be used by default :) declare variable $config:default-odd :="shakespeare.odd";

(:~ : Complete list of ODD files used by the app. If you add another ODD to this list, : make sure to run modules/generate-pm-config.xql to update the main configuration : module for transformations (modules/pm-config.xql). :) 65

declare variable $config:odd-available :=("shakespeare.odd"); 3. In eXide, open modules/generate-pm-config.xql (path relative to your app root) and execute it once. This will regenerate the XQuery module modules/pm- config.xql , which registers all the ODD modules known to the app. 4. Additionally you need to regenerate all ODDs used by the app by either 1. Clicking the Regenerate all ODDs button on the start page of the app 2. Via the api documentation page api.html by executing the POST version of the /api/odd .

Using different ODDs depending on the collection

By default the same ODD will be used for all documents within the app. It is possible though to organize documents into a hierarchy of collections beneath $config:data- root and use different ODDs and general view settings for each collection or document type. To do so, search for the function config:collection-config , which by default returns an empty sequence - meaning that the default configuration should be used. Comment this out and enable the switch/case statement below to return a different config depending on the current collection. The $collection and $docUri parameters be relative paths, i.e. relative to $config:data-root . So for a single level hierarchy as used in TEI Publisher by default, $collection will be either test , playground or doc . For multi-level hierarchies it might also be e.g. volume1/ transcripts . In this case a simple switch/case might not be enough, but you can just replace it with an if/then/else and apply any path matching you like. declare function config:collection-config($collection as xs:string?, $docUri as xs:string?) { (: Return empty sequence to use default config :) ()

(: : Replace line above with the following code to switch between different view configurations per collection. : $collection corresponds to the relative collection path (i.e. after $config:data-root). :) (: switch ($collection) case "playground" return map { "odd": "dodis.odd", "view": "body", "depth": $config:pagination-depth, "fill": $config:pagination-fill, "template": "facsimile.html" } default return () :) };

66

Instead of switching by collection, you could also configure different views depending on the type of document, e.g. by checking the type of the first div: doc($config:data-root || "/" || $docUri)//tei:body/tei:div/@type

Exporting the Finished App

To save your finished application or exchange it with other people, you need to save it as an application archive. Application archives use a standardized EXPath Packaging format: the resulting .xar file can be uploaded to any eXist instance via the Dashboard and the Package manager will take care of the deployment. There are two ways to create a .xar file from your application: 1. Use the Admin / Download App menu entry in generated app to directly download a .xar 2. Synchronize the application to a directory on disk via Application / Synchronize in eXide The first approach is recommended if all you need is a copy of you app on disk. A .xar is just a ZIP archive, so you can unpack it into a directory of your choice, which you can then commit to a version control system like git. However, if you continue to make changes inside the database (e.g. further work on the ODD), you may want to use the second method, i.e. call eXide's synchronization. It requires that you have access to the file system of the server running eXist though, so it's usually only an option if you run your own eXist instance locally. The synchronize steps in detail: • Prerequisite: you need to have the Apache Ant build tool installed. • Open one resource belonging to your application in eXide. It doesn't matter which one. The only important thing is that the name of your app is displayed next to Current app: on the top right of the eXide window. If this is not the case, stop and check again! • Click Application / Synchronize in the menu. It opens up a dialog with two fields: Start time and Target directory . When you synchronize the first time, empty the Start time field. Enter a valid, absolute directory path on your server machine into Target directory . • Click the Synchronize button. This may take a moment, but you should see a list of written files at the bottom of the dialog afterwards. • Change to the directory you specified for synchronize. Watch the screencast below for the whole synchronization procedure.

Note for security reasons, the password you entered when creating the app is not stored in the database, so it cannot be synchronized to disk. To restore a password for your app, you thus need to edit the repo.xml file in the directory and add a @password attribute to the element. 67

Export an App 68

Building from a directory

Once you have a copy of your app extracted in a local directory, you can always rebuild a fresh .xar from it. In the simplest case it is sufficient to call ant inside the directory. For up to date build instructions which also cover more advanced uses, refer to the "Building" section in the readme of the TEI Publisher repository .

Best Practice Recommendations

In case you'd wish to further customize the generated app it's worth to keep the changes as much separated from the generated code as possible to allow for future alignment with newer versions of the TEI Publisher. The generated app shares most of its XQuery libraries with the main TEI Publisher app. A copy of those is included in the lib/ collection of the generated app and should not be modified! This way you can later update the libraries to a newer TEI Publisher release without breaking your app. Including the libraries in the generated app creates some redundancy, but we chose to accept this trade-off to make it easier to view and modify everything relevant to the app. Meanwhile, if you find that modifications of lib/ modules are necessary, please consider if your change would be generally beneficial for TEI Publisher and create PR for the TEI Publisher if so. It is considered safe : 1. to modify all HTML templates below templates/ as well as index.html and search.html in the root of the app 2. to change XQuery modules in modules , excluding those in modules/lib 3. to add images, fonts or change i18n translations below resources 4. to add custom API functions (written in XQuery) to custom-api.json and custom- api.xql The following core XQuery modules in every app are safe to be modified (all are stored in modules subcollection:

Updating Applications

With TEI Publisher 7, we have redesigned the server-side API. Combined with the client-side reorganization brought by TEI Publisher 6, this will make future updates 69 for generated applications a lot easier. Client-side UI and server-side API are now cleanly separated from any custom app elements. For apps created with TEI Publisher 6 updating to 7 requires only small modifications. Migrating from earlier versions, like 4 or 5, requires more effort, and is described in futher sections.

Upgrading from TEI Publisher 7 to 7.x

Upgrading an app generated by TEI Publisher 7 to another minor release involves the following general steps: 1. Update the webcomponents library (also see the faq). The version used is defined by a single variable in modules/config.xqm:

declare variable $config:webcomponents := "1.24.12"; 2. Update server-side modules: those live in modules/lib. You can safely remove the entire folder and replace it with the corresponding one from the new version.

Extra Steps for TEI Publisher 7.1

The new annotation editor in TEI Publisher 7.1 requires a bunch of files to be copied. If you do not intend to support web annotations in your app (we recommend having a separate app for this), you only need step 1 and 2: 1. Copy modules/annotation-config.xqm into the corresponding location in your app 2. Copy templates/basic/controller.xql into the root of your app. This contains an important security fix. If you modified your controller yourself, just replace the last "else", which in the new version should start with:

let $main := if (matches($exist:path, "^/+api/+(?:odd|lint)")) then "api-odd.xql" else if (matches($exist:path, "/+tex$") or matches($exist:path, "/+api/+apps/+generate$")) then "api-dba.xql" else "api.xql" 3. Copy odd/annotations.odd to resources/odd in the generated app. 4. Copy templates/pages/annotate.html and templates/pages/annotate.css to the corresponding folders. 5. Also copy resources/scripts/annotations 70

Upgrading from TEI Publisher 6 to 7

To upgrade a custom application generated with TEI Publisher 6 to version 7, we recommend to clone the source code of both, TEI Publisher 7 and the custom application, into a local directory. The commands shown below were used on a Linux system for updating the Dodis demo app , but with appropriate adjustments to the syntax the same actions could be performed on a different operating system. Alternatively the migration can be executed using eXide's file manager. If you decide to work with the command line, as suggested here, clone TEI Publisher 7 into the directory one level above the one containing your own application source code: git clone https://github.com/eeditiones/tei-publisher-app.git cd tei-publisher-app git checkout v7.0.0

At this point we need to move to the directory where your own application source code is stored: cd ../my-custom-app

Copy files

Now we need to remove the old modules/lib directory and replace it entirely with the new one from the TEI Publisher 7. We'll also copy a bunch of files from the templates subdirectory of TEI Publisher: rm -rf modules/lib cp -r ../tei-publisher-app/modules/lib/ modules/ cp ../tei-publisher-app/templates/basic/pre-install.xql . cp ../tei-publisher-app/templates/basic/post-install.xql . cp ../tei-publisher-app/templates/basic/controller.xql . cp ../tei-publisher-app/resources/css/theme.css resources/css cp ../tei-publisher-app/templates/basic/modules/custom-api.* modules/ cp ../tei-publisher-app/templates/basic/modules/facets.xql modules/ cp ../tei-publisher-app/templates/api.html templates/

Ideally, these files should not have been modified by you, if you followed our earlier best practice recommendations . Otherwise you may need to reapply your changes. Next, copy the navigation* and query* modules. These are intended to be customized, so you may have changed them in your custom app. Compare the versions and make sure you reapply your modifications, if any. cp ../tei-publisher-app/modules/navigation* modules/ cp ../tei-publisher-app/modules/query*.xql modules/

Note that the naming of the query-*.xql has changed to be constistent with the navigation-*.xql files. 71

Same considerations apply to the HTML templates for menus and the toolbar. Nevertheless, changes in these areas were quite minor, so you may alternatively postpone this step untill you encounter concrete issues in your application: cp ../tei-publisher-app/templates/menu.* templates/ cp ../tei-publisher-app/templates/toolbar.html templates/ cp ../tei-publisher-app/templates/drawer.html templates/

Edit HTML templates

Important TEI Publisher 7 expects all HTML files to reside in the templates subfolder. Previous versions used a mix of locations with some HTML files in the root of the app. You should thus copy your index.html , search.html , error-page.html and any other custom HTML files into templates first. Also, because TEI Publisher 7 has a well-defined API to handle the communication between user interface components and server-side functionality, we need to change some of the URLs in the HTML templates we're using. As a rule of thumb, all URLs previously calling XQuery modules directly, should now start with api/ followed by the correct API path. The main HTML template to be changed is index.html . Please search and replace the properties for the following webcomponents:

should be changed into

and

should be changed into

The last change should also be applied to search.html and you should change the loading the search results to read:

Also check templates/pages/view.html and any other page template your app uses. Search for calls to .xql or modules/ and replace them with appropriate API paths. If you were making direct calls to custom modules, e.g. from pb-load component, these need to be added as custom API endpoints . 72

Update config.xqm

Finally, a few settings need to be added to the main configuration module, modules/ config.xqm : 1. Change $config:webcomponents to at least version 1.13.0 . 2. Check for missing variables or functions and copy these from tei-publisher-app/ templates/basic/modules/config.xqm , in particular: • $config:odd-available • $config:odd-internal • config:collection-config() • config:default-config() • config:document-type() • config:get-document()

Check for other customizations

In most cases you will now be able to rebuild your custom app, redeploy it to eXist and test. If you encounter any error messages, they are most likely due to additional modifications you applied to your custom app. These are usually located in two main areas:

Remove superfluous code

Now you can remove files which are no longer used: rm templates/toc.html rm templates/search-results.html rm modules/view.xql

Change API endpoints

Finally, to be able to view and test the API of your custom app, you should change the endpoints in the Open API specification files. In both, modules/lib/api.json and modules/custom-api.json , change the url property in the following section:

"servers": [ { "description": "Endpoint for testing on localhost", "url": "http://localhost:8080/exist/apps/tei-publisher" } ],

Change the final part of the URL to match the name of your app instead of "tei- publisher". If you are on a remote server, adjust the whole url accordingly. 73

Migrating from TEI Publisher 4, 5 or earlier

If you are migrating to version 7 from TEI Publisher 5, 4 or earlier, you also need to pay attention to the user interface redesign introduced by TEI Publisher 6. This makes things more difficult and there are two possible approaches for updating: • generate a new application with TEI Publisher 7 and merge your changes in into the newly generated app. This is the recommended method. If your customizations were limited to e.g. ODD files, CSS styles, templates, index configurations or adding your own code in modules/ , it should all go very smoothly. • update your generated app by modifying the HTML templates you use and copying files from TEI Publisher. The first option is strongly recommended and much easier in general. The second option is only for experienced users who have to update apps containing a lot of customized code.

Update by Generating a New Application

1. Upload your customized ODD to the TEI Publisher 7 2. If you created a custom HTML template for your document view, upload it to the templates/pages collection of the TEI Publisher. You can use eXide's file manager to do so. 3. Generate a new application using your custom ODD. Make sure to choose a different URL and short name to not confuse the old and new app. 4. Adjust modules/config.xqm variables, if necessary. 5. Selectively upload all other files you changed or resources you added to the generated app (data, CSS). In case of custom page templates and XQuery modules follow recommendations for updating from Publisher 6 .

Update by copying

This approach is for experienced users only and not recommended. We mainly keep it here for reference. This guide has been created for updating to TEI Publisher 6, so you will also need to follow the instructions for migrating from 6 to 7 and apply the steps not described below. The following steps assume that you either • have a copy of your app's code on the filesystem and either a clone of TEI Publisher 6 or an app skeleton generated by TEI Publisher 6 (preferred) next to it • or have your app and TEI Publisher installed in your database. In this case use eXide's file manager to copy files In the following, the TEI Publisher 6 app you are copying from will be identified as $source . 74

1. Edit modules/config.xqm and add two variables at the top: $config:origin- whitelist and $config:webcomponents . $config:webcomponents should point to the latest version of pb-components.

(:~ : A list of regular expressions to check which external hosts are : allowed to access this TEI Publisher instance. The check is done : against the Origin header sent by the browser. :) declare variable $config:origin-whitelist := ( "(?:https?://localhost:.*|https?://127.0.0.1:.*)" );

(:~~ : The version of the pb-components webcomponents library to be used by this app. : Should either point to a version published on npm, : or be set to 'local'. In the latter case, webcomponents : are assumed to be self-hosted in the app (which means you : have to npm install it yourself using the existing package.json). : If a version is given, the components will be loaded from a public CDN. : This is recommended unless you develop your own components. :) declare variable $config:webcomponents := "0.9.11";

Other variables in conf.xml which may need to be updated are: • $config:data-default • $config:data-exclude • $config:context-path 2. Copy all XQuery files from $source/modules/lib into the modules/lib of your application. 3. Copy $source/resources/css/theme.css into the same location in your application. 4. In all HTML files you changed for your app , check and remove any

Also add a link to theme.css :

5. Copy $source/resources/css/theme.css 6. Copy $source/resources/i18n . The existing *.xml in your directory can be deleted - unless you added your own translated keys, in which case you would need to move them into the corresponding json format. 75

7. HTML files you did not change may just be overwritten by copying the corresponding versions from $source . In particular this includes files in the $source/templates/basic/templates subdirectory. 8. Copy $source/templates/basic/controller.xql into the root of your application - unless you made changes to this file yourself, in which case you would need to merge it. 9. Check the files in modules within your app: if you have not changed any of them, just copy the corresponding files from tei-publisher-app/modules . See if you need to merge the modified ones. 10.i18n translations are now applied client-side rather than server-side. This means you should drop all references to data-template="i18n:translate" and the i18n namespace from your HTML. Also remove the i18n module import from your modules/view.xql or overwrite the file if you have not modified it. You may consult the diff of an actual update from Publisher 5 to 6 in the dodis- wall repository on github: update to publisher 6: update templates, configuration and styling

Data

TEI Publisher ships its data files within the same application package. Nevertheless, separating your data from application code has many benefits, particularly for actively developed applications and data sets. This way changes to your code can be deployed without redeploying and reindexing your data and vice versa. It is also easier to maintain separate repositories (e.g. in Git) and differentiate privileges for editorial and developer teams. While we would generally recommend separating data and code, some projects may still prefer to keep their data and application integrated in a single xar package for the sake of marginally easier distribution. Internal structure of the data collection can be arbitrary, though there are some considerations regarding index configuration to take into account.

Data collection

Two variables in config.xqm are used to configure location of the data collection. $config:data-root specifies where in the collection hierarchy the data is stored. Only the top level collection needs to be specified.

(:~ : The root of the collection hierarchy containing data. :) declare variable $config:data-root := $config:app-root || "/data"; 76

Switching to a separate data package is as simple as changing this variable, e.g. assuming that we store our data in /db/apps/lgpn-ling-data/data , the data-root should be defined as:

declare variable $config:data-root := '/db/apps/lgpn-ling-data/ data';

In the variable $config:data-exclude you may specifiy a sequence of root elements which should be excluded from the list of document shown in the browsing view, e.g. secondary data files like a taxonomy or entity lists. You may find it helpful to create and build the data package starting from the template hosted in the e-editiones repository . README document provided there explains shortly roles of all the files and how to adjust them to your needs. It is critical to store index configuration file, collection.xconf in the correct location. Refer to eXist-db documentation for details but it is common practice to store a collection configuration in the data package and rely on mechanisms of pre-install script ( pre-install.xql ) to copy the file to its required position in /db/system/ config . Similar consideration applies if fulltext index makes use of any external function module, which in TEI Publisher and generated apps is commonly the case for facet and field definitions. This module needs to be stored before the index is to be applied and usually it's best to store it in the same location as collection.xconf .

Subcollections

Many editions will simply present a number of documents, e.g. a collection of letters like Van Gogh or plays like Shakespeare demo. Structure of the data is therefore very simple, all the files are stored on the same level in the data-root collection. Other publications, particularly those including heterogeneous material may require a more complex organization. TEI Publisher itself is a good example - its data collection is further divided into three subcollections: doc (for documentation), test (for examples) and playground (for user-supplied material). 77

Structure of TEI Publisher's data collection You will note that it also contains other resources: image files and even an HTML file. Specify the $config:data-default variable in modules/config.xqm to set which collection should be used as a point of entry to your data. Whenever collection.html is present in that location, it will be used as a custom landing page displayed instead of a simple document listing. In case of the TEI Publisher it would be the Demo Collection, Playground and Documentation

(:~ : The root of the collection hierarchy whose files should be displayed : on the entry page. Can be different from $config:data-root. :) declare variable $config:data-default := $config:data-root;

This approach can be extended further, say if you wanted to add a prints and manuscripts subcollection to the playground and present them in a custom landing page. Just add the data and a corresponding collection.html into the playground . You can shape the custom landing page however you like, using the full power of HTML and eXist templating. 78

Subcollections

Fragment of the subcollections landing page The landing page like this could be created with this HTML template stored in data/ playground/collection.html .

Custom landing page demonstrating sub-subcollections.

Processing instructions

The default view for a specific document can be configured via a processing instruction. Before displaying a document, TEI publisher will check if a processing instruction exists at the start of the document, telling it which ODD and view template to use (along with other configuration parameters). For example, the following processing instruction associates the document with the view template translation.html , the ODD dantiscus.odd , and switches to a page-by-page display (along TEI page break boundaries):

When viewing the document by structural divisions, two additional settings control the amount of content displayed at a time:

Facet Search Configuration

Facets allow users to quickly navigate through a set of documents or query results by selecting from predefined categories or properties. This way, users can "drill down" into the set, reducing the number of displayed items with every step. For demonstration purposes, TEI Publisher configures two facets by default: "Genre" and 80

"Language". You can see those to the left of the document list on the start page, or below the search box on the search result page.

Facets on the start page From a user perspective, the main concept behind facets is the drill down : initially the user sees all facet values associated with the set of documents or search results displayed. The number behind each value denotes the number of items in the set having the particular facet set. As the user selects one facet, the set necessarily becomes smaller, so non-matching facet values will disappear and the numbers adjust accordingly. Facets are a new feature in eXist 5.0. They are super fast because eXist will create them when indexing the document. No extra computation is needed when the user clicks on a facet to drill down into a displayed set: all information is already available in the index. To see a more complex example of facets in action, visit our Van Gogh demo. If you would like to configure other or additional facets, you need to edit three files: 81

Embedding TEI Publisher in other systems

Since version 6.0, all pb-components can be used outside TEI Publisher itself. The components can be embedded into any environment, e.g. a CMS or blog software (like WordPress or Drupal) or integrated into any modern front-end framework (like vue, react or angular). All that is needed is a TEI Publisher instance available on the web which stores the source TEI and provides a communication endpoint for the components to talk to.

Live Examples

The embedded example below demonstrates such a use case: it provides a sandbox running on codepen.io but communicates with the TEI Publisher instance on teipublisher.com which stores the documents. The magic happens in the endpoint attribute passed to , which tells the components where to talk to:

You can actually edit the code above: for example, try to change the path for the first document to test/F-rom.xml and the odd to shakespeare . See how the live view changes? And if you would like to read Romeo and Juliet in two-column mode, just add column- separator=".tei-cb" to the main .

Retrieving the whole document as a simple HTML

Embedding results of applying Processing Model transformation on a document is even simpler. Behind the scenes, TEI Publisher has a separate library part, which is essentially an implementation of the TEI processing model. This library can be used independently to retrieve the entire content of a TEI document as HTML, transformed through an ODD with processing instructions. All you need is a small XQuery which calls the library modules, setting the correct source document and ODD. Fortunately, TEI Publisher already contains a boilerplate XQuery script for this job, which you can call as follows in your browser: https://teipublisher.com/exist/apps/tei-publisher/api/document/test%2FF-rom.xml/ html?odd=shakespeare.odd This will retrieve the content of Shakespeare's Romeo and Juliet as an HTML page, transformed through the odd shakespeare.odd . For embedding an entire document 82 in an iframe or similar, this should already be enough. Please note that / character in the path to the document test/F-rom.xml had to be URL encoded as %2F .

Embedding webcomponents for navigation

For longer documents, embedding the entire content in a page may not be too user-friendly. A better way is to use the library of webcomponents provided by TEI Publisher. This way, we can show the content page by page or division by division, allowing the reader to navigate between sections.

Because webcomponents are part of the HTML5 standard and supported natively by most modern browsers, we can easily import the component library which is at the core of the TEI Publisher app and reuse the components it provides in other contexts. They should work in any HTML5 page, no matter if it was written by hand, is generated by PHP, Python or a CMS. For a start, the page should import two scripts in its header:

This imports necessary pb-components libraries from unpkg.com CDN which is considered the best practice for web sites. The second

Using pb-clipboard

John Doe: "The miracles of foobar", Paradise Publishers, Little Village, Stardate 46254.7

In order to use the element, you need to import the component code into your HTML page, as demonstrated with the