An Extension of the Arden Syntax to Facilitate Clinical Document Generation

Healthcare of the Future 65 T. Bürkle et al. (Eds.) © 2019 The authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). doi:10.3233/978-1-61499-961-4-65 An Extension of the Arden Syntax to Facilitate Clinical Document Generation Stefan KRAUS a,1, Dennis TODDENROTH a, Philipp UNBERATH a, Hans-Ulrich a b PROKOSCH , and Dirk HUESKE-KRAUS a Medical Informatics, University Erlangen-Nürnberg, Erlangen, Germany. b Philips Medical Systems, Böblingen, Germany. Abstract. While clinical information systems usually store patient records in database tables, human interpretations as well as information transfer between institutions often require that clinical data can be represented as documents. To automate document generation from patient data in conjunction with the rich computational facilities of clinical decision support, we propose a template-based extension of the Arden Syntax, and discuss the benefits and limitations observed during a pilot application for patient recruitment. While the original Arden Syntax supports string concatenation as well as the substitution of unnamed placeholders, we integrated an additional method based on embedding expressions into strings. A dedicated parser identifies the expressions and automatically substitutes them at runtime, which can for example be harnessed to display the most recent value from a time series. The resulting mechanism supports the generation of extensive clinical documents without the need to apply specific operators. To evaluate the proposed extension, we implemented an Arden module that identifies an intensive care patient cohort that conforms to the eligibility criteria of a clinical trial and outputs a concise patient overview in different document formats. While string interpolation in the original Arden standard has been tailored to clinical event monitoring, we interpret that our accessible approach usefully extends Arden's data-to-text capabilities. Future research might target the development of an interactive template editor that would hide the complexity of formatting directives and conditional expressions behind a graphical user interface, and explore how computer-linguistic formalisms might facilitate advanced features such as automatic inflections of verbs and nouns. Keywords. Clinical document generation, Arden Syntax, string interpolation, natural language generation 1. Introduction A clinical document can be defined as "a discrete electronic composition about an identified patient to be read or used by a human" [1]. Clinical information systems usually store electronic medical records (EMRs) in relational databases, where the corresponding clinical information is divided into entries in database tables. In order to support provider communication for a seamless clinical care, some parts of an EMR may be represented in the form of documents, such as transfer letters, consultant's reports, or radiology reports. To facilitate workflows between inpatient and outpatient settings, parts of an EMR may thus be converted to documents if required for information transfer 1 Corresponding Author, Stefan Kraus, Chair of Medical Informatics, University Erlangen-Nürnberg, Wetterkreuz 13, 91058 Erlangen, Germany; E-mail: [email protected]. 66 S. Kraus et al. / An Extension of the Arden Syntax to Facilitate Clinical Document Generation between departments or institutions, as in case of the Swiss electronic patient dossier [2]. Many clinical information systems provide their users with a means of generating documents, often based on a template system where data items can be inserted into document templates using placeholders. In its intensive care units (ICUs), University Hospital Erlangen (UHER) uses a commercial patient data management system (PDMS) [3], which provides such an integrated template system using placeholders for the automated generation of documents, either in plain text, Microsoft Word, or in portable document format (PDF). The placeholders of this template system provide limited filtering and preprocessing capabilities, thus it is only sufficient for basic document creation. In use cases that require advanced document generation, however, this approach quickly reaches its limits. This motivated an earlier research on alternative means of generating clinical documents [4], which was based on the Arden Syntax for Medical Logic Systems, a Health Level 7 standard for clinical decision support functions in the form of Medical Logic Modules (MLMs) [5]. Although MLMs are originally designed for clinical event monitoring [6], they can be used for multiple other applications in the medical domain. The Arden Syntax provides a rich set of language constructs and a time-stamped data type system, which are both tailored to the needs of processing EMR contents for implementing clinical decision support functions. This study builds on the above mentioned earlier research and explores the capability of Arden Syntax to generate clinical documents, based on the integration of an extension for a template-based text generation, which is also called string interpolation. The technical platform constitutes an experimental generalization of the Arden Syntax, termed PLAIN [7]. There are two pronounced differences between the Arden Syntax and PLAIN with respect to this study. First, Arden Syntax MLMs generally correspond to condition-action rules. PLAIN, in contrast, additionally supports the use of Arden Syntax statements and operators apart from condition-action rules, thus providing a kind of medical informatics scripting language. Second, PLAIN supports the use of other MLMs as user-defined functions (UDFs) that can be called in arbitrary expressions. Below we describe the characteristics of the proposed extension for string interpolation and its use in a real-world application at UHER, which generates documents for patient recruitment in a clinical trial. Moreover, we discuss the benefits and limitations of template-based document generation in contrast to ontology-based natural language generation. 2. Methods The Arden Syntax standard provides three different approaches to compose text blocks from templates and expressions. The first one is the FORMATTED WITH operator ([8], 9.8.2), shown in Figure 1 A), which uses a variety of placeholders such as %s, %d, and %f, which themselves provide various flags to control the formatting. The second one is the string concatenation operator ([8], 9.8.1), shown in Figure 1 B), which is expressed with a double pipe symbol "||". The third one is the STRING operator ([8], 9.8.3), shown in Figure 1 C), which takes a list of expressions as the argument and concatenates the string representations of all elements to a single string. S. Kraus et al. / An Extension of the Arden Syntax to Facilitate Clinical Document Generation 67 Figure 1: Examples A), B), and C) show the string interpolation approaches provided by the original Arden Syntax. Examples D) and E) show the additional approaches described in this study. We integrated an additional string interpolation approach, which does not require the use of an operator to substitute the placeholders, but embeds expressions directly into strings. The substitution is automatically performed as soon as the control flow within an MLM reaches a string with placeholders. As a delimiter, we enclosed each expression with a pair of curly braces, prefixed with a "$" symbol. This pragmatic convention was inspired by the Haxe programming language [9], which also constitutes the technical basis of the PLAIN prototype, but is also used in a variety of other general purpose languages like PHP or JavaScript. To implement the string substitution with patient- specific values during runtime, we integrated a placeholder parser that analyses the content of the particular delimiter, accepts exactly one single expression, evaluates it, and immediately replaces it with the string representation of the evaluation result. Figure 1 D) shows a placeholder that uses the LATEST operator ([8], ) and thus evaluates to the most recent value of the inflammation marker procalcitonin (PCT). In case the expression within a placeholder is a single variable name, the pair of curly braces can be omitted and it is sufficient to prefix the variable name with a "$" symbol, as shown in Figure 1 E). 3. Results The proposed new approach enables the automated generation of extensive clinical documents through expressions that are directly embedded within string templates, without the need to apply specific operators. The substitution of the placeholders is automatically performed as soon as the control flow reaches a string, and may be used repeatedly in order to progressively assemble more complex textual outputs. The method is currently evaluated in routine use at UHER since January 2019 in the context of a clinical study to identify a cohort of patients whose medications include specific antimycotics. For this purpose, an MLM retrieves the records of all patients that were admitted to three different ICUs as a single data structure, which is encoded in the PLAIN data markup language (PDML) [7], from a REST service connected to the data access interface of our local PDMS. The MLM then extracts those records where specific antimycotics were administered, and applies the string interpolation approach described in this study to generate a document containing an overview of all eligible patients. 68 S.

Load more