(EQL) – a Query Language for Archetype-Based Health Records
Total Page:16
File Type:pdf, Size:1020Kb
MEDINFO 2007 K. Kuhn et al. (Eds) IOS Press, 2007 © 2007 The authors. All rights reserved. EHR Query Language (EQL) – A Query Language for Archetype-Based Health Records Chunlan Ma, Heath Frankel, Thomas Beale, Sam Heard Ocean Informatics Pty. Ltd, Australia Abstract data that end user needs [4]. EHR data include thousands of facts to do with a patient’s clinical status, which can be OpenEHR specifications have been developed to standard- highly structured, semi-structured or non-structured, or ise the representation of an international electronic health most commonly, a mixture of all three. The lack of disci- record (EHR). The language used for querying EHR data pline commonly found in EHR data not only increases the is not as yet part of the specification. To fill in this gap, difficulties in storing them, but particularly the difficulty Ocean Informatics has developed a query language cur- in querying them. rently known as EHR Query Language (EQL), a declarative language supporting queries on EHR data. The openEHR specifications have been developed to stan- EQL is neutral to EHR systems, programming languages dardise the representation of an international electronic and system environments and depends only on the health record (EHR)1. Although there are implementations openEHR archetype model and semantics. Thus, in princi- based on these specifications in development, experience ple, EQL can be used in any archetype-based with querying archetype-based EHR data is still limited, to computational context. In the EHR context described here, the point where it is not clear what kinds of query lan- particular queries mention concepts from the openEHR guage(s) are even appropriate for information systems EHR Reference Model (RM). EQL can be used as a com- based on archetypes. mon query language for disparate archetype-based Since the openEHR specifications intend to define a stan- applications. The use of a common RM, archetypes, and a dardised EHR infrastructure, a query language for this companion query language, such as EQL, semantic infrastructure should aim to become an open specification interoperability of EHR information is much closer. This as well. The query language described here will be submit- paper introduces the EQL syntax and provides example ted to the openEHR foundation as a candidate openEHR clinical queries to illustrate the syntax. Finally, current query language. implementations and future directions are outlined. There are four requirements for an archetype-based query Keywords: language. First, the query language should be able to openEHR reference model, archetype, query language, express queries for requesting any data item from an electronic health record archetype-based system, i.e. data defined in archetypes and/or the underlying reference model . Second, the query Introduction language should be able to be used by both domain profes- sionals and software developers. Third, the query language The National E-Health Transition Authority of Australia should be portable, i.e., be neutral to system implementa- (NEHTA) defines an EHR query as a formal user or sys- tion, application environment and programming language. tem request for information to the EHR repository/ Lastly, the syntax should be neutral with respect to the ref- database that specifies the constraints on precisely what erence model, i.e. the common data model of the part(s) of the EHR content needs to be retrieved [1]. Due to information being queried. Particular queries will of the lack of any clear standards in EHR query services, course be specific to a reference model. NEHTA has identified it as a major area for review [2]. The current available query languages that might poten- Easy accessibility of data from an electronic health record tially be used to query openEHR data include the XML (EHR) is considered as one of the essential features of the Query Language (XQuery) [5] and the Structured Query EHRs that can enhance a hospital revenue cycle [3]. There Language (SQL) . XQuery uses eXtensible Mark-up Lan- are two major challenges commonly encountered in clini- guage (XML) as its underlying data model. It has rich cal data accessibility: one is that people who understand predefined functions and allows user-defined functions the clinical data best, such as health professionals, are not supporting the kind of clinical data requests required in the the ones most competent of querying the data; another EHR context. It is platform independent. Nevertheless, its challenge is that qualified SQL programmers or other main strength is also its main flaw: it is limited to purely query language programmers must spend much time on exploring the data, composing tedious code to provide the 1 http://www.openehr.org/ 397 C. Ma et al. / EHR Query Language (EQL) – A Query Language for Archetype-Based Health Records XML data environments. Direct use of XQuery for the Object Query Language), and the study of the archetypes openEHR EHR would require that all openEHR data must technology, openEHR RM and openEHR path mecha- be represented in XML format. However, openEHR is nisms. designed as an object-oriented framework, and allows for a What is EQL multitude of data representations, including as program- ming language persistent objects (e.g. in the form of Java EQL is a declarative query language developed exclu- objects in a product such as db4o2); as language neutral sively for expressing the queries used for searching and objects (such as in a database like Matisse3); as relational retrieving the clinical data found in archetype-based structures (governed by an object/relational mapping EHRs. It is applied to the openEHR EHR Reference Model layer), and in various XML storage representations (e.g. (RM) and the openEHR clinical archetypes, but the syntax XML blob or XML databases). XQuery is therefore prob- is generic across applications, programming languages, lematic, because the query syntax is directly tied to the system environment, and reference model. The EQL is representational format of the data. Considerable efforts designed as a common language used for expressing clini- would be required to convert openEHR data in each cal data requests across multiple openEHR-based deployment context to XML just for the purpose of query- applications. ing; each such transformation may well be custom, The EQL has two innovations: 1) utilizing the openEHR requiring special work on the part of the system path mechanism to represent the query criteria and implementers. returned results; and 2) using a ‘containment’ mechanism A further disadvantage is that XQuery is a native XML to indicate the data hierarchy and constrain the source data programming language requiring intimate familiarity of to which the query is applied. both users (in this case, health professionals and software OpenEHR path mechanisms developers) alike with XML and XQuery, something that cannot be assumed. OpenEHR path syntax is used to locate clinical statements and data values within them using Archetypes. An Arche- SQL in its standard form is also not a viable candidate for type is a computable expression of a clinical concept in the querying archetype-based EHR data, because it does not form of structured constraint statements, based on some support object-structured data, e.g. the data modelled in reference model [7], such as the openEHR RM which pro- archetypes. It has been found that considerable intellectual vides the support for clinical archetypes. Each archetype efforts are required when using SQL to search and retrieve has a global unique identifier and each node of this arche- clinical data for both individual subject care and clinical type has a unique archetype node identifier. The openEHR research studies [6]. architecture has a path mechanism that enables any node Object Query Language (OQL) is a query language for within a top level structure to be specified from the top of object-oriented databases. OQL was developed by the the structure using a "semantic" X-path compatible path. Object Data Management Group (ODMG4), which was The availability of such paths radically changes the avail- disbanded in 2001. Comparing with XQuery and SQL, able querying possibilities with health information, and is OQL would be the best candidate query language used for one of the major distinguishing features of openEHR [8]. archetype-based EHR data. However, it is complex and as Consequently, it is possible to locate any node in an arche- a result, it has not been widely implemented. For large type, including leaf data elements by using the archetype object models, such as the openEHR RMs, OQL query and archetype node identifiers within openEHR paths. statements can become extremely verbose. OQL uses an Features of the EQL object programming style dot-notation to express object members, while an XPath-based syntax is specified by The EQL features are listed below: openEHR to locate archetype data elements. Using OQL • Neutral expression syntax. EQL does not have any with archetype-based EHRs would require the use and dependencies on the underlying RM of the archetypes. translation between these two notation styles. It is neutral to system implementation and environ- ment. This is one of the distinguishing features of the To satisfy the aforementioned requirements of an arche- EQL. type-based query language, a new language – EHR Query Language (EQL) is under development. This paper intro- • Allows setting query criteria using archetype and node duces the EQL features and syntax. Example clinical query identifiers, data values within the archetypes, and class scenarios are used to demonstrate the use of the EQL attributes defined within the openEHR RM. expression. • Allows the returned results to be top-level archetyped RM objects, data items within the archetypes or RM Methods attribute values. EQL was developed based on the analysis of a set of clini- • Supports naming returned results. cal query scenarios, the study of the current available • Support queries with logical time-based data rollback.