A Java-based Multi-Institutional Medical Information Retrieval System Kang Wang Ph.D.*, drs F.J. van Wingerde*, Karen Bradshaw BS*, Peter Szolovits Ph.D.', Isaac Kohane MD Ph.D.* *Boston Children's Hospital Informatics Program +Laboratory for Computer Science, MIT Abstract is used only for communication with extemal sys- (Java-based Agglutination of Medical Informa- tems, and makes it difficult to enhance functionality tion) is designed as a framework for integrating het- without turning again to the vendor who supplied the erogeneous information systems used in healthcare proprietary protocols. Another popular strategy is to related institutions. It is one of the implementations create new data warehouses that either duplicate or under the W3-EMRS project [1]' aimed at using the supplant existing legacy systems. However, captur- World Wide Web (Web) to unify different hospital ing the full capabilities of the older systems often information systems. involves a long and complex analysis task to deter- JAMI inherited several design decisions from the first mine all the legacy data models. Also, solutions that W3-EMRS implementation described in [2], includ- duplicate data require solutions to difficult data syn- ing using the Web as the communication infiastruc- chronization problems. The warehousing strategy ture and HL7 as the communication protocol between therefore usually leads to many years' development the heterogeneous systems and the W3-EMRS sys- efforts and high costs. tems. In addition, JAMI incorporates the growing A sustainable integration system for healthcare insti- Java technologies and has a more flexible and efficient tutions must be open and scaleable. Therefore it architecture. This paper describes JAMI's architec- should be based on standard and widely supported ture and implementation. It also present two in- protocols. To reduce maintenance costs, the systems stances of JAMI, one for the integration of different should be easy to install and to update. The system hospital information systems and another for the inte- should also be platform independent in order to make gration of two heterogeneous systems within a single use of an institution's past investment on hardware hospital. Some important issues for the further devel- and software. opment of JAMI, including security and confidential- JAMI tries to reach these goals by using Web as the ity, data input and decision support are discussed. communication infrastructure and HL7 as the com- Motivations munication protocol both between the underlying In recent years, cost pressure from managed care has legacy systems and JAMI and between the compo- been causing consolidation of healthcare institutions. nents of JAMI. Furthermore, JAMI uses Java as the As a result, the integration of independently devel- implementation language and the component infra- oped information systems becomes one of the foci af structure JavaBean as the framework. Therefore, JAMI both the medical informatics research community and is easily adaptable to different institutions by adding, the information system providers. Even inside a sin- changing or configuring its components. Also, it is gle hospital, different departments usually have differ- possible to use standard Java-enabled Web browsers ent vendor specific information systems. Demands fir as JAMI's front end, which makes JAMI easy to ac- integrating these systems are also increasing. cess. The use of Web browsers enables JAMI to ac- Various integration strategies exist. One is to create cess volumes of medical resources on the Internet open interfaces among different vendor specific sys- such as MEDLINE and to take advantages of the for- tems, often based on the exchange of data via the HL7 matting capabilities of the HyperText Markup Lan- [3] protocol, and often with the intent of feeding all guage (HTML). data to a new data repository. Although this is a In the next section, some Web-based medical record welcome advance over previous systems that could systems are reviewed, including our first HTML- not communicate at all, the resulting systems still based implementation in the W3-EMRS project. suffer from many typical disadvantages. Most of the Some fundamental limitations of these systems are systems being integrated and most data repository presented. In Section 3, the architecture of JAMI is solutions remain platform dependent, locking the described. We discuss how Java technology enables organization into required support for old hardware as us to overcome the limitations outlined in Section 2. well as new for the repository. Some systems take Two modules of the architecture, the Common Medi- advantage of - architecture, but often the cal Record and the User Interface, are discussed in internal communication protocols are proprietary and more detail in Section 4. The advantages of compo- thus hinder further integration. In such systems, HL7 nent-oriented system development are illustrated. In Section 5, two instances of JAMI are described. They ' Visit http://www.emrs.org/ for more information are used to integrate multiple hospital information and demonstrations systems and multiple information systems within a

1091-8280/97/$5.00 0 1997 AMIA, Inc. 538 DB Figure 2: HTML-based W3-EMRS Figure 1: First Generation Web-Based Systems ing and translation of the CMR (e.g. vocabulary hospital respectively. Finally, some interesting issues translation of the problem list, merging the clinical that arose during the development of JAMI as well as notes from different sites and sorting them by date). planned future efforts are discussed in Section 6. At the end, the agglutinator translates the result into Previous work HTML and sends it back to the browser. In its infancy, the Web was mostly used for accessing There are some fundamental limitations brought by static information presented in the form of HTML the assumption that the Web browsers only under- pages. Today, some systems combine Web and data- stand HTML. First, HTML has very limited data base technologies to generate Web pages on the fly. types, namely text and images; furthermore, HTML To use a Web browser as the front end, a hospital supplies limited layout and formatting operations on data repository as the back-end and CGI programs' as these data types (e.g. paragraph, table, list). It is diffi- the mediator to retrieve (store) information from (to) cult, for example, to use these data types and opera- the repository characterized the first generation of sys- tions to present a live EKG-wave for an ICU applica- tems that used a Web front-end to medical records [4, tion. Although the functionality of some Web 5]. Usually, the first generation systems only dealt browsers (e.g. Netscape) can be expanded by using with single databases. The typical architecture cf extra programs (e.g. plug-ins) to handle new data these systems is shown in Figure 1. types, these extra programs are browser dependent The Web browser sends a query (e.g. get the last 24 and usually platform dependent as well. hours' lab results) to the Web server, which in turn Second, the user interaction model of HTML is very start a CGI program to handle the query. The CGI simple. Essentially, the only user interaction is to go program translates the query into the data manipula- to a certain place in a specified HTML file, which tion language (DML, e.g. SQL) of the underlying might be generated on the fly. Even though this sim- repository and retrieves the data. It then represents the plicity contributed largely to the popularity of Web, data in HTML and sends it back to the browser. it is a severe limitation for systems whose underlying Previous HTML-based implementations in the W3- domains, such as medicine, require complex logic EMRS project expanded the first generation Web and user interaction. This problem has been widely front ends to handle information retrieved from multi- recognized. Now, many web browsers allow scripts ple hospital information systems [2, 6]. The architec- to be embedded into HTML which define more so- ture has an abstract layer, called the Common Medi- phisticated user interaction. But again these scripting cal Record (CMR), above the underlying legacy in- languages are vendor specific. formation models. For each legacy system, a site Third, HTML only represents the appearance (view) server must be written which is responsible for the of the data but not the data-model itself. For example, translation between the legacy information model and if we want to use HTML to present a trend of a given the CMR. Figure 2 shows the architecture of the sys- lab parameter, the image data type must be used. But tem. if an abnormal point is noticed in the trend and the The Web browser sends queries in the form of HL7 user wants to know the value behind the point, the to a Web server (e.g. get the last 5 visits browser must go back to the Web server to get the for this patient). The Web server in turn invokes a value because the set of values behind the trend, CGI program (called the agglutinator) which distrib- which constitutes the model of the view, is not repre- utes the query to the participating site servers and sented by HTML. gathers the returned CMR encoded in HL7. Since the Fourth, since HTML can only be used to describe the implementation assumes that the browser only under- view of the underlying data model, the server must stands HTML, the agglutinator must do all the merg- do all the translations from HL7 to CMR to HTML. The resulting HTML file, together with the related http://hoohoo.ncsa.uiuc.edu/cgi/ image files, might use much more space than it is 539 tion systems. Tier one also contains the site servers that translate between the legacy data model and the CMR encoded in HL7. The site servers are also re- sponsible for authenticating the source of the request, and participating in by encryp- tion and enforcement of agreed-on security policies. Because many legacy systems have HL7 interfaces, the site servers are usually easy to create. ,eiver Tier two supports communication with multiple leg- acy systems by distributing queries to appropriate tier 1 site servers and assembling their individual re- vers sponses into a combined response returned to the requester. MUX, or multiplexor, refers to combining multiple data streams into one. Both requests and responses are encoded in HL7, and the security archi- tecture enhances reliability and confidentiality ef communications. This tier is also concerned with noting unavailability of any of the information Figure 3: Architecture of JAMI sources and strategies for finding alternative commu- necessary to store the data model itself. This makes nication paths in case of network difficulties. Because the HTML-based implementation demanding both on Tier 2 may be implemented on machines distinct the server's computation power and the network's from those supporting the other tiers, this architecture can improve perfonnance by distributing loads. speed. Compared with the HTML-based design, the load of Last but not least, the communication protocol cf tier two may be reduced by distributing a large part of Web, HTTP, is stateless. As a result, all the infor- the computation to the third tier. The load of the mation (called state) that is needed for generating a network is also reduced because the patient record can Web page (e.g. patient ID, user name and password be encoded more concisely in HL7 than in HTML. for login to the database) must be passed to the corre- sponding CGI-program. Even if a Web page itself The third tier is loaded as a Java applet to a Web- might not need the state, it still needs to store it browser running on the client. The HL7 module de- (usually in forms of hidden fields) in order to be able codes the HL7 messages into segments and fields, to pass it to the linked pages, whose generation needs upon which the CMR is built. The "Integrated the state. To keep and pass these states among large CMR" module integrates the CMR from diffnt numbers of Web-pages is a huge overhead, especially legacy systems according to the semantics of the rec- when the states are complex. There are extensions cf ords. For example, clinical notes from different hos- HTTP which tried to remedy this flaw. For example, pitals can be assembled into a chronological se- cookies are introduced to remember some simple quence, and vocabulary translations allow assembly states. But it is hard to use them to represent com- of a common list of prescribed medications or a joint plex states. list of diagnostic conclusions. The integrated CMR is presented by the Visual Presentation module. Due to the rapid development of the Web technolo- gies, it is now reasonable to assume that Web- Because HL7 is used as the communication protocol browsers not only understand HTML but they can both between MUX and the client-side applet and also run Java-applets. This new assumption leads to between MUX and the site servers. it is possible fr the new design of the W3-EMRS system as described the site servers to be MUXs, or interface engines, that in the next section. The new design overcomes the manage their own site servers. In this way, hierar- above mentioned limitations, which restricted the chies can be built to reflect the organizational struc- design of the HTML-based system of W3-EMRS. ture of the to be integrated institutions. Architecture One of the design decisions made for JAMI as shown The design of JAMI inherited several ideas from the in Figure 3 is that the computational tasks such as previous HTML-based design of W3-EMRS. These data fornat translation, vocabulary are accomplished include using Web as the communication infrastruc- as much as possible by the client, which are typically ture and HL7 as the communication protocol. JAMI a single user PCs or workstations. However, the also reused the CMR defined as part of the HTML- between client and server can be moved. For exam- based system and the Agglutinator functionality. ple, if we have a powerful server hardware and weak JAMI's three tiered architecture is in 3. client machines, the division line can be set between shown Figure Agglutination and Visual Presentation (VP). With Tier one contains the legacy data repositories. These Java's Remote Method Invocation mechanism, it is are mostly existing hospital or departmental infonna- relatively easy to implement this shift. 540 In the current implementation, all tiers except tier one In the following, we describe in more detail two in- have been implemented in JAVA. Tier one is closely stances of JAMI. The first is used to integrate multi- related to the legacy systems. An implementation ple disparate HIS. The second is used to integrate an language should be chosen according to the underly- ICU data repository with a HIS within a hospital. ing hardware platform and data repository. Multiple Hospitals It should be pointed out that most modules in the This instance is designed for the doctors in emer- architecture need to be adapted to the concrete integra- gency rooms (ER). It helps ER doctors to save their tion situations. In the next section, we detail the ad- precious time by releasing them from login to diffir- aptation process of two modules. ent systems for retrieving patient record. After login Common Medical Record and Visual Presentation to the JAMI system, the doctors input some basic The structure of the CMR depends largely on the data patient demographic information and then choose model of the underlying legacy systems and the target which parts of the patient record need to be retrieved users of the integrated system. However, experience (Figure 4). Requests are sent automatically to the showed that there are relatively constant elements if participating sites. Note that no effort is made to the CMR. With a repository of such elements, it is solve the master patient index (MPI) problem. In- easy to construct a CMR for a new case. stead, simple matching of the basic demographics In the first phase of the W3-EMRS project, the Bos- fields is used. Based on the returned, more compre- ton Collaborative defined a CMR which contains: hensive, patient demographics data (Figure 5) the Patient Demography, Visits History, Clinical Notes, clinicians decide whether the responding sites have Diagnosis/Problems, Medications, Allergies. Each relevant information for their patient. An MPI system has attributes which were mapped to all the can be plugged into the site servers to implement HIS of the participating hospitals. more flexible matching strategies. Another module which is highly dependent on the After the patient medical record is retrieved, the VP concrete integration systems is the Visual Presenta- module presents the data to the clinicians. Figure 6 tion module. The way the CMR should be displayed presents the problem list of the patient from two hos- and the way the users interact with the system de- pitals. Note that the screen Figure 6 is composed if pend, among others, on the user group (e.g. physi- two components, one for displaying the demograph- cians, nurses, administrators), where the system oper- ics and one for displaying the problem list. ates (e.g. Emergency Room, ICU) and the usage df Multiple Data Repositories within a Hospital the system (e.g. documentation, monitoring, decision This instance is designed for the physicians and support). Similar to the CMR module, visualization nurses in intensive care units (ICU). Many ICUs have elements can be identified which can be reused in their own vendor specific data repositories with high different integration systems. data density. Web front-end has been developed for In the implementation level, component methodol- integrating a specific ICU system and a specific HIS ogy is used to map the elements in the CMR module [7]. This instance of JAMI re-implements the system and Visual Presentation module to software compo- described in [7]. It makes full use of Java and is more nents. Concretely, Java Bean' is used in JAMI as the efficient and flexible. The use of JAMI architecture component infrastructure. makes it easier to be expanded and to be ported. Application examples A user chooses a patient from a patient list that is JAMI is being tested and evaluated for the integration automatically updated by the ADT system of the of the hospital information systems (HIS) of Boston underlying ICU application. Several screens are de- Children's Hospital, Beth-Israel Hospital and Mass fined that display different facets of the patient record. General Hospital in the Boston area [6]. JAMI also Figure 7 is part of the "current status" screen. It is used to integrate a vendor-specific ICU system shows the most important vital signal waves and the with the HIS at Children's Hospital. Other related latest BLOODGAS lab result. The lab data and wave efforts include using W3-EMRS to integrate the HIS data come from different data sources, but the difelir- of Beth-Israel and Deaconess Hospitals to meet the ence is transparent to the user. new information systems requirements imposed by Discussions the merger of these two institutions [9]. Additionally, Data input W3-EMRS is being used to merge the medical rec- The current implementation of JAMI is focused on ords of new-born babies with jaundice at Brigham & the viewer functionality. However, since HL7 is a bi- Women's Hospital, Beth Israel Hospital and Chil- directional communication protocol, there is no limi- dren's Hospital, and to implement guidelines based tation that excludes data input. In fact, by using Java on the data obtained from the merged records. at the client site, input of complex data structures (e.g. order entry) with complex validation logic can be realized. i http://splashjavasoft.com/beans/

541 PtinDaeLis One obstacle of implementing data input is the man- ...... Patient List: Frieda Alien = Demographics agement policy of the sites. It is generally more diffi- Notes cult to get the permission to write to a data reposi- First Nme: FRIEDj Visiting History Last Nme: ProbIems tory than to read from it. One of the possible solu- ALLI ___ Medications tions is to implement an extra site which stores user Sex: Allergies~~~Lab input such as annotations, notes, etc. Dat of Birth: 12/0o8]0_ YYYYiv SeIecteGet Data Security and Confidentiality Security and confidentiality is a general concem tf Figure 4: Patient Identification using the Internet, especially when sensitive data slte(s) chlbols- bi-_rs such as patient records are transmitted. The W3- Given 1e FRIEDA FRIEDA Felly Ie ALLEN EMRS project defined a comprehensive security and Dae of BIrth 8/03/28 8/03/28 confidentiality framework that includes authentication ISK F F 3 Mass. Ave. 3 Mass. Ave. of site servers, multiplexors and client sites, user Boston Boston 02157 02157 access based on the combination of designated pass- MA word and one time password and encrypted transmis- Hee Number 8005551211 61755i1212 sion of messages. The most important component cf Figure 5: Information from multiple the framework is a coherent security and confidential- HIS ity policy among the participating institutions [8]. FRIEDA ALLEN 1926/08/03 F Probl_s Decision Support and Guidelines Stre I tas.. t...... e...e...... lhl - - _ iiiism Having access to the complete historical medical rec- C.0-90__S-- _ii-I ord of the patient lets clinical applications like report T/n/02 makers, automated guidelines help deliver better care T/29/92 a T/20/02 OSEC TO THN ID UNEO by informing clinicians to previously unattainable ...... level. One of the current projects at Boston Chil- T/29/02 mEK bUC TB...... P...... T...... dren's Hospital is to create an automated guideline ir ..../9 urgent care of jaundiced newborns. Implementation ...... 6.../...... chibomrsm PRIHKR-HVPOTHYROIS.13 of this system on top of the W3-EMRS architecture UftIEI4EE ...... {.v.... .v .I .. 11 ...... allows us to retrieve and incorporate birth and mater- -i nal data from the neighboring birth-hospitals for con- Figure 6: Problem List sideration when recommending a course of action. Dablla. Fadh--- 1M/915AwWwwwaip Fr Such a system has the potential to shorten the length CG of time to search for the patient data and thus shorten 30-16 30:26 the time between the child being admitted and the RNE2 start of effective and appropriate treatment. 3016 Acknowledgments N 30.26 The project W3-EMRS has been supported by NLM, r. /' 1' iN 30:16 SpaceLabs Medical, Sun Microsystems and Oracle. N N.: '.1 . \f N.! References

1. Kohane IS, Greenspun P, Fackler J, Cimino C, ...... V;...... I:...... I v. Szolovits P. Building National Electronic Medical Rec- SBsaWs2$ 1 Q.3 _110.2 WML D:17. L 210 M9 _ ord Systems via the World Wide Web. JAMIA :17:4 1996;3:191-207. IMl 4.2 11.1 2. F.J. van Wingerde et al. Using HL7 and the World Figure 7: Integration of ICU and HIS data Wide Web for Unifying Patient Data from Remote Data- bases. In: Cimino JJ, ed. 1996 AMIA Annual Fall Sym- Symposium. Washington, DC, USA: Hanley & Belfus, posium. Washington, DC, USA: Hanley & Belfus, Inc; Inc; 1996:608-612. 1996:643-647. 7. Kang Wang, Isaac Kohane, Karen Bradshaw, James 3. Health Level Seven: An application protocol for elec- Fackler. A Real Time Patient Monitoring System on the tronic data exchange in healthcare environments; Ver- World Wide Web. In: Cimino JJ, ed. 1996 AMIA Annual sion 2.2,Health Level Seven. Chicago, Illinois; 1990. Fall Symposium. Washington, DC, USA: Hanley & 4. Cimino JJ, Socratous SA, Clayton PD. Internet as Belfus, Inc; 1996:729-732. Clinical Information System: Application development 8. Rind DM, Kohane IS, Szolovits P, Safran C, Chueh using the World Wide Web. JAMIA. 1995;2:273-284. HC, Bamett GO., Maintaining the Confidentiality of 5. Willard KE, Hallgren JH, Sielaff B, Connelly DP. The Medical Records Shared over the Intemet and World deployment of a World Wide Web (W3) based medical Wide Web, Accepted for publication in: Annals in In- information system. In: Gardner RM, ed. Symposium on Computer Applications in Medical Care. New Orleans, ternal Medicine 1997 Louisiana: Hanley & Belfus, Inc; 1995:771-775. 9. Halamka J, Safran C, Virtual Integration of the Beth 6. Isaac Kohane et al. Sharing Electronic Medical Rec- Israel and Deaconess Hospitals via the WWW, Proced- ords Across Multiple Heterogeneous and Competing ings of the 1997 AMIA Fall Symposium Institutions. In: Cimino JJ, ed. 1996 AMIA Annual Fall

542