Change Management and Synchronization of Local and Shared Versions of a Controlled Vocabulary
Total Page:16
File Type:pdf, Size:1020Kb
CHANGE MANAGEMENT AND SYNCHRONIZATION OF LOCAL AND SHARED VERSIONS OF A CONTROLLED VOCABULARY A DISSERTATION SUBMITTED TO THE PROGRAM IN BIOMEDICAL INFORMATICS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Diane Elizabeth Oliver August 2000 © Copyright by Diane E. Oliver 2001 All Rights Reserved ii I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ Mark A. Musen, M.D., Ph.D., Principal Adviser (Stanford Medical Informatics) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ Edward H. Shortliffe, M.D., Ph.D. (Stanford Medical Informatics) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ Yuval Shahar, M.D., Ph.D. (Stanford Medical Informatics) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ Gio Wiederhold, Ph.D. (Department of Computer Science) Approved for the University Committee on Graduate Studies: _______________________________________ iii Abstract To share clinical data and to build interoperating computer systems that permit data entry, data retrieval, and data analysis, users and systems at multiple sites must share a common controlled clinical vocabulary (or ontology). However, local sites that adopt a shared vocabulary have local needs, and local-vocabulary maintainers make changes to the local version of that vocabulary. If the local site is motivated to conform to the shared vocabulary, then the burden lies with the local site to manage its own changes and to incorporate changes from the shared version at periodic intervals. I call this process synchronization. In this dissertation, I present an approach to change management and synchronization of local and shared versions of a controlled vocabulary. I describe the CONCORDIA model, which comprises a structural model, a change model, and a log model to which the shared and local vocabularies conform. I demonstrate use of this model in the implementation of a synchronization-support tool that supports carefully controlled divergence. To evaluate my model and methods, I performed synchronization on a small test set of medical concepts in the subdomain of rickettsial diseases. The CONCORDIA model served as an effective approach for representation and communication of vocabulary change. v Acknowledgements I thank Mark Musen, Yuval Shahar, Ted Shortliffe, and Gio Wiederhold for their contributions as members of my dissertation committee. I thank Octo Barnett for introducing me to the challenges of controlled medical vocabularies. I thank Keith Campbell, Alan Rector, Samson Tu, Ray Fergerson, Larry Fagan, John Gennari, Jeremy Wyatt, and Darlene Vian for valuable discussions and inspiration. I thank Lyn Dupre for editing this dissertation. Above all, I am grateful for the support from my family and friends. This work was funded by the Agency for Health Care Policy and Research, the Veterans Administration, and the National Library of Medicine. vi Table of Contents Abstract ......................................................................................................................................................... v Acknowledgements ...................................................................................................................................... vi Table of Contents........................................................................................................................................ vii List of Figures ............................................................................................................................................ xiii List of Tables............................................................................................................................................. xvii List of Boxes ............................................................................................................................................... xix 1 Management of Change in Controlled Medical Vocabularies ........................................................ 1 1.1 Hypothesis........................................................................................................................................ 3 1.2 The Need for a Shared Controlled Medical Vocabulary.................................................................. 3 1.3 The Problem of Local Divergence.................................................................................................... 4 1.3.1 An Example of Divergence: The VA Lexicon........................................................................ 5 1.3.2 Update of a Local Terminology: A Report from the Trenches ............................................... 5 1.3.3 Tradeoffs Between Autonomy and Conformance................................................................... 7 1.3.4 Reasons for Shared and Local Change.................................................................................... 7 1.4 Research Assumptions...................................................................................................................... 8 1.5 Research Approach .......................................................................................................................... 9 1.5.1 Extended Vocabulary Model ..................................................................................................9 1.5.2 Synchronization .................................................................................................................... 10 1.5.3 System Architecture.............................................................................................................. 11 1.5.3.1 Concept and Change-Data Repositories........................................................................... 12 1.5.3.2 Browsers and Editors ....................................................................................................... 13 1.5.3.3 Synchronization-Support Tool......................................................................................... 13 1.6 Evaluation ...................................................................................................................................... 14 1.7 Vocabulary Problems Not Addressed Herein................................................................................. 15 1.8 Summary......................................................................................................................................... 16 1.9 Guide for the Reader...................................................................................................................... 16 vii 2 Structural Models, Change Models, and Log Models in Existing Systems .................................. 19 2.1 Controlled Medical Vocabularies and Frame-Based Knowledge-Representation Systems........... 20 2.2 Structural Models........................................................................................................................... 23 2.2.1 Naming and Identification of Concepts ................................................................................ 24 2.2.1.1 Codes, Names, and Unique Identifiers............................................................................. 24 2.2.1.2 Synonyms and Abbreviations .......................................................................................... 26 2.2.1.3 Translation Between Coding Systems.............................................................................. 27 2.2.1.4 Translation Between Natural Languages ......................................................................... 28 2.2.2 Organization of Concepts...................................................................................................... 28 2.2.2.1 Hierarchy.......................................................................................................................... 30 2.2.2.2 Binary Relations............................................................................................................... 30 2.2.2.3 User-Defined Facets and Cardinality ............................................................................... 31 2.2.2.4 Transitivity of Roles......................................................................................................... 32 2.2.2.5 Individuals or No Individuals........................................................................................... 33 2.2.2.6 Inheritance and Subsumption........................................................................................... 33 2.2.2.7 Conjunction, Disjunction, and Negation.......................................................................... 34 2.3 Change Models............................................................................................................................... 35 2.3.1 Additions..............................................................................................................................