Clinical Concept Extraction for Document-Level Coding
Total Page:16
File Type:pdf, Size:1020Kb
Clinical Concept Extraction for Document-Level Coding Sarah Wiegreffe1, Edward Choi1∗, Sherry Yan2, Jimeng Sun1, Jacob Eisenstein1 1Georgia Institute of Technology 2Sutter Health ∗ Current Affiliation: Google Inc [email protected], [email protected], [email protected], [email protected], [email protected] Abstract Clinical concept extraction tools abstract over The text of clinical notes can be a valuable the noise inherent in surface representations of source of patient information and clinical as- clinical text by linking raw text to standardized sessments. Historically, the primary approach concepts in clinical ontologies. The Apache clin- for exploiting clinical notes has been informa- ical Text Analysis Knowledge Extraction System tion extraction: linking spans of text to con- (cTAKES, Savova et al., 2010) is the most widely- cepts in a detailed domain ontology. How- used such tool, with over 1000 citations. Based on ever, recent work has demonstrated the poten- rules and non-neural machine learning methods and tial of supervised machine learning to extract engineered for almost a decade, cTAKES provides document-level codes directly from the raw text of clinical notes. We propose to bridge an easily-obtainable source of human-encoded do- the gap between the two approaches with two main knowledge, although it cannot leverage deep novel syntheses: (1) treating extracted con- learning to make document-level predictions. cepts as features, which are used to supple- Our goal in this paper is to maximize the predic- ment or replace the text of the note; (2) treating tive power of clinical notes by bridging the gap labels extracted concepts as , which are used to between information extraction and deep learn- learn a better representation of the text. Un- fortunately, the resulting concepts do not yield ing models. We address the following research performance gains on the document-level clin- questions: how can we best leverage tools such as ical coding task. We explore possible explana- cTAKES on clinical text? Can we show the value tions and future research directions. of these tools in linking unstructured data to struc- tured codes in an existing ontology for downstream 1 Introduction prediction? Clinical decision support from raw-text notes taken We explore two novel hybrids of these meth- by clinicians about patients has proven to be a ods: data augmentation (augmenting text with ex- valuable alternative to state-of-the-art models built tracted concepts) and multi-task learning (learning from structured EHRs. Clinical notes contain valu- to predict the output of cTAKES). Unfortunately, able information that the structured part of the in neither case does cTAKES improve downstream EHR does not provide, and do not rely on expen- performance on the document-level clinical cod- sive and time-consuming human annotation (Torres ing task. We probe this negative result through an et al., 2017; American Academy of Professional extensive series of ablations, and suggest possible Coders, 2019). Impressive advances using deep explanations, such as the lack of word variation learning have allowed for modeling on the raw text captured through concept assignment. alone (Mullenbach et al., 2018; Rios and Kavuluru, 2018a; Baumel et al., 2018). However, there exist 2 Related Work some shortcomings to these approaches: clinical text is noisy, and often contains heavy amounts of Clinical Ontologies Clinical concept ontologies abbreviations and acronyms, a challenge for ma- facilitate the maintenance of EHR systems with chine reading (Nguyen and Patrick, 2016). Addi- standardized and comprehensive code sets, allow- tionally, rare words replaced with "UNK" tokens ing consistency across healthcare institutions and for better generalization may be crucial for predict- practitioners. The Unified Medical Language Sys- ing rare labels. tem (UMLS) (Lindberg et al., 1993) maintains 261 Proceedings of the BioNLP 2019 workshop, pages 261–272 Florence, Italy, August 1, 2019. c 2019 Association for Computational Linguistics the text and their positions, with functionality to map them to other ontologies such as SNOMED and ICD9. It is highly scalable, and can be de- ployed locally to avoid compromising identifiable patient data. Figure2 shows an example cTAKES annotation on a clinical record. Clinical Named-Entity Recognition (NER) Recent work has focused on developing tools Figure 1: A subtree of the ICD ontology (figure from to replace cTAKES in favor of modern neural Singh et al., 2014). architectures such as Bi-LSTM CRFs (Boag et al., 2018; Tao et al., 2018; Xu et al., 2018; Greenberg et al., 2018), varying in task definition a standardized vocabulary of clinical concepts, and evaluation. Newer approaches leverage each of which is assigned a concept unique iden- contextualized word embeddings such as ELMo tifier (CUI). The Systematized Nomenclature of (Zhu et al., 2018; Si et al., 2019). In contrast, we Medicine- Clinical Terms (SNOMED-CT) (Don- focus on maximizing the power of existing tools nelly, 2006) and the International Classification such as cTAKES. This approach is more practical of Diseases (ICD) (National Center for Health in the near-term, because the adoption of new Statistics, 1991) build off of the UMLS and pro- NER systems in the clinical domain is inhibited vide structure by linking concepts based on their by the amount of computational power, data, and relationships. The SNOMED ontology has over gold-label annotations needed to build and train 340,000 active concepts, ranging from fine-grained such token-level models, as well as considerations ("Adenylosuccinate lyase deficiency") to extremely for the effectiveness of domain transfer and a general ("patient"). The ICD ontology is narrower necessity to perform annotations locally in order to in scope, with around 13,000 diagnosis and pro- protect patient data. Newer models do not provide cedure codes used for insurance billing. Unlike these capabilities. SNOMED, which has an unconstrained graph struc- ture, ICD9 is organized into a top-down hierarchy NER in Text-based Models Prior works use the of specificity (see Figure 1). output of cTAKES as features for disease- and Clinical Information Extraction Tools There drug-specific tasks, but either concatenate them are several tools for extracting structured informa- as shallow features, or substitute them for the text tion from clinical text. Popular types of informa- itself (see Wang et al.(2017) for a literature re- tion extraction include named-entity recognition, view). Weng et al.(2017) incorporate the output identifying words or phrases in the text which align of cTAKES into their input feature vectors for the with clinical concepts, and ontology mapping, la- task of predicting the medical subdomain of clini- belling the identified words and phrases with their cal notes. However, they use them as shallow fea- respective clinical codes from an existing ontol- tures in a non-neural setting, and combine cTAKES ogy.1 Of the tools which perform both of these annotations with the text representations by con- tasks, the open-source Apache cTAKES is used catenating the two into one larger feature vector. in over 50% of recent work (Wang et al., 2017), In contrast, we propose to learn dense neural con- outpacing competitors such as MetaMap (Aronson, cept embedding representations, and integrate the 2001) and MedLEE (Friedman, 2000). concepts in a learnable fashion to guide the rep- cTAKES utilizes a rule-based system for per- resentation learning process, rather than simply forming ontology mapping, via a UMLS dictionary concatenating them or using them as a text replace- lookup on the noun phrases inferred by a part-of- ment. We additionally focus on a more challenging speech tagger. Taking raw text as input, the soft- task setting. ware outputs a set of UMLS concepts identified in Boag and Kané(2017) augment a Word2Vec training objective to predict clinical concepts. This 1 Ontology mapping also serves as a form of text normal- work is orthogonal to ours as it is an unsupervised ization. 2Figure from https://healthnlp.github.io/ "embedding pretraining" approach rather than an examples/. end-to-end supervised model. 262 Figure 2: An example of cTAKES annotation output with part-of-speech tags and UMLS CUIs for named entities.2 Automated Clinical Coding The automated Results on the extracted concepts are presented in clinical coding task is to predict from the raw text Table1. Note the difference in number of anno- of a hospital discharge summary describing a pa- tations provided by using the SNOMED ontology tient encounter all of the possible ICD9 (diagnosis compared to ICD9.4 and procedure) codes which a human annotator would assign to the visit. Because these annotators ICD9 Total concepts extracted 1,005,756 are trained professionals, the ICD codes assigned Mean # extracted concepts per document 19.10 serve as a natural label set for describing a patient Mean % words assigned a concept per document 1.26% record, and the task can be seen as a proxy for SNOMED a general patient outcome or treatment prediction Total concepts extracted 28,090,075 Mean # extracted concepts per document 532.76 task. State-of-the-art methods such as CAML (Mul- Mean % words assigned a concept per document 35.21% lenbach et al., 2018) treat each label prediction as a separate task, performing many binary classifi- Mean # tokens per document 1513.00 cations over the many-thousand-dimensional label Table 1: Descriptive Statistics on concept extraction for space. The model is described in more detail in the the MIMIC-III corpus. next section. The label space is very large (tens of thousands of possible codes) and frequency is long-tailed. Base model We evaluate against CAML (Mul- Rios and Kavuluru(2018b) find that CAML per- lenbach et al., 2018), a state-of-the-art text-based forms weakly on rare labels. model for the clinical coding task. The model lever- ages a convolutional neural network (CNN) with 3 Problem Setup per-label attention to predict the combination of Task Notation A given discharge summary is codes to assign to a diven discharge summary.