Biosurveillance Using Clinical Diagnoses and Social Media Indicators in Military Populations February 2017
Total Page:16
File Type:pdf, Size:1020Kb
PNNL-26271 Biosurveillance Using Clinical Diagnoses and Social Media Indicators in Military Populations February 2017 CD Corley JJ Harrison S Volkova J Mendoza J Rounds K Han LE Charles Prepared for the U.S. Department of Energy under Contract DE-AC05-76RL01830 Acknowledgments This work was supported by the Defense Threat Reduction Agency under contract CB10082 to Pacific Northwest National Laboratory. The authors thank Commander Jean-Paul Chretien, Aaron Kite-Powell, and Vivek Khatri, who were at the Armed Forces Health Surveillance Branch, Defense Heath Agency, for helpful discussions on the research and experimental design of this contract. iii Acronyms and Abbreviations AFHSC Armed Forces Health Surveillance Center BSVE Biosurveillance Ecosystem CDC U.S. Centers for Disease Control ETL Extract Transform Load EWMA Exponentially Weighted Moving Average GTSEDA Generalized Time-Series Exploratory Data Analysis ICD-9 International Classification of Disease, 9th edition ILI Influenza-like illness ILINet U.S. Outpatient Influenza-like Illness Surveillance Network LDA Latent Dirichlet Allocation LIWC Linguistic Inquiry and Word Count LSTM Long Short-Term Memory MMWR Morbidity and Mortality Weekly Report NNDSS National Notifiable Diseases Surveillance System (CDC) SODA Socrata Open Data API STL Seasonal Trend Loess VADER Valence Aware Dictionary and sEntiment Reasoner iv Contents Executive Summary ..................................................................................... Error! Bookmark not defined. Acknowledgments ........................................................................................................................................ iii Acronyms and Abbreviations ...................................................................................................................... iv 1.0 Project Overview ............................................................................................................................... 1.1 1.1 Objective ................................................................................................................................... 1.1 1.2 Introduction ............................................................................................................................... 1.1 1.3 Methods and Results ................................................................................................................. 1.1 1.4 Conclusion ................................................................................................................................. 1.2 2.0 Generalized Time Series Exploratory Data Analysis Application .................................................... 2.1 2.1 Introduction ............................................................................................................................... 2.1 2.1.1 Data Sources ................................................................................................................... 2.2 2.2 User Interface ............................................................................................................................ 2.3 2.2.1 Query Tab ....................................................................................................................... 2.4 2.2.2 Select Tab ....................................................................................................................... 2.4 2.2.3 Summary Tab ................................................................................................................. 2.6 2.2.4 Clustering Tab ................................................................................................................ 2.7 2.2.5 Within-Location Clustering Tab .................................................................................. 2.10 2.2.6 Alarms Tab ................................................................................................................... 2.14 2.3 Future Work and Discussion ................................................................................................... 2.15 2.3.1 Data Sources ................................................................................................................. 2.15 2.3.2 Automation ................................................................................................................... 2.16 2.3.3 Modularization ............................................................................................................. 2.16 2.3.4 Further Output Integration ........................................................................................... 2.17 3.0 Military Biosurveillance: Studying Military Community Health, Well-being, and Discourse through the Social Media Lens ....................................................................................................................... 3.1 3.1 Research Questions ................................................................................................................... 3.1 3.2 Background and Related Work ................................................................................................. 3.2 3.2.1 Characteristics of the U.S. Military Population ............................................................. 3.2 3.2.2 Studies on Military Populations ..................................................................................... 3.2 3.2.3 Understanding Populations through Social Media ......................................................... 3.2 3.3 Data ........................................................................................................................................... 3.3 3.3.1 Data Anonymization ...................................................................................................... 3.3 3.3.2 Sampling Military Users on Twitter ............................................................................... 3.3 3.4 Analysis and Results ................................................................................................................. 3.4 3.4.1 RQ1: Differences in Social Media Activities of Military vs. Control ............................ 3.4 3.4.2 RQ2: Differences in Language Use between Military and Control ............................... 3.5 3.4.3 RQ3: Trends of Sentiment for Military and Control ...................................................... 3.1 v 3.4.4 RQ4: Topic Variations between Military and Control ................................................... 3.2 3.4.5 RQ5: Health-related Discourse between Military and Control ...................................... 3.4 3.5 Discussion ................................................................................................................................. 3.5 3.5.1 Implications for Military Social Life .............................................................................. 3.5 3.5.2 Implications for Military and Public Health ................................................................... 3.5 3.5.3 Limitations and Future Work ......................................................................................... 3.6 3.6 Conclusion ................................................................................................................................. 3.6 3.7 Acknowledgment ........................................................................ Error! Bookmark not defined. 4.0 Military Biosurveillance: Predicting Influenza Dynamics with Neural Networks Using Signals from Social Media ...................................................................................................................................... 4.1 4.1 Motivation ................................................................................................................................. 4.1 4.2 Approach ................................................................................................................................... 4.1 4.3 Results ....................................................................................................................................... 4.1 5.0 Chiron Computing Pipeline and Architecture ................................................................................... 5.1 5.1 Server Architecture and Description ......................................................................................... 5.1 5.2 ETL Pipeline ............................................................................................................................. 5.2 5.2.1 High-level ETL .............................................................................................................. 5.2 5.2.2 Enrichments .................................................................................................................... 5.3 6.0 Understanding Readers’ Credibility Perceptions on Social Media Content: Case of Disease Outbreaks on Twitter ......................................................................................................................... 6.1 6.1 Introduction ............................................................................................................................... 6.1 6.2 Related Work ............................................................................................................................ 6.2 6.2.1 Understanding Credibility in Social Media .................................................................... 6.2 6.3 Study Goals ..............................................................................................................................