Infodemiology and Infoveillance Tracking Online Health Information and Cyberbehavior for Public Health
Total Page:16
File Type:pdf, Size:1020Kb
Infodemiology and Infoveillance Tracking Online Health Information and Cyberbehavior for Public Health Gunther Eysenbach, MD, MPH Introduction The vision explored in this paper is to provide real- time metrics (presented as graphics, indices, and maps) of nfodemiology, an emerging area of research at the public behavior, opinion, knowledge, and attitudes to crossroads of consumer health informatics and pub- public health offıcials and policymakers, based on textual lic health informatics, as well as infometrics and web I data harvested from the Internet. The amount of user- analytics tools, can be defıned as the science of distribu- tion and determinants of information in an electronic generated data on social media and other Internet-based venues “has made measurable what was previously im- medium, specifıcally the Internet, with the ultimate aim 1 to inform public health and public policy. Infodemiology measurable,” and opens up a new intriguing area of data (derived from unstructured, textual, openly accessi- research and development: the possibility to systemati- ble information produced and consumed by the public on cally mine, aggregate, and analyze these textual, unstruc- tured data, to inform public health and public policy. the Internet, such as blogs, websites, and query and nav- 1–3 igation data) can be collected and analyzed in near real- This emerging fıeld has been called infodemiology, time. We developed a proof-of-concept infoveillance sys- originally in the context of identifying and monitoring 2 tem called Infovigil, which can identify, archive, and misinformation, and later in a study that showed that 1 analyze health-related information from Twitter and Google searches predicted influenza outbreaks (later 4 other information streams from Internet and social me- popularized by the Google Flutrends application). An- 3 dia sources. The system was developed to demonstrate other term—infoveillance —has been used for applica- and explore the potential of infoveillance for measuring tions where infodemiology methods are employed for public attention, attitudes, behavior, knowledge, and in- surveillance purposes. formation consumption, as well as for syndromic surveil- The underlying idea of this emerging fıeld is to mea- lance, health communication, and knowledge translation sure the pulse of public opinion, attention, behavior, research. knowledge, and attitudes by tracking what people do and write on the Internet. The term technosocial predictive What Are Infodemiology and analysis also has been proposed,5 although “prediction” Infoveillance? does not capture the full range of possible applications of infodemiology. As will be discussed in more detail below, Imagine being a public health offıcial, eHealth researcher, infodemiology goes beyond forecasting (prediction) and or behavioral scientist, and being able to look at a dash- includes “nowcasting” (providing data for situational board telling you in real-time what people are doing or awareness on what the public does, knows, or feels about feeling, much as economists can look at the Dow Jones certain issues). stock index as a real-time measure of “what people are The term infodemiology is now widely used by others: doing” (buying or selling), or at the VIX (volatility index, also called “fear index”), which provides metrics for im- in January 2011, there were almost 5000 Google hits for plied feelings or attitudes, such as investor nervousness or the term (as an aside, it should be noted that reporting fear. and tracking the number of hits on Google for an emerg- ing concept is a little infodemiology study in itself: The emergence of a new concept or term can be visualized From the Centre for Global eHealth Innovation, Consumer Health & Public Health Informatics Lab, University Health Network, and the De- by plotting the number of occurrences on the web as a partment of Health Policy, Management, and Evaluation, University of knowledge translation or diffusion metric. We have Toronto, Toronto, Canada done a similar analysis to illustrate the uptake of the Address correspondence to: Gunther Eysenbach, MD, MPH, Centre for Global eHealth Innovation, Consumer & Public Health Informatics Lab, new, recommended term H1N1 versus the popularly University Health Network, 190 Elizabeth Street, Toronto M5G 2C4 Can- used “swine flu” during the 2009 H1N1 pandemic).6 ada. E-mail: [email protected]. 0749-3797/$17.00 Perhaps the term infodemiology is preferred because it doi: 10.1016/j.amepre.2011.02.006 intuitively conveys a few key concepts, including the fact S154 Am J Prev Med 2011;40(5S2):S154–S158 © 2011 American Journal of Preventive Medicine • Published by Elsevier Inc. Eysenbach / Am J Prev Med 2011;40(5S2):S154–S158 S155 Figure 1. The role of infodemiology in public health that (1) epidemiologic methods and terminology (e.g., While the Internet is currently the main source of such prevalence) can be used to study and describe informa- information, in principle any consumer health informat- tion in an information universe; (2) infodemiology pro- ics application (including, for example, personal health vides data for public health decision making (again sim- records, or even domotics applications such as intelligent ilar to the role of epidemiology); and (3) infodemiology kitchen appliances) and social media may produce data takes a population-perspective (implied by the word that may be harnessed for infodemiology and infoveil- demos, from Greek for people) on three different levels. lance approaches. First, people are the generators of this information; sec- Figure 1 further illustrates the relationship between ond, information has an effect on other people; and fı- epidemiology (the science of the determinants and distri- nally, it also reminds us of the fact that we have to look at bution disease) and infodemiology (the science of the populations of information units (e.g., many different determinants and distribution of information). The top websites) rather than an individual unit (e.g., web analyt- cycle shows how traditional epidemiologic methods (e.g., ics of a single website) in order to obtain meaningful and surveys, clinical data) and data from polls or focus groups robust data. inform public health professionals and policymakers and A more formal defınition of infodemiology is “the sci- affect public health interventions and policy decisions, ence of distribution and determinants of information in which—augmented by the media and public relations an electronic medium, specifıcally the Internet, or in a campaigns—then (hopefully) have an impact on popula- population, with the ultimate aim to inform public health tion behavior, attitudes, and ultimately health status. and public policy.”3 Infoveillance is “the longitudinal These outcomes are in turn picked up by traditional epi- tracking of infodemiology metrics for surveillance and demiologic research methods, which again inform public trend analysis.”3 With information we mean primarily health professionals and policymakers, and so on. This unstructured, textual, openly accessible information pro- cycle is often a time-consuming process; for example, duced and consumed by the public on the Internet. This it may take months or years to determine whether a can include, for example, search or navigation data (in- healthy-eating campaign has been successful in terms of formation demand), or postings (information supply) on affecting behavior, attitudes, knowledge, or even clinical websites, blogs, microblogs (Twitter), discussion boards, outcomes. However, in the age of the Internet and social or social media. It can also include data on what people media, changes in population behavior, public attitudes, browse, buy, and read on the Internet, or social network- public attention, or health status are often reflected in ing data (who we befriend or interact with) harvested immediate changes in information and communication from sites such as Facebook. Some of these data may patterns on the Internet.1,3 If these changes could be actually be accessible in a structured format, but usually picked up by infodemiology metrics, these data points infodemiology implies some sort of free-text analysis. could give some additional information to decision mak- May 2011 S156 Eysenbach / Am J Prev Med 2011;40(5S2):S154–S158 ers (we stress that infodemiology metrics cannot replace, ● mining status updates on microblogs such as Twitter but rather complement, traditional methods). for syndromic surveillance and situational awareness For example, changes in the health status of a popula- during a pandemic6; tion (more people getting influenza) leads to an increased ● identifying and monitoring of public health–relevant search for influenza-related websites, more clicks on ad- publications on the Internet (e.g., anti-vaccination vertisements for influenza websites, more tweets and sta- sites, but also news articles or expert-curated outbreak tus updates on Facebook saying “I’ve got a cold,” more reports)2; orders in Internet pharmacies, possibly more book sales ● detecting and quantifying disparities in health infor- of influenza-related books, and so on. These infodemiol- mation availability3; ogy data will obviously be confounded and influenced by ● tracking the effectiveness of health marketing campaigns; media reports (which in itself can be tracked and picked ● extracting user-generated health outcomes data up by infodemiology methods). An “epidemic of fear” (e.g., from sites such as PatientsLikeMe) to construct may