170 Current Drug Safety, 2012, 7, 170-175 Implemented Data Mining and Signal Management Systems on Spontaneous Reporting Systems’ Databases and their Availability to the Scientific Community - A Systematic Review

Luís Miguel de Almeida Vieira Lima*,1, Nuno Gonçalo Sales Craveiro Nunes2,3,8, Pedro Gonçalo Pires da Silva Dias4,5,6,7 and Francisco Jorge Batel Marques2,8

1University of Coimbra, Coimbra, Portugal 2AIBILI – Association for Innovation and Biomedical Research on Light and Image, Coimbra, Portugal 3Health Sciences Department, University of Aveiro, Aveiro, Portugal 4Beira Interior University, Covilhã, Portugal 5Polytechnic Institute of Leiria, Leiria, Portugal 6Center of Investigation in Human Motricity, Leiria, Portugal 7Research Center in Sports, Health Sciences and Human Development, Vila Real, Portugal 8School of Pharmacy, University of Coimbra, Coimbra, Portugal

Abstract: Adverse drug reactions’ spontaneous reporting systems are an important element in worldwide , gathering potentially useful information for post-marketing drug safety surveillance. Data mining and signal management systems, providing the capability of reading and interpreting these systems’ raw data (data that has not been subjected to processing or any other manipulation), improve its analysis process. In order for this analysis to be possible, both data mining and signal management systems and raw data should be available to researchers and the scientific community. The purpose of this work was to provide an overview of the spontaneous reporting systems databases reported in literature as having implemented a data mining and signal management system and the implementation itself, evidencing their availability to researchers. A systematic review was carried out, concluding that they are freely provided to researchers within institutions responsible for maintaining the spontaneous reporting systems, but not to most researchers within the scientific community. Keywords: Data mining, information system, pharmacovigilance, post-marketing safety surveillance, scientific community, signal detection software, spontaneous reporting system.

INTRODUCTION agencies, market authorization holders (MAH) and the scientific community [2-11] have struggled to develop and Spontaneous Reporting System (SRS) databases, enhance data mining and signal management systems containing suspected adverse drug reactions (ADR) reports, (DMSs), systematizing the process of analysing both are an important element in the worldwide previously existing and new spontaneous ADR reports. pharmacovigilance [1]. Despite intrinsic limitations [1-11], DMSs grew into an essential tool for an efficient detection of the amount of potentially useful information that they potential signals [11]. contain makes the process of searching for relevant cases among the stored information an important challenge for The objective of the present work is to compile drug safety researchers. These cases are often named as information that identifies and describes the SRS databases signals, defined by the World Health Organization as reported in the literature as having implemented a DMS and “reported information on a possible causal relationship the implementation characteristics themself, giving evidence between an adverse event and a drug, of which the of their availability to researchers within the scientific relationship is unknown or incompletely documented community. previously” [2], being its searching process known as signalling or signal detection. MATERIALS AND METHODS Due to the increasingly large number of spontaneous A Medline search from January 1995 to December 2010 reports stored in SRS databases, celerity and efficiency in the was carried out using the search strategy defined in Table 1. process of signalling for relevant cases are being compromised. To overcome the problem, drug regulatory All 210 titles were examined by the first author and reviewed independently by the second author. Relevant abstracts were extracted in order to identify potentially

*Address correspondence to this author at the University of Coimbra, relevant studies. Full papers of potentially relevant studies Coimbra, Portugal; Tel: 00351912570205; were obtained and assessed according to the inclusion and E-mail: [email protected] exclusion criteria. Studies were selected based on their

1574-8863/12 $58.00+.00 © 2012 Bentham Science Publishers Adverse Drug Reactions Spontaneous Reporting Systems Current Drug Safety, 2012, Vol. 7, No. 2 171

Table 1. MedLine Search Strategy

Information Source (Search Date) Step Number Search Terms Results

#1 Pharmacovigilance 2030 #2 Post marketing surveillance 1108 #3 Adverse drug reaction 340832 #4 Adverse event 24183 #5 #1 OR #2 OR #3 OR #4 363873 #6 Signal detection 33770 #7 Data mining 7575 Pubmed (25/07/2011) #8 Automated signaling 816 #9 #6 OR #7 OR #8 42019 #10 Software 130058 #11 Information System 250967 #12 #10 OR #11 347629 #13 #5 AND #9 AND #12 212 #14 Limits: Publication Date from 1995/01/01 210 referral of DMSs implemented on SRS databases [1-14]. • Database Only articles published in English were considered to the • Data origin present review. Bibliographies of included studies were searched for further relevant studies [15-17]. In complement, • Report submission (collecting data process) institutions were contacted and their websites were accessed, • Implemented DMS characterization: as a way to extend our information sources [18-28]. The selection process is illustrated in Fig. (1). o System identification – used DMS software Data were extracted by the first author using a o Signal detection methods - methods used to identify standardized extraction table and checked independently by drug-event associations that should be signalized for the second author. All data were review by other authors. further investigation Disagreement was resolved by discussion. The extraction o Data presentation - way the results of the previous table took into account the following variables: phase are displayed in order to facilitate its analysis for risk assessment by researchers or technicians.

210 papers screened

112 papers excluded based on title

98 papers identified for abstracts review 57 papers excluded based on abstract review

41 papers identified for full paper review

27 papers excluded following perusal of full paper

14 papers met inclusion criteria

Fig. (1). Identification of published evidence for review. 172 Current Drug Safety, 2012, Vol. 7, No. 2 Lima et al.

o Public availability Food and Drug Administration’s Adverse Event Reporting System receives data from health care • Data availability to the scientific community. professionals, consumers, MAH and other residual sources, Finally, a description was performed for each SRS such as literature reports. From 1969 to 2002, 60% of the database identified in the review and each DMS identified reports received were from health care professionals, 29% has having been implemented on the SRS databases. from consumers and the remaining from MAH and others [15]. Report submission can be distinguished according to its RESULTS nature: voluntary, performed essentially by health care professionals and consumers, which occurs through This review identified seven SRS databases. Six of them MedWatch, a web-based reporting tool that allows users to are property of regulatory or drug safety monitoring fill and send an Online Voluntary Reporting Form; and institutions and one is property of GlaxoSmithKline, a mandatory, performed by MAH that receive direct reports multinational pharmaceutical company. from health care professionals and consumers and are Table 2 presents a summary of their characteristics. obliged to further report them to the Food and Drug Administration using a specific formulary sent by mail or by an electronic process [19]. This system’s raw reported data UNITES STATES FOOD AND DRUG ADMINISTRAT- are freely available and paid reports of specific drug-event ION ADVERSE EVENTS REPORTING SYSTEM association analysis are available on demand [14]. The Food and Drug Administration’s Adverse Event Reporting System is a large database of voluntarily WORLD HEALTH ORGANIZATION INDIVIDUAL submitted adverse events, representing the most important CASE SAFETY REPORTS DATABASE SYSTEM resource to the study and identification of adverse reactions (VIGIBASE) to drugs and biological products in the United States [15]. It VigiBase is a large and global database of Individual Case is in practice since October 1997, when it replaced Safety Reports, maintained and developed by the Uppsala Spontaneous Reporting System, an electronic database that Monitoring Centre, in Sweden, that conjugates reports from the contained reports since 1969 [15]. Currently contains more national centres that integrate the World Health Organization than four million reports, considering data from previous International Drug Monitoring Programme. Currently, there are Spontaneous Reporting System [18]. more than 80 countries submitting their national reports to VigiBase. The top-5 contributors are the United States, the

Table 2. Identified SRS Databases

Databases Data Origin Report Submission Data Availability

United States (consumers, health Paper-based forms Food and Drug Administration’s Raw data available care professionals MEDWatch (electronic reporting Adverse Event Reporting System Reports available on demand and MAH) tools)

World Health Organization E2B format reports Individual Case Safety Reports VigiFlow (web-based reporting World Health Organization Raw data not available database - reports sent by national tool) VigiBase Reports available on demand agencies from more than 80 ASCII text files (old World Health member countries all over the world Organization format) Data from European Medicines European Medicines Agency Agency member states (regulatory Electronic reports (EVPM-ICSRs) Not available EudraVigilance agencies and MAH)

Medicines and Healthcare products Paper-based Yellow Cards Raw data not available United Kingdom (consumers, health Regulatory Agency 's Yellow Card Electronic Yellow Cards List of all reactions reported for a care professionals and MAH) Scheme (Electronic reporting tools ) particular drug

Raw data not available List on all reactions reported for a particular drug and all drugs Netherlands Pharmacovigilance Netherlands (consumers, health care Paper-based forms reported associated to a particular Centre Lareb professionals and MAH) Electronic reporting forms reaction Messages about possible signals detected available Guangdong Monitoring Centre Spontaneous reports from 21 cities Web-based reporting tool Not available Database in Guangdong province (China)

Worldwide GlaxoSmithKline 's GlaxoSmithKline No information available Not available spontaneous reports

Abbreviations: MAH, Market Authorization Holders; EVPM, EudraVigilance PostAuthorization Module; ICSR, Individual Case Safety Report. Adverse Drug Reactions Spontaneous Reporting Systems Current Drug Safety, 2012, Vol. 7, No. 2 173

United Kingdom, Germany, Australia and Canada [6, 7, 16]. was founded in 1991 with the purpose of maintaining the The system holds more than six million Individual Case Safety Netherlands national Spontaneous Reporting System (SRS) Reports [17]. on behalf of the Dutch Medicines Evaluation Board [6]. It collects ADRs reported by health care professionals, patients The submission of Individual Case Safety Reports occurs in three different ways [8]: and MAH through paper forms or online reporting forms [26]. • International Conference on Harmonization of Technical Lareb’s raw data are not made available outside the Requirements for Registration of Pharmaceuticals for Netherlands Pharmacovigilance Centre, but there is a large Human Use E2B format document variety of information that can be accessed on its website • Web-based reporting tool – VigiFlow [26], such as: • ASCII files in the old World Health Organization format • List of all reactions reported for a particular drug; (subset of the E2B format) • List of all drugs reported associated to a particular Vigibase’s raw data was not identified as being available, reaction; but paid overview or detailed reports are available on demand • Messages concerning the detection of possible [18]. signals. EUROPEAN MEDICINES AGENCY EUDRAVIGILANCE (EUDRAVIGILANCE) GUANGDONG MONITORING CENTRE DATABASE Guangdong Monitoring Centre Database of ADRs EudraVigilance is a central computer database created in contains reports from a population of over 82 million people 2001, containing ADR reports for drugs licenced in the from the 21 existing cities in the Chinese province of European Union. Reports are submitted electronically by Guangdong. In 2007, it contained over 40 thousand qualified member states’ regulatory agencies and pharmaceutical companies [22, 23], in compliance to the EudraVigilance Post- reports, and, in 2009, about three to five thousand new reports were added quarterly [9]. No information could be Authorization Module, designed for post-authorization found regarding the availability of Guangdong Monitoring Individual Case Safety Reports [22]. Centre Database’s data. No information was found on the availability of EudraVigilance’s data. GLAXOSMITHKLINE’S SPONTANEOUS REPORTS DATABASE MEDICINES AND HEALTHCARE PRODUCTS REGULATORY AGENCY YELLOW CARD SCHEME GlaxoSmithKline’s database contains spontaneously reported suspected ADRs and supports the Spontaneous reports have been collected in the United pharmacovigilance of the MAH’s marketed products. Until Kingdom since 1964 through the Yellow Card Scheme, 2007, it typically received more than 70 thousand developed and maintained by Medicines and Healthcare spontaneous reports a year [10]. No information was found products Regulatory Agency 's (that replaced, in 2003, the regarding the way reports are submitted and which entities Medical Devices Agency and the Medicines Control Agency), are meant to do so, as well as whether GlaxoSmithKline’s an agency of the Department of Health that operates post- internal data is available. marketing drug safety surveillance systems in the United Five different DSMs have been implemented in the Kingdom. Its database contains more than half a million reports previously presented SRS databases. In Table 3 each of suspected ADRs voluntarily submitted by healthcare implementation’s characteristics are summarized. professionals and patients by three available means [24, 25]. • Online Yellow Card Report; ORACLE HEALTH SCIENCES EMPIRICA SIGNAL™ • Yellow Card Form, available at pharmacies and General (EMPIRICA) Practitioner surgeries or downloadable; Empirica is the latest version of WebVDME, providing a • Yellow Card hotline. visual data mining environment for detecting signals, Medicines and Healthcare products Regulatory Agency's uncovering patterns and recognizing emerging trends in spontaneous adverse event report data. The resulting scores Yellow Card Scheme’s raw data is not available, but its website can be reviewed in tabular or graphical format [27]. provides lists of all suspected ADRs reported [25]. Food and Drug Administration’s Adverse Event NETHERLANDS PHARMACOVIGILANCE CENTRE Reporting System, GlaxoSmithKline and Medicines and LAREB (LAREB) Healthcare products Regulatory Agency's Yellow Card Scheme use Empirica. The first two implemented it with Lareb, that stands for ‘Landelijke Registratie en Multi-item Gamma Poisson Shrinker and its correspondent Evaluatie van Bijwerkingen’ (National Registration and measure of disproportionality, Empirical Bayes Geometric Evaluation of Adverse drug reactions) and goes by the name Mean [4], while the last one implemented it with of Netherlands Pharmacovigilance Centre Lareb since 2002, Proportional Reporting Ratios [14]. These implementations are only available inside each institution.

174 Current Drug Safety, 2012, Vol. 7, No. 2 Lima et al.

Table 3. DSM Implementation on Identified SRS Databases

Used Data Mining System Implementation DMS SRS Database Techniques/Measures of Data Presentation Availability Disproportionality

Food and Drug Administration’s Adverse EBGM Not identified Event Reporting System Oracle Health Sciences Medicines and Healthcare Tabular and graphic form Empirica Signal™ software products Regulatory resulting scores PRR Not identified Agency 's Yellow Card Scheme GlaxoSmithKline EBGM Not identified Bayesian Confidence World Health Organization Interactive visualization Propagation Neural IC Conditional Vigibase module Network software Netherlands Lareb’s graphical interface Pharmacovigilance Centre ROR No information available Not identified Lareb EudraVigilance Data Analysis System + European Medicines EudraVigilance Analysis PRR Graphical interface Not identified Toolkit Agency EudraVigilance

Guandong Quantitative Guangdong Monitoring Tabular and graphic form IC Not identified Signal Detection System Centre Database resulting scores Abbreviations: EBGM, Empirical Bayes Geometric Mean; IC, Information Component; PRR, Proportional Reporting Ratio; ROR, Reporting Odds Ratio.

BAYESIAN CONFIDENCE PROPAGATION NEURAL regulatory purposes and signal detection, using Proportional NETWORK DATA MINING SOFTWARE Reporting Ratios as the measure of disproportionality. EudraVigilance Analysis Toolkit, also an in-house Bayesian Confidence Propagation Neural Network data implementation, provides predefined table and graph formats mining software was developed by Orre and associates, in for report presentation, having many available functions for cooperation with the World Health Organization the manipulation of report results. No information was found Collaborative Centre for International Drug Monitoring, in regarding both modules’ availability [23]. order to data mine Vigibase for signal detection. It uses Bayesian Confidence Propagation Neural Network’s Information Component as its measure of disproportionality GUANGDONG QUANTITATIVE SIGNAL DETECT- and has a module for interactive visualization of query ION SYSTEM results that produces automatic presentation of results in Guandong Quantitative Signal Detection System, a web- graphical and table forms [17, 28]. This software was based quantitative signal detection system for ADRs, identified as being accessible to health professionals through comprises three software modules that prepare data, detect a paid access to a web-based module. associations, and generate reports. This in-house development was implemented over Guangdong Monitoring LAREB’S GRAPHICAL INTERFACE Centre Database and was based on the Guangdong ADR monitoring platform. It defined Information Component Lareb’s graphical interface was an in-house development analysis as the system’s measure of disproportionality [9]. meant to facilitate a fast and efficient screening of its No information was found on the way results are presented database for possible drug-drug interactions or drug-related to users, as well as on the availability of the system. syndromes using Reporting Odds Ratio as the measure of disproportionality [26]. No information was found regarding the system’s availability and the way it presents data for DISCUSSION analysis. A thorough and systematized analysis of SRS databases outside the institutions responsible for them depends on the EUDRAVIGILANCE DATA ANALYSIS SYSTEM + availability of these systems’ raw data, and the consequent EUDRAVIGILANCE ANALYSIS TOOLKIT possibility of analysing it by applying selected data mining and signal detection methods, and of DMSs, to accomplish EudraVigilance Data Analysis System is an in-house data analysis with selected data mining and signal detection implementation, designed to allow users to analyse safety methods. data collected in EudraVigilance in order to understand the safety profile of medicinal products. It provides a range of This review identified several obstacles on both aspects. analytical tools for measuring reporting compliance for Only Food and Drug Administration’s Adverse Event Adverse Drug Reactions Spontaneous Reporting Systems Current Drug Safety, 2012, Vol. 7, No. 2 175

Reporting System’s raw data are available, allowing its [5] Wilson AM, Thabane L, Holbrook A. Application of data mining interpretation by selected data mining and signal detection techniques in pharmacovigilance. Br J Clin Pharmacol; 2003 57(2): 127–34. methods to take place. World Health Organization allows [6] van Puijenbroek E, Diemont W, van Grootheest K. Application of paid access to a web-based data search module [17, 28], and quantitative signal detection in the Dutch spontaneous reporting Medicines and Healthcare products Regulatory and Lareb system for adverse drug reactions. Drug Saf 2003; 26 (5): 293-01. provide it for free [25,26], but the available data cannot be [7] Bate A. Bayesian confidence propagation neural network. Drug Saf manipulated and therefore no data mining and signal 2007; 30(7): 623-25. [8] Lindquist M. VigiBase, the WHO Global ICSR Database System: detection method can be applied. GlaxoSmithKline’s Basic Facts. Drug Information J 2008; 42: 409–19. database and Guangdong Monitoring Centre Database data [9] Li C, Xia J, Deng J, et al. A web-based quantitative signal were not identified as being available. Regarding the detection system on adverse drug reaction in China. Eur J Clin analysed DMS implementations, none was identified as Pharmacol 2009; 65: 729–41. [10] Almenoff J. Innovations for the future of pharmacovigilance. Drug being available, except for Bayesian Confidence Propagation Saf 2007; 30(7): 631-3. Neural Network software over Vigibase, accessible by health [11] Hauben M, Bate A. Decision support methods for the detection of professionals through a paid access to a web-based module adverse events in post-marketing data. Drug Discov Today 2009; [17, 28]. 14(7/8) 343-57. [12] Bate A, Lindquist M, Edwards IR, Orre R. A data mining approach for signal detection and analysis. Drug Saf 2002; 25(6): 393-7. CONCLUSION [13] Almenoff JS, Pattishall EN, Gibbs TG, DuMouchel W, Evans SJ, Yuen N. Novel statistical tools for monitoring the safety of Considering what has been exposed, it comes as a marketed drugs. Clin Pharmacol Ther 2007; 82(2): 157-66. conclusion that SRS databases and DMSs are freely [14] Evans SJ, Waller PC, Davis S. Use proportional reporting ratios available to researchers within institutions responsible for (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf 2001; 10: 483-6. maintaining the systems, but not to most researchers within [15] Diane K, Wysowski LS. Adverse drug event surveillance and drug the scientific community. This hinders the possibility of withdrawals in the United States, 1969-2002. Arch Intern Med performing adequate scientific studies on drug safety. 2005; 165: 1363-9. Available databases and DMSs should allow the spreading of [16] Almenoff J, Tonning JM, Gould AL. Perspectives on the use of researchers participating in the process of discovering data mining in pharmacovigilance. Drug Saf 2005; 28(11): 981- 1007. previously unknown drug safety related problems, creating [17] Orre R. On Data mining and classification using a bayesian additional knowledge from the available information or confidence propagation neural network, in Department of presenting it from other points of view. Numerical Analysis and Computer Science. Royal Institute of Technology: Stockholm, Sweden 2003. This review presents some limitations. It is to be [18] FDA. Adverse Event Reporting System (AERS). 2009 [accessed expected that most SRS databases have an implemented 12/9/2011]; Available from: http://www.fda.gov/Drugs/Guidanc DMS. It is also possible that other agencies have freely eComplianceRegulatoryInformation/Surveillance/AdverseDrugEffe cts/default.htm. available raw data. None of these issues were identified [19] FDA. MedWatch: The FDA Safety Information and Adverse Event through our research methodology. Reporting Program. 2011 [accessed 12/9/2011]; Available from: http://www.fda.gov/Safety/MedWatch/default.htm. [20] Sten Olsson GB. UR 54. Uppsala Reports 2011; ISSN 1651-9779. ACKNOWLEDGEMENT [21] UMC. The WHO Global ICSR database. 2011 [accessed Declared none. 12/9/2011]; Available from: http://www.who-umc.org/DynPage .aspx?id=101 247&mn1=7347&mn2=7252&mn3=7258. [22] EMA. Guideline on the use of statistical signal detection methods CONFLICT OF INTEREST in the Eudravigilance data analysis system. 2006 [accessed 12/9/2011]; Available from: http://www.ema.europa.eu/docs/en Declared none. _GB/document_library/Regulatory_and_procedural_guideline/200 9/11/WC500011434.pdf. [23] EMA. EudraVigilance 2011 [accessed 12/9/2011]; Available from: REFERENCES http://eudravigilance.ema.europa.eu [24] MHRA. Guidance Notes and Application Form for Access to

[1] Lu Z, Information technology in pharmacovigilance: Benefits, Yellow Card and Adverse Drug Reactions On line Information challenges, and future directions from industry perspectives. Drug Tracking (ADROIT) Data. 2004 [accessed 12/9/2011]; Available Healthc Patient Saf 2009: 1: 35–45. from: http://www.mhra.gov.uk/CON2023455.

[2] Bate A, Lindquist M, Edwards IR, et al. A Bayesian neural [25] MHRA. Yellow Card Scheme. 2011 12/9/2011]; Available from: network method for adverse drug reaction signal generation. Eur J http://yellowcard.mhra.gov.uk. Clin Pharmacol 1998; 54: 315-21. [26] Lareb. Netherlands Pharmacovigilance Centre Lareb. 2011

[3] EP van Puijenbroek AB, Leufkens HGM, Lindquist M, Orre M, [accessed 12/9/2011]; Available from: http://www.lareb.nl. Egberts ACG. A comparison of measures of disproportionality for [27] Phase-Forward. Post-marketing surveillance. 2010 [accessed signal detection in spontaneous reporting systems for adverse drug 12/9/2011]; Available from: http://www.phaseforward.com/produ reactions. Pharmacoepidemiol Drug Saf 2002; 11(1): 3-10. cts/safety/stratpharma

[4] Szarfman A, Machado SG, O'Neill RT. Use of screening [28] World Health Organization. UMC, Viewpoint Part 2, watching for algorithms and computer systems to efficiently signal higher-than- safer medicines. The Uppsala Monitoring Centre 2005. expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25(6): 381-92.

Received: September 22, 2011 Revised: May 2, 2012 Accepted: May 17, 2012