Data Analytics Subgroup Report

Total Page:16

File Type:pdf, Size:1020Kb

Data Analytics Subgroup Report Data Analytics Subgroup report Paolo Alcini - EMA (subgroup lead) Gianmario Candore – EMA (subgroup lead) Marek Lehmann - EMA Luis Pinheiro - EMA Antti Hyvärinen - Fimea Hans Ovelgonne – CBG-MED Mateja Sajovic - JAZMP Panagiotis Telonis - EMA Kevin Horan – HPRA (until May 2018) Massimiliano Falcinelli – EMA (from September 2018) See websites for contact details Heads of Medicines Agencies www.hma.eu European Medicines Agency www.ema.europa.eu An agency of the European Union Table of content Executive summary ..................................................................................................... 7 Data Standardisation ................................................................................................ 7 Information technology ............................................................................................ 8 Data manipulation ................................................................................................... 9 Artificial intelligence ................................................................................................. 9 Conclusions ........................................................................................................... 10 Acknowledgments .................................................................................................. 11 Data Analytics – Standardisation................................................................................. 12 1. Standardisation..................................................................................................... 12 1.1. Why .............................................................................................................. 12 1.2. Objectives ...................................................................................................... 14 1.3. Defining the main concepts .............................................................................. 14 1.4. Overview ........................................................................................................ 16 1.5. Opportunities (or use) in regulatory activities ..................................................... 17 1.5.1. Clinical trial domain ................................................................................... 18 1.5.2. Genomics domain ...................................................................................... 19 1.5.3. Bioanalytics Omics domain ......................................................................... 20 1.5.4. Social media domain .................................................................................. 20 1.5.5. Observational data/Real World evidence (RWE) domain.................................. 20 1.5.6. Spontaneous ADR ...................................................................................... 21 1.6. Challenges in regulatory activities ..................................................................... 21 1.7. Regulatory implications .................................................................................... 23 1.8. Conclusions .................................................................................................... 23 1.9. Recommendations ........................................................................................... 24 1.9.1. Subgroups recommendations supporting the needs or standardisation ............. 32 1.9.2. Useful references ....................................................................................... 36 Data Analytics - Information Technology for Big Data .................................................... 38 2. Information Technology ......................................................................................... 38 2.1. Why .............................................................................................................. 38 2.2. Objectives ...................................................................................................... 38 2.3. Main concepts ................................................................................................. 38 2.3.1. Big data ................................................................................................... 38 2.3.2. Big data sources ........................................................................................ 39 2.3.3. Big data formats........................................................................................ 40 Data Analytics Page 2/147 2.3.4. Data analytics models ................................................................................ 41 2.4. Overview ........................................................................................................ 42 2.4.1. Data storage technologies .......................................................................... 42 2.4.2. Hadoop ecosystem .................................................................................... 45 2.4.3. Cloud big data storage ............................................................................... 46 2.4.4. Data integration technologies ...................................................................... 47 2.4.5. Data warehouses and data lakes ................................................................. 47 2.4.6. Architecture .............................................................................................. 50 2.4.7. Related concepts and technologies .............................................................. 53 2.5. Opportunities .................................................................................................. 55 2.6. Challenges ..................................................................................................... 56 2.7. Recommendations ........................................................................................... 57 Data Analytics – Data manipulation ............................................................................. 59 3. Data manipulation ................................................................................................. 59 3.1. Why .............................................................................................................. 59 3.2. Objectives ...................................................................................................... 59 3.3. Main concepts ................................................................................................. 59 3.4. Glossary ......................................................................................................... 60 3.5. Overview ........................................................................................................ 61 3.5.1. Data types ................................................................................................ 61 3.5.2. Reshaping Data ......................................................................................... 63 3.5.3. Transforming Data ..................................................................................... 66 3.5.4. Dealing with missing data ........................................................................... 67 3.5.5. Dealing with incorrect data ......................................................................... 69 3.5.6. Metadata .................................................................................................. 70 3.6. Opportunities in regulatory activities .................................................................. 70 3.7. Challenges in regulatory activities ..................................................................... 71 3.8. Recommendations ........................................................................................... 72 3.9. Resources ...................................................................................................... 74 Data Analytics – The impact of artificial intelligence on analytics in the regulatory setting .. 76 4. The impact of artificial intelligence on analytics in the regulatory setting ...................... 76 4.1. Why .............................................................................................................. 76 4.2. Objectives ...................................................................................................... 77 4.3. Introduction ................................................................................................... 78 4.3.1. Defining the main concepts ......................................................................... 78 Data Analytics Page 3/147 4.3.2. Two approaches to AI ................................................................................ 79 4.3.3. Machine learning ....................................................................................... 80 4.3.4. Deep learning ........................................................................................... 81 4.3.5. Natural language processing ....................................................................... 82 4.3.6. Why AI is becoming popular ....................................................................... 82 4.3.7. Aim of the AI algorithms ............................................................................ 83 4.3.8. Which AI algorithm to use .......................................................................... 84 4.3.9. Summary of the main points ....................................................................... 84 4.4. Opportunities in regulatory activities .................................................................. 85 4.4.1. Efficiency and automation
Recommended publications
  • Semantic Mapping Between IAI Ifcxml and FIATECH AEX Models for Centrifugal Pumps
    NISTIR 7223 Semantic Mapping Between IAI ifcXML and FIATECH AEX Models for Centrifugal Pumps E.F. Begley M.E. Palmer K.A. Reed NISTIR 7223 Semantic Mapping Between IAI ifcXML and FIATECH AEX Models for Centrifugal Pumps E.F. Begley M.E. Palmer K.A. Reed Building Environment Division Building and Fire Research Laboratory May 2005 U.S. DEPARTMENT OF COMMERCE Carlos M. Gutierrez, Secretary TECHNOLOGY ADMINISTRATION Phillip J. Bond, Under Secretary of Commerce for Technology NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY Hratch G. Semerjian, Acting Director TABLE OF CONTENTS 1. INTRODUCTION ..................................................................................................................................................2 1.1 WHAT IS XML? ...................................................................................................................................................2 1.2 INTERNATIONAL ALLIANCE FOR INTEROPERABILITY AND THE INDUSTRY FOUNDATION CLASSES.......................4 1.3 IFCXML...............................................................................................................................................................5 1.4 IFC USE CASES....................................................................................................................................................6 1.5 FIATECH AND THE AEX PROJECT......................................................................................................................8 1.6 AEX USE CASES..................................................................................................................................................9
    [Show full text]
  • Googling the Grey: Open Data, Web Services, and Semantics
    Archaeologies: Journal of the World Archaeological Congress (Ó 2010) DOI 10.1007/s11759-010-9146-4 Googling the Grey: Open Data, Web Services, and Semantics Eric C. Kansa, School of Information, UC Berkeley, Berkeley, CA, USA E-mail: [email protected] RESEARCH Sarah Whitcher Kansa, Alexandria Archive Institute, San Francisco, CA, USA Margie M. Burton and Cindy Stankowski, San Diego Archaeological Center, Escondido, CA, USA ABSTRACT ________________________________________________________________ Primary data, though an essential resource for supporting authoritative archaeological narratives, rarely enters the public record. Lack of primary data publication is also a major obstacle to cultural heritage preservation and the goals of cultural resource management (CRM). Moreover, access to primary data is key to contesting claims about the past and to the formulation of credible alternative interpretations. In response to these concerns, experimental systems have implemented a variety of strategies to support online publication of primary data. Online data dissemination can be a powerful tool to meet the needs of CRM professionals, establish better communication and collaborative ties with colleagues in academic settings, and encourage public engagement with the documented record of the past. This paper introduces the ArchaeoML standard and its implementation in the Open Context system. As will be discussed, the integration and August 2010 online dissemination of primary data offer great opportunities for making archaeological knowledge creation more participatory and transparent. However, different strategies in this area involve important trade-offs, and all face complex conceptual, ethical, legal, and professional challenges. ________________________________________________________________ Volume 6 Number 2 Since Open Context is operated in the United States, the European Union’s database protection laws are less applicable.
    [Show full text]
  • Foundations of Temporal Text Networks
    Vega et al. Applied Network Science (2018) 3:25 Applied Network Science https://doi.org/10.1007/s41109-018-0082-3 RESEARCH Open Access Foundations of Temporal Text Networks Davide Vega* and Matteo Magnani *Correspondence: [email protected] Abstract InfoLab, Department of Information Three fundamental elements to understand human information networks are the Technology, Uppsala University, Uppsala, Sweden individuals (actors) in the network, the information they exchange, that is often observable online as text content (emails, social media posts, etc.), and the time when these exchanges happen. An extremely large amount of research has addressed some of these aspects either in isolation or as combinations of two of them. There are also more and more works studying systems where all three elements are present, but typically using ad hoc models and algorithms that cannot be easily transfered to other contexts. To address this heterogeneity, in this article we present a simple, expressive and extensible model for temporal text networks, that we claim can be used as a common ground across different types of networks and analysis tasks, and we show how simple procedures to produce views of the model allow the direct application of analysis methods already developed in other domains, from traditional data mining to multilayer network mining. Keywords: Network, Text, Time, Model, Temporal text network, Human information network Introduction A large amount of human-generated information is available online in the form of text exchanged between individuals at specific times. Examples include social network sites, online forums and emails. The public accessibility of several of these sources allows us to observe our society at various scales, from focused conversations among small groups of individuals to broad political discussions involving heterogeneous audiences from large geographical areas (Zhou et al.
    [Show full text]
  • Guide for the Use of the International System of Units (SI)
    Guide for the Use of the International System of Units (SI) m kg s cd SI mol K A NIST Special Publication 811 2008 Edition Ambler Thompson and Barry N. Taylor NIST Special Publication 811 2008 Edition Guide for the Use of the International System of Units (SI) Ambler Thompson Technology Services and Barry N. Taylor Physics Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899 (Supersedes NIST Special Publication 811, 1995 Edition, April 1995) March 2008 U.S. Department of Commerce Carlos M. Gutierrez, Secretary National Institute of Standards and Technology James M. Turner, Acting Director National Institute of Standards and Technology Special Publication 811, 2008 Edition (Supersedes NIST Special Publication 811, April 1995 Edition) Natl. Inst. Stand. Technol. Spec. Publ. 811, 2008 Ed., 85 pages (March 2008; 2nd printing November 2008) CODEN: NSPUE3 Note on 2nd printing: This 2nd printing dated November 2008 of NIST SP811 corrects a number of minor typographical errors present in the 1st printing dated March 2008. Guide for the Use of the International System of Units (SI) Preface The International System of Units, universally abbreviated SI (from the French Le Système International d’Unités), is the modern metric system of measurement. Long the dominant measurement system used in science, the SI is becoming the dominant measurement system used in international commerce. The Omnibus Trade and Competitiveness Act of August 1988 [Public Law (PL) 100-418] changed the name of the National Bureau of Standards (NBS) to the National Institute of Standards and Technology (NIST) and gave to NIST the added task of helping U.S.
    [Show full text]
  • International Standard @ 1000
    International Standard @ 1000 INTERNATIONAL ORGANIZATION FOR STANDARDIZATIONoME~YHAPOLIHAROPTAHMSAUMR no CTAHAAPTM3AUHMaRGANISATION INTERNATIONALE DE NORMALISATION L SI units and recommendations for the use of their multiples and of certain other units Unités SI et recommandations pour l'emploi de leurs multiples et de certaines autres unités Second edition - 1981-02-15iTeh STANDARD PREVIEW (standards.iteh.ai) ISO 1000:1981 https://standards.iteh.ai/catalog/standards/sist/a63c4770-3eda-4cd9-9c90- c032a9b11787/iso-1000-1981 UDC : : Ref. No. IS0 (E) 1 53.081 003.62 004.1 IOOO-1981 Descriptors : units of measurement, metric system, multiples, international system of units, utilisation s Price based on 14 pages Foreword IS0 (the International Organization for Standardization) is a worldwide federation of national standards institutes (IS0 member bodies). The work of developing Inter- national Standards is carried out through IS0 technical committees. Every member body interested in a subject for which a technical committee has been set up has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. Draft International Standards adopted by the technical committees are circulated to the member bodies for approval before their acceptance as International Standards by the IS0 Council. International Standard IS0 loo0 was idevelopedTeh SbyT TechnicalAND CommitteeARD ISO/TC PR 12,E VIEW Quantities, units, symbols, conversion factors and( sconversiontand atables.rd s.iteh.ai) This second edition was submitted directly to the IS0 Council, in accordance with clause 5.10.1 of part 1 of the Directives for the technical workIS Oof 1ISO.000: 1It9 cancels81 and replaces the first edition (i.e.
    [Show full text]
  • Database-Driven Web Mashups
    Database-Driven Web Mashups Andrei Vancea Michael Grossniklaus Moira C. Norrie Institute for Information Systems ETH Zurich CH-8092 Zurich, Switzerland {vancea,grossniklaus,norrie}@inf.ethz.ch Abstract news feeds being combined into a single feed. An enterprise mashup uses the general mashup techniques within a com- In most web mashup applications, the content is gener- pany’s own internal applications. It usually combines data ated using either web feeds or an application programming from both internal and external sources. Finally, a business interface (API) based on web services. Both approaches mashup is a combination of all of the above that makes the have limitations. Data models provided by web feeds are result available for a business application. not powerful enough to permit complex data structures to In this paper, we argue that current approaches to the de- be transmitted. APIs based on web services are usually velopment of web mashup applications lack a powerful data different for each web application, and thus different im- model for data interchange. As mentioned above, in most plementations of the APIs are required for each web ser- cases, the content is generated using feeds or an applica- vice that a web mashup application uses. We propose a tion programming interface (API) based on web services. database-driven approach to web mashups that supports In the former case, the data models provided by web feeds integration at the database level and enables mashup de- (RSS or Atom) lack flexibility and do not permit complex velopers to work with a uniform abstract model and have data structures to be transmitted.
    [Show full text]
  • INSPIRE Generic Conceptual Model
    INSPIRE Infrastructure for Spatial Information in Europe INSPIRE Generic Conceptual Model Title D2.5: Generic Conceptual Model, Version 3.4rc2 Status Version for Annex II/III data specifications v3.0rc2 Creator Drafting Team "Data Specifications" Date 2012-06-15 Subject Generic Conceptual Model of the INSPIRE data specifications Publisher Drafting Team "Data Specifications" Type Text Description Generic Conceptual Model of the INSPIRE data specifications Contributor Members of the INSPIRE Drafting Team "Data Specifications", INSPIRE Spatial Data Interest Communities & Legally Mandated Organisations, INSPIRE Consolidation Teams and other Drafting Teams Format MS Word (doc) Source Drafting Team "Data Specifications" Rights Public Identifier D2.5_v3.4rc2.docx Language En Relation n/a Coverage Project duration Table of contents Foreword ................................................................................................................................................ 6 Introduction ........................................................................................................................................... 8 1 Scope ............................................................................................................................................. 11 2 Normative references ................................................................................................................... 11 3 Terms and abbreviations ............................................................................................................
    [Show full text]
  • Download and Execution, Along with Metadata That Dr
    Table of Contents Preface 5 Purpose and Membership 7 Ecma's role in International Standardization 9 Organization of Ecma International* 10 General Assembly 13 Ordinary members 14 Associate members 16 SME members 17 SPC members 18 Not-for-Profit members 19 Technical Committees 21 Index of Ecma Standards 57 Ecma Standards and corresponding International and European Standards 61 Technical Reports 81 List of Representatives 84 Ecma By-laws 139 Ecma Rules 146 Code of Conduct in Patent Matters 151 Withdrawn Ecma Standards and Technical Reports 153 History of Ecma International 165 Past Presidents / Secretary General 166 * Often called Ecma, or ECMA (in the past), short for Ecma International. - 3 - Preface Information Technology, Telecommunications and Consumer Electronics are key factors in today's economic and social environment. Effective interchange both of commercial, technical, and administrative data, text and images and of audiovisual information is essential for the growth of economy in the world markets. Through the increasing digitalization of information technology, telecommunications and consumer electronics are getting more and more integrated. Open Systems and Distributed Networks based on worldwide recognized standards will not only provide effective interchange of information but also help to remove technical barriers to trade. In particular harmonized standards are recognized as a prerequisite for the establishment of the European economic area. From 1961 until 1994, ECMA (European Computer Manufacturers Association), then Ecma International (Ecma, for short) has actively contributed to worldwide standardization in information technology, communications and consumer electronics (ICT and CE). More than 380 Ecma Standards and 90 Technical Reports of high quality have been published.
    [Show full text]
  • Butcher's Copy-Editing
    BUTCHER’S COPY-EDITING Judith Butcher, Caroline Drake and Maureen Leach BUTCHER’S COPY-EDITING The Cambridge Handbook for Editors, Copy-editors and Proofreaders Fourth edition, fully revised and updated cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge cb2 2ru, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521847131 © Cambridge University Press 1975, 1981, 1992, 2006 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2006 isbn-13 978-0-511-25039-2 eBook (NetLibrary) isbn-10 0-511-25039-8 eBook (NetLibrary) isbn-13 978-0-521-84713-1 hardback isbn-10 0-521-84713-3 hardback Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents List of illustrations viii 4 Illustrations 69 Preface to the fourth edition ix 4.1 What needs to be done 75 Preface to the third edition x 4.2 Line illustrations 80 Preface to the second edition xii 4.3 Maps 85 Preface to the first edition xii 4.4 Graphs 88 Acknowledgements
    [Show full text]
  • Sturbridge, Brimfield, Holland and Wales
    SPENCER FAMILY DENTAL Gentle Caring State of the Art Dentistry For The Whole Family Cosmetic Dentistry • Restorative Dentistry • Preventative Dentistry CROWNS • CAPS • BRIDGES • COMPLETE and PARTIAL DENTURES New We Strive NON SURGICAL GUM TREATMENT • ROOT CANAL THERAPY Patients SURGICAL SERVICES For Painless Welcome BREATH CLINIC-WE TREAT CHRONIC BAD BREATH Dentistry HERBAL DENTAL PRODUCTS • All Instruments Fully Sterilized • Most Insurances Accepted Dr. Nasser S. Hanna Conveniently Located On Route 9 • (Corner of Greenville St. & Main St.) 284 Main St., Spencer 508-885-5511 Mailed free to requesting homes in Sturbridge, Brimfield, Holland and Wales Vol. VII, No. 50 PROUD MEDIA SPONSOR OF RELAY FOR LIFE OF THE GREATER SOUTHBRIDGE AREA! COMPLIMENTARY HOME DELIVERY ONLINE: WWW.STURBRIDGEVILLAGER.NET Friday, December 13, 2013 THIS WEEK’S QUOTE Board fields “The only thing that OML complaint overcomes hard luck is CREAMER ACCUSED OF DEFAMATORY COMMENTS hard work.” BY CHELSEA DAVIS Photo courtesy John Shevlin Harry Golden VILLAGER STAFF WRITER This design represents the intersection of Route 20 and Route 131 STURBRIDGE — The Board of becoming a rotary after the proposed improvement plans. Selectmen addressed an Open Meeting Law complaint made against them at their INSIDE Monday, Dec. 2 meeting. “[The written response is] acknowledg- Residents get look ing it was a somewhat awkward discus- ALMANAC ............2 sion,” said Town Administrator Shaun Suhoski. “It addresses the five items that POLICE LOGS........5 were on page 2 of the complaint.” at Route 20 project OPINION ............10 The complaint made by Monique Marinelli, wife of Sturbridge Fire CALENDAR .........12 Lieutenant John Marinelli, mostly BY CHELSEA DAVIS walking distances to these VILLAGER STAFF WRITER OBITUARIES .......12 addressed the alleged fault of former Board facilities, [and] protects and of Selectmen Chairman Thomas Creamer STURBRIDGE — A pres- takes advantage of the SPORTS ............14 in a Nov.
    [Show full text]
  • INSPIRE Generic Conceptual Model
    INSPIRE Infrastructure for Spatial Information in Europe INSPIRE Generic Conceptual Model Title D2.5: Generic Conceptual Model, Version 3.4rc24rc3 Status Version for Annex II/III data specifications v3.0rc20rc3 Creator Drafting Team "Data Specifications" Date 2012-06-152013-04-05 Subject Generic Conceptual Model of the INSPIRE data specifications Publisher Drafting Team "Data Specifications" Type Text Description Generic Conceptual Model of the INSPIRE data specifications Contributor Members of the INSPIRE Drafting Team "Data Specifications", INSPIRE Spatial Data Interest Communities & Legally Mandated Organisations, INSPIRE Consolidation Teams and other Drafting Teams Format MS Word (doc)Portable document format (pdf) Source Drafting Team "Data Specifications" Rights Public Identifier D2.5_v3.4rc2.docxD2.5_v3.4rc3 Language En Relation n/a Coverage Project duration INSPIRE Data Specifications Reference: Compare Result 2 Generic Conceptual Model 2013-04-05 Page I Table of contents Foreword ............................................................................................................................................1 Introduction .........................................................................................................................................3 1 Scope ...........................................................................................................................................6 2 Normative references ....................................................................................................................6
    [Show full text]
  • Open Meeting Law, Public Records Law and the Use of Technology
    Open Meeting Law, Public Records Law and the Use of Technology Town of Wilmington Mark R. Reich, Esq. November 30, 2020 2 Disclaimer This information is provided as a service by KP Law, P.C. This information is general in nature and does not, and is not intended to, constitute legal advice as to particular issues. Neither the provision nor receipt of this information creates an attorney-client relationship between the presenter and the recipient. You are advised not to take, or to refrain from taking, any action based on this information without consulting legal counsel about the specific issue(s). 3 Municipal Use of Social Media – Legal Issues ■ Open Meeting Law, G.L. c.30A, §§18-25 ■ Communications among a quorum of board members on social media can constitute an open meeting law violation ■ Click here for a discussion of the OML and social media: http://www.k-plaw.com/wp-content/uploads/2017/01/Open- Meeting-Law-and-Social-Media-Potential-Pitfalls.pdf ■ Public Records Law, G.L. c.66, §10 ■ Records retention ■ Policy to retain copies of social media pages and posts ■ Current guidance recommends taking a periodic “snapshot” of the social media sites in order to meet records retention obligations ■ Do not post information that is not public record 4 Open Meeting Law – Overview Massachusetts Open Meeting Law - G.L. c. 30A, §§ 18-25 • Purpose of OML is to eliminate the secrecy surrounding deliberations and decisions on which public policy is based • Applies to “a deliberation by a public body with respect to any matter within the body’s
    [Show full text]