Final Version of Technical White Paper

Big Data Technical Working Groups White Paper BIG 318062 Project Acronym: BIG Project Title: Big Data Public Private Forum (BIG) Project Number: 318062 Instrument: CSA Thematic Priority: ICT-2011.4.4 D2.2.2 Final Version of Technical White Paper Work Package: WP2 Strategy & Operations Due Date: 28/02/2014 Submission Date: 14/05/2014 Start Date of Project: 01/09/2012 Duration of Project: 26 Months Organisation Responsible of Deliverable: NUIG Version: 1.0 Status: Final Author name(s): Edward Curry (NUIG) Panayotis Kikiras (AGT), Andre Freitas (NUIG) John Domingue (STIR) Andreas Thalhammer (UIBK) Nelia Lasierra (UIBK) Anna Fensel (UIBK) Marcus Nitzschke (INFAI) Axel Ngonga (INFAI) Michael Martin (INFAI) Ivan Ermilov (INFAI) Mohamed Morsey (INFAI) Klaus Lyko (INFAI) Philipp Frischmuth (INFAI) Martin Strohbach (AGT) Sarven Capadisli (INFAI) Herman Ravkin (AGT) Sebastian Hellmann (INFAI) Mario Lischka (AGT) Tilman Becker (DFKI) Jörg Daubert (AGT) Tim van Kasteren (AGT) Amrapali Zaveri (INFAI) Umair Ul Hassan (NUIG) Reviewer(s): Amar Djalil Mezaour Helen Lippell (PA) (EXALEAD) Marcus Nitzschke (INFAI) Axel Ngonga (INFAI) Michael Hausenblas (NUIG) Klaus Lyko (INFAI) Tim Van Kasteren (AGT) Nature: R – Report P – Prototype D – Demonstrator O - Other Dissemination level: PU - Public CO - Confidential, only for members of the consortium (including the Commission) RE - Restricted to a group specified by the consortium (including the Commission Services) Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013) ii BIG 318062 Revision history Version Date Modified by Comments 0.1 25/04/2013 Andre Freitas, Aftab Finalized the first version of Iqbal, Umair Ul the whitepaper Hassan, Nur Aini (NUIG) 0.2 27/04/2013 Edward Curry (NUIG) Review and content modification 0.3 27/04/2013 Helen Lippell (PA) Review and corrections 0.4 27/04/2013 Andre Freitas, Aftab Fixed corrections Iqbal (NUIG) 0.5 20/12/2013 Andre Freitas (NUIG) Major content improvement 0.6 20/02/2014 Andre Freitas (NUIG) Major content improvement 0.7 15/03/2014 Umair Ul Hassan Content contribution (human computation, case studies) 0.8 10/03/2014 Helen Lippell (PA) Review and corrections 0.91 20/03/2014 Edward Curry (NUIG) Review and content modification 0.92 06/05/2014 Andre Freitas, Edward Added Data Usage and minor Curry (NUIG) corrections 0.93 11/05/2014 Axel Ngonga, Klaus Final review Lyko, Marcus Nitzschke (INFAI) 1.0 13/05/2014 Edward Curry (NUIG) Corrections from final review iii BIG 318062 Copyright © 2012, BIG Consortium The BIG Consortium (http://www.big-project.eu/) grants third parties the right to use and distribute all or parts of this document, provided that the BIG project and the document are properly referenced. THIS DOCUMENT IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENT, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. iv BIG 318062 Table of Contents 1. Executive Summary .......................................................................................................... 1 1.1. Understanding Big Data .............................................................................................. 1 1.2. The Big Data Value Chain ........................................................................................... 1 1.3. The BIG Project .......................................................................................................... 2 1.4. Key Technical Insights ................................................................................................ 3 2. Data Acquisition ............................................................................................................... 4 2.1. Executive Summary .................................................................................................... 4 2.2. Big Data Acquisition Key Insights ................................................................................ 5 2.3. Social and Economic Impact ....................................................................................... 7 2.4. State of the Art ............................................................................................................ 7 2.4.1 Protocols .................................................................................................. 7 2.4.2 Software Tools ....................................................................................... 11 2.5. Future Requirements & Emerging Trends for Big Data Acquisition ........................... 22 2.5.1 Future Requirements/Challenges ........................................................... 22 2.5.2 Emerging Paradigms .............................................................................. 24 2.6. Sector Case Studies for Big Data Acquisition ............................................................ 25 2.6.1 Health Sector ......................................................................................... 25 2.6.2 Manufacturing, Retail, Transport ............................................................ 26 2.6.3 Government, Public, Non-profit .............................................................. 28 2.6.4 Telco, Media, Entertainment ................................................................... 30 2.6.5 Finance and Insurance ........................................................................... 33 2.7. Conclusion ................................................................................................................ 33 2.8. References ............................................................................................................... 34 2.9. Useful Links .............................................................................................................. 35 2.10. Appendix ................................................................................................................... 36 3. Data Analysis .................................................................................................................. 37 3.1. Executive Summary .................................................................................................. 37 3.2. Introduction ............................................................................................................... 38 3.3. Big Data Analysis Key Insights .................................................................................. 39 3.3.1 General .................................................................................................. 39 3.3.2 New Promising Areas for Research ........................................................ 39 3.3.3 Features to Increase Take-up ................................................................ 39 3.3.4 Communities and Big Data ..................................................................... 40 3.3.5 New Business Opportunities .................................................................. 40 3.4. Social & Economic Impact ........................................................................................ 40 3.5. State of the art .......................................................................................................... 41 3.5.1 Large-scale: Reasoning, Benchmarking and Machine Learning ............. 42 3.5.2 Stream data processing ......................................................................... 45 3.5.3 Use of Linked Data and Semantic Approaches to Big Data Analysis ...... 47 3.6. Future Requirements & Emerging Trends for Big Data Analysis ............................... 49 3.6.1 Future Requirements .............................................................................. 49 3.6.2 Emerging Paradigms .............................................................................. 51 3.7. Sectors Case Studies for Big Data Analysis .............................................................. 53 3.7.1 Public sector .......................................................................................... 53 3.7.2 Traffic ..................................................................................................... 53 3.7.3 Emergency response ............................................................................. 53 v BIG 318062 3.7.4 Health .................................................................................................... 54 3.7.5 Retail ...................................................................................................... 55 3.7.6 Logistics ................................................................................................. 55 3.7.7 Finance .................................................................................................. 55 3.8. Conclusions .............................................................................................................. 56 3.9. Acknowledgements ................................................................................................... 57 3.10.

Final Version of Technical White Paper

Large-Scale Learning from Data Streams with Apache SAMOA

DSP Frameworks DSP Frameworks We Consider

Empirical Study on the Usage of Graph Query Languages in Open Source Java Projects

Oracle Metadata Management V12.2.1.3.0 New Features Overview

Apache Log4j 2 V

Apache Sentry

Cómo Citar El Artículo Número Completo Más Información Del

Tracking Known Security Vulnerabilities in Third-Party Components

Vimal Daga Chief Technical Officer (CTO) – Linuxworld Informatics Pvt Ltd Professional Experience & Certifications

Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store

Chainsys-Platform-Technical Architecture-Bots

Return of Organization Exempt from Income