D3.1 Multimedia Pre-Processing, Indexing and Mining Tools

Ref. Ares(2019)380292 - 23/01/2019 Funded by the Horizon 2020 Framework Programme of the European Union MAGNETO - Grant Agreement 786629 Deliverable D3.1 Title: Multimedia Pre-processing, Indexing and Mining Tools Dissemination Level: PU Nature of the Deliverable: R Date: 19/12/2018 Distribution: WP3 Editors: QMUL Reviewers: ICCS, THA Contributors: ICCS, IOSB, UPV, EST, EUROB, VML, SIV Abstract: This deliverable reports on development, setup, configuration and preliminary performance evaluation of the MAGNETO pre-processing tools and Big Data software framework for mining, indexing and annotation of heterogenous multimedia data. The tools and solutions described in this document have been developed as part of the WP3 and will be further optimized and tuned based on feedback from end-users and performance evaluation during the testing phase. This document provides comprehensive overview of tailored solutions for multimedia content indexing, pre-processing of sensor data, text and data mining, automatic language translation, and big data framework. * Dissemination Level: PU= Public, RE= Restricted to a group specified by the Consortium, PP= Restricted to other program participants (including the Commission services), CO= Confidential, only for members of the Consortium (including the Commission services) ** Nature of the Deliverable: P= Prototype, R= Report, S= Specification, T= Tool, O= Other D3.1 Multimedia Pre-processing, Indexing and Mining Tools Disclaimer This document contains material, which is copyright of certain MAGNETO consortium parties and may not be reproduced or copied without permission. The information contained in this document is the proprietary confidential information of certain MAGNETO consortium parties and may not be disclosed except in accordance with the consortium agreement. The commercial use of any information in this document may require a license from the proprietor of that information. Neither the MAGNETO consortium as a whole, nor any certain party of the MAGNETO consortium warrants that the information contained in this document is capable of use, or that use of the information is free from risk, and accepts no liability for loss or damage suffered by any person using the information. The contents of this document are the sole responsibility of the MAGNETO consortium and can in no way be taken to reflect the views of the European Commission. H2020-SEC-12-FCT-2017-786629 MAGNETO Project Page 2 of 77 D3.1 Multimedia Pre-processing, Indexing and Mining Tools Revision History Date Rev. Description Partner 04/10/2018 1.1 Table of Content and Document Structure QMUL 10/10/2018 1.2 Updated Table of Content and instructions QMUL 11/10/2018 1.3 Foreground/Background subtraction section QMUL 31/10/2018 1.4 Contribution towards DROP module VML 31/10/2018 1.5 First version of contributions for Text Mining, Data IOSB Mining and Text Retrieval 31/10/2018 1.6 Initial inputs for Low level and local features QMUL extraction 08/11/2018 1.7 Added performance evaluation data QMUL 11/11/2018 1.8 Contributions to Automatic Language Translation EST 16/11/2018 1.9 Contributions towards Speech to text UPV transformation 21/11/2018 2.0 Updated version of the deliverable with improved QMUL formatting and draft introduction 24/11/2018 2.1 Initial contributions towards Big Data Framework SIVECO 06/12/2018 2.2 Full overview of MAGNETO Big Data Framework EUROB 10/12/2018 2.3 Revised version of final contribution towards Big EUROB Data Framework 11/12/2018 2.4 First revision of the full document with improved QMUL formatting 12/12/2018 2.5 Updates towards DROP module and performance VML evaluation 12/12/2018 2.6 Updates for Text Mining, Data Mining and Text IOSB Retrieval 14/12/2018 3.0 Integration of changes and updates. Added abstract, QMUL introduction, and conclusions. 17/12/2018 3.1 Updated references, figures and tables. Release for QMUL the internal review 20/12/2018 3.2 Internal Review comments THA, ICCS 21/12/201 3.3 Corrections, further revisions and final QMUL improvements. Final version for submission H2020-SEC-12-FCT-2017-786629 MAGNETO Project Page 3 of 77 D3.1 Multimedia Pre-processing, Indexing and Mining Tools List of Authors Partner Author QMUL Tomas Piatrik, Samadhi Wickrama Arachchilage VML Krishna Chandramouli IOSB Dennis Muller, Wilmuth Muler EUROB Roberto Gimenez UPV Francisco Perez SIVECO Denisa Becheru, Iacob Crucianu EST Pawel Walentynowicz ICCS Ioannis Loumiotis, Konstantinos Demestichas H2020-SEC-12-FCT-2017-786629 MAGNETO Project Page 4 of 77 D3.1 Multimedia Pre-processing, Indexing and Mining Tools Table of Contents Revision History ............................................................................................................................................ 3 List of Authors ............................................................................................................................................... 4 Table of Contents .......................................................................................................................................... 5 Index of figures ............................................................................................................................................. 7 Index of tables ............................................................................................................................................... 9 Glossary ....................................................................................................................................................... 10 Executive Summary ..................................................................................................................................... 12 1. Introduction ........................................................................................................................................ 13 2. Multimedia Content Indexing ............................................................................................................. 15 2.1 Visual low level and local features ............................................................................................ 15 2.1.1 Problem Definition ................................................................................................................ 15 2.1.2 Application to the MAGNETO use cases ............................................................................... 15 2.1.3 Approach and configuration used for MAGNETO ................................................................. 15 2.2 Foreground/background subtraction ........................................................................................ 19 2.2.1 Problem Definition ................................................................................................................ 19 2.2.2 Application to the MAGNETO use cases ............................................................................... 20 2.2.3 Approach and configuration used for MAGNETO ................................................................. 20 2.2.4 Performance Evaluation ........................................................................................................ 24 2.3 Speech to text transformation .................................................................................................. 27 2.3.1 Problem Definition ................................................................................................................ 27 2.3.2 Application to the MAGNETO use cases ............................................................................... 28 2.3.3 Approach and configuration used for MAGNETO ................................................................. 28 2.3.4 Performance Evaluation ........................................................................................................ 31 3. Pre-processing of Sensor Data ............................................................................................................ 33 3.1 DROP detection and tracking .................................................................................................... 33 3.1.1 Problem Definition ................................................................................................................ 33 3.1.2 Application to the MAGNETO use cases ............................................................................... 33 3.1.3 Approach and configuration used for MAGNETO ................................................................. 34 3.1.4 Performance Evaluation ........................................................................................................ 36 4. Text and Data Mining .......................................................................................................................... 40 H2020-SEC-12-FCT-2017-786629 MAGNETO Project Page 5 of 77 D3.1 Multimedia Pre-processing, Indexing and Mining Tools 4.1 Text Mining Module .................................................................................................................. 40 4.1.1 Problem Definition ................................................................................................................ 40 4.1.2 Application to the MAGNETO use cases ............................................................................... 40 4.1.3 Approach and configuration used for MAGNETO ................................................................. 41 4.1.4 Performance Evaluation .......................................................................................................

D3.1 Multimedia Pre-Processing, Indexing and Mining Tools

Mapreduce Service

Desarrollo De Una Solución Business Intelligence Mediante Un Paradigma De Data Lake

Optimisation of Ad-Hoc Analysis of an OLAP Cube Using Sparksql

Code Smell Prediction Employing Machine Learning Meets Emerging Java Language Constructs"

IBM Big SQL (With Hbase), Splice Major Contributor to the Apache Be a Major Determinant“ Machine (Which Incorporates Hbase Madlib Project

Apache Calcite for Enabling SQL Access to Nosql Data Systems Such As Apache Geode Christian Tzolov Whoami Christian Tzolov

Apache Kylin云原生架构思考及规划-OS2ATC

Big Data Developer, Data Science and Business Intelligence Developer

Design of the Integrated Big and Fast Data Eco-‐System

Elastic Mapreduce EMR Development Guide

Online Analytical Processing on Hadoop Using Apache Kylin

Software Security > Wearable Computing > Gaming Software