Knowledge Discovery from Data Streams

Total Page:16

File Type:pdf, Size:1020Kb

Knowledge Discovery from Data Streams The Fourth International Workshop on Knowledge Discovery from Data Streams The rapid growth in information science and technology in general and the com- plexity and volume of data in particular have introduced new challenges for the research community. Databases are growing incessantly and many sources pro- duce data continuously. In most real world applications, the process generating the data is not strictly stationary. In many cases, we need to extract some sort of knowledge from this continuous stream of data. Examples include customer click streams, telephone records, large sets of web pages, multimedia data, scientific data, and sets of retail chain transactions. These sources are called data streams. Learning from data streams is an incremental task that requires incremental al- gorithms that take drift into account. The goal of this workshop is to promote an interdisciplinary forum for re- searchers who deal with sequential learning, anytime learning, real-time learn- ing, online learning, temporal and spatial learning, etc. from data streams. The workshop papers present significant contributions in learning decision trees, asso- ciation rules, clustering, preprocessing, feature selection, etc. from data streams and related themes. This is the fourth workshop in this series. The workshop web page is avail- able at http://www.machine-learning.eu/iwkdds-2006/. There were 17 sub- missions. The program committee accepted 9 full papers, 4 short papers and 1 poster paper. This year the workshop has a special emphasis on learning from sensor net- works in ubiquitous environments. In that direction, the workshop includes also an invited talk, by Hillol Kargupta, one of the leading researchers in distributed mining data streams. We would like to thank the ECML/PKDD 2006 organizers, the authors of the submitted papers, and the members of the Program Committee for all efforts to make this workshop possible. J. Gama, R. Klinkenberg and J. Aguilar Organization Program Chairs Jo˜aoGama, University of Porto, Portugal Ralf Klinkenberg, University of Dortmund, Germany Jes´usS. Aguilar-Ruiz, University of Seville, Spain Program Committee Michaela Black, University of Ulster, Coleraine, Northern Ireland, UK Andre Carvalho, University of Sao Paulo, Brazil Pedro Domingos, University of Washington, Seattle, WA, USA Francisco Ferrer, University of Seville, Spain Mohamed Gaber, Monash University, Victoria, Australia Jo˜aoGama, University of Porto, Portugal Ray Hickey, University of Ulster, Coleraine, Northern Ireland, UK Hillol Kargupta, University of Maryland, Baltimore, MD, USA Ralf Klinkenberg, University of Dortmund, Germany Jeremy Z. Kolter, Stanford University, CA, USA Miroslav Kubat, University Miami, FL, USA Mark Last, Ben-Gurion University, Israel Mark Maloof, Georgetown University, Washington, DC, USA S. Muthu Muthukrishnan, Rutgers University and AT&T Research, USA Masayuki Numao, Osaka University, Japan Pedro Pereira Rodrigues, University of Porto, Portugal Josep Roure, Carnegie Mellon University, Pittsburgh, PA, USA Jes´usS. Aguilar-Ruiz, University of Pablo de Olavide, Spain Bernhard Seeger, University Marburg, Germany Elaine Parros Sousa, University of Sao Paulo, Brazil Min Wang, IBM Watson Research Center, Hawthorne, NY, USA Wei Wang, University of North Carolina, Chapel Hill, NC, USA Xiaoyang Wang, University of Vermont, Burlington, VT, USA Gerhard Widmer, University of Linz, Austria Philip S. Yu, IBM Watson Research Center, Yorktown Heights, NY, USA Sponsors The workshop is supported by Kdubiq - Knowledge Discovery in Ubiquitous Environments - European Coordinated Action of FP6 and Adaptive Learning Systems II Project (POSI/EIA/55340/2004). ii Table of Contents Invited Talk Peer-to-Peer Distributed Data Stream Mining and Monitoring Hillol Kargupta, University of Maryland, Baltimore, MD, USA . 1 Contribution Papers Accurate Schema Matching on Streams Szymon Jaroszewicz, Lenka Ivantysynova and Tobias Scheffer ................3 High-Order Substate Chain Prediction Based on Massive Sensor Outputs Nguyen Viet Phuong and Takashi Washio...................................13 Online Prediction of Clustered Streams Pedro Pereira Rodrigues and Joao Gama....................................23 Model Averaging via Penalized Regression for Tracking Concept Drift Kyupil Yeon, Moon Sup Song, Yongdai Kim and Cheolwoo Park. .33 Incremental Training of Markov Mixture Models Andreas Kakoliris and Konstantinos Blekas .................................47 Improving Prediction Accuracy of an Incremental Algorithm Driven by Error Margins Jos´edel Campo-Avila´ , Gonzalo Ramos-Jim´enez, Jo˜aoGama and Rafael Morales- Bueno ......................................................................57 Analyzing Data Streams by Online DFT Alexander Hinneburg, Dirk Habich and Marcel Karnstedt....................67 Early Drift Detection Method Manuel Baena-Garcia, Jos´edel Campo-Avila´ , Ra´ulFidalgo-Merino, Albert Bifet, Ricard Gavald`a and Rafael Morales-Bueno .................................77 Mining Frequent Items in a Stream using Flexible Windows Toon Calders, Nele Dexters and Bart Goethals ..............................87 iii Hard Real-time Analysis of Two Java-based Kernels for Stream Mining Rasmus Pedersen ...........................................................97 Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams Ioannis Katakis, Grigorios Tsoumakas and Ioannis Vlahavas. .107 User Constraints over Data Streams Carlos Ruiz Moreno, Myra Spiliopoulou and Ernestina Menasalvas . 117 StreamSamp DataStream Clustering Over Tilted Windows Through Sampling Baptiste Csernel, Fabrice Clerot and Georges H´ebrail . 127 Structural Analysis of the Web Pratyus Patnaik and Sudip Sanyal .........................................137 Author Index. .145 iv Peer-to-Peer Distributed Data Stream Mining and Monitoring Hillol Kargupta University of Maryland - Baltimore County, Baltimore, MD, USA (Invited Talk) Abstract: Distributed data mining (DDM) deals with the problem of analyzing distributed, possibly multi-party data by paying attention to the computing, communication, storage, and human factors-related issues in a distributed environment. Unlike the conventional off-the-shelf centralized data mining products, DDM systems are based on fundamentally distributed algorithms that do not necessarily re- quire centralization of data and other resources. DDM technology is finding in- creasing number of applications in many domains. Examples include data driven pervasive applications for mobile and embedded devices, grid-based large scale scientific and business data analysis, security and defense related applications involving analysis of multi-party possibly privacy-sensitive data, and peer-to- peer data stream mining in sensor and file-sharing networks. This talk will focus on peer-to-peer (P2P) distributed data stream mining and monitoring. It will first discuss the foundation of approximate and exact P2P algorithms for data analysis. Then it will present a class of P2P algorithms for eigen-analysis and clustering in details. The talk will end with a discussion on the future directions of research on P2P data mining. 1 2 ÙÖØ ËÑ ÅØ Ò ÓÒ ËØÖÑ× ½ ¾ ¾ ËÞÝÑÓÒ ÂÖÓ×ÞÛÞ ¸ ÄÒÁÚÒØÝ×ÝÒÓÚ ¸ÌÓ× Ë«Ö ½ ÆØÓÒÐ ÁÒ×ØØÙØ Ó Ì Ð ÓÑÑÙÒ ØÓÒ׸ ÏÖ×Û¸ ÈÓÐÒ¸ ×׺ÙѺ Ù ¾ ÀÙÑ ÓÐعÍÒÚ Ö×ØØ ÞÙ ÖÐÒ¸ ÖÑÒÝ Ú ÒØÝ×ݸ×Ö ÒÓÖÑ Ø ºÙ¹ ÖÐ Òº ×ØÖغ Ï Ö ×× Ø ÔÖÓÐ Ñ Ó ÑØ Ò! ÑÔ Ö ØÐÝ Ó Ù¹ Ñ ÒØ × Ñ× Ó Ø ×ØÖ Ñ× Ò ÐÖ! Ø× ×º ÁÒ×ØÒ ¹Ð Ú Ð × Ñ ÑØ Ò! Ð!ÓÖØÑ× ÒØ Ý Ð# ÐÝ ÓÖÖ ×ÔÓÒ Ò × ØÛ Ò Ø¹ ØÖÙØ × Ý ÕÙÒØ ÝÒ! Ø ×ÑÐÖØÝ Ó Ø Ö ÓÖÖ ×ÔÓÒÒ! ÚÐ٠׺ ÀÓÛ¹ Ú Ö¸ Ü Ø Ð ÙÐØÓÒ Ó Ø × ×ÑÐÖØ × Ö ÕÙÖ × ÔÖÓ ××Ò! Ó ÐÐ Ø× Ö ÓÖ×&Û × Ò ×Ð ÓÖ Ø ×ØÖ Ñ׺ Ï Ú× ×Ø ÑØ Ò! Ð!ÓÖØÑ ØØ Ù× × ÓÒÐÝ ×ÑÐÐ ×ÑÔÐ Ó Ö ÓÖ׸ Ò × Ý Ø !ÙÖÒØ ØÓ ÑØ Ø ÑÓ×Ø ×ÑÐÖ ØØÖÙØ × ÛØ ! ÔÖÓÐØݺ Ì Ñ ØÓ Ò ÔÔÐ ØÓ ÒÝ !Ú Ò ´ ÓÑÒØÓÒ Ó µ ×ÑÐÖØÝ Ñ ØÖ × ØØ Ò ×ØÑØ ÖÓÑ ×ÑÔÐ ÛØ ÓÙÒ ÖÖÓÖ) Û ÔÔÐÝ Ø Ð!ÓÖØÑ ØÓ × Ú ÖÐ Ñ ØÖ ×º Ï !Ú Ö!ÓÖÓÙ× ÔÖÓ Ó Ó Ø Ñ ØÓ ³× ÓÖÖ ØÒ ×× Ò Ö ÔÓÖØ ÓÒ ÜÔ ÖÑ Ò Ø× Ù×Ò! ÐÖ! Ø× ×º ½ ÁÒØÖÓ ÙØ ÓÒ ÁÒ ÑÒÝ ÔÖØÐ ×ØÙØÓÒ׸ Ø ÑÒÒ ÔÖÓ ××× Ö ÔÖ ÝÔÖÔÖÓ¹ ××Ò ×ØÔ Ò Û ÑÙÐØÔÐ ØÖÓÒÓÙ× ØÖÒ×ØÓÒ Ø×× Ö Òع ÖØ ÒØÓ ×ÒÐ Ø ÛÖÓÙ׺ ÁÒØÖØÓÒ ÖÕÙÖ× Ø ÒجØÓÒ Ó ×¹ ÑÒØ ÓÖÖ×Ô ÓÒÒ× ØÛÒ ØØÖÙØ× ÖÓ×× ÑÙÐØÔÐ ØÖÒ×ØÓÒ Ø× ×Ñ× ¾º ÁÒ ÔÖØÐ ÔÔÐØÓÒ× ØØ ÓØÒ ÒÚÓÐÚ ÐÖ Ø×׸ ×Ñ ÑØÒ × ØÓÙ×Ø׸ ÓØÒ ÙÖØÖ ÒÖ Ý ÆØ Ó ÙÑÒØØÓÒº Ê×Ö ÖÚÓÐÚ× ÖÓÙÒ Ø ÕÙ×ØÓÒ ÓÛ ×Ñ ÑØÒ Ò ×ÙÔ¹ Ô ÓÖظ ÓÖ ÚÒ ÙØÓÑظ «ØÚÐݺ ËÑ ÑØÒ ÐÓÖØÑ× ÒÖÐÐÝ ÑØ ÐÑÒØ× Ó Ø ×ØÒØ ×Ñ× ×Ù ØØ ×ÓÑ ×ÑÐÖØÝ ÖØÖÓÒ × ÑÜÑÞº ËѹÐÚÐ ÑØÖ× ÙØÐÞ ×ÑÐÖØÝ ÖØÖÓÒ ØØ ÖÖ× ØÓ ×Ñ ÒÓÖÑØÓÒ ×Ù × ØØÖÙØ ÒÑ׸ ×ÖÔØÓÒ׸ ÓÖ Ø ×Ñ׳ ×ØÖÙ¹ ØÙÖº ÍÒÓÖØÙÒØÐݸØÚÐÐ ×Ñ ÒÓÖÑØÓÒ × ÓØÒ Æغ ×× Ö× Ò Û ØØÖÙØ ÒÑ× Ö ÓÔÕÙ Ò ÐÙ× ÓÒ ØÖ ×ÑÒØ× Ò ÓÒÐÝ ÓÙÒ Ý Ò×Ô ØÒ Ø ØØÖÙØ׳ ÚÐÙ׺ ÁÒ×ØÒ¹ÐÚÐ ×Ñ ÑØÖ× ÕÙÒØÝ Ø ×ÑÐÖØÝ Ó ØØÖÙØ× Ý ÓÑÔÖÒ ÔÖÓÔ ÖØ× Ó ØÖ ÚÐÙ׺ ÁÒ ÓÖÖ ØÓ ÙÖÒØ ØØ Ø ÔÖÓ Ù ÑÔÔÒ Ò Ø ÑÜÑÞ× Ø Ó×Ò ×ÑÐÖØÝ ÙÒØÓÒ¸ Ò×ØÒ¹ÐÚÐ ÑØÖ× Ò ØÓ ÜÖ× Ø Ð×Ø ÓÒ ÒØÖ Ô×× ÓÚÖ Ø Ø×׺ ÓÑÔÐØ Ô×× × ÙÒÖ×ÓÒÐ ÓÖ ØÖÒ×ØÓÒÐ Ø×× ØØ ÖÓÖ ¹×Ô Ø ×ØÖÑ× × ØÝ Ó ÙÖ¸ ÓÖ Ò×ØÒ¸ Ò ØÐÓÑÑÙÒØÓÒ Ò ÒÒ ÔÔÐØÓÒ× Ò Ò ×ÐÝ ÖØÐ Ò׸ Ò ÖÓÛÒØÓ Ø ÓÖÖ Ó ÙÒÖ× Ó ØÖÝØ׺ Ì ¹Ó ×ÓÐÙØÓÒ Ó ÔÖÓ ××Ò 3 ÓÒÐÝ ×ÑÐÐ ×ÑÔÐ Ó ØÖÒ×ØÓÒ× Ö×ÙÐØ× Ò ÐÓ×× Ó ÒÝ ÙÖÒØ ÓÒ Ø ÓÔØÑÐØÝ Ó Ø ÔÖÓ Ù Ñغ ÃÒÓÛÒ ÑØÓ × ØØ ÙØÐÞ ×ÑÔÐÒ Ó ÒÓØ ÙÖÒØ ÒÝ ÔÖÓÔ ÖØ× Ó Ø ÖØÖÚ ÑÔÔÒ׺ Ï ÓÖÑÐÞ Ø ÑØÒ ÔÖÓÐÑ Ò ÛÝ ØØ × ÓØ ÑØÑØÐÐÝ ÖÓÖÓÙ× Ò ÐÓ×ÐÝ Ø ØÓ ÔÖØÐ ÔÔÐØÓÒ׺ Ï Ú× ×Ñ ÑØÒ ÐÓÖØѸ Û× ÙÖÒØ ØÓ ÔÖÓ Ù ÒÖ¹ÓÔØÑÐ ÑØ Û ØÐ Ø ØÖÑ ÒÖ¹ÓÔØÑÐ Ý ÑÒ× Ó ÔÖÓÐ×Ø ÓÙÒ× ÓÒ ØØÖÙØ ×ÑÐÖØݺ Ù×ÙÐ ×ÑÐÖØÝ ÑØÖ ÓÖ Ø ÑØÒ ÔÖÓÐÑ Ø Ò × × ÖÕÙÖ¹ ÑÒØ ÓÖ ÒÝ ×ÑÔÐÒ ÐÓÖØѸ Ò ×ÒÒ ×Ù ÔÖÓÐѹ×Ô ¬ ÑØÖ ÐÖÐÝ × ÐÐÒÒ ××Ù Ò Ø×к ÌÖÓÖ¸ Û ×Ò ÓÙÖ ×ÓÐÙØÓÒ ØÓ ÒÖ ÛØ Ö×Ô Ø ØÓ Ø ×ÑÐÖØÝ ÖØÖÓÒ ÛØ ÓÑÒØÓÒ× Ó ×ÚÖÐ ×ѹÐÚÐ Ò Ò×ØÒ¹ÐÚÐ ÖØÖ Ò
Recommended publications
  • Knowledge Discovery in Databases: PKDD 2007
    springer.com J.N. Kok, J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenic (Eds.) Knowledge Discovery in Databases: PKDD 2007 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings Series: Lecture Notes in Artificial Intelligence The two premier annual European conferences in the areas of machine learning and data mining have been collocated ever since the ?rst joint conference in Freiburg, 2001. The European Conference on Machine Learning (ECML) traces its origins to 1986, when the ?rst European Working Session on Learning was held in Orsay, France. The European Conference on Principles and Practice of KnowledgeDiscoveryinDatabases(PKDD) was? rstheldin1997inTrondheim, Norway. Over the years, the ECML/PKDD series has evolved into one of the largest and most selective international conferences in machine learning and data mining. In 2007, the seventh collocated ECML/PKDD took place during September 17–21 on 2007, XXIV, 644 p. the centralcampus of WarsawUniversityand in the nearby Staszic Palace of the Polish Academy of Sciences. The conference for the third time used a hierarchical reviewing process. We Printed book nominated 30 Area Chairs, each of them responsible for one sub-?eld or several closely related Softcover research topics. Suitable areas were selected on the basis of the submission statistics for ECML 114,99 € | £103.50 | $159.00 /PKDD 2006 and for last year’s International Conference on Machine Learning (ICML 2006) to [1]123,04 € (D) | 126,49 € (A) | CHF ensure a proper load balance amongtheAreaChairs.AjointProgramCommittee(PC) 153,65 wasnominatedforthe two conferences, consisting of some 300 renowned researchers, mostly eBook proposed by the Area Chairs.
    [Show full text]
  • Curriculum Vitae – Wouter Duivesteijn
    Curriculum Vitae – Wouter Duivesteijn Date of birth December 09, 1984 Place of birth Rotterdam, the Netherlands Nationality Dutch Email address [email protected] Website http://wwwis.win.tue.nl/~wouter/ Academic career September 2016 Assistant Professor Data Mining at Technische Universiteit Eindhoven (the Nether- - now lands). October 2015 Postdoctoraal Bursaal at Universiteit Gent (Belgium), working on the FORSIED project: - August 2016 FORmalising Subjective Interestingness in Exploratory Data mining. August 2015 Honorary Research Associate at University of Bristol (UK), working on the FORSIED - September 2015 project: FORmalising Subjective Interestingness in Exploratory Data mining. August 2013 Wissenschaftlicher Mitarbeiter at Technische Universität Dortmund (Germany), in - July 2015 Sonderforschungsbereich 876: Verfügbarkeit von Information durch Analyse unter Ressourcenbeschränkung (Collaborative Research Center SFB 876: Providing Informa- tion by Resource-Constrained Data Analysis). July 2009 PhD candidate in the Algorithms group at Leiden Institute of Advanced Computer - July 2013 Science, Universiteit Leiden (the Netherlands), under the supervision of Joost N. Kok and Arno Knobbe (graduation date: September 17, 2013). Education 2019 University Teaching Qualification (UTQ), Technische Universiteit Eindhoven. 2009-2013 PhD in Computer Science, Universiteit Leiden (graduated on September 17, 2013). 2008-2009 MSc in History and Philosophy of Science, Universiteit Utrecht (aborted for PhD position). 2005-2008 MSc in Applied Computing Science, Universiteit Utrecht (graduated on October 15, 2008). 2005-2007 MSc in Mathematical Sciences, Universiteit Utrecht (graduated on October 30, 2007). 2002-2005 BSc in Mathematics, Universiteit Utrecht (graduated on July 7, 2005). 2002-2005 BSc in Computer Science, Universiteit Utrecht (graduated on July 7, 2005). 1996-2002 Gymnasium, C.S.G. Blaise Pascal, Spijkenisse.
    [Show full text]
  • Global Impact of COVID-19 Restrictions on the Surface Concentrations of Nitrogen Dioxide and Ozone Christoph A
    Global Impact of COVID-19 Restrictions on the Surface Concentrations of Nitrogen Dioxide and Ozone Christoph A. Keller1,2, Mat. J. Evans3,4, K. Emma Knowland1,2, Christa A. Hasenkopf5, Sruti Modekurty5, Robert A. Lucchesi1,6, Tomohiro Oda1,2, Bruno B. Franca7, Felipe C. Mandarino7, M. 5 Valeria Díaz Suárez8, Robert G. Ryan9, Luke H. Fakes3,4, Steven Pawson1 1NASA Global Modeling and Assimilation Office, Goddard Space Flight Center, Greenbelt, MD, USA 2Universities Space Research Association, Columbia, MD, USA 3Wolfson Atmospheric Chemistry Laboratories, Department of Chemistry, University of York, York, 10 YO10 5DD, UK 4National Centre for Atmospheric Science, University of York, York, YO10 5DD, UK 5OpenAQ, Washington, DC, USA 6Science Systems and Applications, Inc., Lanham, MD, USA 7Municipal Government of Rio de Janeiro, Rio de Janeiro, Brazil 15 8Secretaria de Ambiente, Quito, Ecuador 9School of Earth Sciences, The University of Melbourne, Australia Correspondence to: Christoph A. Keller ([email protected]) Abstract. Social-distancing to combat the COVID-19 pandemic has led to widespread reductions in air pollutant emissions. Quantifying these changes requires a business-as-usual counterfactual that accounts 20 for the synoptic and seasonal variability of air pollutants. We use a machine learning algorithm driven by information from the NASA GEOS-CF model to assess changes in nitrogen dioxide (NO2) and ozone (O3) at 5,756 observation sites in 46 countries from January through June 2020. Reductions in NO2 coincide with timing and intensity of COVID-19 restrictions, ranging from 60% in severely affected cities (e.g., Wuhan, Milan) to little change (e.g., Rio de Janeiro, Taipei).
    [Show full text]
  • Open Data Science to Fight COVID-19
    Open Data Science to fight COVID-19: Winning the 500k XPRIZE Pandemic Response Challenge Miguel Angel Lozano1 ( ), Oscar` Garibo i Orts2 , Eloy Pi~nol2 , Miguel Rebollo2 , Kristina Polotskaya3 , Miguel Angel Garcia-March2 , J. Alberto Conejero2 , Francisco Escolano1 , and Nuria Oliver4 1 University of Alicante, Alicante, Spain [email protected] 2 IUMPA and VRAIN. Universitat Polit`ecnicade Val`encia,Val`encia.Spain 3 University Miguel Hern´andez,Elche, Spain 4 ELLIS (European Lab. for Learning and Intelligent Systems) Unit Alicante, Spain Abstract. In this paper, we describe the deep learning-based COVID-19 cases predictor and the Pareto-optimal Non-Pharmaceutical Intervention (NPI) prescriptor developed by the winning team of the 500k XPRIZE Pandemic Response Challenge, a four-month global competition organized by the XPRIZE Foundation. The competition aimed at developing data- driven AI models to predict COVID-19 infection rates and to prescribe NPI Plans that governments, business leaders and organizations could implement to minimize harm when reopening their economies. In addition to the validation performed by XPRIZE with real data, the winning models were validated in a real-world scenario thanks to an ongoing collaboration with the Valencian Government in Spain. We believe that this experience contributes to the necessary transition to more evidence- driven policy-making, particularly during a pandemic. Keywords: SARS-CoV-2 · Computational Epidemiology · Data Science for Public Health · Recurrent Neural Networks · Non-Pharmaceutical Interventions · Pareto-front optimization 1 Introduction During a pandemic, predicting the number of infections under different circum- stances is important to inform public health, health care and emergency system responses. Different approaches to predict the evolution of a pandemic have been proposed in the literature, including traditional compartmental meta-population models {such as SIR or SEIR [12], complex network [18], agent-based individual [9] and purely data-driven time series forecasting [23] models.
    [Show full text]
  • LNAI 3201) and the Proceedings of the 8Th European Conferences on Principles and Practice of Knowledge Discovery in Databases (LNAI 3202)
    Lecture Notes in Artificial Intelligence 3201 Edited by J. G. Carbonell and J. Siekmann Subseries of Lecture Notes in Computer Science Jean-François Boulicaut Floriana Esposito Fosca Giannotti Dino Pedreschi (Eds.) Machine Learning: ECML 2004 15th European Conference on Machine Learning Pisa, Italy, September 20-24, 2004 Proceedings 13 Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA Jörg Siekmann, University of Saarland, Saarbrücken, Germany Volume Editors Jean-François Boulicaut INSA Lyon LIRIS CNRS FRE 2672, 69621 Villeurbanne Cedex, France E-mail: [email protected] Floriana Esposito University of Bari Department of Computer Science Via Orabona 4, 70126 Bari, Italy E-mail: [email protected] Fosca Giannotti Science and Technology Institute Knowledge Discovery and Delivery (KDD) Via G. Moruzzi 1, 56124 Pisa, Italy E-mail: [email protected] Dino Pedreschi University of Pisa Department of Computer Science Via F. Buonarroti 2, 56125 Pisa, Italy E-mail: [email protected] Library of Congress Control Number: 2004111517 CR Subject Classification (1998): I.2, F.2.2, F.4.1, H.2.8 ISSN 0302-9743 ISBN 3-540-23105-6 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer.
    [Show full text]
  • ECML PKDD 2018 Workshops DMLE 2018 and Iotstream 2018 Dublin, Ireland, September 10–14, 2018 Revised Selected Papers
    Communications in Computer and Information Science 967 Commenced Publication in 2007 Founding and Former Series Editors: Phoebe Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu, Krishna M. Sivalingam, Dominik Ślęzak, and Xiaokang Yang Editorial Board Simone Diniz Junqueira Barbosa Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro, Brazil Joaquim Filipe Polytechnic Institute of Setúbal, Setúbal, Portugal Ashish Ghosh Indian Statistical Institute, Kolkata, India Igor Kotenko St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia Takashi Washio Osaka University, Osaka, Japan Junsong Yuan University at Buffalo, The State University of New York, Buffalo, USA Lizhu Zhou Tsinghua University, Beijing, China More information about this series at http://www.springer.com/series/7899 Anna Monreale • Carlos Alzate et al. (Eds.) ECML PKDD 2018 Workshops DMLE 2018 and IoTStream 2018 Dublin, Ireland, September 10–14, 2018 Revised Selected Papers 123 Editors Anna Monreale Carlos Alzate KDDLab IBM Research - Ireland University of Pisa Dublin, Ireland Pisa, Pisa, Italy Workshop Editors see next page ISSN 1865-0929 ISSN 1865-0937 (electronic) Communications in Computer and Information Science ISBN 978-3-030-14879-9 ISBN 978-3-030-14880-5 (eBook) https://doi.org/10.1007/978-3-030-14880-5 Library of Congress Control Number: 2019933158 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
    [Show full text]
  • The Technological Gap Between Virtual Assistants And
    The Technological Gap Between Virtual Assistants and Recommendation Systems Dimitrios Rafailidis Yannis Manolopoulos Maastricht University Aristotle University of Thessaloniki Maastricht, The Netherlands Thessaloniki, Greece [email protected] [email protected] Abstract—Virtual assistants, also known as intelligent conver- allowed us to talk to computers via commands. Intelligent sational systems such as Google’s Virtual Assistant and Apple’s Conversational Agents (virtual assistants) have allowed us Siri, interact with human-like responses to users’ queries and fin- not to just talk to machines, but also accomplish our daily ish specific tasks. Meanwhile, existing recommendation technolo- gies model users’ evolving, diverse and multi-aspect preferences tasks. For example, Google Assistant, Apple’s Siri, Amazon to generate recommendations in various domains/applications, Alexa, and Microsoft Cortana have revolutionized the way we aiming to improve the citizens’ daily life by making suggestions. interact with phones and machines. These virtual assistants The repertoire of actions is no longer limited to the one-shot are termed as “dialogue systems often endowed with human- presentation of recommendation lists, which can be insufficient like behaviour”, and they have started becoming integral parts when the goal is to offer decision support for the user, by quickly adapting to his/her preferences through conversations. Such an of people’s lives. Although both recommendation systems interactive mechanism is currently missing from recommendation and virtual assistants are based on various machine learning systems. This article sheds light on the gap between virtual strategies, there is a large technological gap between them [4], assistants and recommendation systems in terms of different [5].
    [Show full text]
  • Machine Learning and Knowledge Discovery in Databases European Conference, ECML PKDD 2019 Würzburg, Germany, September 16–20, 2019 Proceedings, Part I
    Lecture Notes in Artificial Intelligence 11906 Subseries of Lecture Notes in Computer Science Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany Founding Editor Jörg Siekmann DFKI and Saarland University, Saarbrücken, Germany More information about this series at http://www.springer.com/series/1244 Ulf Brefeld • Elisa Fromont • Andreas Hotho • Arno Knobbe • Marloes Maathuis • Céline Robardet (Eds.) Machine Learning and Knowledge Discovery in Databases European Conference, ECML PKDD 2019 Würzburg, Germany, September 16–20, 2019 Proceedings, Part I 123 Editors Ulf Brefeld Elisa Fromont Leuphana University IRISA/Inria Lüneburg, Germany Rennes, France Andreas Hotho Arno Knobbe University of Würzburg Leiden University Würzburg, Germany Leiden, the Netherlands Marloes Maathuis Céline Robardet ETH Zurich Institut National des Sciences Appliquées Zurich, Switzerland Villeurbanne, France ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Artificial Intelligence ISBN 978-3-030-46149-2 ISBN 978-3-030-46150-8 (eBook) https://doi.org/10.1007/978-3-030-46150-8 LNCS Sublibrary: SL7 – Artificial Intelligence © Springer Nature Switzerland AG 2020, corrected publication 2020 The chapter “Heavy-Tailed Kernels Reveal a Finer Cluster Structure in t-SNE Visualisations” is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/). For further details see license information in the chapter. This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
    [Show full text]
  • Dragi Kocev – Curriculum Vitae
    Jožef Stefan Institute Department of Knowledge Technologies, Jamova cesta 39, Ljubljana, Slovenia Dragi Kocev T +386 1 477 3639 u +386 1 477 3315 B [email protected] Curriculum Vitae Í kt.ijs.si/DragiKocev Work experience 2016- Researcher, Dept. of Knowledge Technologies, Jožef Stefan Institute 2014-2015 Visiting research fellow, Dept. of Informatics, Universita degli studi di Bari, Italy 2011-2014 Post-doctoral researcher, Dept. of Knowledge Technologies, Jožef Stefan Institute 2008-2011 Research assistant, Dept. of Knowledge Technologies, Jožef Stefan Institute Education 2011 PhD in Computer Science, IPS Jožef Stefan, Ljubljana, Slovenia Dissertation: Ensembles for predicting structured outputs 2005 BSc in Enginnering, Faculty of Electrical Engineering, Skopje, Macedonia Thesis: Inductive querying environment for learning predictive clustering trees Projects 2014-2017 MAESTRA: Learning from Massive, Incompletely annotated, and Structured Data, FP7 FET Open Xtrack, grant no. ICT-2013-612944; co-coordinator 2014-2016 HBP: Human Brain Project, FET Flagship, grant no. 604102 2008-2012 PHAGOSYS: Systems biology of phagosome formation and maturation - modulation by intracellular pathogens, FP7 STREP, grant no. HEALTH-F4-2008-223451 2005-2008 IQ: Inductive Queries for Mining Patterns and Models, FP6 STREP, grant no. IST- 2004-516169 2009-2012 Data mining for integrative data analysis in systemic biology, basic research project, ARRS, grant no. J2-2285 2013 Structured annotation, storage and retrieval of images and videos, bilateral research
    [Show full text]
  • Matija Piškorec
    Curriculum vitae PERSONAL INFORMATION Matija Piškorec Fancevljev prilaz 2, 10000 Zagreb, Croatia +385 91 516 7547 +385 91 516 7547 [email protected] matijapiskorec.github.io Gender Male | Date of birth 27 August 1986 WORK EXPERIENCE July 2019 onwards (expected to Research assistant at the Centre for Excellence, Zagreb, Croatia finish in July 2020) Faculty of Electrical Engineering and Computing Unska 3, Zagreb, Croatia Working on research project “Datacross”. My duties include statistical modeling on social network data and developing of bioinformatics web services. July 2013 – July 2019 Research assistant at the Ruder¯ Boškovic´ Institute, Zagreb, Croatia Ruder¯ Boškovic´ Institute Bijenickaˇ cesta 54, Zagreb, Croatia Working in the group of Tomislav Šmuc on problems considering complex networks, machine learning, data mining, knowledge discovery. Working on FP7 project “Foundational Research on Multilevel Complex Networks and Systems” (MULTIPLEX) and project of Croatian Science Foundation “Machine Learning Algorithms for Insightful Analysis od Complex Data Structures”. February 2012 – July 2013 Expert associate on FP7 project “Forecasting Financial Crises (FOC)” Ruder¯ Boškovic´ Institute Bijenickaˇ cesta 54, Zagreb, Croatia Collection of macro-level (country) data and design and implementation of forecasting model based on that data. Collaboration with Jožef Stefan Institute in Ljubljana in developing non- financial risk modeling methodology that will provides additional view on the build-up and emergence of financial crises. February 2012 – July 2013 Expert associate on FP7 project “e-Laboratory for Collaborative Interdisci- plinary Research in Data Mining and Data Intensive Sciences-Enlarged Eu- ropean Union (e-LICO)” Ruder¯ Boškovic´ Institute Bijenickaˇ cesta 54, Zagreb, Croatia My main task was to design and implement predictive model for execution-time estimation of data mining workflows.
    [Show full text]
  • 8. CURRICULUM VITAE and PUBLICATIONS PART 1 1A
    8. CURRICULUM VITAE AND PUBLICATIONS PART 1 1a. Personal details Full name Title First name Second name(s) Family name Associate Bernhard Markus Pfahringer Professor Present position Professor Organisation/Employer The University of Waikato Contact address Hillcrest Road Hillcrest Hamilton Post code 3240 Work telephone 07-838 4041 Mobile 021-027 021 03 Email [email protected] Personal website http://www.cs.waikato.ac.nz/˜bernhard (if applicable) 1b. Academic qualifications 1995, Dr.techn., Computer Science, Vienna University of Technology 1985, Dipl.-Ing., Computer Science, Vienna University of Technology 1c. Professional positions held 2015 - present: Professor, Computer Science Department, University of Waikato 2013 - present: Head of Department, Computer Science, University of Waikato 2007 - 2014: Associate Professor, Computer Science Department, University of Waikato 2000 - 2007: Senior Lecturer, Computer Science Department, University of Waikato 1997 - 2000: Research Fellow, Austrian Research Institute for AI 1996 - 1997: Postdoctoral Research Fellow, Computer Science, University of Waikato 1992 - 1998: Research Fellow, Austrian Research Institute for AI, Machine Learning 1986 - 1991: Research Assistant, Inst. for Medical Cybernetics and AI, University of Vienna 1985 - 1991: Research Associate, Austrian Research Institute for AI, Expert Systems 1d. Present research/professional speciality Machine Learning, data mining. 1e. Total years research experience: 24 years 1f. Professional distinctions and memberships (including honours,
    [Show full text]
  • Lecture Notes in Artificial Intelligence 6913
    Lecture Notes in Artificial Intelligence 6913 Subseries of Lecture Notes in Computer Science LNAI Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany LNAI Founding Series Editor Joerg Siekmann DFKI and Saarland University, Saarbrücken, Germany Dimitrios Gunopulos Thomas Hofmann Donato Malerba Michalis Vazirgiannis (Eds.) Machine Learning and Knowledge Discovery in Databases European Conference, ECML PKDD 2011 Athens, Greece, September 5-9, 2011 Proceedings, Part III 13 Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany Volume Editors Dimitrios Gunopulos University of Athens, Greece E-mail: [email protected] Thomas Hofmann Google Switzerland GmbH, Zurich, Switzerland E-mail: [email protected] Donato Malerba University of Bari “Aldo Moro”, Bari, Italy E-mail: [email protected] Michalis Vazirgiannis Athens University of Economics and Business, Greece E-mail: [email protected] ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-23807-9 e-ISBN 978-3-642-23808-6 DOI 10.1007/978-3-642-23808-6 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: Applied for CR Subject Classification (1998): I.2, H.2.8, H.2, H.3, G.3, J.1, I.7, F.2.2, F.4.1 LNCS Sublibrary: SL 7 – Artificial Intelligence © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks.
    [Show full text]