Snailtrail for Latency-Sensitive Workloads
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Network Traffic Profiling and Anomaly Detection for Cyber Security
Network traffic profiling and anomaly detection for cyber security Laurens D’hooge Student number: 01309688 Supervisors: Prof. dr. ir. Filip De Turck, dr. ir. Tim Wauters Counselors: Prof. dr. Bruno Volckaert, dr. ir. Tim Wauters A dissertation submitted to Ghent University in partial fulfilment of the requirements for the degree of Master of Science in Information Engineering Technology Academic year: 2017-2018 Acknowledgements This thesis is the result of 4 months work and I would like to express my gratitude towards the people who have guided me throughout this process. First and foremost I’d like to thank my thesis advisors prof. dr. Bruno Volckaert and dr. ir. Tim Wauters. By virtue of their knowledge and clear communication, I was able to maintain a clear target. Secondly I would like to thank prof. dr. ir. Filip De Turck for providing me the opportunity to conduct research in this field with the IDLab research group. Special thanks to Andres Felipe Ocampo Palacio and dr. Marleen Denert are in order as well. Mr. Ocampo’s Phd research into big data processing for network traffic and the resulting framework are an integral part of this thesis. Ms. Denert has been the go-to member of the faculty staff for general advice and administrative dealings. The final token of gratitude I’d like to extend to my family and friends for their continued support during this process. Laurens D’hooge Network traffic profiling and anomaly detection for cyber security Laurens D’hooge Supervisor(s): prof. dr. ir. Filip De Turck, dr. ir. Tim Wauters Abstract— This article is a short summary of the research findings of a creation of APT2. -
Unravel Data Systems Version 4.5
UNRAVEL DATA SYSTEMS VERSION 4.5 Component name Component version name License names jQuery 1.8.2 MIT License Apache Tomcat 5.5.23 Apache License 2.0 Tachyon Project POM 0.8.2 Apache License 2.0 Apache Directory LDAP API Model 1.0.0-M20 Apache License 2.0 apache/incubator-heron 0.16.5.1 Apache License 2.0 Maven Plugin API 3.0.4 Apache License 2.0 ApacheDS Authentication Interceptor 2.0.0-M15 Apache License 2.0 Apache Directory LDAP API Extras ACI 1.0.0-M20 Apache License 2.0 Apache HttpComponents Core 4.3.3 Apache License 2.0 Spark Project Tags 2.0.0-preview Apache License 2.0 Curator Testing 3.3.0 Apache License 2.0 Apache HttpComponents Core 4.4.5 Apache License 2.0 Apache Commons Daemon 1.0.15 Apache License 2.0 classworlds 2.4 Apache License 2.0 abego TreeLayout Core 1.0.1 BSD 3-clause "New" or "Revised" License jackson-core 2.8.6 Apache License 2.0 Lucene Join 6.6.1 Apache License 2.0 Apache Commons CLI 1.3-cloudera-pre-r1439998 Apache License 2.0 hive-apache 0.5 Apache License 2.0 scala-parser-combinators 1.0.4 BSD 3-clause "New" or "Revised" License com.springsource.javax.xml.bind 2.1.7 Common Development and Distribution License 1.0 SnakeYAML 1.15 Apache License 2.0 JUnit 4.12 Common Public License 1.0 ApacheDS Protocol Kerberos 2.0.0-M12 Apache License 2.0 Apache Groovy 2.4.6 Apache License 2.0 JGraphT - Core 1.2.0 (GNU Lesser General Public License v2.1 or later AND Eclipse Public License 1.0) chill-java 0.5.0 Apache License 2.0 Apache Commons Logging 1.2 Apache License 2.0 OpenCensus 0.12.3 Apache License 2.0 ApacheDS Protocol -
Optimizing Resource Utilization in Distributed Computing Systems For
THESE` DE DOCTORAT DE L’ETABLISSEMENT´ UNIVERSITE´ BOURGOGNE FRANCHE-COMTE´ PREPAR´ EE´ A` L’UNIVERSITE´ DE FRANCHE-COMTE´ Ecole´ doctorale n°37 Sciences Pour l’Ingenieur´ et Microtechniques Doctorat d’Informatique par ANTHONY NASSAR Optimizing Resource Utilization in Distributed Computing Systems for Automotive Applications Optimisation de l’utilisation des ressources dans les systemes` informatiques distribues´ pour les applications automobiles These` present´ ee´ et soutenue publiquement le 04-02-2021 a` Belfort, devant le Jury compose´ de : MR CERIN CHRISTOPHE Professeur a` l’Universite´ Sorbonne Paris Nord President´ MR CHBEIR RICHARD Professeur a` l’Universite´ de Pau et des Pays de l’Adour Rapporteur MME BENBERNOU SALIMA Professeur a` l’Universite´ Paris-Descartes Rapporteur MR MOSTEFAOUI AHMED Maˆıtre de conferences´ a` l’Universite´ de Franche-Comte´ Directeur de these` MR DESSABLES FRANC¸ OIS Ingenieur´ chez Groupe PSA Codirecteur de these` DOCTORAL THESIS OF THE UNIVERSITY BOURGOGNE FRANCHE-COMTE´ INSTITUTION PREPARED AT UNIVERSITE´ DE FRANCHE-COMTE´ Doctoral school n°37 Engineering Sciences and Microtechnologies Computer Science Doctorate by ANTHONY NASSAR Optimizing Resource Utilization in Distributed Computing Systems for Automotive Applications Optimisation de l’utilisation des ressources dans les systemes` informatiques distribues´ pour les applications automobiles Thesis presented and publicly defended in Belfort, on 04-02-2021 Composition of the Jury : CERIN CHRISTOPHE Professor at Universite´ Sorbonne Paris Nord President -
60 Recipes for Apache Cloudstack
60 Recipes for Apache CloudStack Sébastien Goasguen 60 Recipes for Apache CloudStack by Sébastien Goasguen Copyright © 2014 Sébastien Goasguen. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/ institutional sales department: 800-998-9938 or [email protected]. Editor: Brian Anderson Indexer: Ellen Troutman Zaig Production Editor: Matthew Hacker Cover Designer: Karen Montgomery Copyeditor: Jasmine Kwityn Interior Designer: David Futato Proofreader: Linley Dolby Illustrator: Rebecca Demarest September 2014: First Edition Revision History for the First Edition: 2014-08-22: First release See http://oreilly.com/catalog/errata.csp?isbn=9781491910139 for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. 60 Recipes for Apache CloudStack, the image of a Virginia Northern flying squirrel, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. -
Hortonworks Data Platform May 29, 2015
docs.hortonworks.com Hortonworks Data Platform May 29, 2015 Hortonworks Data Platform : Data Integration Services with HDP Copyright © 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing and analyzing large volumes of data. It is designed to deal with data from many sources and formats in a very quick, easy and cost-effective manner. The Hortonworks Data Platform consists of the essential set of Apache Hadoop projects including MapReduce, Hadoop Distributed File System (HDFS), HCatalog, Pig, Hive, HBase, Zookeeper and Ambari. Hortonworks is the major contributor of code and patches to many of these projects. These projects have been integrated and tested as part of the Hortonworks Data Platform release process and installation and configuration tools have also been included. Unlike other providers of platforms built using Apache Hadoop, Hortonworks contributes 100% of our code back to the Apache Software Foundation. The Hortonworks Data Platform is Apache-licensed and completely open source. We sell only expert technical support, training and partner-enablement services. All of our technology is, and will remain free and open source. Please visit the Hortonworks Data Platform page for more information on Hortonworks technology. For more information on Hortonworks services, please visit either the Support or Training page. Feel free to Contact Us directly to discuss your specific needs. Except where otherwise noted, this document is licensed under Creative Commons Attribution ShareAlike 3.0 License. http://creativecommons.org/licenses/by-sa/3.0/legalcode ii Hortonworks Data Platform May 29, 2015 Table of Contents 1. -
Performance Prediction of Data Streams on High-Performance
Gautam and Basava Hum. Cent. Comput. Inf. Sci. (2019) 9:2 https://doi.org/10.1186/s13673-018-0163-4 RESEARCH Open Access Performance prediction of data streams on high‑performance architecture Bhaskar Gautam* and Annappa Basava *Correspondence: bhaskar.gautam2494@gmail. Abstract com Worldwide sensor streams are expanding continuously with unbounded velocity in Department of Computer Science and Engineering, volume, and for this acceleration, there is an adaptation of large stream data processing National Institute system from the homogeneous to rack-scale architecture which makes serious con- of Technology Karnataka, cern in the domain of workload optimization, scheduling, and resource management Surathkal, India algorithms. Our proposed framework is based on providing architecture independent performance prediction model to enable resource adaptive distributed stream data processing platform. It is comprised of seven pre-defned domain for dynamic data stream metrics including a self-driven model which tries to ft these metrics using ridge regularization regression algorithm. Another signifcant contribution lies in fully-auto- mated performance prediction model inherited from the state-of-the-art distributed data management system for distributed stream processing systems using Gaussian processes regression that cluster metrics with the help of dimensionality reduction algorithm. We implemented its base on Apache Heron and evaluated with proposed Benchmark Suite comprising of fve domain-specifc topologies. To assess the pro- posed methodologies, we forcefully ingest tuple skewness among the benchmark- ing topologies to set up the ground truth for predictions and found that accuracy of predicting the performance of data streams increased up to 80.62% from 66.36% along with the reduction of error from 37.14 to 16.06%. -
ACNA2011: Apache Rave: Enterprise Social Networking out of The
Apache Rave Enterprise Social Networking Out Of The Box Ate Douma, Hippo B.V. Matt Franklin, The MITRE Corporation November 9, 2011 Overview ● About us ● What is Apache Rave? ● History ● Projects and people behind Rave ● The Project ● Demo ● Goals & Roadmap ● More demos and examples ● Other projects using Rave ● Participate Apache Rave: Enterprise Social Networking Out Of The Box About us Ate Douma Matt Franklin Chief Architect at Lead Software Engineer at Hippo B.V. The MITRE Corporation's Center of Open source CMS and Portal Software Information & Technology Apache Champion, Mentor and Committer Apache PPMC Member and Committer of Apache Rave of Apache Rave [email protected] [email protected] [email protected] [email protected] [email protected] twitter: @atedouma twitter: @mattfranklin Apache Rave: Enterprise Social Networking Out Of The Box What is Apache Rave? Apache Rave (incubating) is a lightweight and extensible Web and Social Mashup engine, to host, serve and aggregate Gadgets, Widgets and general (social) network and web services with a highly customizable Web 2.0 friendly front-end. ● Targets Enterprise-level intranet, extranet, portal, web and mobile sites ● Can be used 'out-of-the-box' or as an embeddable engine ● Transparent integration and usage of OpenSocial Gadgets, W3C Widgets, …, ● Built upon a highly extensible and pluggable component architecture ● Will enhance this with context-aware cross-component communication, collaboration and content integration features ● Leverages latest/open standards and related open source -
Evaluating the Impact of Streaming Systems Design on Application Performance Alessio Pagliari
Evaluating the impact of streaming systems design on application performance Alessio Pagliari To cite this version: Alessio Pagliari. Evaluating the impact of streaming systems design on application performance. Data Structures and Algorithms [cs.DS]. Université Côte d’Azur, 2021. English. NNT : 2021COAZ4011. tel-03273377 HAL Id: tel-03273377 https://tel.archives-ouvertes.fr/tel-03273377 Submitted on 29 Jun 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE DE DOCTORAT Évaluer l'impact de la conception des systèmes de streaming sur la performance des applications Alessio PAGLIARI Laboratoire d’Informatique, Signaux et Systèmes de Sophia Antipolis (I3S) Présentée en vue de l’obtention Devant le jury, composé de : du grade de docteur en Informatique Jean-Marc Pierson, Professeur, Université Paul Sabatier Toulouse 3 d’Université Côte d’Azur Guillaume Pierre, Professeur, Université de Rennes 1 Pietro Michiardi, Professeur, Eurecom Dirigée par : Fabrice Huet / Fabrice Huet, Professeur, Université Côte d’Azur Guillaume Urvoy-Keller, Professeur, -
Apache Samza
Apache Samza Martin Kleppmann Definition vehicles, or the writes of records to a database. Apache Samza is an open source frame- Stream processing jobs are long- work for distributed processing of high- running processes that continuously volume event streams. Its primary design consume one or more event streams, goal is to support high throughput for a invoking some application logic on wide range of processing patterns, while every event, producing derived output providing operational robustness at the streams, and potentially writing output massive scale required by Internet com- to databases for subsequent querying. panies. Samza achieves this goal through While a batch process or a database a small number of carefully designed ab- query typically reads the state of a stractions: partitioned logs for messag- dataset at one point in time, and then ing, fault-tolerant local state, and cluster- finishes, a stream processor is never based task scheduling. finished: it continually awaits the arrival of new events, and it only shuts down when terminated by an administrator. Many tasks can be naturally ex- Overview pressed as stream processing jobs, for example: Stream processing is playing an increas- • aggregating occurrences of events, ingly important part of the data man- e.g., counting how many times a agement needs of many organizations. particular item has been viewed; Event streams can represent many kinds • computing the rate of certain events, of data, for example, the activity of users e.g., for system diagnostics, report- on a website, the movement of goods or ing, and abuse prevention; 1 2 Martin Kleppmann • enriching events with information the scalability of Samza is directly at- from a database, e.g., extending user tributable to the choice of these founda- click events with information about tional abstractions. -
Real-Time Stream Processing for Big Data
it – Information Technology 2016; 58(4): 186–194 DE GRUYTER OLDENBOURG Special Issue Wolfram Wingerath*, Felix Gessert, Steffen Friedrich, and Norbert Ritter Real-time stream processing for Big Data DOI 10.1515/itit-2016-0002 1 Introduction Received January 15, 2016; accepted May 2, 2016 Abstract: With the rise of the web 2.0 and the Internet of Through technological advance and increasing connec- things, it has become feasible to track all kinds of infor- tivity between people and devices, the amount of data mation over time, in particular fine-grained user activi- available to (web) companies, governments and other or- ties and sensor data on their environment and even their ganisations is constantly growing. The shift towards more biometrics. However, while efficiency remains mandatory dynamic and user-generated content in the web and the for any application trying to cope with huge amounts of omnipresence of smart phones, wearables and other mo- data, only part of the potential of today’s Big Data repos- bile devices, in particular, have led to an abundance of in- itories can be exploited using traditional batch-oriented formation that are only valuable for a short time and there- approaches as the value of data often decays quickly and fore have to be processed immediately. Companies like high latency becomes unacceptable in some applications. Amazon and Netflix have already adapted and are mon- In the last couple of years, several distributed data pro- itoring user activity to optimise product or video recom- cessing systems have emerged that deviate from the batch- mendations for the current user context. -
A Novel Cloud Broker-Based Resource Elasticity Management and Pricing for Big Data Streaming Applications
A Novel Cloud Broker-based Resource Elasticity Management and Pricing for Big Data Streaming Applications by Olubisi A. Runsewe Thesis submitted to the Faculty of Graduate and Postdoctoral Studies in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electronic Business School of Electrical Engineering and Computer Science Faculty of Engineering University of Ottawa c Olubisi A. Runsewe, Ottawa, Canada, 2019 Abstract The pervasive availability of streaming data from various sources is driving todays' enterprises to acquire low-latency big data streaming applications (BDSAs) for extracting useful information. In parallel, recent advances in technology have made it easier to collect, process and store these data streams in the cloud. For most enterprises, gaining insights from big data is immensely important for maintaining competitive advantage. However, majority of enterprises have difficulty managing the multitude of BDSAs and the complex issues cloud technologies present, giving rise to the incorporation of cloud service brokers (CSBs). Generally, the main objective of the CSB is to maintain the heterogeneous quality of service (QoS) of BDSAs while minimizing costs. To achieve this goal, the cloud, al- though with many desirable features, exhibits major challenges | resource prediction and resource allocation | for CSBs. First, most stream processing systems allocate a fixed amount of resources at runtime, which can lead to under- or over-provisioning as BDSA demands vary over time. Thus, obtaining optimal trade-off between QoS violation and cost requires accurate demand prediction methodology to prevent waste, degradation or shutdown of processing. Second, coordinating resource allocation and pricing decisions for self-interested BDSAs to achieve fairness and efficiency can be complex. -
Configuring Data Stream Processing
Numéro National de Thèse : 2019LYSEN050 THESE de DOCTORAT DE L’UNIVERSITE DE LYON opérée par l’École Normale Supérieure de Lyon École Doctorale N◦512 Ecole Doctorale en Informatique et Mathématiques de Lyon Discipline : Informatique Soutenue publiquement le 23/09/2019, par : Alexandre DA SILVA VEITH Quality of Service Aware Mechanisms for (Re)Configuring Data Stream Processing Applications on Highly Distributed Infrastructure Mécanismes prenant en compte la qualité de service pour la (re)configuration d’applications de traitement de flux de données sur une infrastructure hautement distribuée Devant le jury composé de : Omer RANA Professeur Université de Cardiff (UK) Rapporteur Patricia STOLF Maître de Conférences Université Paul Sabatier Rapporteure Hélène COULLON Maître de Conférences IMT Atlantique Examinatrice Frédéric DESPREZ Directeur de Recherche Inria Examinateur Guillaume PIERRE Professeur Université Rennes 1 Examinateur Laurent LEFEVRE Chargé de Recherche Inria Directeur Marcos DIAS DE ASSUNCAO Chargé de Recherche Inria Co-encadrant ii Contents Acknowledgments vii French Abstract ix 1 Introduction 1 1.1 Challenges in Data Stream Processing (DSP) Applications Deployment . .3 1.1.1 Challenges in Edge Computing . .4 1.1.2 Challenges in Cloud Computing . .4 1.1.3 Challenges in DSP Operator Placement . .4 1.1.4 Challenges in DSP Application Reconfiguration . .4 1.2 Research Problem and Objectives . .5 1.3 Evaluation Methodology . .5 1.4 Thesis Contribution . .6 1.5 Thesis Organisation . .6 2 State-of-the-art and Positioning 9 2.1 Introduction . .9 2.2 Data Stream Processing Architecture and Elasticity . 10 2.2.1 Online Data Processing Architecture . 10 2.2.2 Data Streams and Models .