Data Platforms
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Redis and Memcached
Redis and Memcached Speaker: Vladimir Zivkovic, Manager, IT June, 2019 Problem Scenario • Web Site users wanting to access data extremely quickly (< 200ms) • Data being shared between different layers of the stack • Cache a web page sessions • Research and test feasibility of using Redis as a solution for storing and retrieving data quickly • Load data into Redis to test ETL feasibility and Performance • Goal - get sub-second response for API calls for retrieving data !2 Why Redis • In-memory key-value store, with persistence • Open source • Written in C • It can handle up to 2^32 keys, and was tested in practice to handle at least 250 million of keys per instance.” - http://redis.io/topics/faq • Most popular key-value store - http://db-engines.com/en/ranking !3 History • REmote DIctionary Server • Released in 2009 • Built in order to scale a website: http://lloogg.com/ • The web application of lloogg was an ajax app to show the site traffic in real time. Needed a DB handling fast writes, and fast ”get latest N items” operation. !4 Redis Data types • Strings • Bitmaps • Lists • Hyperlogs • Sets • Geospatial Indexes • Sorted Sets • Hashes !5 Redis protocol • redis[“key”] = “value” • Values can be strings, lists or sets • Push and pop elements (atomic) • Fetch arbitrary set and array elements • Sorting • Data is written to disk asynchronously !6 Memory Footprint • An empty instance uses ~ 3MB of memory. • For 1 Million small Keys => String Value pairs use ~ 85MB of memory. • 1 Million Keys => Hash value, representing an object with 5 fields, -
Changing the Game: Monthly Technology Briefs
the way we see it Changing the Game: Monthly Technology Briefs April 2011 Tablets and Smartphones: Levers of Disruptive Change Read the Capgemini Chief Technology Officers’ Blog at www.capgemini.com/ctoblog Public the way we see it Tablets and Smartphones: Levers of Disruptive Change All 2010 shipment reports tell the same story - of an incredible increase in the shipments of both Smartphones and Tablets, and of a corresponding slowdown in the conventional PC business. Smartphone sales exceeded even the most optimis- tic forecasts of experts, with a 74 percent increase from the previous year – around a battle between Apple and Google Android for supremacy at the expense of traditional leaders Nokia and RIM BlackBerry. It was the same story for Tablets with 17.4 million units sold in 2010 led by Apple, but once again with Google Android in hot pursuit. Analyst predictions for shipments suggest that the tablet market will continue its exponential growth curve to the extent that even the usually cautious Gartner think that by 2013 there will be as many Tablets in use in an enterprise as PCs with a profound impact on the IT environment. On February 7, as part of the Gartner ‘First Thing Monday’ series under the title ‘The Digital Natives are Restless, The impending Revolt against the IT Nanny State’ Gartner analyst Jim Shepherd stated; “I am regularly hearing middle managers and even senior executives complaining bit- terly about IT departments that are so focussed on the global rollout of some monolith- ic solution that they have no time for new and innovative technologies that could have an immediate impact on the business. -
Mysql Replication Tutorial
MySQL Replication Tutorial Lars Thalmann Technical lead Replication, Backup, and Engine Technology Mats Kindahl Lead Developer Replication Technology MySQL Conference and Expo 2008 Concepts 3 MySQL Replication Why? How? 1. High Availability Snapshots (Backup) Possibility of fail-over 1. Client program mysqldump 2. Load-balancing/Scale- With log coordinates out 2. Using backup Query multiple servers InnoDB, NDB 3. Off-site processing Don’t disturb master Binary log 1. Replication Asynchronous pushing to slave 2. Point-in-time recovery Roll-forward Terminology Master MySQL Server • Changes data • Has binlog turned on Master • Pushes binlog events to slave after slave has requested them MySQL Server Slave MySQL Server • Main control point of replication • Asks master for replication log Replication • Gets binlog event from master MySQL Binary log Server • Log of everything executed Slave • Divided into transactional components • Used for replication and point-in-time recovery Terminology Synchronous replication Master • A transaction is not committed until the data MySQL has been replicated (and applied) Server • Safer, but slower • This is available in MySQL Cluster Replication Asynchronous replication • A transaction is replicated after it has been committed MySQL Server • Faster, but you can in some cases loose transactions if master fails Slave • Easy to set up between MySQL servers Configuring Replication Required configuration – my.cnf Replication Master log-bin server_id Replication Slave server_id Optional items in my.cnf – What -
Synopsys: Large Graph Analytics in the SAP HANA Database Through Summarization
SynopSys: Large Graph Analytics in the SAP HANA Database Through Summarization Michael Rudolf1 Marcus Paradies1 Christof Bornhövd2 Wolfgang Lehner3 1SAP AG 2SAP Labs, LLC 3Database Technology Group Walldorf, Germany Palo Alto, CA 94304, USA TU Dresden, Germany [email protected] [email protected] [email protected] ABSTRACT “Cell Phones “Computers & & Accessories” Graph-structured data is ubiquitous and with the advent of social Accessories” 4 networking platforms has recently seen a significant increase in 6 popularity amongst researchers. However, also many business appli- part of part of “Freddy” cations deal with this kind of data and can therefore benefit greatly 10 “Mike” from graph processing functionality offered directly by the underly- 8 5 “Phones” “Tablets” 7 ing database. This paper summarizes the current state of graph data in rates 4/5 processing capabilities in the SAP HANA database and describes our efforts to enable large graph analytics in the context of our research in 11 “Steve” rates 5/5 white 3 “Apple in project SynopSys. With powerful graph pattern matching support at iPhone 4” 16 GB the core, we envision OLAP-like evaluation functionality exposed to rates 5/5 the user in the form of easy-to-apply graph summarization templates. black rates 3/5 black 64 GB 1 32 GB 2 By combining them, the user is able to produce concise summaries 9 of large graph-structured datasets. We also point out open questions “Apple iPad “Apple and challenges that we plan to tackle in the future developments on MC707LL/A” “Carl” iPhone 5” our way towards large graph analytics. -
Modeling and Analyzing Latency in the Memcached System
Modeling and Analyzing Latency in the Memcached system Wenxue Cheng1, Fengyuan Ren1, Wanchun Jiang2, Tong Zhang1 1Tsinghua National Laboratory for Information Science and Technology, Beijing, China 1Department of Computer Science and Technology, Tsinghua University, Beijing, China 2School of Information Science and Engineering, Central South University, Changsha, China March 27, 2017 abstract Memcached is a widely used in-memory caching solution in large-scale searching scenarios. The most pivotal performance metric in Memcached is latency, which is affected by various factors including the workload pattern, the service rate, the unbalanced load distribution and the cache miss ratio. To quantitate the impact of each factor on latency, we establish a theoretical model for the Memcached system. Specially, we formulate the unbalanced load distribution among Memcached servers by a set of probabilities, capture the burst and concurrent key arrivals at Memcached servers in form of batching blocks, and add a cache miss processing stage. Based on this model, algebraic derivations are conducted to estimate latency in Memcached. The latency estimation is validated by intensive experiments. Moreover, we obtain a quantitative understanding of how much improvement of latency performance can be achieved by optimizing each factor and provide several useful recommendations to optimal latency in Memcached. Keywords Memcached, Latency, Modeling, Quantitative Analysis 1 Introduction Memcached [1] has been adopted in many large-scale websites, including Facebook, LiveJournal, Wikipedia, Flickr, Twitter and Youtube. In Memcached, a web request will generate hundreds of Memcached keys that will be further processed in the memory of parallel Memcached servers. With this parallel in-memory processing method, Memcached can extensively speed up and scale up searching applications [2]. -
Histcoroy Pyright for Online Information and Ordering of This and Other Manning Books, Please Visit Topwicws W.Manning.Com
www.allitebooks.com HistCoroy pyright For online information and ordering of this and other Manning books, please visit Topwicws w.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Tutorials Special Sales Department Offers & D e al s Manning Publications Co. 20 Baldwin Road Highligh ts PO Box 761 Shelter Island, NY 11964 Email: [email protected] Settings ©2017 by Manning Publications Co. All rights reserved. Support No part of this publication may be reproduced, stored in a retrieval system, or Sign Out transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acidfree paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. Manning Publications Co. PO Box 761 Shelter Island, NY 11964 www.allitebooks.com Development editor: Cynthia Kane Review editor: Aleksandar Dragosavljević Technical development editor: Stan Bice Project editors: Kevin Sullivan, David Novak Copyeditor: Sharon Wilkey Proofreader: Melody Dolab Technical proofreader: Doug Warren Typesetter and cover design: Marija Tudor ISBN 9781617292576 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 – EBM – 22 21 20 19 18 17 www.allitebooks.com HistPoray rt 1. -
Dashboard Design for Real-Time Situation Awareness
Dashboard Design for Real-Time Situation Awareness Stephen Few, author of Information Dashboard Design Dashboard Design for Real-Time Situation Awareness Few if any recent trends in business information delivery have inspired as much enthusiasm as dashboards. When they work, they provide a powerful means to tame the beast of data overload. Despite their popularity, however, most dashboards live up to only a fraction of their potential. They fail, not because of poor technology—at least not primarily—but because of poor design. The more critical that information is to the well being of the business, the more grievous is the failure, because the remedy is so readily available. The term “dashboard” refers to a single screen information display that is used to monitor what’s going on in some aspect of the business. The key word is “monitor.” A dashboard presents the key data that you must effi ciently monitor to maintain awareness of what’s going on in your area of responsibility. Most dashboards are used to monitor information once a day, because more frequent use is unnecessary given the rate at which the information changes and speed at which responses must be made. Some jobs, however, require constant monitoring in real time, or close to it, because the activities that you track are happening right now and delays in responding can’t be tolerated. There is perhaps no better example of this type of dashboard than one that monitors the brisk and sometimes harried activities of a call center. Much like air traffi c control systems or cockpits in airplanes, call center dashboards must be designed to support real-time “situation awareness.” They must grab your attention when it’s needed, they must make it easy to spot what’s most important in a screen full of data, and they must give you the means to understand what’s happening and respond without delay. -
Beyond Relational Databases
EXPERT ANALYSIS BY MARCOS ALBE, SUPPORT ENGINEER, PERCONA Beyond Relational Databases: A Focus on Redis, MongoDB, and ClickHouse Many of us use and love relational databases… until we try and use them for purposes which aren’t their strong point. Queues, caches, catalogs, unstructured data, counters, and many other use cases, can be solved with relational databases, but are better served by alternative options. In this expert analysis, we examine the goals, pros and cons, and the good and bad use cases of the most popular alternatives on the market, and look into some modern open source implementations. Beyond Relational Databases Developers frequently choose the backend store for the applications they produce. Amidst dozens of options, buzzwords, industry preferences, and vendor offers, it’s not always easy to make the right choice… Even with a map! !# O# d# "# a# `# @R*7-# @94FA6)6 =F(*I-76#A4+)74/*2(:# ( JA$:+49>)# &-)6+16F-# (M#@E61>-#W6e6# &6EH#;)7-6<+# &6EH# J(7)(:X(78+# !"#$%&'( S-76I6)6#'4+)-:-7# A((E-N# ##@E61>-#;E678# ;)762(# .01.%2%+'.('.$%,3( @E61>-#;(F7# D((9F-#=F(*I## =(:c*-:)U@E61>-#W6e6# @F2+16F-# G*/(F-# @Q;# $%&## @R*7-## A6)6S(77-:)U@E61>-#@E-N# K4E-F4:-A%# A6)6E7(1# %49$:+49>)+# @E61>-#'*1-:-# @E61>-#;6<R6# L&H# A6)6#'68-# $%&#@:6F521+#M(7#@E61>-#;E678# .761F-#;)7-6<#LNEF(7-7# S-76I6)6#=F(*I# A6)6/7418+# @ !"#$%&'( ;H=JO# ;(\X67-#@D# M(7#J6I((E# .761F-#%49#A6)6#=F(*I# @ )*&+',"-.%/( S$%=.#;)7-6<%6+-# =F(*I-76# LF6+21+-671># ;G';)7-6<# LF6+21#[(*:I# @E61>-#;"# @E61>-#;)(7<# H618+E61-# *&'+,"#$%&'$#( .761F-#%49#A6)6#@EEF46:1-# -
Business Intelligence: a Discussion on Platforms, Technologies, and Solutions
Business Intelligence: A Discussion on Platforms, Technologies, and solutions Tutorial RCIS 2013-Paris May 29-31 Introduction • Presenters: – Noushin Ashrafi • [email protected] – Jean-Pierre Kuilboer • [email protected] • Time and Discussion Frame – 90 minutes • Audiene can ask questions any time during presentation. 4/19/2013 RCIS 2013-Paris May 29-31 2 Tutorial overview knowledge Discovery from Data • The Thrust of the tutorial is to examine Business Intelligence (BI) Platforms to develop business analytics applications. • Specifically, we will address: – Requirements, development , and capabilities of the BI. – Self-service, enterprise, and cloud BI. – Confusing BI concepts such as: Platform, infrastructure, technology and architecture . – Solutions by three well-known vendors: Microsoft, TableauSoftware and IBM. 4/19/2013 RCIS 2013-Paris May 29-31 3 Overview of Business Intelligence • BI is revolutionizing decision making and information technology across all industries. This phenomenon is largely due to the ever-increasing availability of data. • The explosive volumes of data are available in both structured and unstructured formats, and are analyzed and processed to become information within context hence providing relevance, and purpose to the decision making process. 4/19/2013 RCIS 2013-Paris May 29-31 4 4/19/2013 RCIS 2013-Paris May 29-31 5 What is Business Intelligence? • BI is a content-free expression, so it means different things to different people. • BI is neither a product, nor a service – BI refers to people, processes, technologies and practices used to support business decision making. 4/19/2013 RCIS 2013-Paris May 29-31 6 4/19/2013 RCIS 2013-Paris May 29-31 7 What is Business Intelligence? • BI is an umbrella term that combines architectures, technology, analytical tools, applications, and methodologies to help transform data, to information, to knowledge, to decisions, and finally to action. -
Two Node Mysql Cluster
Two Node MySQL Cluster 1.0 EXECUTIVE SUMMARY This white paper describes the challenges CONTENTS involved in deploying the 2 node High Available MySQL-Cluster with a proposed solution. For the SECTION PAGE sake of users reading this document it also describes in brief the main components of the MySQL Cluster which are necessary to 1.0 EXECUTIVE SUMMARY………………………1 understand the paper overall. 2.0 BUSINESS CHALLENGES……………………1 The solution relies on the Linux HA framework 3.0 MYSQL CLUSTER……………………………..1 (Heartbeat/Pacemaker) so the white paper can 3.1 CLIENTS/APIS………………………………….2 be best understood with the knowledge of Linux 3.2 SQL NODE………………………………………2 HA framework. 3.3 DATA NODE…………………………………….2 3.4 NDB MANAGEMENT NODE………………….3 3.5 CHALLENGES………………………………….3 3.6 SOLUTION………………………………………4 4.0 REFERENCES………………………………….7 2.0 BUSINESS CHALLENGES The MySQL cluster demands at least 4 nodes to be present for deploying a High Available MySQL database cluster. The typical configuration of any enterprise application is a 2 Node solution (Active-Standby mode or Active-Active Mode). The challenge lies in fitting the MySQL Clsuter Nodes in the 2 Nodes offering the application services and to make it work in that configuration with no single point of failure. 3.0 MYSQL CLUSTER The intent of this section is to briefly mention the important actors and their roles in the overall MySQL Cluster. For more information the reader can refer to the MYSQL reference documents from its official site (http://dev.mysql.com/doc/index.html). MySQL Cluster is a technology that enables clustering of in-memory databases in a “shared-nothing system”. -
Data Platforms Map from 451 Research
1 2 3 4 5 6 Azure AgilData Cloudera Distribu2on HDInsight Metascale of Apache Kaa MapR Streams MapR Hortonworks Towards Teradata Listener Doopex Apache Spark Strao enterprise search Apache Solr Google Cloud Confluent/Apache Kaa Al2scale Qubole AWS IBM Azure DataTorrent/Apache Apex PipelineDB Dataproc BigInsights Apache Lucene Apache Samza EMR Data Lake IBM Analy2cs for Apache Spark Oracle Stream Explorer Teradata Cloud Databricks A Towards SRCH2 So\ware AG for Hadoop Oracle Big Data Cloud A E-discovery TIBCO StreamBase Cloudera Elas2csearch SQLStream Data Elas2c Found Apache S4 Apache Storm Rackspace Non-relaonal Oracle Big Data Appliance ObjectRocket for IBM InfoSphere Streams xPlenty Apache Hadoop HP IDOL Elas2csearch Google Azure Stream Analy2cs Data Ar2sans Apache Flink Azure Cloud EsgnDB/ zone Platforms Oracle Dataflow Endeca Server Search AWS Apache Apache IBM Ac2an Treasure Avio Kinesis LeanXcale Trafodion Splice Machine MammothDB Drill Presto Big SQL Vortex Data SciDB HPCC AsterixDB IBM InfoSphere Towards LucidWorks Starcounter SQLite Apache Teradata Map Data Explorer Firebird Apache Apache JethroData Pivotal HD/ Apache Cazena CitusDB SIEM Big Data Tajo Hive Impala Apache HAWQ Kudu Aster Loggly Ac2an Ingres Sumo Cloudera SAP Sybase ASE IBM PureData January 2016 Logic Search for Analy2cs/dashDB Logentries SAP Sybase SQL Anywhere Key: B TIBCO Splunk Maana Rela%onal zone B LogLogic EnterpriseDB SQream General purpose Postgres-XL Microso\ Ry\ X15 So\ware Oracle IBM SAP SQL Server Oracle Teradata Specialist analy2c PostgreSQL Exadata -
SAP HANA Client Interface Programming Reference for SAP HANA Platform Company
PUBLIC SAP HANA Platform 2.0 SPS 04 Document Version: 1.1 – 2019-10-31 SAP HANA Client Interface Programming Reference for SAP HANA Platform company. All rights reserved. All rights company. affiliate THE BEST RUN 2019 SAP SE or an SAP SE or an SAP SAP 2019 © Content 1 SAP HANA Client Interface Programming Reference.................................17 2 Configuring Clients for Secure Connections.......................................19 2.1 Server Certificate Authentication.................................................19 2.2 Mutual Authentication........................................................ 20 Implement Mutual Authentication..............................................20 2.3 Configuring the Client for Client-Side Encryption and LDAP.............................. 26 3 Connecting to SAP HANA Databases and Servers...................................27 3.1 Setting Session-Specific Client Information..........................................29 3.2 Use the User Store (hdbuserstore)................................................32 4 Client Support for Active/Active (Read Enabled)...................................34 4.1 Connecting Using Active/Active (Read Enabled)...................................... 34 Client Requirements For A Takeover.............................................35 4.2 Hint-Based Statement Routing for Active/Active (Read Enabled)...........................36 4.3 Forced Statement Routing to a Site for Active/Active (Read Enabled)........................37 Implement Forced Statement Routing to a Site for Active/Active