What's Really New with Newsql?
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Redis and Memcached
Redis and Memcached Speaker: Vladimir Zivkovic, Manager, IT June, 2019 Problem Scenario • Web Site users wanting to access data extremely quickly (< 200ms) • Data being shared between different layers of the stack • Cache a web page sessions • Research and test feasibility of using Redis as a solution for storing and retrieving data quickly • Load data into Redis to test ETL feasibility and Performance • Goal - get sub-second response for API calls for retrieving data !2 Why Redis • In-memory key-value store, with persistence • Open source • Written in C • It can handle up to 2^32 keys, and was tested in practice to handle at least 250 million of keys per instance.” - http://redis.io/topics/faq • Most popular key-value store - http://db-engines.com/en/ranking !3 History • REmote DIctionary Server • Released in 2009 • Built in order to scale a website: http://lloogg.com/ • The web application of lloogg was an ajax app to show the site traffic in real time. Needed a DB handling fast writes, and fast ”get latest N items” operation. !4 Redis Data types • Strings • Bitmaps • Lists • Hyperlogs • Sets • Geospatial Indexes • Sorted Sets • Hashes !5 Redis protocol • redis[“key”] = “value” • Values can be strings, lists or sets • Push and pop elements (atomic) • Fetch arbitrary set and array elements • Sorting • Data is written to disk asynchronously !6 Memory Footprint • An empty instance uses ~ 3MB of memory. • For 1 Million small Keys => String Value pairs use ~ 85MB of memory. • 1 Million Keys => Hash value, representing an object with 5 fields, -
Mysql Replication Tutorial
MySQL Replication Tutorial Lars Thalmann Technical lead Replication, Backup, and Engine Technology Mats Kindahl Lead Developer Replication Technology MySQL Conference and Expo 2008 Concepts 3 MySQL Replication Why? How? 1. High Availability Snapshots (Backup) Possibility of fail-over 1. Client program mysqldump 2. Load-balancing/Scale- With log coordinates out 2. Using backup Query multiple servers InnoDB, NDB 3. Off-site processing Don’t disturb master Binary log 1. Replication Asynchronous pushing to slave 2. Point-in-time recovery Roll-forward Terminology Master MySQL Server • Changes data • Has binlog turned on Master • Pushes binlog events to slave after slave has requested them MySQL Server Slave MySQL Server • Main control point of replication • Asks master for replication log Replication • Gets binlog event from master MySQL Binary log Server • Log of everything executed Slave • Divided into transactional components • Used for replication and point-in-time recovery Terminology Synchronous replication Master • A transaction is not committed until the data MySQL has been replicated (and applied) Server • Safer, but slower • This is available in MySQL Cluster Replication Asynchronous replication • A transaction is replicated after it has been committed MySQL Server • Faster, but you can in some cases loose transactions if master fails Slave • Easy to set up between MySQL servers Configuring Replication Required configuration – my.cnf Replication Master log-bin server_id Replication Slave server_id Optional items in my.cnf – What -
Modeling and Analyzing Latency in the Memcached System
Modeling and Analyzing Latency in the Memcached system Wenxue Cheng1, Fengyuan Ren1, Wanchun Jiang2, Tong Zhang1 1Tsinghua National Laboratory for Information Science and Technology, Beijing, China 1Department of Computer Science and Technology, Tsinghua University, Beijing, China 2School of Information Science and Engineering, Central South University, Changsha, China March 27, 2017 abstract Memcached is a widely used in-memory caching solution in large-scale searching scenarios. The most pivotal performance metric in Memcached is latency, which is affected by various factors including the workload pattern, the service rate, the unbalanced load distribution and the cache miss ratio. To quantitate the impact of each factor on latency, we establish a theoretical model for the Memcached system. Specially, we formulate the unbalanced load distribution among Memcached servers by a set of probabilities, capture the burst and concurrent key arrivals at Memcached servers in form of batching blocks, and add a cache miss processing stage. Based on this model, algebraic derivations are conducted to estimate latency in Memcached. The latency estimation is validated by intensive experiments. Moreover, we obtain a quantitative understanding of how much improvement of latency performance can be achieved by optimizing each factor and provide several useful recommendations to optimal latency in Memcached. Keywords Memcached, Latency, Modeling, Quantitative Analysis 1 Introduction Memcached [1] has been adopted in many large-scale websites, including Facebook, LiveJournal, Wikipedia, Flickr, Twitter and Youtube. In Memcached, a web request will generate hundreds of Memcached keys that will be further processed in the memory of parallel Memcached servers. With this parallel in-memory processing method, Memcached can extensively speed up and scale up searching applications [2]. -
View Company Overview
Real Business Relies on MariaDB™ About MariaDB Milestones MariaDB frees companies from the costs, constraints and complexity of proprietary databases, enabling them to reinvest in what matters most – rapidly developing innovative, Oct 2014 MariaDB customer-facing applications. MariaDB uses pluggable, purpose-built storage engines to Corporation formed support workloads that previously required a variety of specialized databases. With complexity and constraints eliminated, enterprises can now depend on a single complete database for all their needs, whether on commodity hardware or their cloud of choice. Deployed in minutes for transactional or analytical use cases, MariaDB delivers unmatched operational agility without Jan 2015 MariaDB MaxScale 1.0 sacrificing key enterprise features including real ACID compliance and full SQL. Trusted by organizations such as Deutsche Bank, DBS Bank, Nasdaq, Red Hat, The Home Depot, ServiceNow and Verizon – MariaDB meets the same core requirements as proprietary databases Oct 2015 MariaDB at a fraction of the cost. No wonder it’s the fastest growing open source database. Server 10.1 MariaDB at a Glance Dec 2016 MariaDB ColumnStore 1.0 Tens of millions of users worldwide Amount saved by Fastest growing Growing partner $ M 70+ ecosystem 9 migrating from April 2017 1st user 1 database Oracle conference: M|17 Customers May 2017 MariaDB TX 2.0 MariaDB Solutions MariaDB Services MariaDB TX Remote DBA Migration Services MariaDB AX Enterprise Architect Consulting Nov 2017 MariaDB AX Technical Support Training -
Mariadb / Mysql for Web Developers
www.fromdual.com MariaDB / MySQL for Web Developers Web Developer Congress 2020, remote Oli Sennhauser Senior MariaDB & MySQL Consultant at FromDual GmbH https://www.fromdual.com/presentations 1 / 27 About FromDual GmbH www.fromdual.com Support Consulting remote-DBA Training 2 / 27 Contents www.fromdual.com MariaDB / MySQL for Web Developers ➢ Databases ➢ Connecting to the database ➢ Basic database queries (SELECT) ➢ Changing Data (DML) ➢ Transactions ➢ Error Handling and Debugging ➢ Joining Tables ➢ Indexing 3 / 27 What are databases for? www.fromdual.com ● Primarily: Relational DBMS (RDBMS) ● Storing Business Information: ● CRM, ERP, Accounting, Shop, Booking, etc. ● What are they NOT for (non optimal)? ● Logs → Files, Logstash ● Images, PDFs, huge texts → Filer, Solr ● Trash → Waste bin 4 / 27 Different types of databases www.fromdual.com ● Flat files, CSV, ISAM ● Hierarchical database ● Relational databases (RDBMS) ● Network databases ● Object Oriented databases (OODBMS) ● Object Relational DBMS (ORDBMS) ● Graph databases ● Column Stores (MariaDB CS) ● "Document" Stores (JSON, MongoDB) ● Wide Column Stores (Cassandra, HBase) 5 / 27 Common Relational DBMS www.fromdual.com ● MariaDB ● more in the Web-Client-Server field (LAMP) ● MySQL ● more in the Web-Client-Server field (LAMP) ● PostgreSQL ● more in the fat-Client-Server Business Software field ● SQLite ● Not a real "Client-Server-DBMS" → Library ● Embedded DBMS (Industry, Firefox, etc.) 6 / 27 Connection to the DBMS www.fromdual.com ● GUI (MySQL Workbench, HeidiSQL) ● CLI (mariadb, -
Let's Talk About Storage & Recovery Methods for Non-Volatile Memory
Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems Joy Arulraj Andrew Pavlo Subramanya R. Dulloor [email protected] [email protected] [email protected] Carnegie Mellon University Carnegie Mellon University Intel Labs ABSTRACT of power, the DBMS must write that data to a non-volatile device, The advent of non-volatile memory (NVM) will fundamentally such as a SSD or HDD. Such devices only support slow, bulk data change the dichotomy between memory and durable storage in transfers as blocks. Contrast this with volatile DRAM, where a database management systems (DBMSs). These new NVM devices DBMS can quickly read and write a single byte from these devices, are almost as fast as DRAM, but all writes to it are potentially but all data is lost once power is lost. persistent even after power loss. Existing DBMSs are unable to take In addition, there are inherent physical limitations that prevent full advantage of this technology because their internal architectures DRAM from scaling to capacities beyond today’s levels [46]. Using are predicated on the assumption that memory is volatile. With a large amount of DRAM also consumes a lot of energy since it NVM, many of the components of legacy DBMSs are unnecessary requires periodic refreshing to preserve data even if it is not actively and will degrade the performance of data intensive applications. used. Studies have shown that DRAM consumes about 40% of the To better understand these issues, we implemented three engines overall power consumed by a server [42]. in a modular DBMS testbed that are based on different storage Although flash-based SSDs have better storage capacities and use management architectures: (1) in-place updates, (2) copy-on-write less energy than DRAM, they have other issues that make them less updates, and (3) log-structured updates. -
Beyond Relational Databases
EXPERT ANALYSIS BY MARCOS ALBE, SUPPORT ENGINEER, PERCONA Beyond Relational Databases: A Focus on Redis, MongoDB, and ClickHouse Many of us use and love relational databases… until we try and use them for purposes which aren’t their strong point. Queues, caches, catalogs, unstructured data, counters, and many other use cases, can be solved with relational databases, but are better served by alternative options. In this expert analysis, we examine the goals, pros and cons, and the good and bad use cases of the most popular alternatives on the market, and look into some modern open source implementations. Beyond Relational Databases Developers frequently choose the backend store for the applications they produce. Amidst dozens of options, buzzwords, industry preferences, and vendor offers, it’s not always easy to make the right choice… Even with a map! !# O# d# "# a# `# @R*7-# @94FA6)6 =F(*I-76#A4+)74/*2(:# ( JA$:+49>)# &-)6+16F-# (M#@E61>-#W6e6# &6EH#;)7-6<+# &6EH# J(7)(:X(78+# !"#$%&'( S-76I6)6#'4+)-:-7# A((E-N# ##@E61>-#;E678# ;)762(# .01.%2%+'.('.$%,3( @E61>-#;(F7# D((9F-#=F(*I## =(:c*-:)U@E61>-#W6e6# @F2+16F-# G*/(F-# @Q;# $%&## @R*7-## A6)6S(77-:)U@E61>-#@E-N# K4E-F4:-A%# A6)6E7(1# %49$:+49>)+# @E61>-#'*1-:-# @E61>-#;6<R6# L&H# A6)6#'68-# $%&#@:6F521+#M(7#@E61>-#;E678# .761F-#;)7-6<#LNEF(7-7# S-76I6)6#=F(*I# A6)6/7418+# @ !"#$%&'( ;H=JO# ;(\X67-#@D# M(7#J6I((E# .761F-#%49#A6)6#=F(*I# @ )*&+',"-.%/( S$%=.#;)7-6<%6+-# =F(*I-76# LF6+21+-671># ;G';)7-6<# LF6+21#[(*:I# @E61>-#;"# @E61>-#;)(7<# H618+E61-# *&'+,"#$%&'$#( .761F-#%49#A6)6#@EEF46:1-# -
The Business Case for In-Memory Databases
THE BUSINESS CASE FOR IN-MEMORY DATABASES By Elliot King, PhD Research Fellow, Lattanze Center for Information Value Loyola University Maryland Abstract Creating a true real-time enterprise has long been a goal for many organizations. The efficient use of appropriate enterprise information is always a central element of that vision. Enabling organizations to operate in real-time requires the ability to access data without delay and process transactions immediately and efficiently. In-memory databases, (IMDB) which offer much faster I/O than on-disk database technology deliver on the promise of real-time access to data. Case studies demonstrate the value of real-time access to data provided by in-memory database systems. Organizations are increasingly recognizing the value of incorporating real- time data access with appropriate applications. In-memory databases, an established technology, have traditionally been used in telecommunications and financial applications. Now they are being successfully deployed in other applications. The overall increases in data volumes which can slow down on-disk database management systems have driven this shift. Additionally, increased computer processing power and main memory capacities have facilitated more ubiquitous in-memory databases which can either standalone or serve as a cache for on-disk databases—thus creating a hybrid infrastructure. Introduction: The Real-Time Enterprise For the last decade, the real-time enterprise has been a strategic objective for many organizations and has been the stimulus for significant investment in IT. Building a real-time enterprise entails implementing access to the most timely and up-to-date data, reducing or eliminating delays in transaction processing and accelerating decision- making at all levels of an organization. -
Failures in DBMS
Chapter 11 Database Recovery 1 Failures in DBMS Two common kinds of failures StSystem filfailure (t)(e.g. power outage) ‒ affects all transactions currently in progress but does not physically damage the data (soft crash) Media failures (e.g. Head crash on the disk) ‒ damagg()e to the database (hard crash) ‒ need backup data Recoveryyp scheme responsible for handling failures and restoring database to consistent state 2 Recovery Recovering the database itself Recovery algorithm has two parts ‒ Actions taken during normal operation to ensure system can recover from failure (e.g., backup, log file) ‒ Actions taken after a failure to restore database to consistent state We will discuss (briefly) ‒ Transactions/Transaction recovery ‒ System Recovery 3 Transactions A database is updated by processing transactions that result in changes to one or more records. A user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned with data read/written from/to the database. The DBMS’s abstract view of a user program is a sequence of transactions (reads and writes). To understand database recovery, we must first understand the concept of transaction integrity. 4 Transactions A transaction is considered a logical unit of work ‒ START Statement: BEGIN TRANSACTION ‒ END Statement: COMMIT ‒ Execution errors: ROLLBACK Assume we want to transfer $100 from one bank (A) account to another (B): UPDATE Account_A SET Balance= Balance -100; UPDATE Account_B SET Balance= Balance +100; We want these two operations to appear as a single atomic action 5 Transactions We want these two operations to appear as a single atomic action ‒ To avoid inconsistent states of the database in-between the two updates ‒ And obviously we cannot allow the first UPDATE to be executed and the second not or vice versa. -
Oracle Vs. Nosql Vs. Newsql Comparing Database Technology
Oracle vs. NoSQL vs. NewSQL Comparing Database Technology John Ryan Senior Solution Architect, Snowflake Computing Table of Contents The World has Changed . 1 What’s Changed? . 2 What’s the Problem? . .. 3 Performance vs. Availability and Durability . 3 Consistecy vs. Availability . 4 Flexibility vs . Scalability . 5 ACID vs. Eventual Consistency . 6 The OLTP Database Reimagined . 7 Achieving the Impossible! . .. 8 NewSQL Database Technology . 9 VoltDB . 10 MemSQL . 11 Which Applications Need NewSQL Technology? . 12 Conclusion . 13 About the Author . 13 ii The World has Changed The world has changed massively in the past 20 years. Back in the year 2000, a few million users connected to the web using a 56k modem attached to a PC, and Amazon only sold books. Now billions of people are using to their smartphone or tablet 24x7 to buy just about everything, and they’re interacting with Facebook, Twitter and Instagram. The pace has been unstoppable . Expectations have also changed. If a web page doesn’t refresh within seconds we’re quickly frustrated, and go elsewhere. If a web site is down, we fear it’s the end of civilisation as we know it. If a major site is down, it makes global headlines. Instant gratification takes too long! — Ladawn Clare-Panton Aside: If you’re not a seasoned Database Architect, you may want to start with my previous articles on Scalability and Database Architecture. Oracle vs. NoSQL vs. NewSQL eBook 1 What’s Changed? The above leads to a few observations: • Scalability — With potentially explosive traffic growth, IT systems need to quickly grow to meet exponential numbers of transactions • High Availability — IT systems must run 24x7, and be resilient to failure. -
Data Platforms Map from 451 Research
1 2 3 4 5 6 Azure AgilData Cloudera Distribu2on HDInsight Metascale of Apache Kaa MapR Streams MapR Hortonworks Towards Teradata Listener Doopex Apache Spark Strao enterprise search Apache Solr Google Cloud Confluent/Apache Kaa Al2scale Qubole AWS IBM Azure DataTorrent/Apache Apex PipelineDB Dataproc BigInsights Apache Lucene Apache Samza EMR Data Lake IBM Analy2cs for Apache Spark Oracle Stream Explorer Teradata Cloud Databricks A Towards SRCH2 So\ware AG for Hadoop Oracle Big Data Cloud A E-discovery TIBCO StreamBase Cloudera Elas2csearch SQLStream Data Elas2c Found Apache S4 Apache Storm Rackspace Non-relaonal Oracle Big Data Appliance ObjectRocket for IBM InfoSphere Streams xPlenty Apache Hadoop HP IDOL Elas2csearch Google Azure Stream Analy2cs Data Ar2sans Apache Flink Azure Cloud EsgnDB/ zone Platforms Oracle Dataflow Endeca Server Search AWS Apache Apache IBM Ac2an Treasure Avio Kinesis LeanXcale Trafodion Splice Machine MammothDB Drill Presto Big SQL Vortex Data SciDB HPCC AsterixDB IBM InfoSphere Towards LucidWorks Starcounter SQLite Apache Teradata Map Data Explorer Firebird Apache Apache JethroData Pivotal HD/ Apache Cazena CitusDB SIEM Big Data Tajo Hive Impala Apache HAWQ Kudu Aster Loggly Ac2an Ingres Sumo Cloudera SAP Sybase ASE IBM PureData January 2016 Logic Search for Analy2cs/dashDB Logentries SAP Sybase SQL Anywhere Key: B TIBCO Splunk Maana Rela%onal zone B LogLogic EnterpriseDB SQream General purpose Postgres-XL Microso\ Ry\ X15 So\ware Oracle IBM SAP SQL Server Oracle Teradata Specialist analy2c PostgreSQL Exadata -
ACID Compliant Distributed Key-Value Store
ACID Compliant Distributed Key-Value Store #1 #2 #3 Lakshmi Narasimhan Seshan , Rajesh Jalisatgi , Vijaeendra Simha G A # 1l [email protected] # 2r [email protected] #3v [email protected] Abstract thought that would be easier to implement. All Built a fault-tolerant and strongly-consistent key/value storage components run http and grpc endpoints. Http endpoints service using the existing RAFT implementation. Add ACID are for debugging/configuration. Grpc Endpoints are transaction capability to the service using a 2PC variant. Build used for communication between components. on the raftexample[1] (available as part of etcd) to add atomic Replica Manager (RM): RM is the service discovery transactions. This supports sharding of keys and concurrent transactions on sharded KV Store. By Implementing a part of the system. RM is initiated with each of the transactional distributed KV store, we gain working knowledge Replica Servers available in the system. Users have to of different protocols and complexities to make it work together. instantiate the RM with the number of shards that the user wants the key to be split into. Currently we cannot Keywords— ACID, Key-Value, 2PC, sharding, 2PL Transaction dynamically change while RM is running. New Replica Servers cannot be added or removed from once RM is I. INTRODUCTION up and running. However Replica Servers can go down Distributed KV stores have become a norm with the and come back and RM can detect this. Each Shard advent of microservices architecture and the NoSql Leader updates the information to RM, while the DTM revolution. Initially KV Store discarded ACID properties leader updates its information to the RM.